6. Inlines¶
Inlines are parsed sequentially from the beginning of the character stream to the end (left to right, in left-to-right languages). Thus, for{panels}, in
hi
lo`
.
hi
lo`
hi
is parsed as code, leaving the backtick at the end as a literal
backtick.
6.1. Code spans¶
A backtick string
is a string of one or more backtick characters (`
) that is neither
preceded nor followed by a backtick.
A code span begins with a backtick string and ends with a backtick string of equal length. The contents of the code span are the characters between these two backtick strings, normalized in the following ways:
First, [line endings] are converted to [spaces].
If the resulting string both begins and ends with a [space] character, but does not consist entirely of [space] characters, a single [space] character is removed from the front and back. This allows you to include code that begins or ends with backtick characters, which must be separated by whitespace from the opening or closing backtick strings.
This is a simple code span:
foo
.
foo
Here two backticks are used, because the code contains a backtick. This{panels} also illustrates stripping of a single leading and trailing space:
foo ` bar
.
foo ` bar
This{panels} shows the motivation for stripping leading and trailing spaces:
``
.
``
Note that only one space is stripped:
``
.
``
The stripping only happens if the space is on both sides of the string:
a
.
a
Only [spaces], and not [unicode whitespace] in general, are stripped in this way:
b
.
b
No stripping occurs if the code span contains only spaces:
.
[Line endings] are treated like spaces:
foo bar baz
.
foo bar baz
foo
.
foo
Interior spaces are not collapsed:
foo bar baz
.
foo bar baz
Note that browsers will typically collapse consecutive spaces
when rendering <code>
elements, so it is recommended that
the following CSS be used:
code{white-space: pre-wrap;}
Note that backslash escapes do not work in code spans. All backslashes are treated literally:
foo\
bar`
.
foo\
bar`
Backslash escapes are never needed, because one can always choose a string of n backtick characters as delimiters, where the code does not contain any strings of exactly n backtick characters.
foo`bar
.
foo`bar
foo `` bar
.
foo `` bar
Code span backticks have higher precedence than any other inline
constructs except HTML tags and autolinks. Thus, for{panels}, this is
not parsed as emphasized text, since the second *
is part of a code
span:
*foo*
.
*foo*
And this is not parsed as a link:
[not a link](/foo
)
.
[not a link](/foo
)
Code spans, HTML tags, and autolinks have the same precedence. Thus, this is code:
<a href="
”>`
.
<a href="
">`
But this is an HTML tag:
And this is code:
<http://foo.bar.
baz>`
.
<http://foo.bar.
baz>`
But this is an autolink:
When a backtick string is not closed by a matching backtick string, we just have literal backticks:
```foo`` .
```foo``
`foo .
`foo
The following case also illustrates the need for opening and closing backtick strings to be equal in length:
`foobar
.
`foobar
6.2. Emphasis and strong emphasis¶
John Gruber’s original Markdown syntax description says:
Markdown treats asterisks (
*
) and underscores (_
) as indicators of emphasis. Text wrapped with one*
or_
will be wrapped with an HTML<em>
tag; double*
’s or_
’s will be wrapped with an HTML<strong>
tag.
This is enough for most users, but these rules leave much undecided,
especially when it comes to nested emphasis. The original
Markdown.pl
test suite makes it clear that triple ***
and
___
delimiters can be used for strong emphasis, and most
implementations have also allowed the following patterns:
***strong emph***
***strong** in emph*
***emph* in strong**
**in strong *emph***
*in emph **strong***
The following patterns are less widely supported, but the intent is clear and they are useful (especially in contexts like bibliography entries):
*emph *with emph* in it*
**strong **with strong** in it**
Many implementations have also restricted intraword emphasis to
the *
forms, to avoid unwanted emphasis in words containing
internal underscores. (It is best practice to put these in code
spans, but users often do not.)
internal emphasis: foo*bar*baz
no emphasis: foo_bar_baz
The rules given below capture all of these patterns, while allowing for efficient parsing strategies that do not backtrack.
First, some definitions. A delimiter run is either
a sequence of one or more *
characters that is not preceded or
followed by a non-backslash-escaped *
character, or a sequence
of one or more _
characters that is not preceded or followed by
a non-backslash-escaped _
character.
A left-flanking delimiter run is a [delimiter run] that is (1) not followed by [Unicode whitespace], and either (2a) not followed by a [Unicode punctuation character], or (2b) followed by a [Unicode punctuation character] and preceded by [Unicode whitespace] or a [Unicode punctuation character]. For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.
A right-flanking delimiter run is a [delimiter run] that is (1) not preceded by [Unicode whitespace], and either (2a) not preceded by a [Unicode punctuation character], or (2b) preceded by a [Unicode punctuation character] and followed by [Unicode whitespace] or a [Unicode punctuation character]. For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.
Here are some{panels}s of delimiter runs.
left-flanking but not right-flanking:
***abc _abc **"abc" _"abc"
right-flanking but not left-flanking:
abc*** abc_ "abc"** "abc"_
Both left and right-flanking:
abc***def "abc"_"def"
Neither left nor right-flanking:
abc *** def a _ b
(The idea of distinguishing left-flanking and right-flanking delimiter runs based on the character before and the character after comes from Roopesh Chander’s vfmd. vfmd uses the terminology “emphasis indicator string” instead of “delimiter run,” and its rules for distinguishing left- and right-flanking runs are a bit more complex than the ones given here.)
The following rules define emphasis and strong emphasis:
A single
*
character can open emphasis iff (if and only if) it is part of a [left-flanking delimiter run].A single
_
character [can open emphasis] iff it is part of a [left-flanking delimiter run] and either (a) not part of a [right-flanking delimiter run] or (b) part of a [right-flanking delimiter run] preceded by a [Unicode punctuation character].A single
*
character can close emphasis iff it is part of a [right-flanking delimiter run].A single
_
character [can close emphasis] iff it is part of a [right-flanking delimiter run] and either (a) not part of a [left-flanking delimiter run] or (b) part of a [left-flanking delimiter run] followed by a [Unicode punctuation character].A double
**
can open strong emphasis iff it is part of a [left-flanking delimiter run].A double
__
[can open strong emphasis] iff it is part of a [left-flanking delimiter run] and either (a) not part of a [right-flanking delimiter run] or (b) part of a [right-flanking delimiter run] preceded by a [Unicode punctuation character].A double
**
can close strong emphasis iff it is part of a [right-flanking delimiter run].A double
__
[can close strong emphasis] iff it is part of a [right-flanking delimiter run] and either (a) not part of a [left-flanking delimiter run] or (b) part of a [left-flanking delimiter run] followed by a [Unicode punctuation character].Emphasis begins with a delimiter that [can open emphasis] and ends with a delimiter that [can close emphasis], and that uses the same character (
_
or*
) as the opening delimiter. The opening and closing delimiters must belong to separate [delimiter runs]. If one of the delimiters can both open and close emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of 3 unless both lengths are multiples of 3.Strong emphasis begins with a delimiter that [can open strong emphasis] and ends with a delimiter that [can close strong emphasis], and that uses the same character (
_
or*
) as the opening delimiter. The opening and closing delimiters must belong to separate [delimiter runs]. If one of the delimiters can both open and close strong emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of 3 unless both lengths are multiples of 3.A literal
*
character cannot occur at the beginning or end of*
-delimited emphasis or**
-delimited strong emphasis, unless it is backslash-escaped.A literal
_
character cannot occur at the beginning or end of_
-delimited emphasis or__
-delimited strong emphasis, unless it is backslash-escaped.
Where rules 1–12 above are compatible with multiple parsings, the following principles resolve ambiguity:
The number of nestings should be minimized. Thus, for{panels}, an interpretation
<strong>...</strong>
is always preferred to<em><em>...</em></em>
.An interpretation
<em><strong>...</strong></em>
is always preferred to<strong><em>...</em></strong>
.When two potential emphasis or strong emphasis spans overlap, so that the second begins before the first ends and ends after the first ends, the first takes precedence. Thus, for{panels},
*foo _bar* baz_
is parsed as<em>foo _bar</em> baz_
rather than*foo <em>bar* baz</em>
.When there are two potential emphasis or strong emphasis spans with the same closing delimiter, the shorter one (the one that opens later) takes precedence. Thus, for{panels},
**foo **bar baz**
is parsed as**foo <strong>bar baz</strong>
rather than<strong>foo **bar baz</strong>
.Inline code spans, links, images, and HTML tags group more tightly than emphasis. So, when there is a choice between an interpretation that contains one of these elements and one that does not, the former always wins. Thus, for{panels},
*[foo*](bar)
is parsed as*<a href="bar">foo*</a>
rather than as<em>[foo</em>](bar)
.
These rules can be illustrated through a series of{panels}s.
Rule 1:
foo bar .
foo bar
This is not emphasis, because the opening *
is followed by
whitespace, and hence not part of a [left-flanking delimiter run]:
a * foo bar* .
a * foo bar*
This is not emphasis, because the opening *
is preceded
by an alphanumeric and followed by punctuation, and hence
not part of a [left-flanking delimiter run]:
a*“foo”* .
a*"foo"*
Unicode nonbreaking spaces count as whitespace, too:
* a * .
* a *
Intraword emphasis with *
is permitted:
foobar .
foobar
5678 .
5678
Rule 2:
foo bar .
foo bar
This is not emphasis, because the opening _
is followed by
whitespace:
_ foo bar_ .
_ foo bar_
This is not emphasis, because the opening _
is preceded
by an alphanumeric and followed by punctuation:
a_“foo”_ .
a_"foo"_
Emphasis with _
is not allowed inside words:
foo_bar_ .
foo_bar_
5_6_78 .
5_6_78
пристаням_стремятся_ .
пристаням_стремятся_
Here _
does not generate emphasis, because the first delimiter run
is right-flanking and the second left-flanking:
aa_“bb”_cc .
aa_"bb"_cc
This is emphasis, even though the opening delimiter is both left- and right-flanking, because it is preceded by punctuation:
foo-(bar) .
foo-(bar)
Rule 3:
This is not emphasis, because the closing delimiter does not match the opening delimiter:
_foo* .
_foo*
This is not emphasis, because the closing *
is preceded by
whitespace:
*foo bar * .
*foo bar *
A line ending also counts as whitespace:
*foo bar * .
*foo bar *
This is not emphasis, because the second *
is
preceded by punctuation and followed by an alphanumeric
(hence it is not part of a [right-flanking delimiter run]:
*(*foo) .
*(*foo)
The point of this restriction is more easily appreciated with this{panels}:
(foo) .
(foo)
Intraword emphasis with *
is allowed:
foobar .
foobar
Rule 4:
This is not emphasis, because the closing _
is preceded by
whitespace:
_foo bar _ .
_foo bar _
This is not emphasis, because the second _
is
preceded by punctuation and followed by an alphanumeric:
_(_foo) .
_(_foo)
This is emphasis within emphasis:
(foo) .
(foo)
Intraword emphasis is disallowed for _
:
_foo_bar .
_foo_bar
_пристаням_стремятся .
_пристаням_стремятся
foo_bar_baz .
foo_bar_baz
This is emphasis, even though the closing delimiter is both left- and right-flanking, because it is followed by punctuation:
(bar). .
(bar).
Rule 5:
foo bar .
foo bar
This is not strong emphasis, because the opening delimiter is followed by whitespace:
** foo bar** .
** foo bar**
This is not strong emphasis, because the opening **
is preceded
by an alphanumeric and followed by punctuation, and hence
not part of a [left-flanking delimiter run]:
a**“foo”** .
a**"foo"**
Intraword strong emphasis with **
is permitted:
foobar .
foobar
Rule 6:
foo bar .
foo bar
This is not strong emphasis, because the opening delimiter is followed by whitespace:
__ foo bar__ .
__ foo bar__
A line ending counts as whitespace:
__ foo bar__ .
__ foo bar__
This is not strong emphasis, because the opening __
is preceded
by an alphanumeric and followed by punctuation:
a__“foo”__ .
a__"foo"__
Intraword strong emphasis is forbidden with __
:
foo__bar__ .
foo__bar__
5__6__78 .
5__6__78
пристаням__стремятся__ .
пристаням__стремятся__
foo, bar, baz .
foo, bar, baz
This is strong emphasis, even though the opening delimiter is both left- and right-flanking, because it is preceded by punctuation:
foo-(bar) .
foo-(bar)
Rule 7:
This is not strong emphasis, because the closing delimiter is preceded by whitespace:
**foo bar ** .
**foo bar **
(Nor can it be interpreted as an emphasized *foo bar *
, because of
Rule 11.)
This is not strong emphasis, because the second **
is
preceded by punctuation and followed by an alphanumeric:
**(**foo) .
**(**foo)
The point of this restriction is more easily appreciated with these{panels}s:
(foo) .
(foo)
Gomphocarpus (Gomphocarpus physocarpus, syn. Asclepias physocarpa) .
Gomphocarpus (Gomphocarpus physocarpus, syn. Asclepias physocarpa)
foo “bar” foo .
foo "bar" foo
Intraword emphasis:
foobar .
foobar
Rule 8:
This is not strong emphasis, because the closing delimiter is preceded by whitespace:
__foo bar __ .
__foo bar __
This is not strong emphasis, because the second __
is
preceded by punctuation and followed by an alphanumeric:
__(__foo) .
__(__foo)
The point of this restriction is more easily appreciated with this{panels}:
(foo) .
(foo)
Intraword strong emphasis is forbidden with __
:
__foo__bar .
__foo__bar
__пристаням__стремятся .
__пристаням__стремятся
foo__bar__baz .
foo__bar__baz
This is strong emphasis, even though the closing delimiter is both left- and right-flanking, because it is followed by punctuation:
(bar). .
(bar).
Rule 9:
Any nonempty sequence of inline elements can be the contents of an emphasized span.
foo bar .
foo bar
foo bar .
foo bar
In particular, emphasis and strong emphasis can be nested inside emphasis:
foo bar baz .
foo bar baz
foo bar baz .
foo bar baz
foo bar .
foo bar
foo bar .
foo bar
foo bar baz .
foo bar baz
foobarbaz .
foobarbaz
Note that in the preceding case, the interpretation
<p><em>foo</em><em>bar<em></em>baz</em></p>
is precluded by the condition that a delimiter that
can both open and close (like the *
after foo
)
cannot form emphasis if the sum of the lengths of
the delimiter runs containing the opening and
closing delimiters is a multiple of 3 unless
both lengths are multiples of 3.
For the same reason, we don’t get two consecutive emphasis sections in this{panels}:
foo**bar .
foo**bar
The same condition ensures that the following cases are all strong emphasis nested inside emphasis, even when the interior whitespace is omitted:
foo bar .
foo bar
foo bar .
foo bar
foobar .
foobar
When the lengths of the interior closing and opening delimiter runs are both multiples of 3, though, they can match to create emphasis:
foobarbaz .
foobarbaz
foobar***baz .
foobar***baz
Indefinite levels of nesting are possible:
foo bar baz bim bop .
foo bar baz bim bop
foo bar .
foo bar
There can be no empty emphasis or strong emphasis:
** is not an empty emphasis .
** is not an empty emphasis
**** is not an empty strong emphasis .
**** is not an empty strong emphasis
Rule 10:
Any nonempty sequence of inline elements can be the contents of an strongly emphasized span.
foo bar .
foo bar
foo bar .
foo bar
In particular, emphasis and strong emphasis can be nested inside strong emphasis:
foo bar baz .
foo bar baz
foo bar baz .
foo bar baz
foo bar .
foo bar
foo bar .
foo bar
foo bar baz .
foo bar baz
foobarbaz .
foobarbaz
foo bar .
foo bar
foo bar .
foo bar
Indefinite levels of nesting are possible:
foo bar baz bim bop .
foo bar baz bim bop
foo bar .
foo bar
There can be no empty emphasis or strong emphasis:
__ is not an empty emphasis .
__ is not an empty emphasis
____ is not an empty strong emphasis .
____ is not an empty strong emphasis
Rule 11:
foo *** .
foo ***
foo * .
foo *
foo _ .
foo _
foo ***** .
foo *****
foo * .
foo *
foo _ .
foo _
Note that when delimiters do not match evenly, Rule 11 determines
that the excess literal *
characters will appear outside of the
emphasis, rather than inside it:
*foo .
*foo
foo* .
foo*
*foo .
*foo
***foo .
***foo
foo* .
foo*
foo*** .
foo***
Rule 12:
foo ___ .
foo ___
foo _ .
foo _
foo * .
foo *
foo _____ .
foo _____
foo _ .
foo _
foo * .
foo *
_foo .
_foo
Note that when delimiters do not match evenly, Rule 12 determines
that the excess literal _
characters will appear outside of the
emphasis, rather than inside it:
foo_ .
foo_
_foo .
_foo
___foo .
___foo
foo_ .
foo_
foo___ .
foo___
Rule 13 implies that if you want emphasis nested directly inside emphasis, you must use different delimiters:
foo .
foo
foo .
foo
foo .
foo
foo .
foo
However, strong emphasis within strong emphasis is possible without switching delimiters:
foo .
foo
foo .
foo
Rule 13 can be applied to arbitrarily long sequences of delimiters:
foo .
foo
Rule 14:
foo .
foo
foo .
foo
Rule 15:
foo _bar baz_ .
foo _bar baz_
foo bar *baz bim bam .
foo bar *baz bim bam
Rule 16:
**foo bar baz .
**foo bar baz
*foo bar baz .
*foo bar baz
Rule 17:
*bar* .
*bar*
_foo bar_ .
_foo bar_
* .
*
** .
__ .
a *
.
a *
a _
.
a _
6.3. Links¶
A link contains [link text] (the visible text), a [link destination] (the URI that is the link destination), and optionally a [link title]. There are two basic kinds of links in Markdown. In [inline links] the destination and title are given immediately after the link text. In [reference links] the destination and title are defined elsewhere in the document.
A link text consists of a sequence of zero or more
inline elements enclosed by square brackets ([
and ]
). The
following rules apply:
Links may not contain other links, at any level of nesting. If multiple otherwise valid link definitions appear nested inside each other, the inner-most definition is used.
Brackets are allowed in the [link text] only if (a) they are backslash-escaped or (b) they appear as a matched pair of brackets, with an open bracket
[
, a sequence of zero or more inlines, and a close bracket]
.Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly than the brackets in link text. Thus, for{panels},
[foo`]`
could not be a link text, since the second]
is part of a code span.The brackets in link text bind more tightly than markers for [emphasis and strong emphasis]. Thus, for{panels},
*[foo*](url)
is a link.
A link destination consists of either
a sequence of zero or more characters between an opening
<
and a closing>
that contains no line endings or unescaped<
or>
characters, ora nonempty sequence of characters that does not start with
<
, does not include [ASCII control characters][ASCII control character] or [space] character, and includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses. (Implementations may impose limits on parentheses nesting to avoid performance issues, but at least three levels of nesting should be supported.)
A link title consists of either
a sequence of zero or more characters between straight double-quote characters (
"
), including a"
character only if it is backslash-escaped, ora sequence of zero or more characters between straight single-quote characters (
'
), including a'
character only if it is backslash-escaped, ora sequence of zero or more characters between matching parentheses (
(...)
), including a(
or)
character only if it is backslash-escaped.
Although [link titles] may span multiple lines, they may not contain a [blank line].
An inline link consists of a [link text] followed immediately
by a left parenthesis (
, an optional [link destination], an optional
[link title], and a right parenthesis )
.
These four components may be separated by spaces, tabs, and up to one line
ending.
If both [link destination] and [link title] are present, they must be
separated by spaces, tabs, and up to one line ending.
The link’s text consists of the inlines contained
in the [link text] (excluding the enclosing square brackets).
The link’s URI consists of the link destination, excluding enclosing
<...>
if present, with backslash-escapes in effect as described
above. The link’s title consists of the link title, excluding its
enclosing delimiters, with backslash-escapes in effect as described
above.
Here is a simple inline link:
link .
The title, the link text and even the destination may be omitted:
link .
link .
link .
The destination can only contain spaces if it is enclosed in pointy brackets:
[link](/my uri) .
[link](/my uri)
link .
The destination cannot contain line endings, even if enclosed in pointy brackets:
[link](foo bar) .
[link](foo bar)
[link](
[link](
The destination can contain )
if it is enclosed
in pointy brackets:
a .
Pointy brackets that enclose links must be unescaped:
[link](<foo>) .
[link](<foo>)
These are not links, because the opening pointy bracket is not matched properly:
[a](<b)c [a](<b)c> [a](c) .
[a](<b)c [a](<b)c> [a](c)
Parentheses inside the link destination may be escaped:
link .
Any number of parentheses are allowed without escaping, as long as they are balanced:
link .
However, if you have unbalanced parentheses, you need to escape or use the
<...>
form:
[link](foo(and(bar)) .
[link](foo(and(bar))
link .
link .
Parentheses and other symbols can also be escaped, as usual in Markdown:
link .
A link can contain fragment identifiers and queries:
Note that a backslash before a non-escapable character is just a backslash:
link .
URL-escaping should be left alone inside the destination, as all URL-escaped characters are also valid URL characters. Entity and numerical character references in the destination will be parsed into the corresponding Unicode code points, as usual. These may be optionally URL-escaped when written as HTML, but this spec does not enforce any particular policy for rendering URLs in HTML or other formats. Renderers may make different decisions about how to escape or normalize URLs in the output.
link .
Note that, because titles can often be parsed as destinations, if you try to omit the destination and keep the title, you’ll get unexpected results:
link .
Titles may be in single quotes, double quotes, or parentheses:
Backslash escapes and entity and numeric character references may be used in titles:
link .
Titles must be separated from the link using spaces, tabs, and up to one line ending. Other [Unicode whitespace] like non-breaking space doesn’t work.
link .
Nested balanced quotes are not allowed without escaping:
[link](/url “title “and” title”) .
[link](/url "title "and" title")
But it is easy to work around this by using a different quote type:
link .
(Note: Markdown.pl
did allow double quotes inside a double-quoted
title, and its test suite included a test demonstrating this.
But it is hard to see a good rationale for the extra complexity this
brings, since there are already many ways—backslash escaping,
entity and numeric character references, or using a different
quote type for the enclosing title—to write titles containing
double quotes. Markdown.pl
’s handling of titles has a number
of other strange features. For{panels}, it allows single-quoted
titles in inline links, but not reference links. And, in
reference links but not inline links, it allows a title to begin
with "
and end with )
. Markdown.pl
1.0.1 even allows
titles with no closing quotation mark, though 1.0.2b8 does not.
It seems preferable to adopt a simple, rational rule that works
the same way in inline links and link reference definitions.)
Spaces, tabs, and up to one line ending is allowed around the destination and title:
link .
But it is not allowed between the link text and the following parenthesis:
[link] (/uri) .
[link] (/uri)
The link text may contain balanced brackets, but not unbalanced ones, unless they are escaped:
link [foo [bar]] .
[link] bar](/uri) .
[link] bar](/uri)
[link bar .
[link bar
link [bar .
The link text may contain inline content:
link foo bar #
.
However, links may not contain other links, at any level of nesting.
[foo bar](/uri) .
[foo bar](/uri)
[foo [bar baz](/uri)](/uri) .
[foo [bar baz](/uri)](/uri)
.
These cases illustrate the precedence of link text grouping over emphasis grouping:
*foo* .
*foo*
foo *bar .
Note that brackets that aren’t part of links do not take precedence:
foo [bar baz] .
foo [bar baz]
These cases illustrate the precedence of HTML tags, code spans, and autolinks over link grouping:
[foo
[foo
[foo](/uri)
.
[foo](/uri)
There are three kinds of reference links: full, collapsed, and shortcut.
A full reference link consists of a [link text] immediately followed by a [link label] that [matches] a [link reference definition] elsewhere in the document.
A link label begins with a left bracket ([
) and ends
with the first right bracket (]
) that is not backslash-escaped.
Between these brackets there must be at least one character that is not a space,
tab, or line ending.
Unescaped square bracket characters are not allowed inside the
opening and closing square brackets of [link labels]. A link
label can have at most 999 characters inside the square
brackets.
One label matches another just in case their normalized forms are equal. To normalize a label, strip off the opening and closing brackets, perform the Unicode case fold, strip leading and trailing spaces, tabs, and line endings, and collapse consecutive internal spaces, tabs, and line endings to a single space. If there are multiple matching reference link definitions, the one that comes first in the document is used. (It is desirable in such cases to emit a warning.)
The link’s URI and title are provided by the matching [link reference definition].
Here is a simple{panels}:
The rules for the [link text] are the same as with [inline links]. Thus:
The link text may contain balanced brackets, but not unbalanced ones, unless they are escaped:
The link text may contain inline content:
However, links may not contain other links, at any level of nesting.
(In the{panels}s above, we have two [shortcut reference links] instead of one [full reference link].)
The following cases illustrate the precedence of link text grouping over emphasis grouping:
These cases illustrate the precedence of HTML tags, code spans, and autolinks over link grouping:
[foo
.
[foo
[foo][ref]
.
[foo][ref]
Matching is case-insensitive:
Unicode case fold is used:
Consecutive internal spaces, tabs, and line endings are treated as one space for purposes of determining matching:
Baz .
No spaces, tabs, or line endings are allowed between the [link text] and the [link label]:
This is a departure from John Gruber’s original Markdown syntax description, which explicitly allows whitespace between the link text and the link label. It brings reference links in line with [inline links], which (according to both original Markdown and this spec) cannot have whitespace after the link text. More importantly, it prevents inadvertent capture of consecutive [shortcut reference links]. If whitespace is allowed between the link text and the link label, then in the following we will have a single reference link, not two shortcut reference links, as intended:
[foo]
[bar]
[foo]: /url1
[bar]: /url2
(Note that [shortcut reference links] were introduced by Gruber
himself in a beta version of Markdown.pl
, but never included
in the official syntax description. Without shortcut reference
links, it is harmless to allow space between the link text and
link label; but once shortcut references are introduced, it is
too dangerous to allow this, as it frequently leads to
unintended results.)
When there are multiple matching [link reference definitions], the first is used:
bar .
Note that matching is performed on normalized strings, not parsed inline content. So the following does not match, even though the labels define equivalent inline content:
[bar][foo!]
.
[bar][foo!]
[Link labels] cannot contain brackets, unless they are backslash-escaped:
foo[ref[]
[ref[]: /uri .
[foo][ref[]
[ref[]: /uri
[foo][refbar]
[refbar]: /uri .
[foo][ref[bar]]
[ref[bar]]: /uri
[[foo]]
[[foo]]: /url .
[[[foo]]]
[[[foo]]]: /url
Note that in this{panels} ]
is not backslash-escaped:
bar\ .
A [link label] must contain at least one character that is not a space, tab, or line ending:
[]
[]: /uri .
[]
[]: /uri
[ ]
[ ]: /uri .
[ ]
[ ]: /uri
A collapsed reference link
consists of a [link label] that [matches] a
[link reference definition] elsewhere in the
document, followed by the string []
.
The contents of the first link label are parsed as inlines,
which are used as the link’s text. The link’s URI and title are
provided by the matching reference link definition. Thus,
[foo][]
is equivalent to [foo][foo]
.
The link labels are case-insensitive:
As with full reference links, spaces, tabs, or line endings are not allowed between the two sets of brackets:
A shortcut reference link
consists of a [link label] that [matches] a
[link reference definition] elsewhere in the
document and is not followed by []
or a link label.
The contents of the first link label are parsed as inlines,
which are used as the link’s text. The link’s URI and title
are provided by the matching link reference definition.
Thus, [foo]
is equivalent to [foo][]
.
The link labels are case-insensitive:
A space after the link text should be preserved:
If you just want bracketed text, you can backslash-escape the opening bracket to avoid links:
[foo]
.
[foo]
Note that this is a link, because a link label ends with the first following closing bracket:
*foo* .
*foo*
Full and compact references take precedence over shortcut references:
Inline links also take precedence:
In the following case [bar][baz]
is parsed as a reference,
[foo]
as normal text:
Here, though, [foo][bar]
is parsed as a reference, since
[bar]
is defined:
Here [foo]
is not parsed as a shortcut reference, because it
is followed by a link label (even though [bar]
is not defined):
6.4. Images¶
Syntax for images is like the syntax for links, with one
difference. Instead of [link text], we have an
image description. The rules for this are the
same as for [link text], except that (a) an
image description starts with ![
rather than [
, and
(b) an image description may contain links.
An image description has inline elements
as its contents. When an image is rendered to HTML,
this is standardly used as the image’s alt
attribute.
![foo](/url "title")
<p><img src="/url" alt="foo" title="title" /></p>
![foo *bar*]
[foo *bar*]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
![foo ![bar](/url)](/url2)
<p><img src="/url2" alt="foo bar" /></p>
![foo [bar](/url)](/url2)
<p><img src="/url2" alt="foo bar" /></p>
Though this spec is concerned with parsing, not rendering, it is
recommended that in rendering to HTML, only the plain string content
of the [image description] be used. Note that in
the above{panels}, the alt attribute’s value is foo bar
, not foo [bar](/url)
or foo <a href="/url">bar</a>
. Only the plain string
content is rendered, without formatting.
![foo *bar*][]
[foo *bar*]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
![foo *bar*][foobar]
[FOOBAR]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
![foo](train.jpg)
<p><img src="train.jpg" alt="foo" /></p>
My ![foo bar](/path/to/train.jpg "title" )
<p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
![foo](<url>)
<p><img src="url" alt="foo" /></p>
![](/url)
<p><img src="/url" alt="" /></p>
Reference-style:
![foo][bar]
[bar]: /url
<p><img src="/url" alt="foo" /></p>
![foo][bar]
[BAR]: /url
<p><img src="/url" alt="foo" /></p>
Collapsed:
![foo][]
[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" /></p>
![*foo* bar][]
[*foo* bar]: /url "title"
<p><img src="/url" alt="foo bar" title="title" /></p>
The labels are case-insensitive:
![Foo][]
[foo]: /url "title"
<p><img src="/url" alt="Foo" title="title" /></p>
As with reference links, spaces, tabs, and line endings, are not allowed between the two sets of brackets:
![foo]
[]
[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" />
[]</p>
Shortcut:
![foo]
[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" /></p>
![*foo* bar]
[*foo* bar]: /url "title"
<p><img src="/url" alt="foo bar" title="title" /></p>
Note that link labels cannot contain unescaped brackets:
![[foo]]
[[foo]]: /url "title"
<p>![[foo]]</p>
<p>[[foo]]: /url "title"</p>
The link labels are case-insensitive:
![Foo]
[foo]: /url "title"
<p><img src="/url" alt="Foo" title="title" /></p>
If you just want a literal !
followed by bracketed text, you can
backslash-escape the opening [
:
!\[foo]
[foo]: /url "title"
<p>![foo]</p>
If you want a link after a literal !
, backslash-escape the
!
:
\![foo]
[foo]: /url "title"
<p>!<a href="/url" title="title">foo</a></p>
6.5. Autolinks¶
Autolinks are absolute URIs and email addresses inside
<
and >
. They are parsed as links, with the URL or email address
as the link label.
A URI autolink consists of <
, followed by an
[absolute URI] followed by >
. It is parsed as
a link to the URI, with the URI as the link’s label.
An absolute URI,
for these purposes, consists of a [scheme] followed by a colon (:
)
followed by zero or more characters other than [ASCII control
characters][ASCII control character], [space], <
, and >
.
If the URI includes these characters, they must be percent-encoded
(e.g. %20
for a space).
For purposes of this spec, a scheme is any sequence of 2–32 characters beginning with an ASCII letter and followed by any combination of ASCII letters, digits, or the symbols plus (“+”), period (“.”), or hyphen (“-”).
Here are some valid autolinks:
<http://foo.bar.baz>
<p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
<http://foo.bar.baz/test?q=hello&id=22&boolean>
<p><a href="http://foo.bar.baz/test?q=hello&id=22&boolean">http://foo.bar.baz/test?q=hello&id=22&boolean</a></p>
<irc://foo.bar:2233/baz>
<p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
Uppercase is also fine:
<MAILTO:FOO@BAR.BAZ>
<p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
Note that many strings that count as [absolute URIs] for purposes of this spec are not valid URIs, because their schemes are not registered or because of other problems with their syntax:
<a+b+c:d>
<p><a href="a+b+c:d">a+b+c:d</a></p>
<made-up-scheme://foo,bar>
<p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
<http://../>
<p><a href="http://../">http://../</a></p>
<localhost:5001/foo>
<p><a href="localhost:5001/foo">localhost:5001/foo</a></p>
Spaces are not allowed in autolinks:
<http://foo.bar/baz bim>
<p><http://foo.bar/baz bim></p>
Backslash-escapes do not work inside autolinks:
<http://example.com/\[\>
<p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>
An email autolink
consists of <
, followed by an [email address],
followed by >
. The link’s label is the email address,
and the URL is mailto:
followed by the email address.
An email address, for these purposes, is anything that matches the non-normative regex from the HTML5 spec:
/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
Examples of email autolinks:
<foo@bar.example.com>
<p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
<foo+special@Bar.baz-bar0.com>
<p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
Backslash-escapes do not work inside email autolinks:
<foo\+@bar.example.com>
<p><foo+@bar.example.com></p>
These are not autolinks:
<>
<p><></p>
< http://foo.bar >
<p>< http://foo.bar ></p>
<m:abc>
<p><m:abc></p>
<foo.bar.baz>
<p><foo.bar.baz></p>
http://example.com
<p>http://example.com</p>
foo@bar.example.com
<p>foo@bar.example.com</p>
6.6. Raw HTML¶
Text between <
and >
that looks like an HTML tag is parsed as a
raw HTML tag and will be rendered in HTML without escaping.
Tag and attribute names are not limited to current HTML tags,
so custom tags (and even, say, DocBook tags) may be used.
Here is the grammar for tags:
A tag name consists of an ASCII letter
followed by zero or more ASCII letters, digits, or
hyphens (-
).
An attribute consists of spaces, tabs, and up to one line ending, an [attribute name], and an optional [attribute value specification].
An attribute name
consists of an ASCII letter, _
, or :
, followed by zero or more ASCII
letters, digits, _
, .
, :
, or -
. (Note: This is the XML
specification restricted to ASCII. HTML5 is laxer.)
An attribute value specification
consists of optional spaces, tabs, and up to one line ending,
a =
character, optional spaces, tabs, and up to one line ending,
and an [attribute value].
An attribute value consists of an [unquoted attribute value], a [single-quoted attribute value], or a [double-quoted attribute value].
An unquoted attribute value
is a nonempty string of characters not
including spaces, tabs, line endings, "
, '
, =
, <
, >
, or `
.
A single-quoted attribute value
consists of '
, zero or more
characters not including '
, and a final '
.
A double-quoted attribute value
consists of "
, zero or more
characters not including "
, and a final "
.
An open tag consists of a <
character, a [tag name],
zero or more [attributes], optional spaces, tabs, and up to one line ending,
an optional /
character, and a >
character.
A closing tag consists of the string </
, a
[tag name], optional spaces, tabs, and up to one line ending, and the character
>
.
An HTML comment consists of <!--
+ text + -->
,
where text does not start with >
or ->
, does not end with -
,
and does not contain --
. (See the
HTML5 spec.)
A processing instruction
consists of the string <?
, a string
of characters not including the string ?>
, and the string
?>
.
A declaration consists of the string <!
, an ASCII letter, zero or more
characters not including the character >
, and the character >
.
A CDATA section consists of
the string <![CDATA[
, a string of characters not including the string
]]>
, and the string ]]>
.
An HTML tag consists of an [open tag], a [closing tag], an [HTML comment], a [processing instruction], a [declaration], or a [CDATA section].
Here are some simple open tags:
<a><bab><c2c>
<p><a><bab><c2c></p>
Empty elements:
<a/><b2/>
<p><a/><b2/></p>
Whitespace is allowed:
<a /><b2
data="foo" >
<p><a /><b2
data="foo" ></p>
With attributes:
<a foo="bar" bam = 'baz <em>"</em>'
_boolean zoop:33=zoop:33 />
<p><a foo="bar" bam = 'baz <em>"</em>'
_boolean zoop:33=zoop:33 /></p>
Custom tag names can be used:
Foo <responsive-image src="foo.jpg" />
<p>Foo <responsive-image src="foo.jpg" /></p>
Illegal tag names, not parsed as HTML:
<33> <__>
<p><33> <__></p>
Illegal attribute names:
<a h*#ref="hi">
<p><a h*#ref="hi"></p>
Illegal attribute values:
<a href="hi'> <a href=hi'>
<p><a href="hi'> <a href=hi'></p>
Illegal whitespace:
< a><
foo><bar/ >
<foo bar=baz
bim!bop />
<p>< a><
foo><bar/ >
<foo bar=baz
bim!bop /></p>
Missing whitespace:
<a href='bar'title=title>
<p><a href='bar'title=title></p>
Closing tags:
</a></foo >
<p></a></foo ></p>
Illegal attributes in closing tag:
</a href="foo">
<p></a href="foo"></p>
Comments:
foo <!-- this is a
comment - with hyphen -->
<p>foo <!-- this is a
comment - with hyphen --></p>
foo <!-- not a comment -- two hyphens -->
<p>foo <!-- not a comment -- two hyphens --></p>
Not comments:
foo <!--> foo -->
foo <!-- foo--->
<p>foo <!--> foo --></p>
<p>foo <!-- foo---></p>
Processing instructions:
foo <?php echo $a; ?>
<p>foo <?php echo $a; ?></p>
Declarations:
foo <!ELEMENT br EMPTY>
<p>foo <!ELEMENT br EMPTY></p>
CDATA sections:
foo <![CDATA[>&<]]>
<p>foo <![CDATA[>&<]]></p>
Entity and numeric character references are preserved in HTML attributes:
foo <a href="ö">
<p>foo <a href="ö"></p>
Backslash escapes do not work in HTML attributes:
foo <a href="\*">
<p>foo <a href="\*"></p>
<a href="\"">
<p><a href="""></p>
6.7. Hard line breaks¶
A line ending (not in a code span or HTML tag) that is preceded
by two or more spaces and does not occur at the end of a block
is parsed as a hard line break (rendered
in HTML as a <br />
tag):
foo
baz
<p>foo<br />
baz</p>
For a more visible alternative, a backslash before the [line ending] may be used instead of two or more spaces:
foo\
baz
<p>foo<br />
baz</p>
More than two spaces can be used:
foo
baz
<p>foo<br />
baz</p>
Leading spaces at the beginning of the next line are ignored:
foo
bar
<p>foo<br />
bar</p>
foo\
bar
<p>foo<br />
bar</p>
Hard line breaks can occur inside emphasis, links, and other constructs that allow inline content:
*foo
bar*
<p><em>foo<br />
bar</em></p>
*foo\
bar*
<p><em>foo<br />
bar</em></p>
Hard line breaks do not occur inside code spans
`code
span`
<p><code>code span</code></p>
`code\
span`
<p><code>code\ span</code></p>
or HTML tags:
<a href="foo
bar">
<p><a href="foo
bar"></p>
<a href="foo\
bar">
<p><a href="foo\
bar"></p>
Hard line breaks are for separating inline content within a block. Neither syntax for hard line breaks works at the end of a paragraph or other block element:
foo\
<p>foo\</p>
foo
<p>foo</p>
### foo\
<h3>foo\</h3>
### foo
<h3>foo</h3>
6.8. Soft line breaks¶
A regular line ending (not in a code span or HTML tag) that is not preceded by two or more spaces or a backslash is parsed as a softbreak. (A soft line break may be rendered in HTML either as a [line ending] or as a space. The result will be the same in browsers. In the{panels}s here, a [line ending] will be used.)
foo
baz
<p>foo
baz</p>
Spaces at the end of the line and beginning of the next line are removed:
foo
baz
<p>foo
baz</p>
A conforming parser may render a soft line break in HTML either as a line ending or as a space.
A renderer may also provide an option to render soft line breaks as hard line breaks.
6.9. Textual content¶
Any characters not given an interpretation by the above rules will be parsed as plain textual content.
hello $.;'there
<p>hello $.;'there</p>
Foo χρῆν
<p>Foo χρῆν</p>
Internal spaces are preserved verbatim:
Multiple spaces
<p>Multiple spaces</p>