- Commit
- d4711bb865a17dcefb3b0907c0d452ef49c33c16
- Parent
- ca83398c7aed70a73b010a6ce9366bac90b7c32d
- Author
- John MacFarlane <jgm@berkeley.edu>
- Date
Updaet spec.txt.
My personal build of CMark ✏️
Updaet spec.txt.
1 file changed, 3432 insertions, 3422 deletions
Status | File Name | N° Changes | Insertions | Deletions |
Modified | test/spec.txt | 6854 | 3432 | 3422 |
diff --git a/test/spec.txt b/test/spec.txt @@ -326,6 +326,9 @@ A [space](@) is `U+0020`. A [non-whitespace character](@) is any character that is not a [whitespace character]. +An [ASCII control character](@) is a character between `U+0000–1F` (both +including) or `U+007F`. + An [ASCII punctuation character](@) is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, `*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F), @@ -478,3903 +481,3653 @@ bar For security reasons, the Unicode character `U+0000` must be replaced with the REPLACEMENT CHARACTER (`U+FFFD`). -# Blocks and inlines - -We can think of a document as a sequence of -[blocks](@)---structural elements like paragraphs, block -quotations, lists, headings, rules, and code blocks. Some blocks (like -block quotes and list items) contain other blocks; others (like -headings and paragraphs) contain [inline](@) content---text, -links, emphasized text, images, code spans, and so on. -## Precedence +## Backslash escapes -Indicators of block structure always take precedence over indicators -of inline structure. So, for example, the following is a list with -two items, not a list with one item containing a code span: +Any ASCII punctuation character may be backslash-escaped: ```````````````````````````````` example -- `one -- two` +\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ . -<ul> -<li>`one</li> -<li>two`</li> -</ul> +<p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> ```````````````````````````````` -This means that parsing can proceed in two steps: first, the block -structure of the document can be discerned; second, text lines inside -paragraphs, headings, and other block constructs can be parsed for inline -structure. The second step requires information about link reference -definitions that will be available only at the end of the first -step. Note that the first step requires processing lines in sequence, -but the second can be parallelized, since the inline parsing of -one block element does not affect the inline parsing of any other. - -## Container blocks and leaf blocks - -We can divide blocks into two types: -[container blocks](@), -which can contain other blocks, and [leaf blocks](@), -which cannot. - -# Leaf blocks +Backslashes before other characters are treated as literal +backslashes: -This section describes the different kinds of leaf block that make up a -Markdown document. +```````````````````````````````` example +\→\A\a\ \3\φ\« +. +<p>\→\A\a\ \3\φ\«</p> +```````````````````````````````` -## Thematic breaks -A line consisting of 0-3 spaces of indentation, followed by a sequence -of three or more matching `-`, `_`, or `*` characters, each followed -optionally by any number of spaces or tabs, forms a -[thematic break](@). +Escaped characters are treated as regular characters and do +not have their usual Markdown meanings: ```````````````````````````````` example -*** ---- -___ +\*not emphasized* +\<br/> not a tag +\[not a link](/foo) +\`not code` +1\. not a list +\* not a list +\# not a heading +\[foo]: /url "not a reference" +\ö not a character entity . -<hr /> -<hr /> -<hr /> +<p>*not emphasized* +<br/> not a tag +[not a link](/foo) +`not code` +1. not a list +* not a list +# not a heading +[foo]: /url "not a reference" +&ouml; not a character entity</p> ```````````````````````````````` -Wrong characters: +If a backslash is itself escaped, the following character is not: ```````````````````````````````` example -+++ +\\*emphasis* . -<p>+++</p> +<p>\<em>emphasis</em></p> ```````````````````````````````` +A backslash at the end of the line is a [hard line break]: + ```````````````````````````````` example -=== +foo\ +bar . -<p>===</p> +<p>foo<br /> +bar</p> ```````````````````````````````` -Not enough characters: +Backslash escapes do not work in code blocks, code spans, autolinks, or +raw HTML: ```````````````````````````````` example --- -** -__ +`` \[\` `` . -<p>-- -** -__</p> +<p><code>\[\`</code></p> ```````````````````````````````` -One to three spaces indent are allowed: - ```````````````````````````````` example - *** - *** - *** + \[\] . -<hr /> -<hr /> -<hr /> +<pre><code>\[\] +</code></pre> ```````````````````````````````` -Four spaces is too many: - ```````````````````````````````` example - *** +~~~ +\[\] +~~~ . -<pre><code>*** +<pre><code>\[\] </code></pre> ```````````````````````````````` ```````````````````````````````` example -Foo - *** +<http://example.com?find=\*> . -<p>Foo -***</p> +<p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> ```````````````````````````````` -More than three characters may be used: - ```````````````````````````````` example -_____________________________________ +<a href="/bar\/)"> . -<hr /> +<a href="/bar\/)"> ```````````````````````````````` -Spaces are allowed between the characters: +But they work in all other contexts, including URLs and link titles, +link references, and [info strings] in [fenced code blocks]: ```````````````````````````````` example - - - - +[foo](/bar\* "ti\*tle") . -<hr /> +<p><a href="/bar*" title="ti*tle">foo</a></p> ```````````````````````````````` ```````````````````````````````` example - ** * ** * ** * ** +[foo] + +[foo]: /bar\* "ti\*tle" . -<hr /> +<p><a href="/bar*" title="ti*tle">foo</a></p> ```````````````````````````````` ```````````````````````````````` example -- - - - +``` foo\+bar +foo +``` . -<hr /> +<pre><code class="language-foo+bar">foo +</code></pre> ```````````````````````````````` -Spaces are allowed at the end: +## Entity and numeric character references -```````````````````````````````` example -- - - - -. -<hr /> -```````````````````````````````` +Valid HTML entity references and numeric character references +can be used in place of the corresponding Unicode character, +with the following exceptions: +- Entity and character references are not recognized in code + blocks and code spans. -However, no other characters may occur in the line: +- Entity and character references cannot stand in place of + special characters that define structural elements in + CommonMark. For example, although `*` can be used + in place of a literal `*` character, `*` cannot replace + `*` in emphasis delimiters, bullet list markers, or thematic + breaks. -```````````````````````````````` example -_ _ _ _ a +Conforming CommonMark parsers need not store information about +whether a particular character was represented in the source +using a Unicode character or an entity reference. -a------ +[Entity references](@) consist of `&` + any of the valid +HTML5 entity names + `;`. The +document <https://html.spec.whatwg.org/entities.json> +is used as an authoritative source for the valid entity +references and their corresponding code points. ----a--- +```````````````````````````````` example + & © Æ Ď +¾ ℋ ⅆ +∲ ≧̸ . -<p>_ _ _ _ a</p> -<p>a------</p> -<p>---a---</p> +<p> & © Æ Ď +¾ ℋ ⅆ +∲ ≧̸</p> ```````````````````````````````` -It is required that all of the [non-whitespace characters] be the same. -So, this is not a thematic break: +[Decimal numeric character +references](@) +consist of `&#` + a string of 1--7 arabic digits + `;`. A +numeric character reference is parsed as the corresponding +Unicode character. Invalid Unicode code points will be replaced by +the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, +the code point `U+0000` will also be replaced by `U+FFFD`. ```````````````````````````````` example - *-* +# Ӓ Ϡ � . -<p><em>-</em></p> +<p># Ӓ Ϡ �</p> ```````````````````````````````` -Thematic breaks do not need blank lines before or after: +[Hexadecimal numeric character +references](@) consist of `&#` + +either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. +They too are parsed as the corresponding Unicode character (this +time specified with a hexadecimal numeral instead of decimal). ```````````````````````````````` example -- foo -*** -- bar +" ആ ಫ . -<ul> -<li>foo</li> -</ul> -<hr /> -<ul> -<li>bar</li> -</ul> +<p>" ആ ಫ</p> ```````````````````````````````` -Thematic breaks can interrupt a paragraph: +Here are some nonentities: ```````````````````````````````` example -Foo -*** -bar +  &x; &#; &#x; +� +&#abcdef0; +&ThisIsNotDefined; &hi?; . -<p>Foo</p> -<hr /> -<p>bar</p> +<p>&nbsp &x; &#; &#x; +&#87654321; +&#abcdef0; +&ThisIsNotDefined; &hi?;</p> ```````````````````````````````` -If a line of dashes that meets the above conditions for being a -thematic break could also be interpreted as the underline of a [setext -heading], the interpretation as a -[setext heading] takes precedence. Thus, for example, -this is a setext heading, not a paragraph followed by a thematic break: +Although HTML5 does accept some entity references +without a trailing semicolon (such as `©`), these are not +recognized here, because it makes the grammar too ambiguous: ```````````````````````````````` example -Foo ---- -bar +© . -<h2>Foo</h2> -<p>bar</p> +<p>&copy</p> ```````````````````````````````` -When both a thematic break and a list item are possible -interpretations of a line, the thematic break takes precedence: +Strings that are not on the list of HTML5 named entities are not +recognized as entity references either: ```````````````````````````````` example -* Foo -* * * -* Bar +&MadeUpEntity; . -<ul> -<li>Foo</li> -</ul> -<hr /> -<ul> -<li>Bar</li> -</ul> +<p>&MadeUpEntity;</p> ```````````````````````````````` -If you want a thematic break in a list item, use a different bullet: +Entity and numeric character references are recognized in any +context besides code spans or code blocks, including +URLs, [link titles], and [fenced code block][] [info strings]: ```````````````````````````````` example -- Foo -- * * * +<a href="öö.html"> . -<ul> -<li>Foo</li> -<li> -<hr /> -</li> -</ul> +<a href="öö.html"> ```````````````````````````````` -## ATX headings - -An [ATX heading](@) -consists of a string of characters, parsed as inline content, between an -opening sequence of 1--6 unescaped `#` characters and an optional -closing sequence of any number of unescaped `#` characters. -The opening sequence of `#` characters must be followed by a -[space] or by the end of line. The optional closing sequence of `#`s must be -preceded by a [space] and may be followed by spaces only. The opening -`#` character may be indented 0-3 spaces. The raw contents of the -heading are stripped of leading and trailing spaces before being parsed -as inline content. The heading level is equal to the number of `#` -characters in the opening sequence. - -Simple headings: - ```````````````````````````````` example -# foo -## foo -### foo -#### foo -##### foo -###### foo +[foo](/föö "föö") . -<h1>foo</h1> -<h2>foo</h2> -<h3>foo</h3> -<h4>foo</h4> -<h5>foo</h5> -<h6>foo</h6> +<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> ```````````````````````````````` -More than six `#` characters is not a heading: - ```````````````````````````````` example -####### foo +[foo] + +[foo]: /föö "föö" . -<p>####### foo</p> +<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> ```````````````````````````````` -At least one space is required between the `#` characters and the -heading's contents, unless the heading is empty. Note that many -implementations currently do not require the space. However, the -space was required by the -[original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), -and it helps prevent things like the following from being parsed as -headings: - ```````````````````````````````` example -#5 bolt - -#hashtag +``` föö +foo +``` . -<p>#5 bolt</p> -<p>#hashtag</p> +<pre><code class="language-föö">foo +</code></pre> ```````````````````````````````` -This is not a heading, because the first `#` is escaped: +Entity and numeric character references are treated as literal +text in code spans and code blocks: ```````````````````````````````` example -\## foo +`föö` . -<p>## foo</p> +<p><code>f&ouml;&ouml;</code></p> ```````````````````````````````` -Contents are parsed as inlines: - ```````````````````````````````` example -# foo *bar* \*baz\* + föfö . -<h1>foo <em>bar</em> *baz*</h1> +<pre><code>f&ouml;f&ouml; +</code></pre> ```````````````````````````````` -Leading and trailing [whitespace] is ignored in parsing inline content: +Entity and numeric character references cannot be used +in place of symbols indicating structure in CommonMark +documents. ```````````````````````````````` example -# foo +*foo* +*foo* . -<h1>foo</h1> +<p>*foo* +<em>foo</em></p> ```````````````````````````````` - -One to three spaces indentation are allowed: - ```````````````````````````````` example - ### foo - ## foo - # foo +* foo + +* foo . -<h3>foo</h3> -<h2>foo</h2> -<h1>foo</h1> +<p>* foo</p> +<ul> +<li>foo</li> +</ul> ```````````````````````````````` +```````````````````````````````` example +foo bar +. +<p>foo -Four spaces are too much: +bar</p> +```````````````````````````````` ```````````````````````````````` example - # foo +	foo . -<pre><code># foo -</code></pre> +<p>→foo</p> ```````````````````````````````` ```````````````````````````````` example -foo - # bar +[a](url "tit") . -<p>foo -# bar</p> +<p>[a](url "tit")</p> ```````````````````````````````` -A closing sequence of `#` characters is optional: -```````````````````````````````` example -## foo ## - ### bar ### -. -<h2>foo</h2> -<h3>bar</h3> -```````````````````````````````` +# Blocks and inlines +We can think of a document as a sequence of +[blocks](@)---structural elements like paragraphs, block +quotations, lists, headings, rules, and code blocks. Some blocks (like +block quotes and list items) contain other blocks; others (like +headings and paragraphs) contain [inline](@) content---text, +links, emphasized text, images, code spans, and so on. -It need not be the same length as the opening sequence: +## Precedence + +Indicators of block structure always take precedence over indicators +of inline structure. So, for example, the following is a list with +two items, not a list with one item containing a code span: ```````````````````````````````` example -# foo ################################## -##### foo ## +- `one +- two` . -<h1>foo</h1> -<h5>foo</h5> +<ul> +<li>`one</li> +<li>two`</li> +</ul> ```````````````````````````````` -Spaces are allowed after the closing sequence: +This means that parsing can proceed in two steps: first, the block +structure of the document can be discerned; second, text lines inside +paragraphs, headings, and other block constructs can be parsed for inline +structure. The second step requires information about link reference +definitions that will be available only at the end of the first +step. Note that the first step requires processing lines in sequence, +but the second can be parallelized, since the inline parsing of +one block element does not affect the inline parsing of any other. + +## Container blocks and leaf blocks + +We can divide blocks into two types: +[container blocks](@), +which can contain other blocks, and [leaf blocks](@), +which cannot. + +# Leaf blocks + +This section describes the different kinds of leaf block that make up a +Markdown document. + +## Thematic breaks + +A line consisting of 0-3 spaces of indentation, followed by a sequence +of three or more matching `-`, `_`, or `*` characters, each followed +optionally by any number of spaces or tabs, forms a +[thematic break](@). ```````````````````````````````` example -### foo ### +*** +--- +___ . -<h3>foo</h3> +<hr /> +<hr /> +<hr /> ```````````````````````````````` -A sequence of `#` characters with anything but [spaces] following it -is not a closing sequence, but counts as part of the contents of the -heading: +Wrong characters: ```````````````````````````````` example -### foo ### b ++++ . -<h3>foo ### b</h3> +<p>+++</p> ```````````````````````````````` -The closing sequence must be preceded by a space: - ```````````````````````````````` example -# foo# +=== . -<h1>foo#</h1> +<p>===</p> ```````````````````````````````` -Backslash-escaped `#` characters do not count as part -of the closing sequence: +Not enough characters: ```````````````````````````````` example -### foo \### -## foo #\## -# foo \# +-- +** +__ . -<h3>foo ###</h3> -<h2>foo ###</h2> -<h1>foo #</h1> +<p>-- +** +__</p> ```````````````````````````````` -ATX headings need not be separated from surrounding content by blank -lines, and they can interrupt paragraphs: +One to three spaces indent are allowed: ```````````````````````````````` example -**** -## foo -**** + *** + *** + *** . <hr /> -<h2>foo</h2> +<hr /> <hr /> ```````````````````````````````` +Four spaces is too many: + ```````````````````````````````` example -Foo bar -# baz -Bar foo + *** . -<p>Foo bar</p> -<h1>baz</h1> -<p>Bar foo</p> +<pre><code>*** +</code></pre> ```````````````````````````````` -ATX headings can be empty: - ```````````````````````````````` example -## -# -### ### +Foo + *** . -<h2></h2> -<h1></h1> -<h3></h3> +<p>Foo +***</p> ```````````````````````````````` -## Setext headings - -A [setext heading](@) consists of one or more -lines of text, each containing at least one [non-whitespace -character], with no more than 3 spaces indentation, followed by -a [setext heading underline]. The lines of text must be such -that, were they not followed by the setext heading underline, -they would be interpreted as a paragraph: they cannot be -interpretable as a [code fence], [ATX heading][ATX headings], -[block quote][block quotes], [thematic break][thematic breaks], -[list item][list items], or [HTML block][HTML blocks]. - -A [setext heading underline](@) is a sequence of -`=` characters or a sequence of `-` characters, with no more than 3 -spaces indentation and any number of trailing spaces. If a line -containing a single `-` can be interpreted as an -empty [list items], it should be interpreted this way -and not as a [setext heading underline]. - -The heading is a level 1 heading if `=` characters are used in -the [setext heading underline], and a level 2 heading if `-` -characters are used. The contents of the heading are the result -of parsing the preceding lines of text as CommonMark inline -content. - -In general, a setext heading need not be preceded or followed by a -blank line. However, it cannot interrupt a paragraph, so when a -setext heading comes after a paragraph, a blank line is needed between -them. - -Simple examples: +More than three characters may be used: ```````````````````````````````` example -Foo *bar* -========= - -Foo *bar* ---------- +_____________________________________ . -<h1>Foo <em>bar</em></h1> -<h2>Foo <em>bar</em></h2> +<hr /> ```````````````````````````````` -The content of the header may span more than one line: +Spaces are allowed between the characters: ```````````````````````````````` example -Foo *bar -baz* -==== + - - - . -<h1>Foo <em>bar -baz</em></h1> +<hr /> ```````````````````````````````` -The contents are the result of parsing the headings's raw -content as inlines. The heading's raw content is formed by -concatenating the lines and removing initial and final -[whitespace]. ```````````````````````````````` example - Foo *bar -baz*→ -==== + ** * ** * ** * ** . -<h1>Foo <em>bar -baz</em></h1> +<hr /> ```````````````````````````````` -The underlining can be any length: - ```````````````````````````````` example -Foo -------------------------- - -Foo -= +- - - - . -<h2>Foo</h2> -<h1>Foo</h1> +<hr /> ```````````````````````````````` -The heading content can be indented up to three spaces, and need -not line up with the underlining: +Spaces are allowed at the end: ```````````````````````````````` example - Foo ---- - - Foo ------ - - Foo - === +- - - - . -<h2>Foo</h2> -<h2>Foo</h2> -<h1>Foo</h1> +<hr /> ```````````````````````````````` -Four spaces indent is too much: +However, no other characters may occur in the line: ```````````````````````````````` example - Foo - --- +_ _ _ _ a - Foo ---- -. -<pre><code>Foo ---- +a------ -Foo -</code></pre> -<hr /> +---a--- +. +<p>_ _ _ _ a</p> +<p>a------</p> +<p>---a---</p> ```````````````````````````````` -The setext heading underline can be indented up to three spaces, and -may have trailing spaces: +It is required that all of the [non-whitespace characters] be the same. +So, this is not a thematic break: ```````````````````````````````` example -Foo - ---- + *-* . -<h2>Foo</h2> +<p><em>-</em></p> ```````````````````````````````` -Four spaces is too much: +Thematic breaks do not need blank lines before or after: ```````````````````````````````` example -Foo - --- +- foo +*** +- bar . -<p>Foo ----</p> +<ul> +<li>foo</li> +</ul> +<hr /> +<ul> +<li>bar</li> +</ul> ```````````````````````````````` -The setext heading underline cannot contain internal spaces: +Thematic breaks can interrupt a paragraph: ```````````````````````````````` example Foo -= = - -Foo ---- - +*** +bar . -<p>Foo -= =</p> <p>Foo</p> <hr /> +<p>bar</p> ```````````````````````````````` -Trailing spaces in the content line do not cause a line break: +If a line of dashes that meets the above conditions for being a +thematic break could also be interpreted as the underline of a [setext +heading], the interpretation as a +[setext heading] takes precedence. Thus, for example, +this is a setext heading, not a paragraph followed by a thematic break: ```````````````````````````````` example -Foo ------ +Foo +--- +bar . <h2>Foo</h2> +<p>bar</p> ```````````````````````````````` -Nor does a backslash at the end: +When both a thematic break and a list item are possible +interpretations of a line, the thematic break takes precedence: ```````````````````````````````` example -Foo\ ----- +* Foo +* * * +* Bar . -<h2>Foo\</h2> +<ul> +<li>Foo</li> +</ul> +<hr /> +<ul> +<li>Bar</li> +</ul> ```````````````````````````````` -Since indicators of block structure take precedence over -indicators of inline structure, the following are setext headings: +If you want a thematic break in a list item, use a different bullet: ```````````````````````````````` example -`Foo ----- -` - -<a title="a lot ---- -of dashes"/> +- Foo +- * * * . -<h2>`Foo</h2> -<p>`</p> -<h2><a title="a lot</h2> -<p>of dashes"/></p> +<ul> +<li>Foo</li> +<li> +<hr /> +</li> +</ul> ```````````````````````````````` -The setext heading underline cannot be a [lazy continuation -line] in a list item or block quote: +## ATX headings -```````````````````````````````` example -> Foo ---- -. -<blockquote> -<p>Foo</p> -</blockquote> -<hr /> -```````````````````````````````` +An [ATX heading](@) +consists of a string of characters, parsed as inline content, between an +opening sequence of 1--6 unescaped `#` characters and an optional +closing sequence of any number of unescaped `#` characters. +The opening sequence of `#` characters must be followed by a +[space] or by the end of line. The optional closing sequence of `#`s must be +preceded by a [space] and may be followed by spaces only. The opening +`#` character may be indented 0-3 spaces. The raw contents of the +heading are stripped of leading and trailing spaces before being parsed +as inline content. The heading level is equal to the number of `#` +characters in the opening sequence. +Simple headings: ```````````````````````````````` example -> foo -bar -=== +# foo +## foo +### foo +#### foo +##### foo +###### foo . -<blockquote> -<p>foo -bar -===</p> -</blockquote> +<h1>foo</h1> +<h2>foo</h2> +<h3>foo</h3> +<h4>foo</h4> +<h5>foo</h5> +<h6>foo</h6> ```````````````````````````````` -```````````````````````````````` example -- Foo ---- +More than six `#` characters is not a heading: + +```````````````````````````````` example +####### foo . -<ul> -<li>Foo</li> -</ul> -<hr /> +<p>####### foo</p> ```````````````````````````````` -A blank line is needed between a paragraph and a following -setext heading, since otherwise the paragraph becomes part -of the heading's content: +At least one space is required between the `#` characters and the +heading's contents, unless the heading is empty. Note that many +implementations currently do not require the space. However, the +space was required by the +[original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), +and it helps prevent things like the following from being parsed as +headings: ```````````````````````````````` example -Foo -Bar ---- +#5 bolt + +#hashtag . -<h2>Foo -Bar</h2> +<p>#5 bolt</p> +<p>#hashtag</p> ```````````````````````````````` -But in general a blank line is not required before or after -setext headings: +This is not a heading, because the first `#` is escaped: ```````````````````````````````` example ---- -Foo ---- -Bar ---- -Baz +\## foo . -<hr /> -<h2>Foo</h2> -<h2>Bar</h2> -<p>Baz</p> +<p>## foo</p> ```````````````````````````````` -Setext headings cannot be empty: +Contents are parsed as inlines: ```````````````````````````````` example - -==== +# foo *bar* \*baz\* . -<p>====</p> +<h1>foo <em>bar</em> *baz*</h1> ```````````````````````````````` -Setext heading text lines must not be interpretable as block -constructs other than paragraphs. So, the line of dashes -in these examples gets interpreted as a thematic break: +Leading and trailing [whitespace] is ignored in parsing inline content: ```````````````````````````````` example ---- ---- +# foo . -<hr /> -<hr /> +<h1>foo</h1> ```````````````````````````````` +One to three spaces indentation are allowed: + ```````````````````````````````` example -- foo ------ + ### foo + ## foo + # foo . -<ul> -<li>foo</li> -</ul> -<hr /> +<h3>foo</h3> +<h2>foo</h2> +<h1>foo</h1> ```````````````````````````````` +Four spaces are too much: + ```````````````````````````````` example - foo ---- + # foo . -<pre><code>foo +<pre><code># foo </code></pre> -<hr /> ```````````````````````````````` ```````````````````````````````` example -> foo ------ +foo + # bar . -<blockquote> -<p>foo</p> -</blockquote> -<hr /> +<p>foo +# bar</p> ```````````````````````````````` -If you want a heading with `> foo` as its literal text, you can -use backslash escapes: +A closing sequence of `#` characters is optional: ```````````````````````````````` example -\> foo ------- +## foo ## + ### bar ### . -<h2>> foo</h2> +<h2>foo</h2> +<h3>bar</h3> ```````````````````````````````` -**Compatibility note:** Most existing Markdown implementations -do not allow the text of setext headings to span multiple lines. -But there is no consensus about how to interpret - -``` markdown -Foo -bar ---- -baz -``` +It need not be the same length as the opening sequence: -One can find four different interpretations: +```````````````````````````````` example +# foo ################################## +##### foo ## +. +<h1>foo</h1> +<h5>foo</h5> +```````````````````````````````` -1. paragraph "Foo", heading "bar", paragraph "baz" -2. paragraph "Foo bar", thematic break, paragraph "baz" -3. paragraph "Foo bar --- baz" -4. heading "Foo bar", paragraph "baz" -We find interpretation 4 most natural, and interpretation 4 -increases the expressive power of CommonMark, by allowing -multiline headings. Authors who want interpretation 1 can -put a blank line after the first paragraph: +Spaces are allowed after the closing sequence: ```````````````````````````````` example -Foo - -bar ---- -baz +### foo ### . -<p>Foo</p> -<h2>bar</h2> -<p>baz</p> +<h3>foo</h3> ```````````````````````````````` -Authors who want interpretation 2 can put blank lines around -the thematic break, +A sequence of `#` characters with anything but [spaces] following it +is not a closing sequence, but counts as part of the contents of the +heading: ```````````````````````````````` example -Foo -bar - ---- - -baz +### foo ### b . -<p>Foo -bar</p> -<hr /> -<p>baz</p> +<h3>foo ### b</h3> ```````````````````````````````` -or use a thematic break that cannot count as a [setext heading -underline], such as +The closing sequence must be preceded by a space: ```````````````````````````````` example -Foo -bar -* * * -baz +# foo# . -<p>Foo -bar</p> -<hr /> -<p>baz</p> +<h1>foo#</h1> ```````````````````````````````` -Authors who want interpretation 3 can use backslash escapes: +Backslash-escaped `#` characters do not count as part +of the closing sequence: ```````````````````````````````` example -Foo -bar -\--- -baz +### foo \### +## foo #\## +# foo \# . -<p>Foo -bar ---- -baz</p> +<h3>foo ###</h3> +<h2>foo ###</h2> +<h1>foo #</h1> ```````````````````````````````` -## Indented code blocks - -An [indented code block](@) is composed of one or more -[indented chunks] separated by blank lines. -An [indented chunk](@) is a sequence of non-blank lines, -each indented four or more spaces. The contents of the code block are -the literal contents of the lines, including trailing -[line endings], minus four spaces of indentation. -An indented code block has no [info string]. - -An indented code block cannot interrupt a paragraph, so there must be -a blank line between a paragraph and a following indented code block. -(A blank line is not needed, however, between a code block and a following -paragraph.) +ATX headings need not be separated from surrounding content by blank +lines, and they can interrupt paragraphs: ```````````````````````````````` example - a simple - indented code block +**** +## foo +**** . -<pre><code>a simple - indented code block -</code></pre> +<hr /> +<h2>foo</h2> +<hr /> ```````````````````````````````` -If there is any ambiguity between an interpretation of indentation -as a code block and as indicating that material belongs to a [list -item][list items], the list item interpretation takes precedence: - ```````````````````````````````` example - - foo - - bar +Foo bar +# baz +Bar foo . -<ul> -<li> -<p>foo</p> -<p>bar</p> -</li> -</ul> +<p>Foo bar</p> +<h1>baz</h1> +<p>Bar foo</p> ```````````````````````````````` -```````````````````````````````` example -1. foo +ATX headings can be empty: - - bar +```````````````````````````````` example +## +# +### ### . -<ol> -<li> -<p>foo</p> -<ul> -<li>bar</li> -</ul> -</li> -</ol> +<h2></h2> +<h1></h1> +<h3></h3> ```````````````````````````````` +## Setext headings -The contents of a code block are literal text, and do not get parsed -as Markdown: - -```````````````````````````````` example - <a/> - *hi* +A [setext heading](@) consists of one or more +lines of text, each containing at least one [non-whitespace +character], with no more than 3 spaces indentation, followed by +a [setext heading underline]. The lines of text must be such +that, were they not followed by the setext heading underline, +they would be interpreted as a paragraph: they cannot be +interpretable as a [code fence], [ATX heading][ATX headings], +[block quote][block quotes], [thematic break][thematic breaks], +[list item][list items], or [HTML block][HTML blocks]. - - one -. -<pre><code><a/> -*hi* +A [setext heading underline](@) is a sequence of +`=` characters or a sequence of `-` characters, with no more than 3 +spaces indentation and any number of trailing spaces. If a line +containing a single `-` can be interpreted as an +empty [list items], it should be interpreted this way +and not as a [setext heading underline]. -- one -</code></pre> -```````````````````````````````` +The heading is a level 1 heading if `=` characters are used in +the [setext heading underline], and a level 2 heading if `-` +characters are used. The contents of the heading are the result +of parsing the preceding lines of text as CommonMark inline +content. +In general, a setext heading need not be preceded or followed by a +blank line. However, it cannot interrupt a paragraph, so when a +setext heading comes after a paragraph, a blank line is needed between +them. -Here we have three chunks separated by blank lines: +Simple examples: ```````````````````````````````` example - chunk1 +Foo *bar* +========= - chunk2 - - - - chunk3 +Foo *bar* +--------- . -<pre><code>chunk1 - -chunk2 +<h1>Foo <em>bar</em></h1> +<h2>Foo <em>bar</em></h2> +```````````````````````````````` +The content of the header may span more than one line: -chunk3 -</code></pre> +```````````````````````````````` example +Foo *bar +baz* +==== +. +<h1>Foo <em>bar +baz</em></h1> ```````````````````````````````` - -Any initial spaces beyond four will be included in the content, even -in interior blank lines: +The contents are the result of parsing the headings's raw +content as inlines. The heading's raw content is formed by +concatenating the lines and removing initial and final +[whitespace]. ```````````````````````````````` example - chunk1 - - chunk2 + Foo *bar +baz*→ +==== . -<pre><code>chunk1 - - chunk2 -</code></pre> +<h1>Foo <em>bar +baz</em></h1> ```````````````````````````````` -An indented code block cannot interrupt a paragraph. (This -allows hanging indents and the like.) +The underlining can be any length: ```````````````````````````````` example Foo - bar +------------------------- +Foo += . -<p>Foo -bar</p> +<h2>Foo</h2> +<h1>Foo</h1> ```````````````````````````````` -However, any non-blank line with fewer than four leading spaces ends -the code block immediately. So a paragraph may occur immediately -after indented code: +The heading content can be indented up to three spaces, and need +not line up with the underlining: ```````````````````````````````` example - foo -bar + Foo +--- + + Foo +----- + + Foo + === . -<pre><code>foo -</code></pre> -<p>bar</p> +<h2>Foo</h2> +<h2>Foo</h2> +<h1>Foo</h1> ```````````````````````````````` -And indented code can occur immediately before and after other kinds of -blocks: +Four spaces indent is too much: ```````````````````````````````` example -# Heading - foo -Heading ------- - foo ----- + Foo + --- + + Foo +--- . -<h1>Heading</h1> -<pre><code>foo -</code></pre> -<h2>Heading</h2> -<pre><code>foo +<pre><code>Foo +--- + +Foo </code></pre> <hr /> ```````````````````````````````` -The first line can be indented more than four spaces: +The setext heading underline can be indented up to three spaces, and +may have trailing spaces: ```````````````````````````````` example - foo - bar +Foo + ---- . -<pre><code> foo -bar -</code></pre> +<h2>Foo</h2> ```````````````````````````````` -Blank lines preceding or following an indented code block -are not included in it: +Four spaces is too much: ```````````````````````````````` example - - - foo - - +Foo + --- . -<pre><code>foo -</code></pre> +<p>Foo +---</p> ```````````````````````````````` -Trailing spaces are included in the code block's content: +The setext heading underline cannot contain internal spaces: ```````````````````````````````` example - foo +Foo += = + +Foo +--- - . -<pre><code>foo -</code></pre> +<p>Foo += =</p> +<p>Foo</p> +<hr /> ```````````````````````````````` +Trailing spaces in the content line do not cause a line break: -## Fenced code blocks - -A [code fence](@) is a sequence -of at least three consecutive backtick characters (`` ` ``) or -tildes (`~`). (Tildes and backticks cannot be mixed.) -A [fenced code block](@) -begins with a code fence, indented no more than three spaces. - -The line with the opening code fence may optionally contain some text -following the code fence; this is trimmed of leading and trailing -whitespace and called the [info string](@). If the [info string] comes -after a backtick fence, it may not contain any backtick -characters. (The reason for this restriction is that otherwise -some inline code would be incorrectly interpreted as the -beginning of a fenced code block.) +```````````````````````````````` example +Foo +----- +. +<h2>Foo</h2> +```````````````````````````````` -The content of the code block consists of all subsequent lines, until -a closing [code fence] of the same type as the code block -began with (backticks or tildes), and with at least as many backticks -or tildes as the opening code fence. If the leading code fence is -indented N spaces, then up to N spaces of indentation are removed from -each line of the content (if present). (If a content line is not -indented, it is preserved unchanged. If it is indented less than N -spaces, all of the indentation is removed.) -The closing code fence may be indented up to three spaces, and may be -followed only by spaces, which are ignored. If the end of the -containing block (or document) is reached and no closing code fence -has been found, the code block contains all of the lines after the -opening code fence until the end of the containing block (or -document). (An alternative spec would require backtracking in the -event that a closing code fence is not found. But this makes parsing -much less efficient, and there seems to be no real down side to the -behavior described here.) +Nor does a backslash at the end: -A fenced code block may interrupt a paragraph, and does not require -a blank line either before or after. +```````````````````````````````` example +Foo\ +---- +. +<h2>Foo\</h2> +```````````````````````````````` -The content of a code fence is treated as literal text, not parsed -as inlines. The first word of the [info string] is typically used to -specify the language of the code sample, and rendered in the `class` -attribute of the `code` tag. However, this spec does not mandate any -particular treatment of the [info string]. -Here is a simple example with backticks: +Since indicators of block structure take precedence over +indicators of inline structure, the following are setext headings: ```````````````````````````````` example -``` -< - > -``` +`Foo +---- +` + +<a title="a lot +--- +of dashes"/> . -<pre><code>< - > -</code></pre> +<h2>`Foo</h2> +<p>`</p> +<h2><a title="a lot</h2> +<p>of dashes"/></p> ```````````````````````````````` -With tildes: +The setext heading underline cannot be a [lazy continuation +line] in a list item or block quote: ```````````````````````````````` example -~~~ -< - > -~~~ +> Foo +--- . -<pre><code>< - > -</code></pre> +<blockquote> +<p>Foo</p> +</blockquote> +<hr /> ```````````````````````````````` -Fewer than three backticks is not enough: ```````````````````````````````` example -`` -foo -`` +> foo +bar +=== . -<p><code>foo</code></p> +<blockquote> +<p>foo +bar +===</p> +</blockquote> ```````````````````````````````` -The closing code fence must use the same character as the opening -fence: ```````````````````````````````` example -``` -aaa -~~~ -``` +- Foo +--- . -<pre><code>aaa -~~~ -</code></pre> +<ul> +<li>Foo</li> +</ul> +<hr /> ```````````````````````````````` +A blank line is needed between a paragraph and a following +setext heading, since otherwise the paragraph becomes part +of the heading's content: + ```````````````````````````````` example -~~~ -aaa -``` -~~~ +Foo +Bar +--- . -<pre><code>aaa -``` -</code></pre> +<h2>Foo +Bar</h2> ```````````````````````````````` -The closing code fence must be at least as long as the opening fence: +But in general a blank line is not required before or after +setext headings: ```````````````````````````````` example -```` -aaa -``` -`````` +--- +Foo +--- +Bar +--- +Baz . -<pre><code>aaa -``` -</code></pre> +<hr /> +<h2>Foo</h2> +<h2>Bar</h2> +<p>Baz</p> ```````````````````````````````` +Setext headings cannot be empty: + ```````````````````````````````` example -~~~~ -aaa -~~~ -~~~~ + +==== . -<pre><code>aaa -~~~ -</code></pre> +<p>====</p> ```````````````````````````````` -Unclosed code blocks are closed by the end of the document -(or the enclosing [block quote][block quotes] or [list item][list items]): +Setext heading text lines must not be interpretable as block +constructs other than paragraphs. So, the line of dashes +in these examples gets interpreted as a thematic break: ```````````````````````````````` example -``` +--- +--- . -<pre><code></code></pre> +<hr /> +<hr /> ```````````````````````````````` ```````````````````````````````` example -````` +- foo +----- +. +<ul> +<li>foo</li> +</ul> +<hr /> +```````````````````````````````` -``` -aaa + +```````````````````````````````` example + foo +--- . -<pre><code> -``` -aaa +<pre><code>foo </code></pre> +<hr /> ```````````````````````````````` ```````````````````````````````` example -> ``` -> aaa - -bbb +> foo +----- . <blockquote> -<pre><code>aaa -</code></pre> +<p>foo</p> </blockquote> -<p>bbb</p> +<hr /> ```````````````````````````````` -A code block can have all empty lines as its content: +If you want a heading with `> foo` as its literal text, you can +use backslash escapes: ```````````````````````````````` example -``` - - -``` +\> foo +------ . -<pre><code> - -</code></pre> +<h2>> foo</h2> ```````````````````````````````` -A code block can be empty: +**Compatibility note:** Most existing Markdown implementations +do not allow the text of setext headings to span multiple lines. +But there is no consensus about how to interpret -```````````````````````````````` example -``` +``` markdown +Foo +bar +--- +baz ``` -. -<pre><code></code></pre> -```````````````````````````````` +One can find four different interpretations: -Fences can be indented. If the opening fence is indented, -content lines will have equivalent opening indentation removed, -if present: +1. paragraph "Foo", heading "bar", paragraph "baz" +2. paragraph "Foo bar", thematic break, paragraph "baz" +3. paragraph "Foo bar --- baz" +4. heading "Foo bar", paragraph "baz" + +We find interpretation 4 most natural, and interpretation 4 +increases the expressive power of CommonMark, by allowing +multiline headings. Authors who want interpretation 1 can +put a blank line after the first paragraph: ```````````````````````````````` example - ``` - aaa -aaa -``` +Foo + +bar +--- +baz . -<pre><code>aaa -aaa -</code></pre> +<p>Foo</p> +<h2>bar</h2> +<p>baz</p> ```````````````````````````````` +Authors who want interpretation 2 can put blank lines around +the thematic break, + ```````````````````````````````` example - ``` -aaa - aaa -aaa - ``` -. -<pre><code>aaa -aaa -aaa -</code></pre> -```````````````````````````````` +Foo +bar +--- -```````````````````````````````` example - ``` - aaa - aaa - aaa - ``` +baz . -<pre><code>aaa - aaa -aaa -</code></pre> +<p>Foo +bar</p> +<hr /> +<p>baz</p> ```````````````````````````````` -Four spaces indentation produces an indented code block: +or use a thematic break that cannot count as a [setext heading +underline], such as ```````````````````````````````` example - ``` - aaa - ``` +Foo +bar +* * * +baz . -<pre><code>``` -aaa -``` -</code></pre> +<p>Foo +bar</p> +<hr /> +<p>baz</p> ```````````````````````````````` -Closing fences may be indented by 0-3 spaces, and their indentation -need not match that of the opening fence: +Authors who want interpretation 3 can use backslash escapes: ```````````````````````````````` example -``` -aaa - ``` +Foo +bar +\--- +baz . -<pre><code>aaa -</code></pre> +<p>Foo +bar +--- +baz</p> ```````````````````````````````` -```````````````````````````````` example - ``` -aaa - ``` -. -<pre><code>aaa -</code></pre> -```````````````````````````````` +## Indented code blocks +An [indented code block](@) is composed of one or more +[indented chunks] separated by blank lines. +An [indented chunk](@) is a sequence of non-blank lines, +each indented four or more spaces. The contents of the code block are +the literal contents of the lines, including trailing +[line endings], minus four spaces of indentation. +An indented code block has no [info string]. -This is not a closing fence, because it is indented 4 spaces: +An indented code block cannot interrupt a paragraph, so there must be +a blank line between a paragraph and a following indented code block. +(A blank line is not needed, however, between a code block and a following +paragraph.) ```````````````````````````````` example -``` -aaa - ``` + a simple + indented code block . -<pre><code>aaa - ``` +<pre><code>a simple + indented code block </code></pre> ```````````````````````````````` - -Code fences (opening and closing) cannot contain internal spaces: +If there is any ambiguity between an interpretation of indentation +as a code block and as indicating that material belongs to a [list +item][list items], the list item interpretation takes precedence: ```````````````````````````````` example -``` ``` -aaa + - foo + + bar . -<p><code> </code> -aaa</p> +<ul> +<li> +<p>foo</p> +<p>bar</p> +</li> +</ul> ```````````````````````````````` ```````````````````````````````` example -~~~~~~ -aaa -~~~ ~~ +1. foo + + - bar . -<pre><code>aaa -~~~ ~~ -</code></pre> +<ol> +<li> +<p>foo</p> +<ul> +<li>bar</li> +</ul> +</li> +</ol> ```````````````````````````````` -Fenced code blocks can interrupt paragraphs, and can be followed -directly by paragraphs, without a blank line between: + +The contents of a code block are literal text, and do not get parsed +as Markdown: ```````````````````````````````` example -foo -``` -bar -``` -baz + <a/> + *hi* + + - one . -<p>foo</p> -<pre><code>bar +<pre><code><a/> +*hi* + +- one </code></pre> -<p>baz</p> ```````````````````````````````` -Other blocks can also occur before and after fenced code blocks -without an intervening blank line: +Here we have three chunks separated by blank lines: ```````````````````````````````` example -foo ---- -~~~ -bar -~~~ -# baz + chunk1 + + chunk2 + + + + chunk3 . -<h2>foo</h2> -<pre><code>bar +<pre><code>chunk1 + +chunk2 + + + +chunk3 </code></pre> -<h1>baz</h1> ```````````````````````````````` -An [info string] can be provided after the opening code fence. -Although this spec doesn't mandate any particular treatment of -the info string, the first word is typically used to specify -the language of the code block. In HTML output, the language is -normally indicated by adding a class to the `code` element consisting -of `language-` followed by the language name. +Any initial spaces beyond four will be included in the content, even +in interior blank lines: ```````````````````````````````` example -```ruby -def foo(x) - return 3 -end -``` + chunk1 + + chunk2 . -<pre><code class="language-ruby">def foo(x) - return 3 -end +<pre><code>chunk1 + + chunk2 </code></pre> ```````````````````````````````` +An indented code block cannot interrupt a paragraph. (This +allows hanging indents and the like.) + ```````````````````````````````` example -~~~~ ruby startline=3 $%@#$ -def foo(x) - return 3 -end -~~~~~~~ +Foo + bar + . -<pre><code class="language-ruby">def foo(x) - return 3 -end -</code></pre> +<p>Foo +bar</p> ```````````````````````````````` +However, any non-blank line with fewer than four leading spaces ends +the code block immediately. So a paragraph may occur immediately +after indented code: + ```````````````````````````````` example -````; -```` + foo +bar . -<pre><code class="language-;"></code></pre> +<pre><code>foo +</code></pre> +<p>bar</p> ```````````````````````````````` -[Info strings] for backtick code blocks cannot contain backticks: +And indented code can occur immediately before and after other kinds of +blocks: ```````````````````````````````` example -``` aa ``` -foo +# Heading + foo +Heading +------ + foo +---- . -<p><code>aa</code> -foo</p> +<h1>Heading</h1> +<pre><code>foo +</code></pre> +<h2>Heading</h2> +<pre><code>foo +</code></pre> +<hr /> ```````````````````````````````` -[Info strings] for tilde code blocks can contain backticks and tildes: +The first line can be indented more than four spaces: ```````````````````````````````` example -~~~ aa ``` ~~~ -foo -~~~ + foo + bar . -<pre><code class="language-aa">foo +<pre><code> foo +bar </code></pre> ```````````````````````````````` -Closing code fences cannot have [info strings]: +Blank lines preceding or following an indented code block +are not included in it: ```````````````````````````````` example -``` -``` aaa -``` + + + foo + + . -<pre><code>``` aaa +<pre><code>foo </code></pre> ```````````````````````````````` +Trailing spaces are included in the code block's content: -## HTML blocks +```````````````````````````````` example + foo +. +<pre><code>foo +</code></pre> +```````````````````````````````` -An [HTML block](@) is a group of lines that is treated -as raw HTML (and will not be escaped in HTML output). -There are seven kinds of [HTML block], which can be defined by their -start and end conditions. The block begins with a line that meets a -[start condition](@) (after up to three spaces optional indentation). -It ends with the first subsequent line that meets a matching [end -condition](@), or the last line of the document, or the last line of -the [container block](#container-blocks) containing the current HTML -block, if no line is encountered that meets the [end condition]. If -the first line meets both the [start condition] and the [end -condition], the block will contain just that line. -1. **Start condition:** line begins with the string `<script`, -`<pre`, or `<style` (case-insensitive), followed by whitespace, -the string `>`, or the end of the line.\ -**End condition:** line contains an end tag -`</script>`, `</pre>`, or `</style>` (case-insensitive; it -need not match the start tag). +## Fenced code blocks -2. **Start condition:** line begins with the string `<!--`.\ -**End condition:** line contains the string `-->`. +A [code fence](@) is a sequence +of at least three consecutive backtick characters (`` ` ``) or +tildes (`~`). (Tildes and backticks cannot be mixed.) +A [fenced code block](@) +begins with a code fence, indented no more than three spaces. -3. **Start condition:** line begins with the string `<?`.\ -**End condition:** line contains the string `?>`. +The line with the opening code fence may optionally contain some text +following the code fence; this is trimmed of leading and trailing +whitespace and called the [info string](@). If the [info string] comes +after a backtick fence, it may not contain any backtick +characters. (The reason for this restriction is that otherwise +some inline code would be incorrectly interpreted as the +beginning of a fenced code block.) -4. **Start condition:** line begins with the string `<!` -followed by an uppercase ASCII letter.\ -**End condition:** line contains the character `>`. +The content of the code block consists of all subsequent lines, until +a closing [code fence] of the same type as the code block +began with (backticks or tildes), and with at least as many backticks +or tildes as the opening code fence. If the leading code fence is +indented N spaces, then up to N spaces of indentation are removed from +each line of the content (if present). (If a content line is not +indented, it is preserved unchanged. If it is indented less than N +spaces, all of the indentation is removed.) -5. **Start condition:** line begins with the string -`<![CDATA[`.\ -**End condition:** line contains the string `]]>`. - -6. **Start condition:** line begins the string `<` or `</` -followed by one of the strings (case-insensitive) `address`, -`article`, `aside`, `base`, `basefont`, `blockquote`, `body`, -`caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, -`dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, -`footer`, `form`, `frame`, `frameset`, -`h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, -`html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, -`nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, -`section`, `source`, `summary`, `table`, `tbody`, `td`, -`tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed -by [whitespace], the end of the line, the string `>`, or -the string `/>`.\ -**End condition:** line is followed by a [blank line]. +The closing code fence may be indented up to three spaces, and may be +followed only by spaces, which are ignored. If the end of the +containing block (or document) is reached and no closing code fence +has been found, the code block contains all of the lines after the +opening code fence until the end of the containing block (or +document). (An alternative spec would require backtracking in the +event that a closing code fence is not found. But this makes parsing +much less efficient, and there seems to be no real down side to the +behavior described here.) -7. **Start condition:** line begins with a complete [open tag] -(with any [tag name] other than `script`, -`style`, or `pre`) or a complete [closing tag], -followed only by [whitespace] or the end of the line.\ -**End condition:** line is followed by a [blank line]. +A fenced code block may interrupt a paragraph, and does not require +a blank line either before or after. -HTML blocks continue until they are closed by their appropriate -[end condition], or the last line of the document or other [container -block](#container-blocks). This means any HTML **within an HTML -block** that might otherwise be recognised as a start condition will -be ignored by the parser and passed through as-is, without changing -the parser's state. +The content of a code fence is treated as literal text, not parsed +as inlines. The first word of the [info string] is typically used to +specify the language of the code sample, and rendered in the `class` +attribute of the `code` tag. However, this spec does not mandate any +particular treatment of the [info string]. -For instance, `<pre>` within a HTML block started by `<table>` will not affect -the parser state; as the HTML block was started in by start condition 6, it -will end at any blank line. This can be surprising: +Here is a simple example with backticks: ```````````````````````````````` example -<table><tr><td> -<pre> -**Hello**, - -_world_. -</pre> -</td></tr></table> +``` +< + > +``` . -<table><tr><td> -<pre> -**Hello**, -<p><em>world</em>. -</pre></p> -</td></tr></table> +<pre><code>< + > +</code></pre> ```````````````````````````````` -In this case, the HTML block is terminated by the newline — the `**Hello**` -text remains verbatim — and regular parsing resumes, with a paragraph, -emphasised `world` and inline and block HTML following. - -All types of [HTML blocks] except type 7 may interrupt -a paragraph. Blocks of type 7 may not interrupt a paragraph. -(This restriction is intended to prevent unwanted interpretation -of long tags inside a wrapped paragraph as starting HTML blocks.) -Some simple examples follow. Here are some basic HTML blocks -of type 6: +With tildes: ```````````````````````````````` example -<table> - <tr> - <td> - hi - </td> - </tr> -</table> - -okay. +~~~ +< + > +~~~ . -<table> - <tr> - <td> - hi - </td> - </tr> -</table> -<p>okay.</p> +<pre><code>< + > +</code></pre> ```````````````````````````````` +Fewer than three backticks is not enough: ```````````````````````````````` example - <div> - *hello* - <foo><a> +`` +foo +`` . - <div> - *hello* - <foo><a> +<p><code>foo</code></p> ```````````````````````````````` - -A block can also start with a closing tag: +The closing code fence must use the same character as the opening +fence: ```````````````````````````````` example -</div> -*foo* +``` +aaa +~~~ +``` . -</div> -*foo* +<pre><code>aaa +~~~ +</code></pre> ```````````````````````````````` -Here we have two HTML blocks with a Markdown paragraph between them: - ```````````````````````````````` example -<DIV CLASS="foo"> - -*Markdown* - -</DIV> +~~~ +aaa +``` +~~~ . -<DIV CLASS="foo"> -<p><em>Markdown</em></p> -</DIV> +<pre><code>aaa +``` +</code></pre> ```````````````````````````````` -The tag on the first line can be partial, as long -as it is split where there would be whitespace: +The closing code fence must be at least as long as the opening fence: ```````````````````````````````` example -<div id="foo" - class="bar"> -</div> +```` +aaa +``` +`````` . -<div id="foo" - class="bar"> -</div> +<pre><code>aaa +``` +</code></pre> ```````````````````````````````` ```````````````````````````````` example -<div id="foo" class="bar - baz"> -</div> +~~~~ +aaa +~~~ +~~~~ . -<div id="foo" class="bar - baz"> -</div> +<pre><code>aaa +~~~ +</code></pre> ```````````````````````````````` -An open tag need not be closed: -```````````````````````````````` example -<div> -*foo* +Unclosed code blocks are closed by the end of the document +(or the enclosing [block quote][block quotes] or [list item][list items]): -*bar* +```````````````````````````````` example +``` . -<div> -*foo* -<p><em>bar</em></p> +<pre><code></code></pre> ```````````````````````````````` - -A partial tag need not even be completed (garbage -in, garbage out): - ```````````````````````````````` example -<div id="foo" -*hi* +````` + +``` +aaa . -<div id="foo" -*hi* +<pre><code> +``` +aaa +</code></pre> ```````````````````````````````` ```````````````````````````````` example -<div class -foo +> ``` +> aaa + +bbb . -<div class -foo +<blockquote> +<pre><code>aaa +</code></pre> +</blockquote> +<p>bbb</p> ```````````````````````````````` -The initial tag doesn't even need to be a valid -tag, as long as it starts like one: +A code block can have all empty lines as its content: ```````````````````````````````` example -<div *???-&&&-<--- -*foo* +``` + + +``` . -<div *???-&&&-<--- -*foo* +<pre><code> + +</code></pre> ```````````````````````````````` -In type 6 blocks, the initial tag need not be on a line by -itself: +A code block can be empty: ```````````````````````````````` example -<div><a href="bar">*foo*</a></div> +``` +``` . -<div><a href="bar">*foo*</a></div> +<pre><code></code></pre> ```````````````````````````````` +Fences can be indented. If the opening fence is indented, +content lines will have equivalent opening indentation removed, +if present: + ```````````````````````````````` example -<table><tr><td> -foo -</td></tr></table> + ``` + aaa +aaa +``` . -<table><tr><td> -foo -</td></tr></table> +<pre><code>aaa +aaa +</code></pre> ```````````````````````````````` -Everything until the next blank line or end of document -gets included in the HTML block. So, in the following -example, what looks like a Markdown code block -is actually part of the HTML block, which continues until a blank -line or the end of the document is reached: - ```````````````````````````````` example -<div></div> -``` c -int x = 33; -``` + ``` +aaa + aaa +aaa + ``` . -<div></div> -``` c -int x = 33; -``` +<pre><code>aaa +aaa +aaa +</code></pre> ```````````````````````````````` -To start an [HTML block] with a tag that is *not* in the -list of block-level tags in (6), you must put the tag by -itself on the first line (and it must be complete): - ```````````````````````````````` example -<a href="foo"> -*bar* -</a> + ``` + aaa + aaa + aaa + ``` . -<a href="foo"> -*bar* -</a> +<pre><code>aaa + aaa +aaa +</code></pre> ```````````````````````````````` -In type 7 blocks, the [tag name] can be anything: +Four spaces indentation produces an indented code block: ```````````````````````````````` example -<Warning> -*bar* -</Warning> + ``` + aaa + ``` . -<Warning> -*bar* -</Warning> +<pre><code>``` +aaa +``` +</code></pre> ```````````````````````````````` +Closing fences may be indented by 0-3 spaces, and their indentation +need not match that of the opening fence: + ```````````````````````````````` example -<i class="foo"> -*bar* -</i> +``` +aaa + ``` . -<i class="foo"> -*bar* -</i> +<pre><code>aaa +</code></pre> ```````````````````````````````` ```````````````````````````````` example -</ins> -*bar* + ``` +aaa + ``` . -</ins> -*bar* +<pre><code>aaa +</code></pre> ```````````````````````````````` -These rules are designed to allow us to work with tags that -can function as either block-level or inline-level tags. -The `<del>` tag is a nice example. We can surround content with -`<del>` tags in three different ways. In this case, we get a raw -HTML block, because the `<del>` tag is on a line by itself: +This is not a closing fence, because it is indented 4 spaces: ```````````````````````````````` example -<del> -*foo* -</del> +``` +aaa + ``` . -<del> -*foo* -</del> +<pre><code>aaa + ``` +</code></pre> ```````````````````````````````` -In this case, we get a raw HTML block that just includes -the `<del>` tag (because it ends with the following blank -line). So the contents get interpreted as CommonMark: + +Code fences (opening and closing) cannot contain internal spaces: ```````````````````````````````` example -<del> +``` ``` +aaa +. +<p><code> </code> +aaa</p> +```````````````````````````````` -*foo* -</del> +```````````````````````````````` example +~~~~~~ +aaa +~~~ ~~ . -<del> -<p><em>foo</em></p> -</del> +<pre><code>aaa +~~~ ~~ +</code></pre> ```````````````````````````````` -Finally, in this case, the `<del>` tags are interpreted -as [raw HTML] *inside* the CommonMark paragraph. (Because -the tag is not on a line by itself, we get inline HTML -rather than an [HTML block].) +Fenced code blocks can interrupt paragraphs, and can be followed +directly by paragraphs, without a blank line between: ```````````````````````````````` example -<del>*foo*</del> +foo +``` +bar +``` +baz . -<p><del><em>foo</em></del></p> +<p>foo</p> +<pre><code>bar +</code></pre> +<p>baz</p> ```````````````````````````````` -HTML tags designed to contain literal content -(`script`, `style`, `pre`), comments, processing instructions, -and declarations are treated somewhat differently. -Instead of ending at the first blank line, these blocks -end at the first line containing a corresponding end tag. -As a result, these blocks can contain blank lines: - -A pre tag (type 1): +Other blocks can also occur before and after fenced code blocks +without an intervening blank line: ```````````````````````````````` example -<pre language="haskell"><code> -import Text.HTML.TagSoup - -main :: IO () -main = print $ parseTags tags -</code></pre> -okay +foo +--- +~~~ +bar +~~~ +# baz . -<pre language="haskell"><code> -import Text.HTML.TagSoup - -main :: IO () -main = print $ parseTags tags +<h2>foo</h2> +<pre><code>bar </code></pre> -<p>okay</p> +<h1>baz</h1> ```````````````````````````````` -A script tag (type 1): +An [info string] can be provided after the opening code fence. +Although this spec doesn't mandate any particular treatment of +the info string, the first word is typically used to specify +the language of the code block. In HTML output, the language is +normally indicated by adding a class to the `code` element consisting +of `language-` followed by the language name. ```````````````````````````````` example -<script type="text/javascript"> -// JavaScript example - -document.getElementById("demo").innerHTML = "Hello JavaScript!"; -</script> -okay +```ruby +def foo(x) + return 3 +end +``` . -<script type="text/javascript"> -// JavaScript example - -document.getElementById("demo").innerHTML = "Hello JavaScript!"; -</script> -<p>okay</p> +<pre><code class="language-ruby">def foo(x) + return 3 +end +</code></pre> ```````````````````````````````` -A style tag (type 1): - ```````````````````````````````` example -<style - type="text/css"> -h1 {color:red;} - -p {color:blue;} -</style> -okay +~~~~ ruby startline=3 $%@#$ +def foo(x) + return 3 +end +~~~~~~~ . -<style - type="text/css"> -h1 {color:red;} - -p {color:blue;} -</style> -<p>okay</p> +<pre><code class="language-ruby">def foo(x) + return 3 +end +</code></pre> ```````````````````````````````` -If there is no matching end tag, the block will end at the -end of the document (or the enclosing [block quote][block quotes] -or [list item][list items]): - ```````````````````````````````` example -<style - type="text/css"> - -foo +````; +```` . -<style - type="text/css"> - -foo +<pre><code class="language-;"></code></pre> ```````````````````````````````` -```````````````````````````````` example -> <div> -> foo +[Info strings] for backtick code blocks cannot contain backticks: -bar -. -<blockquote> -<div> +```````````````````````````````` example +``` aa ``` foo -</blockquote> -<p>bar</p> +. +<p><code>aa</code> +foo</p> ```````````````````````````````` +[Info strings] for tilde code blocks can contain backticks and tildes: + ```````````````````````````````` example -- <div> -- foo +~~~ aa ``` ~~~ +foo +~~~ . -<ul> -<li> -<div> -</li> -<li>foo</li> -</ul> +<pre><code class="language-aa">foo +</code></pre> ```````````````````````````````` -The end tag can occur on the same line as the start tag: +Closing code fences cannot have [info strings]: ```````````````````````````````` example -<style>p{color:red;}</style> -*foo* +``` +``` aaa +``` . -<style>p{color:red;}</style> -<p><em>foo</em></p> +<pre><code>``` aaa +</code></pre> ```````````````````````````````` -```````````````````````````````` example -<!-- foo -->*bar* -*baz* -. -<!-- foo -->*bar* -<p><em>baz</em></p> -```````````````````````````````` +## HTML blocks -Note that anything on the last line after the -end tag will be included in the [HTML block]: +An [HTML block](@) is a group of lines that is treated +as raw HTML (and will not be escaped in HTML output). -```````````````````````````````` example -<script> -foo -</script>1. *bar* -. -<script> -foo -</script>1. *bar* -```````````````````````````````` +There are seven kinds of [HTML block], which can be defined by their +start and end conditions. The block begins with a line that meets a +[start condition](@) (after up to three spaces optional indentation). +It ends with the first subsequent line that meets a matching [end +condition](@), or the last line of the document, or the last line of +the [container block](#container-blocks) containing the current HTML +block, if no line is encountered that meets the [end condition]. If +the first line meets both the [start condition] and the [end +condition], the block will contain just that line. +1. **Start condition:** line begins with the string `<script`, +`<pre`, or `<style` (case-insensitive), followed by whitespace, +the string `>`, or the end of the line.\ +**End condition:** line contains an end tag +`</script>`, `</pre>`, or `</style>` (case-insensitive; it +need not match the start tag). -A comment (type 2): +2. **Start condition:** line begins with the string `<!--`.\ +**End condition:** line contains the string `-->`. -```````````````````````````````` example -<!-- Foo +3. **Start condition:** line begins with the string `<?`.\ +**End condition:** line contains the string `?>`. -bar - baz --> -okay -. -<!-- Foo +4. **Start condition:** line begins with the string `<!` +followed by an ASCII letter.\ +**End condition:** line contains the character `>`. -bar - baz --> -<p>okay</p> -```````````````````````````````` +5. **Start condition:** line begins with the string +`<![CDATA[`.\ +**End condition:** line contains the string `]]>`. + +6. **Start condition:** line begins the string `<` or `</` +followed by one of the strings (case-insensitive) `address`, +`article`, `aside`, `base`, `basefont`, `blockquote`, `body`, +`caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, +`dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, +`footer`, `form`, `frame`, `frameset`, +`h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, +`html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, +`nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, +`section`, `source`, `summary`, `table`, `tbody`, `td`, +`tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed +by [whitespace], the end of the line, the string `>`, or +the string `/>`.\ +**End condition:** line is followed by a [blank line]. +7. **Start condition:** line begins with a complete [open tag] +(with any [tag name] other than `script`, +`style`, or `pre`) or a complete [closing tag], +followed only by [whitespace] or the end of the line.\ +**End condition:** line is followed by a [blank line]. +HTML blocks continue until they are closed by their appropriate +[end condition], or the last line of the document or other [container +block](#container-blocks). This means any HTML **within an HTML +block** that might otherwise be recognised as a start condition will +be ignored by the parser and passed through as-is, without changing +the parser's state. -A processing instruction (type 3): +For instance, `<pre>` within a HTML block started by `<table>` will not affect +the parser state; as the HTML block was started in by start condition 6, it +will end at any blank line. This can be surprising: ```````````````````````````````` example -<?php - - echo '>'; +<table><tr><td> +<pre> +**Hello**, -?> -okay +_world_. +</pre> +</td></tr></table> . -<?php - - echo '>'; - -?> -<p>okay</p> +<table><tr><td> +<pre> +**Hello**, +<p><em>world</em>. +</pre></p> +</td></tr></table> ```````````````````````````````` +In this case, the HTML block is terminated by the newline — the `**Hello**` +text remains verbatim — and regular parsing resumes, with a paragraph, +emphasised `world` and inline and block HTML following. -A declaration (type 4): +All types of [HTML blocks] except type 7 may interrupt +a paragraph. Blocks of type 7 may not interrupt a paragraph. +(This restriction is intended to prevent unwanted interpretation +of long tags inside a wrapped paragraph as starting HTML blocks.) + +Some simple examples follow. Here are some basic HTML blocks +of type 6: ```````````````````````````````` example -<!DOCTYPE html> +<table> + <tr> + <td> + hi + </td> + </tr> +</table> + +okay. . -<!DOCTYPE html> +<table> + <tr> + <td> + hi + </td> + </tr> +</table> +<p>okay.</p> ```````````````````````````````` -CDATA (type 5): - ```````````````````````````````` example -<![CDATA[ -function matchwo(a,b) -{ - if (a < b && a < 0) then { - return 1; - - } else { - - return 0; - } -} -]]> -okay + <div> + *hello* + <foo><a> . -<![CDATA[ -function matchwo(a,b) -{ - if (a < b && a < 0) then { - return 1; - - } else { - - return 0; - } -} -]]> -<p>okay</p> + <div> + *hello* + <foo><a> ```````````````````````````````` -The opening tag can be indented 1-3 spaces, but not 4: +A block can also start with a closing tag: ```````````````````````````````` example - <!-- foo --> - - <!-- foo --> +</div> +*foo* . - <!-- foo --> -<pre><code><!-- foo --> -</code></pre> +</div> +*foo* ```````````````````````````````` +Here we have two HTML blocks with a Markdown paragraph between them: + ```````````````````````````````` example - <div> +<DIV CLASS="foo"> - <div> +*Markdown* + +</DIV> . - <div> -<pre><code><div> -</code></pre> +<DIV CLASS="foo"> +<p><em>Markdown</em></p> +</DIV> ```````````````````````````````` -An HTML block of types 1--6 can interrupt a paragraph, and need not be -preceded by a blank line. +The tag on the first line can be partial, as long +as it is split where there would be whitespace: ```````````````````````````````` example -Foo -<div> -bar +<div id="foo" + class="bar"> </div> . -<p>Foo</p> -<div> -bar +<div id="foo" + class="bar"> </div> ```````````````````````````````` -However, a following blank line is needed, except at the end of -a document, and except for blocks of types 1--5, [above][HTML -block]: - ```````````````````````````````` example -<div> -bar +<div id="foo" class="bar + baz"> </div> -*foo* . -<div> -bar +<div id="foo" class="bar + baz"> </div> -*foo* ```````````````````````````````` -HTML blocks of type 7 cannot interrupt a paragraph: - +An open tag need not be closed: ```````````````````````````````` example -Foo -<a href="bar"> -baz +<div> +*foo* + +*bar* . -<p>Foo -<a href="bar"> -baz</p> +<div> +*foo* +<p><em>bar</em></p> ```````````````````````````````` -This rule differs from John Gruber's original Markdown syntax -specification, which says: - -> The only restrictions are that block-level HTML elements — -> e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from -> surrounding content by blank lines, and the start and end tags of the -> block should not be indented with tabs or spaces. - -In some ways Gruber's rule is more restrictive than the one given -here: - -- It requires that an HTML block be preceded by a blank line. -- It does not allow the start tag to be indented. -- It requires a matching end tag, which it also does not allow to - be indented. - -Most Markdown implementations (including some of Gruber's own) do not -respect all of these restrictions. - -There is one respect, however, in which Gruber's rule is more liberal -than the one given here, since it allows blank lines to occur inside -an HTML block. There are two reasons for disallowing them here. -First, it removes the need to parse balanced tags, which is -expensive and can require backtracking from the end of the document -if no matching end tag is found. Second, it provides a very simple -and flexible way of including Markdown content inside HTML tags: -simply separate the Markdown from the HTML using blank lines: -Compare: +A partial tag need not even be completed (garbage +in, garbage out): ```````````````````````````````` example -<div> - -*Emphasized* text. - -</div> +<div id="foo" +*hi* . -<div> -<p><em>Emphasized</em> text.</p> -</div> +<div id="foo" +*hi* ```````````````````````````````` ```````````````````````````````` example -<div> -*Emphasized* text. -</div> +<div class +foo . -<div> -*Emphasized* text. -</div> +<div class +foo ```````````````````````````````` -Some Markdown implementations have adopted a convention of -interpreting content inside tags as text if the open tag has -the attribute `markdown=1`. The rule given above seems a simpler and -more elegant way of achieving the same expressive power, which is also -much simpler to parse. - -The main potential drawback is that one can no longer paste HTML -blocks into Markdown documents with 100% reliability. However, -*in most cases* this will work fine, because the blank lines in -HTML are usually followed by HTML block tags. For example: +The initial tag doesn't even need to be a valid +tag, as long as it starts like one: ```````````````````````````````` example -<table> - -<tr> - -<td> -Hi -</td> - -</tr> - -</table> +<div *???-&&&-<--- +*foo* . -<table> -<tr> -<td> -Hi -</td> -</tr> -</table> +<div *???-&&&-<--- +*foo* ```````````````````````````````` -There are problems, however, if the inner tags are indented -*and* separated by spaces, as then they will be interpreted as -an indented code block: +In type 6 blocks, the initial tag need not be on a line by +itself: ```````````````````````````````` example -<table> - - <tr> - - <td> - Hi - </td> +<div><a href="bar">*foo*</a></div> +. +<div><a href="bar">*foo*</a></div> +```````````````````````````````` - </tr> -</table> +```````````````````````````````` example +<table><tr><td> +foo +</td></tr></table> . -<table> - <tr> -<pre><code><td> - Hi -</td> -</code></pre> - </tr> -</table> +<table><tr><td> +foo +</td></tr></table> ```````````````````````````````` -Fortunately, blank lines are usually not necessary and can be -deleted. The exception is inside `<pre>` tags, but as described -[above][HTML blocks], raw HTML blocks starting with `<pre>` -*can* contain blank lines. +Everything until the next blank line or end of document +gets included in the HTML block. So, in the following +example, what looks like a Markdown code block +is actually part of the HTML block, which continues until a blank +line or the end of the document is reached: -## Link reference definitions +```````````````````````````````` example +<div></div> +``` c +int x = 33; +``` +. +<div></div> +``` c +int x = 33; +``` +```````````````````````````````` -A [link reference definition](@) -consists of a [link label], indented up to three spaces, followed -by a colon (`:`), optional [whitespace] (including up to one -[line ending]), a [link destination], -optional [whitespace] (including up to one -[line ending]), and an optional [link -title], which if it is present must be separated -from the [link destination] by [whitespace]. -No further [non-whitespace characters] may occur on the line. -A [link reference definition] -does not correspond to a structural element of a document. Instead, it -defines a label which can be used in [reference links] -and reference-style [images] elsewhere in the document. [Link -reference definitions] can come either before or after the links that use -them. +To start an [HTML block] with a tag that is *not* in the +list of block-level tags in (6), you must put the tag by +itself on the first line (and it must be complete): ```````````````````````````````` example -[foo]: /url "title" - -[foo] +<a href="foo"> +*bar* +</a> . -<p><a href="/url" title="title">foo</a></p> +<a href="foo"> +*bar* +</a> ```````````````````````````````` -```````````````````````````````` example - [foo]: - /url - 'the title' +In type 7 blocks, the [tag name] can be anything: -[foo] +```````````````````````````````` example +<Warning> +*bar* +</Warning> . -<p><a href="/url" title="the title">foo</a></p> +<Warning> +*bar* +</Warning> ```````````````````````````````` ```````````````````````````````` example -[Foo*bar\]]:my_(url) 'title (with parens)' - -[Foo*bar\]] +<i class="foo"> +*bar* +</i> . -<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> +<i class="foo"> +*bar* +</i> ```````````````````````````````` ```````````````````````````````` example -[Foo bar]: -<my url> -'title' - -[Foo bar] +</ins> +*bar* . -<p><a href="my%20url" title="title">Foo bar</a></p> +</ins> +*bar* ```````````````````````````````` -The title may extend over multiple lines: +These rules are designed to allow us to work with tags that +can function as either block-level or inline-level tags. +The `<del>` tag is a nice example. We can surround content with +`<del>` tags in three different ways. In this case, we get a raw +HTML block, because the `<del>` tag is on a line by itself: ```````````````````````````````` example -[foo]: /url ' -title -line1 -line2 -' - -[foo] +<del> +*foo* +</del> . -<p><a href="/url" title=" -title -line1 -line2 -">foo</a></p> +<del> +*foo* +</del> ```````````````````````````````` -However, it may not contain a [blank line]: +In this case, we get a raw HTML block that just includes +the `<del>` tag (because it ends with the following blank +line). So the contents get interpreted as CommonMark: ```````````````````````````````` example -[foo]: /url 'title +<del> -with blank line' +*foo* -[foo] +</del> . -<p>[foo]: /url 'title</p> -<p>with blank line'</p> -<p>[foo]</p> +<del> +<p><em>foo</em></p> +</del> ```````````````````````````````` -The title may be omitted: - -```````````````````````````````` example -[foo]: -/url +Finally, in this case, the `<del>` tags are interpreted +as [raw HTML] *inside* the CommonMark paragraph. (Because +the tag is not on a line by itself, we get inline HTML +rather than an [HTML block].) -[foo] +```````````````````````````````` example +<del>*foo*</del> . -<p><a href="/url">foo</a></p> +<p><del><em>foo</em></del></p> ```````````````````````````````` -The link destination may not be omitted: +HTML tags designed to contain literal content +(`script`, `style`, `pre`), comments, processing instructions, +and declarations are treated somewhat differently. +Instead of ending at the first blank line, these blocks +end at the first line containing a corresponding end tag. +As a result, these blocks can contain blank lines: + +A pre tag (type 1): ```````````````````````````````` example -[foo]: +<pre language="haskell"><code> +import Text.HTML.TagSoup -[foo] +main :: IO () +main = print $ parseTags tags +</code></pre> +okay . -<p>[foo]:</p> -<p>[foo]</p> +<pre language="haskell"><code> +import Text.HTML.TagSoup + +main :: IO () +main = print $ parseTags tags +</code></pre> +<p>okay</p> ```````````````````````````````` - However, an empty link destination may be specified using - angle brackets: + +A script tag (type 1): ```````````````````````````````` example -[foo]: <> +<script type="text/javascript"> +// JavaScript example -[foo] +document.getElementById("demo").innerHTML = "Hello JavaScript!"; +</script> +okay . -<p><a href="">foo</a></p> +<script type="text/javascript"> +// JavaScript example + +document.getElementById("demo").innerHTML = "Hello JavaScript!"; +</script> +<p>okay</p> ```````````````````````````````` -The title must be separated from the link destination by -whitespace: + +A style tag (type 1): ```````````````````````````````` example -[foo]: <bar>(baz) +<style + type="text/css"> +h1 {color:red;} -[foo] +p {color:blue;} +</style> +okay . -<p>[foo]: <bar>(baz)</p> -<p>[foo]</p> +<style + type="text/css"> +h1 {color:red;} + +p {color:blue;} +</style> +<p>okay</p> ```````````````````````````````` -Both title and destination can contain backslash escapes -and literal backslashes: +If there is no matching end tag, the block will end at the +end of the document (or the enclosing [block quote][block quotes] +or [list item][list items]): ```````````````````````````````` example -[foo]: /url\bar\*baz "foo\"bar\baz" +<style + type="text/css"> -[foo] +foo . -<p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> -```````````````````````````````` +<style + type="text/css"> +foo +```````````````````````````````` -A link can come before its corresponding definition: ```````````````````````````````` example -[foo] +> <div> +> foo -[foo]: url +bar . -<p><a href="url">foo</a></p> +<blockquote> +<div> +foo +</blockquote> +<p>bar</p> ```````````````````````````````` -If there are several matching definitions, the first one takes -precedence: - ```````````````````````````````` example -[foo] - -[foo]: first -[foo]: second +- <div> +- foo . -<p><a href="first">foo</a></p> +<ul> +<li> +<div> +</li> +<li>foo</li> +</ul> ```````````````````````````````` -As noted in the section on [Links], matching of labels is -case-insensitive (see [matches]). +The end tag can occur on the same line as the start tag: ```````````````````````````````` example -[FOO]: /url - -[Foo] +<style>p{color:red;}</style> +*foo* . -<p><a href="/url">Foo</a></p> +<style>p{color:red;}</style> +<p><em>foo</em></p> ```````````````````````````````` ```````````````````````````````` example -[ΑΓΩ]: /φου - -[αγω] +<!-- foo -->*bar* +*baz* . -<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> +<!-- foo -->*bar* +<p><em>baz</em></p> ```````````````````````````````` -Here is a link reference definition with no corresponding link. -It contributes nothing to the document. +Note that anything on the last line after the +end tag will be included in the [HTML block]: ```````````````````````````````` example -[foo]: /url +<script> +foo +</script>1. *bar* . +<script> +foo +</script>1. *bar* ```````````````````````````````` -Here is another one: +A comment (type 2): ```````````````````````````````` example -[ -foo -]: /url +<!-- Foo + bar + baz --> +okay . -<p>bar</p> +<!-- Foo + +bar + baz --> +<p>okay</p> ```````````````````````````````` -This is not a link reference definition, because there are -[non-whitespace characters] after the title: + +A processing instruction (type 3): ```````````````````````````````` example -[foo]: /url "title" ok +<?php + + echo '>'; + +?> +okay . -<p>[foo]: /url "title" ok</p> +<?php + + echo '>'; + +?> +<p>okay</p> ```````````````````````````````` -This is a link reference definition, but it has no title: +A declaration (type 4): ```````````````````````````````` example -[foo]: /url -"title" ok +<!DOCTYPE html> . -<p>"title" ok</p> +<!DOCTYPE html> ```````````````````````````````` -This is not a link reference definition, because it is indented -four spaces: +CDATA (type 5): ```````````````````````````````` example - [foo]: /url "title" +<![CDATA[ +function matchwo(a,b) +{ + if (a < b && a < 0) then { + return 1; -[foo] + } else { + + return 0; + } +} +]]> +okay . -<pre><code>[foo]: /url "title" -</code></pre> -<p>[foo]</p> +<![CDATA[ +function matchwo(a,b) +{ + if (a < b && a < 0) then { + return 1; + + } else { + + return 0; + } +} +]]> +<p>okay</p> ```````````````````````````````` -This is not a link reference definition, because it occurs inside -a code block: +The opening tag can be indented 1-3 spaces, but not 4: ```````````````````````````````` example -``` -[foo]: /url -``` + <!-- foo --> -[foo] + <!-- foo --> . -<pre><code>[foo]: /url + <!-- foo --> +<pre><code><!-- foo --> </code></pre> -<p>[foo]</p> ```````````````````````````````` -A [link reference definition] cannot interrupt a paragraph. - ```````````````````````````````` example -Foo -[bar]: /baz + <div> -[bar] + <div> . -<p>Foo -[bar]: /baz</p> -<p>[bar]</p> + <div> +<pre><code><div> +</code></pre> ```````````````````````````````` -However, it can directly follow other block elements, such as headings -and thematic breaks, and it need not be followed by a blank line. +An HTML block of types 1--6 can interrupt a paragraph, and need not be +preceded by a blank line. ```````````````````````````````` example -# [Foo] -[foo]: /url -> bar -. -<h1><a href="/url">Foo</a></h1> -<blockquote> -<p>bar</p> -</blockquote> +Foo +<div> +bar +</div> +. +<p>Foo</p> +<div> +bar +</div> ```````````````````````````````` + +However, a following blank line is needed, except at the end of +a document, and except for blocks of types 1--5, [above][HTML +block]: + ```````````````````````````````` example -[foo]: /url +<div> bar -=== -[foo] +</div> +*foo* . -<h1>bar</h1> -<p><a href="/url">foo</a></p> +<div> +bar +</div> +*foo* ```````````````````````````````` + +HTML blocks of type 7 cannot interrupt a paragraph: + ```````````````````````````````` example -[foo]: /url -=== -[foo] +Foo +<a href="bar"> +baz . -<p>=== -<a href="/url">foo</a></p> +<p>Foo +<a href="bar"> +baz</p> ```````````````````````````````` -Several [link reference definitions] -can occur one after another, without intervening blank lines. +This rule differs from John Gruber's original Markdown syntax +specification, which says: -```````````````````````````````` example -[foo]: /foo-url "foo" -[bar]: /bar-url - "bar" -[baz]: /baz-url +> The only restrictions are that block-level HTML elements — +> e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from +> surrounding content by blank lines, and the start and end tags of the +> block should not be indented with tabs or spaces. -[foo], -[bar], -[baz] -. -<p><a href="/foo-url" title="foo">foo</a>, -<a href="/bar-url" title="bar">bar</a>, -<a href="/baz-url">baz</a></p> -```````````````````````````````` +In some ways Gruber's rule is more restrictive than the one given +here: +- It requires that an HTML block be preceded by a blank line. +- It does not allow the start tag to be indented. +- It requires a matching end tag, which it also does not allow to + be indented. -[Link reference definitions] can occur -inside block containers, like lists and block quotations. They -affect the entire document, not just the container in which they -are defined: +Most Markdown implementations (including some of Gruber's own) do not +respect all of these restrictions. + +There is one respect, however, in which Gruber's rule is more liberal +than the one given here, since it allows blank lines to occur inside +an HTML block. There are two reasons for disallowing them here. +First, it removes the need to parse balanced tags, which is +expensive and can require backtracking from the end of the document +if no matching end tag is found. Second, it provides a very simple +and flexible way of including Markdown content inside HTML tags: +simply separate the Markdown from the HTML using blank lines: + +Compare: ```````````````````````````````` example -[foo] +<div> -> [foo]: /url +*Emphasized* text. + +</div> . -<p><a href="/url">foo</a></p> -<blockquote> -</blockquote> +<div> +<p><em>Emphasized</em> text.</p> +</div> ```````````````````````````````` -Whether something is a [link reference definition] is -independent of whether the link reference it defines is -used in the document. Thus, for example, the following -document contains just a link reference definition, and -no visible content: - ```````````````````````````````` example -[foo]: /url +<div> +*Emphasized* text. +</div> . +<div> +*Emphasized* text. +</div> ```````````````````````````````` -## Paragraphs - -A sequence of non-blank lines that cannot be interpreted as other -kinds of blocks forms a [paragraph](@). -The contents of the paragraph are the result of parsing the -paragraph's raw content as inlines. The paragraph's raw content -is formed by concatenating the lines and removing initial and final -[whitespace]. +Some Markdown implementations have adopted a convention of +interpreting content inside tags as text if the open tag has +the attribute `markdown=1`. The rule given above seems a simpler and +more elegant way of achieving the same expressive power, which is also +much simpler to parse. -A simple example with two paragraphs: +The main potential drawback is that one can no longer paste HTML +blocks into Markdown documents with 100% reliability. However, +*in most cases* this will work fine, because the blank lines in +HTML are usually followed by HTML block tags. For example: ```````````````````````````````` example -aaa - -bbb -. -<p>aaa</p> -<p>bbb</p> -```````````````````````````````` +<table> +<tr> -Paragraphs can contain multiple lines, but no blank lines: +<td> +Hi +</td> -```````````````````````````````` example -aaa -bbb +</tr> -ccc -ddd +</table> . -<p>aaa -bbb</p> -<p>ccc -ddd</p> +<table> +<tr> +<td> +Hi +</td> +</tr> +</table> ```````````````````````````````` -Multiple blank lines between paragraph have no effect: +There are problems, however, if the inner tags are indented +*and* separated by spaces, as then they will be interpreted as +an indented code block: ```````````````````````````````` example -aaa +<table> + <tr> -bbb + <td> + Hi + </td> + + </tr> + +</table> . -<p>aaa</p> -<p>bbb</p> +<table> + <tr> +<pre><code><td> + Hi +</td> +</code></pre> + </tr> +</table> ```````````````````````````````` -Leading spaces are skipped: +Fortunately, blank lines are usually not necessary and can be +deleted. The exception is inside `<pre>` tags, but as described +[above][HTML blocks], raw HTML blocks starting with `<pre>` +*can* contain blank lines. + +## Link reference definitions + +A [link reference definition](@) +consists of a [link label], indented up to three spaces, followed +by a colon (`:`), optional [whitespace] (including up to one +[line ending]), a [link destination], +optional [whitespace] (including up to one +[line ending]), and an optional [link +title], which if it is present must be separated +from the [link destination] by [whitespace]. +No further [non-whitespace characters] may occur on the line. + +A [link reference definition] +does not correspond to a structural element of a document. Instead, it +defines a label which can be used in [reference links] +and reference-style [images] elsewhere in the document. [Link +reference definitions] can come either before or after the links that use +them. ```````````````````````````````` example - aaa - bbb +[foo]: /url "title" + +[foo] . -<p>aaa -bbb</p> +<p><a href="/url" title="title">foo</a></p> ```````````````````````````````` -Lines after the first may be indented any amount, since indented -code blocks cannot interrupt paragraphs. - ```````````````````````````````` example -aaa - bbb - ccc + [foo]: + /url + 'the title' + +[foo] . -<p>aaa -bbb -ccc</p> +<p><a href="/url" title="the title">foo</a></p> ```````````````````````````````` -However, the first line may be indented at most three spaces, -or an indented code block will be triggered: - ```````````````````````````````` example - aaa -bbb +[Foo*bar\]]:my_(url) 'title (with parens)' + +[Foo*bar\]] . -<p>aaa -bbb</p> +<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> ```````````````````````````````` ```````````````````````````````` example - aaa -bbb +[Foo bar]: +<my url> +'title' + +[Foo bar] . -<pre><code>aaa -</code></pre> -<p>bbb</p> +<p><a href="my%20url" title="title">Foo bar</a></p> ```````````````````````````````` -Final spaces are stripped before inline parsing, so a paragraph -that ends with two or more spaces will not end with a [hard line -break]: +The title may extend over multiple lines: ```````````````````````````````` example -aaa -bbb +[foo]: /url ' +title +line1 +line2 +' + +[foo] . -<p>aaa<br /> -bbb</p> +<p><a href="/url" title=" +title +line1 +line2 +">foo</a></p> ```````````````````````````````` -## Blank lines - -[Blank lines] between block-level elements are ignored, -except for the role they play in determining whether a [list] -is [tight] or [loose]. - -Blank lines at the beginning and end of the document are also ignored. +However, it may not contain a [blank line]: ```````````````````````````````` example - - -aaa - +[foo]: /url 'title -# aaa +with blank line' - +[foo] . -<p>aaa</p> -<h1>aaa</h1> +<p>[foo]: /url 'title</p> +<p>with blank line'</p> +<p>[foo]</p> ```````````````````````````````` +The title may be omitted: -# Container blocks - -A [container block](#container-blocks) is a block that has other -blocks as its contents. There are two basic kinds of container blocks: -[block quotes] and [list items]. -[Lists] are meta-containers for [list items]. - -We define the syntax for container blocks recursively. The general -form of the definition is: - -> If X is a sequence of blocks, then the result of -> transforming X in such-and-such a way is a container of type Y -> with these blocks as its content. +```````````````````````````````` example +[foo]: +/url -So, we explain what counts as a block quote or list item by explaining -how these can be *generated* from their contents. This should suffice -to define the syntax, although it does not give a recipe for *parsing* -these constructions. (A recipe is provided below in the section entitled -[A parsing strategy](#appendix-a-parsing-strategy).) +[foo] +. +<p><a href="/url">foo</a></p> +```````````````````````````````` -## Block quotes -A [block quote marker](@) -consists of 0-3 spaces of initial indent, plus (a) the character `>` together -with a following space, or (b) a single character `>` not followed by a space. +The link destination may not be omitted: -The following rules define [block quotes]: +```````````````````````````````` example +[foo]: -1. **Basic case.** If a string of lines *Ls* constitute a sequence - of blocks *Bs*, then the result of prepending a [block quote - marker] to the beginning of each line in *Ls* - is a [block quote](#block-quotes) containing *Bs*. +[foo] +. +<p>[foo]:</p> +<p>[foo]</p> +```````````````````````````````` -2. **Laziness.** If a string of lines *Ls* constitute a [block - quote](#block-quotes) with contents *Bs*, then the result of deleting - the initial [block quote marker] from one or - more lines in which the next [non-whitespace character] after the [block - quote marker] is [paragraph continuation - text] is a block quote with *Bs* as its content. - [Paragraph continuation text](@) is text - that will be parsed as part of the content of a paragraph, but does - not occur at the beginning of the paragraph. + However, an empty link destination may be specified using + angle brackets: -3. **Consecutiveness.** A document cannot contain two [block - quotes] in a row unless there is a [blank line] between them. +```````````````````````````````` example +[foo]: <> -Nothing else counts as a [block quote](#block-quotes). +[foo] +. +<p><a href="">foo</a></p> +```````````````````````````````` -Here is a simple example: +The title must be separated from the link destination by +whitespace: ```````````````````````````````` example -> # Foo -> bar -> baz +[foo]: <bar>(baz) + +[foo] . -<blockquote> -<h1>Foo</h1> -<p>bar -baz</p> -</blockquote> +<p>[foo]: <bar>(baz)</p> +<p>[foo]</p> ```````````````````````````````` -The spaces after the `>` characters can be omitted: +Both title and destination can contain backslash escapes +and literal backslashes: ```````````````````````````````` example -># Foo ->bar -> baz +[foo]: /url\bar\*baz "foo\"bar\baz" + +[foo] . -<blockquote> -<h1>Foo</h1> -<p>bar -baz</p> -</blockquote> +<p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> ```````````````````````````````` -The `>` characters can be indented 1-3 spaces: +A link can come before its corresponding definition: ```````````````````````````````` example - > # Foo - > bar - > baz +[foo] + +[foo]: url . -<blockquote> -<h1>Foo</h1> -<p>bar -baz</p> -</blockquote> +<p><a href="url">foo</a></p> ```````````````````````````````` -Four spaces gives us a code block: +If there are several matching definitions, the first one takes +precedence: ```````````````````````````````` example - > # Foo - > bar - > baz +[foo] + +[foo]: first +[foo]: second . -<pre><code>> # Foo -> bar -> baz -</code></pre> +<p><a href="first">foo</a></p> ```````````````````````````````` -The Laziness clause allows us to omit the `>` before -[paragraph continuation text]: +As noted in the section on [Links], matching of labels is +case-insensitive (see [matches]). ```````````````````````````````` example -> # Foo -> bar -baz +[FOO]: /url + +[Foo] . -<blockquote> -<h1>Foo</h1> -<p>bar -baz</p> -</blockquote> +<p><a href="/url">Foo</a></p> ```````````````````````````````` -A block quote can contain some lazy and some non-lazy -continuation lines: - ```````````````````````````````` example -> bar -baz -> foo +[ΑΓΩ]: /φου + +[αγω] . -<blockquote> -<p>bar -baz -foo</p> -</blockquote> +<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> ```````````````````````````````` -Laziness only applies to lines that would have been continuations of -paragraphs had they been prepended with [block quote markers]. -For example, the `> ` cannot be omitted in the second line of - -``` markdown -> foo -> --- -``` - -without changing the meaning: +Here is a link reference definition with no corresponding link. +It contributes nothing to the document. ```````````````````````````````` example -> foo ---- +[foo]: /url . -<blockquote> -<p>foo</p> -</blockquote> -<hr /> ```````````````````````````````` -Similarly, if we omit the `> ` in the second line of - -``` markdown -> - foo -> - bar -``` - -then the block quote ends after the first line: +Here is another one: ```````````````````````````````` example -> - foo -- bar +[ +foo +]: /url +bar . -<blockquote> -<ul> -<li>foo</li> -</ul> -</blockquote> -<ul> -<li>bar</li> -</ul> +<p>bar</p> ```````````````````````````````` -For the same reason, we can't omit the `> ` in front of -subsequent lines of an indented or fenced code block: +This is not a link reference definition, because there are +[non-whitespace characters] after the title: ```````````````````````````````` example -> foo - bar +[foo]: /url "title" ok . -<blockquote> -<pre><code>foo -</code></pre> -</blockquote> -<pre><code>bar -</code></pre> +<p>[foo]: /url "title" ok</p> ```````````````````````````````` +This is a link reference definition, but it has no title: + ```````````````````````````````` example -> ``` -foo -``` +[foo]: /url +"title" ok . -<blockquote> -<pre><code></code></pre> -</blockquote> -<p>foo</p> -<pre><code></code></pre> +<p>"title" ok</p> ```````````````````````````````` -Note that in the following case, we have a [lazy -continuation line]: +This is not a link reference definition, because it is indented +four spaces: ```````````````````````````````` example -> foo - - bar + [foo]: /url "title" + +[foo] . -<blockquote> -<p>foo -- bar</p> -</blockquote> +<pre><code>[foo]: /url "title" +</code></pre> +<p>[foo]</p> ```````````````````````````````` -To see why, note that in +This is not a link reference definition, because it occurs inside +a code block: -```markdown -> foo -> - bar +```````````````````````````````` example +``` +[foo]: /url ``` -the `- bar` is indented too far to start a list, and can't -be an indented code block because indented code blocks cannot -interrupt paragraphs, so it is [paragraph continuation text]. +[foo] +. +<pre><code>[foo]: /url +</code></pre> +<p>[foo]</p> +```````````````````````````````` -A block quote can be empty: + +A [link reference definition] cannot interrupt a paragraph. ```````````````````````````````` example -> +Foo +[bar]: /baz + +[bar] . -<blockquote> -</blockquote> +<p>Foo +[bar]: /baz</p> +<p>[bar]</p> ```````````````````````````````` +However, it can directly follow other block elements, such as headings +and thematic breaks, and it need not be followed by a blank line. + ```````````````````````````````` example -> -> -> +# [Foo] +[foo]: /url +> bar . +<h1><a href="/url">Foo</a></h1> <blockquote> +<p>bar</p> </blockquote> ```````````````````````````````` - -A block quote can have initial or final blank lines: +```````````````````````````````` example +[foo]: /url +bar +=== +[foo] +. +<h1>bar</h1> +<p><a href="/url">foo</a></p> +```````````````````````````````` ```````````````````````````````` example -> -> foo -> +[foo]: /url +=== +[foo] . -<blockquote> -<p>foo</p> -</blockquote> +<p>=== +<a href="/url">foo</a></p> ```````````````````````````````` -A blank line always separates block quotes: +Several [link reference definitions] +can occur one after another, without intervening blank lines. ```````````````````````````````` example -> foo +[foo]: /foo-url "foo" +[bar]: /bar-url + "bar" +[baz]: /baz-url -> bar +[foo], +[bar], +[baz] . -<blockquote> -<p>foo</p> -</blockquote> -<blockquote> -<p>bar</p> -</blockquote> +<p><a href="/foo-url" title="foo">foo</a>, +<a href="/bar-url" title="bar">bar</a>, +<a href="/baz-url">baz</a></p> ```````````````````````````````` -(Most current Markdown implementations, including John Gruber's -original `Markdown.pl`, will parse this example as a single block quote -with two paragraphs. But it seems better to allow the author to decide -whether two block quotes or one are wanted.) - -Consecutiveness means that if we put these block quotes together, -we get a single block quote: +[Link reference definitions] can occur +inside block containers, like lists and block quotations. They +affect the entire document, not just the container in which they +are defined: ```````````````````````````````` example -> foo -> bar +[foo] + +> [foo]: /url . +<p><a href="/url">foo</a></p> <blockquote> -<p>foo -bar</p> </blockquote> ```````````````````````````````` -To get a block quote with two paragraphs, use: +Whether something is a [link reference definition] is +independent of whether the link reference it defines is +used in the document. Thus, for example, the following +document contains just a link reference definition, and +no visible content: ```````````````````````````````` example -> foo -> -> bar +[foo]: /url . -<blockquote> -<p>foo</p> -<p>bar</p> -</blockquote> ```````````````````````````````` -Block quotes can interrupt paragraphs: +## Paragraphs + +A sequence of non-blank lines that cannot be interpreted as other +kinds of blocks forms a [paragraph](@). +The contents of the paragraph are the result of parsing the +paragraph's raw content as inlines. The paragraph's raw content +is formed by concatenating the lines and removing initial and final +[whitespace]. + +A simple example with two paragraphs: ```````````````````````````````` example -foo -> bar +aaa + +bbb . -<p>foo</p> -<blockquote> -<p>bar</p> -</blockquote> +<p>aaa</p> +<p>bbb</p> ```````````````````````````````` -In general, blank lines are not needed before or after block -quotes: +Paragraphs can contain multiple lines, but no blank lines: ```````````````````````````````` example -> aaa -*** -> bbb +aaa +bbb + +ccc +ddd . -<blockquote> -<p>aaa</p> -</blockquote> -<hr /> -<blockquote> -<p>bbb</p> -</blockquote> +<p>aaa +bbb</p> +<p>ccc +ddd</p> ```````````````````````````````` -However, because of laziness, a blank line is needed between -a block quote and a following paragraph: +Multiple blank lines between paragraph have no effect: ```````````````````````````````` example -> bar -baz +aaa + + +bbb . -<blockquote> -<p>bar -baz</p> -</blockquote> +<p>aaa</p> +<p>bbb</p> ```````````````````````````````` -```````````````````````````````` example -> bar +Leading spaces are skipped: -baz +```````````````````````````````` example + aaa + bbb . -<blockquote> -<p>bar</p> -</blockquote> -<p>baz</p> +<p>aaa +bbb</p> ```````````````````````````````` +Lines after the first may be indented any amount, since indented +code blocks cannot interrupt paragraphs. + ```````````````````````````````` example -> bar -> -baz +aaa + bbb + ccc . -<blockquote> -<p>bar</p> -</blockquote> -<p>baz</p> +<p>aaa +bbb +ccc</p> ```````````````````````````````` -It is a consequence of the Laziness rule that any number -of initial `>`s may be omitted on a continuation line of a -nested block quote: +However, the first line may be indented at most three spaces, +or an indented code block will be triggered: ```````````````````````````````` example -> > > foo -bar + aaa +bbb . -<blockquote> -<blockquote> -<blockquote> -<p>foo -bar</p> -</blockquote> -</blockquote> -</blockquote> +<p>aaa +bbb</p> ```````````````````````````````` ```````````````````````````````` example ->>> foo -> bar ->>baz + aaa +bbb . -<blockquote> -<blockquote> -<blockquote> -<p>foo -bar -baz</p> -</blockquote> -</blockquote> -</blockquote> +<pre><code>aaa +</code></pre> +<p>bbb</p> ```````````````````````````````` -When including an indented code block in a block quote, -remember that the [block quote marker] includes -both the `>` and a following space. So *five spaces* are needed after -the `>`: +Final spaces are stripped before inline parsing, so a paragraph +that ends with two or more spaces will not end with a [hard line +break]: ```````````````````````````````` example -> code - -> not code +aaa +bbb . -<blockquote> -<pre><code>code -</code></pre> -</blockquote> -<blockquote> -<p>not code</p> -</blockquote> +<p>aaa<br /> +bbb</p> ```````````````````````````````` +## Blank lines -## List items +[Blank lines] between block-level elements are ignored, +except for the role they play in determining whether a [list] +is [tight] or [loose]. -A [list marker](@) is a -[bullet list marker] or an [ordered list marker]. +Blank lines at the beginning and end of the document are also ignored. -A [bullet list marker](@) -is a `-`, `+`, or `*` character. +```````````````````````````````` example + -An [ordered list marker](@) -is a sequence of 1--9 arabic digits (`0-9`), followed by either a -`.` character or a `)` character. (The reason for the length -limit is that with 10 digits we start seeing integer overflows -in some browsers.) +aaa + -The following rules define [list items]: +# aaa -1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of - blocks *Bs* starting with a [non-whitespace character], and *M* is a - list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result - of prepending *M* and the following spaces to the first line of - *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a - list item with *Bs* as its contents. The type of the list item - (bullet or ordered) is determined by the type of its list marker. - If the list item is ordered, then it is also assigned a start - number, based on the ordered list marker. + +. +<p>aaa</p> +<h1>aaa</h1> +```````````````````````````````` - Exceptions: - 1. When the first list item in a [list] interrupts - a paragraph---that is, when it starts on a line that would - otherwise count as [paragraph continuation text]---then (a) - the lines *Ls* must not begin with a blank line, and (b) if - the list item is ordered, the start number must be 1. - 2. If any line is a [thematic break][thematic breaks] then - that line is not a list item. -For example, let *Ls* be the lines +# Container blocks -```````````````````````````````` example -A paragraph -with two lines. +A [container block](#container-blocks) is a block that has other +blocks as its contents. There are two basic kinds of container blocks: +[block quotes] and [list items]. +[Lists] are meta-containers for [list items]. - indented code +We define the syntax for container blocks recursively. The general +form of the definition is: -> A block quote. -. -<p>A paragraph -with two lines.</p> -<pre><code>indented code -</code></pre> -<blockquote> -<p>A block quote.</p> -</blockquote> -```````````````````````````````` +> If X is a sequence of blocks, then the result of +> transforming X in such-and-such a way is a container of type Y +> with these blocks as its content. +So, we explain what counts as a block quote or list item by explaining +how these can be *generated* from their contents. This should suffice +to define the syntax, although it does not give a recipe for *parsing* +these constructions. (A recipe is provided below in the section entitled +[A parsing strategy](#appendix-a-parsing-strategy).) -And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says -that the following is an ordered list item with start number 1, -and the same contents as *Ls*: +## Block quotes -```````````````````````````````` example -1. A paragraph - with two lines. +A [block quote marker](@) +consists of 0-3 spaces of initial indent, plus (a) the character `>` together +with a following space, or (b) a single character `>` not followed by a space. - indented code +The following rules define [block quotes]: - > A block quote. -. -<ol> -<li> -<p>A paragraph -with two lines.</p> -<pre><code>indented code -</code></pre> -<blockquote> -<p>A block quote.</p> -</blockquote> -</li> -</ol> -```````````````````````````````` +1. **Basic case.** If a string of lines *Ls* constitute a sequence + of blocks *Bs*, then the result of prepending a [block quote + marker] to the beginning of each line in *Ls* + is a [block quote](#block-quotes) containing *Bs*. +2. **Laziness.** If a string of lines *Ls* constitute a [block + quote](#block-quotes) with contents *Bs*, then the result of deleting + the initial [block quote marker] from one or + more lines in which the next [non-whitespace character] after the [block + quote marker] is [paragraph continuation + text] is a block quote with *Bs* as its content. + [Paragraph continuation text](@) is text + that will be parsed as part of the content of a paragraph, but does + not occur at the beginning of the paragraph. -The most important thing to notice is that the position of -the text after the list marker determines how much indentation -is needed in subsequent blocks in the list item. If the list -marker takes up two spaces, and there are three spaces between -the list marker and the next [non-whitespace character], then blocks -must be indented five spaces in order to fall under the list -item. +3. **Consecutiveness.** A document cannot contain two [block + quotes] in a row unless there is a [blank line] between them. -Here are some examples showing how far content must be indented to be -put under the list item: +Nothing else counts as a [block quote](#block-quotes). -```````````````````````````````` example -- one +Here is a simple example: - two +```````````````````````````````` example +> # Foo +> bar +> baz . -<ul> -<li>one</li> -</ul> -<p>two</p> +<blockquote> +<h1>Foo</h1> +<p>bar +baz</p> +</blockquote> ```````````````````````````````` -```````````````````````````````` example -- one +The spaces after the `>` characters can be omitted: - two +```````````````````````````````` example +># Foo +>bar +> baz . -<ul> -<li> -<p>one</p> -<p>two</p> -</li> -</ul> +<blockquote> +<h1>Foo</h1> +<p>bar +baz</p> +</blockquote> ```````````````````````````````` -```````````````````````````````` example - - one +The `>` characters can be indented 1-3 spaces: - two +```````````````````````````````` example + > # Foo + > bar + > baz . -<ul> -<li>one</li> -</ul> -<pre><code> two -</code></pre> +<blockquote> +<h1>Foo</h1> +<p>bar +baz</p> +</blockquote> ```````````````````````````````` -```````````````````````````````` example - - one +Four spaces gives us a code block: - two +```````````````````````````````` example + > # Foo + > bar + > baz . -<ul> -<li> -<p>one</p> -<p>two</p> -</li> -</ul> +<pre><code>> # Foo +> bar +> baz +</code></pre> ```````````````````````````````` -It is tempting to think of this in terms of columns: the continuation -blocks must be indented at least to the column of the first -[non-whitespace character] after the list marker. However, that is not quite right. -The spaces after the list marker determine how much relative indentation -is needed. Which column this indentation reaches will depend on -how the list item is embedded in other constructions, as shown by -this example: +The Laziness clause allows us to omit the `>` before +[paragraph continuation text]: ```````````````````````````````` example - > > 1. one ->> ->> two +> # Foo +> bar +baz . <blockquote> -<blockquote> -<ol> -<li> -<p>one</p> -<p>two</p> -</li> -</ol> -</blockquote> +<h1>Foo</h1> +<p>bar +baz</p> </blockquote> ```````````````````````````````` -Here `two` occurs in the same column as the list marker `1.`, -but is actually contained in the list item, because there is -sufficient indentation after the last containing blockquote marker. - -The converse is also possible. In the following example, the word `two` -occurs far to the right of the initial text of the list item, `one`, but -it is not considered part of the list item, because it is not indented -far enough past the blockquote marker: +A block quote can contain some lazy and some non-lazy +continuation lines: ```````````````````````````````` example ->>- one ->> - > > two +> bar +baz +> foo . <blockquote> -<blockquote> -<ul> -<li>one</li> -</ul> -<p>two</p> -</blockquote> +<p>bar +baz +foo</p> </blockquote> ```````````````````````````````` -Note that at least one space is needed between the list marker and -any following content, so these are not list items: +Laziness only applies to lines that would have been continuations of +paragraphs had they been prepended with [block quote markers]. +For example, the `> ` cannot be omitted in the second line of -```````````````````````````````` example --one +``` markdown +> foo +> --- +``` -2.two +without changing the meaning: + +```````````````````````````````` example +> foo +--- . -<p>-one</p> -<p>2.two</p> +<blockquote> +<p>foo</p> +</blockquote> +<hr /> ```````````````````````````````` -A list item may contain blocks that are separated by more than -one blank line. +Similarly, if we omit the `> ` in the second line of -```````````````````````````````` example -- foo +``` markdown +> - foo +> - bar +``` +then the block quote ends after the first line: - bar +```````````````````````````````` example +> - foo +- bar . +<blockquote> <ul> -<li> -<p>foo</p> -<p>bar</p> -</li> +<li>foo</li> +</ul> +</blockquote> +<ul> +<li>bar</li> </ul> ```````````````````````````````` -A list item may contain any kind of block: +For the same reason, we can't omit the `> ` in front of +subsequent lines of an indented or fenced code block: ```````````````````````````````` example -1. foo - - ``` +> foo bar - ``` - - baz - - > bam . -<ol> -<li> -<p>foo</p> +<blockquote> +<pre><code>foo +</code></pre> +</blockquote> <pre><code>bar </code></pre> -<p>baz</p> +```````````````````````````````` + + +```````````````````````````````` example +> ``` +foo +``` +. <blockquote> -<p>bam</p> +<pre><code></code></pre> </blockquote> -</li> -</ol> +<p>foo</p> +<pre><code></code></pre> ```````````````````````````````` -A list item that contains an indented code block will preserve -empty lines within the code block verbatim. +Note that in the following case, we have a [lazy +continuation line]: ```````````````````````````````` example -- Foo - - bar +> foo + - bar +. +<blockquote> +<p>foo +- bar</p> +</blockquote> +```````````````````````````````` - baz -. -<ul> -<li> -<p>Foo</p> -<pre><code>bar +To see why, note that in +```markdown +> foo +> - bar +``` -baz -</code></pre> -</li> -</ul> -```````````````````````````````` +the `- bar` is indented too far to start a list, and can't +be an indented code block because indented code blocks cannot +interrupt paragraphs, so it is [paragraph continuation text]. -Note that ordered list start numbers must be nine digits or less: +A block quote can be empty: ```````````````````````````````` example -123456789. ok +> . -<ol start="123456789"> -<li>ok</li> -</ol> +<blockquote> +</blockquote> ```````````````````````````````` ```````````````````````````````` example -1234567890. not ok +> +> +> . -<p>1234567890. not ok</p> +<blockquote> +</blockquote> ```````````````````````````````` -A start number may begin with 0s: +A block quote can have initial or final blank lines: ```````````````````````````````` example -0. ok +> +> foo +> . -<ol start="0"> -<li>ok</li> -</ol> +<blockquote> +<p>foo</p> +</blockquote> ```````````````````````````````` +A blank line always separates block quotes: + ```````````````````````````````` example -003. ok +> foo + +> bar . -<ol start="3"> -<li>ok</li> -</ol> +<blockquote> +<p>foo</p> +</blockquote> +<blockquote> +<p>bar</p> +</blockquote> ```````````````````````````````` -A start number may not be negative: +(Most current Markdown implementations, including John Gruber's +original `Markdown.pl`, will parse this example as a single block quote +with two paragraphs. But it seems better to allow the author to decide +whether two block quotes or one are wanted.) + +Consecutiveness means that if we put these block quotes together, +we get a single block quote: ```````````````````````````````` example --1. not ok +> foo +> bar . -<p>-1. not ok</p> +<blockquote> +<p>foo +bar</p> +</blockquote> ```````````````````````````````` - -2. **Item starting with indented code.** If a sequence of lines *Ls* - constitute a sequence of blocks *Bs* starting with an indented code - block, and *M* is a list marker of width *W* followed by - one space, then the result of prepending *M* and the following - space to the first line of *Ls*, and indenting subsequent lines of - *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. - If a line is empty, then it need not be indented. The type of the - list item (bullet or ordered) is determined by the type of its list - marker. If the list item is ordered, then it is also assigned a - start number, based on the ordered list marker. - -An indented code block will have to be indented four spaces beyond -the edge of the region where text will be included in the list item. -In the following case that is 6 spaces: +To get a block quote with two paragraphs, use: ```````````````````````````````` example -- foo - - bar +> foo +> +> bar . -<ul> -<li> +<blockquote> <p>foo</p> -<pre><code>bar -</code></pre> -</li> -</ul> +<p>bar</p> +</blockquote> ```````````````````````````````` -And in this case it is 11 spaces: +Block quotes can interrupt paragraphs: ```````````````````````````````` example - 10. foo - - bar +foo +> bar . -<ol start="10"> -<li> <p>foo</p> -<pre><code>bar -</code></pre> -</li> -</ol> +<blockquote> +<p>bar</p> +</blockquote> ```````````````````````````````` -If the *first* block in the list item is an indented code block, -then by rule #2, the contents must be indented *one* space after the -list marker: +In general, blank lines are not needed before or after block +quotes: ```````````````````````````````` example - indented code - -paragraph - - more code -. -<pre><code>indented code -</code></pre> -<p>paragraph</p> -<pre><code>more code -</code></pre> -```````````````````````````````` - - -```````````````````````````````` example -1. indented code - - paragraph - - more code -. -<ol> -<li> -<pre><code>indented code -</code></pre> -<p>paragraph</p> -<pre><code>more code -</code></pre> -</li> -</ol> -```````````````````````````````` - - -Note that an additional space indent is interpreted as space -inside the code block: - -```````````````````````````````` example -1. indented code - - paragraph - - more code +> aaa +*** +> bbb . -<ol> -<li> -<pre><code> indented code -</code></pre> -<p>paragraph</p> -<pre><code>more code -</code></pre> -</li> -</ol> +<blockquote> +<p>aaa</p> +</blockquote> +<hr /> +<blockquote> +<p>bbb</p> +</blockquote> ```````````````````````````````` -Note that rules #1 and #2 only apply to two cases: (a) cases -in which the lines to be included in a list item begin with a -[non-whitespace character], and (b) cases in which -they begin with an indented code -block. In a case like the following, where the first block begins with -a three-space indent, the rules do not allow us to form a list item by -indenting the whole thing and prepending a list marker: +However, because of laziness, a blank line is needed between +a block quote and a following paragraph: ```````````````````````````````` example - foo - -bar +> bar +baz . -<p>foo</p> -<p>bar</p> +<blockquote> +<p>bar +baz</p> +</blockquote> ```````````````````````````````` ```````````````````````````````` example -- foo +> bar - bar +baz . -<ul> -<li>foo</li> -</ul> +<blockquote> <p>bar</p> +</blockquote> +<p>baz</p> ```````````````````````````````` -This is not a significant restriction, because when a block begins -with 1-3 spaces indent, the indentation can always be removed without -a change in interpretation, allowing rule #1 to be applied. So, in -the above case: - ```````````````````````````````` example -- foo - - bar +> bar +> +baz . -<ul> -<li> -<p>foo</p> +<blockquote> <p>bar</p> -</li> -</ul> +</blockquote> +<p>baz</p> ```````````````````````````````` -3. **Item starting with a blank line.** If a sequence of lines *Ls* - starting with a single [blank line] constitute a (possibly empty) - sequence of blocks *Bs*, not separated from each other by more than - one blank line, and *M* is a list marker of width *W*, - then the result of prepending *M* to the first line of *Ls*, and - indenting subsequent lines of *Ls* by *W + 1* spaces, is a list - item with *Bs* as its contents. - If a line is empty, then it need not be indented. The type of the - list item (bullet or ordered) is determined by the type of its list - marker. If the list item is ordered, then it is also assigned a - start number, based on the ordered list marker. - -Here are some list items that start with a blank line but are not empty: +It is a consequence of the Laziness rule that any number +of initial `>`s may be omitted on a continuation line of a +nested block quote: ```````````````````````````````` example -- - foo -- - ``` - bar - ``` -- - baz +> > > foo +bar . -<ul> -<li>foo</li> -<li> -<pre><code>bar -</code></pre> -</li> -<li> -<pre><code>baz -</code></pre> -</li> -</ul> +<blockquote> +<blockquote> +<blockquote> +<p>foo +bar</p> +</blockquote> +</blockquote> +</blockquote> ```````````````````````````````` -When the list item starts with a blank line, the number of spaces -following the list marker doesn't change the required indentation: ```````````````````````````````` example -- - foo +>>> foo +> bar +>>baz . -<ul> -<li>foo</li> -</ul> +<blockquote> +<blockquote> +<blockquote> +<p>foo +bar +baz</p> +</blockquote> +</blockquote> +</blockquote> ```````````````````````````````` -A list item can begin with at most one blank line. -In the following example, `foo` is not part of the list -item: +When including an indented code block in a block quote, +remember that the [block quote marker] includes +both the `>` and a following space. So *five spaces* are needed after +the `>`: ```````````````````````````````` example -- +> code - foo +> not code . -<ul> -<li></li> -</ul> -<p>foo</p> +<blockquote> +<pre><code>code +</code></pre> +</blockquote> +<blockquote> +<p>not code</p> +</blockquote> ```````````````````````````````` -Here is an empty bullet list item: -```````````````````````````````` example -- foo -- -- bar -. -<ul> -<li>foo</li> -<li></li> -<li>bar</li> -</ul> -```````````````````````````````` +## List items +A [list marker](@) is a +[bullet list marker] or an [ordered list marker]. -It does not matter whether there are spaces following the [list marker]: +A [bullet list marker](@) +is a `-`, `+`, or `*` character. -```````````````````````````````` example -- foo -- -- bar -. -<ul> -<li>foo</li> -<li></li> -<li>bar</li> -</ul> -```````````````````````````````` +An [ordered list marker](@) +is a sequence of 1--9 arabic digits (`0-9`), followed by either a +`.` character or a `)` character. (The reason for the length +limit is that with 10 digits we start seeing integer overflows +in some browsers.) +The following rules define [list items]: -Here is an empty ordered list item: +1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of + blocks *Bs* starting with a [non-whitespace character], and *M* is a + list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result + of prepending *M* and the following spaces to the first line of + *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a + list item with *Bs* as its contents. The type of the list item + (bullet or ordered) is determined by the type of its list marker. + If the list item is ordered, then it is also assigned a start + number, based on the ordered list marker. -```````````````````````````````` example -1. foo -2. -3. bar -. -<ol> -<li>foo</li> -<li></li> -<li>bar</li> -</ol> -```````````````````````````````` + Exceptions: + 1. When the first list item in a [list] interrupts + a paragraph---that is, when it starts on a line that would + otherwise count as [paragraph continuation text]---then (a) + the lines *Ls* must not begin with a blank line, and (b) if + the list item is ordered, the start number must be 1. + 2. If any line is a [thematic break][thematic breaks] then + that line is not a list item. -A list may start or end with an empty list item: +For example, let *Ls* be the lines ```````````````````````````````` example -* -. -<ul> -<li></li> -</ul> -```````````````````````````````` - -However, an empty list item cannot interrupt a paragraph: - -```````````````````````````````` example -foo -* - -foo -1. -. -<p>foo -*</p> -<p>foo -1.</p> -```````````````````````````````` - - -4. **Indentation.** If a sequence of lines *Ls* constitutes a list item - according to rule #1, #2, or #3, then the result of indenting each line - of *Ls* by 1-3 spaces (the same for each line) also constitutes a - list item with the same contents and attributes. If a line is - empty, then it need not be indented. - -Indented one space: - -```````````````````````````````` example - 1. A paragraph - with two lines. +A paragraph +with two lines. - indented code + indented code - > A block quote. +> A block quote. . -<ol> -<li> <p>A paragraph with two lines.</p> <pre><code>indented code @@ -4382,20 +4135,20 @@ with two lines.</p> <blockquote> <p>A block quote.</p> </blockquote> -</li> -</ol> ```````````````````````````````` -Indented two spaces: +And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says +that the following is an ordered list item with start number 1, +and the same contents as *Ls*: ```````````````````````````````` example - 1. A paragraph - with two lines. +1. A paragraph + with two lines. - indented code + indented code - > A block quote. + > A block quote. . <ol> <li> @@ -4411,658 +4164,750 @@ with two lines.</p> ```````````````````````````````` -Indented three spaces: +The most important thing to notice is that the position of +the text after the list marker determines how much indentation +is needed in subsequent blocks in the list item. If the list +marker takes up two spaces, and there are three spaces between +the list marker and the next [non-whitespace character], then blocks +must be indented five spaces in order to fall under the list +item. -```````````````````````````````` example - 1. A paragraph - with two lines. +Here are some examples showing how far content must be indented to be +put under the list item: - indented code +```````````````````````````````` example +- one - > A block quote. + two . -<ol> -<li> -<p>A paragraph -with two lines.</p> -<pre><code>indented code -</code></pre> -<blockquote> -<p>A block quote.</p> -</blockquote> -</li> -</ol> +<ul> +<li>one</li> +</ul> +<p>two</p> ```````````````````````````````` -Four spaces indent gives a code block: - ```````````````````````````````` example - 1. A paragraph - with two lines. - - indented code +- one - > A block quote. + two . -<pre><code>1. A paragraph - with two lines. - - indented code - - > A block quote. -</code></pre> +<ul> +<li> +<p>one</p> +<p>two</p> +</li> +</ul> ```````````````````````````````` - -5. **Laziness.** If a string of lines *Ls* constitute a [list - item](#list-items) with contents *Bs*, then the result of deleting - some or all of the indentation from one or more lines in which the - next [non-whitespace character] after the indentation is - [paragraph continuation text] is a - list item with the same contents and attributes. The unindented - lines are called - [lazy continuation line](@)s. - -Here is an example with [lazy continuation lines]: - ```````````````````````````````` example - 1. A paragraph -with two lines. - - indented code + - one - > A block quote. + two . -<ol> -<li> -<p>A paragraph -with two lines.</p> -<pre><code>indented code +<ul> +<li>one</li> +</ul> +<pre><code> two </code></pre> -<blockquote> -<p>A block quote.</p> -</blockquote> -</li> -</ol> ```````````````````````````````` -Indentation can be partially deleted: - ```````````````````````````````` example - 1. A paragraph - with two lines. + - one + + two . -<ol> -<li>A paragraph -with two lines.</li> -</ol> +<ul> +<li> +<p>one</p> +<p>two</p> +</li> +</ul> ```````````````````````````````` -These examples show how laziness can work in nested structures: +It is tempting to think of this in terms of columns: the continuation +blocks must be indented at least to the column of the first +[non-whitespace character] after the list marker. However, that is not quite right. +The spaces after the list marker determine how much relative indentation +is needed. Which column this indentation reaches will depend on +how the list item is embedded in other constructions, as shown by +this example: ```````````````````````````````` example -> 1. > Blockquote -continued here. + > > 1. one +>> +>> two . <blockquote> +<blockquote> <ol> <li> -<blockquote> -<p>Blockquote -continued here.</p> -</blockquote> +<p>one</p> +<p>two</p> </li> </ol> </blockquote> +</blockquote> ```````````````````````````````` +Here `two` occurs in the same column as the list marker `1.`, +but is actually contained in the list item, because there is +sufficient indentation after the last containing blockquote marker. + +The converse is also possible. In the following example, the word `two` +occurs far to the right of the initial text of the list item, `one`, but +it is not considered part of the list item, because it is not indented +far enough past the blockquote marker: + ```````````````````````````````` example -> 1. > Blockquote -> continued here. +>>- one +>> + > > two . <blockquote> -<ol> -<li> <blockquote> -<p>Blockquote -continued here.</p> +<ul> +<li>one</li> +</ul> +<p>two</p> </blockquote> -</li> -</ol> </blockquote> ```````````````````````````````` +Note that at least one space is needed between the list marker and +any following content, so these are not list items: + +```````````````````````````````` example +-one -6. **That's all.** Nothing that is not counted as a list item by rules - #1--5 counts as a [list item](#list-items). +2.two +. +<p>-one</p> +<p>2.two</p> +```````````````````````````````` -The rules for sublists follow from the general rules -[above][List items]. A sublist must be indented the same number -of spaces a paragraph would need to be in order to be included -in the list item. -So, in this case we need two spaces indent: +A list item may contain blocks that are separated by more than +one blank line. ```````````````````````````````` example - foo - - bar - - baz - - boo + + + bar . <ul> -<li>foo -<ul> -<li>bar -<ul> -<li>baz -<ul> -<li>boo</li> -</ul> -</li> -</ul> -</li> -</ul> +<li> +<p>foo</p> +<p>bar</p> </li> </ul> ```````````````````````````````` -One is not enough: +A list item may contain any kind of block: ```````````````````````````````` example -- foo - - bar - - baz - - boo -. -<ul> -<li>foo</li> -<li>bar</li> -<li>baz</li> -<li>boo</li> -</ul> -```````````````````````````````` +1. foo + ``` + bar + ``` -Here we need four, because the list marker is wider: + baz -```````````````````````````````` example -10) foo - - bar + > bam . -<ol start="10"> -<li>foo -<ul> -<li>bar</li> -</ul> +<ol> +<li> +<p>foo</p> +<pre><code>bar +</code></pre> +<p>baz</p> +<blockquote> +<p>bam</p> +</blockquote> </li> </ol> ```````````````````````````````` -Three is not enough: +A list item that contains an indented code block will preserve +empty lines within the code block verbatim. ```````````````````````````````` example -10) foo - - bar -. -<ol start="10"> -<li>foo</li> -</ol> -<ul> -<li>bar</li> -</ul> -```````````````````````````````` +- Foo + bar -A list may be the first block in a list item: -```````````````````````````````` example -- - foo + baz . <ul> <li> -<ul> -<li>foo</li> -</ul> +<p>Foo</p> +<pre><code>bar + + +baz +</code></pre> </li> </ul> ```````````````````````````````` +Note that ordered list start numbers must be nine digits or less: ```````````````````````````````` example -1. - 2. foo +123456789. ok . -<ol> -<li> -<ul> -<li> -<ol start="2"> -<li>foo</li> -</ol> -</li> -</ul> -</li> +<ol start="123456789"> +<li>ok</li> </ol> ```````````````````````````````` -A list item can contain a heading: - ```````````````````````````````` example -- # Foo -- Bar - --- - baz +1234567890. not ok . -<ul> -<li> -<h1>Foo</h1> -</li> -<li> -<h2>Bar</h2> -baz</li> -</ul> +<p>1234567890. not ok</p> ```````````````````````````````` -### Motivation - -John Gruber's Markdown spec says the following about list items: - -1. "List markers typically start at the left margin, but may be indented - by up to three spaces. List markers must be followed by one or more - spaces or a tab." +A start number may begin with 0s: -2. "To make lists look nice, you can wrap items with hanging indents.... - But if you don't want to, you don't have to." +```````````````````````````````` example +0. ok +. +<ol start="0"> +<li>ok</li> +</ol> +```````````````````````````````` -3. "List items may consist of multiple paragraphs. Each subsequent - paragraph in a list item must be indented by either 4 spaces or one - tab." -4. "It looks nice if you indent every line of the subsequent paragraphs, - but here again, Markdown will allow you to be lazy." +```````````````````````````````` example +003. ok +. +<ol start="3"> +<li>ok</li> +</ol> +```````````````````````````````` -5. "To put a blockquote within a list item, the blockquote's `>` - delimiters need to be indented." -6. "To put a code block within a list item, the code block needs to be - indented twice — 8 spaces or two tabs." +A start number may not be negative: -These rules specify that a paragraph under a list item must be indented -four spaces (presumably, from the left margin, rather than the start of -the list marker, but this is not said), and that code under a list item -must be indented eight spaces instead of the usual four. They also say -that a block quote must be indented, but not by how much; however, the -example given has four spaces indentation. Although nothing is said -about other kinds of block-level content, it is certainly reasonable to -infer that *all* block elements under a list item, including other -lists, must be indented four spaces. This principle has been called the -*four-space rule*. +```````````````````````````````` example +-1. not ok +. +<p>-1. not ok</p> +```````````````````````````````` -The four-space rule is clear and principled, and if the reference -implementation `Markdown.pl` had followed it, it probably would have -become the standard. However, `Markdown.pl` allowed paragraphs and -sublists to start with only two spaces indentation, at least on the -outer level. Worse, its behavior was inconsistent: a sublist of an -outer-level list needed two spaces indentation, but a sublist of this -sublist needed three spaces. It is not surprising, then, that different -implementations of Markdown have developed very different rules for -determining what comes under a list item. (Pandoc and python-Markdown, -for example, stuck with Gruber's syntax description and the four-space -rule, while discount, redcarpet, marked, PHP Markdown, and others -followed `Markdown.pl`'s behavior more closely.) -Unfortunately, given the divergences between implementations, there -is no way to give a spec for list items that will be guaranteed not -to break any existing documents. However, the spec given here should -correctly handle lists formatted with either the four-space rule or -the more forgiving `Markdown.pl` behavior, provided they are laid out -in a way that is natural for a human to read. -The strategy here is to let the width and indentation of the list marker -determine the indentation necessary for blocks to fall under the list -item, rather than having a fixed and arbitrary number. The writer can -think of the body of the list item as a unit which gets indented to the -right enough to fit the list marker (and any indentation on the list -marker). (The laziness rule, #5, then allows continuation lines to be -unindented if needed.) +2. **Item starting with indented code.** If a sequence of lines *Ls* + constitute a sequence of blocks *Bs* starting with an indented code + block, and *M* is a list marker of width *W* followed by + one space, then the result of prepending *M* and the following + space to the first line of *Ls*, and indenting subsequent lines of + *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. + If a line is empty, then it need not be indented. The type of the + list item (bullet or ordered) is determined by the type of its list + marker. If the list item is ordered, then it is also assigned a + start number, based on the ordered list marker. -This rule is superior, we claim, to any rule requiring a fixed level of -indentation from the margin. The four-space rule is clear but -unnatural. It is quite unintuitive that +An indented code block will have to be indented four spaces beyond +the edge of the region where text will be included in the list item. +In the following case that is 6 spaces: -``` markdown +```````````````````````````````` example - foo - bar - - - baz -``` - -should be parsed as two lists with an intervening paragraph, - -``` html -<ul> -<li>foo</li> -</ul> -<p>bar</p> -<ul> -<li>baz</li> -</ul> -``` - -as the four-space rule demands, rather than a single list, - -``` html + bar +. <ul> <li> <p>foo</p> -<p>bar</p> -<ul> -<li>baz</li> -</ul> +<pre><code>bar +</code></pre> </li> </ul> -``` - -The choice of four spaces is arbitrary. It can be learned, but it is -not likely to be guessed, and it trips up beginners regularly. - -Would it help to adopt a two-space rule? The problem is that such -a rule, together with the rule allowing 1--3 spaces indentation of the -initial list marker, allows text that is indented *less than* the -original list marker to be included in the list item. For example, -`Markdown.pl` parses +```````````````````````````````` -``` markdown - - one - two -``` +And in this case it is 11 spaces: -as a single list item, with `two` a continuation paragraph: +```````````````````````````````` example + 10. foo -``` html -<ul> + bar +. +<ol start="10"> <li> -<p>one</p> -<p>two</p> +<p>foo</p> +<pre><code>bar +</code></pre> </li> -</ul> -``` +</ol> +```````````````````````````````` -and similarly -``` markdown -> - one -> -> two -``` +If the *first* block in the list item is an indented code block, +then by rule #2, the contents must be indented *one* space after the +list marker: -as +```````````````````````````````` example + indented code -``` html -<blockquote> -<ul> -<li> -<p>one</p> -<p>two</p> +paragraph + + more code +. +<pre><code>indented code +</code></pre> +<p>paragraph</p> +<pre><code>more code +</code></pre> +```````````````````````````````` + + +```````````````````````````````` example +1. indented code + + paragraph + + more code +. +<ol> +<li> +<pre><code>indented code +</code></pre> +<p>paragraph</p> +<pre><code>more code +</code></pre> </li> +</ol> +```````````````````````````````` + + +Note that an additional space indent is interpreted as space +inside the code block: + +```````````````````````````````` example +1. indented code + + paragraph + + more code +. +<ol> +<li> +<pre><code> indented code +</code></pre> +<p>paragraph</p> +<pre><code>more code +</code></pre> +</li> +</ol> +```````````````````````````````` + + +Note that rules #1 and #2 only apply to two cases: (a) cases +in which the lines to be included in a list item begin with a +[non-whitespace character], and (b) cases in which +they begin with an indented code +block. In a case like the following, where the first block begins with +a three-space indent, the rules do not allow us to form a list item by +indenting the whole thing and prepending a list marker: + +```````````````````````````````` example + foo + +bar +. +<p>foo</p> +<p>bar</p> +```````````````````````````````` + + +```````````````````````````````` example +- foo + + bar +. +<ul> +<li>foo</li> </ul> -</blockquote> -``` +<p>bar</p> +```````````````````````````````` -This is extremely unintuitive. -Rather than requiring a fixed indent from the margin, we could require -a fixed indent (say, two spaces, or even one space) from the list marker (which -may itself be indented). This proposal would remove the last anomaly -discussed. Unlike the spec presented above, it would count the following -as a list item with a subparagraph, even though the paragraph `bar` -is not indented as far as the first paragraph `foo`: +This is not a significant restriction, because when a block begins +with 1-3 spaces indent, the indentation can always be removed without +a change in interpretation, allowing rule #1 to be applied. So, in +the above case: -``` markdown - 10. foo +```````````````````````````````` example +- foo - bar -``` + bar +. +<ul> +<li> +<p>foo</p> +<p>bar</p> +</li> +</ul> +```````````````````````````````` -Arguably this text does read like a list item with `bar` as a subparagraph, -which may count in favor of the proposal. However, on this proposal indented -code would have to be indented six spaces after the list marker. And this -would break a lot of existing Markdown, which has the pattern: -``` markdown -1. foo +3. **Item starting with a blank line.** If a sequence of lines *Ls* + starting with a single [blank line] constitute a (possibly empty) + sequence of blocks *Bs*, not separated from each other by more than + one blank line, and *M* is a list marker of width *W*, + then the result of prepending *M* to the first line of *Ls*, and + indenting subsequent lines of *Ls* by *W + 1* spaces, is a list + item with *Bs* as its contents. + If a line is empty, then it need not be indented. The type of the + list item (bullet or ordered) is determined by the type of its list + marker. If the list item is ordered, then it is also assigned a + start number, based on the ordered list marker. - indented code -``` +Here are some list items that start with a blank line but are not empty: -where the code is indented eight spaces. The spec above, by contrast, will -parse this text as expected, since the code block's indentation is measured -from the beginning of `foo`. +```````````````````````````````` example +- + foo +- + ``` + bar + ``` +- + baz +. +<ul> +<li>foo</li> +<li> +<pre><code>bar +</code></pre> +</li> +<li> +<pre><code>baz +</code></pre> +</li> +</ul> +```````````````````````````````` -The one case that needs special treatment is a list item that *starts* -with indented code. How much indentation is required in that case, since -we don't have a "first paragraph" to measure from? Rule #2 simply stipulates -that in such cases, we require one space indentation from the list marker -(and then the normal four spaces for the indented code). This will match the -four-space rule in cases where the list marker plus its initial indentation -takes four spaces (a common case), but diverge in other cases. +When the list item starts with a blank line, the number of spaces +following the list marker doesn't change the required indentation: -## Lists +```````````````````````````````` example +- + foo +. +<ul> +<li>foo</li> +</ul> +```````````````````````````````` -A [list](@) is a sequence of one or more -list items [of the same type]. The list items -may be separated by any number of blank lines. -Two list items are [of the same type](@) -if they begin with a [list marker] of the same type. -Two list markers are of the -same type if (a) they are bullet list markers using the same character -(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same -delimiter (either `.` or `)`). +A list item can begin with at most one blank line. +In the following example, `foo` is not part of the list +item: -A list is an [ordered list](@) -if its constituent list items begin with -[ordered list markers], and a -[bullet list](@) if its constituent list -items begin with [bullet list markers]. +```````````````````````````````` example +- -The [start number](@) -of an [ordered list] is determined by the list number of -its initial list item. The numbers of subsequent list items are -disregarded. + foo +. +<ul> +<li></li> +</ul> +<p>foo</p> +```````````````````````````````` -A list is [loose](@) if any of its constituent -list items are separated by blank lines, or if any of its constituent -list items directly contain two block-level elements with a blank line -between them. Otherwise a list is [tight](@). -(The difference in HTML output is that paragraphs in a loose list are -wrapped in `<p>` tags, while paragraphs in a tight list are not.) -Changing the bullet or ordered list delimiter starts a new list: +Here is an empty bullet list item: ```````````````````````````````` example - foo +- - bar -+ baz . <ul> <li>foo</li> +<li></li> <li>bar</li> </ul> +```````````````````````````````` + + +It does not matter whether there are spaces following the [list marker]: + +```````````````````````````````` example +- foo +- +- bar +. <ul> -<li>baz</li> +<li>foo</li> +<li></li> +<li>bar</li> </ul> ```````````````````````````````` +Here is an empty ordered list item: + ```````````````````````````````` example 1. foo -2. bar -3) baz +2. +3. bar . <ol> <li>foo</li> +<li></li> <li>bar</li> </ol> -<ol start="3"> -<li>baz</li> -</ol> ```````````````````````````````` -In CommonMark, a list can interrupt a paragraph. That is, -no blank line is needed to separate a paragraph from a following -list: +A list may start or end with an empty list item: ```````````````````````````````` example -Foo -- bar -- baz +* . -<p>Foo</p> <ul> -<li>bar</li> -<li>baz</li> +<li></li> </ul> ```````````````````````````````` -`Markdown.pl` does not allow this, through fear of triggering a list -via a numeral in a hard-wrapped line: +However, an empty list item cannot interrupt a paragraph: -``` markdown -The number of windows in my house is -14. The number of doors is 6. -``` +```````````````````````````````` example +foo +* -Oddly, though, `Markdown.pl` *does* allow a blockquote to -interrupt a paragraph, even though the same considerations might -apply. +foo +1. +. +<p>foo +*</p> +<p>foo +1.</p> +```````````````````````````````` -In CommonMark, we do allow lists to interrupt paragraphs, for -two reasons. First, it is natural and not uncommon for people -to start lists without blank lines: -``` markdown -I need to buy -- new shoes -- a coat -- a plane ticket -``` +4. **Indentation.** If a sequence of lines *Ls* constitutes a list item + according to rule #1, #2, or #3, then the result of indenting each line + of *Ls* by 1-3 spaces (the same for each line) also constitutes a + list item with the same contents and attributes. If a line is + empty, then it need not be indented. -Second, we are attracted to a +Indented one space: -> [principle of uniformity](@): -> if a chunk of text has a certain -> meaning, it will continue to have the same meaning when put into a -> container block (such as a list item or blockquote). +```````````````````````````````` example + 1. A paragraph + with two lines. -(Indeed, the spec for [list items] and [block quotes] presupposes -this principle.) This principle implies that if + indented code -``` markdown - * I need to buy - - new shoes - - a coat - - a plane ticket -``` + > A block quote. +. +<ol> +<li> +<p>A paragraph +with two lines.</p> +<pre><code>indented code +</code></pre> +<blockquote> +<p>A block quote.</p> +</blockquote> +</li> +</ol> +```````````````````````````````` -is a list item containing a paragraph followed by a nested sublist, -as all Markdown implementations agree it is (though the paragraph -may be rendered without `<p>` tags, since the list is "tight"), -then -``` markdown -I need to buy -- new shoes -- a coat -- a plane ticket -``` +Indented two spaces: -by itself should be a paragraph followed by a nested sublist. +```````````````````````````````` example + 1. A paragraph + with two lines. -Since it is well established Markdown practice to allow lists to -interrupt paragraphs inside list items, the [principle of -uniformity] requires us to allow this outside list items as -well. ([reStructuredText](http://docutils.sourceforge.net/rst.html) -takes a different approach, requiring blank lines before lists -even inside other list items.) + indented code + + > A block quote. +. +<ol> +<li> +<p>A paragraph +with two lines.</p> +<pre><code>indented code +</code></pre> +<blockquote> +<p>A block quote.</p> +</blockquote> +</li> +</ol> +```````````````````````````````` + + +Indented three spaces: + +```````````````````````````````` example + 1. A paragraph + with two lines. + + indented code + + > A block quote. +. +<ol> +<li> +<p>A paragraph +with two lines.</p> +<pre><code>indented code +</code></pre> +<blockquote> +<p>A block quote.</p> +</blockquote> +</li> +</ol> +```````````````````````````````` + + +Four spaces indent gives a code block: + +```````````````````````````````` example + 1. A paragraph + with two lines. + + indented code + + > A block quote. +. +<pre><code>1. A paragraph + with two lines. + + indented code + + > A block quote. +</code></pre> +```````````````````````````````` + + + +5. **Laziness.** If a string of lines *Ls* constitute a [list + item](#list-items) with contents *Bs*, then the result of deleting + some or all of the indentation from one or more lines in which the + next [non-whitespace character] after the indentation is + [paragraph continuation text] is a + list item with the same contents and attributes. The unindented + lines are called + [lazy continuation line](@)s. -In order to solve of unwanted lists in paragraphs with -hard-wrapped numerals, we allow only lists starting with `1` to -interrupt paragraphs. Thus, +Here is an example with [lazy continuation lines]: ```````````````````````````````` example -The number of windows in my house is -14. The number of doors is 6. -. -<p>The number of windows in my house is -14. The number of doors is 6.</p> -```````````````````````````````` + 1. A paragraph +with two lines. -We may still get an unintended result in cases like + indented code -```````````````````````````````` example -The number of windows in my house is -1. The number of doors is 6. + > A block quote. . -<p>The number of windows in my house is</p> <ol> -<li>The number of doors is 6.</li> +<li> +<p>A paragraph +with two lines.</p> +<pre><code>indented code +</code></pre> +<blockquote> +<p>A block quote.</p> +</blockquote> +</li> </ol> ```````````````````````````````` -but this rule should prevent most spurious list captures. -There can be any number of blank lines between items: +Indentation can be partially deleted: ```````````````````````````````` example -- foo + 1. A paragraph + with two lines. +. +<ol> +<li>A paragraph +with two lines.</li> +</ol> +```````````````````````````````` -- bar +These examples show how laziness can work in nested structures: -- baz +```````````````````````````````` example +> 1. > Blockquote +continued here. . -<ul> -<li> -<p>foo</p> -</li> +<blockquote> +<ol> <li> -<p>bar</p> +<blockquote> +<p>Blockquote +continued here.</p> +</blockquote> </li> +</ol> +</blockquote> +```````````````````````````````` + + +```````````````````````````````` example +> 1. > Blockquote +> continued here. +. +<blockquote> +<ol> <li> -<p>baz</p> +<blockquote> +<p>Blockquote +continued here.</p> +</blockquote> </li> -</ul> +</ol> +</blockquote> ```````````````````````````````` + + +6. **That's all.** Nothing that is not counted as a list item by rules + #1--5 counts as a [list item](#list-items). + +The rules for sublists follow from the general rules +[above][List items]. A sublist must be indented the same number +of spaces a paragraph would need to be in order to be included +in the list item. + +So, in this case we need two spaces indent: + ```````````````````````````````` example - foo - bar - baz - - - bim + - boo . <ul> <li>foo <ul> <li>bar <ul> -<li> -<p>baz</p> -<p>bim</p> +<li>baz +<ul> +<li>boo</li> +</ul> </li> </ul> </li> @@ -5072,778 +4917,938 @@ There can be any number of blank lines between items: ```````````````````````````````` -To separate consecutive lists of the same type, or to separate a -list from an indented code block that would otherwise be parsed -as a subparagraph of the final list item, you can insert a blank HTML -comment: +One is not enough: ```````````````````````````````` example - foo -- bar - -<!-- --> - -- baz -- bim + - bar + - baz + - boo . <ul> <li>foo</li> <li>bar</li> -</ul> -<!-- --> -<ul> <li>baz</li> -<li>bim</li> -</ul> -```````````````````````````````` - - -```````````````````````````````` example -- foo - - notcode - -- foo - -<!-- --> - - code -. -<ul> -<li> -<p>foo</p> -<p>notcode</p> -</li> -<li> -<p>foo</p> -</li> -</ul> -<!-- --> -<pre><code>code -</code></pre> -```````````````````````````````` - - -List items need not be indented to the same level. The following -list items will be treated as items at the same list level, -since none is indented enough to belong to the previous list -item: - -```````````````````````````````` example -- a - - b - - c - - d - - e - - f -- g -. -<ul> -<li>a</li> -<li>b</li> -<li>c</li> -<li>d</li> -<li>e</li> -<li>f</li> -<li>g</li> +<li>boo</li> </ul> ```````````````````````````````` -```````````````````````````````` example -1. a - - 2. b - - 3. c -. -<ol> -<li> -<p>a</p> -</li> -<li> -<p>b</p> -</li> -<li> -<p>c</p> -</li> -</ol> -```````````````````````````````` - -Note, however, that list items may not be indented more than -three spaces. Here `- e` is treated as a paragraph continuation -line, because it is indented more than three spaces: +Here we need four, because the list marker is wider: ```````````````````````````````` example -- a - - b - - c - - d - - e +10) foo + - bar . +<ol start="10"> +<li>foo <ul> -<li>a</li> -<li>b</li> -<li>c</li> -<li>d -- e</li> +<li>bar</li> </ul> -```````````````````````````````` - -And here, `3. c` is treated as in indented code block, -because it is indented four spaces and preceded by a -blank line. - -```````````````````````````````` example -1. a - - 2. b - - 3. c -. -<ol> -<li> -<p>a</p> -</li> -<li> -<p>b</p> </li> </ol> -<pre><code>3. c -</code></pre> ```````````````````````````````` -This is a loose list, because there is a blank line between -two of the list items: +Three is not enough: ```````````````````````````````` example -- a -- b - -- c +10) foo + - bar . -<ul> -<li> -<p>a</p> -</li> -<li> -<p>b</p> -</li> -<li> -<p>c</p> -</li> +<ol start="10"> +<li>foo</li> +</ol> +<ul> +<li>bar</li> </ul> ```````````````````````````````` -So is this, with a empty second item: +A list may be the first block in a list item: ```````````````````````````````` example -* a -* - -* c +- - foo . <ul> <li> -<p>a</p> -</li> -<li></li> -<li> -<p>c</p> +<ul> +<li>foo</li> +</ul> </li> </ul> ```````````````````````````````` -These are loose lists, even though there is no space between the items, -because one of the items directly contains two block-level elements -with a blank line between them: - ```````````````````````````````` example -- a -- b - - c -- d +1. - 2. foo . -<ul> -<li> -<p>a</p> -</li> +<ol> <li> -<p>b</p> -<p>c</p> -</li> +<ul> <li> -<p>d</p> +<ol start="2"> +<li>foo</li> +</ol> </li> </ul> +</li> +</ol> ```````````````````````````````` -```````````````````````````````` example -- a -- b +A list item can contain a heading: - [ref]: /url -- d +```````````````````````````````` example +- # Foo +- Bar + --- + baz . <ul> <li> -<p>a</p> -</li> -<li> -<p>b</p> +<h1>Foo</h1> </li> <li> -<p>d</p> -</li> +<h2>Bar</h2> +baz</li> </ul> ```````````````````````````````` -This is a tight list, because the blank lines are in a code block: +### Motivation -```````````````````````````````` example -- a -- ``` - b +John Gruber's Markdown spec says the following about list items: + +1. "List markers typically start at the left margin, but may be indented + by up to three spaces. List markers must be followed by one or more + spaces or a tab." +2. "To make lists look nice, you can wrap items with hanging indents.... + But if you don't want to, you don't have to." - ``` -- c -. +3. "List items may consist of multiple paragraphs. Each subsequent + paragraph in a list item must be indented by either 4 spaces or one + tab." + +4. "It looks nice if you indent every line of the subsequent paragraphs, + but here again, Markdown will allow you to be lazy." + +5. "To put a blockquote within a list item, the blockquote's `>` + delimiters need to be indented." + +6. "To put a code block within a list item, the code block needs to be + indented twice — 8 spaces or two tabs." + +These rules specify that a paragraph under a list item must be indented +four spaces (presumably, from the left margin, rather than the start of +the list marker, but this is not said), and that code under a list item +must be indented eight spaces instead of the usual four. They also say +that a block quote must be indented, but not by how much; however, the +example given has four spaces indentation. Although nothing is said +about other kinds of block-level content, it is certainly reasonable to +infer that *all* block elements under a list item, including other +lists, must be indented four spaces. This principle has been called the +*four-space rule*. + +The four-space rule is clear and principled, and if the reference +implementation `Markdown.pl` had followed it, it probably would have +become the standard. However, `Markdown.pl` allowed paragraphs and +sublists to start with only two spaces indentation, at least on the +outer level. Worse, its behavior was inconsistent: a sublist of an +outer-level list needed two spaces indentation, but a sublist of this +sublist needed three spaces. It is not surprising, then, that different +implementations of Markdown have developed very different rules for +determining what comes under a list item. (Pandoc and python-Markdown, +for example, stuck with Gruber's syntax description and the four-space +rule, while discount, redcarpet, marked, PHP Markdown, and others +followed `Markdown.pl`'s behavior more closely.) + +Unfortunately, given the divergences between implementations, there +is no way to give a spec for list items that will be guaranteed not +to break any existing documents. However, the spec given here should +correctly handle lists formatted with either the four-space rule or +the more forgiving `Markdown.pl` behavior, provided they are laid out +in a way that is natural for a human to read. + +The strategy here is to let the width and indentation of the list marker +determine the indentation necessary for blocks to fall under the list +item, rather than having a fixed and arbitrary number. The writer can +think of the body of the list item as a unit which gets indented to the +right enough to fit the list marker (and any indentation on the list +marker). (The laziness rule, #5, then allows continuation lines to be +unindented if needed.) + +This rule is superior, we claim, to any rule requiring a fixed level of +indentation from the margin. The four-space rule is clear but +unnatural. It is quite unintuitive that + +``` markdown +- foo + + bar + + - baz +``` + +should be parsed as two lists with an intervening paragraph, + +``` html <ul> -<li>a</li> -<li> -<pre><code>b +<li>foo</li> +</ul> +<p>bar</p> +<ul> +<li>baz</li> +</ul> +``` +as the four-space rule demands, rather than a single list, -</code></pre> +``` html +<ul> +<li> +<p>foo</p> +<p>bar</p> +<ul> +<li>baz</li> +</ul> </li> -<li>c</li> </ul> -```````````````````````````````` +``` +The choice of four spaces is arbitrary. It can be learned, but it is +not likely to be guessed, and it trips up beginners regularly. -This is a tight list, because the blank line is between two -paragraphs of a sublist. So the sublist is loose while -the outer list is tight: +Would it help to adopt a two-space rule? The problem is that such +a rule, together with the rule allowing 1--3 spaces indentation of the +initial list marker, allows text that is indented *less than* the +original list marker to be included in the list item. For example, +`Markdown.pl` parses -```````````````````````````````` example -- a - - b +``` markdown + - one - c -- d -. -<ul> -<li>a + two +``` + +as a single list item, with `two` a continuation paragraph: + +``` html <ul> <li> -<p>b</p> -<p>c</p> +<p>one</p> +<p>two</p> </li> </ul> +``` + +and similarly + +``` markdown +> - one +> +> two +``` + +as + +``` html +<blockquote> +<ul> +<li> +<p>one</p> +<p>two</p> </li> -<li>d</li> </ul> -```````````````````````````````` +</blockquote> +``` + +This is extremely unintuitive. + +Rather than requiring a fixed indent from the margin, we could require +a fixed indent (say, two spaces, or even one space) from the list marker (which +may itself be indented). This proposal would remove the last anomaly +discussed. Unlike the spec presented above, it would count the following +as a list item with a subparagraph, even though the paragraph `bar` +is not indented as far as the first paragraph `foo`: + +``` markdown + 10. foo + + bar +``` + +Arguably this text does read like a list item with `bar` as a subparagraph, +which may count in favor of the proposal. However, on this proposal indented +code would have to be indented six spaces after the list marker. And this +would break a lot of existing Markdown, which has the pattern: + +``` markdown +1. foo + + indented code +``` + +where the code is indented eight spaces. The spec above, by contrast, will +parse this text as expected, since the code block's indentation is measured +from the beginning of `foo`. +The one case that needs special treatment is a list item that *starts* +with indented code. How much indentation is required in that case, since +we don't have a "first paragraph" to measure from? Rule #2 simply stipulates +that in such cases, we require one space indentation from the list marker +(and then the normal four spaces for the indented code). This will match the +four-space rule in cases where the list marker plus its initial indentation +takes four spaces (a common case), but diverge in other cases. -This is a tight list, because the blank line is inside the -block quote: +## Lists -```````````````````````````````` example -* a - > b - > -* c -. -<ul> -<li>a -<blockquote> -<p>b</p> -</blockquote> -</li> -<li>c</li> -</ul> -```````````````````````````````` +A [list](@) is a sequence of one or more +list items [of the same type]. The list items +may be separated by any number of blank lines. +Two list items are [of the same type](@) +if they begin with a [list marker] of the same type. +Two list markers are of the +same type if (a) they are bullet list markers using the same character +(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same +delimiter (either `.` or `)`). -This list is tight, because the consecutive block elements -are not separated by blank lines: +A list is an [ordered list](@) +if its constituent list items begin with +[ordered list markers], and a +[bullet list](@) if its constituent list +items begin with [bullet list markers]. -```````````````````````````````` example -- a - > b - ``` - c - ``` -- d -. -<ul> -<li>a -<blockquote> -<p>b</p> -</blockquote> -<pre><code>c -</code></pre> -</li> -<li>d</li> -</ul> -```````````````````````````````` +The [start number](@) +of an [ordered list] is determined by the list number of +its initial list item. The numbers of subsequent list items are +disregarded. +A list is [loose](@) if any of its constituent +list items are separated by blank lines, or if any of its constituent +list items directly contain two block-level elements with a blank line +between them. Otherwise a list is [tight](@). +(The difference in HTML output is that paragraphs in a loose list are +wrapped in `<p>` tags, while paragraphs in a tight list are not.) -A single-paragraph list is tight: +Changing the bullet or ordered list delimiter starts a new list: ```````````````````````````````` example -- a +- foo +- bar ++ baz . <ul> -<li>a</li> +<li>foo</li> +<li>bar</li> </ul> -```````````````````````````````` - - -```````````````````````````````` example -- a - - b -. -<ul> -<li>a <ul> -<li>b</li> -</ul> -</li> +<li>baz</li> </ul> ```````````````````````````````` -This list is loose, because of the blank line between the -two block elements in the list item: - ```````````````````````````````` example -1. ``` - foo - ``` - - bar +1. foo +2. bar +3) baz . <ol> -<li> -<pre><code>foo -</code></pre> -<p>bar</p> -</li> +<li>foo</li> +<li>bar</li> +</ol> +<ol start="3"> +<li>baz</li> </ol> ```````````````````````````````` -Here the outer list is loose, the inner list tight: +In CommonMark, a list can interrupt a paragraph. That is, +no blank line is needed to separate a paragraph from a following +list: ```````````````````````````````` example -* foo - * bar - - baz +Foo +- bar +- baz . -<ul> -<li> -<p>foo</p> +<p>Foo</p> <ul> <li>bar</li> -</ul> -<p>baz</p> -</li> +<li>baz</li> </ul> ```````````````````````````````` +`Markdown.pl` does not allow this, through fear of triggering a list +via a numeral in a hard-wrapped line: -```````````````````````````````` example -- a - - b - - c +``` markdown +The number of windows in my house is +14. The number of doors is 6. +``` -- d - - e - - f -. -<ul> -<li> -<p>a</p> -<ul> -<li>b</li> -<li>c</li> -</ul> -</li> -<li> -<p>d</p> -<ul> -<li>e</li> -<li>f</li> -</ul> -</li> -</ul> -```````````````````````````````` +Oddly, though, `Markdown.pl` *does* allow a blockquote to +interrupt a paragraph, even though the same considerations might +apply. +In CommonMark, we do allow lists to interrupt paragraphs, for +two reasons. First, it is natural and not uncommon for people +to start lists without blank lines: -# Inlines +``` markdown +I need to buy +- new shoes +- a coat +- a plane ticket +``` -Inlines are parsed sequentially from the beginning of the character -stream to the end (left to right, in left-to-right languages). -Thus, for example, in +Second, we are attracted to a -```````````````````````````````` example -`hi`lo` -. -<p><code>hi</code>lo`</p> -```````````````````````````````` +> [principle of uniformity](@): +> if a chunk of text has a certain +> meaning, it will continue to have the same meaning when put into a +> container block (such as a list item or blockquote). -`hi` is parsed as code, leaving the backtick at the end as a literal -backtick. +(Indeed, the spec for [list items] and [block quotes] presupposes +this principle.) This principle implies that if +``` markdown + * I need to buy + - new shoes + - a coat + - a plane ticket +``` -## Backslash escapes +is a list item containing a paragraph followed by a nested sublist, +as all Markdown implementations agree it is (though the paragraph +may be rendered without `<p>` tags, since the list is "tight"), +then -Any ASCII punctuation character may be backslash-escaped: +``` markdown +I need to buy +- new shoes +- a coat +- a plane ticket +``` -```````````````````````````````` example -\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ -. -<p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> -```````````````````````````````` +by itself should be a paragraph followed by a nested sublist. +Since it is well established Markdown practice to allow lists to +interrupt paragraphs inside list items, the [principle of +uniformity] requires us to allow this outside list items as +well. ([reStructuredText](http://docutils.sourceforge.net/rst.html) +takes a different approach, requiring blank lines before lists +even inside other list items.) -Backslashes before other characters are treated as literal -backslashes: +In order to solve of unwanted lists in paragraphs with +hard-wrapped numerals, we allow only lists starting with `1` to +interrupt paragraphs. Thus, ```````````````````````````````` example -\→\A\a\ \3\φ\« +The number of windows in my house is +14. The number of doors is 6. . -<p>\→\A\a\ \3\φ\«</p> +<p>The number of windows in my house is +14. The number of doors is 6.</p> ```````````````````````````````` - -Escaped characters are treated as regular characters and do -not have their usual Markdown meanings: +We may still get an unintended result in cases like ```````````````````````````````` example -\*not emphasized* -\<br/> not a tag -\[not a link](/foo) -\`not code` -1\. not a list -\* not a list -\# not a heading -\[foo]: /url "not a reference" -\ö not a character entity +The number of windows in my house is +1. The number of doors is 6. . -<p>*not emphasized* -<br/> not a tag -[not a link](/foo) -`not code` -1. not a list -* not a list -# not a heading -[foo]: /url "not a reference" -&ouml; not a character entity</p> +<p>The number of windows in my house is</p> +<ol> +<li>The number of doors is 6.</li> +</ol> ```````````````````````````````` +but this rule should prevent most spurious list captures. -If a backslash is itself escaped, the following character is not: +There can be any number of blank lines between items: ```````````````````````````````` example -\\*emphasis* -. -<p>\<em>emphasis</em></p> -```````````````````````````````` +- foo +- bar -A backslash at the end of the line is a [hard line break]: -```````````````````````````````` example -foo\ -bar +- baz . -<p>foo<br /> -bar</p> +<ul> +<li> +<p>foo</p> +</li> +<li> +<p>bar</p> +</li> +<li> +<p>baz</p> +</li> +</ul> ```````````````````````````````` +```````````````````````````````` example +- foo + - bar + - baz -Backslash escapes do not work in code blocks, code spans, autolinks, or -raw HTML: -```````````````````````````````` example -`` \[\` `` + bim . -<p><code>\[\`</code></p> +<ul> +<li>foo +<ul> +<li>bar +<ul> +<li> +<p>baz</p> +<p>bim</p> +</li> +</ul> +</li> +</ul> +</li> +</ul> ```````````````````````````````` +To separate consecutive lists of the same type, or to separate a +list from an indented code block that would otherwise be parsed +as a subparagraph of the final list item, you can insert a blank HTML +comment: + ```````````````````````````````` example - \[\] -. -<pre><code>\[\] -</code></pre> -```````````````````````````````` +- foo +- bar +<!-- --> -```````````````````````````````` example -~~~ -\[\] -~~~ +- baz +- bim . -<pre><code>\[\] -</code></pre> +<ul> +<li>foo</li> +<li>bar</li> +</ul> +<!-- --> +<ul> +<li>baz</li> +<li>bim</li> +</ul> ```````````````````````````````` ```````````````````````````````` example -<http://example.com?find=\*> -. -<p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> -```````````````````````````````` +- foo + + notcode +- foo -```````````````````````````````` example -<a href="/bar\/)"> +<!-- --> + + code . -<a href="/bar\/)"> +<ul> +<li> +<p>foo</p> +<p>notcode</p> +</li> +<li> +<p>foo</p> +</li> +</ul> +<!-- --> +<pre><code>code +</code></pre> ```````````````````````````````` -But they work in all other contexts, including URLs and link titles, -link references, and [info strings] in [fenced code blocks]: +List items need not be indented to the same level. The following +list items will be treated as items at the same list level, +since none is indented enough to belong to the previous list +item: ```````````````````````````````` example -[foo](/bar\* "ti\*tle") +- a + - b + - c + - d + - e + - f +- g . -<p><a href="/bar*" title="ti*tle">foo</a></p> +<ul> +<li>a</li> +<li>b</li> +<li>c</li> +<li>d</li> +<li>e</li> +<li>f</li> +<li>g</li> +</ul> ```````````````````````````````` ```````````````````````````````` example -[foo] +1. a -[foo]: /bar\* "ti\*tle" + 2. b + + 3. c . -<p><a href="/bar*" title="ti*tle">foo</a></p> +<ol> +<li> +<p>a</p> +</li> +<li> +<p>b</p> +</li> +<li> +<p>c</p> +</li> +</ol> ```````````````````````````````` +Note, however, that list items may not be indented more than +three spaces. Here `- e` is treated as a paragraph continuation +line, because it is indented more than three spaces: ```````````````````````````````` example -``` foo\+bar -foo -``` +- a + - b + - c + - d + - e . -<pre><code class="language-foo+bar">foo -</code></pre> +<ul> +<li>a</li> +<li>b</li> +<li>c</li> +<li>d +- e</li> +</ul> ```````````````````````````````` +And here, `3. c` is treated as in indented code block, +because it is indented four spaces and preceded by a +blank line. +```````````````````````````````` example +1. a -## Entity and numeric character references - -Valid HTML entity references and numeric character references -can be used in place of the corresponding Unicode character, -with the following exceptions: - -- Entity and character references are not recognized in code - blocks and code spans. + 2. b -- Entity and character references cannot stand in place of - special characters that define structural elements in - CommonMark. For example, although `*` can be used - in place of a literal `*` character, `*` cannot replace - `*` in emphasis delimiters, bullet list markers, or thematic - breaks. + 3. c +. +<ol> +<li> +<p>a</p> +</li> +<li> +<p>b</p> +</li> +</ol> +<pre><code>3. c +</code></pre> +```````````````````````````````` -Conforming CommonMark parsers need not store information about -whether a particular character was represented in the source -using a Unicode character or an entity reference. -[Entity references](@) consist of `&` + any of the valid -HTML5 entity names + `;`. The -document <https://html.spec.whatwg.org/multipage/entities.json> -is used as an authoritative source for the valid entity -references and their corresponding code points. +This is a loose list, because there is a blank line between +two of the list items: ```````````````````````````````` example - & © Æ Ď -¾ ℋ ⅆ -∲ ≧̸ +- a +- b + +- c . -<p> & © Æ Ď -¾ ℋ ⅆ -∲ ≧̸</p> +<ul> +<li> +<p>a</p> +</li> +<li> +<p>b</p> +</li> +<li> +<p>c</p> +</li> +</ul> ```````````````````````````````` -[Decimal numeric character -references](@) -consist of `&#` + a string of 1--7 arabic digits + `;`. A -numeric character reference is parsed as the corresponding -Unicode character. Invalid Unicode code points will be replaced by -the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, -the code point `U+0000` will also be replaced by `U+FFFD`. +So is this, with a empty second item: ```````````````````````````````` example -# Ӓ Ϡ � +* a +* + +* c . -<p># Ӓ Ϡ �</p> +<ul> +<li> +<p>a</p> +</li> +<li></li> +<li> +<p>c</p> +</li> +</ul> ```````````````````````````````` -[Hexadecimal numeric character -references](@) consist of `&#` + -either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. -They too are parsed as the corresponding Unicode character (this -time specified with a hexadecimal numeral instead of decimal). +These are loose lists, even though there is no space between the items, +because one of the items directly contains two block-level elements +with a blank line between them: ```````````````````````````````` example -" ആ ಫ +- a +- b + + c +- d . -<p>" ആ ಫ</p> +<ul> +<li> +<p>a</p> +</li> +<li> +<p>b</p> +<p>c</p> +</li> +<li> +<p>d</p> +</li> +</ul> ```````````````````````````````` -Here are some nonentities: - ```````````````````````````````` example -  &x; &#; &#x; -� -&#abcdef0; -&ThisIsNotDefined; &hi?; +- a +- b + + [ref]: /url +- d . -<p>&nbsp &x; &#; &#x; -&#987654321; -&#abcdef0; -&ThisIsNotDefined; &hi?;</p> +<ul> +<li> +<p>a</p> +</li> +<li> +<p>b</p> +</li> +<li> +<p>d</p> +</li> +</ul> ```````````````````````````````` -Although HTML5 does accept some entity references -without a trailing semicolon (such as `©`), these are not -recognized here, because it makes the grammar too ambiguous: +This is a tight list, because the blank lines are in a code block: ```````````````````````````````` example -© +- a +- ``` + b + + + ``` +- c . -<p>&copy</p> +<ul> +<li>a</li> +<li> +<pre><code>b + + +</code></pre> +</li> +<li>c</li> +</ul> ```````````````````````````````` -Strings that are not on the list of HTML5 named entities are not -recognized as entity references either: +This is a tight list, because the blank line is between two +paragraphs of a sublist. So the sublist is loose while +the outer list is tight: ```````````````````````````````` example -&MadeUpEntity; +- a + - b + + c +- d . -<p>&MadeUpEntity;</p> +<ul> +<li>a +<ul> +<li> +<p>b</p> +<p>c</p> +</li> +</ul> +</li> +<li>d</li> +</ul> ```````````````````````````````` -Entity and numeric character references are recognized in any -context besides code spans or code blocks, including -URLs, [link titles], and [fenced code block][] [info strings]: +This is a tight list, because the blank line is inside the +block quote: ```````````````````````````````` example -<a href="öö.html"> +* a + > b + > +* c . -<a href="öö.html"> +<ul> +<li>a +<blockquote> +<p>b</p> +</blockquote> +</li> +<li>c</li> +</ul> ```````````````````````````````` +This list is tight, because the consecutive block elements +are not separated by blank lines: + ```````````````````````````````` example -[foo](/föö "föö") +- a + > b + ``` + c + ``` +- d . -<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> +<ul> +<li>a +<blockquote> +<p>b</p> +</blockquote> +<pre><code>c +</code></pre> +</li> +<li>d</li> +</ul> ```````````````````````````````` -```````````````````````````````` example -[foo] +A single-paragraph list is tight: -[foo]: /föö "föö" +```````````````````````````````` example +- a . -<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> +<ul> +<li>a</li> +</ul> ```````````````````````````````` ```````````````````````````````` example -``` föö -foo -``` +- a + - b . -<pre><code class="language-föö">foo -</code></pre> +<ul> +<li>a +<ul> +<li>b</li> +</ul> +</li> +</ul> ```````````````````````````````` -Entity and numeric character references are treated as literal -text in code spans and code blocks: +This list is loose, because of the blank line between the +two block elements in the list item: ```````````````````````````````` example -`föö` -. -<p><code>f&ouml;&ouml;</code></p> -```````````````````````````````` - +1. ``` + foo + ``` -```````````````````````````````` example - föfö + bar . -<pre><code>f&ouml;f&ouml; +<ol> +<li> +<pre><code>foo </code></pre> +<p>bar</p> +</li> +</ol> ```````````````````````````````` -Entity and numeric character references cannot be used -in place of symbols indicating structure in CommonMark -documents. +Here the outer list is loose, the inner list tight: ```````````````````````````````` example -*foo* -*foo* +* foo + * bar + + baz . -<p>*foo* -<em>foo</em></p> +<ul> +<li> +<p>foo</p> +<ul> +<li>bar</li> +</ul> +<p>baz</p> +</li> +</ul> ```````````````````````````````` + ```````````````````````````````` example -* foo +- a + - b + - c -* foo +- d + - e + - f . -<p>* foo</p> <ul> -<li>foo</li> +<li> +<p>a</p> +<ul> +<li>b</li> +<li>c</li> +</ul> +</li> +<li> +<p>d</p> +<ul> +<li>e</li> +<li>f</li> +</ul> +</li> </ul> ```````````````````````````````` -```````````````````````````````` example -foo bar -. -<p>foo -bar</p> -```````````````````````````````` +# Inlines + +Inlines are parsed sequentially from the beginning of the character +stream to the end (left to right, in left-to-right languages). +Thus, for example, in ```````````````````````````````` example -	foo +`hi`lo` . -<p>→foo</p> +<p><code>hi</code>lo`</p> ```````````````````````````````` +`hi` is parsed as code, leaving the backtick at the end as a literal +backtick. -```````````````````````````````` example -[a](url "tit") -. -<p>[a](url "tit")</p> -```````````````````````````````` ## Code spans @@ -7461,10 +7466,11 @@ A [link destination](@) consists of either closing `>` that contains no line breaks or unescaped `<` or `>` characters, or -- a nonempty sequence of characters that does not start with - `<`, does not include ASCII space or control characters, and - includes parentheses only if (a) they are backslash-escaped or - (b) they are part of a balanced pair of unescaped parentheses. +- a nonempty sequence of characters that does not start with `<`, + does not include [ASCII control characters][ASCII control character] + or [whitespace][], and includes parentheses only if (a) they are + backslash-escaped or (b) they are part of a balanced pair of + unescaped parentheses. (Implementations may impose limits on parentheses nesting to avoid performance issues, but at least three levels of nesting should be supported.) @@ -7616,6 +7622,13 @@ However, if you have unbalanced parentheses, you need to escape or use the `<...>` form: ```````````````````````````````` example +[link](foo(and(bar)) +. +<p>[link](foo(and(bar))</p> +```````````````````````````````` + + +```````````````````````````````` example [link](foo\(and\(bar\)) . <p><a href="foo(and(bar)">link</a></p> @@ -7923,9 +7936,8 @@ perform the *Unicode case fold*, strip leading and trailing matching reference link definitions, the one that comes first in the document is used. (It is desirable in such cases to emit a warning.) -The contents of the first link label are parsed as inlines, which are -used as the link's text. The link's URI and title are provided by the -matching [link reference definition]. +The link's URI and title are provided by the matching [link +reference definition]. Here is a simple example: @@ -8018,11 +8030,11 @@ emphasis grouping: ```````````````````````````````` example -[foo *bar][ref] +[foo *bar][ref]* [ref]: /uri . -<p><a href="/uri">foo *bar</a></p> +<p><a href="/uri">foo *bar</a>*</p> ```````````````````````````````` @@ -8070,11 +8082,11 @@ Matching is case-insensitive: Unicode case fold is used: ```````````````````````````````` example -[Толпой][Толпой] is a Russian word. +[ẞ] -[ТОЛПОЙ]: /url +[SS]: /url . -<p><a href="/url">Толпой</a> is a Russian word.</p> +<p><a href="/url">ẞ</a></p> ```````````````````````````````` @@ -8707,9 +8719,9 @@ a link to the URI, with the URI as the link's label. An [absolute URI](@), for these purposes, consists of a [scheme] followed by a colon (`:`) -followed by zero or more characters other than ASCII -[whitespace] and control characters, `<`, and `>`. If -the URI includes these characters, they must be percent-encoded +followed by zero or more characters other [ASCII control +characters][ASCII control character] or [whitespace][] , `<`, and `>`. +If the URI includes these characters, they must be percent-encoded (e.g. `%20` for a space). For purposes of this spec, a [scheme](@) is any sequence @@ -8942,10 +8954,8 @@ consists of the string `<?`, a string of characters not including the string `?>`, and the string `?>`. -A [declaration](@) consists of the -string `<!`, a name consisting of one or more uppercase ASCII letters, -[whitespace], a string of characters not including the -character `>`, and the character `>`. +A [declaration](@) consists of the string `<!`, an ASCII letter, zero or more +characters not including the character `>`, and the character `>`. A [CDATA section](@) consists of the string `<![CDATA[`, a string of characters not including the string @@ -9444,7 +9454,7 @@ blocks. But we cannot close unmatched blocks yet, because we may have a blocks, we look for new block starts (e.g. `>` for a block quote). If we encounter a new block start, we close any blocks unmatched in step 1 before creating the new block as a child of the last -matched block. +matched container block. 3. Finally, we look at the remainder of the line (after block markers like `>`, list markers, and indentation have been consumed).