cmark
My personal build of CMark ✏️
spec.txt (204505B)
1 --- 2 title: CommonMark Spec 3 author: John MacFarlane 4 version: 0.29 5 date: '2019-04-06' 6 license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' 7 ... 8 9 # Introduction 10 11 ## What is Markdown? 12 13 Markdown is a plain text format for writing structured documents, 14 based on conventions for indicating formatting in email 15 and usenet posts. It was developed by John Gruber (with 16 help from Aaron Swartz) and released in 2004 in the form of a 17 [syntax description](http://daringfireball.net/projects/markdown/syntax) 18 and a Perl script (`Markdown.pl`) for converting Markdown to 19 HTML. In the next decade, dozens of implementations were 20 developed in many languages. Some extended the original 21 Markdown syntax with conventions for footnotes, tables, and 22 other document elements. Some allowed Markdown documents to be 23 rendered in formats other than HTML. Websites like Reddit, 24 StackOverflow, and GitHub had millions of people using Markdown. 25 And Markdown started to be used beyond the web, to author books, 26 articles, slide shows, letters, and lecture notes. 27 28 What distinguishes Markdown from many other lightweight markup 29 syntaxes, which are often easier to write, is its readability. 30 As Gruber writes: 31 32 > The overriding design goal for Markdown's formatting syntax is 33 > to make it as readable as possible. The idea is that a 34 > Markdown-formatted document should be publishable as-is, as 35 > plain text, without looking like it's been marked up with tags 36 > or formatting instructions. 37 > (<http://daringfireball.net/projects/markdown/>) 38 39 The point can be illustrated by comparing a sample of 40 [AsciiDoc](http://www.methods.co.nz/asciidoc/) with 41 an equivalent sample of Markdown. Here is a sample of 42 AsciiDoc from the AsciiDoc manual: 43 44 ``` 45 1. List item one. 46 + 47 List item one continued with a second paragraph followed by an 48 Indented block. 49 + 50 ................. 51 $ ls *.sh 52 $ mv *.sh ~/tmp 53 ................. 54 + 55 List item continued with a third paragraph. 56 57 2. List item two continued with an open block. 58 + 59 -- 60 This paragraph is part of the preceding list item. 61 62 a. This list is nested and does not require explicit item 63 continuation. 64 + 65 This paragraph is part of the preceding list item. 66 67 b. List item b. 68 69 This paragraph belongs to item two of the outer list. 70 -- 71 ``` 72 73 And here is the equivalent in Markdown: 74 ``` 75 1. List item one. 76 77 List item one continued with a second paragraph followed by an 78 Indented block. 79 80 $ ls *.sh 81 $ mv *.sh ~/tmp 82 83 List item continued with a third paragraph. 84 85 2. List item two continued with an open block. 86 87 This paragraph is part of the preceding list item. 88 89 1. This list is nested and does not require explicit item continuation. 90 91 This paragraph is part of the preceding list item. 92 93 2. List item b. 94 95 This paragraph belongs to item two of the outer list. 96 ``` 97 98 The AsciiDoc version is, arguably, easier to write. You don't need 99 to worry about indentation. But the Markdown version is much easier 100 to read. The nesting of list items is apparent to the eye in the 101 source, not just in the processed document. 102 103 ## Why is a spec needed? 104 105 John Gruber's [canonical description of Markdown's 106 syntax](http://daringfireball.net/projects/markdown/syntax) 107 does not specify the syntax unambiguously. Here are some examples of 108 questions it does not answer: 109 110 1. How much indentation is needed for a sublist? The spec says that 111 continuation paragraphs need to be indented four spaces, but is 112 not fully explicit about sublists. It is natural to think that 113 they, too, must be indented four spaces, but `Markdown.pl` does 114 not require that. This is hardly a "corner case," and divergences 115 between implementations on this issue often lead to surprises for 116 users in real documents. (See [this comment by John 117 Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) 118 119 2. Is a blank line needed before a block quote or heading? 120 Most implementations do not require the blank line. However, 121 this can lead to unexpected results in hard-wrapped text, and 122 also to ambiguities in parsing (note that some implementations 123 put the heading inside the blockquote, while others do not). 124 (John Gruber has also spoken [in favor of requiring the blank 125 lines](http://article.gmane.org/gmane.text.markdown.general/2146).) 126 127 3. Is a blank line needed before an indented code block? 128 (`Markdown.pl` requires it, but this is not mentioned in the 129 documentation, and some implementations do not require it.) 130 131 ``` markdown 132 paragraph 133 code? 134 ``` 135 136 4. What is the exact rule for determining when list items get 137 wrapped in `<p>` tags? Can a list be partially "loose" and partially 138 "tight"? What should we do with a list like this? 139 140 ``` markdown 141 1. one 142 143 2. two 144 3. three 145 ``` 146 147 Or this? 148 149 ``` markdown 150 1. one 151 - a 152 153 - b 154 2. two 155 ``` 156 157 (There are some relevant comments by John Gruber 158 [here](http://article.gmane.org/gmane.text.markdown.general/2554).) 159 160 5. Can list markers be indented? Can ordered list markers be right-aligned? 161 162 ``` markdown 163 8. item 1 164 9. item 2 165 10. item 2a 166 ``` 167 168 6. Is this one list with a thematic break in its second item, 169 or two lists separated by a thematic break? 170 171 ``` markdown 172 * a 173 * * * * * 174 * b 175 ``` 176 177 7. When list markers change from numbers to bullets, do we have 178 two lists or one? (The Markdown syntax description suggests two, 179 but the perl scripts and many other implementations produce one.) 180 181 ``` markdown 182 1. fee 183 2. fie 184 - foe 185 - fum 186 ``` 187 188 8. What are the precedence rules for the markers of inline structure? 189 For example, is the following a valid link, or does the code span 190 take precedence ? 191 192 ``` markdown 193 [a backtick (`)](/url) and [another backtick (`)](/url). 194 ``` 195 196 9. What are the precedence rules for markers of emphasis and strong 197 emphasis? For example, how should the following be parsed? 198 199 ``` markdown 200 *foo *bar* baz* 201 ``` 202 203 10. What are the precedence rules between block-level and inline-level 204 structure? For example, how should the following be parsed? 205 206 ``` markdown 207 - `a long code span can contain a hyphen like this 208 - and it can screw things up` 209 ``` 210 211 11. Can list items include section headings? (`Markdown.pl` does not 212 allow this, but does allow blockquotes to include headings.) 213 214 ``` markdown 215 - # Heading 216 ``` 217 218 12. Can list items be empty? 219 220 ``` markdown 221 * a 222 * 223 * b 224 ``` 225 226 13. Can link references be defined inside block quotes or list items? 227 228 ``` markdown 229 > Blockquote [foo]. 230 > 231 > [foo]: /url 232 ``` 233 234 14. If there are multiple definitions for the same reference, which takes 235 precedence? 236 237 ``` markdown 238 [foo]: /url1 239 [foo]: /url2 240 241 [foo][] 242 ``` 243 244 In the absence of a spec, early implementers consulted `Markdown.pl` 245 to resolve these ambiguities. But `Markdown.pl` was quite buggy, and 246 gave manifestly bad results in many cases, so it was not a 247 satisfactory replacement for a spec. 248 249 Because there is no unambiguous spec, implementations have diverged 250 considerably. As a result, users are often surprised to find that 251 a document that renders one way on one system (say, a GitHub wiki) 252 renders differently on another (say, converting to docbook using 253 pandoc). To make matters worse, because nothing in Markdown counts 254 as a "syntax error," the divergence often isn't discovered right away. 255 256 ## About this document 257 258 This document attempts to specify Markdown syntax unambiguously. 259 It contains many examples with side-by-side Markdown and 260 HTML. These are intended to double as conformance tests. An 261 accompanying script `spec_tests.py` can be used to run the tests 262 against any Markdown program: 263 264 python test/spec_tests.py --spec spec.txt --program PROGRAM 265 266 Since this document describes how Markdown is to be parsed into 267 an abstract syntax tree, it would have made sense to use an abstract 268 representation of the syntax tree instead of HTML. But HTML is capable 269 of representing the structural distinctions we need to make, and the 270 choice of HTML for the tests makes it possible to run the tests against 271 an implementation without writing an abstract syntax tree renderer. 272 273 This document is generated from a text file, `spec.txt`, written 274 in Markdown with a small extension for the side-by-side tests. 275 The script `tools/makespec.py` can be used to convert `spec.txt` into 276 HTML or CommonMark (which can then be converted into other formats). 277 278 In the examples, the `→` character is used to represent tabs. 279 280 # Preliminaries 281 282 ## Characters and lines 283 284 Any sequence of [characters] is a valid CommonMark 285 document. 286 287 A [character](@) is a Unicode code point. Although some 288 code points (for example, combining accents) do not correspond to 289 characters in an intuitive sense, all code points count as characters 290 for purposes of this spec. 291 292 This spec does not specify an encoding; it thinks of lines as composed 293 of [characters] rather than bytes. A conforming parser may be limited 294 to a certain encoding. 295 296 A [line](@) is a sequence of zero or more [characters] 297 other than line feed (`U+000A`) or carriage return (`U+000D`), 298 followed by a [line ending] or by the end of file. 299 300 A [line ending](@) is a line feed (`U+000A`), a carriage return 301 (`U+000D`) not followed by a line feed, or a carriage return and a 302 following line feed. 303 304 A line containing no characters, or a line containing only spaces 305 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@). 306 307 The following definitions of character classes will be used in this spec: 308 309 A [Unicode whitespace character](@) is 310 any code point in the Unicode `Zs` general category, or a tab (`U+0009`), 311 line feed (`U+000A`), form feed (`U+000C`), or carriage return (`U+000D`). 312 313 [Unicode whitespace](@) is a sequence of one or more 314 [Unicode whitespace characters]. 315 316 A [tab](@) is `U+0009`. 317 318 A [space](@) is `U+0020`. 319 320 An [ASCII control character](@) is a character between `U+0000–1F` (both 321 including) or `U+007F`. 322 323 An [ASCII punctuation character](@) 324 is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, 325 `*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F), 326 `:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040), 327 `[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060), 328 `{`, `|`, `}`, or `~` (U+007B–007E). 329 330 A [Unicode punctuation character](@) is an [ASCII 331 punctuation character] or anything in 332 the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. 333 334 ## Tabs 335 336 Tabs in lines are not expanded to [spaces]. However, 337 in contexts where spaces help to define block structure, 338 tabs behave as if they were replaced by spaces with a tab stop 339 of 4 characters. 340 341 Thus, for example, a tab can be used instead of four spaces 342 in an indented code block. (Note, however, that internal 343 tabs are passed through as literal tabs, not expanded to 344 spaces.) 345 346 ```````````````````````````````` example 347 →foo→baz→→bim 348 . 349 <pre><code>foo→baz→→bim 350 </code></pre> 351 ```````````````````````````````` 352 353 ```````````````````````````````` example 354 →foo→baz→→bim 355 . 356 <pre><code>foo→baz→→bim 357 </code></pre> 358 ```````````````````````````````` 359 360 ```````````````````````````````` example 361 a→a 362 ὐ→a 363 . 364 <pre><code>a→a 365 ὐ→a 366 </code></pre> 367 ```````````````````````````````` 368 369 In the following example, a continuation paragraph of a list 370 item is indented with a tab; this has exactly the same effect 371 as indentation with four spaces would: 372 373 ```````````````````````````````` example 374 - foo 375 376 →bar 377 . 378 <ul> 379 <li> 380 <p>foo</p> 381 <p>bar</p> 382 </li> 383 </ul> 384 ```````````````````````````````` 385 386 ```````````````````````````````` example 387 - foo 388 389 →→bar 390 . 391 <ul> 392 <li> 393 <p>foo</p> 394 <pre><code> bar 395 </code></pre> 396 </li> 397 </ul> 398 ```````````````````````````````` 399 400 Normally the `>` that begins a block quote may be followed 401 optionally by a space, which is not considered part of the 402 content. In the following case `>` is followed by a tab, 403 which is treated as if it were expanded into three spaces. 404 Since one of these spaces is considered part of the 405 delimiter, `foo` is considered to be indented six spaces 406 inside the block quote context, so we get an indented 407 code block starting with two spaces. 408 409 ```````````````````````````````` example 410 >→→foo 411 . 412 <blockquote> 413 <pre><code> foo 414 </code></pre> 415 </blockquote> 416 ```````````````````````````````` 417 418 ```````````````````````````````` example 419 -→→foo 420 . 421 <ul> 422 <li> 423 <pre><code> foo 424 </code></pre> 425 </li> 426 </ul> 427 ```````````````````````````````` 428 429 430 ```````````````````````````````` example 431 foo 432 →bar 433 . 434 <pre><code>foo 435 bar 436 </code></pre> 437 ```````````````````````````````` 438 439 ```````````````````````````````` example 440 - foo 441 - bar 442 → - baz 443 . 444 <ul> 445 <li>foo 446 <ul> 447 <li>bar 448 <ul> 449 <li>baz</li> 450 </ul> 451 </li> 452 </ul> 453 </li> 454 </ul> 455 ```````````````````````````````` 456 457 ```````````````````````````````` example 458 #→Foo 459 . 460 <h1>Foo</h1> 461 ```````````````````````````````` 462 463 ```````````````````````````````` example 464 *→*→*→ 465 . 466 <hr /> 467 ```````````````````````````````` 468 469 470 ## Insecure characters 471 472 For security reasons, the Unicode character `U+0000` must be replaced 473 with the REPLACEMENT CHARACTER (`U+FFFD`). 474 475 476 ## Backslash escapes 477 478 Any ASCII punctuation character may be backslash-escaped: 479 480 ```````````````````````````````` example 481 \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ 482 . 483 <p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> 484 ```````````````````````````````` 485 486 487 Backslashes before other characters are treated as literal 488 backslashes: 489 490 ```````````````````````````````` example 491 \→\A\a\ \3\φ\« 492 . 493 <p>\→\A\a\ \3\φ\«</p> 494 ```````````````````````````````` 495 496 497 Escaped characters are treated as regular characters and do 498 not have their usual Markdown meanings: 499 500 ```````````````````````````````` example 501 \*not emphasized* 502 \<br/> not a tag 503 \[not a link](/foo) 504 \`not code` 505 1\. not a list 506 \* not a list 507 \# not a heading 508 \[foo]: /url "not a reference" 509 \ö not a character entity 510 . 511 <p>*not emphasized* 512 <br/> not a tag 513 [not a link](/foo) 514 `not code` 515 1. not a list 516 * not a list 517 # not a heading 518 [foo]: /url "not a reference" 519 &ouml; not a character entity</p> 520 ```````````````````````````````` 521 522 523 If a backslash is itself escaped, the following character is not: 524 525 ```````````````````````````````` example 526 \\*emphasis* 527 . 528 <p>\<em>emphasis</em></p> 529 ```````````````````````````````` 530 531 532 A backslash at the end of the line is a [hard line break]: 533 534 ```````````````````````````````` example 535 foo\ 536 bar 537 . 538 <p>foo<br /> 539 bar</p> 540 ```````````````````````````````` 541 542 543 Backslash escapes do not work in code blocks, code spans, autolinks, or 544 raw HTML: 545 546 ```````````````````````````````` example 547 `` \[\` `` 548 . 549 <p><code>\[\`</code></p> 550 ```````````````````````````````` 551 552 553 ```````````````````````````````` example 554 \[\] 555 . 556 <pre><code>\[\] 557 </code></pre> 558 ```````````````````````````````` 559 560 561 ```````````````````````````````` example 562 ~~~ 563 \[\] 564 ~~~ 565 . 566 <pre><code>\[\] 567 </code></pre> 568 ```````````````````````````````` 569 570 571 ```````````````````````````````` example 572 <http://example.com?find=\*> 573 . 574 <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> 575 ```````````````````````````````` 576 577 578 ```````````````````````````````` example 579 <a href="/bar\/)"> 580 . 581 <a href="/bar\/)"> 582 ```````````````````````````````` 583 584 585 But they work in all other contexts, including URLs and link titles, 586 link references, and [info strings] in [fenced code blocks]: 587 588 ```````````````````````````````` example 589 [foo](/bar\* "ti\*tle") 590 . 591 <p><a href="/bar*" title="ti*tle">foo</a></p> 592 ```````````````````````````````` 593 594 595 ```````````````````````````````` example 596 [foo] 597 598 [foo]: /bar\* "ti\*tle" 599 . 600 <p><a href="/bar*" title="ti*tle">foo</a></p> 601 ```````````````````````````````` 602 603 604 ```````````````````````````````` example 605 ``` foo\+bar 606 foo 607 ``` 608 . 609 <pre><code class="language-foo+bar">foo 610 </code></pre> 611 ```````````````````````````````` 612 613 614 ## Entity and numeric character references 615 616 Valid HTML entity references and numeric character references 617 can be used in place of the corresponding Unicode character, 618 with the following exceptions: 619 620 - Entity and character references are not recognized in code 621 blocks and code spans. 622 623 - Entity and character references cannot stand in place of 624 special characters that define structural elements in 625 CommonMark. For example, although `*` can be used 626 in place of a literal `*` character, `*` cannot replace 627 `*` in emphasis delimiters, bullet list markers, or thematic 628 breaks. 629 630 Conforming CommonMark parsers need not store information about 631 whether a particular character was represented in the source 632 using a Unicode character or an entity reference. 633 634 [Entity references](@) consist of `&` + any of the valid 635 HTML5 entity names + `;`. The 636 document <https://html.spec.whatwg.org/entities.json> 637 is used as an authoritative source for the valid entity 638 references and their corresponding code points. 639 640 ```````````````````````````````` example 641 & © Æ Ď 642 ¾ ℋ ⅆ 643 ∲ ≧̸ 644 . 645 <p> & © Æ Ď 646 ¾ ℋ ⅆ 647 ∲ ≧̸</p> 648 ```````````````````````````````` 649 650 651 [Decimal numeric character 652 references](@) 653 consist of `&#` + a string of 1--7 arabic digits + `;`. A 654 numeric character reference is parsed as the corresponding 655 Unicode character. Invalid Unicode code points will be replaced by 656 the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, 657 the code point `U+0000` will also be replaced by `U+FFFD`. 658 659 ```````````````````````````````` example 660 # Ӓ Ϡ � 661 . 662 <p># Ӓ Ϡ �</p> 663 ```````````````````````````````` 664 665 666 [Hexadecimal numeric character 667 references](@) consist of `&#` + 668 either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. 669 They too are parsed as the corresponding Unicode character (this 670 time specified with a hexadecimal numeral instead of decimal). 671 672 ```````````````````````````````` example 673 " ആ ಫ 674 . 675 <p>" ആ ಫ</p> 676 ```````````````````````````````` 677 678 679 Here are some nonentities: 680 681 ```````````````````````````````` example 682   &x; &#; &#x; 683 � 684 &#abcdef0; 685 &ThisIsNotDefined; &hi?; 686 . 687 <p>&nbsp &x; &#; &#x; 688 &#87654321; 689 &#abcdef0; 690 &ThisIsNotDefined; &hi?;</p> 691 ```````````````````````````````` 692 693 694 Although HTML5 does accept some entity references 695 without a trailing semicolon (such as `©`), these are not 696 recognized here, because it makes the grammar too ambiguous: 697 698 ```````````````````````````````` example 699 © 700 . 701 <p>&copy</p> 702 ```````````````````````````````` 703 704 705 Strings that are not on the list of HTML5 named entities are not 706 recognized as entity references either: 707 708 ```````````````````````````````` example 709 &MadeUpEntity; 710 . 711 <p>&MadeUpEntity;</p> 712 ```````````````````````````````` 713 714 715 Entity and numeric character references are recognized in any 716 context besides code spans or code blocks, including 717 URLs, [link titles], and [fenced code block][] [info strings]: 718 719 ```````````````````````````````` example 720 <a href="öö.html"> 721 . 722 <a href="öö.html"> 723 ```````````````````````````````` 724 725 726 ```````````````````````````````` example 727 [foo](/föö "föö") 728 . 729 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 730 ```````````````````````````````` 731 732 733 ```````````````````````````````` example 734 [foo] 735 736 [foo]: /föö "föö" 737 . 738 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 739 ```````````````````````````````` 740 741 742 ```````````````````````````````` example 743 ``` föö 744 foo 745 ``` 746 . 747 <pre><code class="language-föö">foo 748 </code></pre> 749 ```````````````````````````````` 750 751 752 Entity and numeric character references are treated as literal 753 text in code spans and code blocks: 754 755 ```````````````````````````````` example 756 `föö` 757 . 758 <p><code>f&ouml;&ouml;</code></p> 759 ```````````````````````````````` 760 761 762 ```````````````````````````````` example 763 föfö 764 . 765 <pre><code>f&ouml;f&ouml; 766 </code></pre> 767 ```````````````````````````````` 768 769 770 Entity and numeric character references cannot be used 771 in place of symbols indicating structure in CommonMark 772 documents. 773 774 ```````````````````````````````` example 775 *foo* 776 *foo* 777 . 778 <p>*foo* 779 <em>foo</em></p> 780 ```````````````````````````````` 781 782 ```````````````````````````````` example 783 * foo 784 785 * foo 786 . 787 <p>* foo</p> 788 <ul> 789 <li>foo</li> 790 </ul> 791 ```````````````````````````````` 792 793 ```````````````````````````````` example 794 foo bar 795 . 796 <p>foo 797 798 bar</p> 799 ```````````````````````````````` 800 801 ```````````````````````````````` example 802 	foo 803 . 804 <p>→foo</p> 805 ```````````````````````````````` 806 807 808 ```````````````````````````````` example 809 [a](url "tit") 810 . 811 <p>[a](url "tit")</p> 812 ```````````````````````````````` 813 814 815 816 # Blocks and inlines 817 818 We can think of a document as a sequence of 819 [blocks](@)---structural elements like paragraphs, block 820 quotations, lists, headings, rules, and code blocks. Some blocks (like 821 block quotes and list items) contain other blocks; others (like 822 headings and paragraphs) contain [inline](@) content---text, 823 links, emphasized text, images, code spans, and so on. 824 825 ## Precedence 826 827 Indicators of block structure always take precedence over indicators 828 of inline structure. So, for example, the following is a list with 829 two items, not a list with one item containing a code span: 830 831 ```````````````````````````````` example 832 - `one 833 - two` 834 . 835 <ul> 836 <li>`one</li> 837 <li>two`</li> 838 </ul> 839 ```````````````````````````````` 840 841 842 This means that parsing can proceed in two steps: first, the block 843 structure of the document can be discerned; second, text lines inside 844 paragraphs, headings, and other block constructs can be parsed for inline 845 structure. The second step requires information about link reference 846 definitions that will be available only at the end of the first 847 step. Note that the first step requires processing lines in sequence, 848 but the second can be parallelized, since the inline parsing of 849 one block element does not affect the inline parsing of any other. 850 851 ## Container blocks and leaf blocks 852 853 We can divide blocks into two types: 854 [container blocks](@), 855 which can contain other blocks, and [leaf blocks](@), 856 which cannot. 857 858 # Leaf blocks 859 860 This section describes the different kinds of leaf block that make up a 861 Markdown document. 862 863 ## Thematic breaks 864 865 A line consisting of optionally up to three spaces of indentation, followed by a 866 sequence of three or more matching `-`, `_`, or `*` characters, each followed 867 optionally by any number of spaces or tabs, forms a 868 [thematic break](@). 869 870 ```````````````````````````````` example 871 *** 872 --- 873 ___ 874 . 875 <hr /> 876 <hr /> 877 <hr /> 878 ```````````````````````````````` 879 880 881 Wrong characters: 882 883 ```````````````````````````````` example 884 +++ 885 . 886 <p>+++</p> 887 ```````````````````````````````` 888 889 890 ```````````````````````````````` example 891 === 892 . 893 <p>===</p> 894 ```````````````````````````````` 895 896 897 Not enough characters: 898 899 ```````````````````````````````` example 900 -- 901 ** 902 __ 903 . 904 <p>-- 905 ** 906 __</p> 907 ```````````````````````````````` 908 909 910 Up to three spaces of indentation are allowed: 911 912 ```````````````````````````````` example 913 *** 914 *** 915 *** 916 . 917 <hr /> 918 <hr /> 919 <hr /> 920 ```````````````````````````````` 921 922 923 Four spaces of indentation is too many: 924 925 ```````````````````````````````` example 926 *** 927 . 928 <pre><code>*** 929 </code></pre> 930 ```````````````````````````````` 931 932 933 ```````````````````````````````` example 934 Foo 935 *** 936 . 937 <p>Foo 938 ***</p> 939 ```````````````````````````````` 940 941 942 More than three characters may be used: 943 944 ```````````````````````````````` example 945 _____________________________________ 946 . 947 <hr /> 948 ```````````````````````````````` 949 950 951 Spaces and tabs are allowed between the characters: 952 953 ```````````````````````````````` example 954 - - - 955 . 956 <hr /> 957 ```````````````````````````````` 958 959 960 ```````````````````````````````` example 961 ** * ** * ** * ** 962 . 963 <hr /> 964 ```````````````````````````````` 965 966 967 ```````````````````````````````` example 968 - - - - 969 . 970 <hr /> 971 ```````````````````````````````` 972 973 974 Spaces and tabs are allowed at the end: 975 976 ```````````````````````````````` example 977 - - - - 978 . 979 <hr /> 980 ```````````````````````````````` 981 982 983 However, no other characters may occur in the line: 984 985 ```````````````````````````````` example 986 _ _ _ _ a 987 988 a------ 989 990 ---a--- 991 . 992 <p>_ _ _ _ a</p> 993 <p>a------</p> 994 <p>---a---</p> 995 ```````````````````````````````` 996 997 998 It is required that all of the characters other than spaces or tabs be the same. 999 So, this is not a thematic break: 1000 1001 ```````````````````````````````` example 1002 *-* 1003 . 1004 <p><em>-</em></p> 1005 ```````````````````````````````` 1006 1007 1008 Thematic breaks do not need blank lines before or after: 1009 1010 ```````````````````````````````` example 1011 - foo 1012 *** 1013 - bar 1014 . 1015 <ul> 1016 <li>foo</li> 1017 </ul> 1018 <hr /> 1019 <ul> 1020 <li>bar</li> 1021 </ul> 1022 ```````````````````````````````` 1023 1024 1025 Thematic breaks can interrupt a paragraph: 1026 1027 ```````````````````````````````` example 1028 Foo 1029 *** 1030 bar 1031 . 1032 <p>Foo</p> 1033 <hr /> 1034 <p>bar</p> 1035 ```````````````````````````````` 1036 1037 1038 If a line of dashes that meets the above conditions for being a 1039 thematic break could also be interpreted as the underline of a [setext 1040 heading], the interpretation as a 1041 [setext heading] takes precedence. Thus, for example, 1042 this is a setext heading, not a paragraph followed by a thematic break: 1043 1044 ```````````````````````````````` example 1045 Foo 1046 --- 1047 bar 1048 . 1049 <h2>Foo</h2> 1050 <p>bar</p> 1051 ```````````````````````````````` 1052 1053 1054 When both a thematic break and a list item are possible 1055 interpretations of a line, the thematic break takes precedence: 1056 1057 ```````````````````````````````` example 1058 * Foo 1059 * * * 1060 * Bar 1061 . 1062 <ul> 1063 <li>Foo</li> 1064 </ul> 1065 <hr /> 1066 <ul> 1067 <li>Bar</li> 1068 </ul> 1069 ```````````````````````````````` 1070 1071 1072 If you want a thematic break in a list item, use a different bullet: 1073 1074 ```````````````````````````````` example 1075 - Foo 1076 - * * * 1077 . 1078 <ul> 1079 <li>Foo</li> 1080 <li> 1081 <hr /> 1082 </li> 1083 </ul> 1084 ```````````````````````````````` 1085 1086 1087 ## ATX headings 1088 1089 An [ATX heading](@) 1090 consists of a string of characters, parsed as inline content, between an 1091 opening sequence of 1--6 unescaped `#` characters and an optional 1092 closing sequence of any number of unescaped `#` characters. 1093 The opening sequence of `#` characters must be followed by spaces or tabs, or 1094 by the end of line. The optional closing sequence of `#`s must be preceded by 1095 spaces or tabs and may be followed by spaces or tabs only. The opening 1096 `#` character may be preceded by up to three spaces of indentation. The raw 1097 contents of the heading are stripped of leading and trailing space or tabs 1098 before being parsed as inline content. The heading level is equal to the number 1099 of `#` characters in the opening sequence. 1100 1101 Simple headings: 1102 1103 ```````````````````````````````` example 1104 # foo 1105 ## foo 1106 ### foo 1107 #### foo 1108 ##### foo 1109 ###### foo 1110 . 1111 <h1>foo</h1> 1112 <h2>foo</h2> 1113 <h3>foo</h3> 1114 <h4>foo</h4> 1115 <h5>foo</h5> 1116 <h6>foo</h6> 1117 ```````````````````````````````` 1118 1119 1120 More than six `#` characters is not a heading: 1121 1122 ```````````````````````````````` example 1123 ####### foo 1124 . 1125 <p>####### foo</p> 1126 ```````````````````````````````` 1127 1128 1129 At least one space or tab is required between the `#` characters and the 1130 heading's contents, unless the heading is empty. Note that many 1131 implementations currently do not require the space. However, the 1132 space was required by the 1133 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), 1134 and it helps prevent things like the following from being parsed as 1135 headings: 1136 1137 ```````````````````````````````` example 1138 #5 bolt 1139 1140 #hashtag 1141 . 1142 <p>#5 bolt</p> 1143 <p>#hashtag</p> 1144 ```````````````````````````````` 1145 1146 1147 This is not a heading, because the first `#` is escaped: 1148 1149 ```````````````````````````````` example 1150 \## foo 1151 . 1152 <p>## foo</p> 1153 ```````````````````````````````` 1154 1155 1156 Contents are parsed as inlines: 1157 1158 ```````````````````````````````` example 1159 # foo *bar* \*baz\* 1160 . 1161 <h1>foo <em>bar</em> *baz*</h1> 1162 ```````````````````````````````` 1163 1164 1165 Leading and trailing spaces or tabs are ignored in parsing inline content: 1166 1167 ```````````````````````````````` example 1168 # foo 1169 . 1170 <h1>foo</h1> 1171 ```````````````````````````````` 1172 1173 1174 Up to three spaces of indentation are allowed: 1175 1176 ```````````````````````````````` example 1177 ### foo 1178 ## foo 1179 # foo 1180 . 1181 <h3>foo</h3> 1182 <h2>foo</h2> 1183 <h1>foo</h1> 1184 ```````````````````````````````` 1185 1186 1187 Four spaces of indentation is too many: 1188 1189 ```````````````````````````````` example 1190 # foo 1191 . 1192 <pre><code># foo 1193 </code></pre> 1194 ```````````````````````````````` 1195 1196 1197 ```````````````````````````````` example 1198 foo 1199 # bar 1200 . 1201 <p>foo 1202 # bar</p> 1203 ```````````````````````````````` 1204 1205 1206 A closing sequence of `#` characters is optional: 1207 1208 ```````````````````````````````` example 1209 ## foo ## 1210 ### bar ### 1211 . 1212 <h2>foo</h2> 1213 <h3>bar</h3> 1214 ```````````````````````````````` 1215 1216 1217 It need not be the same length as the opening sequence: 1218 1219 ```````````````````````````````` example 1220 # foo ################################## 1221 ##### foo ## 1222 . 1223 <h1>foo</h1> 1224 <h5>foo</h5> 1225 ```````````````````````````````` 1226 1227 1228 Spaces or tabs are allowed after the closing sequence: 1229 1230 ```````````````````````````````` example 1231 ### foo ### 1232 . 1233 <h3>foo</h3> 1234 ```````````````````````````````` 1235 1236 1237 A sequence of `#` characters with anything but spaces or tabs following it 1238 is not a closing sequence, but counts as part of the contents of the 1239 heading: 1240 1241 ```````````````````````````````` example 1242 ### foo ### b 1243 . 1244 <h3>foo ### b</h3> 1245 ```````````````````````````````` 1246 1247 1248 The closing sequence must be preceded by a space or tab: 1249 1250 ```````````````````````````````` example 1251 # foo# 1252 . 1253 <h1>foo#</h1> 1254 ```````````````````````````````` 1255 1256 1257 Backslash-escaped `#` characters do not count as part 1258 of the closing sequence: 1259 1260 ```````````````````````````````` example 1261 ### foo \### 1262 ## foo #\## 1263 # foo \# 1264 . 1265 <h3>foo ###</h3> 1266 <h2>foo ###</h2> 1267 <h1>foo #</h1> 1268 ```````````````````````````````` 1269 1270 1271 ATX headings need not be separated from surrounding content by blank 1272 lines, and they can interrupt paragraphs: 1273 1274 ```````````````````````````````` example 1275 **** 1276 ## foo 1277 **** 1278 . 1279 <hr /> 1280 <h2>foo</h2> 1281 <hr /> 1282 ```````````````````````````````` 1283 1284 1285 ```````````````````````````````` example 1286 Foo bar 1287 # baz 1288 Bar foo 1289 . 1290 <p>Foo bar</p> 1291 <h1>baz</h1> 1292 <p>Bar foo</p> 1293 ```````````````````````````````` 1294 1295 1296 ATX headings can be empty: 1297 1298 ```````````````````````````````` example 1299 ## 1300 # 1301 ### ### 1302 . 1303 <h2></h2> 1304 <h1></h1> 1305 <h3></h3> 1306 ```````````````````````````````` 1307 1308 1309 ## Setext headings 1310 1311 A [setext heading](@) consists of one or more 1312 lines of text, not interrupted by a blank line, of which the first line does not 1313 have more than 3 spaces of indentation, followed by 1314 a [setext heading underline]. The lines of text must be such 1315 that, were they not followed by the setext heading underline, 1316 they would be interpreted as a paragraph: they cannot be 1317 interpretable as a [code fence], [ATX heading][ATX headings], 1318 [block quote][block quotes], [thematic break][thematic breaks], 1319 [list item][list items], or [HTML block][HTML blocks]. 1320 1321 A [setext heading underline](@) is a sequence of 1322 `=` characters or a sequence of `-` characters, with no more than 3 1323 spaces of indentation and any number of trailing spaces or tabs. If a line 1324 containing a single `-` can be interpreted as an 1325 empty [list items], it should be interpreted this way 1326 and not as a [setext heading underline]. 1327 1328 The heading is a level 1 heading if `=` characters are used in 1329 the [setext heading underline], and a level 2 heading if `-` 1330 characters are used. The contents of the heading are the result 1331 of parsing the preceding lines of text as CommonMark inline 1332 content. 1333 1334 In general, a setext heading need not be preceded or followed by a 1335 blank line. However, it cannot interrupt a paragraph, so when a 1336 setext heading comes after a paragraph, a blank line is needed between 1337 them. 1338 1339 Simple examples: 1340 1341 ```````````````````````````````` example 1342 Foo *bar* 1343 ========= 1344 1345 Foo *bar* 1346 --------- 1347 . 1348 <h1>Foo <em>bar</em></h1> 1349 <h2>Foo <em>bar</em></h2> 1350 ```````````````````````````````` 1351 1352 1353 The content of the header may span more than one line: 1354 1355 ```````````````````````````````` example 1356 Foo *bar 1357 baz* 1358 ==== 1359 . 1360 <h1>Foo <em>bar 1361 baz</em></h1> 1362 ```````````````````````````````` 1363 1364 The contents are the result of parsing the headings's raw 1365 content as inlines. The heading's raw content is formed by 1366 concatenating the lines and removing initial and final 1367 spaces or tabs. 1368 1369 ```````````````````````````````` example 1370 Foo *bar 1371 baz*→ 1372 ==== 1373 . 1374 <h1>Foo <em>bar 1375 baz</em></h1> 1376 ```````````````````````````````` 1377 1378 1379 The underlining can be any length: 1380 1381 ```````````````````````````````` example 1382 Foo 1383 ------------------------- 1384 1385 Foo 1386 = 1387 . 1388 <h2>Foo</h2> 1389 <h1>Foo</h1> 1390 ```````````````````````````````` 1391 1392 1393 The heading content can be preceded by up to three spaces of indentation, and 1394 need not line up with the underlining: 1395 1396 ```````````````````````````````` example 1397 Foo 1398 --- 1399 1400 Foo 1401 ----- 1402 1403 Foo 1404 === 1405 . 1406 <h2>Foo</h2> 1407 <h2>Foo</h2> 1408 <h1>Foo</h1> 1409 ```````````````````````````````` 1410 1411 1412 Four spaces of indentation is too many: 1413 1414 ```````````````````````````````` example 1415 Foo 1416 --- 1417 1418 Foo 1419 --- 1420 . 1421 <pre><code>Foo 1422 --- 1423 1424 Foo 1425 </code></pre> 1426 <hr /> 1427 ```````````````````````````````` 1428 1429 1430 The setext heading underline can be preceded by up to three spaces of 1431 indentation, and may have trailing spaces or tabs: 1432 1433 ```````````````````````````````` example 1434 Foo 1435 ---- 1436 . 1437 <h2>Foo</h2> 1438 ```````````````````````````````` 1439 1440 1441 Four spaces of indentation is too many: 1442 1443 ```````````````````````````````` example 1444 Foo 1445 --- 1446 . 1447 <p>Foo 1448 ---</p> 1449 ```````````````````````````````` 1450 1451 1452 The setext heading underline cannot contain internal spaces or tabs: 1453 1454 ```````````````````````````````` example 1455 Foo 1456 = = 1457 1458 Foo 1459 --- - 1460 . 1461 <p>Foo 1462 = =</p> 1463 <p>Foo</p> 1464 <hr /> 1465 ```````````````````````````````` 1466 1467 1468 Trailing spaces or tabs in the content line do not cause a hard line break: 1469 1470 ```````````````````````````````` example 1471 Foo 1472 ----- 1473 . 1474 <h2>Foo</h2> 1475 ```````````````````````````````` 1476 1477 1478 Nor does a backslash at the end: 1479 1480 ```````````````````````````````` example 1481 Foo\ 1482 ---- 1483 . 1484 <h2>Foo\</h2> 1485 ```````````````````````````````` 1486 1487 1488 Since indicators of block structure take precedence over 1489 indicators of inline structure, the following are setext headings: 1490 1491 ```````````````````````````````` example 1492 `Foo 1493 ---- 1494 ` 1495 1496 <a title="a lot 1497 --- 1498 of dashes"/> 1499 . 1500 <h2>`Foo</h2> 1501 <p>`</p> 1502 <h2><a title="a lot</h2> 1503 <p>of dashes"/></p> 1504 ```````````````````````````````` 1505 1506 1507 The setext heading underline cannot be a [lazy continuation 1508 line] in a list item or block quote: 1509 1510 ```````````````````````````````` example 1511 > Foo 1512 --- 1513 . 1514 <blockquote> 1515 <p>Foo</p> 1516 </blockquote> 1517 <hr /> 1518 ```````````````````````````````` 1519 1520 1521 ```````````````````````````````` example 1522 > foo 1523 bar 1524 === 1525 . 1526 <blockquote> 1527 <p>foo 1528 bar 1529 ===</p> 1530 </blockquote> 1531 ```````````````````````````````` 1532 1533 1534 ```````````````````````````````` example 1535 - Foo 1536 --- 1537 . 1538 <ul> 1539 <li>Foo</li> 1540 </ul> 1541 <hr /> 1542 ```````````````````````````````` 1543 1544 1545 A blank line is needed between a paragraph and a following 1546 setext heading, since otherwise the paragraph becomes part 1547 of the heading's content: 1548 1549 ```````````````````````````````` example 1550 Foo 1551 Bar 1552 --- 1553 . 1554 <h2>Foo 1555 Bar</h2> 1556 ```````````````````````````````` 1557 1558 1559 But in general a blank line is not required before or after 1560 setext headings: 1561 1562 ```````````````````````````````` example 1563 --- 1564 Foo 1565 --- 1566 Bar 1567 --- 1568 Baz 1569 . 1570 <hr /> 1571 <h2>Foo</h2> 1572 <h2>Bar</h2> 1573 <p>Baz</p> 1574 ```````````````````````````````` 1575 1576 1577 Setext headings cannot be empty: 1578 1579 ```````````````````````````````` example 1580 1581 ==== 1582 . 1583 <p>====</p> 1584 ```````````````````````````````` 1585 1586 1587 Setext heading text lines must not be interpretable as block 1588 constructs other than paragraphs. So, the line of dashes 1589 in these examples gets interpreted as a thematic break: 1590 1591 ```````````````````````````````` example 1592 --- 1593 --- 1594 . 1595 <hr /> 1596 <hr /> 1597 ```````````````````````````````` 1598 1599 1600 ```````````````````````````````` example 1601 - foo 1602 ----- 1603 . 1604 <ul> 1605 <li>foo</li> 1606 </ul> 1607 <hr /> 1608 ```````````````````````````````` 1609 1610 1611 ```````````````````````````````` example 1612 foo 1613 --- 1614 . 1615 <pre><code>foo 1616 </code></pre> 1617 <hr /> 1618 ```````````````````````````````` 1619 1620 1621 ```````````````````````````````` example 1622 > foo 1623 ----- 1624 . 1625 <blockquote> 1626 <p>foo</p> 1627 </blockquote> 1628 <hr /> 1629 ```````````````````````````````` 1630 1631 1632 If you want a heading with `> foo` as its literal text, you can 1633 use backslash escapes: 1634 1635 ```````````````````````````````` example 1636 \> foo 1637 ------ 1638 . 1639 <h2>> foo</h2> 1640 ```````````````````````````````` 1641 1642 1643 **Compatibility note:** Most existing Markdown implementations 1644 do not allow the text of setext headings to span multiple lines. 1645 But there is no consensus about how to interpret 1646 1647 ``` markdown 1648 Foo 1649 bar 1650 --- 1651 baz 1652 ``` 1653 1654 One can find four different interpretations: 1655 1656 1. paragraph "Foo", heading "bar", paragraph "baz" 1657 2. paragraph "Foo bar", thematic break, paragraph "baz" 1658 3. paragraph "Foo bar --- baz" 1659 4. heading "Foo bar", paragraph "baz" 1660 1661 We find interpretation 4 most natural, and interpretation 4 1662 increases the expressive power of CommonMark, by allowing 1663 multiline headings. Authors who want interpretation 1 can 1664 put a blank line after the first paragraph: 1665 1666 ```````````````````````````````` example 1667 Foo 1668 1669 bar 1670 --- 1671 baz 1672 . 1673 <p>Foo</p> 1674 <h2>bar</h2> 1675 <p>baz</p> 1676 ```````````````````````````````` 1677 1678 1679 Authors who want interpretation 2 can put blank lines around 1680 the thematic break, 1681 1682 ```````````````````````````````` example 1683 Foo 1684 bar 1685 1686 --- 1687 1688 baz 1689 . 1690 <p>Foo 1691 bar</p> 1692 <hr /> 1693 <p>baz</p> 1694 ```````````````````````````````` 1695 1696 1697 or use a thematic break that cannot count as a [setext heading 1698 underline], such as 1699 1700 ```````````````````````````````` example 1701 Foo 1702 bar 1703 * * * 1704 baz 1705 . 1706 <p>Foo 1707 bar</p> 1708 <hr /> 1709 <p>baz</p> 1710 ```````````````````````````````` 1711 1712 1713 Authors who want interpretation 3 can use backslash escapes: 1714 1715 ```````````````````````````````` example 1716 Foo 1717 bar 1718 \--- 1719 baz 1720 . 1721 <p>Foo 1722 bar 1723 --- 1724 baz</p> 1725 ```````````````````````````````` 1726 1727 1728 ## Indented code blocks 1729 1730 An [indented code block](@) is composed of one or more 1731 [indented chunks] separated by blank lines. 1732 An [indented chunk](@) is a sequence of non-blank lines, 1733 each preceded by four or more spaces of indentation. The contents of the code 1734 block are the literal contents of the lines, including trailing 1735 [line endings], minus four spaces of indentation. 1736 An indented code block has no [info string]. 1737 1738 An indented code block cannot interrupt a paragraph, so there must be 1739 a blank line between a paragraph and a following indented code block. 1740 (A blank line is not needed, however, between a code block and a following 1741 paragraph.) 1742 1743 ```````````````````````````````` example 1744 a simple 1745 indented code block 1746 . 1747 <pre><code>a simple 1748 indented code block 1749 </code></pre> 1750 ```````````````````````````````` 1751 1752 1753 If there is any ambiguity between an interpretation of indentation 1754 as a code block and as indicating that material belongs to a [list 1755 item][list items], the list item interpretation takes precedence: 1756 1757 ```````````````````````````````` example 1758 - foo 1759 1760 bar 1761 . 1762 <ul> 1763 <li> 1764 <p>foo</p> 1765 <p>bar</p> 1766 </li> 1767 </ul> 1768 ```````````````````````````````` 1769 1770 1771 ```````````````````````````````` example 1772 1. foo 1773 1774 - bar 1775 . 1776 <ol> 1777 <li> 1778 <p>foo</p> 1779 <ul> 1780 <li>bar</li> 1781 </ul> 1782 </li> 1783 </ol> 1784 ```````````````````````````````` 1785 1786 1787 1788 The contents of a code block are literal text, and do not get parsed 1789 as Markdown: 1790 1791 ```````````````````````````````` example 1792 <a/> 1793 *hi* 1794 1795 - one 1796 . 1797 <pre><code><a/> 1798 *hi* 1799 1800 - one 1801 </code></pre> 1802 ```````````````````````````````` 1803 1804 1805 Here we have three chunks separated by blank lines: 1806 1807 ```````````````````````````````` example 1808 chunk1 1809 1810 chunk2 1811 1812 1813 1814 chunk3 1815 . 1816 <pre><code>chunk1 1817 1818 chunk2 1819 1820 1821 1822 chunk3 1823 </code></pre> 1824 ```````````````````````````````` 1825 1826 1827 Any initial spaces or tabs beyond four spaces of indentation will be included in 1828 the content, even in interior blank lines: 1829 1830 ```````````````````````````````` example 1831 chunk1 1832 1833 chunk2 1834 . 1835 <pre><code>chunk1 1836 1837 chunk2 1838 </code></pre> 1839 ```````````````````````````````` 1840 1841 1842 An indented code block cannot interrupt a paragraph. (This 1843 allows hanging indents and the like.) 1844 1845 ```````````````````````````````` example 1846 Foo 1847 bar 1848 1849 . 1850 <p>Foo 1851 bar</p> 1852 ```````````````````````````````` 1853 1854 1855 However, any non-blank line with fewer than four spaces of indentation ends 1856 the code block immediately. So a paragraph may occur immediately 1857 after indented code: 1858 1859 ```````````````````````````````` example 1860 foo 1861 bar 1862 . 1863 <pre><code>foo 1864 </code></pre> 1865 <p>bar</p> 1866 ```````````````````````````````` 1867 1868 1869 And indented code can occur immediately before and after other kinds of 1870 blocks: 1871 1872 ```````````````````````````````` example 1873 # Heading 1874 foo 1875 Heading 1876 ------ 1877 foo 1878 ---- 1879 . 1880 <h1>Heading</h1> 1881 <pre><code>foo 1882 </code></pre> 1883 <h2>Heading</h2> 1884 <pre><code>foo 1885 </code></pre> 1886 <hr /> 1887 ```````````````````````````````` 1888 1889 1890 The first line can be preceded by more than four spaces of indentation: 1891 1892 ```````````````````````````````` example 1893 foo 1894 bar 1895 . 1896 <pre><code> foo 1897 bar 1898 </code></pre> 1899 ```````````````````````````````` 1900 1901 1902 Blank lines preceding or following an indented code block 1903 are not included in it: 1904 1905 ```````````````````````````````` example 1906 1907 1908 foo 1909 1910 1911 . 1912 <pre><code>foo 1913 </code></pre> 1914 ```````````````````````````````` 1915 1916 1917 Trailing spaces or tabs are included in the code block's content: 1918 1919 ```````````````````````````````` example 1920 foo 1921 . 1922 <pre><code>foo 1923 </code></pre> 1924 ```````````````````````````````` 1925 1926 1927 1928 ## Fenced code blocks 1929 1930 A [code fence](@) is a sequence 1931 of at least three consecutive backtick characters (`` ` ``) or 1932 tildes (`~`). (Tildes and backticks cannot be mixed.) 1933 A [fenced code block](@) 1934 begins with a code fence, preceded by up to three spaces of indentation. 1935 1936 The line with the opening code fence may optionally contain some text 1937 following the code fence; this is trimmed of leading and trailing 1938 spaces or tabs and called the [info string](@). If the [info string] comes 1939 after a backtick fence, it may not contain any backtick 1940 characters. (The reason for this restriction is that otherwise 1941 some inline code would be incorrectly interpreted as the 1942 beginning of a fenced code block.) 1943 1944 The content of the code block consists of all subsequent lines, until 1945 a closing [code fence] of the same type as the code block 1946 began with (backticks or tildes), and with at least as many backticks 1947 or tildes as the opening code fence. If the leading code fence is 1948 preceded by N spaces of indentation, then up to N spaces of indentation are 1949 removed from each line of the content (if present). (If a content line is not 1950 indented, it is preserved unchanged. If it is indented N spaces or less, all 1951 of the indentation is removed.) 1952 1953 The closing code fence may be preceded by up to three spaces of indentation, and 1954 may be followed only by spaces or tabs, which are ignored. If the end of the 1955 containing block (or document) is reached and no closing code fence 1956 has been found, the code block contains all of the lines after the 1957 opening code fence until the end of the containing block (or 1958 document). (An alternative spec would require backtracking in the 1959 event that a closing code fence is not found. But this makes parsing 1960 much less efficient, and there seems to be no real down side to the 1961 behavior described here.) 1962 1963 A fenced code block may interrupt a paragraph, and does not require 1964 a blank line either before or after. 1965 1966 The content of a code fence is treated as literal text, not parsed 1967 as inlines. The first word of the [info string] is typically used to 1968 specify the language of the code sample, and rendered in the `class` 1969 attribute of the `code` tag. However, this spec does not mandate any 1970 particular treatment of the [info string]. 1971 1972 Here is a simple example with backticks: 1973 1974 ```````````````````````````````` example 1975 ``` 1976 < 1977 > 1978 ``` 1979 . 1980 <pre><code>< 1981 > 1982 </code></pre> 1983 ```````````````````````````````` 1984 1985 1986 With tildes: 1987 1988 ```````````````````````````````` example 1989 ~~~ 1990 < 1991 > 1992 ~~~ 1993 . 1994 <pre><code>< 1995 > 1996 </code></pre> 1997 ```````````````````````````````` 1998 1999 Fewer than three backticks is not enough: 2000 2001 ```````````````````````````````` example 2002 `` 2003 foo 2004 `` 2005 . 2006 <p><code>foo</code></p> 2007 ```````````````````````````````` 2008 2009 The closing code fence must use the same character as the opening 2010 fence: 2011 2012 ```````````````````````````````` example 2013 ``` 2014 aaa 2015 ~~~ 2016 ``` 2017 . 2018 <pre><code>aaa 2019 ~~~ 2020 </code></pre> 2021 ```````````````````````````````` 2022 2023 2024 ```````````````````````````````` example 2025 ~~~ 2026 aaa 2027 ``` 2028 ~~~ 2029 . 2030 <pre><code>aaa 2031 ``` 2032 </code></pre> 2033 ```````````````````````````````` 2034 2035 2036 The closing code fence must be at least as long as the opening fence: 2037 2038 ```````````````````````````````` example 2039 ```` 2040 aaa 2041 ``` 2042 `````` 2043 . 2044 <pre><code>aaa 2045 ``` 2046 </code></pre> 2047 ```````````````````````````````` 2048 2049 2050 ```````````````````````````````` example 2051 ~~~~ 2052 aaa 2053 ~~~ 2054 ~~~~ 2055 . 2056 <pre><code>aaa 2057 ~~~ 2058 </code></pre> 2059 ```````````````````````````````` 2060 2061 2062 Unclosed code blocks are closed by the end of the document 2063 (or the enclosing [block quote][block quotes] or [list item][list items]): 2064 2065 ```````````````````````````````` example 2066 ``` 2067 . 2068 <pre><code></code></pre> 2069 ```````````````````````````````` 2070 2071 2072 ```````````````````````````````` example 2073 ````` 2074 2075 ``` 2076 aaa 2077 . 2078 <pre><code> 2079 ``` 2080 aaa 2081 </code></pre> 2082 ```````````````````````````````` 2083 2084 2085 ```````````````````````````````` example 2086 > ``` 2087 > aaa 2088 2089 bbb 2090 . 2091 <blockquote> 2092 <pre><code>aaa 2093 </code></pre> 2094 </blockquote> 2095 <p>bbb</p> 2096 ```````````````````````````````` 2097 2098 2099 A code block can have all empty lines as its content: 2100 2101 ```````````````````````````````` example 2102 ``` 2103 2104 2105 ``` 2106 . 2107 <pre><code> 2108 2109 </code></pre> 2110 ```````````````````````````````` 2111 2112 2113 A code block can be empty: 2114 2115 ```````````````````````````````` example 2116 ``` 2117 ``` 2118 . 2119 <pre><code></code></pre> 2120 ```````````````````````````````` 2121 2122 2123 Fences can be indented. If the opening fence is indented, 2124 content lines will have equivalent opening indentation removed, 2125 if present: 2126 2127 ```````````````````````````````` example 2128 ``` 2129 aaa 2130 aaa 2131 ``` 2132 . 2133 <pre><code>aaa 2134 aaa 2135 </code></pre> 2136 ```````````````````````````````` 2137 2138 2139 ```````````````````````````````` example 2140 ``` 2141 aaa 2142 aaa 2143 aaa 2144 ``` 2145 . 2146 <pre><code>aaa 2147 aaa 2148 aaa 2149 </code></pre> 2150 ```````````````````````````````` 2151 2152 2153 ```````````````````````````````` example 2154 ``` 2155 aaa 2156 aaa 2157 aaa 2158 ``` 2159 . 2160 <pre><code>aaa 2161 aaa 2162 aaa 2163 </code></pre> 2164 ```````````````````````````````` 2165 2166 2167 Four spaces of indentation is too many: 2168 2169 ```````````````````````````````` example 2170 ``` 2171 aaa 2172 ``` 2173 . 2174 <pre><code>``` 2175 aaa 2176 ``` 2177 </code></pre> 2178 ```````````````````````````````` 2179 2180 2181 Closing fences may be preceded by up to three spaces of indentation, and their 2182 indentation need not match that of the opening fence: 2183 2184 ```````````````````````````````` example 2185 ``` 2186 aaa 2187 ``` 2188 . 2189 <pre><code>aaa 2190 </code></pre> 2191 ```````````````````````````````` 2192 2193 2194 ```````````````````````````````` example 2195 ``` 2196 aaa 2197 ``` 2198 . 2199 <pre><code>aaa 2200 </code></pre> 2201 ```````````````````````````````` 2202 2203 2204 This is not a closing fence, because it is indented 4 spaces: 2205 2206 ```````````````````````````````` example 2207 ``` 2208 aaa 2209 ``` 2210 . 2211 <pre><code>aaa 2212 ``` 2213 </code></pre> 2214 ```````````````````````````````` 2215 2216 2217 2218 Code fences (opening and closing) cannot contain internal spaces or tabs: 2219 2220 ```````````````````````````````` example 2221 ``` ``` 2222 aaa 2223 . 2224 <p><code> </code> 2225 aaa</p> 2226 ```````````````````````````````` 2227 2228 2229 ```````````````````````````````` example 2230 ~~~~~~ 2231 aaa 2232 ~~~ ~~ 2233 . 2234 <pre><code>aaa 2235 ~~~ ~~ 2236 </code></pre> 2237 ```````````````````````````````` 2238 2239 2240 Fenced code blocks can interrupt paragraphs, and can be followed 2241 directly by paragraphs, without a blank line between: 2242 2243 ```````````````````````````````` example 2244 foo 2245 ``` 2246 bar 2247 ``` 2248 baz 2249 . 2250 <p>foo</p> 2251 <pre><code>bar 2252 </code></pre> 2253 <p>baz</p> 2254 ```````````````````````````````` 2255 2256 2257 Other blocks can also occur before and after fenced code blocks 2258 without an intervening blank line: 2259 2260 ```````````````````````````````` example 2261 foo 2262 --- 2263 ~~~ 2264 bar 2265 ~~~ 2266 # baz 2267 . 2268 <h2>foo</h2> 2269 <pre><code>bar 2270 </code></pre> 2271 <h1>baz</h1> 2272 ```````````````````````````````` 2273 2274 2275 An [info string] can be provided after the opening code fence. 2276 Although this spec doesn't mandate any particular treatment of 2277 the info string, the first word is typically used to specify 2278 the language of the code block. In HTML output, the language is 2279 normally indicated by adding a class to the `code` element consisting 2280 of `language-` followed by the language name. 2281 2282 ```````````````````````````````` example 2283 ```ruby 2284 def foo(x) 2285 return 3 2286 end 2287 ``` 2288 . 2289 <pre><code class="language-ruby">def foo(x) 2290 return 3 2291 end 2292 </code></pre> 2293 ```````````````````````````````` 2294 2295 2296 ```````````````````````````````` example 2297 ~~~~ ruby startline=3 $%@#$ 2298 def foo(x) 2299 return 3 2300 end 2301 ~~~~~~~ 2302 . 2303 <pre><code class="language-ruby">def foo(x) 2304 return 3 2305 end 2306 </code></pre> 2307 ```````````````````````````````` 2308 2309 2310 ```````````````````````````````` example 2311 ````; 2312 ```` 2313 . 2314 <pre><code class="language-;"></code></pre> 2315 ```````````````````````````````` 2316 2317 2318 [Info strings] for backtick code blocks cannot contain backticks: 2319 2320 ```````````````````````````````` example 2321 ``` aa ``` 2322 foo 2323 . 2324 <p><code>aa</code> 2325 foo</p> 2326 ```````````````````````````````` 2327 2328 2329 [Info strings] for tilde code blocks can contain backticks and tildes: 2330 2331 ```````````````````````````````` example 2332 ~~~ aa ``` ~~~ 2333 foo 2334 ~~~ 2335 . 2336 <pre><code class="language-aa">foo 2337 </code></pre> 2338 ```````````````````````````````` 2339 2340 2341 Closing code fences cannot have [info strings]: 2342 2343 ```````````````````````````````` example 2344 ``` 2345 ``` aaa 2346 ``` 2347 . 2348 <pre><code>``` aaa 2349 </code></pre> 2350 ```````````````````````````````` 2351 2352 2353 2354 ## HTML blocks 2355 2356 An [HTML block](@) is a group of lines that is treated 2357 as raw HTML (and will not be escaped in HTML output). 2358 2359 There are seven kinds of [HTML block], which can be defined by their 2360 start and end conditions. The block begins with a line that meets a 2361 [start condition](@) (after up to three optional spaces of indentation). 2362 It ends with the first subsequent line that meets a matching [end 2363 condition](@), or the last line of the document, or the last line of 2364 the [container block](#container-blocks) containing the current HTML 2365 block, if no line is encountered that meets the [end condition]. If 2366 the first line meets both the [start condition] and the [end 2367 condition], the block will contain just that line. 2368 2369 1. **Start condition:** line begins with the string `<script`, 2370 `<pre`, `<textarea`, or `<style` (case-insensitive), followed by a space, 2371 a tab, the string `>`, or the end of the line.\ 2372 **End condition:** line contains an end tag 2373 `</script>`, `</pre>`, `</textarea>`, or `</style>` (case-insensitive; it 2374 need not match the start tag). 2375 2376 2. **Start condition:** line begins with the string `<!--`.\ 2377 **End condition:** line contains the string `-->`. 2378 2379 3. **Start condition:** line begins with the string `<?`.\ 2380 **End condition:** line contains the string `?>`. 2381 2382 4. **Start condition:** line begins with the string `<!` 2383 followed by an ASCII letter.\ 2384 **End condition:** line contains the character `>`. 2385 2386 5. **Start condition:** line begins with the string 2387 `<![CDATA[`.\ 2388 **End condition:** line contains the string `]]>`. 2389 2390 6. **Start condition:** line begins the string `<` or `</` 2391 followed by one of the strings (case-insensitive) `address`, 2392 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`, 2393 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, 2394 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, 2395 `footer`, `form`, `frame`, `frameset`, 2396 `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, 2397 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, 2398 `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, 2399 `section`, `source`, `summary`, `table`, `tbody`, `td`, 2400 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed 2401 by a space, a tab, the end of the line, the string `>`, or 2402 the string `/>`.\ 2403 **End condition:** line is followed by a [blank line]. 2404 2405 7. **Start condition:** line begins with a complete [open tag] 2406 (with any [tag name] other than `script`, 2407 `style`, or `pre`) or a complete [closing tag], 2408 followed only by a space, a tab, or the end of the line.\ 2409 **End condition:** line is followed by a [blank line]. 2410 2411 HTML blocks continue until they are closed by their appropriate 2412 [end condition], or the last line of the document or other [container 2413 block](#container-blocks). This means any HTML **within an HTML 2414 block** that might otherwise be recognised as a start condition will 2415 be ignored by the parser and passed through as-is, without changing 2416 the parser's state. 2417 2418 For instance, `<pre>` within an HTML block started by `<table>` will not affect 2419 the parser state; as the HTML block was started in by start condition 6, it 2420 will end at any blank line. This can be surprising: 2421 2422 ```````````````````````````````` example 2423 <table><tr><td> 2424 <pre> 2425 **Hello**, 2426 2427 _world_. 2428 </pre> 2429 </td></tr></table> 2430 . 2431 <table><tr><td> 2432 <pre> 2433 **Hello**, 2434 <p><em>world</em>. 2435 </pre></p> 2436 </td></tr></table> 2437 ```````````````````````````````` 2438 2439 In this case, the HTML block is terminated by the blank line — the `**Hello**` 2440 text remains verbatim — and regular parsing resumes, with a paragraph, 2441 emphasised `world` and inline and block HTML following. 2442 2443 All types of [HTML blocks] except type 7 may interrupt 2444 a paragraph. Blocks of type 7 may not interrupt a paragraph. 2445 (This restriction is intended to prevent unwanted interpretation 2446 of long tags inside a wrapped paragraph as starting HTML blocks.) 2447 2448 Some simple examples follow. Here are some basic HTML blocks 2449 of type 6: 2450 2451 ```````````````````````````````` example 2452 <table> 2453 <tr> 2454 <td> 2455 hi 2456 </td> 2457 </tr> 2458 </table> 2459 2460 okay. 2461 . 2462 <table> 2463 <tr> 2464 <td> 2465 hi 2466 </td> 2467 </tr> 2468 </table> 2469 <p>okay.</p> 2470 ```````````````````````````````` 2471 2472 2473 ```````````````````````````````` example 2474 <div> 2475 *hello* 2476 <foo><a> 2477 . 2478 <div> 2479 *hello* 2480 <foo><a> 2481 ```````````````````````````````` 2482 2483 2484 A block can also start with a closing tag: 2485 2486 ```````````````````````````````` example 2487 </div> 2488 *foo* 2489 . 2490 </div> 2491 *foo* 2492 ```````````````````````````````` 2493 2494 2495 Here we have two HTML blocks with a Markdown paragraph between them: 2496 2497 ```````````````````````````````` example 2498 <DIV CLASS="foo"> 2499 2500 *Markdown* 2501 2502 </DIV> 2503 . 2504 <DIV CLASS="foo"> 2505 <p><em>Markdown</em></p> 2506 </DIV> 2507 ```````````````````````````````` 2508 2509 2510 The tag on the first line can be partial, as long 2511 as it is split where there would be whitespace: 2512 2513 ```````````````````````````````` example 2514 <div id="foo" 2515 class="bar"> 2516 </div> 2517 . 2518 <div id="foo" 2519 class="bar"> 2520 </div> 2521 ```````````````````````````````` 2522 2523 2524 ```````````````````````````````` example 2525 <div id="foo" class="bar 2526 baz"> 2527 </div> 2528 . 2529 <div id="foo" class="bar 2530 baz"> 2531 </div> 2532 ```````````````````````````````` 2533 2534 2535 An open tag need not be closed: 2536 ```````````````````````````````` example 2537 <div> 2538 *foo* 2539 2540 *bar* 2541 . 2542 <div> 2543 *foo* 2544 <p><em>bar</em></p> 2545 ```````````````````````````````` 2546 2547 2548 2549 A partial tag need not even be completed (garbage 2550 in, garbage out): 2551 2552 ```````````````````````````````` example 2553 <div id="foo" 2554 *hi* 2555 . 2556 <div id="foo" 2557 *hi* 2558 ```````````````````````````````` 2559 2560 2561 ```````````````````````````````` example 2562 <div class 2563 foo 2564 . 2565 <div class 2566 foo 2567 ```````````````````````````````` 2568 2569 2570 The initial tag doesn't even need to be a valid 2571 tag, as long as it starts like one: 2572 2573 ```````````````````````````````` example 2574 <div *???-&&&-<--- 2575 *foo* 2576 . 2577 <div *???-&&&-<--- 2578 *foo* 2579 ```````````````````````````````` 2580 2581 2582 In type 6 blocks, the initial tag need not be on a line by 2583 itself: 2584 2585 ```````````````````````````````` example 2586 <div><a href="bar">*foo*</a></div> 2587 . 2588 <div><a href="bar">*foo*</a></div> 2589 ```````````````````````````````` 2590 2591 2592 ```````````````````````````````` example 2593 <table><tr><td> 2594 foo 2595 </td></tr></table> 2596 . 2597 <table><tr><td> 2598 foo 2599 </td></tr></table> 2600 ```````````````````````````````` 2601 2602 2603 Everything until the next blank line or end of document 2604 gets included in the HTML block. So, in the following 2605 example, what looks like a Markdown code block 2606 is actually part of the HTML block, which continues until a blank 2607 line or the end of the document is reached: 2608 2609 ```````````````````````````````` example 2610 <div></div> 2611 ``` c 2612 int x = 33; 2613 ``` 2614 . 2615 <div></div> 2616 ``` c 2617 int x = 33; 2618 ``` 2619 ```````````````````````````````` 2620 2621 2622 To start an [HTML block] with a tag that is *not* in the 2623 list of block-level tags in (6), you must put the tag by 2624 itself on the first line (and it must be complete): 2625 2626 ```````````````````````````````` example 2627 <a href="foo"> 2628 *bar* 2629 </a> 2630 . 2631 <a href="foo"> 2632 *bar* 2633 </a> 2634 ```````````````````````````````` 2635 2636 2637 In type 7 blocks, the [tag name] can be anything: 2638 2639 ```````````````````````````````` example 2640 <Warning> 2641 *bar* 2642 </Warning> 2643 . 2644 <Warning> 2645 *bar* 2646 </Warning> 2647 ```````````````````````````````` 2648 2649 2650 ```````````````````````````````` example 2651 <i class="foo"> 2652 *bar* 2653 </i> 2654 . 2655 <i class="foo"> 2656 *bar* 2657 </i> 2658 ```````````````````````````````` 2659 2660 2661 ```````````````````````````````` example 2662 </ins> 2663 *bar* 2664 . 2665 </ins> 2666 *bar* 2667 ```````````````````````````````` 2668 2669 2670 These rules are designed to allow us to work with tags that 2671 can function as either block-level or inline-level tags. 2672 The `<del>` tag is a nice example. We can surround content with 2673 `<del>` tags in three different ways. In this case, we get a raw 2674 HTML block, because the `<del>` tag is on a line by itself: 2675 2676 ```````````````````````````````` example 2677 <del> 2678 *foo* 2679 </del> 2680 . 2681 <del> 2682 *foo* 2683 </del> 2684 ```````````````````````````````` 2685 2686 2687 In this case, we get a raw HTML block that just includes 2688 the `<del>` tag (because it ends with the following blank 2689 line). So the contents get interpreted as CommonMark: 2690 2691 ```````````````````````````````` example 2692 <del> 2693 2694 *foo* 2695 2696 </del> 2697 . 2698 <del> 2699 <p><em>foo</em></p> 2700 </del> 2701 ```````````````````````````````` 2702 2703 2704 Finally, in this case, the `<del>` tags are interpreted 2705 as [raw HTML] *inside* the CommonMark paragraph. (Because 2706 the tag is not on a line by itself, we get inline HTML 2707 rather than an [HTML block].) 2708 2709 ```````````````````````````````` example 2710 <del>*foo*</del> 2711 . 2712 <p><del><em>foo</em></del></p> 2713 ```````````````````````````````` 2714 2715 2716 HTML tags designed to contain literal content 2717 (`script`, `style`, `pre`), comments, processing instructions, 2718 and declarations are treated somewhat differently. 2719 Instead of ending at the first blank line, these blocks 2720 end at the first line containing a corresponding end tag. 2721 As a result, these blocks can contain blank lines: 2722 2723 A pre tag (type 1): 2724 2725 ```````````````````````````````` example 2726 <pre language="haskell"><code> 2727 import Text.HTML.TagSoup 2728 2729 main :: IO () 2730 main = print $ parseTags tags 2731 </code></pre> 2732 okay 2733 . 2734 <pre language="haskell"><code> 2735 import Text.HTML.TagSoup 2736 2737 main :: IO () 2738 main = print $ parseTags tags 2739 </code></pre> 2740 <p>okay</p> 2741 ```````````````````````````````` 2742 2743 2744 A script tag (type 1): 2745 2746 ```````````````````````````````` example 2747 <script type="text/javascript"> 2748 // JavaScript example 2749 2750 document.getElementById("demo").innerHTML = "Hello JavaScript!"; 2751 </script> 2752 okay 2753 . 2754 <script type="text/javascript"> 2755 // JavaScript example 2756 2757 document.getElementById("demo").innerHTML = "Hello JavaScript!"; 2758 </script> 2759 <p>okay</p> 2760 ```````````````````````````````` 2761 2762 2763 A textarea tag (type 1): 2764 2765 ```````````````````````````````` example 2766 <textarea> 2767 2768 *foo* 2769 2770 _bar_ 2771 2772 </textarea> 2773 . 2774 <textarea> 2775 2776 *foo* 2777 2778 _bar_ 2779 2780 </textarea> 2781 ```````````````````````````````` 2782 2783 A style tag (type 1): 2784 2785 ```````````````````````````````` example 2786 <style 2787 type="text/css"> 2788 h1 {color:red;} 2789 2790 p {color:blue;} 2791 </style> 2792 okay 2793 . 2794 <style 2795 type="text/css"> 2796 h1 {color:red;} 2797 2798 p {color:blue;} 2799 </style> 2800 <p>okay</p> 2801 ```````````````````````````````` 2802 2803 2804 If there is no matching end tag, the block will end at the 2805 end of the document (or the enclosing [block quote][block quotes] 2806 or [list item][list items]): 2807 2808 ```````````````````````````````` example 2809 <style 2810 type="text/css"> 2811 2812 foo 2813 . 2814 <style 2815 type="text/css"> 2816 2817 foo 2818 ```````````````````````````````` 2819 2820 2821 ```````````````````````````````` example 2822 > <div> 2823 > foo 2824 2825 bar 2826 . 2827 <blockquote> 2828 <div> 2829 foo 2830 </blockquote> 2831 <p>bar</p> 2832 ```````````````````````````````` 2833 2834 2835 ```````````````````````````````` example 2836 - <div> 2837 - foo 2838 . 2839 <ul> 2840 <li> 2841 <div> 2842 </li> 2843 <li>foo</li> 2844 </ul> 2845 ```````````````````````````````` 2846 2847 2848 The end tag can occur on the same line as the start tag: 2849 2850 ```````````````````````````````` example 2851 <style>p{color:red;}</style> 2852 *foo* 2853 . 2854 <style>p{color:red;}</style> 2855 <p><em>foo</em></p> 2856 ```````````````````````````````` 2857 2858 2859 ```````````````````````````````` example 2860 <!-- foo -->*bar* 2861 *baz* 2862 . 2863 <!-- foo -->*bar* 2864 <p><em>baz</em></p> 2865 ```````````````````````````````` 2866 2867 2868 Note that anything on the last line after the 2869 end tag will be included in the [HTML block]: 2870 2871 ```````````````````````````````` example 2872 <script> 2873 foo 2874 </script>1. *bar* 2875 . 2876 <script> 2877 foo 2878 </script>1. *bar* 2879 ```````````````````````````````` 2880 2881 2882 A comment (type 2): 2883 2884 ```````````````````````````````` example 2885 <!-- Foo 2886 2887 bar 2888 baz --> 2889 okay 2890 . 2891 <!-- Foo 2892 2893 bar 2894 baz --> 2895 <p>okay</p> 2896 ```````````````````````````````` 2897 2898 2899 2900 A processing instruction (type 3): 2901 2902 ```````````````````````````````` example 2903 <?php 2904 2905 echo '>'; 2906 2907 ?> 2908 okay 2909 . 2910 <?php 2911 2912 echo '>'; 2913 2914 ?> 2915 <p>okay</p> 2916 ```````````````````````````````` 2917 2918 2919 A declaration (type 4): 2920 2921 ```````````````````````````````` example 2922 <!DOCTYPE html> 2923 . 2924 <!DOCTYPE html> 2925 ```````````````````````````````` 2926 2927 2928 CDATA (type 5): 2929 2930 ```````````````````````````````` example 2931 <![CDATA[ 2932 function matchwo(a,b) 2933 { 2934 if (a < b && a < 0) then { 2935 return 1; 2936 2937 } else { 2938 2939 return 0; 2940 } 2941 } 2942 ]]> 2943 okay 2944 . 2945 <![CDATA[ 2946 function matchwo(a,b) 2947 { 2948 if (a < b && a < 0) then { 2949 return 1; 2950 2951 } else { 2952 2953 return 0; 2954 } 2955 } 2956 ]]> 2957 <p>okay</p> 2958 ```````````````````````````````` 2959 2960 2961 The opening tag can be preceded by up to three spaces of indentation, but not 2962 four: 2963 2964 ```````````````````````````````` example 2965 <!-- foo --> 2966 2967 <!-- foo --> 2968 . 2969 <!-- foo --> 2970 <pre><code><!-- foo --> 2971 </code></pre> 2972 ```````````````````````````````` 2973 2974 2975 ```````````````````````````````` example 2976 <div> 2977 2978 <div> 2979 . 2980 <div> 2981 <pre><code><div> 2982 </code></pre> 2983 ```````````````````````````````` 2984 2985 2986 An HTML block of types 1--6 can interrupt a paragraph, and need not be 2987 preceded by a blank line. 2988 2989 ```````````````````````````````` example 2990 Foo 2991 <div> 2992 bar 2993 </div> 2994 . 2995 <p>Foo</p> 2996 <div> 2997 bar 2998 </div> 2999 ```````````````````````````````` 3000 3001 3002 However, a following blank line is needed, except at the end of 3003 a document, and except for blocks of types 1--5, [above][HTML 3004 block]: 3005 3006 ```````````````````````````````` example 3007 <div> 3008 bar 3009 </div> 3010 *foo* 3011 . 3012 <div> 3013 bar 3014 </div> 3015 *foo* 3016 ```````````````````````````````` 3017 3018 3019 HTML blocks of type 7 cannot interrupt a paragraph: 3020 3021 ```````````````````````````````` example 3022 Foo 3023 <a href="bar"> 3024 baz 3025 . 3026 <p>Foo 3027 <a href="bar"> 3028 baz</p> 3029 ```````````````````````````````` 3030 3031 3032 This rule differs from John Gruber's original Markdown syntax 3033 specification, which says: 3034 3035 > The only restrictions are that block-level HTML elements — 3036 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from 3037 > surrounding content by blank lines, and the start and end tags of the 3038 > block should not be indented with spaces or tabs. 3039 3040 In some ways Gruber's rule is more restrictive than the one given 3041 here: 3042 3043 - It requires that an HTML block be preceded by a blank line. 3044 - It does not allow the start tag to be indented. 3045 - It requires a matching end tag, which it also does not allow to 3046 be indented. 3047 3048 Most Markdown implementations (including some of Gruber's own) do not 3049 respect all of these restrictions. 3050 3051 There is one respect, however, in which Gruber's rule is more liberal 3052 than the one given here, since it allows blank lines to occur inside 3053 an HTML block. There are two reasons for disallowing them here. 3054 First, it removes the need to parse balanced tags, which is 3055 expensive and can require backtracking from the end of the document 3056 if no matching end tag is found. Second, it provides a very simple 3057 and flexible way of including Markdown content inside HTML tags: 3058 simply separate the Markdown from the HTML using blank lines: 3059 3060 Compare: 3061 3062 ```````````````````````````````` example 3063 <div> 3064 3065 *Emphasized* text. 3066 3067 </div> 3068 . 3069 <div> 3070 <p><em>Emphasized</em> text.</p> 3071 </div> 3072 ```````````````````````````````` 3073 3074 3075 ```````````````````````````````` example 3076 <div> 3077 *Emphasized* text. 3078 </div> 3079 . 3080 <div> 3081 *Emphasized* text. 3082 </div> 3083 ```````````````````````````````` 3084 3085 3086 Some Markdown implementations have adopted a convention of 3087 interpreting content inside tags as text if the open tag has 3088 the attribute `markdown=1`. The rule given above seems a simpler and 3089 more elegant way of achieving the same expressive power, which is also 3090 much simpler to parse. 3091 3092 The main potential drawback is that one can no longer paste HTML 3093 blocks into Markdown documents with 100% reliability. However, 3094 *in most cases* this will work fine, because the blank lines in 3095 HTML are usually followed by HTML block tags. For example: 3096 3097 ```````````````````````````````` example 3098 <table> 3099 3100 <tr> 3101 3102 <td> 3103 Hi 3104 </td> 3105 3106 </tr> 3107 3108 </table> 3109 . 3110 <table> 3111 <tr> 3112 <td> 3113 Hi 3114 </td> 3115 </tr> 3116 </table> 3117 ```````````````````````````````` 3118 3119 3120 There are problems, however, if the inner tags are indented 3121 *and* separated by spaces, as then they will be interpreted as 3122 an indented code block: 3123 3124 ```````````````````````````````` example 3125 <table> 3126 3127 <tr> 3128 3129 <td> 3130 Hi 3131 </td> 3132 3133 </tr> 3134 3135 </table> 3136 . 3137 <table> 3138 <tr> 3139 <pre><code><td> 3140 Hi 3141 </td> 3142 </code></pre> 3143 </tr> 3144 </table> 3145 ```````````````````````````````` 3146 3147 3148 Fortunately, blank lines are usually not necessary and can be 3149 deleted. The exception is inside `<pre>` tags, but as described 3150 [above][HTML blocks], raw HTML blocks starting with `<pre>` 3151 *can* contain blank lines. 3152 3153 ## Link reference definitions 3154 3155 A [link reference definition](@) 3156 consists of a [link label], optionally preceded by up to three spaces of 3157 indentation, followed 3158 by a colon (`:`), optional spaces or tabs (including up to one 3159 [line ending]), a [link destination], 3160 optional spaces or tabs (including up to one 3161 [line ending]), and an optional [link 3162 title], which if it is present must be separated 3163 from the [link destination] by spaces or tabs. 3164 No further character may occur. 3165 3166 A [link reference definition] 3167 does not correspond to a structural element of a document. Instead, it 3168 defines a label which can be used in [reference links] 3169 and reference-style [images] elsewhere in the document. [Link 3170 reference definitions] can come either before or after the links that use 3171 them. 3172 3173 ```````````````````````````````` example 3174 [foo]: /url "title" 3175 3176 [foo] 3177 . 3178 <p><a href="/url" title="title">foo</a></p> 3179 ```````````````````````````````` 3180 3181 3182 ```````````````````````````````` example 3183 [foo]: 3184 /url 3185 'the title' 3186 3187 [foo] 3188 . 3189 <p><a href="/url" title="the title">foo</a></p> 3190 ```````````````````````````````` 3191 3192 3193 ```````````````````````````````` example 3194 [Foo*bar\]]:my_(url) 'title (with parens)' 3195 3196 [Foo*bar\]] 3197 . 3198 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> 3199 ```````````````````````````````` 3200 3201 3202 ```````````````````````````````` example 3203 [Foo bar]: 3204 <my url> 3205 'title' 3206 3207 [Foo bar] 3208 . 3209 <p><a href="my%20url" title="title">Foo bar</a></p> 3210 ```````````````````````````````` 3211 3212 3213 The title may extend over multiple lines: 3214 3215 ```````````````````````````````` example 3216 [foo]: /url ' 3217 title 3218 line1 3219 line2 3220 ' 3221 3222 [foo] 3223 . 3224 <p><a href="/url" title=" 3225 title 3226 line1 3227 line2 3228 ">foo</a></p> 3229 ```````````````````````````````` 3230 3231 3232 However, it may not contain a [blank line]: 3233 3234 ```````````````````````````````` example 3235 [foo]: /url 'title 3236 3237 with blank line' 3238 3239 [foo] 3240 . 3241 <p>[foo]: /url 'title</p> 3242 <p>with blank line'</p> 3243 <p>[foo]</p> 3244 ```````````````````````````````` 3245 3246 3247 The title may be omitted: 3248 3249 ```````````````````````````````` example 3250 [foo]: 3251 /url 3252 3253 [foo] 3254 . 3255 <p><a href="/url">foo</a></p> 3256 ```````````````````````````````` 3257 3258 3259 The link destination may not be omitted: 3260 3261 ```````````````````````````````` example 3262 [foo]: 3263 3264 [foo] 3265 . 3266 <p>[foo]:</p> 3267 <p>[foo]</p> 3268 ```````````````````````````````` 3269 3270 However, an empty link destination may be specified using 3271 angle brackets: 3272 3273 ```````````````````````````````` example 3274 [foo]: <> 3275 3276 [foo] 3277 . 3278 <p><a href="">foo</a></p> 3279 ```````````````````````````````` 3280 3281 The title must be separated from the link destination by 3282 spaces or tabs: 3283 3284 ```````````````````````````````` example 3285 [foo]: <bar>(baz) 3286 3287 [foo] 3288 . 3289 <p>[foo]: <bar>(baz)</p> 3290 <p>[foo]</p> 3291 ```````````````````````````````` 3292 3293 3294 Both title and destination can contain backslash escapes 3295 and literal backslashes: 3296 3297 ```````````````````````````````` example 3298 [foo]: /url\bar\*baz "foo\"bar\baz" 3299 3300 [foo] 3301 . 3302 <p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> 3303 ```````````````````````````````` 3304 3305 3306 A link can come before its corresponding definition: 3307 3308 ```````````````````````````````` example 3309 [foo] 3310 3311 [foo]: url 3312 . 3313 <p><a href="url">foo</a></p> 3314 ```````````````````````````````` 3315 3316 3317 If there are several matching definitions, the first one takes 3318 precedence: 3319 3320 ```````````````````````````````` example 3321 [foo] 3322 3323 [foo]: first 3324 [foo]: second 3325 . 3326 <p><a href="first">foo</a></p> 3327 ```````````````````````````````` 3328 3329 3330 As noted in the section on [Links], matching of labels is 3331 case-insensitive (see [matches]). 3332 3333 ```````````````````````````````` example 3334 [FOO]: /url 3335 3336 [Foo] 3337 . 3338 <p><a href="/url">Foo</a></p> 3339 ```````````````````````````````` 3340 3341 3342 ```````````````````````````````` example 3343 [ΑΓΩ]: /φου 3344 3345 [αγω] 3346 . 3347 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> 3348 ```````````````````````````````` 3349 3350 3351 Here is a link reference definition with no corresponding link. 3352 It contributes nothing to the document. 3353 3354 ```````````````````````````````` example 3355 [foo]: /url 3356 . 3357 ```````````````````````````````` 3358 3359 3360 Here is another one: 3361 3362 ```````````````````````````````` example 3363 [ 3364 foo 3365 ]: /url 3366 bar 3367 . 3368 <p>bar</p> 3369 ```````````````````````````````` 3370 3371 3372 This is not a link reference definition, because there are 3373 characters other than spaces or tabs after the title: 3374 3375 ```````````````````````````````` example 3376 [foo]: /url "title" ok 3377 . 3378 <p>[foo]: /url "title" ok</p> 3379 ```````````````````````````````` 3380 3381 3382 This is a link reference definition, but it has no title: 3383 3384 ```````````````````````````````` example 3385 [foo]: /url 3386 "title" ok 3387 . 3388 <p>"title" ok</p> 3389 ```````````````````````````````` 3390 3391 3392 This is not a link reference definition, because it is indented 3393 four spaces: 3394 3395 ```````````````````````````````` example 3396 [foo]: /url "title" 3397 3398 [foo] 3399 . 3400 <pre><code>[foo]: /url "title" 3401 </code></pre> 3402 <p>[foo]</p> 3403 ```````````````````````````````` 3404 3405 3406 This is not a link reference definition, because it occurs inside 3407 a code block: 3408 3409 ```````````````````````````````` example 3410 ``` 3411 [foo]: /url 3412 ``` 3413 3414 [foo] 3415 . 3416 <pre><code>[foo]: /url 3417 </code></pre> 3418 <p>[foo]</p> 3419 ```````````````````````````````` 3420 3421 3422 A [link reference definition] cannot interrupt a paragraph. 3423 3424 ```````````````````````````````` example 3425 Foo 3426 [bar]: /baz 3427 3428 [bar] 3429 . 3430 <p>Foo 3431 [bar]: /baz</p> 3432 <p>[bar]</p> 3433 ```````````````````````````````` 3434 3435 3436 However, it can directly follow other block elements, such as headings 3437 and thematic breaks, and it need not be followed by a blank line. 3438 3439 ```````````````````````````````` example 3440 # [Foo] 3441 [foo]: /url 3442 > bar 3443 . 3444 <h1><a href="/url">Foo</a></h1> 3445 <blockquote> 3446 <p>bar</p> 3447 </blockquote> 3448 ```````````````````````````````` 3449 3450 ```````````````````````````````` example 3451 [foo]: /url 3452 bar 3453 === 3454 [foo] 3455 . 3456 <h1>bar</h1> 3457 <p><a href="/url">foo</a></p> 3458 ```````````````````````````````` 3459 3460 ```````````````````````````````` example 3461 [foo]: /url 3462 === 3463 [foo] 3464 . 3465 <p>=== 3466 <a href="/url">foo</a></p> 3467 ```````````````````````````````` 3468 3469 3470 Several [link reference definitions] 3471 can occur one after another, without intervening blank lines. 3472 3473 ```````````````````````````````` example 3474 [foo]: /foo-url "foo" 3475 [bar]: /bar-url 3476 "bar" 3477 [baz]: /baz-url 3478 3479 [foo], 3480 [bar], 3481 [baz] 3482 . 3483 <p><a href="/foo-url" title="foo">foo</a>, 3484 <a href="/bar-url" title="bar">bar</a>, 3485 <a href="/baz-url">baz</a></p> 3486 ```````````````````````````````` 3487 3488 3489 [Link reference definitions] can occur 3490 inside block containers, like lists and block quotations. They 3491 affect the entire document, not just the container in which they 3492 are defined: 3493 3494 ```````````````````````````````` example 3495 [foo] 3496 3497 > [foo]: /url 3498 . 3499 <p><a href="/url">foo</a></p> 3500 <blockquote> 3501 </blockquote> 3502 ```````````````````````````````` 3503 3504 3505 Whether something is a [link reference definition] is 3506 independent of whether the link reference it defines is 3507 used in the document. Thus, for example, the following 3508 document contains just a link reference definition, and 3509 no visible content: 3510 3511 ```````````````````````````````` example 3512 [foo]: /url 3513 . 3514 ```````````````````````````````` 3515 3516 3517 ## Paragraphs 3518 3519 A sequence of non-blank lines that cannot be interpreted as other 3520 kinds of blocks forms a [paragraph](@). 3521 The contents of the paragraph are the result of parsing the 3522 paragraph's raw content as inlines. The paragraph's raw content 3523 is formed by concatenating the lines and removing initial and final 3524 spaces or tabs. 3525 3526 A simple example with two paragraphs: 3527 3528 ```````````````````````````````` example 3529 aaa 3530 3531 bbb 3532 . 3533 <p>aaa</p> 3534 <p>bbb</p> 3535 ```````````````````````````````` 3536 3537 3538 Paragraphs can contain multiple lines, but no blank lines: 3539 3540 ```````````````````````````````` example 3541 aaa 3542 bbb 3543 3544 ccc 3545 ddd 3546 . 3547 <p>aaa 3548 bbb</p> 3549 <p>ccc 3550 ddd</p> 3551 ```````````````````````````````` 3552 3553 3554 Multiple blank lines between paragraphs have no effect: 3555 3556 ```````````````````````````````` example 3557 aaa 3558 3559 3560 bbb 3561 . 3562 <p>aaa</p> 3563 <p>bbb</p> 3564 ```````````````````````````````` 3565 3566 3567 Leading spaces or tabs are skipped: 3568 3569 ```````````````````````````````` example 3570 aaa 3571 bbb 3572 . 3573 <p>aaa 3574 bbb</p> 3575 ```````````````````````````````` 3576 3577 3578 Lines after the first may be indented any amount, since indented 3579 code blocks cannot interrupt paragraphs. 3580 3581 ```````````````````````````````` example 3582 aaa 3583 bbb 3584 ccc 3585 . 3586 <p>aaa 3587 bbb 3588 ccc</p> 3589 ```````````````````````````````` 3590 3591 3592 However, the first line may be preceded by up to three spaces of indentation. 3593 Four spaces of indentation is too many: 3594 3595 ```````````````````````````````` example 3596 aaa 3597 bbb 3598 . 3599 <p>aaa 3600 bbb</p> 3601 ```````````````````````````````` 3602 3603 3604 ```````````````````````````````` example 3605 aaa 3606 bbb 3607 . 3608 <pre><code>aaa 3609 </code></pre> 3610 <p>bbb</p> 3611 ```````````````````````````````` 3612 3613 3614 Final spaces or tabs are stripped before inline parsing, so a paragraph 3615 that ends with two or more spaces will not end with a [hard line 3616 break]: 3617 3618 ```````````````````````````````` example 3619 aaa 3620 bbb 3621 . 3622 <p>aaa<br /> 3623 bbb</p> 3624 ```````````````````````````````` 3625 3626 3627 ## Blank lines 3628 3629 [Blank lines] between block-level elements are ignored, 3630 except for the role they play in determining whether a [list] 3631 is [tight] or [loose]. 3632 3633 Blank lines at the beginning and end of the document are also ignored. 3634 3635 ```````````````````````````````` example 3636 3637 3638 aaa 3639 3640 3641 # aaa 3642 3643 3644 . 3645 <p>aaa</p> 3646 <h1>aaa</h1> 3647 ```````````````````````````````` 3648 3649 3650 3651 # Container blocks 3652 3653 A [container block](#container-blocks) is a block that has other 3654 blocks as its contents. There are two basic kinds of container blocks: 3655 [block quotes] and [list items]. 3656 [Lists] are meta-containers for [list items]. 3657 3658 We define the syntax for container blocks recursively. The general 3659 form of the definition is: 3660 3661 > If X is a sequence of blocks, then the result of 3662 > transforming X in such-and-such a way is a container of type Y 3663 > with these blocks as its content. 3664 3665 So, we explain what counts as a block quote or list item by explaining 3666 how these can be *generated* from their contents. This should suffice 3667 to define the syntax, although it does not give a recipe for *parsing* 3668 these constructions. (A recipe is provided below in the section entitled 3669 [A parsing strategy](#appendix-a-parsing-strategy).) 3670 3671 ## Block quotes 3672 3673 A [block quote marker](@), 3674 optionally preceded by up to three spaces of indentation, 3675 consists of (a) the character `>` together with a following space of 3676 indentation, or (b) a single character `>` not followed by a space of 3677 indentation. 3678 3679 The following rules define [block quotes]: 3680 3681 1. **Basic case.** If a string of lines *Ls* constitute a sequence 3682 of blocks *Bs*, then the result of prepending a [block quote 3683 marker] to the beginning of each line in *Ls* 3684 is a [block quote](#block-quotes) containing *Bs*. 3685 3686 2. **Laziness.** If a string of lines *Ls* constitute a [block 3687 quote](#block-quotes) with contents *Bs*, then the result of deleting 3688 the initial [block quote marker] from one or 3689 more lines in which the next character other than a space or tab after the 3690 [block quote marker] is [paragraph continuation 3691 text] is a block quote with *Bs* as its content. 3692 [Paragraph continuation text](@) is text 3693 that will be parsed as part of the content of a paragraph, but does 3694 not occur at the beginning of the paragraph. 3695 3696 3. **Consecutiveness.** A document cannot contain two [block 3697 quotes] in a row unless there is a [blank line] between them. 3698 3699 Nothing else counts as a [block quote](#block-quotes). 3700 3701 Here is a simple example: 3702 3703 ```````````````````````````````` example 3704 > # Foo 3705 > bar 3706 > baz 3707 . 3708 <blockquote> 3709 <h1>Foo</h1> 3710 <p>bar 3711 baz</p> 3712 </blockquote> 3713 ```````````````````````````````` 3714 3715 3716 The space or tab after the `>` characters can be omitted: 3717 3718 ```````````````````````````````` example 3719 ># Foo 3720 >bar 3721 > baz 3722 . 3723 <blockquote> 3724 <h1>Foo</h1> 3725 <p>bar 3726 baz</p> 3727 </blockquote> 3728 ```````````````````````````````` 3729 3730 3731 The `>` characters can be preceded by up to three spaces of indentation: 3732 3733 ```````````````````````````````` example 3734 > # Foo 3735 > bar 3736 > baz 3737 . 3738 <blockquote> 3739 <h1>Foo</h1> 3740 <p>bar 3741 baz</p> 3742 </blockquote> 3743 ```````````````````````````````` 3744 3745 3746 Four spaces of indentation is too many: 3747 3748 ```````````````````````````````` example 3749 > # Foo 3750 > bar 3751 > baz 3752 . 3753 <pre><code>> # Foo 3754 > bar 3755 > baz 3756 </code></pre> 3757 ```````````````````````````````` 3758 3759 3760 The Laziness clause allows us to omit the `>` before 3761 [paragraph continuation text]: 3762 3763 ```````````````````````````````` example 3764 > # Foo 3765 > bar 3766 baz 3767 . 3768 <blockquote> 3769 <h1>Foo</h1> 3770 <p>bar 3771 baz</p> 3772 </blockquote> 3773 ```````````````````````````````` 3774 3775 3776 A block quote can contain some lazy and some non-lazy 3777 continuation lines: 3778 3779 ```````````````````````````````` example 3780 > bar 3781 baz 3782 > foo 3783 . 3784 <blockquote> 3785 <p>bar 3786 baz 3787 foo</p> 3788 </blockquote> 3789 ```````````````````````````````` 3790 3791 3792 Laziness only applies to lines that would have been continuations of 3793 paragraphs had they been prepended with [block quote markers]. 3794 For example, the `> ` cannot be omitted in the second line of 3795 3796 ``` markdown 3797 > foo 3798 > --- 3799 ``` 3800 3801 without changing the meaning: 3802 3803 ```````````````````````````````` example 3804 > foo 3805 --- 3806 . 3807 <blockquote> 3808 <p>foo</p> 3809 </blockquote> 3810 <hr /> 3811 ```````````````````````````````` 3812 3813 3814 Similarly, if we omit the `> ` in the second line of 3815 3816 ``` markdown 3817 > - foo 3818 > - bar 3819 ``` 3820 3821 then the block quote ends after the first line: 3822 3823 ```````````````````````````````` example 3824 > - foo 3825 - bar 3826 . 3827 <blockquote> 3828 <ul> 3829 <li>foo</li> 3830 </ul> 3831 </blockquote> 3832 <ul> 3833 <li>bar</li> 3834 </ul> 3835 ```````````````````````````````` 3836 3837 3838 For the same reason, we can't omit the `> ` in front of 3839 subsequent lines of an indented or fenced code block: 3840 3841 ```````````````````````````````` example 3842 > foo 3843 bar 3844 . 3845 <blockquote> 3846 <pre><code>foo 3847 </code></pre> 3848 </blockquote> 3849 <pre><code>bar 3850 </code></pre> 3851 ```````````````````````````````` 3852 3853 3854 ```````````````````````````````` example 3855 > ``` 3856 foo 3857 ``` 3858 . 3859 <blockquote> 3860 <pre><code></code></pre> 3861 </blockquote> 3862 <p>foo</p> 3863 <pre><code></code></pre> 3864 ```````````````````````````````` 3865 3866 3867 Note that in the following case, we have a [lazy 3868 continuation line]: 3869 3870 ```````````````````````````````` example 3871 > foo 3872 - bar 3873 . 3874 <blockquote> 3875 <p>foo 3876 - bar</p> 3877 </blockquote> 3878 ```````````````````````````````` 3879 3880 3881 To see why, note that in 3882 3883 ```markdown 3884 > foo 3885 > - bar 3886 ``` 3887 3888 the `- bar` is indented too far to start a list, and can't 3889 be an indented code block because indented code blocks cannot 3890 interrupt paragraphs, so it is [paragraph continuation text]. 3891 3892 A block quote can be empty: 3893 3894 ```````````````````````````````` example 3895 > 3896 . 3897 <blockquote> 3898 </blockquote> 3899 ```````````````````````````````` 3900 3901 3902 ```````````````````````````````` example 3903 > 3904 > 3905 > 3906 . 3907 <blockquote> 3908 </blockquote> 3909 ```````````````````````````````` 3910 3911 3912 A block quote can have initial or final blank lines: 3913 3914 ```````````````````````````````` example 3915 > 3916 > foo 3917 > 3918 . 3919 <blockquote> 3920 <p>foo</p> 3921 </blockquote> 3922 ```````````````````````````````` 3923 3924 3925 A blank line always separates block quotes: 3926 3927 ```````````````````````````````` example 3928 > foo 3929 3930 > bar 3931 . 3932 <blockquote> 3933 <p>foo</p> 3934 </blockquote> 3935 <blockquote> 3936 <p>bar</p> 3937 </blockquote> 3938 ```````````````````````````````` 3939 3940 3941 (Most current Markdown implementations, including John Gruber's 3942 original `Markdown.pl`, will parse this example as a single block quote 3943 with two paragraphs. But it seems better to allow the author to decide 3944 whether two block quotes or one are wanted.) 3945 3946 Consecutiveness means that if we put these block quotes together, 3947 we get a single block quote: 3948 3949 ```````````````````````````````` example 3950 > foo 3951 > bar 3952 . 3953 <blockquote> 3954 <p>foo 3955 bar</p> 3956 </blockquote> 3957 ```````````````````````````````` 3958 3959 3960 To get a block quote with two paragraphs, use: 3961 3962 ```````````````````````````````` example 3963 > foo 3964 > 3965 > bar 3966 . 3967 <blockquote> 3968 <p>foo</p> 3969 <p>bar</p> 3970 </blockquote> 3971 ```````````````````````````````` 3972 3973 3974 Block quotes can interrupt paragraphs: 3975 3976 ```````````````````````````````` example 3977 foo 3978 > bar 3979 . 3980 <p>foo</p> 3981 <blockquote> 3982 <p>bar</p> 3983 </blockquote> 3984 ```````````````````````````````` 3985 3986 3987 In general, blank lines are not needed before or after block 3988 quotes: 3989 3990 ```````````````````````````````` example 3991 > aaa 3992 *** 3993 > bbb 3994 . 3995 <blockquote> 3996 <p>aaa</p> 3997 </blockquote> 3998 <hr /> 3999 <blockquote> 4000 <p>bbb</p> 4001 </blockquote> 4002 ```````````````````````````````` 4003 4004 4005 However, because of laziness, a blank line is needed between 4006 a block quote and a following paragraph: 4007 4008 ```````````````````````````````` example 4009 > bar 4010 baz 4011 . 4012 <blockquote> 4013 <p>bar 4014 baz</p> 4015 </blockquote> 4016 ```````````````````````````````` 4017 4018 4019 ```````````````````````````````` example 4020 > bar 4021 4022 baz 4023 . 4024 <blockquote> 4025 <p>bar</p> 4026 </blockquote> 4027 <p>baz</p> 4028 ```````````````````````````````` 4029 4030 4031 ```````````````````````````````` example 4032 > bar 4033 > 4034 baz 4035 . 4036 <blockquote> 4037 <p>bar</p> 4038 </blockquote> 4039 <p>baz</p> 4040 ```````````````````````````````` 4041 4042 4043 It is a consequence of the Laziness rule that any number 4044 of initial `>`s may be omitted on a continuation line of a 4045 nested block quote: 4046 4047 ```````````````````````````````` example 4048 > > > foo 4049 bar 4050 . 4051 <blockquote> 4052 <blockquote> 4053 <blockquote> 4054 <p>foo 4055 bar</p> 4056 </blockquote> 4057 </blockquote> 4058 </blockquote> 4059 ```````````````````````````````` 4060 4061 4062 ```````````````````````````````` example 4063 >>> foo 4064 > bar 4065 >>baz 4066 . 4067 <blockquote> 4068 <blockquote> 4069 <blockquote> 4070 <p>foo 4071 bar 4072 baz</p> 4073 </blockquote> 4074 </blockquote> 4075 </blockquote> 4076 ```````````````````````````````` 4077 4078 4079 When including an indented code block in a block quote, 4080 remember that the [block quote marker] includes 4081 both the `>` and a following space of indentation. So *five spaces* are needed 4082 after the `>`: 4083 4084 ```````````````````````````````` example 4085 > code 4086 4087 > not code 4088 . 4089 <blockquote> 4090 <pre><code>code 4091 </code></pre> 4092 </blockquote> 4093 <blockquote> 4094 <p>not code</p> 4095 </blockquote> 4096 ```````````````````````````````` 4097 4098 4099 4100 ## List items 4101 4102 A [list marker](@) is a 4103 [bullet list marker] or an [ordered list marker]. 4104 4105 A [bullet list marker](@) 4106 is a `-`, `+`, or `*` character. 4107 4108 An [ordered list marker](@) 4109 is a sequence of 1--9 arabic digits (`0-9`), followed by either a 4110 `.` character or a `)` character. (The reason for the length 4111 limit is that with 10 digits we start seeing integer overflows 4112 in some browsers.) 4113 4114 The following rules define [list items]: 4115 4116 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of 4117 blocks *Bs* starting with a character other than a space or tab, and *M* is 4118 a list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces of indentation, 4119 then the result of prepending *M* and the following spaces to the first line 4120 of Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a 4121 list item with *Bs* as its contents. The type of the list item 4122 (bullet or ordered) is determined by the type of its list marker. 4123 If the list item is ordered, then it is also assigned a start 4124 number, based on the ordered list marker. 4125 4126 Exceptions: 4127 4128 1. When the first list item in a [list] interrupts 4129 a paragraph---that is, when it starts on a line that would 4130 otherwise count as [paragraph continuation text]---then (a) 4131 the lines *Ls* must not begin with a blank line, and (b) if 4132 the list item is ordered, the start number must be 1. 4133 2. If any line is a [thematic break][thematic breaks] then 4134 that line is not a list item. 4135 4136 For example, let *Ls* be the lines 4137 4138 ```````````````````````````````` example 4139 A paragraph 4140 with two lines. 4141 4142 indented code 4143 4144 > A block quote. 4145 . 4146 <p>A paragraph 4147 with two lines.</p> 4148 <pre><code>indented code 4149 </code></pre> 4150 <blockquote> 4151 <p>A block quote.</p> 4152 </blockquote> 4153 ```````````````````````````````` 4154 4155 4156 And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says 4157 that the following is an ordered list item with start number 1, 4158 and the same contents as *Ls*: 4159 4160 ```````````````````````````````` example 4161 1. A paragraph 4162 with two lines. 4163 4164 indented code 4165 4166 > A block quote. 4167 . 4168 <ol> 4169 <li> 4170 <p>A paragraph 4171 with two lines.</p> 4172 <pre><code>indented code 4173 </code></pre> 4174 <blockquote> 4175 <p>A block quote.</p> 4176 </blockquote> 4177 </li> 4178 </ol> 4179 ```````````````````````````````` 4180 4181 4182 The most important thing to notice is that the position of 4183 the text after the list marker determines how much indentation 4184 is needed in subsequent blocks in the list item. If the list 4185 marker takes up two spaces of indentation, and there are three spaces between 4186 the list marker and the next character other than a space or tab, then blocks 4187 must be indented five spaces in order to fall under the list 4188 item. 4189 4190 Here are some examples showing how far content must be indented to be 4191 put under the list item: 4192 4193 ```````````````````````````````` example 4194 - one 4195 4196 two 4197 . 4198 <ul> 4199 <li>one</li> 4200 </ul> 4201 <p>two</p> 4202 ```````````````````````````````` 4203 4204 4205 ```````````````````````````````` example 4206 - one 4207 4208 two 4209 . 4210 <ul> 4211 <li> 4212 <p>one</p> 4213 <p>two</p> 4214 </li> 4215 </ul> 4216 ```````````````````````````````` 4217 4218 4219 ```````````````````````````````` example 4220 - one 4221 4222 two 4223 . 4224 <ul> 4225 <li>one</li> 4226 </ul> 4227 <pre><code> two 4228 </code></pre> 4229 ```````````````````````````````` 4230 4231 4232 ```````````````````````````````` example 4233 - one 4234 4235 two 4236 . 4237 <ul> 4238 <li> 4239 <p>one</p> 4240 <p>two</p> 4241 </li> 4242 </ul> 4243 ```````````````````````````````` 4244 4245 4246 It is tempting to think of this in terms of columns: the continuation 4247 blocks must be indented at least to the column of the first character other than 4248 a space or tab after the list marker. However, that is not quite right. 4249 The spaces of indentation after the list marker determine how much relative 4250 indentation is needed. Which column this indentation reaches will depend on 4251 how the list item is embedded in other constructions, as shown by 4252 this example: 4253 4254 ```````````````````````````````` example 4255 > > 1. one 4256 >> 4257 >> two 4258 . 4259 <blockquote> 4260 <blockquote> 4261 <ol> 4262 <li> 4263 <p>one</p> 4264 <p>two</p> 4265 </li> 4266 </ol> 4267 </blockquote> 4268 </blockquote> 4269 ```````````````````````````````` 4270 4271 4272 Here `two` occurs in the same column as the list marker `1.`, 4273 but is actually contained in the list item, because there is 4274 sufficient indentation after the last containing blockquote marker. 4275 4276 The converse is also possible. In the following example, the word `two` 4277 occurs far to the right of the initial text of the list item, `one`, but 4278 it is not considered part of the list item, because it is not indented 4279 far enough past the blockquote marker: 4280 4281 ```````````````````````````````` example 4282 >>- one 4283 >> 4284 > > two 4285 . 4286 <blockquote> 4287 <blockquote> 4288 <ul> 4289 <li>one</li> 4290 </ul> 4291 <p>two</p> 4292 </blockquote> 4293 </blockquote> 4294 ```````````````````````````````` 4295 4296 4297 Note that at least one space or tab is needed between the list marker and 4298 any following content, so these are not list items: 4299 4300 ```````````````````````````````` example 4301 -one 4302 4303 2.two 4304 . 4305 <p>-one</p> 4306 <p>2.two</p> 4307 ```````````````````````````````` 4308 4309 4310 A list item may contain blocks that are separated by more than 4311 one blank line. 4312 4313 ```````````````````````````````` example 4314 - foo 4315 4316 4317 bar 4318 . 4319 <ul> 4320 <li> 4321 <p>foo</p> 4322 <p>bar</p> 4323 </li> 4324 </ul> 4325 ```````````````````````````````` 4326 4327 4328 A list item may contain any kind of block: 4329 4330 ```````````````````````````````` example 4331 1. foo 4332 4333 ``` 4334 bar 4335 ``` 4336 4337 baz 4338 4339 > bam 4340 . 4341 <ol> 4342 <li> 4343 <p>foo</p> 4344 <pre><code>bar 4345 </code></pre> 4346 <p>baz</p> 4347 <blockquote> 4348 <p>bam</p> 4349 </blockquote> 4350 </li> 4351 </ol> 4352 ```````````````````````````````` 4353 4354 4355 A list item that contains an indented code block will preserve 4356 empty lines within the code block verbatim. 4357 4358 ```````````````````````````````` example 4359 - Foo 4360 4361 bar 4362 4363 4364 baz 4365 . 4366 <ul> 4367 <li> 4368 <p>Foo</p> 4369 <pre><code>bar 4370 4371 4372 baz 4373 </code></pre> 4374 </li> 4375 </ul> 4376 ```````````````````````````````` 4377 4378 Note that ordered list start numbers must be nine digits or less: 4379 4380 ```````````````````````````````` example 4381 123456789. ok 4382 . 4383 <ol start="123456789"> 4384 <li>ok</li> 4385 </ol> 4386 ```````````````````````````````` 4387 4388 4389 ```````````````````````````````` example 4390 1234567890. not ok 4391 . 4392 <p>1234567890. not ok</p> 4393 ```````````````````````````````` 4394 4395 4396 A start number may begin with 0s: 4397 4398 ```````````````````````````````` example 4399 0. ok 4400 . 4401 <ol start="0"> 4402 <li>ok</li> 4403 </ol> 4404 ```````````````````````````````` 4405 4406 4407 ```````````````````````````````` example 4408 003. ok 4409 . 4410 <ol start="3"> 4411 <li>ok</li> 4412 </ol> 4413 ```````````````````````````````` 4414 4415 4416 A start number may not be negative: 4417 4418 ```````````````````````````````` example 4419 -1. not ok 4420 . 4421 <p>-1. not ok</p> 4422 ```````````````````````````````` 4423 4424 4425 4426 2. **Item starting with indented code.** If a sequence of lines *Ls* 4427 constitute a sequence of blocks *Bs* starting with an indented code 4428 block, and *M* is a list marker of width *W* followed by 4429 one space of indentation, then the result of prepending *M* and the 4430 following space to the first line of *Ls*, and indenting subsequent lines 4431 of *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. 4432 If a line is empty, then it need not be indented. The type of the 4433 list item (bullet or ordered) is determined by the type of its list 4434 marker. If the list item is ordered, then it is also assigned a 4435 start number, based on the ordered list marker. 4436 4437 An indented code block will have to be preceded by four spaces of indentation 4438 beyond the edge of the region where text will be included in the list item. 4439 In the following case that is 6 spaces: 4440 4441 ```````````````````````````````` example 4442 - foo 4443 4444 bar 4445 . 4446 <ul> 4447 <li> 4448 <p>foo</p> 4449 <pre><code>bar 4450 </code></pre> 4451 </li> 4452 </ul> 4453 ```````````````````````````````` 4454 4455 4456 And in this case it is 11 spaces: 4457 4458 ```````````````````````````````` example 4459 10. foo 4460 4461 bar 4462 . 4463 <ol start="10"> 4464 <li> 4465 <p>foo</p> 4466 <pre><code>bar 4467 </code></pre> 4468 </li> 4469 </ol> 4470 ```````````````````````````````` 4471 4472 4473 If the *first* block in the list item is an indented code block, 4474 then by rule #2, the contents must be preceded by *one* space of indentation 4475 after the list marker: 4476 4477 ```````````````````````````````` example 4478 indented code 4479 4480 paragraph 4481 4482 more code 4483 . 4484 <pre><code>indented code 4485 </code></pre> 4486 <p>paragraph</p> 4487 <pre><code>more code 4488 </code></pre> 4489 ```````````````````````````````` 4490 4491 4492 ```````````````````````````````` example 4493 1. indented code 4494 4495 paragraph 4496 4497 more code 4498 . 4499 <ol> 4500 <li> 4501 <pre><code>indented code 4502 </code></pre> 4503 <p>paragraph</p> 4504 <pre><code>more code 4505 </code></pre> 4506 </li> 4507 </ol> 4508 ```````````````````````````````` 4509 4510 4511 Note that an additional space of indentation is interpreted as space 4512 inside the code block: 4513 4514 ```````````````````````````````` example 4515 1. indented code 4516 4517 paragraph 4518 4519 more code 4520 . 4521 <ol> 4522 <li> 4523 <pre><code> indented code 4524 </code></pre> 4525 <p>paragraph</p> 4526 <pre><code>more code 4527 </code></pre> 4528 </li> 4529 </ol> 4530 ```````````````````````````````` 4531 4532 4533 Note that rules #1 and #2 only apply to two cases: (a) cases 4534 in which the lines to be included in a list item begin with a 4535 characer other than a space or tab, and (b) cases in which 4536 they begin with an indented code 4537 block. In a case like the following, where the first block begins with 4538 three spaces of indentation, the rules do not allow us to form a list item by 4539 indenting the whole thing and prepending a list marker: 4540 4541 ```````````````````````````````` example 4542 foo 4543 4544 bar 4545 . 4546 <p>foo</p> 4547 <p>bar</p> 4548 ```````````````````````````````` 4549 4550 4551 ```````````````````````````````` example 4552 - foo 4553 4554 bar 4555 . 4556 <ul> 4557 <li>foo</li> 4558 </ul> 4559 <p>bar</p> 4560 ```````````````````````````````` 4561 4562 4563 This is not a significant restriction, because when a block is preceded by up to 4564 three spaces of indentation, the indentation can always be removed without 4565 a change in interpretation, allowing rule #1 to be applied. So, in 4566 the above case: 4567 4568 ```````````````````````````````` example 4569 - foo 4570 4571 bar 4572 . 4573 <ul> 4574 <li> 4575 <p>foo</p> 4576 <p>bar</p> 4577 </li> 4578 </ul> 4579 ```````````````````````````````` 4580 4581 4582 3. **Item starting with a blank line.** If a sequence of lines *Ls* 4583 starting with a single [blank line] constitute a (possibly empty) 4584 sequence of blocks *Bs*, and *M* is a list marker of width *W*, 4585 then the result of prepending *M* to the first line of *Ls*, and 4586 preceding subsequent lines of *Ls* by *W + 1* spaces of indentation, is a 4587 list item with *Bs* as its contents. 4588 If a line is empty, then it need not be indented. The type of the 4589 list item (bullet or ordered) is determined by the type of its list 4590 marker. If the list item is ordered, then it is also assigned a 4591 start number, based on the ordered list marker. 4592 4593 Here are some list items that start with a blank line but are not empty: 4594 4595 ```````````````````````````````` example 4596 - 4597 foo 4598 - 4599 ``` 4600 bar 4601 ``` 4602 - 4603 baz 4604 . 4605 <ul> 4606 <li>foo</li> 4607 <li> 4608 <pre><code>bar 4609 </code></pre> 4610 </li> 4611 <li> 4612 <pre><code>baz 4613 </code></pre> 4614 </li> 4615 </ul> 4616 ```````````````````````````````` 4617 4618 When the list item starts with a blank line, the number of spaces 4619 following the list marker doesn't change the required indentation: 4620 4621 ```````````````````````````````` example 4622 - 4623 foo 4624 . 4625 <ul> 4626 <li>foo</li> 4627 </ul> 4628 ```````````````````````````````` 4629 4630 4631 A list item can begin with at most one blank line. 4632 In the following example, `foo` is not part of the list 4633 item: 4634 4635 ```````````````````````````````` example 4636 - 4637 4638 foo 4639 . 4640 <ul> 4641 <li></li> 4642 </ul> 4643 <p>foo</p> 4644 ```````````````````````````````` 4645 4646 4647 Here is an empty bullet list item: 4648 4649 ```````````````````````````````` example 4650 - foo 4651 - 4652 - bar 4653 . 4654 <ul> 4655 <li>foo</li> 4656 <li></li> 4657 <li>bar</li> 4658 </ul> 4659 ```````````````````````````````` 4660 4661 4662 It does not matter whether there are spaces or tabs following the [list marker]: 4663 4664 ```````````````````````````````` example 4665 - foo 4666 - 4667 - bar 4668 . 4669 <ul> 4670 <li>foo</li> 4671 <li></li> 4672 <li>bar</li> 4673 </ul> 4674 ```````````````````````````````` 4675 4676 4677 Here is an empty ordered list item: 4678 4679 ```````````````````````````````` example 4680 1. foo 4681 2. 4682 3. bar 4683 . 4684 <ol> 4685 <li>foo</li> 4686 <li></li> 4687 <li>bar</li> 4688 </ol> 4689 ```````````````````````````````` 4690 4691 4692 A list may start or end with an empty list item: 4693 4694 ```````````````````````````````` example 4695 * 4696 . 4697 <ul> 4698 <li></li> 4699 </ul> 4700 ```````````````````````````````` 4701 4702 However, an empty list item cannot interrupt a paragraph: 4703 4704 ```````````````````````````````` example 4705 foo 4706 * 4707 4708 foo 4709 1. 4710 . 4711 <p>foo 4712 *</p> 4713 <p>foo 4714 1.</p> 4715 ```````````````````````````````` 4716 4717 4718 4. **Indentation.** If a sequence of lines *Ls* constitutes a list item 4719 according to rule #1, #2, or #3, then the result of preceding each line 4720 of *Ls* by up to three spaces of indentation (the same for each line) also 4721 constitutes a list item with the same contents and attributes. If a line is 4722 empty, then it need not be indented. 4723 4724 Indented one space: 4725 4726 ```````````````````````````````` example 4727 1. A paragraph 4728 with two lines. 4729 4730 indented code 4731 4732 > A block quote. 4733 . 4734 <ol> 4735 <li> 4736 <p>A paragraph 4737 with two lines.</p> 4738 <pre><code>indented code 4739 </code></pre> 4740 <blockquote> 4741 <p>A block quote.</p> 4742 </blockquote> 4743 </li> 4744 </ol> 4745 ```````````````````````````````` 4746 4747 4748 Indented two spaces: 4749 4750 ```````````````````````````````` example 4751 1. A paragraph 4752 with two lines. 4753 4754 indented code 4755 4756 > A block quote. 4757 . 4758 <ol> 4759 <li> 4760 <p>A paragraph 4761 with two lines.</p> 4762 <pre><code>indented code 4763 </code></pre> 4764 <blockquote> 4765 <p>A block quote.</p> 4766 </blockquote> 4767 </li> 4768 </ol> 4769 ```````````````````````````````` 4770 4771 4772 Indented three spaces: 4773 4774 ```````````````````````````````` example 4775 1. A paragraph 4776 with two lines. 4777 4778 indented code 4779 4780 > A block quote. 4781 . 4782 <ol> 4783 <li> 4784 <p>A paragraph 4785 with two lines.</p> 4786 <pre><code>indented code 4787 </code></pre> 4788 <blockquote> 4789 <p>A block quote.</p> 4790 </blockquote> 4791 </li> 4792 </ol> 4793 ```````````````````````````````` 4794 4795 4796 Four spaces indent gives a code block: 4797 4798 ```````````````````````````````` example 4799 1. A paragraph 4800 with two lines. 4801 4802 indented code 4803 4804 > A block quote. 4805 . 4806 <pre><code>1. A paragraph 4807 with two lines. 4808 4809 indented code 4810 4811 > A block quote. 4812 </code></pre> 4813 ```````````````````````````````` 4814 4815 4816 4817 5. **Laziness.** If a string of lines *Ls* constitute a [list 4818 item](#list-items) with contents *Bs*, then the result of deleting 4819 some or all of the indentation from one or more lines in which the 4820 next character other than a space or tab after the indentation is 4821 [paragraph continuation text] is a 4822 list item with the same contents and attributes. The unindented 4823 lines are called 4824 [lazy continuation line](@)s. 4825 4826 Here is an example with [lazy continuation lines]: 4827 4828 ```````````````````````````````` example 4829 1. A paragraph 4830 with two lines. 4831 4832 indented code 4833 4834 > A block quote. 4835 . 4836 <ol> 4837 <li> 4838 <p>A paragraph 4839 with two lines.</p> 4840 <pre><code>indented code 4841 </code></pre> 4842 <blockquote> 4843 <p>A block quote.</p> 4844 </blockquote> 4845 </li> 4846 </ol> 4847 ```````````````````````````````` 4848 4849 4850 Indentation can be partially deleted: 4851 4852 ```````````````````````````````` example 4853 1. A paragraph 4854 with two lines. 4855 . 4856 <ol> 4857 <li>A paragraph 4858 with two lines.</li> 4859 </ol> 4860 ```````````````````````````````` 4861 4862 4863 These examples show how laziness can work in nested structures: 4864 4865 ```````````````````````````````` example 4866 > 1. > Blockquote 4867 continued here. 4868 . 4869 <blockquote> 4870 <ol> 4871 <li> 4872 <blockquote> 4873 <p>Blockquote 4874 continued here.</p> 4875 </blockquote> 4876 </li> 4877 </ol> 4878 </blockquote> 4879 ```````````````````````````````` 4880 4881 4882 ```````````````````````````````` example 4883 > 1. > Blockquote 4884 > continued here. 4885 . 4886 <blockquote> 4887 <ol> 4888 <li> 4889 <blockquote> 4890 <p>Blockquote 4891 continued here.</p> 4892 </blockquote> 4893 </li> 4894 </ol> 4895 </blockquote> 4896 ```````````````````````````````` 4897 4898 4899 4900 6. **That's all.** Nothing that is not counted as a list item by rules 4901 #1--5 counts as a [list item](#list-items). 4902 4903 The rules for sublists follow from the general rules 4904 [above][List items]. A sublist must be indented the same number 4905 of spaces of indentation a paragraph would need to be in order to be included 4906 in the list item. 4907 4908 So, in this case we need two spaces indent: 4909 4910 ```````````````````````````````` example 4911 - foo 4912 - bar 4913 - baz 4914 - boo 4915 . 4916 <ul> 4917 <li>foo 4918 <ul> 4919 <li>bar 4920 <ul> 4921 <li>baz 4922 <ul> 4923 <li>boo</li> 4924 </ul> 4925 </li> 4926 </ul> 4927 </li> 4928 </ul> 4929 </li> 4930 </ul> 4931 ```````````````````````````````` 4932 4933 4934 One is not enough: 4935 4936 ```````````````````````````````` example 4937 - foo 4938 - bar 4939 - baz 4940 - boo 4941 . 4942 <ul> 4943 <li>foo</li> 4944 <li>bar</li> 4945 <li>baz</li> 4946 <li>boo</li> 4947 </ul> 4948 ```````````````````````````````` 4949 4950 4951 Here we need four, because the list marker is wider: 4952 4953 ```````````````````````````````` example 4954 10) foo 4955 - bar 4956 . 4957 <ol start="10"> 4958 <li>foo 4959 <ul> 4960 <li>bar</li> 4961 </ul> 4962 </li> 4963 </ol> 4964 ```````````````````````````````` 4965 4966 4967 Three is not enough: 4968 4969 ```````````````````````````````` example 4970 10) foo 4971 - bar 4972 . 4973 <ol start="10"> 4974 <li>foo</li> 4975 </ol> 4976 <ul> 4977 <li>bar</li> 4978 </ul> 4979 ```````````````````````````````` 4980 4981 4982 A list may be the first block in a list item: 4983 4984 ```````````````````````````````` example 4985 - - foo 4986 . 4987 <ul> 4988 <li> 4989 <ul> 4990 <li>foo</li> 4991 </ul> 4992 </li> 4993 </ul> 4994 ```````````````````````````````` 4995 4996 4997 ```````````````````````````````` example 4998 1. - 2. foo 4999 . 5000 <ol> 5001 <li> 5002 <ul> 5003 <li> 5004 <ol start="2"> 5005 <li>foo</li> 5006 </ol> 5007 </li> 5008 </ul> 5009 </li> 5010 </ol> 5011 ```````````````````````````````` 5012 5013 5014 A list item can contain a heading: 5015 5016 ```````````````````````````````` example 5017 - # Foo 5018 - Bar 5019 --- 5020 baz 5021 . 5022 <ul> 5023 <li> 5024 <h1>Foo</h1> 5025 </li> 5026 <li> 5027 <h2>Bar</h2> 5028 baz</li> 5029 </ul> 5030 ```````````````````````````````` 5031 5032 5033 ### Motivation 5034 5035 John Gruber's Markdown spec says the following about list items: 5036 5037 1. "List markers typically start at the left margin, but may be indented 5038 by up to three spaces. List markers must be followed by one or more 5039 spaces or a tab." 5040 5041 2. "To make lists look nice, you can wrap items with hanging indents.... 5042 But if you don't want to, you don't have to." 5043 5044 3. "List items may consist of multiple paragraphs. Each subsequent 5045 paragraph in a list item must be indented by either 4 spaces or one 5046 tab." 5047 5048 4. "It looks nice if you indent every line of the subsequent paragraphs, 5049 but here again, Markdown will allow you to be lazy." 5050 5051 5. "To put a blockquote within a list item, the blockquote's `>` 5052 delimiters need to be indented." 5053 5054 6. "To put a code block within a list item, the code block needs to be 5055 indented twice — 8 spaces or two tabs." 5056 5057 These rules specify that a paragraph under a list item must be indented 5058 four spaces (presumably, from the left margin, rather than the start of 5059 the list marker, but this is not said), and that code under a list item 5060 must be indented eight spaces instead of the usual four. They also say 5061 that a block quote must be indented, but not by how much; however, the 5062 example given has four spaces indentation. Although nothing is said 5063 about other kinds of block-level content, it is certainly reasonable to 5064 infer that *all* block elements under a list item, including other 5065 lists, must be indented four spaces. This principle has been called the 5066 *four-space rule*. 5067 5068 The four-space rule is clear and principled, and if the reference 5069 implementation `Markdown.pl` had followed it, it probably would have 5070 become the standard. However, `Markdown.pl` allowed paragraphs and 5071 sublists to start with only two spaces indentation, at least on the 5072 outer level. Worse, its behavior was inconsistent: a sublist of an 5073 outer-level list needed two spaces indentation, but a sublist of this 5074 sublist needed three spaces. It is not surprising, then, that different 5075 implementations of Markdown have developed very different rules for 5076 determining what comes under a list item. (Pandoc and python-Markdown, 5077 for example, stuck with Gruber's syntax description and the four-space 5078 rule, while discount, redcarpet, marked, PHP Markdown, and others 5079 followed `Markdown.pl`'s behavior more closely.) 5080 5081 Unfortunately, given the divergences between implementations, there 5082 is no way to give a spec for list items that will be guaranteed not 5083 to break any existing documents. However, the spec given here should 5084 correctly handle lists formatted with either the four-space rule or 5085 the more forgiving `Markdown.pl` behavior, provided they are laid out 5086 in a way that is natural for a human to read. 5087 5088 The strategy here is to let the width and indentation of the list marker 5089 determine the indentation necessary for blocks to fall under the list 5090 item, rather than having a fixed and arbitrary number. The writer can 5091 think of the body of the list item as a unit which gets indented to the 5092 right enough to fit the list marker (and any indentation on the list 5093 marker). (The laziness rule, #5, then allows continuation lines to be 5094 unindented if needed.) 5095 5096 This rule is superior, we claim, to any rule requiring a fixed level of 5097 indentation from the margin. The four-space rule is clear but 5098 unnatural. It is quite unintuitive that 5099 5100 ``` markdown 5101 - foo 5102 5103 bar 5104 5105 - baz 5106 ``` 5107 5108 should be parsed as two lists with an intervening paragraph, 5109 5110 ``` html 5111 <ul> 5112 <li>foo</li> 5113 </ul> 5114 <p>bar</p> 5115 <ul> 5116 <li>baz</li> 5117 </ul> 5118 ``` 5119 5120 as the four-space rule demands, rather than a single list, 5121 5122 ``` html 5123 <ul> 5124 <li> 5125 <p>foo</p> 5126 <p>bar</p> 5127 <ul> 5128 <li>baz</li> 5129 </ul> 5130 </li> 5131 </ul> 5132 ``` 5133 5134 The choice of four spaces is arbitrary. It can be learned, but it is 5135 not likely to be guessed, and it trips up beginners regularly. 5136 5137 Would it help to adopt a two-space rule? The problem is that such 5138 a rule, together with the rule allowing up to three spaces of indentation for 5139 the initial list marker, allows text that is indented *less than* the 5140 original list marker to be included in the list item. For example, 5141 `Markdown.pl` parses 5142 5143 ``` markdown 5144 - one 5145 5146 two 5147 ``` 5148 5149 as a single list item, with `two` a continuation paragraph: 5150 5151 ``` html 5152 <ul> 5153 <li> 5154 <p>one</p> 5155 <p>two</p> 5156 </li> 5157 </ul> 5158 ``` 5159 5160 and similarly 5161 5162 ``` markdown 5163 > - one 5164 > 5165 > two 5166 ``` 5167 5168 as 5169 5170 ``` html 5171 <blockquote> 5172 <ul> 5173 <li> 5174 <p>one</p> 5175 <p>two</p> 5176 </li> 5177 </ul> 5178 </blockquote> 5179 ``` 5180 5181 This is extremely unintuitive. 5182 5183 Rather than requiring a fixed indent from the margin, we could require 5184 a fixed indent (say, two spaces, or even one space) from the list marker (which 5185 may itself be indented). This proposal would remove the last anomaly 5186 discussed. Unlike the spec presented above, it would count the following 5187 as a list item with a subparagraph, even though the paragraph `bar` 5188 is not indented as far as the first paragraph `foo`: 5189 5190 ``` markdown 5191 10. foo 5192 5193 bar 5194 ``` 5195 5196 Arguably this text does read like a list item with `bar` as a subparagraph, 5197 which may count in favor of the proposal. However, on this proposal indented 5198 code would have to be indented six spaces after the list marker. And this 5199 would break a lot of existing Markdown, which has the pattern: 5200 5201 ``` markdown 5202 1. foo 5203 5204 indented code 5205 ``` 5206 5207 where the code is indented eight spaces. The spec above, by contrast, will 5208 parse this text as expected, since the code block's indentation is measured 5209 from the beginning of `foo`. 5210 5211 The one case that needs special treatment is a list item that *starts* 5212 with indented code. How much indentation is required in that case, since 5213 we don't have a "first paragraph" to measure from? Rule #2 simply stipulates 5214 that in such cases, we require one space indentation from the list marker 5215 (and then the normal four spaces for the indented code). This will match the 5216 four-space rule in cases where the list marker plus its initial indentation 5217 takes four spaces (a common case), but diverge in other cases. 5218 5219 ## Lists 5220 5221 A [list](@) is a sequence of one or more 5222 list items [of the same type]. The list items 5223 may be separated by any number of blank lines. 5224 5225 Two list items are [of the same type](@) 5226 if they begin with a [list marker] of the same type. 5227 Two list markers are of the 5228 same type if (a) they are bullet list markers using the same character 5229 (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same 5230 delimiter (either `.` or `)`). 5231 5232 A list is an [ordered list](@) 5233 if its constituent list items begin with 5234 [ordered list markers], and a 5235 [bullet list](@) if its constituent list 5236 items begin with [bullet list markers]. 5237 5238 The [start number](@) 5239 of an [ordered list] is determined by the list number of 5240 its initial list item. The numbers of subsequent list items are 5241 disregarded. 5242 5243 A list is [loose](@) if any of its constituent 5244 list items are separated by blank lines, or if any of its constituent 5245 list items directly contain two block-level elements with a blank line 5246 between them. Otherwise a list is [tight](@). 5247 (The difference in HTML output is that paragraphs in a loose list are 5248 wrapped in `<p>` tags, while paragraphs in a tight list are not.) 5249 5250 Changing the bullet or ordered list delimiter starts a new list: 5251 5252 ```````````````````````````````` example 5253 - foo 5254 - bar 5255 + baz 5256 . 5257 <ul> 5258 <li>foo</li> 5259 <li>bar</li> 5260 </ul> 5261 <ul> 5262 <li>baz</li> 5263 </ul> 5264 ```````````````````````````````` 5265 5266 5267 ```````````````````````````````` example 5268 1. foo 5269 2. bar 5270 3) baz 5271 . 5272 <ol> 5273 <li>foo</li> 5274 <li>bar</li> 5275 </ol> 5276 <ol start="3"> 5277 <li>baz</li> 5278 </ol> 5279 ```````````````````````````````` 5280 5281 5282 In CommonMark, a list can interrupt a paragraph. That is, 5283 no blank line is needed to separate a paragraph from a following 5284 list: 5285 5286 ```````````````````````````````` example 5287 Foo 5288 - bar 5289 - baz 5290 . 5291 <p>Foo</p> 5292 <ul> 5293 <li>bar</li> 5294 <li>baz</li> 5295 </ul> 5296 ```````````````````````````````` 5297 5298 `Markdown.pl` does not allow this, through fear of triggering a list 5299 via a numeral in a hard-wrapped line: 5300 5301 ``` markdown 5302 The number of windows in my house is 5303 14. The number of doors is 6. 5304 ``` 5305 5306 Oddly, though, `Markdown.pl` *does* allow a blockquote to 5307 interrupt a paragraph, even though the same considerations might 5308 apply. 5309 5310 In CommonMark, we do allow lists to interrupt paragraphs, for 5311 two reasons. First, it is natural and not uncommon for people 5312 to start lists without blank lines: 5313 5314 ``` markdown 5315 I need to buy 5316 - new shoes 5317 - a coat 5318 - a plane ticket 5319 ``` 5320 5321 Second, we are attracted to a 5322 5323 > [principle of uniformity](@): 5324 > if a chunk of text has a certain 5325 > meaning, it will continue to have the same meaning when put into a 5326 > container block (such as a list item or blockquote). 5327 5328 (Indeed, the spec for [list items] and [block quotes] presupposes 5329 this principle.) This principle implies that if 5330 5331 ``` markdown 5332 * I need to buy 5333 - new shoes 5334 - a coat 5335 - a plane ticket 5336 ``` 5337 5338 is a list item containing a paragraph followed by a nested sublist, 5339 as all Markdown implementations agree it is (though the paragraph 5340 may be rendered without `<p>` tags, since the list is "tight"), 5341 then 5342 5343 ``` markdown 5344 I need to buy 5345 - new shoes 5346 - a coat 5347 - a plane ticket 5348 ``` 5349 5350 by itself should be a paragraph followed by a nested sublist. 5351 5352 Since it is well established Markdown practice to allow lists to 5353 interrupt paragraphs inside list items, the [principle of 5354 uniformity] requires us to allow this outside list items as 5355 well. ([reStructuredText](http://docutils.sourceforge.net/rst.html) 5356 takes a different approach, requiring blank lines before lists 5357 even inside other list items.) 5358 5359 In order to solve of unwanted lists in paragraphs with 5360 hard-wrapped numerals, we allow only lists starting with `1` to 5361 interrupt paragraphs. Thus, 5362 5363 ```````````````````````````````` example 5364 The number of windows in my house is 5365 14. The number of doors is 6. 5366 . 5367 <p>The number of windows in my house is 5368 14. The number of doors is 6.</p> 5369 ```````````````````````````````` 5370 5371 We may still get an unintended result in cases like 5372 5373 ```````````````````````````````` example 5374 The number of windows in my house is 5375 1. The number of doors is 6. 5376 . 5377 <p>The number of windows in my house is</p> 5378 <ol> 5379 <li>The number of doors is 6.</li> 5380 </ol> 5381 ```````````````````````````````` 5382 5383 but this rule should prevent most spurious list captures. 5384 5385 There can be any number of blank lines between items: 5386 5387 ```````````````````````````````` example 5388 - foo 5389 5390 - bar 5391 5392 5393 - baz 5394 . 5395 <ul> 5396 <li> 5397 <p>foo</p> 5398 </li> 5399 <li> 5400 <p>bar</p> 5401 </li> 5402 <li> 5403 <p>baz</p> 5404 </li> 5405 </ul> 5406 ```````````````````````````````` 5407 5408 ```````````````````````````````` example 5409 - foo 5410 - bar 5411 - baz 5412 5413 5414 bim 5415 . 5416 <ul> 5417 <li>foo 5418 <ul> 5419 <li>bar 5420 <ul> 5421 <li> 5422 <p>baz</p> 5423 <p>bim</p> 5424 </li> 5425 </ul> 5426 </li> 5427 </ul> 5428 </li> 5429 </ul> 5430 ```````````````````````````````` 5431 5432 5433 To separate consecutive lists of the same type, or to separate a 5434 list from an indented code block that would otherwise be parsed 5435 as a subparagraph of the final list item, you can insert a blank HTML 5436 comment: 5437 5438 ```````````````````````````````` example 5439 - foo 5440 - bar 5441 5442 <!-- --> 5443 5444 - baz 5445 - bim 5446 . 5447 <ul> 5448 <li>foo</li> 5449 <li>bar</li> 5450 </ul> 5451 <!-- --> 5452 <ul> 5453 <li>baz</li> 5454 <li>bim</li> 5455 </ul> 5456 ```````````````````````````````` 5457 5458 5459 ```````````````````````````````` example 5460 - foo 5461 5462 notcode 5463 5464 - foo 5465 5466 <!-- --> 5467 5468 code 5469 . 5470 <ul> 5471 <li> 5472 <p>foo</p> 5473 <p>notcode</p> 5474 </li> 5475 <li> 5476 <p>foo</p> 5477 </li> 5478 </ul> 5479 <!-- --> 5480 <pre><code>code 5481 </code></pre> 5482 ```````````````````````````````` 5483 5484 5485 List items need not be indented to the same level. The following 5486 list items will be treated as items at the same list level, 5487 since none is indented enough to belong to the previous list 5488 item: 5489 5490 ```````````````````````````````` example 5491 - a 5492 - b 5493 - c 5494 - d 5495 - e 5496 - f 5497 - g 5498 . 5499 <ul> 5500 <li>a</li> 5501 <li>b</li> 5502 <li>c</li> 5503 <li>d</li> 5504 <li>e</li> 5505 <li>f</li> 5506 <li>g</li> 5507 </ul> 5508 ```````````````````````````````` 5509 5510 5511 ```````````````````````````````` example 5512 1. a 5513 5514 2. b 5515 5516 3. c 5517 . 5518 <ol> 5519 <li> 5520 <p>a</p> 5521 </li> 5522 <li> 5523 <p>b</p> 5524 </li> 5525 <li> 5526 <p>c</p> 5527 </li> 5528 </ol> 5529 ```````````````````````````````` 5530 5531 Note, however, that list items may not be preceded by more than 5532 three spaces of indentation. Here `- e` is treated as a paragraph continuation 5533 line, because it is indented more than three spaces: 5534 5535 ```````````````````````````````` example 5536 - a 5537 - b 5538 - c 5539 - d 5540 - e 5541 . 5542 <ul> 5543 <li>a</li> 5544 <li>b</li> 5545 <li>c</li> 5546 <li>d 5547 - e</li> 5548 </ul> 5549 ```````````````````````````````` 5550 5551 And here, `3. c` is treated as in indented code block, 5552 because it is indented four spaces and preceded by a 5553 blank line. 5554 5555 ```````````````````````````````` example 5556 1. a 5557 5558 2. b 5559 5560 3. c 5561 . 5562 <ol> 5563 <li> 5564 <p>a</p> 5565 </li> 5566 <li> 5567 <p>b</p> 5568 </li> 5569 </ol> 5570 <pre><code>3. c 5571 </code></pre> 5572 ```````````````````````````````` 5573 5574 5575 This is a loose list, because there is a blank line between 5576 two of the list items: 5577 5578 ```````````````````````````````` example 5579 - a 5580 - b 5581 5582 - c 5583 . 5584 <ul> 5585 <li> 5586 <p>a</p> 5587 </li> 5588 <li> 5589 <p>b</p> 5590 </li> 5591 <li> 5592 <p>c</p> 5593 </li> 5594 </ul> 5595 ```````````````````````````````` 5596 5597 5598 So is this, with a empty second item: 5599 5600 ```````````````````````````````` example 5601 * a 5602 * 5603 5604 * c 5605 . 5606 <ul> 5607 <li> 5608 <p>a</p> 5609 </li> 5610 <li></li> 5611 <li> 5612 <p>c</p> 5613 </li> 5614 </ul> 5615 ```````````````````````````````` 5616 5617 5618 These are loose lists, even though there are no blank lines between the items, 5619 because one of the items directly contains two block-level elements 5620 with a blank line between them: 5621 5622 ```````````````````````````````` example 5623 - a 5624 - b 5625 5626 c 5627 - d 5628 . 5629 <ul> 5630 <li> 5631 <p>a</p> 5632 </li> 5633 <li> 5634 <p>b</p> 5635 <p>c</p> 5636 </li> 5637 <li> 5638 <p>d</p> 5639 </li> 5640 </ul> 5641 ```````````````````````````````` 5642 5643 5644 ```````````````````````````````` example 5645 - a 5646 - b 5647 5648 [ref]: /url 5649 - d 5650 . 5651 <ul> 5652 <li> 5653 <p>a</p> 5654 </li> 5655 <li> 5656 <p>b</p> 5657 </li> 5658 <li> 5659 <p>d</p> 5660 </li> 5661 </ul> 5662 ```````````````````````````````` 5663 5664 5665 This is a tight list, because the blank lines are in a code block: 5666 5667 ```````````````````````````````` example 5668 - a 5669 - ``` 5670 b 5671 5672 5673 ``` 5674 - c 5675 . 5676 <ul> 5677 <li>a</li> 5678 <li> 5679 <pre><code>b 5680 5681 5682 </code></pre> 5683 </li> 5684 <li>c</li> 5685 </ul> 5686 ```````````````````````````````` 5687 5688 5689 This is a tight list, because the blank line is between two 5690 paragraphs of a sublist. So the sublist is loose while 5691 the outer list is tight: 5692 5693 ```````````````````````````````` example 5694 - a 5695 - b 5696 5697 c 5698 - d 5699 . 5700 <ul> 5701 <li>a 5702 <ul> 5703 <li> 5704 <p>b</p> 5705 <p>c</p> 5706 </li> 5707 </ul> 5708 </li> 5709 <li>d</li> 5710 </ul> 5711 ```````````````````````````````` 5712 5713 5714 This is a tight list, because the blank line is inside the 5715 block quote: 5716 5717 ```````````````````````````````` example 5718 * a 5719 > b 5720 > 5721 * c 5722 . 5723 <ul> 5724 <li>a 5725 <blockquote> 5726 <p>b</p> 5727 </blockquote> 5728 </li> 5729 <li>c</li> 5730 </ul> 5731 ```````````````````````````````` 5732 5733 5734 This list is tight, because the consecutive block elements 5735 are not separated by blank lines: 5736 5737 ```````````````````````````````` example 5738 - a 5739 > b 5740 ``` 5741 c 5742 ``` 5743 - d 5744 . 5745 <ul> 5746 <li>a 5747 <blockquote> 5748 <p>b</p> 5749 </blockquote> 5750 <pre><code>c 5751 </code></pre> 5752 </li> 5753 <li>d</li> 5754 </ul> 5755 ```````````````````````````````` 5756 5757 5758 A single-paragraph list is tight: 5759 5760 ```````````````````````````````` example 5761 - a 5762 . 5763 <ul> 5764 <li>a</li> 5765 </ul> 5766 ```````````````````````````````` 5767 5768 5769 ```````````````````````````````` example 5770 - a 5771 - b 5772 . 5773 <ul> 5774 <li>a 5775 <ul> 5776 <li>b</li> 5777 </ul> 5778 </li> 5779 </ul> 5780 ```````````````````````````````` 5781 5782 5783 This list is loose, because of the blank line between the 5784 two block elements in the list item: 5785 5786 ```````````````````````````````` example 5787 1. ``` 5788 foo 5789 ``` 5790 5791 bar 5792 . 5793 <ol> 5794 <li> 5795 <pre><code>foo 5796 </code></pre> 5797 <p>bar</p> 5798 </li> 5799 </ol> 5800 ```````````````````````````````` 5801 5802 5803 Here the outer list is loose, the inner list tight: 5804 5805 ```````````````````````````````` example 5806 * foo 5807 * bar 5808 5809 baz 5810 . 5811 <ul> 5812 <li> 5813 <p>foo</p> 5814 <ul> 5815 <li>bar</li> 5816 </ul> 5817 <p>baz</p> 5818 </li> 5819 </ul> 5820 ```````````````````````````````` 5821 5822 5823 ```````````````````````````````` example 5824 - a 5825 - b 5826 - c 5827 5828 - d 5829 - e 5830 - f 5831 . 5832 <ul> 5833 <li> 5834 <p>a</p> 5835 <ul> 5836 <li>b</li> 5837 <li>c</li> 5838 </ul> 5839 </li> 5840 <li> 5841 <p>d</p> 5842 <ul> 5843 <li>e</li> 5844 <li>f</li> 5845 </ul> 5846 </li> 5847 </ul> 5848 ```````````````````````````````` 5849 5850 5851 # Inlines 5852 5853 Inlines are parsed sequentially from the beginning of the character 5854 stream to the end (left to right, in left-to-right languages). 5855 Thus, for example, in 5856 5857 ```````````````````````````````` example 5858 `hi`lo` 5859 . 5860 <p><code>hi</code>lo`</p> 5861 ```````````````````````````````` 5862 5863 `hi` is parsed as code, leaving the backtick at the end as a literal 5864 backtick. 5865 5866 5867 5868 ## Code spans 5869 5870 A [backtick string](@) 5871 is a string of one or more backtick characters (`` ` ``) that is neither 5872 preceded nor followed by a backtick. 5873 5874 A [code span](@) begins with a backtick string and ends with 5875 a backtick string of equal length. The contents of the code span are 5876 the characters between these two backtick strings, normalized in the 5877 following ways: 5878 5879 - First, [line endings] are converted to [spaces]. 5880 - If the resulting string both begins *and* ends with a [space] 5881 character, but does not consist entirely of [space] 5882 characters, a single [space] character is removed from the 5883 front and back. This allows you to include code that begins 5884 or ends with backtick characters, which must be separated by 5885 whitespace from the opening or closing backtick strings. 5886 5887 This is a simple code span: 5888 5889 ```````````````````````````````` example 5890 `foo` 5891 . 5892 <p><code>foo</code></p> 5893 ```````````````````````````````` 5894 5895 5896 Here two backticks are used, because the code contains a backtick. 5897 This example also illustrates stripping of a single leading and 5898 trailing space: 5899 5900 ```````````````````````````````` example 5901 `` foo ` bar `` 5902 . 5903 <p><code>foo ` bar</code></p> 5904 ```````````````````````````````` 5905 5906 5907 This example shows the motivation for stripping leading and trailing 5908 spaces: 5909 5910 ```````````````````````````````` example 5911 ` `` ` 5912 . 5913 <p><code>``</code></p> 5914 ```````````````````````````````` 5915 5916 Note that only *one* space is stripped: 5917 5918 ```````````````````````````````` example 5919 ` `` ` 5920 . 5921 <p><code> `` </code></p> 5922 ```````````````````````````````` 5923 5924 The stripping only happens if the space is on both 5925 sides of the string: 5926 5927 ```````````````````````````````` example 5928 ` a` 5929 . 5930 <p><code> a</code></p> 5931 ```````````````````````````````` 5932 5933 Only [spaces], and not [unicode whitespace] in general, are 5934 stripped in this way: 5935 5936 ```````````````````````````````` example 5937 ` b ` 5938 . 5939 <p><code> b </code></p> 5940 ```````````````````````````````` 5941 5942 No stripping occurs if the code span contains only spaces: 5943 5944 ```````````````````````````````` example 5945 ` ` 5946 ` ` 5947 . 5948 <p><code> </code> 5949 <code> </code></p> 5950 ```````````````````````````````` 5951 5952 5953 [Line endings] are treated like spaces: 5954 5955 ```````````````````````````````` example 5956 `` 5957 foo 5958 bar 5959 baz 5960 `` 5961 . 5962 <p><code>foo bar baz</code></p> 5963 ```````````````````````````````` 5964 5965 ```````````````````````````````` example 5966 `` 5967 foo 5968 `` 5969 . 5970 <p><code>foo </code></p> 5971 ```````````````````````````````` 5972 5973 5974 Interior spaces are not collapsed: 5975 5976 ```````````````````````````````` example 5977 `foo bar 5978 baz` 5979 . 5980 <p><code>foo bar baz</code></p> 5981 ```````````````````````````````` 5982 5983 Note that browsers will typically collapse consecutive spaces 5984 when rendering `<code>` elements, so it is recommended that 5985 the following CSS be used: 5986 5987 code{white-space: pre-wrap;} 5988 5989 5990 Note that backslash escapes do not work in code spans. All backslashes 5991 are treated literally: 5992 5993 ```````````````````````````````` example 5994 `foo\`bar` 5995 . 5996 <p><code>foo\</code>bar`</p> 5997 ```````````````````````````````` 5998 5999 6000 Backslash escapes are never needed, because one can always choose a 6001 string of *n* backtick characters as delimiters, where the code does 6002 not contain any strings of exactly *n* backtick characters. 6003 6004 ```````````````````````````````` example 6005 ``foo`bar`` 6006 . 6007 <p><code>foo`bar</code></p> 6008 ```````````````````````````````` 6009 6010 ```````````````````````````````` example 6011 ` foo `` bar ` 6012 . 6013 <p><code>foo `` bar</code></p> 6014 ```````````````````````````````` 6015 6016 6017 Code span backticks have higher precedence than any other inline 6018 constructs except HTML tags and autolinks. Thus, for example, this is 6019 not parsed as emphasized text, since the second `*` is part of a code 6020 span: 6021 6022 ```````````````````````````````` example 6023 *foo`*` 6024 . 6025 <p>*foo<code>*</code></p> 6026 ```````````````````````````````` 6027 6028 6029 And this is not parsed as a link: 6030 6031 ```````````````````````````````` example 6032 [not a `link](/foo`) 6033 . 6034 <p>[not a <code>link](/foo</code>)</p> 6035 ```````````````````````````````` 6036 6037 6038 Code spans, HTML tags, and autolinks have the same precedence. 6039 Thus, this is code: 6040 6041 ```````````````````````````````` example 6042 `<a href="`">` 6043 . 6044 <p><code><a href="</code>">`</p> 6045 ```````````````````````````````` 6046 6047 6048 But this is an HTML tag: 6049 6050 ```````````````````````````````` example 6051 <a href="`">` 6052 . 6053 <p><a href="`">`</p> 6054 ```````````````````````````````` 6055 6056 6057 And this is code: 6058 6059 ```````````````````````````````` example 6060 `<http://foo.bar.`baz>` 6061 . 6062 <p><code><http://foo.bar.</code>baz>`</p> 6063 ```````````````````````````````` 6064 6065 6066 But this is an autolink: 6067 6068 ```````````````````````````````` example 6069 <http://foo.bar.`baz>` 6070 . 6071 <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p> 6072 ```````````````````````````````` 6073 6074 6075 When a backtick string is not closed by a matching backtick string, 6076 we just have literal backticks: 6077 6078 ```````````````````````````````` example 6079 ```foo`` 6080 . 6081 <p>```foo``</p> 6082 ```````````````````````````````` 6083 6084 6085 ```````````````````````````````` example 6086 `foo 6087 . 6088 <p>`foo</p> 6089 ```````````````````````````````` 6090 6091 The following case also illustrates the need for opening and 6092 closing backtick strings to be equal in length: 6093 6094 ```````````````````````````````` example 6095 `foo``bar`` 6096 . 6097 <p>`foo<code>bar</code></p> 6098 ```````````````````````````````` 6099 6100 6101 ## Emphasis and strong emphasis 6102 6103 John Gruber's original [Markdown syntax 6104 description](http://daringfireball.net/projects/markdown/syntax#em) says: 6105 6106 > Markdown treats asterisks (`*`) and underscores (`_`) as indicators of 6107 > emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML 6108 > `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>` 6109 > tag. 6110 6111 This is enough for most users, but these rules leave much undecided, 6112 especially when it comes to nested emphasis. The original 6113 `Markdown.pl` test suite makes it clear that triple `***` and 6114 `___` delimiters can be used for strong emphasis, and most 6115 implementations have also allowed the following patterns: 6116 6117 ``` markdown 6118 ***strong emph*** 6119 ***strong** in emph* 6120 ***emph* in strong** 6121 **in strong *emph*** 6122 *in emph **strong*** 6123 ``` 6124 6125 The following patterns are less widely supported, but the intent 6126 is clear and they are useful (especially in contexts like bibliography 6127 entries): 6128 6129 ``` markdown 6130 *emph *with emph* in it* 6131 **strong **with strong** in it** 6132 ``` 6133 6134 Many implementations have also restricted intraword emphasis to 6135 the `*` forms, to avoid unwanted emphasis in words containing 6136 internal underscores. (It is best practice to put these in code 6137 spans, but users often do not.) 6138 6139 ``` markdown 6140 internal emphasis: foo*bar*baz 6141 no emphasis: foo_bar_baz 6142 ``` 6143 6144 The rules given below capture all of these patterns, while allowing 6145 for efficient parsing strategies that do not backtrack. 6146 6147 First, some definitions. A [delimiter run](@) is either 6148 a sequence of one or more `*` characters that is not preceded or 6149 followed by a non-backslash-escaped `*` character, or a sequence 6150 of one or more `_` characters that is not preceded or followed by 6151 a non-backslash-escaped `_` character. 6152 6153 A [left-flanking delimiter run](@) is 6154 a [delimiter run] that is (1) not followed by [Unicode whitespace], 6155 and either (2a) not followed by a [Unicode punctuation character], or 6156 (2b) followed by a [Unicode punctuation character] and 6157 preceded by [Unicode whitespace] or a [Unicode punctuation character]. 6158 For purposes of this definition, the beginning and the end of 6159 the line count as Unicode whitespace. 6160 6161 A [right-flanking delimiter run](@) is 6162 a [delimiter run] that is (1) not preceded by [Unicode whitespace], 6163 and either (2a) not preceded by a [Unicode punctuation character], or 6164 (2b) preceded by a [Unicode punctuation character] and 6165 followed by [Unicode whitespace] or a [Unicode punctuation character]. 6166 For purposes of this definition, the beginning and the end of 6167 the line count as Unicode whitespace. 6168 6169 Here are some examples of delimiter runs. 6170 6171 - left-flanking but not right-flanking: 6172 6173 ``` 6174 ***abc 6175 _abc 6176 **"abc" 6177 _"abc" 6178 ``` 6179 6180 - right-flanking but not left-flanking: 6181 6182 ``` 6183 abc*** 6184 abc_ 6185 "abc"** 6186 "abc"_ 6187 ``` 6188 6189 - Both left and right-flanking: 6190 6191 ``` 6192 abc***def 6193 "abc"_"def" 6194 ``` 6195 6196 - Neither left nor right-flanking: 6197 6198 ``` 6199 abc *** def 6200 a _ b 6201 ``` 6202 6203 (The idea of distinguishing left-flanking and right-flanking 6204 delimiter runs based on the character before and the character 6205 after comes from Roopesh Chander's 6206 [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags). 6207 vfmd uses the terminology "emphasis indicator string" instead of "delimiter 6208 run," and its rules for distinguishing left- and right-flanking runs 6209 are a bit more complex than the ones given here.) 6210 6211 The following rules define emphasis and strong emphasis: 6212 6213 1. A single `*` character [can open emphasis](@) 6214 iff (if and only if) it is part of a [left-flanking delimiter run]. 6215 6216 2. A single `_` character [can open emphasis] iff 6217 it is part of a [left-flanking delimiter run] 6218 and either (a) not part of a [right-flanking delimiter run] 6219 or (b) part of a [right-flanking delimiter run] 6220 preceded by a [Unicode punctuation character]. 6221 6222 3. A single `*` character [can close emphasis](@) 6223 iff it is part of a [right-flanking delimiter run]. 6224 6225 4. A single `_` character [can close emphasis] iff 6226 it is part of a [right-flanking delimiter run] 6227 and either (a) not part of a [left-flanking delimiter run] 6228 or (b) part of a [left-flanking delimiter run] 6229 followed by a [Unicode punctuation character]. 6230 6231 5. A double `**` [can open strong emphasis](@) 6232 iff it is part of a [left-flanking delimiter run]. 6233 6234 6. A double `__` [can open strong emphasis] iff 6235 it is part of a [left-flanking delimiter run] 6236 and either (a) not part of a [right-flanking delimiter run] 6237 or (b) part of a [right-flanking delimiter run] 6238 preceded by a [Unicode punctuation character]. 6239 6240 7. A double `**` [can close strong emphasis](@) 6241 iff it is part of a [right-flanking delimiter run]. 6242 6243 8. A double `__` [can close strong emphasis] iff 6244 it is part of a [right-flanking delimiter run] 6245 and either (a) not part of a [left-flanking delimiter run] 6246 or (b) part of a [left-flanking delimiter run] 6247 followed by a [Unicode punctuation character]. 6248 6249 9. Emphasis begins with a delimiter that [can open emphasis] and ends 6250 with a delimiter that [can close emphasis], and that uses the same 6251 character (`_` or `*`) as the opening delimiter. The 6252 opening and closing delimiters must belong to separate 6253 [delimiter runs]. If one of the delimiters can both 6254 open and close emphasis, then the sum of the lengths of the 6255 delimiter runs containing the opening and closing delimiters 6256 must not be a multiple of 3 unless both lengths are 6257 multiples of 3. 6258 6259 10. Strong emphasis begins with a delimiter that 6260 [can open strong emphasis] and ends with a delimiter that 6261 [can close strong emphasis], and that uses the same character 6262 (`_` or `*`) as the opening delimiter. The 6263 opening and closing delimiters must belong to separate 6264 [delimiter runs]. If one of the delimiters can both open 6265 and close strong emphasis, then the sum of the lengths of 6266 the delimiter runs containing the opening and closing 6267 delimiters must not be a multiple of 3 unless both lengths 6268 are multiples of 3. 6269 6270 11. A literal `*` character cannot occur at the beginning or end of 6271 `*`-delimited emphasis or `**`-delimited strong emphasis, unless it 6272 is backslash-escaped. 6273 6274 12. A literal `_` character cannot occur at the beginning or end of 6275 `_`-delimited emphasis or `__`-delimited strong emphasis, unless it 6276 is backslash-escaped. 6277 6278 Where rules 1--12 above are compatible with multiple parsings, 6279 the following principles resolve ambiguity: 6280 6281 13. The number of nestings should be minimized. Thus, for example, 6282 an interpretation `<strong>...</strong>` is always preferred to 6283 `<em><em>...</em></em>`. 6284 6285 14. An interpretation `<em><strong>...</strong></em>` is always 6286 preferred to `<strong><em>...</em></strong>`. 6287 6288 15. When two potential emphasis or strong emphasis spans overlap, 6289 so that the second begins before the first ends and ends after 6290 the first ends, the first takes precedence. Thus, for example, 6291 `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather 6292 than `*foo <em>bar* baz</em>`. 6293 6294 16. When there are two potential emphasis or strong emphasis spans 6295 with the same closing delimiter, the shorter one (the one that 6296 opens later) takes precedence. Thus, for example, 6297 `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>` 6298 rather than `<strong>foo **bar baz</strong>`. 6299 6300 17. Inline code spans, links, images, and HTML tags group more tightly 6301 than emphasis. So, when there is a choice between an interpretation 6302 that contains one of these elements and one that does not, the 6303 former always wins. Thus, for example, `*[foo*](bar)` is 6304 parsed as `*<a href="bar">foo*</a>` rather than as 6305 `<em>[foo</em>](bar)`. 6306 6307 These rules can be illustrated through a series of examples. 6308 6309 Rule 1: 6310 6311 ```````````````````````````````` example 6312 *foo bar* 6313 . 6314 <p><em>foo bar</em></p> 6315 ```````````````````````````````` 6316 6317 6318 This is not emphasis, because the opening `*` is followed by 6319 whitespace, and hence not part of a [left-flanking delimiter run]: 6320 6321 ```````````````````````````````` example 6322 a * foo bar* 6323 . 6324 <p>a * foo bar*</p> 6325 ```````````````````````````````` 6326 6327 6328 This is not emphasis, because the opening `*` is preceded 6329 by an alphanumeric and followed by punctuation, and hence 6330 not part of a [left-flanking delimiter run]: 6331 6332 ```````````````````````````````` example 6333 a*"foo"* 6334 . 6335 <p>a*"foo"*</p> 6336 ```````````````````````````````` 6337 6338 6339 Unicode nonbreaking spaces count as whitespace, too: 6340 6341 ```````````````````````````````` example 6342 * a * 6343 . 6344 <p>* a *</p> 6345 ```````````````````````````````` 6346 6347 6348 Intraword emphasis with `*` is permitted: 6349 6350 ```````````````````````````````` example 6351 foo*bar* 6352 . 6353 <p>foo<em>bar</em></p> 6354 ```````````````````````````````` 6355 6356 6357 ```````````````````````````````` example 6358 5*6*78 6359 . 6360 <p>5<em>6</em>78</p> 6361 ```````````````````````````````` 6362 6363 6364 Rule 2: 6365 6366 ```````````````````````````````` example 6367 _foo bar_ 6368 . 6369 <p><em>foo bar</em></p> 6370 ```````````````````````````````` 6371 6372 6373 This is not emphasis, because the opening `_` is followed by 6374 whitespace: 6375 6376 ```````````````````````````````` example 6377 _ foo bar_ 6378 . 6379 <p>_ foo bar_</p> 6380 ```````````````````````````````` 6381 6382 6383 This is not emphasis, because the opening `_` is preceded 6384 by an alphanumeric and followed by punctuation: 6385 6386 ```````````````````````````````` example 6387 a_"foo"_ 6388 . 6389 <p>a_"foo"_</p> 6390 ```````````````````````````````` 6391 6392 6393 Emphasis with `_` is not allowed inside words: 6394 6395 ```````````````````````````````` example 6396 foo_bar_ 6397 . 6398 <p>foo_bar_</p> 6399 ```````````````````````````````` 6400 6401 6402 ```````````````````````````````` example 6403 5_6_78 6404 . 6405 <p>5_6_78</p> 6406 ```````````````````````````````` 6407 6408 6409 ```````````````````````````````` example 6410 пристаням_стремятся_ 6411 . 6412 <p>пристаням_стремятся_</p> 6413 ```````````````````````````````` 6414 6415 6416 Here `_` does not generate emphasis, because the first delimiter run 6417 is right-flanking and the second left-flanking: 6418 6419 ```````````````````````````````` example 6420 aa_"bb"_cc 6421 . 6422 <p>aa_"bb"_cc</p> 6423 ```````````````````````````````` 6424 6425 6426 This is emphasis, even though the opening delimiter is 6427 both left- and right-flanking, because it is preceded by 6428 punctuation: 6429 6430 ```````````````````````````````` example 6431 foo-_(bar)_ 6432 . 6433 <p>foo-<em>(bar)</em></p> 6434 ```````````````````````````````` 6435 6436 6437 Rule 3: 6438 6439 This is not emphasis, because the closing delimiter does 6440 not match the opening delimiter: 6441 6442 ```````````````````````````````` example 6443 _foo* 6444 . 6445 <p>_foo*</p> 6446 ```````````````````````````````` 6447 6448 6449 This is not emphasis, because the closing `*` is preceded by 6450 whitespace: 6451 6452 ```````````````````````````````` example 6453 *foo bar * 6454 . 6455 <p>*foo bar *</p> 6456 ```````````````````````````````` 6457 6458 6459 A line ending also counts as whitespace: 6460 6461 ```````````````````````````````` example 6462 *foo bar 6463 * 6464 . 6465 <p>*foo bar 6466 *</p> 6467 ```````````````````````````````` 6468 6469 6470 This is not emphasis, because the second `*` is 6471 preceded by punctuation and followed by an alphanumeric 6472 (hence it is not part of a [right-flanking delimiter run]: 6473 6474 ```````````````````````````````` example 6475 *(*foo) 6476 . 6477 <p>*(*foo)</p> 6478 ```````````````````````````````` 6479 6480 6481 The point of this restriction is more easily appreciated 6482 with this example: 6483 6484 ```````````````````````````````` example 6485 *(*foo*)* 6486 . 6487 <p><em>(<em>foo</em>)</em></p> 6488 ```````````````````````````````` 6489 6490 6491 Intraword emphasis with `*` is allowed: 6492 6493 ```````````````````````````````` example 6494 *foo*bar 6495 . 6496 <p><em>foo</em>bar</p> 6497 ```````````````````````````````` 6498 6499 6500 6501 Rule 4: 6502 6503 This is not emphasis, because the closing `_` is preceded by 6504 whitespace: 6505 6506 ```````````````````````````````` example 6507 _foo bar _ 6508 . 6509 <p>_foo bar _</p> 6510 ```````````````````````````````` 6511 6512 6513 This is not emphasis, because the second `_` is 6514 preceded by punctuation and followed by an alphanumeric: 6515 6516 ```````````````````````````````` example 6517 _(_foo) 6518 . 6519 <p>_(_foo)</p> 6520 ```````````````````````````````` 6521 6522 6523 This is emphasis within emphasis: 6524 6525 ```````````````````````````````` example 6526 _(_foo_)_ 6527 . 6528 <p><em>(<em>foo</em>)</em></p> 6529 ```````````````````````````````` 6530 6531 6532 Intraword emphasis is disallowed for `_`: 6533 6534 ```````````````````````````````` example 6535 _foo_bar 6536 . 6537 <p>_foo_bar</p> 6538 ```````````````````````````````` 6539 6540 6541 ```````````````````````````````` example 6542 _пристаням_стремятся 6543 . 6544 <p>_пристаням_стремятся</p> 6545 ```````````````````````````````` 6546 6547 6548 ```````````````````````````````` example 6549 _foo_bar_baz_ 6550 . 6551 <p><em>foo_bar_baz</em></p> 6552 ```````````````````````````````` 6553 6554 6555 This is emphasis, even though the closing delimiter is 6556 both left- and right-flanking, because it is followed by 6557 punctuation: 6558 6559 ```````````````````````````````` example 6560 _(bar)_. 6561 . 6562 <p><em>(bar)</em>.</p> 6563 ```````````````````````````````` 6564 6565 6566 Rule 5: 6567 6568 ```````````````````````````````` example 6569 **foo bar** 6570 . 6571 <p><strong>foo bar</strong></p> 6572 ```````````````````````````````` 6573 6574 6575 This is not strong emphasis, because the opening delimiter is 6576 followed by whitespace: 6577 6578 ```````````````````````````````` example 6579 ** foo bar** 6580 . 6581 <p>** foo bar**</p> 6582 ```````````````````````````````` 6583 6584 6585 This is not strong emphasis, because the opening `**` is preceded 6586 by an alphanumeric and followed by punctuation, and hence 6587 not part of a [left-flanking delimiter run]: 6588 6589 ```````````````````````````````` example 6590 a**"foo"** 6591 . 6592 <p>a**"foo"**</p> 6593 ```````````````````````````````` 6594 6595 6596 Intraword strong emphasis with `**` is permitted: 6597 6598 ```````````````````````````````` example 6599 foo**bar** 6600 . 6601 <p>foo<strong>bar</strong></p> 6602 ```````````````````````````````` 6603 6604 6605 Rule 6: 6606 6607 ```````````````````````````````` example 6608 __foo bar__ 6609 . 6610 <p><strong>foo bar</strong></p> 6611 ```````````````````````````````` 6612 6613 6614 This is not strong emphasis, because the opening delimiter is 6615 followed by whitespace: 6616 6617 ```````````````````````````````` example 6618 __ foo bar__ 6619 . 6620 <p>__ foo bar__</p> 6621 ```````````````````````````````` 6622 6623 6624 A line ending counts as whitespace: 6625 ```````````````````````````````` example 6626 __ 6627 foo bar__ 6628 . 6629 <p>__ 6630 foo bar__</p> 6631 ```````````````````````````````` 6632 6633 6634 This is not strong emphasis, because the opening `__` is preceded 6635 by an alphanumeric and followed by punctuation: 6636 6637 ```````````````````````````````` example 6638 a__"foo"__ 6639 . 6640 <p>a__"foo"__</p> 6641 ```````````````````````````````` 6642 6643 6644 Intraword strong emphasis is forbidden with `__`: 6645 6646 ```````````````````````````````` example 6647 foo__bar__ 6648 . 6649 <p>foo__bar__</p> 6650 ```````````````````````````````` 6651 6652 6653 ```````````````````````````````` example 6654 5__6__78 6655 . 6656 <p>5__6__78</p> 6657 ```````````````````````````````` 6658 6659 6660 ```````````````````````````````` example 6661 пристаням__стремятся__ 6662 . 6663 <p>пристаням__стремятся__</p> 6664 ```````````````````````````````` 6665 6666 6667 ```````````````````````````````` example 6668 __foo, __bar__, baz__ 6669 . 6670 <p><strong>foo, <strong>bar</strong>, baz</strong></p> 6671 ```````````````````````````````` 6672 6673 6674 This is strong emphasis, even though the opening delimiter is 6675 both left- and right-flanking, because it is preceded by 6676 punctuation: 6677 6678 ```````````````````````````````` example 6679 foo-__(bar)__ 6680 . 6681 <p>foo-<strong>(bar)</strong></p> 6682 ```````````````````````````````` 6683 6684 6685 6686 Rule 7: 6687 6688 This is not strong emphasis, because the closing delimiter is preceded 6689 by whitespace: 6690 6691 ```````````````````````````````` example 6692 **foo bar ** 6693 . 6694 <p>**foo bar **</p> 6695 ```````````````````````````````` 6696 6697 6698 (Nor can it be interpreted as an emphasized `*foo bar *`, because of 6699 Rule 11.) 6700 6701 This is not strong emphasis, because the second `**` is 6702 preceded by punctuation and followed by an alphanumeric: 6703 6704 ```````````````````````````````` example 6705 **(**foo) 6706 . 6707 <p>**(**foo)</p> 6708 ```````````````````````````````` 6709 6710 6711 The point of this restriction is more easily appreciated 6712 with these examples: 6713 6714 ```````````````````````````````` example 6715 *(**foo**)* 6716 . 6717 <p><em>(<strong>foo</strong>)</em></p> 6718 ```````````````````````````````` 6719 6720 6721 ```````````````````````````````` example 6722 **Gomphocarpus (*Gomphocarpus physocarpus*, syn. 6723 *Asclepias physocarpa*)** 6724 . 6725 <p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn. 6726 <em>Asclepias physocarpa</em>)</strong></p> 6727 ```````````````````````````````` 6728 6729 6730 ```````````````````````````````` example 6731 **foo "*bar*" foo** 6732 . 6733 <p><strong>foo "<em>bar</em>" foo</strong></p> 6734 ```````````````````````````````` 6735 6736 6737 Intraword emphasis: 6738 6739 ```````````````````````````````` example 6740 **foo**bar 6741 . 6742 <p><strong>foo</strong>bar</p> 6743 ```````````````````````````````` 6744 6745 6746 Rule 8: 6747 6748 This is not strong emphasis, because the closing delimiter is 6749 preceded by whitespace: 6750 6751 ```````````````````````````````` example 6752 __foo bar __ 6753 . 6754 <p>__foo bar __</p> 6755 ```````````````````````````````` 6756 6757 6758 This is not strong emphasis, because the second `__` is 6759 preceded by punctuation and followed by an alphanumeric: 6760 6761 ```````````````````````````````` example 6762 __(__foo) 6763 . 6764 <p>__(__foo)</p> 6765 ```````````````````````````````` 6766 6767 6768 The point of this restriction is more easily appreciated 6769 with this example: 6770 6771 ```````````````````````````````` example 6772 _(__foo__)_ 6773 . 6774 <p><em>(<strong>foo</strong>)</em></p> 6775 ```````````````````````````````` 6776 6777 6778 Intraword strong emphasis is forbidden with `__`: 6779 6780 ```````````````````````````````` example 6781 __foo__bar 6782 . 6783 <p>__foo__bar</p> 6784 ```````````````````````````````` 6785 6786 6787 ```````````````````````````````` example 6788 __пристаням__стремятся 6789 . 6790 <p>__пристаням__стремятся</p> 6791 ```````````````````````````````` 6792 6793 6794 ```````````````````````````````` example 6795 __foo__bar__baz__ 6796 . 6797 <p><strong>foo__bar__baz</strong></p> 6798 ```````````````````````````````` 6799 6800 6801 This is strong emphasis, even though the closing delimiter is 6802 both left- and right-flanking, because it is followed by 6803 punctuation: 6804 6805 ```````````````````````````````` example 6806 __(bar)__. 6807 . 6808 <p><strong>(bar)</strong>.</p> 6809 ```````````````````````````````` 6810 6811 6812 Rule 9: 6813 6814 Any nonempty sequence of inline elements can be the contents of an 6815 emphasized span. 6816 6817 ```````````````````````````````` example 6818 *foo [bar](/url)* 6819 . 6820 <p><em>foo <a href="/url">bar</a></em></p> 6821 ```````````````````````````````` 6822 6823 6824 ```````````````````````````````` example 6825 *foo 6826 bar* 6827 . 6828 <p><em>foo 6829 bar</em></p> 6830 ```````````````````````````````` 6831 6832 6833 In particular, emphasis and strong emphasis can be nested 6834 inside emphasis: 6835 6836 ```````````````````````````````` example 6837 _foo __bar__ baz_ 6838 . 6839 <p><em>foo <strong>bar</strong> baz</em></p> 6840 ```````````````````````````````` 6841 6842 6843 ```````````````````````````````` example 6844 _foo _bar_ baz_ 6845 . 6846 <p><em>foo <em>bar</em> baz</em></p> 6847 ```````````````````````````````` 6848 6849 6850 ```````````````````````````````` example 6851 __foo_ bar_ 6852 . 6853 <p><em><em>foo</em> bar</em></p> 6854 ```````````````````````````````` 6855 6856 6857 ```````````````````````````````` example 6858 *foo *bar** 6859 . 6860 <p><em>foo <em>bar</em></em></p> 6861 ```````````````````````````````` 6862 6863 6864 ```````````````````````````````` example 6865 *foo **bar** baz* 6866 . 6867 <p><em>foo <strong>bar</strong> baz</em></p> 6868 ```````````````````````````````` 6869 6870 ```````````````````````````````` example 6871 *foo**bar**baz* 6872 . 6873 <p><em>foo<strong>bar</strong>baz</em></p> 6874 ```````````````````````````````` 6875 6876 Note that in the preceding case, the interpretation 6877 6878 ``` markdown 6879 <p><em>foo</em><em>bar<em></em>baz</em></p> 6880 ``` 6881 6882 6883 is precluded by the condition that a delimiter that 6884 can both open and close (like the `*` after `foo`) 6885 cannot form emphasis if the sum of the lengths of 6886 the delimiter runs containing the opening and 6887 closing delimiters is a multiple of 3 unless 6888 both lengths are multiples of 3. 6889 6890 6891 For the same reason, we don't get two consecutive 6892 emphasis sections in this example: 6893 6894 ```````````````````````````````` example 6895 *foo**bar* 6896 . 6897 <p><em>foo**bar</em></p> 6898 ```````````````````````````````` 6899 6900 6901 The same condition ensures that the following 6902 cases are all strong emphasis nested inside 6903 emphasis, even when the interior whitespace is 6904 omitted: 6905 6906 6907 ```````````````````````````````` example 6908 ***foo** bar* 6909 . 6910 <p><em><strong>foo</strong> bar</em></p> 6911 ```````````````````````````````` 6912 6913 6914 ```````````````````````````````` example 6915 *foo **bar*** 6916 . 6917 <p><em>foo <strong>bar</strong></em></p> 6918 ```````````````````````````````` 6919 6920 6921 ```````````````````````````````` example 6922 *foo**bar*** 6923 . 6924 <p><em>foo<strong>bar</strong></em></p> 6925 ```````````````````````````````` 6926 6927 6928 When the lengths of the interior closing and opening 6929 delimiter runs are *both* multiples of 3, though, 6930 they can match to create emphasis: 6931 6932 ```````````````````````````````` example 6933 foo***bar***baz 6934 . 6935 <p>foo<em><strong>bar</strong></em>baz</p> 6936 ```````````````````````````````` 6937 6938 ```````````````````````````````` example 6939 foo******bar*********baz 6940 . 6941 <p>foo<strong><strong><strong>bar</strong></strong></strong>***baz</p> 6942 ```````````````````````````````` 6943 6944 6945 Indefinite levels of nesting are possible: 6946 6947 ```````````````````````````````` example 6948 *foo **bar *baz* bim** bop* 6949 . 6950 <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p> 6951 ```````````````````````````````` 6952 6953 6954 ```````````````````````````````` example 6955 *foo [*bar*](/url)* 6956 . 6957 <p><em>foo <a href="/url"><em>bar</em></a></em></p> 6958 ```````````````````````````````` 6959 6960 6961 There can be no empty emphasis or strong emphasis: 6962 6963 ```````````````````````````````` example 6964 ** is not an empty emphasis 6965 . 6966 <p>** is not an empty emphasis</p> 6967 ```````````````````````````````` 6968 6969 6970 ```````````````````````````````` example 6971 **** is not an empty strong emphasis 6972 . 6973 <p>**** is not an empty strong emphasis</p> 6974 ```````````````````````````````` 6975 6976 6977 6978 Rule 10: 6979 6980 Any nonempty sequence of inline elements can be the contents of an 6981 strongly emphasized span. 6982 6983 ```````````````````````````````` example 6984 **foo [bar](/url)** 6985 . 6986 <p><strong>foo <a href="/url">bar</a></strong></p> 6987 ```````````````````````````````` 6988 6989 6990 ```````````````````````````````` example 6991 **foo 6992 bar** 6993 . 6994 <p><strong>foo 6995 bar</strong></p> 6996 ```````````````````````````````` 6997 6998 6999 In particular, emphasis and strong emphasis can be nested 7000 inside strong emphasis: 7001 7002 ```````````````````````````````` example 7003 __foo _bar_ baz__ 7004 . 7005 <p><strong>foo <em>bar</em> baz</strong></p> 7006 ```````````````````````````````` 7007 7008 7009 ```````````````````````````````` example 7010 __foo __bar__ baz__ 7011 . 7012 <p><strong>foo <strong>bar</strong> baz</strong></p> 7013 ```````````````````````````````` 7014 7015 7016 ```````````````````````````````` example 7017 ____foo__ bar__ 7018 . 7019 <p><strong><strong>foo</strong> bar</strong></p> 7020 ```````````````````````````````` 7021 7022 7023 ```````````````````````````````` example 7024 **foo **bar**** 7025 . 7026 <p><strong>foo <strong>bar</strong></strong></p> 7027 ```````````````````````````````` 7028 7029 7030 ```````````````````````````````` example 7031 **foo *bar* baz** 7032 . 7033 <p><strong>foo <em>bar</em> baz</strong></p> 7034 ```````````````````````````````` 7035 7036 7037 ```````````````````````````````` example 7038 **foo*bar*baz** 7039 . 7040 <p><strong>foo<em>bar</em>baz</strong></p> 7041 ```````````````````````````````` 7042 7043 7044 ```````````````````````````````` example 7045 ***foo* bar** 7046 . 7047 <p><strong><em>foo</em> bar</strong></p> 7048 ```````````````````````````````` 7049 7050 7051 ```````````````````````````````` example 7052 **foo *bar*** 7053 . 7054 <p><strong>foo <em>bar</em></strong></p> 7055 ```````````````````````````````` 7056 7057 7058 Indefinite levels of nesting are possible: 7059 7060 ```````````````````````````````` example 7061 **foo *bar **baz** 7062 bim* bop** 7063 . 7064 <p><strong>foo <em>bar <strong>baz</strong> 7065 bim</em> bop</strong></p> 7066 ```````````````````````````````` 7067 7068 7069 ```````````````````````````````` example 7070 **foo [*bar*](/url)** 7071 . 7072 <p><strong>foo <a href="/url"><em>bar</em></a></strong></p> 7073 ```````````````````````````````` 7074 7075 7076 There can be no empty emphasis or strong emphasis: 7077 7078 ```````````````````````````````` example 7079 __ is not an empty emphasis 7080 . 7081 <p>__ is not an empty emphasis</p> 7082 ```````````````````````````````` 7083 7084 7085 ```````````````````````````````` example 7086 ____ is not an empty strong emphasis 7087 . 7088 <p>____ is not an empty strong emphasis</p> 7089 ```````````````````````````````` 7090 7091 7092 7093 Rule 11: 7094 7095 ```````````````````````````````` example 7096 foo *** 7097 . 7098 <p>foo ***</p> 7099 ```````````````````````````````` 7100 7101 7102 ```````````````````````````````` example 7103 foo *\** 7104 . 7105 <p>foo <em>*</em></p> 7106 ```````````````````````````````` 7107 7108 7109 ```````````````````````````````` example 7110 foo *_* 7111 . 7112 <p>foo <em>_</em></p> 7113 ```````````````````````````````` 7114 7115 7116 ```````````````````````````````` example 7117 foo ***** 7118 . 7119 <p>foo *****</p> 7120 ```````````````````````````````` 7121 7122 7123 ```````````````````````````````` example 7124 foo **\*** 7125 . 7126 <p>foo <strong>*</strong></p> 7127 ```````````````````````````````` 7128 7129 7130 ```````````````````````````````` example 7131 foo **_** 7132 . 7133 <p>foo <strong>_</strong></p> 7134 ```````````````````````````````` 7135 7136 7137 Note that when delimiters do not match evenly, Rule 11 determines 7138 that the excess literal `*` characters will appear outside of the 7139 emphasis, rather than inside it: 7140 7141 ```````````````````````````````` example 7142 **foo* 7143 . 7144 <p>*<em>foo</em></p> 7145 ```````````````````````````````` 7146 7147 7148 ```````````````````````````````` example 7149 *foo** 7150 . 7151 <p><em>foo</em>*</p> 7152 ```````````````````````````````` 7153 7154 7155 ```````````````````````````````` example 7156 ***foo** 7157 . 7158 <p>*<strong>foo</strong></p> 7159 ```````````````````````````````` 7160 7161 7162 ```````````````````````````````` example 7163 ****foo* 7164 . 7165 <p>***<em>foo</em></p> 7166 ```````````````````````````````` 7167 7168 7169 ```````````````````````````````` example 7170 **foo*** 7171 . 7172 <p><strong>foo</strong>*</p> 7173 ```````````````````````````````` 7174 7175 7176 ```````````````````````````````` example 7177 *foo**** 7178 . 7179 <p><em>foo</em>***</p> 7180 ```````````````````````````````` 7181 7182 7183 7184 Rule 12: 7185 7186 ```````````````````````````````` example 7187 foo ___ 7188 . 7189 <p>foo ___</p> 7190 ```````````````````````````````` 7191 7192 7193 ```````````````````````````````` example 7194 foo _\__ 7195 . 7196 <p>foo <em>_</em></p> 7197 ```````````````````````````````` 7198 7199 7200 ```````````````````````````````` example 7201 foo _*_ 7202 . 7203 <p>foo <em>*</em></p> 7204 ```````````````````````````````` 7205 7206 7207 ```````````````````````````````` example 7208 foo _____ 7209 . 7210 <p>foo _____</p> 7211 ```````````````````````````````` 7212 7213 7214 ```````````````````````````````` example 7215 foo __\___ 7216 . 7217 <p>foo <strong>_</strong></p> 7218 ```````````````````````````````` 7219 7220 7221 ```````````````````````````````` example 7222 foo __*__ 7223 . 7224 <p>foo <strong>*</strong></p> 7225 ```````````````````````````````` 7226 7227 7228 ```````````````````````````````` example 7229 __foo_ 7230 . 7231 <p>_<em>foo</em></p> 7232 ```````````````````````````````` 7233 7234 7235 Note that when delimiters do not match evenly, Rule 12 determines 7236 that the excess literal `_` characters will appear outside of the 7237 emphasis, rather than inside it: 7238 7239 ```````````````````````````````` example 7240 _foo__ 7241 . 7242 <p><em>foo</em>_</p> 7243 ```````````````````````````````` 7244 7245 7246 ```````````````````````````````` example 7247 ___foo__ 7248 . 7249 <p>_<strong>foo</strong></p> 7250 ```````````````````````````````` 7251 7252 7253 ```````````````````````````````` example 7254 ____foo_ 7255 . 7256 <p>___<em>foo</em></p> 7257 ```````````````````````````````` 7258 7259 7260 ```````````````````````````````` example 7261 __foo___ 7262 . 7263 <p><strong>foo</strong>_</p> 7264 ```````````````````````````````` 7265 7266 7267 ```````````````````````````````` example 7268 _foo____ 7269 . 7270 <p><em>foo</em>___</p> 7271 ```````````````````````````````` 7272 7273 7274 Rule 13 implies that if you want emphasis nested directly inside 7275 emphasis, you must use different delimiters: 7276 7277 ```````````````````````````````` example 7278 **foo** 7279 . 7280 <p><strong>foo</strong></p> 7281 ```````````````````````````````` 7282 7283 7284 ```````````````````````````````` example 7285 *_foo_* 7286 . 7287 <p><em><em>foo</em></em></p> 7288 ```````````````````````````````` 7289 7290 7291 ```````````````````````````````` example 7292 __foo__ 7293 . 7294 <p><strong>foo</strong></p> 7295 ```````````````````````````````` 7296 7297 7298 ```````````````````````````````` example 7299 _*foo*_ 7300 . 7301 <p><em><em>foo</em></em></p> 7302 ```````````````````````````````` 7303 7304 7305 However, strong emphasis within strong emphasis is possible without 7306 switching delimiters: 7307 7308 ```````````````````````````````` example 7309 ****foo**** 7310 . 7311 <p><strong><strong>foo</strong></strong></p> 7312 ```````````````````````````````` 7313 7314 7315 ```````````````````````````````` example 7316 ____foo____ 7317 . 7318 <p><strong><strong>foo</strong></strong></p> 7319 ```````````````````````````````` 7320 7321 7322 7323 Rule 13 can be applied to arbitrarily long sequences of 7324 delimiters: 7325 7326 ```````````````````````````````` example 7327 ******foo****** 7328 . 7329 <p><strong><strong><strong>foo</strong></strong></strong></p> 7330 ```````````````````````````````` 7331 7332 7333 Rule 14: 7334 7335 ```````````````````````````````` example 7336 ***foo*** 7337 . 7338 <p><em><strong>foo</strong></em></p> 7339 ```````````````````````````````` 7340 7341 7342 ```````````````````````````````` example 7343 _____foo_____ 7344 . 7345 <p><em><strong><strong>foo</strong></strong></em></p> 7346 ```````````````````````````````` 7347 7348 7349 Rule 15: 7350 7351 ```````````````````````````````` example 7352 *foo _bar* baz_ 7353 . 7354 <p><em>foo _bar</em> baz_</p> 7355 ```````````````````````````````` 7356 7357 7358 ```````````````````````````````` example 7359 *foo __bar *baz bim__ bam* 7360 . 7361 <p><em>foo <strong>bar *baz bim</strong> bam</em></p> 7362 ```````````````````````````````` 7363 7364 7365 Rule 16: 7366 7367 ```````````````````````````````` example 7368 **foo **bar baz** 7369 . 7370 <p>**foo <strong>bar baz</strong></p> 7371 ```````````````````````````````` 7372 7373 7374 ```````````````````````````````` example 7375 *foo *bar baz* 7376 . 7377 <p>*foo <em>bar baz</em></p> 7378 ```````````````````````````````` 7379 7380 7381 Rule 17: 7382 7383 ```````````````````````````````` example 7384 *[bar*](/url) 7385 . 7386 <p>*<a href="/url">bar*</a></p> 7387 ```````````````````````````````` 7388 7389 7390 ```````````````````````````````` example 7391 _foo [bar_](/url) 7392 . 7393 <p>_foo <a href="/url">bar_</a></p> 7394 ```````````````````````````````` 7395 7396 7397 ```````````````````````````````` example 7398 *<img src="foo" title="*"/> 7399 . 7400 <p>*<img src="foo" title="*"/></p> 7401 ```````````````````````````````` 7402 7403 7404 ```````````````````````````````` example 7405 **<a href="**"> 7406 . 7407 <p>**<a href="**"></p> 7408 ```````````````````````````````` 7409 7410 7411 ```````````````````````````````` example 7412 __<a href="__"> 7413 . 7414 <p>__<a href="__"></p> 7415 ```````````````````````````````` 7416 7417 7418 ```````````````````````````````` example 7419 *a `*`* 7420 . 7421 <p><em>a <code>*</code></em></p> 7422 ```````````````````````````````` 7423 7424 7425 ```````````````````````````````` example 7426 _a `_`_ 7427 . 7428 <p><em>a <code>_</code></em></p> 7429 ```````````````````````````````` 7430 7431 7432 ```````````````````````````````` example 7433 **a<http://foo.bar/?q=**> 7434 . 7435 <p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p> 7436 ```````````````````````````````` 7437 7438 7439 ```````````````````````````````` example 7440 __a<http://foo.bar/?q=__> 7441 . 7442 <p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p> 7443 ```````````````````````````````` 7444 7445 7446 7447 ## Links 7448 7449 A link contains [link text] (the visible text), a [link destination] 7450 (the URI that is the link destination), and optionally a [link title]. 7451 There are two basic kinds of links in Markdown. In [inline links] the 7452 destination and title are given immediately after the link text. In 7453 [reference links] the destination and title are defined elsewhere in 7454 the document. 7455 7456 A [link text](@) consists of a sequence of zero or more 7457 inline elements enclosed by square brackets (`[` and `]`). The 7458 following rules apply: 7459 7460 - Links may not contain other links, at any level of nesting. If 7461 multiple otherwise valid link definitions appear nested inside each 7462 other, the inner-most definition is used. 7463 7464 - Brackets are allowed in the [link text] only if (a) they 7465 are backslash-escaped or (b) they appear as a matched pair of brackets, 7466 with an open bracket `[`, a sequence of zero or more inlines, and 7467 a close bracket `]`. 7468 7469 - Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly 7470 than the brackets in link text. Thus, for example, 7471 `` [foo`]` `` could not be a link text, since the second `]` 7472 is part of a code span. 7473 7474 - The brackets in link text bind more tightly than markers for 7475 [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. 7476 7477 A [link destination](@) consists of either 7478 7479 - a sequence of zero or more characters between an opening `<` and a 7480 closing `>` that contains no line endings or unescaped 7481 `<` or `>` characters, or 7482 7483 - a nonempty sequence of characters that does not start with `<`, 7484 does not include [ASCII control characters][ASCII control character] 7485 or [space] character, and includes parentheses only if (a) they are 7486 backslash-escaped or (b) they are part of a balanced pair of 7487 unescaped parentheses. 7488 (Implementations may impose limits on parentheses nesting to 7489 avoid performance issues, but at least three levels of nesting 7490 should be supported.) 7491 7492 A [link title](@) consists of either 7493 7494 - a sequence of zero or more characters between straight double-quote 7495 characters (`"`), including a `"` character only if it is 7496 backslash-escaped, or 7497 7498 - a sequence of zero or more characters between straight single-quote 7499 characters (`'`), including a `'` character only if it is 7500 backslash-escaped, or 7501 7502 - a sequence of zero or more characters between matching parentheses 7503 (`(...)`), including a `(` or `)` character only if it is 7504 backslash-escaped. 7505 7506 Although [link titles] may span multiple lines, they may not contain 7507 a [blank line]. 7508 7509 An [inline link](@) consists of a [link text] followed immediately 7510 by a left parenthesis `(`, an optional [link destination], an optional 7511 [link title], and a right parenthesis `)`. 7512 These four components may be separated by spaces, tabs, and up to one line 7513 ending. 7514 If both [link destination] and [link title] are present, they *must* be 7515 separated by spaces, tabs, and up to one line ending. 7516 7517 The link's text consists of the inlines contained 7518 in the [link text] (excluding the enclosing square brackets). 7519 The link's URI consists of the link destination, excluding enclosing 7520 `<...>` if present, with backslash-escapes in effect as described 7521 above. The link's title consists of the link title, excluding its 7522 enclosing delimiters, with backslash-escapes in effect as described 7523 above. 7524 7525 Here is a simple inline link: 7526 7527 ```````````````````````````````` example 7528 [link](/uri "title") 7529 . 7530 <p><a href="/uri" title="title">link</a></p> 7531 ```````````````````````````````` 7532 7533 7534 The title, the link text and even 7535 the destination may be omitted: 7536 7537 ```````````````````````````````` example 7538 [link](/uri) 7539 . 7540 <p><a href="/uri">link</a></p> 7541 ```````````````````````````````` 7542 7543 ```````````````````````````````` example 7544 [](./target.md) 7545 . 7546 <p><a href="./target.md"></a></p> 7547 ```````````````````````````````` 7548 7549 7550 ```````````````````````````````` example 7551 [link]() 7552 . 7553 <p><a href="">link</a></p> 7554 ```````````````````````````````` 7555 7556 7557 ```````````````````````````````` example 7558 [link](<>) 7559 . 7560 <p><a href="">link</a></p> 7561 ```````````````````````````````` 7562 7563 7564 ```````````````````````````````` example 7565 []() 7566 . 7567 <p><a href=""></a></p> 7568 ```````````````````````````````` 7569 7570 The destination can only contain spaces if it is 7571 enclosed in pointy brackets: 7572 7573 ```````````````````````````````` example 7574 [link](/my uri) 7575 . 7576 <p>[link](/my uri)</p> 7577 ```````````````````````````````` 7578 7579 ```````````````````````````````` example 7580 [link](</my uri>) 7581 . 7582 <p><a href="/my%20uri">link</a></p> 7583 ```````````````````````````````` 7584 7585 The destination cannot contain line endings, 7586 even if enclosed in pointy brackets: 7587 7588 ```````````````````````````````` example 7589 [link](foo 7590 bar) 7591 . 7592 <p>[link](foo 7593 bar)</p> 7594 ```````````````````````````````` 7595 7596 ```````````````````````````````` example 7597 [link](<foo 7598 bar>) 7599 . 7600 <p>[link](<foo 7601 bar>)</p> 7602 ```````````````````````````````` 7603 7604 The destination can contain `)` if it is enclosed 7605 in pointy brackets: 7606 7607 ```````````````````````````````` example 7608 [a](<b)c>) 7609 . 7610 <p><a href="b)c">a</a></p> 7611 ```````````````````````````````` 7612 7613 Pointy brackets that enclose links must be unescaped: 7614 7615 ```````````````````````````````` example 7616 [link](<foo\>) 7617 . 7618 <p>[link](<foo>)</p> 7619 ```````````````````````````````` 7620 7621 These are not links, because the opening pointy bracket 7622 is not matched properly: 7623 7624 ```````````````````````````````` example 7625 [a](<b)c 7626 [a](<b)c> 7627 [a](<b>c) 7628 . 7629 <p>[a](<b)c 7630 [a](<b)c> 7631 [a](<b>c)</p> 7632 ```````````````````````````````` 7633 7634 Parentheses inside the link destination may be escaped: 7635 7636 ```````````````````````````````` example 7637 [link](\(foo\)) 7638 . 7639 <p><a href="(foo)">link</a></p> 7640 ```````````````````````````````` 7641 7642 Any number of parentheses are allowed without escaping, as long as they are 7643 balanced: 7644 7645 ```````````````````````````````` example 7646 [link](foo(and(bar))) 7647 . 7648 <p><a href="foo(and(bar))">link</a></p> 7649 ```````````````````````````````` 7650 7651 However, if you have unbalanced parentheses, you need to escape or use the 7652 `<...>` form: 7653 7654 ```````````````````````````````` example 7655 [link](foo(and(bar)) 7656 . 7657 <p>[link](foo(and(bar))</p> 7658 ```````````````````````````````` 7659 7660 7661 ```````````````````````````````` example 7662 [link](foo\(and\(bar\)) 7663 . 7664 <p><a href="foo(and(bar)">link</a></p> 7665 ```````````````````````````````` 7666 7667 7668 ```````````````````````````````` example 7669 [link](<foo(and(bar)>) 7670 . 7671 <p><a href="foo(and(bar)">link</a></p> 7672 ```````````````````````````````` 7673 7674 7675 Parentheses and other symbols can also be escaped, as usual 7676 in Markdown: 7677 7678 ```````````````````````````````` example 7679 [link](foo\)\:) 7680 . 7681 <p><a href="foo):">link</a></p> 7682 ```````````````````````````````` 7683 7684 7685 A link can contain fragment identifiers and queries: 7686 7687 ```````````````````````````````` example 7688 [link](#fragment) 7689 7690 [link](http://example.com#fragment) 7691 7692 [link](http://example.com?foo=3#frag) 7693 . 7694 <p><a href="#fragment">link</a></p> 7695 <p><a href="http://example.com#fragment">link</a></p> 7696 <p><a href="http://example.com?foo=3#frag">link</a></p> 7697 ```````````````````````````````` 7698 7699 7700 Note that a backslash before a non-escapable character is 7701 just a backslash: 7702 7703 ```````````````````````````````` example 7704 [link](foo\bar) 7705 . 7706 <p><a href="foo%5Cbar">link</a></p> 7707 ```````````````````````````````` 7708 7709 7710 URL-escaping should be left alone inside the destination, as all 7711 URL-escaped characters are also valid URL characters. Entity and 7712 numerical character references in the destination will be parsed 7713 into the corresponding Unicode code points, as usual. These may 7714 be optionally URL-escaped when written as HTML, but this spec 7715 does not enforce any particular policy for rendering URLs in 7716 HTML or other formats. Renderers may make different decisions 7717 about how to escape or normalize URLs in the output. 7718 7719 ```````````````````````````````` example 7720 [link](foo%20bä) 7721 . 7722 <p><a href="foo%20b%C3%A4">link</a></p> 7723 ```````````````````````````````` 7724 7725 7726 Note that, because titles can often be parsed as destinations, 7727 if you try to omit the destination and keep the title, you'll 7728 get unexpected results: 7729 7730 ```````````````````````````````` example 7731 [link]("title") 7732 . 7733 <p><a href="%22title%22">link</a></p> 7734 ```````````````````````````````` 7735 7736 7737 Titles may be in single quotes, double quotes, or parentheses: 7738 7739 ```````````````````````````````` example 7740 [link](/url "title") 7741 [link](/url 'title') 7742 [link](/url (title)) 7743 . 7744 <p><a href="/url" title="title">link</a> 7745 <a href="/url" title="title">link</a> 7746 <a href="/url" title="title">link</a></p> 7747 ```````````````````````````````` 7748 7749 7750 Backslash escapes and entity and numeric character references 7751 may be used in titles: 7752 7753 ```````````````````````````````` example 7754 [link](/url "title \""") 7755 . 7756 <p><a href="/url" title="title """>link</a></p> 7757 ```````````````````````````````` 7758 7759 7760 Titles must be separated from the link using spaces, tabs, and up to one line 7761 ending. 7762 Other [Unicode whitespace] like non-breaking space doesn't work. 7763 7764 ```````````````````````````````` example 7765 [link](/url "title") 7766 . 7767 <p><a href="/url%C2%A0%22title%22">link</a></p> 7768 ```````````````````````````````` 7769 7770 7771 Nested balanced quotes are not allowed without escaping: 7772 7773 ```````````````````````````````` example 7774 [link](/url "title "and" title") 7775 . 7776 <p>[link](/url "title "and" title")</p> 7777 ```````````````````````````````` 7778 7779 7780 But it is easy to work around this by using a different quote type: 7781 7782 ```````````````````````````````` example 7783 [link](/url 'title "and" title') 7784 . 7785 <p><a href="/url" title="title "and" title">link</a></p> 7786 ```````````````````````````````` 7787 7788 7789 (Note: `Markdown.pl` did allow double quotes inside a double-quoted 7790 title, and its test suite included a test demonstrating this. 7791 But it is hard to see a good rationale for the extra complexity this 7792 brings, since there are already many ways---backslash escaping, 7793 entity and numeric character references, or using a different 7794 quote type for the enclosing title---to write titles containing 7795 double quotes. `Markdown.pl`'s handling of titles has a number 7796 of other strange features. For example, it allows single-quoted 7797 titles in inline links, but not reference links. And, in 7798 reference links but not inline links, it allows a title to begin 7799 with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows 7800 titles with no closing quotation mark, though 1.0.2b8 does not. 7801 It seems preferable to adopt a simple, rational rule that works 7802 the same way in inline links and link reference definitions.) 7803 7804 Spaces, tabs, and up to one line ending is allowed around the destination and 7805 title: 7806 7807 ```````````````````````````````` example 7808 [link]( /uri 7809 "title" ) 7810 . 7811 <p><a href="/uri" title="title">link</a></p> 7812 ```````````````````````````````` 7813 7814 7815 But it is not allowed between the link text and the 7816 following parenthesis: 7817 7818 ```````````````````````````````` example 7819 [link] (/uri) 7820 . 7821 <p>[link] (/uri)</p> 7822 ```````````````````````````````` 7823 7824 7825 The link text may contain balanced brackets, but not unbalanced ones, 7826 unless they are escaped: 7827 7828 ```````````````````````````````` example 7829 [link [foo [bar]]](/uri) 7830 . 7831 <p><a href="/uri">link [foo [bar]]</a></p> 7832 ```````````````````````````````` 7833 7834 7835 ```````````````````````````````` example 7836 [link] bar](/uri) 7837 . 7838 <p>[link] bar](/uri)</p> 7839 ```````````````````````````````` 7840 7841 7842 ```````````````````````````````` example 7843 [link [bar](/uri) 7844 . 7845 <p>[link <a href="/uri">bar</a></p> 7846 ```````````````````````````````` 7847 7848 7849 ```````````````````````````````` example 7850 [link \[bar](/uri) 7851 . 7852 <p><a href="/uri">link [bar</a></p> 7853 ```````````````````````````````` 7854 7855 7856 The link text may contain inline content: 7857 7858 ```````````````````````````````` example 7859 [link *foo **bar** `#`*](/uri) 7860 . 7861 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> 7862 ```````````````````````````````` 7863 7864 7865 ```````````````````````````````` example 7866 [![moon](moon.jpg)](/uri) 7867 . 7868 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> 7869 ```````````````````````````````` 7870 7871 7872 However, links may not contain other links, at any level of nesting. 7873 7874 ```````````````````````````````` example 7875 [foo [bar](/uri)](/uri) 7876 . 7877 <p>[foo <a href="/uri">bar</a>](/uri)</p> 7878 ```````````````````````````````` 7879 7880 7881 ```````````````````````````````` example 7882 [foo *[bar [baz](/uri)](/uri)*](/uri) 7883 . 7884 <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p> 7885 ```````````````````````````````` 7886 7887 7888 ```````````````````````````````` example 7889 ![[[foo](uri1)](uri2)](uri3) 7890 . 7891 <p><img src="uri3" alt="[foo](uri2)" /></p> 7892 ```````````````````````````````` 7893 7894 7895 These cases illustrate the precedence of link text grouping over 7896 emphasis grouping: 7897 7898 ```````````````````````````````` example 7899 *[foo*](/uri) 7900 . 7901 <p>*<a href="/uri">foo*</a></p> 7902 ```````````````````````````````` 7903 7904 7905 ```````````````````````````````` example 7906 [foo *bar](baz*) 7907 . 7908 <p><a href="baz*">foo *bar</a></p> 7909 ```````````````````````````````` 7910 7911 7912 Note that brackets that *aren't* part of links do not take 7913 precedence: 7914 7915 ```````````````````````````````` example 7916 *foo [bar* baz] 7917 . 7918 <p><em>foo [bar</em> baz]</p> 7919 ```````````````````````````````` 7920 7921 7922 These cases illustrate the precedence of HTML tags, code spans, 7923 and autolinks over link grouping: 7924 7925 ```````````````````````````````` example 7926 [foo <bar attr="](baz)"> 7927 . 7928 <p>[foo <bar attr="](baz)"></p> 7929 ```````````````````````````````` 7930 7931 7932 ```````````````````````````````` example 7933 [foo`](/uri)` 7934 . 7935 <p>[foo<code>](/uri)</code></p> 7936 ```````````````````````````````` 7937 7938 7939 ```````````````````````````````` example 7940 [foo<http://example.com/?search=](uri)> 7941 . 7942 <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p> 7943 ```````````````````````````````` 7944 7945 7946 There are three kinds of [reference link](@)s: 7947 [full](#full-reference-link), [collapsed](#collapsed-reference-link), 7948 and [shortcut](#shortcut-reference-link). 7949 7950 A [full reference link](@) 7951 consists of a [link text] immediately followed by a [link label] 7952 that [matches] a [link reference definition] elsewhere in the document. 7953 7954 A [link label](@) begins with a left bracket (`[`) and ends 7955 with the first right bracket (`]`) that is not backslash-escaped. 7956 Between these brackets there must be at least one character that is not a space, 7957 tab, or line ending. 7958 Unescaped square bracket characters are not allowed inside the 7959 opening and closing square brackets of [link labels]. A link 7960 label can have at most 999 characters inside the square 7961 brackets. 7962 7963 One label [matches](@) 7964 another just in case their normalized forms are equal. To normalize a 7965 label, strip off the opening and closing brackets, 7966 perform the *Unicode case fold*, strip leading and trailing 7967 spaces, tabs, and line endings, and collapse consecutive internal 7968 spaces, tabs, and line endings to a single space. If there are multiple 7969 matching reference link definitions, the one that comes first in the 7970 document is used. (It is desirable in such cases to emit a warning.) 7971 7972 The link's URI and title are provided by the matching [link 7973 reference definition]. 7974 7975 Here is a simple example: 7976 7977 ```````````````````````````````` example 7978 [foo][bar] 7979 7980 [bar]: /url "title" 7981 . 7982 <p><a href="/url" title="title">foo</a></p> 7983 ```````````````````````````````` 7984 7985 7986 The rules for the [link text] are the same as with 7987 [inline links]. Thus: 7988 7989 The link text may contain balanced brackets, but not unbalanced ones, 7990 unless they are escaped: 7991 7992 ```````````````````````````````` example 7993 [link [foo [bar]]][ref] 7994 7995 [ref]: /uri 7996 . 7997 <p><a href="/uri">link [foo [bar]]</a></p> 7998 ```````````````````````````````` 7999 8000 8001 ```````````````````````````````` example 8002 [link \[bar][ref] 8003 8004 [ref]: /uri 8005 . 8006 <p><a href="/uri">link [bar</a></p> 8007 ```````````````````````````````` 8008 8009 8010 The link text may contain inline content: 8011 8012 ```````````````````````````````` example 8013 [link *foo **bar** `#`*][ref] 8014 8015 [ref]: /uri 8016 . 8017 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> 8018 ```````````````````````````````` 8019 8020 8021 ```````````````````````````````` example 8022 [![moon](moon.jpg)][ref] 8023 8024 [ref]: /uri 8025 . 8026 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> 8027 ```````````````````````````````` 8028 8029 8030 However, links may not contain other links, at any level of nesting. 8031 8032 ```````````````````````````````` example 8033 [foo [bar](/uri)][ref] 8034 8035 [ref]: /uri 8036 . 8037 <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p> 8038 ```````````````````````````````` 8039 8040 8041 ```````````````````````````````` example 8042 [foo *bar [baz][ref]*][ref] 8043 8044 [ref]: /uri 8045 . 8046 <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p> 8047 ```````````````````````````````` 8048 8049 8050 (In the examples above, we have two [shortcut reference links] 8051 instead of one [full reference link].) 8052 8053 The following cases illustrate the precedence of link text grouping over 8054 emphasis grouping: 8055 8056 ```````````````````````````````` example 8057 *[foo*][ref] 8058 8059 [ref]: /uri 8060 . 8061 <p>*<a href="/uri">foo*</a></p> 8062 ```````````````````````````````` 8063 8064 8065 ```````````````````````````````` example 8066 [foo *bar][ref]* 8067 8068 [ref]: /uri 8069 . 8070 <p><a href="/uri">foo *bar</a>*</p> 8071 ```````````````````````````````` 8072 8073 8074 These cases illustrate the precedence of HTML tags, code spans, 8075 and autolinks over link grouping: 8076 8077 ```````````````````````````````` example 8078 [foo <bar attr="][ref]"> 8079 8080 [ref]: /uri 8081 . 8082 <p>[foo <bar attr="][ref]"></p> 8083 ```````````````````````````````` 8084 8085 8086 ```````````````````````````````` example 8087 [foo`][ref]` 8088 8089 [ref]: /uri 8090 . 8091 <p>[foo<code>][ref]</code></p> 8092 ```````````````````````````````` 8093 8094 8095 ```````````````````````````````` example 8096 [foo<http://example.com/?search=][ref]> 8097 8098 [ref]: /uri 8099 . 8100 <p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p> 8101 ```````````````````````````````` 8102 8103 8104 Matching is case-insensitive: 8105 8106 ```````````````````````````````` example 8107 [foo][BaR] 8108 8109 [bar]: /url "title" 8110 . 8111 <p><a href="/url" title="title">foo</a></p> 8112 ```````````````````````````````` 8113 8114 8115 Unicode case fold is used: 8116 8117 ```````````````````````````````` example 8118 [ẞ] 8119 8120 [SS]: /url 8121 . 8122 <p><a href="/url">ẞ</a></p> 8123 ```````````````````````````````` 8124 8125 8126 Consecutive internal spaces, tabs, and line endings are treated as one space for 8127 purposes of determining matching: 8128 8129 ```````````````````````````````` example 8130 [Foo 8131 bar]: /url 8132 8133 [Baz][Foo bar] 8134 . 8135 <p><a href="/url">Baz</a></p> 8136 ```````````````````````````````` 8137 8138 8139 No spaces, tabs, or line endings are allowed between the [link text] and the 8140 [link label]: 8141 8142 ```````````````````````````````` example 8143 [foo] [bar] 8144 8145 [bar]: /url "title" 8146 . 8147 <p>[foo] <a href="/url" title="title">bar</a></p> 8148 ```````````````````````````````` 8149 8150 8151 ```````````````````````````````` example 8152 [foo] 8153 [bar] 8154 8155 [bar]: /url "title" 8156 . 8157 <p>[foo] 8158 <a href="/url" title="title">bar</a></p> 8159 ```````````````````````````````` 8160 8161 8162 This is a departure from John Gruber's original Markdown syntax 8163 description, which explicitly allows whitespace between the link 8164 text and the link label. It brings reference links in line with 8165 [inline links], which (according to both original Markdown and 8166 this spec) cannot have whitespace after the link text. More 8167 importantly, it prevents inadvertent capture of consecutive 8168 [shortcut reference links]. If whitespace is allowed between the 8169 link text and the link label, then in the following we will have 8170 a single reference link, not two shortcut reference links, as 8171 intended: 8172 8173 ``` markdown 8174 [foo] 8175 [bar] 8176 8177 [foo]: /url1 8178 [bar]: /url2 8179 ``` 8180 8181 (Note that [shortcut reference links] were introduced by Gruber 8182 himself in a beta version of `Markdown.pl`, but never included 8183 in the official syntax description. Without shortcut reference 8184 links, it is harmless to allow space between the link text and 8185 link label; but once shortcut references are introduced, it is 8186 too dangerous to allow this, as it frequently leads to 8187 unintended results.) 8188 8189 When there are multiple matching [link reference definitions], 8190 the first is used: 8191 8192 ```````````````````````````````` example 8193 [foo]: /url1 8194 8195 [foo]: /url2 8196 8197 [bar][foo] 8198 . 8199 <p><a href="/url1">bar</a></p> 8200 ```````````````````````````````` 8201 8202 8203 Note that matching is performed on normalized strings, not parsed 8204 inline content. So the following does not match, even though the 8205 labels define equivalent inline content: 8206 8207 ```````````````````````````````` example 8208 [bar][foo\!] 8209 8210 [foo!]: /url 8211 . 8212 <p>[bar][foo!]</p> 8213 ```````````````````````````````` 8214 8215 8216 [Link labels] cannot contain brackets, unless they are 8217 backslash-escaped: 8218 8219 ```````````````````````````````` example 8220 [foo][ref[] 8221 8222 [ref[]: /uri 8223 . 8224 <p>[foo][ref[]</p> 8225 <p>[ref[]: /uri</p> 8226 ```````````````````````````````` 8227 8228 8229 ```````````````````````````````` example 8230 [foo][ref[bar]] 8231 8232 [ref[bar]]: /uri 8233 . 8234 <p>[foo][ref[bar]]</p> 8235 <p>[ref[bar]]: /uri</p> 8236 ```````````````````````````````` 8237 8238 8239 ```````````````````````````````` example 8240 [[[foo]]] 8241 8242 [[[foo]]]: /url 8243 . 8244 <p>[[[foo]]]</p> 8245 <p>[[[foo]]]: /url</p> 8246 ```````````````````````````````` 8247 8248 8249 ```````````````````````````````` example 8250 [foo][ref\[] 8251 8252 [ref\[]: /uri 8253 . 8254 <p><a href="/uri">foo</a></p> 8255 ```````````````````````````````` 8256 8257 8258 Note that in this example `]` is not backslash-escaped: 8259 8260 ```````````````````````````````` example 8261 [bar\\]: /uri 8262 8263 [bar\\] 8264 . 8265 <p><a href="/uri">bar\</a></p> 8266 ```````````````````````````````` 8267 8268 8269 A [link label] must contain at least one character that is not a space, tab, or 8270 line ending: 8271 8272 ```````````````````````````````` example 8273 [] 8274 8275 []: /uri 8276 . 8277 <p>[]</p> 8278 <p>[]: /uri</p> 8279 ```````````````````````````````` 8280 8281 8282 ```````````````````````````````` example 8283 [ 8284 ] 8285 8286 [ 8287 ]: /uri 8288 . 8289 <p>[ 8290 ]</p> 8291 <p>[ 8292 ]: /uri</p> 8293 ```````````````````````````````` 8294 8295 8296 A [collapsed reference link](@) 8297 consists of a [link label] that [matches] a 8298 [link reference definition] elsewhere in the 8299 document, followed by the string `[]`. 8300 The contents of the first link label are parsed as inlines, 8301 which are used as the link's text. The link's URI and title are 8302 provided by the matching reference link definition. Thus, 8303 `[foo][]` is equivalent to `[foo][foo]`. 8304 8305 ```````````````````````````````` example 8306 [foo][] 8307 8308 [foo]: /url "title" 8309 . 8310 <p><a href="/url" title="title">foo</a></p> 8311 ```````````````````````````````` 8312 8313 8314 ```````````````````````````````` example 8315 [*foo* bar][] 8316 8317 [*foo* bar]: /url "title" 8318 . 8319 <p><a href="/url" title="title"><em>foo</em> bar</a></p> 8320 ```````````````````````````````` 8321 8322 8323 The link labels are case-insensitive: 8324 8325 ```````````````````````````````` example 8326 [Foo][] 8327 8328 [foo]: /url "title" 8329 . 8330 <p><a href="/url" title="title">Foo</a></p> 8331 ```````````````````````````````` 8332 8333 8334 8335 As with full reference links, spaces, tabs, or line endings are not 8336 allowed between the two sets of brackets: 8337 8338 ```````````````````````````````` example 8339 [foo] 8340 [] 8341 8342 [foo]: /url "title" 8343 . 8344 <p><a href="/url" title="title">foo</a> 8345 []</p> 8346 ```````````````````````````````` 8347 8348 8349 A [shortcut reference link](@) 8350 consists of a [link label] that [matches] a 8351 [link reference definition] elsewhere in the 8352 document and is not followed by `[]` or a link label. 8353 The contents of the first link label are parsed as inlines, 8354 which are used as the link's text. The link's URI and title 8355 are provided by the matching link reference definition. 8356 Thus, `[foo]` is equivalent to `[foo][]`. 8357 8358 ```````````````````````````````` example 8359 [foo] 8360 8361 [foo]: /url "title" 8362 . 8363 <p><a href="/url" title="title">foo</a></p> 8364 ```````````````````````````````` 8365 8366 8367 ```````````````````````````````` example 8368 [*foo* bar] 8369 8370 [*foo* bar]: /url "title" 8371 . 8372 <p><a href="/url" title="title"><em>foo</em> bar</a></p> 8373 ```````````````````````````````` 8374 8375 8376 ```````````````````````````````` example 8377 [[*foo* bar]] 8378 8379 [*foo* bar]: /url "title" 8380 . 8381 <p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p> 8382 ```````````````````````````````` 8383 8384 8385 ```````````````````````````````` example 8386 [[bar [foo] 8387 8388 [foo]: /url 8389 . 8390 <p>[[bar <a href="/url">foo</a></p> 8391 ```````````````````````````````` 8392 8393 8394 The link labels are case-insensitive: 8395 8396 ```````````````````````````````` example 8397 [Foo] 8398 8399 [foo]: /url "title" 8400 . 8401 <p><a href="/url" title="title">Foo</a></p> 8402 ```````````````````````````````` 8403 8404 8405 A space after the link text should be preserved: 8406 8407 ```````````````````````````````` example 8408 [foo] bar 8409 8410 [foo]: /url 8411 . 8412 <p><a href="/url">foo</a> bar</p> 8413 ```````````````````````````````` 8414 8415 8416 If you just want bracketed text, you can backslash-escape the 8417 opening bracket to avoid links: 8418 8419 ```````````````````````````````` example 8420 \[foo] 8421 8422 [foo]: /url "title" 8423 . 8424 <p>[foo]</p> 8425 ```````````````````````````````` 8426 8427 8428 Note that this is a link, because a link label ends with the first 8429 following closing bracket: 8430 8431 ```````````````````````````````` example 8432 [foo*]: /url 8433 8434 *[foo*] 8435 . 8436 <p>*<a href="/url">foo*</a></p> 8437 ```````````````````````````````` 8438 8439 8440 Full and compact references take precedence over shortcut 8441 references: 8442 8443 ```````````````````````````````` example 8444 [foo][bar] 8445 8446 [foo]: /url1 8447 [bar]: /url2 8448 . 8449 <p><a href="/url2">foo</a></p> 8450 ```````````````````````````````` 8451 8452 ```````````````````````````````` example 8453 [foo][] 8454 8455 [foo]: /url1 8456 . 8457 <p><a href="/url1">foo</a></p> 8458 ```````````````````````````````` 8459 8460 Inline links also take precedence: 8461 8462 ```````````````````````````````` example 8463 [foo]() 8464 8465 [foo]: /url1 8466 . 8467 <p><a href="">foo</a></p> 8468 ```````````````````````````````` 8469 8470 ```````````````````````````````` example 8471 [foo](not a link) 8472 8473 [foo]: /url1 8474 . 8475 <p><a href="/url1">foo</a>(not a link)</p> 8476 ```````````````````````````````` 8477 8478 In the following case `[bar][baz]` is parsed as a reference, 8479 `[foo]` as normal text: 8480 8481 ```````````````````````````````` example 8482 [foo][bar][baz] 8483 8484 [baz]: /url 8485 . 8486 <p>[foo]<a href="/url">bar</a></p> 8487 ```````````````````````````````` 8488 8489 8490 Here, though, `[foo][bar]` is parsed as a reference, since 8491 `[bar]` is defined: 8492 8493 ```````````````````````````````` example 8494 [foo][bar][baz] 8495 8496 [baz]: /url1 8497 [bar]: /url2 8498 . 8499 <p><a href="/url2">foo</a><a href="/url1">baz</a></p> 8500 ```````````````````````````````` 8501 8502 8503 Here `[foo]` is not parsed as a shortcut reference, because it 8504 is followed by a link label (even though `[bar]` is not defined): 8505 8506 ```````````````````````````````` example 8507 [foo][bar][baz] 8508 8509 [baz]: /url1 8510 [foo]: /url2 8511 . 8512 <p>[foo]<a href="/url1">bar</a></p> 8513 ```````````````````````````````` 8514 8515 8516 8517 ## Images 8518 8519 Syntax for images is like the syntax for links, with one 8520 difference. Instead of [link text], we have an 8521 [image description](@). The rules for this are the 8522 same as for [link text], except that (a) an 8523 image description starts with `![` rather than `[`, and 8524 (b) an image description may contain links. 8525 An image description has inline elements 8526 as its contents. When an image is rendered to HTML, 8527 this is standardly used as the image's `alt` attribute. 8528 8529 ```````````````````````````````` example 8530 ![foo](/url "title") 8531 . 8532 <p><img src="/url" alt="foo" title="title" /></p> 8533 ```````````````````````````````` 8534 8535 8536 ```````````````````````````````` example 8537 ![foo *bar*] 8538 8539 [foo *bar*]: train.jpg "train & tracks" 8540 . 8541 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8542 ```````````````````````````````` 8543 8544 8545 ```````````````````````````````` example 8546 ![foo ![bar](/url)](/url2) 8547 . 8548 <p><img src="/url2" alt="foo bar" /></p> 8549 ```````````````````````````````` 8550 8551 8552 ```````````````````````````````` example 8553 ![foo [bar](/url)](/url2) 8554 . 8555 <p><img src="/url2" alt="foo bar" /></p> 8556 ```````````````````````````````` 8557 8558 8559 Though this spec is concerned with parsing, not rendering, it is 8560 recommended that in rendering to HTML, only the plain string content 8561 of the [image description] be used. Note that in 8562 the above example, the alt attribute's value is `foo bar`, not `foo 8563 [bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string 8564 content is rendered, without formatting. 8565 8566 ```````````````````````````````` example 8567 ![foo *bar*][] 8568 8569 [foo *bar*]: train.jpg "train & tracks" 8570 . 8571 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8572 ```````````````````````````````` 8573 8574 8575 ```````````````````````````````` example 8576 ![foo *bar*][foobar] 8577 8578 [FOOBAR]: train.jpg "train & tracks" 8579 . 8580 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8581 ```````````````````````````````` 8582 8583 8584 ```````````````````````````````` example 8585 ![foo](train.jpg) 8586 . 8587 <p><img src="train.jpg" alt="foo" /></p> 8588 ```````````````````````````````` 8589 8590 8591 ```````````````````````````````` example 8592 My ![foo bar](/path/to/train.jpg "title" ) 8593 . 8594 <p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p> 8595 ```````````````````````````````` 8596 8597 8598 ```````````````````````````````` example 8599 ![foo](<url>) 8600 . 8601 <p><img src="url" alt="foo" /></p> 8602 ```````````````````````````````` 8603 8604 8605 ```````````````````````````````` example 8606 ![](/url) 8607 . 8608 <p><img src="/url" alt="" /></p> 8609 ```````````````````````````````` 8610 8611 8612 Reference-style: 8613 8614 ```````````````````````````````` example 8615 ![foo][bar] 8616 8617 [bar]: /url 8618 . 8619 <p><img src="/url" alt="foo" /></p> 8620 ```````````````````````````````` 8621 8622 8623 ```````````````````````````````` example 8624 ![foo][bar] 8625 8626 [BAR]: /url 8627 . 8628 <p><img src="/url" alt="foo" /></p> 8629 ```````````````````````````````` 8630 8631 8632 Collapsed: 8633 8634 ```````````````````````````````` example 8635 ![foo][] 8636 8637 [foo]: /url "title" 8638 . 8639 <p><img src="/url" alt="foo" title="title" /></p> 8640 ```````````````````````````````` 8641 8642 8643 ```````````````````````````````` example 8644 ![*foo* bar][] 8645 8646 [*foo* bar]: /url "title" 8647 . 8648 <p><img src="/url" alt="foo bar" title="title" /></p> 8649 ```````````````````````````````` 8650 8651 8652 The labels are case-insensitive: 8653 8654 ```````````````````````````````` example 8655 ![Foo][] 8656 8657 [foo]: /url "title" 8658 . 8659 <p><img src="/url" alt="Foo" title="title" /></p> 8660 ```````````````````````````````` 8661 8662 8663 As with reference links, spaces, tabs, and line endings, are not allowed 8664 between the two sets of brackets: 8665 8666 ```````````````````````````````` example 8667 ![foo] 8668 [] 8669 8670 [foo]: /url "title" 8671 . 8672 <p><img src="/url" alt="foo" title="title" /> 8673 []</p> 8674 ```````````````````````````````` 8675 8676 8677 Shortcut: 8678 8679 ```````````````````````````````` example 8680 ![foo] 8681 8682 [foo]: /url "title" 8683 . 8684 <p><img src="/url" alt="foo" title="title" /></p> 8685 ```````````````````````````````` 8686 8687 8688 ```````````````````````````````` example 8689 ![*foo* bar] 8690 8691 [*foo* bar]: /url "title" 8692 . 8693 <p><img src="/url" alt="foo bar" title="title" /></p> 8694 ```````````````````````````````` 8695 8696 8697 Note that link labels cannot contain unescaped brackets: 8698 8699 ```````````````````````````````` example 8700 ![[foo]] 8701 8702 [[foo]]: /url "title" 8703 . 8704 <p>![[foo]]</p> 8705 <p>[[foo]]: /url "title"</p> 8706 ```````````````````````````````` 8707 8708 8709 The link labels are case-insensitive: 8710 8711 ```````````````````````````````` example 8712 ![Foo] 8713 8714 [foo]: /url "title" 8715 . 8716 <p><img src="/url" alt="Foo" title="title" /></p> 8717 ```````````````````````````````` 8718 8719 8720 If you just want a literal `!` followed by bracketed text, you can 8721 backslash-escape the opening `[`: 8722 8723 ```````````````````````````````` example 8724 !\[foo] 8725 8726 [foo]: /url "title" 8727 . 8728 <p>![foo]</p> 8729 ```````````````````````````````` 8730 8731 8732 If you want a link after a literal `!`, backslash-escape the 8733 `!`: 8734 8735 ```````````````````````````````` example 8736 \![foo] 8737 8738 [foo]: /url "title" 8739 . 8740 <p>!<a href="/url" title="title">foo</a></p> 8741 ```````````````````````````````` 8742 8743 8744 ## Autolinks 8745 8746 [Autolink](@)s are absolute URIs and email addresses inside 8747 `<` and `>`. They are parsed as links, with the URL or email address 8748 as the link label. 8749 8750 A [URI autolink](@) consists of `<`, followed by an 8751 [absolute URI] followed by `>`. It is parsed as 8752 a link to the URI, with the URI as the link's label. 8753 8754 An [absolute URI](@), 8755 for these purposes, consists of a [scheme] followed by a colon (`:`) 8756 followed by zero or more characters other [ASCII control 8757 characters][ASCII control character], [space], `<`, and `>`. 8758 If the URI includes these characters, they must be percent-encoded 8759 (e.g. `%20` for a space). 8760 8761 For purposes of this spec, a [scheme](@) is any sequence 8762 of 2--32 characters beginning with an ASCII letter and followed 8763 by any combination of ASCII letters, digits, or the symbols plus 8764 ("+"), period ("."), or hyphen ("-"). 8765 8766 Here are some valid autolinks: 8767 8768 ```````````````````````````````` example 8769 <http://foo.bar.baz> 8770 . 8771 <p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p> 8772 ```````````````````````````````` 8773 8774 8775 ```````````````````````````````` example 8776 <http://foo.bar.baz/test?q=hello&id=22&boolean> 8777 . 8778 <p><a href="http://foo.bar.baz/test?q=hello&id=22&boolean">http://foo.bar.baz/test?q=hello&id=22&boolean</a></p> 8779 ```````````````````````````````` 8780 8781 8782 ```````````````````````````````` example 8783 <irc://foo.bar:2233/baz> 8784 . 8785 <p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p> 8786 ```````````````````````````````` 8787 8788 8789 Uppercase is also fine: 8790 8791 ```````````````````````````````` example 8792 <MAILTO:FOO@BAR.BAZ> 8793 . 8794 <p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p> 8795 ```````````````````````````````` 8796 8797 8798 Note that many strings that count as [absolute URIs] for 8799 purposes of this spec are not valid URIs, because their 8800 schemes are not registered or because of other problems 8801 with their syntax: 8802 8803 ```````````````````````````````` example 8804 <a+b+c:d> 8805 . 8806 <p><a href="a+b+c:d">a+b+c:d</a></p> 8807 ```````````````````````````````` 8808 8809 8810 ```````````````````````````````` example 8811 <made-up-scheme://foo,bar> 8812 . 8813 <p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p> 8814 ```````````````````````````````` 8815 8816 8817 ```````````````````````````````` example 8818 <http://../> 8819 . 8820 <p><a href="http://../">http://../</a></p> 8821 ```````````````````````````````` 8822 8823 8824 ```````````````````````````````` example 8825 <localhost:5001/foo> 8826 . 8827 <p><a href="localhost:5001/foo">localhost:5001/foo</a></p> 8828 ```````````````````````````````` 8829 8830 8831 Spaces are not allowed in autolinks: 8832 8833 ```````````````````````````````` example 8834 <http://foo.bar/baz bim> 8835 . 8836 <p><http://foo.bar/baz bim></p> 8837 ```````````````````````````````` 8838 8839 8840 Backslash-escapes do not work inside autolinks: 8841 8842 ```````````````````````````````` example 8843 <http://example.com/\[\> 8844 . 8845 <p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p> 8846 ```````````````````````````````` 8847 8848 8849 An [email autolink](@) 8850 consists of `<`, followed by an [email address], 8851 followed by `>`. The link's label is the email address, 8852 and the URL is `mailto:` followed by the email address. 8853 8854 An [email address](@), 8855 for these purposes, is anything that matches 8856 the [non-normative regex from the HTML5 8857 spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)): 8858 8859 /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])? 8860 (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ 8861 8862 Examples of email autolinks: 8863 8864 ```````````````````````````````` example 8865 <foo@bar.example.com> 8866 . 8867 <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p> 8868 ```````````````````````````````` 8869 8870 8871 ```````````````````````````````` example 8872 <foo+special@Bar.baz-bar0.com> 8873 . 8874 <p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p> 8875 ```````````````````````````````` 8876 8877 8878 Backslash-escapes do not work inside email autolinks: 8879 8880 ```````````````````````````````` example 8881 <foo\+@bar.example.com> 8882 . 8883 <p><foo+@bar.example.com></p> 8884 ```````````````````````````````` 8885 8886 8887 These are not autolinks: 8888 8889 ```````````````````````````````` example 8890 <> 8891 . 8892 <p><></p> 8893 ```````````````````````````````` 8894 8895 8896 ```````````````````````````````` example 8897 < http://foo.bar > 8898 . 8899 <p>< http://foo.bar ></p> 8900 ```````````````````````````````` 8901 8902 8903 ```````````````````````````````` example 8904 <m:abc> 8905 . 8906 <p><m:abc></p> 8907 ```````````````````````````````` 8908 8909 8910 ```````````````````````````````` example 8911 <foo.bar.baz> 8912 . 8913 <p><foo.bar.baz></p> 8914 ```````````````````````````````` 8915 8916 8917 ```````````````````````````````` example 8918 http://example.com 8919 . 8920 <p>http://example.com</p> 8921 ```````````````````````````````` 8922 8923 8924 ```````````````````````````````` example 8925 foo@bar.example.com 8926 . 8927 <p>foo@bar.example.com</p> 8928 ```````````````````````````````` 8929 8930 8931 ## Raw HTML 8932 8933 Text between `<` and `>` that looks like an HTML tag is parsed as a 8934 raw HTML tag and will be rendered in HTML without escaping. 8935 Tag and attribute names are not limited to current HTML tags, 8936 so custom tags (and even, say, DocBook tags) may be used. 8937 8938 Here is the grammar for tags: 8939 8940 A [tag name](@) consists of an ASCII letter 8941 followed by zero or more ASCII letters, digits, or 8942 hyphens (`-`). 8943 8944 An [attribute](@) consists of spaces, tabs, and up to one line ending, 8945 an [attribute name], and an optional 8946 [attribute value specification]. 8947 8948 An [attribute name](@) 8949 consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII 8950 letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML 8951 specification restricted to ASCII. HTML5 is laxer.) 8952 8953 An [attribute value specification](@) 8954 consists of optional spaces, tabs, and up to one line ending, 8955 a `=` character, optional spaces, tabs, and up to one line ending, 8956 and an [attribute value]. 8957 8958 An [attribute value](@) 8959 consists of an [unquoted attribute value], 8960 a [single-quoted attribute value], or a [double-quoted attribute value]. 8961 8962 An [unquoted attribute value](@) 8963 is a nonempty string of characters not 8964 including spaces, tabs, line endings, `"`, `'`, `=`, `<`, `>`, or `` ` ``. 8965 8966 A [single-quoted attribute value](@) 8967 consists of `'`, zero or more 8968 characters not including `'`, and a final `'`. 8969 8970 A [double-quoted attribute value](@) 8971 consists of `"`, zero or more 8972 characters not including `"`, and a final `"`. 8973 8974 An [open tag](@) consists of a `<` character, a [tag name], 8975 zero or more [attributes], optional spaces, tabs, and up to one line ending, 8976 an optional `/` character, and a `>` character. 8977 8978 A [closing tag](@) consists of the string `</`, a 8979 [tag name], optional spaces, tabs, and up to one line ending, and the character 8980 `>`. 8981 8982 An [HTML comment](@) consists of `<!--` + *text* + `-->`, 8983 where *text* does not start with `>` or `->`, does not end with `-`, 8984 and does not contain `--`. (See the 8985 [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) 8986 8987 A [processing instruction](@) 8988 consists of the string `<?`, a string 8989 of characters not including the string `?>`, and the string 8990 `?>`. 8991 8992 A [declaration](@) consists of the string `<!`, an ASCII letter, zero or more 8993 characters not including the character `>`, and the character `>`. 8994 8995 A [CDATA section](@) consists of 8996 the string `<![CDATA[`, a string of characters not including the string 8997 `]]>`, and the string `]]>`. 8998 8999 An [HTML tag](@) consists of an [open tag], a [closing tag], 9000 an [HTML comment], a [processing instruction], a [declaration], 9001 or a [CDATA section]. 9002 9003 Here are some simple open tags: 9004 9005 ```````````````````````````````` example 9006 <a><bab><c2c> 9007 . 9008 <p><a><bab><c2c></p> 9009 ```````````````````````````````` 9010 9011 9012 Empty elements: 9013 9014 ```````````````````````````````` example 9015 <a/><b2/> 9016 . 9017 <p><a/><b2/></p> 9018 ```````````````````````````````` 9019 9020 9021 Whitespace is allowed: 9022 9023 ```````````````````````````````` example 9024 <a /><b2 9025 data="foo" > 9026 . 9027 <p><a /><b2 9028 data="foo" ></p> 9029 ```````````````````````````````` 9030 9031 9032 With attributes: 9033 9034 ```````````````````````````````` example 9035 <a foo="bar" bam = 'baz <em>"</em>' 9036 _boolean zoop:33=zoop:33 /> 9037 . 9038 <p><a foo="bar" bam = 'baz <em>"</em>' 9039 _boolean zoop:33=zoop:33 /></p> 9040 ```````````````````````````````` 9041 9042 9043 Custom tag names can be used: 9044 9045 ```````````````````````````````` example 9046 Foo <responsive-image src="foo.jpg" /> 9047 . 9048 <p>Foo <responsive-image src="foo.jpg" /></p> 9049 ```````````````````````````````` 9050 9051 9052 Illegal tag names, not parsed as HTML: 9053 9054 ```````````````````````````````` example 9055 <33> <__> 9056 . 9057 <p><33> <__></p> 9058 ```````````````````````````````` 9059 9060 9061 Illegal attribute names: 9062 9063 ```````````````````````````````` example 9064 <a h*#ref="hi"> 9065 . 9066 <p><a h*#ref="hi"></p> 9067 ```````````````````````````````` 9068 9069 9070 Illegal attribute values: 9071 9072 ```````````````````````````````` example 9073 <a href="hi'> <a href=hi'> 9074 . 9075 <p><a href="hi'> <a href=hi'></p> 9076 ```````````````````````````````` 9077 9078 9079 Illegal whitespace: 9080 9081 ```````````````````````````````` example 9082 < a>< 9083 foo><bar/ > 9084 <foo bar=baz 9085 bim!bop /> 9086 . 9087 <p>< a>< 9088 foo><bar/ > 9089 <foo bar=baz 9090 bim!bop /></p> 9091 ```````````````````````````````` 9092 9093 9094 Missing whitespace: 9095 9096 ```````````````````````````````` example 9097 <a href='bar'title=title> 9098 . 9099 <p><a href='bar'title=title></p> 9100 ```````````````````````````````` 9101 9102 9103 Closing tags: 9104 9105 ```````````````````````````````` example 9106 </a></foo > 9107 . 9108 <p></a></foo ></p> 9109 ```````````````````````````````` 9110 9111 9112 Illegal attributes in closing tag: 9113 9114 ```````````````````````````````` example 9115 </a href="foo"> 9116 . 9117 <p></a href="foo"></p> 9118 ```````````````````````````````` 9119 9120 9121 Comments: 9122 9123 ```````````````````````````````` example 9124 foo <!-- this is a 9125 comment - with hyphen --> 9126 . 9127 <p>foo <!-- this is a 9128 comment - with hyphen --></p> 9129 ```````````````````````````````` 9130 9131 9132 ```````````````````````````````` example 9133 foo <!-- not a comment -- two hyphens --> 9134 . 9135 <p>foo <!-- not a comment -- two hyphens --></p> 9136 ```````````````````````````````` 9137 9138 9139 Not comments: 9140 9141 ```````````````````````````````` example 9142 foo <!--> foo --> 9143 9144 foo <!-- foo---> 9145 . 9146 <p>foo <!--> foo --></p> 9147 <p>foo <!-- foo---></p> 9148 ```````````````````````````````` 9149 9150 9151 Processing instructions: 9152 9153 ```````````````````````````````` example 9154 foo <?php echo $a; ?> 9155 . 9156 <p>foo <?php echo $a; ?></p> 9157 ```````````````````````````````` 9158 9159 9160 Declarations: 9161 9162 ```````````````````````````````` example 9163 foo <!ELEMENT br EMPTY> 9164 . 9165 <p>foo <!ELEMENT br EMPTY></p> 9166 ```````````````````````````````` 9167 9168 9169 CDATA sections: 9170 9171 ```````````````````````````````` example 9172 foo <![CDATA[>&<]]> 9173 . 9174 <p>foo <![CDATA[>&<]]></p> 9175 ```````````````````````````````` 9176 9177 9178 Entity and numeric character references are preserved in HTML 9179 attributes: 9180 9181 ```````````````````````````````` example 9182 foo <a href="ö"> 9183 . 9184 <p>foo <a href="ö"></p> 9185 ```````````````````````````````` 9186 9187 9188 Backslash escapes do not work in HTML attributes: 9189 9190 ```````````````````````````````` example 9191 foo <a href="\*"> 9192 . 9193 <p>foo <a href="\*"></p> 9194 ```````````````````````````````` 9195 9196 9197 ```````````````````````````````` example 9198 <a href="\""> 9199 . 9200 <p><a href="""></p> 9201 ```````````````````````````````` 9202 9203 9204 ## Hard line breaks 9205 9206 A line ending (not in a code span or HTML tag) that is preceded 9207 by two or more spaces and does not occur at the end of a block 9208 is parsed as a [hard line break](@) (rendered 9209 in HTML as a `<br />` tag): 9210 9211 ```````````````````````````````` example 9212 foo 9213 baz 9214 . 9215 <p>foo<br /> 9216 baz</p> 9217 ```````````````````````````````` 9218 9219 9220 For a more visible alternative, a backslash before the 9221 [line ending] may be used instead of two or more spaces: 9222 9223 ```````````````````````````````` example 9224 foo\ 9225 baz 9226 . 9227 <p>foo<br /> 9228 baz</p> 9229 ```````````````````````````````` 9230 9231 9232 More than two spaces can be used: 9233 9234 ```````````````````````````````` example 9235 foo 9236 baz 9237 . 9238 <p>foo<br /> 9239 baz</p> 9240 ```````````````````````````````` 9241 9242 9243 Leading spaces at the beginning of the next line are ignored: 9244 9245 ```````````````````````````````` example 9246 foo 9247 bar 9248 . 9249 <p>foo<br /> 9250 bar</p> 9251 ```````````````````````````````` 9252 9253 9254 ```````````````````````````````` example 9255 foo\ 9256 bar 9257 . 9258 <p>foo<br /> 9259 bar</p> 9260 ```````````````````````````````` 9261 9262 9263 Hard line breaks can occur inside emphasis, links, and other constructs 9264 that allow inline content: 9265 9266 ```````````````````````````````` example 9267 *foo 9268 bar* 9269 . 9270 <p><em>foo<br /> 9271 bar</em></p> 9272 ```````````````````````````````` 9273 9274 9275 ```````````````````````````````` example 9276 *foo\ 9277 bar* 9278 . 9279 <p><em>foo<br /> 9280 bar</em></p> 9281 ```````````````````````````````` 9282 9283 9284 Hard line breaks do not occur inside code spans 9285 9286 ```````````````````````````````` example 9287 `code 9288 span` 9289 . 9290 <p><code>code span</code></p> 9291 ```````````````````````````````` 9292 9293 9294 ```````````````````````````````` example 9295 `code\ 9296 span` 9297 . 9298 <p><code>code\ span</code></p> 9299 ```````````````````````````````` 9300 9301 9302 or HTML tags: 9303 9304 ```````````````````````````````` example 9305 <a href="foo 9306 bar"> 9307 . 9308 <p><a href="foo 9309 bar"></p> 9310 ```````````````````````````````` 9311 9312 9313 ```````````````````````````````` example 9314 <a href="foo\ 9315 bar"> 9316 . 9317 <p><a href="foo\ 9318 bar"></p> 9319 ```````````````````````````````` 9320 9321 9322 Hard line breaks are for separating inline content within a block. 9323 Neither syntax for hard line breaks works at the end of a paragraph or 9324 other block element: 9325 9326 ```````````````````````````````` example 9327 foo\ 9328 . 9329 <p>foo\</p> 9330 ```````````````````````````````` 9331 9332 9333 ```````````````````````````````` example 9334 foo 9335 . 9336 <p>foo</p> 9337 ```````````````````````````````` 9338 9339 9340 ```````````````````````````````` example 9341 ### foo\ 9342 . 9343 <h3>foo\</h3> 9344 ```````````````````````````````` 9345 9346 9347 ```````````````````````````````` example 9348 ### foo 9349 . 9350 <h3>foo</h3> 9351 ```````````````````````````````` 9352 9353 9354 ## Soft line breaks 9355 9356 A regular line ending (not in a code span or HTML tag) that is not 9357 preceded by two or more spaces or a backslash is parsed as a 9358 [softbreak](@). (A soft line break may be rendered in HTML either as a 9359 [line ending] or as a space. The result will be the same in 9360 browsers. In the examples here, a [line ending] will be used.) 9361 9362 ```````````````````````````````` example 9363 foo 9364 baz 9365 . 9366 <p>foo 9367 baz</p> 9368 ```````````````````````````````` 9369 9370 9371 Spaces at the end of the line and beginning of the next line are 9372 removed: 9373 9374 ```````````````````````````````` example 9375 foo 9376 baz 9377 . 9378 <p>foo 9379 baz</p> 9380 ```````````````````````````````` 9381 9382 9383 A conforming parser may render a soft line break in HTML either as a 9384 line ending or as a space. 9385 9386 A renderer may also provide an option to render soft line breaks 9387 as hard line breaks. 9388 9389 ## Textual content 9390 9391 Any characters not given an interpretation by the above rules will 9392 be parsed as plain textual content. 9393 9394 ```````````````````````````````` example 9395 hello $.;'there 9396 . 9397 <p>hello $.;'there</p> 9398 ```````````````````````````````` 9399 9400 9401 ```````````````````````````````` example 9402 Foo χρῆν 9403 . 9404 <p>Foo χρῆν</p> 9405 ```````````````````````````````` 9406 9407 9408 Internal spaces are preserved verbatim: 9409 9410 ```````````````````````````````` example 9411 Multiple spaces 9412 . 9413 <p>Multiple spaces</p> 9414 ```````````````````````````````` 9415 9416 9417 <!-- END TESTS --> 9418 9419 # Appendix: A parsing strategy 9420 9421 In this appendix we describe some features of the parsing strategy 9422 used in the CommonMark reference implementations. 9423 9424 ## Overview 9425 9426 Parsing has two phases: 9427 9428 1. In the first phase, lines of input are consumed and the block 9429 structure of the document---its division into paragraphs, block quotes, 9430 list items, and so on---is constructed. Text is assigned to these 9431 blocks but not parsed. Link reference definitions are parsed and a 9432 map of links is constructed. 9433 9434 2. In the second phase, the raw text contents of paragraphs and headings 9435 are parsed into sequences of Markdown inline elements (strings, 9436 code spans, links, emphasis, and so on), using the map of link 9437 references constructed in phase 1. 9438 9439 At each point in processing, the document is represented as a tree of 9440 **blocks**. The root of the tree is a `document` block. The `document` 9441 may have any number of other blocks as **children**. These children 9442 may, in turn, have other blocks as children. The last child of a block 9443 is normally considered **open**, meaning that subsequent lines of input 9444 can alter its contents. (Blocks that are not open are **closed**.) 9445 Here, for example, is a possible document tree, with the open blocks 9446 marked by arrows: 9447 9448 ``` tree 9449 -> document 9450 -> block_quote 9451 paragraph 9452 "Lorem ipsum dolor\nsit amet." 9453 -> list (type=bullet tight=true bullet_char=-) 9454 list_item 9455 paragraph 9456 "Qui *quodsi iracundia*" 9457 -> list_item 9458 -> paragraph 9459 "aliquando id" 9460 ``` 9461 9462 ## Phase 1: block structure 9463 9464 Each line that is processed has an effect on this tree. The line is 9465 analyzed and, depending on its contents, the document may be altered 9466 in one or more of the following ways: 9467 9468 1. One or more open blocks may be closed. 9469 2. One or more new blocks may be created as children of the 9470 last open block. 9471 3. Text may be added to the last (deepest) open block remaining 9472 on the tree. 9473 9474 Once a line has been incorporated into the tree in this way, 9475 it can be discarded, so input can be read in a stream. 9476 9477 For each line, we follow this procedure: 9478 9479 1. First we iterate through the open blocks, starting with the 9480 root document, and descending through last children down to the last 9481 open block. Each block imposes a condition that the line must satisfy 9482 if the block is to remain open. For example, a block quote requires a 9483 `>` character. A paragraph requires a non-blank line. 9484 In this phase we may match all or just some of the open 9485 blocks. But we cannot close unmatched blocks yet, because we may have a 9486 [lazy continuation line]. 9487 9488 2. Next, after consuming the continuation markers for existing 9489 blocks, we look for new block starts (e.g. `>` for a block quote). 9490 If we encounter a new block start, we close any blocks unmatched 9491 in step 1 before creating the new block as a child of the last 9492 matched container block. 9493 9494 3. Finally, we look at the remainder of the line (after block 9495 markers like `>`, list markers, and indentation have been consumed). 9496 This is text that can be incorporated into the last open 9497 block (a paragraph, code block, heading, or raw HTML). 9498 9499 Setext headings are formed when we see a line of a paragraph 9500 that is a [setext heading underline]. 9501 9502 Reference link definitions are detected when a paragraph is closed; 9503 the accumulated text lines are parsed to see if they begin with 9504 one or more reference link definitions. Any remainder becomes a 9505 normal paragraph. 9506 9507 We can see how this works by considering how the tree above is 9508 generated by four lines of Markdown: 9509 9510 ``` markdown 9511 > Lorem ipsum dolor 9512 sit amet. 9513 > - Qui *quodsi iracundia* 9514 > - aliquando id 9515 ``` 9516 9517 At the outset, our document model is just 9518 9519 ``` tree 9520 -> document 9521 ``` 9522 9523 The first line of our text, 9524 9525 ``` markdown 9526 > Lorem ipsum dolor 9527 ``` 9528 9529 causes a `block_quote` block to be created as a child of our 9530 open `document` block, and a `paragraph` block as a child of 9531 the `block_quote`. Then the text is added to the last open 9532 block, the `paragraph`: 9533 9534 ``` tree 9535 -> document 9536 -> block_quote 9537 -> paragraph 9538 "Lorem ipsum dolor" 9539 ``` 9540 9541 The next line, 9542 9543 ``` markdown 9544 sit amet. 9545 ``` 9546 9547 is a "lazy continuation" of the open `paragraph`, so it gets added 9548 to the paragraph's text: 9549 9550 ``` tree 9551 -> document 9552 -> block_quote 9553 -> paragraph 9554 "Lorem ipsum dolor\nsit amet." 9555 ``` 9556 9557 The third line, 9558 9559 ``` markdown 9560 > - Qui *quodsi iracundia* 9561 ``` 9562 9563 causes the `paragraph` block to be closed, and a new `list` block 9564 opened as a child of the `block_quote`. A `list_item` is also 9565 added as a child of the `list`, and a `paragraph` as a child of 9566 the `list_item`. The text is then added to the new `paragraph`: 9567 9568 ``` tree 9569 -> document 9570 -> block_quote 9571 paragraph 9572 "Lorem ipsum dolor\nsit amet." 9573 -> list (type=bullet tight=true bullet_char=-) 9574 -> list_item 9575 -> paragraph 9576 "Qui *quodsi iracundia*" 9577 ``` 9578 9579 The fourth line, 9580 9581 ``` markdown 9582 > - aliquando id 9583 ``` 9584 9585 causes the `list_item` (and its child the `paragraph`) to be closed, 9586 and a new `list_item` opened up as child of the `list`. A `paragraph` 9587 is added as a child of the new `list_item`, to contain the text. 9588 We thus obtain the final tree: 9589 9590 ``` tree 9591 -> document 9592 -> block_quote 9593 paragraph 9594 "Lorem ipsum dolor\nsit amet." 9595 -> list (type=bullet tight=true bullet_char=-) 9596 list_item 9597 paragraph 9598 "Qui *quodsi iracundia*" 9599 -> list_item 9600 -> paragraph 9601 "aliquando id" 9602 ``` 9603 9604 ## Phase 2: inline structure 9605 9606 Once all of the input has been parsed, all open blocks are closed. 9607 9608 We then "walk the tree," visiting every node, and parse raw 9609 string contents of paragraphs and headings as inlines. At this 9610 point we have seen all the link reference definitions, so we can 9611 resolve reference links as we go. 9612 9613 ``` tree 9614 document 9615 block_quote 9616 paragraph 9617 str "Lorem ipsum dolor" 9618 softbreak 9619 str "sit amet." 9620 list (type=bullet tight=true bullet_char=-) 9621 list_item 9622 paragraph 9623 str "Qui " 9624 emph 9625 str "quodsi iracundia" 9626 list_item 9627 paragraph 9628 str "aliquando id" 9629 ``` 9630 9631 Notice how the [line ending] in the first paragraph has 9632 been parsed as a `softbreak`, and the asterisks in the first list item 9633 have become an `emph`. 9634 9635 ### An algorithm for parsing nested emphasis and links 9636 9637 By far the trickiest part of inline parsing is handling emphasis, 9638 strong emphasis, links, and images. This is done using the following 9639 algorithm. 9640 9641 When we're parsing inlines and we hit either 9642 9643 - a run of `*` or `_` characters, or 9644 - a `[` or `![` 9645 9646 we insert a text node with these symbols as its literal content, and we 9647 add a pointer to this text node to the [delimiter stack](@). 9648 9649 The [delimiter stack] is a doubly linked list. Each 9650 element contains a pointer to a text node, plus information about 9651 9652 - the type of delimiter (`[`, `![`, `*`, `_`) 9653 - the number of delimiters, 9654 - whether the delimiter is "active" (all are active to start), and 9655 - whether the delimiter is a potential opener, a potential closer, 9656 or both (which depends on what sort of characters precede 9657 and follow the delimiters). 9658 9659 When we hit a `]` character, we call the *look for link or image* 9660 procedure (see below). 9661 9662 When we hit the end of the input, we call the *process emphasis* 9663 procedure (see below), with `stack_bottom` = NULL. 9664 9665 #### *look for link or image* 9666 9667 Starting at the top of the delimiter stack, we look backwards 9668 through the stack for an opening `[` or `![` delimiter. 9669 9670 - If we don't find one, we return a literal text node `]`. 9671 9672 - If we do find one, but it's not *active*, we remove the inactive 9673 delimiter from the stack, and return a literal text node `]`. 9674 9675 - If we find one and it's active, then we parse ahead to see if 9676 we have an inline link/image, reference link/image, compact reference 9677 link/image, or shortcut reference link/image. 9678 9679 + If we don't, then we remove the opening delimiter from the 9680 delimiter stack and return a literal text node `]`. 9681 9682 + If we do, then 9683 9684 * We return a link or image node whose children are the inlines 9685 after the text node pointed to by the opening delimiter. 9686 9687 * We run *process emphasis* on these inlines, with the `[` opener 9688 as `stack_bottom`. 9689 9690 * We remove the opening delimiter. 9691 9692 * If we have a link (and not an image), we also set all 9693 `[` delimiters before the opening delimiter to *inactive*. (This 9694 will prevent us from getting links within links.) 9695 9696 #### *process emphasis* 9697 9698 Parameter `stack_bottom` sets a lower bound to how far we 9699 descend in the [delimiter stack]. If it is NULL, we can 9700 go all the way to the bottom. Otherwise, we stop before 9701 visiting `stack_bottom`. 9702 9703 Let `current_position` point to the element on the [delimiter stack] 9704 just above `stack_bottom` (or the first element if `stack_bottom` 9705 is NULL). 9706 9707 We keep track of the `openers_bottom` for each delimiter 9708 type (`*`, `_`) and each length of the closing delimiter run 9709 (modulo 3). Initialize this to `stack_bottom`. 9710 9711 Then we repeat the following until we run out of potential 9712 closers: 9713 9714 - Move `current_position` forward in the delimiter stack (if needed) 9715 until we find the first potential closer with delimiter `*` or `_`. 9716 (This will be the potential closer closest 9717 to the beginning of the input -- the first one in parse order.) 9718 9719 - Now, look back in the stack (staying above `stack_bottom` and 9720 the `openers_bottom` for this delimiter type) for the 9721 first matching potential opener ("matching" means same delimiter). 9722 9723 - If one is found: 9724 9725 + Figure out whether we have emphasis or strong emphasis: 9726 if both closer and opener spans have length >= 2, we have 9727 strong, otherwise regular. 9728 9729 + Insert an emph or strong emph node accordingly, after 9730 the text node corresponding to the opener. 9731 9732 + Remove any delimiters between the opener and closer from 9733 the delimiter stack. 9734 9735 + Remove 1 (for regular emph) or 2 (for strong emph) delimiters 9736 from the opening and closing text nodes. If they become empty 9737 as a result, remove them and remove the corresponding element 9738 of the delimiter stack. If the closing node is removed, reset 9739 `current_position` to the next element in the stack. 9740 9741 - If none is found: 9742 9743 + Set `openers_bottom` to the element before `current_position`. 9744 (We know that there are no openers for this kind of closer up to and 9745 including this point, so this puts a lower bound on future searches.) 9746 9747 + If the closer at `current_position` is not a potential opener, 9748 remove it from the delimiter stack (since we know it can't 9749 be a closer either). 9750 9751 + Advance `current_position` to the next element in the stack. 9752 9753 After we're done, we remove all delimiters above `stack_bottom` from the 9754 delimiter stack.