Markua Spec

Version 0.29 (2020-05-28)
This formal specification of Markua was developed at Leanpub, is written by Peter Armstrong, is based on the CommonMark Spec by John MacFarlane, and is licensed under the Creative Commons license CC-BY-SA 4.0.
Creative
   Commons BY-SA
CommonMark Spec by John MacFarlane is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

1Introduction

1.1The magical typewriter (M)

Imagine you owned a magical typewriter.

When you used this magical typewriter, you wrote with fewer distractions. You didn’t just write faster, you wrote better.

With your magical typewriter, you never worried about layout. The book formatted itself.

You could hit a key on your magical typewriter to create an ebook from your manuscript with one click.

All ebook formats would be created, and they’d all look good. You’d have PDF for computers, MOBI for Kindle, and EPUB for everywhere else. The book would look great on phones.

With your magical typewriter, you could publish your book before it was even done, and get feedback from readers all over the world. You could automatically share book updates with them. You would press one key on your magical typewriter to publish a new version, and all your readers would have it instantly.

With your magical typewriter, you could easily compare your current manuscript to any other version of your manuscript that had ever existed.

If you decided to make a print book, you could press a key on your magical typewriter to get a print-ready PDF. All you would need to do is add a cover.

If you wanted to work with a designer, you could press a different key to generate InDesign. They could then use this as a starting point for producing a beautiful print book.

Or, if you wanted to work with a publisher, you could press a different key to get a Word document.

Your magical typewriter could even transform your completed book manuscript into a course that anyone in the world could take. All you’d need to do is to add some quizzes and exercises and then press a key for your magical typewriter to publish a massive open online course (MOOC) for you. The quizzes and exercises would mark themselves, and students would get certificates based on how well they did.

With your magical typewriter, you’d only have to do one thing:

Write.

Wouldn’t it be great if such a magical typewriter existed?

It does. At Leanpub, we’re building it.

But there’s one requirement for this magical typewriter to exist: a simple, coherent, open, free, plain text format for a book or course manuscript.

This simple format will be the basis for the magical typewriter.

This simple format is called Markua.

This is its specification.

1.2For Leanpub authors (M)

There are currently two ways to write in plain text on Leanpub. These are their user manuals:

  1. Markua (preferred)
  2. Leanpub Flavoured Markdown (LFM)

If you’re a Leanpub author, you should probably go read one of those manuals instead of this specification.

Specifically, not all Markua features are supported on Leanpub. To learn what works and what does not work, you need to see the manual.

1.3What is this document? (M)

This is the Markua Spec, written by Peter Armstrong.

The Markua Spec is based on the CommonMark Spec, whose author is John MacFarlane.

This document also incorporates the specification of two of the five GitHub Flavored Markdown (GFM) extensions from the GFM Spec.

Like the CommonMark Spec and the GFM Spec, the Markua Spec is licensed under the Creative Commons license CC-BY-SA 4.0.

This document has two types of intended readers:

  1. authors with advanced questions about Markua syntax
  2. developers implementing Markua Processors

This Introduction contains a number of introductory sections about Markua and its relationship to Markdown, CommonMark and GFM. It also contains sections which discuss Leanpub, to minimize confusion for Leanpub authors who encounter this document.

If you’re going to read this document, you really should read the Introduction, even if you normally skip them.

1.4How to write a novel in Markua (M)

The Markua specification is long. However, if you’re an author, the amount you need to learn to write in Markua is actually very short.

This example shows everything you need to know to write a novel in Markua:


# Chapter One

This is a paragraph. You just write.

Blank lines separate paragraphs. This is *italic text*.

This is another paragraph. You can manually wrap your paragraphs
however you want. Single newlines function like single spaces by
default.

* * *

That was a thematic break. "This is in quotes."

# Chapter Two

This is a paragraph in a new chapter.

By the way, everything in the above example is identical in the following Markdown dialects:

In the simple case, Markua is just a mapping from Markdown concepts to book concepts. This is deliberate.

1.5How to write almost any book in Markua (M)

The longer example below shows everything you need to know to write almost any type of book, such as a computer programming book, in Markua. This example also serves as a tutorial which explains most of Markua’s important syntax in its content. Some parts of this syntax are just standard Markdown or CommonMark syntax, and other parts are Markua extensions.

# Paragraphs

This is a paragraph. You just write.

Blank lines separate paragraphs. This is *italic text* and **this is bold**.

This is another paragraph. You can manually wrap your paragraphs
however you want. Single newlines function like single spaces by
default.

* * *

That was a thematic break. "This is in quotes."

# Lists

Here's a numbered list (called an "ordered list", even though all lists are
ordered):

1. foo
2. bar
3. baz

Here's a bulleted list (called an "unordered list", for irony):

* foo
* bar
* baz

You can even have definition lists!

term 1
: definition 1a
: definition 1b

term 2
: definition 2

# Tables

You can also use tables, which work best for numeric tabular data
involving a small number of columns containing small numbers:

| Central Bank | Rate      |
|--------------|-----------|
| JPY          | -0.10%    |
| EUR          |  0.00%    |
| USD          |  0.00%    |
| CAD          |  0.25%    |

Note that definition lists are preferred to tables for most use cases,
since typing text in a table quickly gets annoying.

# Headings

Markua supports both of Markdown's heading styles.

The preferred style, called atx headers, has the following meaning in Markua:

```
{class: part}
# Part

This is a paragraph.

# Chapter

This is a paragraph.

## Section

This is a paragraph.

### Sub-section

This is a paragraph.

#### Sub-sub-section

This is a paragraph.

##### Sub-sub-sub-section

This is a paragraph.

###### Sub-sub-sub-sub-section

This is a paragraph.
```

Note the use of three backticks in the above example, to treat the Markua like
inline code.

The other style of headers, called Setext headers, has the following headings:

```
{class: part}
Part
====

This is a paragraph.

Chapter
=======

This is a paragraph.

Section
-------

This is a paragraph.
```

Setext headers look nice, but only if you're only using chapters and sections.
If you want to add sub-sections (or lower), you'll be using atx headers for at
least some of your headers. My advice is to just use atx headers all the time.
(The `{class: part}` attribute list on a chapter header to make a part header
does actually work with Setext headers, but it's really ugly.)

Finally, note that while it is confusing and ugly to mix and match using Setext
and atx headers for chapters and sections in the same document, you can do it.
However, please don't.

# Images

You can add images like this:

![](mac.jpg)

You can specify the alt text and figure caption like this:

![alt text for accessibility](image.png "Figure Caption")

An example of the difference between alt text and a figure caption is:

![a red apple](mac.jpg "The Original Mac")

You can also set the alt text and/or the figure caption in an attribute list:

{alt: "a red apple", caption: "The Original Mac"}
![](mac.jpg)

Attribute lists are one of Markua's additions to Markdown, and are discussed
later in this specification.

If you wish, you can configure a Markua Processor to convert alt text into a
figure caption if no separate figure caption is provided. (A figure caption is
just as accessible as alt text, so this is fine.) Global settings are discussed
next.

# Global Settings

Various global settings can be specified at the start of a Markua document to
affect the behaviour of the Markua Processor. For example, you can set the
`alt-caption` setting to `all` to make images use the alt text as the figure
caption instead of as alt text:

```
{
alt-caption: all
}

# Chapter One

This is an apple with a figure caption but no alt text:

![The Original McIntosh, a red apple](mac.jpg)
```

This gets you the terseness of inserting images as figures in LFM, in a general
way. Note that if the `alt-caption` global setting is set to `text` (the
default) this behaviour only happens for text-based resources like code samples.
If the `alt-caption` is set to `none`, it never happens.

The generalization of the Markdown image syntax into resources, and the addition
of global settings, are two more of Markua's additions to Markdown. These are
also both discussed later in this specification.

One important type of resource in Markua is code samples, which is discussed
next.

# Code Samples

The image syntax is the inspiration for the syntax for external code samples:

![](hello.rb "Hello World in Ruby")

Just like with images, you can also use an attribute list for the caption:

{caption: "Hello World in Ruby"}
![](hello.rb)

Note that you can also specify the figure caption as the alt text, as long as
the global setting of `alt-caption` is not set to `none`:

![Hello World in Ruby](hello.rb)

You can also have inline code samples, which can only have a caption using an
attribute list:

{caption: "Hello World in Ruby"}
```ruby
puts "hello"
```

You can also include single lines of code like `puts "hello"` in paragraphs
using backticks.

# Other Stuff

Note that you can easily add math `d = v_i t + \frac{1}{2} a t^2`$ as well,
either inline in a paragraph or as a figure, using LaTeX math as the format.

> Blockquotes are really easy too.
> --Peter Armstrong, *Markua Spec*

Markua has lots of features not discussed in this example. Read the manual or
the rest of the spec!

1.6Markua: Markdown for books and courses (M)

Markua, pronounced “mar-coo-ah”, is Markdown for books and courses.

Markua maps Markdown syntax to book concepts, and then adds some new syntax and concepts of its own.

Markua documents can be automatically transformed into every popular type of ebook format. The computer programs which do this transformation are called Markua Processors. These programs understand both Markua syntax and how to generate the various output formats. An example Markua Processor is Leanpub: Leanpub can output PDF, EPUB, MOBI and HTML from the same Markua document, and can even output print-ready PDFs and InDesign files from them as well.

Markua has been developed with extensive real-world testing and feedback. Markua has been used by Leanpub authors for years, both to create books and online courses.

Markua was started at Leanpub in 2014, and benefited from the years of lessons that Leanpub had learned from Leanpub Flavoured Markdown (LFM). Markua is the successor to LFM. We have been iterating on our Markua support for many years.

Markua’s three main contributions are the following:

  1. The mapping of Markdown headings (h1, h2, h3, etc.) to book structures (parts, chapters, sections, sub-sections, etc.), which provides the ability for Markua Processors like Leanpub to produce an ebook or online course from a Markua manuscript with one click.
  2. The unified resource and attribute list syntax, which lets Markua handle audio, code, math and video in the same way that it does images, and which supports inline, local and web resource locations.
  3. The mapping of book structures to courses, and the creation of a plain-text based microformat for course quizzes and exercises, which supports automated marking and automated production of everything which is required for a MOOC.

If you have written something (say blog posts or lecture notes) in Markdown, you can use a Markua Processor, such as Leanpub, to turn them into an ebook or course with one click. Then, as you go down the path of enhancing the manuscript and adding things which only make sense in books or courses, this process will feel like decorating, not converting.

The goal is for writers who are familiar with Markdown to feel that with Markua, Markdown just grew an understanding of book and course concepts.

1.7Why the name “Markua”? (M)

When I set out to specify Markua, I realized I needed a name. I wanted a name that conveyed the love that I have for Markdown while not implying endorsement by John Gruber in any way. I also did not want a name which referenced Leanpub: Markua is a standalone specification with its own identity, which anyone (including Leanpub competitors) can freely implement. Finally, I was on vacation in Hawaii when I named Markua, and I wanted something that sounded happy, friendly and almost Hawaiian. (Yes, I’m aware that there is no r in Hawaiian.) I also wanted a name that had its .com domain name available, and that was short and spellable, for branding purposes. The Markua name had all these properties.

1.8What is Leanpub Flavoured Markdown (LFM)? (M)

Leanpub Flavoured Markdown (LFM) was Leanpub’s original attempt to extend Markdown to support book authoring. It started in 2011 and evolved with feedback from Leanpub authors until 2014, when Markua was launched.

Leanpub books can currently be written in LFM or Markua. Leanpub courses can currently only be written in Markua, however.

LFM is the predecessor of Markua. LFM is specified in the Leanpub Flavoured Markdown Spec.

Leanpub authors should consult the LFM manual instead of the LFM Spec, since it provides a more accurate description of Leanpub’s LFM support.

The reason for this divergence is simple: the LFM Spec is based on the CommonMark Spec, but LFM is not based on CommonMark. (CommonMark and the CommonMark Spec did not even exist when LFM was completed.)

Leanpub is working on a version (“Spec Mode”) of its LFM support which will be based on CommonMark. When this is complete, Leanpub’s existing LFM Support will continue to be supported unchanged for many years, and will be rebranded as “LFM Quirks Mode”.

1.9How does Leanpub’s current Markua support match the Markua Spec? (M)

Markua is not yet fully supported on Leanpub.

Even though the Markua Spec is based on the CommonMark Spec, Leanpub’s current Markua support is not based on CommonMark. Markua and CommonMark were both launched in 2014, within a few months of each other. Leanpub’s Markua implementation was based on its implementation of LFM, which started in 2011.

Leanpub is working on a version (“Spec Mode”) of its Markua support which will be based on CommonMark. When this is complete, Leanpub’s existing Markua Support will continue to be supported unchanged for many years, and will be rebranded as “Markua Quirks Mode”.

1.10Converting between LFM and Markua (M)

The launch of Spec Mode for both LFM and Markua on Leanpub will give authors of existing Leanpub books and courses the following choice for each of their books or courses:

  1. Leave their book or course in Quirks Mode, for unchanged behaviour and performance.
  2. Switch their book or course to Spec Mode, for faster book or course generation and better compliance with the LFM or Markua specs.
  3. Click a button to save a version of their existing manuscript and then automatically convert their LFM manuscript into a Markua manuscript.

To be clear, conversions from LFM to Markua will only be done in Spec Mode for both LFM and Markua. (The conversion will be between LFM Spec Mode and Markua Spec Mode.)

1.11Differences between LFM and Markua (M)

When considering converting between LFM and Markua (both in Spec Mode), it’s important to highlight all the differences between Markua and LFM.

The following are the top 10 differences between Markua and LFM. This is true both with LFM as currently implemented in Leanpub (in Quirks Mode), but these differences will also exist in LFM Spec Mode once it exists.

  1. In LFM, parts are created with -# Part. In Markua, parts are created an attribute list on an atx or Setext heading.
  2. In LFM, there is a special syntax for inserting code samples: <<[Optional Title](code/some_code_file.rb). In Markua, however, code is just a resource, and the LFM syntax is not supported.
  3. In LFM, to mark code as added or removed, the magic words were leanpub-start-insert, leanpub-end-insert, leanpub-start-delete and leanpub-end-delete. In Markua, the magic words are markua-start-insert, markua-end-insert, markua-start-delete and markua-end-delete.
  4. In LFM, there is a special syntax for inserting math: {$$}...{/$$}. This looks nice to people who like LaTeX, and looks like nothing else in Markdown. In Markua, however, math is just another resource, and that LaTeX-inspired syntax for wrapping math resources is not supported.
  5. In LFM, there are G> “generic boxes”. In Markua, these are replaced with blurbs (B>).
  6. LFM had the C> syntax to center text, but we didn’t have the same effect on generic boxes, and blurbs did not exist. In Markua, a C> syntax is just syntactic sugar for a centered blurb, for greater consistency. Because of this, the blurb also gets the ability to be centered by adding a {class: center} attribute.
  7. LFM had {rtl} and {ltr} directives. These are not supported in Markua, and neither is a {dir} attribute in general: any given language is either a left-to-right or a right-to-left language, so specifying the language with the lang document setting and the {lang: ___} directive is sufficient.
  8. LFM used Sample.txt to define the sample content. Markua moves the definition of what constitutes sample content into a {sample: true} attribute on parts, chapters and sections. So, in Markua, inclusion in the sample is at the content level, not the file level. This helps avoid a number of bugs that could happen with including at the file level, if a file did not clearly break at a chapter boundary. (So, in Leanpub, the Sample.txt approach is not supported for books which use Markua.)
  9. LFM used Book.txt to define the book content. In Markua, the way that the list of manuscript files is defined is considered out of scope of the Markua specification itself. (Leanpub still uses the Book.txt approach to specify which files are in a book, but other Markua Processors could use other approaches, or could just support parsing single files or input streams.)
  10. Most importantly, a number of features (courses, index entries, smart crosslinks, etc.) exist in Markua Spec Mode that do not exist in LFM Spec Mode. Compare the two specs for the full details.

To emphasize point #10: the differences between Markua Spec Mode and LFM Spec Mode are not just incompatibilities: Markua Spec Mode has many more features than LFM Spec Mode. This is why Leanpub will support conversion from LFM Spec Mode to Markua Spec Mode, but not vice-versa.

1.12Structure of this document (M)

The Markua Spec is a strange hybrid of the CommonMark Spec, the GFM Spec and the specification of the Markua extensions.

The starting point is the CommonMark Spec. Almost the entire CommonMark Spec is preserved unchanged.

Besides the numerous extensions which were added, the following are the changes to the CommonMark Spec:

  1. The “HTML blocks” section is removed from the Leaf blocks chapter.
  2. The “Raw HTML” section is removed from the Inlines chapter.
  3. Since the Metadata and Quizzes and exercises chapters are new, the numbering of some chapters is changed.
  4. The metadata at the top of the spec itself is changed.

The reason that HTML tags are not treated as HTML is discussed in the No HTML Blocks section.

Next, two of the five GFM extensions are added. These are discussed below.

Finally, many new sections about Markua are added. These are all labeled “(M)” for “(Markua extension)”. There are so many of these extensions that using (Markua extension) everywhere added too much clutter.

So, for any given chapter or section of this document, it is either:

  1. written by John MacFarlane about CommonMark
  2. written by GitHub employees about GFM, and ending with “(GFM)”
  3. written by Peter Armstrong about Markua, and ending with “(M)”

To be clear, every section which does not end with (M) or (GFM) is from the original CommonMark Spec, and was written by John MacFarlane.

Together, these sections combine to specify Markua, since Markua is CommonMark, minus HTML blocks and Raw HTML, plus some GFM extensions, plus (many!) Markua extensions.

While this hybrid approach of creating the Markua specification is somewhat odd, it has a number of significant benefits:

  1. The CommonMark Spec is very good, and is licensed under the Creative Commons license CC-BY-SA 4.0. Not starting with the CommonMark Spec would be an exercise in vanity.
  2. The CommonMark Spec is the defacto standard specification of Markdown. Any specification not based on CommonMark will need to be compared to CommonMark, so the most helpful thing to do is to just start with CommonMark.
  3. Writing this specification as a series of extensions to CommonMark will enable people who are implementing it to just start with a compliant CommonMark implementation and modify it appropriately.
  4. Defining all Markua changes as extensions to CommonMark or deletions from CommonMark makes it abundantly clear whether something that Markua does originates in CommonMark, or whether it was added by GFM or by Markua. This is potentially helpful for authors, since it will show them what will work in other Markdown dialects, and what won’t.
  5. It means that updating this document to keep it current with the CommonMark and GFM specs will be very easy.

1.13GitHub Flavored Markdown (GFM) extension status (M)

Note that while Markua is a strict superset of CommonMark, it is not a strict superset of the GitHub Flavored Markdown (GFM) Spec.

The GFM Spec defines five extensions to CommonMark. In the GFM Spec, these extensions have “(extension)” in their title.

Two of these five GFM extensions are included in the Markua Spec:

  1. Tables (GFM)
  2. Strikethrough (GFM)

Their content is unchanged, but their titles are renamed to say “(GFM)” instead of “(extension)”, to more clearly specify their provenance.

The other three GFM extensions are omitted:

  1. Task list items is omitted since Markua is for books and courses, not todo lists.
  2. Disallowed Raw HTML is omitted since Markua goes farther and converts all raw HTML into text.
  3. Autolinks (GFM) is omitted, because the extra work to create a link is literally typing two characters (< and >), so the added complexity both in parsing and in documentation is not worth it. (It does make sense for, say, a discussion forum. It does not make sense for books and courses.)

1.14About the HTML output (M)

The HTML mapping defined in this specification is NOT a complete specification of the HTML documents produced by Markua Processors.

Also, the HTML mapping defined in the specification is NOT considered to be canonical.

For example, consider the following Markua:

![a red apple](mac.jpg "The Original Mac")

A Markua Processor can produce HTML that looks like this:

<p><img src="mac.jpg" alt="a red apple" title="The Original Mac" /></p>

(That is what was produced at https://spec.commonmark.org/dingus/ on 2020-05-27, for example, and it is valid Markua as well as valid CommonMark.)

A Markua Processor can also, however, produce HTML that looks like this:

<p>
  <figure>
    <img src="mac.jpg"
         alt="a red apple">
    <figcaption>The Original Mac</figcaption>
  </figure>
</p>

Both approaches are completely legitimate, and have pros and cons.

So, why show any HTML at all? There are two reasons:

  1. The CommonMark Spec shows HTML, and the Markua Spec tries to only be additive to the CommonMark Spec.
  2. Showing HTML helps Markua Processors to test their parsers.

Finally, besides not being canonical, the HTML is not complete. It is only a specification of the parts of the HTML mapping where there is less need to have flexibility on the part of the Markua Processor.

The HTML mapping should be thought of specifying HTML fragments, not HTML documents. In all of the examples, a Markua Processor may add more HTML before and after the relevant content, as well as changing the relevant content itself.

If a Markua Processor wishes to test that it produces the correct HTML, it should test that the HTML produced contains the specified output, not that it is identical to the specified output.

Every example uses the default Markua document settings, unless otherwise specified. Any custom settings are specified in a document settings list in the top of the example.

1.15What is Markdown?

Markdown is a plain text format for writing structured documents, based on conventions for indicating formatting in email and usenet posts. It was developed by John Gruber (with help from Aaron Swartz) and released in 2004 in the form of a syntax description and a Perl script (Markdown.pl) for converting Markdown to HTML. In the next decade, dozens of implementations were developed in many languages. Some extended the original Markdown syntax with conventions for footnotes, tables, and other document elements. Some allowed Markdown documents to be rendered in formats other than HTML. Websites like Reddit, StackOverflow, and GitHub had millions of people using Markdown. And Markdown started to be used beyond the web, to author books, articles, slide shows, letters, and lecture notes.

What distinguishes Markdown from many other lightweight markup syntaxes, which are often easier to write, is its readability. As Gruber writes:

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions. (http://daringfireball.net/projects/markdown/)

The point can be illustrated by comparing a sample of AsciiDoc with an equivalent sample of Markdown. Here is a sample of AsciiDoc from the AsciiDoc manual:

1. List item one.
+
List item one continued with a second paragraph followed by an
Indented block.
+
.................
$ ls *.sh
$ mv *.sh ~/tmp
.................
+
List item continued with a third paragraph.

2. List item two continued with an open block.
+
--
This paragraph is part of the preceding list item.

a. This list is nested and does not require explicit item
continuation.
+
This paragraph is part of the preceding list item.

b. List item b.

This paragraph belongs to item two of the outer list.
--

And here is the equivalent in Markdown:

1.  List item one.

    List item one continued with a second paragraph followed by an
    Indented block.

        $ ls *.sh
        $ mv *.sh ~/tmp

    List item continued with a third paragraph.

2.  List item two continued with an open block.

    This paragraph is part of the preceding list item.

    1. This list is nested and does not require explicit item continuation.

       This paragraph is part of the preceding list item.

    2. List item b.

    This paragraph belongs to item two of the outer list.

The AsciiDoc version is, arguably, easier to write. You don’t need to worry about indentation. But the Markdown version is much easier to read. The nesting of list items is apparent to the eye in the source, not just in the processed document.

1.16Why is a spec needed?

John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously. Here are some examples of questions it does not answer:

  1. How much indentation is needed for a sublist? The spec says that continuation paragraphs need to be indented four spaces, but is not fully explicit about sublists. It is natural to think that they, too, must be indented four spaces, but Markdown.pl does not require that. This is hardly a “corner case,” and divergences between implementations on this issue often lead to surprises for users in real documents. (See this comment by John Gruber.)

  2. Is a blank line needed before a block quote or heading? Most implementations do not require the blank line. However, this can lead to unexpected results in hard-wrapped text, and also to ambiguities in parsing (note that some implementations put the heading inside the blockquote, while others do not). (John Gruber has also spoken in favor of requiring the blank lines.)

  3. Is a blank line needed before an indented code block? (Markdown.pl requires it, but this is not mentioned in the documentation, and some implementations do not require it.)

    paragraph
        code?
    
  4. What is the exact rule for determining when list items get wrapped in <p> tags? Can a list be partially “loose” and partially “tight”? What should we do with a list like this?

    1. one
    
    2. two
    3. three
    

    Or this?

    1.  one
        - a
    
        - b
    2.  two
    

    (There are some relevant comments by John Gruber here.)

  5. Can list markers be indented? Can ordered list markers be right-aligned?

     8. item 1
     9. item 2
    10. item 2a
    
  6. Is this one list with a thematic break in its second item, or two lists separated by a thematic break?

    * a
    * * * * *
    * b
    
  7. When list markers change from numbers to bullets, do we have two lists or one? (The Markdown syntax description suggests two, but the perl scripts and many other implementations produce one.)

    1. fee
    2. fie
    -  foe
    -  fum
    
  8. What are the precedence rules for the markers of inline structure? For example, is the following a valid link, or does the code span take precedence ?

    [a backtick (`)](/url) and [another backtick (`)](/url).
    
  9. What are the precedence rules for markers of emphasis and strong emphasis? For example, how should the following be parsed?

    *foo *bar* baz*
    
  10. What are the precedence rules between block-level and inline-level structure? For example, how should the following be parsed?

    - `a long code span can contain a hyphen like this
      - and it can screw things up`
    
  11. Can list items include section headings? (Markdown.pl does not allow this, but does allow blockquotes to include headings.)

    - # Heading
    
  12. Can list items be empty?

    * a
    *
    * b
    
  13. Can link references be defined inside block quotes or list items?

    > Blockquote [foo].
    >
    > [foo]: /url
    
  14. If there are multiple definitions for the same reference, which takes precedence?

    [foo]: /url1
    [foo]: /url2
    
    [foo][]
    

In the absence of a spec, early implementers consulted Markdown.pl to resolve these ambiguities. But Markdown.pl was quite buggy, and gave manifestly bad results in many cases, so it was not a satisfactory replacement for a spec.

Because there is no unambiguous spec, implementations have diverged considerably. As a result, users are often surprised to find that a document that renders one way on one system (say, a GitHub wiki) renders differently on another (say, converting to docbook using pandoc). To make matters worse, because nothing in Markdown counts as a “syntax error,” the divergence often isn’t discovered right away.

1.17About this document

This document attempts to specify Markdown syntax unambiguously. It contains many examples with side-by-side Markdown and HTML. These are intended to double as conformance tests. An accompanying script spec_tests.py can be used to run the tests against any Markdown program:

python test/spec_tests.py --spec spec.txt --program PROGRAM

Since this document describes how Markdown is to be parsed into an abstract syntax tree, it would have made sense to use an abstract representation of the syntax tree instead of HTML. But HTML is capable of representing the structural distinctions we need to make, and the choice of HTML for the tests makes it possible to run the tests against an implementation without writing an abstract syntax tree renderer.

This document is generated from a text file, spec.txt, written in Markdown with a small extension for the side-by-side tests. The script tools/makespec.py can be used to convert spec.txt into HTML or CommonMark (which can then be converted into other formats).

In the examples, the character is used to represent tabs.

2Preliminaries

2.1Characters and lines

Any sequence of characters is a valid CommonMark document.

A character is a Unicode code point. Although some code points (for example, combining accents) do not correspond to characters in an intuitive sense, all code points count as characters for purposes of this spec.

This spec does not specify an encoding; it thinks of lines as composed of characters rather than bytes. A conforming parser may be limited to a certain encoding.

A line is a sequence of zero or more characters other than newline (U+000A) or carriage return (U+000D), followed by a line ending or by the end of file.

A line ending is a newline (U+000A), a carriage return (U+000D) not followed by a newline, or a carriage return and a following newline.

A line containing no characters, or a line containing only spaces (U+0020) or tabs (U+0009), is called a blank line.

The following definitions of character classes will be used in this spec:

A whitespace character is a space (U+0020), tab (U+0009), newline (U+000A), line tabulation (U+000B), form feed (U+000C), or carriage return (U+000D).

Whitespace is a sequence of one or more whitespace characters.

A Unicode whitespace character is any code point in the Unicode Zs general category, or a tab (U+0009), carriage return (U+000D), newline (U+000A), or form feed (U+000C).

Unicode whitespace is a sequence of one or more Unicode whitespace characters.

A space is U+0020.

A non-whitespace character is any character that is not a whitespace character.

An ASCII control character is a character between U+0000–1F (both including) or U+007F.

An ASCII punctuation character is !, ", #, $, %, &, ', (, ), *, +, ,, -, ., / (U+0021–2F), :, ;, <, =, >, ?, @ (U+003A–0040), [, \, ], ^, _, ` (U+005B–0060), {, |, }, or ~ (U+007B–007E).

A punctuation character is an ASCII punctuation character or anything in the general Unicode categories Pc, Pd, Pe, Pf, Pi, Po, or Ps.

2.2Tabs

Tabs in lines are not expanded to spaces. However, in contexts where whitespace helps to define block structure, tabs behave as if they were replaced by spaces with a tab stop of 4 characters.

Thus, for example, a tab can be used instead of four spaces in an indented code block. (Note, however, that internal tabs are passed through as literal tabs, not expanded to spaces.)

Example 1
→foo→baz→→bim
<pre><code>foo→baz→→bim
</code></pre>
Example 2
  →foo→baz→→bim
<pre><code>foo→baz→→bim
</code></pre>
Example 3
    a→a
    ὐ→a
<pre><code>a→a
ὐ→a
</code></pre>

In the following example, a continuation paragraph of a list item is indented with a tab; this has exactly the same effect as indentation with four spaces would:

Example 4
  - foo

→bar
<ul>
<li>
<p>foo</p>
<p>bar</p>
</li>
</ul>
Example 5
- foo

→→bar
<ul>
<li>
<p>foo</p>
<pre><code>  bar
</code></pre>
</li>
</ul>

Normally the > that begins a block quote may be followed optionally by a space, which is not considered part of the content. In the following case > is followed by a tab, which is treated as if it were expanded into three spaces. Since one of these spaces is considered part of the delimiter, foo is considered to be indented six spaces inside the block quote context, so we get an indented code block starting with two spaces.

Example 6
>→→foo
<blockquote>
<pre><code>  foo
</code></pre>
</blockquote>
Example 7
-→→foo
<ul>
<li>
<pre><code>  foo
</code></pre>
</li>
</ul>
Example 8
    foo
→bar
<pre><code>foo
bar
</code></pre>
Example 9
 - foo
   - bar
→ - baz
<ul>
<li>foo
<ul>
<li>bar
<ul>
<li>baz</li>
</ul>
</li>
</ul>
</li>
</ul>
Example 10
#→Foo
<h1>Foo</h1>
Example 11
*→*→*→
<hr />

2.3Insecure characters

For security reasons, the Unicode character U+0000 must be replaced with the REPLACEMENT CHARACTER (U+FFFD).

2.4Backslash escapes

Any ASCII punctuation character may be backslash-escaped:

Example 12
\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
<p>!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</p>

Backslashes before other characters are treated as literal backslashes:

Example 13
\→\A\a\ \3\φ\«
<p>\→\A\a\ \3\φ\«</p>

Escaped characters are treated as regular characters and do not have their usual Markdown meanings:

Example 14
\*not emphasized*
\<br/> not a tag
\[not a link](/foo)
\`not code`
1\. not a list
\* not a list
\# not a heading
\[foo]: /url "not a reference"
\&ouml; not a character entity
<p>*not emphasized*
&lt;br/&gt; not a tag
[not a link](/foo)
`not code`
1. not a list
* not a list
# not a heading
[foo]: /url &quot;not a reference&quot;
&amp;ouml; not a character entity</p>

If a backslash is itself escaped, the following character is not:

Example 15
\\*emphasis*
<p>\<em>emphasis</em></p>

A backslash at the end of the line is a hard line break:

Example 16
foo\
bar
<p>foo<br />
bar</p>

Backslash escapes do not work in code blocks, code spans, autolinks, or raw HTML:

Example 17
`` \[\` ``
<p><code>\[\`</code></p>
Example 18
    \[\]
<pre><code>\[\]
</code></pre>
Example 19
~~~
\[\]
~~~
<pre><code>\[\]
</code></pre>
Example 20
<http://example.com?find=\*>
<p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
Example 21
<a href="/bar\/)">
<a href="/bar\/)">

But they work in all other contexts, including URLs and link titles, link references, and info strings in fenced code blocks:

Example 22
[foo](/bar\* "ti\*tle")
<p><a href="/bar*" title="ti*tle">foo</a></p>
Example 23
[foo]

[foo]: /bar\* "ti\*tle"
<p><a href="/bar*" title="ti*tle">foo</a></p>
Example 24
``` foo\+bar
foo
```
<pre><code class="language-foo+bar">foo
</code></pre>

2.5Entity and numeric character references

Valid HTML entity references and numeric character references can be used in place of the corresponding Unicode character, with the following exceptions:

Conforming CommonMark parsers need not store information about whether a particular character was represented in the source using a Unicode character or an entity reference.

Entity references consist of & + any of the valid HTML5 entity names + ;. The document https://html.spec.whatwg.org/entities.json is used as an authoritative source for the valid entity references and their corresponding code points.

Example 25
&nbsp; &amp; &copy; &AElig; &Dcaron;
&frac34; &HilbertSpace; &DifferentialD;
&ClockwiseContourIntegral; &ngE;
<p>  &amp; © Æ Ď
¾  ⅆ
∲ ≧̸</p>

Decimal numeric character references consist of &# + a string of 1–7 arabic digits + ;. A numeric character reference is parsed as the corresponding Unicode character. Invalid Unicode code points will be replaced by the REPLACEMENT CHARACTER (U+FFFD). For security reasons, the code point U+0000 will also be replaced by U+FFFD.

Example 26
&#35; &#1234; &#992; &#0;
<p># Ӓ Ϡ �</p>

Hexadecimal numeric character references consist of &# + either X or x + a string of 1-6 hexadecimal digits + ;. They too are parsed as the corresponding Unicode character (this time specified with a hexadecimal numeral instead of decimal).

Example 27
&#X22; &#XD06; &#xcab;
<p>&quot;  ಫ</p>

Here are some nonentities:

Example 28
&nbsp &x; &#; &#x;
&#87654321;
&#abcdef0;
&ThisIsNotDefined; &hi?;
<p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
&amp;#87654321;
&amp;#abcdef0;
&amp;ThisIsNotDefined; &amp;hi?;</p>

Although HTML5 does accept some entity references without a trailing semicolon (such as &copy), these are not recognized here, because it makes the grammar too ambiguous:

Example 29
&copy
<p>&amp;copy</p>

Strings that are not on the list of HTML5 named entities are not recognized as entity references either:

Example 30
&MadeUpEntity;
<p>&amp;MadeUpEntity;</p>

Entity and numeric character references are recognized in any context besides code spans or code blocks, including URLs, link titles, and fenced code block info strings:

Example 31
<a href="&ouml;&ouml;.html">
<a href="&ouml;&ouml;.html">
Example 32
[foo](/f&ouml;&ouml; "f&ouml;&ouml;")
<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
Example 33
[foo]

[foo]: /f&ouml;&ouml; "f&ouml;&ouml;"
<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
Example 34
``` f&ouml;&ouml;
foo
```
<pre><code class="language-föö">foo
</code></pre>

Entity and numeric character references are treated as literal text in code spans and code blocks:

Example 35
`f&ouml;&ouml;`
<p><code>f&amp;ouml;&amp;ouml;</code></p>
Example 36
    f&ouml;f&ouml;
<pre><code>f&amp;ouml;f&amp;ouml;
</code></pre>

Entity and numeric character references cannot be used in place of symbols indicating structure in CommonMark documents.

Example 37
&#42;foo&#42;
*foo*
<p>*foo*
<em>foo</em></p>
Example 38
&#42; foo

* foo
<p>* foo</p>
<ul>
<li>foo</li>
</ul>
Example 39
foo&#10;&#10;bar
<p>foo

bar</p>
Example 40
&#9;foo
<p>→foo</p>
Example 41
[a](url &quot;tit&quot;)
<p>[a](url &quot;tit&quot;)</p>

2.6Entity and numeric character references (M)

In PDF and other non-HTML-based output formats, a Markua Processor may choose to do the following for HTML entity and numeric character references:

  1. Convert it to the correct character in the font which it is using for body text.
  2. Convert it to a character for missing character, typically which is a box which looks like tofu.
  3. Convert it into an image and insert the image. For example, when encountering &#x1F602; (the HTML entity reference to the Unicode code point in hexadecimal of the “face with tears of joy” emoji), a Markua processor can just insert an image of a smiling face with tears of joy.

In HTML and in HTML-based output formats such as EPUB, the entity or numeric character reference must be output instead.

In output formats such as MOBI which are typically derived from EPUB, a Markua Processor may either output the entity or numeric character reference or it may choose to use one of the approaches used for non-HTML output formats.

2.7Emoji (M)

In 2015, emoji fully arrived. The 2015 Oxford Dictionaries Word of the Year was, in fact, the Face with Tears of Joy emoji. You may think of it as a smiling face with tears of joy, but you can also can think of it as &#x1F602;, which is the HTML entity reference to its Unicode code point (in hexadecimal).

However, Unicode characters aren’t the only way to do emoji. Another popular syntax for emoji is :emoji_name: – that is a colon (:), followed by the underscore-separated name of the emoji, followed by a colon.

Markua supports both the HTML character reference approach for emoji as well as the :emoji_name: syntax.

Which emoji are supported is up to the Markua Processor.

One recommended list of emoji names is at http://www.webpagefx.com/tools/emoji-cheat-sheet/.

The preferred thing to do when a Markua Processor recognizes an emoji is to replace it with the Unicode code point HTML entity reference, assuming that those are also handled correctly:

This makes me &#x1F602;.

This makes me :joy:.
```
<p>This makes me &#x1F602;.</p>
<p>This makes me &#x1F602;.</p>

However, it is also acceptable for a Markua Processor to just output the “tofu” characters for the HTML entity references while still supporting the :emoji_name: style emoji. Presumably with this approach, either a set of images or an emoji font will be used for the supported emoji.

Finally, if Font Awesome icons are going to be supported by a Markua Processor, using the :emoji_name: syntax, they must have their :fa- prefix which they always come with. So, the Leanpub logo in Font Awesome would be output as :fa-leanpub:.

3Blocks and inlines

We can think of a document as a sequence of blocks—structural elements like paragraphs, block quotations, lists, headings, rules, and code blocks. Some blocks (like block quotes and list items) contain other blocks; others (like headings and paragraphs) contain inline content—text, links, emphasized text, images, code spans, and so on.

3.1Precedence

Indicators of block structure always take precedence over indicators of inline structure. So, for example, the following is a list with two items, not a list with one item containing a code span:

Example 43
- `one
- two`
<ul>
<li>`one</li>
<li>two`</li>
</ul>

This means that parsing can proceed in two steps: first, the block structure of the document can be discerned; second, text lines inside paragraphs, headings, and other block constructs can be parsed for inline structure. The second step requires information about link reference definitions that will be available only at the end of the first step. Note that the first step requires processing lines in sequence, but the second can be parallelized, since the inline parsing of one block element does not affect the inline parsing of any other.

3.2Container blocks and leaf blocks

We can divide blocks into two types: container blocks, which can contain other blocks, and leaf blocks, which cannot.

4Metadata (M)

Markua adds a number of extensions to support metadata in a consistent way. These include attributes (on everything), settings (at the beginning of a document) and directives (at certain points throughout the document).

All metadata is enclosed in curly braces {}.

Metadata is not actually output in the Markua document itself. Instead, it changes the behaviour of the Markua Processor, sometimes dramatically.

4.1Attributes (M)

Attributes are used to do everything from specify the language of code blocks, add ids for crosslinking and even support extensions to Markua.

4.1.1Attribute List Format

An attribute list is one or more key-value, comma-separated pairs:

{one: v1, two: "v2", three: 'v3!', four: true, five: 0, six: 3.14, seven: a b}

Note that you can skip the space between the colon and the value: the following {format: ruby} and {format:ruby} both work. However, for consistency I recommend always adding a space.

Note that attribute values can be in no quotes, in double quotes (") or single quotes ('). Whenever an attribute contains spaces, using either double or single quotes is preferred to no quotes, but you can get away with using no quotes as long as the attribute value does not contain a comma.

The choice of double or single quotes is mostly personal taste. However, inside double quotes, a double quote must be backslash-escaped (\"); inside single quotes, a single quote must be backslash-escaped (\'). So, if your attribute value has a lot of double quotes, then it’s more convenient to wrap it in single quotes, and vice-versa.

Regardless of whether quotes are used, leading and trailing spaces are removed from all attribute values, but internal spaces within the attribute values are preserved.

An attribute list can be inserted into a Markua document in one of three ways:

  1. Immediately above a block element (e.g. heading, figure, aside, blurb, quiz, etc.), with one newline (not a blank line) separating it from the block element.
  2. Immediately after a span element (e.g. a word, italicized phrase, etc.) in normal paragraphs and in similarly-simple contexts, with no spaces separating it from the span element.
  3. On a line by itself, with one blank line above and below it. In this format, the attribute list contains directives.

One common use of attributes is to add id attributes to headings:

{id: foo}
# Chapter Foo

foo
<h1 id="foo" class="chapter">Chapter Foo</h1>
<p>foo</p>

You can also use an {#id} syntactic sugar:

{#bar}
# Chapter Bar

bar
<h1 id="bar" class="chapter">Chapter Bar</h1>
<p>bar</p>

One common use of attributes is to add id attributes to span elements:

Here [is lorem]{id: lorem}.

This is ipsum{#ipsum}.
<p>Here <span id="lorem">is lorem</span>.</p>

<p>This is <span id="ipsum">ipsum</span>.</p>

If there is an error in the syntax of an attribute list, or if the Markua Processor does not support an attribute list in a given context, it should just ignore the attribute list and add an appropriate error.

Any line outside of a code resource which starts with an opening curly brace { and ends with a closing curly brace } is assumed to be an attribute list, and will not be output by a Markua Processor. If you want to start a line with a literal opening curly brace { you need to preface it with a backslash (\).

You cannot add attribute lists inside headers:

# Chapter Foo{id: foo}

foo
<h1 class="chapter">Chapter Foo</h1>
<p>foo</p>

4.1.2Attribute Keys

The keys of attributes must consist exclusively of lowercase letters, hyphens (-) and underscores (_). Uppercase letters are not permitted in attribute keys: a Markua Processor must treat uppercase letters in attribute keys as an error.

If a key is duplicated in an attribute list, the first key value is used and subsequent ones are ignored. A Markua Processor should add a warning in its list of warnings, which are not output in the output itself.

4.1.3Attribute Values

All attributes are text. Markua Processors should interpret text values of “true” and “false” as representing true and false. Quotes, by which I mean double quotes (") not single quotes ('), are optional for attribute values, and are only needed if the attribute value contains whitespace or special characters.

If a text attribute value contains a quote, it must be “escaped” with a backslash: e.g. {title: "\"Fresh\" Fish"}

4.1.4id Attributes

As previously discussed, there is special syntactic sugar for ids: {#foo} is equivalent to {id: foo}. However, ids are just attributes.

4.1.5title Attributes

Markua headings (part, chapter, section, sub-section, etc.) and figures can all have title attributes specified in an attribute list. This is text which overrides what is displayed for the heading or figure in the table of contents. For a heading, it is analogous to the title attribute on a resource inserted as a figure, which specifies the text to use for the figure in the appropriate list of figures (e.g. List of Illustrations, Table of Tables, etc.). If a heading does not have a title attribute, the text of the heading itself is used–which is quite often exactly what is desired. Use of a title attribute is always optional; it’s only used when the default behaviour of using the heading text (or the title attribute for a resource) is not appropriate, say if it’s too long.

4.1.6class Attributes

One of the attributes which is supported in every attribute list is the class attribute. This allows a Markua Processor to do smart things about formatting.

For example, if a chapter has a {class: bibliography} added to it, then the Markua Processor can then use this information to change the indentation of paragraphs within it to be hanging paragraphs.

4.1.7Conditional Inclusion Attributes on Headings

Markua headings (and only headings) may have various attributes which specify which output formats their content (of the part, chapter, section, sub-section, etc.) should be included in. If the given attribute is not present, the default value of it is that specified by the nearest ancestor heading. If no such attribute is present at a top-level heading, the default is given by the default value for the attribute defined of Markua.

full : true or false. The default is true. If true, include this heading and its content (including nested sections, subsections, etc.) in the full book or course, including the PDF, EPUB and/or MOBI versions and the web version that is being generated. If false, omit it. Setting this to false is an easy way to “comment out” a section of your book or course.

sample : true or false. The default is false. If true, include the content under this heading (including nested sections, subsections, etc.) in the sample of the book or course that is being generated. If false, omit it. Since the default is false, by default a sample book or course is empty. Note that this attribute just governs the inclusion of the content, not the heading itself. A Markua Processor may choose to omit all headings with sample: false (either explicitly set or defaulted to false) or it may choose to include every heading in the sample version of a book or course, in order to produce a representative Table of Contents. In a case such as this where sample is false, the Markua Processor may output special content inside the chapter, section or subsection to indicate that the content itself is being omitted from the sample. This attribute applies to both the book version (PDF, EPUB and/or MOBI) and the web version of the sample book or course.

Note that specifying either of these attributes in a nested section overrides any value inherited from its ancestors, or from the default. This way, you can include a chapter in the sample, except for a specific section of the chapter.

Example:

{sample: true}
# Chapter One

This is included in the sample.

## Section One

This is included in the sample since it is contained in a chapter which is.

{sample: false}
## Section Two

This is *not* included in the sample since it is explicitly excluded, despite
the fact that the chapter is in the sample.

{sample: true}
## Section Three

This is included in the sample. This is redundant since it's in a chapter which
is.

{full: false, sample: true}
# Buy the book!

What you read was just a sample. Why not buy the full book?

# Chapter Two

This has the default values, so it is included in the book or course, but is
excluded from the sample.

To be clear: ALL conditional inclusion attributes ONLY have meaning when used as an attribute list on headings. You can only say {sample: true} immediately above a heading. You can’t have a blank line below it (otherwise it would be a directive, and not be valid) and you can’t attach it to anything other than a heading (like a paragraph, figure, etc.).

4.1.8Extension attributes

Markua Processors may encounter attributes which they do not understand.

Whenever this happens, these attributes must be filtered from the output. A Markua Processor should function as though there is a whitelist of attributes which it permits for each element, and filter everything else.

Because of this, Markua attribute lists can contain any number of extension attributes. An extension attribute is an attribute which is not defined in the Markua specification. This is true whether the attributes are inserted in an attribute list attached to a span, block or even in free-floating directives.

Because of the fact that all unrecognized attributes are filtered, it is possible for a Markua Processor to add extension attributes that only it understands. This encourages competition in the Markua ecosystem, while ensuring that Markua implementations do not choke on Markua input which goes beyond their capabilities.

To be clear, Markua does not understand “HTML attributes”. It understands Markua attributes. Some Markua attributes, like class, happen to correspond to attributes in HTML. This is just coincidence, and many others do not.

For example, Leanpub supports an icon attribute on blurbs. If a different Markua processor does not support this attribute, there is no harm done: the attribute just has no effect.

Extension attributes go far beyond adding icons to blurbs: they allow for specialized uses of Markua. Since CSS is so powerful, with creative uses of custom attributes and custom CSS, Markua documents can be transformed. Some obvious uses of extension attributes include adding CSS classes which can then be styled to set fonts, adding custom types to figures for things like lemmas and theorems, etc.

This ensures that new attributes can be added to future versions of Markua without a negative effect on older Markua implementations. It also ensures that new versions of Markua can simply stop supporting attributes defined in this version of Markua without needing to specify anything special.

4.2Document settings (M)

The HTML which is output by a Markua Processor can be affected by certain global settings. These can be overridden by one or more document settings hashes.

The document settings hash or hashes must be at the start of a Markua document. Once something other than a document settings hash is encountered, no more document settings hashes can be defined.

The settings are newline-separated key-value pairs (key: value) which contain information about the document and about the desired behaviour of the Markua Processor.

Document information includes things like the title, subtitle, copyright, authors, etc. Markua Processor behaviour includes things like italicize-underlines, smart-punctuation, etc.

The format of a document settings hash is as follows: Every non-blank line in the document settings hash must contain a key, a colon :, and a value. Any whitespace at the beginning or end of the keys and values will be stripped. Only the first colon is important; subsequent colons on a line will just be part of the value.

There must be at least one blank line between the end of the settings block and the start of any other document content other than another document settings hash.

Here’s an example of a settings block at the start of a book:

{
italicize-underlines: false
soft-breaks: space
caption: alt
smart-punctuation: true
title: Markua Spec
authors: Peter Armstrong and John MacFarlane
}

# Chapter One

Lorem ipsum dolor...

The reason that more than one document settings hash is permitted is that a Markua Processor may also get settings from elsewhere, like a web interface. The most straightforward thing to do in such a case is to just add another document settings hash at the beginning of the file, and then let the Markua Processor sort out any conflicts.

In the case of a conflict between settings defined in a document and settings defined, say, in a web interface, the last setting definition should win. However, a Markua Processor may override settings whose values are unsupported or illegal. Also, a Markua processor should provide a warning if there are conflicts.

Here’s an example of a Markua document which could be generated if there is a settings block produced by a web interface, and then a settings block manually added at the start of a book:

{
italicize-underlines: false
soft-breaks: space
alt-caption: all
smart-punctuation: true
title: Markua Spec
authors: Peter Armstrong and John MacFarlane
}
{
italicize-underlines: true
}

# Chapter One

Lorem ipsum dolor...

Thankfully, the only thing that would see this would be a Markua Processor, since it’s gross. To the author, they would just see their (second) settings block, and the first one would have been created from the values stored in a database somewhere.

Note that each document setting either applies to books and courses, or to courses only.

Finally, Markua Processors can also add their own document settings, just as Markua Processors can understand their own extension attributes.

4.2.1Settings which apply to books and courses

The following settings apply to books and courses.

authors : The authors of the Markua document.

alt-caption : all, text or none. The default is text. This controls whether the alt text on a resource is removed and used instead as the figure caption if there is no figure caption explicitly set. The possible values of this setting are all, text or none. If all, then if no caption is provided for a given resource, regardless of the resource type, remove the alt text and use it as the figure caption instead. (This setting is closest to the behaviour of LFM.) If text, then if no caption is provided for a given text-based resource (such as an external code sample), remove the alt text and use it as the figure caption instead. However, if the resource is not purely text-based, such as an image or a video, do not remove the remove the alt text and use it as the figure caption if no figure caption is provided. (This is the default value. It behaves like Markdown for images and like LFM for text-based resource types.) If none, then never use the alt text as the figure caption, even for a text-based resource where no figure caption is provided. (This is how Markdown behaves.) Note that if the alt text is converted into the figure caption, it is NOT also provided as alt text. It would be very annoying to hear the same text spoken twice (once as the alt text and once as the figure caption) when using a screen reader.

contributors : Any contributors to the Markua document who are not authors.

copyright : The owner of the copyright.

default-code-language : The default language that code which is a local resource, web resource or inline resource inserted as a block with three backticks is interpreted as. The default value is guess, which means to guess at the code language based on the syntax encountered (or the file extension for external code samples), and attempt to syntax highlight appropriately. A good alternative is text, which means no syntax highlighting should be used, but the code should be in a monospaced font suitable for a programming language. Besides these options, you can specify a particular programming language used, such as ruby or java. If a Markua Processor does not recognize the programming language specified, it must format it as text. Finally, note that the value of this setting only affects local resources, web resources or inline resources inserted as a block with three backticks–it has no effect on code spans, or on inline resources inserted with tildes. The default language for inline resources delimited by three tildes is always text, but you can override the default on individual resources of course.

default-code-span-language : The default language that code which an inline code span resource is formatted as. The default is text. If you set this to ruby, it is the equivalent of setting {format: ruby} as an inline attribute list on every code span resource by default, e.g. `puts "hi"`{format: ruby}. If you need to override a specific resource to not be ruby, you would set {format: text}.

italicize-underlines : true or false. The default value is true. If italicize-underlines is true, then _this_ and *this* are both italic. If italicize-underlines is false, then _this_ is underlined while *this* is italic. This is discussed here.

lang : The IS0 639-3 three character language code of the language that the Markua document is written in. The default is eng.

number-chapters : true or false. Default true. Whether the chapters are numbered.

number-figures : true or false. Default true. Whether resources are numbered when inserted as figures. Note that different types of resources (e.g. code listings, tables, images, etc.) have different numbering.

number-sections : true or false. Default false. Whether the sections (and sub-sections) are numbered.

restart-endnote-numbering : true or false. Default true. Whether the endnote numbering is restarted at the end of a chapter.

restart-footnote-numbering : true or false. Default true. Whether the footnote numbering is restarted at the end of a chapter.

soft-breaks : break, space or source. Default source. Whether single newlines produce a space (with space), a newline in the HTML source (with source) or a break tag (with break). See this section for a full description of this setting. If you have been writing in the Markua beta, you will want to set this to break. If you want the Markdown behaviour, leave it at the default of source.

smart-punctuation : If smart is true, use smart punctuation. Straight double and single quotes will be rendered as curly quotes, depending on their position, -- will be rendered as an en-dash, --- will be rendered as an em-dash and ... will be rendered as ellipses. If smart is false, do not use smart punctuation.

title : The title of the Markua document.

validate : If validate is true, the UTF-8 characters in the Markua document are validated, replacing illegal sequences with U+FFFD. If validate is false, the UTF-8 characters in the Markua document are not validated. A Markua Processor may override this setting.

version : The version of the Markua document. This is primarily intended for use when producing analytics for courses. The default is 0.1.

4.2.2Settings which apply to courses only

The following settings are defined by Markua, but only make sense in the context of a course. So, a Markua Processor which is not generating a course can ignore them.

course-attempts : The number of allowed attempts on the course itself. The default is 1.

course-completeness : How “done” the course itself is. Expressed as a percentage, in decimal value. The default is 0. A value of 100 or 100.0 is the maximum. A Markua Processor may do something smart when a course which is 100% complete changes the id values of a quiz or quizzes, since this will affect current students.

default-exercise-show-answers : This can be all (the default), incorrect or none. It affects how answers are shown after an exercise is completed, say in a MOOC.

default-quiz-attempts : The default number of allowed attempts on a quiz. The default is 1. Note that for quizzes, a value of 0 means the quiz cannot be taken (yet). A value of -1 means the quiz has an unlimited number of attempts. Setting either of these as the value of default-quiz-attempts is a legitimate value, and has the effect of changing the default value for all quizzes. Again, an exercise always has an unlimited number of attempts, so there is no corresponding default-exercise-attempts value.

default-quiz-auto-submit : true or false. The default is true. If true, an incomplete quiz is submitted when the time-limit is expired. If false, it is not. Either way, an incomplete quiz counts as an attempt.

default-quiz-show-answers : This can be all, incorrect (the default) or none. It affects how answers are shown after a quiz is completed, say in a MOOC.

default-quiz-use-result : best or latest. The default is best. Whether the best result on the quiz is used, or the latest one.

default-random-choice-order : true or false. The default is false. This sets the default behaviour of multiple choice questions. If true, the choices in the multiple choice question are randomly arranged; if false, they are presented in the order written.

default-random-question-order : true or false. The default is false. This sets the default behaviour of the quiz or exercise. If true, the questions are randomly arranged; if false, they are presented in the order written.

4.3Directives (M)

Directives are switches which affect the future behaviour of a Markua Processor.

The syntax for directives is simple: they are just contained in an attribute list. The only difference is that the attribute list is inserted an a line by itself, with one blank line above and below it.

A directive does not have any kind of “closing tag”–it simply remains in effect for the rest of the Markua document, or until the directive is overridden by another use of the same directive with a different value.

Here’s the directive syntax, for a nonexistent directive foo with a value bar:

some content

{foo: bar}

some content

There is syntactic sugar for boolean directives, where providing the key without a value is the equivalent of providing the key with the value true.

In the following example, the two {foo} directives are equivalent:

some content

{foo}

some content

{foo: true}

some content

4.4The lang directive (M)

A book has a global lang setting, which defaults to eng, for English.

However, many books are written in more than one language. Switching between languages in accomplished using the lang directive.

For example, say I’m writing a book about waka, renga and haiku. If I want to switch to Japanese to display a poem I can do so using a lang directive {lang: jpn} and then switch back to English using {lang: eng}:

lorem ipsum dolor

{lang: jpn}

古池や蛙飛び込む水の音

(松尾芭蕉)

{lang: eng}

foo bar baz

The value of the lang directive must be a valid ISO 639-3 language code.

The two main things that the lang directive affects are:

  1. the direction of the text (left-to-right or right-to-left)
  2. the font used for the language

Since English is the default language of a Markua document, all Markua Processors must support the eng value of the lang directive. Support for all other language codes is optional. If the language code given as the value of a lang directive is not supported or is unrecognized, it must be interpreted as eng and switch the font accordingly and switch the direction to left-to-right. This way, a Markua Processor can add support for right-to-left languages such as Arabic ({lang: ara}) or Hebrew ({lang: heb}), and know that when a different language is specified the direction will switch to left-to-right even if the given language was not supported–there are a lot of languages, but most of them are left-to-right languages.

4.5The pagebreak directive (M)

This is a nod to reality: sometimes an author will really, really want to insert a page break.

This is the pagebreak directive:

lorem ipsum dolor

{pagebreak}

foo bar baz

This is using the syntactic sugar version of the syntax, but the explicit true also works, of course:

lorem ipsum dolor

{pagebreak: true}

foo bar baz

Either way, the use of this directive is discouraged. Authors should spend their time writing, not formatting.

4.6Book section directives (M)

Most published books have three types of material in them: the front matter, the text (or “main matter”) and the back matter. (Yes, this applies to courses as well as books, since many Markua Processors like Leanpub can generate a book out of the course material.)

What authors write, the manuscript, is typically what goes into the text, or main matter, of the book. In style guidelines this is often called the text; in formats such as TeX and LaTeX it is called main matter. It’s what is typically numbered with Arabic numerals starting from 1.

There’s a bunch of other stuff (the Dedication, Epigraph, Table of Contents, Foreword, Preface, etc.) which can come before the main text of the book. The stuff before the main matter is called “front matter”. Some of the front matter comes before the Table of Contents and is not numbered; the rest of the front matter that comes after the Table of Contents gets numbered with Roman numerals.

There’s also a bunch of other stuff (appendices, the index, etc.) which can come after the main matter. This is called the “back matter”.

If Markua just relied on its headings support there would be no good way to accomplish the division of a manuscript into front matter, main matter and back matter. (We could try some convention about heading names, but that would be a highly objectionable, English-centric hack.)

So, this is where the book section directives come in.

There are two legal values of the section directive:

  1. mainmatter
  2. backmatter

With this approach, your book is essentially the following:

any frontmatter content, like a dedication, preface or introduction

{section: mainmatter}

the body of your book (the text or main matter)

{section: backmatter}

any back matter content, such as appendices

Note that there is no need for a {section: frontmatter} directive, ever.

The {section: mainmatter} and {section: backmatter} directives act as dividers, separating a book or course into logical groups of content.

While these book section directives are merely optional hints, there is a very strict rule about their use: each value (mainmatter, backmatter) of the section directive can only appear once in a document.

Now, note that the most minimal way to write a book is to use no section directives at all. With this approach, everything is in the text (main matter) of the book. Page numbering is in Arabic numerals, etc.

For authors who do not know about the section directive, this is what they are doing. Nothing bad or unexpected will happen: they can write their book, and it will look correct. Only when they go to add things like a preface or an appendix, and feel the need for different numbering, will they need to discover the section directives. Then, they can add them to their Markua document as appropriate.

4.7Insert directives (M)

With a book or course, certain types of content gets created by the Markua Processor, or in metadata provided by the author. This content then needs to be positioned in the book or course.

While a Markua Processor can adopt sensible defaults, sometimes an author wants more fine-grained control over where this automatically-generated content is inserted. That’s what the insert directive is for.

Here’s an example book:

some front matter content

{insert: dedication}

more front matter content

{insert: table-of-contents}

even more front matter content

{section: mainmatter}

the main matter content

{section: backmatter}

some back matter content

{insert: index}

more back matter content

{insert: quiz-answers}

The insert directive can have any value. If a Markua Processor does not recognize it, does not support it in that location, or does not support it at all, it should log an error and ignore the directive.

The following are some values of the insert directive which it may benefit a Markua Processor to support in front matter, which is everything before the {section: mainmatter} directive:

Again, it only makes sense to support these directives if there is some way of actually getting the content to insert, such as a web interface.

Note that a Markua Processor can be really smart about certain things, if it wants. For example, in front matter (before a {section: mainmatter} directive), it can choose to not number the pages before the Table of Contents, but to number the pages after the Table of Contents. Then, the author can use the {insert: table-of-contents} directive to choose where in the front matter the Table of Contents is positioned. This will not only affect its position, it will also affect the numbering of content before and after it.

The following are some values of the insert directive which it may benefit a Markua Processor to support in back matter, which is everything after the {section: backmatter} directive:

The exercise-answers and quiz-answers directives are used to position the answers to any exercises or quizzes in the text of the Markua document.

If neither of these directives are present, a Markua Processor should position any exercise answers somewhere near the back of the book (in the back matter, if it exists). For quiz answers, on the other hand, a Markua Processor may do whatever it wants in terms of whether the quiz answers are included in the book, regardless of the presence or position of the quiz-answers directive.

For example, in Leanpub’s online courses, the quiz answers are only provided when quizzes are completed and automatically marked, and are never output in the material book for the course.

5Leaf blocks

This section describes the different kinds of leaf block that make up a Markdown document.

5.1Thematic breaks

A line consisting of 0-3 spaces of indentation, followed by a sequence of three or more matching -, _, or * characters, each followed optionally by any number of spaces or tabs, forms a thematic break.

Example 48
***
---
___
<hr />
<hr />
<hr />

Wrong characters:

Example 49
+++
<p>+++</p>
Example 50
===
<p>===</p>

Not enough characters:

Example 51
--
**
__
<p>--
**
__</p>

One to three spaces indent are allowed:

Example 52
 ***
  ***
   ***
<hr />
<hr />
<hr />

Four spaces is too many:

Example 53
    ***
<pre><code>***
</code></pre>
Example 54
Foo
    ***
<p>Foo
***</p>

More than three characters may be used:

Example 55
_____________________________________
<hr />

Spaces are allowed between the characters:

Example 56
 - - -
<hr />
Example 57
 **  * ** * ** * **
<hr />
Example 58
-     -      -      -
<hr />

Spaces are allowed at the end:

Example 59
- - - -    
<hr />

However, no other characters may occur in the line:

Example 60
_ _ _ _ a

a------

---a---
<p>_ _ _ _ a</p>
<p>a------</p>
<p>---a---</p>

It is required that all of the non-whitespace characters be the same. So, this is not a thematic break:

Example 61
 *-*
<p><em>-</em></p>

Thematic breaks do not need blank lines before or after:

Example 62
- foo
***
- bar
<ul>
<li>foo</li>
</ul>
<hr />
<ul>
<li>bar</li>
</ul>

Thematic breaks can interrupt a paragraph:

Example 63
Foo
***
bar
<p>Foo</p>
<hr />
<p>bar</p>

If a line of dashes that meets the above conditions for being a thematic break could also be interpreted as the underline of a setext heading, the interpretation as a setext heading takes precedence. Thus, for example, this is a setext heading, not a paragraph followed by a thematic break:

Example 64
Foo
---
bar
<h2>Foo</h2>
<p>bar</p>

When both a thematic break and a list item are possible interpretations of a line, the thematic break takes precedence:

Example 65
* Foo
* * *
* Bar
<ul>
<li>Foo</li>
</ul>
<hr />
<ul>
<li>Bar</li>
</ul>

If you want a thematic break in a list item, use a different bullet:

Example 66
- Foo
- * * *
<ul>
<li>Foo</li>
<li>
<hr />
</li>
</ul>

5.2Thematic break semantics (M)

Thematic breaks, as shown above, produce an <hr /> in HTML.

Thematic breaks are treated as scene breaks by Markua.

In other formats, such as PDF, a Markua Processor may format the thematic break in a way appropriate for a scene break in fiction.

There are no additional tests in this section, since this is just a note about semantics and about CSS.

5.2.1CSS Note

Since Markua bans all inline HTML except HTML comments, the only way to produce an <hr /> is using the above thematic break syntax. So, designers can assume that all <hr /> elements are thematic breaks, and style them in a way appropriate for a scene break in fiction.

5.3ATX headings

An ATX heading consists of a string of characters, parsed as inline content, between an opening sequence of 1–6 unescaped # characters and an optional closing sequence of any number of unescaped # characters. The opening sequence of # characters must be followed by a space or by the end of line. The optional closing sequence of #s must be preceded by a space and may be followed by spaces only. The opening # character may be indented 0-3 spaces. The raw contents of the heading are stripped of leading and trailing spaces before being parsed as inline content. The heading level is equal to the number of # characters in the opening sequence.

Simple headings:

Example 67
# foo
## foo
### foo
#### foo
##### foo
###### foo
<h1>foo</h1>
<h2>foo</h2>
<h3>foo</h3>
<h4>foo</h4>
<h5>foo</h5>
<h6>foo</h6>

More than six # characters is not a heading:

Example 68
####### foo
<p>####### foo</p>

At least one space is required between the # characters and the heading’s contents, unless the heading is empty. Note that many implementations currently do not require the space. However, the space was required by the original ATX implementation, and it helps prevent things like the following from being parsed as headings:

Example 69
#5 bolt

#hashtag
<p>#5 bolt</p>
<p>#hashtag</p>

This is not a heading, because the first # is escaped:

Example 70
\## foo
<p>## foo</p>

Contents are parsed as inlines:

Example 71
# foo *bar* \*baz\*
<h1>foo <em>bar</em> *baz*</h1>

Leading and trailing whitespace is ignored in parsing inline content:

Example 72
#                  foo                     
<h1>foo</h1>

One to three spaces indentation are allowed:

Example 73
 ### foo
  ## foo
   # foo
<h3>foo</h3>
<h2>foo</h2>
<h1>foo</h1>

Four spaces are too much:

Example 74
    # foo
<pre><code># foo
</code></pre>
Example 75
foo
    # bar
<p>foo
# bar</p>

A closing sequence of # characters is optional:

Example 76
## foo ##
  ###   bar    ###
<h2>foo</h2>
<h3>bar</h3>

It need not be the same length as the opening sequence:

Example 77
# foo ##################################
##### foo ##
<h1>foo</h1>
<h5>foo</h5>

Spaces are allowed after the closing sequence:

Example 78
### foo ###     
<h3>foo</h3>

A sequence of # characters with anything but spaces following it is not a closing sequence, but counts as part of the contents of the heading:

Example 79
### foo ### b
<h3>foo ### b</h3>

The closing sequence must be preceded by a space:

Example 80
# foo#
<h1>foo#</h1>

Backslash-escaped # characters do not count as part of the closing sequence:

Example 81
### foo \###
## foo #\##
# foo \#
<h3>foo ###</h3>
<h2>foo ###</h2>
<h1>foo #</h1>

ATX headings need not be separated from surrounding content by blank lines, and they can interrupt paragraphs:

Example 82
****
## foo
****
<hr />
<h2>foo</h2>
<hr />
Example 83
Foo bar
# baz
Bar foo
<p>Foo bar</p>
<h1>baz</h1>
<p>Bar foo</p>

ATX headings can be empty:

Example 84
## 
#
### ###
<h2></h2>
<h1></h1>
<h3></h3>

5.4Setext headings

A setext heading consists of one or more lines of text, each containing at least one non-whitespace character, with no more than 3 spaces indentation, followed by a setext heading underline. The lines of text must be such that, were they not followed by the setext heading underline, they would be interpreted as a paragraph: they cannot be interpretable as a code fence, ATX heading, block quote, thematic break, list item, or [HTML block][HTML blocks].

A setext heading underline is a sequence of = characters or a sequence of - characters, with no more than 3 spaces indentation and any number of trailing spaces. If a line containing a single - can be interpreted as an empty list items, it should be interpreted this way and not as a setext heading underline.

The heading is a level 1 heading if = characters are used in the setext heading underline, and a level 2 heading if - characters are used. The contents of the heading are the result of parsing the preceding lines of text as CommonMark inline content.

In general, a setext heading need not be preceded or followed by a blank line. However, it cannot interrupt a paragraph, so when a setext heading comes after a paragraph, a blank line is needed between them.

Simple examples:

Example 85
Foo *bar*
=========

Foo *bar*
---------
<h1>Foo <em>bar</em></h1>
<h2>Foo <em>bar</em></h2>

The content of the header may span more than one line:

Example 86
Foo *bar
baz*
====
<h1>Foo <em>bar
baz</em></h1>

The contents are the result of parsing the headings’s raw content as inlines. The heading’s raw content is formed by concatenating the lines and removing initial and final whitespace.

Example 87
  Foo *bar
baz*→
====
<h1>Foo <em>bar
baz</em></h1>

The underlining can be any length:

Example 88
Foo
-------------------------

Foo
=
<h2>Foo</h2>
<h1>Foo</h1>

The heading content can be indented up to three spaces, and need not line up with the underlining:

Example 89
   Foo
---

  Foo
-----

  Foo
  ===
<h2>Foo</h2>
<h2>Foo</h2>
<h1>Foo</h1>

Four spaces indent is too much:

Example 90
    Foo
    ---

    Foo
---
<pre><code>Foo
---

Foo
</code></pre>
<hr />

The setext heading underline can be indented up to three spaces, and may have trailing spaces:

Example 91
Foo
   ----      
<h2>Foo</h2>

Four spaces is too much:

Example 92
Foo
    ---
<p>Foo
---</p>

The setext heading underline cannot contain internal spaces:

Example 93
Foo
= =

Foo
--- -
<p>Foo
= =</p>
<p>Foo</p>
<hr />

Trailing spaces in the content line do not cause a line break:

Example 94
Foo  
-----
<h2>Foo</h2>

Nor does a backslash at the end:

Example 95
Foo\
----
<h2>Foo\</h2>

Since indicators of block structure take precedence over indicators of inline structure, the following are setext headings:

Example 96
`Foo
----
`

<a title="a lot
---
of dashes"/>
<h2>`Foo</h2>
<p>`</p>
<h2>&lt;a title=&quot;a lot</h2>
<p>of dashes&quot;/&gt;</p>

The setext heading underline cannot be a lazy continuation line in a list item or block quote:

Example 97
> Foo
---
<blockquote>
<p>Foo</p>
</blockquote>
<hr />
Example 98
> foo
bar
===
<blockquote>
<p>foo
bar
===</p>
</blockquote>
Example 99
- Foo
---
<ul>
<li>Foo</li>
</ul>
<hr />

A blank line is needed between a paragraph and a following setext heading, since otherwise the paragraph becomes part of the heading’s content:

Example 100
Foo
Bar
---
<h2>Foo
Bar</h2>

But in general a blank line is not required before or after setext headings:

Example 101
---
Foo
---
Bar
---
Baz
<hr />
<h2>Foo</h2>
<h2>Bar</h2>
<p>Baz</p>

Setext headings cannot be empty:

Example 102

====
<p>====</p>

Setext heading text lines must not be interpretable as block constructs other than paragraphs. So, the line of dashes in these examples gets interpreted as a thematic break:

Example 103
---
---
<hr />
<hr />
Example 104
- foo
-----
<ul>
<li>foo</li>
</ul>
<hr />
Example 105
    foo
---
<pre><code>foo
</code></pre>
<hr />
Example 106
> foo
-----
<blockquote>
<p>foo</p>
</blockquote>
<hr />

If you want a heading with > foo as its literal text, you can use backslash escapes:

Example 107
\> foo
------
<h2>&gt; foo</h2>

Compatibility note: Most existing Markdown implementations do not allow the text of setext headings to span multiple lines. But there is no consensus about how to interpret

Foo
bar
---
baz

One can find four different interpretations:

  1. paragraph “Foo”, heading “bar”, paragraph “baz”
  2. paragraph “Foo bar”, thematic break, paragraph “baz”
  3. paragraph “Foo bar — baz”
  4. heading “Foo bar”, paragraph “baz”

We find interpretation 4 most natural, and interpretation 4 increases the expressive power of CommonMark, by allowing multiline headings. Authors who want interpretation 1 can put a blank line after the first paragraph:

Example 108
Foo

bar
---
baz
<p>Foo</p>
<h2>bar</h2>
<p>baz</p>

Authors who want interpretation 2 can put blank lines around the thematic break,

Example 109
Foo
bar

---

baz
<p>Foo
bar</p>
<hr />
<p>baz</p>

or use a thematic break that cannot count as a setext heading underline, such as

Example 110
Foo
bar
* * *
baz
<p>Foo
bar</p>
<hr />
<p>baz</p>

Authors who want interpretation 3 can use backslash escapes:

Example 111
Foo
bar
\---
baz
<p>Foo
bar
---
baz</p>

5.5Part headings (M)

As discussed, Markua uses a {class: part} attribute on an h1 heading to make a part heading from what would otherwise be a chapter heading.

In the HTML output, the only difference between a part and chapter heading on the heading itself is that there is a class="part" attribute shown. However, obviously, the part and chapter headings are treated differently elsewhere, such as in the table of contents of the book or the navigation structure of the course.

5.5.1Correct usage

Part heading, atx headers:

{class: part}
# Part Heading

This is a paragraph after the start of a part.
<h1 class="part">Part Heading</h1>
<p>This is a paragraph after the start of a part.</p>

Part heading, Setext headers:

{class: part}
Part Heading
============

This is a paragraph after the start of a part.
<h1 class="part">Part Heading</h1>
<p>This is a paragraph after the start of a part.</p>

5.5.2Incorrect usage

Adding {class: part} to any other heading level than an h1 is ignored.

Ignored on atx section headers:

{class: part}
## Section Heading

Foo
<h2>Section Heading</h2>
<p>Foo</p>

Ignored on Setext section headers:

{class: part}
Section Heading
---------------

Foo
<h2>Section Heading</h2>
<p>Foo</p>

5.6Resources (M)

Markua documents are written in plain text, either in one text file or multiple text files. However, modern books and courses sometimes contain more than text. Books and courses may embed many types of resources.

Resources vary in four different ways:

  1. Insertion Methods: Figure and Span
  2. Locations: Local, Web and Inline
  3. Types: audio, code, image, math, poetry, table and video
  4. Formats: mp3, ruby, png, latexmath, mp4, etc.

If you are familiar with Markdown, you will recognize that images, code and tables are being generalized here as just three of the seven types of resource, instead of being treated as special cases.

To be clear: the syntax for images, fenced code blocks, and tables is a superset of that specified by CommonMark and GFM. All images, fenced code blocks and tables supported by CommonMark and GFM work in Markua. Markua’s changes here are additions, not modifications. Markua’s resources are an extension, both conceptual and in features, of what was already in CommonMark and GFM.

Specifically, what resources do is three things:

  1. Add attribute lists and format specifiers to all resource types, including images, fenced code blocks and tables. Attribute lists can be used to specify a figure title and (if applicable) alt text, set an id, specify the format, and do other useful things.
  2. Add other types of resources (audio, poetry, math and video).
  3. Provide a consistent conceptual framework for thinking about all resources, instead of just seven different, unrelated things.

If you’re familiar with Markdown syntax, you’ll note that the syntax for local and web resources is similar to Markdown’s inline image insertion syntax, and that the syntax for inline resources is just the fenced code blocks syntax from CommonMark with the addition of format specifiers and attribute lists. Furthermore, the image syntax is the inspiration for the syntax of all local and web resources.

Note that this section is conceptual, and does not contain HTML mapping which serves as tests. Those are contained in subsequent sections about each resource type.

If any resource is missing during book or course generation, a Markua processor should insert something like the “missing image” often seen on web pages into the book or course.

5.6.1Resource insertion methods

Resources can be inserted either as figures or as spans.

5.6.1.1Figures

All resources can be inserted as figures. Figures are the normal case for how a resource is inserted. In many Markua documents, they will be most or all of the resources.

The syntax for a local or web resource inserted as a figure is as follows:

{key: value, comma: separated, optional: attribute_list}
![optional alt text](resource_path_or_url "Optional Figure Title")

The syntax for an inline resource inserted as a figure is as follows:

{key: value, comma: separated, optional: attribute_list}
```optional_format
inline resource content (default format is `guess` with backticks)
```

The number of backticks can be three or more. This is just the fenced code blocks syntax, which is discussed later.

You can also insert an inline resource as a figure using three or more tildes (~) as the delimiter:

{key: value, comma: separated, optional: attribute_list}
~~~optional_format
inline resource content (default format is `text` with tildes)
~~~
5.6.1.1.1Figure attributes

A figure can also have attributes. The supported attributes vary based on the type of resource, but all resources inserted as figures support the class, format, title and type attributes. (As such, these four attributes are discussed here, instead of on the specific resource types below.)

class : All attribute lists support a class attribute. When used in figures, this is the class of the figure. This can be used for styling, and it can also be used by Markua Processors which group figures by classes, such as theorems or lemmas.

format : This is the resource format. Different resource types have different legal values for format.

title : This is text which is shown near the figure, typically above or below it. A Markua Processor can choose where to position titles based on any criteria it chooses. For example, a Markua Processor can position all titles above or below figures, or use a different behaviour based on the resource format (e.g. table titles above, image titles below). Note that the figure title itself may contain the Markua text formatting specified earlier (e.g. bold, italic). This text is also displayed for the figure wherever the figure is listed (e.g. List of Illustrations, Table of Tables, etc.).

type : This is the resource type. This is usually inferred from the format instead of being specified.

Figures can have both alt text and a figure title. These are distinct things. We will discuss the figure title first and the alt text second.

5.6.1.1.2Figure titles

A resource which is inserted as a figure can have a figure title.

This title shows up in two places in the output:

  1. Near the resource, typically above or below it, per the preference of the Markua Processor.
  2. Based on the resource type, either in the List of Illustrations, List of Tables or Table of Figures, if they are generated for the book. This text should also be a crosslink to the title inserted near the figure itself.

The title for a figure can provided in one of two ways:

  1. In quotes after the filename or URL of a local or web resource.
    ![](algorithm.rb "My Amazing Algorithm")
    
  2. In the attribute list above the resource.
    {title: "My Amazing Algorithm"}
    ![](algorithm.rb)
    

The first choice is clearly shorter, as well as more pleasant to write and to read.

Since inline resources do not use the bracket syntax, any titles must be added in the attribute list:

{title: "My Amazing Algorithm"}
```ruby
puts "hello world"
```
5.6.1.1.3Figure alt text

A figure can have alt text, as long as the type of resource being inserted supports alt text. The resource location does not matter: local, web and inline resources all support alt text.

The four types of resource which can have alt text when inserted as a figure are audio, image, math and video. Resources of type code, poetry and table do not support alt text, since they themselves are just text.

Alt text is text which is intended to take the place of the resource if the resource itself cannot be seen. In the case of images, the obvious use case is for readers with visual disabilities who are using a screen reader, but it also includes audiobooks and ebook readers which often do not support embedded images, audio and video, and which may have a hard time displaying math.

Here’s an example of good alt text:

![a red apple, possibly a McIntosh or Spartan](fruit.jpg)

You can also use an attribute list:

{alt: "a red apple, possibly a McIntosh or Spartan"}
![](fruit.jpg)

The alt text should not have the same content as the figure title, if the figure title is present. (Imagine the annoyance for someone with a visual disability having their screen reader read identical alt text and figure titles to them throughout an entire book!)

Instead, the alt text should be descriptive of the image content, while the figure title can be more creative. For example, a figure title may be “Washington Crossing the Delaware” and the alt text could be “Denzel Washington on a boat in a river.” Having good alt text would enable readers who cannot see the image to still get the joke which the figure title makes.

5.6.1.1.4Figure titles and alt text together

These are some examples of figures with both alt text and a figure title:

![a red apple](mac.jpg "The Original Mac")

{alt: "a red apple", title: "The Original Mac"}
![](mac.jpg)

{alt: "a blue circle", title: "Earth From Space (Simplified)"}
```!
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```
5.6.1.1.5The figure attribute list takes precedence

It is always an error to specify an attribute both in the attribute list for a figure and in the syntactic sugar locations, either after the backticks or in the square brackets.

However, if this is done, then the value in the attribute list takes precedence.

In the following figure, the format is text not ruby:

{format: text}
```ruby
puts "hello world"
```

In the following figure, the alt text would be “foo” not “bar”:

{alt: foo}
![bar](foo.png)

In the following figure, the title text would be “foo” not “bar”:

{title: foo}
![](foo.png "bar")

Again, the Markua Processor should treat all of these as an error.

5.6.1.2Spans

Inserting a resource as a span means to continue the block context which it is in, instead of starting a new leaf block for the resource.

There are three types of resources which can be inserted as spans:

  1. Single-line inline code resources.
  2. A math resource (regardless of location) without a title.
  3. A local or web image resource without a title.

An inline code resource can be inserted as a span using the following syntax:

backtick `resource`format backtick

The math formats, including the syntactic sugar for them, are discussed later.

A local image resource without a title can be inserted as expected:

This is an ![a red apple](mac.jpg) not a banana.

A web image resource without a title can be inserted as expected:

This is an ![a red apple](http://markua.com/mac.jpg) not a banana.

Note that inline image resources cannot be inserted as spans, since even SVG is so verbose it would make the resulting document very hard to read.

Also, note that if you wish to insert an image in the middle of a sentence, chances are you are trying to insert an emoji. You can do this with a span image, if you want exact control of what your emoji look like. However, note that emoji also have special support in Markua, described here.

Finally, if a title is present, the Markua Processor must treat the resource as having implicit newlines before and after it, so that it is handled as a figure. (Figure titles do not make sense inside of sentences.)

This is ![a red apple](mac.jpg "The Original Mac") not a banana.
5.6.1.2.1Span attribute lists can be used to set the format

When a resource is inserted as a span, the resource is inserted as part of the flow of text of a paragraph with no newlines before or after it. A span resource is intended to be extremely short and simple.

By default, the type of an inline resource is code and the format is guess. Math and SVG images override this using special syntaxes, discussed later.

However, to support rare use cases, any span can have an attribute list. The attribute list on a span is specified immediately after the closing backtick. Span attribute lists can be used to set the format explicitly.

For example, this resource would be formatted as Ruby code:

Yada yada `puts "hi"`{format: ruby} yada yada.

5.6.2Resource Locations

A resource is either considered a local, web or inline resource based on its location:

Local Resource : The resource is stored along with the manuscript–either in a resources directory on a local filesystem, or uploaded to the same web service where the Markua document is being written.

Web Resource : The resource is referred to via an http or https URL.

Inline Resource : The resource is defined right in the body of a Markua document.

5.6.2.1Local resources

If local resources are used, all local resources must be stored inside a resources directory, or one of its subdirectories. The resources directory is not part of the path to the resource.

Here’s how the paths to local resources work:

  1. An image called foo.jpg in the resources directory should be referred to as ![](foo.jpg), but can also be referred to as ![](resources/foo.jpg).
  2. An image called bar.png in a subdirectory images of the resources directory should be referred to as ![](images/bar.png), but can also be referred to as ![](resources/images/foo.jpg).
  3. For security reasons, leading slashes / and navigating upward (../) are not allowed: ![](/foo.jpg), ![](/images/bar.png) and ![](../foo.jpg) are all illegal.

The reason that paths can either include or omit the resources directory is simple: including it makes it a simple relative path, which means that Markdown-aware tools that support, say, external images will just work. However, omitting it is nice to type, so this is something which should be supported as well. And the reason the resources directory exists is to keep the Markua manuscript file(s) separate from the resources, to reduce clutter.

Nested directory trees work as well. A file called foo.rb in a ch1/examples/ruby directory tree inside the resources directory is referenced as ![](ch1/examples/ruby/foo.rb) or as ![](resources/ch1/examples/ruby/foo.rb).

Markua does not specify whether there are any subdirectories of the resources directory, or what their names are. Since any subdirectories have their names as part of the path to the resource, authors can do whatever they want. For example, you can create subdirectories of the resources directory for different types of resource, such as audio, code, images, etc., but you can also just put them all in the resources directory together. To be clear: the names of the directories have no meaning, and do not restrict the formats of what can go inside them.

If you are using a hosted service to write in Markua, this service can store resources wherever it wants. However, if they provide a download (say as a zip file) they should create the resources directory and provide the uploaded resources in that directory. If a nested structure is used, it should be exported that way–if a web service produces paths which reference images inside an images directory (e.g. as images/foo.png), then the zip file containing an export should contain a resources directory which contains an images subdirectory with the images.

5.6.2.2Web resources

If web resources are supported, both http: and https: resources should be supported.

Web resources are identified by the absolute URL of the resource on the internet.

5.6.2.3Inline resources

Inline resources can be of type code, image (of SVG format only), math (regardless of format), poetry, or table. Since an SVG image is just XML text, it can be contained inline in the text of a Markua document. This is not something that is true for binary resources like PNG or JPEG images or any type of audio or video file–these can only be local or web resources.

The syntax for SVG images is discussed later.

5.6.3Resource types

There are seven types of resources:

  1. audio
  2. code
  3. image
  4. math
  5. poetry
  6. table
  7. video

Each type of resource has a number of supported formats. Any of the seven resource types can be inserted as a local resource or web resource, and many of the resource types can also be inserted as an inline resource.

5.6.4Resource formats

Both the type and the format can be specified in an attribute list, by the respective type and format attributes.

The type and the format can also be inferred from the file extension and, in the case of web resources, the URL.

Markua Processors must interpret all unspecified file extensions as specifying a resource of type code with a format of guess, unless the resource is a web resource.

If the type and format are not specified and the resource is a web resource, the Markua Processor may use the domain to decide what type of resource to assume. For example, a domain of youtube.com may be assumed to be of type video, a domain of instagram.com may be assumed to be of type image, and a domain of github.com may be assumed to be of type code.

If the type is not specified in the attribute list, the format determines the type. The formats can either be specified by the format attribute or (in most cases) inferred from the file extension for local and web resources. (Inline resources obviously have no file extension, since they are contained in the body of a Markua document.)

As an author, all you typically do is provide the correct file extension for a local resource or set the format in the attribute list. Markua recognizes the format, and uses it to determine the type. If the format is unrecognized, then the resource is treated as a resource of type code and with a format of guess.

It’s important to emphasize that the type and format of a resource can be overridden using an attribute list. The file extensions just set the default type and format that are inferred.

In rare instances, it is useful to override the type and format which have been inferred by the Markua Processor based on the file extension of the resource. This is done by specifying a type and/or format in the attribute list of the resource.

The following sections list the various resource formats for each type of resource, along with the file extensions which make that format and type be inferred. Note that these file extensions are case-insensitive.

5.6.4.1Audio resource formats

5.6.4.2Image resource formats

5.6.4.3Math resource formats

Note that AsciiMath and LaTeX math are almost always contained as inline resources in a Markua document. They only have file extensions for completeness. Also, note that .tex is assumed to be LaTeX math, not a vanilla LaTeX file. There is a simple reason for this: Markua needs LaTeX math for math, but only needs to display LaTeX for people writing about LaTeX. So, if you have a LaTeX file with an extension of .tex that just contains LaTeX code and you want to display it, then add a {type: code, format: latex} attribute list.

Note that the assumption is that AsciiMath will almost always be used as an inline resource. So, the .asciimath file extension is deliberately verbose.

5.6.4.4Table resource formats

Note that tables are almost always contained as inline resources in a Markua document. They only have a file extension for completeness. The .md file extension is used, since the tables are formatted in a Markdown extension, and since Markdown and Markua files themselves are not inserted as resources.

5.6.4.5Poetry resource formats

Note that poetry is almost always contained as an inline resource in a Markua document. It only have a file extension for completeness. The .text file extension is used, since .txt was used for unformatted code, and since anything else would not have the appropriate behaviour for authors when trying to open the files locally by double-clicking.

5.6.4.6Video resource formats

Note that .mp4 is used for MP4 video, not MP4 AAC audio.

5.6.4.7Code resource formats

Note that most code languages are not specified, and are guessed at with a format of guess. A Markua Processor typically leaves this decision about the language recognized to the syntax highlighter, such as Pygments.

5.7Indented code blocks

An indented code block is composed of one or more indented chunks separated by blank lines. An indented chunk is a sequence of non-blank lines, each indented four or more spaces. The contents of the code block are the literal contents of the lines, including trailing line endings, minus four spaces of indentation. An indented code block has no info string.

An indented code block cannot interrupt a paragraph, so there must be a blank line between a paragraph and a following indented code block. (A blank line is not needed, however, between a code block and a following paragraph.)

Example 116
    a simple
      indented code block
<pre><code>a simple
  indented code block
</code></pre>

If there is any ambiguity between an interpretation of indentation as a code block and as indicating that material belongs to a list item, the list item interpretation takes precedence:

Example 117
  - foo

    bar
<ul>
<li>
<p>foo</p>
<p>bar</p>
</li>
</ul>
Example 118
1.  foo

    - bar
<ol>
<li>
<p>foo</p>
<ul>
<li>bar</li>
</ul>
</li>
</ol>

The contents of a code block are literal text, and do not get parsed as Markdown:

Example 119
    <a/>
    *hi*

    - one
<pre><code>&lt;a/&gt;
*hi*

- one
</code></pre>

Here we have three chunks separated by blank lines:

Example 120
    chunk1

    chunk2
  
 
 
    chunk3
<pre><code>chunk1

chunk2



chunk3
</code></pre>

Any initial spaces beyond four will be included in the content, even in interior blank lines:

Example 121
    chunk1
      
      chunk2
<pre><code>chunk1
  
  chunk2
</code></pre>

An indented code block cannot interrupt a paragraph. (This allows hanging indents and the like.)

Example 122
Foo
    bar
<p>Foo
bar</p>

However, any non-blank line with fewer than four leading spaces ends the code block immediately. So a paragraph may occur immediately after indented code:

Example 123
    foo
bar
<pre><code>foo
</code></pre>
<p>bar</p>

And indented code can occur immediately before and after other kinds of blocks:

Example 124
# Heading
    foo
Heading
------
    foo
----
<h1>Heading</h1>
<pre><code>foo
</code></pre>
<h2>Heading</h2>
<pre><code>foo
</code></pre>
<hr />

The first line can be indented more than four spaces:

Example 125
        foo
    bar
<pre><code>    foo
bar
</code></pre>

Blank lines preceding or following an indented code block are not included in it:

Example 126

    
    foo
    
<pre><code>foo
</code></pre>

Trailing spaces are included in the code block’s content:

Example 127
    foo  
<pre><code>foo  
</code></pre>

5.8Fenced code blocks

A code fence is a sequence of at least three consecutive backtick characters (`) or tildes (~). (Tildes and backticks cannot be mixed.) A fenced code block begins with a code fence, indented no more than three spaces.

The line with the opening code fence may optionally contain some text following the code fence; this is trimmed of leading and trailing whitespace and called the info string. If the info string comes after a backtick fence, it may not contain any backtick characters. (The reason for this restriction is that otherwise some inline code would be incorrectly interpreted as the beginning of a fenced code block.)

The content of the code block consists of all subsequent lines, until a closing code fence of the same type as the code block began with (backticks or tildes), and with at least as many backticks or tildes as the opening code fence. If the leading code fence is indented N spaces, then up to N spaces of indentation are removed from each line of the content (if present). (If a content line is not indented, it is preserved unchanged. If it is indented less than N spaces, all of the indentation is removed.)

The closing code fence may be indented up to three spaces, and may be followed only by spaces, which are ignored. If the end of the containing block (or document) is reached and no closing code fence has been found, the code block contains all of the lines after the opening code fence until the end of the containing block (or document). (An alternative spec would require backtracking in the event that a closing code fence is not found. But this makes parsing much less efficient, and there seems to be no real down side to the behavior described here.)

A fenced code block may interrupt a paragraph, and does not require a blank line either before or after.

The content of a code fence is treated as literal text, not parsed as inlines. The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string.

Here is a simple example with backticks:

Example 128
```
<
 >
```
<pre><code>&lt;
 &gt;
</code></pre>

With tildes:

Example 129
~~~
<
 >
~~~
<pre><code>&lt;
 &gt;
</code></pre>

Fewer than three backticks is not enough:

Example 130
``
foo
``
<p><code>foo</code></p>

The closing code fence must use the same character as the opening fence:

Example 131
```
aaa
~~~
```
<pre><code>aaa
~~~
</code></pre>
Example 132
~~~
aaa
```
~~~
<pre><code>aaa
```
</code></pre>

The closing code fence must be at least as long as the opening fence:

Example 133
````
aaa
```
``````
<pre><code>aaa
```
</code></pre>
Example 134
~~~~
aaa
~~~
~~~~
<pre><code>aaa
~~~
</code></pre>

Unclosed code blocks are closed by the end of the document (or the enclosing block quote or list item):

Example 135
```
<pre><code></code></pre>
Example 136
`````

```
aaa
<pre><code>
```
aaa
</code></pre>
Example 137
> ```
> aaa

bbb
<blockquote>
<pre><code>aaa
</code></pre>
</blockquote>
<p>bbb</p>

A code block can have all empty lines as its content:

Example 138
```

  
```
<pre><code>
  
</code></pre>

A code block can be empty:

Example 139
```
```
<pre><code></code></pre>

Fences can be indented. If the opening fence is indented, content lines will have equivalent opening indentation removed, if present:

Example 140
 ```
 aaa
aaa
```
<pre><code>aaa
aaa
</code></pre>
Example 141
  ```
aaa
  aaa
aaa
  ```
<pre><code>aaa
aaa
aaa
</code></pre>
Example 142
   ```
   aaa
    aaa
  aaa
   ```
<pre><code>aaa
 aaa
aaa
</code></pre>

Four spaces indentation produces an indented code block:

Example 143
    ```
    aaa
    ```
<pre><code>```
aaa
```
</code></pre>

Closing fences may be indented by 0-3 spaces, and their indentation need not match that of the opening fence:

Example 144
```
aaa
  ```
<pre><code>aaa
</code></pre>
Example 145
   ```
aaa
  ```
<pre><code>aaa
</code></pre>

This is not a closing fence, because it is indented 4 spaces:

Example 146
```
aaa
    ```
<pre><code>aaa
    ```
</code></pre>

Code fences (opening and closing) cannot contain internal spaces:

Example 147
``` ```
aaa
<p><code> </code>
aaa</p>
Example 148
~~~~~~
aaa
~~~ ~~
<pre><code>aaa
~~~ ~~
</code></pre>

Fenced code blocks can interrupt paragraphs, and can be followed directly by paragraphs, without a blank line between:

Example 149
foo
```
bar
```
baz
<p>foo</p>
<pre><code>bar
</code></pre>
<p>baz</p>

Other blocks can also occur before and after fenced code blocks without an intervening blank line:

Example 150
foo
---
~~~
bar
~~~
# baz
<h2>foo</h2>
<pre><code>bar
</code></pre>
<h1>baz</h1>

An info string can be provided after the opening code fence. Although this spec doesn’t mandate any particular treatment of the info string, the first word is typically used to specify the language of the code block. In HTML output, the language is normally indicated by adding a class to the code element consisting of language- followed by the language name.

Example 151
```ruby
def foo(x)
  return 3
end
```
<pre><code class="language-ruby">def foo(x)
  return 3
end
</code></pre>
Example 152
~~~~    ruby startline=3 $%@#$
def foo(x)
  return 3
end
~~~~~~~
<pre><code class="language-ruby">def foo(x)
  return 3
end
</code></pre>
Example 153
````;
````
<pre><code class="language-;"></code></pre>

Info strings for backtick code blocks cannot contain backticks:

Example 154
``` aa ```
foo
<p><code>aa</code>
foo</p>

Info strings for tilde code blocks can contain backticks and tildes:

Example 155
~~~ aa ``` ~~~
foo
~~~
<pre><code class="language-aa">foo
</code></pre>

Closing code fences cannot have info strings:

Example 156
```
``` aaa
```
<pre><code>``` aaa
</code></pre>

5.9Code resources (M)

Code can be a local, web or inline resource, just like any other resource, and the same resource syntax applies to code as to all other resources.

As discussed, code cannot have alt text. It’s just text. If any alt text is provided for a code resource, it is ignored.

Markua specifies only one specific file extension to be associated with a type of code: the .txt extension, which is for the format of text. However, Markua Processors must interpret all unspecified file extensions as specifying a resource of type code with a format of guess.

Regardless of whether syntax highlighting is supported and the programming language is detected, all code must be formatted as monospaced text by Markua Processors.

The text format means to not do any syntax highlighting as well.

The guess format is a request for the Markua Processor to guess at the programming language based on the file extension and/or the syntax of the code itself. Then, if the detected language corresponds to a particular programming language which the Markua Processor recognizes, and if the Markua Processor supports syntax highlighting, then it can format the resource as nicely syntax-highlighted code. Syntax highlighting is entirely optional in Markua Processors. If a Markua Processor does not support syntax highlighting, and/or if it cannot detect a matching supported programming language, then it must format the code as though the format was text–i.e. to format it as unformatted monospaced text.

Besides the text and guess values of the format attribute, you can also specify the programming language by setting the format attribute to a specific programming language. This is more reliable than guess. Unlike other resource types, Markua does not specify the complete set of the values of the format attribute–there are so many programming languages in the world, and new ones are added so frequently, that doing so would be impractical.

However, while a complete set of the values of the format attribute is not specified, Markua does specify the console value of the format attribute to indicate console input. A Markua Processor should format console input as such. (For example, Leanpub uses the open source Pygments library for its code formatting, and Pygments handles console format correctly, so Leanpub gets this for free.)

The default value of the format attribute for code is complex:

  1. For code which is inserted as a span (which is only supported with inline resources), the default format is text.
  2. For code which is inserted as a figure which is inserted as an inline resource using three tildes, the default format is text.
  3. For all other code, the default format is guess. This includes local and web resources inserted as figures, and code inserted as an inline figure using three backticks.

Note that the default format can be overridden by specifying it via an attribute list, or after the three backticks in syntactic sugar.

5.9.1Supported Attributes for Code

The following are the supported attributes for code resources, in addition to the class, format, title and type attributes which all resources support.

line-numbers : This determines whether the code sample shows line numbers. Legal values are true or false. The default value is false. Any value other than true is interpreted as false.

number-from : If line numbers are shown, this lets you override the starting number of the line numbers. The default value is 1.

crop-start : Sometimes it’s desirable to only show part of a code resource defined in an external file as the code example. The crop-start and crop-end attributes let you accomplish this. The crop-start attribute defines the line which represents the first line included from the resource. For example, {crop-start: 10, crop-end: 15, line-numbers: true, number-from: 10} ensures that lines 10-15 are shown and are numbered as lines 10-15. The default value is 1, which is the first line of the file.

crop-end : This attribute ends the range started with crop-start. The default value of crop-end is to be omitted, which is equivalent to specifying the last line of the file.

5.9.1.1Default Value of the format attribute in Inline Code Samples

The default value of the format attribute for a code resource inserted as a figure varies based on context.

If the code resource is a local or web resource, it defaults to guess.

If the code resource is an inline resource, the default varies based on the delimiter, and whether the code is inserted as a span or as a block.

With three backticks the default format is guess, and with three tildes, the default format is text. This way, you can vary the default without having to type an attribute list: if you want the code language guessed at, use backticks; if you don’t, use tildes. Of course, you can specify any attributes you wish with either delimiter, and specified attributes override default ones. The only reason there are different defaults are to make things easier to type. Programmers refer to such niceties as “syntactic sugar”.

The default value of block code resources inserted with three backticks can be overridden from guess to some other value by setting by the default-code-language attribute on the entire Markua document. (This attribute has no effect on resources inserted with three tildes.) The default value of code span resources inserted as spans with single backticks can be overridden from the default value of text to some other value by setting the default-code-span-language on the entire Markua document.

5.9.2Local Code Resources

Local code resources can be inserted as a figure.

This first figure will be a type of code and a format of guess. A Markua Processor which associates .rb file extensions with Ruby code will treat this as Ruby code; a Markua Processor which has no association for .rb files will treat it as plain text:

Here's a paragraph before the figure.

![](hello.rb "Hello World in Ruby")

Here's a paragraph after the figure.

That is equivalent to:

Here's a paragraph before the figure.

{format: guess}
![](hello.rb "Hello World in Ruby")

Here's a paragraph after the figure.

If you don’t want to take chances you can do this:

Here's a paragraph before the figure.

{format: ruby}
![](hello.rb "Hello World in Ruby")

Here's a paragraph after the figure.

Note that the title is optional in all figures:

Here's a paragraph before the figure.

![](hello.rb)

Here's a paragraph after the figure.

5.9.3Web Code Resources

Web code resources function identically to how local code resources work, including the significance of file extensions. The only differences is that the files are on the web.

This will be a type of code and a format of guess since the file extension is not specified:

![](http://markua.com/hello.rb "Hello World in Ruby")

That is equivalent to:

{format: guess}
![](http://markua.com/hello.rb "Hello World in Ruby")

If you don’t want to take chances you can do this:

{format: ruby}
![](http://markua.com/hello.rb "Hello World in Ruby")

5.9.4Inline Code Resources

Inline code resources are the most flexible way to insert code. They are the only way to insert code as a span resource, and the most straightforward way to add short code examples as figures.

The great thing about inline code resources, either as spans or figures, is that they work the same way as they do in CommonMark and GFM, with small additions by Markua.

5.9.4.1No Attribute Lists or Format Specifiers on Indented Code Blocks

Indented code blocks are supported for compatibility with CommonMark and GFM. However, no attribute lists or format specifiers can be used. If you want to use them, use a fenced code block.

5.9.4.2Attribute Lists and Format Specifiers on Fenced Code Blocks

Fenced code blocks, discussed earlier, are how to insert inline code resources as figures. These can have attribute lists or format specifiers.

This will be a type of code and a format of guess since three backticks are used and since the format is not specified:

Some paragraph.

```
puts "hello"
```

Some paragraph.

That is equivalent to:

Some paragraph.

```guess
puts "hello"
```

Some paragraph.

If you don’t want to take chances you can do this to explicitly specify the format:

Some paragraph.

```ruby
puts "hello"
```

Some paragraph.

This Ruby code may be formatted as such if the Markua Processor understands ruby. If not, the ruby format will be ignored.

If you don’t like syntactic sugar you can do:

Some paragraph.

{format: ruby}
```
puts "hello"
```

Some paragraph.

If you want a figure title, you can add it to the attribute list with any of the above. For example:

Some paragraph.

{title: "Hello World in Ruby"}
```ruby
puts "hello"
```

Some paragraph.

Finally, if you want the code to definitely not get syntax highlighted, you can force format to be text in one of two ways.

First, you can set it explicitly:

Some paragraph.

```text
puts "hello"
```

Some paragraph.

Second, you can use three tildes instead of three backticks, since the default with tildes is text not guess:

Some paragraph.

~~~
puts "hello"
~~~

Some paragraph.

As discussed previously, console input and output should be formatted as such by a Markua Processor:

```console
$ git init
Initialized empty Git repository in /path/to/repo
```

Finally, it’s important to note that when you are writing about other inline formats, such as SVG or AsciiMath, what you are really doing is creating a code resource. This is shown in the sections below, which discuss SVG and AsciiMath, but this applies more broadly.

5.9.4.3Attribute Lists on Code Spans

Code spans are discussed later in the spec. While code spans are done just as in CommonMark and GFM, Markua adds one thing: attribute lists to specify the format. This serves as a hint to any syntax highlighter:

Hello World in Ruby is a simple `puts "hello world"`{format: ruby} statement.

5.9.5Marking Code as Added or Deleted

Markua supports marking code as added or deleted, which can be helpful if you are writing a computer programming book and want to indicate what code should be added or removed to a larger program.

The way to do this is to add special comment lines to your code.

The magic words are markua-start-insert, markua-end-insert, markua-start-delete and markua-end-delete. Any line containing one of those words will be removed completely by a Markua Processor before being inserted into the output.

The Markua Processor will then be able to determine which code is being deleted or inserted, and format it accordingly. The recommended way for a Markua Processor to do this is to make code which is being inserted get bolded, and to make code which is getting deleted to be put in ~~strikethrough~~.

Finally, while syntax highlighting is optional in a Markua Processor, if a Markua Processor does support syntax highlighting it is allowed for the Markua Processor to not do any syntax highlighting when there is the presence of any of any special markua-* comments. Syntax highlighting may make it harder to notice the added and removed code, if they are formatted with bold and strikethrough respectively.

5.9.6Line Wrapping in Code Resources

Code resources should have newlines added by the author to ensure that automatic line wrapping is not relied upon. Markua Processors may wrap lines to ensure that all code is visible on a page, and may add continuation characters (like the backslash \ character) in the output to indicate that a line has been automatically wrapped. However, adding a continuation character is not a requirement, nor is the choice of which continuation character is used.

5.10Poetry (M)

Poetry can be a local, web or inline resource, just like any other resource, and the same resource syntax applies to poetry as to all other resources.

Poetry can only be inserted as a figure. It cannot be inserted as a span resource, like code can. (When you insert poetry in a sentence, nothing is needed except quotes, so this makes sense.)

Markua specifies only one specific file extension to be associated with a type of poetry: the .text extension.

Poetry is different from code resources in one important way: in poetry, you can use Markua text formatting (bold, italic, strikethrough, etc.) to format your text. This is true regardless of whether the poetry is formatted with a proportional or monospaced font.

5.10.1Supported attributes for poetry

The following are the supported attributes for poetry resources, in addition to the class, format, title and type attributes which all resources support.

monospaced : true or false. The default is false, except for a special case with inline poetry resources, discussed below. If true, the Markua Processor must use a monospaced font to output the poetry. If false, the Markua Processor may use whatever font (proportional or monospaced) it is configured to use to output poetry.

5.10.2Poetry, whitespace and newlines

Regardless of the value of the monospaced attribute, all whitespace inside a poetry resource is respected. This includes all spaces and newlines, regardless of whether they are leading, internal or trailing spaces, and regardless of the number of consecutive newlines.

(The whitespace is preserved, but it is rendered using either a proportional font or a monospaced one based on the value of the monospaced attribute.)

Poetry resources should have newlines added by the author to ensure that automatic line wrapping is not relied upon.

Markua Processors may wrap lines to ensure that all poetry is visible on a page, and may add continuation characters (like the backslash \ character) in the output to indicate that a line has been automatically wrapped by the Markua Processor.

Obviously, seeing a continuation character is in poetry is terrible, so the author should consider this a mistake that needs to be fixed by manually line wrapping.

5.10.3Local poetry resources

Local poetry resources look like many other resources inserted as figures.

Here's a Shakespearean sonnet:

![](sonnet130.text "Sonnet 130")

Here's an E. E. Cummings poem:

{monospaced: true}
![](iwillbe.text "I Will Be")

If you don’t want to rely on the .text extension, you can specify the type and format explicitly:

Here's a Shakespearean sonnet:

{type: poetry, monospaced: false}
![](sonnet130.text "Sonnet 130")

Here's an E. E. Cummings poem:

{type: poetry, monospaced: true}
![](iwillbe.text "I Will Be")

5.10.4Web poetry resources

Web poetry resources function identically to how local poetry resources work, including the significance of file extensions. The only differences is that the files are on the web.

Here's a Shakespearean sonnet:

![](http://markua.com/sonnet130.text "Sonnet 130")

Here's a paragraph after the figure.

You can also specify the type and format explicitly:

Here's a Shakespearean sonnet:

{type: poetry, monospaced: false}
![](http://markua.com/sonnet130.text "Sonnet 130")

Here's a paragraph after the figure.

5.10.5Inline poetry resources

To create an inline poetry resource, you use the fenced code block syntax, and you either use an attribute list or one of two types of syntactic sugar.

& : This is equivalent to {type: poetry, monospaced: false}

* : This is equivalent to {type: poetry, monospaced: true}

Here’s a portion of Shakespearean Sonnet 130, using & to indicate that monospaced is false:

Foo

```&
I grant I never saw a goddess go;
My mistress when she walks treads on the ground.
    And yet, by heaven, I think my love as rare
    As any she belied with false compare.
```

Bar
<p>Foo</p>
<div class="poetry">
<p>I grant I never saw a goddess go;<br />
My mistress when she walks treads on the ground.<br />
&nbsp;&nbsp;&nbsp;&nbsp;And yet, by heaven, I think my love as rare<br />
&nbsp;&nbsp;&nbsp;&nbsp;As any she belied with false compare.</p>
</div>
<p>Bar</p>

Yes, manual indentation of the last two lines of the sonnet is done with &nbsp;, since that’s what they are: non-breaking spaces. Also, note that this sonnet didn’t use any formatting characters to make things bold or italic, since Shakespeare spent his time writing, not formatting. (If Shakespeare had used a computer, I’d like to think he would have written in Markua.)

Here’s a portion of a “poem” using * to indicate that monospaced is true:

Foo

```*
i
  *am*
      a        cat
            a        **l a z y**          cat

me   ow
```

Bar
<p>Foo</p>
<div class="poetry monospaced">
<p>i<br />
&nbsp;&nbsp;<em>am</em><br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cat<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<strong>l a z y</strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cat<br />
<br />
me&nbsp;&nbsp;&nbsp;ow</p>
</div>
<p>Bar</p>

No, the &nbsp; isn’t very poetic. Neither is my poem.

5.10.6But what if I want every character taken literally?

If you want to type poetry where every character is taken exactly literally, use a code block with a format of text for that:

```text
*this* isn't italic
and **this** is not bold
* * * cherry blossoms * * *
```
<pre><code>*this* isn't italic
and **this** is not bold
* * * cherry blossoms * * *
</code></pre>

5.11No HTML Blocks (M)

Markdown and CommonMark support HTML blocks, which are groups of lines that are treated as raw HTML (and will not be escaped in HTML output).

Markua does not support HTML blocks. Instead it treats HTML as text to be rendered.

The way it does this in HTML is by replacing a leading < with the entity &lt;.

<h1>Hello</h1>

<p>Hello world!</p>
<p>&lt;h1>Hello&lt;/h1>

<p>&lt;p>Hello world!&lt;/p></p>

Markua and Markdown have different use cases. Markdown is a better way to write HTML; Markua is a better way to write a book or course.

HTML is just one possible output format of Markua, and other possible Markua output formats (such as LaTeX and PDF) are not based on HTML. If HTML blocks were supported, a Markua Processor would have to support parsing and meaningfully outputting all of HTML syntax as well as all of Markua syntax.

Since Markdown’s only output format target is HTML, it might as well support HTML blocks: generating HTML from HTML is as simple as passing the HTML through. From an implementation perspective, Markdown gets inline HTML support for free. This is not true in the case of Markua.

A link reference definition consists of a link label, indented up to three spaces, followed by a colon (:), optional whitespace (including up to one line ending), a link destination, optional whitespace (including up to one line ending), and an optional link title, which if it is present must be separated from the link destination by whitespace. No further non-whitespace characters may occur on the line.

A link reference definition does not correspond to a structural element of a document. Instead, it defines a label which can be used in reference links and reference-style images elsewhere in the document. Link reference definitions can come either before or after the links that use them.

Example 161
[foo]: /url "title"

[foo]
<p><a href="/url" title="title">foo</a></p>
Example 162
   [foo]: 
      /url  
           'the title'  

[foo]
<p><a href="/url" title="the title">foo</a></p>
Example 163
[Foo*bar\]]:my_(url) 'title (with parens)'

[Foo*bar\]]
<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
Example 164
[Foo bar]:
<my url>
'title'

[Foo bar]
<p><a href="my%20url" title="title">Foo bar</a></p>

The title may extend over multiple lines:

Example 165
[foo]: /url '
title
line1
line2
'

[foo]
<p><a href="/url" title="
title
line1
line2
">foo</a></p>

However, it may not contain a blank line:

Example 166
[foo]: /url 'title

with blank line'

[foo]
<p>[foo]: /url 'title</p>
<p>with blank line'</p>
<p>[foo]</p>

The title may be omitted:

Example 167
[foo]:
/url

[foo]
<p><a href="/url">foo</a></p>

The link destination may not be omitted:

Example 168
[foo]:

[foo]
<p>[foo]:</p>
<p>[foo]</p>

However, an empty link destination may be specified using angle brackets:

Example 169
[foo]: <>

[foo]
<p><a href="">foo</a></p>

The title must be separated from the link destination by whitespace:

Example 170
[foo]: <bar>(baz)

[foo]
<p>[foo]: <bar>(baz)</p>
<p>[foo]</p>

Both title and destination can contain backslash escapes and literal backslashes:

Example 171
[foo]: /url\bar\*baz "foo\"bar\baz"

[foo]
<p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>

A link can come before its corresponding definition:

Example 172
[foo]

[foo]: url
<p><a href="url">foo</a></p>

If there are several matching definitions, the first one takes precedence:

Example 173
[foo]

[foo]: first
[foo]: second
<p><a href="first">foo</a></p>

As noted in the section on Links, matching of labels is case-insensitive (see matches).

Example 174
[FOO]: /url

[Foo]
<p><a href="/url">Foo</a></p>
Example 175
[ΑΓΩ]: /φου

[αγω]
<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>

Here is a link reference definition with no corresponding link. It contributes nothing to the document.

Example 176
[foo]: /url

Here is another one:

Example 177
[
foo
]: /url
bar
<p>bar</p>

This is not a link reference definition, because there are non-whitespace characters after the title:

Example 178
[foo]: /url "title" ok
<p>[foo]: /url &quot;title&quot; ok</p>

This is a link reference definition, but it has no title:

Example 179
[foo]: /url
"title" ok
<p>&quot;title&quot; ok</p>

This is not a link reference definition, because it is indented four spaces:

Example 180
    [foo]: /url "title"

[foo]
<pre><code>[foo]: /url &quot;title&quot;
</code></pre>
<p>[foo]</p>

This is not a link reference definition, because it occurs inside a code block:

Example 181
```
[foo]: /url
```

[foo]
<pre><code>[foo]: /url
</code></pre>
<p>[foo]</p>

A link reference definition cannot interrupt a paragraph.

Example 182
Foo
[bar]: /baz

[bar]
<p>Foo
[bar]: /baz</p>
<p>[bar]</p>

However, it can directly follow other block elements, such as headings and thematic breaks, and it need not be followed by a blank line.

Example 183
# [Foo]
[foo]: /url
> bar
<h1><a href="/url">Foo</a></h1>
<blockquote>
<p>bar</p>
</blockquote>
Example 184
[foo]: /url
bar
===
[foo]
<h1>bar</h1>
<p><a href="/url">foo</a></p>
Example 185
[foo]: /url
===
[foo]
<p>===
<a href="/url">foo</a></p>

Several link reference definitions can occur one after another, without intervening blank lines.

Example 186
[foo]: /foo-url "foo"
[bar]: /bar-url
  "bar"
[baz]: /baz-url

[foo],
[bar],
[baz]
<p><a href="/foo-url" title="foo">foo</a>,
<a href="/bar-url" title="bar">bar</a>,
<a href="/baz-url">baz</a></p>

Link reference definitions can occur inside block containers, like lists and block quotations. They affect the entire document, not just the container in which they are defined:

Example 187
[foo]

> [foo]: /url
<p><a href="/url">foo</a></p>
<blockquote>
</blockquote>

Whether something is a link reference definition is independent of whether the link reference it defines is used in the document. Thus, for example, the following document contains just a link reference definition, and no visible content:

Example 188
[foo]: /url

5.13Paragraphs

A sequence of non-blank lines that cannot be interpreted as other kinds of blocks forms a paragraph. The contents of the paragraph are the result of parsing the paragraph’s raw content as inlines. The paragraph’s raw content is formed by concatenating the lines and removing initial and final whitespace.

A simple example with two paragraphs:

Example 189
aaa

bbb
<p>aaa</p>
<p>bbb</p>

Paragraphs can contain multiple lines, but no blank lines:

Example 190
aaa
bbb

ccc
ddd
<p>aaa
bbb</p>
<p>ccc
ddd</p>

Multiple blank lines between paragraphs have no effect:

Example 191
aaa


bbb
<p>aaa</p>
<p>bbb</p>

Leading spaces are skipped:

Example 192
  aaa
 bbb
<p>aaa
bbb</p>

Lines after the first may be indented any amount, since indented code blocks cannot interrupt paragraphs.

Example 193
aaa
             bbb
                                       ccc
<p>aaa
bbb
ccc</p>

However, the first line may be indented at most three spaces, or an indented code block will be triggered:

Example 194
   aaa
bbb
<p>aaa
bbb</p>
Example 195
    aaa
bbb
<pre><code>aaa
</code></pre>
<p>bbb</p>

Final spaces are stripped before inline parsing, so a paragraph that ends with two or more spaces will not end with a hard line break:

Example 196
aaa     
bbb     
<p>aaa<br />
bbb</p>

5.14Paragraph continuations (M)

In books, you can insert things like lists in the middle of paragraphs.

Ideally, this would be one paragraph. However, it’s not:

foo
  * lorem
  * ipsum
bar
<p>foo</p>
<ul>
<li>lorem</li>
<li>ipsum
bar</li>
</ul>

Here, you lose hard, because “bar” is considered to be part of the same list item as ipsum. I think this is a mistake, but it is more important to preserve Markdown compatibility than to fix this mistake.

Ideally, it would also be possible to insert a list inside a paragraph. However, it’s not:

foo

  * lorem
  * ipsum

bar
<p>foo</p>
<ul>
<li>lorem</li>
<li>ipsum</li>
</ul>
<p>bar</p>

The reason here is that HTML does not support inserting lists in paragraphs. Lists are always siblings to paragraphs, not contained within them.

There is no way to fix this at the HTML tag level, since this is just how HTML works. It would be possible to just construct a fake list with a bunch of CSS and break tags inside a paragraph, but that would be far worse.

From an HTML perspective, you cannot insert lists inside paragraphs. But if you are writing in Markua, you’re not just writing HTML. You can also be generating a book in PDF and other non-HTML formats, which do support lists inside paragraphs just fine.

So, there needs to be some way to indicate that the beginning of a paragraph after something like a list is actually a continuation of the previous paragraph.

Now, this could have been done by just not adding blank lines, only single newlines, but that would involve fighting against existing Markdown behaviour. Worse, it would introduce an incompatibility just to fix something that many people don’t consider broken.

For authors who want a list to appear to be inside a paragraph, Markua does have two solutions. First, you can just add a continued-para class to the subsequent paragraph, via an attribute list:

foo

  * lorem
  * ipsum

{class: continued-para}
bar
<p>foo</p>
<ul>
<li>lorem</li>
<li>ipsum</li>
</ul>
<p class="continued-para">bar</p>

It would then be up to the Markua Processor to use the appropriate CSS to make the paragraph with the continued-para class look like a continuation of the previous paragraph, such as by removing indentation or decreasing leading space.

However, if you are an author who cares about this feature, typing an attribute list whenever you want to continue a paragraph is really obnoxious. So, there needs to be some syntactic sugar.

Here it is:

foo

  * lorem
  * ipsum

^
bar
<p>foo</p>
<ul>
<li>lorem</li>
<li>ipsum</li>
</ul>
<p class="continued-para">bar</p>

A caret (^) on a line by itself above a paragraph functions as a {class: continued-para} attribute list for the following paragraph.

Note that this works for more than just lists. In fact, it has nothing to do with lists, or even with images or other resources. It’s all about the indentation behaviour of a paragraph, typically following some other block. So, you can use this whenever you want:

foo

> a block quote

^
bar
<p>foo</p>
<blockquote>
<p>a block quote</p>
</blockquote>
<p class="continued-para">bar</p>

5.15Blank lines

Blank lines between block-level elements are ignored, except for the role they play in determining whether a list is tight or loose.

Blank lines at the beginning and end of the document are also ignored.

Example 202
  

aaa
  

# aaa

  
<p>aaa</p>
<h1>aaa</h1>

5.16Tables (GFM)

GFM enables the table extension, where an additional leaf block type is available.

A table is an arrangement of data with rows and columns, consisting of a single header row, a delimiter row separating the header from the data, and zero or more data rows.

Each row consists of cells containing arbitrary text, in which inlines are parsed, separated by pipes (|). A leading and trailing pipe is also recommended for clarity of reading, and if there’s otherwise parsing ambiguity. Spaces between pipes and cell content are trimmed. Block-level elements cannot be inserted in a table.

The delimiter row consists of cells whose only content are hyphens (-), and optionally, a leading or trailing colon (:), or both, to indicate left, right, or center alignment respectively.

| foo | bar |
| --- | --- |
| baz | bim |
<table>
<thead>
<tr>
<th>foo</th>
<th>bar</th>
</tr>
</thead>
<tbody>
<tr>
<td>baz</td>
<td>bim</td>
</tr>
</tbody>
</table>

Cells in one column don’t need to match length, though it’s easier to read if they are. Likewise, use of leading and trailing pipes may be inconsistent:

| abc | defghi |
:-: | -----------:
bar | baz
<table>
<thead>
<tr>
<th align="center">abc</th>
<th align="right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">bar</td>
<td align="right">baz</td>
</tr>
</tbody>
</table>

Include a pipe in a cell’s content by escaping it, including inside other inline spans:

| f\|oo  |
| ------ |
| b `\|` az |
| b **\|** im |
<table>
<thead>
<tr>
<th>f|oo</th>
</tr>
</thead>
<tbody>
<tr>
<td>b <code>|</code> az</td>
</tr>
<tr>
<td>b <strong>|</strong> im</td>
</tr>
</tbody>
</table>

The table is broken at the first empty line, or beginning of another block-level structure:

| abc | def |
| --- | --- |
| bar | baz |
> bar
<table>
<thead>
<tr>
<th>abc</th>
<th>def</th>
</tr>
</thead>
<tbody>
<tr>
<td>bar</td>
<td>baz</td>
</tr>
</tbody>
</table>
<blockquote>
<p>bar</p>
</blockquote>
| abc | def |
| --- | --- |
| bar | baz |
bar

bar
<table>
<thead>
<tr>
<th>abc</th>
<th>def</th>
</tr>
</thead>
<tbody>
<tr>
<td>bar</td>
<td>baz</td>
</tr>
<tr>
<td>bar</td>
<td></td>
</tr>
</tbody>
</table>
<p>bar</p>

The header row must match the delimiter row in the number of cells. If not, a table will not be recognized:

| abc | def |
| --- |
| bar |
<p>| abc | def |
| --- |
| bar |</p>

The remainder of the table’s rows may vary in the number of cells. If there are a number of cells fewer than the number of cells in the header row, empty cells are inserted. If there are greater, the excess is ignored:

| abc | def |
| --- | --- |
| bar |
| bar | baz | boo |
<table>
<thead>
<tr>
<th>abc</th>
<th>def</th>
</tr>
</thead>
<tbody>
<tr>
<td>bar</td>
<td></td>
</tr>
<tr>
<td>bar</td>
<td>baz</td>
</tr>
</tbody>
</table>

If there are no rows in the body, no <tbody> is generated in HTML output:

| abc | def |
| --- | --- |
<table>
<thead>
<tr>
<th>abc</th>
<th>def</th>
</tr>
</thead>
</table>

6Container blocks

A container block is a block that has other blocks as its contents. There are two basic kinds of container blocks: block quotes and list items. Lists are meta-containers for list items.

We define the syntax for container blocks recursively. The general form of the definition is:

If X is a sequence of blocks, then the result of transforming X in such-and-such a way is a container of type Y with these blocks as its content.

So, we explain what counts as a block quote or list item by explaining how these can be generated from their contents. This should suffice to define the syntax, although it does not give a recipe for parsing these constructions. (A recipe is provided below in the section entitled A parsing strategy.)

6.1Block quotes

A block quote marker consists of 0-3 spaces of initial indent, plus (a) the character > together with a following space, or (b) a single character > not followed by a space.

The following rules define block quotes:

  1. Basic case. If a string of lines Ls constitute a sequence of blocks Bs, then the result of prepending a block quote marker to the beginning of each line in Ls is a block quote containing Bs.

  2. Laziness. If a string of lines Ls constitute a block quote with contents Bs, then the result of deleting the initial block quote marker from one or more lines in which the next non-whitespace character after the block quote marker is paragraph continuation text is a block quote with Bs as its content. Paragraph continuation text is text that will be parsed as part of the content of a paragraph, but does not occur at the beginning of the paragraph.

  3. Consecutiveness. A document cannot contain two block quotes in a row unless there is a blank line between them.

Nothing else counts as a block quote.

Here is a simple example:

Example 211
> # Foo
> bar
> baz
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>

The spaces after the > characters can be omitted:

Example 212
># Foo
>bar
> baz
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>

The > characters can be indented 1-3 spaces:

Example 213
   > # Foo
   > bar
 > baz
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>

Four spaces gives us a code block:

Example 214
    > # Foo
    > bar
    > baz
<pre><code>&gt; # Foo
&gt; bar
&gt; baz
</code></pre>

The Laziness clause allows us to omit the > before paragraph continuation text:

Example 215
> # Foo
> bar
baz
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>

A block quote can contain some lazy and some non-lazy continuation lines:

Example 216
> bar
baz
> foo
<blockquote>
<p>bar
baz
foo</p>
</blockquote>

Laziness only applies to lines that would have been continuations of paragraphs had they been prepended with block quote markers. For example, the > cannot be omitted in the second line of

> foo
> ---

without changing the meaning:

Example 217
> foo
---
<blockquote>
<p>foo</p>
</blockquote>
<hr />

Similarly, if we omit the > in the second line of

> - foo
> - bar

then the block quote ends after the first line:

Example 218
> - foo
- bar
<blockquote>
<ul>
<li>foo</li>
</ul>
</blockquote>
<ul>
<li>bar</li>
</ul>

For the same reason, we can’t omit the > in front of subsequent lines of an indented or fenced code block:

Example 219
>     foo
    bar
<blockquote>
<pre><code>foo
</code></pre>
</blockquote>
<pre><code>bar
</code></pre>
Example 220
> ```
foo
```
<blockquote>
<pre><code></code></pre>
</blockquote>
<p>foo</p>
<pre><code></code></pre>

Note that in the following case, we have a lazy continuation line:

Example 221
> foo
    - bar
<blockquote>
<p>foo
- bar</p>
</blockquote>

To see why, note that in

> foo
>     - bar

the - bar is indented too far to start a list, and can’t be an indented code block because indented code blocks cannot interrupt paragraphs, so it is paragraph continuation text.

A block quote can be empty:

Example 222
>
<blockquote>
</blockquote>
Example 223
>
>  
> 
<blockquote>
</blockquote>

A block quote can have initial or final blank lines:

Example 224
>
> foo
>  
<blockquote>
<p>foo</p>
</blockquote>

A blank line always separates block quotes:

Example 225
> foo

> bar
<blockquote>
<p>foo</p>
</blockquote>
<blockquote>
<p>bar</p>
</blockquote>

(Most current Markdown implementations, including John Gruber’s original Markdown.pl, will parse this example as a single block quote with two paragraphs. But it seems better to allow the author to decide whether two block quotes or one are wanted.)

Consecutiveness means that if we put these block quotes together, we get a single block quote:

Example 226
> foo
> bar
<blockquote>
<p>foo
bar</p>
</blockquote>

To get a block quote with two paragraphs, use:

Example 227
> foo
>
> bar
<blockquote>
<p>foo</p>
<p>bar</p>
</blockquote>

Block quotes can interrupt paragraphs:

Example 228
foo
> bar
<p>foo</p>
<blockquote>
<p>bar</p>
</blockquote>

In general, blank lines are not needed before or after block quotes:

Example 229
> aaa
***
> bbb
<blockquote>
<p>aaa</p>
</blockquote>
<hr />
<blockquote>
<p>bbb</p>
</blockquote>

However, because of laziness, a blank line is needed between a block quote and a following paragraph:

Example 230
> bar
baz
<blockquote>
<p>bar
baz</p>
</blockquote>
Example 231
> bar

baz
<blockquote>
<p>bar</p>
</blockquote>
<p>baz</p>
Example 232
> bar
>
baz
<blockquote>
<p>bar</p>
</blockquote>
<p>baz</p>

It is a consequence of the Laziness rule that any number of initial >s may be omitted on a continuation line of a nested block quote:

Example 233
> > > foo
bar
<blockquote>
<blockquote>
<blockquote>
<p>foo
bar</p>
</blockquote>
</blockquote>
</blockquote>
Example 234
>>> foo
> bar
>>baz
<blockquote>
<blockquote>
<blockquote>
<p>foo
bar
baz</p>
</blockquote>
</blockquote>
</blockquote>

When including an indented code block in a block quote, remember that the block quote marker includes both the > and a following space. So five spaces are needed after the >:

Example 235
>     code

>    not code
<blockquote>
<pre><code>code
</code></pre>
</blockquote>
<blockquote>
<p>not code</p>
</blockquote>

6.2Block quotes with curly braces (M)

Block quotes in Markua are created in one of two ways:

  1. By prefacing lines with > , i.e. a greater than character followed by a space. This was shown above.
  2. By wrapping the blockquote in {blockquote}{/blockquote}

Option #1 is preferable for short quotes; option #2 is easier on authors for really long quotes.

Like figures and tables, blockquotes can be inserted in the middle of a paragraph or as a sibling of it.

These are the two ways to make block quotes in Markua:

This is the first paragraph.

> This is a blockquote.
>
> You saw this above.

This is the second paragraph.

{blockquote}
This is also a blockquote.

It is good for longer quotes.
{/blockquote}

This is the third paragraph.
<p>This is the first paragraph.</p>
<blockquote>
<p>This is a blockquote.</p>
<p>You saw this above.</p>
</blockquote>
<p>This is the second paragraph.</p>
<blockquote>
<p>This is also a blockquote.</p>
<p>It is good for longer quotes.</p>
</blockquote>
<p>This is the third paragraph.</p>

If a blockquote contains headings, these headings may be formatted by a Markua Processor differently than normal headings. Finally, if a Markua Processor is automatically generating a Table of Contents from chapter and section headings, any headings inside blockquotes should be ignored.

6.3Asides (M)

Since Markua is for writing books and courses, including technical books and courses, it needs not just a syntax for blockquotes–it also needs a syntax for asides and for blurbs.

These syntaxes are very similar to the Markua syntax for blockquotes. Like blockquotes, any headings inside asides or blurbs do not show up in a Table of Contents (if one is present), and they can also be formatted differently by Markua Processors.

We will consider asides first.

Asides are typically short or long notes in books which are tangential to the main idea–sort of like footnotes, but in the body text itself. In technical books, quite often they are formatted in a box. Asides can span multiple pages.

The syntaxes for asides are very similar to blockquotes:

  1. By prefacing lines with A> , i.e. an A, then a greater than character (>), then a space.
  2. By wrapping the aside in {aside}{/aside}

Option #1 is preferable for short asides; option #2 is easier on authors for really long asides.

For consistency with blockquotes, asides can be siblings of paragraphs or nested in them.

Here’s a short aside:

A> This is a short aside.
<aside>
<p>This is a short aside.</p>
</aside>

Here’s a longer aside, which also contains a heading:

A> # A Longer Aside
A>
A> This is a longer aside.
A>
A> It can have multiple paragraphs.
<aside>
<h1>A Longer Aside</h1>
<p>This is a longer aside.</p>
<p>It can have multiple paragraphs.</p>
</aside>

Here’s the same longer aside using the {aside} syntax:

{aside}
# A Longer Aside

This is a longer aside.

It can have multiple paragraphs.
{/aside}
<aside>
<h1>A Longer Aside</h1>
<p>This is a longer aside.</p>
<p>It can have multiple paragraphs.</p>
</aside>

6.4Blurbs (M)

Blurbs are like asides, but shorter. A blurb is not intended to span multiple pages of output.

The syntaxes for blurbs are very similar to asides:

  1. By prefacing lines with B> , i.e. a B, then a greater than character (>), then a space.
  2. By wrapping the blurb in {blurb}{/blurb}

Here’s a short blurb:

B> This is a short blurb.
<aside class="blurb">
<p>This is a short blurb.</p>
</aside>

Here’s a longer blurb, which also contains a heading:

B> # A Longer Blurb
B>
B> This is a longer blurb.
B>
B> It can have multiple paragraphs.
<aside class="blurb">
<h1>A Longer Blurb</h1>
<p>This is a longer blurb.</p>
<p>It can have multiple paragraphs.</p>
</aside>

Here’s the same longer blurb using the {blurb} syntax:

{blurb}
# A Longer Blurb

This is a longer blurb.

It can have multiple paragraphs.
{/blurb}
<aside class="blurb">
<h1>A Longer Blurb</h1>
<p>This is a longer blurb.</p>
<p>It can have multiple paragraphs.</p>
</aside>

6.5Blurb classes (M)

Blurbs also have support for an attribute list, which can contain a class attribute.

Markua has its origin in authoring computer programming books. In computer programming books, there are a number of blurb types which are a de facto standard:

These blurb types can be set on a blurb as its class attribute. A Markua Processor can optionally style these blurbs appropriately based on the class, for example by adding custom icons for each class of blurb.

Here’s how this looks with the B> syntax:

{class: warning}
B> This is a warning!
<aside class="blurb warning">
<p>This is a warning!</p>
</aside>

Here’s how this looks with the {blurb} syntax:

{blurb, class: warning}
This is a warning!
{/blurb}
<aside class="blurb warning">
<p>This is a warning!</p>
</aside>

The attribute list must either directly precede the B> with no blank line between it and the B>, or it must be combined with the {blurb} block opening. It is NOT legal Markua syntax to have an attribute list preceding a {blurb} block opening like this:

{class: warning}
{blurb}
That is not legal Markua...

6.6Syntactic sugar for blurb classes (M)

Having to constantly type {class: warning} in a computer programming book with a number of warnings would get tedious, as would any of the other blurb classes listed above.

So, Markua defines a standard shorthand syntax for these classes of blurbs. With this syntax, you use a different letter than B in the B>, to create a blurb with the appropriate class.

These are the syntactic sugar values you can use which have a heritage in computer programming books:

D> : {class: discussion}

E> : {class: error}

I> : {class: information}

Q> : {class: question}

T> : {class: tip}

W> : {class: warning}

X> : {class: exercise}

D> This is a discussion blurb.

{class: discussion}
B> This is a discussion blurb.

{blurb, class: discussion}
This is a discussion blurb.
{/blurb}
<aside class="blurb discussion">
<p>This is a discussion blurb.</p>
</aside>

<aside class="blurb discussion">
<p>This is a discussion blurb.</p>
</aside>

<aside class="blurb discussion">
<p>This is a discussion blurb.</p>
</aside>
E> This is an error blurb.

{class: error}
B> This is an error blurb.

{blurb, class: error}
This is an error blurb.
{/blurb}
<aside class="blurb error">
<p>This is an error blurb.</p>
</aside>

<aside class="blurb error">
<p>This is an error blurb.</p>
</aside>

<aside class="blurb error">
<p>This is an error blurb.</p>
</aside>
X> This is an exercise blurb.

{class: exercise}
B> This is an exercise blurb.

{blurb, class: exercise}
This is an exercise blurb.
{/blurb}
<aside class="blurb exercise">
<p>This is an exercise blurb.</p>
</aside>

<aside class="blurb exercise">
<p>This is an exercise blurb.</p>
</aside>

<aside class="blurb exercise">
<p>This is an exercise blurb.</p>
</aside>
I> This is an information blurb.

{class: information}
B> This is an information blurb.

{blurb, class: information}
This is an information blurb.
{/blurb}
<aside class="blurb information">
<p>This is an information blurb.</p>
</aside>

<aside class="blurb information">
<p>This is an information blurb.</p>
</aside>

<aside class="blurb information">
<p>This is an information blurb.</p>
</aside>
Q> This is a question blurb.

{class: question}
B> This is a question blurb.

{blurb, class: question}
This is a question blurb.
{/blurb}
<aside class="blurb question">
<p>This is a question blurb.</p>
</aside>

<aside class="blurb question">
<p>This is a question blurb.</p>
</aside>

<aside class="blurb question">
<p>This is a question blurb.</p>
</aside>
T> This is a tip blurb.

{class: tip}
B> This is a tip blurb.

{blurb, class: tip}
This is a tip blurb.
{/blurb}
<aside class="blurb tip">
<p>This is a tip blurb.</p>
</aside>

<aside class="blurb tip">
<p>This is a tip blurb.</p>
</aside>

<aside class="blurb tip">
<p>This is a tip blurb.</p>
</aside>
W> This is a warning blurb.

{class: warning}
B> This is a warning blurb.

{blurb, class: warning}
This is a warning blurb.
{/blurb}
<aside class="blurb warning">
<p>This is a warning blurb.</p>
</aside>

<aside class="blurb warning">
<p>This is a warning blurb.</p>
</aside>

<aside class="blurb warning">
<p>This is a warning blurb.</p>
</aside>

Note that Q> and X> are a bit confusing:

Also note that nothing in this section defines what a Markua Processor must do with the given class of blurb, in terms of formatting. Leanpub, for example, uses it to add an appropriate icon from Font Awesome at the left of the blurb, but other Markua Processors are free to do something different.

Finally, note that specifying a class in metadata overrides what the syntactic sugar does, and is also an error:

{class: tip}
W> This is a tip blurb, not a warning blurb.
<aside class="blurb tip">
<p>This is a tip blurb, not a warning blurb.</p>
</aside>

6.7Using blurbs to center text (M)

You can also use a blurb to center text.

The following two ways to do this are equivalent:

C> This is a centered blurb.

{class: center}
B> This is a centered blurb.
<aside class="blurb center">
<p>This is a centered blurb.</p>
</aside>
<aside class="blurb center">
<p>This is a centered blurb.</p>
</aside>

This is the only way to center text in Markua.

Unlike other blurb types which have their origin in technical books, centering text has a wide range of uses. So, it could have been thought of as something different than a blurb. However, in terms of its behaviour and the way it’s inserted, centered text is a blurb–whether it’s inserted via syntactic sugar or via a class attribute on a normal blurb element.

6.8Blurb icons with extension attributes (M)

Markua Processors must ignore any attributes which they do not understand.

Because of this, Markua attribute lists can contain any number of extension attributes. An extension attribute is an attribute which is not defined in the Markua specification, but which is understood by some specific Markua Processor.

As an example of an extension attribute, Leanpub adds an icon attribute to blurbs. Markua does not specify that a blurb must support an icon attribute, or what it would mean if it did. However, Leanpub understands an icon attribute to reference an icon from Font Awesome. The value of this attribute is assumed to be the name of an icon in Font Awesome, without the fa- prefix. So, in any Markua Processor, you can do this:

{icon: car}
B> You can't spell carbon without it!

{icon: leanpub}
B> Yes, we're in Font Awesome!

{icon: github}
B> So is GitHub, of course. Unicorns.

In Leanpub, this will produce a nice icon of a car, using the Font Awesome icon. In a Markua implementation that does not understand the icon attribute, nothing will be generated for that attribute–it will be functionally equivalent to the attribute not being present.

6.9List items

A list marker is a bullet list marker or an ordered list marker.

A bullet list marker is a -, +, or * character.

An ordered list marker is a sequence of 1–9 arabic digits (0-9), followed by either a . character or a ) character. (The reason for the length limit is that with 10 digits we start seeing integer overflows in some browsers.)

The following rules define list items:

  1. Basic case. If a sequence of lines Ls constitute a sequence of blocks Bs starting with a non-whitespace character, and M is a list marker of width W followed by 1 ≤ N ≤ 4 spaces, then the result of prepending M and the following spaces to the first line of Ls, and indenting subsequent lines of Ls by W + N spaces, is a list item with Bs as its contents. The type of the list item (bullet or ordered) is determined by the type of its list marker. If the list item is ordered, then it is also assigned a start number, based on the ordered list marker.

    Exceptions:

    1. When the first list item in a list interrupts a paragraph—that is, when it starts on a line that would otherwise count as paragraph continuation text—then (a) the lines Ls must not begin with a blank line, and (b) if the list item is ordered, the start number must be 1.
    2. If any line is a thematic break then that line is not a list item.

For example, let Ls be the lines

Example 254
A paragraph
with two lines.

    indented code

> A block quote.
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>

And let M be the marker 1., and N = 2. Then rule #1 says that the following is an ordered list item with start number 1, and the same contents as Ls:

Example 255
1.  A paragraph
    with two lines.

        indented code

    > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

The most important thing to notice is that the position of the text after the list marker determines how much indentation is needed in subsequent blocks in the list item. If the list marker takes up two spaces, and there are three spaces between the list marker and the next non-whitespace character, then blocks must be indented five spaces in order to fall under the list item.

Here are some examples showing how far content must be indented to be put under the list item:

Example 256
- one

 two
<ul>
<li>one</li>
</ul>
<p>two</p>
Example 257
- one

  two
<ul>
<li>
<p>one</p>
<p>two</p>
</li>
</ul>
Example 258
 -    one

     two
<ul>
<li>one</li>
</ul>
<pre><code> two
</code></pre>
Example 259
 -    one

      two
<ul>
<li>
<p>one</p>
<p>two</p>
</li>
</ul>

It is tempting to think of this in terms of columns: the continuation blocks must be indented at least to the column of the first non-whitespace character after the list marker. However, that is not quite right. The spaces after the list marker determine how much relative indentation is needed. Which column this indentation reaches will depend on how the list item is embedded in other constructions, as shown by this example:

Example 260
   > > 1.  one
>>
>>     two
<blockquote>
<blockquote>
<ol>
<li>
<p>one</p>
<p>two</p>
</li>
</ol>
</blockquote>
</blockquote>

Here two occurs in the same column as the list marker 1., but is actually contained in the list item, because there is sufficient indentation after the last containing blockquote marker.

The converse is also possible. In the following example, the word two occurs far to the right of the initial text of the list item, one, but it is not considered part of the list item, because it is not indented far enough past the blockquote marker:

Example 261
>>- one
>>
  >  > two
<blockquote>
<blockquote>
<ul>
<li>one</li>
</ul>
<p>two</p>
</blockquote>
</blockquote>

Note that at least one space is needed between the list marker and any following content, so these are not list items:

Example 262
-one

2.two
<p>-one</p>
<p>2.two</p>

A list item may contain blocks that are separated by more than one blank line.

Example 263
- foo


  bar
<ul>
<li>
<p>foo</p>
<p>bar</p>
</li>
</ul>

A list item may contain any kind of block:

Example 264
1.  foo

    ```
    bar
    ```

    baz

    > bam
<ol>
<li>
<p>foo</p>
<pre><code>bar
</code></pre>
<p>baz</p>
<blockquote>
<p>bam</p>
</blockquote>
</li>
</ol>

A list item that contains an indented code block will preserve empty lines within the code block verbatim.

Example 265
- Foo

      bar


      baz
<ul>
<li>
<p>Foo</p>
<pre><code>bar


baz
</code></pre>
</li>
</ul>

Note that ordered list start numbers must be nine digits or less:

Example 266
123456789. ok
<ol start="123456789">
<li>ok</li>
</ol>
Example 267
1234567890. not ok
<p>1234567890. not ok</p>

A start number may begin with 0s:

Example 268
0. ok
<ol start="0">
<li>ok</li>
</ol>
Example 269
003. ok
<ol start="3">
<li>ok</li>
</ol>

A start number may not be negative:

Example 270
-1. not ok
<p>-1. not ok</p>
  1. Item starting with indented code. If a sequence of lines Ls constitute a sequence of blocks Bs starting with an indented code block, and M is a list marker of width W followed by one space, then the result of prepending M and the following space to the first line of Ls, and indenting subsequent lines of Ls by W + 1 spaces, is a list item with Bs as its contents. If a line is empty, then it need not be indented. The type of the list item (bullet or ordered) is determined by the type of its list marker. If the list item is ordered, then it is also assigned a start number, based on the ordered list marker.

An indented code block will have to be indented four spaces beyond the edge of the region where text will be included in the list item. In the following case that is 6 spaces:

Example 271
- foo

      bar
<ul>
<li>
<p>foo</p>
<pre><code>bar
</code></pre>
</li>
</ul>

And in this case it is 11 spaces:

Example 272
  10.  foo

           bar
<ol start="10">
<li>
<p>foo</p>
<pre><code>bar
</code></pre>
</li>
</ol>

If the first block in the list item is an indented code block, then by rule #2, the contents must be indented one space after the list marker:

Example 273
    indented code

paragraph

    more code
<pre><code>indented code
</code></pre>
<p>paragraph</p>
<pre><code>more code
</code></pre>
Example 274
1.     indented code

   paragraph

       more code
<ol>
<li>
<pre><code>indented code
</code></pre>
<p>paragraph</p>
<pre><code>more code
</code></pre>
</li>
</ol>

Note that an additional space indent is interpreted as space inside the code block:

Example 275
1.      indented code

   paragraph

       more code
<ol>
<li>
<pre><code> indented code
</code></pre>
<p>paragraph</p>
<pre><code>more code
</code></pre>
</li>
</ol>

Note that rules #1 and #2 only apply to two cases: (a) cases in which the lines to be included in a list item begin with a non-whitespace character, and (b) cases in which they begin with an indented code block. In a case like the following, where the first block begins with a three-space indent, the rules do not allow us to form a list item by indenting the whole thing and prepending a list marker:

Example 276
   foo

bar
<p>foo</p>
<p>bar</p>
Example 277
-    foo

  bar
<ul>
<li>foo</li>
</ul>
<p>bar</p>

This is not a significant restriction, because when a block begins with 1-3 spaces indent, the indentation can always be removed without a change in interpretation, allowing rule #1 to be applied. So, in the above case:

Example 278
-  foo

   bar
<ul>
<li>
<p>foo</p>
<p>bar</p>
</li>
</ul>
  1. Item starting with a blank line. If a sequence of lines Ls starting with a single blank line constitute a (possibly empty) sequence of blocks Bs, and M is a list marker of width W, then the result of prepending M to the first line of Ls, and indenting subsequent lines of Ls by W + 1 spaces, is a list item with Bs as its contents. If a line is empty, then it need not be indented. The type of the list item (bullet or ordered) is determined by the type of its list marker. If the list item is ordered, then it is also assigned a start number, based on the ordered list marker.

Here are some list items that start with a blank line but are not empty:

Example 279
-
  foo
-
  ```
  bar
  ```
-
      baz
<ul>
<li>foo</li>
<li>
<pre><code>bar
</code></pre>
</li>
<li>
<pre><code>baz
</code></pre>
</li>
</ul>

When the list item starts with a blank line, the number of spaces following the list marker doesn’t change the required indentation:

Example 280
-   
  foo
<ul>
<li>foo</li>
</ul>

A list item can begin with at most one blank line. In the following example, foo is not part of the list item:

Example 281
-

  foo
<ul>
<li></li>
</ul>
<p>foo</p>

Here is an empty bullet list item:

Example 282
- foo
-
- bar
<ul>
<li>foo</li>
<li></li>
<li>bar</li>
</ul>

It does not matter whether there are spaces following the list marker:

Example 283
- foo
-   
- bar
<ul>
<li>foo</li>
<li></li>
<li>bar</li>
</ul>

Here is an empty ordered list item:

Example 284
1. foo
2.
3. bar
<ol>
<li>foo</li>
<li></li>
<li>bar</li>
</ol>

A list may start or end with an empty list item:

Example 285
*
<ul>
<li></li>
</ul>

However, an empty list item cannot interrupt a paragraph:

Example 286
foo
*

foo
1.
<p>foo
*</p>
<p>foo
1.</p>
  1. Indentation. If a sequence of lines Ls constitutes a list item according to rule #1, #2, or #3, then the result of indenting each line of Ls by 1-3 spaces (the same for each line) also constitutes a list item with the same contents and attributes. If a line is empty, then it need not be indented.

Indented one space:

Example 287
 1.  A paragraph
     with two lines.

         indented code

     > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

Indented two spaces:

Example 288
  1.  A paragraph
      with two lines.

          indented code

      > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

Indented three spaces:

Example 289
   1.  A paragraph
       with two lines.

           indented code

       > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

Four spaces indent gives a code block:

Example 290
    1.  A paragraph
        with two lines.

            indented code

        > A block quote.
<pre><code>1.  A paragraph
    with two lines.

        indented code

    &gt; A block quote.
</code></pre>
  1. Laziness. If a string of lines Ls constitute a list item with contents Bs, then the result of deleting some or all of the indentation from one or more lines in which the next non-whitespace character after the indentation is paragraph continuation text is a list item with the same contents and attributes. The unindented lines are called lazy continuation lines.

Here is an example with lazy continuation lines:

Example 291
  1.  A paragraph
with two lines.

          indented code

      > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

Indentation can be partially deleted:

Example 292
  1.  A paragraph
    with two lines.
<ol>
<li>A paragraph
with two lines.</li>
</ol>

These examples show how laziness can work in nested structures:

Example 293
> 1. > Blockquote
continued here.
<blockquote>
<ol>
<li>
<blockquote>
<p>Blockquote
continued here.</p>
</blockquote>
</li>
</ol>
</blockquote>
Example 294
> 1. > Blockquote
> continued here.
<blockquote>
<ol>
<li>
<blockquote>
<p>Blockquote
continued here.</p>
</blockquote>
</li>
</ol>
</blockquote>
  1. That’s all. Nothing that is not counted as a list item by rules #1–5 counts as a list item.

The rules for sublists follow from the general rules above. A sublist must be indented the same number of spaces a paragraph would need to be in order to be included in the list item.

So, in this case we need two spaces indent:

Example 295
- foo
  - bar
    - baz
      - boo
<ul>
<li>foo
<ul>
<li>bar
<ul>
<li>baz
<ul>
<li>boo</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>

One is not enough:

Example 296
- foo
 - bar
  - baz
   - boo
<ul>
<li>foo</li>
<li>bar</li>
<li>baz</li>
<li>boo</li>
</ul>

Here we need four, because the list marker is wider:

Example 297
10) foo
    - bar
<ol start="10">
<li>foo
<ul>
<li>bar</li>
</ul>
</li>
</ol>

Three is not enough:

Example 298
10) foo
   - bar
<ol start="10">
<li>foo</li>
</ol>
<ul>
<li>bar</li>
</ul>

A list may be the first block in a list item:

Example 299
- - foo
<ul>
<li>
<ul>
<li>foo</li>
</ul>
</li>
</ul>
Example 300
1. - 2. foo
<ol>
<li>
<ul>
<li>
<ol start="2">
<li>foo</li>
</ol>
</li>
</ul>
</li>
</ol>

A list item can contain a heading:

Example 301
- # Foo
- Bar
  ---
  baz
<ul>
<li>
<h1>Foo</h1>
</li>
<li>
<h2>Bar</h2>
baz</li>
</ul>

6.9.1Motivation

John Gruber’s Markdown spec says the following about list items:

  1. “List markers typically start at the left margin, but may be indented by up to three spaces. List markers must be followed by one or more spaces or a tab.”

  2. “To make lists look nice, you can wrap items with hanging indents…. But if you don’t want to, you don’t have to.”

  3. “List items may consist of multiple paragraphs. Each subsequent paragraph in a list item must be indented by either 4 spaces or one tab.”

  4. “It looks nice if you indent every line of the subsequent paragraphs, but here again, Markdown will allow you to be lazy.”

  5. “To put a blockquote within a list item, the blockquote’s > delimiters need to be indented.”

  6. “To put a code block within a list item, the code block needs to be indented twice — 8 spaces or two tabs.”

These rules specify that a paragraph under a list item must be indented four spaces (presumably, from the left margin, rather than the start of the list marker, but this is not said), and that code under a list item must be indented eight spaces instead of the usual four. They also say that a block quote must be indented, but not by how much; however, the example given has four spaces indentation. Although nothing is said about other kinds of block-level content, it is certainly reasonable to infer that all block elements under a list item, including other lists, must be indented four spaces. This principle has been called the four-space rule.

The four-space rule is clear and principled, and if the reference implementation Markdown.pl had followed it, it probably would have become the standard. However, Markdown.pl allowed paragraphs and sublists to start with only two spaces indentation, at least on the outer level. Worse, its behavior was inconsistent: a sublist of an outer-level list needed two spaces indentation, but a sublist of this sublist needed three spaces. It is not surprising, then, that different implementations of Markdown have developed very different rules for determining what comes under a list item. (Pandoc and python-Markdown, for example, stuck with Gruber’s syntax description and the four-space rule, while discount, redcarpet, marked, PHP Markdown, and others followed Markdown.pl’s behavior more closely.)

Unfortunately, given the divergences between implementations, there is no way to give a spec for list items that will be guaranteed not to break any existing documents. However, the spec given here should correctly handle lists formatted with either the four-space rule or the more forgiving Markdown.pl behavior, provided they are laid out in a way that is natural for a human to read.

The strategy here is to let the width and indentation of the list marker determine the indentation necessary for blocks to fall under the list item, rather than having a fixed and arbitrary number. The writer can think of the body of the list item as a unit which gets indented to the right enough to fit the list marker (and any indentation on the list marker). (The laziness rule, #5, then allows continuation lines to be unindented if needed.)

This rule is superior, we claim, to any rule requiring a fixed level of indentation from the margin. The four-space rule is clear but unnatural. It is quite unintuitive that

- foo

  bar

  - baz

should be parsed as two lists with an intervening paragraph,

<ul>
<li>foo</li>
</ul>
<p>bar</p>
<ul>
<li>baz</li>
</ul>

as the four-space rule demands, rather than a single list,

<ul>
<li>
<p>foo</p>
<p>bar</p>
<ul>
<li>baz</li>
</ul>
</li>
</ul>

The choice of four spaces is arbitrary. It can be learned, but it is not likely to be guessed, and it trips up beginners regularly.

Would it help to adopt a two-space rule? The problem is that such a rule, together with the rule allowing 1–3 spaces indentation of the initial list marker, allows text that is indented less than the original list marker to be included in the list item. For example, Markdown.pl parses

   - one

  two

as a single list item, with two a continuation paragraph:

<ul>
<li>
<p>one</p>
<p>two</p>
</li>
</ul>

and similarly

>   - one
>
>  two

as

<blockquote>
<ul>
<li>
<p>one</p>
<p>two</p>
</li>
</ul>
</blockquote>

This is extremely unintuitive.

Rather than requiring a fixed indent from the margin, we could require a fixed indent (say, two spaces, or even one space) from the list marker (which may itself be indented). This proposal would remove the last anomaly discussed. Unlike the spec presented above, it would count the following as a list item with a subparagraph, even though the paragraph bar is not indented as far as the first paragraph foo:

 10. foo

   bar  

Arguably this text does read like a list item with bar as a subparagraph, which may count in favor of the proposal. However, on this proposal indented code would have to be indented six spaces after the list marker. And this would break a lot of existing Markdown, which has the pattern:

1.  foo

        indented code

where the code is indented eight spaces. The spec above, by contrast, will parse this text as expected, since the code block’s indentation is measured from the beginning of foo.

The one case that needs special treatment is a list item that starts with indented code. How much indentation is required in that case, since we don’t have a “first paragraph” to measure from? Rule #2 simply stipulates that in such cases, we require one space indentation from the list marker (and then the normal four spaces for the indented code). This will match the four-space rule in cases where the list marker plus its initial indentation takes four spaces (a common case), but diverge in other cases.

6.10Lists

A list is a sequence of one or more list items of the same type. The list items may be separated by any number of blank lines.

Two list items are of the same type if they begin with a list marker of the same type. Two list markers are of the same type if (a) they are bullet list markers using the same character (-, +, or *) or (b) they are ordered list numbers with the same delimiter (either . or )).

A list is an ordered list if its constituent list items begin with ordered list markers, and a bullet list if its constituent list items begin with bullet list markers.

The start number of an ordered list is determined by the list number of its initial list item. The numbers of subsequent list items are disregarded.

A list is loose if any of its constituent list items are separated by blank lines, or if any of its constituent list items directly contain two block-level elements with a blank line between them. Otherwise a list is tight. (The difference in HTML output is that paragraphs in a loose list are wrapped in <p> tags, while paragraphs in a tight list are not.)

Changing the bullet or ordered list delimiter starts a new list:

Example 302
- foo
- bar
+ baz
<ul>
<li>foo</li>
<li>bar</li>
</ul>
<ul>
<li>baz</li>
</ul>
Example 303
1. foo
2. bar
3) baz
<ol>
<li>foo</li>
<li>bar</li>
</ol>
<ol start="3">
<li>baz</li>
</ol>

In CommonMark, a list can interrupt a paragraph. That is, no blank line is needed to separate a paragraph from a following list:

Example 304
Foo
- bar
- baz
<p>Foo</p>
<ul>
<li>bar</li>
<li>baz</li>
</ul>

Markdown.pl does not allow this, through fear of triggering a list via a numeral in a hard-wrapped line:

The number of windows in my house is
14.  The number of doors is 6.

Oddly, though, Markdown.pl does allow a blockquote to interrupt a paragraph, even though the same considerations might apply.

In CommonMark, we do allow lists to interrupt paragraphs, for two reasons. First, it is natural and not uncommon for people to start lists without blank lines:

I need to buy
- new shoes
- a coat
- a plane ticket

Second, we are attracted to a

principle of uniformity: if a chunk of text has a certain meaning, it will continue to have the same meaning when put into a container block (such as a list item or blockquote).

(Indeed, the spec for list items and block quotes presupposes this principle.) This principle implies that if

  * I need to buy
    - new shoes
    - a coat
    - a plane ticket

is a list item containing a paragraph followed by a nested sublist, as all Markdown implementations agree it is (though the paragraph may be rendered without <p> tags, since the list is “tight”), then

I need to buy
- new shoes
- a coat
- a plane ticket

by itself should be a paragraph followed by a nested sublist.

Since it is well established Markdown practice to allow lists to interrupt paragraphs inside list items, the principle of uniformity requires us to allow this outside list items as well. (reStructuredText takes a different approach, requiring blank lines before lists even inside other list items.)

In order to solve of unwanted lists in paragraphs with hard-wrapped numerals, we allow only lists starting with 1 to interrupt paragraphs. Thus,

Example 305
The number of windows in my house is
14.  The number of doors is 6.
<p>The number of windows in my house is
14.  The number of doors is 6.</p>

We may still get an unintended result in cases like

Example 306
The number of windows in my house is
1.  The number of doors is 6.
<p>The number of windows in my house is</p>
<ol>
<li>The number of doors is 6.</li>
</ol>

but this rule should prevent most spurious list captures.

There can be any number of blank lines between items:

Example 307
- foo

- bar


- baz
<ul>
<li>
<p>foo</p>
</li>
<li>
<p>bar</p>
</li>
<li>
<p>baz</p>
</li>
</ul>
Example 308
- foo
  - bar
    - baz


      bim
<ul>
<li>foo
<ul>
<li>bar
<ul>
<li>
<p>baz</p>
<p>bim</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>

To separate consecutive lists of the same type, or to separate a list from an indented code block that would otherwise be parsed as a subparagraph of the final list item, you can insert a blank HTML comment:

Example 309
- foo
- bar

<!-- -->

- baz
- bim
<ul>
<li>foo</li>
<li>bar</li>
</ul>
<!-- -->
<ul>
<li>baz</li>
<li>bim</li>
</ul>
Example 310
-   foo

    notcode

-   foo

<!-- -->

    code
<ul>
<li>
<p>foo</p>
<p>notcode</p>
</li>
<li>
<p>foo</p>
</li>
</ul>
<!-- -->
<pre><code>code
</code></pre>

List items need not be indented to the same level. The following list items will be treated as items at the same list level, since none is indented enough to belong to the previous list item:

Example 311
- a
 - b
  - c
   - d
  - e
 - f
- g
<ul>
<li>a</li>
<li>b</li>
<li>c</li>
<li>d</li>
<li>e</li>
<li>f</li>
<li>g</li>
</ul>
Example 312
1. a

  2. b

   3. c
<ol>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
<li>
<p>c</p>
</li>
</ol>

Note, however, that list items may not be indented more than three spaces. Here - e is treated as a paragraph continuation line, because it is indented more than three spaces:

Example 313
- a
 - b
  - c
   - d
    - e
<ul>
<li>a</li>
<li>b</li>
<li>c</li>
<li>d
- e</li>
</ul>

And here, 3. c is treated as in indented code block, because it is indented four spaces and preceded by a blank line.

Example 314
1. a

  2. b

    3. c
<ol>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
</ol>
<pre><code>3. c
</code></pre>

This is a loose list, because there is a blank line between two of the list items:

Example 315
- a
- b

- c
<ul>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
<li>
<p>c</p>
</li>
</ul>

So is this, with a empty second item:

Example 316
* a
*

* c
<ul>
<li>
<p>a</p>
</li>
<li></li>
<li>
<p>c</p>
</li>
</ul>

These are loose lists, even though there is no space between the items, because one of the items directly contains two block-level elements with a blank line between them:

Example 317
- a
- b

  c
- d
<ul>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
<p>c</p>
</li>
<li>
<p>d</p>
</li>
</ul>
Example 318
- a
- b

  [ref]: /url
- d
<ul>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
<li>
<p>d</p>
</li>
</ul>

This is a tight list, because the blank lines are in a code block:

Example 319
- a
- ```
  b


  ```
- c
<ul>
<li>a</li>
<li>
<pre><code>b


</code></pre>
</li>
<li>c</li>
</ul>

This is a tight list, because the blank line is between two paragraphs of a sublist. So the sublist is loose while the outer list is tight:

Example 320
- a
  - b

    c
- d
<ul>
<li>a
<ul>
<li>
<p>b</p>
<p>c</p>
</li>
</ul>
</li>
<li>d</li>
</ul>

This is a tight list, because the blank line is inside the block quote:

Example 321
* a
  > b
  >
* c
<ul>
<li>a
<blockquote>
<p>b</p>
</blockquote>
</li>
<li>c</li>
</ul>

This list is tight, because the consecutive block elements are not separated by blank lines:

Example 322
- a
  > b
  ```
  c
  ```
- d
<ul>
<li>a
<blockquote>
<p>b</p>
</blockquote>
<pre><code>c
</code></pre>
</li>
<li>d</li>
</ul>

A single-paragraph list is tight:

Example 323
- a
<ul>
<li>a</li>
</ul>
Example 324
- a
  - b
<ul>
<li>a
<ul>
<li>b</li>
</ul>
</li>
</ul>

This list is loose, because of the blank line between the two block elements in the list item:

Example 325
1. ```
   foo
   ```

   bar
<ol>
<li>
<pre><code>foo
</code></pre>
<p>bar</p>
</li>
</ol>

Here the outer list is loose, the inner list tight:

Example 326
* foo
  * bar

  baz
<ul>
<li>
<p>foo</p>
<ul>
<li>bar</li>
</ul>
<p>baz</p>
</li>
</ul>
Example 327
- a
  - b
  - c

- d
  - e
  - f
<ul>
<li>
<p>a</p>
<ul>
<li>b</li>
<li>c</li>
</ul>
</li>
<li>
<p>d</p>
<ul>
<li>e</li>
<li>f</li>
</ul>
</li>
</ul>

6.11List attributes (M)

Lists in Markdown (as described by John Gruber and specified by CommonMark) have some pretty major issues. Here are two of the biggest ones:

This list gets rendered as 3,4,5, since Markdown does not have a way of specifying that a list is descending, and it does not infer it from the order by default:

before

3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This looks like a list, but it is in fact a paragraph. There are no break tags after the foo, bar and baz, just soft breaks:

before

a. foo
b. bar
c. baz

after
<p>before</p>
<p>a. foo
b. bar
c. baz</p>
<p>after</p>

Again, both of these behaviours are as defined in the CommonMark spec.

Now, Markua can’t change this by default, without being incompatibile with Markdown.

So what to do?

Markua adds the list attributes extension, to support being able to output all the needed types of list from the limited list support in Markdown. The way this is done is by giving lists an attribute list, much like how resources can have an attribute list. (Lists are not resources, however.)

6.11.1Supported Attributes for Lists

The following are the supported attributes for lists.

reversed : This determines whether the list is in ascending or descending order. Legal values are true or false. The default value is false, for ascending. Any value other than true is interpreted as false.

type : The type of the list marker to use. Valid choices are 1, A, a, I or i. A and a mean uppercase or lowercase letters; I or i mean uppercase or lowercase Roman numerals.

The use of these attributes are shown in the following examples.

6.11.2Lists with Attributes: Good Style

This list will be numbered a, b, c:

before

{type: a}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered A, B, C:

before

{type: A}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="A">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered c, d, e:

before

{type: a}
3. foo
4. bar
5. baz

after
<p>before</p>
<ol type="a" start="3">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered i, ii, iii:

before

{type: i}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered I, II, III:

before

{type: I}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered iii, iv, v:

before

{type: i}
3. foo
4. bar
5. baz

after
<p>before</p>
<ol type="i" start="3">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered 3, 2, 1:

before

{reversed: true}
3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3" reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered c, b, a:

before

{type: a, reversed: true}
3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3" type="a" reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered iii, ii, i:

before

{type: i, reversed: true}
3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3" type="i" reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

6.11.3Nested Lists with Attributes

Everything which has been said about attribute lists on lists applies to nested lists as well. This example shows a nested list of the form 1, a, i, * (the final nested list is an unordered list, not an ordered list):

before

1. lorem
   {type: a}
   1. foo
      {type: i}
      1. apple
         * Gala
         * Spartan
         * McIntosh
      2. orange
      3. banana
   2. bar
   3. baz
2. ipsum
3. dolor

after
<p>before</p>
<ol>
<li>lorem
<ol type="a">
<li>foo
<ol type="i">
<li>apple
<ul>
<li>Gala</li>
<li>Spartan</li>
<li>McIntosh</li>
</ul>
</li>
<li>orange</li>
<li>banana</li>
</ol>
</li>
<li>bar</li>
<li>baz</li>
</ol>
</li>
<li>ipsum</li>
<li>dolor</li>
</ol>
<p>after</p>

This nested list will be rendered somewhat like this:

before

1. lorem
   a. foo
        i. apple
           * Gala
           * Spartan
           * McIntosh
       ii. orange
      iii. banana
   b. bar
   c. baz
2. ipsum
3. dolor

after

6.11.4Lists with Attributes: Lazy Style

It is acceptable to just use the number of the first label, and rely on list behaviour.

This list will be numbered 1, 2, 3:

before

1. foo
1. bar
1. baz

after
<p>before</p>
<ol>
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered 3, 4, 5:

before

3. foo
3. bar
3. baz

after
<p>before</p>
<ol start="3">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered c, d, e:

before

{type: a}
3. foo
3. bar
3. baz

after
<p>before</p>
<ol start="3" type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered iii, iv, v:

before

{type: i}
3. foo
3. bar
3. baz

after
<p>before</p>
<ol start="3" type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered 3, 2, 1:

before

{reversed: true}
3. foo
3. bar
3. baz

after
<p>before</p>
<ol start="3" reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

6.11.5Lists with Attributes: Redundant Style

This list will be numbered 1, 2, 3. This is just the default behaviour made explicit:

before

{type: 1, reversed: false}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol>
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

6.11.6Lists with Attributes: Bad Style

This list will be numbered 3, 2, 1, regardless of the order of the numbers:

before

{reversed: true}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered c, d, e, regardless of the order of the numbers:

before

{type: a}
3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3" type="a">
<li>three</li>
<li>bar</li>
<li>one</li>
</ol>
<p>after</p>

This list will be numbered 1, 0, -1:

before

{reversed: true}
1. foo
1. bar
1. baz

after
<p>before</p>
<ol reversed="true" start="1" type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered i, 0, -1, since the Romans didn’t have the concept of zero or negative numbers (and thus neither do Roman numerals):

before

{type: i, reversed: true}
1. foo
1. bar
1. baz

after
<p>before</p>
<ol reversed="true" start="1" type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered a, 0, -1, since there are no negative letters either:

before

{type: a, reversed: true}
1. foo
1. bar
1. baz

after
<p>before</p>
<ol reversed="true" start="1" type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

6.12Definition lists (M)

Definition lists are also supported in Markua. To define a definition list, use the following syntax:

term 1
: definition 1

term 2
: definition 2
<dl>
  <dt>term 1</dt>
  <dd>definition 1</dd>

  <dt>term 1</dt>
  <dd>definition 1</dd>
</dl>

There can be one to three spaces after the colon, or one tab.

Note that in HTML the terms produce <dt> elements (for “description term”) and the definitions produce <dd> elements (for “description details”). Since it is not certain that the definition list contains definitions, and since it is desirable for the syntax to remain simple, the <dd> element does not contain a <dfn> element.

There can only be one term per definition, but there can be multiple definitions for a term:

term 1
: definition 1a
: definition 1b

term 2
: definition 2
<dl>
  <dt>term 1</dt>
  <dd>definition 1a</dd>
  <dd>definition 1b</dd>

  <dt>term 2</dt>
  <dd>definition 2</dd>
</dl>

A single term definition list is a definition list, regardless of how many definitions for the term exist:

term
: definition
<dl>
  <dt>term</dt>
  <dd>definition</dd>
</dl>

Finally, each definition list item can contain newlines and multiple paragraphs. What you do here is indent the subsequent lines by the same amount of space as the initial line.

(If you do not indent the subsequent lines, then you’re ending the definition list and just starting a new paragraph.)

before

one
: foo
: bar

  baz

two
: foo

after
<p>before</p>
<dl>
  <dt>one</dt>
  <dd>foo</dd>
  <dd>
    <p>bar</p>
    <p>baz</p>
  </dd>

  <dt>two</dt>
  <dd>foo</dd>
</dl>
<p>after</p>

Note that single line breaks within the definitions (the <dd> not <dt>) are handled the same way as single line breaks within paragraphs, and are subject to the soft-breaks global setting which defaults to source for compatibility with Markdown.

before

one
: lorem
  ipsum

  dolor

  sit amet

two
: foo

after
<p>before</p>
<dl>
  <dt>one</dt>
  <dd>
    <p>lorem
    ipsum</p>
    <p>dolor</p>
    <p>sit amet</p>
  </dd>

  <dt>two</dt>
  <dd>foo</dd>
</dl>
<p>after</p>

Finally, with definition lists, one thing you often want to do is link to a specific definition, not just to the list itself. This is useful to do, since in a document with many definitions, it’s helpful if the reader scrolls to the right spot or opens to the right page.

To do this, just define a span id on the element itself, and then link to it.

foo{#foo}
: This is foo.

bar{#bar}
: This is bar.
<dl>
  <dt><span id="foo">foo</span></dt>
  <dd>This is foo.</dd>

  <dt><span id="bar">bar</span></dt>
  <dd>This is bar.</dd>
</dl>

Note that if you define an id attribute above the first definition list item, what you are doing is defining an id on the entire definition list. As such, this does not work on any subsequent list item: in the following definition list, the {#definitions} is the id of the definition list. It is not the id of foo.

{#definitions}
foo{#foo}
: This is foo.

bar{#bar}
: This is bar.
<dl id="definitions">
  <dt><span id="foo">foo</span></dt>
  <dd>This is foo.</dd>

  <dt><span id="bar">bar</span></dt>
  <dd>This is bar.</dd>
</dl>

To be clear, the following is not legal in Markua:

{#foo}
foo
: This is foo.

{#bar}
bar
: This is bar.
<dl id="foo">
  <dt>foo</dt>
  <dd>This is foo.</dd>

  <dt>bar</dt>
  <dd>This is bar.</dd>
</dl>

Here, the id above foo is actually the id of the entire definition list, and the id above bar is not legal. A Markua Processor should ignore this id, and add an error to the list of errors.

7Quizzes and exercises (M)

Markua was initially designed for the writing of books, but it has been extended to support creating courses. For example, Leanpub authors can click a button to create a massive open online course (MOOC), complete with automated marking, entirely from a Markua document.

To convert a Markua document from a book into a course, all you need to do is add quizzes and exercises.

Over the past decade, there has been a steady growth of interest in courses delivered over the internet at massive scale. These Massive Open Online Courses, or MOOCs, consist of essentially four things:

  1. Reading material
  2. Video or audio lectures
  3. Exercises, with answers provided to the student
  4. Quizzes, with answers used to automatically mark the quiz

It turns out the four things in this list all work perfectly in a Markua document. So, not only can Markua be used to easily create a textbook which includes video, audio, images and quizzes, it is also an amazingly simple and flexible way of creating a MOOC. A MOOC is essentially just a textbook which is executable, plus discussion forums and credentials. For example, Leanpub authors can click one button to create a massive open online course (MOOC), complete with automated marking for all the quizzes in the course, entirely from one Markua document.

Quizzes and exercises are essentially the same. The only difference is that quizzes are intended to be marked, and exercises are not. Because of their similarities, they are discussed here together.

Quizzes or exercises in a textbook consist of two things:

  1. Questions, typically in the chapter itself.
  2. Answers, typically at the back of the book.

The questions in the chapter essentially are placed there like any other block element, such as an aside or blurb. The answers are positioned at the back of the book, along with other elements like the index and appendices. The specific location that they are positioned can be controlled by the author using insert directives, discussed earlier.

There is only one syntax to create a quiz or exercise. For a quiz, it’s by wrapping the quiz in {quiz}{/quiz}; for an exercise, it’s by wrapping the exercise in {exercise}{/exercise}.

Here is a brief example of a quiz:

{quiz, id: quiz1}
? How many letters are in the word Markua?

a) 5
B) 6
c) 7

? How many unique letters are in the word Markua?

! 5
{/quiz}

This quiz contains two questions: a multiple-choice question where the correct answer is b, and a fill-in-the-blank question where the correct answer is 5. Quizzes and exercises have the same question types, discussed later.

With a quiz, the id attribute is required. This is so the identity of a quiz can be preserved across generations of a course.

Here is the same example, but as an exercise:

{exercise, id: exercise1}
? How many letters are in the word Markua?

a) 5
B) 6
c) 7

? How many unique letters are in the word Markua?

! 5
{/exercise}

Just like with quiz, with an exercise the id attribute is required. This is so the identity of an exercise can be preserved across generations of a course.

A quiz or exercise can contain any Markua content, not just questions and answers. This is true regardless of whether the quiz or exercise is in a MOOC, an ebook or on paper. Note that video and audio resources don’t work so well on paper, however.

If a quiz or exercise starts with any type of heading immediately after the {quiz} or {exercise} line, this heading’s content should be considered the name of the quiz or exercise. This can be used in a list of quizzes or exercises produced by the Markua Processor. Typically the heading will be a chapter heading (# ), but section headings (## ) and lower headings also are supported. (The reason for this is that quizzes are sometimes top-level things, and other times are nested inside chapters, sections or sub-sections. Some course authors would correctly feel that the quiz should have the appropriate level of heading given their position in the document.

Example:

{quiz, id: quiz2}
# Markua Quiz

Watch this [video](https://www.youtube.com/watch?time_continue=1&v=VOCYL-FNbr0)
of Peter explaining Markua.

? What year was that video from?

What year? Really? Did it really take that long? What was going on???

a) 2012
b) 2013
C) 2014
d) 2015

{words: 500}
? Why do you think the first version of the Markua spec took so long?

Look at the Leanpub website and read the [pricing
essay](https://leanpub.com/pricing).

! Answers could include "bootstrapped startup", the spec evolving, etc.

That's it for this quiz, and this MOOC!

**Thanks for taking my course!**
{/quiz}

There are four types of questions supported by Markua.

  1. Multiple Choice
  2. Multiple Selection
  3. Fill In The Blank
  4. Written

These types are not specified by a {type} attribute. Instead, they are inferred from properties of the answers or from other attributes of the question.

The exact way to create these types of questions is discussed further below.

7.1Quiz and exercise output (M)

A Markua Processor has many degrees of freedom in terms of whether, and how, to output quizzes and exercises. As such, no HTML mapping is shown here.

For example, when outputting an online course or MOOC, the Markua Processor can basically do whatever it wants. It can choose to output an ebook of the course material only, and put all quizzes and exercises only in the online version. Or it can choose to put all the quizzes only in the online version, and include the exercises in the ebook version as well. Or it can include the quizzes and exercises in the ebook version, but only include the answers for the exercises in the ebook version.

If, on the other hand, a Markua Processor is outputting a textbook, it may choose to output the quizzes in an entirely separate ebook or print-ready PDF, for use in physical classrooms.

If a Markua Processor does output a quiz or exercise, it needs to do so in a medium-appropriate way. This includes outputting multiple-choice questions without showing their answers, of course. The display of fill in the blank and essay questions varies greatly based on whether the question is in an online quiz or exercise, in an ebook or on paper.

Markua Processors are encouraged to be creative here.

However, there are some rules. If a given quiz or exercise is output by a Markua Processor in an ebook or a physical book, the following things must occur:

  1. The quiz or exercise must be output as questions-only, in the place in the document where the quiz was defined.
  2. The questions must be numbered sequentially, incrementing by 1 for each question.
  3. The questions must be numbered starting from the number specified by the start-at attribute if specified, or 1 if no start-at attribute is specified.
  4. The multiple choice options in any question must be converted into a set of choices which all look the same, so that the correct answer is not indicated. Converting all choice letters to lowercase is sufficient here.
  5. The answers, if provided, must be positioned somewhere separate from the questions, typically at the back of the book. This position can be controlled by the author using insert directives, discussed earlier.

7.2Quiz and exercise validation (M)

7.2.1An empty quiz or exercise is not an error

A quiz or exercise which contains no questions is not an error. Instead, a if a Markua Processor encounters a quiz or exercise with no questions it must filter the quiz or exercise from the output, optionally providing a warning to the author.

This lets authors create placeholders for quizzes or exercises in their courses before the quizzes or exercises are ready, which is potentially very useful in an in-progress course.

7.2.2A malformed quiz or exercise is an error

If a Markua Processor encounters a malformed quiz or exercise it must treat this as an error and not generate the output from the Markua document. Quizzes and exercises are not something that should ever be produced in a broken state.

However, it is also an error to parse quiz syntax outside a quiz or exercise block. A Markua Processor must not parse lines starting with ? or ! as representing questions or answers unless those are contained in a quiz or exercise block.

7.3Quiz and exercise attributes (M)

These are the supported attributes on quizzes and/or exercises:

attempts : The number of allowed attempts on a quiz. The default is defined by the value of default-quiz-attempts on the containing course, or 1 if this is not present. A value of 0 means the quiz cannot be taken (yet). A value of -1 means the quiz has an unlimited number of attempts. Since an exercise does not count toward the mark on a course, an exercise always has an unlimited number of attempts.

auto-submit : true or false. The default is true. If true, an incomplete quiz is submitted when the time-limit is expired. If false, it is not. Either way, an incomplete quiz counts as an attempt.

case-sensitive : true or false. The default is true. This sets the default behaviour of fill in the blank questions. If true, the fill in the blank question answers are case-sensitive. If false, they are not.

id : All Markua elements support an id attribute. The reason the id attribute is explicitly listed here is to emphasize that a Markua Processor may require an id attribute on a quiz or exercise. For example, Leanpub requires the id attribute on all quizzes, in order to determine the identity of quizzes when a course is being published in-progress. (As a student, you’d be pretty unhappy if you had to re-take an unchanged quiz simply because a professor published a new course version.)

mark-request-url : If omitted, all the quiz or exercise answers are defined in the Markua document. If present, some or all of the quiz or exercise questions are externally marked via an API defined at the URL specified in this attribute. When the quiz or exercise is completed, the entire quiz or exercise should be sent to the endpoint at the mark-request-url. Here’s how this works in Leanpub; other Markua Processors should presumably do something similar. When a quiz or exercise with this attribute defined is completed, its state is set to “pending”. Leanpub then sends the quiz or exercise data as a JSON payload to the marking endpoint. This JSON contains the following attributes: mark-response-url (which defines where responses are sent), quiz_id, quiz_version, quiz_hash (an SHA hash, for an automated quiz versioning approach which does not rely on the course author updating a version attribute), quiz_results (an array of question data structures containing the question, the possible answers, the correct answer according to the Markua document, the answer provided by the student, and (if the question is markable) the mark determined automatically). The API endpoint should send results to the mark-response-url. (Even though exercises are not worth points in a course, they can be marked, for the benefit of the student. However, the expectation is that this attribute will be used primarily by quizzes.) The mark-response-url in the payload contains URL to respond to with the results of the marking the quiz or exercise. The format of the expected payload should be defined by the Markua Processor. For Leanpub this is a JSON payload containing the question ids and the marks for each question. The mark is a decimal number of points between 0 and the maximum number of points for the question, inclusive. The decimal supports two decimal places, so you can get 3.75 out of 4, for example. The mark response must include marks for all questions that are unmarked. It may also include marks for questions that Leanpub marked, and those will override Leanpub’s marks. Finally, note that even though exercises are not worth points in a course, they can be marked, for the benefit of the student. So, since they can be marked, they can also be externally-marked via a mark-request-url. However, the expectation is that this attribute will be used primarily by quizzes.

points : If present, this is the total number of points the quiz or exercise is worth. (This really only matters for quizzes, but is supported for exercises as well, in case a Markua Processor wishes to display the points on exercises to make them feel more real.) If points is not present, the worth of the quiz is determined by summing the points of the questions. (Questions are worth 1 point each if they have no points attribute.) If the quiz has a points attribute and its questions also have points attributes, the worth of each question in a larger course context is determined as follows: its points are the numerator, and the total points in the quiz or exercise is the denominator.

random-choice-order : true or false. The default is false. This sets the default behaviour of multiple choice questions. If true, the choices in the multiple choice question are randomly arranged; if false, they are presented in the order written.

random-question-order : true or false. The default is false. This sets the default behaviour of the quiz or exercise. If true, the questions are randomly arranged; if false, they are presented in the order written.

start-at : The start-at is the number of the first question. The default is 1. Any integer is permitted. Subsequent questions will have a number which is 1 higher than the previous question.

show-answers : This can be all, incorrect or none. It affects how answers are shown after a quiz or exercise is completed, say in a MOOC. For exercises, the default value is determined by the value of the default-exercise-show-answers global setting, with all being the default value of that attribute. For quizzes, the default value is determined by the value of the default-quiz-show-answers global setting, with incorrect being the default value of that attribute. Global settings are discussed later.

time-limit : The time limit to finish the quiz, once started. The format is XdYhZm. For example, 3 days, 6 hours and 45 minutes is expressed as 3d6h45m; 7 days is expressed as 7d. The default is 7d.

use-result : best or latest. Whether the best result on the quiz is used, or the latest one. The default is the value of default-quiz-use-result on the course.

version : The version of the quiz. This does not replace the function of the id; it’s more for use in analytics by the instructor. The default is 1.

As discussed above, there is no title or title attribute for a quiz–you can just add a heading inside the quiz or exercise itself, using the normal Markua formatting for a chapter heading.

7.4Multiple choice questions (M)

A multiple choice question has 2 or more answer choices, and 1 correct answer choice.

The correct answer choice is in capital letters before the parentheses; incorrect answer choices have lowercase letters before the parentheses.

Example:

? How many letters are in the word Markua?

a) 5
B) 6
c) 7

Obviously, when generating the question in the actual quiz or exercise, a Markua Processor must make all answer choices have the same type of letter. This is usually a lowercase letter, although either all lowercase or all uppercase letters would be fine.

Unless a choose-answers attribute is used, the multiple choice answers all must start from a or A, and must use a right-parenthesis after the a or A. Any line starting with a) ar A) in a quiz is considered a set of multiple choice quiz answers, not an ordered list using a) or A) as a delimiter. If you want to put an ordered list in a quiz body, use periods for the delimiter.

A multiple choice question may also have a dynamic number of answer choices, including for the correct answer. This done with the special choose-answers attribute, shown and explained below.

{choose-answers: 4}
? How many grams are in a pound?

C) 454
C) 453
m) 451
m) 1000
o) 100
o) 150
o) 200
o) 250
o) 300
o) 500

The choose-answers attribute specifies how many answer choices should be shown. This includes exactly one of the correct answers (indicated with C), all of the mandatory incorrect answers (indicated with m) and as many of the optional incorrect answers (indicated with o) as are needed for the question to have the total number of answers as indicated by the choose-answers attribute.

So, in the above example, either 453 or 454 will be shown, along with the mandatory incorrect answer choices 451 (a literary joke) and 1000 (a kilogram, not a pound) and one of the optional incorrect answers (100, 150, 200, 250, 300 or 500).

When a choose-answers attribute is used, the question will always have random-choice-order.

The following are errors in a question where a choose-answers attribute is used:

7.4.1Supported attributes on multiple choice questions

choose-answers : This is described above. If choose-answers is used, random-choice-order is forced to true.

points : The number of points the question is worth. This number can be 0 or higher. The default is 1.

random-choice-order : true or false. The default is false, unless choose-answers is used. This sets the behaviour of the specific multiple choice question. If true, the choices in the multiple choice question are randomly arranged; if false, they are presented in the order written. If this attribute is omitted, its value is determined by the random-choice-order attribute on the quiz itself, which defaults to false if absent.

7.5Multiple selection questions (M)

A multiple selection question has 2 or more answer choices, and 1 or more true answer choices.

(Leanpub authors: This question type is not yet supported in Leanpub.)

The true answer choices are specified with T; false answer choices are specified with F.

To get the question 100% correct, you must select ALL answers which are true (not just one of them) and NONE of the answers which are false. As discussed later, the mark attribute determines whether the marking is all-or-nothing (with the binary value) or whether a partial credit can be earned (with the ratio) value.

Regardless of the value of the mark attribute, to get any score for a multiple selection question, at least one answer choice must be selected. Leaving a multiple selection question blank always gets zero points, even if the mark attribute is ratio. Otherwise, if a test had nothing but multiple selection questions with ratio-marked questions, with half the answers true and half the answers false, you could get 50% on the test by just handing it in blank!

7.5.1Supported attributes on multiple selection questions

mark : binary or ratio. The default is binary. This sets the marking behaviour of the specific multiple selection question. If binary, the question is all or nothing: you either get full points for the question, or a 0 on the question. To get full points, you must select every true choice and none of the false choices. If ratio, you get a fractional score defined as (number of true choices selected + number of false choices not selected) / (total number of choices). This fraction is then multiplied by the points that the question is worth to compute the points that you received on the question. Note that regardless of whether the mark is binary or ratio, a question which is left blank is always completely wrong, earning no credit for the correctly-not-selected false choices.

points : The number of points the question is worth. This number can be 0 or higher. The default is 1. The mark attribute is used to determine how the points are computed. Again, regardless of whether the mark is binary or ratio, a question which is left blank always earns 0 points.

random-choice-order : true or false. The default is false. This sets the behaviour of the specific multiple selection question. If true, the choices in the multiple choice question are randomly arranged; if false, they are presented in the order written. If this attribute is omitted, its value is determined by the random-choice-order attribute on the quiz itself, which defaults to false if absent.

7.5.2Examples of multiple selection questions

? Which of these are animals?

F) apple
T) cow
F) lettuce

{mark: ratio, points: 2}
? Which of these are fruits?

T) apple
F) cow
F) lettuce
T) peach
T) pear
F) rice

Here the first question defaults to binary and 1 point; the second question is ratio and 2 points.

To get the first question correct, you must select cow, and neither of apple or lettuce.

To get this second question 100% correct, you must select all of apple, peach and pear, and none of cow, lettuce or rice. For this question, if the mark attribute had been binary, the only way to get any points for it would be to select apple, peach and pear, and to leave cow, lettuce and rice all unselected. However, the mark is not binary; it is ratio. So, if you selected apple, peach and rice, then the score would be computed as follows: ((2 correctly-selected true choices + 2 correctly-not-selected false choices) / 6 choices) * 2 points = (4/6) * 2 points = 1.5 points. Since at least one choice was selected, you get just as much credit for correctly not selecting cow and lettuce as you do for correctly selecting apple and peach. If an instructor thinks this is too lenient, then binary is the choice for them!

7.6Fill in the blank questions (M)

A fill in the blank question consists of a question and a set of answers. Each answer is specified by !, an optional points value, a space, and then a semicolon-separated list of the acceptable values of that answer. Each answer value can be a text string (quoted or not) or a regular expression (regex). If a points value is not specified for an answer, the answer is worth full points.

Support for regular expression answer values is optional. However, a Markua Processor which supports regular expression marking must interpret any answer which starts with a forward slash (/) and ends with a forward slash followed by some word characters (e.g. i) as being a regular expression. Note that the particular format of the regular expression used is implementation-specific. For example, Leanpub uses Ruby regular expressions. Other Markua Processors could, for example, use Perl or JavaScript regular expressions.

Finally, note that you can separate regular expressions with semicolons, just like any other answer value. There’s no reason not to support this, and it may lead to simpler regular expressions. However, if you’re good at regular expressions, you can also combine them into one regular expression, of course.

Note that since a semicolon is used to separate answer values, to provide an actual semicolon as part of an answer value you must either put the answer value in quotes, use a backslash-escape \; or make the semicolon part of a regular expression.

Examples:

? How many unique letters are in the word Markua?

! 5

? What's the global capital of investment banking?

! New York ; London

? What's the global capital of investment banking?

! "New York" ; "London"

? What's the global capital of investment banking?

! New York
! London

? What's the global capital of investment banking?

! "New York"
! "London"

{case-sensitive: false}
? What's pi?

! "The ratio of a circle's circumference to its diameter" ; 3.14 ... 3.1416 ;
an irrational number

{case-sensitive: false}
? What's pi?

! "The ratio of a circle's circumference to its diameter"
! 3.14 ... 3.1416
! an irrational number

? Where's the Eiffel Tower?

! /(Paris|France)/i

? Where's the Eiffel Tower?

! /Paris/i ; /France/i

{points: 2}
? Where's the Eiffel Tower?

! /Paris/i
! /France/i

{points: 2}
? Where's the Eiffel Tower?

!2 /Paris/
!1 /paris/i
!.5 /France/i

{points: 2}
? Where's the Eiffel Tower?

! /Paris/
!1 /paris/i
!.5 /France/i

{points: 2}
? Where's the global capital of investment banking?

!2 New York ; London
!1 USA ; UK

As shown by the answer ("The ratio of a circle's circumference to its diameter" ; 3.14 ... 3.1416; an irrational number), acceptable answer values in a fill in the blank question can be of completely different types, and numeric answer values can be expressed as ranges (min <= x <= max), expressed as min ... max. Also, this answer shows that quotes are optional around text strings. The reason to use quotes is for clarity, or to ensure that any semicolons used are treated as semicolons instead of as answer choice delimiters. Semicolons inside quotes are just semicolons and do not need to be backslash-escaped. You do, however, need to backslash-escape a quote if you want it to be treated as a literal quote, instead of the start or end of a string.

If there are multiple answers to a fill in the blank question and an answer key is being output by the Markua Processor for use by human markers, the acceptable values should be clearly distinguished from each other. The recommendation is to use an unordered list of acceptable values, one per line, but there are no requirements here.

If you’re a programmer, you may wonder what number types and formats are supported. Are they integers? Floating point numbers? Can you use scientific notation? And how are the numbers formatted? Are they US or Canadian numbers (123,456.78) or EU numbers (123 456,78)? What if there’s a number that means something different in the US and EU–does 123,456 mean 123456 or 123.456?

These questions are made worse by the fact that Markua can be used to create human-marked paper quizzes as well as automatically-marked MOOCs. So, specifying rules which made the syntax unambiguous for MOOCs would mean that the answers for human-marked paper quizzes could not be specified with a tolerable amount of ambiguity. So, the format of the answer values in a fill in the blank question is out of scope of the Markua Spec.

7.6.1Supported attributes on fill in the blank questions

points : The number of points the question is worth. This number can be 0 or higher. The default is the 1. The answers must either not specify points (in which case they are worth the full value of points that the question is worth), or they must specify points between 0 and the points value.

case-sensitive : true or false. The default is true. This sets the behaviour of the specific fill in the blank question. If true, the fill in the blank question answer is case-sensitive. If false, it is not. In the case of multiple acceptable answer values, this attribute applies to all of them. Note that this only applies to text string answers, not to regular expressions. For a regular expression to be case-insensitive, you must end it with an i after the closing backtick.

7.7Written questions (M)

A written question corresponds to short answer, long answer or essay questions in traditional tests. It looks the same as a fill in the blank question, except the answer is optional.

A question is interpreted as a written question if either the words, lines or pages attribute is specified, if there is no answer provided, or if an answer is provided using the {answer} syntax discussed shortly.

Note that an answer may be provided in a written question. If this is done, the answer is not split into answer choices and values like a fill in the blank answer. Instead, the answer is essentially a “note to markers”, whether those markers are underpaid graduate students or unpaid AIs. Markua does not specify any microformat for this note to markers: it is just Markua text, kind of like a blockquote (but with each line starting with ! not >).

A Markua Processor generating a MOOC may handle written questions at its own discretion, including not including them or giving them a points value of 0.

Examples:

{quiz, id: "quiz3"}

{pages: 2, points: 10}
? Why is 2019 like *1984*?

! If the student mentions Newspeak, give them an extra mark.

{words: 100, points: 2}
? Why is doubleplusungood the worst?

{lines: 10}
? Can you have multiple line answers in a written question?

! You bet you can!
!
! You can separate them with blank lines,
! and without blank lines.
!
! This is like a blockquote, which uses `>` characters.

Since a written question can have long answers, this increases the likelihood that typing these answers after a bunch of ! delimiters will be a pain. So, written questions (and only written questions) also support answers in the form of {answer} ... {/answer}, like this:

{quiz, points: 0, id: "quiz3"}

{lines: 10}
? Write a function in Ruby that takes an argument and returns three times that
argument.

{answer}
The most straightforward way to do this is like this, since the last value
evaluated in the function is returned:

```ruby
def triple(y)
  y*3
end
```

However, you can also explicitly say `return` if you like:

```ruby
def triple(y)
  return y*3
end
```
{/answer}
{/quiz}

7.7.1Supported attributes on written questions

Note that only one of words, lines or pages may be provided. Providing more than one is an error. Providing none of them means that the question is a fill in the blank question, not a written question.

points : The number of points the question is worth. This number can be 0 or higher. The default is 1. A Markua Processor generating a MOOC may override this. For example, when generating a MOOC, Leanpub overrides all written questions to have a points attribute of 0, regardless of what (if anything) is specified for this attribute by the author.

words : The maximum number of words the answer can be. Obviously this is more useful for online quizzes than paper ones.

lines : The maximum number of lines the answer can be. Obviously this is more useful for paper quizzes than online ones. On paper, this is most useful for a short or long answer question.

pages : The maximum number of pages the answer can be. Obviously this is more useful for paper quizzes than online ones. On paper, this is most useful for an essay question.

7.8Hints on questions (M)

Any question can contain a hint, regardless of question type or whether an answer is provided.

The hint starts on a line with a percent sign (%) followed by a single space. It must follow the question, but it can come either before or after the answer choices.

Hints can span multiple lines, if each line starts with a % sign. This is similar to Markdown handles block quotes with >.

The hint can be used by a Markua processor to show to students at appropriate times, such as when they get the answer wrong or leave it blank. The exact specifics are implementation-dependent. Hint support is optional in a Markua Processor. However, if it’s not supported, it must be ignored.

? What's 1 + 2?

% In a multiple choice question, if you're not sure, `c` is usually a good guess.

a) 1
b) 2
C) 3
d) 4

? Explain the meaning of *The Myth of Sisyphus*.

% Not the actual myth, the essay by Camus.
%
% For bonus points, speculate about Camus' life, death, and what he would have
% thought about autonomous cars.

? Where's the Eiffel Tower?

! /(Paris|France)/i

% The answer must contain the city and/or country.

The hint can be a useful feature for a Markua Processor. There is a lot of discretion here for Markua Processors to compete on features.

For example, here’s how Leanpub handles hints:

  1. We will show the hint as a popup next to the question in the web and mobile quiz views.
  2. We will track if the student looked at the hint.
  3. We will not subtract points for looking at hints for Leanpub-marked questions.
  4. Whether the hint was used will be provided in the analytics, such as a CSV download that we make available to professors.
  5. We will also output the hints for exercises in the “material book” for a course, in a section near the end of the book. This will be linked to via crosslinks from each exercise.
  6. The answers page for a quiz or exercise will show the question, hint and answers for each question.

7.9Question alternates (M)

The fact that a Markua document can be used to create an online course or MOOC means that certain aspects of the syntax for quizzes and exercises are more robust than they would otherwise. One example of this is question alternates.

In an online course or MOOC, some professors might not want every question the same, despite the fact that question order and answer order can be randomized. So, Markua supports question alternates, using a simple (if slightly ugly) syntax. Question alternates are only supported in quizzes, since they make no sense to include in exercises.

To create question alternates, every question in the quiz (not just those with alternates) must be numbered sequentially, starting from 1, using a ?# syntax. This is a question mark followed by the number of the question, e.g. ?1, ?2, ?3. The questions in a quiz are numbered using sequential positive integers starting from 1: 1, 2, 3, etc.

The alternates are specified by providing the same number for multiple questions, e.g. ?1, ?1, ?1, ?2, ?3, ?4, ?4, ?5. When the actual quiz is given, only one of the questions for the given question number is used.

Note that only the first question with a given number may have a points attribute–since all other alternates must use the same points value, specifying it would be pointless.

The following is an example of a quiz which uses question alternates. This ensures that to ensure that students get randomly selected versions of questions 1 and 4. Also, since random-question-order: true is used, the actual position of the questions is randomized after the specific questions are selected from the alternates.

{quiz, id: "midterm", random-question-order: true}
?1 What's 2 + 2?

! 4

?1 What's 2.2 + 2.2?

! 4.4

?2 what's 3 + 3?

! 6

?3 What's 4 + 4?

! 8

{points: 2}
?4 What's 5 + 5?

! 10

?4 What's 6 + 6?

! 12

?5 What's 7 + 7?

! 14
{/quiz}

Note that the syntax for question alternates is very strict. Every question must have a number, and these numbers must be in ascending order (except for the alternates, which have the same number as each other).

Question alternates can also be grouped by a choose-questions attribute attached to the first question alternate. In this case, the Markua Processor must choose the number of questions m specified from the given alternates with that number n, or n choose m. Note that in this scenario, the numbering after the alternates increases by m: for example, if a quiz starts with a choose: 3, the next question is numbered 4, not 2. This ensures that the person constructing the quiz knows what they are doing, and saves them from having to keep track in a scenario where there are multiple questions with a choose-questions attribute.

{quiz, id: "midterm", random-question-order: true}

{choose-questions: 3}
?1 What's 2 + 2?

! 4

?1 What's 2.2 + 2.2?

! 4.4

?1 what's 3 + 3?

! 6

?1 What's 4 + 4?

! 8

?1 What's 5 + 5?

! 10

{points: 2}
?4 What's 6 + 6?

! 12

?5 What's 7 + 7?

! 14
{/quiz}

A Markua Processor must treat any error in the numbering of question alternates (and the questions which follow) as an error, and not generate the quiz if there is any error. This is preferable to a Markua Processor of trying to guess at what the author meant, and trying to do the right thing. Fixing a syntax error takes a couple minutes of editing and a few minutes to publish the book or course again. However, fixing the consequences of a quiz being administered to hundreds–or thousands, or tens of thousands–of people with an incorrect number of questions, or with questions incorrectly used as alternates for each other, would be much more difficult.

8Inlines

Inlines are parsed sequentially from the beginning of the character stream to the end (left to right, in left-to-right languages). Thus, for example, in

Example 359
`hi`lo`
<p><code>hi</code>lo`</p>

hi is parsed as code, leaving the backtick at the end as a literal backtick.

8.1Character substitutions (M)

Markua documents can be written in UTF-8, so to produce any Unicode character, it is possible to just use the proper Unicode characters. However, in certain cases, it’s desirable for Markua to specify automatic replacement of certain combinations of characters with a Unicode replacement. If a Markua Processor encounters one of these combinations of characters outside of a code block, the Markua Processor must replace the combination of characters with the appropriate Unicode character in the output.

-- : To produce an em dash (—), what is thought of by non-typography people as a “dash” or a “long dash”, you can just type two hyphens (--) directly after a non-space character. You can also use the proper Unicode character, U+2014, of course. The following all produce em dashes: foo--bar, foo-- bar, foo--.

-- : To produce a space followed by an en dash (–), or the kind of dash that’s wider than a hyphen but narrower than an em dash, you can just type a space, followed by two hyphens ( --). You can also use the proper Unicode character, U+2013, of course. The following both produce en dashes preceded by spaces: foo -- bar, foo --. (With foo -- bar, there’s a space before and after the en dash; with foo --, there’s no space after it (e.g. at the end of a paragraph).

... : To produce a horizontal ellipsis (…), you can just type .... You can also use the proper Unicode character, U+2026, of course.

8.1.1Optional Automatic Curly Quotes Outside of Code Blocks and Spans

A Markua Processor may replace the " character with the appropriate “curly quote” at its discretion. This lets "typography" become “typography”, and it's become it’s as appropriate.

Note that this is an optional behaviour: a Markua Processor may support this fully, only in some output formats, or not at all.

Also, note that it is NEVER acceptable for a Markua Processor to do this, or any character substitution, to text inside a code block or code span. In almost all instances this would completely break the code. (If you wonder how I got curly quotes into the code spans for “typography” and it’s above, it’s because I pasted them into the manuscript that way. Just as a Markua Processor doesn’t make straight quotes curly in a code span, it doesn’t make curly quotes straight in a code span either.)

8.2Footnotes and endnotes (M)

Books often have footnotes and endnotes. So, Markua has them too.

8.2.1Footnotes

To add a footnote, you insert a footnote tag using square brackets, a caret and the tag, like this:

This has a footnote[^thenote].

Then, you define the footnote later in the document, using the same square brackets, caret and tag, followed by a colon, a space and the footnote definition:

[^thenote]: This is the footnote content.

If you wish to write multiple paragraphs in the footnote, you must indent the subsequent paragraphs by four spaces or one tab:

This has a footnote[^thenote].

Here is some unrelated text.

[^thenote]: This is the first paragraph of footnote content.

    This is the second paragraph of footnote content.

Here is some more unrelated text.

Whether the numbering of footnotes restarts every chapter is something that can be specified by the restart-footnote-numbering setting.

Markua does not specify how footnotes are output in HTML. A Markua Processor should output them somewhere, but the details are not specified. This is deliberate, in order to maximize implementation flexibility for Markua Processors.

8.2.2Endnotes

Sometimes endnotes are used instead of footnotes, but other times, these are in addition to footnotes. So, it makes sense for Markua to define separate syntaxes for both, rather than just defining one “footnote or endnote” syntax and letting the author pick whether the notes are footnotes or endnotes via a global setting.

To add an endnote, you insert an endnote tag using square brackets, two carets and the tag, like this:

This has an endnote[^^thenote].

Endnotes are like footnotes, but happier (^^).

Then, you define the endnote later in the document, using the same square brackets, two carets and tag, followed by a colon, a space and the endnote definition:

[^^thenote]: This is the endnote content.

Just as with footnotes, if you wish to write multiple paragraphs in an endnote, you must indent the subsequent paragraphs by four spaces or one tab.

Whether the numbering of endnotes restarts every chapter is something that can be specified by the restart-endnote-numbering setting.

Markua does not specify how endnotes are output in HTML. A Markua Processor should output them somewhere, but the details are not specified. This is deliberate, in order to maximize implementation flexibility for Markua Processors.

8.2.3Single reference to footnotes and endnotes

You can only refer to a footnote or endnote once. You can’t define a footnote or endnote in one place and refer to it multiple times in the same Markua document. If you wish to refer to a parenthetical piece of text from multiple places in a Markua document, the best approach is to put it in a section (or sub-section, sub-sub-section, etc.) or aside and refer to it from multiple places using a crosslink.

8.2.4Footnote and endnote support required for paragraphs only

A Markua Processor must support footnote and endnote references inserted in normal paragraph content. However, that’s it.

However, sometimes authors want to get creative with their footnotes and endnotes. Sometimes they want to add them in headings, or in footnotes or endnotes themselves. This latter style has been used on rare occasions, most notably by David Foster Wallace.

However, supporting inserting footnotes and endnotes in places other than normal paragraph content puts a hugely increased burden on implementors of Markua Processors. As such, there is no requirement for a Markua Processor to support inserting a footnote or endnote anywhere other than in normal paragraph content.

Authors should not assume that a particular Markua Processor supports inserting a footnote or endnote anywhere other than in normal paragraph content unless its documentation specifically states that it does. For example, Leanpub only supports inserting footnotes or endnotes in normal paragraph content.

8.3Code spans

A backtick string is a string of one or more backtick characters (`) that is neither preceded nor followed by a backtick.

A code span begins with a backtick string and ends with a backtick string of equal length. The contents of the code span are the characters between these two backtick strings, normalized in the following ways:

This is a simple code span:

Example 360
`foo`
<p><code>foo</code></p>

Here two backticks are used, because the code contains a backtick. This example also illustrates stripping of a single leading and trailing space:

Example 361
`` foo ` bar ``
<p><code>foo ` bar</code></p>

This example shows the motivation for stripping leading and trailing spaces:

Example 362
` `` `
<p><code>``</code></p>

Note that only one space is stripped:

Example 363
`  ``  `
<p><code> `` </code></p>

The stripping only happens if the space is on both sides of the string:

Example 364
` a`
<p><code> a</code></p>

Only spaces, and not unicode whitespace in general, are stripped in this way:

Example 365
` b `
<p><code> b </code></p>

No stripping occurs if the code span contains only spaces:

Example 366
` `
`  `
<p><code> </code>
<code>  </code></p>

Line endings are treated like spaces:

Example 367
``
foo
bar  
baz
``
<p><code>foo bar   baz</code></p>
Example 368
``
foo 
``
<p><code>foo </code></p>

Interior spaces are not collapsed:

Example 369
`foo   bar 
baz`
<p><code>foo   bar  baz</code></p>

Note that browsers will typically collapse consecutive spaces when rendering <code> elements, so it is recommended that the following CSS be used:

code{white-space: pre-wrap;}

Note that backslash escapes do not work in code spans. All backslashes are treated literally:

Example 370
`foo\`bar`
<p><code>foo\</code>bar`</p>

Backslash escapes are never needed, because one can always choose a string of n backtick characters as delimiters, where the code does not contain any strings of exactly n backtick characters.

Example 371
``foo`bar``
<p><code>foo`bar</code></p>
Example 372
` foo `` bar `
<p><code>foo `` bar</code></p>

Code span backticks have higher precedence than any other inline constructs except HTML tags and autolinks. Thus, for example, this is not parsed as emphasized text, since the second * is part of a code span:

Example 373
*foo`*`
<p>*foo<code>*</code></p>

And this is not parsed as a link:

Example 374
[not a `link](/foo`)
<p>[not a <code>link](/foo</code>)</p>

Code spans, HTML tags, and autolinks have the same precedence. Thus, this is code:

Example 375
`<a href="`">`
<p><code>&lt;a href=&quot;</code>&quot;&gt;`</p>

But this is an HTML tag:

Example 376
<a href="`">`
<p><a href="`">`</p>

And this is code:

Example 377
`<http://foo.bar.`baz>`
<p><code>&lt;http://foo.bar.</code>baz&gt;`</p>

But this is an autolink:

Example 378
<http://foo.bar.`baz>`
<p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>

When a backtick string is not closed by a matching backtick string, we just have literal backticks:

Example 379
```foo``
<p>```foo``</p>
Example 380
`foo
<p>`foo</p>

The following case also illustrates the need for opening and closing backtick strings to be equal in length:

Example 381
`foo``bar``
<p>`foo<code>bar</code></p>

8.4Emphasis and strong emphasis

John Gruber’s original Markdown syntax description says:

Markdown treats asterisks (*) and underscores (_) as indicators of emphasis. Text wrapped with one * or _ will be wrapped with an HTML <em> tag; double *’s or _’s will be wrapped with an HTML <strong> tag.

This is enough for most users, but these rules leave much undecided, especially when it comes to nested emphasis. The original Markdown.pl test suite makes it clear that triple *** and ___ delimiters can be used for strong emphasis, and most implementations have also allowed the following patterns:

***strong emph***
***strong** in emph*
***emph* in strong**
**in strong *emph***
*in emph **strong***

The following patterns are less widely supported, but the intent is clear and they are useful (especially in contexts like bibliography entries):

*emph *with emph* in it*
**strong **with strong** in it**

Many implementations have also restricted intraword emphasis to the * forms, to avoid unwanted emphasis in words containing internal underscores. (It is best practice to put these in code spans, but users often do not.)

internal emphasis: foo*bar*baz
no emphasis: foo_bar_baz

The rules given below capture all of these patterns, while allowing for efficient parsing strategies that do not backtrack.

First, some definitions. A delimiter run is either a sequence of one or more * characters that is not preceded or followed by a non-backslash-escaped * character, or a sequence of one or more _ characters that is not preceded or followed by a non-backslash-escaped _ character.

A left-flanking delimiter run is a delimiter run that is (1) not followed by Unicode whitespace, and either (2a) not followed by a punctuation character, or (2b) followed by a punctuation character and preceded by Unicode whitespace or a punctuation character. For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.

A right-flanking delimiter run is a delimiter run that is (1) not preceded by Unicode whitespace, and either (2a) not preceded by a punctuation character, or (2b) preceded by a punctuation character and followed by Unicode whitespace or a punctuation character. For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.

Here are some examples of delimiter runs.

(The idea of distinguishing left-flanking and right-flanking delimiter runs based on the character before and the character after comes from Roopesh Chander’s vfmd. vfmd uses the terminology “emphasis indicator string” instead of “delimiter run,” and its rules for distinguishing left- and right-flanking runs are a bit more complex than the ones given here.)

The following rules define emphasis and strong emphasis:

  1. A single * character can open emphasis iff (if and only if) it is part of a left-flanking delimiter run.

  2. A single _ character can open emphasis iff it is part of a left-flanking delimiter run and either (a) not part of a right-flanking delimiter run or (b) part of a right-flanking delimiter run preceded by punctuation.

  3. A single * character can close emphasis iff it is part of a right-flanking delimiter run.

  4. A single _ character can close emphasis iff it is part of a right-flanking delimiter run and either (a) not part of a left-flanking delimiter run or (b) part of a left-flanking delimiter run followed by punctuation.

  5. A double ** can open strong emphasis iff it is part of a left-flanking delimiter run.

  6. A double __ can open strong emphasis iff it is part of a left-flanking delimiter run and either (a) not part of a right-flanking delimiter run or (b) part of a right-flanking delimiter run preceded by punctuation.

  7. A double ** can close strong emphasis iff it is part of a right-flanking delimiter run.

  8. A double __ can close strong emphasis iff it is part of a right-flanking delimiter run and either (a) not part of a left-flanking delimiter run or (b) part of a left-flanking delimiter run followed by punctuation.

  9. Emphasis begins with a delimiter that can open emphasis and ends with a delimiter that can close emphasis, and that uses the same character (_ or *) as the opening delimiter. The opening and closing delimiters must belong to separate delimiter runs. If one of the delimiters can both open and close emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of 3 unless both lengths are multiples of 3.

  10. Strong emphasis begins with a delimiter that can open strong emphasis and ends with a delimiter that can close strong emphasis, and that uses the same character (_ or *) as the opening delimiter. The opening and closing delimiters must belong to separate delimiter runs. If one of the delimiters can both open and close strong emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of 3 unless both lengths are multiples of 3.

  11. A literal * character cannot occur at the beginning or end of *-delimited emphasis or **-delimited strong emphasis, unless it is backslash-escaped.

  12. A literal _ character cannot occur at the beginning or end of _-delimited emphasis or __-delimited strong emphasis, unless it is backslash-escaped.

Where rules 1–12 above are compatible with multiple parsings, the following principles resolve ambiguity:

  1. The number of nestings should be minimized. Thus, for example, an interpretation <strong>...</strong> is always preferred to <em><em>...</em></em>.

  2. An interpretation <em><strong>...</strong></em> is always preferred to <strong><em>...</em></strong>.

  3. When two potential emphasis or strong emphasis spans overlap, so that the second begins before the first ends and ends after the first ends, the first takes precedence. Thus, for example, *foo _bar* baz_ is parsed as <em>foo _bar</em> baz_ rather than *foo <em>bar* baz</em>.

  4. When there are two potential emphasis or strong emphasis spans with the same closing delimiter, the shorter one (the one that opens later) takes precedence. Thus, for example, **foo **bar baz** is parsed as **foo <strong>bar baz</strong> rather than <strong>foo **bar baz</strong>.

  5. Inline code spans, links, images, and HTML tags group more tightly than emphasis. So, when there is a choice between an interpretation that contains one of these elements and one that does not, the former always wins. Thus, for example, *[foo*](bar) is parsed as *<a href="bar">foo*</a> rather than as <em>[foo</em>](bar).

These rules can be illustrated through a series of examples.

Rule 1:

Example 382
*foo bar*
<p><em>foo bar</em></p>

This is not emphasis, because the opening * is followed by whitespace, and hence not part of a left-flanking delimiter run:

Example 383
a * foo bar*
<p>a * foo bar*</p>

This is not emphasis, because the opening * is preceded by an alphanumeric and followed by punctuation, and hence not part of a left-flanking delimiter run:

Example 384
a*"foo"*
<p>a*&quot;foo&quot;*</p>

Unicode nonbreaking spaces count as whitespace, too:

Example 385
* a *
<p>* a *</p>

Intraword emphasis with * is permitted:

Example 386
foo*bar*
<p>foo<em>bar</em></p>
Example 387
5*6*78
<p>5<em>6</em>78</p>

Rule 2:

Example 388
_foo bar_
<p><em>foo bar</em></p>

This is not emphasis, because the opening _ is followed by whitespace:

Example 389
_ foo bar_
<p>_ foo bar_</p>

This is not emphasis, because the opening _ is preceded by an alphanumeric and followed by punctuation:

Example 390
a_"foo"_
<p>a_&quot;foo&quot;_</p>

Emphasis with _ is not allowed inside words:

Example 391
foo_bar_
<p>foo_bar_</p>
Example 392
5_6_78
<p>5_6_78</p>
Example 393
пристаням_стремятся_
<p>пристаням_стремятся_</p>

Here _ does not generate emphasis, because the first delimiter run is right-flanking and the second left-flanking:

Example 394
aa_"bb"_cc
<p>aa_&quot;bb&quot;_cc</p>

This is emphasis, even though the opening delimiter is both left- and right-flanking, because it is preceded by punctuation:

Example 395
foo-_(bar)_
<p>foo-<em>(bar)</em></p>

Rule 3:

This is not emphasis, because the closing delimiter does not match the opening delimiter:

Example 396
_foo*
<p>_foo*</p>

This is not emphasis, because the closing * is preceded by whitespace:

Example 397
*foo bar *
<p>*foo bar *</p>

A newline also counts as whitespace:

Example 398
*foo bar
*
<p>*foo bar
*</p>

This is not emphasis, because the second * is preceded by punctuation and followed by an alphanumeric (hence it is not part of a right-flanking delimiter run:

Example 399
*(*foo)
<p>*(*foo)</p>

The point of this restriction is more easily appreciated with this example:

Example 400
*(*foo*)*
<p><em>(<em>foo</em>)</em></p>

Intraword emphasis with * is allowed:

Example 401
*foo*bar
<p><em>foo</em>bar</p>

Rule 4:

This is not emphasis, because the closing _ is preceded by whitespace:

Example 402
_foo bar _
<p>_foo bar _</p>

This is not emphasis, because the second _ is preceded by punctuation and followed by an alphanumeric:

Example 403
_(_foo)
<p>_(_foo)</p>

This is emphasis within emphasis:

Example 404
_(_foo_)_
<p><em>(<em>foo</em>)</em></p>

Intraword emphasis is disallowed for _:

Example 405
_foo_bar
<p>_foo_bar</p>
Example 406
_пристаням_стремятся
<p>_пристаням_стремятся</p>
Example 407
_foo_bar_baz_
<p><em>foo_bar_baz</em></p>

This is emphasis, even though the closing delimiter is both left- and right-flanking, because it is followed by punctuation:

Example 408
_(bar)_.
<p><em>(bar)</em>.</p>

Rule 5:

Example 409
**foo bar**
<p><strong>foo bar</strong></p>

This is not strong emphasis, because the opening delimiter is followed by whitespace:

Example 410
** foo bar**
<p>** foo bar**</p>

This is not strong emphasis, because the opening ** is preceded by an alphanumeric and followed by punctuation, and hence not part of a left-flanking delimiter run:

Example 411
a**"foo"**
<p>a**&quot;foo&quot;**</p>

Intraword strong emphasis with ** is permitted:

Example 412
foo**bar**
<p>foo<strong>bar</strong></p>

Rule 6:

Example 413
__foo bar__
<p><strong>foo bar</strong></p>

This is not strong emphasis, because the opening delimiter is followed by whitespace:

Example 414
__ foo bar__
<p>__ foo bar__</p>

A newline counts as whitespace:

Example 415
__
foo bar__
<p>__
foo bar__</p>

This is not strong emphasis, because the opening __ is preceded by an alphanumeric and followed by punctuation:

Example 416
a__"foo"__
<p>a__&quot;foo&quot;__</p>

Intraword strong emphasis is forbidden with __:

Example 417
foo__bar__
<p>foo__bar__</p>
Example 418
5__6__78
<p>5__6__78</p>
Example 419
пристаням__стремятся__
<p>пристаням__стремятся__</p>
Example 420
__foo, __bar__, baz__
<p><strong>foo, <strong>bar</strong>, baz</strong></p>

This is strong emphasis, even though the opening delimiter is both left- and right-flanking, because it is preceded by punctuation:

Example 421
foo-__(bar)__
<p>foo-<strong>(bar)</strong></p>

Rule 7:

This is not strong emphasis, because the closing delimiter is preceded by whitespace:

Example 422
**foo bar **
<p>**foo bar **</p>

(Nor can it be interpreted as an emphasized *foo bar *, because of Rule 11.)

This is not strong emphasis, because the second ** is preceded by punctuation and followed by an alphanumeric:

Example 423
**(**foo)
<p>**(**foo)</p>

The point of this restriction is more easily appreciated with these examples:

Example 424
*(**foo**)*
<p><em>(<strong>foo</strong>)</em></p>
Example 425
**Gomphocarpus (*Gomphocarpus physocarpus*, syn.
*Asclepias physocarpa*)**
<p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
<em>Asclepias physocarpa</em>)</strong></p>
Example 426
**foo "*bar*" foo**
<p><strong>foo &quot;<em>bar</em>&quot; foo</strong></p>

Intraword emphasis:

Example 427
**foo**bar
<p><strong>foo</strong>bar</p>

Rule 8:

This is not strong emphasis, because the closing delimiter is preceded by whitespace:

Example 428
__foo bar __
<p>__foo bar __</p>

This is not strong emphasis, because the second __ is preceded by punctuation and followed by an alphanumeric:

Example 429
__(__foo)
<p>__(__foo)</p>

The point of this restriction is more easily appreciated with this example:

Example 430
_(__foo__)_
<p><em>(<strong>foo</strong>)</em></p>

Intraword strong emphasis is forbidden with __:

Example 431
__foo__bar
<p>__foo__bar</p>
Example 432
__пристаням__стремятся
<p>__пристаням__стремятся</p>
Example 433
__foo__bar__baz__
<p><strong>foo__bar__baz</strong></p>

This is strong emphasis, even though the closing delimiter is both left- and right-flanking, because it is followed by punctuation:

Example 434
__(bar)__.
<p><strong>(bar)</strong>.</p>

Rule 9:

Any nonempty sequence of inline elements can be the contents of an emphasized span.

Example 435
*foo [bar](/url)*
<p><em>foo <a href="/url">bar</a></em></p>
Example 436
*foo
bar*
<p><em>foo
bar</em></p>

In particular, emphasis and strong emphasis can be nested inside emphasis:

Example 437
_foo __bar__ baz_
<p><em>foo <strong>bar</strong> baz</em></p>
Example 438
_foo _bar_ baz_
<p><em>foo <em>bar</em> baz</em></p>
Example 439
__foo_ bar_
<p><em><em>foo</em> bar</em></p>
Example 440
*foo *bar**
<p><em>foo <em>bar</em></em></p>
Example 441
*foo **bar** baz*
<p><em>foo <strong>bar</strong> baz</em></p>
Example 442
*foo**bar**baz*
<p><em>foo<strong>bar</strong>baz</em></p>

Note that in the preceding case, the interpretation

<p><em>foo</em><em>bar<em></em>baz</em></p>

is precluded by the condition that a delimiter that can both open and close (like the * after foo) cannot form emphasis if the sum of the lengths of the delimiter runs containing the opening and closing delimiters is a multiple of 3 unless both lengths are multiples of 3.

For the same reason, we don’t get two consecutive emphasis sections in this example:

Example 443
*foo**bar*
<p><em>foo**bar</em></p>

The same condition ensures that the following cases are all strong emphasis nested inside emphasis, even when the interior spaces are omitted:

Example 444
***foo** bar*
<p><em><strong>foo</strong> bar</em></p>
Example 445
*foo **bar***
<p><em>foo <strong>bar</strong></em></p>
Example 446
*foo**bar***
<p><em>foo<strong>bar</strong></em></p>

When the lengths of the interior closing and opening delimiter runs are both multiples of 3, though, they can match to create emphasis:

Example 447
foo***bar***baz
<p>foo<em><strong>bar</strong></em>baz</p>
Example 448
foo******bar*********baz
<p>foo<strong><strong><strong>bar</strong></strong></strong>***baz</p>

Indefinite levels of nesting are possible:

Example 449
*foo **bar *baz* bim** bop*
<p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
Example 450
*foo [*bar*](/url)*
<p><em>foo <a href="/url"><em>bar</em></a></em></p>

There can be no empty emphasis or strong emphasis:

Example 451
** is not an empty emphasis
<p>** is not an empty emphasis</p>
Example 452
**** is not an empty strong emphasis
<p>**** is not an empty strong emphasis</p>

Rule 10:

Any nonempty sequence of inline elements can be the contents of an strongly emphasized span.

Example 453
**foo [bar](/url)**
<p><strong>foo <a href="/url">bar</a></strong></p>
Example 454
**foo
bar**
<p><strong>foo
bar</strong></p>

In particular, emphasis and strong emphasis can be nested inside strong emphasis:

Example 455
__foo _bar_ baz__
<p><strong>foo <em>bar</em> baz</strong></p>
Example 456
__foo __bar__ baz__
<p><strong>foo <strong>bar</strong> baz</strong></p>
Example 457
____foo__ bar__
<p><strong><strong>foo</strong> bar</strong></p>
Example 458
**foo **bar****
<p><strong>foo <strong>bar</strong></strong></p>
Example 459
**foo *bar* baz**
<p><strong>foo <em>bar</em> baz</strong></p>
Example 460
**foo*bar*baz**
<p><strong>foo<em>bar</em>baz</strong></p>
Example 461
***foo* bar**
<p><strong><em>foo</em> bar</strong></p>
Example 462
**foo *bar***
<p><strong>foo <em>bar</em></strong></p>

Indefinite levels of nesting are possible:

Example 463
**foo *bar **baz**
bim* bop**
<p><strong>foo <em>bar <strong>baz</strong>
bim</em> bop</strong></p>
Example 464
**foo [*bar*](/url)**
<p><strong>foo <a href="/url"><em>bar</em></a></strong></p>

There can be no empty emphasis or strong emphasis:

Example 465
__ is not an empty emphasis
<p>__ is not an empty emphasis</p>
Example 466
____ is not an empty strong emphasis
<p>____ is not an empty strong emphasis</p>

Rule 11:

Example 467
foo ***
<p>foo ***</p>
Example 468
foo *\**
<p>foo <em>*</em></p>
Example 469
foo *_*
<p>foo <em>_</em></p>
Example 470
foo *****
<p>foo *****</p>
Example 471
foo **\***
<p>foo <strong>*</strong></p>
Example 472
foo **_**
<p>foo <strong>_</strong></p>

Note that when delimiters do not match evenly, Rule 11 determines that the excess literal * characters will appear outside of the emphasis, rather than inside it:

Example 473
**foo*
<p>*<em>foo</em></p>
Example 474
*foo**
<p><em>foo</em>*</p>
Example 475
***foo**
<p>*<strong>foo</strong></p>
Example 476
****foo*
<p>***<em>foo</em></p>
Example 477
**foo***
<p><strong>foo</strong>*</p>
Example 478
*foo****
<p><em>foo</em>***</p>

Rule 12:

Example 479
foo ___
<p>foo ___</p>
Example 480
foo _\__
<p>foo <em>_</em></p>
Example 481
foo _*_
<p>foo <em>*</em></p>
Example 482
foo _____
<p>foo _____</p>
Example 483
foo __\___
<p>foo <strong>_</strong></p>
Example 484
foo __*__
<p>foo <strong>*</strong></p>
Example 485
__foo_
<p>_<em>foo</em></p>

Note that when delimiters do not match evenly, Rule 12 determines that the excess literal _ characters will appear outside of the emphasis, rather than inside it:

Example 486
_foo__
<p><em>foo</em>_</p>
Example 487
___foo__
<p>_<strong>foo</strong></p>
Example 488
____foo_
<p>___<em>foo</em></p>
Example 489
__foo___
<p><strong>foo</strong>_</p>
Example 490
_foo____
<p><em>foo</em>___</p>

Rule 13 implies that if you want emphasis nested directly inside emphasis, you must use different delimiters:

Example 491
**foo**
<p><strong>foo</strong></p>
Example 492
*_foo_*
<p><em><em>foo</em></em></p>
Example 493
__foo__
<p><strong>foo</strong></p>
Example 494
_*foo*_
<p><em><em>foo</em></em></p>

However, strong emphasis within strong emphasis is possible without switching delimiters:

Example 495
****foo****
<p><strong><strong>foo</strong></strong></p>
Example 496
____foo____
<p><strong><strong>foo</strong></strong></p>

Rule 13 can be applied to arbitrarily long sequences of delimiters:

Example 497
******foo******
<p><strong><strong><strong>foo</strong></strong></strong></p>

Rule 14:

Example 498
***foo***
<p><em><strong>foo</strong></em></p>
Example 499
_____foo_____
<p><em><strong><strong>foo</strong></strong></em></p>

Rule 15:

Example 500
*foo _bar* baz_
<p><em>foo _bar</em> baz_</p>
Example 501
*foo __bar *baz bim__ bam*
<p><em>foo <strong>bar *baz bim</strong> bam</em></p>

Rule 16:

Example 502
**foo **bar baz**
<p>**foo <strong>bar baz</strong></p>
Example 503
*foo *bar baz*
<p>*foo <em>bar baz</em></p>

Rule 17:

Example 504
*[bar*](/url)
<p>*<a href="/url">bar*</a></p>
Example 505
_foo [bar_](/url)
<p>_foo <a href="/url">bar_</a></p>
Example 506
*<img src="foo" title="*"/>
<p>*<img src="foo" title="*"/></p>
Example 507
**<a href="**">
<p>**<a href="**"></p>
Example 508
__<a href="__">
<p>__<a href="__"></p>
Example 509
*a `*`*
<p><em>a <code>*</code></em></p>
Example 510
_a `_`_
<p><em>a <code>_</code></em></p>
Example 511
**a<http://foo.bar/?q=**>
<p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p>
Example 512
__a<http://foo.bar/?q=__>
<p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p>

8.5Underline (M)

In Markdown as defined by John Gruber, and in CommonMark and GFM, *one asterisk* and _one underscore_ both produce italics, and there is no way to produce an underline in Markdown except using inline HTML.

This is unfortunate, since underline is not always just a typewriter version of italics. In some languages and in some contexts, underlining serves a distinct, legitimate purpose.

Worse, Markua bans all inline HTML except HTML comments. So even the gross Markdown workaround to produce underlines with HTML does not work.

So, how to produce underline?

In Markua, *one asterisk* produces italics, and _one underscore_ can produce either italics or an underline based on the italicize-underlines global setting.

The global setting italicize-underlines can be true or false. The default is true, so that Markua functions the same way as Markdown by default.

This will be a bit surprising for new authors who are discovering Markua and have never heard of Markdown, such as people writing novels in Markua. However, it is my expectation that for the foreseeable future, the proportion of authors who discover Markdown first and then learn about Markua will be far greater. For these people, it would be a lot more surprising if all their text which they had italicized like _this_ suddenly became underlined instead!

Furthermore, if you’ve written in a certain way (like _this_ for italic) for years, your fingers essentially just do the right thing. I want this to still feel right for these people, not something that makes them think they need to go change a global setting. Almost nobody has spent years writing in Markua, so this is much less of an issue for them!

(Also, frankly, most novels just need italic, not underline, so this isn’t a big issue. While underline does have legitimate uses, they are more niche than italic.)

So, by default, italicize-underlines is true:

# Chapter One

stuff

foo _bar_ baz

lorem *ipsum* dolor

stuff
<h1>Chapter One</h1>
<p>stuff</p>
<p>foo <em>bar</em> baz</p>
<p>lorem <em>ipsum</em> dolor</p>
<p>stuff</p>

This can be made explicit by setting the italicize-underlines global setting to true:

{
italicize-underlines: true
}

# Chapter One

stuff

foo _bar_ baz

lorem *ipsum* dolor

stuff
<h1>Chapter One</h1>
<p>stuff</p>
<p>foo <em>bar</em> baz</p>
<p>lorem <em>ipsum</em> dolor</p>
<p>stuff</p>

Set the italicize-underlines global setting to false to produce underline:

{
italicize-underlines: false
}

# Chapter One

stuff

foo _bar_ baz

lorem *ipsum* dolor

stuff
<h1>Chapter One</h1>
<p>stuff</p>
<p>foo <u>bar</u> baz</p>
<p>lorem <em>ipsum</em> dolor</p>
<p>stuff</p>

8.6Superscript and subscript (M)

To produce superscript like the 3 in 5^3^ = 125, surround it with carets like 5^3^ = 125.

Superscript: 5^3^ = 125
<p>Superscript: 5<sup>3</sup> = 125</p>

To produce subscript like the 2 in H~2~O, surround it with single tildes like H~2~O.

Subscript: H~2~O
<p>Subscript: H<sub>2</sub>O</p>

8.7Strikethrough (GFM)

Markua enables the strikethrough extension, where an additional emphasis type is available.

Strikethrough text is any text wrapped in two tildes (~).

Example 518
~~Hi~~ Hello, world!
<p><del>Hi</del> Hello, world!</p>

As with regular emphasis delimiters, a new paragraph will cause strikethrough parsing to cease:

Example 519
This ~~has a

new paragraph~~.
<p>This ~~has a</p>
<p>new paragraph~~.</p>

A link contains link text (the visible text), a link destination (the URI that is the link destination), and optionally a link title. There are two basic kinds of links in Markdown. In inline links the destination and title are given immediately after the link text. In reference links the destination and title are defined elsewhere in the document.

A link text consists of a sequence of zero or more inline elements enclosed by square brackets ([ and ]). The following rules apply:

A link destination consists of either

A link title consists of either

Although link titles may span multiple lines, they may not contain a blank line.

An inline link consists of a link text followed immediately by a left parenthesis (, optional whitespace, an optional link destination, an optional link title separated from the link destination by whitespace, optional whitespace, and a right parenthesis ). The link’s text consists of the inlines contained in the link text (excluding the enclosing square brackets). The link’s URI consists of the link destination, excluding enclosing <...> if present, with backslash-escapes in effect as described above. The link’s title consists of the link title, excluding its enclosing delimiters, with backslash-escapes in effect as described above.

Here is a simple inline link:

Example 520
[link](/uri "title")
<p><a href="/uri" title="title">link</a></p>

The title, the link text and even the destination may be omitted:

Example 521
[link](/uri)
<p><a href="/uri">link</a></p>
Example 522
[](./target.md)
<p><a href="./target.md"></a></p>
Example 523
[link]()
<p><a href="">link</a></p>
Example 524
[link](<>)
<p><a href="">link</a></p>
Example 525
[]()
<p><a href=""></a></p>

The destination can only contain spaces if it is enclosed in pointy brackets:

Example 526
[link](/my uri)
<p>[link](/my uri)</p>
Example 527
[link](</my uri>)
<p><a href="/my%20uri">link</a></p>

The destination cannot contain line breaks, even if enclosed in pointy brackets:

Example 528
[link](foo
bar)
<p>[link](foo
bar)</p>
Example 529
[link](<foo
bar>)
<p>[link](<foo
bar>)</p>

The destination can contain ) if it is enclosed in pointy brackets:

Example 530
[a](<b)c>)
<p><a href="b)c">a</a></p>

Pointy brackets that enclose links must be unescaped:

Example 531
[link](<foo\>)
<p>[link](&lt;foo&gt;)</p>

These are not links, because the opening pointy bracket is not matched properly:

Example 532
[a](<b)c
[a](<b)c>
[a](<b>c)
<p>[a](&lt;b)c
[a](&lt;b)c&gt;
[a](<b>c)</p>

Parentheses inside the link destination may be escaped:

Example 533
[link](\(foo\))
<p><a href="(foo)">link</a></p>

Any number of parentheses are allowed without escaping, as long as they are balanced:

Example 534
[link](foo(and(bar)))
<p><a href="foo(and(bar))">link</a></p>

However, if you have unbalanced parentheses, you need to escape or use the <...> form:

Example 535
[link](foo(and(bar))
<p>[link](foo(and(bar))</p>
Example 536
[link](foo\(and\(bar\))
<p><a href="foo(and(bar)">link</a></p>
Example 537
[link](<foo(and(bar)>)
<p><a href="foo(and(bar)">link</a></p>

Parentheses and other symbols can also be escaped, as usual in Markdown:

Example 538
[link](foo\)\:)
<p><a href="foo):">link</a></p>

A link can contain fragment identifiers and queries:

Example 539
[link](#fragment)

[link](http://example.com#fragment)

[link](http://example.com?foo=3#frag)
<p><a href="#fragment">link</a></p>
<p><a href="http://example.com#fragment">link</a></p>
<p><a href="http://example.com?foo=3#frag">link</a></p>

Note that a backslash before a non-escapable character is just a backslash:

Example 540
[link](foo\bar)
<p><a href="foo%5Cbar">link</a></p>

URL-escaping should be left alone inside the destination, as all URL-escaped characters are also valid URL characters. Entity and numerical character references in the destination will be parsed into the corresponding Unicode code points, as usual. These may be optionally URL-escaped when written as HTML, but this spec does not enforce any particular policy for rendering URLs in HTML or other formats. Renderers may make different decisions about how to escape or normalize URLs in the output.

Example 541
[link](foo%20b&auml;)
<p><a href="foo%20b%C3%A4">link</a></p>

Note that, because titles can often be parsed as destinations, if you try to omit the destination and keep the title, you’ll get unexpected results:

Example 542
[link]("title")
<p><a href="%22title%22">link</a></p>

Titles may be in single quotes, double quotes, or parentheses:

Example 543
[link](/url "title")
[link](/url 'title')
[link](/url (title))
<p><a href="/url" title="title">link</a>
<a href="/url" title="title">link</a>
<a href="/url" title="title">link</a></p>

Backslash escapes and entity and numeric character references may be used in titles:

Example 544
[link](/url "title \"&quot;")
<p><a href="/url" title="title &quot;&quot;">link</a></p>

Titles must be separated from the link using a whitespace. Other Unicode whitespace like non-breaking space doesn’t work.

Example 545
[link](/url "title")
<p><a href="/url%C2%A0%22title%22">link</a></p>

Nested balanced quotes are not allowed without escaping:

Example 546
[link](/url "title "and" title")
<p>[link](/url &quot;title &quot;and&quot; title&quot;)</p>

But it is easy to work around this by using a different quote type:

Example 547
[link](/url 'title "and" title')
<p><a href="/url" title="title &quot;and&quot; title">link</a></p>

(Note: Markdown.pl did allow double quotes inside a double-quoted title, and its test suite included a test demonstrating this. But it is hard to see a good rationale for the extra complexity this brings, since there are already many ways—backslash escaping, entity and numeric character references, or using a different quote type for the enclosing title—to write titles containing double quotes. Markdown.pl’s handling of titles has a number of other strange features. For example, it allows single-quoted titles in inline links, but not reference links. And, in reference links but not inline links, it allows a title to begin with " and end with ). Markdown.pl 1.0.1 even allows titles with no closing quotation mark, though 1.0.2b8 does not. It seems preferable to adopt a simple, rational rule that works the same way in inline links and link reference definitions.)

Whitespace is allowed around the destination and title:

Example 548
[link](   /uri
  "title"  )
<p><a href="/uri" title="title">link</a></p>

But it is not allowed between the link text and the following parenthesis:

Example 549
[link] (/uri)
<p>[link] (/uri)</p>

The link text may contain balanced brackets, but not unbalanced ones, unless they are escaped:

Example 550
[link [foo [bar]]](/uri)
<p><a href="/uri">link [foo [bar]]</a></p>
Example 551
[link] bar](/uri)
<p>[link] bar](/uri)</p>
Example 552
[link [bar](/uri)
<p>[link <a href="/uri">bar</a></p>
Example 553
[link \[bar](/uri)
<p><a href="/uri">link [bar</a></p>

The link text may contain inline content:

Example 554
[link *foo **bar** `#`*](/uri)
<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
Example 555
[![moon](moon.jpg)](/uri)
<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>

However, links may not contain other links, at any level of nesting.

Example 556
[foo [bar](/uri)](/uri)
<p>[foo <a href="/uri">bar</a>](/uri)</p>
Example 557
[foo *[bar [baz](/uri)](/uri)*](/uri)
<p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
Example 558
![[[foo](uri1)](uri2)](uri3)
<p><img src="uri3" alt="[foo](uri2)" /></p>

These cases illustrate the precedence of link text grouping over emphasis grouping:

Example 559
*[foo*](/uri)
<p>*<a href="/uri">foo*</a></p>
Example 560
[foo *bar](baz*)
<p><a href="baz*">foo *bar</a></p>

Note that brackets that aren’t part of links do not take precedence:

Example 561
*foo [bar* baz]
<p><em>foo [bar</em> baz]</p>

These cases illustrate the precedence of HTML tags, code spans, and autolinks over link grouping:

Example 562
[foo <bar attr="](baz)">
<p>[foo <bar attr="](baz)"></p>
Example 563
[foo`](/uri)`
<p>[foo<code>](/uri)</code></p>
Example 564
[foo<http://example.com/?search=](uri)>
<p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p>

There are three kinds of reference links: full, collapsed, and shortcut.

A full reference link consists of a link text immediately followed by a link label that matches a link reference definition elsewhere in the document.

A link label begins with a left bracket ([) and ends with the first right bracket (]) that is not backslash-escaped. Between these brackets there must be at least one non-whitespace character. Unescaped square bracket characters are not allowed inside the opening and closing square brackets of link labels. A link label can have at most 999 characters inside the square brackets.

One label matches another just in case their normalized forms are equal. To normalize a label, strip off the opening and closing brackets, perform the Unicode case fold, strip leading and trailing whitespace and collapse consecutive internal whitespace to a single space. If there are multiple matching reference link definitions, the one that comes first in the document is used. (It is desirable in such cases to emit a warning.)

The link’s URI and title are provided by the matching link reference definition.

Here is a simple example:

Example 565
[foo][bar]

[bar]: /url "title"
<p><a href="/url" title="title">foo</a></p>

The rules for the link text are the same as with inline links. Thus:

The link text may contain balanced brackets, but not unbalanced ones, unless they are escaped:

Example 566
[link [foo [bar]]][ref]

[ref]: /uri
<p><a href="/uri">link [foo [bar]]</a></p>
Example 567
[link \[bar][ref]

[ref]: /uri
<p><a href="/uri">link [bar</a></p>

The link text may contain inline content:

Example 568
[link *foo **bar** `#`*][ref]

[ref]: /uri
<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
Example 569
[![moon](moon.jpg)][ref]

[ref]: /uri
<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>

However, links may not contain other links, at any level of nesting.

Example 570
[foo [bar](/uri)][ref]

[ref]: /uri
<p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
Example 571
[foo *bar [baz][ref]*][ref]

[ref]: /uri
<p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>

(In the examples above, we have two shortcut reference links instead of one full reference link.)

The following cases illustrate the precedence of link text grouping over emphasis grouping:

Example 572
*[foo*][ref]

[ref]: /uri
<p>*<a href="/uri">foo*</a></p>
Example 573
[foo *bar][ref]*

[ref]: /uri
<p><a href="/uri">foo *bar</a>*</p>

These cases illustrate the precedence of HTML tags, code spans, and autolinks over link grouping:

Example 574
[foo <bar attr="][ref]">

[ref]: /uri
<p>[foo <bar attr="][ref]"></p>
Example 575
[foo`][ref]`

[ref]: /uri
<p>[foo<code>][ref]</code></p>
Example 576
[foo<http://example.com/?search=][ref]>

[ref]: /uri
<p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p>

Matching is case-insensitive:

Example 577
[foo][BaR]

[bar]: /url "title"
<p><a href="/url" title="title">foo</a></p>

Unicode case fold is used:

Example 578
[ẞ]

[SS]: /url
<p><a href="/url">ẞ</a></p>

Consecutive internal whitespace is treated as one space for purposes of determining matching:

Example 579
[Foo
  bar]: /url

[Baz][Foo bar]
<p><a href="/url">Baz</a></p>

No whitespace is allowed between the link text and the link label:

Example 580
[foo] [bar]

[bar]: /url "title"
<p>[foo] <a href="/url" title="title">bar</a></p>
Example 581
[foo]
[bar]

[bar]: /url "title"
<p>[foo]
<a href="/url" title="title">bar</a></p>

This is a departure from John Gruber’s original Markdown syntax description, which explicitly allows whitespace between the link text and the link label. It brings reference links in line with inline links, which (according to both original Markdown and this spec) cannot have whitespace after the link text. More importantly, it prevents inadvertent capture of consecutive shortcut reference links. If whitespace is allowed between the link text and the link label, then in the following we will have a single reference link, not two shortcut reference links, as intended:

[foo]
[bar]

[foo]: /url1
[bar]: /url2

(Note that shortcut reference links were introduced by Gruber himself in a beta version of Markdown.pl, but never included in the official syntax description. Without shortcut reference links, it is harmless to allow space between the link text and link label; but once shortcut references are introduced, it is too dangerous to allow this, as it frequently leads to unintended results.)

When there are multiple matching link reference definitions, the first is used:

Example 582
[foo]: /url1

[foo]: /url2

[bar][foo]
<p><a href="/url1">bar</a></p>

Note that matching is performed on normalized strings, not parsed inline content. So the following does not match, even though the labels define equivalent inline content:

Example 583
[bar][foo\!]

[foo!]: /url
<p>[bar][foo!]</p>

Link labels cannot contain brackets, unless they are backslash-escaped:

Example 584
[foo][ref[]

[ref[]: /uri
<p>[foo][ref[]</p>
<p>[ref[]: /uri</p>
Example 585
[foo][ref[bar]]

[ref[bar]]: /uri
<p>[foo][ref[bar]]</p>
<p>[ref[bar]]: /uri</p>
Example 586
[[[foo]]]

[[[foo]]]: /url
<p>[[[foo]]]</p>
<p>[[[foo]]]: /url</p>
Example 587
[foo][ref\[]

[ref\[]: /uri
<p><a href="/uri">foo</a></p>

Note that in this example ] is not backslash-escaped:

Example 588
[bar\\]: /uri

[bar\\]
<p><a href="/uri">bar\</a></p>

A link label must contain at least one non-whitespace character:

Example 589
[]

[]: /uri
<p>[]</p>
<p>[]: /uri</p>
Example 590
[
 ]

[
 ]: /uri
<p>[
]</p>
<p>[
]: /uri</p>

A collapsed reference link consists of a link label that matches a link reference definition elsewhere in the document, followed by the string []. The contents of the first link label are parsed as inlines, which are used as the link’s text. The link’s URI and title are provided by the matching reference link definition. Thus, [foo][] is equivalent to [foo][foo].

Example 591
[foo][]

[foo]: /url "title"
<p><a href="/url" title="title">foo</a></p>
Example 592
[*foo* bar][]

[*foo* bar]: /url "title"
<p><a href="/url" title="title"><em>foo</em> bar</a></p>

The link labels are case-insensitive:

Example 593
[Foo][]

[foo]: /url "title"
<p><a href="/url" title="title">Foo</a></p>

As with full reference links, whitespace is not allowed between the two sets of brackets:

Example 594
[foo] 
[]

[foo]: /url "title"
<p><a href="/url" title="title">foo</a>
[]</p>

A shortcut reference link consists of a link label that matches a link reference definition elsewhere in the document and is not followed by [] or a link label. The contents of the first link label are parsed as inlines, which are used as the link’s text. The link’s URI and title are provided by the matching link reference definition. Thus, [foo] is equivalent to [foo][].

Example 595
[foo]

[foo]: /url "title"
<p><a href="/url" title="title">foo</a></p>
Example 596
[*foo* bar]

[*foo* bar]: /url "title"
<p><a href="/url" title="title"><em>foo</em> bar</a></p>
Example 597
[[*foo* bar]]

[*foo* bar]: /url "title"
<p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
Example 598
[[bar [foo]

[foo]: /url
<p>[[bar <a href="/url">foo</a></p>

The link labels are case-insensitive:

Example 599
[Foo]

[foo]: /url "title"
<p><a href="/url" title="title">Foo</a></p>

A space after the link text should be preserved:

Example 600
[foo] bar

[foo]: /url
<p><a href="/url">foo</a> bar</p>

If you just want bracketed text, you can backslash-escape the opening bracket to avoid links:

Example 601
\[foo]

[foo]: /url "title"
<p>[foo]</p>

Note that this is a link, because a link label ends with the first following closing bracket:

Example 602
[foo*]: /url

*[foo*]
<p>*<a href="/url">foo*</a></p>

Full and compact references take precedence over shortcut references:

Example 603
[foo][bar]

[foo]: /url1
[bar]: /url2
<p><a href="/url2">foo</a></p>
Example 604
[foo][]

[foo]: /url1
<p><a href="/url1">foo</a></p>

Inline links also take precedence:

Example 605
[foo]()

[foo]: /url1
<p><a href="">foo</a></p>
Example 606
[foo](not a link)

[foo]: /url1
<p><a href="/url1">foo</a>(not a link)</p>

In the following case [bar][baz] is parsed as a reference, [foo] as normal text:

Example 607
[foo][bar][baz]

[baz]: /url
<p>[foo]<a href="/url">bar</a></p>

Here, though, [foo][bar] is parsed as a reference, since [bar] is defined:

Example 608
[foo][bar][baz]

[baz]: /url1
[bar]: /url2
<p><a href="/url2">foo</a><a href="/url1">baz</a></p>

Here [foo] is not parsed as a shortcut reference, because it is followed by a link label (even though [bar] is not defined):

Example 609
[foo][bar][baz]

[baz]: /url1
[foo]: /url2
<p>[foo]<a href="/url1">bar</a></p>

There are two parts to making a crosslink.

  1. Define an id.
  2. Reference that id with a crosslink.

8.9.1Defining an id

There are two ways to define an id:

  1. Using an id attribute {id: some-id}
  2. Using a shorter “syntactic sugar” approach: {#some-id}

The shorter “syntactic sugar” approach is usually preferred. However, it can look a bit odd in an attribute list with other attributes in it. So, when other attributes are present in an attribute list, the {id: some-id} syntax is preferred.

In terms of the value of an id, it has some special restrictions:

  1. The first character in the id has to be a lowercase or uppercase letter, i.e. [a-zA-Z] if you think in regular expressions.
  2. The remaining characters in the id have to be a lowercase or uppercase letter or a digit or a hyphen (-) or an underscore (_).
  3. You can only define an id value once in an entire Markua document, even one that is split over multiple files.

These restrictions ensure that your ids can then be linked to by a crosslink from anywhere in the Markua document.

Also, note that id attributes need to be defined on either a block or span element. Finally, if an id is defined with an invalid name, the Markua Processor must ignore it and log an error.

8.9.1.1Defining an id on a block element

To define an id on a block element like a paragraph, figure, heading or even a definition list item, you simply stick the id definition on a line above the start of the block element. Note that exactly one newline must separate the attribute list from the block element–if two newlines are used, the attribute list will be interpreted as a directive, and the id won’t be correctly applied.

Here’s how to use the attribute list syntax to define an id attribute:

{id: some-id}
This is a paragraph with the id of `some-id`.

Here’s how to use the shorter “syntactic sugar” approach to define an id attribute:

{#some-id}
This is a paragraph with the id of `some-id`.

8.9.1.2Defining an id on a span element

To define an id on a span element you simply add the id definition immediately after the span element.

Here’s how to use the attribute list syntax to define an id attribute on a span element:

The word Markua{id: markua} has an id.

Leanpub is based in **Victoria, BC, Canada**{id: victoria}.

Here’s how to use the shorter “syntactic sugar” approach to define an id attribute:

The word Markua{#markua} has an id.

Leanpub is based in **Victoria, BC, Canada**{#victoria}.

Here’s how to define an id on a custom span:

The [quick sly fox]{#quick_sly} jumped over the lazy dogs.

If you want to define an id on a span while also defining other attributes like index entries, the id: syntax must be used in a full attribute list:

The [quick sly fox]{id: quick_sly, i: "Fox, Sly and Quick"} jumped over the lazy
dogs.

Regardless of how you defined the id, you then link to it to create a crosslink. To do this, you use the # character and the id in a link:

[link text](#some-id)

This syntax is intended to be reminiscent of HTML anchor tags.

Note that order of definition and use does not matter: crosslinks will work regardless of whether the id is defined before or after the use of it.

{id: id1}
This is a paragraph with an id of `id1`.

{#id2}
This is a paragraph with an id of `id2`.

This link to [Markua](#markua) is before the id definition.

The word Markua{id: markua} has an id of `markua`.

This link to [Markua](#markua) is after the id definition.

Leanpub is based in **Victoria, BC, Canada**{#victoria}.

The [quick sly fox]{#quick_sly} jumped over the lazy dogs.
<p id="id1">This is a paragraph with an id of <code>id1</code>.</p>
<p id="id2">This is a paragraph with an id of <code>id2</code>.</p>
<p>This link to <a href="#markua">Markua</a> is before the id definition.</p>
<p>The word <span id="markua">Markua</span> has an id of <code>markua</code>.</p>
<p>This link to <a href="#markua">Markua</a> is after the id definition.</p>
<p>Leanpub is based in <strong id="victoria">Victoria, BC, Canada</strong>.
<p>The <span id="quick_sly">quick sly fox</span> jumped over the lazy dogs.</p>

Chapters, sections and figures with titles often have two useful properties for writers:

  1. A name which is often short and useful to reference.
  2. A number, if numbering is turned on.

Whether the numbers exist is determined by the number-chapters, number-parts and number-figures attributes .

In your writing, it’s often desirable to refer to these figures from elsewhere in your book. When creating such a reference, it’s helpful to be able to reference the title and number of the chapter, section or figure.

A> Adding a reference to the page number would also be helpful in print, but it is obviously problematic in an ebook, since certain ebook formats have resizable text and since the page displayed by PDF readers is often different than the printed page number, because of front matter Roman numerals. So, Markua doesn’t provide any special magical syntax to refer to a page number, in order to discourage it–if Markua did support a reference to a page number, then the next logical step would be to only do so in certain output formats, and then writing Markua text would feel like programming. Instead of this nightmare, a really smart Markua Processor could hypothetically (and with the author’s permission) automatically add page number references to these references when producing a print-ready PDF.

Here’s how these references to titles and numbers work:

So, for “Figure 8.2: Anatomy of a Squirrel”, these are:

Note that in this example, “Anatomy of a Squirrel” was typed by the author, whereas “Figure 8.2: “ was generated by the Markua Processor. It does not matter; both can be referenced.

Also, note that regardless of section level, sections referenced in #d or #f are all called “Section” (not “Sub-Section”, “Sub-Sub-Section”, etc.)

The expectation is that #f will be used by authors who don’t mind verbosity, and #t and #n will be used by authors who prefer control and brevity. The #d is for very lazy authors who like saving keystrokes and/or who don’t know whether their publisher will call the code samples “Listing”, “Example” or some other word and want to protect themselves against extra work.

The code-sample-names, figure-names and table-names settings control the words used to name things.

Examples:

This is discussed in [section #n, #t](#crosslinks).

This is discussed in [#f](#crosslinks).

This is discussed in a [#d](#crosslinks) above.

See [chapter #n](#span-elements), which is the best chapter in this book.

This is in figure [#n](#fancy-diagram), arguably the fanciest diagram in this
document.

The only way to reference the title and numbers is in crosslinks. There’s no syntax to do so without creating a crosslink–a crosslink is more helpful, since it is clickable, and adding another syntax simply to be less helpful to the reader is a bad idea. Markua has enough syntax as it is.

Note that resources, chapters and sections all have implicit numbering. So, #n always works even if numbering is off. However, you will confuse readers if you refer to numbering they cannot see. Also, if numbering is off, the #f must not include either the #d or #n parts: it will be “Anatomy of a Squirrel” not “Figure 8.2: Anatomy of a Squirrel”.

8.11Images

Syntax for images is like the syntax for links, with one difference. Instead of link text, we have an image description. The rules for this are the same as for link text, except that (a) an image description starts with ![ rather than [, and (b) an image description may contain links. An image description has inline elements as its contents. When an image is rendered to HTML, this is standardly used as the image’s alt attribute.

Example 611
![foo](/url "title")
<p><img src="/url" alt="foo" title="title" /></p>
Example 612
![foo *bar*]

[foo *bar*]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
Example 613
![foo ![bar](/url)](/url2)
<p><img src="/url2" alt="foo bar" /></p>
Example 614
![foo [bar](/url)](/url2)
<p><img src="/url2" alt="foo bar" /></p>

Though this spec is concerned with parsing, not rendering, it is recommended that in rendering to HTML, only the plain string content of the image description be used. Note that in the above example, the alt attribute’s value is foo bar, not foo [bar](/url) or foo <a href="/url">bar</a>. Only the plain string content is rendered, without formatting.

Example 615
![foo *bar*][]

[foo *bar*]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
Example 616
![foo *bar*][foobar]

[FOOBAR]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
Example 617
![foo](train.jpg)
<p><img src="train.jpg" alt="foo" /></p>
Example 618
My ![foo bar](/path/to/train.jpg  "title"   )
<p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
Example 619
![foo](<url>)
<p><img src="url" alt="foo" /></p>
Example 620
![](/url)
<p><img src="/url" alt="" /></p>

Reference-style:

Example 621
![foo][bar]

[bar]: /url
<p><img src="/url" alt="foo" /></p>
Example 622
![foo][bar]

[BAR]: /url
<p><img src="/url" alt="foo" /></p>

Collapsed:

Example 623
![foo][]

[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" /></p>
Example 624
![*foo* bar][]

[*foo* bar]: /url "title"
<p><img src="/url" alt="foo bar" title="title" /></p>

The labels are case-insensitive:

Example 625
![Foo][]

[foo]: /url "title"
<p><img src="/url" alt="Foo" title="title" /></p>

As with reference links, whitespace is not allowed between the two sets of brackets:

Example 626
![foo] 
[]

[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" />
[]</p>

Shortcut:

Example 627
![foo]

[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" /></p>
Example 628
![*foo* bar]

[*foo* bar]: /url "title"
<p><img src="/url" alt="foo bar" title="title" /></p>

Note that link labels cannot contain unescaped brackets:

Example 629
![[foo]]

[[foo]]: /url "title"
<p>![[foo]]</p>
<p>[[foo]]: /url &quot;title&quot;</p>

The link labels are case-insensitive:

Example 630
![Foo]

[foo]: /url "title"
<p><img src="/url" alt="Foo" title="title" /></p>

If you just want a literal ! followed by bracketed text, you can backslash-escape the opening [:

Example 631
!\[foo]

[foo]: /url "title"
<p>![foo]</p>

If you want a link after a literal !, backslash-escape the !:

Example 632
\![foo]

[foo]: /url "title"
<p>!<a href="/url" title="title">foo</a></p>

8.12Image resource attributes (M)

As shown in the Images section, the syntax to insert an image in Markua is the identical to that used by CommonMark and GFM. (That section is unchanged from the CommonMark spec.) It is, however, also consistent with the syntax that is used for other resource types.

As discussed in the Resources section, Markua uses the image syntax for all resource types. Markua also reinterprets images as being a resource. This is important for the following reason:

Resources can have attribute lists.

Therefore, since images are resources in Markua, images can have attribute lists. These attributes are used to do basic image formatting.

Most formatting in Markua is semantic–things like bold and italic, superscript, etc. This formatting is part of the manuscript and should not be changed by any publisher. The formatting for images, however, is some of the most non-semantic formatting in Markua. It’s stuff which can absolutely be changed with no effect on the meaning of the Markua document. Specifically, images can have width, height, align or float attributes, or can even be fullbleed.

These settings are so universal when applied to images that it would be too dogmatic for Markua to insist that the only way to format images was to add a class element, and then style the class of the image elsewhere. Markua is pragmatic: by providing a few basic ways to format images, it enables authors to make in-progress books look good enough to actually publish in-progress.

The following are the supported attributes for image resources, in addition to the class, format, title and type attributes which all resources support:

align : The align can be left, right or middle. The default is middle. Combined with width and height, this provides basic image formatting. When align is specified, text goes above and below the image, but the image and its surrounding whitespace occupies the entire width of the page. If you want the text to actually wrap around the image, use float instead. In terms of the specific values of align, a Markua Processor must interpret left as “on the left side of the page”, right as “on the right side of the page” and middle as “in the middle of the content area of the page, respecting margins” in all cases. Finally, note that inside and outside are not supported for align.

alt : The alt is the alt text, to be displayed when the image cannot be shown. This is provided in the figure attribute list or in the square brackets before the image, for local and web images. (Inline images don’t have the square brackets, so for an inline image, the alt text can only be provided via an attribute.) If it is provided in both places, the one in the attribute list wins.

float : The float can be left, right, inside or outside. The left and right attribute values mean the same thing as they do with align. When float is specified, text flows around the image. It is an error to specify both align and float. Note that middle is not supported for float, since Markua is not a good choice for the types of advanced layouts which flow text on both sides of an image. For that, you should use something like InDesign; this is why Markua Processors such as Leanpub can export InDesign files. Also, note that float supports inside or outside, but align does not. What a Markua Processor does with inside and outside is more complex. A Markua Processor must interpret inside as “near the spine” and outside as “away from the spine” in a print book. In an ebook, however, a Markua Processor has some flexibility in terms of the meaning of inside and outside: it can either interpret inside as “left” or it can interpret inside as “left if the language is left-to-right, and right if the language is right-to-left”. Similarly, a Markua Processor can either interpret outside as “right” or it can interpret outside as “right if the language is left-to-right, and left if the language is right-to-left”.

fullbleed : true or false. The default is false. If true, the Markua Processor should ensure that the image is the full size of the page, with no margins on the page. If false, the width, height and align attributes are used instead and margins are respected. A Markua Processor should do something fancy and tasteful with the title attribute for a fullbleed image.

height : The height of the image, in percentage of page content area height (respecting margins). This is specified as a number (integer or float) between 1 and 100 followed by a percentage sign (%).

width : The width of the image, in percentage of page content area width (respecting margins). This is specified as an number (integer or float) between 1 and 100 followed by a percentage sign (%).

Note that if only one of width and height are specified, the Markua Processor should scale the image proportionally if possible (again, respecting margins). If both width and height are specified, the Markua Processor should scale the image accordingly, ignoring the aspect ratio. (So, it’s almost always a bad idea to specify both width and height.)

![foo](pie.jpg)

{alt: "foo"}
![](pie.jpg)

{alt: "foo"}
![clobbered by alt foo](pie.jpg)
<p><img src="pie.jpg" alt="foo" /></p>
<p><img src="pie.jpg" alt="foo" /></p>
<p><img src="pie.jpg" alt="foo" /></p>
![foo](pie.jpg "bar")

{alt: "foo"}
![](pie.jpg "bar")

{alt: "foo"}
![clobbered by alt foo](pie.jpg "bar")
<p><img src="pie.jpg" alt="foo" title="bar" /></p>
<p><img src="pie.jpg" alt="foo" title="bar" /></p>
<p><img src="pie.jpg" alt="foo" title="bar" /></p>

Since the other attributes are all presentational, there is no formal specification of the HTML which must result. So, here are some examples of usage instead, involving a very real and very delicious apple pie:

{fullbleed: true}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

{width: 100%}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

{width: "100%", height: '30%'}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

{width: 50%, float: left}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

{width: 50%, align: middle}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

8.13Inline SVG images (M)

Local and web resource locations are supported for any type of image; inline resource locations are supported for SVG images only.

To add an inline SVG image, you create a fenced code block, and then you indicate that instead of code, you are inserting a resource of type image and of format SVG.

There are two ways to do this. Here’s the verbose way:

{type: image, format: svg}
```
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Here’s the syntactic sugar way:

```!
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

The ! after the opening code fence means that what follows is an SVG image, and that it should be rendered as an image, not displayed as the XML source of the SVG image.

Inline SVG images support all the normal image resource attributes:

{alt: "a blue circle", title: "Earth From Space (Simplified)"}
```!
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

You can use them with the verbose way as well:

{type: image, format: svg, alt: "foo", title: "bar"}
```
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Note that when you are writing about SVG and want to display the SVG text, what you are really doing is creating a code resource. This is discussed below.

8.13.1Writing about SVG

If you want to write about the SVG format, and show the actual SVG source (instead of the image produced), it needs to be of a format of code, not image.

Now, you can just be lazy and not provide format or type attributes at all, since guessing when neither is present always produces a type of code.

```
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

However, you can also just specify both, either this way…

{type: code}
```svg
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

…or this way:

{type: code, format: svg}
```
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

…or this way:


```svg
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Or you can just specify xml, since SVG is xml:

```xml
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Or you can just specify text, if you don’t want syntax highlighting:

```text
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Heck, you can even use tildes to do that:

~~~
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
~~~

8.14Image locations and embedding (M)

Note that regardless of the image location, a Markua Processor can handle images in the following ways:

  1. Embed the image in the output format (such as a PDF or EPUB file)
  2. Place the image at some URL on the web, and load it from a URL in all cases

This is true regardless of the location of the source image (local, web or inline). The location of the image source has no effect on the location of the image in the output.

For example, a web image does not need to be loaded from the web every time; it can be inlined in the output by the Markua Processor.

This is also true for all of the types of images which are supported in Markua: GIF, PNG, JPEG, SVG and zipped SVG.

8.15Video resources (M)

The syntax to insert a video is the same compact and consistent syntax that is used for any resource. Local and web resource locations are supported for both video formats; inline resource locations for video are obviously not supported.

Unlike images where many images will accompany the manuscript itself, with video it’s expected that authors will be uploading their videos to sites such as YouTube first, and then reusing these videos in the contents of their Markua books. So, while video can be local or web video, web video will be much more prominent than web images in a Markua book.

Furthermore, unlike with web images where the format (and thus the type) are typically discoverable from the file extension in the URL, with web video it’s usually not. So, for web videos, there will typically be a {type: video} attribute list at a minimum. That said, if the type and format are not specified and the resource is a web resource, the Markua Processor may use the domain to decide what type of resource to assume. So YouTube videos may just work in some Markua Processors, like Leanpub, without specifying either the type or format.

The following types of videos are supported in Markua: MP4 and WebM.

We will discuss the supported and the default attributes for videos, and then show examples of videos being inserted for both local and web videos.

8.15.1Supported Attributes for Video

The following are the supported attributes for video resources, in addition to the class, format, title and type attributes which all resources support. Note that the class, height, title and width attributes apply to the poster image for the video–not to the video itself during playback.

align : The align can be left, right or middle. This applies to the poster image for the video, and works just like it does with images.

embed : true or false. If true, then you can actually embed the video file or reference it and play it. If false, then it’s from a site like YouTube which disallows this. In this case, this should function like a link to external web page, but open an appropriate app (e.g. the YouTube app) instead of a browser. A Markua Processor can be smart about defaults, and parse the URL to set the value of the embed attribute.

float : The float can be left, right, inside or outside. This applies to the poster image for the video, and works just like it does with images. It is an error to specify both align and float.

height : This applies to the poster image for the video, and works just like it does with images.

poster : The poster is the URL or path to an image which should be shown instead of the video before the video is played. If a Markua Processor is outputting some format where it is known that video resources are not supported, it must choose the poster to use as a replacement for the video. Books are not just ebooks–books can also be printed on the fibers of trees that have been chopped down (“paper”), producing something called a “book”. These “books”, whether they are bound in a sturdy fashion (“hardcover books”) or a flimsy fashion (“paperback books”), have one thing in common with respect to embedded video: they do not support it. Finally, if a Markua Processor is embedding a web video from a web video service (e.g. YouTube) which has a known algorithm for the location of the poster images for its videos, the Markua Processor may choose to use that poster image if a poster attribute is not explicitly specified. However, if a poster attribute is specified, then it must be used by the Markua Processor.

poster-format : The format of the poster image, if it exists. If this attribute is not specified, the format of the poster is inferred from the poster image file extension. This attribute needs to exist only to support poster images which do not have file extensions.

transcript : The URL or path to a transcript of the video, which should be shown or dictated to people who cannot view video. This helps people with visual disabilities view course material, and helps with ADA compliance. For example, what Leanpub does with a transcript is as follows: a URL is linked to in the title (but in a way which does not affect the Table of Contents), and a path is assumed to be a path to a Markua file which is then used to produce a web page at a public URL, which then is also linked to in the title in a way which does not affect the Table of Contents. With the Markua files for transcripts, Leanpub only supports certain resource types to be included in the transcript files themselves. Specifically, images, video, audio are not supported, but code, poetry, math and tables are supported. When Leanpub generates a transcript, the URLs are publicly accessible but obscure. Identical transcripts should not make duplicate transcript files on the web, and using a new UUID every time would violate this. However, not changing the transcript URL when its content changed could lead to bad outcomes. So, the URL of a transcript on the web should be scoped to the book and affected both by its filename and its content. What Leanpub does is: (filename minus extension minus period plus hyphen plus the SHA hash).html. Including the filename eliminates collisions if filenames are unique but the hashes are not.

width : This applies to the poster image for the video, and works just like it does with images.

In the following sections, please note that while the examples are shown with an HTML mapping, please note that a Markua Processor has complete flexibility over how it handles the location of video resources and their display.

8.15.2Local Video

Here's a paragraph before the figure.

![](pie_eating_contest.webm "A Piece of Pie")

Here's a paragraph after the figure.
<p>Here's a paragraph before the figure.</p>
<div class="figure">
<video src="pie_eating_contest.webm"/>
<p class="title">A Piece of Pie</p>
</div>
<p>Here's a paragraph after the figure.</p>

8.15.3Web Video

Here's a paragraph before the figure.

{type: video, poster: http://img.youtube.com/vi/VOCYL-FNbr0/mqdefault.jpg}
![](https://www.youtube.com/watch?t=105&v=VOCYL-FNbr0 "Introducing Markua")

Here's a paragraph after the figure.
<p>Here's a paragraph before the figure.</p>
<div class="figure">
<video src="https://www.youtube.com/watch?t=105&v=VOCYL-FNbr0"
poster="http://img.youtube.com/vi/VOCYL-FNbr0/mqdefault.jpg"/>
<p class="title">Introducing Markua</p>
</div>
<p>Here's a paragraph after the figure.</p>

8.16Audio resources (M)

The syntax to insert an audio resource is the same compact and consistent syntax that is used for any resource. Local and web resource locations are supported for both audio formats; inline resource locations for audio are obviously not supported.

Just as with video, the audio support in ebooks and on the web is more varied than for images. With audio, there are MP3, AAC, Ogg and WAV formats all in widespread use, and there are a number of other formats with supporters. It’s entirely likely that many ebook readers won’t support any of them.

Unlike images where many images will accompany the manuscript itself, with audio it’s expected that authors will be uploading their audio files to various sites first, and then reusing these files in the contents of their Markua books. So, while audio can be local or web audio, web audio will be much more prominent than web images in a Markua book.

Furthermore, unlike with web images where the format (and thus the type) are typically discoverable from the file extension in the URL, with web audio it’s usually not. So, for web audio files, there will typically be a {type: audio} attribute list at a minimum. That said, if the type and format are not specified and the resource is a web resource, the Markua Processor may use the domain to decide what type of resource to assume. So YouTube audios may just work in some Markua Processors, like Leanpub, without specifying either the type or format.

The following types of audio resources are supported in Markua: MP3, AAC, WAV and Ogg Vorbis.

We will discuss the supported and the default attributes for audio files, and then show examples of audio being inserted for both local and web audio files.

8.16.1Supported Attributes for Audio

The following are the supported attributes for audio resources, in addition to the class, format, title and type attributes which all resources support.

transcript : The URL or path to a transcript of the audio, which should be shown to people who cannot hear audio. This helps people with auditory disabilities view course material, and helps with ADA compliance. Transcripts should be produced and handled in an identical way to video resources. For example, Leanpub does this. See the Video resources section above for more information.

In the following sections, please note that while the examples are shown with an HTML mapping, please note that a Markua Processor has complete flexibility over how it handles the location of video resources and their display.

8.16.2Local Audio

The full version of the talk is here:

![](talk.mp3 "Full Talk")
<p>The full version of the talk is here:</p>
<div class="figure">
<audio src="resources/talk.mp3"/>
<p class="title">Full Talk</p>
</div>

8.16.3Web Audio

The full version of the talk is here:

![](http://markua.com/talk.mp3 "Full Talk")
<p>The full version of the talk is here:</p>
<div class="figure">
<audio src="http://markua.com/talk.mp3"/>
<p class="title">Full Talk</p>
</div>

8.17Math (M)

Math can be a local, web or inline resource, just like any other resource, and the same resource syntax applies to code as to all other resources.

Markua does not specify how math is output in HTML. It can be an image, or it can use something like MathJax. If MathJax is available, it is recommended for a Markua Processor to convert AsciiMath into MathML, and then use MathJax to format it.

There are two formats of math resources supported by Markua:

  1. LaTeX math (latexmath or $)
  2. AsciiMath (asciimath or @)

8.17.1Supported Attributes for Math

The following is the supported attribute for math resources, in addition to the class, format, title and type attributes which all resources support:

alt : The alt is the alt text, to be displayed when the mathematical equations cannot be shown. The default alt text for math is “math”. This can be provided in the figure attribute list. This is primarily intended for Markua Processors that output math as images; there are no output requirements for the alt text. This attribute functions as it does for images. (In fact, a Markua Processor may choose to transform the math into an image, for maximum ebook reader compatibility.)

Note that for math, the format is the name of the syntax used to write the mathematical equations. There are two special types of format for math baked into Markua: latexmath for LaTeX math and asciimath for AsciiMath. If a Markua Processor encounters one of these formats, it must assume the type of the resource is math, not code.

8.17.2Local Math Resources

Local math resources can be inserted as a figure.

Here's a paragraph before the figure.

{format: latexmath}
![too large to fit in the alt text](fermat-proof.tex "Proof of Fermat's Last Theorem")

Here's a paragraph after the figure.

The alt text can also be set with an attribute list:

Here's a paragraph before the figure.

{format: latexmath, alt: "too large to fit in the alt text"}
![](fermat-proof.tex "Proof of Fermat's Last Theorem")

Here's a paragraph after the figure.

8.17.3Web Math Resources

This is identical to how local math resources work, including the significance of file extensions. The only difference is that the files are on the web.

Here's a paragraph before the figure.

{format: latexmath}
![too large to fit in the alt text](http://markua.com/fermat-proof.tex "Proof of Fermat's Last Theorem")

Here's a paragraph after the figure.

The alt text can also be set with an attribute list:

Here's a paragraph before the figure.

{format: latexmath}{alt: "too large to fit in the alt text"}
![](http://markua.com/fermat-proof.tex "Proof of Fermat's Last Theorem")

Here's a paragraph after the figure.

8.17.4Inline Math Resources

Inline math resources are the most flexible way to insert math. They are the only way to insert math as a span resource, and the most straightforward way to add short math examples as figures. LaTeX math and AsciiMath can be inserted inline as a span or figure.

8.17.4.1Span

Being able to insert a math resource as a span is important, as it lets you write things like one of the kinematic equations d = v_i t + \frac{1}{2} a t^2$ inside sentences. This can be done with LaTeX math or AsciiMath.

To insert math as inline math, use a $ after closing backtick for LaTeX math, an @ after the closing backtick for AsciiMath, or an attribute list specifying a format of latexmath or asciimath. If none of these is done, the content of the backticks is treated as code and is output verbatim as monospaced text.

8.17.4.1.1LaTeX math span

There is syntactic sugar for LaTeX math which is inserted as a span, using the $ character after the closing backtick:

Here's one of the kinematic equations `d = v_i t + \frac{1}{2} a t^2`$ inside a
sentence.

The $ character indicates the inline resource is LaTeX math.

If you don’t like syntactic sugar, you can also use {format: latexmath} or {type: math, format: latexmath} after the inline span resource:

Here's one of the kinematic equations
`d = v_i t + \frac{1}{2} a t^2`{format: latexmath}
inside a sentence.
8.17.4.1.2AsciiMath span

AsciiMath is a way of producing simple MathML equations, using about 1% of the typing. It’s more terse than LaTeX math.

There is syntactic sugar for AsciiMath which is inserted as a span, using the @ character after the closing backtick:

Here's one of the kinematic equations `d = v_i t + 1/2 at^2`@ inside a sentence.

The @ character indicates the inline resource is AsciiMath.

If you don’t like syntactic sugar, you can also use {format: asciimath} or {type: math, format: asciimath} after the inline span resource:

Here's one of the kinematic equations `d = v_i t + 1/2 at^2`{format: asciimath}
inside a sentence.

8.17.4.2Figure

LaTeX math and AsciiMath can be inserted inline as a figure.

Both let you produce mathematical equations like this:

\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
8.17.4.2.1LaTeX math figures

Here’s how you do this using LaTeX math…

Here’s the version with the syntactic sugar for the format after the backticks:

Here's a paragraph before the figure.

```$
\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
```

Here's a paragraph after the figure.

Here’s the same thing, with the full format after the backticks:

Here's a paragraph before the figure.

```latexmath
\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
```

Here's a paragraph after the figure.

Here’s the same thing again, with a full attribute list:

Here's a paragraph before the figure.

{format: latexmath}
```
\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
```

Here's a paragraph after the figure.
8.17.4.2.2AsciiMath figures

Here’s how you do this using AsciiMath…

Here’s the version with the syntactic sugar for the format after the backticks:

Here's a paragraph before the figure.

```@
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Here's a paragraph after the figure.

Here’s the same thing, with the full format after the backticks:

Here's a paragraph before the figure.

```asciimath
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Here's a paragraph after the figure.

Here’s the same thing again, with a full attribute list:

Here's a paragraph before the figure.

{format: asciimath}
```
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Here's a paragraph after the figure.

If you wonder why I’m a fan of AsciiMath: I actually got that right on the first try at the AsciiMath website.

Note that when you are writing about AsciiMath and want to display the AsciiMath text, what you are really doing is creating a code resource. This is essentially identical to what was shown earlier in writing about SVG images. It’s also shown below.

8.17.5Writing about Math

If you want to write about math, and show the actual code (instead of the formatted output), it needs to be of a format of code, not math.

Now, you can just be lazy and not provide format or type attributes at all, since guessing when neither is present always produces a type of code.

This will be output as code:

```
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

This isn’t just true for AsciiMath code; it’s true for any math. For example, this will be output as code, since there’s no $ or latexmath format specifier to indicate that it is LaTeX math:

```
\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
```

However, you can also specify the type of code, to be explicit:

{type: code}
```
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Since specifying a type overrides the type inferred by the format, you can even specify the format of the math being used, while still keeping it code:

{type: code, format: asciimath}
```
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

This also works with the syntactic sugar with the full format name:

{type: code}
```asciimath
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

This also works with the syntactic sugar with the format shortcut:

{type: code}
```@
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Now, chances are a Markua Processor will not be doing syntax highlighting on AsciiMath.

If you want to ensure that no syntax highlighting is done, you can just specify text:

```text
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Heck, you can even use tildes, since this defaults to text not guess:

~~~
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
~~~

Autolinks are absolute URIs and email addresses inside < and >. They are parsed as links, with the URL or email address as the link label.

A URI autolink consists of <, followed by an absolute URI followed by >. It is parsed as a link to the URI, with the URI as the link’s label.

An absolute URI, for these purposes, consists of a scheme followed by a colon (:) followed by zero or more characters other ASCII control characters or whitespace , <, and >. If the URI includes these characters, they must be percent-encoded (e.g. %20 for a space).

For purposes of this spec, a scheme is any sequence of 2–32 characters beginning with an ASCII letter and followed by any combination of ASCII letters, digits, or the symbols plus (”+”), period (”.”), or hyphen (”-”).

Here are some valid autolinks:

Example 639
<http://foo.bar.baz>
<p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
Example 640
<http://foo.bar.baz/test?q=hello&id=22&boolean>
<p><a href="http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean">http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean</a></p>
Example 641
<irc://foo.bar:2233/baz>
<p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>

Uppercase is also fine:

Example 642
<MAILTO:FOO@BAR.BAZ>
<p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>

Note that many strings that count as absolute URIs for purposes of this spec are not valid URIs, because their schemes are not registered or because of other problems with their syntax:

Example 643
<a+b+c:d>
<p><a href="a+b+c:d">a+b+c:d</a></p>
Example 644
<made-up-scheme://foo,bar>
<p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
Example 645
<http://../>
<p><a href="http://../">http://../</a></p>
Example 646
<localhost:5001/foo>
<p><a href="localhost:5001/foo">localhost:5001/foo</a></p>

Spaces are not allowed in autolinks:

Example 647
<http://foo.bar/baz bim>
<p>&lt;http://foo.bar/baz bim&gt;</p>

Backslash-escapes do not work inside autolinks:

Example 648
<http://example.com/\[\>
<p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>

An email autolink consists of <, followed by an email address, followed by >. The link’s label is the email address, and the URL is mailto: followed by the email address.

An email address, for these purposes, is anything that matches the non-normative regex from the HTML5 spec:

/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/

Examples of email autolinks:

Example 649
<foo@bar.example.com>
<p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
Example 650
<foo+special@Bar.baz-bar0.com>
<p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>

Backslash-escapes do not work inside email autolinks:

Example 651
<foo\+@bar.example.com>
<p>&lt;foo+@bar.example.com&gt;</p>

These are not autolinks:

Example 652
<>
<p>&lt;&gt;</p>
Example 653
< http://foo.bar >
<p>&lt; http://foo.bar &gt;</p>
Example 654
<m:abc>
<p>&lt;m:abc&gt;</p>
Example 655
<foo.bar.baz>
<p>&lt;foo.bar.baz&gt;</p>
Example 656
http://example.com
<p>http://example.com</p>
Example 657
foo@bar.example.com
<p>foo@bar.example.com</p>

8.19No Raw HTML (M)

Besides supporting HTML blocks, Markdown and CommonMark also support raw HTML as inlines, in which HTML tags inserted in a leaf block such as a paragraph will be rendered without escaping.

Just as Markua does not support HTML blocks, Markua also does not support raw HTML as inlines. The logic is the same, so the conclusion is the same.

8.20Hard line breaks

A line break (not in a code span or HTML tag) that is preceded by two or more spaces and does not occur at the end of a block is parsed as a hard line break (rendered in HTML as a <br /> tag):

Example 658
foo  
baz
<p>foo<br />
baz</p>

For a more visible alternative, a backslash before the line ending may be used instead of two spaces:

Example 659
foo\
baz
<p>foo<br />
baz</p>

More than two spaces can be used:

Example 660
foo       
baz
<p>foo<br />
baz</p>

Leading spaces at the beginning of the next line are ignored:

Example 661
foo  
     bar
<p>foo<br />
bar</p>
Example 662
foo\
     bar
<p>foo<br />
bar</p>

Line breaks can occur inside emphasis, links, and other constructs that allow inline content:

Example 663
*foo  
bar*
<p><em>foo<br />
bar</em></p>
Example 664
*foo\
bar*
<p><em>foo<br />
bar</em></p>

Line breaks do not occur inside code spans

Example 665
`code 
span`
<p><code>code  span</code></p>
Example 666
`code\
span`
<p><code>code\ span</code></p>

or HTML tags:

Example 667
<a href="foo  
bar">
<p><a href="foo  
bar"></p>
Example 668
<a href="foo\
bar">
<p><a href="foo\
bar"></p>

Hard line breaks are for separating inline content within a block. Neither syntax for hard line breaks works at the end of a paragraph or other block element:

Example 669
foo\
<p>foo\</p>
Example 670
foo  
<p>foo</p>
Example 671
### foo\
<h3>foo\</h3>
Example 672
### foo  
<h3>foo</h3>

8.21Soft line breaks

A regular line break (not in a code span or HTML tag) that is not preceded by two or more spaces or a backslash is parsed as a softbreak. (A softbreak may be rendered in HTML either as a line ending or as a space. The result will be the same in browsers. In the examples here, a line ending will be used.)

Example 673
foo
baz
<p>foo
baz</p>

Spaces at the end of the line and beginning of the next line are removed:

Example 674
foo 
 baz
<p>foo
baz</p>

A conforming parser may render a soft line break in HTML either as a line break or as a space.

A renderer may also provide an option to render soft line breaks as hard line breaks.

8.22Configurable soft line breaks (M)

There is a good argument to be made that single newlines at the end of a line should always be treated as a hard line break, not as a soft line break.

In ancient history, some text editors did not automatically wrap lines of text, so manual wrapping of plain text files was a good thing to do. However, for writing, automatic wrapping of paragraphs is essential for staying in the flow while writing, and for being able to edit your text without needing to re-wrap every line in a paragraph. This is one decision that even Microsoft Word gets right.

So, Markua used to treat single newlines at the end of a line as a hard line break, not as a soft line break.

However, compatibility with Markdown by default is more than fixing things such as this which I consider to be mistakes, even though the Markdown two-space hack to force a hard line break makes me die a little inside every time I (don’t) see it. As I’ve grown older, I’ve become more willing to compromise.

So, Markua has a document setting called soft-breaks:

The default value of soft-breaks is source, for compatibility with Markdown. However, most authors will want to set the value of soft-breaks to break, for compatibility with truth and beauty.

The default value of soft-breaks is soft:

# Chapter One

foo
bar
<h1>Chapter One</h1>
<p>foo
bar</p>

This can be made explicit by setting the soft-breaks global setting to source:

{
soft-breaks: source
}

# Chapter One

foo
bar
<h1>Chapter One</h1>
<p>foo
bar</p>

This can be made shorter by setting the soft-breaks global setting to space:

{
soft-breaks: space
}

# Chapter One

foo
bar
<h1>Chapter One</h1>
<p>foo bar</p>

If you want your line breaks to actually, well, produce line breaks, you need to set the soft-breaks global setting to break:

{
soft-breaks: break
}

# Chapter One

foo
bar
<h1>Chapter One</h1>
<p>foo<br />
bar</p>

Finally, note that the soft-breaks global setting also applies to other paragraph-like contexts, such as definition list items.

8.23Span attribute lists (M)

Surrounding text in square brackets can be useful not just for giving it a URL to link to. If you wish to add attributes to an arbitrary span of text, you can create an arbitrary span of text using square brackets and then add an attribute list immediately afterward. You can use any attribute list on this span, and you can also just use the id syntactic sugar {#theid} on this span. The most common uses of this are to add id attributes or index entries. (Index entries are discussed later.)

Some text [then a span]{foo: lorem, bar: ipsum} and more text.

This [span has an id]{#hello}, so hooray!

This span [also has an id]{id: world}. 
<p>Some text <span foo="lorem" bar="ipsum">then a span</span> and more text.</p>
<p>This <span id="hello">span has an id</span>, so hooray!</p>
<p>This span <span id="world">also has an id</span>.</p>

Note, however, that you cannot start a normal span with a caret (^): this creates a [^footnote] instead. (Footnotes are discussed below.)

8.24Index entries (M)

Markua supports adding index entries via the attribute list syntax. Index entries let authors or indexers produce professional-looking indexes in Markua books.

Index entries can either be attached to block or span elements using the same attribute list syntax. In fact, index entries can just be added as part of a larger attribute list.

The actual syntax of what the value of an index entry looks like is inspired by LaTeX.

The key of an index entry is i, for index.

In its simplest form, an index entry is simply {i: "hello"}. Like any attribute list, you don’t need a space between the colon and the quote–you can also do {i:"hello"}.

These are the various formats of an index entry:

{i: hello}
{i: "hello"}
{i: "Armstrong, Peter"}
{i: "Yahoo\!"}
{i: "*hello*"}
{i: "**hello**"}
{i: "hello!Peter"}
{i: "hello!*Peter*"}
{i: "hello!**Peter**"}
{i: "Peter|see{i:'hello'}"}
{i: "Jen|seealso{i:'Jenny'}"}

Here’s what they do:

{i: hello} : Adds an index entry for hello. If an index entry has no punctuation or formatting then it does not need quotes.

{i: "hello"} : Adds an index entry for hello. Quotes are always fine to use, even when not required.

{i: "Armstrong, Peter"} : Adds an index entry for Armstrong, Peter. The quotes are always omitted. Their function is to allow things like exclamation marks and other punctuation to be added without fear, in case you don’t feel like learning which punctuation is safe.

{i: "Yahoo\!"} : Adds an index entry for Yahoo!. Note that the exclamation mark must be backslash-escaped because ! is a delimiter otherwise. The |, {, } and \ characters also must be backslash-escaped.

{i: "*hello*"} : Adds an index entry for hello, with hello in italics.

{i: "**hello**"} : Adds an index entry for hello, with hello in bold.

{i: "hello!Peter"} : Adds an index entry for Peter which is a sub-entry of hello.

{i: "hello!*Peter*"} : Adds an index entry for Peter (with Peter in emphasis) which is a sub-entry of hello. Note that this cannot be combined with a see or seealso (the | syntax).

{i: "hello!**Peter**"} : Adds an index entry for Peter (with Peter in strong emphasis) which is a sub-entry of hello. Note that this cannot be combined with a see or seealso (the | syntax).

{i: "Peter|see{i:'hello'}"} : Adds an index entry for Peter, which references the index entry hello with the equivalent of “see” in the language of the book. Note that this cannot be combined with a sub-entry (the ! syntax).

{i: "Jen|seealso{i:'Jenny'}"} : Adds an index entry for Jen, which references the index entry Jenny with the equivalent of “see also” in the language of the book. Note that this cannot be combined with a sub-entry (the ! syntax).

Index entries are case sensitive. For example, {i: "mark"} and {i: "Mark"} are distinct entries. (The first is for a result or a smudge, the second is a person’s name.)

To attach an index entry to the start of a block, put it on its own line above a block:

{i: "hello"}
I just came to say hello, hello, hello, hello

To attach an index entry to a word, just add the index entry after the word:

I just came to say hello{i: "hello"}, hello, hello, hello

To attach an index entry to a span element, just add the index entry after the span element:

The first program that a programmer writes in a language is usually
*Hello World*{i: "Hello World"}

Index entries can have commas and other punctuation (except colons) in their definition:

My wife read some book about a whale by Herman Melville{i: "Melville, Herman"}.

Multiple index entries can exist in a block, or even a sentence:

Supposedly the great-great-great-granduncle of the musician Moby{i: "Moby"} was
Herman Melville{i: "Melville, Herman"}, the author of a book about a
whale{i: "Moby-Dick; or, The Whale"}.

Note that adding index entries is best left until the author is done writing the book. At that time, ids like {#myid} can be converted to {id: #myid, i: “blah”} if index entries are being added where ids already are.

Markua does not specify the HTML output of index entries, to maximize implementation flexibility.

8.25Concatenating files and blank lines (M)

A Markua document can be written in one file or multiple manuscript files. If a manuscript is written in multiple files, these files are concatenated together by the Markua Processor to produce one temporary manuscript file, and that one file is used as the input.

Importantly, in order to avoid a number of bugs, the files are not just concatenated together unchanged–they must be concatenated together by Markua Processors in the following way:

  1. All newlines at the beginning and the end of each file are removed.
  2. Two newlines are added between the end of each file and the beginning of the next file.

So, after this process, exactly one blank line separates the contents of each manuscript file. Note that because of this rule, a paragraph (or any other block element) cannot span multiple manuscript files.

To see why this approach is so important, consider the following single-file Markua document:

Example 680
# Chapter One

Lorem ipsum dolor.

# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.</p>
<h1 class="chapter">Chapter Two</h1>
<p>Yada yada yada.</p>

Suppose instead a multiple-file approach was used, in which there were two files, ch1.txt and ch2.txt, with the following content.

ch1.txt:

Example 681
# Chapter One

Lorem ipsum dolor.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.</p>

ch2.txt:

Example 682
# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter Two</h1>
<p>Yada yada yada.</p>

If Markua did not add any newlines between files, then these files would produce the following incorrect manuscript:

Example 683
# Chapter One

Lorem ipsum dolor.# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.#Chapter Two</p>
<p>Yada yada yada.</p>

If Markua only added one newline when concatenating, this would produce a correct manuscript (since headings end paragraphs), but one with possible bugs:

Example 684
# Chapter One

Lorem ipsum dolor.
# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.</p>
<h1 class="chapter">Chapter Two</h1>
<p>Yada yada yada.</p>

Worse, since a number of text editors such as Emacs have a “strip blank lines at the end of files” setting, it would be possible to introduce such a bug if Markua simply relied on blank lines being added to the end of a file by the author.

So, because of the blank line rule, concatenating the files produces the same manuscript as the single-file manuscript above:

Example 685
# Chapter One

Lorem ipsum dolor.

# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.</p>
<h1 class="chapter">Chapter Two</h1>
<p>Yada yada yada.</p>

8.26Textual content

Any characters not given an interpretation by the above rules will be parsed as plain textual content.

Example 686
hello $.;'there
<p>hello $.;'there</p>
Example 687
Foo χρῆν
<p>Foo χρῆν</p>

Internal spaces are preserved verbatim:

Example 688
Multiple     spaces
<p>Multiple     spaces</p>

Appendix: A parsing strategy

In this appendix we describe some features of the parsing strategy used in the CommonMark reference implementations.

Overview

Parsing has two phases:

  1. In the first phase, lines of input are consumed and the block structure of the document—its division into paragraphs, block quotes, list items, and so on—is constructed. Text is assigned to these blocks but not parsed. Link reference definitions are parsed and a map of links is constructed.

  2. In the second phase, the raw text contents of paragraphs and headings are parsed into sequences of Markdown inline elements (strings, code spans, links, emphasis, and so on), using the map of link references constructed in phase 1.

At each point in processing, the document is represented as a tree of blocks. The root of the tree is a document block. The document may have any number of other blocks as children. These children may, in turn, have other blocks as children. The last child of a block is normally considered open, meaning that subsequent lines of input can alter its contents. (Blocks that are not open are closed.) Here, for example, is a possible document tree, with the open blocks marked by arrows:

-> document
  -> block_quote
       paragraph
         "Lorem ipsum dolor\nsit amet."
    -> list (type=bullet tight=true bullet_char=-)
         list_item
           paragraph
             "Qui *quodsi iracundia*"
      -> list_item
        -> paragraph
             "aliquando id"

Phase 1: block structure

Each line that is processed has an effect on this tree. The line is analyzed and, depending on its contents, the document may be altered in one or more of the following ways:

  1. One or more open blocks may be closed.
  2. One or more new blocks may be created as children of the last open block.
  3. Text may be added to the last (deepest) open block remaining on the tree.

Once a line has been incorporated into the tree in this way, it can be discarded, so input can be read in a stream.

For each line, we follow this procedure:

  1. First we iterate through the open blocks, starting with the root document, and descending through last children down to the last open block. Each block imposes a condition that the line must satisfy if the block is to remain open. For example, a block quote requires a > character. A paragraph requires a non-blank line. In this phase we may match all or just some of the open blocks. But we cannot close unmatched blocks yet, because we may have a lazy continuation line.

  2. Next, after consuming the continuation markers for existing blocks, we look for new block starts (e.g. > for a block quote). If we encounter a new block start, we close any blocks unmatched in step 1 before creating the new block as a child of the last matched container block.

  3. Finally, we look at the remainder of the line (after block markers like >, list markers, and indentation have been consumed). This is text that can be incorporated into the last open block (a paragraph, code block, heading, or raw HTML).

Setext headings are formed when we see a line of a paragraph that is a setext heading underline.

Reference link definitions are detected when a paragraph is closed; the accumulated text lines are parsed to see if they begin with one or more reference link definitions. Any remainder becomes a normal paragraph.

We can see how this works by considering how the tree above is generated by four lines of Markdown:

> Lorem ipsum dolor
sit amet.
> - Qui *quodsi iracundia*
> - aliquando id

At the outset, our document model is just

-> document

The first line of our text,

> Lorem ipsum dolor

causes a block_quote block to be created as a child of our open document block, and a paragraph block as a child of the block_quote. Then the text is added to the last open block, the paragraph:

-> document
  -> block_quote
    -> paragraph
         "Lorem ipsum dolor"

The next line,

sit amet.

is a “lazy continuation” of the open paragraph, so it gets added to the paragraph’s text:

-> document
  -> block_quote
    -> paragraph
         "Lorem ipsum dolor\nsit amet."

The third line,

> - Qui *quodsi iracundia*

causes the paragraph block to be closed, and a new list block opened as a child of the block_quote. A list_item is also added as a child of the list, and a paragraph as a child of the list_item. The text is then added to the new paragraph:

-> document
  -> block_quote
       paragraph
         "Lorem ipsum dolor\nsit amet."
    -> list (type=bullet tight=true bullet_char=-)
      -> list_item
        -> paragraph
             "Qui *quodsi iracundia*"

The fourth line,

> - aliquando id

causes the list_item (and its child the paragraph) to be closed, and a new list_item opened up as child of the list. A paragraph is added as a child of the new list_item, to contain the text. We thus obtain the final tree:

-> document
  -> block_quote
       paragraph
         "Lorem ipsum dolor\nsit amet."
    -> list (type=bullet tight=true bullet_char=-)
         list_item
           paragraph
             "Qui *quodsi iracundia*"
      -> list_item
        -> paragraph
             "aliquando id"

Phase 2: inline structure

Once all of the input has been parsed, all open blocks are closed.

We then “walk the tree,” visiting every node, and parse raw string contents of paragraphs and headings as inlines. At this point we have seen all the link reference definitions, so we can resolve reference links as we go.

document
  block_quote
    paragraph
      str "Lorem ipsum dolor"
      softbreak
      str "sit amet."
    list (type=bullet tight=true bullet_char=-)
      list_item
        paragraph
          str "Qui "
          emph
            str "quodsi iracundia"
      list_item
        paragraph
          str "aliquando id"

Notice how the line ending in the first paragraph has been parsed as a softbreak, and the asterisks in the first list item have become an emph.

By far the trickiest part of inline parsing is handling emphasis, strong emphasis, links, and images. This is done using the following algorithm.

When we’re parsing inlines and we hit either

we insert a text node with these symbols as its literal content, and we add a pointer to this text node to the delimiter stack.

The delimiter stack is a doubly linked list. Each element contains a pointer to a text node, plus information about

When we hit a ] character, we call the look for link or image procedure (see below).

When we hit the end of the input, we call the process emphasis procedure (see below), with stack_bottom = NULL.

Starting at the top of the delimiter stack, we look backwards through the stack for an opening [ or ![ delimiter.

process emphasis

Parameter stack_bottom sets a lower bound to how far we descend in the delimiter stack. If it is NULL, we can go all the way to the bottom. Otherwise, we stop before visiting stack_bottom.

Let current_position point to the element on the delimiter stack just above stack_bottom (or the first element if stack_bottom is NULL).

We keep track of the openers_bottom for each delimiter type (*, _) and each length of the closing delimiter run (modulo 3). Initialize this to stack_bottom.

Then we repeat the following until we run out of potential closers:

After we’re done, we remove all delimiters above stack_bottom from the delimiter stack.