Markua Spec

Version 0.30.35 (2024-02-28)
This formal specification of Markua was developed at Leanpub, is written by Peter Armstrong, is based on the CommonMark Spec (version 0.31.2) by John MacFarlane, and is licensed under the Creative Commons license CC-BY-SA 4.0.



NOTE: If you are a Leanpub author, there are two different versions of Markua used on Leanpub. The default version of Markua for books on Leanpub is Markua 0.30, which is documented in this specification, and which is currently in beta on Leanpub. The default version of Markua for courses on Leanpub is still Markua 0.10, which is documented in The Markua Manual. (You can still use Markua 0.10 and Leanpub Flavoured Markdown for books on Leanpub, of course.)



Creative
   Commons BY-SA
CommonMark Spec by John MacFarlane is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

1Introduction

1.1The magical typewriter (M)

Imagine you owned a magical typewriter.

When you used this magical typewriter, you wrote with fewer distractions. You didn’t just write faster, you wrote better.

With your magical typewriter, you never worried about layout. The book formatted itself.

You could hit a key on your magical typewriter to create an ebook from your manuscript with one click.

All ebook formats would be created, and they’d all look good. You’d have PDF for computers, and EPUB for everywhere else. The book would look great on phones.

With your magical typewriter, you could publish your book before it was even done, and get feedback from readers all over the world. You could automatically share book updates with them. You would press one key on your magical typewriter to publish a new version, and all your readers would have it instantly.

With your magical typewriter, you could easily compare your current manuscript to any other version of your manuscript that had ever existed.

If you decided to make a print book, you could press a key on your magical typewriter to get a print-ready PDF. All you would need to do is add a cover.

If you wanted to work with a designer, you could press a different key to generate InDesign. They could then use this as a starting point for producing a beautiful print book.

Or, if you wanted to work with a publisher, you could press a different key to get a Word document.

Your magical typewriter could even transform your completed book manuscript into a course that anyone in the world could take. All you’d need to do is to add some quizzes and exercises and then press a key for your magical typewriter to publish an online course for you. The quizzes and exercises would mark themselves, and students would get certificates based on how well they did.

With your magical typewriter, you’d only have to do one thing:

Write.

Wouldn’t it be great if such a magical typewriter existed?

It does. At Leanpub, we’re building it.

But there’s one requirement for this magical typewriter to exist: a simple, coherent, open, free, plain text format for a book or course manuscript.

This simple format will be the basis for the magical typewriter.

This simple format is called Markua.

This is its specification.

1.2Markua: Markdown for books and courses (M)

Markua, pronounced “mar-coo-ah”, is Markdown for books and courses.

Markua maps Markdown syntax to book concepts, and then adds some new syntax and concepts of its own.

Markua documents can be automatically transformed into every popular type of ebook format. The computer programs which do this transformation are called Markua Processors. These programs understand both Markua syntax and how to generate the various output formats. An example Markua Processor is Leanpub: Leanpub can output PDF, EPUB and HTML from the same Markua document, and can even output print-ready PDFs and InDesign files from them as well.

Markua has been developed with extensive real-world testing and feedback. Markua has been used by Leanpub authors for years, both to create books and online courses.

Markua was started at Leanpub in 2014, and benefited from the years of lessons that Leanpub had learned from Leanpub Flavoured Markdown (LFM). Markua is the successor to LFM. We have been iterating on our Markua support for many years.

1.3Markua’s five key additions to Markdown (M)

Markua’s five main additions to Markdown are the following:

  1. The mapping of Markdown headings (h1, h2, h3, etc.) to book structures (chapters, sections, sub-sections, etc.), which provides the ability for Markua Processors like Leanpub to produce an ebook from a Markua manuscript with one click.
  2. The unified resource syntax, which lets Markua handle audio, code, iframe, math, verbatim and video resources in the same way that it does images and tables, and which supports inline, local and web resource locations.
  3. The attribute list syntax, which lets Markua define properties for resources using the same syntax that is used for other document elements, such as part headings and index entries.
  4. A comprehensive definition of document settings, so that authors can use Markua to produce documents which are styled via global switches, rather than by tedious micromanagement.
  5. The mapping of book structures to online courses, and the creation of a plain-text microformat for course quizzes and exercises, which supports automated marking and automated production of everything which is required for an online course.

If you have written something (say blog posts or lecture notes) in Markdown, you can use a Markua Processor, such as Leanpub, to turn them into an ebook or course with one click. Then, as you go down the path of enhancing the manuscript and adding things which only make sense in books or courses, this process will feel like decorating, not converting.

The goal is for writers who are familiar with Markdown to feel that with Markua, Markdown just grew an understanding of book and course concepts.

1.4Markua’s one key subtraction from Markdown: raw HTML is removed (M)

Markua also has one key subtraction from Markdown: support for raw HTML, either as inlines or blocks, is completely removed.

Markdown and CommonMark support raw HTML. Markua does not support raw HTML.

Instead, in Markua all raw HTML elements are removed.

This is true both for raw HTML which is inserted as leaf blocks, and raw HTML which is inserted as inlines. Neither of these are supported by Markua.

In both cases, raw HTML must be completely removed by the Markua Processor.

As a side effect of this, since HTML comments are HTML, authors can use HTML comments (<!-- a comment -->) as comments in Markua: since these comments are HTML, they will be completely removed from the output (along with any other raw HTML tags).

Markua and Markdown have different use cases. Markdown is a better way to write HTML; Markua is a better way to write a book or course.

HTML is just one possible output format of Markua, and other possible Markua output formats (such as LaTeX and PDF) are not based on HTML. If HTML blocks were supported, a Markua Processor would have to support parsing and meaningfully outputting all of HTML syntax as well as all of Markua syntax.

Since Markdown’s only output format target is HTML, it might as well support HTML blocks: generating HTML from HTML is as simple as passing the HTML through. From an implementation perspective, Markdown gets inline HTML support for free. This is not true in the case of Markua.

Note that besides raw HTML, Markua does not remove anything else from Markdown. (As is discussed later, the Markua spec is written with the CommonMark spec as a starting point, so that you can verify this for yourself.)

1.5How to write a novel in Markua (M)

The Markua specification is long. However, if you’re an author, the amount you need to learn to write in Markua is actually very short.

This example shows everything you need to know to write a novel in Markua:


# Chapter One

This is a paragraph. You just write.

Blank lines separate paragraphs. This is *italic text*.

This is another paragraph. You can manually wrap your paragraphs
however you want. Single newlines function like single spaces by
default.

* * *

That was a thematic break, which is used as a "scene break" in a novel.

# Chapter Two

This is a paragraph in a new chapter.

By the way, everything in the above example is identical in the following Markdown dialects:

In the simple case, Markua is just a mapping from Markdown concepts to book concepts. This is deliberate.

1.6How to write almost any book in Markua (M)

The longer example below shows everything you need to know to write almost any type of book, such as a computer programming book, in Markua. This example also serves as a tutorial which explains most of Markua’s important syntax in its content. Some parts of this syntax are just standard Markdown or CommonMark syntax, and other parts are Markua extensions.

# Paragraphs

This is a paragraph. You just write.

Blank lines separate paragraphs. This is *italic text* and **this is bold**.

This is another paragraph. You can manually wrap your paragraphs
however you want. Single newlines function like single spaces by
default.

* * *

That was a thematic break, which is used as a "scene break" in a novel.

# Lists

Here's a numbered list (called an "ordered list", even though all lists are
ordered):

1. foo
2. bar
3. baz

Here's a bulleted list (called an "unordered list", for irony):

* foo
* bar
* baz

You can even have definition lists!

term 1
: definition 1a
: definition 1b

term 2
: definition 2

# Tables

You can also use tables, which work best for numeric tabular data
involving a small number of columns containing small numbers:

| Central Bank | Rate      |
|--------------|-----------|
| JPY          | -0.10%    |
| EUR          |  0.00%    |
| USD          |  0.00%    |
| CAD          |  0.25%    |

Note that definition lists are preferred to tables for most use cases,
since typing text in a table quickly gets annoying.

# Headings

Markua supports both of Markdown's heading styles.

The preferred style, called atx headers, has the following meaning in Markua:

```
{class: part}
# Part

This is a paragraph.

# Chapter

This is a paragraph.

## Section

This is a paragraph.

### Sub-section

This is a paragraph.

#### Sub-sub-section

This is a paragraph.

##### Sub-sub-sub-section

This is a paragraph.

###### Sub-sub-sub-sub-section

This is a paragraph.
```

Note the use of three backticks in the above example, to treat the Markua like
inline code (instead of actually like headers).

The other style of headers, called Setext headers, has the following headings:

```
{class: part}
Part
====

This is a paragraph.

Chapter
=======

This is a paragraph.

Section
-------

This is a paragraph.
```

Setext headers look nice, but only if you're only using chapters and sections.
If you want to add sub-sections (or lower), you'll be using atx headers for at
least some of your headers. My advice is to just use atx headers all the time.
(The `{class: part}` attribute list on a chapter header to make a part header
does actually work with Setext headers, but it's really ugly.)

Note that while it is confusing and ugly to mix and match using atx and Setext
headers for chapters and sections in the same document, you can do it. However,
please don't.

Finally, headers and their mapping to document structure concepts are discussed
later in the Document Structure section.

# Images

You can add images like this:

![](mac.jpg)

You can specify the alt text and figure title like this:

![alt text for accessibility](image.png "Figure Title")

An example of the difference between alt text and a figure title is:

![a red apple](mac.jpg "The Original Mac")

You can also set the alt text and/or the figure title in an attribute list:

{alt: "a red apple", title: "The Original Mac"}
![](mac.jpg)

Attribute lists are one of Markua's additions to Markdown, and are discussed
later in this specification.

Finally, you can set an image figure title using the alt text, if the
`alt-title` document setting is set to `all`:

![The Original Mac](mac.jpg)

The default value of `alt-title` is `all`, which means that all resources
can have their title specified in the alt text. There is also a value of
`text`, which means that only text-based resources like code can have their
title specified in the alt text. Finally, there is a value of `none`, which
means that no resources can have their title specified in the alt text. The
`alt-title` document setting is discussed
[in great detail here](#the-alt-title-document-setting-m-).

# Document Settings

Various document settings can be specified at the start of a Markua document to
affect the behavior of the Markua Processor. For example, if you want single
newlines to insert a forced line break in the PDF, EPUB and HTML output formats
(instead of the single space that results by default), set the `soft-breaks`
document setting to `break` instead of the default of `space`:

```
{
soft-breaks: break
}

# Chapter One

This will be on a line
and this will be on the next line.
```

This `soft-breaks` setting is so important it is discussed in its own
section below.

# Code Samples

The generalization of the Markdown image syntax into resources is one of
Markua's most important additions to Markdown. For example, the image syntax
is the inspiration for the syntax for external code samples:

![](hello.rb "Hello World in Ruby")

Just like with images, you can also use an attribute list for the title:

{title: "Hello World in Ruby"}
![](hello.rb)

Note that you can also specify the figure title as the alt text, as long as
the document setting of `alt-title` is not set to `none`:

![Hello World in Ruby](hello.rb)

You can also have inline code samples, which can only have a title using an
attribute list:

{title: "Hello World in Ruby"}
```ruby
puts "hello"
```

You can also include single lines of code like `puts "hello"` in paragraphs
using backticks.

# Other Stuff

Note that you can easily add math `d = v_i t + \frac{1}{2} a t^2`$ as well,
either inline in a paragraph or as a figure, using LaTeX math as the format.

> Blockquotes are really easy too.
> --Peter Armstrong, *Markua Spec*

Markua has lots of features not discussed in this example. Read the manual or
the rest of the spec!

1.7For Leanpub authors (M)

There are three ways to write in plain text on Leanpub. These are their user manuals:

  1. Leanpub Flavoured Markdown (LFM) - LFM was launched in 2011, and evolved until 2013. It is now obsolete, and is on long-term support.
  2. Markua 0.10 - Markua 0.10 was launched in 2014, and evolved until 2021. It is currently the only way to write courses on Leanpub.
  3. Markua 0.30 Beta - The Markua 0.30 beta was launched on Leanpub in 2021. Currently, Markua 0.30 is the default for new books on Leanpub. The user manual for Markua 0.30 is not yet stable, so for now you’re much better off reading this spec!

Not all features in the Markua spec are supported in the Markua 0.30 beta. However, the Markua 0.30 beta is good enough that it is the default choice for new Leanpub books!

You can still choose to use Leanpub’s Markua 0.10 support, of course. This is also true if you want to create courses. Leanpub’s Markua 0.30 implementation does not support courses yet.

You can also choose to use LFM, but unless you’re dealing with a legacy Leanpub book, there’s no good reason to do so. You’re much better off using the Markua 0.30 beta than LFM.

Markua 0.30 is compared to LFM and Markua 0.10 in the next sections below. These sections are primarily here for the benefit of Leanpub authors, but also to show the origin of how Markua 0.30 evolved.

1.8Markua 0.30 vs. Leanpub Flavoured Markdown (LFM) (M)

If you have used Leanpub Flavoured Markdown (LFM) in the past, you may be curious about the main differences between Markua 0.30 and LFM.

First, to emphasize the most obvious point: If you are writing a new book or course on Leanpub, you should absolutely choose Markua over LFM. For new books, you should choose Markua 0.30. For new courses, you should choose Markua 0.10 for now, since we do not support Markua 0.30 for courses yet.

There is no good reason to start any new project using LFM. The only reason we continue to support LFM on Leanpub is for updates to legacy books.

There are many differences between Markua 0.30 and LFM. This list highlights 10 of the more important ones…

  1. In LFM, parts are created with -# Part. In Markua, parts are created with a {class: part} attribute list above an atx (# Foo) or Setext (using ===) chapter heading.
  2. In LFM, there is a special syntax for inserting code samples: <<[Optional Title](code/some_code_file.rb). In Markua, however, code is just a resource, and the LFM syntax is not supported.
  3. In LFM, to mark code as added or removed, the magic words were leanpub-start-insert, leanpub-end-insert, leanpub-start-delete and leanpub-end-delete. In Markua, the magic words are markua-start-insert, markua-end-insert, markua-start-delete and markua-end-delete.
  4. In LFM, there is a special syntax for inserting math: {$$}...{/$$}. This looks nice to people who like LaTeX, and looks like nothing else in Markdown. In Markua, however, math is just another resource, and that LaTeX-inspired syntax for wrapping math resources is not supported.
  5. In LFM, there are G> “generic boxes”. In Markua, these are replaced with blurbs (B>).
  6. LFM had the C> syntax to center text, but we didn’t have the same effect on generic boxes, and blurbs did not exist. In Markua, a C> syntax is just syntactic sugar for a centered blurb, for greater consistency. Because of this, the blurb also gets the ability to be centered by adding a {class: center} attribute.
  7. LFM had {rtl} and {ltr} directives. These are not supported in Markua, and neither is a {dir} attribute in general: any given language is either a left-to-right or a right-to-left language, so specifying the language with the lang document setting and the {lang: ___} directive is sufficient.
  8. LFM used Sample.txt to define the sample content. Markua moves the definition of what constitutes sample content into a {sample: true} attribute on parts, chapters and sections. So, in Markua, inclusion in the sample is at the content level, not the file level. This helps avoid a number of bugs that could happen with including at the file level, if a file did not clearly break at a chapter boundary. (So, in Leanpub, the Sample.txt approach is not supported for books which use Markua.)
  9. LFM used Book.txt to define the book content. In Markua, the way that the list of manuscript files is defined is considered out of scope of the Markua specification itself. (Leanpub still uses the Book.txt approach to specify which files are in a book, but other Markua Processors could use other approaches, or could just support parsing single files or input streams.)
  10. Most importantly, a number of features (courses, index entries, smart crosslinks, etc.) exist in Markua and do not exist in LFM.

1.9Migrating a Leanpub Flavoured Markdown (LFM) Book to Markua 0.30 (M)

Migrating a Leanpub Flavoured Markdown (LFM) Book to Markua 0.30 is actually really easy.

Here are the steps you do:

  1. Move the images folder inside a resources folder.
  2. Convert the part syntax, if you use parts.
  3. Generate the book, look at the output, and make any other changes needed.

These are all described below.

1.9.1Move the images folder inside a resources folder

In LFM, images live here:

manuscript/images/some_image.png

To convert this to Markua 0.30, the simplest thing to do is just move the images folder inside the resources folder:

manuscript/resources/images/some_image.png

This way, your image paths will be unchanged (as they will both start with images/).

1.9.2Convert the part syntax, if you use parts

In LFM, parts are defined like this:

-# Some Part Title

To convert this to Markua 0.30, you do this:

{class: part}
# Some Part Title

1.9.3Generate the book, look at the output, and make any other changes needed

The above list of changes are the main ones, but there are other differences between LFM and Markua 0.30.

However, at this point, the best thing to do is to just generate the book using Markua 0.30 instead of LFM, and see what is produced. Then, consult that list and the LFM manual to learn more about how to make any other changes needed to finish the migration.

One thing that Markua 0.30 defines which is not done in LFM is the concept of document settings, which are a set of global switches which customize the output of your book. You will want to review the standard document settings section below to learn more.

1.10Markua 0.30 vs. Markua 0.10 (M)

If you are writing a new book on Leanpub, you should absolutely choose Markua 0.30 over Markua 0.10.

If you are writing a new course on Leanpub, you’re stuck with Markua 0.10 for a while longer: our Markua 0.30 support don’t support courses yet.

Leanpub’s Markua 0.10 support is fully documented in its own user manual. Everything documented in that manual should work as described for Markua 0.10 books and courses, and our Markua 0.10 version will continue to be supported for years to come.

Leanpub’s Markua 0.10 support is based on a private fork of an open source Ruby library. It is not based on CommonMark or Pandoc. So, there are a number of quirks to our Markua 0.10 support. If you are using Markua 0.10 on Leanpub, consult the user manual, NOT this spec.

1.11Migrating a Markua 0.10 Book to Markua 0.30 (M)

When I defined how Markua 0.10 worked, I was a number of years younger than I am now. I cared less about compatibility with Markdown defaults, and more about trying to fix what I thought were mistakes.

This was youthful hubris, which Markua 0.30 corrects.

What Markua 0.30 does is functions like Markdown (minus inline HTML, and then extended) by default, but also provides the ability to override behavior using global document settings. These document settings allow you to customize the behavior of an entire Markua document via global switches. This way, the behavior of a Markua 0.30 document can be changed in sweeping ways via these global settings.

If you are updating a large Markua 0.10 document to Markua 0.30, you will probably want to start by doing these three steps:

  1. Rename all caption attributes to title
  2. Preview the book
  3. Update any document settings

These are all described below.

1.11.1Rename all caption attributes to title

First, you will want to do a find-replace in your Markua manuscript to update all caption: attributes to be title:, since the caption attribute was renamed to title in Markua 0.30 for consistency.

Now, if you’re like most Leanpub authors who wrote in Markua 0.10, chances are you did not use any caption attributes, but instead relied on the alt-title behavior. The alt-title document setting is discussed in great detail here.

1.11.2Preview the book

Before making any more changes, you should preview your book to see what it looks like unchanged.

1.11.3Update any document settings

After previewing the book, you will want to review the PDF output and decide on the appropriate values of the document settings.

If you want maximum compatibility with Markua 0.10, you will want to override the soft-breaks and two-space-hack settings as follows:

  1. Set the soft-breaks setting to break
  2. Set the two-space-hack setting to false

Note that both of these settings are against the default behavior of Markdown, so unless you are already relying on that behaviour in Markua 0.10, you should consider leaving them at their default values.

You can change these settings on Leanpub in the Settings > Generation Settings section of the author app, or you can set the document settings at the top of the first manuscript file like this:

{
    soft-breaks: break
    two-space-hack: false
}

# Chapter One

Lorem ipsum dolor...

Make sure you add a blank line below the document settings, and also make sure you put the curly braces on lines by themselves!

Note that you only define this list of document settings once, at the start of your first file in your Markua manuscript.

Besides the specifics of the soft-breaks and the two-space-hack document settings, the big point is that regardless of whether you are upgrading a large Markua 0.10 document to Markua 0.30 or starting a new Markua 0.30 document, you can make sweeping changes to how a Markua 0.30 document functions with a little bit of metadata at the top.

I strongly recommend that you review all of the document settings to learn what choices are available, and decide which make sense for you. There’s no one true right answer here; it’s entirely based on personal preference and whether you are starting a new book or upgrading an existing one.

1.12Origin and structure of this document (M)

This is the Markua Spec, written by Peter Armstrong.

The Markua Spec is based on the CommonMark Spec, whose author is John MacFarlane.

This document also incorporates the specification of two of the five GitHub Flavored Markdown (GFM) extensions from the GFM Spec.

Like the CommonMark Spec and the GFM Spec, the Markua Spec is licensed under the Creative Commons license CC-BY-SA 4.0.

This document has two types of intended readers:

  1. authors with advanced questions about Markua syntax
  2. developers implementing Markua Processors

The Markua Spec is a strange hybrid of the CommonMark Spec, the GFM Spec and the specification of the Markua extensions.

The starting point is the CommonMark Spec. With the exception of the removal of the HTML Blocks and Raw HTML sections, the entire CommonMark Spec is preserved unchanged.

Next, two of the five GFM extensions are added. These are discussed below.

Finally, many new chapters or sections about Markua are added. These are all labeled with “(Markua extension)”.

So, for any given chapter or section of this document, it is either:

  1. written by John MacFarlane about CommonMark
  2. written by GitHub employees about GFM, and ending with “(GFM)”
  3. written by Peter Armstrong about Markua, and ending with “(M)”

To be clear: every chapter or section which does not end with (M) or (GFM) is from the original CommonMark Spec, and was written by John MacFarlane.

Together, these sections combine to specify Markua, since Markua is CommonMark, minus HTML blocks and Raw HTML, plus two GFM extensions, plus many Markua extensions.

While this hybrid approach of creating the Markua specification is somewhat odd, it has a number of significant benefits:

  1. The CommonMark Spec is very good, and is licensed under the Creative Commons license CC-BY-SA 4.0. Not starting with the CommonMark Spec would be an exercise in vanity.
  2. The CommonMark Spec is the defacto standard specification of Markdown. Any specification not based on CommonMark will need to be compared to CommonMark, so the most helpful thing to do is to just start with CommonMark.
  3. Writing this specification as a series of extensions to CommonMark will enable people who are implementing it to just start with a compliant CommonMark implementation and modify it appropriately.
  4. Defining all Markua changes as extensions to CommonMark or deletions from CommonMark makes it abundantly clear whether something that Markua does originates in CommonMark, or whether it was added by GFM or by Markua. This is potentially helpful for authors, since it will show them what will work in other Markdown dialects, and what won’t.
  5. It means that updating this document to keep it current with the CommonMark and GFM specs will be very easy.

In terms of the GFM extensions, the following two GFM extensions are included in the Markua Spec:

  1. Tables (GFM)
  2. Strikethrough (GFM)

Their content is unchanged, but their titles are renamed to say “(GFM extension)” instead of “(extension)”, to more clearly specify their provenance.

The other three GFM extensions are omitted:

  1. Task list items is omitted since Markua is for books and courses, not task lists.
  2. Disallowed Raw HTML is omitted since Markua goes farther and completely removes all raw HTML, both HTML added as HTML blocks or as raw HTML inlines.
  3. Autolinks is omitted, because the extra work to create a link is literally typing two characters (< and >), so the added complexity both in parsing and in documentation is not worth it. (It does make sense for, say, a discussion forum. It does not make sense for books and courses.)

1.13About the HTML output (M)

The HTML mapping defined in this specification is NOT a complete specification of the HTML documents produced by Markua Processors.

Also, the HTML mapping defined in the specification is NOT considered to be canonical.

For example, consider the following Markua:

![a red apple](mac.jpg "The Original Mac")

A Markua Processor can produce HTML that looks like this:

<p><img src="mac.jpg" alt="a red apple" title="The Original Mac" /></p>

(That is what was produced at https://spec.commonmark.org/dingus/ on 2020-05-27, for example, and it is valid Markua as well as valid CommonMark.)

A Markua Processor can also, however, produce HTML that looks like this:

<p>
  <figure>
    <img src="mac.jpg"
         alt="a red apple">
    <figcaption>The Original Mac</figcaption>
  </figure>
</p>

Both approaches are completely legitimate, and have pros and cons.

So, why show any HTML at all? There are two reasons:

  1. The CommonMark Spec shows HTML, and the Markua Spec tries to only be additive to the CommonMark Spec.
  2. Showing HTML helps Markua Processors to test their parsers.

Finally, besides not being canonical, the HTML is not complete. It is only a specification of the parts of the HTML mapping where there is less need to have flexibility on the part of the Markua Processor.

The HTML mapping should be thought of specifying HTML fragments, not HTML documents. In all of the examples, a Markua Processor may add more HTML before and after the relevant content, as well as changing the relevant content itself.

If a Markua Processor wishes to test that it produces the correct HTML, it should test that the HTML produced contains the specified output, not that it is identical to the specified output.

Every example uses the default Markua document settings, unless otherwise specified. Any custom settings are specified in a document settings list in the top of the example.

1.14Why the name “Markua”? (M)

When I set out to specify Markua, I realized I needed a name. I wanted a name that conveyed the love that I have for Markdown while not implying endorsement by John Gruber in any way. I also did not want a name which referenced Leanpub: Markua is a standalone specification with its own identity, which anyone (including Leanpub competitors) can freely implement. Finally, I was on vacation in Hawaii when I named Markua, and I wanted something that sounded happy, friendly and almost Hawaiian. (Yes, I’m aware that there is no r in Hawaiian.) I also wanted a name that had its .com domain name available, and that was short and spellable, for branding purposes. The Markua name had all these properties.

So, I named it Markua, registered the domain name, and filed a trademark. Markua and the Markua logo is a registered trademark of Ruboss Technology Corporation, the company which created, owns and operates Leanpub.

1.15What is Markdown?

Markdown is a plain text format for writing structured documents, based on conventions for indicating formatting in email and usenet posts. It was developed by John Gruber (with help from Aaron Swartz) and released in 2004 in the form of a syntax description and a Perl script (Markdown.pl) for converting Markdown to HTML. In the next decade, dozens of implementations were developed in many languages. Some extended the original Markdown syntax with conventions for footnotes, tables, and other document elements. Some allowed Markdown documents to be rendered in formats other than HTML. Websites like Reddit, StackOverflow, and GitHub had millions of people using Markdown. And Markdown started to be used beyond the web, to author books, articles, slide shows, letters, and lecture notes.

What distinguishes Markdown from many other lightweight markup syntaxes, which are often easier to write, is its readability. As Gruber writes:

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions. (https://daringfireball.net/projects/markdown/)

The point can be illustrated by comparing a sample of AsciiDoc with an equivalent sample of Markdown. Here is a sample of AsciiDoc from the AsciiDoc manual:

1. List item one.
+
List item one continued with a second paragraph followed by an
Indented block.
+
.................
$ ls *.sh
$ mv *.sh ~/tmp
.................
+
List item continued with a third paragraph.

2. List item two continued with an open block.
+
--
This paragraph is part of the preceding list item.

a. This list is nested and does not require explicit item
continuation.
+
This paragraph is part of the preceding list item.

b. List item b.

This paragraph belongs to item two of the outer list.
--

And here is the equivalent in Markdown:

1.  List item one.

    List item one continued with a second paragraph followed by an
    Indented block.

        $ ls *.sh
        $ mv *.sh ~/tmp

    List item continued with a third paragraph.

2.  List item two continued with an open block.

    This paragraph is part of the preceding list item.

    1. This list is nested and does not require explicit item continuation.

       This paragraph is part of the preceding list item.

    2. List item b.

    This paragraph belongs to item two of the outer list.

The AsciiDoc version is, arguably, easier to write. You don’t need to worry about indentation. But the Markdown version is much easier to read. The nesting of list items is apparent to the eye in the source, not just in the processed document.

1.16Why is a spec needed?

John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously. Here are some examples of questions it does not answer:

  1. How much indentation is needed for a sublist? The spec says that continuation paragraphs need to be indented four spaces, but is not fully explicit about sublists. It is natural to think that they, too, must be indented four spaces, but Markdown.pl does not require that. This is hardly a “corner case,” and divergences between implementations on this issue often lead to surprises for users in real documents. (See this comment by John Gruber.)

  2. Is a blank line needed before a block quote or heading? Most implementations do not require the blank line. However, this can lead to unexpected results in hard-wrapped text, and also to ambiguities in parsing (note that some implementations put the heading inside the blockquote, while others do not). (John Gruber has also spoken in favor of requiring the blank lines.)

  3. Is a blank line needed before an indented code block? (Markdown.pl requires it, but this is not mentioned in the documentation, and some implementations do not require it.)

    paragraph
        code?
    
  4. What is the exact rule for determining when list items get wrapped in <p> tags? Can a list be partially “loose” and partially “tight”? What should we do with a list like this?

    1. one
    
    2. two
    3. three
    

    Or this?

    1.  one
        - a
    
        - b
    2.  two
    

    (There are some relevant comments by John Gruber here.)

  5. Can list markers be indented? Can ordered list markers be right-aligned?

     8. item 1
     9. item 2
    10. item 2a
    
  6. Is this one list with a thematic break in its second item, or two lists separated by a thematic break?

    * a
    * * * * *
    * b
    
  7. When list markers change from numbers to bullets, do we have two lists or one? (The Markdown syntax description suggests two, but the perl scripts and many other implementations produce one.)

    1. fee
    2. fie
    -  foe
    -  fum
    
  8. What are the precedence rules for the markers of inline structure? For example, is the following a valid link, or does the code span take precedence ?

    [a backtick (`)](/url) and [another backtick (`)](/url).
    
  9. What are the precedence rules for markers of emphasis and strong emphasis? For example, how should the following be parsed?

    *foo *bar* baz*
    
  10. What are the precedence rules between block-level and inline-level structure? For example, how should the following be parsed?

    - `a long code span can contain a hyphen like this
      - and it can screw things up`
    
  11. Can list items include section headings? (Markdown.pl does not allow this, but does allow blockquotes to include headings.)

    - # Heading
    
  12. Can list items be empty?

    * a
    *
    * b
    
  13. Can link references be defined inside block quotes or list items?

    > Blockquote [foo].
    >
    > [foo]: /url
    
  14. If there are multiple definitions for the same reference, which takes precedence?

    [foo]: /url1
    [foo]: /url2
    
    [foo][]
    

In the absence of a spec, early implementers consulted Markdown.pl to resolve these ambiguities. But Markdown.pl was quite buggy, and gave manifestly bad results in many cases, so it was not a satisfactory replacement for a spec.

Because there is no unambiguous spec, implementations have diverged considerably. As a result, users are often surprised to find that a document that renders one way on one system (say, a GitHub wiki) renders differently on another (say, converting to docbook using pandoc). To make matters worse, because nothing in Markdown counts as a “syntax error,” the divergence often isn’t discovered right away.

1.17About this document

This document attempts to specify Markdown syntax unambiguously. It contains many examples with side-by-side Markdown and HTML. These are intended to double as conformance tests. An accompanying script spec_tests.py can be used to run the tests against any Markdown program:

python test/spec_tests.py --spec spec.txt --program PROGRAM

Since this document describes how Markdown is to be parsed into an abstract syntax tree, it would have made sense to use an abstract representation of the syntax tree instead of HTML. But HTML is capable of representing the structural distinctions we need to make, and the choice of HTML for the tests makes it possible to run the tests against an implementation without writing an abstract syntax tree renderer.

Note that not every feature of the HTML samples is mandated by the spec. For example, the spec says what counts as a link destination, but it doesn’t mandate that non-ASCII characters in the URL be percent-encoded. To use the automatic tests, implementers will need to provide a renderer that conforms to the expectations of the spec examples (percent-encoding non-ASCII characters in URLs). But a conforming implementation can use a different renderer and may choose not to percent-encode non-ASCII characters in URLs.

This document is generated from a text file, spec.txt, written in Markdown with a small extension for the side-by-side tests. The script tools/makespec.py can be used to convert spec.txt into HTML or CommonMark (which can then be converted into other formats).

In the examples, the character is used to represent tabs.

2Preliminaries

2.1Characters and lines

Any sequence of characters is a valid CommonMark document.

A character is a Unicode code point. Although some code points (for example, combining accents) do not correspond to characters in an intuitive sense, all code points count as characters for purposes of this spec.

This spec does not specify an encoding; it thinks of lines as composed of characters rather than bytes. A conforming parser may be limited to a certain encoding.

A line is a sequence of zero or more characters other than line feed (U+000A) or carriage return (U+000D), followed by a line ending or by the end of file.

A line ending is a line feed (U+000A), a carriage return (U+000D) not followed by a line feed, or a carriage return and a following line feed.

A line containing no characters, or a line containing only spaces (U+0020) or tabs (U+0009), is called a blank line.

The following definitions of character classes will be used in this spec:

A Unicode whitespace character is a character in the Unicode Zs general category, or a tab (U+0009), line feed (U+000A), form feed (U+000C), or carriage return (U+000D).

Unicode whitespace is a sequence of one or more Unicode whitespace characters.

A tab is U+0009.

A space is U+0020.

An ASCII control character is a character between U+0000–1F (both including) or U+007F.

An ASCII punctuation character is !, ", #, $, %, &, ', (, ), *, +, ,, -, ., / (U+0021–2F), :, ;, <, =, >, ?, @ (U+003A–0040), [, \, ], ^, _, ` (U+005B–0060), {, |, }, or ~ (U+007B–007E).

A Unicode punctuation character is a character in the Unicode P (puncuation) or S (symbol) general categories.

2.2Tabs

Tabs in lines are not expanded to spaces. However, in contexts where spaces help to define block structure, tabs behave as if they were replaced by spaces with a tab stop of 4 characters.

Thus, for example, a tab can be used instead of four spaces in an indented code block. (Note, however, that internal tabs are passed through as literal tabs, not expanded to spaces.)

Example 1
→foo→baz→→bim
<pre><code>foo→baz→→bim
</code></pre>
Example 2
  →foo→baz→→bim
<pre><code>foo→baz→→bim
</code></pre>
Example 3
    a→a
    ὐ→a
<pre><code>a→a
ὐ→a
</code></pre>

In the following example, a continuation paragraph of a list item is indented with a tab; this has exactly the same effect as indentation with four spaces would:

Example 4
  - foo

→bar
<ul>
<li>
<p>foo</p>
<p>bar</p>
</li>
</ul>
Example 5
- foo

→→bar
<ul>
<li>
<p>foo</p>
<pre><code>  bar
</code></pre>
</li>
</ul>

Normally the > that begins a block quote may be followed optionally by a space, which is not considered part of the content. In the following case > is followed by a tab, which is treated as if it were expanded into three spaces. Since one of these spaces is considered part of the delimiter, foo is considered to be indented six spaces inside the block quote context, so we get an indented code block starting with two spaces.

Example 6
>→→foo
<blockquote>
<pre><code>  foo
</code></pre>
</blockquote>
Example 7
-→→foo
<ul>
<li>
<pre><code>  foo
</code></pre>
</li>
</ul>
Example 8
    foo
→bar
<pre><code>foo
bar
</code></pre>
Example 9
 - foo
   - bar
→ - baz
<ul>
<li>foo
<ul>
<li>bar
<ul>
<li>baz</li>
</ul>
</li>
</ul>
</li>
</ul>
Example 10
#→Foo
<h1>Foo</h1>
Example 11
*→*→*→
<hr />

2.3Insecure characters

For security reasons, the Unicode character U+0000 must be replaced with the REPLACEMENT CHARACTER (U+FFFD).

2.4Backslash escapes

Any ASCII punctuation character may be backslash-escaped:

Example 12
\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
<p>!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</p>

Backslashes before other characters are treated as literal backslashes:

Example 13
\→\A\a\ \3\φ\«
<p>\→\A\a\ \3\φ\«</p>

Escaped characters are treated as regular characters and do not have their usual Markdown meanings:

Example 14
\*not emphasized*
\<br/> not a tag
\[not a link](/foo)
\`not code`
1\. not a list
\* not a list
\# not a heading
\[foo]: /url "not a reference"
\&ouml; not a character entity
<p>*not emphasized*
&lt;br/&gt; not a tag
[not a link](/foo)
`not code`
1. not a list
* not a list
# not a heading
[foo]: /url &quot;not a reference&quot;
&amp;ouml; not a character entity</p>

If a backslash is itself escaped, the following character is not:

Example 15
\\*emphasis*
<p>\<em>emphasis</em></p>

A backslash at the end of the line is a hard line break:

Example 16
foo\
bar
<p>foo<br />
bar</p>

Backslash escapes do not work in code blocks, code spans, autolinks, or raw HTML:

Example 17
`` \[\` ``
<p><code>\[\`</code></p>
Example 18
    \[\]
<pre><code>\[\]
</code></pre>
Example 19
~~~
\[\]
~~~
<pre><code>\[\]
</code></pre>
Example 20
<https://example.com?find=\*>
<p><a href="https://example.com?find=%5C*">https://example.com?find=\*</a></p>
Example 21
<a href="/bar\/)">
<a href="/bar\/)">

But they work in all other contexts, including URLs and link titles, link references, and info strings in fenced code blocks:

Example 22
[foo](/bar\* "ti\*tle")
<p><a href="/bar*" title="ti*tle">foo</a></p>
Example 23
[foo]

[foo]: /bar\* "ti\*tle"
<p><a href="/bar*" title="ti*tle">foo</a></p>
Example 24
``` foo\+bar
foo
```
<pre><code class="language-foo+bar">foo
</code></pre>

2.5Entity and numeric character references

Valid HTML entity references and numeric character references can be used in place of the corresponding Unicode character, with the following exceptions:

Conforming CommonMark parsers need not store information about whether a particular character was represented in the source using a Unicode character or an entity reference.

Entity references consist of & + any of the valid HTML5 entity names + ;. The document https://html.spec.whatwg.org/entities.json is used as an authoritative source for the valid entity references and their corresponding code points.

Example 25
&nbsp; &amp; &copy; &AElig; &Dcaron;
&frac34; &HilbertSpace; &DifferentialD;
&ClockwiseContourIntegral; &ngE;
<p>  &amp; © Æ Ď
¾  ⅆ
∲ ≧̸</p>

Decimal numeric character references consist of &# + a string of 1–7 arabic digits + ;. A numeric character reference is parsed as the corresponding Unicode character. Invalid Unicode code points will be replaced by the REPLACEMENT CHARACTER (U+FFFD). For security reasons, the code point U+0000 will also be replaced by U+FFFD.

Example 26
&#35; &#1234; &#992; &#0;
<p># Ӓ Ϡ �</p>

Hexadecimal numeric character references consist of &# + either X or x + a string of 1-6 hexadecimal digits + ;. They too are parsed as the corresponding Unicode character (this time specified with a hexadecimal numeral instead of decimal).

Example 27
&#X22; &#XD06; &#xcab;
<p>&quot;  ಫ</p>

Here are some nonentities:

Example 28
&nbsp &x; &#; &#x;
&#87654321;
&#abcdef0;
&ThisIsNotDefined; &hi?;
<p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
&amp;#87654321;
&amp;#abcdef0;
&amp;ThisIsNotDefined; &amp;hi?;</p>

Although HTML5 does accept some entity references without a trailing semicolon (such as &copy), these are not recognized here, because it makes the grammar too ambiguous:

Example 29
&copy
<p>&amp;copy</p>

Strings that are not on the list of HTML5 named entities are not recognized as entity references either:

Example 30
&MadeUpEntity;
<p>&amp;MadeUpEntity;</p>

Entity and numeric character references are recognized in any context besides code spans or code blocks, including URLs, link titles, and fenced code block info strings:

Example 31
<a href="&ouml;&ouml;.html">
<a href="&ouml;&ouml;.html">
Example 32
[foo](/f&ouml;&ouml; "f&ouml;&ouml;")
<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
Example 33
[foo]

[foo]: /f&ouml;&ouml; "f&ouml;&ouml;"
<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
Example 34
``` f&ouml;&ouml;
foo
```
<pre><code class="language-föö">foo
</code></pre>

Entity and numeric character references are treated as literal text in code spans and code blocks:

Example 35
`f&ouml;&ouml;`
<p><code>f&amp;ouml;&amp;ouml;</code></p>
Example 36
    f&ouml;f&ouml;
<pre><code>f&amp;ouml;f&amp;ouml;
</code></pre>

Entity and numeric character references cannot be used in place of symbols indicating structure in CommonMark documents.

Example 37
&#42;foo&#42;
*foo*
<p>*foo*
<em>foo</em></p>
Example 38
&#42; foo

* foo
<p>* foo</p>
<ul>
<li>foo</li>
</ul>
Example 39
foo&#10;&#10;bar
<p>foo

bar</p>
Example 40
&#9;foo
<p>→foo</p>
Example 41
[a](url &quot;tit&quot;)
<p>[a](url &quot;tit&quot;)</p>

2.6Emoji (M)

In 2015, emoji fully arrived. The 2015 Oxford Dictionaries Word of the Year was, in fact, the Face with Tears of Joy emoji. You may think of it as a smiling face with tears of joy, but you can also can think of it as &#x1F602;, which is the HTML entity reference to its Unicode character (in hexadecimal).

However, Unicode characters aren’t the only way to do emoji. Another popular syntax for emoji is :emoji_name: – that is a colon (:), followed by the underscore-separated name of the emoji, followed by a colon.

Markua supports both the HTML character reference approach for emoji as well as the :emoji_name: syntax.

Which emoji are supported is up to the Markua Processor.

One recommended list of emoji names is at https://www.webfx.com/tools/emoji-cheat-sheet/.

The preferred thing to do when a Markua Processor recognizes an emoji is to replace it with the Unicode code point HTML entity reference, assuming that those are also handled correctly:

This makes me &#x1F602;.

This makes me :joy:.
```
<p>This makes me &#x1F602;.</p>
<p>This makes me &#x1F602;.</p>

However, it is also acceptable for a Markua Processor to just output the “tofu” characters for the HTML entity references while still supporting the :emoji_name: style emoji. Presumably with this approach, either a set of images or an emoji font will be used for the supported emoji.

Finally, if Font Awesome icons are going to be supported by a Markua Processor, using the :emoji_name: syntax, they must have their :fa- prefix which they always come with. So, the Leanpub logo in Font Awesome would be output as :fa-leanpub:.

3Document semantics (M)

As discussed in the introduction, Markua is an extension to Markdown. The goal is for Markua to feel as though Markdown just magically grew a knowledge of how to produce books and courses.

Recall that Markua’s five main additions to Markdown include the mapping of Markdown headings to book structures, the unified attribute list syntax and a comprehensive definition of document settings. These three key additions are discussed in this section. (The other two key additions, resources and online courses, are both big enough to need their own sections.

Markua adds a number of extensions to support metadata in a consistent way. These include attributes (on everything), settings (at the beginning of a document) and directives (at certain points throughout the document). All metadata is enclosed in curly braces {}.

Metadata is not actually output in the Markua document itself. Instead, it changes the behavior of the Markua Processor, sometimes dramatically.

First, however, it is important to start with how Markua headers and metadata are used to define document structure.

3.1How headers define document structure (M)

Markua is a way of writing books and courses. Books have things like chapters, sections and subsections. Courses typically call their chapters “modules”, but it’s the same idea. Sometimes books or courses organize their chapters or modules in parts.

Markdown defines two types of heading styles.

Markua’s first innovation is to map both of these heading styles to document structure concepts.

The preferred style, called atx headers, has the following meaning in Markua:

{class: part}
# Part

This is a paragraph.

# Chapter

This is a paragraph.

## Section

This is a paragraph.

### Sub-section

This is a paragraph.

#### Sub-sub-section

This is a paragraph.

##### Sub-sub-sub-section

This is a paragraph.

###### Sub-sub-sub-sub-section

This is a paragraph.

The other style of headers, called Setext headers, defines two levels of heading only. Here’s how those are used in Markua:

{class: part}
Part
====

This is a paragraph.

Chapter
=======

This is a paragraph.

Section
-------

This is a paragraph.

Setext headers look nice, but only if you’re only using chapters and sections. If you want to add sub-sections (or lower), you’ll be using atx headers for at least some of your headers. My advice is to just use atx headers all the time. (The {class: part} attribute list on a chapter header to make a part header does actually work with Setext headers, but it’s really ugly.)

Note that while it is confusing and ugly to mix and match using atx and Setext headers for chapters and sections in the same document, you can do it. However, please don’t.

Now, regardless of whether you are using the atx or Setext header style of header, the key point is that Markua is mapping the headers to document structure concepts. The header defines the name of the part, chapter, section, etc., and the content underneath it becomes the content of that given thing (until the next header is reached).

In the HTML output, the Part and Chapter heading (for atx or Setext headers) both produce an h1, the Section heading produces an h2, and the Sub-Section heading produces an h3. Lower levels go down to h6.

To be clear, these are the Markdown “atx” header mappings:

# : Used for a Chapter heading when there is no {class: part} attribute list, and used for a Part heading when there is. This produces an h1 in HTML.

## : Used for a Section heading. This produces an h2 in HTML.

### : Used for a Sub-Section heading. This produces an h3 in HTML.

#### : Used for a Sub-Sub-Section heading. This produces an h4 in HTML.

##### : Used for a Sub-Sub-Sub-Section heading. This produces an h5 in HTML.

###### : Used for a Sub-Sub-Sub-Sub-Section heading. This produces an h6 in HTML.

The reason that a Part heading and a Chapter heading both use the # symbol, and are only differentiated by the presence of a {class: part} attribute list, is that not all books have parts, so it would be awkward to start chapters with the second-level heading.

{class: part}
# Part

This is a paragraph.

# Chapter

This is a paragraph.

## Section

This is a paragraph.

### Sub-section

This is a paragraph.

#### Sub-sub-section

This is a paragraph.

##### Sub-sub-sub-section

This is a paragraph.

###### Sub-sub-sub-sub-section

This is a paragraph.
<h1 class="part">Part</h1>
<p>This is a paragraph.</p>
<h1>Chapter</h1>
<p>This is a paragraph.</p>
<h2>Section</h2>
<p>This is a paragraph.</p>
<h3>Sub-section</h3>
<p>This is a paragraph.</p>
<h4>Sub-sub-section</h4>
<p>This is a paragraph.</p>
<h5>Sub-sub-sub-section</h5>
<p>This is a paragraph.</p>
<h6>Sub-sub-sub-sub-section</h6>
<p>This is a paragraph.</p>

These are the Setext header mappings:

{class: part}
Part
====

This is a paragraph.

Chapter
-------

This is a paragraph.
<h1 class="part">Part</h1>
<p>This is a paragraph.</p>
<h1>Chapter</h1>
<p>This is a paragraph.</p>

In the HTML output, the only difference between a part and chapter heading on the heading itself is that there is a class="part" attribute shown. However, obviously, the part and chapter headings are treated differently elsewhere, such as in the table of contents of the book or the navigation structure of the course.

The {class: part} is an example of an attribute list, which is another of Markua’s main additions to Markdown. There can be no blank line between the attribute list and the header it is attached to. Attribute lists are discussed in the Metadata section later.

Note that adding a {class: part} attribute list to any other heading level than an h1 is ignored by Markua.

Here’s an example of this being incorrectly added to an atx section header, which will have no effect:

{class: part}
## Section Heading

Foo
<h2>Section Heading</h2>
<p>Foo</p>

Here’s an example of this being incorrectly added to a Setext section header, which will also have no effect:

{class: part}
Section Heading
---------------

Foo
<h2>Section Heading</h2>
<p>Foo</p>

3.2Attribute lists (M)

Attribute lists are used to define everything from specify the language of code blocks, add ids for crosslinking and even support extensions to Markua.

3.2.1Attribute list format

An attribute list is one or more key-value, comma-separated pairs:

{one: v1, two: "v2", three: 'v3!', four: true, five: 0, six: 3.14, seven: a b}

Note that you can skip the space between the colon and the value: the following {format: ruby} and {format:ruby} both work. However, for consistency I recommend always adding a space.

Note that attribute values can be in no quotes, in double quotes (") or single quotes ('). Whenever an attribute contains spaces, using either double or single quotes is preferred to no quotes, but you can get away with using no quotes as long as the attribute value does not contain a comma.

The choice of double or single quotes is mostly personal taste. However, inside double quotes, a double quote must be backslash-escaped (\"); inside single quotes, a single quote must be backslash-escaped (\'). So, if your attribute value has a lot of double quotes, then it’s more convenient to wrap it in single quotes, and vice-versa.

Regardless of whether quotes are used, leading and trailing spaces are removed from all attribute values, but internal spaces within the attribute values are preserved.

An attribute list can be inserted into a Markua document in one of three ways:

  1. Immediately above a block element (e.g. heading, figure, aside, blurb, quiz, etc.), with one newline (not a blank line) separating it from the block element.
  2. Immediately after a span element (e.g. a word, italicized phrase, etc.) in normal paragraphs and in similarly-simple contexts, with no spaces separating it from the span element.
  3. On a line by itself, with one blank line above and below it. In this format, the attribute list contains directives.

One common use of attributes is to add id attributes to headings:

{id: foo}
# Chapter Foo

foo
<h1 id="foo" class="chapter">Chapter Foo</h1>
<p>foo</p>

You can also use an {#id} syntactic sugar:

{#bar}
# Chapter Bar

bar
<h1 id="bar" class="chapter">Chapter Bar</h1>
<p>bar</p>

One common use of attributes is to add id attributes to span elements:

Here [is lorem]{id: lorem}.

This is ipsum{#ipsum}.
<p>Here <span id="lorem">is lorem</span>.</p>

<p>This is <span id="ipsum">ipsum</span>.</p>

If there is an error in the syntax of an attribute list, or if the Markua Processor does not support an attribute list in a given context, it should just ignore the attribute list and add an appropriate error.

Any line outside of a code resource which starts with an opening curly brace { and ends with a closing curly brace } is assumed to be an attribute list, and will not be output by a Markua Processor. If you want to start a line with a literal opening curly brace { you need to preface it with a backslash (\).

You cannot add attribute lists inside headers:

# Chapter Foo{id: foo}

foo
<h1 class="chapter">Chapter Foo</h1>
<p>foo</p>

3.2.2Attribute Keys

The keys of attributes must consist exclusively of lowercase letters, hyphens (-) and underscores (_). Uppercase letters are not permitted in attribute keys: a Markua Processor must treat uppercase letters in attribute keys as an error.

If a key is duplicated in an attribute list, the first key value is used and subsequent ones are ignored. A Markua Processor should add a warning in its list of warnings, which are not output in the output itself.

To be clear, the keys of attributes have nothing to do with HTML attributes, even though HTML is one of the output formats of Markua. Markua does not have any particular knowledge of HTML attributes; Markua understands Markua attributes. Some Markua attributes, like class, happen to have the same name as some attributes in HTML. This is just coincidence, and many others do not.

3.2.3Attribute Values

All attributes are text. Markua Processors should interpret text values of “true” and “false” as representing true and false. Quotes, by which I mean double quotes (") not single quotes ('), are optional for attribute values, and are only needed if the attribute value contains whitespace or special characters.

If a text attribute value contains a quote, it must be “escaped” with a backslash: e.g. {title: "\"Fresh\" Fish"}

3.2.4id Attributes

As previously discussed, there is special syntactic sugar for ids: {#foo} is equivalent to {id: foo}. However, ids are just attributes.

3.2.5title Attributes

Markua headings (part, chapter, section, sub-section, etc.) and figures can all have title attributes specified in an attribute list. This is text which overrides what is displayed for the heading or figure in the table of contents. For a heading, it is analogous to the title attribute on a resource inserted as a figure, which specifies the text to use for the figure in the appropriate list of figures (e.g. List of Illustrations, Table of Tables, etc.). If a heading does not have a title attribute, the text of the heading itself is used–which is quite often exactly what is desired. Use of a title attribute is always optional; it’s only used when the default behavior of using the heading text (or the title attribute for a resource) is not appropriate, say if it’s too long.

3.2.6class Attributes

One of the attributes which is supported in every attribute list is the class attribute. This allows a Markua Processor to do smart things about formatting.

For example, if a chapter has a {class: bibliography} added to it, then the Markua Processor can then use this information to change the indentation of paragraphs within it to be hanging paragraphs.

3.2.7Conditional Inclusion Attributes on Headings

Markua headings (and only headings) may have various attributes which specify which output formats their content (of the part, chapter, section, sub-section, etc.) should be included in. If the given attribute is not present, the default value of it is that specified by the nearest ancestor heading. If no such attribute is present at a top-level heading, the default is given by the default value for the attribute defined of Markua.

full : true or false. The default is true. If true, include this heading and its content (including nested sections, subsections, etc.) in the full book or course, including the PDF and EPUB versions and the web version that is being generated. If false, omit it. Setting this to false is an easy way to “comment out” a section of your book or course.

sample : true or false. The default is false. If true, include the content under this heading (including nested sections, subsections, etc.) in the sample of the book or course that is being generated. If false, omit it. Since the default is false, by default a sample book or course is empty. Note that this attribute just governs the inclusion of the content, not the heading itself. A Markua Processor may choose to omit all headings with sample: false (either explicitly set or defaulted to false) or it may choose to include every heading in the sample version of a book or course, in order to produce a representative Table of Contents. In a case such as this where sample is false, the Markua Processor may output special content inside the chapter, section or subsection to indicate that the content itself is being omitted from the sample. This attribute applies to both the book version (PDF and EPUB) and the web version of the sample book or course.

Note that specifying either of these attributes in a nested section overrides any value inherited from its ancestors, or from the default. This way, you can include a chapter in the sample, except for a specific section of the chapter.

Example:

{sample: true}
# Chapter One

This is included in the sample.

## Section One

This is included in the sample since it is contained in a chapter which is.

{sample: false}
## Section Two

This is *not* included in the sample since it is explicitly excluded, despite
the fact that the chapter is in the sample.

{sample: true}
## Section Three

This is included in the sample. This is redundant since it's in a chapter which
is.

{full: false, sample: true}
# Buy the book!

What you read was just a sample. Why not buy the full book?

# Chapter Two

This has the default values, so it is included in the book or course, but is
excluded from the sample.

To be clear: ALL conditional inclusion attributes ONLY have meaning when used as an attribute list on headings. You can only say {sample: true} immediately above a heading. You can’t have a blank line below it (otherwise it would be a directive, and not be valid) and you can’t attach it to anything other than a heading (like a paragraph, figure, etc.).

Finally, as discussed later, Markua supports “extension attributes”, to allow implementations to add attributes for purposes which are not in the scope of the Markua spec. One such attribute added by Leanpub is the {community: true} attribute, which functions like a {sample: true} attribute to produce a sample book. However, the {community: true} attribute results in Leanpub producing a sample book which is only available if the reader provides an email address for the download. This type of feature obviously does not belong in the Markua spec; the point, however, is that extension attributes make it so easy to add these types of features that it is almost as though they were in the spec.

3.2.8Span attribute lists

Surrounding text in square brackets can be useful not just for giving it a URL to link to. If you wish to add attributes to an arbitrary span of text, you can create an arbitrary span of text using square brackets and then add an attribute list immediately afterward. You can use any attribute list on this span, and you can also just use the id syntactic sugar {#theid} on this span. The most common uses of this are to add id attributes or index entries. (Index entries are discussed later.)

Some text [then a span]{foo: lorem, bar: ipsum} and more text.

This [span has an id]{#hello}, so hooray!

This span [also has an id]{id: world}. 

This span contains [古池や蛙飛び込む水の音]{lang: jpn} Japanese text.
<p>Some text <span foo="lorem" bar="ipsum">then a span</span> and more text.</p>
<p>This <span id="hello">span has an id</span>, so hooray!</p>
<p>This span <span id="world">also has an id</span>.</p>
<p>This span contains <span lang="jp">古池や蛙飛び込む水の音</span> Japanese text.</p>

Note, however, that you cannot start a normal span with a caret (^): this creates a [^footnote] instead. (Footnotes are discussed below.)

3.2.9Extension attributes

Markua Processors may encounter attributes which they do not understand.

Whenever this happens, these attributes must be filtered from the output. A Markua Processor should function as though there is a whitelist of attributes which it permits for each element, and filter everything else.

Because of this, Markua attribute lists can contain any number of extension attributes. An extension attribute is an attribute which is not defined in the Markua specification. This is true whether the attributes are inserted in an attribute list attached to a span, block or even in free-floating directives.

Because of the fact that all unrecognized attributes are filtered, it is possible for a Markua Processor to add extension attributes that only it understands. This encourages competition in the Markua ecosystem, while ensuring that Markua implementations do not choke on Markua input which goes beyond their capabilities.

For example, Leanpub supports an icon attribute on blurbs. If a different Markua processor does not support this attribute, there is no harm done: the attribute just has no effect.

Extension attributes go far beyond adding icons to blurbs: they allow for specialized uses of Markua. Since CSS is so powerful, with creative uses of custom attributes and custom CSS, Markua documents can be transformed. Some obvious uses of extension attributes include adding CSS classes which can then be styled to set fonts, etc.

This ensures that new attributes can be added to future versions of Markua without a negative effect on older Markua implementations. It also ensures that new versions of Markua can simply stop supporting attributes defined in this version of Markua without needing to specify anything special.

3.3Directives (M)

Directives are switches which affect the behavior of a Markua Processor.

The syntax for directives is {simple}. A directive is surrounded in curly braces, and should have a blank line above and below it.

Directives do not have a value. Instead, their presence is all that is required. Here’s the syntax for an imaginary directive foo:

some content

{foo}

some content

To be clear: a directive does NOT contain {key: value} pairs. That is an attribute list, not a directive.

The absence of key-value pairs is what makes a directive a directive. It is a keyword by itself, enclosed in curly braces on a line by itself. So, the following is also a directive, albeit a poorly-formatted one:

some content
{foo}
some content

The specific directives are discussed next.

3.3.1The pagebreak directive

Although authors are encouraged to not think about page breaks, this directive is a nod to reality: sometimes an author will really, really want to insert a page break.

This is done with the {pagebreak} directive:

lorem ipsum dolor

{pagebreak}

foo bar baz

Again, the use of this directive is discouraged. Authors should spend their time writing, not formatting.

3.3.2The mainmatter and backmatter directives

Most published books have three types of material in them: the front matter, the text (or “main matter”) and the back matter. (Yes, this applies to courses as well as books, since many Markua Processors like Leanpub can generate a book out of the course material.)

What authors write, the manuscript, is typically what goes into the text, or main matter, of the book. In style guidelines this is often called the text; in formats such as TeX and LaTeX it is called main matter. It’s what is typically numbered with Arabic numerals starting from 1. In Markua, the first page of the mainmatter will be numbered page 1, regardless of whether or not the page number will be shown on that page.

There’s a bunch of other stuff (the Dedication, Epigraph, Table of Contents, Foreword, Preface, etc.) which can come before the main text of the book. The stuff before the main matter is called “front matter”. The special pages that can be created by insertion directives are not numbered. The rest of the frontmatter follows the same page numbering rules as the main matter, except that numbering uses Roman numerals. The first page of the frontmatter will be numbered page i, regardless of whether or not the page number will show on that page.

There’s also a bunch of other stuff (appendices, the index, etc.) which can come after the main matter. This is called the “back matter”.

If Markua just relied on its headings support there would be no good way to accomplish the division of a manuscript into front matter, main matter and back matter. (We could try some convention about heading names, but that would be a highly objectionable, English-centric hack.)

So, this is where the mainmatter and backmatter directives come in. With this approach, your book is essentially the following:

any frontmatter content, like a dedication, preface or introduction

{mainmatter}

the body of your book (the text or main matter)

{backmatter}

any back matter content, such as appendices

Note that there is no need for a {frontmatter} directive, so it does not exist. A Markua Processor should ignore it if it is encountered, but a warning can be provided.

The {mainmatter} and {backmatter} directives act as dividers, separating a book or course into logical groups of content.

While these mainmatter and backmatter directives are merely optional hints, there is a very strict rule about their use: each of them ({mainmatter}, {backmatter}) can only appear once in a document.

Now, note that the most minimal way to write a book is to use no section directives at all. With this approach, everything is in the text (main matter) of the book. Page numbering is in Arabic numerals, etc.

For authors who do not know about the section directive, this is what they are doing. Nothing bad or unexpected will happen: they can write their book, and it will look correct. Only when they go to add things like a preface or an appendix, and feel the need for different numbering, will they need to discover the section directives. Then, they can add them to their Markua document as appropriate.

3.3.3The insertion directives (dedication, toc, etc.)

With a book or course, certain types of content gets created by the Markua Processor, or in metadata provided by the author. This content then needs to be positioned in the book or course.

While a Markua Processor can adopt sensible defaults, sometimes an author wants more fine-grained control over where this automatically-generated content or user-provided metadata content is inserted. That’s what the various insertion directives are for: to specify the positioning of where this content is inserted.

Here’s an example book:

some front matter content

{dedication}

more front matter content

{toc}

{figures}

even more front matter content

{mainmatter}

the main matter content

{backmatter}

some back matter content

{index}

more back matter content

{quiz-answers}

The following the insertion directives which can only occur in the front matter, which is everything before the {mainmatter} directive:

To be clear, these directives will not work if they are inserted after the {mainmatter} directive.

Also, note that there is no more {frontmatter} directive in Markua. It existed in Markua 0.10, but it is redundant: front matter is just the stuff which is comes before a {mainmatter} directive. (You should remove it when migrating a Markua 0.10 document to the version of Markua specified here, but if you forget, don’t worry: Leanpub will ignore the {frontmatter} directive, and other Markua processors should do so as well.)

Note that a Markua Processor can be smart about certain things, if it wants to. For example, it can choose to not number the pages before the Table of Contents, but to number the pages after the Table of Contents. Then, the author can use the {toc} directive to choose where in the front matter the Table of Contents is positioned. This will not only affect its position, it will also affect the numbering of content before and after it.

The following are the insertion directives which can occur in the back matter, which is everything after the {backmatter} directive:

The exercise-answers and quiz-answers directives are used to position the answers to any exercises or quizzes in the text of the Markua document.

If neither of these directives are present, a Markua Processor should position any exercise answers somewhere near the back of the book (in the back matter, if it exists). For quiz answers, on the other hand, a Markua Processor may do whatever it wants in terms of whether the quiz answers are included in the book, regardless of the presence or position of the quiz-answers directive.

For example, in Leanpub’s online courses, the quiz answers are only provided when quizzes are completed and automatically marked, and are never output in the material book for the course.

3.3.4The lang-* directives

There is one more set of directives, the lang-* directives. These all start with lang-.

The lang-* directives are important enough that they are discussed in the next section, which also discusses the lang attributie and document setting.

3.4The lang document setting, lang attribute and lang directives (M)

When generating books, knowing the language being used is very important. This enables a Markua Processor to do two things:

  1. Know which fonts to use (since the fonts to use should take into account the language of the text).
  2. Know the direction of the text (left-to-right or right-to-left).

Note that the way which default fonts are specified for each language in a Markua document is not defined in this specification. At a bare minimum, a Markua Processor should allow users to specify the font to use for each language.

For example, Leanpub defines standard themes for each supported language, which specify default fonts to use for titles, body text and code samples. Leanpub authors can also create custom themes, with their own font choices for each language, of course. Leanpub authors can even make these custom themes reusable, so that they can use one theme for all their books.

Anyway, there are three ways for an author to indicate what language is being used at any given time:

  1. The lang document setting, which sets the default language.
  2. The lang attribute, which temporarily overrides the default language for a particular span or block.
  3. The lang-* directives, which set a new default language. This will be the default language until another lang-* directive is encountered. (There is no directive called lang-* itself; what this refers to is a family of directives like lang-eng, lang-jpn, lang-zh-Hant, etc.)

In all three cases, the value being set is the language code of the language that the Markua document is written in. This language code is found in IS0 639-3, with two exceptions:

  1. Traditional Chinese is zh-Hant.
  2. Simplified Chinese is zh-Hans.

Since English is the default language of a Markua document, all Markua Processors must support the eng value. Support for all other language codes (or lang-* directives) is optional. If the language code given as the value of a lang-* directive is not supported or is unrecognized, it must be interpreted as eng and switch the font accordingly and switch the direction to left-to-right.

3.4.1The lang document setting

A Markua document has a global lang document setting, which sets the default language. All document settings have default values; the default value of the lang document setting is eng.

Document settings are discussed here.

For a example of the lang document setting in action, see the section “A detailed example” below.

3.4.2The lang attribute

A lang attribute can be applied to a particular block or span.

After that block or span concludes, the lang immediately reverts to the default language.

Here is some English text.

{lang: jpn}
古池や蛙飛び込む水の音

{lang: jpn}
春過ぎて夏来にけらし白妙の衣干すてふ天の香具山

This is some paragraph text which defaults to English, and it
contains [古池や蛙飛び込む水の音]{lang: jpn} a span which is in Japanese.

This is some paragraph text which still defaults to English.

This example used attribute lists on two paragraphs and on a span.

The default language of the document stayed as eng throughout, even though two paragraphs and one span had their language set to Japanese.

Having to specify the lang attribute list on every single paragraph which did not use the default language would get annoying very quickly! This is why the lang-* directives exist.

3.4.3The lang-* directives

NOTE FOR LEANPUB AUTHORS: The lang-* directives are not yet supported on Leanpub. We will add support for them sometime in Q1 2024, and remove this note (TODO) when this is done.

The lang-* directives set a new default language. This will be the default language until another lang-* directive is encountered.

This example shows how to use the lang-* directives to temporarily switch the default language of a Markua document:

Here is some English text.

The Markua Processor knows it is in English since there is no document settings
block, so the default document setting for `lang` is `eng`.

{lang-jpn}

古池や蛙飛び込む水の音

春過ぎて夏来にけらし白妙の衣干すてふ天の香具山

{lang-eng}

Here is some more English text.

For another example of the lang-* directives, see the section “A detailed example” below.

3.4.4A detailed example

The following example shows a number of uses of the lang document setting, the lang attribute, and the lang-* directives.

{
  lang: eng
}

# Chapter One

Note that there was a document settings block at the beginning of this Markua
document. It set the default language to `eng`. Since that is the default of
the `lang` document setting, that was redundant.

This is still in English. Now we will use the `lang-jpn` directive to switch
the default language to Japanese, so that we can output some Haiku
by 松尾芭蕉{lang: jpn} (Matsuo Bashō). (Note that this was a use of the `lang`
attribute on a span).

{lang-jpn}

古池や蛙飛び込む水の音

春過ぎて夏来にけらし白妙の衣干すてふ天の香具山

{lang-eng}

Now we are back in English, since we used the `lang-eng` directive again to
switch the default back to English.

Directives are the correct way to switch the default language for an extended
period of time, but sometimes you may want to just switch the language for a
particular block or span. We already saw how to use the `lang` attribute on a
span; here's how to use it on a block:

{lang: jpn}
夏草や兵どもが夢の跡

Now we are still in English, since the above was a `lang` attribute, not a
`lang-*` directive.

Finally, we'll use the `lang-*` directives again, to switch to Traditional
Chinese and then to Arabic for some more poetry.

{lang-zh-Hant}

花間一壺酒,獨酌無相親。
舉杯邀明月,對影成三人。

{lang-ara}

أَلا كُلُّ شَيْءٍ ما خَلا اللهَ باطِلُ
وَكُلُّ نَعِيمٍ لا مَحالَةَ زائِلُ

{lang-eng}

And with those poems by 李白{lang: zh-Hant} (Li Bai) and
by الْمُتَنَبِّي {lang: ara} (Al-Mutanabbi), we can get back to the spec!

Implementors of Markua Processers should strongly consider using this example as a test case!

3.5Paragraph continuations (M)

In books, you can insert things like lists in the middle of paragraphs.

Ideally, this would be one paragraph. However, it’s not:

foo
  * lorem
  * ipsum
bar
<p>foo</p>
<ul>
<li>lorem</li>
<li>ipsum
bar</li>
</ul>

Here, you lose hard, because “bar” is considered to be part of the same list item as ipsum. I think this is a mistake, but it is more important to preserve Markdown compatibility than to fix this mistake.

Ideally, it would also be possible to insert a list inside a paragraph. However, it’s not:

foo

  * lorem
  * ipsum

bar
<p>foo</p>
<ul>
<li>lorem</li>
<li>ipsum</li>
</ul>
<p>bar</p>

The reason here is that HTML does not support inserting lists in paragraphs. Lists are always siblings to paragraphs, not contained within them.

There is no way to fix this at the HTML tag level, since this is just how HTML works. It would be possible to just construct a fake list with a bunch of CSS and break tags inside a paragraph, but that would be far worse.

From an HTML perspective, you cannot insert lists inside paragraphs. But if you are writing in Markua, you’re not just writing HTML. You can also be generating a book in PDF and other non-HTML formats, which do support lists inside paragraphs just fine.

So, there needs to be some way to indicate that the beginning of a paragraph after something like a list is actually a continuation of the previous paragraph.

Now, this could have been done by just not adding blank lines, only single newlines, but that would involve fighting against existing Markdown behavior. Worse, it would introduce an incompatibility just to fix something that many people don’t consider broken.

For authors who want a list to appear to be inside a paragraph, Markua does have two solutions. First, you can just add a continued-para class to the subsequent paragraph, via an attribute list:

foo

  * lorem
  * ipsum

{class: continued-para}
bar
<p>foo</p>
<ul>
<li>lorem</li>
<li>ipsum</li>
</ul>
<p class="continued-para">bar</p>

It would then be up to the Markua Processor to use the appropriate CSS to make the paragraph with the continued-para class look like a continuation of the previous paragraph, such as by removing indentation or decreasing leading space.

(Attribute lists, discussed later, are one of the main contributions of Markua.)

However, if you are an author who cares about this feature, typing an attribute list whenever you want to continue a paragraph is really obnoxious. So, there needs to be some syntactic sugar, and there is, with the ^ symbol.

A caret (^) on a line by itself above a paragraph functions as a {class: continued-para} attribute list for the following paragraph:

foo

  * lorem
  * ipsum

^
bar
<p>foo</p>
<ul>
<li>lorem</li>
<li>ipsum</li>
</ul>
<p class="continued-para">bar</p>

Note that this works for more than just lists. In fact, it has nothing to do with lists, or even with images or other resources. It’s all about the indentation behavior of a paragraph, typically following some other block. So, you can use this whenever you want:

foo

> a block quote

^
bar
<p>foo</p>
<blockquote>
<p>a block quote</p>
</blockquote>
<p class="continued-para">bar</p>

3.6Concatenating manuscript files inserts blank lines between them (M)

A Markua document can be written in one file or multiple manuscript files. If a manuscript is written in multiple files, these files are concatenated together by the Markua Processor to produce one temporary manuscript file, and that one file is used as the input.

The way that the list of files that should be included in the book or course being generated from a Markua manuscript is defined is implementation-specific, and is not formally specified.

For example, Leanpub enables authors to write books and courses either in a web browser or in local files on an author’s computer. In the browser mode, the author chooses a file to edit via a web UI, which also allows the files to be reordered, etc. In the local files mode, the author creates a file called Book.txt which just lists the files in order. The reason that this is worth mentioning is that in both cases, what results is a simple list of files, and nothing more. All semantics about which content is included in, say, the full book or the sample book is determined by attribute lists, which are specified in the Markua specification.

Now, even though the way that the list of files is defined is not part of the Markua spec, the way that the files are concatenated together to produce the full manuscript is part of the Markua spec.

Importantly, in order to avoid a number of bugs, the files are not just concatenated together unchanged–they must be concatenated together by Markua Processors by adding two newlines (i.e. one blank line) between the end of each file and the beginning of the next file.

So, after this process, exactly one blank line separates the contents of each manuscript file. Note that because of this rule, a paragraph (or any other block element) cannot span multiple manuscript files.

To see why this approach is so important, consider the following single-file Markua document:

# Chapter One

Lorem ipsum dolor.

# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.</p>
<h1 class="chapter">Chapter Two</h1>
<p>Yada yada yada.</p>

Suppose instead a multiple-file approach was used, in which there were two files, ch1.txt and ch2.txt, with the following content.

ch1.txt:

# Chapter One

Lorem ipsum dolor.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.</p>

ch2.txt:

# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter Two</h1>
<p>Yada yada yada.</p>

If Markua did not add any newlines between files, then these files would produce the following incorrect manuscript:

# Chapter One

Lorem ipsum dolor.# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.#Chapter Two</p>
<p>Yada yada yada.</p>

If Markua only added one newline when concatenating, this would produce a correct manuscript (since headings end paragraphs), but one with possible bugs:

# Chapter One

Lorem ipsum dolor.
# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.</p>
<h1 class="chapter">Chapter Two</h1>
<p>Yada yada yada.</p>

Worse, since a number of text editors such as Emacs have a “strip blank lines at the end of files” setting, it would be possible to introduce such a bug if Markua simply relied on blank lines being added to the end of a file by the author.

So, because of the blank line rule, concatenating the files produces the same manuscript as the single-file manuscript above:

# Chapter One

Lorem ipsum dolor.

# Chapter Two

Yada yada yada.
<h1 class="chapter">Chapter One</h1>
<p>Lorem ipsum dolor.</p>
<h1 class="chapter">Chapter Two</h1>
<p>Yada yada yada.</p>

4Document settings (M)

Markua’s philosophy is that non-semantic formatting is procrastination. Because of this, almost all formatting is done by global switches. These switches are called document settings, and they are defined once, at the beginning of the document. Document settings affect the formatting of the entire book or course.

Document settings are defined in one or more document settings hashes. The document settings hash or hashes must be at the start of a Markua document. Once something other than a document settings hash or a blank line is encountered, no more document settings hashes can be defined. (Document settings hashes can be separated with single or multiple newlines, but once something other than a document settings hash or blank line is encountered, no more document settings hashes can be defined.)

4.1Document settings are specified in simple key-value pairs or in JSON (M)

There are two ways to specify a document settings hash:

  1. As a newline-separated flat list of key-value pairs (key: value). With this format, every non-blank line in the document settings hash must contain a key, a colon :, and a value. Any whitespace at the beginning or end of the keys and values will be stripped.
  2. As JSON.

The Markua Spec only defines a short list of the standard document settings which all Markua Processors must support. In the spec, these settings are defined in a flat list of key: value pairs, but they can also be specified using JSON.

4.2The standard document settings (M)

The Markua Spec only defines the settings which all Markua Processors MUST support. It is expected that Markua Processors will define their own implementation-specific settings somehow, either as document settings which are in a flat list, as JSON, or both.

Since Markua defines a number of document settings, it is helpful to see what the defaults are in one list. These defaults are listed below.

{
  alt-title: all
  chapter-number-format: name
  default-code-language: guess
  figure-classes: all
  include-page-numbers-in-references: true
  italicize-underlines: true
  lang: eng
  list-figures: combined
  number-figures: sequential
  part-number-format: name
  restart-endnote-numbering: true
  restart-footnote-numbering: true
  section-number-format: name
  soft-breaks: space
  toc-chapter-number-format: name
  toc-part-number-format: name
  toc-section-number-format: name
  two-space-hack: true
}

What these settings all mean is explained in the rest of this section.

Note that the behavior of resources which are treated as figures is controlled by four related settings: alt-title, figure-classes, list-figures and number-figures. These need to be considered together: the behavior of these settings are tightly coupled.

alt-title : all, text or none. The default is all, for compatibility with Markua 0.10 and Leanpub Flavoured Markdown by default. This setting controls whether, in the cases where there is no figure title explicitly set, the alt text on a resource is removed and used as the figure title instead. This setting is so important it is discussed in its own section below.

chapter-number-format : category_number_name (e.g. Chapter 1: The Chapter Name), category_number (e.g. Chapter 1), number_name (e.g. 1 The Chapter Name), number (e.g. 1) or name (e.g. The Chapter Name). The default is name. This determines the format of the chapter heading in the body of the book. The format of the chapter heading in the table of contents is set by the related toc-chapter-number-format setting.

default-code-language : The default language that code which is a local resource, web resource or inline resource inserted as a block with three backticks is interpreted as. The default value is guess, which means to guess at the code language based on the syntax encountered (or the file extension for external code samples), and attempt to syntax highlight appropriately. A good alternative is text, which means no syntax highlighting should be used, but the code should be in a monospaced font suitable for a programming language. Besides these options, you can specify a particular programming language used, such as ruby or java. If a Markua Processor does not recognize the programming language specified, it must format it as text. Finally, note that the value of this setting only affects local resources, web resources or inline resources inserted as a block with three backticks–it has no effect on code spans, or on inline resources inserted with tildes. The default language for inline resources delimited by three tildes is always text, but you can override the default on individual resources of course. Finally, code spans do not support syntax highlighting, so there is no equivalent setting for the default language of code spans.

figure-classes : this is a comma-separated list of the figure classes which are included in the list(s) of figures near the beginning of the book or course. The default is all, which means that all figures are included, regardless of the value of their class attribute. An example of this setting is: figure-classes: equation, figure, image, listing. Note that whether the figures are listed, either grouped by their class or together in one big list, is controlled by the list-figures setting. Note that the numbering of figures is controlled by the number-figures setting.

include-page-numbers-in-references : true or false. The default value is true. When a smart crosslink to a figure reference is inserted with #f, this setting determines how the reference is displayed. If true (the default), the text includes the page number (e.g. “Figure 8.2: Anatomy of a Squirrel on page 42”) when inserted in a PDF; if false, it does not (e.g. “Figure 8.2: Anatomy of a Squirrel”. This setting has no effect on EPUB, which never includes page numbers in references (since the text is resizable).

italicize-underlines : true or false. The default value is true. If italicize-underlines is true, then _this_ and *this* are both italic. If italicize-underlines is false, then _this_ is underlined while *this* is italic. This is discussed here.

lang : The IS0 639-3 three character language code of the language that the Markua document is written in. The default is eng.

list-figures : combined, groups or none. The default value is combined. This setting controls whether figures are listed in one or more lists near the beginning of the book. Note that whether these figures are numbered is controlled by the number-figures setting. The list-figures setting respects the numbering decision specified there. This is what the values mean:

The combination of list-figures: groups and number-figures: chapter produces figures like Listing 1.1, Listing 1.2, Listing 4.1 and Image 1.1, Image 1.2, Image 2.1, Image 2.2, Image 4.1, etc. This is a good combination of document settings for books like computer programming books.

number-figures : chapter, none, or sequential. The default is sequential. If the value is chapter or sequential, then resources are numbered when they are inserted as figures. If the value is sequential, you get numbering like Figure 1, Figure 2, Figure 3, etc. If the value is chapter, you get numbering like Figure 1.1, Figure 1.2, Figure 4.1, etc. If the value is none, figures are not numbered.

part-number-format : category_number_name (e.g. Part 1: The Part Name), category_number (e.g. Part 1), number_name (e.g. 1 The Part Name), number (e.g. 1) or name (e.g. The Part Name). The default is name. This determines the format of the part heading in the body of the book. The format of the part heading in the table of contents is set by the related toc-part-number-format setting.

restart-endnote-numbering : true or false. Default true. Whether the endnote numbering is restarted at the end of a chapter.

restart-footnote-numbering : true or false. Default true. Whether the footnote numbering is restarted at the end of a chapter.

section-number-format : number_name (e.g. 1.1 The Section Name), number (e.g. 1.1) or name (e.g. The Section Name). The default is name. This determines the format of the section heading in the body of the book. The format of the section heading in the table of contents is set by the related toc-section-number-format setting.

soft-breaks : break or space. The default is space, for compatibility with Markdown. This setting determines the behavior of single newlines (i.e. a newline which is not followed by another newline) in the output. With space, single newlines in the source essentially produce a single space in the output (like Markdown and Leanpub Flavoured Markdown). With break, single newlines inside paragraphs in the Markua document essentially produce newlines in the output (like Markua 0.10). This setting is so important it is discussed in its own section below.

toc-chapter-number-format : category_number_name (e.g. Chapter 1: The Chapter Name), category_number (e.g. Chapter 1), number_name (e.g. 1. The Chapter Name), number (e.g. 1) or name (e.g. The Chapter Name). The default is name. This determines the format of the chapter heading in the table of contents of the book. The format of the chapter heading in the body is set by the related chapter-number-format setting.

toc-part-number-format : category_number_name (e.g. Part I: The Part Name), category_number (e.g. Part 1), number_name (e.g. I. The Part Name), number (e.g. I) or name (e.g. The Part Name). The default is name. This determines the format of the part heading in the body of the book. The format of the part heading in the table of contents is set by the related toc-part-number-format setting.

toc-section-number-format : number_name (e.g. 1.1 The Section Name), number (e.g. 1.1) or name (e.g. The Section Name). The default is name. This determines the format of the section heading in the body of the book. The format of the section heading in the table of contents is set by the related toc-section-number-format setting.

two-space-hack : true or false. The default is true, for compatibility with Markdown. If true, adding two spaces at the end of a line forces a hard line break. If you use this setting, ensure that your text editor does not strip trailing whitespace! This is probably the best choice if soft-breaks is set to space, or if you want maximum compatibility with Markdown. If false, spaces at the end of a line are ignored, and the single newline is treated as a single newline would be without those spaces, which is determined by the value of the soft-breaks setting. Since Markua allows the behavior of soft line breaks to be configured with the soft-breaks setting, the two-space-hack is not needed if the soft-breaks setting is break. However, even in this case, authors may wish for the two space hack to still function as it does in Markdown, in order to support existing Markdown text or in cases where there are multiple authors of a manuscript.

4.3The soft-breaks document setting (M)

As discussed above, the soft-breaks is one of the document settings that a Markua document can have. (Document settings are set at the beginning of a Markua document, or via a user interface such as a web page.)

The soft-breaks document setting can be break or space. The default is space, for compatibility with Markdown.

If you want the behavior of Leanpub Flavoured Markdown, you need to choose space.

If you want the behavior of Markua 0.10, you need to choose break.

This setting determines the behavior of single newlines (i.e. a newline which is not followed by another newline) in the output. This is what the values mean:

space : Single newlines inside paragraphs in the Markua source produce output which functions like a single space. In output such as PDF, a single space must be what is output. In output such as HTML, a single newline is output, since in HTML output, a single newline is equivalent to a single space in terms of the effect on presentation. So, this type of line break only is a line break in the HTML source, and in all output presentation it functions like a space–hence the name space. The space option is the default for compatibility with Markdown by default. If you want the behavior of Markua 0.10, choose break.

break : Single newlines inside paragraphs in the Markua document produce break tags in the HTML output and line breaks in non-HTML output like LaTeX or PDF. The reason this value is called break is to indicate that when you hit a newline to get a line break, you get a real line break. To be clear: the break attribute means to insert a forced linebreak whenever a single newline is encountered in the manuscript, regardless of whether there are two spaces at the end of the line. While this makes some intuitive sense, it is very much against the default behavior of Markdown. The break option gives you the behavior of Markua 0.10, in which all single newlines function as though they were using the “two space hack” to produce a hard line break in normal Markdown. (Whether the “two space hack” is enabled in a Markua document is determined by the value of the two-space-hack setting, discussed below.)

4.4The alt-title document setting (M)

As discussed above, the alt-title is one of the document settings that a Markua document can have. (Document settings are set at the beginning of a Markua document, or via a user interface such as a web page.)

The alt-title document setting controls whether, in the cases where there is no figure title explicitly set, the alt text on a resource is used as the figure title instead of as alt text.

The alt-title document setting can have the value of all, text or none. The default is all, for compatibility with Markua 0.10 and Leanpub Flavoured Markdown by default. These are the three choices, and what their values mean:

Using the alt text as the figure title is the most terse syntax, and may make sense if no alt text is needed. For example, valid use cases include:

If alt text is used as the figure title, it is NOT also provided as alt text, since it would be very annoying to hear the same text spoken twice (once as the alt text and once as the figure title) when using a screen reader.

Now, it is a debatable point whether using alt text as figure captions is a good idea. The main argument in favour of it is terseness: it is the most terse syntax, and the easiest to remember.

A second argument is that in some cases such as external code samples and other text-based resources, there is no conceivable use for alt text, so the way to add a title with the least syntax is to use the spot for the alt text.

A third argument in favour of it is aesthetics, since the alternative syntax in which a title is provided without any alt text looks like this:

![](someimage.png "A Figure Caption")

With this syntax, you have the [] for what looks like no real reason.

Now, the main argument against using alt text as figure captions is very straightforward: you are supposed to actually provide alt text, and the alt text should be different from the figure caption.

Alt text is supposed to be supposed to be descriptive (say to provide a literal description of an image for people with visual disabilities, or who are listening to the book as an audiobook), whereas good figure captions are terse (and often witty). These are different requirements, so they should have different text.

With this argument in mind, if it looks like a bit of an error to have empty square brackets, that’s a good thing: according to this argument, you should always provide alt text, and it should be different from the figure caption. Having missing alt text look like a bit of a mistake is actually correct.

Now, this is a very good argument, but as shown above, there are good counterarguments as well. So, the behavior of the alt text is determined by the alt-title document setting, and the default value is all (not text or none). The reason why the default is all is simple: there are over a decade of Leanpub books which do this, since they were written in Leanpub Flavoured Markdown or Markua 0.10:

![A Figure Caption](someimage.png)

Despite what one’s opinion of what the correct value of alt-title should be, the default value of alt-title is an acknowledgment of Markua’s history.

So, even though the default value of alt-title is all, the conclusion I’ve come to is that the default behavior of Markdown is best for image resources, and alt text should be used for descriptions, not figure captions, in the case of images. If you’re in this camp, the way to use alt text, and the best syntax when inserting an image, is to do the following:

![some descriptive alt text](someimage.png "A Figure Caption")

Note that you can use that syntax even with the default alt-title: all, since the explictly-defined title will take precedence over the alt text. The alt-title document setting only defines what to do when there is no title set.

To be clear, the alt-title document setting never makes a figure caption be used as the alt text, and it makes alt text be used as a figure caption if and only if no title attribute is provided, either in quotes or in the attribute list.

In these examples, the value of the alt-title document setting does not matter:

![This is always alt text, since a title is provided](someimage.png "A Figure Caption")

{title: "A Figure Caption"}
![This is always alt text, if a caption is provided](someimage.png)

{title: "This is ALWAYS a Figure Caption, Regardless of alt-title"}
![](someimage.png)

![](someimage.png "This is ALWAYS a Figure Caption, Regardless of alt-title")

Again, to be clear, the only thing that the alt-title document setting affects is whether the alt text is used as the figure caption instead of as alt text if there is no title attribute provided as the figure caption.

In these examples, the value of the alt-title document setting does matter:

![No title, so this will be the figure caption if alt-title is all](someimage.png)

![This will NOT be the figure caption if alt-title is text or none](someimage.png)

4.5Multiple document settings hashes are permitted (M)

A Markua document can define multiple document settings hashes.

There must be at least one blank line between the end of the settings block and the start of any other document content other than another document settings hash.

The reason that more than one document settings hash is permitted is that a Markua Processor may also get settings from elsewhere, like a web interface. The most straightforward thing to do in such a case is to just add another document settings hash at the beginning of the file, and then let the Markua Processor sort out any conflicts.

In the case of a conflict between settings defined in a document and settings defined, say, in a web interface, the last setting definition must win. However, a Markua Processor may override settings whose values are unsupported or illegal.

Here’s an example of a Markua document which could be generated if there is a settings block produced by a web interface, and then a settings block manually added at the start of a book:

{
  alt-title: all
  figure-classes: equation, figure, image, listing
  list-figures: groups
  number-figures: chapter
  soft-breaks: space
  two-space-hack: false
}

{
  italicize-underlines: true
}

# Chapter One

Lorem ipsum dolor...

Now, most tools will not output their settings as key-value pairs, but will instead output them as JSON. JSON is easy to output and to parse, since there are good JSON libraries for almost all popular programming languages.

So, here’s the same example, but with the automatically-generated set of document settings specified as JSON:

{
  "alt-title": "all",
  "figure-classes": [
    "equation",
    "figure",
    "image",
    "listing"
  ],
  "list-figures": "groups",
  "number-figures": chapter,
  "soft-breaks": "space",
  "two-space-hack": false
}

{
  italicize-underlines: true
}

# Chapter One

Lorem ipsum dolor...

Regardless of how the automatically-generated document settings are output, the author will typically only see the manually-defined document settings block. So, the formatting of the automatically-generated document settings doesn’t really matter, and JSON is the pragmatic choice.

4.6Custom document settings are encouraged (M)

Finally, Markua Processors can also add their own document settings, just as Markua Processors can understand their own extension attributes.

For these custom document settings, JSON is ideal, since JSON supports nested objects, arrays, and all the data types which are needed.

It is expected that Markua Processors such as Leanpub will add a bunch of their own custom document settings, typically in JSON, in order to control the formatting of all parts of a Markua document. This is encouraged.

It is deliberate that these document settings are not being formally-specified: doing so would actually serve to limit innovation. There are many ways to format a document, and Markua should not seek to define the one true way to do so.

5Resources (M)

Markua documents are written in plain text, either in one text file or multiple text files. However, modern books and courses sometimes contain more than text. Books and courses may embed many kinds of resources.

Resources vary in four different ways:

  1. Type
  2. Format
  3. Class
  4. Location

While resources are a very important addition that Markua makes to Markdown, they are exactly that: an addition. Resources provide an important conceptual framework, while remaining purely additive to standard Markdown. Specifically, all images, fenced code blocks and tables supported by CommonMark or GFM work in Markua unchanged.

What resources do is provide consistency, both in the conceptual understanding, in adding support for attribute lists, and specifying standard type, format and class attributes for all resources, including the resources such as images, fenced code blocks and tables which are also found in other Markdown dialects.

We have already discussed attribute lists earlier for things like part headings and conditional inclusion attributes, but by far the most important use of attribute lists in Markua is with resources. For resources, attribute lists can be used to specify a figure title and alt text, set an id, specify the format, and do other useful things.

While resources are an important addition that Markua makes to Markdown, they were not created in a vacuum. As you’ll see, the syntax for local and web resources is similar to Markdown’s inline image insertion syntax, and the syntax for inline resources is just the fenced code blocks syntax from CommonMark with the addition of format specifiers and attribute lists. Furthermore, the standard Markdown image syntax is the inspiration for the syntax of all local and web resources.

Before we consider how resources vary by type, format, class and location, we will consider what they have in common. Specifically, resources share the following things:

  1. Attributes and attribute list support
  2. Behavior as a figure or a span based on the presence of title
  3. Support for a title to be provided, either in the attribute list, in quotes or from alt text based on the alt-title document setting.

5.1Resource attributes (M)

Resources have attributes. Attributes can be specified in two ways:

  1. In attribute lists.
  2. In the syntax for the resource insertion.

As you will see soon, resources can be local, web or inline.

The syntax for a local or web resource inserted as a figure is as follows:

{key: value, comma: separated, optional: attribute_list}
![optional alt text](resource_path_or_url "Optional Figure Title")

The syntax for an inline resource inserted as a figure is as follows:

{key: value, comma: separated, optional: attribute_list}
```optional_format
inline resource content (default format is `guess` with backticks)
```

You can also insert an inline resource using three or more tildes (~) as the delimiter, instead of the more typical backticks (`):

{key: value, comma: separated, optional: attribute_list}
~~~optional_format
inline resource content (default format is `text` with tildes)
~~~

To be clear, the number of backticks or tildes can be three or more. This is just the fenced code blocks syntax, which is discussed later.

Finally, a local or web resource can also have attributes specified after the resource itself. This is a lot less common, and is discouraged, but it is documented for completeness:

![optional alt text](resource_path_or_url "Optional Figure Title"){key: value, comma: separated, optional: attribute_list}

This is gross, but the reason it is supported is since resources can be inserted in a sentence, and attribute lists (e.g. for index entries) can be added this way.

5.1.1The attribute list takes precedence

It is always an error to specify an attribute both in the attribute list for a resource and in the syntactic sugar locations, either after the backticks or in the square brackets.

However, if this is done, then the value in the attribute list takes precedence, and the Markua Processor should report the error to the author in a list of errors and warnings. (Errors and warnings must never be output by a Markua Processor in the book or course itself, but must always presented in a separate list for the author’s use only.)

In the following figure, the format is text not ruby:

{format: text}
```ruby
puts "hello world"
```

In the following figure, the alt text would be “foo” not “bar”:

{alt: foo}
![bar](foo.png)

In the following figure, the title text would be “foo” not “bar”:

{title: foo}
![](foo.png "bar")

Again, the Markua Processor should treat all of these as errors, and output them in a list of errors and warnings.

5.2Common resource attributes (M)

The supported attributes vary based on the type of resource, but all resources support the class, format, title and type attributes.

As you recall, resources vary in four ways:

  1. Type
  2. Format
  3. Class
  4. Location

The first three of these ways that resources vary are specified via the type, format and class attributes. These attributes are discussed briefly here, and in the sections below. The fourth way, location, is discussed in a later section below.

These are the attributes that resources all share:

class : All attribute lists, including those for resources, support a class attribute. When used on a resource, this is the class of the resource. This can be used for styling, and it can also be used by Markua Processors which group figures by classes. Note that there is no need to set a class of figure on a resource. That would be too verbose and error-prone. Instead, all resources with titles are treated as figures. This is discussed extensively later.

format : This is the resource format. Different resource types have different legal values for format. Specific resource types and formats are discussed throughout the rest of the spec.

title : If a resource has a title, it is a figure. This is discussed in the next section.

type : This is the resource type. This is usually inferred from the format instead of being specified.

5.3Resource titles make figures (M)

All resources with title attributes are figures.

This is true regardless of whether the title is specified in quotes, in an attribute list, or in the alt text (with the appropriate value of the alt-title document setting). In all cases, the resource is a figure.

To be clear, if alt-title is all (the default), the following three image resources are all figures, and should look identical:

![](mac.jpg "The Original Mac")

![The Original Mac](mac.jpg)

{title: "The Original Mac"}
![](mac.jpg)

The title is text which is shown near the figure, typically above or below it. A Markua Processor can choose where to position titles based on any criteria it chooses. For example, a Markua Processor can position all titles above or below figures, or use a different behavior based on the resource format (e.g. table titles above, image titles below).

Note that the figure title itself may contain Markua text formatting (e.g. bold, italic). This text is also displayed for the figure wherever the figure is listed (e.g. List of Illustrations, Table of Tables, etc.).

The title of a resource shows up in two places in the output:

  1. Near the resource, typically above or below it, per the preference of the Markua Processor.
  2. Based on the resource type, either in the List of Illustrations, List of Tables or Table of Figures, if they are generated for the book. This text should also be a crosslink to the title inserted near the figure itself.

The title of a resource used as a figure can be specified in three ways:

  1. In an attribute list:
    {title: "My Amazing Algorithm"}
    ![](algorithm.rb)
    
  2. In quotes after the filename or URL of a local or web resource: ![](resource_path_or_url "Optional Figure Title")
  3. In the alt text, assuming an appropriate value of the alt-title document setting. The alt-title document setting is discussed in great detail here.

Since inline resources do not use the bracket syntax, any titles must be added in the attribute list:

{title: "My Amazing Algorithm"}
```ruby
puts "hello world"
```

Note that an attribute list can also be provided after the resource, but this is discouraged since it is less readable:

![](algorithm.rb){title: "My Amazing Algorithm"}

If a resource has a title (provided by any of the above approaches), it is a figure.

If a resource does not have a title, it is inserted as a span. This is useful for images being added like an emoji, for example:

Here's a ![poop](poop.png) emoji.

Note that in this case, there is already a poop emoji, so this is wasteful as the emoji syntax can just be used instead. However, if you wanted to use an image of your favourite politician as a poop emoji, you wouldn’t have an emoji for that person built in, so this syntax would work in its stead.

Again, if a title was added, things would be different:

Here's a ![poop](poop.png "The Year 2022 So Far") emoji.

In this case, the resource would be a figure, and it would flow in the text as a figure.

But what if we wanted to specify image alignment for this resource? This could be done with an attribute list like this:

Here's a ![poop](poop.png "The Year 2022 So Far"){align: left} emoji.

Since the image resource has a title, it is being inserted as a figure, and it would flow in the layout like any other figure.

Specifically, this would be equivalent to the following:

Here's a
![poop](poop.png "The Year 2022 So Far"){align: left}
emoji.

This would also be equivalent to the following:

Here's a

![poop](poop.png "The Year 2022 So Far"){align: left}

{class: continued-para}
emoji.

This would also be equivalent to the following:

Here's a

![poop](poop.png "The Year 2022 So Far"){align: left}

^
emoji.

To be clear: inserting a figure inline in a sentence is equivalent to adding a single line break before and after it, which is also equivalent to adding a blank line before and after it with a class of continued-para on the paragraph after it, which is also equivalent to the syntactic sugar equivalent with a caret ^ on the paragraph after it.

However, this example is both contrived and terrible.

A much better way to write something like this is as follows, using the attribute list in its typical place above the resource:

Here's an emoji.

{align: left}
![poop](poop.png "The Year 2022 So Far")

Let's hope the year improves.

This is why almost all figures are added this way.

But just because that’s the convention, it does not make it the rule. That is semantically identical to the following:

Here's an emoji.

![poop](poop.png "The Year 2022 So Far"){align: left}

Let's hope the year improves.

Finally, if you are implementing a Markua Processor and are trying to determine the class of a resource, the following pseudocode algorithm can help:

What's the `class`?

if the class was explicitly set to something
  that's the class
elsif there is a title
  the class is `figure`
else
  the class is undefined
end

A resource with an undefined class has no title and is not a figure.

5.4Resource figure numbering (M)

NOTE FOR LEANPUB AUTHORS: The numbered attribute is not implemented in Leanpub yet. We plan to add this in 2024, after which point this note will be removed.

When a resource has a title, it is a figure.

By default, figures are numbered. This numbering is controlled by the number-figures document setting.

number-figures : chapter, none, or sequential. The default is sequential. If the value is chapter or sequential, then resources are numbered when they are inserted as figures. If the value is sequential, you get numbering like Figure 1, Figure 2, Figure 3, etc. If the value is chapter, you get numbering like Figure 1.1, Figure 1.2, Figure 4.1, etc. If the value is none, figures are not numbered.

If a resource does not have a title, it is not a figure, and it is not numbered.

If you do not want to number any of your figures, you can use the none setting of the number-figures document setting to have none of them numbered.

However, sometimes you may wish to number most of your figures, but not all of them. In other words, you may wish to give a resource a title (making it a figure), but not have that resource be numbered like a figure.

To do this, you can use the numbered: false attribute.

numbered : true or false. The default is true. This controls whether a particular resource is numbered. Note that if number-figures is none then no resources are numbered. The numbered attribute does not override this setting.

In other words, the numbered: false can force a figure to not be numbered (if figures are being numbered), but numbered: true cannot force a figure to be numbered if figures are not being numbered.

There is no reason to ever explicitly set numbered: true, since this is the default. The numbered attribute only exists so that it can be overridden to false, to explicitly turn off figure numbering for a particular figure.

Finally, if numbered is false on a figure, then there are the following consequences:

  1. The figure will not show up in any list of figures, if there is one.
  2. You cannot make certain kinds of smart crosslinks to it, since it has no number.

A resource without a number but with a title exists in a kind of limbo: it is not a full-fledged figure which is listed in a list of figures and can be easily referred to with smart crosslinks, but it’s also not not-a-figure: it is laid out like a figure, and can have a caption, etc.

5.5Resource alt text (M)

A resource can have alt text. The resource location does not matter: local, web and inline resources all support alt text.

Alt text is text which is intended to take the place of the resource if the resource itself cannot be seen. In the case of images, the obvious use case is for readers with visual disabilities who are using a screen reader, but it also includes audiobooks and ebook readers which often do not support embedded images, audio and video, and which may have a hard time displaying math.

Here’s an example of good alt text:

![a red apple, possibly a McIntosh or Spartan](fruit.jpg)

You can also use an attribute list:

{alt: "a red apple, possibly a McIntosh or Spartan"}
![](fruit.jpg)

The alt text should not have the same content as the resource figure title, if the resource title is present. (Imagine the annoyance for someone with a visual disability having their screen reader read identical alt text and figure titles to them throughout an entire book!)

Instead, the alt text should be descriptive of the image content, while the resource figure title can be more creative. For example, a figure title may be “Washington Crossing the Delaware” and the alt text could be “Denzel Washington on a boat in a river.” Having good alt text would enable readers who cannot see the image to still get the joke which the figure title makes.

Figure titles and alt text can, and should, be both added to resources. These are some examples of resources with both alt text and a figure title:

![a red apple](mac.jpg "The Original Mac")

{alt: "a red apple", title: "The Original Mac"}
![](mac.jpg)

{alt: "a blue circle", title: "Earth From Space (Simplified)"}
```!
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

5.6Resource types (M)

The type of a resource can be specified by the type attribute, or inferred from the format of the resource.

There are eight types of resources:

  1. audio
  2. code
  3. iframe
  4. image
  5. math
  6. table
  7. verbatim
  8. video

Each type of resource has a number of supported formats. Any of the resource types can be inserted as a local resource or web resource, and many of the resource types can also be inserted as an inline resource.

The code, image and table resource types are an extension and a generalization of what is already supported in Markdown; the audio, iframe, math, verbatim and video resource types are new.

Either way, each type of resource has a number of supported formats. Any of the resource types can be inserted as a local resource or web resource, and many of the resource types can also be inserted as an inline resource.

Resources are not parsed as plain textual content; instead, they are parsed according to the rules that govern the parsing of resources.

All of these types of resources are discussed later.

5.7Resource formats (M)

The Format is the format of the resource. It can be specified with the type attribute, or inferred from the file extension (discussed below). This is important information to help the Markua Processor handle the resource correctly.

Both the type and the format can be specified in an attribute list, by the respective type and format attributes.

The type and the format can also be inferred from the file extension and, in the case of web resources, the URL.

Markua Processors must interpret all unspecified file extensions as specifying a resource of type code with a format of guess, unless the resource is a web resource.

If the type and format are not specified and the resource is a web resource, the Markua Processor may use the domain to decide what type of resource to assume. For example, a domain of youtube.com may be assumed to be of type video, a domain of instagram.com may be assumed to be of type image, and a domain of github.com may be assumed to be of type code.

If the type is not specified in the attribute list, the format determines the type. The formats can either be specified by the format attribute or (in most cases) inferred from the file extension for local and web resources. (Inline resources obviously have no file extension, since they are contained in the body of a Markua document.)

As an author, all you typically do is provide the correct file extension for a local resource or set the format in the attribute list. Markua recognizes the format, and uses it to determine the type. If the format is unrecognized, then the resource is treated as a resource of type code and with a format of guess.

It’s important to emphasize that the type and format of a resource can be overridden using an attribute list. The file extensions just set the default type and format that are inferred.

In rare instances, it is useful to override the type and format which have been inferred by the Markua Processor based on the file extension of the resource. This is done by specifying a type and/or format in the attribute list of the resource.

5.8Resource classes (M)

As shown above, when a resource is inserted with a title, it is treated as a figure. Figures can show up in one of the lists of figures which can be generated near the start of the book or course.

If a resource has a title and thus is a figure, and if figures are listed in groups (if the list-figures document setting is groups), the default is to group by the type of the resource. However, the class attribute can be used to override which list the resource appears in.

5.9Resource locations (M)

The Location is the location of the resource, when encountered by the Markua Processor. There is no attribute for this; it is inferred from the location.

A resource is either considered a local, web or inline resource based on its location:

Local Resource : The resource is stored along with the manuscript–either in a resources directory on a local filesystem, or uploaded to the same web service where the Markua document is being written.

Web Resource : The resource is referred to via an http or https URL.

Inline Resource : The resource is defined right in the body of a Markua document.

5.9.1Local resources

If local resources are used, all local resources must be stored inside a resources directory, or one of its subdirectories. The resources directory is not part of the path to the resource.

Here’s how the paths to local resources work:

  1. An image called foo.jpg in the resources directory should be referred to as ![](foo.jpg), but can also be referred to as ![](resources/foo.jpg).
  2. An image called bar.png in a subdirectory images of the resources directory should be referred to as ![](images/bar.png), but can also be referred to as ![](resources/images/foo.jpg).
  3. For security reasons, leading slashes / and navigating upward (../) are not allowed: ![](/foo.jpg), ![](/images/bar.png) and ![](../foo.jpg) are all illegal.

The reason that paths can either include or omit the resources directory is simple: including it makes it a simple relative path, which means that Markdown-aware tools that support, say, external images will just work. However, omitting it is nice to type, so this is something which should be supported as well. And the reason the resources directory exists is to keep the Markua manuscript file(s) separate from the resources, to reduce clutter.

Nested directory trees work as well. A file called foo.rb in a ch1/examples/ruby directory tree inside the resources directory is referenced as ![](ch1/examples/ruby/foo.rb) or as ![](resources/ch1/examples/ruby/foo.rb).

Markua does not specify whether there are any subdirectories of the resources directory, or what their names are. Since any subdirectories have their names as part of the path to the resource, authors can do whatever they want. For example, you can create subdirectories of the resources directory for different types of resource, such as audio, code, images, etc., but you can also just put them all in the resources directory together. To be clear: the names of the directories have no meaning, and do not restrict the formats of what can go inside them.

If you are using a hosted service to write in Markua, this service can store resources wherever it wants. However, if they provide a download (say as a zip file) they should create the resources directory and provide the uploaded resources in that directory. If a nested structure is used, it should be exported that way–if a web service produces paths which reference images inside an images directory (e.g. as images/foo.png), then the zip file containing an export should contain a resources directory which contains an images subdirectory with the images.

5.9.2Web resources

If web resources are supported, both http: and https: resources should be supported.

Web resources are identified by the absolute URL of the resource on the internet.

5.9.3Inline resources

Certain types of resources can be inserted inline in a Markua document:

6Blocks and inlines

We can think of a document as a sequence of blocks—structural elements like paragraphs, block quotations, lists, headings, rules, and code blocks. Some blocks (like block quotes and list items) contain other blocks; others (like headings and paragraphs) contain inline content—text, links, emphasized text, images, code spans, and so on.

6.1Precedence

Indicators of block structure always take precedence over indicators of inline structure. So, for example, the following is a list with two items, not a list with one item containing a code span:

Example 63
- `one
- two`
<ul>
<li>`one</li>
<li>two`</li>
</ul>

This means that parsing can proceed in two steps: first, the block structure of the document can be discerned; second, text lines inside paragraphs, headings, and other block constructs can be parsed for inline structure. The second step requires information about link reference definitions that will be available only at the end of the first step. Note that the first step requires processing lines in sequence, but the second can be parallelized, since the inline parsing of one block element does not affect the inline parsing of any other.

6.2Container blocks and leaf blocks

We can divide blocks into two types: container blocks, which can contain other blocks, and leaf blocks, which cannot.

7Leaf blocks

This section describes the different kinds of leaf block that make up a Markdown document.

7.1Thematic breaks

A line consisting of optionally up to three spaces of indentation, followed by a sequence of three or more matching -, _, or * characters, each followed optionally by any number of spaces or tabs, forms a thematic break.

Example 64
***
---
___
<hr />
<hr />
<hr />

Wrong characters:

Example 65
+++
<p>+++</p>
Example 66
===
<p>===</p>

Not enough characters:

Example 67
--
**
__
<p>--
**
__</p>

Up to three spaces of indentation are allowed:

Example 68
 ***
  ***
   ***
<hr />
<hr />
<hr />

Four spaces of indentation is too many:

Example 69
    ***
<pre><code>***
</code></pre>
Example 70
Foo
    ***
<p>Foo
***</p>

More than three characters may be used:

Example 71
_____________________________________
<hr />

Spaces and tabs are allowed between the characters:

Example 72
 - - -
<hr />
Example 73
 **  * ** * ** * **
<hr />
Example 74
-     -      -      -
<hr />

Spaces and tabs are allowed at the end:

Example 75
- - - -    
<hr />

However, no other characters may occur in the line:

Example 76
_ _ _ _ a

a------

---a---
<p>_ _ _ _ a</p>
<p>a------</p>
<p>---a---</p>

It is required that all of the characters other than spaces or tabs be the same. So, this is not a thematic break:

Example 77
 *-*
<p><em>-</em></p>

Thematic breaks do not need blank lines before or after:

Example 78
- foo
***
- bar
<ul>
<li>foo</li>
</ul>
<hr />
<ul>
<li>bar</li>
</ul>

Thematic breaks can interrupt a paragraph:

Example 79
Foo
***
bar
<p>Foo</p>
<hr />
<p>bar</p>

If a line of dashes that meets the above conditions for being a thematic break could also be interpreted as the underline of a setext heading, the interpretation as a setext heading takes precedence. Thus, for example, this is a setext heading, not a paragraph followed by a thematic break:

Example 80
Foo
---
bar
<h2>Foo</h2>
<p>bar</p>

When both a thematic break and a list item are possible interpretations of a line, the thematic break takes precedence:

Example 81
* Foo
* * *
* Bar
<ul>
<li>Foo</li>
</ul>
<hr />
<ul>
<li>Bar</li>
</ul>

If you want a thematic break in a list item, use a different bullet:

Example 82
- Foo
- * * *
<ul>
<li>Foo</li>
<li>
<hr />
</li>
</ul>

7.2ATX headings

An ATX heading consists of a string of characters, parsed as inline content, between an opening sequence of 1–6 unescaped # characters and an optional closing sequence of any number of unescaped # characters. The opening sequence of # characters must be followed by spaces or tabs, or by the end of line. The optional closing sequence of #s must be preceded by spaces or tabs and may be followed by spaces or tabs only. The opening # character may be preceded by up to three spaces of indentation. The raw contents of the heading are stripped of leading and trailing space or tabs before being parsed as inline content. The heading level is equal to the number of # characters in the opening sequence.

Simple headings:

Example 83
# foo
## foo
### foo
#### foo
##### foo
###### foo
<h1>foo</h1>
<h2>foo</h2>
<h3>foo</h3>
<h4>foo</h4>
<h5>foo</h5>
<h6>foo</h6>

More than six # characters is not a heading:

Example 84
####### foo
<p>####### foo</p>

At least one space or tab is required between the # characters and the heading’s contents, unless the heading is empty. Note that many implementations currently do not require the space. However, the space was required by the original ATX implementation, and it helps prevent things like the following from being parsed as headings:

Example 85
#5 bolt

#hashtag
<p>#5 bolt</p>
<p>#hashtag</p>

This is not a heading, because the first # is escaped:

Example 86
\## foo
<p>## foo</p>

Contents are parsed as inlines:

Example 87
# foo *bar* \*baz\*
<h1>foo <em>bar</em> *baz*</h1>

Leading and trailing spaces or tabs are ignored in parsing inline content:

Example 88
#                  foo                     
<h1>foo</h1>

Up to three spaces of indentation are allowed:

Example 89
 ### foo
  ## foo
   # foo
<h3>foo</h3>
<h2>foo</h2>
<h1>foo</h1>

Four spaces of indentation is too many:

Example 90
    # foo
<pre><code># foo
</code></pre>
Example 91
foo
    # bar
<p>foo
# bar</p>

A closing sequence of # characters is optional:

Example 92
## foo ##
  ###   bar    ###
<h2>foo</h2>
<h3>bar</h3>

It need not be the same length as the opening sequence:

Example 93
# foo ##################################
##### foo ##
<h1>foo</h1>
<h5>foo</h5>

Spaces or tabs are allowed after the closing sequence:

Example 94
### foo ###     
<h3>foo</h3>

A sequence of # characters with anything but spaces or tabs following it is not a closing sequence, but counts as part of the contents of the heading:

Example 95
### foo ### b
<h3>foo ### b</h3>

The closing sequence must be preceded by a space or tab:

Example 96
# foo#
<h1>foo#</h1>

Backslash-escaped # characters do not count as part of the closing sequence:

Example 97
### foo \###
## foo #\##
# foo \#
<h3>foo ###</h3>
<h2>foo ###</h2>
<h1>foo #</h1>

ATX headings need not be separated from surrounding content by blank lines, and they can interrupt paragraphs:

Example 98
****
## foo
****
<hr />
<h2>foo</h2>
<hr />
Example 99
Foo bar
# baz
Bar foo
<p>Foo bar</p>
<h1>baz</h1>
<p>Bar foo</p>

ATX headings can be empty:

Example 100
## 
#
### ###
<h2></h2>
<h1></h1>
<h3></h3>

7.3Setext headings

A setext heading consists of one or more lines of text, not interrupted by a blank line, of which the first line does not have more than 3 spaces of indentation, followed by a setext heading underline. The lines of text must be such that, were they not followed by the setext heading underline, they would be interpreted as a paragraph: they cannot be interpretable as a code fence, ATX heading, block quote, thematic break, list item, or [HTML block][HTML blocks].

A setext heading underline is a sequence of = characters or a sequence of - characters, with no more than 3 spaces of indentation and any number of trailing spaces or tabs.

The heading is a level 1 heading if = characters are used in the setext heading underline, and a level 2 heading if - characters are used. The contents of the heading are the result of parsing the preceding lines of text as CommonMark inline content.

In general, a setext heading need not be preceded or followed by a blank line. However, it cannot interrupt a paragraph, so when a setext heading comes after a paragraph, a blank line is needed between them.

Simple examples:

Example 101
Foo *bar*
=========

Foo *bar*
---------
<h1>Foo <em>bar</em></h1>
<h2>Foo <em>bar</em></h2>

The content of the header may span more than one line:

Example 102
Foo *bar
baz*
====
<h1>Foo <em>bar
baz</em></h1>

The contents are the result of parsing the headings’s raw content as inlines. The heading’s raw content is formed by concatenating the lines and removing initial and final spaces or tabs.

Example 103
  Foo *bar
baz*→
====
<h1>Foo <em>bar
baz</em></h1>

The underlining can be any length:

Example 104
Foo
-------------------------

Foo
=
<h2>Foo</h2>
<h1>Foo</h1>

The heading content can be preceded by up to three spaces of indentation, and need not line up with the underlining:

Example 105
   Foo
---

  Foo
-----

  Foo
  ===
<h2>Foo</h2>
<h2>Foo</h2>
<h1>Foo</h1>

Four spaces of indentation is too many:

Example 106
    Foo
    ---

    Foo
---
<pre><code>Foo
---

Foo
</code></pre>
<hr />

The setext heading underline can be preceded by up to three spaces of indentation, and may have trailing spaces or tabs:

Example 107
Foo
   ----      
<h2>Foo</h2>

Four spaces of indentation is too many:

Example 108
Foo
    ---
<p>Foo
---</p>

The setext heading underline cannot contain internal spaces or tabs:

Example 109
Foo
= =

Foo
--- -
<p>Foo
= =</p>
<p>Foo</p>
<hr />

Trailing spaces or tabs in the content line do not cause a hard line break:

Example 110
Foo  
-----
<h2>Foo</h2>

Nor does a backslash at the end:

Example 111
Foo\
----
<h2>Foo\</h2>

Since indicators of block structure take precedence over indicators of inline structure, the following are setext headings:

Example 112
`Foo
----
`

<a title="a lot
---
of dashes"/>
<h2>`Foo</h2>
<p>`</p>
<h2>&lt;a title=&quot;a lot</h2>
<p>of dashes&quot;/&gt;</p>

The setext heading underline cannot be a lazy continuation line in a list item or block quote:

Example 113
> Foo
---
<blockquote>
<p>Foo</p>
</blockquote>
<hr />
Example 114
> foo
bar
===
<blockquote>
<p>foo
bar
===</p>
</blockquote>
Example 115
- Foo
---
<ul>
<li>Foo</li>
</ul>
<hr />

A blank line is needed between a paragraph and a following setext heading, since otherwise the paragraph becomes part of the heading’s content:

Example 116
Foo
Bar
---
<h2>Foo
Bar</h2>

But in general a blank line is not required before or after setext headings:

Example 117
---
Foo
---
Bar
---
Baz
<hr />
<h2>Foo</h2>
<h2>Bar</h2>
<p>Baz</p>

Setext headings cannot be empty:

Example 118

====
<p>====</p>

Setext heading text lines must not be interpretable as block constructs other than paragraphs. So, the line of dashes in these examples gets interpreted as a thematic break:

Example 119
---
---
<hr />
<hr />
Example 120
- foo
-----
<ul>
<li>foo</li>
</ul>
<hr />
Example 121
    foo
---
<pre><code>foo
</code></pre>
<hr />
Example 122
> foo
-----
<blockquote>
<p>foo</p>
</blockquote>
<hr />

If you want a heading with > foo as its literal text, you can use backslash escapes:

Example 123
\> foo
------
<h2>&gt; foo</h2>

Compatibility note: Most existing Markdown implementations do not allow the text of setext headings to span multiple lines. But there is no consensus about how to interpret

Foo
bar
---
baz

One can find four different interpretations:

  1. paragraph “Foo”, heading “bar”, paragraph “baz”
  2. paragraph “Foo bar”, thematic break, paragraph “baz”
  3. paragraph “Foo bar — baz”
  4. heading “Foo bar”, paragraph “baz”

We find interpretation 4 most natural, and interpretation 4 increases the expressive power of CommonMark, by allowing multiline headings. Authors who want interpretation 1 can put a blank line after the first paragraph:

Example 124
Foo

bar
---
baz
<p>Foo</p>
<h2>bar</h2>
<p>baz</p>

Authors who want interpretation 2 can put blank lines around the thematic break,

Example 125
Foo
bar

---

baz
<p>Foo
bar</p>
<hr />
<p>baz</p>

or use a thematic break that cannot count as a setext heading underline, such as

Example 126
Foo
bar
* * *
baz
<p>Foo
bar</p>
<hr />
<p>baz</p>

Authors who want interpretation 3 can use backslash escapes:

Example 127
Foo
bar
\---
baz
<p>Foo
bar
---
baz</p>

7.4Indented code blocks

An indented code block is composed of one or more indented chunks separated by blank lines. An indented chunk is a sequence of non-blank lines, each preceded by four or more spaces of indentation. The contents of the code block are the literal contents of the lines, including trailing line endings, minus four spaces of indentation. An indented code block has no info string.

An indented code block cannot interrupt a paragraph, so there must be a blank line between a paragraph and a following indented code block. (A blank line is not needed, however, between a code block and a following paragraph.)

Example 128
    a simple
      indented code block
<pre><code>a simple
  indented code block
</code></pre>

If there is any ambiguity between an interpretation of indentation as a code block and as indicating that material belongs to a list item, the list item interpretation takes precedence:

Example 129
  - foo

    bar
<ul>
<li>
<p>foo</p>
<p>bar</p>
</li>
</ul>
Example 130
1.  foo

    - bar
<ol>
<li>
<p>foo</p>
<ul>
<li>bar</li>
</ul>
</li>
</ol>

The contents of a code block are literal text, and do not get parsed as Markdown:

Example 131
    <a/>
    *hi*

    - one
<pre><code>&lt;a/&gt;
*hi*

- one
</code></pre>

Here we have three chunks separated by blank lines:

Example 132
    chunk1

    chunk2
  
 
 
    chunk3
<pre><code>chunk1

chunk2



chunk3
</code></pre>

Any initial spaces or tabs beyond four spaces of indentation will be included in the content, even in interior blank lines:

Example 133
    chunk1
      
      chunk2
<pre><code>chunk1
  
  chunk2
</code></pre>

An indented code block cannot interrupt a paragraph. (This allows hanging indents and the like.)

Example 134
Foo
    bar
<p>Foo
bar</p>

However, any non-blank line with fewer than four spaces of indentation ends the code block immediately. So a paragraph may occur immediately after indented code:

Example 135
    foo
bar
<pre><code>foo
</code></pre>
<p>bar</p>

And indented code can occur immediately before and after other kinds of blocks:

Example 136
# Heading
    foo
Heading
------
    foo
----
<h1>Heading</h1>
<pre><code>foo
</code></pre>
<h2>Heading</h2>
<pre><code>foo
</code></pre>
<hr />

The first line can be preceded by more than four spaces of indentation:

Example 137
        foo
    bar
<pre><code>    foo
bar
</code></pre>

Blank lines preceding or following an indented code block are not included in it:

Example 138

    
    foo
    
<pre><code>foo
</code></pre>

Trailing spaces or tabs are included in the code block’s content:

Example 139
    foo  
<pre><code>foo  
</code></pre>

7.5Fenced code blocks

A code fence is a sequence of at least three consecutive backtick characters (`) or tildes (~). (Tildes and backticks cannot be mixed.) A fenced code block begins with a code fence, preceded by up to three spaces of indentation.

The line with the opening code fence may optionally contain some text following the code fence; this is trimmed of leading and trailing spaces or tabs and called the info string. If the info string comes after a backtick fence, it may not contain any backtick characters. (The reason for this restriction is that otherwise some inline code would be incorrectly interpreted as the beginning of a fenced code block.)

The content of the code block consists of all subsequent lines, until a closing code fence of the same type as the code block began with (backticks or tildes), and with at least as many backticks or tildes as the opening code fence. If the leading code fence is preceded by N spaces of indentation, then up to N spaces of indentation are removed from each line of the content (if present). (If a content line is not indented, it is preserved unchanged. If it is indented N spaces or less, all of the indentation is removed.)

The closing code fence may be preceded by up to three spaces of indentation, and may be followed only by spaces or tabs, which are ignored. If the end of the containing block (or document) is reached and no closing code fence has been found, the code block contains all of the lines after the opening code fence until the end of the containing block (or document). (An alternative spec would require backtracking in the event that a closing code fence is not found. But this makes parsing much less efficient, and there seems to be no real down side to the behavior described here.)

A fenced code block may interrupt a paragraph, and does not require a blank line either before or after.

The content of a code fence is treated as literal text, not parsed as inlines. The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string.

Here is a simple example with backticks:

Example 140
```
<
 >
```
<pre><code>&lt;
 &gt;
</code></pre>

With tildes:

Example 141
~~~
<
 >
~~~
<pre><code>&lt;
 &gt;
</code></pre>

Fewer than three backticks is not enough:

Example 142
``
foo
``
<p><code>foo</code></p>

The closing code fence must use the same character as the opening fence:

Example 143
```
aaa
~~~
```
<pre><code>aaa
~~~
</code></pre>
Example 144
~~~
aaa
```
~~~
<pre><code>aaa
```
</code></pre>

The closing code fence must be at least as long as the opening fence:

Example 145
````
aaa
```
``````
<pre><code>aaa
```
</code></pre>
Example 146
~~~~
aaa
~~~
~~~~
<pre><code>aaa
~~~
</code></pre>

Unclosed code blocks are closed by the end of the document (or the enclosing block quote or list item):

Example 147
```
<pre><code></code></pre>
Example 148
`````

```
aaa
<pre><code>
```
aaa
</code></pre>
Example 149
> ```
> aaa

bbb
<blockquote>
<pre><code>aaa
</code></pre>
</blockquote>
<p>bbb</p>

A code block can have all empty lines as its content:

Example 150
```

  
```
<pre><code>
  
</code></pre>

A code block can be empty:

Example 151
```
```
<pre><code></code></pre>

Fences can be indented. If the opening fence is indented, content lines will have equivalent opening indentation removed, if present:

Example 152
 ```
 aaa
aaa
```
<pre><code>aaa
aaa
</code></pre>
Example 153
  ```
aaa
  aaa
aaa
  ```
<pre><code>aaa
aaa
aaa
</code></pre>
Example 154
   ```
   aaa
    aaa
  aaa
   ```
<pre><code>aaa
 aaa
aaa
</code></pre>

Four spaces of indentation is too many:

Example 155
    ```
    aaa
    ```
<pre><code>```
aaa
```
</code></pre>

Closing fences may be preceded by up to three spaces of indentation, and their indentation need not match that of the opening fence:

Example 156
```
aaa
  ```
<pre><code>aaa
</code></pre>
Example 157
   ```
aaa
  ```
<pre><code>aaa
</code></pre>

This is not a closing fence, because it is indented 4 spaces:

Example 158
```
aaa
    ```
<pre><code>aaa
    ```
</code></pre>

Code fences (opening and closing) cannot contain internal spaces or tabs:

Example 159
``` ```
aaa
<p><code> </code>
aaa</p>
Example 160
~~~~~~
aaa
~~~ ~~
<pre><code>aaa
~~~ ~~
</code></pre>

Fenced code blocks can interrupt paragraphs, and can be followed directly by paragraphs, without a blank line between:

Example 161
foo
```
bar
```
baz
<p>foo</p>
<pre><code>bar
</code></pre>
<p>baz</p>

Other blocks can also occur before and after fenced code blocks without an intervening blank line:

Example 162
foo
---
~~~
bar
~~~
# baz
<h2>foo</h2>
<pre><code>bar
</code></pre>
<h1>baz</h1>

An info string can be provided after the opening code fence. Although this spec doesn’t mandate any particular treatment of the info string, the first word is typically used to specify the language of the code block. In HTML output, the language is normally indicated by adding a class to the code element consisting of language- followed by the language name.

Example 163
```ruby
def foo(x)
  return 3
end
```
<pre><code class="language-ruby">def foo(x)
  return 3
end
</code></pre>
Example 164
~~~~    ruby startline=3 $%@#$
def foo(x)
  return 3
end
~~~~~~~
<pre><code class="language-ruby">def foo(x)
  return 3
end
</code></pre>
Example 165
````;
````
<pre><code class="language-;"></code></pre>

Info strings for backtick code blocks cannot contain backticks:

Example 166
``` aa ```
foo
<p><code>aa</code>
foo</p>

Info strings for tilde code blocks can contain backticks and tildes:

Example 167
~~~ aa ``` ~~~
foo
~~~
<pre><code class="language-aa">foo
</code></pre>

Closing code fences cannot have info strings:

Example 168
```
``` aaa
```
<pre><code>``` aaa
</code></pre>

7.6Code resources (M)

Code can be a local, web or inline resource, just like any other resource, and the same resource syntax applies to code as to all other resources.

Code cannot have alt text. It’s just text. If any alt text is provided for a code resource, it is ignored.

Markua specifies only one specific file extension to be associated with a type of code: the .txt extension, which is for the format of text. However, Markua Processors must interpret all unspecified file extensions as specifying a resource of type code with a format of guess.

This way, the decision about the language recognized does not need to be done by the Markua Processor, but can instead be delegated to the syntax highlighter, such as Pygments.

Regardless of whether syntax highlighting is supported and the programming language is detected, all code must be formatted as monospaced text by Markua Processors.

The text format means to not do any syntax highlighting as well.

The guess format is a request for the Markua Processor to guess at the programming language based on the file extension and/or the syntax of the code itself. Then, if the detected language corresponds to a particular programming language which the Markua Processor recognizes, and if the Markua Processor supports syntax highlighting, then it can format the resource as nicely syntax-highlighted code. Syntax highlighting is entirely optional in Markua Processors. If a Markua Processor does not support syntax highlighting, and/or if it cannot detect a matching supported programming language, then it must format the code as though the format was text–i.e. to format it as unformatted monospaced text.

Besides the text and guess values of the format attribute, you can also specify the programming language by setting the format attribute to a specific programming language. This is more reliable than guess. Unlike other resource types, Markua does not specify the complete set of the values of the format attribute–there are so many programming languages in the world, and new ones are added so frequently, that doing so would be impractical.

However, while a complete set of the values of the format attribute is not specified, Markua does specify the console value of the format attribute to indicate console input. A Markua Processor should format console input as such. (For example, Leanpub uses the open source Pygments library for its code formatting, and Pygments handles console format correctly, so Leanpub gets this for free.)

The default value of the format attribute for code is complex:

  1. For code which is inserted as a span (which is only supported with inline resources), the default format is text.
  2. For code which is inserted as a figure which is inserted as an inline resource using three tildes, the default format is text.
  3. For all other code, the default format is guess. This includes local and web resources inserted as figures, and code inserted as an inline figure using three backticks.

Note that the default format can be overridden by specifying it via an attribute list, or after the three backticks in syntactic sugar.

7.6.1Supported Attributes for Code

The following are the supported attributes for code resources, in addition to the class, format, title and type attributes which all resources support.

line-numbers : This determines whether the code sample shows line numbers. Legal values are true or false. The default value is false. Any value other than true is interpreted as false.

number-from : If line numbers are shown, this lets you override the starting number of the line numbers. The default value is 1.

crop-start : Sometimes it’s desirable to only show part of a code resource defined in an external file as the code example. The crop-start and crop-end attributes let you accomplish this. The crop-start attribute defines the line which represents the first line included from the resource. For example, {crop-start: 10, crop-end: 15, line-numbers: true, number-from: 10} ensures that lines 10-15 are shown and are numbered as lines 10-15. The default value is 1, which is the first line of the file.

crop-end : This attribute ends the range started with crop-start. The default value of crop-end is to be omitted, which is equivalent to specifying the last line of the file.

7.6.1.1Default Value of the format attribute in Inline Code Samples

The default value of the format attribute for a code resource inserted as a figure varies based on context.

If the code resource is a local or web resource, it defaults to guess.

If the code resource is an inline resource, the default varies based on the delimiter, and whether the code is inserted as a span or as a block.

With three backticks the default format is guess, and with three tildes, the default format is text. This way, you can vary the default without having to type an attribute list: if you want the code language guessed at, use backticks; if you don’t, use tildes. Of course, you can specify any attributes you wish with either delimiter, and specified attributes override default ones. The only reason there are different defaults are to make things easier to type. Programmers refer to such niceties as “syntactic sugar”.

The default value of block code resources inserted with three backticks can be overridden from guess to some other value by setting by the default-code-language attribute on the entire Markua document. (This attribute has no effect on resources inserted with three tildes, or on code spans.)

7.6.2Local Code Resources

Local code resources can be inserted as a figure.

This first figure will be a type of code and a format of guess. A Markua Processor which associates .rb file extensions with Ruby code will treat this as Ruby code; a Markua Processor which has no association for .rb files will treat it as plain text:

Here's a paragraph before the figure.

![](hello.rb "Hello World in Ruby")

Here's a paragraph after the figure.

That is equivalent to:

Here's a paragraph before the figure.

{format: guess}
![](hello.rb "Hello World in Ruby")

Here's a paragraph after the figure.

If you don’t want to take chances you can do this:

Here's a paragraph before the figure.

{format: ruby}
![](hello.rb "Hello World in Ruby")

Here's a paragraph after the figure.

Note that the title is optional in all figures:

Here's a paragraph before the figure.

![](hello.rb)

Here's a paragraph after the figure.

7.6.3Web Code Resources

Web code resources function identically to how local code resources work, including the significance of file extensions. The only differences is that the files are on the web.

This will be a type of code and a format of guess since the file extension is not specified:

![](https://markua.com/hello.rb "Hello World in Ruby")

That is equivalent to:

{format: guess}
![](https://markua.com/hello.rb "Hello World in Ruby")

If you don’t want to take chances you can do this:

{format: ruby}
![](https://markua.com/hello.rb "Hello World in Ruby")

7.6.4Inline Code Resources

Inline code resources are the most flexible way to insert code. They are the only way to insert code as a span resource, and the most straightforward way to add short code examples as figures.

The great thing about inline code resources, either as spans or figures, is that they work the same way as they do in CommonMark and GFM, with small additions by Markua.

7.6.4.1No Attribute Lists or Format Specifiers on Indented Code Blocks

Indented code blocks are supported for compatibility with CommonMark and GFM. However, no attribute lists or format specifiers can be used. If you want to use them, use a fenced code block.

7.6.4.2Attribute Lists and Format Specifiers on Fenced Code Blocks

Fenced code blocks, discussed earlier, are how to insert inline code resources as figures. These can have attribute lists or format specifiers.

This will be a type of code and a format of guess since three backticks are used and since the format is not specified:

Some paragraph.

```
puts "hello"
```

Some paragraph.

That is equivalent to:

Some paragraph.

```guess
puts "hello"
```

Some paragraph.

If you don’t want to take chances you can do this to explicitly specify the format:

Some paragraph.

```ruby
puts "hello"
```

Some paragraph.

This Ruby code may be formatted as such if the Markua Processor understands ruby. If not, the ruby format will be ignored.

If you don’t like syntactic sugar you can do:

Some paragraph.

{format: ruby}
```
puts "hello"
```

Some paragraph.

If you want a figure title, you can add it to the attribute list with any of the above. For example:

Some paragraph.

{title: "Hello World in Ruby"}
```ruby
puts "hello"
```

Some paragraph.

Finally, if you want the code to definitely not get syntax highlighted, you can force format to be text in one of two ways.

First, you can set it explicitly:

Some paragraph.

```text
puts "hello"
```

Some paragraph.

Second, you can use three tildes instead of three backticks, since the default with tildes is text not guess:

Some paragraph.

~~~
puts "hello"
~~~

Some paragraph.

As discussed previously, console input and output should be formatted as such by a Markua Processor:

```console
$ git init
Initialized empty Git repository in /path/to/repo
```

Finally, it’s important to note that when you are writing about other inline formats, such as SVG or AsciiMath, what you are really doing is creating a code resource. This is shown in the sections below, which discuss SVG and AsciiMath, but this applies more broadly.

7.6.4.3No Attribute Lists on Code Spans

Code spans (like puts foo) are discussed later in the spec. Code spans work just as in CommonMark and GFM. Since Markua adds attribute lists to code blocks, you may wonder whether Markua adds attribute lists to code spans.

So, to be clear: Markua does NOT support attribute lists on code spans.

So, in an example like the following, the attribute list is simply ignored:

Hello World in Ruby is a simple `puts "hello world"`{format: ruby} statement.

The reason for this is simple: it does not make sense to run a syntax highlighter to format two or three words of code. If you want syntax highlighting, use a code block.

7.6.5Marking Code as Added or Deleted

Markua supports marking code as added or deleted, which can be helpful if you are writing a computer programming book and want to indicate what code should be added or removed to a larger program.

The way to do this is to add special comment lines to your code.

The magic words are markua-start-insert, markua-end-insert, markua-start-delete and markua-end-delete. Any line containing one of those words will be removed completely by a Markua Processor before being inserted into the output.

The Markua Processor will then be able to determine which code is being deleted or inserted, and format it accordingly. The recommended way for a Markua Processor to do this is to make code which is being inserted get bolded, and to make code which is getting deleted to be put in ~~strikethrough~~.

Finally, while syntax highlighting is optional in a Markua Processor, if a Markua Processor does support syntax highlighting it is allowed for the Markua Processor to not do any syntax highlighting when there is the presence of any of any special markua-* comments. Syntax highlighting may make it harder to notice the added and removed code, if they are formatted with bold and strikethrough respectively.

7.6.6Line Wrapping in Code Resources

Code resources should have newlines added by the author to ensure that automatic line wrapping is not relied upon. Markua Processors may wrap lines to ensure that all code is visible on a page, and may add continuation characters (like the backslash \ character) in the output to indicate that a line has been automatically wrapped. However, adding a continuation character is not a requirement, nor is the choice of which continuation character is used.

7.7Verbatim resources (M)

The verbatim resource type is used to preserve leading and internal whitespace while still supporting Markua text formatting (things like **bold** and *italic*).

While Markua text formatting IS supported inside a verbatim resource, Markua resources (including images) ARE NOT supported inside a verbatim resource. While this may seem arbitrary and overly restrictive, we need to prevent a slippery slope of expanded scope.

The following are the verbatim resource formats and the file extensions which choose them by default:

verbatim : verbatim text - .text

The primary use case is poetry, and the secondary use case is boilerplate legal text. If you don’t care about leading or internal whitespace, or about poetry or legalese, you can safely skip this section.

For a verbatim resource, leading whitespace, internal whitespace, and newlines must all be preserved verbatim as typed. This can be useful for certain types of poetry, such as indenting the last two lines of a sonnet.

To be clear:

Now, while the whitespace is preserved, it is rendered using either a proportional font or a monospaced one based on the value of the monospaced attribute.

The verbatim resource should have newlines added by the author to ensure that automatic line wrapping is not relied upon. Markua Processors may wrap lines to ensure that all the text is visible on a page, and may add continuation characters (like the backslash \ character) in the output to indicate that a line has been automatically wrapped by the Markua Processor. Obviously, seeing a continuation character is in poetry is terrible, so the author should consider this a mistake that needs to be fixed by manually line wrapping.

Verbatim resources can be a local, web or inline resource, just like any other resource, and the same resource syntax applies to Markua resources as to all other resources.

There are three main use cases for a Markua resource:

  1. Inserting a specific class of resource, such as a poem.
  2. Inserting boilerplate, such as a legal disclaimer, which may be needed to be used in many places and/or sourced from a third party.
  3. Controlling the behavior of fonts or whitespace for specific types of text, such as poetry.

The following are the supported attributes for verbatim resources, in addition to the class, format, title and type attributes which all resources support.

monospaced : true or false. The default is false. If true, the Markua Processor must use a monospaced font to output the text in the resource. If false, the Markua Processor may use whatever font (proportional or monospaced) it is configured to use to output Markua text. The value of true can be useful for certain types of poetry, for example.

Note that the standard class attribute should be used with a verbatim type resource to define a specific class of resource on verbatim resources which have titles, such as a poem. A List of Poems is a lot more poetic than a List of Verbatims.

Also note that the standard resource format attribute for a verbatim type resource must always be overridden to verbatim by a Markua Processor. Similarly, if a format of verbatim is specified, this must set the type to verbatim.

7.7.1Local Verbatim resources

Local and web resources with a .text extension default to being treated as a verbatim type of resource.

Here's a Shakespearean sonnet, which has a type of `verbatim` and a class of
`poem`:

{type: verbatim, class: poem}
![](sonnet130.txt "Sonnet 130")

Here's a Shakespearean sonnet, which has a class of `poem` and will default to
the `{type: verbatim}` because of the `.text` extension:

{class: poem}
![](sonnet130.text "Sonnet 130")

Here's a Shakespearean sonnet, which has no class specified and which will
default to the `{type: verbatim}` because of the `.text` extension:

![](sonnet130.text "Sonnet 130")

Here's some legal boilerplate, which has no class specified and which will
default to the `{type: verbatim}` because of the `.text` extension:

![](tos.text "Terms and Conditions")

Here's an E. E. Cummings poem:

{type: verbatim, class: poem, monospaced: true}
![](iwillbe.txt "I Will Be")

Here's an E. E. Cummings poem which has no class specified and which will
default to the `{type: verbatim}` because of the `.text` extension:

{monospaced: true}
![](iwillbe.text "I Will Be")

7.7.2Web Verbatim resources

Web Markua resources function identically to how local Markua resources work, including the significance of file extensions. The only difference is that the files are on the web.

Here's a Shakespearean sonnet, which has a type of `verbatim` and a class of
`poem`:

{type: verbatim, class: poem}
![](https://markua.com/sonnet130.txt "Sonnet 130")

Here's a Shakespearean sonnet, which has a class of `poem` and will default to
the `{type: verbatim}` because of the `.text` extension:

{class: poem}
![](https://markua.com/sonnet130.text "Sonnet 130")

Here's a Shakespearean sonnet, which has no class specified and which will
default to the `{type: verbatim}` because of the `.text` extension:

![](https://markua.com/sonnet130.text "Sonnet 130")

Here's some legal boilerplate, which has no class specified and which will
default to the `{type: verbatim}` because of the `.text` extension:

![](https://markua.com/tos.text "Terms and Conditions")

Here's an E. E. Cummings poem:

{type: verbatim, class: poem, monospaced: true}
![](https://markua.com/iwillbe.txt "I Will Be")

Here's an E. E. Cummings poem which has no class specified and which will
default to the `{type: verbatim}` because of the `.text` extension:

{monospaced: true}
![](https://markua.com/iwillbe.text "I Will Be")

7.7.3Inline Verbatim resources

Inline verbatim resources are the easiest way to insert verbatim resources.

These work just like fenced code blocks, discussed earlier, but with a verbatim format. (Recall from above that a verbatim format causes the type to also be verbatim.)

Here’s what this looks like with backticks:

Some paragraph.

```verbatim
I grant I never saw a goddess go;
My mistress when she walks treads on the ground.
    And yet, by heaven, I think my love as rare
    As any she belied with false compare.
```

Some paragraph.

You can also use tildes:

Some paragraph.

~~~verbatim
I grant I never saw a goddess go;
My mistress when she walks treads on the ground.
    And yet, by heaven, I think my love as rare
    As any she belied with false compare.
~~~

Some paragraph.

You can also use an attribute list:

Some paragraph.

{type: verbatim}
```
I grant I never saw a goddess go;
My mistress when she walks treads on the ground.
    And yet, by heaven, I think my love as rare
    As any she belied with false compare.
```

Some paragraph.

But the best choice is to use the & syntactic sugar, which means {type: verbatim}:

Some paragraph.

```&
I grant I never saw a goddess go;
My mistress when she walks treads on the ground.
    And yet, by heaven, I think my love as rare
    As any she belied with false compare.
```

Some paragraph.

To summarize, what a verbatim resource type means is…

  1. whitespace is preserved verbatim
  2. no automatic indenting is done
  3. basic Markua text formatting (bold, italic, etc.) is applied
  4. Markua resources (e.g. tables, images, etc.) CANNOT be nested inside it
  5. you can add {monospaced: true | false} (default false)
  6. it can be local, web, or inline resource
  7. the format is verbatim, which can be inferred from a .text extension
  8. the normal rules apply for the class attribute

7.7.4But what if I want every character taken literally?

If you want to type poetry where every character is taken exactly literally, instead of being potentially interpreted as Markua formatting, you need to use a code block with a format of text for that:

```text
*this* isn't italic
and **this** is not bold
* * * cherry blossoms * * *
```
<pre><code>*this* isn't italic
and **this** is not bold
* * * cherry blossoms * * *
</code></pre>

I’m no poet, but at least I know it.

7.8No HTML blocks (M)

As discussed, Markua does not support raw HTML. All raw HTML is removed. As a side-effect, this means HTML comments can still be used, since they (like all HTML) will be removed from the output.

A link reference definition consists of a link label, optionally preceded by up to three spaces of indentation, followed by a colon (:), optional spaces or tabs (including up to one line ending), a link destination, optional spaces or tabs (including up to one line ending), and an optional link title, which if it is present must be separated from the link destination by spaces or tabs. No further character may occur.

A link reference definition does not correspond to a structural element of a document. Instead, it defines a label which can be used in reference links and reference-style images elsewhere in the document. Link reference definitions can come either before or after the links that use them.

Example 170
[foo]: /url "title"

[foo]
<p><a href="/url" title="title">foo</a></p>
Example 171
   [foo]: 
      /url  
           'the title'  

[foo]
<p><a href="/url" title="the title">foo</a></p>
Example 172
[Foo*bar\]]:my_(url) 'title (with parens)'

[Foo*bar\]]
<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
Example 173
[Foo bar]:
<my url>
'title'

[Foo bar]
<p><a href="my%20url" title="title">Foo bar</a></p>

The title may extend over multiple lines:

Example 174
[foo]: /url '
title
line1
line2
'

[foo]
<p><a href="/url" title="
title
line1
line2
">foo</a></p>

However, it may not contain a blank line:

Example 175
[foo]: /url 'title

with blank line'

[foo]
<p>[foo]: /url 'title</p>
<p>with blank line'</p>
<p>[foo]</p>

The title may be omitted:

Example 176
[foo]:
/url

[foo]
<p><a href="/url">foo</a></p>

The link destination may not be omitted:

Example 177
[foo]:

[foo]
<p>[foo]:</p>
<p>[foo]</p>

However, an empty link destination may be specified using angle brackets:

Example 178
[foo]: <>

[foo]
<p><a href="">foo</a></p>

The title must be separated from the link destination by spaces or tabs:

Example 179
[foo]: <bar>(baz)

[foo]
<p>[foo]: <bar>(baz)</p>
<p>[foo]</p>

Both title and destination can contain backslash escapes and literal backslashes:

Example 180
[foo]: /url\bar\*baz "foo\"bar\baz"

[foo]
<p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>

A link can come before its corresponding definition:

Example 181
[foo]

[foo]: url
<p><a href="url">foo</a></p>

If there are several matching definitions, the first one takes precedence:

Example 182
[foo]

[foo]: first
[foo]: second
<p><a href="first">foo</a></p>

As noted in the section on Links, matching of labels is case-insensitive (see matches).

Example 183
[FOO]: /url

[Foo]
<p><a href="/url">Foo</a></p>
Example 184
[ΑΓΩ]: /φου

[αγω]
<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>

Whether something is a link reference definition is independent of whether the link reference it defines is used in the document. Thus, for example, the following document contains just a link reference definition, and no visible content:

Example 185
[foo]: /url

Here is another one:

Example 186
[
foo
]: /url
bar
<p>bar</p>

This is not a link reference definition, because there are characters other than spaces or tabs after the title:

Example 187
[foo]: /url "title" ok
<p>[foo]: /url &quot;title&quot; ok</p>

This is a link reference definition, but it has no title:

Example 188
[foo]: /url
"title" ok
<p>&quot;title&quot; ok</p>

This is not a link reference definition, because it is indented four spaces:

Example 189
    [foo]: /url "title"

[foo]
<pre><code>[foo]: /url &quot;title&quot;
</code></pre>
<p>[foo]</p>

This is not a link reference definition, because it occurs inside a code block:

Example 190
```
[foo]: /url
```

[foo]
<pre><code>[foo]: /url
</code></pre>
<p>[foo]</p>

A link reference definition cannot interrupt a paragraph.

Example 191
Foo
[bar]: /baz

[bar]
<p>Foo
[bar]: /baz</p>
<p>[bar]</p>

However, it can directly follow other block elements, such as headings and thematic breaks, and it need not be followed by a blank line.

Example 192
# [Foo]
[foo]: /url
> bar
<h1><a href="/url">Foo</a></h1>
<blockquote>
<p>bar</p>
</blockquote>
Example 193
[foo]: /url
bar
===
[foo]
<h1>bar</h1>
<p><a href="/url">foo</a></p>
Example 194
[foo]: /url
===
[foo]
<p>===
<a href="/url">foo</a></p>

Several link reference definitions can occur one after another, without intervening blank lines.

Example 195
[foo]: /foo-url "foo"
[bar]: /bar-url
  "bar"
[baz]: /baz-url

[foo],
[bar],
[baz]
<p><a href="/foo-url" title="foo">foo</a>,
<a href="/bar-url" title="bar">bar</a>,
<a href="/baz-url">baz</a></p>

Link reference definitions can occur inside block containers, like lists and block quotations. They affect the entire document, not just the container in which they are defined:

Example 196
[foo]

> [foo]: /url
<p><a href="/url">foo</a></p>
<blockquote>
</blockquote>

7.10Paragraphs

A sequence of non-blank lines that cannot be interpreted as other kinds of blocks forms a paragraph. The contents of the paragraph are the result of parsing the paragraph’s raw content as inlines. The paragraph’s raw content is formed by concatenating the lines and removing initial and final spaces or tabs.

A simple example with two paragraphs:

Example 197
aaa

bbb
<p>aaa</p>
<p>bbb</p>

Paragraphs can contain multiple lines, but no blank lines:

Example 198
aaa
bbb

ccc
ddd
<p>aaa
bbb</p>
<p>ccc
ddd</p>

Multiple blank lines between paragraphs have no effect:

Example 199
aaa


bbb
<p>aaa</p>
<p>bbb</p>

Leading spaces or tabs are skipped:

Example 200
  aaa
 bbb
<p>aaa
bbb</p>

Lines after the first may be indented any amount, since indented code blocks cannot interrupt paragraphs.

Example 201
aaa
             bbb
                                       ccc
<p>aaa
bbb
ccc</p>

However, the first line may be preceded by up to three spaces of indentation. Four spaces of indentation is too many:

Example 202
   aaa
bbb
<p>aaa
bbb</p>
Example 203
    aaa
bbb
<pre><code>aaa
</code></pre>
<p>bbb</p>

Final spaces or tabs are stripped before inline parsing, so a paragraph that ends with two or more spaces will not end with a hard line break:

Example 204
aaa     
bbb     
<p>aaa<br />
bbb</p>

7.11Blank lines

Blank lines between block-level elements are ignored, except for the role they play in determining whether a list is tight or loose.

Blank lines at the beginning and end of the document are also ignored.

Example 205
  

aaa
  

# aaa

  
<p>aaa</p>
<h1>aaa</h1>

7.12Tables (GFM)

GFM enables the table extension, where an additional leaf block type is available.

A table is an arrangement of data with rows and columns, consisting of a single header row, a delimiter row separating the header from the data, and zero or more data rows.

Each row consists of cells containing arbitrary text, in which inlines are parsed, separated by pipes (|). A leading and trailing pipe is also recommended for clarity of reading, and if there’s otherwise parsing ambiguity. Spaces between pipes and cell content are trimmed. Block-level elements cannot be inserted in a table.

The delimiter row consists of cells whose only content are hyphens (-), and optionally, a leading or trailing colon (:), or both, to indicate left, right, or center alignment respectively.

| foo | bar |
| --- | --- |
| baz | bim |
<table>
<thead>
<tr>
<th>foo</th>
<th>bar</th>
</tr>
</thead>
<tbody>
<tr>
<td>baz</td>
<td>bim</td>
</tr>
</tbody>
</table>

Cells in one column don’t need to match length, though it’s easier to read if they are. Likewise, use of leading and trailing pipes may be inconsistent:

| abc | defghi |
:-: | -----------:
bar | baz
<table>
<thead>
<tr>
<th align="center">abc</th>
<th align="right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">bar</td>
<td align="right">baz</td>
</tr>
</tbody>
</table>

Include a pipe in a cell’s content by escaping it, including inside other inline spans:

| f\|oo  |
| ------ |
| b `\|` az |
| b **\|** im |
<table>
<thead>
<tr>
<th>f|oo</th>
</tr>
</thead>
<tbody>
<tr>
<td>b <code>|</code> az</td>
</tr>
<tr>
<td>b <strong>|</strong> im</td>
</tr>
</tbody>
</table>

The table is broken at the first empty line, or beginning of another block-level structure:

| abc | def |
| --- | --- |
| bar | baz |
> bar
<table>
<thead>
<tr>
<th>abc</th>
<th>def</th>
</tr>
</thead>
<tbody>
<tr>
<td>bar</td>
<td>baz</td>
</tr>
</tbody>
</table>
<blockquote>
<p>bar</p>
</blockquote>
| abc | def |
| --- | --- |
| bar | baz |
bar

bar
<table>
<thead>
<tr>
<th>abc</th>
<th>def</th>
</tr>
</thead>
<tbody>
<tr>
<td>bar</td>
<td>baz</td>
</tr>
<tr>
<td>bar</td>
<td></td>
</tr>
</tbody>
</table>
<p>bar</p>

The header row must match the delimiter row in the number of cells. If not, a table will not be recognized:

| abc | def |
| --- |
| bar |
<p>| abc | def |
| --- |
| bar |</p>

The remainder of the table’s rows may vary in the number of cells. If there are a number of cells fewer than the number of cells in the header row, empty cells are inserted. If there are greater, the excess is ignored:

| abc | def |
| --- | --- |
| bar |
| bar | baz | boo |
<table>
<thead>
<tr>
<th>abc</th>
<th>def</th>
</tr>
</thead>
<tbody>
<tr>
<td>bar</td>
<td></td>
</tr>
<tr>
<td>bar</td>
<td>baz</td>
</tr>
</tbody>
</table>

If there are no rows in the body, no <tbody> is generated in HTML output:

| abc | def |
| --- | --- |
<table>
<thead>
<tr>
<th>abc</th>
<th>def</th>
</tr>
</thead>
</table>

7.13Table resources and CSV-format tables (M)

Tables can be formatted either as Markua tables or, for tables which are local or web resources, as CSV-format tables. (CSV-format tables are not supported as inline resources, since they would result in too many false-positives.)

The type and format of a table can be specified in an attribute list as follows:

{type: table, format: markua}
![](some_markua_table.txt)

{type: table, format: csv}
![](some_markua_table.txt)

The default type of a csv format resource is table, so the second example could also be written like this:

{format: csv}
![](some_markua_table.txt)

However, the .csv file extension can also be used to provide the default format of csv, so this can also be written like this:

![](some_markua_table.csv)

Also, if no format is specified but the type is specified to table, the default is to assume a format of markua. So, the first example from above could also be written like this:

{type: table}
![](some_markua_table.txt)

As shown in the Tables section, the syntax to insert a table in Markua is the identical syntax to that used by GFM. (That section is unchanged from the GFM spec, which is built on top of the CommonMark spec.)

Just as Markua reinterprets Markdown concepts like images as being resources, Markua also reinterprets GFM tables as being a resource. This is important for two reasons, in the case of tables:

  1. Resources can have attribute lists.
  2. Resources can be inline, local or web resources.

So, Markua supports attribute list on tables, and tables which are inline, local or web resources:

Both local and web resources can be very useful for including data sourced from an external data source, as well as making them easier to keep correct.

The following are the supported attributes for table resources, in addition to the class, format, title and type attributes which all resources support:

align : The align can be left, right or middle. The default is middle. This works the same way it does for images. Note that it applies to the position of the entire table, not to the cells within it. In terms of the specific values of align, a Markua Processor must interpret left as “on the left side of the page”, right as “on the right side of the page” and middle as “in the middle of the content area of the page, respecting margins” in all cases. Finally, note that inside and outside are not supported for align, and that a Markua Processor may interpret center and/or centre as aliases for middle. (The value of middle was chosen because of the center vs. centre spelling issue.)

columns : Sometimes it’s desirable to only show part of a table resource defined in an external file. The rows and columns attributes let you accomplish this. The columns attribute can take a comma-separated list of values or ranges, such as 2, 4, 7..9, which would ensure that columns 2, 4, 7, 8 and 9 were shown.

column-widths : The column widths are a whitespace-separated list of numbers (integers and/or floats) and/or * symbols, from left to right, as a percentage of the total table width. In this attribute value, * means for the column to use the remaining space, equally divided between it and any other column with the * attribute. Some examples are: {column-widths: "10% 30% * 10%"}, {column-widths: "10% * 40% *"}, {column-widths: "10% 30% * 12.5%"}, {column-widths: "95% * *"}. The numbers used for the column-widths percentages must sum to exactly 100 (if only numbers are used), or to less than 100 (if there are any *s used). Every specified value must be at least 1, and every * must compute to at least 1. The number of values (numbers or *s) must match the number of columns. Like with the width attribute, the percentage sign (%) is required, to make it absolutely clear that these are not measurements in pixels or points.

footer : true, false or dynamic. The default is dynamic for markua format tables, and false for csv format tables. If dynamic, the existence of a footer is determined by whether the content of the table contains a footer. For a csv format table, true means that the last row is treated as a footer row, and false means that the last row is not treated as a footer row.

format : The format can be either markua (the default) or csv. The csv format can ony be used with local or web resources, to reduce false positives. CSV-format table resources are discussed in the next section.

header : true, false or dynamic. The default is dynamic for markua format tables, and true for csv format tables. If dynamic, the existence of a header is determined by whether the content of the table contains a header. For a csv format table, true means that the first row is treated as a header row, and false means that the first row is not treated as a header row.

rows : Sometimes it’s desirable to only show part of a table resource defined in an external file. The rows and columns attributes let you accomplish this. The rows attribute can take a comma-separated list of values or ranges, such as 1, 5, 13..18, 22, which would ensure that rows 1, 5, 13, 14, 15, 16, 17, 18 and 22 were shown.

width : The width of the table, in percentage of page content area width (respecting margins). This is specified as an number (integer or float) between 1 and 100 followed by a percentage sign (%). The quotes are optional. So, you can say {width: "70%"}, {width: 70%}, {width: "70.5%"} or {width: 70.5%}.

The examples below assume that a table which looks like the following (from the GFM table example above) is defined in a file called census.tbl (real census data would be too long for a good example):

| foo | bar |
| --- | --- |
| baz | bim |
<table>
<thead>
<tr>
<th>foo</th>
<th>bar</th>
</tr>
</thead>
<tbody>
<tr>
<td>baz</td>
<td>bim</td>
</tr>
</tbody>
</table>

With this table above defined in a file which is available at a local resource location, the following example results:

![](census.tbl "Canadian Census Data")
<figure>
  <table>
  <thead>
  <tr>
  <th>foo</th>
  <th>bar</th>
  </tr>
  </thead>
  <tbody>
  <tr>
  <td>baz</td>
  <td>bim</td>
  </tr>
  </tbody>
  </table>
  <figcaption>Canadian Census Data</figcaption>
</figure>

With this table above defined in a file which is available at a given web resource location, the identical example results:

![](https://markua.com/census.tbl "Canadian Census Data")
<figure>
  <table>
  <thead>
  <tr>
  <th>foo</th>
  <th>bar</th>
  </tr>
  </thead>
  <tbody>
  <tr>
  <td>baz</td>
  <td>bim</td>
  </tr>
  </tbody>
  </table>
  <figcaption>Canadian Census Data</figcaption>
</figure>

Table resources should be defined in a file with a .tbl extension. If the extension is different, then there must be a type: table attribute in the attribute list.

In the interest of brevity, some other examples of usage are below.

With the .tbl extension, the table type is inferred:

{title: "Canadian Census Data"}
![](census.tbl)

![](census.tbl "Canadian Census Data")

![Canadian Census Data](census.tbl)

{title: "Canadian Census Data", column-widths: "30% *"}
![](census.tbl)

{column-widths: "30% *"}
![](census.tbl "Canadian Census Data")

{column-widths: "30% *"}
![Canadian Census Data](census.tbl)

With a non-.tbl extension, the table type must be specified:

{type: table}
![](census.txt "Canadian Census Data")

{type: table, title: "Canadian Census Data"}
![](census.txt)

{type: table, title: "Canadian Census Data", column-widths: "30% *"}
![](census.txt)

Recall that the resource syntax is as follows:

{key: value, comma: separated, optional: attribute_list}
![optional alt text](resource_path_or_url "Optional Figure Title")

Note that there is some flexibility in the title attribute. If the alt-title document setting is text or all (the default) , then the alt text can be used as the title as well. If the alt-title is none, then it cannot. The alt-title document setting is discussed in great detail here.

So, assuming a value of text or all for a document setting, this also works:

![Canadian Census Data](census.tbl)

(This is arguably the nicest looking version, and since tables are a text resource there are no accessibility concerns about there being no alt text.)

This syntax can be used with an attribute list, of course:

{type: table}
![Canadian Census Data](census.txt)

Finally, if there is both alt text and a title attribute, the alt text is clobbered by the title:

{align: middle, width: 80%, title: "Canadian Census Data"}
![some alt text which gets clobbered by the title](census.tbl)

As discussed above, tables (i.e. resources of type table) which are local or web resources can have a csv format, instead of the default markua format.

For a csv-format table, the existence of a header and/or footer row cannot be inferred from the data, and must be specified in an attribute list. The default value of header is true, and the default value of footer is false.

Also, as discussed, a csv table can have rows and columns attributes, for easily presenting a subset of a larger CSV. If rows and columns are used with header of true, the header should be displayed by the Markua Processor.

Examples:

{type: table, format: csv}
![Canadian Census Data](census.csv)

{type: table, format: csv, title: "Canadian Census Data"}
![](census.csv)

{type: table, format: csv}
![](census.csv "Canadian Census Data")

{type: table, format: csv, header: true}
![Canadian Census Data](census.csv)

{type: table, format: csv, header: true, footer: true}
![Canadian Census Data](census.csv)

{type: table, format: csv, header: true, footer: true, rows: 1..5, columns: 2,3}
![Canadian Census Data](census.csv)

8Container blocks

A container block is a block that has other blocks as its contents. There are two basic kinds of container blocks: block quotes and list items. Lists are meta-containers for list items.

We define the syntax for container blocks recursively. The general form of the definition is:

If X is a sequence of blocks, then the result of transforming X in such-and-such a way is a container of type Y with these blocks as its content.

So, we explain what counts as a block quote or list item by explaining how these can be generated from their contents. This should suffice to define the syntax, although it does not give a recipe for parsing these constructions. (A recipe is provided below in the section entitled A parsing strategy.)

8.1Block quotes

A block quote marker, optionally preceded by up to three spaces of indentation, consists of (a) the character > together with a following space of indentation, or (b) a single character > not followed by a space of indentation.

The following rules define block quotes:

  1. Basic case. If a string of lines Ls constitute a sequence of blocks Bs, then the result of prepending a block quote marker to the beginning of each line in Ls is a block quote containing Bs.

  2. Laziness. If a string of lines Ls constitute a block quote with contents Bs, then the result of deleting the initial block quote marker from one or more lines in which the next character other than a space or tab after the block quote marker is paragraph continuation text is a block quote with Bs as its content. Paragraph continuation text is text that will be parsed as part of the content of a paragraph, but does not occur at the beginning of the paragraph.

  3. Consecutiveness. A document cannot contain two block quotes in a row unless there is a blank line between them.

Nothing else counts as a block quote.

Here is a simple example:

Example 217
> # Foo
> bar
> baz
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>

The space or tab after the > characters can be omitted:

Example 218
># Foo
>bar
> baz
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>

The > characters can be preceded by up to three spaces of indentation:

Example 219
   > # Foo
   > bar
 > baz
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>

Four spaces of indentation is too many:

Example 220
    > # Foo
    > bar
    > baz
<pre><code>&gt; # Foo
&gt; bar
&gt; baz
</code></pre>

The Laziness clause allows us to omit the > before paragraph continuation text:

Example 221
> # Foo
> bar
baz
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>

A block quote can contain some lazy and some non-lazy continuation lines:

Example 222
> bar
baz
> foo
<blockquote>
<p>bar
baz
foo</p>
</blockquote>

Laziness only applies to lines that would have been continuations of paragraphs had they been prepended with block quote markers. For example, the > cannot be omitted in the second line of

> foo
> ---

without changing the meaning:

Example 223
> foo
---
<blockquote>
<p>foo</p>
</blockquote>
<hr />

Similarly, if we omit the > in the second line of

> - foo
> - bar

then the block quote ends after the first line:

Example 224
> - foo
- bar
<blockquote>
<ul>
<li>foo</li>
</ul>
</blockquote>
<ul>
<li>bar</li>
</ul>

For the same reason, we can’t omit the > in front of subsequent lines of an indented or fenced code block:

Example 225
>     foo
    bar
<blockquote>
<pre><code>foo
</code></pre>
</blockquote>
<pre><code>bar
</code></pre>
Example 226
> ```
foo
```
<blockquote>
<pre><code></code></pre>
</blockquote>
<p>foo</p>
<pre><code></code></pre>

Note that in the following case, we have a lazy continuation line:

Example 227
> foo
    - bar
<blockquote>
<p>foo
- bar</p>
</blockquote>

To see why, note that in

> foo
>     - bar

the - bar is indented too far to start a list, and can’t be an indented code block because indented code blocks cannot interrupt paragraphs, so it is paragraph continuation text.

A block quote can be empty:

Example 228
>
<blockquote>
</blockquote>
Example 229
>
>  
> 
<blockquote>
</blockquote>

A block quote can have initial or final blank lines:

Example 230
>
> foo
>  
<blockquote>
<p>foo</p>
</blockquote>

A blank line always separates block quotes:

Example 231
> foo

> bar
<blockquote>
<p>foo</p>
</blockquote>
<blockquote>
<p>bar</p>
</blockquote>

(Most current Markdown implementations, including John Gruber’s original Markdown.pl, will parse this example as a single block quote with two paragraphs. But it seems better to allow the author to decide whether two block quotes or one are wanted.)

Consecutiveness means that if we put these block quotes together, we get a single block quote:

Example 232
> foo
> bar
<blockquote>
<p>foo
bar</p>
</blockquote>

To get a block quote with two paragraphs, use:

Example 233
> foo
>
> bar
<blockquote>
<p>foo</p>
<p>bar</p>
</blockquote>

Block quotes can interrupt paragraphs:

Example 234
foo
> bar
<p>foo</p>
<blockquote>
<p>bar</p>
</blockquote>

In general, blank lines are not needed before or after block quotes:

Example 235
> aaa
***
> bbb
<blockquote>
<p>aaa</p>
</blockquote>
<hr />
<blockquote>
<p>bbb</p>
</blockquote>

However, because of laziness, a blank line is needed between a block quote and a following paragraph:

Example 236
> bar
baz
<blockquote>
<p>bar
baz</p>
</blockquote>
Example 237
> bar

baz
<blockquote>
<p>bar</p>
</blockquote>
<p>baz</p>
Example 238
> bar
>
baz
<blockquote>
<p>bar</p>
</blockquote>
<p>baz</p>

It is a consequence of the Laziness rule that any number of initial >s may be omitted on a continuation line of a nested block quote:

Example 239
> > > foo
bar
<blockquote>
<blockquote>
<blockquote>
<p>foo
bar</p>
</blockquote>
</blockquote>
</blockquote>
Example 240
>>> foo
> bar
>>baz
<blockquote>
<blockquote>
<blockquote>
<p>foo
bar
baz</p>
</blockquote>
</blockquote>
</blockquote>

When including an indented code block in a block quote, remember that the block quote marker includes both the > and a following space of indentation. So five spaces are needed after the >:

Example 241
>     code

>    not code
<blockquote>
<pre><code>code
</code></pre>
</blockquote>
<blockquote>
<p>not code</p>
</blockquote>

8.2Block quotes with curly braces (M)

Block quotes in Markua are created in one of two ways:

  1. By prefacing lines with > , i.e. a greater than character followed by a space. This was shown above.
  2. By wrapping the blockquote in {blockquote}{/blockquote}

Option #1 is preferable for short quotes; option #2 is easier on authors for really long quotes.

Like figures and tables, blockquotes can be inserted in the middle of a paragraph or as a sibling of it.

These are the two ways to make block quotes in Markua:

This is the first paragraph.

> This is a blockquote.
>
> You saw this above.

This is the second paragraph.

{blockquote}
This is also a blockquote.

It is good for longer quotes.
{/blockquote}

This is the third paragraph.
<p>This is the first paragraph.</p>
<blockquote>
<p>This is a blockquote.</p>
<p>You saw this above.</p>
</blockquote>
<p>This is the second paragraph.</p>
<blockquote>
<p>This is also a blockquote.</p>
<p>It is good for longer quotes.</p>
</blockquote>
<p>This is the third paragraph.</p>

If a blockquote contains headings, these headings may be formatted by a Markua Processor differently than normal headings. Finally, if a Markua Processor is automatically generating a Table of Contents from chapter and section headings, any headings inside blockquotes should be ignored.

8.3List items

A list marker is a bullet list marker or an ordered list marker.

A bullet list marker is a -, +, or * character.

An ordered list marker is a sequence of 1–9 arabic digits (0-9), followed by either a . character or a ) character. (The reason for the length limit is that with 10 digits we start seeing integer overflows in some browsers.)

The following rules define list items:

  1. Basic case. If a sequence of lines Ls constitute a sequence of blocks Bs starting with a character other than a space or tab, and M is a list marker of width W followed by 1 ≤ N ≤ 4 spaces of indentation, then the result of prepending M and the following spaces to the first line of Ls, and indenting subsequent lines of Ls by W + N spaces, is a list item with Bs as its contents. The type of the list item (bullet or ordered) is determined by the type of its list marker. If the list item is ordered, then it is also assigned a start number, based on the ordered list marker.

    Exceptions:

    1. When the first list item in a list interrupts a paragraph—that is, when it starts on a line that would otherwise count as paragraph continuation text—then (a) the lines Ls must not begin with a blank line, and (b) if the list item is ordered, the start number must be 1.
    2. If any line is a thematic break then that line is not a list item.

For example, let Ls be the lines

Example 243
A paragraph
with two lines.

    indented code

> A block quote.
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>

And let M be the marker 1., and N = 2. Then rule #1 says that the following is an ordered list item with start number 1, and the same contents as Ls:

Example 244
1.  A paragraph
    with two lines.

        indented code

    > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

The most important thing to notice is that the position of the text after the list marker determines how much indentation is needed in subsequent blocks in the list item. If the list marker takes up two spaces of indentation, and there are three spaces between the list marker and the next character other than a space or tab, then blocks must be indented five spaces in order to fall under the list item.

Here are some examples showing how far content must be indented to be put under the list item:

Example 245
- one

 two
<ul>
<li>one</li>
</ul>
<p>two</p>
Example 246
- one

  two
<ul>
<li>
<p>one</p>
<p>two</p>
</li>
</ul>
Example 247
 -    one

     two
<ul>
<li>one</li>
</ul>
<pre><code> two
</code></pre>
Example 248
 -    one

      two
<ul>
<li>
<p>one</p>
<p>two</p>
</li>
</ul>

It is tempting to think of this in terms of columns: the continuation blocks must be indented at least to the column of the first character other than a space or tab after the list marker. However, that is not quite right. The spaces of indentation after the list marker determine how much relative indentation is needed. Which column this indentation reaches will depend on how the list item is embedded in other constructions, as shown by this example:

Example 249
   > > 1.  one
>>
>>     two
<blockquote>
<blockquote>
<ol>
<li>
<p>one</p>
<p>two</p>
</li>
</ol>
</blockquote>
</blockquote>

Here two occurs in the same column as the list marker 1., but is actually contained in the list item, because there is sufficient indentation after the last containing blockquote marker.

The converse is also possible. In the following example, the word two occurs far to the right of the initial text of the list item, one, but it is not considered part of the list item, because it is not indented far enough past the blockquote marker:

Example 250
>>- one
>>
  >  > two
<blockquote>
<blockquote>
<ul>
<li>one</li>
</ul>
<p>two</p>
</blockquote>
</blockquote>

Note that at least one space or tab is needed between the list marker and any following content, so these are not list items:

Example 251
-one

2.two
<p>-one</p>
<p>2.two</p>

A list item may contain blocks that are separated by more than one blank line.

Example 252
- foo


  bar
<ul>
<li>
<p>foo</p>
<p>bar</p>
</li>
</ul>

A list item may contain any kind of block:

Example 253
1.  foo

    ```
    bar
    ```

    baz

    > bam
<ol>
<li>
<p>foo</p>
<pre><code>bar
</code></pre>
<p>baz</p>
<blockquote>
<p>bam</p>
</blockquote>
</li>
</ol>

A list item that contains an indented code block will preserve empty lines within the code block verbatim.

Example 254
- Foo

      bar


      baz
<ul>
<li>
<p>Foo</p>
<pre><code>bar


baz
</code></pre>
</li>
</ul>

Note that ordered list start numbers must be nine digits or less:

Example 255
123456789. ok
<ol start="123456789">
<li>ok</li>
</ol>
Example 256
1234567890. not ok
<p>1234567890. not ok</p>

A start number may begin with 0s:

Example 257
0. ok
<ol start="0">
<li>ok</li>
</ol>
Example 258
003. ok
<ol start="3">
<li>ok</li>
</ol>

A start number may not be negative:

Example 259
-1. not ok
<p>-1. not ok</p>
  1. Item starting with indented code. If a sequence of lines Ls constitute a sequence of blocks Bs starting with an indented code block, and M is a list marker of width W followed by one space of indentation, then the result of prepending M and the following space to the first line of Ls, and indenting subsequent lines of Ls by W + 1 spaces, is a list item with Bs as its contents. If a line is empty, then it need not be indented. The type of the list item (bullet or ordered) is determined by the type of its list marker. If the list item is ordered, then it is also assigned a start number, based on the ordered list marker.

An indented code block will have to be preceded by four spaces of indentation beyond the edge of the region where text will be included in the list item. In the following case that is 6 spaces:

Example 260
- foo

      bar
<ul>
<li>
<p>foo</p>
<pre><code>bar
</code></pre>
</li>
</ul>

And in this case it is 11 spaces:

Example 261
  10.  foo

           bar
<ol start="10">
<li>
<p>foo</p>
<pre><code>bar
</code></pre>
</li>
</ol>

If the first block in the list item is an indented code block, then by rule #2, the contents must be preceded by one space of indentation after the list marker:

Example 262
    indented code

paragraph

    more code
<pre><code>indented code
</code></pre>
<p>paragraph</p>
<pre><code>more code
</code></pre>
Example 263
1.     indented code

   paragraph

       more code
<ol>
<li>
<pre><code>indented code
</code></pre>
<p>paragraph</p>
<pre><code>more code
</code></pre>
</li>
</ol>

Note that an additional space of indentation is interpreted as space inside the code block:

Example 264
1.      indented code

   paragraph

       more code
<ol>
<li>
<pre><code> indented code
</code></pre>
<p>paragraph</p>
<pre><code>more code
</code></pre>
</li>
</ol>

Note that rules #1 and #2 only apply to two cases: (a) cases in which the lines to be included in a list item begin with a character other than a space or tab, and (b) cases in which they begin with an indented code block. In a case like the following, where the first block begins with three spaces of indentation, the rules do not allow us to form a list item by indenting the whole thing and prepending a list marker:

Example 265
   foo

bar
<p>foo</p>
<p>bar</p>
Example 266
-    foo

  bar
<ul>
<li>foo</li>
</ul>
<p>bar</p>

This is not a significant restriction, because when a block is preceded by up to three spaces of indentation, the indentation can always be removed without a change in interpretation, allowing rule #1 to be applied. So, in the above case:

Example 267
-  foo

   bar
<ul>
<li>
<p>foo</p>
<p>bar</p>
</li>
</ul>
  1. Item starting with a blank line. If a sequence of lines Ls starting with a single blank line constitute a (possibly empty) sequence of blocks Bs, and M is a list marker of width W, then the result of prepending M to the first line of Ls, and preceding subsequent lines of Ls by W + 1 spaces of indentation, is a list item with Bs as its contents. If a line is empty, then it need not be indented. The type of the list item (bullet or ordered) is determined by the type of its list marker. If the list item is ordered, then it is also assigned a start number, based on the ordered list marker.

Here are some list items that start with a blank line but are not empty:

Example 268
-
  foo
-
  ```
  bar
  ```
-
      baz
<ul>
<li>foo</li>
<li>
<pre><code>bar
</code></pre>
</li>
<li>
<pre><code>baz
</code></pre>
</li>
</ul>

When the list item starts with a blank line, the number of spaces following the list marker doesn’t change the required indentation:

Example 269
-   
  foo
<ul>
<li>foo</li>
</ul>

A list item can begin with at most one blank line. In the following example, foo is not part of the list item:

Example 270
-

  foo
<ul>
<li></li>
</ul>
<p>foo</p>

Here is an empty bullet list item:

Example 271
- foo
-
- bar
<ul>
<li>foo</li>
<li></li>
<li>bar</li>
</ul>

It does not matter whether there are spaces or tabs following the list marker:

Example 272
- foo
-   
- bar
<ul>
<li>foo</li>
<li></li>
<li>bar</li>
</ul>

Here is an empty ordered list item:

Example 273
1. foo
2.
3. bar
<ol>
<li>foo</li>
<li></li>
<li>bar</li>
</ol>

A list may start or end with an empty list item:

Example 274
*
<ul>
<li></li>
</ul>

However, an empty list item cannot interrupt a paragraph:

Example 275
foo
*

foo
1.
<p>foo
*</p>
<p>foo
1.</p>
  1. Indentation. If a sequence of lines Ls constitutes a list item according to rule #1, #2, or #3, then the result of preceding each line of Ls by up to three spaces of indentation (the same for each line) also constitutes a list item with the same contents and attributes. If a line is empty, then it need not be indented.

Indented one space:

Example 276
 1.  A paragraph
     with two lines.

         indented code

     > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

Indented two spaces:

Example 277
  1.  A paragraph
      with two lines.

          indented code

      > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

Indented three spaces:

Example 278
   1.  A paragraph
       with two lines.

           indented code

       > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

Four spaces indent gives a code block:

Example 279
    1.  A paragraph
        with two lines.

            indented code

        > A block quote.
<pre><code>1.  A paragraph
    with two lines.

        indented code

    &gt; A block quote.
</code></pre>
  1. Laziness. If a string of lines Ls constitute a list item with contents Bs, then the result of deleting some or all of the indentation from one or more lines in which the next character other than a space or tab after the indentation is paragraph continuation text is a list item with the same contents and attributes. The unindented lines are called lazy continuation lines.

Here is an example with lazy continuation lines:

Example 280
  1.  A paragraph
with two lines.

          indented code

      > A block quote.
<ol>
<li>
<p>A paragraph
with two lines.</p>
<pre><code>indented code
</code></pre>
<blockquote>
<p>A block quote.</p>
</blockquote>
</li>
</ol>

Indentation can be partially deleted:

Example 281
  1.  A paragraph
    with two lines.
<ol>
<li>A paragraph
with two lines.</li>
</ol>

These examples show how laziness can work in nested structures:

Example 282
> 1. > Blockquote
continued here.
<blockquote>
<ol>
<li>
<blockquote>
<p>Blockquote
continued here.</p>
</blockquote>
</li>
</ol>
</blockquote>
Example 283
> 1. > Blockquote
> continued here.
<blockquote>
<ol>
<li>
<blockquote>
<p>Blockquote
continued here.</p>
</blockquote>
</li>
</ol>
</blockquote>
  1. That’s all. Nothing that is not counted as a list item by rules #1–5 counts as a list item.

The rules for sublists follow from the general rules above. A sublist must be indented the same number of spaces of indentation a paragraph would need to be in order to be included in the list item.

So, in this case we need two spaces indent:

Example 284
- foo
  - bar
    - baz
      - boo
<ul>
<li>foo
<ul>
<li>bar
<ul>
<li>baz
<ul>
<li>boo</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>

One is not enough:

Example 285
- foo
 - bar
  - baz
   - boo
<ul>
<li>foo</li>
<li>bar</li>
<li>baz</li>
<li>boo</li>
</ul>

Here we need four, because the list marker is wider:

Example 286
10) foo
    - bar
<ol start="10">
<li>foo
<ul>
<li>bar</li>
</ul>
</li>
</ol>

Three is not enough:

Example 287
10) foo
   - bar
<ol start="10">
<li>foo</li>
</ol>
<ul>
<li>bar</li>
</ul>

A list may be the first block in a list item:

Example 288
- - foo
<ul>
<li>
<ul>
<li>foo</li>
</ul>
</li>
</ul>
Example 289
1. - 2. foo
<ol>
<li>
<ul>
<li>
<ol start="2">
<li>foo</li>
</ol>
</li>
</ul>
</li>
</ol>

A list item can contain a heading:

Example 290
- # Foo
- Bar
  ---
  baz
<ul>
<li>
<h1>Foo</h1>
</li>
<li>
<h2>Bar</h2>
baz</li>
</ul>

8.3.1Motivation

John Gruber’s Markdown spec says the following about list items:

  1. “List markers typically start at the left margin, but may be indented by up to three spaces. List markers must be followed by one or more spaces or a tab.”

  2. “To make lists look nice, you can wrap items with hanging indents…. But if you don’t want to, you don’t have to.”

  3. “List items may consist of multiple paragraphs. Each subsequent paragraph in a list item must be indented by either 4 spaces or one tab.”

  4. “It looks nice if you indent every line of the subsequent paragraphs, but here again, Markdown will allow you to be lazy.”

  5. “To put a blockquote within a list item, the blockquote’s > delimiters need to be indented.”

  6. “To put a code block within a list item, the code block needs to be indented twice — 8 spaces or two tabs.”

These rules specify that a paragraph under a list item must be indented four spaces (presumably, from the left margin, rather than the start of the list marker, but this is not said), and that code under a list item must be indented eight spaces instead of the usual four. They also say that a block quote must be indented, but not by how much; however, the example given has four spaces indentation. Although nothing is said about other kinds of block-level content, it is certainly reasonable to infer that all block elements under a list item, including other lists, must be indented four spaces. This principle has been called the four-space rule.

The four-space rule is clear and principled, and if the reference implementation Markdown.pl had followed it, it probably would have become the standard. However, Markdown.pl allowed paragraphs and sublists to start with only two spaces indentation, at least on the outer level. Worse, its behavior was inconsistent: a sublist of an outer-level list needed two spaces indentation, but a sublist of this sublist needed three spaces. It is not surprising, then, that different implementations of Markdown have developed very different rules for determining what comes under a list item. (Pandoc and python-Markdown, for example, stuck with Gruber’s syntax description and the four-space rule, while discount, redcarpet, marked, PHP Markdown, and others followed Markdown.pl’s behavior more closely.)

Unfortunately, given the divergences between implementations, there is no way to give a spec for list items that will be guaranteed not to break any existing documents. However, the spec given here should correctly handle lists formatted with either the four-space rule or the more forgiving Markdown.pl behavior, provided they are laid out in a way that is natural for a human to read.

The strategy here is to let the width and indentation of the list marker determine the indentation necessary for blocks to fall under the list item, rather than having a fixed and arbitrary number. The writer can think of the body of the list item as a unit which gets indented to the right enough to fit the list marker (and any indentation on the list marker). (The laziness rule, #5, then allows continuation lines to be unindented if needed.)

This rule is superior, we claim, to any rule requiring a fixed level of indentation from the margin. The four-space rule is clear but unnatural. It is quite unintuitive that

- foo

  bar

  - baz

should be parsed as two lists with an intervening paragraph,

<ul>
<li>foo</li>
</ul>
<p>bar</p>
<ul>
<li>baz</li>
</ul>

as the four-space rule demands, rather than a single list,

<ul>
<li>
<p>foo</p>
<p>bar</p>
<ul>
<li>baz</li>
</ul>
</li>
</ul>

The choice of four spaces is arbitrary. It can be learned, but it is not likely to be guessed, and it trips up beginners regularly.

Would it help to adopt a two-space rule? The problem is that such a rule, together with the rule allowing up to three spaces of indentation for the initial list marker, allows text that is indented less than the original list marker to be included in the list item. For example, Markdown.pl parses

   - one

  two

as a single list item, with two a continuation paragraph:

<ul>
<li>
<p>one</p>
<p>two</p>
</li>
</ul>

and similarly

>   - one
>
>  two

as

<blockquote>
<ul>
<li>
<p>one</p>
<p>two</p>
</li>
</ul>
</blockquote>

This is extremely unintuitive.

Rather than requiring a fixed indent from the margin, we could require a fixed indent (say, two spaces, or even one space) from the list marker (which may itself be indented). This proposal would remove the last anomaly discussed. Unlike the spec presented above, it would count the following as a list item with a subparagraph, even though the paragraph bar is not indented as far as the first paragraph foo:

 10. foo

   bar  

Arguably this text does read like a list item with bar as a subparagraph, which may count in favor of the proposal. However, on this proposal indented code would have to be indented six spaces after the list marker. And this would break a lot of existing Markdown, which has the pattern:

1.  foo

        indented code

where the code is indented eight spaces. The spec above, by contrast, will parse this text as expected, since the code block’s indentation is measured from the beginning of foo.

The one case that needs special treatment is a list item that starts with indented code. How much indentation is required in that case, since we don’t have a “first paragraph” to measure from? Rule #2 simply stipulates that in such cases, we require one space indentation from the list marker (and then the normal four spaces for the indented code). This will match the four-space rule in cases where the list marker plus its initial indentation takes four spaces (a common case), but diverge in other cases.

8.4Lists

A list is a sequence of one or more list items of the same type. The list items may be separated by any number of blank lines.

Two list items are of the same type if they begin with a list marker of the same type. Two list markers are of the same type if (a) they are bullet list markers using the same character (-, +, or *) or (b) they are ordered list numbers with the same delimiter (either . or )).

A list is an ordered list if its constituent list items begin with ordered list markers, and a bullet list if its constituent list items begin with bullet list markers.

The start number of an ordered list is determined by the list number of its initial list item. The numbers of subsequent list items are disregarded.

A list is loose if any of its constituent list items are separated by blank lines, or if any of its constituent list items directly contain two block-level elements with a blank line between them. Otherwise a list is tight. (The difference in HTML output is that paragraphs in a loose list are wrapped in <p> tags, while paragraphs in a tight list are not.)

Changing the bullet or ordered list delimiter starts a new list:

Example 291
- foo
- bar
+ baz
<ul>
<li>foo</li>
<li>bar</li>
</ul>
<ul>
<li>baz</li>
</ul>
Example 292
1. foo
2. bar
3) baz
<ol>
<li>foo</li>
<li>bar</li>
</ol>
<ol start="3">
<li>baz</li>
</ol>

In CommonMark, a list can interrupt a paragraph. That is, no blank line is needed to separate a paragraph from a following list:

Example 293
Foo
- bar
- baz
<p>Foo</p>
<ul>
<li>bar</li>
<li>baz</li>
</ul>

Markdown.pl does not allow this, through fear of triggering a list via a numeral in a hard-wrapped line:

The number of windows in my house is
14.  The number of doors is 6.

Oddly, though, Markdown.pl does allow a blockquote to interrupt a paragraph, even though the same considerations might apply.

In CommonMark, we do allow lists to interrupt paragraphs, for two reasons. First, it is natural and not uncommon for people to start lists without blank lines:

I need to buy
- new shoes
- a coat
- a plane ticket

Second, we are attracted to a

principle of uniformity: if a chunk of text has a certain meaning, it will continue to have the same meaning when put into a container block (such as a list item or blockquote).

(Indeed, the spec for list items and block quotes presupposes this principle.) This principle implies that if

  * I need to buy
    - new shoes
    - a coat
    - a plane ticket

is a list item containing a paragraph followed by a nested sublist, as all Markdown implementations agree it is (though the paragraph may be rendered without <p> tags, since the list is “tight”), then

I need to buy
- new shoes
- a coat
- a plane ticket

by itself should be a paragraph followed by a nested sublist.

Since it is well established Markdown practice to allow lists to interrupt paragraphs inside list items, the principle of uniformity requires us to allow this outside list items as well. (reStructuredText takes a different approach, requiring blank lines before lists even inside other list items.)

In order to solve the problem of unwanted lists in paragraphs with hard-wrapped numerals, we allow only lists starting with 1 to interrupt paragraphs. Thus,

Example 294
The number of windows in my house is
14.  The number of doors is 6.
<p>The number of windows in my house is
14.  The number of doors is 6.</p>

We may still get an unintended result in cases like

Example 295
The number of windows in my house is
1.  The number of doors is 6.
<p>The number of windows in my house is</p>
<ol>
<li>The number of doors is 6.</li>
</ol>

but this rule should prevent most spurious list captures.

There can be any number of blank lines between items:

Example 296
- foo

- bar


- baz
<ul>
<li>
<p>foo</p>
</li>
<li>
<p>bar</p>
</li>
<li>
<p>baz</p>
</li>
</ul>
Example 297
- foo
  - bar
    - baz


      bim
<ul>
<li>foo
<ul>
<li>bar
<ul>
<li>
<p>baz</p>
<p>bim</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>

To separate consecutive lists of the same type, or to separate a list from an indented code block that would otherwise be parsed as a subparagraph of the final list item, you can insert a blank HTML comment:

Example 298
- foo
- bar

<!-- -->

- baz
- bim
<ul>
<li>foo</li>
<li>bar</li>
</ul>
<!-- -->
<ul>
<li>baz</li>
<li>bim</li>
</ul>
Example 299
-   foo

    notcode

-   foo

<!-- -->

    code
<ul>
<li>
<p>foo</p>
<p>notcode</p>
</li>
<li>
<p>foo</p>
</li>
</ul>
<!-- -->
<pre><code>code
</code></pre>

List items need not be indented to the same level. The following list items will be treated as items at the same list level, since none is indented enough to belong to the previous list item:

Example 300
- a
 - b
  - c
   - d
  - e
 - f
- g
<ul>
<li>a</li>
<li>b</li>
<li>c</li>
<li>d</li>
<li>e</li>
<li>f</li>
<li>g</li>
</ul>
Example 301
1. a

  2. b

   3. c
<ol>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
<li>
<p>c</p>
</li>
</ol>

Note, however, that list items may not be preceded by more than three spaces of indentation. Here - e is treated as a paragraph continuation line, because it is indented more than three spaces:

Example 302
- a
 - b
  - c
   - d
    - e
<ul>
<li>a</li>
<li>b</li>
<li>c</li>
<li>d
- e</li>
</ul>

And here, 3. c is treated as in indented code block, because it is indented four spaces and preceded by a blank line.

Example 303
1. a

  2. b

    3. c
<ol>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
</ol>
<pre><code>3. c
</code></pre>

This is a loose list, because there is a blank line between two of the list items:

Example 304
- a
- b

- c
<ul>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
<li>
<p>c</p>
</li>
</ul>

So is this, with a empty second item:

Example 305
* a
*

* c
<ul>
<li>
<p>a</p>
</li>
<li></li>
<li>
<p>c</p>
</li>
</ul>

These are loose lists, even though there are no blank lines between the items, because one of the items directly contains two block-level elements with a blank line between them:

Example 306
- a
- b

  c
- d
<ul>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
<p>c</p>
</li>
<li>
<p>d</p>
</li>
</ul>
Example 307
- a
- b

  [ref]: /url
- d
<ul>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
<li>
<p>d</p>
</li>
</ul>

This is a tight list, because the blank lines are in a code block:

Example 308
- a
- ```
  b


  ```
- c
<ul>
<li>a</li>
<li>
<pre><code>b


</code></pre>
</li>
<li>c</li>
</ul>

This is a tight list, because the blank line is between two paragraphs of a sublist. So the sublist is loose while the outer list is tight:

Example 309
- a
  - b

    c
- d
<ul>
<li>a
<ul>
<li>
<p>b</p>
<p>c</p>
</li>
</ul>
</li>
<li>d</li>
</ul>

This is a tight list, because the blank line is inside the block quote:

Example 310
* a
  > b
  >
* c
<ul>
<li>a
<blockquote>
<p>b</p>
</blockquote>
</li>
<li>c</li>
</ul>

This list is tight, because the consecutive block elements are not separated by blank lines:

Example 311
- a
  > b
  ```
  c
  ```
- d
<ul>
<li>a
<blockquote>
<p>b</p>
</blockquote>
<pre><code>c
</code></pre>
</li>
<li>d</li>
</ul>

A single-paragraph list is tight:

Example 312
- a
<ul>
<li>a</li>
</ul>
Example 313
- a
  - b
<ul>
<li>a
<ul>
<li>b</li>
</ul>
</li>
</ul>

This list is loose, because of the blank line between the two block elements in the list item:

Example 314
1. ```
   foo
   ```

   bar
<ol>
<li>
<pre><code>foo
</code></pre>
<p>bar</p>
</li>
</ol>

Here the outer list is loose, the inner list tight:

Example 315
* foo
  * bar

  baz
<ul>
<li>
<p>foo</p>
<ul>
<li>bar</li>
</ul>
<p>baz</p>
</li>
</ul>
Example 316
- a
  - b
  - c

- d
  - e
  - f
<ul>
<li>
<p>a</p>
<ul>
<li>b</li>
<li>c</li>
</ul>
</li>
<li>
<p>d</p>
<ul>
<li>e</li>
<li>f</li>
</ul>
</li>
</ul>

8.5List attributes (M)

By default, Markua keeps the behavior of lists to be identical to that in Markdown, as described by John Gruber and specified by CommonMark. The enhancements that Markua makes to lists are purely additive: if you use any CommonMark-compliant list in a Markua document, it will work exactly as it would in CommonMark.

Now, lists in CommonMark have some major limitations. Here are two of the biggest ones:

  1. You can’t output descending order lists.
  2. You can’t output letters or Roman numerals for labels in an ordered list.

This list gets rendered as 3,4,5 (not 3,2,1), since Markdown does not have a way of specifying that a list is descending, and it does not infer it from the order by default:

before

3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This looks like a list, but it is in fact a paragraph. There are no break tags after the foo, bar and baz, just soft breaks:

before

a. foo
b. bar
c. baz

after
<p>before</p>
<p>a. foo
b. bar
c. baz</p>
<p>after</p>

Again, both of these behaviors are as defined in the CommonMark spec.

Now, Markua can’t change this by default, without being incompatibile with Markdown.

Also, CommonMark allows slight inconsistency in the amount of indentation of list items. I understand why, but this allows for documents which are incomprehensible and ugly. However, Markua must keep compatibility with CommonMark lists, so this is allowed.

So what to do?

Markua adds the list attributes extension.

Adding attributes to lists allows them to output all the needed types of list from the limited list support in CommonMark, without introducing breaking incompatibilities to CommonMark lists.

It is certainly possible to define a better list syntax than Markdown’s: I have previously done exactly that, specifying a syntax for complex lists which was superior to Markdown’s lists. Furthermore, Leanpub even implemented support for it! However, this syntax introduced incompatibilites with Markdown, and the tradeoff was not worth it. Markua’s list enhancements must be purely additive to CommonMark lists, not incompatible with them in any way. They cannot introduce any breaking changes to existing CommonMark lists, or authors will be justifiably unhappy.

The way the list attributes extension works is simple: it gives lists the ability to have an attribute list, much like how resources can have an attribute list. (Lists are not resources, however. That would not just be overkill, it would result in incompatibilites.)

8.5.1Supported Attributes for Lists

The following are the supported attributes for lists.

reversed : This determines whether the list is in ascending or descending order. Legal values are true or false. The default value is false, for ascending. Any value other than true is interpreted as false.

type : The type of the list marker to use. Valid choices are 1, A, a, I or i. A and a mean uppercase or lowercase letters; I or i mean uppercase or lowercase Roman numerals. Note that lists are NOT resources, so it is fine to use type for the list marker: there is no {type: list} resource, so there is no naming conflict.

delimiter : The type of the list delimiter to use, in output formats which allow custom list delimiters. Valid choices are "." or ")". The default value is inferred by the delimiter used: if you use periods . to delimit the list, then the delimiter is .; if you use closing parenthesis ) to delimit the list, then the delimiter is ). Note that the use of a closing parenthesis as a delimiter is not supported in all output formats. Specifically, HTML only supports using . for a list delimiter, so ) is not supported in HTML. And since EPUB is based on HTML, EPUB should not be expected to support the use of a right parenthesis for a delimiter. However, PDF and LaTeX are not subject to these limitations, and should support both . and ) as list delimiters.

The use of these attributes are shown in the following examples.

8.5.2Lists with Attributes: Good Style

This list will be numbered a, b, c, and will have the delimiter of .:

before

{type: a}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered a, b, c, and will have the delimiter of .:

before

{type: a, delimiter: "."}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered a, b, c, and will have the delimiter of ) in output formats that support it (e.g. in PDF but not in HTML):

before

{type: a}
1) foo
2) bar
3) baz

after
<p>before</p>
<ol type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered a, b, c, and will have the delimiter of ) in output formats that support it (e.g. in PDF but not in HTML):

before

{type: a, delimiter: ")"}
1) foo
2) bar
3) baz

after
<p>before</p>
<ol type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered A, B, C:

before

{type: A}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="A">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered c, d, e:

before

{type: a}
3. foo
4. bar
5. baz

after
<p>before</p>
<ol type="a" start="3">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered i, ii, iii:

before

{type: i}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered I, II, III:

before

{type: I}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered iii, iv, v:

before

{type: i}
3. foo
4. bar
5. baz

after
<p>before</p>
<ol type="i" start="3">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered 3, 2, 1:

before

{reversed: true}
3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3" reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered c, b, a:

before

{type: a, reversed: true}
3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3" type="a" reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered iii, ii, i:

before

{type: i, reversed: true}
3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3" type="i" reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

8.5.3Nested Lists without Attributes

By default, nested lists which do not have attributes function as though they have a type attribute whose value is determined by the context in which the list is found.

By default, in order, the delimiter chosen from outermost to innermost nested numbered lists are:

1., a., i., 1), a), i)

before

1. lorem
   1. foo
      1. apple
         1. Gala
         2. Spartan
         3. McIntosh
      2. orange
      3. banana
   2. bar
   3. baz
2. ipsum
3. dolor

after
<p>before</p>
<ol>
<li>lorem
<ol type="a">
<li>foo
<ol type="i">
<li>apple
<ol>
<li>Gala</li>
<li>Spartan</li>
<li>McIntosh</li>
</ol>
</li>
<li>orange</li>
<li>banana</li>
</ol>
</li>
<li>bar</li>
<li>baz</li>
</ol>
</li>
<li>ipsum</li>
<li>dolor</li>
</ol>
<p>after</p>

This nested list will be rendered somewhat like this:

before

1. lorem
   a. foo
        i. apple
           1) Gala
           2) Spartan
           3) McIntosh
       ii. orange
      iii. banana
   b. bar
   c. baz
2. ipsum
3. dolor

after

Note that as discussed earlier in section 8.5.1, Supported Attributes for Lists, the support of the right parenthesis for a delimiter varies based on the output format. In cases where it is not supported, a period will be used instead.

8.5.4Nested Lists with Attributes

Everything which has been said about attribute lists on lists applies to nested lists as well. This example shows a nested list of the form 1, a, i, * (the final nested list is an unordered list, not an ordered list):

before

1. lorem
   {type: a}
   1. foo
      {type: i}
      1. apple
         * Gala
         * Spartan
         * McIntosh
      2. orange
      3. banana
   2. bar
   3. baz
2. ipsum
3. dolor

after
<p>before</p>
<ol>
<li>lorem
<ol type="a">
<li>foo
<ol type="i">
<li>apple
<ul>
<li>Gala</li>
<li>Spartan</li>
<li>McIntosh</li>
</ul>
</li>
<li>orange</li>
<li>banana</li>
</ol>
</li>
<li>bar</li>
<li>baz</li>
</ol>
</li>
<li>ipsum</li>
<li>dolor</li>
</ol>
<p>after</p>

This nested list will be rendered somewhat like this:

before

1. lorem
   a. foo
        i. apple
           * Gala
           * Spartan
           * McIntosh
       ii. orange
      iii. banana
   b. bar
   c. baz
2. ipsum
3. dolor

after

8.5.5Lists with Attributes: Lazy Style

It is acceptable to just use the number of the first label, and rely on list behavior.

This list will be numbered 1, 2, 3:

before

1. foo
1. bar
1. baz

after
<p>before</p>
<ol>
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered 3, 4, 5:

before

3. foo
3. bar
3. baz

after
<p>before</p>
<ol start="3">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered c, d, e:

before

{type: a}
3. foo
3. bar
3. baz

after
<p>before</p>
<ol start="3" type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered iii, iv, v:

before

{type: i}
3. foo
3. bar
3. baz

after
<p>before</p>
<ol start="3" type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered 3, 2, 1:

before

{reversed: true}
3. foo
3. bar
3. baz

after
<p>before</p>
<ol start="3" reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

8.5.6Lists with Attributes: Redundant Style

This list will be numbered 1, 2, 3. This is just the default behavior made explicit:

before

{type: 1, reversed: false}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol>
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

8.5.7Lists with Attributes: Bad Style

This list will be numbered a, b, c, and will have the delimiter of .. The reason that this is bad style is that the specified delimiter and the inferred delimiter do not match:

before

{type: a, delimiter: "."}
1) foo
2) bar
3) baz

after
<p>before</p>
<ol type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered a, b, c, and will have the delimiter of ) in output formats that support it (e.g. in PDF but not in HTML). The reason that this is bad style is that the specified delimiter and the inferred delimiter do not match:

before

{type: a, delimiter: ")"}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered 3, 2, 1, not 3, 4, 5. Only the first number matters since the reversed attribute is set:

before

{reversed: true}
3. foo
4. bar
5. baz

after
<p>before</p>
<ol reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered 1, 0, -1, not 1, 2, 3. Only the first number matters since the reversed attribute is set:

before

{reversed: true}
1. foo
2. bar
3. baz

after
<p>before</p>
<ol reversed="true">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered c, d, e, regardless of the order of the numbers:

before

{type: a}
3. foo
2. bar
1. baz

after
<p>before</p>
<ol start="3" type="a">
<li>three</li>
<li>bar</li>
<li>one</li>
</ol>
<p>after</p>

This list will be numbered 1, 0, -1:

before

{reversed: true}
1. foo
1. bar
1. baz

after
<p>before</p>
<ol reversed="true" start="1">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered i, 0, -1, since the Romans didn’t have the concept of zero or negative numbers (and thus neither do Roman numerals):

before

{type: i, reversed: true}
1. foo
1. bar
1. baz

after
<p>before</p>
<ol reversed="true" start="1" type="i">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

This list will be numbered a, 0, -1, since there are no negative letters either:

before

{type: a, reversed: true}
1. foo
1. bar
1. baz

after
<p>before</p>
<ol reversed="true" start="1" type="a">
<li>foo</li>
<li>bar</li>
<li>baz</li>
</ol>
<p>after</p>

8.6Definition lists (M)

Definition lists are also supported in Markua. To define a definition list, use the following syntax:

term 1
: definition 1

term 2
: definition 2
<dl>
  <dt>term 1</dt>
  <dd>definition 1</dd>

  <dt>term 1</dt>
  <dd>definition 1</dd>
</dl>

The term portion of a definition list item may only contain simple inline content such as text, code spans, images (which are not figures), etc. It may not contain resources or arbitrary block content such as figures. The definition portion of a definition list item may contain arbitrary block content.

There can be one to three spaces after the colon, or one tab.

Note that in HTML the terms produce <dt> elements (for “description term”) and the definitions produce <dd> elements (for “description details”). Since it is not certain that the definition list contains definitions, and since it is desirable for the syntax to remain simple, the <dd> element does not contain a <dfn> element.

There can only be one term per definition, but there can be multiple definitions for a term:

term 1
: definition 1a
: definition 1b

term 2
: definition 2
<dl>
  <dt>term 1</dt>
  <dd>definition 1a</dd>
  <dd>definition 1b</dd>

  <dt>term 2</dt>
  <dd>definition 2</dd>
</dl>

A single term definition list is a definition list, regardless of how many definitions for the term exist:

term
: definition
<dl>
  <dt>term</dt>
  <dd>definition</dd>
</dl>

Finally, each definition list item can contain newlines and multiple paragraphs. What you do here is indent the subsequent lines by the same amount of space as the initial line.

(If you do not indent the subsequent lines, then you’re ending the definition list and just starting a new paragraph.)

before

one
: foo
: bar

  baz

two
: foo

after
<p>before</p>
<dl>
  <dt>one</dt>
  <dd>foo</dd>
  <dd>
    <p>bar</p>
    <p>baz</p>
  </dd>

  <dt>two</dt>
  <dd>foo</dd>
</dl>
<p>after</p>

Note that single line breaks within the definitions (the <dd> not <dt>) are handled the same way as single line breaks within paragraphs, and are subject to the soft-breaks document setting which defaults to space for compatibility with Markdown.

before

one
: lorem
  ipsum

  dolor

  sit amet

two
: foo

after
<p>before</p>
<dl>
  <dt>one</dt>
  <dd>
    <p>lorem
    ipsum</p>
    <p>dolor</p>
    <p>sit amet</p>
  </dd>

  <dt>two</dt>
  <dd>foo</dd>
</dl>
<p>after</p>

Finally, with definition lists, you need to link to a specific definition, not just to the list itself. This is useful to do, since in a document with many definitions, it’s helpful if the reader scrolls to the right spot or opens to the right page.

To do this, just define a span id on the element itself, and then link to it.

foo{#foo}
: This is foo.

bar{#bar}
: This is bar.
<dl>
  <dt><span id="foo">foo</span></dt>
  <dd>This is foo.</dd>

  <dt><span id="bar">bar</span></dt>
  <dd>This is bar.</dd>
</dl>

8.7Asides (M)

Since Markua is for writing books and courses, including technical books and courses, it needs not just a syntax for blockquotes–it also needs a syntax for asides and for blurbs.

These syntaxes are very similar to the Markua syntax for blockquotes. Like blockquotes, any headings inside asides or blurbs do not show up in a Table of Contents (if one is present), and they can also be formatted differently by Markua Processors.

We will consider asides first.

Asides are typically short or long notes in books which are tangential to the main idea–sort of like footnotes, but in the body text itself. In technical books, quite often they are formatted in a box. Asides can span multiple pages.

The syntaxes for asides are very similar to blockquotes:

  1. By prefacing lines with A> , i.e. an A, then a greater than character (>), then a space.
  2. By wrapping the aside in {aside}{/aside}

Option #1 is preferable for short asides; option #2 is easier on authors for really long asides.

For consistency with blockquotes, asides can be siblings of paragraphs or nested in them.

Here’s a short aside:

A> This is a short aside.
<aside>
<p>This is a short aside.</p>
</aside>

Here’s a longer aside, which also contains a heading:

A> # A Longer Aside
A>
A> This is a longer aside.
A>
A> It can have multiple paragraphs.
<aside>
<h1>A Longer Aside</h1>
<p>This is a longer aside.</p>
<p>It can have multiple paragraphs.</p>
</aside>

Here’s the same longer aside using the {aside} syntax:

{aside}
# A Longer Aside

This is a longer aside.

It can have multiple paragraphs.
{/aside}
<aside>
<h1>A Longer Aside</h1>
<p>This is a longer aside.</p>
<p>It can have multiple paragraphs.</p>
</aside>

8.8Blurbs (M)

Blurbs are like asides, but shorter. A blurb is not intended to span multiple pages of output.

The syntaxes for blurbs are very similar to asides:

  1. By prefacing lines with B> , i.e. a B, then a greater than character (>), then a space.
  2. By wrapping the blurb in {blurb}{/blurb}

Here’s a short blurb:

B> This is a short blurb.
<aside class="blurb">
<p>This is a short blurb.</p>
</aside>

Here’s a longer blurb, which also contains a heading:

B> # A Longer Blurb
B>
B> This is a longer blurb.
B>
B> It can have multiple paragraphs.
<aside class="blurb">
<h1>A Longer Blurb</h1>
<p>This is a longer blurb.</p>
<p>It can have multiple paragraphs.</p>
</aside>

Here’s the same longer blurb using the {blurb} syntax:

{blurb}
# A Longer Blurb

This is a longer blurb.

It can have multiple paragraphs.
{/blurb}
<aside class="blurb">
<h1>A Longer Blurb</h1>
<p>This is a longer blurb.</p>
<p>It can have multiple paragraphs.</p>
</aside>

8.8.1Blurb classes

Blurbs also have support for an attribute list, which can contain a class attribute.

Markua has its origin in authoring computer programming books. In computer programming books, there are a number of blurb types which are a de facto standard:

These blurb types can be set on a blurb as its class attribute. A Markua Processor can optionally style these blurbs appropriately based on the class, for example by adding custom icons for each class of blurb.

Here’s how this looks with the B> syntax:

{class: warning}
B> This is a warning!
<aside class="blurb warning">
<p>This is a warning!</p>
</aside>

Here’s how this looks with the {blurb} syntax:

{blurb, class: warning}
This is a warning!
{/blurb}
<aside class="blurb warning">
<p>This is a warning!</p>
</aside>

The attribute list must either directly precede the B> with no blank line between it and the B>, or it must be combined with the {blurb} block opening. It is NOT legal Markua syntax to have an attribute list preceding a {blurb} block opening like this:

{class: warning}
{blurb}
That is not legal Markua...

8.8.2Syntactic sugar for blurb classes

Having to constantly type {class: warning} in a computer programming book with a number of warnings would get tedious, as would any of the other blurb classes listed above.

So, Markua defines a standard shorthand syntax for these classes of blurbs. With this syntax, you use a different letter than B in the B>, to create a blurb with the appropriate class.

These are the syntactic sugar values you can use which have a heritage in computer programming books:

D> : {class: discussion}

E> : {class: error}

I> : {class: information}

Q> : {class: question}

T> : {class: tip}

W> : {class: warning}

X> : {class: exercise}

D> This is a discussion blurb.

{class: discussion}
B> This is a discussion blurb.

{blurb, class: discussion}
This is a discussion blurb.
{/blurb}
<aside class="blurb discussion">
<p>This is a discussion blurb.</p>
</aside>

<aside class="blurb discussion">
<p>This is a discussion blurb.</p>
</aside>

<aside class="blurb discussion">
<p>This is a discussion blurb.</p>
</aside>
E> This is an error blurb.

{class: error}
B> This is an error blurb.

{blurb, class: error}
This is an error blurb.
{/blurb}
<aside class="blurb error">
<p>This is an error blurb.</p>
</aside>

<aside class="blurb error">
<p>This is an error blurb.</p>
</aside>

<aside class="blurb error">
<p>This is an error blurb.</p>
</aside>
X> This is an exercise blurb.

{class: exercise}
B> This is an exercise blurb.

{blurb, class: exercise}
This is an exercise blurb.
{/blurb}
<aside class="blurb exercise">
<p>This is an exercise blurb.</p>
</aside>

<aside class="blurb exercise">
<p>This is an exercise blurb.</p>
</aside>

<aside class="blurb exercise">
<p>This is an exercise blurb.</p>
</aside>
I> This is an information blurb.

{class: information}
B> This is an information blurb.

{blurb, class: information}
This is an information blurb.
{/blurb}
<aside class="blurb information">
<p>This is an information blurb.</p>
</aside>

<aside class="blurb information">
<p>This is an information blurb.</p>
</aside>

<aside class="blurb information">
<p>This is an information blurb.</p>
</aside>
Q> This is a question blurb.

{class: question}
B> This is a question blurb.

{blurb, class: question}
This is a question blurb.
{/blurb}
<aside class="blurb question">
<p>This is a question blurb.</p>
</aside>

<aside class="blurb question">
<p>This is a question blurb.</p>
</aside>

<aside class="blurb question">
<p>This is a question blurb.</p>
</aside>
T> This is a tip blurb.

{class: tip}
B> This is a tip blurb.

{blurb, class: tip}
This is a tip blurb.
{/blurb}
<aside class="blurb tip">
<p>This is a tip blurb.</p>
</aside>

<aside class="blurb tip">
<p>This is a tip blurb.</p>
</aside>

<aside class="blurb tip">
<p>This is a tip blurb.</p>
</aside>
W> This is a warning blurb.

{class: warning}
B> This is a warning blurb.

{blurb, class: warning}
This is a warning blurb.
{/blurb}
<aside class="blurb warning">
<p>This is a warning blurb.</p>
</aside>

<aside class="blurb warning">
<p>This is a warning blurb.</p>
</aside>

<aside class="blurb warning">
<p>This is a warning blurb.</p>
</aside>

Note that Q> and X> are a bit confusing:

Also note that nothing in this section defines what a Markua Processor must do with the given class of blurb, in terms of formatting. Leanpub, for example, uses it to add an appropriate icon from Font Awesome at the left of the blurb, but other Markua Processors are free to do something different.

Finally, note that specifying a class in metadata overrides what the syntactic sugar does, and is also an error:

{class: tip}
W> This is a tip blurb, not a warning blurb.
<aside class="blurb tip">
<p>This is a tip blurb, not a warning blurb.</p>
</aside>

8.8.3Using blurbs to center text

You can also use a blurb to center text.

The following two ways to do this are equivalent:

C> This is a centered blurb.

{class: center}
B> This is a centered blurb.
<aside class="blurb center">
<p>This is a centered blurb.</p>
</aside>
<aside class="blurb center">
<p>This is a centered blurb.</p>
</aside>

This is the only way to center text in Markua.

Unlike other blurb types which have their origin in technical books, centering text has a wide range of uses. So, it could have been thought of as something different than a blurb. However, in terms of its behavior and the way it’s inserted, centered text is a blurb–whether it’s inserted via syntactic sugar or via a class attribute on a normal blurb element.

8.8.4Blurb icons with extension attributes

Markua Processors must ignore any attributes which they do not understand.

Because of this, Markua attribute lists can contain any number of extension attributes. An extension attribute is an attribute which is not defined in the Markua specification, but which is understood by some specific Markua Processor.

As an example of an extension attribute, Leanpub adds an icon attribute to blurbs. Markua does not specify that a blurb must support an icon attribute, or what it would mean if it did. However, Leanpub understands an icon attribute to reference an icon from Font Awesome. The value of this attribute is assumed to be the name of an icon in Font Awesome, without the fa- prefix. So, in any Markua Processor, you can do this:

{icon: car}
B> You can't spell carbon without it!

{icon: leanpub}
B> Yes, we're in Font Awesome!

{icon: github}
B> So is GitHub, of course. Unicorns.

In Leanpub, this will produce a nice icon of a car, using the Font Awesome icon. In a Markua implementation that does not understand the icon attribute, nothing will be generated for that attribute–it will be functionally equivalent to the attribute not being present.

9Quizzes and exercises (M)

Markua was initially designed for the writing of books, but it has been extended to support creating courses. For example, Leanpub authors can click a button to create an online course, complete with automated marking, entirely from a Markua document.

To convert a Markua document from a book into a course, all you need to do is add quizzes and exercises.

Over the past decade, there has been a steady growth of interest in courses delivered over the internet at massive scale. These courses consist of essentially four things:

  1. Reading material
  2. Video or audio lectures
  3. Exercises, with answers provided to the student
  4. Quizzes, with answers used to automatically mark the quiz

It turns out the four things in this list all work perfectly in a Markua document. So, not only can Markua be used to easily create a textbook which includes video, audio, images and quizzes, it is also an amazingly simple and flexible way of creating an online course. An online course is essentially just a textbook which is executable, plus discussion forums and credentials. For example, Leanpub authors can click one button to create a course, complete with automated marking for all the quizzes in the course, entirely from one Markua document.

Quizzes and exercises are essentially the same. The only difference is that quizzes are intended to be marked, and exercises are not. Because of their similarities, they are discussed here together.

Quizzes or exercises in a textbook consist of two things:

  1. Questions, typically in the chapter itself.
  2. Answers, typically at the back of the book.

The questions in the chapter essentially are placed there like any other block element, such as an aside or blurb. The answers are positioned at the back of the book, along with other elements like the index and appendices. The specific location that they are positioned can be controlled by the author using insert directives, discussed earlier.

There is only one syntax to create a quiz or exercise. For a quiz, it’s by wrapping the quiz in {quiz}{/quiz}; for an exercise, it’s by wrapping the exercise in {exercise}{/exercise}.

Here is a brief example of a quiz:

{quiz, id: quiz1}
? How many letters are in the word Markua?

a) 5
B) 6
c) 7

? How many unique letters are in the word Markua?

! 5
{/quiz}

This quiz contains two questions: a multiple-choice question where the correct answer is b, and a fill-in-the-blank question where the correct answer is 5. Quizzes and exercises have the same question types, discussed later.

With a quiz, the id attribute is required. This is so the identity of a quiz can be preserved across generations of a course.

Here is the same example, but as an exercise:

{exercise, id: exercise1}
? How many letters are in the word Markua?

a) 5
B) 6
c) 7

? How many unique letters are in the word Markua?

! 5
{/exercise}

Just like with quiz, with an exercise the id attribute is required. This is so the identity of an exercise can be preserved across generations of a course.

A quiz or exercise can contain any Markua content, not just questions and answers. This is true regardless of whether the quiz or exercise is in an online course, an ebook or on paper. Note that video and audio resources don’t work so well on paper, however.

If a quiz or exercise starts with any type of heading immediately after the {quiz} or {exercise} line, this heading’s content should be considered the name of the quiz or exercise. This can be used in a list of quizzes or exercises produced by the Markua Processor. Typically the heading will be a chapter heading (# ), but section headings (## ) and lower headings also are supported. (The reason for this is that quizzes are sometimes top-level things, and other times are nested inside chapters, sections or sub-sections. Some course authors would correctly feel that the quiz should have the appropriate level of heading given their position in the document.

Example:

{quiz, id: quiz2}
# Markua Quiz

Watch this [video](https://www.youtube.com/watch?time_continue=1&v=VOCYL-FNbr0)
of Peter explaining Markua.

? What year was that video from?

What year? Really? Did it really take that long? What was going on???

a) 2012
b) 2013
C) 2014
d) 2015

{words: 500}
? Why do you think the first version of the Markua spec took so long?

Look at the Leanpub [website](https://leanpub.com/).

! Answers could include "bootstrapped startup", the spec evolving, etc.

That's it for this quiz, and this course!

**Thanks for taking my course!**
{/quiz}

There are four types of questions supported by Markua.

  1. Multiple Choice
  2. Multiple Selection
  3. Fill In The Blank
  4. Written

These types are not specified by a {type} attribute. Instead, they are inferred from properties of the answers or from other attributes of the question.

The exact way to create these types of questions is discussed further below.

9.1Quiz and exercise output (M)

A Markua Processor has many degrees of freedom in terms of whether, and how, to output quizzes and exercises. As such, no HTML mapping is shown here.

For example, when outputting an online course, the Markua Processor can basically do whatever it wants. It can choose to output an ebook of the course material only, and put all quizzes and exercises only in the online version. Or it can choose to put all the quizzes only in the online version, and include the exercises in the ebook version as well. Or it can include the quizzes and exercises in the ebook version, but only include the answers for the exercises in the ebook version.

If, on the other hand, a Markua Processor is outputting a textbook, it may choose to output the quizzes in an entirely separate ebook or print-ready PDF, for use in physical classrooms.

If a Markua Processor does output a quiz or exercise, it needs to do so in a medium-appropriate way. This includes outputting multiple-choice questions without showing their answers, of course. The display of fill in the blank and essay questions varies greatly based on whether the question is in an online quiz or exercise, in an ebook or on paper.

Markua Processors are encouraged to be creative here.

However, there are some rules. If a given quiz or exercise is output by a Markua Processor in an ebook or a physical book, the following things must occur:

  1. The quiz or exercise must be output as questions-only, in the place in the document where the quiz was defined.
  2. The questions must be numbered sequentially, incrementing by 1 for each question.
  3. The questions must be numbered starting from the number specified by the start-at attribute if specified, or 1 if no start-at attribute is specified.
  4. The multiple choice options in any question must be converted into a set of choices which all look the same, so that the correct answer is not indicated. Converting all choice letters to lowercase is sufficient here.
  5. The answers, if provided, must be positioned somewhere separate from the questions, typically at the back of the book. This position can be controlled by the author using insert directives, discussed earlier.

9.2Quiz and exercise validation (M)

9.2.1An empty quiz or exercise is not an error

A quiz or exercise which contains no questions is not an error. Instead, a if a Markua Processor encounters a quiz or exercise with no questions it must filter the quiz or exercise from the output, optionally providing a warning to the author.

This lets authors create placeholders for quizzes or exercises in their courses before the quizzes or exercises are ready, which is potentially very useful in an in-progress course.

9.2.2A malformed quiz or exercise is an error

If a Markua Processor encounters a malformed quiz or exercise it must treat this as an error and not generate the output from the Markua document. Quizzes and exercises are not something that should ever be produced in a broken state.

However, it is also an error to parse quiz syntax outside a quiz or exercise block. A Markua Processor must not parse lines starting with ? or ! as representing questions or answers unless those are contained in a quiz or exercise block.

9.3Quiz and exercise attributes (M)

These are the supported attributes on quizzes and/or exercises:

attempts : The number of allowed attempts on a quiz. The default is defined by the value of default-quiz-attempts on the containing course, or 1 if this is not present. A value of 0 means the quiz cannot be taken (yet). A value of -1 means the quiz has an unlimited number of attempts. Since an exercise does not count toward the mark on a course, an exercise always has an unlimited number of attempts.

auto-submit : true or false. The default is true. If true, an incomplete quiz is submitted when the time-limit is expired. If false, it is not. Either way, an incomplete quiz counts as an attempt.

case-sensitive : true or false. The default is true. This sets the default behavior of fill in the blank questions. If true, the fill in the blank question answers are case-sensitive. If false, they are not.

id : All Markua elements support an id attribute. The reason the id attribute is explicitly listed here is to emphasize that a Markua Processor may require an id attribute on a quiz or exercise. For example, Leanpub requires the id attribute on all quizzes, in order to determine the identity of quizzes when a course is being published in-progress. (As a student, you’d be pretty unhappy if you had to re-take an unchanged quiz simply because a professor published a new course version.)

mark-request-url : If omitted, all the quiz or exercise answers are defined in the Markua document. If present, some or all of the quiz or exercise questions are externally marked via an API defined at the URL specified in this attribute. When the quiz or exercise is completed, the entire quiz or exercise should be sent to the endpoint at the mark-request-url. Here’s how this works in Leanpub; other Markua Processors should presumably do something similar. When a quiz or exercise with this attribute defined is completed, its state is set to “pending”. Leanpub then sends the quiz or exercise data as a JSON payload to the marking endpoint. This JSON contains the following attributes: mark-response-url (which defines where responses are sent), quiz_id, quiz_version, quiz_hash (an SHA hash, for an automated quiz versioning approach which does not rely on the course author updating a version attribute), quiz_results (an array of question data structures containing the question, the possible answers, the correct answer according to the Markua document, the answer provided by the student, and (if the question is markable) the mark determined automatically). The API endpoint should send results to the mark-response-url. (Even though exercises are not worth points in a course, they can be marked, for the benefit of the student. However, the expectation is that this attribute will be used primarily by quizzes.) The mark-response-url in the payload contains URL to respond to with the results of the marking the quiz or exercise. The format of the expected payload should be defined by the Markua Processor. For Leanpub this is a JSON payload containing the question ids and the marks for each question. The mark is a decimal number of points between 0 and the maximum number of points for the question, inclusive. The decimal supports two decimal places, so you can get 3.75 out of 4, for example. The mark response must include marks for all questions that are unmarked. It may also include marks for questions that Leanpub marked, and those will override Leanpub’s marks. Finally, note that even though exercises are not worth points in a course, they can be marked, for the benefit of the student. So, since they can be marked, they can also be externally-marked via a mark-request-url. However, the expectation is that this attribute will be used primarily by quizzes.

points : If present, this is the total number of points the quiz or exercise is worth. (This really only matters for quizzes, but is supported for exercises as well, in case a Markua Processor wishes to display the points on exercises to make them feel more real.) If points is not present, the worth of the quiz is determined by summing the points of the questions. (Questions are worth 1 point each if they have no points attribute.) If the quiz has a points attribute and its questions also have points attributes, the worth of each question in a larger course context is determined as follows: its points are the numerator, and the total points in the quiz or exercise is the denominator.

random-choice-order : true or false. The default is false. This sets the default behavior of multiple choice questions. If true, the choices in the multiple choice question are randomly arranged; if false, they are presented in the order written.

random-question-order : true or false. The default is false. This sets the default behavior of the quiz or exercise. If true, the questions are randomly arranged; if false, they are presented in the order written.

start-at : The start-at is the number of the first question. The default is 1. Any integer is permitted. Subsequent questions will have a number which is 1 higher than the previous question.

show-answers : This can be all, incorrect or none. It affects how answers are shown after a quiz or exercise is completed, say in a course. For exercises, the default value is determined by the value of the default-exercise-show-answers document setting, with all being the default value of that attribute. For quizzes, the default value is determined by the value of the default-quiz-show-answers document setting, with incorrect being the default value of that attribute. Document settings are discussed later.

time-limit : The time limit to finish the quiz, once started. The format is XdYhZm. For example, 3 days, 6 hours and 45 minutes is expressed as 3d6h45m; 7 days is expressed as 7d. The default is 7d.

use-result : best or latest. Whether the best result on the quiz is used, or the latest one. The default is the value of default-quiz-use-result on the course.

version : The version of the quiz. This does not replace the function of the id; it’s more for use in analytics by the instructor. The default is 1.

As discussed above, there is no title or title attribute for a quiz–you can just add a heading inside the quiz or exercise itself, using the normal Markua formatting for a chapter heading.

9.4Multiple choice questions (M)

A multiple choice question has 2 or more answer choices, and 1 correct answer choice.

The correct answer choice is in capital letters before the parentheses; incorrect answer choices have lowercase letters before the parentheses.

Example:

? How many letters are in the word Markua?

a) 5
B) 6
c) 7

Obviously, when generating the question in the actual quiz or exercise, a Markua Processor must make all answer choices have the same type of letter. This is usually a lowercase letter, although either all lowercase or all uppercase letters would be fine.

Unless a choose-answers attribute is used, the multiple choice answers all must start from a or A, and must use a right-parenthesis after the a or A. Any line starting with a) ar A) in a quiz is considered a set of multiple choice quiz answers, not an ordered list using a) or A) as a delimiter. If you want to put an ordered list in a quiz body, use periods for the delimiter.

A multiple choice question may also have a dynamic number of answer choices, including for the correct answer. This done with the special choose-answers attribute, shown and explained below.

{choose-answers: 4}
? How many grams are in a pound?

C) 454
C) 453
m) 451
m) 1000
o) 100
o) 150
o) 200
o) 250
o) 300
o) 500

The choose-answers attribute specifies how many answer choices should be shown. This includes exactly one of the correct answers (indicated with C), all of the mandatory incorrect answers (indicated with m) and as many of the optional incorrect answers (indicated with o) as are needed for the question to have the total number of answers as indicated by the choose-answers attribute.

So, in the above example, either 453 or 454 will be shown, along with the mandatory incorrect answer choices 451 (a literary joke) and 1000 (a kilogram, not a pound) and one of the optional incorrect answers (100, 150, 200, 250, 300 or 500).

When a choose-answers attribute is used, the question will always have random-choice-order.

The following are errors in a question where a choose-answers attribute is used:

9.4.1Supported attributes on multiple choice questions

choose-answers : This is described above. If choose-answers is used, random-choice-order is forced to true.

points : The number of points the question is worth. This number can be 0 or higher. The default is 1.

random-choice-order : true or false. The default is false, unless choose-answers is used. This sets the behavior of the specific multiple choice question. If true, the choices in the multiple choice question are randomly arranged; if false, they are presented in the order written. If this attribute is omitted, its value is determined by the random-choice-order attribute on the quiz itself, which defaults to false if absent.

9.5Multiple selection questions (M)

A multiple selection question has 2 or more answer choices, and any number of true answer choices.

(Leanpub authors: This question type is not yet supported in Leanpub.)

The true answer choices are specified with T; false answer choices are specified with F.

To get the question 100% correct, you must select ALL answers which are true (not just one of them) and NONE of the answers which are false. As discussed later, the mark attribute determines whether the marking is all-or-nothing (with the binary value) or whether a partial credit can be earned (with the ratio) value.

Regardless of the value of the mark attribute, to get any score for a multiple selection question, at least one answer choice must be selected. Leaving a multiple selection question blank always gets zero points, even if the mark attribute is ratio. Otherwise, if a test had nothing but multiple selection questions with ratio-marked questions, with half the answers true and half the answers false, you could get 50% on the test by just handing it in blank!

9.5.1Supported attributes on multiple selection questions

mark : binary or ratio. The default is binary. This sets the marking behavior of the specific multiple selection question. If binary, the question is all or nothing: you either get full points for the question, or a 0 on the question. To get full points, you must select every true choice and none of the false choices. If ratio, you get a fractional score defined as (number of true choices selected + number of false choices not selected) / (total number of choices). This fraction is then multiplied by the points that the question is worth to compute the points that you received on the question. Note that regardless of whether the mark is binary or ratio, a question which is left blank is always completely wrong, earning no credit for the correctly-not-selected false choices.

points : The number of points the question is worth. This number can be 0 or higher. The default is 1. The mark attribute is used to determine how the points are computed. Again, regardless of whether the mark is binary or ratio, a question which is left blank always earns 0 points.

random-choice-order : true or false. The default is false. This sets the behavior of the specific multiple selection question. If true, the choices in the multiple choice question are randomly arranged; if false, they are presented in the order written. If this attribute is omitted, its value is determined by the random-choice-order attribute on the quiz itself, which defaults to false if absent.

9.5.2Examples of multiple selection questions

? Which of these are animals?

F) apple
T) cow
F) lettuce

? Which of these are animals?

F) apple
F) lettuce

? Which of these are animals?

T) cat
T) dog

{mark: ratio, points: 2}
? Which of these are fruits?

T) apple
F) cow
F) lettuce
T) peach
T) pear
F) rice

Here the first question defaults to binary and 1 point; the second question is ratio and 2 points.

To get the first question correct, you must select cow, and neither of apple or lettuce.

To get the second question correct, you must select nothing. This means that by default someone who is skipping the question gets it right, which some instructors may consider suboptimal. However, other instructors may consider it amusing or philosophically interesting.

To get the third question correct, you must select everything. This is equally controversial, but also valid.

To get this last question 100% correct, you must select all of apple, peach and pear, and none of cow, lettuce or rice. For this question, if the mark attribute had been binary, the only way to get any points for it would be to select apple, peach and pear, and to leave cow, lettuce and rice all unselected. However, the mark is not binary; it is ratio. So, if you selected apple, peach and rice, then the score would be computed as follows: ((2 correctly-selected true choices + 2 correctly-not-selected false choices) / 6 choices) * 2 points = (4/6) * 2 points = 1.5 points. Since at least one choice was selected, you get just as much credit for correctly not selecting cow and lettuce as you do for correctly selecting apple and peach. If an instructor thinks this is too lenient, then binary is the choice for them!

9.6Fill in the blank questions (M)

A fill in the blank question consists of a question and a set of answers. Each answer is specified by !, an optional points value, a space, and then a semicolon-separated list of the acceptable values of that answer. Each answer value can be a text string (quoted or not) or a regular expression (regex). If a points value is not specified for an answer, the answer is worth full points.

Support for regular expression answer values is optional. However, a Markua Processor which supports regular expression marking must interpret any answer which starts with a forward slash (/) and ends with a forward slash followed by some word characters (e.g. i) as being a regular expression. Note that the particular format of the regular expression used is implementation-specific. For example, Leanpub uses Ruby regular expressions. Other Markua Processors could, for example, use Perl or JavaScript regular expressions.

Finally, note that you can separate regular expressions with semicolons, just like any other answer value. There’s no reason not to support this, and it may lead to simpler regular expressions. However, if you’re good at regular expressions, you can also combine them into one regular expression, of course.

Note that since a semicolon is used to separate answer values, to provide an actual semicolon as part of an answer value you must either put the answer value in quotes, use a backslash-escape \; or make the semicolon part of a regular expression.

Examples:

? How many unique letters are in the word Markua?

! 5

? What's the global capital of investment banking?

! New York ; London

? What's the global capital of investment banking?

! "New York" ; "London"

? What's the global capital of investment banking?

! New York
! London

? What's the global capital of investment banking?

! "New York"
! "London"

{case-sensitive: false}
? What's pi?

! "The ratio of a circle's circumference to its diameter" ; 3.14 ... 3.1416 ;
an irrational number

{case-sensitive: false}
? What's pi?

! "The ratio of a circle's circumference to its diameter"
! 3.14 ... 3.1416
! an irrational number

? Where's the Eiffel Tower?

! /(Paris|France)/i

? Where's the Eiffel Tower?

! /Paris/i ; /France/i

{points: 2}
? Where's the Eiffel Tower?

! /Paris/i
! /France/i

{points: 2}
? Where's the Eiffel Tower?

!2 /Paris/
!1 /paris/i
!.5 /France/i

{points: 2}
? Where's the Eiffel Tower?

! /Paris/
!1 /paris/i
!.5 /France/i

{points: 2}
? Where's the global capital of investment banking?

!2 New York ; London
!1 USA ; UK

As shown by the answer ("The ratio of a circle's circumference to its diameter" ; 3.14 ... 3.1416; an irrational number), acceptable answer values in a fill in the blank question can be of completely different types, and numeric answer values can be expressed as ranges (min <= x <= max), expressed as min ... max. Also, this answer shows that quotes are optional around text strings. The reason to use quotes is for clarity, or to ensure that any semicolons used are treated as semicolons instead of as answer choice delimiters. Semicolons inside quotes are just semicolons and do not need to be backslash-escaped. You do, however, need to backslash-escape a quote if you want it to be treated as a literal quote, instead of the start or end of a string.

If there are multiple answers to a fill in the blank question and an answer key is being output by the Markua Processor for use by human markers, the acceptable values should be clearly distinguished from each other. The recommendation is to use an unordered list of acceptable values, one per line, but there are no requirements here.

If you’re a programmer, you may wonder what number types and formats are supported. Are they integers? Floating point numbers? Can you use scientific notation? And how are the numbers formatted? Are they US or Canadian numbers (123,456.78) or EU numbers (123 456,78)? What if there’s a number that means something different in the US and EU–does 123,456 mean 123456 or 123.456?

These questions are made worse by the fact that Markua can be used to create human-marked paper quizzes as well as automatically-marked courses. So, specifying rules which made the syntax unambiguous for online courses with automated marking would mean that the answers for human-marked paper quizzes could not be specified with a tolerable amount of ambiguity. So, the format of the answer values in a fill in the blank question is out of scope of the Markua Spec.

9.6.1Supported attributes on fill in the blank questions

points : The number of points the question is worth. This number can be 0 or higher. The default is the 1. The answers must either not specify points (in which case they are worth the full value of points that the question is worth), or they must specify points between 0 and the points value.

case-sensitive : true or false. The default is true. This sets the behavior of the specific fill in the blank question. If true, the fill in the blank question answer is case-sensitive. If false, it is not. In the case of multiple acceptable answer values, this attribute applies to all of them. Note that this only applies to text string answers, not to regular expressions. For a regular expression to be case-insensitive, you must end it with an i after the closing backtick.

9.7Written questions (M)

A written question corresponds to short answer, long answer or essay questions in traditional tests. It looks the same as a fill in the blank question, except the answer is optional.

A question is interpreted as a written question if either the words, lines or pages attribute is specified, if there is no answer provided, or if an answer is provided using the {answer} syntax discussed shortly.

Note that an answer may be provided in a written question. If this is done, the answer is not split into answer choices and values like a fill in the blank answer. Instead, the answer is essentially a “note to markers”, whether those markers are underpaid graduate students or unpaid AIs. Markua does not specify any microformat for this note to markers: it is just Markua text, kind of like a blockquote (but with each line starting with ! not >).

A Markua Processor generating an online course may handle written questions at its own discretion, including not including them or giving them a points value of 0.

Examples:

{quiz, id: "quiz3"}

{pages: 2, points: 10}
? Why is 2019 like *1984*?

! If the student mentions Newspeak, give them an extra mark.

{words: 100, points: 2}
? Why is doubleplusungood the worst?

{lines: 10}
? Can you have multiple line answers in a written question?

! You bet you can!
!
! You can separate them with blank lines,
! and without blank lines.
!
! This is like a blockquote, which uses `>` characters.

Since a written question can have long answers, this increases the likelihood that typing these answers after a bunch of ! delimiters will be a pain. So, written questions (and only written questions) also support answers in the form of {answer} ... {/answer}, like this:

{quiz, points: 0, id: "quiz3"}

{lines: 10}
? Write a function in Ruby that takes an argument and returns three times that
argument.

{answer}
The most straightforward way to do this is like this, since the last value
evaluated in the function is returned:

```ruby
def triple(y)
  y*3
end
```

However, you can also explicitly say `return` if you like:

```ruby
def triple(y)
  return y*3
end
```
{/answer}
{/quiz}

9.7.1Supported attributes on written questions

Note that only one of words, lines or pages may be provided. Providing more than one is an error. Providing none of them means that the question is a fill in the blank question, not a written question.

points : The number of points the question is worth. This number can be 0 or higher. The default is 1. A Markua Processor generating a course may override this. For example, when generating a course, Leanpub overrides all written questions to have a points attribute of 0, regardless of what (if anything) is specified for this attribute by the author.

words : The maximum number of words the answer can be. Obviously this is more useful for online quizzes than paper ones.

lines : The maximum number of lines the answer can be. Obviously this is more useful for paper quizzes than online ones. On paper, this is most useful for a short or long answer question.

pages : The maximum number of pages the answer can be. Obviously this is more useful for paper quizzes than online ones. On paper, this is most useful for an essay question.

9.8Hints on questions (M)

Any question can contain a hint, regardless of question type or whether an answer is provided.

The hint starts on a line with a percent sign (%) followed by a single space. It must follow the question, but it can come either before or after the answer choices.

Hints can span multiple lines, if each line starts with a % sign. This is similar to Markdown handles block quotes with >.

The hint can be used by a Markua processor to show to students at appropriate times, such as when they get the answer wrong or leave it blank. The exact specifics are implementation-dependent. Hint support is optional in a Markua Processor. However, if it’s not supported, it must be ignored.

? What's 1 + 2?

% In a multiple choice question, if you're not sure, `c` is usually a good guess.

a) 1
b) 2
C) 3
d) 4

? Explain the meaning of *The Myth of Sisyphus*.

% Not the actual myth, the essay by Camus.
%
% For bonus points, speculate about Camus' life, death, and what he would have
% thought about autonomous cars.

? Where's the Eiffel Tower?

! /(Paris|France)/i

% The answer must contain the city and/or country.

The hint can be a useful feature for a Markua Processor. There is a lot of discretion here for Markua Processors to compete on features.

For example, here’s how Leanpub handles hints:

  1. We will show the hint as a popup next to the question in the web and mobile quiz views.
  2. We will track if the student looked at the hint.
  3. We will not subtract points for looking at hints for Leanpub-marked questions.
  4. Whether the hint was used will be provided in the analytics, such as a CSV download that we make available to professors.
  5. We will also output the hints for exercises in the “material book” for a course, in a section near the end of the book. This will be linked to via crosslinks from each exercise.
  6. The answers page for a quiz or exercise will show the question, hint and answers for each question.

9.9Question alternates (M)

The fact that a Markua document can be used to create an online course means that certain aspects of the syntax for quizzes and exercises are more robust than they would otherwise. One example of this is question alternates.

In an online course, some professors might not want every question the same, despite the fact that question order and answer order can be randomized. So, Markua supports question alternates, using a simple (if slightly ugly) syntax. Question alternates are only supported in quizzes, since they make no sense to include in exercises.

To create question alternates, every question in the quiz (not just those with alternates) must be numbered sequentially, starting from 1, using a ?# syntax. This is a question mark followed by the number of the question, e.g. ?1, ?2, ?3. The questions in a quiz are numbered using sequential positive integers starting from 1: 1, 2, 3, etc.

The alternates are specified by providing the same number for multiple questions, e.g. ?1, ?1, ?1, ?2, ?3, ?4, ?4, ?5. When the actual quiz is given, only one of the questions for the given question number is used.

Note that only the first question with a given number may have a points attribute–since all other alternates must use the same points value, specifying it would be pointless.

The following is an example of a quiz which uses question alternates. This ensures that to ensure that students get randomly selected versions of questions 1 and 4. Also, since random-question-order: true is used, the actual position of the questions is randomized after the specific questions are selected from the alternates.

{quiz, id: "midterm", random-question-order: true}
?1 What's 2 + 2?

! 4

?1 What's 2.2 + 2.2?

! 4.4

?2 what's 3 + 3?

! 6

?3 What's 4 + 4?

! 8

{points: 2}
?4 What's 5 + 5?

! 10

?4 What's 6 + 6?

! 12

?5 What's 7 + 7?

! 14
{/quiz}

Note that the syntax for question alternates is very strict. Every question must have a number, and these numbers must be in ascending order (except for the alternates, which have the same number as each other).

Question alternates can also be grouped by a choose-questions attribute attached to the first question alternate. In this case, the Markua Processor must choose the number of questions m specified from the given alternates with that number n, or n choose m. Note that in this scenario, the numbering after the alternates increases by m: for example, if a quiz starts with a choose: 3, the next question is numbered 4, not 2. This ensures that the person constructing the quiz knows what they are doing, and saves them from having to keep track in a scenario where there are multiple questions with a choose-questions attribute.

{quiz, id: "midterm", random-question-order: true}

{choose-questions: 3}
?1 What's 2 + 2?

! 4

?1 What's 2.2 + 2.2?

! 4.4

?1 what's 3 + 3?

! 6

?1 What's 4 + 4?

! 8

?1 What's 5 + 5?

! 10

{points: 2}
?4 What's 6 + 6?

! 12

?5 What's 7 + 7?

! 14
{/quiz}

A Markua Processor must treat any error in the numbering of question alternates (and the questions which follow) as an error, and not generate the quiz if there is any error. This is preferable to a Markua Processor of trying to guess at what the author meant, and trying to do the right thing. Fixing a syntax error takes a couple minutes of editing and a few minutes to publish the book or course again. However, fixing the consequences of a quiz being administered to hundreds–or thousands, or tens of thousands–of people with an incorrect number of questions, or with questions incorrectly used as alternates for each other, would be much more difficult.

10Inlines

Inlines are parsed sequentially from the beginning of the character stream to the end (left to right, in left-to-right languages). Thus, for example, in

Example 370
`hi`lo`
<p><code>hi</code>lo`</p>

hi is parsed as code, leaving the backtick at the end as a literal backtick.

10.1Code spans

A backtick string is a string of one or more backtick characters (`) that is neither preceded nor followed by a backtick.

A code span begins with a backtick string and ends with a backtick string of equal length. The contents of the code span are the characters between these two backtick strings, normalized in the following ways:

This is a simple code span:

Example 371
`foo`
<p><code>foo</code></p>

Here two backticks are used, because the code contains a backtick. This example also illustrates stripping of a single leading and trailing space:

Example 372
`` foo ` bar ``
<p><code>foo ` bar</code></p>

This example shows the motivation for stripping leading and trailing spaces:

Example 373
` `` `
<p><code>``</code></p>

Note that only one space is stripped:

Example 374
`  ``  `
<p><code> `` </code></p>

The stripping only happens if the space is on both sides of the string:

Example 375
` a`
<p><code> a</code></p>

Only spaces, and not unicode whitespace in general, are stripped in this way:

Example 376
` b `
<p><code> b </code></p>

No stripping occurs if the code span contains only spaces:

Example 377
` `
`  `
<p><code> </code>
<code>  </code></p>

Line endings are treated like spaces:

Example 378
``
foo
bar  
baz
``
<p><code>foo bar   baz</code></p>
Example 379
``
foo 
``
<p><code>foo </code></p>

Interior spaces are not collapsed:

Example 380
`foo   bar 
baz`
<p><code>foo   bar  baz</code></p>

Note that browsers will typically collapse consecutive spaces when rendering <code> elements, so it is recommended that the following CSS be used:

code{white-space: pre-wrap;}

Note that backslash escapes do not work in code spans. All backslashes are treated literally:

Example 381
`foo\`bar`
<p><code>foo\</code>bar`</p>

Backslash escapes are never needed, because one can always choose a string of n backtick characters as delimiters, where the code does not contain any strings of exactly n backtick characters.

Example 382
``foo`bar``
<p><code>foo`bar</code></p>
Example 383
` foo `` bar `
<p><code>foo `` bar</code></p>

Code span backticks have higher precedence than any other inline constructs except HTML tags and autolinks. Thus, for example, this is not parsed as emphasized text, since the second * is part of a code span:

Example 384
*foo`*`
<p>*foo<code>*</code></p>

And this is not parsed as a link:

Example 385
[not a `link](/foo`)
<p>[not a <code>link](/foo</code>)</p>

Code spans, HTML tags, and autolinks have the same precedence. Thus, this is code:

Example 386
`<a href="`">`
<p><code>&lt;a href=&quot;</code>&quot;&gt;`</p>

But this is an HTML tag:

Example 387
<a href="`">`
<p><a href="`">`</p>

And this is code:

Example 388
`<https://foo.bar.`baz>`
<p><code>&lt;https://foo.bar.</code>baz&gt;`</p>

But this is an autolink:

Example 389
<https://foo.bar.`baz>`
<p><a href="https://foo.bar.%60baz">https://foo.bar.`baz</a>`</p>

When a backtick string is not closed by a matching backtick string, we just have literal backticks:

Example 390
```foo``
<p>```foo``</p>
Example 391
`foo
<p>`foo</p>

The following case also illustrates the need for opening and closing backtick strings to be equal in length:

Example 392
`foo``bar``
<p>`foo<code>bar</code></p>

10.2Emphasis and strong emphasis

John Gruber’s original Markdown syntax description says:

Markdown treats asterisks (*) and underscores (_) as indicators of emphasis. Text wrapped with one * or _ will be wrapped with an HTML <em> tag; double *’s or _’s will be wrapped with an HTML <strong> tag.

This is enough for most users, but these rules leave much undecided, especially when it comes to nested emphasis. The original Markdown.pl test suite makes it clear that triple *** and ___ delimiters can be used for strong emphasis, and most implementations have also allowed the following patterns:

***strong emph***
***strong** in emph*
***emph* in strong**
**in strong *emph***
*in emph **strong***

The following patterns are less widely supported, but the intent is clear and they are useful (especially in contexts like bibliography entries):

*emph *with emph* in it*
**strong **with strong** in it**

Many implementations have also restricted intraword emphasis to the * forms, to avoid unwanted emphasis in words containing internal underscores. (It is best practice to put these in code spans, but users often do not.)

internal emphasis: foo*bar*baz
no emphasis: foo_bar_baz

The rules given below capture all of these patterns, while allowing for efficient parsing strategies that do not backtrack.

First, some definitions. A delimiter run is either a sequence of one or more * characters that is not preceded or followed by a non-backslash-escaped * character, or a sequence of one or more _ characters that is not preceded or followed by a non-backslash-escaped _ character.

A left-flanking delimiter run is a delimiter run that is (1) not followed by Unicode whitespace, and either (2a) not followed by a Unicode punctuation character, or (2b) followed by a Unicode punctuation character and preceded by Unicode whitespace or a Unicode punctuation character. For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.

A right-flanking delimiter run is a delimiter run that is (1) not preceded by Unicode whitespace, and either (2a) not preceded by a Unicode punctuation character, or (2b) preceded by a Unicode punctuation character and followed by Unicode whitespace or a Unicode punctuation character. For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.

Here are some examples of delimiter runs.

(The idea of distinguishing left-flanking and right-flanking delimiter runs based on the character before and the character after comes from Roopesh Chander’s vfmd. vfmd uses the terminology “emphasis indicator string” instead of “delimiter run,” and its rules for distinguishing left- and right-flanking runs are a bit more complex than the ones given here.)

The following rules define emphasis and strong emphasis:

  1. A single * character can open emphasis iff (if and only if) it is part of a left-flanking delimiter run.

  2. A single _ character can open emphasis iff it is part of a left-flanking delimiter run and either (a) not part of a right-flanking delimiter run or (b) part of a right-flanking delimiter run preceded by a Unicode punctuation character.

  3. A single * character can close emphasis iff it is part of a right-flanking delimiter run.

  4. A single _ character can close emphasis iff it is part of a right-flanking delimiter run and either (a) not part of a left-flanking delimiter run or (b) part of a left-flanking delimiter run followed by a Unicode punctuation character.

  5. A double ** can open strong emphasis iff it is part of a left-flanking delimiter run.

  6. A double __ can open strong emphasis iff it is part of a left-flanking delimiter run and either (a) not part of a right-flanking delimiter run or (b) part of a right-flanking delimiter run preceded by a Unicode punctuation character.

  7. A double ** can close strong emphasis iff it is part of a right-flanking delimiter run.

  8. A double __ can close strong emphasis iff it is part of a right-flanking delimiter run and either (a) not part of a left-flanking delimiter run or (b) part of a left-flanking delimiter run followed by a Unicode punctuation character.

  9. Emphasis begins with a delimiter that can open emphasis and ends with a delimiter that can close emphasis, and that uses the same character (_ or *) as the opening delimiter. The opening and closing delimiters must belong to separate delimiter runs. If one of the delimiters can both open and close emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of 3 unless both lengths are multiples of 3.

  10. Strong emphasis begins with a delimiter that can open strong emphasis and ends with a delimiter that can close strong emphasis, and that uses the same character (_ or *) as the opening delimiter. The opening and closing delimiters must belong to separate delimiter runs. If one of the delimiters can both open and close strong emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of 3 unless both lengths are multiples of 3.

  11. A literal * character cannot occur at the beginning or end of *-delimited emphasis or **-delimited strong emphasis, unless it is backslash-escaped.

  12. A literal _ character cannot occur at the beginning or end of _-delimited emphasis or __-delimited strong emphasis, unless it is backslash-escaped.

Where rules 1–12 above are compatible with multiple parsings, the following principles resolve ambiguity:

  1. The number of nestings should be minimized. Thus, for example, an interpretation <strong>...</strong> is always preferred to <em><em>...</em></em>.

  2. An interpretation <em><strong>...</strong></em> is always preferred to <strong><em>...</em></strong>.

  3. When two potential emphasis or strong emphasis spans overlap, so that the second begins before the first ends and ends after the first ends, the first takes precedence. Thus, for example, *foo _bar* baz_ is parsed as <em>foo _bar</em> baz_ rather than *foo <em>bar* baz</em>.

  4. When there are two potential emphasis or strong emphasis spans with the same closing delimiter, the shorter one (the one that opens later) takes precedence. Thus, for example, **foo **bar baz** is parsed as **foo <strong>bar baz</strong> rather than <strong>foo **bar baz</strong>.

  5. Inline code spans, links, images, and HTML tags group more tightly than emphasis. So, when there is a choice between an interpretation that contains one of these elements and one that does not, the former always wins. Thus, for example, *[foo*](bar) is parsed as *<a href="bar">foo*</a> rather than as <em>[foo</em>](bar).

These rules can be illustrated through a series of examples.

Rule 1:

Example 393
*foo bar*
<p><em>foo bar</em></p>

This is not emphasis, because the opening * is followed by whitespace, and hence not part of a left-flanking delimiter run:

Example 394
a * foo bar*
<p>a * foo bar*</p>

This is not emphasis, because the opening * is preceded by an alphanumeric and followed by punctuation, and hence not part of a left-flanking delimiter run:

Example 395
a*"foo"*
<p>a*&quot;foo&quot;*</p>

Unicode nonbreaking spaces count as whitespace, too:

Example 396
* a *
<p>* a *</p>

Unicode symbols count as punctuation, too:

Example 397
*$*alpha.

*£*bravo.

*€*charlie.
<p>*$*alpha.</p>
<p>*£*bravo.</p>
<p>*€*charlie.</p>

Intraword emphasis with * is permitted:

Example 398
foo*bar*
<p>foo<em>bar</em></p>
Example 399
5*6*78
<p>5<em>6</em>78</p>

Rule 2:

Example 400
_foo bar_
<p><em>foo bar</em></p>

This is not emphasis, because the opening _ is followed by whitespace:

Example 401
_ foo bar_
<p>_ foo bar_</p>

This is not emphasis, because the opening _ is preceded by an alphanumeric and followed by punctuation:

Example 402
a_"foo"_
<p>a_&quot;foo&quot;_</p>

Emphasis with _ is not allowed inside words:

Example 403
foo_bar_
<p>foo_bar_</p>
Example 404
5_6_78
<p>5_6_78</p>
Example 405
пристаням_стремятся_
<p>пристаням_стремятся_</p>

Here _ does not generate emphasis, because the first delimiter run is right-flanking and the second left-flanking:

Example 406
aa_"bb"_cc
<p>aa_&quot;bb&quot;_cc</p>

This is emphasis, even though the opening delimiter is both left- and right-flanking, because it is preceded by punctuation:

Example 407
foo-_(bar)_
<p>foo-<em>(bar)</em></p>

Rule 3:

This is not emphasis, because the closing delimiter does not match the opening delimiter:

Example 408
_foo*
<p>_foo*</p>

This is not emphasis, because the closing * is preceded by whitespace:

Example 409
*foo bar *
<p>*foo bar *</p>

A line ending also counts as whitespace:

Example 410
*foo bar
*
<p>*foo bar
*</p>

This is not emphasis, because the second * is preceded by punctuation and followed by an alphanumeric (hence it is not part of a right-flanking delimiter run:

Example 411
*(*foo)
<p>*(*foo)</p>

The point of this restriction is more easily appreciated with this example:

Example 412
*(*foo*)*
<p><em>(<em>foo</em>)</em></p>

Intraword emphasis with * is allowed:

Example 413
*foo*bar
<p><em>foo</em>bar</p>

Rule 4:

This is not emphasis, because the closing _ is preceded by whitespace:

Example 414
_foo bar _
<p>_foo bar _</p>

This is not emphasis, because the second _ is preceded by punctuation and followed by an alphanumeric:

Example 415
_(_foo)
<p>_(_foo)</p>

This is emphasis within emphasis:

Example 416
_(_foo_)_
<p><em>(<em>foo</em>)</em></p>

Intraword emphasis is disallowed for _:

Example 417
_foo_bar
<p>_foo_bar</p>
Example 418
_пристаням_стремятся
<p>_пристаням_стремятся</p>
Example 419
_foo_bar_baz_
<p><em>foo_bar_baz</em></p>

This is emphasis, even though the closing delimiter is both left- and right-flanking, because it is followed by punctuation:

Example 420
_(bar)_.
<p><em>(bar)</em>.</p>

Rule 5:

Example 421
**foo bar**
<p><strong>foo bar</strong></p>

This is not strong emphasis, because the opening delimiter is followed by whitespace:

Example 422
** foo bar**
<p>** foo bar**</p>

This is not strong emphasis, because the opening ** is preceded by an alphanumeric and followed by punctuation, and hence not part of a left-flanking delimiter run:

Example 423
a**"foo"**
<p>a**&quot;foo&quot;**</p>

Intraword strong emphasis with ** is permitted:

Example 424
foo**bar**
<p>foo<strong>bar</strong></p>

Rule 6:

Example 425
__foo bar__
<p><strong>foo bar</strong></p>

This is not strong emphasis, because the opening delimiter is followed by whitespace:

Example 426
__ foo bar__
<p>__ foo bar__</p>

A line ending counts as whitespace:

Example 427
__
foo bar__
<p>__
foo bar__</p>

This is not strong emphasis, because the opening __ is preceded by an alphanumeric and followed by punctuation:

Example 428
a__"foo"__
<p>a__&quot;foo&quot;__</p>

Intraword strong emphasis is forbidden with __:

Example 429
foo__bar__
<p>foo__bar__</p>
Example 430
5__6__78
<p>5__6__78</p>
Example 431
пристаням__стремятся__
<p>пристаням__стремятся__</p>
Example 432
__foo, __bar__, baz__
<p><strong>foo, <strong>bar</strong>, baz</strong></p>

This is strong emphasis, even though the opening delimiter is both left- and right-flanking, because it is preceded by punctuation:

Example 433
foo-__(bar)__
<p>foo-<strong>(bar)</strong></p>

Rule 7:

This is not strong emphasis, because the closing delimiter is preceded by whitespace:

Example 434
**foo bar **
<p>**foo bar **</p>

(Nor can it be interpreted as an emphasized *foo bar *, because of Rule 11.)

This is not strong emphasis, because the second ** is preceded by punctuation and followed by an alphanumeric:

Example 435
**(**foo)
<p>**(**foo)</p>

The point of this restriction is more easily appreciated with these examples:

Example 436
*(**foo**)*
<p><em>(<strong>foo</strong>)</em></p>
Example 437
**Gomphocarpus (*Gomphocarpus physocarpus*, syn.
*Asclepias physocarpa*)**
<p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
<em>Asclepias physocarpa</em>)</strong></p>
Example 438
**foo "*bar*" foo**
<p><strong>foo &quot;<em>bar</em>&quot; foo</strong></p>

Intraword emphasis:

Example 439
**foo**bar
<p><strong>foo</strong>bar</p>

Rule 8:

This is not strong emphasis, because the closing delimiter is preceded by whitespace:

Example 440
__foo bar __
<p>__foo bar __</p>

This is not strong emphasis, because the second __ is preceded by punctuation and followed by an alphanumeric:

Example 441
__(__foo)
<p>__(__foo)</p>

The point of this restriction is more easily appreciated with this example:

Example 442
_(__foo__)_
<p><em>(<strong>foo</strong>)</em></p>

Intraword strong emphasis is forbidden with __:

Example 443
__foo__bar
<p>__foo__bar</p>
Example 444
__пристаням__стремятся
<p>__пристаням__стремятся</p>
Example 445
__foo__bar__baz__
<p><strong>foo__bar__baz</strong></p>

This is strong emphasis, even though the closing delimiter is both left- and right-flanking, because it is followed by punctuation:

Example 446
__(bar)__.
<p><strong>(bar)</strong>.</p>

Rule 9:

Any nonempty sequence of inline elements can be the contents of an emphasized span.

Example 447
*foo [bar](/url)*
<p><em>foo <a href="/url">bar</a></em></p>
Example 448
*foo
bar*
<p><em>foo
bar</em></p>

In particular, emphasis and strong emphasis can be nested inside emphasis:

Example 449
_foo __bar__ baz_
<p><em>foo <strong>bar</strong> baz</em></p>
Example 450
_foo _bar_ baz_
<p><em>foo <em>bar</em> baz</em></p>
Example 451
__foo_ bar_
<p><em><em>foo</em> bar</em></p>
Example 452
*foo *bar**
<p><em>foo <em>bar</em></em></p>
Example 453
*foo **bar** baz*
<p><em>foo <strong>bar</strong> baz</em></p>
Example 454
*foo**bar**baz*
<p><em>foo<strong>bar</strong>baz</em></p>

Note that in the preceding case, the interpretation

<p><em>foo</em><em>bar<em></em>baz</em></p>

is precluded by the condition that a delimiter that can both open and close (like the * after foo) cannot form emphasis if the sum of the lengths of the delimiter runs containing the opening and closing delimiters is a multiple of 3 unless both lengths are multiples of 3.

For the same reason, we don’t get two consecutive emphasis sections in this example:

Example 455
*foo**bar*
<p><em>foo**bar</em></p>

The same condition ensures that the following cases are all strong emphasis nested inside emphasis, even when the interior whitespace is omitted:

Example 456
***foo** bar*
<p><em><strong>foo</strong> bar</em></p>
Example 457
*foo **bar***
<p><em>foo <strong>bar</strong></em></p>
Example 458
*foo**bar***
<p><em>foo<strong>bar</strong></em></p>

When the lengths of the interior closing and opening delimiter runs are both multiples of 3, though, they can match to create emphasis:

Example 459
foo***bar***baz
<p>foo<em><strong>bar</strong></em>baz</p>
Example 460
foo******bar*********baz
<p>foo<strong><strong><strong>bar</strong></strong></strong>***baz</p>

Indefinite levels of nesting are possible:

Example 461
*foo **bar *baz* bim** bop*
<p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
Example 462
*foo [*bar*](/url)*
<p><em>foo <a href="/url"><em>bar</em></a></em></p>

There can be no empty emphasis or strong emphasis:

Example 463
** is not an empty emphasis
<p>** is not an empty emphasis</p>
Example 464
**** is not an empty strong emphasis
<p>**** is not an empty strong emphasis</p>

Rule 10:

Any nonempty sequence of inline elements can be the contents of an strongly emphasized span.

Example 465
**foo [bar](/url)**
<p><strong>foo <a href="/url">bar</a></strong></p>
Example 466
**foo
bar**
<p><strong>foo
bar</strong></p>

In particular, emphasis and strong emphasis can be nested inside strong emphasis:

Example 467
__foo _bar_ baz__
<p><strong>foo <em>bar</em> baz</strong></p>
Example 468
__foo __bar__ baz__
<p><strong>foo <strong>bar</strong> baz</strong></p>
Example 469
____foo__ bar__
<p><strong><strong>foo</strong> bar</strong></p>
Example 470
**foo **bar****
<p><strong>foo <strong>bar</strong></strong></p>
Example 471
**foo *bar* baz**
<p><strong>foo <em>bar</em> baz</strong></p>
Example 472
**foo*bar*baz**
<p><strong>foo<em>bar</em>baz</strong></p>
Example 473
***foo* bar**
<p><strong><em>foo</em> bar</strong></p>
Example 474
**foo *bar***
<p><strong>foo <em>bar</em></strong></p>

Indefinite levels of nesting are possible:

Example 475
**foo *bar **baz**
bim* bop**
<p><strong>foo <em>bar <strong>baz</strong>
bim</em> bop</strong></p>
Example 476
**foo [*bar*](/url)**
<p><strong>foo <a href="/url"><em>bar</em></a></strong></p>

There can be no empty emphasis or strong emphasis:

Example 477
__ is not an empty emphasis
<p>__ is not an empty emphasis</p>
Example 478
____ is not an empty strong emphasis
<p>____ is not an empty strong emphasis</p>

Rule 11:

Example 479
foo ***
<p>foo ***</p>
Example 480
foo *\**
<p>foo <em>*</em></p>
Example 481
foo *_*
<p>foo <em>_</em></p>
Example 482
foo *****
<p>foo *****</p>
Example 483
foo **\***
<p>foo <strong>*</strong></p>
Example 484
foo **_**
<p>foo <strong>_</strong></p>

Note that when delimiters do not match evenly, Rule 11 determines that the excess literal * characters will appear outside of the emphasis, rather than inside it:

Example 485
**foo*
<p>*<em>foo</em></p>
Example 486
*foo**
<p><em>foo</em>*</p>
Example 487
***foo**
<p>*<strong>foo</strong></p>
Example 488
****foo*
<p>***<em>foo</em></p>
Example 489
**foo***
<p><strong>foo</strong>*</p>
Example 490
*foo****
<p><em>foo</em>***</p>

Rule 12:

Example 491
foo ___
<p>foo ___</p>
Example 492
foo _\__
<p>foo <em>_</em></p>
Example 493
foo _*_
<p>foo <em>*</em></p>
Example 494
foo _____
<p>foo _____</p>
Example 495
foo __\___
<p>foo <strong>_</strong></p>
Example 496
foo __*__
<p>foo <strong>*</strong></p>
Example 497
__foo_
<p>_<em>foo</em></p>

Note that when delimiters do not match evenly, Rule 12 determines that the excess literal _ characters will appear outside of the emphasis, rather than inside it:

Example 498
_foo__
<p><em>foo</em>_</p>
Example 499
___foo__
<p>_<strong>foo</strong></p>
Example 500
____foo_
<p>___<em>foo</em></p>
Example 501
__foo___
<p><strong>foo</strong>_</p>
Example 502
_foo____
<p><em>foo</em>___</p>

Rule 13 implies that if you want emphasis nested directly inside emphasis, you must use different delimiters:

Example 503
**foo**
<p><strong>foo</strong></p>
Example 504
*_foo_*
<p><em><em>foo</em></em></p>
Example 505
__foo__
<p><strong>foo</strong></p>
Example 506
_*foo*_
<p><em><em>foo</em></em></p>

However, strong emphasis within strong emphasis is possible without switching delimiters:

Example 507
****foo****
<p><strong><strong>foo</strong></strong></p>
Example 508
____foo____
<p><strong><strong>foo</strong></strong></p>

Rule 13 can be applied to arbitrarily long sequences of delimiters:

Example 509
******foo******
<p><strong><strong><strong>foo</strong></strong></strong></p>

Rule 14:

Example 510
***foo***
<p><em><strong>foo</strong></em></p>
Example 511
_____foo_____
<p><em><strong><strong>foo</strong></strong></em></p>

Rule 15:

Example 512
*foo _bar* baz_
<p><em>foo _bar</em> baz_</p>
Example 513
*foo __bar *baz bim__ bam*
<p><em>foo <strong>bar *baz bim</strong> bam</em></p>

Rule 16:

Example 514
**foo **bar baz**
<p>**foo <strong>bar baz</strong></p>
Example 515
*foo *bar baz*
<p>*foo <em>bar baz</em></p>

Rule 17:

Example 516
*[bar*](/url)
<p>*<a href="/url">bar*</a></p>
Example 517
_foo [bar_](/url)
<p>_foo <a href="/url">bar_</a></p>
Example 518
*<img src="foo" title="*"/>
<p>*<img src="foo" title="*"/></p>
Example 519
**<a href="**">
<p>**<a href="**"></p>
Example 520
__<a href="__">
<p>__<a href="__"></p>
Example 521
*a `*`*
<p><em>a <code>*</code></em></p>
Example 522
_a `_`_
<p><em>a <code>_</code></em></p>
Example 523
**a<https://foo.bar/?q=**>
<p>**a<a href="https://foo.bar/?q=**">https://foo.bar/?q=**</a></p>
Example 524
__a<https://foo.bar/?q=__>
<p>__a<a href="https://foo.bar/?q=__">https://foo.bar/?q=__</a></p>

A link contains link text (the visible text), a link destination (the URI that is the link destination), and optionally a link title. There are two basic kinds of links in Markdown. In inline links the destination and title are given immediately after the link text. In reference links the destination and title are defined elsewhere in the document.

A link text consists of a sequence of zero or more inline elements enclosed by square brackets ([ and ]). The following rules apply:

A link destination consists of either

A link title consists of either

Although link titles may span multiple lines, they may not contain a blank line.

An inline link consists of a link text followed immediately by a left parenthesis (, an optional link destination, an optional link title, and a right parenthesis ). These four components may be separated by spaces, tabs, and up to one line ending. If both link destination and link title are present, they must be separated by spaces, tabs, and up to one line ending.

The link’s text consists of the inlines contained in the link text (excluding the enclosing square brackets). The link’s URI consists of the link destination, excluding enclosing <...> if present, with backslash-escapes in effect as described above. The link’s title consists of the link title, excluding its enclosing delimiters, with backslash-escapes in effect as described above.

Here is a simple inline link:

Example 525
[link](/uri "title")
<p><a href="/uri" title="title">link</a></p>

The title, the link text and even the destination may be omitted:

Example 526
[link](/uri)
<p><a href="/uri">link</a></p>
Example 527
[](./target.md)
<p><a href="./target.md"></a></p>
Example 528
[link]()
<p><a href="">link</a></p>
Example 529
[link](<>)
<p><a href="">link</a></p>
Example 530
[]()
<p><a href=""></a></p>

The destination can only contain spaces if it is enclosed in pointy brackets:

Example 531
[link](/my uri)
<p>[link](/my uri)</p>
Example 532
[link](</my uri>)
<p><a href="/my%20uri">link</a></p>

The destination cannot contain line endings, even if enclosed in pointy brackets:

Example 533
[link](foo
bar)
<p>[link](foo
bar)</p>
Example 534
[link](<foo
bar>)
<p>[link](<foo
bar>)</p>

The destination can contain ) if it is enclosed in pointy brackets:

Example 535
[a](<b)c>)
<p><a href="b)c">a</a></p>

Pointy brackets that enclose links must be unescaped:

Example 536
[link](<foo\>)
<p>[link](&lt;foo&gt;)</p>

These are not links, because the opening pointy bracket is not matched properly:

Example 537
[a](<b)c
[a](<b)c>
[a](<b>c)
<p>[a](&lt;b)c
[a](&lt;b)c&gt;
[a](<b>c)</p>

Parentheses inside the link destination may be escaped:

Example 538
[link](\(foo\))
<p><a href="(foo)">link</a></p>

Any number of parentheses are allowed without escaping, as long as they are balanced:

Example 539
[link](foo(and(bar)))
<p><a href="foo(and(bar))">link</a></p>

However, if you have unbalanced parentheses, you need to escape or use the <...> form:

Example 540
[link](foo(and(bar))
<p>[link](foo(and(bar))</p>
Example 541
[link](foo\(and\(bar\))
<p><a href="foo(and(bar)">link</a></p>
Example 542
[link](<foo(and(bar)>)
<p><a href="foo(and(bar)">link</a></p>

Parentheses and other symbols can also be escaped, as usual in Markdown:

Example 543
[link](foo\)\:)
<p><a href="foo):">link</a></p>

A link can contain fragment identifiers and queries:

Example 544
[link](#fragment)

[link](https://example.com#fragment)

[link](https://example.com?foo=3#frag)
<p><a href="#fragment">link</a></p>
<p><a href="https://example.com#fragment">link</a></p>
<p><a href="https://example.com?foo=3#frag">link</a></p>

Note that a backslash before a non-escapable character is just a backslash:

Example 545
[link](foo\bar)
<p><a href="foo%5Cbar">link</a></p>

URL-escaping should be left alone inside the destination, as all URL-escaped characters are also valid URL characters. Entity and numerical character references in the destination will be parsed into the corresponding Unicode code points, as usual. These may be optionally URL-escaped when written as HTML, but this spec does not enforce any particular policy for rendering URLs in HTML or other formats. Renderers may make different decisions about how to escape or normalize URLs in the output.

Example 546
[link](foo%20b&auml;)
<p><a href="foo%20b%C3%A4">link</a></p>

Note that, because titles can often be parsed as destinations, if you try to omit the destination and keep the title, you’ll get unexpected results:

Example 547
[link]("title")
<p><a href="%22title%22">link</a></p>

Titles may be in single quotes, double quotes, or parentheses:

Example 548
[link](/url "title")
[link](/url 'title')
[link](/url (title))
<p><a href="/url" title="title">link</a>
<a href="/url" title="title">link</a>
<a href="/url" title="title">link</a></p>

Backslash escapes and entity and numeric character references may be used in titles:

Example 549
[link](/url "title \"&quot;")
<p><a href="/url" title="title &quot;&quot;">link</a></p>

Titles must be separated from the link using spaces, tabs, and up to one line ending. Other Unicode whitespace like non-breaking space doesn’t work.

Example 550
[link](/url "title")
<p><a href="/url%C2%A0%22title%22">link</a></p>

Nested balanced quotes are not allowed without escaping:

Example 551
[link](/url "title "and" title")
<p>[link](/url &quot;title &quot;and&quot; title&quot;)</p>

But it is easy to work around this by using a different quote type:

Example 552
[link](/url 'title "and" title')
<p><a href="/url" title="title &quot;and&quot; title">link</a></p>

(Note: Markdown.pl did allow double quotes inside a double-quoted title, and its test suite included a test demonstrating this. But it is hard to see a good rationale for the extra complexity this brings, since there are already many ways—backslash escaping, entity and numeric character references, or using a different quote type for the enclosing title—to write titles containing double quotes. Markdown.pl’s handling of titles has a number of other strange features. For example, it allows single-quoted titles in inline links, but not reference links. And, in reference links but not inline links, it allows a title to begin with " and end with ). Markdown.pl 1.0.1 even allows titles with no closing quotation mark, though 1.0.2b8 does not. It seems preferable to adopt a simple, rational rule that works the same way in inline links and link reference definitions.)

Spaces, tabs, and up to one line ending is allowed around the destination and title:

Example 553
[link](   /uri
  "title"  )
<p><a href="/uri" title="title">link</a></p>

But it is not allowed between the link text and the following parenthesis:

Example 554
[link] (/uri)
<p>[link] (/uri)</p>

The link text may contain balanced brackets, but not unbalanced ones, unless they are escaped:

Example 555
[link [foo [bar]]](/uri)
<p><a href="/uri">link [foo [bar]]</a></p>
Example 556
[link] bar](/uri)
<p>[link] bar](/uri)</p>
Example 557
[link [bar](/uri)
<p>[link <a href="/uri">bar</a></p>
Example 558
[link \[bar](/uri)
<p><a href="/uri">link [bar</a></p>

The link text may contain inline content:

Example 559
[link *foo **bar** `#`*](/uri)
<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
Example 560
[![moon](moon.jpg)](/uri)
<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>

However, links may not contain other links, at any level of nesting.

Example 561
[foo [bar](/uri)](/uri)
<p>[foo <a href="/uri">bar</a>](/uri)</p>
Example 562
[foo *[bar [baz](/uri)](/uri)*](/uri)
<p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
Example 563
![[[foo](uri1)](uri2)](uri3)
<p><img src="uri3" alt="[foo](uri2)" /></p>

These cases illustrate the precedence of link text grouping over emphasis grouping:

Example 564
*[foo*](/uri)
<p>*<a href="/uri">foo*</a></p>
Example 565
[foo *bar](baz*)
<p><a href="baz*">foo *bar</a></p>

Note that brackets that aren’t part of links do not take precedence:

Example 566
*foo [bar* baz]
<p><em>foo [bar</em> baz]</p>

These cases illustrate the precedence of HTML tags, code spans, and autolinks over link grouping:

Example 567
[foo <bar attr="](baz)">
<p>[foo <bar attr="](baz)"></p>
Example 568
[foo`](/uri)`
<p>[foo<code>](/uri)</code></p>
Example 569
[foo<https://example.com/?search=](uri)>
<p>[foo<a href="https://example.com/?search=%5D(uri)">https://example.com/?search=](uri)</a></p>

There are three kinds of reference links: full, collapsed, and shortcut.

A full reference link consists of a link text immediately followed by a link label that matches a link reference definition elsewhere in the document.

A link label begins with a left bracket ([) and ends with the first right bracket (]) that is not backslash-escaped. Between these brackets there must be at least one character that is not a space, tab, or line ending. Unescaped square bracket characters are not allowed inside the opening and closing square brackets of link labels. A link label can have at most 999 characters inside the square brackets.

One label matches another just in case their normalized forms are equal. To normalize a label, strip off the opening and closing brackets, perform the Unicode case fold, strip leading and trailing spaces, tabs, and line endings, and collapse consecutive internal spaces, tabs, and line endings to a single space. If there are multiple matching reference link definitions, the one that comes first in the document is used. (It is desirable in such cases to emit a warning.)

The link’s URI and title are provided by the matching link reference definition.

Here is a simple example:

Example 570
[foo][bar]

[bar]: /url "title"
<p><a href="/url" title="title">foo</a></p>

The rules for the link text are the same as with inline links. Thus:

The link text may contain balanced brackets, but not unbalanced ones, unless they are escaped:

Example 571
[link [foo [bar]]][ref]

[ref]: /uri
<p><a href="/uri">link [foo [bar]]</a></p>
Example 572
[link \[bar][ref]

[ref]: /uri
<p><a href="/uri">link [bar</a></p>

The link text may contain inline content:

Example 573
[link *foo **bar** `#`*][ref]

[ref]: /uri
<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
Example 574
[![moon](moon.jpg)][ref]

[ref]: /uri
<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>

However, links may not contain other links, at any level of nesting.

Example 575
[foo [bar](/uri)][ref]

[ref]: /uri
<p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
Example 576
[foo *bar [baz][ref]*][ref]

[ref]: /uri
<p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>

(In the examples above, we have two shortcut reference links instead of one full reference link.)

The following cases illustrate the precedence of link text grouping over emphasis grouping:

Example 577
*[foo*][ref]

[ref]: /uri
<p>*<a href="/uri">foo*</a></p>
Example 578
[foo *bar][ref]*

[ref]: /uri
<p><a href="/uri">foo *bar</a>*</p>

These cases illustrate the precedence of HTML tags, code spans, and autolinks over link grouping:

Example 579
[foo <bar attr="][ref]">

[ref]: /uri
<p>[foo <bar attr="][ref]"></p>
Example 580
[foo`][ref]`

[ref]: /uri
<p>[foo<code>][ref]</code></p>
Example 581
[foo<https://example.com/?search=][ref]>

[ref]: /uri
<p>[foo<a href="https://example.com/?search=%5D%5Bref%5D">https://example.com/?search=][ref]</a></p>

Matching is case-insensitive:

Example 582
[foo][BaR]

[bar]: /url "title"
<p><a href="/url" title="title">foo</a></p>

Unicode case fold is used:

Example 583
[ẞ]

[SS]: /url
<p><a href="/url">ẞ</a></p>

Consecutive internal spaces, tabs, and line endings are treated as one space for purposes of determining matching:

Example 584
[Foo
  bar]: /url

[Baz][Foo bar]
<p><a href="/url">Baz</a></p>

No spaces, tabs, or line endings are allowed between the link text and the link label:

Example 585
[foo] [bar]

[bar]: /url "title"
<p>[foo] <a href="/url" title="title">bar</a></p>
Example 586
[foo]
[bar]

[bar]: /url "title"
<p>[foo]
<a href="/url" title="title">bar</a></p>

This is a departure from John Gruber’s original Markdown syntax description, which explicitly allows whitespace between the link text and the link label. It brings reference links in line with inline links, which (according to both original Markdown and this spec) cannot have whitespace after the link text. More importantly, it prevents inadvertent capture of consecutive shortcut reference links. If whitespace is allowed between the link text and the link label, then in the following we will have a single reference link, not two shortcut reference links, as intended:

[foo]
[bar]

[foo]: /url1
[bar]: /url2

(Note that shortcut reference links were introduced by Gruber himself in a beta version of Markdown.pl, but never included in the official syntax description. Without shortcut reference links, it is harmless to allow space between the link text and link label; but once shortcut references are introduced, it is too dangerous to allow this, as it frequently leads to unintended results.)

When there are multiple matching link reference definitions, the first is used:

Example 587
[foo]: /url1

[foo]: /url2

[bar][foo]
<p><a href="/url1">bar</a></p>

Note that matching is performed on normalized strings, not parsed inline content. So the following does not match, even though the labels define equivalent inline content:

Example 588
[bar][foo\!]

[foo!]: /url
<p>[bar][foo!]</p>

Link labels cannot contain brackets, unless they are backslash-escaped:

Example 589
[foo][ref[]

[ref[]: /uri
<p>[foo][ref[]</p>
<p>[ref[]: /uri</p>
Example 590
[foo][ref[bar]]

[ref[bar]]: /uri
<p>[foo][ref[bar]]</p>
<p>[ref[bar]]: /uri</p>
Example 591
[[[foo]]]

[[[foo]]]: /url
<p>[[[foo]]]</p>
<p>[[[foo]]]: /url</p>
Example 592
[foo][ref\[]

[ref\[]: /uri
<p><a href="/uri">foo</a></p>

Note that in this example ] is not backslash-escaped:

Example 593
[bar\\]: /uri

[bar\\]
<p><a href="/uri">bar\</a></p>

A link label must contain at least one character that is not a space, tab, or line ending:

Example 594
[]

[]: /uri
<p>[]</p>
<p>[]: /uri</p>
Example 595
[
 ]

[
 ]: /uri
<p>[
]</p>
<p>[
]: /uri</p>

A collapsed reference link consists of a link label that matches a link reference definition elsewhere in the document, followed by the string []. The contents of the link label are parsed as inlines, which are used as the link’s text. The link’s URI and title are provided by the matching reference link definition. Thus, [foo][] is equivalent to [foo][foo].

Example 596
[foo][]

[foo]: /url "title"
<p><a href="/url" title="title">foo</a></p>
Example 597
[*foo* bar][]

[*foo* bar]: /url "title"
<p><a href="/url" title="title"><em>foo</em> bar</a></p>

The link labels are case-insensitive:

Example 598
[Foo][]

[foo]: /url "title"
<p><a href="/url" title="title">Foo</a></p>

As with full reference links, spaces, tabs, or line endings are not allowed between the two sets of brackets:

Example 599
[foo] 
[]

[foo]: /url "title"
<p><a href="/url" title="title">foo</a>
[]</p>

A shortcut reference link consists of a link label that matches a link reference definition elsewhere in the document and is not followed by [] or a link label. The contents of the link label are parsed as inlines, which are used as the link’s text. The link’s URI and title are provided by the matching link reference definition. Thus, [foo] is equivalent to [foo][].

Example 600
[foo]

[foo]: /url "title"
<p><a href="/url" title="title">foo</a></p>
Example 601
[*foo* bar]

[*foo* bar]: /url "title"
<p><a href="/url" title="title"><em>foo</em> bar</a></p>
Example 602
[[*foo* bar]]

[*foo* bar]: /url "title"
<p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
Example 603
[[bar [foo]

[foo]: /url
<p>[[bar <a href="/url">foo</a></p>

The link labels are case-insensitive:

Example 604
[Foo]

[foo]: /url "title"
<p><a href="/url" title="title">Foo</a></p>

A space after the link text should be preserved:

Example 605
[foo] bar

[foo]: /url
<p><a href="/url">foo</a> bar</p>

If you just want bracketed text, you can backslash-escape the opening bracket to avoid links:

Example 606
\[foo]

[foo]: /url "title"
<p>[foo]</p>

Note that this is a link, because a link label ends with the first following closing bracket:

Example 607
[foo*]: /url

*[foo*]
<p>*<a href="/url">foo*</a></p>

Full and compact references take precedence over shortcut references:

Example 608
[foo][bar]

[foo]: /url1
[bar]: /url2
<p><a href="/url2">foo</a></p>
Example 609
[foo][]

[foo]: /url1
<p><a href="/url1">foo</a></p>

Inline links also take precedence:

Example 610
[foo]()

[foo]: /url1
<p><a href="">foo</a></p>
Example 611
[foo](not a link)

[foo]: /url1
<p><a href="/url1">foo</a>(not a link)</p>

In the following case [bar][baz] is parsed as a reference, [foo] as normal text:

Example 612
[foo][bar][baz]

[baz]: /url
<p>[foo]<a href="/url">bar</a></p>

Here, though, [foo][bar] is parsed as a reference, since [bar] is defined:

Example 613
[foo][bar][baz]

[baz]: /url1
[bar]: /url2
<p><a href="/url2">foo</a><a href="/url1">baz</a></p>

Here [foo] is not parsed as a shortcut reference, because it is followed by a link label (even though [bar] is not defined):

Example 614
[foo][bar][baz]

[baz]: /url1
[foo]: /url2
<p>[foo]<a href="/url1">bar</a></p>

There are two parts to making a crosslink.

  1. Define an id.
  2. Reference that id with a crosslink.

10.4.1Defining an id

There are two ways to define an id:

  1. Using an id attribute {id: some-id}
  2. Using a shorter “syntactic sugar” approach: {#some-id}

The shorter “syntactic sugar” approach is usually preferred. However, it can look a bit odd in an attribute list with other attributes in it. So, when other attributes are present in an attribute list, the {id: some-id} syntax is preferred.

In terms of the value of an id, it has some special restrictions:

  1. The first character in the id has to be a lowercase or uppercase letter, i.e. [a-zA-Z] if you think in regular expressions.
  2. The remaining characters in the id have to be a lowercase or uppercase letter or a digit or a hyphen (-) or an underscore (_).
  3. You can only define an id value once in an entire Markua document, even one that is split over multiple files.

These restrictions ensure that your ids can then be linked to by a crosslink from anywhere in the Markua document.

Also, note that id attributes need to be defined on either a block or span element. Finally, if an id is defined with an invalid name, the Markua Processor must ignore it and log an error.

10.4.1.1Defining an id on a block element

To define an id on a block element like a paragraph, figure, heading or even a definition list item, you simply stick the id definition on a line above the start of the block element. Note that exactly one newline must separate the attribute list from the block element–if two newlines are used, the attribute list will be interpreted as a directive, and the id won’t be correctly applied.

Here’s how to use the attribute list syntax to define an id attribute:

{id: some-id}
This is a paragraph with the id of `some-id`.

Here’s how to use the shorter “syntactic sugar” approach to define an id attribute:

{#some-id}
This is a paragraph with the id of `some-id`.

10.4.1.2Defining an id on a span element

To define an id on a span element you simply add the id definition immediately after the span element.

Here’s how to use the attribute list syntax to define an id attribute on a span element:

The word Markua{id: markua} has an id.

Leanpub is based in **Victoria, BC, Canada**{id: victoria}.

Here’s how to use the shorter “syntactic sugar” approach to define an id attribute:

The word Markua{#markua} has an id.

Leanpub is based in **Victoria, BC, Canada**{#victoria}.

Here’s how to define an id on a custom span:

The [quick sly fox]{#quick_sly} jumped over the lazy dogs.

If you want to define an id on a span while also defining other attributes like index entries, the id: syntax must be used in a full attribute list:

The [quick sly fox]{id: quick_sly, i: "Fox, Sly and Quick"} jumped over the lazy
dogs.

Regardless of how you defined the id, you then link to it to create a crosslink. To do this, you use the # character and the id in a link:

[link text](#some-id)

This syntax is intended to be reminiscent of HTML anchor tags.

Note that order of definition and use does not matter: crosslinks will work regardless of whether the id is defined before or after the use of it.

{id: id1}
This is a paragraph with an id of `id1`.

{#id2}
This is a paragraph with an id of `id2`.

This link to [Markua](#markua) is before the id definition.

The word Markua{id: markua} has an id of `markua`.

This link to [Markua](#markua) is after the id definition.

Leanpub is based in **Victoria, BC, Canada**{#victoria}.

The [quick sly fox]{#quick_sly} jumped over the lazy dogs.
<p id="id1">This is a paragraph with an id of <code>id1</code>.</p>
<p id="id2">This is a paragraph with an id of <code>id2</code>.</p>
<p>This link to <a href="#markua">Markua</a> is before the id definition.</p>
<p>The word <span id="markua">Markua</span> has an id of <code>markua</code>.</p>
<p>This link to <a href="#markua">Markua</a> is after the id definition.</p>
<p>Leanpub is based in <strong id="victoria">Victoria, BC, Canada</strong>.
<p>The <span id="quick_sly">quick sly fox</span> jumped over the lazy dogs.</p>

Parts, chapters, sections and figures (with titles) often have two useful properties for writers:

  1. A name, which is often short and useful to reference.
  2. A number.

In your writing, it’s often desirable to refer to these figures from elsewhere in your book. When creating such a reference, it’s helpful to be able to reference the title and number of the part, chapter, section or figure.

It is also helpful (especially in a print book) to be able to reference a part, chapter, section or figure by its page number. This is a bit complex. Whether page numbers are output in smart crosslinks is determined by the include-page-numbers-in-references document setting, which defaults to true. It is also determined by the output format: page numbers are supported in PDF, but HTML and EPUB do not support outputting page numbers in references. For EPUB, static page numbers in smart crosslinks make no sense because EPUB files have resizable text (and thus the page numbers vary). For HTML, page numbers make no sense, period.

The simplest way to use smart crosslinks is to always use them the same way:

Some text [#f](#its-id) more text.

The #f inserts the full title of what is being crosslinked to, by the id referenced by (#its-id).

If you like simplicity, stop reading this section now. If you like complexity and fine-grained control, please continue!

There are actually a whole bunch of different ways te insert smart crosslinks, since you don’t always want to show the full title.

Here are a number of representative examples of how to use smart crosslinks. The actual meaning of all the references (with the # characters) is explained afterward:

This is discussed in [#f](#crosslinks).

This is discussed in [section #n, #t](#crosslinks).

This is discussed in a [#d](#crosslinks) above.

See [#f](#metadata), which is the best chapter in this book.

See [chapter #n](#metadata), which is the best chapter in this book.

See [#d #n](#metadata), which is the best chapter in this book.

See [#s](#metadata), which is the best chapter in this book.

See [#f](#span-elements), which is the best section in this book.

See [section #n](#span-elements), which is the best section in this book.

This is in [#f](#fancy-diagram), arguably the fanciest diagram in this
document.

This is in [#s](#fancy-diagram), arguably the fanciest diagram in this
document.

This is in figure [#n](#fancy-diagram), arguably the fanciest diagram in this
document.

Here’s how these references to titles and numbers work:

So, for a “Figure 8.2: Anatomy of a Squirrel” on page 42, these are:

Note that for a book which is in English:

In other languages, the Markua Processor should do the right thing when outputting #s and #f.

Note that in this example, “Anatomy of a Squirrel” was typed by the author, whereas “Figure 8.2: “ was generated by the Markua Processor. It does not matter; both can be referenced.

Also, note that regardless of section level, sections referenced in #d or #f are all called “Section” (not “Sub-Section”, “Sub-Sub-Section”, etc.)

The expectation is that #f will be used by authors who don’t mind verbosity, and #t and #n will be used by authors who prefer control and brevity. The #d is for very lazy authors who like saving keystrokes and/or who don’t know whether their publisher will call the code samples “Listing”, “Example” or some other word and want to protect themselves against extra work.

Finally, note that you can make a smart crosslink to any id, including that of a paragraph or span inside some chapter, section or sub-section. What the Markua Processor should do is the following:

So, for a “Figure 8.2: Anatomy of a Squirrel” on page 42, these are:

The code-sample-names, figure-names and table-names settings control the words used to name things.

The only way to reference the title and numbers is in crosslinks. There’s no syntax to do so without creating a crosslink–a crosslink is more helpful, since it is clickable, and adding another syntax simply to be less helpful to the reader is a bad idea. Markua has enough syntax as it is.

Finally, note that parts, chapters and sections and figures all have implicit numbering. So, #n always works even if numbering is off. However, you will confuse readers if you refer to numbering they cannot see. Also, if numbering is off, the #f must not include either the #d or #n parts: it will be “Anatomy of a Squirrel” not “Figure 8.2: Anatomy of a Squirrel”.

10.6Images

Syntax for images is like the syntax for links, with one difference. Instead of link text, we have an image description. The rules for this are the same as for link text, except that (a) an image description starts with ![ rather than [, and (b) an image description may contain links. An image description has inline elements as its contents. When an image is rendered to HTML, this is standardly used as the image’s alt attribute.

Example 616
![foo](/url "title")
<p><img src="/url" alt="foo" title="title" /></p>
Example 617
![foo *bar*]

[foo *bar*]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
Example 618
![foo ![bar](/url)](/url2)
<p><img src="/url2" alt="foo bar" /></p>
Example 619
![foo [bar](/url)](/url2)
<p><img src="/url2" alt="foo bar" /></p>

Though this spec is concerned with parsing, not rendering, it is recommended that in rendering to HTML, only the plain string content of the image description be used. Note that in the above example, the alt attribute’s value is foo bar, not foo [bar](/url) or foo <a href="/url">bar</a>. Only the plain string content is rendered, without formatting.

Example 620
![foo *bar*][]

[foo *bar*]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
Example 621
![foo *bar*][foobar]

[FOOBAR]: train.jpg "train & tracks"
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
Example 622
![foo](train.jpg)
<p><img src="train.jpg" alt="foo" /></p>
Example 623
My ![foo bar](/path/to/train.jpg  "title"   )
<p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
Example 624
![foo](<url>)
<p><img src="url" alt="foo" /></p>
Example 625
![](/url)
<p><img src="/url" alt="" /></p>

Reference-style:

Example 626
![foo][bar]

[bar]: /url
<p><img src="/url" alt="foo" /></p>
Example 627
![foo][bar]

[BAR]: /url
<p><img src="/url" alt="foo" /></p>

Collapsed:

Example 628
![foo][]

[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" /></p>
Example 629
![*foo* bar][]

[*foo* bar]: /url "title"
<p><img src="/url" alt="foo bar" title="title" /></p>

The labels are case-insensitive:

Example 630
![Foo][]

[foo]: /url "title"
<p><img src="/url" alt="Foo" title="title" /></p>

As with reference links, spaces, tabs, and line endings, are not allowed between the two sets of brackets:

Example 631
![foo] 
[]

[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" />
[]</p>

Shortcut:

Example 632
![foo]

[foo]: /url "title"
<p><img src="/url" alt="foo" title="title" /></p>
Example 633
![*foo* bar]

[*foo* bar]: /url "title"
<p><img src="/url" alt="foo bar" title="title" /></p>

Note that link labels cannot contain unescaped brackets:

Example 634
![[foo]]

[[foo]]: /url "title"
<p>![[foo]]</p>
<p>[[foo]]: /url &quot;title&quot;</p>

The link labels are case-insensitive:

Example 635
![Foo]

[foo]: /url "title"
<p><img src="/url" alt="Foo" title="title" /></p>

If you just want a literal ! followed by bracketed text, you can backslash-escape the opening [:

Example 636
!\[foo]

[foo]: /url "title"
<p>![foo]</p>

If you want a link after a literal !, backslash-escape the !:

Example 637
\![foo]

[foo]: /url "title"
<p>!<a href="/url" title="title">foo</a></p>

10.7Image resource attributes (M)

As shown in the Images section, the syntax to insert an image in Markua is the identical to that used by CommonMark and GFM. (That section is unchanged from the CommonMark spec.) It is, however, also consistent with the syntax that is used for other resource types.

The following are the image resource formats and the file extensions which choose them by default:

jpeg : JPEG image - .jpg or .jpeg

gif : GIF image - .gif

png : PNG image - .png

svg : SVG image - .svg

svgz : SVG image (zipped) - .svgz

Markua uses the image syntax for all resource types. Markua also reinterprets images as being a resource. This is important for the following reason:

Resources can have attribute lists.

Therefore, since images are resources in Markua, images can have attribute lists. These attributes are used to do basic image formatting.

Most formatting in Markua is semantic–things like bold and italic, superscript, etc. This formatting is part of the manuscript and should not be changed by any publisher. The formatting for images, however, is some of the most non-semantic formatting in Markua. It’s stuff which can absolutely be changed with no effect on the meaning of the Markua document. Specifically, images can have width, height, align or float attributes, or can even be fullbleed.

These settings are so universal when applied to images that it would be too dogmatic for Markua to insist that the only way to format images was to add a class element, and then style the class of the image elsewhere. Markua is pragmatic: by providing a few basic ways to format images, it enables authors to make in-progress books look good enough to actually publish in-progress.

The following are the supported attributes for image resources, in addition to the class, format, title and type attributes which all resources support:

align : The align can be left, right or middle. The default is middle. Combined with width and height, this provides basic image formatting. When align is specified, text goes above and below the image, but the image and its surrounding whitespace occupies the entire width of the page. If you want the text to actually wrap around the image, use float instead. In terms of the specific values of align, a Markua Processor must interpret left as “on the left side of the page”, right as “on the right side of the page” and middle as “in the middle of the content area of the page, respecting margins” in all cases. Finally, note that inside and outside are not supported for align.

alt : The alt is the alt text, to be displayed when the image cannot be shown. This is provided in the figure attribute list or in the square brackets before the image, for local and web images. (Inline images don’t have the square brackets, so for an inline image, the alt text can only be provided via an attribute.) If it is provided in both places, the one in the attribute list wins.

float : The float can be left, right, inside or outside. The left and right attribute values mean the same thing as they do with align. When float is specified, text flows around the image. It is an error to specify both align and float. Note that middle is not supported for float, since Markua is not a good choice for the types of advanced layouts which flow text on both sides of an image. For that, you should use something like InDesign; this is why Markua Processors such as Leanpub can export InDesign files. Also, note that float supports inside or outside, but align does not. What a Markua Processor does with inside and outside is more complex. A Markua Processor must interpret inside as “near the spine” and outside as “away from the spine” in a print book. In an ebook, however, a Markua Processor has some flexibility in terms of the meaning of inside and outside: it can either interpret inside as “left” or it can interpret inside as “left if the language is left-to-right, and right if the language is right-to-left”. Similarly, a Markua Processor can either interpret outside as “right” or it can interpret outside as “right if the language is left-to-right, and left if the language is right-to-left”.

fullbleed : true or false. The default is false. If true, the Markua Processor should ensure that the image is the full size of the page, with no margins on the page. If false, the width, height and align attributes are used instead and margins are respected. A Markua Processor should do something fancy and tasteful with the title attribute for a fullbleed image.

height : The height of the image, in percentage of page content area height (respecting margins). This is specified as a number (integer or float) between 1 and 100 followed by a percentage sign (%).

width : The width of the image, in percentage of page content area width (respecting margins). This is specified as an number (integer or float) between 1 and 100 followed by a percentage sign (%).

Note that if only one of width and height are specified, the Markua Processor should scale the image proportionally if possible (again, respecting margins). If both width and height are specified, the Markua Processor should scale the image accordingly, ignoring the aspect ratio. (So, it’s almost always a bad idea to specify both width and height.)

![foo](pie.jpg)

{alt: "foo"}
![](pie.jpg)

{alt: "foo"}
![clobbered by alt foo](pie.jpg)
<p><img src="pie.jpg" alt="foo" /></p>
<p><img src="pie.jpg" alt="foo" /></p>
<p><img src="pie.jpg" alt="foo" /></p>
![foo](pie.jpg "bar")

{alt: "foo"}
![](pie.jpg "bar")

{alt: "foo"}
![clobbered by alt foo](pie.jpg "bar")
<p><img src="pie.jpg" alt="foo" title="bar" /></p>
<p><img src="pie.jpg" alt="foo" title="bar" /></p>
<p><img src="pie.jpg" alt="foo" title="bar" /></p>

Since the other attributes are all presentational, there is no formal specification of the HTML which must result. So, here are some examples of usage instead, involving a very real and very delicious apple pie:

{fullbleed: true}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

{width: 100%}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

{width: "100%", height: '30%'}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

{width: 50%, float: left}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

{width: 50%, align: middle}
![a half-eaten apple pie](pie.jpg "It Tasted Even Better")

10.8Inline SVG images (M)

Local and web resource locations are supported for any type of image; inline resource locations are supported for SVG images only.

Since an SVG image is just XML text, it can be contained inline in the text of a Markua document. This is not true for binary resources like PNG or JPEG images or any type of audio or video file–these can only be local or web resources.

To add an inline SVG image, you create a fenced code block, and then you indicate that instead of code, you are inserting a resource of type image and of format SVG.

There are two ways to do this. Here’s the verbose way:

{type: image, format: svg}
```
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Here’s the syntactic sugar way:

```!
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

The ! after the opening code fence means that what follows is an SVG image, and that it should be rendered as an image, not displayed as the XML source of the SVG image.

Inline SVG images support all the normal image resource attributes:

{alt: "a blue circle", title: "Earth From Space (Simplified)"}
```!
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

You can use them with the verbose way as well:

{type: image, format: svg, alt: "foo", title: "bar"}
```
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Note that when you are writing about SVG and want to display the SVG text, what you are really doing is creating a code resource. This is discussed below.

10.8.1Writing about SVG

If you want to write about the SVG format, and show the actual SVG source (instead of the image produced), it needs to be of a format of code, not image.

Now, you can just be lazy and not provide format or type attributes at all, since guessing when neither is present always produces a type of code.

```
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

However, you can also just specify both, either this way…

{type: code}
```svg
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

…or this way:

{type: code, format: svg}
```
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

…or this way:


```svg
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Or you can just specify xml, since SVG is xml:

```xml
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Or you can just specify text, if you don’t want syntax highlighting:

```text
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
```

Heck, you can even use tildes to do that:

~~~
<svg width="20" height="20">
  <circle cx="10" cy="10" r="9" fill="blue"/>
</svg>
~~~

10.9Image locations and embedding (M)

Note that regardless of the image location, a Markua Processor can handle images in the following ways:

  1. Embed the image in the output format (such as a PDF or EPUB file)
  2. Place the image at some URL on the web, and load it from a URL in all cases

This is true regardless of the location of the source image (local, web or inline). The location of the image source has no effect on the location of the image in the output.

For example, a web image does not need to be loaded from the web every time; it can be inlined in the output by the Markua Processor.

This is also true for all of the types of images which are supported in Markua: GIF, PNG, JPEG, SVG and zipped SVG.

10.10Video resources (M)

Note that .mp4 is used for MP4 video, not MP4 AAC audio.

The following are the video resource formats and the file extensions which choose them by default:

mp4 : MP4 video - .mp4

webm : WebM video - .webm

The syntax to insert a video is the same compact and consistent syntax that is used for any resource. Local and web resource locations are supported for both video formats; inline resource locations for video are obviously not supported.

Unlike images where many images will accompany the manuscript itself, with video it’s expected that authors will be uploading their videos to sites such as YouTube first, and then reusing these videos in the contents of their Markua books. So, while video can be local or web video, web video will be much more prominent than web images in a Markua book.

Furthermore, unlike with web images where the format (and thus the type) are typically discoverable from the file extension in the URL, with web video it’s usually not. So, for web videos, there will typically be a {type: video} attribute list at a minimum. That said, if the type and format are not specified and the resource is a web resource, the Markua Processor may use the domain to decide what type of resource to assume. So YouTube videos may just work in some Markua Processors, like Leanpub, without specifying either the type or format.

The following types of videos are supported in Markua: MP4 and WebM.

We will discuss the supported and the default attributes for videos, and then show examples of videos being inserted for both local and web videos.

10.10.1Supported Attributes for Video

The following are the supported attributes for video resources, in addition to the class, format, title and type attributes which all resources support. Note that the class, height, title and width attributes apply to the poster image for the video–not to the video itself during playback.

align : The align can be left, right or middle. This applies to the poster image for the video, and works just like it does with images.

embed : true or false. If true, then you can actually embed the video file or reference it and play it. If false, then it’s from a site like YouTube which disallows this. In this case, this should function like a link to external web page, but open an appropriate app (e.g. the YouTube app) instead of a browser. A Markua Processor can be smart about defaults, and parse the URL to set the value of the embed attribute.

float : The float can be left, right, inside or outside. This applies to the poster image for the video, and works just like it does with images. It is an error to specify both align and float.

height : This applies to the poster image for the video, and works just like it does with images.

poster : The poster is the URL or path to an image which should be shown instead of the video before the video is played. If a Markua Processor is outputting some format where it is known that video resources are not supported, it must choose the poster to use as a replacement for the video. Books are not just ebooks–books can also be printed on the fibers of trees that have been chopped down (“paper”), producing something called a “book”. These “books”, whether they are bound in a sturdy fashion (“hardcover books”) or a flimsy fashion (“paperback books”), have one thing in common with respect to embedded video: they do not support it. Finally, if a Markua Processor is embedding a web video from a web video service (e.g. YouTube) which has a known algorithm for the location of the poster images for its videos, the Markua Processor may choose to use that poster image if a poster attribute is not explicitly specified. However, if a poster attribute is specified, then it must be used by the Markua Processor.

poster-format : The format of the poster image, if it exists. If this attribute is not specified, the format of the poster is inferred from the poster image file extension. This attribute needs to exist only to support poster images which do not have file extensions.

transcript : The URL or path to a transcript of the video, which should be shown or dictated to people who cannot view video. This helps people with visual disabilities view course material, and helps with ADA compliance. For example, what Leanpub does with a transcript is as follows: a URL is linked to in the title (but in a way which does not affect the Table of Contents), and a path is assumed to be a path to a Markua file which is then used to produce a web page at a public URL, which then is also linked to in the title in a way which does not affect the Table of Contents. With the Markua files for transcripts, Leanpub only supports certain resource types to be included in the transcript files themselves. Specifically, images, video, audio are not supported, but code, math and tables are supported. When Leanpub generates a transcript, the URLs are publicly accessible but obscure. Identical transcripts should not make duplicate transcript files on the web, and using a new UUID every time would violate this. However, not changing the transcript URL when its content changed could lead to bad outcomes. So, the URL of a transcript on the web should be scoped to the book and affected both by its filename and its content. What Leanpub does is: (filename minus extension minus period plus hyphen plus the SHA hash).html Including the filename eliminates collisions if filenames are unique but the hashes are not.

width : This applies to the poster image for the video, and works just like it does with images.

In the following sections, please note that while the examples are shown with an HTML mapping, please note that a Markua Processor has complete flexibility over how it handles the location of video resources and their display.

10.10.2Local Video

Here's a paragraph before the figure.

![](pie_eating_contest.webm "A Piece of Pie")

Here's a paragraph after the figure.
<p>Here's a paragraph before the figure.</p>
<div class="figure">
<video src="pie_eating_contest.webm"/>
<p class="title">A Piece of Pie</p>
</div>
<p>Here's a paragraph after the figure.</p>

10.10.3Web Video

Here's a paragraph before the figure.

{type: video, poster: https://img.youtube.com/vi/VOCYL-FNbr0/mqdefault.jpg}
![](https://www.youtube.com/watch?t=105&v=VOCYL-FNbr0 "Introducing Markua")

Here's a paragraph after the figure.
<p>Here's a paragraph before the figure.</p>
<div class="figure">
<video src="https://www.youtube.com/watch?t=105&v=VOCYL-FNbr0"
poster="https://img.youtube.com/vi/VOCYL-FNbr0/mqdefault.jpg"/>
<p class="title">Introducing Markua</p>
</div>
<p>Here's a paragraph after the figure.</p>

10.11IFrame resources (M)

The syntax to insert an IFrame is almost identical to that is used for video resources.

The idea about having a syntax to insert an IFrame is simple: since Markua does not support inline HTML, there is no implicit support for it. However, resources such as slide decks can be an important part of, say, courses which are created in Markua.

So, the resource syntax is used for yet another task.

The only resource location which is supported for an IFrame resource is the web resource location, for obvious reasons.

10.11.1Supported Attributes for iframe Resources

The following are the supported attributes for IFrame resources, in addition to the class, format, title and type attributes which all resources support.

align : The align can be left, right or middle.

height : This works just like it does with images.

poster : The poster is the URL or path to an image which should be shown instead of the IFrame. If a Markua Processor is outputting some format where it is known that IFrame resources are not supported, it must choose the poster to use as a replacement for the IFrame. This poster is then turned into a link to a web page displaying the IFrame content, either top-level or as an IFrame.

poster-format : The format of the poster image, if it exists. If this attribute is not specified, the format of the poster is inferred from the poster image file extension. This attribute needs to exist only to support poster images which do not have file extensions.

transcript : The URL or path to a transcript of the resource contained in the IFrame, which should be shown or dictated to people who cannot view the IFrame content. This helps people with visual disabilities view course material, and helps with ADA compliance. For example, what Leanpub does with an iframe resource transcript is the same as what it does for a video transcript.

width : This works just like it does with images.

In the following sections, please note that while the examples are shown with an HTML mapping, please note that a Markua Processor has complete flexibility over how it handles the location of video resources and their display.

Here's a paragraph before the figure.

{type: iframe, poster: https://markua.com/Markua.png, width: 300, height: 200}
![](https://markua.com/Markua.pdf "Introducing Markua")

Here's a paragraph after the figure.
<p>Here's a paragraph before the figure.</p>
<div class="figure">
<iframe title="Introducing Markua"
    width="300"
    height="200"
    src="https://markua.com/Markua.pdf">
<p class="title">Introducing Markua</p>
</iframe>
</div>
<p>Here's a paragraph after the figure.</p>

10.12Audio resources (M)

The following are the audio resource formats and the file extensions which choose them by default:

aac : AAC audio - .aac, .m4a

mp3 : MP3 audio - .mp3

ogg : Ogg Vorbis audio - .oga, .ogg

wav : WAV audio - .wav, .wave

The syntax to insert an audio resource is the same compact and consistent syntax that is used for any resource. Local and web resource locations are supported for both audio formats; inline resource locations for audio are obviously not supported.

Just as with video, the audio support in ebooks and on the web is more varied than for images. With audio, there are MP3, AAC, Ogg and WAV formats all in widespread use, and there are a number of other formats with supporters. It’s entirely likely that many ebook readers won’t support any of them.

Unlike images where many images will accompany the manuscript itself, with audio it’s expected that authors will be uploading their audio files to various sites first, and then reusing these files in the contents of their Markua books. So, while audio can be local or web audio, web audio will be much more prominent than web images in a Markua book.

Furthermore, unlike with web images where the format (and thus the type) are typically discoverable from the file extension in the URL, with web audio it’s usually not. So, for web audio files, there will typically be a {type: audio} attribute list at a minimum. That said, if the type and format are not specified and the resource is a web resource, the Markua Processor may use the domain to decide what type of resource to assume. So YouTube audios may just work in some Markua Processors, like Leanpub, without specifying either the type or format.

The following types of audio resources are supported in Markua: MP3, AAC, WAV and Ogg Vorbis.

We will discuss the supported and the default attributes for audio files, and then show examples of audio being inserted for both local and web audio files.

10.12.1Supported Attributes for Audio

The following are the supported attributes for audio resources, in addition to the class, format, title and type attributes which all resources support.

poster : The poster is the URL or path to an image (such as an album cover or a photo of the artist) which should be shown to accompany the audio resource. A Markua Processor may choose to make this poster be a link to the resource, as well as making its title be a link to the resource.

transcript : The URL or path to a transcript of the audio, which should be shown to people who cannot hear audio. This helps people with auditory disabilities view course material, and helps with ADA compliance. Transcripts should be produced and handled in an identical way to video resources. For example, Leanpub does this. See the Video resources section above for more information.

In the following sections, please note that while the examples are shown with an HTML mapping, please note that a Markua Processor has complete flexibility over how it handles the location of video resources and their display.

10.12.2Local Audio

The full version of the talk is here:

![](talk.mp3 "Full Talk")
<p>The full version of the talk is here:</p>
<div class="figure">
<audio src="resources/talk.mp3"/>
<p class="title">Full Talk</p>
</div>

10.12.3Web Audio

The full version of the talk is here:

![](https://markua.com/talk.mp3 "Full Talk")
<p>The full version of the talk is here:</p>
<div class="figure">
<audio src="https://markua.com/talk.mp3"/>
<p class="title">Full Talk</p>
</div>

10.13Math resources (M)

Math can be a local, web or inline resource, just like any other resource, and the same resource syntax applies to code as to all other resources.

Markua does not specify how math is output in HTML. It can be an SVG or PNG image, with or without transparency, or it can use something like MathJax. After much research, Leanpub actually settled on using PNGs with a white background, for maximum compatibility with browsers, older devices and dark mode. However, a different Markua Processor could maake a different decision, like transparent SVGs with a dynamic foreground color, or MathJax or MathML.

There are two formats of math resources supported by Markua:

  1. LaTeX math (latexmath or $)
  2. AsciiMath (asciimath or @)

10.13.1Math resource formats

The following are the math resource formats and the file extensions which choose them by default:

asciimath : AsciiMath math - .asciimath

latexmath : LaTeX math - .tex

Note that AsciiMath and LaTeX math are almost always contained as inline resources in a Markua document, not in external files as local or web resources.

Also, note that .tex is assumed to be LaTeX math, not a vanilla LaTeX file. There is a simple reason for this: Markua needs LaTeX math for math, but only needs to display LaTeX for people writing about LaTeX. So, if you have a LaTeX file with an extension of .tex that just contains LaTeX code and you want to display it, then add a {type: code, format: latex} attribute list.

Note that the assumption is that AsciiMath will almost always be used as an inline resource. So, the .asciimath file extension is deliberately verbose.

10.13.2Supported Attributes for Math

The following are the supported attribute for math resources, in addition to the class, format, title and type attributes which all resources support:

alt : The alt is the alt text, to be displayed when the mathematical equations cannot be shown. The default alt text for math is “math”. This can be provided in the figure attribute list. This is primarily intended for Markua Processors that output math as images; there are no output requirements for the alt text. This attribute functions as it does for images. (In fact, a Markua Processor may choose to transform the math into an image, for maximum ebook reader compatibility.)

Note that for math, the format is the name of the syntax used to write the mathematical equations. There are two special types of format for math baked into Markua: latexmath for LaTeX math and asciimath for AsciiMath. If a Markua Processor encounters one of these formats, it must assume the type of the resource is math, not code.

10.13.3Using bold and strikethrough formatting with math

Markua supports formatting math as bold or strikethrough.

Math can be **bolded** by being wrapped with two asterisks, and can be formatted in ~~strikethrough~~ by being wrapped with two tildes. Note that the format of the strikethrough line is controlled by the strikethrough-format document setting.

10.13.4Local math resources

Local math resources can be inserted as a figure.

Here's a paragraph before the figure.

{format: latexmath}
![too large to fit in the alt text](fermat-proof.tex "Proof of Fermat's Last Theorem")

Here's a paragraph after the figure.

The alt text can also be set with an attribute list:

Here's a paragraph before the figure.

{format: latexmath, alt: "too large to fit in the alt text"}
![](fermat-proof.tex "Proof of Fermat's Last Theorem")

Here's a paragraph after the figure.

10.13.5Web math resources

This is identical to how local math resources work, including the significance of file extensions. The only difference is that the files are on the web.

Here's a paragraph before the figure.

{format: latexmath}
![too large to fit in the alt text](https://markua.com/fermat-proof.tex "Proof of Fermat's Last Theorem")

Here's a paragraph after the figure.

The alt text can also be set with an attribute list:

Here's a paragraph before the figure.

{format: latexmath, alt: "too large to fit in the alt text"}
![](https://markua.com/fermat-proof.tex "Proof of Fermat's Last Theorem")

Here's a paragraph after the figure.

10.13.6Inline math resources

Inline math resources are the most flexible way to insert math. They are the only way to insert math as a span resource, and the most straightforward way to add short math examples as figures. LaTeX math and AsciiMath can be inserted inline as a span or figure.

10.13.6.1Span

Being able to insert a math resource as a span is important, as it lets you write things like one of the kinematic equations d = v_i t + \frac{1}{2} a t^2$ inside sentences. This can be done with LaTeX math or AsciiMath.

To insert math as inline math, use a $ after closing backtick for LaTeX math, an @ after the closing backtick for AsciiMath, or an attribute list specifying a format of latexmath or asciimath. If none of these is done, the content of the backticks is treated as code and is output verbatim as monospaced text.

10.13.6.1.1LaTeX math span

There is syntactic sugar for LaTeX math which is inserted as a span, using the $ character after the closing backtick:

Here's one of the kinematic equations `d = v_i t + \frac{1}{2} a t^2`$ inside a
sentence.

The $ character indicates the inline resource is LaTeX math.

If you don’t like syntactic sugar, you can also use {format: latexmath} or {type: math, format: latexmath} after the inline span resource:

Here's one of the kinematic equations
`d = v_i t + \frac{1}{2} a t^2`{format: latexmath}
inside a sentence.
10.13.6.1.2AsciiMath span

AsciiMath is a way of producing simple MathML equations, using about 1% of the typing. It’s more terse than LaTeX math.

There is syntactic sugar for AsciiMath which is inserted as a span, using the @ character after the closing backtick:

Here's one of the kinematic equations `d = v_i t + 1/2 at^2`@ inside a sentence.

The @ character indicates the inline resource is AsciiMath.

If you don’t like syntactic sugar, you can also use {format: asciimath} or {type: math, format: asciimath} after the inline span resource:

Here's one of the kinematic equations `d = v_i t + 1/2 at^2`{format: asciimath}
inside a sentence.

10.13.6.2Figure

LaTeX math and AsciiMath can be inserted inline as a figure.

10.13.6.2.1LaTeX math figures

Here’s how you do this using LaTeX math…

Here’s the version with the syntactic sugar for the format after the backticks:

Here's a paragraph before the figure.

```$
\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
```

Here's a paragraph after the figure.

Here’s the same thing, with the full format after the backticks:

Here's a paragraph before the figure.

```latexmath
\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
```

Here's a paragraph after the figure.

Here’s the same thing again, with a full attribute list:

Here's a paragraph before the figure.

{format: latexmath}
```
\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
```

Here's a paragraph after the figure.
10.13.6.2.2AsciiMath figures

Here’s how you do this using AsciiMath…

Here’s the version with the syntactic sugar for the format after the backticks:

Here's a paragraph before the figure.

```@
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Here's a paragraph after the figure.

Here’s the same thing, with the full format after the backticks:

Here's a paragraph before the figure.

```asciimath
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Here's a paragraph after the figure.

Here’s the same thing again, with a full attribute list:

Here's a paragraph before the figure.

{format: asciimath}
```
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Here's a paragraph after the figure.

If you wonder why I’m a fan of AsciiMath: I actually got that right on the first try at the AsciiMath website.

Note that when you are writing about AsciiMath and want to display the AsciiMath text, what you are really doing is creating a code resource. This is essentially identical to what was shown earlier in writing about SVG images. It’s also shown below.

10.13.7Writing about Math

If you want to write about math, and show the actual code (instead of the formatted output), it needs to be of a format of code, not math.

Now, you can just be lazy and not provide format or type attributes at all, since guessing when neither is present always produces a type of code.

This will be output as code:

```
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

This isn’t just true for AsciiMath code; it’s true for any math. For example, this will be output as code, since there’s no $ or latexmath format specifier to indicate that it is LaTeX math:

```
\left|\sum_{i=1}^n a_ib_i\right|
\le
\left(\sum_{i=1}^n a_i^2\right)^{1/2}
\left(\sum_{i=1}^n b_i^2\right)^{1/2}
```

However, you can also specify the type of code, to be explicit:

{type: code}
```
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Since specifying a type overrides the type inferred by the format, you can even specify the format of the math being used, while still keeping it code:

{type: code, format: asciimath}
```
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

This also works with the syntactic sugar with the full format name:

{type: code}
```asciimath
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

This also works with the syntactic sugar with the format shortcut:

{type: code}
```@
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Now, chances are a Markua Processor will not be doing syntax highlighting on AsciiMath.

If you want to ensure that no syntax highlighting is done, you can just specify text:

```text
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
```

Heck, you can even use tildes, since this defaults to text not guess:

~~~
abs(sum_(i=1)^n a_i b_i) <= (sum_(i=1)^n a_i^2)^(1/2) (sum_(i=1)^n b_i^2)^(1/2)
~~~

Autolinks are absolute URIs and email addresses inside < and >. They are parsed as links, with the URL or email address as the link label.

A URI autolink consists of <, followed by an absolute URI followed by >. It is parsed as a link to the URI, with the URI as the link’s label.

An absolute URI, for these purposes, consists of a scheme followed by a colon (:) followed by zero or more characters other than ASCII control characters, space, <, and >. If the URI includes these characters, they must be percent-encoded (e.g. %20 for a space).

For purposes of this spec, a scheme is any sequence of 2–32 characters beginning with an ASCII letter and followed by any combination of ASCII letters, digits, or the symbols plus (“+”), period (“.”), or hyphen (“-”).

Here are some valid autolinks:

Example 645
<http://foo.bar.baz>
<p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
Example 646
<https://foo.bar.baz/test?q=hello&id=22&boolean>
<p><a href="https://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean">https://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean</a></p>
Example 647
<irc://foo.bar:2233/baz>
<p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>

Uppercase is also fine:

Example 648
<MAILTO:FOO@BAR.BAZ>
<p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>

Note that many strings that count as absolute URIs for purposes of this spec are not valid URIs, because their schemes are not registered or because of other problems with their syntax:

Example 649
<a+b+c:d>
<p><a href="a+b+c:d">a+b+c:d</a></p>
Example 650
<made-up-scheme://foo,bar>
<p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
Example 651
<https://../>
<p><a href="https://../">https://../</a></p>
Example 652
<localhost:5001/foo>
<p><a href="localhost:5001/foo">localhost:5001/foo</a></p>

Spaces are not allowed in autolinks:

Example 653
<https://foo.bar/baz bim>
<p>&lt;https://foo.bar/baz bim&gt;</p>

Backslash-escapes do not work inside autolinks:

Example 654
<https://example.com/\[\>
<p><a href="https://example.com/%5C%5B%5C">https://example.com/\[\</a></p>

An email autolink consists of <, followed by an email address, followed by >. The link’s label is the email address, and the URL is mailto: followed by the email address.

An email address, for these purposes, is anything that matches the non-normative regex from the HTML5 spec:

/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/

Examples of email autolinks:

Example 655
<foo@bar.example.com>
<p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
Example 656
<foo+special@Bar.baz-bar0.com>
<p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>

Backslash-escapes do not work inside email autolinks:

Example 657
<foo\+@bar.example.com>
<p>&lt;foo+@bar.example.com&gt;</p>

These are not autolinks:

Example 658
<>
<p>&lt;&gt;</p>
Example 659
< https://foo.bar >
<p>&lt; https://foo.bar &gt;</p>
Example 660
<m:abc>
<p>&lt;m:abc&gt;</p>
Example 661
<foo.bar.baz>
<p>&lt;foo.bar.baz&gt;</p>
Example 662
https://example.com
<p>https://example.com</p>
Example 663
foo@bar.example.com
<p>foo@bar.example.com</p>

10.15No raw HTML (M)

As discussed, Markua does not support raw HTML. All raw HTML is removed. As a side-effect, this means HTML comments can still be used, since they (like all HTML) will be removed from the output.

10.16Hard line breaks

A line ending (not in a code span or HTML tag) that is preceded by two or more spaces and does not occur at the end of a block is parsed as a hard line break (rendered in HTML as a <br /> tag):

Example 664
foo  
baz
<p>foo<br />
baz</p>

For a more visible alternative, a backslash before the line ending may be used instead of two or more spaces:

Example 665
foo\
baz
<p>foo<br />
baz</p>

More than two spaces can be used:

Example 666
foo       
baz
<p>foo<br />
baz</p>

Leading spaces at the beginning of the next line are ignored:

Example 667
foo  
     bar
<p>foo<br />
bar</p>
Example 668
foo\
     bar
<p>foo<br />
bar</p>

Hard line breaks can occur inside emphasis, links, and other constructs that allow inline content:

Example 669
*foo  
bar*
<p><em>foo<br />
bar</em></p>
Example 670
*foo\
bar*
<p><em>foo<br />
bar</em></p>

Hard line breaks do not occur inside code spans

Example 671
`code  
span`
<p><code>code   span</code></p>
Example 672
`code\
span`
<p><code>code\ span</code></p>

or HTML tags:

Example 673
<a href="foo  
bar">
<p><a href="foo  
bar"></p>
Example 674
<a href="foo\
bar">
<p><a href="foo\
bar"></p>

Hard line breaks are for separating inline content within a block. Neither syntax for hard line breaks works at the end of a paragraph or other block element:

Example 675
foo\
<p>foo\</p>
Example 676
foo  
<p>foo</p>
Example 677
### foo\
<h3>foo\</h3>
Example 678
### foo  
<h3>foo</h3>

10.17Soft line breaks

A regular line ending (not in a code span or HTML tag) that is not preceded by two or more spaces or a backslash is parsed as a softbreak. (A soft line break may be rendered in HTML either as a line ending or as a space. The result will be the same in browsers. In the examples here, a line ending will be used.)

Example 679
foo
baz
<p>foo
baz</p>

Spaces at the end of the line and beginning of the next line are removed:

Example 680
foo 
 baz
<p>foo
baz</p>

A conforming parser may render a soft line break in HTML either as a line ending or as a space.

A renderer may also provide an option to render soft line breaks as hard line breaks.

10.18Character substitutions (M)

Markua documents can be written in UTF-8, so to produce any Unicode character, it is possible to just use the proper Unicode characters. However, in certain cases, it’s desirable for Markua to specify automatic replacement of certain combinations of characters with a Unicode replacement. If a Markua Processor encounters one of these combinations of characters outside of a code block, the Markua Processor must replace the combination of characters with the appropriate Unicode character in the output.

-- : To produce an em dash (—), what is thought of by non-typography people as a “dash” or a “long dash”, you can just type two hyphens (--) directly after a non-space character. You can also use the proper Unicode character, U+2014, of course. The following all produce em dashes: foo--bar, foo-- bar, foo--.

-- : To produce a space followed by an en dash (–), or the kind of dash that’s wider than a hyphen but narrower than an em dash, you can just type a space, followed by two hyphens ( --). You can also use the proper Unicode character, U+2013, of course. The following both produce en dashes preceded by spaces: foo -- bar, foo --. (With foo -- bar, there’s a space before and after the en dash; with foo --, there’s no space after it (e.g. at the end of a paragraph).

... : To produce a horizontal ellipsis (…), you can just type .... You can also use the proper Unicode character, U+2026, of course.

10.18.1Optional Automatic Curly Quotes Outside of Code Blocks and Spans

A Markua Processor may replace the " character with the appropriate “curly quote” at its discretion. This lets "typography" become “typography”, and it's become it’s as appropriate.

Note that this is an optional behavior: a Markua Processor may support this fully, only in some output formats, or not at all.

Also, note that it is NEVER acceptable for a Markua Processor to do this, or any character substitution, to text inside a code block or code span. In almost all instances this would completely break the code. (If you wonder how I got curly quotes into the code spans for “typography” and it’s above, it’s because I pasted them into the manuscript that way. Just as a Markua Processor doesn’t make straight quotes curly in a code span, it doesn’t make curly quotes straight in a code span either.)

10.19Footnotes and endnotes (M)

Books often have footnotes and endnotes. So, Markua has them too.

10.19.1Footnotes

To add a footnote, you insert a footnote tag using square brackets, a caret and the tag, like this:

This has a footnote[^thenote].

Then, you define the footnote later in the document, using the same square brackets, caret and tag, followed by a colon, a space and the footnote definition:

[^thenote]: This is the footnote content.

If you wish to write multiple paragraphs in the footnote, you must indent the subsequent paragraphs by four spaces or one tab:

This has a footnote[^thenote].

Here is some unrelated text.

[^thenote]: This is the first paragraph of footnote content.

    This is the second paragraph of footnote content.

Here is some more unrelated text.

Whether the numbering of footnotes restarts every chapter is something that can be specified by the restart-footnote-numbering setting.

Markua does not specify how footnotes are output in HTML. A Markua Processor should output them somewhere, but the details are not specified. This is deliberate, in order to maximize implementation flexibility for Markua Processors.

10.19.2Endnotes

Sometimes endnotes are used instead of footnotes, but other times, these are in addition to footnotes. So, it makes sense for Markua to define separate syntaxes for both, rather than just defining one “footnote or endnote” syntax and letting the author pick whether the notes are footnotes or endnotes via a document setting.

To add an endnote, you insert an endnote tag using square brackets, two carets and the tag, like this:

This has an endnote[^^thenote].

Endnotes are like footnotes, but happier (^^).

Then, you define the endnote later in the document, using the same square brackets, two carets and tag, followed by a colon, a space and the endnote definition:

[^^thenote]: This is the endnote content.

Just as with footnotes, if you wish to write multiple paragraphs in an endnote, you must indent the subsequent paragraphs by four spaces or one tab.

Whether the numbering of endnotes restarts every chapter is something that can be specified by the restart-endnote-numbering setting.

Markua does not specify how endnotes are output in HTML. A Markua Processor should output them somewhere, but the details are not specified. This is deliberate, in order to maximize implementation flexibility for Markua Processors.

10.19.3Single reference to footnotes and endnotes

You can only refer to a footnote or endnote once. You can’t define a footnote or endnote in one place and refer to it multiple times in the same Markua document. If you wish to refer to a parenthetical piece of text from multiple places in a Markua document, the best approach is to put it in a section (or sub-section, sub-sub-section, etc.) or aside and refer to it from multiple places using a crosslink.

10.19.4Footnote and endnote support is required for paragraphs only

A Markua Processor must support footnote and endnote references inserted in normal paragraph content. However, that’s it.

However, sometimes authors want to get creative with their footnotes and endnotes. Sometimes they want to add them in headings, or in footnotes or endnotes themselves. This latter style has been used on rare occasions, most notably by David Foster Wallace.

However, supporting inserting footnotes and endnotes in places other than normal paragraph content puts a hugely increased burden on implementors of Markua Processors. As such, there is no requirement for a Markua Processor to support inserting a footnote or endnote anywhere other than in normal paragraph content.

Authors should not assume that a particular Markua Processor supports inserting a footnote or endnote anywhere other than in normal paragraph content unless its documentation specifically states that it does. For example, Leanpub only supports inserting footnotes or endnotes in normal paragraph content.

10.20Underline (M)

In Markdown as defined by John Gruber, and in CommonMark and GFM, *one asterisk* and _one underscore_ both produce italics, and there is no way to produce an underline in Markdown except using inline HTML.

This is unfortunate, since underline is not always just a typewriter version of italics. In some languages and in some contexts, underlining serves a distinct, legitimate purpose.

Worse, Markua bans all inline HTML except HTML comments. So even the gross Markdown workaround to produce underlines with HTML does not work.

So, how to produce an actual underline?

In Markua, *one asterisk* produces italics, and _one underscore_ can produce either italics or an underline based on the italicize-underlines document setting.

The document setting italicize-underlines can be true or false. The default is true, so that Markua functions the same way as Markdown by default.

This will be a bit surprising for new authors who are discovering Markua and have never heard of Markdown, such as people writing novels in Markua. However, it is my expectation that for the foreseeable future, the proportion of authors who discover Markdown first and then learn about Markua will be far greater. For these people, it would be a lot more surprising if all their text which they had italicized like _this_ suddenly became underlined instead!

Furthermore, if you’ve written in a certain way (like _this_ for italic) for years, your fingers essentially just do the right thing. I want this to still feel right for these people, not something that makes them think they need to go change a document setting. Almost nobody has spent years writing in Markua, so this is much less of an issue for them!

(Also, frankly, most novels just need italic, not underline, so this isn’t a big issue. While underline does have legitimate uses, they are more niche than italic.)

So, by default, italicize-underlines is true:

# Chapter One

stuff

foo _bar_ baz

lorem *ipsum* dolor

stuff
<h1>Chapter One</h1>
<p>stuff</p>
<p>foo <em>bar</em> baz</p>
<p>lorem <em>ipsum</em> dolor</p>
<p>stuff</p>

This can be made explicit by setting the italicize-underlines document setting to true:

{
italicize-underlines: true
}

# Chapter One

stuff

foo _bar_ baz

lorem *ipsum* dolor

stuff
<h1>Chapter One</h1>
<p>stuff</p>
<p>foo <em>bar</em> baz</p>
<p>lorem <em>ipsum</em> dolor</p>
<p>stuff</p>

Set the italicize-underlines document setting to false to produce underline:

{
italicize-underlines: false
}

# Chapter One

stuff

foo _bar_ baz

lorem *ipsum* dolor

stuff
<h1>Chapter One</h1>
<p>stuff</p>
<p>foo <u>bar</u> baz</p>
<p>lorem <em>ipsum</em> dolor</p>
<p>stuff</p>

To produce text which is underlined as well as other bolded and/or italicized, you combine the underscores for underlining with the asterisks for bold and italics:

{
italicize-underlines: false
}

# Chapter One

stuff

This is _*italicized and underlined*_.

This is _**bolded and underlined**_.

This is _***bolded and italicized and underlined***_.

stuff
<h1>Chapter One</h1>
<p>stuff</p>
<p>This is <u><em>italicized and underlined</em></u>.</p>
<p>This is <u><strong>bolded and underlined</strong></u>.</p>
<p>This is <u><strong><em>bolded and italicized and underlined</em></strong></u>.</p>
<p>stuff</p>

10.21Superscript and subscript (M)

To produce superscript like the 3 in 5^3^ = 125, surround it with carets like 5^3^ = 125.

Superscript: 5^3^ = 125
<p>Superscript: 5<sup>3</sup> = 125</p>

To produce subscript like the 2 in H~2~O, surround it with single tildes like H~2~O.

Subscript: H~2~O
<p>Subscript: H<sub>2</sub>O</p>

10.22Strikethrough (GFM)

Markua enables the strikethrough extension, where an additional emphasis type is available.

Strikethrough text is any text wrapped in two tildes (~).

~~Hi~~ Hello, world!
<p><del>Hi</del> Hello, world!</p>

As with regular emphasis delimiters, a new paragraph will cause strikethrough parsing to cease:

This ~~has a

new paragraph~~.
<p>This ~~has a</p>
<p>new paragraph~~.</p>

10.23Index entries (M)

Markua supports adding index entries via the attribute list syntax. Index entries let authors or indexers produce professional-looking indexes in Markua books.

Index entries can either be attached to block or span elements using the same attribute list syntax. In fact, index entries can just be added as part of a larger attribute list.

The actual syntax of what the value of an index entry looks like is inspired by LaTeX.

The key of an index entry is i, for index.

In its simplest form, an index entry is simply {i: "hello"}. Like any attribute list, you don’t need a space between the colon and the quote–you can also do {i:"hello"}.

These are the various formats of an index entry:

{i: hello}
{i: "hello"}
{i: "Armstrong, Peter"}
{i: "Yahoo\!"}
{i: "*hello*"}
{i: "**hello**"}
{i: "hello!Peter"}
{i: "hello!*Peter*"}
{i: "hello!**Peter**"}
{i: "Peter|see{i:'hello'}"}
{i: "Jen|seealso{i:'Jenny'}"}

Here’s what they do:

{i: hello} : Adds an index entry for hello. If an index entry has no punctuation or formatting then it does not need quotes.

{i: "hello"} : Adds an index entry for hello. Quotes are always fine to use, even when not required.

{i: "Armstrong, Peter"} : Adds an index entry for Armstrong, Peter. The quotes are always omitted. Their function is to allow things like exclamation marks and other punctuation to be added without fear, in case you don’t feel like learning which punctuation is safe.

{i: "Yahoo\!"} : Adds an index entry for Yahoo!. Note that the exclamation mark must be backslash-escaped because ! is a delimiter otherwise. The |, {, } and \ characters also must be backslash-escaped.

{i: "*hello*"} : Adds an index entry for hello, with hello in italics.

{i: "**hello**"} : Adds an index entry for hello, with hello in bold.

{i: "hello!Peter"} : Adds an index entry for Peter which is a sub-entry of hello.

{i: "hello!*Peter*"} : Adds an index entry for Peter (with Peter in emphasis) which is a sub-entry of hello. Note that this cannot be combined with a see or seealso (the | syntax).

{i: "hello!**Peter**"} : Adds an index entry for Peter (with Peter in strong emphasis) which is a sub-entry of hello. Note that this cannot be combined with a see or seealso (the | syntax).

{i: "Peter|see{i:'hello'}"} : Adds an index entry for Peter, which references the index entry hello with the equivalent of “see” in the language of the book. Note that this cannot be combined with a sub-entry (the ! syntax).

{i: "Jen|seealso{i:'Jenny'}"} : Adds an index entry for Jen, which references the index entry Jenny with the equivalent of “see also” in the language of the book. Note that this cannot be combined with a sub-entry (the ! syntax).

Index entries are case sensitive. For example, {i: "mark"} and {i: "Mark"} are distinct entries. (The first is for a result or a smudge, the second is a person’s name.)

To attach an index entry to the start of a block, put it on its own line above a block:

{i: "hello"}
I just came to say hello, hello, hello, hello

To attach an index entry to a word, just add the index entry after the word:

I just came to say hello{i: "hello"}, hello, hello, hello

To attach an index entry to a span element, just add the index entry after the span element:

The first program that a programmer writes in a language is usually
*Hello World*{i: "Hello World"}

Index entries can have commas and other punctuation (except colons) in their definition:

My wife read some book about a whale by Herman Melville{i: "Melville, Herman"}.

Multiple index entries can exist in a block, or even a sentence:

Supposedly the great-great-great-granduncle of the musician Moby{i: "Moby"} was
Herman Melville{i: "Melville, Herman"}, the author of a book about a
whale{i: "Moby-Dick; or, The Whale"}.

Note that adding index entries is best left until the author is done writing the book. At that time, ids like {#myid} can be converted to {id: #myid, i: “blah”} if index entries are being added where ids already are.

Markua does not specify the HTML output of index entries, to maximize implementation flexibility.

Finally, please note that if you use an exclamation mark (!) symbol in an index entry, you must backslash-escape it. Otherwise, it would be intepreted as an attempt at creating a sub-entry. (This actually got me, when I tried to reference Hello! Flex 4, one of my books. To do this, I would have to write the entry as {i: "*Hello\! Flex 4*} not as {i: "*Hello! Flex 4*}.

10.24Syntactic sugar list (M)

Markua has a handful of syntactic sugar that it relies on to make life nicer for authors. Here’s a helpful summary list of the different syntactic sugar elements provided:

$ : When used following three backticks or tildes which start an inline resource, adds {type: math, format: latexmath} to the resource.

@ : When used following three backticks or tildes which start an inline resource, adds {type: math, format: asciimath} to the resource.

! : When used following three backticks or tildes which start an inline resource, adds {type: image, format: svg} to the resource.

& : When used following three backticks or tildes which start an inline resource, adds {type: verbatim} to the resource.

^ : When used as the first character on a line by itself above a paragraph, adds a {class: continued-para}

three backticks to start a resource : {format: guess}

three tildes to start a resource : {format: text}

10.25Textual content

Any characters not given an interpretation by the above rules will be parsed as plain textual content.

Example 689
hello $.;'there
<p>hello $.;'there</p>
Example 690
Foo χρῆν
<p>Foo χρῆν</p>

Internal spaces are preserved verbatim:

Example 691
Multiple     spaces
<p>Multiple     spaces</p>

Appendix: A parsing strategy

In this appendix we describe some features of the parsing strategy used in the CommonMark reference implementations.

Overview

Parsing has two phases:

  1. In the first phase, lines of input are consumed and the block structure of the document—its division into paragraphs, block quotes, list items, and so on—is constructed. Text is assigned to these blocks but not parsed. Link reference definitions are parsed and a map of links is constructed.

  2. In the second phase, the raw text contents of paragraphs and headings are parsed into sequences of Markdown inline elements (strings, code spans, links, emphasis, and so on), using the map of link references constructed in phase 1.

At each point in processing, the document is represented as a tree of blocks. The root of the tree is a document block. The document may have any number of other blocks as children. These children may, in turn, have other blocks as children. The last child of a block is normally considered open, meaning that subsequent lines of input can alter its contents. (Blocks that are not open are closed.) Here, for example, is a possible document tree, with the open blocks marked by arrows:

-> document
  -> block_quote
       paragraph
         "Lorem ipsum dolor\nsit amet."
    -> list (type=bullet tight=true bullet_char=-)
         list_item
           paragraph
             "Qui *quodsi iracundia*"
      -> list_item
        -> paragraph
             "aliquando id"

Phase 1: block structure

Each line that is processed has an effect on this tree. The line is analyzed and, depending on its contents, the document may be altered in one or more of the following ways:

  1. One or more open blocks may be closed.
  2. One or more new blocks may be created as children of the last open block.
  3. Text may be added to the last (deepest) open block remaining on the tree.

Once a line has been incorporated into the tree in this way, it can be discarded, so input can be read in a stream.

For each line, we follow this procedure:

  1. First we iterate through the open blocks, starting with the root document, and descending through last children down to the last open block. Each block imposes a condition that the line must satisfy if the block is to remain open. For example, a block quote requires a > character. A paragraph requires a non-blank line. In this phase we may match all or just some of the open blocks. But we cannot close unmatched blocks yet, because we may have a lazy continuation line.

  2. Next, after consuming the continuation markers for existing blocks, we look for new block starts (e.g. > for a block quote). If we encounter a new block start, we close any blocks unmatched in step 1 before creating the new block as a child of the last matched container block.

  3. Finally, we look at the remainder of the line (after block markers like >, list markers, and indentation have been consumed). This is text that can be incorporated into the last open block (a paragraph, code block, heading, or raw HTML).

Setext headings are formed when we see a line of a paragraph that is a setext heading underline.

Reference link definitions are detected when a paragraph is closed; the accumulated text lines are parsed to see if they begin with one or more reference link definitions. Any remainder becomes a normal paragraph.

We can see how this works by considering how the tree above is generated by four lines of Markdown:

> Lorem ipsum dolor
sit amet.
> - Qui *quodsi iracundia*
> - aliquando id

At the outset, our document model is just

-> document

The first line of our text,

> Lorem ipsum dolor

causes a block_quote block to be created as a child of our open document block, and a paragraph block as a child of the block_quote. Then the text is added to the last open block, the paragraph:

-> document
  -> block_quote
    -> paragraph
         "Lorem ipsum dolor"

The next line,

sit amet.

is a “lazy continuation” of the open paragraph, so it gets added to the paragraph’s text:

-> document
  -> block_quote
    -> paragraph
         "Lorem ipsum dolor\nsit amet."

The third line,

> - Qui *quodsi iracundia*

causes the paragraph block to be closed, and a new list block opened as a child of the block_quote. A list_item is also added as a child of the list, and a paragraph as a child of the list_item. The text is then added to the new paragraph:

-> document
  -> block_quote
       paragraph
         "Lorem ipsum dolor\nsit amet."
    -> list (type=bullet tight=true bullet_char=-)
      -> list_item
        -> paragraph
             "Qui *quodsi iracundia*"

The fourth line,

> - aliquando id

causes the list_item (and its child the paragraph) to be closed, and a new list_item opened up as child of the list. A paragraph is added as a child of the new list_item, to contain the text. We thus obtain the final tree:

-> document
  -> block_quote
       paragraph
         "Lorem ipsum dolor\nsit amet."
    -> list (type=bullet tight=true bullet_char=-)
         list_item
           paragraph
             "Qui *quodsi iracundia*"
      -> list_item
        -> paragraph
             "aliquando id"

Phase 2: inline structure

Once all of the input has been parsed, all open blocks are closed.

We then “walk the tree,” visiting every node, and parse raw string contents of paragraphs and headings as inlines. At this point we have seen all the link reference definitions, so we can resolve reference links as we go.

document
  block_quote
    paragraph
      str "Lorem ipsum dolor"
      softbreak
      str "sit amet."
    list (type=bullet tight=true bullet_char=-)
      list_item
        paragraph
          str "Qui "
          emph
            str "quodsi iracundia"
      list_item
        paragraph
          str "aliquando id"

Notice how the line ending in the first paragraph has been parsed as a softbreak, and the asterisks in the first list item have become an emph.

By far the trickiest part of inline parsing is handling emphasis, strong emphasis, links, and images. This is done using the following algorithm.

When we’re parsing inlines and we hit either

we insert a text node with these symbols as its literal content, and we add a pointer to this text node to the delimiter stack.

The delimiter stack is a doubly linked list. Each element contains a pointer to a text node, plus information about

When we hit a ] character, we call the look for link or image procedure (see below).

When we hit the end of the input, we call the process emphasis procedure (see below), with stack_bottom = NULL.

Starting at the top of the delimiter stack, we look backwards through the stack for an opening [ or ![ delimiter.

process emphasis

Parameter stack_bottom sets a lower bound to how far we descend in the delimiter stack. If it is NULL, we can go all the way to the bottom. Otherwise, we stop before visiting stack_bottom.

Let current_position point to the element on the delimiter stack just above stack_bottom (or the first element if stack_bottom is NULL).

We keep track of the openers_bottom for each delimiter type (*, _), indexed to the length of the closing delimiter run (modulo 3) and to whether the closing delimiter can also be an opener. Initialize this to stack_bottom.

Then we repeat the following until we run out of potential closers:

After we’re done, we remove all delimiters above stack_bottom from the delimiter stack.