SEP 13: Switch to a more reasonable source markup language
Markdown is far and away the dominant lightweight markup language for authoring documents that are later to be processed into HTML.
Unfortunately, Markdown is poorly specified. John Gruber gave a great gift to the world back in 2004 but its incomplete and inconsistent specification does create some annoying edge cases and inconsistencies in parsing. It also arguably requires parsing to be non-local (state must be held throughout the parsing of a document).
In April 2018 John MacFarlane authored a brilliantly concise essay (Beyond Markdown) that offered a way out of both of these issues. He followed the essay up with the release of Djot, a well specified and thoughtful alternative to markdown.
Update: I have since adopted Djot as the markup for this site.
This thoughtful design makes for a specification that is much, much easier to reason through, and avoids many of the pesky interactions of precedence that complicate Markdown and CommonMark.
That said, Djot is not necessarily without it’s issues and this evaluation by Vas should be thoroughly reviewed before going to the effort of extricating some 200,000+ words from the clutches of markdown.
Several markdown, or python-markdown, specific features would have to be accounted for/translated during any parser migration. Below is a no doubt non-exhaustive list:
-
Djot requires that integrated HTML elements be marked up as such: see raw inline and raw block
-
Djot uses a backslash followed by newline for a hard line break, in contrast to markdown which uses a double space (
' '
). These will have to be translated. -
Both lists and definition lists are handled differently and will need checking.
-
As above, but more broadly, blank lines between elements is strictly required and will need vetting during and after the switch over.
If I do make the transition I would also propose that parsing be made extremely strict ie, any unrecognised syntax or precedence should cause a build to fail.