Mini Markdown

January 21, 2020

The first step is to create an input file to be used for testing:

This is a text paragraph.

# Heading level 1

This is another text paragraph.

- List item 1

- List item 2

- List item 3

This is still another text paragraph.

### Heading level 3

This is the last text paragraph.

I’ll write the program in Awk, which is rapidly becoming my second-favorite language, because it has a “paragraph mode” that is very useful. From the Posix specification, in the definition of RS:

If RS is null, then records are separated by sequences consisting of a <newline> plus one or more blank lines, leading or trailing blank lines shall not result in empty records at the beginning or end of the input, and a <newline> shall always be a field separator, no matter what the value of FS is.

It pays to know the dark corners of your language. Here’s the code:

BEGIN { FS = OFS = "\n"; RS = "" } # paragraph mode
$1 ~ /^[#]{1,6} / {
    if (inlist) { inlist = 0; print "

” } len = index($1, ” “) – 1 print “<h” len “>” substr($1,len+2) “</h” len “>” next } $1 ~ /^[-] / { if (! inlist) { inlist = 1; print ”

- ” substr($1,3) “

” next } { if (inlist) { inlist = 0; print ”

” } print ”

” $0 ”

” }

Variable inlist keeps track of whether or not the input is currently in a list, and writes list headers and trailers as needed. Here’s the output:

This is a text paragraph.

Heading level 1

This is another text paragraph.

- List item 1

List item 2

List item 3

This is still another text paragraph.

Heading level 3

This is the last text paragraph.

You can run the program at https://ideone.com/lat9aL.

Posted by programmingpraxis

Filed in Exercises

2 Comments »

2 Responses to “Mini Markdown”

Daniel said

February 14, 2020 at 12:31 AM

Here’s a solution in Python.

@programmingpraxis, your solution seemingly does not add a closing for list items occurring as the last elements of the input text.

import os
import sys

assert len(sys.argv) == 2

with open(sys.argv[1]) as f:
    lines = [line for line in f.read().splitlines() if line]
snippets = []
for line in lines:
    if line.startswith('-'):
        if snippets and snippets[-1] == '  </ul>':
            snippets.pop()
        else:
            snippets.append('  <ul>')
        snippets.append('    <li>' + line[1:].strip() + '</li>')
        snippets.append('  </ul>')
    elif line.startswith('#'):
        # WARN: this approach emits <h7>, <h8>, etc.
        line_ = line.lstrip('#')
        level = len(line) - len(line_)
        snippets.append(f'  <h{level}>{line_.strip()}</h{level}>')
    else:
        snippets.append('  <p>' + line + '</p>')

print('<html>' + os.linesep + '<body>')
print(os.linesep.join(snippets))
print('</body>' + os.linesep + '</html>')

Example Usage:

$ python3.7 markdown.py input.txt
<html>
<body>
  <p>This is a text paragraph.</p>
  <h1>Heading level 1</h1>
  <p>This is another text paragraph.</p>
  <ul>
    <li>List item 1</li>
    <li>List item 2</li>
    <li>List item 3</li>
  </ul>
  <p>This is still another text paragraph.</p>
  <h3>Heading level 3</h3>
  <p>This is the last text paragraph.</p>
</body>
</html>

Daniel said
February 14, 2020 at 12:34 AM
Here’s my same comment included above, this time with HTML escaping to try preventing dropped text.

@programmingpraxis, your solution seemingly does not add a closing </ul> for list items occurring as the last elements of the input text.

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Programming Praxis

Mini Markdown

January 21, 2020

Heading level 1

Heading level 3

2 Responses to “Mini Markdown”

Leave a comment

Categories

Archives

Archives

Programming Praxis

Mini Markdown

January 21, 2020

Heading level 1

Heading level 3

Share this:

Related

2 Responses to “Mini Markdown”

Leave a comment

Categories

Archives

Archives