Literate Programming

August 10, 2010

Literate programming is a style of programming invented by Donald Knuth that merges documentation and code in a single document, with code presented in an order that is conducive to the reader. Chunks of code can be written in any order; a program called tangle restructures the chunks into the order required by the compiler. Here is a short but complete example of a literate program, which you may recognize as the second program, after “hello world,” from K&R:

This program prints a fahrenheit/celsius conversion table.

<<*>>=
<< include standard headers >>
<< the main program >>

The only standard header required is stdio.h, which includes the printf function used by the program.

<< include standard headers >>=
#include <stdio.h>

The main program defines some variables, initializes them, then loops through the table printing output lines.

<< the main program >>=
main()
{
    << declare variables >>
    << initialize variables >>
    << loop through the table >>
}

Variables fahr and celsius hold the current temperatures. Variables lower, upper and step control the loop.

<< declare variables >>=
int fahr, celsius;
int lower, upper, step;

The loop control variables are initialized so that the table prints fahrenheit values from 0 to 300 in steps of 20, along with the corresponding celsius values.

<< initialize variables >>=
lower = 0;
upper = 300;
step = 20;

Temperatures are printed by a loop.

<< loop through the table >>=
fahr = lower;
while (fahr <= upper) {
    << calculate celsius and print one line >>
    fahr = fahr + step;
}

The celsius equivalent of a fahrenheit temperature is computed by the traditional formula C = 9/5 * (F-32). The two temperatures are printed separated by a tab character, each pair on a single line.

<< calculate celsius and print one line >>=
celsius = 5 * (fahr-32) / 9;
printf("%d\t%d\n", fahr, celsius);

This is a simple-minded literate programming system, and the form of the input file is correspondingly simple. Code chunks are introduced by a line beginning with double less-than signs and ending with double greater-than signs and an equals sign; there may be no white space at the beginning or end of the line. Code chunks are referenced on any line within another code chunk by surrounding the name of the chunk, which must exactly match the name given on the definition line, with double less-than and greater-than signs; there may be only one reference per line. A code chunk ends at the first blank line following its beginning, or at the end of the file, whichever comes sooner.

The tangle program collects all the code chunks, then performs depth-first search through the call-tree graph beginning with the top-level “*” chunk. Tangle is careful to preserve the formatting of the original, in case the programmer needs to look at its output. Tangle produces the following output from the example program shown above:

#include
main()
{
    int fahr, celsius;
    int lower, upper, step;
    lower = 0;
    upper = 300;
    step = 20;
    fahr = lower;
    while (fahr <= upper) {
        celsius = 5 * (fahr-32) / 9;
        printf("%d\t%d\n", fahr, celsius);
        fahr = fahr + step;
    }
}

This program is simple-minded for exposition, and doesn’t do justice to the literate programming concept. You’ll see a better example in the solution.

Your task is to write a program that tangles an input file and creates a program output. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisement

Pages: 1 2