Contents: Permuted Table Of Contents

July 9, 2010

Here is our program, written in Awk as was the previous program:

export PRAXIS=/home/phil/praxis

awk ' # permuted table of contents (page "permuted")

BEGIN { FS = "\n"; RS = ""
        split("Jan:Feb:Mar:Apr:May:Jun:Jul:Aug:Sep:Oct:Nov:Dec",
              monthname, ":") }

$1 ~ /^number\t[1-9][0-9]*$/ {

    for (i in val) delete val[i]

    for (i=1; i<=NF; i++)
        if (split($i, f, /\t/) == 2)
            val[f[1]] = f[2]

    if (val["ptitle"] == "")
        val["ptitle"] = val["title"]

    if (val["pblurb"] == "")
        val["pblurb"] = val["blurb"]

    for (i in words) delete words[i]
    for (i in pwords) delete pwords[i]

    n = split(val["title"] ": " val["blurb"], words, / /)
    if (split(val["ptitle"] ": " val["pblurb"], pwords, / /) != n)
        print "error: unmatched word count at " val["number"] | "cat 1>&2"

    for (i=1; i<=n; i++) { # main loop to generate rotated permutations

        sortkey = suffix = ""
        for (j=i; j<=n; j++) {
            sortkey = sortkey " " pwords[j]
            suffix = suffix " " words[j] }
        sortkey = substr(sortkey,2); suffix = substr(suffix,2)

        prefix = ""
        for (j=1; j<i; j++)
            prefix = prefix " " words[j]
        prefix = substr(prefix,2)

        printf "%s\t<tr><td>%d</td><td>%02d %s %d</td>" \
                "<td align=\"right\">%s</td><td>%s</td>" \
                "<td><a href=\"/%d/%02d/%02d/%s/\">exercise</a> " \
                "<a href=\"/%d/%02d/%02d/%s/%d/\">solution</a> " \
                "<a href=\"http://programmingpraxis.codepad.org/" \
                "%s\">codepad</a></td></tr>\n",
               sortkey, val["number"], val["pubday"],
               monthname[val["pubmon"]], val["pubyear"],
               prefix ? prefix : "&nbsp;", suffix,
               val["pubyear"], val["pubmon"], val["pubday"],
               val["file"], val["pubyear"], val["pubmon"],
               val["pubday"], val["file"], val["soln"],
               val["codepad"] } }

' $PRAXIS/praxis.info |
awk 'bash !~ /^(a|an|and|as|at|but|by|for|from|his|in|it|my|of|on|or|our|per|that|the|to|which|with|A|An|And|As|At|But|By|For|From|His|In|In|My|Of|On|Or|Our|Per|That|The|To|Which|With) /' | sort -f | cut -f2 >/pages/permuted

ed -s $PRAXIS/pages/permuted <<'EOF'
0a
<table cellpadding="10">
.

</table>
.
w
q

The program is in three parts. The main Awk program produces rotations, storing the string destined for output to the table along with a sort key. Noise words are eliminated by the second Awk program. The remaining output is then sorted and the sort key field is removed. Finally, an ed script adds the beginning and ending table tags, because they can’t participate in the sort.

I chose Awk for this program because I’ve been using Awk for text munging programs like this for twenty-five years. But looking at the final program, I’m not sure that was a particularly wise choice. Perhaps we’ll revisit this program in some future exercise.

The code is collected at http://programmingpraxis.codepad.org/K5OCk9zv.

Advertisement

Pages: 1 2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: