Contents: Permuted Table Of Contents
July 9, 2010
Here is our program, written in Awk as was the previous program:
export PRAXIS=/home/phil/praxis
awk ' # permuted table of contents (page "permuted")
BEGIN { FS = "\n"; RS = ""
split("Jan:Feb:Mar:Apr:May:Jun:Jul:Aug:Sep:Oct:Nov:Dec",
monthname, ":") }
$1 ~ /^number\t[1-9][0-9]*$/ {
for (i in val) delete val[i]
for (i=1; i<=NF; i++)
if (split($i, f, /\t/) == 2)
val[f[1]] = f[2]
if (val["ptitle"] == "")
val["ptitle"] = val["title"]
if (val["pblurb"] == "")
val["pblurb"] = val["blurb"]
for (i in words) delete words[i]
for (i in pwords) delete pwords[i]
n = split(val["title"] ": " val["blurb"], words, / /)
if (split(val["ptitle"] ": " val["pblurb"], pwords, / /) != n)
print "error: unmatched word count at " val["number"] | "cat 1>&2"
for (i=1; i<=n; i++) { # main loop to generate rotated permutations
sortkey = suffix = ""
for (j=i; j<=n; j++) {
sortkey = sortkey " " pwords[j]
suffix = suffix " " words[j] }
sortkey = substr(sortkey,2); suffix = substr(suffix,2)
prefix = ""
for (j=1; j<i; j++)
prefix = prefix " " words[j]
prefix = substr(prefix,2)
printf "%s\t<tr><td>%d</td><td>%02d %s %d</td>" \
"<td align=\"right\">%s</td><td>%s</td>" \
"<td><a href=\"/%d/%02d/%02d/%s/\">exercise</a> " \
"<a href=\"/%d/%02d/%02d/%s/%d/\">solution</a> " \
"<a href=\"http://programmingpraxis.codepad.org/" \
"%s\">codepad</a></td></tr>\n",
sortkey, val["number"], val["pubday"],
monthname[val["pubmon"]], val["pubyear"],
prefix ? prefix : " ", suffix,
val["pubyear"], val["pubmon"], val["pubday"],
val["file"], val["pubyear"], val["pubmon"],
val["pubday"], val["file"], val["soln"],
val["codepad"] } }
' $PRAXIS/praxis.info |
awk 'bash !~ /^(a|an|and|as|at|but|by|for|from|his|in|it|my|of|on|or|our|per|that|the|to|which|with|A|An|And|As|At|But|By|For|From|His|In|In|My|Of|On|Or|Our|Per|That|The|To|Which|With) /' | sort -f | cut -f2 >/pages/permuted
ed -s $PRAXIS/pages/permuted <<'EOF'
0a
<table cellpadding="10">
.
</table>
.
w
q
The program is in three parts. The main Awk program produces rotations, storing the string destined for output to the table along with a sort key. Noise words are eliminated by the second Awk program. The remaining output is then sorted and the sort key field is removed. Finally, an ed
script adds the beginning and ending table
tags, because they can’t participate in the sort.
I chose Awk for this program because I’ve been using Awk for text munging programs like this for twenty-five years. But looking at the final program, I’m not sure that was a particularly wise choice. Perhaps we’ll revisit this program in some future exercise.
The code is collected at http://programmingpraxis.codepad.org/K5OCk9zv.