Rhyming Sort
December 8, 2020
I recently found on the internet a delightful paper by Kenneth Ward Church called Unix™ for Poets that provides an introduction to the Unix command line utilities for text processing. The paper must be ancient, because it gives the author’s email address at research.att.com, but I’ve never seen it before. One of the examples in the paper is rhyming sort, in which words are sorted from the right rather than the left; the left column below is in normal alphabetical order, the right column is in rhyming order:
falsely freely
fly sorely
freely surely
sorely falsely
surely fly
Church says:
“freely” comes before “sorely” because “yleerf” (“freely” spelled backwards) comes before “yleros” (“sorely” spelled backwards) in lexicographic order. Rhyming dictionaries are often used to help poets (and linguists who are interested in morphology).
Church solved the problem with a rev | sort | rev
pipeline.
Your task is to sort a file containing one word per line into rhyming order. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Here is a “one-liner” using standard R7RS Scheme and a couple of well
known helpers that seems very close in spirit to the Unix pipeline
solution. In comparing it to the solution by @programmingpraxis, one
difference is that this one makes only Theta(n) string-reversals
instead of Theta(n log n). Based on some very quick tests on larger
inputs (e.g., /usr/share/dict/words on Debian GNU/Linux), the
difference can be quite significant (10x).
Output:
We might as well use some actual poetry for our test data, so here is some Chaucer. After sanitizing the text, reverse all words, sort with appropriate collation (to get, for example, ‘vertú’ in the right place – Unicode combining characters might be a problem), and reverse again before final output:
Klong
Adopting Chaucer’s sample text (Thanks, @matthew), I get the following:
A common lisp solution.