Rhyming Dictionary
April 24, 2012
One of the sites that I visit regularly is Reddit, and one of my favorite forums there is learnprogramming, which is often a source of ideas for the exercises that I write here. A recent question on learnprogramming came from a poet who wanted a rhyming dictionary. The poet says the dictionary must be the OED (won’t accept the CMU dictionary), isn’t a programmer but has a little bit of experience with HTML but expects to be able to write a rhyming dictionary program in a couple of days, doesn’t know that Java and JavaScript are two different languages, has already decided to use Python or Ruby, expects to be able to copy all the pronunciations from the OED without license, and seems to not have a clue what he really wants (“Basically I want to have multiple inputs (somewhere around a dozen) in different categories that when pooled together can give you what you’re looking for.” — whatever that means). The poet also has a foul mouth. Still, building a rhyming dictionary is a good exercise, and we can have some fun with it.
First we need a pronunciation dictionary. CMU has one, which we will use even though the poet found it “too choppy.” You’ll want the c06d.gz file, which after a brief commentary provides one word per line in the form PROGRAMMING P R OW1 G R AE2 M IH0 NG. The dictionary uses 39 phonemes, 15 vowels and 24 consonants, with three possible stresses for each vowel, indicated by a trailing number, 1 for primary stress, 2 for secondary stress, and 0 for unstressed; thus, there are 69 unique pronunciation symbols. We’ll say that two words rhyme if their final vowel (including stress) and any trailing consonants are identical.
Your task is to write a function that determines if two words rhyme and a function that takes one word and returns all the words that rhyme with it. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Reblogged this on InspiredWeightloss!.
Python 2.7 version
Uses the pronunciation file to build two dictionaries:
The first one maps a word to its ending phenomes (from the primary stress to the end of the word);
The second one maps the ending phonemes to a list of words that have that ending.
do_rhyme() and find_rhymes() then use the dictionaries.
Ruby style, matching every item after a vowel and building a map of maps of arrays hash[vowel][consonant] = [words with these stuff] (if I understood the rules well :-))
i like the programming job done at http://www.prime-rhyme.com