The Daily Cryptogram
July 14, 2009
Many newspapers publish a cryptogram each day; for instance:
P OYUUAOEXYW YM AEFGAD, FZGPEAG JAATUL, MYC ENA AGFOPEXYW PWG AWSYLVAWE YM ENA DPHHL ZCYICPVVAC
The deciphered cryptogram is usually a quote from a famous author, often advice for daily life or a pithy saying. Cryptographically, these puzzles are monoalphabetic substitution ciphers, with each of the twenty-six plain-text letters systematically replaced by a cipher-text letter.
Your task is to write a program that decrypts the daily cryptogram, and use that program to read the message given above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
[…] Praxis – The Daily Cryptogram By Remco Niemeijer Today’s Programming Praxis problem is an interesting one: we have to write a program to solve […]
First a small notification: I’ll be on vacation the next 4 weeks, so I won’t be able to post any solutions in that period. I should be back for puzzle 59.
My Haskell solution (see http://bonsaicode.wordpress.com/2009/07/14/programming-praxis-the-daily-cryptogram/ for a version with comments):
import Data.Char
import Data.List
import qualified Data.Map as M
import GHC.Exts
match :: M.Map Char String -> String -> String -> Bool
match _ [] [] = True
match k (c:cs) (p:ps) = M.findWithDefault [p] c k == [p] &&
match (M.insert c [p] k) cs ps
match _ _ _ = False
substitute :: M.Map Char String -> String -> String
substitute key = concatMap (\c -> M.findWithDefault [c] c key)
getKeys :: M.Map Char String -> [String] -> String -> [M.Map Char String]
getKeys k dict c = [M.fromList .
unionBy (\(a,x) (b,y) -> a == b || x == y) (M.assocs k) .
zip w $ map return p | w <- words c, let ps = filter (match k w) dict, length ps < 100, p <- ps] score :: M.Map Char String -> String -> Int
score k = negate . length . filter isLower . substitute k
findBestKeys :: Int -> [String] -> String -> [M.Map Char String]
findBestKeys n dict c = iterate (take 10 . sortWith (`score` c) .
concatMap (\k -> getKeys k dict c)) [M.empty] !! n
solve :: Int -> [String] -> String -> [String]
solve n dict c = map (`substitute` c) $ findBestKeys n dict $
filter (\x -> isAlpha x || isSpace x) c
crypto = “P OYUUAOEXYW YM AEFGAD, FZGPEAG JAATUL, MYC ENA ” ++
“AGFOPEXYW PWG AWSYLVAWE YM ENA DPHHL ZCYICPVVAC”
main :: IO ()
main = do dict <- fmap lines $ readFile "english-words.10" mapM_ print $ solve 7 dict crypto [/sourcecode]
Several years ago, I got involved in Simon Singh’s Code Challenge, and wrote a number of programs to try to crack classic ciphers. As part of this, I wrote a program which used a genetic algorithm to crack cryptograms. In many ways, it’s kind of crude: it doesn’t use a dictionary of english words, and ignored punctuation and word spacing. What it does use is a table containing the logarithm of the probability of each trigram for a corpus of text I analyzed (don’t remember what I used, maybe Sherlock Holmes or the Bible or something from Project Gutenberg). For longer cryptograms, it almost always works perfectly. It mucked up a couple of letters on this one, which is short and includes a fairly diverse set of trigrams, but the message was very easy to read even with these few mistakes in assigning low probability letters.
Here’s the Python source. For some reason, I can’t find the code which generated the pickled “corpus”, but it shouldn’t be too hard to figure out. Perhaps I’ll rewrite one during lunch. The code ia also strictly speaking not an “algorithm”, since doesn’t actually stop: it just keeps running, trying to maximize the score, and does nothing to detect stasis. But you might find it amusing.
[ EDIT: I put the word ‘python’ in single quotes in the sourcecode tag; without the single quotes, the tag is not recognized and the formatting is incorrect. I also fixed some ampersand-coded html characters. I hope this is correct now. ProgPrax ]
#!/u0/markv/my-python/bin/python
# _
# __ _ _ _ _ _ __| |_ ___ __ _ _ _ __ _ _ __
# / _| ‘_| || | ‘_ \ _/ _ \/ _` | ‘_/ _` | ‘ \
# \__|_| \_, | .__/\__\___/\__, |_| \__,_|_|_|_|
# |__/|_| |___/
#
# A simple program for automatically solving simple substitution ciphers
#
# Written by Mark VandeWettering
import pickle
import random
import string
import re
import sys
ldict = pickle.load(open(‘newcorpus.p’))
letters = “ABCDEFGHIJKLMNOPQRSTUVWXYZ”
sorig = open(sys.argv[1]).read().upper()
swork = re.sub(r”[^A-Z\!?.,’]+”, ” “, sorig)
swork = filter(lambda x : x in r”ABCDEFGHIJKLMNOPQRSTUVWXYZ!?.,’ “, swork)
print “SWORK: “, swork
POOLSIZE = 1000
pool = []
for p in range(POOLSIZE):
a = list(letters)
random.shuffle(a)
pool.append(”.join(a))
def calculatescore(x):
s = 0
for i in range(len(x)-3):
s = s + ldict.get(x[i:i+3], 0)
return s
class BogusError(Exception):
pass
def mutate(str):
if random.uniform(0, 1) < 0.01: a = list(letters) random.shuffle(a) return ''.join(a) s = list(str) c0 = random.choice(range(26)) c1 = random.choice(range(26)) s[c0], s[c1] = s[c1], s[c0] return ''.join(s) def crossover(pop, mom): pop = list(pop) mom = list(mom) c0 = random.randint(0, 13) c1 = random.randint(1, 13) c1 = c0 + c1 child = pop[:] for idx in range(c0, c1): idx2 = pop.index(mom[idx]) child[idx], child[idx2] = child[idx2], child[idx] child = ''.join(child) for c in string.uppercase: if c not in child: raise BogusError return child def sumpool(pool): total = 0 sums = [] for score, key in pool: total = total + score sums.append(total) return total, sums import bisect def pick(total, l): idx = bisect.bisect_right(l, random.randint(0, total)) return idx while True: newpool = [] for p in pool: twork = string.translate(swork, string.maketrans(letters, p)) score = calculatescore(twork) newpool.append((score, p)) newpool.sort(lambda x, y: cmp(y,x)) newpool = newpool[:POOLSIZE/2] # now, generate the distribution total, sums = sumpool(newpool) newpool = map(lambda x: x[1], newpool) torig = string.translate(sorig, string.maketrans(letters, newpool[0])) print "Best Decode score = %d" % sums[0] print torig for x in range(POOLSIZE/2): idx = pick(total, sums) if (random.uniform(0, 1) < 0.15): newpool.append(mutate(newpool[idx])) else: idx2 = pick(total, sums) newpool.append(crossover(newpool[idx], newpool[idx2])) pool = newpool [/sourcecode]
FYI, words 4 and 5 (AEFGAD and FZGPEAG) aren’t in /usr/share/dict/words on Mac. Only word 4 is missing on Ubuntu.