Beale’s Cipher
December 2, 2016
In the early 1800, Thomas J. Beale mined a large quantity of gold and silver, secretly, some place in the American West, and brought the gold, silver, and some jewels purchased with the treasur to Virginia, where he buried it. He wrote three coded documents that described the location of the buried treasure, the nature of the treasure, and the names of the owners. He never came back to retrieve the treasure. Only the second of those documents has been decoded, and many people, even today, are scouring Bedford County, Virginia, looking for buried treasure. Or so the story goes.
Beale used a variant of a book cipher. He chose a long text as a key, numbered each of the words in the text sequentially, starting from 1, and formed a message by choosing from the key text a word for each character of plain text having the same initial letter as the plain text; the cipher text consists of a list of the sequence numbers of the chosen words. For instance, if the key text is “now is the time” and the plain text is “tin”, then either (3 2 1) or (4 2 1) are valid encipherments. If the key text is long, there will be many duplicates, as we saw with the letter “t”, and the resulting cipher will be reasonably secure. Beale used the 1322 words of the Declaration of Independence as the key text for the second document; the key texts associated with the first and third documents are unknown.
Your task is to write programs that encipher and decipher messages using the Beale cipher; use it to decode the second document. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
This was my 2nd ever programming course assignment, and the one I enjoyed programming the most. Thanks for posting this.
In Python. The Beale cipher is apparently so complicated to use, that Beale (probably) messed up the encoding. It is, of course, a daunting task to set up the encoding with a document of 1322 words without a computer. It is also not really known which version of the Declaration of Independence he used.
def read_decl(list_of_words): """create encode and decode dictionaries the input is a sequence of lowercase words """ E, D = defaultdict(list), {} for n, word in enumerate(list_of_words, 1): E[word[0]].append(n) D[n] = word[0] return E, D def decode(code, D): 'input is list of ints - output is string' return "".join(D.get(n, "?") for n in code) def encode(txt, E): 'input is string - output is list of ints' txt = txt.lower() return [choice(E.get(c, [0])) for c in txt] E, D = read_decl(open(DECLARATION).read().lower().split()) print(decode((int(n) for n in open(LETTER).read().split(", ")), D))Sounds like a good excuse to play around with Unicode and ES6 a little more. ES6 has a number of features that make proper Unicode handling rather easier than in previous versions of Javascript. Notably, strings now support the new iterator protocol that allows us to easily convert strings to arrays of single Unicode characters rather than having to deal with surrogate pairs. For testing this, we will use texts in the Gothic script which uses the astral Unicode codepoints U+10330 to U+1034A, and seems to be reasonably well supported by fonts and browsers. Extant Gothic scripts are mainly religious, but there is one poem in Gothic, “Bagme Bloma”, written by J. R. R. Tolkein. Here we encrypt that poem, using as key text the Gothic version of the Lord’s Prayer. I couldn’t find the texts already in the Gothic script, so we start by converting from latin transliterations.
"use strict" // The Lord's Prayer in Gothic, transliterated. const text1 = [ "atta unsar þu in himinam", "weihnai namo þein", "qimai þiudinassus þeins", "wairþai wilja þeins", "swe in himina jah ana airþai", "hlaif unsarana þana sinteinan gif uns himma daga", "jah aflet uns þatei skulans sijaima", "swaswe jah weis afletam þaim skulam unsaraim", "jah ni briggais uns in fraistubnjai", "ak lausei uns af þamma ubilin", "unte þeina ist þiudangardi jah mahts", "jah wulþus in aiwins" ]; // Tolkein's Bagme Bloma poem const text2 = [ "brunaim bairiþ bairka bogum", "laubans liubans liudandei", "gilwagroni glitmunjandei", "bagme bloma blauandei", "fagrafahsa liþulinþi", "fraujinondei fairguni", "wopjand windos wagjand lindos", "lutiþ limam laikandei", "slaihta raihta hweitarinda", "razda rodeiþ reirandei", "bandwa bairhta runa goda", "þiuda meina þiuþjandei", "andanahti milhmam neipiþ", "liuhteiþ liuhmam lauhmuni", "laubos liubai fliugand lausai", "tulgus triggwa standandei", "bairka baza beidiþ blaika", "fraujinondei fairguni" ]; // Gothic alphabet and the standard latin transliteration const gothic = "𐌰𐌱𐌲𐌳𐌴𐌵𐌶𐌷𐌸𐌹𐌺𐌻𐌼𐌽𐌾𐌿𐍀𐍁𐍂𐍃𐍄𐍅𐍆𐍇𐍈𐍉𐍊"; const latin = "abgdeqzhþiklmnjup*rstwfxƕo^"; const gchars = [...gothic] // iterator respects codepoints const lchars = [...latin] const charmap = new Map(lchars.map((c,i)=>[c,gchars[i]])) // zip const convert = s => [...s].map(c=>charmap.get(c)||c).join("") const keytext = text1.map(convert) // Now construct the encoding tables const encoder = new Map() const decoder = new Map([[0,"."]]) // Unknown character const allwords = [].concat(...keytext.map(s=>s.split(" "))) allwords.forEach((w,i) => { const index = i+1 const c = String.fromCodePoint(w.codePointAt(0)) if (!encoder.has(c)) encoder.set(c,[]) encoder.get(c).push(index) decoder.set(index,c) }) const encodeone = c => { const a = encoder.get(c) if (a == undefined) return 0 else return a[Math.floor(Math.random() * a.length)] } const encode = s => [...s].map(encodeone,s) const decode = s => s.map(n => decoder.get(n)).join("") const plaintext = text2.map(convert) const ciphertext = plaintext.map(encode) const plaintext2 = ciphertext.map(decode) keytext.forEach(s=>console.log(s)) console.log() plaintext.forEach(s=>console.log(s)) console.log() plaintext2.forEach(s=>console.log(s))Here’s the output with the key text, the plain text and the decrypted cipher text, we can see one of the disadvantages of Beale’s scheme – some of the letters do not occur as first letter in the key text so cannot be enciphered:
In Ruby
def encipher(key, text) words = key.scan(/\w+/) text.scan(/\w+/).flat_map do |word| word.chars.map do |char, i| words .each_with_index .select { |word, _| word.chars.first == char } .sample .last + 1 end end end def decipher(key, cipher) char_map = Hash[ key .scan(/\w+/) .each_with_index .map { |w, i| [i + 1, w[0]] } ] cipher.map { |char| char_map[char] }.join end encipher("now is the time" , "tin") => [4, 2, 1] decipher("now is the time", [4, 2, 1]) => "Tin" key = "lorem ipsum dolor sit amet, consectetur adipisicing elit" decipher(key, encipher(key, "lisa is ace")) => "lisaisace"