Union Route Cipher
March 6, 2012
One of the most successful military ciphers in history was fielded by the Union armies during the American Civil War. It is known that the Confederacy never cracked the cipher; in fact, they even took to publishing intercepted messages in their newspapers, in the hope that someone could read them. The cipher was devised by Anton Stager, the general superintendent of the Western Union Telegraph Company, at the request of the government of Ohio; later, General George McClellan, who was in charge of the armies of Ohio, introduced the cipher throughout the Union armies. You can read more about the cipher here and here.
The cipher works in two phases. First, some of the words with military significance are replaced by code words; for instance, attack could be replaced by tulip, and the phrase at dawn could be replaced by stripe, so the code tulip stripe means attack at dawn. The lexicon included names of people (Lincoln, various generals), places (Richmond, hilltop), times (Tuesday, 4:30pm, dawn), and actions (attack, reconnoiter); a single cleartext word could admit multiple code words, multiple cleartext words could be encoded as a single word, and even digits and punctuation had code words. The final version of the lexicon included 1608 codewords.
The second phase was a route transposition, and there were many variants. For instance, a route designated willow might call for six columns with words chosen in the order down column three, up column four, down column two, down column six, up column one, and down column five; nulls were added to pad out the last row, and an additional null was added in a seventh column.
Here’s an example: On June 1, 1863, President Lincoln sent the following telegram:
FOR COLONEL LUDLOW:
RICHARDSON AND BROWN, CORRESPONDENTS OF THE TRIBUNE,
CAPTURED AT VICKSBURG, ARE DETAINED AT RICHMOND.
PLEASE ASCERTAIN WHY THEY ARE DETAINED
AND GET THEM OFF IF YOU CAN.
LINCOLN.
The lexicon included certain codewords: VENUS for COLONEL, WAYLAND for CAPTURED, ODOR for VICKBURG, NEPTUNE for RICHMOND, and ADAM for LINCOLN. An additional codeword NELLY gives the time of dispatch as 4:30PM. Thus the message becomes:
FOR VENUS LUDLOW:
RICHARDSON AND BROWN, CORRESPONDENTS OF THE TRIBUNE,
WAYLAND AT ODOR, ARE DETAINED AT NEPTUNE.
PLEASE ASCERTAIN WHY THEY ARE DETAINED
AND GET THEM OFF IF YOU CAN.
ADAM NELLY
Now the cipher clerk picks a route GUARD that calls for five columns in the order up column one, down column two, up column five, down column four, and up column three. He writes the message in five columns, the last row padded with nulls:
FOR VENUS LUDLOW RICHARDSON AND
BROWN CORRESPONDENTS OF THE TRIBUNE
WAYLAND AT ODOR ARE DETAINED
AT NEPTUNE PLEASE ASCERTAIN WHY
THEY ARE DETAINED AND GET
THEM OFF IF YOU CAN
ADAM NELLY THIS FILLS UP
Now the message is pulled off in column route order and nulls are added after each column:
up column one ADAM THEM THEY AT WAYLAND BROWN FOR KISSING
down column two VENUS CORRESPONDENTS AT NEPTUNE ARE OFF NELLY TURNING
up column five UP CAN GET WHY DETAINED TRIBUNE AND TIMES
down column four RICHARDSON THE ARE ASCERTAIN AND YOU FILLS BELLY
up column three THIS IF DETAINED PLEASE ODOR OF LUDLOW COMMISSIONER
Then the final message is read off in order following the route indicator:
GUARD ADAM THEM THEY AT WAYLAND BROWN FOR KISSING VENUS CORRESPONDENTS AT NEPTUNE ARE OFF NELLY TURNING UP CAN GET WHY DETAINED TRIBUNE AND TIMES RICHARDSON THE ARE ASCERTAIN AND YOU FILLS BELLY THIS IF DETAINED PLEASE ODOR OF LUDLOW COMMISSIONER
Deciphering simply reverses the process. The recipient counts the words in the message, draws a grid of the proper size, fills in words in route order, converts codewords to their plaintext equivalents, and reads the message.
The Union route cipher had the strong advantage that it worked with words rather than individual letters, making errors in transcription and telegraphy much less common. The disadvantage, of course, was that codebooks had to be painstakingly controlled; loss of a single codebook would give the enemy access to all your communications. But that didn’t happen, and the cipher remained secure throughout the war.
Your task is to write a program that encrypts and decrypts messages according to the Union route cipher; make your own conventions for the lexicon and routes. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Reblogged this on Inspiredweightloss.
The first part of this is under-specified. Problems:
Numerals and punctuation: does the cipher try to distinguish between, say, “12 3-oz boxes” and “123-oz boxes” or between “He’s the best I’ve ever seen” (apostrophes) and “He ‘s the best I’ ve ever seen” (single quotation marks)?
Ambiguity: How should the program deal with an ambiguous lexicon? If “soldiers attack” encodes to “blue” and “attack at dawn” to “green”, is “soldiers attack at dawn” to be “blue at dawn” or “soldiers green”? If “Lincoln” is encoded as “kill yellow” and “orders cease-fire” is encoded as “orange”, then “Lincoln orders cease-fire” will be encoded as “kill yellow orange”. What if “yellow orange” is also the code for “prisoners”?
treeowl: You’re stretching.
It is clear that the cipher uses words as its tokens, so there is no ambiguity in parsing “12 3-oz boxes” or “He’s”. One plaintext word never expands to multiple ciphertext words, so your second concern is moot. And it’s a military cipher, so a plaintext like “He’s the best I’ve ever seen” is unlikely.
In that case, I’m struggling to understand “a single cleartext word could admit multiple code words” and “even digits and punctuation had code words.” I figured the former meant that a single cleartext word could be encoded as multiple code words, but I suppose it could mean it could produce one of several code words arbitrarily for the same plaintext. You may be right about the punctuation, but I’m certainly unclear on how else to interpret “digits”.
treeowl:
Your second interpretation of multiple code words is correct: Adam, fountain, umbrella, and marble could all be code words for Lincoln, with one of them chosen at the discretion of the cipher clerk. In the original code books, codewords were provided for things like commas and periods, though they were often ignored, as in our example. And codewords were provided for numbers — twelve might be Sally — because numbers were often meaningful. Codewords were also provided for dates and times, such as Nelly in our example. Follow the references in the exercise for more information.