Union Route Cipher

March 6, 2012

One of the most successful military ciphers in history was fielded by the Union armies during the American Civil War. It is known that the Confederacy never cracked the cipher; in fact, they even took to publishing intercepted messages in their newspapers, in the hope that someone could read them. The cipher was devised by Anton Stager, the general superintendent of the Western Union Telegraph Company, at the request of the government of Ohio; later, General George McClellan, who was in charge of the armies of Ohio, introduced the cipher throughout the Union armies. You can read more about the cipher here and here.

The cipher works in two phases. First, some of the words with military significance are replaced by code words; for instance, attack could be replaced by tulip, and the phrase at dawn could be replaced by stripe, so the code tulip stripe means attack at dawn. The lexicon included names of people (Lincoln, various generals), places (Richmond, hilltop), times (Tuesday, 4:30pm, dawn), and actions (attack, reconnoiter); a single cleartext word could admit multiple code words, multiple cleartext words could be encoded as a single word, and even digits and punctuation had code words. The final version of the lexicon included 1608 codewords.

The second phase was a route transposition, and there were many variants. For instance, a route designated willow might call for six columns with words chosen in the order down column three, up column four, down column two, down column six, up column one, and down column five; nulls were added to pad out the last row, and an additional null was added in a seventh column.

Here’s an example: On June 1, 1863, President Lincoln sent the following telegram:

FOR COLONEL LUDLOW:
RICHARDSON AND BROWN, CORRESPONDENTS OF THE TRIBUNE,
CAPTURED AT VICKSBURG, ARE DETAINED AT RICHMOND.
PLEASE ASCERTAIN WHY THEY ARE DETAINED
AND GET THEM OFF IF YOU CAN.
LINCOLN.

The lexicon included certain codewords: VENUS for COLONEL, WAYLAND for CAPTURED, ODOR for VICKBURG, NEPTUNE for RICHMOND, and ADAM for LINCOLN. An additional codeword NELLY gives the time of dispatch as 4:30PM. Thus the message becomes:

FOR VENUS LUDLOW:
RICHARDSON AND BROWN, CORRESPONDENTS OF THE TRIBUNE,
WAYLAND AT ODOR, ARE DETAINED AT NEPTUNE.
PLEASE ASCERTAIN WHY THEY ARE DETAINED
AND GET THEM OFF IF YOU CAN.
ADAM NELLY

Now the cipher clerk picks a route GUARD that calls for five columns in the order up column one, down column two, up column five, down column four, and up column three. He writes the message in five columns, the last row padded with nulls:

FOR     VENUS          LUDLOW   RICHARDSON AND
BROWN   CORRESPONDENTS OF       THE        TRIBUNE
WAYLAND AT             ODOR     ARE        DETAINED
AT      NEPTUNE        PLEASE   ASCERTAIN  WHY
THEY    ARE            DETAINED AND        GET
THEM    OFF            IF       YOU        CAN
ADAM    NELLY          THIS     FILLS      UP

Now the message is pulled off in column route order and nulls are added after each column:

up column one    ADAM THEM THEY AT WAYLAND BROWN FOR           KISSING
down column two  VENUS CORRESPONDENTS AT NEPTUNE ARE OFF NELLY TURNING
up column five   UP CAN GET WHY DETAINED TRIBUNE AND           TIMES
down column four RICHARDSON THE ARE ASCERTAIN AND YOU FILLS    BELLY
up column three  THIS IF DETAINED PLEASE ODOR OF LUDLOW        COMMISSIONER

Then the final message is read off in order following the route indicator:

GUARD ADAM THEM THEY AT WAYLAND BROWN FOR KISSING VENUS CORRESPONDENTS AT NEPTUNE ARE OFF NELLY TURNING UP CAN GET WHY DETAINED TRIBUNE AND TIMES RICHARDSON THE ARE ASCERTAIN AND YOU FILLS BELLY THIS IF DETAINED PLEASE ODOR OF LUDLOW COMMISSIONER

Deciphering simply reverses the process. The recipient counts the words in the message, draws a grid of the proper size, fills in words in route order, converts codewords to their plaintext equivalents, and reads the message.

The Union route cipher had the strong advantage that it worked with words rather than individual letters, making errors in transcription and telegraphy much less common. The disadvantage, of course, was that codebooks had to be painstakingly controlled; loss of a single codebook would give the enemy access to all your communications. But that didn’t happen, and the cipher remained secure throughout the war.

Your task is to write a program that encrypts and decrypts messages according to the Union route cipher; make your own conventions for the lexicon and routes. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

5 Responses to “Union Route Cipher”

  1. sweetopiagirl said

    Reblogged this on Inspiredweightloss.

  2. treeowl said

    The first part of this is under-specified. Problems:
    Numerals and punctuation: does the cipher try to distinguish between, say, “12 3-oz boxes” and “123-oz boxes” or between “He’s the best I’ve ever seen” (apostrophes) and “He ‘s the best I’ ve ever seen” (single quotation marks)?
    Ambiguity: How should the program deal with an ambiguous lexicon? If “soldiers attack” encodes to “blue” and “attack at dawn” to “green”, is “soldiers attack at dawn” to be “blue at dawn” or “soldiers green”? If “Lincoln” is encoded as “kill yellow” and “orders cease-fire” is encoded as “orange”, then “Lincoln orders cease-fire” will be encoded as “kill yellow orange”. What if “yellow orange” is also the code for “prisoners”?

  3. programmingpraxis said

    treeowl: You’re stretching.

    It is clear that the cipher uses words as its tokens, so there is no ambiguity in parsing “12 3-oz boxes” or “He’s”. One plaintext word never expands to multiple ciphertext words, so your second concern is moot. And it’s a military cipher, so a plaintext like “He’s the best I’ve ever seen” is unlikely.

  4. treeowl said

    In that case, I’m struggling to understand “a single cleartext word could admit multiple code words” and “even digits and punctuation had code words.” I figured the former meant that a single cleartext word could be encoded as multiple code words, but I suppose it could mean it could produce one of several code words arbitrarily for the same plaintext. You may be right about the punctuation, but I’m certainly unclear on how else to interpret “digits”.

  5. programmingpraxis said

    treeowl:

    Your second interpretation of multiple code words is correct: Adam, fountain, umbrella, and marble could all be code words for Lincoln, with one of them chosen at the discretion of the cipher clerk. In the original code books, codewords were provided for things like commas and periods, though they were often ignored, as in our example. And codewords were provided for numbers — twelve might be Sally — because numbers were often meaningful. Codewords were also provided for dates and times, such as Nelly in our example. Follow the references in the exercise for more information.

Leave a comment