ADFGX and back again

I've been working on a small and simple tool called caesar that implements some of the old, basic ciphers like Caesar, Playfair, and Vigenère. These are lightweight and trivially broken with modern technology (and sometimes trivially broken without modern technology). But nonetheless, I find them really fun to mess around with. caesar, the tool, is supposed to provide a work-free way to encrypt and decrypt using these classic, out-of-style ciphers.

A cipher that I was trying to implement the past week was an old WWI-era cipher called ADFGX. There's a better version of it called ADFGVX which supports numbers and the letter J, but so far caesar is letters-only, so I figured ADFGX is a good place to start. ADFGX is really simple to work with by hand and takes advantage of a cool idea I will explain presently.

ADFGX takes two keys and consists of two steps. The first key is used to build what's called a Polybius square with a mixed alphabet. Essentially we write the alphabet out into a square, but we start it off with a keyword (our first key). This results in a mixed up alphabet. To get an alphabet into a square, we need to drop a letter (26 is not a perfect square), so typically I and J are merged and everyone named Joel has to go by Ioel for the foreseeable future. Here's a Polybius square without a key:

    1 2 3 4 5
    ---------
1 | A B C D E
2 | F G H I K
3 | L M N O P
4 | Q R S T U
5 | V W X Y Z

To make a square with a keyword, you start the square with letters of the keyword and then write the rest of the alphabet in order (skipping keyword letters, since they're already in the square). If our keyword was “apple”, it would look like this:

    1 2 3 4 5
    ---------
1 | A P L E B
2 | C D F G H
3 | I K M N O
4 | Q R S T U
5 | V W X Y Z

With a square like this, you can convert each letter of your message to a pair (row, column). “Hello” becomes 25 14 13 13 35. In ADFGX, the row and column headers are A D F G X instead of 1 2 3 4 5:

    A D F G X
    ---------
A | A P L E B
D | C D F G H
F | I K M N O
G | Q R S T U
X | V W X Y Z

So “Hello” passed through the first step of ADFGX (with key one being “apple”) would produce DX AG AF AF FX. This substitution by itself isn't great security-wise, since the pairs are as vulnerable to frequency analysis as any other one-to-one substitution cipher. So we come to step two of ADFGX.

Step two requires a new key, the transposition key. The output from step one is written underneath the transposition key (in a normal, left-to-right, row-by-row way) and then letters are read off by column in alphabetical order. To continue our example, let's say our second key is “cat”:

C A T
-----
D X A
G A F
A F F
X

Reading off columns in alphabetical order (ACT), we would get XAF DGAX AFF. This step decouples the pairs that each letter was substituted with and that means frequency analysis won't reveal too much anymore. This is a super cool upgrade to just using a Polybius square. The Polybius square fractionates the original message, such that each letter becomes a pair. The transposition step muddles up those pairs. Fractionation + transposition is a killer duo.

This is pretty much where Wikipedia and the other handful of sites I checked out leave off. Reversing the process to decrypt is pretty straightforward, for the most part. But I don't like this finished ciphertext (XAF DGAX AFF). I always like to group ciphertext letters into consistent sizes, like XAF DGA XAF F or XAFD GAXA FF. Left as-is, the ciphertext is leaking the length of the second key — 3 groups, 3 letters (this is called out on some websites as a “feature” of the cipher). I'm not sure how things were done a hundred years ago, but I'm hoping the letter groupings of the final output weren't heavily legislated.

It's not very obvious how to get the letters back in their proper columns without knowing how long each column should be. I found one way of reversing the transposition step without having the column spacing but it involved a bunch of moving columns around and writing in all kinds of directions at once and that was just too complicated for me. I found another website that said to fill the transposition block with random a/d/f/g/x characters so you wouldn't have to worry about how many letters to put into each column when you're decrypting. Nothing grinds my gears like some indistinguishable, non-standardized padding.

Anyway, look no further. Here is my Simple Method of Properly Populating the Transposition Columns When The Ciphertext Isn't Grouped By Column™ (the full impetus for this blog post; everything else was introduction):

  1. Write out the key word and put blanks for each letter of the message underneath:

    C A T
    -----
    _ _ _
    _ _ _
    _ _ _
    _
    
  2. Starting from the beginning of the message, populate the columns in alphabetical order (read the first 3 letters, XAF, into column A, etc.)

    C A T
    -----
    _ X _
    _ A _
    _ F _
    _
    
  3. Repeat, done.

🙌🙌🙌