Alice sends messages to Bob, Carla, and Dante. She encrypts each message with a one-time pad, but she’s lazy and uses the same key for each message. In this section, we’ll show how Eve can decrypt all three messages.
Suppose the messages are and the key is . The ciphertexts are computed as . Eve computes
Similarly, she obtains and . The key has disappeared, and Eve’s task is to deduce from knowledge of , , . The following example shows some basic ideas of the method.
Let’s assume for simplicity that the messages are written in capital letters with spaces but with no other punctuation. The letters are converted to ASCII using
(the letters to are the numbers 65 through 90 written in binary, and space is 32 in binary; see Table 4.1).
The XORs of the messages are
Note that the first block of is 0000000
. This means that the first letter of is the same as the first letter of . But the observation that makes the biggest difference for us is that “space” has a 0 as its leading bit, while all the letters have 1 as the leading bit. Therefore, if the leading bit of an XOR block is 1, then it arises from a letter XORed with “space.”
For example, the third block of is 1100100
and the third block of is 1100001
. These can happen only from and , respectively. It follows easily that has “space” as its third entry, has as its third entry, and has as its third entry. Similarly, we obtain the 2nd, 7th, 8th, and 9th entries of each message.
The 5th entries cause a little more trouble: and tell us that there is “space” XORed with , and tells us that the 5th entries of and are equal, but we could have “space A A” or “A space space.” We need more information from surrounding letters to determine which it is.
To proceed, we use the fact that some letters are more common than others, and therefore certain XORs are more likely to arise from these than from others. For example, the block 0010001 is more likely to arise from than from . The most common letters in English are . Make a table of the XORs of each pair of these letters. If an XOR block occurs in this table, we guess that it comes from one of the pairs yielding this block. For example, 0001001
arises from and from , so we guess that the 11th block of comes from one of these two combinations. This might not be a correct guess, but we expect such guesses to be right often enough to yield a lot of information.
Rather than produce the whole table, we give the blocks that we need and the combinations of these frequent letters that produce them:
Let’s look at the next-to-last block of each XOR. In , we have 0010111
, which could come from or . In , we have 0000100
, which could come from or . In , we have 0010011
, which could come from . The only combination consistent with these guesses is for the next-to-last letters of , respectively. This type of reasoning, combined with the information obtained from the occurrence of spaces, yields the following progress (- indicates a space, * indicates a letter to be determined):
The first letter of is a one-letter word. The XOR block 0000001
gives us the possibilities so we guess starts with . The XORs tell us that and start with .
Now let’s look at the 12th letters. The block 0001111
for suggests . The block for suggests and . The block for suggests . These do not have a common solution, so we know that one of the less common letters occurs in at least one of the messages. But ends with , and a good guess is that this should be . If so, then the XOR information tells us that ends in SECRET
, and ends in TODAY
, both of which are words. So this looks like a good guess. Our progress so far is the following:
It is time to revisit the 5th letters. We already know that they are “space A A” or “A space space.” The first possibility requires to have two consecutive one-letter words, which seems unlikely. The second possibility means that starts with the two words , so we can guess this is
. The XOR information tells us that and have and , respectively. We now have all the letters:
Something seems to be wrong in the 6th column. These letters were deduced from the assumption that they all were common letters, and this must have been wrong. But at this point, we can make educated guesses. When we change to in , and make the corresponding changes in and required by the XORs, we get the final answer:
These techniques can also be used when there are only two messages, but progress is usually slower. More possible combinations must be tried, and more false deductions need to be corrected during the decryption. The process can be automated. See [Dawson-Nielsen].
The use of spaces and the restriction to capital letters made the decryption in the example easier. However, even if spaces are removed and more symbols are used, decryption is usually possible, though much more tedious.
During World War II, problems with production and distribution of one-time pads forced Russian embassies to reuse some keys. This was discovered, and part of the Venona Project by the U.S. Army’s Signal Intelligence Service (the predecessor to NSA) was dedicated to deciphering these messages. Information obtained this way revealed several examples of Russian espionage, for example, in the Manhattan Project’s development of atomic bombs, in the State Department, and in the White House.