Everyone knows that the one-time pad provides perfect secrecy. But what does this mean? In this section, we make this concept precise. Also, we know that it is very difficult in practice to produce a truly random key for a one-time pad. In the next section, we show quantitatively how biases in producing the key affect the security of the encryption.
In Section 20.4, we repeat some of the arguments of the present section and phrase them in terms of entropy and information theory.
The topics of this section and the next are part of the subject known as Provable Security. Rather than relying on intuition that a cryptosystem is secure, the goal is to isolate exactly what fundamental problems are the basis for its security. The result of the next section shows that the security of a one-time pad is based on the quality of the random number generator. In Section 10.5, we will show that the security of the ElGamal public key cryptosystem reduces to the difficulty of the Computational Diffie-Hellman problem, one of the fundamental problems related to discrete logarithms. In Section 12.3, we will use the Random Oracle Model to relate the security of a simple cryptosystem to the noninvertibility of a one-way function. Since these fundamental problems have been well studied, it is easier to gauge the security levels of the cryptosystems.
First, we need to define conditional probability. Let’s consider an example. We know that if it rains Saturday, then there is a reasonable chance that it will rain on Sunday. To make this more precise, we want to compute the probability that it rains on Sunday, given that it rains on Saturday. So we restrict our attention to only those situations where it rains on Saturday and count how often this happens over several years. Then we count how often it rains on both Saturday and Sunday. The ratio gives an estimate of the desired probability. If we call the event that it rains on Saturday and the event that it rains on Sunday, then the conditional probability of given is
where denotes the probability of the event . This formula can be used to define the conditional probability of one event given another for any two events and that have probabilities (we implicitly assume throughout this discussion that any probability that occurs in a denominator is nonzero).
Events and are independent if . For example, if Alice flips a fair coin, let be the event that the coin ends up Heads. If Bob rolls a fair six-sided die, let be the event that he rolls a 3. Then and . Since all 12 combinations of and are equally likely, , which equals . Therefore, and are independent.
If and are independent, then
which means that knowing that happens does not change the probability that happens. By reversing the steps in the above equation, we see that
An example of events that are not independent is the original example, where is the event that it rains on Saturday and is the event that it rains on Sunday, since . (Unfortunately, a widely used high school algebra text published around 2005 gave exactly one example of independent events: and .)
How does this relate to cryptography? In a cryptosystem, there is a set of possible keys. Let’s say we have keys. If we have a perfect random number generator to choose the keys, then the probability that the key is is . In this case we say that the key is chosen uniformly randomly. In any case, we assume that each key has a certain probability of being chosen. We also have various possible plaintexts and each one has a certain probability . These probably do not all have the same probability. For example, the message attack at noon is usually more probable than two plus two equals seven. Finally, each possible ciphertext has a probability .
We say that a cryptosystem has perfect secrecy if
for all possible plaintexts and all possible ciphertexts . In other words, knowledge of the ciphertext never changes the probability that a given plaintext occurs. This means that eavesdropping gives no advantage to Eve if she wants to guess the message.
We can now formalize what we claimed about the one-time pad.
If the key is chosen uniformly randomly, then the one-time pad has perfect secrecy.
Proof. We need to show that for each pair .
Let’s say that there are keys, each of which has probability . We start by showing that each possible ciphertext also has probability . Start with any plaintext . If is the ciphertext, then the key is . Therefore, the probability that is the ciphertext is the probability that is the key, namely , since all keys have this probability. Therefore, we have proved that
for each and .
We now combine the contributions from the various possibilities for . Note that if we sum over all possible , then
since this is the probability of the plaintext existing. Similarly, the event can be split into the disjoint sets , which yields
Applying Equation (4.1) twice yields
Since we have already proved that , we can multiply by to obtain
which says that the one-time pad has perfect secrecy.
One of the difficulties with using the one-time pad is that the number of possible keys is as least as large the number of possible messages. Unfortunately, this is required for perfect secrecy:
If a cryptosystem has perfect secrecy, then the number of possible keys is greater than or equal to the number of possible plaintexts.
Proof. Let be the number of possible plaintexts and let be the number of possible keys. Suppose . Let be a ciphertext. For each key , decrypt using the key . This gives possible plaintexts, and these are the only plaintexts that can encrypt to . Since , there is some plaintext that is not a decryption of . Therefore,
This contradicts the assumption that the system has perfect secrecy. Therefore, .
Suppose a parent goes to the pet store to buy a pet for a child’s birthday. The store sells 30 different pets with 3-letter names (ant, bat, cat, dog, eel, elk, ...). The parent chooses a pet at random, encrypts its name with a shift cipher, and sends the ciphertext to let the other parent know what has been bought. The child intercepts the message, which is ZBL. The child hopes the present is a dog. Since DOG is not a shift of ZBL, the child realizes that the conditional probability
and is disappointed. Since (because there are 30 equally likely possibilities), we have , so there is not perfect secrecy. This is because a given ciphertext has at most 26 possible corresponding plaintexts, so knowledge of the ciphertext restricts the possibilities for the decryption. Then the child realizes that YAK is the only possible shift of ZBL, so . This does not equal , so again we see that we don’t have perfect secrecy. But now the child is happy, being the only one in the neighborhood who will have a yak as a pet.