This key-exchange algorithm was invented by Whitfield Diffie and Martin Hellman in 1976, and it too was previously invented within GCHQ in Britain, this time by Malcolm Williamson in 1974.
Although it is impossible to encrypt any message directly in the Diffie-Hellman system, it is still useful for sending secret messages. In general terms, it can be thought of as working like this: two parties, sharing a public key, each think of a number. Both parties perform a particular type of hash on that number using the public key, and give the result to the other party. This is useful, because there is a function of one original number and the special hash that was calculated of the other number that is the same, regardless of which one of the numbers you have in its original form, and so the two parties can use that product as a conventional encryption key for communicating. Note that the number each party thinks of can also be called a secret key, and the hashed version as a public key.
The Diffie-Hellman method, like RSA, also uses modulus arithmetic, but here the modulus is just a prime number, which we will call P. Exponentiation is involved as well. But, while in RSA the encryption function applied to x is x to the d power, which is inverted as x to the e power, in Diffie-Hellman the function applied is a hash function, in effect, because it has no inverse: A to the x power, modulo P.
(Of course, with real number arithmetic, that has an inverse, the logarithm of x to the base A. And the inverse of x to the e power is the e root of x as well. Note that this does indicate one weakness of RSA: if e happens to be 3, and the encrypted message is a perfect cube, then the conventional cube root, which is easy to find, is also the cube root modulo M. This requires both a low exponent and a small number as the message, and is therefore easily avoided in practice.)
So, in Diffie-Hellman, the two parties each think of a secret random number, respectively x and y. Each transmits A to that power. So, one party knows x and A^y, and the other party knows y and A^x. Each party can calculate A^(x*y), since that is (A^y)^x and it is also (A^x)^y, but an eavesdropper, only knowing A^x and A^y, cannot do this.
A potential danger when Diffie-Hellman is used is that one could, for some values of the modulus P, choose values of A such that A^x has only a small number of possible values, no matter what x is, which would make it easy to find for a value of x that was equivalent to the original value of x. A prime modulus P satisfying the condition that P-1 has some large factors is required to avoid this problem.
One way to obtain a modulus that definitely satisfies this condition would be to choose a modulus P such that P-1 is equal to twice another prime number. This second, smaller prime number is known as a Sophie Germain prime, after a noted mathematician who was able to prove Fermat's Last Theorem for the case when the exponent was such a prime.
For Diffie-Hellman, choosing 2q+1 for q being some Sophie Germain prime may well be reasonable. For RSA, the condition that p-1 have some large prime factors is also useful to improve resistance to some forms of factoring, but since it is necessary for the two prime factors of the modulus in RSA to be unknown, the rarity of Sophie Germain primes may conflict with this. In Diffie-Hellman, of course, the one prime used is part of the public key.
The Key Exchange Algorithm (KEA) which was used with SKIPJACK in the Clipper chip, and which was declassified at the same time, uses Diffie-Hellman in an interesting manner which highlights some of its properties. As in the Digital Signature Algorithm, the prime modulus P is required to be one such that P-1 has a factor that is 160 bits long. Since P is 1024 bits long, (or 512 to 1024 bits long, in the case of the Digital Signature Standard) this appears to be less secure than the use of a prime derived from a Sophie Germain prime. Also, it is somewhat more cumbersome to use: where f is the 160-bit long prime factor of P-1, A cannot simply be chosen at random; instead, another number less than P-1, B, is chosen at random, and B^((P-1)/f) is used as A as long as it is not equal to 0 or 1 (modulo P, of course) and it also requires that each A^x used be tested to determine that A^(xf) is equal to 1 modulo P.
Although this may be less secure than the use of a Sophie Germain prime, it is certainly more secure than simply choosing P at random, and making no effort to avoid the problems that small factors of P-1 could cause.
The protocol involved in KEA is of more interest. The session key is derived by hashing
(x1*y2) (x2*y1) A + A
all calculated modulo P, where x1 and x2 came from the first party to the communication, and y1 and y2 came from the second.
If doing Diffie-Hellman once isn't good enough, what is the point of doing it twice? The important thing about KEA is the difference between where the x values and the y values came from.
The first user retains x1 as a persistent private key, and when A^x1 is presented as his public key, it is accompanied by a digital certificate. Similarly, the second user has a digital certificate for A^y1 as a public key.
On the other hand, x2 and y2 were random numbers generated at the time of the communication, and thus A^x2 and A^y2 do not have certificates corresponding to them. Thus, by involving the four parameters, a persistent and a temporary key from each user in the procedure, both users prove their identity with a digital certificate, but the resulting session key is something that is different for every message.
Since the x2 and y2 values are not persistent (they are sometimes called nonces), an attacker could only obtain them through access to the computers or other equipment of the parties to the communication at about the same time as the message with which they are associated is itself present in plaintext form on that equipment. Thus, unlike persistent private keys, nonces do not contribute to the security burden of key storage. This means that a passive attacker would need to compromise the persistent private keys of both parties (if that attacker could not also obtain one of the nonces) to read a message whose key was generated through KEA. This, however, was not likely to have been a major design consideration, because additional security against passive attacks could be gained by making the protocol more elaborate (for example, A^(x2*y2) could be used in addition in the generation of the session key); but this by itself would not increase security against an active attack. Although this technique could be combined with other measures that do combat active attacks, an attacker who could also partially compromise keys is not only likely to be able to mount an active attack of the "man-in-the-middle" type, but may even be able to tamper with the encryption hardware or software being used.
The technique for deriving an 80-bit key for SKIPJACK from the 1024-bit value A^(x1*y2)+A^(x2*y1) mod P, which value I will call k1, included for completeness, is as follows:
The 80 most significant bits of k1 are XORed with the 80-bit constant X'72F1A87E92824198AB0B', and the result of this will be called k2.
The 80 next most significant bits of k1 are split into 64 more significant bits and 16 less significant bits. The 64 more significant bits are encrypted twice in SKIPJACK with a key of k2. The 16 leftmost bits of the result of the first of these two encryptions is XORed with the 16 less significant bits. The concatenation of the 64-bit result of the double encryption with the 16-bit result of the XOR forms the 80 bit key that will actually be used for the conventional encryption, with SKIPJACK, of the actual message. The following diagram illustrates this process, and it is similar to a diagram included in the document describing SKIPJACK and KEA on the occasion of their declassification:
Since k1 is a number modulo P, where P is a prime that is 1024 bits long, its most significant digit is somewhat more likely, depending on the choice of P, to be a zero than a one, and this could be considered a slight weakness in choosing to use the most significant bits of k1 instead of some other part of it. Note also that the value of the 80-bit constant used, were it secret, as it was originally, would provide an additional layer of protection to the actual key used from the values exchanged. Also note that I am assuming that the convention for taking the 80 most significant bits and the 80 next most significant bits of k1 and submitting them to SKIPJACK in the form of binary bytes is a big-endian convention; anything else would be too confusing for me to attempt to explain.
Also note that the technique used for converting the 80 next most significant bits of k1 to the 80-bit key actually used, while suitable to its purpose here, is not a secure technique in general for increasing the block size of a block cipher.
Table of Contents