Cryptanalysis is hard work, requiring a willingness to endure many false starts, and a painstaking attention to detail. It requires intelligence to see subtle patterns in incomprehensible ciphertext.

Automated aids to cryptanalysis come in many forms. Some collected statistical information about ciphertexts, thus removing one bit of drudgery from human shoulders. Others, such as the Bombe used in attacking the German Enigma, or the DES cracker built by the Electronic Frontier Foundation, or the converted unit record equipment (punched card machines) which compared Japanese code messages to one another at various displacements to find messages with overlapping superencipherment groups, work by trying thousands, or millions, of possibilities, one after another.

Neither of these techniques is adequate to deal with many cipher systems, particularly modern ones. A well-designed cipher will not offer a simple opportunity to try different possibilities to find partial information about the key, and will have a key large enough to make trying every possible key hopeless. Nor is ordinary statistical information about the frequencies and contacts of bytes in the ciphertext likely to be much use.

Thus, approaches taken from the field of AI (*artificial
intelligence*) have been tried. In these approaches, it is
attempted to combine the speed of the computer with steps that
at least slightly move towards the skill and judgement of a
human cryptanalyst.

Because the individual bits of the subkeys in DES are actual bits taken from the 56-bit DES key, an approach like the following to recover a DES key must have occurred to many people.

Given a block of known plaintext, and its corresponding ciphertext, starting with a random 56-bit possible key, do the following:

- Encipher the known plaintext with that key, and with every one of the 56 other keys obtained by inverting one bit of that key.
- Compare the resulting ciphertext to the actual ciphertext.
- In those of the 56 cases where the flipped bit results in the ciphertext produced differing in fewer bits from the actual ciphertext than that produced by the original trial key, invert that bit of the trial key to obtain the next trial key.

This is a simple example of a hill-climbing algorithm, where the number of bits by which a trial encipherment differs from the actual ciphertext are a measure of one's (lack of) altitude.

It would, however, never work against DES. That is because
of the *avalanche property* of DES; changing a single
bit in a DES key results in every bit of the block being enciphered
being changed randomly after only a few rounds.

Thus, even attempting to improve the hill climbing algorithm above by, for each trial, enciphering the known plaintext for eight rounds with the trial key, and deciphering the actual ciphertext for eight rounds with the trial key, and then determining the number of bits by which these two results differed would not be enough to help.

Another idea would be to choose two rounds of DES, and by determining the input to those rounds by enciphering the known plaintext by the previous rounds, and the required output from those rounds by deciphering the actual ciphertext by the following rounds, examine the two 48-bit subkeys for the rounds, and, by examining the four possibilities for each group of 6 bits in those subkeys to produce the required change in each half of the block, find those which are consistent with the origin of those two subkeys from the original 56-bit key, and then try the resulting new 56-bit key or keys on the basis that it or they might be improvements over the preceding trial key.

A thesis by A. J. Bagnall described the ciphertext-only solution of some simple rotor machines by means of the technique of genetic programming.

Genetic programming is a method by which a computer
produces an answer to a question, or even a computer program to
perform a task, by mimicing the process of natural selection.
As noted in the thesis, and in the book *Artificial Life*
by Stephen Levy, this technique was originated by John
Holland in the mid-1960s, and his student David Goldberg was
one of the first to refine the technique so that it could be
used in practice with real problems of importance.

It can be thought of as a special case of the hill-climbing algorithm, in that a quantitative measure of how "warm" the computer is in approaching the desired solution is required.

Programs or answers must be in the form of a chain of
discrete elements, such that there is at least a reasonable
likelihood that a chain formed by taking one chain, and
replacing a span of elements within it by the corresponding
elements from another chain, will "make sense".
(This may be noted as the *second* major use of sex
for the purpose of obtaining the codes of foreign powers.)

Random mutations are also usually used, although genetic crossover has been found to be much more important.

Starting with a random selection of solutions, those that work best are retained, and used as the parents of the next generation of solutions to be tried. Often, this retention is also randomized, so that better solutions have a higher probability of being retained.

One type of mutation that happens in real life has not, to my knowledge, been used for genetic programming yet. Occasionally, plants and animals will increase the size of their genetic inheritance by duplicating part of it. Thus, a finite state machine could mutate by becoming a machine with twice as many states. It might be useful to make provision for this where a problem might be more complex to solve than initially realized.

[Next] [Up] [Previous] [Index]