[Next] [Up] [Previous] [Index]

Cryptanalysis, Almost by Aimlessly Thrashing About

Cryptanalysis is hard work, requiring a willingness to endure many false starts, and a painstaking attention to detail. It requires intelligence to see subtle patterns in incomprehensible ciphertext.

Automated aids to cryptanalysis come in many forms. Some collected statistical information about ciphertexts, thus removing one bit of drudgery from human shoulders. Others, such as the Bombe used in attacking the German Enigma, or the DES cracker built by the Electronic Frontier Foundation, or the converted unit record equipment (punched card machines) which compared Japanese code messages to one another at various displacements to find messages with overlapping superencipherment groups, work by trying thousands, or millions, of possibilities, one after another.

Neither of these techniques is adequate to deal with many cipher systems, particularly modern ones. A well-designed cipher will not offer a simple opportunity to try different possibilities to find partial information about the key, and will have a key large enough to make trying every possible key hopeless. Nor is ordinary statistical information about the frequencies and contacts of bytes in the ciphertext likely to be much use.

Thus, approaches taken from the field of AI (artificial intelligence) have been tried. In these approaches, it is attempted to combine the speed of the computer with steps that at least slightly move towards the skill and judgement of a human cryptanalyst.

Hill-climbing

Because the individual bits of the subkeys in DES are actual bits taken from the 56-bit DES key, an approach like the following to recover a DES key must have occurred to many people.

Given a block of known plaintext, and its corresponding ciphertext, starting with a random 56-bit possible key, do the following:

This is a simple example of a hill-climbing algorithm, where the number of bits by which a trial encipherment differs from the actual ciphertext are a measure of one's (lack of) altitude.

It would, however, never work against DES. That is because of the avalanche property of DES; changing a single bit in a DES key results in every bit of the block being enciphered being changed randomly after only a few rounds.

Thus, even attempting to improve the hill climbing algorithm above by, for each trial, enciphering the known plaintext for eight rounds with the trial key, and deciphering the actual ciphertext for eight rounds with the trial key, and then determining the number of bits by which these two results differed would not be enough to help.

Another idea would be to choose two rounds of DES, and by determining the input to those rounds by enciphering the known plaintext by the previous rounds, and the required output from those rounds by deciphering the actual ciphertext by the following rounds, examine the two 48-bit subkeys for the rounds, and, by examining the four possibilities for each group of 6 bits in those subkeys to produce the required change in each half of the block, find those which are consistent with the origin of those two subkeys from the original 56-bit key, and then try the resulting new 56-bit key or keys on the basis that it or they might be improvements over the preceding trial key.

Genetic Programming

A thesis by A. J. Bagnall described the ciphertext-only solution of some simple rotor machines by means of the technique of genetic programming.

Genetic programming is a method by which a computer produces an answer to a question, or even a computer program to perform a task, by mimicing the process of natural selection. As noted in the thesis, and in the book Artificial Life by Stephen Levy, this technique was originated by John Holland in the mid-1960s, and his student David Goldberg was one of the first to refine the technique so that it could be used in practice with real problems of importance.

It can be thought of as a special case of the hill-climbing algorithm, in that a quantitative measure of how "warm" the computer is in approaching the desired solution is required.

Programs or answers must be in the form of a chain of discrete elements, such that there is at least a reasonable likelihood that a chain formed by taking one chain, and replacing a span of elements within it by the corresponding elements from another chain, will "make sense". (This may be noted as the second major use of sex for the purpose of obtaining the codes of foreign powers.)

Random mutations are also usually used, although genetic crossover has been found to be much more important.

Starting with a random selection of solutions, those that work best are retained, and used as the parents of the next generation of solutions to be tried. Often, this retention is also randomized, so that better solutions have a higher probability of being retained.

One type of mutation that happens in real life has not, to my knowledge, been used for genetic programming yet. Occasionally, plants and animals will increase the size of their genetic inheritance by duplicating part of it. Thus, a finite state machine could mutate by becoming a machine with twice as many states. It might be useful to make provision for this where a problem might be more complex to solve than initially realized.


[Next] [Up] [Previous] [Index]

Next
Chapter Start
Table of Contents
Home Page