The idea behind Quadibloc 2002D is to have a cipher with the same kind of indirection as was used in Quadibloc 2002, but without the use of unbalanced Feistel rounds. An unbalanced Feistel round which is target-heavy potentially makes differential cryptanalysis easier, because the source is effectively repeated in the f-function output. An unbalanced Feistel round which is source-heavy slows diffusion. These problems are not necessarily all that serious in ciphers which are as elaborate as those in the Quadibloc series, but it would still be desirable to avoid them.
Using components from other Quadibloc ciphers, it is easy enough to design a combiner, following the same principle as the one in Quadibloc 2000, but operating on a 64-bit value instead of a 32-bit value:
This combiner takes a 128-bit input.
How might a 128-bit value be derived from the other 64-bit half of the 128-bit block? One obvious way would be to use the intermediate results from two conventional Quadibloc rounds, as shown at right:
And then, the remaining question is, what f-function do we use to modify the 128-bit value obtained from the left half before using it to alter the right half?
Since it is desirable to use something that is different from the operations used with the left and right halves of the block, it seems to me that a reasonable choice would be to use the operations from Quadibloc 2002A; a diffusion phase, a first key phase, a nonlinearity phase, a second key phase, and an extra diffusion phase.
S5 and S6 were used instead of S1 and S2 in the diagram for the conventional Quadibloc rounds for the left half in order to allow the reciprocal S-box derived from S1 used with Quadibloc 2002A to be used here, and S7 was used instead of S8 for the key-dependent S-box so that the two key-dependent S-boxes used in the diffusion phase could continue to be called S8 and S9. Note that the 128-bit keys used in the diagram, intended to show this operation as used as an f-function in the first round, are labelled LK3 and LK4. LK1 and LK2 are intended to be used with this same operation, but applied to the block as a whole, prior to the first regular round of Quadibloc 2002D. In this way, the original plaintext block will be divided between the left and right halves of the Feistel structure of the main part of the cipher, thereby, it is hoped, complicating analysis somewhat. A similar one-round version of Quadibloc 2002A will, of course, also be applied to the block on exit from the cipher.
After each regular round, except the last, the halves of the block are to be swapped. As the rounds of this cipher are complicated and slow, eight regular rounds, which should be enough to be adequate, are the normal number of rounds for Quadibloc 2002D. Sixteen rounds would, however, be ideal.
On entry and exit from the cipher, and as an f-function within its regular rounds, an operation equivalent to one-round Quadibloc 2002A is performed on a 128-bit input giving a 128-bit output.
This operation consists of a diffusion phase, a first key phase, a nonlinearity phase, a second key phase, and another diffusion phase.
The diffusion phase proceeds as follows:
The block is divided into pairs of bytes. In each pair, first the byte on the left has the element of S8 the byte on the right points to added to it modulo 256, and then the byte on the right has the element of S8 the byte on the left points to added to it modulo 256. In the absence of S8, this would essentially be a pseudo-Hadamard transform, as used in SAFER.
The bytes are next considered as belonging to groups of four, and the middle two bytes of each such group are swapped.
Then, considered as pairs of bytes again, first the byte on the right has the element of S8 the byte on the left points to subtracted from it modulo 256, and then the byte on the left has the element of S8 the byte on the right points to subtracted from it modulo 256. In the absence of S9, this would essentially be the inverse of the pseudo-Hadamard transform.
Since at this point, we have four groups of four bytes, each thoroughly mixed within itself, a byte transpose is performed, transposing the sixteen bytes of the block from the order
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
to the order
1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16
and thus placing one byte from each group of four into each new group of four.
Then the use of S8 in a pseudo-Hadamard transform and the use of S9 in an inverse pseudo-Hadamard transform is repeated.
Each key phase consists of modifying the block by XORing it with a 128-bit subkey.
The nonlinearity phase consists of replacing each byte of the block with its substitute in the S-box S1, where that S-box is created by taking the original S-box S1, as defined in the page entitled Euler's Constant and the Quadibloc S-boxes; the derived S-box is listed in the page describing Quadibloc 2002A.
The cipher consists of one such manipulation, using the first two long subkeys, followed by eight or sixteen regular rounds, each one using two more long subkeys and six 32-bit subkeys, with the halves of the block swapped after each regular round but the last, ending in another of the manipulations described above, using the last two long subkeys, LK19 and LK20 in the eight-round version, and LK35 and LK36 in the sixteen-round version.
In a regular round, the first operation that is performed is the encipherment of the left half of the block by means of two Feistel rounds resembling those of standard Quadibloc. Specifically, in the case of this cipher:
The left half of the block is divided into a left subblock of 32 bits and a right subblock of 32 bits. In the first of the two rounds, the left subblock is the input to the f-function, the first three short subkeys for the round are used, which are K1 through K3 for the first round, K7 through K9 for the second round, and so on, and the output of the f-function is applied to the right subblock by means of an XOR, modifying it. In the second of the two rounds, the right subblock is the input to the f-function, the second short subkeys for the round are used, which are K4 through K6 for the first round, increased by six for each successive round, and the output of the f-function is applied to the left subblock by means of an XOR, modifying it.
The f-function proceeds as follows:
The first 32-bit subkey used with the f-function is XORed with the input.
The four bytes of the result are replaced with their substitutes in S-boxes S5, S5, S5, and S6, respectively.
The bits of the result are transposed from the order:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
to the order
1 2 27 28 21 22 15 16 9 10 3 4 29 30 23 24 17 18 11 12 5 6 31 32 25 26 19 20 13 14 7 8
The value at this point is the first intermediate result of the f-function.
The second 32-bit subkey used with the f-function is XORed with the result.
The four bytes of the result are replaced with their substitutes in S-boxes S5, S6, S6, and S6, respectively.
The bits of the result are again subjected to the transposition previously described.
The value at this point is the second intermediate result of the f-function.
The third 32-bit subkey used with the f-function is XORed with the result.
The four bytes of the result are replaced with their substitutes in the key-dependent S-box S7.
The result is now the output from the f-function.
The intermediate results of the f-functions in the first and second Feistel rounds applied to the left half of the block form a 128-bit value by being taken in the following order, from left to right:
This 128-bit value is used as the input to the one-round Quadibloc 2002A operation described above, using the two long keys appropriate to the current round, LK3 and LK4 for the first regular round, increasing by two for each successive round. The output is then used to modify the right half of the block by being used as input to the combiner.
The combiner operates as follows:
The input to the combiner can be considered as consisting of eight sixteen-bit parts, numbered from 1 to 8 from left to right.
The combiner acts on the left half of the block. This 64-bit half of the block can also be considered as being divided into sixteen-bit parts, numbered from 1 to 4 from left to right.
The combiner operation consists of these steps:
The key material used by this cipher consists of:
The key must be a multiple of four bytes in length, and it must be at least eight bytes long.
Subkey material is to be generated in the following sequence:
Key-dependent S-boxes S7, S8, and S9 are produced by the method for generating permutations; key-dependent S-boxes S10 and S11 are merely filled directly with subkey material from beginning to end.
The subkey material is prepared from the key in the usual fashion, as initially established for Quadibloc XI, and as modified for the case of long keys for Quadibloc 2002:
Three strings of bytes of different length are produced from the key.
The first string consists of the key, followed by one byte containing the one's complement of the XOR of all the bytes of the key.
The second string consists of the one's complements of the bytes of the key in reverse order, with three bytes appended containing the following three quantities:
The third string consists of alternating bytes, taken from the bytes of the key in reverse order, and then from the bytes of the one's complement of the key, and then that string is followed by the one's complements of the first four bytes of the key.
Thus, if the key is:
128 64 32 16 8 4 2 1 1 2 3 4 5 6 7 8
then the strings generated from it are as follows:
First string: 128 64 32 16 8 4 2 1 1 2 3 4 5 6 7 8 8 Second string: 247 248 249 250 251 252 253 254 254 253 251 247 239 223 191 127 37 170 93 Third string: 8 127 7 191 6 223 5 239 4 247 3 251 2 253 1 254 1 254 2 253 4 252 8 251 16 250 32 249 64 248 128 247 127 191 223 239
Given that the length of the key is 4n, the lengths of the three strings are 4n+1, 4n+3, and 8n+4, and hence all three are relatively prime, since both 4n+1 and 4n+3 are odd, and 8n+4 is two times 4n+2.
Two buffers, each containing room for 256 bytes, are filled by generating bytes from the first and second strings by placing them in a nonlinear shift register.
The form of that shift register is shown in the following illustration, showing its precise form for the first string.
Five bytes are read from the string at each step. For the first string, they are, as shown in the diagram, the eighth-last, fifth-last, third-last, and second-last bytes and the last byte. For the second string, they are the eighth-last, seventh-last, fourth-last, and second-last bytes, and the last byte. For the third string, they are the twelfth-last, tenth-last, seventh-last, and fourth-last bytes, and the last byte.
Each time the shift register produces a byte, it does so as follows:
Both buffers contain 256 bytes. The first buffer, called buffer A, is filled with 256 successive bytes generated from the second string by means of the nonlinear shift register filled with the second string. The second buffer, called buffer B, is filled with 256 successive bytes generated from the first string by means of the nonlinear shift register filled with the first string.
Once the setup is complete, subkey material is generated one byte at a time, the first byte generated being the leftmost byte of subkey K1, and so on.
A subkey byte is generated as follows:
The following diagram attempts to illustrate the complete process of subkey byte generation:
Note that this procedure, since it exercises the two strings used to select bytes, rather than the string used to generate values, results in a small change in the key resulting in large changes in the subkeys from the very beginning.
To generate bijective S-boxes, which in the case of this cipher are the S-boxes S7, S8, and S9, the following procedure is used:
The resulting contents of buffer D are used as the key-dependent bijective S-box intended to be produced. Note that this is a procedure, introduced for Quadibloc X, is more straightforwards than the other two basic procedures used previously to produce S8 in other ciphers in this series.
This procedure, although it uses buffers A and B, leaves them undisturbed; thus, byte generation may continue after one S-box is produced.
The subkey generation procedure given above works well for keys from 64 bits to 1,024 bits in length. When the length of the key approaches 2,048 bits in length, however, the first two shift registers are no longer thoroughly mixed by filling the buffers initially.
Thus, the following modified subkey generation procedure is given for keys which are more than 1,024 bits long. A key must still be a multiple of 32 bits in length.
This method ensures that the key schedule is a function of the entire key.
While the method for subkey byte generation introduced for Quadibloc XI will normally produce a key schedule of high quality, there is one potential for difficulties which remains in it. The basis of subkey byte generation is three nonlinear shift registers. They are designed so that they have an incompressible state space; that is, since a nonlinear function of the block is XORed with a byte of the block, stepping the shift register is invertible, so each shift register state has exactly one predecessor as well as exactly one successor. This prevents a large proportion of states from eventually degenerating into a short cycle, but it does not prevent the existence of short cycles; there is always the risk that the initial state of the shift register might be part of a short cycle.
In this subkey byte generation method, two of the shift registers are used for selecting elements from 256-byte buffers, and only one shift register, the longest, produces the actual values which are placed in those buffers, and which are XORed together to produce the output subkey bytes.
Since producing a key schedule requires only a limited number of bytes, even if that number is in the thousands, short cycles are not a major risk. However, if bytes from the other shift registers also entered into the direct production of output subkey bytes, the robustness of the subkey byte generation process would be improved. In consequence, the alternate subkey byte generation procedure illustrated below is proposed:
Here, there are four buffers. In addition to the original buffers A and B, there are two additional buffers, C and D, which precede them. Initially, the 256 elements of buffer C, followed by the 256 elements of buffer A, are filled with bytes generated from the second string by its associated nonlinear shift register operation, and the 256 elements of buffer D, followed by the 256 elements of buffer B, are filled with bytes generated from the first string by its associated nonlinear shift register operation. These operations, and the initial values of the three strings, remain the same as in the original key schedule.
The positions of the buffers A through D in the diagram above are as follows:
C A D B
The generation of subkey bytes by this improved procedure proceeds as follows:
Thus, basically, in this improved subkey byte generation procedure, all three nonlinear shift registers are called upon to generate three bytes. For the first two shift registers, two of these bytes control two stages of shuffling of the bytes recieved from the third shift register, and a third is applied between those two stages of shuffling to the bytes shuffled by the other one of those two shift registers. The three bytes from the third shift register are the starting inputs to the two sets of shuffling stages, and a byte which is directly XORed to the XOR of the final outputs from the two sets of shuffling stages.
The Quadibloc 2002A rounds at the beginning and end of the cipher are inverted in the manner set forth for Quadibloc 2002A, by reversing the order of the subkeys and interchanging key-dependent S-boxes S8 and S9. Of course, no changes are made to the f-function in the regular round.
The basic principle of leaving f-functions unchanged, and performing Feistel rounds in reverse order, is used to invert the Quadibloc rounds operating on the left half of the block, and the combiner operating on the right half of the block, to form the inverse of a regular round.
One disadvantage of the abandonment of the unbalanced Feistel round in this design is that, since there are only 2^64 possible left halves of the block on entry to any round, and the subkeys for that round are fixed, in a sense, half of the f-function (the major one which is in the form of one-round Quadibloc 2002A) is wasted; in any given round, the 128-bit value which is its input can only have 2^64 possible values.
However, those possible values do not have a relationship which is simple in terms of the f-function; the subkeys are different in different rounds; and the same can be said of the expansion permutation in DES, which produces a 48-bit value which can have only 2^32 different values.
In addition to indirection, this cipher gains its strength from the use of five key-dependent S-boxes, which are generated by a key schedule which produces a thoroughly mixed output from the key, even if it is not quite as secure a key schedule as that of Blowfish.
Start of Section
Skip to Next Chapter
Skip to Next Section
Table of Contents