MARS is the IBM's entrant into the AES competition. As IBM is the company that designed DES, this alone makes this entry of particular interest. Note, incidentally, that as the AES process only requires companies to allow free use of the proposed algorithm if it is selected as the successful candidate, some AES candidates are protected by patents, both to retain control of the cipher itself if it is not selected, and to control the basic technology on which the cipher is based, so that licensing would still be required for larger variants of the cipher, for example. Among the entrants I have examined so far, this applies to RC6 and it had also applied to MARS, but IBM later made MARS available for licensing free of charge.
The document describing MARS, accessible at http://www.research.ibm.com/security/mars.html, notes that "little-endian" conventions, with the "first" byte of a 32-bit word being its least significant byte, are used in the cipher. Unlike the documentation of some of the other little-endian designs, the MARS documentation makes it completely clear and unambiguous which byte goes where.
As the little-endian case is difficult to describe, I am going to use big-endian conventions consistently in my description of MARS. Essentially, this means that the very first thing one does when starting MARS as I describe it is divide the 128-bit block of plaintext into four words of four bytes, and reverse the order of the four bytes in each word. The same has to be done on exit.
The overall structure of MARS is as follows:
First, key material is XORed with the whole block.
Then, eight rounds of a transformation similar to DES are applied, but that transformation is fixed, and without any part that is affected by a key. (Note that if not for the initial XOR of key material, this transformation would be a waste of time.)
Then, there are sixteen rounds (the last eight are called "reverse" rounds, with a rationale somewhat like the one seen in Skipjack) which constitute the "cryptographic core" of the cipher. One 32-bit word is used to modify the other three, by being split into three copies of itself, and subjected to various manipulations, including one multiplication by key material, one S-box lookup, and two data-dependent rotations.
Then we have another unkeyed transformation, and another XOR of key material.
There are 40 32-bit words of subkeys, which are generated by a kind of shift-register method from the key.
As its documentation points out, the design of MARS is oriented around having structures that are secure, but which can also be analyzed, so that it is possible to be confident of the security the cipher posesses. Hard-to-analyze structures that might offer more security, but which would be hard to be certain of, were specifically avoided. The unkeyed rounds of mixing at the start and end of the cipher were, in a way, an exception to that rule; this is why they were left unkeyed (although that may well seem bizarre and wasteful), to ensure that they didn't form, in some sense, a "real" exception to that rule. (It would seem to me that if the key schedule of MARS were very strong, i.e., like that of Blowfish, or if the key used for the outer rounds, presumably to XOR with the inputs to the S-boxes, were independent of the key for the rest of the cipher, there would also be no concern to prevent doing this. And the encrypting speed of MARS would not be affected significantly, either.)
The diagrams I will use here show the words of the block with the first word on the left, and with the most significant byte of each word on the left. Thus, before starting, and after finishing, the bytes of the block being enciphered must be transposed to the following order, if considered as numbered from 0 to 15:
3 2 1 0 7 6 5 4 11 10 9 8 15 14 13 12
to convert between little-endian to big-endian. With this conversion applied, the procedure I am describing in big-endian form will be correct.
The first step in MARS encipherment is to XOR the first four subkey words, K0 to K3, with the four words of the block. (Of course this can be thought of as the 128-bit XOR of a single 128-bit subkey, but this keeps the notation for subkeys consistent.)
The four bytes of the first word in the block, W0, are used as inputs to the two fixed S-boxes used by this cipher with eight bits of input and 32 bits of output. The second word is XORed with the word in S0 chosen by the least significant byte of the first word and then the word in S1 chosen by the third, or second least significant, byte of the first word is added to it. The word in S0 chosen by the second byte of the first word is added to the third word. The fourth word is XORed with the word in S1 chosen by the first byte of the first word.
In the first and fifth round of this, the fourth word is added to the first word, and in the second and sixth round of this, the second word is added to the first word. This additional complication is intended to protect against differential cryptanalysis.
Then the words are rotated, so that the old first word becomes the new fourth word, and the other words are moved to the position immediately preceding their old position.
This is shown clearly in the diagram of MARS that has been on the right side of this page above, which outlines the general structure of MARS.
Also on this diagram are the next sixteen rounds, which constitute the "cryptographic core" of MARS. The E function produces three 32-bit outputs from the first word of the block and two 32-bit subkeys. The 13-bit circular left shift shown as being applied to the first word would be duplicated inside the E-function if it really recieved only one input; that oversimplication is avoided in the following diagram,
which shows one of the forward core rounds (which I have chosen to label as type D in the first diagram for ease of reference) in detail.
The final eight rounds of MARS are the inverse, in the sense of the operations performed, of the first eight rounds of unkeyed mixing; subtraction replaces each addition, and the rounds are performed in reverse order. However, the direction in which the four words are rotated after each round is not reversed, so these rounds are not an exact inverse of the first four rounds in an overall sense.
S(0,1) in the diagram stands for an S-box with nine input bits and 32 output bits which is merely the concatenation of S0 and S1.
The S-boxes in MARS are as follows (no, I didn't type them in myself; I must gratefully acknowledge the C implementation of MARS by Brian Gladman as my source):
S-box 0: 09D0C479 28C8FFE0 84AA6C39 9DAD7287 7DFF9BE3 D4268361 C96DA1D4 7974CC93 85D0582E 2A4B5705 1CA16A62 C3BD279D 0F1F25E5 5160372F C695C1FB 4D7FF1E4 AE5F6BF4 0D72EE46 FF23DE8A B1CF8E83 F14902E2 3E981E42 8BF53EB6 7F4BF8AC 83631F83 25970205 76AFE784 3A7931D4 4F846450 5C64C3F6 210A5F18 C6986A26 28F4E826 3A60A81C D340A664 7EA820C4 526687C5 7EDDD12B 32A11D1D 9C9EF086 80F6E831 AB6F04AD 56FB9B53 8B2E095C B68556AE D2250B0D 294A7721 E21FB253 AE136749 E82AAE86 93365104 99404A66 78A784DC B69BA84B 04046793 23DB5C1E 46CAE1D6 2FE28134 5A223942 1863CD5B C190C6E3 07DFB846 6EB88816 2D0DCC4A A4CCAE59 3798670D CBFA9493 4F481D45 EAFC8CA8 DB1129D6 B0449E20 0F5407FB 6167D9A8 D1F45763 4DAA96C3 3BEC5958 ABABA014 B6CCD201 38D6279F 02682215 8F376CD5 092C237E BFC56593 32889D2C 854B3E95 05BB9B43 7DCD5DCD A02E926C FAE527E5 36A1C330 3412E1AE F257F462 3C4F1D71 30A2E809 68E5F551 9C61BA44 5DED0AB8 75CE09C8 9654F93E 698C0CCA 243CB3E4 2B062B97 0F3B8D9E 00E050DF FC5D6166 E35F9288 C079550D 0591AEE8 8E531E74 75FE3578 2F6D829A F60B21AE 95E8EB8D 6699486B 901D7D9B FD6D6E31 1090ACEF E0670DD8 DAB2E692 CD6D4365 E5393514 3AF345F0 6241FC4D 460DA3A3 7BCF3729 8BF1D1E0 14AAC070 1587ED55 3AFD7D3E D2F29E01 29A9D1F6 EFB10C53 CF3B870F B414935C 664465ED 024ACAC7 59A744C1 1D2936A7 DC580AA6 CF574CA8 040A7A10 6CD81807 8A98BE4C ACCEA063 C33E92B5 D1E0E03D B322517E 2092BD13 386B2C4A 52E8DD58 58656DFB 50820371 41811896 E337EF7E D39FB119 C97F0DF6 68FEA01B A150A6E5 55258962 EB6FF41B D7C9CD7A A619CD9E BCF09576 2672C073 F003FB3C 4AB7A50B 1484126A 487BA9B1 A64FC9C6 F6957D49 38B06A75 DD805FCD 63D094CF F51C999E 1AA4D343 B8495294 CE9F8E99 BFFCD770 C7C275CC 378453A7 7B21BE33 397F41BD 4E94D131 92CC1F98 5915EA51 99F861B7 C9980A88 1D74FD5F B0A495F8 614DEED0 B5778EEA 5941792D FA90C1F8 33F824B4 C4965372 3FF6D550 4CA5FEC0 8630E964 5B3FBBD6 7DA26A48 B203231A 04297514 2D639306 2EB13149 16A45272 532459A0 8E5F4872 F966C7D9 07128DC0 0D44DB62 AFC8D52D 06316131 D838E7CE 1BC41D00 3A2E8C0F EA83837E B984737D 13BA4891 C4F8B949 A6D6ACB3 A215CDCE 8359838B 6BD1AA31 F579DD52 21B93F93 F5176781 187DFDDE E94AEB76 2B38FD54 431DE1DA AB394825 9AD3048F DFEA32AA 659473E3 623F7863 F3346C59 AB3AB685 3346A90B 6B56443E C6DE01F8 8D421FC0 9B0ED10C 88F1A1E9 54C1F029 7DEAD57B 8D7BA426 4CF5178A 551A7CCA 1A9A5F08 FCD651B9 25605182 E11FC6C3 B6FD9676 337B3027 B7C8EB14 9E5FD030 S-box 1: 6B57E354 AD913CF7 7E16688D 58872A69 2C2FC7DF E389CCC6 30738DF1 0824A734 E1797A8B A4A8D57B 5B5D193B C8A8309B 73F9A978 73398D32 0F59573E E9DF2B03 E8A5B6C8 848D0704 98DF93C2 720A1DC3 684F259A 943BA848 A6370152 863B5EA3 D17B978B 6D9B58EF 0A700DD4 A73D36BF 8E6A0829 8695BC14 E35B3447 933AC568 8894B022 2F511C27 DDFBCC3C 006662B6 117C83FE 4E12B414 C2BCA766 3A2FEC10 F4562420 55792E2A 46F5D857 CEDA25CE C3601D3B 6C00AB46 EFAC9C28 B3C35047 611DFEE3 257C3207 FDD58482 3B14D84F 23BECB64 A075F3A3 088F8EAD 07ADF158 7796943C FACABF3D C09730CD F7679969 DA44E9ED 2C854C12 35935FA3 2F057D9F 690624F8 1CB0BAFD 7B0DBDC6 810F23BB FA929A1A 6D969A17 6742979B 74AC7D05 010E65C4 86A3D963 F907B5A0 D0042BD3 158D7D03 287A8255 BBA8366F 096EDC33 21916A7B 77B56B86 951622F9 A6C5E650 8CEA17D1 CD8C62BC A3D63433 358A68FD 0F9B9D3C D6AA295B FE33384A C000738E CD67EB2F E2EB6DC2 97338B02 06C9F246 419CF1AD 2B83C045 3723F18A CB5B3089 160BEAD7 5D494656 35F8A74B 1E4E6C9E 000399BD 67466880 B4174831 ACF423B2 CA815AB3 5A6395E7 302A67C5 8BDB446B 108F8FA4 10223EDA 92B8B48B 7F38D0EE AB2701D4 0262D415 AF224A30 B3D88ABA F8B2C3AF DAF7EF70 CC97D3B7 E9614B6C 2BAEBFF4 70F687CF 386C9156 CE092EE5 01E87DA6 6CE91E6A BB7BCC84 C7922C20 9D3B71FD 060E41C6 D7590F15 4E03BB47 183C198E 63EEB240 2DDBF49A 6D5CBA54 923750AF F9E14236 7838162B 59726C72 81B66760 BB2926C1 48A0CE0D A6C0496D AD43507B 718D496A 9DF057AF 44B1BDE6 054356DC DE7CED35 D51A138B 62088CC9 35830311 C96EFCA2 686F86EC 8E77CB68 63E1D6B8 C80F9778 79C491FD 1B4C67F2 72698D7D 5E368C31 F7D95E2E A1D3493F DCD9433E 896F1552 4BC4CA7A A6D1BAF4 A5A96DCC 0BEF8B46 A169FDA7 74DF40B7 4E208804 9A756607 038E87C8 20211E44 8B7AD4BF C6403F35 1848E36D 80BDB038 1E62891C 643D2107 BF04D6F8 21092C8C F644F389 0778404E 7B78ADB8 A2C52D53 42157ABE A2253E2E 7BF3F4AE 80F594F9 953194E7 77EB92ED B3816930 DA8D9336 BF447469 F26D9483 EE6FAED5 71371235 DE425F73 B4E59F43 7DBE2D4E 2D37B185 49DC9A63 98C39D98 1301C9A2 389B1BBF 0C18588D A421C1BA 7AA3865C 71E08558 3C5CFCAA 7D239CA4 0297D9DD D7DC2830 4B37802B 7428AB54 AEEE0347 4B3FBB85 692F2F08 134E578E 36D9E0BF AE8B5FCF EDB93ECF 2B27248E 170EB1EF 7DC57FD6 1E760F16 B1136601 864E1B9B D7EA7319 3AB871BD CFA4D76F E31BD782 0DBEB469 ABB96061 5370F85D FFB07E37 DA30D0FB EBC977B6 0B98B40F 3A4D0FE6 DF4FC26B 159CF22A C298D6E2 2B78EF6A 61A94AC0 AB561187 14EEA0F0 DF0D4164 19AF70EE
The key for MARS may be from 4 to 39 words in length. These words are considered to be little-endian in format, so the first byte of the key is the least significant byte of the first word.
To generate the key, we use an array of words which is considered to have subscripts ranging from -7 to 39. The first seven words of this array, numbered from -7 to -1, are filled with the first seven words in the S-box (or the first seven words in S-box zero).
Then, words 0 to 38 of the array are initialized in order with the following quantity:
The XOR of the following four items:
Word 39 of the array is loaded with the number of words in the external key.
At this point, we forget words -7 through -1 of the array, and treat the array as containing only words 0 through 39.
Then, seven times over, starting from word 1 of the array, add to each word of the array the S-box entry indicated by the most significant nine bytes of the preceding element of the array. Note that word 39 is considered as preceding word 0, the last word to be modified by this loop.
Finally, word i of this temporary array, for i from 0 to 39, becomes subkey (7 * i) modulo 40.
A proposed change to the definition of MARS is to replace this key generation procedure, which generates 40 subkey words in one step, with four instances of the following, each of which generates 10 subkey words:
To generate the key, we use an array of words which is considered to have subscripts ranging from 0 to 14. The key is placed in the array starting with word 0, the word in the array immediately following the last word of the key is filled with the number of words in the key, and the remainder of the array is filled with zeroes.
What follows is done four times, each time producing 10 words of subkey.
Then, words 0 to 14 of the array are initialized in order with the following quantity:
The XOR of the following five items:
Then, four times over, starting from word 1 of the array, add to each word of the array the S-box entry indicated by the most significant nine bits of the preceding element of the array. Note that word 14 is considered as preceding word 0, the last word to be modified by this loop.
Finally, word (4 * i) modulo 15 of this temporary array, for i from 0 to 9, becomes subkey i plus 10 times the iteration number (in the first iteration, considered iteration number zero, we generate subkeys 0 through 9, in the second iteration we generate subkeys 10 through 19, and so on).
Subkeys 5, 7, 9, 11, ... to 35 are used to multiply the first word of the block within the E function. These subkey values are modified if necessary to ensure that they are good values for multiplication.
The method of correcting these subkeys is somewhat involved, and goes as follows:
Call the original value of the subkey SR.
Let SM be SR or 3.
Let IX be SR and 3.
Create MA such that only those bits in MA corresponding to a run of ten or more bits in SM are ones, excluding the first and last bit of each run. (If MA is zero, SR does not need to be altered, so you can quit.)
Originally, the reference code implementing this procedure left the first bit of MA equal to one if the first bit of SM was zero. A proposed modification to MARS is to change this to conform to the written description.
Let MA be MA and FFFFFFFC - that is, set its last two bits to zero.
Use IX to select an element from this array (containing four values from the S-boxes):
0: A4A8D57B 1: 5B5D193B 2: C8A8309B 3: 73F9A978
then rotate the value right by the amount indicated by the element three places ahead
The proposed modification changes this to one place back. (Since only odd-numbered subkey words are modified, this will always lead to a word in the same group of 10 words as the one being modified.)
in the array of words of internal key, and then modify the current key word by XORing it with that result ANDed with MA.
As noted above, the unkeyed mixing rounds of MARS seemed to me to be somewhat wasteful. However, I was one of what seemed to be only a few people who liked MARS, thinking it one of the likeliest candidates to remain secure for a considerable time to come. Originally, I could not think of a good way to modify the key schedule to produce keys for the mixing rounds. Once the key schedule was modified, however, I saw a natural way to do this, which I described in a comment to Round 2 of the AES process. However, a second comment in which I made a correction to that proposal got garbled.
My proposal, as corrected, is:
Using the new key schedule, after generating subkeys 30 to 39, the process that is applied to the array of 15 words to generate a batch of 10 subkeys is to be performed once again. Once this is done, the contents of that array, the elements of which are known as T through T, are to be modified as follows:
T = T xor subkey 6 T = T xor subkey 8 T = T xor subkey 10 T = T xor subkey 12 T = T xor subkey 14 ... T = T xor subkey 26 T = T xor subkey 28 T = T xor subkey 1 T = T xor subkey 2 T = T xor subkey 3
The intent of this is to cause the contents of the array T to be a non-invertible function of the key. This is done by XORing the contents of T after an iteration of the invertible transformation function with values generated from it previous to that iteration.
The values chosen are all values used as subkey values, so that additional memory is not required to retain other information to be used for this purpose. For that reason, the subkeys that were modified in order to be used for the multiplication portion of the E function in the cryptographic core rounds of MARS are also avoided.
Then, two additional iterations of the process previously used to generate a set of 10 subkey words are performed, but from each of them only eight subkeys are now taken. Each subkey is used to provide four bytes to XOR to the values used to index into the MARS S-boxes in one of the mixing rounds.
Because there are now seven, rather than four, instances of the procedure to modify the contents of the array T, the number of the array word being calculated, when XORed with the contents of that word, should be shifted three bits left instead of two bits left.
In my proposal, I did not specify how to take eight subkeys from each of the last two key generation phases. I now suggest that, instead of choosing the subkeys based on (7 * i) mod 15, for each of these last two phases, the eight subkeys taken should simply be from words T, T, ... T, T, thus always using the last ones modified, and being different from the scheme used for the rest of MARS. However, if the subkeys used for the first group of mixing rounds are subkeys 40 to 47 in order, and for the second group 48 to 55 in order, the words taken from these rounds should be allocated to the subkeys as follows:
T T T T T T T T Iteration 6 48 46 50 44 52 42 54 40 Iteration 7 47 49 45 51 43 53 41 55
Table of Contents