[Next] [Up] [Previous] [Index]

Error-Correcting Codes

After text has been compressed, and even more so after it has been encrypted, the resulting output is random in appearance. This makes it easy to make mistakes in handling it, and hard to see errors if they are introduced through electrical noise on a communications link or any similar cause.

Thus, even if redundancy is removed as far as is possible before encryption by means of data compression, after encryption is complete, it becomes desirable to put redundancy back in, to give resistance to errors.

There are mathematical techniques, given an arbitrary text, to add more symbols to it in a way that gives resistance to a specific number of errors. Since they depend only on the input text, if the ciphertext is used as input, this kind of added redundancy does not help the cryptanalyst, except by giving both him and the legitimate recipient a copy of the ciphertext that is more likely to be accurate.

Thus, telegraph codes were designed so that an error in a single letter of a codeword, or switching two adjacent letters in a codeword, would not result in another valid codeword, by the use of tables such as this:

AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ A* 
BB BC BD BE BF BG BH BI BJ BK BL BM BN BO BP BQ BR BS BT BU BV BW BX BY BZ B* BA 
CC CD CE CF CG CH CI CJ CK CL CM CN CO CP CQ CR CS CT CU CV CW CX CY CZ C* CA CB 
DD DE DF DG DH DI DJ DK DL DM DN DO DP DQ DR DS DT DU DV DW DX DY DZ D* DA DB DC 
EE EF EG EH EI EJ EK EL EM EN EO EP EQ ER ES ET EU EV EW EX EY EZ E* EA EB EC ED 
FF FG FH FI FJ FK FL FM FN FO FP FQ FR FS FT FU FV FW FX FY FZ F* FA FB FC FD FE 
GG GH GI GJ GK GL GM GN GO GP GQ GR GS GT GU GV GW GX GY GZ G* GA GB GC GD GE GF 
HH HI HJ HK HL HM HN HO HP HQ HR HS HT HU HV HW HX HY HZ H* HA HB HC HD HE HF HG 
II IJ IK IL IM IN IO IP IQ IR IS IT IU IV IW IX IY IZ I* IA IB IC ID IE IF IG IH 
JJ JK JL JM JN JO JP JQ JR JS JT JU JV JW JX JY JZ J* JA JB JC JD JE JF JG JH JI 
KK KL KM KN KO KP KQ KR KS KT KU KV KW KX KY KZ K* KA KB KC KD KE KF KG KH KI KJ 
LL LM LN LO LP LQ LR LS LT LU LV LW LX LY LZ L* LA LB LC LD LE LF LG LH LI LJ LK 
MM MN MO MP MQ MR MS MT MU MV MW MX MY MZ M* MA MB MC MD ME MF MG MH MI MJ MK ML 
NN NO NP NQ NR NS NT NU NV NW NX NY NZ N* NA NB NC ND NE NF NG NH NI NJ NK NL NM 
OO OP OQ OR OS OT OU OV OW OX OY OZ O* OA OB OC OD OE OF OG OH OI OJ OK OL OM ON 
PP PQ PR PS PT PU PV PW PX PY PZ P* PA PB PC PD PE PF PG PH PI PJ PK PL PM PN PO 
QQ QR QS QT QU QV QW QX QY QZ Q* QA QB QC QD QE QF QG QH QI QJ QK QL QM QN QO QP 
RR RS RT RU RV RW RX RY RZ R* RA RB RC RD RE RF RG RH RI RJ RK RL RM RN RO RP RQ 
SS ST SU SV SW SX SY SZ S* SA SB SC SD SE SF SG SH SI SJ SK SL SM SN SO SP SQ SR 
TT TU TV TW TX TY TZ T* TA TB TC TD TE TF TG TH TI TJ TK TL TM TN TO TP TQ TR TS 
UU UV UW UX UY UZ U* UA UB UC UD UE UF UG UH UI UJ UK UL UM UN UO UP UQ UR US UT 
VV VW VX VY VZ V* VA VB VC VD VE VF VG VH VI VJ VK VL VM VN VO VP VQ VR VS VT VU 
WW WX WY WZ W* WA WB WC WD WE WF WG WH WI WJ WK WL WM WN WO WP WQ WR WS WT WU WV 
XX XY XZ X* XA XB XC XD XE XF XG XH XI XJ XK XL XM XN XO XP XQ XR XS XT XU XV XW 
YY YZ Y* YA YB YC YD YE YF YG YH YI YJ YK YL YM YN YO YP YQ YR YS YT YU YV YW YX 
ZZ Z* ZA ZB ZC ZD ZE ZF ZG ZH ZI ZJ ZK ZL ZM ZN ZO ZP ZQ ZR ZS ZT ZU ZV ZW ZX ZY 

M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L    AA BB CC DD EE FF GG HH II JJ KK LL MM NN OO PP QQ RR SS TT UU VV WW XX YY ZZ 
N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M    A* BA CB DC ED FE GF HG IH JI KJ LK ML NM ON PO QP RQ SR TS UT VU WV XW YX ZY 
O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N    AZ B* CA DB EC FD GE HF IG JH KI LJ MK NL OM PN QO RP SQ TR US VT WU XV YW ZX 
P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O    AY BZ C* DA EB FC GD HE IF JG KH LI MJ NK OL PM QN RO SP TQ UR VS WT XU YV ZW 
Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P    AX BY CZ D* EA FB GC HD IE JF KG LH MI NJ OK PL QM RN SO TP UQ VR WS XT YU ZV 
R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q    AW BX CY DZ E* FA GB HC ID JE KF LG MH NI OJ PK QL RM SN TO UP VQ WR XS YT ZU 
S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R    AV BW CX DY EZ F* GA HB IC JD KE LF MG NH OI PJ QK RL SM TN UO VP WQ XR YS ZT 
T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S    AU BV CW DX EY FZ G* HA IB JC KD LE MF NG OH PI QJ RK SL TM UN VO WP XQ YR ZS 
U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T    AT BU CV DW EX FY GZ H* IA JB KC LD ME NF OG PH QI RJ SK TL UM VN WO XP YQ ZR 
V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U    AS BT CU DV EW FX GY HZ I* JA KB LC MD NE OF PG QH RI SJ TK UL VM WN XO YP ZQ 
W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V    AR BS CT DU EV FW GX HY IZ J* KA LB MC ND OE PF QG RH SI TJ UK VL WM XN YO ZP 
X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W    AQ BR CS DT EU FV GW HX IY JZ K* LA MB NC OD PE QF RG SH TI UJ VK WL XM YN ZO 
Y  Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X    AP BQ CR DS ET FU GV HW IX JY KZ L* MA NB OC PD QE RF SG TH UI VJ WK XL YM ZN 
Z  *  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y    AO BP CQ DR ES FT GU HV IW JX KY LZ M* NA OB PC QD RE SF TG UH VI WJ XK YL ZM 
*  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z    AN BO CP DQ ER FS GT HU IV JW KX LY MZ N* OA PB QC RD SE TF UG VH WI XJ YK ZL 
A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *    AM BN CO DP EQ FR GS HT IU JV KW LX MY NZ O* PA QB RC SD TE UF VG WH XI YJ ZK 
B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A    AL BM CN DO EP FQ GR HS IT JU KV LW MX NY OZ P* QA RB SC TD UE VF WG XH YI ZJ 
C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B    AK BL CM DN EO FP GQ HR IS JT KU LV MW NX OY PZ Q* RA SB TC UD VE WF XG YH ZI 
D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C    AJ BK CL DM EN FO GP HQ IR JS KT LU MV NW OX PY QZ R* SA TB UC VD WE XF YG ZH 
E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D    AI BJ CK DL EM FN GO HP IQ JR KS LT MU NV OW PX QY RZ S* TA UB VC WD XE YF ZG 
F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E    AH BI CJ DK EL FM GN HO IP JQ KR LS MT NU OV PW QX RY SZ T* UA VB WC XD YE ZF 
G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F    AG BH CI DJ EK FL GM HN IO JP KQ LR MS NT OU PV QW RX SY TZ U* VA WB XC YD ZE 
H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G    AF BG CH DI EJ FK GL HM IN JO KP LQ MR NS OT PU QV RW SX TY UZ V* WA XB YC ZD 
I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H    AE BF CG DH EI FJ GK HL IM JN KO LP MQ NR OS PT QU RV SW TX UY VZ W* XA YB ZC 
J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I    AD BE CF DG EH FI GJ HK IL JM KN LO MP NQ OR PS QT RU SV TW UX VY WZ X* YA ZB 
K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J    AC BD CE DF EG FH GI HJ IK JL KM LN MO NP OQ PR QS RT SU TV UW VX WY XZ Y* ZA 
L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  *  A  B  C  D  E  F  G  H  I  J  K    AB BC CD DE EF FG GH HI IJ JK KL LM MN NO OP PQ QR RS ST TU UV VW WX XY YZ Z* 

A miniature version of such a code construction table is shown in The Codebreakers by David Kahn, but here I've put the last two letters of the codeword in alphabetical order in the rows of the square on the lower right, since a code compiler would want to generate codewords in alphabetical order. The row of the top square, and the column of the square on the lower right, which contain digraphs beginning with *, are omitted, since those are unused if codewords from a 26-letter alphabet are wanted.

Valid codewords consist of two letters from the top, a letter from the lower left, and two letters on the lower right, such that the single letter shares the same column as the pair of letters from the top and the same row as the pair of letters from the lower right. The use of an extra dummy letter * is required to avoid codewords differing only by a swap of two adjacent letters, because there are an even number of letters in the alphabet. The middle square can start with any letter; here it is started with M to indicate that this feature can be used to produce codewords different from those in someone else's code.

Thus, the first few codewords in alphabetical order that can be generated from the square above are:

AAAAM AAABN AAACO AAADP AAAEQ AAAFR AAAGS AAAHT
AAAIU AAAJV AAAKW AAALX AAAMY AAANZ AAAPA AAAQB
AAARC AAASD AAATE AAAUF AAAVG AAAWH AAAXI AAAYJ
AAAZK
AABAL AABBM AABCN AABDO ...
AACAK AACBL AACCM AACDN ...
AADAJ AADBK AADCL AADDM

The underlying principle of the diagram above is simply, for a five-character codeword that can be represented as abcde, to keep a-b+c-d+e a constant value, modulo 27.

The fact that this type of coding only protects against two characters being swapped when the alphabet has an odd number of letters in it is also the reason why the check digit for ISBN numbers (okay, that stands for International Standard Book Number, so I was redundant) can also be an X in addition to a digit from 0 to 9.

For example, given a string of bits, a single bit which contains the parity of that string of bits can be appended to it. In that case, if the result is transmitted, and exactly one bit is in error in what is recieved, the parity will be wrong, and the fact that an error has taken place will be detected.

Another simple example would be to transmit one's message three times. For every group of three corresponding bits from the three copies of the message, if exactly one error takes place in communications, that error can be corrected, because of the three recieved copies of that bit, two will be correct, and only one will be wrong.

Repeating a bit an odd number of times is a trivial example of a perfect code; a parity bit, and repetition an even number of times, are both examples of extended codes based on perfect codes, where an extended code is formed from any code by adding a parity bit, because having no error-checking bits added to a block is also a trivial example of a perfect code.

If we confine ourselves to the even parity form of a parity bit, both these codes can be represented by matrices:

A parity bit:

1 0 0 0 0 0 0 0 1
0 1 0 0 0 0 0 0 1
0 0 1 0 0 0 0 0 1
0 0 0 1 0 0 0 0 1
0 0 0 0 1 0 0 0 1
0 0 0 0 0 1 0 0 1
0 0 0 0 0 0 1 0 1
0 0 0 0 0 0 0 1 1

Triple repetition:

1 1 1

In general, matrix multiplication works like this: introduce the numbers which are the elements of the vector being multiplied from the left, starting with the first one on the top; for each bit in that row, calculate the product of the number introduced from the side with the number in the matrix; then, after all the products have been calculated, total them down the columns to obtain the output.

For error-correcting codes illustrated by matrices, it is the bits of the message that are introduced from the left. Then, an AND is performed (rather than a multiplication) between them and the bits of the matrix, and the results are accumulated by an XOR going down the columns (rather than addition).

Illustrating this in the case of a parity bit applied to a nybble:

        1   0   0   0   1
  1 -->  1   0   0   0   1

        0   1   0   0   1
  0 -->  0   0   0   0   0

        0   0   1   0   1
  1 -->  0   0   1   0   1

        0   0   0   1   1
  1 -->  0   0   0   1   1
        -------------------
         1   0   1   1   1

As this is a code applied to a binary signal, all arithmetic is done modulo 2 (hence, 1+1+1 = 1 instead of 3).

Just using parity bits can allow correcting errors. One way to do this is to have both a parity bit for each byte of data, and a parity byte that results from the XOR of all the data bytes. If there is one error, the byte it is in is indicated by the parity bit for that byte, and the bit it is in is indicated by which byte of the final parity byte shows the wrong parity.

A simple way to obtain row and column parity for a sequence of characters on a computer would be to add three parity characters to a line of 49 7-bit characters on this basis:

This would allow one single-bit error in a line to be found and therfore corrected, as follows:

33333X3-----*------*------*------*------*------*- ==?
----2-13333*3*----2-1----2-1----2-1----2-1----2-1 ===
1--2---1--2---*33*3331--2---1--2---1--2---1--2--- ?==
-12-----12-----12----3**3333-12-----12-----12---- ===
-21-----21-----21-----21----3**3333-21-----21---- ===
2--1---2--1---2--1---2--1---2--1---*33*3332--1--- =?=
----1-2----1-2----1------1-2----1-2----1-23333*3* ===

The X is at the only point where the three lines of bits marked by 1, 2, and 3 all coincide. Note that the asterisks mark points where two of them cross. Assuming the shifts after a byte is XORed in even take place after the last byte, the question marks indicate where the three parity characters would show the wrong parity if there was an error at the point marked by the X.

It is easier to do this for 7-bit characters because 7 is an odd number, but it can also be done for 8-bit bytes, simply by rotating only one accumulator by one bit for each byte, and not rotating one accumulator at all, as follows:

33313333---1-------1-------1-------1-------1-------1-------1---- ==?
----1---3333*333----1-------1-------1-------1-------1-------1--- ===
22222*2222222*22*****X**22222*2222222*2222222*2222222*2222222*22 =?=
------1-------1-------1-333333*3------1-------1-------1-------1- ===
-------1-------1-------1-------13333333*-------1-------1-------1 ===
1-------1-------1-------1-------1-------*33333331-------1------- ?==
-1-------1-------1-------1-------1-------1------3*333333-1------ ===
--1-------1-------1-------1-------1-------1-------1-----33*33333 ===

However, approaches based only on using parity bits are unsophisticated, and more efficient ways of correcting and detecting multiple errors can be found if more sophisticated codes are used.

Another class of error-correcting codes are called Hamming codes. No longer trivial, these codes are also perfect, which means they add exactly enough redundancy to a group of input bits to detect and correct a certain number of errors.

Note that this perfection applies to one particular situation, where errors occur randomly among bits. Where there is a tendency for bits close together in a message to be in error together, this being called a burst error, other strategies are needed. One common way of dealing with this is simply to apply conventional error-correcting codes to widely-separated bits at equal intervals in a message. This, or the strategy of applying the error-correcting code first, and then spreading out the bits of each output group, is called interleaving.

The Hamming Codes

In a Hamming code, after the group of bits being encoded, the extra parity bits are formed by allocating every combination of those bits with two or more of them equal to 1 to each bit position in the input. Hence, changing one bit in the input always changes at least three bits in the word with error-correction added. Since no two rows in the error-checking part of the matrix are identical, changing two bits in the input still results in a three-bit change at least; thus, the number of bits difference (or the Hamming distance) between possible outputs (valid codewords) is a minimum of three.

A parity bit can be added to a Hamming code; the result is another perfect code, this time with a four-bit change resulting from any change in the input.

The Hamming distance between valid codewords tells us how much error-checking we have. If there is no error-checking added, each different input still gives a different output, for a distance of 1. If the distance is 2, one error can be detected, as is the case for the code that just adds one parity bit. If the distance is 3, as for repeating a bit three times, or the basic Hamming code without a parity bit added, one error can be corrected.

The error-correcting capabilities of an error-correcting code are given by the Hamming distance between codewords, but it is possible to choose how to make use of them. For example, if we repeat every bit three times, we can either choose to accept 011 as standing for 1, hence correcting one error when it occurs, but being fooled if two errors happen, or we can choose to discard it, accepting only 000 and 111. Thus, a Hamming distance of 3 can be used either to correct one error, or to detect two errors. A distance of 4, such as obtained by a Hamming code with a parity bit, or by repeating a bit four times, allows correcting one error and detecting two errors, or detecting up to three errors.

Some Hamming codes are:

1 0 0 0 0 1 1
0 1 0 0 1 0 1
0 0 1 0 1 1 0
0 0 0 1 1 1 1

1 0 0 0 0 0 0 0 0 0 0 0 0 1 1
0 1 0 0 0 0 0 0 0 0 0 0 1 0 1
0 0 1 0 0 0 0 0 0 0 0 0 1 1 0
0 0 0 1 0 0 0 0 0 0 0 1 0 0 1
0 0 0 0 1 0 0 0 0 0 0 1 0 1 0
0 0 0 0 0 1 0 0 0 0 0 1 1 0 0
0 0 0 0 0 0 1 0 0 0 0 0 1 1 1
0 0 0 0 0 0 0 1 0 0 0 1 0 1 1
0 0 0 0 0 0 0 0 1 0 0 1 1 0 1
0 0 0 0 0 0 0 0 0 1 0 1 1 1 0
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1

Since a Hamming code, as shown, can only correct one error, decoding is simple:

This, of course, assumes that the transmitted block actually was subjected to no more than one error. When a parity bit is added to the Hamming code, decoding is modified slightly:

These are just one particular form for each of the three Hamming codes shown. Rearranging the rows or columns, or certain other modifications, will not change the ability of the code to be used to detect or correct errors. Replacing one row of the matrix by the XOR of itself and another row is allowed, because that will result in the matrix generating exactly the same set of codewords, but they will be generated in some cases for different inputs.

As we have noted, the minimum Hamming distance between valid symbols is 3 for a Hamming code without parity bit, and 4 for a Hamming code with parity bit.

A notation for error correcting codes exists, where a [t, i, d] code is one which uses t-bit symbols, containing i bits of information, having a minimum Hamming distance of d between valid symbols. The sequence of Hamming codes is:

Without          With
Parity           Parity

[7,4,3]          [8,4,4]
[15,11,3]        [16,11,4]
[31,26,3]        [32,26,4]
[63,57,3]        [64,57,4]
[127,120,3]      [128,120,4]
[255,247,3]      [256,247,4]

  n    n           n  n
[2 -1,2 -n-1,3]  [2 ,2 -n-1,4]

As the total length of a symbol is 2^n for a code with parity, this means that even though the number of check bits is small, if one wishes to use a Hamming code for error checking in a computer memory with a word length that is a power of two, one has just slightly gone past the limit for using a certain number of check bits.

Thus, for example, single error correction and double error detection for a 64-bit word requires eight check bits instead of seven. This is not too inconvenient a number to provide. Also, this inefficiency provides some freedom of choice to the designer.

The seven check bits in the [127,120,3] code without parity correspond, for each of the 120 data bits, to every possible combination of seven bits in which two or more of the bits are equal to one.

These combinations can be broken down by number of bits into the following categories:

Two bits:       21
Three bits:     35
Four bits:      35
Five bits:      21
Six bits:        7
Seven bits:      1

If one takes only the combinations with an even number of bits, there are 21 + 35 + 7 of them, or 63. This is, unfortunately, less than 64, but that still means one could have a code with parity bit based on a subset of the Hamming code where only one of the data bits determines if the error-correcting portion should have the same parity, or the opposite parity, as the rest of the data bits.

As noted above, one can obtain an equivalent code from the matrix for a code by replacing a row with the XOR of another row with itself. Doing this for a code with a parity bit will mean the code no longer has an explicit parity bit.

The additional flexibility this provides allows one to create a [72,64,4] code which has the additional property that the check bits always have odd parity. In fact, obtaining that property is now trivial, in itself.

U. S. Patent 3,623,155 is for a code of this type, devised by Mu-Yue Hsiao at IBM, which was optimized with great effort for far more important properties as well; maximizing the percentage of the time when four-bit errors are detected as errors instead of leading to a valid symbol (which will be most of the time in any case), and minimizing the percentage of the time when three-bit errors are assumed to be two-bit errors that can be corrected (this, unavoidably, will occur about half the time, but it can be worse), and allowing a reduction in the gate count, and in the maximum gate delay, required for an implementation. (It also had the property that the calculation of this error-correcting code could produce byte parity as a byproduct.)

This invention came after IBM had already produced at least two computer systems, the IBM 7030 (Stretch) and the IBM System/360 Model 85, which used a [72,64,4] code for memory error correction. It is far from the only attempt to improve upon the Hamming code for this purpose; more recently, another such code was devised by Shigeo Kaneda at Nippon Telegraph and Telephone. This code is called a SEC-DED-S4ED code, which means that, in addition to providing correction of a single bit error, and detection of a double-bit error, it will detect an error involving three or four bits, as long as they belong to the same 4-bit nybble of the 72-bit word fetched from memory. Anyone who has looked at a 72-pin memory stick, and found there were fewer than 36 chips on it, or a 168-pin DIMM, and found there were fewer than 72 chips on it, will appreciate the relevance of this property.

Intermediate between these two codes is one devised by D. C. Bossen at IBM at about the same time as Mu-Yue Hsiao developed his code; this one has the property of also correcting errors in two bits, if those two bits are adjacent.

The check bits for these three codes are shown below:

Mue-Yue Hsiao       D. C. Bossen        Shigeo Kaneda
 1 1 0 1 0 0 0 0     1 0 0 1 0 0 0 0     1 0 0 0 0 1 0 1
 1 1 0 1 1 1 0 0     0 1 1 1 0 0 0 0     0 1 1 0 1 0 1 1
 1 1 1 0 1 1 0 0     1 0 1 1 0 0 0 0     0 0 1 0 0 1 0 1
 1 1 1 0 0 0 0 0     0 1 1 0 0 0 0 0     1 0 0 1 1 1 1 0
 1 0 0 1 1 0 0 0     1 0 1 0 0 0 0 0     0 1 0 0 0 1 0 1
 1 0 0 1 0 1 0 0     0 1 0 1 0 0 0 0     1 0 0 1 1 0 1 1
 1 0 0 1 0 0 1 0     0 1 0 0 1 0 0 0     0 0 0 1 0 1 0 1
 1 0 0 1 0 0 0 1     1 1 0 0 0 1 0 0     0 1 1 0 1 1 1 0
 0 1 1 0 1 0 0 0     1 1 0 0 1 0 0 0     1 0 0 0 1 0 1 0
 0 1 1 0 0 1 0 0     1 0 0 0 0 1 0 0     0 1 1 0 0 1 1 1
 0 1 1 0 0 0 1 0     1 0 0 0 1 0 0 0     0 0 1 0 1 0 1 0
 0 1 1 0 0 0 0 1     0 1 0 0 0 1 0 0     1 0 0 1 1 1 0 1
 1 1 0 0 1 0 0 0     1 0 1 0 1 1 0 1     0 1 0 0 1 0 1 0
 1 1 0 0 0 1 0 0     0 1 0 1 1 0 1 1     1 0 0 1 0 1 1 1
 1 1 0 0 0 0 1 0     1 0 1 0 0 1 1 1     0 0 0 1 1 0 1 0
 1 1 0 0 0 0 0 1     0 1 0 1 1 1 1 0     0 1 1 0 1 1 0 1
 0 0 1 1 1 0 0 0     0 1 1 0 1 0 0 0     1 0 0 0 0 0 1 1
 0 0 1 1 0 1 0 0     1 1 0 1 0 1 0 0     0 1 0 0 1 1 1 1
 0 0 1 1 0 0 1 0     1 1 1 0 1 0 0 0     0 0 1 0 1 1 0 0
 0 0 1 1 0 0 0 1     1 0 0 1 0 1 0 0     0 0 0 1 1 1 1 1
 1 0 1 0 1 0 0 0     0 0 0 1 1 0 1 0     1 0 0 0 1 1 1 1
 1 0 1 0 0 1 0 0     0 0 1 1 0 1 0 1     0 1 0 0 0 0 1 1
 1 0 1 0 0 0 1 0     0 0 1 1 1 0 1 0     0 0 1 0 1 1 1 1
 1 0 1 0 0 0 0 1     0 0 1 0 0 1 0 1     0 0 0 1 1 1 0 0
 0 1 0 1 1 0 0 0     1 0 0 0 0 1 1 0     0 1 0 0 0 1 1 0
 0 1 0 1 0 1 0 0     0 1 0 0 1 1 0 1     1 0 0 1 1 0 0 0
 0 1 0 1 0 0 1 0     1 0 0 0 1 1 1 0     0 0 0 1 1 0 0 1
 0 1 0 1 0 0 0 1     0 1 0 0 1 0 0 1     0 1 1 0 0 0 1 0
 1 0 1 1 0 0 0 0     1 0 1 0 0 0 0 1     1 0 0 0 1 1 0 0
 1 0 1 1 0 0 1 1     0 1 0 1 0 0 1 1     0 1 0 0 1 1 0 0
 0 1 1 1 0 0 1 1     1 0 1 0 0 0 1 1     0 0 1 0 0 0 1 1
 0 1 1 1 0 0 0 0     0 1 0 1 0 0 1 0     0 0 0 1 0 0 1 1
 0 0 0 0 1 1 1 0     1 0 1 0 0 0 1 0     0 1 0 1 1 0 0 0
 1 1 0 0 1 1 1 0     0 1 0 1 0 0 0 1     1 0 1 1 0 1 1 0
 1 1 0 0 1 1 0 1     1 0 1 0 1 0 0 0     0 1 0 1 0 0 1 0
 0 0 0 0 1 1 0 1     0 1 0 1 0 1 0 0     1 1 1 0 1 0 0 1
 1 0 0 0 1 0 1 0     1 0 0 0 1 0 1 0     0 1 0 1 0 1 0 0
 0 1 0 0 1 0 1 0     0 1 0 0 0 1 0 1     1 0 1 1 1 0 0 1
 0 0 1 0 1 0 1 0     0 0 1 0 1 0 1 0     0 1 0 1 0 0 0 1
 0 0 0 1 1 0 1 0     0 0 0 1 0 1 0 1     1 1 1 0 0 1 1 0
 1 0 0 0 0 1 0 1     0 0 0 0 1 0 0 1     1 0 1 0 1 0 0 0
 0 1 0 0 0 1 0 1     0 0 0 0 0 1 1 1     0 1 1 1 0 1 1 0
 0 0 1 0 0 1 0 1     0 0 0 0 1 0 1 1     1 0 1 0 0 0 1 0
 0 0 0 1 0 1 0 1     0 0 0 0 0 1 1 0     1 1 0 1 1 0 0 1
 1 0 0 0 1 1 0 0     0 0 0 0 1 0 1 0     1 0 1 0 0 1 0 0
 0 1 0 0 1 1 0 0     0 0 0 0 0 1 0 1     0 1 1 1 1 0 0 1
 0 0 1 0 1 1 0 0     0 0 0 1 0 0 1 0     1 0 1 0 0 0 0 1
 0 0 0 1 1 1 0 0     0 0 1 1 0 0 0 1     1 1 0 1 0 1 1 0
 1 0 0 0 0 0 1 1     0 0 1 1 0 0 1 0     0 0 1 1 1 0 0 0
 0 1 0 0 0 0 1 1     0 0 1 0 0 0 0 1     1 1 1 1 0 1 0 0
 0 0 1 0 0 0 1 1     0 0 1 0 0 0 1 0     1 1 0 0 0 0 1 0
 0 0 0 1 0 0 1 1     0 0 0 1 0 0 0 1     1 1 1 1 0 0 0 1
 1 0 0 0 0 1 1 0     0 0 1 0 0 1 0 0     1 1 1 1 1 0 0 0
 0 1 0 0 0 1 1 0     0 0 0 1 1 1 0 0     0 0 1 1 0 1 0 0
 0 0 1 0 0 1 1 0     0 0 1 0 1 1 0 0     1 1 1 1 0 0 1 0
 0 0 0 1 0 1 1 0     0 0 0 1 1 0 0 0     1 1 0 0 0 0 0 1
 1 0 0 0 1 0 0 1     0 0 1 0 1 0 0 0     0 1 1 0 0 1 0 0
 0 1 0 0 1 0 0 1     0 0 0 1 0 1 0 0     1 0 0 0 1 0 0 1
 0 0 1 0 1 0 0 1     0 1 0 0 0 0 1 0     1 0 0 1 0 0 0 1
 0 0 0 1 1 0 0 1     1 1 0 0 0 0 0 1     0 0 1 0 0 1 1 0
 0 0 0 0 0 1 1 1     1 1 0 0 0 0 1 0     1 1 0 0 1 0 0 0
 0 0 1 1 0 1 1 1     1 0 0 0 0 0 0 1     1 1 0 0 0 1 0 0
 0 0 1 1 1 0 1 1     1 0 0 0 0 0 1 0     0 0 1 1 0 0 1 0
 0 0 0 0 1 0 1 1     0 1 0 0 0 0 0 1     0 0 1 1 0 0 0 1

The Golay Code

Only one other perfect code for binary signals, significantly distinct from those we have examined so far, the Hamming code, a parity bit, and multiple repetition, is possible, the binary Golay code. It takes a 12-bit input, and adds 11 error-checking bits to it. Like the Hamming codes, an extra parity-check bit can be added, and the code remains optimal; however, the term perfect is only applied to the Golay code and Hamming codes without the parity bit, and the trivial code involving an odd number of repetitions. There are also nonlinear codes which involve the same number of bits, and correct the same number of errors, as the Hamming codes, but which are distinct from them, which are perfect; for practical purposes, there would be no reason to use them instead of the Hamming code, but of course this is still a result of great theoretical importance: and as for practical applications of nonlinear codes, the Preparata codes are examples of nonlinear codes the performance of which cannot be equalled by that of a linear code.

The number of possible combinations of three or fewer bits in error over a block of 23 bits is the following:

No bits in error:
                                  1

One bit in error:
                                 23

Two bits in error:

 23 * 22
--------- = 23 * 11 =           253
  2 *  1

Three bits in error:

 23 * 22 * 21
-------------- = 23 * 11 * 7 = 1771
  3 *  2 *  1
                               ----
                    Total:     2048

and so the total, 2,048, is precisely the number of possible values of the 11 bits used for error checking.

As the Golay code uses a block of 23 bits, with 12 bits used for data (and therefore 2^12 possible valid codewords), and it corrects three errors, which means that the minimum Hamming distance between valid codewords is 7, it is referred to as a [23,12,7] code. Similarly, the code consisting of triple repetition is a [3,1,3] code, that of quintuple repetition is a [5,1,5] code, that of septuple repetition is a [7,1,7] code, and so on. The Hamming codes all have distance 3, and the three illustrated above are [7,4,3], [15,11,3], and [31,26,3] codes. In general, for N error-correcting bits, the corresponding Hamming code is a [(2^N)-1,(2^N)-N-1,3] code.

It is only possible to have codes that approach the effectiveness of the perfect codes due to the existence of symmetric arrangements of codewords; thus, [7,2,5], [10,4,5], [15,8,5] and [31,22,5] codes, for example, as might be thought to exist from the number of possible combinations of two or fewer bits in error for words of various lengths, are not necessarily available.

A modified form of the Golay code with parity bit added, so that the parity bit is no longer explicitly visible, is shown in a book by two of the three co-authors of the Handbook of Applied Cryptography, and an equivalent form, modified by some rearrangement of rows and columns (to obtain a shifting of the cyclic 11 by 11 portion of the error-checking matrix, and to put the row and column of 1 bits in a more conventional position) is shown here:

1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 1 0 0 0 1
0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 1 1
0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 1 0 1
0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1
0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 1 1 1
0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 0 1 0 1 1 0 1
0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 1 1
0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 1 1 1
0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 1 1 1 0 1
0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 1 1 0 0 1
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0

In this code, the minimum Hamming distance between valid codewords is eight; without the parity bit, it would be seven.

A distance of seven allows either correcting three errors, correcting two errors and detecting two errors, correcting one error and detecting four errors, or detecting six errors.

A distance of eight allows either correcting three errors and detecting one error, correcting two errors and detecting three errors, correcting one error and detecting five errors, or detecting seven errors.

Examine the error-checking portion, the second square half, of the matrix for the Golay code shown above. The right and bottom edges consist of all ones except for a zero where they meet.

The remaining 11 by 11 square consists of the sequence 10110111000 repeated in every row, but shifted one space to the left in each row. This sequence contains exactly six one bits, an even number.

The matrix is symmetric, hence unchanged when flipped around the diagonal running from the top left to the bottom right.

Hence, the fact that every row contains an odd number of 1 bits, the last row ANDed with any other row produces a row with six one bits, and any two of the first 11 rows, when combined by an AND, produces a rotated version of one of the following strings:

10110111000 and 01101110001 = 00100110000 (3 bits)
10110111000 and 11011100010 = 10010100000 (3 bits)
10110111000 and 10111000101 = 10110000000 (3 bits)
10110111000 and 01110001011 = 00110001000 (3 bits)
10110111000 and 11100010110 = 10100010000 (3 bits)
10110111000 and 11000101101 = 10000101000 (3 bits)
10110111000 and 10001011011 = 10000011000 (3 bits)
10110111000 and 00010110111 = 00010110000 (3 bits)
10110111000 and 00101101110 = 00100101000 (3 bits)
10110111000 and 01011011100 = 00010011000 (3 bits)

preceded by a single 1 bit, means that, using modulo 2 arithmetic, the error-checking matrix in the code as represented here is its own inverse. (If it weren't for that extra zero, a different decoding matrix would be required, and a slightly more complicated version of the decoding procedure given below would be required.) Because of the symmetry around the diagonal, this is true both in the usual convention for matrix multiplication (numbers go in from the top, and come out on the right) and the one used for error-correcting codes (numbers go in from the left, and come out on the bottom). For more information on matrix multiplication, see the section concerning the Hill Cipher.

This helps to make it practical to check a codeword formed by this code, and then transmitted, for errors. The following procedure will find the correct original 12-bit input if there were three or fewer errors in the transmitted 24-bit block, or it will fail if there were four errors. (With more than four errors, of course, it can be fooled.)

Of course, this involves counting the number of 1 bits in the XOR of the expected error-checking bits or the expected data bits and those actually received. This step can be time-consuming. One could speed it up by using a table with 4,096 entries, in which one could swiftly look up the number of one bits in a 12-bit word.

Of course, that could be made more manageable by using a table with 64 entries twice, for each half of the word. But if one were reconciled to using a table with 4,096 entries, then one could decode the Golay Code more swiftly.

The XOR of the expected error-checking bits with those actually received is called the syndrome of a received codeword. The entries in a table, indexed by the syndrome, corresponding to combinations with 0, 1, 2, or 3 bits equal to 1, could contain 0, indicating that all the errors, if there actually are three or fewer errors in the block, are in the error-checking portion of the codeword.

For the case of one error in the data portion, place the binary value 100000000000, indicating the first data bit is to be flipped, in all locations derived from XORing the first row of the error-checking part of the matrix with every combination of bits involving 0, 1, or 2 bits equal to 1, and then the binary value 010000000000 in all locations derived from XORing the second row of the error-checking part of the matrix with every combination of bits involving 0, 1, or 2 bits equal to 1, and so on.

Similarly, for two errors in the data portion, combine the XORs of two different rows in the error-checking portion of the matrix with either 0 or a single bit equal to 1 to produce the indexes for every combination of two bits in error in the data portion as indicated by those two rows in the error-checking portion of the matrix.

Finally, for three errors in the data portion, all combinations of three distinct rows in the error-checking portion of the matrix would index to the corresponding arrangement of three errors in the data portion.

The remaining entries in the table would indicate that there may be four errors, which cannot be corrected, only detected, so an invalid value would be placed in those portions of the table as a flag. Any value with more than three one bits could be tested for directly, so the entries do not need to be more than twelve bits long.

Another perspective on the binary Golay Code, which I also found understandable, is contained in this thesis. In the form of the Golay Code given above, eleven rows are cyclic, and one special, in the error-checking matrix; in the form discussed in that thesis, all twelve rows are equivalent, but as they relate to the faces of a dodecahedron, there is no way to put them in a cyclic order.

The form of the Golay code discussed there is:

1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1
0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 1 1 1 1
0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 1 1 1
0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 1
0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 1 0 0 1
0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 1 0 1
0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 1 1 0 1 0 1 1 0 0
0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0
0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 1 1 1 0 1 0 1 0
0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 0 1 1 1 0 1 0 0
0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 0 0 1 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 1

Each of the twelve rows and columns of the error-checking part of the matrix corresponds to a face of the dodecahedron, and it contains a zero for every pair of faces that are next to one another. (A face is not considered to be next to itself.) This is still a Golay code, with the same error-correcting property of a Hamming distance of 8 between codewords, and not only is the error-checking matrix symmetric, but once again it is its own inverse as shown here. Because of the dodecahedral symmetry, once again, it is only necessary to AND one row of the matrix with the eleven other rows to establish that. For example, row 1 shares four bits with rows 2 to 11, and two bits with row 12.

Another form of the Golay Code, which is closely related to the one shown above, but with a minor rearrangement of some rows and columns, is discussed on this page on this site, where a diagram shows the relationship to the icosahedron (the dual of the dodecahedron) in a clear and explicit fashion.

But being self-dual is not a necessary property of a Golay code; either example of a Golay code given here would still be a Golay code if the error-checking matrix were reflected left to right, since the error-checking properties would be identical, but then that matrix would not be its own inverse.

This site contains a link to a paper in DVI format giving eight different constructions of the binary Golay code. (Unfortunately, you may have difficulty in viewing documents in DVI format on your system.)

The following representation of the Golay code is the one actually used in MIL-STD-188-141A and FED-STD-1045, as well as being proposed for use in ITU H.223, and thus it may be widely used for purposes involving radio communications:

1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 0 1 1
0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 0 0 1 0
0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 1 0 1 0 1 1
0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 1 0
0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 1 1 0 1 1 0 0 1
0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 1 1 0 1 1 0 1
0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 1 1 0 1 1 1
0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 0 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 0 1 1 1

The error-checking part of the matrix is not symmetric along the main (upper left to lower right) diagonal, and its transpose is shown as being used in decoding. Also, it is said to have been generated from the polynomial

 11  9  7  6  5
x  +x +x +x +x +x+1

or, according to another source, on the polynomial

 11  10  6  5  4  2
x  +x  +x +x +x +x +1

which correspond to the following sequence of bits, either in order or reversed:

1 0 1 0 1 1 1 0 0 0 1 1

which indeed is the first row of the error-correcting matrix.

If you look at the first eleven columns, when the eleventh column is a 0, the first column in the next row is a 0, and bits 2 to 11 of that row are the same as bits 1 to 10 of the row the eleventh column of which is a 1. When the eleventh column is a 1, after shifting the first 11 bits one place to the right, the first 11 bits of the next row are obtained by the XOR of 10101110001 with those bits, so we are seeing powers of two modulo the polynomial

 11  10  6  5  4  2
x  +x  +x +x +x +x +1

with their bits reversed. If we proceed from the bottom up, we can use left shifts, but we are no longer starting with the polynomial itself.

Each column has seven ones in it, so when multiplied by its transpose, there will be a 1 in every position on the diagonal in the result. Any two distinct columns have either four or two ones in common, (as I had to verify by brute force with a short BASIC program) and so the transpose of the error-checking part of the matrix is indeed also the inverse of that part.

This explains the contents of the twelfth column of the error-checking part of the matrix; the bit in it is a one if there are only six ones in the rest of the row, and a zero if there are seven, so it is an odd parity bit for the row.

Despite the fact that each column and row contains seven ones, the error matrix can't be produced simply by rearranging rows and columns of the one produced from the dodecahedron. This can be shown because the columns corresponding to opposite faces can be identified (no zeroes in the same row), and two non-opposite faces must be adjacent to two other faces, and those two faces must be adjacent to three other (than the first two: those two faces may be adjacent to each other if the two non-opposite faces were not adjacent) faces each, two from each group of which are opposite to two from the other group (which is the point at which it breaks down if you start with the first two columns).

Error-checking in this case involves the use of the inverse of the error-checking part of the matrix, but otherwise the algorithm is the same as the one given above.

A form of the Golay Code as a (23,12) code, without the parity bit added, is the following, based on the one from the original paper in which Marcel Golay of the Signal Corps described this error-correcting code in 1949.

                 A A A A B B B C C D
                 B C D E C D E D E E

1 0 0     0 0    0 0 0 0 1 1 1 1 1 1   A       1
0 1 0 ... 0 0    0 1 1 1 0 0 0 1 1 1   B       1
0 0 1     0 0    1 0 1 1 0 1 1 0 0 1   C       1
                 1 1 0 1 1 0 1 0 1 0   D       1
                 1 1 1 0 1 1 0 1 0 0   E       1

                 0 1 0 1 0 1 1 1 0 0   ABCED   1
                 0 1 1 0 1 0 1 0 0 1   ABDCE   1
                 0 0 1 1 1 1 0 0 1 0   ABEDC   1
                 1 0 1 0 0 0 1 1 1 0   ACBDE   1
                 1 0 0 1 1 0 0 1 0 1   ACEBD   1
0 0 0 ... 1 0    1 1 0 0 0 1 0 0 1 1   ADCBE   1

0 0 0 ... 0 1    1 1 1 1 1 1 1 1 1 1           0

On one side is the identity matrix, abbreviated in this diagram. Note that the column of all ones except for a zero at the bottom, is part of the basic structure of the code, so the column representing the parity bit is hidden elsewhere in the preceding examples. (This seems astounding, as one would think that if one added a parity bit to this code as it stands, the result would be wasteful instead of perfect, given its similarity to that column.)

AB to DE are the various combinations of two different objects from a set of five. The first five rows represent when one of the five objects is not part of that pair of objects.

The next six rows represent when that pair of objects, or its reversal, is not present in one of the six different distinct odd permutations of the five objects that exist when cyclic permutations are ignored. Thus, ABCED in the chart above stands for ABCED, BECDA, CEDAB, EDABC, and DABCE, and so DA, the reversal of AD is considered to be a part of it.

Then there is the row with all ones except for that single zero in the last column.

Through rearrangement of rows and columns, one can obtain from that version of the Golay Code, but with the parity bit added, the following version:

1 0 0 0 0 0 0 0 0 0 0 0   1 1 1 1 1 1   0 0 0 0   1   1   A

0 1 0 0 0 0 0 0 0 0 0 0   0 0 0 1 1 1   0 0 1 1   1   1   ABEDC
0 0 1 0 0 0 0 0 0 0 0 0   0 1 1 0 0 1   0 1 0 1   1   1   ABCED
0 0 0 1 0 0 0 0 0 0 0 0   1 0 1 0 1 0   0 1 1 0   1   1   ABDCE
0 0 0 0 1 0 0 0 0 0 0 0   1 1 0 0 1 0   1 0 0 1   1   1   ACEBD
0 0 0 0 0 1 0 0 0 0 0 0   0 1 1 1 0 0   1 0 1 0   1   1   ACBDE
0 0 0 0 0 0 1 0 0 0 0 0   1 0 0 1 0 1   1 1 0 0   1   1   ADCBE
0 0 0 0 0 0 0 1 0 0 0 0   1 1 0 1 0 0   0 1 1 1   1   1   B
0 0 0 0 0 0 0 0 1 0 0 0   1 0 1 0 0 1   1 0 1 1   1   1   C
0 0 0 0 0 0 0 0 0 1 0 0   0 0 1 1 1 0   1 1 0 1   1   1   D
0 0 0 0 0 0 0 0 0 0 1 0   0 1 0 0 1 1   1 1 1 0   1   1   E
0 0 0 0 0 0 0 0 0 0 0 1   1 1 1 1 1 1   1 1 1 1   0   1

                          D C B C B B   A A A A   s   p
                          E D E E C D   B C D E

Not only is the parity bit, the simplest error-correcting code, present explicitly, but also present is the Hamming code for eleven of the data bits. (Incidentally, I discovered, in preparing this, an error in my description of Golay's original version of his code above; I had originally failed to recognize that AD is contained within ADCBE.) Note also that the first six columns of the error-correcting part of the matrix, in addition to containing two occurrences of 111111, also contain ten of the twenty combinations possible within a three-of-six code, so here the Golay code is divided up into several older and less-sophisticated codes.

The presence of an explicit parity bit can be used to shrink the table with 4,096 entries used for decoding down to one with only 2,048 entries; this table would indicate, for the first 11 error-checking bits, the correction to the data bits which the discrepancy in them would imply, but what would also be noted is whether the error to be corrected would also change the parity, and how many bits are in error. If three bits are in error, and the parity is also not what would be expected, then an error can only be detected.

It should be possible to make use of the three-of-six code and the Hamming code portions as well to produce an algorithm for decoding that would be quite efficient while only involving relatively short tables.

The bits shown in bold show where even the AUTOSPEC code can be seen as embedded within it! There is also a relationship between Golay codes and Hadamard codes as well.

It would actually be preferable to have the supplementary Golay bit as the first of the error correcting bits to produce a "conventional" version of the Golay code; it is placed as the second-last bit here in order that the AUTOSPEC code would be visible.

With the aid of the inverse of the error-correcting part of the matrix, one can decode this code using basically the same algorithm as used in the first case, but, in addition, for the common case of only a few errors, decoding can be made simpler by using only the Hamming error-correcting biits first.

The inverse of the error-correcting part of the matrix shown above is:

1 1 1 0 0 1 0 1 1 0 0 0
1 1 0 1 0 0 1 1 0 0 1 0
1 1 0 0 1 0 1 0 1 1 0 0
1 0 1 1 1 0 0 1 0 1 0 0
1 0 1 0 0 1 1 0 0 1 1 0
1 0 0 1 1 1 0 0 1 0 1 0
0 1 1 1 0 0 0 0 1 1 1 0
0 1 0 0 1 1 0 1 0 1 1 0
0 0 1 0 1 0 1 1 1 0 1 0
0 0 0 1 0 1 1 1 1 1 0 0
1 0 0 0 0 0 0 1 1 1 1 1
0 1 1 1 1 1 1 0 0 0 0 1

While it no longer has a parity bit, or a supplemental Golay bit, the three-of-six code and the Hamming error-correction bits are still visible in the inverse, which has the structure:

* g g g g g g h h h h *

Additional Error-Correcting Codes

There are a number of other codes used for error detection and correction. Examples of codes used in the same kind of form as used with Hamming codes include Hadamard codes and Reed-Muller codes. These codes all add slightly more redundancy than is required for the error-checking they seek to attain.

Hadamard Codes

An example of a Hadamard code looks like this:

00000 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
00001 | 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
00010 | 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
00011 | 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
00100 | 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
00101 | 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0
00110 | 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0
00111 | 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1
01000 | 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
01001 | 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0
01010 | 0 0 1 1 0 0 1 1 1 1 0 0 1 1 0 0
01011 | 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1
01100 | 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0
01101 | 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1
01110 | 0 0 1 1 1 1 0 0 1 1 0 0 0 0 1 1
01111 | 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0
10000 | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
10001 | 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
10010 | 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0
10011 | 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1
10100 | 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0
10101 | 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1
10110 | 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1
10111 | 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0
11000 | 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
11001 | 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1
11010 | 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1
11011 | 1 0 0 1 1 0 0 1 0 1 1 0 0 1 1 0
11100 | 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1
11101 | 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0
11110 | 1 1 0 0 0 0 1 1 0 0 1 1 1 1 0 0
11111 | 1 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1

These codes are obtained by a procedure similar to that used to create fractal designs:

                       1 1   1 1
   1 1                 1 0   1 0
   1 0  - becomes -->
                       1 1   0 0
                       1 0   0 1

applied as many times as desired, and then the result has its complement appended to it. Mariner 9 used the code of this type that occupies a 32 by 64 matrix. This method only generates Hadamard codes of orders which are powers of 2. Hadamard codes and matrices of many other orders which are multiples of 4 are also known, and it is conjectured, but not yet proven, that one exists for every such order. These Hadamard codes are obtained by other, more difficult methods.

Note that here we are not creating a generator matrix for the code, but the actual table of codewords. Thus, Hadamard codes are useful when it is desired to construct a code that contains a large proportion of error correction; thus, for example, the code used by Mariner 9 used 32 bits of signal to represent only 6 bits of data.

Thus, the example above, which expands five bits to sixteen, is perhaps the smallest which exhibits the unique strength of this code in providing efficient codes which protect very strongly against errors.

In fact, however, the example above shows that the code is a linear code, so it can be expressed more compactly than the full table given above. The encoding matrix for the code is simply:

 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

thus, in general, the matrix for a Hadamard code is composed of the representations of the binary numbers in order to encode all but the last bit of the block, and the last bit, when present, inverts the entire output block.

One can turn a Hadamard code into a more conventional-looking error correcting code by rearranging columns, and XORing all the other rows to the last row, to cause the last row to have zeroes in it wherever there is exactly one other one in the column (as well as when the number of ones elsewhere is odd); the result in the case of the Hadamard code for five data bits could be something such as the following:

1 0 0 0 0 0 0 0 1 1 1 0 1 1 1 1
0 1 0 0 0 0 1 1 0 0 1 1 0 1 1 1
0 0 1 0 0 1 0 1 0 1 0 1 1 0 1 1
0 0 0 1 0 1 1 0 1 0 0 1 1 1 0 1
0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 1

which contains part of a Hamming code turned sideways.

Note that only when the number of data bits is odd does this approach produce a true parity bit as part of the code.

More About the Golay Code

Incidentally, this representation of the Hadamard code for five data bits can also be found embedded in the conventionalized Golay code shown above:

1 0 0 0 0 0 0 0 0 0 0 0   1 1 1 1 1 1   0 0 0 0   1   1   A

0 1 0 0 0 0 0 0 0 0 0 0   0 0 0 1 1 1   0 0 1 1   1   1   ABEDC
0 0 1 0 0 0 0 0 0 0 0 0   0 1 1 0 0 1   0 1 0 1   1   1   ABCED
0 0 0 1 0 0 0 0 0 0 0 0   1 0 1 0 1 0   0 1 1 0   1   1   ABDCE
0 0 0 0 1 0 0 0 0 0 0 0   1 1 0 0 1 0   1 0 0 1   1   1   ACEBD
0 0 0 0 0 1 0 0 0 0 0 0   0 1 1 1 0 0   1 0 1 0   1   1   ACBDE
0 0 0 0 0 0 1 0 0 0 0 0   1 0 0 1 0 1   1 1 0 0   1   1   ADCBE
0 0 0 0 0 0 0 1 0 0 0 0   1 1 0 1 0 0   0 1 1 1   1   1   B
0 0 0 0 0 0 0 0 1 0 0 0   1 0 1 0 0 1   1 0 1 1   1   1   C
0 0 0 0 0 0 0 0 0 1 0 0   0 0 1 1 1 0   1 1 0 1   1   1   D
0 0 0 0 0 0 0 0 0 0 1 0   0 1 0 0 1 1   1 1 1 0   1   1   E
0 0 0 0 0 0 0 0 0 0 0 1   1 1 1 1 1 1   1 1 1 1   0   1

                          D C B C B B   A A A A   s   p
                          E D E E C D   B C D E

in this case, split up, and with some trivial rearrangement of rows and columns.

This illuminates something else about this representation of the Golay code, and suggests a further rearrangement of rows and columns to achieve the following:

1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1  ABEDC
0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1  ABCED
0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 1 0 1 1 0 1  ABDCE
0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 1  ACEBD
0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1  ACBDE
0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 1 1 1 0 0 1  ADCBE
0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 1 1 0 1 1 1 1  B
0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 0 1 1 0 1 1 1  C
0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 1 0 1 0 1 1 0 1 1  D
0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1  E
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 1  A

                        s B B B C C D A A A A p
                          C D E D E E B C D E

in which the error-correcting part only just misses being its own transpose. In addition to the vector for the last data bit not being all ones in the parity-check area, to match the parity bit, note the structure of the highlighted six entries in the three-of-six code portion: the main diagonal is all ones, and except for the opposing diagonal of all zeroes, the other bits would all be inverted by transposition.

Here is an image of this representation of the Golay code, with color highlighting its symmetries and distinguishing its components:

Here is the inverse of the error-correcting part of this code:

 1  0  0  0  0  0  0  1  1  1  1  1
 0  0  1  0  0  1  1  0  0  1  1  1
 0  0  0  1  1  1  0  0  1  0  1  1
 0  1  0  0  1  0  1  0  1  1  0  1
 0  1  0  1  0  0  1  1  0  0  1  1
 0  0  1  1  1  0  0  1  0  1  0  1
 0  1  1  0  0  1  0  1  1  0  0  1
 0  1  1  1  0  0  0  0  1  1  1  0
 0  1  0  0  1  1  0  1  0  1  1  0
 0  0  1  0  1  0  1  1  1  0  1  0
 0  0  0  1  0  1  1  1  1  1  0  0
 1  1  1  1  1  1  1  0  0  0  0  0

Note how the Hamming portion stays in place, and the Hadamard portion is flipped vertically. And in the six-by-six square highlighted above, the two diagonals are inverted, and the rest of the bits remain unchanged.

So far, the error-correcting codes we have examined are designed around the assumption that single-bit errors are completely random and independent. In practice, of course, burst errors which affect several consecutive bits are more likely than errors involving the same number of bits in isolation.

One way of dealing with this is called interleaving, as was referred to briefly above. This is used, for example, in the encoding scheme for the Compact Disc. A very simple example of interleaving is used in the AUTOSPEC radio teletypewriter transmission format.

As seen previously, that format uses an error-correcting code with the matrix:

1 0 0 0 0 0 1 1 1 1
0 1 0 0 0 1 0 1 1 1
0 0 1 0 0 1 1 0 1 1
0 0 0 1 0 1 1 1 0 1
0 0 0 0 1 1 1 1 1 0

which has the easy-to understand result that a five-bit character with odd parity is transformed to itself followed by its inverse, and a five-bit character with even parity is transformed to two repetitions of itself, so that five bits convey information about the parity of the character, and then every bit is almost repeated twice.

In actual AUTOSPEC transmission, the ten-bit character codes are transmitted in ten-bit blocks with interleaving, so that for a given character, the first bit of the character's code is the first bit of one block, the second bit of the character's code is the second bit of the next block, and the third bit of the character's code is the third bit of the block after that, and so on.

Thus, a burst error ten bits long becomes ten single-bit errors in the codes of ten consecutive characters, and this code can correct single-bit errors.

The decoding algorithm is simple: by comparing the first five bits to the last five bits of a character, determine the parity of the data character. If the data bits have the wrong parity, then the values indicated by the error-correcting bits have priority; if they have the right parity, they are assumed correct.

If more than one error is found in either the error-correcting bits or in the data bits, however, some additional complexity is needed to handle the error in the best fashion.

Since 00000 has even parity, its error correcting bits are also 00000; on the other hand, 11111 is also followed by 00000, and thus 11111 could be used by convention as a synchronization character (although 11100 00011, for example, is also a valid code, the idea is that only multiple repetitions of 11111 would be expected by the receiver).

Bose-Chaudhuri-Hocquenghem Codes

The Golay code and the Hadamard codes both produce as many check bits as data bits. Normally, for protecting an enciphered message in transit, only a small number of check bits applied to a group of data bits are desired. The Bose-Chaudhuri-Hocquenghem codes are a family of error-correcting codes that allow one to devise codes similar to the Hamming code, but with a higher proportion of check bits.

The sequence of Hamming codes without a parity bit added are codes of the type:

  n    n
[2 -1,2 -n-1,3]

which can be used to correct one error, and the addition of a parity bit allows detecting two errors as well.

The primitive Bose-Chaudhuri-Hocquenghem codes can be of the types:

  n    n
[2 -1,2 -2n-1,5]

  n    n
[2 -1,2 -3n-1,7]

  n    n
[2 -1,2 -4n-1,9]

and so on. As with the Hamming codes, a parity bit can be added, and if the number of bits one wishes to send in a block are not of the form 2^n-Xn-1, where X are the number of errors to be corrected, rows can be omitted. As the number of errors to be corrected increases, BCH codes may no longer be the optimum codes for correcting a given number of errors, but for small values, they provide very good codes.

BCH codes are based on primitive polynomials in the Galois Field over 2^n, which were discussed earlier in the sections on the Rijndael block cipher, and on linear-feedback shift registers, and in a section of their own.

Because a BCH code is a cyclic code, when expressed in systematic form, in which parity-check bits are appended to the original data, it can be implemented, for encoding, in the same way as a cyclic redundancy check code; but as a cyclic redundancy check is simply used to reject an entire block if it is wrong, while a BCH code is used to correct errors, decoding a BCH code is more complicated.

We can compare the [31,26,3] Hamming code, in the form it has when produced as a case of the BCH code:

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1

to the [31,21,5] BCH code:

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 1 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 1 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 1
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 1 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 1
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 1 1 1 0 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 1 1 1 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 1 0 0 1

and the [31,16,7] BCH code:

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 0 1 0 1 1 1
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 1 1 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 1 1 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 1 1
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 1 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 1 0 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 1 0
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 1
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 1 1 0 1
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 1 0 1 0 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 0 1 0 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 0 1 0 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 0 1 0 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 0 1 0 1 1 1 1

all shown in the form they would have as CRC codes; thus, the modular polynomial is visible in the bottom row.

If one wished to devise a code for protecting a computer memory consisting of words that were 256 bits wide, a starting point would be the BCH codes with a block size of 511 bits. Given existing computer memories with one parity bit for eight data bits, we have already seen that when the width of a data word was increased to sixty-four bits, it was possible to use the [127,120,3] Hamming code, with seven check bits, as a starting point, adding parity to make a [128,120,4] code, and then removing data bits to make a [72,64,4] code. In the case of data words 256 bits wide, similarly, the [511,484,7] BCH code can be used as a starting point; add a parity bit, to make a [512,484,8] code which corrects three errors and detects four errors, and then remove data bits to make a [284,256,8] code. This leaves four additional bits available, if one is using existing memory modules that allow for a 12.5% overhead for error correction.

Improving on BCH

Starting from the [255,239,5] BCH code for data words 128 bits wide, if one wished to make an error-correcting code that corrected up to two errors, and detected up to three errors, one would have to add a parity bit to the sixteen check bits already present in the [255,239,5] code.

However, IBM managed to devise a DEC-TED code for 128 bit data words that not only required only sixteen check bits, but which also had the desirable property as the (72,64) code by Shigeo Kaneda described above, of being able to detect errors of four bits confined to a single aligned block of four bits.

As that is a (144,128) code, it can use the same standard 72-bit memory parts that are used for conventional SEC-DED error-correction coding, but with a considerably stronger resistance to soft errors.

This code was described in U. S. Patent 4,509,172.

Some b-Adjacent Codes

Primitive polynomials over Galois fields are very useful in the construction of error-correcting codes; the theory of BCH codes involves them, and they are also the basis of Reed-Solomon codes.

A primitive polynomial, as we have seen elsewhere on this site, can be used as the basis for a shift register with a maximal period.

The operation of a shift register can be expressed by a matrix, and such a matrix is known as the companion matrix of the primitive polynomial.

Thus, the primitive polynomial x^4 + x + 1 over GF(16) has the companion matrix:

0 0 0 1
1 0 0 1
0 1 0 0
0 0 1 0

and multiplication of a vector of four bits by that matrix shifts the bits one place left, also XORing a copy of the leftmost bit into the rightmost position.

Since the shift register it simulates has maximal period, this means that each nonzero vector multiplied by that matrix produces a distinct output (so the matrix is invertible).

And that means that successive multiplications by that matrix will produce distinct outputs for a given input until all the possibilities have been generated.

This is a useful property for an error-correcting code.

When illustrating error-correcting codes, such as Hamming codes above, they're usually illustrated in a form like this:

1 0 0 0 0 1 1
0 1 0 0 1 0 1
0 0 1 0 1 1 0
0 0 0 1 1 1 1

Input bits come in from the left, an AND is performed with the number in each cell, and then XOR is done down the columns to produce the output. So, here, the diagonal matrix indicates where the data bits are just copied, and the last three columns show how the check bits are generated.

In normal matrix multiplication, though, the input comes in at the top, and the result goes out on the right. So sometimes just the columns used for the check bits are shown in ordinary matrix notation like this:

0 1 1 1
1 0 1 1
1 1 0 1

This makes it easier to express codes which are built up from matrices raised to powers.

Consider a code, described in this notation, with the structure:

I I       I    T T^2     T^14 T T^2     T^14
T T^2 ... T^14 I I   ... I    T T^2 ... T^14
T T^2     T^14 T T^2     T^14 I I       I

where T stands for the companion matrix for x^4 + x + 1 that we saw above, and I stands for the identity matrix.

This code produces 12 check bits for 168 data bits.

If we divide the code bits into aligned groups of four bits, and there is an error - whether it involves one bit, two bits, three bits, or all four bits, in just one of those groups of four bits, it can be corrected.

How would this be done?

If just one of the three groups of four bits in the check bits was in error, clearly the error is in the check bits, so they can be ignored, and the data bits taken as is.

For any error in the data bits that is confined to a single aligned group of four bits, the syndrome - the deviation of the check bits from their expected value based on the version of the data bits that is received - would consist of two nybbles that are the same, and one nybble that is different.

The nybble that is different represents the four bits of the error itself.

If it's in the first part of the check bits, the error is somewhere in the first 56 data bits; if it's in the second nybble, the error is in the second 56 data bits, and so on.

And, because of the maximal period property of the shift registers based on the primitive polynomial to which T is the companion matrix, the two nybbles that have the same data will have a different value for each of the fourteen nybbles in the group of 56 bits to which we have narrowed down the error.

The code in this example is of the kind known as a b-Adjacent code. Related codes that are guaranteed to detect some kinds of errors in addition to correcting others will be described next, as they have been found to be very useful in dealing with real-world errors in large computer memories.

Instead of looking for two nybbles the same, one could of course look for two nybbles that are nonzero, and use a code with this structure:

I I       I    0 0       0    T T^2     T^14
T T^2 ... T^14 I I   ... I    0 0       0
0 0       0    T T^2     T^14 I I       I

which is even simpler; here, the first of the two nybbles that are nonzero indicates the error, and the second one indicates its location within a group of 56 bits.

While the DEC-TED-S4ED code described earlier, based on a BCH code, is a major improvement over the conventional ECC, based on a Hamming code, normally used in computers today, IBM later devised a further improvement in error-correcting codes which is usually used instead when a better code than a SEC-DED code is desired.

The S4ED feature - detecting when a 4-bit wide memory chip has failed - has been noted as highly desirable. Thus, it would not be surprising that an S4EC feature - actually correcting the errors resulting from the failure of a 4-bit wide memory chip - would be even more desirable.

Initially, IBM devised an S4EC-D4ED code that was a (76,64) code, requiring 12 check bits to protect 64 data bits. This was described in U. S. Patent 5,757,823. It was originally used with System/390 mainframes, and thus the use of special memory parts was less of an issue. Later, codes with the S4EC-D4ED property that were (144,128) codes were devised, making this form of error correction highly practical, and, indeed, it is now offered by several companies in the computer industry.

In 1982, a paper described one such (144,128) code, of the type known as a Kaneda-Fujiwara code, and it was built out of the same matrices that I noted in the example above; its structure was equivalent to:

I    I    I    I    I    I    I    I    0    0    0    0    0    0    0    0    
I    T    T^2  T^3  T^14 T^13 T^12 T^11 I    I    I    I    I    I    I    I    
T^14 T^13 T^12 T^11 I    T    T^2  T^3  I    T    T^2  T^3  T^14 T^13 T^12 T^11 
0    0    0    0    0    0    0    0    T^14 T^13 T^12 T^11 I    T    T^2  T^3  

T^14 T^13 T^12 T^11 I    T    T^2  T^3  I    T    T^2  T^3  T^14 T^13 T^12 T^11 
0    0    0    0    0    0    0    0    T^14 T^13 T^12 T^11 I    T    T^2  T^3  
I    I    I    I    I    I    I    I    0    0    0    0    0    0    0    0    
I    T    T^2  T^3  T^14 T^13 T^12 T^11 I    I    I    I    I    I    I    I    

although the illustration in the original paper had the check bits associated with the data bits in a somewhat different order. To avoid excessive width, the check bits were split into two parts; those for the first 64 data bits are described on the top four lines, and those for the last 64 data bits on the bottom four lines.

Newer memory modules use DRAM chips that are eight bits wide instead of four bits wide. It is still possible to correct errors in a single package, but it is no longer possible to guarantee to correct all double errors in a (144,128) code. Therefore, it seems a simple brute-force approach involving interleaving can be used; for example, one might begin a code with the following structure:

I 0 I   0       I   0   T 0 T^2 0       T^7 0
0 I 0   I       0   I   0 T 0   T^2     0   T^7
T 0 T^2 0   ... T^7 0   I 0 I   0   ... I   0
0 T 0   T^2     0   T^7 0 I 0   I       0   I

Since T, T^2, up through T^14 are all distinct, with T^15 being the identity matrix, it is immediately obvious that the first 112 bits of data are recoverable by a code of this form. But since it is possible that whichever 8 bit group is in error might have an error in only four bits, or the same error in the first four bits and the second four bits, it is not immediately clear how one could devise the rest of the code.

However, continuing like this:

0 T 0   T^2
I 0 I   0
T 0 T^2 0
0 I 0   I

might initially appear to work.

In the case where only one nybble is in error, now the syndrome has two nonzero nybbles that are in opposite halves of its two bytes. That is clearly distinct.

In the case where both nybbles have exactly the same error, the two nybbles in one half aren't the same kind of multiple of the nybbles in the other half.

But could there be cases where error correction would fail when the two nybbles in one byte have different errors? Say when one error happens to be a power of T times the other error?

If an error in the second-last byte had an error of q in the first nybble, and Tq in the second nybble, then it is true that the fourth nybble of the syndrome would be T times the second nybble of the syndrome - just as it is for an error in the second nybble of the first byte.

However, the first nybble of the syndrome is also T times the third nybble of the syndrome, which is in the opposite direction.

So it does handle the case of two nybbles with different errors as well as a continuation such as:

I 0
0 I
0 T
T 0

which obviously fails when both nybbles have the same error, or

I 0
0 I
T 0
0 T^7

which obviously fails when the error in one nybble is zero.

What about this possibility, which might be considered simpler:

I 0   I   0
0 I   0   I
0 T^7 0   T^6
T 0   T^2 0

This, also, obviously works in the two possibly pathological cases; if both nybbles have the same error, or if one has no error.

But suppose the error in the second nybble of the second-last byte is T^6 times the error in the first nybble of the second-last byte. Then the third nybble of the syndrome will be T times the first nybble of the syndrome, just as in the case of an error in the first nybble of the first byte.

But the fourth nybble is also T times the first nybble, and the second nybble is not the same as the first nybble, so this also is distinguishable. So it seems as though this continuation should also work.

A Minor Mystery

Incidentally, one might wonder why so many bits have to be added to a code to increase the minimum Hamming distance between codewords from an even number to the next higher odd number, but increasing from an odd number to the next even number requires only adding one parity bit, even when the code providing a minimum distance between codewords that is that odd number is perfect.

The following diagram was an attempt to exhibit the full structure of the simplest Hamming code in all its seven-dimensional glory in an attempt to answer this,

but I am afraid that it is not particularly clear, there not being enough sufficiently distinctive colors among the 256 Internet-safe colors for its purpose.

Recent Error Correcting Codes

The field of error-correcting codes, like the field of data compression, is still a very active one, with new research being done and patents being issued on new methods.

Fire Codes

In the 1970s, a set of error-correcting codes called "Fire codes" were developed. One of the simplest of these could be used as a direct replacement for a parity bit, used with each character in many existing systems, providing error correction instead of just the detection of erroneous characters.

It worked like this:

Data   * - - - - - - - *    * - - * - - - - * - - *
bits   - * - - - - - - *    - * - - * - - - * - - *
       - - * - - - - - *    - - * - - * - - * - - *
       - - - * - - - - *    - - - * - - * - * - - *
       - - - - * - - - *    - - - - * - - * * - - *
       - - - - - * - - *    - - - - - * - - X - - *
       - - - - - - * - *    - - - - - - * - * * - *
       - - - - - - - * *    - - - - - - - * * - * *
Parity = = = = = = = = P    = = = = = = = = ! = = !

Each parity bit is the XOR of parity for the data byte with which it is associated and parity for staggered bits from the eight previous bytes. Essentially, it is a clever way to obtain the same results as are provided by the use of row and column parity bits. (After the last byte of data, one extra byte, with the extra parity bit, is needed to provide parity for all the remaining diagonals.)

More complicated forms involved XORing these parity bits with the parity of another group of eight bits, staggered by being from bytes separated by a larger distance.

Turbo Codes

A very recent exciting development in error-correcting codes is the class of codes known as "Turbo Codes". These were first presented by Berrou, Glavieux, and Thitimajshima in 1993.

As the first step in a Turbo Code, a copy of the data stream is divided into blocks of uniform length, and a known pseudorandom scrambling sequence is used to select at random a shuffle of the bits in each block.

The original data stream, and the shuffled data stream, are both fed separately into linear feedback shift registers to produce a constant stream of check bits.

A Turbo Code of rate 1/4 is obtained by sending all four sequences, the original data stream, the shuffled data stream, and the two sequences of check bits.

A Turbo Code of rate 1/3 is obtained by sending the original data stream and the two sequences of check bits.

A Turbo Code of rate 1/2 is obtained by using an independent pseudorandom scrambling sequence to select individual check bits from either of the two check bit sequences at random.

Since one could imagine circumstances where the random interleaving results in the data bits not being shuffled, or the selection of check bits at random provides parity for some data bits but not others, there is a temptation to modify the method somewhat from its ideal theoretical model.


[Next] [Up] [Previous] [Index]

Table of Contents
Home Page