The preceding section described a way of preparing the output from encipherment for transmission. There, the purpose was simply to allow the text to be transmitted properly.
Another form of ciphertext preparation, which can be considered an encipherment method in itself, is designed to conceal another secret: that there even is an encrypted message being stored or transmitted.
Many forms of steganography were devised in the era of paper-and-pencil encryption.
The first method listed above can lead to a stilted cover message that obviously has something wrong with it. Thus, improved forms of this method that involve concealing something subtler than a letter of the message in each word of the cover message have been devised.
As we saw in the section on fractionation, one can encode the letters of the alphabet as follows:
A 111 D 121 G 131 J 211 M 221 P 231 S 311 V 321 Y 331 B 112 E 122 H 132 K 212 N 222 Q 232 T 312 W 322 Z 332 C 113 F 123 I 133 L 213 O 223 R 233 U 313 X 323
and so convert a message composed of letters into one composed only of the digits 1 through 3. Then, a word of 1 or 4 or 7 syllables could stand for 1, a word of 2 or 5 syllables for 2, a word of 3 or 6 syllables for 3. This system, among others, is discussed in Helen Fouché Gaines' [Elementary] Cryptanalysis, which begins by discussing such systems, continues with transposition ciphers, and only afterwards introduces substitution ciphers. This is, in a way, a logical order for introducing the subject, since with each step the letters of the original message become less accessible. (But on the other hand, the case of simple substitution, because it is so familiar, is one that is attractive to discuss first.)
And, of course, a letter can also be converted into five digits which are all either 0 or 1. Long before Èmile Baudot, using a and b as his two symbols, Sir Francis Bacon discussed such a representation of the alphabet specifically for the purpose of steganography.
If we're discussing binary coding, we may as well address the computer era.
The fact that images can be usefully subjected to lossy compression methods has suggested that extra information could be concealed in them.
One popular format for storing pictures on a computer involves first determining 256 colors, each one expressed as three one-byte values for each of its Red, Green, and Blue components, which are optimal for rendering the image with as little change as possible, and then providing first a table of the 256 colors used, and then the image with one byte indicating the color from that table to be used for each pixel. The image in that form is compressed using a form of lossless compression.
A popular steganography program conceals a bit in every pixel by ordering the 256 colors so that similar colors are adjacent to each other in the table, and then modifying each pixel so that the least significant bit of the color code for that pixel has the right value.
For other forms of image encoding, one might wish to place one's message more directly in the image itself. One way of doing this might be as follows:
Compress the image using a lossy technique such as JPEG. Then, modify the original image so that the differences between it and the compressed image, on a pixel-by-pixel basis, contain the message.
Since the idea will be to send only the modified image, it will be necessary to check that the modified image, when compressed by the same technique, produces a compressed form which has the same differences from the modified image as the compressed form of the original image did. This requires a coding which does not disturb the compressed image excessively, and it requires a coding which is flexible, so that different ways of representing the hidden message are possible.
An example of the kind of coding that might work is noted below. 0 represents an unchanged pixel, + one that is made brighter, - one that is made less bright.
00000000 null 0000-00+ 00110 000-00+0 01110 0000+00- 01111 000+00-0 01111 000000-+ 00000 0000-0+0 01000 000-0+00 10000 000000+- 00001 0000+0-0 01001 000+0-00 10001 00000-0+ 00010 0000-+00 01010 000-+000 10010 00000+0- 00011 0000+-00 01011 000+-000 10011 00000-+0 00100 000-000+ 01100 00-0000+ 10100 00000+-0 00101 000+000- 01101 00+0000- 10101 0000--++ 00000000 000-0-++ 00000110 0000-+-+ 00000001 000-0+-+ 00000111 0000-++- 00000010 000-0++- 00001000 0000+--+ 00000011 000+0--+ 00001001 0000+-+- 00000100 000+0-+- 00001010 0000++-- 00000101 000+0+-- 00001100
The table is not complete. First, one can choose to leave all pixels undisturbed if necessary, and in that case one skips that group of eight pixels without encoding any data. If one can alter two pixels, and only combinations where the net alteration balances out are shown, then there are 56 possibilities, so five bits can be coded. If one can alter four pixels without altering the compressed form of the image, then there are 420 possibilities, so eight bits can be coded.
If JPEG compression does not work well to form a baseline image, one can simplify calculations by using a very simple lossy compression algorithm: averaging adjacent pixels in pairs.
Of the RGB components of an image, the Green component contributes the most to the brightness, and the Blue component the least, so using the blue part of the pixel to encode information makes sense.
Note that, due to the current interest in watermarking images, it is entirely likely that the technique described above may be covered by someone's patent at this time.
Table of Contents