Steganography

Explanation

Digital Steganography Basics

I am going to deal mainly with digital Steganography. Below will be a explanation and discussion of how basic digital Steganography can be achieved. Those who prefer to see things in action to understand them, can jump to the interactive example page, where if JavaScript is enabled in the browser an interactive example will be shown.

The clearInfo will be the word Hi, and the coverData for this explanation will be a simple image 9 pixels by 1 pixel in size.

I am going to use a Steganography technique that causes a low amount of perceptible change in the coverData. An even lower amount of change would be to shift the value of a pixel by 1 in only one of the colour channels, though this would increase the ratio of plainInfo to coverData size considerably, so I will change each channel by 1 or leave as is.

The data will therefore be encoded with the coverData using a 1 or 0 approach (even or odd).

The data to be encoded will use a value range of 0 - 255 per character. The maximum expressed in powers of 2 would be 28. So, a binary representation of each character would be a maximum of 8 places. We have 2 characters so need a minimum of 16 places.

Each pixel in a raster image generally contains 3 bits of information; the red, green, and blue values. To make things simpler I will encode each character over 3 pixels, which gives 9 slots, 1 more than is necessary to encode one ANSI character. The ninth slot can be used as an end of message identifier.

As there are two characters to encode, a total of 6 pixels would be required, I will add another 3 pixels to demonstrate the static nature of the remaining pixels.

ClearInfo to PlainInfo

First we need to change clearInfo into plainInfo. In reality the word Hi if stored digitally is already in the plainInfo state, just reperesented to us as H and i, but let's see what that internal representation looks like numerically.

In computing we use the term ordinal to refer to this internal numeric representation. H and i are from the ASCII and set of symbols. I will use the python interpretor to get theses values, but nearly any programming language can be used to determine them. If Python is not handy, Perl could be used as well.

Python Interpretor

>>> ord('H')
72
>>> ord('i')
105
Perl command line

> perl -e 'print ord("H")."\n"'
72
> perl -e 'print ord("i")."\n"'
105

There are 128 ASCII characters, but we actually use an extension of ASCII called ANSI most of the time, and ANSI reperesents up to 256 characters.

256 as a power of 2 is 28 or expressed in binary as 8 ones (0 - 255 is 256 slots); 11111111.

H has an ANSI decimal value of 72, in base 2, binary, this is 1001000 or 01001000 to give it a length of 8.

i has an ANSI decimal value of 105, in binary this is 1101001 or 01101001.

ClearInfo and PlainInfo
Character ANSI Decimal ANSI Binary
Copyright Poised Solutions Ltd
H 72 01001000
i 105 01101001

The binary values are important, the encoding is done by taking each 0 or 1 and applying it to an evened out pixel channel value. This way the pixel information can be read and if the value of a channel is even then a 0 is the data, if the value of the channel is odd than a 1 is the data.

CoverData

The coverData is an image where the pixels are of the following Red Green Blue (RGB) values:

Pixel Values of the CoverData
Pixel Red Green Blue Colour
Copyright Poised Solutions Ltd
1 000 000 000 Black
2 255 000 000 Red
3 000 255 000 Green
4 000 000 255 Blue
5 255 255 000 Yellow
6 255 000 255 Pink
7 000 255 255 Cyan
8 255 255 255 White
9 128 000 255 Purple

An image made up of the following pixels looks like this:

coverData

This is quite small so I have enlarged the image 16 times.

coverData enlarged

Even the Payload Area.

As this encoding technique will use an even or odd state of the channel value the area which the plainInfo will be encoded needs to be set to an even state.

0 is even, and as 255 is the highest value which is odd, the algorithm should round down (floor) to the nearest even value.

The first six pixels will have their values evened out. This results in the coverData being changed to:

Pixel Values of an Evened Payload Area CoverData
Pixel Red Green Blue Colour
Copyright Poised Solutions Ltd
1 000 000 000 Black
2 254 000 000 Red
3 000 254 000 Green
4 000 000 254 Blue
5 254 254 000 Yellow
6 254 000 254 Pink
7 000 255 255 Cyan
8 255 255 255 White
9 128 000 255 Purple

The evened out payload area coverData image looks like this:

coverData with even payload area

Here is it again enlarged 16 times.

coverData enlarged area

The image is virtually indistinguishable from the original and it is the level of change that the eventual StegData image will show from this one (or really from the original image itself).

Those who are observant will probably have already started to notice ways in which any particual image could be graded as to the possibility of Steganography being used with it. It is has to be remembered though, this is just an example which deliberately makes the process simpler to aid in understanding the process. There are many ways to encode, and certain coverData is better than others. The real flaw though in this particular encoding system happens with the use of the stop indicator which we will get to shortly.

Encoding PlainInfo into CoverData - Generating the StegData

With the payload area of the coverData now evened out, we can simply encode the value of Hi into the area.

To do this we look at the binary value of H then i, and look at the individual 0 or 1, if a 1 present we add 1 to the pixel channel value (making it odd), if 0 is present we leave the pixel value as is (even).

The 9th value of the 3 pixel channel values is left even (remember they are even already), unless the last character is being encoded in which case we add 1 to it (making it odd).

I have added the binary values of H and i plus the evenness of the channel value to the image table, and the resulting StegData image has channel values of:

Pixel Values of the StegData Image
Pixel Red Green Blue Colour PlainInfo Evenness
Copyright Poised Solutions Ltd
1 000 001 000 Black 010 EOE
2 254 001 000 Red 010 EOE
3 000 254 000 Green 00 EEE
4 000 001 255 Blue 011 EOO
5 254 255 000 Yellow 010 EOE
6 254 001 255 Pink 01 EOO
7 000 255 255 Cyan --- ---
8 255 255 255 White --- ---
9 128 000 255 Purple --- ---

For easier comparison here is the PlainInfo table again:

ClearInfo and PlainInfo
Character ANSI Decimal ANSI Binary
Copyright Poised Solutions Ltd
H 72 01001000
i 105 01101001

The StegData image looks like this:

StegData

Again enlarged 16 times.

StegData enlarged area

To compare against the original:

???

???

And enlarged 16 times.

???

???

If you wish to play around with the guessing of which is the original and which the StegData image, there is a JavaScript Steganography guessing game you can play with.

Decoding

Decoding is quite simple, in fact I have already shown how to decode when encoding.

Look back at the encoding, the evenness column contains the data, E is 0 and O is 1. When you hit the an odd in a pixel value divisible by 9 with no remainder the message is at an end.

Decoding on evenness

EOEEOEEE/E EOOEOEEO/O
01001000   01101001

Remember to look at the evenness of the pixel channel values not the PlainInfo, but do note they are the same.

The only thing left to do is to convert the binary representation back into an ANSI character.

Python Interpretor

>>> chr(int('1001000', 2))
H
>>> chr(int('1101001', 2))
i

Conclusion

This explanation was written by hand in the main, the only computed elements were the clearInfo to plainInfo ANSI values. To gain a good insight into Steganography for use in the digital field, doing it by hand first of all is a good starting point. The real power though of digital steganography is when a program is used to do the encoding and decoding. There is a source code walkthrough page that may be of interest to further understand Steganography.

Encoding often uses basic binary operations, which can appear quite magical, in essence though they are fast ways to do basic maths, often a compiler or interpretor will reduce an operation to a binary equivalent if one is spotted in the source, but not always so it is useful to know these tricks. I have compiled a few tricks at a binary magick site which could prove useful in understanding how Steganography and encryption algorithms works.

If you just want a basic Steganography application to encode messages into imagery, there is a basic application available at the download page. Please read the licence though.

The use of an odd value on the 9th divisible is an obvious weakness in the Steganography encoding algorithm above. To detect if an image has a high chance of containing an encoded message, ie if an image is a stegInfo image, an application could be designed to look at the values of every ninth channel value of an image. If those values were of the nature E E E... O then it could be assumed that the probablity of getting such a pattern would be small enough to warrant a look. Even, if you forgive the pun, with the situation simply reversed to O O O... E the chance of such a pattern occuring would again be small enough to arouse suspicion. So, good Steganography algorithms should be varied and give some weight to the pattern generation of their footprint. Of course encryption should also be used to keep the PlainInfo as CipherData.



Further References


Home | Introduction | Explanation | Interactive Example
Source Code | Steganalysis | Download | Steg Guess