Steganography: Hiding Data in Images
Steganography is the art and science of hiding messages, files etc. In a way that no one, except recipient, suspects that it even exists. Jump to the end to learn how to hide text or any file (even executable) in picture using Linux and Windows command line.
Steganography has long and interesting history, I’m just going to share 2 old examples for introduction to the topic.
First recorded example of steganography happened in Ancient Greece 440 BC. Upon separation, Herodotus told his vassal “When my messenger comes, shave his head and look there.” When the time came, Herodotus shaved his slave’s head and tattooed the message. When the hair regrew, he was sent to deliver the message.
Another one I like was a message from a prisoner of war in 1941. He used knitting to write a message in Morse code:
In case you were wondering, messages are: “fuck hitler” and “God Save the King”.
Images are most common digital medium, but there are many other. You can hide data in music files and if you have a lot of data, videos might be good.
Hiding Data in Images
This procedure is very size limiting, you obviously can’t put in 1 MB of information in 4 KB picture. Putting 4 KB of information in that picture would totally ruin the picture, but with the right amount, it will not be detectable to human eye.
Simplest algorithm is to change the least significant bit in each pixel. Common way of saving pictures is to describe each pixel with a number. Some use 16 bits for each pixel, which represent the exact color. Changing the last bit for each of those won’t affect that pixel color enough to be noticeable. Only if we’re too greedy and try to hide too much data in a small medium we might ruin the picture. 20 KB of data in 60 KB picture will cause significant distortion.
This algorithm is easy to detect if you know what to look for, there are many different algorithms to explore to suit your need.
Linux and Windows Tutorial
Put whatever you’re hiding in one folder or choose just one file. I’m hiding a text file called secret to beach.jpg. Linux tool is called steghide and it’s awesome. Features include the compression of the embedded data, encryption of the embedded data and automatic integrity checking using a checksum.
There is no restrictions on type of secret.
Steghide will never disrupt the image enough for it to be detectable, you can see more details below. Main point is, it will save only the last part of the message if it’s too big.
On Windows, command line gives you the ability to add data to the picture without changing the picture at all. This simply increases the memory allocated for the picture and puts in an archive.
Copy /b image.jpg + hide.rar output.jpg
Creates output.jpg whose size is equals to size of image.jpg plus size of hide.rar. Double click on the picture opens the same picture, but if recipient knows, he can open the file with archive manager. This is handy, but not as cool as steghide’s automatic compression, encryption and awesome algorithm.
Steghide embedding algorithm
This paragraph is quote from “man page”: The embedding algorithm roughly works as follows: At first, the secret data is compressed and encrypted. Then a sequence of positons of pixels in the cover file is created based on a pseudo-random number generator initialized with the passphrase (the secret data will be embedded in the pixels at these positions). Of these positions those that do not need to be changed (because they already contain the correct value by chance) are sorted out. Then a graph-theoretic matching algorithm finds pairs of positions such that exchanging their values has the effect of embedding the corresponding part of the secret data. If the algorithm cannot find any more such pairs all exchanges are actually performed. The pixels at the remaining positions (the positions that are not part of such a pair) are also modified to contain the embedded data (but this is done by overwriting them, not by exchanging them with other pixels). The fact that (most of) the embedding is done by exchanging pixel values implies that the first-order statistics (i.e. the number of times a color occurs in the picture) is not changed. For audio files the algorithm is the same, except that audio samples are used instead of pixels. The default encryption algorithm is Rijndael with a key size of 128 bits (which is AES – the advanced encryption standard) in the cipher block chaining mode. If you do not trust this combination for whatever reason feel free to choose another algorithm/mode combination (information about all possible algorithms and modes is displayed by the encinfo command). The checksum is calculated using the CRC32 algorithm.
Watermarks for Copyrighters
Unlike encryption, steganography tries to hide that there is a message at all, and this key property is very useful not only for message delivering, but to copyrighters too. Using these techniques, sellers can embed each buyer’s name (watermark) into the digital product he’s selling and when someone leaks it on the Internet, that buyer has some explaining to do.
If done with good algorithm, it’s extremely difficult to figure out there is a message hidden, impossible if you used unique pictures. It is recommended using pictures you took because no one will have the original for the comparison. Of course, compression and encryption are your friends as well.