Advertisement
Top

Steganography – what are you actually seeing/listening to?

February 25, 2016

We come into contact with numerous digital files every day, but some of them might not be what they seem at first glance; designated by the term “steganography”, the practice of concealing a message in a different cover-message precedes the era of cybernetics. A sub-section of cryptography, steganography got its denomination in 1499 when a German Benedictine abbot published his treatise by the same name, but it dates back to ancient Greece (as its denomination suggests).

Nowadays, steganography is an important research topic in cyber-security that is tied to computer forensics. Concealing information within computer files allows the undetected traffic of sensitive or illegal data in the form of written, audio or image documents that are not what they seem to be at first glance. The strategy counts on alterations so subtle that usually they would go undetected if not purposely checked by specialists.

The main trait of steganography (when pictured against the image of cryptography in general) consists in its capacity of remaining inconspicuous. Since the cover message does not look out of the ordinary in any way, it does not raise suspicions.

Methods and classification in steganography

Gary Kessler’s paper on steganography from 2001 lists as consecrated methods the following:

  • Hidden text within webpages;
  • Hidden documents that appear to be common OS files, such as Win system32 files;
  • Null ciphers;
  • Covert channels.

The same paper also lists the generic formula for steganography activities (cover_medium + hidden_data + stego_key = stego_medium), where the cover medium is the straightforward data as it appears to the eye/ear of any computer user.

The hidden message itself can be plain, encrypted or may consist of images. Therefore this concealing method can be present alone or in combination with supplementary encryption.

Steganography tools are currently available online, some of them for free, such as Steghide, Crypture, or rSteg. Other tools require a license, such as Matlab, which enables users to encode secret images into a cover image.

Nevertheless, the most common instance of covert messages takes place when employed by malicious entities, cases where the law enforcement forces train to identify and decrypt the real core message.

  • Another classification identifies the digital type, the social type, the network type and the printed type in steganographical materials.

In the digital version, the cryptographer can choose to render some of the characters in a similar color to the background, to use Unicode characters instead of the right ASCII characters or to deliver his message via apparent errors: extra spaces, redundant markup or Unicode Zero-Width Joiner (ZWJ) and Zero-Width Non-Joiner (ZWNJ) non-printing characters.

The social steganography deals with hiding messages in the title/context/images/spelling errors of popular social media items that will result in a different alternate meaning when decrypted properly.

Network steganography comprises a broad array of methods and techniques: Voice-over-IP conversations may conceal messages, as well as unused header fields or other corrupt networks elements.

Printed steganography concerns the technique of formatting a text so it will only print partially and selectively, revealing the hidden message. The cover text can be preset to print as the real message or it might depend on a key for recovering the real text.

Most frequently, the most common types of steganography listed by online sources are:

  • Text-based steganography;
  • Audio steganography (least significant bits substitution or LSB manipulation, phase coding and echo hiding);
  • Image steganography (LSB manipulation applied to pixels, hiding data in image edges and other methods of embedding payload);
  • Video steganography (using video formats to hide information by discrete cosine transform (DCT) alter values).
  • Reference links steganography (using hyperlinks in texts without marking the words that contain hyperlinks);
  • Source code steganography (hiding information in the webpages’ comment section, by using various coding languages).

Efficiency factors in steganography

The previously quoted paper also mentions how the effectiveness of this cryptologic approach measures by a few factors:

  • Robustness (the capacity of preserving the hidden message regardless of the stego-medium transformations;
  • Imperceptibility (the stego-medium’s capacity of not drawing attention);
  • Payload capacity (how much hidden information fits into the cover file);
  • Various specific functions/ratios: Peak Signal to Noise Ratio or PSNR (a ratio resulting from the possible power of a signal and the power of corrupting noise before the point where the representation fidelity is affected); Mean Square Error or MSE (another function that relates to the report between the reference image and the distorted image) and Signal to Noise Ratio or SNR (the ratio between signal power and noise power).

Steganalysis

Steganalysis is the activity of analyzing digital files in order to determine whether or not they hide secret messages and to further investigate on the methods employed, the content of the hidden messages and the identity of the sender and the recipients – usually performed by specialized law enforcement entities.

The challenges inherent to these activities are listed by an academic online source as follows:

  • The initial question of whether a digital file is concealing a secret message or not (easier when there are reference data streams available to study in comparison);
  • The encryption challenge involved when steganography is associated with encrypted secret messages that also need to be deciphered;
  • The challenge of purposely embedded irrelevant data, destined to increase the difficulty of the decoding process;
  • The challenge of irretrievable hidden messages that makes the malicious label impossible to attribute in the absence of proof;

Further insights into the steganalysis activity consist in tools description, types of identifiable attacks researchers usually encounter or detection methods. When approaching a suspect computer, for example, researchers try to identify the presence of steganography software/tools that might provide relevant clues on what they should search further.

Steganography – an upgrade

When this field meets the developments in the Artificial Intelligence (AI) field, the encryption-decryption mechanisms go to another level. Experiments concerning this type of upgrade are available online, and some mention the notion of highly undetectable steganography (HUGO).

Machine learning and neural networks related experiments take the concealed messages paradigm as previously known and perfect it in what efficiency is concerned. Artificial learning software can produce better steganography carrier messages, can provide a better compression for the hidden data or allow the use of a single embedding key for various cover-sources.

Multi-layered messages crafted by neural based networks count as neural-based steganography. Exploring the possibilities of this kind of algorithm-based concealment, the researchers have experimented with image and audio steganography.

Since the multitude of samples to be examined by the steganalysts was among the challenges in this field, using AI to go through all the potential cover-messages is a beneficial way of simplifying the researcher’s activity. Denominated “pooled steganalysis”, the activity of increasing the chances of finding hidden data by going through as many files as possible is an activity better suited for machines than for humans – as long as the correct algorithm guides the search.