next up previous contents
Next: 8. Towards Coding in Up: Towards Linguistic Steganography: A Previous: 6. Lessons Learned   Contents

Subsections

7. Towards Secure and Robust Mixed-Radix Replacement-Coding

Secure and Robust Coding

7.1 Blocking Choice-Configurations

If we are not satisfied with establishing robustness implicitly, similarly to how Atallah did, error-correction quickly becomes an issue. Error-correcting codes would enable us to actively reconstruct data after receiving it incorrectly (as a result of Wendy, the active warden, attacking a steganogram by substituting words used for coding the secret) as long as there are not too many errors.

However, the automatic construction of error-correcting codes is traditionally studied only in the context of q-ary symmetric channelschannel@q-ary. Generally we can think of steganograms and covers as datagrams of a fixed length $n$ represented by $n$-tuples $s=( s_0,s_2,s_3,...,s_{n-1} )$. Datagrams transmitted by $q$-ary channels are restricted to choosing each $s_i$ from a fixed alphabet $S$ of cardinality $\vert S\vert=q$, so that $s_i \in S$, i.e. datagrams that can be transmitted over a $q$-ary channel are chosen from $S^n$.

However, this is not the case in lexical steganography, if we wish to code the data into word-replacements $s=( s_0,s_2,s_3,...,s_{n-1} )$, where each $s_i$ is chosen from a different synset with a different cardinality, so that $s_i \in S_i$. As opposed to $S^n$, we wish to choose our datagrams from $S_0 \times S_1 \times S_2 \times ... \times S_{n-1}$.

We will not deal with the construction of $q$-ary error-correcting codes in detail. For readers unfamiliar with the topic of error-correction, it might be helpful to think of the simplest form of error-correction which is a repetition-code. For example, in order to encode a value $0 \leq x < q$ for submission over a channel with $3$ elements, we could choose the configuration $(x,x,x)$ from $X_1 \times X_2 \times X_3$. If an error happens (i.e. Wendy changes a value $x$ to $y \neq x$), then $(x,y,x)$ might be incorrectly received. We can then correct it by assuming $(x,x,x)$ instead, because one substitution would suffice to replace $(x,x,x) \rightarrow (x,y,x)$, whereas two substitutions would be necessary to replace $(y,y,y) \rightarrow (x,y,x)$. Assuming that a channel is more likely to transmit each value correctly than erroneously (since Wendy is trying not to completely destroy the datagram), we can safely decode $(x,y,x)$ to $(x,x,x)$.

Figure: How word-choices are assigned to blocks.
\includegraphics[scale=.9]{img/block1.eps}

In order to make the well-known techniques for automatic construction of error-correcting codes for $q$-ary channels applicable in this scenario, we will compose the word-choices into blocks, and apply error-correction at the block-level rather than to word-choices. We will define a blocking $B$ as a set partition over the index-set for word-choices in the datagram. Formally,

\begin{eqnarray*}
&\bigcup_{P \in B}& P ~ = ~ \lbrace 0,1,...,n-1 \rbrace, \\
&\forall P,Q \in B: & ~ P \cap Q = \emptyset.
\end{eqnarray*}

That is, a blocking $B$ consists of several blocks $s'_{i'} \in B$ and every word-choice $s_i$ is assigned to exactly one block, so that blocks never share word-choices, and none remain unassigned. One can bring these blocks into an arbitrarily chosen order, and think of them as an $n'$-tuple $s'=( s'_0,s'_1,s'_2,...,s'_{n'-1} )$ where $n'$ is the number of blocks. This sequence of blocks can, itself, be seen as a sequence of elements each of which takes on a finite number of values, just as the sequence of word-choices. Each block $s'_{i'}$ can take on exactly

\begin{displaymath}
\vert S'_{i'}\vert = \prod_{i \in s'_{i'}} \vert S_i \vert
\end{displaymath}

distinct values. If, for example, a block $s'_d$ is assigned to three word-choices $s_a$, $s_b$ and $s_c$, we can see each of the word-choices $s_a \in S_a$, $s_b \in S_b$, and $s_c \in S_c$ as an information-carrying element $x \in X$ or we can see the whole block $s'_d \in S'_d$ where $S'_d = S_a \times S_b \times S_c$ as an information-carrying element $x \in X$.

Again, we will use the numeric properties of word-choices as digits of mixed-radix numbers for coding, by defining $v(s_i)$, the numeric value of a word-choice $s_i$, as $v(s_i)=x$, if and only if, $s_i$ is the $x$-th replaceable word in $S_i$ in alphabetic order. The correspondence between a block's numeric value $v(s'_{i'})$, and the values of each of the assigned word-choices can be defined by the numeric value of a mixed-radix number, the digits of which are represented by the word-choices assigned to the block. This property of a block to behave as an abstract element reduces the problem of finding a configuration for the word-choices $( v(s_0),v(s_1),v(s_2),...,v(s_{n-1}) )$ to finding a configuration for the blocks' numeric values $( v(s'_0),v(s'_1),v(s'_2),...,v(s'_{n'-1}) )$ and determining the values of the word-choices by expressing them as mixed-radix numbers.

Figure [*] shows this graphically. In this example, there are $n=5$ word-choices assigned to $n'=2$ blocks. The word-choices $s=( s_0,s_1,s_2,s_3,s_4 )$ are chosen from synsets $S_1= \lbrace a,b,c,d \rbrace$, $S_2= \lbrace x,y,z \rbrace$, $S_3= \lbrace p,q,r,s,t \rbrace$, $S_4= \lbrace k,l \rbrace$ and $S_5= \lbrace m,n,o,p \rbrace$.

We can think of this as a mixed-radix number

\begin{displaymath}
\left[
\begin{array}{ccccc}
v(s_0) & v(s_1) & v(s_2) & v(s_3) & v(s_4) \\
4 & 3 & 5 & 2 & 4 \\
\end{array}\right].
\end{displaymath}

Alternatively, we can think of the two blocks as a mixed-radix number

\begin{displaymath}
\left[
\begin{array}{cc}
v(s'_0) & v(s'_1) \\
20 & 24 \\
\end{array}\right]
\end{displaymath}

and reinterpret the digits $v(s'_0)$ and $v(s'_1)$ as the numeric values of the mixed-radix numbers established by the word-choices assigned to the blocks as

\begin{displaymath}
v(s'_0)=\left[
\begin{array}{cc}
v(s_0) & v(s_2) \\
4 & 5 \\
\end{array}\right]
\end{displaymath}

and

\begin{displaymath}
v(s'_1)=\left[
\begin{array}{ccc}
v(s_1) & v(s_3) & v(s_4) \\
3 & 2 & 4 \\
\end{array}\right].
\end{displaymath}

We already mentioned that the reason why we compose the word-choices into blocks is because we wish to make error-correcting codes for $q$-ary channels applicable in this scenario. Two methods to achieve this come to mind.

Figure: Blocking by Method I: Each block contains word-choices of equal capacity.
\includegraphics[scale=.9]{img/block2.eps}

Figure: Blocking by Method II: Each block has equal capacity.
\includegraphics[scale=.9]{img/block3.eps}

7.1.0.0.1 Method I

is to compose all word-choices of equal capacity into blocks. For all blocks $s'_{i'} \in B$ we have

\begin{displaymath}
\forall_{s_x,s_y \in s'_{i'}}:~ \vert S_x\vert = \vert S_y\vert+\varepsilon.
\end{displaymath}

This makes it possible to assume the smallest capacity $q=\min \vert S_{x}\vert$ and apply a $q$-ary error-correcting code within each block.

7.1.0.0.2 Method II

is to compose word-choices into blocks in such a way that the resulting blocks have the same capacity. That is, for all blocks $s'_{x'} \in B$ and $s'_{y'} \in B$,

\begin{displaymath}
\vert S'_{x'}\vert = \vert S'_{y'}\vert+\varepsilon.
\end{displaymath}

This will make it possible to assume the smallest capacity $q=\min \vert S'_{x'}\vert$ and apply a $q$-ary error-correcting code to the abstract values of the blocks.

Here $\varepsilon$ is a variable summand which allows for some deviation in the blocks' capacities. It is desireable to keep $\epsilon = \max \varepsilon$ as small as possible, however, it will be necessary to allow for some deviation, to allow for efficient assignment of word-choices to blocks, according to Method II, and to compensate for word-choices of capacities that appear only once for Method I. Figures [*] and [*] show examples of how the two methods would choose the blocks.

7.2 Some Elements of a Coding Scheme

Elements of a Coding Scheme

7.2.0.0.1 The word/block-choice hash:

Let $w(x)$ denote an element's value for use with the $q$-ary code (recall that an element $x$ chosen from $X$ can either be a word-choice $s_i \in S_i$ or a configuration of a block $s'_{i'} \in S'_{i'}$). By definition, a symbol chosen by a $q$-ary code can only take on $q$ distinct values, so we could define $w(x)$ as an integer in the range $0 \leq w(x) < q$. This is problematic unless $\epsilon = 0$, since we assume $q$ to be the minimum capacity of any element, so there will be elements which actually have a larger capacity, so that $0 \leq v(x) < \vert X\vert$, while $0 \leq w(x) < q$ for $q < \vert X\vert$. As a result, some configurations are never assigned by the code, which could be security-relevant (very much like the block-code that restricts synsets to specific sizes, resulting in undergeneration because of the possible replacements that remain unassigned by the code). A straightforward approach might be to let

\begin{displaymath}
v(x) \equiv w(x) ~ (\textrm{modulo } q')
\end{displaymath}

for a constant $q' \leq q$, so that, given a value $w(x)$ determined by the $q$-ary code, one could randomly choose a $v(x)$ that decodes to $w(x)$.

Figure: Splitting word-choices into atomic units.
\includegraphics[scale=.9]{img/block5.eps}

7.2.0.0.2 Prime-factorization of capacities:

Another improvement might be to split up the ``physical'' elements, given by the word-choices into ``virtual'' elements, atomic units forming the basis for the coding-process. The values for word-choices can be determined by the numeric value of a mixed-radix number, the digits of which are the atomic units. These are chosen in such a way, that their capacities are the prime-factors of the word-choice's capacity as shown in Figure [*]. For example, instead of one single choice $s_0$ from $S_0=\lbrace a, b, c, d \rbrace$ choosing from $\vert S_0\vert=4$ values, we can think of two choices $v_0$ and $v_1$ from $V_0 = \lbrace 0, 1 \rbrace$ and $V_1 = \lbrace 0, 1 \rbrace$. Since $\vert V_0 \times V_1\vert = \vert S_0\vert$ we can establish a one-to-one correspondence, and doing so is straightforward via the mixed-radix number

\begin{displaymath}
v(s_0)=\left[
\begin{array}{cc}
v(v_0) & v(v_1) \\
2 & 2 \\
\end{array}\right].
\end{displaymath}

This has three important advantages: First, for many error-correcting codes it is necessary for $q$ to be prime. Secondly, for Method I, we will be able to construct larger blocks. For example, if we have elements of capacities 6, 8 and 12, it will suffice to allocate two blocks, one with $q=2$ and one with $q=3$. Since the rate of redundancy an error-correcting code needs to introduce in order to correct a given number of errors saturates logarithmically with a raising number of elements, it will be preferable to represent a datagram of a given capacity by many elements, choosing each from a small number of distinct values, as opposed to few elements, choosing each from a larger number of values. Thirdly, it will be easier to keep $\epsilon$ small, since smaller elements allow to perform blocking in a more fine-grained way.

7.2.0.0.3 Random-assignment of blocking strategies:

Another problem that arises in the context of lexical steganography is of strategic nature. Suppose we constructed a stegosystem to use either Method I or Method II as a matter of design.

In response to a code, constructed with blocking by Method I, Wendy would pick out exactly one block to attack, since one block is sufficient for the data to be unrecoverable, and then attack as many units within this block as possible, maximizing the probability to succeed in making the whole block unrecoverable. In response to a code constructed with blocking by Method II, she would attack as many blocks as possible, maximizing the probability of breaking the block-error-correction, but would attack no more than one unit within each block since one element is sufficient for the whole block to be unrecoverable.

A simplistic approach to making sure Wendy cannot make use of such strategic considerations, is to make sure she doesn't know the blocking. A straightforward way of achieving this would be to have the encoder ``flip a coin'' to decide which strategy to use. That way, Wendy could not rely on any single strategy to be most successful. However, she will still benefit from choosing one of the above attack-strategies, since she knows it must be one of the two available strategies, or she could use a clever combination like picking one block, in which to attack many units and attacking only one unit in each of the remaining blocks, which would maximize her probability of succeeding over both strategies.

A more sophisticated approach might be to partition the sequence of units in two areas and handle one area by Method I and the other by Method II, so that Wendy never knows how the areas are composed. We could assign units to two areas, for example, by seeding a random-number generator with a secret key to generate a bitstring $b$ of length $n$. Random-number generators have the important property that they generate a bitstring $b$ in such a way that there are no statistically significant correlations between any two bits in the string $b_x$ and $b_y$. The value of $b_i$ could be used to decide upon a blocking-method to use. For example, a value $b_i = 0$ could be interpreted by the encoding/decoding-mechanisms as ``use Method I for unit $v_i$'' and $b_i = 1$ as ``use Method II for unit $v_i$'', as depicted in Figure [*].

Figure: Assigning Blocking-Methods to elements.
\includegraphics[scale=.9]{img/block4.eps}

Figure: An exemplaric coding-scheme.
\includegraphics[scale=.8]{img/block-ex.eps}

7.3 An Exemplaric Coding Scheme

Exemplaric Coding Scheme

We can think of a coding-scheme organized in six layers as shown in Figure [*].

1.
On the first layer, the word-choices $s_0,s_1,s_2,...,s_{10}$, from synsets $S_i$ are split up into atomic units $v_0, v_1, v_2,...,v_{22}$, where $v_j$ refers to the choice for an element from a corresponding set $V_j$. Several values $v_{j_1},v_{j_2},...,v_{j_n}$ determine the word-choices $s_i$, and their respective $V_{j_1},V_{j_2},...,V_{j_n}$ are chosen in such a way that $\vert V_{j_1} \times V_{j_2} \times ... \times V_{j_n}\vert = \vert S_i \vert$, so they provide for the right capacity and all $\vert V_{j_k}\vert$ are prime. In the simplest case, such a correspondence could be established by the value $v(s_0)$ of a mixed-radix number, the digits of which are $v(v_{j_1}),v(v_{j_2}),...,v(v_{j_n})$ where $0 \leq v(v_{j_k}) < \vert V_{j_k} \vert$. For example, the units $v_0$ and $v_1$ together make up a mixed-radix number, which is supposed to choose an $s_0$ from $S_0=\lbrace a, b, c, d \rbrace$. The numeric value of this mixed-radix number $v(s_0)$ will have to range from $0$ to $3$, because $\vert S_0\vert=4$. The two digits $v_0$ and $v_1$ that make up this mixed-radix number need to be chosen from a prime number of values. Since $\vert V_0\vert*\vert V_1\vert=2*2=4$ these units are chosen with $\vert V_0\vert=2$ and $\vert V_1\vert=2$.
2.
The second layer assigns all units to one of two areas, according to a bitstring $b$, generated by a random-number-generator from a secret key. In Figure [*] all elements $v_i$ where a corresponding bit $b_i$ from the bitstring $b$ is $0$, are assigned to Area I which is to be coded using Method I, and all elements with $b_i = 1$ are assigned to Area II which is to be coded using Method II.
3.
The third layer composes the units $v_0, v_1, v_2,...,v_{22}$ into blocks $s'_{0},s'_{1},s'_{2},...,s'_{6}$.
I.
In Area I, Method I composes the blocks in such a way that all the units in one block have the same capacity, for example $\vert V_6\vert+0=\vert V_{12}\vert+0=\vert V_{16}\vert+0=3$. Here $\epsilon = 0$.
II.
In Area II, Method II composes the blocks in such a way that the blocks themselves have the same capacity, for example $\vert S'_4\vert=\vert S'_5\vert+1=\vert S'_6\vert+1=16$. Here $\epsilon=1$.
4.
Layer four is where Method I applies its error-correction.
I.
In the simplest case, we could use a repetition-code, and make sure three adjacent units in Area I, like $w(v_0), w(v_1), w(v_5)$, are always assigned the same value, in this example the block-value $v(s'_0)$. (Recall that we denote values of elements, as chosen by an error-correcting code as $w(x)$). Since $\epsilon = 0$, $v(x)$ always coincides with $w(x)$.
II.
Method II does not carry out error-correction on this layer, so there is no error-correction involved in Area II. The values $v(v_2), v(v_3), v(v_4), v(v_7)$, are simply a mixed-radix representation of $v(s'_4)$.
5.
Layer five is where Method II applies its error-correction.
I.
As far as Area I is concerned, we can simply interpret the blocks' numeric values $v(s'_0),v(s'_1),v(s'_2),v(s'_3)$ as digits of a mixed-radix number to determine $v(\textit{secret}_{I})$
II.
Again we use a repetition code but this time we will apply it to the block-level, making sure that three adjacent blocks are always assigned the same value. For example $w(s'_4)=w(s'_5)=w(s'_6)=v(\textit{secret}_{II})$. Here the hash-function $v(v_i) \equiv w(v_i) ~ (\textrm{modulo } q)$ is used, to determine the values $v(s'_4)=v(s'_5)=v(s'_6)$ randomly. Since $\epsilon>0$, these values will not always coincide.
6.
Layer six is where we bring Area I and Area II together and compute $v(\textit{secret})$ by interpreting $v(\textit{secret}_I)$ and $v(\textit{secret}_{II})$ as digits of a mixed-radix number.

The repetition-code was used in the above setup only as a primitive example of error-correction. It has many disadvantageous side-effects. For example $v_{22}$ is left unassigned in Figure [*], and would have to be initialized randomly. The code is very inefficient, since it needs vast amounts of redundancy and corrects only a comparatively small number of errors. From a strategic point of view, it is not optimal either. Consider, for instance, the consequences of an error in word-choice $s_0$. It would result in altering both $v_0$ and $v_1$, leading the error-correction astray when reconstructing $s'_0$.

All of these limitations could be overcome by using more sophisticated codes, for example Hamming codes. However, we would like to draw attention to the multi-layered blocking-scheme, rather than to the details of error-correction. The blocking scheme introduced above, incorporates some degree of security, since it will not easily be possible to extract the secret from the word-choices without predicting the pseudorandom numbers assigning elements to coding-strategies. It also provides for robustness against some attacks, overcoming the problematic requirement of error-correcting codes to operate on $q$-ary channels.

Figure: Encoding the secret $255$.
\includegraphics[scale=.8]{img/block-ex1.eps}

Figure: Decoding the secret again, after incorrectly receiving the word-choices.
\includegraphics[scale=.8]{img/block-ex2.eps}

7.3.0.0.1 For example

, in the situation depicted in Figure [*], if we wanted to encode the value $v(\textit{secret})=255$ with word-choices, we could do so as depicted in Figure [*], by:
6.
rewriting $v(\textit{secret})=255=17*15+0*1$ as $v(\textit{secret}_I)=17$ and $v(\textit{secret}_{II})=0$, using mixed-radix conversion.
5-I.
rewriting $v(\textit{secret}_I)=17=1*2*2*3+0*2*3+1*3+2*1$ as $v(s'_0)=1, v(s'_1)=0, v(s'_2)=1$, and $v(s'_3)=2$, using mixed-radix conversion.
5-II.
determining a configuration for $w(s'_4), w(s'_5)$, and $w(s'_6)$ using an error-correcting code with $q=15$. In the simple case of a repetition code, we let $w(s'_4)=w(s'_5)=w(s'_6)=v(\textit{secret}_{II})=0$ and use a random-number generator to set $v(s'_4)=15$, $v(s'_5)=0$ and $v(s'_6)=0$. Note that $15 \equiv 0 ~ (\textrm{modulo } 15)$.
4-I.
determining a configuration for $w(v_0), w(v_1), w(v_5)$ using an error-correcting code with $q=2$. Again, using a repetition code, we have $w(v_0)=w(v_1)=w(v_5)=v(s'_1)=1$ and, as a result, $v(v_0)=v(v_1)=v(v_5)=v(s'_1)=1$. Analogously, we get
4-II.
rewriting
3.
bringing the elements from the order used for blocking back into the order, as assigned by layer 2. (This would not usually incorporate any actual processing, if we are working with references).
2.
bringing the elements from the order used for dividing them into two areas back into the original order, according to the mixed-radix-numbers assigned by the prime-factorization on layer 1 (again, not usually incorporating any processing).
1.
rewriting

Figure [*] shows the impact of an error. If Wendy tries to destroy the secret, by changing element $s_0$ from $\textsf{s}$ to $\textsf{r}$ and element $s_3$ from $\textsf{r}$ to $\textsf{o}$, we could still recover the secret.

The error propagates through all the levels of coding, until it is corrected. Since $v(s_1)$ is $2$, instead of $3$, $v(v_3)$ gets $0$, instead of $1$. This propagates down to the block-level (since this element happens to be handled by Method II), where the repetition code would expect $w(s'_4)=0,w(s'_5)=0,w(s'_6)=0$. The decoder now finds the values $w(s'_4)=10,w(s'_5)=0,w(s'_6)=0$ instead. Since $0$ was transmitted correctly twice, $w(s'_4)=10$ is more likely to be the element which is in error, so the decoder can assume $w(s'_4)=0$.

Similarly, the error at word-choice $s_3$ changes $v(v_6)$ from $2$ to $1$ and propagates only down to the unit-level (since this element happens to be handled by Method I) where the repetition code would expect $v(v_6)=2,v(v_{12})=2,v(v_{16})=2$. The decoder now finds the values $v(v_6)=1,v(v_{12})=2,v(v_{16})=2$ instead. Since $2$ was transmitted correctly twice, $v(v_6)=1$ is more likely to be the unit which is in error, so the decoder can assume $v(v_6)=2$, thereby correcting the error and successfully defending against Wendy's attack.


next up previous contents
Next: 8. Towards Coding in Up: Towards Linguistic Steganography: A Previous: 6. Lessons Learned   Contents
Richard Bergmair 2005-01-31