# Full Plaintext Recovery Attack on Broadcast RC4

## Abstract

This paper investigates the practical security of RC4 in broadcast setting where the same plaintext is encrypted with different user keys. We introduce several new biases in the initial (1st to 257th) bytes of the RC4 keystream, which are substantially stronger than known biases. Combining the new biases with the known ones, a cumulative list of strong biases in the first 257 bytes of the RC4 keystream is constructed. We demonstrate a plaintext recovery attack using our strong bias set of initial bytes by the means of a computer experiment. Almost all of the first 257 bytes of the plaintext can be recovered, with probability more than 0.8, using only \(2^{32}\) ciphertexts encrypted by randomly-chosen keys. We also propose an efficient method to extract later bytes of the plaintext, after the 258th byte. The proposed method exploits our bias set of first 257 bytes in conjunction with the digraph repetition bias proposed by Mantin in EUROCRYPT 2005, and sequentially recovers the later bytes of the plaintext after recovering the first 257 bytes. Once the possible candidates for the first 257 bytes are obtained by our bias set, the later bytes can be recovered from about \(2^{34}\) ciphertexts with probability close to 1.

### Keywords

RC4 Broadcast setting Plaintext recovery attack Bias Experimentally-verified attack SSL/TLS Multi-session setting## 1 Introduction

RC4, designedIsobe, TakanoriOhigashi, ToshihiroWatanabe, YuheiMorii, Masakatu by Rivest in 1987, is one of most widely used stream ciphers in the world. It is adopted in many software applications and standard protocols such as SSL/TLS, WEP, Microsoft Lotus and Oracle secure SQL. RC4 consists of a key scheduling algorithm (KSA) and a pseudo-random generation algorithm (PRGA). The KSA converts a user-provided variable-length key (typically, 5–32 bytes) into an initial state \(S\) consisting of a permutation of \(\{0, 1, 2, \ldots , N - 1\}\), where \(N\) is typically 256. The PRGA generates a keystream \(Z_1\), \(Z_2\), \(\ldots \), \(Z_r\), \(\ldots \) from \(S\), where \(r\) is a round number of the PRGA. \(Z_r\) is XOR-ed with the \(r\)-th plaintext byte \(P_r\) to obtain the ciphertext byte \(C_r\). The algorithm of RC4 is shown in Algorithm 1, where \(+\) denotes arithmetic addition modulo \(N\), \(\ell \) is the key length, and \(i\) and \(j\) are used to point to the locations of \(S\), respectively. Then, \(S[x]\) denotes the value of \(S\) indexed \(x\).

After the disclosure of its algorithm in 1994, RC4 has attracted intensive cryptanalytic efforts over past 20 years. Distinguishing attacks, which attempt to distinguish an RC4 keystream from a random stream, were proposed in [3, 4, 8, 10, 11, 14, 16]. State recovery attack, which recovers a full state instead of the user-provided key, was shown by Knudsen et al. [7], and it was improved by Maximov and Khovratovich [13]. Other types of attacks are also proposed, e.g., key collision attack [12], keystream predictive attack [10] and key recovery attacks from a state [1, 15].

In FSE 2001, Mantin and Shamir presented an attack on RC4 in the broadcast setting where the same plaintext is encrypted with different user keys [11]. The Mantin-Shamir attack can extract the second byte of the plaintext from only \(\varOmega (N)\) ciphertexts encrypted with randomly-chosen different keys by exploiting a bias of \(Z_2\). Specifically, the event \(Z_2 = 0\) occurs with twice the expected probability of a random one. In FSE 2011, Maitra, Paul and Sen Gupta showed that \(Z_3, Z_4, \ldots , Z_{255}\) are also biased to 0 [8]. Then the bytes 3 to 255 can also be recovered in the broadcast setting, from \(\varOmega (N^3)\) ciphertexts.

- 1.
*Are the biases exploited in the previous attacks the strongest biases for the initial bytes 1 to 255?* - 2.
*While the previous results*[8, 11]*estimate only lower bounds*(\(\varOmega \)),*how many ciphertexts encrypted with different keys are actually required for a practical attack on broadcast RC4?* - 3.
*Is it possible to efficiently recover the later bytes of the plaintext, after byte 256?*

### 1.1 Our Contribution

In this paper, we provide answers to all the aforesaid questions. To begin with, we introduce a new bias regarding \(Z_1\), which is a conditional bias such that \(Z_1\) is biased to 0 when \(Z_2\) is \(0\). Using this bias in conjunction with the bias of \(Z_2 = 0\) [11], the first byte of a plaintext is extracted from \(\varOmega (N^{2})\) ciphertexts encrypted with different keys. Although the strong bias of the first byte, which is a negative bias towards zero, has already been pointed out in [6, 14], it requires \(\varOmega ({N}^3)\) ciphertexts to extract the first byte of the plaintext. Thus, the new conditional bias observed by us is very useful, because the number of required ciphertexts to recover the first byte reduces by a factor of \(N/2\) compared the straightforward method. Besides, we introduce new strong biases, i.e., \(Z_3 = 131\), \(Z_r = r\) for \(3 \le r \le 255\), and extended keylength-dependent biases such that \(Z_{x \cdot \ell } = -x \cdot \ell \) for \(x = 2, 3, \ldots , 7\) and \(\ell = 16\), which are extensions of the keylength-dependent biases in which only the parameter of \(x=1\) is considered [5]. These new biases are substantially stronger than known biases of \(Z_r =0\) in case of certain bytes within \(Z_3, Z_4, \ldots , Z_{255}\). After providing theoretical considerations for these biases, we experimentally confirm the validity of the same. Combining the new biases with known biases, we construct a cumulative list of strongest known biases in \(Z_1, Z_2, \ldots , Z_{255}\). At the same time, we experimentally show two new biases of \(Z_{256}\) and \(Z_{257}\), and add these to our bias set. Note that biases of \(Z_2, Z_3, \ldots , Z_{257}\) included in our bias set are \(strongest\) biases amongst all *single* positive and negative biases of each byte when a 16-byte (128-bit) key is used.

We demonstrate a plaintext recovery attack using our bias set by the computer experiment, and estimate the number of required ciphertexts and success probability when \(N =256\). Almost all first 257 bytes, \(P_1, P_2, \ldots , P_{257}\), can be extracted with probability more than 0.8 from \(2^{32}\) ciphertexts encrypted by randomly-chosen keys. Given \(2^{34}\) ciphertexts, all bytes of \(P_1, P_2, \ldots , P_{257}\) can be narrowed down to two candidates each with probability one. This is a first practical security evaluation of broadcast RC4 using all known biases of the cipher, and some new ones that we observe.

Finally, an efficient method to extract later bytes of the plaintext, namely bytes after \(P_{258}\), is given. It exploits our bias set of \(Z_{1}, Z_{2}, \ldots , Z_{257}\) in conjunction with the digraph repetition bias proposed by Mantin [10], and then sequentially recovers bytes of the plaintext. Once the possible candidates for \(P_1, P_2, \ldots , P_{257}\) are obtained by our bias set, \(P_r\)\((r \ge 258)\) are recovered from about \(2^{34}\) ciphertexts with probability one. Since the digraph repetition bias is a long-term bias, which occurs in any keystream byte, our sequential method is expected to recover any plaintext byte from only ciphertexts produced by different randomly-chosen keys. We show that the first \(2^{50}\) bytes \(\approx \) 1000 T bytes of the plaintext can be recovered from \(2^{34}\) ciphertexts with probability of \(0.97170\).

Also, the broadcast setting is converted into the multi-session setting of SSL/TLS where the target plaintext block are repeatedly sent in the same position in the plaintexts in multiple sessions.

## 2 Known Attacks on Broadcast RC4

This section briefly reviews known attacks on RC4 in the broadcast setting where the same plaintext is encrypted with different randomly-chosen keys.

### 2.1 Mantin-Shamir (MS) Attack

Mantin and Shamir first presented a broadcast RC4 attack exploiting a bias of \(Z_2\) [11].

**Theorem 1**

[11]**.** Assume that the initial permutation \(S\) is randomly chosen from the set of all the possible permutations of \(\{0, 1, 2, \ldots , N-1\}\). Then the probability that the second output byte of RC4 is 0 is approximately \(\frac{2}{N}\).

This probability is estimated as \(\frac{2}{256}\) when \(N = 256\). Based on this bias, the broadcast RC4 attack is demonstrated by Theorems 2 and 3.

**Theorem 2**

[11]**.** Let \(X\) and \(Y\) be two distributions, and suppose that the event \(e\) happens in \(X\) with probability \(p\) and in \(Y\) with probability \(p \cdot (1+q)\). Then for small \(p\) and \(q\), \(\mathrm {O}(\frac{1}{p \cdot q^2})\) samples suffice to distinguish \(X\) from \(Y\) with a constant probability of success.

In this case, \(p\) and \(q\) are given as \(p = 1/N\) and \(q = 1\). The number of samples is about \(N\).

**Theorem 3**

[11]**.** Let \(P\) be a plaintext, and let \(C^{(1)}, C^{(2)}, \ldots , C^{(k)}\) be the RC4 encryptions of \(P\) under \(k\) uniformly distributed keys. Then, if \(k = \varOmega (N)\), the second byte of \(P\) can be reliably extracted from \(C^{(1)}, C^{(2)}, \ldots , C^{(k)}\).

According to the relation \(C^{(i)}_2= P^{(i)}_2 \oplus Z^{(i)}_2\), if \(Z^{(i)}_2 = 0\) holds, then \(C^{(i)}_2\) is same as \(P^{(i)}_2\). From Theorem 1, \(Z_2 = 0\) occurs with twice the expected probability of a random one. Thus, most frequent byte in amongst \(C_2^{(1)}, C_2^{(2)}, \ldots , C_2^{(k)}\) is likely to be \(P_2\) itself. When \(N = 256\), it requires more than \(2^8\) ciphertexts encrypted with randomly-chosen keys.

### 2.2 Maitra, Paul and Sen Gupta (MPS) Attack

Maitra, Paul and Sen Gupta showed that \(Z_3, Z_4, \ldots , Z_{255}\) are also biased to 0 [6, 8]. Although the MS attack assumes that an initial permutation \(S\) is random, the MPS attack exploits biases of \(S\) after the KSA [9]. Let \(S_r[x]\) be the value of \(S\) indexed \(x\) after \(r\) round, where \(S_0\) is the initial state of RC4 after the KSA. Biases of the initial state of the PRGA are given as follow.

**Proposition 1**

**.**After the end of KSA, for \(0 \le u \le N - 1, 0 \le v \le N - 1\),

The probability of \(S_{r-1}[r]\) in the PRGA are given as the follows.

**Theorem 4**

^{1}

**.**For \( 3 \le r \le N-1\), the probability \(\mathrm{Pr}(S_{r-1}[r] = v)\) is approximately

Then, the bias of \(\mathrm{Pr}(Z_r = 0)\) is estimated as follows.

**Theorem 5**

**.**For \(3 \le r \le N-1\), \(\mathrm{Pr}(Z_r = 0)\) is approximately

Since the parameters of \(p\) and \(q\) are given as \(p = 1/N\) and \(q = c_r/N\), The number of required ciphertexts with different keys for the extraction of \(P_3, P_4, \ldots , P_{255}\) is roughly estimated as \(\varOmega (N^3)\).

## 3 New Biases : Theory and Experiment

This section introduces four new biases in the keystream of RC4. To begin with, we prove a conditional bias of \(Z_1\) towards \(0\) when \(Z_2 = 0\). After that, we present new biases in the events, \(Z_3 = 131\), \(Z_r = r\), and extended keylength-dependent biases, which are substantially stronger than the known biases such as \(Z_r = 0\). Then, we construct a cumulative list of strong biases in \(Z_1, Z_2, \ldots , Z_{257}\) to mount an efficient plaintext recovery attack on broadcast RC4.

### 3.1 Bias of \(Z_1 = 0 | Z_2 = 0\)

A new conditional bias such that \(Z_1\) is biased to 0 when \(Z_2 = 0\) is given as Theorem 6.

**Theorem 6**

*Proof*

\(S_0[2] = 0\)

For \(i = 1\), if \(S_0[1] \) is \(1\), the index \(j\) is updated as \(j = S_0[i] = S_0[1] = 1\). Then the first output byte \(Z_1\) is expressed as follows (see Fig. 1),Assuming that \(Z_1 = 0\) holds with probability of \(\frac{1}{N}\) when \(S_0[1] \ne 1\), the probability of \(\mathrm{Pr}(Z_1 = 0|S_0[2] = 0)\) is estimated as$$ Z_1 = S_1[S_1[i] + S_1[j]] = S_1[S_1[1] + S_1[1]] = S_1[2] = S_0[2] = 0. $$$$ \mathrm{Pr}(Z_1 = 0|S_0[2] = 0) = \mathrm{Pr}(S_0[1] = 1) + (1 - \mathrm{Pr}(S_0[1] = 1)) \cdot \frac{1}{N}. $$\(S_0[2] \ne 0\)

Suppose that the event of \(Z_1 = 0\) occurs with probability of \(\frac{1}{N}\). Then \(\mathrm{Pr}(Z_1 = 0|S_0[2] = 0)\) is estimated as$$ \mathrm{Pr}(Z_1 = 0|S_0[2] \ne 0) = \frac{1}{N}. $$

Note that we searched for the similar form of conditional biases in first 256 bytes of the RC4 keystream. In particular, we check following specific patterns, \((Z_{r-a} = X | Z_r = Y)\) for \(0 \le X\), \(Y \le 255\), \(2 \le r \le 256\), \(1 \le a \le 8\). However, such a strong bias could not be found in our experiment, while all conditional biases are not covered.

**Application to Broadcast RC4 attack.**Using this new conditional bias of \(Z_1 = 0 | Z_2 = 0\) in conjunction with the bias of \(Z_2 = 0\) [11], the first byte of the plaintext can be efficiently extracted, where \(N = 256 \). After \(2^{17}\) ciphertexts with randomly-chosen keys are collected, following procedures are performed.

- Step 1.
Extract the second byte of the target plaintext, \(P_2\), from \(2^{8}\) ciphertexts [11].

- Step 2.
Find the ciphertext in which \(Z_2 = 0 \) is XOR-ed by the computation of \(C_2 \oplus P_2\). Then, \(2^{10} = 2^{17} \cdot 2/256\) ciphertexts matching this criterion are expected to be obtained.

- Step 3.
Regard the most frequent byte in the first byte \(C_1\) of these matching \(2^{10}\) ciphertexts as \(P_1\).

### 3.2 Bias of \(Z_3 = 131\)

A new bias of \(Z_3 = 131\), which is stronger than \(Z_3 = 0\) [6, 8], is given as Theorem 7.

**Theorem 7**

*Proof*

We should utilize \(Z_3 = 131\) instead of \(Z_3 = 0\) for the efficient plaintext recovery attack. When \(Z_3 = 131\) and \(Z_3 = 0\) are jointly used, two candidates of \(P_3\) remain. Thus, in order to detect one correct value of \(P_3\), the only use of \(Z_3 = 131\) is more efficient.

### 3.3 Bias of \(Z_r = r\) for \(3 \le r \le N - 1\)

We also present a new bias in the event \(Z_r = r\) for \(3 \le r \le N - 1\), whose probabilities are very close to those of \(Z_r = 0\) [8], and the new biases are stronger than those of \(Z_r = 0\) in some rounds. Thus, for an efficient attack, we need to carefully consider which biases are stronger in each round. The probability of \(Z_r = r\) is given as Theorem 8.

**Theorem 8**

*Proof*

- Case 1 :
\(S_{r-1}[r] = 0 \wedge S_r[r] = r\)

- Case 2 :
\(S_{r-1}[r] = r \wedge S_r[r] = j_r - r \wedge j_r \ne r, r + r\)

- Case 3 :
\(S_{r-1}[r] \ne 0 \wedge S_r[r] = r - S_{r-1}[r] \)

- Case 4 :
\(S_{r-1}[r] \ne 0 \wedge S_r[r] = r \)

**Case 1 :**\(S_{r-1}[r] = 0 \wedge S_r[r] = r\)

**Case 2 :**\(S_{r-1}[r] = r \wedge S_r[r] = j_r - r \wedge j_r \ne r, r + r\)

The output is expressed as \(Z_r = S_r[S_r[r] + S_{r-1}[r] ] = S_r[j_r - r + r ] =S_r[j_r] = S_{r-1}[r] = r\) (see Fig. 4). Then, the probability of \(Z_r = r \) is one. Similar to Case 1, \(S_r[r]\) is assumed to be uniformly random.

**Case 3 :**\(S_{r-1}[r] \ne 0 \wedge S_r[r] = r - S_{r-1}[r]\)

**Case 4 :**\(S_{r-1}[r] \ne 0 \wedge S_r[r] = r \)

Here, \(p_{r-1, r}\) and \(p_{r-1, 0}\) are obtained from Theorem 4. Figure 5 shows the comparison of theoretical values and experimental values of \(Z_r = r\) for \(2^{40}\) randomly-chosen keys when \(N = 256\). Since the theoretical values do not exactly coincide with the experimental values, we do not claim that Theorem 8 completely prove this bias. We guess that several minor events are not covered in our approach. However, the order of the bias seems to be well matched. At least it can be said that the main event causing this bias is discovered.

### 3.4 Extended Keylength-Dependent Biases

Experimental values of \(Z_r=-r\), \(Z_r=0\) and \(Z_r=r\)

\(r\) | \(\mathrm{Pr}(Z_r=-r)\) | \(\mathrm{Pr}(Z_r=0)\) | \(\mathrm{Pr}(Z_r=r)\) |
---|---|---|---|

16 | \(2^{-8} \cdot (1 + 2^{-4.811})\) | \(2^{-8} \cdot (1 + 2^{-7.714})\) | \(2^{-8} \cdot (1 + 2^{-7.762})\) |

32 | \(2^{-8} \cdot (1 + 2^{-5.383})\) | \(2^{-8} \cdot (1 + 2^{-7.880})\) | \(2^{-8} \cdot (1 + 2^{-7.991})\) |

48 | \(2^{-8} \cdot (1 + 2^{-5.938})\) | \(2^{-8} \cdot (1 + 2^{-8.043})\) | \(2^{-8} \cdot (1 + 2^{-8.350})\) |

64 | \(2^{-8} \cdot (1 + 2^{-6.496})\) | \(2^{-8} \cdot (1 + 2^{-8.244})\) | \(2^{-8} \cdot (1 + 2^{-8.664})\) |

80 | \(2^{-8} \cdot (1 + 2^{-7.224})\) | \(2^{-8} \cdot (1 + 2^{-8.407})\) | \(2^{-8} \cdot (1 + 2^{-9.052})\) |

96 | \(2^{-8} \cdot (1 + 2^{-7.911})\) | \(2^{-8} \cdot (1 + 2^{-8.577})\) | \(2^{-8} \cdot (1 + 2^{-9.351})\) |

112 | \(2^{-8} \cdot (1 + 2^{-8.666})\) | \(2^{-8} \cdot (1 + 2^{-8.747})\) | \(2^{-8} \cdot (1 + 2^{-9.732})\) |

The probability of these biases is given as Theorem 9 (the proof is in Appendix A).

**Theorem 9**

### 3.5 Cumulative Bias Set of First 257 Bytes

Cumulative bias set of first 257 bytes

\(r\) | Strongest known bias of \(Z_{r}\) | Prob.(Theoretical) | Prob.(Experimental) |
---|---|---|---|

1 | \(Z_1 = 0 | Z_2 = 0\) (Our) | \(2^{-8} \cdot (1 + 2^{-1.009})\) | \(2^{-8} \cdot (1 + 2^{-1.036})\) |

2 | \(Z_2 = 0\) [11] | \(2^{-8} \cdot (1 + 2^{0})\) | \(2^{-8} \cdot (1 + 2^{0.002})\) |

3 | \(Z_3 = 131\) (Our) | \(2^{-8} \cdot (1 + 2^{-8.089})\) | \(2^{-8} \cdot (1 + 2^{-8.109})\) |

4 | \(Z_4 = 0\) [8] | \(2^{-8} \cdot (1 + 2^{-7.581})\) | \(2^{-8} \cdot (1 + 2^{-7.611})\) |

\(5\)–\(15\) | \(Z_r = r\) (Our) | max: \(2^{-8} \cdot (1 + 2^{-7.627})\) | max: \(2^{-8} \cdot (1 + 2^{-7.335})\) |

min: \(2^{-8} \cdot (1 + 2^{-7.737})\) | min: \(2^{-8} \cdot (1 + 2^{-7.535})\) | ||

16 | \(Z_{16} = 240\) [5] | \(2^{-8} \cdot (1 + 2^{-4.841})\) | \(2^{-8} \cdot (1 + 2^{-4.811})\) |

\(17\)–\(31\) | \(Z_r = r\) (Our) | max: \(2^{-8} \cdot (1 + 2^{-7.759})\) | max: \(2^{-8} \cdot (1 + 2^{-7.576})\) |

min: \(2^{-8} \cdot (1 + 2^{-7.912})\) | min: \(2^{-8} \cdot (1 + 2^{-7.839})\) | ||

32 | \(Z_{32} = 224\) (Our) | \(2^{-8} \cdot (1 + 2^{-5.404})\) | \(2^{-8} \cdot (1 + 2^{-5.383})\) |

\(33\)–\(47\) | \(Z_r = 0\) [8] | max: \(2^{-8} \cdot (1 + 2^{-7.897})\) | max: \(2^{-8} \cdot (1 + 2^{-7.868})\) |

min: \(2^{-8} \cdot (1 + 2^{-8.050})\) | min: \(2^{-8} \cdot (1 + 2^{-8.039})\) | ||

48 | \(Z_{48} = 208\) (Our) | \(2^{-8} \cdot (1 + 2^{-5.981})\) | \(2^{-8} \cdot (1 + 2^{-5.938})\) |

\(49\)–\(63\) | \(Z_r = 0\) [8] | max: \(2^{-8} \cdot (1 + 2^{-8.072})\) | max: \(2^{-8} \cdot (1 + 2^{-8.046})\) |

min: \(2^{-8} \cdot (1 + 2^{-8.224})\) | min: \(2^{-8} \cdot (1 + 2^{-8.238})\) | ||

64 | \(Z_{64} = 192\) (Our) | \(2^{-8} \cdot (1 + 2^{-6.576})\) | \(2^{-8} \cdot (1 + 2^{-6.496})\) |

\(65\)–\(79\) | \(Z_r = 0\) [8] | max: \(2^{-8} \cdot (1 + 2^{-8.246})\) | max: \(2^{-8} \cdot (1 + 2^{-8.223})\) |

min: \(2^{-8} \cdot (1 + 2^{-8.398})\) | min: \(2^{-8} \cdot (1 + 2^{-8.376})\) | ||

80 | \(Z_{80} = 176\) (Our) | \(2^{-8} \cdot (1 + 2^{-7.192})\) | \(2^{-8} \cdot (1 + 2^{-7.224})\) |

\(81\)–\(95\) | \(Z_r = 0\) [8] | max: \(2^{-8} \cdot (1 + 2^{-8.420})\) | max: \(2^{-8} \cdot (1 + 2^{-8.398})\) |

min: \(2^{-8} \cdot (1 + 2^{-8.571})\) | min: \(2^{-8} \cdot (1 + 2^{-8.565})\) | ||

96 | \(Z_{96} = 160\) (Our) | \(2^{-8} \cdot (1 + 2^{-7.831})\) | \(2^{-8} \cdot (1 + 2^{-7.911})\) |

\(97\)–\(111\) | \(Z_r = 0\) [8] | max: \(2^{-8} \cdot (1 + 2^{-8.592})\) | max: \(2^{-8} \cdot (1 + 2^{-8.570})\) |

min: \(2^{-8} \cdot (1 + 2^{-8.741})\) | min: \(2^{-8} \cdot (1 + 2^{-8.722})\) | ||

112 | \(Z_{112} = 144\) (Our) | \(2^{-8} \cdot (1 + 2^{-8.500})\) | \(2^{-8} \cdot (1 + 2^{-8.666})\) |

\(113\)–\(255\) | \(Z_r = 0\) [8] | max: \(2^{-8} \cdot (1 + 2^{-8.763})\) | max: \(2^{-8} \cdot (1 + 2^{-8.760})\) |

min: \(2^{-8} \cdot (1 + 2^{-10.052})\) | min: \(2^{-8} \cdot (1 + 2^{-10.041})\) | ||

256 | \(Z_{256} = 0\) (negative bias) (Our) | N/A | \(2^{-8} \cdot (1 - 2^{-9.407})\) |

257 | \(Z_{257} = 0\) (Our) | N/A | \(2^{-8} \cdot (1 + 2^{-9.531})\) |

For the first time, we propose a cumulative list of strongest known biases in the initial bytes of RC4 that can be exploited in a practical attack against the broadcast mode of the cipher.

## 4 Experimental Results of Plaintext Recovery Attack

- Step 1.
Randomly generate a target plaintext \(P\).

- Step 2.
Encrypt \(P\) with \(2^x\) randomly-chosen keys, and obtain \(2^x\) ciphertexts \(C\).

- Step 3.
Find most frequent byte in each byte, and extract \(P_r\), assuming \(P_r = C_r \oplus Z_r\) where \(Z_r\) is the value of the keystream byte from our bias set.

Figure 9 shows that the success probability of extracting each byte \(P_r\)\((1 \le r \le 257)\) when \(2^{24}\), \(2^{28}\), \(2^{32}\), \(2^{35}\) ciphertexts are given. Note that the probability of a random guess is \(1/256 = 0.00390625\). Given \(2^{32}\) ciphertexts, all bytes of \(P_1, P_2, \ldots , P_{257}\) can be extracted with probability more than 0.5. In addition, most bytes can be extracted with probability more than 0.8. Also, the bytes having stronger bias such as \(P_1\), \(P_2\), \(P_{16}\), \(P_{32}\), \(P_{48}\), \(P_{64}\), are extracted from only \(2^{24}\) ciphertexts with high probability. However, even if \(2^{35}\) ciphertexts are given, the probability does not become one in some bytes. It is guessed that in such bytes, the difference of probability of the strongest known bias (as in our cumulative bias set) and the second one is very small. Thus, more ciphertexts are required for an attack with probability one.

We additionally utilize the second most frequent byte in the ciphertexts for extracting plaintext bytes. In other words, two candidates are obtained by using the relation of \(P_r = C_r \oplus Z_r\), where \(C_r\) are most and second most frequent ciphertext bytes and \(Z_r\) is chosen from our bias set. This result is shown in Fig. 10, and its success probability is estimated as the probability that the guess for the correct plaintext byte is narrowed down to two possible candidates. Note that the probability of a random guess for such a scenario is \(2/256 = 0.0078125\). Given \(2^{34}\) ciphertexts, each byte of \(P_1, P_2, \ldots , P_{257}\) can be extracted with probability one. In this case, although we can not obtain the correct byte of the plaintext, it is narrowed down to only two candidates. For the experiments of Figs. 9, 10, it requires about one day if one uses a single CPU core (Intel(R) Core(TM) i7 CPU 920@ 2.67 GHz) to obtain the result of one plaintext, where 256 plaintexts are used.

Figure 11 shows the number of plaintext bytes that are extracted with five times higher probability than that of a random guess, i.e., where the success probability is more than \(\frac{5}{256}\). Given \(2^{29}\) ciphertexts, all the plaintext bytes \(P_1, P_2, \ldots , P_{257}\) are guessed with much higher probability than random guesses.

## 5 How to Recover Bytes of the Plaintext After \(P_{258}\)

In this section, we propose an efficient method to recover later bytes of the plaintext, namely bytes after \(P_{258}\). The method using our bias in initial bytes is not directly applied to extract these bytes, because it exploits biases existing in only the initial keystream. For the extraction of the later bytes, a long-term bias, which occurs in any keystream bytes, is utilized. In particular, the digraph repetition bias (also called \(ABSAB\) bias) proposed by Mantin [10], which is the strongest known long-term bias, is used. Combining it with our cumulative bias set of \(Z_{1}, Z_{2}, \ldots , Z_{257}\), we can sequentially recover bytes of a plaintext, even after \(P_{258}\), given only the ciphertexts.

### 5.1 Best Known Long-Term Bias (\(ABSAB\) bias)

**Theorem 10**

[10]**.** For small values of G the probability of the pattern ABSAB in RC4 keystream, where S is a G-byte string, is \((1 + e^{(-4-8G)/N}/N) \cdot 1/N^2\).

For the enhancement of these biases, combining use of \(ABSAB\) biases with different \(G\) is considered by using the following lemma for the discrimination.

**Lemma 1**

[10]**.** Let X and Y be two distributions and suppose that the independent events {\(E_i\): \(1 \le i \le k\) } occur with probabilities \(p_X(E_i) = p_i\) in \(X\) and \(p_Y(E_i)=(1+b_i) \cdot p_i\) in Y. Then the discrimination \(D\) of the distributions is \(\sum _i p_i \cdot b_i^{2}\).

The number of required samples for distinguishing the biased distribution from the random distribution with probability of \(1-\alpha \) is given as the following lemma.

**Lemma 2**

[10]**.** The number of samples that is required for distinguishing two distributions that have discrimination \(D\) with success rate \(1-\alpha \) (for both directions) is \((1/D) \cdot (1- 2\alpha ) \cdot log_2 \frac{1-\alpha }{\alpha }\).

This lemma shows that in the broadcast RC4 attack, given \(D\) and the number of samples \(N_{ciphertext}\), the success probability for distinguishing the distribution of correct candidate plaintext byte (the biased distribution) from the distribution of one wrong candidate of plaintext byte (a random distribution) is a constant. \(\mathrm{Pr}_{distinguish}\) denotes this probability.

### 5.2 Plaintext Recovery Method Using \(ABSAB\) Bias and Our Bias Set

However, in the straight way, we can not combine these relations with different \(G\) to enhance the biases, as we do in the distinguishing attack setting. When the value of \(G\) is different, the above equation is surely different even if \(r\) is properly chosen. For example, in the cases of (\(r\) and \(G=1\)) and (\(r+1\) and \(G=0\)), right parts of equations are given as \((P_{r} ~||~ P_{r+1}) \oplus ( P_{r+3} ~||~ P_{r+4})\) and \((P_{r+1} ~||~ P_{r+2}) \oplus ( P_{r+3} ~||~ P_{r+4})\), respectively. Thus, due to independent use of these equations with different \(G\), we are not able to efficiently make use of \(ABSAB\) bias in the broadcast setting.

In order to get rid of this problem, we give a method that sequentially recovers the plaintext after \(P_{258}\) with the knowledge of pre-guessed plaintext bytes. For example, in the cases of (\(r\) and \(G=1\)) and (\(r+1\) and \(G=0\)), if \(P_{r}\), \(P_{r+1}\), and \(P_{r+2}\) are already known, the two equations with respected to \((P_{r+3}~||~P_{r+4})\) is obtained by transposing \(P_{r}\), \(P_{r+1}\), and \(P_{r+2}\) to the left part of the equation. Then, these equations with different \(G\) can be merged.

Suppose that \(P_1, P_2, \dots , P_{257}\) are guessed by our cumulative bias set of the initial bytes, where the success probability of finding these bytes are evaluated in Sect. 4. Then we aim to sequentially find \(P_r ~\mathrm{for}~ r = 258, 259, \dots , P_{MAX}\) by using \(ABSAB\) biases of \(G = 0, 1, \dots , G_{MAX}\). The detailed procedures are given as follows.

- Step 1.
Obtain \(C_{258-3-G_{MAX}},C_{258-2-G_{MAX}},\dots ,C_{P_{MAX}}\) in each ciphertext, and make frequency tables \(T_{count}[r][G]\) of \((C_{r-3-G} ~||~ C_{r-2-G}) \oplus ( C_{r-1} ~||~ C_{r})\) for all \(r=258, 259, \dots , P_{MAX}\) and \(G = 0, 1, \dots , G_{MAX}\), where \((C_{r-3-G} ~||~ C_{r-2-G})\)\(\oplus \)\(( C_{r-1} ~||~ C_{r}) = (P_{r-3-G} ~||~ P_{r-2-G}) \oplus ( P_{r-1} ~||~ P_{r})\) only if Eq. (1) holds.

- Step 2.
Set \(r=258\).

- Step 3.
Guess the value of \(P_{r}\).

- Step 3.1.
For \(G = 0, 1, \dots , G_{MAX}\), convert \(T_{count}[r][G]\) into a frequency table \(T_{marge}[r]\) of \((P_{r-1} ~||~P_{r})\) by using pre-guessed values of \(P_{r-3-G_{MAX}}\), \(\dots \), \(P_{r-2}\), and merge counter values of all tables.

- Step 3.2.
Make a frequency table \(T_{guess}[r]\) indexed by only \(P_{r}\) from \(T_{marge}[r]\) with knowledge of the \(P_{r-1}\). To put it more precisely, using a pre-guessed value of \(P_{r-1}\), only Tables \(T_{marge}[r]\) corresponding to the value of \(P_{r-1}\) is taken into consideration. Finally, regard most frequency one in table \(T_{guess}[r]\) as the correct \(P_{r}\).

- Step 4.
Increment \(r\). If \(r=P_{MAX} + 1\), terminate this algorithm. Otherwise, go to Step 3.

The bytes of the plaintext are correctly extracted from \(T_{marge}[r]\) only if it is distinguished from other \(N^2 - 1\) wrong candidate distributions. Assuming that wrong candidates are randomly distributed, a probability of the correct extraction from \(T_{marge}[r]\) is estimated as \((\mathrm{Pr}_{distingush})^{N^2 -1}\). In Step 3.2, our method converts \(T_{marge}[r]\) into \(T_{guess}[r]\) by using knowledge of \(P_{r-1}\), where \(T_{guess}[r]\) has \(N-1\) wrong candidates. It enables us to reduce the number of wrong candidates from \(N^2 -1\) to \(N-1\). Then, a probability of the correct extraction from \(T_{guess}[r]\) is estimated as \((\mathrm{Pr}_{distingush})^{N-1}\), which is \(1/(\mathrm{Pr}_{distingush})^{N+1}\) times higher than that of \(T_{marge}[r]\). Therefore, the table reduction technique of Step 3.2 enables us to further optimize the attack.

**Experimental Results.** We perform practical experiments using our algorithm to find \(P_{258}\), \(P_{259}\), \(P_{260}\), and \(P_{261}\) (\(P_{MAX}=261\)). As a parameter of \(ABSAB\) bias, \(G_{MAX} = 63\) is chosen, because the increase of \(D\) is converged around \(G_{MAX} = 63\). Then, \(D\) is estimated as \(D=2^{-28.0}\). The success probability of our algorithm for recovering \(P_r\) (\(r \ge 258\)) when \(2^{30}\) to \(2^{34}\) ciphertexts are given is shown in Table 3, where the number of tests is 256. Note that \(P_{1}, P_{2}, \dots , P_{257}\) are obtained by using our bias set (candidate one) with success probability as shown in Fig. 9. For this experiment, it requires about one week if one uses a single CPU core (Intel(R) Core(TM) i7 CPU 920@ 2.67 GHz) to get the result of one plaintext, where 256 plaintexts are used.

Interestingly, given \(2^{34}\) ciphertexts, \(P_{258}\), \(P_{259}\), \(P_{260}\), and \(P_{261}\) can be recovered with probability one, while the success probability of some bytes in \(P_{1}, P_2, \dots , P_{257}\) is not one. Combining multiple biases allows us to omit negative effects of some uncorrected value of \(P_{1}, P_2, \dots , P_{257}\). Although our experiment is performed until \(P_{261}\), the success probability is expected not to change even in the case of later bytes, because \(ABSAB\) bias is a long-term bias.

Let us discuss the success probability of extracting bytes after \(P_{262}\) when \(2^{34}\) ciphertexts are given. According to Lemma 2 and \(D=2^{-28.0}\), \(2^{34}\) ciphertexts allow us to distinguish an RC4 keystream from a random stream with the probability of \(\mathrm{Pr}_{distinguish} = 1 - 10^{-19}\). Then, assuming that wrong candidates are randomly distributed, the probability of correctly extracting the candidate from \((N-1)\) wrong candidates is estimated as \((\mathrm{Pr}_{distinguish})^{N - 1}\). Therefore, our method enables to extract consecutive (\(257 + X\)) bytes of a plaintext with the probability of \(((\mathrm{Pr}_{distinguish})^{N - 1})^{X} = (\mathrm{Pr}_{distinguish})^{(N - 1) \cdot X}\). For instance, when \(X = 2^{40}\) and \(X = 2^{50}\), the success probabilities are estimated as \(0.99997\) and \(0.97170\), respectively.

Success Probability of our algorithm for recovering \(P_r\) (\(r \ge 258\)).

# of ciphertexts | \(P_{258}\) | \(P_{259}\) | \(P_{260}\) | \(P_{261}\) |
---|---|---|---|---|

\(2^{30}\) | 0.003906 | 0.003906 | 0.000000 | 0.000000 |

\(2^{31}\) | 0.039062 | 0.007812 | 0.003906 | 0.007812 |

\(2^{32}\) | 0.386719 | 0.152344 | 0.070312 | 0.027344 |

\(2^{33}\) | 0.964844 | 0.941406 | 0.921875 | 0.902344 |

\(2^{34}\) | 1.000000 | 1.000000 | 1.000000 | 1.000000 |

## 6 Conclusion

In this paper, we have evaluated the practical security of RC4 in the broadcast setting. After the introduction of four new biases of the keystream of RC4, i.e., the conditional bias of \(Z_1\), the biases of \(Z_3 = 131\) and \(Z_r = r\) for \(3 \le r \le 255\), and the extended keylength-dependent biases, a cumulative list of strongest known biases in \(Z_1, Z_2, \ldots , Z_{257}\) is given. Then, we demonstrate a practical plaintext recovery attack using our bias set by a computer experiment. As a result, most bytes of \(P_1, P_2, \ldots , P_{257}\) could be extracted with probability more than \(0.8\) using \(2^{32}\) ciphertexts encrypted by randomly-chosen keys. Finally, we have proposed an efficient method to extract bytes of plaintexts after \(P_{258}\). Our attack is able to recover any plaintext byte from only ciphertexts generated using different keys. For example, first \(2^{50}\) bytes of the plaintext are expected to be recovered from \(2^{34}\) ciphertexts with high probability.

Note that our attack on broadcast RC4, as proposed in this paper, utilizes the advantage of sequential recovery of plaintext bytes. If the initial 256/512/768 bytes of the keystream are suppressed in the protocol, as recommended in case of RC4 usages [14], our attack does not work any more. However, widely-used protocols such as SSL/TLS use initial bytes of the keystream. For SSL/TLS, the broadcast setting is converted into the multi-session setting where the target plaintext block are repeatedly sent in the same position in the plaintexts in multiple SSL/TLS sessions [2].

Our evaluation reveals that broadcast RC4 is practically vulnerable to the plaintext recovery attacks as moderate amount of ciphertexts, i.e., \(2^{24}\) to \(2^{34}\) ciphertexts generated by different keys, leaks considerable information about the plaintext. Thus, RC4 is not to be recommended for the encryption in case of the typical broadcast setting and multi-session setting of SSL/TLS.

## Footnotes

## Notes

### Acknowledgments

We would like to thank to Sourav Sen Gupta and the anonymous referees for their fruitful comments and suggestions. We also would like to thank to Tubasa Tsukaune and Atsushi Nagao for insightful discussions. This work was supported in part by Grant-in-Aid for Scientific Research (C) (KAKENHI 23560455) for Japan Society for the Promotion of Science and Cryptography Research and Evaluation Committee (CRYPTREC).

### References

- 1.Biham, E., Carmeli, Y.: Efficient reconstruction of RC4 keys from internal states. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086, pp. 270–288. Springer, Heidelberg (2008) CrossRefGoogle Scholar
- 2.Canvel, B., Hiltgen, A.P., Vaudenay, S., Vuagnoux, M.: Password interception in a SSL/TLS channel. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 583–599. Springer, Heidelberg (2003) CrossRefGoogle Scholar
- 3.Fluhrer, S.R., McGrew, D.A.: Statistical analysis of the alleged RC4 keystream generator. In: Schneier, B. (ed.) FSE 2000. LNCS, vol. 1978, p. 19. Springer, Heidelberg (2001) CrossRefGoogle Scholar
- 4.Golić, J.D.: Linear statistical weakness of alleged RC4 keystream generator. In: Fumy, W. (ed.) EUROCRYPT 1997. LNCS, vol. 1233, pp. 226–238. Springer, Heidelberg (1997) CrossRefGoogle Scholar
- 5.Sen Gupta, S., Maitra, S., Paul, G., Sarkar, S.: Proof of empirical RC4 biases and new key correlations. In: Miri, A., Vaudenay, S. (eds.) SAC 2011. LNCS, vol. 7118, pp. 151–168. Springer, Heidelberg (2012) CrossRefGoogle Scholar
- 6.Sen Gupta, S., Maitra, S., Paul, G., Sarkar, S.: (Non-)random sequences from (Non-)random permutations - analysis of RC4 stream cipher. J. Cryptol
**27**(1), 67–108 (2014). http://dblp.uni-trier.de/rec/bibtex/journals/joc/GuptaMPS14 CrossRefMATHGoogle Scholar - 7.Knudsen, L.R., Meier, W., Preneel, B., Rijmen, V., Verdoolaege, S.: Analysis methods for (alleged) RC4. In: Ohta, K., Pei, D. (eds.) ASIACRYPT 1998. LNCS, vol. 1514, pp. 327–341. Springer, Heidelberg (1998) CrossRefGoogle Scholar
- 8.Maitra, S., Paul, G., Sen Gupta, S.: Attack on broadcast RC4 revisited. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 199–217. Springer, Heidelberg (2011) CrossRefGoogle Scholar
- 9.Mantin, I.: Analysis of the stream cipher RC4. Master’s Thesis, The Weizmann Institute of Science, Israel (2001). http://www.wisdom.weizmann.ac.il/itsik/RC4/rc4.html
- 10.Mantin, I.: Predicting and distinguishing attacks on RC4 keystream generator. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 491–506. Springer, Heidelberg (2005) CrossRefGoogle Scholar
- 11.Mantin, I., Shamir, A.: A practical attack on broadcast RC4. In: Matsui, M. (ed.) FSE 2001. LNCS, vol. 2355, p. 152. Springer, Heidelberg (2002) CrossRefGoogle Scholar
- 12.Matsui, M.: Key collisions of the RC4 stream cipher. In: Dunkelman, O. (ed.) FSE 2009. LNCS, vol. 5665, pp. 38–50. Springer, Heidelberg (2009) CrossRefGoogle Scholar
- 13.Maximov, A., Khovratovich, D.: New state recovery attack on RC4. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 297–316. Springer, Heidelberg (2008) CrossRefGoogle Scholar
- 14.Mironov, I.: (Not so) random shuffles of RC4. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 304–319. Springer, Heidelberg (2002) CrossRefGoogle Scholar
- 15.Paul, G., Maitra, S.: Permutation after RC4 key scheduling reveals the secret key. In: Adams, C., Miri, A., Wiener, M. (eds.) SAC 2007. LNCS, vol. 4876, pp. 360–377. Springer, Heidelberg (2007) CrossRefGoogle Scholar
- 16.Paul, S., Preneel, B.: A new weakness in the RC4 keystream generator and an approach to improve the security of the cipher. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 245–259. Springer, Heidelberg (2004) CrossRefGoogle Scholar
- 17.Sepehrdad, P., Vaudenay, S., Vuagnoux, M.: Discovery and exploitation of new biases in RC4. In: Biryukov, A., Gong, G., Stinson, D.R. (eds.) SAC 2010. LNCS, vol. 6544, pp. 74–91. Springer, Heidelberg (2011) CrossRefGoogle Scholar