A Relations and Separations Between Security Notions
This section briefly gives the intuition for the proofs of the relations and separations of the security notions; details appear in the full version [15].
-
\({\mathsf {IND{\text {-}}CPA}}\implies {\mathsf {CCI}}\): A (non-compressing) \({\mathsf {IND{\text {-}}CPA}}\)-secure symmetric-key encryption scheme provides indistinguishability of any pair of equal-length chosen messages, including messages involving a cookie. The proof proceeds by a hybrid argument, making the cookie used in each query made by the adversary to its \(E_1\) oracle independent of the secret bit b.
-
: A degenerate scheme that uses a separating-secrets filter to extract secret cookies then encrypt the cookies but not the non-cookie data is \({\mathsf {CCI}}\)-secure but not \({\mathsf {IND{\text {-}}CPA}}\)-secure for the whole message.
-
\({\mathsf {CCI}}\implies {\mathsf {RCI}}\): A straightforward simulation: an adversary who cannot distinguish between encryptions of equal-length cookies of its choosing can also not distinguish between encryptions of randomly chosen equal-length cookies.
-
: A counterexample is constructed that uses a separating-secrets filter: an extra ciphertext component \(c_2\) is added, consisting of a point function applied to the separated secrets, where the point function is 1 on a single, publicly known cookie value z. With high probability, two randomly chosen cookies will not match z, so \(c_2\) carries no useful information and the scheme is \({\mathsf {RCI}}\)-secure, but a \({\mathsf {CCI}}\) adversary can choose one cookie that matches z and one that does not, so \(c_2\) allows distinguishing of the chosen cookies.
-
\({\mathsf {RCI}}\implies {\mathsf {CR}}\): A straightforward simulation: an adversary who recovers a cookie given only ciphertexts easily distinguishes encryptions of cookies.
-
: A counterexample is constructed: an extra ciphertext component \(c_2\) is added, consisting of a random oracle applied to the message. The adversary gets encryptions of \(m'\Vert ck\Vert m''\) for \(m',m''\) of its choice; without querying the random oracle on exactly \(m'\Vert ck\Vert m''\), \(c_2\) provides no information to the adversary, so the scheme is \({\mathsf {CR}}\)-secure. However, an \({\mathsf {RCI}}\) adversary can check the random oracle on the two given random cookies, so \(c_2\) allows distinguishing of the given random cookies.
B Proof of \({\mathsf {CCI}}\) Security of Separating-Secrets Technique
Proof of Theorem 1
The proof proceeds in a sequence of games, using a hybrid approach. Each Game i proceeds as in the original \({\mathsf {CCI}}\) security experiment, except that the queries to \(E_1\) are answered as in Fig. 7. Let \(\mathrm {Adv}^{i}\) denote the probability that game i outputs 1.
Game 0. This is the original \({\mathsf {CCI}}\) security game for \({\varPi }\). By definition,
\(\mathrm {Adv}^{{\mathsf {CCI}}}_{{\varPi }\circ \mathrm {SS}_{f,{\varGamma }},\mathcal {CK}}(\mathcal {A}) = \mathrm {Adv}^{0}.\)
Transition from Game \((i-1)\) to Game i, \(1 \le i \le q\). Each hybrid transition changes how one query is answered; if the adversary’s behaviour differs because of the change in answering the query, we can construct a simulator \(\mathcal {B}_i\) that wins the \({\mathsf {IND{\text {-}}CPA}}\) game for \({\varPsi }\), as shown in Fig. 8. When the \({\mathsf {IND{\text {-}}CPA}}\) challenger uses \(b=0\), \(c^*\) is the encryption of the separating-secrets compression of \(m'\Vert ck_{\hat{b}}\Vert m''\), so \(\mathcal {B}_i\) is playing game \((i-1)\) with \(\mathcal {A}\). When the \({\mathsf {IND{\text {-}}CPA}}\) challenger uses \(b=1\), \(c^*\) is the encryption of the separating-secrets compression of \(m'\Vert ck_0\Vert m''\), so \(\mathcal {B}_i\) is playing game i with \(\mathcal {A}\). Since f is safe for \(\mathcal {CK}\), the separating-secrets compressions of \(m'\Vert ck_0\Vert m''\) and \(m'\Vert ck_1\Vert m''\) have the same length, and thus the pair of chosen messages given from the simulator in \(E_1\) to the \({\mathsf {IND{\text {-}}CPA}}\) challenger is valid according to the \({\mathsf {IND{\text {-}}CPA}}\) experiment. Thus, \( \left| \mathrm {Adv}^{i-1} - \mathrm {Adv}^{i} \right| \le \mathrm {Adv}^{{\mathsf {IND{\text {-}}CPA}}}_{{\varPsi }}(\mathcal {B}_i^\mathcal {A}). \)
Analysis of Game q. Since the adversary’s view is independent of b in Game q, we have \( \mathrm {Adv}^{q} = 0. \)
Conclusion. Combining the above results, we have
\( \mathrm {Adv}^{{\mathsf {CCI}}}_{{\varPsi },\mathcal {CK}}(\mathcal {A}) \le \sum _{i=1}^{q} \mathrm {Adv}^{{\mathsf {IND{\text {-}}CPA}}}_{{\varPsi }}(\mathcal {B}_i^\mathcal {A}) = q \cdot \mathrm {Adv}^{{\mathsf {IND{\text {-}}CPA}}}_{{\varPsi }}(\mathcal {B}^\mathcal {A}) \) (with a small abuse of notation in creating a single \(\mathcal {B}\) from the disparate \(\mathcal {B}_i\)). \(\square \)
C Analysis of Security of Fixed-Dictionary Technique
1.1 C.1 Probability Bounds, No Prefix/Suffix
In this section, we compute the amount of information given to the adversary from knowing the length of the compressed cookie, without any adversarially chosen prefix or suffix. This can be computed by calculating the amount of information given by knowing how many substrings of the cookie appear in the dictionary. For the analysis, we treat \(\mathcal {D}\) as a set of strings. Proofs for results in this section appear in the full version of the paper [15].
First we calculate the probability that a given string is a substring of a randomly chosen cookie.
Lemma 1
Let \(x \in {\varOmega }^w\) be a word, and let \(ck \mathop {\leftarrow }\limits ^{\$}{\varOmega }^{n} = \mathcal {CK}\) be a random string of n characters. Then \( \Pr (x \preceq ck) \le 1 - \left( 1 - \frac{1}{|{\varOmega }|^{w}}\right) ^{n-w+1}. \)
We now compute that probability that one of a set of given strings is a substring of a randomly chosen cookie:
Lemma 2
Let \(\mathcal {D}\subseteq {\varOmega }^w\) with \(|\mathcal {D}| = d\) be a dictionary of d words of w characters. Let \(ck \mathop {\leftarrow }\limits ^{\$}{\varOmega }^{n} = \mathcal {CK}\) be a random string of n characters. Then
$$\begin{aligned} \Pr (\exists x \in \mathcal {D}: x \preceq ck) \le d \left( 1 - \left( 1 - \frac{1}{|{\varOmega }|^{w}}\right) ^{n-w+1} \right) . \end{aligned}$$
Recall the definition of conditional entropy for random variables X and Y:
$$\begin{aligned} H(Y \mid X)&= \sum _{x \in {\mathrm {supp}}(X)} \Pr (X = x) H(Y \mid X=x) \\&= - \sum _{x \in {\mathrm {supp}}(X)} \Pr (X=x) \\&\quad \quad \cdot \sum _{y \in {\mathrm {supp}}(Y)} \Pr (Y=y \mid X=x) \log _2 \Pr (Y=y \mid X=x). \end{aligned}$$
We now compute the amount of entropy about the cookie given knowledge about the number of substrings of the cookie that appear in the dictionary:
Lemma 3
Fix \(\mathcal {D}\). Let \(\mathrm {\#SUB}(ck)\) denote the number of substrings of ck that appear in \(\mathcal {D}\). Suppose CK is a uniform random variable on \(\mathcal {CK}\). Then
$$\begin{aligned} H(CK \mid \mathrm {\#SUB}(CK))&\ge \left( 1 - d \left( 1 - \left( 1 - \frac{1}{|{\varOmega }|^{w}}\right) ^{n-w+1} \right) \right) \\&\quad \quad \cdot \log _2 \left( |\mathcal {CK}| - |\mathcal {CK}| \cdot d \left( 1 - \left( 1 - \frac{1}{|{\varOmega }|^{w}}\right) ^{n-w+1} \right) \right) \ . \end{aligned}$$
For example, if we have 16-byte cookies (\(\mathcal {CK}= \{{\mathtt {0x00}},\dots ,{\mathtt {0xFF}}\}^{16}\)), and the dictionary \(\mathcal {D}\) is a set of \(d=4096\) words of length \(w = 4\) bytes, then
$$\begin{aligned} H(CK \mid \mathrm {\#SUB}(CK)) \ge 127.998395. \end{aligned}$$
Concluding our analysis of the information learned given to the adversary without any adversarially chosen prefix or suffix, we give a bound on the amount of entropy about the cookie given the length of the compressed cookie:
Lemma 4
Fix \(\mathcal {D}\) with d words of length w over character set \({\varOmega }\). Denote the length of a cookie ck compressed with dictionary \(\mathcal {D}\) by \(\mathrm {COMPLEN}(ck) = |\mathrm {FD}_{\mathcal {D},w,\ell }.\mathrm {Comp}(ck)|\). Suppose CK is a uniform random variable on \(\mathcal {CK}\). Then
$$\begin{aligned}&H(CK \mid \mathrm {COMPLEN}(CK)) \ge H(CK \mid \mathrm {\#SUB}(CK)) \\&\quad \quad \ge \left( 1 - d \left( 1 - \left( 1 - \frac{1}{|{\varOmega }|^{w}}\right) ^{n-w+1} \right) \right) \\&\quad \quad \quad \quad \cdot \log _2 \left( |\mathcal {CK}| - |\mathcal {CK}| \cdot d \left( 1 - \left( 1 - \frac{1}{|{\varOmega }|^{w}}\right) ^{n-w+1} \right) \right) . \end{aligned}$$
Lemma 4 follows from the data processing inequality and Lemma 3.
1.2 C.2 Probability Bounds, Prefix/Suffix
Suppose CK is a uniform random variable on \(\mathcal {CK}= {\varOmega }^n\). We know that \(H(CK) = n \log _2(|{\varOmega }|)\). Trivially, \(H(CK \mid CK_1) = (n-1) \log _2(|{\varOmega }|)\), where \(CK_1\) is the first character of CK. Similarly, \(H(CK \mid CK_{1:a}) = (n-a) \log _2(|{\varOmega }|)\) and finally \(H(CK \mid CK_{1:a}, CK_{n-b:b}) = (n-a-b) \log _2(|{\varOmega }|)\).
Consider the following CRIME-like attack on the beginning of the cookie. Let \(\mathcal {D}\) be a dictionary with d words of length w over character set \({\varOmega }\). Let \(ck \in {\varOmega }^n\). Let \(O(\cdot )\) be an oracle that, upon input a of length \(w-m\), with \(1 \le m \le w-1\), returns 1 if and only if \(a \Vert ck_{1:m} \in \mathcal {D}\).
The CRIME-like attack works as follows:
-
1.
For each \(x \in \mathcal {D}\), query \(x_{1:w-1}\) to the oracle. If a query for \(x_{1:w-1}\) returns 1, then it is known that \(ck_{1:1} \in Z_1 = \{ z : x_{1:w-1} \Vert z \in \mathcal {D}\}\). If no query returns 1, then return \(\emptyset \).
-
2.
For \(m=2, \dots , w-1\): For each \(x \in \mathcal {D}\) such that \(x_{w-m} \in Z_{m-1}\), query \(x_{1:w-m}\) to the oracle. If a query for \(x_{1:w-m}\) returns 1, then it is known that \(ck_{1:m} \in Z_m = \{ z_1 z_2 \dots z_m : x_{1:w-m} \Vert z_1 z_2 \dots z_m \in \mathcal {D}\}\). If no query returns 1, then return \(Z_1, \dots , Z_{m-1}\).
-
3.
Return \(Z_1,\dots , Z_{w-1}\).
A corresponding attack on the suffix is obvious.
Let \(\mathrm {CRIMEpre}(ck)\) denote the output obtained from running the above prefix CRIME attacks on ck, \(\mathrm {CRIMEsuf}(ck)\) denote the output from the corresponding suffix attack. Let \(\mathrm {CRIME}(ck) = (\mathrm {CRIMEpre}(ck), \mathrm {CRIMEsuf}(ck))\).
Noting that in the best case the CRIME attack allows the attacker to learn the first \(w-1\) and the last \(w-1\) characters of the cookie, some trivial lower bounds are:
$$\begin{aligned} H(CK_{1:w-1} \mid \mathrm {CRIME}(CK))&\ge 0 \\ H(CK_{n-w+1:w-1} \mid \mathrm {CRIME}(CK))&\ge 0 \end{aligned}$$
However, the CRIME attack provides no information about the remaining characters, so \(I(CK_{1:w-1}, CK_{w:n-w+1}) = 0\) and \(I(CK_{1:n-w+1}, CK_{n-w+1:w-1}) = 0\), and thus \(H(CK_{w:n-w+2} \mid \mathrm {CRIME}(CK), \mathrm {COMPLEN}(CK)) = H(CK_{w:n-w+2} \mid \mathrm {COMPLEN}(CK))\).
Finally, we have that
$$\begin{aligned}&H(CK \mid \mathrm {CRIME}(CK), \mathrm {COMPLEN}(CK)) \\&\quad \quad \ge H(CK_{1:w-1} \mid \mathrm {CRIMEpre}(CK)) + H(CK_{w:n-w+2} \mid \mathrm {COMPLEN}(CK)) \\&\quad \quad \quad \quad + H(CK_{n-w+1:w-1} \mid \mathrm {CRIMEsuf}(CK)) \\&\quad \quad \ge 0 + H(CK_{w:n-w+2} \mid \mathrm {COMPLEN}(CK)) + 0 \end{aligned}$$
and we can obtain a lower bound on \(H(CK_{w:n-w} \mid \mathrm {COMPLEN}(CK))\) using Lemma 4.