Chosen-Ciphertext Security from Subset Sum

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9614)


We construct a public-key encryption (PKE) scheme whose security is polynomial-time equivalent to the hardness of the Subset Sum problem. Our scheme achieves the standard notion of indistinguishability against chosen-ciphertext attacks (IND-CCA) and can be used to encrypt messages of arbitrary polynomial length, improving upon a previous construction by Lyubashevsky, Palacio, and Segev (TCC 2010) which achieved only the weaker notion of semantic security (IND-CPA) and whose concrete security decreases with the length of the message being encrypted.

At the core of our construction is a trapdoor technique which originates in the work of Micciancio and Peikert (Eurocrypt 2012).


Public-key cryptography Chosen-ciphertext security Subset Sum problem 

1 Introduction

Public-Key Encryption (PKE) is perhaps the most basic application of public-key cryptography [10]. Intuitively a PKE scheme allows Alice to encrypt a message \( M \) for Bob, given just Bob’s public key \( pk \); the received ciphertext C can be decrypted by Bob using the secret key \( sk \) corresponding to \( pk \).

Security of a PKE scheme can be formulated in different ways, depending on the assumed adversarial capabilities. The most basic and natural notion is that of indistinguishability against chosen-plaintext attacks (IND-CPA, a.k.a. semantic security) [14]; here we demand that a passive (computationally bounded) adversary only given \( pk \) should not be able to distinguish the encryption of two (adversarially chosen) messages \( M _0, M _1\).

Whilst already sufficient for some applications, IND-CPA security is not enough to deal with active adversaries. Hence, researchers have put forward stronger security notions. The de-facto standard notion of security for PKE is that of indistinguishability against chosen-ciphertext attacks [29] (IND-CCA), where we now demand that an active (computationally bounded) adversary given \( pk \) should not be able to distinguish the encryption of two (adversarially chosen) messages \( M _0, M _1\) even given access to an oracle decrypting arbitrarily chosen ciphertexts.1

By now we dispose of many PKE schemes satisfying IND-CCA security under a variety of assumptions, including factoring [15], decisional and computational Diffie-Hellman [6, 8], and learning parity with noise [18].

The Subset Sum assumption. Since its introduction, the Subset Sum problem has been considered a valid alternative to number-theoretic assumptions. In its basic computational version, the Subset Sum problem \(\mathrm {SS}(n,\mu )\) (parametrized by integers \(\mu \) and n) asks to find a secret vector \(\mathbf {s}\in \{0,1\}^n\) given a vector \(\mathbf {a}\in \mathbb Z_\mu ^n\) together with the target value \(T := \langle \mathbf {a}\cdot \mathbf {s}\rangle \) mod \({\mu }\), where both \(\mathbf {a}\) and \(\mathbf {s}\) are chosen uniformly at random, and \(\langle \cdot ,\cdot \rangle \) denotes the inner product. The hardness of \(\mathrm {SS}(n,\mu )\) depends on the so-called density, which is defined by the ratio \(\delta := n/\log \mu \). In case \(\delta < 1/n\) or \(\delta > n/\log ^2 n\), the problem can be solved in polynomial time [12, 13, 20, 21, 32]. In case \(\delta \) is o(1) or even as small as \(O(1/\log n)\), the problem is considered to be hard. The best classical algorithm for solving Subset Sum is due to [19], and takes sub-exponential time for solving instances with \(\delta = o(1)\) and time \(2^{(\ln 2/2 + o(1))n/\log \log n}\) for instances with \(\delta = O(1/\log n)\).

One nice feature of the Subset Sum problem is its believed hardness against quantum attacks. At the time of writing, the best quantum attack—due to Bernstein et al. [3]—on Subset Sum requires complexity \(2^{(0.241+o(1))n}\) to solve a random instance of the problem.

PKE from Subset Sum. The first PKE scheme based on the hardness of Subset Sum was constructed in the seminal work of Ajtai and Dwork [2], who presented a scheme whose semantic security is as hard to break as solving worst-case instances of a lattice problem called “the unique shortest vector problem” (uSVP). It is well known that Subset Sum can be reduced to uSVP [13, 20].

A disadvantage of the scheme in [2] (and its extensions [27, 30, 31]) is that they are based on Subset Sum only in an indirect way (i.e., via a non-tight reduction to uSVP). This limitation was overcome by the work of Lyubashevsky, Palacio, and Segev [22] that proposed a new PKE scheme achieving IND-CPA security with a simple and direct reduction to solving random instances of the Subset Sum problem.

More precisely, the security of the scheme in [22] is based on the assumption that a random instance \((\mathbf {a}, T)\) of the Subset Sum problem is indistinguishable from uniform. Such a decisional variant of the problem was shown to be equivalent to the above introduced computational version (i.e., to the task of recovering \(\mathbf {s}\)) by Impagliazzo and Naor [16].

1.1 Our Contributions and Techniques

The work of [22] left as an explicit open problem to construct a PKE scheme achieving IND-CCA security with a direct reduction to the hardness of Subset Sum.

Contributions. In this paper we present a new PKE scheme resolving the above open problem. Previous to our work, the only known PKE schemes with IND-CCA security from Subset Sum were the ones based on uSVP [27, 28] (which are not directly based on the hardness of Subset Sum). An additional advantage of our scheme is that it can be used to encrypt an arbitrary polynomial number of bits; this stands in sharp contrast with the scheme of [22], whose concrete security starts to decrease when encrypting messages of length longer than \(n\log n\) (where n is, as usual, the Subset Sum dimension).2 The theorem below summarizes our main result.

Theorem 1

(Main result, informal). For \(q = \varTheta (n^2\log ^6 n)\) there exists a PKE scheme with IND-CCA security based on the hardness of \(\mathrm {SS}(n,2^{n\log n})\).

Techniques. Our scheme (as the one of [22]) is based on the decisional variant of \(\mathrm {SS}(n,q^m)\), where q is a small integer and m is an integer. The main observation (also made in [22]) is that, in case \(\mu = q^m\), the target value \(T := \langle \mathbf {a}\cdot \mathbf {s}\rangle \) mod \({q^m}\) written in base q is equal to \(\mathbf {A}\mathbf {s} + e(\mathbf {A},\mathbf {s})\) where \(\mathbf {A}\in \mathbb Z_q^{m\times n}\) is a matrix whose i-th column corresponds to the i-th element of vector \(\mathbf {a}\) written in base q, and \(e(\mathbf {A},\mathbf {s})\) is a vector in \(\mathbb Z_q^m\) (function of \(\mathbf {A}\) and \(\mathbf {s}\)) which corresponds to the carries when performing “grade-school” addition. This particular structure resembles the structure of an instance of the learning with errors (LWE) problem [31], with the important difference that the noise term is “deterministic” and, in fact, completely determined by the matrix \(\mathbf {A}\) and the vector \(\mathbf {s}\).

We use the above similarity between LWE and Subset Sum to construct our new PKE scheme, using a trapdoor technique due to Micciancio and Peikert [24]. Essentially our scheme relies on a tag-based trapdoor function, where the trapdoor is associated with a hidden tag. Whenever the function is evaluated w.r.t. the hidden tag, the trapdoor disappears and the function is hard to invert; for all other tags the function can be inverted efficiently given the trapdoor. Using the leftover hash lemma, one can switch the hidden tag without the adversary noticing.

The above technique allows us to prove that our PKE scheme achieves a weaker (tag-based) CCA notion. This means that each ciphertext is associated with a tag \(\tau \), and in the security game the adversary has to commit in advance to the tag \(\tau ^*\) which will be associated with the challenge ciphertext.3 In the security proof we first switch the tag associated with the hidden trapdoor with the challenge tag (using the trapdoor technique outlined above). Now, the simulator is not able to decrypt a message related to the challenge tag which allows us to argue about indistinguishability of the PKE scheme.

It is well known that the above weak tag-based CCA notion can be generically enhanced to full-fledged IND-CCA security using a one-time signature scheme [17]. This allows us to conclude Theorem 1.

Efficiency. Let \(\ell \) be the length of the messages to be encrypted, and denote by n, q and m the parameters of the Subset Sum problem. The secret key of our PKE scheme consists of a binary matrix of dimension \(n\times m\); the public key consists of 3 matrices of elements in \(\mathbb Z_q\), with dimensions (respectively) \(m\times n\), \(n\times n\), and \(\ell \times n\). A ciphertext consists of 3 vectors of elements in \(\mathbb Z_q\), with dimensions (respectively) m, n, and \(\ell \).

1.2 Related Work

Pioneered by Merkle and Hellman [23], the first construction of PKE schemes based on Subset Sum were based on instances of the problem with special structure. All these constructions have been subsequently broken. (See [26] for a survey.)

In a seminal paper, Impagliazzo and Naor [16] presented constructions of universal one-way hash functions, pseudorandom generators and bit commitment schemes based on the hardness of random instances of Subset Sum.

Besides constructing PKE schemes, [22] additionally presents an oblivious transfer protocol with security against malicious senders and semi-honest receivers. The Subset Sum problem has also recently been used to solve the problem of outsourced pattern matching [11] in the cloud setting.

2 Preliminaries

For two distributions \(\mathcal {D}\) and \(\mathcal {D}'\) over \(\Omega \), \(\mathcal {D}(x)\) is the probability assigned to \(x\in \Omega \) and \(\Delta [\mathcal {D},\mathcal {D}']:=\frac{1}{2}\sum _{x\in \Omega }|\mathcal {D}(x)-\mathcal {D}'(x)|\) is the statistical distance between \(\mathcal {D}\) and \(\mathcal {D}'\). We denote with \(x\leftarrow X\) that x is sampled according to the distribution X. If X is a set, then this denotes that x is sampled uniformly at random from X. \(\lfloor \cdot \rceil _2:{\mathbb Z}_q\rightarrow {\mathbb Z}_2\) is the rounding function defined by \(\lfloor x \rceil _2:=\lfloor x\cdot \frac{2}{q} \rceil \).

Vectors and matrices are denoted in boldface. For two vectors \(\mathbf {u}\),\(\mathbf {v}\), with \(\mathbf {u} = (u_1,\ldots ,u_n)\) and \(\mathbf {v} = (v_1,\ldots ,v_n)\), the inner product between \(\mathbf {u}\) and \(\mathbf {v}\) is defined as \(\langle \mathbf {u},\mathbf {v}\rangle := \sum _{i=1}^n u_i\cdot v_i\). We represent elements in \(\mathbb Z_q\) by integers in the range \([-(q - 1)/2;(q - 1)/2]\). For an element \(v\in \mathbb Z_q\), its length, denoted by |v| is the absolute value of its representative in the range \([-(q - 1)/2;(q - 1)/2]\). For a vector \(\mathbf {v} = (v_1,\ldots ,v_n) \in \mathbb Z_q^n\) , we define \(\Vert \mathbf {v}\Vert _\infty := \max _{1\le i \le n}|v_i|\).

We say that a function \(\nu \) is negligible in the security parameter n, if it is asymptotically smaller than the inverse of any polynomial in n, i.e. \(\nu (n) = n^{-\omega (1)}\). An algorithm \(\mathsf {A}\) is probabilistic polynomial-time (PPT) if \(\mathsf {A}\) is randomized, and for any input \(x,r\in \{0,1\}^*\) the computation of \(\mathsf {A}(x;r)\) (i.e., \(\mathsf {A}\) with input x and random coins r) terminates in at most \(\mathrm {poly}(|x|)\) steps.

2.1 Subset Sum

Traditionally a Subset Sum \(\mathrm {SS}(n,\mu )\) instance is defined as \(\mathbf {a}:= (a_1,\ldots ,a_n)\) and a target \(T := \langle \mathbf {a},\mathbf {s}\rangle \) mod \({\mu }\), where the goal is to recover \(\mathbf {s}\in \{0,1\}^n\). For a modulus \(\mu =q^m\), Lyubashevsky, Palacio, and Segev [22] gave an alternative description which shows its similarities with the LWE problem more clearly. First they define matrix \(\mathbf {A}\in {\mathbb Z}_q^{m\times n}\), where \(a_{j,i}:=\big \lfloor \frac{a_i}{q^{j-1}}\big \rfloor \) mod q. Thus,
$$ \mathbf {A}\odot \mathbf {s}:=\sum _{i=1}^n s_i\cdot \left( \sum _{j=1}^m a_{j,i}q^{j-1}\right) \mathrm {mod}\,q^m =\sum _{i=1}^n s_i\cdot a_i \mathrm {mod}\, q^m=\langle \mathbf {a},\mathbf {s}\rangle \mathrm {mod}\,q^m $$
where \(\mathbf {s}\in \{0,1\}^n\). Notice that when \(\mathbf {A}\mathbf {s}\) is the matrix vector multiplication mod q, then \(\mathbf {A}\odot \mathbf {s}=\mathbf {A}\mathbf {s}+e(\mathbf {A},\mathbf {s})\) mod \(q \in \mathbb Z^{m}_q\). Here \(e(\mathbf {A},\mathbf {s})_1:=0\), and for \(1<j\le m\) the j-th component of \(e(\mathbf {A},\mathbf {s})\) is given by
$$ e(\mathbf {A},\mathbf {s})_j:= \left\lfloor \frac{\sum _{i=1}^{n} s_ia_{j-1,i} }{q}\right\rfloor +c_j\mod q, $$
for carry \(c_j\) which is recursively defined by \(c_2:=0\) and
$$ c_{j}:=\left\lfloor \frac{(\sum _{i=1}^{n} s_ia_{j-1,i}) \mathrm {mod}\,q +e(\mathbf {A},\mathbf {s})_{j-1}}{q}\right\rfloor \mod q. $$
Since \(c_j\) is small, and moreover it is the only part of \(e(\mathbf {A},\mathbf {s})_j\) which depends on \(e(\mathbf {A},\mathbf {s})_{j-1}\), one has that \(e(\mathbf {A},\mathbf {s})_j-c_j\) is bound by the Hoeffding bound. This implies an overall bound on \(e(\mathbf {A},\mathbf {s})_j\):

Lemma 1

([22] Lemma 3.3). For any \(n,m\in \mathbb N\) and \(\mathbf {s}\in \{0,1\}^n\), there exists a negligible function \(\nu :\mathbb N\rightarrow [0,1]\) such that
$${\mathop {\Pr }\limits _{\mathbf {A}\leftarrow \mathbb Z_q^{m\times n}}[\Vert e(\mathbf {A},\mathbf {s})\Vert _\infty \ge \sqrt{n}\log n] \le \nu (n).} $$

The main difference between Subset Sum and LWE is that error term \(e(\mathbf {A},\mathbf {s})\) is uniquely determined given \(\mathbf {A}\) and \(\mathbf {s}\) where as in case of LWE, error e is sampled from a discrete Gaussian distribution independent of \(\mathbf {A}\), \(\mathbf {s}\).

The Subset Sum assumption. A \(\mathrm {SS}(n, q^m)\) instance has the following distribution:
$$ \mathcal {D}_{\mathrm {SS}(n, q^m)} := \{(\mathbf {A}, \mathbf {A}\odot \mathbf {s}) \mid \mathbf {A}\leftarrow \mathbb Z_q^{m\times n}\text {, }\mathbf {s}\leftarrow \{0,1\}^n\}. $$
The challenge is to distinguish \(\mathcal {D}_{\mathrm {SS}(n, q^m)}\) from a uniform \((\mathbf {A}, \mathbf {b}) \in \mathbb Z^{m\times n}_q \times \mathbb Z^m_q\). The advantage of an algorithm \(\mathsf {A}\) in breaking the \(\mathrm {SS}(n, q^m)\) assumption is
$$ \mathbf {Adv}_{\mathrm {SS}(n, q^m)}(\mathsf {\mathsf {A}})=|\Pr [ \mathsf {A}(\mathbf {A}, \mathbf {b})=1]-\Pr [\mathsf {A}(\mathbf {A}', \mathbf {b}')=1]|, $$
where \((\mathbf {A}, \mathbf {b}) \leftarrow \mathcal {D}_{\mathrm {SS}(n, q^m)}\) and \((\mathbf {A}', \mathbf {b}') \leftarrow \mathbb Z^{m\times n}_q \times \mathbb Z^m_q\). It was shown by Impagliazzo and Naor [16] that this decisional version of Subset Sum is as hard as recovering the hidden vector \(\mathbf {s}\).

Re-randomizing Subset Sum. We use a technique introduced by Lyubashevsky [21] allowing to re-randomize a Subset Sum sample. This technique is based on the leftover hash lemma:

Lemma 2

(Leftover hash lemma). For \(2m\ge n+1+\omega (\log {n+1})/\log {q}\) and polynomial \(\ell \), there exists a negligible function \(\nu :\mathbb N \rightarrow [0,1]\) such that the statistical distance
$$ \Delta \big [(\mathbf {A},\mathbf {R}\mathbf {A},\mathbf {a},\mathbf {R}\mathbf {a}),(\mathbf {A},\mathbf {C},\mathbf {a},\mathbf {c}) \big ] \le \nu (n), $$
for \(\mathbf {A}\leftarrow \mathbb Z^{m\times n}_q\), \(\mathbf {R}\leftarrow [-\sqrt{q}/2,\sqrt{q}/2]^{\ell \times m}\), \(\mathbf {a}\leftarrow \mathbb Z^n_q\), \(\mathbf {C}\leftarrow \mathbb Z^{\ell \times n}_q\), \(\mathbf {c}\leftarrow \mathbb Z^\ell _q\).

A Subset Sum sample \((\mathbf {A}, \mathbf {b})\leftarrow \mathcal {D}_{\mathrm {SS}(n, q^m)}\) can now be re-randomized to \((\mathbf {R}\mathbf {A}, \mathbf {R}\mathbf {b})\) where \(\mathbf {R}\mathbf {A}\) is statistically close to uniform given \(\mathbf {A}, \mathbf {b}\) and \(\mathbf {R}\mathbf {b}\). Note that \((\mathbf {R}\mathbf {A}, \mathbf {R}\mathbf {b})\) is not \(\mathrm {SS}(n, q^m)\)-distributed anymore.

Given this re-randomization technique, we are able to construct a tag-based trapdoor function [24] and a PKE scheme whose hardness is independent of the amount of simultaneously encrypted bits. Of major significance is the fact that, after re-randomization, the noise is still bounded:

Lemma 3

([22] Lemma 3.4). For any \(n,m\in \mathbb N\), \(\mathbf {s}\in \{0,1\}^n\) and \(\mathbf {r}\in [-\sqrt{q}/2, \sqrt{q}/2]^m\), there exists a negligible function \(\nu :\mathbb N\rightarrow [0,1]\) such that
$$ {\mathop {\Pr }\limits _{\mathbf {A}\leftarrow \mathbb Z_q^{m\times n}}[\mathbf {r}\cdot e(\mathbf {A},\mathbf {s})\ge \sqrt{qnm}\log ^2 n+\sqrt{q}m] \le \nu (n).} $$

This bound will be crucial to show the correctness of our proposed PKE.

2.2 Tag-Based Encryption

The main motivation behind the concept of tag-based encryption (TBE) comes from the fact that it is possible to transform an identity-based encryption scheme into an IND-CCA secure PKE scheme [4, 5]. Kiltz [17] showed that these transformations already work starting from TBE.

A TBE scheme with tag-space \(\mathcal {T}\), message-space \(\mathcal {M}\), and security parameter n, consists of the following three PPT algorithms \(\mathrm {TBE}=(\mathsf {Gen}, \mathsf {Enc}, \mathsf {Dec})\).

  • \(\mathsf {Gen}(1^n)\): Outputs a secret key \( sk \) and a public key \( pk \).

  • \(\mathsf {Enc}( pk , \tau , M )\): Outputs a ciphertext C for \( M \in \mathcal {M}\), and tag \(\tau \) \(\in \) \(\mathcal {T}\).

  • \(\mathsf {Dec}( sk , \tau , C)\): Outputs the decrypted message \( M \) of ciphertext C with respect to tag \(\tau \in \mathcal {T}\), or an invalid symbol \(\bot \).

For correctness, we require that for any \(\tau , M \) and \(( sk , pk )\leftarrow \mathsf {Gen}(1^n)\):
$$\mathsf {Dec}( sk ,\tau ,\mathsf {Enc}( pk ,\tau , M ))= M $$
holds with overwhelming probability. As for security, we define the following selective-tag weak CCA game \(\mathrm {G}_{\mathrm {TBE}}\) [17]:
  1. 1.

    Adversary \(\mathsf {A}\) picks a tag \(\tau ^*\in \mathcal {T}\).

  2. 2.

    Run \(( sk , pk )\leftarrow \mathsf {Gen}(1^n)\). Adversary \(\mathsf {A}\) receives public key \( pk \) and gets permanent access to an oracle which outputs \(\mathsf {Dec}( sk , \tau , C)\) upon input requests of the form \(\mathsf {QueryDec}(C,\tau )\) for all \(\tau \ne \tau ^*\), and \(\perp \) otherwise.

  3. 3.

    \(\mathsf {A}\) chooses \( M _0\) and \( M _1\) from \(\mathcal {M}\) and receives \(C\leftarrow \mathsf {Enc}( pk ,\tau ^*, M _u)\) for \(u\leftarrow \{0,1\}\).

  4. 4.

    Finally \(\mathsf {A}\) outputs \(u'\) and \(\mathrm {G}_{\mathrm {TBE}}\) outputs 1 iff \(u'=u\).

The advantage of an adversary \(\mathsf {A}\) in game \(\mathrm {G}_{\mathrm {TBE}}\) is defined as
$$ \mathbf {Adv}_{\mathrm {TBE}}(\mathsf {A}) := \left| \Pr [\mathrm {G}_{\mathrm {TBE}}(\mathsf {A})=1]-\frac{1}{2}\right| , $$
and a \(\mathrm {TBE}\) scheme is called secure against selective-tag weak CCA adversaries, if for all PPT \(\mathsf {A}\) there exists a negligible function \(\nu :\mathbb N\rightarrow [0,1]\) such that \(\mathbf {Adv}_{\mathrm {TBE}}(\mathsf {A}) \le \nu (n)\).

Given an exponential tag-space, there is a transformation from a \(\mathrm {TBE}\) scheme satisfying the above notion to an IND-CCA secure PKE; the transformation requires a one-time signature scheme or a message authentication code plus a commitment [17].

We embed the tags in our proposed \(\mathrm {TBE}\) using a full-rank differences (FRD) encoding \(\mathcal H\) [1, 7]. This means that \(\mathcal H: \mathbb Z_2^n\rightarrow \mathbb Z_2^{n\times n}\), \(\tau \mapsto \mathbf H_\tau \) and \(\forall \tau \ne \tau ' \in \mathbb Z_2^n\) \(\mathbf H_\tau -\mathbf H_{\tau '}\) has full rank.

3 A Subset Sum Based TBE

For security parameter n, let \(q=\varTheta (n^2\log ^6 n)\), \(2\mid q\), and \(m=\varTheta (n)\) for appropriate constant factors. The following three algorithms describe our \(\mathrm {TBE}=(\mathsf {Gen},\mathsf {Enc},\mathsf {Dec})\) based on \(\mathrm {SS}(n, q^m)\) with tag space \(\mathcal {T}:={\mathbb Z}^n_2\setminus \{\mathbf {0}\}\) (where \(\mathbf {0}\) is the all-zero vector of length n) and message space \(\mathcal {M}:=\{0,1\}^{\ell }\).

  • \(\mathsf {Gen}(1^n)\): Sample \(\mathbf {R}\leftarrow \{0,1\}^{n\times m}\) and \(\mathbf {A}\leftarrow \mathbb Z^{m\times n}_q\), \(\mathbf {C}\leftarrow \mathbb Z^{\ell \times n}_q\). Define \(\mathbf {B}:=\mathbf {R}\mathbf {A}\). The private and public key are defined as
    $$ sk :=\mathbf {R}, \quad pk :=(\mathbf {A}, \mathbf {B}, \mathbf {C}) \in {\mathbb Z}_q^{m \times n}\times {\mathbb Z}_q^{n\times n} \times {\mathbb Z}_q^{\ell \times n}.$$
  • \(\mathsf {Enc}( pk , \tau , M )\): Pick \(\mathbf {R}'\leftarrow [-\sqrt{q}/2,\sqrt{q}/2]^{n\times m}\), \(\mathbf {R}''\leftarrow [-\sqrt{q}/2,\sqrt{q}/2]^{\ell \times m}\), \(\mathbf {s}\leftarrow \{0,1\}^n\) and define
    $$\begin{aligned} \mathbf {c}_0&:= \mathbf {A}\mathbf {s}+e(\mathbf {A},\mathbf {s})&\in {\mathbb Z}_q^m \\ \mathbf {c}_1&:= \left( \mathbf {B}+\frac{q}{2}\cdot \mathbf H_{\tau }\right) \mathbf {s}+\mathbf {R}'\cdot e(\mathbf {A},\mathbf {s})&\in {\mathbb Z}_q^n\\ \mathbf {c}_2&:= \mathbf {C}\mathbf {s} +\mathbf {R}''\cdot e(\mathbf {A},\mathbf {s})+\frac{q}{2}\cdot M&\in {\mathbb Z}_q^\ell \end{aligned}$$
    where \(\mathbf H_{\tau }\) is the matrix representation of \(\tau \).
  • \(\mathsf {Dec}( sk , \tau , C)\): Compute
    $$\begin{aligned} \hat{\mathbf {s}} :=&\left\lfloor \begin{pmatrix} \mathbf {R}&\mathbf {I} \end{pmatrix} \cdot \begin{pmatrix} -\mathbf {c}_0 \\ \mathbf {c}_{1} \end{pmatrix}\right\rceil _2. \end{aligned}$$
    and \(\mathbf {s}= \mathbf H_{\tau }^{-1}\hat{\mathbf {s}}\). If \(\mathbf {c}_0\ne \mathbf {A}\odot \mathbf {s}\) or \(\Vert \mathbf {c}_1-\left( \mathbf {B}+\frac{q}{2}\cdot \mathbf H_{\tau }\right) \mathbf {s}\Vert _\infty \ge \frac{q}{4}\) output \(\perp \). Otherwise output message \( M =\lfloor \mathbf {c}_2-\mathbf {C}\mathbf {s}\rceil _2\).

3.1 Correctness

The correctness of the scheme follows basically from the bounds on the noise of re-randomized Subset Sum instances. Given these bounds, the noise will be smaller than q / 4 such that it will be rounded away by the rounding function \(\lfloor \cdot \rceil _2\).

Theorem 2

(Correctness). Let \(q=O(n^2\log ^6 n)\), \(2\mid q\), \(m=\varTheta (n)\), and \(\ell \in O(n^c)\) for some constant c. Then for any \(\tau \in \mathbb Z_2^n\), \( M \in \{0,1\}^\ell \), there exists a negligible function \(\nu :\mathbb N\rightarrow [0,1]\) such that
$$ {\mathop {\Pr }\limits _{( sk , pk )\leftarrow \mathsf {Gen}(1^n)}[\mathsf {Dec}( sk ,\tau ,\mathsf {Enc}( pk ,\tau , M ))\ne M ] \le \nu (n).} $$


Given a ciphertext \(C=(\mathbf {c}_0,\mathbf {c}_1,\mathbf {c}_2)\), in case \(\mathsf {Dec}\) successfully reconstruct \(\mathbf {s}\), the decryption algorithm computes
$$ \lfloor \mathbf {c}_2-\mathbf {C}\mathbf {s}\rceil _2=\left\lfloor \mathbf {R}''\cdot e(\mathbf {A},\mathbf {s})+\frac{q}{2}\cdot M \right\rceil _2=\lfloor \mathbf {R}''\cdot e(\mathbf {A},\mathbf {s})\rceil _2+ M .$$
By Lemma 3, \(\Vert \mathbf {R}''\cdot e(\mathbf {A},\mathbf {s})\Vert _\infty \le \sqrt{qnm}\log ^2 n+\sqrt{q}m<q/4\) with overwhelming probability over \(\mathbf {A}\leftarrow \mathbb Z^{m\times n}_q\) (for appropriately chosen constants). Hence \(\lfloor \mathbf {R}''\cdot e(\mathbf {A},\mathbf {s})\rceil _2=\mathbf {0}\) and \(\mathsf {Dec}\) outputs \( M \).
For the same reason \(\lfloor (\mathbf {R}'-\mathbf {R})\cdot e(\mathbf {A},\mathbf {s})\rceil _2=\mathbf {0}\) holds with overwhelming probability over \(\mathbf {A}\leftarrow \mathbb Z^{m\times n}_q\) (for appropriately chosen constants). Therefore \(\mathsf {Dec}\) reconstructs
$$ \left\lfloor \begin{pmatrix} \mathbf {R}&\mathbf {I} \end{pmatrix} \cdot \begin{pmatrix} -\mathbf {c}_0 \\ \mathbf {c}_{1} \end{pmatrix}\right\rceil _2= \left\lfloor \frac{q}{2}\cdot \mathbf H_{\tau }\mathbf {s} + (\mathbf {R}'-\mathbf {R})e(\mathbf {A},\mathbf {s})\right\rceil _2= \left\lfloor \frac{q}{2}\cdot \mathbf H_{\tau }\mathbf {s}\right\rceil _2. $$
The coordinates of \(\mathbf H_{\tau }\) are in \({\mathbb Z}_2\), and as \(2\mid q\), we get \(\hat{\mathbf {s}}=\left\lfloor \frac{q}{2}\cdot \mathbf H_{\tau } \mathbf {s}\mod q\right\rceil _2=\mathbf H_{\tau }\mathbf {s}\mod 2\). This results in the correct reconstruction of \(\mathbf {s}=\mathbf H_{\tau }^{-1}\hat{\mathbf {s}}=\mathbf H_{\tau }^{-1}\mathbf H_{\tau }\mathbf {s}\).

3.2 Proof of Security

The intuition behind the security proof is that \(\mathbf {B}=\mathbf {R}\mathbf {A}\) is statistically indistinguishable from \(\mathbf {B}'=\mathbf {R}\mathbf {A}-\frac{q}{2}\mathbf H_{\tau ^*}\). But when \(\mathbf {B}'\) is used as part of the public key, ciphertexts with tag \(\tau ^*\) can not be decrypted using \(\mathsf {Dec}\) anymore. During the proof, we will show that there are ciphertexts for \(\tau ^*\) which are at least as hard to decrypt as solving \(\mathrm {SS}(n, q^m)\). Given any algorithm guessing the message encrypted in such a ciphertext and therefore breaking the security of \(\mathrm {TBE}\), there will be also an algorithm solving \(\mathrm {SS}(n, q^m)\).

Theorem 3

(CCA Security). Let \(q=\varTheta (n^2\log ^6 n)\), \(2\mid q\), and \(m=\varTheta (n)\) for appropriate constant factors. If the \(\mathrm {SS}(n, q^m)\) assumption holds (which corresponds to density \(\delta \in O(1/\log n)\)), then the proposed \(\mathrm {TBE}\) scheme is secure against \(\mathrm {selective}\)-\(\mathrm {tag}\) \(\mathrm {weak}\) \(\mathrm {CCA}\) adversaries. In particular, for every PPT algorithm \(\mathsf {A}\) there exist a PPT algorithm \(\mathsf {D}\) and a negligible function \(\nu :\mathbb N \rightarrow [0,1]\) such that:
$$ \mathbf {Adv}_{\mathrm {TBE}}(\mathsf {A})\le \mathbf {Adv}_{\mathrm {SS}(n, q^m)}(\mathsf {D}) + \nu (n). $$


We construct an algorithm \(\mathsf {D}\) which will distinguish \(\mathrm {SS}(n, q^m)\) from uniform invoking a successful adversary \(\mathsf {A}\) in game \(\mathrm {G}_{\mathrm {TBA}}\). If \(\mathsf {D}\) receives a \(\mathrm {SS}(n, q^m)\) instance \(\mathsf {D}\) will simulate game \(\mathrm {G}_{\mathrm {TBA}}\) and a successful \(\mathsf {A}\) will guess b correctly with probability \(\frac{1}{2} + \mathbf {Adv}_{\mathrm {TBE}}(\mathsf {A}) > \frac{1}{2}+\nu (n)\). When \(\mathsf {D}\) receives a uniform input, \(\mathsf {D}\) will simulate a game in which the challenge ciphertext is independent of message \( M _u\), and hence independent of u. Therefore guess \(u'\) of \(\mathsf {A}\) will be correct (i.e., \(u'=u\)) with probability \(\frac{1}{2}\).

In the following, we describe algorithm \(\mathsf {D}\) interacting with \(\mathsf {A}\) and afterwards we analyse its success probability.

  1. 1.

    \(\mathsf {D}\) receives a \(\mathrm {SS}(n, q^m)\) challenge \((\mathbf {A},\mathbf {b})\) and invokes \(\mathsf {A}\) which will send a tag \(\tau ^*\in \mathcal {T}\).

  2. 2.

    \(\mathsf {D}\) samples \(\mathbf {R}'\leftarrow [-\sqrt{q}/2,\sqrt{q}/2]^{n\times m}\), \(\mathbf {R}''\leftarrow [-\sqrt{q}/2,\sqrt{q}/2]^{\ell \times m}\) and sets \( pk =(\mathbf {A},\mathbf {B}:=\mathbf {R}'\mathbf {A}-\frac{q}{2}\mathbf H_{\tau ^*},\mathbf {C}:=\mathbf {R}''\mathbf {A})\) which is by Lemma 2 statistically close to the output distribution of public keys of \(\mathsf {Gen}\). The public key \( pk \) is given to \(\mathsf {A}\).

    Thus, \(\mathsf {D}\) uses \(\mathbf {R}'\) to respond to \(\mathsf {QueryDec}(C,\tau )\) queries as follows: If \(\tau =\tau ^*\) output \(\perp \). Otherwise \(\mathsf {D}\) uses \(\mathsf {Dec}(\mathbf {R}',\tau , C)\) to reconstruct:
    $$ \hat{\mathbf {s}}:=\left\lfloor \begin{pmatrix} \mathbf {R}'&\mathbf {I} \end{pmatrix} \cdot \begin{pmatrix} -\mathbf {c}_0 \\ \mathbf {c}_{1} \end{pmatrix}\right\rceil _2. $$
    For a properly distributed C,
    $$\begin{aligned} \hat{\mathbf {s}}=&\left\lfloor \left( \frac{q}{2}\cdot \mathbf H_{\tau }-\frac{q}{2}\mathbf H_{\tau ^*}\mod q\right) \mathbf {s}\right\rceil _2&\\ =&\left\lfloor \left( \frac{q}{2} \cdot (\mathbf H_{\tau }-\mathbf H_{\tau ^*}\mod 2)\right) \mathbf {s}\right\rceil _2&=(\mathbf H_{\tau }-\mathbf H_{\tau ^*})\mathbf {s}. \end{aligned}$$
    \(\mathbf {s}\) is reconstructed by computing \(\mathbf {s}=(\mathbf H_{\tau }-\mathbf H_{\tau ^*})^{-1}\hat{\mathbf {s}}\). If \(\mathbf {c}_0\ne \mathbf {A}\odot \mathbf {s}\) or \(\Vert \mathbf {c}_1-\left( \mathbf {B}+\frac{q}{2}\cdot \mathbf H_{\tau }\right) \mathbf {s}\Vert _\infty \ge \frac{q}{4}\) output \(\perp \). This ensures that the output of \(\mathsf {QueryDec}(C,\tau )\) is independent of \(\mathbf {R}'\) conditioned on \(\mathbf {B}=\mathbf {R}'\mathbf {A}-\frac{q}{2}\mathbf H_{\tau ^*}\) and C is a proper cipertext for randomness \(\mathbf {s}\). \(\mathsf {D}\) follows now the description of \(\mathsf {Dec}(\mathbf {R}',\tau , C)\) such that by Theorem 2 for all properly generated C and \(\tau \ne \tau ^*\) \(\mathsf {QueryDec}(C,\tau )\) outputs the correct message \( M \).
  3. 3.

    \(\mathsf {A}\) sends \( M _0\) and \( M _1\). Now \(\mathsf {D}\) samples \(u\leftarrow \{0,1\}\), sets \(C^*:=(\mathbf {b},\mathbf {R}'\mathbf {b},\mathbf {R}''\mathbf {b}+\frac{q}{2} M _u)\), and sends \(C^*\) to \(\mathsf {A}\).

  4. 4.

    Finally \(\mathsf {A}\) outputs \(u'\) and \(\mathsf {D}\) outputs 1 iff \(u'=u\).

When \(\mathbf {b}=\mathbf {A}\mathbf {s}+e(\mathbf {A},\mathbf {s})\), the challenge ciphertext \(C^*\) is a proper ciphertext for public key \( pk \) and randomness \(\mathbf {s}\):
$$\begin{aligned} \mathbf {c}_0&:=\mathbf {b}= \mathbf {A}\mathbf {s}+e(\mathbf {A},\mathbf {s}) \\ \mathbf {c}_1&:=\mathbf {R}'\mathbf {b}= \mathbf {R}'\mathbf {A}\mathbf {s}+\mathbf {R}' e(\mathbf {A},\mathbf {s})= \left( \mathbf {B}+\frac{q}{2}\cdot \mathbf H_{\tau }\right) \mathbf {s}+\mathbf {R}' e(\mathbf {A},\mathbf {s}) \\ \mathbf {c}_2&:=\mathbf {R}''\mathbf {b}+\frac{q}{2} M _u= \mathbf {R}''\mathbf {A}\mathbf {s} +\mathbf {R}'' e(\mathbf {A},\mathbf {s})+\frac{q}{2} M _u= \mathbf {C}\mathbf {s} +\mathbf {R}'' e(\mathbf {A},\mathbf {s})+\frac{q}{2} M _u. \end{aligned}$$
Note that, by Lemma 2, there is enough entropy in \(\mathbf {R}'\), \(\mathbf {R}''\) such that \(\mathbf {B}\), \(\mathbf {R}' e(\mathbf {A},\mathbf {s})\) and \(\mathbf {C}\), \(\mathbf {R}'' e(\mathbf {A},\mathbf {s})\) are independent. In this case \(\mathsf {B}\) outputs 1 with roughly probability \(\frac{1}{2}+\mathbf {Adv}_{\mathrm {TBE}}(\mathsf {A})\).
In the other case, i.e. when \(\mathbf {A},\mathbf {b}\leftarrow {\mathbb Z}_q^{m\times n}\times {\mathbb Z}_q^{m}\), we know that \(\mathbf {c}_2:=\mathbf {R}''\mathbf {b}+\frac{q}{2} M _u\) is uniform and independent of \(\mathbf {A},\mathbf {C},\mathbf {c}_0\) and \(\mathbf {c}_1\) by Lemma 2. Therefore \(C^*\) is independent of u and for any output \(u'\) of \(\mathsf {A}\):
$$ {\mathop {\Pr }\limits _{u\leftarrow \{0,1\}}[u=u']=\frac{1}{2}.} $$
Summing up, \(\mathsf {D}\) outputs 1 for a \(\mathrm {SS}(n, q^m)\) instance with roughly probability \(\frac{1}{2}+\mathbf {Adv}_{\mathrm {TBE}}(\mathsf {A})\), and it outputs 1 otherwise with probability \(\frac{1}{2}\). This implies
$$\mathbf {Adv}_{\mathrm {SS}(n, q^m)}(\mathsf {D})=\mathbf {Adv}_{\mathrm {TBE}}(\mathsf {A})-\nu (n),$$
for a negligible function \(\nu \), concluding the proof.

4 Conclusions and Open Problems

We presented a construction of a new PKE scheme with a simple and direct security proof based on the hardness of random instances of the Subset Sum problem. Our scheme achieves IND-CCA security and its concrete security does not depend on the length of the messages being encrypted. This resolves the main open problems from the previous work by Lyubashevsky, Palacio, and Segev [22].

Similarly to one of the constructions in [22], it is not hard to see that actually our PKE scheme achieves the stronger notion of IND-CCA security against non-adaptive leakage attacks.4 We leave it as an open problem to construct a PKE scheme with IND-CCA security against fully adaptive leakage attacks. An approach towards answering this question would be to construct a hash proof system [9] based on Subset Sum, as this would directly yield a leakage-resilient IND-CCA secure PKE [25].

It would also be interesting to construct PKE schemes with additional properties (always based on Subset Sum), such as circular security, key-dependent message security, and security against related-key attacks.


  1. 1.

    Clearly, the decryption oracle cannot be queried on the challenge ciphertext.

  2. 2.

    In particular, for message length \(n^2\) the scheme of [22] can be broken in polynomial time.

  3. 3.

    Decryption queries for the challenge tag \(\tau ^*\) are disallowed.

  4. 4.

    Since the latter notion is a very weak form of leakage resilience, we preferred to not work out the details.


  1. 1.
    Agrawal, S., Boneh, D., Boyen, X.: Efficient lattice (H)IBE in the standard model. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 553–572. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  2. 2.
    Ajtai, M., Dwork, C.: A public-key cryptosystem with worst-case/average-case equivalence. In: ACM STOC, pp. 284–293 (1997)Google Scholar
  3. 3.
    Bernstein, D.J., Jeffery, S., Lange, T., Meurer, A.: Quantum algorithms for the subset-sum problem. In: Gaborit, P. (ed.) PQCrypto 2013. LNCS, vol. 7932, pp. 16–33. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  4. 4.
    Boneh, D., Canetti, R., Halevi, S., Katz, J.: Chosen-ciphertext security from identity-based encryption. SIAM J. Comput. 36(5), 1301–1328 (2007)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Canetti, R., Halevi, S., Katz, J.: Chosen-ciphertext security from identity-based encryption. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 207–222. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Cash, D., Kiltz, E., Shoup, V.: The twin Diffie-Hellman problem and applications. J. Cryptol. 22(4), 470–504 (2009)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Cramer, R., Damgård, I.: On the amortized complexity of zero-knowledge protocols. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 177–191. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  8. 8.
    Cramer, R., Shoup, V.: A practical public key cryptosystem provably secure against adaptive chosen ciphertext attack. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 13–25. Springer, Heidelberg (1998)Google Scholar
  9. 9.
    Cramer, R., Shoup, V.: Universal hash proofs and a paradigm for adaptive chosen ciphertext secure public-key encryption. In: Knudsen, L.R. (ed.) EUROCRYPT 2002. LNCS, vol. 2332, pp. 45–64. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  10. 10.
    Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Trans. Inf. Theor. 22(6), 644–654 (1976)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Faust, S., Hazay, C., Venturi, D.: Outsourced pattern matching. In: Fomin, F.V., Freivalds, R., Kwiatkowska, M., Peleg, D. (eds.) ICALP 2013, Part II. LNCS, vol. 7966, pp. 545–556. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    Flaxman, A.D., Przydatek, B.: Solving medium-density subset sum problems in expected polynomial time. In: Diekert, V., Durand, B. (eds.) STACS 2005. LNCS, vol. 3404, pp. 305–314. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  13. 13.
    Frieze, A.M.: On the Lagarias-Odlyzko algorithm for the subset sum problem. SIAM J. Comput. 15(2), 536–539 (1986)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Goldwasser, S., Micali, S.: Probabilistic encryption. J. Comput. Syst. Sci. 28(2), 270–299 (1984)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Hofheinz, D., Kiltz, E., Shoup, V.: Practical chosen ciphertext secure encryption from factoring. J. Cryptol. 26(1), 102–118 (2013)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Impagliazzo, R., Naor, M.: Efficient cryptographic schemes provably as secure as subset sum. J. Cryptol. 9(4), 199–216 (1996)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Kiltz, E.: Chosen-ciphertext security from tag-based encryption. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 581–600. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  18. 18.
    Kiltz, E., Masny, D., Pietrzak, K.: Simple chosen-ciphertext security from low-noise LPN. In: PKC, pp. 1–18. (2014)Google Scholar
  19. 19.
    Kirchner, P., Fouque, P.: An improved BKW algorithm for LWE with applications to cryptography and lattices. In: CRYPTO, pp. 43–62. (2015)Google Scholar
  20. 20.
    Lagarias, J.C., Odlyzko, A.M.: Solving low-density subset sum problems. J. ACM 32(1), 229–246 (1985)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Lyubashevsky, V.: The parity problem in the presence of noise, decoding random linear codes, and the subset sum problem. In: Chekuri, C., Jansen, K., Rolim, J.D.P., Trevisan, L. (eds.) APPROX 2005 and RANDOM 2005. LNCS, vol. 3624, pp. 378–389. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  22. 22.
    Lyubashevsky, V., Palacio, A., Segev, G.: Public-key cryptographic primitives provably as secure as subset sum. In: Micciancio, D. (ed.) TCC 2010. LNCS, vol. 5978, pp. 382–400. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  23. 23.
    Merkle, R.C., Hellman, M.E.: Hiding information and signatures in trapdoor knapsacks. IEEE Trans. Inf. Theor. 24(5), 525–530 (1978)CrossRefGoogle Scholar
  24. 24.
    Micciancio, D., Peikert, C.: Trapdoors for lattices: simpler, tighter, faster, smaller. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012. LNCS, vol. 7237, pp. 700–718. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  25. 25.
    Naor, M., Segev, G.: Public-key cryptosystems resilient to key leakage. SIAM J. Comput. 41(4), 772–814 (2012)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Odlyzko, A.M.: The rise and fall of knapsack cryptosystems. In: Symposia of Applied Mathematics, pp. 75–88. (1990)Google Scholar
  27. 27.
    Peikert, C.: Public-key cryptosystems from the worst-case shortest vector problem. In: ACM STOC, pp. 333–342. (2009)Google Scholar
  28. 28.
    Peikert, C., Waters, B.: Lossy trapdoor functions and their applications. SIAM J. Comput. 40(6), 1803–1844 (2011)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Rackoff, C., Simon, D.R.: Non-interactive zero-knowledge proof of knowledge and chosen ciphertext attack. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 433–444. Springer, Heidelberg (1992)Google Scholar
  30. 30.
    Regev, O.: New lattice based cryptographic constructions. In: ACM STOC, pp. 407–416. (2003)Google Scholar
  31. 31.
    Regev, O.: On lattices, learning with errors, random linear codes, and cryptography. J. ACM 56(6), 1–40 (2009)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Shallue, A.: An improved multi-set algorithm for the dense subset sum problem. In: van der Poorten, A.J., Stein, A. (eds.) ANTS-VIII 2008. LNCS, vol. 5011, pp. 416–429. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© International Association for Cryptologic Research 2016

Authors and Affiliations

  1. 1. Faculty of Mathematics, Horst-Görtz Institute for IT SecurityRuhr-Universität BochumBochumGermany
  2. 2.Department of Computer ScienceSapienza University of RomeRomeItaly

Personalised recommendations