Algebraic Restriction Codes and Their Applications

Consider the following problem: You have a device that is supposed to compute a linear combination of its inputs, which are taken from some finite field. However, the device may be faulty and compute arbitrary functions of its inputs. Is it possible to encode the inputs in such a way that only linear functions can be evaluated over the encodings? I.e., learning an arbitrary function of the encodings will not reveal more information about the inputs than a linear combination. In this work, we introduce the notion of algebraic restriction codes (AR codes), which constrain adversaries who might compute any function to computing a linear function. Our main result is an information-theoretic construction AR codes that restrict any class of function with a bounded number of output bits to linear functions. Our construction relies on a seed which is not provided to the adversary. While interesting and natural on its own, we show an application of this notion in cryptography. In particular, we show that AR codes lead to the first construction of rate-1 oblivious transfer with statistical sender security from the Decisional Diffie–Hellman assumption, and the first-ever construction that makes black-box use of cryptography. Previously, such protocols were known only from the LWE assumption, using non-black-box cryptographic techniques. We expect our new notion of AR codes to find further applications, e.g., in the context of non-malleability, in the future.


Introduction
In this work, we consider leakage problems of the following kind: Assume we have a device which takes an input x and is supposed to compute a function f (x) from a certain class of legitimate functions F. For concreteness, assume that the class F consists of functions computing linear combinations, e.g., f (x 1 , x 2 ) = a 1 x 1 + a 2 x 2 .However, the device might be faulty, and instead of computing f it might compute another function g.We want to find a way to encode x into an x such that the following two properties hold: • If the device correctly implements a linear function f , then we can efficiently decode the output y to f (x).• If, on the other hand, the device implements a non-linear function g, then the output g( x) does not reveal more information about x than f (x) for some linear function f .
First, note that this notion is trivially achievable if F includes the identity function, or in fact any invertible function, as in this case we can simulate g( x) from f (x) by first recovering x from f (x), encoding x to x and finally evaluating g on x.For this reason, in this work, we will focus on function classes F whose output-length is smaller than their input-length, such as the linear combination functions mentioned above.In general, we will allow both the encoding and decoding procedure to depend on a secret seed, which is not given to the evaluating device/adversary.
It is worthwhile comparing the type of security this notion provides to tamperresilient primitives such as non-malleable codes (NM-codes) [1][2][3] and non-malleable extractors [4][5][6][7].Such notions are geared towards prohibiting tampering altogether.Moreover, a central aspect for security for such notions is that the decoder tries to detect if some tampering happened, and indeed the decoder plays a crucial role in modelling the security of non-malleable codes.In contrast, AR codes do and should allow manipulation by benign functions from the class F. Furthermore, we only require a decoder for correctness purposes, whereas security is defined independently of the decoder.
One motivation to study the above problem comes from cryptography, specifically secure computation, where this is, in fact, a natural scenario.Indeed, a typical blueprint for secure two-party computation [8] in two rounds proceeds as follows: One party, called the receiver, encrypts his input y under a homomorphic encryption scheme [9][10][11][12] obtaining a ciphertext c, and sends both the public key pk and the ciphertext c to the other party, called the sender.The sender, in possession of an input x homomorphically performs a computation f on input x and ciphertext c, obtaining a ciphertext c which encrypts f (x, y).The ciphertext c is sent back to the receiver who can then decrypt it to f (x, y).
For the case of a malicious receiver, the security of this blueprint breaks down completely: A malicious receiver can choose both the public key pk and the ciphertext c maliciously, i.e. they are generally not well-formed.Effectively, this means that the sender's homomorphic evaluation will result in some value f (x) (where f will be specified by the receiver's protocol message) instead of an encryption of f (x, y).Critically, the value f (x) might reveal substantially more information about x than f (x, y) and compromise the sender's security.
Generally speaking, in this situation, there is no direct way for the sender to enforce which information about x the receiver obtains.A typical cryptographic solution for achieving malicious security involves using zero-knowledge proofs to enforce honest behavior for the receiver.This technique, however, is typically undesirable as it often leads to less efficient protocols (due to these tools using non-black-box techniques) and the need for several rounds of interaction or a trusted setup.We aim to upgrade such protocols to achieve security against malicious receivers without additional cryptographic machinery.
To see how algebraic restriction codes will help in this scenario, consider the following.Upon receiving a public key pk and a ciphertext c from the receiver (who potentially generated them in a malicious way) the sender proceeds as follows.First, he encodes his own input x into x using a suitable AR code with a fresh seed s.Next, also then sender evaluates the function f ( x, •) homomorphically on the ciphertext c (which encrypts the receiver's input y), resulting in a ciphertext c = Eval(pk, f ( x, •), c).For simplicity's sake, assume that the sender now sends c and the seed s back to the receiver, who decrypts c to ẑ = f ( x, y) and uses the seed s to decode ẑ to his output z using the decoding algorithm of the AR code.
How can we argue that even a malicious receiver cannot learn more than the legitimate output z? Let's take a closer look on the computation which is actually performed on the encoding x.The output ciphertext c is computed via c = Eval(pk, f ( x, •), c).Thus, if we can assure that the function g( x) = Eval(pk, f ( x, •), c) is in the class G which is restricted by the AR code, then security of the AR code guarantees that c does not leak more than z = f (x, y) about x, irrespective of the choice of pk and c.

Our Results
In this work, we formalize the notion of algebraic restriction codes and provide constructions which restrict general function classes to linear functions over finite fields.Let G and F be two function classes.Roughly, a G-F AR code provides a way to encode an x in the domain of the functions in F into a codeword x in the domain of the functions in G, in a way that any function f ∈ F can still be evaluated on x, by evaluating a function f ∈ G on x.Furthermore, given f ( x) we can decode to f (x).Security-wise, we require that for any g ∈ G there exists a function f ∈ F, such that g( x) can be simulated given only the legitimate output f (x).AR codes provide an information-theoretic interface to limit the capabilities of an unbounded adversary in protocols in which some weak restrictions (characterized by the class G) are already in place.In this way, AR codes will allow us to harness simple structural restrictions of protocols to implement very strong security guarantees.
In this work we consider seeded AR codes, where both the encoding and decoding procedures of the AR code have access to a random seed s, which is not provided to the function g.
Our first construction of AR-codes restricts general linear functions to linear combinations.
Theorem 1 (Formal: Theorem 4, Page 21) Let F q be a finite field, let F be the class of functions F k q × F k q → F k q of the form (x, y) → ax + y, and let G be the class of all linear functions F n q × F n q → F n q of the form (x, y) → Ax + y.There exists a seeded AR code AR 1 which restricts G to F.
Our main contribution is a construction of seeded AR codes restricting arbitrary functions with bounded output length to linear combinations.
Theorem 2 (Formal: Theorem 5, Page 26) Let F q be a finite field, let F be the class of functions F q × F q → F q of the form (x, y) → ax + by, and let G be the class of all functions F n q × F n q → {0, 1} 1.5•n log (q) .There exists a seeded AR code AR 2 which restricts G to F.
We note that the constant 1.5 in the theorem is arbitrary and can in fact be replaced with any constant between 1 and 2.
The main ingredient of this construction is the following theorem, which may be of independent interest and which we will discuss in some greater detail.The theorem exhibits a new correlation-breaking property of the inner-product extractor.
In essence, it states that for a suitable parameter choice, if x 1 , . . ., x t are uniformly random vectors in a finite vector space and s is a random seed (in the same vector space), then anything that can be inferred about the x 1 , s , . . ., x t , s via a joint leak f (x 1 , . . ., x t ) of bounded length can also be inferred from a linear combination i a i x i , s , i.e. f (x 1 , . . ., x t ) does not leak more than i a i x i , s .Theorem 3 (Formal: Theorem 5, Page 26) Let q be a prime power, let t, s be positive integers, and ε > 0 and n = O(t + s/ log(q) + (log 1 ε )/ log(q)).Let x 1 , . . ., x t be uniform in F n q and s is uniform in F n q and independent of the x i .For any f : F tn q → {0, 1} n log q+s , there exists a simulator Sim and random variables a 1 , . . ., where u 1 , . . ., u t are uniform and independent random variables in F q , independent of (a 1 , . . ., a t ).
One way to interpret the theorem is that the inner product extractor breaks all correlations [induced by a leak f (x 1 , . . ., x t )], except linear ones.Recall that our notion of AR codes it is crucial that linear relations are preserved.
We then demonstrate an application of AR codes in upgrading the security of oblivious transfer (OT) protocols while simultaneously achieving optimal communication, a question that had remained opened due to insurmountable difficulties, explained later.Specifically, we obtain the first rate-1 OT protocol with statistical sender privacy from the decisional Diffie Hellman (DDH) assumption.While our motivation to study AR codes is to construct efficient and high rate statistically sender private OT protocols, we expect AR codes and in particular the ideas used to construct them to be useful in a broader sense.

Technical Outline
In what follows, we provide an informal overview of the techniques developed in this work.

Warmup: Algebraic Restriction Codes for General Linear Functions
Before discussing the ideas leading up to our main result, we will first discuss the instructive case of AR codes restricting general linear functions to simple linear functions.Specifically, fix a finite field F q and let G be the class of linear functions F 2m q → F m q of the form g(x 1 , x2 ) = Ax 1 + x2 , where A ∈ F m×m q is an arbitrary matrix.We want to restrict G to the class F consisting of linear functions F 2n q → F n of the form f (x 1 , x 2 ) = a • x 1 + x 2 , where a ∈ F q is a scalar.
Our construction proceeds as follows.The seed s specifies a random matrix R ∈ F n×m q , such a matrix has full rank except with probability ≤ 2 −(m−n) .To encode a pair of input vectors x 1 , x 2 ∈ F n q , the encoder samples uniformly random x1 , x2 pick ←− F m q such that R x1 = x 1 and R x2 = x 2 , and outputs the codeword (x 1 , x2 ).To evaluate a scalar linear function given by a ∈ F q on such a codeword, we (unsurprisingly) compute ŷ = a x1 + x2 .To decode ŷ we compute y = R ŷ. Correctness of this AR code construction follows routinely: i.e. correctness holds as the scalar a commutes with the matrix R.
In this case it will also be more convenient to look at the problem from the angle of randomness extraction; Specifically, assume that x1 , x2 pick ←− F m q are chosen uniformly random.We want to show that for any matrix A ∈ F m×m q anything that can be learned about R x1 and R x2 from Ax 1 + x2 can also be learned from a • R x1 + R x2 for some a ∈ F q .
How can we find such an a for any given A?
First notice that if x1 happens to be an eigenvector of A with respect to an eigenvalue a i , then it indeed holds that Ax 1 + x 2 = a i x 1 + x 2 .Thus, a reasonable approach is to set the extracted scalar a ∈ F q to one of the eigenvalues of A (or 0 if there are no eigenvalues).If the matrix A has several distinct eigenvalues a i , we will set a to be the eigenvalue whose eigenspace V i has maximal dimension.Note that since the sum of the dimensions of all eigenspaces of A is at most n, there can be at most one eigenspace whose dimension is larger than m/2.Furthermore, the eigenvalue a i corresponding to this eigenspace will necessarily be the extracted value a.
Rather than showing how we can simulate ŷ = Ax 1 + x2 in general, in this sketch we will only briefly argue the following special case.Namely, if all the eigenspaces of A have dimension smaller than or equal to m/2, then with high probability over the choice of the random matrix R pick ←− F n×m q it holds that x 1 = R x1 and x 2 = R x2 are uniform and independent of ŷ.Thus assume that ŷ = Ax 1 + x2 was not independent of x 1 = R x1 and x 2 = R x2 .Since these three variables are linear functions of the uniformly random x1 and x2 there must exist a non-zero linear relation given by vectors u, v ∈ F n q and w ∈ F m q such that u x 1 + v x 2 + w ŷ = 0 for all choices of x1 and x2 .But this means that it holds that u R + w A = 0 and v R + w = 0. Eliminating w , this simplifies to the equation u R = v RA.
We will now argue that for any such matrix A ∈ F m×m q (whose eigenspaces all have dimension ≤ m/2) with high probability over the choice of the random matrix R, such a relation given by (u, v) = 0 does not exist.We will take a union bound over all non-zero u, v and distinguish the following cases: • If u and v are linearly independent, then u R and v R are uniformly random and independent (over the random choice of R).Thus the probability that u R and v RA collide is 1/q m .• If u and v are linearly dependent, then (say) u = αv.In this case u R = v RA is equivalent to αv R = v RA, i.e. the uniformly random vector v R is an eigenvector of the matrix A with respect to the eigenvalue α.However, since all eigenspaces of A have dimension at most m/2, the probability that v R lands in one of the eigenspaces bounded by m/q m/2 .Since there are q 2n possible choices for the vectors u, v ∈ F n q , choosing m sufficiently large (e.g.m > 5n) implies that the probability that such u, v ∈ F n q exist is negligible.The full proof is provided in Sect.6.

Algebraic Restriction Codes for Bounded Output Functions
We will now turn to algebraic restriction codes for arbitrary functions with bounded output length.Now let F q be the finite field of size q, let G be the class of all functions from F 2n q → {0, 1} 1.5n log(q) and let F be the class of linear functions F 2 q → F q , i.e. all functions of the form f (x 1 , x 2 ) = a 1 x 1 + a 2 x 2 for some a 1 , a 2 ∈ F q .Our AR code construction follows naturally from the inner product extractor.The seed s consists of a random vector s pick ←− F n q , to encode x 1 , x 2 ∈ F q we choose uniformly random x 1 , x 2 ∈ F n q with x 1 , s = x 1 and x 2 , s = x 2 .Likewise, to decode a value y we compute y = y, s , correctness follows immediately as above.To show that this construction restricts G to F, we will again take the extractor perspective.Thus, assume that x 1 , x 2 ∈ F n q are distributed uniformly random and let g : F n q × F n q → {0, 1} 1.5n log( p) be an arbitrary function.
We need to argue that for any g ∈ G there exist exist a 1 , a 2 ∈ F q such that g(x 1 , x 2 ) can be simulated given y = a 1 x 1 , s + a 2 x 2 , s , but no further information about x 1 , s and x 2 , s .Our analysis distinguishes two cases.
• In the first case, both x 1 , s and x 2 , s are statistically close to uniform given g(x 1 , x 2 ).In other words, it directly holds that g(x 1 , x 2 ) contains no information about x 1 , s and x 2 , s .We can simulate g(x 1 , x 2 ) by choosing two independent x 1 and x 2 and computing g(x 1 , x 2 ).• In the second case x 1 , s and x 2 , s are (jointly) statistically far from uniform given g(x 1 , x 2 ).In this case we will rely on a variant of the XOR Lemma [13] to conclude that there must exist a 1 , a 2 ∈ F q such that a 1 x 1 + a 2 x 2 is also far from uniform given g(x 1 , x 2 ).Roughly, the XOR Lemma states that if it holds for two (correlated) random variables z 1 , z 2 that for all a 1 , a 2 ∈ F q (such that one of them is non-zero) that a 1 z 1 + a 2 z 2 are statistically close to uniform, then (z 1 , z 2 ) must be statistically close to uniform in F 2 q .Consequently, the existence of such a 1 , a 2 ∈ F q in our setting follows directly from the contrapositive of the XOR Lemma.But this implies that a 1 x 1 + a 2 x 2 must have very low min-entropy given g(x 1 , x 2 ).Otherwise, the leftover hash lemma would imply that a 1 x 1 + a 2 x 2 = a 1 x 1 + a 2 x 2 , s is close to uniform given g(x 1 , x 2 ), in contradiction to the conclusion above.But this means that a 1 x 1 + a 2 x 2 is essentially fully specified by g(x 1 , x 2 ).In other words g(x 1 , x 2 ) carries essentially the entire information about a 1 x 1 + a 2 x 2 .But now recall that the bit size of g(x 1 , x 2 ) is 1.5n log(q) bits and the bit size of a 1 x 1 +a 2 x 2 is n log(q) bits.Thus, there is essentially not enough room in g(x 1 , x 2 ) to carry significant further information about x 1 or x 2 .Again relying on the leftover hash lemma, we then conclude that given g(x 1 , x 2 ), x 1 , s and x 2 , s are statistically close to uniform subject to a 1 x 1 , s + a 2 x 2 , s = y.While this sketch captures the very high level ideas of our proof, the actual proof needs to overcome some additional technical challenges and relies on a careful partitioning argument.The proof can be found in Sect.7.

From AR Codes to Efficient Oblivious Transfer
We display the usefulness of AR codes in cryptography by constructing a new oblivious transfer (OT) [14,15] protocol.OT is a protocol between two parties, a sender, who has a pair of messages (m 0 , m 1 ), and a receiver who has a bit b, where at the end, the receiver learns m b , while the sender should learn nothing.OT is a central primitive of study in the field of secure computation: Any multiparty functionality can be securely computed given a secure OT protocol [16,17].In particular, statisticallysender private (SSP) [18,19] 2-message OT has recently received a lot of attention due to its wide array of applications, such as statistical ZAPs [20,21] and maliciously circuit-private homomorphic encryption [22].While the standard security definitions for OT are simulation-based (via efficient simulators), SSP OT settles for a weaker indistinguishability-based security notion for the receiver and an inefficient simulation notion for the sender.On the other hand, SSP OT can be realized in just two messages, without a setup and from standard assumptions, a regime in which no OT protocols with simulation-based security are known. 1n this work, we obtain the first OT protocol that simultaneously satisfies the following properties: (1) It is round-optimal (2 messages) and it does not assume a trusted setup.
(2) It satisfies the notion of statistical sender privacy (and computational receiver privacy).That is, a receiver who may (potentially) choose her first round message maliciously will be statistically oblivious to at least one of the two messages of the sender.(3) It achieves optimal rate for information transfer (i.e., it is rate-1).( 4) It makes only black-box use of cryptographic primitives, in the sense that our protocol does not depend on circuit-level implementations of the underlying primitives.
Prior to our work, we did not know any OT protocol that simultaneously satisfied all of the above properties from any assumption.The only previous construction was based on LWE (using expensive fully-homomorphic encryption techniques), which only satisfies the first three conditions, but not the last one.(See Sect.3.) We obtain constructions that satisfy all the above conditions from DDH/LWE.Optimal-rate OT is an indispensable tool in relazing various MPC functionalities with sublinear communication [24].As direct corollaries, we obtain two-message maliciously secure protocols for keyword search [24] and symmetric private information retrieval (PIR) protocols [25] with statistical server privacy and with asymptotically optimal communication complexity from DDH/LWE.Our scheme is the first that makes only black-box use of cryptography, which we view as an important step towards the practical applicability of these protocols.

Packed ElGamal
Before delving into the description of our scheme, we recall the vectorized variant of the ElGamal encryption scheme [26].Let G be an Abelian group of prime order p and let g be a generator of G.In the packed ElGamal scheme, a public key pk consists of If we disregard the need for efficient decryption, we can encrypt arbitrary Z n p vectors rather than just binary vectors.For such full range plaintexts the rate of packed ElGamal, i.e. the ratio between plaintext size and ciphertext size comes down to (1 − 1/(n + 1)) log( p)/λ, assuming a group element can be described using λ bits.If λ ≈ log( p), as is the case for dense groups, the rate approaches 1, for sufficiently large n.Finally, for a matrix X ∈ {0, 1} n×k , we encrypt X column-wise, to obtain a ciphertext-matrix C.

Homomorphism and Ciphertext Compression
Packed ElGamal supports two types of homomorphism.It is linearly homomorphic with respect to Z p -linear combinations.Namely, if c is an encryption of a vector m ∈ Z n p and c is an encryption of a vector m ∈ Z n p , then for any α, β ∈ Z p it holds that c = c α •c β is a well-formed encryption of αm+βm (again, disregarding the need for efficient decryption for large plaintexts).This routinely generalizes to arbitrary linear combinations, namely we can define a homomorphic evaluation algorithm Eval 1 which takes as input a public key pk, a ciphertext matrix C encrypting a matrix X ∈ Z n×m Finally, the packed ElGamal scheme supports ciphertext compression for bitencryptions [27].There is an efficient algorithm Shrink which takes a ciphertext c = (d 0 , d) and produces a compressed ciphertext c = (d 0 , K , b), where K is a (short) key and b ∈ {0, 1} n is a binary vector.Consequently, compressed ciphertexts are of size n + poly bits and therefore have rate 1 − poly/n, which approaches 1 for a sufficiently large n (independent of the description size of group elements).Such compressed ciphertexts can then be decrypted using a special algorithm ShrinkDec, using the same secret key sk.Compressed ciphertexts generally do not support any further homomorphic operations, so ciphertext compression is performed after all homomorphic operations.

Semi-Honest Rate-1 OT from Packed ElGamal
The packed ElGamal encryption scheme with ciphertext compression immediately gives rise to a semi-honestly secure OT protocol with download rate 1.Specifically, the receiver whose choice-bit is b generates a key-pair pk, sk, However, note that the sender privacy of this protocol completely breaks down against malicious receivers.Specifically, a malicious receiver is not bound to encrypting the scalar matrix b • I, but could instead encrypt an arbitrary matrix A ∈ Z n×n p , thereby learning A(m 1 − m 0 ) + m 0 instead of m b .By e.g.choosing A = 0 0 0 I the receiver could learn half of the bits of m 0 and half of the bits of m 1 , thus breaking sender privacy.

Malicious Security via AR Codes
Next we show how to make the above protocol statistically sender private against malicious receivers using AR codes.The protocol follows the same outline as above, except that the sender samples a seed R for an AR code and encodes its inputs Then it computes a ciphertext c = Eval 1 (pk, C, x1 , x2 ).If the sender were to transmit directly this ciphertext, the rate of the scheme would degrade (due to the size of the encodings) and the decryption would not be efficient, since c contains an encoding ŷ ∈ Z m p .To deal with this issue, we observe that decoding ŷ to y via y = R ŷ is exactly the type of operation supported by the homomorphic evaluation Eval 2 .Thus, we let the sender further compute c = Eval 2 (pk, c, R).By homomorphic correctness of Eval 2 , it holds that c is an encryption of R ŷ = y = m b ∈ {0, 1} n under a modified public key pk (which depends on R).Since c encrypts a binary message, the sender can further use the ciphertext compression algorithm Shrink to shrink c into a rate-1 ciphertext c.The sender now sends R and c back to the receiver, who derives a key from sk and R, and uses it to decrypt c via ShrinkDec.
If we were to do things naively, the protocol would still not achieve rate-1 since we have to also attach to the OT second message a potentially large matrix R.This can be resolved via a standard trick: By reusing the same matrix R in several parallel instances of the protocol, we can amortize the cost of sending the matrix R. Note that R can be reused as we only need to ensure that the matrix A does not depend on R. Thus, we have achieved a rate-1 protocol.
There is one subtle aspect that we need to address before declaring victory: The security of AR codes only guarantees that a malicious receiver may learn a(m 1 − m 0 ) + m 0 for some a ∈ Z p , rather than b(m 1 − m 0 ) + m 0 = m b for b ∈ {0, 1}.To address this last issue, we let the sender compute x1 and x2 by and r 0 , r 1 are uniformly random.
Consequently, instead of a(m 1 − m 0 ) + m 0 the ciphertext c now encrypts and by the security of the AR code c does not leak more information about x 1 and x 2 then f (x 1 , x 2 ).Now, note that if a = 0, then where we note that r 1 = m 0 − r 1 is uniformly random.On the other hand, if a = 1, then where we note that r 0 = m 1 + r 0 is uniformly random.Finally, if a / ∈ {0, 1}, then which is uniformly random as the last term is uniformly random.I.e. if a / ∈ {0, 1} the receiver will learn nothing about m 0 and m 1 .Thus, we can conclude that even for a malformed public key pk and ciphertext C the view of the receiver can be simulated given at most one m b , and statistical sender privacy follows.

Back to Rate-1
Note that now the ciphertext c is twice as long as before, which again ruins the rate of our scheme.However, note that in order to get a correct scheme, if a = 0 the receiver only needs to recover the first half z 0 of the vector she needs the second part z 1 .Our final idea is to facilitate this by additionally using a rate-1 OT protocol OT = (OT 1 , OT 2 , OT 3 ) with semi-honest security (e.g. as given in [27]).We will further use the fact that the packed ElGamal ciphertext c can be written as (h, c0 , c1 ), where h is the ciphertext header, c0 is a rate-1 ciphertext encrypting z 0 and c1 is a rate-1 ciphertext encrypts z 1 (both with respect to the header h).We modify the above protocol such that the receiver additionally includes a first message ot 1 computed using his choice bit b.Instead of sending both c0 and c1 to the receiver (which would ruin the rate), we compute the sender message ot 2 for OT as ot 2 ← OT 2 (ot 1 , c0 , c1 ) and send (h, ot 2 ) to the receiver.The receiver can now recover cb from ot 2 and decrypt the ciphertext (h, cb ) as above.Note that now the communication rate from sender to receiver is 1.Note that we do not require any form of sender security from the rate-1 OT.Finally, note that as discussed above the the protocol can be made overall rate-1 by amortizing for the size of the receiver's message (i.e.repeating the protocol in parallel for the same receiver message but independent blocks of the sender message).

Certified Versus Uncertified Groups
We conclude this overview by discussing two variants of groups where we can implement the OT as specified above.In certified groups, we can assume that G in fact implements a group of prime order p, even if maliciously chosen.In these settings, our simpler variant of AR codes suffices, since we are warranted that a malicious receiver can only obtain information of the form Ax 1 + x2 (for an arbitrarily chosen matrix A).In non-certified groups, the linearity of the group is no longer checkable by just looking at its description G.Here we can only appeal to the fact that have a bound on the size of the output learned by the receiver, enforced by the fact that our OT achieves rate-1: The second OT message is too short to encode both x1 and x2 .In these settings, we need the full power of bounded-output AR codes, in order to show the statistical privacy of the above protocol.

Roadmap
We discuss some related works in Sect.3. The preliminaries are provided in Sect. 4. We will introduce algebraic restriction codes in Sect. 5.In Sect.6 we show that canonic AR codes restrict general linear functions to simple linear functions.In Sect.7 we show that canonic AR codes restrict output-bounded functions to simple linear combinations, where the main result of this section is stated in Theorem 5.In Sect.8 we provide our construction of rate-1 SSP OT from DDH and we discuss novel applications in Sect.9.

Related Work
A recent line of works [27] proposed a new approach to constructing semi-honest OT with a rate approaching 1.This framework can be instantiated from a wide range of standard assumptions, such as the DDH, QR and LWE problems.The core idea of this approach is to construct OT from a special type of packed linearly homomorphic encryption scheme which allows compressing ciphertexts after homomorphic evaluation.Pre-evaluation ciphertexts in such packed encryption schemes typically need to encrypt a structured plaintext containing redundant information to guarantee correctness of homomorphic evaluation.In the context of statistical sender privacy, this presents an issue as a malicious receiver may deviate from the structure required by the protocol to (potentially) learn correlated information about m 0 and m 1 .
Regarding the construction of SSP OT, all current schemes roughly follow one of three approaches sketched below.

The Two Keys Approach [18, 19, 28, 29]
In this construction blueprint, the receiver message ot 1 specifies two (correlated) public keys pk 0 and pk 1 under potentially different public key encryption schemes.The sender's message ot 2 now consists of two ciphertexts c 0 = Enc(pk 0 , m 0 ) and c 1 = Enc (pk 1 , m 1 ).Statistical sender privacy is established by choosing the correlation between the keys pk 0 and pk 1 in such a way that one of these keys must be lossy, and that this is either directly enforced by the underlying structure or checkable by the sender.Here, lossiness means that either c 0 or c 1 loses information about their respective encrypted message.In group-based constructions following this paradigm [18,19,28], the sender must trust that the structure on which the encryption schemes are defined actually implements a group in order to be convinced that either pk 0 or pk 1 is lossy.We say that the group G must be a certified group.This is problematic if the group G is chosen by the receiver, as the group G could e.g. have non-trivial subgroups which prevent lossiness.
Furthermore, note that since the sender's message ot 2 contains two ciphertexts, each of which should, from the sender's perspective be potentially decryptable, this approach is inherently limited to rates below 1/2.

The Compactness Approach [30]
The second approach to construct SSP OT is based on high rate OT.Specifically, assume we are starting with any two round OT protocol with a (download) rate greater than 1/2, say for the sake of simplicity with rate close to 1.This means that the sender's message ot 2 is shorter than the concatenation of m 0 and m 1 .But this means that, from an information theoretic perspective ot 2 must lose information about either m 0 or m 1 .This lossiness can now be used to bootstrap statistical sender privacy as follows.The sender chooses two random messages r 0 and r 1 and uses them as his input to the OT.Moreover, he uses a randomness extractor to derive a key k 0 from r 0 and k 1 from r 1 respectively.Now the sender provides two one-time pad encrypted ciphertexts c 0 = k 0 ⊕ m 0 and c 1 = k 1 ⊕ m 1 to the receiver.A receiver with choice bit b can then recover r b from the OT, derive the key k b via the randomness extractor and obtain m b by decrypting c b .
To argue statistical sender privacy using this approach, we need to ensure that one of the keys k 0 or k 1 is uniformly random from a malicious receivers perspective.Roughly speaking, due to the discussion above the second OT message ot 2 needs to lose either half of the information in r 0 or r 1 .Thus, in the worst case, the receiver could learn half of the information in each r 0 and r 1 from ot 2 .Consequently, we need a randomness extractor which produces a uniformly random output as long as its input has n/2 bits of min-entropy.Thus, we can prove statistical sender privacy for messages of length smaller than n/2.
But in terms of communication efficiency, this means that we used a high rate n-bit string OT to implement a string OT of length ≤ n/2, which means that the rate of the SSP OT we've constructed is less than 1/2.This is true without even taking into account the addition communication cost required to transmit the ciphertexts c 0 and c 1 .Thus, this approach effectively trades high rate for statistical sender privacy at the expense of falling back to a lower rate.We conclude that this approach is also fundamentally stuck at rate 1/2.

The Non Black-Box Approach [31, 32]
While the above discussion seems to imply that there might be an inherent barrier in achieving SSP OT with rate > 1/2, there is in fact a way to convert any SSP OT protocol into a rate-1 SSP OT protocol using sufficiently powerful tools.Specifically, using a rate-1 fully-homomorphic encryption (FHE) scheme [31,32], the receiver can delegate the decryption of ot 2 to the sender.In more detail, assume that OT 3 (st, ot 2 ) is the decryption operation which is performed by the receiver at the end of the SSP OT protocol.By providing an FHE encryption F H E.Enc(st) of the OT receiver state st along with the first message ot 1 , the receiver enables the sender to perform OT 3 (st, ot 2 ) homomorphically, resulting in an FHE encryption c of the receivers output m b .Now the receiver merely has to decrypt c to recover m b .In terms of rate, note that the OT sender message now merely consists of c, which is rate-1 as the FHE scheme is rate-1.Further note that this transformation does not harm SSP security, as from the sender's view the critical part of the protocol is over once ot 2 has been computed.I.e. for the sender performing the homomorphic decryption is merely a post-processing operation.On the downside, this transformation uses quite heavy tools.In particular, this transformation needs to make non black-box use of the underlying SSP OT protocol by performing the OT 3 operation homomorphically.
In summary, to the best of our knowledge, all previous approaches to construct SSP OT are either fundamentally stuck at rate 1/2 or make non black-box usage of the underlying cryptographic machinery, making it prohibitively expensive to run such a protocol in practice.
Finally, we mention that if one wishes to settle on a computational instead of statistical privacy for the sender, it is possible to build rate-1 OT using existing techniques by relying on super-polynomial hardness assumptions.The idea is that the parties will first engage in a (low-rate) OT protocol OT 1 , so that the receiver will learn one of the two random PRG seeds (s 0 , s 1 ) sampled by the sender.In parallel, the sender prepares two ciphertexts (ct 0 := PRG(s 0 ) ⊕ m 0 , ct 1 := PRG(s 1 ) ⊕ m 1 ) for his two input messages (m 0 , m 1 ), and communicates one of them to the receiver using a semi-honest rate-1 OT protocol.Even given both (ct 0 , ct 1 ) the receiver cannot recover both m 0 and m 1 , because OT 1 will guarantee at least one of the seeds remains computationally hidden to the receiver.The above protocol is rate-1 because the added communication of obliviously transferring (s 0 , s 1 ) is independent of the size of m 0 .The main drawback of this above protocol is that, since we do not rely on a trusted setup, we cannot extract the choice bit in polynomial time from the receiver, and hence we will have to rely on complexity leveraging to establish sender security.In particular, the best we can guarantee is that a malicious computationally-bounded receiver cannot compute both messages of the sender.This notion will fall short in replacing rate-1 SSP OT in the aforementioned applications.

Preliminaries
We will denote finite fields of unspecified size by F, and for any prime-power q we will denote the finite field of size q by F q .We will use u F to denote a uniform and independent random variable over F, and likewise u F t to denote a uniform and independent random variable over F t .
Z are the integers and Z q = Z/qZ are the integers modulo q.Vectors are small, bold letter (i.e.a, b) while matrices are big, bold letters (i.e.A, B).For sets of functions we use big, italic letters (i.e.F, G).We also use [n] instead of {1, . . ., n}.
For a cyclic group we use G and usually call its generator g.As a shorthand for the matrix of group elements (g M i, j ) i, j∈ [n] we write g M where M is a matrix is from Z n×n and similarly for vectors and rectangular matrices.This allows for notations such as (g M ) v = g Mv where M is a n × n matrix of group elements and v is a n vector of group elements.
We use span(M) to indicate the column span of matrix M and LKer(M) its left kernel.

Definition 1 (Computational Indistinguishability) Two random variables B and C are computationally indistinguishable if for every polynomial adversary
is negligible in the security parameter Sometimes we denote this with A ≈ B.

Statistical Measures
We introduce some standard concepts for statistical measures.
Definition 2 (Statistical Distance) We define the statistical distance between two discrete random variables A, B to be We use (A; B|C) as a shorthand for ((A, C) ; (B, C)).Sometimes we write A ≈ B instead of (A; B) ≤ .
We call two random variables statistically indistinguishable if their statistical distance is negligible in some the security parameter.We denote this as ≈ or ≈ s .Since ≈ by itself is ambiguous we will make it clear from the context.Definition 3 (Min-Entropy) We define the min-entropy of a random variable A to be Definition 4 (Average Conditional Min-Entropy) We define the average conditional min-entropy of random variable A given the random variable B We will make use of the following simple Lemma.
Lemma 1 (See e.g.[33]) Let F be a finite field and m > n be integers.A uniformly random matrix R pick ←− F n×m has full rank, except with probability 2 −(m−n) .
We will use the following variant of the leftover hash lemma.
Lemma 2 Let r be uniform in F n , and l be a random variable in where •, • is the inner product over F.
We will need the following simple lemma from [34].Variants of this lemma have been used in the past to prove the security of various non-malleable code constructions (such as [2,3]).
Lemma 3 Let S be some random variable distributed over a set S, and let S 1 , . . ., S j be a partition of S. Let φ : S → T be some function, and let D 1 , . . ., D j be some random variables over the set T .Assume that for all 1 ≤ i ≤ j, for some random variable D ∈ T such that for all d Pr[D The following is a fundamental property of statistical distance.

Lemma 4 For any, possibly random, function α, if (
We will need the following lemma.
Then, for any non-empty set A, we have: .

Algebraic Restriction Codes
In this section, we will define our main technical tool: Algebraic Restriction Codes.An algebraic restriction code allows encoding a linear function so that any (suitably bounded) malicious evaluation algorithm cannot exfiltrate information that could not have been obtained via a valid evaluation of the function.We will use algebraic restriction codes as a powerful interface to achieve circuit privacy without sacrificing other crucial properties such as high rate.Algebraic restriction codes can be seen as a specific type of secret sharing which allows for certain homomorphic operations while inhibiting others.
In particular, algebraic restriction codes will become useful in striking a balance between seemingly conflicting goals: Relying on additional structure to achieve advanced functionality while not making this additional structure a potential avenue to attack function privacy.Generally, we allow AR-codes to be seeded, i.e. all operations take as additional input a seed s.We now will define algebraic restriction codes as follows.In terms of correctness, we require that for all seeds s, all inputs x and all functions f ∈ F that Decode(s, Eval(Encode(s, x), f )) = f (x).
In terms of security we require that AR codes restrict a potentially larger class G of functions to F. Specifically, we require that for any malicious evaluation function g ∈ G that evaluating g on an encoding of an input x corresponds to an honest evaluation of a function f ∈ F on x.We formalize this via a simulation-based security notion.
Definition 6 (Restriction Security) We say that a code AR is G-F restriction secure, if there exists a (randomized) extractor E, which takes as input a function g ∈ G and outputs a function f ∈ F and auxiliary information aux, and a simulator S such that for every x and every function g ∈ G it holds that where s is a uniformly random seed and ( f , aux) ← E(g).Here, ≈ is either computational or statistical indistinguishability.
A crucial aspect of algebraic restriction codes will be the complexity of both evaluation and decoding.Specifically, we will be interested in algebraic restriction codes for which both Eval and Decode are linear functions.

Concatenating AR Codes
Concatenation is a powerful concept in coding theory, allowing to combine properties of different codes.We will now briefly show that concatenating AR codes has the expected effect: If AR 1 restricts a class H to a class G and AR 2 restricts a class G ⊇ G to another class F, then the code AR 3 obtained by first encoding with AR 2 and then with AR 1 restricts H to F.

Lemma 6
Let AR 1 be an AR code which restricts a class H to a class G. Let further AR 2 be an AR code which restricts the class G to a class F. Let AR 3 be the AR code obtained by first encoding with AR 2 and then with AR 1 , i.e.AR 3 .Encode s 1 ,s 2 (x) = AR 1 .Encode s 1 (AR 2 .Encode s 2 (x)).Then the code AR 3 restricts H to F.
Proof Let S 1 be the simulator for AR 1 and S 2 be the simulator for AR 2 .We define the extractor in the canonic way via E 3 via E 3 (h) = ( f , (aux 1 , aux 2 )) where ( f , aux 2 ) = E 2 (g) and (g, aux 1 ) = E(h).Furthermore, we define the simulator S 3 via S 3 ((aux 1 , aux 2 ), y) = S 1 (aux 1 , S 2 (aux 2 , y)).Let h ∈ H be a tampering function.We get that 123 While Lemma 6 provides a general concatenation theorem for AR codes, in our applications we will rely on a slight variant for specific function classes where AR 1 is a H − G AR code and AR 2 is a G − F AR code for which the classes G and G are not identical, but rather G is a subclass of G.In the following, we will identify the extension field F q k with F k q as a vector space.
Lemma 7 Let F q be a finite field and let F q k be its extension field of degree k.Let G be the class of functions F q k × F q k → F q k of the form (x, y) → ax + by (for a, b ∈ F q k ), and let H be a class of functions containing G. Let G be the class of functions F k q × F k q → F k q of the form (x, y) → Ax + y (for AinF k×k q ).Finally, let F be the class of functions F n q × F n q → F n q which are either of the form (x, y) Proof Let E 1 and S 1 be the extractor and simulator for AR 1 , and let E 2 and S 2 be the extractor and simulator for AR 2 .We start by constructing the extractor E 3 for AR 3 .On input h ∈ H, E 3 proceeds as follows: • Compute (g, aux 1 ) ← E 1 (h), and parse g as a function (x, y) → ax + by for a, b ∈ F q k .• If b = 0, set f to be the function (x, y) → x and set aux 2 = ∅.• Otherwise: • Let A, B ∈ F k×k q be the multiplication matrices corresponding to a, b ∈ F q k (Notice that B is invertible as b = 0) and set g to be the function Now the simulator S 3 is given as follows.On input (s = (s 1 , s 2 ), aux 3 , z), S 3 proceeds as follows:

Now fix a function h ∈ H and let (g, aux 1
) ← E 1 (h), where we parse g as a function (x, y) → ax + by for a, b ∈ F q k .We will distinguish two cases, b = 0 and b = 0.
1.In the first case, conditioned on b = 0 it holds that 2. In the second case, conditioned on b = 0 it holds that Overall, we conclude that which concludes the proof.

From Arbitrary Linear to Simple Linear Functions
In this section, we will show a simple construction of AR codes which constrain an adversary from arbitrary linear functions to simple linear functions.Specifically, we consider the following two classes of functions: • The class F consists of all functions f : F n q × F n q → F n q of the form f (x, y) = ax + y, where a ∈ F q • The class G consists of all functions g : F m q × F m q → F m q of the form g(x, y) = Ax + y, where A ∈ F m×m q .
Note that the functions in the class the class F have two degrees of freedom, whereas the functions in class G have 2m 2 degrees of freedom.
Let s = R pick ←− F n×m q be a uniformly random matrix.The AR code AR 1 is given as follows.
The technical core of this section is Lemma 8.

123
Lemma 8 Let q > 0 be a modulus and n > 0. Let A ∈ F m×m q be a square matrix.Let a ∈ F q be the eigenvalue of A for which the dimension of the corresponding eigenspace V a is maximal.Let x 1 , x 2 , u pick ←− F m q be chosen uniformly at random.Let further R pick ←− F n×m q be chosen uniformly at random.Given that m ≥ 2n + 2 + 2t it holds that except with probability 2q −t over the choice of R.
Using Lemma 8, we will establish the main result of this section, Theorem 4.

Theorem 4
Let F, G be the two classes defined above.The AR code AR 1 restricts G to F.

Proof of Theorem 4
Let g(x 1 , x2 ) = Ax 1 + x2 , and let a ∈ F q be the eigenvalue of A for which the corresponding eigenspace has the largest dimension, if no non-zero eigenvalue exists set a = 0.By Lemma 1 the matrix R has full rank, except with negligible probability 2 −(m−n) .By Lemma 8, for uniformly random x1 and x2 it holds that except with negligible probability over the choice of R. Thus fix a R which has both full rank and for which (2) holds, and fix two vectors x 1 , x 2 ∈ F n q .Since R has full rank, we can condition on R x1 = x 1 and R x2 = x 2 and obtain that This implies that for all but a negligible fraction of the R we can simulate g(x 1 , x2 ) from y = ax 1 + x 2 by choosing a uniformly random random ŷ with R ŷ = y, and a uniformly random u ∈ F m q and outputting z = ŷ + (A − a • I)u.By (3) it holds that z and g(x 1 , x2 ) are identically distributed.

Proof of Lemma 8
First note that leaving out R, the lefthandside of (1) can be written as M 0 • (x 1 , x 2 ) where whereas the righthandside of (1) (again leaving out R) can be written as M 1 • (x 1 , x 2 , u) where Consequently, since x 1 , x 2 and u are chosen uniformly random from F m q , it holds that the two distributions on the lefthand side and the righthand side are identically distributed, if and only if the columns of M 0 and M 1 span the same space.First observe that span(M 0 ) ⊆ span(M 1 ), as To show the other inclusion, note that span(M 1 ) ⊆ span(M 0 ), if and only if LKer(M 0 ) ⊆ LKer(M 1 ).Therefore, let (v 1 , v 2 , w) be a vector in LKer(M 0 ), i.e. it holds that v 1 R + wA = 0 and v 2 R + w = 0.This immediately implies that v 1 R = v 2 RA.We will show that this implies that v 2 R is an eigenvector of A. As R is chosen uniformly from F n×m q , it holds by Theorem 4 that R has full rank, except with negligible probability 2 −(m−n) .Now recall that a is the eigenvalue of A with the eigenspace of highest dimension and recall that v 2 RA = v 1 R.We will show that this implies that v 1 RA = a • v 2 R, except with negligible probability over the choice of R. That is, we will show that In other words, it holds for all v 1 , v 2 = 0 that v 2 RA = v 1 R implies v 2 RA = av 2 R, except with negligible probability over the choice of R. From this is follows immediately that LKer(M 0 ) ⊆ LKer(M 1 ), as w = −v 2 R and therefore w(A − aI) = −av 2 R + av 2 R = 0. We will establish (4) via a union-bound over the v 1 , v 2 , and towards this goal we will distinguish two cases.
1.In the first case, v 1 and v 2 are linearly dependent, i.e. there exists an α ∈ F q such that v 1 = αv 2 .If α = a then the probability of the event is 0. Thus consider α = a, and let V α be the eigenspace of A corresponding to the eigenvalue α, where V α = {0} if α is not an eigenvalue of A. Observe that it must hold that the dimension of V α is at most m/2, as otherwise α would be the eigenvalue with the eigenspace of the largest dimension and therefore α = a.Consequently, it holds that as v 2 R is distributed uniformly random over F m q and the dimension of V α is at most m/2.We further note that there are at most q n choices for v 2 and q choices for α, thus in this case there are are q n+1 possible choices for the pair (v 1 , v 2 ). 2. In the second case, v 1 and v 2 are linearly independent.In this case v 1 R and v 2 R are distributed independently and uniformly random.Consequently, it holds that Note that in this case there are less than q 2n choices for the pair v 1 , v 2 .

123
We can conclude that Pr As m ≥ 2n + 2 + 2t, we can bound this probability by 2q −t .

From Output-Bounded Functions to Linear Combinations
In this section, we will show that the AR code induced by the inner product extractor restricts arbitrary functions of bounded output length to linear functions.Specifically, consider the following two classes of functions: • The class F consists of all functions f : (F q ) t → F q of the form f (x 1 , . . ., x t ) = t i=1 a i x i , where a 1 , . . ., a t ∈ F q • The class G consists of all functions g : (F n q ) t → {0, 1} n log(q)+l (for some l < n log(q)).Let s = s pick ←− F n q be a uniformly random vector.The AR code AR 2 is given as follows.
• Compute and output y ← t i=1 a i x i Decode(s, y): • Compute and output y ← y, s .
Restriction security of this construction follows immediately from Corollary 1 at the end of this section.

A Conditional XOR Lemma
The following is straightforward from a Markov-like argument.
Lemma 9 • For any ε > 0, and any correlated random variables X ∈ S and E if then for any δ > 0, with probability at least 1 − ε δ over the choice of i ← E, • For any δ > 0, if holds with probability at least p over the choice of i ← E, then Lemma 10 Let X ∈ S be a random variable for some set S. Assume that Lemma 11 Let X ∈ S, Z ∈ T be correlated random variables for some sets S, T .Assume that (X , The following is a variant of the well known Vazirani's XOR lemma.This was proved in [35] in the quantum setting. Lemma 12 Let x = (x 1 , . . ., x t ) ∈ F t be a random variable, and E be some correlated random variable.Assume that for all α 1 , . . ., α t ∈ F not all zero, Proof We start by choosing E and fixing it.By Lemma 9 and the union bound, we have that with probability at least 1 − p t ε δ over the choice of E, we have that for all α 1 , . . ., α t ∈ F not all zero, where the distribution of x i 's is conditioned on the choice of E. Let x = (x 1 , . . ., x t ) be i.i.d. as x conditioned on the choice of E. By Lemma 10, we have that for all α 1 , . . ., α t ∈ F not all zero, Let a = (a 1 , . . ., a t ) be uniform in F t and independent of x, x .Then, Thus, Simplifying, we get, Using the inequality in Lemma 10, we get that Recall that this is conditioned on the correct choice of E which we have with probability at least 1 − p t ε δ .Using Lemma 9 with δ = p t/4 √ ε, we have that We remark here that if there is no side information E, then there is no union bound in the first step, and δ = ε so that the statistical distance is 2 p t/2 ε.

Combinatorial Simulator
We now prove our main technical result, which yields algebraic restriction codes for functions of bounded output length.
Theorem 5 Let q be a prime power, let n, t, s be positive integers and ε > 0 such that n log q − (9t + 3) log q − s − 2 log t − 28 ≥ 16 log 1 ε .
Let x 1 , . . ., x t be uniform in F n q and s is uniform in F n q and independent of the x i .For any f : F tn q → {0, 1} n log q+s , there exists a simulator Sim and random variables a 1 , . . ., a t ∈ F q such that s, f (x 1 , . . ., x t ), x 1 , s , . . ., x t , s , a 1 , . . ., a t ≈ 2ε s, Sim s, a 1 , . . ., a t , t i=1 a i u i , u 1 , . . ., u t , a 1 , . . ., a t where u 1 , . . ., u t are uniform and independent random variables in F q , independent of (a 1 , . . ., a t ).
We will use the XOR lemma (Lemma 12) to prove this theorem.We will begin by showing that if we start with x 1 , . . ., x t being uniform in any large enough set T , then there exists a large subset T ⊆ T and some fixed a 1 , . . ., a t in F q such that conditioned on (x 1 , . . ., x t ) being in this subset, the only information about x 1 , s , . . ., x t , s obtained by learning f (x 1 , . . ., x t ) and S is t i=1 a i x i , s .More formally, Lemma 13 Let q be a prime power, let n, t, s be positive integers and ε > 0 such that n log q − (9t + 3) log q − s − 2 log t − 28 ≥ 16 log 1 ε .
Let T ⊆ F tn q such that |T | ≥ ε • q tn .For any f : F tn q → {0, 1} n log q+s , there exist a 1 , . . ., a t ∈ F q , non-empty set T ⊆ T , and a simulator Sim, such that for the tuple (x 1 , . . ., x t ) distributed uniformly in T and s uniformly and independently in in F n q , s, f (x 1 , . . ., x t ), x 1 , s , . . ., x t , s ; s, Sim s, where u 1 , . . ., u t are uniform and independent random variables in F q .
Proof Let x 1 , . . ., x t be uniform in T .Consider the following cases.

CASE 1:
(x, f (x 1 , . . ., x t ), x 1 , s , . . ., x t , s ; s, f (x 1 , . . ., x t ), u 1 , . . ., u t ) ≤ ε.In this case, let T 1 = T .The simulator Sim ignores the inputs and just samples s, x 1 , . . ., x t according to the given input distribution, and outputs s, f (x 1 , . . ., x t ).Thus, the given statement implies Notice that since the simulator ignores the input, the above statement holds for any choice of a 1 , . . ., a t .CASE 2: (s, f (x 1 , . . ., x t ), x 1 , s , . . ., x t , s ; s, f (x 1 , . . ., x t ), u 1 , . . ., u t ) > ε.Lemma 12 shows that if all non-trivial linear combinations of x i , s are close to uniform given E = (s, f (x 1 , . . ., s t )), then the joint distribution x 1 , s , . . ., x t , s is close to uniform given E. Applying the contrapositive, we get that there exists a 1 , . . ., a t ∈ F q , not all 0, such that Notice that this implies that there is a non-trivial correlation between t i=1 a i x i , s and (s, f (x 1 , . . ., x t )).We will show that for this choice of a 1 , . . ., a t , and an appropriate choice of the subset T , this correlation is essentially the only correlation between the joint distribution ( x 1 , s , . . ., x t , s ) and (s, f (x 1 , . . ., x t )).By the Markov inequality, with probability at least ε 2 18q 3t/2 over the choice of y ← f (x 1 , . . ., x t ) iy holds that Let Y be the set of all y which satisfy the above.For all y ∈ Y, let T y be the preimage of y for the function f , i.e., the set of all (x 1 , . . ., x t ) such that f (x 1 , . . ., x t ) = y.
We have that an element chosen uniformly at random from T is in T y for some y ∈ Y with probability at least ε 2 18q 3t/2 .This implies that y∈Y Let y be some element in Y.By the contrapositive of the leftover hash lemma (Lemma 2), we have that the min-entropy of t i=1 a i x i conditioned on f (x 1 , . . ., x t ) = y is at most log q + 2 log 1 , where This implies that for each y ∈ Y, there is a large number of elements (x 1 , . . ., x t ) ∈ T y such that t i=1 a i x i is fixed.We now select only those elements from T y which correspond t i=1 a i x i being fixed.For each y ∈ Y, let φ(y) be the most frequently occurring value of t i=1 a i x i for (x 1 , . . ., x t ) ∈ T y , and let By the inequality in 6, we have that Notice that for any (x 1 , . . ., x t ) in y∈Y T y , it holds that t i=1 a i x i is equal to φ( f (x 1 , . . ., x t )), i.e., it is uniquely determined given f (x 1 , . . ., x t ).
Intuitively, since t i=1 a i x i carries roughly n log p bits of information, and it is a deterministic function of f (x 1 , . . ., x t ), we expect that f (x 1 , . . ., x t ) can be uniquely determined with (a little more than an) additional s bits of information.We will now remove those elements for which it requires a large number of bits to determine f (x 1 , . . ., x t ) given t i=1 a i x i .Let Y be the set of all y such that The total number of elements in Y \ Y is at most the size of the image of f , i.e., q n • 2 s .Hence the number of distinct values of φ(y) for y ∈ Y\Y is at most Notice that for any element z ∈ F n q , there are at most q (t−1)n values of (x 1 , . . ., x t ) such that t i=1 a i x i = z.Thus, the number of elements in y∈φ −1 (z) T y is at most Thus, We let the set T be y∈Y T y , and let x 1 , x 2 , . . ., x t be uniform in T .We have the following two properties satisfied by (x 1 , . . ., x t ): for some function ψ : Since not all a 1 , . . ., a t are 0, we assume without loss of generality that a t = 0.Then, notice that where we used the inequality (7).Additionally, considering the additional leakage from ψ(x 1 , . . ., x t ), we get that where we used the fact that the length of ψ(x 1 , . . ., x t ) is at most s + 14 + (9t/2 + 1) log q + 7 log 1 ε , and also the bound on n log q as given in the lemma statement.Restating with φ, ψ replaced by the function f , we get the following.
By the leftover hash lemma (Lemma 2), we have that Again by the triangle inequality, we have that Writing x t , s as 1 a t t i=1 a i x i , s − t−1 i=1 a i x i , s and applying Lemma 4, we have that x 1 , s , . . ., x t , s , f (x 1 , . . ., x t ), s Corollary 1 Let q be a prime power, let n, t, s be positive integers and ε > 0 such that n log q − (25t + 3) log q − s − 2 log t − 60 ≥ 16 log 1 ε .
Let m 1 , . . ., m t ∈ F q .Let s be uniform in F n q and let x 1 , . . ., x t be sampled uniformly in F n q conditioned on the event that for all i ∈ [t], x i , s = m i .For any f : F tn q → {0, 1} n log q+s , there exists a simulator Sim and random variables a 1 , . . ., a t ∈ F q such that s, f (x 1 , . . ., x t ), a 1 , . . ., a t ≈ ε s, Sim s, a 1 , . . ., a t , t i=1 a i m i , a 1 , . . ., a t where u 1 , . . ., u t are uniform and independent random variables in F q , independent of (a 1 , . . ., a t ).
Proof Applying Lemma 5 to Theorem 5, and conditioning on u 1 = m 1 , . . ., u t = m t , we get that if

Rate-1 SSP OT from DDH
In this section, we discuss the standard definition of rate-1 statistical sender-private oblivious transfer (rate-1 SSP OT) and then go over our construction using algebraic restriction codes.We start by providing the necessary cryptographic definitions.

Decisional Diffie-Hellman Assumption
These assumptions below are with regard to a group-generator scheme while most protocols just consider the group.This however, is just to make the notation in the protocol easier.Each protocol-participating party just chooses the group G according to the publicly known group-generator scheme G and security parameter λ and proceeds as detailed in the protocol.
Definition 7 (DDH) Let G be a group-generator scheme, which on input 1 λ outputs (G, p, g).The decisional Diffie-Hellman assumption holds for group-generator scheme G if for all polynomial time adversaries A Pr A(g, g a , b, ab) = 1 − Pr A(g, g a , g b , g c ) = 1 is negligible in λ for (G, p, g) ← G(1 λ ) and uniformly random a, b, c ∈ Z p

Public-Key Encryption Schemes
A public-key encryption scheme uses two keys, a public key pk and a secret key sk.
We use the public key to encrypt messages, the result of which is called ciphertext.Without knowledge of the secret key, it is virtually impossible to calculate the message from the ciphertext.The secret key, however, enables the holder to reliably retrieve the message from the ciphertext.
Definition 8 (Public-Key Encryption) The following algorithms describe a public-key encryption scheme: The key-generation algorithm takes the security parameter λ as input and outputs a key pair (pk, sk).Enc(pk, m): The encryption algorithm takes a public key pk and a message m as input and outputs a ciphertext c.Dec(sk, c): The decryption algorithm takes a secret key sk and a ciphertext c as input and outputs a message m.It rarely requires randomness.
In the rest of the document, every encryption scheme will be public key.Therefore we will not mention it again.
The rate is trying to capture the size comparison between a ciphertext and its corresponding plaintext.
Definition 11 (Rate) An encryption scheme (KeyGen, Enc, Dec) has rate ρ if there exists a polynomial μ such that for all security parameters λ, possible outputs of KeyGen(1 λ ) called (pk, sk), and messages m with |m| ≥ μ(λ) We call an encryption scheme high rate if it has a rate greater than 1/2 and we call it rate-1 if for λ → ∞ the rate ρ(λ) approaches 1.

Homomorphic Encryption
In homomorphic encryption the decryption algorithm is a homomorphism.Certain changes on a ciphertext change the underlying plaintext in a structured way.Definition 13 (Homomorphic Correctness) Let F be a set of functions and f be an arbitrary element of F. An F-homomorphic encryption scheme (KeyGen, Enc, Eval, Dec) is correct if (KeyGen, Enc, Dec) is a correct encryption scheme, and for all messages m, security parameters λ, and (pk, sk) from the support of KeyGen(1 λ )

Oblivious Transfer
Two-round oblivious transfer is a protocol in which a receiver encodes a choice bit b and transmits it to a sender.The sender then responds to that transmission using its two messages m 0 and m 1 .In the end the receiver learns m b , but not m 1−b and the sender learns nothing.Definition 14 (Oblivious Transfer) A (string) 1-out-of-2 OT consists of three algorithms: OT 1 , OT 2 , and OT 3 .
OT 1 (1 λ , b): Takes as inputs the security parameter λ ∈ N and a choice bit b ∈ {0, 1} to produce a request ot 1 and a state st.OT 2 (ot 1 , (m 0 , m 1 )): Uses the request ot 1 , and the two sender inputs m 0 , m 1 ∈ {0, 1} * of same length to create a response ot 2 .OT 3 (ot 2 , st): Calculates a result y from the state st and the response ot 2 .
We define correctness in the following.
Definition 15 (Correctness) An OT is correct if for all security parameters λ, bits b ∈ {0, 1}, and sender inputs m 0 , m 1 ∈ {0, 1} * the following holds: As standard for 2-round OT, we require that the bit of the receiver is hidden in an indistinguishability sense.
Definition 16 (Receiver Security) An OT is receiver secure if for all security parameters λ and PPT adversaries A the following holds: We define (malicious) statistical sender privacy for 2-round OT.
Definition 17 (Statistical Sender Privacy) An OT is statistically sender private if the exists a unbounded simulator Sim such that for all requests ot 1 and sender inputs m 0 , m 1 ∈ {0, 1} * the following holds: The (download) rate of an OT protocol captures how big the senders response is in comparison to the size of a message m 0 .Definition 18 (Rate) An OT (OT 1 , OT 2 , OT 3 ) has rate ρ if there exists a polynomial μ such that for all security parameters λ, for all choice bits b, message lengths n > μ(λ), sender inputs m 0 , m 1 ∈ {0, 1} n , receiver outputs (ot 1 , st) ← OT 1 (1 λ , b) and sender outputs ot 2 ← OT 2 (ot 1 , (m 0 , m 1 ))

Packed ElGamal
A big component of our OT construction is the packed ElGamal encryption scheme, which we recall here for completeness.As discussed in the introduction, in the ElGamal encryption scheme [26] public keys are of the form g, h and messages m are encrypted as g r , h r • m.In the packed ElGamal scheme, the same header g r is shared across several payload slots h r i • m i , effectively amortizing the cost of the header to encrypt an entire vector m.Now let G be a cyclic group of prime order p, and let g be a generator of G.In our description we will provide a decryption algorithm which takes as additional input a matrix M ∈ Z m×n p , which is applied to the secret key before decryption.In this way, we achieve correctness for homomorphic operations across slots.
• Return secret key sk = s and public key pk = h.Enc(pk, m ∈ {0, 1} n ): For a matrix X = (x 1 , . . ., x m ) ∈ {0, 1} n×m , we overload encryption and denote Enc(pk, X) = (Enc(pk, x 1 ), . . ., Enc(pk, x m ))., M, c): For two ciphertexts c 1 and c 2 we overload Eval 1 and denote by Eval 1 (pk, c 1 , c 2 , −) the homomorphic computation of the difference of c 1 and c 2 .Homomorphic correctness of this scheme follows routinely.To analyze the rate of this scheme, note that plaintexts m ∈ G n consist of n group elements, whereas ciphertexts consist of n + 1 group elements, i.e. there is an additive overhead of 1 group element and the rate of the scheme comes down to 1 − 1/n.Thus the rate of the scheme approaches 1 for a growing n.

Lemma 14
The packed ElGamal encryption scheme as described above is IND-CPA secure, given the DDH problem is hard for group G.
Proof IND-CPA security of the packed ElGamal scheme follows tightly (in n) from the decisional Diffie Hellman assumption in a routine way: A DDH instance (g, h, g , h ) can be rerandomized into a pair of vectors h and f, such that h is distributed uniformly random in G n and the following holds for f.
then c is a correctly distributed ciphertext for the public key pk and the message m.On the other hand, if f is uniformly random, then c is also uniformly random and independent of m.It follows that an adversary with advantage against the IND-CPA security of packed ElGamal can be used to distinguish DDH with advantage .
Before presenting our construction we recall a useful pair of algorithms that allow us to compress ciphertexts for the packed ElGamal encryption scheme.
Sketch Let T be a polynomial in the security parameter and let PRF : {0, 1} λ × G → {0, 1} τ , where τ ≈ log(λ), be a pseudorandom function.On input a ciphertext (c, (e 1 , . . ., e n )), the compression algorithm Shrink samples the key K for the PRF until the following two conditions are simultaneously satisfied: For all i ∈ [n] it holds that (1) PRF(K , e i /g) = 0.
The compressed decryption algorithm ShrinkDec finds, for every i ∈ [n], the smallest γ i such that PRF(K , c s i • g γ i ) = 0 by exhaustive search, where sk = (s 1 , . . ., s n ).
Finally it outputs M i = δ i ⊕ LSB(γ i ), where LSB denotes the least significant bit of an integer.Note that the scheme is correct with probability 1, since condition (1) ensures that there is no ambiguity in the decoding of the bit M i .By setting the parameters appropriately, we can guarantee that K can always be found in polynomial time, except with negligible probability.
We can straightforwardly modify the algorithm ShrinkDec in the same way as we have modified the Dec algorithm above to support decryption of ciphertexts produced by Eval 2 .Specifically, we modify it such that it takes as an additional input a matrix M and transforms the secret key sk before decrypting its input ciphertext.

Construction
We will now provide our construction of rate-1 SSP OT from the packed ElGamal scheme and an additional receiver-secure rate-1 OT.Specifically, let (OT 1 , OT 2 , OT 3 ) be a receiver-secure rate-1 OT protocol.We will also use the the packed ElGamal encryption scheme with ciphertext compression discussed above.Finally, let AR = (AR.Encode, AR.Eval, AR.Decode) be an AR-code with linear decoding, i.e. decoding of a codeword y proceeds by computing R s y for a matrix R s ∈ Z 2n×m p specified by a seed s.
Our OT protocol (OT * 1 , OT * 2 , OT * 3 ) is given as follows.In the following we assume that the seed s is available to the receiver after the first message ot * 1 has been sent.Note that since the seed can be reused in an arbitrary number of parallel executions of the protocol, its size can be amortized and does therefore not affect the asymptotic rate of the protocol.We proceed by showing the computational receiver privacy of the resulting OT protocol.
Theorem 6 (Receiver Privacy) The scheme as described above is computationally receiver private, given that (OT 1 , OT 2 , OT 3 ) is a computationally receiver private OT and that the packed ElGamal scheme is IND-CPA secure.

Instantiating the AR Code
Finally, we show that our scheme satisfies statistical sender privacy.In the certified group setting, we can directly rely on the AR codes constructed in Sect.6.In the uncertified group setting, we routinely obtain the required codes by concatenating the AR codes of Sect.7 over an extension field of Z p with the AR codes of Sect.6 via Lemma 7. p .In the second case Sim proceeds as follows, where we distinguish 3 subcases depending on the value of a ∈ Z p .
• a = 0: In this case, the simulator queries the OT oracle on 0 to receive m 0 , and then computes c = S aux, m 0 r 1 for a uniformly random r 1 .
• a = 1: This case is similar as the previous one, except that the bit is flipped.More precisely, the simulator queries the OT oracle on 1 and receives m 1 , then computes c = S aux, r 0 m 1 for a uniformly random r 0 .
• a / ∈ {0, 1}: c is computed running S and on a uniformly sampled r $ ← − Z 2n p .The remainder of the algorithm proceeds exactly as in the definition of OT * 2 .Note that in any of the above cases, the simulator queries the OT oracle at most once.Thus, all we need to argue is that the distribution of the simulated c is statistically close to the real one.
Our analysis distinguishes the same cases as Sim, i.e. whether the extracted f is of the form f ( ≡ (s, S(aux, x 1 )) = (s, S(aux, r)), as the information in OT * 2 can be computed from s and c, we conclude that Sim faithfully simulates the sender message ot * 2 .
2. We now turn to analyzing the second case, where f is of the form f (x 1 , x2 ) = a • x1 + x2 .In the first two sub-cases, it holds that a ∈ {0, 1}.To see why in these sub-cases the output of the simulator is correctly distributed, first observe that for a ∈ {0, 1} it holds that Note that if a = 0, then r 1 = m 0 − r 1 is distributed uniformly random.Likewise, if a = 1 then r 0 = m 1 + r 0 is distributed uniformly random.Consequently, the value c computed by Sim has the correct distribution as it holds by the security of AR that (s, c) = (s, h(x 1 , x2 )) ≈ (s, S(aux, f (x 1 , x 2 ))).
As above, we conclude that Sim faithfully simulates the sender message ot * 2 .This concludes the proof.

Rate-1
Finally we argue that the scheme achieves rate-1.In the calculation we only consider without loss of generality the size of the OT second message (i.e. the download rate), since the size of the first message can always be amortized to increase the rate arbitrarily [27].By Lemma 15, we have that for b ∈ {0, 1} it holds that |Shrink(pk, c)| = log(|G|) + λ + n = 2λ + n, which approaches n as n grows.Since the underlying OT scheme has rate-1, then the size of ot 2 asymptotically equals n. set ots (0) i to an arbitrary value.Continue the procedure of PIR 2 described above based on these values.
The proof of statistical security for PSim follows from that of OSim using an inductive argument.We omit the details.

Server's communication complexity
For r = r (λ), if the server has K = 2 k messages each of r bits, then the server's communication complexity is r + poly(k, λ) bits (where poly is a fixed polynomial), achieving rate-1 for large enough r .
we choose a uniformly random r pick ←− Z p and set the ciphertext c to c = (d 0 , d) = (g r , h r • g m ) where both exponentiations and group operations of vectors are component-wise.We call d 0 the header of the ciphertext and d = (d 1 , . . ., d n ) the payload of c, we further call d 1 , . . ., d n the slots.To decrypt a ciphertext c, we compute m = dlog g (d −x 0 • d).
p , and two vectors a ∈ Z m p and b ∈ Z n p and outputs an encryption of Xa + b.By rerandomizing the resulting ciphertext this can be made function private, i.e. the output ciphertext leaks nothing beyond Xa + b about a and b.The second type of homomorphism supported by packed ElGamal is a limited type of homomorphism across the slots.Specifically, let c = (d 0 , d) be an encryption of a message m ∈ Z n p and let M ∈ Z m×n p be a matrix.Then there is a homomorphic evaluation algorithm Eval 2 which takes the public key pk, the ciphertext c and a matrix M ∈ Z m×n p and outputs a ciphertext c , such that c encrypts the message m = Mm under a modified public key pk = g Mx .Furthermore, if the decrypter knows the matrix M, it can derive the modified secret sk = Mx and decrypt c to m (given that m ∈ {0, 1} m ).
encrypts the matrix b • I to a ciphertext matrix C, and sends ot 1 = (pk, C) to the sender.The sender, whose input are two strings m 0 and m 1 ∈ {0, 1} n uses Eval 1 to homomorphically evaluate the function f (X) = X(m 1 − m 0 ) + m 0 on the ciphertext C, obtaining a ciphertext c.It then compresses the ciphertext c to a compressed ciphertext c and sends ot 1 = c back to the receiver who can decrypt it to a value m using the ShrinkDec algorithm.By homomorphic correctness it holds that b

Definition 5
An algebraic restriction code consists of three algorithms Encode, Eval and Decode with the following syntax.• Encode(s, x): Takes as input a seed s, an input x and outputs an encoding c • Eval(c, f ): Takes as input an encoding c, a function f ∈ F and outputs an encoding d • Decode(s, d): Takes as input a seed s, an encoding d and outputs a value y

Definition 9 (
Correctness) An encryption scheme (KeyGen, Enc, Dec) is correct if for all message m and security parameters λPr m = Dec(sk, Enc(pk, m)) (pk, sk) ← KeyGen(1 λ ) = 1The most popular notion of security for encryption schemes is IND-CPA security.Definition 10 (IND-CPA Security) An encryption scheme (KeyGen, Enc, Dec) is indcpa secure if for all adversary pairs (A 1 , A 2 )

Definition 12 (
Homomorphic Encryption) These four algorithms describe a homomorphic encryption scheme: KeyGen(1 λ ): The key-generation algorithm takes the security parameter λ as input and outputs a key pair (pk, sk).Enc(pk, m): The encryption algorithm takes a public key pk and a message m as inputs and outputs a ciphertext c.Eval(1 λ , pk, f , c 1 , . . ., c n ): The evaluation algorithm takes a security parameter λ, a public key pk, a string representation of a function f and n where n is the input size of f ciphertexts c 1 , . . ., c n as inputs and outputs a new ciphertext c.Dec(sk, c): The decryption algorithm takes a secret key sk and a ciphertext c as input and outputs a message m.It rarely requires randomness.

•
e) Eval 2 (pk, c, M ∈ Z m×n p ) • Parse M = (m i j ) • Parse c = (c, e) and e = (e 1 , . . ., e n ) • For all i ∈ [m] compute d i = n j=1 e m i j j Return the ciphertext c = (c, d)

Theorem 7 (
Sender Privacy) The scheme as described above is statistically sender private, given that the AR code AR restricts functions of the form h(x 1 , x2 ) = Eval 1 (pk, C, x1 , x2 ) (for maliciously chosen pk, C) to linear functions of the form f(x 1 , x2 ) = a x1 + x2 or f (x 1 , x2 ) = x1 .Proof We will first provide the description of the (unbounded) simulator Sim.Sim uses the extractor E and the simulator S of the AR code AR as a subroutine.Now fix a maliciously chosen receiver message ot * 1 = (ot 1 , pk, C).On input ot * 1 , Sim proceeds as follows.It first defines a function h(x 1 , x2 ) = Eval 1 (pk, C, x1 , x2 ) and runs the extractor E on input h, which outputs a function f and auxiliary information aux.The simulator now distinguishes the following two cases.1.The functionf is of the form f (x 1 , x2 ) = x1 2. The function f is of the form f (x 1 , x2 ) = a • x1 + x2 for an a ∈ Z p .In the first case, the simulator computes c by running S and on a uniformly sampledr $ ← − Z 2n