1 Introduction

Suppose Alice holds a set \(S_A\) and Bob a set \(S_B\). Private set intersection (PSI) is a cryptographic primitive that allows each party to learn the intersection \(S_A\cap S_B\) and nothing else. In particular, Alice gets no information about \(S_B\setminus S_A\) (and vice-versa). The problem has attracted a lot of attention through the years, with an extended line of work proposing solutions in a variety of different settings (e.g., [11,12,13, 15,16,17, 21, 25,26,27, 31,32,36]). Also, numerous applications have been proposed for PSI such as contact discovery, advertising, etc. (see for example [22] and references therein). More recently, PSI has also been proposed as a solution for private contact tracing (e.g., [2]).

Threshold PSI. In this work, we focus on a special setting of PSI called Threshold PSI. Here, the parties involved in the protocol learn the output if the size of the intersection between the input sets of the parties is very large, say larger than \(n-t\), where n is the size of the input sets and t is some threshold such that \(t\ll n\); Otherwise, they learn nothing about the intersection. This is in contrast with standard PSI where the parties always get the intersection, no matter its size.

The main reason for considering this problem (apart from its numerous applications which we discuss next) is that the amount of communication needed is much smaller than for standard PSI: In particular, there are threshold PSI protocols whose communication complexity depends only on the threshold t and not on the size of the input sets as for standard PSI [17].

Despite its theoretical and practical appeal, there are just a few works that consider this problem [16, 17, 20], and just one of them achieves communication complexity independent of n [17], in the two party setting.

1.1 Applications of Threshold PSI

A wide number of applications has been suggested for threshold PSI in previous works such as applications to dating apps or biometric authentication mechanisms [17].

One of the most interesting applications for threshold PSI is its use in carpooling (or ridesharing) apps. Suppose two (or more) parties are using a carpooling app, which allows them to share a vehicle if their routes have a large intersection. However, due to privacy issues, they do not want to make their itinerary public. Threshold PSI solves this problem in a simple way [20]: The parties can engage in a threshold PSI protocol, learn the intersection of the routes and, if the intersection is large enough, share a vehicle. Otherwise, they learn nothing and their privacy is maintained.

PSI Using Threshold PSI. Most of current protocols for threshold PSI (including ours) are splitted into two parts: i) a Cardinality Testing, where parties decide if the intersection is larger than \(n-t\); and ii) secure computation of the intersection of the input sets (which we refer to as the PSI part). The communication complexity of these two parts should depend only on the threshold t and not on the input sets’ size n.

Threshold PSI protocols of this form can be used to efficiently compute the intersection, even when no threshold on the intersection is known a priori by the parties, by doing an exponential search for the right threshold. In this case, parties can proceed as follows:

  1. 1.

    Run a Cardinality Testing for some t (say \(t=1\)).

  2. 2.

    If it succeeds, perform the PSI part. Else, run again the Cardinality Test for \(t=2t\).

  3. 3.

    Repeat Step 2 until the Cardinality Testing succeeds for some threshold t and the set intersection is computed.

By following this blueprint, parties are sure that they overshoot the right threshold by a factor of at most 2. That is, if the intersection is larger than \(n-t'\), then the Cardinality Testing will succeed for t such that \(t\ge t'> t/2\). Thus, they can compute the intersection incurring only in a factor of 2 overhead over the best insecure protocol. In other words, PSI protocols can be computed with communication complexity depending on the size of the intersection, and not on the size of the sets.

This approach can be useful in scenarios where parties suspect that the intersection is large but they do not know exactly how large it is.

1.2 Our Contributions

In the following, N denotes the number of parties in a multi-party protocol and t is the threshold in a threshold PSI protocol. Below, we briefly describe our results.

Multi-party Cardinality Testing. We develop a new Cardinality Testing scheme that allows N parties to check if the intersection of their input sets, each having size n, is larger than \(n-t\) for some threshold \(t\ll n\). The protocol needs \(\tilde{\mathcal {O}}(Nt^2)\) bits of information to be exchanged.

Along the way, we develop new protocols to securely compute linear algebra related functions (such as compute the rank of an encrypted matrix, invert a encrypted matrix or even solve an encrypted linear system). Our protocols build on ideas of previous works [24, 29], except that our protocols are specially crafted for the multi-party case. Technically, we rely heavily on Threshold Public-Key Encryption schemes which are additively homomorphic (such schemes can be constructed from DDH [14], DCR [30], or from several pairings assumptions [3, 4]) to perform linear operations.

Multi-party Threshold PSI. We then show how our Cardinality Testing protocol can be used to build a Threshold PSI protocol in the multi-party setting. Our construction achieves communication complexity of \(\tilde{\mathcal {O}}(Nt^2)\).

Concurrent Work. Recently, Ghosh and Simkin [18] updated their paper with a generalization to the multi-party case which is similar to the one presented in this paper in Sect. 4. However, they leave as a major open problem the design of a new Cardinality Testing that extends nicely to multiple parties, a problem on which we make relevant advances in this work.

In a concurrent work, Badrinarayanan et al. [1] also proposed new protocols for threshold PSI in the multi-party setting. Their results complement ours. In particular, they propose an FHE-based approach to solve the same problem as we do with a communication complexity of \(\mathcal {O}(Nt)\), where N is the number of parties and t is the threshold. However, we remark that the goal of our work was to reduce the assumptions needed for threshold PSI. They also propose a TPKE-based protocol that solves a slightly different problem: the parties learn the intersection if and only if the set difference among the sets is large, that is, \(|\left( \cup _{i=1}^N S_i\right) \setminus \left( \cap _{i=1}^N S_i\right) |\) is largeFootnote 1, which is denoted as in [1]. This protocol achieves communication complexity of \(\tilde{\mathcal {O}}(Nt)\). They achieve that result using completely different techniques from ones used in this work. Namely, they noticed that computing the determinant of a Hankel matrix can be done in sublinear time in the size of the matrix. This implies that the cardinality testing of [17] can actually be realized in time \(\tilde{\mathcal {O}}(Nt)\).

1.3 Technical Outline

We now give a high-level overview of the techniques we use to achieve the results discussed above.

Threshold PSI: The Protocol of [17]. Consider two parties Alice and Bob, with their respective input sets \(S_A\) and \(S_B\) of size n. Suppose that they want to know the intersection \(S_A\cap S_B\) iff \(|S_A\cap S_B|\ge n-t\) for some threshold \(t \ll n\). To compute the intersection, both parties encode their sets into polynomials \(P_A(x)=\prod _i^n (x-a_i)\) and \(P_B(x)=\prod _i^n (x-b_i)\) over a large finite field \(\mathbb {F}\), where \(a_i\in S_A\) and \(b_i\in S_B\). The main observation of Ghosh and Simkin [17] is that set reconciliation techniques (developed by Minsky et al. [28]) can be applied in this scenario: if \(|S_A\cap S_B|\ge n-t\), then

$$\frac{P_A(x)}{P_B(x)}=\frac{P_{A\cap B}(x)}{P_{A\cap B}(x)}\frac{P_{A\setminus B}(x)}{P_{B\setminus A}(x)}=\frac{P_{A\setminus B}(x)}{P_{B\setminus A}(x)}$$

and, moreover, \(\deg P_{A\setminus B}=\deg P_{B\setminus A}=t\). Hence, Alice and Bob just need to (securely) compute \(\mathcal {O}(t)\) evaluation points of the rational function \(P_A(x)/P_B(x)=P_{A\setminus B}(x)/P_{B\setminus A}(x)\) and, after interpolating over these points, Bob can recover the denominator (which reveals the intersection).

Of course, Bob should not be able to recover the numerator \(P_{A\setminus B}\), otherwise security is compromised. So, [17] used an Oblivious Linear Evaluation (OLE) scheme to mask the numerator with a random polynomial that hides \(P_{A\setminus B}\) from Bob.

This protocol is only secure if Alice and Bob are absolutely sure that \(|S_A\cap S_B|\ge n-t\). Otherwise, additional information could be leaked about the respective inputs. Consequently, Alice and Bob should perform a Cardinality Testing protocol, which reveals if \(|S_A\cap S_B|\ge n-t\) and nothing else.

Limitations of the Protocol when Extending to the Multi-party Setting. It turns out that the main source of inefficiency when extending Ghosh and Simkin protocol to the multi-party setting is the Cardinality Testing they use. In [17], Alice and Bob encode their sets into polynomials \(Q_A(X)=\sum ^n_ix^{a_i}\) and \(Q_B(X)=\sum ^n_ix^{b_i}\), respectively, where \(a_i\in S_A\) and \(b_i\in S_B\). Then, they can check if \(\tilde{Q}(x)=Q_A(x)-Q_B(x)\) is a sparse polynomial. If it is, we conclude that the set \((S_A\cup S_B)\setminus (S_A\cap S_B)\) is small. By disposing \(\mathcal {O}(t)\) evaluations of the polynomial \(\tilde{Q}(x)\) in a Hankel matrix [19] and securely computing its determinant (via a generic secure linear algebra protocol from [24]), both parties can determine if \(|S_A\cap S_B|\ge n-t\). The total communication complexity of this protocol is \(\mathcal {O}(t^2)\).Footnote 2

However, if we were to naively extend this approach to the multi-party setting, we would have N parties computing, say,

$$\tilde{Q}(x)=N Q_1(x)-Q_2(x)- \dots - Q_N(x)$$

which is a sparse polynomial only if N is small. Moreover, if we were to compute the sparsity of this polynomial using the same approach, we would have a protocol with communication complexity \(\mathcal {O}((Nt)^2)\).

Our Approach. Given the state of affairs presented in the previous section, it seems we need to take a different approach from the one of [17] if we want to design an efficient threshold PSI protocol for multiple parties.

Interlude: Secure Linear Algebra. Recall that in the setting of secure linear algebra (as in [29] and [24]), there are two parties, one holding an encryption of a matrix and the other one holding the corresponding secret key . Their goal is to compute an encryption of a (linear algebra related) function of the matrix \(\mathbf {M}\), such as the rank, the determinant of \(\mathbf {M}\), or, most importantly, find a solution \(\mathbf {x}\) for the linear system \(\mathbf {M}\mathbf {x}=\mathbf {y}\) where both \(\mathbf {M}\) and \(\mathbf {y}\) are encrypted. We can easily extend this problem to the multi-party case: Consider N parties, \(\mathsf {P}_1,\dots , \mathsf {P}_N\), each one holding a share of the secret key of a threshold PKE scheme. Additionally, \(\mathsf {P}_1\) has an encrypted matrix. The goal of all the parties is to compute an encryption of a (linear algebra related) function of the encrypted matrix.

We observe that the protocols for secure linear algebra presented in [24] can be extended to the multiparty setting by replacing the use of an (additively homomorphic) PKE and garbled circuits for an (additively homomorphic) threshold PKEFootnote 3. Hence, our protocols allow N parties to solve a linear system of the form \(\mathbf {M}\mathbf {x}=\mathbf {y}\) under the hood of a threshold PKE scheme.

Cardinality Testing via Degree Test of a Rational Function. Consider again the encodings \(P_{S_i}(x)=\prod _j^n (x-a^{(i)}_j)\) where \(a^{(i)}_j\in S_i\), for N different sets, and the rational functionFootnote 4

$$\frac{P_{S_1}+\dots + P_{S_N}}{P_{S_1}}=\frac{P_{S_1\setminus (\cap _{j=1}^N S_j)}+ \dots +P_{S_N\setminus (\cap _{j=1}^{N} S_j)}}{P_{S_1\setminus (\cap _{j=1}^N S_j)}}.$$

Note that, if the intersection \(\cap S_i \) is larger than \(n-t\), then \(\deg P_{S_1\setminus (\cap _{j=1}^N S_j)}=\dots =\deg P_{S_N\setminus (\cap _{j=1}^{N} S_j)}\le t\).

Therefore, the Cardinality Testing boils down to the following problem: Given a rational function \(f(x)=\tilde{P}_1(x)/\tilde{P}_2(x)\), can we securely decide if \(\deg \tilde{P}_1=\deg \tilde{P}_2\le t\) having access to \(\mathcal {O}(t)\) evaluation points of f(x)?

Our crucial observation is that, if we interpolate two different rational functions \(f_V\) and \(f_W\) on different two support sets \(V=\{v_i,f(v_i)\}\) and \(W=\{w_i,f(w_i)\}\) each one of size 2t, then we have:

  1. 1.

    \(f_V=f_W\) if \(\deg P_1=\deg P_2\le t\)

  2. 2.

    \(f_V\ne f_W\) if \(\deg P_1=\deg P_2> t\)

except with negligible probability over the uniform choice of \(v_i,w_i\).

Moreover, interpolating a rational function can be reduced to solving a linear system of equations. Hence, by using the Secure Linear Algebra tools developed before, we can perform the degree test revealing nothing else than the output. In other words, we can decide if the size of the intersection is smaller than \(n-t\) while revealing no additional information about the parties’ input sets.

Security of the Protocol. We prove security of our Cardinality Testing in the UC framework [7]. However, there is a subtle issue in our security proof. Namely, our secure linear algebra protocols cannot be proven UC-secure since the inputs are encrypted under a public key which, in the UC setting, needs to come from somewhere.

We solve this problem by using the Externalized UC framework [8]. In this framework, the secure linear algebra ideal functionalities all share a common setup which, in our case, is the public key (and the corresponding secret key shares). We prove security of our secure linear algebra protocols in this setting.

Since the secure linear algebra protocols are secure if they all share the same public key, then, on the Cardinality Testing, we just need to create this public key and share it over these functionalities. Thus, we prove standard UC-security of our Cardinality Testing.

Badrinarayanan et al. [1] also encounter the same problem as we did and they opted to not prove security of each subprotocol individually, but rather prove security only for their main protocol (where the public key is created and shared among these smaller protocols).

Multi-party PSI. Having developed a Cardinality Testing, we can now focus on securely computing the intersection. In fact, our protocol for computing the intersection can be seen as a generalization of Gosh and Simkin protocol [17]. Again, by encoding the sets as above (that is, \(P_{S_i}(x)=\prod _j^n (x-a^{(i)}_j)\) where \(a^{(i)}_j\in S_j\) and \(S_j\) is the set of party \(\mathsf {P}_j\)) and knowing that the intersection is larger than \(n-t\), parties can securely compute the rational functionFootnote 5 \((P_{S_1}+\dots + P_{S_N})/P_{S_1}\). By interpolating the rational function on any \(\mathcal {O}(t)\) points, party \(\mathsf {P}_1\) can recover the denominator and compute the intersection.

The main difference between our protocol and the one in [17] is that we replace the OLE calls used in [17] by a threshold additively homomorphic PKE scheme (which can be seen as the multi-party replacement of OLE).

1.4 Other Related Work

Oblivious Linear Algebra. Cramer and Damgård [9] proposed a constant-round protocol to securely solve a linear system of unknown rank over a finite field. Since they were mainly focused on round-optimality, the communication cost of their proposal is \(\varOmega (t^3)\) for \(\mathcal {O}(t^2)\) input size. Bouman et al. [5] recently constructed a secure linear algebra protocol for multiple parties, however they focused on computational complexity.

Other secure linear algebra schemes in the two-party setting were presented by Nissim and Weinreb in [29] and Kiltz et al. in [24]. In the following, consider (square) matrices of size t over a field . These two works take different approaches: [29] obliviously solves linear algebra related problems directly via Gaussian elimination in \(\mathcal {O}(t^2)\) communication complexity, for a square matrix of size t. However, their approach has an error probability that decreases polynomially with t. In other words, the error probability is only sufficiently small when applied to linear system with large matrices. Whereas [24] has error probability decreases polynomially with , which is negligible when is of exponentially size.Footnote 6

2 Preliminaries

If S is a finite set, then denotes an element x sampled from S according to a uniform distribution and |S| denotes the cardinality of S. If \(\mathcal {A}\) is an algorithm, \(y\leftarrow \mathcal {A}(x)\) denotes the output y after running \(\mathcal {A}\) on input x. For \(N\in \mathbb {N}\), we define \([N]=\{1,\dots , N\}\).

Given two distributions \(D_1,D_2\), we say that they are computationally indistinguishable, denoted as \(D_1\approx D_2\), if no probabilistic polynomial-time (PPT) algorithm is able to distinguish them.

Throughout this work, we denote the security parameter by \(\lambda \).

2.1 Threshold Public-Key Encryption

We present some ideal functionalities regarding threshold public-key encryption (TPKE) schemes. In the following, N is the number of parties.

Let \(\mathcal {F}_\mathsf {Gen}\) be the ideal functionality that distributes a secret share of the secret key and the corresponding public key. That is, on input \((\mathsf {sid},\mathsf {P}_i),\) \(\mathcal {F}_\mathsf {Gen}\) outputs to each party party where .

Moreover, we define the functionality \(\mathcal {F}_{\mathsf {DecZero}}\), which allows N parties, each of them holding a secret share , to learn if a ciphertext is an encryption of 0 and nothing else. That is, \(\mathcal {F}_{\mathsf {DecZero}}\) receives as input a ciphertext c and the secret shares of each of the parties. It outputs 0, if , and 1 otherwise. Note that these functionalities can be securely realized on varies PKE schemes such as El Gamal PKE or PaillerFootnote 7 PKE [21].

We also assume that the underlying TPKE (or plain PKE) is always additively homomorphic, unless stated otherwise (see Supplementary Material A.1).

2.2 UC Framework and Ideal Functionalities

In this work, we use the UC framework by Canetti [7] to analyze the security of our protocols.Footnote 8 Throughout this work, we only consider semi-honest adversaries, unless stated otherwise. We denote the underlying environment by \(\mathcal {Z}\). For a protocol \(\pi \) and a real-world adversary \(\mathcal {A}\), we denote the real-world ensemble by \(\mathsf {EXEC}_{\pi ,\mathcal {A},\mathcal {Z}}\) Similarly, for an ideal functionality \(\mathcal {F}\) and a simulator \(\mathsf {Sim}\), we denote the ideal-world ensemble by \(\mathsf {IDEAL}_{\mathcal {F},\mathsf {Sim},\mathcal {Z}}\).

Definition 1

We say that a protocol \(\pi \) UC-realizes \(\mathcal {F}\) if for every PPT adversary \(\mathcal {A}\) there is a PPT simulator \(\mathsf {Sim}\) such that for all PPT environments \(\mathcal {Z}\),

$$\mathsf {IDEAL}_{\mathcal {F},\mathsf {Sim},\mathcal {Z}}\approx \mathsf {EXEC}_{\pi ,\mathcal {A},\mathcal {Z}}$$

where \(\mathcal {F}\) is an ideal functionality.

In the following, we present some ideal functionalities that will be recurrent for the rest of the paper.

Multi-party Threshold Private Set Intersection. This ideal functionality implements the multi-party version of the functionality above. Here, each of the N parties input a set and they learn the intersection if and only if the intersection is large enough.

figure a

Externalized UC Protocol with Global Setup. We introduce a notion of protocol emulation from [8], called externalized UC emulation (EUC), which is a simplified version of UC with global setup (GUC).

Definition 2

(EUC-Emulation [8]). We say that \(\pi \) EUC-realizes \(\mathcal {F}\) with respect to shared functionality \(\bar{\mathcal {G}}\) (or, in shorthand, that \(\pi \) \(\bar{\mathcal {G}}\)-EUC-emulates \(\phi \)) if for any PPT adversary \(\mathcal {A}\) there exists a PPT adversary \(\mathsf {Sim}\) such that for any shared functionality \(\bar{\mathcal {G}}\), we have:

$$ \mathsf {IDEAL}^{\bar{\mathcal {G}}}_{\mathcal {F},\mathsf {Sim},\mathcal {Z}} \approx \mathsf {EXEC}^{\bar{\mathcal {G}}}_{\pi ,\mathcal {A},\mathcal {Z}} $$

Notice that the formalism implies that the shared functionality \(\bar{\mathcal {G}}\) exists both in the model for executing \(\pi \) and also in the model for executing the ideal protocol for \(\mathcal {F}\), \(\mathsf {IDEAL}_\mathcal {F}\).

We remark that the notion of \(\bar{\mathcal {G}}\)-EUC-emulation can be naturally extended to protocols that use several different shared functionalities (instead of only one).

2.3 Polynomials and Interpolation

We present a series of results that will be useful to analyze correctness and security of the protocols presented in this work.

The following lemma show how we can mask a polynomial of degree less than t using a uniformly random polynomial.

Lemma 1

([25]). Let \(\mathbb {F}_p\) be a prime order field, P(x), Q(x) be two polynomials over \(\mathbb {F}_p\) such that \(\deg P =\deg Q=d\le t\) and \(\gcd (P,Q)=1\). Let such that \(\deg R_1=\deg R_2=t\). Then \(U(x)=P(x)R_1(x)+Q(x)R_2(x)\) is a uniformly random polynomial with \(\deg U\le 2t\).

Note that this result also applies for multiple polynomials as long as they don’t share a common factor (referring to Theorem 2 and Theorem 3 of [25] for more details).

We say that f is a rational function if \(f(x)=\frac{P(x)}{Q(x)}\) for two polynomials P and Q.

The next two lemmata show that we can recover a rational function via interpolation and that this function is unique.

Lemma 2

([28]). Let \(f(x)=P(x)/Q(x)\) be rational function where \(\deg P(x)=m\) and \(\deg Q(x)=n\). Then f(x) can be uniquely recovered (up to constants) via interpolation from \(m+n+1\) points. In particular, if P(x) and Q(x) are monic, f(x) can be uniquely recovered from \(m+n\) points.

Lemma 3

([28]). Choose V to be a support setFootnote 9 of cardinality \(m_1+m_2+1\). Then, there is a unique rational function \(f(x)=P(x)/Q(x)\) that can be interpolated from V, and P(x) has degree at most \(m_1\) and Q(x) has degree at most \(m_2\).

3 Oblivious Degree Test for Rational Functions

Suppose we have a rational function \(f(x)=P(x)/Q(x)\) where P(x) and Q(x) are two polynomials with the same degree. In this section, we present a protocol that allows several parties to check if \(\deg P(x)=\deg Q(x) \le t\) for some threshold \(t\in Z\). To this end, and inspired by the works of [24, 29], we present a multi-party protocol to obliviously solve a linear system \(\mathbf {M}\mathbf {x}=\mathbf {y}\) over a finite field \(\mathbb {F}\) with communication complexity \(O(t^2k\lambda N)\), where \(\mathbf {M}\in \mathbb {F}^{t\times t}\), \(\log |\mathbb {F}|=k\) and N is the number of parties involved in the protocol.

3.1 Oblivious Linear Algebra

In this section, we state the Secure Linear Algebra protocols that we need to build our degree test protocol. For the sake of briefness, the protocols are presented in Appendix B. These protocol all have the following form: There is a public key of a TPKE that encrypts a matrix \(\mathbf {M}\) and every party involved in the protocol has a share of the secret key.

Note that if we let parties \(\mathsf {P}_i\) input their encrypted matrix , then the ideal functionality \(\mathcal {F}\) has to know the secret key (by receiving secret key shares from all parties), otherwise \(\mathcal {F}\) cannot compute the corresponding function correctly. However, this will cause an unexpected problem in security proof as mentioned in our introduction and [1]: The environment \(\mathcal {Z}\) will learn the secret key as well since it can choose inputs for all parties. We fix this by relying on global UC framework where exists a shared functionality \(\bar{\mathcal {G}}\) in charge of distributing key pairs (\(\mathcal {F}_\mathsf {Gen}\) from Sect. 2.1).

Oblivious Matrix Multiplication. We begin by presenting the ideal functionality for a multi-party protocol to jointly compute the product of two matrices, under a TPKE. The protocol is presented in Appendix B.1.

Ideal Functionality. The ideal functionality for oblivious matrix multiplication is presented below.

figure b

Securely Compute the Rank of a Matrix. We present the ideal functionality to obliviously compute the rank of an encrypted matrix. The protocol is presented in Appendix B.2.

Ideal Functionality. The ideal functionality of oblivious rank computation is defined below.

figure c

Oblivious Linear System Solver. We now show how N parties can securely solve a linear system using the multiplication protocol above. We follow the ideas from [24] to reduce the problem to minimal polynomials, and the only difference is we focus on multiparty setting.

The protocol is presented in Appendix B.5. Informally, we evaluate an arithmetic circuit following the ideas of [10], and for the unary representation, a binary-conversion protocol [37] is required. All of above protocols can be based on Paillier cryptosystem.

Ideal Functionality. We give an ideal functionality of oblivious linear system solver for multiparty as follows.

figure d

3.2 Oblivious Degree Test

We now present the main protocol of this section and the one that will be using in the construction of threshold PSI. Given a rational function P(x)/Q(x) (for two polynomials P(x) and Q(x) with the same degree) and two support sets \(V_1,V_2\), the protocol allows us to test if the degree of the polynomials is less than some threshold t. Of course, we can do this using generic approaches like garbled circuits. However, we are interested in solutions with communication complexity depending on t (even when the degree of P(x) or Q(x) is much larger than t).

Ideal Functionality. The ideal functionality for degree test of rational functions is presented below.

figure e

Protocol. We present the Protocol 1 for secure degree test which we denote by \(\mathsf {secDT}\). The main idea of the protocol is to interpolate the rational function on two different support sets and check if the result is the same in both experiments.

Recall that interpolating a rational function boils down to solve a linear equation. We can thus use the secure linear algebra tools developed to allow the parties to securely solve a linear equation.

Also recall that two rational functions \(C_v^{(1)}/C_v^{(2)}=C_w^{(1)}/C_w^{(2)}\) are equivalent if \(C_v^{(1)}C_w^{(2)}-C_w^{(1)} C_v^{(2)}=0\). Thus, in the end, parties just need to securely check if \(C_v^{(1)}C_w^{(2)}-C_w^{(1)} C_v^{(2)}\) is equal to 0.

figure f

Footnote 10 Footnote 11

Comments. Suppose that, for an interpolation point \(\alpha _i\), the rational function \(f(x)=P(x)/Q(x)\) is well-defined but \(Q(\alpha _i)=P(\alpha _i)=0\) such that we cannot compute \(f(\alpha _i)\) by division. In this caseFootnote 12, the parties evaluate \(\tilde{P}(x)=P(x)/(x-\alpha _i)\) and \(\tilde{Q}(x)=Q(x)/(x-\alpha _i)\) on \(\alpha _i\) and set \(f(\alpha _i)=\tilde{P}(\alpha _i)/\tilde{Q}(\alpha _i)\). These points are called tagged values and this strategy is used in [28]. In more details, instead of using for \(\alpha _i\), we will use a tagged pair where \(s^{(1)}_i=\frac{P_1(\alpha _i)}{x-\alpha _i}\) and \(s^{(2)}_i=\frac{P_2(\alpha _i)}{x-\alpha _i}\). Correspondingly, replace each row of and with

and , respectively.

Also, note that the protocol easily generalizes to rational functions \(f(x)=P(x)/Q(x)\) with \(\deg P\ne \deg Q\) (which is actually what we use in the following sections). We present the version where \(\deg P= \deg Q\) for simplicity. In fact, the case where \(\deg P\ne \deg Q\) can be reduced to the presented case by multiplying the least degree polynomial by a uniformly chosen R(x) of degree \(\max \{\deg P(x)-\deg Q(X), \deg Q(x)-\deg P(x)\}\).

Moreover, if \(t'>t\), the linear system for rational interpolation might be unsolvable. In this case, there is no solution which means we cannot interpolate an appropriate rational function on certain support set. Therefore, the parties just return 0.

Analysis. We analyze correctness, security and communication complexity of the protocol. We begin the analysis with the following auxiliary lemma.

Lemma 4

Let \(\mathbb {F}\) be a field with \(|\mathbb {F}|=\omega (2^{\log \lambda })\). Let \(V=\{(v_i,f(v_i))|\forall i \in [1,2t+1]\}\) and \(W=\{(w_i,f(w_i))|\forall i \in [1,2t+1]\}\) be two support sets each of them with \(2t+1\) elements over a field , with , and \(f(x):=\frac{P(x)}{Q(x)}\) is some unknown reduced rational function (i.e., P(x), Q(x) are co-prime), where \(\deg (P)=\deg (Q)=t'\) and \(t<t'\) where . We also require Q(x) to be monic (to fit in our application). Additionally, assume that \(Q(v_i)\ne 0\) and \(Q(w_i)\ne 0\) for every \(i\in [2t+1]\).

If we recover two rational function \(f_V(x), f_W(x)\) by interpolation on VW, respectively, then

over the choice of \(v_i,w_i\).

Proof

Let \(f_V(x)=A(x)/B(x)\) the rational function recovered by rational interpolation over the support set V and let \(f(x)=P(x)/Q(x)\) be the rational function interpolated over any \(2t'+1\) interpolation points. We have that \(f_V(v_i)=f(v_i)\) for all \(i\in [2t+1]\) and hence

$$\frac{A(v_i)}{B(v_i)}=\frac{P(v_i)}{Q(v_i)}\Leftrightarrow A(v_i)Q(v_i)=P(v_i)B(v_i).$$

Since \(\gcd (P(x),Q(x))=1\), then the polynomial \(\tilde{P}(x)=A(x)Q(x)-P(x)B(x)\) is different from the null polynomial (as \(\deg (P)=t'>t=\deg (A)\)). Moreover, \(v_i\) is a root of \(\tilde{P}(x)\), for all \(i\in [2t+1]\), and \(\deg \tilde{P}(x)\le t+t'\) (which means that \(\tilde{P}(x)\) has at most \(t+t'\) roots).

Analogously, let \(f_W=C(x)/D(x)\) be the rational function resulting from interpolating over the support set W and let \(\tilde{Q}(x)=C(x)Q(x)-D(x)P(x)\). We have that \(\tilde{Q}(w_i)=0\) for all \(i\in [2t+1]\). Hence, if \(f_V(x)=f_W(x)\), then we have that the points \(w_i\) are also roots of \(\tilde{P}(x)\). But, since the points \(w_i\) are chosen uniformly at random from \(\mathbb {F}\) (which is of exponential size when compared to \(t,t'\)), then there is a negligible probability that all \(w_i\)’s are roots of \(\tilde{P}(x)\).

Concretely,

$$\begin{aligned} \Pr \left[ f_V=f_W\right]&\le \Pr \left[ \tilde{P}(w_i)=0 \forall i [2t+1]\right] \\&= \prod _i^{2t+1} \Pr \left[ \tilde{P}(w_i)=0 \right] \le \left( \frac{\deg \tilde{ P}}{|\mathbb {F}|}\right) ^{2t+1} \end{aligned}$$

which is negligible for \(|\mathbb {F}|\in \omega (2^{\log \lambda })\).    \(\Box \)

Theorem 1 (Correctness)

The protocol \(\mathsf {secDT}\) is correct.

Proof

The protocol interpolates two polynomials from two different support sets. Then, it checks if the two interpolated polynomials are the same by computing

$$C_{v}^{(1)}(x)\cdot C_{w}^{(2)}(x)-C_{w}^{(1)}(x)\cdot C_{v}^{(2)}(x))$$

which should be equal to 0 if \(C_{v}^{(1)}(x)/ C_{v}^{(2)}(x)=C_{w}^{(1)}(x)/ C_{w}^{(2)}(x)\).

If \(t'\le t\), then by Lemma 3, there is a unique rational function can be recovered thus the final output of the algorithm should be 1. On the other hand, if \(t'>t\), the linear system can be either unsolvable or solvable but yielding two different solutions with overwhelming probability by Lemma 4. In this case, the protocol outputs 0.    \(\Box \)

Theorem 2

The protocol \(\mathsf {secDT}\) EUC-securely realizes \(\mathcal {F}_\mathsf {SDT}\) with shared ideal functionality \(\mathcal {F}_\mathsf {Gen}\) in the \((\mathcal {F}_\mathsf {ORank},\mathcal {F}_\mathsf {OMM},\) \(\mathcal {F}_\mathsf {OLS},\mathcal {F}_\mathsf {DecZero})\)-hybrid model against semi-honest adversaries corrupting at most \(N-1\) parties, given that \(\mathsf {TPKE}\) is IND-CPA.

Proof

(Sketch). The simulator sends the corrupted parties’ input to the ideal functionality and obtains the output (either 0 or 1). Then, it simulates the ideal functionalities \((\mathcal {F}_\mathsf {ORank},\mathcal {F}_\mathsf {OMM},\mathcal {F}_\mathsf {OLS},\mathcal {F}_\mathsf {DecZero})\) so that the output in the real-world execution is the same as in the ideal-world execution. In particular, the simulator is able to recover the secret key shares via \(\mathcal {F}_\mathsf {ORank},\mathcal {F}_\mathsf {OMM},\mathcal {F}_\mathsf {OLS}\) and, thus, simulate \(\mathcal {F}_\mathsf {DecZero}\) in the right way.

Indistinguishability of executions holds given that \(\mathsf {TPKE}\) is IND-CPA.    \(\Box \)

Communication Complexity. When we instantiate \(\mathcal {F}_\mathsf {OLS}\) with the protocol from the previous section, the communication complexity of \(\mathsf {secDT}\) is \(\mathcal {O}(Nt^2)\).

4 Multi-party Threshold Private Set Intersection

We present our protocol for Threshold PSI in the multi-party setting. Our protocol to privately compute the intersection can be seen as a generalization of Ghosh and Simkin protocol [17] where we replace the OLE by a TPKE (which fits nicer in a multi-party setting). The main difference between our protocol and theirs is in the cardinality test protocol used.

We begin by presenting the protocol to securely compute a cardinality testing between N sets. Then, we plug everything together in a PSI protocol.

4.1 Secure Cardinality Testing

Ideal Functionality. The ideal functionality for Secure Cardinality Testing receives the sets from all the parties and outputs 1 if and only if the intersection between these sets is larger than some threshold. Else, no information is disclosed. The ideal functionality for multi-party cardinality testing is given as follows.

figure g

Protocol. We introduce our multiparty Protocol 2 (based on degree test protocol). In the following, \(\mathcal {F}_\mathsf {Gen}\) be the ideal functionality defined in Sect. 2.1 and \(\mathcal {F}_\mathsf {SDT}\) be the functionality defined in Sect. 3.2.

figure h

Footnote 13

Analysis. We now proceed to the analysis of the protocol described above.

Lemma 5

Given n characteristic polynomials with same degree from \(\mathbb {F}[x]\), denoted as \(P_1(x),\dots ,P_n(x)\), we argue that, for any j, \(P'(x)=\sum _{i=1}^n r_i\cdot P_i(x)\) and \(P_j(x)\) are relatively prime with probability if \(P_1(x),\dots ,P_n(x)\) are mutually relatively prime, where is a uniformly random element.

Proof

Supposing there is a common divisor of two polynomials \(P'(x)\) and \(P_j(x)\), since \(P_j(x)\) is a characteristic polynomial, we denote \((x-s)\) the common divisor. Therefore, we have \(P'(s)=0\) which can be represented as \(\sum _{i=1}^n r_i\cdot P_i(s) = 0\). However, from the mutually relative primality of \(P_1(x),\dots ,P_n(x)\), we know that \(P_i(s)\) cannot be zero simultaneously which means there exists at least one \(i^*\) to make \(P_{i^*}(s)\ne 0\). Moreover, \(r_i\) are all sampled uniformly from \(\mathbb {F}\), the weighted sum of \(r_i\) will not be zero with all but negligible probability. This is a contradiction. Therefore, \(P'(x)\) and \(P_j(x)\) will share a common divisor only with negligible probability.    \(\Box \)

Theorem 3 (Correctness)

The protocol \(\mathsf {MPCT}\) described above is correct.

Proof

Note that the encryption \(d^{(j)}\) computed by party \(\mathsf {P}_1\) are equal to

Also, observe that

$$\begin{aligned} \frac{\sum _{i=1}^N r_i\cdot P_i(\alpha _j)}{P_1(\alpha _j)}&=\frac{P_{\cap _i S_i}(\alpha _j)\cdot \sum _i^{N} r_i\cdot P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(\alpha _j)}{P_{\cap _i S_i}(\alpha _j)\cdot P_{S_1\setminus \left( \cap _{k\ne 1} S_k\right) }} \\&=\frac{\sum _i^{N} r_i\cdot P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(\alpha _j)}{ P_{S_1\setminus \left( \cap _{k\ne 1} S_k\right) }(\alpha _j)}, \end{aligned}$$

in this way, we make the numerator and denominator relatively prime except with negligible probability by Lemma 5.

Observe that \(\deg \sum _i^{N} r_i\cdot P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(x) \le t\) and \(\deg P_{S_1\setminus \left( \cap _{k\ne 1} S_k\right) }(x)\le t\) if and only if \(S_\cap \ge n-t\). Hence, by the correctness of \(\mathcal {F}_{\mathsf {SDT}}\), the protocol outputs 1 if \(S_\cap \ge n-t\), and 0 otherwise.    \(\Box \)

Theorem 4

The protocol \(\mathsf {MPCT}\) securely realizes functionality \(\mathcal {F}_{\mathsf {MPCT}}\) in the \((\mathcal {F}_{\mathsf {Gen}},\mathcal {F}_{\mathsf {SDT}})\)-hybrid model against any semi-honest adversaries corrupting up to \(N-1\) parties, given that \(\mathsf {TPKE}\) is IND-CPA.

Proof

Assume that the adversary is corrupting \(N-k\) parties in the protocol, for \(k=1,\dots , N-1\). The simulator creates the secret keys and the public key of a threshold PKE in the setup phase while simulating \(\mathcal {F}_\mathsf {Gen}\) and distributes the secret keys between every party. The simulator \(\mathsf {Sim}\) takes the inputs (which are sets of size n, say \(S_{i_1},\dots ,S_{i_{N-k}}\)) of the corrupted parties and send them to the ideal functionality \(\mathcal {F}_\mathsf {MPCT}\). It receives the output b from the ideal functionality. If \(b=0\), the simulator chooses k uniformly chosen sets such that \(|\cap _{i=1}^N S_i|< n-t\) and proceed the simulation as the honest parties would do. If \(b=1\), the simulator chooses k uniformly chosen random sets such that \(|\cap _{i=1}^N S_i|\ge n-t\) and proceed the simulation as the honest parties would do. Note that it can simulate the ideal functionality \(\mathcal {F}_\mathsf {SDT}\) since it knows all the secret keys of the threshold PKE.

Indistinguishability of executions follows immediately from the IND-CPA property of the underlying threshold PKE scheme.    \(\Box \)

Communication Complexity. When we instantiate the \(\mathcal {F}_\mathsf {SDT}\) with the protocol from the previous section, each party broadcasts \(\tilde{\mathcal {O}}(t^2)\). Hence, the total communication complexity is \(\tilde{\mathcal {O}}(Nt^2)\), assuming a broadcast channel.

4.2 Multi-party Threshold Private Set Intersection Protocol

In this section, we extend Ghosh and Simkin protocol [17] to the multi-party setting using TPKE. We make use of the cardinality testing designed above to get the Protocol 3.

figure i

Analysis. We now proceed to the analysis of the protocol described above. We start by analyzing the correctness of the protocol and then its security.

Theorem 5 (Correctness)

The protocol \(\mathsf {MTPSI}\) is correct.

Proof

Assume that \(|S_1\setminus \left( \cap _{i=2}^N S_i\right) |\le t\) (note that this condition is guaranteed after resorting to the functionality \(\mathcal {F}_{\mathsf {MPCT}}\) in the first step of the protocol). After the execution of the protocol, party \(\mathsf {P}_1\) obtains the points \(V^{(j)}=\sum _i^{N} P_i(\alpha _j)\cdot R_i(\alpha _j)\). Then,

$$\begin{aligned} \tilde{V}^ {(j)}&=\frac{V^{(j)}}{P_1(\alpha _j)} =\frac{\sum _i^{N} P_i(\alpha _j)\cdot R_i(\alpha _j)}{P_1(\alpha _j)} \\&=\frac{P_{\cap _i S_i}(\alpha _j)\cdot \sum _i^{N} P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(\alpha _j)\cdot R_i(\alpha _j)}{P_{\cap _i S_i}(\alpha _j)\cdot P_{S_1\setminus \left( \cap _{k\ne 1} S_k\right) }(\alpha _j)} \\&=\frac{\sum _i^{N} P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(\alpha _j)\cdot R_i(\alpha _j)}{ P_{S_1\setminus \left( \cap _{k\ne 1} S_k\right) }(\alpha _j)} . \end{aligned}$$

Since \(\mathsf {P}_1\) has \(3t+1\) evaluated points of the rational function above, then it can interpolate a rational function to recover the polynomial \(P_{S_1\setminus \left( \cap _{k\ne 1} S_k\right) }\). This is possible because of Lemma 2 and the fact that

$$\deg \left( \sum _i^{N} P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(\alpha _j)\cdot R_i(\alpha _j) \right) \le 2t \quad \text { and } \quad \deg \left( P_{S_1\setminus \left( \cap _{k\ne 1} S_k\right) }(\alpha _j)\right) \le t.$$

Having computed the polynomial \(P_{S_1\setminus \left( \cap _{k\ne 1} S_k\right) }\), party \(\mathsf {P}_1\) can compute the intersection because the roots of this polynomial are exactly the elements in \(S_1\setminus \left( \cap _{k\ne 1} S_k\right) \).    \(\Box \)

Theorem 6

The protocol \(\mathsf {MTPSI}\) securely realizes functionality \(\mathcal {F}_{\mathsf {MTPSI}}\) in the \((\mathcal {F}_{\mathsf {Gen}},\mathcal {F}_{\mathsf {MPCT}})\)-hybrid model against any semi-honest adversary corrupting up to \(N-1\) parties.

Proof

Let \(\mathcal {A}\) be an adversary corrupting up to k parties involved in the protocol, for any \(k\in [N-1]\). Let \(\mathsf {P}_{i_1}, \dots , \mathsf {P}_{i_{k}}\) be the corrupted parties. The simulator \(\mathsf {Sim}\) works as follows:

  1. 1.

    It sends the inputs of the corrupted parties, \(S_{i_1},\dots , S_{i_k}\), to the ideal functionality \(\mathcal {F}_{\mathsf {MTPSI}}\). \(\mathsf {Sim}\) either receives \(\perp \) or \(\cap _i S_i\) from the ideal functionality \(\mathcal {F}_{\mathsf {MTPSI}}\).

  2. 2.

    \(\mathsf {Sim}\) waits for \(\mathcal {A}\) to send the corrupted parties’ inputs to the ideal functionality \(\mathcal {F}_{\mathsf {MPCT}}\). If \(\mathsf {Sim}\) has received \(\perp \) from \(\mathcal {F}_{\mathsf {MPCT}}\), then \(\mathsf {Sim}\) leaks 0 to \(\mathcal {A}\) (and \(\mathcal {Z}\)) and terminates the protocol. Else, \(\mathsf {Sim}\) leaks 1 and continues.

  3. 3.

    \(\mathsf {Sim}\) waits for \(\mathcal {A}\) to send a request \((\mathsf {sid},\mathsf {request}_{i_j})\) for each of the corrupted parties (that is, for \(j\in [k]\)) to \(\mathcal {F}_\mathsf {Gen}\). Upon receiving such requests, \(\mathsf {Sim}\) generates and returns for each of the requests.

  4. 4.

    For each party \(\mathsf {P}_{\ell }\) such that \(\ell \ne i_j\) (where \(j\in [k]\)), \(\mathsf {Sim}\) picks a random polynomial \(U_\ell (x)\) of degree \(n-|\cap _i S_i|+t\) and sends , where \(R_\ell (x)\) is chosen uniformly at random such that \(\deg R_\ell (x)=t\). From now on, \(\mathsf {Sim}\) simulates the dummy parties as in the protocol.

We now argue that both the simulation and the real-world scheme are indistinguishable from the point-of-view of any environment \(\mathcal {Z}\). In the real-world scheme, party \(\mathsf {P}_1\) obtains the polynomial

$$V(x)=P_{\cap _i S_i}(x)\cdot \sum _i^{N} P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(x)\cdot R_i(x)$$

evaluated in \(3t+1\) points. Assume that \(\mathsf {P}_1\) is corrupted by \(\mathcal {A}\). Even in this case, there is an index \(\ell \) for which \(\mathcal {A}\) does not know the polynomial \(R_\ell (x)\). More precisely, we have that

$$V(x)=P_{\cap _i S_i}(x)\cdot \left( \left( \sum _{i\ne \ell } P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(x)\cdot R_i(x)\right) + P_{S_\ell \setminus \left( \cap _{k\ne \ell } S_k\right) }(x)\cdot R_\ell (x)\right) .$$

First, note that

$$\begin{aligned}\deg \left( \sum _{i\ne \ell } P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }(x)\cdot R_i(x)\right)&= \deg P_{S_\ell \setminus \left( \cap _{k\ne \ell } S_k\right) }(x)\cdot R_\ell (x)\\&=n-|\cap _i S_i|+t\le 2t.\end{aligned}$$

Moreover, we have for any \(i\in [N]\) that \(\deg P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }\le t\), \(\deg R_i(x)=t\) and \(\gcd \left( P_{S_i\setminus \left( \cap _{k\ne i} S_k\right) }, P_{S_j\setminus \left( \cap _{k\ne j} S_k\right) }\right) =1\) for any \(j\ne i\). Hence, by Lemma 1, we can build a sequence of hybrids where we replace V(x) by the polynomial \(V'(x)=P_{\cap _i S_i}(x)\cdot U(x)\), where \(\deg U(x)=n-|\cap _i S_i|+t\), as in the ideal-world execution. Indistinguishability of executions follows.    \(\Box \)

Communication Complexity. When we instantiate the ideal functionality \(\mathcal {F}_{\mathsf {MPCT}}\) with the protocol from the previous section the scheme has communication complexity \(\tilde{\mathcal {O}}(Nt^2)\).