1 Introduction

Today’s world is seeing a visible transition from offline services to a heavy dependency on online platforms for banking, socializing, healthcare, etc. This is leading to an increased user presence online, which leaves a trail of online activity and personal data over the Internet. The availability of such user-specific data opens up possibilities for its misuse. For instance, there has been a lot of concern raised regarding advertisement service providers such as Google and Facebook breaching user privacy for targeted advertisement services [38]. In the process of providing enhanced targeted advertisement services, service providers are allegedly learning more information about their users than they are entitled to (e.g. user’s shopping activity, browsing history) from various data collection entities. These entities collect user data via website cookies, loyalty cards, etc. [72]. While such targeted advertisements offer a personalized online experience, they may come at the cost of revealing unauthorized user data to these service providers. Such a challenge is also encountered in the healthcare sector. Collaborative analysis among healthcare institutes over patient data is known to facilitate better diagnosis and improved treatment. However, laws such as General Data Protection Regulation (GDPR) and the Consumer Privacy Act (CCPA), which prevent sharing of patient records, hinder such collaborations, thereby re-emphasizing the need for mechanisms that enable privacy-preserving computations.

Such mechanisms that ensure privacy-preserving computations can be facilitated via several privacy-enhancing technologies such as homomorphic encryption [18, 47], differential privacy [42], and secure multiparty computation [12, 49, 98]. We focus on secure multiparty computation (MPC) as it has been the cornerstone of research lately, showcasing its effectiveness in various applications such as privacy-preserving machine learning [63, 75, 94], secure collaborative analytics [82], and secure genome matching [6, 86]. Essentially, it offers a solution to the potential privacy issues which may arise in collaborative computation scenarios such as targeted advertisements described earlier.

MPC allows mutually distrusting parties to perform computations on their private inputs such that they learn nothing beyond the output of the computation. The distrust among the parties is captured by the notion of a centralized adversary, which is said to corrupt up to t out of the n participating parties. Depending on its behaviour, the adversary can be categorized as either semi-honest or malicious [48]. Semi-honest adversary models the corruption scenario where the corrupt parties are restricted to follow the protocol and cannot deviate arbitrarily, as in the stronger notion of malicious corruption.

MPC with honest majority, where only a minority of the parties are corrupt, enables construction of efficient protocols for multiple parties [2, 13, 15, 35, 54, 82]. The recent concretely efficient protocols, operating over rings, have only considered small number of parties [26, 32, 63, 73, 75, 81, 93, 94], which restricts the number of corruptions to at most one. Although the small-party setting has found application in the outsourced computation paradigm too, the multiparty setting is a better fit for real-world deployments due to its resiliency to a higher number of corruptions (up to \(t < n/2\)). Thus, for larger n, the number of corruptions that can be tolerated is also higher, thereby increasing the trust in the system. Moreover, the multiparty setting allows for privacy-conscious computations even in a non-outsourced deployment scenario, such as in providing targeted advertisement services (described in Fig. 1 and elaborated below), when outsourcing the computation is not feasible/preferable. Hence, to design efficient protocols, we focus on honest-majority multiparty computation.

Fig. 1
figure 1

Use case for privacy-conscious solutions

Use Case. Consider the scenario of targeted advertisement services depicted in Fig. 1a. Typically, data collection entities track a user’s online activities via website cookies while browsing the Internet (). Also known as cookie profiling, such data collection allows the entities to create a “profile” for each user, which may contain information such as browsing habits, gender, marital status, and age, as shown in . These profiles can facilitate targeted advertisements via specialized algorithms (), which is leveraged by the advertisement service providers such as Google and Facebook. While such services offer a personalized experience, it comes at the expense of users’ private data being revealed to the service providers, as indicated in . A feasible solution (Fig. 1b) instead is to place a solution box at the interface between these service providers and the data collection entities such that it provides mechanisms to ensure the privacy of user data while also facilitating the required computations over the same (to provide targeted advertisement services). MPC being a technology that supports privacy-preserving computations lends itself well to such tasks. Instead of the data collection entities directly revealing the user data to the advertisement service providers, they can engage in an instance of MPC protocol () which securely runs the required algorithm on the user data while maintaining its privacy. Moreover, such a computation does not require the data collection entities to reveal their data to each other, thus offering a viable solution. Furthermore, as studied in [55], the effectiveness of targeted advertisements can greatly benefit from the use of machine learning algorithms. In particular, neural networks and more recently graph neural networks [28, 69, 79, 97, 99] have shown the potential to better analyse the data available via user profiles, in turn allowing for a refined personalized experience. We thus focus on protocols for securely evaluating the standard neural networks such as VGG16 [91] (deep neural network) and graph neural network and provide benchmarks for the same in Sect. 6.

1.1 Related Work

Despite the interest in MPC for small population [2, 4, 5, 20, 25,26,27, 32, 45, 63, 64, 81, 94], MPC protocols for arbitrary number of parties (n) have been studied largely [3, 7, 9, 11, 13, 15, 17, 19, 21, 35, 44, 51, 54, 85] in the honest-majority (\(t<n/2\)) as well as the dishonest-majority (\(t<n\)) setting, where t denotes the maximum amount of allowed corruption. We restrict the related work to MPC protocols in the honest-majority setting, which is the focus of this work. In the honest-majority setting, protocols can be categorized as working over the field algebraic structure [35, 46, 52, 54] or rings [7, 13, 15, 17, 44]. The field-based protocols, which mostly operate over Shamir secret sharing [88] scheme, have the advantage of having the share size linear in the number of parties. On the other hand, ring-based protocols are proven to be practically more efficient since they can leverage CPU optimizations [14, 33, 36, 40, 89]. In the following, we cover the field-based protocols first, followed by ring-based ones.

Field-based protocols: In the semi-honest case, [35, 46] provide MPC protocols over fields in the information-theoretic setting. ATLAS [52] further improves upon the communication complexity of [35] in the information-theoretic setting from 12t field elements to 8t field elements per multiplication gate. ATLAS [52] also provides another protocol variant, which improves the round complexity of [35] by 2\(\times \) but requires slightly higher communication of 9t field elements. In the computational setting, the two protocol variants in ATLAS [52] roughly require communication of 4t and 5t field elements, respectively. The work of [44] demonstrates MPC protocols in the computational setting in the preprocessing model with malicious security. We observe that the semi-honest protocol derived from [44] requires communicating 2t elements in the online and 3t elements in the preprocessing phase.

In the malicious setting, the semi-honest protocol of [35] has served as the basis for obtaining malicious security for free (i.e. amortized communication cost of 3t field elements per multiplication gate) in the computational setting [17] as well as in the information-theoretic setting [52, 54]. These works follow the approach of executing a semi-honest protocol, followed by a verification phase to check the correctness of multiplication which involves heavy polynomial interpolation operations. As mentioned earlier, the work of [44] focuses on maliciously secure protocols for honest-majority setting in the preprocessing model. Their protocol relies on an instantiation of [54] in the preprocessing phase that requires communicating 3t field elements while requiring another 3t field elements communication in the online phase. However, their protocol is inefficient due to a consistency check required after each level of multiplication and introduces depth-dependent overhead in communication complexity. The absence of this check results in a privacy breach as described in [53] and is elaborated in Sect. 4.

Ring-based protocols: Operating over rings is challenging because they do not have inverses for every element, which fields do. One way to work with rings is to adapt a field-based protocol to work over rings, but this can be computationally intensive due to the use of an extension field [1, 44]. Another option is to use replicated secret shares (RSS) [57], which allows for direct operation over a ring without the need for extensions. However, this method results in share size becoming exponential in the number of parties due to the replication, but can be more efficient when the number of parties is constant.

The work of [15] shows how the honest-majority semi-honest field-based MPC protocol of [35] can be optimized to work over rings using RSS. Operating in the computational setting using a one-time setup for correlated randomness, this optimized version of [35] has a communication cost of 3t ring elements per multiplication gate. We will refer to this optimized honest-majority semi-honest protocol given in [15] (see section 7.4, page 63 of ePrint version) as DN07\({}^{\star }\). This protocol forms the state-of-the-art semi-honest protocol for honest-majority in the computational setting over rings and uses RSS. The work of [7, 13] also provides semi-honest MPC protocols over rings in the computational setting, which require each party to communicate roughly t elements per multiplication gate, resulting in quadratic communication in the number of parties. The work of [44], as described earlier, showcases how their field protocols can be extended to work over rings using Galois ring extensions. The semi-honest protocol derived from their maliciously secure variant requires communicating 2t and 3t extended ring elements in the online and preprocessing phases, respectively. In the malicious setting, [44] suffers from the privacy breach over rings as well (see Sect. 4 for details). Further, both [15, 17] provide protocols over rings. However, they rely on computationally heavy zero-knowledge machinery where expensive polynomial interpolation operations are carried out in the online phase.

1.2 Towards Practically Efficient Protocols

Before stating our contributions, we elaborate on the choices made in this work towards designing a practically efficient protocol for n parties, tolerating at most \(t < n/2\) corruptions.

  1. 1.

    Preprocessing paradigm. With the goal of attaining as fast a response time as possible, the protocols are cast in the preprocessing paradigm [8, 26, 29, 34, 36, 37, 59,60,61, 81, 83]. Here, expensive data-independent computations are carried out in a preprocessing phase, thereby making way for a fast and efficient data-dependent online phase. We thus focus on improving the online phase without hampering the overall protocol complexity. Such a paradigm is apt for applications like ML-as-a-Service, where the same function is executed several times and is known beforehand.

  2. 2.

    Algebraic structure. Although operating over fields allows one to use techniques like Shamir secret sharing [88], where the number of shares are linear in the number of parties, we note that field operation bring in efficiency overhead [36, 89]. This is because operating over fields requires reliance on external libraries, and overloading basic operations such as addition and multiplication, since computer architectures are designed to operate on 32 or 64 bit rings. Thus, in an attempt to further enhance efficiency by utilizing the underlying CPU architecture, several protocols work over rings [26, 63, 64, 73, 75, 94]. We follow this approach and design MPC protocols operating over the ring \(\mathbb {Z}_{2^{\ell }}\) and rely on replicated secret sharing (RSS). Note that usage of RSS inherently results in exponential blow-up in the number of shares for an arbitrary number of parties. Hence, it is well suited for the practically oriented scenarios comprising a constant number of parties [15, 17], which we restrict to for benchmarking our protocols.

  3. 3.

    Masked evaluation. To make our protocols efficient in the preprocessing paradigm, we use the masked evaluation paradigm, a variant of the replicated secret sharing scheme. The secret data are masked using a masking value in this case, and the mask is RSS shared. The computation is done on the publicly available masked values and the shared masks. This technique was first introduced in the context of circuit garbling schemes (see [71, 96]) and was then adapted to secret sharing-based protocols in dishonest majority (see [10, 58]). It was later applied to small-population honest-majority settings such as [25, 32, 50, 63, 81] and [64] to aid in the development of practically efficient protocols.

  4. 4.

    Adversarial strategy. Based on the deployment scenario, different levels of security may be desired. While semi-honest security suffices for several applications as shown in [5, 6, 24, 25, 66, 77, 86, 92], malicious security is always desirable. When considering malicious security, we note that different levels of security notions can be attained such as (i) abort,Footnote 1 (ii) fairness,Footnote 2 and (iii) guaranteed output delivery.Footnote 3 Thus, to cater to different scenarios, our protocols are designed to provide semi-honest and malicious security with fairness, where each security goal has its merit.

  5. 5.

    Monetary cost. To reduce the operational costs in the online phase, several recent works [26, 63, 64, 81] reduce the number of (online) computing parties. This is useful in long computations such as those involved in privacy-preserving machine learning (PPML) applications, which span several days or even weeks. Reducing the number of online parties is especially advantageous for protocols deployed in the secure outsourced computation (SOC) setting since one has to pay for the uptime of every hired server. Shutting down even a single server significantly helps in reducing the monetary cost [64, 74] of the system. We thus focus on ensuring the participation of a minimal number of parties during the online computation in our protocols. This is achieved for the first time in the multiparty (malicious) setting.Footnote 4 Specifically, all the protocols for the semi-honest setting in our framework benefit from using only \(t+1\) parties in the online phase. The protocols in the malicious setting also enjoy this benefit except that the remainder t parties are required to come online for a short verification phase at the end. The reduction in online parties aids in improving the operational cost of the framework by almost 50\(\%\). This is unlike prior works [15, 17, 35, 52, 54] which require active participation from all parties throughout the computation.

1.3 Our Contributions

We begin with a quick overview of the contributions of this work, followed by the details.

  • We construct an n-party semi-honest protocol, tolerating at most \(t < n/2\) corruptions, in the preprocessing paradigm which offers an improved online phase than the (optimized) protocol of DN07\({}^{\star }\) [15], without inflating its total cost. Moreover, our protocol reduces the number of active parties in the online phase, thereby improving the system’s operational cost.

  • We extend our semi-honest protocol to the malicious setting, while retaining the benefits of requiring reduced number of parties in online phase for majority of the computation. Our offer over state-of-the-art protocol of [44] is a stronger security guarantee of fairness, and at least \(2\times \) improvement in round complexity via a one-time verification at the end of protocol evaluation.

  • We provide support for 3 and 4 input multiplication, at the same online complexity as that of the 2 input multiplication. In addition to improving the communication cost over the approach of sequential multiplications, multi-input multiplication offers a \(2\times \) improvement in the round complexity which is beneficial for high latency networks. Moreover, the approach can be extended to an arbitrary number of inputs while retaining the same online communication, albeit requiring exponential communication in the preprocessing phase [80].

  • We design building blocks for a range of applications such as deep neural networks, graph neural networks, genome sequence matching, and biometric matching. When the applications are benchmarked, our semi-honest protocol witnesses a saving of up to \(69\%\) in monetary cost and has \(3.5\times \) to \(4.6\times \) improvements in online run-time and throughput over DN07\({}^{\star }\). Interestingly, our maliciously secure protocols outperform the semi-honest protocol of DN07\({}^{\star }\) in terms of online run-time and throughput for the applications under consideration, achieving the goal of fast online phase.

We now elaborate on the contributions and highlight the technical details and novelty of our work.

Fig. 2
figure 2

Hierarchy of primitives in our 3-tier framework

Our protocol suite follows a 3-tier architecture (Fig. 2) to attain the final goal of privacy-conscious computations. The first tier comprises fundamental primitives such as input sharing, reconstruction, multiplication (with truncation), and multi-input multiplication. The second tier includes building blocks such as dot product, matrix multiplication, conversion between Boolean and arithmetic worlds, comparison, equality, and nonlinear activation functions, as required in the applications considered. Finally, the third tier is applications. Our main contributions lie in Tier I, and these are detailed below. Going ahead, we use ‘multiparty protocols’ to mean honest-majority n-party protocols that tolerate \(t>1\) corruptions and thus do not include the tailor-made protocols in the 3 and 4 party setting.

Tier I—MPC protocols Our goal is to design protocols with a fast online phase. Thus, working over \(\mathbb {Z}_{2^{\ell }}\) and relying on RSS, we design a semi-honest MPC protocol in the computational setting assuming a one-time shared-key setup for correlated randomness.

Note that the straightforward extension of semi-honest multiplication protocol of DN07\({}^{\star }\) to the preprocessing model, which can also be derived from the recent work of [44], incurs a communication of 3t elements in the preprocessing phase while communicating 2t elements in the online. This amounts to a \(1.6\times \) overhead in the total cost over DN07\({}^{\star }\). Our contribution lies in ensuring a fast online phase, without inflating the total communication cost of the protocol. Specifically, our protocol requires communicating only 2t ring elements in the online phase and t in the preprocessing, for a multiplication gate. Thus, in the honest-majority multiparty setting over rings, we are the first to achieve a communication cost of 2t in the online phase (unlike 3t in the prior works [35, 46]), without incurring any overhead in the total cost, i.e. our total cost still matches that of the best known (optimized) semi-honest honest-majority protocol [35, 46].

We extend our protocol to provide malicious security with fairness at the cost of additionally communicating t elements in the online phase and 2t in the preprocessing phase. Although (abort) protocol of [44] has the same communication as our maliciously secure protocol, we achieve a stronger security notion of fairness. Moreover, [44] requires an additional round of communication for consistency checks after each level, the absence of which results in a privacy breach (described in [53] and elaborated in Sect. 4) and necessitates participation from all parties. However, by relying on a variant of RSS, our protocol avoids the consistency check after each level of circuit evaluation and ensures privacy. Notably, we only require participation from all parties for a one-time verification at the end of evaluation, thus reducing the number of rounds by d. (d denotes circuit depth.)

3- and 4-input multiplications Following [64, 78, 80], to reduce the online communication cost and round complexity, we design protocols to enable the multiplication of 3 and 4 inputs in a single shot. Compared to the naive approach of performing sequential multiplications to multiply 3/4 inputs, the multi-input multiplication protocol enjoys the benefit of having the same online phase complexity as that of the 2-input multiplication protocol. This brings in a \(2\times \) improvement in the online round complexity and improves the online communication cost. Support for multi-input multiplication enables usage of optimized adder circuits [80] for secure comparison and Boolean addition, thereby resulting in a faster online phase. The recent work of [52] also proposes a method to improve the round complexity of circuit evaluation by evaluating all gates in two consecutive layers in a circuit in parallel. We observe that their method can be viewed as a variant of multi-input multiplication with 3 and 4 inputs. Thus, our protocols need not be limited to facilitate faster comparison and Boolean additions alone (as described above), but can be used to reduce the round and communication complexity of any general circuit evaluation. Note that [52] only improves the round complexity (2\(\times \)) without inflating the communication cost when compared to DN07\({}^{\star }\). However, we focus on improving round complexity (2\( \times \)) as well as communication of the online phase by trading off an increase in the preprocessing.

Tier II—Building blocks We design efficient protocols for several building blocks in semi-honest and malicious settings, which are stepping stones for Tier III applications. These are extensions from the small-party setting [63, 75, 80, 81], and we provide the details in Sect. 5.1 (semi-honest) and Sect. 5.2 (malicious).

Tier III—Applications To showcase the practicality of our framework and improvements of our protocols, we benchmark a range of applications such as neural networks (NN), which also includes the popular deep NN called VGG16 [91], graph neural network, genome sequence matching, and biometric matching and is considered for the first time in the n-party honest-majority setting. We benchmark the applications in the WAN setting using Google Cloud instances. As mentioned, owing to the inherent restrictions of RSS and keeping the focus on practical scenarios, we showcase the performance of our protocols for \(n = 5, 7\), and 9 and compare with the state-of-the-art semi-honest protocol of DN07\({}^{\star }\).

  1. 1.

    Deep neural networks. We benchmark inference phases of deep neural networks such as LeNet [67] and VGG16 [91]. We observe savings of up to \(69\%\) in monetary cost, and improvements of up to \(4.3\times \) in online run-time and throughput, in comparison with DN07\({}^{\star }\).

  2. 2.

    Graph neural network. We benchmark the inference phase of graph neural network [39, 90] on MNIST [68] data set. In comparison with DN07\({}^{\star }\), our protocol improves up to \(3.5\times \) in online run-time and sees up to \(15\%\) savings in monetary cost.

  3. 3.

    Genome sequence matching. We demonstrate an efficient protocol for similar sequence queries (SSQ), which can be used to perform secure genome matching. Our protocol is based on the protocol of [86] which works for 2 parties and uses an edit distance approximation [6]. We extend and optimize the protocol for the multiparty setting. In comparison with DN07\({}^{\star }\), we witness improvements of up to \(4\times \) in online run-time and throughput, and savings of \(66\%\) in monetary cost.

  4. 4.

    Biometric matching. We propose efficient protocols for computing Euclidean distance (ED), which forms the basis for performing biometric matching. Continuing the trend, we witness a \(4.6\times \) improvement in online run-time and throughput over DN07\({}^{\star }\), and savings of up to \(85\%\) in monetary cost.

2 Preliminaries

Our protocols are designed for rings (\(\mathbb {Z}_{2^{\ell }}\)) algebraic structure and follows the (function-dependent) preprocessing paradigm to enable a fast online phase. For machine learning applications considered in this work, we use fixed-point arithmetic (FPA) [22, 23, 63, 75, 77] representation to operate over decimal values. Here, a decimal value is represented as an \(\ell \)-bit integer in signed 2’s complement representation. The most significant bit (\(\textsf {msb} \)) represents the sign bit, and \(\textsf {d} \) least significant bits are reserved for the fractional part. The \(\ell \)-bit integer is then treated as an element of \(\mathbb {Z}_{2^{\ell }}\), and operations are performed modulo \(2^{\ell }\). We let \(\ell = 64\), \(\textsf {d} = 13\), with \(\ell - \textsf {d} - 1\) bits for the integer part.

Shared-key setup \(\mathcal {F}_{\textsf {setup} }\) [5, 63, 75, 81] enables establishment of common random keys for a pseudo-random function (PRF) F, among parties. This aids in non-interactively generating correlated randomness. Here, \(F: \{0, 1\}^{\kappa } \times \{0, 1\}^{\kappa } \rightarrow X\) is a secure PRF, with co-domain X being \(\mathbb {Z}_{2^{\ell }}\). The semi-honest functionality, \(\mathcal {F}_{\textsf {setup} }\), appears in Fig. 3. The functionality for the malicious case is similar, except that the adversary now has the capability to \(\texttt{abort}\).

To sample a random value \(\textsf {r} \in \mathbb {Z}_{2^{\ell }}\) among a set of \(t+1\) parties \({\mathcal {T}}= \{P_1, \ldots , P_{t+1}\}\) non-interactively, each \(P_i \in {\mathcal {T}}\) invokes \(F_{k_{{\mathcal {T}}}}(id_{{\mathcal {T}}})\) and obtains \(\textsf {r} \). Here, \(id_{{\mathcal {T}}}\) denotes a counter maintained by the parties in \({\mathcal {T}}\) and is updated after every PRF invocation. The appropriate keys used to sample are implicit from the context, from the identities of the parties that sample.

Fig. 3
figure 3

Ideal functionality for shared-key setup

Collision-Resistant Hash Function A family of hash functions [84] \(\{\textsf {H} : \mathcal {K} \times \textsf {M} \rightarrow \mathcal {Y} \}\) is said to be collision resistant if for all PPT adversaries \(\mathcal {A}\), given the hash function \(\textsf {H} _k\) for \(k \in _R \mathcal {K}\), the following holds: \(\textsf {Pr} [(x, x^{\prime }) \leftarrow \mathcal {A}(k): (x \ne x^{\prime }) \wedge \textsf {H} _k(x) = \textsf {H} _k(x^{\prime })] = \textsf {negl} (\kappa )\), where \(x, x^{\prime } \in \{0,1\}^{m}\), \(m = \textsf {poly} (\kappa )\), and \(\kappa \) is security parameter.

Commitment Scheme Let \(\textsf {Com} (x)\) denote the commitment of a value x [76]. The commitment scheme \(\textsf {Com} (x)\) possesses two properties; hiding and binding. The former ensures privacy of value x given its commitment \(\textsf {Com} (x)\), while the latter prevents a corrupt party from opening the commitment to a different value \(x' \ne x\).

Our System Model This work considers both semi-honest and malicious adversarial models with static and at most \(t<n/2\) corruptions. For the rest of the paper, we assume maximal corruption in this setting and thus \(n = 2t + 1\). The security of our constructions is proved in the standalone, simulation-based security model of MPC, using the real-world/ideal-world simulation paradigm [70] against a computationally bounded adversary, and the details are provided in Sect. 7. Let \(\mathcal {P}= \{P_1, P_2, \ldots , P_n\}\) denote the set of n parties which are connected by pair-wise private and authentic channels in a synchronous network. Additionally, our fair reconstruction protocol in the malicious setting relies on a broadcast channel, which can be instantiated using an existing broadcast protocol such as [41]. Set \(\mathcal {E}= \{P_1, P_2, \ldots , P_{t+1}\}\), termed as the evaluator set, comprises parties that are active during the online phase. Set \(\mathcal {D}= \{P_{t+2}, P_{t+3}, \ldots , P_n\}\), termed as the helper set, comprises parties which help in the preprocessing phase, and in the online verification in the malicious setting. Parties agree on a \(P_{\textsf {king} }\in \mathcal {E}\). Without loss of generality, let \(P_{\textsf {king} }= P_{t+1}\).

Sharing semantics We use the following sharing semantics, based on RSS and additive sharing schemes, which facilitate a fast online phase.

  • \(\langle \cdot \rangle \)-sharing: This denotes the replicated secret sharing (RSS) of a value with threshold t. A value \(\textsf {a} \in \mathbb {Z}_{2^{\ell }}\) is said to be RSS-shared with threshold t if for every subset \({\mathcal {T}}\subset \mathcal {P}\) of \(n-t\) parties there exists \(\langle \textsf {a} \rangle _{{\mathcal {T}}} \in \mathbb {Z}_{2^{\ell }}\) possessed by all \(P_i \in {\mathcal {T}}\) such that \(\textsf {a} = \sum _{{\mathcal {T}}} \langle \textsf {a} \rangle _{{\mathcal {T}}}\). Alternatively, for every set of t parties, the residual \(h = n-t\) parties forming the set \({\mathcal {T}}\) hold the share \(\langle \textsf {a} \rangle _{{\mathcal {T}}}\). Let \({\mathcal {T}}_1, {\mathcal {T}}_2, \ldots , {\mathcal {T}}_{\textsf {q} } \subset \mathcal {P}\) be the distinct subsets of size h, where \(\textsf {q} = \left( {\begin{array}{c}n\\ h\end{array}}\right) \) represents the total number of shares. Since \(P_i\) belongs to \(\left( {\begin{array}{c}n-1\\ h-1\end{array}}\right) \) such sets, it holds a tuple of \(\left( {\begin{array}{c}n-1\\ h-1\end{array}}\right) \) shares, \(\{\langle \textsf {a} \rangle _{{\mathcal {T}}}\}\). We denote this tuple of shares that it possesses as \(\langle \textsf {a} \rangle _i\).

  • \(\left[ \cdot \right] \)-sharing: A value \(\textsf {a} \in \mathbb {Z}_{2^{\ell }}\) is said to be \(\left[ \cdot \right] \)-shared (additively shared) among parties in \(\mathcal {P}\) if \(P_i \in \mathcal {P}\) possesses \(\left[ \textsf {a} \right] _i \in \mathbb {Z}_{2^{\ell }}\) such that \(\textsf {a} = \left[ \textsf {a} \right] _1 + \left[ \textsf {a} \right] _2 + \ldots + \left[ \textsf {a} \right] _n\).

  • \({}^{{\mathcal {T}}}{\left[ \cdot \right] }\)-sharing: A value \(\textsf {a} \in \mathbb {Z}_{2^{\ell }}\) is said to be \({}^{{\mathcal {T}}}{\left[ \cdot \right] }\)-shared among \(t+1\) parties in \({\mathcal {T}}\), if each \(P_i \in {\mathcal {T}}\) holds \({}^{{\mathcal {T}}}{\left[ \textsf {a} \right] }_i\) such that \(\textsf {a} = \sum _{P_i \in {\mathcal {T}}} {}^{{\mathcal {T}}}{\left[ \textsf {a} \right] }_{i}\). We refer to this sharing scheme as \((t+1)\)-additive sharing and use \({}^{\mathcal {E}}{\left[ \textsf {a} \right] }\) to denote such a sharing among parties in \(\mathcal {E}\).

  • \(\langle \!\langle \cdot \rangle \!\rangle \)-sharing: A value \(\textsf {a} \in \mathbb {Z}_{2^{\ell }}\) is said to be \(\langle \!\langle \cdot \rangle \!\rangle \)-shared in the semi-honest setting if there exist values \(\lambda _{\textsf {a} }, \textsf {m} _{\textsf {a} } \in \mathbb {Z}_{2^{\ell }}\) such that \(\textsf {m} _{\textsf {a} } = \textsf {a} + \lambda _{\textsf {a} }\) where \(\lambda _{\textsf {a} }\) is \(\langle \cdot \rangle \)-shared among \(\mathcal {P}\) and every \(P_i \in \mathcal {E}\) holds \(\textsf {m} _{\textsf {a} }\). We denote the shares of \(P_i \in \mathcal {D}\) by \(\langle \!\langle \textsf {a} \rangle \!\rangle _{i} = \langle \lambda _{\textsf {a} } \rangle _{i}\) and that of \(P_i \in \mathcal {E}\) as \(\langle \!\langle \textsf {a} \rangle \!\rangle _{i} = (\textsf {m} _{\textsf {a} }, \langle \lambda _{\textsf {a} } \rangle _{i})\). In the malicious setting, \(\textsf {m} _{\textsf {a} }\) is held by all parties, and \(\langle \!\langle \textsf {a} \rangle \!\rangle _{i} = (\textsf {m} _{\textsf {a} }, \langle \lambda _{\textsf {a} } \rangle _{i})\) for all \(P_i \in \mathcal {P}\).

It is trivial to see that all the sharing schemes mentioned above are linear. This allows parties to compute linear operations such as addition and multiplication with constants locally. The Boolean world operates over \(\mathbb {Z}_{2^{}}\), and we denote the corresponding Boolean sharing with a superscript B. Notations are summarized in Table 1.

Table 1 Notations used in this work

Helper primitives We use the primitives described in Table 2 from the literature [15, 17, 30, 81] in our protocols, and their details are presented next. The Boolean variants of corresponding primitives are denoted with a superscript B.

Table 2 Description of helper primitives—all are non-interactive, except\({\Pi }_{\textsf {agree} }\)
  1. (1)

    \({\Pi }_{[0]}\rightarrow \left[ 0\right] \) (Fig. 6): To generate \(\left[ \cdot \right] \)-shares of 0, each party non-interactively samples two values, each with one of its neighbouring parties. A party’s shares of 0 are defined as the difference between these values.

  2. (2)

    \({\Pi }_{\textsf {rand} }\rightarrow \langle \textsf {r} \rangle \) (Fig. 5): To generate \(\langle \cdot \rangle \)-shares of a random \(\textsf {r} \in \mathbb {Z}_{2^{\ell }}\), every set of \(t+1\) parties non-interactively sample a random value using keys established during the setup phase and define \(\textsf {r} \) to be the sum of these values.

  3. (3)

    \({\Pi }_{\textsf {pRand} }(P_s) \rightarrow \langle \textsf {r} \rangle \) (Fig. 6): This protocol generates \(\langle \cdot \rangle \)-shares of a random value \(\textsf {r} \) such that \(P_s\) learns all the shares. Every set of \(t+1\) parties non-interactively samples a random value together with \(P_s\), using the keys established (for every set of \(t+2\) parties) during the setup phase.

  4. (4)

    \({\Pi }_{\tiny {\cdot \rightarrow \langle \!\langle \cdot \rangle \!\rangle }}(\textsf {a} ) \rightarrow \langle \!\langle \textsf {a} \rangle \!\rangle \): This protocol generates \(\langle \!\langle \textsf {a} \rangle \!\rangle \) when \(\textsf {a} \in \mathbb {Z}_{2^{\ell }}\) is held by at least \(t+1\) parties, say parties in \(\mathcal {E}\). For this, \(P_i \in \mathcal {E}\) sets \(\textsf {m} _{\textsf {a} } = \textsf {a} \) and \(\langle \cdot \rangle \)-shares of \(\lambda _{\textsf {a} }\) as 0. To generate \(\langle \!\langle \textsf {a} \rangle \!\rangle \) in the malicious case where all parties hold \(\textsf {a} \), we let parties set \(\textsf {m} _{\textsf {a} } = \textsf {a} \) and shares of \(\lambda _{\textsf {a} }\) as 0.

  5. (5)

    \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow {}^{{\mathcal {T}}}{\left[ \cdot \right] }}}(\langle \textsf {a} \rangle ) \rightarrow {}^{{\mathcal {T}}}{\left[ \textsf {a} \right] }\) (Fig. 7): This protocol enables parties in \({\mathcal {T}}= \{E_1, E_2, \ldots , E_{t+1} \}\) to generate \({}^{{\mathcal {T}}}{\left[ \textsf {a} \right] }\) from \(\left[ \textsf {a} \right] \). To generate \({}^{{\mathcal {T}}}{\left[ \textsf {a} \right] }_i\), the idea is to sum up the shares in \(\langle \textsf {a} \rangle _{{\mathcal {T}}_1}, \ldots , \langle \textsf {a} \rangle _{{\mathcal {T}}_{\textsf {q} }}\), while ensuring that every share is accounted for and no share is incorporated more than once. Concretely, for share \(\langle \textsf {a} \rangle _{{\mathcal {T}}_j}\) held by parties in \({\mathcal {T}}_j\) for \(j \in \{1, \ldots , \textsf {q} \} \), \(E_i \in {\mathcal {T}}_j\) incorporates \(\langle \textsf {a} \rangle _{{\mathcal {T}}_j}\) in its share of \({}^{\mathcal {E}}{\left[ \textsf {a} \right] }_i\) if \(E_i\) has the least index in \({\mathcal {T}}_j\).

  6. (6)

    \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow \left[ \cdot \right] }}(\langle \textsf {a} \rangle ) \rightarrow \left[ \textsf {a} \right] \): \(\langle \cdot \rangle \)-share can be converted to \(\left[ \cdot \right] \)-share following similar procedure as \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow {}^{{\mathcal {T}}}{\left[ \cdot \right] }}}\) and is denoted as \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow \left[ \cdot \right] }}(\langle \textsf {a} \rangle )\). We omit the details due to similarity.

  7. (7)

    \({\Pi }_{\tiny {\langle \!\langle \cdot \rangle \!\rangle \rightarrow {}^{{\mathcal {T}}}{\left[ \cdot \right] }}}(\langle \!\langle \textsf {a} \rangle \!\rangle ) \rightarrow {}^{{\mathcal {T}}}{\left[ \textsf {a} \right] }\): Parties in \({\mathcal {T}}\) invoke \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow {}^{{\mathcal {T}}}{\left[ \cdot \right] }}}\) on \(-\lambda _{\textsf {a} }\) to generate \({}^{{\mathcal {T}}}{\left[ -\lambda _{\textsf {a} } \right] }\), followed by a designated \(P_i \in {\mathcal {T}}\) that holds \(\textsf {m} _{\textsf {a} }\) setting \({}^{{\mathcal {T}}}{\left[ \textsf {a} \right] }_i = \textsf {m} _{\textsf {a} } + {}^{\mathcal {E}}{\left[ -\lambda _{\textsf {a} } \right] }_i\).

  8. (8)

    \({\Pi }_{\tiny {\langle \!\langle \cdot \rangle \!\rangle \rightarrow \left[ \cdot \right] }}(\langle \!\langle \textsf {a} \rangle \!\rangle ) \rightarrow \left[ \textsf {a} \right] \): \(\left[ \textsf {a} \right] \) can be generated from \(\langle \!\langle \textsf {a} \rangle \!\rangle \) similar to \({\Pi }_{\tiny {\langle \!\langle \cdot \rangle \!\rangle \rightarrow {}^{{\mathcal {T}}}{\left[ \cdot \right] }}}\) and is denoted as \({\Pi }_{\tiny {\langle \!\langle \cdot \rangle \!\rangle \rightarrow \left[ \cdot \right] }}(\langle \!\langle \textsf {a} \rangle \!\rangle )\).

  9. (9)

    \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow \langle \!\langle \cdot \rangle \!\rangle }}(\langle \textsf {a} \rangle ) \rightarrow \langle \!\langle \textsf {a} \rangle \!\rangle \): To convert \(\langle \textsf {a} \rangle \), to \(\langle \!\langle \textsf {a} \rangle \!\rangle \), set \(\textsf {m} _{\textsf {a} } = 0\) and set \(\langle \lambda _{\textsf {a} } \rangle = - \langle \textsf {a} \rangle \).

  10. (10)

    \({\Pi }_{\tiny {\langle \!\langle \cdot \rangle \!\rangle \rightarrow \langle \cdot \rangle }}(\langle \!\langle \textsf {a} \rangle \!\rangle ) \rightarrow \langle \textsf {a} \rangle \): To convert \(\langle \!\langle \textsf {a} \rangle \!\rangle \) to \(\langle \textsf {a} \rangle \), set \(\langle \textsf {a} \rangle _{{\mathcal {T}}_j} = - \langle \lambda _{\textsf {a} } \rangle _{{\mathcal {T}}_j}\) for \(j \in \{1, \ldots , \textsf {q} -1\}\) and \(\langle \textsf {a} \rangle _{{\mathcal {T}}_{\textsf {q} }} = \textsf {m} _{\textsf {a} } - \langle \lambda _{\textsf {a} } \rangle _{{\mathcal {T}}_{\textsf {q} }}\), where \({\mathcal {T}}_{\textsf {q} } = \mathcal {E}\).

  11. (11)

    \({\Pi }_{\tiny {\langle \cdot \rangle \cdot \langle \cdot \rangle \rightarrow \left[ \cdot \right] }}(\langle \textsf {a} \rangle , \langle \textsf {b} \rangle ) \rightarrow \left[ \textsf {a} \textsf {b} \right] \) (Fig. 8): Given \(\langle \textsf {a} \rangle , \langle \textsf {b} \rangle \), parties non-interactively compute \(\left[ \textsf {a} \textsf {b} \right] \) as follows. Observe that \(\left[ \textsf {a} \textsf {b} \right] = \sum _{j=1}^{\textsf {q} } \left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] \). To generate \(\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] \), the idea is to generate \({}^{{\mathcal {T}}_j}{\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] }\) and perform a conversion. Parties in \({\mathcal {T}}_j\) generate \({}^{{\mathcal {T}}_j}{\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] }\) as \({}^{{\mathcal {T}}_j}{\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] } = \left( \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \right) \cdot \left( {}^{{\mathcal {T}}_j}{\left[ \textsf {b} \right] } \right) \). To obtain \(\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] \) from \({}^{{\mathcal {T}}_j}{\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] }\), \(P_i \in \mathcal {P}\) sets \(\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] _i = {}^{{\mathcal {T}}_j}{\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] }_i\) if \(P_i \in {\mathcal {T}}_j\) and \(\left[ \langle \textsf {a} \rangle _{{\mathcal {T}}_j} \textsf {b} \right] _i =0\), otherwise.

  12. (12)

    \({\Pi }_{\textsf {agree} }(\mathcal {P}, \{\vec {\textsf {v} _1}, \ldots , \vec {\textsf {v} _n}\}) \rightarrow \texttt{continue}/ \texttt{abort}\): Allows parties to check if they hold the same set of values \(\vec {\textsf {v} } = (\textsf {v} _1, \ldots , \textsf {v} _m)\), where parties \(\texttt{continue}\) if the values are same, and \(\texttt{abort}\) otherwise. We denote the version of \(\vec {\textsf {v} }\) held by \(P_i \in \mathcal {P}\) as \(\vec {\textsf {v} _i}\). To check for consistency of \(\vec {\textsf {v} }\), parties compute hash, \(\textsf {H} = \textsf {H} (\textsf {v} _1 ||\ldots ||\textsf {v} _m)\), of the concatenation of all values \(\textsf {v} _1, \ldots , \textsf {v} _m\), and exchange \(\textsf {H} \) among themselves. If any party receives inconsistent hashes, it \(\texttt{abort}\)s; else it \(\texttt{continue}\)s.

  13. (13)

    \({\Pi }_{\langle \cdot \rangle }(P_s, \textsf {a} ) \rightarrow \langle \textsf {a} \rangle \): To enable \(P_s\) to generate \(\langle \textsf {a} \rangle \), parties generate \(\langle \textsf {a} \rangle _{{\mathcal {T}}_j}\) for \(j \in \{1, \ldots , \textsf {q} -1 \}\) using \({\Pi }_{\textsf {pRand} }\), with \(P_s\) learning \(\langle \textsf {a} \rangle _{{\mathcal {T}}_j}\) (i.e. \(\langle \textsf {a} \rangle _{{\mathcal {T}}_j}\) are sampled using common key among \(t+2\) parties). \(P_s\) sets \(\langle \textsf {a} \rangle _{{\mathcal {T}}_ {\textsf {q} }} = \textsf {a} - \sum _{j=1}^{\textsf {q} -1} \langle \textsf {a} \rangle _{{\mathcal {T}}_j}\) and sends \(\langle \textsf {a} \rangle _{{\mathcal {T}}_{\textsf {q} }}\) to parties in \({\mathcal {T}}_{\textsf {q} }\). For malicious case, this is followed by invoking \({\Pi }_{\textsf {agree} }(\mathcal {P}, \{\langle \textsf {a} \rangle _{{\mathcal {T}}_{\textsf {q} }}\})\) to check consistency of the values sent by \(P_s\).

Fig. 4
figure 4

Generating \(\left[ \cdot \right] \)-shares of 0

Fig. 5
figure 5

Generating \(\langle \cdot \rangle \)-shares of a random value

Fig. 6
figure 6

Generating \(\langle \cdot \rangle \)-shares of a random value along with \(P_s\)

Fig. 7
figure 7

Conversion from \(\langle \cdot \rangle \)-share to \({}^{{\mathcal {T}}}{\left[ \cdot \right] }\)-share

Fig. 8
figure 8

\(\langle \textsf {a} \rangle , \langle \textsf {b} \rangle \) to \(\left[ \textsf {a} \textsf {b} \right] \)

3 MPClan Protocol

The ideal functionality \(\mathcal {F}_{\textsf {n\text{- }}PC }\) for evaluating function f in the n-party setting with semi-honest security appears in Fig. 9. Details of its instantiation over the ring \(\mathbb {Z}_{2^{\ell }}\) that comprises three phases—input sharing, evaluation (linear operations and multiplication), and output reconstruction—appear next.

Fig. 9
figure 9

Semi-honest: ideal functionality for function f

Input sharing and output reconstruction To enable \(P_s \in \mathcal {P}\) to \(\langle \!\langle \cdot \rangle \!\rangle \)-share a value \(\textsf {v} \in \mathbb {Z}_{2^{\ell }}\), parties first non-interactively sample \(\langle \cdot \rangle \)-shares of \(\lambda _{\textsf {v} }\), relying on the shared-key setup, such that \(P_s\) learns all these shares in clear (via \({\Pi }_{\textsf {pRand} }\)). This enables \(P_s\) to compute and send \(\textsf {m} _{\textsf {v} } = \textsf {v} + \lambda _{\textsf {v} }\) to parties in \(\mathcal {E}\), thereby generating \(\langle \!\langle \textsf {v} \rangle \!\rangle \). The protocol for input sharing appears in Fig. 10.

Fig. 10
figure 10

Semi-honest: input sharing protocol

To reconstruct \(\textsf {v} \) towards all parties given \(\langle \!\langle \textsf {v} \rangle \!\rangle \), observe that parties in \(\mathcal {E}\) possess sufficient shares to facilitate the same. Elaborately, parties in \(\mathcal {E}\) can non-interactively generate additive shares, \({}^{\mathcal {E}}{\left[ \textsf {v} \right] }\), among themselves (via \(\varPi _{\tiny {\langle \!\langle \cdot \rangle \!\rangle \rightarrow {}^{\mathcal {E}}{\left[ \cdot \right] }}}\)). These parties can then send their additive shares to \(P_{\textsf {king} }\), who computes and sends \(\textsf {v} \) to all parties. Reconstruction towards a single party, say \(P_s\), can proceed similarly except that the protocol terminates after parties in \(\mathcal {E}\) send their additive shares of \(\textsf {v} \) to \(P_{\textsf {king} }= P_s\), who then computes \(\textsf {v} \).

Evaluation Evaluation comprises linear operations of addition and multiplication with public constant, and nonlinear operations such as multiplication. Parties can non-interactively compute linear operations owing to the linearity of the \(\langle \!\langle \cdot \rangle \!\rangle \)-sharing. Concretely, given \(\langle \!\langle \textsf {a} \rangle \!\rangle , \langle \!\langle \textsf {b} \rangle \!\rangle \) and public constants \(\textsf {c} _1, \textsf {c} _2\), parties can non-interactively compute \(\langle \!\langle \textsf {c} _1 \textsf {a} + \textsf {c} _2 \textsf {b} \rangle \!\rangle \) as \(\textsf {c} _1 \langle \!\langle \textsf {a} \rangle \!\rangle + \textsf {c} _2 \langle \!\langle \textsf {b} \rangle \!\rangle \).

Fig. 11
figure 11

Steps of semi-honest multiplication protocol

To compute \(\langle \!\langle \cdot \rangle \!\rangle \)-shares for nonlinear operations such as multiplication, say \(\textsf {z} = \textsf {a} \textsf {b} \) given \(\langle \!\langle \textsf {a} \rangle \!\rangle , \langle \!\langle \textsf {b} \rangle \!\rangle \), parties proceed as follows. At a high level, the approach is to enable generation of \(\langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle \) and \(\langle \!\langle \textsf {r} \rangle \!\rangle \) for a random \(\textsf {r} \in \mathbb {Z}_{2^{\ell }}\), which enables parties to non-interactively compute \(\langle \!\langle \textsf {z} \rangle \!\rangle = \langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle + \langle \!\langle \textsf {r} \rangle \!\rangle \). Observe that \(\langle \!\langle \textsf {r} \rangle \!\rangle \) can be generated non-interactively by locally sampling each of its shares. To generate \(\langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle \), we let parties in \(\mathcal {E}\) obtain \(\textsf {z} - \textsf {r} \), following which \(\langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle \) can be generated non-interactively. (This is achieved via \({\Pi }_{\tiny {\cdot \rightarrow \langle \!\langle \cdot \rangle \!\rangle }}\) where all parties set their shares of \(\langle \lambda _{\textsf {z} - \textsf {r} } \rangle \) as 0, and parties in \(\mathcal {E}\) set \(\textsf {m} _{\textsf {z} - \textsf {r} } = \textsf {z} - \textsf {r} \).) Observe that \(\textsf {z} \) remains private while revealing \(\textsf {z} - \textsf {r} \) to parties in \(\mathcal {E}\) since \(\textsf {r} \) is a random mask not known to adversary.

To enable parties in \(\mathcal {E}\) to obtain \(\textsf {z} - \textsf {r} \), we let \(\textsf {z} - \textsf {r} = \textsf {D} + \textsf {E} \), where \(\textsf {D} \) is additively shared among parties in \(\mathcal {D}\), while \(\textsf {E} \) is additively shared among parties in \(\mathcal {E}\). (\(\textsf {D} , \textsf {E} \) are defined in the following paragraphs.) Thus, to reconstruct \(\textsf {z} - \textsf {r} \) towards parties in \(\mathcal {E}\), parties send their respective additive shares of \(\textsf {D} \) or \(\textsf {E} \) towards \(P_{\textsf {king} }\in \mathcal {P}\). \(P_{\textsf {king} }\) reconstructs \(\textsf {D} , \textsf {E} \) and sends \(\textsf {z} - \textsf {r} = \textsf {D} + \textsf {E} \) to parties in \(\mathcal {E}\). Elaborately, as seen in [25, 64], \(\textsf {z} - \textsf {r} \) can be computed as

$$\begin{aligned} \textsf {z} - \textsf {r}&= \textsf {a} \textsf {b} - \textsf {r} = \left( \textsf {m} _{\textsf {a} } - \lambda _{\textsf {a} } \right) \left( \textsf {m} _{\textsf {b} } - \lambda _{\textsf {b} } \right) - \textsf {r} = {\text {M}}_{\textsf {a} \textsf {b} } -\textsf {m} _{\textsf {a} } \lambda _{\textsf {b} } - \textsf {m} _{\textsf {b} } \lambda _{\textsf {a} } + \varLambda _{\textsf {a} \textsf {b} } - \textsf {r} \nonumber \\&= \underbrace{{\text {M}}_{\textsf {a} \textsf {b} } -\textsf {m} _{\textsf {a} } \lambda _{\textsf {b} } - \textsf {m} _{\textsf {b} } \lambda _{\textsf {a} } + (\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} )_{\mathcal {E}}}_{\textsf {E} } + \underbrace{(\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} )_{\mathcal {D}}}_{\textsf {D} } \end{aligned}$$
(1)

where \(\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} = (\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} )_{\mathcal {D}} + (\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} )_{\mathcal {E}}\). The multiplication protocol \(\varPi _{\textsf {mult} }\) (Fig. 12) is detailed next, and its schematic representation is provided in Fig. 11.

Fig. 12
figure 12

Semi-honest: multiplication protocol

  • Step : Parties non-interactively generate \(\langle \textsf {r} \rangle \) by locally sampling each of its shares (via \({\Pi }_{\textsf {rand} }\)). Parties locally compute \(\left[ \textsf {r} \right] \) and \(\langle \!\langle \textsf {r} \rangle \!\rangle \) from \(\langle \textsf {r} \rangle \) using \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow \left[ \cdot \right] }}\) and \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow \langle \!\langle \cdot \rangle \!\rangle }}\), respectively. Looking ahead, \(\left[ \textsf {r} \right] \) aids in generating additive shares of \(\textsf {D} , \textsf {E} \), while \(\langle \!\langle \textsf {r} \rangle \!\rangle \) aids in computing \(\langle \!\langle \textsf {z} \rangle \!\rangle \) from \(\langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle \).

  • Step : This step involves computing additive shares of \(\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} \) among all parties. For this, parties non-interactively generate \(\left[ \varLambda _{\textsf {a} \textsf {b} }\right] \) from \(\langle \lambda _{\textsf {a} } \rangle , \langle \lambda _{\textsf {b} } \rangle \) (via \({\Pi }_{\tiny {\langle \cdot \rangle \cdot \langle \cdot \rangle \rightarrow \left[ \cdot \right] }}\)). \(P_i \in \mathcal {P}\) sets its additive share of \(\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} \) as \(\left[ \varLambda _{\textsf {a} \textsf {b} } - \textsf {r} \right] _i = \left[ \varLambda _{\textsf {a} \textsf {b} }\right] _i - \left[ \textsf {r} \right] _i\). Observe that the shares \(\left[ \varLambda _{\textsf {a} \textsf {b} } - \textsf {r} \right] _i\) of \(P_i \in \mathcal {D}\) define the additive shares of \(\textsf {D} = (\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} )_{\mathcal {D}}\) among parties in \(\mathcal {D}\). Similarly, the shares \(\left[ \varLambda _{\textsf {a} \textsf {b} } - \textsf {r} \right] _i\) of \(P_i \in \mathcal {E}\) define the additive shares of \((\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} )_{\mathcal {E}}\) among parties in \(\mathcal {E}\) (i.e. \({}^{\mathcal {E}}{\left[ (\varLambda _{\textsf {a} \textsf {b} } - r)_{\mathcal {E}} \right] }\)).

  • Step : Parties in \(\mathcal {E}\) generate additive shares of \(\lambda _{\textsf {a} }, \lambda _{{\textsf {b} }}\) among themselves (\({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares, via \(\Pi _{\tiny {\langle \cdot \rangle \rightarrow {}^{\mathcal {E}}{\left[ \cdot \right] }}}\)). Looking ahead, \({}^{\mathcal {E}}{\left[ \lambda _{\textsf {a} } \right] }, {}^{\mathcal {E}}{\left[ \lambda _{\textsf {b} } \right] }\) aid in generating additive shares of \(\textsf {E} \) among \(\mathcal {E}\).

  • Step : Parties in \(\mathcal {D}\) send their additive shares of \(\textsf {D} \) (as defined in step ) to \(P_{\textsf {king} }\), where the latter reconstructs \(\textsf {D} \).

  • Step : \(P_i \in \mathcal {E}\setminus \{P_{\textsf {king} }\}\) non-interactively generates additive share, \({}^{\mathcal {E}}{\left[ \textsf {E} \right] }_i\), of \(\textsf {E} \) among parties in \(\mathcal {E}\) as \({}^{\mathcal {E}}{\left[ \textsf {E} \right] }_i = - \textsf {m} _{\textsf {a} } {}^{\mathcal {E}}{\left[ \lambda _{\textsf {b} } \right] }_i - \textsf {m} _{\textsf {b} } {}^{\mathcal {E}}{\left[ \lambda _{\textsf {a} } \right] }_i + {}^{\mathcal {E}}{\left[ (\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} )_{\mathcal {E}} \right] }_i\). Note that it suffices for only one designated party in \(\mathcal {E}\) to add \({\text {M}}_{\textsf {a} \textsf {b} }\) in its share of \({}^{\mathcal {E}}{\left[ \textsf {E} \right] }\), and without loss of generality we let this designated party be \(P_{\textsf {king} }\). For \(P_{\textsf {king} }= P_{t+1}\) in our case, \({}^{\mathcal {E}}{\left[ \textsf {E} \right] }_{t+1} = {\text {M}}_{\textsf {a} \textsf {b} } - \textsf {m} _{\textsf {a} } {}^{\mathcal {E}}{\left[ \lambda _{\textsf {b} } \right] }_{t+1} - \textsf {m} _{\textsf {b} } {}^{\mathcal {E}}{\left[ \lambda _{\textsf {a} } \right] }_{t+1} + {}^{\mathcal {E}}{\left[ (\varLambda _{\textsf {a} \textsf {b} } - \textsf {r} )_{\mathcal {E}} \right] }_{t+1}\). Parties send their additive shares of \(\textsf {E} \) to \(P_{\textsf {king} }\), who reconstructs \(\textsf {E} \) and sends \(\textsf {z} - \textsf {r} = \textsf {D} + \textsf {E} \) to parties in \(\mathcal {E}\).

  • Step : Parties non-interactively generate \(\langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle \) (via \({\Pi }_{\tiny {\cdot \rightarrow \langle \!\langle \cdot \rangle \!\rangle }}\)) as explained earlier. Using \(\langle \!\langle \textsf {r} \rangle \!\rangle \) generated in step , parties compute \(\langle \!\langle \textsf {z} \rangle \!\rangle = \langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle + \langle \!\langle \textsf {r} \rangle \!\rangle \), as required.

Lemma 1

Protocol \(\varPi _{\textsf {mult} }\) (Fig. 12) incurs a communication of t elements in the preprocessing phase and 2t elements in 2 rounds in the online phase for multiplication when \(\textsf {isTr} = 0\).

Analysis: Observe that the communication towards \(P_{\textsf {king} }\) in steps and can be performed in parallel, resulting in the overall round complexity of the protocol being two. Further, a communication of t elements is required in step and 2t elements are required in (since \(P_{\textsf {king} }\in \mathcal {E}\)), thereby having a total communication complexity of 3t ring elements. This complexity resembles that of DN07\({}^{\star }\). However, our sharing semantics enables us to push some of the steps mentioned above to a preprocessing phase, resulting in a fast online phase, which is non-trivial to achieve in the case of DN07\({}^{\star }\). Elaborately, observe that since \(\textsf {r} , \lambda _{\textsf {a} }, \lambda _{\textsf {b} }\) are independent of the input (owing to our sharing semantics), computation involving these terms ranging from steps to can thus be moved to a preprocessing phase. This improves the online communication complexity by slashing the inward communication towards \(P_{\textsf {king} }\) by half. Thus, the online phase requires only 2t ring elements of communication while offloading t elements of communication to the preprocessing phase.

Note that a straightforward extension of semi-honest multiplication of DN07\({}^{\star }\) to the preprocessing model, which can be derived from [44], does not provide an efficient solution. Although such a protocol has the same online complexity (2t elements) as our online phase, it has the drawback of inflating the overall communication cost by a factor of \(1.6\times \) over DN07\({}^{\star }\). Elaborately, the online communication cost of 2t elements can be attained by appropriately defining the sharing semantics and using the \(P_{\textsf {king} }\) approach, similar to our protocol. However, this requires parties to generate the sharing of \(\varLambda _{\textsf {a} \textsf {b} } = \lambda _{\textsf {a} }\cdot \lambda _{\textsf {b} }\) from the shares of \(\lambda _{\textsf {a} }\) and \(\lambda _{\textsf {b} }\) during the preprocessing phase and requires a full-fledged multiplication, incurring a cost of 3t elements. This yields a protocol with a total cost of 5t elements in comparison with the 3t cost of the all-online DN07\({}^{\star }\) protocol. Thus, departing from this approach, the novelty of our protocol lies in leveraging the interplay between the sharing semantics and redesigning the communication pattern among the parties to ensure that the total cost of 3t does not change.

Furthermore, our protocol design allows parties in \(\mathcal {D}\) to remain shut in the online phase, thereby reducing the system’s operational load. This is because parties in \(\mathcal {D}\) only contribute towards the computation of \(\textsf {D} \), which can be completed in the preprocessing phase. However, the preprocessing phase becomes function-dependent due to the linear gates, for which the \(\lambda _{}\) value for the output wires cannot be chosen randomly. Concretely, if \(\textsf {c} \) is the output of a linear gate, say addition, with inputs \(\textsf {a} , \textsf {b} \), then \(\lambda _{\textsf {c} }\) cannot be chosen randomly and should be defined as \(\lambda _{\textsf {c} } = \lambda _{\textsf {a} } + \lambda _{\textsf {b} }\).

The complete semi-honest secure MPC protocol, \(\varPi _{\textsf {MPC} }^{\textsf {sh} }\), evaluating a function \(f(\cdot )\) appears in Fig. 13.

Fig. 13
figure 13

Semi-honest: the complete MPC protocol

Online-only mode: Note that in instances where the function description is not known beforehand, our protocol can be run as an all-online protocol with a cost matching that of DN07\({}^{\star }\). There are two approaches in which this can be achieved. The first approach is as described in steps discussed above, with the communication towards \(P_{\textsf {king} }\) in steps and executed simultaneously. The second approach is to begin by performing steps comprising the preprocessing phase, followed by the online phase steps. Although the second approach requires an additional round in the beginning to perform the preprocessing steps, it has the advantage that after completing the preprocessing phase, parties in \(\mathcal {D}\) can be shut down. This helps in improving the operational cost of the system.

Incorporating truncation To deal with decimal values that arise in several applications, including the ones considered in this work, we operate on the fixed-point arithmetic (FPA) representation [22, 23], as described in Sect. 2. In this case, performing multiplication, \(\textsf {z} = \textsf {a} \textsf {b} \), results in increasing the number of fractional bits in the result of multiplication, \(\textsf {z} \), from \(\textsf {d} \) to \(2\textsf {d} \). To retain FPA semantics, it is required to truncate \(\textsf {z} \) by \(\textsf {d} \) bits, i.e. compute \({\textsf {z} }^{\textsf {d} } = \textsf {z} / 2^{\textsf {d} }\). For this, we extend the probabilistic truncation technique of [63, 64, 75] proposed in the small-party domain to the n-party setting. Given \((\textsf {r} , {\textsf {r} }^{\textsf {d} })\)-pair, with \({\textsf {r} }^{\textsf {d} } = \textsf {r} /2^{\textsf {d} }\), the truncated value of \(\textsf {z} \) can be obtained as \({\textsf {z} }^{\textsf {d} } = {(\textsf {z} - \textsf {r} )}^{\textsf {d} } + {\textsf {r} }^{\textsf {d} }\). Accuracy and correctness of this method follow from [73, 75].

Fig. 14
figure 14

Ideal functionality \(\mathcal {F}_{\textsf {TrGen} }\)

Our multiplication protocol can be modified to additionally perform truncation by incorporating the following two changes—(i) generate \(\langle \!\langle {\textsf {r} }^{\textsf {d} } \rangle \!\rangle \) in step , and (ii) compute \(\langle \!\langle {\textsf {z} }^{\textsf {d} } \rangle \!\rangle = \langle \!\langle {(\textsf {z} - \textsf {r} )}^{\textsf {d} } \rangle \!\rangle + \langle \!\langle {\textsf {r} }^{\textsf {d} } \rangle \!\rangle \), instead, in step . For (i), we rely on the ideal functionality, \(\mathcal {F}_{\textsf {TrGen} }\) (Fig. 14), for computing \(\langle \!\langle \textsf {r} \rangle \!\rangle , \langle \!\langle {\textsf {r} }^{\textsf {d} } \rangle \!\rangle \). \(\mathcal {F}_{\textsf {TrGen} }\) can be instantiated using the appropriate MPC protocol which will be used as a black-box in our multiplication. Thus, improvements in the MPC protocol that realizes \(\mathcal {F}_{\textsf {TrGen} }\) can be inherited in our multiplication protocol. In our work, we instantiate \(\mathcal {F}_{\textsf {TrGen} }\) using \(\varPi _{\textsf {dsBits} }\) (Fig. 15), which is a slightly modified version of the doubly shared random bit generation protocol of [33], adapted to our n-party setting. Concretely, \(\varPi _{\textsf {dsBits} }\) generates \(\ell \) doubly shared random bits instead of a single bit, as done in the protocol of [33]. Here, a doubly shared random bit is a bit which is arithmetic as well as Boolean shared. With respect to (ii), observe that it is a local operation, and hence performing truncation does not incur any additional overhead in the online phase. The details of \(\varPi _{\textsf {dsBits} }\), which follows from the protocol of [33], are presented next.

Truncation—Instantiating \(\mathcal {F}_{\textsf {TrGen} }\) We rely on a modified version of the doubly shared random bit (a bit that is arithmetic as well as Boolean shared) generation protocol of [33], extended to our n-party setting, to generate \(\langle \!\langle \textsf {r} \rangle \!\rangle , \langle \!\langle {\textsf {r} }^{\textsf {d} } \rangle \!\rangle \) as required to perform truncation. Here, \({\textsf {r} }^{\textsf {d} }\) represents the truncated (by d bits) version of \(\textsf {r} \in \mathbb {Z}_{2^{\ell }}\). The resulting protocol is referred to as \(\varPi _{\textsf {dsBits} }\) (Fig. 15).

Fig. 15
figure 15

Semi-honest: doubly shared bits

At a high level, generation of doubly shared bits relies on the property that every nonzero quadratic residue has exactly one root when working over fields. The work of [33], operating over rings, shows that something similar holds over rings as well. Concretely, according to lemma 4.1 of [33]: if \(\textsf {a} \) is such that \(\textsf {a} ^{2} \equiv _{\ell } 1\), then \(\textsf {a} \) is congruent mod \(2^{\ell }\) to either \(1, -1, -1+2^{\ell -1}, 1+2^{\ell -1}\). Thus, the doubly shared bit generation protocol of [33] proceeds as follows. Generate \(\textsf {a} ^{2}\) for \(\textsf {a} \in \mathbb {Z}_{2^{\ell +2}}\) such that \(\textsf {a} ^{2} \equiv _{\ell +2} 1\), and compute its smallest root \(\textsf {c} \) mod \(2^{\ell +2}\). Compute \((\textsf {c} ^{-1} \textsf {a} )\), and by lemma 4.1 of [33] it follows that \(\textsf {c} ^{-1} \textsf {a} \in \{\pm 1, \pm 1 + 2^{\ell +1}\}\). That is, \((\textsf {c} ^{-1} \textsf {a} )\) is congruent to \(\pm 1\) modulo \(2^{\ell + 1}\). Thus, \(\textsf {d} = \textsf {c} ^{-1} \textsf {a} + 1\) is congruent to 0 or 2 modulo \(2^{\ell +1}\) with equal probability. Hence, setting \(\textsf {b} = \textsf {d} /2\) outputs bit \(\textsf {b} = 0\) or bit \(\textsf {b} = 1\) with equal probability. Observe that the computation has to be performed over \(\mathbb {Z}_{2^{\ell +2}}\). Hence, in the protocol description, we use \(\ell +2\) in the superscript to distinguish shares of \(\textsf {x} \) over \(\mathbb {Z}_{2^{\ell +2}}\) from its shares over \(\mathbb {Z}_{2^{\ell }}\).

The main change in \(\varPi _{\textsf {dsBits} }\) from that of the protocol in [33] is that to generate \(\langle \!\langle \textsf {r} \rangle \!\rangle , \langle \!\langle {\textsf {r} }^{\textsf {d} } \rangle \!\rangle \) \(\varPi _{\textsf {dsBits} }\) generates \(\ell \) random doubly shared bits \(\textsf {b} _0, \ldots , \textsf {b} _{\ell -1} \in \mathbb {Z}_{2^{}}\) instead of a single one, and composes these \(\ell \) bits to generate \(\textsf {r} \), and composes the higher \(\ell - d\) bits to generate \({\textsf {r} }^{\textsf {d} }\), as follows.

$$\begin{aligned} \left( \langle \!\langle \textsf {r} \rangle \!\rangle , \langle \!\langle {\textsf {r} }^{\textsf {d} } \rangle \!\rangle \right) = \left( \sum _{i=0}^{\ell -1} 2^i \langle \!\langle \textsf {b} _i^\textsf{R} \rangle \!\rangle , \sum _{i=d}^{\ell -1} 2^{i-d} \langle \!\langle \textsf {b} _i^\textsf{R} \rangle \!\rangle \right) \end{aligned}$$
(2)

Looking ahead, \(\varPi _{\textsf {dsBits} }\) can also be used only to generate a single doubly shared random bit, which finds use in other building blocks such as bit to arithmetic conversion and arithmetic to Boolean conversion. Thus, to distinguish the case when \((\langle \!\langle \textsf {r} \rangle \!\rangle , \langle \!\langle {\textsf {r} }^{\textsf {d} } \rangle \!\rangle )\) has to be generated versus when only a single doubly shared bit is to be generated, \(\varPi _{\textsf {dsBits} }\) takes a bit \(\textsf {isTr} \) as input and gives as output a doubly shared bit \(\langle \!\langle \textsf {b} ^\textsf{R} \rangle \!\rangle , \langle \!\langle \textsf {b} \rangle \!\rangle ^{\textbf{B}}\) if \(\textsf {isTr} = 0\), and \((\langle \!\langle \textsf {r} \rangle \!\rangle , \langle \!\langle {\textsf {r} }^{\textsf {d} } \rangle \!\rangle )\) otherwise. The protocol appears in Fig. 15.

A final thing to note is that the computation in \(\varPi _{\textsf {dsBits} }\) proceeds over secret-shared data. Thus, to generate shares of the doubly shared bit \(\textsf {b} \), one should be able to divide each share of \(\textsf {d} \) by 2, which necessitates \(\textsf {d} \) and its shares to be even. This holds true since \(\langle \textsf {d} \rangle ^{\ell +2} = \textsf {c} ^{-1} \langle \textsf {a} \rangle ^{\ell +2} + 1 = \textsf {c} ^{-1} \left( 2\langle \textsf {u} \rangle ^{\ell +2} + 1 \right) + 1 = 2 \textsf {c} ^{-1} \langle \textsf {u} \rangle ^{\ell +2} + \textsf {c} ^{-1} + 1\). Here, \(2 \textsf {c} ^{-1} \langle \textsf {u} \rangle ^{\ell +2}\) is even due to multiplication by 2, while \(\textsf {c} ^{-1} + 1\) is even since \(\textsf {c} ^{-1}\) is odd by definition.

Dot product Given \(\langle \!\langle \cdot \rangle \!\rangle \)-shares of vectors \(\vec {x}\) and \(\vec {y}\) of size , dot product outputs \(\langle \!\langle \textsf {z} \rangle \!\rangle \) where and \(\odot \) denotes the dot product operation. The design of our multiplication protocol enables easy extension to support dot product computation without incurring any overhead. Concretely, similar to multiplication,

(3)

In each of the summands of \(\textsf {z} - \textsf {r} \), each of the product terms can be generated similar to that in the multiplication protocol, which can then be locally summed up before sending it towards \(P_{\textsf {king} }\). The formal protocol details appear in Fig. 16. Looking ahead, for matrix multiplication, each element of the resultant matrix can be computed via a dot product.

Fig. 16
figure 16

Semi-honest: dot product protocol

Multi input multiplication 3-input and 4-input multiplication protocols have showcased their wide applicability in improving the online phase complexity [64, 78, 80]. Concretely, computing \(\textsf {z} = \textsf {a} \textsf {b} \textsf {c} \) (3-input) or \(\textsf {z} = \textsf {a} \textsf {b} \textsf {c} \textsf {d} \) (4-input) naively requires at least two sequential invocations of 2-input multiplication protocol in the online phase. Instead, 3-input and 4-input multiplication protocol, respectively, enables performing this computation with the same online complexity as that of a single 2-input multiplication. Thus, we design 3-input and 4-input multiplication protocols by extending the techniques of [64, 80] to the n-party setting. Designing these protocols require modifications in the preprocessing steps. Consider 3-input multiplication (Fig. 18) where the goal is to generate \(\langle \!\langle \cdot \rangle \!\rangle \)-sharing of \(\textsf {z} = \textsf {a} \textsf {b} \textsf {c} \) given \(\langle \!\langle \textsf {a} \rangle \!\rangle , \langle \!\langle \textsf {b} \rangle \!\rangle , \langle \!\langle \textsf {c} \rangle \!\rangle \). Note that

$$\begin{aligned} \textsf {z} - \textsf {r}&= \textsf {a} \textsf {b} \textsf {c} - \textsf {r} = (\textsf {m} _{\textsf {a} } - \lambda _{\textsf {a} })(\textsf {m} _{\textsf {b} } - \lambda _{\textsf {b} })(\textsf {m} _{\textsf {c} } - \lambda _{\textsf {c} }) - \textsf {r} \\&= {\text {M}}_{\textsf {a} \textsf {b} \textsf {c} } - {\text {M}}_{\textsf {a} \textsf {c} } \lambda _{\textsf {b} } - {\text {M}}_{\textsf {b} \textsf {c} } \lambda _{\textsf {a} } - {\text {M}}_{\textsf {a} \textsf {b} } \lambda _{\textsf {c} } + \textsf {m} _{\textsf {a} } \varLambda _{\textsf {b} \textsf {c} } + \textsf {m} _{\textsf {b} } \varLambda _{\textsf {a} \textsf {c} } + \textsf {m} _{\textsf {c} } \varLambda _{\textsf {a} \textsf {b} } - \varLambda _{\textsf {a} \textsf {b} \textsf {c} } - \textsf {r} \end{aligned}$$

We follow an approach closely related to 2-input multiplication, with the difference being that parties additionally require to generate the additive sharing of \(\varLambda _{\textsf {b} \textsf {c} }, \varLambda _{\textsf {a} \textsf {c} }\) and \(\varLambda _{\textsf {a} \textsf {b} \textsf {c} }\) during preprocessing. Given these sharings, parties proceed with a similar online phase as in \(\varPi _{\textsf {mult} }\) to compute the 3-input multiplication without inflating the online cost. Specifically, the following steps are performed in the preprocessing phase.

  • For generating \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {c} } \right] }, {}^{\mathcal {E}}{\left[ \varLambda _{\textsf {b} \textsf {c} } \right] }\) parties first compute the respective additive sharings ( \(\left[ \cdot \right] \)) using \(\langle \lambda _{\textsf {a} } \rangle , \langle \lambda _{\textsf {b} } \rangle \) and \(\langle \lambda _{\textsf {c} } \rangle \) (via two invocations of \({\Pi }_{\tiny {\langle \cdot \rangle \cdot \langle \cdot \rangle \rightarrow \left[ \cdot \right] }}\)). Following this parties in \(\mathcal {D}\) communicate their share of \(\left[ \varLambda _{\textsf {a} \textsf {c} }\right] \) and \(\left[ \varLambda _{\textsf {b} \textsf {c} }\right] \) to \(P_{\textsf {king} }\), each masked with a random \(\left[ \cdot \right] \)-sharing of 0 (generated using \({\Pi }_{[0]}\)). This establishes \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {c} } \right] }, {}^{\mathcal {E}}{\left[ \varLambda _{\textsf {b} \textsf {c} } \right] }\) among parties in \(\mathcal {E}\).

  • For generating \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} } \right] }\), a slightly different approach is taken where parties first generate \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \) using \(\langle \lambda _{\textsf {a} } \rangle , \langle \lambda _{\textsf {b} } \rangle \) (as explained later), followed by non-interactively generating \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} } \right] }\) (via \({\Pi }_{\tiny {\langle \cdot \rangle \rightarrow {}^{{\mathcal {T}}}{\left[ \cdot \right] }}}\)). The reason for generating \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \) (instead of directly generating \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} } \right] }\)) is to facilitate generation of \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} \textsf {c} } - \textsf {r} \right] }\) from \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \), \(\langle \lambda _{\textsf {c} } \rangle \) and \(\left[ \textsf {r} \right] \), which closely follows the preprocessing phase of the 2-input multiplication. Specifically, parties can generate \(\left[ \varLambda _{\textsf {a} \textsf {b} \textsf {c} }\right] \) using \({\Pi }_{\tiny {\langle \cdot \rangle \cdot \langle \cdot \rangle \rightarrow \left[ \cdot \right] }}\) on \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \), \(\langle \lambda _{\textsf {c} } \rangle \), followed by parties in \(\mathcal {D}\) communicating their \(\left[ \varLambda _{\textsf {a} \textsf {b} \textsf {c} }\right] \) shares masked with \(\left[ \cdot \right] \)-sharing of a random \(\textsf {r} \) to \(P_{\textsf {king} }\). This generates \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} \textsf {c} } + \textsf {r} \right] }\)-sharing required during online phase.

  • Regarding generation of \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \), all parties generate \(\langle \cdot \rangle \)-sharing of a random \(\gamma \in \mathbb {Z}_{2^{\ell }}\) non-interactively and convert it to \(\left[ \gamma \right] \). Parties then compute \(\left[ \varLambda _{\textsf {a} \textsf {b} } + \gamma \right] \) by computing \(\left[ \varLambda _{\textsf {a} \textsf {b} }\right] \) from \( \langle \lambda _{\textsf {a} } \rangle , \langle \lambda _{\textsf {b} } \rangle \) followed by summing it up with \(\left[ \gamma \right] \). Parties reconstruct this value towards \(P_{\textsf {king} }\), who then generates \(\langle \varLambda _{\textsf {a} \textsf {b} } + \gamma \rangle \), from which parties compute \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle = \langle \varLambda _{\textsf {a} \textsf {b} } + \gamma \rangle - \langle \gamma \rangle \), and thereby \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} } \right] }\) by invoking \(\Pi _{\tiny {\langle \cdot \rangle \rightarrow {}^{\mathcal {E}}{\left[ \cdot \right] }}}\).

Table 3 Semi-honest: communication and round complexity for multi-input multiplications

Similarly, for 4-input multiplication, parties need to generate the additive sharing of \(\varLambda _{\textsf {a} \textsf {d} }, \varLambda _{\textsf {b} \textsf {d} }, \varLambda _{\textsf {c} \textsf {d} }, \varLambda _{\textsf {a} \textsf {b} \textsf {d} }, \varLambda _{\textsf {a} \textsf {c} \textsf {d} }, \varLambda _{\textsf {b} \textsf {c} \textsf {d} }, \varLambda _{\textsf {a} \textsf {b} \textsf {c} \textsf {d} }\) in addition to those required in the case of 3-input multiplication. Specifically, generation of \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares (additive shares) of \(\varLambda _{\textsf {a} \textsf {c} }, \varLambda _{\textsf {a} \textsf {d} }, \varLambda _{\textsf {b} \textsf {c} }, \varLambda _{\textsf {b} \textsf {d} }\) can proceed similar to generation of \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {c} } \right] }\) in \(\varPi _{\textsf {3-mult} }\). Generation of \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares of \(\varLambda _{\textsf {a} \textsf {b} }, \varLambda _{\textsf {c} \textsf {d} }\) is carried out by first generating its \(\langle \cdot \rangle \)-shares. This enables generation of \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares of \(\varLambda _{\textsf {a} \textsf {b} \textsf {c} }, \varLambda _{\textsf {a} \textsf {b} \textsf {d} }, \varLambda _{\textsf {a} \textsf {c} \textsf {d} }, \varLambda _{\textsf {b} \textsf {c} \textsf {d} }\) following steps similar to generation of \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {c} } \right] }\) in \(\varPi _{\textsf {3-mult} }\). Finally, \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} \textsf {c} \textsf {d} } - \textsf {r} \right] }\) is generated similar to generating \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} \textsf {c} } + \textsf {r} \right] }\) in \(\varPi _{\textsf {3-mult} }\). We omit formal details of 4-input multiplication protocol, \(\varPi _{\textsf {4-mult} }\), as it is very close to \(\varPi _{\textsf {3-mult} }\). Table 3 compares the cost of computing \(\textsf {z} = \textsf {a} \textsf {b} \textsf {c} \) via a 2-input multiplication sequentially vs a 3-input multiplication, and computing \(\textsf {z} = \textsf {a} \textsf {b} \textsf {c} \textsf {d} \) via a 2-input and 4-input multiplication.

Fig. 17
figure 17

4-input multiplication

The recent work of [52] provides a method to reduce the round complexity of circuit evaluation. They group the (distinct) consecutive layers in the circuit into pairs and perform a parallel evaluation of all gates in the two layers in a group. Consider a multiplication gate with inputs \(\textsf {x} , \textsf {y} \) (obtained as output from a previous layer) and output \(\textsf {z} \). Their approach considers three cases: (i) If \(\textsf {x} \) and \(\textsf {y} \) are not the outputs of a multiplication gate, (ii) exactly one among \(\textsf {x} ,\textsf {y} \) is the output of a multiplication gate, and (iii) both \(\textsf {x} , \textsf {y} \) are outputs of a multiplication gate. We observe that cases (ii) and (iii) in their approach resemble multi-input multiplication, which allows evaluating the second layer of multiplication (\(\textsf {z} = \textsf {x} \cdot \textsf {y} \)) non-interactively, thereby saving on rounds. For instance, consider a 2-layer sub-circuit as in Fig. 17, where \(\textsf {x} = \textsf {a} \cdot \textsf {b} , \textsf {y} = \textsf {c} \cdot \textsf {d} \) are outputs of a multiplication gate which are fed as input to a multiplication gate in the next level. The approach of [52] allows computation of \(\textsf {z} = (\textsf {a} \cdot \textsf {b} ) \cdot (\textsf {c} \cdot \textsf {d} )\) in a single shot, which is equivalent to computing \(\textsf {z} \) via a 4-input multiplication in our case. Similarly, when only one of the inputs (either \(\textsf {x} \) or \(\textsf {y} \)) is the output of multiplication, computation of \(\textsf {z} = \textsf {x} \cdot \textsf {y} \) resembles a 3-input multiplication. Thus, cases (i), (ii), (iii) correspond to 2-input, 3-input, and 4-input multiplication, respectively, in our work and are sufficient to reduce round complexity of any circuit evaluation by half. Hence, we restrict our focus to 3- and 4-input multiplication, although our technique can be generalized to gates with arbitrarily large fan-in.

Fig. 18
figure 18

Semi-honest: 3-input multiplication protocol

4 Extending to Malicious Security

The ideal functionality \(\mathcal {F}_{\textsf {n\text{- }}PC }\) for evaluating a function f in the n-party setting while providing malicious security (with fairness) appears in Fig. 19.

The input sharing and output reconstruction protocols for the malicious setting can be obtained efficiently from the semi-honest protocol following standard approaches [44, 63, 81]. However, the same cannot be said about multiplication. Note that although a maliciously secure multiplication protocol can be achieved by compiling our semi-honest protocol using compiler techniques such as [2, 17], the resultant protocol has an expensive online phase. For instance, using the compiler of [2] yields a protocol that requires computation over extended rings and communicating 4t extended ring elements in the online phase. This is not favourable compared to working over plain rings, especially in the online phase. Further, compilers such as those in [17] require heavy computational machinery like reliance on zero-knowledge proofs in the online phase, which is also not desirable. Thus, to attain a computation and communication efficient online phase, departing from the aforementioned compiler-based approaches, we design a maliciously secure multiplication protocol that requires communicating 3t ring elements in each phase. It is worth noting that we can do this while retaining the benefits of requiring only \(t+1\) parties in the online phase (for most of the computation). The remaining t parties are required to come online only for a short one-time verification phase, that is deferred to the end of the computation. Deferring verification may result in a privacy breach [53]. However, we describe later why the privacy breach does not arise in our protocol. With this background, in what follows next, we begin with describing the input sharing and output reconstruction protocols and then focus on discussing the challenges encountered and their resolutions for obtaining a maliciously secure multiplication protocol.

Fig. 19
figure 19

Malicious: ideal functionality for evaluating function f with fairness

Input sharing This protocol is similar to the semi-honest one, where to enable \(P_s\) to generate \(\langle \!\langle \textsf {a} \rangle \!\rangle \), parties generate \(\langle \lambda _{\textsf {a} } \rangle \) such that \(P_s\) learns \(\lambda _{\textsf {a} }\), followed by \(P_s\) sending the masked value \(\textsf {m} _{\textsf {a} } = \textsf {a} + \lambda _{\textsf {a} }\) to all. However, note that a corrupt \(P_s\) can cause inconsistency among the honest parties by sending different masked values. To ensure the same value is received by all, parties perform a hash-based consistency check, denoted by \({\Pi }_{\textsf {agree} }\) (Sect. 2), where each party sends a hash of the received masked value(s) to every other party and \(\texttt{abort}\)s if it receives inconsistent hashes. Note that this check for all the inputs can be combined, thereby amortizing the cost. The formal protocol appears in Fig. 20.

Fig. 20
figure 20

Malicious: input sharing protocol

Reconstruction To reconstruct \(\langle \!\langle \cdot \rangle \!\rangle \)-shared value \(\textsf {a} \) towards \(P_s \in \mathcal {P}\), observe that each share that \(P_s\) misses is held by \(t+1\) other parties. Each of these parties sends the missing share to \(P_s\). If the received values for a share are consistent, \(P_s\) uses this value to perform reconstruction, and \(\texttt{abort}\)s otherwise. As an optimization, one party can send the missing share while reconstructing several values, and t others can send its hash.

Fairness is a stronger security notion than security with abort, where, during reconstruction, either all parties learn the output or none do. For fair reconstruction, we extend the techniques in [81] to the n-party setting, where commitments are generated on each share of the mask of the output \(\textsf {z} \) (required to reconstruct \(\textsf {z} \)) by \(t+1\) parties in the preprocessing phase.

During the online phase, these commitments are opened towards the respective parties if all the parties are alive (did not \(\texttt{abort}\)). Since each share of the mask is held by \(t+1\) parties and there is at least one honest party among every set of \(t+1\) parties, it is guaranteed that parties will obtain the correct opening for the commitment of the missing share from the honest party, and all honest parties can reconstruct the output. Else, if the adversary misbehaved at some step during the protocol, none of the honest parties will share the opening information and none will obtain the output. Note that to determine if all parties are alive, each party broadcasts a bit \(\textsf {alive} = 1\), where the broadcast can be realized on point-to-point channels using a broadcast protocol such as that of [41]. The formal protocol \(\varPi _{\textsf {Rec} }^{\textsf {fair} }\) appears in Fig. 21.

Fig. 21
figure 21

Malicious: fair reconstruction protocol

Multiplication To enable generation of \(\langle \!\langle \textsf {z} \rangle \!\rangle = \langle \!\langle \textsf {a} \textsf {b} \rangle \!\rangle \) from \(\langle \!\langle \textsf {a} \rangle \!\rangle \) and \(\langle \!\langle \textsf {b} \rangle \!\rangle \), we retain the high-level ideas from the semi-honest protocol. Our task reduces to \(\textsf {(i)} \) generating additive shares of \(\varLambda _{\textsf {a} \textsf {b} }\) among parties in \(\mathcal {E}\) (i.e. \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} } \right] }\)) given \(\langle \lambda _{\textsf {a} } \rangle \) and \(\langle \lambda _{\textsf {b} } \rangle \), in the preprocessing phase, and \(\textsf {(ii)} \) reconstructing \(\textsf {z} - \textsf {r} \) in the online phase. Given \(\textsf {(i)} \), computing \({}^{\mathcal {E}}{\left[ \textsf {z} - \textsf {r} \right] }\) in the online phase is a local operation. Given \(\textsf {(ii)} \), parties can invoke \(\varPi _{\tiny {\cdot \rightarrow \langle \!\langle \cdot \rangle \!\rangle }}\) to generate \(\langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle \) and compute \(\langle \!\langle \textsf {z} \rangle \!\rangle = \langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle + \langle \!\langle \textsf {r} \rangle \!\rangle \), where \(\langle \!\langle \textsf {r} \rangle \!\rangle \) is generated in the preprocessing phase, as discussed in the semi-honest case.

Fig. 22
figure 22

Ideal functionality \(\mathcal {F}_{\textsf {MulPre} }\)

For task \(\textsf {(i)} \), our idea for the semi-honest case, of making parties in \(\mathcal {D}\) send their shares to \(P_{\textsf {king} }\), does not work in the presence of a malicious adversary. To address this, we make black-box use of a maliciously secure multiplication protocol, abstracted as a functionality \(\mathcal {F}_{\textsf {MulPre} }\) in Fig. 22, that computes \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \) from \(\langle \lambda _{\textsf {a} } \rangle , \langle \lambda _{\textsf {b} } \rangle \). In this work, we instantiate \(\mathcal {F}_{\textsf {MulPre} }\) with the state-of-the-art multiplication protocol of [17] that provides \(\texttt{abort}\) security and requires 3t elements of (amortized) communication. Note that although the protocol of [17] relies on zero-knowledge proofs, this computation is carried out in the preprocessing phase of our multiplication protocol. Moreover, since preprocessing is done for many instances in one shot, the zero-knowledge proof can benefit from amortization. The parties then invoke \(\Pi _{\tiny {\langle \cdot \rangle \rightarrow {}^{\mathcal {E}}{\left[ \cdot \right] }}}\) to obtain \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} } \right] }\) from \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \). Looking ahead, \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \) also aids in performing the online verification check.

For task \(\textsf {(ii)} \), in the online phase, we retain the idea of parties in \(\mathcal {E}\) optimistically reconstructing \(\textsf {z} - \textsf {r} \) from their additive shares (\({}^{\mathcal {E}}{\left[ \cdot \right] }\)-share) to ensure that only the parties in \(\mathcal {E}\) remain active for most of the computation. Moreover, this optimistic reconstruction requires only \(\mathcal {O}(t)\)-element communication rather than the \(\mathcal {O}(t^{2})\) required for reconstruction from \(\langle \cdot \rangle \)-shares (which is what will be used later for performing verification, albeit to perform only one such reconstruction). Thus, similar to the semi-honest protocol, parties in \(\mathcal {E}\) optimistically reconstruct \(\textsf {z} - \textsf {r} \) towards \(P_{\textsf {king} }\), who further sends the reconstructed value to the parties in \(\mathcal {E}\). In the malicious setting, this approach requires additional care since a malicious party may send a wrong \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-share of \(\textsf {z} - \textsf {r} \) to \(P_{\textsf {king} }\) or a malicious \(P_{\textsf {king} }\) may send an incorrectly reconstructed (inconsistent) \(\textsf {z} - \textsf {r} \) to the parties. To account for these behaviours, the protocol is augmented with a short one-off verification phase to verify the consistency and correctness of \(\textsf {z} - \textsf {r} \). This phase is executed in the end of the protocol and requires the presence of all parties, and hence the possession of \(\textsf {z} - \textsf {r} \) by all. This is in contrast to the semi-honest protocol where \(\textsf {z} - \textsf {r} \) is given to only parties in \(\mathcal {E}\). To keep \(\mathcal {D}\) disengaged for most of the online phase, sending \(\textsf {z} - \textsf {r} \) to them is deferred till the end of the protocol. This send is a one-off and can be combined for all multiplication gates. Details of verification protocol \(\varPi _{\textsf {Vrfy} }\) (Fig. 23) are given next.

Fig. 23
figure 23

Malicious: verification protocol for all multiplication gates

Verification comprises two checks—a consistency check to first verify that \(P_{\textsf {king} }\) has indeed sent the same \(\textsf {z} - \textsf {r} \) to all the parties, followed by a correctness check to verify the correctness of the \(\textsf {z} - \textsf {r} \). For the former, parties perform a hash-based consistency check of \(\textsf {z} - \textsf {r} \) and abort in case of any inconsistency. If \(\textsf {z} - \textsf {r} \) is consistent, parties verify its correctness. The high-level idea for verifying correctness is to robustly reconstruct \(\textsf {z} - \textsf {r} \), but now from its \(\langle \cdot \rangle \)-shares (can be computed given \(\langle \lambda _{\textsf {a} } \rangle , \langle \lambda _{\textsf {a} } \rangle , \langle \varLambda _{\textsf {a} \textsf {b} } \rangle \) that are generated in the preprocessing phase). Parties can then verify if this reconstructed value equals the value received from \(P_{\textsf {king} }\). Concretely, this is equivalent to robustly reconstructing \(\langle \varOmega \rangle = \langle \textsf {z} - \textsf {r} - ({\text {M}}_{\textsf {a} \textsf {b} } -\textsf {m} _{\textsf {a} } \lambda _{\textsf {b} } - \textsf {m} _{\textsf {b} } \lambda _{\textsf {a} } + \varLambda _{\textsf {a} \textsf {b} } - \textsf {r} ) \rangle \), where \(\textsf {z} - \textsf {r} \) is the value received from \(P_{\textsf {king} }\), and verifying if \(\varOmega = 0\). For robust reconstruction of \(\langle \varOmega \rangle \), every party sends its \(\langle \cdot \rangle \)-share to every other party who misses this share, and \(\texttt{abort}\)s in case of inconsistencies in the received values. Elaborately, reconstruction of \(\varOmega \) towards \(P_s \in \mathcal {P}\) proceeds as follows. For each missing \(\langle \cdot \rangle \)-share of \(\varOmega \) at \(P_s\), each of the \(t+1\) parties holding this share sends it to \(P_s\). \(P_s\) uses this share for reconstruction if all the \(t+1\) received values are consistent, else it \(\texttt{abort}\)s. The presence of at least one honest party among the \(t+1\) guarantees that inconsistency, if any, can be detected. Since each share in \(\langle \varOmega \rangle \) is held by \(t+1\) parties, comprising at least one honest party, any cheating by up to t corrupt parties is guaranteed to be detected. Since reconstruction should happen towards at least \(t+1\) parties, communicating a missing share towards all these \(t+1\) parties requires \(\mathcal {O}(t^2)\) communication in total, and there are \(m = \left( {\begin{array}{c}n\\ h\end{array}}\right) - \left( {\begin{array}{c}n-1\\ h-1\end{array}}\right) \) such missing shares. Note that the cost of this reconstruction can be optimized using standard optimization techniques [2, 27], where the correctness of \(\textsf {z} - \textsf {r} \) for several multiplication gates can be verified with a single reconstruction by reconstructing a linear combination of \(\varOmega \) for several gates and verifying equality with 0. Thus, only one robust reconstruction from \(\langle \cdot \rangle \)-shares is required for several multiplication gates, whose cost gets amortized due to verification across multiple gates.

It is worth noting that this random linear combination technique does not trivially work over rings. This is due to the existence of zero divisors which results in the linear combination being 0 with a probability 1/2 (which denotes the cheating probability of the adversary) [2]. Hence, to obtain the desired security, the verification check is repeated \(\kappa \) times where \(\kappa \) is the security parameter. This bounds the cheating probability of adversary to \(1/2^{\kappa }\). Another approach is to perform the verification over extended rings [15, 16]. Specifically, verification operations are carried out over a ring \(\mathbb {Z}_{2^{\ell }}/f(x)\) which is a ring of all polynomials with coefficients in \(\mathbb {Z}_{2^{\ell }}\) modulo a degree d polynomial f(x) that is irreducible over \(\mathbb {Z}_{2^{}}\). Each element of \(\mathbb {Z}_{2^{\ell }}\) is lifted to a degree d polynomial in \(\mathbb {Z}_{2^{\ell }}[x]/f(x)\), which increases the communication required to perform verification by a factor of d.

Fig. 24
figure 24

Malicious: multiplication protocol

The maliciously secure multiplication protocol (see Fig. 24) can be broken down into the following:

  • Preprocessing phase which involves generation of \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \) by invoking \(\mathcal {F}_{\textsf {MulPre} }\). Malicious behaviour, if any, will be caught by \(\mathcal {F}_{\textsf {MulPre} }\). \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle \) is non-interactively converted into \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares of \(\lambda _{\textsf {a} \textsf {b} }\). \({}^{\mathcal {E}}{\left[ \lambda _{\textsf {a} } \right] }, {}^{\mathcal {E}}{\left[ \lambda _{\textsf {b} } \right] }\) is also generated non-interactively.

  • Generation of \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares of \(\lambda _{\textsf {a} }, \lambda _{\textsf {b} }, \varLambda _{\textsf {a} \textsf {b} }\) during preprocessing enables computation of \({}^{\mathcal {E}}{\left[ \textsf {z} - \textsf {r} \right] }\) in the online phase, and thereby reconstruction of \(\textsf {z} - \textsf {r} \) via \(P_{\textsf {king} }\). The crucial point to note here is that this requires the presence of only parties in \(\mathcal {E}\) in the online phase. This is followed by non-interactive generation of \(\langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle \) from which \(\langle \!\langle \textsf {z} \rangle \!\rangle \) is computed as \(\langle \!\langle \textsf {z} \rangle \!\rangle = \langle \!\langle \textsf {z} - \textsf {r} \rangle \!\rangle + \langle \!\langle \textsf {r} \rangle \!\rangle \), where \(\langle \!\langle \textsf {r} \rangle \!\rangle \) is generated during preprocessing.

  • Finally, to catch malicious behaviour in the online phase, if any, in the verification phase the correctness of the generated \(\langle \!\langle \textsf {z} \rangle \!\rangle \) is checked simultaneously, for each \(\textsf {z} \) that is the output of a multiplication gate. This is done by invoking \(\varPi _{\textsf {Vrfy} }\). Note that before this verification begins, \(P_{\textsf {king} }\) sends \(\textsf {z} - \textsf {r} \) corresponding to all multiplication gates to parties in \(\mathcal {D}\) in a single shot.

As pointed out in [53], deferring the correctness check to later may result in a privacy breach when using a sharing scheme that allows for redundancy (such as RSS or Shamir sharing). We next discuss this breach and explain how it is overcome in our case. We begin with explaining the attack that a malicious adversary can launch if reconstruction towards \(P_{\textsf {king} }\) is performed by relying on RSS (or Shamir sharing), naively. Consider a circuit with two sequential multiplication gates with the output of the first gate, say \(\textsf {a} \), going as input to the second gate. Let \(\textsf {b} \) denote the other input to the second multiplication gate and \(\textsf {z} \) denote its output. In a \(P_{\textsf {king} }\)-based approach for multiplication, t parties send their respective (RSS/Shamir) share of a masked value to \(P_{\textsf {king} }\). In particular, for the first multiplication gate in the circuit mentioned above, t parties send their corresponding share of \(\textsf {a} - \textsf {r} _{\textsf {a} }\) to \(P_{\textsf {king} }\), who reconstructs it and sends it back to all. Delaying the verification allows a malicious \(P_{\textsf {king} }\) to send an inconsistent value of \(\textsf {a} - \textsf {r} _{\textsf {a} }\) to the parties, using which it can learn the private input \(\textsf {b} \), as follows. Suppose \(P_{\textsf {king} }\) sends the correct \(\textsf {a} - \textsf {r} _{\textsf {a} }\) to all but one out of the remaining t online parties, to which it sends \(\textsf {a} - \textsf {r} _{\textsf {a} } + \delta \). Owing to this, for the next multiplication gate \(P_{\textsf {king} }\) receives the shares of \(\textsf {z} - \textsf {r} _{\textsf {z} }\) from the former \(t-1\) parties and a share of \((\textsf {a} + \delta ) \textsf {b} - \textsf {r} _{\textsf {z} } = \textsf {z} + \delta \textsf {b} - \textsf {r} _{\textsf {z} }\) from the latter party. Having obtained these and additionally using the shares of \(\textsf {z} - \textsf {r} _{\textsf {z} }\) and \(\textsf {z} + \delta \textsf {b} - \textsf {r} _{\textsf {z} }\) corresponding to the t corrupt parties including itself, a malicious \(P_{\textsf {king} }\) can reconstruct \(\textsf {z} - \textsf {r} _{\textsf {z} }\) as well as \(\textsf {z} + \delta \textsf {b} - \textsf {r} _{\textsf {z} }\), thus learning \(\textsf {b} \) in clear. The crux of this attack lies in the fact that a malicious adversary corrupting t parties including \(P_{\textsf {king} }\) already possesses t shares each of \(\textsf {z} - \textsf {r} _{\textsf {z} }\) and \(\textsf {z} + \delta \textsf {b} - \textsf {r} _{\textsf {z} } \). Thus, an additional share of these obtained from the online parties allows it to carry out the attack successfully. However, this attack does not hold when working with additive (\({}^{\mathcal {E}}{\left[ \cdot \right] }\)) sharing, which is what prevents our protocol from falling prey to this attack.

Elaborately, recall that in our protocol, during reconstruction towards \(P_{\textsf {king} }\), any redundancy due to \(\langle \!\langle \cdot \rangle \!\rangle \)-sharing is eliminated with parties switching to \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-sharing (additive sharing among parties in \(\mathcal {E}\)). Due to this, even if \(P_{\textsf {king} }\) sends inconsistent values to the parties, the \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-share of \(\textsf {z} - \textsf {r} _{\textsf {z} }\) or \(\textsf {z} + \delta \textsf {b} - \textsf {r} _{\textsf {z} }\) that it receives corresponds to an additive share defined with respect to parties in \(\mathcal {E}\). Hence, this additionally received additive share cannot be combined with the shares held by the t corrupt parties to perform the reconstruction. Thus, the earlier strategy of \(P_{\textsf {king} }\) of using these additional shares in conjunction with the t corrupt shares to reconstruct \(\textsf {z} - \textsf {r} _{\textsf {z} }\) and \(\textsf {z} + \delta \textsf {b} - \textsf {r} _{\textsf {z} }\) does not hold. The primary reason which prevents the attack is the elimination of redundancy in the sharing scheme by switching to \((t+1)\)-out-of-\((t+1)\) additive sharing (\({}^{\mathcal {E}}{\left[ \cdot \right] }\)-sharing) for the set of parties in \(\mathcal {E}\), which is known to withstand this attack [53]. However, this privacy breach persists in the protocol of [44].

Fig. 25
figure 25

Malicious: the complete MPC protocol

Discussion about [44]: The above attack can be circumvented by making \(P_{\textsf {king} }\) broadcast the reconstructed value to all the parties, as discussed in [44]. To further optimize the protocol by requiring only \(t+1\) parties to be active in the online phase, they rely on broadcast with abort, which comprises two phases—(i) send: where \(P_{\textsf {king} }\) sends the value to the recipients, and (ii) verification: where the recipients exchange hash of the received value among themselves and abort in case of inconsistency. However, for amortization, they defer the verification (even with respect to broadcast) towards the end of the protocol, thus making their protocol susceptible to the aforementioned attack. We observe that one fix is to perform the verification with respect to broadcast after each level in the circuit. This, however, requires all the parties to be online. An optimization to let only the \(t+1\) parties in the online phase to perform this verification after each level, thereby allowing the remaining t parties to be shut off. Specifically, this involves performing verification where the online parties exchange the hash of the received value and abort in case of inconsistency. When the remainder t (offline) parties come online towards the end of the protocol for verifying the correctness of the multiplication gates, this verification should be preceded by first verifying the consistency of the values broadcast by \(P_{\textsf {king} }\) to the offline parties (and involves participation of all n parties). Since the online phase involves broadcasting the reconstructed value to t other online parties, this amounts to an exchange of \(\mathcal {O}(t^2)\) hashes after each level, thereby incurring a circuit depth-dependent overhead in the communication cost as well as the rounds. In order for the communication cost to get amortized, it is required that the circuit has \(\mathcal {O}(t^2)\) gates at each level. However, the overhead in terms of number of rounds persists.

Lemma 2

Protocol \(\varPi _{\textsf {mult} }^{\textsf {M} }\) (Fig. 24) incurs a communication of 3t elements in the preprocessing phase and 3t elements in 2 rounds in the online phase for multiplication when \(\textsf {isTr} = 0\).

The complete MPC protocol The maliciously secure MPC protocol, \(\varPi _{\textsf {MPC} }^{\textsf {mal} }\), evaluating a function \(f(\cdot )\) appears in Fig. 25.

Multiplication with truncation Similar to the semi-honest protocol, truncation can be incorporated in the malicious multiplication as well without inflating the online communication. For this, we rely on maliciously secure ideal functionality, \(\mathcal {F}_{\textsf {TrGen} }^{\textsf {M} }\) (Fig. 26), to generate the \(\langle \!\langle \cdot \rangle \!\rangle \)-shares of \((\textsf {r} , {\textsf {r} }^{\textsf {d} })\).

Fig. 26
figure 26

Ideal functionality \(\mathcal {F}_{\textsf {TrGen} }^{\textsf {M} }\)

\(\mathcal {F}_{\textsf {TrGen} }^{\textsf {M} }\) (Fig. 26) can be realized using the maliciously secure variant of \(\varPi _{\textsf {dsBits} }\) (Fig. 15), denoted as \(\varPi _{\textsf {dsBits} }^{\textsf {M} }\) [33]. This protocol is similar to the semi-honest protocol except with the following differences to account for malicious behaviour. The \(\langle \cdot \rangle \)-shares of \(\textsf {e} _i = \textsf {a} ^2\) are generated by invoking \(\varPi _{\textsf {multPre} }\) instead of relying on \({\Pi }_{\tiny {\langle \cdot \rangle \cdot \langle \cdot \rangle \rightarrow \left[ \cdot \right] }}\). This ensures generation of correct \(\langle \cdot \rangle \)-shares of \(\textsf {e} _i\), and malicious behaviour, if any, will lead to an \(\texttt{abort}\). Following this, \(\textsf {e} _i\) is either correctly reconstructed towards all or parties \(\texttt{abort}\). This ensures that an adversary cannot lead to reconstruction of an incorrect \(\textsf {e} _i\). Concretely, for reconstruction, similar to multiplication, every party sends its \(\langle \cdot \rangle \)-share to every other party, and \(\texttt{abort}\)s in case of inconsistencies in the received values.Footnote 5 The rest of the protocol steps (which are non-interactive) remain unchanged, and hence, a formal protocol is omitted.

Dot product To generate \(\langle \!\langle \textsf {z} \rangle \!\rangle \) for \(\textsf {z} = \vec {\textsf {x} } \odot \vec {\textsf {y} }\) where \(\vec {\textsf {x} }\) and \(\vec {\textsf {y} }\) are vectors of size and are \(\langle \!\langle \cdot \rangle \!\rangle \)-shared, protocol \(\varPi _{\textsf {dp} }^{\textsf {M} }\) proceeds similar to the semi-honest variant \(\varPi _{\textsf {dp} }\) (Fig. 16). During the preprocessing phase, parties in \(\mathcal {E}\) obtain \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares of and \(\lambda _{\textsf {x} _k}, \lambda _{\textsf {y} _k}\) for . Although the latter two can be computed by parties locally with an invocation of \(\Pi _{\tiny {\langle \cdot \rangle \rightarrow {}^{\mathcal {E}}{\left[ \cdot \right] }}}\) (Fig. 7), computation of the former differs significantly from the semi-honest protocol. For this, we extend the ideas from SWIFT [63] and generate \(\langle \varLambda _{\vec {\textsf {x} }\odot \vec {\textsf {y} }} \rangle \), by executing a maliciously secure dot product protocol \(\varPi _{\textsf {dotPre} }\) over \(\langle \cdot \rangle \)-shares (abstracted as a functionality \(\mathcal {F}_{\textsf {DotPPre} }\) in Fig. 27). Specifically, parties invoke \(\varPi _{\textsf {dotPre} }\) on \(\langle \cdot \rangle \)-shares of and to compute \(\langle \varLambda _{\vec {\textsf {x} }\odot \vec {\textsf {y} }} \rangle \), followed by an invocation of \(\Pi _{\tiny {\langle \cdot \rangle \rightarrow {}^{\mathcal {E}}{\left[ \cdot \right] }}}\) to obtain \({}^{\mathcal {E}}{\left[ \varLambda _{\vec {\textsf {x} }\odot \vec {\textsf {y} }} \right] }\). Having computed the necessary preprocessing data, the online phase proceeds similarly to the semi-honest protocol \(\varPi _{\textsf {dp} }\) (Fig. 16), where parties reconstruct \(\textsf {z} - \textsf {r} \) via \(P_{\textsf {king} }\) as per equation (3). To account for misbehaviour, the protocol is augmented with a verification phase similar to that in malicious multiplication.

Observe that a trivial realization of \(\mathcal {F}_{\textsf {DotPPre} }\) can be reduced to instances of multiplication. However, we extend the ideas from [16, 17, 63] and rely on a distributed zero-knowledge proof [17] to eliminate the vector-size dependency in the preprocessing phase. Concretely, we instantiate \(\mathcal {F}_{\textsf {DotPPre} }\) using a semi-honest dot product protocol [54] whose cost matches that of semi-honest multiplication [35] (and thus is independent of the vector-size), followed by a verification phase to verify the correctness of the dot product computation. For the verification, we extend the verification technique for multiplication in [17], to now verify the correctness of the dot product, such that the cost due to verification can be amortized away for multiple dot products, thereby resulting in vector-size independent preprocessing.

Fig. 27
figure 27

Ideal functionality for \(\varPi _{\textsf {dotPre} }\)

Elaborately, the semi-honest dot product protocol in [54] takes as input \(\langle \vec {x} \rangle , \langle \vec {y} \rangle \) where \(\vec {x}, \vec {y}\) are vectors of size , and outputs \(\langle \textsf {z} \rangle = \langle \vec {x} \odot \vec {y} \rangle \). For this, parties invoke \({\Pi }_{\tiny {\langle \cdot \rangle \cdot \langle \cdot \rangle \rightarrow \left[ \cdot \right] }}\) on each element in \(\vec {x}, \vec {y}\) and sum these up to generate \(\left[ \rho \right] = \left[ \vec {x} \odot \vec {y}\right] \). These shares are randomized by summing with \(\left[ \textsf {r} \right] \) (converted from \(\langle \textsf {r} \rangle \)) for a random \(\textsf {r} \), and the sum \(\textsf {z} + \textsf {r} = (\vec {x} \odot \vec {y} ) + \textsf {r} \) is reconstructed towards \(P_{\textsf {king} }\), who sends the reconstructed \(\textsf {z} + \textsf {r} \) to parties in \(\mathcal {E}\). All parties then non-interactively generate \(\langle \textsf {z} + \textsf {r} \rangle \) by setting one of its share as \(\textsf {z} + \textsf {r} \) and the others as 0. Given \(\langle \textsf {z} + \textsf {r} \rangle , \langle \textsf {r} \rangle \), parties can compute \(\langle \textsf {z} \rangle = \langle \textsf {z} + \textsf {r} \rangle - \langle \textsf {r} \rangle \). Observe that communication of \(\left[ \textsf {z} + \textsf {r} \right] \) to \(P_{\textsf {king} }\) requires 2t elements, while communicating \(\textsf {z} + \textsf {r} \) to parties in \(\mathcal {E}\) requires t elements, resulting in a matching cost of 3t elements as that required for semi-honest multiplication [35]. The correctness of m dot product triples \((\vec {x_1}, \vec {y_1}, \textsf {z} _1), \ldots , (\vec {x_m}, \vec {y_m}, \textsf {z} _m)\), can be verified by taking a random linear combination,

where \(\{\theta _k\}_{k=1}^{m}\) is randomly chosen by all the parties and checking if \(\beta = 0\). Given \(\langle \cdot \rangle \)-shares of \(\vec {x}_{k}, \vec {y}_{k}, \textsf {z} _k\) for \(k \in \{1, \ldots , m\}\), parties can compute an additive share ( \(\left[ \cdot \right] \)-share) of \(\beta \) by invoking \({\Pi }_{\tiny {\langle \cdot \rangle \cdot \langle \cdot \rangle \rightarrow \left[ \cdot \right] }}\). However, since \(\left[ \cdot \right] \)-sharing does not allow for robust reconstruction, the approach is to generate \(\langle \beta \rangle \) and then robustly reconstruct it and check equality with 0. To generate \(\langle \beta \rangle \), parties first \(\langle \cdot \rangle \)-share (via \({\Pi }_{\langle \cdot \rangle }\), Sect. 2) their \(\left[ \cdot \right] \)-share of

Let \(\psi ^i\) denote the \(\left[ \cdot \right] \)-share of \(\psi \) held by \(P_i\). Given \(\langle \psi ^i \rangle \) for \(i \in \{1, \ldots , n\}\), parties can compute

$$\begin{aligned} \langle \beta \rangle = \sum _{k=1}^{m} \theta _k \cdot \langle \textsf {z} _k \rangle - \sum _{i=1}^{n} \langle \psi ^i \rangle \end{aligned}$$

and reconstruct \(\beta \). It is, however, required to ensure that every party \(P_i\) \(\langle \cdot \rangle \)-shares the correct \(\psi ^i\). To check the correctness of \(\psi ^{i}\), parties need to verify if

(4)

where \(\textsf {x} ^{i}_{kj}, \textsf {y} ^{i}_{kj}\) denote the \(\langle \cdot \rangle \)-share of \(\textsf {x} _{kj}, \textsf {y} _{kj}\) held by \(P_i\). Note that following along the lines of \({\Pi }_{\tiny {\cdot \rightarrow \langle \!\langle \cdot \rangle \!\rangle }}\), parties can generate these \(\langle \cdot \rangle \)-share of \(\textsf {x} ^{i}_{kj}, \textsf {y} ^{i}_{kj}\) from \(\langle \cdot \rangle \)-shares of \(\textsf {x} _{kj}, \textsf {y} _{kj}\), non-interactively. Now, setting \(\textsf {a} _{kj} = \theta _k \textsf {x} ^{i}_{kj}, \textsf {b} _{kj} = \textsf {y} ^{i}_{kj}, \textsf {c} = \psi ^i\), for \(k \in \{1, \ldots , m \}\), Eq. (4), can re-written as

(5)

The correctness of Eq. (5) can be verified by invoking \(\mathcal {F}^{\texttt{abort}}_{\textsf {proveDeg2Rel} }\) (see section 3 of [17] for the definition and its instantiation), which takes as input \(\langle \cdot \rangle \)-shares of \({\tilde{\textsf {a} }}_{l}, {\tilde{\textsf {b} }}_{l}, \textsf {c} \) for , which are known in clear to party \(P_i\), and verifies if Eq. (5) holds. The protocol realizing \(\mathcal {F}^{\texttt{abort}}_{\textsf {proveDeg2Rel} }\) for all n parties requires communicating extended ring elements per party. Further, since steps other than \(\mathcal {F}^{\texttt{abort}}_{\textsf {proveDeg2Rel} }\) require sharing and reconstructing one element, it adds a small constant cost, resulting in the communication cost for verifying m dot products for vector size being extended ring elements per party.

Multi input multiplication This protocol is similar to its semi-honest counterpart with the difference that the preprocessing phase relies on invoking \(\mathcal {F}_{\textsf {MulPre} }\) for generating the required multiplicative terms. At a high level, the malicious variant of multi-input multiplication protocol can be viewed as an amalgamation of the semi-honest multi-input multiplication and the malicious multiplication protocol. For the case of 3-input multiplication, recall that the semi-honest protocol to compute \(\langle \!\langle \textsf {z} \rangle \!\rangle \) given \(\langle \!\langle \textsf {a} \rangle \!\rangle , \langle \!\langle \textsf {b} \rangle \!\rangle \) and \(\langle \!\langle \textsf {c} \rangle \!\rangle \) where \(\textsf {z} = \textsf {a} \textsf {b} \textsf {c} \) requires parties to obtain \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} } \right] }, {}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {c} } \right] }, {}^{\mathcal {E}}{\left[ \varLambda _{\textsf {b} \textsf {c} } \right] }\) and \({}^{\mathcal {E}}{\left[ \varLambda _{\textsf {a} \textsf {b} \textsf {c} } \right] }\) in the preprocessing phase, which is then used to reconstruct \(\textsf {m} _{\textsf {z} }\) in the online phase.

Since parties in \(\mathcal {E}\) are required to hold the correct \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-sharings before the online phase begins, as in the case of multiplication, the techniques from the semi-honest protocol fail in this setting. Hence, our protocol uses 4 instances of \(\mathcal {F}_{\textsf {MulPre} }\) in the preprocessing phase, one each to compute \(\langle \varLambda _{\textsf {a} \textsf {b} } \rangle , \langle \varLambda _{\textsf {a} \textsf {c} } \rangle , \langle \varLambda _{\textsf {b} \textsf {c} } \rangle \) and \(\langle \varLambda _{\textsf {a} \textsf {b} \textsf {c} } \rangle \). Each of the \(\langle \cdot \rangle \)-sharing is further converted to \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-sharing using \(\Pi _{\tiny {\langle \cdot \rangle \rightarrow {}^{\mathcal {E}}{\left[ \cdot \right] }}}\) to ensure active participation of only \(t+1\) parties in the online phase for reconstruction of \(\textsf {z} - \textsf {r} \). Further, to detect malicious behaviour during reconstruction of \(\textsf {z} - \textsf {r} \), a verification check similar to the multiplication protocol is performed such that parties \(\texttt{abort}\) if the check fails. For 4-input multiplication, parties obtain \(\langle \!\langle \cdot \rangle \!\rangle \)-sharing of \(\textsf {z} = \textsf {a} \textsf {b} \textsf {c} \textsf {d} \) using \(\textsf {z} - \textsf {r} = (\textsf {m} _{\textsf {a} } - \lambda _{\textsf {a} })(\textsf {m} _{\textsf {b} } - \lambda _{\textsf {b} })(\textsf {m} _{\textsf {c} } - \lambda _{\textsf {c} })(\textsf {m} _{\textsf {d} } - \lambda _{\textsf {d} }) - \textsf {r} \). The protocol proceeds in a similar manner as the 3-input case by delegating the computation of product terms to the preprocessing phase.

5 Building Blocks

For completeness, we discuss the building blocks used in our framework. These blocks are known from the literature [64, 80] and we show how these can be extended to the n-party setting.

5.1 Semi-Honest Building Blocks

Bit to arithmetic Given Boolean shares \(\langle \!\langle \textsf {b} \rangle \!\rangle ^{\textbf{B}}\) of bit \(\textsf {b} \), protocol \({\Pi }_{\textsf {bit2A} }\) generates its arithmetic shares, \(\langle \!\langle \textsf {b} ^\textsf{R} \rangle \!\rangle \) over \(\mathbb {Z}_{2^{\ell }}\) (Fig. 28). Here, \(\textsf {b} ^\textsf{R}\) denotes the arithmetic value of \(\textsf {b} \) over the ring \(\mathbb {Z}_{2^{\ell }}\). The approach is to generate a randomized version, \(\zeta = \textsf {b} \oplus \textsf {r} \) of \(\textsf {b} \), and then recover arithmetic shares of \(\textsf {b} \) by performing the arithmetic equivalent of XOR of \(\textsf {b} = \zeta \oplus \textsf {r} \). Specifically, the arithmetic equivalent of \(\textsf {x} \oplus \textsf {y} \) is given as \(\textsf {x} ^\textsf{R} + \textsf {y} ^\textsf{R} - 2 \textsf {x} ^\textsf{R} \textsf {y} ^\textsf{R}\).

Fig. 28
figure 28

Semi-honest: bit to arithmetic

Bit injection This protocol, denoted as \(\varPi _{\textsf {BitInj} }\), facilitates generation of \(\langle \!\langle \textsf {b} ^\textsf{R} \cdot \textsf {v} \rangle \!\rangle \) given \(\langle \!\langle \textsf {b} \rangle \!\rangle ^{\textbf{B}}, \langle \!\langle \textsf {v} \rangle \!\rangle \) for \(\textsf {b} \in \mathbb {Z}_{2^{}}\) and \(\textsf {v} \in \mathbb {Z}_{2^{\ell }}\). As seen in [64],

$$\begin{aligned} \textsf {b} ^\textsf{R} \textsf {v}&= (\textsf {m} _{\textsf {b} } \oplus \lambda _{\textsf {b} })^\textsf{R} (\textsf {m} _{\textsf {v} } - \lambda _{{\textsf {v} }}) = \textsf {m} _{\textsf {b} }^\textsf{R} \textsf {m} _{\textsf {v} } - \textsf {m} _{\textsf {b} }^\textsf{R} \lambda _{\textsf {v} } + (2 \textsf {m} _{\textsf {b} }^\textsf{R} - 1)(\lambda _{\textsf {b} }^\textsf{R} \lambda _{\textsf {v} } - \textsf {m} _{\textsf {v} } \lambda _{\textsf {b} }^\textsf{R}) \end{aligned}$$

Given \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares of \(\lambda _{\textsf {v} }, \lambda _{\textsf {b} }^\textsf{R}, \lambda _{\textsf {b} }^\textsf{R} \lambda _{\textsf {v} }, \textsf {r} \), together with \(\langle \!\langle \textsf {r} \rangle \!\rangle \) where \(\textsf {r} \in \mathbb {Z}_{2^{\ell }}\), and the knowledge that \(\textsf {m} _{\textsf {v} }, \textsf {m} _{\textsf {b} }^\textsf{R}\) is held by all parties in \(\mathcal {E}\), parties can non-interactively compute \({}^{\mathcal {E}}{\left[ \textsf {b} ^\textsf{R} \textsf {v} + \textsf {r} \right] }\), reconstruct it via \(P_{\textsf {king} }\) and generate \(\langle \!\langle \textsf {b} ^\textsf{R} \textsf {v} + \textsf {r} \rangle \!\rangle \). \(\langle \!\langle \textsf {b} ^\textsf{R} \textsf {v} \rangle \!\rangle \) can then be computed as \(\langle \!\langle \textsf {b} ^\textsf{R} \textsf {v} \rangle \!\rangle = \langle \!\langle \textsf {b} ^\textsf{R} \textsf {v} + \textsf {r} \rangle \!\rangle - \langle \!\langle \textsf {r} \rangle \!\rangle \). To facilitate this, in the preprocessing phase parties generate \({}^{\mathcal {E}}{\left[ \cdot \right] }\)-shares of \(\textsf {r} , \lambda _{\textsf {v} }, \lambda _{\textsf {b} }^\textsf{R}, \lambda _{\textsf {b} }^\textsf{R} \lambda _{\textsf {v} }\), and \(\langle \!\langle \textsf {r} \rangle \!\rangle \). Here, \({}^{\mathcal {E}}{\left[ \textsf {r} \right] }, {}^{\mathcal {E}}{\left[ \lambda _{\textsf {v} } \right] }\) and \(\langle \!\langle \textsf {r} \rangle \!\rangle \) are generated as in the preprocessing of multiplication, and \({}^{\mathcal {E}}{\left[ \lambda _{\textsf {b} }^\textsf{R} \right] }\) is generated via \({\Pi }_{\textsf {bit2A} }\) followed by invoking \(\varPi _{\tiny {\langle \!\langle \cdot \rangle \!\rangle \rightarrow {}^{\mathcal {E}}{\left[ \cdot \right] }}}\). Following this, \({}^{\mathcal {E}}{\left[ \lambda _{\textsf {b} }^\textsf{R} \lambda _{\textsf {v} } \right] }\) is generated as done in the preprocessing of multiplication.

Fig. 29
figure 29

Semi-honest: arithmetic to Boolean

Arithmetic to Boolean sharing Extending the techniques from [64], protocol \({\Pi }_{\textsf {A2B} }\) generates \(\langle \!\langle \textsf {x} \rangle \!\rangle ^{\textbf{B}}\) from \(\langle \!\langle \textsf {x} \rangle \!\rangle \) for \(\textsf {x} \in \mathbb {Z}_{2^{\ell }}\). For this, given arithmetic and Boolean shares of \(\textsf {r} \in \mathbb {Z}_{2^{\ell }}\), Boolean shares of \(\textsf {x} \) are computed as \((\textsf {x} +\textsf {r} ) - \textsf {r} \) by evaluating a parallel prefix adder (PPA) circuit [75, 80]. The PPA circuit takes as input two Boolean values (\(\textsf {x} +\textsf {r} \), \(-\textsf {r} \) in this case) and outputs their sum. The protocol appears in Fig. 29. Looking ahead, \({\Pi }_{\textsf {A2B} }\) is used in the preprocessing phase in the applications considered. Hence, we rely on the PPA circuit from [75] as it provides a good trade-off between rounds and communication as opposed to the circuit from [80] which is optimized to provide a fast online phase at the expense of a higher preprocessing cost (yielding a higher total cost than [75]).

Boolean to arithmetic sharing This protocol generates \(\langle \!\langle \textsf {x} \rangle \!\rangle \) from \(\langle \!\langle \textsf {x} \rangle \!\rangle ^{\textbf{B}}\) where \(\textsf {x} \in \mathbb {Z}_{2^{\ell }}\). Inspired from [63, 64], observe that \(\textsf {x} = \sum _{i=0}^{\ell -1} 2^i (\textsf {x} [i])^\textsf{R}\). Thus, we invoke \({\Pi }_{\textsf {bit2A} }\) on \(\textsf {x} [i]\) for \(i \in \{0, \ldots , \ell -1 \}\) to generate \(\langle \!\langle \textsf {x} [i]^\textsf{R} \rangle \!\rangle \) followed by locally combining it as per the above equation to generate \(\langle \!\langle \textsf {x} \rangle \!\rangle \). Optimizations in [64] carry forward to our setting as well.

Comparison To compare \(\textsf {x} , \textsf {y} \in \mathbb {Z}_{2^{\ell }}\) in FPA, we extend the technique of [26, 63, 64, 75, 80, 81], where checking \(\textsf {x} < \textsf {y} \) is equivalent to checking if the most significant bit (\(\textsf {msb} \)) of \(\textsf {v} = \textsf {x} - \textsf {y} \) is 1. To extract the \(\textsf {msb} \) from \(\langle \!\langle \textsf {v} \rangle \!\rangle \), we rely on \(\varPi _{\textsf {bitext} }\) which takes as input \(\langle \!\langle \textsf {v} \rangle \!\rangle \) and outputs the \(\langle \!\langle \cdot \rangle \!\rangle ^{\textbf{B}}\)-share of the \(\textsf {msb} \) of \(\textsf {v} \), denoted as \(\langle \!\langle \textsf {msb} (\textsf {v} ) \rangle \!\rangle ^{\textbf{B}}\). The optimized bit extraction circuit from [80] is used for computing the \(\textsf {msb} \) whose inputs are two \(\langle \!\langle \cdot \rangle \!\rangle ^{\textbf{B}}\)-shared values and output is the \(\langle \!\langle \cdot \rangle \!\rangle ^{\textbf{B}}\)-shared \(\textsf {msb} \) of the sum of these two inputs. Observe that, given \(\langle \!\langle \textsf {v} \rangle \!\rangle \), \(\textsf {v} \) can be written as \(\textsf {v} = \textsf {m} _{\textsf {v} } - \lambda _{\textsf {v} }\), and hence \(\langle \!\langle \cdot \rangle \!\rangle ^{\textbf{B}}\)-shares of \(\textsf {m} _{\textsf {v} }\) and \(\lambda _{\textsf {v} }\) constitute the two inputs to the circuit. While \(\langle \!\langle \textsf {m} _{\textsf {v} } \rangle \!\rangle ^{\textbf{B}}\) can be generated non-interactively by invoking \(\varPi _{\tiny {\cdot \rightarrow \langle \!\langle \cdot \rangle \!\rangle }}^{\textbf{B}}\) in the online phase, \(\langle \!\langle \lambda _{\textsf {v} } \rangle \!\rangle ^{\textbf{B}}\) is generated by performing an arithmetic to Boolean conversion in the preprocessing phase. Evaluation of bit extraction circuit then gives \(\langle \!\langle \textsf {msb} (\textsf {v} ) \rangle \!\rangle ^{\textbf{B}}\).

Equality check Given \(\langle \!\langle \cdot \rangle \!\rangle \)-shared \(\textsf {x} , \textsf {y} \in \mathbb {Z}_{2^{\ell }}\), this protocol outputs a \(\langle \!\langle \cdot \rangle \!\rangle ^{\textbf{B}}\)-shared bit, which is set to 1 if \(\textsf {x} = \textsf {y} \), and 0 otherwise. The approach is to obtain the bit decomposition of \(\textsf {v} = \textsf {x} -\textsf {y} \) by performing \({\Pi }_{\textsf {A2B} }\) and check if all bits of \(\textsf {v} \) are 0. For this, parties non-interactively obtain 1’s complement of the bits of \(\textsf {v} \), denoted as \({\bar{\textsf {v} }}\), by setting the corresponding \(\textsf {m} _{{\bar{\textsf {v} }}} = 1 \oplus \textsf {m} _{\textsf {v} }\) and \(\lambda _{{\bar{\textsf {v} }}} = \lambda _{\textsf {v} }\). Parties proceed to compute an AND of all the bits in \({\bar{\textsf {v} }}\) following the standard-tree-based approach where we use the 4-input multiplication to save on rounds and communication. If \(\textsf {v} = 0\), then the AND outputs 1 else it outputs a 0. The protocol appears in Fig. 30.

Fig. 30
figure 30

Semi-honest: equality check protocol

Maxpool/Minpool Maxpool allows parties to compute \(\langle \!\langle \cdot \rangle \!\rangle \)-share of the maximum value \(\textsf {x} _{\max }\) among a vector of values . For this, we proceed along the lines of [64]. Observe that the maximum among two values \(\textsf {x} _i, \textsf {x} _j\) can be computed by first using the secure comparison protocol to obtain \(\langle \!\langle \textsf {b} \rangle \!\rangle ^{\textbf{B}}\) such that \(\textsf {b} = 0\) if \(\textsf {x} _i \ge \textsf {x} _j\) and 1 otherwise. Following this, parties can compute \(\textsf {b} (\textsf {x} _j - \textsf {x} _i) +\textsf {x} _i\) using the bit injection protocol, to obtain the maximum value as the output. To compute the maximum among a vector of values, parties follow the standard binary tree-based approach where consecutive pairs of values are compared in a level-by-level manner. We refer to the resulting protocol as \(\Pi _{\max }\). A protocol \(\Pi _{\min }\) for minpool can be worked out similarly.

ReLU The ReLU function, \(\textsf {ReLU} (\textsf {v} ) = \textsf {max} (0, \textsf {v} )\), can be written as \(\textsf {ReLU} (\textsf {v} ) = {\overline{\textsf {b} }} \cdot \textsf {v} \), where bit \(\textsf {b} = 1\) if \(\textsf {v} < 0\) and 0 otherwise. Here \({\overline{\textsf {b} }}\) denotes the complement of \(\textsf {b} \). Given \(\langle \!\langle \textsf {v} \rangle \!\rangle \), parties invoke \(\varPi _{\textsf {bitext} }\) on \(\langle \!\langle \textsf {v} \rangle \!\rangle \) to obtain \(\langle \!\langle \textsf {b} \rangle \!\rangle ^{\textbf{B}}\). The \(\langle \!\langle \cdot \rangle \!\rangle ^{\textbf{B}}\)-sharing of \({\overline{\textsf {b} }}\) is then computed, non-interactively, by setting \(\textsf {m} _{{\overline{\textsf {b} }}} = 1 \oplus \textsf {m} _{\textsf {b} }\). Given \(\langle \!\langle {\overline{\textsf {b} }} \rangle \!\rangle ^{\textbf{B}}\) and \(\langle \!\langle \textsf {v} \rangle \!\rangle \), \(\textsf {ReLU} \) can be computed using \(\varPi _{\textsf {BitInj} }\).

5.2 Malicious Blocks

Note that the malicious variants for the building blocks such as bit to arithmetic, Boolean to arithmetic, and arithmetic to Boolean conversion, bit extraction, secure comparison, secure equality check, ReLU, maxpool, and convolutions, follow along similar lines to that of the semi-honest protocols with the difference that the underlying protocols used are replaced with their maliciously secure variants. Moreover, for steps that involve opening values via \(P_{\textsf {king} }\), the reconstructed values are sent to all and are accompanied by a verification check similar to the one in the multiplication protocol.

5.3 Communication Cost

Table 4 summarizes the complexities of the various protocols, in the semi-honest as well as the malicious settings, discussed so far. Specifically, we tabulate the preprocessing communication cost and the online communication cost. We also report the online round complexity of the designed protocols.

Table 4 Communication and round complexity of protocols: semi-honest and malicious

6 Applications and Benchmarks

To evaluate the performance of our protocols, we benchmark some of the popular applications such as deep neural networks (NN), graph neural networks (GNN), similar sequence queries (SSQ), and biometric matching where MPC is used to achieve privacy. While these applications have been looked at in the small-party setting [6, 63, 75, 77, 80, 86, 90, 94], we believe the n-party setting is a better fit for reasons described in the introduction. To the best of our knowledge, we are the first to benchmark these in the multiparty honest-majority setting for more than four parties.

Benchmark environment The performance of our protocols is analysed using a prototype implementation building over the ENCRYPTO library [31] in C++17. We chose 64 bit ring (\(\mathbb {Z}_{2^{64}}\)) for our arithmetic world, and the operations over extended ring were carried out using the NTL library.Footnote 6 Since the correctness and accuracy of the applications considered in the secure computation setting are already established, our benchmark aims to demonstrate our protocols’ performance and is not fully functional. Moreover, we believe that incorporating state-of-the-art code optimizations like GPU-assisted computing can enhance the efficiency of our protocols, which is left as future work. Since there is no defined way to capture an adversary’s misbehaviour, following standard practice [32, 63, 75], we benchmark honest executions of the protocols, which also include the steps performed for verification in the malicious case. We use multi-threading, wherever possible, to facilitate efficient computation and communication among the parties. The parties in the computation are emulated using Google Cloud (n1-standard-64 instances, 2.0 GHz Intel Xeon Skylake, 64 vCPUs, 240 GB RAM) with machines located in East Australia, South Asia, South East Asia, and West Europe. All our experiments are run for 5, 7, and 9 parties, each. We would like to note that our protocols can be scaled to a larger number of parties. However, recall that reliance on RSS will result in increasing the share size with increasing number of parties. Further, we note that the performance of our semi-honest protocol in the special setting of \(t=1\) is on par with tailor-made protocols such as [25]. For the malicious setting, note that customized protocols with \(t=1\) such as [26, 63, 64] are tailor-made for their setting and hence are more efficient. Specifically, they benefit from a single online round of interaction per multiplication gate, as opposed to two in our case, while having the same communication cost. We estimate this will roughly double the latency of our malicious protocol in this setting of \(t=1\). While a single-round protocol can be designed for the multiparty case, the two-round protocol ensures that communication complexity remains linear in the number of parties, as opposed to quadratic in the former. Since our focus is on attaining protocols that tolerate \(t>1\), we omit providing performance comparisons for these customized protocols in the \(t=1\) setting.

Fig. 31
figure 31

Round-trip time (\(\textsf {rtt} \))

Benchmark parameters We report the run-time and communication of the online phase and total (= preprocessing + online). Note that the reported costs only consider the evaluation phase and do not account for the cost of input sharing and output reconstruction phases (because the latter phases amount to a one-time cost). Hence, for the malicious setting, the reported numbers do not account for the cost of broadcast required for the fair reconstruction. To capture the effect of online round complexity and communication in one go, we also report the throughput (\(\textsf {TP} \) [5, 63, 75]) of the online phase. \(\textsf {TP} \) denotes the number of operations that can be performed in one minute. Finally, when deployed in the outsourced setting, one pays the price for the communication and uptime of the hired servers. To demonstrate how our protocols fare in this scenario, we additionally report the monetary cost (Cost) [64, 74] for the applications considered. This cost is estimated using Google Cloud Platform [87] pricing, where 1 GB and 1 hour of usage cost USD 0.08 and USD 3.04, respectively.

6.1 Comparison with DN07\({}^{\star }\)

In this section, we benchmark our semi-honest and malicious protocols over synthetic circuits comprising one million multiplications with varying depths of 1, 100, and 1000 and compare against the optimized ring variant of DN07\({}^{\star }\) [15]. The gates are distributed equally across each level in the circuit.

Communication The communication cost for 1 million multiplications is tabulated in Table 5 for the 5, 7, and 9 party settings. As can be observed, the online phase of our semi-honest protocol enjoys the benefits of pushing \(33\%\) communication to a preprocessing phase compared to DN07\({}^{\star }\). The observed values corroborate the claimed improvement in the online complexity of our protocol. Our malicious protocol retains the online communication cost of DN07\({}^{\star }\) while incurring a similar overhead in the preprocessing phase.

Table 5 Communication (Preprocessing, Online) in MB for 1 million multiplications

Note that pushing the communication to the preprocessing phase has several benefits. First, communication with respect to several instances can happen in a single shot and leverage the benefit of serialization. Second, with respect to resource-constrained devices such as mobile phones, the preprocessing communication can occur whenever they have access to a high-bandwidth Wi-Fi network (for instance, when the device is at home overnight). These benefits facilitate a fast online phase, as observed, that may happen over a low-bandwidth network.

Table 6 Latency in seconds (Preprocessing, Online) for varying depth (d) circuits with 1 million multiplications for \(n=7\)

Run-time The time taken to evaluate circuits of different depths appears in Table 6. Since the time for the 5, 7, and 9 party settings varies within the range [0, 0.5], we report values only for the 7-party setting in Table 6. With respect to the online run-time, our semi-honest protocol’s time is expected to be similar to that of DN07\({}^{\star }\). However, DN07\({}^{\star }\) demonstrates around 1.5\(\times \) higher run-time. This difference can be attributed to the asymmetry in the \(\textsf {rtt} \) among parties, which vanished when benchmarked over a symmetric \(\textsf {rtt} \) setting. Compared to the semi-honest protocol, the malicious variant incurs a minimal overhead of less than one second in the online run-time due to the one-time verification phase. However, the overhead is higher for the case of the overall run-time. Concretely, it is around 10 s and is due to the distributed zero-knowledge proof computation in the preprocessing phase. Note that this overhead is independent of the circuit depth and gets amortized for deeper circuits as evident from Table 6 (depth 1 vs. 1000).

Fig. 32
figure 32

Monetary cost (in USD) for evaluating circuits (1000 instances) of various depths (d) for \(n=9\) parties. The values are reported in \(\log _2\) scale

Monetary cost Another key highlight of our protocols is their improved monetary cost, as evident from Fig. 32.Footnote 7 Concretely, for 9 parties (semi-honest), we observe a saving of 17\(\%\) over DN07\({}^{\star }\) for a depth-1 circuit, and it increases up to 72\(\%\) for circuits with depth 1000. This is primarily due to the reduction in the number of online parties over DN07\({}^{\star }\). Comparing our semi-honest and malicious variants, the latter has an overhead of 8\(\times \) for depth-1 circuit, and it reduces to 1.14\(\times \) for depth-1000 circuit. This is justified because the verification cost is amortized for deeper circuits, as mentioned earlier. Interestingly, our malicious variant outperforms even the semi-honest DN07\({}^{\star }\) upon reaching circuit depths of 100 and above. A similar analysis holds in the symmetric \(\textsf {rtt} \) setting as well, where the saving is up to \(56\%\) (for \(d = 1000\)).

Online throughput (\(\textsf {TP} \)): Owing to the asymmetric \(\textsf {rtt} \) as described earlier, our semi-honest variant witnesses up to \(1.78 \times \) improvements in \(\textsf {TP} \) (for a single execution) over DN07\({}^{\star }\), which vanishes in the symmetric \(\textsf {rtt} \) setting. However, recall that our protocol requires only \(t+1\) active parties in the online phase, which leaves several channels among the parties underutilized. Hence, we can leverage the load balancing technique where parties’ roles are interchanged across various parallel executions. For instance, one approach is to make every party act as \(P_{\textsf {king} }\), i.e. in 5PC, in one execution, \(P_{\textsf {king} }= P_1, \mathcal {E}= \{P_1, P_2, P_3\}, \mathcal {D}= \{P_4, P_5\}\), while in another execution \(P_{\textsf {king} }= P_2, \mathcal {E}= \{P_2, P_3, P_4\}, \mathcal {D}= \{P_5, P_1\}\), and so on. To analyse the effect of load balancing, we performed experiments with similar \(\textsf {rtt} \) among the parties and observed a 1.5\(\times \) improvement in our semi-honest variant over DN07\({}^{\star }\). This is justified as we communicate over four channels among the parties as opposed to six in DN07\({}^{\star }\). We note that while enhancing the security from semi-honest to malicious, we observe a significant drop in \(\textsf {TP} \), which is about 3\(\times \) for the depth-1 circuit. This is primarily due to increased run-time owing to the verification in online phase for malicious setting. However, this drop tends to zero for deeper circuits (as verification cost gets amortized), making the online phase of our maliciously secure protocol on par with the semi-honest one.

6.2 Deep Neural Networks (DNN) and Graph Neural Networks (GNN)

We begin by discussing the architectural details of the neural networks and graph neural networks that have been benchmarked in this work.

Neural networks We benchmark three different neural networks (NN) [75, 81, 94] with increasing number of parameters—(i) NN-1: a 3-layer fully connected network with ReLU activation after each layer, as considered in [63, 75, 77, 81], (ii) NN-2: the LeNet [67] architecture, which contains two convolutional layers and two fully connected layers with ReLU activation after each layer, and maxpool operation after convolutional layers, and (iii) NN-3: VGG16 [91] architecture, that comprises 16 layers in total, which includes fully connected, convolutional, ReLU activation, and maxpool layers. The last 2 NNs were considered in [94]. We benchmark the inference phase of the above NNs, which comprises computing activation matrices, followed by applying an activation function or pooling operation, depending on the network architecture. NN-1 and NN-2 are benchmarked over MNIST dataset [68], while NN-3 is benchmarked using CIFAR-10 dataset [65].

Graph neural networks The goal of spectral-based GNNs [39, 62] is to learn a function of signals \(\vec {\textsf {x} _1}, \ldots , \vec {\textsf {x} _m}\) each of length n, on a graph \(\textsf {G} = (\textsf {V} , \textsf {E} , {\textbf {M}})\), where \(\textsf {V} \) is the set of n vertices of the graph, \(\textsf {E} \) is the set of edges and \({\textbf {M}}\) is the graph description in terms of an \(n \times n\) adjacency matrix. The jth component of every signal \(\vec {\textsf {x} _i}\) corresponds to jth node of the graph. Training data are used to compute graph description \({\textbf {M}}\), which is common for all signals considered.

The approximation of graph filters using a truncated expansion in terms of Chebyshev polynomials was put forth in [39]. Chebyshev polynomials are recursively defined as follows:

$$\begin{aligned} T_k(x) = {\left\{ \begin{array}{ll} 1 &{} \text {if } k = 0\\ x &{} \text {if } k = 1\\ 2x T_{k-1}(x) - T_{k-2}(x) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

The inference phase for a \(n \times c\) signal matrix \({\textbf {X}}\) with f feature maps, where c represents the dimension of feature vector for each node, with a K-localized filter matrix \(\varTheta _k\) can be performed as \({\textbf {Y}} = \sum _{k=0}^{K-1} T_k(\tilde{{\textbf {L}}}) {\textbf {X}} \varTheta _k\). Here, \(\tilde{{\textbf {L}}} = \frac{2}{\lambda _{max}} \cdot {\textbf {L}} - {\textbf {I}} \cdot \lambda _{max}\), and \(\lambda _{max}\) is the largest eigenvalue of the normalized graph Laplacian \({\textbf {L}}\), \({\textbf {Y}}\) is an \(n \times f\)-dimensional matrix and the trainable parameter for the kth layer \(\varTheta _k\) is of dimension \(c \times f\).

We use the simplified architecture of [39] given in [90]. The GNN architecture in the latter uses one graph convolution layer without pooling operation instead of the original model with two graph convolution layers, each of which is followed by a pooling operation. Further, K is set to 5 instead of 25. This architecture is shown to achieve an accuracy of more than \(99\%\) on MNIST classification in [90]. The GNN architecture [90] is as follows.

  • Graph convolution layer:

    • Input: \(T_k(\tilde{{\textbf {L}}})\) with dimensions \(784 \times 784\), \(\varTheta _k\) with dimensions \(1 \times 32\), for \(k \in \{0, \ldots , K-1\}\), and \(28 \times 28\) image transformed into a vector \(\vec {{\textsf {x} }}\) of dimension 784.

    • Output: \(\sum _{k=0}^{K-1} T_k(\tilde{{\textbf {L}}}) \vec {\textsf {x} } \varTheta _k\) with dimensions \(784 \times 32\).

  • ReLU activation: Calculates the ReLU for each input.

  • Fully connected layer (FC): with 10 nodes.

Analysis To analyse the improvement of our protocols, we also benchmark (semi-honest) DN07\({}^{\star }\) for the applications by adapting our building blocks to their setting. The semi-honest benchmarks for the different NNs and GNN appear in Table 7, while the malicious ones appear in Table 8. Figure 33 gives a pictorial view of the trends observed while comparing the semi-honest variants and are described next.

Fig. 33
figure 33

Comparison for GNN and deep NN between our semi-honest protocol and DN07\({}^{\star }\) (values plotted are logarithmic in base 2)

We incur a very minimal overhead in the run-time of our protocols when moving from five to nine parties over all the networks considered. Hence, we use \(\pm \delta \) to denote this variation in the table. The trends witnessed in synthetic circuit benchmarks (Sect. 6.1) carry forward to neural networks as well due to reasons discussed previously. For instance, the improvement in the online run-time for our semi-honest variant is up to 4.3\(\times \) over DN07\({}^{\star }\). The effect of reduced run-time and improved communication results in a significant improvement in online throughput of our protocol over DN07\({}^{\star }\). Concretely, the gain ranges up to 4.3\(\times \). Further, the improved run-time coupled with the reduced number of online parties for our case brings in a saving of up to 69\(\%\) in monetary cost for NN-1. However, the improvement drops to 33\(\%\) for deep network NN-3. The reduction in savings is due to improved run-time getting nullified by increased communication from NN-1 to NN-3, making communication the dominant factor in determining monetary cost.

Table 7 Semi-honest: benchmarks for deep NN and GNN

Observe that, unlike the case in synthetic circuits (Table 5), the total communication here is an order of magnitude higher. This is primarily due to the higher communication cost incurred for performing the truncation operation—specifically, generation of the doubly shared bits (\(\varPi _{\textsf {dsBits} }\)) in the preprocessing phase. It is worth noting that \(\varPi _{\textsf {dsBits} }\) is used as a black-box, and an improved instantiation for it will lower the communication. Similar trends are observed for GNN as well, where the online run-time of DN07\({}^{\star }\) is up to 3.5\(\times \) higher than our semi-honest protocol. This is reflected in the throughput where we gain up to 3.4\(\times \). Further, we observe savings of up to 15\(\%\) in monetary cost due to the reduced number of active parties and lesser run-time.

Table 8 Malicious: benchmarks for deep NN and GNN

Compared to our semi-honest variant for evaluating NNs, the malicious variant incurs a 2\(\times \) higher online communication cost for NN-1 and NN-2. However, this difference closes in with deeper NNs, with the communication being 1.5\(\times \) for NN-3. The drop in the difference can be attributed to the one-time cost of verification required in the malicious variant, which gets amortized over deeper circuits. Due to the same reason, in comparison with the semi-honest case, the malicious variant has an overhead of around 1 second in the online run-time, which in turn reflects in the reduced throughput. Similar to the semi-honest evaluation of NNs, the overall communication is an order of magnitude higher than the online communication due to the cost incurred for truncation during preprocessing. Also, analogous to the trend observed for synthetic circuits, the overhead in overall run-time is approximately 11 seconds owing to the distributed zero-knowledge proof verification required in the preprocessing phase. For GNN, the trend follows closely to that of NN-3, where the malicious variant incurs 1.5\(\times \) higher communication than its semi-honest counterpart.

6.3 Genome Sequence Matching

Given a genome sequence as a query, genome matching aims to identify the most similar sequence from a database of sequences. This task is also known as similar sequence query (SSQ). An SSQ algorithm on two sequences s and q requires the computation of edit distance (ED), which quantifies how different two sequences are by identifying the minimum number of additions, deletions, and substitutions needed to transform one sequence to the other. To compute the ED, we extend the (2-party) protocol from [86] which builds on top of the approximation from [6], to the n-party setting. We proceed to describe the high-level idea of the approximation algorithm for ED computation for a query sequence \(\textsf {q} \) against a database of sequences \(\{\textsf {s} _1, \ldots , \textsf {s} _m\}\).

Fig. 34
figure 34

Similar sequence queries

The ED approximation algorithm has a non-interactive phase, during which the database owner with the sequences \(\textsf {s} _1, \ldots , \textsf {s} _m\), generates a look-up table (\(\textsf {LUT} \)) for each sequence. These \(\textsf {LUT} \)s are then secret-shared among all the parties. To generate the \(\textsf {LUT} \), the sequences in the database are aligned with respect to a common reference genome sequence (using the Wagner–Fischer algorithm [95]) and divided into blocks of a fixed, predetermined size. Based on the most frequently occurring block sequences in the database, an \(\textsf {LUT} \) is constructed consisting of these block values and their distance from each other. Specifically, for a database of \(m\) sequences \(\{\textsf {s} _1, \ldots , \textsf {s} _m\}\), each of length \(\omega \) blocks, an \(\textsf {LUT} _i\) is constructed for each \(\textsf {s} _i\). Each \(\textsf {LUT} \) has \(m\) columns, one corresponding to each \(\textsf {s} _i\) in the database, and \(\omega \) rows, one corresponding to each block of a sequence, where \(\textsf {LUT} _\textsf {s} [i][j]\) corresponds to the ED between block i of the sequence \(\textsf {s} \) and \(\textsf {s} _j\). This completes the non-interactive phase of the ED approximation algorithm.

Fig. 35
figure 35

Edit distance between query \(\textsf {q} \) and sequence \(\textsf {s} \) with respect to a database of \(m\) sequences and \(\omega \) blocks

Given the \(\textsf {LUT} \)s, when a new query \(\textsf {q} \) has to be processed, its ED must be computed from every sequence \(\textsf {s} \) in the database. For this, similar to the non-interactive phase, the query is first aligned with the reference sequence and broken down into blocks of the same fixed size. Then, the ith block from the query is matched with the ith block of each sequence in the \(\textsf {LUT} \) for a sequence \(\textsf {s} \). If the block values match, then the precomputed distance is taken as the output for that block; otherwise, the output is taken to be 0. Finally, the resultant sum of distances for all the blocks is taken to be the approximated ED between \(\textsf {q} \) and the sequence \(\textsf {s} \). Computing the ED to all such sequences \(\textsf {s} \) in the database then allows the identification of the most similar sequence for the query using the minpool operation. Algorithms for ED computation between two sequences and SSQ appear in Fig. 35, Fig. 34, respectively, where accuracy and correctness follow from [6]. Since the generation of \(\textsf {LUT} \)s happens non-interactively, we only focus on the computation of ED with respect to the new query \(\textsf {q} \), which requires interaction, and benchmark the same.

The benchmarks for genome sequence matching appear in Table 9. Following [86], we consider three cases with different number of sequences in the database (\(m\)) and different block lengths (\(\omega \)). We witness similar trends here, where our semi-honest protocol has improvements of up to 4\(\times \) in both online run-time and throughput over DN07\({}^{\star }\). Our malicious variant incurs a minimal overhead in the range of 5-6\(\%\) in online run-time and total communication over the semi-honest counterpart.

Table 9 Benchmarks for genome sequence matching

For the monetary cost (Fig. 36), our semi-honest protocol has up to 66\(\%\) saving over DN07\({}^{\star }\), and the malicious variant has around 42\(\%\)-54\(\%\) overhead over the semi-honest counterpart.

Fig. 36
figure 36

Monetary cost for SSQ evaluation for varying number of sequences and block lengths ((1000,25), (2000, 30), (4000,35)) for \(n=9\) parties. Costs for 1000 instances are reported in USD

6.4 Biometric Matching

We extend support for biometric matching, which finds application in many real-world tasks such as face recognition [43] and fingerprint matching [56]. Given a database of m biometric samples \((\vec {s}_1, \ldots , \vec {s}_m)\) each of size , and a user holding its sample \(\vec {u}\), the goal of biometric matching is to identify the sample from the database that is “closest” to \(\vec {u}\). The notion of “closeness” can be formalized by various distance metrics, of which Euclidean distance (\({\textbf {EuD}}_{}\)) is the most widely used. Following the general trend, we reduce our biometric matching problem to that of finding the sample from the database which has the least \({\textbf {EuD}}_{}\) with the user’s sample \(\vec {u}\).

Table 10 Benchmarks for biometric matching

We follow [77, 80] where the \({\textbf {EuD}}_{}\) between two vectors \(\vec {x}, \vec {y}\) each of length is given as

(6)

where .

To achieve this goal of performing biometric matching securely, each \(\vec {s}_i\) for all \(i \in \{1, \ldots , m\}\) in the database is \(\langle \!\langle \cdot \rangle \!\rangle \)-shared among the n parties participating in the computation. Specifically, each component \(\vec {s}_{i_j}\) for all is \(\langle \!\langle \cdot \rangle \!\rangle \)-shared among all the parties. Similarly, the user also \(\langle \!\langle \cdot \rangle \!\rangle \)-shares its sample \(\vec {u}\). The parties compute a \(\langle \!\langle \cdot \rangle \!\rangle \)-shared distance vector \({\textbf {DV}}\) of size m, where the ith component corresponds to the \({\textbf {EuD}}_{}\) between \(\vec {u}\) and \(\vec {s}_i\). For this, each party locally obtains \(\langle \!\langle \vec {z}_i \rangle \!\rangle = \langle \!\langle \vec {s}_i \rangle \!\rangle - \langle \!\langle \vec {u} \rangle \!\rangle \) and computes \(\langle \!\langle {\textbf {DV}}_i \rangle \!\rangle \) according to Eq. 6 using the dot product operation. The final step is then to identify the minimum of these m components of \({\textbf {DV}}\), which can be performed using the protocol \(\Pi _{\min }\) for minpool operation.

The benchmarks for biometric matching appear in Table 10 for varying number of sequences. As is evident from Table 10, our semi-honest protocol witnesses a 4.6\(\times \) improvement over DN07\({}^{\star }\) in both online run-time and throughput. Further, in terms of monetary cost, we observe a saving of around 85\(\%\). With respect to our maliciously secure protocol, we incur a minimal overhead of around 9.5\(\%\) in terms of total communication and around 4\(\%\) in online throughput over our semi-honest variant. We note that our malicious variant outperforms semi-honest DN07\({}^{\star }\) in both online run-time and throughput, thereby achieving our goal of a fast online phase.

7 Security Proofs

Security proofs are given in the real-world/ideal-world simulation-based paradigm [70]. Let \(\mathcal {A}^{\textsf {sh} }, \mathcal {A}^{\textsf {mal} }\) denote the real-world semi-honest, malicious adversary, respectively, corrupting at most t parties in \(\mathcal {P}\), denoted by \(\mathcal {C}\). Let \(\mathcal {S}^{\textsf {sh} }, \mathcal {S}^{\textsf {mal} }\) denote the corresponding ideal world semi-honest, malicious adversary, respectively. Security proofs are given in the \(\{\mathcal {F}_{\textsf {setup} }, \mathcal {F}_{\textsf {TrGen} }\}\)-hybrid (and \(\{\mathcal {F}_{\textsf {Broadcast} }, \mathcal {F}_{\textsf {TrGen} }^{\textsf {M} }, \mathcal {F}_{\textsf {MulPre} }, \mathcal {F}_{\textsf {DotPPre} }\}\)-hybrid for malicious setting) model. For modularity, we provide simulation steps for each protocol separately.

7.1 Semi-Honest Security

The following is the strategy for simulating the computation of function f (represented by a circuit \(\textsf {ckt} \)). The simulator \(\mathcal {S}^{\textsf {sh} }\) knows the input and output of the adversary \(\mathcal {A}^{\textsf {sh} }\) and sets the inputs of the honest parties to be 0. \(\mathcal {S}^{\textsf {sh} }\) emulates \(\mathcal {F}_{\textsf {setup} }\) and gives the respective keys to the \(\mathcal {A}^{\textsf {sh} }\). Knowing all the inputs and randomness, \(\mathcal {S}^{\textsf {sh} }\) can compute all the intermediate values for each building block in the clear. Thus, \(\mathcal {S}^{\textsf {sh} }\) proceeds to simulate each building block in topological order using the aforementioned values (input and output of \(\mathcal {A}^{\textsf {sh} }\), randomness and intermediate values). We provide the simulation steps for each of the sub-protocols separately for modularity. When carried out in the respective order, these steps result in the simulation steps for the entire computation. To distinguish the simulators for various protocols, we use the corresponding protocol name as the subscript of \(\mathcal {S}^{\textsf {sh} }\).

Sharing and Reconstruction Simulation for input sharing (Fig. 10) and reconstruction appears in Figs. 3738, respectively.

Fig. 37
figure 37

Semi-honest: simulation for the input sharing protocol \(\varPi _{\textsf {Sh} }\) by \(P_s\)

Fig. 38
figure 38

Semi-honest: simulation for the reconstruction protocol towards all the parties

Multiplication Simulation steps for multiplication (Fig. 12) are provided in Fig. 39.

Fig. 39
figure 39

Semi-honest: simulation for the multiplication protocol \(\varPi _{\textsf {mult} }\)

Observe that the adversary’s view in the simulation is indistinguishable from its view in the real world since it only receives random values in each step of the protocol.

Other building blocks Simulation steps for the remaining building blocks can be obtained analogously by simulating the steps for the respective underlying protocols in their order of invocations.

Complete MPC protocol Simulation for the complete semi-honest MPC protocol \(\varPi _{\textsf {MPC} }^{\textsf {sh} }\) (Fig. 13) appears in Fig. 40.

Fig. 40
figure 40

Semi-honest: simulation for the complete MPC protocol \(\varPi _{\textsf {MPC} }^{\textsf {sh} }\)

Theorem 1

Protocol \(\varPi _{\textsf {MPC} }^{\textsf {sh} }\) (Fig. 13) realizes \(\mathcal {F}_{\textsf {n\text{- }}PC }\) (Fig. 9) with computational security against a semi-honest adversary \(\mathcal {A}^{\textsf {sh} }\) in the \(\{\mathcal {F}_{\textsf {setup} }, \mathcal {F}_{\textsf {TrGen} }\}\)-hybrid model.

Proof

We prove that the adversary’s view in the simulation is indistinguishable from its view in the real world via a sequence of hybrids.

Hybrid\(_{\textbf{0}}\): Execution of protocol \(\varPi _{\textsf {MPC} }^{\textsf {sh} }\) in the real world.

Hybrid\(_{\textbf{1}}\): In this hybrid, the execution of \(\varPi _{\textsf {Sh} }\) is replaced by the simulation of \(\mathcal {S}^{\textsf {sh} }_\textsf{Sh}\). The two hybrids differ only in the case of inputs \(\textsf {x} _s\) of each honest \(P_s\). Note that for the input of an honest \(P_s\), the adversary’s view consists of \((\beta _{\textsf {x} _s},\langle \lambda _{\textsf {x} _s} \rangle _i)\) for each \(P_i \in \mathcal {C}\). Of these, \(\langle \lambda _{\textsf {x} _s} \rangle _i\) consists of random values selected using the shared-key setup among parties and hence is indistinguishable in both the hybrids. Moreover, \(\beta _{\textsf {x} _s}\) remains a random value from the adversary’s view in both the hybrids due to the share \(\langle \lambda _{\textsf {x} _s} \rangle _{{\mathcal {T}}}\) where \({\mathcal {T}}\subseteq \mathcal {P}{\setminus } \mathcal {C}\) unknown to the adversary. Hence, the distributions of Hybrid\(_{\textbf{0}}\) and Hybrid\(_{\textbf{1}}\) are indistinguishable.

Hybrid\(_{\textbf{2}}\): In this hybrid, the execution of \(\varPi _{\textsf {mult} }\) is replaced with the simulation of \(\mathcal {S}^{\textsf {sh} }_\textsf{mult}\) for all the multiplication gates. The adversary’s view here may consist of the reconstructed value \(\textsf {z} - \textsf {r} \) if some corrupt party belongs to \(\mathcal {E}\). However, note that it remains a random value from the adversary’s view in both the hybrids due to the randomly chosen \(\textsf {r} \). Moreover, \(\textsf {r} \) is unknown to the adversary due to the common share held by \(n-t\) honest parties. Hence, the distributions of Hybrid\(_{\textbf{1}}\) and Hybrid\(_{\textbf{2}}\) are indistinguishable.

Hybrid\(_{\textbf{3}}\): In this hybrid, the reconstruction protocol is replaced with the simulation of \(\mathcal {S}^{\textsf {sh} }_\textsf{Rec}\). Note that this is exactly the execution in the ideal world. The transcript of a corrupt party for an output wire \(\textsf {a} \) consists of shares of \(\langle \lambda _{\textsf {a} } \rangle \) and \(\textsf {a} \). As described in \(\mathcal {S}^{\textsf {sh} }_\textsf{Rec}\), the simulator obtains the output wire value \(\textsf {a} \) from the functionality and adjusts the shares of \(\lambda _{\textsf {a} }\) held only by the honest parties to ensure a sharing that is consistent with the output \(\textsf {a} \). Since \(\lambda _{\textsf {a} }\) is random and unknown to the adversary, \(\textsf {m} _{\textsf {a} }\) is also random. Hence, the adversary’s view is indistinguishable in both these executions.

Thus, we conclude that the view of the adversary is indistinguishable in Hybrid\(_{\textbf{0}}\), which is the execution of the protocol in the real world and Hybrid\(_{\textbf{3}}\) corresponding to the execution in the ideal world. \(\square \)

7.2 Malicious Security

The following is the strategy for simulating the computation of function f (represented by a circuit \(\textsf {ckt} \)). The simulator emulates \(\mathcal {F}_{\textsf {setup} }\) and gives the respective keys to the malicious adversary, \(\mathcal {A}^{\textsf {mal} }\). This is followed by the input sharing phase in which \(\mathcal {S}^{\textsf {mal} }\) extracts the input of \(\mathcal {A}^{\textsf {mal} }\), using the known keys, and sets the inputs of the honest parties to be 0. Knowing all the inputs, \(\mathcal {S}^{\textsf {mal} }\) can compute all the intermediate values for each building block in the clear. \(\mathcal {S}^{\textsf {mal} }\) proceeds to simulate each building block in topological order using the aforementioned values (inputs of \(\mathcal {A}^{\textsf {mal} }\), intermediate values). Finally, depending on whether \(\mathcal {A}^{\textsf {mal} }\) misbehaved, which \(\mathcal {S}^{\textsf {mal} }\) can detect using the aforementioned information, \(\mathcal {S}^{\textsf {mal} }\) invokes \(\mathcal {F}_{\textsf {n\text{- }}PC }^{\textsf {mal} }\) to obtain the function output and forwards it to \(\mathcal {A}^{\textsf {mal} }\). As before, we provide the simulation steps for each of the sub-protocols separately for modularity. When carried out in the respective order, these steps result in the simulation steps for the entire computation. To distinguish the simulators for various protocols, the corresponding protocol name appears as the subscript of \(\mathcal {S}^{\textsf {mal} }\).

Sharing Simulation for input sharing appears in Fig. 41.

Fig. 41
figure 41

Malicious: simulation for the input sharing protocol \(\varPi _{\textsf {Sh} }^{\textsf {M} }\) by \(P_s\)

Reconstruction Simulation for reconstruction (with fairness) appears in Fig. 42.

Fig. 42
figure 42

Malicious: simulation for the fair reconstruction protocol \(\varPi _{\textsf {Rec} }^{\textsf {fair} }\) towards all the parties

Multiplication Simulation steps for multiplication (Fig. 24) are provided in Fig. 43.

Observe that since \(\mathcal {A}^{\textsf {mal} }\) sees random shares in both the real-world protocol and in the simulation, indistinguishability of the simulation follows.

Fig. 43
figure 43

Malicious: simulation for the multiplication protocol \(\varPi _{\textsf {mult} }^{\textsf {M} }\)

Other building blocks Simulations for the remaining building blocks can be obtained analogously and using the steps for the underlying protocols.

Complete MPC protocol Simulation for the complete malicious MPC protocol \(\varPi _{\textsf {MPC} }^{\textsf {mal} }\) (Fig. 25) appears in Fig. 44.

Fig. 44
figure 44

Malicious: simulation for the complete MPC protocol \(\varPi _{\textsf {MPC} }^{\textsf {mal} }\)

Theorem 2

Protocol \(\varPi _{\textsf {MPC} }^{\textsf {mal} }\) (Fig.  25) realizes \(\mathcal {F}_{\textsf {n\text{- }}PC }^{\textsf {mal} }\) (Fig.  19) with computational security against a malicious adversary \(\mathcal {A}^{\textsf {mal} }\) in the \(\{\mathcal {F}_{\textsf {Broadcast} }, \mathcal {F}_{\textsf {setup} }, \mathcal {F}_{\textsf {MulPre} }, \mathcal {F}_{\textsf {TrGen} }^{\textsf {M} }\}\)-hybrid model.

Proof

We prove that the adversary’s view in the simulation is indistinguishable from its view in the real world via a sequence of hybrids.

Hybrid\(_{\textbf{0}}\): Execution of protocol \(\varPi _{\textsf {MPC} }^{\textsf {mal} }\) in the real world.

Hybrid\(_{\textbf{1}}\): In this hybrid, the execution of \(\varPi _{\textsf {Sh} }^{\textsf {M} }\) is replaced by the simulation of \(\mathcal {S}^{\textsf {mal} }_\textsf{Sh}\). Similar to the semi-honest protocol, the two hybrids differ only in the case of inputs \(\textsf {x} _s\) of each honest \(P_s\). Note that for the input of an honest \(P_s\), the adversary’s view consists of \((\beta _{\textsf {x} _s},\langle \lambda _{\textsf {x} _s} \rangle _i)\) for each \(P_i \in \mathcal {C}\). Of these, \(\langle \lambda _{\textsf {x} _s} \rangle _i\) consists of random values selected using the shared-key setup among parties and hence is indistinguishable in both the hybrids. Moreover, \(\beta _{\textsf {x} _s}\) remains a random value from the adversary’s view in both the hybrids due to the share \(\langle \lambda _{\textsf {x} _s} \rangle _{{\mathcal {T}}}\) where \({\mathcal {T}}\subseteq \mathcal {P}{\setminus } \mathcal {C}\) unknown to the adversary. Hence, the distributions of Hybrid\(_{\textbf{0}}\) and Hybrid\(_{\textbf{1}}\) are indistinguishable.

Hybrid\(_{\textbf{2}}\): In this hybrid, the execution of \(\varPi _{\textsf {mult} }^{\textsf {M} }\) is replaced with the simulation of \(\mathcal {S}^{\textsf {mal} }_\textsf{mult}\) for all the multiplication gates. The adversary’s view here may consist of the reconstructed value \(\textsf {z} - \textsf {r} \). For multiplication gates in the last layer of the circuit, the simulation differs from the gates in the other layers only for the reconstructed value \(\textsf {z} - \textsf {r} \). Specifically, the simulator receives the output wire value of the multiplication gate by functionality \(\mathcal {F}_{\textsf {n\text{- }}PC }\) and adjusts \(\textsf {z} - \textsf {r} \) (by adjusting shares held by honest parties) accordingly. However, note that it remains a random value from the adversary’s view in both the hybrids due to the randomly chosen \(\textsf {r} \). Moreover, \(\textsf {r} \) is unknown to the adversary due to the common share held by \(n-t\) honest parties in \({\mathcal {T}}\subseteq \mathcal {P}\setminus \mathcal {C}\). Hence, the distributions of Hybrid\(_{\textbf{1}}\) and Hybrid\(_{\textbf{2}}\) are indistinguishable.

Hybrid\(_{\textbf{3}}\): In this hybrid, the fair reconstruction protocol \(\varPi _{\textsf {Rec} }^{\textsf {fair} }\) is replaced with the simulation of \(\mathcal {S}^{\textsf {mal} }_\textsf{fairRec}\). Note that this is exactly the execution in the ideal world. The transcript of a corrupt party for an output wire \(\textsf {a} \) consists of shares of \(\langle \lambda _{\textsf {a} } \rangle \) and \(\textsf {a} \). Assume without loss of generality that the output wire is the output of a multiplication gate. As described earlier, the simulator obtains the output wire value \(\textsf {a} \) from the functionality and adjusts the shares of \(\textsf {m} _{\textsf {a} }\) held only by the honest parties to ensure a sharing that is consistent with the output \(\textsf {a} \). Since \(\lambda _{\textsf {a} }\) is random and unknown to the adversary, \(\textsf {m} _{\textsf {a} }\) is also random. Therefore, the adversary’s view is indistinguishable in both these executions.

Thus, we conclude that the view of the adversary is indistinguishable in Hybrid\(_{\textbf{0}}\), which is the execution of the protocol in the real world and Hybrid\(_{\textbf{3}}\) corresponding to the execution in the ideal world. \(\square \)

7.3 Conclusion

This work improves the practical efficiency of n-party honest-majority protocols using preprocessing paradigm. While our first construction achieves a fast online phase compared to the semi-honest protocol of DN07\({}^{\star }\), the second enhances security by tolerating malicious adversaries with minimal overhead in the online phase. The active participation of half of the participants in both of our constructions is a major highlight. This reduction in online parties results in monetary benefits in real-world deployments.