1 Introduction

Over recent years, the demand for secure computation has garnered widespread attention for the exploitation of big data and outsourced computation, while preserving user privacy. Homomorphic encryption (HE) is a major secure computation scheme [1]; it is a public key cryptosystem that can perform additions and/or multiplications over ciphertexts via homomorphic evaluations. Since Gentry’s breakthrough work in 2009 [2], HE has received widespread attention. For instance, HE has been applied in privacy-preserving statistical processing [3] and machine learning [4,5,6,7,8] applications involving the data provided by various parties. Furthermore, HE-based secure computation has been gaining increasing importance for the realization of machine learning applications with preserved privacy, owing to the remarkable developments in machine learning techniques over recent years.

The implementation cost of HE is largely dependent on the supported secret operations. Fully HE (FHE) supports both addition and multiplication, and it can perform the homomorphic evaluation of any polynomial function. However, it incurs considerably large implementation costs owing to the key/ciphertext size and computational complexity. Particularly, FHE requires a computationally expensive procedure called bootstrapping after evaluating multiplication(s), which is considered as a major bottleneck for FHE. By contrast, somewhat HE (SHE) can be implemented with a significantly smaller cost than FHE. However, it only supports the homomorphic evaluation of polynomial functions with lower degrees (i.e., multiplicative depth), which, in turn, limits its practical applicability. Thus, reducing the costs of FHE/SHE is necessary for broader applications.

A high computational accuracy is not a critical requirement in many real-world applications. Approximate computations with an acceptable accuracy degradation are commonly deployed in such applications. Such computations include rounding off in floating-point arithmetic [commonly deployed in central processing units (CPUs)] and probabilistic algorithms. A study conducted in 2017 proposed an HE called homomorphic encryption for arithmetic involving approximate numbers (HEAAN, or CKKS scheme [9]); this approach can homomorphically evaluate the rounding of plaintext over ciphertext. The CKKS has been widely employed in many applications, such as privacy-preserving machine learning and oblivious inference [10], owing to its high efficiency. Nevertheless, to the best of our knowledge, an HE scheme that can perform probabilistic arithmetic over ciphertext has not been reported thus far, and probabilistic algorithms are expected to improve the HE efficiency, similar to that when using the CKKS.

This paper proposes an HE for stochastic computing (HESC), which supports both probabilistic addition and multiplication, based on stochastic computing (SC) [11]. SC is a probabilistic arithmetic system, where numbers are represented as probabilities, and additions and multiplications are performed using random numbers. SC has been utilized and investigated in the domain of low-power digital circuit design, and its advantage has been shown in some practical applications such as digital filter [12, 13]. Recently, SC has been also employed for hardware implementation of neural network (NN) inferences, performance of which is sometimes shown to be superior to binary/floating point representations (e.g., [14,15,16,17]). The basic HESC integrates the additive/multiplicative HE (or SHE) with SC and can homomorphically evaluate both stochastic addition and multiplication, without any bootstrapping. This implies that the HESC can be implemented with a low computational cost, equivalent or comparable to that of the combined additive/multiplicative HE or SHE, while exhibiting better arithmetic flexibility. One major drawback of the HESC, however, is that any plaintext obtained through homomorphic evaluations includes noise, owing to the SC. Essentially, the HESC is useful for certain applications where noise is tolerable.

In this paper, the basic construction of the HESC and its homomorphic evaluation is first presented. Subsequently, HESC construction based on lattice-based cryptography and a new stochastic addition method are described. The cost of HESCs is evaluated through prototype implementations, using some typical HEs, including the CKKS. The HESC is further applied for the evaluation of certain polynomial functions and an oblivious inference using neural networks (NNs) to classify the Iris flower dataset. The results indicate that the HESC can achieve sufficiently high accuracy with lower computational costs, as compared with an equivalent CKKS-based NN.

Remark 1

HESC involves noisy decrypted plaintext: The result determined by decrypting a ciphertext obtained via homomorphic evaluation is only approximately equal to the result of the corresponding function evaluation. In this regard, HESC does not satisfy the correctness property of the standard HE. This paper argues, however, that HESC is a useful primitive that can nevertheless be efficiently instantiated and that offers several advantages over conventional techniques, especially in contexts where the inputs and outputs are approximate in nature anyway. In the future, HESC can be applied for the development of privacy enhancing techniques and applied cryptography.

2 Background

2.1 Homomorphic encryption

HE belongs to a class of cryptographic schemes that employ a polynomial-time algorithm for the homomorphic evaluation of addition and/or multiplication operations over ciphertexts. In general, the scheme is a tuple of the algorithm defined as [18]

  • \(\textsf {HE}.\textsf {KeyGen}(1^\lambda )\rightarrow (\textsf {pk}, \textsf {sk})\): For a given security parameter, \(\lambda \), and a public key, \(\textsf {pk}\), a private key, \(\textsf {sk}\), is generated.

  • \(\textsf {HE}.\textsf {Enc}_\textsf {pk}(m)\rightarrow c\): For a given, \(\textsf {pk}\), and a plaintext, m, a ciphertext, c, is generated as output.

  • \(\textsf {HE}.\textsf {Eval}_\textsf {pk}(f,c,c')\rightarrow \textsf {HE}.\textsf {Enc}(f(c,c'))\): For \(\textsf {pk}\), two ciphertexts, c and \(c'\), and a function, f, a ciphertext of the evaluation result of \(f(c,c^{\prime })\), are generated as output.

  • \(\textsf {HE}.\textsf {Dec}_\textsf {sk}(c)\rightarrow m\): For a given ciphertext, c, and the private key, \(\textsf {sk}\), corresponding to \(\textsf {pk}\), the plaintext, m, is output.

Given that \(\textsf {HE}.\textsf {KeyGen}\), \(\textsf {HE}.\textsf {Enc}\), and \(\textsf {HE}.\textsf {Dec}\) are common algorithms in a public key cryptosystem, \(\mathsf HE.Eval\) homomorphically executes the evaluation function, f, over the ciphertext. The existing HE schemes are roughly classified into three categories based on their executable homomorphic operations [18].

  1. (i)

    Additive/multiplicative HE only supports homomorphic addition or multiplication. Typical examples of additive HEs include the Goldwasser–Micali (GM) encryption [19] and lifted-(EC)ElGamal encryption [20,21,22,23]. The RSA [24] and Elgamal [20] encryptions are representative multiplicative HEs.

  2. (ii)

    The SHE can homomorphically evaluate addition and a limited number of multiplications. This is because the SHE utilizes noise for its encryption, which increases after the evaluation of each multiplication. Thus, the implementation cost of the SHE increases considerably if a large number of multiplication operations are required.

  3. (iii)

    The FHE can perform both addition and multiplication over ciphertext and homomorphically evaluate any polynomial function. The most common FHE schemes can be categorized into FHEW [25, 26]-type, BGV/BFV [27, 28]-type, and CKKS [9]-type [29] schemes. The basic concept involves converting a ciphertext with noise into another one with less noise. This conversion is termed as bootstrapping, and it is used to overcome the constraint on the number of multiplications in the SHE.Footnote 1 However, this approach is computationally expensive and limits the applicability of the FHE.

Although several SHE and FHE applications can realize finite-field or fixed-point arithmetic over ciphertexts, the CKKS can efficiently support real and complex-number arithmetic [9]. The key feature of the CKKS is that it can homomorphically evaluate the rounding of plaintext, which significantly improves the implementation efficiency of the SHE/FHE, as compared with other conventional schemes where the plaintext size exponentially increases with the number of multiplications. Essentially, the CKKS achieves high implementation efficiency based on approximate computing (i.e., rounding). However, the CKKS still requires bootstrapping to perform a large number of multiplications [30].

2.2 Stochastic computing

SC is a probabilistic arithmetic system that was developed in the domain of digital circuit design [11]. It employs a specific rational number representation, called a stochastic number, which is represented by the occurrence probability of “1” in an L-bit sequence. There are two typical types of stochastic number representations: unipolar (UP) and inverted bipolar (IBP). An L-bit UP stochastic number \(X_\text {UP}\) represents a rational number \(\text {HW}(X)/L\), where \(\text {HW}(X)\) denotes the Hamming weight of X. The UP stochastic numbers represent rational numbers in the range of [0, 1], with a resolution of 1/L. An L-bit IBP stochastic number \(X_\text {IBP}\) represents a real number \(1-2\mathrm{HW}(X)/L\) in the range of \([-1,1]\), with a resolution of 2/L. All the bit sequences for the stochastic number with the same Hamming weight represent an identical rational number. For example, a 4-bit stochastic number, 0001, 0010, 0100, and 1000, represents an identical rational number of 1/4 and 1/2 for UP and IBP, respectively. This transformation from binary to stochastic numbers is performed randomly.

One major advantage of SC is that multiplication and addition can be performed using only L-bit logic gates and multiplexers, respectively. Let A and B be the L-bit stochastic numbers, where \(a_i\) and \(b_i\) are the ith bits, respectively. The stochastic multiplication, \(G = AB\), is given by the bit-parallel AND and XOR of A and B (i.e., \(g_i = a_ib_i\) and \(g_i = a_i\oplus b_i\)) for UP and IBP, respectively, where \(g_i\) is the i-th bit of G. The stochastic addition, \(D = A+B\), is implemented with a multiplexer that randomly selects \(a_i\) or \(b_i\) for the ith bit of D (denoted by \(d_i\)). If \(a_i\) (or \(b_i\)) is selected with a probability of 1/2, the stochastic addition becomes a normalized addition \((A+B)/2\). These SC features make it possible that very lightweight circuit design, as an SC adder/multiplier has only one logic depth, independently of the bit-length of SC number. Extremely, we can implement an SC adder/multiplier with only one logic gate if we perform the logic operation serially, or, we can implement them with a very low latency if we perform the logic operation in parallel. In fact, this lightweight feature of SC arithmetic is exploited in the domain of digital circuit design for applications such as digital filters [12, 13] and NN inferences (e.g., [14,15,16,17]). The SC computation result is validated only as an expected value owing to its probabilistic nature, which indicates that SC results always contain noise. To ensure that this noise remains within an acceptable range, the stochastic number length must be determined appropriately based on the application.

3 Proposed scheme

3.1 Basic concept and construction

The basic concept behind the HESC is that stochastic addition can be realized by multiplexing inputs without any arithmetic operations, whereas the homomorphic evaluation of stochastic multiplication is realized by either homomorphic addition or multiplication of the underlying HE.

The HESC involves the binary-stochastic number transformation (B2S) of plaintext, encryption, and decryption via additive/multiplicative HE (or SHE) used for homomorphic stochastic multiplication, and stochastic-binary number transformation (S2B) of the decrypted plaintext.

Herein, the ciphertext of HESC is represented by a sequence of blocks, each of which is the encrypted result of a bit of a stochastic number representing the plaintext. The key length of the HESC is equivalent to the underlying HE; the ciphertext length and computational cost are proportional to the stochastic number length. The HESC does not require any bootstrapping because either homomorphic addition or multiplication of the underlying HE can help realize the homomorphic evaluation of both the stochastic addition and multiplication of the HESC.

Encryption

The encryption Algorithm 1 \(\textsf {HESC}.\textsf {Enc}\) uses a public key \(\textsf {pk}\), where \(\textsf {HE}.\textsf {Enc}_\textsf {pk}\) is the encryption with the underlying HE with \(\textsf {pk}\). First, the plaintext, M (\(M \in [0, 1]\) for UP and \(M \in [-1, 1]\) for IBP), is converted to a stochastic number by B2S. A stochastic number can be easily produced from a binary number by a random number generator. Each bit of the stochastic number is then separately encrypted, i.e., \(\textsf {HE}.\textsf {Enc}_\textsf {pk}\) is performed L times to encrypt all the bits.

Decryption

In the decryption Algorithm 2, \(\textsf {HE}.\textsf {Dec}_\textsf {sk}\) denotes the decryption of the underlying HE with the private key, \(\textsf {sk}\). The HESC decryption follows the inverse procedure of the HESC encryption. Each ciphertext block in the HESC ciphertext is decrypted by \(\textsf {HE}.\textsf {Dec}_\textsf {sk}\) to acquire a stochastic number of the plaintext. Lastly, \(\textsf {S2B}\) provides the result of the operation.

Homomorphic evaluation

In the homomorphic evaluation Algorithms 3 and 4 for stochastic addition and multiplication, respectively, \(\textsf {HE}.\textsf {Eval}\) denotes the homomorphic evaluation of the underlying HE (i.e., homomorphic addition and multiplication for IBP and UP, respectively). The homomorphism of the underlying HE is employed for homomorphic stochastic multiplication (i.e., \(\textsf {HE}.\textsf {Eval}\)). One of the two input blocks is randomly selected for the homomorphic stochastic addition. In Algorithms 3, a stochastic number, S, with a Hamming weight corresponding to a selection signal, A, is generated by an external random number generator. The ith block of the addition result is then selected based on \(s_i\) (i.e., the ith bit of S), as described in Sect. 2.2. In Algorithms 4, \(\textsf {HE}.\textsf {Eval}\) is performed L times to obtain the resulting HESC ciphertext R as \(r_i = \textsf {HE}.\textsf {Eval}(c_i, c'_i)\), where \(c_i\) and \(c'_i\) denote the ith blocks of two inputs (i.e., HESC ciphertexts), C and \(C'\), respectively, and \(r_i\) denotes the ith block of R.

figure a
figure b
figure c
figure d

3.2 HESC with lattice-based cryptography

3.2.1 Basic concept

The underlying concept is that certain HEs, which are based on lattice-based cryptography (with a plaintext packing scheme [31]) such as BFV [27] and CKKS [9], can encode a vector (or polynomial) into one ciphertext block. They can process its homomorphic evaluation at once, as shown in CryptoNets [4]; this process is called single instruction multiple data processing.

During the HESC encryption, the \(\textsf {B2S}\) result is given by an L-bit stochastic number (i.e., L-dimensional vector). If the HE encrypts an n-dimensional vector in one block for a stochastic number length of L, the HESC encryption is completed within \(\textsf {HE}.\textsf {Enc}\) L/n times. Consequently, the number of HE ciphertext blocks to L/n is reduced. Therefore, the lattice-based cryptography reduces the computational cost and ciphertext size for the HESC. Such vectorized homomorphic evaluation is also beneficial for homomorphic stochastic multiplication.

3.2.2 Stochastic addition compatible with lattice-based HEs

The classic homomorphic stochastic addition is no longer a step in the encoding process explained earlier because the homomorphic evaluation of random multiplexing in a bit-wise manner is not applicable to such packed ciphertexts. More precisely, the random selection of the ciphertext blocks does not correspond to the conventional stochastic addition given as a random selection of bits. However, we can still evaluate the stochastic addition using a plaintext-ciphertext multiplication for some lattice-based scheme including BFV. Let \(A^1\) and \(A^2\) be two stochastic numbers to be added. We generate a random bit string S used for stochastic addition, and let \({\bar{S}}\) be its complement. The stochastic addition is evaluated as \(A = SA^1 + {\bar{S}}A^2\), where \(SA^1\) and \({\bar{S}}A^2\) are computed using a plaintext-ciphertext multiplication. As such a multiplication is not expensive, we can evaluate stochastic additions even for lattice-based HESC.

In addition, a new addition method for stochastic numbers is presented for improving the precision at a cost of ciphertext length increase, wherein two stochastic numbers are concatenated. Let \(A = (a_1, a_2, \dots , a_{L})\) and \(B = (b_1, b_2, \dots , b_{L})\) be the input stochastic numbers. In the new method, the normalized sum of A and B (i.e., \(D = (A+B)/2\)) is given as \(D = A \parallel B = (a_1, a_2, \dots , a_{L}, b_1, b_2, \dots , b_{L})\). The resulting D represents a rational number within the range of [0, 1] for UP or \([-1, 1]\) for IBP with a resolution of 1/2L at the expense of the stochastic number length (i.e., 2L). The addition explained earlier is feasible even for the packed ciphertext because it can be realized by concatenating ciphertexts. Although the ciphertext length increases with each addition, the new method exhibits the following features: (i) applicability to the packed ciphertext and (ii) no noise/error during addition.Footnote 2

The concatenated stochastic addition is then formally validated as Proposition 1.

Proposition 1

Let \(A^1, A^2, \dots , A^f\) be f stochastic numbers in IBP with a length of L. Their concatenation \(A = A^1 \parallel A^2 \parallel \dots \parallel A^f\) is a valid stochastic sum of \(A^1, A^2, \dots , A^f\) with a standard deviation of \(\frac{4}{L}\sum _{j=1}^fp^j(1-p)^j\).

Proof

Consider the normalized sum of f stochastic numbers \(A^1, \dots ,A^j, \dots ,A^f\) in IBP. Let \(a_i^j\) be a random variable representing the ith bit of the stochastic number, \(A^j\) \((1\le i\le L)\), and \(A^j\) is can be considered as a random variable given as follows:

$$\begin{aligned} A^j=1-2\frac{\sum _{i=1}^{L}a_i^j}{L}. \end{aligned}$$
(1)

Here, the sum of f stochastic numbers is expressed as follows:

$$\begin{aligned} A = \sum _{j=1}^f A^j=f-2\frac{\sum _{j=1}^{f}\sum _{i=1}^{L}(a_i^j)}{L}. \end{aligned}$$
(2)

Using the expected values, these sums are given as follows:

$$\begin{aligned} \mathbb {E} A = f-2\frac{\sum _{j=1}^{f}\sum _{i=1}^{L} \mathbb {E} (a_i^j)}{L} = f - 2 \sum _{j=1}^f p^j, \end{aligned}$$
(3)

where \(p^j\) is the expected value represented by \(A^j\).

Based on the variance, the error is given as follows:

$$\begin{aligned} V\left[ A\right] = \mathbb {E} \left[ (A - \mathbb {E} A\right) ^2] = \frac{4}{L} \sum _{j=1}^f p^j (1-p^j). \end{aligned}$$
(4)

The standard deviation of the error is inversely proportional to the root of the stochastic number length. The error after the addition decreases as the stochastic number length increases, which validates the concatenation-based sum as a SC addition. \(\square \)

Based on Eq. (2), the resulting sum is accurately normalized from the scaling coefficient. Therefore, the error after the addition is given by the sum of the errors of the input stochastic numbers; no error is added for the concatenated stochastic addition. This indicates that the HESC with the concatenated stochastic addition is advantageous over that with the conventional addition, if the stochastic number length of the resulting ciphertext is acceptable.

The resulting stochastic number has a bias derived from the concatenated bit position because each input stochastic number has a unique bias. Therefore, the subsequent operations after the concatenated addition must be carefully performed. Additionally, when using the HESC with CKKS, the result after the operation includes CKKS-derived errors along with the SC-derived errors. Consequently, the decrypted value in the HESC with CKKS is not necessarily an integer. Hence, the value must be rounded to the closest integer to accurately realize S2B after decryption.

Fig. 1
figure 1

Comparison of the outline plots for \(f(x)=x^2+x+1\)

Fig. 2
figure 2

Comparison of the average errors for \(f(x)= \sum _{i=0}^n x^i\) for \(n \le 10\). Average count means the number of trials to compute the resulting averaged value

Fig. 3
figure 3

Comparison of the maximum errors for \(f(x)= \sum _{i=0}^n x^i\) for \(n \le 10\). Average count means the number of trials to compute the resulting averaged value

3.2.3 Experimental evaluation of SC additions

A polynomial function, \(f(x)=\sum _{i=0}^n\), is evaluated by SC, and the resulting errors are analyzed to validate the effectiveness of the new addition method. The input stochastic number length is set to \(L=2048\) bits. Figure 1 shows the outline plots of f(x) evaluated using the conventional and new addition (i.e., concatenated stochastic addition) methods. Furthermore, Figs. 2 and 3 compare the mean and maximum errors of the two methods for \(n \le 10\). The input value is set to \([-1, 1]\) in increments of 0.01 (i.e., 200 computations are plotted) to obtain the outline plots. The average and maximum errors are calculated from the difference between the outputs and the true values.

Figures 2 and 3 show that the that the error decreases as the number of averages increases. For example, for the averaging of 10 times, the mean errors are reduced by approximately 70% for both the methods, relative to the corresponding errors without averaging. The error of the stochastic operation results is explained by a binomial distribution, and therefore, the increase in the averaging times suppresses the variance of the binomial distribution (i.e., error). The values in Figs. 2 and 3 are experimental, and the probabilistic calculations may be different in each trial owing to the SC probabilistic feature. Theoretically, the error is inversely proportional to the square root of the number of averages.

The results also indicate that the errors in the new method are considerably smaller than those obtained by the conventional methods. For example, the mean error of the new method is 58.7% smaller than that of the conventional method for the averaging of 10 times. This is because, the resolution/accuracy of the new method is uncompromised after each stochastic addition, as shown in Eq. (4). The value computed via stochastic addition must be multiplied by a constant corresponding to the number of additions to obtain the correct value; however, in the conventional method, this produces a loss of resolution and accuracy. The concatenation-based method resolves this issue by extending the stochastic number length after addition. In addition, the growth of error by an increase in degree, n, is significantly suppressed by the proposed method compared to the conventional one. This feature would be useful for some practical applications with a non-trivial degree function as demonstrated in Sect. 4. Thus, the effectiveness of the new addition method on such a polynomial can be confirmed.

Note that this experiment only shows the comparison of conventional and proposed method; in practice, we can combine them to exploit tradeoff between the accuracy and ciphertext size by adaptively choosing the conventional and concatenation-based SC additions, if we need to perform the stochastic addition many times. The methodology to design circuit based on two stochastic addition with exploiting the tradeoff would be a future work.

3.3 Improvement by reduction in ciphertext size

HESC schemes with lattice cryptography use concatenated stochastic addition, which increases the number of ciphertexts after additive evaluation. Further, the schemes perform stochastic multiplication by adding ciphertexts. The resulting plaintexts can be non-negative integers as in ordinary stochastic operations. In particular, the number of ciphertexts increases significantly when a number of addition and multiplication operations are performed. As the number of ciphertexts increases, the resulting decryption computation costs increase.

Fig. 4
figure 4

Technique that can reduce the size of ciphertexts

To address the above limitation, we introduce an improvement technique for reducing the number of ciphertexts while maintaining the above stochastic operations. Figure 4 shows an overview of the technique, which entails fusion and separation parts before and after decryption, respectively. The fusion part packs several ciphertexts by weighted addition while the separation part separates the ciphertext package with the weights used. Even with the additional two parts, the reduction in ciphertext size can reduce the total computation time.

In the following, let \(\boxed {m}\) be a ciphertext whose plaintext has a maximum value of m. We first assume that ciphertexts fused are not only \(\boxed {1}\). The basic idea is to fuse ciphertexts as a W-decimal number, where W is the integer larger than the maximum value of m. Let \(a_1\) and \(a_2\) be non-negative integers (i.e., ciphertexts) less than W. If \(A=a_1\cdot W+a_2\), then \(a_1\) and \(a_2\) can be separated as follows:

$$\begin{aligned}{}[A/W]= & {} a_1, \end{aligned}$$
(5)
$$\begin{aligned} A\mod W= & {} a_2. \end{aligned}$$
(6)

Here, one constant multiplication and one addition are performed for generating the fused ciphertext A, which is easily computed in HE. Applying the above operations recursively, we can fuse multiple ciphertexts into a single fused ciphertext. That is, we can fuse N ciphertexts \(\boxed {m_1},\boxed {m_2}\dots \boxed {m_N}\) into \(\mu \) as follows:

$$\begin{aligned} \mu= & {} \boxed {m_1}\prod _{i=1}^{N-1}W_i + \boxed {m_2}\prod _{i=2}^{N-1}W_i \nonumber \\&\quad + \dots +\boxed {m_{N-1}}W_{N-1}+\boxed {m_N}, \end{aligned}$$
(7)

where \(W_1,W_2,\dots ,W_i,\dots ,W_N\) are constants that are larger than the maximum values of the corresponding plaintexts \(m_1,m_2,\dots ,m_i,\dots ,m_N\), respectively. If \(W=W_i\) for any i, the ciphertext \(\mu \) after the fusion is given as follows:

$$\begin{aligned} \mu =\boxed {m_1}W^{N-1}+ \dots +\boxed {m_{N-1}}W+\boxed {m_N}. \end{aligned}$$
(8)

This fusion part has a restriction on the maximum number of fused ciphertexts owing to the increase in the plaintext space and noise after the operation. In Eq. (8) if \(m=m_i\) for any i and \(W=m+1\), then the value A after the fusion is given as follows:

$$\begin{aligned} A&=mW^{N-1}+\dots +mW+m\nonumber \\&=m\frac{W^N-1}{W-1}\nonumber \\&=W^N-1 . \end{aligned}$$
(9)

This makes it possible to estimate the maximum number of fused ciphertexts under the condition of the plaintext space. It does not consider that the noise increases slightly with each addition. We can separate the fused value by computing the remainder divided by \(W_i\) recursively after decryption.

We then assume that the ciphertexts fused are only \(\boxed {1}\) (i.e., 0 or 1). To recover the binary number from the computed stochastic number, solely the number of “1”s in the decoded stochastic number sequence is required. When the decoded sequence is \(\{0, 1\}^N\), this is equivalent to obtaining the sum of elements. This means that the weight W should be 1 in the fusion. In this case, the fusion part is given solely by the addition of ciphertexts, and the increase in noise is extremely small. Therefore, compared with the case wherein ciphertexts fused are not only \(\boxed {1}\), we can merge more ciphertexts.

4 Performance evaluation

In this section, the implementation performance of the HESC is evaluated by using several applications. The homomorphic evaluations of a polynomial function are first conducted for a typical application.

Such polynomial functions have been used as activation functions in oblivious inference protocols such as CryptNets [4] because major nonlinear functions in a standard model (e.g., ReLU and Sigmoid) cannot be homomorphically evaluated over ciphertexts. Therefore, the evaluation of the performance of homomorphic evaluation of such polynomial functions presents an important benchmark. The HESC is then applied to the Iris classification for a more practical evaluation. This is a simpler dataset than the MNIST [32] and other datasets used by studies on oblivious inference using HE (CryptNets [4], LoLa [6], Falcon [5], etc.), but it is very effective as a baseline.

In the following sections, all stochastic numbers are expressed in IBP.

Table 1 Experimental condition for performance evaluation of HESC instantiated with various HEs
Table 2 Computation time per bit of stochastic number (\(\upmu \)s)

4.1 Basic implementation and comparison

Firstly, the fundamental performance of the HESC is evaluated by implementing it with the typical HEs. To this end, the HESC schemes with three additive HEs: GM encryption [19], lifted-ElGamal [20, 21], and lifted-ECElGamal [22, 23, 36] are implemented. Additionally, BFV [27] and CKKS [9] are employed in the prototype HESCs with lattice-based cryptography, as described in Sect. 3.2, and their performance is then evaluated. The execution time is measured using an Intel Core i7-8665U (2.10 GHz) system with 16 GB of memory.

Table 1 lists the experimental conditions for the HE implementation, where the parameters are set to meet an equivalent security level (128 bits [37]) for each scheme. Table 2 compares the execution times of encryption, homomorphic evaluation, and decryption for the basic HESCs with the three additive HEs at the top. Among these, the HESC with the GM encryption is the fastest. This is because the plaintext space of the GM encryption is \(\mathbb {F}_2\), whereas those of the other two schemes are \(\mathbb {F}_p\) (where p is an odd prime). Thus, the GM encryption can be efficiently implemented as a basic HESC. Additionally, Table 2 compares the respective execution times for the HESCs with lattice-based schemes (i.e., BFV and CKKS) at the bottom. The HESCs with lattice-based cryptography require packing and unpacking operations that pack multiple plaintexts into a single ciphertext (i.e., packing [31], also known as encoding) at the beginning and unpack it at the end, respectively. However, the total execution time per bit is much smaller than that of the basic HESCs.

For example, if a 2048-bit stochastic number can be packed into a single ciphertext, only one \(\textsf {HE}.\textsf {Enc}\) is called to complete the encryption; however, 2048 \(\textsf {HE}.\textsf {Enc}\) calls are required for the basic HESCs without packing techniques. This advantage of reducing the number of function calls is also reflected in the subsequent homomorphic evaluation. A comparison between the BFV and CKKS shows that CKKS is advantageous in terms of the evaluation time required for realizing stochastic addition and multiplication, although the packing/unpacking (i.e., encoding/decoding) of CKKS takes a little longer due to its unique features such as the usage of floating-point representation [38].

Based on the above comparison results, the HESCs with BFV and CKKS are considered for the performance evaluation in the following section.

Fig. 5
figure 5

Outline plots of \(f(x)=x^2+x+1\): a HESC w/BFV, b HESC w/CKKS, and c CKKS

Fig. 6
figure 6

Outline plots of \(g(x)=x^3+x^2+x+1\): a HESC w/BFV, b HESC w/CKKS, and c CKKS

Fig. 7
figure 7

Comparison of errors: a f(x) and b g(x)

4.2 Polynomial functions

Figures 5 and 6 show the evaluation results of the functions \(f(x)=x^2+x+1\) and \(g(x)=x^3+ x^2+x+1\) obtained using “HESC with BFV” and “HESC with CKKS.” Here, the target functions are chosen because, given a polynomial degree, evaluation of such all one polynomials is most severe as its evaluation would require the greatest number of two-input additions among functions for the degree. Note that addition is critcal for HESC rather than multiplication; therefore, the choice of coefficients has not very big impact on the result. These figures also present the evaluation results of CKKS for comparison. Note that we do NOT perform any averaging for the plots. The figures show that the HESCs approximately compute the polynomial functions with some errors due to the nature of SC.

Figure 7 shows the mean and maximum errors of the three schemes, where the input value is given from \([-1,1]\) with a resolution of 0.01. This means that the results of 201 values were evaluated to plot a single error value. The maximum error and average error were calculated using the absolute values of the difference between the calculated and true values, respectively. The horizontal axis represents the number of averaging times. For example, the results of \(201 \times 10\) values were evaluated in total for a case in which the averaging time was 10. Figure 7 shows that the error decreases with the increase in the number of averaging times in the HESCs. Although the error can be reduced by averaging the number of evaluations, it cannot be completely removed.

Table 3 Computation time for polynomial functions (ms)

Particularly, such averaging cannot reduce the expected value of error to the extent of the inverse square root of the number of samples. Therefore, the evaluation result of the HESC is not as accurate as that of the CKKS, which produces almost no error. This error must be considered during the application of the HESC, as evaluated in the Iris classification below.

The error of g(x) in all the schemes is larger than that of f(x). This is because the degree of g(x) is higher than that of f(x), which increases the number of operations. Both the CKKS-derived and stochastic-derived errors increase with the increase in the depth of operations. “HESC with CKKS” contains both errors, but the effect of the CKKS-derived error is trivial due to the rounding process to the stochastic numbers during decoding. Therefore, no significant difference is observed in the calculation accuracy between “HESC with CKKS” and “HESC with BFV.”

Table 3 shows the evaluation times for f(x) and g(x). For comparison, the result of BFV is shown in addition to the three schemes described above. The HESC is faster than the CKKS for both functions, which makes it one of the fastest conventional schemes. This is because it does not require homomorphic multiplication over ciphertexts, which is a major time-consuming procedure for HEs. Additionally, the CKKS requires larger parameters to realize the multiplication for tolerating noise, which also degrades the computational efficiency. HESC with CKKS is usually advantageous in terms of the evaluation time when compared to HESC with BFV. Since the HESC does not require multiplication, the depth of operations for low-order polynomial functions does not significantly affect the evaluation time.

Figure 8 shows the total computation times of the above four schemes for up to 10-degree polynomial functions given as \(\sum ^n_{i=0}x^i \ (2\le n\le 10)\). As the degree of functions becomes higher, the multiplication operations become deeper, thus requiring a larger computation time when evaluating with HE alone. In contrast, the computation times of HESC did not increase much because it can perform stochastic multiplication without using the multiplication of HE. For example, BFV and CKKS evaluated the 10-degree function with 162.1 [ms] and 60.9 [ms], respectively, while HESC with CKKS evaluated the same function only with 6.5 [ms].Footnote 3

Fig. 8
figure 8

Comparison of computation times in four schemes

4.3 NN oblivious inference

This section describes the effectiveness of the HESC through its application to an NN oblivious inference. The experimental setup is described as follows:

  • Iris flower dataset this dataset consists of the petal and sepal lengths of three different types of irises, as feature quantities. The training set contains 120 data elements, while the test set has 30 data elements.

  • NN model it consists of two fully connected layers of four-dimensional inputs and three-dimensional outputs. There are four nodes in the middle layer, and the activation function is \(f(x)=x^2\).

In this experiment, each parameter is normalized and clipped at \([-1,1]\) to use the SC. To create the model, we introduced Center Loss [39], which is one of the Deep Metric Learning methods. With this model, we can evaluate the similarity between input data and reduce the influence of SC errors because it can amplify the difference between the output values of the correct label and other labels.

Figure 9 shows the resulting inference accuracy, where the horizontal axis indicates the input stochastic number length used for the HESC. For reference, the model accuracy calculated by the floating-point arithmetic is shown as “float” in the figure, which is independent of the stochastic number length. The model is trained using common floating-point arithmetic, and the inference is performed on various SCs.

The legends “Conventional SC” and “Concatenated SC” correspond to the accuracies of the conventional SC and proposed SC with concatenation-based stochastic addition, respectively.

As mentioned in Sect. 3, the position-dependent biases must be considered after the concatenation-based addition. In order to address this issue, the activation function is calculated in an expanded form in “Concatenated SC,” such that the multiple-input stochastic addition is performed only once at the end of the computation. In this case, the length of the stochastic number (the number of ciphertexts) is already five times larger than that of the “conventional SC” at the input to the activation function. Consequently, the length of the stochastic number is 25 times longer after the activation function.

Figure 9 shows that both “Concatenated SC” and “Conventional SC” achieve successful inferences at the trade-off of the stochastic number length, and Concatenated SC offers a higher accuracy. Using the input word length of 1024 bits, the accuracy can be increased to a level that is almost identical to “float.” Furthermore, the stochastic number length (i.e., the number of ciphertexts) after the inference of “Concatenated SC” is 93.75 times larger than the input length, 4. However, the increase in the resulting stochastic number length depends on the model size and the method used for the computation. For example, in the second fully connected layer, its bias is only added to the end (i.e., it is simply attached to the back in the HESC); therefore, it is unnecessary to perform B2S when encrypting the bias. By applying this concept, the resulting length is 75.75 times the length of the input, 4.

In this case, the parameter that renders the NN inference accuracy of floating-point arithmetic and SC comparable is employed (i.e., L = 1024). An open-source cryptographic library SEAL-Python [35] is used. The HESC is determined to be the smallest value considering the parameters in the library such that it can pack a 1024-bit vector in a ciphertext block, and the CKKS is determined to be the smallest value such that it can accurately decrypt the inference result without any bootstrapping.

Fig. 9
figure 9

Model accuracy of inference

Table 4 Computation time for inference (ms)

Table 4 shows the execution times per inference with “BFV,” “CKKS,” “HESC with BFV,” and “HESC with CKKS.”Footnote 4

In a typical machine learning as a service (MLaaS) scenario, a network model is deployed on a server to provide inference services to client users. Here, it is assumed that each parameter of the model is already encrypted, and the client user encrypts the input data before sending them to the server for private inference. Note that “Enc” includes Packing and Encryption, and “Dec” includes Decryption and Unpacking. Additionally, the CKKS adopts the cutting-edge method to accelerate the homomorphic matrix multiplication [40].

The comparison result indicates that the HESC can perform the inference faster than the CKKS in terms of the encryption and layer computations because of the lower cost and smaller parameters. This is primarily because the HESC does not require homomorphic multiplication over ciphertexts. This reduces the size of the parameters and significantly contributes to the reduction in the computation time. For example, the first-layer computation is approximately 306 times faster in “HESC with BFV” and approximately 642 times faster in “HESC with CKKS” when compared with that of CKKS. As the number of ciphertext blocks increases with the addition of the HESC, the second-layer computation and decryption require a greater amount of time. The second-layer computation is only approximately 16 times faster in “HESC with BFV” and approximately 20 times faster in “HESC with CKKS” when compared to that of the CKKS; the HESC decryption requires a longer time (approximately 55 times slower). Consequently, the advantage of the HESCs in the total time can be confirmed.

HESC with CKKS takes a slightly longer time to perform the data conversion from the integer (i.e., stochastic number) to complex numbers in B2S and the rounding process in \(\textsf {S2B}\), when compared to that with BFV. However, HESC with CKKS is slightly faster in terms of the total time because of the faster homomorphic SC evaluations. Therefore, it can be confirmed that the HESC is superior in terms of the computation time, while there are some factors to be considered, such as the increase in the ciphertext length and the decrease in the operational accuracy. This result indicates that the HESC is suitable for solving classification problems with an acceptable degree of error. Particularly, HESC with CKKS produces CKKS-derived errors in addition to the SC-derived errors, but these errors do not significantly affect the inference results, as demonstrated in the successful inference for the iris dataset.

4.4 Effect of fusion and separation

In this section, we evaluate the effect of the improvement technique described in Sect. 3.3. The target operation is an oblivious inference for the same Iris dataset as in the previous section, and the HESC is constructed by applying the BFV (i.e., HESC w/BFV). We consider that the value of W is set to 6 because the maximum value of the final output (i.e., plaintext) is at most 5 in the target. The parameters of BFV, called the plaintext space and the noise budget (i.e., stochastic number length or order), limit the number of ciphertexts that can be merged in the fusion. In the previous section, the parameters of BFV were 12289 and 1024 for the plaintext space and order, respectively. In the experimental setup, the maximum value of N is set to 5, that is, the maximum number of fused ciphertexts is 6, according to Eq. 9. For obtaining a sufficient noise budget, the evaluations were performed in the order of 2048 in the experiment.

Table 5 shows the evaluation results of HESC w/ and w/o the improvement method. With the improvement method, the total number of ciphertexts is reduced, and the number of decryption is decreased, which results in a significant reduction in computation times. In particular, we confirmed in the order of 2048 that the decryption time w/ the improvement method was approximately 5.6 times smaller, and the overall computation time is approximately 2.2 times smaller, compared with that w/o the method. This is because the number of ciphertexts was reduced from 101 to 18 by the fusion in the experiment, which showed a good agreement with the result. We can also see that the overall computation time w/ the improvement method is smaller than that w/o the improvement method in the order of 1024 (that is, HESC w/BFV or CKKS in Table 4). Figure 9 shows that the accuracy can be improved by increasing the order. Therefore, the results show that the improvement method can achieve a higher accuracy while reducing the computational cost.

Table 5 Evaluation of improvement technique (i.e., packing for reducing ciphertext size presented in Sect. 3.3) (\(\mathrm{ms}\))

4.5 Semantic security

An HESC ciphertext consists of a collection of ciphertexts for the underlying homomorphic encryption scheme (either with or without packing). As a result, it immediately follows by a standard hybrid argument that the HESC scheme is IND-CPA secure as soon as the underlying homomorphic encryption scheme is IND-CPA secure, which is the case for all the schemes considered in this work: Goldwasser–Micali, both variants of ElGamal and BFV. Although CKKS-based HESC achieved a high performance in our experiment, the original CKKS does not provide a semantic security against semi-honest model [41]. Note that the purpose of our experiment is evaluating and comparing the fundamental and potential performance of HEs in the HESC context; in practice, some mitigation would be required for a secure use of HESC-based CKKS.

As a side note, the length of HESC ciphertexts can vary depending on the number of times concatenated stochastic addition is carried out, and the distribution of the noise also depends on the successive stochastic operations. In that sense, the HESC scheme is not a “function private” homomorphic encryption scheme (but the same is true of almost all practical SHE or FHE schemes including BFV and CKKS). Since the homomorphic operations are carried out identically to all ciphertexts, these points have no bearing on semantic security or the confidentiality of plaintexts.

5 Conclusion

This paper presented the HESC, a new HE-based secure computation scheme based on SC. This HESC can perform both homomorphic stochastic additions and multiplications without bootstrapping, based on an underlying single-operation homomorphic encryption scheme (either additive or multiplicative). This is achieved at the cost of some noise being included in the decrypted plaintext. This paper also presented the constructions of basic and efficient HESCs, with the latter featuring lattice-based cryptography, which improved the implementation efficiency using plaintext packing, a new stochastic addition method, and ciphertext integration. Some of the HESC schemes were validated with typical HEs, and their application for low-degree polynomial functions was demonstrated, along with an oblivious inference using a neural network.

The validation results indicated that the HESC scheme could potentially mitigate a large part of the computational costs associated with conventional FHEs/SHEs. This HESC scheme can further be improved in terms of its usage in cryptography and related applications; detailed analyses of these improvements will be performed in future research. The development of hardware accelerators dedicated to HESCs is also a potential area of research interest. In addition, we are planning to develop a methodology for optimal circuit design where there are two stochastic additions to exploit the tradeoff between accuracy and ciphertext size.