1 Introduction

The goal of any privacy critical applications is to preserve the underlying privacy (like user privacy or server privacy or data privacy) with guaranteed confidentiality primitive (i.e., information theoretic).

Among all other user privacy-preserving techniques, Private Information Retrieval (PIR) is one of the prominent privacy-preserving techniques to preserve both user privacy and data privacy introduced by Chor et al. [12, 13]. The private information retrieval also called as special case of 1-out-of-n oblivious transfer involves two communicating parties: user and server in which user privately reads a single bit from server’s n bit database. The basic goal of Chor et al. [12, 13] was to provide the highest confidentiality to the user’s interest (maybe index, pattern, graph moves etc.) for real-time privacy applications. Since then, comprehensive research has been carried out in several dimensions of PIR including relaxing the privacy level from information-theoretic to a computationally bounded setting, reducing communication and computation overhead, reducing the number of rounds and number of servers involved, extending to private write etc.

One of the natural extensions to PIR protocol is Private Block Retrieval (PBR) in which user privately reads v bit block (instead of a bit) from server’s u block database. Based on the level of privacy, the PIR protocol is broadly divided into two groups: information-theoretic PIR and computationally bounded PIR as described below.

  • Information-theoretic PIR (itPIR): If the PIR protocol involves information-theoretically private queries with non-colluding replicated database server entities then such scheme is considered as information-theoretic PIR (itPIR) in which the user privacy is preserved through the information-theoretically private queries. Several information-theoretic schemes [3, 4, 8, 17, 19, 28, 32] and some PBR extensions [2, 5, 13, 14, 16, 21, 30, 31, 37] have concentrated on providing information-theoretic privacy using database replications.

  • Computationally bounded PIR (cPIR): If the PIR protocol involves a computationally bounded (or computationally intractable) database server entities then such scheme is considered as computationally bounded PIR (cPIR) in which the privacy is preserved based on the well-defined cryptographic intractability assumption(s). Most of the research work [6, 11, 20, 23, 25, 33] and [7, 9, 16, 22, 26, 29, 31, 35, 36, 38] on cPIR concentrated on using a single intractability assumption to preserve both user privacy and data privacy.

There are following major problems in the existing single database PBR schemes (including both itPBR and cPBR).

  • Lack of sufficient itPIR approaches: More research focus was on the construction of an efficient cPBR instead of itPBR in a single database setting. This leads to the lack of information-theoretic privacy guarantee to the user in single database setting.

  • Lack of independency between user and data privacy: Most of the existing cPBR schemes use a single intractability assumption (such as Quadratic residuosity, Phi-hiding, Lattices, Composite residuosity etc) to preserve both user privacy and data privacy. If the curious party breaks the underlying intractability assumption then both the privacy concerns are easily compromised without extra effort. For instance, the single database PIR protocol constructed by Kushilevitz and Ostrovsky [25] rely on the well-known intractability assumption called Quadratic Residuosity Assumption (QRA) to achieve both the user privacy (through the computationally intractable query inputs with quadratic residuosity properties) and the data privacy (through the quadratic residuosity ciphertexts). Note that compromising the QRA naturally reveals both privacy concerns (without extra effort). Therefore, there is a strong need of a generic scheme with efficient mapping from cPBR to itPBR in such a way that the underlying primitive of user privacy should also map from intractability assumption to information-theoretic privacy. Note that, Kushilevitz and Ostrovsky scheme does not support an efficient mapping cPBR to/from itPBR.

  • Lack of generic framework that fulfills the above needs: Due to the lack of generic PBR framework (which can be used as a generic framework for several privacy critical applications such as PBR, oblivious transfer, asymmetric encryption etc), there is a strong need of a generic PBR scheme that can efficiently transform between several PBR extensions like information-theoretic PBR, computationally bounded PBR, oblivious transfer, asymmetric encryption etc.

With this thorough investigation, the natural question that arises is as follows.

Is it possible to construct a generic single database Private Block Retrieval framework with a reasonable performance that fulfills one or more privacy concerns (such as user privacy, data privacy, server privacy) of private block retrieval and oblivious transfer ?

Our single database private block retrieval solution: We have introduced a new bit connection and QRA based trapdoor functions for a single database PBR with the following results.

  • New quadratic residuosity based single bit injective and lossy trapdoor functions.

  • New bit connection methods (BCMs) called rail-shape and signal-shape to interconnect the proposed trapdoor functions with the aid of quadratic residuosity based injective trapdoor functions introduced by Freeman et al. [15].

  • The appropriate combination of the proposed bit connection methods and trapdoor functions serve as a generic framework to map between several PBR extensions such as information-theoretic PBR, computationally bounded PBR, oblivious transfer, asymmetric encryption etc.

  • New single database information-theoretic PBR (Sit PBR) scheme using the combination of proposed bit connection methods and trapdoor functions in which the communication cost of the proposed scheme is \(\mathcal {O}(u(v-2)+2u\) log N) and it’s computation cost is \(\mathcal {O}(u(2v-2))\) where n = uv is the database size, u = rows, v = columns, and N is the RSA composite.

  • New single database computationally bounded PBR (ScPBR) scheme in which the communication cost of the proposed scheme is \(\mathcal {O}(u(v-2)+2u\) log N) and it’s computation cost is \(\mathcal {O}(u(2v-2))\).

2 Related work

Many PIR schemes have also been extended to orivide database security. Naor and Pinkas [34] have first proposed the transformation of cPIR to OT with a small computation overhead. Gertner et al. [18] have proposed the way of transforming cPIT to cOT with a small communication overhead. To achieve this, the system must add one more auxillary database to store some rando strings. Aiello et al. [1] have proposed the method of converting cPIR to cOT with an additional initialization phase which is is not the component of classic OT method. Moreover the security has been slightly relaxed from the standard notion. Chang [10] has proposed the first balanced computationally bounded OT scheme in single database setting. But, no method has been proposed to convert cOT to cPIR. Laur and Lipmaa [27] have proposed a disclose-if-equal (DIE) protocol which in turn supports cPIR to OT transformation. The proposed protocol involves compulsory client encryption operations, it is not best-suited for large database PIR operations. Kiayias et al. [24] have proposed rate-optimal cPIR-to-OT transformation. But, the transformed OT scheme is computationally secure on the server side. Therefore, authors left the information-theoretically server-private optimal-rate OT protocol as an interesting open problem.

But, there is no substancial efforts visible till date in the transformation of cPIR to itPIR. It is generally hard to achieve the sublinear communication in single database itPIR. But, it is essential to go with single database setting as multi-database setting provides weaker privacy guarantee. This basic reason motivated the design of the proposed transformation scheme.

3 Preliminaries and notations

Let [1, u] denotes taking all values from 1 to u and [u]\(\triangleq \{1,2,\ldots ,u\}\) denotes taking any one value in the range from 1 to u. Let k denotes the security parameter, \(N\xleftarrow {}\{0,1\}^{k}\) = PQ be the RSA composite modulus where \(P\equiv \) 3 (mod 4), \(Q\equiv \) 3 (mod 4), \(\mathbb {Z}^{+1}_{N}\) denotes the set of all elements with Jacobi Symbol () 1. Let \(Q_{R}\) and \(\overline{Q}_{R}\) denote the quadratic residue and quadratic non-residue sets with respectively. Let \(\langle a,b \rangle \) be a set consists of two components in which \(a\in \mathbb {Z}^{+1}_{N}\), and b = \(\{i:i\in \{0,1\}\}\).

Correctness: In any given instance, user must be able to retrieve the correct desired bit.

User privacy: In any given instance, user interest (may be in terms of database index) should not be revealed to the server.

Information theoretic privacy: Even after receiving the user query and having unbounded computation power, server should not gain any (even partial) information about the database index.

Quadratic residuosity predicate (QRP): \(\forall x\in \mathbb {Z}^{*}_{N}\),

$$\begin{aligned} (QRP_{P,Q}(x)~or ~QRP(x)) =\left\{ \begin{array}{c l} 0&{}\quad ~If ~x\in Q_{R}\\ 1&{}\quad ~If ~x\in \overline{Q}_{R} \end{array}\right. \end{aligned}$$
(1)

Quadratic residuosity based lossy trapdoor function of Freeman et al. [15] (LTDF): For all \(\alpha \in \mathbb {Z}^{*}_{N}\), \(s\in \overline{Q}_{R}\) and \(r\in \mathbb {Z}^{-1}_{N}\), the lossy trapdoor function is (mod N)) such that j is equal to 1 if otherwise j is equal to 0. The value of h is equal to 1 if \(\alpha > N/2\) otherwise h is equal to 0. The respective inverse function is (mod N)). We use the alternative square root syntax as (mod N)).

4 Combination of new bit connection methods and trapdoor functions

We have introduced a novel combinations of the quadratic residuosity based trapdoor functions in Sect. 4.1 and the database bit connection methods in Sect. 4.2 that can be used as a generic framework for itPBR to/from cPBR transformations as shown in Fig. 1. These combinations can assure many privacy concerns such as user privacy, data privacy and server privacy.

Fig. 1
figure 1

A single database private block retrieval framework with itPBR to/from cPBR transformations

4.1 A new quadratic residuosity based trapdoor functions

It is a newly constructed 7-tuple (\(\mathcal {I}\), \(\mathcal {G}_{0}\), \(\mathcal {G}_{1}\), , , , ) consists of the following functions.

  • Sampling an input (\(\mathcal {I}\)): The algorithm \(\mathcal {I}\) receives the input \(1^{k}\) and produces the large RSA composite N = PQ where P and Q are large distinct primes with \(P\equiv Q\equiv \) 3 (mod 4) or 1 (mod 4). Then chooses an “identically distributed” random \(x\in \mathbb {Z}_{N}^{+1}\). The input domain of the random input x is \(\mathbb {Z}_{N}^{+1}\).

  • Sampling a lossless injective function (\(\mathcal {G}_{0}\)): On receiving the composite N, the algorithm \(\mathcal {G}_{0}\) chooses a random such that the quadratic residuosity predicate of and must be different (i.e., QRP()\(\ne \) QRP()). The function parameters are \(\sigma \) = (N) and the trapdoor/private key is \(\tau \) = (PQ). Now it is clear that the injective function is defined over the domain \(\mathbb {Z}_{N}^{+1}\).

  • Sampling a lossy trapdoor function (\(\mathcal {G}_{1}\)): On receiving the composite N, the algorithm \(\mathcal {G}_{1}\) chooses a random such that the quadratic residuosity predicate of and must be equal (i.e., QRP()=QRP()).

  • Evaluation of trapdoor function of [15] ( ): The algorithm receives the input x and produces “h” value of x (as described in quadratic residuosity based lossy trapdoor function [15]) as trapdoor bit as follows.

    (2)
  • Inversion of trapdoor function of [15] ( ): Given the modular square \(x^{2}\) and “h” value of x, the algorithm obtains the input x as follows.

    (3)
  • Evaluation of lossless injective function (): The algorithm chooses a bit \(b\in \{0,1\}\). It then receives the function parameters, and evaluates the following.

    (4)
  • Inversion of lossless injective function (): Given the function parameters, trapdoor \(\tau \), trapdoor bit h and ciphertext y, the algorithm obtains both x and b as follows.

    (5)

    where 1 (mod N) and 1 (mod N).

4.2 A new bit connection methods (BCMs)

We introduce new methods of interconnecting the database bits during PBR response creation on the server side as shown in Fig. 1. Based on the interconnectivity of the database bits, we classify the newly introduced bit connection methods as rail-shape and signal-shape as shown in Fig. 2.

Let the database be . Consider the following ordered subsets of

(6)
Fig. 2
figure 2

A new bit connection methods used to interconnect the proposed trapdoor functions

Fig. 3
figure 3

Possible transformations in the existing and the proposed PBR schemes

Note that if the absolute difference between any two database indices of the underlying set is 1 then such set is used for rail-shape connection and if the absolute difference between any two database indices of the underlying set is 2 then such set is used for signal-shape connection. Therefore, it is now intuitive that the set is used for rail-shape connection and / are used for signal-shape connections.

Now, let’s see the main advantage of using these BCMs in a single database PBR setting as follows.

  • Most of the existing PBR schemes provide the whole database as input to their underlying trapdoor functions as shown in Fig. 3a. Consequently, this method of providing a database to the underlying trapdoor function in PBR results in the following types of PBR: either itPBR or cPBR. Also, there should always be a chance of transforming from each itPBR scheme to its cPBR version (i.e., Map(itPBR)\(\rightarrow \) cPBR). But, there is no chance of transforming from each cPBR scheme to its itPBR version (i.e., Map( cPBR) \(\nrightarrow \)itPBR).

  • Introducing the unique bit connection methods (other than using the whole plaintext) is helpful to achieve Map(itPBR)\(\rightarrow \)cPBR? Yes. It is possible to achieve both Map(itPBR)\(\rightarrow \)cPBR and Map(cPBR)\(\rightarrow \)itPBR using the combination of BCMs and newly constructed trapdoor functions of Sect. 4.1 as shown in Fig. 3b. Therefore, the combination of BCMs and newly constructed trapdoor functions serve as a framework to construct either itPBR or cPBR and thereby achieving Map(itPBR)\(\rightarrow \)cPBR and Map(cPBR)\(\rightarrow \) itPBR.

Fig. 4
figure 4

The block response creation (RC) and response retrieval (RR) algorithms of the proposed scheme

5 A new single database information-theoretic private block retrieval schemes (SitPBR)

In this section, we have introduced a new information-theoretic private block retrieval technique. At the abstract view, the proposed scheme is a 3-tuple (QG, RC , RR) involves two communicating parties: user and server in which user generates an information-theoretically private query from the input domain \(\mathbb {Z}_{N}^{+1}\) using QG algorithm and sends this query to server. On the other hand, using query and the database , server generates the response using RC algorithm and sends back to user. Finally, user retrieves the intended block privately using RR algorithm. The detailed description of the proposed scheme is given as follows.

Let n = \(u\times v\) bit 2-dimensional matrix database with u rows and v columns be where , \(i\in [u]\). Each database block is further viewed as two subsets and where  = \(\{b_{i,2}\), \(b_{i,4}\), \(b_{i,6}\) ..., \(b_{i,v}\}\) and . The idea here is to use new bit connections using the subsets and apply the recursive execution of the proposed trapdoor function of Sect. 4.1. The detailed description of the proposed algorithms is given as follow.

  • Query generation (QG): (user generates) Generate (public key, private key) pair from the query input domain \(\mathbb {Z}_{N}^{+1}\) as follows. Generate the public key \(\sigma \) = (N,, and the private key \(\tau \) = (PQ) as described in the algorithm \(\mathcal {G}_{0}\). Also, generate an “identically distributed” random \(x\in \mathbb {Z}_{N}^{+1}\) as described in the algorithm \(\mathcal {I}\). Then, generate an information-theoretically private query \(\mathcal {Q}\) = (N, x) where \(\mathcal {Q}^{z}\) represents the z-th block query with public key components ().

  • Response creation (RC): (server generates) Using the information theoretic query \(\mathcal {Q}\) and the database , generate the response by executing the following.

    For all database block , \(z\in [1,u]\), using respective public key components , execute the following recursive function as described in the algorithm and obtain the intermediate ciphertext bits from each (as described in the algorithm ) and two final ciphertexts as follows.

    (7)

    where \(i\in [v,4]\), \(j\in [v\text {-1},3]\) and each is an injective function described in the algorithm .

    Finally, the database response would be . The pictorial representation of the block response creation process is given in Fig. 4.

  • Response retrieval (RR): (user generates) Using the response and the trapdoor \(\tau \), retrieve the required block \(w\in [u]\) (generally single block) as follows.

    (8)

    where \(i\in [\frac{v}{2}\text {-1},1]\) and each is the inverse of the injective function described in Eq. (7). The pictorial representation of the response retrieval process is given in Fig. 4.

Table 1 An illustrative example of the proposed scheme with two database blocks

A toy example: Let P = 23, Q = 17 and N = (\(P\cdot Q\))=391, , , x = 40. Therefore the common query for both the blocks is \(\mathcal {Q}\) = (391, \(\{\)82,10\(\}\),40). Let a 2-dimensional two block matrix database be where , \(u=2\) and v = 8. Therefore, and . Let us assume that the user is interested in the second block (i.e., w = 2). Note that the entire database response must be downloaded to retrieve any block. The complete illustrative example is given in Table 1.

6 A new single database computationally bounded private block retrieval schemes (ScPBR)

In this section, we have introduced a new computationally bounded block retrieval technique using computationally intractable queries. The response creation and the response retrieval algorithms are same as the SitPBR scheme. The detailed description of the query generation algorithm is given as follows.

Generate (public key, private key) pair from the query input domain \(\mathbb {Z}_{N}^{+1}\) as follows. Let the user is interested in the database block , \(w\in [u]\). Generate the first set of public key components , (from \(\mathcal {G}_{0}\) algorithm) such that QRP()\(\ne \) QRP() and generate the second set of public key component pairs (), \(w\in [u]\) and \(w\ne z\), (from \(\mathcal {G}_{1}\) algorithm) such that QRP()=QRP(). Note that these two sets of public key components are computationally intractable under quadratic residuosity assumption. The public key is \(\sigma \) = (N), and private key is \(\tau \) = (PQ). Also, generate a random \(x\in \mathbb {Z}_{N}^{+1}\). Then, generate a computationally intractable query \(\mathcal {Q}\) = (N, x) where \(x\in \mathbb {Z}^{+1}_{N}\), \(\mathcal {Q}^{i}\) represents the i-th block query with public key components ().

7 Transformation (or mapping) of SitPBR to/from ScPBR without affecting the basic setup

Most of the existing single database PBR schemes are concentrated on constructing single type of PBR either itPBR or cPBR. But, what if somebody wants to covert from one type to another without changing the basic setup? Essentially, there should be a framework of techniques that provides both types and the transformation mechanism between them.

In order to provide the above mentioned generic framework, we have proposed single database itPBR schemes in Sect. 5, single database cPBR in Sect. 6. Now, we describe the transformation of one type to another without changing the basic setup as follows.

The transformation of the proposed SitPBR to/from ScPBR depends upon the appropriate quadratic residuosity properties of the public key components. If so, how to choose the appropriate property public key components in the proposed PBR? Just look into the following descriptions to find the answer to this.

  • Sampling function parameters for SitPBR (\(\mathcal {L}_{0}\)) : The algorithm \(\mathcal {L}_{0}\) chooses the identically distributed public key components from \(\mathbb {Z}^{+1}_{N}\) such that QRP()\(\ne \) QRP() (as described in the algorithm \(\mathcal {G}_{0}\)) during QG algorithm execution without altering the remaining algorithms.

  • Sampling function parameters for ScPBR (\(\mathcal {L}_{1}\)) : The algorithm \(\mathcal {L}_{1}\) chooses both kinds of public key components from \(\mathcal {G}_{0}\) and \(\mathcal {G}_{1}\) algorithms during QG algorithm execution such that both kinds of components are computationally indistinguishable. Note that choosing these appropriate property public key components neither affects the remaining PBR algorithms nor effect the basic PBR setup.

  • Sampling function parameters for Map(Sit PBR) \(\rightarrow \)ScPBR (\(\mathcal {M}_{0}\)): In order to map from proposed SitPBR to ScPBR, just choose the appropriate public key components from \(\mathcal {L}_{1}\) during QG algorithm execution and continue to execute the remaining PBR algorithms. Note that this mapping process is computationally indistinguishable.

  • Sampling function parameters for Map(ScP BR) \(\rightarrow \)SitPBR (\(\mathcal {M}_{1}\)): In order to map from proposed ScPBR to SitPBR, just choose the appropriate public key components from \(\mathcal {L}_{0}\) during QG algorithm execution and continue to execute the remaining PBR algorithms (as usual). Note that this mapping process is also computationally indistinguishable.

8 Performance evaluation

Privacy: The proposed scheme of Sect. 5 always preserves the user privacy against the curious-server through the generation of information-theoretically private queries. If \(\mathcal {Q}_{1}\) =  (N\(x_{1}\)), \(\mathcal {Q}_{2}\) = (N\(x_{2}\)) are any two randomly generated queries in QG algorithm then the selection of public key components from the identically distributed domain for all database blocks always guarantees perfect user privacy i.e., the query components are randomly chosen from an identically distributed domain in such a way that the mutual information between any two queries is always zero and assures perfect privacy to the user.

Fig. 5
figure 5

Proposed privacy preserving big data access control models

The proposed scheme of Sect. 6 always preserves the user privacy against the curious-server through the generation of computationally bounded queries. If \(\mathcal {Q}_{1}\) = (N\(x_{1}\)), \(\mathcal {Q}_{2}\) = (N\(x_{2}\)) are any two randomly generated queries in QG algorithm then the computationally indistinguishable selection of public key components for all database blocks always guarantees computationally bounded user privacy. In other words, the quadratic residuosity properties of public-key components of \(\mathcal {Q}_{1}\) and \(\mathcal {Q}_{2}\) are computationally hidden from the curious server.

Both the proposed schemes of Sects. 5 and 6 use quadratic residuosity assumption to preserve data privacy against intermediate adversary.

Communication and Computation: In the proposed schemes of Sects. 5 and 6, user sends \(\mathcal {O}((2u+2)\cdot \)log N) query bits to the server. The server sends \(\mathcal {O}(u(v-2)+2u\) log N) response bits to the user where u is the row size of the database, v is the column size of the database, N is the composite modulus. Both the proposed schemes are single round PBR protocols use only one request-response cycle where user requests for a database block and server responds through the response.

The execution of the RC algorithms of the Sect. 5 and Sect. 6 involve uv number of lossless trapdoor functions and \(u(v-2)\) number of lossy trapdoor functions . Each trapdoor function (either lossy or lossless) involves a single modular multiplication, the RC algorithm involves a total of \(u(2v-2)\) number of modular multiplications. On the other hand, the RR algorithms of Sects. 5 and 6 involve only (\(2v-2\)) number of modular multiplications plus \((v-2)\) number of quadratic square roots to retrieve the required block.

9 Privacy preserving big data access control

We will extend our work to introduce a novel privacy preserving access control model in Big Data information processing environment. The core idea is to store only the CCA secure ciphertext components of the proposed PBR schemes of Sect. 5 or Sect. 6 on the Big Data and download the stored information using one of the proposed PBR techniques in 2-party and 3-party scenarios as shown in Fig. 5a, b. This idea covers many privacy critical applications such as Healthcare, Patent and Stock search, Email, Social media, Private chat which cannot be handled by traditional Big Data information processing model alone.

In 2-party scenario, the proposed model consists of two communicating parties: Alice and Cloud in which Alice encrypts his/her data using his/her own public key \(\sigma \) using RC algorithm and stores one of the ciphertext components , \(\forall z\in [1,u]\) on Cloud (which maintains Big Data storage and processing) and keeps other ciphertext components , \(\forall z\in [1,u]\) with him/her. Whenever required, Alice directly downloads partial ciphertext component from Cloud or downloads using the proposed schemes of Sects. 5, 6 and decrypts his/her data using his/her own private key \(\tau \) using RR algorithm.

In 3-party scenario, the proposed model consists of three communicating parties: Alice, Bob and Cloud in which Alice encrypts his/her data using Bob’s public key \(\sigma \) using RC algorithm and stores one of the ciphertext components on Cloud (which maintains Big Data storage and processing) and sends other ciphertext component , \(\forall z\in [1,u]\) to Bob. Whenever required, Bob downloads a part of ciphertext components from Alice and downloads other ciphertext component from Cloud and decrypts Alice’s data using his/her private key \(\tau \) using RR algorithm.

10 Conclusion and future work

We have presented a new combination of trapdoor functions and bit connection methods to achieve a novel mapping single database information-theoretic and computationally bounded private block retrieval schemes and their transformations. Although, the proposed schemes show reasonable performance with the current state-of-art work, focusing on other dimensions such as scalable and fault-tolerant multi-server PBR scheme for practical privacy-preserving BigData access control applications is the future direction.