Skip to main content

Database Outsourcing with Hierarchical Authenticated Data Structures

  • Conference paper
  • First Online:
Information Security and Cryptology -- ICISC 2013 (ICISC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 8565))

Included in the following conference series:

Abstract

In an outsourced database scheme, the data owner delegates the data management tasks to a remote service provider. At a later time, the remote service is supposed to answer any query on the database. The essential requirements are ensuring the data integrity and authenticity with efficient mechanisms. Current approaches employ authenticated data structures to store security information, generated by the client and used by the server, to compute proofs that show the answers to the queries are authentic. The existing solutions have shortcomings with multi-clause queries and duplicate values in a column.

We propose a hierarchical authenticated data structure for storing security information, which alleviates the mentioned problems. We provide a unified formal definition of a secure outsourced database scheme, and prove that our proposed scheme is secure according to this definition, which captures previously separate properties such as correctness, completeness, and freshness. The performance evaluation based on our prototype implementation confirms the efficiency of the proposed outsourced database scheme, showing more than 50 % decrease in proof size and proof generation time compared to previous work, and about 1–20 % communication overhead compared to the query result size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We can handle the heterogeneous case as well, but it complicates the presentation.

References

  1. Benaloh, J., de Mare, M.: One-way accumulators: a decentralized alternative to digital signatures. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 274–285. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  2. Celko, J.: Joe Celko’s Trees and Hierarchies in SQL for Smarties. Morgan Kaufmann, Washington (2004)

    Google Scholar 

  3. Devanbu, P., Gertz, M., Martel, C., Stubblebine, S.: Authentic third-party data publication. In: Thuraisingham, B., van de Riet, R., Dittrich, K.R., Tari, Z. (eds.) Data and Application Security. IFIP, vol. 73, pp. 101–112. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  4. Di Battista, G., Palazzi, B.: Authenticated relational tables and authenticated skip lists. In: Barker, S., Ahn, G.-J. (eds.) Data and Applications Security 2007. LNCS, vol. 4602, pp. 31–46. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Erway, C., Küpçü, A., Papamanthou, C., Tamassia, R.: Dynamic provable data possession. In: CCS’09, pp. 213–222. ACM (2009)

    Google Scholar 

  6. Goodrich, M., Tamassia, R.: Efficient authenticated dictionaries with skip lists and commutative hashing. US Patent App, 10(416,015) (2000)

    Google Scholar 

  7. Goodrich, M.T., Tamassia, R., Hasić, J.: An efficient dynamic and distributed cryptographic accumulator. In: Chan, A.H., Gligor, V.D. (eds.) ISC 2002. LNCS, vol. 2433, pp. 372–388. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  8. Goodrich, M.T., Tamassia, R., Triandopoulos, N.: Super-efficient verification of dynamic outsourced databases. In: Malkin, T. (ed.) CT-RSA 2008. LNCS, vol. 4964, pp. 407–424. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Li, F., Hadjieleftheriou, M., Kollios, G., Reyzin, L.: Dynamic authenticated index structures for outsourced databases. In: ACM SIGMOD, pp. 121–132 (2006)

    Google Scholar 

  10. Li, F., Hadjieleftheriou, M., Kollios, G., Reyzin, L.: Authenticated index structures for aggregation queries. TISSEC 13(4), 32 (2010)

    Article  Google Scholar 

  11. Martel, C., Nuckolls, G., Devanbu, P., Gertz, M., Kwong, A., Stubblebine, S.: A general model for authenticated data structures. Algorithmica 39(1), 21–41 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  12. Merkle, R.C.: A certified digital signature. In: Brassard, G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp. 218–238. Springer, Heidelberg (1990)

    Google Scholar 

  13. Mykletun, E., Narasimha, M., Tsudik, G.: Providing authentication and integrity in outsourced databases using merkle hash trees. UCI-SCONCE Technical report (2003)

    Google Scholar 

  14. Narasimha, M., Tsudik, G.: Authentication of outsourced databases using signature aggregation and chaining. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 420–436. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Nuckolls, G.: Verified query results from hybrid authentication trees. In: Jajodia, S., Wijesekera, D. (eds.) Data and Applications Security 2005. LNCS, vol. 3654, pp. 84–98. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  16. Palazzi, B.: Outsourced Storage Services: Authentication and Security Visualization. Ph.D. thesis, Roma Tre University (2009)

    Google Scholar 

  17. Palazzi, B., Pizzonia, M., Pucacco, S.: Query racing: fast completeness certification of query results. In: Foresti, S., Jajodia, S. (eds.) Data and Applications Security and Privacy XXIV. LNCS, vol. 6166, pp. 177–192. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  18. Pang, H., Jain, A., Ramamritham, K., Tan, K.: Verifying completeness of relational query results in data publishing. In: ACM SIGMOD, pp. 407–418 (2005)

    Google Scholar 

  19. Pang, H., Tan, K.: Authenticating query results in edge computing. In: International Conference on Data Engineering, pp. 560–571. IEEE (2004)

    Google Scholar 

  20. Papamanthou, C., Tamassia, R.: Time and space efficient algorithms for two-party authenticated data structures. In: Qing, S., Imai, H., Wang, G. (eds.) ICICS 2007. LNCS, vol. 4861, pp. 1–15. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  21. Papamanthou, C., Tamassia, R., Triandopoulos, N.: Authenticated hash tables. In: CCS’08, pp. 437–448. ACM (2008)

    Google Scholar 

  22. Tamassia, R.: Authenticated data structures. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 2–5. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  23. Tamassia, R., Triandopoulos, N.: On the cost of authenticated data structures. Technical report, Center for Geometric Computing, Brown University (2003)

    Google Scholar 

  24. Wang, J., Du, X.: Skip list based authenticated data structure in das paradigm. In: GCC’09, pp. 69–75. IEEE (2009)

    Google Scholar 

  25. Yang, Y., Papadias, D., Papadopoulos, S., Kalnis, P.: Authenticated join processing in outsourced databases. In: ACM SIGMOD, pp. 5–18. ACM (2009)

    Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge the support of TÜBİTAK, the Scientific and Technological Research Council of Turkey, under project numbers 111E019 and 112E115, as well as European Union COST Action IC1206. We also thank Ertem Esiner, Adilet Kachkeev, and Ozan Okumuşoǵlu for their contributions during performance evaluation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Etemad .

Editor information

Editors and Affiliations

Appendices

A ADS Definitions and Security Analysis

Definition 2

ADS scheme consists of three polynomial-time algorithms [20]:

  • \(\mathbf {KeyGen(1^k)} \rightarrow \mathbf {(sk, pk){:}}\) is a probabilistic algorithm executed by the client to generate a private and public key pair \((sk, pk)\) given the security parameter \(k\). The client then shares the public key \(pk\) with the server.

  • \(\mathbf {Certify(pk, cmd)} \rightarrow \mathbf {(ans,}\,\pi \mathbf {){:}}\) is run by the server to respond to a command issued by the client. The public key \(pk\) and the command \(cmd\) is given as input. If \(cmd\) is a query command, it outputs a verification proof \(\pi \) that enables the client to verify the validity of the answer \(ans\). If \(cmd\) is a modification command (insertion, update, or deletion), then the \(ans\) is null, and \(\pi \) is a consistency proof enabling the client to update her local metadata.

  • \(\mathbf {Verify(sk, pk, cmd, ans,}\,\pi \mathbf {, st)} \rightarrow \mathbf {(\{accept, reject\}, st'){:}}\) is run by the client upon receipt of a response to verify it. The public and private keys \((pk, sk)\), the answer \(ans\), the proof \(\pi \), and the client’s current metadata \(st\) are given as input. It outputs an \(accept\) or \(reject\) based on the result of the verification. Moreover, if the command was a modification command and the proof is accepted, then the client updates her metadata accordingly (to \(st'\)).

Definition 3

ADS correctness: For all valid proofs \(\pi \) and answers \(ans\) returned by the server in response to a command issued by the client, the verify algorithm accepts with overwhelming probability.

Definition 4

The ADS security game: Played between the challenger who acts as the client and the adversary who plays the role of the server.

  • \({\mathbf {Key~generation{:}}}\) The challenger runs \(KeyGen(1^k)\) to generate the private and public key pair \((sk, pk)\), and sends the public key \(pk\) to the adversary.

  • \({\mathbf {Setup{:}}}\) The adversary specifies a command \(cmd\), and sends it together with an answer \(ans\) and proof \(\pi \) to the challenger. The challenger runs the algorithm Verify, and notifies the adversary about the result. If the command was a modification command, and the proof is accepted, then the challenger applies the changes on her local metadata accordingly. The adversary can repeat this interaction polynomially-many times. Call the latest version of the HADS, constructed using all the commands whose proofs verified, \(D\).

  • \({\mathbf {Challenge{:}}}\) The adversary specifies a command \(cmd\), an answer \(ans'\), and a proof \(\pi '\), and sends them all to the challenger. The adversary wins if the answer \(ans'\) is different from the result set of running \(cmd\) on \(D\), and \(cmd,ans',\pi '\) are verified as accepted by the challenger.

Definition 5

Security of ADS: We say that the ADS is secure if no PPT adversary can win the ADS security game with non-negligible probability.

Definition 6

An outsourced database scheme (ODB) consists of three probabilistic polynomial-time algorithms (OKeyGen, OCertify, OVerify) where:

  • \(\mathbf {OKeyGen(1^k)}\rightarrow \mathbf {(sk, pk){:}}\) is a probabilistic algorithm run by the client to generate a pair of secret and public keys \((sk, pk)\) given the security parameter \(k\). She keeps both keys, and shares only the public key with the server.

  • \(\mathbf {OCertify(pk, cmd)} \rightarrow \mathbf {(ans,}\,\pi \mathbf {){:}}\) is run by the server to respond to a command \(cmd\) issued by the client. It produces an answer \(ans\) and a proof \(\pi \) proving the authenticity of the answer. If the command is a modification command, the answer is empty, and the proof proves that the modification is done properly.

  • \(\mathbf {OVerify(pk, sk, cmd, ans, \pi , st) \rightarrow (\{accept, reject\}, st'){:}}\) is run by the client upon receipt of the answer \(ans\) and proof \(\pi \), to be verified using the public and private key pair. It outputs an ‘accept’ or ‘reject’ notification. If the command was a modification command and the verification result is ‘accept’, then, the client updates her local metadata (to \(st'\)), according to the proof.

Definition 7

ODB security game: This game is similar to the ADS game (Definition 4), except that proper algorithm names (from ODB scheme) is used.

Definition 8

ODB Security: We say that an ODB scheme is secure if no PPT adversary can win the ODB security game with non-negligible probability.

Since the algorithm OCertify is used to execute both query and modification commands, the server utilizes it to generate and update the authentication information. It starts with an empty structure, and updates it according to the received modification commands (e.g., the SQL ‘Insert’ command).

Note that the ODB security game covers all previously separate guarantees: correctness, completeness, and freshness. This is simply due to the fact that the game requires that no adversary can return a query answer together with a valid proof such that the returned answer is different from the answer that would have been produced by the actual database. If any one of the freshness, completeness, or correctness guarantees were to be invaded, the adversary would have won the game. Looking ahead, in our proofs, the challenger keeps a local copy of the database, and can detect whether or not the adversary succeeded. If he succeeds, our reduction shows that we break some underlying security assumption.

Theorem 1

The ADS scheme is secure according to Definition 5.

Proof

It is proved for different schemes separately by different researchers. Papamanthou et al. [21] proved the security of the authenticated hash tables, Goodrich et al. [7] proved the security of the RSA one-way accumulator [1] based ADS, and Papamanthou and Tamassia [20] proved the security of the ADSs constructed using authenticated skip list or red black tree.

Theorem 2

Our HADS construction is secure according to Definition 5 (employing HADS algorithm names) if the underlying ADSs are secure.

Proof

We reduce security of the HADS scheme to the security of the underlying ADSs. If a PPT adversary \(\mathcal {A}\) wins the HADS security game with non-negligible probability, we can use it to construct a PPT algorithm \(\mathcal {B}\) who breaks the security of at least one of the ADS schemes used, with non-negligible probability. \(\mathcal {B}\) acts as the server in the ADS game played with the ADS challenger \(\mathcal {C}\), and simultaneously, \(\mathcal {B}\) plays the role of the challenger in the HADS game with the adversary \(\mathcal {A}\). He receives the public key of an ADS from \(\mathcal {C}\), and himself produces \(n-1\) pairs of ADS public and private keys. Then, he puts the received key in \(i^{th}\) position, and puts the \(n\) public keys as a public key of an n-level HADS, and sends it to \(\mathcal {A}\). During the setup phase, \(\mathcal {B}\) builds a local copy of the HADS for herself. Note that this is invisible to the adversary \(\mathcal {A}\), and thus will not affect his behavior. After the setup phase, \(\mathcal {A}\) selects a command, generates the answer and proof for the command, and sends them to \(\mathcal {B}\). For the adversary to win, the answer must be different from the real answer in at least one location, with its verifying sub-proof \(\pi _{i_j}\). \(\mathcal {B}\) can find it since she maintains a local copy. When \(\mathcal {B}\) receives them, she selects the related command, answer and proof parts for the \(i^{th}\) position, and forwards them to \(\mathcal {C}\). If the guess of \(i\) was correct, then \(\mathcal {B}\) would succeed. If \(\mathcal {A}\) passes the verification with non-negligible probability \(p\), then \(\mathcal {B}\) passes the ADS verification with probability greater than or equal to \(p/n\).

Since we employ secure ADSs, \(p/n\) must be negligible, which implies that \(p\) is negligible, and hence, \(\mathcal {A}\) has negligible probability of winning the HADS game. Therefore, if the underlying ADSs are secure, then the HADS scheme is secure.

Theorem 3

Our ODB scheme is secure according to Definition 8, provided that the underlying HADS scheme is secure.

Proof

We reduce security of the ODB scheme to the security of underlying HADSs. If a PPT adversary \(\mathcal {A}\) wins the ODB security game with non-negligible probability, we can use it to construct a PPT algorithm \(\mathcal {B}\) who breaks the security of HADS scheme with non-negligible probability. \(\mathcal {B}\) acts as the server in the HADS game played with the HADS challenger \(\mathcal {C}\), and simultaneously, \(\mathcal {B}\) plays the role of the challenger in the ODB game with the adversary \(\mathcal {A}\). He receives the public key of an HADS from \(\mathcal {C}\), and relays it to \(\mathcal {A}\) (note that all HADSs built for each searchable column will use the same key). During the setup phase, \(\mathcal {B}\) builds a local database for herself (which does not change the adversary’s view). After the setup phase, \(\mathcal {A}\) selects a query, generates the answer and proof for the query, and sends them to \(\mathcal {B}\). For the adversary to win, the adversary’s answer must be different from the real answer on at least one location, but with a verifying proof. On receipt, \(\mathcal {B}\) selects the related command, answer and proof parts for the answer that differs from the real answer (she can find it since she maintains a local copy), and forwards them to \(\mathcal {C}\). If \(\mathcal {A}\) passes the ODB verification with non-negligible probability \(p\), then \(\mathcal {B}\) can also pass the HADS verification (i.e., break HADS security) with non-negligible probability \(p\).

Since we employ a secure HADS, \(p\) must be negligible, which implies that the adversary has negligible probability of breaking ODB. Therefore, our ODB scheme is secure (and provides correctness, completeness, and freshness), if the underlying HADS is secure.

B Efficient ODB Construction

For each level in an HADS, an ADS can be chosen subject to the requirements of that level and the application. Our construction is a two-level HADS, each level having a special role and posing special considerations. We compare the existing ADSs and investigate their eligibility to be used in each level. We consider three classes of ADSs: logarithmic (e.g., authenticated skip list [5, 6]), sublinear (e.g., authenticated hash tables [21]), and linear (e.g., one-way accumulator [1]).

First level: This level stores the distinct values of a column, and generates the first part of the proof to be sent to the client. Proof generation is based on the authenticated range queries, which implies that this level should use an ADS who preserves the order of values it stores. One-way accumulator and hash tables does not support this property efficiently, and cannot be used for this level.

Therefore, we choose the authenticated skip list (alternatively, the Merkle hash tree) to be used in the first level. It requires \(O(\log (|C_i|))\) and \(O(\log (|C_i|) + |t|)\) time/size for the update and query proofs, respectively. There are \(|C_i|\) distinct values, on average, in the first level ADS (stored at leaves), therefore, the storage complexity is \(2|C_i|\), which is \(O(|C_i|)\).

Second level: This level stores the PK set of values in the first level, where the order of PKs is not a matter of importance (although it can be useful for comparing the PK sets of multiple clauses connected with AND). Thus, any ADS can be used with time/space trade-offs discussed below.

Accumulator: For each distinct value in the first level ADS, an accumulated value is computed using all values in its PK set, and is stored together with the value itself. For each PK value, a witness is computed which proves that it belongs to the specified PK set. If we need to select all PK values, it suffices to have only the accumulated value (not the witnesses) to check the integrity. But, if want to select a subset of the PK values, then their witnesses are also required.

For each distinct value in the first level ADS, \(N/|C_i|\) PK values and witnesses should be computed and stored, on average, where \(N\) is the total number of records in the table. In total, \(2|C_i| + |C_i|*N/|C_i| = 2|C_i| + N\) (which is \(O(|C_i| + N)\)) storage is required (including the \(2|C_i|\) space for the first level ADS).

A proof for each value is made up of two parts, one for the first level ADS (e.g., for authenticated skip list, a path from the leaf up to the root, which is \(O(\log |C_i|)\)), and the other is the accumulated value along with all the values in the PK set, which is \(N/|C_i|\) (the accumulated value is already included in the hash value stored at the corresponding leaf of the first level ADS). The client herself can check the validity of the PK set against the accumulated value. Therefore, for a result set of size \(t\), the asymptotic size of the verification object will be \((O(\log |C_i|) + (t|C_i|/N)(1+N/|C_i|))\simeq O(\log |C_i| + t)\).

The main problem with the accumulator is the cost of update: with each update, all witnesses should be updated, which is expensive.

Authenticated hash table: This is a sublinear membership scheme with constant query and verification time, making it an interesting scheme for clients with resource-constrained devices. It is a good choice if the data is static. For a leaf node storing \(v_i\), we put the PK set of \(v_i\) in an authenticated hash table, and store its root at the leaf node itself.

On average, \(N/|C_i|\) PK values linked to each leaf node, therefore, we require \(O(|C_i| + (1+\epsilon )N/|C_i|*|C_i|) = O(|C_i| + (1+\epsilon )N)) \approx O(|C_i| + N)\) storage in total (including the \(O(|C_i|)\) space for the first level). Here \(0 < \epsilon < 1\) is a constant.

Table 1. A comparison of membership schemes for the second level where the first level is a logarithmic ADS. Proof size and verification time is given for one-dimensional queries. The \(s, t, t_1,\) and \( t_2\) denote the number of searchable columns in a table, size of the result set, and number of records in the first and second level ADSs, respectively.

The first level ADS proof is the same, but the authenticated hash table requires only constant proof size \(\epsilon \) [20], reaching \((O(\log |C_i|) + 1))\) for one record, and \((O(\log |C_i|) + t)\) for \(t\) records in the result set. Moreover, hash operations are much faster than accumulator operations using modular exponentiation.

Authenticated Skip list: This is a membership scheme with logarithmic height and proof size. The way the second-level membership schemes are modified, or the proofs are generated, are the same as for the first-level ADS.

Each node requires \(\approx 2(N/|C_i|)\) storage to store the PK set, therefore, \(2|C_i| + 2|C_i|*N/|C_i| = 2(|C_i| + N)=O(|C_i| + N)\) storage is required to store a column (including the \(2|C_i|\) space for the first level ADS). The proof size and time for one value are both \(O(\log |C_i| + \log (N/|C_i|))=O(\log N)\), and for \(t\) values are \(O(\log |C_i| + t\log (N/|C_i|))\) and \(O(\log |C_i|+t)\), respectively.

A comparison of these schemes is given in Table 1, where the first level is a logarithmic ADS and the second levels are shown in the table. The \(s, t, t_1,\) and \( t_2\) denote the number of searchable columns in a table, size of the result set, and number of records in the first and second level ADSs, respectively. Note however that unit operations in the accumulator are more costly than those in the others.

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Etemad, M., Küpçü, A. (2014). Database Outsourcing with Hierarchical Authenticated Data Structures. In: Lee, HS., Han, DG. (eds) Information Security and Cryptology -- ICISC 2013. ICISC 2013. Lecture Notes in Computer Science(), vol 8565. Springer, Cham. https://doi.org/10.1007/978-3-319-12160-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12160-4_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12159-8

  • Online ISBN: 978-3-319-12160-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics