Database Outsourcing with Hierarchical Authenticated Data Structures

Etemad, Mohammad; Küpçü, Alptekin

doi:10.1007/978-3-319-12160-4_23

Mohammad Etemad¹⁵ &
Alptekin Küpçü¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 8565))

Included in the following conference series:

International Conference on Information Security and Cryptology

1148 Accesses
5 Citations

Abstract

In an outsourced database scheme, the data owner delegates the data management tasks to a remote service provider. At a later time, the remote service is supposed to answer any query on the database. The essential requirements are ensuring the data integrity and authenticity with efficient mechanisms. Current approaches employ authenticated data structures to store security information, generated by the client and used by the server, to compute proofs that show the answers to the queries are authentic. The existing solutions have shortcomings with multi-clause queries and duplicate values in a column.

We propose a hierarchical authenticated data structure for storing security information, which alleviates the mentioned problems. We provide a unified formal definition of a secure outsourced database scheme, and prove that our proposed scheme is secure according to this definition, which captures previously separate properties such as correctness, completeness, and freshness. The performance evaluation based on our prototype implementation confirms the efficiency of the proposed outsourced database scheme, showing more than 50 % decrease in proof size and proof generation time compared to previous work, and about 1–20 % communication overhead compared to the query result size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We can handle the heterogeneous case as well, but it complicates the presentation.

References

Benaloh, J., de Mare, M.: One-way accumulators: a decentralized alternative to digital signatures. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 274–285. Springer, Heidelberg (1994)
Chapter Google Scholar
Celko, J.: Joe Celko’s Trees and Hierarchies in SQL for Smarties. Morgan Kaufmann, Washington (2004)
Google Scholar
Devanbu, P., Gertz, M., Martel, C., Stubblebine, S.: Authentic third-party data publication. In: Thuraisingham, B., van de Riet, R., Dittrich, K.R., Tari, Z. (eds.) Data and Application Security. IFIP, vol. 73, pp. 101–112. Springer, Heidelberg (2001)
Chapter Google Scholar
Di Battista, G., Palazzi, B.: Authenticated relational tables and authenticated skip lists. In: Barker, S., Ahn, G.-J. (eds.) Data and Applications Security 2007. LNCS, vol. 4602, pp. 31–46. Springer, Heidelberg (2007)
Chapter Google Scholar
Erway, C., Küpçü, A., Papamanthou, C., Tamassia, R.: Dynamic provable data possession. In: CCS’09, pp. 213–222. ACM (2009)
Google Scholar
Goodrich, M., Tamassia, R.: Efficient authenticated dictionaries with skip lists and commutative hashing. US Patent App, 10(416,015) (2000)
Google Scholar
Goodrich, M.T., Tamassia, R., Hasić, J.: An efficient dynamic and distributed cryptographic accumulator. In: Chan, A.H., Gligor, V.D. (eds.) ISC 2002. LNCS, vol. 2433, pp. 372–388. Springer, Heidelberg (2002)
Chapter Google Scholar
Goodrich, M.T., Tamassia, R., Triandopoulos, N.: Super-efficient verification of dynamic outsourced databases. In: Malkin, T. (ed.) CT-RSA 2008. LNCS, vol. 4964, pp. 407–424. Springer, Heidelberg (2008)
Chapter Google Scholar
Li, F., Hadjieleftheriou, M., Kollios, G., Reyzin, L.: Dynamic authenticated index structures for outsourced databases. In: ACM SIGMOD, pp. 121–132 (2006)
Google Scholar
Li, F., Hadjieleftheriou, M., Kollios, G., Reyzin, L.: Authenticated index structures for aggregation queries. TISSEC 13(4), 32 (2010)
Article Google Scholar
Martel, C., Nuckolls, G., Devanbu, P., Gertz, M., Kwong, A., Stubblebine, S.: A general model for authenticated data structures. Algorithmica 39(1), 21–41 (2004)
Article MathSciNet MATH Google Scholar
Merkle, R.C.: A certified digital signature. In: Brassard, G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp. 218–238. Springer, Heidelberg (1990)
Google Scholar
Mykletun, E., Narasimha, M., Tsudik, G.: Providing authentication and integrity in outsourced databases using merkle hash trees. UCI-SCONCE Technical report (2003)
Google Scholar
Narasimha, M., Tsudik, G.: Authentication of outsourced databases using signature aggregation and chaining. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 420–436. Springer, Heidelberg (2006)
Chapter Google Scholar
Nuckolls, G.: Verified query results from hybrid authentication trees. In: Jajodia, S., Wijesekera, D. (eds.) Data and Applications Security 2005. LNCS, vol. 3654, pp. 84–98. Springer, Heidelberg (2005)
Chapter Google Scholar
Palazzi, B.: Outsourced Storage Services: Authentication and Security Visualization. Ph.D. thesis, Roma Tre University (2009)
Google Scholar
Palazzi, B., Pizzonia, M., Pucacco, S.: Query racing: fast completeness certification of query results. In: Foresti, S., Jajodia, S. (eds.) Data and Applications Security and Privacy XXIV. LNCS, vol. 6166, pp. 177–192. Springer, Heidelberg (2010)
Chapter Google Scholar
Pang, H., Jain, A., Ramamritham, K., Tan, K.: Verifying completeness of relational query results in data publishing. In: ACM SIGMOD, pp. 407–418 (2005)
Google Scholar
Pang, H., Tan, K.: Authenticating query results in edge computing. In: International Conference on Data Engineering, pp. 560–571. IEEE (2004)
Google Scholar
Papamanthou, C., Tamassia, R.: Time and space efficient algorithms for two-party authenticated data structures. In: Qing, S., Imai, H., Wang, G. (eds.) ICICS 2007. LNCS, vol. 4861, pp. 1–15. Springer, Heidelberg (2007)
Chapter Google Scholar
Papamanthou, C., Tamassia, R., Triandopoulos, N.: Authenticated hash tables. In: CCS’08, pp. 437–448. ACM (2008)
Google Scholar
Tamassia, R.: Authenticated data structures. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 2–5. Springer, Heidelberg (2003)
Chapter Google Scholar
Tamassia, R., Triandopoulos, N.: On the cost of authenticated data structures. Technical report, Center for Geometric Computing, Brown University (2003)
Google Scholar
Wang, J., Du, X.: Skip list based authenticated data structure in das paradigm. In: GCC’09, pp. 69–75. IEEE (2009)
Google Scholar
Yang, Y., Papadias, D., Papadopoulos, S., Kalnis, P.: Authenticated join processing in outsourced databases. In: ACM SIGMOD, pp. 5–18. ACM (2009)
Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge the support of TÜBİTAK, the Scientific and Technological Research Council of Turkey, under project numbers 111E019 and 112E115, as well as European Union COST Action IC1206. We also thank Ertem Esiner, Adilet Kachkeev, and Ozan Okumuşoǵlu for their contributions during performance evaluation.

Author information

Authors and Affiliations

Koç University, İstanbul, Turkey
Mohammad Etemad & Alptekin Küpçü

Authors

Mohammad Etemad
View author publications
You can also search for this author in PubMed Google Scholar
Alptekin Küpçü
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Etemad .

Editor information

Editors and Affiliations

EWHA Womans University, Seoul, Korea, Republic of (South Korea)
Hyang-Sook Lee
Kookmin University, Seoul, Korea, Republic of (South Korea)
Dong-Guk Han

Appendices

A ADS Definitions and Security Analysis

Definition 2

ADS scheme consists of three polynomial-time algorithms [20]:

\(\mathbf {KeyGen(1^k)} \rightarrow \mathbf {(sk, pk){:}}\) is a probabilistic algorithm executed by the client to generate a private and public key pair \((sk, pk)\) given the security parameter \(k\). The client then shares the public key \(pk\) with the server.
\(\mathbf {Certify(pk, cmd)} \rightarrow \mathbf {(ans,}\,\pi \mathbf {){:}}\) is run by the server to respond to a command issued by the client. The public key \(pk\) and the command \(cmd\) is given as input. If \(cmd\) is a query command, it outputs a verification proof \(\pi \) that enables the client to verify the validity of the answer \(ans\). If \(cmd\) is a modification command (insertion, update, or deletion), then the \(ans\) is null, and \(\pi \) is a consistency proof enabling the client to update her local metadata.
\(\mathbf {Verify(sk, pk, cmd, ans,}\,\pi \mathbf {, st)} \rightarrow \mathbf {(\{accept, reject\}, st'){:}}\) is run by the client upon receipt of a response to verify it. The public and private keys \((pk, sk)\), the answer \(ans\), the proof \(\pi \), and the client’s current metadata \(st\) are given as input. It outputs an \(accept\) or \(reject\) based on the result of the verification. Moreover, if the command was a modification command and the proof is accepted, then the client updates her metadata accordingly (to \(st'\)).

Definition 3

ADS correctness: For all valid proofs \(\pi \) and answers \(ans\) returned by the server in response to a command issued by the client, the verify algorithm accepts with overwhelming probability.

Definition 4

The ADS security game: Played between the challenger who acts as the client and the adversary who plays the role of the server.

\({\mathbf {Key~generation{:}}}\) The challenger runs \(KeyGen(1^k)\) to generate the private and public key pair \((sk, pk)\), and sends the public key \(pk\) to the adversary.
\({\mathbf {Setup{:}}}\) The adversary specifies a command \(cmd\), and sends it together with an answer \(ans\) and proof \(\pi \) to the challenger. The challenger runs the algorithm Verify, and notifies the adversary about the result. If the command was a modification command, and the proof is accepted, then the challenger applies the changes on her local metadata accordingly. The adversary can repeat this interaction polynomially-many times. Call the latest version of the HADS, constructed using all the commands whose proofs verified, \(D\).
\({\mathbf {Challenge{:}}}\) The adversary specifies a command \(cmd\), an answer \(ans'\), and a proof \(\pi '\), and sends them all to the challenger. The adversary wins if the answer \(ans'\) is different from the result set of running \(cmd\) on \(D\), and \(cmd,ans',\pi '\) are verified as accepted by the challenger.

Definition 5

Security of ADS: We say that the ADS is secure if no PPT adversary can win the ADS security game with non-negligible probability.

Definition 6

An outsourced database scheme (ODB) consists of three probabilistic polynomial-time algorithms (OKeyGen, OCertify, OVerify) where:

\(\mathbf {OKeyGen(1^k)}\rightarrow \mathbf {(sk, pk){:}}\) is a probabilistic algorithm run by the client to generate a pair of secret and public keys \((sk, pk)\) given the security parameter \(k\). She keeps both keys, and shares only the public key with the server.
\(\mathbf {OCertify(pk, cmd)} \rightarrow \mathbf {(ans,}\,\pi \mathbf {){:}}\) is run by the server to respond to a command \(cmd\) issued by the client. It produces an answer \(ans\) and a proof \(\pi \) proving the authenticity of the answer. If the command is a modification command, the answer is empty, and the proof proves that the modification is done properly.
\(\mathbf {OVerify(pk, sk, cmd, ans, \pi , st) \rightarrow (\{accept, reject\}, st'){:}}\) is run by the client upon receipt of the answer \(ans\) and proof \(\pi \), to be verified using the public and private key pair. It outputs an ‘accept’ or ‘reject’ notification. If the command was a modification command and the verification result is ‘accept’, then, the client updates her local metadata (to \(st'\)), according to the proof.

Definition 7

ODB security game: This game is similar to the ADS game (Definition 4), except that proper algorithm names (from ODB scheme) is used.

Definition 8

ODB Security: We say that an ODB scheme is secure if no PPT adversary can win the ODB security game with non-negligible probability.

Since the algorithm OCertify is used to execute both query and modification commands, the server utilizes it to generate and update the authentication information. It starts with an empty structure, and updates it according to the received modification commands (e.g., the SQL ‘Insert’ command).

Note that the ODB security game covers all previously separate guarantees: correctness, completeness, and freshness. This is simply due to the fact that the game requires that no adversary can return a query answer together with a valid proof such that the returned answer is different from the answer that would have been produced by the actual database. If any one of the freshness, completeness, or correctness guarantees were to be invaded, the adversary would have won the game. Looking ahead, in our proofs, the challenger keeps a local copy of the database, and can detect whether or not the adversary succeeded. If he succeeds, our reduction shows that we break some underlying security assumption.

Theorem 1

The ADS scheme is secure according to Definition 5.

Proof

It is proved for different schemes separately by different researchers. Papamanthou et al. [21] proved the security of the authenticated hash tables, Goodrich et al. [7] proved the security of the RSA one-way accumulator [1] based ADS, and Papamanthou and Tamassia [20] proved the security of the ADSs constructed using authenticated skip list or red black tree.

Theorem 2

Our HADS construction is secure according to Definition 5 (employing HADS algorithm names) if the underlying ADSs are secure.

Proof

We reduce security of the HADS scheme to the security of the underlying ADSs. If a PPT adversary \(\mathcal {A}\) wins the HADS security game with non-negligible probability, we can use it to construct a PPT algorithm \(\mathcal {B}\) who breaks the security of at least one of the ADS schemes used, with non-negligible probability. \(\mathcal {B}\) acts as the server in the ADS game played with the ADS challenger \(\mathcal {C}\), and simultaneously, \(\mathcal {B}\) plays the role of the challenger in the HADS game with the adversary \(\mathcal {A}\). He receives the public key of an ADS from \(\mathcal {C}\), and himself produces \(n-1\) pairs of ADS public and private keys. Then, he puts the received key in \(i^{th}\) position, and puts the \(n\) public keys as a public key of an n-level HADS, and sends it to \(\mathcal {A}\). During the setup phase, \(\mathcal {B}\) builds a local copy of the HADS for herself. Note that this is invisible to the adversary \(\mathcal {A}\), and thus will not affect his behavior. After the setup phase, \(\mathcal {A}\) selects a command, generates the answer and proof for the command, and sends them to \(\mathcal {B}\). For the adversary to win, the answer must be different from the real answer in at least one location, with its verifying sub-proof \(\pi _{i_j}\). \(\mathcal {B}\) can find it since she maintains a local copy. When \(\mathcal {B}\) receives them, she selects the related command, answer and proof parts for the \(i^{th}\) position, and forwards them to \(\mathcal {C}\). If the guess of \(i\) was correct, then \(\mathcal {B}\) would succeed. If \(\mathcal {A}\) passes the verification with non-negligible probability \(p\), then \(\mathcal {B}\) passes the ADS verification with probability greater than or equal to \(p/n\).

Since we employ secure ADSs, \(p/n\) must be negligible, which implies that \(p\) is negligible, and hence, \(\mathcal {A}\) has negligible probability of winning the HADS game. Therefore, if the underlying ADSs are secure, then the HADS scheme is secure.

Theorem 3

Our ODB scheme is secure according to Definition 8, provided that the underlying HADS scheme is secure.

Proof

We reduce security of the ODB scheme to the security of underlying HADSs. If a PPT adversary \(\mathcal {A}\) wins the ODB security game with non-negligible probability, we can use it to construct a PPT algorithm \(\mathcal {B}\) who breaks the security of HADS scheme with non-negligible probability. \(\mathcal {B}\) acts as the server in the HADS game played with the HADS challenger \(\mathcal {C}\), and simultaneously, \(\mathcal {B}\) plays the role of the challenger in the ODB game with the adversary \(\mathcal {A}\). He receives the public key of an HADS from \(\mathcal {C}\), and relays it to \(\mathcal {A}\) (note that all HADSs built for each searchable column will use the same key). During the setup phase, \(\mathcal {B}\) builds a local database for herself (which does not change the adversary’s view). After the setup phase, \(\mathcal {A}\) selects a query, generates the answer and proof for the query, and sends them to \(\mathcal {B}\). For the adversary to win, the adversary’s answer must be different from the real answer on at least one location, but with a verifying proof. On receipt, \(\mathcal {B}\) selects the related command, answer and proof parts for the answer that differs from the real answer (she can find it since she maintains a local copy), and forwards them to \(\mathcal {C}\). If \(\mathcal {A}\) passes the ODB verification with non-negligible probability \(p\), then \(\mathcal {B}\) can also pass the HADS verification (i.e., break HADS security) with non-negligible probability \(p\).

Since we employ a secure HADS, \(p\) must be negligible, which implies that the adversary has negligible probability of breaking ODB. Therefore, our ODB scheme is secure (and provides correctness, completeness, and freshness), if the underlying HADS is secure.

B Efficient ODB Construction

For each level in an HADS, an ADS can be chosen subject to the requirements of that level and the application. Our construction is a two-level HADS, each level having a special role and posing special considerations. We compare the existing ADSs and investigate their eligibility to be used in each level. We consider three classes of ADSs: logarithmic (e.g., authenticated skip list [5, 6]), sublinear (e.g., authenticated hash tables [21]), and linear (e.g., one-way accumulator [1]).

First level: This level stores the distinct values of a column, and generates the first part of the proof to be sent to the client. Proof generation is based on the authenticated range queries, which implies that this level should use an ADS who preserves the order of values it stores. One-way accumulator and hash tables does not support this property efficiently, and cannot be used for this level.

Therefore, we choose the authenticated skip list (alternatively, the Merkle hash tree) to be used in the first level. It requires \(O(\log (|C_i|))\) and \(O(\log (|C_i|) + |t|)\) time/size for the update and query proofs, respectively. There are \(|C_i|\) distinct values, on average, in the first level ADS (stored at leaves), therefore, the storage complexity is \(2|C_i|\), which is \(O(|C_i|)\).

Second level: This level stores the PK set of values in the first level, where the order of PKs is not a matter of importance (although it can be useful for comparing the PK sets of multiple clauses connected with AND). Thus, any ADS can be used with time/space trade-offs discussed below.

Accumulator: For each distinct value in the first level ADS, an accumulated value is computed using all values in its PK set, and is stored together with the value itself. For each PK value, a witness is computed which proves that it belongs to the specified PK set. If we need to select all PK values, it suffices to have only the accumulated value (not the witnesses) to check the integrity. But, if want to select a subset of the PK values, then their witnesses are also required.

For each distinct value in the first level ADS, \(N/|C_i|\) PK values and witnesses should be computed and stored, on average, where \(N\) is the total number of records in the table. In total, \(2|C_i| + |C_i|*N/|C_i| = 2|C_i| + N\) (which is \(O(|C_i| + N)\)) storage is required (including the \(2|C_i|\) space for the first level ADS).

A proof for each value is made up of two parts, one for the first level ADS (e.g., for authenticated skip list, a path from the leaf up to the root, which is \(O(\log |C_i|)\)), and the other is the accumulated value along with all the values in the PK set, which is \(N/|C_i|\) (the accumulated value is already included in the hash value stored at the corresponding leaf of the first level ADS). The client herself can check the validity of the PK set against the accumulated value. Therefore, for a result set of size \(t\), the asymptotic size of the verification object will be \((O(\log |C_i|) + (t|C_i|/N)(1+N/|C_i|))\simeq O(\log |C_i| + t)\).

The main problem with the accumulator is the cost of update: with each update, all witnesses should be updated, which is expensive.

Authenticated hash table: This is a sublinear membership scheme with constant query and verification time, making it an interesting scheme for clients with resource-constrained devices. It is a good choice if the data is static. For a leaf node storing \(v_i\), we put the PK set of \(v_i\) in an authenticated hash table, and store its root at the leaf node itself.

On average, \(N/|C_i|\) PK values linked to each leaf node, therefore, we require \(O(|C_i| + (1+\epsilon )N/|C_i|*|C_i|) = O(|C_i| + (1+\epsilon )N)) \approx O(|C_i| + N)\) storage in total (including the \(O(|C_i|)\) space for the first level). Here \(0 < \epsilon < 1\) is a constant.

Table 1. A comparison of membership schemes for the second level where the first level is a logarithmic ADS. Proof size and verification time is given for one-dimensional queries. The \(s, t, t_1,\) and \( t_2\) denote the number of searchable columns in a table, size of the result set, and number of records in the first and second level ADSs, respectively.

Full size table

The first level ADS proof is the same, but the authenticated hash table requires only constant proof size \(\epsilon \) [20], reaching \((O(\log |C_i|) + 1))\) for one record, and \((O(\log |C_i|) + t)\) for \(t\) records in the result set. Moreover, hash operations are much faster than accumulator operations using modular exponentiation.

Authenticated Skip list: This is a membership scheme with logarithmic height and proof size. The way the second-level membership schemes are modified, or the proofs are generated, are the same as for the first-level ADS.

Each node requires \(\approx 2(N/|C_i|)\) storage to store the PK set, therefore, \(2|C_i| + 2|C_i|*N/|C_i| = 2(|C_i| + N)=O(|C_i| + N)\) storage is required to store a column (including the \(2|C_i|\) space for the first level ADS). The proof size and time for one value are both \(O(\log |C_i| + \log (N/|C_i|))=O(\log N)\), and for \(t\) values are \(O(\log |C_i| + t\log (N/|C_i|))\) and \(O(\log |C_i|+t)\), respectively.

A comparison of these schemes is given in Table 1, where the first level is a logarithmic ADS and the second levels are shown in the table. The \(s, t, t_1,\) and \( t_2\) denote the number of searchable columns in a table, size of the result set, and number of records in the first and second level ADSs, respectively. Note however that unit operations in the accumulator are more costly than those in the others.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Etemad, M., Küpçü, A. (2014). Database Outsourcing with Hierarchical Authenticated Data Structures. In: Lee, HS., Han, DG. (eds) Information Security and Cryptology -- ICISC 2013. ICISC 2013. Lecture Notes in Computer Science(), vol 8565. Springer, Cham. https://doi.org/10.1007/978-3-319-12160-4_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-12160-4_23
Published: 19 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12159-8
Online ISBN: 978-3-319-12160-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Database Outsourcing with Hierarchical Authenticated Data Structures

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A ADS Definitions and Security Analysis

Definition 2

Definition 3

Definition 4

Definition 5

Definition 6

Definition 7

Definition 8

Theorem 1

Proof

Theorem 2

Proof

Theorem 3

Proof

B Efficient ODB Construction

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation