Skip to main content

Efficient and secure exact-match queries in outsourced databases

Abstract

Data management can now be outsourced to cloud service providers like Amazon Web Services or IBM SmartCloud. This calls for encrypted data-representation schemes that also give way to efficient query processing. State-of-the-art approaches are overly expensive for exact-match queries in the worst case, or they do not ensure privacy if an adversary knows the data distribution. In this paper, we propose a new privacy approach without these shortcomings. It makes use of encryption, obfuscated indices, and data fragmentation. To speed up query processing, we propose three novel data-transformation and query-execution schemes. For two schemes, we prove that an adversary capable of solving any polynomial problem cannot determine if any attribute values appear together in a tuple. Thus, with our schemes, sensitive data is not linked to personally identifiable information. To evaluate our third scheme, we propose a measure that quantifies the risk of disclosure. We evaluate our approach on real-world folksonomy data. Our evaluation shows that its average response time of exact-match queries with 15 million tuples is under one second on a conventional desktop PC.

This is a preview of subscription content, access via your institution.

References

  1. 1.

    Aggarwal, G., Bawa, M., Ganesan, P., Garcia-Molina, H., Kenthapadi, K., Motwani, R., Srivastava, U., Thomas, D., Xu, Y.: Two can keep a secret: a distributed architecture for secure database services. In: Proceedings of the Conference on Innovative Data Systems Research (CIDR), pp. 186–199 (2005)

  2. 2.

    Agrawal, R., Kiernan, J., Srikant, R., Xu, Y.: Order preserving encryption for numeric data. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 563–574. ACM (2004)

  3. 3.

    Ames, M., Naaman, M.: Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the International Conference on Human Factors in Computing Systems (CHI), pp. 971–980. ACM (2007)

  4. 4.

    Baeza-Yates, R.: A fast set intersection algorithm for sorted sequences. In: Proceedings of the Symposium on Combinatorial Pattern Matching (CPM), pp. 400–408. Springer (2004)

  5. 5.

    Bajaj, S., Sion, R.: TrustedDB. In: Proceedings of the International Conference on Management of Data (SIGMOD), p. 205. ACM (2011)

  6. 6.

    Boneh, D., Di Crescenzo, G., Ostrovsky, R., Persiano, G.: Public key encryption with keyword search. In: Advances in Cryptology (EUROCRYPT), vol. 3027, chap. 30, pp. 506–522. Springer (2004)

  7. 7.

    Burghardt, T., Buchmann, E., Müller, J., Böhm, K.: Understanding user preferences and awareness: privacy mechanisms in location-based services. In: Proceedings of the International Conference on Cooperative Information Systems (CoopIS), pp. 304–321. Springer (2009)

  8. 8.

    Ceselli, A., Damiani, E., De Capitani Di Vimercati, S., Jajodia, S., Paraboschi, S., Samarati, P.: Modeling and assessing inference exposure in encrypted databases. ACM Trans. Inf. Syst. Secur. (TISSEC) 8(1), 119–152 (2005)

    Article  Google Scholar 

  9. 9.

    Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Combining fragmentation and encryption to protect privacy in data storage. ACM Trans. Inf. Syst. Secur. (TISSEC) 13(3), 1–33 (2010)

    Article  Google Scholar 

  10. 10.

    Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. J. Comput. Secur. 19(5), 895–934 (2011)

    Google Scholar 

  11. 11.

    De Capitani di Vimercati, S., Foresti, S., Paraboschi, S., Pelosi, G., Samarati, P.: Efficient and private access to outsourced data. In: Proceedings of the International Conference on Distributed Computing Systems (ICDCS), pp. 710–719. IEEE (2011)

  12. 12.

    Dwork, C.: Differential privacy. In: Proceedings of the International Colloquium on Automata, Languages and Programming, Part II (ICALP), pp. 1–12. Springer (2006)

  13. 13.

    European Parliament and the Council of the European Union: Directive 95/46/EC on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Off. J. Eur. Union 281, 31–50 (1995)

    Google Scholar 

  14. 14.

    Gibbons, P.B., Matias, Y.: New sampling-based summary statistics for improving approximate query answers. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 331–342. ACM (1998)

  15. 15.

    Goldreich, O.: A note on computational indistinguishability. Inf. Process. Lett. 34(6), 277–281 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  16. 16.

    Goldreich, O., Ostrovsky, R.: Software protection and simulation on oblivious RAMs. J. ACM (JACM) 43(3), 431–473 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  17. 17.

    Grawrock, D.: Dynamics of a Trusted Platform: A Building Block Approach. Intel Press (2009)

  18. 18.

    Hacigümüş, H., Iyer, B., Li, C., Mehrotra, S.: Executing SQL over encrypted data in the database-service-provider model. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 216–227. ACM (2002)

  19. 19.

    Hacigümüş, H., Mehrotra, S., Iyer, B.: Providing database as a service. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 29–38 (2002)

  20. 20.

    Halpin, H., Robu, V., Shepherd, H.: The complex dynamics of collaborative tagging. In: Proceedings of the International Conference on World Wide Web (WWW), pp. 211–220. ACM (2007)

  21. 21.

    Heckner, M., Neubauer, T., Wolff, C.: Tree, funny, to_read, google: what are tags supposed to achieve? A comparative analysis of user keywords for different digital resource types. In: Proceedings of the Workshop on Search in Social Media (SSM), pp. 3–10. ACM (2008)

  22. 22.

    Heidinger, C., Buchmann, E., Huber, M., Böhm, K., Müller-Quade, J.: Privacy-aware folksonomies. In: Proceedings of the European Conference on Research and Advanced Technology for Digital Libraries (ECDL), pp. 156–167. Springer (2010)

  23. 23.

    Henrich, C., Huber, M., Kempka, C., Reussner, R.: Secure Cloud Computing Through a Separation of Duties. Tech. rep. Karlsruhe Institute of Technology (KIT) (2010)

  24. 24.

    Hore, B., Mehrotra, S., Canim, M., Kantarcioglu, M.: Secure multidimensional range queries over outsourced data. VLDB J. 21(3), 333–358 (2012)

    Article  Google Scholar 

  25. 25.

    Hore, B., Mehrotra, S., Tsudik, G.: A privacy-preserving index for range queries. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 720–731. VLDB Endowment (2004)

  26. 26.

    Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: search and ranking. In: Proceedings of the European Semantic Web Conference (ESWC), pp. 411–426. Springer (2006)

  27. 27.

    Kranen, P., Seidl, T.: Harnessing the strengths of anytime algorithms for constant data streams. Data Min. Knowl. Disc. 19(2), 245–260 (2009)

    Article  MathSciNet  Google Scholar 

  28. 28.

    Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: privacy beyond k-Anonymity and l-Diversity. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 106–115. IEEE (2007)

  29. 29.

    Lin, P., Candan, S.: Hiding traversal of tree structured data from untrusted data stores. In: Proceedings of the Conference on Intelligence and Security Informatics (ISI), vol. 2665. (2003)

  30. 30.

    Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: privacy beyond k-Anonymity. ACM Trans. Knowl. Disc. Data (TKDD) 1(1) (2007)

  31. 31.

    Marlow, C., Naaman, M., Boyd, D., Davis, M.: HT06, tagging paper, taxonomy, flickr, academic article, to read. In: Proceedings of the Conference on Hypertext and Hypermedia (Hypertext), pp. 31–40. ACM (2006)

  32. 32.

    Pinkas, B., Reinman, T.: Oblivious RAM revisited. In: Advances in Cryptology (CRYPTO), vol. 6223, pp. 502–519. (2010)

  33. 33.

    Raykova, M., Vo, B., Bellovin, S.M., Malkin, T.: Secure anonymous database search. In: Proceedings of the Workshop on Cloud Computing Security (CCSW), pp. 115–126. ACM (2009)

  34. 34.

    Sedghi, S., Doumen, J., Hartel, P., Jonker, W.: Towards an information theoretic analysis of searchable encryption. In: Proceedings of the International Conference on Information and Communications Security (ICICS), pp. 345–360. Springer (2008)

  35. 35.

    Shmueli, E., Waisenberg, R., Elovici, Y., Gudes, E.: Designing secure indexes for encrypted databases. In: Proceedings of the Conference on Data and Applications Security (DBSec), pp. 54–68. Springer (2005)

  36. 36.

    Sweeney, L.: k-Anonymity: a model for protecting privacy. Int. J. Uncertain., Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  37. 37.

    Williams, P., Sion, R., Carbunar, B.: Building castles out of Mud: practical access pattern privacy and correctness on untrusted storage. In: Proceedings of the Conference on Computer and Communications Security (CCS), pp. 139–148. ACM (2008)

  38. 38.

    Xiaoding Song, D., Wagner, D., Perrig, A.: Practical techniques for searches on encrypted data. In: Proceedings of the IEEE Symposium on Security and Privacy (S&P), pp. 44–55 (2000)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Clemens Heidinger.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Heidinger, C., Böhm, K., Buchmann, E. et al. Efficient and secure exact-match queries in outsourced databases. World Wide Web 18, 567–605 (2015). https://doi.org/10.1007/s11280-013-0270-0

Download citation

Keywords

  • Outsourcing
  • Privacy
  • Security
  • Confidentiality
  • Query execution