A Universal Toolkit for Cryptographically Secure Privacy-Preserving Data Mining

  • Dan Bogdanov
  • Roman Jagomägis
  • Sven Laur
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7299)


The issue of potential data misuse rises whenever it is collected from several sources. In a common setting, a large database is either horizontally or vertically partitioned between multiple entities who want to find global trends from the data. Such tasks can be solved with secure multi-party computation (MPC) techniques. However, practitioners tend to consider such solutions inefficient. Furthermore, there are no established tools for applying secure multi-party computation in real-world applications. In this paper, we describe Sharemind—a toolkit, which allows data mining specialist with no cryptographic expertise to develop data mining algorithms with good security guarantees. We list the building blocks needed to deploy a privacy-preserving data mining application and explain the design decisions that make Sharemind applications efficient in practice. To validate the practical feasibility of our approach, we implemented and benchmarked four algorithms for frequent itemset mining.


Association Rule Memory Consumption Data Owner Cover Vector APRIORI Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. of VLDB 1994, pp. 487–499. Morgan Kaufmann (1994)Google Scholar
  2. 2.
    Agrawal, S., Haritsa, J.R., Prakash, B.A.: FRAPP: a framework for high-accuracy privacy-preserving mining. Knowledge Discovery and Data Mining 18(1), 101–139 (2009)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bogdanov, D., Laur, S., Willemson, J.: Sharemind: A Framework for Fast Privacy-Preserving Computations. In: Jajodia, S., Lopez, J. (eds.) ESORICS 2008. LNCS, vol. 5283, pp. 192–206. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Bogetoft, P., Damgård, I., Jakobsen, T., Nielsen, K., Pagter, J., Toft, T.: A Practical Implementation of Secure Auctions Based on Multiparty Integer Computation. In: Di Crescenzo, G., Rubin, A. (eds.) FC 2006. LNCS, vol. 4107, pp. 142–147. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Bramer, M.: Principles of Data Mining. Springer (2007)Google Scholar
  6. 6.
    Brijs, T., Swinnen, G., Vanhoof, K., Wets, G.: Using association rules for product assortment decisions: A case study. In: Proc. of KDD 1999, pp. 254–260. ACM (1999)Google Scholar
  7. 7.
    Burkhart, M., Strasser, M., Many, D., Dimitropoulos, X.: SEPIA: privacy-preserving aggregation of multi-domain network events and statistics. In: Proc. of USENIX Security 2010, p. 15. USENIX Association (2010)Google Scholar
  8. 8.
    Chor, B., Kushilevitz, E.: A zero-one law for boolean privacy. In: Proc. of STOC 1989, pp. 62–72. ACM Press (1989)Google Scholar
  9. 9.
    Damgård, I., Ishai, Y.: Constant-Round Multiparty Computation Using a Black-Box Pseudorandom Generator. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 378–394. Springer, Heidelberg (2005)Google Scholar
  10. 10.
    Evfimievski, A.V., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: Proc. of KDD 2002, pp. 217–228 (2002)Google Scholar
  11. 11.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010)Google Scholar
  12. 12.
    Geisler, M.: Cryptographic Protocols: Theory and Implementation. PhD thesis, Aarhus University (2010)Google Scholar
  13. 13.
    Goethals, B.: Frequent set mining. In: The Data Mining and Knowledge Discovery Handbook, ch. 17, pp. 377–397. Springer (2005)Google Scholar
  14. 14.
    Goethals, B., Laur, S., Lipmaa, H., Mielikäinen, T.: On Private Scalar Product Computation for Privacy-Preserving Data Mining. In: Park, C., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 104–120. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering 16(9), 1026–1037 (2004)CrossRefGoogle Scholar
  16. 16.
    Mannila, H., Toivonen, H., Verkamo, A.I.: Efficient algorithms for discovering association rules. In: KDD Workshop, pp. 181–192 (1994)Google Scholar
  17. 17.
    Rizvi, S., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proc. of VLDB 2002, pp. 682–693 (2002)Google Scholar
  18. 18.
    The Sharemind framework,
  19. 19.
    Shamir, A.: How to share a secret. Communications of the ACM 22(11), 612–613 (1979)MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Toivonen, H.: Sampling large databases for association rules. In: Proc. of VLDB 1996, pp. 134–145. Morgan Kaufmann (1996)Google Scholar
  21. 21.
    Yang, Z., Wright, R.N., Subramaniam, H.: Experimental analysis of a privacy-preserving scalar product protocol. Computer Systems: Science & Engineering 21(1) (2006)Google Scholar
  22. 22.
    Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. In: Proc. of KDD 2003, pp. 326–335 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Dan Bogdanov
    • 1
    • 2
  • Roman Jagomägis
    • 1
    • 2
  • Sven Laur
    • 2
  1. 1.AS CyberneticaTallinnEstonia
  2. 2.Institute of Computer ScienceUniversity of TartuTartuEstonia

Personalised recommendations