Large-Scale Multi-party Counting Set Intersection Using a Space Efficient Global Synopsis

  • Dimitrios Karapiperis
  • Dinusha Vatsalan
  • Vassilios S. Verykios
  • Peter Christen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9050)


Privacy-preserving set intersection (PPSI) of very large data sets is increasingly being required in many real application areas including health-care, national security, and law enforcement. Various techniques have been developed to address this problem, where the majority of them rely on computationally expensive cryptographic techniques. Moreover, conventional data structures cannot be used efficiently for providing count estimates of the elements of the intersection of very large data sets. We consider the problem of efficient PPSI by integrating sets from multiple (three or more) sources in order to create a global synopsis which is the result of the intersection of efficient data structures, known as Count-Min sketches. This global synopsis furthermore provides count estimates of the intersected elements. We propose two protocols for the creation of this global synopsis which are based on homomorphic computations, a secure distributed summation scheme, and a symmetric noise addition technique. Experiments conducted on large synthetic and real data sets show the efficiency and accuracy of our protocols, while at the same time privacy under the Honest-but-Curious model is preserved.


Hash Function Oblivious Transfer Cipher Text Heavy Hitter Count Estimate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    OTS SA (2014).
  2. 2.
    Adamic, L., Huberman, B.: Zipf’s law and the internet. Glottonmetrics 11, 143–150 (2002)Google Scholar
  3. 3.
    Aggarwal, C., Yu, P.: A general survey of privacy-preserving data mining models and algorithms. Adv. Datab. Sys. 34, 11–52 (2008)CrossRefGoogle Scholar
  4. 4.
    Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: SIGMOD, San Diego, California, USA, pp. 86–97 (2003)Google Scholar
  5. 5.
    Aumann, Y., Lindell, Y.: Security against covert adversaries: Efficient protocols for realistic adversaries. J. of Cryptol. 23(2), 281–343 (2010)CrossRefMATHMathSciNetGoogle Scholar
  6. 6.
    Broder, A., Mitzenmacher, M.: Network applications of Bloom filters: A survey. Internet Math. 1(4), 485–509 (2002)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Burkhart, M., Dimitropoulos, X.: Privacy-preserving distributed network troubleshooting - bridging the gap between theory and practice. ACM Trans. Inf. Sys. Sec. 14(4) (2011)Google Scholar
  8. 8.
    Charikar, Moses, Chen, Kevin, Farach-Colton, Martín: Finding frequent items in data streams. In: Widmayer, Peter, Triguero, Francisco, Morales, R., Hennessy, Matthew, Eidenbenz, Stephan, Conejo, Ricardo (eds.) ICALP 2002. LNCS, vol. 2380, pp. 693–703. Springer, Heidelberg (2002) CrossRefGoogle Scholar
  9. 9.
    Clifton, C., Kantarcioglou, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. ACM SIGKDD Explor. Newsl. 4(2), 28–34 (2002)CrossRefGoogle Scholar
  10. 10.
    Cohen, S., Matias, Y.: Spectral Bloom filters. In: SIGMOD, San Diego, California, pp. 241–252 (2003)Google Scholar
  11. 11.
    Cormode, G., Garofalakis, M.: Sketching streams through the net distributed approximate query tracking. In: VLDB, Trondheim, Norway, pp. 13–24 (2005)Google Scholar
  12. 12.
    Cormode, G., Muthukrishnan, S.: An improved data stream summary: the Count-Min sketch and its applications. J. of Algor. 55(1), 58–75 (2005)CrossRefMATHMathSciNetGoogle Scholar
  13. 13.
    Dachman-Soled, D., Malkin, T., Raykova, M., Yung, M.: Efficient robust private set intersection. Appl. Cryptog. 2(4), 289–303 (2012)CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Dong, C., Chen, L., Wan, Z.: When private set intersection meets big data: an efficient and scalable protocol. In: SIGSAC, Berlin, Germany, pp. 789–800 (2013)Google Scholar
  15. 15.
    Freedman, Michael J., Nissim, Kobbi, Pinkas, Benny: Efficient private matching and set intersection. In: Cachin, Christian, Camenisch, Jan L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004) CrossRefGoogle Scholar
  16. 16.
    Frikken, K.: Privacy-preserving set union. Appl. Cryptog. Network Sec. 4521, 237–252 (2007)CrossRefGoogle Scholar
  17. 17.
    Glassman, S.: A caching relay for the world wide web. Comput. Netw. ISDN Syst. 27(2), 165–173 (1994)CrossRefGoogle Scholar
  18. 18.
    Goldreich, O., Micali, S., Wigderson, A.: How to play ANY mental game. In: STOC, New York, USA, pp. 218–229 (1987)Google Scholar
  19. 19.
    Hall, Rob, Fienberg, Stephen E.: Privacy-preserving record linkage. In: Domingo-Ferrer, Josep, Magkos, Emmanouil (eds.) PSD 2010. LNCS, vol. 6344, pp. 269–283. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  20. 20.
    Hazay, Carmit, Lindell, Yehuda: Efficient protocols for set intersection and pattern matching with security against malicious and covert adversaries. In: Canetti, Ran (ed.) TCC 2008. LNCS, vol. 4948, pp. 155–175. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  21. 21.
    Jauhari, M., Saxena, A., Gautam, J.: Zipf’s law and the number of hits on the world wide web. Annals of Lib. and Inf. Studies 54, 81–84 (2007)Google Scholar
  22. 22.
    Kantarcioglu, Murat, Jiang, Wei, Malin, Bradley: A privacy-preserving framework for integrating person-specific databases. In: Domingo-Ferrer, Josep, Saygın, Yücel (eds.) PSD 2008. LNCS, vol. 5262, pp. 298–314. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  23. 23.
    Kiayias, A., Mitrofanova, A.: Testing disjointness of private datasets. In: Patrick, Andrew S., Yung, M. (eds.) FC 2005. LNCS 3570, vol. 3570, pp. 109–124. Springer, Heidelberg (2005) CrossRefGoogle Scholar
  24. 24.
    Kissner, Lea, Song, Dawn: Privacy-preserving set operations. In: Shoup, Victor (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 241–257. Springer, Heidelberg (2005) CrossRefGoogle Scholar
  25. 25.
    Krashakov, S., Teslyuk, A., Shchur, L.: On the universality of rank distributions of website popularity. Comp. Netw. 50(11), 1769–1780 (2006)CrossRefMATHGoogle Scholar
  26. 26.
    Krawczyk, H., Bellare, M., Canetti, R.: HMAC: keyed-hashing for message authentication, Internet RFC 2104 (1997).
  27. 27.
    Lindell, Y., Pinkas, B.: Secure multiparty computation for privacy-preserving data mining. J. Priv. Conf. 1(1) (2009)Google Scholar
  28. 28.
    Many, D., Burkhart, M., Dimitropoulos, X.: Fast private set operations with sepia. Tech. Rep. no. 345, ETH Zurich (2012)Google Scholar
  29. 29.
    Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press (1995)Google Scholar
  30. 30.
    Naor, M., Pinkas, B.: Oblivious transfer and polynomial evaluation. In: STOC, Atlanta, Georgia, USA, pp. 245–254 (1999)Google Scholar
  31. 31.
    Paillier, Pascal: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, Jacques (ed.) EUROCRYPT 1999. LNCS, vol. 1592, p. 223. Springer, Heidelberg (1999) CrossRefGoogle Scholar
  32. 32.
    Pierre, K., Lai, S., Yiu, K., Chow, C., Chong, L., Hui, C.: An efficient Bloom filter based solution for multiparty private matching. In: SAM (2006)Google Scholar
  33. 33.
    Roughan, M., Zhang, Y.: Secure distributed data-mining and its application to large-scale network measurements. SIGCOMM Comput. Commun. Rev. 36(1), 7–14 (2006)CrossRefGoogle Scholar
  34. 34.
    Rusu, F., Dobra, A.: Statistical analysis of sketch estimators. In: SIGMOD, Beijing, China, pp. 187–198 (2007)Google Scholar
  35. 35.
    Vatsalan, D., Christen, P., Verykios, V.S.: A taxonomy of privacy-preserving record linkage techniques. J. Inf. Sys. 38(6), 946–969 (2013)CrossRefGoogle Scholar
  36. 36.
    Yao, A.: How to generate and exchange secrets. In: SFCS, Toronto, Canada, pp. 162–167 (1986)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Dimitrios Karapiperis
    • 1
  • Dinusha Vatsalan
    • 2
  • Vassilios S. Verykios
    • 1
  • Peter Christen
    • 2
  1. 1.School of Science and TechnologyHellenic Open UniversityPatrasGreece
  2. 2.Research School of Computer ScienceThe Australian National UniversityCanberraAustralia

Personalised recommendations