Skip to main content

Privacy Preserving Collaborative Agglomerative Hierarchical Clustering Construction

  • Conference paper
  • First Online:
Information Systems Security and Privacy (ICISSP 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 977))

Included in the following conference series:

  • 703 Accesses

Abstract

Sharing information brought by governments, companies, and individuals, has created fabulous opportunities for knowledge-based decision making. However, the main challenge in collaborative data analysis returns back to the privacy of sensitive data. In current study, we propose a general framework which can be exploited as a secure tool for constructing any agglomerative hierarchical clustering algorithm over partitioned data. We assume that data is distributed between two (or more) parties either horizontally or vertically, such that for mutual benefits the participated parties are interested in obtaining the clusters’ structure on whole data, but for privacy concerns, they are not willing to share the original datasets. To this end, in this study, we propose general algorithms based on secure scalar product and secure hamming distance to securely compute the desired criteria for shaping the clusters’ scheme. Our proposed approach covers the private construction of all possible agglomerative hierarchical clustering algorithms on distributed datasets, including both numerical and categorical data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://hdl.handle.net/10071/4097.

References

  1. Artoisenet, C., Roland, M., Closon, M.: Health networks: actors, professional relationships, and controversies. In: Collaborative Patient Centred eHealth, vol. 141. IOSPress (2013)

    Google Scholar 

  2. Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data, pp. 25–71. Springer, Berlin Heidelberg (2006). https://doi.org/10.1007/3-540-28349-8_2

    Chapter  Google Scholar 

  3. Bogan, E., English, J.: Benchmarking for Best Practices: Winning Through Innovative Adaptation. McGraw-Hill, New York (1994)

    Google Scholar 

  4. Bringer, J., Chabanne, H., Favre, M., Patey, A., Schneider, T., Zohner, M.: GSHADE: faster privacy-preserving distance computation and biometric identification. In: Proceedings of the 2nd ACM Workshop on Information Hiding and Multimedia Security, New York, NY, USA, pp. 187–198 (2014)

    Google Scholar 

  5. Bringer, J., Chabanne, H., Patey, A.: SHADE: secure hamming distance computation from oblivious transfer. In: Adams, A.A., Brenner, M., Smith, M. (eds.) FC 2013. LNCS, vol. 7862, pp. 164–176. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41320-9_11

    Chapter  Google Scholar 

  6. Bunn, P., Ostrovsky, R.: Secure two-party k-means clustering. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, CCS 2007, pp. 486–497. ACM, NY, USA (2007)

    Google Scholar 

  7. Day, W.H.E., Edelsbrunner, H.: Efficient algorithms for agglomerative hierarchical clustering methods. J. Classif. 1(1), 7–24 (1984)

    Article  Google Scholar 

  8. De, I., Tripathy, A.: A secure two party hierarchical clustering approach for vertically partitioned data set with accuracy measure. In: Thampi, S., Abraham, A., Pal, S., Rodriguez, J. (eds.) Recent Advances in Intelligent Informatics. Advances in Intelligent Systems and Computing, vol. 235. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01778-5_16

    Chapter  Google Scholar 

  9. Erkin, Z., Franz, M., Guajardo, J., Katzenbeisser, S., Lagendijk, I., Toft, T.: Privacy-preserving face recognition. In: Goldberg, I., Atallah, M.J. (eds.) PETS 2009. LNCS, vol. 5672, pp. 235–253. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03168-7_14

    Chapter  Google Scholar 

  10. Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. ASA-SIAM Series on Statistics and Applied Probability. Society for Industrial and Applied Mathematics, Philadelphia (2007)

    Book  Google Scholar 

  11. Hamidi, M., Sheikhalishahi, M., Martinelli, F.: Secure two-party agglomerative hierarchical clustering construction. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy, ICISSP 2018, Funchal, Madeira, Portugal, 22–24 January 2018, pp. 432–437 (2018)

    Google Scholar 

  12. Hamidi, M., Sheikhalishahi, M., Martinelli, F.: Secure two-party agglomerative hierarchical clustering construction. In: the 4th International Conference on Information Systems Security and Privacy (ICISSP). SciTePress (2018)

    Google Scholar 

  13. Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A new privacy-preserving distributed k-clustering algorithm. In: SDM, pp. 494–498. SIAM (2006)

    Google Scholar 

  14. Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, KDD 2005, pp. 593–599. ACM, New York, NY, USA (2005)

    Google Scholar 

  15. Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: di Vimercati, S.C., Syverson, P., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005). https://doi.org/10.1007/11555827_23

    Chapter  Google Scholar 

  16. Martinelli, F., Saracino, A., Sheikhalishahi, M.: Modeling privacy aware information sharing systems: a formal and general approach. In: 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (2016)

    Google Scholar 

  17. Sheikhalishahi, M., Martinelli, F.: Privacy preserving hierarchical clustering over multi-party data distribution. In: Wang, G., Atiquzzaman, M., Yan, Z., Choo, K.-K.R. (eds.) SpaCCS 2017. LNCS, vol. 10656, pp. 530–544. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72389-1_42

    Chapter  Google Scholar 

  18. Mohammed, N., Chen, R., Fung, B.C., Yu, P.S.: Differentially private data release for data mining. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 493–501, ACM, New York, NY, USA (2011)

    Google Scholar 

  19. Murtagh, F., Contreras, P.: Algorithms for hierarchical clustering: an overview. Wiley Interdisc. Rew. Data Min. Knowl. Discov. 2(1), 86–97 (2012)

    Article  Google Scholar 

  20. Nateghizad, M., Erkin, Z., Lagendijk, R.L.: Efficient and secure equality tests. In: 2016 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6 (2016)

    Google Scholar 

  21. Nateghizad, M., Erkin, Z., Lagendijk, R.L.: An efficient privacy-preserving comparison protocol in smart metering systems. EURASIP J. Inf. Secur. 2016(1), 11 (2016)

    Article  Google Scholar 

  22. Oliveira, S.R.M., Zaïane, O.R.: Privacy preserving frequent itemset mining. In: Proceedings of the IEEE International Conference on Privacy, Security and Data Mining, CRPIT 2014, vol. 14, pp. 43–54 (2002)

    Google Scholar 

  23. Oliveira, S.R.M., Zaiane, O.R.: A privacy-preserving clustering approach toward secure and effective data analysis for business collaboration. Comput. Secur. 26(1), 81–93 (2007)

    Article  Google Scholar 

  24. Sheikhalishahi, M., Martinelli, F.: Privacy preserving clustering over horizontal and vertical partitioned data. In: 2017 IEEE Symposium on Computers and Communications, ISCC 2017, Heraklion, Greece, 3–6 July 2017, pp. 1237–1244 (2017)

    Google Scholar 

  25. Sheikhalishahi, M., Martinelli, F.: Privacy-utility feature selection as a privacy mechanism in collaborative data classification. In: The 26th IEEE International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises, Poznan, Poland (2017)

    Google Scholar 

  26. Sheikhalishahi, M., Mejri, M., Tawbi, N., Martinelli, F.: Privacy-aware data sharing in a tree-based categorical clustering algorithm. In: Cuppens, F., Wang, L., Cuppens-Boulahia, N., Tawbi, N., Garcia-Alfaro, J. (eds.) FPS 2016. LNCS, vol. 10128, pp. 161–178. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51966-1_11

    Chapter  Google Scholar 

  27. Su, C., Zhou, J., Bao, F., Takagi, T., Sakurai, K.: Two-party privacy-preserving agglomerative document clustering. In: Dawson, E., Wong, D.S. (eds.) ISPEC 2007. LNCS, vol. 4464, pp. 193–208. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72163-5_16

    Chapter  Google Scholar 

  28. Tan, S.C., San Lau, J.P.: Time series clustering: A superior alternative for market basket analysis. In: Herawan, T., Deris, M., Abawajy, J. (eds.) DaEng-2013. LNEE, vol. 285, pp. 241–248. Springer, Singapore (2014). https://doi.org/10.1007/978-981-4585-18-7_28

    Chapter  Google Scholar 

  29. Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 639–644. ACM, New York, NY, USA (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mina Sheikhalishahi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sheikhalishahi, M., Hamidi, M., Martinelli, F. (2019). Privacy Preserving Collaborative Agglomerative Hierarchical Clustering Construction. In: Mori, P., Furnell, S., Camp, O. (eds) Information Systems Security and Privacy. ICISSP 2018. Communications in Computer and Information Science, vol 977. Springer, Cham. https://doi.org/10.1007/978-3-030-25109-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-25109-3_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-25108-6

  • Online ISBN: 978-3-030-25109-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics