Abstract
This chapter presents a brief summary and review of Privacy-preserving Data Mining (PPDM). The review of the existing approaches is structured along a tentative taxonomy of PPDM as a field. The main axes of this taxonomy specify what kind of data is being protected, and what is the ownership of the data (centralized or distributed). We comment on the relationship between PPDM and preventing discriminatory use of data mining techniques. We round up the chapter by discussing some of the new, arising challenges before PPDM as a field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, C.C., Yu, P.S.: A framework for condensation-based anonymization of string data. Data Mining and Knowledge Discovery 16, 251–275 (2008)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM SIGMOD Record 29, 439–450 (2000)
Atzori, M., Bonchi, F., et al.: Anonymity preserving pattern discovery. VLDB Journal 17(4), 703–727 (2008)
Bonizzoni, P., Della Vedova, G., Dondi, R.: The k-Anonymity Problem is Hard. In: Kutyłowski, M., Charatonik, W., Gębala, M. (eds.) FCT 2009. LNCS, vol. 5699, pp. 26–37. Springer, Heidelberg (2009)
Chen, B.-C., Kifer, D., et al.: Privacy-Preserving Data Publishing. Found Trends Databases 2(1-2), 1–167 (2009)
Chen, B.-C., LeFevre, K., et al.: Privacy skyline: privacy with multidimensional adversarial knowledge. In: Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria. VLDB Endowment (2007)
Ciriani, V., Capitani di Vimercati, S., et al.: k-Anonymity. Secure Data Management in Decentralized Systems 33, 323–353 (2007)
El Emam, K., Dankar, F.K., et al.: A Globally Optimal k-Anonymity Method for the De-Identification of Health Data. Journal of the American Medical Informatics Association 16(5), 670–682 (2009)
Fang, L., Kim, H., et al.: A privacy recommendation wizard for users of social networking sites. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, Chicago, Illinois, USA. ACM (2010)
Fung, B.C.M., Wang, K., et al.: Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv. 42(4), 1–53 (2010)
Gentry, C.: Computing arbitrary functions of encrypted data. Commun. ACM 53(3), 97–105 (2010)
Giannotti, F., Pedreschi, D., Turini, F.: Mobility, Data Mining and Privacy the Experience of the GeoPKDD Project. In: Bonchi, F., Ferrari, E., Jiang, W., Malin, B. (eds.) PinKDD 2008. LNCS, vol. 5456, pp. 25–32. Springer, Heidelberg (2009)
Hay, M., Miklau, G., et al.: Resisting structural re-identification in anonymized social networks. Proc. VLDB Endow. 1(1), 102–114 (2008)
Kargupta, H., Datta, S., et al.: On the privacy preserving properties of random data perturbation techniques. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 99–106 (2003)
Muralidhar, K., Sarathy, R.: Transactions on Data Privacy 1(1), 17–33 (2008)
Kun, L., Kargupta, H., et al.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering 18(1), 92–106 (2006)
Li, N., Li, T.: t-Closeness: Privacy Beyond k-Anonymity and ℓ-Diversity. In: Proceedings of IEEE International Conference on Data Engineering (2007)
Lindell, Y., Pinkas, B.: Secure Multiparty Computation for Privacy-Preserving Data Mining. Journal of Privacy and Confidentiality 1(1), 59–98 (2009)
Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver. ACM (2008)
Loukides, G., Gkoulalas-Divanis, A., Shao, J.: Anonymizing Transaction Data to Eliminate Sensitive Inferences. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010. LNCS, vol. 6261, pp. 400–415. Springer, Heidelberg (2010)
Martin, D.J., Kifer, D., et al.: Worst-Case Background Knowledge for Privacy-Preserving Data Publishing. In: IEEE 23rd International Conference on Data Engineering, ICDE 2007 (2007)
Mohammed, N., Fung, B.C.M., et al.: Anonymizing healthcare data: a case study on the blood transfusion service. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France. ACM (2009)
Moor, J.: Towards a theory of privacy in the information age. In: Bynum, T., Rodgerson, S. (eds.) Computer Ethics and Professional Responsibility. Blackwell Publishing (2004)
Nin, J., Herranz, J., et al.: Rethinking rank swapping to decrease disclosure risk. Data Knowl. Eng. 64(1), 346–364 (2008)
Oliveira, S.R.M., Zaïane, O.R., Saygın, Y.: Secure Association Rule Sharing. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 74–85. Springer, Heidelberg (2004)
Paillier, P.: The 26th International Conference on Privacy and Personal Data Protection. In: Advances in Cryptography - EUROCRYPT 1999, pp. 23–38 (1999)
Sweeney, L.: Datafly: A System for Providing Anonymity in Medical Data. In: Proceedings of the IFIP TC11 WG11.3 Eleventh International Conference on Database Securty XI: Status and Prospects, pp. 356–381 (1998)
Sweeney, L.: Computational Disclosure Control: A Primer on Data Privacy Protection, Ph.D. thesis. Massachusetts Institute of Technology (2001)
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)
Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., ACM (2003)
Vaidya, J., Clifton, C., et al.: Privacy-preserving decision trees over vertically partitioned data. ACM Trans. Knowl. Discov. Data 2(3), 1–27 (2008)
Vaidya, J., Zhu, Y.M., et al.: Privacy Preserving Data Mining. Springer (2006)
Verykios, V.S., Elmagarmid, A.K., et al.: Association Rule Hiding. IEEE Trans. on Knowl. and Data Eng. 16(4), 434–447 (2004)
Wang, D., Pedreschi, D., et al.: Human mobility, social ties, and link prediction. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, USA, pp. 1100–1108. ACM (2011)
Yang, Z., Wright, R.N.: Privacy-Preserving Computation of Bayesian Networks on Vertically Partitioned Data. IEEE Trans. on Knowl. and Data Eng. 18(9), 1253–1264 (2006)
Yang, Z., Wright, R.N., et al.: Experimental analysis of a privacy-preserving scalar product protocol. Comput. Syst. Sci. Eng. 21(1) (2006)
Zhan, J., Chang, L., et al.: Privacy preserving k-nearest neighbor classification. International Journal of Network Security (1), 46–51 (2005)
Zhan, J., Matwin, S.: Privacy-preserving support vector machine classification. International Journal of Intelligent Information and Database Systems 1(3-4), 365–385 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Matwin, S. (2013). Privacy-Preserving Data Mining Techniques: Survey and Challenges. In: Custers, B., Calders, T., Schermer, B., Zarsky, T. (eds) Discrimination and Privacy in the Information Society. Studies in Applied Philosophy, Epistemology and Rational Ethics, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30487-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-30487-3_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30486-6
Online ISBN: 978-3-642-30487-3
eBook Packages: EngineeringEngineering (R0)