A Survey of Quantification of Privacy Preserving Data Mining Algorithms

  • Elisa Bertino
  • Dan Lin
  • Wei Jiang
Part of the Advances in Database Systems book series (ADBS, volume 34)

The aim of privacy preserving data mining (PPDM) algorithms is to extract relevant knowledge from large amounts of data while protecting at the same time sensitive information. An important aspect in the design of such algorithms is the identification of suitable evaluation criteria and the development of related benchmarks. Recent research in the area has devoted much effort to determine a trade-off between the right to privacy and the need of knowledge discovery. It is often the case that no privacy preserving algorithm exists that outperforms all the others on all possible criteria. Therefore, it is crucial to provide a comprehensive view on a set of metrics related to existing privacy preserving algorithms so that we can gain insights on how to design more effective measurement and PPDM algorithms. In this chapter, we review and summarize existing criteria and metrics in evaluating privacy preserving techniques.

Keywords

Privacy metric 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th ACMSIGACT-SIGMOD-SIGART Symposium on Principle of Database System, pp. 247–255. ACM (2001)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proceeedings of the ACMSIGMOD Conference of Management of Data, pp. 439–450. ACM (2000)Google Scholar
  3. 3.
    Ballou, D., Pazer, H.: Modelling data and process quality in multi input, multi output information systems. Management science 31(2), 150–162 (1985)CrossRefGoogle Scholar
  4. 4.
    Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proc. of the 21st Int’l Conf. on Data Engineering (2005)Google Scholar
  5. 5.
    Bertino, E., Fovino, I.N.: Information driven evaluation of data hiding algorithms. In: 7th Internationa Conference on Data Warehousing and Knowledge Discovery, pp. 418–427 (2005)Google Scholar
  6. 6.
    Bertino, E., Fovino, I.N., Provenza, L.P.: A framework for evaluating privacy preserving data mining algorithms. Data Mining and Knowledge Discovery 11(2), 121–154 (2005)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: L. Zayatz, P. Doyle, J. Theeuwes, J. Lane (eds.) Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 113–134. North-Holland (2002)Google Scholar
  8. 8.
    Duncan, G.T., Keller-McNulty, S.A., Stokes, S.L.: Disclosure risks vs. data utility: The r-u confidentiality map. Tech. Rep. 121, National Institute of Statistical Sciences (2001)Google Scholar
  9. 9.
    Dwork, C., Nissim, K.: Privacy preserving data mining in vertically partitioned database. In: CRYPTO 2004, vol. 3152, pp. 528–544 (2004)Google Scholar
  10. 10.
    Evfimievski, A.: Randomization in privacy preserving data mining. SIGKDD Explor. Newsl. 4(2), 43–48 (2002)CrossRefGoogle Scholar
  11. 11.
    Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: 8th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–228. ACM-Press (2002)Google Scholar
  12. 12.
    Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st IEEE International Conference on Data Engineering (ICDE 2005). Tokyo, Japan (2005)Google Scholar
  13. 13.
    Iyengar, V.: Transforming data to satisfy privacy constraints. In: Proc., the Eigth ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, pp. 279–288 (2002)Google Scholar
  14. 14.
    Kantarcioglu, M., Clifton, C.: Privacy preserving distributed mining of association rules on horizontally partitioned data. In: ACMSIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 24–31 (2002)Google Scholar
  15. 15.
    Kantarcıoğlu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 599–604. Seattle, WA (2004).Google Scholar
  16. 16.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the Third IEEE International Conference on Data Mining (ICDM’03). Melbourne, Florida (2003)Google Scholar
  17. 17.
    Kifer, D., Gehrke, J.: Injecting utility into anonymized datasets. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 217–228. ACM Press, Chicago, IL, USA (2006)CrossRefGoogle Scholar
  18. 18.
    Kumar Tayi, G., Ballou, D.P.: Examining data quality. Communications of the ACM 41(2), 54–57 (1998)CrossRefGoogle Scholar
  19. 19.
    Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining 18(1), 92–106 (2006)Google Scholar
  20. 20.
    Nergiz, M.E., Clifton, C.: Thoughts on k-anonymization. In: The Second International Workshop on Privacy Data Management held in conjunction with The 22nd International Conference on Data Engineering. Atlanta, Georgia (2006)Google Scholar
  21. 21.
    Oliveira, S.R.M., Zaiane, O.R.: Privacy preserving frequent itemset mining. In: IEEE icdm Workshop on Privacy, Security and Data Mining, vol. 14, pp. 43–54 (2002)Google Scholar
  22. 22.
    Oliveira, S.R.M., Zaiane, O.R.: Privacy preserving clustering by data transformation. In: 18th Brazilian Symposium on Databases (SBBD 2003), pp. 304–318 (2003)Google Scholar
  23. 23.
    Oliveira, S.R.M., Zaiane, O.R.: Toward standardization in privacy preserving data mining. In: ACMSIGKDD 3rd Workshop on Data Mining Standards, pp. 7–17 (2004)Google Scholar
  24. 24.
    Rizvi, S., Haritsa, R.: Maintaining data privacy in association rule mining. In: 28th International Conference on Very Large Databases, pp. 682–693 (2002)Google Scholar
  25. 25.
    Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering (TKDE) 13(6), 1010–1027 (2001).CrossRefGoogle Scholar
  26. 26.
    Schoeman, F.D.: Philosophical Dimensions of Privacy: An Anthology. Cambridge University Press. (1984)Google Scholar
  27. 27.
    Shannon, C.E.: A mathematical theory of communication. Bell System Technical Journal 27, 379–423, 623–656 (1948)MATHMathSciNetGoogle Scholar
  28. 28.
    Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems 10(5), 571–588 (2002)MATHCrossRefMathSciNetGoogle Scholar
  29. 29.
    Trottini, M.: A decision-theoretic approach to data disclosure problems. Research in Official Statistics 4, 7–22 (2001)Google Scholar
  30. 30.
    Trottini, M.: Decision models for data disclosure limitation. Ph.D. thesis, Carnegie Mellon University (2003).Google Scholar
  31. 31.
    Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: 8th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644. ACM Press (2002)Google Scholar
  32. 32.
    Verykios, V.S., Bertino, E., Nai Fovino, I., Parasiliti, L., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. SIGMOD Record 33(1), 50–57 (2004)CrossRefGoogle Scholar
  33. 33.
    Walters, G.J.: Human Rights in an Information Age: A Philosophical Analysis, chap. 5. University of Toronto Press. (2001)Google Scholar
  34. 34.
    Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. Journal of Management Information Systems 12(4), 5–34 (1996)MATHGoogle Scholar
  35. 35.
    Willenborg, L., De Waal, T.: Elements of statistical disclosure control, Lecture Notes in Statistics, vol. 155. Springer (2001)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Elisa Bertino
    • 1
  • Dan Lin
    • 1
  • Wei Jiang
    • 1
  1. 1.Department of Computer SciencePurdue UniversityPittsburghUK

Personalised recommendations