Advertisement

Privacy Preserving Models of k-NN Algorithm

  • Bartosz Krawczyk
  • Michal Wozniak
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 95)

Abstract

The paper focuses on the problem of privacy preserving for classification task. This issue is quite an important subject for the machine learning approach based on distributed databases. On the basis of the study of available works devoted to privacy we propose its new definition and its taxonomy. We use this taxonomy to create several modifications of k-nearest neighbors classifier which are consistent with the proposed privacy levels. Their computational complexity are evaluated on the basis of computer experiments.

Keywords

privacy preserving data mining distributed data mining pattern recognition k-NN database security 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aha, D.W., Kibler, D., Albert, M.: K Instance-Based Learning Algorithms. Machine Learning 6, 37–66 (1991)Google Scholar
  2. 2.
    Alpaydin, E.: Introduction to Machine Learning, 2nd edn. The MIT Press, London (2010)zbMATHGoogle Scholar
  3. 3.
    Angiulli, F., Folino, G.: Distributed Nearest Neighbor Based Condensation of Very Large Datasets. IEEE Transactions on Knowledge and Data Engineering 19(12), 1593–1606 (2007)CrossRefGoogle Scholar
  4. 4.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
  5. 5.
    Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: Proc. of the 23rd International Conference on Machine Learning, Pittsburgh, PA, pp. 97–104 (2006)Google Scholar
  6. 6.
    Chitti, S., Liu, L., Xiong, L.: Mining Multiple Private Databases isung Privacy Preserving kNN Classifier, Technical Reports TR-2006-008, Emory University (2006)Google Scholar
  7. 7.
    Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving data mining. In: SIGKDD Explorations, pp. 28–34 (2002)Google Scholar
  8. 8.
    Cost, S., Salzberg, S.: A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features. Machine Learning 10(1), 57–78 (1993)Google Scholar
  9. 9.
    Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. on Inform. Theory 13(1), 21–27 (1967)CrossRefzbMATHGoogle Scholar
  10. 10.
    Dasarathy, B.: Nearest Neighbor (NN) Norms NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)Google Scholar
  11. 11.
    Devroye, L.: On the inequality of cover and hart in nearest neighbor discrimination. IEEE Trans. on Pat. Anal. and Mach. Intel. 3, 75–78 (1981)CrossRefzbMATHGoogle Scholar
  12. 12.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley Interscience, NewYork (2001)zbMATHGoogle Scholar
  13. 13.
    Freitas, A.A., Lavington, S.H.: Mining Very Large Databases with Parallel Processing. Kluwer Academic Publishers, Boston (1998)zbMATHGoogle Scholar
  14. 14.
    Gantz, J., Reinsel, D.: As the Economy Contracts, the Digital Universe Expands. IDC Multimedia Whitepaper (2009)Google Scholar
  15. 15.
    Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. on Inform. Th. 14(3), 515–516 (1968)CrossRefGoogle Scholar
  16. 16.
    Hacigumus, H., Iyer, B., Li, C., Mehrotra, S.: Executing sql over encrypted data in the database service provider model. In: ACM SIGMOD Conference (2002)Google Scholar
  17. 17.
    Han, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publ. Inc., San Francisco (2005)Google Scholar
  18. 18.
    Jajodia, S., Sandhu, R.: Toward a multilevel secure relational data model. In: ACM SIGMOD Conference (1991)Google Scholar
  19. 19.
    Kantarcioglu, M., Clifton, C.: Privacy preserving k-nn classifier. In: IEEE International Conference on Data Engineering, ICDE (2005)Google Scholar
  20. 20.
    Kuncheva, L.I.: Combining pattern classifiers: Methods and algorithms. Wiley-Interscience, New Jersey (2004)CrossRefzbMATHGoogle Scholar
  21. 21.
    Lindell, Y., Pinkas, B.: Privacy Preserving Data Mining. Journal of Cryptology 15(3), 177–206 (2004)CrossRefMathSciNetGoogle Scholar
  22. 22.
    Moor, J.H.: The future of computer ethics: You ain’t seen nothing yet! In: Ethics and Information Technology, vol. 3, pp. 89–91. Kluwer Academic Publishers, Dordrecht (2001)Google Scholar
  23. 23.
    Nissenbaum, H.: Can we Protect Privacy in Public? In: Computer Ethics Philosophical Enquiry ACM/SIGCAS Conference, Rotterdam, The Netherlands (1997)Google Scholar
  24. 24.
    Teng, Z., Du, W.: A hybrid multi-group privacypreserving approach for building decision trees. In: Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining PAKDD 2006, pp. 296–307 (2006)Google Scholar
  25. 25.
    Westin, A.F.: Privacy and Freedom. The Bodley Head Ltd (1970)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Bartosz Krawczyk
    • 1
  • Michal Wozniak
    • 1
  1. 1.Department of Systems and Computer NetworksWroclaw University of TechnologyWroclawPoland

Personalised recommendations