An Attacker’s View of Distance Preserving Maps for Privacy Preserving Data Mining

  • Kun Liu
  • Chris Giannella
  • Hillol Kargupta
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4213)


We examine the effectiveness of distance preserving transformations in privacy preserving data mining. These techniques are potentially very useful in that some important data mining algorithms can be efficiently applied to the transformed data and produce exactly the same results as if applied to the original data e.g. distance-based clustering, k-nearest neighbor classification. However, the issue of how well the original data is hidden has, to our knowledge, not been carefully studied. We take a step in this direction by assuming the role of an attacker armed with two types of prior information regarding the original data. We examine how well the attacker can recover the original data from the transformed data and prior information. Our results offer insight into the vulnerabilities of distance preserving transformations.


Association Rule Orthogonal Transformation Distinct Eigenvalue Data Perturbation Privacy Breach 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proc. ACM SIGMOD, pp. 439–450 (2000)Google Scholar
  2. 2.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: Random data perturbation techniques and privacy preserving data mining. Knowledge and Information Systems 7(5), 387–414 (2005)CrossRefGoogle Scholar
  3. 3.
    Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: Proc. ACM SIGMOD, pp. 37–48 (2005)Google Scholar
  4. 4.
    Sweeney, L.: K-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: Proc. IEEE ICDM, pp. 589–592 (2005)Google Scholar
  6. 6.
    Oliveira, S.R.M., Zaïane, O.R.: Privacy preservation when sharing data for clustering. In: Proc. Workshop on Secure Data Management in a Connected World, pp. 67–82 (2004)Google Scholar
  7. 7.
    Artin, M.: Algebra. Prentice Hall, Englewood Cliffs (1991)Google Scholar
  8. 8.
    Adam, N.R., Worthmann, J.C.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys 21(4), 515–556 (1989)CrossRefGoogle Scholar
  9. 9.
    Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer Series in Statistics. Springer, Heidelberg (2002)MATHGoogle Scholar
  10. 10.
    Strang, G.: Linear Algebra and Its Applications, 3rd edn. Harcourt Brace Jovanovich College Publishers, New York (1986)Google Scholar
  11. 11.
    Szekély, G.J., Rizzo, M.L.: Testing for equal distributions in high dimensions. InterStat (5) (November 2004)Google Scholar
  12. 12.
    Vaidya, J., Clifton, C., Zhu, M.: Privacy Preserving Data Mining. In: Advances in Information Security, vol. 19. Springer, Heidelberg (2006)Google Scholar
  13. 13.
    Kim, J.J., Winkler, W.E.: Multiplicative noise for masking continuous data. Technical Report Statistics #2003-01, Statistical Research Division, U.S. Bureau of the Census (2003)Google Scholar
  14. 14.
    Liu, K., Kargupta, H., Ryan, J.: Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining. IEEE Transactions on Knowledge and Data Engineering 18(1), 92–106 (2006)CrossRefGoogle Scholar
  15. 15.
    Evfimevski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proc. ACM PODS (2003)Google Scholar
  16. 16.
    Rizvi, S.J., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proc. 28th VLDB, pp. 682–693 (2002)Google Scholar
  17. 17.
    Hore, B., Mehrotra, S., Tsudik, G.: A privacy-preserving index for range queries. In: Proc. 30th VLDB, pp. 720–731 (2004)Google Scholar
  18. 18.
    Verykios, V.S., Elmagarmid, A.K., Elisa, B., Saygin, Y., Elena, D.: Association rule hiding. IEEE Transactions on Knowledge and Data Engineering 16(4), 434–447 (2004)CrossRefGoogle Scholar
  19. 19.
    Fienberg, S.E., McIntyre, J.: Data swapping: Variations on a theme by dalenius and reiss. Technical report, U.S. National Institute of Statistical Sciences (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kun Liu
    • 1
  • Chris Giannella
    • 1
  • Hillol Kargupta
    • 1
  1. 1.Department of Computer Science and Electrical EngineeringUniversity of Maryland Baltimore CountyBaltimoreUSA

Personalised recommendations