Privacy-Preserving Distributed k-Anonymity

  • Wei Jiang
  • Chris Clifton
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3654)


k-anonymity provides a measure of privacy protection by preventing re-identification of data to fewer than a group of k data items. While algorithms exist for producing k-anonymous data, the model has been that of a single source wanting to publish data. This paper presents a k-anonymity protocol when the data is vertically partitioned between sites. A key contribution is a proof that the protocol preserves k-anonymity between the sites: While one site may have individually identifiable data, it learns nothing that violates k-anonymity with respect to the data at the other site. This is a fundamentally different distributed privacy definition than that of Secure Multiparty Computation, and it provides a better match with both ethical and legal views of privacy.


k-anonymity privacy security 


  1. 1.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, Dallas, TX, pp. 439–450. ACM, New York (2000)CrossRefGoogle Scholar
  2. 2.
    Moore Jr., R.A.: Controlled data-swapping techniques for masking public use microdata sets. In: Statistical Research Division Report Series RR 96-04, U.S. Bureau of the Census, Washington, DC. (1996)Google Scholar
  3. 3.
    Dobkin, D., Jones, A.K., Lipton, R.J.: Secure databases: Protection against user influence. ACM Transactions on Database Systems 4, 97–106 (1979)CrossRefGoogle Scholar
  4. 4.
    Yao, A.C.: Protocols for secure computation. In: Proceedings of the 23rd IEEE Symposium on Foundations of Computer Science, pp. 160–164. IEEE, Los Alamitos (1982)Google Scholar
  5. 5.
    Yao, A.C.: How to generate and exchange secrets. In: Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, pp. 162–167. IEEE, Los Alamitos (1986)Google Scholar
  6. 6.
    Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game - a completeness theorem for protocols with honest majority. In: 19th ACM Symposium on the Theory of Computing, pp. 218–229 (1987)Google Scholar
  7. 7.
    Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 557–570 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 571–588 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Goldreich, O.: General Cryptographic Protocols. In: The Foundations of Cryptography, vol. 2. Cambridge University Press, Cambridge (2004)CrossRefGoogle Scholar
  10. 10.
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. Journal of Cryptology 15, 177–206 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Du, W., Zhan, Z.: Building decision tree classifier on private data. In: Clifton, C., Estivill-Castro, V. (eds.) IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining, Maebashi City, Japan, vol. 14, pp. 1–8. Australian Computer Society (2002)Google Scholar
  12. 12.
    Vaidya, J., Clifton, C.: Privacy preserving naïve bayes classifier for vertically partitioned data. In: 2004 SIAM International Conference on Data Mining, Lake Buena Vista, Florida, pp. 522–526 (2004)Google Scholar
  13. 13.
    Kantarcıoǧlu, M., Clifton, C.: Privately computing a distributed k-nn classifier. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 279–290. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  14. 14.
    Kantarcıoǧlu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, pp. 599–604 (2004)Google Scholar
  15. 15.
    Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In: Proceedings of the IEEE Symposium on Research in Security and Privacy, Oakland, CA (1998)Google Scholar
  16. 16.
    Sweeney, L.: Computational Disclosure Control: A Primer on Data Privacy Protection. PhD thesis, Massachusetts Institute of Technology (2001)Google Scholar
  17. 17.
    Hundepool, A., Willenborg, L.: μ- and τ-argus: software for statistical disclosure control. In: Third International Seminar on Statistical Confidentiality (1996)Google Scholar
  18. 18.
    Sweeney, L.: Guaranteeing anonymity when sharing medical data, the datafly system. Proceedings. Journal of the American Medical Informatics Association (1997)Google Scholar
  19. 19.
    Pohlig, S.C., Hellman, M.E.: An improved algorithm for computing logarithms over GF(p) and its cryptographic significance. IEEE Transactions on Information Theory IT-24, 106–110 (1978)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2005

Authors and Affiliations

  • Wei Jiang
    • 1
  • Chris Clifton
    • 1
  1. 1.Department of Computer SciencePurdue UniversityWest LafayetteUSA

Personalised recommendations