Improved Algorithms for Anonymization of Set-Valued Data

  • B. K. Tripathy
  • A. Jayaram Reddy
  • G. V. Manusha
  • G. S. Mohisin
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 177)


Data anonymization techniques enable publication of detailed information, while providing the privacy of sensitive information in the data against a variety of attacks. Anonymized data describes a set of possible worlds that include the original data. Generalization and suppression have been the most commonly used techniques for achieving anonymization. Some algorithms to protect privacy in the publication of set-valued data were developed by Terrovitis et al.,[16]. The concept of k-anonymity was introduced by Samarati and Sweeny [15], so that every tuple has at least (k-1) tuples identical with it. This concept was modified in [16] in order to introduce K m -anonymity, to limit the effects of the data dimensionality. This approach depends upon generalisation instead of suppression. To handle this problem two heuristic algorithms; namely the DA-algorithm and the AA-algorithm were developed by them.These alogorithms provide near optimal solutions in many cases.In this paper,we improve DA such that undesirable duplicates are not generated and using a FP-growth we display the anonymized data.We illustrate through suitable examples,the efficiency of our proposed algorithm.


K-anonymization Km-anonymization Direct anonymization Apriori-based anonymization set-valued data count–tree 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aggarwal, G., Feder, G., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving Anonymity via Clustering. In: Proc. of ACM PODS, pp. 153–162 (2006)Google Scholar
  2. 2.
    Aggarwal, G., Feder, G., Kenthapadi, R., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Approximation Algorithms for k-Anonymity. Journal of Privacy Technology (2005)Google Scholar
  3. 3.
    Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D.: Anonymity Preserving Pattern Discovery. VLDB Journal (2008) (accepted for publication)Google Scholar
  4. 4.
    Bayardo, R.J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. Proc. of ICDE, pp. 217–228 (2005)Google Scholar
  5. 5.
    Ghinita, G., Karras, F.P., Kalnis, P., Mamoulis, N.: Fast Data Anonymization with Low Information Loss. In: VLDB, pp. 758–769 (2007)Google Scholar
  6. 6.
    Ghinita, G., Tao, Y., Kalnis, P.: On the Anonymization of Sparse High-Dimensional Data. In: Proceedings of ICDE (2008)Google Scholar
  7. 7.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD, pp. 1–12 (2000)Google Scholar
  8. 8.
    Iyengar, V.S.: Transforming Data to Satisfy Privacy Constraints. In: Proceedings of SIGKDD, pp. 279–288 (2002)Google Scholar
  9. 9.
    LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient Full-domain k-anonymity. In: Proceedings of ACM SIGMOD, pp. 49–60 (2005)Google Scholar
  10. 10.
    Li, N., Li, T., Venktasubramanian, S.: t-closeness Privacy Beyond k-anonymity and l-diversity. In: Proceedings of ICDE, pp. 106–115 (2007)Google Scholar
  11. 11.
    Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, S.: l-diversity: Privacy Beyond k-Anonymity. In: Proceedings of ICDE (2006)Google Scholar
  12. 12.
    Meyerson, A., Williams, R.: On the Complexity of Optimal k-Anonymity. In: Proceedings of ACM PODS, pp. 223–228 (2004)Google Scholar
  13. 13.
    Park, H., Shim, K.: Approximate algorithms for k-Anonymity. In: Proceedings of the ACM SIGMOD, pp. 67–78 (2007)Google Scholar
  14. 14.
    Samarati, P.: Protecting Respondents Identities in Microdata Release. IEEE TKDE 13(6), 1010–1027 (2001)Google Scholar
  15. 15.
    Sweeney, L.: K-Anonymity: A Model for Protecting Privacy. International Journal of Uncertainty. Fuzziness and Knowledge-Based Systems 10(5), 557–570 (2002)MathSciNetMATHCrossRefGoogle Scholar
  16. 16.
    Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy Preserving Anonymization of Set-Valued Data. In: PVLDB 2008, Auckland, New Zeland, pp. 115–125 (2008)Google Scholar
  17. 17.
    Tripathy, B.K., Devineni, H., Jayasri, K.J., Bhargava, M.: An Efficient Clustering Algorithm for l-diversity. In: Proceedings of the International Conference on Advances and Emerging Trends in Computing Technologies, ICAET 2010, June 21-24, pp. 76–81. SRM university (2010)Google Scholar
  18. 18.
    Tripathy, B.K., Panda, G.K., Kumaran, K.: A Rough Set Approach to develop an efficient l-diversity Algorithm based on Clustering. In: Proc. of the 2nd IIMA International Conference on Advanced Data Analysis, Business Analytics and Intelligence, January 8-9, p. 34 (2011)Google Scholar
  19. 19.
    Tripathy, B.K., Panda, G.K., Kumaran, K.: A Fast l - Diversity Anonymisation Algorithm. In: Proc. of the Third International Conference on Computer Modelling and Simulation, ICCMS 2011, Mumbai, January 7-9, pp. V2-648–652(2011)Google Scholar
  20. 20.
    Tripathy, B.K., Maity, A., Ranajit, B., Chowdhuri, D.: A fast p-sensitive l-diversity Anonymisation algorithm. In: Proceedings of the RAICS IEEE Conference, Kerala, September 21-23, pp. 741–744 (2011)Google Scholar
  21. 21.
    Xiao, X., Tao, Y.: Anatomy: Simple and Effective Privacy Preservation. In: Proceedings of VLDB, pp. 139–150 (2006)Google Scholar
  22. 22.
    Zhang, Q., Koudas, N., Srivastava, D., Yu, T.: Aggregate Query Answering on Anonymised Tables. In: Proceedings of ICDE, pp. 116–125 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • B. K. Tripathy
    • 1
  • A. Jayaram Reddy
    • 2
  • G. V. Manusha
    • 2
  • G. S. Mohisin
    • 2
  1. 1.School of Computing Science and EngineeringVIT UniversityVelloreIndia
  2. 2.School of Information Technology and EngineeringVIT UniversityVelloreIndia

Personalised recommendations