Advertisement

Frontiers of Computer Science

, Volume 12, Issue 6, pp 1241–1254 | Cite as

M-generalization for multipurpose transactional data publication

  • Xianxian Li
  • Peipei Sui
  • Yan Bai
  • Li-E WangEmail author
Research Article

Abstract

Transactional data collection and sharing currently face the challenge of how to prevent information leakage and protect data from privacy breaches while maintaining high-quality data utilities. Data anonymization methods such as perturbation, generalization, and suppression have been proposed for privacy protection. However, many of these methods incur excessive information loss and cannot satisfy multipurpose utility requirements. In this paper, we propose a multidimensional generalization method to provide multipurpose optimization when anonymizing transactional data in order to offer better data utility for different applications. Our methodology uses bipartite graphs with generalizing attribute, grouping item and perturbing outlier. Experiments on real-life datasets are performed and show that our solution considerably improves data utility compared to existing algorithms.

Keywords

anonymization generalization privacy protection bipartite graph 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Grant Nos. 61662008, 61272535, 61502111), Guangxi “Bagui Scholar” Teams for Innovation and Research Project, Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Guangxi Natural Science Foundation (2015GXNSFBA139246, 2014GXNSFBA118288 and 2013GXNSFBA019263), and Guangxi Special Project of Science and Technology Base and Talents (AD16380008).

Supplementary material

11704_2016_6061_MOESM1_ESM.ppt (168 kb)
M-generalization for multi-purpose transactional data publication

References

  1. 1.
    Chang C C, Thompson B, Wang H W, Yao D. Towards publishing recommendation data with predictive anonymization. In: Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security. 2010, 24–35Google Scholar
  2. 2.
    Zheng Z J, Kohavi R, Mason L. Real world performance of association rule algorithms. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2001, 401–406Google Scholar
  3. 3.
    Wang L E, Li X X. A hybrid optimization approach for anonymizing transactional data. In: Proceedings of International Conference on Algorithms and Architectures for Parallel Processing. 2015, 120–132CrossRefGoogle Scholar
  4. 4.
    Ghinita G, Tao Y F, Kalnis P. On the anonymization of sparse highdimensional data. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 715–724Google Scholar
  5. 5.
    Terrovitis M, Mamoulis N, Kalnis P. Privacy-preserving anonymization of set-valued data. Proceedings of the VLDB Endowment, 2008, 1(1): 115–125CrossRefGoogle Scholar
  6. 6.
    Terrovitis M, Mamoulis N, Kalnis P. Local and global recoding methods for anonymizing set-valued data. The VLDB Journal—The International Journal on Very Large Data Bases, 2011, 20(1): 83–106CrossRefGoogle Scholar
  7. 7.
    He Y Y, Naughton J F. Anonymization of set-valued data via topdown, local generalization. Proceedings of the VLDB Endowment, 2009, 2(1): 934–945CrossRefGoogle Scholar
  8. 8.
    Liu J Q, Wang K. Anonymizing transaction data by integrating suppression and generalization. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2010, 171–180CrossRefGoogle Scholar
  9. 9.
    Xu Y B, Wang K, Fu A W C, Yu P S. Anonymizing transaction databases for publication. In: Proceedings of the 14th ACM SIGKDD Nternational Conference on Knowledge Discovery and Data Mining. 2008, 767–775CrossRefGoogle Scholar
  10. 10.
    Ghinita G, Kalnis P, Tao Y F. Anonymous publication of sensitive transactional data. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(2): 161–174CrossRefGoogle Scholar
  11. 11.
    Chen B, Kifer D, Le Fevre K, Machanavajjhala A. Privacy-preserving data publishing. Foundations and Trends in databases, 2009, 2(1–2): 1–167CrossRefGoogle Scholar
  12. 12.
    Fung B C M, Wang K, Chen R, Yu P S. Privacy-preserving data publishing: a survey on recent developments. ACM Computing Surveys (CSUR), 2010, 42(4): 14CrossRefGoogle Scholar
  13. 13.
    Poulis G, Loukides G, Gkoulalas-Divanis A, Skiadopoulos S. Anonymizing data with relational and transaction attributes. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2013, 353–369Google Scholar
  14. 14.
    Takahashi T, Sobataka K, Takenouchi T, Toyoda Y, Mori T, Kohro T. Top-down itemset recoding for releasing private complex data. In: Proceedings of the 11th IEEE Annual International Conference on Privacy, Security and Trust. 2013, 373–376Google Scholar
  15. 15.
    Gkoulalas-Divanis A, Loukides G. Utility-guided clustering-based transaction data anonymization. Transactions on Data Privacy, 2012, 5(1): 223–251MathSciNetGoogle Scholar
  16. 16.
    Cormode G, Srivastava D, Yu T, Zhang, Q. Anonymizing bipartite graph data using safe groupings. The VLDB Journal—The International Journal on Very Large Data Bases, 2010, 19(1): 115–139CrossRefGoogle Scholar
  17. 17.
    Wong W K, Mamoulis N, Cheung D WL. Non-homogeneous generalization in privacy preserving data publishing. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2010, 747–758Google Scholar
  18. 18.
    Samarati P. Protecting respondents’ identities in microdata release. IEEE transactions on Knowledge and Data Engineering, 2001, 13(6): 1010–1027CrossRefGoogle Scholar
  19. 19.
    Sweeney L. k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002, 10(05): 557–570MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. ldiversity: privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 2007, 1(1): 3CrossRefGoogle Scholar
  21. 21.
    Li N H, Li T C, Venkatasubramanian S. t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd IEEE International Conference on Data Engineering. 2007, 106–115Google Scholar
  22. 22.
    Xue M Q, Karras P, Raïssi C, Vaidya J, Tan K L. Anonymizing setvalued data by nonreciprocal recoding. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 1050–1058Google Scholar
  23. 23.
    Cao J N, Karras P, Raïssi C, Tan K L. ?-uncertainty: inference-proof transaction anonymization. Proceedings of the VLDB Endowment, 2010, 3(1–2): 1033–1044CrossRefGoogle Scholar
  24. 24.
    Loukides G, Gkoulalas-Divanis A, Shao J H. Anonymizing transaction data to eliminate sensitive inferences. In: Proceedings of International Conference on Database and Expert Systems Applications. 2010, 400–415CrossRefGoogle Scholar
  25. 25.
    Loukides G, Gkoulalas-Divanis A, Shao J H. Efficient and flexible anonymization of transaction data. Knowledge and Information Systems, 2013, 36(1): 153–210CrossRefGoogle Scholar
  26. 26.
    Zhou J, Jing J W, Xiang J, Wang L. Privacy preserving social network publication on bipartite graphs. In: Proceedings of IFIP International Workshop on Information Security Theory and Practice. 2012, 58–70Google Scholar
  27. 27.
    Wang L E, Li X X. A clustering-based bipartite graph privacypreserving approach for sharing high-dimensional data. International Journal of Software Engineering and Knowledge Engineering, 2014, 24(07): 1091–1111CrossRefGoogle Scholar
  28. 28.
    Wang L E, Li XX. Personalized privacy protection for transactional data. In: Proceedings of International Conference on Advanced Data Mining and Applications. 2014, 253–266Google Scholar
  29. 29.
    Loukides G, Gkoulalas-Divanis A, Malin B. COAT: constraint-based anonymization of transactions. Knowledge and Information Systems, 2011, 28(2): 251–282CrossRefGoogle Scholar
  30. 30.
    Gionis A, Mazza A, Tassa T. k-Anonymization revisited. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 744–753Google Scholar

Copyright information

© Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature 2017

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringBeihang UniversityBeijingChina
  2. 2.Guangxi Key Lab of Multi-source InformationMining & SecurityGuilinChina
  3. 3.Institute of TechnologyUniversity of Washington TacomaTacomaUSA

Personalised recommendations