As a serious concern in data publishing and analysis, privacy preserving data processing has received a lot of attention. Privacy preservation often leads to information loss. Consequently, we want to minimize utility loss as long as the privacy is preserved. In this chapter, we survey the utility-based privacy preservation methods systematically. We first briefly discuss the privacy models and utility measures, and then review four recently proposed methods for utilitybased privacy preservation.
We first introduce the utility-based anonymization method for maximizing the quality of the anonymized data in query answering and discernability. Then we introduce the top-down specialization (TDS) method and the progressive disclosure algorithm (PDA) for privacy preservation in classification problems. Last, we introduce the anonymized marginal method, which publishes the anonymized projection of a table to increase the utility and satisfy the privacy requirement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Charu C. Aggarwal. On k-anonymity and the curse of dimensionality. In Proceedings of the 31st International Conference on Very Large Data Bases, pages 901–909, August 2005.
Charu C. Aggarwal, Jian Pei, and Bo Zhang. On privacy preservation against adversarial data mining. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 510 – 516. ACM Press, 2006.
Roberto J. Bayardo and Rakesh Agrawal. Data privacy through optimal k-anonymization. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05), pages 217 – 228. IEEE Computer Society, 2005.
A.L. Berger, S.A. Della-Pietra, and V.J. Della-Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39–71, 1996.
Benjamin C. M. Fung, Ke Wang, and Philip S. Yu. Top-down specialization for information and privacy preservation. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05), volume 00, pages 205 – 216. IEEE Computer Society, 2005.
Benjamin C. M. Fung, Ke Wang, and Philip S. Yu. Anonymizing classification data for privacy preservation. IEEE Transactions on Knowledge and Data Engineering, 19(5):711–725, May 2007.
Vijay S. Iyengar. Transforming data to satisfy privacy constraints. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 279 – 288. ACM Press, 2002.
Daniel Kifer and Johannes Gehrke. Injecting utility into anonymized datasets. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pages 217 – 228. ACM Press, 2006.
S. Kullback and R. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22:79–87, 1951.
Steffen L. Lauritzen. Graphical Models. Oxford Science Publicatins, 1996.
F. Giannotti M. Atzori, F. Bonchi and D. Pedreschi. Blocking anonymity threats raised by frequent itemset mining. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM’05), November 2005.
F. Giannotti M. Atzori, F. Bonchi and D. Pedreschi. k-anonymous patterns. In Proceedings of the Ninth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’05), volume 3721 of Lecture Notes in Computer Science, Springer, Porto, Portugal, October 2005.
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In Proceedings of the 22nd International Conference on Data Engineering (ICDE’06), page 24, 2006.
Adam Meyerson and Ryan Williams. On the complexity of optimal k-anonymity. In Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 223–228, June 2004.
Stanley R. M. Oliveira and Osmar R. Zaïane. Privacy preserving frequent itemset mining. In CRPITS’14: Proceedings of the IEEE international conference on Privacy, security and data mining, pages 43–54, Darlinghurst, Australia, Australia, 2002. Australian Computer Society, Inc.
Adwait Ratnaparkhi. A maximum entropy part-of-speech tagger. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 133–142, University of Pennsylvania, May 1996. ACL.
P. Samarati. Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6): 1010 – 1027, November 2001.
Pierangela Samarati and Latanya Sweeney. Generalizing data to provide anonymity when disclosing information. Technical report, March 1998.
Latanya Sweeney. Achieving k-Anonymity Privacy Protection Using Generalization and Suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):571–588, 2002.
Latanya Sweeney. k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5):557–570, 2002.
Vassilios S. Verykios, Elisa Bertino, Igor Nai Fovino, Loredana Parasiliti Provenza, Yucel Saygin, and Yannis Theodoridis. State-of-the-art in privacy preserving data mining. ACM SIGMOD Record, 33(1):50 – 57, 2004.
Vassilios S. Verykios, Ahmed K. Elmagarmid, Elisa Bertino, Yucel Saygin, and Elena Dasseni. Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4):434–447, 2004.
Ke Wang, Benjamin C. M. Fung, and Philip S. Yu. Template-based privacy preservation in classification problems. In Proceedings of the Fifth IEEE International Conference on Data Mining, pages 466 – 473. IEEE Computer Society, 2005.
Ke Wang, Philip S. Yu, and Sourav Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM’04), volume 00, pages 249 – 256. IEEE Computer Society, 2004.
Xiaokui Xiao and Yufei Tao. m-invariance: Towards privacy preserving re-publication of dynamic datasets. In To appear in ACM Conference on Management of Data (SIGMOD), 2007.
Xiaokui Xiao and Yufei Tao. Anatomy: simple and effective privacy preservation. In Proceedings of the 32nd international conference on Very large data bases, volume 32, pages 139 – 150. VLDB Endowment, 2006.
Jian Xu, Wei Wang, Jian Pei, Xiaoyuan Wang, Baile Shi, and Ada Wai-Chee Fu. Utility-based anonymization for privacy preservation with less information loss. ACM SIGKDD Explorations Newsletter, 8(2):21–30, December 2006.
Jian Xu, Wei Wang, Jian Pei, Xiaoyuan Wang, Baile Shi, and Ada Wai-Chee Fu. Utility-based anonymization using local recoding. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 785 – 790. ACM Press, 2006.
Sheng Zhong, Zhiqiang Yang, and Rebecca N. Wright. Privacy-enhancing k-anonymization of customer data. In Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems(PODS ’05), pages 139–147, New York, NY, USA, 2005. ACM Press.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Hua, M., Pei, J. (2008). A Survey of Utility-based Privacy-Preserving Data Transformation Methods. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_9
Download citation
DOI: https://doi.org/10.1007/978-0-387-70992-5_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-70991-8
Online ISBN: 978-0-387-70992-5
eBook Packages: Computer ScienceComputer Science (R0)