Abstract
Data anonymization techniques based on the k-anonymity model have been the focus of intense research in the last few years. Although the k-anonymity model and the related techniques provide valuable solutions to data privacy, current solutions are limited only to static data release (i.e., the entire dataset is assumed to be available at the time of release). While this may be acceptable in some applications, today we see databases continuously growing everyday and even every hour. In such dynamic environments, the current techniques may suffer from poor data quality and/or vulnerability to inference. In this paper, we analyze various inference channels that may exist in multiple anonymized datasets and discuss how to avoid such inferences. We then present an approach to securely anonymizing a continuously growing dataset in an efficient manner while assuring high data quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adam, N., Wortmann, J.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys 21 (1989)
Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: ACM International Conference on Management of Data (2003)
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: The 21st International Conference on Data Engineering (2005)
Dalenius, T.: Finding a needle in a haystack. Journal of Official Statistics 2 (1986)
Denning, D.E.: Cryptography and Data Security. Addison-Wesley, Reading (1982)
Dobkin, D., Jones, A.K., Lipton, R.J.: Secure databases: Protection against user influence. ACM Transactions on Database systems 4 (1979)
Dong, X., Halevy, A., Madhavan, J., Nemes, E.: Reference reconciliation in complex information spaces. In: ACM International Conference on Management of Data (2005)
Fellegi, I.P.: On the question of statistical confidentiality. Journal of the American Statistical Association (1972)
Fellegi, I.P., Sunter, A.B.: A theory for record linkage. Journal of the American Statistical Association (1969)
Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: The 21st International Conference on Data Engineering (2005)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: ACM Conference on Knowledge Discovery and Data mining (2002)
Lambert, D.: Measures of disclosure risk and harm. Journal of Official Statistics 9 (1993)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: ACM International Conference on Management of Data (2005)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: The 22nd International Conference on Data Engineering (2006)
Liew, C.K., Choi, U.J., Liew, C.J.: A data distortion by probability distribution. ACM Transactions on Database Systems 10 (1985)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: â„“-diversity: Privacy beyond k-anonymity. In: The 22nd International Conference on Data Engineering (2006)
Reiss, S.P.: Practical data-swapping: The first steps. ACM Transactions on Database Systems 9 (1980)
Hettich, C.B.S., Merz, C.: UCI repository of machine learning databases (1998)
Sarawagi, S., Bhamidipaty, A.: Interactive deduplication using active learning. In: ACM International Conference on Knowledge Discovery and Data Mining (2002)
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: ACM International Conference on Management of Data (1996)
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems (2002)
Sweeney, L.: K-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems (2002)
Traub, J.F., Wozniakowski, Y.Y.H.: The statistical security of statistical database. ACM Transactions on Database Systems 9 (1984)
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: ACM International Conference on Knowledge Discovery and Data Mining (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Byun, JW., Sohn, Y., Bertino, E., Li, N. (2006). Secure Anonymization for Incremental Datasets. In: Jonker, W., Petković, M. (eds) Secure Data Management. SDM 2006. Lecture Notes in Computer Science, vol 4165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11844662_4
Download citation
DOI: https://doi.org/10.1007/11844662_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38984-2
Online ISBN: 978-3-540-38987-3
eBook Packages: Computer ScienceComputer Science (R0)