We consider the problem of releasing tables from a relational database containing personal records, while ensuring individual privacy and maintaining data integrity to the extent possible. One of the techniques proposed in the literature is k-anonymization. A release is considered k-anonymous if the information for each person contained in the release cannot be distinguished from at least k–1 other persons whose information also appears in the release. In the k-Anonymityproblem the objective is to minimally suppress cells in the table so as to ensure that the released version is k-anonymous. We show that the k-Anonymity problem is NP-hard even when the attribute values are ternary. On the positive side, we provide an O(k)-approximation algorithm for the problem. This improves upon the previous best-known O(klog k)-approximation. We also give improved positive results for the interesting cases with specific values of k — in particular, we give a 1.5-approximation algorithm for the special case of 2-Anonymity, and a 2-approximation algorithm for 3-Anonymity.
Unable to display preview. Download preview PDF.
- [AA01]Agrawal, D., Aggarwal, C.: On the design and quantification of privacy preserving datamining algorithms. In: Proc. of the ACM Symp. on Principles of Database Systems (2001)Google Scholar
- [AS00]Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp. 439–450 (May 2000)Google Scholar
- [AST03]Agrawal, R., Srikant, R., Thomas, D.: Privacy preserving aggregates. Technical report, Stanford University (2003)Google Scholar
- [DN03]Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proc. of the ACM Symp. on Principles of Database Systems, pp. 202–210 (2003)Google Scholar
- [DN04]Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)Google Scholar
- [EGS03]Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proc. of the ACM Symp. on Principles of Database Systems (June 2003)Google Scholar
- [Eur98]European Union. Directive on Privacy Protection (October 1998)Google Scholar
- [MW04]Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proc. of the ACM Symp. on Principles of Database Systems (June 2004)Google Scholar
- [SS98]Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proc. of the ACM Symp. on Principles of Database Systems, pp. 188 (1998)Google Scholar
- [Swe00]Sweeney, L.: Uniqueness of simple demographics in the U.S. population. In: LIDAP-WP4. Carnegie Mellon University, Laboratory for International Data Privacy, Pittsburgh, PA (2000)Google Scholar
- [Swe02]Sweeney, L.: k-Anonymity: A model for protecting privacy. International Journal on Uncertainty Fuzziness Knowledge-based Systems (June 2002)Google Scholar
- [Tim97]Time. The Death of Privacy (August 1997)Google Scholar