Fixed-parameter tractability of anonymizing data by suppressing entries
Article
First Online:
Abstract
A popular model for protecting privacy when person-specific data is released is k -anonymity. A dataset is k-anonymous if each record is identical to at least (k−1) other records in the dataset. The basic k-anonymization problem, which minimizes the number of dataset entries that must be suppressed to achieve k-anonymity, is NP-hard and hence not solvable both quickly and optimally in general. We apply parameterized complexity analysis to explore algorithmic options for restricted versions of this problem that occur in practice. We present the first fixed-parameter algorithms for this problem and identify key techniques that can be applied to this and other k-anonymization problems.
Keywords
Privacy Anonymization Parameterized complexity Fixed-parameter tractability KernelizationReferences
- Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Approximation algorithms for k-anonymity. J Priv Technol, paper 20051120001 Google Scholar
- Brankovic L, Estivill-Castro V (1999) Privacy issues in knowledge discovery and data mining. In: Proceedings of Australian institute of computer ethics conference (AICEC99), pp 89–99 Google Scholar
- Brankovic L, Miller M, Horak P, Wrightson G (1997) Usability of compromise-free statistical databases. In: Proceedings of the ninth international conference on scientific and statistical database management (SSDBM 1997). IEEE Press, New York, pp 144–154 CrossRefGoogle Scholar
- Bonizzoni P, Della Vedova G, Dondi R (2007) Anonymizing binary tables is APX-hard. The Computing Research Repository (CoRR) 0707.0421. http://arxiv.org/abs/0707.0421
- Chaytor R (2006) Utility preserving k-anonymity. Technical report MUN-CS 2006-01, Dept Computer Science, Memorial University of Newfoundland Google Scholar
- Chaytor R (2007) Allowing privacy protection algorithms to jump out of local optimums: an ordered greed framework. In: Bonchi F et al. (eds) Proceedings of the 1st SIGKDD international workshop on privacy, security, and trust in KDD (PinKDD’07). LNCS, vol 4890. Springer, Berlin, pp 33–55 CrossRefGoogle Scholar
- Downey R, Fellows M (1999) Parameterized complexity. Springer, Berlin Google Scholar
- Er MC (1988) A fast algorithm for generating set partitions. Comput J 31:283–284 MATHCrossRefGoogle Scholar
- Fernau H (2004) Complexity of a {0,1}-matrix problem. Australasian J Comb 29:273–300 MATHMathSciNetGoogle Scholar
- Horak P, Brankovic L, Miller M (1999) A combinatorial problem in database security. Discrete Appl Math 91:119–126 MATHCrossRefMathSciNetGoogle Scholar
- Islam MZ, Brankovic L (2004) A framework for privacy preserving classification in data mining. In: Proceedings of the second workshop on Australasian information security, data mining and web intelligence, and software internationalisation (ACSW Frontiers 2004), pp 163–168 Google Scholar
- MacDonald (2005) personal communication Google Scholar
- Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of 23rd ACM symposium on principles of database systems (PODS’04), pp 223–228 Google Scholar
- Niedermeier R (2006) Invitation to fixed-parameter algorithms. Oxford University Press, Oxford MATHCrossRefGoogle Scholar
- Samarati P, Sweeney L (1998) Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report SRI-CSL-98-04, SRI International, Computer Science Laboratory Google Scholar
- Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl-Based Syst 10(5):571–588 MATHCrossRefMathSciNetGoogle Scholar
- Wang K, Yu P, Chakraborty S (2004) Bottom-up generalization: a data mining solution to privacy protection. In: Proceedings of 4th IEEE international conference on data mining (ICDM’04), pp 249–256 Google Scholar
- Wareham T (1999) Systematic parameterized complexity analysis in computational phonology. PhD thesis, Dept Computer Science, University of Victoria Google Scholar
Copyright information
© Springer Science+Business Media, LLC 2009