Fixed-parameter tractability of anonymizing data by suppressing entries

  • Patricia A. Evans
  • H. Todd Wareham
  • Rhonda Chaytor
Article

Abstract

A popular model for protecting privacy when person-specific data is released is k -anonymity. A dataset is k-anonymous if each record is identical to at least (k−1) other records in the dataset. The basic k-anonymization problem, which minimizes the number of dataset entries that must be suppressed to achieve k-anonymity, is NP-hard and hence not solvable both quickly and optimally in general. We apply parameterized complexity analysis to explore algorithmic options for restricted versions of this problem that occur in practice. We present the first fixed-parameter algorithms for this problem and identify key techniques that can be applied to this and other k-anonymization problems.

Keywords

Privacy Anonymization Parameterized complexity Fixed-parameter tractability Kernelization 

References

  1. Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Approximation algorithms for k-anonymity. J Priv Technol, paper 20051120001 Google Scholar
  2. Brankovic L, Estivill-Castro V (1999) Privacy issues in knowledge discovery and data mining. In: Proceedings of Australian institute of computer ethics conference (AICEC99), pp 89–99 Google Scholar
  3. Brankovic L, Miller M, Horak P, Wrightson G (1997) Usability of compromise-free statistical databases. In: Proceedings of the ninth international conference on scientific and statistical database management (SSDBM 1997). IEEE Press, New York, pp 144–154 CrossRefGoogle Scholar
  4. Bonizzoni P, Della Vedova G, Dondi R (2007) Anonymizing binary tables is APX-hard. The Computing Research Repository (CoRR) 0707.0421. http://arxiv.org/abs/0707.0421
  5. Chaytor R (2006) Utility preserving k-anonymity. Technical report MUN-CS 2006-01, Dept Computer Science, Memorial University of Newfoundland Google Scholar
  6. Chaytor R (2007) Allowing privacy protection algorithms to jump out of local optimums: an ordered greed framework. In: Bonchi F et al. (eds) Proceedings of the 1st SIGKDD international workshop on privacy, security, and trust in KDD (PinKDD’07). LNCS, vol 4890. Springer, Berlin, pp 33–55 CrossRefGoogle Scholar
  7. Downey R, Fellows M (1999) Parameterized complexity. Springer, Berlin Google Scholar
  8. Er MC (1988) A fast algorithm for generating set partitions. Comput J 31:283–284 MATHCrossRefGoogle Scholar
  9. Fernau H (2004) Complexity of a {0,1}-matrix problem. Australasian J Comb 29:273–300 MATHMathSciNetGoogle Scholar
  10. Horak P, Brankovic L, Miller M (1999) A combinatorial problem in database security. Discrete Appl Math 91:119–126 MATHCrossRefMathSciNetGoogle Scholar
  11. Islam MZ, Brankovic L (2004) A framework for privacy preserving classification in data mining. In: Proceedings of the second workshop on Australasian information security, data mining and web intelligence, and software internationalisation (ACSW Frontiers 2004), pp 163–168 Google Scholar
  12. MacDonald (2005) personal communication Google Scholar
  13. Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of 23rd ACM symposium on principles of database systems (PODS’04), pp 223–228 Google Scholar
  14. Niedermeier R (2006) Invitation to fixed-parameter algorithms. Oxford University Press, Oxford MATHCrossRefGoogle Scholar
  15. Samarati P, Sweeney L (1998) Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report SRI-CSL-98-04, SRI International, Computer Science Laboratory Google Scholar
  16. Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl-Based Syst 10(5):571–588 MATHCrossRefMathSciNetGoogle Scholar
  17. Wang K, Yu P, Chakraborty S (2004) Bottom-up generalization: a data mining solution to privacy protection. In: Proceedings of 4th IEEE international conference on data mining (ICDM’04), pp 249–256 Google Scholar
  18. Wareham T (1999) Systematic parameterized complexity analysis in computational phonology. PhD thesis, Dept Computer Science, University of Victoria Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Patricia A. Evans
    • 1
  • H. Todd Wareham
    • 2
  • Rhonda Chaytor
    • 3
  1. 1.Faculty of Computer ScienceUniversity of New BrunswickFrederictonCanada
  2. 2.Department of Computer ScienceMemorial UniversitySt. John’sCanada
  3. 3.School of Computing ScienceSimon Fraser UniversityVancouverCanada

Personalised recommendations