Parameterized complexity of k-anonymity: hardness and tractability
- First Online:
- 147 Downloads
The problem of publishing personal data without giving up privacy is becoming increasingly important. A precise formalization that has been recently proposed is the k-anonymity, where the rows of a table are partitioned into clusters of sizes at least k and all rows in a cluster become the same tuple after the suppression of some entries. The natural optimization problem, where the goal is to minimize the number of suppressed entries, is hard even when the stored values are over a binary alphabet or the table consists of a bounded number of columns. In this paper we study how the complexity of the problem is influenced by different parameters. First we show that the problem is W-hard when parameterized by the value of the solution (and k). Then we exhibit a fixed-parameter algorithm when the problem is parameterized by the number of columns and the number of different values in any column. Finally, we prove that k-anonymity is still APX-hard even when restricting to instances with 3 columns and k=3.
KeywordsAnonymity Fixed-parameter complexity Approximation algorithms Hardness
Unable to display preview. Download preview PDF.
- Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Anonymizing tables. In: Eiter T, Libkin L (eds) ICDT. Lecture Notes in Computer Science, vol 3363. Springer, Berlin, pp 246–258 Google Scholar
- Aggarwal G, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Approximation algorithms for k-anonymity. J Priv Technol Google Scholar
- Aggarwal G, Panigrahy R, Feder T, Thomas D, Kenthapadi K, Khuller S, Zhu A (2010) Achieving anonymity via clustering. ACM Trans Algorithms 6(3) Google Scholar
- Blocki J, Williams R (2010) Resolving the complexity of some data privacy problems. In: Abramsky S, Gavoille C, Kirchner C, auf der Heide FM, Spirakis PG (eds) Automata, languages and programming. 37th international colloquium, ICALP 2010, Proceedings of Part II, Bordeaux, France, July 6–10, 2010, LNCS, vol 6199. Springer, Berlin, pp 393–404 Google Scholar
- Du W, Eppstein D, Goodrich MT, Lueker GS (2009) On the approximability of geometric and geographic generalization and the min-max bin covering problem. In: Dehne FKHA, Gavrilova ML, Sack JR, Tóth CD (eds) WADS. Lecture notes in computer science, vol 5664. Springer, Berlin, pp 242–253 Google Scholar
- Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
- Samarati P, Sweeney L (1998) Generalizing data to provide anonymity when disclosing information (abstract). In: PODS. ACM, New York, p 188 Google Scholar