Anonymizing binary and small tables is hard to approximate
- First Online:
- 73 Downloads
The problem of publishing personal data without giving up privacy is becoming increasingly important. An interesting formalization recently proposed is the k-anonymity. This approach requires that the rows in a table are clustered in sets of size at least k and that all the rows in a cluster become the same tuple, after the suppression of some records. The natural optimization problem, where the goal is to minimize the number of suppressed entries, is known to be NP-hard when the values are over a ternary alphabet, k=3 and the rows length is unbounded. In this paper we give a lower bound on the approximation factor that any polynomial-time algorithm can achieve on two restrictions of the problem, namely (i) when the records values are over a binary alphabet and k=3, and (ii) when the records have length at most 8 and k=4, showing that these restrictions of the problem are APX-hard.
Unable to display preview. Download preview PDF.
- Aggarwal G, Feder T, Kenthapadi K, Khuller S, Panigrahy R, Thomas D, Zhu A (2006) Achieving anonymity via clustering. In: Vansummeren S (ed) PODS. ACM, New York, pp 153–162 Google Scholar
- Aggarwal G, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Approximation algorithms for k-anonymity. J Priv Technol 2 Google Scholar
- Chaytor R, Evans PA, Wareham T (2008) Fixed-parameter tractability of anonymizing data by suppressing entries. In: Yang B, Du D-Z, Wang CA (eds) COCOA. Lecture notes in computer science, vol 5165. Springer, Berlin, pp 23–31 Google Scholar
- Gionis A, Tassa T (2007) k-anonymization with minimal loss of information. In: Arge L, Hoffmann M, Welzl E (eds) ESA. Lecture notes in computer science, vol 4698. Springer, Berlin, pp 439–450 Google Scholar
- Park H, Shim K (2007) Approximate algorithms for k-anonymity. In: Chan CY, Ooi BC, Zhou A (eds) SIGMOD Conference. ACM, New York, pp 67–78 Google Scholar
- Samarati P, Sweeney L (1998) Generalizing data to provide anonymity when disclosing information. In: PODS. ACM, New York, p 188 (abstract) Google Scholar