Journal of Combinatorial Optimization

, Volume 22, Issue 1, pp 97–119

Anonymizing binary and small tables is hard to approximate

  • Paola Bonizzoni
  • Gianluca Della Vedova
  • Riccardo Dondi

DOI: 10.1007/s10878-009-9277-y

Cite this article as:
Bonizzoni, P., Della Vedova, G. & Dondi, R. J Comb Optim (2011) 22: 97. doi:10.1007/s10878-009-9277-y


The problem of publishing personal data without giving up privacy is becoming increasingly important. An interesting formalization recently proposed is the k-anonymity. This approach requires that the rows in a table are clustered in sets of size at least k and that all the rows in a cluster become the same tuple, after the suppression of some records. The natural optimization problem, where the goal is to minimize the number of suppressed entries, is known to be NP-hard when the values are over a ternary alphabet, k=3 and the rows length is unbounded. In this paper we give a lower bound on the approximation factor that any polynomial-time algorithm can achieve on two restrictions of the problem, namely (i) when the records values are over a binary alphabet and k=3, and (ii) when the records have length at most 8 and k=4, showing that these restrictions of the problem are APX-hard.

k-anonymity APX-hardness Computational complexity Clustering 

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Paola Bonizzoni
    • 1
  • Gianluca Della Vedova
    • 2
  • Riccardo Dondi
    • 3
  1. 1.DISCoUniversità degli Studi di Milano-BicoccaMilanoItaly
  2. 2.Dipartimento di StatisticaUniversità degli Studi di Milano-BicoccaMilanoItaly
  3. 3.Dipartimento di Scienze dei Linguaggi, della Comunicazione e degli Studi CulturaliUniversità degli Studi di BergamoBergamoItaly

Personalised recommendations