Skip to main content

On the Complexity of t-Closeness Anonymization and Related Problems

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7825))

Included in the following conference series:

Abstract

An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as k-anonymity and l-diversity, have been extensively studied both theoretically and empirically. Nonetheless, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. The t-closeness principle has been proposed to fix this, which also has the benefit of supporting numerical sensitive attributes. However, in contrast to k-anonymity and l-diversity, the theoretical aspect of t-closeness has not yet been well investigated.

We initiate the first systematic theoretical study on the t-closeness principle under the commonly-used attribute suppression model. We prove that for every constant t such that 0 ≤ t < 1, it is NP-hard to find an optimal t-closeness generalization of a given table. The proof consists of several reductions each of which works for different values of t, which together cover the full range. To complement this negative result, we also provide exact and fixed-parameter algorithms. Finally, we answer some open questions regarding the complexity of k-anonymity and l-diversity left in the literature.

This work was supported in part by the National Basic Research Program of China Grant 2011CBA00300, 2011CBA00301, the National Natural Science Foundation of China Grant 61033001, 61061130540, 61073174. The research of the second author was supported by the Research Grants Council of Hong Kong under grant 9041688 (CityU 124411).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, G., Feder, T., Motwani, K.K.R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing tables. In: ICDT, pp. 246–258 (2005)

    Google Scholar 

  2. Anshelevich, E., Karagiozova, A.: Terminal backup, 3D matching, and covering cubic graphs. SIAM Journal on Computing 40(3), 678–708 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  3. Baig, M.M., Li, J., Liu, J., Wang, H.: Cloning for privacy protection in multiple independent data publications. In: CIKM, pp. 885–894 (2011)

    Google Scholar 

  4. Blocki, J., Williams, R.: Resolving the complexity of some data privacy problems. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010, Part II. LNCS, vol. 6199, pp. 393–404. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Bonizzoni, P., Vedova, G.D., Dondi, R.: Anonymizing binary and small tables is hard to approximate. Journal of Combinatorial Optimization 22(1), 97–119 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bonizzoni, P., Vedova, G.D., Dondi, R., Pirola, Y.: Parameterized complexity of k-anonymity: hardness and tractability. Journal of Combinatorial Optimization (in press)

    Google Scholar 

  7. Bredereck, R., Nichterlein, A., Niedermeier, R., Philip, G.: The effect of homogeneity on the complexity of k-anonymity. In: Owe, O., Steffen, M., Telle, J.A. (eds.) FCT 2011. LNCS, vol. 6914, pp. 53–64. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Cao, J., Karras, P., Kalnis, P., Tan, K.-L.: SABRE: a sensitive attribute bucketization and redistribution framework for t-closeness. The VLDB Journal 20(1), 59–81 (2011)

    Article  Google Scholar 

  9. Dondi, R., Mauri, G., Zoppis, I.: The l-diversity problem: Tractability and approximability. Theoretical Computer Science (2012) (in press), doi:10.1016/j.tcs.2012.05.024

    Google Scholar 

  10. Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer (1999)

    Google Scholar 

  11. Evans, P.A., Wareham, T., Chaytor, R.: Fixed-parameter tractability of anonymizing data by suppressing entries. Journal of Combinatorial Optimization 18(4), 362–375 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  12. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman (1979)

    Google Scholar 

  13. Garey, M.R., Johnson, D.S., Stockmeyer, L.: Some simplified NP-complete problems. In: STOC, pp. 47–63 (1974)

    Google Scholar 

  14. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD, pp. 49–60 (2005)

    Google Scholar 

  15. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE (2006)

    Google Scholar 

  16. Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy beyond k-Anonymity and l-Diversity. In: ICDE, pp. 106–115 (2007)

    Google Scholar 

  17. Li, N., Li, T., Venkatasubramanian, S.: Closeness: A new privacy measure for data publishing. IEEE Transactions on Knowledge and Data Engineering 22(7), 943–956 (2010)

    Article  Google Scholar 

  18. Liang, H., Yuan, H.: On the complexity of t-closeness anonymization and related problems. Technical report (2012), http://arxiv.org/abs/1301.1751

  19. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (TKDD) 1(1) (2007)

    Google Scholar 

  20. Martin, D.J., Kifer, D., Machanavajjhala, A., Gehrke, J., Halpern, J.Y.: Worst-case background knowledge for privacy preserving data publishing. In: ICDE, pp. 126–135 (2007)

    Google Scholar 

  21. Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: PODS (2004)

    Google Scholar 

  22. Mohammed, N., Chen, R., Fung, B.C.M., Yu, P.S.: Differentially private data release for data mining. In: KDD (2011)

    Google Scholar 

  23. Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: SIGMOD, pp. 665–676 (2007)

    Google Scholar 

  24. Park, H., Shim, K.: Approximate algorithms for k-anonymity. In: SIGMOD (2007)

    Google Scholar 

  25. Rebollo-Monedero, D., Forné, J., Domingo-Ferrer, J.: From t-closeness-like privacy to postrandomization via information theory. IEEE Transactions on Knowledge and Data Engineering 22(11), 1623–1636 (2010)

    Article  Google Scholar 

  26. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40(2), 99–121 (2000)

    Article  MATH  Google Scholar 

  27. Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  28. Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  29. Xiao, X., Tao, Y.: Personalized privacy preservation. In: SIGMOD, pp. 229–240 (2006)

    Google Scholar 

  30. Xiao, X., Tao, Y.: m-invariance: Towards privacy preserving re-publication of dynamic datasets. In: SIGMOD, pp. 689–700 (2007)

    Google Scholar 

  31. Xiao, X., Yi, K., Tao, Y.: The hardness and approximation algorithms for l-diversity. In: EDBT, pp. 135–146 (2010)

    Google Scholar 

  32. Xue, M., Karras, P., Raissi, C., Pung, H.K.: Utility-driven anonymization in data publishing. In: CIKM, pp. 2277–2280 (2011)

    Google Scholar 

  33. Zhang, Q., Koudas, N., Srivastava, D., Yu, T.: Aggregate query answering on anonymized tables. In: ICDE, pp. 116–125 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liang, H., Yuan, H. (2013). On the Complexity of t-Closeness Anonymization and Related Problems. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds) Database Systems for Advanced Applications. DASFAA 2013. Lecture Notes in Computer Science, vol 7825. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37487-6_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37487-6_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37486-9

  • Online ISBN: 978-3-642-37487-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics