On the Complexity of t-Closeness Anonymization and Related Problems

Liang, Hongyu; Yuan, Hao

doi:10.1007/978-3-642-37487-6_26

Hongyu Liang²¹ &
Hao Yuan²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7825))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1839 Accesses
10 Citations

Abstract

An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as k-anonymity and l-diversity, have been extensively studied both theoretically and empirically. Nonetheless, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. The t-closeness principle has been proposed to fix this, which also has the benefit of supporting numerical sensitive attributes. However, in contrast to k-anonymity and l-diversity, the theoretical aspect of t-closeness has not yet been well investigated.

We initiate the first systematic theoretical study on the t-closeness principle under the commonly-used attribute suppression model. We prove that for every constant t such that 0 ≤ t < 1, it is NP-hard to find an optimal t-closeness generalization of a given table. The proof consists of several reductions each of which works for different values of t, which together cover the full range. To complement this negative result, we also provide exact and fixed-parameter algorithms. Finally, we answer some open questions regarding the complexity of k-anonymity and l-diversity left in the literature.

This work was supported in part by the National Basic Research Program of China Grant 2011CBA00300, 2011CBA00301, the National Natural Science Foundation of China Grant 61033001, 61061130540, 61073174. The research of the second author was supported by the Research Grants Council of Hong Kong under grant 9041688 (CityU 124411).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, G., Feder, T., Motwani, K.K.R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing tables. In: ICDT, pp. 246–258 (2005)
Google Scholar
Anshelevich, E., Karagiozova, A.: Terminal backup, 3D matching, and covering cubic graphs. SIAM Journal on Computing 40(3), 678–708 (2011)
Article MathSciNet MATH Google Scholar
Baig, M.M., Li, J., Liu, J., Wang, H.: Cloning for privacy protection in multiple independent data publications. In: CIKM, pp. 885–894 (2011)
Google Scholar
Blocki, J., Williams, R.: Resolving the complexity of some data privacy problems. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010, Part II. LNCS, vol. 6199, pp. 393–404. Springer, Heidelberg (2010)
Chapter Google Scholar
Bonizzoni, P., Vedova, G.D., Dondi, R.: Anonymizing binary and small tables is hard to approximate. Journal of Combinatorial Optimization 22(1), 97–119 (2011)
Article MathSciNet MATH Google Scholar
Bonizzoni, P., Vedova, G.D., Dondi, R., Pirola, Y.: Parameterized complexity of k-anonymity: hardness and tractability. Journal of Combinatorial Optimization (in press)
Google Scholar
Bredereck, R., Nichterlein, A., Niedermeier, R., Philip, G.: The effect of homogeneity on the complexity of k-anonymity. In: Owe, O., Steffen, M., Telle, J.A. (eds.) FCT 2011. LNCS, vol. 6914, pp. 53–64. Springer, Heidelberg (2011)
Chapter Google Scholar
Cao, J., Karras, P., Kalnis, P., Tan, K.-L.: SABRE: a sensitive attribute bucketization and redistribution framework for t-closeness. The VLDB Journal 20(1), 59–81 (2011)
Article Google Scholar
Dondi, R., Mauri, G., Zoppis, I.: The l-diversity problem: Tractability and approximability. Theoretical Computer Science (2012) (in press), doi:10.1016/j.tcs.2012.05.024
Google Scholar
Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer (1999)
Google Scholar
Evans, P.A., Wareham, T., Chaytor, R.: Fixed-parameter tractability of anonymizing data by suppressing entries. Journal of Combinatorial Optimization 18(4), 362–375 (2009)
Article MathSciNet MATH Google Scholar
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman (1979)
Google Scholar
Garey, M.R., Johnson, D.S., Stockmeyer, L.: Some simplified NP-complete problems. In: STOC, pp. 47–63 (1974)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD, pp. 49–60 (2005)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE (2006)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy beyond k-Anonymity and l-Diversity. In: ICDE, pp. 106–115 (2007)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: Closeness: A new privacy measure for data publishing. IEEE Transactions on Knowledge and Data Engineering 22(7), 943–956 (2010)
Article Google Scholar
Liang, H., Yuan, H.: On the complexity of t-closeness anonymization and related problems. Technical report (2012), http://arxiv.org/abs/1301.1751
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (TKDD) 1(1) (2007)
Google Scholar
Martin, D.J., Kifer, D., Machanavajjhala, A., Gehrke, J., Halpern, J.Y.: Worst-case background knowledge for privacy preserving data publishing. In: ICDE, pp. 126–135 (2007)
Google Scholar
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: PODS (2004)
Google Scholar
Mohammed, N., Chen, R., Fung, B.C.M., Yu, P.S.: Differentially private data release for data mining. In: KDD (2011)
Google Scholar
Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: SIGMOD, pp. 665–676 (2007)
Google Scholar
Park, H., Shim, K.: Approximate algorithms for k-anonymity. In: SIGMOD (2007)
Google Scholar
Rebollo-Monedero, D., Forné, J., Domingo-Ferrer, J.: From t-closeness-like privacy to postrandomization via information theory. IEEE Transactions on Knowledge and Data Engineering 22(11), 1623–1636 (2010)
Article Google Scholar
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40(2), 99–121 (2000)
Article MATH Google Scholar
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)
Article Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 557–570 (2002)
Article MathSciNet MATH Google Scholar
Xiao, X., Tao, Y.: Personalized privacy preservation. In: SIGMOD, pp. 229–240 (2006)
Google Scholar
Xiao, X., Tao, Y.: m-invariance: Towards privacy preserving re-publication of dynamic datasets. In: SIGMOD, pp. 689–700 (2007)
Google Scholar
Xiao, X., Yi, K., Tao, Y.: The hardness and approximation algorithms for l-diversity. In: EDBT, pp. 135–146 (2010)
Google Scholar
Xue, M., Karras, P., Raissi, C., Pung, H.K.: Utility-driven anonymization in data publishing. In: CIKM, pp. 2277–2280 (2011)
Google Scholar
Zhang, Q., Koudas, N., Srivastava, D., Yu, T.: Aggregate query answering on anonymized tables. In: ICDE, pp. 116–125 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University, Beijing, 100084, China
Hongyu Liang
Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
Hao Yuan

Authors

Hongyu Liang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Binghamton University, 13902, Binghamton, NY, USA
Weiyi Meng
Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Ling Feng
Department of Computer Science, National University of Singapore, 117417, Singapore
Stéphane Bressan
Research Group Data Analystics and Computing, University of Vienna, 1090, Vienna, Austria
Werner Winiwarter
School of Computer, Wuhan University, 430072, Wuhan, China
Wei Song

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, H., Yuan, H. (2013). On the Complexity of t-Closeness Anonymization and Related Problems. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds) Database Systems for Advanced Applications. DASFAA 2013. Lecture Notes in Computer Science, vol 7825. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37487-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-37487-6_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37486-9
Online ISBN: 978-3-642-37487-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics