Pattern-Guided Data Anonymization and Clustering

Bredereck, Robert; Nichterlein, André; Niedermeier, Rolf; Philip, Geevarghese

doi:10.1007/978-3-642-22993-0_19

Robert Bredereck¹⁷,
André Nichterlein¹⁷,
Rolf Niedermeier¹⁷ &
…
Geevarghese Philip¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6907))

Included in the following conference series:

International Symposium on Mathematical Foundations of Computer Science

881 Accesses
2 Citations

Abstract

A matrix M over a fixed alphabet is k-anonymous if every row in M has at least k − 1 identical copies in M. Making a matrix k-anonymous by replacing a minimum number of entries with an additional ⋆-symbol (called “suppressing entries”) is known to be NP-hard. This task arises in the context of privacy-preserving publishing. We propose and analyze the computational complexity of an enhanced anonymization model where the user of the k-anonymized data may additionally “guide” the selection of the candidate matrix entries to be suppressed. The basic idea is to express this by means of “pattern vectors” which are part of the input. This can also be interpreted as a sort of clustering process. It is motivated by the observation that the “value” of matrix entries may significantly differ, and losing one (by suppression) may be more harmful than losing the other, which again may very much depend on the intended use of the anonymized data. We show that already very basic special cases of our new model lead to NP-hard problems while others allow for (fixed-parameter) tractability results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving anonymity via clustering. ACM Trans. Algorithms 6(3), 1–19 (2010)
Article MathSciNet Google Scholar
Blocki, J., Williams, R.: Resolving the complexity of some data privacy problems. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6199, pp. 393–404. Springer, Heidelberg (2010)
Chapter Google Scholar
Bodlaender, H.L.: Kernelization: New upper and lower bound techniques. In: Chen, J., Fomin, F.V. (eds.) IWPEC 2009. LNCS, vol. 5917, pp. 17–37. Springer, Heidelberg (2009)
Chapter Google Scholar
Bodlaender, H.L., Thomassé, S., Yeo, A.: Analysis of data reduction: Transformations give evidence for non-existence of polynomial kernels. Technical Report UU-CS-2008-030, Department of Information and Computing Sciences, Utrecht University (2008)
Google Scholar
Bredereck, R., Nichterlein, A., Niedermeier, R., Philip, G.: The effect of homogeneity on the complexity of k-anonymity. In: Proc. 18th FCT. LNCS, Springer, Heidelberg (2011)
Google Scholar
Dom, M., Lokshtanov, D., Saurabh, S.: Incompressibility through colors and iDs. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009. LNCS, vol. 5555, pp. 378–389. Springer, Heidelberg (2009)
Chapter Google Scholar
Domingo-Ferrer, J., Torra, V.: A critique of k-anonymity and some of its enhancements. In: Proc. 3rd ARES, pp. 990–993. IEEE Computer Society, Los Alamitos (2008)
Google Scholar
Fellows, M.R.: Towards fully multivariate algorithmics: Some new results and directions in parameter ecology. In: Fiala, J., Kratochvíl, J., Miller, M. (eds.) IWOCA 2009. LNCS, vol. 5874, pp. 2–10. Springer, Heidelberg (2009)
Chapter Google Scholar
Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv. 42(4), 14:1–14:14 (2010)
Article Google Scholar
Guo, J., Niedermeier, R.: Invitation to data reduction and problem kernelization. ACM SIGACT News 38(1), 31–45 (2007)
Article Google Scholar
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proc. 23rd PODS, pp. 223–228. ACM, New York (2004)
Google Scholar
Niedermeier, R.: Reflections on multivariate algorithmics and problem parameterization. In: Proc. 27th STACS. LIPIcs, vol. 5, pp. 17–32. IBFI Dagstuhl (2010)
Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. IJUFKS 10(5), 571–588 (2002)
MathSciNet MATH Google Scholar
Sweeney, L.: k-anonymity: A model for protecting privacy. IJUFKS 10(5), 557–570 (2002)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Softwaretechnik und Theoretische Informatik, TU Berlin, Germany
Robert Bredereck, André Nichterlein & Rolf Niedermeier
The Institute of Mathematical Sciences, Chennai, India
Geevarghese Philip

Authors

Robert Bredereck
View author publications
You can also search for this author in PubMed Google Scholar
André Nichterlein
View author publications
You can also search for this author in PubMed Google Scholar
Rolf Niedermeier
View author publications
You can also search for this author in PubMed Google Scholar
Geevarghese Philip
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Informatics, University of Warsaw, ul. Banacha 2, 02-097, Warsaw, Poland
Filip Murlak & Piotr Sankowski &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bredereck, R., Nichterlein, A., Niedermeier, R., Philip, G. (2011). Pattern-Guided Data Anonymization and Clustering. In: Murlak, F., Sankowski, P. (eds) Mathematical Foundations of Computer Science 2011. MFCS 2011. Lecture Notes in Computer Science, vol 6907. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22993-0_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-22993-0_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22992-3
Online ISBN: 978-3-642-22993-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics