Improving the Utility of Differential Privacy via Univariate Microaggregation

Sánchez, David; Domingo-Ferrer, Josep; Martínez, Sergio

doi:10.1007/978-3-319-11257-2_11

David Sánchez¹⁶,
Josep Domingo-Ferrer¹⁶ &
Sergio Martínez¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8744))

Included in the following conference series:

International Conference on Privacy in Statistical Databases

1483 Accesses
6 Citations

Abstract

Differential privacy is a privacy model for anonymization that offers more robust privacy guarantees than previous models, such as k-anonymity and its extensions. However, it is often disregarded that the utility of differentially private outputs is quite limited, either because of the amount of noise that needs to be added to obtain them or because utility is only preserved for a restricted type of queries. On the contrary, k-anonymity-like anonymization offers general purpose data releases that make no assumption on the uses of the protected data. This paper proposes a mechanism to offer general purpose differentially private data releases with a specific focus on the preservation of the utility of the protected data. Our proposal relies on univariate microaggregation to reduce the amount of noise needed to satisfy differential privacy. The theoretical benefits of the proposal are illustrated and in a practical setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing tables. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 246–258. Springer, Heidelberg (2005)
Chapter Google Scholar
Blum, A., Ligett, K., Roth, A.: A learning theory approach to non-interactive database privacy. In: The 40th Annual Symposium on the Theory of Computing-STOC 2008, pp. 609–618 (2008)
Google Scholar
Charest, A.-S.: How can we analyze differentially-private synthetic data sets? Journal of Privacy and Confidentiality 2(2), 21–33 (2010)
Google Scholar
Charest, A.-S.: Empirical evaluation of statistical inference from differentially-private contingency tables. In: Domingo-Ferrer, J., Tinnirello, I. (eds.) PSD 2012. LNCS, vol. 7556, pp. 257–272. Springer, Heidelberg (2012)
Chapter Google Scholar
Cormode, G., Procopiuc, C.M., Shen, E., Srivastava, D., Yu, T.: Differentially private spatial decompositions. In: IEEE International Conference on Data Engineering (ICDE 2012), pp. 20–31 (2012)
Google Scholar
Defays, D., Nanopoulos, P.: Panels of enterprises and confidentiality: the small aggregated method. In: The 1992 Symposium on Design and Analysis of Longitudinal Surveys, pp. 195–204 (1993)
Google Scholar
Domingo-Ferrer, J.: A critique of k-anonymity and some of its enhancements. In: ARES/PSAI 2008, pp. 990–993. IEEE Computer Society (2008)
Google Scholar
Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 14(1), 189–201 (2002)
Article Google Scholar
Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)
Article MathSciNet Google Scholar
Domingo-Ferrer, J., Mateo-Sanz, J.M., Oganian, A., Torra, V., Torres, A.: On the Security of Microaggregation with Individual Ranking: Analytical Attacks. International Journal of Uncertainty, Fuzziness, and Knowledge-Based Systems 18(5), 477–492 (2002)
Article Google Scholar
Domingo-Ferrer, J., Sánchez, D., Rufian-Torrell, G.: Anonymization of nominal data based on semantic marginality. Information Sciences 242, 35–48 (2013)
Article Google Scholar
Domingo-Ferrer, J., Sebé, F., Solanas, A.: A polynomial-time approximation to optimal multivariate microaggregation. Computing & Mathematics with Applications 55(4), 714–732 (2008)
Article MATH Google Scholar
Drechsler, J.: My understanding of the differences between the CS and the statistical approach to data confidentiality. In: The 4th IAB Wokshop on Confidentiality and Disclosure. Institute for Employment Research (2011)
Google Scholar
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)
Chapter Google Scholar
Dwork, C.: A firm foundation for private data analysis. Communications of the ACM 54(1), 86–95 (2011)
Article Google Scholar
Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2010), http://archive.ics.uci.edu/ml/datasets/Adult
Google Scholar
Goldberger, J., Tassa, T.: Efficient anonymizations with enhanced utility. Transactions on Data Privacy 3, 149–175 (2010)
MathSciNet Google Scholar
Hay, M., Rastogi, V., Miklau, G., Suciu, D.: Boosting the accuracy of differentially private histograms through consistency. PVLDB 3(1), 1021–1032 (2010)
Google Scholar
Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Schulte Nordholt, E., Spicer, K., de Wolf, P.-P.: Statistical Disclosure Control. Wiley (2012)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: privacy beyond k-anonymity and l-diversity. In: IEEE International Conference on Data Engineering (ICDE 2007), pp. 106–115 (2007)
Google Scholar
Li, N., Yang, W., Qardaji, W.: Differentially private grids for geospatial data. In: IEEE International Conference on Data Engineering (ICDE 2013), pp. 757–768 (2013)
Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-Diversity: privacy beyond k-anonymity. In: IEEE International Conference on Data Engineering (ICDE 2006), p. 24 (2006)
Google Scholar
Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L.: Privacy: theory meets practice on the map. In: IEEE International Conference on Data Engineering (ICDE 2008), pp. 277–286 (2008)
Google Scholar
Martínez, S., Sánchez, D., Valls, A.: A semantic framework to protect the privacy of electronic health records with non-numerical attributes. Journal of Biomedical Informatics 46(2), 294–303 (2013)
Article Google Scholar
McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: The 2009 ACM SIGMOD International Conference on Management of Data, pp. 19–30. ACM (2009)
Google Scholar
Mohammed, N., Chen, R., Fung, B.C.M., Yu, P.S.: Differentially private data release for data mining. In: The 17th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining-KDD 2011, pp. 493–501. ACM (2011)
Google Scholar
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)
Article Google Scholar
Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. SRI International Report (1998)
Google Scholar
Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: Enhancing Data Utility in Differential Privacy via Microaggregation-based k-Anonymity. VLDB Journal (to appear)
Google Scholar
Sweeney, L.: k-Anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Article MATH MathSciNet Google Scholar
Wong, R., Li, J., Fu, A., Wang, K.: (α, k)-Anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), pp. 754–759 (2006)
Google Scholar
Xiao, X., Wang, G., Gehrke, J.: Differential Privacy via Wavelet Transforms. IEEE Transactions on Knowledge and Data Engineering 23(8), 1200–1214 (2010)
Article Google Scholar
Xiao, Y., Xiong, L., Yuan, C.: Differentially private data release through multidimensional partitioning. In: The 7th VLDB Conference on Secure Data Management (SDM 2010), pp. 150–168 (2010)
Google Scholar
Xu, J., Zhang, Z., Xiao, X., Yang, Y., Yu, G.: Differentially Private Histogram Publication. In: IEEE International Conference on Data Engineering (ICDE 2012), pp. 32–43 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

UNESCO Chair in Data Privacy, Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Av. Països Catalans 26, E-43007, Tarragona, Catalonia
David Sánchez, Josep Domingo-Ferrer & Sergio Martínez

Authors

David Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Josep Domingo-Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Martínez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Engineering and Mathematics, UNESCO Chair in Data Privacy, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007, Tarragona, Catalonia
Josep Domingo-Ferrer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sánchez, D., Domingo-Ferrer, J., Martínez, S. (2014). Improving the Utility of Differential Privacy via Univariate Microaggregation. In: Domingo-Ferrer, J. (eds) Privacy in Statistical Databases. PSD 2014. Lecture Notes in Computer Science, vol 8744. Springer, Cham. https://doi.org/10.1007/978-3-319-11257-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-11257-2_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11256-5
Online ISBN: 978-3-319-11257-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics