Differentially Private Data Sets Based on Microaggregation and Record Perturbation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10571)

Abstract

We present an approach to generate differentially private data sets that consists in adding noise to a microaggregated version of the original data set. While this idea has already been proposed in the literature to reduce the data sensitivity and hence the noise required to reach differential privacy, the novelty of our approach is that we focus on the microaggregated data set as the target of protection, rather than focusing on the original data set and viewing the microaggregated data set as a mere intermediate step. As a result, we avoid the complexities inherent to the insensitive microaggregation used in previous contributions and we significantly improve the utility of the data. This claim is supported by theoretical and empirical utility comparisons between our approach and existing approaches.

Keywords

Anonymization Differential privacy Microaggregation Privacy 

References

  1. 1.
    Soria-Comas, J., Domingo-Ferrer, J.: Big data privacy: challenges to privacy principles and models. Data Sci. Eng. 1(1), 21–28 (2015)CrossRefGoogle Scholar
  2. 2.
    Dwork, C., McSherry, F., Nissim, K., Smith, A.D.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). doi:10.1007/11681878_14 CrossRefGoogle Scholar
  3. 3.
    Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International (1998)Google Scholar
  4. 4.
    Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: \(l\)-diversity: privacy beyond \(k\)-anonymity. ACM Trans. Knowl. Disc. Data 1(1) (2007).Google Scholar
  5. 5.
    Li, N., Li, T., Venkatasubramanian, S.: \(t\)-closeness: privacy beyond \(k\)-anonymity and \(l\)-diversity. In: 23th IEEE International Conference on Data Engineering-ICDE 2007, pp. 106–115. IEEE (2007)Google Scholar
  6. 6.
    Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L.: Privacy: theory meets practice on the map. In: 24th IEEE International Conference on Data Engineering-ICDE 2008, pp. 277–286 (2008)Google Scholar
  7. 7.
    Zhang, J., Cormode, G., Procopiuc, C.M., Srivastava, D., Xiao, X.: Privbayes: private data release via Bayesian networks. In: 2014 ACM SIGMOD International Conference on Management of Data-SIGMOD 2014, pp. 1423–1434. ACM, New York (2014)Google Scholar
  8. 8.
    Xiao, Y., Xiong, L., Yuan, C.: Differentially private data release through multidimensional partitioning. In: Jonker, W., Petković, M. (eds.) SDM 2010. LNCS, vol. 6358, pp. 150–168. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15546-8_11 CrossRefGoogle Scholar
  9. 9.
    Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: Enhancing data utility in differential privacy via microaggregation-based \(k\)-anonymity. VLDB J. 23(5), 771–794 (2014)CrossRefGoogle Scholar
  10. 10.
    Sánchez, D., Domingo-Ferrer, J., Martínez, S., Soria-Comas, J.: Utility-preserving differentially private data releases via individual ranking microaggregation. Inf. Fusion 30, 1–14 (2016)CrossRefGoogle Scholar
  11. 11.
    Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). doi:10.1007/11787006_1 CrossRefGoogle Scholar
  12. 12.
    Soria-Comas, J., Domingo-Ferrer, J.: Optimal data-independent noise for differential privacy. Inf. Sci. 250, 200–214 (2013)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: 48th Annual IEEE Symposium on Foundations of Computer Science-FOCS 2007, pp. 94–103. IEEE Computer Society, Washington D.C. (2007)Google Scholar
  14. 14.
    Nissim, K., Raskhodnikova, S., Smith, A.: Smooth sensitivity and sampling in private data analysis. In: 39th Annual ACM Symposium on Theory of Computing-STOC 2007, pp. 75–84. ACM, New York (2007)Google Scholar
  15. 15.
    Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: Improving the utility of differentially private data releases via k-anonymity. In: 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications-TrustCom 2013, pp. 372–379 (2013)Google Scholar
  16. 16.
    Sánchez, D., Domingo-Ferrer, J., Martínez, S.: Improving the utility of differential privacy via univariate microaggregation. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 130–142. Springer, Cham (2014). doi:10.1007/978-3-319-11257-2_11 Google Scholar
  17. 17.
    Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous \(k\)-anonymity through microaggregation. Data Mining Knowl. Discov. 11(2), 195–212 (2005)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Brand, R., Domingo-Ferrer, J., Mateo-Sanz, J.M.: Reference data sets to test and compare SDC methods for the protection of numerical microdata. Deliverable of the EU FP5 “CASC” project (2002). http://neon.vb.cbs.nl/casc/CASCtestsets.htm
  19. 19.
    Domingo-Ferrer, J., Sebé, F., Solanas, A.: A polynomial-time approximation to optimal multivariate microaggregation. Comput. Math. Appl. 55(4), 714–732 (2008)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.UNESCO Chair in Data Privacy, Department of Computer Science and MathematicsUniversitat Rovira i VirgiliTarragona, CataloniaSpain

Personalised recommendations