Skip to main content
Log in

Local dampening: differential privacy for non-numeric queries via local sensitivity

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Differential privacy is the state-of-the-art formal definition for data release under strong privacy guarantees. A variety of mechanisms have been proposed in the literature for releasing the output of numeric queries (e.g., the Laplace mechanism and smooth sensitivity mechanism). Those mechanisms guarantee differential privacy by adding noise to the true query’s output. The amount of noise added is calibrated by the notions of global sensitivity and local sensitivity of the query that measure the impact of the addition or removal of an individual on the query’s output. Mechanisms that use local sensitivity add less noise and, consequently, have a more accurate answer. However, although there has been some work on generic mechanisms for releasing the output of non-numeric queries using global sensitivity (e.g., the Exponential mechanism), the literature lacks generic mechanisms for releasing the output of non-numeric queries using local sensitivity to reduce the noise in the query’s output. In this work, we remedy this shortcoming and present the local dampening mechanism. We adapt the notion of local sensitivity for the non-numeric setting and leverage it to design a generic non-numeric mechanism. We provide theoretical comparisons to the exponential mechanism and show under which conditions the local dampening mechanism is more accurate than the exponential mechanism. We illustrate the effectiveness of the local dampening mechanism by applying it to three diverse problems: (i) percentile selection problem. We report the p-th element in the database; (ii) Influential node analysis. Given an influence metric, we release the top-k most influential nodes while preserving the privacy of the relationship between nodes in the network; (iii) Decision tree induction. We provide a private adaptation to the ID3 algorithm to build decision trees from a given tabular dataset. Experimental evaluation shows that we can reduce the error for percentile selection application up to \(73\%\), reduce the use of privacy budget by 2 to 4 orders of magnitude for influential node analysis application, and increase accuracy up to \(12\%\) for decision tree induction when compared to global sensitivity-based approaches. Finally, to illustrate the scalability of our local dampening mechanism, we empirically evaluate its runtime performance for the influential node analysis problem and show a sub-quadratic behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Commission, E.: 2018 reform of EU data protection rules (2018). https://gdpr-info.eu/

  2. Brasil: Lei geral de proteção de dados pessoais (LGPD) (2018). http://www.planalto.gov.br/ccivil_03/_ato2015-2018/2018/lei/L13709.htm

  3. Dwork, C.: Differential privacy. Encyclopedia of Cryptography and Security pp. 338–340 (2011)

  4. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography Conference, pp. 265–284. Springer, Berlin (2006)

  5. Nissim, K., Raskhodnikova, S., Smith, A.: Smooth sensitivity and sampling in private data analysis. In: Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, pp. 75–84. ACM (2007)

  6. Blocki, J., Blum, A., Datta, A., Sheffet, O.: Differentially private data analysis of social networks via restricted sensitivity. In: Proceedings of the 4th Conference on Innovations in Theoretical Computer Science, pp. 87–96. ACM (2013)

  7. Karwa, V., Raskhodnikova, S., Smith, A., Yaroslavtsev, G.: Private analysis of graph structure. PVLDB 4(11), 1146–1157 (2011)

    MATH  Google Scholar 

  8. Kasiviswanathan, S.P., Nissim, K., Raskhodnikova, S., Smith, A.: Analyzing graphs with node differential privacy. In: Theory of Cryptography Conference, pp. 457–476. Springer, Berlin (2013)

  9. Lu, W., Miklau, G.: Exponential random graph estimation under differential privacy. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 921–930. ACM (2014)

  10. Zhang, J., Cormode, G., Procopiuc, C.M., Srivastava, D., Xiao, X.: Private release of graph statistics using ladder functions. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 731–745. ACM (2015)

  11. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07), pp. 94–103 (2007). https://doi.org/10.1109/FOCS.2007.66

  12. Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239 (1978)

    Article  Google Scholar 

  13. Marsden, P.V.: Egocentric and sociocentric measures of network centrality. Soc. Netw. 24(4), 407–422 (2002)

    Article  Google Scholar 

  14. Everett, M., Borgatti, S.P.: Ego network betweenness. Soc. Netw. 27(1), 31–38 (2005)

    Article  Google Scholar 

  15. de Farias, V.A.E., Brito, F.T., Flynn, C., Machado, J.C., Majumdar, S., Srivastava, D.: Local dampening: differential privacy for non-numeric queries via local sensitivity. CoRR (2020). arXiv:2012.04117

  16. Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques, pp. 486–503. Springer, Berlin (2006)

  17. McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pp. 19–30 (2009)

  18. Machanavajjhala, A., He, X., Hay, M.: Differential privacy in the wild: a utorial on current practices and pen challenges. In: Proceedings of the 2017 ACM SIGMOD International Conference on Management of data, pp. 1727–1730. ACM (2017)

  19. Farias, V.A., Brito, F.T., Flynn, C., Machado, J.C., Majumdar, S., Srivastava, D.: Local dampening: differential privacy for non-numeric queries via local sensitivity. PVLDB 14(4), 521–533 (2020)

    Google Scholar 

  20. McKenna, R., Sheldon, D.R.: Permute-and-flip: a new mechanism for differentially private selection. In: Advances in Neural Information Processing Systems, vol. 33 (2020)

  21. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data (2014)

  22. Farias, V.: Local Dampening: Differential Privacy for Non-numeric Queries via Local Sensitivity. Ph.D. thesis, Universidade Federal do Ceará(2021). https://repositorio.ufc.br/handle/riufc/59462

  23. Hay, M., Machanavajjhala, A., Miklau, G., Chen, Y., Zhang, D.: Principled evaluation of differentially private algorithms using dpbench. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD’16, pp. 139–154. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2882903.2882931

  24. Ma, H., Yang, H., Lyu, M.R., King, I.: Mining social networks using heat diffusion processes for marketing candidates selection. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 233–242 (2008)

  25. Laeuchli, J., Ramírez-Cruz, Y., Trujillo-Rasua, R.: Analysis of centrality measures under differential privacy models. arXiv preprint arXiv:2103.03556 (2021)

  26. Tao, Y., He, X., Machanavajjhala, A., Roy, S.: Computing local sensitivities of counting queries with joins. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 479–494 (2020)

  27. Chen, S., Zhou, S.: Recursive mechanism: towards node differential privacy and unrestricted joins. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 653–664 (2013)

  28. Kotsogiannis, I., Tao, Y., He, X., Fanaeepour, M., Machanavajjhala, A., Hay, M., Miklau, G.: PrivateSQL: a differentially private SQL query engine. PVLDB 12(11), 1371–1384 (2019)

    Google Scholar 

  29. Li, C., Miklau, G., Hay, M., McGregor, A., Rastogi, V.: The matrix mechanism: optimizing linear counting queries under differential privacy. The VLDB journal 24(6), 757–781 (2015)

    Article  Google Scholar 

  30. Johnson, N., Near, J.P., Song, D.: Towards practical differential privacy for SQL queries. PVLDB 11(5), 526–539 (2018)

  31. Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: a review of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 160, 3–24 (2007)

    Google Scholar 

  32. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Article  Google Scholar 

  33. Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the SULQ framework. In: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 128–138 (2005)

  34. Friedman, A., Schuster, A.: Data mining with differential privacy. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 493–502 (2010)

  35. Fletcher, S., Islam, M.Z.: Decision tree classification with differential privacy: a survey. ACM Comput. Surv. (CSUR) 52(4), 1–33 (2019)

    Article  Google Scholar 

  36. Fletcher, S., Islam, M.Z.: A differentially private decision forest. AusDM 15, 99–108 (2015)

    Google Scholar 

  37. Fletcher, S., Islam, M.Z.: A differentially private random decision forest using reliable signal-to-noise ratios. In: Australasian Joint Conference on Artificial Intelligence, pp. 192–203. Springer, Berlin (2015)

  38. Fletcher, S., Islam, M.Z.: Differentially private random decision forests using smooth sensitivity. Expert Syst. Appl. 78, 16–31 (2017)

    Article  Google Scholar 

  39. Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A practical differentially private random decision tree classifier. In: 2009 IEEE International Conference on Data Mining Workshops, pp. 114–121. IEEE (2009)

  40. Patil, A., Singh, S.: Differential private random forest. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2623–2630. IEEE (2014)

  41. Rana, S., Gupta, S.K., Venkatesh, S.: Differentially private random forest with high utility. In: 2015 IEEE International Conference on Data Mining, pp. 955–960. IEEE (2015)

  42. Salzberg, S.L.: C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc. Mach Learn 16, 235–240 (1993)

    Article  Google Scholar 

  43. Manton, K.G.: National long-term care survey: 1982, 1984, 1989, 1994, 1999, and 2004. Inter-university Consortium for Political and Social Research (2010)

  44. Series, I.P.U.M.: Version 6.0. University of Minnesota, Minneapolis (2015)

    Google Scholar 

  45. Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases. University of California, Oakland (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Victor A. E. Farias.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farias, V.A.E., Brito, F.T., Flynn, C. et al. Local dampening: differential privacy for non-numeric queries via local sensitivity. The VLDB Journal 32, 1191–1214 (2023). https://doi.org/10.1007/s00778-022-00774-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-022-00774-w

Keywords

Navigation