Skip to main content
Log in

Evaluating differentially private decision tree model over model inversion attack

  • Regular Contribution
  • Published:
International Journal of Information Security Aims and scope Submit manuscript

Abstract

Machine learning techniques have been widely used and shown remarkable performance in various fields. Along with the widespread utilization of machine learning, concerns about privacy violations have been raised. Recently, as privacy invasion attacks on machine learning models have been reported, research on privacy-preserving machine learning has been conducted. In particular, in the field of differential privacy, which is the rigorous notion of privacy, various mechanisms have been proposed to preserve privacy of machine learning models. However, there is a lack of research that analyzes the relationship between the degree of privacy guarantee and substantial privacy breach attacks. In this paper, we analyze the relationship between differentially private models and privacy breach attacks according to the degree of privacy preservation and study how to set appropriate privacy parameters. In particular, we focus on the model inversion attack for decision trees and analyze various differentially private decision tree algorithms over the attack. Our main finding from investigating the trade-off between data privacy and model utility is that well-designed differentially private algorithms can significantly mitigate the substantial privacy invasion attack while preserving model utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. It is assumed that the attacker knows auxiliary information about the target except for sensitive attributes.

  2. It can be extended to multiple sensitive attributes. Additionally, since an attacker can be a user who properly uses machine learning services, it is assumed that the attacker can know candidate values of each attribute.

  3. the gain ratio can be calculated by dividing the information gain by an intrinsic value \(IV(A)=-\sum _{i \in A} {{n_i^A \over n'} \cdot \log {n_i^A \over n'}}\) (\(n'\) denotes the number of records in the current node), and IV(A) is close to zero when \(n_i^A \approx n'\). In the differentially private scenario, the sensitivity needs to consider the worst case, and in this case, the gain ratio can be infinite. Therefore, the gain ratio cannot be bounded.

References

  1. Chi, C.L., Street, W.N., Robinson, J.G., Crawford, M.A.: Individualized patient-centered lifestyle recommendations: an expert system for communicating patient specific cardiovascular risk information and prioritizing lifestyle options. J. Biomed. Inform. 45(6), 1164–1174 (2012)

    Article  Google Scholar 

  2. International Warfarin Pharmacogenetics Consortium: Estimation of the warfarin dose with clinical and pharmacogenetic data. N Engl J Med 360(8), 753–764 (2009)

  3. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1026-1034 (2015)

  4. Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a foreign language. In: Advances in Neural Information Processing Systems. pp. 2773–2781 (2015)

  5. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  6. Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: 23rd USENIX Security Symposium. USENIX Security 14, 17–32 (2014)

  7. Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM. pp. 1322-1333 (2015)

  8. Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy. IEEE. pp. 3–18 (2017)

  9. Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction apis. In: 25th USENIX Security Symposium. USENIX Security 16, 601–618 (2016)

  10. Yeom, S., Giacomelli, I., Fredrikson, M., Jha, S.: Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium. IEEE. pp. 268–282 (2018)

  11. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography Conference. Springer. pp. 265–284 (2006)

  12. Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends\(\text{\textregistered} \) Theor. Comput. Sci. 9(3–4). 211-407 (2014)

  13. Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the SuLQ framework. In: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. ACM. pp. 128–138 (2005)

  14. Friedman, A., Schuster, A.: Data mining with differential privacy. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. pp. 493–502 (2010)

  15. Mohammed, N., Barouti, S., Alhadidi, D., Chen, R.: Secure and private management of healthcare databases for data mining. In: 2015 IEEE 28th International Symposium on Computer-Based Medical Systems. IEEE, pp. 191–196 (2015)

  16. Jagannathan, G., Pillaipakkamnatt, K., Wright, R. N.: A practical differentially private random decision tree classifier. In: 2009 IEEE International Conference on Data Mining Workshops. IEEE, pp. 114–121 (2009)

  17. Jagannathan, G., Monteleoni, C., Pillaipakkamnatt, K.: A semi-supervised learning approach to differential privacy. In: 2013 IEEE 13th International Conference on Data Mining Workshops. IEEE, pp. 841–848 (2013)

  18. Patil, A., Singh, S.: Differential private random forest. In: 2014 International Conference on Advances in Computing, Communications and Informatics. IEEE, pp. 2623–2630 (2014)

  19. Fletcher, S., Islam, M.Z.: A differentially private decision forest. In: 13th Australasian Data Mining Conference, pp. 1–10 (2015)

  20. Fletcher, S., Islam, M.Z.: A differentially private random decision forest using reliable signal-to-noise ratios. In: Australasian Joint Conference on Artificial Intelligence. Springer, Berlin. pp. 192–203 (2015)

  21. Rana, S., Gupta, S.K., Venkatesh, S.: Differentially private random forest with high utility. In: 2015 IEEE International Conference on Data Mining. IEEE, pp. 955–960 (2015)

  22. Fletcher, S., Islam, M.Z.: Differentially private random decision forests using smooth sensitivity. Expert Syst. Appl. 78, 16–31 (2017)

    Article  Google Scholar 

  23. Rahman, M.A., Rahman, T., Laganière, R., Mohammed, N., Wang, Y.: Membership inference attack against differentially private deep learning model. Trans. Data Privacy 11(1), 61–79 (2018)

    Google Scholar 

  24. Wang, Y., Si, C., Wu, X.: Regression model fitting under differential privacy and model inversion attack. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)

  25. Zhang, T., He, Z., Lee, R.B.: Privacy-preserving machine learning through data obfuscation. 2018. arXiv preprint arXiv:1807.01860

  26. Park, C., Hong, D., Seo, C.: An attack-based evaluation method for differentially private learning against model inversion attack. IEEE Access 7, 124988–124999 (2019)

    Article  Google Scholar 

  27. Jayaraman, B., Evans, D.: Evaluating differentially private machine learning in practice. In: 28th USENIX Security Symposium. USENIX Security 19 (2019)

  28. Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier (2014)

  29. Homer, N., Szelinger, S., Redman, M., Duggan, D., Tembe, W., Muehling, J., Pearson, J.V., Stephan, D.A., Nelson, S.F., Craig, D.W.: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4(8), e1000167 (2008)

  30. Ateniese, G., Felici, G., Mancini, L.V., Spognardi, A., Villani, A., Vitali, D.: Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. 2013. arXiv preprint arXiv:1306.4447

  31. Calandrino, J. A., Kilzer, A., Narayanan, A., Felten, E. W., Shmatikov, V.: “You might also like:” Privacy risks of collaborative filtering. In: 2011 IEEE Symposium on Security and Privacy. IEEE, pp. 231–246 (2011)

  32. Hickey, W.: FiveThirtyEight.com, DataLab: How americans like their steak. 2014 http://fivethirtyeight.com/datalab/how-americans-like-their-steak/

  33. Smith, T.W., Marsden, P., Hout, M., Kim, J.: General Social Surveys. National Opinion Research Center (2012)

  34. Prince, J.: Social science research on pornography. http://byuresearch.org/ssrp/downloads/GSShappiness.pdf

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (2019R1A2C1003146), Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government (21ZR1300, Core Technology Research on Trust Data Connectome), and the research grant of the Kongju National University in 2021.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dowon Hong.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, C., Hong, D. & Seo, C. Evaluating differentially private decision tree model over model inversion attack. Int. J. Inf. Secur. 21, 1–14 (2022). https://doi.org/10.1007/s10207-021-00564-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10207-021-00564-5

Keywords

Navigation