Evaluating differentially private decision tree model over model inversion attack

Park, Cheolhee; Hong, Dowon; Seo, Changho

doi:10.1007/s10207-021-00564-5

Evaluating differentially private decision tree model over model inversion attack

Regular Contribution
Published: 31 August 2021

Volume 21, pages 1–14, (2022)
Cite this article

International Journal of Information Security Aims and scope Submit manuscript

495 Accesses
2 Citations
Explore all metrics

Abstract

Machine learning techniques have been widely used and shown remarkable performance in various fields. Along with the widespread utilization of machine learning, concerns about privacy violations have been raised. Recently, as privacy invasion attacks on machine learning models have been reported, research on privacy-preserving machine learning has been conducted. In particular, in the field of differential privacy, which is the rigorous notion of privacy, various mechanisms have been proposed to preserve privacy of machine learning models. However, there is a lack of research that analyzes the relationship between the degree of privacy guarantee and substantial privacy breach attacks. In this paper, we analyze the relationship between differentially private models and privacy breach attacks according to the degree of privacy preservation and study how to set appropriate privacy parameters. In particular, we focus on the model inversion attack for decision trees and analyze various differentially private decision tree algorithms over the attack. Our main finding from investigating the trade-off between data privacy and model utility is that well-designed differentially private algorithms can significantly mitigate the substantial privacy invasion attack while preserving model utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Big data in healthcare: management, analysis and future prospects

Article Open access 19 June 2019

Sabyasachi Dash, Sushil Kumar Shakyawar, … Sandeep Kaushik

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Vitor Werner de Vargas, Jorge Arthur Schneider Aranda, … Jorge Luis Victória Barbosa

A survey on federated learning: challenges and applications

Article 11 November 2022

Jie Wen, Zhixia Zhang, … Wensheng Zhang

Notes

It is assumed that the attacker knows auxiliary information about the target except for sensitive attributes.
It can be extended to multiple sensitive attributes. Additionally, since an attacker can be a user who properly uses machine learning services, it is assumed that the attacker can know candidate values of each attribute.
the gain ratio can be calculated by dividing the information gain by an intrinsic value \(IV(A)=-\sum _{i \in A} {{n_i^A \over n'} \cdot \log {n_i^A \over n'}}\) (\(n'\) denotes the number of records in the current node), and IV(A) is close to zero when \(n_i^A \approx n'\). In the differentially private scenario, the sensitivity needs to consider the worst case, and in this case, the gain ratio can be infinite. Therefore, the gain ratio cannot be bounded.

References

Chi, C.L., Street, W.N., Robinson, J.G., Crawford, M.A.: Individualized patient-centered lifestyle recommendations: an expert system for communicating patient specific cardiovascular risk information and prioritizing lifestyle options. J. Biomed. Inform. 45(6), 1164–1174 (2012)
Article Google Scholar
International Warfarin Pharmacogenetics Consortium: Estimation of the warfarin dose with clinical and pharmacogenetic data. N Engl J Med 360(8), 753–764 (2009)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1026-1034 (2015)
Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a foreign language. In: Advances in Neural Information Processing Systems. pp. 2773–2781 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: 23rd USENIX Security Symposium. USENIX Security 14, 17–32 (2014)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM. pp. 1322-1333 (2015)
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy. IEEE. pp. 3–18 (2017)
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction apis. In: 25th USENIX Security Symposium. USENIX Security 16, 601–618 (2016)
Yeom, S., Giacomelli, I., Fredrikson, M., Jha, S.: Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium. IEEE. pp. 268–282 (2018)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography Conference. Springer. pp. 265–284 (2006)
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends\(\text{\textregistered} \) Theor. Comput. Sci. 9(3–4). 211-407 (2014)
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the SuLQ framework. In: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. ACM. pp. 128–138 (2005)
Friedman, A., Schuster, A.: Data mining with differential privacy. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. pp. 493–502 (2010)
Mohammed, N., Barouti, S., Alhadidi, D., Chen, R.: Secure and private management of healthcare databases for data mining. In: 2015 IEEE 28th International Symposium on Computer-Based Medical Systems. IEEE, pp. 191–196 (2015)
Jagannathan, G., Pillaipakkamnatt, K., Wright, R. N.: A practical differentially private random decision tree classifier. In: 2009 IEEE International Conference on Data Mining Workshops. IEEE, pp. 114–121 (2009)
Jagannathan, G., Monteleoni, C., Pillaipakkamnatt, K.: A semi-supervised learning approach to differential privacy. In: 2013 IEEE 13th International Conference on Data Mining Workshops. IEEE, pp. 841–848 (2013)
Patil, A., Singh, S.: Differential private random forest. In: 2014 International Conference on Advances in Computing, Communications and Informatics. IEEE, pp. 2623–2630 (2014)
Fletcher, S., Islam, M.Z.: A differentially private decision forest. In: 13th Australasian Data Mining Conference, pp. 1–10 (2015)
Fletcher, S., Islam, M.Z.: A differentially private random decision forest using reliable signal-to-noise ratios. In: Australasian Joint Conference on Artificial Intelligence. Springer, Berlin. pp. 192–203 (2015)
Rana, S., Gupta, S.K., Venkatesh, S.: Differentially private random forest with high utility. In: 2015 IEEE International Conference on Data Mining. IEEE, pp. 955–960 (2015)
Fletcher, S., Islam, M.Z.: Differentially private random decision forests using smooth sensitivity. Expert Syst. Appl. 78, 16–31 (2017)
Article Google Scholar
Rahman, M.A., Rahman, T., Laganière, R., Mohammed, N., Wang, Y.: Membership inference attack against differentially private deep learning model. Trans. Data Privacy 11(1), 61–79 (2018)
Google Scholar
Wang, Y., Si, C., Wu, X.: Regression model fitting under differential privacy and model inversion attack. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)
Zhang, T., He, Z., Lee, R.B.: Privacy-preserving machine learning through data obfuscation. 2018. arXiv preprint arXiv:1807.01860
Park, C., Hong, D., Seo, C.: An attack-based evaluation method for differentially private learning against model inversion attack. IEEE Access 7, 124988–124999 (2019)
Article Google Scholar
Jayaraman, B., Evans, D.: Evaluating differentially private machine learning in practice. In: 28th USENIX Security Symposium. USENIX Security 19 (2019)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier (2014)
Homer, N., Szelinger, S., Redman, M., Duggan, D., Tembe, W., Muehling, J., Pearson, J.V., Stephan, D.A., Nelson, S.F., Craig, D.W.: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4(8), e1000167 (2008)
Ateniese, G., Felici, G., Mancini, L.V., Spognardi, A., Villani, A., Vitali, D.: Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. 2013. arXiv preprint arXiv:1306.4447
Calandrino, J. A., Kilzer, A., Narayanan, A., Felten, E. W., Shmatikov, V.: “You might also like:” Privacy risks of collaborative filtering. In: 2011 IEEE Symposium on Security and Privacy. IEEE, pp. 231–246 (2011)
Hickey, W.: FiveThirtyEight.com, DataLab: How americans like their steak. 2014 http://fivethirtyeight.com/datalab/how-americans-like-their-steak/
Smith, T.W., Marsden, P., Hout, M., Kim, J.: General Social Surveys. National Opinion Research Center (2012)
Prince, J.: Social science research on pornography. http://byuresearch.org/ssrp/downloads/GSShappiness.pdf

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (2019R1A2C1003146), Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government (21ZR1300, Core Technology Research on Trust Data Connectome), and the research grant of the Kongju National University in 2021.

Author information

Authors and Affiliations

Information Security Research Division, Electronics and Telecommunications Research Institute (ETRI), Daejeon, 34129, Republic of Korea
Cheolhee Park
Department of Applied Mathematics, Kongju National University, Gongju-si, 32588, South Korea
Dowon Hong
Department of Convergence Science, Kongju National University, Gongju-si, 32588, South Korea
Changho Seo

Authors

Cheolhee Park
View author publications
You can also search for this author in PubMed Google Scholar
Dowon Hong
View author publications
You can also search for this author in PubMed Google Scholar
Changho Seo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dowon Hong.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, C., Hong, D. & Seo, C. Evaluating differentially private decision tree model over model inversion attack. Int. J. Inf. Secur. 21, 1–14 (2022). https://doi.org/10.1007/s10207-021-00564-5

Download citation

Published: 31 August 2021
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10207-021-00564-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating differentially private decision tree model over model inversion attack

Abstract

Access this article

Similar content being viewed by others

Big data in healthcare: management, analysis and future prospects

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A survey on federated learning: challenges and applications

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evaluating differentially private decision tree model over model inversion attack

Abstract

Access this article

Similar content being viewed by others

Big data in healthcare: management, analysis and future prospects

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A survey on federated learning: challenges and applications

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation