Abstract
This chapter is focused on a specific topic in modern machine learning, i.e., deep learning. First of all, we introduce a few fundamental aspects, including perceptron and why we need multi-layer structures, how the convolutional neural networks extract features layer by layer, the back-propagation learning algorithm, and the functional layers of convolutional neural networks. Then, we will focus on the safety and security vulnerabilities of deep learning, explaining uncertainty estimation, adversarial attack on the robustness, poisoning attack, model stealing, membership inference and model inversion. Unlike traditional machine learning models that we discussed in the previous chapters where the structure or the training algorithm of a machine learning model may be considered for the design of the attacks, most attacks for deep learning do not consider the internal structure of a deep learning model, although the gradient may be used in some cases. These black-box or grey-box attacks suggest that many of the attack algorithms for deep learning may also be applicable to traditional machine learning models, which explains why we frequently refer to algorithms in the chapter when discussing safety threads for traditional machine learning models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Roberto Battiti. First-and second-order methods for learning: between steepest descent and newton’s method. Neural computation, 4(2):141–166, 1992.
Christopher M Bishop. Pattern recognition and machine learning. springer, 2006.
Aleksandar Botev, Hippolyt Ritter, and David Barber. Practical gauss-newton optimisation for deep learning. In International Conference on Machine Learning, pages 557–565. PMLR, 2017.
Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In Security and Privacy (SP), IEEE Symposium on, pages 39–57, 2017.
Ambra Demontis, Marco Melis, Maura Pintor, Matthew Jagielski, Battista Biggio, Alina Oprea, Cristina Nita-Rotaru, and Fabio Roli. Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks. In 28th USENIX Security Symposium (USENIX Security 19), pages 321–338, Santa Clara, CA, August 2019. USENIX Association.
Yukun Ding, Jinglan Liu, Jinjun Xiong, and Yiyu Shi. Revisiting the evaluation of uncertainty estimation and its application to explore model complexity-uncertainty trade-off. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2020, Seattle, WA, USA, June 14-19, 2020, pages 22–31. Computer Vision Foundation / IEEE, 2020.
Logan Engstrom, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. A rotation and a translation suffice: Fooling cnns with simple transformations. arXiv preprint arXiv:1712.02779, 2017.
Xueluan Gong, Yanjiao Chen, Wenbin Yang, Guanghao Mei, and Qian Wang. Inversenet: Augmenting model extraction attacks with training data inversion. In Zhi-Hua Zhou, editor, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 19-27 August 2021, pages 2439–2447. ijcai.org, 2021.
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
Jamie Hayes and George Danezis. Learning universal adversarial perturbations with generative models. In 2018 IEEE Security and Privacy Workshops (SPW), pages 43–49. IEEE, 2018.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
Dan Hendrycks and Kevin Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan, and Samy Bengio. Fantastic generalization measures and where to find them. arXiv preprint arXiv:1912.02178, 2019.
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision, pages 694–711. Springer, 2016.
Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. On large-batch training for deep learning: Generalization gap and sharp minima. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. Adversarial examples in the physical world. CoRR, abs/1607.02533, 2016.
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
Melanie Lefkowitz. Professor’s perceptron paved the way for ai – 60 years too soon, 2019.
Aravindh Mahendran and Andrea Vedaldi. Understanding deep image representations by inverting them. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, pages 5188–5196. IEEE Computer Society, 2015.
James Martens and Roger Grosse. Optimizing neural networks with kronecker-factored approximate curvature. In International conference on machine learning, pages 2408–2417. PMLR, 2015.
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. Universal adversarial perturbations. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 86–94, 2017.
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2574–2582, 2016.
Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. Knockoff nets: Stealing functionality of black-box models. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 4954–4963. Computer Vision Foundation / IEEE, 2019.
Soham Pal, Yash Gupta, Aditya Shukla, Aditya Kanade, Shirish K. Shevade, and Vinod Ganapathy. Activethief: Model extraction using active learning and unannotated public data. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pages 865–872. AAAI Press, 2020.
Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, and Rujun Long. Technical report on the cleverhans v2.1.0 adversarial examples library. arXiv preprint arXiv:1610.00768, 2018.
Nicolas Papernot, Patrick D. McDaniel, Ian J. Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. Practical black-box attacks against deep learning systems using adversarial examples. CoRR, abs/1602.02697, 2016.
Nicolas Papernot, Patrick D. McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on, pages 372–387. IEEE, 2016.
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. Towards practical verification of machine learning: The case of computer vision systems. arXiv preprint arXiv:1712.01785, 2017.
Omid Poursaeed, Isay Katsman, Bicheng Gao, and Serge J. Belongie. Generative adversarial perturbations. In Conference on Computer Vision and Pattern Recognition, 2018.
Hippolyt Ritter, Aleksandar Botev, and David Barber. A scalable laplace approximation for neural networks. In 6th International Conference on Learning Representations, ICLR 2018-Conference Track Proceedings, volume 6. International Conference on Representation Learning, 2018.
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
Aniruddha Saha, Akshayvarun Subramanya, and Hamed Pirsiavash. Hidden trigger backdoor attacks. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pages 11957–11965. AAAI Press, 2020.
Ali Shafahi, W. Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, and Tom Goldstein. Poison frogs! targeted clean-label poisoning attacks on neural networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 6106–6116, Red Hook, NY, USA, 2018. Curran Associates Inc.
Adrian J Shepherd. Second-order methods for neural networks: Fast and reliable training methods for multi-layer perceptrons. Springer Science & Business Media, 2012.
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. In Yoshua Bengio and Yann LeCun, editors, 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Workshop Track Proceedings, 2014.
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56):1929–1958, 2014.
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In In ICLR. Citeseer, 2014.
Chaowei Xiao, Jun-Yan Zhu, Bo Li, Warren He, Mingyan Liu, and Dawn Song. Spatially transformed adversarial examples. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018.
Ziqi Yang, Jiyi Zhang, Ee-Chien Chang, and Zhenkai Liang. Neural network inversion in adversarial setting via background knowledge alignment. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, CCS ’19, page 225–240, New York, NY, USA, 2019. Association for Computing Machinery.
Jason Yosinski, Jeff Clune, Anh Mai Nguyen, Thomas J. Fuchs, and Hod Lipson. Understanding neural networks through deep visualization. CoRR, abs/1506.06579, 2015.
Honggang Yu, Kaichen Yang, Teng Zhang, Yun-Yun Tsai, Tsung-Yi Ho, and Yier Jin. Cloudleak: Large-scale deep learning models stealing through adversarial examples. In 27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020. The Internet Society, 2020.
Matthew D. Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. In European Conference on Computer Vision (ECCV), 2014.
Chen Zhu, W. Ronny Huang, Hengduo Li, Gavin Taylor, Christoph Studer, and Tom Goldstein. Transferable clean-label poisoning attacks on deep neural nets. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 7614–7623. PMLR, 09–15 Jun 2019.
Author information
Authors and Affiliations
Exercises
Exercises
Question 1
Consider the example dataset in Example 5.1, please compute the below expressions (up to 2 decimal places)
-
GainRatio(Humidity, PlayTennis) = 0.16
-
GainRatio(Temperature, PlayTennis) = 0.03
-
GainRatio(Wind, PlayTennis) = 0.05
-
InfoGain(Humidity, PlayTennis) = 0.15
-
InfoGain(Temperature, PlayTennis) = 0.03
-
InfoGain(Wind, PlayTennis) = 0.05 □
Question 2
Consider part of the Iris dataset in Table 10.5, please compute the below expressions (up to 2 decimal places)
-
GainRatio(SepalLength, IrisClass) = 
-
GainRatio(SepalWidth, IrisClass) = 
-
GainRatio(PedalLength, IrisClass) = 
-
GainRatio(PedalWidth, IrisClass) = 
-
InfoGain(SepalLength, IrisClass) = 
-
InfoGain(SepalWidth, IrisClass) = 
-
InfoGain(PedalLength, IrisClass) = 
-
InfoGain(PedalWidth, IrisClass) = 
Question 3
Understand the basic idea of random forest by conducting research on the literature, and implement a random forest algorithm based on the decision tree algorithm to see if random forest performs better than decision tree on Iris dataset. â–¡
Question 4
Following the last newquestion, please give an adversarial attack algorithm for random forest. â–¡
Question 5
Consider the four data samples in Example 7.1 (also provided in Table 10.6) and the mean square error, if we have the following two functions:
-
\(f_{{\mathbf {w}}_1}=2X_1+1X_2+20X_3-330\)
-
\(f_{{\mathbf {w}}_2}=X_1-2X_2+23X_3-332\)
please newanswer the following newquestions:
-
1.
which model is better for linear regression?
-
2.
which model is better for linear classification by considering 0-1 loss for y T = (0, 1, 1, 0)?
-
3.
which model is better for logistic regression for y T = (0, 1, 1, 0)?
-
4.
According to the logistic regression of the first model, what is the prediction result of the first model on a new input (181, 92, 12.4)? â–¡
Answer
-
1.
Because
$$\displaystyle \begin{aligned} \begin{array}{rll} f_{{\mathbf{w}}_1}({\mathbf{x}}_1)-y_1 = & 2*182+1*87+20*11.3-330-325 & = 22 \\ f_{{\mathbf{w}}_1}({\mathbf{x}}_2)-y_2 = & 2*189+1*92+20*12.3-330-344 & = 42 \\ f_{{\mathbf{w}}_1}({\mathbf{x}}_3)-y_3 = & 2*178+1*79+20*10.6-330-350 & = -33 \\ f_{{\mathbf{w}}_1}({\mathbf{x}}_4)-y_4 = & 2*183+1*90+20*12.7-330-320 & = 60 \\ f_{{\mathbf{w}}_2}({\mathbf{x}}_1)-y_1 = & 182-2*87+23*11.3-332-325 & = -389.1 \\ f_{{\mathbf{w}}_2}({\mathbf{x}}_2)-y_2 = & 189-2*92+23*12.3-332-344 & = -388.1 \\ f_{{\mathbf{w}}_2}({\mathbf{x}}_3)-y_3 = & 178-2*79+23*10.6-332-350 & = -418.2 \\ f_{{\mathbf{w}}_2}({\mathbf{x}}_4)-y_4 = & 183-2*90+23*12.7-332-320 & = -356.9 \\ \end{array} \end{aligned} $$(10.89)the model \(f_{{\mathbf {w}}_1}\) is better for linear regression, according to the loss function (Eq. (7.3));
-
2.
Because
$$\displaystyle \begin{aligned} \begin{array}{cc} step(f_{{\mathbf{w}}_1}({\mathbf{x}}_1))=1\\ step(f_{{\mathbf{w}}_1}({\mathbf{x}}_2))=1\\ step(f_{{\mathbf{w}}_1}({\mathbf{x}}_3))=1\\ step(f_{{\mathbf{w}}_1}({\mathbf{x}}_4))=1\\ step(f_{{\mathbf{w}}_2}({\mathbf{x}}_1))=0\\ step(f_{{\mathbf{w}}_2}({\mathbf{x}}_2))=0\\ step(f_{{\mathbf{w}}_2}({\mathbf{x}}_3))=0\\ step(f_{{\mathbf{w}}_2}({\mathbf{x}}_4))=0\\ \end{array} \end{aligned} $$(10.90)we have that both models are the same, according to the loss function (Eq. (7.13));
-
3.
Because
$$\displaystyle \begin{aligned} \begin{array}{rcl} y_1*\log(\sigma(f_{{\mathbf{w}}_1}({\mathbf{x}}_1)))+(1-y_1)\log((1-\sigma(f_{{\mathbf{w}}_1}({\mathbf{x}}_1))))&=&-M\\ y_2*\log(\sigma(f_{{\mathbf{w}}_1}({\mathbf{x}}_2)))+(1-y_2)\log((1-\sigma(f_{{\mathbf{w}}_1}({\mathbf{x}}_2))))&=&0\\ y_3*\log(\sigma(f_{{\mathbf{w}}_1}({\mathbf{x}}_3)))+(1-y_3)\log((1-\sigma(f_{{\mathbf{w}}_1}({\mathbf{x}}_3))))&=&0\\ y_4*\log(\sigma(f_{{\mathbf{w}}_1}({\mathbf{x}}_4)))+(1-y_4)\log((1-\sigma(f_{{\mathbf{w}}_1}({\mathbf{x}}_4))))&=&-M\\ y_1*\log(\sigma(f_{{\mathbf{w}}_2}({\mathbf{x}}_1)))+(1-y_1)\log((1-\sigma(f_{{\mathbf{w}}_2}({\mathbf{x}}_1))))&=&0\\ y_2*\log(\sigma(f_{{\mathbf{w}}_2}({\mathbf{x}}_2)))+(1-y_2)\log((1-\sigma(f_{{\mathbf{w}}_2}({\mathbf{x}}_2))))&=&-44.1\\ y_3*\log(\sigma(f_{{\mathbf{w}}_2}({\mathbf{x}}_3)))+(1-y_3)\log((1-\sigma(f_{{\mathbf{w}}_2}({\mathbf{x}}_3))))&=&-68.2\\ y_4*\log(\sigma(f_{{\mathbf{w}}_2}({\mathbf{x}}_4)))+(1-y_4)\log((1-\sigma(f_{{\mathbf{w}}_2}({\mathbf{x}}_4))))&=&-1.1\\ \end{array} \end{aligned} $$(10.91)where M represents a large number, so we have
$$\displaystyle \begin{aligned} \begin{array}{c} \hat{L}(f_{{\mathbf{w}}_1}) = -\frac{1}{4}(-M+0+0-M)=M/2\\ \hat{L}(f_{{\mathbf{w}}_2}) = -\frac{1}{4}(0-44.1-68.2-1.1)=28.35 \end{array} \end{aligned} $$(10.92)according to Eq. (7.18). Therefore, \(f_{{\mathbf {w}}_2}\) is better.
-
4.
According to Eq. (7.16), we have
$$\displaystyle \begin{aligned} \begin{array}{c} P_{{\mathbf{w}}_1}(y=1|\mathbf{x})=\sigma(2*181+1*92+20*12.4-330)=1 \\ P_{{\mathbf{w}}_1}(y=0|\mathbf{x})=1-\sigma(2*181+1*92+20*12.4-330)=0 \end{array} \end{aligned} $$(10.93)Therefore, it is predicted to 1. â–¡
Question 6
Understand the basic idea of Bayesian linear regression by conducting research on the literature, and implement a Bayesian linear regression algorithm to compare its performance with linear regression. â–¡
Question 7
Write a program for the adversarial attack for logistic regression. â–¡
Question 8
Given a function f(x) = e x∕(1 + e x), how many critical points? □
Answer
0 â–¡
Question 9
Given a function \(f(x_1,x_2)= 9x_1^2+3x_2+4\), how many critical points? â–¡
Answer
0, because there is no assignment to x 1 and x 2 that can make the gradient of f(x 1, x 2) equal to 0. â–¡
Question 10
Consider the dataset in Table 10.7, please newanswer the following newquestion:
-
P(Wealthy = Y ) = 4∕7
-
P(Wealthy = N) = 3∕7
-
P(Gender = F|Wealthy = Y ) = 3∕4
-
P(Gender = M|Wealthy = Y ) = 1∕4
-
P(HrsWorked > 40.5|Wealthy = Y ) = 1∕4
-
P(HrsWorked < 40.5|Wealthy = Y ) = 3∕4
-
P(Gender = F|Wealthy = N) = 1∕3
-
P(Gender = M|Wealthy = N) = 2∕3
-
P(HrsWorked > 40.5|Wealthy = N) = 2∕3
-
P(HrsWorked < 40.5|Wealthy = N) = 1∕3
Based on the above, please use Classify a new instance with Naive Bayes algorithm (Gender = F, HrsWorked = 44). â–¡
Answer
Because
we have that it will be predicted as Wealthy = Y  according to Eq. (8.11). □
Question 11
Implement the perceptron learning algorithm in Sect. 10.1, and compare the obtained binary classifier with the one obtained from logistic regression. â–¡
Question 12
Understand how learning rate in the perceptron learning algorithm affects the learning results. Draw a curve to exhibit the change of accuracy with respect to the learning rate. â–¡
Question 13
Use the perceptron learning algorithm to work with the XOR dataset in Example 10.3, and check the accuracy. â–¡
Question 14
Assuming that all weights in Fig. 10.5, i.e., those numbers 1 and -1, need to be learned. Can you adapt the perceptron learning algorithm to learn the weights? â–¡
Question 15
Understand the equivalence of applying matrix expression to compute the outputs for a dataset and the computation of outputs for individual inputs. â–¡
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Huang, X., Jin, G., Ruan, W. (2023). Deep Learning. In: Machine Learning Safety. Artificial Intelligence: Foundations, Theory, and Algorithms. Springer, Singapore. https://doi.org/10.1007/978-981-19-6814-3_10
Download citation
DOI: https://doi.org/10.1007/978-981-19-6814-3_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6813-6
Online ISBN: 978-981-19-6814-3
eBook Packages: Computer ScienceComputer Science (R0)