Deep Networks with RBF Layers to Prevent Adversarial Examples

Vidnerová, Petra; Neruda, Roman

doi:10.1007/978-3-319-91253-0_25

Petra Vidnerová¹⁸ &
Roman Neruda¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10841))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

2485 Accesses
7 Citations

Abstract

We propose a simple way to increase the robustness of deep neural network models to adversarial examples. The new architecture obtained by stacking deep neural network and RBF network is proposed. It is shown on experiments that such architecture is much more robust to adversarial examples than the original one while its accuracy on legitimate data stays more or less the same.

Download conference paper PDF

Adversarial Examples Are Closely Relevant to Neural Network Models - A Preliminary Experiment Explore

Mitigating Overfitting Using Regularization to Defend Networks Against Adversarial Examples

Deep neural rejection against adversarial examples

Article Open access 07 April 2020

Keywords

1 Introduction

Deep neural networks (DNN) and convolutional neural networks (CNN) enjoy high interest nowadays. They have become the state-of-art methods in many fields of machine learning, and have been applied to various problems, including image recognition, speech recognition, and natural language processing [1].

In the area of pattern recognition, deep and convolutional neural networks achieved several human-competitive results [2,3,4]. Concerning these results, there is a question if these methods achieve similar capabilities to human vision, such as a generalization. This paper deals with a property of machine learning models that demonstrates a difference. Let us have a classifier and an image, correctly classified by the classifier as one class (for example an image of a hand-written digit five). It is possible to slightly change the image, so as for human eyes, there is almost no difference, but the classifier classifies the image as something completely else (such as digit zero).

This counter-intuitive property of neural networks was first described in [5]. It relates to the stability of a neural network with respect to small perturbation of their inputs. Such perturbed examples are known as adversarial examples. The adversarial examples differ only slightly from correctly classified examples drawn from the data distribution, but they are classified incorrectly by the classifier learned on the data. Not only they are classified incorrectly, they can often be classified as a class of our choice.

The vulnerability to adversarial examples is not only the case of deep neural network models, but spreads through all machine learning methods, including shallow architectures (like SVMs) or decision trees. Networks with local units, RBF networks, are known to be more robust to adversarial examples. In this paper we examine the way of using RBF layers in deep architecture to protect the architecture from adversarial examples. We propose the new architecture obtained by stacking the deep architecture and an RBF network. We show that such a model is much less vulnerable to adversarial examples than the original model, while its accuracy remains almost the same.

This paper is organized as follows. First, in Sect. 2 we explain how adversarial examples work and review related work. Then, Sect. 3 introduces the new architecture. Section 4 deals with the results of our experiments. Finally, Sect. 5 concludes our paper.

2 Adversarial Examples and Related Work

The adversarial examples were first introduced in [5]. The paper shows that having a trained network it is possible to arbitrarily change the network prediction by applying an imperceptible non-random perturbation to an input image. Such perturbations are found by optimizing the input to maximize the prediction error. The box-constrained Limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm (L-BFGS) is used for this optimization.

On some data sets, such as ImageNet, the adversarial examples are so close to the original examples that they are indistinguishable by human eye. In addition, the authors state that adversarial examples are relatively robust, and they generalize between neural networks with varied number of layers, activations, or trained on different subsets of the training data. In other words, if we use one neural network to generate a set of adversarial examples, these examples are also misclassified by another neural network even when it was trained with different hyperparameters, or when it was trained on a different subset of examples.

Paper [6] suggests that it is the linear behaviour in high-dimensional spaces that is sufficient to cause adversarial examples (for example, a linear classifier exhibits this behaviour, too). The authors propose a fast method of generating adversarial examples (adding small vector in the direction of the sign of the derivation).

Let us have a linear classifier and let x and $\tilde{x} = x + \eta $ be input vectors. The classifier should assign x and $\tilde{x}$ same classes as long as $ ||\eta ||_{\infty } \le \varepsilon $, where $\varepsilon $ is the precision of features.

Consider the dot product between weight vector w and input vector $\tilde{x}$:

$$ w^{\top }\tilde{x} = w^{\top }x + w^{\top }\eta . $$

Adding $\eta $ to the input vector, the activation increases by $w^{\top }\eta $. We can maximize this increase by $\eta = \varepsilon sign(w)$. If n is the dimension and m is the average magnitude of w, the activation grows by $\varepsilon mn$. Note that $||\eta ||_{\infty }$ does not grow with n, but the change in activation caused by perturbation $\eta $ does grow linearly. It is possible to make many infinitesimal changes to the input that add up to a large change of the activation. Therefore, a simple linear model can have adversarial examples if its input has sufficient dimensionality.

The above observation can be generalized to nonlinear models [6]. Let $\theta $ be the parameters of a model, x an input, y the targets for x, and $J(\theta , x, y)$ the cost function. If we linearize the cost function around the $\theta $, we obtain an optimal perturbation: $ \eta = \varepsilon \;\text {sgn}(\bigtriangledown _x J(\theta , x, y)).$ This represents an efficient way of generating adversarial examples and it is referred to as the fast gradient sign method (FGSM). See Fig. 1 for adversarial images crafted by FGSM on MNIST data set [7] for CNN.

Other results of fooling deep and convolutional networks can be found in [8]. This paper studies the generation of images looking as noise or regular patterns by evolutionary algorithms. To generate regular patterns the authors use Compositional pattern-producing network (CPPN), that has similar structure as neural networks. It takes indexes (x, y) as an input and outputs pixel value. Nodes are functions like Gaussian, sinus, sigmoid, linear. The CPPNs are created by evolutionary algorithms, and the resulting images are regular patterns that are classified as desired images from the training set with high confidence.

In [9] another class of crafting algorithms is proposed, and in [10] a black-box strategy to adversarial attacks is described.

In our paper [11], we examine a vulnerability to adversarial examples throughout variety of machine learning methods. We propose a genetic algorithm for generating adversarial examples. Though the evolutionary search for adversarial examples is slower than techniques described in [5, 6], it enables us to obtain adversarial examples without the access to model’s weights. Thus, we have a unified approach for a wide range of machine learning models, including not only neural networks, but also support vector machine classifiers (SVMs), decision trees, and possibly others. The only thing this approach needs is the possibility to query the classifier to evaluate a given example. See Fig. 2 for adversarial images crafted by our genetic algorithm for CNN.

The question of how to make the neural networks robust to adversarial examples is dealt with in [12]. The authors tried several methods, from noise injection and Gaussian blur, using autoencoder, to method they call deep contractive network (that applies a regularization term penalizing large changes of activation in respect to change of input to the cost function). However, the methods cure the adversarial examples only to some extend.

Another attempt to prevent adversarial examples is proposed in [13], based on distillation, i.e. training another network based on outputs produced by target network.

3 Deep Networks with RBF Layers

RBF networks [14,15,16,17,18] are neural networks with one hidden layer of RBF units and a linear output layer.

By an RBF unit we mean a neuron with multiple real inputs $\varvec{x}=(x_1,\ldots ,x_n)$ and one output y. Each unit is determined by n-dimensional vector $\varvec{c}$ which is called centre. It can have additional parameter $\beta > 0$ that determines its width.

The output y is computed as:

$$\begin{aligned} y = \varphi (\xi ); \;\;\;\; \xi = \beta ||\varvec{x}-\varvec{c}||^2 \end{aligned}$$

(1)

where $\varphi :{\mathbb R}\rightarrow {\mathbb R}$ is suitable activation function, typically Gaussian $\varphi (z)=e^{-z^2}$.

Thus, the network computes the following function $\varvec{f}:{\mathbb R}^n\rightarrow {\mathbb R}^m$:

$$\begin{aligned} f_s(\varvec{x}) = \sum _{j=1}^{h} w_{js}\varphi \left( \beta _j \parallel \varvec{x} - \varvec{c_j} \parallel \right) , \end{aligned}$$

(2)

where $w_{ji}\in {\mathbb R}$ and $f_s$ is the output of the s-th output unit.

The history of RBF networks can be traced back to the 1980s, particularly to the study of interpolation problems in numerical analysis. It is where the radial basis functions were first introduced, in the solution of the real multivariate interpolation problem [19, 20].

The RBF networks benefit from a rich spectrum of learning possibilities. The study of these algorithms together with experimental results was also published in our papers [21, 22].

With the boom of deep learning the popularity of RBF networks vanishes. However, we show that they can bring advantages when combined with deep neural networks.

We introduce new deep architecture that is defined as a concatenation of a feedforward deep neural network and an RBF network (see Fig. 3). Let us have a deep neural network DN that realizes a function $f_{DN}: {\mathbb R}^n \rightarrow {\mathbb R}^m$ and a RBF network RBF that realizes a function $f_{RBF}: {\mathbb R}^m \rightarrow {\mathbb R}^m$. Then feeding the outputs of DN to inputs of RBF we get a network implementing function: $f: {\mathbb R}^n \rightarrow {\mathbb R}^m$, where

$$ f(\varvec{x}) = f_{RBF}(f_{DN}(\varvec{x})). $$

For classification tasks we can add softmax activation function to the output layer of RBF network.

The training procedure is the following:

1.
train the DN by any appropriate learning algorithm
2.
set the centers of RBF randomly, drawn from uniform distribution on (0, 1.0)
3.
set the parameters $\beta $ to the constant value
4.
initialize the weights of RBF output layer to random small values
5.
retrain the whole network DNRBF (by back propagation).

While the DN part of the network is already trained, it is usually sufficient to train the whole stacked network only for few epochs.

4 Experimental Results

For our experiments we use the FGSM implemented in Cleverhans library [23]. To implement deep neural networks we use Keras [24] and our RBF layer implementation [25]. The scripts used for experiments can be found at [26].

Table 1. Accuracies on legitimate test examples and adversarial examples for MLP and MLPRBF with various initial widths. Average accuracies over 30 runs of learning algorithm.

Full size table

Table 2. Accuracies on legitimate test examples and adversarial examples for CNN and CNNRBF with various initial widths. Average accuracies over 30 runs of learning algorithm.

Full size table

We have two target architectures—MLP (two dense hidden layers with 512 ReLU units each, dense output layer of 10 softmax units) and CNN (two convolutional layers with 32 3$\,\times \,$3 filters, ReLU activation, 2$\,\times \,$2 max pooling layer, dense layer with 128 ReLU units, dense output layer of 10 softmax units).

These two architectures were trained 30 times by RMSProp for 20 and 12 epochs for MLP and CNN respectively. We obtained 98.35% average accuracy for MLP and 98.97% average accuracy for CNN on test data, but only 1.95% (MLP) and 8.49% (CNN) on adversarial data drawn by FGSM from test set.

To each of the 30 trained networks we added the RBF network and retrained the whole new networks for 3 epochs. We found that the results depend on the parameters $\beta $ of the Gaussians, therefore we tried several initial setups. The best results were obtained with initial $\beta $ 2.0, and on adversarial data they were 89.21% for MLPRBF and 74.57% for CNNRBF. The complete results can be found in Tables 1, 2 and Figs. 4, 5. It shows that adding RBF network to the deep network may significantly decrease the vulnerability to adversarial examples.

In addition, Table 3 lists the average accuracies on adversarial data crafted with FGSM with different values of $\epsilon $.

Table 3. Accuracies on adversarial data crafted by FGSM with different $\epsilon $.

Full size table

5 Conclusion

In this paper we dealt with the problem of adversarial examples. We have proposed the new deep architecture that is obtained by stacking a feedforward deep neural network and an RBF network. Only a few learning epochs for the whole stacked network are needed to retrain, and to obtain the accuracy close to the accuracy of original deep neural network. We have shown that the new stacked network is much less vulnerable to adversarial examples than the original one.

References

Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet Google Scholar
Hinton, G.E.: Learning multiple layers of representation. Trends Cognit. Sci. 11, 428–434 (2007)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Bartlett, P., Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1106–1114. Neural Information Processing Systems Foundation (2012)
Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks (2013). arXiv:1312.6199
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014). arXiv:1412.6572
LeCun, Y., Cortes, C.: The MNIST database of handwritten digits (2012)
Google Scholar
Nguyen, A.M., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. CoRR abs/1412.1897 (2014)
Google Scholar
Papernot, N., McDaniel, P.D., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. CoRR abs/1511.07528 (2015)
Google Scholar
Papernot, N., McDaniel, P.D., Goodfellow, I.J., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against deep learning systems using adversarial examples. CoRR abs/1602.02697 (2016)
Google Scholar
Vidnerová, P., Neruda, R.: Evolutionary generation of adversarial examples for deep and shallow machine learning models, pp. 43:1–43:7 (2016)
Google Scholar
Gu, S., Rigazio, L.: Towards deep neural network architectures robust to adversarial examples. CoRR abs/1412.5068 (2014)
Google Scholar
Papernot, N., McDaniel, P.D., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. CoRR abs/1511.04508 (2015)
Google Scholar
Moody, J., Darken, C.: Fast learning in networks of locally-tuned processing units. Neural Comput. 1, 289–303 (1989)
Article Google Scholar
Poggio, T., Girosi, F.: A theory of networks for approximation and learning. Technical report, Cambridge, MA, USA (1989) A. I. Memo No. 1140, C.B.I.P. Paper No. 31
Google Scholar
Broomhead, D., Lowe, D.: Multivariable functional interpolation and adaptive networks. Complex Syst. 2, 321–355 (1988)
MathSciNet MATH Google Scholar
Peng, J.X., Li, K., Irwin, G.W.: A novel continuous forward algorithm for RBF neural modelling. IEEE Trans. Autom. Control 52(1), 117–122 (2007)
Article MathSciNet Google Scholar
Fu, X., Wang, L.: Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 33(3), 399–409 (2003)
Article Google Scholar
Powel, M.: Radial basis functions for multivariable interpolation: a review. In: IMA Conference on Algorithms for the Approximation of Functions and Data, RMCS, Shrivenham, England, pp. 143–167 (1985)
Google Scholar
Light, W.: Some aspects of radial basis function approximation. In: Approximation Theory, Spline Functions and Applications, pp. 163–190. Kluwer Academic Publishers, Dordrecht (1992)
Chapter Google Scholar
Neruda, R., Kudová, P.: Learning methods for radial basis functions networks. Future Gener. Comput. Syst. 21, 1131–1142 (2005)
Article Google Scholar
Neruda, R., Kudová, P.: Hybrid learning of RBF networks. Neural Netw. World 12(6), 573–585 (2002)
MATH Google Scholar
Papernot, N., et al.: cleverhans v2.0.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768 (2017)
Chollet, F.: Keras (2015). https://github.com/fchollet/keras
Vidnerová, P.: RBF for keras (2017). https://github.com/PetraVidnerova/rbf_keras
Vidnerová, P.: Experiments with deep RBF networks (2017). https://github.com/PetraVidnerova/rbf_tests

Download references

Acknowledgments

This work was partially supported by the Czech Grant Agency grant GA18-23827S and institutional support of the Institute of Computer Science RVO 67985807.

Access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum provided under the programme “Projects of Large Research, Development, and Innovations Infrastructures” (CESNET LM2015042), is greatly appreciated.

Author information

Authors and Affiliations

Institute of Computer Science, Czech Academy of Sciences, Pod vodárenskou věží 2, Praha 8, Prague, Czech Republic
Petra Vidnerová & Roman Neruda

Authors

Petra Vidnerová
View author publications
You can also search for this author in PubMed Google Scholar
Roman Neruda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Petra Vidnerová .

Editor information

Editors and Affiliations

Częstochowa University of Technology, Częstochowa, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
AGH University of Science and Technology, Kraków, Poland
Ryszard Tadeusiewicz
University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vidnerová, P., Neruda, R. (2018). Deep Networks with RBF Layers to Prevent Adversarial Examples. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2018. Lecture Notes in Computer Science(), vol 10841. Springer, Cham. https://doi.org/10.1007/978-3-319-91253-0_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-91253-0_25
Published: 11 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91252-3
Online ISBN: 978-3-319-91253-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deep Networks with RBF Layers to Prevent Adversarial Examples

Abstract

Similar content being viewed by others

Adversarial Examples Are Closely Relevant to Neural Network Models - A Preliminary Experiment Explore

Mitigating Overfitting Using Regularization to Defend Networks Against Adversarial Examples

Deep neural rejection against adversarial examples

Keywords

1 Introduction

2 Adversarial Examples and Related Work

3 Deep Networks with RBF Layers

4 Experimental Results

5 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Deep Networks with RBF Layers to Prevent Adversarial Examples

Abstract

Similar content being viewed by others

Adversarial Examples Are Closely Relevant to Neural Network Models - A Preliminary Experiment Explore

Mitigating Overfitting Using Regularization to Defend Networks Against Adversarial Examples

Deep neural rejection against adversarial examples

Keywords

1 Introduction

2 Adversarial Examples and Related Work

3 Deep Networks with RBF Layers

4 Experimental Results

5 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation