Adversarial Noise Attacks of Deep Learning Architectures: Stability Analysis via Sparse-Modeled Signals

Romano, Yaniv; Aberdam, Aviad; Sulam, Jeremias; Elad, Michael

doi:10.1007/s10851-019-00913-z

Adversarial Noise Attacks of Deep Learning Architectures: Stability Analysis via Sparse-Modeled Signals

Published: 05 October 2019

Volume 62, pages 313–327, (2020)
Cite this article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Yaniv Romano¹,
Aviad Aberdam ORCID: orcid.org/0000-0003-0084-5022²,
Jeremias Sulam³ &
…
Michael Elad⁴

791 Accesses
7 Citations
Explore all metrics

Abstract

Despite their impressive performance, deep convolutional neural networks (CNN) have been shown to be sensitive to small adversarial perturbations. These nuisances, which one can barely notice, are powerful enough to fool sophisticated and well performing classifiers, leading to ridiculous misclassification results. In this paper, we analyze the stability of state-of-the-art deep learning classification machines to adversarial perturbations, where we assume that the signals belong to the (possibly multilayer) sparse representation model. We start with convolutional sparsity and then proceed to its multilayered version, which is tightly connected to CNN. Our analysis links between the stability of the classification to noise and the underlying structure of the signal, quantified by the sparsity of its representation under a fixed dictionary. In addition, we offer similar stability theorems for two practical pursuit algorithms, which are posed as two different deep learning architectures—the layered thresholding and the layered basis pursuit. Our analysis establishes the better robustness of the later to adversarial attacks. We corroborate these theoretical results by numerical experiments on three datasets: MNIST, CIFAR-10 and CIFAR-100.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Frame Regularization of a Convolutional Neural Network in Image-Classification Problems

Article 10 December 2022

Regularization and Sparsity for Adversarial Robustness and Stable Attribution

Neural Networks in an Adversarial Setting and Ill-Conditioned Weight Space

Notes

Note that in this scheme, the number of iterations for each BP pursuit stage is implicit, hidden by the number of loops to apply. More on this is given in later sections.
Locally bounded noise results exist for the CSC as well [22], and can be leveraged in a similar fashion.

References

Aberdam, A., Sulam, J., Elad, M.: Multi-layer sparse coding: the holistic way. SIAM J. Math. Data Sci. 1(1), 46–77 (2019)
Article MathSciNet Google Scholar
Bibi, A., Ghanem, B., Koltun, V., Ranftl, R: Deep layers as stochastic solvers. In: International Conference on Learning Representations (2019)
Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
MATH Google Scholar
Bredensteiner, E.J., Bennett, K.P.: Multicategory classification by support vector machines. In: Computational Optimization, pp. 53–79. Springer, Berlin (1999)
Candes, E.J.: The restricted isometry property and its implications for compressed sensing. C.R. Math. 346(9–10), 589–592 (2008)
Article MathSciNet Google Scholar
Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, 1st edn. Springer, Berlin (2010)
Book Google Scholar
Fawzi, A., Fawzi, H., Fawzi, O.: Adversarial vulnerability for any classifier. arXiv preprint arXiv:1802.08686 (2018)
Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. Mach. Learn. 107(3), 481–508 (2018)
Article MathSciNet Google Scholar
Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)
MATH Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. ICLR (2015)
Gregor, K., LeCun, Y.: Learning fast approximations of sparse coding. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 399–406 (2010)
Krizhevsky, A., Nair, V., Hinton, G.: The CIFAR-10 dataset. online: http://www.cs.toronto.edu/kriz/cifar.html (2014)
Kurakin. A., Goodfellow, I.,, Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
LeCun, Y., Cortes, C., Burges, C.J.: MNIST handwritten digit database. AT&T Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2 (2010)
Liao, F., Liang, M., Dong, Y., Pang, T., Zhu, J., Hu, X.: Defense against adversarial attacks using high-level representation guided denoiser. In: IEEE-CVPR (2018)
Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks. In: ICLR (2017)
Mahdizadehaghdam, S., Panahi, A., Krim, H., Dai, L.: Deep dictionary learning: a parametric network approach. arXiv preprint arXiv:1803.04022 (2018)
Mairal, J., Bach, F., Ponce, J.: Sparse modeling for image and vision processing. arXiv preprint arXiv:1411.3230 (2014)
Moustapha, C., Piotr, B., Edouard, G., Yann, D., Nicolas, U.: Parseval networks: improving robustness to adversarial examples. In: ICML (2017)
Papyan, V., Romano, Y., Elad, M.: Convolutional neural networks analyzed via convolutional sparse coding. J. Mach. Learn. Res. 18(83), 1–52 (2017)
MathSciNet MATH Google Scholar
Papyan, V., Sulam, J., Elad, M.: Working locally thinking globally: theoretical guarantees for convolutional sparse coding. IEEE Trans. Signal Process. 65(21), 5687–5701 (2017)
Article MathSciNet Google Scholar
Sokolić, J., Giryes, R., Sapiro, G., Rodrigues, M.R.D.: Robust large margin deep neural networks. IEEE Trans. Signal Process. 65(16), 4265–4280 (2016)
Article MathSciNet Google Scholar
Sulam, J., Aberdam, A., Beck, A., Elad, M.: On multi-layer basis pursuit, efficient algorithms and convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. (2019)
Sulam, J., Papyan, V., Romano, Y., Elad, M.: Multilayer convolutional sparse modeling: pursuit and dictionary learning. IEEE Trans. Signal Process. 66(15), 4090–4104 (2018)
MathSciNet MATH Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Dumitru, E., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: ICLR (2014)
Zeiler, M.D.., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: IEEE-CVPR (2010)

Download references

Author information

Authors and Affiliations

Department of Statistics, Stanford University, Stanford, USA
Yaniv Romano
Electrical Engineering, Technion—Israel Institute of Technology, Haifa, Israel
Aviad Aberdam
Biomedical Engineering, Johns Hopkins University, Baltimore, USA
Jeremias Sulam
Computer Science, Technion—Israel Institute of Technology, Haifa, Israel
Michael Elad

Authors

Yaniv Romano
View author publications
You can also search for this author in PubMed Google Scholar
Aviad Aberdam
View author publications
You can also search for this author in PubMed Google Scholar
Jeremias Sulam
View author publications
You can also search for this author in PubMed Google Scholar
Michael Elad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aviad Aberdam.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Y. Romano and A. Aberdam contributed equally to this work.

The research leading to these results has received funding from the Technion Hiroshi Fujiwara Cyber Security Research Center and the Israel Cyber Directorate, and from Israel Science Foundation (ISF) grant no. 335/18.

Y. R. thanks the Zuckerman Institute, ISEF Foundation and the Viterbi Fellowship, Technion, for supporting this research.

Appendices

Appendix A: Proof of Theorem 7: Stable Binary Classification of the CSC Model

Theorem 5

(Stable binary classification of the CSC model) Suppose we are given a CSC signal $ {\mathbf {X}}$, $ \Vert {\varvec{\Upgamma }}\Vert _{0,\infty }^{\scriptscriptstyle {{\mathbf {S}}}}\le k $, contaminated with perturbation $ {\mathbf {E}}$ to create the signal $ {\mathbf {Y}}= {\mathbf {X}}+ {\mathbf {E}}$, such that $ \Vert {\mathbf {E}}\Vert _{2} \le \epsilon $. Suppose further that $ {\mathcal {O}}_{{\mathcal {B}}}^* > 0 $ and denote by ${\hat{{\varvec{\Upgamma }}}}$ the solution of the $ {\text {P}_{0,\infty }^{{\varvec{{\mathcal {E}}}}}}$ problem. Assuming that $ \delta _{2k} < 1 - \left( \frac{2{\left\| {\mathbf {w}}\right\| _2}\epsilon }{{\mathcal {O}}_{{\mathcal {B}}}^*}\right) ^2, $ then $ sign(f({\mathbf {X}})) = sign(f({\mathbf {Y}}))$.

Considering the more conservative bound that relies on $ \mu ({\mathbf {D}}) $, and assuming that

$$\begin{aligned} \Vert {\varvec{\Upgamma }}\Vert _{0,\infty }^{{\scriptscriptstyle {{\mathbf {S}}}}} < k = \frac{1}{2} \left( 1 + \frac{1}{\mu ({\mathbf {D}})}\left[ 1 - \left( \frac{2{\left\| {\mathbf {w}}\right\| _2}\epsilon }{{\mathcal {O}}_{{\mathcal {B}}}^*}\right) ^2\right] \right) , \end{aligned}$$

then $ sign(f({\mathbf {X}})) = sign(f({\mathbf {Y}}))$.

Proof

Without loss of generality, consider the case where $ {\mathbf {w}}^T{\varvec{\Upgamma }}+ \omega > 0 $, i.e., the original signal $ {\mathbf {X}}$ is assigned to class $ y = 1 $. Our goal is to show that $ {\mathbf {w}}^T{\hat{{\varvec{\Upgamma }}}} + \omega > 0 $. We start by manipulating the latter expression as follows:

$$\begin{aligned} {\mathbf {w}}^T{\hat{{\varvec{\Upgamma }}}} + \omega&= {\mathbf {w}}^T\left( {\varvec{\Upgamma }}+ {\hat{{\varvec{\Upgamma }}}} - {\varvec{\Upgamma }}\right) + \omega \nonumber \\&= \left( {\mathbf {w}}^T{\varvec{\Upgamma }}+ \omega \right) + {\mathbf {w}}^T\left( {\hat{{\varvec{\Upgamma }}}} - {\varvec{\Upgamma }}\right) \nonumber \\&\ge \left( {\mathbf {w}}^T{\varvec{\Upgamma }}+ \omega \right) - \left| {\mathbf {w}}^T\left( {\hat{{\varvec{\Upgamma }}}} - {\varvec{\Upgamma }}\right) \right| \nonumber \\&\ge \left( {\mathbf {w}}^T{\varvec{\Upgamma }}+ \omega \right) - \left\| {\mathbf {w}}^T\right\| _2 \left\| {\hat{{\varvec{\Upgamma }}}} - {\varvec{\Upgamma }}\right\| _2, \end{aligned}$$

(4)

where the first inequality relies on the relation $ a + b \ge a - |b| $ for $ a > 0 $, and the last derivation leans on the Cauchy-Schwarz inequality. Using the SRIP [22] and the fact that both $ \Vert {\mathbf {Y}}- {\mathbf {D}}{\varvec{\Upgamma }}\Vert _2 \le \epsilon $ and $ \Vert {\mathbf {Y}}- {\mathbf {D}}{\hat{{\varvec{\Upgamma }}}}\Vert _2 \le \epsilon $, we get

$$\begin{aligned} (1-\delta _{2k}) \Vert {\hat{{\varvec{\Upgamma }}}} - {\varvec{\Upgamma }}\Vert _2^2&\le \Vert {\mathbf {D}}{\hat{{\varvec{\Upgamma }}}} - {\mathbf {D}}{\varvec{\Upgamma }}\Vert _2^2 \le 4\epsilon ^2. \end{aligned}$$

Thus,

$$\begin{aligned} \Vert {\hat{{\varvec{\Upgamma }}}} - {\varvec{\Upgamma }}\Vert _2^2&\le \frac{4\epsilon ^2}{1-\delta _{2k}}. \end{aligned}$$

Combining the above with Eq. (4) leads to (recall that y = 1):

$$\begin{aligned} {\mathcal {O}}_{{\mathcal {B}}}({\mathbf {Y}},y) = {\mathbf {w}}^T{\hat{{\varvec{\Upgamma }}}} + \omega&\ge {{\mathbf {w}}^T{\varvec{\Upgamma }}+ \omega } - {\left\| {\mathbf {w}}\right\| _2}\frac{2\epsilon }{\sqrt{1-\delta _{2k}}}. \end{aligned}$$

Using the definition of the score of our classifier, satisfying

$$\begin{aligned} 0 < {\mathcal {O}}_{{\mathcal {B}}}({\mathbf {X}},y) = {{\mathbf {w}}^T{\varvec{\Upgamma }}+ \omega } \end{aligned}$$

we get

$$\begin{aligned} {\mathcal {O}}_{{\mathcal {B}}}({\mathbf {Y}},y) \ge {\mathcal {O}}_{{\mathcal {B}}}({\mathbf {X}},y) - {\left\| {\mathbf {w}}\right\| _2}\frac{2\epsilon }{\sqrt{1-\delta _{2k}}}. \end{aligned}$$

We are now after the condition for $ {\mathcal {O}}_{{\mathcal {B}}}({\mathbf {Y}},y) > 0$, and so we require:

$$\begin{aligned} 0&< {\mathcal {O}}_{{\mathcal {B}}}({\mathbf {X}},y) - {\left\| {\mathbf {w}}\right\| _2}\frac{2\epsilon }{\sqrt{1-\delta _{2k}}}\\&\le {\mathcal {O}}_{{\mathcal {B}}}^* - {\left\| {\mathbf {w}}\right\| _2}\frac{2\epsilon }{\sqrt{1-\delta _{2k}}}. \end{aligned}$$

where we relied on the fact that $ {\mathcal {O}}_{{\mathcal {B}}}({\mathbf {X}},y) \ge {\mathcal {O}}_{{\mathcal {B}}}^* $. The above inequality leads to

$$\begin{aligned} \delta _{2k}&< 1 - \left( \frac{2{\left\| {\mathbf {w}}\right\| _2}\epsilon }{{\mathcal {O}}_{{\mathcal {B}}}^*}\right) ^2. \end{aligned}$$

(5)

Next we turn to develop the condition that relies on $ \mu ({\mathbf {D}}) $. We shall use the relation between the SRIP and the mutual coherence [22], given by $ \delta _{2k} \ge (2k-1)\mu ({\mathbf {D}}) $ for all $ k < \frac{1}{2} \left( 1 + \frac{1}{\mu ({\mathbf {D}})}\right) $. Plugging this bound into Eq. (5) results in

$$\begin{aligned} 0&< {\mathcal {O}}_{{\mathcal {B}}}^* - \frac{2 \Vert {\mathbf {w}}\Vert _2 \epsilon }{\sqrt{1-(2k - 1)\mu ({\mathbf {D}})}}, \end{aligned}$$

which completes our proof. $\square $

Appendix B: Proof of Theorem 9: Stable Multi-class Classification of the CSC Model

Theorem 7

(Stable multi-class classification of the CSC model) Suppose we are given a CSC signal $ {\mathbf {X}}$, $ \Vert {\varvec{\Upgamma }}\Vert _{0,\infty }^{\scriptscriptstyle {{\mathbf {S}}}}\le k $, contaminated with perturbation $ {\mathbf {E}}$ to create the signal $ {\mathbf {Y}}= {\mathbf {X}}+ {\mathbf {E}}$, such that $ \Vert {\mathbf {E}}\Vert _{2} \le \epsilon $. Suppose further that $ f_u({\mathbf {X}}) = {\mathbf {w}}_u^T{\varvec{\Upgamma }}+ \omega _u $ correctly assigns $ {\mathbf {X}}$ to class $ y = u $. Suppose further that $ {\mathcal {O}}_{{\mathcal {M}}}^* > 0 $, and denote by ${\hat{{\varvec{\Upgamma }}}}$ the solution of the $ {\text {P}_0^{{\varvec{{\mathcal {E}}}}}}$ problem. Assuming that $ \delta _{2k} < 1 - \left( \frac{2\phi ({\mathbf {W}})\epsilon }{{\mathcal {O}}_{{\mathcal {M}}}^*}\right) ^2, $ then $ {\mathbf {Y}}$ will be assigned to the correct class.

Considering the more conservative bound that relies on $ \mu ({\mathbf {D}}) $ and assuming that

$$\begin{aligned} \Vert {\varvec{\Upgamma }}\Vert _{0,\infty }^{{\scriptscriptstyle {{\mathbf {S}}}}} < k = \frac{1}{2} \left( 1 + \frac{1}{\mu ({\mathbf {D}})}\left[ 1 - \left( \frac{2\phi ({\mathbf {W}})\epsilon }{{\mathcal {O}}_{{\mathcal {M}}}^*}\right) ^2\right] \right) , \end{aligned}$$

then $ {\mathbf {Y}}$ will be assigned to the correct class.

Proof

Given that $ f_u({\varvec{\Upgamma }}) = {\mathbf {w}}_u^T{\varvec{\Upgamma }}+ \omega _u > f_v({\varvec{\Upgamma }}) = {\mathbf {w}}_v^T{\varvec{\Upgamma }}+ \omega _v $ for all $ v \ne u $, i.e., $ {\mathbf {X}}$ belongs to class $ y = u $, we shall prove that $ f_u({\hat{{\varvec{\Upgamma }}}}) > f_v({\hat{{\varvec{\Upgamma }}}}) $ for all $ v \ne u $. Denoting $ \varDelta = {\hat{{\varvec{\Upgamma }}}} - {\varvec{\Upgamma }}$, we bound from below the difference $ f_u({\hat{{\varvec{\Upgamma }}}}) - f_v({\hat{{\varvec{\Upgamma }}}})$ as follows:

$$\begin{aligned}&\left[ {\mathbf {w}}_u^T{\hat{{\varvec{\Upgamma }}}} + \omega _u\right] - \left[ {\mathbf {w}}_v^T{\hat{{\varvec{\Upgamma }}}} + \omega _v\right] \nonumber \\&\quad = \left[ {\mathbf {w}}_u^T\left( {\varvec{\Upgamma }}+ \varDelta \right) + \omega _u \right] - \left[ {\mathbf {w}}_v^T\left( {\varvec{\Upgamma }}+ \varDelta \right) + \omega _v\right] \nonumber \\&\quad = \left[ {\mathbf {w}}_u^T{\varvec{\Upgamma }}+ \omega _u\right] - \left[ {\mathbf {w}}_v^T{\varvec{\Upgamma }}+ \omega _v\right] + \left( {\mathbf {w}}_u^T - {\mathbf {w}}_v^T\right) \varDelta \nonumber \\&\quad \ge f_u({\varvec{\Upgamma }}) - f_v({\varvec{\Upgamma }}) - \left| \left( {\mathbf {w}}_u^T - {\mathbf {w}}_v^T\right) \varDelta \right| \nonumber \\&\quad \ge f_u({\varvec{\Upgamma }}) - f_v({\varvec{\Upgamma }}) - \Vert {\mathbf {w}}_u^T - {\mathbf {w}}_v^T\Vert _2\Vert \varDelta \Vert _2. \end{aligned}$$

(6)

Similarly to the proof of Theorem 7, the first inequality holds since $ a + b \ge a - |b| $ for $ a = f_u({\varvec{\Upgamma }}) - f_v({\varvec{\Upgamma }}) > 0 $, and the last inequality relies on the Cauchy-Schwarz formula. Relying on $ \phi ({\mathbf {W}}) $ that satisfies

$$\begin{aligned} \phi ({\mathbf {W}}) \ge \Vert {\mathbf {w}}_u - {\mathbf {w}}_v\Vert _2, \end{aligned}$$

and plugging $\Vert \varDelta \Vert _2^2 \le \frac{4\epsilon ^2}{1-\delta _{2k}} $ into Eq. (6) we get

$$\begin{aligned} f_u({\hat{{\varvec{\Upgamma }}}}) - f_v({\hat{{\varvec{\Upgamma }}}})&\ge f_u({\varvec{\Upgamma }}) - f_v({\varvec{\Upgamma }}) - \phi ({\mathbf {W}}) \frac{2\epsilon }{\sqrt{1-\delta _{2k}}} \\&\ge {\mathcal {O}}_{{\mathcal {M}}}({{\mathbf {X}}},y) - \phi ({\mathbf {W}}) \frac{2\epsilon }{\sqrt{1-\delta _{2k}}} \\&\ge {\mathcal {O}}_{{\mathcal {M}}}^* - \phi ({\mathbf {W}}) \frac{2\epsilon }{\sqrt{1-\delta _{2k}}}, \end{aligned}$$

where the second to last inequality holds since $ f_u({\varvec{\Upgamma }}) - f_v({\varvec{\Upgamma }}) \ge {\mathcal {O}}_{{\mathcal {M}}}({{\mathbf {X}}},y)$, and the last inequality follows the definition of $ {\mathcal {O}}_{{\mathcal {M}}}^* $. As such, we shall seek for the following inequality to hold:

$$\begin{aligned} 0&< {\mathcal {O}}_{{\mathcal {M}}}^* - \phi ({\mathbf {W}})\frac{2\epsilon }{\sqrt{1-\delta _{2k}}} \\ \rightarrow \delta _{2k}&< 1 - \left( \frac{2\phi ({\mathbf {W}})\epsilon }{{\mathcal {O}}_{{\mathcal {M}}}^*}\right) ^2. \end{aligned}$$

Similarly to the binary setting, one can readily write the above in terms of $ \mu ({\mathbf {D}}) $. $\square $

Appendix C: Proof of Theorem 12: Stable Binary Classification of the L-THR

Theorem 10

(Stable binary classification of the L-THR) Suppose we are given an ML-CSC signal $ {\mathbf {X}}$ contaminated with perturbation $ {\mathbf {E}}$ to create the signal $ {\mathbf {Y}}= {\mathbf {X}}+ {\mathbf {E}}$, such that $\Vert {\mathbf {E}}\Vert _{2,\infty }^{\scriptscriptstyle {{\mathbf {P}}}}\le \epsilon _0$. Denote by $|\varGamma _i^{\text {min}}|$ and $|\varGamma _i^{\text {max}}|$ the lowest and highest entries in absolute value in the vector ${\varvec{\Upgamma }}_i$, respectively. Suppose further that $ {\mathcal {O}}_{{\mathcal {B}}}^* > 0 $ and let $\{{\hat{{\varvec{\Upgamma }}}}_i\}_{i=1}^{K}$ be the set of solutions obtained by running the layered soft thresholding algorithm with thresholds $\{\beta _i\}_{i=1}^{K}$, i.e., ${\hat{{\varvec{\Upgamma }}}}_i=\S _{\beta _i}({\mathbf {D}}_i^T{\hat{{\varvec{\Upgamma }}}}_{i-1})$ where $ \S _{\beta _i} $ is the soft thresholding operator and ${\hat{{\varvec{\Upgamma }}}}_{0}={\mathbf {Y}}$. Assuming that $\forall \ 1 \le i \le K$

a.
$\Vert {\varvec{\Upgamma }}_i \Vert _{0,\infty }^{\scriptscriptstyle {{\mathbf {S}}}}< \frac{1}{2} \left( 1 + \frac{1}{\mu ({\mathbf {D}}_i)} \frac{ |\varGamma _i^{\text {min}}| }{ |\varGamma _i^{\text {max}}| } \right) - \frac{1}{\mu ({\mathbf {D}}_i)}\frac{ \epsilon _{i-1} }{|\varGamma _i^{\text {max}}|}$;
b.
The threshold $\beta _i$ is chosen according to
$$\begin{aligned} |{\varvec{\Upgamma }}_i^{\text {min}}| - C_i - \epsilon _{i-1}> \beta _i > K_i + \epsilon _{i-1}, \end{aligned}$$
where
$$\begin{aligned} \begin{aligned} C_i= & {} ( \Vert {\varvec{\Upgamma }}_i \Vert _{0,\infty }^{\scriptscriptstyle {{\mathbf {S}}}}- 1 ) \mu ({\mathbf {D}}_i) |{\varvec{\Upgamma }}_i^{\text {max}}|, \\ K_i= & {} \Vert {\varvec{\Upgamma }}_i \Vert _{0,\infty }^{\scriptscriptstyle {{\mathbf {S}}}}\mu ({\mathbf {D}}_i) |{\varvec{\Upgamma }}_i^{\text {max}}|, \\ \epsilon _i= & {} \sqrt{ \Vert {\varvec{\Upgamma }}_{i} \Vert _{0,\infty }^{\scriptscriptstyle {{\mathbf {P}}}}} \ \Big ( \epsilon _{i-1} + C_i + \beta _i \Big ); \end{aligned} \end{aligned}$$
and
c.
${\mathcal {O}}_{{\mathcal {B}}}^* > \Vert {\mathbf {w}}\Vert _2\sqrt{\Vert {\varvec{\Upgamma }}_K\Vert _0} \Big (\epsilon _{K-1} +C_K + \beta _K\Big )$,

then $ sign(f({\mathbf {Y}})) = sign(f({\mathbf {X}}))$.

Proof

Following Theorem 10 in [22], if assumptions (a)–(c) above hold $\forall \ 1 \le i \le K$ then

1.
The support of the solution ${\hat{{\varvec{\Upgamma }}}}_i$ is equal to that of ${\varvec{\Upgamma }}_i$; and
2.
$\Vert {\varvec{\Upgamma }}_i - {\hat{{\varvec{\Upgamma }}}}_i \Vert _{2,\infty }^{\scriptscriptstyle {{\mathbf {P}}}}\le \epsilon _i$, where $\epsilon _i$ defined above.

In particular, the last layer satisfies

$$\begin{aligned} \Vert {\varvec{\Upgamma }}_K - {\hat{{\varvec{\Upgamma }}}}_K \Vert _{\infty } \le \epsilon _{K-1} + C_K + \beta _K. \end{aligned}$$

(7)

Defining $ \varDelta = {\hat{{\varvec{\Upgamma }}}}_K - {\varvec{\Upgamma }}_K $, we get

$$\begin{aligned} \Vert \varDelta \Vert _{2} \le \Vert \varDelta \Vert _\infty \sqrt{\Vert \varDelta \Vert _0} = \Vert \varDelta \Vert _\infty \sqrt{\Vert {\varvec{\Upgamma }}_K\Vert _0}, \end{aligned}$$

where the last equality relies on the successful recovery of the support. Having the upper bound on $ \Vert \varDelta \Vert _2 $, one can follow the transition from Eqs. (4) to (5) (see the proof of Theorem 7), leading to the following requirement for accurate classification:

$$\begin{aligned} {\mathcal {O}}_{{\mathcal {B}}}^* - \Vert {\mathbf {w}}\Vert _2\Vert \varDelta \Vert _\infty \sqrt{\Vert {\varvec{\Upgamma }}_K\Vert _0} > 0. \end{aligned}$$

Plugging Eq. (7) to the above expression results in the additional condition that ties the propagated error throughout the layers to the output margin, given by

$$\begin{aligned} {\mathcal {O}}_{{\mathcal {B}}}^* > \Vert {\mathbf {w}}\Vert _2\sqrt{\Vert {\varvec{\Upgamma }}_K\Vert _0} \Big (\epsilon _{K-1} + C_K + \beta _K\Big ). \end{aligned}$$

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Romano, Y., Aberdam, A., Sulam, J. et al. Adversarial Noise Attacks of Deep Learning Architectures: Stability Analysis via Sparse-Modeled Signals. J Math Imaging Vis 62, 313–327 (2020). https://doi.org/10.1007/s10851-019-00913-z

Download citation

Received: 20 November 2018
Accepted: 25 September 2019
Published: 05 October 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s10851-019-00913-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adversarial Noise Attacks of Deep Learning Architectures: Stability Analysis via Sparse-Modeled Signals

Abstract

Access this article

Similar content being viewed by others

Frame Regularization of a Convolutional Neural Network in Image-Classification Problems

Regularization and Sparsity for Adversarial Robustness and Stable Attribution

Neural Networks in an Adversarial Setting and Ill-Conditioned Weight Space

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Proof of Theorem 7: Stable Binary Classification of the CSC Model

Theorem 5

Proof

Appendix B: Proof of Theorem 9: Stable Multi-class Classification of the CSC Model

Theorem 7

Proof

Appendix C: Proof of Theorem 12: Stable Binary Classification of the L-THR

Theorem 10

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adversarial Noise Attacks of Deep Learning Architectures: Stability Analysis via Sparse-Modeled Signals

Abstract

Access this article

Similar content being viewed by others

Frame Regularization of a Convolutional Neural Network in Image-Classification Problems

Regularization and Sparsity for Adversarial Robustness and Stable Attribution

Neural Networks in an Adversarial Setting and Ill-Conditioned Weight Space

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Proof of Theorem 7: Stable Binary Classification of the CSC Model

Theorem 5

Proof

Appendix B: Proof of Theorem 9: Stable Multi-class Classification of the CSC Model

Theorem 7

Proof

Appendix C: Proof of Theorem 12: Stable Binary Classification of the L-THR

Theorem 10

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation