1 Introduction

The high predictive power of neural network classifiers makes them the method of choice to tackle challenging classification problems in many areas. However, questions regarding the robustness of their performance under slight input perturbations still remain open, severely limiting the applicability of deep neural network classifiers to sensitive tasks that require certification of the obtained results.

In recent years this issue gained a lot of attention, resulting in a large variety of methods tackling tasks ranging from adversarial attacks and defenses against these to robustness verification and robust training. In this work we focus on robustness verification. That is, computing the distance from a given anchor point \(x^{0}\) in the input space to its closest adversarial, i.e. a point that is assigned a different class label by the network. This problem plays a fundamental role in understanding the behavior of deep classifiers and essentially provides the only reliable way to assess classifier robustness. Unfortunately, its complexity class does not allow a polynomial time algorithm. For deep classifiers with ReLU activation the verification problem can equivalently be reformulated as a mixed integer programming (MIP) task and was shown to be NP-complete by Katz et al. (2017). Even worse, Weng et al. (2018) showed that an approximation of the minimum adversarial perturbation of a certain (high) quality cannot be found within polynomial time.

Related work There exist two streams of related work on robustness verification of deep ReLU classifiers. This categorization is based on whether they are solving the verification problem exactly or verifying a bound on the distance to the decision boundary (DtDB).

The first group of methods are exact verification approaches. As mentioned above, the verification task can be modeled using MIP techniques. Katz et al. (2017) present a modification of the simplex algorithm that can be used to solve the verification task exactly for smaller ReLU networks based on satisfiable modulo theory (SMT). Other approaches (Ehlers 2017) rely on SMT solvers when solving the described task. Bunel et al. (2018) provide an overview and comparison of those. Other exact methods (Dutta et al. 2018; Lomuscio and Maganti 2017; Tjeng et al. 2017) deploy MIP solvers together with presolving to find a tight formulation of the MIP problem or (Jordan et al. 2018) use an algorithm to find the largest ball around the anchor point that touches the decision boundary.

The second popular class of methods for verifying classifier robustness deals with verification of an \(\epsilon\)-neighborhood: given an anchor point \(x^{0}\) and an \(\epsilon >0\), the task is to verify whether an adversarial point exists within the \(\epsilon\) neighborhood of \(x^{0}\) which is defined with respect to a certain norm in the input space. All existing methods relax the initial problem and require bounds on activation inputs in each layer. These bounds should be as tight as possible to ensure good final results. Raghunathan et al. (2018a, b), Dvijotham et al. (2018, 2019) consider semidefinite (SDP) and linear (LP) problems as relaxations of the \(\epsilon\)-verification problem. Wong and Kolter (2018) replace ReLU constraints by linear constraints and consider the dual formulation of the obtained LP relaxation. Weng et al. (2018) present an approach that also uses linear functions (later extended to quadratic functions by Zhang et al. 2018) to deal with nonlinear activation functions and propagate the layer-wise output bounds until the final layer. Salman et al. (2019) provide a unifying framework for the approaches using neuron-wise relaxations of the activation functions and use the best possible convex relaxation. Finally, Hein and Andriushchenko (2017), Tsuzuku et al. (2018) use the Lipschitz constant of the transformations within classifier’s architecture.

Our approach belongs to the same group of the inexact verifiers, but deals with constructing lower bounds on DtDB without necessarily restricting admissible adversarial points to a given neighborhood. Croce et al. (2019) leverage the piecewise affine nature of the outputs of a ReLU classifier and compute lower bounds on DtDB by assuming that the classifier behaves globally the same way it does in the linear region around the given anchor point. The \(\epsilon\)-verification task is closely related to this problem since each \(\epsilon\)-neighborhood that is certified as adversarial-free immediately provides a lower bound on the minimal adversarial perturbation magnitude. It is also a common strategy for the \(\epsilon\)-verification methods to use a binary search or a Newton method on top of their algorithm to find the largest \(\epsilon\) such that the \(\epsilon\)-neighborhood around \(x^{0}\) is still successfully verified as robust.

Adversarial attacks Constructing misclassified examples that are close to the anchor point can be considered as a complementary research direction to robustness verification since each adversarial example by definition provides an upper bound on the DtDB. Many methods were proposed to construct such points (Szegedy et al. 2014; Goodfellow et al. 2015; Kurakin et al. 2016; Papernot et al. 2016; Madry et al. 2017; Carlini and Wagner 2017).

Robust training The question of how to actually train a robust classifier is closely related to robustness verification since the latter might allow us to construct some type of robust loss based on the insights from the verification procedure (Hein and Andriushchenko 2017; Madry et al. 2017; Wong and Kolter 2018; Raghunathan et al. 2018a; Tsuzuku et al. 2018; Wang et al. 2018; Croce et al. 2019). We leave this direction for future work.

1.1 Contributions

  1. 1.

    We propose a novel relaxation of the DtDB problem in form of a QP task allowing efficient computation of high quality lower bounds on the DtDB in \(l_{2}\)-norm with an extension to \(l_{\infty}\)-norm. We reach state-of-the-art performance for dense and convolutional networks compared to the bounds obtained from methods based on LP relaxations (CROWN by Zhang et al. 2018 and ConvAdv by Wong and Kolter 2018). Furthermore, our method performs much faster than methods based on SDP relaxations (Raghunathan et al. 2018b), while providing smaller lower bounds. This is a fundamental property due to the difference in computational complexity between SDP and QP tasks.

  2. 2.

    Unlike \(\epsilon\)-verification techniques, we provide a lower bound on DtDB without an initial guess and without computing bounds for the neuron activation values in each layer. If additional information is present allowing the user to bound the distance to any admissible adversarial point from above, we incorporate these upper bounds in our formulation to verify larger regions around the anchor point. Such bounds have to be tight enough to verify non-trivial neighborhoods and play an important role in other relaxation techniques such as the SDP based approaches by Raghunathan et al. (2018b) and Dvijotham et al. (2019). We describe an efficient search method for pre-activation bounds resulting in larger verified regions based on sequential convex quadratic programming (QP).

  3. 3.

    To analyze the gap in the optimal objective function value between the initial DtDB problem and our relaxation we establish a connection of DtDB’s dual problem to our QP task. It allows us to deconstruct this gap into two components. Moreover, we discuss how we improve the QP formulation to close the gap to DtDB and how we bound one of its components.

The remainder of this paper is organized as follows. In Sect. 2 we introduce the necessary notation. In Sect. 3.1 we formally define the problem of finding the smallest adversarial perturbation and in Sect. 3.2 introduce its QP relaxation QPRel. There we also formulate the dual DtDB problem as the best convex QP relaxation. In Sect. 3.3 we introduce additional linear constraints using bounds on the region of the admissible points around \(x^{0}\) and summarize our verification procedure. In Sect. 4 we compare our approach to the LP- and SDP-based competitors. We summarize our findings in Sect. 5 and discuss the directions for future work.

2 Notation and idea

We consider a neural network consisting of L linear transformations representing dense, convolutional, skip or average pooling layers and \(L-1\) ReLU activations (no ReLU after the last hidden layer). The number of neurons in layer l is denoted as \(n_{l}\) for \(l=0,\ldots ,L\), meaning that the data has \(n_{0}\) features and \(n_{L}\) classes. Furthermore, we present our analysis for the \(l_{2}\)-norm as perturbation measure since only few available methods are applicable to this setting. To make our method comparable with the approach by Raghunathan et al. (2018b) we propose a generalization to \(l_{\infty} \)-setting as well.

Given sample \(x^{0}\in {\mathbb {R}}^{n_{0}}\), weight matrices \(W^{l}\in {\mathbb {R}}^{n_{l}\times n_{l-1}}\), and bias vectors \(b^{l}\in {\mathbb {R}}^{n_{l}}\), we define the output of the ith neuron in the lth layer after the ReLU activation as

$$\begin{aligned}&x^{l}_{i} = \left[ W^{l}_{i}x^{l-1} + b^{l}_{i}\right] _{+}\text { and} \\&f_{i}(x^{0}) = x^{L}_{i} = W^{L}_{i}x^{L-1} + b^{L}_{i}, \end{aligned}$$
(1)

where \(\left[ x \right] _{+}\) is the positive part of x and \(f(x^{0})=x^{L}\) denotes the output of the complete forward pass through the network. We start with the observation that for each pair of scalars x and y the following holds (also used by Raghunathan et al. 2018b; Dvijotham et al. 2019 for \(\epsilon\)-verification).

$$x=\left[ y\right] _{+} \Longleftrightarrow x\ge 0,\quad x-y\ge 0,\quad x(x-y)=0.$$
(2)

This relation allows us to obtain an optimization problem with linear complementarity constraints.

3 Verification as an optimization task

3.1 Formulation of DtDB

For a given sample \(\tilde{x}^{0}\), pre-trained neural network f, predicted label \(\tilde{y}\) and adversarial label y we aim to find the closest point to \(\tilde{x}^{0}\) in \({\mathbb {R}}^{n_{0}}\) that has a larger or equal probability of being classified as y compared to the initial label. This task corresponds to the following optimization problem.

$$\min _{x^{0}\in {\mathbb {R}}^{n_{0}}} \Vert x^{0} - \tilde{x}^{0}\Vert ^{2},\quad \text { s.t. } (e_{\tilde{y}} - e_{y})^{T}f(x^{0}) \le 0,$$
(DtDB)

where \(e_{i}\) is the ith unit vector in \({\mathbb {R}}^{n_{L}}\) and \(\Vert x\Vert\) denotes the Euclidean norm of x. To compute the distance from \(\tilde{x}^{0}\) to the (full) decision boundary, one needs to compute the solution for all adversarial labels \(y=1,\ldots ,n_{L}\) except \(\tilde{y}\). Next we unfold the above optimization problem using (1), where x denotes a container with all variables \(x^{0},\ldots ,x^{L}\) and [L] is the set \(\{1,\ldots ,L\}\) .

$$\begin{aligned}\min _{x\in {\mathbb {R}}^{n}} \Vert x^{0} - \tilde{x}^{0}\Vert ^{2},\quad \text { s.t. }& (e_{\tilde{y}} - e_{y})^{T}x^{L} \le 0, \quad x^{L} = W^{L}x^{L-1} + b^{L}\\&x^{l} = ReLU(W^{l}x^{l-1} + b^{l})\quad \text { for }l\in [L-1]. \end{aligned}$$

We apply (2) to reformulate the problem and eliminate \(x^{L}\), such that from now on \(n=n_{0}+\cdots +n_{L-1}\) and x contains only the remaining variables \(x^{0},\ldots ,x^{L-1}\).

$$\min _{x\in {\mathbb {R}}^{n}} \Vert x^{0} - \tilde{x}^{0}\Vert ^{2},\quad \text { s.t. } (e_{\tilde{y}} - e_{y})^{T}\left( W^{L}x^{L-1} + b^{L} \right) \le 0,$$
(DtDB)
$$\begin{aligned}&\left( x^{l} \right) ^{T}\left( x^{l} - \left( W^{l}x^{l-1} + b^{l}\right) \right) = 0\quad \text { for }l\in [L-1], \end{aligned}$$
(3)
$$\begin{aligned}&x^{l} - \left( W^{l}x^{l-1} + b^{l}\right) \ge 0,\quad x^{l} \ge 0\quad \text { for }l\in [L-1]. \end{aligned}$$
(4)

3.2 QP relaxation

To get rid of the quadratic equality constraints (3) we consider a Lagrangian relaxation of DtDB:

$$\begin{aligned}\min _{x\in {\mathbb {R}}^{n}} \Vert x^{0} - \tilde{x}^{0}\Vert ^{2} + c(x,\lambda ),\quad \text { s.t. } & (e_{\tilde{y}} - e_{y})^{T}\left( W^{L}x^{L-1} + b^{L} \right) \le 0, \\&x^{l} - \left( W^{l}x^{l-1} + b^{l}\right) \ge 0,\quad x^{l} \ge 0\quad \text { for }l\in [L-1], \end{aligned}$$
(QPRel)

where for arbitrary vectors \(x^{0}\in {\mathbb {R}}^{n_{0}},\ldots ,x^{L-1}\in {\mathbb {R}}^{n_{L-1}}\) and \(\lambda \in {\mathbb {R}}^{L-1}_{+}\) we define

$$c(x,\lambda ) := \sum _{l=1}^{L-1} \lambda _{{l}}\left( x^{l} \right) ^{T} \left( x^{l} - \left( W^{l}x^{l-1} + b^{l}\right) \right)$$
(5)

as the propagation gap. The obtained problem is indeed a QP with linear constraints. We need to clarify two questions. How does the problem QPRel help us with solving DtDB and how do we solve this problem itself efficiently?

3.2.1 QPRel vs. DtDB

QPRel returns robust radius It follows directly from the definition of the Lagrange relaxation QPRel that for arbitrary non-negative \(\lambda\) it holds that:

  • if x is feasible for DtDB we have \(c(x,\lambda )=0\), meaning that x equals the vector obtained by propagating \(x^{0}\) through the neural network as defined in (1),

  • if x is feasible for QPRel then \(c(x,\lambda )\ge 0\), meaning that there might be a slack between the true output of layer l when getting \(x^{0}\) as an input and the value of \(x^{l}\).

In general the following holds for the relation between the solution of QPRel and DtDB (see Fig. 1). We include the proof of Lemma 1 and all other results in “Appendix B”.

Lemma 1

Denote the solution of QPRel by \(x_{\text{qp}}\) and the square root of its optimal objective value by \(d_{\text{qp}},\) let d be the square root of the optimal objective value of DtDB. The following holds:

  1. 1.

    \(d_{\text{qp}} \le d\) and when \(c( x_{\text{qp}}, \lambda ) = 0\) we have \(d_{\text{qp}} = d\) and \(x_{\text{qp}}\) is optimal for DtDB.

  2. 2.

    \(d_{\text{qp}}\) is monotone with respect to \(\lambda ,\) that is for two non-negative \(\lambda ^{1}, \lambda ^{2}\) with \(\lambda ^{1} \le \lambda ^{2}\) elementwise it holds that \(d_{\text{qp}}(\lambda ^{1}) \le d_{\text{qp}}(\lambda ^{2})\).

Fig. 1
figure 1

Setting of the optimal solutions for DtDB\(\tilde{x}_{\text{adv}}\) and QPRel\(x_{\text{qp}}\)

The first result from Lemma 1 ensures that \(d_{\text{qp}}\) provides a radius of a certified region around the anchor point. Whereas the second part indicates that we should choose \(\lambda\) as large as possible to get our lower bound closer to DtDB. Unfortunately, as we show below, QPRel becomes non-convex for large values of \(\lambda\). While one could try to tackle a non-convex QP with proper optimization methods, we address conditions such that QPRel is guaranteed to be convex and can be solved efficiently next.

Convexity of QPRel To look into the problem QPRel in more detail we introduce the Hessian \(M^{\lambda}\) (which is a constant matrix) of its objective function. Let \(E_{l}\in {\mathbb {R}}^{n_{l}\times n_{l}}\) be the identity matrix of the corresponding dimension and set \(\lambda _{0} = 1\). We define \(M^{\lambda} \in {\mathbb {R}}^{n\times n}\) as the symmetric block tridiagonal matrix with blocks \(M^{\lambda} _{l,l}=2\lambda _{{l}} E_{l} \text { and } M^{\lambda} _{l,l-1} = -\lambda _{{l}} W^{l}.\) Using this matrix we rewrite the objective function from QPRel as (see “Appendix B”, Lemma 4 for the proof and definition of the terms)

$$\min _{x\in {\mathbb {R}}^{n}} \frac{1}{2} x^{T} M^{\lambda} (W) x + x^{T} B_{1} (b, \lambda , \tilde{x}^{0}) + \Vert \tilde{x}^{0}\Vert ^{2}, \quad \text { s.t. } \bar{M}(W) x - B_{2}(b) \ge 0,$$
(6)

where \(B_{1}\) influences only the linear term and is therefore not relevant in this section. From this reformulation we clearly see that the matrix \(M^{\lambda}\) determines the (non-)convexity of the objective function. The following theorem provides sufficient and necessary conditions on \(\lambda\) depending on the weights \(W^{l}\) assuring that \(M^{\lambda}\) is positive semi-definite. This allows us to use off-the-shelf QP-solvers with excellent convergence properties.

Theorem 1

Let \(W^{1},\ldots ,W^{L-1}\) be the weights of a pre-trained neural network and \(\Vert W\Vert\) the spectral norm of an arbitrary matrix. Then the following two conditions for \(\lambda\) provide correspondingly a sufficient and a necessary criterion for the matrix \(M^{\lambda}\) to be positive semi-definite.

$$\begin{aligned}&\text {(suf. condition)} \quad \lambda _{{1}}\le \frac{2\lambda _{{0}}}{\Vert W^{1}\Vert ^{2}}\quad \text { and }\quad \lambda _{{l}}\le \frac{\lambda _{{l-1}}}{\Vert W^{l}\Vert ^{2}} \quad \text { for }l\ge 2, \end{aligned}$$
(7)
$$\begin{aligned}&\text {(nec. condition)} \quad \lambda _{{l}}\le \frac{4\lambda _{{l-1}}}{\Vert W^{l}\Vert ^{2}} \quad \text { for }l\ge 1. \end{aligned}$$
(8)

Furthermore, we define \(\underline{\lambda }\) and \(\bar{\lambda }\) that correspondingly satisfy conditions (7) and (8) with equality:

$$\underline{\lambda }_{l} = 2 \prod _{k=1}^{l} \frac{1}{\Vert W^{k}\Vert ^{2}}, \quad \bar{\lambda }_{l} = 4^{l}\prod _{k=1}^{l} \frac{1}{\Vert W^{k}\Vert ^{2}}.$$
(9)

Finally, in case with a single hidden layer \(M^{\lambda }\) is positive-semi definite even for \(\lambda =\bar{\lambda }\) from (8).

We use (7), (8) and our previous results as guidelines for the choice of \(\lambda\). Since \(d_{\text{qp}}(\lambda )\) is monotone in the sense of Lemma 1 we perform a binary search between \(\underline{\lambda }\) and \(\bar{\lambda }\) to find the point closest to \(\bar{\lambda }\) (where QP is non-convex for networks with more than one hidden layer) such that the QP remains convex. We denote the obtained \(\lambda\) by \(\hat{\lambda }\). This preprocessing step does not considerably affect the runtime since checking whether a matrix is positive semi-definite is done efficiently by Cholesky decomposition. However, it significantly improves the final bounds compared to the bounds obtained when using \(\lambda =\underline{\lambda }\) from (7).

Note that this procedure has to be done once for a given classifier. \(\hat{\lambda }\) is then used to solve QPRel for all anchor points and adversarial labels. This is a significant computational advantage compared to SDP-based \(\epsilon\)-verification methods. For example, Dvijotham et al. (2019) include the dual multipliers as variables in a relaxation of the SDP problem that has to be solved for each combination of the anchor point, adversarial label and verified epsilon.

Relation to the dual of DtDB Since QPRel is a Lagrangean relaxation of a non-convex quadratically constrained QP DtDB, we unavoidably have a gap between their optimal objective values, but get a simpler problem to solve in return. To investigate and approximate the components of that gap, we look onto the relation of DtDB and QPRel from the perspective of duality theory. A similar question was investigated by Salman et al. (2019) for the existing \(\epsilon\)-verification methods based on neuron-wise LP-relaxations. However, our method does not fall into this category because the relaxation happens jointly for all layers.

Note, that our formulation of DtDB problem contains quadratic equality constraints (3) and therefore has a non-convex admissible set. For the derivation of its dual problem we refer to the complementary material (see “Appendix B”) and summarize here the most important result.

Theorem 2

Solving the Lagrange dual problem of the non-convex DtDB is equivalent to solving the problem

$$\max _{\lambda \in {\mathbb {R}}^{L-1}_{+}}\text {QPRel}(\lambda ) \text { s.t. }M^{\lambda} \text { is positive semi-definite,}$$

where we slightly redefine the notation and write \(\text {QPRel}(\lambda )\) for the optimal objective function value of QPRel for the corresponding \(\lambda\). We also denote \(\lambda ^{*}\) as the optimal value of \(\lambda\) for the above problem.

Now we are ready to formulate the result that provides a way to estimate how large is the difference between the optimal objective function value of QPRel for \(\hat{\lambda }\), constructed using Theorem 1, and the optimal \(\lambda ^{*}\). The latter is defined by Theorem 2 and would provide the best bound we can get when constraining ourselves to the convex QP relaxations.

Lemma 2

Denote \(\lambda ^{*}\) as the optimal \(\lambda\) defined in Theorem 2, \(\hat{\lambda }\) as \(\lambda\) we use for verification, \(\bar{\lambda }\) as defined in (9), \(c(x, \lambda )\) as the propagation gap defined in (5) and \(\hat{x}_{qp}\) as the solution of \(\text {QPRel}(\hat{\lambda })\). Then we get the following upper bound on the possible improvement of QPRel’s objective function for a \(\lambda\) value that is different from our \(\hat{\lambda }\):

$$\max _{\begin{array}{c} \lambda \ge 0 \\ M^{\lambda} \text { psd} \end{array}} \left( \text {QPRel}(\lambda ) - \text {QPRel}(\hat{\lambda })\right) =\text {QPRel}(\lambda ^{*}) - \text {QPRel}(\hat{\lambda }) \le c(\hat{x}_{qp}, \bar{\lambda } - \hat{\lambda }).$$

In summary, we have the following relation between the values defined above, where we add -P and -D to the problem name to denote its primal and dual forms respectively:

$$\text {DtDB-P} \ge \text {DtDB-D} = \text {QPRel} (\lambda ^{*}) \ge \text {QPRel}(\hat{\lambda }).$$

We have shown how to find a good \(\hat{\lambda }\) and are able to estimate the gap resulting in the second \(\ge\) sign as shown in Lemma 2. Additionally, in the next section we describe how to close the duality gap resulting in the first \(\ge\) sign by introducing additional constraints to the QPRel problem.

3.3 Improving bounds via additional linear constraints

The initial DtDB problem and its relaxation QPRel do not require bounds on pre-activation values \(W^{l}x^{l-1} + b^{l}\) frequently used in \(\epsilon\)-verification approaches. However, if available, these can improve our relaxation. That is, we can additionally bound the admissible set of QPRel by

$$\underline{a}^{l} \le W^{l}x^{l-1} + b^{l} \le \bar{a}^{l} \quad \text { for }l=1,\ldots ,L-1$$
(10)

given some bounds \(\underline{a}^{l}, \bar{a}^{l} \in {\mathbb {R}}^{n_{l}}\) for layer l. Moreover, we include the following linear constraint on each neuron i in layer l as also widely used in other verification methods for ReLU networks (Ehlers 2017; Wong and Kolter 2018; Dvijotham et al. 2019; Salman et al. 2019).

$$-\bar{a}^{l}_{i} (W^{l}x^{l-1} + b^{l})_{i} + (\bar{a}^{l}_{i} -\underline{a}^{l}_{i}) x^{l}_{i} \le -\bar{a}^{l}_{i} \underline{a}^{l}_{i}.$$
(11)

Note that constraints (10) and (11) are linear and therefore the new relaxation is still a QP.

Before continuing the discussion how we exploit these bounds, we first introduce the notation of a proper bound propagation mapping. We need this to ensure that the resulting solution of QPRel with these additional constraints is still a lower bound on DtDB. For a fixed anchor point and network weights consider a mapping from a bound in the input layer \(\gamma \in {\mathbb {R}}_{+}\) to the bounds \(\underline{a}^{l}(\gamma ), \bar{a}^{l}(\gamma )\in {\mathbb {R}}^{n_{l}}\). We call this mapping a proper bound propagation mapping if

  1. 1.

    bounds are valid for all \(x^{0}\) with \(\Vert \tilde{x}^{0} - x^{0}\Vert \le \gamma\) inequalities (10) hold for the corresponding pre-activation values in each layer as defined in (1) and

  2. 2.

    bounds are monotone for arbitrary \(\gamma _{1}\le \gamma _{2}\) in each hidden layer l of the network there holds \(\bar{a}^{l}(\gamma _{2})\ge \bar{a}^{l}(\gamma _{1})\ge \underline{a}^{l} (\gamma _{1})\ge \underline{a}^{l}(\gamma _{2})\).

In our experiments we deploy the bound propagation technique by Wong and Kolter (2018) to obtain bounds \(\underline{a}^{l}, \bar{a}^{l}\) since it satisfies these properties and is computationally efficient.

Lemma 3

When using a proper bound propagation mapping, the following holds for the square root of the optimal objective function value \(d_{\text{qp}}(\gamma )\) of QPRel (we drop the dependence on \(\lambda\) since it is now fixed) solved with the additional constraints (10) and (11) using pre-activation bounds \(\underline{a}^{l}(\gamma ), \bar{a}^{l}(\gamma )\).

  1. 1.

    \(d_{\text{qp}}(\gamma _{1}) \ge d_{\text{qp}}(\gamma _{2})\) if \(\gamma _{1} \le \gamma _{2},\) i.e. \(d_{\text{qp}}(\gamma )\) is monotonically decreasing, where we say that \(d_{\text{qp}}(\gamma )=\infty\) if the corresponding QPRel with (10) and (11) is infeasible,

  2. 2.

    if \(d_{\text{qp}}(\gamma ) \le \gamma\) then \(d_{\text{qp}}(\gamma )\) is a lower bound on DtDB (which might not be the case otherwise, see “Appendix B” for details).

Guided by the results of Lemma 3 we apply binary search to find the smallest \(\gamma\) that is still providing us with a lower bound \(d_{\text{qp}}(\gamma )\) on the smallest adversarial perturbation (the smaller the value of \(\gamma\), the better the resulting bound). In each step we solve a convex QP and increase \(\gamma\) if QPRel is infeasible, that is current bounds \(\underline{a}^{l}(\gamma ), \bar{a}^{l}(\gamma )\) are too tight, or if \(d_{\text{qp}}(\gamma ) > \gamma\) since in this case we do not have a certificate for \(d_{\text{qp}}(\gamma )\) to be a valid lower bound on DtDB. Otherwise we set the current \(\gamma\) as the right boundary of the search interval and proceed with a smaller value of \(\gamma\). The whole procedure is summarized in Algorithm 1.

figure a

3.4 \(l_{\infty}\)-Setting

For comparison with the SDP-based approach by Raghunathan et al. (2018b) we show how we apply our method to compute bounds on the distance to the closest adversarial measured using the \(l_{\infty}\)-norm. A straight forward way would be to modify the objective function accordingly. By introducing a new variable m representing \(\Vert x^{0} - \tilde{x}^{0}\Vert _{\infty} ^{2} = \max _{i} (x_{i}^{0} - \tilde{x}_{i}^{0})^{2}\) and \(n_{0}\) new quadratic constraints we get the following versions of QPRel:

$$\begin{aligned}\min _{x\in {\mathbb {R}}^{n}, m\in {\mathbb {R}}} m + c(x, \lambda ) \text {, s.t. }& ( x_{i}^{0} - \tilde{x}_{i}^{0} )^{2} \le m, \quad i=1,\ldots ,n_{0}, \\&(e_{\tilde{y}} - e_{y})^{T}\left( W^{L}x^{L-1} + b^{L} \right) \le 0 ,\\&x^{l} - \left( W^{l}x^{l-1} + b^{l}\right) \ge 0,\quad x^{l} \ge 0\quad \text { for }l\in [L-1]. \end{aligned}$$

Note that the quadratic constraints do not harm the complexity since they describe a convex cone and can be handled by the QP-solvers. While this formulation is of a similar structure as the QPRel (quadratic objective as well as linear and quadratic constraints), the Hessian of the objective function is not positive semi-definite for any value of \(\lambda\). Since \(c(x, \lambda )\) is the only source of quadratic terms now (squared distance to the anchor point is now replaced by m), the new \(M^{\lambda}\) is of the same form as in (6), but with \(\lambda _{0}=0\). To see that we cannot affect the convexity of the objective function by the parameter \(\lambda\) anymore consider vector x with an arbitrary \(x^{0}\in {\mathbb {R}}^{n_{0}}\) as well as \(x^{1} = \alpha W^{1}x^{0}\) for some \(0<\alpha <1\) and \(x^{l}=0\) for \(l>1\). Then

$$x^{T}M^{\lambda} x =\lambda _{1} \left( \Vert x^{1}\Vert ^{2} -\left( x^{1}\right) ^{T}W^{1}x^{0} \right) =\lambda _{1}(\alpha ^{2} - \alpha )\Vert W^{1}x^{0}\Vert ^{2}<0$$

meaning that \(M^{\lambda}\) cannot be positive semi-definite.

To overcome this issue, we utilize the new quadratic constraints. We return back to a convex QP by considering the following problem with a positive \(\mu\).

$$\begin{aligned}\min _{x\in {\mathbb {R}}^{n}, m\in {\mathbb {R}}} m + c(x, \lambda ) +\mu \sum _{i=1}^{n_{0}}\left( (x_{i}^{0} - \tilde{x}_{i}^{0})^{2} - m \right) , \quad \text {s.t. } & ( x_{i}^{0} - \tilde{x}_{i}^{0} )^{2} \le m \quad \text {for }i=1,\ldots ,n_{0},\\&(e_{\tilde{y}} - e_{y})^{T}\left( W^{L}x^{L-1} + b^{L} \right) \le 0,\\&x^{l} - \left( W^{l}x^{l-1} + b^{l}\right) \ge 0, x^{l} \ge 0 \quad \text {for }l\in [L-1]. \end{aligned}$$

Clearly, for \(0<\mu \le n_{0}^{-1}\) the solution of this problem is a finite lower bound on DtDB with the \(l_{\infty}\)-norm. On the other side we are back in the setting of Theorem 1 with \(\lambda _{0} = \mu\) allowing us to use the same framework as before. In Sect. 4 we obtain the results in the \(l_{\infty}\)-setting by solving this problem with \(\mu =(2n_{0})^{-1}\).

4 Experiments

For each considered sample we apply the procedure described in Sect. 3.3, Algorithm 1 including tightening of the relaxation by introducing additional linear constraints (10) and (11). \(\hat{\lambda }\) is chosen for each classifier according to Theorem 1 and the discussion afterwards such that a relative accuracy of at least \(c_\lambda = 10^{-4}\) is achieved during the binary search in each \(\lambda _{l}\). For the values of other parameters in Algorithm 1 we choose for all tests \(c_\gamma = 10^{-8}\) and \(n_\gamma =10\). Other methods are tested with the default settings as provided in the corresponding repositories. For ConvAdv by Wong and Kolter (2018) we use the maximum of 200 iterations during Newton’s method for the networks D8, D8R, C, CR (see below) and 20 otherwise. To solve the QP tasks or verify that they are infeasible we use Gurobi (Gurobi Optimization 2018).

Datasets and classifiers The experiments are performed using the MNIST (LeCun et al. 1999) and Fashion-MNIST (Xiao et al. 2017) datasets as well as the tabular datasets IRIS (3 classes, 4 features) and WINE (2 classes, 12 features) from Dua and Graff (2017) scaled such that the feature values lie in [0, 1] interval. For each of the datasets we use the correctly classified samples from 120 train points to evaluate the verification approaches.

For classification we take ReLU networks consisting of dense and convolutional linear layers. The architectures we used for the image datasets are named D2, D4, D8 (dense networks containing 2, 4 and 8 hidden layers consisting of 50 neurons each with an exception for the last 4 layers in D8 that have 20 neurons each) and C. We use similar structures of the networks as Wong and Kolter (2018) to enable easier comparison. The latter consists of two convolutional layers with \(4\times 4\) windows, a stride of 2 as well as 16 and 32 output channels correspondingly, followed by two dense layers with input/output dimensions of 1568/100 and finally 100/10. For each architecture we use normally trained classifiers as well as robustly trained ones (indicated by suffix R, e.g. CR) using the method by Wong and Kolter (2018) with \(\epsilon =1.58\) in \(l_{2}\)-setting and \(\epsilon =0.1\) in \(l_{\infty}\)-setting. For the tabular datasets we use a dense network with two hidden layers with 10 neurons called D2 and different \(\epsilon\) values in \(l_{2}\)-setting: 0.113 for IRIS and 0.195 for WINE (and the same \(\epsilon =0.1\) in \(l_{\infty}\)-setting). The weights as well as the project code are available at github.com/Aleksei-Kuvshinov/QPRel. In Table 1 we show the clean accuracy of the trained networks on the corresponding test sets.

Table 1 Clean accuracy

Competitors We compare our approach QPRel with the following verification methods: ConvAdv by Wong and Kolter (2018) based on the LP relaxation of ReLU constraints (we use its implementation supporting the \(l_{2}\)-norm by Croce et al. 2019), CROWN by Zhang et al. (2018) which is a layerwise bound propagation technique including performance boosting quadratic approximations and warm start (for dense networks only since its implementation did not support convolutional layers), and SDPRel by Raghunathan et al. (2018b) based on a SDP relaxation solved by MOSEK.

Metrics The results on MNIST and Fashion-MNIST for the \(l_{2}\)- and \(l_{\infty}\) setting are shown in Tables 3 and  4 correspondingly. We show the results on the tabular data in Table 2. We run the methods for each of the considered samples and report the following metrics.

Table 2 Experiment results, tabular data

(1) AvgBound the average value of the bounds obtained from QPRel and the corresponding competitor (the best value marked bold if at least 5% larger than the worst one). To asses the impact of introducing additional linear constraints using a bound propagation method as described in Sect. 3.3 we report the lower bounds obtained by solving QPRel without constraints (10) and (11) in the last column AvgBound (no BndProp) in Tables 3 and 4. (2) MedRelDiff to QPRel: the median of the relative difference between the bounds (e.g. QPRel minus CROWN and then divided by CROWN). Positive values for the lower bounds mean our bounds are better in average over the samples. (3) \(\epsilon\) to hit 50% LB-verified: the number of samples with an adversarial-free radius of \(\epsilon\) is monotonically decreasing in \(\epsilon\). Therefore, to assess the performance of a verification procedure like QPRel or CROWN we report the smallest \(\epsilon\) such that exactly 50% of the samples are successfully verified. The larger this value, the better (the largest values marked bold).

Table 3 Experiment results, image data, \(l_{2}\)-setting
Table 4 Experiment results, image data, \(l_{\infty}\)-setting

\(l_{2}\)-setting, state-of-the-art bounds For all considered architectures the lower bounds computed by QPRel are tighter in comparison to the competitors in average (see Table 3, AvgBound and MedRelDiff) and for the networks with a smaller number of hidden layers even for most individual images. Naturally, this results in larger values of \(\epsilon\) to hit 50% LB-verified as well. It seems that the competitors tend to underestimate robustness of the considered networks, especially if it was not trained robustly. For the normally trained convolutional network C on MNIST we were able to improve the competitor’s lower bounds by a factor of 2 in average. In contrast to other verification procedure that can not easily verify networks that were not robustly trained, our method is applicable to normally trained networks as well.

While this improvement of the verifiable radius comes at higher computational cost (QPRel is about one order of magnitude slower than the LP-competitors) due to a fundamental difference in complexity of the LP- and QP-tasks, the average runtime per sample is still only seconds or less for the dense and multiple minutes for the convolutional networks. We present a detailed runtime comparison in “Appendix A”.

In the last column of Table 3, we report the lower bound obtained when solving QPRelwithout introducing additional constraints as described in Sect. 3.3. We observe that the relaxation becomes less tight for networks with more layers and if it was trained robustly. We suppose that when the number of layers L becomes larger the binary search between \(\underline{\lambda }\) and \(\bar{\lambda }\) (see Theorem 1 and the discussion afterwards) in a higher dimensional space results in a point far from the optimal Lagrange multipliers. Especially the last \(\underline{\lambda }_{L-1}\) and \(\bar{\lambda }_{L-1}\) defined in (9) become small such that the gap between \(x^{L-1}\) and \(W^{L-1}x^{L-2}+b^{L-1}\) has only a very limited effect on the objective function of QPRel. That results in an undesired optimal solution of QPRel with a large propagation gap. At that point, by introducing additional linear constraints [especially (11)] we prohibit this behavior by bounding the propagation gap for the set of feasible points. Overall, incorporating additional linear constraints by using bounds on ReLU’s input has proven to significantly improve our relaxation and the resulting lower bounds.

\(l_{\infty}\)-setting, comparison with SDP-relaxations In order to compare our method with the work done by Raghunathan et al. (2018b) we generalize QPRel to the \(l_{\infty}\)-setting as described in Sect. 3.4. Note, that the resulting relaxation is looser than the initial QPRel for the \(l_{2}\)-setting since we bound the \(l_{\infty}\)-distance from below to make the problem quadratic and convex. To compute the largest \(\epsilon\) such that the SDP verification succeeds we perform a binary search on the [0, 1] interval. Since this approach takes longer to run we test it only on the networks D2 and D2R trained with \(\epsilon =0.1\) (MNIST data).

In \(l_{\infty}\) setting our bounds are about 3 times smaller than the ones of SDPRel (see Table 4, MedRelDiff to QPRel)—though computed three orders of magnitude faster (see “Appendix A”). This shows that the QP relaxation is less suited than the competitors for obtaining tight bounds in \(l_{\infty}\)-setting as already indicated by the arguments above due to the nature of the quadratic relaxation, but trades this off by much better efficiency compared to SDPRel.

5 Conclusion and future work

We presented a novel approach to solve the problem of approximating the minimal adversarial perturbations for ReLU networks based on a convex QP relaxation of DtDB. We show that the lower bounds computed with QPRel allow certification of larger neighborhoods. Since convexity of the underlying QP determines computational efficiency of our approach we derive the necessary and sufficient conditions on the Lagrangian multipliers. The obtained lower bounds in the \(l_{2}\)-setting show state-of-the-art results allowing to certify larger radia around the data samples as adversarial free.

With our contribution we make a step towards robustness verification of deep ReLU-based classifiers. While the proposed theoretical framework is applicable to any linear transformations including dense, convolutional and average pooling layers as well as skip connections, it requires a different analysis when a non-ReLU activation functions are used (except leaky ReLU). To be able to apply the approach on a wider class of networks it should be generalized to popular architectures beyond ReLU activations. Last but not least, excellent results that our method demonstrated for the verification task indicate an intriguing research direction toward robust training. Based on our certificates the next step towards robust training would be an approach that uses the solution of QPRel to make an update step resulting in larger certified neighborhood for the correctly classified samples. As our approach does not require a predefined \(\epsilon\), that additional regularization acts individually for each sample depending on its current robust neighborhood.