Reachable Sets of Classifiers and Regression Models: (Non-)Robustness Analysis and Robust Training

Neural networks achieve outstanding accuracy in classification and regression tasks. However, understanding their behavior still remains an open challenge that requires questions to be addressed on the robustness, explainability and reliability of predictions. We answer these questions by computing reachable sets of neural networks, i.e. sets of outputs resulting from continuous sets of inputs. We provide two efficient approaches that lead to over- and under-approximations of the reachable set. This principle is highly versatile, as we show. First, we use it to analyze and enhance the robustness properties of both classifiers and regression models. This is in contrast to existing works, which are mainly focused on classification. Specifically, we verify (non-)robustness, propose a robust training procedure, and show that our approach outperforms adversarial attacks as well as state-of-the-art methods of verifying classifiers for non-norm bound perturbations. Second, we provide techniques to distinguish between reliable and non-reliable predictions for unlabeled inputs, to quantify the influence of each feature on a prediction, and compute a feature ranking.


Introduction
Neural networks are widely used in classification and regression tasks. However, understanding their behavior remains an open challenge and raises questions concerning the robustness, reliability and explainability of their predictions. We address these issues by studying the principle of reachable sets of neural networks: Given a set of inputs, what is the set of outputs of the neural network.
Input Set  Methods that compute an exact reachable set [35] are not feasible, even for tiny neural networks [18]. In this study, we aim to approximate the reachable set such that it can be computed for neural networks used on standard data sets. More specifically, we investigate this problem in the context of ReLU neural networks, which constitute the most widely used class of networks. To allow flexibility regarding inputs, we propagate a set of points defined by a zonotope through the neural network. As the ReLU operation can result in non-convex sets, we derive under-approximated or over-approximated output sets (see Figure 1). The resulting sets are used to analyze and enhance neural network properties (see Section 4).
Overall, our main contributions are: (i) Two efficient approaches RsO and RsU (Reachable set Overand Under-approximation) of approximating the reachable set of a neural network; (ii) Classification: Techniques of applying RsU and RsO to (non-)robustness verification, robust training, comparison with attacks, and state-of-the-art verification methods. (iii) Regression: an approach for analyzing and enhancing the robustness properties of regression models. (iv) Explainability/Reliability: a method of distinguishing between reliable and non-reliable predictions as well as a technique of quantifying and ranking features w.r.t. their influence on a prediction.

Related Work
Reachable Sets. Computing the exact reachable set of a neural network as [35] is not applicable even with tiny networks, as shown in [18]. Some techniques that approximate reachable sets, such as [24], cannot handle the common robustness definition. Most approaches that deal with the reachable sets of a neural network emerged from robustness verification. The study that is the most closely related to our over-approximation approach RsO is [8]. Further developments of this technique include bounds [26,27,28,25]. In addition, set-based approaches are used for robust training [20,11]. Our work goes beyond the existing approaches. First, our over-approximations are (by construction) subsets of the ones computed in [8] and thus tighter. Second, in comparison to the improvements presented in [26,27,28], our approaches do not require bounds on the layer input. Third, in addition to over-approximations, we provide an approach to under-approximate the reachable set.
(Non-)Robustness Verification. Reachable sets are applicable to (non-) robustness verification (see Section 3). Other expensive robustness verification methods are based on SMT solvers [15,5,3], mixed integer linear programming [33] or Lipschitz optimization [24]. One family of verification approaches search for adversarial examples by solving the constrained optimization problem of finding a sample that is close to the input, but labeled differently. The search space, i.e. an overapproximation of the reachable set of the neural network is defined by the constraints. The distance of the samples to an input point is usually bound by a norm that the optimization problem can deal with, such as L ∞ -norm [34,22,1,15,29] or L 2 -norm [13]. One drawback of these approaches is the strong norm-based restriction on the inputs. Our approaches can handle input sets equivalent to norms as well as input sets that couple features and thus allow complex perturbations such as different brightness of pictures.
The complement of robustness verification are adversarial attacks, i.e. points close to an input that are assigned a different label. Adversarial attacks compute a single point within the reachable set, without explicitly computing the reachable set. There are various ways of designing attacks, one of the strongest being the projected gradient descent attack (PGD) [19]. In contrast to attacks, our RsU approach aims to find an entire set of predictions corresponding to an input set.
It should be noted that, all the above principles are designed for classification tasks. In contrast, our approach is naturally suited for regression as well. To further highlight the versatility of our method, we show how to apply it to explaining predictions and to distinguishing between reliable and non-reliable predictions.

Reachable Sets of Neural Networks
The reachable set O w.r.t. an input set I of a neural network f is its output set, i.
Computing the exact reachable set of a neural network is challenging, as proving simple properties of a neural network is already an NP-complete problem [15]. Under-approximationsÔ u ⊆ O produce a set of points that can definitely be reached with respect to the input, while over-approximations cover all points that might possibly be reached O ⊆Ô o (see Figure 1).
We propose approximating the reachable set by propagating the input set layer-by-layer through the neural network. In each layer, the input set is first subjected to the linear transformation defined by weights and biases. This linear transformation is computed exactly and efficiently for the zonotopebased set representations we exploit. Then, the ReLU activation function is applied. Since applying ReLU onto a convex set can result in a non-convex set, we approximate convex subsets. Specifically, we propose an analytical solution for the over-approximations and an efficient linear optimization problem formulation for the under-approximations.
(c) Over-approximation. Figure 2: Application of ReLU to a zonotope (blue) can result in a non-convex set (red). We approximate each subset located in each quadrant separately (here: two quadrants) and subject it to ReLU. The obtained set of sets under-approximates (green sets) or over-approximates (orange sets) ReLU(Z).
Definition of Input Sets. Our approaches operate directly on sets and require an efficient and flexible set representation. For this, we use zonotopes, as they are closed under linear transformation and their G-representation provides a compact representation in high-dimensional spaces. Furthermore, they allow complex and realistic perturbations to be defined that couple input features such as different light conditions on pictures (in short: we go beyond simple and unrealistic norm constraints).
The G-representation of a d-dimensional zonotopeẐ with n generators is defined by a row-vector, the centerĉ ∈ R D and a matrixĜ ∈ R n×D . The rows of this matrix contain the generatorsĝ i . The set of all points withinẐ is: Propagating Sets through ReLU Networks. In this paper we focus on ReLU neural networks, as they are not only widely used but also powerful. A neural network consists of a series of functional transformations, in which each layer l (of n l neurons) receives the input x ∈ R n l−1 and produces the output by first subjecting the input to a linear transformation defined by the weights W l and bias b l , and then applying ReLU. In the final layer, no activation function is applied, and the output stays in the logit-space. Thus, starting with the input setẐ 0 a series of alternating operations is where Z l denotes the set after the linear transformation,Ẑ l denotes the set after the ReLU, and Z L is the reachable set (output layer). Since zonotopes are closed under linear transformations, applying weights and bias of layer l to zonotopeẐ l−1 = (ĉ l−1 |Ĝ l−1 ) results in Obtaining ReLU(Z l ) is challenging, as it may be a non-convex set, as illustrated in Figure 2a. It is inefficient to further propagate the non-convex set ReLU(Z l ) through the neural network. Therefore, our core idea is to approximate ReLU(Z l ) and use this as the input to the next layer. More precisely, we propose two methods: RsO (reachable set over-approximation) and RsU (reachable set underapproximation). RsO obtains a superset of ReLU(Z l ) in each layer l, while RsU returns a subset. Using RsO within each layer ensures that no points are missed and that the output set captures all reachable points. Equivalently, applying RsU within each layer results in an output set that is a subset of the exact reachable set, i.e. contains the points that will definitely be reached. Pseudocode for RsO, RsU and propagating a zonotope through the neural network is provided in the appendix (see Section 6.1).
Approximation of ReLU(Z). In the following, we describe how to approximate ReLU(Z) based on zonotope Z. To unclutter the notiation, we omit layer index l. The ReLU function maps points dimension-wise onto the maximum of themselves and zero. Consideration of dimension d results in three possible cases: Case 1: ∀p ∈ Z : p d < 0, where the points are mapped to zero, Case 2: ∀p ∈ Z : p d ≥ 0, where the points are mapped onto themselves and Case 3: where the points are mapped to zero or themselves.
Case 3 causes the non-convexity of ReLU (see Figure 2a, 2 nd dimension). We consider the three cases separately to approximate each maximum convex subset of ReLU(Z) by one zonotope. The three cases are distinguished by computing an index set for each case: These index sets can be efficiently obtained through the interval hull of Z [16], where |.| is the element-wise absolute value: IH (Z) := [c − δg, c + δg] where δg = i |g i |. R n contains the dimensions d such that (c − δg) d ≤ 0, R p contains the dimensions where (c + δg) d ≥ 0, and R contains the remaining dimensions.
Projection of a Zonotope. Regarding the dimensions in R n , ReLU maps each point of the zonotope to zero. Thus, we can safely project the whole zonotope Z = (c | G) to zero within these dimensions. Theorem 1. Let Z be an arbitrary zonotope and Z = Proj Rn (Z), then ReLU(Z) = ReLU(Z ).
Proof. Applying ReLU and the projection operator to Z results in the sets: Applying ReLU on the projection results in ReLU(Z): Projecting Z results in the more compact Z with no change to the output set. We overload notation, and Z denotes the projected zonotope in the following.
Computation of Quadrants that Contain a Subset S k of Z. Next, we subdivide the projected zonotope Z into subsets located in one quadrant. Quadrants that contain points of Z are determined by an index set R k , where R k is an element of the power set P(R) of R. Each index set R k corresponds to a set Clearly, all S k are convex and disjoint, the union over S k results in Z. It is important to highlight that we never materialize the subsets S k , as they are unfavorable to compute. Our core idea is to approximate each S k by zonotopeẐ k . Subsequently, we projectẐ k in all dimensions of the corresponding R k (see case 1), resulting in Proj R k (Ẑ k ). The obtained set of zonotopes is an approximatiuon of ReLU(Z) and is the input for the next layer. The computation of R, R k and corresponding subsets S k is illustrated in the following example. Example 1. Consider zonotope Z = (c | G) (Figure 2), where c = (6, 1) and generators are g 1 = (3, 0), g 2 = (2, 3) and g 3 = (0, 0.5). The lower bounds are (1, −2.5), the upper bounds are (11, 4.5). As all upper bounds are positive, we do not project any dimension. The index set considering case 3 is R = {2}. We need to approximate all subsets S k corresponding to R k ∈ P(R). The empty set corresponds to the positive quadrant.
We capture each S k individually to keep the approximations as tight as possible. Theoretically, we could decrease the number of zonotopes by over-approximating several S k by one zonotope or by not considering small subsets S k in the case of under-approximation. We discuss such an extension that restricts the maximum number of zonotopes at the end of this section. This extension enables a balance between tightness of approximations and run-time, which is useful for larger neural networks. The approximation of S k can be either an over-approximation (RsO) or an under-approximation (RsU).
Over-approximation of ReLU(Z). Given S k ⊆ Z, we aim to over-approximate S k byẐ k = (ĉ |Ĝ) (to unclutter the notation, we omit the index k w.r.t. center and generators). Our core idea is that ifẐ k is a tight over-approximation of S k , the shape ofẐ k should resemble the shape of Z (see Figure 2c). As the shapes of two zonotopes are similar if their generators point in similar directions, we deriveẐ k from the generators of Z. More precisely, the generators ofẐ k are obtained by scaling each generator g j of Z with a factor α j ∈ [0, 1] and computing the shift of the center such that S k ⊆Ẑ k ⊆ Z. Clearly, α j = 1 fulfills this property, but results inĝ j = g and a loose over-approximation. Thus, we aim to minimize over α j . Each scaling factor α j for generator g j is computed analytically (see Figure 3) by first computing an extreme point e of the zonotope. We start in e and test if the generator g j allows a point t j,d to be reached outside the quadrant under consideration. If this is the case, g j can be scaled down andẐ still over-approximates S k . We compute the extreme points and scaling factors for each dimension d, resulting in α j,d .
Regarding dimension d, g j can be scaled by the factor α j,d . If we scale g j by a larger factor, we leave the quadrant corresponding to S k with respect to dimension d. A larger scaling factor is not necessary in order to over-approximate S k . Thus, we minimize over α j,d to obtain the smallest α j and the tightest over-approximation. Formally,ĝ j andĉ of the overapproximating zonotopeẐ k are: Although the generators of Z are scaled down, the obtained zonotopeẐ is an over-approximation of S k for the respective quadrant (which we never computed explicitly). This is shown in Theorem 2 by using Lemma 1 and 2.
with the center and generators as defined in Equation 4. Then S k ⊆Ẑ k .
and letẐ k be a zonotope with the center and generators defined in Equation 4. Consider the definitions used in Theorem 2.
Proof. We use that for a point p ∈ Z : We use that The proof of Lemma 2 is similar to the one of Lemma 1 and not shown in detail. The differences to the previous proof are that for a point p ∈ Z : p d ≤ 0 in case d ∈ R k and t − j,d > 0, we start with 0 ≥ p d , we use the t − j,d instead of t + j,d and signs of the terms are different. The subset S k is located in one quadrant and the corresponding R k contains dimensions that are mapped to zero by ReLU (case 1 on S k ). Thus, we project the over-approximationẐ k in dimensions d ∈ R k as described above. This projection is exact: Under-approximation of ReLU(Z). Finding a tight under-approximation of S k turns out to be more challenging. We propose to tackle this by solving a constrained optimization problem, in which we aim to find a zonotopeẐ k of maximum volume subject to the constraintẐ k ⊆ S k : How can we instantiate Equation 14 to under-approximate S k tightly and keep computations efficient?
We derive an efficient linear program by considering the same search domain as before. More precisely, we constrain the search space to zonotopes that are derived from the original zonotope Z, by scaling its generators g i by factors α i ∈ [0, 1], i.e.ĝ i = α i g i . As motivated before, this assumption is reasonable, sinceẐ k and Z have similar shapes.
Importantly, to ensure that we under-approximate a part of Z located in one quadrant, we add a constraint that forces the lower bound of the interval hull ofẐ to be non-negative if d / ∈ R k : ∈ R k and one that forces the upper bound of the interval hull ofẐ to be negative if d ∈ R k :ĉ d + i |ĝ i,d | ≤ 0 ∀d ∈ R k . Since the volume of the zonotope grows with α i , we instantiate the objective function by i α i . Combining all of these considerations results in the following linear optimization problem: If the quadrant under consideration is empty (which can happen for many quadrants) the optimization problem is not solvable and we can safely ignore this quadrant. Since all points inẐ k are negative w.r.t. the dimensions d ∈ R k (case 1), we compute Proj R k (Ẑ k ) and obtain an under-approximation of ReLU(S k ). See Figure 2b for illustration.
Balancing approximation tightness and run-time. For large input sets, the number of convex subsets that define the reachable set of a neural network scales exponentially with the number of neurons. Let us consider zonotope Z = (c | G), c ∈ R D , G ∈ R n×D . In the worst case, Z consists of points that are spread over 2 D quadrants. RsO and RsU approximate each subset S k located in one quadrant by a separate zonotope Z k .
To balance approximation tightness and run-time, we extend RsO and RsU, such that the number of zonotopes can be restricted by the user. The overall number of zonotopes (w.r.t. the whole neural network) is restricted by B and the amplification is restricted by A. The amplification is the maximum number of zonotopes Z k used to approximate ReLU(Z) w.r.t. one layer and one input zonotope Z. It is defined by the number of quadrants q that contain points of Z and can be computed as follows. First, we compute the interval hull of Z. With respect to dimension d, all points within Z are in the Second, we count the number R n of dimensions d where l d low < 0 and l d upp > 0. The number of quadrants is q = 2 Rn .
Over-approximation: If q > A, we compute the interval hull of Z and restrict the intervals to their positive portion. Thus, we over-approximate ReLU(Z) by one zonotope instead of q zonotopes. If the overall number of zonotopes is larger than B, we estimate the size of each zonotope Z = (c | G) by The largest B −1 zonotopes are kept while the smaller ones are merged (i.e. we compute an over-approximation of their union by minimizing/maximizing over the lower/upper limits of their interval hulls). The resulting interval is transformed into the G-representation of a zonotope: Under-approximation: In this case, we simply drop the smallest zonotopes if q > A or the overall number of zonotopes is larger than B.

Applications and Experiments
We highlight the versatility of our RsO and RsU approach by describing several applications in classification and regression tasks. More specifically, we discuss (non-)robustness verification, robust training, quantification of feature importance and the distinction between reliable and non-reliable predictions. Furthermore, we analyze reachable sets of an autoencoder. Our input zonotopes capture three different shapes: cube (equivalent to L ∞ -norm), box (with a different perturbation on each feature) and free (with coupling of features). We train feed-forward ReLU networks on standard data sets.
Experimental Setup. Our approaches are implemented in Python/Pytorch. We train feed-forward ReLU networks using stochastic gradient descent with cross-entropy loss (classifiers), Huber loss (regression models), mean-square-error loss (autoencoder models) or robust loss functions (see following sections) and early stopping. Experiments are carried out on the following popular data sets and neural network architectures (accuracy denotes worst accuracy obtained for this data set by one of the specified neural network architectures): Classifiers: Iris Definition of Input Sets. Using zonotopes as input sets has the advantage that we are able to verify different kinds of perturbations. Here, the input setẐ = (ĉ |Ĝ) is defined by using an input data point x as centerĉ and the following perturbations specified by the generator matrixĜ. Cube: Z cube is a hyper-cube whose shape is equivalent to the unit ball of the L ∞ -norm. As the allowed perturbation on each input feature is the same, the generator matrix is εI d for different ε. Box:Ẑ box is a so called axis-aligned parallelotope (n-dimensional box). This shape allows different disturbances on each input feature, but it does not couple features. For this, we first compute a zonotope by using the eigen-vectors that correspond to the d largest eigenvalues of the data set as generators. Z box is obtained by computing the interval hull of this zonotope and scaling its volume such that it is equivalent to the volume ofẐ cube for a given ε. Free:Ẑ free is an arbitrary zonotope that enables disturbances to be coupled between input features which cannot be captured by norms or intervals. This input zonotope is obtained by increasing/decreasing all feature values simultaneously by at most ε and additionally allowing a small, fixed perturbation δ ε on each feature. If the input is an image, this perturbation would brighten/darken all pixel values simultaneously:Ĝ = [δI d , ε 1].
For feature rankings, the following setting is used: to quantify the influence of feature f 1 on the prediction of x, we define a box-shaped input setẐ f1 = (x|G) around x that allows a perturbation δ on f 1 and a minimal perturbation ε (here: ε = 0.01) on all other features. More formally, we use a diagonal input matrix G, where G 1,1 = δ and G i,i = ε ∀i = 1.
Classification: (Non-)Robustness Verification. First, we evaluate the potential of reachable sets by using them for robustness/non-robustness verification, i.e. for studying how predictions of a classifier change when perturbing input instances. More precisely, we aim to analyze if predictions based on an input set map to the same class or if they vary. Formally, the set of predictions (classes) is P = {arg max c f (x) c |x ∈ I}, given input set I.
For verification, we compute a robustness score against each class. Let a be the predicted class and b = a the class against which we quantify robustness. 1 The least robust point p within the reachable set (output/logit space) is the one where its coordinate p b is close to or larger than p a . Based on these considerations, we define the robustness score against class b of reachable set R S : where Z ∈ R S denotes the computed zonotopes, and we use that 1} depending on the sign of g a i − g b i . Robustness certificates are obtained by computing the scores against all classes b = a on the overapproximated reachable set R SO . If all scores are positive, the robustness certificate holds, and all points from the input set are classified as class a. Non-robustness certificates are obtained by checking if there is a class b, such that s b on the under-approximated reachable set R SU is negative. If this is the case, at least one point from the input set is categorized as class b.
There are three benefits to these scores. First, computing scores is efficient (see Equation 15). What is more, the scores are fully differentiable w.r.t. the network parameters, enabling immediate robust training (see later experiment). Second, the scores are applicable to class-specific verification (i.e. robust against class b 1 , non-robust against b 2 ). And thirdly, the scores allow relative quantification of (non-)robustness. A reachable set with a high score is more robust than one with a low score.
We compare the performance of RSO on robustness verification using the state-of-the-art methods, WK (wong-kolter) [34], dz (deepzono) [26], dp (deeppoly) [27], dr (refinezono) [28] and es (exact approach) [35], which computes the exact reachable set (implementation [18]). RsU is compared with the success rate of FGSM attacks [9,32] and PGD attacks [19]. To handle the box setting, FGSM attacks are scaled, such that the perturbed input is contained within the input zonotope. The PGD attack is projected onto the input zonotope in each step, i.e. extended to handle arbitrary input zonotopes. Figure 4 and 5 illustrate (non-)robustness verification on the cancer data set, MNIST, iris data set and FashionMNIST for cube-, box-and free-shaped input zonotopes. For robustness verification, we measure the number of samples for which the scores against all non-target classes are positive. For non-robustness verification, we count the number of samples in which a negative score exists against a class. In the cube and box settings, RsO perform similar way to dr, dz and dp, while RsU is similar (cube) or slightly better (box) than PGD attacks. Based on arbitrary input zonotopes (free setting), RsO and RsU outperform both state-of-the-art robustness verification approaches and PGD attacks. The run-time of RsO and RsU increases with the number of input features, the number of neurons in the neural network and the perturbation ε. The dependency on ε is due to the fact that huge sets usually decompose into more convex subsets than smaller sets when they are subject to ReLU, and so, run-time increases with the size of the input set. Note that we compute the full reachable set of the neural network, which provides much more information than a binary (non-)robustness-certificate. The other techniques, dz, dp, dr are designed for robustness verification/attacks and do not return any further information. A run-time comparison is thus biased. Still, for smaller ε and also for the free-shaped input, the absolute run-time of our methods is competitive.   Since es [35,18], which computes the exact reachable set, requires too much time even with the smallest neural network architecture, it was not possible to conduct a meaningful comparison. The exact approach es only ran on the smallest neural network (iris data set, neural network with 1 hidden layer of 4 neurons) for the smallest perturbations ε ∈ {0.001, 0.005, 0.01} (see Table 1) 2 . Note that the exact approach es certifies 28 of the 29 samples as robust and 0 as non-robust and rejects one sample for which it was not able to solve an underlying optimization problem. When performing the exact method es on a cube-shape input with perturbation ε = 0.02, it did not finish even after more than three days. This might be explained by the fact that es uses half-spaces to describe the reachable set. Applying ReLU on sets described by half-spaces requires exponential time, and thus, es is not feasible even for small neural networks. Consequently, the reachable set needs to be over-/under-approximated as in our approach.
Classification: Class-Specific Verification. Robustness scores allow class-specific (non-)robustness verification in cases where distinguishing between classes is not equally important, e.g. in the tissue data set. The authors of the data set are of the opinion that distinguishing between the class 3, 4 and 5 (fibro-adenoma, mastopathy and glandular) is of minor importance, while it is crucial to distinguish these classes from class 1 (carcinoma). This is illustrated in Figure 6, left part, where classes 3, 4 and 5 are not robust against each other, while class 1 is robust against all other classes (plot: percentage of instances which are evaluated as (non-)robust; x-axis: ground truth class, y-axis: class we test against). Thus, class-specific analysis allows classifiers to be evaluated more specifically and focus on crucial robustness properties. Furthermore, it allows us to draw conclusions about the concepts a neural network has learned (see Figure 6, right part, Fashion-MNIST with classes: 0 top, 1 trousers, 2 pullover, 3 dress, 4 coat, 5 sandal, 6 shirt 7 sneaker, 8 bag, 9 boot). It is striking that class 2 pullover is less robust against classes of items of a similar shape (0 top, 4 coat) but robust against classes of items of different shapes (1 trousers, 3 dress, 5 sandal, 8 bag, 9 boot). This indicates that the neural network has extracted the shape and learned its importance for a classification decision.
Classification: Reliability of Predictions. Distinguishing between reliable (label 0) and non-reliable predictions (label 1) can be seen as a binary classification problem. Although a wrong prediction (w.r.t. ground truth) can theoretically have a high robustness score, we observe that the robustness scores corresponding to wrongly predicted inputs are mostly negative or close to zero. Thus, we consider a prediction as reliable if the corresponding robustness scores (w.r.t. the predicted class) is larger than a positive threshold θ. This threshold θ is chosen such that it maximizing the number of correctly identified reliable/ non-reliable samples on the validation set. Table 2 compares the performance of RsO with our proposed baseline approach that uses softmax scores to distinguish between reliable and non-reliable predictions.
Our comparison shows that, while softmax scores result in a slightly higher true-positive-rate and overall accuracy, RsO provides a significantly higher true-negative-rate. Thus, RsO identifies more non-reliable predictions than softmax scores. Furthermore, RsO provides a robustness certificate as well as an indicator for reliability.  Classification: Robust Training. The robustness scores as defined in Equation 15 are directly used in robust training by incorporating them into the loss function, e.g. as follows: where L pred is the cross-entropy loss and I[pred = target] = 1 for correctly classified inputs, otherwise 0. Note that the loss is fully differentiable w.r.t. the neural network weights (i.e. we can backpropagate through the zonotope construction) which makes it possible to train a model with enhanced robustness against any perturbation that can be described by any (input) zonotope. Figure 7 compares robustness of models obtained by robust training (L rob ), retraining (warm-start with a normally trained model, further training with L rob ), normal training, and mixup (a robust training technique based on a convex combination of samples, see [37]). Robust training, retraining and Regression: (Non-)Robustness Analysis and Robust Training. Obtaining robust neural networks is desirable in any task but has mainly been studied for the purpose of classification. Classifiers are robust if an input x and all points in its neighborhood are assigned to the same label. In regression tasks, there is no equivalent robustness definition, because outputs are continuous and not categorical. However, intuitively, regression models are robust if close inputs result in close outputs. Assume that inputs and outputs are standardized before training, such that all features are on an equal scale. The extension l a of output feature a within the reachable set R S quantifies robustness: the smaller l a is, the more robust is the model. The extension is defined by the two most distant points u and v within R S w.r.t. dimension a: l a = |max u∈R S u a − min v∈R S v a |. For input features, the extension l in is equivalently defined on the input set. In the cube setting, l in is the same for all input features.
If we have l a ≤ l in for all output features a, the regression model maps close inputs to close outputs and we consider it as robust. We use this robustness definition to define a robust training function based on feature extension and a standard loss function L val (e.g. Huber loss): If l a is larger than l in the second term of L rob is positive, otherwise it is zero. We compare four different training modes: normal (training with Huber loss), retrain (warm-start with a normally trained model, and further training with L rob ), robust (training with L rob ), and mixup (a training technique that convexly combines inputs, see [37]). Figure 8 illustrates the training and robustness analysis, based on the abalone data set (2 hidden layers, first row) and the airfoil dataset (1 hidden layer, second row). While mixup seems to decrease the robustness of regression models, robust training and retraining results in smaller reachable sets and thus ensures that close inputs are mapped to close outputs. Thus, robust training and retraining both improve robustness properties without significantly reducing prediction accuracy.
Explainability: Feature Ranking for Classifiers & Regression Models. Reachable sets enable the importance of features to be quantified w.r.t. a model output. To quantify the influence of feature f 1 , we define a box-shaped input set with a large perturbation δ on f 1 , while the perturbation on the remaining features is small. The size of the reachable set corresponding toẐ f1 captures the variation in the predictions caused by varying f 1 and thus quantifies the influence of f 1 . Since the exact size/volume ofẐ f1 is inefficient to compute [10], we approximate it using the interval hull.
Here, we use the scaled version of the volume that considers the dimensionality d of the zonotope: The volume of the reachable set is approximated by the sum of all interval hull volumes. Figure 9 illustrates the three most important features for samples top   of the wine data set computed by our RsO approach in comparison with two other approaches: the integrated gradients method (igm) [31] and local interpretable model-agnostic explanations (lime) [23]. RsO identifies four possibilities for the most important feature: f 12 (blue, ≈ 30% of samples), f 9 (teal, ≈ 10% of samples), f 7 (bright green, ≈ 50% of samples) and f 6 (yellow, ≈ 20% of samples). Igm identifies three of these possibilities, while lime identifies five possibilities. Overall, the rankings of RsO, igm and lime are of different complexity in terms of different features. The most (second-most/third-most) important feature identified by igm adopts 2-3 possibilities, by lime 5-6 possibilities and by RsO 4-6 possibilities. Consequently, the complexity of the feature ranking computed by RsO is between the one obtained by igm and lime.
Reachable Set Approximation: Analysis of the Limits. RsU and RsO approximate the reachable set of a ReLU network layer-by-layer. Within each layer, they compute a linear transformation (defined by the weights and biases) and approximate the outcome of applying ReLU by a set of convex subsets (zonotopes). The number of subsets required to approximate ReLU(Z) is the bottleneck of our approaches. Worst case, applying ReLU on Z = (c | G), c ∈ R D , G ∈ R n×D results in 2 D subsets/zonotopes. Considering a neural network with K layers of D 1 , D 2 , D 3 , . . . D K neurons, the reachable set approximation requires up to 2 K k=1 D k subsets/zonotopes. Figure 10 illustrates that the run time of RsO linearly increases with the number of subsets and thus exponentially increases with the number of neurons in the worst case. Thus, the number of neurons limits the applicability of our approaches on large neural networks.  To improve this, we propose an extension, which restricts the amplification number and the total number of zonotopes (see Section Balancing approximation tightness and run time), which is applicable to larger neural networks. Results on this extension on robustness verification and for the analysis of autoencoders are presented in the next section and in the appendix (see Section 6.2).
Autoencoder Analysis. To illustrate the strength of our approach, we compare reachable sets obtained by RsO and RsU with a sampling-based set approximation. We approximate the reachable set of an autoencoder (three hidden layers, 60×30×60 neurons) with respect to a cube shaped input set with ε = 0.001. RsO and RsU are restricted such that the maximum amplification of a zonotope is A = 100 and the overall number of zonotopes is less or equal to B = 1000. To compare with RsO and RsU we introduce a simple baseline based on sampling. This sampling approach chooses 10 9 points among the vertices of the cube shaped input set and computes the corresponding outputs. The set spanned by these 10 9 outputs is used to approximate the exact reachable set.  Figure 11: Analysis of an autoencoder (MNIST, cube, ε = 0.001). First row: input image, output image drawn from the reachable set approximated by RsO (second), RsU (third) and sampling (fourth). Second row: extension/size of the pixel range corresponding to the reachable sets computed by RsO (left), RsU (middle) and sampling (right). The smaller the ranges computed by RsO and the larger the ranges computed by RsU or sampling the better is the performance.
Since we consider autoencoder models, the reachable set consists of pictures from the same space as the input. To visualize the properties of the reachable sets computed by RsO, by RsU and by the sampling approach, we draw example pictures from the reachable sets. Furthermore, we compute the extension/size of the range of each pixel based on the reachable set under consideration ( Figure 11).
Even though we restrict the number of convex subsets to 1000, RsO and RsU result in similar example pictures and similar extensions for each pixel (see Figure 11, second row and third row). This illustrates that our approximations are tight and close to the exact reachable set, since the exact reachable set is enclosed by the under-and over-approximation. In comparison to RsU, the sampling approach results in pixel extensions that are about two times smaller/worse and example pictures that are too close to the image reconstructed from the original input. Thus, sampling 10 9 instances from the input set and computing the corresponding outputs still leads to a dramatic underestimation of the exact reachable set. This shows that RsU outperforms the sampling approach, even if we restrict the overall number of zonotopes and the possible amplification. In conclusion, these results highlight the fact that computing an upper bound (RsO) and a lower bound (RsU) to the reachable set of neural networks provides more information on the mapping of networks than sampling.

Conclusion
We propose RsO and RsU as two efficient approaches for over-and under-approximating the reachable sets of ReLU networks. Approximated reachable sets are applicable to the analysis of neural network properties: we analyze and enhance the (non-)robustness properties of both classifiers and regression models. Our approach outperforms PGD attacks as well as state-of-the-art methods of verification for classifiers with respect to non-norm bound perturbations. Reachable sets provide more information than a binary robustness certificate. We use this information for class-specific verification, robustness quantification, robust training, distinguishing between reliable and non-reliable predictions, ranking features according to their influence on a prediction and analyze autoencoders.

Extension of RsO and RsU for Large(r) Neural Networks
The subsection "Reachable Set Approximation: Analysis of the Limit" (page 13) shows that the number of zonotopes required to approximate the reachable set might increase exponentially with the number of neurons of the neural network in the worst case. Thus, we propose an extension (see page 7: "Balancing approximation tightness and run-time") that allows to restrict the number of total subsets (max. zono.) and the amplification (max. amp.). The maximum amplification is the maximum number of subsets used to approximate ReLU(Z) w.r.t. the zonotope Z subjected to ReLU, while the maximum number of zonotopes is the maximum number of zonotopes w.r.t. to the whole neural network that is used to approximate the reachable set. This restriction allows to use RsO for robustness verification of larger neural networks. To illustrate how these restrictions affect the tightness of our approximations, we compare the performance of RsO for different max. zono. and max. amp. values on a neural network with 5 hidden layers of 30 neurons on the FashionMNIST data set (see Figure 12). Without limitations, RsO might require up to 2 150 ≈ 1.43 · 10 45 subsets.  Figure 12 illustrates that the number of robustness certificates increases with the maximum amplification (left plot). Furthermore, the number of robustness certificates increases with max. zono. up to 10, 000, but choosing larger max. zono. does not result in a further increase of robustness certificates (right plot). Thus, to obtain tight approximations the max. amp. should be chosen as large as possible and feasible, while the max. zono. should be chosen as small as possible but as large as necessary to obtain the maximum performance.