Shared Certificates for Neural Network Verification

Existing neural network verifiers compute a proof that each input is handled correctly under a given perturbation by propagating a symbolic abstraction of reachable values at each layer. This process is repeated from scratch independently for each input (e.g., image) and perturbation (e.g., rotation), leading to an expensive overall proof effort when handling an entire dataset. In this work, we introduce a new method for reducing this verification cost without losing precision based on a key insight that abstractions obtained at intermediate layers for different inputs and perturbations can overlap or contain each other. Leveraging our insight, we introduce the general concept of shared certificates, enabling proof effort reuse across multiple inputs to reduce overall verification costs. We perform an extensive experimental evaluation to demonstrate the effectiveness of shared certificates in reducing the verification cost on a range of datasets and attack specifications on image classifiers including the popular patch and geometric perturbations. We release our implementation at https://github.com/eth-sri/proof-sharing.


Introduction
The success of neural networks across a wide range of application domains [23,33] has led to their widespread application and study.Despite this success, neural networks remain vulnerable to adversarial attacks [9,25] which raises concerns over their trustworthiness in safety-critical settings such as autonomous driving and medical devices.To overcome this barrier, formal verification of neural networks has been proposed as a key technology in the literature [44].As a result, recent years have witnessed a growing interest in verifying critical safety properties of neural networks (e.g., fairness, robustness) [16,19,20,34,35,45,47] specified using pre and post conditions over network inputs and outputs respectively.Conceptually, existing verifiers propagate sets of inputs in the precondition captured in symbolic form (e.g., convex sets) through the network, an expensive process that produces over-approximations of all possible values at intermediate layers.
The final abstraction of the output can then be used to check postconditions.
The key technical challenge all existing verifiers aim to address is speeding up and scaling the certification process, i.e, faster and more efficient propagation of symbolic shapes while reducing the overapproximation error.
This work: accelerating certification via proof sharing.In this work, we propose a new, complementary method for accelerating neural network verification based on the key observation that instead of treating each certification attempt in isolation as existing verifiers do, we can reuse proof effort among multiple such attempts, thus obtaining significant overall speed-ups without losing precision.Fig. 1 illustrates both, standard verification and the concept of proof sharing.
In standard verification an input region I 1 (x) (orange square) is propagated from left to right, obtaining intermediate shapes at each intermediate layer (here the goal is to verify all points in the input region are classified as "cat" by the neural network N ).We observe that the abstraction obtained for a new region I 2 (x) (e.g., blue shapes) can be contained inside existing abstractions from I 1 (x), an effect we term proof subsumption.This effect can be observed both between abstractions obtained from different specifications (e.g., ℓ ∞ and adversarial patches) for the same data point and between proofs for the same property but different, yet semantically similar inputs.Building on this observation, we introduce the notion of proof sharing via templates.Proof sharing works in two steps: first, we leverage abstractions from existing proofs in order to create templates, and second, we augment the verifier with these templates, stopping the expensive propagation at an intermediate layer as soon as the newly generated abstraction is included inside an existing template.Key technical ingredients to the effectiveness of our approach are fast template generation and inclusion checking techniques.We experimentally demonstrate that proof sharing can achieve significant speed-ups in challenging scenarios including proving robustness to adversarial patches [11] and geometric perturbations [4] across different neural network architectures.

Main Contributions Our key contributions are:
-An introduction and formalization of the concept of proof sharing in neural network verification: the idea that some proofs capture others ( §3).-A general framework leveraging the above concept, enabling proof effort reuse via proof templates ( §4).-A thorough experimental evaluation involving verification of neural network robustness against challenging adversarial patch and geometric perturbations, demonstrating that our methods can achieve proof match rates of up 95% as well as provide non-trivial end-to-end certification speed-ups ( §5).
Fig. 1: Visualization of neural network verification.The input regions I 1 (x), I 2 (x) are propagated layer by layer through a neural network N .The highdimensional convex shapes are visualized in 2d.While initially I 1 (x) and I 2 (x) only slightly overlap, at layer k, N 1:k (I 2 (x)) is fully contained in N 1:k (I 1 (x)).

Background
Here we formally introduce the necessary background for proof sharing.
Neural Network A neural network N is a function N : R din → R dout , commonly built from individual layers Throughout this text, we consider feed-forward neural networks, where each layer N i (x) = max(Ax+b, 0) consists of an affine transformation (Ax + b) as well as a rectified linear unit (ReLU), that applies the max with 0 elementwise.A neural network, classifying inputs into c classes, outputs d out := c scores, one for each class, and assigns the class with the highest score as the predicted one.While, as is common in the neural network verification literature, we use image classification as a proxy task, many other applications work analogously.Our approach also naturally extends to other types of neural networks, if verifiers exist for these architectures.We discuss the challenges and limitations of such generalizations in §4.5.In the following, for k < L, we let N 1:k denote the application of the first k layers and N k+1:L denote the last L − k layers respectively.
(Local) Neural Network Verification Given a set of inputs and a postcondition ψ, the goal of neural network verification is to prove that ψ holds over the output of the neural network corresponding to the given set of inputs.In this work, we focus on local verification, proving that ψ holds for the network output for a given region I(x) ⊆ R din formed around the input x.Formally, we state this as: Problem 1 (Local neural network verification).For a region I(x) ⊆ R din , neural network N , and postcondition ψ, verify that ∀z ∈ I(x).N (z) |= ψ.We write Here, we restrict ourselves to verifiers based on abstract interpretation [12,16] as they achieve state-of-the-art precision and scalability [35,34].Further, many other popular verifiers [43,47] can be formulated using abstract interpretation.
These verifiers propagate I(x) symbolically through the network N layer-bylayer using abstract transformers, which overapproximate the effect of applying the transformations defined in the different layers on symbolic shapes.The propagation yields an abstraction of the exact shape at each layer.The verifiers finally check if the abstracted output implies ψ.This is showcased in Fig. 1, where the input regions I 1 (x) and I 2 (x) are propagated layer-by-layer through N .
For a verifier V , we let V (I(x), N ) denote the abstraction obtained after the propagation of I(x) through the network N .We declutter notation by overloading N and writing N (I(x)) for the same if V is clear from context, i.e., V (I(x), N ) = N (I(x)).
We consider robustness verification, where the goal is to prove that the network classification does not change within an input region.A common input region is the ℓ ∞ -bounded additive noise, defined as Here, ϵ defines the size of the maximal perturbation to x.The postcondition ψ denotes classification to the same class as x.Throughout this paper, we consider different instantiations for I(x) but assume that ψ denotes classification invariance (although other choices would work analogously).Due to this, we refer to I(x) as input region and specification interchangeably.For example, in Fig. 1, the goal is to verify that all points contained in N (I 1 (x)) are classified as "cat".

Proof Sharing with Templates
Before introducing our framework for proof sharing, we further expand the motivation example discussed in Fig. 1.

Motivation: Proof Subsumption
As stated earlier, we empirically observed that for many input regions I i (x) and I j (x), the abstraction corresponding to one region at some intermediate layer k contains that of another.Formally: Definition 1 (Proof Subsumption).For specifications I i (x), I j (x), we say that the proof of I i (x) subsumes that of I j (x) if at some layer k, N 1:k (I j (x)) ⊆ N 1:k (I i (x)), which we denote as I j (x) ⊆ N,k I i (x).While not formally required, particularly interesting are cases where proof subsumption occurs despite I i (x) ̸ ⊆ I j (x).This form of proof subsumption is showcased in Fig. 1, where I 1 (x) and I 2 (x) have only a small overlap, yet I 2 (x) ⊆ N,k I 1 (x).For another example, consider a neural network N trained as a hand-written digit classifier for the MNIST dataset [24] (example shown in Fig. 2) and the following two specifications: ℓ ∞ -bounded perturbations: all the pixels in an input image can arbitrarily be changed independently by a small amount Fig. 3: The abstraction obtained for I ϵ (x) (blue) contains that for I i,j 2×2 (x) (orange) (projected to d = 2).
adversarial patches [11].A p × p patch inside which the pixel intensity can vary arbitrarily is placed on an image at coordinates (i, j), for which we write I i,j p×p .We showcase a patch in Fig. 2 and formally define them in §4.3.
Clearly I i,j p×p (x) ̸ ⊆ I ϵ (x) (unless ϵ = 1).In Table 1, we show that for a classifier (5 layers with 100 neurons each) we indeed observe proof subsumption.We report the accuracy, i.e., the rate of correct predictions on the unperturbed test data, as well as the certified accuracy, i.e., the rate of samples x for which the prediction is correct and I(x) |= ψ is verified, for I ϵ with ϵ = 0.1 and 0.2 over the whole test set.We also show the percentage of I i,j 2×2 (x) contained in I ϵ (x) at layer k.To this end, we pick 1000 random x for which I ϵ (x) is verifiable and sample 2 (i, j) pairs each.We utilize a Box domain verifier and a robustly trained network [26].Fig. 3 shows a patch specification I i,j 2×2 (x) (in orange) contained in the ℓ ∞ specification I ϵ (in blue) projected to 2 dimensions via PCA.

Reasons for Proof Subsumption
In Table 1, we observe that the rate of proof subsumption increases with larger ϵ and k.These observations give an intuition as to why we observe proof subsumption.First, as input regions pass through the neural network, in each layer the abstractions become more imprecise.While this fundamentally limits verification, it makes the subsumption of abstractions more probable.This effect increases, when increasing ϵ for I ϵ .Second, and more fundamentally, while passing through the layers of a neural network, we observed that semantically similar yet distinct image inputs, e.g., two similar-looking handwritten digits, have activation vectors that grow closer in ℓ 2 norm as they pass through the layers of the neural network [23,37].This effect is a consequence of the neural network distilling low-level information (e.g., individual pixel values) into high-level concepts (e.g., the classes of digits).As specifications (and their proofs) correspond to sets of concrete inputs, a similar effect may apply.We conjecture that these two effects drive the observed proof subsumption.

Proof Sharing with Templates
Leveraging this insight, we introduce the idea of proof sharing via templates, showcased in Fig. 4. We use an abstraction obtained from a robustness proof Fig. 4: Conceptualization of proof sharing with templates.In (a) we create a verifiable template T (black-dashed border) from specification N 1:k (I 1 (x)).When verifying new specifications I 2 , . . ., I 5 , shown in (b), we can shortcut the verification of all but I 5 by subsuming them in T .
N 1:k (I 1 (x)) at layer k to create a template T .After ensuring that T is verifiable, it can be used to shortcut the verification of other regions, e.g., of I 2 (x), . . ., I 5 (x).Formally we decompose proof sharing into two sub-problems: (i) the generation of proof templates and (ii) the matching of abstractions corresponding to other properties to these templates.For simplicity, here we only consider templates at a single layer k of the neural network and we show an extension to multiple layers in §4.3.
Our goal is to construct a template T at layer k that implies the postcondition and captures abstractions at layer k obtained from propagating several I i (x).As it is challenging to find a single T that captures abstractions corresponding to many input regions, yet remains verifiable, we allow a set of templates T .We state this formally as: Problem 2 (Template Generation).For a given neural network N , input x and set of specifications I 1 , . . ., I r , layer k and a postcondition ψ, find a set of templates T with |T | ≤ m such that: Intuitively, Eq. ( 1) aims to find a set T of templates T at layer k, such that the maximal amount (via the sum) of specifications I 1 , . . ., I r is contained in at least one template T (via the disjunction) while ensuring that the individual T are still verifiable (via the constraint on the second line).As neural network verification required by the constraints of Eq. ( 1), is NP-complete [19], computing an exact solution to Problem 2 is computationally infeasible.Therefore, we compute an approximate solution to Eq. (1).In general, Problem 2 does not necessarily require that the templates T are created from previous proofs.However, building on proof subsumption, as discussed in §3.1, in §4 we will infer the templates from previously obtained abstractions.
To leverage proof sharing once the templates T are obtained, we need to be able to match an abstraction S = N 1:k (I(x)) verified using proof transfer to a template in T : Problem 3 (Template Matching).Given a set of templates T at layer k of a neural network N , and a new input region I(x), determine whether there exists a T ∈ T such that S ⊆ T , where S = N 1:k (I(x)).
Together, Problems 2 and 3 outline a general framework for proof sharing, permitting many instantiations.We note that Problems 2 and 3 present an inherent precision vs. speed trade-off: Problem 3 can be solved most efficiently for small values of m = |T | and simpler representations of T (allowing faster checking of S ⊆ T ) at the cost of lower proof matching rates.Alternatively, Eq. ( 1) can be maximized by large m and T represented by complex abstractions, thus attaining high precision but expensive template generation and matching.
Beyond proof sharing on the same input In this section, we focused on proof sharing for different specifications of the same input x.However, we observed that proof sharing is even possible between specifications defined on different inputs x and x ′ .To facilitate the use of templates in this setting, Eq. ( 1) in Problem 2 can be adapted to consider an input distribution.We provide an investigation along these lines in Appendix A.

Efficient Verification via Proof Sharing
We now consider an instantiation of proof sharing where we are given an input x and properties I 1 , . . ., I r to verify.Our general approach, based on Problems 2 and 3, is shown in Algorithm 1.In this section, we first discuss Algorithm 1 in general.We then describe the possible choices of abstract domains and their implications on the algorithm, followed by a discussion on template generation for two different specific problems.Finally, we conclude the section with a discussion on the conditions for effective proof sharing verification.
In Algorithm 1, we first create the set of templates T (Line 1, discussed shortly) and subsequently verify I 1 , . . ., I r using T .Here, we consider two, potentially identical, verifiers V T and V S , where V T is used to create the templates T and V S is used to propagate input regions up to the template layer k.For each I i we propagate it up to layer k (Line 4) to obtain S = N 1:k (I i (x)) and check if we can match it to a template T j ∈ T (Line 6) using an inclusion check.If a match is found, then we conclude that N (I i (x)) |= ψ and set the verification output v i to True.If this is not the case (Line 11) we verify N (I i (x)) |= ψ directly by checking V S (S, N k+1:L ) |= ψ.If the template generation fails, we revert to verifying I i by applying V S in the usual way (omitted in Algorithm 1).
Soundness As long as the templates T are sound, this procedure is sound, i.e Algorithm 1 only returns v i = True if ∀z ∈ I i (x).N (z) |= ψ holds.Formally: This holds by the construction of the algorithm: Proof.For a given x and I i , Algorithm 1 only claims v i = True if either the check in (i) Line 6 or (ii) Line 11 succeeds.Since V S is sound, we know that ∀z ∈ I i (x).N 1:k (z) ∈ S. Therefore in case (i) by our requirement on T as well as S ⊆ T it follows that ∀z ∈ I i (x).N (z) |= ψ.In case (ii) we execute Line 12 and the same property holds due to the soundness of V S .
Importantly, Theorem 1 shows that the generation process of T does not affect the overall soundness as long as the set of templates T fulfills the condition in Theorem 1.In particular, that means that when solving Problem 2, it suffices to show the side condition (∀ T ∈ T .N k+1:L (T ) |= ψ) holds, while heuristically approximating the actual optimization criteria.We let V T denote the verifier used to ensure this property in gen templates.
Precision We say a verifier V 1 is more precise than another verifier V 2 on N if out of a set of specifications it can verify some that V 2 can not.
Proof.Since, even if the inclusion check in Line 6 fails, due to Line 12 we out- ), which by our requirement equals v i = V S (I i (x), N ) |= ψ.Therefore we have at least the precision of V S .
The required property holds for any verifier V S for which the abstractions of all network layers depends only on the abstractions from previous layers and is fulfilled for all verifiers considered in this paper.For verifiers V S that do not fulfill the required property, potential losses in precision can be remedied (at the cost of runtime) by using V S (I i (x), N 1:L ) in Line 12. Interestingly, it is even possible to increase the precision of Algorithm 1 over V S by creating templates T that are verified with a more precise verifier V T .However, in this discussion, we restrict ourselves to speed gains.We believe that obtaining precision gains requires instantiating our framework with a significantly different approach than that taken for improving speed which is the main focus of our work.We leave this as an interesting item for future work.
Run-Time Here, we aim to characterize the run-time of Algorithm 1 as well as its speed-up over conventional verification.For an input x, (keeping the other parameters fixed), the expected run time is where t T is the expected time required to generate the templates at Line 1, r is the number of specifications to be verified, t S is the expected time to compute S (Line 4), t ⊆ is the time to check S ⊆ T for T ∈ T until a match is found (Line 5 to Line 10), ρ ∈ [0, 1] is the rate of specifications where a template is found and t ψ is the time required to check ψ on the network output corresponding to S (Line 12).This time is minimized if the individual expected run times t T , t S , t ψ are minimal and ρ is large (i.e., close to 1).Unfortunately, computing the template match rate ρ analytically is challenging and requires global reasoning over the neural network for all valid inputs, which are not clearly defined.However, our empirical analysis (in §5) shows that ρ is higher when templates are created at later layers (as in §3.1).
To determine the speed-up compared to a baseline standard verifier, we make the simplifying assumption that there is a single verifier V = V S = V T that has expected run-time ν for each layer.Thus, the expected run-time for the conventional verifier is t BL = rLν.We have t T = λmLν, t S = kν, t ψ = (L − k)ν, t ⊆ = ηm and ultimately t P S = (m + r(1 − ρ))Lν + rρkν + rηm for constants λ ∈ R >0 , which indicates the overhead in generating one template over just verifying it, and η ∈ R >0 which denotes the time required to perform an inclusion check for one template.As this phrasing shows, Algorithm 1 has the same asymptotic runtime as the base verifier V .Further, this formulation allows us to write our expected speed-up as t BL t P S = r λm+ηrm/Lµ+rρk/L+r(1−ρ) .This speed-up is maximized when k is small compared to L, i.e., templates are placed early in the neural network, the matching rate ρ is close to 1, and m, λ, η are small, i.e., generation and matching are fast.Unfortunately, these requirements are at odds with each other: as we show in §5, higher m leads to higher matching rate ρ and ρ is naturally higher for templates later in the neural network (higher k).Thus high speed-ups require careful hyper-parameter choices.
To showcase how we can achieve good templates as well as fast matching, we next discuss the choice of the abstract domain to be used in the propagation and the representation of the templates.Then we discuss the template generation procedure and instantiate it for the verification of robustness to adversarial patches and geometric perturbations.

Choice of Abstract Domain
To solve Problems 2 and 3 in a way that minimizes the expected runtime and maximizes the overall precision, the choice of abstract domain is crucial.Here we briefly review common choices of abstract domains for neural network verification and how they are suited to our problem.Geometrically these domains can be thought of as a convex abstraction of the set of vectors representing reachable values at each layer of the neural network.We say that an abstraction a 1 is more precise than another abstraction a 2 , if and only if a 1 ⊆ a 2 , i.e, all points in a 1 occur in a 2 .Similarly, we say that a domain is more precise than another if it can express all abstractions in the other domain.
For efficient proof sharing, we require a fast inclusion check S ⊆ T , which is challenging in our context due to the high dimensionality d of the intermediate neural network layers.While we point the interested reader to [32] for a detailed discussion, we summarize the key results in Table 2. There, ✓ denotes feasibility, i.e. low polynomial runtime (usually 2d comparisons, sometimes with an additional matrix multiplication), ✗ denotes infeasibility, e.g.exponential run time.If T is a Box all checks are simple as it suffices to compute the outer bounding box of S and compare the 2d constraints.If T is a DP Polyhedra these checks require a linear program (LP) to be solved.While the size of this LP permits a low theoretical time complexity, in case S is a Box or DP Polyhedra, in practice, we consider calling an LP solver too expensive (denoted as (✓)).For Zonotopes these checks are generally infeasible, as they require enumeration of the faces or corners, which is computationally expensive for large d and P .While Zonotopes can be encoded as Polyhedra (but not necessarily DP Polyhedra) and the same LP inclusion check as for P could be used, the resulting LP would require exponentially many variables due to the previously mentioned enumeration.However, by placing constraints on the matrix A in Eq. (3) these inclusion checks can be performed efficiently.The mapping of a Zonotope to such a restricted Zonotope is called order reduction via outer-approximation [32,21].
In particular, for a Zonotope Z we consider the order reduction α Box to its outer bounding box (where A is diagonal) and note that other choices of α are possible (e.g. the reduction to affine transformations of a hyperbox).
For a general Zonotope Z its outer bounding box Z ′ = α Box (Z) can be easily obtained.The center of Z ′ is a, the center of Z.The width d ∈ R d ≥0 is given as represented as either a Box or a Zonotope (with A = diag(d)).To check S ⊆ Z ′ for a general Zontope S it suffices to check α Box (S) ⊆ Z ′ which reduces to the simple inclusion check for boxes.
Based on the above discussion we will use the Zonotope domain to represent all abstractions, and use verifiers V S = V T that propagate these zonotopes using the state-of-the-art DeepZ transformers [34].To permit efficient inclusion checks we apply α Box on the resulting zonotopes to obtain the Box templates T , which we treat as a special case of Zonotopes.

Template Generation
We now discuss instantiations for gen templates in Algorithm 1. Recall from §3.1 the idea of proof subsumption, i.e. that abstractions for some specification contain abstractions for other specifications.Building on this, we relax the Problem 2 in order to create m templates T j from intermediate abstractions N 1:k ( Îi (x)) for some Î1 , . . ., Îm .Note that Îj are not necessarily directly related to the specifications I 1 , . . ., I r that we want to verify.For a chosen layer k, input x, number of templates m and verifiers V S and V T we optimize arg max As originally in Problem 2 (Eq.( 1)) we aim to find a set of templates such that the intermediate shapes at layer k for most of the r specifications are covered by at least one template T .In contrast to Eq. ( 1), we tie T j to the specifications Îj .This alone does not make the problem easier to tackle.However, next, we will discuss how to generate application-specific parametric Îj and solve Eq. ( 4) by optimizing over their parameters, allowing us to solve template generation much more efficiently than in Eq. (1).

Robustness to Adversarial Patches
We now instantiate the above scheme in order to verify the robustness of image classifiers against adversarial patches [11].Consider an attacker that is allowed to arbitrarily change any p × p patch of the image, as showcased earlier in Fig. 2. For such a patch over pixel positions ([i, i+p−1]×[j, j +p−1]), the corresponding perturbation is where h and w denote the height and width of the input x.Here π i,j denotes the parts of the image affected by the patch, and π C i,j its complement, i.e., the unaffected part of the image.To prove robustness for an arbitrarily placed p × p patch, however, one must consider the perturbation set I p×p (x) := ∪ i,j I i,j p×p (x).
To prove robustness for I p×p , existing approaches [11] separately verify I i,j p×p (x) for all i ∈ {1, . . ., h − p + 1}, j ∈ {1, . . ., w − p + 1}.For example, with p = 2 and a 28 × 28 MNIST image, this approach requires 729 individual proofs.Because the different proofs for I p×p share similarities, this is an ideal candidate for proof sharing.We utilize Algorithm 1 and check ∧ i v i at the end to speed up this process.For template generation, we solve Eq. ( 4) for m templates with an input perturbation Îi per template.
We empirically found that (recall Table 1) setting Îi to an ℓ ∞ region I ϵi to work particularly well to capture a majority of patch perturbations I i,j p×p at intermediate layers.Specifically, we found that setting ϵ i to the maximally verifiable value for this input to work particularly well.
To further increase the number of specifications contained in a set of templates T , we use m template perturbations of the form where µ i denotes a subset of pixels of the input image and µ C i its complement and we maximize ϵ i in a best-effort manner.In particular, we consider µ 1 , . . ., µ m , such that they partition the set of pixels in the image (e.g., in Fig. 5).
As noted earlier, this generation procedure needs to be fast, yet obtain T to which many abstractions match in order to obtain speed-ups.Thus, we consider small m, and fixed patterns µ 1 , . . ., µ m .For each Îi , we aim to find the largest ϵ i which can still be verified in order to maximize the number of matches.Note that for m = 1, this is equivalent to the ℓ ∞ input perturbation I ϵ with the maximally verifiable ϵ for the given image.
Concretely, we can perform binary search over ϵ i in order find a large ϵ i , still satisfying N k+1:L (α Box (N 1:k ( Îi ))) |= ψ.Verification with our chosen DeepZ Zonotopes is not monotonous in ϵ i due to the non-monotonic transformers used for non-linearities (e.g., ReLU).This renders the application of binary search a best-effort approximation.As we don't require a formal maximum but rather aim to solve a surrogate for Problem 2, this still works well in practice.Further note that, applying α Box to templates introduces imprecision, i.e.V T might not be able to prove properties over templates that it could without the application of α Box .However, Theorem 2 (which only requires properties of V S ) still applies.

Algorithm 2: Online Template Generation for Patches
Input: x, N, µ 1 , . . ., µ m , K, ψ, V T Result: Templates at multiple layers We can extend this approach to obtain templates at multiple layers without a large increase in computational cost.With templates at multiple layers, we first try to match the propagated shape against the earliest template layer and upon failure propagate it further to the next, where we again attempt to match the template.In Algorithm 1, this means repeating the block from Line 4 to Line 10 for each template layer before going on to the check on Line 11.
The full template generation procedure is given in Algorithm 2. First, we perform a binary search over ϵ i (Line 6) to find the largest ϵ i , for which the specification is verifiable.Then for each layer k in the set of layers K at which we are creating templates we create a box T k from the Zonotope.As this T k may not be verifiable, due to the imprecision added in α Box , we then perform another binary search for the largest scaling factor β k (Line 10), which is applied to the matrix A in Eq. (3).We denote this operation as β k T k .We show an example for a single layer k in Fig. 6.The blue area outlines the Zonotope found via Line 6, which is verifiable as it is fully on one side of the decision boundary (red, dashed).After applying α Box (orange), however, is not (crosses the decision boundary).By scaling it with β k the shape is verifiable again (green) and used as a template.

Geometric Robustness
Geometric robustness verification [30,35,4,14] aims to verify the robustness of neural networks against geometric transformations such as image rotations or translations.These transformations typically include an interpolation operation.For example consider rotation R γ of an image by γ ∈ Γ degrees for an interval Γ (e.g., γ ∈ [−5, 5]), for which we consider the specification I Γ (x) := {R γ (x) | γ ∈ Γ }.We note that, unlike ℓ ∞ and patch verification, the input regions for geometric transformations are non-linear and have no closed-form solutions.Thus, an overapproximation of the input region must be obtained [4].For large Γ , the approximate input region I Γ (x), can be too coarse resulting in imprecise verification.Hence, in order to assert ψ on I Γ , existing state-of-the-art approaches [4], split Γ into r smaller ranges Γ 1 , . . ., Γ r and then verify the resulting r specifications (I Γi , ψ) for i ∈ 1, . . ., r.These smaller perturbations share similarities facilitating proof sharing.We instantiate our approach similar to §4.3.A key difference to §4.3 is that while x ∈ I i,j p×p (x) for all i, j in patches, here in general x ̸ ∈ I Γi (x) for most i.Therefore, the individual perturbations I i (x) do not overlap.To account for this, we consider m templates and split Γ into m equally sized chunks (unrelated to the r splits) obtaining the angles γ 1 , . . ., γ m at the center of each chunk.For m templates we then consider the perturbations Îi := I ϵi (R γi (x)), denoting the ℓ ∞ perturbation of size ϵ i around the γ i degree rotated x.To find the template we employ a procedure analogous to Algorithm 2.

Requirements for Proof Sharing
Now, we discuss the requirements on the neural network N such that proof sharing via templates works well.For simplicity, we discuss simple per-dimension box bounds propagation for V S and V T .However, similar arguments can be made for more complex relational abstractions such as Zonotopes or Polyhedra.
In order for an abstraction S to match to a template T , we need to show interval inclusion for each dimension.For a particular dimension i this can occur in two ways: (i) when both S and T are just a point in that dimension and these points coincide, e.g., a S i = a T i , or (ii) when a S i ± d S i ⊆ a T i ± d T i .While particularly in ReLU networks, the first case can occur after a ReLU layer sets values to zero, we focus our analysis here on the second case as it is more common.In this case, the width of T in that dimension d T i must be sufficient to cover S. Ignoring case (i) and letting supp(T ) denote the dimensions in which d T i > 0, we can pose that supp(S) ⊆ supp(T ) as a necessary condition for inclusion.While it is in general hard to argue about the magnitudes of these values, this approach still provides an intuition.When starting from input specifications supp(I) ̸ ⊆ supp( Î), supp(S) ⊆ supp(T ) can only occur if during propagation through the neural network N 1:k the mass in supp( Î) can "spread out" sufficiently to cover supp(S).In the fully connected neural networks that we discuss here, the matrices of linear layers provide this possibility.However, in networks that only read part of the input at a time such as recurrent neural networks, or convolutional neural networks in which only locally neighboring inputs feed into the respective output in the next layer, these connections do not necessarily exist.This makes proof sharing hard until layers later in the neural network, that regionally or globally pool information.As this increases the depth of the layer k at which proof transfer can be applied, this also decreases the potential speed-up of proof transfer.This could be alleviated by different ways of creating templates, which we plan to investigate in the future.

Experimental Evaluation
We now experimentally evaluate the effectiveness of our algorithms from §4.

Experimental Setup
We consider the verification of robustness to adversarial patch attacks and geometric transformations in §5.2 and §5.3, respectively.We define specifications on the first 100 test set images each from the MNIST [24] and the CIFAR-10 dataset [22] ("CIFAR") as with repetitions and parameter variations the overall runtime becomes high.We use DeepZ [34] as the baseline verifier as well as for V S and V T [34].Throughout this section, we evaluate proof sharing for two networks on two common datasets: We use a seven layer neural network with 200 neurons per layer ("7x200") and a nine layer network with 500 neurons per layer ("9x500") for both the MNIST [24] and CIFAR datasets [22], both utilizing ReLU activations.These architectures are similar to the fully-connected ones used in the ERAN and Mnistfc VNN-Comp categories [3].
For MNIST, we train 100 epochs, enumerating all patch locations for each sample, and for CIFAR we train for 600 with 10 random patch locations, as outlined in [11] with interval training [26,18].On MNIST the 7x200 and the 9x500 achieve a natural accuracy of of 98.3% and 95.3% respectively.For CIFAR, these values are 48.8% and 48.1% respectively.Our implementation utilizes PyTorch [27] and is evaluated on Ubuntu 18.04 with an Intel Core i9-9900K CPU and 64 GB RAM.For all timing results, we provide the mean over three runs.

Robustness against adversarial patches
For MNIST, containing 28 × 28 images, as outlined in §4.3, in order to verify inputs to be robust against 2×2 patch perturbations, 729 individual perturbations must be verified.Only if all are verified, the overall property can be verified for a given image.Similarly, for CIFAR, containing 32 × 32 color images, there are 961 individual perturbations (the patch is applied over all color channels).We now investigate the two main parameters of Algorithm 2: the masks µ 1 , . . ., µ m and the layers k ∈ K.We first study the impact of the layer k used for creating the template.To this end, we consider the 7x200 networks, use m = 1 (covering the whole image; equivalent to Îϵ ).Table 3 shows the corresponding template matching rates, and the overall percentage of individual patches that can be verified "patches verif.".(The overall percentage of images for which I 2×2 is true is reported as "verf." in Table 6.)Table 4 shows the corresponding verification times (including the template generation).We observe that many template matches can already be made at the second or third layer.As creating templates simultaneously at the second and third layer works well for both datasets, we utilize templates at these layers in further experiments.
Next, we investigate the impact of the pixel masks µ 1 , . . ., µ m .To this end, we consider three different settings, as showcased in Fig. 5 earlier: (i) the full image (ℓ ∞ -mask as before; m = 1), (ii) "center + border" (m = 2), where we consider the 6 × 6 center pixel as one group and all others as another, and (iii) the 2 × 2 grid (m = 4) where we split the image into equally sized quarters.
As we can see in Table 5, for higher m more patches can be matched to the templates, indicating that our optimization procedure is a good approximation to Problem 2, which only considers the number of templates matched.Yet, for m > 1 the increase in matching rate p does not offset the additional time in template generation and matching.Thus, m = 1 results in a better trade-off.This result highlights the trade-offs discussed throughout sections §3 and §4.
Based on this investigation we now, in Table 6, evaluate all networks and datasets using m = 1 and template generation at layers 2 and 3.In all cases, we obtain a speed up between 1.2 to 2× over the baseline verifier.Going from 2 × 2 to 3 × 3 patches speed ups remain around 1.6 and 1.3 for the two datasets respectively.Comparison with theoretically achievable speed-up Finally, we want to determine the maximal possible speed-up with proof sharing and see how much of this potential is realized by our method.To this end we investigate the same setting and network as in Table 3.We let t BL and t P S denote the runtime of the base verifier without and with proof sharing respectively.Similar to the discussion in §4 we can break down t P S into t T (template generation time), t S (time to propagate one input to layer k), t ⊆ (time to perform template matching) and t ψ (time to verify S if no match).Table 7 shows different ratios of these quantities.For all, we assume a perfect matching rate at layer k and calculate the achievable speed-up for patch verification on MNIST.Comparing the optimal and realized results, we see that at layers 3 and 4 our template generation algorithm, despite only approximately solving Problem 2 achieves near-optimal speed-up.By removing the time for template matching and template generation we can see that, at deeper layers, speeding up t ⊆ and t T only yield diminishing returns.

Robustness against geometric perturbations
For the verification of geometric perturbations, we take 100 images from the MNIST dataset and the 7x200 neural network from §5.2.In Table 8, we consider an input region with ±2°rotation, ±10% contrast and ±1% brightness change, inspired by [4].To verify this region, similar to existing approaches [4], we choose to split the rotation into r regions, each yielding a Box specification over the input.Here we use m = 1, a single template, with the largest verifiable ϵ found via binary search.We observe that as we increase r, the verification rate increases, but also the speed ups.Proof sharing enables significant speed-up between 1.6 to 2.9×.Finally, we investigate the impact of the number of templates m.To this end, we consider a setting with a large parameter space: ±40°rotation generated input  region with r = 200.In Table 9, we evaluate this for m templates obtained from the ℓ ∞ input perturbation around m equally spaced rotations, where we apply binary search to find ϵ i tailored for each template.Again we observe that m > 1 allows more templates matches.However, in this setting the relative increase is much larger than for patches, thus making m = 3 faster than m = 1.

Discussion
We have shown that proof sharing can achieve speed-ups over conventional execution.While the speed-up analysis (see §4 and Table 7) put a ceiling on what is achievable in particular settings, we are optimistic that proof sharing can be an important tool for neural network robustness analysis.In particular, as the size of certifiable neural networks continues to grow, the potential for gains via proof sharing is equally growing.Further, the idea of proof effort reuse can enable efficient verification of larger disjunctive specifications such as the patch or geometric examples considered here.Besides the immediately useful speed-ups, the concept of proof sharing is interesting in its own right and can provide insights into the learning mechanisms of neural networks.

Related Work
Here, we briefly discuss conceptually related work: Incremental Model Checking The field of model checking aims to show whether a formalized model, e.g. of software or hardware, adheres to a specification.As neural network verification can also be cast as model checking, we review incremental model checking techniques which utilize a similar idea to proof sharing: reuse partial previous computations when checking new models or specifications.Proof sharing has been applied for discovering and reusing lemmas when proving theorems for satisfiability [7], Linear Temporal Logic [8], and modal µ-calculus [36].Similarly, caching solvers [38] for Satisfiability Modulo Theories cache obtained results or even the full models used to obtain the solution, with assignments for all variables, allowing for faster verification of subsequent queries.For program analysis tasks that deal with repeated similar inputs (e.g.individual commits in a software project) can leverage partial results [46], constraints [41] precision information [5,6] from previous runs.
Proof Sharing Between Networks In neural network verification, some approaches abstract the network to achieve speed-ups in verification.These simplifications are constructed in a way that the proof can be adapted for the original neural network [1,48].Similarly, another family of approaches analyzes the difference between two closely related neural networks by utilizing their structural similarity [28,29].Such approaches can be used to reuse analysis results between neural network modifications, e.g.fine-tuning [10,42].
In contrast to these works, we do not modify the neural network, but achieve speed-ups rather by only considering the relaxations obtained in the proofs.[42] additionally consider small changes to the input, however, these are much smaller than the difference in specification we consider here.

Conclusion
We introduced the novel concept of proof sharing in the context of neural network verification.We showed how to instantiate this concept, achieving speed-ups of up to 2 to 3 x for patch verification and geometric verification.We believe that the ideas introduced in this work can serve as a solid foundation for exploring methods that effectively share proofs in neural network verification.

A Offline Proof Sharing
In contrast to the online setting in the main paper, here we discuss an offline setting where we generate templates offline on a training dataset T train and use them to speed up the verification process on an unobserved data from a test set T test .Therefore, we consider a modified version of Problem 2 that, instead of Eq. ( 1), optimizes arg max now with a fixed I but x ∼ D, drawn from the data distribution D. As standard in machine learning, we assume that the training and test sets, T train and T test , are sampled from D. An important challenge in the offline setting is that we require the set of templates T to be general enough to generalize to unseen inputs from the test set.To this end, in Appendix A.1 we describe offline-specific template generation algorithm for solving the optimization in Eq. ( 5).We note that once we obtain the set of templates T , we use the same procedure as in Algorithm 1 for utilizing the templates to speed up verification.

A.1 Template Generation on Training Data
We first outline the template generation process in general and then show an instantiation for ℓ ∞ -robustness verification in Appendix A.2.

Template Generation
In order to optimize Eq. ( 5) under our constraints, we attempt to find the set of templates T with |T | ≤ m that contains the most abstraction at layer k obtained from T train .While this is generally computationally hard, we approximate this with a clustering-based approach.To this end, we first compute the abstractions N 1:k (I(x)) for all x ∈ T train and then cluster and merge them.Subsequently, we further merge the obtained templates until we obtain a set T of m templates.We formalize the template generation procedure in Algorithm 3 and showcase it in Fig. 7.
Similarly to the online template generation, discussed in §4.2, we rely on two different verifiers -the original verifier used to verify inputs V S , and the verifier used to verify our templates V T .We can choose V T to be more precise and but slower than V S as the run-time of V T does not impact the runtime of the inference procedure.We first (Line 1,Fig.7a) compute the set V of abstraction computed by V S at layer k, that can be verified.In theory any other verifier could be used for this.Next, we cluster the abstractions in V into n groups {G i | i = 1, . . ., n} of similar abstractions using the function cluster shapes (which we will instantiate in Appendix A.2), showcased by different colors in Fig. 7b.For each group G i , we compute its convex hull T i via the join operator domain D that includes all its inputs.If we are able to show the post-condition ψ for T i , then we add the tuple (T i , G i ), the template along with all abstractions it covers, to the set H, shown in Fig. 7c (Line 7).
As depicted in Fig. 7d, we then attempt to merge pairs of these templates.To this end, we traverse the template pairs according to a priority queue Q ordered by a chosen distance d between them (Line 12).Here, we use the Euclidean distance between the centers of T i and T j for d(T i , T j ).In order to merge (T i , G i ) and (T j , G j ), we first compute the set of shapes G ′ = G i ∪ G j contained in either of them (Line 13) and then again compute the join T ′ = D (G ′ ) (Line 14).We compue the join this way, as this is likely to result in a tigher overall shape than joining T i and T j .If T ′ can be verified, we replace (T i , G i ) and (T j , G j ) with the single pair (T ′ , G ′ ) in H and we update Q accordingly (Line 15 to 22).This procedure is repeated until no templates can be merged.Finally, T is obtained as the set of the m templates with the most associated abstractions (e.g., the largest |G|) in H.

A.2 Dataset templates for ℓ ∞ robustness
We now instantiate the template generation algorithm in Appendix A.1 for speeding up ℓ ∞ -robustness verification I ϵ .As in §4, we rely on verifier V S based on Zonotopes and represent the templates as Boxes (with possible additional half-space constraints as outlined below).For the verification of the templates (lines 6 and 15) we perform exact verification via Mixed-Integer Linear Programming (MILP) [39] via verifier V T .The box-encoded templates can be directly verified by the exact verifier.We note that since exact verification is strictly more precise than Zonotope propagation, the use of templates can potentially allow for higher certification rates than directly employing Zonotope propagation.While we did not observe this experimentally, it presents an interesting target for further investigation.
We instantiate the join D with the join in the Box domain B .For a set of Zonotopes G = {V 1 , . . ., V n }, we compute the bounding box α Box (V i ) for all Zonotopes and then compute the joined bounding Box (which again can be represented as a Zonotope).

Exact verification
We now briefly outline the properties of exact verification via MILP, as we require these in the following discussion.The framework from [39], proves classification to the correct label l by maximizing the error term e = max i̸ =l n i − n l and asserting that e < 0, where n denotes the output of the neural network (e.g. its logits) over the considered input region.If no counterexample to that assertion can be found, it certifies the specification, else it returns a set of counterexamples {z V,i } (concrete points in the input region), utilized later, for which this error is maximal.In both cases we can access value e of the error function.
Shape clustering Next, we describe how we instantiate the clustering method cluster shapes in this setting.We base cluster shapes on k-means clustering for which we provide a similarity matrix computed as follows.For each pair of Zonotopes in , where ⊔ B denotes the Box join operator.We then set the distance between V i and V j to exp(e), where e ∈ R is the obtained from exact verification when attempting to verify ψ for B i,j .To obtain a similarity matrix from these distances, we apply a constant shift embedding [31].As invoking exact verification on each box B i,j is expensive, we only consider the t closest neighbors (in ℓ 2 distance between the Zonotope centers) and set all others to a maximal distance.
Half-space constraints To allow for templates T covering more volume, e.g., those that allow to optimize Eq. ( 5) further by containing more abstractions, we extend the template representation from Boxes to Boxes with additional halfspace constraints, formally called Stars [40,2].As the Star domain is more precise than the Box domain (by allowing to cut some of the box volume), using Stars enables us to generate templates with higher volume that are still verifiable by V T .Further, the Star domain allows efficient containment checks S ⊆ T similarly to the Box domain.Formally a Star B * over a Box B is denoted as: Here each half-space constraint is described by a hyperplane parameterized by C i,• and c i .
The containment check S ⊆ B * (C, c) between an abstraction S and the Star B * (C, c) consists of: (i) a containment for the underlying box S ⊆ B, and (ii) checking if for each constraint C i • z ≤ c i , maximizing the linear expression C i • z with respect to S yields an objective ≤ c i .For a Zonotope S as given in Eq. (3), in §4.1 we showed how to perform step (i) efficiently and step (ii) can be performed efficiently by checking the condition Ca + p j=1 |CA| j ≤ c.A star encoded as in Eq. ( 6) can be directly verified using exact verification (MILP) by adding the half-space constraints as further LP constraints.
Obtaining half-space contraints In the template generation process we utilize Boxes as before.However, whenever we fail to verify a template (e.g., lines 6 and 15 in Algorithm 3), we attempt to add a half-space constraint.We repeat this up to n hs times resulting in as many constraints.We leverage the exact verifier for obtaining half-space constraints.Recall, that it either verifies a region or provides a set of counterexamples {z V,i }.Since we only add additional half-space constraints, if the verification fails we utilize these counterexamples.In the following we assume a single z V , and derive a hyperplane that separates z V from the abstraction we are trying to verify.If there are multiple z V,i , we iterate over them and perform the described procedure for each z V,i , that is not already cut by the hyperplane found for a previous counterexample.These hyperplanes directly yield the new constraints.
We showcase this in Fig. 8, where the green shaded area T shows the Box join over three abstraction T = B ({P 1 , P 2 , P 3 }) The individual P i , shown in blue, are zonotopes, that can be verified individually.
The verification of the green area fails, with the counterexample z V (red dot) shown in the top right corner.To find a hyperplane that separates z V from the rest of T , we consider the line from the center a (green dot) of T to the point z V and a hyperplane orthogonal to it (shown as the dashed line).Thus, adding a row i to the matrix C of the star: ]) along this line to, we consider the value attained for C i x for x in the verified area (P 1 , P 2 , P 3 ): Then, the constant c i of the new hyperplane is given by c i = κc p + (1 − κ)c z for a hyper-parameter κ and c z := C i z V .
A high κ puts the hyperplane close to z V , removes only little volume from the template, while low κ puts it closer to a. Since z V is only the counterexample with the largest violation, but not necessarily the whole region preventing certification, the half-space constraint obtained from a high κ might not be sufficient to separate this region from T .Thus, in a subsequent iteration, another halfspace constraint for the same region may be added.For low κ, fewer constraints are required, but more verifiable volume of T is lost.

A.3 Template Expansion
To further improve the generalization of our templates from the training set to the test set, we introduce an operation called template expansion, outlined in Algorithm 4. We apply the template expansion to the result of the template generation presented in Algorithm 3 and we use the resulting widened templates for our offline proof transfer algorithm.Algorithm 4 tries to expand each of the templates T i separately, by repeatedly scaling the template's associated Box by a diagonal scaling matrix D := diag(f 1 , . . ., f d ) (Line 9) until T i is no longer verifiable by the template verifier V T (Line 4).Here, f j ≥ 1 acts as a scaling factor for the j-th dimension of the template Boxes.Fig. 8: The algorithm used to find half-space constraints, by cutting counterexample z V from the template T with a hyperplane (C i , c i ).The normal of the hyperplane (specified by C i ) is given by the vector between a (the center of T ) and z V .The threshold c i is chosen such that the hyperplane remove z V but does not intersect any relaxations P 1 , P 2 , P 3 the template T was created from.
Template Expansion for Stars When we encode templates as Stars, the scaling matrix is only applied on the Star's underlying Box, but not on the half-space constraints, since these have already been selected to be close to the decision boundary.Thus for the new extened template T i we copy the constraints from T w i (Line 11).If the resulting template fails to verify, we generate up to n te hs additional constraints, in the same way as for Algorithm 3, and add them to T i (Line 13).

A.4 Experimental Evaluation
In this section, we instantiate our offline proof transfer to the MNIST dataset.We consider templates both in the Box and Star domains both with and without template expansion (Appendix A.3) and 5x100 fully-connected network with ReLU activations.We generate templates individually for every label at both the third and fourth layer.For technical details see the end of this section.We allow up to m = 25 templates for each combination.We experiment with two different values for ϵ: 0.05 and 0.1.
Table 10 shows the results.We provide the fraction of input regions that could be successfully matched to templates as well as the overall verification time.If an input cannot be matched with any of the templates, then we propagate the standard Zonotope abstraction through the rest of the network to verify it.
We observe that templates can subsume up to 57.6% of input regions in the test set with ϵ = 0.05, and up to 45.8% for the higher ϵ = 0.1 when expressed in the Box domain (at layer 4).Additionally enabling template expansion increases these rates to 59.0% and 47.7% respectively.Combining template at multiple layers gives more matched templates as many inputs can be matched in the third layer, while unmatched ones can again be considered at the fourth layer for a total of up to 60.6% and 49.2% respecitvely.We observe that improvements in matching rate directly lead to speed ups over standard verification.Additionally allowing half-space constraints, i.e., using Stars instead of Boxes as templates, allows us to increase the matching rate up to 65.2 % and 54.5 % for the two ϵ respectively when using TE.However, as checking matches for Stars is computationally more expensive the resulting final verification time is slightly worse compared to Box templates.To summarize, these results highlight that with the algorithm outlined in Appendix A.2, a set of templates T can be obtained that generalize remarkably well to new unseen input regions (e.g., up to 65.2 % containment).More precise abstractions such as Stars allow templates that capture a far higher rate of containment for new input regions, the added cost of their containment check makes the obtained speedups smaller.Finally, we see that Template Expansion (Appendix A.3) uniformly leads to a higher matching rate and speed-ups.
Technical Details We use a feed forward neural network with five linear layers of size 100 and ReLU activations, trained with DiffAI [26].The network has an accuracy of 0.94 and a certified accuracy of 0.93 and 0.92 for ϵ = 0.1 and ϵ = 0.2 respectively.
For a cluster proofs we set an initial cluster size depending on the number of verifiable images per label, in order that the clusters contain on average 50 images.For the verification of a cluster's union, we allow up to n hs = 30 half-space constraints and set κ of 0.05.Taking a low value leads to a larger truncation, but reduces the number of half-space constraints, which speeds up the template generation as well as the containment check at inference.We take the same values also for verifying unions after merging two clusters.For expand-Table 10: Template matching rate and verification time of the whole MNIST test set t in seconds for the 5x100 using up to m templates per label and layer pair.The baseline verification 292.13 ± 1.77 and 291.80 ± 2.36 seconds for ϵ = 0.05 and ϵ = 0.10 respectively.+TE indicates the use of Template Expansion.ing the templates, we use up to 10 iterations, in which we widen by 5% in each dimension and then allow up to 10 hyperplanes to verify the expanded template.
To avoid truncating previously verified volume, we increase κ linearly by 0.02 for each expansion step, starting with an initial κ of 0.4

Algorithm 3 : 19 PFig. 7 :
Fig. 7: Visualization of Algorithm 3. First, in (a) the abstractions at layer k for input regions in the training set are obtained, and restricted to the verifiable ones (green).These are then clustered and their convex hulls in domain D are obtained.(b) shows different clusters in different colors.The convex hulls are then verified (c), and restricted to the verifiable ones (green).Finally in (d), these regions are further merged, if possible, to obtain the set of templates.

Algorithm 4 : 2 T w i ← Ti 3 Ti ← D • Ti 4 while 5 T w i ← Ti 6 if Ti is Star then 7 Ti ← remove planes(Ti) 8 end 9 Ti ← D • Ti 10 if
Template expansionInput: layer number k, templates T = {Ti} m i=1 at layer k, scaling matrix D, verifier VT Result: Set T w of expanded templates, |T w | = m 1 for i ← 1 to m do VT (Ti, N k+1:L ) ⊢ ψ do Ti is Star then 11 Ti ← copy planes(T w i ) 12 if VT (Ti, N k+1:L ) ̸ ⊢ ψ then 13 Ti ← add planes(Ti, n te hs ) return T w = {T w i } m i=1

Table 1 :
Proof subsumption on a robust MNIST classifier with 94 % accuracy.Verif.acc.denotes the percentage of verifiable inputs from the test set for ℓ ∞ -perturbations (I ϵ ).

Table 2 :
Feasibility of S ⊆ T for Box B, Zonotope Z (with order reduction) and DP Polyhedra P .

Table 3 :
Rate of I i,j 2×2 matched to templates T for I 2×2 patch verification for different combinations of template layers k, 7x200 networks,using m = 1 template.

Table 4 :
Average verification time in seconds per image for I 2×2 patchs for different combinations of template layers k, 7x200 networks,using m = 1 template.

Table 5 :
I 2×2 patch verification with templates at the 2nd & 3rd layer of the 7x200 networks for different masks.

Table 6 :
I 2×2 patch verification with templates generated on the second and third layer using the ℓ ∞ -mask.Verification times are given for the baseline t BL and for applying proof sharing t P S in seconds per image.

Table 7 :
Speed-ups achievable in the setting of Table3.t BL the baseline.

Table 8 :
±2°rotation, ±10% contrast and ±1% brightness change split into r perturbations on 100 MNIST images.Verification rate, rate of splits matched and verified along with the run time of Zonotope t BL and proof sharing t P S .

Table 9 :
±40°rotation split into 200 perturbations evaluated on MNIST.The verification rate is just 15 %, but 82.1 % of individual splits can be verified.