Guided search for hybrid systems based on coarse-grained space abstractions

Hybrid systems represent an important and powerful formalism for modeling real-world applications such as embedded systems. A verification tool like SpaceEx is based on the exploration of a symbolic search space (the region space). As a verification tool, it is typically optimized towards proving the absence of errors. In some settings, e.g., when the verification tool is employed in a feedback-directed design cycle, one would like to have the option to call a version that is optimized towards finding an error trajectory in the region space. A recent approach in this direction is based on guided search. Guided search relies on a cost function that indicates which states are promising to be explored, and preferably explores more promising states first. In this paper, we propose an abstraction-based cost function based on coarse-grained space abstractions for guiding the reachability analysis. For this purpose, a suitable abstraction technique that exploits the flexible granularity of modern reachability analysis algorithms is introduced. The new cost function is an effective extension of pattern database approaches that have been successfully applied in other areas. The approach has been implemented in the SpaceEx model checker. The evaluation shows its practical potential.


Introduction
Hybrid systems are extended finite automata whose discrete states correspond to the various modes of continuous dynamics a system may exhibit, and whose transitions express the switching logic between these modes [1]. Hybrid systems have been used to model and to analyze various types of embedded systems [5,9,19,20,34,35,41]. A hybrid system is considered safe if a given set of bad states cannot be reached from the initial states. Hence, reachability analysis is a main concern for hybrid systems. Since the reachability analysis of hybrid systems is in general undecidable [1], modern reachability analysis tools such as SpaceEx [23] resort to semi-decision procedures based on over-approximation techniques [16,23]. In this paper, we explore the utility of guided search in order to improve the efficiency of such techniques.
Guided search is an approach that has recently found much attention for finding errors in large systems [14,30]. As suggested by the name, guided search performs a search in the state space of a given system. In contrast to standard search methods like breadth-first or depth-first search (DFS), the search is guided by a cost function that estimates the search effort to reach an error state from the current state. This information is exploited by preferably exploring states with lower estimated costs. If accurate cost functions are applied, the search effort can significantly be reduced compared to uninformed search. Obviously, the cost function therefore plays a key role within the setting of guided search, as it should be as accurate as possible on the one hand, and as cheap to compute as possible on the other. Cost functions that have been proposed in the literature are mostly based on abstractions of the original system. An important class of abstraction-based cost functions is based on pattern databases (PDBs). PDBs have originally been proposed in the area of Artificial Intelligence [17] and also have successfully been applied to model checking discrete and timed systems [29,30,37,42]. Roughly speaking, a PDB is a data structure that contains abstract states together with abstract cost values based on an abstraction of the original system. During the concrete search, concrete states s are mapped to corresponding abstract states in the PDB, and the corresponding abstract cost values are used to estimate the costs of s. Overall, PDBs have demonstrated to be powerful for finding errors in different formalisms. The open question is if guided search can be applied equally successfully to finding errors in hybrid systems.
A first approach in this direction [14] is to estimate the cost of a symbolic state based on the Euclidean distance from its continuous part to a given set of error states. This approach appears to be best suited for systems whose behavior is strongly influenced by the (continuous) differential equations. However, it suffers from the fact that discrete information like mode switches is completely ignored, which can lead to arbitrary degeneration of the search. To see this, consider the example presented in Fig. 1. It shows a simple hybrid system with one continuous variable which obeys the differential equationẋ = 1 in every location (differential equations are omitted in the figure). The error states are given by the locations l e1 , . . . , l en and invariants 0 ≤ x ≤ 8. In this example, the box-based distance heuristic wrongly explores l 1 l 2 l 3 l e1 . . . l en l 4 l 5 l 6 . . .
Fig. 1 A motivating example the whole lower branch first (where no error state is reachable) because it only relies on the continuous information given by the invariants. More precisely, for the box-based distance heuristic, the invariants suggest that the costs of the "lower" states are equal to 0, whereas the costs of the "upper" states are estimated to be equal to 4 (i. e., equal to the distance of the centers of the bounding boxes of the invariants).
To overcome these limitations, we introduce an abstraction-based cost function for hybrid systems which is motivated by PDBs. In contrast to the box-based approach based on Euclidean distances, this cost function is able to properly reflect the discrete part of the system. Compared to the "classical" discrete setting, the investigation of PDBs for hybrid systems becomes more difficult for several reasons. First, hybrid systems typically feature both discrete and continuous variables with complex dependencies and interactions. Therefore, the question arises how to compute a suitable (accurate) abstraction of the original system. Second, computations for symbolic successors and inclusion checks become more expensive than for discrete or timed systems-can these computations be performed or approximated efficiently to get an overall efficient PDB approach as well? In this paper, we provide answers to these questions, leading to an efficient guided search approach for hybrid systems. In particular, we introduce an abstraction technique leveraging properties of the set representations used in modern reachability algorithms. By simply using coarser parameters for the explicit representation, we obtain suitable and cheap coarse-grained space abstractions for the behaviors of a given hybrid system. Furthermore, we adapt the idea of partial PDBs, which has been originally proposed for solving discrete search problems [7], to the setting of hybrid systems in order to reduce the size and computation time of "classical" PDBs. Our implementation in the SpaceEx tool [23] shows the practical potential.
The remainder of the paper is organized as follows. After introducing the necessary background for this work in Sect. 2, we present our PDB approach for hybrid systems in Sect. 3. This is followed by a discussion about related work in Sect. 4. Afterwards, we present our experimental evaluation in Sect. 5. Finally, we conclude the paper in Sect. 6.

Preliminaries
In this section, we introduce the preliminaries that are needed for this work.

Notations
We consider models that can be represented by hybrid systems. A hybrid system is formally defined as follows. R n , -The initial condition, given by the constraint Init( ) ⊂ R n for each location , -For each location , a relation called Flow( ) over the variables and their derivatives. We assume Flow( ) to be of the forṁ where x(t) ∈ R n , A is a real-valued n × n matrix and U ⊆ R n is a closed and bounded convex set, -The discrete transition relation, given by a set Trans of discrete transitions; a discrete transition is formally defined as a tuple ( , g, ξ, ) defining -The source location and the target location , -The guard, given by a linear constraint g, -The update, given by an affine mapping ξ , and -The invariant Inv( ) ⊂ R n for each location .
The semantics of a hybrid system H is defined as follows. A state of H is a tuple ( , x), which consists of a location ∈ Loc and a point x ∈ R n . More formally, x is a valuation of the continuous variables in Var. For the following definitions, let T = [0, Δ] be an interval for some Δ ≥ 0. A trajectory of H from state s = ( , x) to state s = ( , x ) is defined by a tuple ρ = (L , X), where L : T → Loc and X : T → R n are functions that define for each time point in T the location and values of the continuous variables, respectively. Furthermore, we will use the following terminology for a given trajectory ρ. A sequence of time points where location switches happen in ρ is denoted by (τ i ) i=0...k ∈ T k+1 . In this case, we define the length of ρ as |τ | = k. Trajectories ρ = (L , X) (and the corresponding sequence (τ i ) i=0...k ) have to satisfy the following conditions: • τ 0 = 0, τ i < τ i+1 , and τ k = Δ -the sequence of switching points increases, starts with 0 and ends with Δ • L(0) = , X(0) = x, L(Δ) = , X(Δ) = x -the trajectory starts in s = ( , x) and ends in s = ( , x ) X(t) = AX(t) + u(t) holds and thus the continuous evolution is consistent with the differential equations of the corresponding location • ∀i ∀t ∈ [τ i , τ i+1 ) : X(t) ∈ Inv(L(τ i )) -the continuous evolution is consistent with the corresponding invariants every continuous transition is followed by a discrete one, X end (i) defines the values of continuous variables right before the discrete transition at the time moment τ i+1 , whereas X start (i) = X(τ i ) denotes the values of continuous variables right after the switch at the time moment τ i .
A state s is reachable from state s if there exists a trajectory from s to s .
In the following, we mostly refer to symbolic states. A symbolic state s = ( , R) is defined as a tuple, where ∈ Loc, and R is a convex and bounded set consisting of points x ∈ R n . The continuous part R of a symbolic state is also called region. The symbolic state space of H is called the region space. The initial set of states S init of H is defined as ( , Init( )). The reachable state space Reach(H) of H is defined as the set of symbolic states that are reachable from an initial state in S init , where the definition of reachability is extended accordingly for symbolic states.
In this paper, we assume there is a given set of symbolic bad states S bad that violate a given property. Our goal is to find a sequence of symbolic states which contains a trajectory from S init to a symbolic error state, where a symbolic error state s e has the property that there is a symbolic bad state in S bad that agrees with s e on the discrete part, and that has a non-empty intersection with s e on the continuous part. A trajectory that starts in a symbolic state s and leads to a symbolic error state is called an error trajectory ρ e (s).

Symbolic states representation
The representation of symbolic states plays a crucial role for the reachability analysis of hybrid systems. As outlined in the previous section, a symbolic state consists of a discrete location and a continuous region. The handling of continuous regions within the reachability analysis poses a special challenge as a number of operations on polyhedra (such as linear maps, Minkowski sum, and convex hull computation) need to be performed efficiently in practice. The LGG scenario [33] which is implemented in SpaceEx [23] relies on two main ingredients for this purpose: support functions [10] and template polyhedra. In the following, we will describe them in more detail.
The support function ρ R ( ) of a region R with respect to the direction ∈ R n is defined as follows: We can represent an arbitrary convex closed set R by using support functions in the following way: The representation based on support functions allows for efficiently computing all the above-mentioned polyhedra operations, hence reachability algorithms in turn benefit from this representation.
As the consideration of an infinite number of directions is clearly infeasible from the computational point of view, SpaceEx also makes use of a continuous set representation derived from the support functions: template polyhedra. In this setting, we predefine from the very beginning a set of directions taken into account in course of the reachability analysis. In other words, a user provides a set of directions D = { 1 , . . . , m } used for the reachability analysis. Based on D, the region R can be over-approximated by the following polyhedron: SpaceEx supports a number of predefined direction sets such as, box directions (directions parallel to axes; see In the rest of this section, we briefly recapitulate the computation of continuous successors for a given symbolic state, i.e., the states which are reachable according to the contin-  The LGG scenario computes the continuous successors only for a finite time horizon. Therefore, we use a timebounded version of the reachable region Reach t 1 ,t 2 (R) for a given starting region R ⊆ R n , dynamicsẋ(t) = Ax(t) + u(t), u(t) ∈ U (*) and a time interval [t 1 , t 2 ] ⊆ R ≥0 : x(τ ) is the solution of (*)}.
SpaceEx performs an over-approximating time-bounded reachability analysis of Reach 0,T (R), where T ∈ R ≥0 is a user-provided time horizon. In more detail, as the reachability analysis of hybrid systems is generally undecidable, SpaceEx over-approximates the successor regions by iteratively computing over-approximations based on discretizing the time up to the time horizon: First, the time interval [0, T ] is partitioned in a number of small time intervals is a user-provided sampling time. Second, given this partitioning, SpaceEx covers the exact reachability set with the sequence where Ω i defines the overapproximation of the states reachable within the time interval In other words, the following inclusion holds: The set Ω i+1 can be expressed in terms of the "predecessor set" Ω i by using a linear map and Minkowski sum. Therefore, we only need to provide a routine to compute Ω 0 which in turn can be done in two steps. First, we compute the convex hull of the union of the region R and its image at the moment T δ . Second, we observe that the continuous dynamics non-linearities can lead to some reachable states being outside of the computed convex hull. In order to account for this phenomena, we bloat the resulting convex hull to ensure the over-approximation. Clearly, a larger sampling time T δ makes a possibly larger bloating necessary, which worsens  To summarize, we observe that the adjustment of the template directions used in the support function representation and the sampling time in the continuous post operator crucially impacts the precision, i.e., the abstraction level, of the symbolic state representation. Clearly, an improved precision leads to an increased analysis time on the downside. Based on this representation, we will present an algorithm which leverages different abstraction levels to efficiently explore the region space.

Guided search
In this section, we introduce a guided search algorithm (Algorithm 1) along the lines of the reachability algorithm used by the current version of SpaceEx [23]. It works on the region space of a given hybrid system. The algorithm checks if a symbolic error state is reachable from a given set of initial symbolic states S init . As outlined above, we define a symbolic state s e in the region space of H to be a symbolic error state if there is a symbolic state s ∈ S bad such that s and s e agree on their discrete part, and the intersection of the regions of s and s e is not empty (in other words, the error states are defined with respect to the given set of bad states). Starting with the set of initial symbolic states from S init , the algorithm explores the region space of a given hybrid system by iteratively computing symbolic successor states until an error state is found, no more states remain to be considered, or a (given) maximum number of iterations i max is reached. The exploration of the region space is guided by the cost function such that symbolic states with lower cost values are considered first.
In the following, we provide a conceptual description of the algorithm using the following terminology. A symbolic state s is called a symbolic successor state of a symbolic state s if s is obtained from s by first computing the continuous successor of s (according to iteratively over-approximating the successor regions of s with sets Ω i as described in the previous section), and then by computing a discrete successor state of the resulting (intermediate) state. Therefore, for a given symbolic state s curr , the function continuousSuccessor (line 7) returns a symbolic state which is an over-approximation of the symbolic state reach-

Algorithm 1 A guided symbolic reachability algorithm
Require: Set of initial symbolic states S init , set of symbolic bad states S bad , cost function cost Ensure: Can a symbolic error state be reached from a symbolic state in S init ? 1: compute cost(s) for all s ∈ S init 2: Push (L waiting , {(s, cost(s)) | s ∈ S init }) 3: i := 0 4: while (L waiting = ∅ ∧ i < i max ) do 5: s curr := GetNext (L waiting ) 6: i := i + 1 7: s curr := continuousSuccessor(s curr ) 8: if s curr is a symbolic error state then 9: return "Error state reached" 10: end if 11: Push (L passed , s curr ) 12: S := discreteSuccessors(s curr ) 13: for all s ∈ S do 14: if s / ∈ L passed then 15: compute cost(s ) 16: Push (L waiting , (s , cost(s ))) 17: able from s curr within the given time horizon according to the continuous evolution. Accordingly, the function discrete-Successors (line 12) returns the symbolic states that are reachable due to the outgoing discrete transitions.
A symbolic state s is called explored if its symbolic successor states have been computed. A symbolic state s is called visited if s has been computed but not yet necessarily explored. To handle encountered states, the algorithm maintains the data structures L passed and L waiting . L passed is a list containing symbolic states that are already explored; this list is used to avoid exploring cycles in the region space. L waiting is a priority queue that contains visited symbolic states together with their cost values that are candidates to be explored next. The algorithm is initialized by computing the cost values for the initial symbolic states and pushing them accordingly into L waiting (lines 1 -2). The main loop iteratively considers a best symbolic state s curr from L waiting according to the cost function (line 5), computes its symbolic continuous successor state s curr (line 7), and checks if s curr is a symbolic error state (lines 8 -10). (Recall that s curr is defined as a symbolic error state if there is a symbolic bad state s ∈ S bad such that s and s curr agree on their discrete part, and the intersection of the regions of s and s curr is not empty.) If this is the case, the algorithm terminates. If this is not the case, then s curr is pushed into L passed (line 11). Finally, for the resulting symbolic state s curr , the symbolic discrete successor states are computed, prioritized, and pushed into L waiting if they have not been considered before (lines 12 -18). As a side remark, if a successor state s = l, R is not contained in L passed (line 14), but instead there is a symbolic state s = l, R ∈ L passed with R ⊂ R , then s is discarded as well because all transitions enabled in s have already been enabled in s which is already explored. Finally, the check if the given maximal number of iterations has been reached (line 4 and line 20) ensures termination, which would not be generally guaranteed otherwise (e. g., because of Zeno behavior).
Obviously, the search behavior of Algorithm 1 is crucially determined by the cost function that is applied. In the next section, we give a generic description of pattern database cost functions.

General framework of pattern databases
For a given system S, a pattern database (PDB) in the classical sense (i. e., in the sense PDBs have been considered for discrete and timed systems) is represented as a table-like data structure that contains abstract states together with abstract cost values. The PDB is used as a cost estimation function by mapping concrete states s to corresponding abstract states s # in the PDB, and using the abstract cost value of s # as an estimation of the cost value of s. The computation of a classical PDB is performed in three steps. First, a subset P of variables and automata of the original system S is selected. Such subsets P are called pattern. Second, based on P, an abstraction S # is computed that only keeps the variables occurring in P. Third, the entire state space of S # is computed and stored in the PDB. More precisely, all reachable abstract states together with their abstract cost values are enumerated and stored. The abstract cost value for an abstract state is defined as the shortest length of a trajectory from that state to an abstract error state. The resulting PDB of these three steps is used as the cost function during the execution of Algorithm 1; in other words, the PDB is computed prior to the actual model checking process, where the resulting PDB is used as an input for Algorithm 1.
A straight-forward adaptation of such classical PDBs to the area of hybrid systems is the following. For a given hybrid system H, compute an abstract system H # as the basis for the PDB, where H # is obtained from H by removing some of the variables in H (the pattern corresponds to the remaining variables in H # ). Based on H # , the PDB is represented by a data structure that contains abstract states together with corresponding cost values. The abstract states and cost values are obtained by a region space exploration of H # . The abstract cost value of an abstract state s # is defined as the length of a shortest found trajectory in H # from s # to an abstract error state. The PDB computes the cost function where s is a symbolic state, s # is a corresponding abstract state to s in the PDB, and cost # is the length of the corresponding trajectory from s # to an abstract error state as defined above.

Pattern databases for hybrid systems
In Sect. 2.4, we have described the general approach for computing and using a PDB for guiding the search. However, for hybrid systems, there are several challenges using the classical PDB approach. First, it is not clear how to effectively design and compute suitable abstractions H # for hybrid systems H with complex variable dependencies. Second, in Sect. 3.2, we address the general problem that the precomputation of a PDB is often quite expensive, where in many cases, only a small fraction of the PDB is actually needed for the search [24]. This is undesirable in general, and specifically becomes problematic in the context of hybrid systems because the reachability analysis in hybrid systems is typically much more expensive than, e. g., for discrete systems. In Sect. 3.2, we introduce a variant of partial PDBs for hybrid systems to address these problems.

Coarse-grained space abstractions
A general question in the context of PDBs is how to compute suitable abstractions of a given system. In particular, for hybrid systems where variables often have rather complex dependencies, projection abstractions based on removing variables (as done for classical PDBs) can be too coarse to achieve accurate heuristics. In this paper, we propose a simple, yet elegant alternative to the classical PDB approach to obtain a coarse grained and fast analysis: As described in Sect. 2.2, the LeGuernic-Girard (LGG) algorithm implemented in SpaceEx [23] uses support function representation (based on the chosen set of template directions) to compute and store over-approximations of the reachable states. Therefore, a reduced number of template directions and an increased sampling time results in an abstraction of the original region space in the sense that the dependency graph of the reachable abstract symbolic states is a discrete abstraction of the system.
The granularity of the resulting abstraction is directly correlated with the parameter selection: Choosing coarser parameters (fewer template directions, larger sampling time) in the reachability algorithm makes this abstraction coarser, whereas finer parameters lead to finer abstractions as well. In more detail, for a given set of template directions D and sampling time N , a subset D ⊂ D and a larger sampling time N > N induce coarse-grained space abstractions with respect to the abstractions obtained by D and N : the overapproximation of regions based on D and N are coarser than for D and N . As an example for template directions, consider again Figs. 2 and 3: the set of box directions in Fig. 2 is a coarse-grained space abstraction of the set of octagonal directions in Fig. 3. Similarly, as an example for the sampling time, consider again Figs. 4 and 5, where Fig. 4 shows a coarsegrained space abstraction based on increased sampling time of the regions in Fig. 5.
In the following, we apply coarse-grained space abstractions to obtain abstractions as the basis for pattern databases. This is a significant difference compared to classical PDB approaches (see Sect. 2.4): Instead of computing an explicit (projection) abstraction H # based on a subset of all variables, we keep all variables (and hence, the original system H), and instead choose a coarser exploration of the abstract region space of H to obtain the abstraction used for the PDB. (In practice, we apply unguided search provided by SpaceEx to compute this coarser abstraction.) As an additional difference to classical PDBs, we will apply a variant of partial PDBs, which are introduced in the next section.

Partial pattern databases
As already outlined, a general drawback of classical PDBs is the fact that their precomputation might become quite expensive. Even worse, in many cases, most of this precomputation time is often unnecessary because only a small fraction of the PDB is actually needed during the symbolic search in the region space [24]. One way that has been proposed in the literature to overcome this problem is to compute the PDB on demand: The so-called switchback search maintains a family of abstractions with increasing granularity; these abstractions are used to compute the PDB to guide the search in the next finer level [32].
In the following, we apply a variant of partial PDBs [7] based on coarse-grained space abstractions to address this problem: Instead of computing the whole abstract region space for a given abstraction, we restrict the abstract search to explore only a fraction of the abstract region space while focusing on those abstract states that are likely to be sufficient for the concrete search. In the following definition, we call an abstract state s # corresponding to state s if s and s # agree on their discrete part, the region of s is included in region of s # , and s # is an abstract state with minimal abstract costs that satisfies these requirements.

Definition 2 (Partial Pattern Database)
Let H be a hybrid system. A partial pattern database for H is a pattern database for H that contains only abstract state/cost value pairs for abstract states that are part of some trajectory of shortest length (in terms of number of location switches) from an initial state to some abstract error state. The partial pattern database computes the function where s, s # , and cost # are defined as above, and +∞ is a default value indicating that no corresponding abstract state to s exists.
Informally, a partial PDB for a hybrid system H exactly contains those abstract states that are explored on some shortest trajectory (instead of containing all abstract states of a complete abstract region space exploration to all abstract error states as it would be the case for a classical PDB). In other words, partial PDBs are incomplete in the sense that there might exist concrete states without any corresponding abstract states in the PDB. In such cases, the default value +∞ is returned with the intention that corresponding concrete states are only explored if no other states are available. Obviously, this might worsen the overall search guidance compared to the fully computed PDB. However, in special cases, a partial PDB is already sufficient to obtain the same cost function as obtained with the original PDB or even obtained with a perfect cost function (that allows for exploring the region space without backtracking to find an error state). For example, this is the case when only abstract states are excluded from which no abstract error state is reachable anyway. More generally, under the idealized assumption that the abstraction is fine enough such that no spurious behavior occurs on shortest possible error trajectories, the partial PDB already delivers the same search behavior as a perfect search algorithm that finds an error trajectory without backtracking.

Proposition 1
Let H be a hybrid system. Let n ∈ N 0 be the length of a shortest concrete error trajectory. If all shortest abstract error trajectories in H (obtained by a coarse-grained space abstraction to build a pattern database) correspond to concrete error trajectories of the same length, then guided search with Algorithm 1 and cost PP finds an error trajectory after n steps.
Proof By construction, the partial PDB contains exactly those symbolic abstract states that are part of shortest possible error trajectories. By assumption, these abstract states correspond to concrete states on concrete error trajectories of the same length. Hence, for every concrete state s on a shortest error trajectory, there is a corresponding entry in the partial PDB for all concrete successor states s of s that are part of a shortest concrete error trajectory, and cost PP (s ) = cost PP (s) − 1. In addition, for all concrete successor states s that are not part of a shortest concrete error trajectory, cost PP (s ) = ∞. Overall, the claim follows by an inductive argument: Let s 0 be an initial state such that cost PP (s 0 ) = n is minimal among the costs of all initial states, i. e., n is the length of a shortest concrete error trajectory. Furthermore, all concrete states on a shortest concrete error trajectory have a concrete successor state with a cost value decreased by one, whereas all other successor states have a cost value of infinity. Hence, Algorithm 1 with the cost PP function finds a concrete error trajectory within n steps.
Under the idealized assumptions of Proposition 1, it follows immediately that guided search applying the full PDB cannot improve over the partial PDB.
Corollary 1 Under the assumptions of Proposition 1, guided search with Algorithm 1 and cost P explores at least as many states as with cost PP .
Proposition 1 and Corollary 1 show that partial PDBs can provide effective search guidance in an idealized setting where the applied abstraction only introduces spurious behavior on non-relevant parts of the region space. Clearly, in practice, these assumptions will mostly not be satisfied for abstractions that are efficiently computable. However, we rather consider Proposition 1 as a proof of concept showing that the basic concept of partial PDBs is meaningful in our setting. (In our experimental analysis, we will show that partial PDBs yield an effective and efficient approach for a number of practical and challenging problems as well-we will come back to this point in Sect. 5.) Overall, we will see that although in case the requirements of Proposition 1 are not fulfilled, partial PDBs can still be a good heuristic choice that lead to cost functions that are efficiently computable and accurately guide the concrete search.

Discussion
Our pattern database approach for finding error states exploits abstractions in a different way than in common approaches for verification (see Sect. 4 for a discussion on related work). Most notably, the main focus of our abstraction is to provide the basis for the cost function to guide the search, rather than to prove correctness (although, under certain circumstances, it can be efficiently used for verification as well-we will come back to this point in the experiments section). As a short summary of the overall approach, we first compute a symbolic abstract region space (as described in Sect. 3.1), where the encountered symbolic abstract states s # are stored in a table together with the corresponding abstract cost values of s # . To avoid the (possibly costly) computation of an entire PDB, we only compute the PDB partially (as described in Sect. 3.2). This partial PDB is then used as the cost function of our guided reachability algorithm. As in many other approaches that apply abstraction techniques to reason about hybrid systems, the abstraction that is used for the PDB is supposed to accurately reflect the "important" behavior of the system, which results in accurate search guidance of the resulting cost function and hence, of our guided reachability algorithm.
An essential feature of the PDB-based cost function is the ability to reflect the continuous and the discrete part of the system. To make this more clear, consider again the motivating example from the introduction (Fig. 1). As we have discussed already, the box-based distance function first wrongly explores the whole lower branch of this system because no discrete information is used to guide the search. In contrast, a partial PDB is also able to reflect the discrete behavior of the system. In this example, the partial PDB consists of an abstract trajectory to the first reachable error state, which is already sufficient to guide the (concrete) region space exploration towards to first reachable error state as well. In particular, this example shows the advantage of partial PDBs compared to fully computed PDBs (recall that fully computed PDBs would include all error states, whereas the partial PDB only contains the trajectory to a shortest one). In general, our PDB approach is particularly well suited for hybrid systems with a non-trivial amount of discrete behavior. However, the continuous behavior is still considered according to our abstraction technique as introduced in Sect. 3.1. Overall, partial PDBs appear to be an accurate approach for guided search because they accurately balance the computation time for the cost function on the one hand, and lead to efficient and still accurately informed cost functions on the other hand.

Related work
Abstraction techniques for hybrid systems have been mostly considered in a verification setting, i. e., in a setting where the focus is on proving that a given set of bad states cannot be reached. For this purpose, abstractions have been applied in different ways. On the one hand, a number of approaches to abstract the regions of symbolic states within the reachability analysis have been suggested, including constraint polyhedra [22], ellipsoids [31], and orthogonal polyhedra [15]. In our paper, we use the support function representation [33]. These approaches have in common that the structure of the considered hybrid system is left intact. On the other hand, it also possible to abstract a hybrid automata structure. Alur et al. [2] suggest to use predicate abstraction for the hybrid systems analysis. In addition, Tiwari et al. [40] introduce a method based on the quantifier elimination decision procedure for real closed fields. Furthermore, Tiwari [39] investigates Lie derivatives and their application to the abstraction generation. Jha et al. [25] computes abstractions by removing some of the continuous variables. Finally, Bogomolov et al. [13] abstract hybrid systems by merging locations. The abstract dynamics is computed by eliminating the state variables and computing a convex hull. Our pattern database approach belongs to the first group outlined above as we exploit the parametrization of the symbolic region representation.
A prominent model checking approach for hybrid systems is based on counterexample-guided abstraction refinement (CEGAR) [3,4]. In a nutshell, CEGAR iteratively refines the considered abstraction until the abstraction is fine enough to prove or refute the property. Our PDB approach shares with CEGAR the general idea of using an abstraction to analyze a concrete system. However, in contrast to CEGAR, where abstract counterexamples have to be validated and possibly used in further abstraction refinement, abstractions for PDBs are never refined and only used as a heuristic to guide the search within the concrete automaton. In other words, in contrast to CEGAR, the accuracy of the abstraction influences the order in which concrete states are explored, and hence, the accuracy in turn influences the performance of the resulting model checking algorithm.
Therefore, a crucial difference lies in the fact that CEGAR does the search in the abstract space, replays the counterexample in the concrete space, and stops if the error trajectory cannot be followed. In contrast, our approach does the search in the concrete space and uses the PDBs for guidance, only. If an abstract trajectory cannot be followed, the search does not stop, but tries other branches until either a counterexample is found, or all trajectories have been exhausted. Due to this reason, our framework provides the same level of precision as the default SpaceEx reachability algorithm. The concretization of a symbolic path is known to be a highly non-trivial computational problem. A symbolic bad path found with our approach can be further concretized to the trajectory level using techniques from optimal control (see, e.g., the work by Zutshi et al. [43] for more details).
Considering more specialized techniques to find error states in faulty hybrid systems, Bhatia and Frazzoli [11] propose using rapidly exploring random trees (RRTs). In the context of hybrid systems, the objective of a basic RRTs approach is to efficiently cover the region space in an "equidistant" way in order to avoid getting stuck in some part of the region space. Recently, RRTs were extended by adding guidance of the input stimulus generation [18]. However, in contrast to our approach, RRTs approaches are based on numeric simulations, rather than symbolic executions. Applying PDBs to RRTs would be an interesting direction for future work. In a further approach, Plaku, Kavraki, and Vardi [36] propose to combine motion planning with discrete search for falsification of hybrid systems. The discrete search and continuous search components are intertwined in such a way that the discrete search extracts a high-level plan that is then used to guide the motion planning component. In a slightly different setting, Ratschan and Smaus [38] apply search to finding error states in hybrid systems that are deterministic. Hence, the search reduces to the problem of finding an accurate initial state.
SpaceEx [23] is a recently developed, yet already prominent model checking tool for hybrid systems. As suggested by the name, it explores the region space by applying (symbolic) search. The most related approach to this paper has recently been presented by Bogomolov et al. [14], who propose a cost function based on Euclidean distances of the regions of the current state and error states. The resulting guided search algorithm is implemented in SpaceEx and has demonstrated to achieve significant guidance and performance improvements compared to the uninformed search of SpaceEx. In contrast to the presented PDB approach of this paper, the Euclidean distances are solely based on the continuous part of the system, whereas PDBs are able to reflect both discrete and continuous parts.
Moreover, guided search has been applied to finding error states in a subclass of hybrid systems, namely to timed systems. In particular, PDBs have been investigated in this context [29,30,42]. In contrast to this paper, the PDB approaches for timed systems are "classical" PDB approaches, i. e., a subset of the available automata and variables are selected to compute a projection abstraction. To select this subset, Kupferschmid et al. [29] compute an abstract error trace and select the automata and variables that occur in transitions in this abstract trace. In contrast, Kupferschmid and Wehrle [30,42] start with the set of all automata and variables (i. e., with the complete system), and iteratively remove variables as long as the resulting projection abstraction is "precise enough" according to a certain quality measure. In both approaches, the entire PDB is computed, which is more expensive than the partial PDB approach proposed in this paper.

Evaluation
We have implemented cost PP in the SpaceEx tool [23] and evaluated it on a number of challenging benchmarks. The implementation and the benchmarks are available at http:// pub.ist.ac.at/~sbogomol/sttt2015. The experiments have been performed on a machine running with AMD Opteron 6174 processors. We set a time limit of 30 min per run. In the following, we report results for our PDB implementation of cost PP in SpaceEx. We compared cost PP with uninformed DFS as implemented in SpaceEx, and with the recently proposed box-based distance function [14] on several challenging benchmark problems. We compare the number of iterations of SpaceEx, the length of the error trajectory found as well as the overall search time (including the computation of the PDB for cost PP ) in seconds. In the following, we will shortly denote partial PDBs with PDBs.

Results for navigation benchmarks
As a first set of benchmarks, we consider a variant of the wellknown navigation benchmark [21]. This benchmark models an object moving on the plane which is divided into a grid of cells. The dynamics of the object's planar position in each cell is governed by the differential equationsẋ where v d stands for the targeted velocity in this location. Compared to the originally proposed navigation benchmark problem, we address a slightly more complex version with the following additional constraints. First, we add inputs allowing perturbation of object coordinates, i. e., the system of differential equations is extended to: Second, to make the search task even harder, the benchmark problems also feature obstacles between certain grid elements. This is particularly challenging because, in contrast to the original benchmark system, one can get stuck in a cell where no further transitions can be taken, and consequently, backtracking might become necessary. The size of the problem instances varies from 400 to 900 locations, and all instances feature 4 variables.
The results for the navigation benchmark problem class are provided in Table 1, where the best results are given in bold fonts with respect to the total runtime. The fraction of the total time to compute the PDB is given in parenthesis. As a general picture, they show that the precomputation time for the PDB mostly pays off in terms of guidance accuracy and overall runtime. Specifically, the overall runtime could (sometimes significantly) be reduced compared to uninformed search and also compared to the box-based heuristic. For example, in navigation instance 1, the precomputation for the PDB only needs around 3 s, leading to an overall runtime of around 28 s, compared to around 99 s with the box-based heuristic, and about 206 s with uninformed search. This search behavior for instance 1 is also visualized in Figs. 6, 7, and 8, showing the trajectories (i. e., the parts of the covered region space) with the different search approaches. We observe the following: While uninformed DFS explores quite a large number of unnecessary trajectories, the box-based heuristic already guides the search more accurately and finds an error state with much fewer detours. Considering the PDB approach, we observe that PDBs can guide the search even more accurately in the sense that no detours are explored at all, and hence, no backtracking is needed either. Furthermore, the covered parts of the region space is again much lower than both with uninformed search and the box-based heuristic. In addition, we observe that even the abstract run (shown in light gray) is already rather accurate, covering only little more of the region space than the concrete run. Overall, the PDB approach finds an accurate balance between the computation time and the accuracy of the resulting cost function.

Results for navigation benchmarks with additive vanishing perturbation
We consider another variant of the navigation benchmark with an additive vanishing perturbation (see, e.g., [28]). We use this variation for evaluating the scalability of our approach with respect to increased continuous complexity given a constant discrete complexity. For this, the benchmark is modified to model a vanishing perturbation w ∈ R p with increasing model order (i. e., p = 1, p = 2, etc.). In more detail, we extend the system of differential equations where u min ≤ u ≤ u max as before and A w ∈ R p× p is Hurwitz to ensure it is a vanishing perturbation. Table 2 presents results of the same scenarios evaluated in the earlier navigation benchmark for p = 2 additional vanishing perturbation variables (i. e., 2 additional state variables compared to the earlier navigation benchmark, yielding n = 6 continuous variables overall). We observe a similar picture as for the previous results: the PDB approach outperforms uninformed DFS and also the box-based heuristic in the majority of the problems. This is also reflected in Figs. 9, 10, and 11, which, respectively, show the corresponding reachable states in the second instance for the three approaches, respectively.
In addition, Fig. 12 presents the navigation benchmark instance 1 scaling the number of additional state variables from p = 1 through p = 8 (for a total of n = 5 through n = 12 continuous variables), while keeping all else constant, using a timeout of 30 min, and runs that exceeded the 30 min timeout are not plotted. We observe that also with increasing number of additional continuous variables, the runtime scalability of our PDB approach is considerably better compared to the box-based heuristic and uninformed DFS-even for p = 8 additional variables, PDBs is able to find an error state in less than 30 min. In contrast, both the uninformed DFS and box-based heuristic methods cannot find the error states in less than 30 min beyond p = 3 and p = 5 additional variables, respectively.

Results for satellite benchmarks
In this section, we consider benchmarks that result from hybridization. For a hybrid system H with non-linear continuous dynamics, hybridization is a technique for generating a hybridized hybrid automaton from H. The hybridized automaton has simpler continuous dynamics (usually affine or rectangular) that over-approximate the behavior of H [8], and can be analyzed by SpaceEx. For our evaluation, we consider benchmarks from this hybridization technique applied to non-linear satellite orbital dynamics [27], where two satellites orbit the earth with non-linear dynamics described by Kepler's laws. The orbits in three-dimensional space lie in a two-dimensional plane and may in general be any conic section, but we assume the orbits are periodic, and hence circular or elliptical. Fixing some orbital parameters (e.g., the orientations of the orbits in three-space), the states of the satellites in three-dimensional space x 1 , x 2 ∈ R 3 can be completely described in terms of their true anomalies (angular positions). Likewise, one can transform between the threedimensional state description and the angular position state description. The non-linear dynamics for the angular position is the semi-latus rectum of the ellipse, a i is the length of the semimajor axis of the ellipse, and 0 ≤ e i < 1 is the eccentricity of  the ellipse (if e i = 0, then the orbit is circular and p i simplifies to the radius of the circle). These dynamics are periodic with a period of 2π , so we consider the bounded subset [0, 2π ] 2 of the state-space R 2 , and add invariants and transitions to create a hybrid automaton ensuring ν i ∈ [0, 2π ]. For the benchmark cases evaluated, we fixed μ = 1 and varied p i and e i for several scenarios. For more details, we refer to the work of Johnson et al. [27]. The size of the problem instances varies from 36 to 1296 locations, and all instances feature 4 variables. The verification problem is conjunction avoidance, i.e., to determine whether there exists a trajectory where the satellites come too close to one another and may collide. Some of the benchmark instances considered are particularly challenging because they feature several sources of non-determinism, including several initial states and several bad states. As an additional source of non-determinism, some benchmarks model thrusting. A change in a satellite's orbit is usually accomplished by firing thrusters. This is usually modeled as an instantaneous change in the orbital parameters e i and a i . However, the angular position ν i in this new orbit does not, in general, equal the angular position in the original orbit, and a change of variables is necessary, which can be modeled by a reset of the ν i values when the thrusters are fired. The transitions introduced for thrusting add additional discrete non-determinism to the system. The results for the satellite benchmark class are provided in Table 3. In general, we observe a similar search behavior to what we have observed in the navigation problems: The precomputation of the PDB pays off in the sense that much better search behavior can be achieved, leading to a fewer number of iterations and a lower overall runtime. For example, in instance 5, the precomputation time for the PDB amounts to roughly 5 s, leading to an overall time of around 92 s for the concrete run. In contrast, uninformed search and the boxbased heuristic need around 426 and 272 s, respectively. The search behavior of the concrete and abstract run in instance 5 is also visualized in Figs. 13, 14, and 15. We observe that the part of the covered search space with our PDB approach is again lower compared to the box-based heuristic and uninformed search. Figure 15 again particularly shows the part of the search space that is covered by the abstract run (which can be performed efficiently due to our abstraction described in Sect. 3.1), showing that our PDB approach finds an accurate balance between the computation time and the accuracy of the resulting cost function.
Furthermore, we have also been able to effectively and efficiently prove the absence of errors in the instances 6 and 14, where the abstract run already revealed that no concrete error trajectory exists. As our abstraction is an overapproximation, we can safely conclude that no reachable error state in the concrete system exists either, and do not need to start the concrete search at all. Being able to efficiently  verify hybrid systems with PDBs is a significant advantage compared to the box-based heuristic.

Results for water tank benchmarks
This benchmark consists of variants of the tank benchmark [6,26]. The tank benchmark (see Fig. 16) consists of some N ∈ N tanks, where each tank i ∈ {1, . . . , N } loses volume x i at some constant flow rate v i , so tank i has dynam- One of the tanks is filled from an external inlet at some constant flow rate w, so it has dynamicsẋ i = w − v i , for a real constant w ≥ 0. In our variant, the volume lost by each tank simply vanishes and does not move from one tank to another. This benchmark class is qualitatively different than either the navigation or satellite benchmarks, as the discrete state space may be small.
The two variations we consider are complete and linear topologies with regard to the inlet tank choice. The inlet pipe w may be moved to some tank j with volume x i ≤ r i from some tank i, where: (a) j = i is any other tank for the complete topology, or (b) j ∈ {i + 1, i − 1} is an adjacent tank for the linear topology. The invariants in our variant of the benchmark are that the volumes of all tanks are non-negative: ∀i ∈ {1, . . . , N } : x i ≥ 0. We consider variants where the aggregate out flow rate equals the in flow rate, so the sum of the flow rates out of all tanks equals the inlet flow rate: Hence, the total volume is constant, so for all t ≥ 0: In these instances, the purpose of the inlet is to effectively move net volume between tanks, and the search problem is to find an appropriate order of such moves to reach a specific volume level in all of the N tanks.
The results for the water tank problem class are provided in Table 4. Again, the results are similar to the results in the navigation and the satellite benchmark classes: We observe that PDBs can help significantly in guiding the search towards error states. For example, comparing Figs. 17, 18, and 19, which each, respectively, show an execution of uninformed search, the box-based heuristic, and PDBs, we observe that our PDB-based approach is able to exploit the abstract run to more quickly find the correct sequence of tanks to fill to reach a certain region of the state-space (again, the light gray regions are covered in the abstract run only and can be computed efficiently). Generally, PDBs can particularly help for the water tank problems because of the non-determinism that occurs in this problem class (which is important to be resolved accurately, corresponding to the choice of which tanks to fill in which order). However, we also observe that in 4 cases, the overall runtime is higher than the runtime with the box-based heuristic. In these cases, the precomputation of the PDB does not pay off-we will discuss such cases in more detail below.

Results for heater benchmarks
This benchmark consists of variants of the heater benchmark [21]. In our variation, we consider three rooms with one heater. The automaton is modeled with four locations, consisting of no heaters on in any room, or the heater is on in one of the three rooms. The size of the problem instances have 4 locations, and all instances feature 3 temperature variables, 1 time variable, and 16 real constants. The temperature dynamics are linear and there is coupling between temperatures in different rooms. If the heater is on in a room, its temperature rate of change has a positive additive term c i , but otherwise does not, so the temperature may decrease (subject to the temperatures in different rooms). Specifically, for room 1 (and symmetrically rooms 2 and 3), if the heater is on, the dynamics are:ẋ 1 = b 1 (u−x 1 )+a 1,2 (x 2 −x 1 )+a 1,3 (x 3 −x 1 )+c 1 , but if the heater is off, the dynamics are the same except without the c 1 term. In our variant, the invariants specify only that the temperatures are all non-negative and bounded. The heater may be turned on in room i ∈ {1, 2, 3} if x i ≤ T on for some real threshold T on , and turned off if x i ≥ T off for some real threshold T off . There is non-determinism in choosing to turn off or on the heater once the threshold condition is met, and there is a potential delay in changing the state of the heater from off to on and vice-versa. The results for the heater benchmark are provided in Table 5. We observe that, unlike the results for the other benchmarks, the results for the heater are more diverse. While the PDB approach overall performs best in 7 out of 15 problem instances, it is somewhat slower than uninformed DFS in other 7 instances. Having a closer look, we observe that the error trajectories with DFS are found with equally many iterations by SpaceEx, and additionally, their length is the same compared to those found with the PDB approach. In such cases where the PDB cannot improve over the search behavior of DFS, DFS is naturally more efficient because of the PDB's computational overhead (in fact, the difference in search time is almost exactly due to this overhead). However, obtaining such an informed search behavior with DFS is rather based on having good luck, whereas PDBs provide a more principled approach to achieve this. Furthermore, despite the sometimes higher runtimes in this benchmark class, we observe that our PDB approach is able to solve one more problem than DFS, and three more problems than the box-based heuristic within our time limit of 30 min. In addition, similar to the satellite benchmarks, we have been able to effectively prove the absence of errors in two cases (heater instance 3 and instance 4).
Finally, for the last time, let us have a look at the covered region space by DFS, by the box-based heuristic and by PDBs in Figs. 20, 21, and 22, respectively. We observe that the concrete run with PDBs (indicated in dark gray) boils down to a small curve in this instance, whereas the other approaches cover a (much) larger fraction.

Runtimes of partial PDBs versus full PDBs
Considering the runtime to build the partial PDBs compared to computing full PDBs, we observed that strong reductions of several orders of magnitude can indeed be obtained. In particular, the computation of the full abstract state space sometimes exceeds our time bound of 30 min, whereas the partial PDBs can still be computed efficiently. This happens in the water tank problem class, where full PDBs could not be computed within 30 min in any instance, whereas partial PDBs could be computed within less than a minute in all of the 15 instances (less than 10 s in 9 of these instances). In this respect, we conclude that the notion of partial PDBs particularly makes the overall approach tractable on a larger class of problems. In cases where full PDBs can be computed within 30 min, the runtime can be significantly higher than with partial PDBs: For example, in the satellite domain instance 10, computing the full PDB needs around 175 s, compared to roughly 3 s for computing the partial PDB.

Discussion
We have observed that PDBs can provide more informed search behavior than uninformed search or than the boxbased heuristic. A potential problem is the computational overhead due to its precomputation time. We will discuss advantages and drawbacks of our PDB approach in this section.
As a general picture, we first observe that the number of iterations of SpaceEx and also the length of the found error trajectories are mostly at most as high with PDBs as with uninformed search and the box-based heuristic. In particular, our PDB approach could solve several problem instances where uninformed search and the box-based heuristic ran out of time. In some cases, the precomputation of the PDB does not pay off compared to DFS and the box-based heuristichowever, in such cases, the pure concrete search time with PDBs is still mostly similar to the pure search time of DFS and the box-based approach.
We further observe that the length of the trajectories found by the box-based heuristic and the PDB heuristic is often similar or equal, while the number of iterations is mostly decreased. This again shows that the search with the PDB approach is more focused than with the box-based heuristic in such cases, and less backtracking is needed. In particular, the box-based heuristic always tries to find a "direct" trajectory to an error state, while ignoring possible obstacles. Therefore, the search can get stuck in a dead-end state if there is an obstacle, and as a consequence, backtracking becomes necessary. Furthermore, the box-based heuristic can perform worse than the PDB if several bad states are present. In such cases, the box-based heuristic might "switch" between several bad states, whereas the better accuracy of the PDB heuristic better focuses the search towards one particular bad state. In problems that are structured more easily (e. g., where no "obstacles" exist and error states are reachable "straight ahead"), the box-based heuristic might yield better performance because the precomputation of the PDB does not pay off.
Finally, a general advantage of PDBs compared to the box-based heuristic which we did not discuss in detail so far is the broader applicability of PDBs. By definition, the boxbased heuristic estimates distances by computing Euclidean distances between the region of the current and the error state. However, in problems where error states are defined solely by (discrete) locations, there is no such error region, and the box-based distance heuristic is not effectively applicable. In contrast, PDBs are more general, and applicable for all kinds of error states.

Conclusion
We have explored the application of coarse-grained space abstractions to compute PDBs for hybrid systems. For a given safety property and hybrid system with linear dynamics in each location, we compute an abstraction by coarsening the over-approximation SpaceEx computes in its reachability analysis. The abstraction is used to construct a PDB, which contains abstract symbolic states together with their abstract error distances. These distances are used in guiding SpaceEx in the concrete search. Given a concrete symbolic state, the guiding heuristics returns the smallest distance to the error state of an enclosing abstract symbolic state. This distance is used to choose the most promising concrete symbolic successor. In our implementation, we have taken advantage of the SpaceEx parametrization support, and were able to report a significant speedup in counterexample detection and even for verification. Our new PDB support for SpaceEx can be seen as a non-trivial extension of our previous work on guided reachability analysis for hybrid systems where the discrete system structure was ignored completely [14]. For the future, it will be interesting to further refine and extend our approach by, e. g., considering even more fine grained abstraction techniques, or by combinations of several abstraction techniques and therefore, by combining several PDBs. We expect that this will lead to even more accurate cost functions and better model checking performance.