Quantifying conformance using the Skorokhod metric

The conformance testing problem for dynamical systems asks, given two dynamical models (e.g., as Simulink diagrams), whether their behaviors are “close” to each other. In the semi-formal approach to conformance testing, the two systems are simulated on a large set of tests, and a metric, defined on pairs of real-valued, real-timed trajectories, is used to determine a lower bound on the distance. We show how the Skorokhod metric on continuous dynamical systems can be used as the foundation for conformance testing of complex dynamical models. The Skorokhod metric allows for both state value mismatches and timing distortions, and is thus well suited for checking conformance between idealized models of dynamical systems and their implementations. We demonstrate the robustness of the metric by proving a transference theorem: trajectories close under the Skorokhod metric satisfy “close” logical properties in the timed linear time logic FLTL (Freeze LTL) containing a rich class of temporal and spatial constraint predicates involving time and value freeze variables. We provide efficient window-based streaming algorithms to compute the Skorokhod metric for both piecewise affine and piecewise constant traces, and use these as a basis for a conformance testing tool for Simulink. We experimentally demonstrate the effectiveness of our tool in finding discrepant behaviors on a set of control system benchmarks, including an industrial challenge problem.


Introduction
A fundamental question in model-based design is conformance testing: whether two models of a system are equivalent.For discrete systems, this question is well-studied [26,17,18,27], and there is a rich theory of process equivalences based on similarity and bisimilarity.For continuous and hybrid systems, however, the state of the art is somewhat unsatisfactory.While there is a straightforward generalization of process equivalences to the continuous case, in practice, equivalence notions such as bisimilarity are always too strong and most systems are not bisimilar.Since equivalence is a Boolean notion, one gets no additional information about the systems other than they are "not bisimilar," and even if two dynamical systems are bisimilar, they may still differ in many properties that are of control-theoretic interest.Thus, classical notions for equivalence and conformance have been of limited use in industrial practice.
In recent years, the notion of bisimulation has therefore been generalized to metrics on systems, which quantify the distance between them.For example, one approach is that of -bisimulation, which requires that the states of the two systems remain "close" forever (within an -ball), rather than coincide exactly.Under suitable stability assumptions on the dynamics, one can prove results about -bisimulation [15,16].Unfortunately, proving the pre-requisites for the existence ofbisimulations for complex dynamical models, or coming up with suitable and practically tractable bisimulation functions, is extremely difficult in practice.In addition, establishing -bisimulation requires full knowledge of the system dynamics making the scheme inapplicable where one system is an actual physical component with unknown mathmatical dynamics.Bisimulation notions have hence been of limited practical use.
Instead, a more pragmatic semi-formal approach has gained prominence in industrial practice.In this approach, the two systems are executed on the same input sequences and a metric on finite trajectories is used to evaluate the closeness of these trajectories.The key to this methodology is the selection of a good metric, with the following properties: -Transference.Closeness in the metric must translate to preserving interesting classes of logical and functional specifications between systems, and -Tractability.The metric should be efficiently computable.In addition, there is the more informal requirement of applicability: the metric should classify systems, that the engineers consider close, as being close, and conversely.
A number of metrics have been proposed recently.The simplest is a pointwise metric that computes the maximum pointwise difference between two trajectories, sometimes generalized to apply a constant time-shift to one trajectory [13].Unfortunately, for many practical models, two trajectories may be close only under variable time-shifts.This is the case, for example, for two dynamical models that may use different numerical integration techniques (e.g., fixed step versus adaptive step) or when some component in the implementation has some jitter.Thus, the pointwise metric spuriously report large distances for "close" models.More complicated hybrid distances have been proposed [1].The transference properties of these metrics w.r.t.common temporal logics for dynamical systems are not yet clear.
In this work we present a methodology for quantifying conformance between real-valued dynamical systems based on the Skorokhod metric [11].The Skorokhod metric allows for mismatches in both the trace values and in the timeline, and quantifies temporal and spatial variation of the system dynamics under a unifying framework.The distortion of the timeline is specified by a retiming function r which is a continuous bijective strictly increasing function from R + to R + .Using the retiming function, we obtain the retimed trace x (r(t)) from the original trace x(t).Intuitively, in the retimed trace x (r(t)), we see exactly the same values as before, in exactly the same order, but the time duration between two values might now be different than the corresponding duration in the original trace.The amount of distortion for the retiming r is given by sup t≥0 |r(t) − t|.Using retiming functions, the Skorokhod distance between two traces x and y is defined to be the least value over all possible retimings r of: where D is a pointwise metric on values.The Skorokhod distance thus incorporates two components: the first component quantifies the timing discrepancy of the timing distortion required to "match" two traces, and the second quantifies the value mismatch (in the metric space O) of the values under the timing distortion.The Skorokhod metric was introduced as a theoretical basis for defining the semantics of hybrid systems by providing an appropriate hybrid topology [8,7].We now demonstrate its usefulness in the context of conformance testing.
Transference.We show that the Skorokhod metric gives a robust quantification of system conformance by relating the metric to TLTL (timed LTL) enriched with (i) predicates of the form f (x 1 , . . ., x n ) ≥ 0, as in Signal Temporal Logic, for specifying constraints on trace values; and (ii) freeze quantifiers, as in TPTL [4], for specifying temporal constraints (freeze quantifiers can express more complex timing constraints than bounded timing constraints, e.g of MTL).This logic subsumes the MITL-based logic STL [13].We prove a transference theorem: flows (and propositional traces) which are close under the Skorokhod metric satisfy "close" TLTL formulae for a rich class of temporal and spatial predicates; where the untimed structure of the formulae remains unchanged, only the predicates are enlarged.Tractability.We improve on recent polynomial-time algorithms for the Skorokhod metric [23] by taking advantage of the fact that, in practice, only retimings that map the times in one trace to "close" times in the other are of interest.This enables us to obtain a streaming sliding-window based monitoring procedure which takes only O(W ) time per sample, where W is the window size (assuming the dimension n of the system to be a constant).
Usability.Using the Skorokhod distance checking procedure as a subroutine, we have implemented a Simulink toolbox for conformance testing.Our tool integrates with Simulink's model-based design flow for control systems, and provides a stochastic search-based approach to find inputs which maximize the Skorokhod distance between systems under these inputs.
We present three case studies from the control domain, including industrial challenge problems; our empirical evaluation shows that our tool computes sharp estimates of the conformance distance reasonably fast on each of them.Our input models were complex enough that more theoretically appealing techniques such as -bisimulation function generation could not be applied.In particular, we demonstrate how two models that only differ in the underlying ODE solver can nevertheless deviate enough to invalidate system requirements on settling time.
We conclude that the Skorokhod metric can be an effective foundation for semi-formal conformance testing for complex dynamical models.Proofs of the theorems are given in the accompanying technical report [REF].
Related Work.The work of [1,2] is closely related to ours.In it, robustness properties of hybrid state sequences are derived with respect to a trace metric which also quantifies temporal and spatial variations.Our work differs in the following ways.First, we guarantee robustness properties over flows rather than only over (discrete) sequences.Second, the Skorokhod metric is a stronger form of the (T, J, (τ, ))-closeness degree1,2 (for systems which do not have hybrid time); and allows us to give stronger robustness transference guarantees.The Skorokhod metric requires order preservation of the timeline, which the (T, J, (τ, ))-closeness function does not.Preservation of the timeline order allows us to (i) keep the untimed structure of the formulae the same (unlike in the transference theorem of [1]); (ii) show transference of a rich class of global timing constraints using freeze quantifiers (rather than only for the standard bounded time quantifiers of MTL/MITL).However, for implementations where the timeline order is not preserved, we have to settle for the less stronger guarantees provided by [1].The work of [13], in terms of robustness, deals mainly with spatial robustness of STL; the only temporal disturbances considered are constant time-shifts for the entire signal where the entire signal is moved to the past, or to the future by the same amount.The Skorokhod metric incorporates time-shifts which are variable along the timeline.

Preliminaries
Traces.A (finite) trace or a signal π : [T i , T e ] → O is a mapping from a finite closed interval [T i , T e ] of R + , with 0 ≤ T i < T e , to some topological space O.If O is a metric space, we refer to the associated metric as D O .The time-domain of π, denoted tdom(π) is the time domain [T i , T e ] over which it is defined.The time-duration of π, denoted as tlen(π), is sup (tdom(π)).The t-suffix of π for t ∈ tdom(π), denoted by π t , is the trace π restricted to the interval (tdom(π) ∩ [t, tlen(π)].We denote by π ↓T e the prefix trace obtained from π by restricting the domain to [T i , T e ] ⊆ tdom(π).

Systems. A (continuous-time) system
+ is the set of finite closed intervals of R + , transforms input traces π ip : [T i , T e ] → O ip into output traces π op : [T i , T e ] → O op (over the same time domain).We require that if A(π ip ) → π op , then for every min tdom(π) ≤ T e < max tdom(π), the system A maps π ip ↓T e to π op ↓T e .Thus, we only consider causal systems.Common examples of such systems are (causal) dynamical, and hybrid dynamical systems [6,28].Conformance.A system A conforms to the system A over an input trace π ip if A (π ip ) = A(π ip ), i.e. if the behavior of A on the input trace π ip is the same as that of A. The system A conforms to the system A over the input trace set Π ip if conformance holds for each input trace in Π ip .Given a metric D over input traces, and an input trace set Π ip , the quantitative conformance between A and A over Π ip is defined as the quantity sup π ip ∈Π ip D (A (π ip ) , A (π ip )) .If Π ip is the set of all input traces, this quantity is the distance between the two systems.Retimings.A retiming r : I → I , for closed intervals I, I of R + is an order-preserving (i.e.monotone) continuous bijective function from I to I ; thus if t < t then r(t) < r(t ).Let the class of retiming functions from I to I be denoted as R I →I , and let I be the identity retiming.Intuitively, retiming can be thought of as follows: imagine a stretchable and compressible timeline; a retiming of the original timeline gives a new timeline where some parts have been stretched, and some compressed, without the timeline having been broken.Given a trace π : The Skorokhod distance3 between the traces π() and π () is defined to be: Intuitively, the Skorokhod distance incorporates two components: the first component quantifies the timing discrepancy of the timing distortion required to "match" two traces, and the second quantifies the value mismatch (in the metric space O) of the values under the timing distortion.In the retimed trace π • r, we see exactly the same values as in π, in exactly the same order, but the times at which the value are seen can be different.Polygonal Traces.A polygonal trace π : I π → O where O is a vector space with the scalar field R is a continuous trace such that there exists a finite sequence min Polygonal traces are obtained when discrete-time traces are completed by linear interpolation.We remark that after retiming, the retimed trace π • r need not be piecewise linear (see e.g.[22]).Theorem 1 (Computing the Distance between Polygonal Traces [23]).Let π : I π → R n and π : I π → R n be two polygonal traces with m π and m π affine segments respectively.Let the Skorokhod distance between them (for the L 2 norm on R n ) be denoted as D S (π, π ).
2. Suppose we restrict retimings to be such that the i-th affine segment of π can only be matched to π affine segments i−W through i+W for all i, where W ≥ 1.Under this retiming restriction, we can determine, with a streaming algorithm, whether Let us denote by D W S (π, π ) the Skorokhod difference between π, π under the retiming restriction of the second part of Theorem 1, i.e., the value obtained by restricting the retimings in Equation 14 .The value D W S (π, π ) is an upper bound on D S (π, π ).In addition, for W < W , we have

Skorokhod Distance based Conformance Testing
In conformance testing, we test for the variance in behavior of two given systems A 1 and A 2 under the same input 5 .Given the same input, the two systems produce potentially differing output traces; the goal is to quantify this difference, and to determine an input signal that causes the corresponding output signals to exceed a user provided bound on the maximum tolerable output trace distance.
Algorithm 1: Algorithm to test if max Algorithm 1 is a standard optimization-guided testing algorithm in which we have used the Skorokhod distance between two output traces as the cost function.In such algorithms, it is common to define a finite parameterization of the input space, represented by the tuple (P, F, B), where P = {p 1 , . . ., p k } represents a set of parameters, F = {f 1 , . . ., f k } represents a finite set of basis functions from [0, T ] to R n , where T is some finite time-horizon, and for each p i ∈ P , there is a b i ∈ B that is a closed interval in R over which p i is assumed to take values.An input signal u is defined such that, for all t, u(t) = i p i • f i (t).A valid input signal has the property that for all i, p i ∈ b i .
In each step, the algorithm picks an input signal u and computes the Skorokhod distance between the corresponding outputs y 1 = A 1 (u) and y 2 = A 2 (u).Based on heuristics that rely on the current cost, and a possibly bounded history of costs, the procedure then picks a new value for u.For instance, in a gradient-ascent based procedure, the new value of u is chosen by estimating the local gradient in each direction in the input-parameter space, and then picking the direction that has the largest (positive) gradient.In our implementation, we use the Nelder-Mead (or nonlinear simplex) algorithm.
The algorithm terminates when a violation is found (i.e., a pair of inputs that exceed the userprovided Skorokhod distance bound), or when the number of iterations is exhausted.The Skorokhod distance bound δ is chosen based on engineering requirements, e.g., based on the maximum allowed weakening of the temporal logical properties that have been verified/tested on one system.Sampling and Polygonal Approximations.In practice, the output behaviors of the systems are observed with a sampling process, thus in implementations of Algorithm 1, entities y 1 and y 2 on lines 4, 5 are time-sampled output trace sequences, from which the Skorokhod distance algorithm of Theorem 1 constructs (continuous time) signals using linear interpolation.Given a timed trace sequence tseq, let [[tseq]] LI denote the continuous time trace obtained from tseq by linear interpolation.Let tseq π , tseq π be two corresponding samplings of the traces π, π .Since the Skorokhod distance is a metric, we have that If ∆ samerr is a bound on the distance between a trace, and an interpolated completion of its sampling, we have that Thus, in a sampling framework, a value of 2•∆ samerr needs to be added to the Skorokhod distance between the polygonal approximations.
Section 4 presents a theory of (quantifiable) transference of logical properties.Section 5 presents results on our implementation of Algorithm 1.We also discuss several case studies, providing rationale for choosing the appropriate δ value, and present results on the computation time and the conformance distance found.

Transference of Logical Properties
In this section, we demonstrate transference of logical properties.If two traces are at a distance of δ, and one trace satisfies a logical specification φ, we derive the "relaxation" needed (if any) in φ so that the other trace also satisfies this relaxed logical specification.The logic we use is a version of the timed linear time logic TLTL [4] (a timed version of the logic LTL).We show that the Skorokhod distance provides robust transference of specifications in this logic: if the Skorokhod distance between two traces is small, they satisfy close TLTL formulae.We first present the results in a propositional framework, and then extend to R n -valued spaces.

The Logic TLTL
Let P be a set of propositions.A propositional trace π over P is a trace where the topological space is 2 P , with the associated metric: D P (σ, σ ) = ∞ if σ = σ , and 0 otherwise for σ, σ ∈ 2 P .We restrict our attention to propositional traces with finite variability: we require that there exists a finite partition of tdom(π) into disjoint subintervals I 0 , I 1 , . . ., I m such that π is constant on each subinterval.The set of all timed propositional traces over P is denoted by Π(P).
Definition 2 (TLTL(F T ) Syntax).Given a set of propositions P, a set of (time) variables V T , and a set F T of functions from R l + to R, the formulae of TLTL(F T ) are defined by the following grammar.
x.φ where p ∈ P and x ∈ V T , and x = (x 1 , . . ., x l ) with x i ∈ V T for all 1 ≤ i ≤ l; f T ∈ F T is a real-valued function, and ∼ is one of {≤, <, ≥, >}.
We say that the variable x is bound in φ if φ is x.Ψ , otherwise it is free.The quantifier "x." is known as the freeze quantifier, and binds the variable x to the current time.A formula is closed if it has no free variables.Definition 3 (TLTL(F T ) Semantics).Let π : I → 2 P be a timed propositional trace, t 0 = min(I), and let E : V → I be the time environment mapping the variables in V to time values in I.The satisfaction of the timed sequence π with respect to the TLTL(F T ) formula φ in the time environment E is written as π |= E φ, and is defined inductively as follows (denoting t 0 = min tdom(π)).
agrees with E for all z = x, and maps x to t 0 ; A timed trace π is said to satisfy the closed formula φ (written as π |= φ) if there is some environment E such that π |= E φ.
The definition of additional temporal operators in terms of these base operators is standard: the "eventually" operator ♦φ stands for true U φ; and the "always" operator φ stands for ¬♦¬φ.TLTL(F T ) provides a richer framework than MTL [21] for expressing timing constraints as: (i) freeze quantifiers allow specification of constraints between distant contexts, which the bounded temporal operators in MTL cannot do; and (ii) the predicates f T () ∼ 0 for f T ∈ F T allow the specification of complex timing requirements not expressible in MTL.Example 1 (Freeze quantifiers; TLTL(F T ) subsumes MTL).Let F T be the set of two variable functions of the form f (x, y) = x − y + c where c is a rational constant.Then TLTL(F T ) subsumes MTL.The MTL formula Q U [a,b] R can be written as We explain the formula as follows.We assign the "current" time t x to the variable x, and some future time t y to the variable y.The values t x and t y are such that at time t y , we have R to be true, and moreover, at all times between t x and t y , we have Q ∨ R to be true.Furthermore, t y must be such that t y ∈ [t x + a, t x + b], which is specified by the term (y ≤ x + b) ∧ (y ≥ x + a).
Example 2 (Temporal Constraints).Suppose we want to express that whenever the event Q occurs, it must be followed by a response R, and then by S. In addition, we have the following timing requirement: if ε QR , ε RS , ε QS are the time delays between Q and R, between R, S, and between Q and S respectively, then: we must have ε 2 QR + ε 2 RS + ε 2 QS ≤ d for a given positive constant d.This can be written using freeze quantifiers as the TLTL formula φ:

Transference of TLTL Properties for Propositional Traces
We show in this section that if a timed propositional trace π satisfies a TLTL(F T ) formula φ, then any timed trace π that is at most δ distance away from π satisfies a slightly relaxed version of the formula φ, the degree of relaxation being governed by δ; and the variance of the functions in F T over the time interval containing the time domains of π and π .
Recall that the distance between two sets of propositions σ, σ is ∞ if σ = σ , and 0 if σ = σ .The distance between two propositional traces is defined to be the Skorokhod distance with the aforementioned metric on 2 P .
Next, we define relaxations of TLTL(F T )formulae.The relaxations are defined as a syntactic transformation on formulae which do not have negations, except on the propositions.Every TLTL(F T )formula can be expressed in this negation-normal form.To remove negations from the until operator, we use the waiting for operator, W , defined as: π |= E φ 1 W φ 2 iff either (1) π t |= E φ 1 for all t ∈ I; or (2) π t |= E φ 2 for some t ∈ I; and π t |= E φ 1 ∨ φ 2 for all t 0 ≤ t < t.It can be showed that every TLTL(F T ) formula can be rewritten using the W operator such that negations appear only over the propositions (the procedure is given in the Appendix).
Definition 4 (δ-relaxation of TLTL(F T ) formulae).Let φ be a TLTL(F T ) formula in which negations appear only on the propositional symbols.The δ relaxation of φ (for δ ≥ 0) over a closed interval J, denoted rx δ J (φ), is defined as: Thus, instead of comparing the f T () values to 0, we relax by comparing instead to ±K J f T (δ).The other cases recursively relax the subformulae.The functions K J f T (δ) define the maximal change in the value of f T that can occur when the input variables can vary by δ.The role of J is the above definition is to restrict the domain of the freeze quantifier variables to the time interval J (from R + ) in order to obtain the least possible relaxation on a given trace π (e.g.we do not care about the values of a function in F T outside of the domain tdom(π) of the trace).
Example 3 (δ-relaxation for Bounded Temporal Operators -MTL).We demonstrate how δ-relaxation operates on bounded time constraints through an example.Consider an This can be written as a TLTL formula, and relaxed using the rx δ R + function.The relaxed TLTL formula is again equivalent to an MTL formula, namely Q U Theorem 2 relaxes the freeze variables over the entire signal time-range I π,π ; it can be strengthened by relaxing over a smaller range: if π |= φ, and t 1 , . . ., t k are time-stamp assignments to the freeze variables x 1 , . . ., x k which witness π satisfying φ, then x i only needs to be relaxed over [t i − δ, t i + δ] rather than the larger interval I π,π .These smaller relaxation intervals for the freeze variables can be incorporated in Equation 2. We omit the details for ease of presentation.
Example 4. Recall Example 2, and the formula φ presented in it.Suppose a flow π satisfies φ; and let π be δ close to π under the Skorokhod metric (for propositional traces).Our robustness theorem ensures that (i) π will satisfy the same untimed formula Q → ♦ (R ∧ ♦S); and (ii) it gives a bound on how much the timing constraints need to be relaxed in φ in order to ensure satisfaction by π ; it states that π satisfies the following relaxed formula φ for every > 0: The constant d † is derived in the appendix.

Transference of TLTL properties for R n -valued Signals
A timed R n -valued trace π is a function from a closed interval In order to define the satisfaction of TLTL formulae over timed R n -valued sequences, we use booleanizing predicates µ : R n → B, as in STL [13], to transform R n -valued sequences in to timed propositional sequences.These predicates are part of the logical specification.In this work, we restrict our attention to traces and predicates such that each predicate varies only finitely often on the finite time traces under consideration.
Definition 5 (TLTL(F T , F S ) Syntax).Given a set of variables V T (the freeze variables), a set of ordered variables V S (the signal variables), and two sets F T , F S of functions, the formulae of TLTL(F T , F S ) are defined by the grammar: x.φ where x ∈ V T , and x = (x 1 , . . ., x l ) with x i ∈ V T for all 1 ≤ i ≤ l; y = (y 1 , . . ., y d ) with y j ∈ V S for all 1 ≤ j ≤ d; -V T and V S are disjoint; f T ∈ F T and f S ∈ F S are real-valued functions, and ∼ is ≤, <, ≥, or >.
The semantics of TLTL(F T , F S ) is straightforward and similar to the propositional case (Definition 3).The only new ingredients are the booleanizing predicates f S (y) ∼ 0: we define π |= E f S (y 1 , . . ., y d ) ∼ 0 iff f S (π j 1 [t 0 ], . . ., π j d [t 0 ]) ∼ 0 for any freeze variable environment E, where t 0 = min tdom(π), and y i is the j i -th variable in V S (i.e., y i refers to the j i -th dimension in the signal trace).We require that for a timed R n -valued trace π to satisfy φ, the arity of the functions in F S occurring in φ should not be more than n, that is, functions should not refer to dimensions greater than n for an R n trace.δ relaxation of TLTL(F T , F S ).Let J V S be a mapping from V S to closed intervals of R, thus J V S (z) denotes a sub-domain of z ∈ V S .The relaxation function rx δ J,J V S which operates on TLTL(F T , F S ) formulae is defined analogous to the relaxation function rx δ J in Definition 4. We omit the similar cases, and only present the new case for the predicates formed from F S (the full definition can be found in the appendix).
The functions K f S (δ) define the maximal change in the value of f S that can occur when the input variables can vary by δ over the intervals in J V S (z) and J.The role of J V S in the above definition is to restrict the domain of the signal variables in order to obtain the least possible relaxation bounds on the signal constraints; as was done in Definition 4 for the freeze variables.
Theorem 3 (Transference for R n -valued Traces).Let π, π be two R n -valued traces such the Skorokhod distance between them is less than δ for some finite δ.Let φ be a closed Theorem 3 can be strengthened similar to the strengthening mentioned for Theorem 2 by relaxing the variables over smaller intervals obtained from assignments to variables which witness π |= φ.
Example 5 (Spatial Constraints and Transference).Recall Example 2, suppose that the events Q, R, S are defined by the following predicates over real variables α 1 and α Let π satisfy this formula with these predicates, and let π be δ close to π, for a finite δ under the Skorokhod metric for R 2 .Our robustness theorem ensures that π will satisfy the relaxed formula where the relaxed predicates Q δ , R δ , S δ are defined as follows:

Experimental Evaluation
Skorokhod Distance Computation Benchmark.The Skorokhod distance is computed with the help of a streaming, sliding window monitoring routine which checks for a fixed δ whether the linear interpolations of two time-sampled traces are at most δ away from each other.The least such δ value is computed by binary search over the monitoring routine.The upper limit of the search range is set to the pointwise metric (i.e assuming the identity retiming) between the two traces.The traces to the Skorokhod procedure are pre-scaled, each dimension (and the time-stamp) is scaled by a different constant.The constants are chosen so that after scaling, one unit of deviation in one dimension is as undesirable as one unit of jitter in other dimensions.We next present a benchmark on the distance computing routine.Consider the hybrid dynamical system A 1 shown in Fig. 1.The system consists of two water tanks, each with an outlet from which water drains at a constant rate d j .Both tanks share a single inlet pipe that is switched between the tanks, filling only one tank at any given time at a constant inflow rate of i.When the water-level in tank j falls below level j , the pipe switches to fill it.The Fig. 1.System A1 used for benchmarking Skorokhod Distance computation.Inflow rate i, Drain rate d1 for tank 1 and d2 for tank 2 are all inputs to the system.
Table 1.Benchmarking the computation of D S (π1, π2), where π1 is a trace of system A1 described in Fig. 1, and π2 is a trace of system A2, which is A1 with an actuation delay.D2 is the naive pointwise distance.Both π1 and π2 contain equally spaced 2001 time points over a simulation horizon of 100 seconds.drain and inflow rates d 1 , d 2 and i are assumed to be inputs to the system.Now consider a version A 2 that incorporates an actuation delay that is a function of the inflow rate.This means that after the level drops to j for tank j, the inlet pipe starts filling it only after a finite time.A 1 and A 2 have the same initial water level.We perform a fixed number of simulations by systematically choosing drain and inflow rates d 1 , d 2 , i to generate traces (water-level vs. time) of both systems and compute their Skorokhod distance.We summarize the results in Table 1.

Window size
Recall that D S (the Skorokhod distance) computation involves a sequence of monitoring calls with different δ values picked by a bisection-search procedure.Thus, the total time to compute D S is the sum over the computation times for individual monitoring calls plus some bookkeeping.In Table 1, we make a distinction between the average time to monitor traces (given a δ value), and the average time to compute D S .There are an average of 6 monitoring calls per D S computation.We ran 64 simulations by choosing different input values, and then computing D S for increasing window sizes.As the window size increases, the average D S is seen to decrease; this is expected as a better match may be achieved in a larger window.The computation time is also seen to increase linearly, as postulated by Theorem 1. Finally, we see that the Skorokhod distance is less aggressive at classifying traces as distant (as shown by its lower overall numbers) than a simpler metric D 2 (defined as as the maximum of the pointwise L 2 norm6 ).We can see this discrepancy becomes more prominent with increased window size (because of better matches being available).Case Study: LQR-based Controller.The first case study is an example of an aircraft pitch control application taken from the openly accessible control tutorials for Matlab and Simulink [25].The authors describe a linear dynamical system of the form: ẋ = (A − BK)x + Bθ des .Here, x describes the vector of continuous state variables and θ des is the desired reference provided as an external input.One of the states in the x vector is the pitch angle θ, which is also the system output.The controller gain matrix K is computed using the linear quadratic regulator method [5], Table 2. Variation in Skorokhod Distance with changing sampling time for an aircraft pitch control system with an LQR-based controller.Time taken indicates the total time spent in computing the upper bound on the Skorokhod distance across all simulations.We scale the signals such that a time-jitter of 0.5 seconds, is treated the same as a value-difference of 0.08 radians, and the window size chosen is 150.The system is simulated for 5 seconds, with a variable-step solver.a standard technique from optimal control.We are interested in studying a digital implementation of the continuous-time controller obtained using the LQR method.To do so, we consider sampleddata control where the controller samples the plant output, computes, and provides the control input to the plant every ∆ seconds.To model sensor delay, we add a fixed delay element to the system; thus, the overall system now represents a delay-differential equation.Control engineers are typically interested in the step response of a system.In particular, quantities such as the overshoot/undershoot of the output signal (maximum positive/negative deviation from a reference value) and the settling time (time it takes for transient behaviors to converge to some small region around the reference value) are of interest.Given a settling time and overshoot for the first system, we would like the second system to display similar characteristics.We remark that both of these properties can be expressed in STL, see [19] for details.We quantify system conformance (and thereby adherence to requirements) in terms of the Skorokhod distance, or, in other words, maximum permitted time/space-jitter value δ.For this system, we know that at nominal conditions, the settling time is approximately 2.5 seconds, and that we can tolerate an increase in settling time of about 0.5 seconds.Thus, we chose a time-saling factor of 2 = 1 0.5 .We observe that the range of θ is about 0.4 radians, and specify an overshoot of 20% of this range as being permissible.Thus, we pick a scaling factor of 0.08 for the signal domain.In other words, Skorokhod distance δ = 1 corresponds to either a time-jitter of 0.5 seconds, or a space-discrepancy of 0.08 radians.

Controller
We summarize the results of conformance testing for different values of sampling time ∆ in Table 2.It is clear that the conformance of the systems decreases with increasing ∆ (which is to be expected).The time taken to compute the Skorokhod distance decreases with increasing ∆, as the number of time-points in the two traces decreases.Case Study: Air-Fuel Ratio Controller.In [19], the authors present three systems representing an air-fuel ratio (λ) controller for a gasoline engine, that regulate λ to a given reference value of λ ref = 14.7.Of interest to us are the second and the third systems.The former has a continuoustime plant model with highly nonlinear dynamics, and a discrete-time controller model.In [20], the authors present a version of this system where the controller is also continuous.We take this to be A 1 .The third system in [19] is a continuous-time closed-loop system where all the system differential equations have right-hand-sides that are polynomial approximations of the nonlinear dynamics in A 1 .We call this polynomial dynamical system A 2 .The rationale for these system versions is as follows: existing formal methods tools cannot reason about highly nonlinear dynamical systems, but tools such as Flow* [9], C2E2 [14], and CORA [3] demonstrate good capabilities for polynomial dynamical systems.Thus, the hope is to analyze the simpler systems instead.In [19], the authors comment that the system transformations are not accompanied by formal guarantees.By quantifying the difference in the system behaviors, we hope to show that if the system A 2 satisfies the temporal requirements ϕ presented in [19], then A satisfies a moderate relaxation of ϕ.We pick a scaling factor of 2 for the time domain, as a time-jitter of 0.5 seconds is the maximum deviation we wish to tolerate in the settling time, and pick 0.68 = 1 0.1 * λ ref as the scaling factor for λ (which corresponds to the worst case tolerated discrepancy in the overshoot).
The results of conformance testing for these systems are summarized in Table 3.In [12], the authors posed a challenge problem for conformance testing.In it, the authors reported that the original nonlinear system and the approximate polynomial system both satisfy the STL requirements specifying overshoot/undershoot and settling time.We, however, found an input that causes the outputs of the two systems to have a high Skorokhod distance.Thus, comparing the two systems by considering equi-satisfaction of a given set of STL requirements such as overshoot/undershoot and settling time may not always be sufficient, and our experiment indicates that the more nuanced Skorokhod metric may be a better measure of conformance.
Case Study: Engine Timing Model.The Simulink demo palette presented by the Mathworks [24] contains a system representing a four-cylinder spark ignition internal combustion engine based on a model by Crossley and Cook [10].This system is then enhanced by adding a proportional plus integral (P+I) control law.The integrator is used to adjust the steady-state throttle as the desired engine speed set-point changes, and the proportional term compensates for phase lag introduced by the integrator.In an actual implementation of such a system, such a P+I controller is implemented using a discrete-time integrator.Such integrator blocks are typically associated with a particular numerical integration technique, e.g., forward-Euler, backward-Euler, trapezoidal, etc.It is expected for different numerical techniques to produce slight variation in the results, and we wish to quantify the effect of using different numerical integrators in a closed-loop setting.We try to check if the user-provided bound of δ = 1.0 is satisfied by systems A 1 and A 2 , where A 1 is the original system provided at [24], while A 2 is a modified system that uses the backward Euler method to compute the discrete-time integral in the controller.We try to determine the input signal that leads to a violation of this δ bound, using a simulation-guided approach as described before.We scale the outputs in such a way that a value discrepancy of 1% of the the output range ( 1000) is equivalent to a time discrepancy of 0.1 seconds.These values are chosen to bias the search towards finding signals that have a small time jitter.This is an interesting scenario for this case study where the two systems are exactly equivalent except for the underlying numerical integration solver.We find the signal shown in Fig. 2, for which we find output traces with Skorokhod distance 1.04.The experiment uses 296 simulations and the total time taken to find the counterexample is 677 seconds.Fig. 2. Example of non-conformant behavior found using a simulation-guided optimization algorithm with the Skorokhod distance between system output trajectories as the cost function.

Conclusion
Metrics for comparing behaviors of dynamical systems which quantify both time and value distortions have heretofore been an object of mathematical inquiry, without enough attention being paid to computational aspects and connections to logical requirements.We argue that the Skorokhod metric provides a robust definition of conformance by proving transference of a rich class of temporal logic properties.We also demonstrate the computationally tractability of the metric for practical use by constructing a conformance testing tool in a simulation and optimization guided approach for finding and quantifying non-conformant behavior of dynamical systems.Pinpointing the source of trace deviations is necessary in many engineering applications; our tool allows for independent weighing of time and value-dimension distortions in order to achieve this objective.
and a retiming r : I → I π ; the function π • r is another trace from I to O. Definition 1 (Skorokhod Metric).Given a retiming r : I → I , let || r − I || sup be defined as || r − I || sup = sup t∈I | r(t) − t|.Given two traces π : I π → O and π : I π → O, where O is a metric space with the associated metric D O , and a retiming r : I π → I π , let π − π • r sup be defined as: π − π • r sup = sup t∈Iπ D O π(t) , π (r(t)) .

Table 3 .
Conformance testing for closed-loop A/F ratio controller at different engine speeds.We scale the signals such that 0.5 seconds of time-jitter is treated equivalent to 10% of the steady-state value (14.7) of the A/F ratio signal.The simulation traces correspond to a time horizon of 10 seconds, and the window size is 300.