Abstract
Data scientists often need to write programs to process predictions of machine learning models, such as object detections and trajectories in video data. However, writing such queries can be challenging due to the fuzzy nature of realworld data; in particular, they often include realvalued parameters that must be tuned by hand. We propose a novel framework called Quivr that synthesizes trajectory queries matching a given set of examples. To efficiently synthesize parameters, we introduce a novel technique for pruning the parameter space and a novel quantitative semantics that makes this more efficient. We evaluate Quivr on a benchmark of 17 tasks, including several from prior work, and show both that it can synthesize accurate queries for each task and that our optimizations substantially reduce synthesis time.
Appendices are available in the technical report [51].
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
1 Introduction
Over the past decade, deep neural networks (DNNs) have successfully solved challenging artificial intelligence problems [47, 70]. Abstractly, these models can be thought of as providing interfaces to realworld data—e.g., they can provide object classes [30, 47], detections [59, 60], and trajectories [10, 11, 83]. Then, these predictions are processed by programs, e.g., to identify driving patterns [5], events in TV broadcasts [28], or animal behaviors [67].
However, writing such programs can be challenging since they must still account for the fuzziness of real data. To do so, these programs typically include realvalued parameters that need to be manually tuned by the user. For example, consider a query over car trajectories designed to identify instances where one car turns in front of another. This query must capture the shape of the trajectory of both the turning car and the car crossing the intersection. In addition, the user must select the appropriate maximum duration from the first car changing lanes to the second car crossing the intersection. Even an expert would require significant experimentation to determine good parameter values; in our experience, it can take up to an hour to tune the parameters for a single query.
We focus on programs that query databases of trajectories output by an object tracker [5, 7, 8, 28, 40,41,42, 54]. Given a video, the tracker predicts the positions of objects in each frame (e.g., cars, people, or mice), as well as associations between detections of the same object across successive frames. Applications often require subsequent analysis of these trajectories. For example, in autonomous driving, when a risky scenario is encountered, engineers typically search for additional examples of that driving pattern to improve their planner [63, 64, 66]—e.g., cars driving too close [82] or stopping in the middle of the road [6]. Object tracking has also been used to track robots [58, 81], animals for behavioral analysis [12, 67, 75], and basketball players for sports analytics [67, 85].
We propose an algorithm for synthesizing queries over object trajectories given just a handful of inputoutput examples. A query takes as input a representation of a trajectory as a sequence of states (e.g., position, velocity, and acceleration) in successive frames of the video, and outputs whether the trajectory matches its semantics. Our query language is based on regular expressions—in particular, a query is a composition of a userextensible set of predicates using the sequencing, conjunction, and iteration operators. For instance, trajectories might correspond to cars in a video; Fig. 1 shows a query for identifying cars turning at an intersection. As we discuss in Sect. 6, the full query language semantics is rich enough to subsume (variants of) Kleene algebras with tests (KAT) [46] and signal temporal logic (STL) [50]; however, such generality is seldom needed, so we use a pareddown query language that works well in practice.
Our algorithm performs enumerative search over the space of possible queries to identify ones that are consistent with the given examples. A key challenge in our setting is that our predicates have realvalued parameters that must also be synthesized. Thus, our strategy enumerates sketches, which are partial programs that only contain holes corresponding to realvalued parameters. For each sketch, we search over the space of realvalued parameters, while using an efficient pruning strategy to reduce the search space. At a high level, we use a quantitative semantics to directly compute “boundary parameters” at which a given example switches from being labeled positive to negative. Then, depending on the target label, we can prune the entire region of the search space on one side of these boundary parameters. We prove that this synthesis strategy comes with soundness and (partial) completeness guarantees.
We implement our approach in a system called Quivr.^{Footnote 1} Our implementation focuses on videos from fixedposition cameras. While our language and synthesis algorithm are general, the predicates we design are tailored to specific settings. We evaluate Quivr on identifying driving patterns in traffic videos, including ones inspired by recent work on autonomous driving [63, 64, 66], on behavior detection in a dataset of mouse trajectories [72], and on a synthetic task from the temporal logic synthesis literature [44]. We demonstrate how both our parameter pruning strategies and our query evaluation optimizations lead to substantial reductions in the running time of our synthesizer.
In summary, our contributions are:

A language for querying object trajectories (Sect. 3) and an algorithm for synthesizing such queries from examples (Sect. 4).

An efficient parameter pruning approach based on a novel quantitative semantics (Sect. 4), yielding a \(5.0{\times }\) speedup over the stateoftheart quantitative pruning technique from the temporal logic synthesis literature.

An implementation of our approach in Quivr, and an evaluation of Quivr on identifying driving behaviors in traffic camera video and mouse behaviors in a dataset of mouse trajectories (Sect. 5), demonstrating substantially better accuracy than neural network baselines.
2 Overview
We consider a hypothetical scenario where an engineer is designing a control algorithm for an autonomous car and would like to identify certain driving patterns in video data. We show how they can use our framework to synthesize a query to identify car trajectories that exhibit a given behavior.
Video Data. Traffic cameras are a rich source of driving behaviors [5, 13, 61]; one dataset used in our evaluation is YTStreams [7], which includes video from several such cameras. Figure 1(a) shows a single frame from such a video; we have used an object tracker [83] to identify all car trajectories (in red).
Predicates. Quivr assumes it is given a set of predicates that match portions of trajectories exhibiting behaviors of interest; during synthesis, it considers queries composed of these predicates. In Fig. 1(a), the engineer has manually annotated the lanes of interest in this video (black), to specify four InLaneK predicates that select trajectories of cars driving in each lane K visible in the video. Predicates may be configured by realvalued parameters. For example,
searches for trajectories where the car stays in lane 1 for a period of time and the car has a displacement at most \(\theta \) between the beginning and end of that period. Note that atomic predicates, like \(\langle \textsf {DispLt}_{\theta } \rangle \), can match multiple timesteps, whereas in formalisms like regular expressions and temporal logic, atomic predicates are over single timesteps. A key feature of our framework is that the set of available predicates is highly extensible, and the user can provide their own. See Sect. 5.1 for the predicates we use in our evaluation.
Synthesis. To specify a driving pattern, the engineer provides a small number of initial positive and negative examples of trajectories; then, Quivr synthesizes a query that correctly labels these examples. In Fig. 1(b), we show the result of executing the query shown, which is synthesized to identify left turns in the data. Often, there are multiple queries consistent with the initial examples. While it may be hard for users to sift through the video for positive examples, it is usually easy for them to label a given trajectory. Thus, to disambiguate, Quivr asks the user to label additional trajectories [19, 36, 62].
Multiobject Queries. So far, we have focused on queries that identify trajectories by processing each trajectory in isolation. A key feature of our framework is that users can express queries over multiple trajectories—for example,
This query says that car B is in lane 1 while car A changes from lane 2 to lane 1, and car A ends up in front of car B. Note that the predicates now include variables indicating which object they refer to, and the predicate \(\textsf {InFront}(A,B)\) refers to multiple objects. An example of a pair of trajectories selected by a multiobject query is shown in Fig. 2.
3 Query Language
We describe our query language for matching object trajectories in videos. Our system first preprocesses the video using an object tracker to obtain trajectories, which are sequences \(z=(x_0,x_1,...,x_{n1})\) of states \(x_i\in \mathcal {X}\). Then, a query Q in our language maps each trajectory z to a value \(\mathbb {B}=\{0,1\}\) indicating whether it matches z. Our language is similar to both STL and KAT. One key difference is that predicates are over arbitrary subsequences of z rather than single states x. In the main paper, we consider a simpler language, but in Appendix A we show how it can be extended to subsume both STL and KAT.
Trajectories. We begin by describing the input to a query in our language, which is the representation of one or more concurrent object trajectories in a video.
Consider a space \(\mathcal {S}\) corresponding to a single object detection in a single video frame—e.g., \(s\in \mathcal {S}\subseteq \mathbb {R}^6\) might encode the 2D position, velocity, and acceleration of s in image coordinates. When considering m concurrent objects, let the space of states \(\mathcal {X} = \mathcal {S}^m\), and then a trajectory \(z\in \mathcal {Z} = \mathcal {X}^*\) is a sequence \(z=(x_0,x_1,...,x_{n1})\) of states of length \(z=n\). We use the notation \(z_{i:j}=(z_i,z_{i+1},...,z_{j1})\) to denote a subtrajectory of z.
Predicates. We assume a set of predicates \(\varPhi \) is given, where each predicate \(\varphi \in \varPhi \) matches trajectories \(z\in \mathcal {Z}\); we use \(\textsf {sat}_{\varphi }(z)\in \mathbb {B}=\{0,1\}\) to indicate that \(\varphi \) matches z. As discussed below, queries in our language compose these predicates to match more complex patterns.
Next, predicates in our language may have realvalued parameters that must be specified. We denote such a predicate \(\varphi \) with parameter \(\theta \in \mathbb {R}\) by \(\varphi _\theta \). To enable our synthesis algorithm to efficiently synthesize these realvalued parameters, we leverage the monotonicity in all such predicates we have used in our queries. In particular, we assume that the semantics of these predicates have the form
where \(\iota _\varphi :\mathcal {Z}\rightarrow \mathbb {R}\) is a scoring function. We also assume that the range of \(\iota _\varphi \) is bounded (which can be achieved with a sigmoid function, if necessary). For example, for the predicate \(\textsf {DispLt}_{\theta }\), we have \(\iota _{\textsf {DispLt}}(z)=\Vert z_0  z_{n1} \Vert \). Thus, \(\iota _{\textsf {DispLt}}(z)\ge \theta \) says the total displacement is at most \(\theta \). We describe the predicates we include in Sect. 5.1; they can easily be extended.
Syntax. The syntax of our language is
where \(Q^k=Q;Q;...;Q\) (k times). That is, the base case is a single predicate \(\varphi \), and queries can be composed using sequencing (\(Q\,;\,Q\)) and conjunction (\(Q\wedge Q\)). Operators for disjunction, negation, Kleene star, and STL’s “until” are discussed in Appendix A.2. We describe constraints imposed on our language during synthesis in Sect. 4.7.
Semantics. The satisfaction semantics of queries have type \(\llbracket \cdot \rrbracket : \mathcal {Q} \rightarrow \mathcal {Z} \rightarrow \mathbb {B}\), where \(\mathcal {Q}\) is the set of all queries in our language, \(\mathcal {Z}\) is the set of trajectories, and \(\mathbb {B}=\{0,1\}\). In particular, \(\llbracket Q \rrbracket (z)\in \mathbb {B}\) indicates whether the query Q matches trajectory z. The semantics are defined in Fig. 3. The base case of a single predicate \(\varphi \) checks whether \(\varphi \) matches z; conjunction \(Q_1 \wedge Q_2\) checks if both conjuncts match; and sequencing \(Q_1\,;\,Q_2\) checks if z can be split into \(z=z_{0:k}z_{k:n}\) in a way that \(Q_1\) matches \(z_{0:k}\) and \(Q_2\) matches \(z_{k:n}\). The semantics can be evaluated in time \(O(\left Q \right \cdot n^2)\).
4 Synthesis Algorithm
We describe our algorithm for synthesizing queries consistent with a given set of examples. It performs a syntaxguided enumerative search over the space of possible queries [3]. In more detail, it enumerates sketches, which are partial programs where only parameter values are missing. For each sketch, it uses a quantitative pruning strategy to compute the subset of the input parameters for which the resulting query is consistent with the given examples. A key contribution is how our algorithm uses quantitative semantics for quantitative pruning.
4.1 Problem Formulation
Partial Queries. A partial query is in the grammar
Note that there are two kinds of holes: (i) a predicate hole \(h=\;??\) that can be filled by a subquery Q, and (ii) a parameter hole \(h=\varphi _{??}\) that can be filled by a real value \(\theta _h\in \mathbb {R}\). We denote the predicate holes of Q by \(\mathcal {H}_{\varphi }(Q)\), the parameter holes by \(\mathcal {H}_{\theta }(Q)\), and let \(\mathcal {H}(Q)=\mathcal {H}_{\varphi }(Q)\cup \mathcal {H}_{\theta }(Q)\). A partial query Q is a sketch (denoted \(Q\in \mathcal {Q}_{\text {sketch}}\)) [71] if \(\mathcal {H}_{\varphi }(Q)=\varnothing \), and is complete (denoted \(Q\in \bar{\mathcal {Q}}\)) if \(\mathcal {H}(Q)=\varnothing \). For example, for \(Q=\langle \textsf {DispLt}_{??1} \rangle \wedge ??2\), we have \(\mathcal {H}_{\theta }(Q)=\{??1\}\) and \(\mathcal {H}_{\varphi }(Q)=\{??2\}\). (We label each hole \(h=\;??i\) with an identifier \(i\in \mathbb {N}\) to distinguish them.)
Refinements and Completions. Given query \(Q\in \mathcal {Q}\), predicate hole \(h\in \mathcal {H}_{\varphi }(Q)\), and production \(R=Q\rightarrow f(Q_1,...,Q_k)\) we can fill h with R (denoted \(Q'=\textsf {fill}(Q,h,R)\)) by replacing h with f(??1, ..., ??k), where each ??i is a fresh hole, and similarly given a parameter hole \(h\in \mathcal {H}_{\theta }(Q)\) and a value \(\theta _h\in \mathbb {R}\). We call \(Q'\) a child of Q (denoted \(Q\rightarrow Q'\)). Next, we call \(Q''\) a refinement of Q (denoted \(Q\xrightarrow {*}Q''\)) if there exists a sequence \(Q\rightarrow ...\rightarrow Q''\); if furthermore \(Q''\in \bar{\mathcal {Q}}\), we say it is a completion of Q. For example, we have
Here, \(\mathopen {}\left\langle {\textsf {InLane1}}_{{}_{}}\,\negthickspace \right\rangle \mathclose {} \,;\,??3\) is a child (and refinement) of \(??2 \,;\,??3\) obtained by filling ??2 with \(Q\rightarrow \mathopen {}\left\langle {\textsf {InLane1}}_{{}_{}}\,\negthickspace \right\rangle \mathclose {}\)—i.e.,
Parameters. We let \(\theta \in \mathbb {R}^{\mathcal {H}_\theta (Q)}\) denote a choice of parameters for each \(h\in \mathcal {H}_\theta (Q)\), let \(\theta _h\in \varTheta _h\subseteq \mathbb {R}\) denote the parameter for hole h, and let \(Q_{\theta }\) denote the query obtained by filling each \(h\in \mathcal {H}_{\theta }(Q)\) with \(\theta _h\). Note that if \(Q\in \mathcal {Q}_{\text {sketch}}\), then \(Q_{\theta }\in \bar{\mathcal {Q}}\) is complete. For example, consider the sketch
This query has two holes, so its parameters are \(\theta \in \mathbb {R}^2\). If \(\theta = (3.2, 5.0)\), then \(\theta _{??1}=3.2\) is used to fill hole ??1 and \(\theta _{??2}=5.0\) is used to fill ??2. In particular,
Query Synthesis Problem. Given examples \(W\subseteq \mathcal {W}=\mathcal {Z}\times \mathbb {B}\), where \(\mathbb {B}=\{0,1\}\), our goal is to find a query \(Q\in \bar{\mathcal {Q}}\) that correctly labels these examples—i.e.,
Thus, \(\psi _W(Q)\) indicates whether Q is consistent with the labeled examples W. Our goal is to devise a synthesis algorithm that is sound and complete—i.e., it finds a query that satisfies \(\psi _W(Q) = 1\) if and only if one exists.
4.2 Algorithm Overview
Our algorithm enumerates sketches \(Q\in \mathcal {Q}_{\text {sketch}}\); for each one, it tries to compute parameter values \(\theta \) such that the completed query \(Q_{\theta }\) is consistent with W—i.e., \(\psi _W(Q_\theta )=1\). It can either stop once it has found a consistent query, or identify additional queries that are consistent with W. Algorithm 1 shows this highlevel strategy—at each iteration, it selects a sketch Q, determines a region B of the parameter space containing consistent parameters \(\theta \in B\), and adds (Q, B) to a list of consistent queries that solve the synthesis problem.
The key challenge is searching over the space of continuous parameters \(\theta \) for a given sketch Q such that \(Q_\theta \) is consistent with W. For efficiency, we rely heavily on pruning the search space. At a high level, consider evaluating a single candidate parameter \(\theta \) on a single example \((z,y)\in W\)—i.e., check whether \(\llbracket Q_\theta \rrbracket (z)=y\). If this condition does not hold, then we can not only prune \(\theta \) from the search space, but also a significant fraction of additional candidates. For instance, suppose \(\llbracket Q_\theta \rrbracket (z)=1\) but \(y=0\); if \(\theta '\le \theta \) (in all components), then by a monotonicity property we prove for our semantics, we also have \(\llbracket Q_{\theta '}\rrbracket (z)=1\). Thus, we can also prune \(\theta '\).
Previous work has leveraged this property to prune the search space [49, 53, 78]. Using a strategy based on binary search, for a given example \((z,y)\in W\), we can identify “boundary” parameters \(\theta \) to accuracy \(\varepsilon \) in \(O(\log (1/\varepsilon ))\) steps—i.e., compute \(\theta \) for which \(\llbracket Q_{\theta \vec {\varepsilon }}\rrbracket (z)=1\) and \(\llbracket Q_{\theta +\vec {\varepsilon }}\rrbracket (z)=0\).
Our algorithm avoids this binary search process, which can lead to a significant speedup in practice. The key idea is to devise a quantitative semantics for queries that directly computes \(\theta \); in fact, this quantitative semantics is closely related to robust temporal logic semantics, where the conjunction and disjunction of the satisfaction semantics are replaced with minimum and maximum, respectively.
4.3 Pruning with Boundary Parameters
We begin by describing how “boundary parameters” can be used to prune a portion of the search space over parameters. First, for any candidate parameters \(\theta \), we can prune parameters \(\theta '\le \theta \) (if \(\llbracket Q_\theta \rrbracket (z)=1\) and \(y=0\)) or \(\theta '\ge \theta \) (if \(\llbracket Q_\theta \rrbracket (z)=0\) and \(y=1\)). Pruned regions of the parameter space take the form of hyperrectangles, which we call boxes. For convenience, let \(\vec {\infty } {:}{=}(\infty , \ldots , \infty )\).
Definition 1
Given \(x, y \in \bar{\mathbb {R}}^d\), where \(\bar{\mathbb {R}}=\mathbb {R}\cup \{\pm \infty \}\), a box is an axisaligned halfopen hyperrectangle \(\lfloor x, y \rceil {:}{=}\{v \mid x_i < v_i \le y_i\}\subseteq \mathbb {R}^d\).
The key property ensuring that parameters prune boxes of the search space is that the semantics are monotonically decreasing in \(\theta \).
Lemma 1
Given sketch Q, trajectory z, and two candidate parameters \(\theta ,\theta '\in \mathbb {R}^d\) such that \(\theta \le \theta '\) componentwise, we have \(\llbracket Q_\theta \rrbracket (z)\ge \llbracket Q_{\theta '} \rrbracket (z)\).
The proof follows by structural induction on the query semantics: the base case follows since the semantics \(\mathbbm {1}(\iota _\varphi (z)\ge \theta _k)\) for predicates is monotonically decreasing in \(\theta _k\), and the inductive case follows since conjunction and disjunction are monotonically increasing in their inputs (so they are also monotonically decreasing in \(\theta _k\)). Below, we show how monotonicity ensures that we can prune whole regions of the search space if we find boundary parameters.
As an example, suppose we have two trajectories, \(z_0\) of a car driving quickly and then slowly, and \(z_1\) of a car driving slowly and then quickly, and that we are trying to synthesize a query for \(W=\{(z_0,0),(z_1,1)\}\). For simplicity, we assume both \(z_0=(0.9,0.6)\) and \(z_1=(0.5,0.8)\) have just two time steps each, with just a single component representing velocity. Furthermore, we assume there is just a single predicate \(\mathopen {}\left\langle {\text {VelGt}_{\theta }}_{{}_{}}\,\negthickspace \right\rangle \mathclose {}\) matching time steps where the velocity is at least \(\theta \), where \(\theta \) is a realvalued parameter. Since \(\mathopen {}\left\langle {\text {VelGt}_{\theta }}_{{}_{}}\,\negthickspace \right\rangle \mathclose {}\) matches single time steps, the satisfaction semantics is 0 except on trajectories of length 1, so:
Consider the sketch \(Q = \mathopen {}\left\langle {\text {VelGt}_{??1}}_{{}_{}}\,\negthickspace \right\rangle \mathclose {}; \mathopen {}\left\langle {\text {VelGt}_{??2}}_{{}_{}}\,\negthickspace \right\rangle \mathclose {}\). We can see that the candidate parameters (0.5, 0.6) satisfy \(\llbracket Q_{(0.5,0.6)} \rrbracket (z_1)=1\):
where the second equality holds because \(\mathopen {}\left\langle {\text {VelGt}_{\theta }}_{{}_{}}\,\negthickspace \right\rangle \mathclose {}\) matches only length1 trajectories, so the \(k=0\) and \(k=2\) cases evaluate to 0. Since the semantics are monotonically decreasing, we have \(\llbracket Q_\theta \rrbracket (z_1)=1\) for any \(\theta \in \left\lfloor (\infty , \infty ), (0.5, 0.6) \right\rceil \).
Notice, however, that if we were to move any \(\vec {\varepsilon } > 0\) upward, we would have \(\llbracket Q_{(0.5 + \varepsilon _1, 0.6 + \varepsilon _2)} \rrbracket (z_1) = \mathbbm {1}(0.5 \ge 0.5 + \varepsilon _1) \wedge \mathbbm {1}(0.8 \ge 0.6 + \varepsilon _2) = 0\). So we know \(\llbracket Q_\theta \rrbracket (z_1)=0\) for any \(\theta \in \left\lfloor (0.5, 0.6), (\infty , \infty ) \right\rceil \). This is because (0.5, 0.6) lies on the boundary between \(\{\theta '\mid \llbracket Q_{\theta '} \rrbracket (z)=0\}\) and \(\{\theta '\mid \llbracket Q_{\theta '} \rrbracket (z)=1\}\). This boundary plays a key role in our algorithm.
Definition 2
Given a sketch Q with d parameter holes and a trajectory z, we say \(\theta \in \mathbb {R}^d \cup \{\bot , \top \}\) is a boundary parameter if one of the following holds:

\(\theta \in \mathbb {R}^d\) and \(\llbracket Q_\theta \rrbracket (z) = 1\), but \(\llbracket Q_{\theta '} \rrbracket (z) = 0\) for all \(\theta ' \in \left\lfloor \theta , \vec {\infty } \right\rceil \)

\(\theta = \bot \) and \(\llbracket Q_{\theta '} \rrbracket (z) = 0\) for all \(\theta ' \in \left\lfloor \vec {\infty }, \vec {\infty } \right\rceil \)

\(\theta = \top \) and \(\llbracket Q_{\theta '} \rrbracket (z) = 1\) for all \(\theta ' \in \left\lfloor \vec {\infty }, \vec {\infty } \right\rceil \)
In the first case, by monotonicity, we also have \(\llbracket Q_{\theta '} \rrbracket (z)=1\) for all \(\theta '\in \left\lfloor \vec {\infty }, \theta \right\rceil \); thus, \(\theta \) lies on the boundary between parameters \(\theta '\) where \(Q_{\theta '}\) evaluates to 1 and those where it evaluates to 0. The second and third cases are where \(Q_{\theta '}\) always evaluates to 0 and 1, respectively.
Given a boundary parameter \(\theta \) for an example \((z,y)\in W\), we can prune \(\left\lfloor \theta , \vec {\infty } \right\rceil \) if \(y=1\) or \(\left\lfloor \vec {\infty }, \theta \right\rceil \) if \(y=0\). Intuitively, boundary parameters provide optimal pruning along a fixed direction in the parameter space. Thus, our algorithm focuses on computing boundary parameters for pruning.
In Fig. 4(a), if \(\theta _1\) is a boundary parameter for \(z_1\), we know that the blue region satisfies \(z_1\), and thus is consistent with the label 1, while the red region dissatisfies \(z_1\), and thus is inconsistent with the label 1. Similarly, in Fig. 4(b), if \(\theta _0\) is a boundary parameter for \(z_0\), we know that the red satisfies \(z_1\), and thus is inconsistent with the label 0, while the blue dissatisfies \(z_0\), and thus is consistent with the label 0.
4.4 Pruning with Pairs of Boundary Parameters
To extend pruning to the entire dataset W, we could simply prune the union of the individual pruned regions for each \((z,y)\in W\). However, one important feature of our approach is that we can also establish regions of the parameter space where the parameters are guaranteed to be consistent with W. To formalize this idea, we introduce the concept of a “pruning pair”, which is a pair of boundary parameters which might allow us to find such a consistent region.
Definition 3
Given a sketch Q and a dataset W, a pair of boundary parameters \(\theta ^, \theta ^+ \in \mathbb {R}^d \cup \{\bot , \top \}\) is a pruning pair for Q and W if all of the following hold:

\(\theta ^+\) is a boundary parameter for some \(z \in W^+\) and, for all \(z'\in W^+\) such that \(z' \ne z\), we have \(\llbracket Q_{\theta ^+} \rrbracket (z') = 1\).

\(\theta ^\) is a boundary parameter for some \(z \in W^\) and, for all \(z'\in W^\) such that \(z' \ne z\), we have \(\llbracket Q_{\theta ^} \rrbracket (z') = 0\).

\(\theta ^ < \theta ^+\) or \(\theta ^ \ge \theta ^+\).
If \(\theta ^ < \theta ^+\), the pruning pair \((\theta ^, \theta ^+)\) is consistent, and inconsistent otherwise.
Our algorithm searches for pruning pairs along a fixed direction—i.e., it considers a curve \(L\subseteq \mathbb {R}^d\) and looks for the following pruning pair along L:
Intuitively, \(\theta ^+\) is the largest \(\theta \) that correctly classifies all positive examples, and conversely for \(\theta ^\). We restrict to curves L that are monotonically increasing in all components, in which case the supremum and infimum are well defined since L comes with a total ordering (from its smallest point to its largest) that is consistent with the standard partial order on \(\mathbb {R}^d\). Then, \((\theta ^,\theta ^+)\) form a pruning pair: since \(\theta ^+\) is a boundary parameter for z, if we take \(\theta ^+\) to be any larger, then we must have \(\llbracket Q_\theta \rrbracket (z)=0\) for some \(z\in W^+\), and similarly for \(\theta ^\).
Given a curve L, we can compute an approximation to \(\theta ^+\) and \(\theta ^\) via binary search. However, our algorithm avoids the need to do so by directly computing \(\theta ^+\) and \(\theta ^\) using a quantitative semantics, which we describe in Sect. 4.6.
Figure 4(c) shows how the pair of boundary parameters \(\theta _0\) for \(z_0\) and \(\theta _1\) for \(z_1\) (where L is the diagonal line) prunes the parameter space. The blue region is guaranteed to be consistent with W, as it is the intersection of the region below \(\theta ^+\), which must satisfy \(\llbracket Q_\theta \rrbracket (z_1)=1\), and the region above \(\theta ^\), which must satisfy \(\llbracket Q_\theta \rrbracket (z_0)=0\). Conversely, the red regions are inconsistent with either \(z_0\) or \(z_1\), and therefore with W. Thus, the red regions can be pruned, whereas the blue regions are solutions to our synthesis problem. Note that the red region is the union of the red regions in Fig. 4(a) and (b), whereas the blue region is the intersection of the blue regions in Fig. 4(a) and (b).
This pattern holds for any consistent pruning pair (\(\theta ^ < \theta ^+\)); if instead the pair is inconsistent (\(\theta ^ \ge \theta ^+\)), then the resulting pattern is illustrated in Fig. 4(d); in this case, we can prune the red regions as before, but there is no blue region of solutions. In general, for a d dimensional parameter space, a pruning pair divides the parameter space into \(3^d\) boxes (i.e., for each dimension, the box can be below, in line with, or above the center box). The regions below \(\theta ^\) and above \(\theta ^+\) can be pruned, and the region between \(\theta ^\) and \(\theta ^+\) (if one exists) contains synthesis solutions. Precisely, it follows from the definitions and monotonicity that:
Lemma 2
Every \(\theta \in \left\lfloor \vec {\infty }, \theta ^ \right\rceil \) and \(\theta \in \left\lfloor \theta ^+, \infty \right\rceil \) is inconsistent with W, and every \(\theta \in \left\lfloor \theta ^, \theta ^+ \right\rceil \) is consistent with W.
The remaining boxes need to be further analyzed by our algorithm.
4.5 Pruning Parameter Search Algorithm
Next, we describe Algorithm 2, which searches over the space of parameters to fill a sketch Q for a given dataset W. The algorithm uses a subroutine that takes a box and returns a pruning pair in that box, which we describe in Sect. 4.6. Given this subroutine, our algorithm maintains a worklist of “unknown” boxes (i.e., unknown whether parameters in these boxes are consistent or inconsistent with W). At each iteration, it pops a box from the worklist (in firstinfirstout order), uses the given subroutine to find a pruning pair inside that box, applies the pruning procedure described in the previous section, and then adds each new unknown box to the worklist.
For the last step, the current box b is divided into \(3^d\) smaller boxes. The box \(b_{\text {center}}{:}{=}\left\lfloor \min \{\theta ^,\theta ^+\}, \max \{\theta ^,\theta ^+\} \right\rceil \) is pruned (added to \(B_{\text {inc}}\)) if the pair \((\theta ^,\theta ^+)\) is inconsistent, and contains solutions to the synthesis problem otherwise (added to \(B_{\text {con}}\)). The boxes \(b_{\text {lower}}=\left\lfloor \vec {\infty }, \min \{\theta ^,\theta ^+\} \right\rceil \) and \(b_{\text {upper}}=\left\lfloor \max \{\theta ^,\theta ^+\}, \vec {\infty } \right\rceil \) are always pruned. The boxes \(b\in B_{\text {incomp}}\) are the remaining corners of b, and always have indeterminate consistency (added to \(B_{\text {unk}})\). The remaining boxes \(b\in B_{\text {extra}}\) are indeterminate if \((\theta ^,\theta ^+)\) is consistent, and inconsistent otherwise. In our example, if the first step of the algorithm yielded Fig. 4(d), then the second step might pop \(b_6\) and yield Fig. 4(e).
The following soundness result follows directly from Lemma 2.
Theorem 1
In Algorithm 2, every \(\theta \in B_{{\text {con}}}\) is consistent with W for Q, and every \(\theta \in B_{{\text {inc}}}\) inconsistent.
In addition, the algorithm is complete for almost all parameters:
Theorem 2
The Lebesgue measure of \(\left\{ \theta \in b \mid b \in B_{{\text {unk}}}\right\} \rightarrow 0\) as \(N\rightarrow \infty \).
See Appendix D.1 for the proof. In other words, all parameters outside a subset of measure zero are eventually classified as consistent or inconsistent; intuitively, the parameters that may never be classified are the ones along the decision boundary. This result holds since at any search depth, the fraction of the parameter space pruned can be lowerbounded.
4.6 Computing Pruning Pairs via Quantitative Semantics
The pruning algorithm depends on the ability to compute, given a box b, a pruning pair \((\theta ^, \theta ^+)\) on the restriction of the parameter space to b. Recall that \(\theta ^+\) must be a boundary parameter for some \(z^+ \in W^+\) and must satisfy \(\llbracket Q_{\theta ^+} \rrbracket (z)=1\) for all other \(z \in W^+\), and \(\theta ^\) must be a boundary parameter for some \(z^ \in W^\), and must satisfy \(\llbracket Q_{\theta ^} \rrbracket (z)=0\) for all other \(z \in W^\).
Given a box \(b=\left\lfloor \theta _{\text {min}}, \theta _{\text {max}} \right\rceil \), our algorithm takes \(L\subseteq \mathbb {R}^d\) to be the diagonal from \(\theta _{\text {min}}\) to \(\theta _{\text {max}}\) and computes the pruning pair along L. We can naïvely use binary search: for \(\theta ^+\), we search for the parameters where \(\bigwedge _{z \in W^+} \llbracket Q_\theta \rrbracket (z)\) transitions from 0 to 1, and similarly for \(\theta ^\) and \(\bigwedge _{z \in W^} \lnot \llbracket Q_\theta \rrbracket (z)\).
Instead, by leveraging a quantitative semantics, we can directly compute \(\theta ^+\) and \(\theta ^\), thereby reducing computation time substantially. Given a sketch Q, trajectory z, parameter \(v \in \mathbb {R}^d\), and positive vector \(u \in \mathbb {R}_{>0}^d\), we devise a quantitative semantics \(\llbracket Q \rrbracket ^{q}_{v,u}(z) \in \bar{\mathbb {R}}\) such that the parameter \(\llbracket Q \rrbracket _{v,u}(z) \cdot u + v\) is a boundary parameter. Intuitively, this semantics computes, starting at v, how many usized steps must be taken to reach the boundary. (For the uses in our algorithm, the number of steps is always in [0, 1].) Then, for a box \(b = \left\lfloor \theta _{\text {min}}, \theta _{\text {max}} \right\rceil \), we can take \(v = \theta _{\text {min}}\) and \(u = \theta _{\text {max}}\theta _{\text {min}}\), and compute
We define the quantitative semantics in Fig. 5. The base case of \(\varphi _{??}\) adjusts and rescales \(\iota _\varphi \) by v and u, and the other cases replace conjunction and disjunction in the satisfaction semantics with minimum and maximum. We have the following key result (where \(\infty \cdot u {:}{=}\top \), \(\infty \cdot u {:}{=}\bot \), \(\top + v {:}{=}\top \), and \(\bot + v {:}{=}\bot \)):
Theorem 3
For a sketch Q, trajectory z, parameter \(v \in \mathbb {R}^d\), and positive vector \(u \in \mathbb {R}_{>0}^d\), we have that \(\llbracket Q \rrbracket ^{q}_{v,u}(z) \cdot u + v\) is a boundary parameter of z for Q.
See Appendix D.2 for the full proof. For intuition, consider \(\theta _{\text {min}}=\vec {0}\) and \(\theta _{\text {max}}=\vec {1}\) (i.e., the current box \(b\subseteq \mathbb {R}^d\) is the unit hypercube). Then, \(v=\vec {0}\) and \(u=\vec {1}\), so \(\llbracket Q \rrbracket ^{q}_{v,u}\) reduces to the standard maxmin quantitative semantics for temporal logic [25].
Now, if we consider the satisfaction semantics of a base predicate \(\llbracket \varphi _{\theta _i} \rrbracket =\mathbbm {1}(\iota _{\varphi }(z)\ge \theta _i)\), then the value of \(\theta _i\) where the sementics flips is just \(\iota _\varphi (z)\). So any parameter with ith component \(\iota _\varphi (z)\) is a boundary parameter, and since L has the same slope in all dimensions, the boundary parameter along L is \(\iota _\varphi (z) \cdot \vec {1} + \vec {0} = \llbracket \varphi _{??i} \rrbracket ^{q}_{\vec {0},\vec {1}}(z) \cdot \vec {1} + \vec {0}\).
In the inductive cases, it suffices to show that we can replace conjunction and disjunction with minimum and maximum in the semantics. Since the satisfaction semantics is monotonically decreasing, as we move upward along L, at some point we will transition from 1 to 0. A conjunction becomes 0 when either conjunct becomes 0, so the transition will occur when we hit the first of the conjuncts’ transition points (their minimum). Dually, a disjunction becomes 0 when both disjuncts become 0, so we will transition at the last of the disjuncts’ transition points (their maximum).
Finally, the intuition behind u and v is that they “preprocess” the parameters so that we evaluate along the diagonal of the current box instead of \(\left\lfloor \vec {0}, \vec {1} \right\rceil \).
4.7 Implementation
We implement our approach in a system called Quivr. It begins by running Algorithm 1 on a small number of labeled examples.
Active Learning. With a small number of examples, there are typically many queries that are consistent with the labels, and yet which disagree on the labels of the remaining data. To disambiguate, we use an active learning strategy, asking the user to label specific trajectories that we choose, which are then added to our set of labeled examples. Queries that are not consistent with the new label are discarded. The labeling process continues until the set of consistent queries agrees on the labels of all unlabeled data.
When choosing the trajectory \(z^*\) to query the user for next, we select the one on which the set of consistent queries C disagrees most—i.e.,
where
is the fraction of consistent queries that predict a positive label for trajectory z.
Search Implementation. In some cases, searching for consistent parameters may take a very long time. To improve performance, we impose a timeout: for each sketch, we pause search if either: (i) we find some consistent box of parameters or (ii) we’ve exceeded 25 steps. In both cases, we save the sets of consistent, inconsistent, and unknown boxes. At each step of active learning, the newly labeled example may render previously consistent parameters inconsistent, so we mark all consistent boxes as unknown. We then resume search, again until (i) we find some consistent box (which may be the same one we had before), or (ii) we again exceed 25 steps.
Note that while this timeout may cause us to query the user more often than is strictly necessary, it does not affect either the soundness or completeness of our approach, as we continue search after querying the user.
Complete Query Selection. Active learning and evaluation of \(F_1\) scores (in Sect. 5) both require complete queries with specific parameters, rather than sketches with boxes of parameters. Since the set C of consistent queries is infinitely large, we instead we use one query for each sketch that is known to have consistent parameters (sketches where search timedout are thus not included). For those sketches, we pick the middle of the box of knownconsistent parameters.
5 Evaluation
We demonstrate how our approach can be used to synthesize queries to solve interesting tasks: in particular, we show that (i) given just a few initial examples, it can synthesize queries that achieve good performance on a heldout test set, and (ii) our optimizations significantly reduce the synthesis time.
5.1 Experimental Setup
Datasets. We evaluate on two datasets of object trajectories: YTStreams [7], consisting of video and extracted object trajectories from fixedposition traffic cameras, and MABe22 [72], consisting of trajectories of up to three mice interacting in a laboratory setting. We also evaluate on a synthetic maritime surveillance task from the STL synthesis literature [44]. On YTStreams, we use two traffic cameras, one in Tokyo and one in Warsaw, and we consider single cars or pairs of cars. On MABe22, we consider pairs of mice. For the predicates used, see Table 1 for YTStreams, Appendix Table 5 for MABe22, and Appendix Table 6 for maritime surveillance.
Tasks. On YTStreams, we manually wrote 5 ground truth queries. Several queries apply to multiple configurations (e.g., different pairs of lanes), resulting in 10 queries total (tasks HQ in Table 2). The realvalued parameters were chosen manually, by visually examining whether they were selecting the desired trajectories. These queries cover a wide range of behaviors; for instance, they can capture behaviors such as human drivers making unprotected turns, an important challenge for autonomous cars [64], as well as cars trying to pass [66]. We show examples of trajectories selected by three of our multiobject queries in Fig. 6. MABe22 describes 9 queries for scientifically interesting mouse behavior. We implemented the 6 most complex to use as ground truth queries (tasks AF in Appendix Table 7). The maritime surveillance task has trajectory labels and so does not need a ground truth query (task G).
Synthesis. For each task, we divide the set Z of all trajectories into a train set \(Z_\text {train}\) and a test set \(Z_\text {test}\), using trajectories in the first half of the video for training, and those in the second half for testing. We randomly sample a set of initial labeled examples W from \(Z_{\text {train}}\), with 2 samples being positive and 10 being negative, and then actively label 25 additional examples from \(Z_{{\text {train}}}\). For YTStreams and MABe22, labels are from the ground truth query. For tractability, we limit search to sketches with at most three predicates, at most two of which may have parameters. In most cases, this excludes the ground truth from the search space.
5.2 Accuracy of Synthesized Queries
We show that Quivr synthesizes accurate queries from just a few labeled examples. We evaluate the \(F_1\) score of the synthesized queries on \(Z_{\text {test}}\). Recall that our algorithm returns a list C of consistent queries; we report the median \(F_1\) score across \(Q\in C\).
Baselines. We compare to (i) an ablation where we replace our active learning strategy with an approach that labels z uniformly at random from the remaining unlabeled training examples; (ii) an LSTM [16, 33] neural network; and (iii) a transformer neural network [26, 29, 77]. Because neural networks perform poorly on such small datasets, we pretrain the LSTM on an auxiliary task, namely, trajectory forecasting [43]. Then, we freeze the hidden representation of the learned LSTM, and use these as features to train a logistic regression model on our labeled examples. The neural network baselines do active learning by selecting among the unlabeled trajectories the one with the highest predicted probability of being positive.
Results. We show the \(F_1\) score of each of the 17 queries in Table 3 after 0, 5, 10, and 25 steps of active learning. After just 10 steps, our approach provides \(F_1\) score above 0.99 on 10 of 17 queries, and after 25 steps, it yields an \(F_1\) score above 0.9 on all but 2 queries. Thus, Quivr is able to synthesize accurate queries with relatively little user input. The neural networks achieve poor performance, particularly on the more difficult queries.
5.3 Synthesis Running Time
Next, we show that quantitative pruning and using a GPU each significantly reduce synthesis time, evaluating total running time for 25 steps of active learning.
Ablations. We compare to two ablations: (i) using the binary search approach of [53] to find pruning pairs, rather than using our quantitative semantics, and (ii) evaluating the matrix semantics (Appendix A.1) on a CPU rather than a GPU.
Results. In Fig. 4, we report the running time of our algorithms on a CPU (\(2{\times }\) AMD EPYC 7402 24Core) and a GPU (\(1{\times }\) NVIDIA RTX A6000). For binary search, on average, the GPU is \(7.6{\times }\) faster than the CPU. On a GPU, using the quantitative semantics rather than binary search offers another \(5.0{\times }\) speedup.
6 Related Work
Monotonicity for Parameter Pruning. We build on [49] for our parameter pruning algorithm. Their approach has been applied to synthesizing STL formulas for sequence classification by first enumerating sketches and then using monotonicity to find parameters, similar to our binary search baseline [53]. We replace binary search with our novel strategy based on quantitative semantics, leading to \(5.0{\times }\) speedup. There is also work building on [49] to create logicallyrelevant distance metrics between trajectories by taking the Hausdorff distance between parameter satisfaction regions (which they call “validity domains”), with applications to clustering [78]. For logics like STL, our quantitative semantics could provide a speedup to their approach.
Synthesis of Temporal Logic Formulas. More broadly, there has been work synthesizing parameters in a variant of STL by discretizing the parameter space and then walking the satisfaction boundary [24]; in one dimension, their approach becomes binary search, inheriting its shortcomings. There has been work on synthesizing STL formulas that are satisfied by a closedloop control model [38], but they assume the ability to find counterexample traces for incorrect STL formulas, which is not applicable to our setting. Another approach is to synthesize parameters in STL formulas using gradientbased optimization [35] or stochastic optimization [45], but we found these methods to be ineffective in our setting, and they do not come with either soundness or completeness guarantees. There is work using decision trees to synthesize STL formulas [1, 14, 44, 48], but these operate on a restricted subset of STL, namely Boolean combinations of a fixed set of template formulas. This restriction prevents these approaches from synthesizing temporal structure, which is a key component of the queries in our domains. Finally, there has been work on active learning of STL formulae using decision trees [48], but it assumes the ability to query for equivalence between a particular STL formula and the ground truth, which is not possible in our setting.
Synthesizing Constants. There is work on synthesizing parameters of programs using counterexampledguided inductive synthesis and different theory solvers, including FourierMotzkin variable elimination and an SMT solver [2]. Though our synthesis objective can be encoded in the theory of linear arithmetic, it is extremely large, and we have found such solvers to be ineffective in practice.
Querying Video Data. There has been recent work on querying object detections and trajectories in video data [5, 7, 8, 28, 40,41,42, 54]. The main difference is our focus on synthesis; in addition, these approaches focus on SQLlike operators such as select, innerjoin, groupby, etc., over predefined predicates, which cannot capture compositions such as the sequencing and iteration operators in our language, which are necessary for identifying more complex behaviors.
Neurosymbolic Models. There has been recent work on leveraging program synthesis in the context of machine learning. For instance, there has been work on using programs to represent highlevel structure in images [21,22,23, 74, 84], for reinforcement learning [9, 34, 79, 80], and for querying websites [18]; in contrast, we use programs to classify trajectories. The most closely related work is on synthesizing functional programs operating over lists [67, 76]. Our language includes key constructs not included in their languages. Most importantly, we include sequencing; in their functional language, such an operator would need to be represented as a nested series of ifthenelse operators. In addition, their language does not support predicates that match subsequences; while such a predicate could be added, none of their operators can compose such predicates.
Quantitative Synthesis. There has been work on program synthesis with quantitative properties—e.g., on synthesis for producing optimized code [37, 57, 65], for approximate computing [15, 52], for probabilistic programming [56], and for embedded control [17]. These approaches largely focus on searchbased synthesis, either using constraint optimization [52], continuous optimization [17], enumerative search [15, 57], or stochastic search [37, 56, 65]. While we leverage ideas from this literature, our quantitative semantics based pruning strategy is novel.
Quantitative Semantics. Our quantitative semantics is similar to the “robustness degree” [25] of a temporal logic formula. The difference is that, by adjusting the denotations of the base predicates, our quantitative semantics gives a parameter on the satisfaction boundary. More broadly, there has been work on quantitative semantics for temporal logic for robust constraint satisfaction [20, 25, 73], and to guide reinforcement learning [39]. There has been work on quantitative regular expressions (QREs) [4], though in general, QREs cannot be efficiently evaluated due to their nondeterminism, and our language is restricted to ensure efficient computation. There has been work on synthesizing QREs for network traffic classification [68], using binary search to compute decision thresholds. Similarly, there has been work using the Viterbi semiring to obtain quantitative semantics for Datalog programs [69], which they use in conjunction with gradient descent to learn the rules of the Datalog program. In contrast, we use our quantitative semantics to efficiently prune the parameter search space in a provably correct way. Finally, there has been work on using GPUs to evaluate regular expressions [55]; however, they focus on regular expressions over strings.
Query Languages. Our language is closely related to both signal temporal logic (STL) [50] and Kleene algebras with tests (KAT) [46]. In particular, it can straightforwardly be extended to subsume both (see Appendix A for details), and our pruning strategy applies to this extended language. The addition of Kleene star, required to subsume KAT, worsens the evaluation time. STL has been used to monitor safety requirements for autonomous vehicles [32]. SpatioTemporal Perception Logic (SPTL) is an extension of STL to support spatial reasoning [31]. Many of its operators are monotone, and thus would benefit from our algorithm. Scenic [27] is a DSL for creating static and dynamic driving scenes, but its focus is on generating scenes rather than querying for behaviors.
7 Conclusion
We have proposed a novel framework called Quivr for synthesizing queries over video trajectory data. Our language is similar to KAT and STL, but supports conjunction and sequencing over multistep predicates. Given only a few examples, Quivr efficiently synthesizes trajectory queries consistent with those examples. A key contribution of our approach is the use of a quantitative semantics to prune the parameter search space, yielding a \(5.0{\times }\) speedup over the stateoftheart. In our evaluation, we demonstrate that Quivr effectively synthesizes queries to identify interesting driving behaviors, and that our optimizations dramatically reduce synthesis time.
Notes
 1.
Quivr stands for QUery Induction for Video tRajectories.
References
Aasi, E., Vasile, C.I., Bahreinian, M., Belta, C.: Classification of timeseries data using boosted decision trees. In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1263–1268 (2022). https://doi.org/10.1109/IROS47612.2022.9982105
Abate, A., David, C., Kesseli, P., Kroening, D., Polgreen, E.: Counterexample guided inductive synthesis modulo theories. In: Chockler, H., Weissenbacher, G. (eds.) CAV 2018. LNCS, vol. 10981, pp. 270–288. Springer, Cham (2018). https://doi.org/10.1007/9783319961453_15
Alur, R., et al.: Syntaxguided synthesis. IEEE (2013)
Alur, R., Mamouras, K., Ulus, D.: Derivatives of quantitative regular expressions. In: Aceto, L., Bacci, G., Bacci, G., Ingólfsdóttir, A., Legay, A., Mardare, R. (eds.) Models, Algorithms, Logics and Tools. LNCS, vol. 10460, pp. 75–95. Springer, Cham (2017). https://doi.org/10.1007/9783319631219_4
Bastani, F., et al.: SkyQuery: optimizing video queries over UAVs (2019)
Bastani, F., et al.: SkyQuery: optimizing video queries over UAVs. In: Onward! (2021)
Bastani, F., et al.: MIRIS: fast object track queries in video. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1907–1921 (2020)
Bastani, F., Moll, O., Madden, S.: Vaas: video analytics at scale. Proc. VLDB Endow. 13(12), 2877–2880 (2020)
Bastani, O., Pu, Y., SolarLezama, A.: Verifiable reinforcement learning via policy extraction. In: Advances in Neural Information Processing Systems, pp. 2494–2504 (2018)
Bergmann, P., Meinhardt, T., LealTaixe, L.: Tracking without bells and whistles. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 941–951 (2019)
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fullyconvolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/9783319488813_56
Betke, M., Hirsh, D.E., Bagchi, A., Hristov, N.I., Makris, N.C., Kunz, T.H.: Tracking large variable numbers of objects in clutter. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Bock, J., Krajewski, R., Moers, T., Runde, S., Vater, L., Eckstein, L.: The inD dataset: a drone dataset of naturalistic road user trajectories at German intersections. arXiv preprint arXiv:1911.07602 (2019)
Bombara, G., Belta, C.: Offline and online learning of signal temporal logic formulae using decision trees. ACM Trans. CyberPhys. Syst. 5(3) (2021). https://doi.org/10.1145/3433994
Bornholt, J., Torlak, E., Grossman, D., Ceze, L.: Optimizing synthesis with metasketches. In: Proceedings of the 43rd Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pp. 775–788 (2016)
Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Chaudhuri, S., Clochard, M., SolarLezama, A.: Bridging Boolean and quantitative synthesis using smoothed proof search. In: Proceedings of the 41st ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pp. 207–220 (2014)
Chen, Q., Lamoreaux, A., Wang, X., Durrett, G., Bastani, O., Dillig, I.: Web question answering with neurosymbolic program synthesis. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pp. 328–343 (2021)
Dasgupta, S.: Analysis of a greedy active learning strategy. In: Advances in Neural Information Processing Systems, pp. 337–344 (2005)
Deshmukh, J.V., Donzé, A., Ghosh, S., Jin, X., Juniwal, G., Seshia, S.A.: Robust online monitoring of signal temporal logic. Formal Meth. Syst. Des. 51(1), 5–30 (2017). https://doi.org/10.1007/s1070301702867
Ellis, K., Nye, M., Pu, Y., Sosa, F., Tenenbaum, J., SolarLezama, A.: Write, execute, assess: program synthesis with a repl. In: Advances in Neural Information Processing Systems, pp. 9169–9178 (2019)
Ellis, K., Ritchie, D., SolarLezama, A., Tenenbaum, J.: Learning to infer graphics programs from handdrawn images. In: Advances in Neural Information Processing Systems, pp. 6059–6068 (2018)
Ellis, K., SolarLezama, A., Tenenbaum, J.: Unsupervised learning by program synthesis. In: Advances in Neural Information Processing Systems, pp. 973–981 (2015)
Ergurtuna, M., Gol, E.A.: An efficient formula synthesis method with past signal temporal logic. IFACPapersOnLine 52(11), 43–48 (2019). https://doi.org/10.1016/j.ifacol.2019.09.116. https://www.sciencedirect.com/science/article/pii/S2405896319307451
Fainekos, G.E., Pappas, G.J.: Robustness of temporal logic specifications for continuoustime signals. Theoret. Comput. Sci. 410(42), 4262–4291 (2009)
Franco, L., Placidi, L., Giuliari, F., Hasan, I., Cristani, M., Galasso, F.: Under the hood of transformer networks for trajectory forecasting. Pattern Recogn. 138, 109372 (2023). https://doi.org/10.1016/j.patcog.2023.109372. https://www.sciencedirect.com/science/article/pii/S0031320323000730
Fremont, D.J., Dreossi, T., Ghosh, S., Yue, X., SangiovanniVincentelli, A.L., Seshia, S.A.: Scenic: a language for scenario specification and scene generation. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 63–78 (2019)
Fu, D.Y., et al.: Rekall: specifying video events using compositions of spatiotemporal labels. arXiv preprint arXiv:1910.02993 (2019)
Giuliari, F., Hasan, I., Cristani, M., Galasso, F.: Transformer networks for trajectory forecasting (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hekmatnejad, M., Hoxha, B., Deshmukh, J.V., Yang, Y., Fainekos, G.: Formalizing and evaluating requirements of perception systems for automated vehicles using spatiotemporal perception logic (2022)
Hekmatnejad, M., et al.: Encoding and monitoring responsibility sensitive safety rules for automated vehicles in signal temporal logic. In: Proceedings of the 17th ACMIEEE International Conference on Formal Methods and Models for System Design, MEMOCODE 2019. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3359986.3361203
Hochreiter, S., Schmidhuber, J.: Long shortterm memory. Neural Comput. 9(8), 1735–1780 (1997)
Inala, J.P., Bastani, O., Tavares, Z., SolarLezama, A.: Synthesizing programmatic policies that inductively generalize. In: International Conference on Learning Representations (2019)
Jha, S., Tiwari, A., Seshia, S.A., Sahai, T., Shankar, N.: TeLEx: learning signal temporal logic from positive examples using tightness metric. Formal Meth. Syst. Des. 54(3), 364–387 (2019). https://doi.org/10.1007/s10703019003321
Ji, R., Liang, J., Xiong, Y., Zhang, L., Hu, Z.: Question selection for interactive program synthesis. In: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 1143–1158 (2020)
Jia, Z., Thomas, J., Warszawski, T., Gao, M., Zaharia, M., Aiken, A.: Optimizing DNN computation with relaxed graph substitutions. In: Proceedings of the 2nd Conference on Systems and Machine Learning, SysML 2019 (2019)
Jin, X., Donzé, A., Deshmukh, J.V., Seshia, S.A.: Mining requirements from closedloop control models. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 34(11), 1704–1717 (2015). https://doi.org/10.1109/TCAD.2015.2421907
Jothimurugan, K., Alur, R., Bastani, O.: A composable specification language for reinforcement learning tasks. In: Advances in Neural Information Processing Systems, pp. 13041–13051 (2019)
Kang, D., Bailis, P., Zaharia, M.: Blazeit: optimizing declarative aggregation and limit queries for neural networkbased video analytics. Proc. VLDB Endow. 13(4), 533–546 (2019)
Kang, D., Mathur, A., Veeramacheneni, T., Bailis, P., Zaharia, M.: Jointly optimizing preprocessing and inference for DNNbased visual analytics. arXiv preprint arXiv:2007.13005 (2020)
Kang, D., Raghavan, D., Bailis, P., Zaharia, M.: Model assertions for monitoring and improving ml model. arXiv preprint arXiv:2003.01668 (2020)
Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642337659_15
Kong, Z., Jones, A., Belta, C.: Temporal logics for learning and detection of anomalous behavior. IEEE Trans. Autom. Control 62(3), 1210–1222 (2017). https://doi.org/10.1109/TAC.2016.2585083
Kong, Z., Jones, A., Medina Ayala, A., Aydin Gol, E., Belta, C.: Temporal logic inference for classification and prediction from data. In: Proceedings of the 17th International Conference on Hybrid Systems: Computation and Control, HSCC 2014, pp. 273–282. Association for Computing Machinery, New York (2014). https://doi.org/10.1145/2562059.2562146
Kozen, D.: Kleene algebra with tests. ACM Trans. Program. Lang. Syst. (TOPLAS) 19(3), 427–443 (1997)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Linard, A., Tumova, J.: Active learning of signal temporal logic specifications. In: 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), pp. 779–785 (2020). https://doi.org/10.1109/CASE48305.2020.9216778
Maler, O.: Learning monotone partitions of partiallyordered domains (Work in Progress), July 2017. Working paper or preprint. https://hal.science/hal01556243
Maler, O., Nickovic, D.: Monitoring temporal properties of continuous signals. In: Lakhnech, Y., Yovine, S. (eds.) FORMATS/FTRTFT  2004. LNCS, vol. 3253, pp. 152–166. Springer, Heidelberg (2004). https://doi.org/10.1007/9783540302063_12
Mell, S., Bastani, F., Zdancewic, S., Bastani, O.: Synthesizing trajectory queries from examples. Technical report, MSCIS2302, Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, July 2023
Misailovic, S., Carbin, M., Achour, S., Qi, Z., Rinard, M.C.: Chisel: Reliabilityand accuracyaware optimization of approximate computational kernels. ACM SIGPLAN Not. 49(10), 309–328 (2014)
Mohammadinejad, S., Deshmukh, J.V., Puranic, A.G., VazquezChanlatte, M., Donzé, A.: Interpretable classification of timeseries data using efficient enumerative techniques. In: Proceedings of the 23rd International Conference on Hybrid Systems: Computation and Control, HSCC 2020, Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3365365.3382218
Moll, O., Bastani, F., Madden, S., Stonebraker, M., Gadepally, V., Kraska, T.: ExSample: efficient searches on video repositories through adaptive sampling. arXiv preprint arXiv:2005.09141 (2020)
Naghmouchi, J., Scarpazza, D.P., Berekovic, M.: Smallruleset regular expression matching on GPGPUs: quantitative performance analysis and optimization. In: Proceedings of the 24th ACM International Conference on Supercomputing, pp. 337–348 (2010)
Nori, A.V., Ozair, S., Rajamani, S.K., Vijaykeerthy, D.: Efficient synthesis of probabilistic programs. ACM SIGPLAN Not. 50(6), 208–217 (2015)
Phothilimthana, P.M., Thakur, A., Bodik, R., Dhurjati, D.: Scaling up superoptimization. In: Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 297–310 (2016)
Preiss, J.A., Honig, W., Sukhatme, G.S., Ayanian, N.: Crazyswarm: a large nanoquadcopter swarm. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3299–3304. IEEE (2017)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Fast RCNN: towards realtime object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Robicquet, A., Sadeghian, A., Alahi, A., Savarese, S.: Learning social etiquette: human trajectory understanding in crowded scenes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/9783319464848_33
Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: International Conference on Machine Learning (2001)
Sadigh, D., Sastry, S.S., Seshia, S.A., Dragan, A.: Information gathering actions over human internal state. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 66–73. IEEE (2016)
Sadigh, D., Sastry, S., Seshia, S.A., Dragan, A.D.: Planning for autonomous cars that leverage effects on human actions. In: Robotics: Science and Systems, vol. 2. Ann Arbor, MI, USA (2016)
Schkufza, E., Sharma, R., Aiken, A.: Stochastic superoptimization. ACM SIGARCH Comput. Arch. News 41(1), 305–316 (2013)
Schmerling, E., Leung, K., Vollprecht, W., Pavone, M.: Multimodal probabilistic modelbased planning for humanrobot interaction. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–9. IEEE (2018)
Shah, A., Zhan, E., Sun, J., Verma, A., Yue, Y., Chaudhuri, S.: Learning differentiable programs with admissible neural heuristics. In: Advances in Neural Information Processing Systems, vol. 33 (2020)
Shi, L., Li, Y., Loo, B.T., Alur, R.: Network traffic classification by program synthesis. In: TACAS 2021. LNCS, vol. 12651, pp. 430–448. Springer, Cham (2021). https://doi.org/10.1007/9783030720162_23
Si, X., Raghothaman, M., Heo, K., Naik, M.: Synthesizing datalog programs using numerical relaxation. In: IJCAI (2019)
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. nature 529(7587), 484–489 (2016)
SolarLezama, A.: Program Synthesis by Sketching. University of California, Berkeley (2008)
Sun, J.J., et al.: The MABe22 benchmarks for representation learning of multiagent behavior. June 2022. https://doi.org/10.22002/D1.20186
Tabuada, P., Neider, D.: Robust linear temporal logic. In: Computer Science Logic 2016 (2016)
Tian, Y., et al.: Learning to infer and execute 3D shape programs. arXiv preprint arXiv:1901.02875 (2019)
Tweed, D., Calway, A.: Tracking multiple animals in wildlife footage. In: Object Recognition Supported by User Interaction for Service Robots, vol. 2, pp. 24–27. IEEE (2002)
Valkov, L., Chaudhari, D., Srivastava, A., Sutton, C., Chaudhuri, S.: HOUDINI: lifelong learning as program synthesis. In: Advances in Neural Information Processing Systems, pp. 8687–8698 (2018)
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., Luxburg, U.V., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aaPaper.pdf
VazquezChanlatte, M., Ghosh, S., Deshmukh, J.V., SangiovanniVincentelli, A., Seshia, S.A.: Timeseries learning using monotonic logical properties. In: Colombo, C., Leucker, M. (eds.) RV 2018. LNCS, vol. 11237, pp. 389–405. Springer, Cham (2018). https://doi.org/10.1007/9783030037697_22
Verma, A., Le, H., Yue, Y., Chaudhuri, S.: Imitationprojected programmatic reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 15752–15763 (2019)
Verma, A., Murali, V., Singh, R., Kohli, P., Chaudhuri, S.: Programmatically interpretable reinforcement learning. In: ICML (2018)
Weinstein, A., Cho, A., Loianno, G., Kumar, V.: Visual inertial odometry swarm: an autonomous swarm of visionbased quadrotors. IEEE Robot. Autom. Lett. 3(3), 1801–1807 (2018)
Wishart, J., et al.: Driving safety performance assessment metrics for adsequipped vehicles. SAE Technical Paper 2(2020011206) (2020)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649. IEEE (2017)
Young, H., Bastani, O., Naik, M.: Learning neurosymbolic generative models via program synthesis. In: ICML (2019)
Zhan, E., Tseng, A., Yue, Y., Swaminathan, A., Hausknecht, M.: Learning calibratable policies using programmatic styleconsistency. In: ICML (2020)
Acknowledgements
We thank the anonymous reviewers for their helpful feedback. This work was supported in part by NSF Award CCF1910769, NSF Award CCF1917852, and ARO Award W911NF2010080.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2023 The Author(s)
About this paper
Cite this paper
Mell, S., Bastani, F., Zdancewic, S., Bastani, O. (2023). Synthesizing Trajectory Queries from Examples. In: Enea, C., Lal, A. (eds) Computer Aided Verification. CAV 2023. Lecture Notes in Computer Science, vol 13964. Springer, Cham. https://doi.org/10.1007/9783031377068_23
Download citation
DOI: https://doi.org/10.1007/9783031377068_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783031377051
Online ISBN: 9783031377068
eBook Packages: Computer ScienceComputer Science (R0)