Keywords

figure a

1 Introduction

Model checking is one of the most successful techniques for the verification of software programs. It consists in the exhaustive verification of the mathematical model of a program against a specification of its desired behavior. The kind of properties that can be proved in this way depends both on the formalism employed to model the program, and on the one used to express the specification. The initial and most classical frameworks consist in the use of operational formalisms, such as Transition Systems and Finite State Automata (generally Büchi automata) for the model, and temporal logics such as Linear-time Temporal Logic (LTL), Computation-Tree Logic (CTL) and CTL* for the specification [24]. The success of such logics is due to their ease in reasoning about linear or branching sequences of events over time, by expressing liveness and safety properties, their conciseness with respect to automata, and the complexity of their model checking.

In this paper we consider linear-time temporal domains. LTL limits its set of expressible properties to the First-Order Logic (FOL) definable fragment of regular languages. This is quite restrictive when compared with the most popular abstract models of procedural programs, such as Pushdown Systems, Boolean Programs [10], and Recursive State Machines [3]. All such stack-based formalisms show behaviors that are expressible by means of Context-Free Languages (CFL), rather than regular ones. State and configuration reachability, fair computation problems, and model checking of regular specifications have been thoroughly studied for such formalisms [3, 4, 13, 17, 28, 30, 32, 40, 51, 55]. To expand the expressive power of specification languages too, [12, 14] augmented LTL with Presburger arithmetic constraints on the occurrences of states, obtaining a logic capable of even some context-sensitive specifications, but with only restricted decidable fragments. [41] introduced model checking of pushdown tree automata specifications on regular systems, and Dynamic Logic was extended to some limited classes of CFL [34]. Decision procedures for different kinds of regular constraints on stack contents have been given in [18, 29, 37].

A coherent approach came with the introduction of temporal logics based on Visibly Pushdown Languages (VPL) [7], a.k.a. Input-Driven Languages [47]. Such logics, namely CaRet [6] and its FO-complete successor NWTL [2], model the execution trace of a procedural program as a Nested Word [8], consisting in a linear ordering augmented with a one-to-one matching relation between function calls and returns. They are the first ones featuring temporal modalities that explicitly refer to the nesting structure of CFL [4]. This enables requirement specifications to include Hoare-style pre/post-conditions, stack-inspection properties, and more. A \(\mu \)-calculus based on VPL extends model checking to branching-time semantics in [5], while [16] introduces a temporal logic capturing the whole class of VPL. Timed extensions of CaRet are given in [15].

VPL too have their limitations. They are more general than Parenthesis Languages [46], but their matching relation is essentially constrained to be one-to-one [43]. This hinders their suitability to model processes in which a single event must be put in relation with multiple ones. Unfortunately, computer programs often present such behaviors: exceptions and continuations are single events that cause the termination (or re-instantiation) of multiple functions on the stack.

To reason about such behaviors, temporal logics based on Operator Precedence Languages (OPL) have been proposed [22]. OPL were initially introduced with the purpose of efficient parsing [31], a field in which they continue to offer useful applications [11]. They are capable of capturing the syntax of arithmetic expressions, and other constructs whose context-free structure is not immediately visible. The generality of the structure of their syntax trees is much greater than that of VPL, which are strictly included in OPL [25]. Nevertheless, they retain the same closure properties that make regular languages and VPL suitable for automata-theoretic model checking: OPL are closed under Boolean operations, concatenation, Kleene *, and language emptiness and inclusion are decidable [42]. They have been characterized by means of push-down automata, Monadic Second-Order Logic and, recently, by an extension of Regular Expressions [42, 44].

OPTL [22] is the first linear-time temporal logic for which a model checking procedure has been given on both finite and \(\omega \)-words of OPL. It enables reasoning on procedural programs with exceptions, expressing properties about whether a function can be terminated by an exception, or throw one, and also pre/post-conditions. NWTL can be translated into OPTL in linear time, thus the latter is capable of expressing all properties that can be formalized in CaRet and NWTL, and many more. [22] does not explore OPTL’s expressiveness further, and does not investigate the practical applicability of their model checking construction.

In this article, we introduce Precedence Oriented Temporal Logic (POTL), which redefines the syntax and semantics of OPTL to be much closer to the context-free structure of words. With POTL, it is much easier to navigate a word’s syntax tree, expressing requirements that are aware of its structure. From a more theoretical point of view, POTL is FO-complete whereas OPTL is not, so that CaRet, NWTL, OPTL and POTL constitute a strict hierarchy in terms of expressive power. Such a theoretical elaboration, however, is technically involved; thus, for length reasons, it is documented in a technical report [23].

In this paper, instead, we focus on the model-checking application of POTL. We provide a tableaux-construction procedure for model checking POTL, which yields nondeterministic automata of size at most singly exponential in the formula’s length, and is thus not asymptotically greater than that of LTL, NWTL and OPTL. We implemented such a procedure in a tool called POMC, which we evaluate on several interesting case studies. POMC’s performance is promising: almost all case studies are verified in seconds and with a reasonable memory consumption, with very few outliers. Such outliers are inevitable, due to the exponential complexity of the task.

The related work on tools is not as rich as the theoretical one. Tools and libraries such as VPAlib [48], VPAchecker [54], OpenNWA [27] and SymbolicAutomata [26] only implement operations such as union, intersection, universality/inclusion/emptiness check for Visibly Pushdown or Nested Word Automata, but have no model checking capabilities. PAL [19] uses nested-word based monitors to express program specifications, and a tool based on blast [36] implements its runtime monitoring and model checking. PAL follows the paradigm of program monitors, and is not—strictly speaking—a temporal logic. PTCaRet [52] is a past version of CaRet, and its runtime monitoring has been implemented in JavaMOP [20]. [49, 50] describe a tool for model checking programs against CaRet specifications. Since its purpose is malware detection, it targets program binaries directly by modeling them as Pushdown Systems. Unfortunately, this tool does not seem to be available online. To the best of our knowledge, POMC is the only publicly-availableFootnote 1 tool for model-checking temporal logics capable of expressing context-free properties.

The paper is organized as follows: we give some background on OPL in Sect. 2, we introduce POTL in Sect. 3 and its model checking in Sect. 4, and we evaluate our prototype model checker in Sect. 5. Due to space constraints, we leave all formal proofs to a technical report [21].

2 Operator Precedence Languages

We assume some familiarity with classical formal language theory concepts such as context-free grammar, parsing, shift-reduce algorithm, syntax tree (ST) [33, 35]. Operator Precedence Languages (OPL) are usually defined through their generating grammars [31]; in this paper, however, we characterize them through their accepting automata [42] which are the natural way for stating equivalence properties with logic characterization, and for model checking. Readers not familiar with OPL may refer to [43] for more explanations on the following basic concepts; an explanatory example is also given at the end of this section.

Let \(\varSigma \) be a finite alphabet, and \(\varepsilon \) the empty string. We use a special symbol \(\# \not \in \varSigma \) to mark the beginning and the end of any string. An operator precedence matrix (OPM) M over \(\varSigma \) is a partial function \((\varSigma \cup \{\#\})^2 \rightarrow \{\lessdot , \doteq , \gtrdot \}\), that, for each ordered pair (ab), defines the precedence relation (PR) M(ab) holding between a and b. If the function is total we say that M is complete. We call the pair \((\varSigma , M)\) an operator precedence alphabet. Relations \(\lessdot , \doteq , \gtrdot \), are respectively named yields precedence, equal in precedence, and takes precedence. By convention, the initial # yields precedence, and other symbols take precedence on the ending #. If \(M(a,b) = \pi \), where \(\pi \in \{\lessdot , \doteq , \gtrdot \}\), we write \(a \mathrel {\pi }b\). For \(u,v \in \varSigma ^+\) we write \(u \mathrel {\pi }v\) if \(u = xa\) and \(v = by\) with \(a \mathrel {\pi }b\). The role of PR is to give structure to words: they can be seen as special and more concise parentheses, where e.g. one “closing” \(\gtrdot \) can match more than one “opening” \(\lessdot \). Despite their graphical appearance, PR are not ordering relations.

Definition 1

An operator precedence automaton (OPA) is a tuple \(\mathcal A = (\varSigma , M, Q, I, F, \delta ) \) where: \((\varSigma , M)\) is an operator precedence alphabet, Q is a finite set of states (disjoint from \(\varSigma \)), \(I \subseteq Q\) is the set of initial states, \(F \subseteq Q\) is the set of final states, \(\delta \subseteq Q \times ( \varSigma \cup Q) \times Q\) is the transition relation, which is the union of the three disjoint relations \(\delta _{{shift}}\subseteq Q \times \varSigma \times Q\), \(\delta _{{push}}\subseteq Q \times \varSigma \times Q\), and \(\delta _{{pop}}\subseteq Q \times Q \times Q\). An OPA is deterministic iff I is a singleton, and all three components of \(\delta \) are—possibly partial—functions.

To define the semantics of OPA, we need some new notations. Letters \(p, q, p_i, q_i, \dots \) denote states in Q. We use for \((q_0, a, q_1) \in \delta _{{push}}\), for \((q_0, a, q_1) \in \delta _{{shift}}\), for \((q_0, q_2, q_1) \in \delta _{{pop}}\), and , if the automaton can read \(w \in \varSigma ^*\) going from \(q_0\) to \(q_1\). Let \(\varGamma = \varSigma \times Q\) and \(\varGamma ' = \varGamma \cup \{\bot \}\) be the stack alphabet; we denote symbols in \(\varGamma '\) as \([a,\ q]\) or \(\bot \). We set \(\mathop {smb}([a,\ q]) = a\), \(\mathop {smb}(\bot )=\#\), and \(\mathop {st}([a,\ q]) = q\). For a stack content \(\gamma = \gamma _n \dots \gamma _1 \bot \), with \(\gamma _i \in \varGamma \), \(n \ge 0\), we set \(\mathop {smb}(\gamma )= \mathop {smb}(\gamma _n)\) if \(n \ge 1\), \(\mathop {smb}(\gamma )= \#\) if \(n = 0\).

A configuration of an OPA is a triple \(c = \langle w, \ q, \ \gamma \rangle \), where \(w \in \varSigma ^*\#\), \(q \in Q\), and \(\gamma \in \varGamma ^*\bot \). A computation or run is a finite sequence of moves or transitions . There are three kinds of moves, depending on the PR between the symbol on top of the stack and the next input symbol:

Push move: if \(\mathop {smb}(\gamma )\lessdot \ a\) then , with \((p,a, q) \in \delta _{{push}}\);

Shift move: if \(a \doteq b\) then , with \((q,b,r) \in \delta _{{shift}}\);

Pop move: if \(a \gtrdot b\) then , with \((q, p, r) \in \delta _{{pop}}\).

Shift and pop moves are not performed when the stack contains only \(\bot \). Push moves put a new element on top of the stack consisting of the input symbol together with the current state of the OPA. Shift moves update the top element of the stack by changing its input symbol only. Pop moves remove the element on top of the stack, and update the state of the OPA according to \(\delta _{{pop}}\) on the basis of the current state of the OPA and the state of the removed stack symbol. They do not consume the input symbol, which is used only to establish the \(\gtrdot \) relation, remaining available for the next move. The OPA accepts the language \( L(\mathcal A) = \left\{ x \in \varSigma ^* \mid \langle x\#, \ q_I, \ \bot \rangle \vdash ^* \langle \#, \ q_F, \ \bot \rangle , q_I \in I, q_F \in F \right\} . \)

We now introduce the concept of chain, which makes the connection between OP relations and context-free structure explicit, through brackets.

Definition 2

A simple chain \( {}^{c_0}[ c_1 c_2 \dots c_\ell ]{}^{c_{\ell +1}} \) is a string \(c_0 c_1 c_2 \dots c_\ell c_{\ell +1}\), such that: \(c_0, c_{\ell +1} \in \varSigma \cup \{\#\}\), \(c_i \in \varSigma \) for every \(i = 1,2, \dots \ell \) (\(\ell \ge 1\)), and \(c_0 \lessdot c_1 \doteq c_2 \dots c_{\ell -1} \doteq c_\ell \gtrdot c_{\ell +1}\). A composed chain is a string \(c_0 s_0 c_1 s_1 c_2 \dots c_\ell s_\ell c_{\ell +1}\), where \({}^{c_0}[ c_1 c_2 \dots c_\ell ]{}^{c_{\ell +1}}\) is a simple chain, and \(s_i \in \varSigma ^*\) is the empty string or is such that \({}^{c_i}[ s_i ]{}^{c_{i+1}}\) is a chain (simple or composed), for every \(i = 0,1, \dots , \ell \) (\(\ell \ge 1\)). Such a composed chain will be written as \({}^{c_0}[ s_0 c_1 s_1 c_2 \dots c_\ell s_\ell ]{}^{c_{\ell +1}}\). \(c_0\) (resp. \(c_{\ell +1}\)) is called its left (resp. right) context; all symbols between them form its body.

Fig. 1.
figure 1

OPM \(M_\mathbf {call}\) (left) and a string with chains shown by brackets (right).

A finite word w over \(\varSigma \) is compatible with an OPM M iff for each pair of letters cd, consecutive in w, M(cd) is defined and, for each substring x of \(\# w \#\) that is a chain of the form \(^a[y]^b\), M(ab) is defined.

Chains can be identified through the traditional operator precedence parsing algorithm. We apply it to the sample word \(w_{ex} = \mathbf {call}\ \mathbf {han}\ \mathbf {call}\ \mathbf {call}\ \mathbf {exc}\ \mathbf {call}\ \mathbf {ret}\ \mathbf {ret}\), which is compatible with \(M_\mathbf {call}\) (for a more complete treatment, cf. [33, 43]). First, write all precedence relations between consecutive characters, according to \(M_{\mathbf {call}}\). Then, recognize all innermost patterns of the form \(a \lessdot c \doteq \dots \doteq c \gtrdot b\) as simple chains, and remove their bodies. Then, write the precedence relations between the left and right contexts of the removed body, a and b, and iterate this process until only ## remains. This procedure is applied to \(w_{ex}\) as follows:

$$ \begin{array}{l | l} 1 &{} \# \lessdot \mathbf {call}\lessdot \mathbf {han}\lessdot \mathbf {call}\lessdot \underline{\mathbf {call}} \gtrdot \mathbf {exc}\gtrdot \mathbf {call}\doteq \mathbf {ret}\gtrdot \mathbf {ret}\gtrdot \# \\ 2 &{} \# \lessdot \mathbf {call}\lessdot \mathbf {han}\lessdot \underline{\mathbf {call}} \gtrdot \mathbf {exc}\gtrdot \mathbf {call}\doteq \mathbf {ret}\gtrdot \mathbf {ret}\gtrdot \# \\ 3 &{} \# \lessdot \mathbf {call}\lessdot \underline{\mathbf {han}} \doteq \underline{\mathbf {exc}} \gtrdot \mathbf {call}\doteq \mathbf {ret}\gtrdot \mathbf {ret}\gtrdot \# \\ 4 &{} \# \lessdot \mathbf {call}\lessdot \underline{\mathbf {call}} \doteq \underline{\mathbf {ret}} \gtrdot \mathbf {ret}\gtrdot \# \\ 5 &{} \# \lessdot \underline{\mathbf {call}} \doteq \underline{\mathbf {ret}} \gtrdot \# \\ 6 &{} \# \doteq \# \\ \end{array} $$

The chain body removed in each step is underlined. In step 1, \({}^{\mathbf {call}}[ \underline{\mathbf {call}} ]{}^{\mathbf {exc}}\) is a simple chain, so its body \(\underline{\mathbf {call}}\) is removed. Then, in step 2 we recognize the simple chain \({}^{\mathbf {han}}[ \underline{\mathbf {call}} ]{}^{\mathbf {exc}}\), which means \({}^{\mathbf {han}}[ \mathbf {call}[\mathbf {call}] ]{}^{\mathbf {exc}}\), where \([\mathbf {call}]\) is the chain body removed in step 1, is a composed chain. This way, we recognize, e.g., \({}^{\mathbf {han}}[ \mathbf {call} ]{}^{\mathbf {exc}}\), \({}^{\mathbf {call}}[ \mathbf {han}\, \mathbf {exc} ]{}^{\mathbf {call}}\) as simple chains, and \({}^{\mathbf {han}}[ \mathbf {call}[ \mathbf {call}] ]{}^{\mathbf {exc}}\) and \({}^{\mathbf {call}}[ \mathbf {han}[ \mathbf {call}[ \mathbf {call}] ] \mathbf {exc} ]{}^{\mathbf {call}}\) as composed chains (with inner chain bodies enclosed in brackets). Figure 1 shows the structure of a longer version of \(w_{ex}\), which is an isomorphic representation of its ST as depicted in Fig. 4. Each chain corresponds to an internal node, and the fringe of the subtree rooted at it is the chain’s body.

Let \(\mathcal A\) be an OPA. We call a support for the simple chain \({}^{c_0}[ c_1 c_2 \dots c_\ell ]{}^{c_{\ell +1}}\) any path in \(\mathcal A\) of the form . The label of the last (and only) pop is exactly \(q_0\), i.e. the first state of the path; this pop is executed because of relation \(c_\ell \gtrdot c_{\ell +1}\). We call a support for the composed chain \({}^{c_0}[ s_0 c_1 s_1 c_2 \dots c_\ell s_\ell ]{}^{c_{\ell +1}}\) any path in \(\mathcal A\) of the form where, for every \(i = 0, 1, \dots , \ell \): if \(s_i \ne \epsilon \), then is a support for the chain \({}^{c_i}[ s_i ]{}^{c_{i+1}}\), else \(q'_i = q_i\).

Chains fully determine the parsing structure of any OPA over \((\varSigma , M)\). If the OPA performs the computation \( \langle sb, q_i, [a, q_j] \gamma \rangle \vdash ^* \langle b, q_k, \gamma \rangle \), then \({}^{a}[ s ]{}^{b}\) is necessarily a chain over \((\varSigma , M)\), and there exists a support like the one above with \(s = s_0 c_1 \dots c_\ell s_\ell \) and \(q_{\ell +1} = q_k\). This corresponds to the parsing of the string \(s_0 c_1 \dots c_\ell s_{\ell }\) within the contexts a,b, which contains all information needed to build the subtree whose frontier is that string.

Consider the OPA \(\mathcal A(\varSigma , M) = ( \varSigma , M,\) \(\{q\}, \{q\}, \{q\}, \delta _{max} )\) where \(\delta _{max}(q,q) = q\), and \(\delta _{max}(q,c) = q\), \(\forall c \in \varSigma \). We call it the OP Max-Automaton over \(\varSigma , M\). For a max-automaton, each chain has a support. Since there is a chain \({}^{\#}[ s ]{}^{\#}\) for any string s compatible with M, a string is accepted by \(\mathcal A(\varSigma , M)\) iff it is compatible with M. If M is complete, each string is accepted by \(\mathcal A(\varSigma , M)\), which defines the universal language \(\varSigma ^*\) by assigning to any string the (unique) structure compatible with the OPM. With \(M_{\mathbf {call}}\) of Fig. 1, if we take e.g. the string \( \mathbf {ret}\ \mathbf {call}\ \mathbf {han}\), it is accepted by the max-automaton with structure \( \#[[\mathbf {ret}] \mathbf {call}[\mathbf {han}]] \#. \)

In conclusion, given an OP alphabet, the OPM M assigns a unique structure to any compatible string in \(\varSigma ^*\); unlike VPL, such a structure is not visible in the string, and must be built by means of a non-trivial parsing algorithm. An OPA defined on the OP alphabet selects an appropriate subset within the “universe” of strings compatible with M. For a more complete description of the OPL family and of its relations with other CFL we refer the reader to [43].

2.1 Operator Precedence \(\omega \)-Languages

All definitions regarding OPL are extended to infinite words in the usual way, but with a few distinctions. Given an OP alphabet \((\varSigma , M)\), an \(\omega \)-word \(w \in \varSigma ^\omega \) is compatible with M if every prefix of w is compatible with M. OP \(\omega \)-words are not terminated by the delimiter \(\#\). An \(\omega \)-word may contain never-ending chains of the form \(c_0 \lessdot c_1 \doteq c_2 \doteq \cdots \), where the \(\lessdot \) relation between \(c_0\) and \(c_1\) is never closed by a corresponding \(\gtrdot \). Such chains are called open chains and may be simple or composed. A composed open chain may contain both open and closed subchains. Of course, a closed chain cannot contain an open one. A terminal symbol \(a \in \varSigma \) is pending if it is part of the body of an open chain and of no closed chains.

OPA classes accepting the whole class of \(\omega \)OPL can be defined by augmenting Definition 1 with Büchi or Muller acceptance conditions [42]. In this paper, we only consider the former. The semantics of configurations, moves and infinite runs are defined as for finite OPA. For the acceptance condition, let \(\rho \) be a run on an \(\omega \)-word w. Define

$$ {\text {Inf}}(\rho ) = \{q \in Q \mid \text {there exist infinitely many positions }i \text { s.t. } \ \langle \beta _i, q, x_i \rangle \in \rho \} $$

as the set of states that occur infinitely often in \(\rho \). \(\rho \) is successful iff there exists a state \(q_f \in F\) such that \(q_f \in {\text {Inf}}(\rho )\). An \(\omega \)OPBA \(\mathcal {A}\) accepts \(w \in \varSigma ^\omega \) iff there is a successful run of \(\mathcal {A}\) on w. The \(\omega \)-language recognized by \(\mathcal {A}\) is \(L(\mathcal {A}) = \{w \in \varSigma ^\omega \mid \mathcal {A} \text { accepts } w \}\). Unlike OPA, \(\omega \)OPBA do not require the stack to be empty for word acceptance: when reading an open chain, the stack symbol pushed when the first character of the body of its underlying simple chain is read remains into the stack forever; it is at most updated by shift moves.

The most important closure properties of OPL are preserved by \(\omega \)OPL, which form a Boolean algebra and are closed under concatenation of an OPL with an \(\omega \)OPL [42]. The equivalence between deterministic and nondeterministic automata is lost in the infinite case, which is unsurprising, since it also happens for regular \(\omega \)-languages and \(\omega \)VPL.

2.2 Modeling Programs with OPA

For readers not familiar with OPL, we show how OPA can naturally model programming languages such as Java and C++. Given a set AP of atomic propositions describing events and states of the program, we use \(({\mathcal {P}(AP)}, M_{AP})\) as the OP alphabet. For convenience, we consider a partitioning of AP into a set of standard propositional labels (in round font), and structural labels (SL, in bold). SL define the OP structure of the word: \(M_{AP}\) is only defined for subsets of AP containing exactly one SL, so that given two SL \(\mathbf {l}_1, \mathbf {l}_2\), for any \(a, a', b, b' \in {\mathcal {P}(AP)}\) s.t. \(\mathbf {l}_1 \in a, a'\) and \(\mathbf {l}_2 \in b, b'\) we have \(M_{AP}(a,b) = M_{AP}(a',b')\). Hence, we define an OPM on the entire \({\mathcal {P}(AP)}\) by only giving the relations between SL, as we did for \(M_\mathbf {call}\). Figure 2 shows how to model a procedural program with an OPA. The OPA simulates the program’s behavior with respect to the stack, by expressing its execution traces with four event kinds: \(\mathbf {call}\) (resp. \(\mathbf {ret}\)) marks a procedure call (resp. return), \(\mathbf {han}\) the installation of an exception handler by a try statement, and \(\mathbf {exc}\) an exception being raised. OPM \(M_\mathbf {call}\) defines the context-free structure of the word, which is strictly linked with the programming language semantics: the \(\lessdot \) PR causes nesting (e.g., \(\mathbf {call}\)s can be nested into other \(\mathbf {call}\)s), and the \(\doteq \) PR implies a one-to-one relation, e.g. between a \(\mathbf {call}\) and the \(\mathbf {ret}\) of the same function, and a \(\mathbf {han}\) and the \(\mathbf {exc}\) it catches. Each OPA state represents a line in the source code. First, procedure \(\mathrm{p}_A\) is called by the program loader (M0), and \([\{\mathbf {call}, \mathrm{p}_A\}, \text {M0}]\) is pushed onto the stack, to track the program state before the \(\mathbf {call}\). Then, the try statement at line A0 of \(\mathrm{p}_A\) installs a handler. All subsequent calls to \(\mathrm{p}_B\) and \(\mathrm{p}_C\) push new stack symbols on top of the one pushed with \(\mathbf {han}\). \(\mathrm{p}_C\) may only call itself recursively, or throw an exception, but never return normally. This is reflected by \(\mathbf {exc}\) being the only transition leading from state C0 to the accepting state Mr, and \(\mathrm{p}_B\) and \(\mathrm{p}_C\) having no way to a normal \(\mathbf {ret}\). The OPA has a look-ahead of one input symbol, so when it encounters \(\mathbf {exc}\), it must pop all symbols in the stack, corresponding to active function frames, until it finds the one with \(\mathbf {han}\) in it, which cannot be popped because \(\mathbf {han}\doteq \mathbf {exc}\). Notice that such behavior cannot be modeled by Visibly Pushdown Automata or Nested Word Automata, because they need to read an input symbol for each pop move. Thus, \(\mathbf {han}\) protects the parent function from the exception. Since the state contained in \(\mathbf {han}\)’s stack symbol is A0, the execution resumes in the catch clause of \(\mathrm{p}_A\). \(\mathrm{p}_A\) then calls twice the error-handling function \(\mathrm{p}_{{Err}}\), which ends regularly both times, and returns. The string of Fig. 1 is accepted by this OPA.

Fig. 2.
figure 2

Example procedural program (top) and the derived OPA (bottom). ‘*’ implies a non-deterministic choice. Push, shift, pop moves are shown by, resp., solid, dashed and double arrows.

In this example, we only model the stack behavior for simplicity, but other statements, such as assignments, and other behaviors, such as continuations, could be modeled by a different choice of the OPA and OPM, and other aspects of the program’s state by appropriate abstractions [38].

Fig. 3.
figure 3

The string of Fig. 1 as an OP word. Chains are shown by edges joining their contexts. Standard atomic propositions are shown below SL: \(\mathrm{p}_l\) means a \(\mathbf {call}\) or a \(\mathbf {ret}\) is related to procedure \(\mathrm{p}_l\). First, procedure \(\mathrm{p}_A\) is called (pos. 1), and it installs a handler in pos. 2. Then, three procedures are called, and one (\(\mathrm{p}_C\)) throws an exception, which is caught by the handler. Two more functions are called and, finally, \(\mathrm{p}_A\) returns.

3 POTL: Syntax and Semantics

Given a finite set of atomic propositions AP, the syntax of POTL follows:

where \(\mathrm{a} \in AP\), and \(t \in \{d, u\}\).

The semantics of POTL is based on the word structure—also called OP word for short—\((U, M_{AP}, P)\), where \(U = \{0, 1, \dots , n, n+1\}\), with \(n \in \mathbb {N}\) is a set of word positions; \(P :U \rightarrow {\mathcal {P}(AP)}\) is a function associating each position in U with the set of atomic propositions holding in that position, with \(P(0) = P(n\,+\,1) = \{\#\}\). Given two positions ij and a PR \(\pi \), we write \(i \mathrel {\pi }j\) to say \(P(i) \mathrel {\pi }P(j)\).

We define the chain relation \(\chi \subseteq U \times U\) so that \(\chi (i, j)\) holds between two positions ij iff \(i < j-1\), and i and j are resp. the left and right contexts of the same chain. For composed chains, \(\chi \) may not be one-to-one, but also one-to-many or many-to-one. Given \(i,j \in U\), relation \(\chi \) has the following properties:

  1. 1.

    It never crosses itself: if \(\chi (i,j)\) and \(\chi (h,k)\), for any \(h,k \in U\), then we have \(i< h < j \implies k \le j\) and \(i< k < j \implies i \le h\).

  2. 2.

    If \(\chi (i,j)\), then \(i \lessdot i+1\) and \(j-1 \gtrdot j\).

  3. 3.

    There exists at most one single position h, called leftmost context of j, s.t. \(\chi (h,j)\) and \(h \lessdot j\) or \(h \doteq j\); for any k s.t. \(\chi (k,j)\) and \(k \gtrdot j\) we have \(k > h\).

  4. 4.

    There exists at most one single position h, called rightmost context of i, s.t. \(\chi (i,h)\) and \(i \gtrdot h\) or \(i \doteq h\); for any k s.t. \(\chi (i,k)\) and \(i \lessdot k\) we have \(k < h\).

Property 4 says that when the chain relation is one-to-many, the contexts of the outermost chains are in the \(\doteq \) or \(\gtrdot \) relation, while the inner ones are in the \(\lessdot \) relation. Property 3 says that contexts of outermost many-to-one chains are in the \(\doteq \) or \(\lessdot \) relation, the inner ones being in the \(\gtrdot \) relation. In the ST, the right context j of a chain is at the same level as the left one i when \(i \doteq j\) (e.g., in Fig. 4, pos. 1 and 11), at a lower level when \(i \lessdot j\) (e.g., pos. 1 with 7, and 9), at a higher level if \(i \gtrdot j\) (e.g., pos. 3 and 4 with 6).

The truth of POTL formulas is defined w.r.t. a single word position. Let w be an OP word, and \(\mathrm{a} \in AP\). Then, for any position \(i \in U\) of w, we have \((w, i) \models \mathrm{a}\) if \(\mathrm{a} \in P(i)\). Operators such as \(\wedge \) and \(\lnot \) have the usual semantics from propositional logic. Next, while giving the formal semantics of POTL operators, we illustrate it by showing how it can be used to express properties on program execution traces, such as the one of Fig. 3.

Fig. 4.
figure 4

The ST corresponding to the word of Fig. 3. Dots are internal nodes.

a) Next/Back Operators. The downward next and back operators and \(\circleddash ^d\) are like their LTL counterparts, except they are true only if the next (resp. current) position is at a lower or equal ST level than the current (resp. preceding) one. The upward next and back, \(\ocircle ^u\) and \(\circleddash ^u\), are symmetric. Formally, iff \((w,i+1) \models \varphi \) and \(i \lessdot (i+1)\) or \(i \doteq (i+1)\), and \((w,i) \models \circleddash ^d\varphi \) iff \((w,i-1) \models \varphi \), and \((i-1) \lessdot i\) or \((i-1) \doteq i\). Substitute \(\lessdot \) with \(\gtrdot \) to obtain the semantics for \(\ocircle ^u\) and \(\circleddash ^u\). E.g., we can write to say that the next position is an inner call (it holds in pos. 2, 3, 4 of Fig. 3), \(\circleddash ^d\mathbf {call}\) to say that the previous position is a \(\mathbf {call}\), and the current is the first of the body of a function (pos. 2, 4, 5), or the \(\mathbf {ret}\) of an empty one (pos. 8, 10), and \(\circleddash ^u\mathbf {call}\) to say that the current position terminates an empty function frame (holds in 6, 8, 10). In pos. 2 holds, but \(\ocircle ^u\mathrm{p}_B\) does not.

b) Chain Next/Back Operators. The chain next and back operators \(\chi _F^{t}\) and \(\chi _P^{t}\), \(t \in \{d,u\}\), evaluate their argument respectively on future and past positions in the chain relation with the current one. The downward (resp. upward) variant only considers chains whose right context goes down (resp. up) in the ST. E.g., in pos. 1 of Fig. 3, \(\chi _F^{d}\mathrm{p}_{{Err}}\) holds because \(\chi (1,7)\) and \(\chi (1,9)\), meaning that \(\mathrm{p}_A\) calls \(\mathrm{p}_{{Err}}\) at least once. Formally, \((w,i) \models \chi _F^{d}\varphi \) iff there exists a position \(j > i\) such that \(\chi (i,j)\), \(i \lessdot j\) or \(i \doteq j\), and \((w,j) \models \varphi \). \((w,i) \models \chi _P^{d}\varphi \) iff there exists a position \(j < i\) such that \(\chi (j,i)\), \(j \lessdot i\) or \(j \doteq i\), and \((w,j) \models \varphi \). Replace \(\lessdot \) with \(\gtrdot \) for the upward versions. In Fig. 3, \(\chi _F^{u}\mathbf {exc}\) is true in \(\mathbf {call}\) positions whose procedure is terminated by an exception thrown by an inner procedure (e.g. pos. 3 and 4). \(\chi _P^{u}\mathbf {call}\) is true in \(\mathbf {exc}\) statements that terminate at least one procedure other than the one raising it, such as the one in pos. 6. \(\chi _F^{d}\mathbf {ret}\) and \(\chi _F^{u}\mathbf {ret}\) hold in \(\mathbf {call}\)s to non-empty procedures that terminate normally, and not due to an uncaught exception (e.g., pos. 1).

c) Until/Since Operators. POTL has two kinds of until and since operators. They express properties on paths, which are sequences of positions obtained by iterating the different kinds of next or back operators. In general, a path of length \(n \in \mathbb {N}\) between \(i, j \in U\) is a sequence of positions \(i = i_1< i_2< \dots < i_n = j\). The until operator on a set of paths \(\varGamma \) is defined as follows: for any word w and position \(i \in U\), and for any two POTL formulas \(\varphi \) and \(\psi \), \((w, i) \models \varphi \mathbin {\mathcal {U}(\varGamma )} \psi \) iff there exist a position \(j \in U\), \(j \ge i\), and a path \(i_1< i_2< \dots < i_n\) between i and j in \(\varGamma \) such that \((w, i_k) \models \varphi \) for any \(1 \le k < n\), and \((w, i_n) \models \psi \). Since operators are defined symmetrically. Note that, depending on \(\varGamma \), a path from i to j may not exist. We define until/since operators by associating them with different sets of paths.

The summary until \(\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta \) (resp. since \(\psi \mathbin {\mathcal {S}^{t}_{\chi }} \theta \)) operator is obtained by inductively applying the \(\ocircle ^t\) and \(\chi _F^{t}\) (resp. \(\circleddash ^t\) and \(\chi _P^{t}\)) operators. It holds in a position in which either \(\theta \) holds, or \(\psi \) holds together with \(\ocircle ^t (\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta )\) (resp. \(\circleddash ^t (\psi \mathbin {\mathcal {S}^{t}_{\chi }} \theta )\)) or \(\chi _F^{t} (\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta )\) (resp. \(\chi _P^{t} (\psi \mathbin {\mathcal {S}^{t}_{\chi }} \theta )\)). It is an until operator on paths that can move not only between consecutive positions, but also between contexts of a chain, skipping its body. With the OPM of Fig. 1, this means skipping function bodies. The downward variants can move between positions at the same level in the ST (i.e., in the same simple chain body), or down in the nested chain structure. The upward ones remain at the same level, or move to higher levels of the ST.

Formula is true in positions contained in the frame of a function that is terminated by an exception. It is true in pos. 3 of Fig. 3 because of path 3-6, and false in pos. 1, because no path can enter the chain whose contexts are pos. 1 and 11. Formula is true in call positions whose function frame contains \(\mathbf {exc}\)s, but that are not necessarily terminated by one of them, such as the one in pos. 1 (with path 1-2-6).

We define Downward Summary Paths (DSP) as follows. Given an OP word w, and two positions \(i \le j\) in w, the DSP between i and j, if it exists, is a sequence of positions \(i = i_1< i_2< \dots < i_n = j\) such that, for each \(1 \le p < n\),

$$ i_{p+1} = {\left\{ \begin{array}{ll} k &{} \text {if }k = \max \{ h \mid h \le j \wedge \chi (i_p,h) \wedge (i_p \lessdot h \vee i_p \doteq h)\} \text {exists;} \\ i_p + 1 &{} \text {otherwise, if }i_p \lessdot (i_p + 1){ or}i_p \doteq (i_p + 1)\text {.} \end{array}\right. } $$

The Downward Summary (DS) until and since operators and use as \(\varGamma \) the set of DSP starting in the position in which they are evaluated. The definition for the upward counterparts is, again, obtained by substituting \(\lessdot \) with \(\gtrdot \). In Fig. 3, holds in pos. 1 because of path 1-7-8 and 1-9-10, in pos. 7 because of path 3-6-7, and in 3 because of path 3-6-7-8.

d) Hierarchical Operators. A single position may be the left or right context of multiple chains. The operators seen so far cannot keep this fact into account, since they “forget” about a left context when they jump to the right one. Thus, we introduce the hierarchical next and back operators. The upward hierarchical next (resp. back), \(\ocircle _H^{u}\psi \) (resp. \(\circleddash _H^{u}\psi \)), is true iff the current position j is the right context of a chain whose left context is i, and \(\psi \) holds in the next (resp. previous) pos. \(j'\) that is the right context of i, with \(i \lessdot j, j'\). So, \(\ocircle _H^{u}\mathrm{p}_{{Err}}\) holds in pos. 7 of Fig. 3 because \(\mathrm{p}_{{Err}}\) holds in 9, and \(\circleddash _H^{u}\mathrm{p}_{{Err}}\) in 9 because \(\mathrm{p}_{{Err}}\) holds in 7. In the ST, \(\ocircle _H^{u}\) goes up between \(\mathbf {call}\)s to \(\mathrm{p}_{{Err}}\), while \(\circleddash _H^{u}\) goes down. Their downward counterparts behave symmetrically, and consider multiple inner chains sharing their right context. They are formally defined as:

  • \((w,i) \models \ocircle _H^{u}\varphi \) iff there exist a position \(h < i\) s.t. \(\chi (h,i)\) and \(h \lessdot i\) and a position \(j = \min \{ k \mid i < k \wedge \chi (h,k) \wedge h \lessdot k \}\) and \((w,j) \models \varphi \);

  • \((w,i) \models \circleddash _H^{u}\varphi \) iff there exist a position \(h < i\) s.t. \(\chi (h,i)\) and \(h \lessdot i\) and a position \(j = \max \{ k \mid k < i \wedge \chi (h,k) \wedge h \lessdot k \}\) and \((w,j) \models \varphi \);

  • \((w,i) \models \ocircle _H^{d}\varphi \) iff there exist a position \(h > i\) s.t. \(\chi (i,h)\) and \(i \gtrdot h\) and a position \(j = \min \{ k \mid i < k \wedge \chi (k,h) \wedge k \gtrdot h \}\) and \((w,j) \models \varphi \);

  • \((w,i) \models \circleddash _H^{d}\varphi \) iff there exist a position \(h > i\) s.t. \(\chi (i,h)\) and \(i \gtrdot h\) and a position \(j = \max \{ k \mid k < i \wedge \chi (k,h) \wedge k \gtrdot h \}\) and \((w,j) \models \varphi \).

In the ST of Fig. 4, \(\ocircle _H^{d}\) and \(\circleddash _H^{d}\) go down and up among \(\mathbf {call}\)s terminated by the same \(\mathbf {exc}\). For example, in pos. 3 \(\ocircle _H^{d}\mathrm{p}_C\) holds, because both pos. 3 and 4 are in the chain relation with 6. Similarly, in pos. 4 \(\circleddash _H^{d}\mathrm{p}_B\) holds. Note that these operators do not consider leftmost/rightmost contexts, so \(\ocircle _H^{u}\mathbf {ret}\) is false in pos. 9, as \(\mathbf {call}\doteq \mathbf {ret}\), and pos. 11 is the rightmost context of pos. 1.

The hierarchical until and since operators are defined by iterating these next and back operators. The upward hierarchical path (UHP) between i and j is a sequence of positions \(i = i_1< i_2< \dots < i_n = j\) such that there exists a position \(h < i\) such that for each \(1 \le p \le n\) we have \(\chi (h,i_p)\) and \(h \lessdot i_p\), and for each \(1 \le q < n\) there exists no position k such that \(i_q< k < i_{q+1}\) and \(\chi (h,k)\). The until and since operators based on the set of UHP starting in the position in which they are evaluated are denoted as and . E.g., holds in pos. 7 because of the singleton path 7 and path 7-9, and in pos. 9 because of paths 9 and 7-9.

The downward hierarchical path (DHP) between i and j is a sequence of positions \(i = i_1< i_2< \dots < i_n = j\) such that there exists a position \(h > j\) such that for each \(1 \le p \le n\) we have \(\chi (i_p,h)\) and \(i_p \gtrdot h\), and for each \(1 \le q < n\) there exists no position k such that \(i_q< k < i_{q+1}\) and \(\chi (k,h)\). The until and since operators based on the set of DHP starting in the position in which they are evaluated are denoted as and \({} \mathbin {\mathcal {S}_H^d} {}\). In Fig. 3, holds in pos. 3, and \({\mathbf {call}} \mathbin {\mathcal {S}_H^d} {\mathrm{p}_B}\) in pos. 4, both because of path 3-4.

The POTL until and since operators enjoy expansion laws similar to those of LTL. Here we give those for two until operators, those for their since and downward counterparts being symmetric.

3.1 Expressiveness of POTL

We first define some derived operators. For \(t \in \{d, u\}\), we define the downward/upward summary eventually as , and the downward/upward summary globally as . and \(\square ^{u} \varphi \) resp. say that \(\varphi \) holds in one or all positions in the path from the current position to the root of the ST. says that \(\varphi \) holds in at least one position in the current subtree, and \(\square ^{d} \varphi \) in all of them. E.g., if \(\square ^{d} (\lnot \mathrm{p}_A)\) holds in a \(\mathbf {call}\), it means that \(\mathrm{p}_A\) never holds in its whole function body, which is the subtree rooted next to the \(\mathbf {call}\).

In the technical report, we prove

Theorem 1

([23]). POTL = FOL with one free variable on OP words.

Equivalence to FOL on the relevant algebraic structure is a desirable feature of linear-time temporal logics, and it was proved for LTL [39] and NWTL [2]. It is in some sense a theoretical assurance of the sufficient expressive power of the logic. Moreover, NWTL \(\subset \) OPTL was proved in [22], and OPTL \(\subseteq \) POTL comes from Theorem 1 and the semantics of OPTL being expressible in FOL. In [23], we also prove that there exist POTL formulas not expressible in OPTL. Thus, we can claim CaRet [6] \(\subseteq \) NWTL \(\subset \) OPTL \(\subset \) POTL. One of such formulas is which, evaluated e.g. on a \(\mathbf {han}\) position with a matched \(\mathbf {exc}\), states that \(\mathrm{p}_A\) holds in one of the positions in the same subtree.

More importantly, POTL can express many useful requirements of procedural programs. To emphasize the potential practical applications in automatic verification, we supply a few examples of typical program properties expressed as POTL formulas, not all of them being expressible in the other above languages.

The LTL globally can be written as . The two nested eventually operators enumerate all future positions by going up and then down in any direction in the syntax tree: when negated, this means \(\lnot \psi \) may never hold. POTL can express Hoare-style pre/postconditions with formulae such as \(\square (\mathbf {call}\,\wedge \, \rho \implies \chi _F^{d}(\mathbf {ret}\,\wedge \, \theta ))\), where \(\rho \) is the precondition, and \(\theta \) is the postcondition.

Unlike NWTL, POTL can easily express properties related to exception handling and interrupt management [43]. E.g., the shortcut \({CallThr}(\psi ) := \ocircle ^u(\mathbf {exc}\,\wedge \, \psi ) \,\vee \, \chi _F^{u}(\mathbf {exc}\,\wedge \, \psi )\), evaluated in a \(\mathbf {call}\), states that the procedure currently started is terminated by a \(\mathbf {exc}\) in which \(\psi \) holds. So, \(\square (\mathbf {call}\,\wedge \,\rho \,\wedge \, {CallThr}(\top ) \implies {CallThr}(\theta ))\) means that if precondition \(\rho \) holds when a procedure is called, then postcondition \(\theta \) must hold if that procedure is terminated by an exception. In object oriented programming languages, if \(\rho \equiv \theta \) is a class invariant asserting that a class instance’s state is valid, this formula expresses weak exception safety [1], and strong exception safety if \(\rho \) and \(\theta \) express particular states of the class instance. The no-throw guarantee can be stated with \(\square (\mathbf {call}\,\wedge \, \mathrm{p}_A \implies \lnot {CallThr}(\top ))\), meaning procedure \(\mathrm{p}_A\) is never interrupted by an exception.

Stack inspection [29, 37], i.e. properties regarding the sequence of procedures active in the program’s stack at a certain point of its execution, is an important class of requirements that can be expressed with shortcut , which subsumes the call since of CaRet, as it also works with exceptions. E.g., \(\square \big ((\mathbf {call}\wedge \mathrm{p}_B \wedge {Scall}(\top , \mathrm{p}_A)) \implies {CallThr}(\top ) \big )\) means that whenever \(\mathrm{p}_B\) is executed and at least one instance of \(\mathrm{p}_A\) is on the stack, \(\mathrm{p}_B\) is terminated by an exception. The OPA of Fig. 2 satisfies this formula, because \(\mathrm{p}_B\) is called by \(\mathrm{p}_A\), and \(\mathrm{p}_C\) throws.

4 Model Checking

Given an OP alphabet \(({\mathcal {P}(AP)}, M_{AP})\), where AP is a finite set of atomic propositions, and a POTL formula \(\varphi \), we build an OPA \( \mathcal {A}_\varphi = ( {\mathcal {P}(AP)}, M_{AP}, Q, I, F, \delta ) \) that accepts models of \(\varphi \). The construction of \(\mathcal {A}_\varphi \) resembles the classical one for LTL and the ones for NWTL and OPTL, diverging from them significantly when dealing with temporal obligations that involve positions in the chain relation.

We first introduce \({{Cl}}({\varphi })\), the closure of \(\varphi \), containing all subformulas of \(\varphi \), and some auxiliary operators. The latter are needed to model-check chain next and back operators. For any PR \(\pi \in \{\lessdot , \doteq , \gtrdot \}\), we define them as follows: \((w,i) \models \chi _{F}^{\pi } \varphi \) iff there exists \(j > i\) such that \(\chi (i,j)\), \(i \mathrel {\pi }j\), and \((w,j) \models \varphi \); \((w,i) \models \chi _{P}^{\pi } \varphi \) iff there exists \(j < i\) such that \(\chi (j,i)\), \(j \mathrel {\pi }i\), and \((w,j) \models \varphi \).

\({{Cl}}({\varphi })\) is the smallest set such that, for \(t \in \{d, u\}\):

  1. 1.

    \(\varphi \in {{Cl}}({\varphi })\),

  2. 2.

    \(AP \subseteq {{Cl}}({\varphi })\),

  3. 3.

    if \(\psi \in {{Cl}}({\varphi })\) and \(\psi \ne \lnot \theta \), then \(\lnot \psi \in {{Cl}}({\varphi })\) (we identify \(\lnot \lnot \psi \) with \(\psi \));

  4. 4.

    if \(\lnot \psi \in {{Cl}}({\varphi })\), then \(\psi \in {{Cl}}({\varphi })\);

  5. 5.

    if any of \(\psi \wedge \theta \) or \(\psi \vee \theta \) is in \({{Cl}}({\varphi })\), then \(\psi , \theta \in {{Cl}}({\varphi })\);

  6. 6.

    if any of \(\ocircle ^t \psi \), \(\circleddash ^t \psi \), \(\chi _F^{t} \psi \), or \(\chi _P^{t} \psi \) is in \({{Cl}}({\varphi })\), then \(\psi \in {{Cl}}({\varphi })\);

  7. 7.

    if \(\chi _F^{d}\psi \) (resp. \(\chi _F^{u}\psi \)) is in \({{Cl}}({\varphi })\), then \(\chi _{F}^{\lessdot } \psi \) (resp. \(\chi _{F}^{\gtrdot } \psi \)), \(\chi _{F}^{\doteq } \psi \), \(\chi _L\) are in it;

  8. 8.

    if \(\chi _P^{d}\psi \) (resp. \(\chi _P^{u}\psi \)) is in \({{Cl}}({\varphi })\), then \(\chi _{P}^{\lessdot } \psi \) (resp. \(\chi _{P}^{\gtrdot } \psi \)), \(\chi _{P}^{\doteq } \psi \) are in it;

  9. 9.

    if any of \(\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta \), \(\psi \mathbin {\mathcal {S}^{t}_{\chi }} \theta \), \(\psi \mathbin {\mathcal {U}^{t}_{H}} \theta \), or \(\psi \mathbin {\mathcal {S}^{t}_{H}} \theta \) is in \({{Cl}}({\varphi })\), then \(\psi , \theta \in {{Cl}}({\varphi })\);

  10. 10.

    if \(\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta \in {{Cl}}({\varphi })\), then \( \ocircle ^{t}(\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta ), \chi _{F}^{t}(\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta ) \in {{Cl}}({\varphi }) \) (since is symmetric).

The set \({{Atoms}}({\varphi })\) contains all consistent subsets of \({{Cl}}({\varphi })\), i.e. all \(\varPhi \subseteq {{Cl}}({\varphi })\) s.t.

  • for every \(\psi \in {{Cl}}({\varphi })\), \(\psi \in \varPhi \) iff \(\lnot \psi \notin \varPhi \);

  • \(\psi \wedge \theta \in \varPhi \), iff \(\psi \in \varPhi \) and \(\theta \in \varPhi \);

  • \(\psi \vee \theta \in \varPhi \), iff \(\psi \in \varPhi \) or \(\theta \in \varPhi \), or both.

The consistency constraints on \({{Atoms}}({\varphi })\) will be augmented incrementally in the following, for each operator.

The set of states of \(\mathcal {A}_\varphi \) is \(Q = {{Atoms}}({\varphi })^2\), and its elements, which we denote with Greek capital letters, are of the form \(\varPhi = (\varPhi _c, \varPhi _p)\), where \(\varPhi _c\) is the set of formulas that hold in the current position, and \(\varPhi _p\) is the set of temporal obligations. The latter keep track of arguments of temporal operators that must be satisfied after a chain body, skipping it. The way they do so depends on the transition relation \(\delta \), which we also define incrementally. Each automaton state is associated to word positions. So, for \((\varPhi , a, \varPsi ) \in \delta _{push/shift}\), with \(\varPhi \in {{Atoms}}({\varphi })^2\) and \(a \in {\mathcal {P}(AP)}\), we have \(\varPhi _c \cap AP = a\) (by \(\varPhi _c \cap AP\) we mean the set of atomic propositions in \(\varPhi _c\)). Pop moves do not read input symbols, and the automaton remains at the same position when performing them: for any \((\varPhi , \varTheta , \varPsi ) \in \delta _{pop}\) we impose \(\varPhi _c = \varPsi _c\). The initial set I contains states of the form \((\varPhi _c, \varPhi _p)\), with \(\varphi \in \varPhi _c\), and the final set F states of the form \((\varPsi _c, \varPsi _p)\), s.t. \(\varPsi _c \,\cap \, AP = \{\#\}\) and \(\varPsi _c\) contains no future operators. We extend the construction to the most important operators, leaving the others and correctness proofs to [21].

Next/Back Operators. Let \((\varPhi , a, \varPsi ) \in \delta _{shift} \cup \delta _{push}\), with \(\varPhi , \varPsi \in {{Atoms}}({\varphi })^2\), \(a \in {\mathcal {P}(AP)}\), and let \(b = \varPsi _c \cap AP\): we have iff \(\psi \in \varPsi _c\) and either \(a \lessdot b\) or \(a \doteq b\). The constraints introduced for the \(\circleddash ^d\) operator are symmetric, and for their upward counterparts it suffices to replace \(\lessdot \) with \(\gtrdot \).

If \(\chi _F^{d}\psi \in {{Cl}}({\varphi })\), for each \(\varPhi \in {{Atoms}}({\varphi })^2\) we impose that \(\chi _F^{d}\psi \in \varPhi _c\) iff \(\chi _{F}^{\lessdot } \psi \in \varPhi _c\) or \(\chi _{F}^{\doteq } \psi \in \varPhi _c\). Analogous rules are defined for the upward and past chain operators. The auxiliary symbol \(\chi _L\) forces the current position to be the first one of a chain body. Let the current state of the OPA be \(\varPhi \in {{Atoms}}({\varphi })^2\): \(\chi _L \in \varPhi _p\) iff the next transition (i.e. the one reading the current position) is a push. Formally, if \((\varPhi , a, \varPsi ) \in \delta _{shift}\) or \((\varPhi , \varTheta , \varPsi ) \in \delta _{pop}\), for any \(\varPhi , \varTheta , \varPsi \) and a, then \(\chi _L \not \in \varPhi _p\). If \((\varPhi , a, \varPsi ) \in \delta _{push}\), then \(\chi _L \in \varPhi _p\). For any initial state \((\varPhi _c, \varPhi _p) \in I\), we have \(\chi _L \in \varPhi _p\) iff \(\# \not \in \varPhi _c\).

If \(\chi _{F}^{\doteq } \psi \in {{Cl}}({\varphi })\), its satisfaction is ensured by the following constraints on \(\delta \):

  1. 1.

    Let \((\varPhi , a, \varPsi ) \in \delta _{push/shift}\): then \(\chi _{F}^{\doteq } \psi \in \varPhi _c\) iff \(\chi _{F}^{\doteq } \psi , \chi _L \in \varPsi _p\);

  2. 2.

    let \((\varPhi , \varTheta , \varPsi ) \in \delta _{pop}\): then \(\chi _{F}^{\doteq } \psi \not \in \varPhi _p\), and \(\chi _{F}^{\doteq } \psi \in \varTheta _p\) iff \(\chi _{F}^{\doteq } \psi \in \varPsi _p\);

  3. 3.

    let \((\varPhi , a, \varPsi ) \in \delta _{shift}\): then \(\chi _{F}^{\doteq } \psi \in \varPhi _p\) iff \(\psi \in \varPhi _c\).

    If \(\chi _{F}^{\lessdot } \psi \in {{Cl}}({\varphi })\), \(\chi _{F}^{\lessdot } \psi \) is allowed in the pending part of initial states, and we add the following constraints:

  4. 4.

    Let \((\varPhi , a, \varPsi ) \in \delta _{push/shift}\): then \(\chi _{F}^{\lessdot } \psi \in \varPhi _c\) iff \(\chi _{F}^{\lessdot } \psi , \chi _L \in \varPsi _p\);

  5. 5.

    let \((\varPhi , \varTheta , \varPsi ) \in \delta _{pop}\): then \(\chi _{F}^{\lessdot } \psi \in \varTheta _p\) iff \(\chi _L \in \varPsi _p\), and either \(\chi _{F}^{\lessdot } \psi \in \varPsi _p\) or \(\psi \in \varPhi _c\).

Fig. 5.
figure 5

Example accepting run of the automaton for \(\chi _F^{d}\mathbf {ret}\).

We illustrate how the construction works for \(\chi _{F}^{\doteq }\) with the example of Fig. 5. The OPA starts in state \(\varPhi ^0\), with \(\chi _F^{d}\mathbf {ret}\in \varPhi ^0_c\), and guesses that \(\chi _F^{d}\) will be fulfilled by \(\chi _{F}^{\doteq }\), so \(\chi _{F}^{\doteq } \mathbf {ret}\in \varPhi ^0_c\). \(\mathbf {call}\) is read by a push move, resulting in state \(\varPhi ^1\). The OPA guesses the next move will be a push, so \(\chi _L \in \varPhi ^1_p\). By rule 1, we have \(\chi _{F}^{\doteq } \mathbf {ret}\in \varPhi ^1_p\). The last guess is immediately verified by the next push (step 2–3). Thus, the pending obligation for \(\chi _{F}^{\doteq } \mathbf {ret}\) is stored onto the stack in \(\varPhi ^1\). The OPA, then, reads \(\mathbf {exc}\) with a shift, and pops the stack symbol containing \(\varPhi ^1\) (step 4–5). By rule 2, the temporal obligation is resumed in the next state \(\varPhi ^4\), so \(\chi _{F}^{\doteq } \mathbf {ret}\in \varPhi ^4_p\). Finally, \(\mathbf {ret}\) is read by a shift which, by rule 3, may occur only if \(\mathbf {ret}\in \varPhi ^4_c\). Rule 3 verifies the guess that \(\chi _{F}^{\doteq } \mathbf {ret}\) holds in \(\varPhi _0\), and fulfills the temporal obligation contained in \(\varPhi ^4_p\), by preventing computations in which \(\mathbf {ret}\not \in \varPhi ^4_c\) from continuing. Had the next transition been a pop (e.g. because there was no \(\mathbf {ret}\) and \(\mathbf {call}\gtrdot \#\)), the run would have been blocked by rule 2, preventing the OPA from reaching an accepting state, and from emptying the stack.

Summary Until and Since. The construction for these operators is based on their expansion laws. For any \(\varPhi \in {{Atoms}}({\varphi })^2\), we have \(\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta \in \varPhi _c\), with \(t \in \{d, u\}\) being a direction, iff either: 1. \(\theta \in \varPhi _c\), 2. \(\ocircle ^{t}(\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta ), \psi \in \varPhi _c\), or 3. \(\chi _{F}^{t}(\psi \mathbin {\mathcal {U}^{t}_{\chi }} \theta ), \psi \in \varPhi _c\). The rules for since are symmetric.

Hierarchical Operators. For the hierarchical operators, we do not give an explicit OPA construction, but we rely on a translation into other POTL operands. For each hierarchical operator \(\eta \) in \(\varphi \), we add a propositional symbol \(\mathrm{q}_{(\eta )}\). The upward hierarchical operators consider the right contexts of chains sharing the same left context. To distinguish such positions, we define formula \( \gamma _{L,\eta } := \chi _{P}^{\lessdot } \big (\mathrm{q}_{(\eta )} \wedge \ocircle (\square \lnot \mathrm{q}_{(\eta )}) \wedge \circleddash (\boxminus \lnot \mathrm{q}_{(\eta )})\big ), \) where \(\square \) and \(\boxminus \) are as in Sect. 3.1. \(\ocircle \) and \(\circleddash \) are the LTL next and back operators, for which model checking can be done as for and \(\circleddash ^d\), but removing the restrictions on PR. \(\gamma _{L,\eta }\), evaluated on a position i, asserts that \(\mathrm{q}_{(\eta )}\) holds in the unique position h such that \(\chi (h, i)\) and \(h \lessdot i\). Thus, \(\mathrm{q}_{(\eta )}\) can be used to distinguish other positions j such that \(\chi (h, j)\) and \(h \lessdot j\), as \(\chi _{P}^{\lessdot } \mathrm{q}_{(\eta )}\) holds in them. The translations for future upward hierarchical operators follow, the others being analogous.

4.1 Model Checking for \(\omega \)-Words

To perform model checking of a POTL formula \(\varphi \) on OP \(\omega \)-words, we build a generalized \(\omega \)OPBA \( \mathcal {A}_\varphi ^\omega = ( {\mathcal {P}(AP)}, M_{AP}, Q_\omega , I, \mathbf {F}, \delta ) \), where \(Q_\omega = {{Atoms}}({\varphi })^2 \,\times \, {\mathcal {P}({{Cl}}_{stack}({\varphi }))}\), which differs from the finite-word OPA only for the state set and the acceptance condition. As in [2], the generalized Büchi acceptance condition is a slight variation on the one shown in Sect. 2.1: \(\mathbf {F}\) is the set of sets of Büchi final states, and an \(\omega \)-word is accepted iff at least one state from each one of the sets contained in \(\mathbf {F}\) is visited infinitely often during the computation.

In finite words, the stack is empty at the end of every accepting computation, which implies the satisfaction of all temporal constraints tracked by the pending part of stack symbols. In \(\omega \)OPBAs, the stack may never be empty, and symbols with a non-empty pending part may remain in it indefinitely, never enforcing the satisfaction of the respective formulas. To overcome this issue, we use \({{Atoms}}({\varphi })^2 \times {\mathcal {P}({{Cl}}_{stack}({\varphi }))}\), with \({{Cl}}_{stack}({\varphi }) \subseteq {{Cl}}({\varphi })\), as the state set of the \(\omega \)OPBA. Such states have the form \(\varPhi = (\varPhi _c, \varPhi _p, \varPhi _s)\), where \(\varPhi _c\) and \(\varPhi _p\) have the same role as in the finite-word case, and \(\varPhi _s\) is the in-stack part of \(\varPhi \). All rules previously defined for \(\varPhi _c\) and \(\varPhi _p\) remain the same. \(\varPhi _s\) contains elements of \({{Cl}}_{stack}({\varphi })\) contained in any symbol currently on the stack. \({{Cl}}_{stack}({\varphi })\) contains formulas in \({{Cl}}({\varphi })\) that use the stack to ensure the satisfaction of future temporal requirements, namely all \(\chi _{F}^{\pi } \psi \in {{Cl}}({\varphi })\), with \(\pi \in \{\lessdot , \doteq , \gtrdot \}\). Thus, pending temporal obligations are moved from the stack to the \(\omega \)OPBA state, and they can be considered by the Büchi acceptance condition.

Suppose we want to model check \(\chi _{F}^{\doteq } \psi \). Formula \(\chi _{F}^{\doteq } \psi \) must be inserted in the in-stack part of the current state whenever a stack symbol containing it in its pending part is pushed. It must be kept in the in-stack part of the current state until the last stack symbol containing it in its pending part is popped, marking the satisfaction of its temporal requirement. Then, it is possible to define an acceptance set \(F_{\chi _{F}^{\doteq } \psi } \in \mathbf {F}\), as the set of states not containing \(\chi _{F}^{\doteq } \psi \) in any part. Figure 6 shows an \(\omega \)OPBA run of this kind. Notice that after step 7 \(\chi _{F}^{\doteq } \psi \) does not appear in any state’s in-stack part, so the run is accepting.

This construction is formalized as follows. Let \(\psi \in {{Cl}}_{stack}({\varphi })\). We add a few constraints on the transition relations. For any \(\varPhi , \varTheta , \varPsi \in Q_\omega \) and \(a \in {\mathcal {P}(AP)}\):

  • 6. let \((\varPhi , a, \varTheta ) \in \delta _{push}\): if \(\psi \in \varPhi _p\), then \(\psi \in \varTheta _s\);

  • 7. let \((\varPhi , a, \varTheta ) \in \delta _{push/shift}\): if \(\psi \in \varPhi _s\), then \(\psi \in \varTheta _s\);

  • 8. let \((\varPhi , \varTheta , \varPsi ) \in \delta _{pop}\): if \(\psi \in \varPhi _s\) and \(\psi \in \varTheta _s\), then \(\psi \in \varPsi _s\).

Fig. 6.
figure 6

Prefix of an accepting run of the automaton for \(\chi _F^{d}\mathbf {ret}\).

An acceptance condition for summary until operators is also needed. For , we add an acceptance set such that for any \(\varPhi \) in it we have , and either or \(\theta \in \varPhi _c\). The condition for is symmetric.

4.2 Complexity

The set \({{Cl}}({\varphi })\) is linear in \(|\varphi |\), the length of \(\varphi \). \({{Atoms}}({\varphi })\) has size at most \(2^{|{{Cl}}({\varphi })|} = 2^{O(|\varphi |)}\), and the size of the set of states is the square of that in the finite case, and is bounded by its cube in the \(\omega \)-case. Moreover, the use of the equivalences for the hierarchical operators causes only a linear increase in the length of \(\varphi \). Therefore,

Theorem 2

Given a POTL formula \(\varphi \), it is possible to build an OPA or an \(\omega \)OPBA \(\mathcal {A}_\varphi \) accepting the language denoted by \(\varphi \) with at most \(2^{O(|\varphi |)}\) states.

\(\mathcal {A}_\varphi \) can then be intersected [42] with an OPA/\(\omega \)OPBA modeling a program (e.g. Fig. 2), and emptiness can be decided with summarization techniques [4].

Table 1. Results of the evaluation. ‘# states’ refers to the OPA to be verified.

5 Experimental Evaluation

We implemented the OPA construction of Sect. 4 in an explicit-state model checking tool called POMC. The tool is written in Haskell [45], a purely functional, statically typed programming language with lazy evaluation. POMC checks OPA for emptiness by checking the reachability of an accepting configuration, by means of a modified DFS of the transition relation. This algorithm, similar to the one in [9], exploits the fact that all transitions only consider the topmost stack symbol, so reachability is actually computed only for semi-configurations made of one stack symbol and one state. Each time a chain support is explored, its ending semi-configuration is saved and associated with the starting one, so the next time the latter is reached, the support does not have to be re-explored. This allows the algorithm to exploit the cyclicities of OPA to terminate after having explored the whole transition relation. Given a POTL specification \(\varphi \) and an OPA \(\mathcal {A}\) to be checked, POMC executes the reachability algorithm, generating the product between \(\mathcal {A}\) and the OPA for \(\lnot \varphi \) on-the-fly. The present prototype of POMC only supports finite-word model checking; its extension to deal with \(\omega \)-languages is under development.

We checked with POMC several requirements on three case studies and we report the results in Table 1. Some additional formulas we checked are in Table 2. Such results can be reproduced through a publicly available artifact.Footnote 2 The experiments were executed on a laptop with a 2.2 GHz Intel processor and 15 GiB of RAM, running Ubuntu GNU/Linux 20.04. In the tables, by “Total” memory we mean the maximum resident memory including the Haskell runtime (which allocates 70 MiB by default), and by “MC only” the maximum memory used by model checking as reported by the runtime. Since model checking is polynomial in OPA size and exponential in formula length, we focus on checking a variety of requirements, rather than large OPA.

Generic Procedural Program. We checked formula

$$ \square \big ((\mathbf {call}\wedge \mathrm{p}_B \wedge {Scall}(\top , \mathrm{p}_A)) \implies {CallThr}(\top ) \big ) $$

from Sect. 3.1 on the OPA of Fig. 2 (bench. 1), and also against two larger OPA (2, where the property does not hold, and 3, where it holds).

We also checked the largest of such OPA against a set of formulas devised with the purpose of testing all POTL operators. The results are reported in Table 2. All formulas are checked very quickly, with only one outlier that runs out of memory. We ran the same experiment on a machine with a 2.0 GHz AMD CPU and 512 GiB of RAM running Debian GNU/Linux 10, obtaining a time of 367 s with a memory occupancy of 16.3 GiB.

Stack Inspection. The security framework of the Java Development Kit (JDK) is based on stack inspection, i.e. the analysis of the contents of the program’s stack during the execution. The JDK provides method checkPermission(perm) from class AccessController, which searches the stack for frames of functions that have not been granted permission perm. If any are found, an exception is thrown. Such permission checks prevent the execution of privileged code by unauthorized parts of the program, but they must be placed in sensitive points manually. Failure to place them appropriately may cause the unauthorized execution of privileged code. An automated tool to check that no code can escape such checks is thus desirable. Any such tool would need the ability to model exceptions, as they are used to avoid code execution in case of security violations.

[37] explains such needs by providing an example Java program for managing a bank account. It allows the user to check the account balance, and to withdraw money. To perform such tasks, the invoking program must have been granted permissions CanPay and Debit, respectively. We modeled such program as an OPA (4), and proved that the program enforces such security measures effectively by checking it against the formula

meaning that the account balance cannot be read if some function in the stack lacks the CanPay permission (a similar formula checks the Debit permission).

Exception Safety. [53] is a tutorial on how to make exception-safe generic containers in C++. It presents two implementations of a generic stack data structure, parametric on the element type T. The first one is not exception-safe: if the constructor of T throws an exception during a pop action, the topmost element is removed, but it is not returned, and it is lost. This violates the strong exception safety requirement that each operation is rolled back if an exception is thrown. The second version of the data structure instead satisfies such requirement.

While exception safety is, in general, undecidable, it is possible to prove the stronger requirement that each modification to the data structure is only committed once no more exceptions can be thrown. We modeled both versions as OPA, and checked such requirement with the following formula:

$$ \square (\mathbf {exc}\implies \lnot ((\circleddash ^u\mathtt {modified} \,\vee \, \chi _P^{u}\mathtt {modified}) \,\wedge \, \chi _P^{u}(\mathtt {Stack::push} \,\vee \, \mathtt {Stack::pop}))) $$

POMC successfully found a counterexample for the first implementation (5), and proved the safety of the second one (6).

Additionally, we proved that both implementations are exception neutral (7, 8), i.e. Stack functions do not block exceptions thrown by the underlying type T. This was accomplished by checking the following formula:

$$ \square (\mathbf {exc}\wedge \circleddash ^u\mathtt {T} \wedge \chi _P^{d}(\mathbf {han}\wedge \chi _P^{d}\mathtt {Stack}) \implies \chi _P^{d}\chi _P^{d}\chi _F^{u}\mathbf {exc}). $$
Table 2. Results of the additional experiments on OPA “generic larger”.

6 Conclusions

We introduced the temporal logic POTL, gave an automata-theoretic model checking procedure, and implemented it in a prototype tool. The results obtained in its experimental evaluation are promising. Additionally, POTL is proved to be FO-complete in a technical report [23]. We argue that the strong gain in expressive power w.r.t. previous approaches to model checking CFL, which comes without an increase in computational complexity, is worth the technicalities needed to achieve the present—and future—results.

In the evaluation, we used models directly coded into OPAs. To ease user interaction with our tool, we additionally implemented a new input format based on a simple procedural language with exceptions and Boolean variables, which is automatically translated into OPA. Moreover, we are currently working on the implementation of the model checking for \(\omega \)-words, described in Sect. 4.1.

As a future research step, we plan to develop user-friendly domain-specific languages for specification too, to prove that OP languages and logics are suitable in practice to program verification.