Abstract
There are two kinds of higherorder extensions of model checking: HORS model checking and HFL model checking. Whilst the former has been applied to automated verification of higherorder functional programs, applications of the latter have not been well studied. In the present paper, we show that various verification problems for functional programs, including may/mustreachability, trace properties, and lineartime temporal properties (and their negations), can be naturally reduced to (extended) HFL model checking. The reductions yield a sound and complete logical characterization of those program properties. Compared with the previous approaches based on HORS model checking, our approach provides a more uniform, streamlined method for higherorder program verification.
Download conference paper PDF
1 Introduction
There are two kinds of higherorder extensions of model checking in the literature: HORS model checking [16, 32] and HFL model checking [42]. The former is concerned about whether the tree generated by a given higherorder tree grammar called a higherorder recursion scheme (HORS) satisfies the property expressed by a given modal \(\mu \)calculus formula (or a tree automaton), and the latter is concerned about whether a given finite state system satisfies the property expressed by a given formula of higherorder modal fixpoint logic (HFL), a higherorder extension of the modal \(\mu \)calculus. Whilst HORS model checking has been applied to automated verification of higherorder functional programs [17, 18, 22, 26, 33, 41, 43], there have been few studies on applications of HFL model checking to program/system verification. Despite that HFL has been introduced more than 10 years ago, we are only aware of applications to assumeguarantee reasoning [42] and process equivalence checking [28].
In the present paper, we show that various verification problems for higherorder functional programs can actually be reduced to (extended) HFL model checking in a rather natural manner. We briefly explain the idea of our reduction below.^{Footnote 1} We translate a program to an HFL formula that says “the program has a valid behavior” (where the validity of a behavior depends on each verification problem). Thus, a program is actually mapped to a property, and a program property is mapped to a system to be verified; this has been partially inspired by the recent work of Kobayashi et al. [19], where HORS model checking problems have been translated to HFL model checking problems by switching the roles of models and properties.
For example, consider a simple program fragment \(\mathtt {read}(x); \mathtt {close}(x)\) that reads and then closes a file (pointer) \(x\). The transition system in Fig. 1 shows a valid access protocol to readonly files. Then, the property that a read operation is allowed in the current state can be expressed by a formula of the form \(\langle {\mathtt {read}}\rangle \varphi \), which says that the current state has a \(\mathtt {read}\)transition, after which \(\varphi \) is satisfied. Thus, the program \(\mathtt {read}(x); \mathtt {close}(x)\) being valid is expressed as \(\langle {\mathtt {read}}\rangle \langle {\mathtt {close}}\rangle \mathbf {true}\),^{Footnote 2} which is indeed satisfied by the initial state \(q_0\) of the transition system in Fig. 1. Here, we have just replaced the operations \(\mathtt {read}\) and \(\mathtt {close}\) of the program with the corresponding modal operators \(\langle {\mathtt {read}}\rangle \) and \(\langle {\mathtt {close}}\rangle \). We can also naturally deal with branches and recursions. For example, consider the program \(\mathtt {close}(x)\Box (\mathtt {read}(x); \mathtt {close}(x))\), where \(e_1\Box e_2\) represents a nondeterministic choice between \(e_1\) and \(e_2\). Then the property that the program always accesses \(x\) in a valid manner can be expressed by \((\langle {\mathtt {close}}\rangle \mathbf {true}) \wedge (\langle {\mathtt {read}}\rangle \langle {\mathtt {close}}\rangle \mathbf {true})\). Note that we have just replaced the nondeterministic branch with the logical conjunction, as we wish here to require that the program’s behavior is valid in both branches. We can also deal with conditional branches if HFL is extended with predicates; \(\mathbf {if}\ b\ \mathbf {then}\ \mathtt {close}(x)\ \mathbf {else}\ {(\mathtt {read}(x);\mathtt {close}(x))}\) can be translated to \((b\Rightarrow \langle {\mathtt {close}}\rangle \mathbf {true}) \wedge (\lnot b\Rightarrow \langle {\mathtt {read}}\rangle \langle {\mathtt {close}}\rangle \mathbf {true})\). Let us also consider the recursive function \(f\) defined by:
Then, the program \(f\,x\) being valid can be represented by using a (greatest) fixpoint formula:
If the state \(q_0\) satisfies this formula (which is indeed the case), then we know that all the file accesses made by \(f\,x\) are valid. So far, we have used only the modal \(\mu \)calculus formulas. If we wish to express the validity of higherorder programs, we need HFL formulas; such examples are given later.
We generalize the above idea and formalize reductions from various classes of verification problems for simplytyped higherorder functional programs with recursion, integers and nondeterminism – including verification of may/mustreachability, trace properties, and lineartime temporal properties (and their negations) – to (extended) HFL model checking where HFL is extended with integer predicates, and prove soundness and completeness of the reductions. Extended HFL model checking problems obtained by the reductions are (necessarily) undecidable in general, but for finitedata programs (i.e., programs that consist of only functions and data from finite data domains such as Booleans), the reductions yield pure HFL model checking problems, which are decidable [42].
Our reductions provide sound and complete logical characterizations of a wide range of program properties mentioned above. Nice properties of the logical characterizations include: (i) (like verification conditions for Hoare triples,) once the logical characterization is obtained as an HFL formula, purely logical reasoning can be used to prove or disprove it (without further referring to the program semantics); for that purpose, one may use theorem provers with various degrees of automation, ranging from interactive ones like Coq, semiautomated ones requiring some annotations, to fully automated ones (though the latter two are yet to be implemented), (ii) (unlike the standard verification condition generation for Hoare triples using invariant annotations) the logical characterization can automatically be computed, without any annotations,^{Footnote 3} (iii) standard logical reasoning can be applied based on the semantics of formulas; for example, coinduction and induction can be used for proving \(\nu \) and \(\mu \)formulas respectively, and (iv) thanks to the completeness, the set of program properties characterizable by HFL formula is closed under negations; for example, from a formula characterizing mayreachability, one can obtain a formula characterizing nonreachability by just taking the De Morgan dual.
Compared with previous approaches based on HORS model checking [18, 22, 26, 33, 37], our approach based on (extended) HFL model checking provides more uniform, streamlined methods for higherorder program verification. HORS model checking provides sound and complete verification methods for finitedata programs [17, 18], but for infinitedata programs, other techniques such as predicate abstraction [22] and program transformation [27, 31] had to be combined to obtain sound (but incomplete) reductions to HORS model checking. Furthermore, the techniques were different for each of program properties, such as reachability [22], termination [27], nontermination [26], fair termination [31], and fair nontermination [43]. In contrast, our reductions are sound and complete even for infinitedata programs. Although the obtained HFL model checking problems are undecidable in general, the reductions allow us to treat various program properties uniformly; all the verifications are boiled down to the issue of how to prove \(\mu \) and \(\nu \)formulas (and as remarked above, we can use induction and coinduction to deal with them). Technically, our reduction to HFL model checking may actually be considered an extension of HORS model checking in the following sense. HORS model checking algorithms [21, 32] usually consist of two phases, one for computing a kind of higherorder “procedure summaries” in the form of variable profiles [32] or intersection types [21], and the other for nested least/greatest fixpoint computations. Our reduction from program verification to extended HFL model checking (the reduction given in Sect. 7, in particular) can be regarded as an extension of the first phase to deal with infinite data domains, where the problem for the second phase is expressed in the form of extended HFL model checking: see [23] for more details.
The rest of this paper is structured as follows. Section 2 introduces HFL extended with integer predicates and defines the HFL model checking problem. Section 3 informally demonstrates some examples of reductions from program verification problems to HFL model checking. Section 4 introduces a functional language used to formally discuss the reductions in later sections. Sections 5, 6, and 7 consider may/mustreachability, trace properties, and temporal properties respectively, and present (sound and complete) reductions from verification of those properties to HFL model checking. Section 8 discusses related work, and Sect. 9 concludes the paper. Proofs are found in an extended version [23].
2 (Extended) HFL
In this section, we introduce an extension of higherorder modal fixpoint logic (HFL) [42] with integer predicates (which we call HFL\(_{\mathbf {Z}}\); we often drop the subscript and write HFL, as in Sect. 1), and define the HFL\(_{\mathbf {Z}}\) model checking problem. The set of integers can actually be replaced by another infinite set \(X\) of data (like the set of natural numbers or the set of finite trees) to yield HFL\(_{X}\).
2.1 Syntax
For a map \(f\), we write \( dom (f)\) and \( codom (f)\) for the domain and codomain of \(f\) respectively. We write \(\mathbf {Z}\) for the set of integers, ranged over by the metavariable \(n\) below. We assume a set \(\mathbf {Pred}\) of primitive predicates on integers, ranged over by \(p\). We write \(\mathtt {arity}(p)\) for the arity of \(p\). We assume that \(\mathbf {Pred}\) contains standard integer predicates such as \(=\) and <, and also assume that, for each predicate \(p\in \mathbf {Pred}\), there also exists a predicate \(\lnot p\in \mathbf {Pred}\) such that, for any integers \(n_1,\ldots ,n_k\), \(p(n_1,\ldots ,n_k)\) holds if and only if \(\lnot p(n_1,\ldots ,n_k)\) does not hold; thus, \(\lnot p(n_1,\ldots ,n_k)\) should be parsed as \((\lnot p)(n_1,\ldots ,n_k)\), but can semantically be interpreted as \(\lnot (p(n_1,\ldots ,n_k))\).
The syntax of \(HFL_\mathbf{Z}\) formulas is given by:
Here, \(\mathbin {\mathtt {op}}\) ranges over a set of binary operations on integers, such as \(+\), and \(X\) ranges over a denumerable set of variables. We have extended the original HFL [42] with integer expressions (\(n\) and \(\varphi _1\mathbin {\mathtt {op}}\varphi _2\)), and atomic formulas \(p(\varphi _1,\ldots ,\varphi _k)\) on integers (here, the arguments of integer operations or predicates will be restricted to integer expressions by the type system introduced below). Following [16], we have omitted negations, as any formula can be transformed to an equivalent negationfree formula [30].
We explain the meaning of each formula informally; the formal semantics is given in Sect. 2.2. Like modal \(\mu \)calculus [10, 25], each formula expresses a property of a labeled transition system. The first line of the syntax of formulas consists of the standard constructs of predicate logics. On the second line, as in the standard modal \(\mu \)calculus, \(\langle {a}\rangle \varphi \) means that there exists an \(a\)labeled transition to a state that satisfies \(\varphi \). The formula \([a]\varphi \) means that after any \(a\)labeled transition, \(\varphi \) is satisfied. The formulas \(\mu X^{\tau }.\varphi \) and \(\nu X^{\tau }.\varphi \) represent the least and greatest fixpoints respectively (the least and greatest \(X\) that \(X=\varphi \)) respectively; unlike the modal \(\mu \)calculus, \(X\) may range over not only propositional variables but also higherorder predicate variables (of type \(\tau \)). The \(\lambda \)abstractions \(\lambda X\mathbin {:}{\sigma }.\varphi \) and applications \(\varphi _1\; \varphi _2\) are used to manipulate higherorder predicates. We often omit type annotations in \(\mu X^{\tau }.\varphi \), \(\nu X^{\tau }.\varphi \) and \(\lambda X\mathbin {:}{\sigma }.\varphi \), and just write \(\mu X.\varphi \), \(\nu X.\varphi \) and \(\lambda X.\varphi \).
Example 1
Consider \(\varphi _{\mathtt {ab}}\,\varphi \) where \(\varphi _{\mathtt {ab}} = \mu X^{\bullet \rightarrow \bullet }.\lambda Y\mathbin {:}\bullet .Y\vee \langle {\mathtt {a}}\rangle (X(\langle {\mathtt {b}}\rangle Y))\). We can expand the formula as follows:
and obtain \( \varphi \vee (\langle {\mathtt {a}}\rangle \langle {\mathtt {b}}\rangle \varphi ) \vee (\langle {\mathtt {a}}\rangle \langle {\mathtt {a}}\rangle \langle {\mathtt {b}}\rangle \langle {\mathtt {b}}\rangle \varphi ) \vee \cdots \). Thus, the formula means that there is a transition sequence of the form \(\mathtt {a}^n\mathtt {b}^n\) for some \(n\ge 0\) that leads to a state satisfying \(\varphi \).
Following [16], we exclude out unmeaningful formulas such as \((\langle {a}\rangle \mathbf {true})+1\) by using a simple type system. The types \(\bullet \), \(\mathtt {int}\), and \(\sigma \rightarrow \tau \) describe propositions, integers, and (monotonic) functions from \(\sigma \) to \(\tau \), respectively. Note that the integer type \(\mathtt {int}\) may occur only in an argument position; this restriction is required to ensure that least and greatest fixpoints are welldefined. The typing rules for formulas are given in Fig. 2. In the figure, \(\varDelta \) denotes a type environment, which is a finite map from variables to (extended) types. Below we consider only welltyped formulas.
2.2 Semantics and HFL\(_{\mathbf {Z}}\) Model Checking
We now define the formal semantics of HFL\(_{\mathbf {Z}}\) formulas. A labeled transition system (LTS) is a quadruple \(\mathtt {L}= (U{}, A{}, \mathbin {\mathbin {\longrightarrow }}, \mathtt {s}_\mathtt {init})\), where \(U{}\) is a finite set of states, \(A{}\) is a finite set of actions, \(\mathbin {\mathbin {\longrightarrow }} \subseteq U{}\times A{}\times U\) is a labeled transition relation, and \(\mathtt {s}_\mathtt {init}\in U\) is the initial state. We write \(\mathtt {s}_1{\mathop {\mathbin {\longrightarrow }}\limits ^{a}}\mathtt {s}_2\) when \((\mathtt {s}_1,a,\mathtt {s}_2)\in \mathbin {\longrightarrow }\).
For an LTS \(\mathtt {L}=(U{}, A{}, \mathbin {\mathbin {\longrightarrow }}, \mathtt {s}_\mathtt {init})\) and an extended type \(\sigma \), we define the partially ordered set \((\mathcal {D}_{\mathtt {L},\sigma }, \sqsubseteq _{\mathtt {L},\sigma })\) inductively by:
Note that \((\mathcal {D}_{\mathtt {L},\tau }, \sqsubseteq _{\mathtt {L},\tau })\) forms a complete lattice (but \((\mathcal {D}_{\mathtt {L},\mathtt {int}},\sqsubseteq _{\mathtt {L},\mathtt {int}})\) does not). We write \(\bot _{\mathtt {L},\tau }\) and \(\top _{\mathtt {L},\tau }\) for the least and greatest elements of \(\mathcal {D}_{\mathtt {L},\tau }\) (which are \(\lambda \widetilde{x}.\emptyset \) and \(\lambda \widetilde{x}.U\)) respectively. We sometimes omit the subscript \(\mathtt {L}\) below. Let \(\llbracket \varDelta \rrbracket _{\mathtt {L}}\) be the set of functions (called valuations) that maps \(X\) to an element of \(\mathcal {D}_{\mathtt {L},\sigma }\) for each \(X\mathbin {:}\sigma \in \varDelta \). For an HFL formula \(\varphi \) such that \(\varDelta \vdash _{\mathtt {H}}\varphi :\sigma \), we define \(\llbracket \varDelta \vdash _{\mathtt {H}}\varphi :\sigma \rrbracket _{\mathtt {L}}\) as a map from \(\llbracket \varDelta \rrbracket _{\mathtt {L}}\) to \(\mathcal {D}_{\sigma }\), by induction on the derivation^{Footnote 4} of \(\varDelta \vdash _{\mathtt {H}}\varphi :\sigma \), as follows.
Here, \(\llbracket \!\!\mathbin {\mathtt {op}}\!\! \rrbracket \) denotes the binary function on integers represented by \(\mathbin {\mathtt {op}}\) and \(\llbracket p \rrbracket \) denotes the \(k\)ary relation on integers represented by \(p\). The least/greatest fixpoint operators \(\mathbf {lfp}_{\mathtt {L},\tau }\) and \(\mathbf {gfp}_{\mathtt {L},\tau }\) are defined by \(\mathbf {lfp}_{\mathtt {L},\tau }(f) = \bigsqcap _{\mathtt {L},\tau }\{x\in \mathcal {D}_{\mathtt {L},\tau } \mid f(x)\sqsubseteq _{\mathtt {L},\tau } x\}\) and \( \mathbf {gfp}_{\mathtt {L},\tau }(f) = \bigsqcup _{\mathtt {L},\tau }\{x\in \mathcal {D}_{\mathtt {L},\tau } \mid x\sqsubseteq _{\mathtt {L},\tau } f(x)\}\). Here, \(\bigsqcup _{\mathtt {L},\tau }\) and \(\bigsqcap _{\mathtt {L},\tau }\) respectively denote the least upper bound and the greatest lower bound with respect to \(\sqsubseteq _{\mathtt {L},\tau }\). We often omit the subscript \(\mathtt {L}\) and write \(\llbracket \varDelta \vdash _{\mathtt {H}}\varphi :\sigma \rrbracket \) for \(\llbracket \varDelta \vdash _{\mathtt {H}}\varphi :\sigma \rrbracket _{\mathtt {L}}\). For a closed formula, i.e., a formula welltyped under the empty type environment \(\emptyset \), we often write \(\llbracket \varphi \rrbracket _{\mathtt {L}}\) or just \(\llbracket \varphi \rrbracket \) for \(\llbracket \emptyset \vdash _{\mathtt {H}}\varphi :\sigma \rrbracket _{\mathtt {L}}(\emptyset )\).
Example 2
For the LTS \(\mathtt {L}_{ file }\) in Fig. 1, we have:
In fact, \(x=\{q_0\}\in \mathcal {D}_{\mathtt {L},\bullet }\) satisfies the equation: \(\llbracket X\mathbin {:}\bullet \vdash \langle {\mathtt {close}}\rangle \mathbf {true}\wedge \langle {\mathtt {read}}\rangle X:\bullet \rrbracket _{\mathtt {L}}(\{X\mapsto x\}) = x\), and \(x=\{q_0\}\in \mathcal {D}_{\mathtt {L},\bullet }\) is the greatest such element.
Consider the following LTS \(\mathtt {L}_1\):
and \(\varphi _{\mathtt {ab}}\,(\langle {c}\rangle \mathbf {true})\) where \(\varphi _{\mathtt {ab}}\) is the one introduced in Example 1. Then, \(\llbracket \varphi _{\mathtt {ab}}\,(\langle {c}\rangle \mathbf {true}) \rrbracket _{\mathtt {L}_1} = \{q_0,q_2\}\).
Definition 1
( \(\mathbf{HFL}_\mathbf{Z}\) model checking). For a closed formula \(\varphi \) of type \(\bullet \), we write \(\mathtt {L}, \mathtt {s}\models \varphi \) if \(\mathtt {s}\in \llbracket \varphi \rrbracket _{\mathtt {L}}\), and write \(\mathtt {L}\models \varphi \) if \(\mathtt {s}_\mathtt {init}\in \llbracket \varphi \rrbracket _{\mathtt {L}}\). \(\mathrm{HFL}_\mathbf{Z}\) model checking is the problem of, given \(\mathtt {L}\) and \(\varphi \), deciding whether \(\mathtt {L}\models \varphi \) holds.
The HFL\(_{\mathbf {Z}}\) model checking problem is undecidable, due to the presence of integers; in fact, the semantic domain \(\mathcal {D}_{\mathtt {L},\sigma }\) is not finite for \(\sigma \) that contains \(\mathtt {int}\). The undecidability is obtained as a corollary of the soundness and completeness of the reduction from the mayreachability problem to HFL model checking discussed in Sect. 5. For the fragment of pure HFL (i.e., HFL\(_{\mathbf {Z}}\) without integers, which we write HFL\(_{\emptyset }\) below), the model checking problem is decidable [42].
The order of an HFL\(_{\mathbf {Z}}\) model checking problem \(\mathtt {L}{\mathop {\models }\limits ^{?}}\varphi \) is the highest order of types of subformulas of \(\varphi \), where the order of a type is defined by: \(\mathtt {order}(\bullet )=\mathtt {order}(\mathtt {int}) = 0\) and \(\mathtt {order}(\sigma \rightarrow \tau ) = \max (\mathtt {order}(\sigma )+1,\mathtt {order}(\tau ))\). The complexity of order\(k\) HFL\(_{\emptyset }\) model checking is \(k\)EXPTIME complete [1], but polynomial time in the size of HFL formulas under the assumption that the other parameters (the size of LTS and the largest size of types used in formulas) are fixed [19].
Remark 1
Though we do not have quantifiers on integers as primitives, we can encode them using fixpoint operators. Given a formula \(\varphi :\mathtt {int}\rightarrow \bullet \), we can express \(\exists x\mathbin {:}\mathtt {int}.\varphi (x)\) and \(\forall x\mathbin {:}\mathtt {int}.\varphi (x)\) by \((\mu X^{\mathtt {int}\rightarrow \bullet }.\lambda x\mathbin {:}\mathtt {int}.\varphi (x)\vee X(x1)\vee X(x+1))0\) and \((\nu X^{\mathtt {int}\rightarrow \bullet }.\lambda x\mathbin {:}\mathtt {int}.\varphi (x)\wedge X(x1)\wedge X(x+1))0\) respectively.
2.3 HES
As in [19], we often write an HFL\(_{\mathbf {Z}}\) formula as a sequence of fixpoint equations, called a hierarchical equation system (HES).
Definition 2
An (extended) hierarchical equation system (HES) is a pair \((\mathcal {E},\varphi )\) where \(\mathcal {E}\) is a sequence of fixpoint equations, of the form: \( X_1^{\tau _1} =_{\alpha _1} \varphi _1; \cdots ; X_n^{\tau _n} =_{\alpha _n} \varphi _n \), where \(\alpha _i\in \{\mu ,\nu \}\). We assume that \(X_1\mathbin {:}\tau _1,\ldots ,X_n\mathbin {:}\tau _n \vdash _{\mathtt {H}}\varphi _i:\tau _i\) holds for each \(i\in \{1,\ldots ,n\}\), and that \(\varphi _1,\ldots ,\varphi _n,\varphi \) do not contain any fixpoint operators.
The HES \(\varPhi = (\mathcal {E}, \varphi )\) represents the HFL\(_{\mathbf {Z}}\) formula \( toHFL (\mathcal {E},\varphi )\) defined inductively by: \( toHFL (\epsilon , \varphi ) = \varphi \) and \( toHFL (\mathcal {E};X^\tau =_\alpha \varphi ', \varphi ) = toHFL ([\alpha X^\tau .\varphi '/X]\mathcal {E}, [\alpha X^\tau .\varphi '/X]\varphi )\). Conversely, every HFL\(_{\mathbf {Z}}\) formula can be easily converted to an equivalent HES. In the rest of the paper, we often represent an HFL\(_{\mathbf {Z}}\) formula in the form of HES, and just call it an HFL\(_{\mathbf {Z}}\) formula. We write \(\llbracket \varPhi \rrbracket \) for \(\llbracket toHFL (\varPhi ) \rrbracket \). An HES \( (X_1^{\tau _1} =_{\alpha _1} \varphi _1; \cdots ; X_n^{\tau _n} =_{\alpha _n} \varphi _n, \varphi )\) can be normalized to \( (X_0^{\tau _0} =_{\nu } \varphi ;X_1^{\tau _1} =_{\alpha _1} \varphi _1; \cdots ; X_n^{\tau _n} =_{\alpha _n} \varphi _n, X_0)\) where \(\tau _0\) is the type of \(\varphi \). Thus, we sometimes call just a sequence of equations \(X_0^{\tau _0} =_{\nu } \varphi ;X_1^{\tau _1} =_{\alpha _1} \varphi _1; \cdots ; X_n^{\tau _n} =_{\alpha _n} \varphi _n\) an HES, with the understanding that “the main formula” is the first variable \(X_0\). Also, we often write \(X^\tau \;x_1\,\cdots \, x_k =_\alpha \varphi \) for the equation \(X^\tau =_\alpha \lambda x_1.\cdots \lambda x_k.\varphi \). We often omit type annotations and just write \(X =_\alpha \varphi \) for \(X^\tau =_\alpha \varphi \).
Example 3
The formula \(\nu X.\mu Y.\langle {\mathtt {b}}\rangle X\vee \langle {\mathtt {a}}\rangle Y\) (which means that the current state has a transition sequence of the form \((\mathtt {a}^*\mathtt {b})^\omega \)) is expressed as the following HES:
3 Warming Up
To help readers get more familiar with HFL\(_{\mathbf {Z}}\) and the idea of reductions, we give here some variations of the examples of verification of fileaccessing programs in Sect. 1, which are instances of the “resource usage verification problem” [15]. General reductions will be discussed in Sects. 5, 6 and 7, after the target language is set up in Sect. 4.
Consider the following OCamllike program, which uses exceptions.
Here, * represents a nondeterministic boolean value. The function \(\texttt {readex}\) reads the file pointer \(x\), and then nondeterministically raises an endoffile (Eof) exception. The main expression (on the third line) first opens file “foo”, calls f to read the file repeatedly, and closes the file upon an endoffile exception. Suppose, as in the example of Sect. 1, we wish to verify that the file “foo” is accessed following the protocol in Fig. 1.
First, we can remove exceptions by representing an exception handler as a special continuation [6]:
Here, we have added to each function two parameters h and k, which represent an exception handler and a (normal) continuation respectively.
Let \(\varPhi \) be \((\mathcal {E},F\;\mathbf {true}\;(\lambda r.\langle {\mathtt {close}}\rangle \mathbf {true})\;(\lambda r.\mathbf {true}))\) where \(\mathcal {E}\) is:
Here, we have just replaced read/close operations with the modal operators \(\langle {\mathtt {read}}\rangle \) and \(\langle {\mathtt {close}}\rangle \), nondeterministic choice with a logical conjunction, and the unit value \((\,)\) with \(\mathbf {true}\). Then, \(\mathtt {L}_{ file }\models \varPhi \) if and only if the program performs only valid accesses to the file (e.g., it does not access the file after a close operation), where \(\mathtt {L}_{ file }\) is the LTS shown in Fig. 1. The correctness of the reduction can be informally understood by observing that there is a close correspondence between reductions of the program and those of the HFL formula above, and when the program reaches a read command \(\mathtt {read}\;x\), the corresponding formula is of the form \(\langle {\mathtt {read}}\rangle \cdots \), meaning that the read operation is valid in the current state; a similar condition holds also for close operations. We will present a general translation and prove its correctness in Sect. 6.
Let us consider another example, which uses integers:
Here, \(\mathtt {n}\) is an integer constant. The function \(\mathtt {f}\) reads \(\mathtt {x}\) \(\mathtt {y}\) times, and then calls the continuation \(\mathtt {k}\). Let \(\mathtt {L}'_{ file }\) be the LTS obtained by adding to \(\mathtt {L}_{ file }\) a new state \(q_2\) and the transition \(q_1{\mathop {\mathbin {\longrightarrow }}\limits ^{\mathtt {end}}}q_2\) (which intuitively means that a program is allowed to terminate in the state \(q_1\)), and let \(\varPhi '\) be \((\mathcal {E}',F\;n\; \mathbf {true}\; (\lambda r.\langle {\mathtt {end}}\rangle \mathbf {true}))\) where \(\mathcal {E}'\) is:
Here, \(p(\varphi _1,\ldots ,\varphi _k)\Rightarrow \varphi \) is an abbreviation of \(\lnot p(\varphi _1,\ldots ,\varphi _k)\vee \varphi \). Then, \(\mathtt {L}'_{ file } \models \varPhi '\) if and only if (i) the program performs only valid accesses to the file, (ii) it eventually terminates, and (iii) the file is closed when the program terminates. Notice the use of \(\mu \) instead of \(\nu \) above; by using \(\mu \), we can express liveness properties. The property \(\mathtt {L}'_{ file } \models \varPhi '\) indeed holds for \(n\ge 0\), but not for \(n<0\). In fact, \(F\;n\;x\;k\) is equivalent to \(\mathbf {false}\) for \(n<0\), and \(\langle {\mathtt {read}}\rangle ^n\langle {\mathtt {close}}\rangle (k\;\mathbf {true})\) for \(n\ge 0\).
4 Target Language
This section sets up, as the target of program verification, a callbyname^{Footnote 5} higherorder functional language extended with events. The language is essentially the same as the one used by Watanabe et al. [43] for discussing fair nontermination.
4.1 Syntax and Typing
We assume a finite set \(\mathbf {Ev}\) of names called events, ranged over by \(a\), and a denumerable set of variables, ranged over by \(x,y,\ldots \). Events are used to express temporal properties of programs. We write \(\widetilde{x}\) (\(\widetilde{t}\), resp.) for a sequence of variables (terms, resp.), and write \(\widetilde{x}\) for the length of the sequence.
A program is a pair \((D, t)\) consisting of a set \(D\) of function definitions \( \{f_1\;\widetilde{x}_1 = t_1,\ldots ,f_n\;\widetilde{x}_n=t_n\}\) and a term \(t\). The set of terms, ranged over by \(t\), is defined by:
Here, \(n\) and \(p\) range over the sets of integers and integer predicates as in HFL formulas. The expression \(\mathbf {event}\ a; t\) raises an event \(a\), and then evaluates \(t\). Events are used to encode program properties of interest. For example, an assertion \(\mathbf {assert}(b)\) can be expressed as \(\mathbf {if}\ b\ \mathbf {then}\ (\,)\ \mathbf {else}\ {(\mathbf {event}\ \mathtt {fail}; \varOmega )}\), where \(\mathtt {fail}\) is an event that expresses an assertion failure and \(\varOmega \) is a nonterminating term. If program termination is of interest, one can insert “\(\mathbf {event}\ \mathtt {end}\)” to every termination point and check whether an \(\mathtt {end}\) event occurs. The expression \(t_1\Box t_2\) evaluates \(t_1\) or \(t_2\) in a nondeterministic manner; it can be used to model, e.g., unknown inputs from an environment. We use the metavariable \(P\) for programs. When \(P= (D,t)\) with \(D= \{f_1\;\widetilde{x}_1 = t_1,\ldots ,f_n\;\widetilde{x}_n=t_n\}\), we write \(\mathbf {funs}(P)\) for \(\{f_1,\ldots ,f_n\}\) (i.e., the set of function names defined in \(P\)). Using \(\lambda \)abstractions, we sometimes write \(f=\lambda \widetilde{x}.t\) for the function definition \(f\;\widetilde{x}=t\). We also regard \(D\) as a map from function names to terms, and write \( dom (D)\) for \(\{f_1,\ldots ,f_n\}\) and \(D(f_i)\) for \(\lambda \widetilde{x}_i.t_i\).
Any program \((D,t)\) can be normalized to \((D\cup \{\mathbf {main}=t\},\mathbf {main})\) where \(\mathbf {main}\) is a name for the “main” function. We sometimes write just \(D\) for a program \((D,\mathbf {main})\), with the understanding that \(D\) contains a definition of \(\mathbf {main}\).
We restrict the syntax of expressions using a type system. The set of simple types, ranged over by \(\kappa \), is defined by:
The types \(\star \), \(\mathtt {int}\), and \(\eta \rightarrow \kappa \) describe the unit value, integers, and functions from \(\eta \) to \(\kappa \) respectively. Note that \(\mathtt {int}\) is allowed to occur only in argument positions. We defer typing rules to [23], as they are standard, except that we require that the righthand side of each function definition must have type \(\star \); this restriction, as well as the restriction that \(\mathtt {int}\) occurs only in argument positions, does not lose generality, as those conditions can be ensured by applying CPS transformation. We consider below only welltyped programs.
4.2 Operational Semantics
We define the labeled transition relation \(t{\mathop {\longrightarrow }\limits ^{\ell }}_{D}t'\), where \(\ell \) is either \(\epsilon \) or an event name, as the least relation closed under the rules in Fig. 3. We implicitly assume that the program \((D,t)\) is welltyped, and this assumption is maintained throughout reductions by the standard type preservation property. In the rules for ifexpressions, \(\llbracket t'_i \rrbracket \) represents the integer value denoted by \(t'_i\); note that the welltypedness of \((D,t)\) guarantees that \(t'_i\) must be arithmetic expressions consisting of integers and integer operations; thus, \(\llbracket t'_i \rrbracket \) is well defined. We often omit the subscript \(D\) when it is clear from the context. We write \(t\mathbin {{\mathop {\longrightarrow }\limits ^{\ell _1\cdots \ell _k}}\!\!\! ~{\,}^{*}_{D}}t'\) if \(t{\mathop {\longrightarrow }\limits ^{\ell _1}}_{D}\cdots {\mathop {\longrightarrow }\limits ^{\ell _k}}_{D}t'\). Here, \(\epsilon \) is treated as an empty sequence; thus, for example, we write \(t\mathbin {{\mathop {\longrightarrow }\limits ^{ab}}\!\!\! ~{\,}^{*}_{D}}t'\) if \(t{\mathop {\longrightarrow }\limits ^{a}}_{D}{\mathop {\longrightarrow }\limits ^{\epsilon }}_{D}{\mathop {\longrightarrow }\limits ^{b}}_{D}{\mathop {\longrightarrow }\limits ^{\epsilon }}_{D}t'\).
For a program \(P=(D,t_0)\), we define the set \(\mathbf {Traces}(P) (\subseteq \mathbf {Ev}^*\cup \mathbf {Ev}^\omega )\) of traces by:
Note that since the label \(\epsilon \) is regarded as an empty sequence, \(\ell _0\ell _1\ell _2 = aa\) if \(\ell _0=\ell _2=a\) and \(\ell _1=\epsilon \), and an element of \((\{\epsilon \}\cup \mathbf {Ev})^\omega \) is regarded as that of \(\mathbf {Ev}^*\cup \mathbf {Ev}^\omega \). We write \(\mathbf {FinTraces}(P)\) and \(\mathbf {InfTraces}(P)\) for \(\mathbf {Traces}(P)\cap \mathbf {Ev}^*\) and \(\mathbf {Traces}(P)\cap \mathbf {Ev}^\omega \) respectively. The set of full traces \(\mathbf {FullTraces}(D,t_0)(\subseteq \mathbf {Ev}^*\cup \mathbf {Ev}^\omega )\) is defined as:
Example 4
The last example in Sect. 1 is modeled as \(P_{ file } = (D, f\,(\,))\), where \(D = \{f\,x = (\mathbf {event}\ \mathtt {close}; (\,))\Box (\mathbf {event}\ \mathtt {read}; \mathbf {event}\ \mathtt {read}; f\,x)\}\). We have:
5 May/MustReachability Verification
Here we consider the following problems:

Mayreachability: “Given a program \(P\) and an event \(a\), may \(P\) raise \(a\)?”

Mustreachability: “Given a program \(P\) and an event \(a\), must \(P\) raise \(a\)?”
Since we are interested in a particular event \(a\), we restrict here the event set \(\mathbf {Ev}\) to a singleton set of the form \(\{a\}\). Then, the mayreachability is formalized as \(a\,{\mathop {\in }\limits ^{?}}\,\mathbf {Traces}(P)\), whereas the mustreachability is formalized as “does every trace in \(\mathbf {FullTraces}(P)\) contain \(a\)?” We encode both problems into the validity of HFL\(_{\mathbf {Z}}\) formulas (without any modal operators \(\langle {a}\rangle \) or \([a]\)), or the HFL\(_{\mathbf {Z}}\) model checking of those formulas against a trivial model (which consists of a single state without any transitions). Since our reductions are sound and complete, the characterizations of their negations –nonreachability and maynonreachability– can also be obtained immediately. Although these are the simplest classes of properties among those discussed in Sects. 5, 6 and 7, they are already large enough to accommodate many program properties discussed in the literature, including lack of assertion failures/uncaught exceptions [22] (which can be characterized as nonreachability; recall the encoding of assertions in Sect. 4), termination [27, 29] (characterized as mustreachability), and nontermination [26] (characterized as maynonreachability).
5.1 MayReachability
As in the examples in Sect. 3, we translate a program to a formula that says “the program may raise an event \(a\)” in a compositional manner. For example, \(\mathbf {event}\ a; t\) can be translated to \(\mathbf {true}\) (since the event will surely be raised immediately), and \(t_1\Box t_2\) can be translated to \(t_1^\dagger \vee t_2^\dagger \) where \(t_i^\dagger \) is the result of the translation of \(t_i\) (since only one of \(t_1\) and \(t_2\) needs to raise an event).
Definition 3
Let \(P=(D,t)\) be a program. \(\varPhi _{P,\textit{may}}\) is the HES \(({D}^{\dagger _{\textit{may}}}, {t}^{\dagger _{\textit{may}}})\), where \({D}^{\dagger _{\textit{may}}}\) and \({t}^{\dagger _{\textit{may}}}\) are defined by:
Note that, in the definition of \({D}^{\dagger _{\textit{may}}}\), the order of function definitions in \(D\) does not matter (i.e., the resulting HES is unique up to the semantic equality), since all the fixpoint variables are bound by \(\mu \).
Example 5
Consider the program:
It is translated to the HES \(\varPhi _{ loop } = ( loop \;x=_\mu loop \;x, loop (\mathbf {true}))\). Since \( loop \equiv \mu loop .\lambda x. loop \;x\) is equivalent to \(\lambda x.\mathbf {false}\), \(\varPhi _{ loop }\) is equivalent to \(\mathbf {false}\). In fact, \(P_{ loop }\) never raises an event \(a\) (recall that our language is callbyname).
Example 6
Consider the program \(P_{ sum }=(D_{ sum },\mathbf {main})\) where \(D_{ sum }\) is:
Here, \(n\) is some integer constant, and \(\mathbf {assert}(b)\) is the macro introduced in Sect. 4. We have used \(\lambda \)abstractions for the sake of readability. The function \( sum \) is a CPS version of a function that computes the summation of integers from \(1\) to \(x\). The main function computes the sum \(r=1+\cdots + n\), and asserts \(r\ge n\). It is translated to the HES \(\varPhi _{P_2,\textit{may}} = (\mathcal {E}_{ sum },\mathbf {main})\) where \(\mathcal {E}_{ sum }\) is:
Here, \(n\) is treated as a constant. Since the shape of the formula does not depend on the value of \(n\), the property “an assertion failure may occur for some \(n\)” can be expressed by \(\exists n.\varPhi _{P_2,\textit{may}}\). \(\square \)
The following theorem states that \(\varPhi _{P,\textit{may}}\) is a complete characterization of the mayreachability of \(P\).
Theorem 1
Let \(P\) be a program. Then, \(a\in \mathbf {Traces}(P)\) if and only if \(\mathtt {L}_0 \models \varPhi _{P,\textit{may}}\) for \(\mathtt {L}_0 = (\{\mathtt {s}_\star \},\emptyset ,\emptyset ,\mathtt {s}_\star )\).
A proof of the theorem above is found in [23]. We only provide an outline. We first show the theorem for recursionfree programs and then lift it to arbitrary programs by using the continuity of functions represented in the fixpointfree fragment of HFL\(_{\mathbf {Z}}\) formulas. To show the theorem for recursionfree programs, we define the reduction relation \(t\longrightarrow _{D}t'\) by:
Here, \(E\) ranges over the set of evaluation contexts given by \(E{:}{:}= [\,]\mid E\Box t\mid t\Box E\mid \mathbf {event}\ a; E\). The reduction relation differs from the labeled transition relation given in Sect. 4, in that \(\Box \) and \(\mathbf {event}\ a; \cdots \) are not eliminated. By the definition of the translation, the theorem holds for programs in normal form (with respect to the reduction relation), and the semantics of translated HFL formulas is preserved by the reduction relation; thus the theorem holds for recursionfree programs, as they are strongly normalizing.
5.2 MustReachability
The characterization of mustreachability can be obtained by an easy modification of the characterization of mayreachability: we just need to replace branches with logical conjunction.
Definition 4
Let \(P=(D,t)\) be a program. \(\varPhi _{P,\textit{must}}\) is the HES \(({D}^{\dagger _{\textit{must}}}, {t}^{\dagger _{\textit{must}}})\), where \({D}^{\dagger _{\textit{must}}}\) and \({t}^{\dagger _{\textit{must}}}\) are defined by:
Here, \(p(\varphi _1,\ldots ,\varphi _k)\Rightarrow \varphi \) is a shorthand for \(\lnot p(\varphi _1,\ldots ,\varphi _k)\vee \varphi \).
Example 7
Consider \(P_{\mathtt {loop}} = (D, \mathtt {loop}\,m\,n)\) where \(D\) is:
Here, the event \(\mathtt {end}\) is used to signal the termination of the program. The function \(\mathtt {loop}\) nondeterministically updates the values of \(x\) and \(y\) until either \(x\) or \(y\) becomes nonpositive. The musttermination of the program is characterized by \(\varPhi _{P_{\mathtt {loop},\textit{must}}} = (\mathcal {E}, \mathtt {loop}\,m\,n)\) where \(\mathcal {E}\) is:
We write \(\mathbf{Must }_a(P)\) if every \(\pi \in \mathbf {FullTraces}(P)\) contains \(a\). The following theorem, which can be proved in a manner similar to Theorem 1, guarantees that \(\varPhi _{P,\textit{must}}\) is indeed a sound and complete characterization of the mustreachability.
Theorem 2
Let \(P\) be a program. Then, \(\mathbf{Must }_a(P)\) if and only if \(\mathtt {L}_0 \models \varPhi _{P,\textit{must}}\) for \(\mathtt {L}_0 = (\{\mathtt {s}_\star \},\emptyset ,\emptyset ,\mathtt {s}_\star )\).
6 Trace Properties
Here we consider the verification problem: “Given a (non\(\omega \)) regular language \(L\) and a program \(P\), does every finite event sequence of \(P\) belong to \(L\)? (i.e. \(\mathbf {FinTraces}(P){\mathop {\subseteq }\limits ^{?}} L\))” and reduce it to an HFL\(_{\mathbf {Z}}\) model checking problem. The verification of fileaccessing programs considered in Sect. 3 may be considered an instance of the problem.
Here we assume that the language \(L\) is closed under the prefix operation; this does not lose generality because \(\mathbf {FinTraces}(P)\) is also closed under the prefix operation. We write \(A_L= (Q,\varSigma ,\delta ,q_0,F)\) for the minimal, deterministic automaton with no dead states (hence the transition function \(\delta \) may be partial). Since \(L\) is prefixclosed and the automaton is minimal, \(w\in L\) if and only if \(\hat{\delta }(q_0,w)\) is defined (where \(\hat{\delta }\) is defined by: \(\hat{\delta }(q,\epsilon )=q\) and \(\hat{\delta }(q,aw) = \hat{\delta }(\delta (q,a),w)\)). We use the corresponding LTS \(\mathtt {L}_L = (Q, \varSigma , \{(q,a,q') \mid \delta (q,a)=q'\}, q_0)\) as the model of the reduced HFL\(_{\mathbf {Z}}\) model checking problem.
Given the LTS \(\mathtt {L}_L\) above, whether an event sequence \(a_1\cdots a_k\) belongs to \(L\) can be expressed as \(\mathtt {L}_L {\mathop {\models }\limits ^{?}} \langle {a_1}\rangle \cdots \langle {a_k}\rangle \mathbf {true}\). Whether all the event sequences in \(\{a_{j,1}\cdots a_{j,k_j}\mid j\in \{1,\ldots ,n\}\}\) belong to \(L\) can be expressed as \(\mathtt {L}_L {\mathop {\models }\limits ^{?}} \bigwedge _{j\in \{1,\ldots ,n\}} \langle {a_{j,1}}\rangle \cdots \langle {a_{j,k_j}}\rangle \mathbf {true}\). We can lift these translations for event sequences to the translation from a program (which can be considered a description of a set of event sequences) to an HFL\(_{\mathbf {Z}}\) formula, as follows.
Definition 5
Let \(P=(D,t)\) be a program. \(\varPhi _{P,\textit{path}}\) is the HES \(({D}^{\dagger _{\textit{path}}}, {t}^{\dagger _{\textit{path}}})\), where \({D}^{\dagger _{\textit{path}}}\) and \({t}^{\dagger _{\textit{path}}}\) are defined by:
Example 8
The last program discussed in Sect. 3 is modeled as \(P_2 = (D_2,f\; m\; g)\), where \(m\) is an integer constant and \(D_2\) consists of:
Here, we have modeled accesses to the file, and termination as events. Then, \(\varPhi _{P_2,\textit{path}} = (\mathcal {E}_{P_2,\textit{path}}, f\;m\;g)\) where \(\mathcal {E}_{P_2,\textit{path}}\) is:^{Footnote 6}
Let \(L\) be the prefixclosure of \(\mathtt {read}^*\cdot \mathtt {close}\cdot \mathtt {end}\). Then \(\mathtt {L}_L\) is \(\mathtt {L}'_{ file }\) in Sect. 3, and \(\mathbf {FinTraces}(P_2){\subseteq } L\) can be verified by checking \(\mathtt {L}_L{\models } \varPhi _{P_2,\textit{path}}\). \(\square \)
Theorem 3
Let \(P\) be a program and \(L\) be a regular, prefixclosed language. Then, \(\mathbf {FinTraces}(P)\subseteq L\) if and only if \(\mathtt {L}_L\models \varPhi _{P,\textit{path}}\).
As in Sect. 5, we first prove the theorem for programs in normal form, and then lift it to recursionfree programs by using the preservation of the semantics of HFL\(_{\mathbf {Z}}\) formulas by reductions, and further to arbitrary programs by using the (co)continuity of the functions represented by fixpointfree HFL\(_{\mathbf {Z}}\) formulas. See [23] for a concrete proof.
7 LinearTime Temporal Properties
This section considers the following problem: “Given a program \(P\) and an \(\omega \)regular word language \(L\), does \(\mathbf {InfTraces}(P){\cap } L = \emptyset \) hold\(?\)”. From the viewpoint of program verification, \(L\) represents the set of “bad” behaviors. This can be considered an extension of the problems considered in the previous sections.
The reduction to HFL model checking is more involved than those in the previous sections. To see the difficulty, consider the program \(P_0\):
where \(c\) is some boolean expression. Let \(L\) be the complement of \((\mathtt {a}^*\mathtt {b})^\omega \), i.e., the set of infinite sequences that contain only finitely many \(\mathtt {b}\)’s. Following Sect. 6 (and noting that \(\mathbf {InfTraces}(P){\cap } L = \emptyset \) is equivalent to \(\mathbf {InfTraces}(P) \subseteq (\mathtt {a}^*\mathtt {b})^\omega \) in this case), one may be tempted to prepare an LTS like the one in Fig. 4 (which corresponds to the transition function of a (parity) word automaton accepting \((\mathtt {a}^*\mathtt {b})^\omega \)), and translate the program to an HES \(\varPhi _{P_0}\) of the form:
where \(\alpha \) is \(\mu \) or \(\nu \). However, such a translation would not work. If \(c=\mathbf {true}\), then \(\mathbf {InfTraces}(P_0)=\mathtt {a}^\omega \), hence \(\mathbf {InfTraces}(P_0)\cap L\ne \emptyset \); thus, \(\alpha \) should be \(\mu \) for \(\varPhi _{P_0}\) to be unsatisfied. If \(c=\mathbf {false}\), however, \(\mathbf {InfTraces}(P_0)=\mathtt {b}^\omega \), hence \(\mathbf {InfTraces}(P_0)\cap L=\emptyset \); thus, \(\alpha \) must be \(\nu \) for \(\varPhi _{P_0}\) to be satisfied.
The example above suggests that we actually need to distinguish between the two occurrences of \(f\) in the body of \(f\)’s definition. Note that in the then and elseclauses respectively, \(f\) is called after different events \(\mathtt {a}\) and \(\mathtt {b}\). This difference is important, since we are interested in whether \(\mathtt {b}\) occurs infinitely often. We thus duplicate \(f\), and replace the program with the following program \(P_{ dup }\):
For checking \(\mathbf {InfTraces}(P_0)\cap L = \emptyset \), it is now sufficient to check that \(f_b\) is recursively called infinitely often. We can thus obtain the following HES:
Note that \(f_b\) and \(f_a\) are bound by \(\nu \) and \(\mu \) respectively, reflecting the fact that \(\mathtt {b}\) should occur infinitely often, but \(\mathtt {a}\) need not. If \(c=\mathbf {true}\), the formula is equivalent to \(\nu f_b.\langle {\mathtt {a}}\rangle \mu f_a.\langle {\mathtt {a}}\rangle f_a\), which is false. If \(c=\mathbf {false}\), then the formula is equivalent to \(\nu f_b.\langle {\mathtt {b}}\rangle f_b\), which is satisfied by by the LTS in Fig. 4.
The general translation is more involved due to the presence of higherorder functions, but, as in the example above, the overall translation consists of two steps. We first replicate functions according to what events may occur between two recursive calls, and reduce the problem \(\mathbf {InfTraces}(P)\cap L{\mathop {=}\limits ^{?}} \emptyset \) to a problem of analyzing which functions are recursively called infinitely often, which we call a callsequence analysis. We can then reduce the callsequence analysis to HFL model checking in a rather straightforward manner (though the proof of the correctness is nontrivial). The resulting HFL formula actually does not contain modal operators.^{Footnote 7} So, as in Sect. 5, the resulting problem is the validity checking of HFL formulas without modal operators.
In the rest of this section, we first introduce the callsequence analysis problem and its reduction to HFL model checking in Sect. 7.1. We then show how to reduce the temporal verification problem \(\mathbf {InfTraces}(P)\cap L{\mathop {=}\limits ^{?}} \emptyset \) to an instance of the callsequence analysis problem in Sect. 7.2.
7.1 CallSequence Analysis
We define the callsequence analysis and reduce it to an HFL modelchecking problem. As mentioned above, in the callsequence analysis, we are interested in analyzing which functions are recursively called infinitely often. Here, we say that \(g\) is recursively called from \(f\), if \(f\,\widetilde{s}{\mathop {\longrightarrow }\limits ^{\epsilon }}_{D} [\widetilde{s}/\widetilde{x}]t_f\mathbin {{\mathop {\longrightarrow }\limits ^{\widetilde{\ell }}}\!\!\! ~{\,}^{*}_{D}} g\,\widetilde{t}\), where \(f\,\widetilde{x}=t_f\in D\) and \(g\) “originates from” \(t_f\) (a more formal definition will be given in Definition 6 below). For example, consider the following program \(P_{ app }\), which is a twisted version of \(P_{ dup }\) above.
Then \(f_a\) is “recursively called” from \(f_b\) in \(f_b\,5 \mathbin {{\mathop {\longrightarrow }\limits ^{\mathtt {a}}}\!\!\! ~{\,}^{*}_{D}} \mathtt {app}\,f_a\,4\mathbin {{\mathop {\longrightarrow }\limits ^{\epsilon }}\!\!\! ~{\,}^{*}_{D}} f_a\,4\) (and so is \(\mathtt {app}\)). We are interested in infinite chains of recursive calls \(f_0f_1f_2\cdots \), and which functions may occur infinitely often in each chain. For instance, the program above has the unique infinite chain \((f_b f_a^5)^\omega \), in which both \(f_a\) and \(f_b\) occur infinitely often. (Besides the infinite chain, the program has finite chains like \(f_b\,\mathtt {app}\); note that the chain cannot be extended further, as the body of \(\mathtt {app}\) does not have any occurrence of recursive functions: \(\mathtt {app},f_a\) and \(f_b\).)
We define the notion of “recursive calls” and callsequences formally below.
Definition 6
(Recursive call relation, call sequences). Let \(P=(D, f_1\,\widetilde{s})\) be a program, with \(D= \{ f_i\,\tilde{x}_i = u_i \}_{1 \le i \le n}\). We define \( D^{\sharp } := D\cup \{ f^{\sharp }_i\,\tilde{x} = u_i \}_{1 \le i \le n} \) where \( f^{\sharp }_1, \dots , f^{\sharp }_n \) are fresh symbols. (Thus, \( D^{\sharp } \) has two copies of each function symbol, one of which is marked by \(\sharp \).) For the terms \(\widetilde{t}_i\) and \(\widetilde{t}_j\) that do not contain marked symbols, we write if (i) \([\widetilde{t}_i/\widetilde{x}_i][f_1^\sharp /f_1,\ldots ,f_n^\sharp /f_n]u_i \mathbin {{\mathop {\longrightarrow }\limits ^{\widetilde{\ell }}}\!\!\! ~{\,}^{*}_{D^\sharp }} f_j^\sharp \,\widetilde{t}'_j\) and (ii) \(\widetilde{t}_j\) is obtained by erasing all the marks in \(\widetilde{t}'_j\). We write \(\mathbf {Callseq}(P)\) for the set of (possibly infinite) sequences of function symbols:
We write \(\mathbf {InfCallseq}(P)\) for the subset of \(\mathbf {Callseq}(P)\) consisting of infinite sequences, i.e., \(\mathbf {Callseq}(P)\cap \{f_1,\ldots ,f_n\}^\omega \).
For example, for \(P_{ app }\) above, \(\mathbf {Callseq}(P)\) is the prefix closure of \(\{(f_bf_a^5)^\omega \}\cup \{s\cdot \mathtt {app}\mid \text{ s } \text{ is } \text{ a } \text{ nonempty } \text{ finite } \text{ prefix } \text{ of } (f_bf_a^5)^\omega \}\), and \(\mathbf {InfCallseq}(P)\) is the singleton set \(\{(f_bf_a^5)^\omega \}\).
Definition 7
(Callsequence analysis). A priority assignment for a program \( P \) is a function \( \varOmega \mathbin {:}\mathbf {funs}(P) \rightarrow \mathbb {N} \) from the set of function symbols of \( P \) to the set \(\mathbb {N}\) of natural numbers. We write \(\models _{ csa } (P,\varOmega )\) if every infinite callsequence \(g_0g_1g_2\dots \in \mathbf {InfCallseq}(P)\) satisfies the parity condition w.r.t. \( \varOmega \), i.e., the largest number occurring infinitely often in \( \varOmega (g_0) \varOmega (g_1) \varOmega (g_2) \dots \) is even. Callsequence analysis is the problem of, given a program \( P \) with a priority assignment \( \varOmega \), deciding whether \(\models _{ csa } (P,\varOmega )\) holds.
For example, for \(P_{ app }\) and the priority assignment \(\varOmega _{ app } = \{\mathtt {app}\mapsto 3, f_a\mapsto 1, f_b\mapsto 2\}\), \(\models _ csa (P_{ app }, \varOmega _{ app })\) holds.
The callsequence analysis can naturally be reduced to HFL model checking against the trivial LTS \( \mathtt {L}_0 = (\{\mathtt {s}_\star \}, \emptyset , \emptyset , \mathtt {s}_\star ) \) (or validity checking).
Definition 8
Let \(P=(D,t)\) be a program and \(\varOmega \) be a priority assignment for \(P\). The HES \(\varPhi _{(P,\varOmega ), csa }\) is \(({D}^{\dagger _{ csa }}, {t}^{\dagger _{ csa }})\), where \({D}^{\dagger _{ csa }}\) and \({t}^{\dagger _{ csa }}\) are defined by:
Here, we assume that \(\varOmega (f_i) \ge \varOmega (f_{i+1})\) for each \(i \in \{1,\dots ,n1\}\), and \(\alpha _i = \nu \) if \(\varOmega (f_i)\) is even and \(\mu \) otherwise.
The following theorem states the soundness and completeness of the reduction. See [23] for a proof.
Theorem 4
Let \( P \) be a program and \( \varOmega \) be a priority assignment for \(P\). Then \(\models _ csa (P,\varOmega )\) if and only if \( \mathtt {L}_0 \models \varPhi _{(P,\varOmega ), csa }\).
Example 9
For \(P_{ app }\) and \(\varOmega _{ app }\) above, \({(P_{ app },\varOmega _{ app })}^{\dagger _{ csa }} = (\mathcal {E}, f_b\,5)\), where: \(\mathcal {E}\) is:
Note that \(\mathtt {L}_0\models {(P_{ app },\varOmega _{ app })}^{\dagger _{ csa }}\) holds.
7.2 From Temporal Verification to CallSequence Analysis
This subsection shows a reduction from the temporal verification problem \(\mathbf {InfTraces}(P) \cap L {\mathop {=}\limits ^{?}} \emptyset \) to a callsequence analysis problem \({\mathop {\models }\limits ^{?}}_{ csa }(P',\varOmega )\).
For the sake of simplicity, we assume without loss of generality that every program \(P=(D, t)\) in this section is nonterminating and every infinite reduction sequence produces infinite events, so that \(\mathbf {FullTraces}(P) = \mathbf {InfTraces}(P)\) holds. We also assume that the \(\omega \)regular language \(L\) for the temporal verification problem is specified by using a nondeterministic, parity word automaton [10]. We recall the definition of nondeterministic, parity word automata below.
Definition 9
(Parity automaton). A nondeterministic parity word automaton is a quintuple \(\mathcal {A}= (Q, \varSigma , \delta , q_I, \varOmega )\) where (i) \(Q\) is a finite set of states; (ii) \(\varSigma \) is a finite alphabet; (iii) \(\delta \), called a transition function, is a total map from \(Q\times \varSigma \) to \(2^Q\); (iv) \(q_I\in Q\) is the initial state; and (v) \(\varOmega \in Q \rightarrow \mathbb {N}\) is the priority function. A run of \(\mathcal {A}\) on an \(\omega \)word \(a_0 a_1 \dots \in \varSigma ^{\omega }\) is an infinite sequence of states \(\rho = \rho (0) \rho (1) \dots \in Q^{\omega }\) such that (i) \(\rho (0) = q_I\), and (ii) \(\rho (i+1) \in \delta (\rho (i),a_i)\) for each \(i\in \omega \). An \(\omega \)word \(w \in \varSigma ^{\omega }\) is accepted by \(\mathcal {A}\) if, there exists a run \(\rho \) of \(\mathcal {A}\) on \(w\) such that \( \mathbf {max}\{\varOmega (q) \mid q \in \mathbf {Inf}(\rho )\} \text { is even} \), where \(\mathbf {Inf}(\rho )\) is the set of states that occur infinitely often in \(\rho \). We write \( \mathcal {L}(\mathcal {A}) \) for the set of \( \omega \)words accepted by \( \mathcal {A}\).
For technical convenience, we assume below that \(\delta (q,a)\ne \emptyset \) for every \(q\in Q\) and \(a\in \varSigma \); this does not lose generality since if \(\delta (q,a)=\emptyset \), we can introduce a new “dead” state \(q_{ dead }\) (with priority 1) and change \(\delta (q,a)\) to \(\{q_{ dead }\}\). Given a parity automaton \(\mathcal {A}\), we refer to each component of \(\mathcal {A}\) by \(Q_{\mathcal {A}}\), \(\varSigma _{\mathcal {A}}\), \(\delta _{\mathcal {A}}\), \(q_{I,{\mathcal {A}}}\) and \(\varOmega _{\mathcal {A}}\).
Example 10
Consider the automaton \(\mathcal {A}_{ab}=(\{q_a,q_b\}, \{\mathtt {a},\mathtt {b}\}, \delta , q_a, \varOmega )\), where \(\delta \) is as given in Fig. 4, \(\varOmega (q_a)=0\), and \(\varOmega (q_b)=1\). Then, \(\mathcal {L}(\mathcal {A}_{ab})= \overline{(\mathtt {a}^*\mathtt {b})^\omega } = (\mathtt {a}^*\mathtt {b})^*\mathtt {a}^\omega \).
The goal of this subsection is, given a program \(P\) and a parity word automaton \(\mathcal {A}\), to construct another program \(P'\) and a priority assignment \(\varOmega \) for \(P'\), such that \(\mathbf {InfTraces}(P)\cap \mathcal {L}(\mathcal {A})=\emptyset \) if and only if \(\models _{ csa }(P',\varOmega )\).
Note that a necessary and sufficient condition for \(\mathbf {InfTraces}(P)\cap \mathcal {L}(\mathcal {A})=\emptyset \) is that no trace in \(\mathbf {InfTraces}(P)\) has a run whose priority sequence satisfies the parity condition; in other words, for every sequence in \(\mathbf {InfTraces}(P)\), and for every run for the sequence, the largest priority that occurs in the associated priority sequence is odd. As explained at the beginning of this section, we reduce this condition to a call sequence analysis problem by appropriately duplicating functions in a given program. For example, recall the program \(P_0\):
It is translated to \(P_0'\):
where \(c\) is some (closed) boolean expression. Since the largest priorities encountered before calling \(f_a\) and \(f_b\) (since the last recursive call) respectively are \(0\) and \(1\), we assign those priorities plus 1 (to flip odd/evenness) to \(f_a\) and \(f_b\) respectively. Then, the problem of \(\mathbf {InfTraces}(P_0)\cap \mathcal {L}(\mathcal {A}) = \emptyset \) is reduced to \(\models _{ csa } (P'_0, \{f_a\mapsto 1,f_b\mapsto 2\})\). Note here that the priorities of \(f_a\) and \(f_b\) represent summaries of the priorities (plus one) that occur in the run of the automaton until \(f_a\) and \(f_b\) are respectively called since the last recursive call; thus, the largest priority of states that occur infinitely often in the run for an infinite trace is equivalent to the largest priority that occurs infinitely often in the sequence of summaries \((\varOmega (f_1)1)(\varOmega (f_2)1)(\varOmega (f_3)1)\cdots \) computed from a corresponding call sequence \(f_1f_2f_3\cdots \).
Due to the presence of higherorder functions, the general reduction is more complicated than the example above. First, we need to replicate not only function symbols, but also arguments. For example, consider the following variation \(P_1\) of \(P_0\) above:
Here, we have just made the calls to \(f\) indirect, by preparing the function \(g\). Obviously, the two calls to \(k\) in the body of \(g\) must be distinguished from each other, since different priorities are encountered before the calls. Thus, we duplicate the argument \(k\), and obtain the following program \(P'_1\):
Then, for the priority assignment \(\varOmega = \{f_a\mapsto 1, f_b\mapsto 2, g\mapsto 1\}\), \(\mathbf {InfTraces}(P_1)\cap \mathcal {L}(\mathcal {A}_{ab})=\emptyset \) if and only if \(\models _{ csa } (P_1', \varOmega )\). Secondly, we need to take into account not only the priorities of states visited by \(\mathcal {A}\), but also the states themselves. For example, if we have a function definition \(f\,h = h(\mathbf {event}\ \mathtt {a}; f\,h)\), the largest priority encountered before \(f\) is recursively called in the body of \(f\) depends on the priorities encountered inside \(h\), and also the state of \(\mathcal {A}\) when \(h\) uses the argument \(\mathbf {event}\ \mathtt {a}; f\) (because the state after the \(\mathtt {a}\) event depends on the previous state in general). We, therefore, use intersection types (a la Kobayashi and Ong’s intersection types for HORS model checking [21]) to represent summary information on how each function traverses states of the automaton, and replicate each function and its arguments for each type. We thus formalize the translation as an intersectiontypebased program transformation; related transformation techniques are found in [8, 11, 12, 20, 38].
Definition 10
Let \(\mathcal {A}= (Q, \varSigma , \delta , q_I, \varOmega )\) be a nondeterministic parity word automaton. Let \(q\) and \(m\) range over \(Q\) and the set \( codom (\varOmega )\) of priorities respectively. The set \(\mathbf {Types}_{\mathcal {A}}\) of intersection types, ranged over by \(\theta \), is defined by:
We assume a certain total order < on \(\mathbf {Types}_{\mathcal {A}}\times \mathbb {N}\), and require that in \(\bigwedge _{1 \le i \le k} (\theta _i, m_i)\), \((\theta _i,m_i)<(\theta _j,m_j)\) holds for each \(i<j\).
We often write \((\theta _1,m_1)\wedge \cdots \wedge (\theta _k,m_k)\) for \(\bigwedge _{1 \le i \le k} (\theta _i, m_i)\), and \(\top \) when \(k=0\). Intuitively, the type \(q\) describes expressions of simple type \(\star \), which may be evaluated when the automaton \(\mathcal {A}\) is in the state \(q\) (here, we have in mind an execution of the product of a program and the automaton, where the latter takes events produced by the program and changes its states). The type \((\bigwedge _{1 \le i \le k} (\theta _i, m_i))\rightarrow \theta \) describes functions that take an argument, use it according to types \(\theta _1,\ldots ,\theta _k\), and return a value of type \(\theta \). Furthermore, the part \(m_i\) describes that the argument may be used as a value of type \(\theta _i\) only when the largest priority visited since the function is called is \(m_i\). For example, given the automaton in Example 10, the function \(\lambda x.(\mathbf {event}\ \mathtt {a}; x)\) may have types \((q_a,0)\rightarrow q_a\) and \((q_a,0)\rightarrow q_b\), because the body may be executed from state \(q_a\) or \(q_b\) (thus, the return type may be any of them), but \(x\) is used only when the automaton is in state \(q_a\) and the largest priority visited is \(1\). In contrast, \(\lambda x.(\mathbf {event}\ \mathtt {b}; x)\) have types \((q_b,1)\rightarrow q_a\) and \((q_b,1)\rightarrow q_b\).
Using the intersection types above, we shall define a typebased transformation relation of the form \(\varGamma \vdash _{\mathcal {A}}t:\theta \Rightarrow t'\), where \(t\) and \(t'\) are the source and target terms of the transformation, and \(\varGamma \), called an intersection type environment, is a finite set of type bindings of the form \(x \mathbin {:}\mathtt {int}\) or \(x \mathbin {:}(\theta , m, m')\). We allow multiple type bindings for a variable \( x \) except for \( x \mathbin {:}\mathtt {int}\) (i.e. if \( x \mathbin {:}\mathtt {int}\in \varGamma \), then this must be the unique type binding for \(x \) in \( \varGamma \)). The binding \(x \mathbin {:}(\theta , m, m')\) means that \(x\) should be used as a value of type \(\theta \) when the largest priority visited is \(m\); \(m'\) is auxiliary information used to record the largest priority encountered so far.
The transformation relation \(\varGamma \vdash _{\mathcal {A}}t:\theta \Rightarrow t'\) is inductively defined by the rules in Fig. 5. (For technical convenience, we have extended terms with \(\lambda \)abstractions; they may occur only at toplevel function definitions.) In the figure, \([k]\) denotes the set \(\{i\in \mathbb {N}\mid 1\le i\le k\}\). The operation \(\varGamma \uparrow m\) used in the figure is defined by:
The operation is applied when the priority \(m\) is encountered, in which case the largest priority encountered is updated accordingly. The key rules are ITVar, ITEvent, ITApp, and ITAbs. In ITVar, the variable \(x\) is replicated for each type; in the target of the translation, \(x_{\theta ,m}\) and \(x_{\theta ',m'}\) are treated as different variables if \((\theta ,m)\ne (\theta ',m')\). The rule ITEvent reflects the state change caused by the event \(a\) to the type and the type environment. Since the state change may be nondeterministic, we transform \(t\) for each of the next states \(q_1,\ldots ,q_n\), and combine the resulting terms with nondeterministic choice. The rule ITApp and ITAbs replicates function arguments for each type. In addition, in ITApp, the operation \(\varGamma \uparrow m_i\) reflects the fact that \(t_2\) is used as a value of type \(\theta _i\) after the priority \(m_i\) is encountered. The other rules just transform terms in a compositional manner. If target terms are ignored, the entire rules are close to those of Kobayashi and Ong’s type system for HORS model checking [21].
We now define the transformation for programs. A toplevel type environment \( \varXi \) is a finite set of type bindings of the form \( x : (\theta , m) \). Like intersection type environments, \( \varXi \) may have more than one binding for each variable. We write \( \varXi \vdash _{\mathcal {A}}t : \theta \) to mean \( \{ x : (\theta , m, 0) \mid x : (\theta , m) \in \varXi \} \vdash _{\mathcal {A}}t : \theta \). For a set \( D \) of function definitions, we write \( \varXi \vdash _{\mathcal {A}}D \Rightarrow D' \) if \( dom (D') = \{\, f_{\theta ,m} \mid f : (\theta , m) \in \varXi \,\} \) and \( \varXi \vdash _{\mathcal {A}}D(f) : \theta \Rightarrow D'(f_{\theta ,m}) \) for every \( f \mathbin {:}(\theta , m) \in \varXi \). For a program \( P = (D, t) \), we write \( \varXi \vdash _{\mathcal {A}}P \Rightarrow (P',\varOmega ') \) if \( P' = (D', t') \), \( \varXi \vdash _{\mathcal {A}}D \Rightarrow D' \) and \( \varXi \vdash _{\mathcal {A}}t : q_I\Rightarrow t' \), with \(\varOmega '(f_{\theta ,m})=m+1\) for each \(f_{\theta ,m}\in dom (D')\). We just write \(\vdash _{\mathcal {A}}P\Rightarrow (P',\varOmega ')\) if \( \varXi \vdash _{\mathcal {A}}P \Rightarrow (P',\varOmega ') \) holds for some \(\varXi \).
Example 11
Consider the automaton \(\mathcal {A}_{ab}\) in Example 10, and the program \(P_2 = (D_2, f\,5)\) where \(D_2\) consists of the following function definitions:
Let \(\varXi \) be: \(\{g\mathbin {:}((q_a,0)\wedge (q_b,1)\rightarrow q_a, 0), g\mathbin {:}((q_a,0)\wedge (q_b,1) \rightarrow q_b, 0), f\mathbin {:}(\mathtt {int}\rightarrow q_a, 0), f\mathbin {:}(\mathtt {int}\rightarrow q_b, 1)\} \). Then, \(\varXi \vdash _{\mathcal {A}}P_1 \Rightarrow ((D'_2, f_{\mathtt {int}\rightarrow q_a, 0}\,5),\varOmega )\) where:
Notice that \(f\), \(g\), and the arguments of \(g\) have been duplicated. Furthermore, whenever \(f_{\theta ,m}\) is called, the largest priority that has been encountered since the last recursive call is \(m\). For example, in the thenclause of \(f_{\mathtt {int}\rightarrow q_a, 0}\), \(f_{\mathtt {int}\rightarrow q_b, 1}(x1)\) may be called through \(g_{(q_a,0)\wedge (q_b,1)\rightarrow q_a, 0}\). Since \(g_{(q_a,0)\wedge (q_b,1)\rightarrow q_a, 0}\) uses the second argument only after an event \(\mathtt {b}\), the largest priority encountered is \(1\). This property is important for the correctness of our reduction.
The following theorems below claim that our reduction is sound and complete, and that there is an effective algorithm for the reduction: see [23] for proofs.
Theorem 5
Let \( P\) be a program and \( \mathcal {A}\) be a parity automaton. Suppose that \( \varXi \vdash _{\mathcal {A}}P \Rightarrow (P',\varOmega ) \). Then \( \mathbf {InfTraces}(P) \cap \mathcal {L}(\mathcal {A}) = \emptyset \) if and only if \(\models _ csa (P',\varOmega )\).
Theorem 6
For every \( P \) and \( \mathcal {A}\), one can effectively construct \( \varXi \), \( P' \) and \(\varOmega \) such that \( \varXi \vdash _{\mathcal {A}}P \Rightarrow (P',\varOmega ) \).
The proof of Theorem 6 above also implies that the reduction from temporal property verification to callsequence analysis can be performed in polynomial time. Combined with the reduction from callsequence analysis to HFL model checking, we have thus obtained a polynomialtime reduction from the temporal verification problem \(\mathbf {InfTraces}(P){\mathop {\subseteq }\limits ^{?}} \mathcal {L}(\mathcal {A})\) to HFL model checking.
8 Related Work
As mentioned in Sect. 1, our reduction from program verification problems to HFL model checking problems has been partially inspired by the translation of Kobayashi et al. [19] from HORS model checking to HFL model checking. As in their translation (and unlike in previous applications of HFL model checking [28, 42]), our translation switches the roles of properties and models (or programs) to be verified. Although a combination of their translation with Kobayashi’s reduction from program verification to HORS model checking [17, 18] yields an (indirect) translation from finitedata programs to pure HFL model checking problems, the combination does not work for infinitedata programs. In contrast, our translation is sound and complete even for infinitedata programs. Among the translations in Sects. 5, 6 and 7, the translation in Sect. 7.2 shares some similarity to their translation, in that functions and their arguments are replicated for each priority. The actual translations are however quite different; ours is typedirected and optimized for a given automaton, whereas their translation is not. This difference comes from the difference of the goals: the goal of [16] was to clarify the relationship between HORS and HFL, hence their translation was designed to be independent of an automaton. The proof of the correctness of our translation in Sect. 7 is much more involved due to the need for dealing with integers. Whilst the proof of [19] could reuse the typebased characterization of HORS model checking [21], we had to generalize arguments in both [19, 21] to work on infinitedata programs.
Lange et al. [16] have shown that various process equivalence checking problems (such as bisimulation and trace equivalence) can be reduced to (pure) HFL model checking problems. The idea of their reduction is quite different from ours. They reduce processes to LTSs, whereas we reduce programs to HFL formulas.
Major approaches to automated or semiautomated higherorder program verification have been HORS model checking [17, 18, 22, 27, 31, 33, 43], (refinement) type systems [14, 24, 34,35,36, 39, 41, 44], Horn clause solving [2, 7], and their combinations. As already discussed in Sect. 1, compared with the HORS model checking approach, our new approach provides more uniform, streamlined methods. Whilst the HORS model checking approach is for fully automated verification, our approach enables various degrees of automation: after verification problems are automatically translated to HFL\(_{\mathbf {Z}}\) formulas, one can prove them (i) interactively using a proof assistant like Coq (see [23]), (ii) semiautomatically, by letting users provide hints for induction/coinduction and discharging the rest of proof obligations by (some extension of) an SMT solver, or (iii) fully automatically by recasting the techniques used in the HORSbased approach; for example, to deal with the \(\nu \)only fragment of HFL\(_{\mathbf {Z}}\), we can reuse the technique of predicate abstraction [22]. For a more technical comparison between the HORSbased approach and our HFLbased approach, see [23].
As for typebased approaches [14, 24, 34,35,36, 39, 41, 44], most of the refinement type systems are (i) restricted to safety properties, and/or (ii) incomplete. A notable exception is the recent work of Unno et al. [40], which provides a relatively complete type system for the classes of properties discussed in Sect. 5. Our approach deals with a wider class of properties (cf. Sects. 6 and 7). Their “relative completeness” property relies on Godel coding of functions, which cannot be exploited in practice.
The reductions from program verification to Horn clause solving have recently been advocated [2,3,4] or used [34, 39] (via refinement type inference problems) by a number of researchers. Since Horn clauses can be expressed in a fragment of HFL without modal operators, fixpoint alternations (between \(\nu \) and \(\mu \)), and higherorder predicates, our reductions to HFL model checking may be viewed as extensions of those approaches. Higherorder predicates and fixpoints over them allowed us to provide sound and complete characterizations of properties of higherorder programs for a wider class of properties. Bjørner et al. [16] proposed an alternative approach to obtaining a complete characterization of safety properties, which defunctionalizes higherorder programs by using algebraic data types and then reduces the problems to (firstorder) Horn clauses. A disadvantage of that approach is that control flow information of higherorder programs is also encoded into algebraic data types; hence even for finitedata higherorder programs, the Horn clauses obtained by the reduction belong to an undecidable fragment. In contrast, our reductions yield pure HFL model checking problems for finitedata programs. Burn et al. [16] have recently advocated the use of higherorder (constrained) Horn clauses for verification of safety properties (i.e., which correspond to the negation of mayreachability properties discussed in Sect. 5.1 of the present paper) of higherorder programs. They interpret recursion using the least fixpoint semantics, so their higherorder Horn clauses roughly corresponds to a fragment of the HFL\(_{\mathbf {Z}}\) without modal operators and fixpoint alternations. They have not shown a general, concrete reduction from safety property verification to higherorder Horn clause solving.
The characterization of the reachability problems in Sect. 5 in terms of formulas without modal operators is a reminiscent of predicate transformers [9, 13] used for computing the weakest preconditions of imperative programs. In particular, [16] and [16] respectively used least fixpoints to express weakest preconditions for whileloops and recursions.
9 Conclusion
We have shown that various verification problems for higherorder functional programs can be naturally reduced to (extended) HFL model checking problems. In all the reductions, a program is mapped to an HFL formula expressing the property that the behavior of the program is correct. For developing verification tools for higherorder functional programs, our reductions allow us to focus on the development of (automated or semiautomated) HFL\(_{\mathbf {Z}}\) model checking tools (or, even more simply, theorem provers for HFL\(_{\mathbf {Z}}\) without modal operators, as the reductions of Sects. 5 and 7 yield HFL formulas without modal operators). To this end, we have developed a prototype model checker for pure HFL (without integers), which will be reported in a separate paper. Work is under way to develop HFL\(_{\mathbf {Z}}\) model checkers by recasting the techniques [22, 26, 27, 43] developed for the HORSbased approach, which, together with the reductions presented in this paper, would yield fully automated verification tools. We have also started building a Coq library for interactively proving HFL\(_{\mathbf {Z}}\) formulas, as briefly discussed in [23]. As a final remark, although one may fear that our reductions may map program verification problems to “harder” problems due to the expressive power of HFL\(_{\mathbf {Z}}\), it is actually not the case at least for the classes of problems in Sects. 5 and 6, which use the only alternationfree fragment of HFL\(_{\mathbf {Z}}\). The model checking problems for \(\mu \)only or \(\nu \)only HFL\(_{\mathbf {Z}}\) are semidecidable and cosemidecidable respectively, like the source verification problems of may/mustreachability and their negations of closed programs.
Notes
 1.
In this section, we use only a fragment of HFL that can be expressed in the modal \(\mu \)calculus. Some familiarity with the modal \(\mu \)calculus [25] would help.
 2.
Here, for the sake of simplicity, we assume that we are interested in the usage of the single file pointer \(x\), so that the name \(x\) can be ignored in HFL formulas; usage of multiple files can be tracked by using the technique of [16].
 3.
This does not mean that invariant discovery is unnecessary; invariant discovery is just postponed to the later phase of discharging verification conditions, so that it can be uniformly performed among various verification problems.
 4.
Note that the derivation of each judgment \(\varDelta \vdash _{\mathtt {H}}\varphi :\sigma \) is unique if there is any.
 5.
Callbyvalue programs can be handled by applying the CPS transformation before applying the reductions to HFL model checking.
 6.
Unlike in Sect. 3, the variables are bound by \(\nu \) since we are not concerned with the termination property here.
 7.
In the example above, we can actually remove \(\langle {\mathtt {a}}\rangle \) and \(\langle {\mathtt {b}}\rangle \), as information about events has been taken into account when \(f\) was duplicated.
References
Axelsson, R., Lange, M., Somla, R.: The complexity of model checking higherorder fixpoint logic. Logical Methods Comput. Sci. 3(2), 1–33 (2007)
Bjørner, N., Gurfinkel, A., McMillan, K., Rybalchenko, A.: Horn clause solvers for program verification. In: Beklemishev, L.D., Blass, A., Dershowitz, N., Finkbeiner, B., Schulte, W. (eds.) Fields of Logic and Computation II. LNCS, vol. 9300, pp. 24–51. Springer, Cham (2015). https://doi.org/10.1007/9783319235349_2
Bjørner, N., McMillan, K.L., Rybalchenko, A.: Program verification as satisfiability modulo theories. In: SMT 2012, EPiC Series in Computing, vol. 20, pp. 3–11. EasyChair (2012)
Bjørner, N., McMillan, K.L., Rybalchenko, A.: Higherorder program verification as satisfiability modulo theories with algebraic datatypes. CoRR, abs/1306.5264 (2013)
Blass, A., Gurevich, Y.: Existential fixedpoint logic. In: Börger, E. (ed.) Computation Theory and Logic. LNCS, vol. 270, pp. 20–36. Springer, Heidelberg (1987). https://doi.org/10.1007/3540181709_151
Blume, M., Acar, U.A., Chae, W.: Exception handlers as extensible cases. In: Ramalingam, G. (ed.) APLAS 2008. LNCS, vol. 5356, pp. 273–289. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540893301_20
Burn, T.C., Ong, C.L., Ramsay, S.J.: Higherorder constrained horn clauses for verification. PACMPL 2(POPL), 11:1–11:28 (2018)
Carayol, A., Serre, O.: Collapsible pushdown automata and labeled recursion schemes: equivalence, safety and effective selection. In: LICS 2012, pp. 165–174. IEEE (2012)
Dijkstra, E.W.: Guarded commands, nondeterminacy and formal derivation of programs. Commun. ACM 18(8), 453–457 (1975)
Grädel, E., Thomas, W., Wilke, T. (eds.): Automata Logics, and Infinite Games: A Guide to Current Research. LNCS, vol. 2500. Springer, Heidelberg (2002). https://doi.org/10.1007/3540363874
Grellois, C., Melliès, P.: Relational semantics of linear logic and higherorder model checking. In: Proceedings of CSL 2015, LIPIcs, vol. 41, pp. 260–276 (2015)
Haddad, A.: Model checking and functional program transformations. In: Proceedings of FSTTCS 2013, LIPIcs, vol. 24, pp. 115–126 (2013)
Hesselink, W.H.: Predicatetransformer semantics of general recursion. Acta Inf. 26(4), 309–332 (1989)
Hofmann, M., Chen, W.: Abstract interpretation from Büchi automata. In: Proceedings of CSLLICS 2014, pp. 51:1–51:10. ACM (2014)
Igarashi, A., Kobayashi, N.: Resource usage analysis. ACM Trans. Prog. Lang. Syst. 27(2), 264–313 (2005)
Knapik, T., Niwiński, D., Urzyczyn, P.: Higherorder pushdown trees are easy. In: Nielsen, M., Engberg, U. (eds.) FoSSaCS 2002. LNCS, vol. 2303, pp. 205–222. Springer, Heidelberg (2002). https://doi.org/10.1007/3540459316_15
Kobayashi, N.: Types and higherorder recursion schemes for verification of higherorder programs. In: Proceedings of POPL, pp. 416–428. ACM Press (2009)
Kobayashi, N.: Model checking higherorder programs. J. ACM 60(3), 1–62 (2013)
Kobayashi, N., Lozes, É., Bruse, F.: On the relationship between higherorder recursion schemes and higherorder fixpoint logic. In: Proceedings of POPL 2017, pp. 246–259 (2017)
Kobayashi, N., Matsuda, K., Shinohara, A., Yaguchi, K.: Functional programs as compressed data. High.Order Symbolic Comput. 25(1), 39–84 (2013)
Kobayashi, N., Ong, C.H.L.: A type system equivalent to the modal mucalculus model checking of higherorder recursion schemes. In: Proceedings of LICS 2009, pp. 179–188 (2009)
Kobayashi, N., Sato, R., Unno, H.: Predicate abstraction and CEGAR for higherorder model checking. In: Proceedings of PLDI, pp. 222–233. ACM Press (2011)
Kobayashi, N., Tsukada, T., Watanabe, K.: Higherorder program verification via HFL model checking. CoRR abs/1710.08614 (2017). http://arxiv.org/abs/1710.08614
Koskinen, E., Terauchi, T.: Local temporal reasoning. In: Proceedings of CSLLICS 2014, pp. 59:1–59:10. ACM (2014)
Kozen, D.: Results on the propositional \(\mu \)calculus. Theor. Comput. Sci. 27, 333–354 (1983)
Kuwahara, T., Sato, R., Unno, H., Kobayashi, N.: Predicate abstraction and CEGAR for disproving termination of higherorder functional programs. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9207, pp. 287–303. Springer, Cham (2015). https://doi.org/10.1007/9783319216683_17
Kuwahara, T., Terauchi, T., Unno, H., Kobayashi, N.: Automatic termination verification for higherorder functional programs. In: Shao, Z. (ed.) ESOP 2014. LNCS, vol. 8410, pp. 392–411. Springer, Heidelberg (2014). https://doi.org/10.1007/9783642548338_21
Lange, M., Lozes, É., Guzmán, M.V.: Modelchecking process equivalences. Theor. Comput. Sci. 560, 326–347 (2014)
LedesmaGarza, R., Rybalchenko, A.: Binary reachability analysis of higher order functional programs. In: Miné, A., Schmidt, D. (eds.) SAS 2012. LNCS, vol. 7460, pp. 388–404. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642331251_26
Lozes, É.: A typedirected negation elimination. In: Proceedings FICS 2015, EPTCS, vol. 191, pp. 132–142 (2015)
Murase, A., Terauchi, T., Kobayashi, N., Sato, R., Unno, H.: Temporal verification of higherorder functional programs. In: Proceedings of POPL 2016, pp. 57–68 (2016)
Ong, C.H.L.: On modelchecking trees generated by higherorder recursion schemes. In: LICS 2006, pp. 81–90. IEEE Computer Society Press (2006)
Ong, C.H.L., Ramsay, S.: Verifying higherorder programs with patternmatching algebraic data types. In: Proceedings of POPL, pp. 587–598. ACM Press (2011)
Rondon, P.M., Kawaguchi, M., Jhala, R.: Liquid types. PLDI 2008, 159–169 (2008)
Skalka, C., Smith, S.F., Horn, D.V.: Types and trace effects of higher order programs. J. Funct. Program. 18(2), 179–249 (2008)
Terauchi, T.: Dependent types from counterexamples. In: Proceedings of POPL, pp. 119–130. ACM (2010)
Tobita, Y., Tsukada, T., Kobayashi, N.: Exact flow analysis by higherorder model checking. In: Schrijvers, T., Thiemann, P. (eds.) FLOPS 2012. LNCS, vol. 7294, pp. 275–289. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642298226_22
Tsukada, T., Ong, C.L.: Compositional higherorder model checking via \(\omega \)regular games over Böhm trees. In: Proceedings of CSLLICS 2014, pp. 78:1–78:10. ACM (2014)
Unno, H., Kobayashi, N.: Dependent type inference with interpolants. In: PPDP 2009, pp. 277–288. ACM (2009)
Unno, H., Satake, Y., Terauchi, T.: Relatively complete refinement type system for verification of higherorder nondeterministic programs. PACMPL 2(POPL), 12:01–12:29 (2018)
Unno, H., Terauchi, T., Kobayashi, N.: Automating relatively complete verification of higherorder functional programs. In: POPL 2013. pp. 75–86. ACM (2013)
Viswanathan, M., Viswanathan, R.: A higher order modal fixed point logic. In: Gardner, P., Yoshida, N. (eds.) CONCUR 2004. LNCS, vol. 3170, pp. 512–528. Springer, Heidelberg (2004). https://doi.org/10.1007/9783540286448_33
Watanabe, K., Sato, R., Tsukada, T., Kobayashi, N.: Automatically disproving fair termination of higherorder functional programs. In: Proceedings of ICFP 2016, pp. 243–255. ACM (2016)
Zhu, H., Nori, A.V., Jagannathan, S.: Learning refinement types. In: Proceedings of ICFP 2015, pp. 400–411. ACM (2015)
Acknowledgment
We would like to thank anonymous referees for useful comments. This work was supported by JSPS KAKENHI Grant Number JP15H05706 and JP16K16004.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
This chapter is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what reuse is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and reuse information, please contact the Rights and Permissions team.
Copyright information
© 2018 The Author(s)
About this paper
Cite this paper
Kobayashi, N., Tsukada, T., Watanabe, K. (2018). HigherOrder Program Verification via HFL Model Checking. In: Ahmed, A. (eds) Programming Languages and Systems. ESOP 2018. Lecture Notes in Computer Science(), vol 10801. Springer, Cham. https://doi.org/10.1007/9783319898841_25
Download citation
DOI: https://doi.org/10.1007/9783319898841_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783319898834
Online ISBN: 9783319898841
eBook Packages: Computer ScienceComputer Science (R0)