Keywords

figure a
figure b

1 Introduction

Relational properties (also called hyperproperties [21]) move away from a traditional specification that considers all executions of a system in isolation and, instead, relate multiple executions. Hyperproperties are becoming increasingly important and have shown up in various disciplines, perhaps most prominently in information-flow control. Assume we are given a program \(\mathbb {P}\) with high-security input h, low-security input l, and public output o, and we want to formally prove that the output of \(\mathbb {P}\) does not leak information about h. One way to ensure this is to verify that \(\mathbb {P}\) behaves deterministically in the low-security input l, i.e., if the low-security input is identical across two executions, so is \(\mathbb {P}\)’s output.

The above property is a typical example of a 2-safety property stating a requirement on all pairs of traces. More generally, a k-safety property requires that all k-tuples of executions, together, satisfy a given property. In the last decade, many approaches for the verification of k-safety properties have been proposed, based, e.g., on model-checking [31, 33, 55], abstract interpretation [4, 41, 43, 44], symbolic execution [30], or program logics [7, 28, 49, 56, 60].

Fig. 1.
figure 1

Example program

However, for many relational properties, the implicit universal quantification found in k-safety properties is too restrictive. Consider the simple program in Figure 1 (taken from [12]), where \(\star _\mathbb {N}\) denotes the nondeterministic choice of a natural number. This program clearly violates the 2-safety property discussed above as the nondeterminism influences the final value of o. Nevertheless, the program does not leak any information about the secret input h. To see this, assume the attacker observes some fixed low-security input-output pair (lo), i.e., the attacker observes everything except the high-security input. The key observation is that (lo) is possible for any possible high-security input, i.e., for every value of h, there exists some way to resolve the nondeterminism such that (lo) is the observation made by the attacker. This information-flow policy – called generalized non-interference (GNI) [45] – requires a combination of universal and existential reasoning and thus cannot be expressed as a k-safety property.

FEHTs. In this paper, we study the automated verification of such (functional) \(\forall ^*\exists ^*\) properties. Concretely, we consider specifications in a form we call Forall-Exist Hoare Tuples (FEHT) (also called refinement quadruples [5] or RHLE triples [26]), which have the form

figure c

where \(\mathbb {P}_1, \ldots , \mathbb {P}_{k+l}\) are (possibly identical) programs and \(\varPhi , \varPsi \) are first-order formulas that relate \(k+l\) different program runs. The FEHT is valid if for all \(k+l\) initial states that satisfy \(\varPhi \), and for all possible executions of \(\mathbb {P}_1, \ldots , \mathbb {P}_k\) there exist executions of \(\mathbb {P}_{k+1}, \ldots , \mathbb {P}_{k+l}\) such that the final states satisfy \(\varPsi \). For example, GNI can be expressed as \(\boldsymbol{\langle } l_1=l_2 \boldsymbol{\rangle } \mathbb {P} \sim \mathbb {P} \boldsymbol{\langle } o_1=o_2 \boldsymbol{\rangle }\), where \(l_1\) and \(o_1\) (resp. \(l_2\) and \(o_2\)) refer to the value of l and o in the first (resp. second) program copy. That is, for any two initial states \(\sigma _1, \sigma _2\) with identical values for l (but possibly different values for h), and any final state \(\sigma _1'\) reachable by executing \(\mathbb {P}\) from \(\sigma _1\), there exists some final state \(\sigma _2'\) (reachable from \(\sigma _2\) by executing \(\mathbb {P}\)) that agrees with \(\sigma _1'\) in the value of o. The program in Figure 1 satisfies this FEHT. In the terminology of Clarkson and Schneider [21], GNI is a hyperliveness property, hence the name of our paper. Intuitively, the term hyperliveness stems from the fact that – due to the existential quantification in FEHTs – GNI reasons about the existence of a particular execution. Similar to the definition of liveness in temporal properties [2], we can, therefore, satisfy GNI by adding sufficiently many execution traces [22].

Verification Using a Program Logic. For finite-state hardware systems, many automated verification methods for hyperliveness properties (e.g., in the form of FEHTs) have been proposed [13,14,15, 20, 22, 33, 38]. In contrast, for infinite-state software, the verification of FEHTs is notoriously difficult; FEHTs mix quantification of different types, so we cannot employ purely over-approximate reasoning principles (as is possible for k-safety). Most existing approaches for software verification, therefore, require substantial user interaction, e.g., in the form of a custom Horn-clause template [57], a user-provided abstraction [12], or a deductive proof strategy [5, 26]. See Section 6 for more discussion.

In this paper, we put forward an automatic algorithm for the verification of FEHTs. Our method is rooted in a novel program logic, which we call Forall-Exist Hoare Logic (FEHL) (in Section 3). Similar to many program logics for k-safety properties [19, 56], our logic focuses on one of the programs involved in the verification at any given time (by, e.g., symbolically executing one step in one of the programs) and thus lends itself to automation. We show that FEHL is sound and complete (relative to a complete proof system for over- and under-approximate unary Hoare triples).

Automated Verification. Our verification algorithm – presented in Section 4 – then leverages FEHL for the analysis of FEHTs. During this analysis, the key algorithmic challenge is to find suitable instantiations for nondeterministic choices made in existentially quantified executions. Our algorithm avoids a direct instantiation and instead treats the outcome of the nondeterministic choice symbolically, allowing an instantiation at a later point in time. Formally, we define the concept of a parametric assertion. Instead of capturing a set of states, a parametric assertion defines a function that maps concrete values for a set of parameters (in our case, the nondeterministic choices in existentially quantified programs whose concrete instantiations we have postponed) to sets of states. Our algorithm then recursively computes a parametric postcondition and delegates the search for appropriate instantiations of the parameters to an SMT solver. Crucially, our algorithm only explores a restricted class of program alignments (as guided by FEHL). Therefore, the resulting constraints are ordinary (first-order) SMT formulas, which can be handled using off-the-shelf SMT solvers.

Implementation and Experiments. We implement our algorithm in a tool called ForEx and compare it with existing approaches for the verification of \(\forall ^*\exists ^*\) properties (in Section 5). As ForEx can resort to highly optimized off-the-shelf SMT solvers, it outperforms existing approaches (which often rely on custom solving strategies) in many benchmarks.

2 Preliminaries

Programs. Let \(\mathcal {V}\) be a set of program variables. We consider a simple (integer-valued) programming language generated by the following grammar.

figure d

where \(x \in \mathcal {V}\) is a variable, e is a (deterministic) arithmetic expressions over variables in \(\mathcal {V}\), and b is a (deterministic) boolean expression. denotes the program that does nothing; assigns x the result of evaluating e; assumes that b holds, i.e., does not continue execution from states that do not satisfy b; executes \(\mathbb {P}\) if b holds and otherwise executes ; executes \(\mathbb {P}\) as long as b holds; executes \(\mathbb {P}\) followed by \(\mathbb {Q}\); and assigns x some nondeterministically chosen integer. For an arithmetic expression e, we write \( Vars (e) \subseteq \mathcal {V}\) for the set of all variables used in the expression.

We endow our language with a standard operational semantics operating on states \(\sigma : \mathcal {V}\rightarrow \mathbb {Z}\). Given a program \(\mathbb {P}\), we write whenever \(\mathbb {P}\) – when executed from state \(\sigma \)can terminate in state \(\sigma '\). Our semantics is defined as expected, and we give a full definition in [10].

Given program states \(\sigma _1 : \mathcal {V}\rightarrow \mathbb {Z}\) and \(\sigma _2 : \mathcal {V}' \rightarrow \mathbb {Z}\) with \(\mathcal {V}\cap \mathcal {V}' = \emptyset \), we write \(\sigma _1 \oplus \sigma _2 : (\mathcal {V}\cup \mathcal {V}') \rightarrow \mathbb {Z}\) for the combined state, that behaves as \(\sigma _1\) on \(\mathcal {V}\) and as \(\sigma _2\) on \(\mathcal {V}'\). For \(i \in \mathbb {N}\), we define \(\mathcal {V}_i := \{x_i \mid x \in \mathcal {V}\}\) as a set of indexed program variables.

Assertions. An assertion \(\varPhi \) is a first-order formula over variables in \(\mathcal {V}\) (or in the relational setting over \(\bigcup _{i=1}^k \mathcal {V}_i\) for some k). Given a state \(\sigma \), we write \(\sigma \models \varPhi \) if \(\sigma \) satisfies \(\varPhi \). We assume that assertions stem from an arbitrarily expressive background theory such that every set of states can be expressed as a formula. This allows us to sidestep the issue of expressiveness in the sense of Cook [23] (see, e.g., [50, 56, 60] for similar treatments).

Hyperliveness Specifications. Our verification algorithm targets specifications that combine universal and existential quantification, similar to RHLE triples [26] and refinement quadruples [5]:

Definition 1

A Forall-Exist Hoare Tuple (FEHT) has the form

figure n

where \(\varPhi , \varPsi \) are assertions over \(\bigcup _{i=1}^{k+l} \mathcal {V}_i\), and \(\mathbb {P}_1, \ldots , \mathbb {P}_{k+l}\) are programs over variables \(\mathcal {V}_1, \ldots , \mathcal {V}_{k+l}\), respectively. The FEHT is valid if for all states \(\sigma _1, \ldots , \sigma _{k+l}\) (with domains \(\mathcal {V}_1, \ldots , \mathcal {V}_{k+l}\), respectively) and \(\sigma '_1, \ldots , \sigma '_{k}\) such that \(\bigoplus _{i=1}^{k+l} \sigma _i \models \varPhi \) and for all \(i \in [1,k]\), there exist states \(\sigma '_{k+1}, \ldots , \sigma '_{k+l}\) such that for all \(i \in [k+1,k+l]\) and \(\bigoplus _{i=1}^{k+l} \sigma '_i \models \varPsi \).

That is, we quantify universally over initial states for all \(k+l\) programs (under the assumption that they, together, satisfy \(\varPhi \)) and also universally over executions of \(\mathbb {P}_1, \ldots , \mathbb {P}_k\). Afterward, we quantify existentially over executions of \(\mathbb {P}_{k+1}, \ldots , \mathbb {P}_{k+l}\) and require that the final states of all \(k+l\) executions, together, satisfy the postcondition \(\varPsi \). A relational property usually refers to \(k+l\) executions of the same program \(\mathbb {P}\) (operating on variables in \(\mathcal {V}\)); we can model this by using \(\alpha \)-renamed copies \(\mathbb {P}_{\langle 1 \rangle }, \ldots , \mathbb {P}_{\langle k+l \rangle }\) where each \(\mathbb {P}_{\langle i \rangle }\) is obtained from \(\mathbb {P}\) by replacing each variable \(x \in \mathcal {V}\) with \(x_i \in \mathcal {V}_i\). FEHTs capture a range of important properties, including e.g., non-inference [46], opacity [61], GNI [45], refinement [59], software doping [16], and robustness [18]. It is easy to see that FEHTs can also express (purely universal) k-safety properties over programs \(\mathbb {P}_1, \ldots , \mathbb {P}_k\) as , where \(\epsilon \) denotes the empty sequence of programs.

3 Forall-Exist Hoare Logic

The verification steps of our constraint-based algorithm (presented in Section 4) are guided by the proof rules of a novel program logic operating on FEHTs, which we call Forall-Exist Hoare Logic (FEHL).

Fig. 2.
figure 2

Selection of core proof rules of FEHL

3.1 Core Rules

We depict a selection of core rules in Figure 2; a full overview can be found in [10]. We write \(\overline{\chi _\forall } \) (resp. \(\overline{\chi _\exists } \)) to abbreviate a list of programs that are universally (resp. existentially) quantified. Rule (\(\forall \)-Reorder) allows for the reordering of universally quantified programs; (\(\forall \)-Skip-I) rewrites a program \(\mathbb {P}\) into ; (\(\forall \)-Skip-E) removes a single -instruction; and (Done) derives a FEHL with an empty program sequence. Using -insertions and reordering (and the analogous rules for existentially quantified programs), we can always bring a program in the form , targeted by the remaining rules. Rule (\(\forall \)-If) embeds the branching condition of a conditional into the preconditions of both branches. Rules (\(\forall \)-Step) and (\(\exists \)-Step) allow us to resort to unary reasoning over parts of the program. These rules make the multiplicity of techniques developed for unary reasoning (e.g., symbolic execution [40] and predicate transformers [27]) applicable to the verification of hyperproperties in the form of FEHTs. For universally quantified programs of the form , (\(\forall \)-Step) requires an auxiliary assertion \(\varPhi '\) that should hold after all executions of \(\mathbb {P}_1\) from \(\varPhi \). We can express this using the standard (non-relational) Hoare triple (HT) \({\{} \varPhi {\}} \mathbb {P}_1 {\{} \varPhi ' {\}}\) [37]. The second premise then ensures that the remaining FEHT (after \(\mathbb {P}_1\) has been executed) holds. For existentially quantified programs, we, instead, employ an underapproximation. In (\(\exists \)-Step), we, again, execute \(\mathbb {P}_1\) but use an Under-Approximate Hoare triple (UHT) \( \boldsymbol{[} \,\varPhi \, \boldsymbol{]} \mathbb {P}_1 \boldsymbol{[} \,\varPhi '\, \boldsymbol{]} \). The UHT \( \boldsymbol{[} \,\varPhi \, \boldsymbol{]} \mathbb {P}_1 \boldsymbol{[} \,\varPhi '\, \boldsymbol{]} \) holds if for all states \(\sigma \) with \(\sigma \models \varPhi \), there exists a state \(\sigma '\) such that and \(\sigma ' \models \varPhi '\).

Remark 1

UHTs behave similar to Incorrectness Triples (ITs) [50, 58] in that they reason about the existence of a particular set of executions. The key difference is that ITs reason backward (all states in \(\varPhi '\) are reachable from some state in \(\varPhi \)), whereas UHTs reason in a forward direction (all states in \(\varPhi \) can reach \(\varPhi '\)). See, e.g., Lisbon Triples [47, §5] and Outcome Triples [62] for related approaches. We will later show that FEHL is complete when equipped with some complete proof system for UHTs (cf. Theorem 2). In [10], we show that there exists at least one complete proof system for UHTs.    \(\triangle \)

For statements, (\(\forall \)-Assume) strengthens the precondition by the assumed expression b; any state that does not satisfy b causes a (universally quantified) execution to halt and renders the FEHT vacuously valid. In contrast, (\(\exists \)-Assume) assumes that all states in \(\varPhi \) satisfy b; if any state in \(\varPhi \) does not satisfy b, the FEHT is invalid. Likewise, the handling of a nondeterministic assignment differs based on whether we consider a universally quantified or existentially quantified program. In the former case, (\(\forall \)-Choice) removes all knowledge about the value of x within the precondition by quantifying x existentially (thus enlarging the precondition). In the latter (existentially quantified) case, we can, in a forward-style execution, choose any concrete value for x. (\(\exists \)-Choice) formalizes this intuition: we first invalidate all knowledge about x and then assert that \(x = e\) for some arbitrary expression e that does not depend on x. In our automated analysis (cf. Section 4), we use (\(\exists \)-Choice), but – instead of fixing some concrete value (or expression) at application time – we postpone the concrete instantiation by treating the value symbolically.

3.2 Asynchronous Loop Reasoning

A particular challenge when reasoning about relational properties is the alignment of loops. In FEHL, we propose a novel counting-based loop rule that supports asynchronous alignments while still admitting good automation. Consider the rule (Loop-Counting) (in Figure 3), which assumes \(k \ge 1\) universally and l existentially quantified loops. The rule requires a loop invariant \(\mathbb {I}\) that (1) is implied by the precondition (\(\varPhi \Rightarrow \mathbb {I}\)), (2) ensures simultaneous termination of all loops (\(\mathbb {I} \Rightarrow \bigwedge _{i=2}^{k+l} (b_1 \leftrightarrow b_i)\)), and (3) is strong enough to establish the postcondition for the program suffixes \(\mathbb {Q}_1, \ldots , \mathbb {Q}_{k+l}\) executed after the loops.

Fig. 3.
figure 3

Counting-based loop rule for FEHL

The key difference from a simple synchronous traversal is that, in each “iteration”, we execute the bodies of the loops for possibly different numbers of times. Concretely, (Loop-Counting) asks for natural numbers \(c_1, \ldots , c_{k+l}\) (ranging between 1 and some arbitrary upper bound B), and – starting from the invariant \(\mathbb {I}\) – we execute each \(\mathbb {P}_i\) \(c_i\) times. Crucially, we need to make sure that each \(\mathbb {P}_i\) will execute at least \(c_i\) times, i.e., the guard \(b_i\) holds after each of the first \(c_i-1\) executions. In particular, we cannot naïvely analyze \(c_i\) copies of \(\mathbb {P}_i\) composed via as this might introduce additional executions of \(\mathbb {P}_i\) that would not happen in . To ensure this, (Loop-Counting) demands \(B+1\) intermediate assertions \(\mathbb {I}_1, \ldots , \mathbb {I}_{B+1}\). In the jth iteration (for \(1 \le j \le B\)), we (symbolically) execute – from \(\mathbb {I}_j\) – all loop bodies \(\mathbb {P}_i\) that we want to execute at least j times (i.e., all loop bodies \(\mathbb {P}_i\) where \(c_i \ge j\)). We require that (1) the postcondition \(\mathbb {I}_{j+1}\) is derivable, and (2) the guards of all loops that we want to execute more than j times (i.e., loops where \(c_i > j\)) evaluate to true.

Fig. 4.
figure 4

In Figure 4a, we depict two example programs. In Figures 4b and 4c, we give two intermediate FEHT verification obligations (cf. Example 1).

Example 1

Consider the two example programs \(\mathbb {P}_1, \mathbb {P}_2\) in Figure 4a and the FEHT \(\boldsymbol{\langle } x_1=x_2 \boldsymbol{\rangle } \mathbb {P}_1 \sim \mathbb {P}_2 \boldsymbol{\langle } x_1=x_2 \boldsymbol{\rangle }\). To see that this FEHT is valid, we can, in each loop iteration, always choose \(z_2 = 1\). In this case, \(\mathbb {P}_1\) quadruples the value of \(x_1\) for \(x_1\) times and \(\mathbb {P}_2\) doubles the value of \(x_2\) for \(2 x_2\) times, which, assuming \(x_1 = x_2\), computes the same result (\(x_1 = x_2 \rightarrow 4^{x_1} x_1 = 2^{2x_2}x_2\)). Verifying this example automatically is challenging as both loops are executed a different number of times, so we cannot align the loops in lockstep. Likewise, computing independent (unary) summaries of both loops requires complex non-linear reasoning. Instead, (Loop-Counting) enables an asynchronous alignment: After applying (\(\forall \)-Step) and (\(\exists \)-Step), we are left with precondition \(x_1 = x_2 \wedge y_2 = 2y_1\). We use (Loop-Counting) and align the loops such that every loop iteration in \(\mathbb {P}_1\) is matched by two iterations in \(\mathbb {P}_2\), which allows us to use a simple (linear) invariant. We set \(c_1 := 1, c_2 := 2\) and define \(\mathbb {I} := x_1 = x_2 \wedge y_2 = 2y_1\), \(\mathbb {I}_1 := \mathbb {I}_3 := \mathbb {I}\), and \(\mathbb {I}_2 := x_1 = 2x_2 \wedge y_2 = 2y_1 + 1\). Note that \(\mathbb {I}\) implies the desired postcondition (\(x_1 = x_2\)). To establish that \(\mathbb {I}\) serves as an invariant, we need to discharge the two proof obligations depicted in Figures 4b and 4c. The obligation in Figure 4b (corresponding to iteration \(j=1\)) establishes that (1) \(\mathbb {I}_2\) is a provable postcondition after executing both loop bodies from \(\mathbb {I}_1\) and (2) that the loop in \(\mathbb {P}_2\) will execute at least one more time, i.e., \(y_2 > 0\). We can easily discharge this FEHT using (\(\forall \)-Step), (\(\exists \)-Step), and (\(\exists \)-Choice) by choosing \(z_2\) to be 1 (note that if \(y_2 = 2y_1\) and \(y_2 > 0\), then \(y_2 - 1 > 0\)). The obligation in Figure 4c corresponds to iteration \(j = 2\), where we only execute the body of \(\mathbb {P}_2\). We can, again, easily discharge this FEHT using (\(\exists \)-Step) and (\(\exists \)-Choice) (again, choosing \(z_2\) to be 1).    \(\triangle \)

3.3 Soundness and Completeness

We can show that our proof system is sound and complete:

Theorem 1 (Soundness)

[Soundness] Assume that \(\vdash {\{} \cdot {\}} \cdot {\{} \cdot {\}}\) and \(\vdash \boldsymbol{[} \,\cdot \, \boldsymbol{]} \cdot \boldsymbol{[} \,\cdot \, \boldsymbol{]} \) are sound proof systems for HTs and UHTs, respectively. If \(\vdash \boldsymbol{\langle } \varPhi \boldsymbol{\rangle } \overline{\chi _\forall } \sim \overline{\chi _\exists } \boldsymbol{\langle } \varPsi \boldsymbol{\rangle }\) then \(\boldsymbol{\langle } \varPhi \boldsymbol{\rangle } \overline{\chi _\forall } \sim \overline{\chi _\exists } \boldsymbol{\langle } \varPsi \boldsymbol{\rangle }\) is valid.

Theorem 2 (Completeness)

[Completeness] Assume that \(\vdash {\{} \cdot {\}} \cdot {\{} \cdot {\}}\) and \(\vdash \boldsymbol{[} \,\cdot \, \boldsymbol{]} \cdot \boldsymbol{[} \,\cdot \, \boldsymbol{]} \) are complete proof systems for HTs and UHTs, respectively. If \(\boldsymbol{\langle } \varPhi \boldsymbol{\rangle } \overline{\chi _\forall } \sim \overline{\chi _\exists } \boldsymbol{\langle } \varPsi \boldsymbol{\rangle }\) is valid then \(\vdash \boldsymbol{\langle } \varPhi \boldsymbol{\rangle } \overline{\chi _\forall } \sim \overline{\chi _\exists } \boldsymbol{\langle } \varPsi \boldsymbol{\rangle }\).

Completeness follows easily by making extensive use of unary reasoning via (U)HTs, similar to the completeness-proof of relational Hoare logic for k-safety properties [49]. In fact, (\(\forall \)-Step), (\(\exists \)-Step), (Done) along with the reordering rules (\(\forall \)-Reorder), (\(\forall \)-Skip-I), and (\(\forall \)-Skip-E) (and their analogous counterparts for existentially quantified programs) already suffice for completeness (see [10]). In the following, we leverage the soundness of FEHL’s rules to guide our automated verification.

4 Automated Verification of Hyperliveness

Our automated verification algorithm for FEHTs follows a strongest postcondition computation, as is widely used in the verification of non-relational properties [1, 36, 51] and k-safety properties [19, 56]. However, due to the inherent presence of existential quantification in FEHT, the strongest postcondition does, in general, not exist. For example, both and are valid but is clearly not. Instead, our algorithm uses the proof rules of FEHL and treats the concrete value for nondeterministic choices in existentially quantified executions symbolically. I.e., we view the outcome as a fresh variable (called a parameter) that can be instantiated later. This idea of instating nondeterminism at a later point in time has already found successful application in many areas, such as existential variables in Coq or symbolic execution [40]. Our analysis brings these techniques to the realm of hyperproperty verification, which we show to yield an effective automated verification algorithm. In the following, we formally introduce parametric assertions and postconditions (in Section 4.1) and show how we can compute them using the rules of FEHL (in Sections 4.2 and 4.3).

4.1 Parametric Assertions and Postconditions

We assume that \(\mathfrak {P}= \{\mu _1, \ldots , \mu _n\}\) is a set of parameters. In FEHTs, we use assertions (formulas) over \(\bigcup _{i=1}^{k+l} \mathcal {V}_i\), which we interpret as sets of (relational) states. A parametric assertion generalizes this by viewing an assertion as a function mapping into sets of (relational) states. Formally, a parametric assertion is a pair \((\varXi , \mathcal {C})\) where \(\varXi \) is a formula over \(\bigcup _{i=1}^{k+l} \mathcal {V}_i \cup \mathfrak {P}\) (called the function-formula), and \(\mathcal {C}\) is a formula over \(\mathfrak {P}\) (called the restriction-formula).

Given a function-formula \(\varXi \) (over \(\bigcup _{i=1}^{k+l} \mathcal {V}_i \cup \mathfrak {P}\)) and a parameter evaluation \(\kappa : \mathfrak {P}\rightarrow \mathbb {Z}\), we define \(\varXi [\kappa ]\) as the formula over \(\bigcup _{i=1}^{k+l} \mathcal {V}_i\) where we fix concrete values for all parameters based on \(\kappa \). We can thus view \(\varXi \) as a function mapping each parameter evaluation \(\kappa \) to the set of states encoded by \(\varXi [\kappa ]\). During our (forward style) analysis, we will use parameters to postpone nondeterministic choices in existentially quantified programs. Intuitively, for every parameter evaluation \(\kappa \) (i.e., any retrospective choice of the nondeterministic outcome), \(\varXi [\kappa ]\) should describe the reachable states (i.e., strongest postcondition) under those specific outcomes. However, not all concrete values for the parameters are valid in the sense that they correspond to nondeterministic outcomes that result in actual executions. To mitigate this, a parametric assertion \((\varXi , \mathcal {C})\) includes a restriction-formula \(\mathcal {C}\) (over \(\mathfrak {P}\)) which restrict the domain of the function encoded by \(\varXi \), i.e., we only consider those parameter evaluations that satisfy \(\mathcal {C}\).

Example 2

Before proceeding with a formal development, let us discuss parametric assertions informally using an example. Let and and assume we want to prove the FEHT \(\boldsymbol{\langle } \top \boldsymbol{\rangle } \mathbb {P}_1 \sim \mathbb {P}_2 \boldsymbol{\langle } x = y \boldsymbol{\rangle }\). To verify this tuple in a principled way, we are interested in potential postconditions \(\varPsi \), i.e., assertions \(\varPsi \) such that \(\boldsymbol{\langle } \top \boldsymbol{\rangle } \mathbb {P}_1 \sim \mathbb {P}_2 \boldsymbol{\langle } \varPsi \boldsymbol{\rangle }\) is valid. For example, both \(\varPsi _1 = x \ge 9 \wedge y = 2\) and \(\varPsi _2 = x \ge 9 \wedge y = 3\) are valid postconditions, but – as already seen before – there does not exist a strongest assertion. Instead, we capture multiple postconditions using the parametric assertion \((\varXi , \mathcal {C})\) where \(\varXi := x \ge 9 \wedge y = \mu \) and \(\mathcal {C}:= \mu \ge 2\) for some fresh parameter \(\mu \in \mathfrak {P}\); we say \((\varXi , \mathcal {C})\) is a parametric postcondition for \((\top , \mathbb {P}_1, \mathbb {P}_2)\) (cf. Definition 2). Intuitively, we have used the parameter \(\mu \) instead of assigning some fixed integer to y. For every concrete parameter evaluation \(\kappa : \{\mu \} \rightarrow \mathbb {Z}\) such that \(\kappa \models \mathcal {C}\), formula \(\varXi [\kappa ]\) defines the reachable states when using \(\kappa (\mu )\) for the choice of y. Observe how formula \(\mathcal {C}= \mu \ge 2\) restricts the possible set of parameter values, i.e., we may only choose a value for y such that holds.    \(\triangle \)

Definition 2

A parametric postcondition for \((\varPhi , \mathbb {P}_1, \ldots , \mathbb {P}_{k+l})\) is a parametric assertion \((\varXi , \mathcal {C})\) with the following conditions. For all states \(\sigma _1, \ldots , \sigma _{k+l}\), and \(\sigma '_1, \ldots , \sigma '_{k}\) such that \(\bigoplus _{i=1}^{k+l} \sigma _i \models \varPhi \) and for all \(i \in [1,k]\) and any parameter evaluation \(\kappa \) such that \(\kappa \models \mathcal {C}\) the following holds: (1) There exist states \(\sigma '_{k+1}, \ldots , \sigma '_{k+l}\) such that \(\bigoplus _{i=1}^{k+l} \sigma '_i \models \varXi [\kappa ]\), and (2) For every \(\sigma '_{k+1}, \ldots , \sigma '_{k+l}\) such that \(\bigoplus _{i=1}^{k+l} \sigma '_i \models \varXi [\kappa ]\) we have for all \(i \in [k+1, k+l]\).

Condition (1) captures that no parameter evaluation may restrict universally quantified executions, i.e., if we fix any parameter evaluation \(\kappa \) and reachable final states for the universally quantified programs, \(\varXi [\kappa ]\) remains satisfiable. This effectively states that \(\varXi [\kappa ]\) over-approximates the set of executions of universally quantified programs. Condition (2) requires that all executions of existentially quantified programs allowed under a particular parameter evaluation are also valid executions, i.e., for any fixed parameter evaluation \(\kappa \), \(\varXi [\kappa ]\) under-approximates the set of executions of the existentially quantified programs.

We can use parametric postconditions to prove FEHTs:

Theorem 3

Let \((\varXi , \mathcal {C})\) be a parametric postcondition for \((\varPhi , \mathbb {P}_1, \ldots , \mathbb {P}_{k+l})\). If

figure ak

holds, then the FEHT is valid.

Here, we universally quantify over final states in \(\mathbb {P}_1, \ldots , \mathbb {P}_k\) and existentially quantify over parameter evaluations that satisfy \(\mathcal {C}\) (recall that \(\mathcal {C}\) only refers to \(\mathfrak {P}\)). The choice of the parameters can thus depend on the final states of universally quantified programs (as in the semantics of FEHTs). Afterward, we quantify (again universally) over final states of \(\mathbb {P}_{k+1}, \ldots , \mathbb {P}_{k+l}\) and state that if \(\varXi \) holds, so does the postcondition \(\varPsi \).

Example 3

Consider the FEHT and parametric postcondition from Example 2. Following Theorem 3, we construct the SMT formula \(\forall x\mathpunct {.}\exists \mu \mathpunct {.}\mu \ge 2 \wedge \forall y\mathpunct {.}\big ((x \ge 9 \wedge y = \mu ) \Rightarrow x = y\big )\). This formula holds; the FEHT is valid.    \(\triangle \)

Note that \((\varXi , \bot )\) is always a parametric postcondition: no parameter evaluation satisfies \(\bot \), so the conditions in Definition 2 are vacuously satisfied. However, \((\varXi , \bot )\) is useless when it comes to proving FEHTs via Theorem 3.

4.2 Generating Parametric Postconditions

Algorithm 1 computes a parametric postcondition based on the proof rules of FEHL from Section 3. As input, Algorithm 1 expects a formula \(\varPhi \) over \(\bigcup _{i=1}^{k+l}\mathcal {V}_i \cup \mathfrak {P}\) – think of \(\varPhi \) as a precondition already containing some parameters – and two program lists \(\overline{\chi _\forall } \) and \(\overline{\chi _\exists } \). It outputs a parametric postcondition.

figure am

Remark 2

For intuition, it is oftentimes helpful to consider \(\varPhi \) as a parameter-free formula over \(\bigcup _{i=1}^{k+l}\mathcal {V}_i\). In this case, most of our steps correspond to the computation of the strongest postcondition [19, 27, 56] in a purely universal (k-safety) setting.    \(\triangle \)

Our algorithm analyses the structure of each program and applies the insights from FEHL: If \(\overline{\chi _\forall } \) and \(\overline{\chi _\exists } \) are empty, we return \((\varPhi , \top )\) (line 3), i.e., we do not place any restrictions on the parameters. In case all programs are loops (line 5), we invoke a subroutine (discussed in Section 4.3). Otherwise, some program has a non-loop statement at the top level, allowing further symbolic analysis. We consider possible steps in \(\overline{\chi _\forall } \) (lines 7-33) and in \(\overline{\chi _\exists } \) (lines 35-56).

We first consider the case where a universally quantified program has a non-loop statement at its top level (lines 7-33). In lines 9, 11, 13, and 15, we bring the first program into the form where by potentially inserting statements in line 15. For a program (line 17), we use (\(\forall \)-Step) to handle the assignment. Here, we can compute the strongest postcondition of the assignment as \(\exists x'. \varPhi [x'/x] \wedge x = e[x'/x]\) (using Floyd’s forward running rule [35]). For conditionals (line 20), we analyze both branches under the strengthened precondition. As our analysis operates on parametric assertions, some of the parameters found in the precondition \(\varPhi \) can be restricted in both branches. After we have computed a parametric postcondition for each branch, we therefore combine them into a parametric postcondition for the entire program by constructing the disjunction of the function-formulas \(\varXi _1\) and \(\varXi _2\) (describing the set of states reachable in either of the branches), and conjoining the restriction-formulas \(\mathcal {C}_1\) and \(\mathcal {C}_2\). For assume statements (line 27), we strengthen the precondition. For nondeterministic assignments (line 29), we invalidate all knowledge about x. If a program matches none of the previous cases (line 33), it must be of the form , and we move it to the end of \(\overline{\chi _\forall } \), continuing the analysis of the renaming programs in the next recursive iteration. If no universally quantified program can be analyzed further, we continue the investigation with existentially quantified ones (lines 35-56). Many cases are analogous to the treatment in universally quantified programs (lines 37-43), but some cases are handled fundamentally differently: If we encounter an assume statement (line 45), we need to certify that b holds in all states in \(\varPhi \) (cf. (\(\exists \)-Assume)). As we already hinted in Example 2, we accomplish this by restricting the viable set of parameters in \(\varPhi \), i.e., we restrict the domain of the function formula \(\varPhi \). Concretely, we consider the formula (which is a formula over \(\mathfrak {P}\)) that characterizes exactly those parameters that ensure that all states in \(\varPhi \) satisfy b. After analyzing the remaining programs, we then conjoin \(\mathcal {C}_ assume \) with the remaining restrictions.

Remark 3

As in Remark 2, we can consider the case where \(\varPhi \) contains no parameter. In this case, \(\mathcal {C}_ assume \) is a variable-free formula that is equivalent to \(\top \) iff all states in \(\varPhi \) satisfy b. If \(\varPhi \) does not imply b (so \(\mathcal {C}_ assume \equiv \bot \)), the resulting parametric postcondition thus cannot prove any FEHT via Theorem 3.   \(\triangle \)

For nondeterministic assignments (line 51), we create a fresh parameter \(\mu \) and continue the analysis under the precondition that \(x = \mu \), effectively postponing the choice of a concrete value for x (cf. Example 2).

Example 4

Our algorithm will automatically compute the parametric postcondition from Example 2. In particular, for the statement, we match line 45 with \(\varPhi = x \ge 9 \wedge y = \mu \) for \(\mu \in \mathfrak {P}\) and compute \(\mathcal {C}_ assume := \forall x, y\mathpunct {.}\varPhi \Rightarrow y \ge 2\), which is logically equivalent to \(\mu \ge 2\).    \(\triangle \)

figure ay

4.3 Generating Parametric Postconditions for Loops

We sketch the postcondition generation for loops in Algorithm 2. As input, expects a precondition \(\varPhi \) over \(\bigcup _{i=1}^{k+l}\mathcal {V}_i \cup \mathfrak {P}\) and universally and existentially quantified loop programs. In the first step, we guess a loop invariant \(\mathbb {I}\) and counter values \(c_1, \ldots , c_{k+l} \in [1, B]\) (cf. (Loop-Counting)). In lines 4 and 5, we ensure that \(\mathbb {I}\) is initial and guarantees simultaneous termination by computing restrictions \(\mathcal {C}_ init \) and \(\mathcal {C}_ sim \) on the parameters present in \(\varPhi \) (similar to statements in line 45 of Algorithm 1). Again, in the special case where \(\varPhi \) contains no parameter (as is, e.g., the case when applying our algorithm to k-safety properties), \(\mathcal {C}_ init \) (resp. \(\mathcal {C}_ sim \)) is equivalent to \(\top \) iff the invariant is initial (resp. guarantees simultaneous termination). Afterward, we check the validity of the guessed counter values \(c_1, \ldots , c_{k+l}\). For each j from 1 to B, we compute a parametric postcondition \((\varXi _{j+1}, \mathcal {C}_{j+1})\) for the bodies of all loops that should be executed at least j times (i.e., \(c_i \ge j\)) starting from precondition \(\varXi _j\) via a (mutually recursive) call to (line 8). To ensure valid derivation using (Loop-Counting) we need to ensure that – in \(\varXi _{j+1}\) – the guard of all loops that we want to execute more than j times still evaluates to true. We ensure this by computing the restriction-formula \(\mathcal {C}_{j+1}^ cont \), which restricts the parameters (both those already present in the precondition \(\varPhi \) and those added during the analysis of the loop bodies) such that all states in \(\varXi _{j+1}\) fulfill the guards of all loops with \(c_i > j\) (line 9). After we have symbolically executed all loops the desired number of times, we construct a parameter restriction \(\mathcal {C}_ ind \) that ensures that we end within the invariant, i.e., \(\varXi _{B+1} \Rightarrow \mathbb {I}\) (line 10). In the last step, we compute a parametric postcondition \((\varXi _{ rem }, \mathcal {C}_{ rem })\) for the program suffix executed after the loops. We return the parametric postcondition that consists of the function-formula \(\varXi _{ rem }\) and the conjunction of all restriction-formulas.

4.4 The Main Verification

From the soundness of FEHL (Theorem 1) we directly get:

Proposition 1

  computes some parametric postcondition for \((\varPhi , \overline{\chi _\forall } , \overline{\chi _\exists } )\).

Given an FEHT \(\boldsymbol{\langle } \varPhi \boldsymbol{\rangle } \overline{\chi _\forall } \sim \overline{\chi _\exists } \boldsymbol{\langle } \varPsi \boldsymbol{\rangle }\), we can thus invoke to compute a parametric postcondition, which (if strong enough) allows us to prove that \(\boldsymbol{\langle } \varPhi \boldsymbol{\rangle } \overline{\chi _\forall } \sim \overline{\chi _\exists } \boldsymbol{\langle } \varPsi \boldsymbol{\rangle }\) is valid via Theorem 3. If the postcondition is too weak, we can re-run using updated invariant guesses (cf. Section 5). For loop-free programs, it is easy to see that computes the “strongest possible“ parametric postcondition (it effectively executes the programs symbolically without incurring the imprecision inserted by loop invariants). In this case, the query from Theorem 3 holds if and only if the FEHT is valid; our algorithm thus constitutes a complete verification method.

Invalid FEHTs. We stress that the goal of our algorithm is the verification of FEHTs and not proving that an FEHT is invalid. For k-safety properties, a refutation (counterexample) consists of a k-tuple of concrete executions that violate the property [19, 56]. In contrast, refuting an FEHT corresponds to proving a \(\exists ^*\forall ^*\) property, an orthogonal problem that requires independent proof ideas.

5 Implementation and Experiments

We have implemented our verification algorithm in a tool called ForEx [9] (short for Forall Exists Verification), supporting programs in a minimalistic C-like language that features basic control structures (cf. Section 2), arrays, and bitvectors. ForEx uses Z3 [48] to discharge SMT queries and supports the theory of linear integer arithmetic, the theory of arrays, and the theory of finite bitvectors. Compared to the presentation in Section 4, we check satisfiability of restriction-formulas eagerly: For example, in Algorithm 2, we compute multiple restriction-formulas and return their conjunction. In ForEx, we immediately check these intermediate restrictions for satisfiability; if any restriction is unsatisfiable on its own, any conjunction involving it will be as well, so we can abort the analysis early and re-start parts of the analysis using, e.g., updated invariants and counter values.

5.1 Loop Invariant Generation

Our loop invariant generation and counter value inference follows a standard guess-and-check procedure [19, 34, 53, 54, 56], i.e., we generate promising candidates by combining expressions found in the programs and equalities between variables in the loop guards. In most loops, there exist “anchor” variables that effectively couple executions of multiple loops together [19, 56]; even in asynchronous cases like Example 1. Exploring more advanced invariant generation techniques is interesting future work. However – even in the simpler setting of k-safety properties – many tools currently rely on a guess-and-check approach [19, 56]. We maintain a lattice of possible candidates ordered by implication, which allows us for efficient pruning. For example, if the current candidate is not initial (i.e., \(\mathcal {C}_ init \) computed in line 4 of Algorithm 2 is unsatisfiable), we do not need to consider stronger candidates. Likewise, if the candidate does not ensure simultaneous termination (\(\mathcal {C}_ sim \)) we can prune all weaker invariants.

5.2 Experiments

We evaluate ForEx in various settings where FEHT-like specifications arise. We compare with HyPA (a predicate-abstraction-based solver) [12], PCSat (a constraint-based solver that relies on predicate templates) [57], and HyPro (a model-checker for \(\forall ^*\exists ^*\) properties in finite-state systems) [11]. Our results were obtained on a M1 Pro CPU with 32GB of memory.

Fig. 5.
figure 5

In Tables 5a and 5b, we compare ForEx with HyPA [12] on k-safety and \(\forall ^*\exists ^*\) properties, respectively. For instances marked with \(\dagger \), ForEx required additional user-provided invariant hints. In Table 5c, we compare ForEx with PCSat [57]. For instances marked with \(\ddagger \), PCSat required additional invariant hints. In Figure 5d, we compare the running time of ForEx () and HyPro [11] (). We check each of the 4 GNI instances from [11] with varying bitwidth. The timeout is set to 3 min (marked by the horizontal dotted line).

  Before we evaluate ForEx on \(\forall ^*\exists ^*\) properties, we investigate the counting-based loop alignment principle underlying ForEx. We collect the k-safety benchmarks from HyPA [12] (which themself were collected from multiple sources [31, 32, 55, 57]) and depict the verification results in Table 5a. We observe that ForEx can verify many of these instances. As it explores a restricted class of loop alignments (guided by (Loop-Counting)), it is more efficient on the instances it can solve. However, for some of the instances, ForEx’s counting-based alignment is insufficient. Instead, these instances require a loop alignment that is context-dependent, i.e., the alignment is chosen based on the current state of the programs [12, 32, 55, 57].

 HyPA [12] explores a liberal program alignment by exploring a user-provided predicate abstraction. The verification instances considered in [12] include a range of \(\forall ^*\exists ^*\) properties on very small programs, including, e.g., GNI and refinement properties. In Table 5b, we compare the running time of ForEx with that of HyPA (using the user-defined predicates for its abstraction).Footnote 1 We observe that ForEx can verify the instances significantly quicker. Moreover, we stress that ForEx solves a much more challenging problem as it analyzes the program fully automatically without any user intervention.

  Unno et al. [57] present an extension of constraint Horn clauses, called pfwCSP, that is able to express a range of relational properties (including \(\forall ^*\exists ^*\) properties). Their custom pfwCSP solver (called PCSat) instantiates predicates with user-provided templates. We compare PCSat and ForEx in Table 5c. ForEx can verify 6 out of the 8 \(\forall ^*\exists ^*\) instances. ForEx currently does not support termination proofs for loops in existentially quantified programs (which are needed for TI_GNI_hTF and TS_GNI_hTF), whereas PCSat features loop variant templates and can thus reason about the termination of existentially quantified loops in isolation. In the instances that ForEx can solve, it is much faster. We conjecture that this is due to the fact that the constraints generated by ForEx can be solved directly by SMT solvers, whereas PCSat’s pfwCSP constraints first require a custom template instantiation.

  Programs whose variables have a finite domain (e.g., boolean) can be checked using explicit-state techniques developed for logics such as HyperLTL [20]. We verify GNI on variants of the four boolean programs from [11] with a varying number of bits. We compare ForEx with the HyperLTL verifier HyPro [11], which converts a program into an explicit-state transition system. We depict the results in Figure 5d. We observe that, with increasing bitwidth, the running time of explicit-state model-checking increases exponentially (note that the scale is logarithmic). In contrast, ForEx can employ symbolic bitvector reasoning, resulting in orders of magnitude faster verification.

6 Related Work

Most methods for k-safety verification are centered around the self-composition of a program [6] and often improve upon a naïve self-composition by, e.g., exploiting the commutativity of statements [29, 31, 32, 55]. Relational program logics for k-safety offer a rich set of rules to over-approximate the program behavior [3, 7, 8, 28, 49, 56, 60]. Recently, much effort has been made to employ under-approximate methods that find bugs instead of proving their absence; so far, mostly for unary (non-hyper) properties [17, 24, 42, 47, 50, 52, 58, 62].

Dardinier et al. [25] propose Hyper Hoare Logic – a logic that can express arbitrary hyperproperties, but requires manual deductive reasoning. Dickerson et al. [26] introduce RHLE, a program logic for the verification of \(\forall ^*\exists ^*\) properties, focusing on the composition (and under-approximation) of function calls. They present a weakest-precondition-based verification algorithm that aligns loops in lock-step via user-provided loop invariants. Unno et al. [57] present an extension of constraint Horn-clauses (called pfwCSP). They show that pfwCSP can encode many relational verification conditions, including many hyperliveness properties like GNI (see Section 5). Compared to the pfwCSP encoding, we explore a less liberal program alignment (guided by (Loop-Counting)). However, we gain the important advantage of generating standard (first-order) SMT constraints that can be handled using existing SMT solvers (which shows significant performance improvement, cf. Section 5).

Most work on the verification of hyperliveness has focused on more general temporal properties, i.e., properties that reason about infinite executions, based on logics such as HyperLTL [13, 20, 33]. Coenen et al. [22] study a method for verifying hyperliveness in finite-state transition systems using strategies to resolve existential quantification. This approach is also applicable to infinite-state systems by means of an abstraction [12, 39] (see HyPA in Section 5). Bounded model-checking (BMC) for hyperproperties [38] unrolls the system to a fixed bound and can, e.g., find violations to GNI. Existing BMC tools target finite-state (boolean) systems and construct QBF formulas; lifting this to support infinite-state systems by constructing SMT constraints is an interesting future work and could, e.g., complement ForEx in the refutation of FEHTs.

7 Conclusion

We have studied the automated program verification of relational \(\forall ^*\exists ^*\) properties. We developed a constraint-based verification algorithm that is rooted in a sound-and-complete program logic and uses a (parametric) postcondition computation. Our experiments show that – while our logic-guided tool explores a restricted class of possible loop alignments – it succeeds in many of the instances we tested. Moreover, the use of off-the-shelf SMT solvers results in faster verification, paving the way toward a future of fully automated tools that can check important hyperliveness properties such as GNI and opacity.