Keywords

1 Introduction

While for intricate verification tasks, monolithic relational domains such as the polyhedra abstract domain [8] are indispensable, they are considered prohibitively expensive. Therefore, weakly relational domains have been proposed which can only express simple relational properties, but scale better to larger programs. Examples of such domains to capture numerical properties are the Two Variables Per Inequality domain [27], or domains given by a finite set of linear templates [25]. The most prominent example of a template numerical domain is the Octagon domain [20, 21] which allows tracking upper and lower bounds not only of program variables but also of sums and differences of two program variables. One such octagon abstract relation could, e.g., be given by the conjunction

$$ (-x\le -5)\wedge (x\le 10)\wedge (x+y\le 0)\wedge (x-z\le 1) $$

Octagons thus can be considered as a mild extension of the non-relational domain of Intervals for program variables. An efficient comparison of octagon abstract relations for inclusion, is enabled by canonical representations where all implied bounds are made explicit. Such representations are called closed. In the given example, the upper bounds

$$ (y\le -5)\wedge (-z\le -4) $$

are implied and therefore are included into the closed representation.

Procedures for computing closures of octagons over rationals or integers have been given by Miné [20] where an improved closure algorithm for integers later has been provided by Bagnara et al. [1, 2]. Further practical improvements are discussed in [4]. All these algorithms have in common that they introduce auxiliary variables for negated program variables \(-z\) in order to represent each octagon as a difference bound matrix (DBM), and then apply dedicated techniques for these [19], namely, the Floyd-Warshall algorithm [6]. The auxiliary variables, however, must additionally be taken care of by the algorithm which blurs the simplicity of the idea, and also complicates the correctness argument.

Here, we take another approach. To provide efficient procedures for the Octagon domain with simple proofs, we identify two generic properties of relational domains which are sufficient for an abstract version of the Floyd-Warshall algorithm to provide normal forms. Normalization takes calculations on abstract relations between 1, 2, and 3 variables as black boxes and uses these to infer abstract 1 or 2-variable relations mediated by other variables. Our normalization algorithm can be instantiated for rational octagons as well as integer octagons or other instances of the class of weakly relational domains satisfying our criteria.

The first criterion is 2-decomposability as introduced in [26] which requires that each abstract relation can be uniquely reconstructed from its projections onto sub-clusters of variables of size at most 2. The second criterion is called 2-projectivity. This property means that each variable x can be eliminated from an abstract relation by considering projections onto at most 2-variable clusters. If both criteria are satisfied, our algorithm returns the normal form. The key correctness argument can be provided on two pages. Our abstract setting also provides an elegant algorithm for incremental normalization, i.e., for re-establishing the normal form after improving the relationship between two variables. In practice, such improvements may occur as the abstract effect of guards in the program which are expressible as abstract relations. For the Octagon domain over rationals or integers, we provide improved abstract transformers for affine assignments based on linear programming.

2 Relational Domains

Let us recall basic definitions for relational domains. We mostly follow the notation used in [26] where the notion of 2-decomposability has been introduced. Let \(\mathcal{X}\) be some finite set of variables. A relational domain \(\mathcal{R}\) is a lattice with least element \(\bot \) and greatest element \(\top \) which provides the monotonic operations

figure a

for some languages e of expressions and c of conditions, respectively.

The given operations are meant to provide the abstract transformers for the basic operations of programs. Restricting a relation r to a subset Y of variables amounts to forgetting all information about variables in \(\mathcal{X}{\setminus } Y\). Thus, we require that

$$\begin{aligned} \begin{array}{lll} {\left. r\right| _{\mathcal{X}}} &{}=&{} r \\ {\left. r\right| _{\emptyset }} &{}=&{} \top \\ {\left. r\right| _{Y_1}} &{}\sqsupseteq &{} {\left. r\right| _{Y_2}}\qquad \text {when}\; Y_1 \subseteq Y_2 \\ {\left. ({\left. r\right| _{Y_1}})\right| _{Y_2}} &{}=&{} {\left. r\right| _{Y_1 \cap Y_2}} \end{array}\end{aligned}$$
(1)

Restriction therefore is idempotent. For guards with condition c, we require that

$$\begin{aligned} \llbracket ?c\rrbracket ^\sharp r = r\sqcap \llbracket ?c\rrbracket ^\sharp ({\left. r\right| _{V}}) \end{aligned}$$
(2)

where V is the set of variables occurring inside c.

For a numerical relational domain, we additionally require for \(Y\subseteq \mathcal{X}\) that

$$\begin{aligned} {\left. (\llbracket x\leftarrow e\rrbracket ^\sharp r)\right| _{Y}} = & {} {\left. r\right| _{Y}} \qquad (x\not \in Y) \end{aligned}$$
(3)
$$\begin{aligned} {\left. (\llbracket x\leftarrow e\rrbracket ^\sharp r)\right| _{Y}} = & {} {\left. (\llbracket x\leftarrow e\rrbracket ^\sharp ({\left. r\right| _{Y \cup V}}))\right| _{Y}} \qquad (x\in Y) \end{aligned}$$
(4)

where V is the set of variables occurring in e. Intuitively, this means that an assignment to the variable x does not affect relational information for any set Y of variables with \(x\not \in Y\). To determine the effect for a set Y of variables containing x, it suffices to additionally take the variables into account which occur in the right-hand side e. This property may, e.g., be violated if the relational domain also represents points-to information so that updates to x may also affect relational information for sets of variables not containing x.

Example 1

For numerical variables, a variety of such relational domains have been proposed, e.g., (conjunctions of) affine equalities [16, 22, 23] or affine inequalities [8]. For affine equalities or inequalities, projection onto a subset of Y of variables corresponds to the geometric projection onto the sub-space defined by Y, combined with arbitrary values for variables \(z\not \in Y\). The abstract effect of a guard c onto a given conjunction r can be realized as \(r\wedge c = r\wedge (c\wedge {\left. r\right| _{V}})\) if c is a linear equality or inequality, respectively, using variables from V. The abstract effect of an assignment \(x \leftarrow e\) with affine right-hand side e, finally, can be reduced to the addition of new constraints and projection onto sub-spaces. Relational domains may also be constructed for non-numerical values, e.g., by maintaining finite subsets of value maps.    \(\square \)

3 Weakly Relational Domains

One way to tackle the high cost of relational domains is to track relationships not between all variables, but only between subclusters of variables. We call such domains Weakly Relational Domains.

For a subset \(Y\subseteq \mathcal{X}\), let \(\mathcal{R}^{Y} = \{r \mid r\in \mathcal{R}, {\left. r\right| _{Y}} = r \}\) the set of all abstract values from \(\mathcal{R}\) that contain only information on those variables in Y. For any collection \(\mathcal{S}\subseteq 2^{\mathcal{X}}\) of clusters of variables, a relation \(r\in \mathcal{R}\) can be approximated by a meet of relations from \(\mathcal{R}^Y, Y\in \mathcal S\) since for every \(r\in \mathcal{R}\),

$$\begin{aligned} r\sqsubseteq \sqcap \{{\left. r\right| _{Y}}\mid Y\in \mathcal{S}\} \end{aligned}$$
(5)

holds. Schwarz et al. [26] introduce the notion of 2-decomposable relational domains. These are domains where the full value can be recovered from the restriction to all clusters \([\mathcal{X}]_2\) of variables of size at most 2, and all finite least upper bounds can be recovered by computing within these clusters only, i.e., where

$$\begin{aligned} r =& \sqcap \left\{ {\left. r\right| _{p}} \mid p\in [\mathcal{X}]_2\right\} \end{aligned}$$
(6)
$$\begin{aligned} {\left. \left( \bigsqcup R\right) \right| _{p}} =&\bigsqcup \left\{ {\left. r\right| _{p}}\mid r\in R\right\} \quad (p\in [\mathcal{X}]_2) \end{aligned}$$
(7)

holds for each abstract relation \(r\in \mathcal{R}\) and each finite set of abstract relations \(R\subseteq \mathcal{R}\). The most prominent example of a 2-decomposable domain is the Octagon domain [20] – either over rationals or integers, while affine equalities or affine inequalities are examples of domains that are not 2-decomposable.

Each value r from a 2-decomposable relational domain \(\mathcal{R}\) can be represented as the meet of its restrictions to 2-clusters, i.e., by the collection \(\left\langle {\left. r\right| _{p}}\right\rangle _{p\in [\mathcal{X}]_2}\). This representation is called 2-normal, and an algorithm to compute it, normalization. Consider an arbitrary collection \(\langle s_p\rangle _{p\in [\mathcal{X}]_2}\) with \(s_p\in \mathcal{R}^p\) with \(r=\sqcap \{s_p\mid p\in [\mathcal{X}]_2\}\). Then \({\left. r\right| _{p}} \sqsubseteq s_p\) always holds, while equality need not hold. In the Octagon domain over the rationals or the integers, the 2-normal representation of an octagon value corresponds to its strong closure and tight closure, respectively, as described in [1, 20]. Here, we do not distinguish between different types of closure for rational and integer octagons. Instead, we call a non-\(\bot \) octagon O over a numerical set of values \({\mathbb I}\in \{{\mathbb Q}, {\mathbb Z}\}\) closed if for each octagon combination \(\ell \), the upper bound \(b_\ell \) equals the minimal value \(b\in {\mathbb I}\) such that \(\ell \le b\) is implied by O, or \(\infty \) if no such bound exists.

While for rational octagons, closure in cubic time was already proposed by Miné [20], it is much more recent that a corresponding algorithm was provided for integer octagons [1, 2]. Here, we re-consider these results. By referring to 2-decomposable domains instead of to octagons, we succeed in providing a conceptually simple normalization algorithm with a simple correctness proof, from which cubic closure algorithms for the Octagon domains can be derived.

4 2-Projectivity

Subsequently, we assume that \(\mathcal{R}\) is an arbitrary 2-decomposable domain over some set \(\mathcal{X}\) of variables. Assume that \(r\in \mathcal{R}\) is given by \(r=\sqcap \{s_p\mid p\in [\mathcal{X}]_2, s_p\in \mathcal{R}^{p}\}\). Then, we consider the following constraint system in the unknowns \(r_p, p\in [\mathcal{X}]_2\), over \(\mathcal{R}\),

$$\begin{aligned} \begin{array}{lll} r_{\{x,y\}} &\sqsubseteq &s_{\{x,y\}} \sqcap {\left. \left( r_{\{x,z\}} \sqcap r_{\{z,y\}}\right) \right| _{\{x,y\}}} \end{array} \end{aligned}$$
(8)

for \(x,y,z \in \mathcal{X}\). All right-hand sides of the constraint system (8) are monotonic.

Proposition 1

The collection \(\langle {\left. r\right| _{p}}\rangle _{p\in [\mathcal{X}]_2}\) is a solution of constraint system (8).

Proof

Let \(x,y,z\in \mathcal{X}\). Then

$$\begin{aligned} \begin{array}{llll} {\left. r\right| _{\{x,y\}}} = & {} {\left. r\right| _{\{x,y\}}} \sqcap {\left. r\right| _{\{x,y\}}} \sqsubseteq s_{\{x,y\}} \sqcap {\left. r\right| _{\{x,y\}}} \sqsubseteq s_{\{x,y\}} \sqcap {\left. \left( {\left. r\right| _{\{x,z\}}} \sqcap {\left. r\right| _{\{z,y\}}}\right) \right| _{\{x,y\}}} \end{array} \end{aligned}$$

   \(\square \)

From Proposition 1, we conclude that the greatest solution of (8) – if it exists – is an overapproximation of the normal representation of r. In general, the Kleene fixpoint iteration for computing greatest solutions of constraint systems (8) may not terminate. Let us call a 2-decomposable relational domain \(\mathcal{R}\) 2-projective when from each abstract relation r, each single variable can be eliminated by using projections onto clusters from \([\mathcal{X}]_2\) only, i.e., when for every \(Y \subseteq \mathcal{X}\), \(z \in \mathcal{X}{\setminus } Y\), \(y_j\in Y\cup \{z\}\), \(r'\in \mathcal{R}^Y\), and \(r_{\{z,y_j\}} \in \mathcal{R}^{\{z,y_j\}}\),

$$\begin{aligned} {\left. \left( r_{\{z,y_1\}}\sqcap \ldots \sqcap r_{\{z,y_k\}} \sqcap r'\right) \right| _{Y}} = r' \sqcap \sqcap _{i,j=1}^k {\left. \left( r_{\{z,y_i\}}\sqcap r_{\{z,y_j\}}\right) \right| _{Y\cap \{y_i,y_j\}}} \end{aligned}$$
(9)

Proposition 2

The following 2-decomposable domains are 2-projective:

  1. 1.

    rational octagons;

  2. 2.

    integer octagons;

  3. 3.

    2-variable rational affine inequalities;

  4. 4.

    2-variable rational affine equalities.

Proof

Let us consider the claims (1) and (2) for octagons. Intuitively, their correctness follows from the correctness of Fourier-Motzkin elimination of a single variable z from a system of inequalities. In general, this holds only for rational inequalities as considered for claim (1). However, it also holds for systems of integer inequalities – given that all coefficients are integer and all non-zero coefficients of z are either 1 or \(-1\).

Let us call a linear combination \(\sum _{x\in \mathcal{X}} a_x\cdot x\) an octagon combination if at most two of the coefficients \(a_x\) are non-zero and these are then from \(\{-1,1\}\). For a subset Y of variables, let \(L_Y\) denote the set of all octagon combinations with variables from Y. An integer octagon constraint is of the form \(\ell \le b\) where \(\ell \) is a linear octagon combination and the bound b is integer or \(\infty \).

Subsequently, we represent an abstract octagon relation over Y by a closed conjunction

$$\begin{aligned} \bigwedge \nolimits _{\ell \in L_Y}\ell \le b_\ell \end{aligned}$$
(10)

of octagon constraints with variables from Y if the octagon is satisfiable, or \(\bot \) if it is not. Here, the conjunction (10) is satisfiable and closed iff

$$ \begin{array}{rlll} 0 &{}\le &{} b_\ell +b_{-\ell } &{}\text {if}\;\ell \in L_Y \\ b_\ell &{}\le &{} (b_{\ell _1}+b_{\ell _2})/c &{}\text {if}\; \ell _1\ne \ell _2\;\text {and}\;c\cdot \ell = \ell _1+\ell _2 \\ \end{array} $$

holds for some \(c\in \{1,2\}\). Here, factor 2 occurs if one variable x occurs both in \(\ell _1\) and \(\ell _2\) with the same sign, while another variable y occurs with different signs, i.e.,

$$ c\cdot \ell = (x+y) + (x-y) = 2\cdot x $$

In case of octagons over rationals, the operator “/” denotes division, whereas in case of octagons over integers, it denotes integer division, i.e., may include rounding downwards. By definition, the closed representation of an abstract octagon relation is also 2-normal.

For computing the closure for an arbitrary conjunction r of octagon constraints with one or two variables only, we may first determine the least given upper bound \(b_\ell \) for each occurring octagon linear combination \(\ell \). As a result, we obtain at most 8 octagon constraints for which satifiability (over rationals or integers) can be decided in constant time. Provided the conjunction is satisfiable, all implied tighter upper bounds (over rationals or integers) can be inferred.

Example 2

Consider the integer octagon given by conjunction of the constraints

$$ x+y \le -2 \qquad x-y \le 5 \qquad -x+y \le 0 $$

By adding up constraints with positive and negative occurrences of the same variable, we derive that

$$ y \le -1 \qquad x \le 1 $$

must also hold, while no further bounds can be inferred. If the conjunction of octagon constraints additionally has the inequality

$$ -x-y \le 0 $$

then, by adding this to the first inequality, we derive

$$ 0 \le -2 $$

– which is false – implying that the octagon equals \(\bot \).    \(\square \)

Assume that each non-\(\bot \) value \(r_{\{y_j,z\}}\), \(y_j \in Y\cup \{z\}\), is represented as a closed conjunction of octagon constraints with variables from \(\{y_j,z\}\). Assume likewise, that \(r'\ne \bot \) is represented by a conjunction of octagon constraints with variables from Y only.

For each pair \(y_i,y_j\) of variables from \(Y\cup \{z\}\), the abstract value

$$\begin{aligned} {\left. \left( r_{\{y_i,z\}}\wedge r_{\{y_j,z\}}\right) \right| _{Y\cap \{y_i,y_j\}}} \end{aligned}$$
(11)

can be obtained by means of Fourier-Motzkin elimination of z, applied to the closed conjunctions of octagon constraints representing \(r_{\{y_i,z\}}\), and \(r_{\{y_j,z\}}\), respectively. In order to see this, we note that all occurring non-zero coefficients of z in the constraints of \(r_{\{y_i,z\}}\) as well as \(r_{\{y_j,z\}}\) are from \(\{-1,1\}\). Consider a constraint \(\ell \le b\) of the resulting conjunction. Three cases may occur.

  • \(\ell \) may contain occurrences of both variables \(y_i\) and \(y_j\) – each with coefficients in \(\{-1,1\}\).

  • \(\ell \) may contain a single occurrence of one variable, w.l.o.g., \(y_i\), whose coefficient now is in \(\{-2,-1,1,2\}\). In case the coefficient of \(y_i\) is in \(\{-2,2\}\), \(\ell \) is still equivalent to an octagon constraint for \(y_i\) only. If the constraint, e.g., is \(2\cdot y_i \le 7\), then it is equivalent to \(y_i\le 3.5\) over rationals, and to \(y_i\le 3\) over the integers.

  • \(\ell \) does not contain any occurrences of variables. In this case, it is either equivalent to true and can be abandoned, or equivalent to false – implying that (11) equals \(\bot \).

We conclude that the expression (11), when satisfiable, can be represented by a conjunction of octagon constraints using variables \(y_i\) and \(y_j\). Thus, the right-hand side of Eq. (9) for rational as well as integer octagons is equivalent to the result of Fourier-Motzkin elimination of z. This implies claim (2).

Example 3

Assume an integer octagon \(r = r'\wedge r_{\{y_1,z\}}\wedge r_{\{y_2,z\}}\) where

$$ \begin{array}{lll} r' &{}=&{} y_1+y_2\le 7 \\ r_{\{y_1,z\}} &{}=&{} (y_1+z \le -1)\wedge (y_1 \le 3)\wedge (-z \le 4) \\ r_{\{y_2,z\}} &{}=&{} (y_2-z \le 5) \wedge (-y_2 \le 1) \end{array} $$

Fourier-Motzkin elimination of z adds the additional constraint

$$ y_1+y_2\le 4 $$

Projection onto the subset \(Y = \{y_1,y_2\}\) according to (9) therefore results in the conjunction of constraints

$$ (y_1 + y_2\le 7)\wedge (y_1\le 3) \wedge (y_1+y_2\le 4) \wedge (-y_2 \le 1) $$

which can be further simplified to \((y_1 \le 3) \wedge (y_1+y_2 \le 4) \wedge (-y_2 \le 1)\).   \(\square \)

Example 4

The following 2-decomposable domains are not 2-projective:

  1. 1.

    Finite sets of 2-variable maps;

  2. 2.

    Implications between interval constraints.    \(\square \)

Proof

For (1), let \(\mathcal{X}=\{a,x,y,z\}\) where variables range over values from the set \(\{1,2,3\}\) and maps from variables to such sets are used as the abstraction. Consider now:

$$ r_{\{a,x\}} = \{ a \mapsto \{1,2\} \} \qquad r_{\{a,y\}} = \{ a \mapsto \{2,3\} \} \qquad r_{\{a,z\}} = \{ a \mapsto \{3,1\} \} $$

where all other \(r_p\), \(p\in [\mathcal{X}]_2\) have the value \(\top \). Then,

$$ \begin{array}{lll} {\left. \left( r_{\{a,x\}} \sqcap r_{\{a,y\}} \sqcap r_{\{a,z\}} \sqcap \top \right) \right| _{\{x,y,z\}}} = \bot \end{array} $$

but, in violation of property (9),

$$ \begin{array}{lll} &{}&{} \top \sqcap {\left. (r_{\{a,x\}} \sqcap r_{\{a,x\}})\right| _{\{x\}}} \sqcap {\left. (r_{\{a,y\}} \sqcap r_{\{a,y\}})\right| _{\{y\}}} \sqcap {\left. (r_{\{a,z\}} \sqcap r_{\{a,z\}})\right| _{\{z\}}}\\ &{}&{} \sqcap {\left. (r_{\{a,x\}} \sqcap r_{\{a,y\}})\right| _{\{x,y\}}} \sqcap {\left. (r_{\{a,x\}} \sqcap r_{\{a,z\}})\right| _{\{x,z\}}} \sqcap {\left. (r_{\{a,y\}} \sqcap r_{\{a,z\}})\right| _{\{y,z\}}} \\ &{}=&{} \top \sqcap \top \sqcap \top \sqcap \top \sqcap {\left. (\{ a \mapsto \{1,2\} \} \sqcap \{ a \mapsto \{2,3\} \})\right| _{\{x,y\}}} \sqcap \\ &{}&{} {\left. (\{ a \mapsto \{1,2\} \} \sqcap \{ a \mapsto \{3,1\} \})\right| _{\{x,z\}}} \sqcap {\left. (\{ a \mapsto \{2,3\}\} \sqcap \{ a \mapsto \{3,1\}\})\right| _{\{y,z\}}} \\ &{}=&{} {\left. (\{ a \mapsto \{2\}\})\right| _{\{x,y\}}} \sqcap {\left. (\{ a \mapsto \{1\}\})\right| _{\{x,z\}}} \sqcap {\left. (\{ a \mapsto \{3\}\})\right| _{\{x,z\}}} \sqcap \\ &{}=&{} \top \sqcap \top \sqcap \top = \top \end{array} $$

The domain of implications between interval constraints consists of finite conjunctions of the form

$$ x \in I \implies y \in I' $$

for variables x and y and \(I, I'\) either intervals or the empty set, ordered by implication. In particular, \(x \in \emptyset \) may be written as \({\textsf {False}}\), while \(x \in [-\infty ,\infty ]\) is denoted by \({\textsf {True}}\).

Now, consider the same set \(\mathcal{X}=\{a,x,y,z\}\) of variables as for claim (1) and let

$$ \begin{array}{lll} r_{\{a,x\}} &{}=&{} \{ {\textsf {True}}\implies a \in [1,2] \} \\ r_{\{a,y\}} &{}=&{} \{ {\textsf {True}}\implies a \in [2,3] \} \\ r_{\{a,z\}} &{}=&{} \{ a \in [2,2] \implies {\textsf {False}}\} \\ \end{array} $$

where all other \(r_p\), \(p\in [\mathcal{X}]_2\) have the value \(\top \). Then,

$$ \begin{array}{lll} {\left. \left( r_{\{a,x\}} \sqcap r_{\{a,y\}} \sqcap r_{\{a,z\}} \sqcap \top \right) \right| _{\{x,y,z\}}} = {\textsf {False}}= \bot \end{array} $$

but

$$ \begin{array}{lll} &{}&{} \top \wedge {\left. (r_{\{a,x\}} \wedge r_{\{a,x\}})\right| _{\{x\}}} \wedge {\left. (r_{\{a,y\}} \wedge r_{\{a,y\}})\right| _{\{y\}}} \wedge {\left. (r_{\{a,z\}} \wedge r_{\{a,z\}})\right| _{\{z\}}}\\ &{}&{} \wedge {\left. (r_{\{a,x\}} \wedge r_{\{a,y\}})\right| _{\{x,y\}}} \wedge {\left. (r_{\{a,x\}} \wedge r_{\{a,z\}})\right| _{\{x,z\}}} \wedge {\left. (r_{\{a,y\}} \wedge r_{\{a,z\}})\right| _{\{y,z\}}} \\ &{}=&{} \top \wedge \top \wedge \top \wedge \top \wedge {\left. (\{ {\textsf {True}}\implies a \in [1,2] \} \wedge \{ {\textsf {True}}\implies a \in [2,3] \})\right| _{\{x,y\}}} \wedge \\ &{}&{} {\left. (\{ {\textsf {True}}\implies a \in [1,2] \} \wedge \{ a \in [2,2] \implies {\textsf {False}}\})\right| _{\{x,z\}}} \wedge \\ &{}&{} {\left. (\{ {\textsf {True}}\implies a \in [2,3] \} \wedge \{ a \in [2,2] \implies {\textsf {False}}\})\right| _{\{y,z\}}} \\ &{}=&{} {\left. ({\textsf {True}}\implies a \in [2,2])\right| _{\{x,y\}}} \wedge {\left. ({\textsf {True}}\implies a \in [1,1])\right| _{\{x,z\}}} \wedge \\ &{}&{} {\left. ({\textsf {True}}\implies a \in [3,3])\right| _{\{x,z\}}}\\ &{}=&{} \top \wedge \top \wedge \top = \top \end{array} $$

which means property (9) is violated.    \(\square \)

Subsequently, assume that the 2-decomposable domain \(\mathcal{R}\) is 2-projective. We show that under this assumption, the greatest solution of the constraint system (8) exists and coincides with the normal representation. Moreover, we provide an efficient algorithm for performing the normalization.

Assume that \(\mathcal{X}= \{ x_1 \ldots x_n \}\), and let \(X_r = \{x_1,\ldots ,x_r\}\), and \(\bar{X}_r = \mathcal{X}{\setminus } X_r\) for \(r=0,\ldots ,n\). Assume that we are given \(s_p\in \mathcal{R}^p\), \((p\in [\mathcal{X}]_2)\). For \(x,y\in \mathcal{X}\), we define the sequence

$$ \begin{array}{lllllll} s^{(0)}_{\{x,y\}} &{}=&{} s_{\{x\}}\sqcap s_{\{y\}}\sqcap s_{\{x,y\}}\\ s^{(r)}_{\{x,y\}} &{}=&{} s^{(r-1)}_{\{x,y\}}\; \sqcap {\left. \left( s^{(r-1)}_{\{x,x_r\}} \sqcap s^{(r-1)}_{\{x_r,y\}}\right) \right| _{\{x,y\}}}\qquad \text {for}\;r>0:\quad \end{array} $$

Proposition 3

Let \(\bar{s} = \sqcap \{s_p\mid p\in [\mathcal{X}]_2\}\) be the abstract relation represented by \(\langle s_p\rangle _{p\in [\mathcal{X}]_2}\). Let \(p\in [\mathcal{X}]_2\). For \(r = 0,\ldots ,n\),

  1. 1.

    \(s^{(r)}_p\sqsubseteq s^{(r)}_{\{x\}}\) for each \(x\in p\);

  2. 2.
    $$\begin{aligned} {} {\left. \bar{s}\right| _{{\bar{X}_{r}\cup \{x,y\}}}} = \sqcap \left\{ s^{(r)}_p\mid p\subseteq {\bar{X}_{r}\cup \{x,y\}}, 1 \le |p|\le 2\right\} \end{aligned}$$
    (12)

Proof

For \(r=0\), the proposition holds by definition. Now assume that \(r>0\) and the assertion already holds for \(r-1\). For \(p= \{x,y\}\), we calculate

$$ \begin{array}{lll} s^{(r)}_{\{x,y\}} \,&{}=&{}\, s^{(r-1)}_{\{x,y\}}\; \sqcap {\left. \left( s^{(r-1)}_{\{x,x_r\}} \sqcap s^{(r-1)}_{\{x_r,y\}}\right) \right| _{\{x,y\}}} \,\sqsubseteq \, s^{(r-1)}_{\{x\}} \sqcap {\left. s^{(r-1)}_{\{x,x_r\}}\right| _{\{x,y\}}}\\ \,&{}\sqsubseteq &{}\, s^{(r-1)}_{\{x\}}\; \sqcap {\left. s^{(r-1)}_{\{x,x_r\}}\right| _{\{x\}}} \,=\, s^{(r)}_{\{x\}} \end{array} $$

and the first claim follows. For the second claim Eq. (12), consider the case \(x_r\not \in \{x,y\}\). Then

figure b

and the assertion holds. For the second but last equality, we used that the meet in the second but last row is non-empty, since

$$ {\left. s^{(r-1)}_{\{x_r\}}\right| _{\emptyset }} \sqsupseteq {\left. s^{(r-1)}_{\{x_r\}}\right| _{\{z_1\}}}\sqsupseteq {\left. s^{(r-1)}_{\{z_1,x_r\}}\right| _{\{z_1\}}}\sqsupseteq {\left. s^{(r-1)}_{\{z_1,x_r\}}\right| _{\{z_1,z_2\}}}\sqsupseteq {\left. s^{(r-1)}_{\{z_1,x_r\}}\sqcap s^{(r-1)}_{\{z_1,x_r\}}\right| _{\{z_1,z_2\}}} $$

holds for each \(z_1,z_2\in {\bar{X}_{r}\cup \{x,y\}}\). Now let \(x_r\in \{x,y\}\). Then \({\bar{X}_{r}\cup \{x,y\}} ={\bar{X}_{r-1}\cup \{x,y\}}\). W.l.o.g., let \(x=x_r\). Then \(s^{(r-1)}_{\{x,x_r\}} = s^{(r-1)}_{\{x\}}\) and \(s^{(r-1)}_{\{x_r,y\}} = s^{(r-1)}_{\{x,y\}}\). Hence by claim (1), \( s^{(r)}_{\{x,y\}} = s^{(r-1)}_{\{x,y\}} \). Accordingly,

figure c

   \(\square \)

Thus, provided \(\mathcal{R}\) fulfills Eq. 9, we obtain for \(k=n\):

$$ \begin{array}{lllll} {\left. \bar{s}\right| _{\{x,y\}}} = & {} s^{(n)}_{\{x,y\}} \sqcap s^{(n)}_{\{x\}} \sqcap s^{(n)}_{\{y\}} = & {} s^{(n)}_{\{x,y\}} \end{array} $$

Subsequently, we consider Algorithm 1. It consists of one application of the Floyd-Warshall algorithm, as is. For that to be sufficient, an initialization round is performed upfront to ensure that each value \(t_{\{x,y\}}\) not only subsumes \(s_{\{x,y\}}\), but also \(s_{\{x\}}\) and \(s_{\{y\}}\). The complexity of the proposed algorithm is \(\mathcal{O}(n^3)\) if calculations with abstract relations over at most three variables, i.e., from \(\mathcal{R}^Y\) for every \(Y\subseteq \mathcal{X}\) with \(|Y|\le 3\), can be performed in constant time. For Algorithm 1, we find:

figure d

Theorem 1

Assume that \(\langle t_p\rangle _{p\in [\mathcal{X}]_2}\) is the collection of values returned by Algorithm 1 for the collection \(\langle s_p\rangle _{p\in [\mathcal{X}]_2}\). Let \(\bar{s} = \sqcap \{s_p\mid p\in [\mathcal{X}]_2\}\) the abstract relation represented by \(\langle s_p\rangle _{p\in [\mathcal{X}]_2}\). Then for each \(p\in [\mathcal{X}]_2\),

  1. 1.

    \({\left. \bar{s}\right| _{p}} \sqsubseteq t_p\);

  2. 2.

    If the 2-decomposable domain \(\mathcal{R}\) is 2-projective, then \({\left. \bar{s}\right| _{p}} = t_p\) holds. In that case, \(\langle t_p\rangle _{p\in [\mathcal{X}]_2}\) is the greatest solution of the constraint system (8).

Thus, Algorithm 1 provides a cubic time normalization procedure – whenever \(\mathcal{R}\) is 2-decomposable and 2-projective. We remark that the initializing first loop cannot be abandoned. When \(\mathcal{R}\) is not 2-projective, but 2-decomposable, the algorithm still computes overapproximations of normal representations.

Proof

Let \(p\in [\mathcal{X}]_2\). By Proposition 1, \({\left. \bar{s}\right| _{p}} \sqsubseteq t_p\) holds, since the right-hand sides of the constraint system (8) are all monotonic, and starting from the initial values provided in the first loop, each update to some \(t_{\{x,y\}}\) in the second loop, corresponds to one update performed by the evaluation of some right-hand side of (8). Therefore, the first assertion follows.

Now assume that the 2-decomposable relational domain \(\mathcal{R}\) additionally is 2-projective. Let \(t^{(r)}_p\) denote the value of \(t_p\) attained after the iteration of the second loop for the variable \(x_r\). By induction on r, we verify by means of Proposition 3 that for all \(p\in [\mathcal{X}]_2\), \( t^{(r)}_p\sqsubseteq s^{(r)}_p \) holds for all \(r=0,\ldots ,n\). In particular, \(t_p = t^{(n)}_p\sqsubseteq {\left. \bar{s}\right| _{p}}\), and the second assertion of the theorem follows.    \(\square \)

Example 5

Given a (finite) set of constants, the Pairs domain consists of false or conjunctions \(\bigwedge \{\phi _p\mid p\in [\mathcal{X}]_2\}\) where for \(p\in [\mathcal{X}]_2\), \(\phi _p\) is true or a disjunction of conjunctions of atomic propositions \(x=c\), \(x\in p\). It is ordered by logical implication. Consider, e.g., \(r= \phi _{\{x,y\}} \wedge \phi _{\{y,z\}}\) with \(\phi _{\{x,y\}} \equiv (x=a)\vee (x=b\wedge y=c)\) and \(\phi _{\{y,z\}} \equiv (y=d\wedge z=b)\). Then \({\left. r\right| _{\{x,y\}}} = (x=a \wedge y=d)\). Likewise, \({\left. r\right| _{\{y,z\}}} = (y=d\wedge z=b)\) and \({\left. r\right| _{\{x,z\}}} = (x=a\wedge z=b)\).

Assume each \(r\in R\) is represented by \(r=\bigwedge \{{\left. r\right| _{p}}\mid p\in [\mathcal{X}]_2\}\), and define for \(p\in [\mathcal{X}]_2\), \(\phi _p\) as the least upper bound of formulas \({\left. r\right| _{p}}, r\in R\). Then \(\bar{r}=\bigwedge \{\phi _p\mid p\in [\mathcal{X}]_2\}\) is an upper bound of R and, in fact, the least upper bound. For some \(p\in [\mathcal{X}]_2\), then by definition, \({\left. \bar{r}\right| _{p}}\Rightarrow \phi _p\). By monotonicity of the restriction, on the other hand, \({\left. r\right| _{p}}\Rightarrow {\left. \bar{r}\right| _{p}}\) for all \(r\in R\). Therefore, \(\phi _p\Rightarrow {\left. \bar{r}\right| _{p}}\) as well, and the claim follows. While being 2-decomposable, the Pairs domain is not 2-projective. Let, e.g.,

figure e

and all other \(s_{p} = \textsf {true}\). Then, Algorithm 1 computes

$$ \begin{array}{llllll} t_{\{w\}} &{}=&{} t_{\{w,x\}} = t_{\{w,y\}} = t_{\{w,z\}} = \textsf {false} \qquad &{} t_{\{y\}} &{}=&{} \textsf {true}\\ t_{\{x\}} &{}=&{} t_{\{x,y\}} = (x = \texttt { \& f1}) \vee (x = \texttt { \& f2}) &{} t_{\{y,z\}} &{}=&{} t_{\{z\}} = (z = \texttt { \& f1})\\ t_{\{x,z\}} &{}=&{} (x = \texttt { \& f1}\wedge z = \texttt { \& f1}) \vee (x = \texttt { \& f2}\wedge z= \texttt { \& f1}) \qquad &{} \end{array} $$

which is an overapproximation of the normalization given by \({\left. \bar{s}\right| _{p}}=\textsf {false}\) for \(p\in [\mathcal{X}]_2\). Here, the normalization happens to coincide with the greatest solution of constraint system (8).    \(\square \)

Example 6

According to Proposition 2, the domains of rational as well as integer octagons are 2-decomposable and 2-projective. Therefore, Algorithm 1 computes the exact 2-normal form, and thus provides us with cubic time closure algorithms for these.    \(\square \)

5 Incremental Normalization

If the condition c of a guard can be abstracted by some abstract relation \(r_c\in \mathcal{R}\), then the transfer function \(\llbracket ?c\rrbracket ^\sharp \) can be chosen as \(\llbracket ?c\rrbracket ^\sharp r = r\sqcap r_c\). Assume that the relational domain \(\mathcal{R}\) is 2-decomposable as well as 2-projective, and that \(r_c\) is represented as the meet \(r_{p_1}\sqcap \ldots \sqcap r_{p_k}\) for \(p_j\in [\mathcal{X}]_2\). Then, the normalization of \(r\sqcap r_c\) can be computed incrementally. For the octagon domain over integers, Chawdhary et al. [4] give quadratic incremental closure algorithms. Just like theirs, our algorithm for incremental normalization is based on the Floyd-Warshall algorithm, i.e., Algorithm 1.

In our setting, adding new constraints amounts to improving some clusters \(r_{\{a,b\}}\) where a and b are from some set \(V \subseteq \mathcal{X}\). For simplicity, we require that only clusters \(r_{\{a,b\}}\) with \(a \ne b\) are improved. This allows us in the adaption of Algorithm 1 to avoid the initialization loop. Whenever \(\mathcal{X}\) contains more than one variable, this extra requirement is no limitation, though, as a constraint involving only the variable z may just be added to any 2-variable cluster p with \(z\in p\). (When \(\mathcal{X}\) contains only one variable, no normalization is required.) Normalization then is computed by the modified version of Algorithm 1 given in Algorithm 2.

figure f

Theorem 2

Assume a 2-normal collection of values of some 2-decomposable relational domain \(S= \langle s_p\rangle _{p\in [\mathcal{X}]_2}\), and a collection \(S_1=\langle s'_{p'}\rangle _{p' \subseteq V,|p'| = 2}\) with \(s'_{p'} \sqsubseteq s_{p'}\) for all \(p'\). Assume that \(\langle t_p\rangle _{p\in [\mathcal{X}]_2}\) is the collection of values returned by Algorithm 2 for the collection \(S' = \langle s_p\rangle _{p\in [\mathcal{X}]_2, (p \not \subseteq V \vee |p| \ne 2 )} \cup S_1\) Let \(\bar{s} = \sqcap S'\) the abstract relation represented by \(S'\). Then for each \(p\in [\mathcal{X}]_2\),

  1. 1.

    \({\left. \bar{s}\right| _{p}} \sqsubseteq t_p\);

  2. 2.

    If the 2-decomposable domain \(\mathcal{R}\) is 2-projective, then \({\left. \bar{s}\right| _{p}} = t_p\) holds. In that case, \(\langle t_p\rangle _{p\in [\mathcal{X}]_2}\) is the greatest solution of constraint system (8).

Proof

Let \(p \in [\mathcal{X}]_2\). \({\left. \bar{s}\right| _{p}} \sqsubseteq t_p\) holds since, as observed before, all right-hand sides of the constraint system (8) are monotonic and the individual update steps of Algorithm 2 each correspond to updates performed by the evaluations of the right-hand sides of (8). Thus, the first statement follows.

Now consider the case where the relational domain is additionally 2-projective. The invariant which the non-incremental Algorithm 1 attains after the initialization holds by construction here. Let \(t^{(r)}_p\) denote the value of \(t_p\) attained after the iteration of the second loop for the r-th variable in the non-incremental Algorithm 1. We choose the order of the iteration of variables in the second loop such that the variables in V are considered last. Then, for the first \(|\mathcal{X}{\setminus } V|\) iterations \(t^{(r-1)}_p = t^{(r)}_p\), as the original collection \(\langle s_p\rangle _{p\in [\mathcal{X}]_2}\) was normalized. Therefore, it suffices to execute the last \(|V|\) iterations of the second loop of Algorithm 1 which is identical to Algorithm 2. Thus, by Theorem 1, the claim follows.    \(\square \)

We have thus shown that re-establishing normalization (and thus closure) after adding octagon constraints for m variables is in \(\mathcal {O}(m\cdot n^2)\).

6 Abstract Transformers for Linear Assignments

Assume we are given a normalized value r over the set \(\mathcal{X}\) of program variables from some 2-decomposable relational domain. Assume further that we are given an assignment a of the form \( x \leftarrow e \) where e is an expression over some subset \(V\subseteq \mathcal{X}\), and assume that the relational domain satisfies properties (3) and (4). Let \(r\in \mathcal{R}\) denote the relational value before the assignment and assume r is already normalized where \(r_p = {\left. r\right| _{p}}\) has already been computed for all \(p\in [\mathcal{X}]_2\). Let \(r' = \llbracket \textsf {a} \rrbracket ^\sharp \,r\) denote the relational value after the assignment. Then, for every \(p\in [\mathcal{X}]_2\) with \(x\not \in p\), \({\left. r'\right| _{p}} = {\left. r\right| _{p}} = r_p\). In order to compute the normalization of \(r'\), it therefore suffices to compute the values \(r'_p = {\left. r'\right| _{p}}\) for \(x\in p\), i.e., a linear number of clusters p. Now consider some variable \(y\in \mathcal{X}\). Because of property (4), we have that

$$ \begin{array}{lll} r'_p &{}=&{} {\left. \llbracket \textsf {a} \rrbracket ^\sharp r\right| _{\{x,y\}}} \\ &{}=&{} {\left. (\llbracket \textsf {a} \rrbracket ^\sharp {\left. r\right| _{V\cup \{x,y\}}})\right| _{\{x,y\}}} \\ &{}=&{} {\left. (\llbracket \textsf {a} \rrbracket ^\sharp (\sqcap \{r_p\mid p\subseteq {V\cup \{x,y\}}\}))\right| _{\{x,y\}}} \\ \end{array} $$

i.e., the abstract value \(r'_{\{x,y\}}\) requires taking into account only clusters \(p\in [\mathcal{X}]_2\) with variables from \(V\cup \{x,y\}\). We conclude:

Proposition 4

Assume that computations on abstract relations from \(\mathcal{R}\) over a bounded set of variables is constant time, and assume that the assignment \(\textsf {a} \) refers only to a bounded number of variables. Assume further that the abstract relation \(r\in \mathcal{R}\) is normalized. Then a normalization of the relation \(\llbracket \textsf {a} \rrbracket ^\sharp r\) can be computed in linear time.    \(\square \)

7 Linear Programming with Octagon Constraints

Let us turn to the implementation of best abstract transformers for assignments for the octagon domain (over rationals as well as over integers). For the octagon domain, an abstract transformer for assignments can be constructed by adding octagon constraints. This works well for right-hand sides of the form \(y+c\) or \(-y+c\) for variables y and constants c. For more general right-hand sides such as, e.g., \(3\cdot y - 2\cdot z\), the best transformer can instead be expressed by means of optimization problems [25].

Assume that the octagon is provided by bounds \(b_\ell , \ell \in L_V\) for some subset \(V\subseteq \mathcal{X}\) of variables. Depending on the sign of a variable occurring in a linear combination \(\ell \), we say it occurs positively or negatively. Consider the optimization problem of maximizing a linear objective function taking variables from V subject to the given set of octagon constraints

$$\begin{aligned} \begin{array}{ll} {\textbf {maximize}} &{} \sum \nolimits _{z\in V} a_z\cdot z \\[1ex] {\textbf {subject to}} &{} \begin{array}{llll} \ell &{}\le &{} b_\ell &{}\qquad (\ell \in L_{V}) \end{array} \end{array} \end{aligned}$$
(13)

When interpreted over the rationals, optimal solutions can be computed in time polynomial in the size of the linear program (i.e., the number of bits to spell it out) [15] or exponential time in the number of variables if simplex type algorithms are used [17]. To this general approach, we here add one more observation, namely, that over the rationals, the set of octagon constraints to be satisfied in optimization problems can be restricted to constraints where each occurring variable \(z\in V\) occurs with the same sign as the coefficient \(a_z\) of z in the objective function: this considerably reduces the number of constraints to be considered.

Proposition 5

Assume that we are given the rational octagon linear program (13) where \(a_z>0\) for all \(z\in V\). If the octagon corresponding to the constraints is closed, then the same result is obtained when the constraints are restricted to octagon linear combinations z and \(z+y\) for \(z,y\in V\) and \(z\ne y\).

Proof

The proof of the proposition is obtained by means of the dual linear program:

$$\begin{aligned} \begin{array}{ll} {\textbf {minimize}} &{}\sum \nolimits _{\ell \in L_V} y_{\ell }\cdot b_{\ell } \\[1ex] {\textbf {subject to}}&{}\begin{array}{rlll} (\sum \nolimits _{z\;\text {in}\;\ell } y_\ell ) - (\sum \nolimits _{-z\;\text {in}\;\ell } y_\ell ) &{}=&{} a_{z} &{}(z\in V) \\ y_\ell &{}\ge &{} 0 &{}(\ell \in L_V) \\ \end{array} \end{array} \end{aligned}$$
(14)

If the original program is unbounded, then so is the program with the restricted set of constraints. Therefore, assume that the original linear program is bounded. Then the dual optimization problem has a feasible solution \(y_\ell ,\ell \in L_V,\) where the minimal gain b is attained, i.e., \(\sum _{\ell \in L_V} y_\ell \cdot b_\ell = b\). It remains to prove that b can be attained by a feasible solution \(y_\ell ,\ell \in L\), where \(y_\ell = 0\) for all octagon combinations \(\ell \) which contain negations. We proceed by induction on the number of octagon combinations \(\ell \) with negative occurrences of variables from V. Assume that there are octagon combinations \(\ell \) with negated occurrences of z and \(y_\ell > 0\). Consider the linear constraint in (13) for z

$$ \left( \sum \nolimits _{j=1}^r y_\ell \right) - \left( \sum \nolimits _{j'=1}^{r'} y_{\ell '_{j'}}\right) = a_z $$

where \(\ell _j\) enumerates all octagon combinations with positive and \(\ell '_{j'}\) enumerates all octagon combinations with negative occurrences of z. Since \(r' > 0\) and \(a_z > 0\), also \(r>0\). If \(y_{\ell _r} \ge y_{\ell '_{r'}}\), we proceed to eliminate the octagon combination \(\ell '_{r'}\) with a negative occurrence of z and proceed to eliminate also all other negative occurrences of z by constructing a solution \(y'_\ell \) with the same gain b where \(y'_{\ell '_{r'}} = 0\). If \(\ell _{r} + \ell '_{r'} = 0\), then either no further variable is contained in \(\ell _r,\ell '_{r'}\) or the same variable \(z'\) occurs with opposite signs. Then we set \(y'_{\ell _r} = y'_{\ell '_{r'}} = 0\) and \(y'_p = y_p\) otherwise.

Now assume that \(\ell _{r} + \ell '_{r'}\) is a linear combination different from 0. Then it either is equivalent to an octagon combination not involving variable z, or \(2z'\) or \(2\cdot (-z')\) for some variable \(z'\) different from z. In order to deal with all these cases consistently, we introduce a correction factor c as 1 if the sum is an octagon linear combination, and 2 otherwise. Let q denote the octagon combination with \(c\cdot q = \ell _{r} + \ell '_{r'}\). Since the octagon r is closed, \(c\cdot b_q\le b_{\ell _r} + b_{\ell '_{r'}}\) holds. Let \(y'_\ell ,\ell \in L_V,\) be defined by

$$ y'_{\ell } = \left\{ \begin{array}{rl} y_{\ell _r} - y_{\ell '_{r'}} &{}\text {if}\; \ell = \ell _r \\ 0 &{}\text {if}\; \ell = \ell '_{r'} \\ y_{\ell }+c\cdot y_{\ell '_{r'}} &{}\text {if}\; c\cdot \ell = q \\ y_\ell &{}\text {otherwise} \end{array}\right. $$

We claim that \(y'_\ell ,\ell \in L,\) is again a feasible solution, i.e., satisfies all constraints, where the same gain b is attained. Concerning the gain, we have

$$ \begin{array}{lll} y_{\ell _{r}}\cdot b_{\ell _{r}} + y_{\ell '_{r'}}\cdot b_{\ell '_{r'}} + y_q\cdot b_{q} &{}=&{} (y_{\ell _r}-y_{\ell '_{r'}})\cdot b_{\ell _r} + y_{\ell '_{r'}}\cdot (b_{\ell _r} + b_{\ell '_{r'}}) + y_q\cdot b_{q} \\ &{}\ge &{} y'_{\ell _r}\cdot b_{\ell _r} + y'_{q} \cdot b_{q} \end{array} $$

As the gain b was already minimal, we conclude that the gain for the \(y'_{\ell }\) has not changed. It remains to show that the \(y'_\ell \) form a feasible solution of the constraints in (13). By construction, the equation for z is satisfied (we reduce \(y_{\ell _r}\) with a positive occurrence of z by the same amount as \(y_{\ell '_{r'}}\) with a negative occurrence). If q contains a variable \(z'\) which is then different from z, then this variable must occur in \(\ell _r,\ell '_{r'}\) or both and if so, with the same sign. If it is contained only in \(\ell '_{r'}\), then \(y_{\ell '_{r'}}\) in the left-hand side of the constraint for \(z'\) is replaced with 0, while at the same time \(y_q\) is increased with \(y_{\ell _r}\). If it is contained only in \(\ell _{r}\), then \(y_{\ell _{r}}\) in the left-hand side of the constraint for \(z'\) is decreased with \(y_{\ell '_{r'}}\), while at the same time \(y_q\) is increased with \(y_{\ell _r}\). If it is contained both in \(\ell _{r}\) and \(\ell '_{r'}\), then \(y_{\ell _{r}}\) in the left-hand side of the constraint for \(z'\) is decreased with \(y_{\ell '_{r'}}\), \(y_{\ell '_{r'}}\) is set to 0, \(y_q\) is increased with \(2\cdot y_{\ell '_{r'}}\).

Thus, in all cases, the equation is satisfied for the \(y'_{p}\).

We conclude that the combination \(\ell _{r}\) can equivalently be removed by means of the octagon combination q not involving the variable z.

Therefore, now assume that \(y_{\ell '_{r'}}> y_{\ell _r}\) where, w.l.o.g., the maximal value of the non-zero \(y_{\ell _j}\) equals \(y_{\ell _r}\). If \(\ell _r + \ell '_{r'} = 0\), then \(b_{\ell _r} + b_{\ell '_{r'}} = 0\) (otherwise the gain were not minimal). Therefore, we set \(y'_{\ell _r} = 0\), \(y'_{\ell '_{r'}} = y_{\ell '_{r'}} - y_{\ell _r}\), and \(y'_\ell = \ell _p\) otherwise to obtain a feasible solution where the minimal gain is attained. At the same time, the number of octagon combinations \(\ell \) with \(y'_\ell > 0\) where z occurs positively has decreased. Therefore, assume that \(\ell _r+\ell '_{r'}\) is different from 0. Then there is a coefficient \(c\in \{1,2\}\) and an octagon constraint q such that \(c\cdot q = \ell _r+\ell '_{r'}\) and \(c\cdot b_q \le b_{\ell _r}+b_{\ell '_{r'}}\). Then we set

$$ y'_\ell = \left\{ \begin{array}{rl} 0 &{}\text {if}\;\ell =\ell _r \\ y_{\ell '_{r'}}-y_{\ell _r} &{}\text {if}\;\ell = \ell '_{r'} \\ y_q+c\cdot y_{\ell _r} &{}\text {if}\;\ell = q \\ y_\ell &{}\text {otherwise} \end{array}\right. $$

Again, we obtain a feasible solution where the gain has not increased, but the number of octagon combinations \(\ell \) with \(y'_\ell > 0\) where z occurs positively has decreased. Altogether, we conclude that, without increasing the gain, the feasible solution \(y_\ell \) can be adjusted such that \(y_\ell = 0\) for \(\ell \) whenever \(\ell \) contains negative occurrences of variables in V.

As a result, we obtain as the dual of the simplified LP problem

(15)

Example 7

Assume that the set of program variables consists of \(x,z_1,z_2,z_3\), that our goal is to maximize the linear objective function \(2z_1+3z_2 +z_3\) subject to the octagon constraints

$$ z_1 + z_2\le 10\qquad z_1 + z_3\le 1\qquad z_2 + z_3\le 1 $$

The dual linear program then is given by

$$ \begin{array}{ll} {\textbf {minimize}} &{} y_1\cdot 10 + y_2 + y_3 \\[1ex] {\textbf {subject to}} &{} y_1 + y_2 = 2 \quad y_1 + y_3 = 3 \quad y_2 + y_3 = 1 \\ &{} y_1,y_2,y_3\ge 0 \end{array} $$

In this case, there is just one possible solution for the \(y_i\), namely,

$$ y_1 = 2.5\quad y_2 = 0.5\quad y_3 = 0.5 $$

—implying that the optimal value is given by \(25+0.5+0.5 = 26\).    \(\square \)

For an optimization problem with integer octagon constraints, we may, in principle, proceed as for rationals. Solving integer linear programs with octagon constraints precisely, however, is NP-hard. This can be seen, e.g., by reduction from the NP-complete maximum clique problem, i.e., the problem of deciding whether the maximal size of a clique in an undirected graph exceeds some bound. Let \(G=(V,E)\) denote a finite undirected graph, and choose V as the set of variables. Then we construct the integer optimization problem

figure g

The constraints are all integer octagon constraints, while the solution to the optimization problem equals the maximal size of a clique. Since the construction of the integer optimization problem from the instance of the clique problem can be done in polynomial time, it follows that to decide whether the optimal value for an integer linear program with octagon constraints exceeds some value, is NP-hard.

8 Abstract Assignments for Octagons

Assume that we are given an affine assignment of the form

$$ x\leftarrow b+\sum \nolimits _{z\in V}a_z\cdot z $$

and that the octagon before the assignment is a closed octagon r with coefficients \(b_\ell ,\ell \in L_\mathcal{X}\). W.l.o.g., assume that x does not occur in the right-hand side, i.e., \(x\not \in V\). Over the rationals, the best upper bound \(b'_\ell \) for the octagon combination \(\ell \) with x occurring in \(\ell \) is obtained by a linear program of the form (13). Depending on \(\ell \), the objective functions are

figure h

The best abstract transformer \(\llbracket \textsf {a} \rrbracket ^\sharp \) then is given by

$$\begin{aligned} \llbracket \textsf {a} \rrbracket ^\sharp (r) = {\left. r\right| _{\mathcal{X}{\setminus }\{x\}}}\wedge {r_x} \end{aligned}$$
(16)

where \(r_{x}\) denotes the conjunction

$$\begin{aligned} \begin{array}{l} (x\le b+b'_x)\;\wedge \; \bigwedge \nolimits _{z\ne x} (x+z\le b+b'_{x+z})\wedge (x-z\le b+b'_{x-z})\;\wedge \\ \qquad \qquad \qquad \qquad \quad \,\, (-x+z\le b'_{-x+z}-b)\wedge (-x-z\le b'_{-x-z}-b) \end{array} \end{aligned}$$

Over the integers, we can proceed analogously to the rational case by solving the corresponding integer optimization problems. Since these, in general, are NP-hard, we prefer for integer octagons, to rely on rational relaxations of the corresponding ILP problems. This means that for each octagon combination \(\ell \), we determine the best rational upper bound \(b_\ell \) after the assignment (as determined by the corresponding LP problem) which is tightened to \(\lfloor b_\ell \rfloor \) to obtain a sound upper bound for \(\ell \) over the integers. We remark that for integer octagons, an alternative formulation of abstract transformers for affine assignments has been provided in [21]. The transformer there is based on the optimal abstract transformer for rational polyhedra in [9] whose bounds are tightened and subsequently over-approximated by octagon constraints. The latter step also requires solving appropriate (relaxed) LP problems, which are essentially the same as we solve – only that we benefit from a reduced number of octagon constraints to be taken into account by each LP problem. We obtain:

Theorem 3

For the octagon domain over the rationals, the best transformer (16) for a linear assignment can be computed in polynomial time. For n program variables and a constant number of variables occurring in the assignment, the best transformer can be computed in time \(\mathcal{O}(n)\).

   \(\square \)

Proof

Assume that the octagon before the assignment is closed. Due to Proposition 5, the octagon transformer for linear assignments satisfies properties (4) and (3). Therefore by Proposition 4, only a linear number of optimization problems must be solved. Over the rationals, the optimal upper bound to an octagon combination can be determined by solving an LP problem – which is known to be possible in polynomial time. Note that due to Proposition 5, the set of octagon constraints to be taken into account can be reduced to constraints with octagon combinations where the signs of variables match the corresponding signs occurring in the objective function.

If the right-hand side contains only a bounded number of variables, each of the LP problems will refer to a bounded number of variables only, and thus can be solved in constant time (e.g., by using the Simplex algorithm). Since only \(\mathcal{O}(n)\) many of these problems must be solved, the overall runtime is linear.    \(\square \)

Over the integers, on the other hand, the solution of the relaxed integer LP problem for a sound bound to an octagon combination can be obtained as the solution to the corresponding relaxed rational LP problem, and the argument proceeds as in the rational case. As a corollary, we therefore obtain:

Corollary 1

For the octagon domain over the integers, the integer relaxation of (16) for a linear assignment can be computed in polynomial time. For n program variables and a constant number of variables occurring in the assignment, the relaxed best transformer can be computed in time \(\mathcal{O}(n)\).    \(\square \)

9 Related Work

Since being introduced by Miné [20, 21], the weakly relational numerical domain of Octagons has found widespread application in the analysis and verification of programs and is part, e.g., of the highly successful static analyzer Astrée [3, 7]. While normalization has been known to be cubic time for rational octagons right from the beginning [20], it was open whether this also holds true for integer octagons. This question has been settled affirmatively by Bagnara et al. [1]. Sankaranarayanan et al. [25] proposed using techniques from linear programming to compute best transformers for linear assignments. Chawdhary et al. [4] investigated the problem of improved quadratic algorithms for incremental closure, i.e., adding one further octagon constraint. Implementations of Octagons are provided, e.g., by the Apron library [14] and Elina [10]. Various Octagon algorithms are practically evaluated by Gange et al. [12].

Fig. 1.
figure 1

Various weakly relational domains, whether they are 2-projective and 2-decomposable, and the complexity of their normalization operation. \(^\textrm{a}\)(For TVPI: As operations on values for 3 variables are in \({\mathcal O}(\log ^2 n)\).) \(^\textrm{b}\)(For int dDBM: Approximate normalization up-to emptiness. Checking emptiness is exponential.)

Extensions of octagons have been considered by Péron and Halbwachs [24] and Chen et al. [5]. For these extensions, however, known normalization algorithms turn out to be rather expensive so that more practical approximate normalizations have been proposed. Figure 1 gives an overview over some weakly relational domains, whether they are 2-decomposable and whether they are also 2-projective as well as the best time complexities for (approximate) normalization in the number of variables.

10 Conclusion and Future Work

We have provided an algorithm for normalizing octagon abstract relations over rationals as well as over integers. For that, we introduced the notion of 2-decomposability for relational domains and provided a cubic-time algorithm based on Floyd-Warshall which overapproximates normalization. For the subclass of 2-projective domains comprising, e.g., integer or rational Octagons, it computes the exact 2-normal form. The major benefit of the resulting algorithm is its simplicity. For the instance of the Octagon domain, e.g., the closure is obtained without duplication of variables. The general setup also provides us with a quadratic algorithm for incremental normalization. For octagons, we also reconsidered the construction of best abstract transformers for affine assignments by means of linear programming. Over the rationals, we observe that only those octagon constraints need to be taken into account where the sign of each occurring variable z agrees with the sign of the occurrence of z in the respective objective functions. This, again, may result in a significant speedup when it comes to practical implementations.

In future work, we would like to provide a new implementation of Octagon domains based on our algorithms and evaluate its practical performance on realistic examples. Combining our algorithms with orthogonal techniques such as online decomposition [28] in particular seems like a promising line of inquiry. We also would like to explore in greater detail the potential of further, perhaps non-numerical 2-decomposable domains.