Non-Numerical Weakly Relational Domains

The weakly relational domain of Octagons offers a decent compromise between precision and efficiency for numerical properties. Here, we are concerned with the construction of non-numerical relational domains. We provide a general construction of weakly relational domains, which we exemplify with an extension of constant propagation by disjunctions. Since for the resulting domain of 2-disjunctive formulas, satisfiability is NP-complete, we provide a general construction for a further, more abstract weakly relational domain where the abstract operations of restriction and least upper bound can be efficiently implemented. In the second step, we consider a relational domain that tracks conjunctions of inequalities between variables, and between variables and constants for arbitrary partial orders of values. Examples are sub(multi)sets, as well as prefix, substring or scattered substring orderings on strings. When the partial order is a lattice, we provide precise polynomial algorithms for satisfiability, restriction, and the best abstraction of disjunction. Complementary to the constructions for lattices, we find that, in general, satisfiability of conjunctions is NP-complete. We therefore again provide polynomial abstract versions of restriction, conjunction, and join. By using our generic constructions, these domains are extended to weakly relational domains that additionally track disjunctions. For all our domains, we indicate how abstract transformers for assignments and guards can be constructed.


Introduction
Relational analyses have been observed to be indispensable for verifying intricate program properties.In particular, this is the case when for the purpose of verification, ghost variables have been introduced which must be related to program variables.Termination may be verified by introducing a ghost loop counter, which can be proven bounded by a relational domain relating it to the actual bounded iteration variable [2].The validity of string operations on null-terminated strings as employed, e.g., in the programming language C, may be verified by introducing ghost variables for the length of a buffer as well as for tracking the position of the null byte in the buffer [12].It also has been observed that monolithic relational domains such as the polyhedra abstract domain [10] scale badly to larger programs.Therefore, weakly relational domains have been proposed which can only express simple relational properties, but have the potential to scale better [16].Examples of weakly relational numerical properties are the Two Variables Per Inequality domain [23], or domains given by a finite set of linear templates [20].The most prominent example of a template numerical domain is the Octagon domain [15,17] which allows tracking upper and lower bounds not only of program variables but also of sums and differences of two program variables.One such octagon abstract relation could, e.g., be given by the conjunction p´x ď ´5q ^px ď 10q ^px `y ď 0q ^px ´z ď 1q arXiv:2401.05165v1[cs.PL] 10 Jan 2024 Octagons thus can be considered as a mild extension of the non-relational domain of Intervals for program variables, and a variety of efficient algorithms have been provided [4,5,7,22].Here, we are concerned with constructing non-numerical abstract domains.
For that, we provide a general technique to construct from every relational domain a weakly relational domain.As one instance of the general construction, we consider 2-disjunctive constants as mentioned in [21].This weakly relational domain allows, e.g., to relate the names of functions with function pointers as in the formula x " "foo" ^y " &foo _ x " "bar" ^y " &bar Since satisfiability of formulas from that domain turns out to be NP-complete, we provide a further mild abstraction, again for arbitrary relational domains, to provide us with a weakly relational domain where all required operations become tractable.
Another family of relational non-numerical domains has been introduced by Arceri et al. [3].Based on a partial order of values, conjunctions of ordering constraints x Ď y for program variables x, y are considered.They observe that analyses of prefixes or the substring relation could be helpful for programs in programming languages supporting high-level operations on strings.
Here, we study this kind of directed domains in greater detail.For conjunctions of inequalities over some partial order P , we extend the constraints from Arceri et al. [3] by allowing for variables both lower and upper bounds from P .For arbitrary partial orders, though, we find that then satisfiability is NP-complete.Partial orders p that are lattices form a notable exception.An instance of this are subsets of some universe or multisets.For lattices, we show that satisfiability is decidable in polynomial time.Moreover, we provide polynomial constructions both for restriction as well as the optimal join operation.Turning to general partial orders of values, we thus cannot hope for polynomial algorithms.Therefore, we provide a meaningful abstraction so that both abstract restriction as well as join is again polynomial.This family of relational domains is already weakly relational.Still, our generic constructions can be applied to obtain more expressive weakly relational domains that additionally support disjunctions at a limited amount of extra costs.
The paper is organized as follows: Section 2 provides background definitions on relational domains.It formally introduces our notion of weakly relational domains and provides a general construction of weakly relational domains.Section 3 is dedicated to disjunctive constants.When applying the generic construction from the last section to this relational domain, the weakly relational domain of 2-disjunctive constants is obtained.Here, we prove that satisfiability for these formulas still is NP-complete.Therefore, a generic abstraction technique is presented so that, when applied to disjunctive constants, normalization, projection, as well as least upper bounds all turn out to be polynomial time.
Finally, abstract transformers for assignments as well as guards are derived.Section 4 then introduces directed domains which do not track equalities but inequalities over a partial order of values.While the first subsection provides polynomial constructions for the case that the partial order for values is a lattice, the second subsection is concerned with arbitrary partial orders as value domain.Since satisfiability, in general, turns out to be NP-complete, again a polynomial abstraction is provided.In a further subsection, we indicate how the generic constructions from the last sections provide us with weakly relational domains that additionally support disjunctions of inequalities.We exemplify the resulting domains with conjunctions and disjunctions of inequalities over the integers.In the final subsection, dedicated abstract transformers are constructed for assignments, while the last subsection discusses the treatment of guards.Section 5 summarizes the contributions and sketches further directions of research.

Weakly Relational Domains
Let us recall basic definitions for relational domains.We mostly follow the notation used in previous work [21], where the notion of 2-decomposability has been introduced.Let X be some finite set of variables.A relational domain R maintains relations between variables in X .We require that a relational domain is a bounded lattice, i.e., has a partial order Ď, a least element K, a greatest element J, as well as binary operators for the greatest lower bound (meet) [ and the least upper bound (join) \.We do not demand relational domains to be complete lattices, i.e., to provide for every subset of elements a least upper bound: the polyhedral domain, e.g., is not complete [10].However, we demand that a relational domain supports the following monotonic operations: x :" e 7 : R Ñ R (assignment of e to x) ¨|Y : R Ñ R (restriction to Y Ď X ) ?c 7 : R Ñ R (guard for condition c) where e and c are from some expression and condition language, respectively.
The abstract transformers for basic actions of programs are given by these functions.Restricting a relation r to a subset Y of variables amounts to forgetting all information about variables in X zY .Thus, we require that A restriction ¨|Y to some set Y therefore is an idempotent operation.We remark that from these axioms it follows that K| Y " K and J| Y " J for any Y Ď X .Given that there is some relation r c P R describing all states satisfying the condition c, the transformation for the guard ?c can be described by -at least, if there is a concretization function γ such that i.e., the binary meet operation is precise.
Example 1 For numerical variables, a variety of such relational domains have been proposed, e.g., (conjunctions of) affine equalities [14,18,19] or affine inequalities [10].For affine equalities or inequalities, restriction to a subset of Y of variables corresponds to the geometric projection onto the subspace defined by Y , combined with arbitrary values for variables z R Y .[ \ One way to tackle the high cost of relational domains is to track the relationships not between all variables, but only between subclusters of variables.We call such domains Weakly Relational Domains.For a subset Y Ď X , let R Y " tr| Y | r P Ru be the set of all abstract values from R that contains only information on those variables in Y .For any collection S Ď 2 X of clusters of variables, a relation r P R can be approximated by a meet of relations from R Y , Y P S since for every r P R, holds, as r Ď r| Y holds for each Y P S. In fact, the right-hand side of (4) is the best approximation of r by some meet over abstract relations s Y , Y P S, with (by monotonicity of restriction) " s Y holds for all Y P S.
Schwarz et al. [21] have introduced 2-decomposable relational domains.These are domains where the full value r can be recovered from the restrictions of r to all clusters p from the set S " rX s 2 of non-empty clusters of variables of size at most 2. Furthermore, Schwarz et al. [21] ask for binary least upper bounds to be determined by computing within these clusters only.More precisely, this amounts to requiring the following two properties r " Ű pPrX s2 r| p (5) to hold for all abstract relations r, r 1 , r 2 P R. The most prominent example of a 2-decomposable domain is the octagon domain [15] -either over rationals or integers, while affine equalities or affine inequalities are examples of domains that are not 2-decomposable.
Any relational domain R, however, which satisfies (6) gives rise to a 2-decomposable domain R 2 of its 2-cluster approximations.
For r P R, let r " Ű pPrX s2 r| p denote the approximation of r by the meet of its restrictions to clusters p P rX s 2 .Let R 2 denote the subset of R of all abstract relations of the form r, r P R, where the ordering is inherited from R. In particular, K as well as J from R are also in R 2 .
Theorem 1 Assume that R is an abstract relational domain which satisfies (6).Then the following holds: 1. r " r for all conjunctions r " Ű pPrX s2 s p with s p P R p , p P rX s 2 , i.e., all such conjunctions are contained in R 2 .2. For r 1 , r 2 P R 2 , the abstract relation r 1 [ r 2 , as provided by R, is in R 2 .3. The binary least upper bound operation in R 2 exists and is given by For a proof of statement (3), we note that any upper bound of r 1 , r 2 in R 2 is also an upper bound of r 1 \ r 2 in R. Therefore, the least upper bound od r 1 , r 2 in R 2 is given by r 1 \ r 2 .We calculate: and statement (3) follows.
The best approximation of r| Y in R 2 is given by r| Y .Thus, we have pr| p q ˇˇY i.e., it can be determined by applying the restriction onto variables from Y for each cluster p P rX s 2 separately.This implies statement (4).
Statement ( 5) is an immediate consequence of statements (3) and (4).[ \ The polyhedral domain, e.g., satisfies (6).Applied to the polyhedral relational domain, the construction from Theorem 1 results in the domain of affine inequalities with at most two variables per inequality [23].
According to Theorem 1, every value r from the 2decomposable relational domain R 2 can be represented as the meet of its restrictions to 2-clusters, i.e., by the collection xr| p y pPrX s2 .We call this representation normal, and an algorithm that computes it normalization.Consider now an arbitrary collection xs p y pPrX s2 with s p P R p with r " Ű pPrX s2 s p .Then r| p Ď s p always holds, while equality need not hold.In the Octagon domain over the rationals or the integers, the normal representation of an octagon value corresponds to its closure as introduced in previous work [4,15].While for rational Octagons, closure in cubic time was already proposed by Miné [15], it is much more recent that a corresponding algorithm was provided for the case when constraints are interpreted over integers [4,5].
Subsequently, we introduce non-numerical weakly relational domains and provide polynomial algorithms for these.

Disjunctive Constants
Constant propagation relies on a domain that maintains conjunctions of atomic propositions x " a where x is a program variable and a is from a finite set U of possible values.In the following, we consider a (mild) generalization of this domain where also disjunctions of at most two atomic propositions are allowed.
Assume we are given a finite set U representing possible values for variables from X .We consider propositions of the form px P Aq for A Ď U which correspond to the disjunction of atomic propositions x " a, a P A. Thus, the proposition x P A for some A Ď U can be understood as an atomic proposition of a multi-valued propositional logic where A serves as the set of logical values of the propositional variable x [6].Every monotonic Boolean combination Ψ of propositions x P A with x P X , A Ď U , represents a function Ψ : pX Ñ U q Ñ B defined by Let CrU s denote the complete lattice of all equivalence classes of formulas Ψ where the ordering is semantic implication.The least element in this ordering can be represented by the empty disjunction or K (false), while the greatest element is equivalent to the empty conjunction or J (true).Each formula Ψ has an equivalent CNF as well as an equivalent DNF where each clause (conjunction) contains at most one proposition x P A for every variable x.Converting Ψ into DNF allows checking satisfiability and computing the restriction Ψ | Y onto a subset Y Ď X of variables.A formula for Ψ | Y is obtained from a DNF for Ψ where each conjunction contains at most one proposition for each variable by the following steps: First, every conjunction which contains y P H for some y is removed.From each remaining conjunction, then every proposition y P A with y R Y is removed.It follows that Ψ | Y is distributive, i.e., commutes with binary least upper bounds.
For an arbitrary Ψ P CrU s, computing an equivalent DNF is an exponential time operation.The same holds if all restrictions Ψ | tx,yu are computed via this normal form.Let C 2 rU s denote the 2-decomposable domain obtained from CrU s according to theorem 1.The lattice C 2 rU s consists of all elements Ψ which can be represented as conjunctions of clauses with at most two propositions x P A x per clause.According to theorem 1, the least upper bound operation \ 2 for C 2 rU s can be realized by a clusterwise disjunction.In particular, it does not coincide with logical disjunction -but is an over-approximation of it.
Example 2 Let Ψ 1 " px P tauq and Ψ 2 " py P tbu _ z P tcuq.Then both Ψ 1 and Ψ 2 are from C 2 rU s, but their disjunction is not.In fact, the least upper bound in C 2 rU s for px P tauq _ py P tbuq _ pz P tcuq is J. [ \

Approximating 2-disjunctive Conjunctions
Any CNF Ψ over some set Y of variables of bounded size can, in polynomial time, be transformed into a DNF Ψ 1 .Each DNF over two distinct variables x, y can be brought into the canonical normal form ł pa,bqPL px " aq ^py " bq for some L Ď U ˆU .Conjunction and disjunction of two such normal forms then correspond to intersection and union of the respective subsets of U ˆU .
For arbitrary sets Y of variables, though, it is nontrivial even to decide whether a given conjunction is different from K.
Theorem 2 To decide for a formula Ψ P C 2 rU s whether or not Ψ is satisfiable, i.e., different from K, is NPcomplete.
Proof Since a satisfying assignment for Ψ can be guessed and then checked in polynomial time, satisfiablity of Ψ is in NP.NP-hardness, on the other hand, follows by a reduction from 3-colorability of graphs [6].We illustrate the reduction with an example.
Example 3 For X " tx 1 , x 2 , x 3 , x 4 u, consider the formula Ψ ľ txi,xj uPE px i P tb, cu _ x j P tb, cuq px i P ta, cu _ x j P ta, cuq px i P ta, bu _ x j P ta, buq where E is given by Then Ψ is satisfiable iff the undirected graph pX , Eq has a 3-coloring.In the given example, the graph cannot be colored by three colors.Therefore, Ψ is equivalent to K. [ \ Exact normalization (as defined in Section 2) of a relation represented by some 2-CNF thus, in general, may be difficult to compute.Instead of giving dedicated further abstraction techniques, we prefer to provide for an arbitrary relational domain R, a general construction to approximate the 2-decomposable domain R 2 further by a 2-decomposable domain R 7 2 .This construction is based on approximate normalization.
Assume that an element in R 2 is given by the meet Ű R where R is the collection xs p y pPrX s2 with s p P R p (p P rX s 2 ).According to Theorem 1, p Ű Rq| p Ď s p for all p P rX s 2 .As we have seen for 2-disjunctive constants, however, exact normalization of Ű R, i.e., the values p Ű Rq| p may be hard to compute precisely.For an approximate normalization, we introduce a constraint system in unknowns r p , p P rX s 2 with the constraints r tx,yu Ď s tx,yu px, y P X q r tx,yu Ď pr tx,zu [ r tz,yu q ˇˇtx,yu px, y, z P X q (8) This constraint system has already been considered for the normalization of 2-projective domains [22].As all right-hand sides are monotonic, the constraint system has a greatest solution -whenever each R p , p P rX s 2 , is a complete lattice.
In case that there is a greatest solution xr p y pPrX s2 , p Ű Rq| p Ď r p holds for all p, since xp Ű Rq| p y pPrX s2 is also a solution of the system (8).Then we call the collection xr p y pPrX s2 the approximate normal form of the collection R. Here, we are not only interested in the existence of a greatest solution of (8) but also that it can be effectively computed.For that, we consider the sets of values possibly occurring during some fixpoint iteration for a particular collection R " xs p y pPrX s2 .
Let I R rRs p , p P rX s 2 , be the least collection of sets such that s p P I R rRs p ; -If r, r 1 P R p R then also r [ r 1 P R p R ; -If r P I R rRs tx,zu and r 1 P I R rRs tz,yu , then pr [ r 1 q| tx,yu P I R rRs tx,yu for all x, y, z P X .
The sets I R rRs p collect the potential iterates occurring during greatest fixpoint iteration of (8).By construction, each set I R rRs p has a greatest element, namely, s p , and is closed under binary [.For the termination of Kleene fixpoint iteration for (8), it suffices for each set I R rRs p to have a least element -whose collection then coincides with the greatest solution of (8).This observation is summarized in the following proposition.

Proposition 1
The following two statements are equivalent: 1.For each p P rX s 2 , I R rRs p has a least element; 2. The constraint system (8) has a greatest solution which can be attained by Kleene fixpoint iteration.
Proof Assume that for each p P rX s 2 , there is a least element d p P I R rRs p .We claim that R " xd p y pPrX s2 is the greatest solution of (8).Since for each p P rX s 2 , d p is a lower bound to all elements in I R rRs p , all constraints of (8) are satisfied.Therefore, R is a solution.By induction on the definition of the sets I R rRs p , any other solution R 1 " xr 1 p y rX s2 consists of lower bounds of these sets, i.e., r 1 p Ď Ű I R rRs p " d p -implying our claim.To conclude statement (2), it remains to prove that the greatest solution R can be reached by Kleene iteration.For every p, d p is an element of the set I R rRs p , and therefore, has arrived there after finitely many applications of the inductive rule of their definitions.Let h be an upper bound to these numbers for all d p , p P rX s 2 .Then, Kleene iteration for the constraint system (8) will also reach these values after at most h iterations.
For the reverse direction, assume that Kleene iteration for the greatest solution of (8) terminates after h iterations with a collection R " xd p y pPrX s2 .By induction on the number j of rounds, we find each value d pjq p attained for r p , p P rX s 2 , after j rounds, is an element of I R rRs p .Therefore, d p " d phq p P I R rRs p for all p.It remains to prove that d p is also a lower bound of I R rRs p .To show this, we again proceed by induction, this time on the number i of applications of the inductive rule for the construction of the I R rRs p , and prove that for all i and any value d 1 added to some set I R rRs p in the ith step, it holds that d piq p Ď d 1 .Therefore, d p is a lower bound to I R rRs p for all p, and statement (1) follows.
[ \ If all operations on abstract relations r P R Y for clusters Y of size at most 3 are constant time and the height of all RrRs p are bounded by h, then the greatest solution of the constraint system (8) can be computed in time polynomial in h and the number of variables.
We call a relational domain 2-nice, if the statements of Proposition 1 are satisfied for each collection R " xs p y pPrX s2 with s p P R p .
Let us instantiate this construction to 2-disjunctive constants.First, we note that the relational domain CrU s is finite and thus, in particular, 2-nice.Let Ψ " xs p y pPrX s2 denote a collection with s p P CrU s p for all p. Assume that X consists of n variables, and let m be the number of constants occurring in any of the s p .According to the normal form (7), the lattice I CrU s rΨ s p has height at most m if p consists of a single variable, and height bounded by m 2 if p is a two-element set.Since there are 1  2 npn `1q clusters, fixpoint iteration will terminate after Opn 2 ¨m2 q updates.[ \ Due to NP-hardness of satisfiability, we cannot expect the greatest solution of the constraint system for 2disjunctive constants to always return the exact normal form.For the formula from Example 3, e.g., it returns for each pair tx i , x j u P E, i ‰ j, For a relational domain R, we call a collection R " xr p y pPrX s2 with r p P R p for all p, stable if it is a solution of the constraint system (8) with s p " r p .We remark that stability of R implies that, if r p " K for some p, then r p 1 " K for all other p 1 P rX s 2 as well.Now we introduce for a relational domain R the domain R 7  2 of all stable collections.The ordering Ď 7 on the domain R 7 p for all p P rX s 2 when R " xr p y pPrX s2 and R 1 " xr 1 p y pPrX s2 .Thus, p Abstract join as well as abstract restriction for R 7

2
then is modeled along the definitions of join and restriction for R 2 , but refers to the representation as solution to the constraint system (8).For R " xr p y pPrX s2 , R 1 " xr 1 p y pPrX s2 in R 7 2 , we define the abstract join by while for Y Ď X , and R " xr p y pPrX s2 , we define abstract restriction by where the latter equality follows since for r p P R p , r p | p " r p .We have: Proposition 2 Assume that R is 2-nice and satisfies (6).Then we have: and is the least upper bound of R, R 1 .Moreover, 2 , i.e., the collection r p \ r 1 p , p P rX s 2 , is a solution of the constraints in (8).For this, we calculate: Ď ppr tx,zu \ r 1 tx,zu q [ pr tz,yu \ r 1 tz,yu qq ˇˇt x,yu for all variables x, y, z P X .From that, the statement follows.
To prove the second statement, we must verify that the collection r p | Y Xp , p P rX s 2 satisfies all constraints in (8).Indeed, we find by monotonicity, r tx,yu ˇˇY Ď pr tx,zu [ r tz,yu q ˇˇtx,yuXY Ď pr tx,zu ˇˇY [ r tz,yu ˇˇY q ˇˇtx,yuXY for all x, y, z P X , and the claim follows.The final statement then follows from the definition.[ \ Elements of R 7 2 are collections xr p y pPrX s2 .For every p P rX s 2 , we can consider elements r p P R p as elements of R 7  2 as well by assuming that r p represents the stable collection xr p | q y qPrX s2 .
According to Proposition 2, both joins and restrictions can be computed componentwise.As a consequence, we find: Theorem 3 For a 2-nice relational domain R which satisfies (6), the domain R 7  2 is a 2-decomposable relational domain.[ \ Fig. 1 shows the abstract relational domains R, R 2 , and R 7 2 together with the mappings between them.According to Theorem 3, the domain C 7  2 rU s of abstract 2-disjunctive constants is indeed 2-decomposable.The given construction provides us with polynomial algorithms for least upper bound, greatest lower bound, and projection.
Fig. 1: The relationship between abstract relational domains.

Assignments
Let us return to the relational domain C 2 rU s of 2-disjunctive constants and indicate how abstract transformers for assignments x :" s can be tailored.For 2-disjunctive constants, we only consider right-hand sides s where s is either ?(unknown value), or of the form A|y 1 | . . .|y k where A is a set of constants and y 1 , . . ., y k P X are variables.The concrete semantics of such an assignment is given by Generalizing the corresponding abstract semantics for (copy) constant propagation, we define the logic transformer for C 2 rU s by Proposition 3 1.The logic transformer x :" ? 2 is precise, i.e., x :" ?pγ Ψ q " γ p x :" ? 2 Ψ q (9) In particular, it is distributive and commutes with K. 2. The logic transformer x :" A | y 1 | . . .|y k 2 is precise, if the logic transformers for x :" y j , j " 1, . . ., k, are.
Thus, we have reduced the construction of logic transformers for assignments to restriction and the construction of logic transformers for variable-variable assignments x :" y.For y " x, the assignment is the identity, i.e., we set x :" x 2 Ψ " Ψ .Therefore, assume that y is different from x, and assume that Ψ | X ztxu " Ψ 1 .Let B denote the set of constants so that Ψ Proposition 4 The logic transformer x :" y 2 is precise, i.e., x :" y pγ Ψ q " γ p x :" y 2 Ψ q (10)

holds. [ \
The same construction allows us to construct abstract logic transformers x :" s 7 2 : C 7 2 rU s Ñ C 7 2 rU s -only that the least upper bound operation and projection of C 2 rU s must be replaced by the corresponding operations of C 7 2 rU s.The abstract transformer then, however, is only sound and no longer precise, since the projection operation of C 7  2 rU s may return for an abstract relation R whose concretization is empty an abstract relation with a non-empty concretization.Accordingly, Eq. ( 9) and Eq. ( 10) may be violated.

Guards
It remains to provide the semantics of guards.Again, we first consider the domain C 2 rU s of 2-disjunctive formulas (modulo logical equivalence), ordered by implication.We consider positive guards of the form x P A, and conversely, negative guards of the form x R A. Positive guards thus can directly be expressed in C 2 rUs.Thus we set ?px P Aq Ψ " Ψ ^px P Aq (11) Negative guards on the other hand cannot be directly expressed in C 2 rU s -at least if there are unknown constant values beyond the finite universe U .To deal with this, we introduce a dedicated fresh symbol ‚ R U with the understanding that ‚ repesents any value a R U .The property x R A then can equivalently be represented by x P pU Y t‚uqzA allowing us to deal with such co-finite sets of possible values in the same way as we did for finite sets of values alone.

Directed Relational Domains
Instead of plain equalities, let us now consider inequalities between variables and constants instead of equalities and abandon disjunctions.We will, however, add disjunctions in the end as well.Thus for now, we just consider finite conjunctions of inequalities of the form the prefix ordering ď p ; e.g., ab ď p abcd; the substring ordering ď s , e.g., bc ď s abcde; the scattered substring ordering ď ss , e.g., bd ď ss abcde.
Much more expressive constraints on strings have been studied, e.g., in [1,8,11,13].In particular, for a fragment containing the prefix ordering, decision procedures are known based on (synchronous) multi-tape finite automata [24].Due to their expressiveness, these techniques come with a considerable computational effort.Instead, we follow Arceri et al. [3] where basic relational domains are considered for reasoning about variables of string type, sets (of characters), or integers (lengths of strings).Their analyses relate program variables only according to some partial order, and also consider lower bounds.Here, these considerations are complemented by taking upper bounds into account as well and, eventually, by adding disjunctions.A mapping σ : X Ñ P is a model of Ψ (relative to P ), written as σ |ù Ψ , if Ψ ‰ K, and d ď σ x (in P ) for each constraint d Ď x in Ψ ; σ x ď d (in P ) for each constraint x Ď d in Ψ ; and σ x ď σ y (in P ) for each constraint x Ď y in Ψ .
Let DrP s denote all finite conjunctions over P modulo semantic equivalence where the ordering on DrP s is semantic implication.As before, normal forms of conjunctions will be considered up to reordering of atomic propositions.Thus, syntactic equality of conjunctions here means equality of the respective sets of propositions.Let Ψ denote a finite conjunction where V Ď P is the set of values occurring in Ψ as lower or upper bounds.To provide a first normal form for Ψ , we proceed in two steps.First, we determine the transitive closure pď Y Ďq `on the set X Y V of the constraints provided by Ψ .In case that pa, bq P pď Y Ďq `for a, b P V where a ď b does not hold in P , then Ψ is unsatisfiable and therefore represented by the dedicated element Ψ 1 " K.If this is not the case, let Ψ 1 denote the conjunction of all inequalities s 1 Ď s 2 where ps 1 , s 2 q P pď Y Ďq `and either s 1 or s 2 or both are in X .
In the second step, when Ψ 1 ‰ K, we remove all redundant constraints.These are constraints of the form x Ď x for x P X , as these constraints hold vacuously; a Ď x for a P V and x P X if there is also a constraint b Ď x with a ď b, i.e., there is a stricter lower bound; x Ď b for b P V and x P X if there is also a constraint x Ď a with a ď b, i.e., there is a stricter upper bound.
Additionally, we set Ψ 1 to K whenever for some variable x, there is no lower bound in P for the set of upper bounds provided for x by Ψ ; or there is no upper bound in P for the set of lower bounds provided for x by Ψ .
Assume, e.g., that Ψ is given by where we consider the prefix order ď p on strings.Since abc, abd cannot be prefixes of the same string, this conjunction is considered equivalent to K.
Let us denote the resulting conjunction Ψ 1 by nf 0 rΨ s and call it the 0-normal form of Ψ .Assuming that comparisons of values as well as checks for common lower or upper bounds are constant-time operations, 0-normal forms can be computed in polynomial time.

Lattice Domains
An important special case is when P is a lattice, i.e., a po where every two elements a, b both have a least upper bound a _ b and a greatest lower bound a ^b.

Example 4
The po 2 U ordered by subset inclusion is a complete lattice and thus, in particular, a lattice.The integers Z with the natural ordering is another example of a lattice, this time without least or greatest element.Yet another example are multisets: this lattice has a least, but no greatest element.
The po Σ ˚of strings ordered by the prefix relation is not a lattice.Σ ˚provides a least element ϵ, as well as greatest lower bounds, namely, the maximal common prefix, but does not have least upper bounds to all pairs of strings.There is, for example, no upper bound to abc and abd in Σ ˚. [ \ When P is a lattice, we can provide a dedicated normal form which, however, may now use constants from P which did not occur in Ψ before.Assume now that Ψ 1 is the 0-normal form of Ψ .If P has a least element K P , we add the vacuous constraint K P Ď x to every variable x.Likewise, if P has a greatest element J P , we add the constraint x Ď J P .If Ψ 1 is different from K, we subsequently simplify Ψ 1 further by replacing for each variable x P X , the set of upper bound constraints occurring in Ψ 1 , if it is non-empty and consists of px Ď b 1 q ^. . .^px Ď b r q, with the single constraint px Ď p Ź r i"1 b i qq; the set of lower bound constraints in Ψ 1 , if it is nonempty and consists of pa 1 Ď xq ^. . .^pa r Ď xq, with the single constraint pp Ž r i"1 a i q Ď xq.Let us denote the resulting formula by nf 1 rΨ s and call it the 1-normal form of Ψ .The 1-normal form of Ψ can be computed in polynomial time as well -given that comparisons as well as pairwise least upper bounds and greatest lower bounds in P are constant time.We have: Theorem 4 Assume that the po P is a lattice.Then the following holds: Satisfiability as well as implication are decidable in polynomial time.
[ \ Proof If Ψ 1 " nf 1 rΨ s " K, then Ψ cannot be satisfiable since any of the simplification steps preserves the set of satisfying assignments.So, assume that Ψ 1 is syntactically different from K. Let σ be the variable assignment which maps each variable x to its lower bound a x P Pif it exists, and to some fixed element a which is less or equal to any other lower bound mentioned in Ψ 1 .Then all single variable constraints are satisfied as well as, by transitivity, all constraints x Ď y occurring in Ψ 1 .Therefore, σ |ù Ψ -implying that Ψ is satisfiable.From this, statement (1) follows.
To prove statement (2), consider conjunctions Ψ 1 1 , Ψ 1 2 both in 1-normal form.If these syntactically coincide, then obviously also Ψ 1 1 ðñ Ψ 1 2 holds.For the reverse direction, we prove that if Ψ 1 i are distinct, then they cannot be equivalent.From that, the assertion follows.If one of them equals K and the other not, then by statement (1), they cannot be equivalent.Therefore, assume that both are satisfiable and thus, different from K. We consider all cases how the Ψ i may differ.
Lower bounds.First, assume that there are constraints a i Ď x, i " 1, 2, for some variable x in Ψ 1 i where a 1 is different from a 2 .Assume w.l.o.g. that a 1 ę a 2 holds.Let L x denote the set consisting of x together with variables z P X where Ψ 1 2 has a constraint z Ď x.Let σ denote some assignment with σ |ù Ψ 1 2 .Then we construct a variable assignment σ 1 such that σ 1 |ù But since a 1 ę a 2 , it follows that σ 1 does not satisfy a 1 Ď x and thus it does not model Ψ 1  1 .If there is a constraint a 1 Ď x in Ψ 1  1 , but no lower bound constraint for x in Ψ 1  2 , then there is some value K P P different from a 1 so that K ď a 1 ^σ x holds.This value allows us to construct an analogous distinguishing assignment σ 1 where we use K instead of a 2 .Upper bounds.First, assume that there are constraints x Ď b i , i " 1, 2, for some variable x in Ψ 1 i where b 1 is different from b 2 .W.l.o.g., assume that b 2 ę b 1 .Let U x Ď X denote the subset consisting of x together with all unknowns z where Ψ 1 2 has a constraint x Ď z.Let σ denote some assignment with σ |ù Ψ 1 2 .Then we construct a variable assignment σ 1 by: 1 , but no upper bound constraint for x in Ψ 1 2 , we introduce a value J P P which is different from b 1 with pb 1 _σ xq ď J, and construct an analogous distinguishing assignment σ 1 only that we use J instead of b 2 .Variable Constraints.Assume that, w.l.o.g., Ψ 1  1 has a constraint px Ď yq for x, y P X which does not occur in Ψ 1  2 where we assume that for every variable z P X both lower and upper bounds are provided by Ψ 1  1 iff they are provided by Ψ 1 2 and that, whenever they are provided, they agree.Consider again the set U x of x together with all variables z with constraints x Ď z, and the set L y of y together with all variables z with constraints z Ď y occurring in Ψ 1 2 .Since x Ď y does not occur in Ψ 1 2 , U x X L y " H. Let σ denote an assignment with σ |ù Ψ 1 2 .First assume that Ψ 1 2 has constraints x Ď b and a Ď y.From x Ď y not occurring in Ψ 1  2 , it follows that b ę a.Now we construct an assignment σ 1 by: x " b and σ 1 y " a.As b ę a, σ 1 does not fulfill the constraint x Ď y from Ψ 1  1 .If no upper bound of x is provided, we choose some value b strictly larger than σ x _ σ y, and define a variable assignment σ 1 by σ 1 z " b _ σ z for z P U x , and σ 1 z " σ z otherwise.Then σ 1 |ù Ψ 1 2 .In order to additionally satisfy x Ď y, we would have σ 1 x " b _ σ x " b ď σ 1 y -which is impossible.Likewise, if no lower bound of y is provided, we choose some value a strictly less than σ x ^σ y, and define a variable assignment σ 1 by σ 1 z " a ^σ z for z P L y , and σ 1 z " σ z otherwise.Then σ 1 |ù Ψ 1 2 .In order to additionally satisfy x Ď y, we would have σ 1 x " σ x ď σ 1 y " a -which again is impossible.
[ \ For lattices, therefore, the construction of normal forms allows deciding satisfiability as well as semantic implication.From our examples, sets, integers, and multisets are lattices.Strings, ordered by the prefix relation, on the other hand, already do not form a lattice anymore.This po, however, is bounded-complete.Recall that a po P is bounded-complete if every subset A Ď P which has some upper bound, also has a least upper bound.When P is bounded-complete, then we at least know that every non-empty subset B Ď P has a greatest lower bound; and -P has a least element K P .Thus, every formula Ψ over a bounded-complete po P which provides some upper bound to every variable x P X also can be brought into 1-normal form.Let us call such conjunctions bounded.We obtain: Proposition 5 Given a po P that is bounded-complete, the following holds: When we drop the extra assumption that conjunctions are bounded, Proposition 5 need no longer hold.
Example 5 For prefixes of strings, consider the conjunction This formula is semantically equivalent to pab Ď xq ^px Ď abq ^pabd Ď yq ^px Ď yq although the formulas are syntactically different.Even without upper bounds, not all implications can be inferred via transitive closure alone.Again for prefixes of strings, consider pabc Ď y 1 q ^pabd Ď y 2 q ^px Ď y 1 q ^px Ď y 2 q ^pab Ď zq The first four constraints imply that x Ď ab, which, by the last constraint, implies that x Ď z must hold as well.
[ \ and otherwise, yield the conjunction of all constraints in Ψ that only uses variables from Y .
For conjunctions Ψ 1 , Ψ 2 in 1-normal form and different from K, we define the abstract join Ψ 1 \ 7 Ψ 2 as the conjunction of the following constraints: all constraints x Ď y, x, y P X , which occur both in Ψ 1 and Ψ 2 ; all constraints pd 1 ^d2 q Ď x, d 1 , d 2 P P , x P X where d i Ď x occurs in Ψ i ; all constraints x Ď pd 1 _d 2 q, d 1 , d 2 P P , x P X where x Ď d i occurs in Ψ i .
Then we have: Theorem 5 Assume that P is a lattice.i.e., all information about upper bounds is lost.

The General Case
For general (even finite) partial orders, the dedicated constructions for lattices cannot be directly applied.
Already the problem of determining whether or not a conjunction is satisfiable, turns out to be surprisingly difficult.Assume that elements in P can be represented and compared in polynomial time.Then we find: Theorem 6 The problem of determining for a given partial order P and a conjunction Ψ , whether Ψ is satisfiable over P , is NP-complete.
Proof Since a satisfying assignment for a conjunction Ψ can be guessed in polynomial time, it remains to prove the hardness part.For that, consider the problem of 3colorability of an undirected finite graph G " pV, Eq.Let v 1 , . . ., v n be an enumeration of the vertices in V .Then, we construct a partial order P consisting of the elements where the partial ordering ď of P is the least partial order satisfying xv i , cy ď xv j , c 1 y whenever tv i , v j u P E ^i ă j ^c ‰ c 1 xv i , cy ď v i whenever D j ą i. ti, ju P E v j ď xv j , cy whenever D i ă j. ti, ju P E For P , we define a conjunction Ψ in the variables x i , i " 1, . . ., n, by Ź tvi,vj uPE,iăj px i Ď v i q ^px i Ď x j q ^pv j Ď x j q Both P and Ψ can be constructed from G in polynomial time.Moreover, it holds that σ |ù Ψ iff σ x i " xv i , c i y for some coloring γ : V Ñ t1, 2, 3u with γ v i " c i .It follows that Ψ is satisfiable iff G has a 3-coloring.In summary, we obtain a polynomial time reduction from the problem of 3-colorability of undirected finite graphs into satisfiability of finite conjunctions over some partial order.This concludes the proof.[ \.
For general partial orders P , however, we still may rely on the 0-normal form nf 0 and otherwise perform the same constructions as we did for lattices with the 1normal form.Thus, we define an abstract ordering by Let us denote the resulting abstract domain by DrP s 0 .We have: Theorem 7 For an arbitrary po P , the following holds: 1.If a conjunction Ψ is satisfiable over P then nf 0 rΨ s ‰ K. 2. For all conjunctions Ψ 1 , Ψ 2 , nf 0 rΨ 1 s " nf 0 rΨ 1 ^Ψ2 s implies that Ψ 1 ùñ Ψ 2 .
[ \ For arbitrary po P , we define the abstract projection in the same way as for conjunctions over a lattice Ponly that we now rely on formulas in 0-normal form.
For such a formula Ψ the projection Ψ |

7
Y onto a subset Y Ď X of variables, is again defined by removing all constraints mentioning variables not in Y .
It is for the abstract join operation that we must find a more general definition, since least upper bounds or greatest lower bounds of sets of values in P are no longer at hand.Assume that Ψ 1 , Ψ 2 are in 0-normal form and different from K.Then, we define the abstract join Ψ 1 \ 7 Ψ 2 as the conjunction of the following constraints all constraints x Ď y, x, y P X , which occur both in Ψ 1 and Ψ 2 ; all constraints d i Ď x, d The proof of the second statement is analogousonly that the occurring constants now may also be finite meets of constants occurring in upper-bound constraints of the initial collection or finite joins of constants occurring in lower-boudn constraints.Still, the number of possible formulas remains finite.[ \ Due to Proposition 6, the construction from Section 3 can be applied resulting in the 2-decomposable relational domains D We exemplify the construction for the lattice Z of integers, i.e., for D Arbitrary elements in D Assume that we are given a collection Z " xs p y pPrX s2 with s p P DrZs p -which is not yet stable, and we would like to determine the corresponding stable collection by performing a fixpoint iteration to determine the greatest solution of Eq. ( 8).During that iteration, we only need to consider upper and lower bounds for each variable x which have already occurred in the formulas s p .Therefore, the length of each intermediate formula is bounded by a polynomial in the input, and each unknown r p is updated only polynomially often.As a consequence, all operations abstract join, abstract meet and abstract projection for D 7 2 rZs are polynomial.For arbitrary lattice or po P , we may proceed analogously.Efficiency of the fixpoint iteration, though, remains to be checked separately for every P .

Assignments
Let us turn to the construction of abstract transformers for assignments.We only describe these for the relational domains DrP s and DrP s 0 , respectively.We first consider three simple cases: assignments of unknown values; assignments of constants; and copying one variable into the other.
x :" ? 7Ψ " Ψ | for d P P and x, y P X with x ı y.Again, we realize the assignment of unknown values by restriction.For assigning constants and variables, we remark that equality can be expressed via a pair of inequalities.
Individual partial orders, though, may support further forms of right-hand sides in assignments.Subsequently, we enumerate more general forms of assignments for sets and for the prefix, substring, and scattered substring partial orders on strings.
Sets.For sets, we consider right-hand sides of the form y 1 X y 2 or y 1 Y y 2 for y 1 , y 2 P X with x R ty 1 , y 2 u.We define Thus, we obtain after the assignment as new upper (lower) bounds of x in terms of the variables y 1 and y 2 .An analogous construction can also be applied to multisets.We remark that the given right-hand sides do not entail that the equalities x " y 1 Xy 2 and x " y 1 Yy 2 , respectively, hold after the assignments.
Prefixes.In this case, right-hand sides of interest are concatenations of a constant or variable, possibly followed by some further value, i.e., are of the form s ?for s either in Σ ˚, or in X ztxu, with "?" again denoting unknown input.We define x :" s ? 7Ψ " Ψ | 7 X ztxu ^ps Ď xq i.e., we only obtain information about lower bounds for x after the assignment but lose all information about upper bounds.Substrings.Again, we consider right-hand sides which are concatenations of constants or variables with further values.These now are of the form ? s 1 ? . ..? s k ?(s i P Σ ˚Y X ztxu).We define x :" ?s 1 ? . ..? s k ? 7Ψ " Ψ | For scattered substrings, we proceed similarly.In both cases, no information is obtained for upper bounds to the left-hand side variable x after the assignment.
So far, we have assumed that the right-hand side s does not contain the variable x from the left-hand side.In case that x occurs in s, we split the assignment into the sequence tmp :" s; x :" tmp; for some fresh variable tmp, i.e., first store the value of the right-hand side s in tmp whose value only then is assigned to the left-hand side variable x.These abstract tranformers for the relational domains DrP s (resp.DrP s 0 ) are readily lifted to corresponding transformers for the weakly relational domains D

Guards and Negated Inequalities
Let us now turn to a treatment of guards ?c for the directed domain D 7 2 rP s where P is a lattice.The case for D 7 2 rP s 0 (when P is not a lattice) is analogous.A condition c which consists of an inequality s 1 Ď s 2 for s i being variables or constants already represents an abstract relation.Therefore, Eq. ( 2) can be used to define the abstract effect of ?c 7 .
If the condition c is a negated inequality s 1 Ę s 2 , this is not immediately possible.Assume that the variables occurring in c all occur in p P rX s 2 .Now consider an arbitrary element D " xd p 1 y p 1 PrX s2 .In particular, d p P DrP s p , i.e., d p " e 1 _. .._e k for conjunctions e 1 , . . ., e k all using variables from p only.In this case, we define ?c 7 D " D [ Ž te j | e j ùñ ps 1 Ď s 2 qu Thus, the negated inequality c allows to improve the abstract relation D by possibly removing those conjuncts e j from d p which contradict c.

Conclusion
We considered a construction of 2-decomposable relational domains from arbitrary relational domains and exemplified this construction by deriving 2-disjunctive constants from the relational domain of disjunctive constants.For 2-disjunctive constants, it turned out that normalization is prohibitively expensive.Therefore, we provided a second general construction of 2-decomposable relational domains, now based on greatest solutions of constraint systems, which -in the case of disjunctive constants -results in a 2-decomposable domain where the operations join, meet, and restriction are polynomial.
In the second part, we then considered directed domains as conjunctions of inequalities over lattices or general partial orders.For lattices, we provided the 1normal form for a syntactic characterization of semantic equivalence.We showed that the resulting domain is 2-decomposable and provided precise polynomial algorithms for 1-normalization, projection, join, and meet.For arbitrary partial orders, we use a weaker form of normalization for constructing a weaker 2-decomposable relational domain, for which we again provided polynomial algorithms, now for 0-normalization, projection, join, and meet.Only in the very last step, we added disjunctions by applying the general construction of 2decomposable domain based on approximate normalization from the previous section.Both for 2-disjunctive constants and for directed domains, we indicated how transfer functions for assignments and guards can be constructed.
Our results can be extended in several directions.In the case of constants, one may, e.g., additionally, track equalities as well as disequalities between variables; likewise for directed domains, an extensive study of the impact of negated inequalities could be of interest.Here, we only studied lattice operations and transfer functions.Directed domains, though, may have infinite strictly ascending chains.Therefore, tailored widening and narrowing operators are of interest when these domains are employed for practical static analysis.

7 2 7 2
rP s (in case of lattices P ) and D rP s 0 (for arbitrary pos).
the best approximation r| Y,2 to the restriction r| Y of r P R 2 onto some subset Y Ď X of variables is given by Now, we claim that for every p P rX s 2 ,r| p [ s| p " pr| p [ s| p q ˇˇpTo prove the claim, we argue that r| p [ s| p Ď pr| p [ s| p q ˇˇp (by monotonicity)Ď pr| p q ˇˇp [ ps| p q ˇˇp (by monotonicity) For each R " xr p y pPrX s2 ,R 1 " xr 1 p y pPrX s2 in R 7 2 , the greatest lower bound R [ 7 R 1 " xr 2 p y pPrX s2 is determined as the greatest solution of (8) with start values s p " r p [ r 1 p (p P rX s 2 ).Proof For the first statement, let R " xr p y pPrX s2 and R 1 " xr 1 p y pPrX s2 .As the ordering on R 7 2 is componentwise, it suffices to prove 1| tyu equals y P B. Let Ψ y denote the conjunction of all formulas Ψ 1 | p for p P rX s 2 with y P p.Let Ψ 2 " Ψ y rx{ys denote the formula obtained from Ψ y by renaming each occurrence of the variable y with x.Then we define Let Ψ denote the formula returned by that transformer for Ψ .Intuitively, our definition means for x R p, that Ψ ˇˇp " Ψ | p , i.e., Ψ | p is preserved while additionally, Ψ ˇˇtxu " Ψ | tyu rx{ys, Ψ ˇˇtx,yu " Ž aPB x " b ^y " b, and for z R tx, yu, Ψ ˇˇtx,zu " Ψ | ty,zu rx{ys.
x :" y 2 Ψ " Ψ 1 ^˜ł aPB x " a ^y " a ¸^Ψ 2 x Ď y, or x Ď d for variables x, y P X and constant values d.As usual, we consider conjunctions only up to semantic equivalence.We call inequalities of the form d Ď x lower bound constraints, and d a lower bound for x.Analogously for upper bounds.Inequalities of the form x Ď y are called variable constraints.Assume we are given a partial order (po), i.e., a set P partially ordered by some relation ď.Examples of partial orders of interest are Subsets.The set 2 U of all subsets of some finite universe U where the ordering is subset inclusion Ď; Integers.The set Z of integers equipped with the natural ordering ď Z ; Multisets.Multisets, i.e., the set of all mappings µ : U Ñ N from elements in U to their multiplicities ordered by multiset inclusion Ď N .Strings.The set of all strings Σ ˚for some finite alphabet Σ.Several partial orderings are of interest: 1.If Ψ is a conjunction in 1-normal form, then for every subset Y Ď X , Ψ | Y is given by Ψ | For Ψ 1 , Ψ 2 in 1-normal form, Ψ 1 \ 7 Ψ 2 is the least upper bound of Ψ 1 , Ψ 2 in DrP s. 3.The domain DrP s is a 2-decomposable relational domain.[ \ ˚, e.g., we have px Ď abcq \ px Ď abdq " J 1 , d 2 P P , x P X where d i Ď x occurs in Ψ i for i " 1, 2 and d i ď d 3´i ; all constraints x Ď d i , d 1 , d 2 P P , x P X where x Ď d i occurs in Ψ i for i " 1, 2 and d 3´i ď d i .With these definitions, the binary operation \ 7 returns the least upper bound of its arguments w.r.t. the ordering Ď 7 .Moreover, DrP s 0 turns into a 2-decomposable relational domain as well.DrP s 0 ) [9].The elements of the resulting relational domain are disjunctions of normal form conjunctions (1-normal forms if P is a lattice, and 0-normal forms in general) where for Y Ď X , the restriction Ψ | Y of the disjunction Ψ is defined as the disjunction of the restrictions c| Y of the normal form conjunctions c contained in Ψ .By definition, restrictions therefore are distributive.Let DrP s (resp.DrP s 0 ) denote the resulting relational abstract domains.If P is infinite, these relational domains have infinite strictly ascending chains, and therefore must have also strictly descending chains of unbounded length.For the lattice Z, e.g., there are even infinite strictly descending chains, e.g., p0 Ď xq, p1 Ď xq, p2 Ď xq, . . .Proposition 6 1.For every po P , DrP s 0 is 2-nice.2. For every lattice P , DrP s is 2-nice.Proof Let D denote an arbitrary collection xd p y pPrX s2 with d p P DrP s p 0 .Consider an arbitrary formula d 1 p from the set I DrP s rDs p .It consists of disjunctions of conjunctions each of which may only mention variables from p or constants occurring in any of the d p 1 , p 1 P rX s 2 .Since the number of these formulas is finite, statement (1) follows.