Aﬀine Loop Invariant Generation via Matrix Algebra

. Loop invariant generation, which automates the generation of assertions that always hold at the entry of a while loop, has many important applications in program analysis and formal veriﬁcation. In this work, we target an important category of while loops, namely aﬃne while loops, that are unnested while loops with aﬃne loop guards and variable updates. Such a class of loops widely exists in many programs yet still lacks a general but eﬃcient approach to invariant generation. We propose a novel matrix-algebra approach to automatically synthesizing aﬃne inductive invariants in the form of an aﬃne inequality. The main novelty of our approach is that (i) the approach is general in the sense that it theoretically addresses all the cases of aﬃne invariant generation over an aﬃne while loop, and (ii) it can be eﬃciently automated through matrix-algebra (such as eigenvalue, matrix inverse) methods. The details of our approach are as follows. First, for the case where the loop guard is a tautology (i.e., ‘ true ’), we show that the eigenvalues and their eigenvectors of the matrices derived from the variable updates of the loop body encompass all meaningful aﬃne inductive invariants. Second, for the more general case where the loop guard is a conjunction of aﬃne inequalities, our approach completely addresses the invariant-generation problem by ﬁrst establishing through matrix inverse the relationship between the invariants and a key parameter in the application of Farkas’ lemma, then solving the feasible domain of the key parameter from the inductive conditions, and ﬁnally illustrating that a ﬁnite number of values suﬃces for the key parameter w.r.t a tightness condition for the invariants to be generated. Experimental results show that compared with previous approaches, our approach generates much more accurate aﬃne inductive invariants over aﬃne while loops from existing and new benchmarks within a few seconds, demonstrating the generality and eﬃciency of our approach.


Introduction
An invariant is a logical assertion at a certain program location that always holds whenever the program executes across that location. Invariants are indispensable parts of program analysis and formal verification, and thus the generation of invariants has been key to the proof and analysis of crucial properties like reachability [17,9,3,46,14,22,6], time complexity [12] and safety [39,44,2]. To ease program analysis and formal verification, there has been a long thread of research on approaches to automatic generation of invariants, including constraint solving [34,16,13], recurrence analysis [38,24,32,36], abstract interpretation [18,20], logical inference [29,23,51,26,40,25], dynamic analysis [21,52,42], and machine learning [27,58,31]. To guarantee that an assertion is indeed an invariant, the widely-adopted paradigm is to generate an inductive invariant that holds for the first execution and for every periodic execution to the particular program location [16,39]. In this work, we consider an important subclass of invariants called numerical invariants which are assertions over the numerical values taken by the program variables, and are closely related to many common vulnerabilities like integer overflow, buffer overflow, division by zero and array out-of-bound. More specifically, we consider affine inductive invariants in the form of an affine inequality over program variables, and focus on affine while loops that have affine loop guards (as a conjunction of affine inequalities) and affine updates for the program variables but do not have nested loops.
To automate the generation of affine inductive invariants, we adopt the constraintsolving based approach with three steps. First, it establishes a template with unknown parameters for the target invariants. Second, it collects constraints derived from the inductive conditions. Finally, it solves the unknown parameters to get the desired invariants. Prior work in this space [16,50] leverages Farkas' lemma to provide a sound and complete characterization for the inductive conditions and then generates the affine inductive invariants either by the complete approach of quantifier elimination [16] or through several heuristics [50]. Specifically, the StInG invariant generator [54] implements the approach in [50], and the InvGen invariant generator [30] integrates abstract interpretation as well as the approach in [50]. Furthermore, a recent effort [43] leverages eigenvalues and eigenvectors for inferring a restricted class of invariants. Finally, some recent work considers decidable logic fragments that directly verify properties of loops [15,37,4,35]. Compared with other approaches such as machine learning and dynamic analysis, constraint solving has a theoretical guarantee on the correctness and accuracy of the generated invariants, yet typically at the cost of higher runtime complexity.
The novelty of our approach lies in that it completely addresses the constraints derived from Farkas' lemma by matrix methods, thus ensuring both generality and efficiency. In detail, this paper makes the following contributions: -For affine while loops with tautological guard, we prove that the affine inductive invariants are determined by the eigenvalues and eigenvectors of the matrices that describe variable updates in the loop body. -For affine while loops whose loop guard is a conjunction of affine inequalities, we solve the affine inductive invariants by first deriving through matrix inverse a formula with a key parameter in the application of Farkas' lemma, then solving the feasible domain of the key parameter from the inductive conditions, and finally showing that it suffices to choose a finite number of values for the key parameter if one imposes a tightness condition on the invariants. -We generalize our results to affine while loops with non-deterministic updates and to bidirectional affine invariants. A continuity property on the invariants w.r.t. the key parameter is also proved for tackling the numerical issue arising from the computation of eigenvectors. Experimental results on existing benchmarks and new benchmarks arising from linear dynamical systems demonstrate the generality and efficiency of our approach.

Related Work
Constraint Solving. There have been several prior approaches [16,50] using constraint solving for invariant generation based on Farkas' lemma. Compared to the approach in [16] that uses quantifier elimination to solve the constraints from Farkas' lemma, our approach is more efficient since it only involves the matrix computation. Compared with [50] that uses several heuristics, our approach is more general and complete in addressing all the cases in affine invariant generation. While the approach in [43] also uses eigenvectors, it is restricted to the subclass of equality and convergent invariants. In contrast, our approach targets at general affine inductive invariants over affine while loops. Other prior work [15,37,4,35] considers to have a decidable logic for unnested affine while loops with tautological guard but no conditional branches. Compared with them, our approach handles general affine while loops and targets at invariant generation.
Abstract Interpretation. A long thread of research to infer inductive invariants is using abstract interpretation [47,1,7,19,41,48,11,30] framework which constructs sound approximations for program semantics. In a nutshell, it first establishes an abstract domain for the specific form of properties to be generated, and then performs fixed-point computation in the abstract domain. Abstract interpretation generates invariants whose precision depends on the abstract domain and abstract operators, except for rare special cases [28,50].
Recurrence Analysis. Another closely-related technique is recurrence analysis [38,24,32,36,10]. The main idea is transforming the problem of invariant generation into a recurrence relation problem and then solve the latter one. The main limitation of recurrence analysis is that it requires the underlying recurrence relation to have a closed-form solution. This requirement, unfortunately, does not hold for the general case of affine inductive invariants over affine while loops.
Dynamic Analysis. Dynamic analysis [21,52,42] has also been exploited to invariant generation. The major process is first to collect the execution traces of a particular program by running it multiple times, and then guess the invariants based on these traces. As indicated in its process, dynamic analysis provides no guarantee on the correctness or accuracy of the inferred invariants, yet still pays the price of running the program at a large amount of time.
Machine Learning. There is a recent trend of applying machine learning [27,58,31] to solve the invariant-generation problem. Such approaches first establish a (typically large) training set of data, then use training approaches such as neural networks to generate invariants. Compared to our approach, those approaches require a large training set, while still having no theoretical guarantee on the correctness or accuracy. Specifically, such approaches cannot produce specific numerical values (e.g., eigenvalues) that are required to handle some examples in this work.

Preliminaries
In this section, we specify the class of affine while loops and define the affine-invariantgeneration problem over such loops. Throughout the paper, we use V = {x 1 , ..., x n } to denote the set of program variables in an affine while loop; we abuse the notation V so that it also represents the current values (before the execution of the loop body) of the original variables in V , and use the primed variables V := {x | x ∈ V } for the next values (after the execution of the loop body). Furthermore, we denote by x = [x 1 , ..., x n ] T the vector variable that represents the current values of the program variables, and by x = [x 1 , ..., x n ] T the vector variable for the next values. An affine while loop is a while loop without nested loops that has affine updates in each assignment statement and possibly multiple conditional branches in the loop body. To formally specify the syntax of it, we first define affine inequalities and assertions, program states and satisfaction relation between them as follows. Affine Inequalities and Assertions. An affine inequality φ is an inequality of the form c T · y + d ≤ 0 where c is a real vector, y is a vector of real-valued variables and d is a real scalar. An affine assertion is a finite conjunction of affine inequalities. An affine assertion is satisfiable if it is true under some assignment of real values to its variables. Given an affine assertion ψ over vector variable x, we denote by ψ the affine assertion obtained by substituting x in ψ with its next-value variable x .
Likewise, v satisfies an affine assertion ψ if it satisfies every conjunctive affine inequality in ψ. Furthermore, given an affine assertion ψ with both x and x , we say that two program states v, v satisfy ψ, written as v, v |= ψ, if ψ is true when one substitutes x by v and x by v .
We then illustrate the syntax of (unnested) affine while loops as follows. Affine While Loops. We consider affine while loops that take the form: where (i) θ is an affine assertion that specifies the initial condition for inputs and is given by the real matrix R and vector f , (ii) G is an affine assertion serving as the loop guard given by the real matrix P and vector q, and (iii) each ψ j is an affine assertion that represents a conditional branch, with the relationship between the current-state vector x and the next-state vector x given by the affine assertion τ j := T j · x − T j · x + b j ≤ 0 with transition matrices T j , T j and vector b j . In this work, we always assume that the rows of R are linearly independent (this condition means that every variable x i has one independent initial condition attached to it, which holds in most situations such as a fixed initial program state), such that R T is left invertible; we denote its left inverse as (R T ) −1 L .
The execution of an affine while loop is as follows. First, the loop starts with an arbitrary initial program state v * that satisfies the initial condition θ. Then in each loop iteration, the current program state v is checked against the loop guard G. In the case that v |= G, the loop arbitrarily chooses a conditional branch ψ j satisfying v |= ψ j , and sets the next program state v non-deterministically such that v, v |= τ j ; the next program state v is then set as the current program state. Otherwise (i.e., v |= G), the loop halts immediately. Now we define affine inductive invariants over affine while loops. Informally, an affine inductive invariant is an affine inequality satisfying the initiation and consecution conditions which mean that the inequality holds at the start of the loop (initiation) and is preserved under every iteration of the loop body (consecution). Affine Inductive Invariants. An affine inductive invariant for an affine while loop ( †) is an affine inequality Φ that satisfies the initiation and consecution conditions as follows: From the definition above, it can be observed that an affine inductive invariant is an invariant, in the sense that every program state traversed (as a current state at the start or after every loop iteration) in some execution of the underlying affine while loop will satisfy the affine inductive invariant. From now on, we abbreviate affine while loops as affine loops and affine inductive invariants as affine invariants. Problem Statement. In this work, we study the problem of automatically generating affine invariants over affine loops. Our aim is to have a complete mathematical characterization on all such invariants and develop efficient algorithms for generating these invariants.

Affine Invariants via Farkas' Lemma
Affine invariant generation through Farkas' lemma is originally proposed in [16,50]. Farkas' lemma is a fundamental result in the theory of linear inequalities that leads to a complete characterization for the affine invariants. Since our approach is based on Farkas' lemma, we present a detailed account on the approaches in [16,50], and point out the weakness of each of the approaches.
Theorem 1 (Farkas' Lemma). Consider the following affine assertion S over real-valued variables y 1 , . . . , y n : when S is satisfiable, it entails a given affine inequality φ : c 1 y 1 + ... + c n y n + d ≤ 0 if and only if there exist non-negative real numbers λ 0 , . . . , λ k such that (i) The application of Farkas' lemma can be visualized by a table form as follows: λ k a k1 y 1 + ... + a kn y n +b k ≤ 0 The intuition of the table form above is that one first multiplies the λ i 's on the left to their corresponding affine inequalities (in the same row) on the right, and then sums these affine inequalities together to obtain the affine inequality at the bottom.
In this paper, we will call the table form as Farkas table.
Given an affine loop as ( †), the approaches in [16,50] first establish a template Φ : c 1 x 1 +...+c n x n +d ≤ 0 for an affine invariant where c 1 , . . . , c n , d are the unknown coefficients. Second, they establish constraints for the unknown coefficients from the initiation and consecution conditions for an affine invariant, as follows. Initiation. By Farkas' lemma, the initiation condition can be solved from the Farkas table ( ‡) with S := θ and φ := Φ: Here we rephrase the affine inequalities in θ and Φ with the condensed matrix forms R · x + f ≤ 0 and c T · x + d ≤ 0; we also use λ = [λ 1 , . . . , λ k ] T to denote the non-negative parameters in the leftmost column of ( ‡).
Consecution. The consecution condition can be solved by handling each conditional branch (specified by τ j , ψ j in ( †)) separately. By Farkas' lemma, we treat each conditional branch by the Farkas table ( ‡) with S := Φ ∧ G ∧ τ j and φ := Φ : Note that the Farkas table above contains quadratic constraints as we multiply an unknown non-negative parameter µ to the unknown invariant c T · x + d ≤ 0 in the table. The Farkas tables for all conditional branches are grouped conjunctively together to represent the whole consecution condition. The weakness of the approaches presented in [16,50] lies at the treatment of the quadratic constraints from the consecution condition. The approach in [16] addresses the quadratic constraints by quantifier elimination that guarantees the theoretical completeness but typically has high runtime complexity. The approach in [50] solves the quadratic constraints by several heuristics that guess possible values for the key parameter µ in ( * ) which causes non-linearity, hence losing completeness. Our approach considers to address parameter µ through matrix-based methods (eigenvalues and eigenvectors, matrix inverse, etc.), which is capable of efficiently generating affine invariants (as compared with quantifier elimination in [16]) while still ensuring theoretical completeness (as compared with the heuristics in [50]).

Single-Branch Affine Loops with Deterministic Updates
For the sake of simplicity, we first consider the affine invariant generation for a simple class of affine loops where there are no conditional branches in the loop body and the updates of the next-value vector x are deterministic.
Formally, an affine loop with deterministic updates and a single branch takes the following form: initial condition θ : For the loop above, we aim at non-trivial affine invariants, i.e., affine invariants We summarize our results below.
1. When the loop guard is 'true', there are only finitely many independent nontrivial invariants c T · x + d ≤ 0 where c is an eigenvector of the transpose of the transition matrix T. 2. When the loop guard is not a tautology, there can be infinitely many more non-trivial invariants c T · x + d ≤ 0 with c given by a direct formula in µ; in this case we derive the feasible domain of µ and select finitely many optimal ones (which we call tight choices) among them.
In Section 4.1, we first derive the constraints from the initiation (#) and consecution ( * ) conditions satisfied by the invariants. Then we solve these constraints for the tautological loop guard case in Section 4.2 and the single-constraint loop guard case in Section 4.3. Finally we generalize the results to the multi-constraint loop guard case in Section 4.4.

Derived Constraints from the Farkas Tables
We first derive the constraints from the Farkas tables as follows: Initiation. Recall the Farkas table (#) for initiation. We first compare the coefficients of x above and below the horizontal line in (#), and obtain Then by comparing the constant terms in (#), we have: Consecution. The Farkas table ( * ) for consecution in the case of single-branch affine loops with deterministic updates is as follows: Here the transition matrix T is a n×n square matrix, and b is a n-dimensional vector.
Since τ contains only equalities, the components η 1 , ..., η n of the vector parameter η do not have to be non-negative (while the components ξ 1 , ..., ξ n of ξ and µ must be non-negative). In this table, by comparing the coefficients of x above and below the horizontal line, we easily get −η = c. Then we substitute η by −c and compare the coefficients of x above and below the horizontal line. We get We also compare the constant terms and get The rest of this section is devoted to solving the invariants Φ : c T · x + d ≤ 0 which satisfy all constraints (1)-(5).

Loops with Tautological Guard
We first consider the simplest case where the loop guard is 'true': In order for completely solving the non-linear constraints, we take three steps: 1. choose the correct µ, thus turn the non-linear constraints into linear ones; 2. use linear algebra method to solve out the vector c; 3. with µ and c known, find out the feasible domain of d and determine the optimal value of it. Here 'optimality' is defined by the fact that all invariants with other d's in this domain are implied by the invariant with the 'optimal' d.
Step 1 and Step 2. We address the values of µ, c by eigenvalues and eigenvectors in the following proposition: Proposition 1. For any non-trivial invariant c T · x + d ≤ 0 of the loop ( ), we have that c must be an eigenvector of T T with a non-negative eigenvalue µ.
Proof. Since the loop guard is a tautology, we take the parameter ξ to be 0 in (4): It's obvious that µ must be a non-negative eigenvalue of T T and c is the corresponding eigenvector.
Example 1 (Fibonacci numbers). Consider the sequence {s n } defined by initial condition s 1 = s 2 = 1 and recursive formula s n+2 = s n+1 + s n for n ≥ 1. If we use variables (x 1 , x 2 ) to represent (s n , s n+1 ), then the sequence can be written as a loop: ; only the second one is non-negative. This eigenvalue µ = 1+ here c 1 is a free variable, which could be fixed in the final form of the invariant.
Step 3. After solving µ and c, we illustrate the feasible domain of d and its optimal value by the following proposition: For any µ and c given by Proposition 1, the feasible domain of d is an interval determined by the two conditions below: If the above conditions have empty solution set, then no affine invariant is available from such µ and c; otherwise, the optimal value of d falls in one of the two choices: (3) provides one condition for d: (5) with ξ = 0 provides the other condition: To obtain the strongest inequality c T · x + d ≤ 0, we need to take d to be either minimal or maximal value, i.e., some boundary point of its interval; thus the invariant with this d would imply all invariants with the same c and other d's in this interval. The boundary is achieved when one of the two conditions achieves the equality.

Loops with Guard: Single-Constraint Case
Here we study the loops with non-tautological guard. First of all, the eigenvalue method of Section 4.2 applies to this case as well; thus for the rest of Section 4, we always assume that µ is not any eigenvalue of T (and c is not any eigenvector of T T either) and aim for other invariants than the ones from the eigenvectors. Let us start with the case that the loop guard consists of only one affine inequality: initial condition θ : where p is a n-dimensional real vector and q is a real number. We again take three steps to compute the invariants; these steps are different from the previous case: 1. we derive a formula to compute c in terms of µ; so for any non-negative real value µ, we get a corresponding c; 2. however, not all µ's would produce invariants that satisfy all constraints (1)- (5).
We will determine the feasible domain of µ that does so; 3. we will select finitely many µ's from its feasible domain which provide tight invariants; the meaning of tightness will be defined later. For every single µ, we will also determine the feasible domain of d and optimal value of it.
Step 1. We first establish the relationship between µ and c through the constraints. The initiation is still (1)(2)(3), while the consecution (4)(5) becomes: where the matrix P in (4) degenerates to vector p T and the vectors q, ξ in (5) both have just one component q, ξ here. Note that ξ is a non-negative parameter.
In contrast to Section 4.2, we assume that µ is not any eigenvalue of T, and ξ = 0. For such µ, we have a new formula to compute c: when µ is fixed, c's with different ξ's are proportional to each other and yield equivalent invariants.
Proof. Since µ is not any eigenvalue of T, the matrix µ · I − T T is invertible; thus (4 ) is equivalent to Example 3 (Fibonacci, Part 3). We add a loop guard x 1 ≤ 10 to Example 1: and search for more invariants. The formula (6) here reads Step 2. With formula (6) in hand, every non-negative value µ would give us a vector c; the next step is to find such µ's that (1)(2)(3)(5 ) are all satisfied. We call this set the feasible domain of µ. Notice that (3) and (5 ) Proof. We multiply (µ − 1) on both sides of (3) and get compare them with (5 ), we see: (3 )(5 ) would not conflict each other because they are both about (µ − 1)d being 'larger' than something. However, (3 )(5 ) are two inequalities of opposite directions, they together must satisfy (6) in the above inequality and cancel out ξ > 0, we obtain the desired inequality: Every µ from [0, 1) and K ∩ [1, +∞) would lead to non-trivial invariant satisfying all constraints (1)(2)(3)(4 )(5 ).
Example 4 (Fibonacci, Part 4). Let us compute the feasible domain of µ for Example 3. Inequality (5 ) We combine them to form the compatibility condition (7) as The solution domain of it is ( 1+ Step 3. Proposition 4 provides us with a continuum of candidates for µ, thus produces infinitely many legitimate invariants. We want to find a basis consisting of finitely many invariants, such that all invariants are non-negative linear combinations of the basis; however, this idea does not work out, because every µ in the feasible domain could produce its own independent invariant, as shown in Appendix A.1, and a complete basis is unavailable, as explained in Appendix A.2. Instead, we impose a weaker form of optimality called tightness coming from the equality cases of constraints (3)(5 ): Note that these roots are also the boundary points of the intervals in K defined in Proposition 4.
Proof. Recall Proposition 2, constraints (3)(5) form the two boundaries of the domain of d, which can not be achieved simultaneously in the case of loops with tautological guard. Nevertheless, in the case of loops with guard, we have an extra freedom on µ which allows us to set λ I 0 = λ C 0 = 0: Equation (8) is just the case that (7) achieves the equality, hence is a rational equation of µ with finite number of roots. These roots are also the boundary points of K since K is the solution domain to (7). Besides the roots of (8), µ = 0 is also a boundary point of the feasible domain; its corresponding invariant reflects the feature of the loop guard itself. Thus we add it into the list of tight choices.
With µ determined and c fixed up to a scaling factor, the last thing remains is to determine the optimal d. The strategy here is similar to Proposition 2: Proposition 6. Suppose µ is from the feasible domain and c is given by Proposition 3. Then the optimal value of d is determined by one of the two choices below: Proof. If µ is a root to (8), then the two conditions (3)(5 ) for d coincide: otherwise, the feasible domain of d is an interval. We discuss two cases respectively: and the optimal value of d is one of the two boundaries of the interval.

Example 5 (Fibonacci, Part 5). Remember that
We compute the tight choices of µ and tight invariants. The equation (8) here is which has only one positive root µ = 5 3 . By Proposition 5 and Proposition 6, We get two invariants:

Loops with Guard: Multi-Constraint Case
After settling the single-constraint loop guard case, we consider the more general loop guard which contains the conjunction of multiple affine constraints: where the loop guard P · x + q ≤ 0 contains m affine inequalities. We can easily generalize the results of Section 4.3 to this case. First of all, we generalize Proposition 3: one simply needs to modify the formula (6) into here ξ is a free non-negative m-dimensional vector parameter. With a fixed µ, we take ξ to traverse all vectors in the standard basis {e 1 , ..., e m } to get m conjunctive invariants. Next, we generalize Proposition 4 which describes the feasible domain of µ: where K is the solution set to the following generalized compatibility condition: substitute c by (6 ) and take ξ to traverse all vectors in the standard basis (in order for all constraints in the loop guard to be satisfied by the invariant), we have the above condition completely decoded as m conjunctive inequalities: where u(µ), w(µ) are two m-dimensional vector functions in µ. The meaning of (7 ) is that the i-th component of u(µ) is no larger than the i-th component of w(µ) for all 1 ≤ i ≤ m; when m = 1, it goes back to (7).
At last, we consider the tight choices of µ. The first idea comes up to mind is to repeat Proposition 5: setting λ I 0 = λ C 0 = 0 for arbitary ξ such that the generalized compatibility condition achieves equality, i.e., u(µ) = w(µ); however, this is the conjunction of m rational equations and probably contains no solution.
Thus we use a different idea: recall that in the single-constraint case, the tight choices are also the (positive) boundary points of K along with 0; so we adopt this property as the definition in the multi-constraint case: The generalized compatibility condition (7 ) contains m inequalities; at each (positive) boundary point of K, at least one inequality achieves equality and all other inequalities are satisfied (equivalently, λ I 0 = λ C 0 = 0 is achieved for at least one non-trivial evaluation of the free vector parameter ξ). This is indeed a natural generalization of Proposition 5.

Generalizations
In this section, we extend our theory developed in Section 4 in two directions. For one direction, we consider the invariants c T · x + d ≤ 0 for the affine loops in the general form ( †): we will derive the relationship of µ and c, as well as the feasible domain and tight choices of µ. For the other direction, we stick to the single-branch affine loops with deterministic updates and tautological guard ( ), yet generalize the invariants to bidirectional-inequality form d 1 ≤ c T · x ≤ d 2 ; we will apply eigenvalue method to this case for solving the invariants. At the end of the section, we also give a brief discussion on some other possible generalizations.

Affine Loops with Non-deterministic Updates
In Section 4, we handled the loops with deterministic updates; here we generalize the results to the non-deterministic case in the form of ( †). We focus on the single-branch loops here, because the multi-branch ones can be handled similarly by taking the conjunction of all branches (as illustrated in Appendix A.3): For this general form, the initiation constraints are still (1)(2)(3), while the consecution constraints from Farkas table ( * ) are with ξ, η ≥ 0. The relationship of c and η is given by (10); plugging it into (9) yield Hence for any non-trivial invariant c T · x + d ≤ 0 of this loop ( † ), we have c = −(T ) T · η, where η is characterized differently in the following three cases: 1. T and T are square matrices and the loop guard is 'true'. In this case, we take ξ = 0 in (9 ) and easily see that µ must be a root of det T T − µ · (T ) T = 0 and η is a kernel vector of the matrix T T − µ · (T ) T . 2. T and T are square matrices and the loop guard is non-tautological. In this case, we set µ to be values other than the roots of det T T − µ · (T ) T = 0, thus the inverse matrix T T − µ · (T ) T −1 exists; we multiply it on (9 ) and get that Neither T nor T is square matrix. In this case, we need to use Gaussian elimination method (with parameters) to solve (9 ). By linear algebra, the solution η(µ) would contain 'homogeneous term' (which does not involve ξ but possibly some free variables η = [η 1 , ..., η l ] T ) and 'non-homogeneous term' (which contains ξ linearly). Thus η(µ) could be written in parametric vector form as M(µ) · η + N(µ) · ξ, where M(µ), N(µ) are matrix functions only in µ.
For Case 2 and Case 3, we have a continuum of candidates for µ. The feasible domain of µ is given by [0, 1) ∪ K ∩ [1, +∞) ∩ J, where K is the solution set to the following compatibility condition (obtained by combining constraints (3 )(11)): and J is the solution set to constraints η(µ) ≥ 0. Here both η and ξ as free nonnegative vector parameters are taken to traverse all standard basis vectors, just in the same way as Proposition 7. The tight choices of µ consists of 0 and the positive boundary points of K ∩ J, in the same sense as Definition 1.

An Extension to Bidirectional Affine Invariants
Here we restrict ourselves to single-branch affine loops with deterministic updates and tautological loop guard ( ), but aim for the invariants of bidirectional-inequality form d 1 ≤ c T · x ≤ d 2 . This is actually the conjunction of two affine inequalities: We have the following proposition: Proposition 8. For any bidirectional invariant d 1 ≤ c T · x ≤ d 2 of the loop ( ), we have that c must be an eigenvector of T T with a negative eigenvalue.
Proof. We can easily write down the initiation condition: θ |= (Φ 1 ∧ Φ 2 ) and the corresponding constraints (with λ, λ being two different vector parameters): However, there are two possible ways to propose the consecution condition: If we choose the first one, there will be nothing different from the things we did in Section 4.2. Thus we choose the second one: making the two inequalities induct each other. Hence the Farkas tables are We write out the constraints of consecution: which reflects the 'golden ratio' property of the Fibonacci numbers.
Remark 1. The generalizations for bidirectional affine invariants to the loops with non-tautological guard or multiple branches are practicable but with some restrictions. The main restriction lies at the point that we need to assume the affine loop guard to also be bidirectional to make our approach for bidirectional affine invariants work. The issue of multiple branches is not critical as the bidirectional invariants can be derived in almost the same way as in Appendix A.3, with the only difference at the adaption to bidirectional inequalities.

Other Possible Generalizations
Integer-valued Variables. One direction is to transfer some of the results for affine loops over real-valued variables to those over integer-valued variables. Our approach is based on Farkas' lemma which is dedicated to real-valued variables, thus can only provide a sound but not exact treatment for integer-valued variables. An exact treatment for integer-valued variables would require Presburger arithmetics [23], rather than Farkas' lemma.
Strict-inequality Invariants. We handle the non-strict-inequality affine invariants in this work. It's natural to consider the affine invariants of the strict-inequality form. For strict inequalities, we could utilize an extended version of Farkas' lemma in [6,Corollary 1], so that strict inequalities can be generated by either relaxing the non-strict ones obtained from our method or restricting the µ value to be positive. Since Motzkin transposition theorem is a standard theorem for handling strict inequalities, we believe that Motzkin transposition theorem can also achieve similar results, but may require more tedious manipulations. Loops with Postcondition. Another generalization is to apply the presented method for checking a given postcondition. For a single invariant to imply the postcondition, we can establish a new Farkas table in this situation and apply our matrix-algebra methods. For multiple invariants to imply the postcondition, we could generate the invariants gradually through our method until the postcondition is proved or all invariants from our approach are generated.

Approximation of Eigenvectors through Continuity
In Section 4.2 and Section 5.2, we need to solve the characteristic polynomial of the transition matrix to get eigenvalues; while general polynomials with degree ≥ 5 do not have algebraic solution formula due to Abel-Ruffini theorem. We can develop a number sequence {λ i } to approximate the eigenvalue λ through root-finding algorithms; however, we cannot approximate the eigenvector of λ by solving the kernel of T T −λ i ·I since it has trivial kernel. In the case of dimensions ≥ 5, i.e., when an explicit formula for eigenvalues is unavailable, we introduce an approximation method of the eigenvectors through a continuity property of the invariants: Continuity of Invariants w.r.t. µ. In Section 4, we have shown that for any invariant c T · x + d ≤ 0 of single-branch affine loops with deterministic updates, the relationship of c and µ is given in two ways: Thus c = c(µ) could be seemed as a vector function in µ expressed differently at eigenvalues from other points. c(µ) is undoubtedly continuous at the points other than eigenvalues, while the following proposition illustrates the continuity property of c(µ) at the eigenvalues: Proposition 9. Suppose λ is a real eigenvalue of T T with eigenvector c(λ); and {λ i } is a sequence lying in the feasible domain of µ which converges to λ. If λ has geometric multiplicity 1, then the sequence {c(λ i )} converges to c(λ) as well; otherwise, {c(λ i )} converges to 0.
In order to prove it, we first introduce the following linear algebra lemma: Lemma 1. We denote the adjoint matrix of T T − µ · I by Ad(µ). Suppose λ is an eigenvalue of the n × n matrix T. Then for any n-dimensional vector x, 1. Ad(λ) · x is an eigenvector of T T with eigenvalue λ; moreover, there exists x such that Ad(λ) · x = 0 if and only if λ has geometric multiplicity 1; 2. If {λ i } is a sequence convergent to λ, then {Ad(λ i ) · x} also converges to Ad(λ) · x.
Proof. 1. By the definition of adjoint matrix, we have for any µ. Since λ is an eigenvalue of T, we have det(T T − λ · I) = 0, and thus for any n-dimensional vector x, so Ad(λ) · x is an eigenvector of T T with eigenvalue λ.
Using the condition that the geometric multiplicity of λ is 1, we know the rank of T T − λ · I is n − 1; it has at least one non-zero cofactor. Hence Ad(λ) is not zero matrix, and there exists x such that Ad(λ) · x = 0.
Otherwise the geometric multiplicity of λ is larger than 1, then rank(T T −λ·I) < n − 1 and all its cofactors are 0. In this case Ad(λ) = [0] n×n . 2. We know that every entry of Ad(µ) is a cofactor of T T − µ · I, which is a polynomial of µ, hence continuous in µ. Thus for any {λ i } → λ, we have Now we are ready to prove Proposition 9: Proof (of Proposition 9). By the definition of inverse matrix and adjoint matrix, is a (non-zero) real number, so we can absorb it into z and write c(λ i ) = Ad(λ i ) · z. By Lemma 1.2, the limit of {c(λ i )} exists and is denoted by Ad(λ) · z. There are two possible outcomes: 1. There exists real vector z such that Ad(λ) · z = 0. By Lemma 1.1, we conclude that Ad(λ) · z is the non-zero eigenvector c(λ) of T T with eigenvalue λ; 2. Ad(λ) · z = 0 for any n-dimensional real vector z. By Lemma 1.1, we know that the geometric multiplicity of λ is > 1 and no continuity is available.
An Algorithmic Approach to Eigenvalue Method in Dimensions ≥ 5. By Proposition 9, if λ has geometric multiplicity 1, we can compute c(λ i ) = (T T − λ i · I) −1 · z (in the case of tautological loop guard, we just replace z by any non-zero n-dimensional real vector) to approximate the eigenvector c(λ). On the other hand, in the case that λ has geometric multiplicity > 1, one can adopt Least-squares approximation as presented in [5,Section 8.9]. Though the Leastsquares approximation applies to the cases of eigenvalues with arbitrary geometric multiplicity, our method is much easier to implement and has higher efficiency.

Experimental Results
Experiment. We implement our automatic invariant-generation algorithm of eigenvalues and tight choices in Python 3.8 and use Sage [56] for matrix manipulation. All results are obtained on an Intel Core i7 (2.00 GHz) machine with 64 GB memory, running Ubuntu 18.04. Our benchmarks are affine loops chosen from some benchmark in the StInG invariant generator [54], some linear dynamical system in [37], some loop programs in [55] and some other linear dynamical systems resulting from well-known linear recurrences such as Fibonacci numbers, Tribonacci numbers, etc.
Complexity. The main bottleneck of our algorithm lies at exactly solving or approximating real roots of univariate polynomials (for computing eigenvalues and boundary points in our algorithmic approach). The rest includes Gaussian elimination with parameters (in the single-branch loops case, the polynomial-time solvability of it is guaranteed by [33], while for multi-branch case, the only exponentiality is in the number of branches, as clarified in Appendix A.3), matrix inverse and solving eigenvectors with fixed eigenvalues, which can easily be done in polynomial time. The exact solution for degrees less than 5 can be done by directly applying the solution formulas. The approximation of real roots can be carried out through real root isolation and a further divide-and-conquer (or Newton's method) in each obtained interval, which can be completed in polynomial time (see e.g. [49] for the polynomial-time solvability of real root isolation). Thus, our approach runs in polynomial time and is much more efficient than quantifier elimination in [16].
Results. The experimental results are presented in Table 1. In the table, the column 'Loop' specifies the name of the benchmark, 'Dim(ension)' specifies the number of program variables, 'µ' specifies the values through eigenvalues of the transition matrices (which we marked with e) or boundary points of the intervals in the feasible domain, 'Invariants' lists the generated affine invariants from our approach. We compare our approach with the existing generators StInG [54] and InvGen [30], where '=', '>', ' ' and ' =' means the generated invariants are identical, more accurate, can only be generated in this work, and incomparable, respectively. Table 2 compares the amounts of runtime for our approach and StInG and InvGen respectively, measured in seconds. Note that the runtime of StInG and InvGen are obtained by executing their binary codes on our platform. Analysis. StInG [54] implements constraint-solving method proposed in [16,50], InvGen [30] integrates both constraint-solving method and abstract interpretation, while our approach uses matrix algebra to refine and upgrade the constraint-solving method. Based on the results in Table 1 and Table 2, we conclude that: -For the benchmarks with rather simple transition matrices (identity or diagonal matrices), our approach covers or outnumbers the invariants generated by StInG and InvGen. -For the benchmarks with complicated transition matrices (which are the matrices far away from diagonal ones), especially the ones with irrational eigenvalues, our approach generates adequate accurate invariants while StInG and InvGen generate nothing or only trivial invariants. -For all benchmarks, the runtime of StInG and InvGen are faster but comparable with our runtime, hence shows the efficiency of our approach. Fibonacci numbers 2 Perrin numbers 3 Tribonacci numbers 3 ∆ = 3 3 √ 33 + 19 a = 1 3 (∆ + 4 ∆ + 1), b = 1/a + 1 µ = (5∆ + 1)/3e x1 + bx2 + ax3 ≥ b + a 1 L stands for the variable LARGE_INT in the original program [55]. Note that we modified the loop programs in [55] as affine loops before execution.
Summarizing all above, the experimental results demonstrate the wider coverage for the µ value endowed from our approach, and show the generality and efficiency of our approach.

Conclusion and Future Work
In this work, we had a thorough investigation on the affine invariant generation over unnested affine loops. Our approach is based on Farkas' lemma and completely addresses the quadratic constraints derived from Farkas' lemma via matrix algebra, hence ensures generality. Furthermore, experimental results demonstrate that our approach can efficiently generate affine invariants that are much more accurate than existing approaches. A future direction would be to consider nested affine loops.  [37] 0.030 0.092 0.173 css2003 [55] 0.019 0.111 0.193 afnp2014 [55] 0.025 0.076 0.193 gsv2008 [55] 0.027 0.092 0.207 cggmp2005 [55] 0 The three eigenvalues of T are 1, 2, −1, which yield the following invariants: Next we compute other invariants. Formula (6) here is Inequalities (5 ) and (3 ) are Compatibility condition (7) is We compute invariants from some other µ's: The reader can check that the invariants from 0 < µ < 1 can not be expressed as the non-negative linear combinations of the ones from µ = 0, 1; the invariants from 2 < µ < 7+ √ 313 11 can not be expressed as the non-negative linear combinations of the ones from µ = 2, 7+ √ 313 11 either. So from this example we conclude that though the µ's from the interior of the feasible domain are not tight, they still would provide independent invariants and thus could be kept as back-up choices.

A.2 Basis of All Legitimate Invariants
We care about one question: is there a basis of invariants, such that all legitimate invariants are non-negative linear combinations of this basis? By Example 8, we already know that tight invariants can not form such a basis.
We have the following proposition to answer this question: Proposition 10. Consider a single-branch affine loop with deterministic updates. There exists a basis consisting of 2n vectors (n is the number of variables), such that any c produced by some µ from the feasible domain is a non-negative linear combination of this basis. However, these basis vectors do not lead to invariants.
Furthermore, it's complicated to compute this basis for any specific loop. We need to calculate the coefficients A 0 , A 1 , ..., A n−1 of the Taylor expansion, which are given by the formula: Therefore, we can not use the basis as 'optimal' invariants.

A.3 Loops with Multiple Branches
For affine invariants of a general affine loop with multiple conditional branches ( †), the consecution condition can be solved by grouping the Farkas tables ( * ) for all conditional branches τ 1 , ..., τ k conjunctively together. In order to avoid tedious derivation and discussion, we assume that T j is square matrix and T j = I n for all 1 ≤ j ≤ k. The general case that T j and T j are not square matrices can be handled similarly but with much more complicated mathematical derivation, thus is omitted here. The conjunctive Farkas tables for consecution is as follows: Thus the constraints from the whole consecution condition can be written as: with µ j ≥ 0, −c ≥ 0, ξ j ≥ 0 for all 1 ≤ j ≤ k. We explain our approach of solving the invariants from these constraints in the following two cases: 1. The loop guard is 'true'. In this case, the constraints (13) read Hence c ∈ ∩ k j=1 E j ∩ {y|y ≤ 0} where E j is the set of eigenvectors of T T j , i.e., c is a common eigenvector of all T T j 's (with non-negative eigenvalues µ j ) and with all components being non-positive. d lies in the set 2. The loop guard is non-tautological. In this case, the constraints (13) read If we concatenate matrices T T j − µ j · I (1 ≤ j ≤ k) vertically as one matrix T(µ 1 , ..., µ k ), and concatenate vectors P T · ξ j (1 ≤ j ≤ k) vertically as one vector P(ξ 1 , ..., ξ k ): then the constraints can be organized as T(µ 1 , ..., µ k ) · c = P(ξ 1 , ..., ξ k ) ∧ k j=1 µ j ≥ 0, ξ j ≥ 0, c ≤ 0 We solve out c = c(µ 1 , ..., µ k , ξ 1 , ..., ξ k ) from (14) by Gaussian elimination method with parameters, where c depends linearly on all ξ j 's but possibly non-linearly on µ j 's. In practice, we carry out the elimination independently on each branch, where [T T j − µ j · I] is the coefficient matrix and [P T · ξ j ] is the augmented column (1 ≤ j ≤ k). In the elimination for each single branch, µ j is the unique variable in the coefficient matrix, thus the elimination can be done in polynomial time, due to the elimination algorithm for matrices over univariate polynomials [33,Theorem 4.1 and Theorem 4.2]. The feasible domain of µ j is given by [0, 1) ∪ K j ∩ [1, +∞) , where K j is the solution set to compatibility condition by combining (3 ) and (15) The tight choices of µ j consists of 0 and the positive boundary points of K j . For all branches grouped together, the feasible domain of all parameters (µ 1 , ..., µ k ) is the Cartesian product k j=1 [0, 1) ∪ K j ∩ [1, +∞) and we can take all µ j 's to be tight simultaneously to get a tight invariant. The Cartesian product brings in the possibility of exponential execution time for our algorithm; nevertheless, the exponentiality here only relies on the number of branches k, but not the number of program variables (i.e., dimension) n. If the loop has fixed numbers of branches at the first place, then everything can be finished in polynomial time.