1 Introduction

There are numerous incomplete approaches for automatic resource analysis of programs, e.g., [1, 2, 5, 8, 10, 15, 19, 21, 29, 33]. However, also many complete techniques to decide termination, analyze runtime complexity, or study memory consumption for certain classes of programs have been developed, e.g., [3, 4, 6, 7, 16, 17, 20, 22, 27, 34, 36]. In this paper, we present a procedure to compute size bounds which indicate how large the absolute value of an integer variable may become. In contrast to other complete procedures for the inference of size bounds which are based on fixpoint computations [3, 6], our technique can also handle (possibly negative) constants and exponential size bounds. Similar to our earlier paper [27], we embed a procedure which is complete for a subclass of loops (i.e., it computes finite size bounds for all loops from this subclass) into an incomplete approach for general integer programs [8, 19]. In this way, the power of the incomplete approach is increased significantly, in particular for programs with non-linear arithmetic. However, in the current paper we tackle a completely different problem than in [27] (and thus, the actual new contributions are also completely different), because in [27] we embedded a complete technique in order to infer runtime bounds, whereas now we integrate a novel technique in order to infer size bounds. As an example, we want to determine bounds on the absolute values of the variables during (and after) the execution of the following loop.

$$\begin{aligned} {\textbf {while }}\!(x_3 > 0)\!{\textbf { do }}\! (x_1, x_2, x_3,x_4) \!\leftarrow \! (3\cdot x_1 + 2\cdot x_2, -5\cdot x_1 -3\cdot x_2, x_3 - 1, x_4 + x_3^2) \end{aligned}$$
(1)

We introduce a technique to compute size bounds for loops which admit a closed form, i.e., an expression which corresponds to applying the loop’s update n times. Then we over-approximate the closed form to obtain a non-negative, weakly monotonically increasing function. For instance, a closed form for \(x_3\) in our example is \(x_3 - n\), since the value of \(x_3\) is decreased by n after n iterations. The (absolute value of this) closed form can be over-approximated by \(x_3 + n\), which is monotonically increasing in all variables. Finally, each occurrence of n is substituted by a runtime bound for the loop. Clearly, (1) terminates after at most \(x_3\) iterations. So if we substitute n by the runtime bound \(x_3\) in the over-approximated closed form \(x_3 + n\), then we infer the linear bound \(2\cdot x_3\) on the size of \(x_3\). Due to the restriction to weakly monotonically increasing over-approximations, we can plug in any over-approximation of the runtime and do not necessarily need exact bounds.

Structure. We introduce our technique to compute size bounds by closed forms in Sect. 2 and show that it is complete for a subclass of loops in Sect. 3. Afterwards in Sect. 4, we incorporate our novel technique into the incomplete setting of general integer programs. In Sect. 5 we demonstrate how size bounds are used in automatic complexity analysis and study completeness for classes of general programs. In Sect. 6, we conclude with an experimental evaluation of our implementation in the tool KoAT and discuss related work. All proofs can be found in [28].

2 Size Bounds by Closed Forms

In this section, we present our novel technique to compute size bounds for loops by closed forms in Theorem 7. We start by introducing the required preliminaries. Let \(\mathcal {V}=\lbrace x_1, \ldots , x_d \rbrace \) be a set of variables. \(\mathcal {F}(\mathcal {V})\) is the set of all formulas built from inequations \(p > 0\) for polynomials \(p\in \mathbb {Q}[\mathcal {V}]\), \(\wedge \), and \(\vee \). A loop \((\varphi , \eta )\) consists of a guard \(\varphi \in \mathcal {F}(\mathcal {V})\) and an update \(\eta : \mathcal {V}\rightarrow \mathbb {Z}[\mathcal {V}]\) mapping variables to polynomials. A closed form \( {\texttt {cl}^{x_i}}\) (formally defined in Definition 1 below) is an expression in n and in the (initial values of the) variables \(x_1,\ldots ,x_d\) which corresponds to the value of \(x_i\) after iterating the loop n times. For our purpose we only need closed forms which hold for all \(n \ge n_0\) for some fixed \(n_0\in \mathbb {N}\). Moreover, we restrict ourselves to closed forms which are so-called normalized poly-exponential expressions [16]. Nonetheless, our procedure works for any closed form expression with a finite number of arithmetic operations (i.e., the number of operations must be independent of n). We extend the application of functions like \(\eta : \mathcal {V}\rightarrow \mathbb {Z}[\mathcal {V}]\) also to polynomials, vectors, and formulas, etc., by replacing each variable v in the expression by \(\eta (v)\). So in particular, \((\eta _2 \circ \eta _1)(x) = \eta _2(\eta _1(x))\) stands for the polynomial \(\eta _1(x)\) in which every variable v is replaced by \(\eta _2(v)\). Moreover, \(\eta ^n\) denotes the n-fold application of \(\eta \).

We call a function \(\sigma : \mathcal {V}\rightarrow \mathbb {Z}\) a state. By \(\sigma (exp)\) or \(\sigma (\varphi )\) we denote the number resp. Boolean value which results from replacing every variable v by the number \(\sigma (v)\) in the arithmetic expression exp or the formula \(\varphi \).

Definition 1

(Closed Forms). For a loop \((\varphi , \eta )\), an arithmetic expression \( { {\texttt {cl}^{x_i}}}\) is a closed form for \(x_i\) with start value \(n_0 \in \mathbb {N}\) if \( { {\texttt {cl}^{x_i}}} = \sum _{1 \le j \le \ell } \alpha _j\cdot n^{a_j}\cdot b_j^n\) with \(\ell , a_j\in \mathbb {N}\), \(b_j\in \mathbb {A}\),Footnote 1 \(\alpha _j\in \mathbb {A}[\mathcal {V}]\), and for all \(\sigma :\mathcal {V}\cup \{n\}\rightarrow \mathbb {Z}\) with \(\sigma (n) \ge n_0\) we have \(\sigma ( { {\texttt {cl}^{x_i}}}) = \sigma (\eta ^n(x_i))\). Similarly, we call \( {\texttt {cl}^{}} = ( { {\texttt {cl}^{x_1}}},\ldots , { {\texttt {cl}^{x_d}}})\) a closed form of the update \(\eta \) (resp. for the loop \((\varphi ,\eta )\)) with start value \(n_0\) if for all \(1 \le i \le d\), \( { {\texttt {cl}^{x_i}}}\) are closed forms for \(x_i\) with start value \(n_0\).

Example 2

In Sect. 3 we will show that for the loop (1), a closed form for \(x_1\) (with start value 0) is \( { {\texttt {cl}^{x_1}}} = \tfrac{1}{2}\cdot \alpha \cdot (-\textrm{i})^n + \tfrac{1}{2}\cdot \overline{\alpha }\cdot \textrm{i}^n\) where \(\alpha = (1 + 3\textrm{i})\cdot x_1 + 2\textrm{i}\cdot x_2\). Here, \(\overline{\alpha }\) denotes the complex conjugate of \(\alpha \), i.e., the sign of those monomials is flipped where the coefficient is a multiple of the imaginary unit \(\textrm{i}\). A closed form for \(x_4\) (also with start value 0) is \( { {\texttt {cl}^{x_4}}} = x_4 + n \cdot (\tfrac{1}{6} + x_3 + x_3^2 - x_3\cdot n -\tfrac{n}{2}+\tfrac{n^2}{3})\).

Our aim is to compute bounds on the sizes of variables and on the runtime. As in [8, 19], we only consider bounds which are weakly monotonically increasing in all occurring variables. Their advantage is that we can compose them easily (i.e., if f and g increase monotonically, then so does \(f\circ g\)).

Definition 3

(Bounds). The set of bounds \(\mathcal {B}\) is the smallest set with \(\overline{\mathbb {N}} = \mathbb {N}\cup \{ \omega \} \subseteq \mathcal {B}\), \(\mathcal {V}\subseteq \mathcal {B}\), and \(\{b_1+b_2, \, b_1 \cdot b_2, \, k^{b_1}\} \subseteq \mathcal {B}\text { for all } k \in \mathbb {N}\) and \(b_1,b_2 \in \mathcal {B}\).

Size bounds should be bounds on the values of variables up to the point where the loop guard is not satisfied anymore for the first time. To define size bounds, we introduce the runtime complexity of a loop (whereas we considered the runtime complexity of arbitrary integer programs in [8, 19, 27]). Let \(\varSigma \) denote the set of all states \(\sigma : \mathcal {V}\rightarrow \mathbb {Z}\) and let \(|\sigma |\) be the state with \(|\sigma |(x) = |\sigma (x)|\) for all \(x \in \mathcal {V}\).

Definition 4

(Runtime Complexity for Loops). The runtime complexity of a loop \((\varphi , \eta )\) is \({{\,\textrm{rc}\,}}: \varSigma \rightarrow \overline{\mathbb {N}}\) with \({{\,\textrm{rc}\,}}(\sigma ) = \inf \lbrace n\in \mathbb {N}\mid \sigma (\eta ^n( \lnot \varphi )) \rbrace \), where \(\inf \varnothing = \omega \). An expression \(r \in \mathcal {B}\) is a runtime bound if \(|\sigma |(r)\ge {{\,\textrm{rc}\,}}(\sigma )\) for all \(\sigma \in \varSigma \).

Example 5

The runtime complexity of the loop (1) is \({{\,\textrm{rc}\,}}(\sigma ) = \max (0,\sigma (x_3))\). For example, \(x_3\) is a runtime bound, as \(|\sigma |(x_3) \ge \max (0,\sigma (x_3))\) for all states \(\sigma \in \varSigma \).

A size bound on a variable x is a bound on the absolute value of x after n iterations of the update \(\eta \), where n is bounded by the runtime complexity. In contrast to the definition of size bounds for transitions in integer programs from [8], Definition 6 requires that size bounds also hold before evaluating the loop.

Definition 6

(Size Bounds for Loops).\({\mathcal{S}\mathcal{B}}: \mathcal {V}\rightarrow \mathcal {B}\) is a size bound for \((\varphi , \eta )\) if for all \(x\in \mathcal {V}\) and all \(\sigma \in \varSigma \), we have \(|\sigma | ({\mathcal{S}\mathcal{B}}(x)) \ge \sup \lbrace |\sigma (\eta ^n(x))| \mid n \le {{\,\textrm{rc}\,}}(\sigma ) \rbrace \).

For any algebraic number \(c \in \mathbb {A}\), as usual \(\lceil * \rceil {|c|}\) is the smallest natural number which is greater or equal to c’s absolute value. Similarly, for any poly-exponential expression \(p = \sum _{j} (\sum _i c_{i,j}\cdot \beta _{i,j})\cdot n^{a_j} \cdot b_j^n\) where \(c_{i,j} \in \mathbb {A}\) and the \(\beta _{i,j}\) are normalized monomials of the form \(x_1^{e_1} \cdot \ldots \cdot x_d^{e_d}\), \(\lceil * \rceil {|p|}\) denotes \(\sum _{j} \left( \sum _i \lceil * \rceil {|c_{i,j}|} \cdot \beta _{i,j}\right) \cdot n^{a_j} \cdot \lceil * \rceil {|b_j|}^n\!\).

We now determine size bounds by over-approximating the closed form \( {\texttt {cl}^{x}}\) by the non-negative, weakly monotonically increasing function \(\lceil * \rceil {| {\texttt {cl}^{x}}|}\). Then we substitute n by a runtime bound r (denoted by “[n/r]”). Due to the monotonicity, this results in a bound on the size of x not only at the end of the loop, but also during the iterations of the loop. Since the closed form is only valid for n iterations with \(n \ge n_0\), we ensure that our size bound is also correct for less than \(n_0\) iterations by symbolically evaluating the update, where we over-approximate maxima by sums. As mentioned, see [28] for the proofs of all new results.

Theorem 7

(Size Bounds for Loops with Closed Forms). Let \( {\texttt {cl}^{}}\) be a closed form for the loop \((\varphi , \eta )\) with start value \(n_0\) and let \(r\in \mathcal {B}\) be a runtime bound. Then the (absolute) size of \(x\in \mathcal {V}\) is bounded by . Hence, the function \({\mathcal{S}\mathcal{B}}\) with \({\mathcal{S}\mathcal{B}}(x) = \texttt {sb}^x\) for all \(x \in \mathcal {V}\) is a size bound for \((\varphi , \eta )\).

Example 8

As mentioned, for the loop (1), a closed form for \(x_1\) with start value 0 is \( { {\texttt {cl}^{x_1}}} =\tfrac{1}{2}\cdot \alpha \cdot (-\textrm{i})^n + \tfrac{1}{2}\cdot \overline{\alpha }\cdot \textrm{i}^n\) where \(\alpha = (1 + 3\textrm{i})\cdot x_1 + 2\textrm{i}\cdot x_2\). Hence, \(\lceil * \rceil {| {\texttt {cl}^{x_1}}|} = \lceil * \rceil {| \tfrac{1}{2}\cdot \alpha \cdot (-\textrm{i})^n + \tfrac{1}{2}\cdot \overline{\alpha }\cdot \textrm{i}^n |} = (\lceil * \rceil {| \tfrac{1+ 3\textrm{i}}{2}|}\cdot x_1 + \lceil * \rceil {|\textrm{i}|}\cdot x_2 )\cdot \lceil * \rceil {|-\textrm{i}|}^n + (\lceil * \rceil {| \tfrac{1- 3\textrm{i}}{2}|}\cdot x_1 + \lceil * \rceil {| -\textrm{i}|}\cdot x_2 )\cdot \lceil * \rceil {|\textrm{i}|}^n = 4\cdot x_1 + 2\cdot x_2\), as \(\lceil * \rceil {| \tfrac{1+ 3 \textrm{i}}{2}|} = \lceil * \rceil {| \tfrac{1- 3\textrm{i}}{2}|} = \lceil * \rceil {\tfrac{\sqrt{10}}{2}} = 2\) and \(\lceil * \rceil {| \textrm{i}|} = \lceil * \rceil {| -\textrm{i}|} = 1\). So our approach infers linear size bounds for \(x_1\) and \(x_2\) (the similar computations for \(x_2\) are omitted) while [8] only infers exponential size bounds.

As this over-approximation does not depend on n, it directly yields a size bound, i.e., \(\texttt {sb}^{x_1} = \lceil * \rceil {| {\texttt {cl}^{x_1}}|}\). In contrast, in the over-approximation \(\lceil * \rceil {| {\texttt {cl}^{x_4}}|} = x_4 + n\left( 1 + x_3 + x_3^2 + x_3\cdot n + n + n^2\right) \), we have to replace n by a runtime bound like \(x_3\). Thus, we obtain the overall size bound \(\texttt {sb}^{x_4} = x_4 + 3\cdot x_3^3 + 2\cdot x_3^2 + x_3\).

Although this section focused on closed forms which are poly-exponential expressions, our technique is applicable to all loops where we can compute over-approximating bounds for the closed form and the runtime complexity. For example, the update \(\eta (x) = x^2\) has the closed form \(x^{(2^n)}\), but it does not admit a poly-exponential closed form due to x’s super-exponential growth. However, by instantiating n by a runtime bound, we can still compute a size bound for this update. The reason for focusing on poly-exponential expressions is that we can compute such a closed form for all so-called solvable loops automatically, see Sect. 3.

3 Size and Runtime Bounds for Solvable Loops

In this section, we present a class of loops where our technique of Theorem 7 is “complete”. The technique relies on the computation of suitable closed forms and of runtime bounds. In Sect. 3.1, we show that poly-exponential closed forms can be computed for all solvable loops [17, 23, 25, 26, 32, 36]. Then we prove in Sect. 3.2 that finite runtime bounds are computable for all terminating solvable loops with only periodic rational eigenvalues.

A loop \((\varphi , \eta )\) is solvable if \(\eta \) is a solvable update (see Definition 9 below for a formal definition), which partitions \(\mathcal {V}\) into blocks \(\mathcal {S}_1,\ldots , \mathcal {S}_m\) (and loop guards \(\varphi \) are not relevant for closed forms). Each block allows updates with cyclic dependencies between its variables and non-linear dependencies on variables in blocks with lower indices.

Definition 9

(Solvable Update [17, 23, 25, 26, 32, 36]). An update \(\eta :\mathcal {V}\rightarrow \mathbb {Z}[\mathcal {V}]\) is solvable if there exists a partition \(\mathcal {S}_1, \ldots , \mathcal {S}_m\) of \(\lbrace x_1, \ldots , x_d \rbrace \) such that for all \(1 \le i \le m\) we have \(\mathbf {\eta }_{\mathcal {S}_i} = A_{\mathcal {S}_i} \cdot \textbf{x}_{\mathcal {S}_i} + \textbf{p}_{\mathcal {S}_i}\) for an \(A_{\mathcal {S}_i}\in \mathbb {Z}^{|\mathcal {S}_i|\times |\mathcal {S}_i|}\) and a \(\textbf{p}_{\mathcal {S}_i}\in \mathbb {Z}[\bigcup _{j < i} \mathcal {S}_j]^{|\mathcal {S}_i|}\), where \(\mathbf {\eta }_{\mathcal {S}_i}\) is the vector of all \(\eta (x_j)\) and \(\textbf{x}_{\mathcal {S}_i}\) is the vector of all \(x_j\) with \(j\in \mathcal {S}_i\). The eigenvalues of a solvable loop are defined as the union of the eigenvalues of all matrices \(A_{\mathcal {S}_i}\). The loop is homogeneous if \(\textbf{p}_{\mathcal {S}_i} = \textbf{0}\) for all \(1 \le i \le m\).

Example 10

The loop (1) is an example for a solvable loop using the partition \(\mathcal {S}_1 = \lbrace x_1,x_2 \rbrace \), \(\mathcal {S}_2 = \lbrace x_3 \rbrace \), and \(\mathcal {S}_3 = \lbrace x_4 \rbrace \).

The crucial idea for our results in Sect. 3.1 and 3.2 is to reduce the problem of finding closed forms and runtime bounds from solvable loops to triangular weakly non-linear loops (twn-loops) [16, 17, 20]. A twn-update is a solvable update where each block \(\mathcal {S}_j\) has cardinality one. Thus, a twn-update is triangular, i.e., the update of a variable does not depend on variables with higher indices. Furthermore, the update is weakly non-linear, i.e., a variable does not occur non-linear in its own update. We are mainly interested in loops over \(\mathbb {Z}\), but to handle solvable updates, we will transform them into twn-updates with coefficients from \(\mathbb {A}\).

Definition 11

(TWN-Update [16, 17, 20]). An update \(\eta : \mathcal {V}\rightarrow \mathbb {A}[\mathcal {V}]\) is twn if for all \(1 \le i \le d\) we have \(\eta (x_i) = c_i\cdot x_i + p_i\) for some \(c_i \in \mathbb {A}\) and some polynomial \(p_i\in \mathbb {A}[x_1, \ldots ,x_{i-1}]\). A loop with a twn-update is called a twn-loop.

Clearly, (1) is not a twn-loop due to the cyclic dependency between \(x_1\) and \(x_2\).

3.1 Closed Forms for Solvable Loops

Lemma 12 (which extends [17, Thm. 16] from solvable updates with real eigenvalues to arbitrary solvable updates) illustrates that one can transform any solvable update \(\eta _s\) into a twn-update \(\eta _t\) by an automorphism \(\vartheta \). Here, \(\vartheta \) is induced by the change-of-basis matrix of the Jordan normal form of each block of \(\eta _s\). Note that the Jordan normal form is always computable in polynomial time (see [9]).

Lemma 12

(Transforming Solvable Updates (see [17], Thm. 16). Let \(\eta _s\) be a solvable update. Then \(\vartheta : \mathcal {V}\rightarrow \mathbb {A}[\mathcal {V}]\) is an automorphism, where \(\vartheta \) is defined by \(\vartheta (\mathcal {S}) = P\cdot \textbf{x}_\mathcal {S}\) for each block \(\mathcal {S}\), where \(J(A_{\mathcal {S}}) = P\cdot A_{\mathcal {S}} \cdot P^{-1}\) is the Jordan normal form of \(A_\mathcal {S}\). Furthermore, \(\eta _t = \vartheta ^{-1}\circ \eta _s\circ \vartheta \) is a twn-update.

Example 13

To illustrate Lemma 12, we transform the solvable update \(\eta _s\) of (1) into a twn-update \(\eta _t\). As the blocks \(\mathcal {S}_2 = \lbrace x_3 \rbrace \) and \(\mathcal {S}_3 = \lbrace x_4 \rbrace \) have cardinality one, we only have to consider \(\mathcal {S}_1 = \lbrace x_1,x_2 \rbrace \). The restriction of \(\eta _s\) to \(\mathcal {S}_1\) is with . So we get the Jordan normal form where and . Thus, we have the following automorphism \(\vartheta \) and its inverse \(\vartheta ^{-1}\):

figure g

Hence, \(\eta _t = \vartheta ^{-1} \circ \eta _s \circ \vartheta \) is the following twn-update:

figure h

The reason for transforming solvable updates to twn-updates is that for the latter, we can re-use our previous algorithm from [16] to compute poly-exponential closed forms. While [16] only considered updates with linear arithmetic over \(\mathbb {Z}\), it can directly be extended to twn-updates over \(\mathbb {A}\).

Lemma 14

(Closed Forms for TWN-Updates (see [16])). Let \(\eta \) be a twn-update. Then a (poly-exponential) closed form is computable for \(\eta \).

Example 15

For \(\eta _t\) from Example 13, we obtain the following closed form (with start value 0): \( {\texttt {cl}^{}_{t}} = ((-\textrm{i})^n\cdot x_1, \textrm{i}^n\cdot x_2, x_3 - n, x_4 + n(\tfrac{1}{6} + x_3 + x_3^2 - x_3\cdot n -\tfrac{n}{2}+\tfrac{n^2}{3}))\).

So to obtain a closed form of a solvable update \(\eta _s\), we first transform it into a twn-update \(\eta _t\) via Lemma 12, and then compute the closed form \( {\texttt {cl}^{}_{t}}\) of \(\eta _t\) (Lemma 14). We now show how to obtain a closed form for \(\eta _s\) from \( {\texttt {cl}^{}_{t}}\).

Theorem 16

(Closed Forms for Solvable Updates). Let \(\eta _s\) be a solvable update and \(\vartheta \) be an automorphism as in Lemma 12 such that \(\eta _t = \vartheta ^{-1}\circ \eta _s\circ \vartheta \) is a twn-update. If \( { {\texttt {cl}^{}_{t}}}\) is a closed form of \(\eta _t\) with start value \(n_0\), then \( { {\texttt {cl}^{}_{s}}} = \vartheta \circ { {\texttt {cl}^{}_{t}}}\circ \vartheta ^{-1}\) is a closed form of \(\eta _s\) with start value \(n_0\).

Example 17

In Example 13 we transformed \(\eta _s\) into the twn-update \(\eta _t\) via an automorphism \(\vartheta \) and in Example 15, we gave a closed form \( { {\texttt {cl}^{}_{t}}}\) of \(\eta _t\). Thus, by Theorem 16, we can infer a closed form \( { {\texttt {cl}^{}_{s}}}= \vartheta \circ { {\texttt {cl}^{}_{t}}}\circ \vartheta ^{-1}\) of \(\eta _s\). For example, we compute a closed form for \(x_1\) with start value 0 (\( {\texttt {cl}^{x_2}_{s}}\) can be inferred in a similar way):

$$\begin{aligned} {\texttt {cl}^{x_1}_{s}}&= \left( \tfrac{1}{5}(\textrm{i}- 3)\cdot x_1 - \tfrac{1}{5}(\textrm{i}+ 3)\cdot x_2\right) \; \left[ v/ {\texttt {cl}^{v}_{t}} \mid v\in \mathcal {V}\right] \; \left[ v/\vartheta (v) \mid v\in \mathcal {V}\right] \\&= \left( \tfrac{1}{5}(\textrm{i}- 3)\cdot (-\textrm{i})^n \cdot x_1 - \tfrac{1}{5}(\textrm{i}+ 3)\cdot \textrm{i}^n \cdot x_2\right) \; \left[ v/\vartheta (v) \mid v\in \mathcal {V}\right] \\&= \tfrac{1}{2}(\underbrace{(1 + 3\textrm{i})\cdot x_1 + 2\textrm{i}\cdot x_2}_{\alpha }) \cdot (-\textrm{i})^n + \tfrac{1}{2}(\underbrace{(1 - 3\textrm{i})\cdot x_1 - 2\textrm{i}\cdot x_2}_{\overline{\alpha }}) \cdot \textrm{i}^n. \end{aligned}$$

3.2 Periodic Rational Solvable Loops

In Sect. 3.1, we discussed how to compute closed forms for solvable updates (by transforming them to twn-updates). However to compute size bounds, we have to instantiate the variable n in the closed forms by runtime bounds (Theorem 7). In [20], it was shown that (polynomial) runtime bounds can always be computed for terminating twn-loops over the integers. However, in general, transforming solvable loops via Lemma 12 yields twn-updates which may contain algebraic (complex) numbers. We now show that for the subclass of terminating periodic rational solvable loops, our approach is “complete” (i.e., finite runtime bounds and thus, also finite size bounds are always computable).

Definition 18

(Periodic Rational [25]). A number \(\lambda \in \mathbb {A}\) is periodic rational if \(\lambda ^p\in \mathbb {Q}\) for some \(p\in \mathbb {N}\) with \(p > 0\). The period of \(\lambda \) is the smallest such p with \(\lambda ^p\in \mathbb {Q}\). A solvable loop is periodic rational (i.e., it is a prs loop) with period p if all its eigenvalues \(\lambda \) are periodic rational and p is the least common multiple of all their periods. A prs loop is a unit prs loop if \(|\lambda | \le 1\) for all its eigenvalues \(\lambda \).

So \(\textrm{i}\), \(-\textrm{i}\), and \(\sqrt{2} \cdot \textrm{i}\) are periodic rational with period 2, while \(\sqrt{2} + \textrm{i}\) is not periodic rational. The following lemma from [25] gives a bound on the period of prs loops and thus yields an algorithm to detect prs loops and to compute their period.

Lemma 19

(Bound on the Period [25]). Let \(A\in \mathbb {Z}^{n\times n}\). If \(\lambda \) is a periodic rational eigenvalue of A with period p, then \(p \le n^3\).

Now we show that by chaining (i.e., by performing p iterations of a prs loop with period p in a single step), one can transform any prs loop into a solvable loop with only integer eigenvalues. Then, our previous results on twn-loops [17, 20] can be used to infer runtime bounds for these loops.

Definition 20

(Chaining Loops). Let \(L = (\varphi , \eta )\) be a loop and \(p \in \mathbb {N}\setminus \{0\}\). Then \(L_p = (\varphi _p,\eta _p)\) results from iterating L p times, i.e., \(\varphi _p = \varphi \, \wedge \, \eta (\varphi ) \, \wedge \, \eta (\eta (\varphi )) \, \wedge \, \ldots \,\, \wedge \, \eta ^{p-1}(\varphi ) \; \text { and } \; \eta _p(v) = \eta ^p(v) \text { for all } v\in \mathcal {V}.\)

Example 21

The eigenvalues of (1) have period 2. Chaining yields :

(2)

Due to Lemma 12 we can transform every solvable update into a twn-update by a (linear) automorphism \(\vartheta \). For prs loops, \(\vartheta \)’s range can be restricted to \(\mathbb {Q}[\mathcal {V}]\), i.e., one does not need algebraic numbers. So, we first chain the prs loop L and then compute a \(\mathbb {Q}\)-automorphism \(\vartheta \) transforming the chained loop \(L_p\) into a twn-loop \(L_t\) via Lemma 12. Then we can infer a runtime bound for \(L_t\) as in [20]. The reason is that all factors \(c_i\) in the update of \(L_t\) are integers and thus, we can compute a closed form \(\sum _j \alpha _j\cdot n^{a_j}\cdot b_j^n\) such that \(\alpha _j\in \mathbb {Q}[\mathcal {V}]\) and \(b_j\in \mathbb {Z}\). Afterwards, the runtime bound for \(L_t\) can be lifted to a runtime bound for the original loop by reconsidering the automorphism \(\vartheta \). Similarly, in order to prove termination of the prs loop L, we analyze termination of \(L_t\) on \(\vartheta (\mathbb {Z}^d) = \lbrace \vartheta (\textbf{x})\mid \textbf{x}\in \mathbb {Z}^d \rbrace \).Footnote 2

Lemma 22

(Runtime Bounds for PRS Loops). Let L be a prs loop with period p and let \(L_p = (\varphi _p, \eta _p)\) result from chaining as in Definition 20. From \(\eta _p\), one can compute a linear automorphism \(\vartheta :\mathcal {V}\rightarrow \mathbb {Q}[\mathcal {V}]\) as in Lemma 12, such that:

  1. (a)

    \(L_p\) is solvable and only has integer eigenvalues.

  2. (b)

    \((\vartheta ^{-1}\circ \eta _p\circ \vartheta ) : \mathcal {V}\rightarrow \mathbb {Q}[\mathcal {V}]\) is a twn-update as in Definition 11 such that all \(c_i\in \mathbb {Z}\).

  3. (c)

    \(L_t = (\varphi _t, \eta _t)\) with \(\varphi _t = \vartheta ^{-1}(\varphi _p)\) and \(\eta _t = \vartheta ^{-1}\circ \eta _p\circ \vartheta \) is a twn-loop. Moreover, the following holds:

    • L terminates on \(\mathbb {Z}^d\) iff

    • \(L_p\) terminates on \(\mathbb {Z}^d\) iff

    • \(L_t\) terminates on \(\vartheta (\mathbb {Z}^d) = \lbrace \vartheta (\textbf{x})\mid \textbf{x}\in \mathbb {Z}^d \rbrace \).

  4. (d)

    If \(r\!\) is a runtime boundFootnote 3 for \(L_t\), then is a runtime bound for L.

Fig. 1.
figure 1

Illustration of Runtime and Size Bound Computations

Since we can detect prs loops and their periods by Lemma 19, Lemma 22 allows us to compute runtime bounds for all terminating prs loops. This is illustrated in Fig. 1: For runtime bounds, L is transformed to \(L_p\) by chaining and \(L_p\) is transformed further to \(L_t\) by an automorphism \(\vartheta \). The runtime bound r for \(L_t\) can then be transformed into a runtime bound for \(L_p\) and further into a runtime bound for L. For size bounds, L is directly transformed to a twn-loop \(L_t'\) by an automorphism \(\vartheta '\). The closed form \( {\texttt {cl}^{}_{t}}\) obtained for \(L_t'\) is transformed via the automorphism \(\vartheta '\) into a closed form \( {\texttt {cl}^{}_{s}}\) for L. Then the runtime bound for L is inserted into this closed form to yield a size bound for L. So in Fig. 1, standard arrows denote transformations of loops and wavy arrows denote transformations of runtime bounds or closed forms.

Theorem 23

(Completeness of Size and Runtime Bound Computation for Terminating PRS Loops). For all terminating prs loops, polynomial runtime bounds and finite size bounds are computable. For terminating unit prs loops, all these size bounds are polynomial as well.

Example 24

For the loop L from (1), we computed \(L_p\) for \(p=2\) in (2), see Example 21. As \(L_p\) is already a twn-loop, we can use the technique of [20] (implemented in our tool KoAT) to obtain the runtime bound \(x_3\) for \(L_p\). Lemma 22 yields the runtime bound \(2\cdot x_3 + 1\) for the original loop (1). Of course, here one could also use (incomplete) approaches based on linear ranking functions (also implemented in KoAT, see, e.g., [8, 19]) to directly infer the tighter runtime bound \(x_3\) for the loop (1).

4 Size Bounds for Integer Programs

Up to now, we focused on isolated loops. In the following, we incorporate our complete approach from Sect. 2 and 3 into the setting of general integer programs where most questions regarding termination or complexity are undecidable. Formally, an integer program is a tuple \((\mathcal {V},\mathcal {L},\ell _0,\mathcal {T})\) with a finite set of variables \(\mathcal {V}\), a finite set of locations \(\mathcal {L}\), a fixed initial location \(\ell _0 \in \mathcal {L}\), and a finite set of transitions \(\mathcal {T}\). A transition is a 4-tuple \((\ell ,\varphi ,\eta ,\ell ')\) with a start location \(\ell \in \mathcal {L}\), target location \(\ell '\in \mathcal {L}\setminus \lbrace \ell _0 \rbrace \), guard \(\varphi \in \mathcal {F}(\mathcal {V})\), and update \(\eta : \mathcal {V}\rightarrow \mathbb {Z}[\mathcal {V}]\). To simplify the presentation, we do not consider “temporary” variables (whose update is non-deterministic), but the approach can easily be extended accordingly. Transitions \((\ell _0,\_,\_,\_)\) are called initial and \(\mathcal {T}_0\) denotes the set of all initial transitions.

Fig. 2.
figure 2

An Integer Program with Non-Linear Size Bounds

Example 25

In the integer program of Fig. 2, we omitted identity updates \(\eta (v) = v\) and guards where \(\varphi \) is \(\texttt {true}\). Here, \(\mathcal {V}= \lbrace x_1,\ldots ,x_5 \rbrace \) and \(\mathcal {L}= \{\ell _0, \ell _1, \ell _2\}\), where \(\ell _0\) is the initial location. Note that the loop in (1) corresponds to transition \(t_1\).

Definition 26

(Correspondence between Loops and Transitions). Let \(t = (\ell , \varphi , \eta ,\ell )\) be a transition with \(\varphi \in \mathcal {F}(\mathcal {V}')\) for some variables \(\mathcal {V}'\subseteq \mathcal {V}\) such that \(\eta (x) = x\) for all \(x\in \mathcal {V}\setminus \mathcal {V}'\) and \(\eta (x) \in \mathbb {Z}[\mathcal {V}']\) for all \(x \in \mathcal {V}'\). A loop \((\varphi ', \eta ')\) with \(\varphi '\in \mathcal {F}(\lbrace x_1,\ldots ,x_d \rbrace )\) and \(\eta ': \lbrace x_1,\ldots ,x_d \rbrace \rightarrow \mathbb {Z}[\lbrace x_1,\ldots ,x_d \rbrace ]\) corresponds to the transition t via the variable renaming \(\pi : \lbrace x_1,\ldots ,x_d \rbrace \rightarrow \mathcal {V}'\) if \(\varphi \) is \(\pi (\varphi ')\) and for all \(1\le i \le d\) we have \(\eta (\pi (x_i)) = \pi (\eta '(x_i))\).

To define the semantics of integer programs, an evaluation step moves from one configuration \((\ell ,\sigma )\in \mathcal {L}\times \varSigma \) to another configuration \((\ell ',\sigma ')\) via a transition \((\ell , \varphi , \eta , \ell ')\) where \(\sigma (\varphi )\) holds. Here, \(\sigma '\) is obtained by applying the update \(\eta \) on \(\sigma \). From now on, we fix an integer program \(\mathcal {P}= (\mathcal {V},\mathcal {L},\ell _0,\mathcal {T})\).

Definition 27

(Evaluation of Programs). For configurations \((\ell ,\sigma )\), \((\ell ',\sigma ')\) and \(t = (\ell _t,\varphi ,\eta ,\ell _{t}')\in \mathcal {T}\), \((\ell ,\sigma )\rightarrow _t(\ell ',\sigma ')\) is an evaluation step if \(\ell = \ell _t\), \(\ell ' = \ell _{t}'\), \(\sigma (\varphi ) = {\texttt {true}}\), and \(\sigma (\eta (v)) = \sigma '(v)\) for all \(v\in \mathcal {V}\). Let \(\rightarrow _{\mathcal {T}} \; = \, \bigcup _{t \in \mathcal {T}} \rightarrow _t\), where we also write \(\rightarrow \) instead of \(\rightarrow _t\) or \(\rightarrow _{\mathcal {T}}\). Let \((\ell _0,\sigma _0)\rightarrow ^k(\ell _k,\sigma _k)\) abbreviate \((\ell _0,\sigma _0) \rightarrow \ldots \rightarrow (\ell _k,\sigma _k)\) and let \((\ell ,\sigma ) \rightarrow ^*(\ell ',\sigma ')\) if \((\ell ,\sigma ) \rightarrow ^k(\ell ',\sigma ')\) for some \(k \ge 0\).

Example 28

If we encode states as tuples \((\sigma (x_1),\ldots ,\sigma (x_5))\in \mathbb {Z}^5\), then \((-6, -8, 2,1,1) \rightarrow _{t_0} (-6,-8,2,1,1) \rightarrow ^2_{t_1} (6,8,0,6,1) \rightarrow _{t_2} (6,8,0,6,1)\rightarrow _{t_4}^6 (0,8,0,6,1)\).

Now we define size bounds for variables v after evaluating a transition t: \({\mathcal{S}\mathcal{B}}(t,v)\) is a size bound for v w.r.t. t if for any run starting in \(\sigma _0\in \varSigma \), \(|\sigma _0|({\mathcal{S}\mathcal{B}}(t, v))\) is greater or equal to the largest absolute value of v after evaluating t.

Definition 29

(Size Bounds [8, 19]). A function \({\mathcal{S}\mathcal{B}}: (\mathcal {T}\times \mathcal {V}) \rightarrow \mathcal {B}\) is a (global) size bound for the program \(\mathcal {P}\) if for all \((t, x) \in \mathcal {T}\times \mathcal {V}\) and all states \(\sigma _0\in \varSigma \) we have \(|\sigma _0|({\mathcal{S}\mathcal{B}}(t, x)) \ge \sup \lbrace |\sigma '(x)| \mid \exists \, \ell ' \in \mathcal {L}. \; (\ell _0, \sigma _0) \; (\rightarrow ^* \circ \rightarrow _t) \; (\ell ', \sigma ') \rbrace \).

Later in Lemma 35, we will compare the notion of size bounds for transitions in a program from Definition 29 to our earlier notion of size bounds for loops from Definition 6.

Example 30

As an example, we give size bounds for the transitions \(t_0\) and \(t_3\) in Fig. 2. Since \(t_0\) does not change any variables, a size bound is \({\mathcal{S}\mathcal{B}}(t_0, x_i) = x_i\) for all \(1 \le i \le 5\). Note that the value of \(x_5\) is never increased and is bounded from below by 0 in any run through the program. Thus, \({\mathcal{S}\mathcal{B}}(t_3, x_3) = x_5 = {\mathcal{S}\mathcal{B}}(t_3, x_5)\). Similarly, we have \({\mathcal{S}\mathcal{B}}(t_3, x_1) = 2\cdot x_5\), \({\mathcal{S}\mathcal{B}}(t_3, x_2) = 3\cdot x_5\), and \({\mathcal{S}\mathcal{B}}(t_3, x_4) = x_3\).

To infer size bounds for transitions as in Definition 29 automatically, we lift local size bounds (i.e., size bounds which only hold for a subprogram with transitions \(\mathcal {T}'\subseteq \mathcal {T}\setminus \mathcal {T}_0\)) to global size bounds for the complete program. For the subprogram, one considers runs which start after evaluating an entry transition of \(\mathcal {T}'\).

Definition 31

(Entry Transitions [8]). Let \(\varnothing \ne \mathcal {T}' \subseteq \mathcal {T}\setminus \mathcal {T}_0\). The entry transitions of \(\mathcal {T}'\) are \( \mathcal {E}_{\mathcal {T}'} = \lbrace t \mid t\!=\!(\_,\_,\_,\ell )\!\in \!\mathcal {T}\setminus \mathcal {T}' \text { and there is a } (\ell ,\_,\_,\_)\!\in \!\mathcal {T}' \rbrace \).

Example 32

For the program in Fig. 2, we have \(\mathcal {E}_{\lbrace t_1 \rbrace } = \{t_0, t_3\}\) and \(\mathcal {E}_{\lbrace t_4 \rbrace } = \lbrace t_2 \rbrace \).

Definition 33

(Local Size Bounds). Let \(\varnothing \ne \mathcal {T}'\subseteq \mathcal {T}\setminus \mathcal {T}_0\) and \(t'\in \mathcal {T}'\). \({\mathcal{S}\mathcal{B}}_{t'}:\mathcal {V}\rightarrow \mathcal {B}\) is a local size bound for \(t'\) w.r.t. \(\mathcal {T}'\) if for all \(x\in \mathcal {V}\) and all \(\sigma \in \varSigma \):Footnote 4 \(|\sigma | ({\mathcal{S}\mathcal{B}}_{t'}(x)) \ge \sup \lbrace |\sigma '(x)|\mid \exists \ell ' \in \mathcal {L}, (\_,\_,\_,\ell )\in \mathcal {E}_{\mathcal {T}'}.\; (\ell ,\sigma )\; (\rightarrow ^*_{\mathcal {T}'}\circ \rightarrow _{t'}) \; (\ell ',\sigma ') \rbrace \).

Theorem 34 below yields a novel modular procedure to infer (global) size bounds from previously computed local size bounds. A local size bound for a transition \(t'\) w.r.t. a subprogram \(\mathcal {T}'\subseteq \mathcal {T}\setminus \mathcal {T}_0\) is lifted by inserting size bounds for all entry transitions. Again, this is possible because we only use weakly monotonically increasing functions as bounds. Here, “\(b\left[ v/p_v \mid v\in \mathcal {V}\right] \)” denotes the bound which results from replacing every variable v by \(p_v\) in the bound b.

Theorem 34

(Lifting Local Size Bounds). Let \(\varnothing \ne \mathcal {T}'\subseteq \mathcal {T}\setminus \mathcal {T}_0\), let \({\mathcal{S}\mathcal{B}}_{t'}\) be a local size bound for a transition \(t'\) w.r.t. \(\mathcal {T}'\) and let \({\mathcal{S}\mathcal{B}}: (\mathcal {T}\times \mathcal {V})\rightarrow \mathcal {B}\) be a size bound for \(\mathcal {P}\). Let \({\mathcal{S}\mathcal{B}}'(t',x) = \sum _{r\in \mathcal {E}_{\mathcal {T}'}} {\mathcal{S}\mathcal{B}}_{t'}(x) \left[ v/{\mathcal{S}\mathcal{B}}(r,v) \mid v\in \mathcal {V}\right] \) and \({\mathcal{S}\mathcal{B}}'(t,x) = {\mathcal{S}\mathcal{B}}(t,x)\) for all \(t' \ne t\). Then \({\mathcal{S}\mathcal{B}}'\) is also a size bound for \(\mathcal {P}\).

To obtain local size bounds which can then be lifted via Theorem 34, we look for transitions \(t_L\) that correspond to a loop L and then we compute a size bound for L as in Sect. 2 and 3. The following lemma shows that size bounds for loops as in Definition 6 indeed yield local size bounds for the corresponding transitions.Footnote 5

Lemma 35

(Local Size Bounds via Loops). Let \({\mathcal{S}\mathcal{B}}_L\) be a size bound for a loop L (as in Definition 6) which corresponds to a transition \(t_L\) via a variable renaming \(\pi \). Then \(\pi \circ {\mathcal{S}\mathcal{B}}_L \circ \pi ^{-1}\) is a local size bound for \(t_L\) w.r.t. \(\{t_L\}\) (as in Definition 33).

Example 36

\({\mathcal{S}\mathcal{B}}_L(x_4) = x_4 + 3\cdot x_3^3 + 2\cdot x_3^2 + x_3\) is a size bound for \(x_4\) in the loop (1), see Example 8. This loop corresponds to transition \(t_1\) in the program of Fig. 2. Since \(\mathcal {E}_{\lbrace t_1 \rbrace } = \{t_0, t_3\}\) by Example 32, Theorem 34 yields the following (non-linear) size bound for \(x_4\) in the full program of Fig. 2 (see Example 30 for \({\mathcal{S}\mathcal{B}}(t_0,v)\) and \({\mathcal{S}\mathcal{B}}(t_3,v)\)):

$$\begin{aligned} {\mathcal{S}\mathcal{B}}(t_1,x_4)&= {\mathcal{S}\mathcal{B}}_L(x_4) \left[ v/{\mathcal{S}\mathcal{B}}(t_0,v) \mid v\in \mathcal {V}\right] + {\mathcal{S}\mathcal{B}}_L(x_4) \left[ v/{\mathcal{S}\mathcal{B}}(t_3,v) \mid v\in \mathcal {V}\right] \\&= (x_4 + 3\cdot x_3^3 + 2\cdot x_3^2 + x_3) + (x_3 + 3\cdot x_5^3 + 2\cdot x_5^2 + x_5) \\&= 2\cdot x_3 + 2\cdot x_3^2 + 3\cdot x_3^3 + x_4 + x_5 + 2\cdot x_5^2 + 3\cdot x_5^3 \end{aligned}$$

Analogously, we infer the remaining size bounds \({\mathcal{S}\mathcal{B}}(t_1,x_i)\), e.g., \({\mathcal{S}\mathcal{B}}(t_1,x_1)\!=\!(4\cdot x_1 + 2\cdot x_2) \left[ v/{\mathcal{S}\mathcal{B}}(t_0,v) \mid v\!\in \!\mathcal {V}\right] + (4\cdot x_1 +\! 2\cdot x_2)\left[ v/{\mathcal{S}\mathcal{B}}(t_3,v) \mid v\!\in \!\mathcal {V}\right] = 4\cdot x_1 + 2\cdot x_2 + 14\cdot x_5\).

Our approach alternates between improving size and runtime bounds for individual transitions. We start with \({\mathcal{S}\mathcal{B}}(t_0,x) = |\eta (x)|\) for initial transitions \(t_0\in \mathcal {T}_0\) where \(\eta \) is \(t_0\)’s update, and \({\mathcal{S}\mathcal{B}}(t,\_) = \omega \) for \(t\in \mathcal {T}\setminus \mathcal {T}_0\). Here, similar to the notion \(\lceil * \rceil {|p|}\) in Sect. 2, for every polynomial \(p = \sum _j c_{j}\cdot \beta _{j}\) with normalized monomials \(\beta _j\), |p| is the polynomial \(\sum _j |c_{j}|\cdot \beta _{j}\). To improve the size bounds of transitions that correspond to (possibly non-linear) solvable loops, we can use closed forms (Theorem 7) and the lifting via Theorem 34. Otherwise, we use an existing incomplete technique [8] to improve size bounds (where [8] essentially only succeeds for updates without non-linear arithmetic). In this way, we can automatically compute polynomial size bounds for all remaining transitions and variables in the program of Fig. 2 (e.g., we obtain \({\mathcal{S}\mathcal{B}}(t_2,x_1) = {\mathcal{S}\mathcal{B}}(t_1,x_1) = 4\cdot x_1 + 2\cdot x_2 + 14\cdot x_5\)).

Both the technique from [8] and our approach from Theorem 7 rely on runtime bounds to compute size bounds. On the other hand, as shown in [8, 19, 27], size bounds for “previous” transitions are needed to infer (global) runtime bounds for transitions in a program. For that reason, the alternated computation resp. improvement of global size and runtime bounds for the transitions is repeated until all bounds are finite. We will illustrate this in more detail in Sect. 5.

In Definition 26 and Lemma 35 we considered transitions with the same start and target location that directly correspond to loops. To increase the applicability of our approach, as in [27] now we consider so-called simple cycles, where iterations through the cycle can only be done in a unique way. So the cycle must not have subcycles and there must not be any indeterminisms concerning the next transition to be taken. Formally, \(\mathcal {C} = \{t_1,\ldots ,t_n\}\subseteq \mathcal {T}\) is a simple cycle if there are pairwise different locations \(\ell _1,\ldots ,\ell _n\) such that \(t_i = (\ell _i, \_, \_, \ell _{i+1})\) for \(1 \le i \le n-1\) and \(t_n = (\ell _n, \_, \_, \ell _1)\). To handle simple cycles, we chain transitions.Footnote 6

Definition 37

(Chaining (see, e.g., [27])). Let \(t_1,\ldots ,t_n \in \mathcal {T}\) where \(t_i = (\ell _i, \varphi _i, \eta _i, \ell _{i+1})\) for all \(1 \le i \le n-1\). Then the transition \(t_1 \star \ldots \star t_n = (\ell _1, \varphi , \eta , \ell _{n+1})\) results from chaining \(t_1,\ldots ,t_n\) where

figure l

Now we want to compute a local size bound for the transition \(t_n\) w.r.t. a simple cycle \(\mathcal {C} = \{t_1, \ldots , t_n\}\) where a loop L corresponds to \(t_1\star \ldots \star t_n\) via \(\pi \). Then a size bound \({\mathcal{S}\mathcal{B}}_L\) for the loop L yields the size bound \(\pi \circ {\mathcal{S}\mathcal{B}}_L \circ \pi ^{-1}\) for \(t_n\) regarding runs through \(\mathcal {C}\) starting in \(t_1\). However, to obtain a local size bound \({\mathcal{S}\mathcal{B}}_{t_n}\) w.r.t. \(\mathcal {C}\), we have to consider runs starting after any entry transition \((\_,\_,\_,\ell _i)\in \mathcal {E}_{\mathcal {C}}\). Hence, we use \(| \, \eta _n(\ldots \eta _i(\pi ({\mathcal{S}\mathcal{B}}_L(\pi ^{-1}(x))))\ldots ) \, |\) for any \((\_,\_,\_,\ell _i)\in \mathcal {E}_{\mathcal {C}}\). In this way, we also capture evaluations starting in \(\ell _i\), i.e., without evaluating the complete cycle.

Theorem 38

(Local Size Bounds for Simple Cycles). Let    \(\mathcal {C} = \lbrace t_1,\ldots ,t_n \rbrace \) \(\subseteq \mathcal {T}\) be a simple cycle and let \({\mathcal{S}\mathcal{B}}_L\) be a size bound for a loop L which corresponds to \(t_1\star \ldots \star t_n\) via a variable renaming \(\pi \). Then a local size bound \({\mathcal{S}\mathcal{B}}_{t_n}\) for \(t_n\) w.r.t. \(\mathcal {C}\) is \({\mathcal{S}\mathcal{B}}_{t_n}(x) = \sum _{1 \le i \le n, (\_,\_,\_,\ell _i)\in \mathcal {E}_{\mathcal {C}}} \; | \, \eta _n(\ldots \eta _i(\pi ({\mathcal{S}\mathcal{B}}_L(\pi ^{-1}(x))))\ldots ) \, |\).

Example 39

As an example, in the program of Fig. 2 we replace \(t_1 = (\ell _1, x_3 > 0, \eta _{1}, \ell _1)\) by \(t_{1a} = (\ell _1, \texttt {true}, \eta _{1a}, \ell _1')\) and \(t_{1b} = (\ell _1', x_3 > 0, \eta _{1b}, \ell _1)\) with a new location \(\ell _1'\), where \(\eta _{1a}(v) = \eta _1(v)\) for \(v \in \{ x_1, x_2 \}\), \(\eta _{1b}(v) = \eta _1(v)\) for \(v \in \{ x_3, x_4 \}\), and \(\eta _{1a}\) resp. \(\eta _{1b}\) are the identity on the remaining variables. Then \(\lbrace t_{1a},t_{1b} \rbrace \) forms a simple cycle and Theorem 38 allows us to compute local size bounds \({\mathcal{S}\mathcal{B}}_{t_{1b}}\) and \({\mathcal{S}\mathcal{B}}_{t_{1a}}\) w.r.t. \(\lbrace t_{1a},t_{1b} \rbrace \), because the chained transitions \(t_{1a} \star t_{1b} = t_1\) and \(t_{1b} \star t_{1a}\) both correspond to the loop (1). They can then be lifted to global size bounds as in Example 36 using size bounds for the entry transitions \(\mathcal {E}_{\lbrace t_{1a},t_{1b} \rbrace } = \{t_0, t_3\}\).

This shows how we choose \(t'\) and \(\mathcal {T}'\) when lifting local size bounds to global ones with Theorem 34: For a transition \(t'\) we search for a simple cycle \(\mathcal {T}'\) such that chaining the cycle results in a twn- or suitable solvable loop and the size bounds of \(\mathcal {E}_{\mathcal {T}'}\) are finite. For all other transitions, we compute size bounds as in [8].

5 Completeness of Size and Runtime Analysis for Programs

For individual loops, we showed in Theorem 23 that polynomial runtime bounds and finite size bounds are computable for all terminating prs loops. In this section, we discuss completeness of the size bound technique from the previous section and of termination and runtime complexity analysis for general integer programs. We show that for a large class of programs consisting of consecutive prs loops, in case of termination we can always infer finite runtime and size bounds.

To this end, we briefly recapitulate how size bounds are used to compute runtime bounds for general integer programs, and show that our new technique to infer size bounds also results in better runtime bounds. We call \({\mathcal{R}\mathcal{B}}: \mathcal {T}\rightarrow \mathcal {B}\) a (global) runtime bound if for every transition \(t\in \mathcal {T}\) and state \(\sigma _0 \in \varSigma \), \(|\sigma _0|({\mathcal{R}\mathcal{B}}(t))\) over-approximates the number of evaluations of t in any run starting in \((\ell _0,\sigma _0)\).

Definition 40

(Runtime Bound [8, 19]). A function \({\mathcal{R}\mathcal{B}}: \mathcal {T}\rightarrow \mathcal {B}\) is a (global) runtime bound if for all \(t \in \mathcal {T}\) and all states \(\sigma _0\in \varSigma \), we have \(|\sigma _0|({\mathcal{R}\mathcal{B}}(t)) \; \ge \; \sup \lbrace n \in \mathbb {N}\mid \exists \, (\ell ', \sigma ').\; (\ell _0, \sigma _0) \; (\rightarrow ^*_{\mathcal {T}} \circ \rightarrow _t)^n \; (\ell ', \sigma ') \rbrace \).

For our example in Fig. 2, a global runtime bound for \(t_0\), \(t_2\), and \(t_3\) is \({\mathcal{R}\mathcal{B}}(t_0) = 1\) and \({\mathcal{R}\mathcal{B}}(t_2) = {\mathcal{R}\mathcal{B}}(t_3) = x_5\), as \(x_5\) is bounded from below by \(t_3\)’s guard \(x_5 > 1\) and the value of \(x_5\) decreases by 1 in \(t_3\), and no transition increases \(x_5\).

To infer global runtime bounds automatically, similar as for size bounds, we first consider a smaller subprogram \(\mathcal {T}'\subseteq \mathcal {T}\) and compute local runtime bounds for non-empty subsets \(\mathcal {T}'_>\subseteq \mathcal {T}'\). A local runtime bound measures how often a transition \(t\in \mathcal {T}'_>\) can occur in a run through \(\mathcal {T}'\) that starts after an entry transition \(r\in \mathcal {E}_{\mathcal {T}'}\). Thus, local runtime bounds do not consider how many \(\mathcal {T}'\)-runs take place in a global run and they do not consider the sizes of the variables before starting a \(\mathcal {T}'\)-run. We lift these local bounds to global runtime bounds for the complete program afterwards.

Definition 41

(Local Runtime Bound [27]). Let \(\varnothing \ne \mathcal {T}'_>\subseteq \mathcal {T}'\subseteq \mathcal {T}\). \({\mathcal{R}\mathcal{B}_{\mathcal {T}'_>}}\in \mathcal {B}\) is a local runtime bound for \(\mathcal {T}'_>\) w.r.t. \(\mathcal {T}'\) if for all \(t \in \mathcal {T}'_>\), all \(r\in \mathcal {E}_{\mathcal {T}'}\) with \(r = (\ell , \_,\_,\_)\), and all \(\sigma \in \varSigma \), we have \(|\sigma |({\mathcal{R}\mathcal{B}_{\mathcal {T}'_>}}) \; \ge \; \sup \lbrace n \in \mathbb {N}\mid \exists \, \sigma _0, (\ell ', \sigma '). \; (\ell _0, \sigma _0) \rightarrow _{\mathcal {T}}^* \circ \rightarrow _{r} \, (\ell , \sigma ) \; (\rightarrow _{\mathcal {T}'}^* \circ \rightarrow _t)^n \; (\ell ', \sigma ') \rbrace \).

Example 42

In Fig. 2, local runtime bounds for \(\mathcal {T}'_> = \mathcal {T}' = \lbrace t_1 \rbrace \) and for \(\mathcal {T}'_> =\mathcal {T}' = \lbrace t_4 \rbrace \) are \(\mathcal{R}\mathcal{B}_{\lbrace t_1 \rbrace } = x_3\) and \(\mathcal{R}\mathcal{B}_{\lbrace t_4 \rbrace } =x_1\). Local runtime bounds can often be inferred automatically by approaches based on ranking functions (see, e.g., [8]) or by the complete technique for terminating prs loops (see Theorem 23).

If we have a local runtime bound \({\mathcal{R}\mathcal{B}_{\mathcal {T}'_>}}\) w.r.t. \(\mathcal {T}'\), then setting \({\mathcal{R}\mathcal{B}}(t)\) to \(\sum _{r\in \mathcal {E}_{\mathcal {T}'}} {\mathcal{R}\mathcal{B}}(r)\cdot ({\mathcal{R}\mathcal{B}_{\mathcal {T}'_>}}\left[ v/{\mathcal{S}\mathcal{B}}(r,v) \mid v\!\in \!\mathcal {V}\right] )\) for all \(t\in \mathcal {T}'_>\) yields a global runtime bound [27]. Here, we over-approximate the number of local \(\mathcal {T}'\)-runs which are started by an entry transition \(r\in \mathcal {E}_{\mathcal {T}'}\) by an already computed global runtime bound \({\mathcal{R}\mathcal{B}}(r)\). Moreover, we instantiate each \(v \in \mathcal {V}\) by a size bound \({\mathcal{S}\mathcal{B}}(r,v)\) to consider the size of v before a local \(\mathcal {T}'\)-run is started. So as mentioned in Sect. 4, we need runtime bounds to infer size bounds (see Theorem 7 and the inference of global size bounds in [8]), and on the other hand we need size bounds to compute runtime bounds. Thus, our implementation alternates between size bound and runtime bound computations (see [8, 27] for a more detailed description of this alternation).

Example 43

Based on the local runtime bounds in Example 42, we can compute the remaining global runtime bounds for our example. We obtain \({\mathcal{R}\mathcal{B}}(t_1) = {\mathcal{R}\mathcal{B}}(t_0)\cdot (x_3\left[ v/{\mathcal{S}\mathcal{B}}(t_0,v) \mid v\in \mathcal {V}\right] ) + {\mathcal{R}\mathcal{B}}(t_3)\cdot (x_3\left[ v/{\mathcal{S}\mathcal{B}}(t_3,v) \mid v\in \mathcal {V}\right] ) = x_3 + x_5^2\) and \({\mathcal{R}\mathcal{B}}(t_4) = {\mathcal{R}\mathcal{B}}(t_2)\cdot (x_1\left[ v/{\mathcal{S}\mathcal{B}}(t_2,v) \mid v\in \mathcal {V}\right] ) = x_5\cdot (4\cdot x_1 + 2\cdot x_2 + 14\cdot x_5)\). Thus, overall we have a quadratic runtime bound \(\sum _{1 \le i \le 5} {\mathcal{R}\mathcal{B}}(t_i)\). Note that it is due to our new size bound technique from Sect. 24 that we obtain polynomial runtime bounds in this example. In contrast, to the best of our knowledge, all other state-of-the-art tools fail to infer polynomial size or runtime bounds for this example. Similarly, if one modifies \(t_4\) such that instead of \(x_1\), \(x_4\) is decreased as long as \(x_4 > 0\) holds, then our approach again yields a polynomial runtime bound, whereas none of the other tools can infer finite runtime bounds.

Finally, we state our completeness results for integer programs. For a set \(\mathcal {C} \subseteq \mathcal {T}\) and \(\ell , \ell ' \in \mathcal {L}\), let \(\ell \rightsquigarrow _\mathcal {C} \ell '\) hold iff there is a transition \((\ell , \_, \_, \ell ') \in \mathcal {C}\). We say that \(\mathcal {C}\) is a component if we have \(\ell \rightsquigarrow _\mathcal {C}^+ \ell '\) for all locations \(\ell , \ell '\) occurring in \(\mathcal {C}\), where \(\rightsquigarrow _\mathcal {C}^+\) is the transitive closure of \(\rightsquigarrow _\mathcal {C}\). So in particular, we must also have \(\ell \rightsquigarrow _\mathcal {C}^+ \ell \) for all locations \(\ell \) in the transitions of \(\mathcal {C}\). We call an integer program simple if every component is a simple cycle that is “reachable” from any initial state.

Definition 44

(Simple Integer Program). An integer program \((\mathcal {V},\mathcal {L},\ell _0,\mathcal {T})\) is simple if every component \(\mathcal {C} \subseteq \mathcal {T}\) is a simple cycle, and for every entry transition \((\_,\_,\_,\ell )\in \mathcal {E}_{\mathcal {C}}\) and every \(\sigma _0\in \varSigma \), there is an evaluation \((\ell _0,\sigma _0) \rightarrow ^*_{\mathcal {T}} (\ell ,\sigma _0)\).

In Fig. 2, \(\mathcal {T}\setminus \lbrace t_0 \rbrace \) is a component that is no simple cycle. However, if we remove \(t_3\) and replace \(t_0\)’s guard by \(\texttt {true}\), then the resulting program \(\mathcal {P}'\) is simple (but not linear). A simple program terminates iff each of its isolated simple cycles terminates. Thus, if we can prove termination for every simple cycle, then the overall program terminates. Hence, if after chaining, every simple cycle corresponds to a linear, unit prs loop, then we can decide termination and infer polynomial runtime and size bounds for the overall integer program. For terminating, non-unit prs loops, runtime bounds are still polynomial but size bounds can be exponential. Hence, then the global runtime bounds can be exponential as well. Note that in the example program \(\mathcal {P}'\) above, the eigenvalues of the update matrices of \(t_1\) and \(t_4\) have absolute value 1, i.e., \(t_1\) and \(t_4\) correspond to unit prs loops. Hence, by Theorem 45 we obtain polynomial runtime and size bounds for \(\mathcal {P}'\).

Theorem 45

(Completeness Results for Integer Programs)

  1. (a)

    Termination is decidable for all simple linear integer programs where after chaining, all simple cycles correspond to prs loops.

  2. (b)

    Finite runtime and size bounds are computable for all simple integer programs where after chaining, all simple cycles correspond to terminating prs loops.

  3. (c)

    If in addition to (b), all simple cycles correspond to unit prs loops, then the runtime and size bounds are polynomial.

In the definition of simple integer programs (Definition 44), we required that for every component \(\mathcal {C}\) and every entry transition \((\_,\_,\_,\ell )\in \mathcal {E}_{\mathcal {C}}\), there is an evaluation \((\ell _0,\sigma _0) \rightarrow ^*_{\mathcal {T}} (\ell ,\sigma _0)\) for every \(\sigma _0\in \varSigma \). If one strengthens this by requiring that one can reach \(\ell \) from \(\ell _0\) using only transitions whose guard is \(\texttt {true}\) and whose update is the identity, then the class of programs in Theorem 45 (a) is decidable (there are only n ways to chain a simple cycle with n transitions and checking whether a loop is a prs loop is decidable by Lemma 19).

6 Conclusion and Evaluation

Conclusion. In this paper, we developed techniques to infer size bounds automatically and to use them in order to obtain bounds on the runtime complexity of programs. This yields a complete procedure to prove termination and to infer runtime and size bounds for a large class of integer programs. Moreover, we showed how to integrate the complete technique into an (incomplete) modular technique for general integer programs. To sum up, we presented the following new contributions in this paper:

  1. (a)

    We showed how to use closed forms in order to infer size bounds for loops with possibly non-linear arithmetic in Theorem 7.

  2. (b)

    We proved completeness of our novel approach for terminating prs loops (see Theorem 23) in Sect. 3.

  3. (c)

    We embedded our approach for loops into the setting of general integer programs in Sect. 4 and showed completeness of our approach for simple integer programs with only prs loops in Sect. 5.

  4. (d)

    Finally, we implemented a prototype of our procedure in our re-implementation of the tool KoAT, written in OCaml. It integrates the computation of size bounds via closed forms for twn-loops and homogeneous (and thus linear) solvable loops into the complexity analysis for general integer programs.Footnote 7

To infer local runtime bounds as in Definition 41, KoAT first applies multiphase-linear ranking functions (see [5, 19]), which can be done very efficiently. For twn-loops where no finite bound was found, it then uses the computability of runtime bounds for terminating twn-loops (see [17, 20, 27]). When computing size bounds, KoAT first applies the technique of [8] for reasons of efficiency and in case of exponential or infinite size bounds, it tries to compute size bounds via closed forms as in the current paper. Here, SymPy [30] is used to compute Jordan normal forms for the transformation to twn-loops. Moreover, KoAT applies a local control-flow refinement technique [19] (using the tool iRankFinder [13]) and preprocesses the program in the beginning, e.g., by extending the guards of transitions with invariants inferred by Apron [24]. For all SMT problems, KoAT uses Z3 [31]. In the future, we plan to extend the runtime bound inference of KoAT to prs loops and to extend our size bound computations also to suitable non-linear non-twn-loops.

Evaluation. To evaluate our new technique, we tested KoAT on the 504 benchmarks for Complexity of C Integer Programs (CINT) from the Termination Problems Data Base [35] which is used in the annual Termination and Complexity Competition (TermComp) [18]. Here, all variables are interpreted as integers over \(\mathbb {Z}\) (i.e., without overflows). To distinguish the original version of KoAT [8] from our re-implementation, we refer to them as KoAT1 resp. KoAT2. We used the following configurations of KoAT2, which apply different techniques to infer size bounds.

  • KoAT2orig uses the original technique from [8] to infer size bounds.

  • KoAT2 + SIZE additionally uses our novel approach with Theorem 7, 34, and 38.

The CINT collection contains almost only examples with linear arithmetic and the existing tools can already solve most of its benchmarks which are not known to be non-terminating.Footnote 8 While most complexity analyzers are essentially restricted to programs with linear arithmetic, our new approach also succeeds on programs with non-linear arithmetic. Some programs with non-linear arithmetic could already be handled by KoAT due to our integration of the complete technique for the inference of local runtime bounds in [27]. But the approach from the current paper increases KoAT’s power substantially for programs (possibly with non-linear arithmetic) where the values of variables computed in “earlier” loops influence the runtime of “later” loops (e.g., the modification of our example from Fig. 2 where \(t_4\) decreases \(x_4\) instead of \(x_1\), see the end of Example 43).

Table 1. Evaluation on the Collection CINT\(^+\)

Therefore, we extended CINT by 15 new typical benchmarks including the programs in (1), Fig. 2, and the modification of Fig. 2 discussed above, as well as several benchmarks from the literature (e.g., [3, 6]), resulting in the collection CINT\(^+\). For KoAT2 and KoAT1, we used Clang [11] and llvm2kittel [14] to transform C into integer programs as in Sect. 4. We compare KoAT2 with KoAT1 [8] and the tools CoFloCo [15], MaxCore [2] with CoFloCo in the backend, and Loopus [33]. These tools also rely on variants of size bounds: CoFloCo uses a set of constraints to measure the size of variables w.r.t. their initial and final values, MaxCore’s size bound computations build upon [12], and Loopus considers suitable bounding invariants to infer size bounds.

Table 1 gives the results of our evaluation, where as in TermComp, we used a timeout of 5 min per example. The first entry in every cell denotes the number of benchmarks from CINT\(^+\) for which the tool inferred the respective bound. The number in brackets only considers the 15 new examples. The runtime bounds computed by the tools are compared asymptotically as functions which depend on the largest initial absolute value n of all program variables. So for example, KoAT2 + SIZE proved a linear runtime bound for \(231 + 2 = 233\) benchmarks, i.e., \({{\,\textrm{rc}\,}}(\sigma )\in \mathcal {O}(n)\) holds for all initial states where \(|\sigma (v)|\le n\) for all \(v\in \mathcal {V}\). Overall, this configuration succeeds on 358 examples, i.e., “\(< \omega \)” is the number of examples where a finite bound on the runtime complexity could be computed by the tool within the time limit. “\(\mathrm {AVG^+(s)}\)” denotes the average runtime of successful runs in seconds, whereas “\(\mathrm {AVG(s)}\)” is the average runtime of all runs.

Already on the original benchmarks CINT, integrating our novel technique for the inference of size bounds leads to the most powerful approach for runtime complexity analysis. The effect of the new size bound technique becomes even clearer when also considering our new examples which contain non-linear arithmetic and loops whose runtime depends on the results of earlier loops in the program. Thus, the new contributions of the paper are crucial in order to extend automated complexity analysis to larger programs with non-linear arithmetic.

KoAT’s source code, a binary, and a Docker image are available at https://koat.verify.rwth-aachen.de/size. This website also has details on our experiments, a list and description of the new examples, and web interfaces to run KoAT’s configurations directly online.