1 Introduction

We consider the reachability problem for multipath (or branching) affine loops over the integers, or equivalently for nondeterministic integer linear dynamical systems. A (deterministic) integer linear dynamical system consists of an update matrix \(M \in \mathbb {Z}^{d\times d}\) together with an initial point \(x^{(0)} \in \mathbb {Z}^d\). We associate to such a system its infinite orbit \((x^{(i)})_{i\in \mathbb {N}}\) consisting of the sequence of reachable points defined by the rule \(x^{(i+1)} = Mx^{(i)}\). The reachability question then asks, given a target set Y, whether the orbit ever meets Y, i.e., whether there exists some time \(i\in \mathbb {N}\) such that \(x^{(i)} \in Y\). The nondeterministic reachability question allows the linear update map to be chosen at each step from a fixed finite collection of matrices.

When the orbit does eventually hit the target, one can easily substantiate this by exhibiting the relevant finite prefix. However, establishing non-reachability is intrinsically more difficult, since the orbit consists of an infinite sequence of points. Here one requires a finitary certificate, which must be a relatively simple object that can be inspected and which provides a proof that the set Y is indeed unreachable. Typically, such a certificate will consist of an over-approximation I of the set R of reachable points, in such a manner that one can check both that \(Y \cap I=\emptyset \) and \(R\subseteq I\); such a set I is called an invariant.

Formally we study the following problem for inductive invariants:

The meta problem. Consider a system defined by an initial vector x and a set of updates, represented by matrices \(M_1,\dots , M_n\). A set I is an inductive invariant of this system if \(x^{(0)}\in I\) and \(M_iI\subseteq I\) for all \(i\in \{1,\ldots ,n\}\). Given a target Y, determine whether there exists an inductive invariant I that separates the reachable points of the system from Y, i.e., such that \(Y\cap I = \emptyset \).

The meta problem is parametrised by the type of invariants and targets that are considered; that is, what are the classes of allowable invariant sets I and target sets Y, or equivalently how are such sets allowed to be expressed?

Fixing particular invariant and target domains, a reachability query encounters three possible scenarios: (1) the instance is reachable, (2) the instance is unreachable and a separating invariant from the domain exists, or (3) the instance is unreachable but no separating invariant exists. Ideally, one would wish to provide a sufficiently expressive invariant domain so that the latter case does not occur, whilst keeping the resulting invariants as simple as possible and computable. Unfortunately, it is known that distinguishing reachability (1) from unreachability (2,3) is undecidable in general; and for some invariant domains, within unreachable instances, determining whether a separating invariant exists (i.e., distinguishing (2) from (3)) is undecidable.

We note that the existence of strongest inductive invariants is a desirable property for an invariant domain. Given two invariants I and \(I'\), we say that I is stronger than \(I'\) if and only if \(I \subseteq I'\); thus strongest invariants correspond to smallest invariant sets. When strongest invariants exist (and can be computed), separating (2) from (1,3) is easy: compute the strongest invariant, and check whether it excludes the target or not; if so, we are done, and if not, no other invariant (from that class) can possibly work either. However, unless (3) is excluded, computability of the strongest invariant does not necessarily imply that reachability is decidable. Alas, strongest invariants are not always guaranteed to exist for a particular invariant domain, although some separating inductive invariant may still exist for every target (or indeed may not).

In prior work from the literature, typical classes of invariants are usually convex, or finite unions of convex sets. In this paper we consider certain classes of invariants that can have infinitely many ‘holes’ (albeit in a structured and regular way); we call such sets porous invariants. These invariants can be represented via Presburger arithmetic.Footnote 1 We shall work instead with the equivalent formulation of semi-linear sets, generalising ultimately periodic sets to higher dimensions, as finite unions of linear sets of the form \(\left( b + p_1\mathbb {N} + \dots + p_m\mathbb {N}\right) \) (by which we mean \(\left\{ b + a_1p_1 + \dots + a_mp_m\mid a_1,\dots ,a_m\in \mathbb {N}\right\} \), see Definition 3).

Let us first consider a motivating example:

Example 1

(Hofstadter’s MU Puzzle [1]) Consider the following term-rewriting puzzle over alphabet \(\{M,U,I\}\). Start with the word MI, and by applying the following grammar rules (where y and z stand for arbitrary words over our alphabet), we ask whether the word MU can ever be reached.

$$\begin{aligned} yI \rightarrow yIU \quad \mid \quad My \rightarrow Myy \quad \mid \quad yIIIz \rightarrow yUz \quad \mid \quad yUUz \rightarrow yz \end{aligned}$$

The answer is no. One way to establish this is to keep track of the number of occurrences of the letter ‘I’ in the words that can be produced, and observe that this number (call it x) will always be congruent to either 1 or 2 modulo 3. In other words, it is not possible to reach the set \(\{x \mid x \equiv 0 \mod 3\}\). Indeed, Rules 2 and 3 are the only rules that affect the number of I’s, and can be described by the system dynamics \(x \mapsto 2x\) and \(x \mapsto x -3\). Hence the MU Puzzle can be viewed as a one-dimensional system with two affine updates,Footnote 2 or a two-dimensional system with two linear updates.Footnote 3 The set \(\left( 1 + 3\mathbb {Z}\right) \cup \left( 2 + 3\mathbb {Z}\right) \) is an inductive invariantFootnote 4, and we wish to automatically synthesise it.

The problem can be rephrased as a safety property of the following multipath loop, verifying that the ‘bad’ state \(x= 0\) is never reached, or equivalently that the loop below can never halt, regardless of the nondeterministic choices made.

x \(:= 1\)

while x \(\ne 0\)

x \(:= 2\)x \(\mid \mid \) x \(:=\) x\(-3\)       (where \(\mid \mid \) represents nondeterministic branching)

The MU Puzzle was presented as a challenge for algorithmic verification in [2]; the tools considered in that paper (and elsewhere, to the best of our knowledge) rely upon the manual provision of an abstract invariant template. Our approach is to find the invariant fully automatically (although one must still abstract from the MU Puzzle the correct formulation as the program \(x \mapsto 2x \mid \mid x \mapsto x-3\)).

Our focus is on the automatic generation of porous invariants for multipath affine loops over the integers, or equivalently nondeterministic integer linear dynamical systems. When we consider affine loops as as linear dynamical systems they do not have loop guards as such. Rather we consider the loop guard as the target of the reachability questions we consider.

  • We first consider targets consisting of a single vector (or ‘point targets’), and present the classes of invariants and systems for which invariants can and cannot be automatically computed for the reachability question. A summary of the results for linear and semi-linear invariants for these targets is given in Table 1. For completeness we also consider \(\mathbb {R},\mathbb {R}_{+}\)-(semi)-linear sets, where we enhance the picture from prior work by showing that strongest \(\mathbb {R}\)-semi-linear invariants are computable.

    • We establish the existence of strongest \(\mathbb {Z}\)-linear invariants, and show that they can be found algorithmically in polynomial time (Theorem 10).

    • If a \(\mathbb {Z}\)-linear invariant is not separating, we may instead look for an \(\mathbb {N}{}\)-semi-linear invariant (a class that generalises both \(\mathbb {Z}\)-semi-linear and \(\mathbb {N}{}\)-linear invariants), and we show that such an invariant can always be found for any unreachable point target when dealing with deterministic integer linear dynamical systems (Theorem 19).

    • However, for nondeterministic integer linear dynamical systems, computing separating \(\mathbb {N}{}\)-semi-linear invariants is an undecidable problem in arbitrary dimension (Theorem 21). Nevertheless we show how such invariants can be computed in a low-dimensional setting, in particular for affine updates in one dimension (Theorem 22). As an immediate consequence, this establishes that the multipath loop associated with the MU Puzzle belongs to a class of programs for which we can automatically synthesise \(\mathbb {N}{}\)-semi-linear invariants.

  • We consider the reachability problem for porous targets. That is, where the target is a linear or semi-linear set.

    • For full-dimensionalFootnote 5\(\mathbb {Z}\)-linear targets we show that reachability is decidable, and, in the case of unreachability that a \(\mathbb {Z}\)-semi-linear invariant can always be exhibited as a certificate (Theorem 37). If the target is not full-dimensional then the reachability problem is Skolem-hard and undecidable for deterministic and nondeterministic systems respectively.

    • Secondly, we also show that the reachability problem for low-dimensional semi-linear sets is decidable for deterministic LDS (Theorem 40). Note that the Skolem problem is decidable at low orders, so it does not present a barrier in this setting.

  • In Sect. 7 we present our tool porous which handles one-dimensional affine systems for both point and \(\mathbb {Z}\)-linear targets, solving both the reachability problem and producing invariants. Inter alia, this allows one to handle the multipath loop derived from the MU Puzzle in fully automated manner.

The present paper extends and strengthens the results of [3]. Firstly, we show that strongest \(\mathbb {Z}\)-semi-linear invariants can be found in polynomial time, whereas [3] merely established decidability. Secondly, we improve the results for porous targets, and in particular consider low-dimensional semi-linear targets. Finally, we present all proofs in full.

Table 1 Results for integer linear dynamical systems for a point target

1.1 Related work

The reachability problem (in arbitrary dimension) for loops with a single affine update, or equivalently for deterministic linear dynamical systems, is decidable in polynomial time for point targets (that is \(Y = \left\{ y\right\} \)), as shown by Kannan and Lipton [6]. However for nondeterministic systems (where the update matrix is chosen nondeterministically from a finite set at each time step), reachability was proven undecidable by reduction from the matrix semigroup membership problem [7].

In particular this entails that for unreachable nondeterministic instances we cannot hope to always be able to compute a separating invariant. In some cases we may compute the strongest invariant (which may suffice if this invariant happens to be separating for the given reachability query), or we may compute an invariant in sub-cases for which reachability is decidable (for example in low dimensions). For some classes of invariants, it is also undecidable whether an invariant exists (e.g., invariants which are unions of polyhedra [5]).

Various types of invariants have been studied for linear dynamical systems, including polyhedral [5, 8], algebraic [9], and o-minimal [10] invariants. For certain classes of invariants (e.g., algebraic [9]), it is decidable whether a separating invariant exists, notwithstanding the reachability problem being undecidable. Other works (e.g., [11]) use heuristic approaches to generate invariants, without aiming for any sort of completeness.

Kincaid, Breck, Cyphert and Reps [12] study loops with linear updates, examining the closed forms for the variables to prove safety and termination properties. Such closed forms, when expressible in certain arithmetic theories, can be interpreted as another type of invariant and can be used to over-approximate the reachable sets. The work is restricted to a single update function (deterministic loops) and places additional constraints on the updates to bring the closed forms into appropriate theories.

Bozga, Iosif and Konecný’s FLATA tool [13] considers affine functions in arbitrary dimension. However, it is restricted to affine functions with finite monoids; in our one-dimensional case this would correspond to limiting oneself to counter-like functions of the form \(f(x) = x+b\).

Finkel, Göller and Haase [14], extending Fremont [15], show that reachability in a single dimension is \({\textbf {PSPACE}}\)-complete for polynomial update functions (and allowing states which can be used to control the sequences of updates that can be applied). The affine functions (and single-state restriction) we consider are a special case, but we focus on producing invariants to disprove reachability.

The reachability problem asks whether there exists a sequence of transitions that reach a given condition. The termination problem asks whether a given condition eventually holds along every possible sequence of transitions. Tools such as AProVE [16] and Büchi Automizer [17] may (dis-)prove reachability in the termination setting, i.e., on all branches, but are not suited to asking if a condition can be reached on some branch (reachability). Restrictions on the number of switches between the update function can also be considered; [18] shows that reachability is decidable only for a small number of switches.

Inductive invariants specified in Presburger arithmetic have been used to disprove reachability in vector addition systems [19]. A generalisation, the class of ‘almost semi-linear sets’ [20], also features non-convexity and moreover can capture exactly the reachable points of vector addition systems. Our nondeterministic linear dynamical systems can be seen as vector addition systems over \(\mathbb {Z}\) extended with affine updates (rather than only additive updates).

2 Preliminaries

We denote by \(\mathbb {Z}{}\) the set of integers and \(\mathbb {N}{}\) the set of non-negative integers. We say that \(x,y \in \mathbb {Z}\) are congruent modulo \(d\in \mathbb {N}\), denoted \(x \equiv y \mod d\), if d divides \(x-y\). Given an integer x and natural d we write \((x \bmod d)\) for the number \(y\in \left\{ 0,\dots ,d-1\right\} \) such that \(y\equiv x \bmod d \).

Definition 2

(Integer Linear Dynamical Systems) A d-dimensional integer linear dynamical system (LDS) \((x^{(0)},\{M_1,\dots ,M_k\})\) is defined by an initial point \(x^{(0)}\in \mathbb {Z}{}^{d}\) and a set of integer matrices \(M_1, \dots , M_k\in \mathbb {Z}^{d\times d}\). An LDS is deterministic if it comprises a single matrix (\(k=1\)) and is otherwise nondeterministic.

A point \(y\) is reachable if there exists \(m\in \mathbb {N}\) and \(B_1,\dots ,B_m\) such that \(B_1 \cdots B_m x^{(0)} = y\) and \(B_i\in \left\{ M_1,\dots , M_k\right\} \) for all \(1\le i \le m\).

The reachability set \(\mathcal {O}\subseteq \mathbb {Z}{}^{d}\) of an LDS is the set of reachable points.

The following definition is parameterised by a semiring \(\mathbb {K}\), which stands either for \(\mathbb {N},\mathbb {Z}, \mathbb {R}\) or \(\mathbb {R}_+\).

Definition 3

(\(\mathbb {K}\) -semi-linear sets) A linear set L is defined by a base vector \(b\in \mathbb {Z}^d\) and period vectors \(p_1,\dots ,p_k\in \mathbb {Z}^d\) such that

$$\begin{aligned} L = \left\{ b+ a_1p_1+\dots + a_kp_k \mid a_1,\dots ,a_k\in \mathbb {K}\right\} . \end{aligned}$$

For convenience we often write \(\left( b + p_1\mathbb {K}+\dots + p_k\mathbb {K}\right) \) for L. A set is semi-linear if it is a finite union of linear sets.

\(\mathbb {N}\)-semi-linear sets are precisely those definable in Presburger arithmetic(\({\text {FO}}(\mathbb {Z}, +, \le )\)) [21]. Likewise, \(\mathbb {Z}\)-semi-linear sets are those definable in \({\text {FO}}(\mathbb {Z}, +)\). We also consider their real counterparts, in which the coefficient semiring is either \(\mathbb {R}\) or \(\mathbb {R}_{+}\). Note that regardless of the semiring \(\mathbb {K}\), the period vectors \(p_i\) all lie in \(\mathbb {Z}^d\). We say a vector \(v\in \mathbb {Z}^d\) is an admissible direction of a linear set L if adding any \(\mathbb {K}\)-multiple of v to a point in L is also in L, in particular \(L = \left( b + p_1\mathbb {K}+\dots + p_k\mathbb {K}\right) = \left( b + p_1\mathbb {K}+\dots + p_k\mathbb {K}+ v\mathbb {K}\right) \).

An invariant is simply an overapproximation of the reachability set (\(\mathcal {O}\subseteq I\)). Typically, we are interested in finding an invariant I that is disjoint from a target, i.e., \(I\cap Y= \emptyset \), to show that the orbit \(\mathcal {O}\) does not meet Y. We moreover require that the property of being an invariant set be easy to verify. The principal way to do this is to consider inductive invariants:

Definition 4

Given an integer linear dynamical system \((x^{(0)},\{M_1,\dots ,M_k\})\), a set I is an inductive invariant if

  • \(x^{(0)}\in I\), and

  • \(\{M_i x \mid x \in I\}\subseteq I\) for all \(i \in \left\{ 1,\dots ,k\right\} \).

We are interested in the following problem:

Definition 5

(Invariant Synthesis Problem) Given an invariant domain \(\mathcal {D}\), an integer linear dynamical system \((x^{(0)},\{M_1,\dots ,M_k\})\), and a target Y, does there exist an inductive invariant I in \(\mathcal {D}\) disjoint from Y?

We foucs on classes \(\mathcal {D}\) of inductive invariants that are linear, or semi-linear. When a separating inductive invariant I exists, we also wish to compute it. Since (semi)-linear invariants are enumerable, the computation of invariants can in theory be reduced to the question of their existence; however all of our proofs are constructive.

We also consider the notion of strongest invariants, where a strongest invariant is the smallest invariant set I in the prescribed domain that contains \(\mathcal {O}\). Such invariants are compelling because they can be used to analyse reachability of any target set in the following sense—either the strongest invariant is separating from the given target, or no invariant in the given domain is separating. Note that strongest invariants do not always exist.

We only consider inductive invariants in the remainder of this paper, and we note when the inductive invariant we compute is also a strongest invariant.

3 \(\mathbb {R}\) invariants: \(\mathbb {R}\)-linear and \(\mathbb {R}\)-semi-linear

Before delving into porous invariants, let us consider invariants over the real numbers, i.e., \(\mathbb {R}\)-(semi)-linear sets.

We observe that a strongest \(\mathbb {R}\)-linear invariant is nothing but the affine hull of the reachability set, which can be computed using Karr’s algorithm [4]. Furthermore we show that strongest \(\mathbb {R}{}\)-semi-linear invariants also exist and can be computed by combining techniques for computing algebraic invariants [9] and \(\mathbb {R}{}\)-linear invariants.

3.1 \(\mathbb {R}\)-linear invariants

Recall that a set L is \(\mathbb {R}\)-linear if \(L = \left( v_{0} +v_{1}\mathbb {R}+\dots +v_{t}\mathbb {R}\right) \) for some \(v_{0},\dots ,v_{t} \in \mathbb {Z}^d\) that can be assumed to be linearly independentFootnote 6 without loss of generality (and thus \(t\le d\)). Given two distinct points of L, every point on the infinite line connecting them must also be in L. Generalising this idea to higher dimensions, given a set \(S\subseteq \mathbb {R}^d\), let the affine hull be

$$\begin{aligned} \textrm{Aff}(S)= \left\{ \sum _{i=1}^k \lambda _i x_i \mid k\in \mathbb {N}, x_i \in S,\lambda _i \in \mathbb {R}, \sum _{i=1}^k \lambda _i = 1 \right\} . \end{aligned}$$

We say the vectors \(v_0,\dots ,v_m\) are \(\mathbb {Q}\)-affinely independent if \(v_1-v_0,\dots ,v_m-v_0\) are \(\mathbb {Q}\)-linearly independent.

Fix an LDS \((x^{(0)},\{M_1,\dots ,M_k\})\) and consider its reachability set \(\mathcal {O}=\) \(\left\{ M_{i_m}\cdots M_{i_1}x^{(0)} \mid m \in \mathbb {N}, i_1,\ldots ,i_m \in \left\{ 1,\ldots ,k\right\} \right\} \). Then \(\textrm{Aff}(\mathcal {O}{})\) is precisely the strongest \(\mathbb {R}\)-linear invariant. Karr’s algorithm [4, 22] can be used to compute this strongest invariant in polynomial time. The next lemma follows from Theorem 3.1 of [22].

Lemma 6

Given an LDS \((x^{(0)}, \left\{ M_1,\dots , M_k\right\} )\) of dimension d, we can compute in time polynomial in d, k, and \(\log \mu \) (where \(\mu >0\) is an upper bound on the absolute values of the integers appearing in \(x^{(0)}\) and \(M_1,\ldots ,M_k\)), a \(\mathbb {Q}\)-affinely independent set of integer vectors \(R_0 \subseteq \mathcal {O}{}\) such that:

  1. 1.

    \(x^{(0)}\in R_0\),

  2. 2.

    the affine span of \(R_0\) and the affine span of \(\mathcal {O}{}\) are the same (\(\textrm{Aff}(R_0) = \textrm{Aff}(\mathcal {O}{})\)),

  3. 3.

    the entries of the vectors in \(R_0\) have absolute value at most \(\mu _0:=\mu (d\mu )^d\).

We highlight that Lemma 6 shows computability of the set \(R_0\) which is a subset of the reachability set (in particular the elements are integer points). This fact will prove useful later in our development of strongest \(\mathbb {Z}\)-linear invariants in Sect. 4.

Before proving Lemma 6, let us first state a small technical proposition on the growth of matrix powers required in the proof.

Proposition 7

Let M be a d-dimensional square matrix and x be a vector. Let the maximum entry of Mx have absolute value at most \(\mu \). Then the maximal absolute value of an entry of \(M^kx\) is at most \(d^{k}\mu ^{k+1}\).

Proof

Without loss of generality, assume that the matrix M and vector x consists only of \(\mu \). We proceed by induction on k. The base case holds by the assumption that entries of x have absolute value at most \(\mu \). The inductive case is as follows:

$$\begin{aligned} \begin{pmatrix} \mu &{} \dots &{} \mu \\ &{} \ddots \\ \mu &{} \dots &{} \mu \\ \end{pmatrix} \begin{pmatrix} d^{k-1}\mu ^k \\ \vdots \\ d^{k-1}\mu ^k \\ \end{pmatrix} = \begin{pmatrix} d\mu (d^{k-1}\mu ^k) \\ \vdots \\ d\mu (d^{k-1}\mu ^k) \\ \end{pmatrix} = \begin{pmatrix} d^k\mu ^{(k+1)}\\ \vdots \\ d^k\mu ^{(k+1)}\\ \end{pmatrix} \end{aligned}$$

\(\square \)

Proof of Lemma 6

The result of [22, Theorem 3.1] proceeds by finding new points in the reachability set and adding them to a set of points if the new point is linearly independent from the other points of the set. Whilst the result of [22] refers to linear independence, this can be converted to affine independence by increasing the dimension by one.

The procedure works via a pruned version breadth-first search, with nodes only expanded if their children are linearly independent from the current set. Hence, the first point found in the tree is the initial point \(x^{(0)}\), and therefore this point is included. The maximum depth of the tree that needs to be explored is d, and so every point included is reached with at most d applications of matrices to \(x^{(0)}\). Hence, by Proposition 7, if the largest absolute value of a point or matrix entry is \(\mu \), after d iterations, the largest absolute value is \(\mu (d\mu )^{d}\).

The algorithm of [22] runs in polynomial time in the number of arithmetic operations, and we observe that this is also polynomial time in the bit size. The independence checking in the algorithm involves verifying linear independence of at most d vectors all having bit size at most \(log(\mu (d\mu )^{d}) = d\log (d) + (d+1)\log (\mu )\), which can be done in polynomial time in the bit size (for example by the Bareiss algorithm for calculating the determinant). \(\square \)

Let \(R_0 = \left\{ x^{(0)}, r_1,\dots ,r_{d'}\right\} \) be obtained as per Lemma 6, with \(d' \le d\). The \(\mathbb {R}\)-linear invariant of the LDS is the affine span \(\textrm{Aff}(R_0)\), which can be written as the \(\mathbb {R}\)-linear set \(L_0 = \left( x^{(0)} + (r_1 - x^{(0)})\mathbb {R}+ \dots + (r_{d'} - x^{(0)})\mathbb {R}\right) \).

3.2 \(\mathbb {R}\)-semi-linear invariants

Let us now generalise this approach to \(\mathbb {R}\)-semi-linear sets, an invariant domain first introduced in [23]. The collection of \(\mathbb {R}\)-semi-linear sets, \(\left\{ \bigcup _{i = 1}^m L_i \mid m\in \mathbb {N}, L_1,\dots , L_m \text { are } \mathbb {R}\text {-linear sets} \right\} \), is closed under finite unions and arbitrary intersections.Footnote 7 Thus for any given set X, the smallest \(\mathbb {R}\)-semi-linear set containing X is simply the intersection of all \(\mathbb {R}\)-semi-linear sets containing X. Let us denote by \(\textrm{SLin}(X)\) the smallest \(\mathbb {R}\)-semi-linear set that contains X. We are interested in computing \(\textrm{SLin}(\mathcal {O})\):

Theorem 8

The strongest \(\mathbb {R}\)-semi-linear invariant \(\textrm{SLin}(\mathcal {O})\) of \(\mathcal {O}{}\) is computable and is inductive.

First, let us consider the richer class of algebraic sets. Algebraic sets are those that are definable as finite unions and intersections of the zero sets of polynomials. For example, \(\left\{ (x,y)\mid xy=0\right\} \) describes the union of the lines \(x= 0\) and \(y=0\). The (real) Zariski closure \(\textrm{Zar}(X)\) of a set \(X\subseteq \mathbb {R}^d\) is the smallest algebraic subset of \(\mathbb {R}^d\) containing X. The Zariski closure of the set of reachable points, \(\textrm{Zar}(\mathcal {O}{})\), can be computed algorithmically and yields an inductive invariant [9].

An algebraic set A is irreducible if whenever \(A\subseteq B \cup C\), where B and C are algebraic sets, then we have \(A\subseteq B\) or \(A\subseteq C\). Any algebraic set can be written effectively as a finite union of irreducible algebraic sets [24].

Proposition 9

Suppose \(\textrm{Zar}(X) = A_1\cup \dots \cup A_k\), with \(A_i\)’s irreducible algebraic sets. Then \(\textrm{SLin}(X) = \textrm{Aff}(A_1) \cup \dots \cup \textrm{Aff}(A_k)\).

Proof

Since semi-linear sets are algebraic we have that \(X\subseteq \textrm{Zar}(X)\subseteq \textrm{SLin}(X)\) and hence \(\textrm{SLin}(X) \subseteq \textrm{SLin}(\textrm{Zar}(X)) \subseteq \textrm{SLin}(\textrm{SLin}(X)) = \textrm{SLin}(X)\). We conclude that \(\textrm{SLin}(X)=\textrm{SLin}(\textrm{Zar}(X))\).

Now we have \(\textrm{SLin}(X) \subseteq \textrm{Aff}(A_1) \cup \dots \cup \textrm{Aff}(A_k)\) since the latter is a semi-linear set that contains X. It remains to prove that \(\textrm{Aff}(A_1) \cup \dots \cup \textrm{Aff}(A_k) \subseteq \textrm{SLin}(X)\). For this, write \(\textrm{SLin}(X)=L_1\cup \cdots \cup L_s\), with the \(L_j\) being linear sets. Since each \(A_i\) is irreducible and each \(L_j\) is algebraic we have that for all i there exists j with \(A_i \subseteq L_j\) and hence \(\textrm{Aff}(A_i) \subseteq L_j\). This immediately yields the required inclusion. \(\square \)

From Proposition 9 we see that \(\textrm{SLin}(\mathcal {O})\) can be obtained by computing \(\textrm{Aff}(A_i)\) for each set \(A_i\) arising from the decomposition \(\textrm{Zar}(\mathcal {O}) = A_1\cup \dots \cup A_k\) of the Zariski closure of the orbit into irreducible components.Footnote 8

Moreover, the set \(\textrm{SLin}(\mathcal {O})\) is inductive. Indeed, given a matrix M of the LDS and \(i\le k\), for all j, we define the consider set \(A_i^j=\{x\in A_i\mid Mx \in A_j\}\), which is clearly algebraic. We have by inductiveness of \(\textrm{Zar}(\mathcal {O})\) that \(A_i = \bigcup _j A_i^j\). As \(A_i\) is irreducible, one of those sets cannot be a proper subset of \(A_i\). Thus there exists j such that \(A_i = A_i^j\) and thus \(M A_i \subseteq A_j\). Therefore \(M(\textrm{SLin}(A_i))=M \textrm{Aff}(A_i)=\textrm{Aff}(M A_i) \subseteq \textrm{Aff}(A_j)\subseteq \textrm{SLin}(\mathcal {O})\), proving inductiveness.

To complete the proof of Theorem 8 it remains to confirm that affine hulls of algebraic sets can be computed algorithmically. Let us fix an algebraic set A, and let W denote a set variable. Proceed as follows. Start with \(W \leftarrow \left\{ x\right\} \) for some point \(x\in A\), and repeatedly make the assignment \(W \leftarrow \textrm{Aff}(W \cup \left\{ y\right\} )\), where \(y \in A {\setminus } W\). Such a point y can always be found using quantifier elimination in the theory of the reals. Each step necessarily increases the dimension, which can occur at most d times, ensuring termination, at which point one has \(\textrm{Aff}(A) = W\).

4 Strongest \(\mathbb {Z}\)-linear invariants

Recall that a \(\mathbb {Z}\)-linear set \(\left( q + p_1\mathbb {Z}+ \dots + p_n \mathbb {Z}\right) \) is defined by a base vector \(q\in \mathbb {Z}^d\) and period vectors \(p_1,\dots ,p_n \in \mathbb {Z}^d\). Equivalently, a \(\mathbb {Z}\)-linear set describes a lattice, i.e., \(\left( p_1\mathbb {Z}+ \dots + p_n \mathbb {Z}\right) \), in d-dimensional space, translated to start from q rather than \(\vec {0}\).

We start by showing that the strongest \(\mathbb {Z}\)-linear invariant can be computed.

4.1 Computing the strongest \(\mathbb {Z}\)-linear invariants

Theorem 10

Given a d-dimensional dynamical system \((x^{(0)}, \left\{ M_1,\dots ,M_k\right\} )\), the strongest \(\mathbb {Z}\)-linear inductive invariant containing the reachability set \(\mathcal {O}{}\) exists and can be computed algorithmically in time polynomial in d, k, and \(\log \mu \) (where \(\mu >0\) is an upper bound on the absolute values of the integers appearing in \(x^{(0)}\) and \(M_1,\ldots ,M_k\)).

We claim that Algorithm 1 computes the requisite invariant according to Theorem 10. Let us first establish some technical results before proving termination and correctness of the algorithm.

Algorithm 1
figure a

Strongest \(\mathbb {Z}\)-linear invariant for LDS \((x^{(0)},M_1,\dots ,M_k)\)

The following proposition asserts that when two points are in a \(\mathbb {Z}\)-linear set, the direction between these two points can be applied from any point of the set, and hence this direction can be included as a period without altering the set.

Proposition 11

Let \(L = \left( q + p_1\mathbb {Z}+\dots +p_n\mathbb {Z}\right) \) be a \(\mathbb {Z}\)-linear set. If \(x,y\in L\) then for all \(z\in L\) and all \(a'\in \mathbb {Z}\) we have \(z + (y-x)a'\in L\). In particular, we have \(L = \left( q + p_1\mathbb {Z}+\dots +p_n\mathbb {Z}+ (y-x)\mathbb {Z}\right) \).

Proof

If \(x = q + a_1p_1+\dots +a_np_n\) and \(y= q + b_1p_1+\dots +b_np_n\) then \(y-x = q + b_1p_1+\dots +b_np_n - (q + a_1p_1+\dots +a_np_n)= (b_1-a_1)p_1+\dots +(b_n-a_n)p_n\).

Then for any \(z = q + c_1p_1+\dots +c_np_n\), we have \(z + a'(y-x) = q + c_1p_1+\dots +c_np_n + a'((b_1-a_1)p_1+\dots +(b_n-a_n)p_n) = q + (c_1 + a'(b_1-a_1))p_1+\dots +(c_n+ a'(b_n-a_n))p_n)\) where \( (c_i + a'(b_i-a_i))\in \mathbb {Z}\), so \(z + a'(y-x) \in L\). \(\square \)

As a sub-procedure, Algorithm 1 must efficiently decide whether a given point lies in the current candidate invariant \(L_i\).

Proposition 12

Let \(x \in \mathbb {Z}^d\) and \(L = \left( x^{(0)} + p_1\mathbb {Z}+ \dots + p_n\mathbb {Z}\right) \). Suppose \(\mu \) is an upper bound for the largest absolute value appearing in x and the largest absolute value appearing in all \(p_i\). Then deciding if \(x\in L_i\) is in polynomial time in \(\mu , n, d\).

A d-dimensional lattice can always be defined by at most d period vectors. However, our procedure may return a representation containing more than d period vectors.

Example 13

Consider the lattice \(\left( (2,2)\mathbb {Z}+ (0,6)\mathbb {Z}+(2,6)\mathbb {Z}\right) \), specified with three vectors, which is equivalent to the lattice \(\left( (2,0)\mathbb {Z}+ (0,2)\mathbb {Z}\right) \). Note that one may not simply pick an independent subset of the periods, as none of the following sets are equal: \(\left( (2,2)\mathbb {Z}+ (0,6)\mathbb {Z}\right) \), \(\left( (2,2)\mathbb {Z}+ (2,6)\mathbb {Z}\right) \), \(\left( (0,6)\mathbb {Z}+ (2,6)\mathbb {Z}\right) \), and \(\left( (2,2)\mathbb {Z}+ (0,6)\mathbb {Z}+(2,6)\mathbb {Z}\right) \).

The Hermite normal form can be used to obtain a basis of the vectors that define the lattice. Consider a lattice \(L_i = \left( p_1\mathbb {Z}+\dots +p_d\mathbb {Z}\right) \). The lattice remains the same if \(p_i\) is swapped with \(p_j\), if \(p_i\) is replaced by \(-p_i\), or if \(p_i\) is replaced by \(p_i + \alpha p_j\) where \(\alpha \) is any fixed integer.Footnote 9

The above are the unimodular operations. The Hermite normal form of a matrix M is a matrix H such that \(M = UH\), where U is a unimodular matrix (formed by unimodular column operations) and H is lower triangular, non-negative and each row has a unique maximum entry which is on the main diagonal. Such a matrix H always exists and its columns form a basis of the lattice spanned by the columns of M, because they differ up to unimodular (lattice-preserving) operations. There are many texts on the subject; we refer the reader to the lecture notes of Shmonin [25] for more detailed explanations.

The non-zero columns of a matrix in Hermite normal form constitute a basis of the lattice generated by the columns of the original matrix. Hence a basis of the lattice spanned by a collection of vectors can be obtained by computing the Hermite normal form of the matrix formed by placing the vectors as columns. The Hermite normal form can be computed in polynomial time [26], which we now use to prove Proposition 12.

Proof of Proposition 12

It is equivalent to ask whether \(x-x^{(0)} \in \left( p_1\mathbb {Z}+ \dots + p_n\mathbb {Z}\right) \). Recall that we can place the lattice into Hermite normal form in polynomial time. That is, determine \(d' \le d, p_1,\dots ,p_{d'}\) such that \(p'_1\mathbb {Z}+\dots +p'_{d'}\mathbb {Z}=p_1\mathbb {Z}+ \dots + p_n\mathbb {Z}\).

As the lattice is in Hermite normal form, there exists a unique choice of \(\alpha _1,\dots ,\alpha _{d'}\) such that \(\sum _{i=1}^{d'} \alpha _i p_i = x-x^{(0)}\), which can be determined by Gaussian elimination. Then we have \(x\in L_i\) if and only if the choices of \(\alpha _1,\dots ,\alpha _{d'}\) are integer. \(\square \)

We now prove the main theorem of this section:

Proof of Theorem 10

We claim that Algorithm 1 returns the strongest \(\mathbb {Z}\)-linear invariant I in polynomial time. Let us first explain the idea of the algorithm, which proceeds in two phases:

  • First compute a subset \(L_0\subseteq I\) of the invariant that has the same dimension as I.

    Recall the set \(R_0 = \left\{ x^{(0)}, r_1,\dots ,r_{d'}\right\} \subseteq \mathcal {O}{}\), with \(d' \le d\), from Lemma 6. The resulting \(\mathbb {Z}\)-linear set \(L_0 = \left( x^{(0)} + (r_1 - x^{(0)})\mathbb {Z}+ \dots + (r_{d'} - x^{(0)})\mathbb {Z}\right) \) is then a \(d'\)-dimensional porous subset of the \(d'\)-dimensional affine hull of the orbit (\(L_0 \subseteq \textrm{Aff}(\mathcal {O}{})\)). Applying \(M_1,\dots , M_k\) can only increase the density, but not the dimension. As each \(r_i\) and \(x^{(0)}\) are in \(\mathcal {O}{}\), by Proposition 11 we can assume that each of the directions \((r_i - x^{(0)})\) must be represented in any \(\mathbb {Z}\)-linear set containing \(\mathcal {O}{}\), and we therefore have that \(L_0 \subseteq I\).

  • In the second phase, we ‘fill in’ the lattice as required to cover the whole of \(\mathcal {O}\). We compute a growing sequence \(L_0\subsetneq L_1\subsetneq \dots \subsetneq L_{m-1} = L_{m}= I\), where at each step the algorithm merely increases the density of the attendant sets in order to ‘fill in’ missing points of the invariant.

    To do this we repeatedly find new points which are not yet covered by \(L_i\). Supposing we find \(x'\in \mathcal {O}\setminus I\), we then use Proposition 11 to argue that we can add the vector \(x' - x^{(0)}\).

\(\square \)

Claim 14

(Termination) Algorithm 1 terminates.

Proof of claim

The vectors \(p_1 = (r_1 - x^{(0)}), \dots ,p_{d'} = (r_{d'} - x^{(0)})\) form a parallelepiped (hyper-parallelogram) that repeats regularly. There are a finite number of integral points inside this parallelepiped. If new points are added in some step, they are added to every parallelepiped. Thus we can add new points finitely many times before saturating or \(L_i\) becomes fixed. \(\square \)

Claim 15

(I is an inductive invariant) Let \(M \in \{M_1,\dots , M_k\}\) and let \(x\in I\). Then \(Mx \in I\).

Proof of claim

It is clear that \(x^{(0)}\in I\) as \(x^{(0)}\in R_0\).

Let \(R=\left\{ r_0,\dots ,r_m\right\} \) be as in the last iteration of the algorithm, with \(r_0 = x^{(0)} \in R\), and so \(I = \left( r_0+ \sum _{i=1}^m(r_i - r_0)\mathbb {Z}\right) \).

Given a vector \(y\in \mathbb {Z}^d\), we denote by \(\begin{pmatrix} y \\ 1 \end{pmatrix}\) the vector in \(\mathbb {Z}^{d+1}\) formed by y in the first d dimensions and 1 in the final dimension. We first show that for any \(y\in \mathbb {Z}^d\):

$$\begin{aligned} y\in I \iff \begin{pmatrix} y \\ 1 \end{pmatrix} \in \sum _{r\in R} \begin{pmatrix} r \\ 1 \end{pmatrix}\mathbb {Z}. \end{aligned}$$
(1)

Let \(y = \left( r_0+ \sum _{i=1}^m(r_i - r_0)a_i\right) \in I\), then \(y = \left( r_0(1-\sum _{r_i}a_i) + \sum _{i=1}^mr_ia_i\right) \). Then we have \(\begin{pmatrix} y \\ 1 \end{pmatrix} = \begin{pmatrix} r_0 \\ 1 \end{pmatrix}(1-\sum _{r_i}a_i) + \sum _{i=1}^m\begin{pmatrix} r_i \\ 1 \end{pmatrix}a_i \in \sum _{r\in R} \begin{pmatrix} r \\ 1 \end{pmatrix}\mathbb {Z}\). Conversely, let \(\begin{pmatrix} y \\ 1 \end{pmatrix} \in \sum _{i=0}^m \begin{pmatrix} r_i \\ 1 \end{pmatrix}a_i\), since \(\sum _{i=0}^m{a_i} = 1\) then \(a_0 = 1- \sum _{i=1}^ma_i\) and we have \(\begin{pmatrix} y \\ 1 \end{pmatrix} = \begin{pmatrix} r_0 \\ 1 \end{pmatrix} + \sum _{i=1}^m\left( \begin{pmatrix} r_i \\ 1 \end{pmatrix}-\begin{pmatrix} r_0 \\ 1 \end{pmatrix}\right) a_i\), thus in particular, \({y} = {r_0} + \sum _{i=1}^m({r_i}-{r_0})a_i \in I\).

By termination of the algorithm we have \(Mr_i\in I\) for all \(r_i\in R\) (otherwise Algorithm 1 would add \(Mr_i\) to R) and thus \(\begin{pmatrix} Mr_i \\ 1 \end{pmatrix} \in \sum _{r_j\in R} \begin{pmatrix} r_j \\ 1 \end{pmatrix}\mathbb {Z}\) for all \(r_i\in R\). Let \(a_{0,i},\dots , a_{n,i}\in \mathbb {Z}\) be such that \(\begin{pmatrix} Mr_i \\ 1 \end{pmatrix} = \sum _{r_j\in R} \begin{pmatrix} r_j \\ 1 \end{pmatrix}a_{j,i}\).

By \(x\in I\) and Eq. (1) we have \(\begin{pmatrix} x \\ 1 \end{pmatrix} = \sum _{r_i\in R} \begin{pmatrix} r_i \\ 1 \end{pmatrix}b_i\) for some \(b_0,\dots ,b_n\in \mathbb {Z}\).

Let us now establish that \(Mx \in I\). We have \(\begin{pmatrix} Mx \\ 1 \end{pmatrix} = \sum _{r_i\in R} \begin{pmatrix} Mr_i \\ 1 \end{pmatrix}b_i\). Therefore we have \(\begin{pmatrix} Mx \\ 1 \end{pmatrix} =\sum _{r_i\in R} \sum _{r_j\in R} \begin{pmatrix} r_j \\ 1 \end{pmatrix}a_{j,i}b_i \). Thus \(\begin{pmatrix} Mx \\ 1 \end{pmatrix} \in \sum _{r_i\in R} \begin{pmatrix} r_i \\ 1 \end{pmatrix}\mathbb {Z}\), entailing \(Mx \in I\) (again by Eq. 1). \(\square \)

Claim 16

(I is the strongest invariant) For every invariant J, we have \(I\subseteq J\).

Proof of claim

By induction, let us prove that every invariant J must contain \(L_i\). Clearly this is the case for \(L_0\) because all points of \(R_0 \subseteq \mathcal {O}{}\) must be in J and every period vector in \(L_0\) can be present, without loss of generality, thanks to Proposition 11. Assume \(L_i\subseteq J\). Then it must be the case that J contains every \(M_j(x)\) for \(x\in L_i\), as otherwise it would not be an invariant. It therefore follows that J must contain \(L_{i+1}\), since the latter is the minimal \(\mathbb {Z}\)-linear set containing \(L_i\) and \(M_j(x)\) for some \(j\le k\). Finally, since I is itself one of the \(L_i\)’s, we have \(I \subseteq J\) as required. \(\square \)

Claim 17

(Polynomial time) The algorithm runs in polynomial time in dk and \(\log (\mu )\).

Proof of claim

Let \(x\in \mathbb {Z}^d\). We denote by \(\left\Vert x\right\Vert _\infty \) the largest absolute value of an entry of x, and by \(\left\Vert x\right\Vert _2\) the Euclidean norm of the vector.

Recall the parallelepiped from the claim of termination. The volume of the parallelepiped is bounded above by \(\left\Vert p_1\right\Vert _2 \cdots \left\Vert p_{d'}\right\Vert _2\). The volume of the parallelepiped must at least halve at every step in which a vector is added to the invariant; a new vector either leaves the parallelepiped unchanged, or partitions it into at least two pieces, in which case, one of the two pieces has volume at most half of the original. The volume at step t is therefore \(vol_t \le \left\Vert p_1\right\Vert _2 \cdots \left\Vert p_{d'}\right\Vert _2 / 2^t\). The procedure must saturate at, or before, the volume becomes 1, which occurs after at most \(\log (\left\Vert p_1\right\Vert _2 \cdots \left\Vert p_{d'}\right\Vert _2)\) steps.

Using Lemma 6 we obtain that each \(r_i\in R_0\) is the result of at most d matrix multiplication operations; thus using Proposition 7 we have \(\left\Vert r_i\right\Vert _\infty \le d^d\mu ^{d+1}\). Using the triangle inequality, we have \(p_i = r_i -x^{(0)}\) we have \(\left\Vert p_i\right\Vert _\infty \le d^d\mu ^{d+1} + \mu \le (d\mu )^{d+1}\) (for \(d\ge 2)\).

Using \(\left\Vert p_i\right\Vert _2 \le \sqrt{d\left\Vert p_i\right\Vert _\infty }\), we obtain \(\left\Vert p_i\right\Vert _2 \le \sqrt{d}(d\mu )^{d+1}\). Taking liberal simplifications we obtain \(\left\Vert p_i\right\Vert _2\le (d\mu )^{2d}\). Hence \(\left\Vert p_1\right\Vert _2 \cdots \left\Vert p_{d'}\right\Vert _2 \le ((d\mu )^{2d})^d\). Hence the number of update steps where a vector is added is at most \(\log ((d\mu )^{2d^2}) = (2d^2)\log (d\mu )\).

Since the number of vectors is at most \((2d^2)\log (d\mu )\), the number of steps between adding a vector is at most \(k(2d^2)\log (d\mu )\) (a new vector is added at least once across all iterations in the inner for loops, otherwise the procedure terminates). Hence, the total number of steps (counting a matrix multiplication, and verifying \(x\in L_i\) as a single step) is at most \(O(k((2d^2)\log (d\mu ))^2)\).

It remains to verify that the bit size of the vectors is polynomial. This will imply that the running time of the matrix multiplications is polynomial as well.

Let \((R_i)\) be the increasing sequence of sets built in Algorithm 1. As there are at most \((2d^2)\log (d\mu )\) vectors added in those sets and at least one vector is added at each step, this sequence becomes stationary after at most \((2d^2)\log (d\mu )\) steps. Given \(i\le (2d^2)\log (d\mu )\), we have that each vector \(x' \in R_i\) is the result of \(M_\ell x\) for some \(x \in R_{i-1}\) and \(\ell \in \{1,\dots ,k\}\). Hence, each element \(x\in R_i\) is the result of at most i matrix multiplications. By Proposition 7, after v matrix applications, the size of the number is at most \(d^v\mu ^{v+1}\). Hence \(\left\Vert x\right\Vert _\infty \le \mu (d\mu )^{(2d^2)\log (d\mu )}\), thus the bit size of such numbers are at most \((2d^2)\log ^2(d\mu )+ \log (\mu )\), which is polynomial in d and \(\log (\mu )\). \(\square \)

Claims 14 to 17 conclude that Algorithm 1 computes the strongest inductive invariant I, terminating in polynomial time, as required. \(\square \)

Remark 18

Considering again the MU puzzle, we note that both 1 and 2 are in the reachability set, hence \(\left( 1 + 1\mathbb {Z}\right) = \mathbb {Z}\) is the strongest \(\mathbb {Z}\)-linear invariant. Thus the class of \(\mathbb {Z}\)-linear sets is not useful for certifying non-reachability in this case.

4.2 Extensions of \(\mathbb {Z}\)-linear sets without strongest invariants

In this section we show that several generalisations of the class of \(\mathbb {Z}\)-linear sets fail to admit strongest invariants.

\(\mathbb {Z}\)-semi-linear sets are unions of \(\mathbb {Z}\)-linear sets, and therefore all finite sets are \(\mathbb {Z}\)-semi-linear. Consider the deterministic dynamical system starting from point 1 and doubling at each step \(\mathcal {M} = (1,(x\mapsto 2x))\). This system has reachability set \(\mathcal {O}= \left\{ 2^k \mid k\in \mathbb {N}\right\} \). For this LDS we can construct the invariant \(\left\{ 2,4,8,..., 2^k\right\} \cup \left\{ 2^{k+1} p_1 \mid p_1\in \mathbb {Z}\right\} \) for each k. For any proposed strongest \(\mathbb {Z}\)-semi-linear invariant, one can find a k for which the corresponding invariant is strictly smaller.

\(\mathbb {N}{}\)-linear sets generalise \(\mathbb {Z}\)-linear sets (observe that \(\mathbb {Z}\)-linear sets are a proper subclass, since \(\left( x+ p_i\mathbb {Z}\right) \) can be expressed as \(\left( x+ (-p_i)\mathbb {N}{} + p_i\mathbb {N}\right) \), but \(\left( x+ p_i\mathbb {N}\right) \) is clearly not \(\mathbb {Z}\)-linear). Consider the LDS \(\left((x_1,x_2),\left( {\begin{matrix}0&{}1\\ 1&{}0\end{matrix}}\right) \right)\), with a reachability set consisting of just two points \(x= (x_1,x_2)\) and \(y=(x_2,x_1)\). There are two incomparable candidates for the minimal \(\mathbb {N}{}\)-linear invariant: \(\left( x + (y-x)\mathbb {N}\right) \) and \(\left( y + (x-y)\mathbb {N}\right) \). Similarly for \(\mathbb {R}_{+}\)-linear invariants, the sets \(\left( y + (x-y)\mathbb {R}_{+}\right) \) and \(\left( x + (y-x)\mathbb {R}_{+}\right) \) are incomparable half-lines.

5 \(\mathbb {N}\)-semi-linear invariants

We turn now to \(\mathbb {N}{}\)-semi-linear invariants, the most general class of invariants that we consider. \(\mathbb {N}\)-semi-linear invariants gain expressivity thanks to the ‘directions’ provided by the period vectors. For example, the only possible \(\mathbb {Z}\)-semi-linear invariant for the LDS \((0,(x\mapsto x+1))\) is \(\mathbb {Z}\), yet the reachability set, namely \(\mathbb {N}\), is \(\mathbb {N}\)-linear.

In Sect. 5.1 we show that a separating \(\mathbb {N}\)-semi-linear inductive invariant can always be found for unreachable instances of deterministic integer LDS, although the computed invariant will depend on the target (strongest invariants do not always exist here). However, in Sect. 5.2 we show that finding invariants is undecidable for nondeterministic systems, at least in high dimension. Nevertheless, we show in Sect. 5.3 decidability for the low-dimensional setting of the MU Puzzle—one dimension with affine updates.

5.1 Existence of sufficient (but non-minimal) \(\mathbb {N}\)-semi-linear invariants for point reachability in deterministic LDS

Kannan and Lipton showed decidability of reachability of a point target for deterministic LDS [6]. In this subsection, we establish the following result to provide a separating invariant in unreachable instances.

Theorem 19

Given a deterministic LDS \((x^{(0)},M)\) together with a point target y, if the target is unreachable then a separating \(\mathbb {N}\)-semi-linear inductive invariant can be provided effectively.

To do so, we will invoke the results from [5] to compute an \(\mathbb {R}_+\)-semi-linear inductive invariant, and then extract from it an \(\mathbb {N}\)-semi-linear inductive invariant. More precisely, the authors of [5] show how to build polytopic inductive invariants for certain deterministic LDS. Such polytopes are either bounded or are \(\mathbb {R}_{+}\)-semi-linear sets. In the first case, the polytope contains only finitely many integral points, which can directly be represented via an \(\mathbb {N}\)-semi-linear set. In the second case, we build an \(\mathbb {N}\)-semi-linear set containing exactly the set of integral points included in the \(\mathbb {R}_{+}\)-semi-linear invariant, thanks to the following lemma.

Lemma 20

Given an \(\mathbb {R}_+\)-linear set \(S= \left( x + \sum _{i} p_i\mathbb {R}_+\right) \), where the vectors \(p_i\) have rational coefficients and x is an integer vector, one can build an \(\mathbb {N}\)-semi-linear set N comprising precisely all of the integral points of S.

Proof

Let \(S= \left( x + \sum _{i} p_i\mathbb {R}_+\right) \) be a \(\mathbb {R}_+\)-linear set where the vectors \(p_i\) have rational coefficients and x is an integer vector. Let \(k\in \mathbb {N}\) be an integer so that the vectors \(kp_i\) have integer coefficients. We denote by \(v_j\) the integer vectors of the form \(\sum _{i}\mu _ikp_i\) where \(0\le \mu _i\le 1\). Then the set \(T = \left( x + \sum _{j} v_j\mathbb {N}\right) \) contains exactly the integer vectors contained in S.

Indeed, first T only contains integer vectors since both x and the vectors \(v_j\) are integer vectors. Secondly, all the vectors in T are included in S as the period vectors of T lie in the cone defined by the vectors of S. Finally, given an integer vector y in S, y can be rewritten as \(y = x + v + \sum _{i} m_ikp_i\) where for all \(i, m_i\in \mathbb {N}\) and v is of the form \(\sum _{i}\mu _ikp_i\) with \(0\le \mu _i\le 1\) Therefore there exists j such that \(v_j=v\) and as for all i, \(kp_i\) is a period vector of T, \(y\in T\). \(\square \)

Proof of Theorem 19

We note that every inductive invariant produced in [5] has rational period vectors, as the vectors are given by the difference of successive points in the orbit of the system, and thus Lemma 20 can be applied. This produces an inductive invariant as their invariant is inductive, the LDS only reaches integer vectors and the invariant produced through Lemma 20 contains all the integer points appearing within their invariant.

The authors of [5] build an inductive invariant in all cases except those for which every eigenvalue of the matrix governing the evolution of the LDS is either 0 or of absolute value 1 and at least one of the latter is not a root of unity. This situation however cannot occur in our setting. Indeed, the eigenvalues of an integer matrix are algebraic integers, and an old result of Kronecker [27] asserts that unless all of the eigenvalues are roots of unity, one of them must have absolute value strictly greater than 1 (the case in which all eigenvalues are 0 being of course trivial). \(\square \)

5.2 Undecidability of \(\mathbb {N}\)-semi-linear invariants for nondeterministic LDS

If the enhanced expressivity of \(\mathbb {N}\)-semi-linear sets allows us to always find an invariant for deterministic LDS, it contributes in turn to making the invariant-synthesis problem undecidable when the LDS is not deterministic.

We establish this through a reduction from the infinite Post correspondence problem (\(\omega \)-PCP), which can be defined in the following way: given m pairs of non-empty words \(\{(u^{1},v^{1}),\dots , (u^{m},v^{m})\}\) over a binary alphabet, does there exist an infinite word \(w=w_1w_2\dots \) over alphabet \(\{1,\dots ,m\}\) such that \(u^{w_1}u^{w_2}\ldots = v^{w_1}v^{w_2}\ldots \). This problem is known to be undecidable when m is at least 8 [28, 29].

Theorem 21

The invariant synthesis problem for \(\mathbb {N}\)-semi-linear sets and linear dynamical systems with 13 matrices of dimension 7, or two matrices of dimension 91, is undecidable.

Proof

This proof follows in part the structure of the argument showing the undecidability of the invariant synthesis problem for \(\mathbb {R}_+\)-semi-linear invariants presented in [5]. Some non-trivial changes and new ideas have to be added here due to the restriction to integer values.

We will transform an instance of \(\omega \)-PCP with m tiles to an instance of the invariant synthesis problem for \(m+5\) matrices of size 7. This can then be converted in routine fashion to an instance of two matrices of size \(7m+35\) (see Theorem 9 of [5] for instance).

The main idea of this proof is to encode a pair of words on alphabet \(\{1,2\}\) corresponding to each sequence of tiles as an integer in base 4. An important property of our encoding is that the operation of appending a new tile to an existing pair of words can be achieved by matrix multiplication.

Recall that if the instance of \(\omega \)-PCP is negative, then every generated pair of words will differ at some point. Our reduction is such that a difference of letters creates a difference in their numerical encodings that can be identified through an \(\mathbb {N}\)-semi-linear invariant. Conversely, when the \(\omega \)-PCP instance has a positive answer, there can be no \(\mathbb {N}\)-semi-linear invariant.

Short simplifying lemma

In order to simplify the main part of the proof, let us first show that one can enforce an order between the matrices using affine transformations on one dimension. Let us denote by p this dimension; it is initially equal to 1 and its target value is 0. Consider the three following affine transformations: \(f_1(p)= 2p -1\), \(f_2(p) = 2p -2\) and \(f_3(p) = 2p\). The only sequences of transformations allowing to reach the target are of the form \(f_3^*f_2f_1^*\). Indeed, let \(\mathcal {I}= \{p\mid p\ge 2 \vee p\le -1\}\), we have (1) if \(p\in \mathcal {I}\), then for all \(i\in \{1,2,3\}\), \(f_i(p)\in \mathcal {I}\), (2) \(f_1(1) = 1\) and \(f_1(0)\in \mathcal {I}\), (3) \(f_2(1) = 0 \) and \(f_2(0)\in \mathcal {I}\) and (4) \(f_3(1)\in \mathcal {I}\) and \(f_3(0)=0\). As a consequence, the inductive invariant \(\mathcal {I}\) ensures that any sequence of transformations that do not have the desired order cannot reach the target. In the following, we will call type 1, 2 or 3 the transformations we define, depending on whether they implicitly contain the function \(f_1\), \(f_2\) or \(f_3\).

Description of the reduction

We reduce an instance \(\{(u^{1},v^{1}),\dots , (u^{m},v^{m})\}\) of the \(\omega \)-PCP problem over binary alphabet \(\{1,2\}\) to the invariant synthesis problem. Given a finite or infinite word w, we denote by \(\left|w\right|\) the length of the word w and given an integer \(i\le \left|w\right|\), we write \(w_i\) for the i-th letter of w. Given a finite or infinite word w on alphabet \(\{1,\dots ,m\}\) we denote by \(u^{w}\) and \(v^{w}\) the words on alphabet \(\{1,2\}\) such that \(u^{w}= u^{w_1}u^{w_2}\dots \) and \(v^{w}= v^{w_1}v^{w_2}\dots \). Given a finite word w on alphabet \(\{1,2\}\), denote by \([ w ]=\sum _{i=1}^{\left|w\right|}w_i4^{\left|w\right|-i}\) the quaternary encoding of w. It is clear that it satisfies \([ ww' ]=4^{\left|w'\right|}[ w ]+[ w' ]\). For all \(i\le m\), we denote by \({n_i}=4^{\left|u^{i}\right|}\), \(m_i=4^{\left|v^{i}\right|}\) and \(\textsf {max}_i = \max (n_i,m_i)\).

We work with five dimensions, (scdnk), and define the following transformations:

  • For \(i \le m\), the type 1 transformation \(\textsf{Simulate}_{i}\) on (scdnk) encodes the action of reading the pair \((u^i,v^i)\) and increases the counters n and k: it simultaneously applies \(s \leftarrow \textsf {max}_i s + c[u^i]\frac{\textsf {max}_i}{n_i}-d[v^i]\frac{\textsf {max}_i}{m_i}\), \(c \leftarrow \frac{\textsf {max}_i}{n_i}c\), \(d \leftarrow \frac{\textsf {max}_i}{m_i}d\), \(n\leftarrow n+k\) and \(k\leftarrow k+1\).

  • The type 2 transformation \(\textsf{Transfer}\) on (scdnk) gathers some of the values in order to compare them and resets d: \(s\leftarrow s - c - d\), \(c \leftarrow -s -c-d\) and \(d\leftarrow 0\).

  • The type 3 transformation \(\mathsf {Inc_s}\) increments s: \(s\leftarrow s+1\).

  • The type 3 transformation \(\mathsf {Inc_c}\) increments c: \(c\leftarrow c+1\).

  • The type 3 transformation \(\textsf{Dec}\) decreases k and n: \(n\leftarrow n-k\), \(k\leftarrow k-1\).

  • The type 3 transformation \(\mathsf {Dec_k}\) decrements k: \(k\leftarrow k-1\).

These \(m+5\) transformations operate over seven dimensions in total: the five above (namely (scdnk)), one (namely p) for ordering the transformations, and one last dimension constantly equal to 1, required to implement affine transformations.

We will show that there is a solution to the given instance of the \(\omega \)-PCP problem iff there does not exist an \(\mathbb {N}\)-semi-linear invariant for the system with initial point \(x = (0,1,1,0,0,1,1)\), target \(y=(0,0,0,1,0,0,1)\), and using the matrices inducing the transformations defined above.

Evolution of the system

Let \(w=w_1\dots w_j\) be a finite word over \(\{1,\dots ,m\}^*\). Consider \((s,c,d,n,k,p,a)=\textsf{Simulate}_{w} x\) where \(\textsf{Simulate}_{w}\) represents the transformation \(\textsf{Simulate}_{w_{j}}\dots \textsf{Simulate}_{w_2}\textsf{Simulate}_{w_1}\). We have

  • \(s = c[ u^{w} ]-d[ v^{w} ]\),

  • \(n=\frac{j(j-1)}{2} \) and \(k=j\),

  • \(p=a=1\).

Indeed, let us prove the first item (the only non-trivial one) by induction on the length of w. If \(|w|=0\), then \([ u^{w} ]=[ v^{w} ]=0\) which is compatible as the first component of x is 0. Otherwise, w is of the form zi with \(i \in \{1,\dots , m\}\). By the induction hypothesis, denoting \((s,c,d,n,k,p,a)=\textsf{Simulate}_w x\) and \((s',c',d',n',k',p',a')=\textsf{Simulate}_{z} x\), we have that \(s' = c'[ u^{z} ]-d'[ v^{z} ]\). Applying \(\textsf{Simulate}_{i}\), we obtain that \(s = \textsf {max}_i s' + c'[u^i]\frac{\textsf {max}_i}{n_i}-d'[v^i]\frac{\textsf {max}_i}{n_i}\), \(c = \frac{\textsf {max}_i}{n_i}c'\) and \(d = \frac{\textsf {max}_i}{m_i}d'\). Thus

$$\begin{aligned} s =&\textsf {max}_i (c'[ u^{z} ]-d'[ v^{z} ]) + c'[u^i]\frac{\textsf {max}_i}{n_i}-d'[v^i]\frac{\textsf {max}_i}{m_i}\\ =&c'(\textsf {max}_i[ u^{z} ] + [u^i]\frac{\textsf {max}_i}{n_i}) -d' (\textsf {max}_i[ v^{z} ] + [v^i]\frac{\textsf {max}_i}{m_i})\\ =&c (n_i[ u^{z} ] + [u^i]) - d (m_i[ v^{z} ] + [v^i])\\ =&c[ u^{w} ]-d[ v^{w} ] \end{aligned}$$

which concludes the induction.

Only if case: \(\omega \)

-PCP solution implies no invariant

Assume that there is a solution w to the \(\omega \)-PCP instance. Consider the sequence of points \((x_n)\) obtained as follows: for all \(j\in \mathbb {N}\), denoting \(w_{\le j}\) the prefix of w of length j, \(x_j = (s_j, c_j,0, n_j,k_j,0,1) = \textsf{Transfer}\ \textsf{Simulate}_{w_{\le j}} x\).

Let (scd) be the three first components of \(\textsf{Simulate}_{w_{\le j}} x\). Assuming without loss of generality that \(\left|u^{w_{\le j}}\right|\le \left|v^{w_{\le j}}\right|\) we have that

$$\begin{aligned} \vert s\vert&= \vert c[ u^{w_{\le j}} ]-d[ v^{w_{\le j}} ]\vert \\&= \sum _{i=1}^{\vert u^{w_{\le j}}\vert }\vert u^{w_{\le j}}_i-v^{w_{\le j}}_i\vert c4^{\vert u^{w_{\le j}}\vert -i} + \sum _{i=\vert u^{w_{\le j}}\vert +1}^{\vert v^{w_{\le j}}\vert }v^{w_{\le j}}_ic4^{\vert u^{w_{\le j}}\vert -i}\\&= \sum _{i=\vert u^{w_{\le j}}\vert +1}^{\vert v^{w_{\le j}}\vert }v^{w_{\le j}}_ic4^{\vert u^{w_{\le j}}\vert -i} \\&< c\, . \end{aligned}$$

The first equality was proven in the previous paragraph. The second equality is obtained by grouping the terms corresponding to the same power of 4 and noting that, by construction, \(c4^{\vert u^{w_{\le j}}\vert }=d4^{\vert v^{w_{\le j}}\vert }\). The third equality comes from the fact that \(w_{\le j}\) is a prefix of a solution to the \(\omega \)-PCP instance and thus that letters on the same level are the same. Finally, the last inequality is obtained by bounding every \(v_i^{w_{\le j}}\) by 2 and extending the sum to infinity.

From this inequality, we immediately have that \(\left|s\right|-c-d\) is negative, and thus both \(s_j = s-c-d\) and \(c_j = -s-c-d\) are negative.

Due to the above, by applying to the points \(x_j\) a number of times the transformations \(\mathsf {Inc_s}\) and \(\mathsf {Inc_c}\), we obtain the sequence of points \((y_j)_{j\in \mathbb {N}}\) where \(y_j = (0,0,0,n_j,k_j,0,1)\). We claim that any semi-linear invariant containing all the points \(y_j\) also contains a point of the form \((0,0,0,n_j+d,k_j,0,1)\), where d is a positive integer. This will imply the result as from such a point, one can reach the target by \(d-1\) applications of \(\mathsf {Dec_k}\) and \(k_j\) applications of \(\textsf{Dec}\) and thus there is no semi-linear invariant of the system that does not intersect the target.

Let us now prove the above claim. Let \(\mathcal {I}\) be a semi-linear set containing every vector \(y_j\) (which we will see as two-dimensional objects by projecting on the 4th and 5th dimension). Then there exists a linear set \(\mathcal {I}'\subseteq \mathcal {I}\) that contains infinitely many vectors of \((y_j)_{j\in \mathbb {N}}\). This set \(\mathcal {I}'\) is defined by an initial vector, and a set of period vectors. As \(\mathcal {I}'\) contains infinitely many vectors of \((y_j)_{j\in \mathbb {N}}\) where the ratios between the first and second component is increasing, one of the period vectors is of the form (d, 0) where d is a strictly positive integer. Let j be such that \(y_j\in \mathcal {I}'\), then \((n_j+d, k_j)\in \mathcal {I}'\) which implies the claim.

As a consequence, every inductive \(\mathbb {N}\)-semi-linear invariant of the LDS intersects with the target.

If case: no \(\omega \)-PCP solution implies an invariant

Assume that there is no solution to the \(\omega \)-PCP instance. There exists \(n_0\in \mathbb {N}\) such that for every infinite word w on alphabet \(\{0,\dots , m\}\) there exists \(n\le n_0\) such that \(u^{w}_n \ne v^{w}_n\). Indeed, consider the tree whose root is labelled by \((\varepsilon ,\varepsilon )\) and, given a node (uv) of the tree, if for all \(n\le \min (\left|u\right|,\left|v\right|)\) we have \(u_n=v_n\), then this node has m children: the nodes \((u u^{i},v v^{i})\) for \(i\in \{1,\dots , m\}\). This tree is finitely branching and does not contain any infinite path (which would induce a solution to the \(\omega \)-PCP instance). Thus, according to König’s lemma, it is finite. We can therefore choose the height of this tree as our \(n_0\).

We define the invariant \(\mathcal {I}=\mathcal {I}_1\cup \mathcal {I}_2\cup \mathcal {I}_3\) where

$$\begin{aligned} \mathcal {I}_1=&\big \{\textsf{Simulate}_w (x) \mid w\in \{1,\dots ,m\}^* \wedge \left|w\right|\le n_0+1\big \}, \\ \mathcal {I}_2=&\big \{z=(s,c,0,n,k,0,1) \mid z= (\mathsf {Inc_s})^*(\mathsf {Inc_c})^* (\textsf{Dec})^*(\mathsf {Dec_k})^* \textsf{Transfer}\ \textsf{Simulate}_w (x) \\&\wedge w\in \{1,\dots ,m\}^* \wedge \left|w\right|\le n_0+1 \wedge s,t,n,k\in \mathbb {N}\big \} \end{aligned}$$

and

$$\begin{aligned} \mathcal {I}_3=&\big \{(s,c,d,n,k,p,1)\mid (\left|s\right|-c-d\ge 1\wedge c\ge 0 \wedge d \ge 0 \wedge p=1)\\&\vee ((s\ge 1\vee c\ge 1\vee n\le -1 \vee k\le -1) \wedge p=0) \vee p\le -1 \vee p \ge 2\}. \big \} \end{aligned}$$

By definition, \(\mathcal {I}\) is an \(\mathbb {N}\)-semi-linear set, contains x and does not contain y. The difficulty is to show stability under the transformations.

  •    \(\diamond \)    Let \(z=\textsf{Simulate}_w(x)\in \mathcal {I}_1\), for some \(w\in \{1,\dots ,m\}^*\) with \(\left|w\right| \le n_0 +1\). By ordering if we apply a transformation outside \(\textsf{Transfer}\) or a \(\textsf{Simulate}_{i}\) for some i, we reach \(\mathcal {I}_3\).

    •    \(\bullet \)   For \(i\in \{1,\dots ,m\}\), if \(\left|w\right| \le n_0\), then \(\textsf{Simulate}_{i}z\in \mathcal {I}_1\).

      Else, \(\textsf{Simulate}_{i}z = \textsf{Simulate}_{wi}x=(s,c,d,n,k,p,1)\) with \(\left|w\right|=n_0+1\). But then, there exists \(n_1\leqslant n_0\) such that \(u^{wi}_{n_1}\ne v^{wi}_{n_1}\). Let \(n_2\) be the smallest such number, then assume without loss of generality that \(c\ge d\), we have

      $$\begin{aligned} s&= c[ u^{wi} ]-d[ v^{wi} ]\\&= (u^{wi}_{n_2}-v^{wi}_{n_2})c4^{\vert u^{wi}\vert -n_2}+ \sum _{j=n_2+1}^{\max (\vert u^{wi}\vert ,\vert v^{wi}\vert )}(u^{wi}_j-v^{wi}_j)c4^{\vert u^wi\vert -j} \end{aligned}$$

      since \(u^{wi}_j=v^{wi}_j\) for \(j<n_2\). Thus,

      $$\begin{aligned} \vert s\vert&\geqslant c4^{\vert u^{wi}\vert -n_2}-\frac{2c}{3} 4^{\vert u^{wi}\vert -n_2}{} & {} \text {since }\vert u^{wi}_{n_2}-v^{wi}_{n_2}\vert =1\\{} & {} {}&\text {and for } n\ge n_2, \vert u^{wi}_{n}-v^{wi}_{n}\vert \le 2\\&\geqslant \frac{1}{3} c4^{\vert u^{wi}\vert -n_2}\\&\geqslant 2c + 1{} & {} \text {since }n_2\leqslant n_0\text { and }\vert u^{wi}\vert \geqslant n_0+2. \end{aligned}$$

      As \(c\ge d\), this shows that \(\textsf{Simulate}_{i} z\in \mathcal {I}_3\).

    •    \(\bullet \)   \(\textsf{Transfer}z\in \mathcal {I}_2\).

  •    \(\diamond \)    Let \(z\in \mathcal {I}_2\) and f be one of the transformations, then \(f(z) \in \mathcal {I}_2\) if f increased (resp. decreased) a negative (resp. positive) component. Otherwise \(f(z) \in \mathcal {I}_3\).

  •    \(\diamond \)    Let \(z=(s,c,d,n,k,p,1)\in \mathcal {I}_3\), f be one of the transformations and \(f(z) = (s',c',d',n',k',p',1)\).

    •    \(\bullet \)   if \(p=0\), then either \(p'\le -1\) and \(f(z)\in \mathcal {I}_3\) or z satisfies \((s\ge 1\vee c\ge 1\vee n\le -1 \vee k\le -1)\) and then f(z) satisfies \((s'\ge 1\vee c'\ge 1\vee n'\le -1 \vee k'\le -1)\), thus \(f(z)\in \mathcal {I}_3\).

    •    \(\bullet \)   if \(p =1\), then \(\vert s\vert -c-d \ge 1, c\ge 0 \) and \(d\ge 0\). There are three possibilities (1) \(p'=2\) and thus \(f(z) \in \mathcal {I}_3\), (2) \(f=\textsf{Transfer}\) then \(p'=0\) and either \(s' \ge 1\) or \(c' \ge 1\) and thus \(f(z)\in \mathcal {I}_3\) or (3) \(f=\textsf{Simulate}_{i}\) for \(i\le m\). In the latter case without loss of generality, assume that \(d'\geqslant c'\). We have that

      $$\begin{aligned} \left|s'\right|&=\vert \textsf {max}_i s+c'[ u^{i} ]-d'[ v^{i} ]\vert{} & {} \text {by applying }\textsf{Simulate}_{i}\\&\geqslant \textsf {max}_i\vert s\vert - d'\max ([ u^{i} ],[ v^{i} ])\\&\geqslant \textsf {max}_i(c+d+1)-d'\max ([ u^{i} ],[ v^{i} ]){} & {} \text {by assumption on }\vert s\vert \\&\geqslant \textsf {max}_i(c+d+1)-\tfrac{2}{3}d\textsf {max}_i{} & {} \text {since }[ u^i ]\in [0,\tfrac{2n_i}{3}]\\&= \textsf {max}_i(c+ d/3) + \textsf {max}_i \\&\geqslant c'+d' + 1{} & {} \end{aligned}$$

      since \(\textsf {max}_i c\ge c'\), \(\textsf {max}_i d/3 \ge d'\) (as \(m_i\ge 4\)) and \(\textsf {max}_i\geqslant 4\). This shows that \(f(z)\in \mathcal {I}_3\).

Therefore \(\mathcal {I}\) is inductive and thus a \(\mathbb {N}\)-semi-linear invariant of the system. This concludes the reduction. \(\square \)

5.3 Nondeterministic one-dimensional affine updates

The previous section shows that point reachability for nondeterministic LDS is undecidable once there are sufficiently many dimensions, motivating an analysis at lower dimensions. The MU Puzzle requires a single dimension with affine updates (or equivalently two dimensions in matrix representation, with the coordinate along the second dimension kept constant). We consider this one-dimensional affine-update case, and therefore, rather than taking matrices as input, we directly work with affine functions of the form \(f_i(x) = a_i x + b_i\).

Theorem 22

Given \(x^{(0)},y{}\in \mathbb {Z}\), along with a finite set of functions \(\left\{ f_1,\dots ,f_k\right\} \) where \(f_i(x) = a_ix+b_i\), \(a_i,b_i \in \mathbb {Z}\) for \(1 \le i \le k\), it is decidable whether \(y\) is reachable from \(x^{(0)}\). Moreover, when y is unreachable, an \(\mathbb {N}\)-semi-linear separating inductive invariant can be algorithmically computed in pseudo-polynomial time.

We note that decidability of reachability is already known [14, 15]. We refine this result by exhibiting an inductive invariant which can be used to certify non-reachability. In fact our procedure will produce an \(\mathbb {N}\)-semi-linear set which can be used to decide reachability, and which, in instances of non-reachability, will be a separating inductive invariant. We have implemented this algorithm into our tool porous, enabling us to efficiently tackle the MU Puzzle as well as its generalisation to arbitrary collections of one-dimensional affine functions. We report on our experiments in Sect. 7.

We build a case distinction depending on the type of functions that appear:

Definition 23

Consider an affine function \(f(x) = ax + b\). We say:

  • f is redundant if \(f(x)= b\), (including possibly \(b = 0\)), or if \(f(x) = x\).

  • f is a counter if \(f(x) = x + b\), \(b\ne 0\). Two counters \(f(x) = x + b\) and \(g(x) = x + c\) are opposing if \(bc<0\). Otherwise they are called codirectional.

  • f is growing if \(f(x) = ax + b\) and \(\left|a\right| \ge 2\). We say a growing function is inverting if \(a \le -2\).

  • f is pure inverting if \(f(x) = -x + b\) (including possibly \(b = 0\)).

5.3.1 Simplifying assumptions

Lemma 24

We can reduce the computation of an invariant for a system having redundant functions to finitely many invariant computations for systems having no such functions.

Proof

Clearly the identity function has no impact on the reachability set, and so can be removed outright. For any other redundant function, its impact on the reachability set does not depend on when the function is used, and we may therefore assume that it was used in the first step, or equivalently, using an alternative starting point. Hence the invariant-computation problem can be reduced to finitely many instances of the problem over different starting points, with redundant functions removed. Finally, taking the union of the resulting invariants yields an invariant for the original system. \(\square \)

Lemma 25

Without loss of generality, \(x^{(0)} \ge 0\).

Proof

Suppose \(x^{(0)} < 0\), we construct a new system, where each transition \(f(x) = ax +b\) is replaced by \(\overline{f}(x) = ax - b\). Then \(x^{(0)}\) reaches \(y{}\) in the original system if and only if \(-x^{(0)}\) reaches \(-y\) in the new system. To see this, observe that if \(f(x) = ax + b\), then \(\overline{f}(-x) = -ax-b = - f(x)\). \(\square \)

Lemma 26

Suppose there are at least two distinct pure inverting functions (and possibly other types of functions). Then without loss of generality there are two opposing counters.

Proof

Consider \(f(x) = -x + b\), and \(g(x) = -x + c\). Then \(f(g(x)) = -(-x+c)+b = x +b-c\) and \(g(f(x)) = -(-x+b)+c = x+c-b\). Since \(b-c = -(c-b)\) and \(b\ne c\) (as \(f\ne g\)) these two functions are opposing. \(\square \)

5.3.2 Two opposing counters

Let us first observe that when there are two opposing counters, we can essentially move in either direction by some fixed amount. This will entail that only \(\mathbb {Z}\)-(semi)-linear invariants need be produced, rather than proper \(\mathbb {N}{}\)-(semi)-linear invariants.

Lemma 27

Suppose there are two opposing counters, \(f(x) = x + b\), and \(g(x) = x - c\). Then for any reachable x we have \(\left( x + d\mathbb {Z}\right) \subseteq I\) for \(d = \gcd (b,c)\).

Lemma 28

For \(\ell ,k\) coprime, the sequence \(a_n = (n\ell \bmod k)\) for \(n\in \mathbb {N}\) cycles through every residue class \(\left\{ 0,\dots ,k-1\right\} \).

Proof

Any path longer than k visits some class twice, and if the shortest cycle is k, then it visits every class.

Suppose there is a cycle of length less than k; then \(n\ell = c + mk\) and \((n+i)\ell = c + m'k\) and hence \(i\ell = (m'-m)k\), with \(i < k\). Since \(\ell \) is an integer i divides \((m'-m)k\) then \(i = pr\) for \(p,r\in \mathbb {N}{}\) such that \(\frac{m'-m}{p}\) is integer and \(\frac{k}{r}\) is integer. Observe that since \(r\le i< k\) we have \(\frac{k}{r} > 1\). But this implies that \(\frac{k}{r}\) divides k and \(\ell \), contradicting \(\gcd (k,\ell ) = 1\). \(\square \)

Proof of Lemma 27

Let \(b = kd, c=\ell d\), where \(k,\ell \) are co-prime.

We show there exists \(m,n\ge 0\) such that \(mb - cn = d\). We have \(mb - cn = d \iff mkd -n\ell d = d \iff mk -n \ell = 1\). Then choose \(m = \frac{1+n\ell }{k}\). By Lemma 28n can be chosen such that \(n\ell \equiv k \mod d\) for any \(k\in \{0,\dots ,d-1\}\). Then n can be chosen such that \(1+n\ell \equiv 0 \mod d\) and so k divides \(1+n\ell \) for some n.

Hence for \(x\in \mathcal {O}\), the set \(\left( x +d\mathbb {N}\right) \) is included in the reachability set: we obtain \(x+ jd\), \(j>0\) by \(g^{nj}\circ f^{mj}(x)\), hence \(x+jd\in \mathcal {O}\) and thus \(x+d\mathbb {N}\subseteq I\). Similarly, we can find \(m',n'\ge 0\) such that \(m'b - cn' = -d\) and thus \(\left( x +d\mathbb {Z}\right) \) is also within the reachability set. \(\square \)

Therefore, starting with \(\left( x^{(0)} + d\mathbb {Z}\right) \subseteq I\) we can ‘saturate’ the invariant under construction using the following lemma:

Lemma 29

Let \(h(x) = x + d\) be chosen as a reference counter amongst the counters. If \(\left( x + d\mathbb {Z}\right) \subseteq I\), then \(\left( f(x) + d\mathbb {Z}\right) \subseteq I\) for every function f.

Proof of Lemma 29

Consider the function \(f(x) = ax + b\). If \(x+dk \in I\) for \(k\in \mathbb {Z}\), then \(f(x+dk)=a(x+dk)+b = ax + adk + b = f(x) + adk \in I\).

Now applying the counter \(h(x) = x+d\) an arbitrary number m of times, we have \(h^{m}\circ f(x+dk) = f(x) + adk + dm \in I\) for \(k\in \mathbb {Z}\) and \(m\in \mathbb {N}\). Thus \(f(x) + dn \in I\) for any choice of \(n\in \mathbb {Z}\) by suitable choice of k (possibly negative) and m (non-negative). \(\square \)

Without loss of generality if \(\left( x +d\mathbb {Z}\right) \) is in the invariant, then \(0\le x < d\). We then repeatedly use Lemma 29 to find the required elements of the invariant. Since there are only finitely many residue classes (modulo d), every reachable residue class \(\left( c_1,\dots ,c_n\right) \) can be found by saturation (in at most d steps), yielding invariant \(\left( c_1+d\mathbb {Z}\right) \cup \dots \cup \left( c_n+d\mathbb {Z}\right) \).

Thanks to Lemma 26, in all remaining cases there is without loss of generality at most one pure inverter.

5.3.3 Only pure inverters

If there is exactly one pure inverter \(f(x) = -x + b\) (and no other functions of any type), then \(f(x^{(0)}) = -x^{(0)} + b\) and \(f(-x^{(0)} + b)= x^{(0)}-b + b = x^{(0)}\), thus the reachability set is \(\{x^{(0)}, -x^{(0)}+b \}\), which is itself a finite inductive invariant.

5.3.4 No counters

If we are not in the preceding case and there are no counters, then there must be growing functions and by Lemma 26, without loss of generality at most one pure inverter. We show that all growing functions increase the absolute value outside of some bounded region.

Lemma 30

For every \(M \ge 0\) and every growing function \(f(x) = ax + b\), \(\left|a\right| \ge 2\), there exists \(C^M_f \ge 0\) such that if \(\left|x\right| \ge C^M_f\) then \(\left|f(x)\right| \ge \left|x\right| + M\).

Proof

By the triangle inequality we have: \(\left|f(x)\right| = \left|ax+b\right| \ge \left|a\right|\left|x\right| - \left|b\right|\). Thus \(\left|x\right| \ge \frac{\left|b\right|+\left|M\right|}{\left|a\right| -1} \implies \left|a\right|\left|x\right| - \left|b\right| \ge \left|x\right| + \left|M\right| \implies \left|f(x)\right|\ge \left|x\right| +M\). \(\square \)

This is the only situation in which the invariant is not exactly the reachability set, and requires us to take an overapproximation.

Let \(C = \max \left\{ C^0_{f_1},\dots ,C^0_{f_k}, \left|y\right|+1\right\} \), for \(f_1,\dots ,f_k\) growing functions and y the target point. If there are no pure inverters then \(\left( -C - \mathbb {N}\right) \cup \left( C + \mathbb {N}\right) \) is inductive. However, as it may not yet contain \(x^{(0)}\), it does not yet contain the whole of \(\mathcal {O}\). From this we can build the inductive invariant \(\left( -C - \mathbb {N}\right) \cup \left( C + \mathbb {N}\right) \cup (\mathcal {O}{} \cap (-C, C))\). The set \(\mathcal {O}{} \cap (-C, C)\) is finite and can be elicited by exhaustive search, noting that once an element of the orbit reaches absolute value at least C, the remainder of the corresponding trajectory remains forever outside of \((-C,C)\).

If there is one pure inverter \(g(x) = -x + d\) then observe that \(-C\) is mapped to \(C +d\) and \(C+d\) is mapped to \(-C\). Thus intuitively we want to use the interval \((-C, C+d)\). However two problems may occur: (a) since d could be less than 0 then \(C+d\) may no longer be growing (under the application of the growing functions), and (b) an inverting growing function only ensures that \(-C\) is mapped to a value greater than or equal to C, rather than \(C+d\). Hence, we choose \(C'\) to ensure that \(C'\pm d\) is still growing by at least \(\left|d\right|\) (under the application of our growing functions). Let \(C' = \max \left\{ C^{\left|d\right|}_{f_1},\dots ,C^{\left|d\right|}_{f_k}, \left|y\right|+1\right\} +\left|d\right|\). Then the invariant is \(\left( -C' - \mathbb {N}\right) \cup \left( C'+d + \mathbb {N}\right) \cup (\mathcal {O}{} \cap (-C', C'+d))\).

5.3.5 Codirectional counters

The only remaining possibility (if there do not exist two opposing counters, and not all functions are growing or pure inverters), is that there are counter functions, but they are all codirectional. There may also be a single pure inverter, and any number of growing functions. Throughout this section we assume the growing functions are growing outside of the interval \([-B,C]\).

We pick a counter \(h(x) = x + d\) amongst the codirectional counters to be the reference counter; the choice is arbitrary, but it is convenient to pick a counter with minimal \(\left|d\right|\). For each residue r modulo d, we will have either a set \(\left( r + d\mathbb {Z}\right) \), a set \(\left( x_r + d\mathbb {N}\right) \) for \(x_r\equiv r \mod d\), or \(\emptyset \). We will define a saturation procedure on these sets. To start, clearly we have \(\left( x^{(0)} + d\mathbb {N}\right) \subseteq I\).

As in the case of two opposing counters, by Lemma 29, \(\mathbb {Z}\)-linear sets will induce new \(\mathbb {Z}\)-linear sets. We now observe that using inverters \(\mathbb {N}\)-linear sets may induce \(\mathbb {Z}\)-linear sets:

Lemma 31

If there is an inverter \(g(x) = -ax + b\), with \(a> 0,b\in \mathbb {Z}\), and we have \(\left( x + d\mathbb {N}\right) \subseteq I\) then \(\left( g(x) + d\mathbb {Z}\right) \subseteq I\).

Proof

Let \(r =g(x) + dm\) for \(m \in \mathbb {Z}\). We show \(r \in I\). Consider \(x + dn\) for \(n\in \mathbb {N}\), then \(g(x+dn) = -a(x+dn) + b = -ax+b-adn = g(x) -adn\). Hence \(g(x) -adn + dk\), \(n,k\in \mathbb {N}\), is reachable by applying k times the function h(x). Then we have for any \(m\in \mathbb {Z}\) there exists \(k,n\in \mathbb {N}\) such that \(k - na = m\), so that r is indeed reachable. \(\square \)

Lemma 32

Let f be a non-inverting function and suppose \(h(x) = x+d\) is a counter. If the \(\mathbb {N}\)-linear set \(\{x_r + d\mathbb {N}\}\) is in the invariant, then the set \(\{f(x_r) + d\mathbb {N}\}\) is in the invariant.

There are finitely many \(\mathbb {Z}\)-linear sets, thus a saturation procedure applied to these sets will terminate. However, repeated application of Lemma 32 will not necessarily saturate. If the application of f to \(x_r\) ‘moves’ in the same direction as the counters then saturation will occur. However, when the function f moves in the opposite direction, we may generate infinitely many such classes. Note that all the counters are assumed to move in same direction as the reference counter (as we do not have opposing counters). However, the direction of a growing function depends on the sign of the input.

Example 33

Consider the reference counter \(h(x) = x +4\), with initial point 5. This yields an initial set \(\left( 5 + 4 \mathbb {N}\right) \subseteq \mathcal {O}\), where 5 is the initial point and \(4\mathbb {N}{}\) is derived from the counter increment. Now when applying \(x \mapsto 2x +6\) to \(\left( 5 + 4 \mathbb {N}\right) \) we obtain \(\left( 10+ 6 + 8 \mathbb {N}+ 4\mathbb {N}\right) =\left( 16+4\mathbb {N}\right) \), then \(\left( 38+4\mathbb {N}\right) \), and then \(\left( 82 + 4\mathbb {N}\right) \). However \(\left( 82+4\mathbb {N}\right) \subseteq \left( 38+4\mathbb {N}\right) \) and we can therefore stop with the invariant \(\left( 5+4\mathbb {N}\right) \cup \left( 16+4\mathbb {N}\right) \cup \left( 38+4\mathbb {N}\right) \).

However, if the initial sequence is not moving in the direction of the reference counter, this saturation does not occur. Consider \(\left( 5 + 4 \mathbb {N}\right) \) with the function \(x\mapsto 2x - 6\). Then \(\left( 5 + 4 \mathbb {N}\right) \) maps to \(\left( 10-6 + 8\mathbb {N}+ 4 \mathbb {N}\right) = \left( 4+4\mathbb {N}\right) \), which maps to \(\left( 2 + 4 \mathbb {N}\right) \), \(\left( -2 + 4 \mathbb {N}\right) \), \(\left( -10 + 4 \mathbb {N}\right) \), \(\left( -26 + 4 \mathbb {N}\right) \), and so on. However \(-2\) and \(-10\) are both 2 modulo 4 (and so is \(-26\) as well). This means in the negative direction we can obtain arbitrarily large negative values congruent to 2 modulo 4 and then use the reference counter \(h(x) = x + 4\) to obtain any value of \(\left( 2 + 4\mathbb {Z}\right) \).

Finally, we will use the following lemma to induce a \(\mathbb {Z}\)-linear set when an infinite sequence of \(\mathbb {N}\)-linear sets occur. Since inverting induces \(\mathbb {Z}\)-linear sets, in the following lemma we can assume all functions are non-inverting.

Lemma 34

Assume the reference counter has the form \(h(x) = x +d\). Suppose all growing functions are growing outside of \([-B,C]\).

If \(d\ge 0\) and there exist \(x_r < -B\) and a sequence of functions \(h_1, h_2,\dots , h_m\in \{f_1,\dots ,f_k\}\) such that

$$\begin{aligned} h_j\circ \dots \circ h_1(x_r) < x_r \le -B \text { for all } j\le m \ \text { and } \ h_m\circ \dots \circ h_1(x_r)\equiv x_r \bmod d, \end{aligned}$$

then for all \(M \le x_r\), there exist \(h'_1, h'_2,\dots , h'_{m'}\) such that

$$\begin{aligned} x_M = h'_{m'}\circ \dots \circ h'_{1}(x_r) \le M\quad \text { and }\quad x_M \equiv x_r \bmod d. \end{aligned}$$
(2)

Furthermore, if \(x_r \in I\), then \(\left( x_r + d\mathbb {Z}\right) \subseteq I\).

Symmetrically, if \(d< 0\) and there exist \(x_r >C\) and \(h_1, h_2,\dots , h_m\in \{f_1,\dots ,f_k\}\) such that

$$\begin{aligned} h_j\circ \dots \circ h_1(x_r) > x_r \ge C \text { for all } j\le m \ \text { and } \ h_m\circ \dots \circ h_1(x_r)\equiv x_r \bmod d, \end{aligned}$$

then for all \(M \ge x_r\), there exist \(h'_1, h'_2,\ldots , h'_{m'}\)

$$\begin{aligned} x_M = h'_{m'}\circ \dots \circ h'_{1}(x_r) \ge M\quad \text { and }\quad x_M \equiv x_r \mod d. \end{aligned}$$

Furthermore, if \(x_r \in I\), then \(\left( x_r + d\mathbb {Z}\right) \subseteq I\).

Proof

We show that \((h_m\circ \dots \circ h_1)^n\) satisfies Eq. (2) for some n. Firstly, observe that the re-application of \(h_{m}\circ \dots \circ h_{1}\) results in the same residue class by modulo arithmetic. Now to show that \(x_M \le M\), consider \(\Delta _j(x_r) = \left|h_j\circ \dots \circ h_1(x_r) - h_{j-1}\circ \dots \circ h_1(x_r)\right|\).

  • If \(h_j\) is a counter, \(\Delta _j\) is constant, regardless of \(x_r\).

  • If \(h_j\) is a growing function outside of \([-B,C]\), then \(\Delta _j(x'_r) \ge \Delta _j(x_r)\) if \(x_r'< x_r <-B\).

Thus, by induction, since \(h_j\circ \dots \circ h_1(x_r) < x_r\), we have

$$\begin{aligned} h_j\circ \dots \circ h_1\circ ( h_m\circ \dots \circ h_1)^n(x_r)< h_j\circ \dots \circ h_1\circ ( h_m\circ \dots \circ h_1)^{n-1}(x_r). \end{aligned}$$

Since \(x_r\) induces \(x'_r \le M\) for any M, repeated application of h induce \(\left( x_r' + d\mathbb {N}\right) \), for arbitrarily small \(x_r' \equiv x_r\). Hence if \(x_r\in I\) then \(\left( x_r + d\mathbb {Z}\right) \subseteq I\).

The second part, when \(d <0\), holds by symmetry: inequalities are reversed and C is used in place of \(-B\). \(\square \)

We now show how to detect whether such sequences exist:

Lemma 35

Let \(f_1,\dots ,f_\kappa \) be non-inverting growing functions and \(g_1,\dots ,g_{\kappa '}\) be codirectional counters with \(\kappa + \kappa ' = k\), and let \(h(x) =x+d\) be the reference counter amongst the \(g_i\). Given \(x_r\not \in [-B,C]\) it can be decided in time \(O(d(d+k))\) whether there exists a sequence of functions \(h_1, h_2,\dots , h_m\) such that \( x_r'\equiv x_r \mod d\), where \(x_r' = h_m\circ \dots \circ h_1(x_r)\), and

  • \( h_j\circ \dots \circ h_1(x_r) < x_r \le -B \) for all \(j\in \{1,\dots ,n\}\) if \(d >0\), or

  • \( h_j\circ \dots \circ h_1(x_r)>x_r \ge C\) for all \(j\in \{1,\dots ,n\}\) if \(d< 0\).

Proof

First, we restrict the form of the sequence we must search for. Suppose there exists a sequence in which there exists \(i<j\) such that \(h_{i}\) is growing and \(h_j\) is a counter, we first show that there exists another sequence satisfying the property without this occurring. That is there is a sequence \(h_1,\dots ,h_{m'}\) where \(h_1,\dots ,h_\ell \in \{f_1,\dots ,f_{\kappa }\}\) and \(h_{\ell +1},\dots ,h_{m}\in \{g_1,\dots ,g_{\kappa '}\}\) for some \(\ell \).

To see this, consider a growing function \(f(x) = ax + b\) applied on top of a counter \(g(x) = x + c\); we have \(f(g(x)) = a(x+c) + b = ax + ac +b > g^{(a \mod d)}(f(x)) = ax + (a\bmod d)c + b\), as \((a\bmod d) \le d\). Observe that \(f(g(x)) \equiv g^{(a\mod d)}(f(x)) \bmod d\).

As a consequence, each of the counters need only be applied at the end and each at most d times as this is sufficient to access all attainable residue classes.

We now consider the graph on nodes \(\{\left|0\right|,\dots ,\left|d-1\right|,\left|0\right|',\dots ,\left|d-1\right|'\}\), such that:

  • \(i\rightarrow j\) if \(f(i) \equiv j\mod d\) for some non-inverting growing function f.

  • \(i\rightarrow j'\) if \(i + a_1 b_1 + \dots + a_{\kappa '} b_{\kappa '} \equiv j \mod d\), for some \(a_i\in \{0,\dots ,d-1\}\), where the counting functions are \(g_i(x) = x+ b_i\) for \(1\le i \le \kappa '\).

  • \(i\rightarrow i'\) for all \(i\in \{0,\dots ,d-1\}\).

In this graph we ask if there exists an infinite family of sequences from i to \(i'\), such that \(\left( x+d\mathbb {N}\right) \subseteq I\) with \(x\le -B\) and \(i\equiv x \mod d\). That is a sequence from i to \(i'\) in which a cycle is accessible. Note that there are only cycles over nodes in \(\{0,\dots ,d-1\}\), not in the primed variants. Let \(i\xrightarrow {*}j\) denote that there exists a path from i to j. This can be decided in polynomial time, using, for example depth-first search; we ask for every j whether \(i\xrightarrow {*}j\), \(j\xrightarrow {*}j\) and \(j\xrightarrow {*}i'\).

The graph is of size \(O(d^2)\) and can be built in \(O(d(d+k))\). Indeed, the most costly operation in constructing this graph is the second test. Moreover, given a state \(i_1\), one can compute an array of size d representing the set of \(j'\) such that \(i_1\rightarrow j'\) following this second test in O(dk). To build this set for another state \(i_2\), one only needs to shift the values by \(i_2-i_1\), which can be done in O(d). We thus need O(dk) to build the first array and \(O(d^2)\) to build all the others.

As the graph is of size \(O(d^2)\), precomputing each j such that \(i\xrightarrow {*}j\) for each i simultaneously takes linear time in the size of the graph \(O(d^2)\). The same is true for precomputing j such that \(j\xrightarrow {*} i'\) for \(i \equiv x_r \mod d\). After precomputation, we can answer for every j in constant time whether \(i\xrightarrow {*}j,j\xrightarrow {*}j,j\xrightarrow {*}i'\) and there exists \((x_r+d\mathbb {Z})\subseteq I\) with \(x_r \equiv i \mod d\) and \(x_r \not \in [-C,C+d]\). The total time spent is dominated by the graph construction, thus giving an algorithm in \(O(d(d+k))\). \(\square \)

We now summarise the procedure in the case that all counters have the same direction, and that \(h(x) = x +d\) is a chosen reference counter.

The procedure continues by applying Lemma 29, Lemma 31, and Lemma 32 using the available functions. We continue until either:

  1. 1.

    no set is updated, or

  2. 2.

    the only updates induced are \(\mathbb {N}\)-linear sets of the form \(\left( x + d\mathbb {N}\right) \) with \(x\le -B\) (or \(x > C\) if \(d < 0\)).

In the first case, the invariant is inductive and nothing further is required. In the second case, we must decide if we have a sequence of the type described in Lemma 34, using Lemma 35 for each most general \(x_r\not \in [-B,C]\) such that \((x_r+d\mathbb {N}) \subseteq I\).

Whenever such a sequence exists, then a new \(\mathbb {Z}\)-linear set is induced, and that can take place at most d times. Further applications of Lemma 29 must then occur on the new \(\mathbb {Z}\)-linear sets until saturation amongst the \(\mathbb {Z}\)-linear sets occurs.

Once no such sequence exists (possibly immediately), then we continue inducing new \(\mathbb {N}\)-linear sets using Lemma 32. This is now guaranteed to terminate, as otherwise there would exist a sequence of the type described in Lemma 34.

5.3.6 Reachability

The above procedure is sufficient to decide reachability. In all cases apart from those in which there are no counters, the invariants produced coincide precisely with the reachability sets. A reachability query therefore reduces to asking whether the target belongs to the invariant.

In the remaining cases, the invariant obtained is parametrised by the target via the bound \(C'\). The target lies within the region \((-C',C' + d)\), within which we can compute all reachable points. Thus once again, the target is reachable precisely if it belongs to the invariant. However, for a new target of larger absolute value, a different invariant would need to be built.

5.3.7 Complexity

Finally we show that the invariant of Theorem 22 can be computed in pseudo-polynomial time. More precisely, we prove the following lemma:

Lemma 36

Let k be the number of functions, and let \(\mu \) bound the largest absolute value occurring in the input. Then the invariant can be computed in time \(O(\mu ^3\cdot k^2)\), that is polynomial in \(\mu \) and k.

Proof

Recall that the input comprises the starting point x, target point y and functions \(f_i(x) = a_ix+ b_i\) for \(i \in \{1,\dots ,k\}\). We have \(\left|x\right| \le \mu \), \(\left|y\right| \le \mu \), \(\left|a_i\right| \le \mu \) and \(\left|b_i\right| \le \mu \) for all \(i\in \{1,\dots ,k\}\).

In the no-counter case, by Lemma 30, we compute the interval \([-C,C+d]\), where \(C \ge |y| +1\) and \(C\ge \frac{\left|b\right|+\left|M\right|}{\left|a\right| -1}\), for \(|M| \le |b_i|\) for some \(i\in \{1,\dots ,k\}\). We have \(C\le 2\mu \) and \(d\le \mu \), therefore the size of the interval \({[-C,C+d]}\) is at most \(5\mu \). It remains to compute the reachability set in \([-C,C+d]\), which is found by breadth-first search over \([-C,C+d]\) with k outgoing edges for each element, thus taking time \(O(\mu \cdot k)\).

In the case of two opposing counters, we have that all components of the invariant are of the form \(x+d\mathbb {Z}\) for \(d\le 2\mu \). Thus there are at most \(2\mu \) rounds, each round taking time at most \(O(\mu \cdot k)\). The procedure runs in time at most \(O(\mu ^2 \cdot k)\).

Finally, we consider the case of codirectional counters. There are three main phases:

  • Firstly we saturate using Lemma 29, Lemma 31, and Lemma 32; here counters take the form \(x+d\mathbb {Z}\) or \(x+d\mathbb {N}\), where \(d\le \mu \) and \(x\in [-B,C]\) for \(B\le 2\mu ,C\le 3\mu \). Observe that there are at most \(5\mu \) sets of the form \((x+d\mathbb {N})\) and \(\mu \) sets of the form \((x+d\mathbb {Z})\). Thus there are at most \(6\mu \) sets that can be considered in this process. Hence, using breadth-first search, this phase takes time \(O(\mu \cdot k)\).

  • Secondly, checking for a sequence of the form in Lemma 34 requires at most \(\mu \) applications of Lemma 35, each taking \(O(\mu (\mu +k))\) time. The newly found \(\mathbb {Z}\)-linear sets are saturated using Lemma 29, taking time at most \(O(\mu \cdot k)\).

  • Thirdly, the final saturation of \(\mathbb {N}\)-linear sets can be done in time \(O(\mu ^2\cdot k^2)\). Specifically, we proceed in rounds: in each round we consider each set of the form \((x+d\mathbb {N})\), and add the sets \((f(x)+d\mathbb {N})\) whenever this is more general than a set already in I. In each round, up to \(d\cdot k\) new \(\mathbb {N}\)-linear sets are considered; however, at the end of the round, there are only d most general sets to expand into the next round. In Lemma 34 we note that the length of any cycle-free path outside of \([-B,C]\) is bounded by at most \(d(k+1)\), thus at most \(d(k+1)\) rounds of exploration are required.

Summing the time spent in the three phases, we require time \(O(\mu ^2(\mu +k)+\mu \cdot k + \mu ^2\cdot k^2)\), which is bounded by \(O(\mu ^3\cdot k^2)\). \(\square \)

Lemma 36 essentially asserts that the procedure is in polynomial time assuming that descriptions of the starting point, target point and the functions are given in unary. Without the unary assumption, the invariant could have exponential size, and hence require at least exponential time to compute. That is because the invariant we construct could include every value in an interval \([-C,C+d]\), where C is of size polynomial in the largest absolute value.

As shown in [15], the reachability problem is at least \({\textbf {NP}}\)-hard in binary, because one can encode the integer Knapsack problem (which allows an object to be picked multiple times rather at most once). Moreover the Knapsack problem is efficiently solvable in pseudo-polynomial time via dynamic programming; that is, polynomial time assuming the input is in unary, matching the complexity of our procedure.

6 Porous targets

So far we have only considered invariants for point targets. We now study the reachability question for porous (or ‘lattice-like’) targets. First, we consider targets that are full dimensional, that is, targets that span the whole space. Here we show decidability of the reachability problem and synthesise suitable invariants.

Lower-dimensional targets are problematic. For nondeterministic systems reachability is undecidable for non-full-dimensional targets (in particular point targets) [7]. However, even for deterministic systems, when \(\mathbb {Z}\)-linear targets are not full-dimensional the reachability problem becomes as hard as the Skolem problem (see, e.g. [30]). Denote by \(e_i\) the i-th standard basis vector where \(e_i \in \left\{ 0,1\right\} ^d\) with \((e_i)_i = 1\) and \((e_i)_j= 0\) for \(j\ne i\). Then the Skolem problem corresponds to having \(\left\{ (0,x_2,\dots ,x_d) \mid x_2,\dots ,x_d \in \mathbb {Z}\right\} = \left( \vec {0} +e_2\mathbb {Z}+\dots + e_d\mathbb {Z}\right) \) as the target set. Similarly full-dimensional \(\mathbb {N}\)-linear targets encode the Positivity problem, that is, reaching \(\left( -e_1\mathbb {N}+e_2\mathbb {Z}+\dots + e_d\mathbb {Z}\right) \).

However, for low-dimensional hyperplanes the Skolem problem is decidable, lifting this barrier. Thus, in cases where the Skolem problem is decidable, we show decidability of hitting an \(\mathbb {N}\)-semi-linear set in Sect. 6.2.

6.1 \(\mathbb {Z}\)-linear targets

First, let us consider targets specified as full-dimensional \(\mathbb {Z}\)-linear sets.

Theorem 37

It is decidable whether a given LDS \((x^{(0)},\left\{ M_1,\dots ,M_k\right\} )\) reaches a full-dimensional \(\mathbb {Z}\)-linear target \(Y = \left( x + p_1\mathbb {Z}+\dots + p_d\mathbb {Z}\right) \), with \(x,p_i\in \mathbb {Z}^d\). Furthermore, for unreachable instances, a \(\mathbb {Z}\)-semi-linear inductive invariant can be provided.

Towards proving Theorem 37, we first show that full-dimensional linear sets can be expressed as ‘square’ hybrid-linear sets. Hybrid-linear sets are semi-linear sets in which all the components share the same period vectors, and thus differ only in starting position (whereas semi-linear sets allow each component to have distinct period vectors). Given a set of base vectors B and a lattice \(L = p_1\mathbb {Z}+ \dots +p_d\mathbb {Z}\), we write \(B+L\) to denote the semi-linear set \(\bigcup _{b\in B} \left( b + p_1\mathbb {Z}+\dots +p_d\mathbb {Z}\right) \). By square, we mean that all period vectors are the same multiple of standard basis vectors (recall from page 31 that these are denoted \(e_1,\dots ,e_d\)).

Lemma 38

Let \(Y = \left( x + p_1\mathbb {Z}+ \dots + p_d\mathbb {Z}\right) \) be a full-dimensional \(\mathbb {Z}\)-linear set. Then there exist \(m\in \mathbb {N}{}\) and a finite set \(B\subseteq [0,m-1]^d\) such that \(Y = B + \left( me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\right) \).

Proof

Let \(p_1,\dots ,p_d\) span a d-dimensional vector space and write \(P = \left( {\begin{matrix} p_1\\ \vdots \\ p_d\end{matrix}}\right) \) for the matrix with rows \(p_1,\dots ,p_d\). Since P has full row rank it is invertible, hence there exists a rational matrix \(P^{-1}\) such that \(e_i = p_1P^{-1}_{i,1}+\dots +p_dP^{-1}_{i,d}\). In particular let \(m_i\) be such that \(P^{-1}_{i,j}m_i\) is integral for all j. Then there is an integral combination of \(p_1,\dots ,p_d\) such that \(m_ie_i\) is an admissible direction in Y.

Let \(m = {\text {lcm}}\left\{ m_1,\dots ,m_d\right\} \). Then \(me_i\) is an admissible direction in Y. Hence by Proposition 11, Y is equivalent to \(\left( x + p_1\mathbb {Z}+ \dots + p_d\mathbb {Z}+ me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\right) \). By the presence of \(me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\) we have that \(x\in Y\) if and only \(x' \in Y\) where \(x'_i = (x_i \bmod m)\).

We conclude that Y can be rewritten as \(B + \left( me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\right) \), where \(B = [0,m-1]^d \cap Y\). \(\square \)

We now prove Theorem 37.

Proof of Theorem 37

Choose m and B as in Lemma 38, so that Y is of the form \(\bigcup _{b\in B} \left( b + me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\right) \). We build an invariant I of the form \(\bigcup _{b\in B'} \left( b + me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\right) \) for some \(B'\subseteq [0,m-1]^d\).

We initialise the set \(I_0=\left( x + me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\right) \), where \(x\in [0,m-1]^d\) is such that \(x_j = (x^{(0)}_j \bmod m)\). We then build the set \(I_1\) by adding to \(I_0\) the sets \(\left( y + me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\right) \) where for each choice of \(M_i\), \(y\in [0,m-1]^d\) is formed by \(y_j =( (M_ix)_j \bmod m)\) for some \(x\in I_0\). We iterate this construction until it stabilises in an inductive invariant I. Termination follows from the finiteness of \([0,m-1]^d\) (noting in particular that if termination occurs with \(B' = [0,m-1]^d\), then \(I = \mathbb {Z}^d\) which is indeed an inductive invariant).

If there exists \(y\in B\cap I\) then we return Reachable. This is because the same sequence of matrices applied to \(x^{(0)}\) to produce \(y\in I\) would, thanks to the modulo step, end up inside the set \(\left( y + me_1\mathbb {Z}+\dots +me_d\mathbb {Z}\right) \), which is a part of the target.

Otherwise, we return Unreachable and I as invariant. By construction, I is indeed an inductive invariant disjoint from the target set. \(\square \)

Remark 39

By the same argument, Theorem 37 extends to a restricted class of \(\mathbb {Z}\)-semi-linear targets: the finite union of full-dimensional \(\mathbb {Z}\)-linear sets.

6.2 Deterministic LDS and low dimension \(\mathbb {N}\)-semi-linear targets

While reachability of a point is well known to be decidable, extending this result to higher dimensional targets is difficult. In particular, reaching a hyperplane is equivalent to the Skolem problem, a longstanding open question. Some results have however been achieved for low-dimensional systems (see e.g. [31,32,33]).

In this subsection, we rely on those results to establish decidability of the reachability problem for low-dimensional \(\mathbb {N}\)-semi-linear targets.

Theorem 40

Given a deterministic LDS together with an \(\mathbb {N}{}\)-semi-linear target, the reachability problem is decidable if either the target has dimension at most 2 or both the target and ambient space have dimension 3.

Proof

This result is achieved through a succession of refinements of the target we consider: (1) we first identify the subspace in which the target lies and detect when this subspace is hit by the LDS, (2) then, when restricted to the times where the subspace of the target is hit, we detect when the modulo constraints of the target are hit as well, (3) finally, we only have to detect when the ‘direction’ provided by the period vectors is hit as well.

Given an LDS \((x^{(0)},M)\) and an \(\mathbb {N}{}\)-semi-linear target Y which is either of dimension 2 or of dimension 3 if the ambient dimension is 3, note first that Y can be decomposed into several \(\mathbb {N}{}\)-linear targets and reachability of Y is directly deduced from the reachability of each new target. As such, we assume the target \(Y=\left( y + \sum _{i} p_i\mathbb {N}{}\right) \) is \(\mathbb {N}{}\)-linear in the following.

We denote by \(R_Y = \left( y + \sum _{i} p_i\mathbb {R}\right) \) the \(\mathbb {R}\)-linear extension of Y. The subspace \(R_Y\) is either of dimension 2 or of dimension 3 if the ambient dimension is 3 as well by definition of Y. By the Skolem-Mahler-Lech theorem [34], the set \(S_Y=\{n\in \mathbb {N}\mid M^nx^{(0)}\in R_Y\}\) has the form \(S_Y= F\cup A\) for a finite set F and semi-linear set \(A=\bigcup _{i} \left( a_i + b\mathbb {N}{}\right) \) where \(a_i \in \{0,\ldots ,b-1\}\) for all i. Moreover, thanks to \(R_Y\) being of low dimension the sets F and A can be computed [31, 32].

We now focus on the times where \(R_Y\) is hit by the LDS. Letting \(N_{\max }\) be the greatest occurrence within F, one can preprocess the first \(N_{\max }\) steps of the system before considering the LDS \((M^{N_{\max } +1}x^{(0)},M)\). As such, we can assume without loss of generality that F is empty.

Similarly, by considering the family of LDS \((M^ix^{(0)},M^b)\) for \(i<b\), we can assume that A is either empty, or it is \(\mathbb {N}{}\). In the first case, Y cannot be reached by the LDS.

In the second case, we refine the target by considering the \(\mathbb {Z}\)-linear extension of Y, \(Z_Y=\left( y + \sum _{i} p_i\mathbb {Z}\right) \). As the orbit of the LDS is included in \(R_Y\), \(Z_Y\) is full-dimensional. Thus, reachability of \(Z_Y\) (and invariant synthesis in the negative case) can be obtained with Theorem 37. Since Theorem 37 shows the behaviour is eventually periodic, one can find a period \(c\in \mathbb {N}\) such that, potentially after an initial shift d, the family of LDS \((M^{i+d}x^{(0)},M^c)\) for \(i \in \{0,\dots ,c-1\}\), either never hit \(Z_Y\) (and thus never hit Y), or hits \(Z_Y\) in every step.

Let us assume we are in the latter case. Then reachability of Y is equivalent to reachability of the \(\mathbb {R}_+\)-linear extension of the target \(L_Y=\left( y + \sum _{i} p_i\mathbb {R}_+\right) \) as \(Y = L_Y\cap Z_Y\). Moreover, reachability of \(L_Y\) can be tested through the results of [33] thanks to the low dimension of the target, which concludes the proof. \(\square \)

Remark 41

Theorem 40 is focused on reachability. It is possible to synthesise an invariant for negative instances, but in some cases the kind of certificates that can be generated go beyond the scope of this paper. In particular, the authors of [32] provide a form of certificate, but it is not a porous invariant, and can be expensive to verify.

Remark 42

Progress to extend decidability of the Skolem problem to cover broader classes would immediately extend the scope of Theorem 40 to the same classes. For example [35] recently shows that the Skolem problem is conditionally decidable for simple linear recurrence sequences, corresponding to linear dynamical systems whose matrix is diagonalisable. Thus reachability of \(\mathbb {Z}\)-semi-linear targets on such system is decidable subject to number-theoretic conjectures discussed in [35].

7 The POROUS tool

Our invariant-synthesis tool porousFootnote 10 computes \(\mathbb {N}{}\)-semi-linear invariants for point and \(\mathbb {Z}\)-linear targets on systems defined by one-dimensional affine functions. porous includes implementations of the procedures of Theorem 37 restricted to one-dimensional affine systems and Theorem 22. The tool is built in Python and can be used either by command-line file input, a web interface, or by directly invoking the Python packages.

porous takes as input an instance (a starting point, a target, and a collection of functions) and returns the generated invariant. Additionally it provides a proof that this set is indeed an inductive invariant: the invariant is a union of \(\mathbb {N}{}\)-linear sets, so for each linear set and each function, porous illustrates the application of that function to the linear set and shows for which other linear set in the invariant this is a subset. Using this invariant, porous can decide reachability; if the specific target is reachable the invariant is not in itself a proof of reachability (since the invariant will often be an overapproximation of the global reachability set).

Rather, equipped with the guarantee of reachability, porous searches for a direct proof of reachability: a sequence of functions from start to target (a process which would not otherwise be guaranteed to terminate).

Example 43

The tool’s output, when, applied to the MU Puzzle is the invariant \(\left( 1+3\mathbb {Z}\right) \cup \left( 2+3\mathbb {Z}\right) \) of Example 1:

figure b

7.1 Experimentation

porous was tested on all \(2^7-1\) possible combinations of the following function types, with \(a\ge 2, b\ge 1\): positive counters (\(x\mapsto x+b\)), negative counters (\(x\mapsto x-b\)), growing (\(x\mapsto ax\pm b\)), inverting and growing (\(x\mapsto -ax\pm b\)), inverters with positive counters (\(x\mapsto -x+b\)), inverters with negative counters (\(x\mapsto -x-b\)) and the pure inverter (\(x\mapsto -x\)). For each such combination a random instance was generated, with a size parameter to control the maximum absolute value of a and b, ranging between 8 and 2048. The starting point was between 1 and the size parameter and the target was between 1 and 4 times the size parameter. Twelve instances were tested for each size parameter and each of the \(2^7-1\) combinations, with between 1 and 9 functions of each type (with a bias for one of each function type). Both the code and the datasets generated and analysed during the current study are available in the Zenodo repository [36].

Table 2 Results varying by size parameter (last row includes all instances tested)

Our analysis, summarised in Table 2, illustrates the effect of the size parameter. The time to produce the proof of invariant is separated from the process of building the invariant I, since producing the proof of invariant can become slower as \(\left|I\right|\) becomes larger; it requires finding \(L_k\in I\) such that \(f_i(L_j) \subseteq L_k\) for every linear set \(L_j\in I\) and every affine function \(f_i\). In every case porous successfully built the invariant, and hence decided reachability very quickly (on average well below 1 s) and also produced the proof of invariance in around half a second on average. To demonstrate correctness in instances for which the target is reachable porous also attempts to produce a proof of reachability (a sequence of functions from start to target). Since our paper is focused on invariants as certificates of non-reachability, our proof-of-reachability procedure was implemented crudely as a simple breadth-first search without any heuristics, and hence a timeout of 60 s was used for this part of the experiment only.

Our experimental methodology was partially limited due to the high prevalence of reachable instances. A random instance will likely exhibit a large (often universal) reachability set. When two random counters are included, the chance that \(\gcd (b_1,b_2) = 1\) (whence the whole space is covered) is around \(60.8\%\) and higher if more counters are chosen.

Overall around 86% of instances were reachable (of which 81% produced a proof within 60 s). Of the 14% of unreachable instances, all produced a proof, with the invariant taking around 0.4 s to build and 0.5 s to produce the proof. The 60-second timeout when demonstrating reachability directly is several orders of magnitudes longer than answering the reachability query via our invariant-building method.

The timing and analysis was conducted using a Dell PowerEdge M620 with 2x Intel Xeon E5-2667 v2 CPUs and 256GB RAM.

8 Conclusions and open directions

We have introduced the notion of porous invariants, which are not necessarily convex and can in fact exhibit infinitely many ‘holes’, and studied these in the context of multipath (or branching/nondeterministic) affine loops over the integers, or equivalently nondeterministic integer linear dynamical systems. We have focused on reachability questions. The potential applicability of porous invariants to larger classes of systems (such as programs involving nested loops) or more complex specifications remains largely unexplored.

Our focus is on the boundary between decidability and undecidability, leaving precise complexity questions open. Indeed, the complexity of synthesising invariants could conceivably be quite high, except where we have highlighted polynomial-time (or pseudo-polynomial-time) results. On the other hand, the invariants produced should be easy to understand and manipulate, from both a human and machine perspective.

On a more technical level, in our setting the most general class of invariants that we consider are \(\mathbb {N}{}\)-semi-linear. There remains at present a large gap between decidability for one-dimensional affine functions, and undecidability for linear updates in dimension 91 and above. It would be interesting to investigate whether decidability can be extended further, for example to dimensions 2 and 3.