Learning Union of Integer Hypercubes with Queries (Technical Report)

We study the problem of learning a finite union of integer (axis-aligned) hypercubes over the d-dimensional integer lattice, i.e., whose edges are parallel to the coordinate axes. This is a natural generalization of the classic problem in the computational learning theory of learning rectangles. We provide a learning algorithm with access to a minimally adequate teacher (i.e. membership and equivalence oracles) that solves this problem in polynomial-time, for any fixed dimension d. Over a non-fixed dimension, the problem subsumes the problem of learning DNF boolean formulas, a central open problem in the field. We have also provided extensions to handle infinite hypercubes in the union, as well as showing how subset queries could improve the performance of the learning algorithm in practice. Our problem has a natural application to the problem of monadic decomposition of quantifier-free integer linear arithmetic formulas, which has been actively studied in recent years. In particular, a finite union of integer hypercubes correspond to a finite disjunction of monadic predicates over integer linear arithmetic (without modulo constraints). Our experiments suggest that our learning algorithms substantially outperform the existing algorithms.


Introduction
Suppose that we are interested in finding a formula ϕ(x) over some theory T (e.g.integer linear arithmetic) to "capture" a certain phenomenon, which in verification could be, for instance, an invariant that a program satisfies some safety property.The process of discovering ϕ can be captured by the notion of a learning algorithm by allowing certain types of queries as an interface to some teacher [3].Most standard learning frameworks can be captured in this way.Here are some examples.Valiant's well-known notion of PAC-learning can be captured by an oracle that returns a new random sample from an unknown distribution.Angluin's well-known notion of exact learning [2,3] can be captured by an interaction with the so-called minimally adequate teachers, which can answer membership and equivalence queries.This has many applications in verification, e.g., verification of parameterized systems [10,20,23] and compositional verification [9].Another learning framework that has become very popular in verification is CEGIS (Counterexample Guided Inductive Synthesis) [21,27], wherein a learning algorithm can ask equivalence queries, but expect various types of "constraint-like" counterexamples (e.g.implication counterexamples) to be returned by the teacher.This is of course in contrast to Angluin's exact learning setting, wherein the teacher may return only a positive/negative counterexample (a point in the symmetric difference of the target concept and the hypothesis).
In this paper, we study the problem of learning sets of points over the ddimensional integer lattice that can be expressed as a finite union of integer (axis-aligned, a.k.a.rectilinear) hypercubes, i.e., whose edges are parallel to the coordinate axes.Such a concept class of course forms a strict subclass of sets of points that are definable by a formula ϕ(x 1 , . . ., x d ) in the integer linear arithmetic (a.k.a.semilinear sets), which have been addressed in several papers including [1,17,28], whose PAC-learnability is as hard as PAC-learning boolean formulas in DNF [16]-a long-standing open problem in learning theory-when binary representations are permitted (even over dimension one [1]).That said, finite unions of integer hypercubes are a concept class that naturally arises in computer science.Below we mention a few examples.
The problem of learning rectangles (2-cube) and generalization to d-dimension are a classic example in computational learning theory, e.g., see [16,22].Maass and Turán [22] showed for example that the d-dimensional rectilinear cubes can be learned in polynomial-time with O(log n) queries, where the corners of the cubes are represented in binary.The authors posed as an open problem if one can learn a union of two (possibly overlapping) rectangles with only O(log n) equivalence queries.Chen [11] showed that this can be learned with 2 equivalence queries and O(d.log n) membership queries.Later Chen and Ameur [12] showed that there is a polynomial-time algorithm using at most O(log 2 n) queries.The same paper left as an open problem if there is a polynomial-time exact learning algorithm that learns finite unions of rectilinear cubes over a fixed dimension d.In this paper, we answer this in the positive, and further show that this can be extended to allow infinite rectilinear hypercubes, which in turn allow interesting applications in formal verification, as we discuss below.
Finite unions of rectilinear cubes arise naturally in program analysis and verification.Here we mention two examples.First, solving games over a large game graph has benefited from constraint-based approaches, where winning regions can be succinctly represented and checked efficiently [6].For example, the discretization of the Cinderella-Stepmother problem [6] admits winning regions that may be represented by a union of a small number of cubes.Secondly, verification algorithms benefit from optimization techniques like monadic decomposition [29], where the aim is the rewriting of a given quantifier-free SMT formula ϕ(x 1 , . . ., x n ) into an equivalent boolean combination of monadic predicates ψ(x i ) in some special form, i.e., typically in DNF [5,7,15,19], or by an if-then-else formula [29], which could sometimes be exponentially smaller than the DNF equivalent representation.Veanes et al. [29] provided a generic semidecision procedure for performing this monadic decomposition as an if-then-else formula, which works regardless of the base theory.The restriction of the problem to the quantifier-free theory of integer linear arithmetic (with and without extra modulo constraints) was studied in [15], wherein the problem was shown to be coNP-complete and a monadic decomposition could be exponentially large in general.For the subcase without modulo constraints, a monadic decomposition in DNF corresponds precisely to a finite union of (possibly infinite) rectilinear hypercubes, which is the subject of this paper.We describe below how oracles for memberships and equivalence (as well as more powerful queries like subsets) admit a fast implementation via an SMT-solver, which enable our learning algorithms to be applied to compute such a monadic decomposition.
Contributions.We study the problem of learning finite unions of rectilinear hypercubes (over Z d ) in Angluin's exact learning framework with membership and equivalence queries [2,3].Our result is a polynomial-time exact learning algorithm for learning finite unions of rectilinear hypercubes over Z d for fixed d.This answers an open problem of [12].As observed in [12], over non-fixed d, this problem generalizes DNF since each term can be seen as a hypercube over {0, 1} d .That is, without fixing d, the problem is as hard as learning unrestricted DNF, which is well-known to be a major open problem in computational learning theory [4].
In view of applying our learning algorithm to the monadic decomposition problem [15,29] for quantifier-free integer linear arithmetic formulas, we consider two extensions.Firstly, we allow infinite hypercubes.For example, over 1-dimension, these would include infinite intervals like [7, ∞), which would correspond to the formula x ≥ 7. Secondly, we observe that the subset query (i.e.checking if the target concept includes a given finite union H of hypercubes) is not an expensive query for performing monadic decomposition, i.e., it would correspond to a single satisfiability check of a quantifier-free integer linear arithmetic formula, which can be handled easily by an SMT-solver.Subset queries belong to one of the standard types of queries in Angluin's active learning framework, e.g., see [3].For this reason, we provide an optimization of our learning algorithm by means of subset queries.
We implemented these learning algorithms (vanilla and various optimization including subset queries and "unary/binary acceleration"), using Z3 [26] as the backend for answering equivalence and subset queries (each a satisfiability check of a quantifier-free formula).We have performed a micro-benchmarking to stresstest our algorithms against the generic monadic decomposition procedure of [29], which also use Z3 as the backend, using various geometric objects over Z d as benchmarks.Our experiments suggest that our algorithms substantially outperform the generic procedure.
Organization.Preliminaries are in Sect. 2. We present the overshooting algorithm that witnesses polynomial learnability of finite unions of rectilinear cubes over a fixed dimension d with membership and equivalence in Sect.3. In Sect.4, we provide two extensions: (1) how subset queries could help speed up the overshooting algorithm, (2) how the algorithm could be extended to handle infinite cubes.Applications to monadic decomposition and experiments are presented in Sect. 5. We conclude in Sect.6.
We refer the reader to the technical report [25] when proofs are omitted and to the artifact [24] for implementation and benchmark details.

Preliminaries
We introduce below some common mathematical notations: N and Z are the sets of natural numbers and integers, respectively.For a, b ∈ Z, we write [a, b] = {i | a ≤ i ≤ b}; For any set X, we denote its power-set P(X) and its cardinal |X| ∈ N {∞}; Given two sets A, B, the symmetric difference is written AΔB = A\B ∪ B\A; When analyzing complexity of the presented algorithms, we assume binary encoding for any number n ∈ Z, which is part of the input of the considered algorithms, namely, size(n) = 1+ log(|n|+1) , where log is the base 2 logarithm.
Hypercubes.For a fixed dimension d ∈ N, we consider the discrete lattice denote the vector v where the i-th coordinate has been replaced by α ∈ Z.The notation 0 d = (0, . . ., 0) ∈ Z d denotes the origin, or simply 0 when the dimension is clear from context.We use standard notation for component-wise additions and scalar multiplication.In particular, for α ∈ Z, v + α • v denotes the vector v ∈ Z d such that for all i, we write e i for the i-th elementary vector, e i = 0[i/1].We shall be mostly using the standard component-wise order ≤ over vectors in We finally denote the size of a vector as the sum of the sizes of its components: size Our main study focuses on rectilinear hypercubes (cubes for short), i.e., any set of points of the form The size of C is uniquely defined as size(C) = size(v) + size(v).On the contrary, an arbitrary finite set X has no unique representation as a finite union of cubes, therefore we define its size as the size of its best representation: We adopt here a worst-case analysis approach, where our later reasoning and complexity analysis are valid for any representation, they are in particular valid for its best representation.
Learning Model.We first recall some standard definition from computational learning theory; for more, see [16].Fix a countable base set D = n i=1 D i , where the sets D i 's are pairwise disjoint.The problem of learning boolean formulas in DNF uses D i = {0, 1} i , i.e., the set of all binary sequences of length i, which can be thought of as a set of all assignments to a boolean function over x 1 , . . ., x i .The learning problem in this paper uses D i = Z i .A concept X is simply a subset of D i , for some i ∈ Z >0 .For example, when D i = {0, 1} i , a concept is simply a boolean function over x 1 , . . ., x i .When we speak of a learning problem, we always have a fixed set of representations in mind.For example, when we speak of learning boolean formulas in DNF (Disjunctive Normal Form), the representation ϕ X of a boolean function X has to be a formula over x 1 , . . ., x i in DNF.For example, X could be a boolean function, whereas ϕ X a DNF formula representing X.Note that a concept could admit many possible representations.A concept class C = ∞ i=1 C i is a set of concepts, where C i ⊆ P(D i ).For example, C i could be the set of boolean functions over variables x 1 , . . ., x i .When the set of representations for C is fixed (e.g.DNF for representing boolean functions), we could define size(X) of the concept X to be the size of the smallest representation of X.In this paper, we are dealing with the concept class C d ⊆ P(Z d ) of sets of integer points that can be represented as a finite union of rectilinear hypercubes over Z d .Earlier in this section we have defined this concept, as well as the size of the representation.To avoid notational clutter, we will often denote the concept class C d by C because our algorithm typically assumes that d is fixed.
In Angluin's active learning framework [2,3], the learner has access to oracles (a.k.a.teachers) that could provide hints about the target concept X to the learner.A minimally adequate teacher must be able to answer membership and equivalence queries.

Definition 1 (M+EQ Oracles). Consider some target concept
Intuitively, an equivalence oracle tells, for any hypothesis H ∈ C, whether H = X.If yes, is returned; if not, it provides a counterexample, namely a point in the symmetric difference.Angluin has considered other types of queries as well in her framework including subset/superset queries and difference queries (e.g.see her excellent survey [3]).We will use the subset queries in Sect. 4. A learning algorithm A is said to learn the concept class C = ∞ d=1 C d if, given d as input and any unknown target concept X, it terminates and outputs a representation of X after a finite amount of interaction with the oracles.Assuming that the oracle always returns the shortest counterexamples, its running time is defined to be number of steps (measured in d and size(X)) that A takes to output a representation of X.The complexity comp(d, size(X)) of A measures the number of steps taken in the worst case for all d and size(X).It runs in polynomial time if comp is a polynomial function.It remains a long-standing open problem in computational learning theory if there is a learning algorithm for boolean formulas represented in DNF, which is true for almost all major models including exact learning and PAC (see [4]).Over geometric concepts including hypercubes and semilinear sets, the dimension d is sometimes considered a fixed parameter, e.g., see [1,12,17,22].

Minimally Adequate Teacher
We restrict first our attention to the minimally adequate teacher setting where only a membership and equivalence oracle are provided, and provide constructions for intermediate procedures that can be interpreted as oracles.

Corner Oracle
At the heart of our learning algorithm is the concept of corners: Definition 2. Given a set of points X ⊆ Z d , a maximal corner (resp minimal corner) of X is a point v ∈ X maximal (resp minimal) with respect to component-wise ordering ≤.We write Corners(X) and Corners(X) for the sets of maximal and minimal corners, respectively, and write Corners(X) = Corners(X) ∪ Corners(X).
Given a membership oracle for some X ∈ C containing 0, Algorithm 1 returns some maximal corner of a given finite subset.Intuitively, for each coordinate i, a binary search is made until a border of X is eventually found.More precisely, we provide the following complexity analysis.Algorithm 1. Binary search for a maximal corner, assuming 0 ∈ X Ensure: Returned value is a maximal corner of X Require: 0 ∈ X; ΦX a membership oracle for X function findMaxCorner(ΦX This algorithm provides a partial implementation of the following oracle: A complete implementation of this oracle is provided by noticing that membership oracles can easily be composed: Remark 1. Assume Φ A and Φ B are two given membership oracles, respectively for two arbitrary sets A and B, and f : Z d → Z d .One can build membership oracles for A ∪ B, A ∩ B, AΔB, A\B and f (A).In particular: -By instantiating f : v → −v, the previous procedure applied on In both cases, notice that size(f (A)) ≤ size(A) + size(v 0 ) ≤ 2size(A).
In the sequel we write Φ C for the membership oracle of any set C obtained by composing sets whose oracles are provided.We also assume having constructed the two procedures findMaxCorner(v, Φ X ) and findMinCorner(v, Φ X ).

Overshooting Algorithm Algorithm 2 Overshooting algorithms
Require: ΦX membership oracle for X, ΨX equivalence oracle for X function LearnCubes(ΦX ,ΨX ) The core loop of the learning algorithm is presented in the LearnCubes function of Algorithm 2. The hypothesis is initially empty, and is later refined, as long as a counterexample is returned.How to refine the hypothesis given a counterexample?Two implementations of Refine are provided namely RefineSym and RefineAddRemove, giving rise to two variants of the algorithm.In both cases, the refinement takes a counterexample as an input and uses the corner oracle to build a cube C. In the former variant, a symmetric difference between the current hypothesis and C is made, while in the latter, C is either added or removed from the hypothesis.An example run of the RefineAddRemove variant is depicted in Fig. 1.While the above diagrams represent the search space used by the corner oracles, the below diagrams depict the resulting hypothesis after refinement.Initially, the hypothesis is empty (not represented) so the search space coincides with the target set X, which can be represented as a union of two overlapping cubes.A counterexample v ∈ X\H is therefore returned by the equivalence oracle.As v ∈ X, the refinement procedure adds some cube by searching the state space X\H = X around v. A too large cube is then added to the hypothesis, and a negative counterexample v ∈ H\X is then returned.The search space is now H\X and the algorithm aims at removing some smaller cube from the hypothesis.After two removals, the final hypothesis coincides with the target.
Hypothesis Representation.Both variants are operating on the hypothesis by applying boolean operations.One can naturally wonder if hypothesis represented by union, symmetric differences and differences of cubes can be handled by oracles operating on the concept class of finite cubes.As a matter of fact, we will observe that HΔX, H\X and X\H can all be represented in C:

Lemma 1 (Cube intersection and subtraction).
Then C 1 ∩ C 2 is a cube and C 2 \C 1 can be written as the disjoint union of 2d cubes.Moreover, these computations are effective in 2d operations.
Intuitively, one can think of a cube subtracted by a smaller cube results in a family of cubes, one for each face of the larger cube.There are 2d faces for a cube in dimension d.

Repetition-Free Complexity
In order to analyze the complexity of both variants of the algorithm, we fix a finite target set X ∈ C d and one of its representation as a union of cubes: We prove by induction on the iteration step that H can be expressed as a union of cubes, whose corners are aligned on a particular set of points: Definition 4 (Abstract grid).For 1 ≤ k ≤ d, we define the sets: Intuitively, B k (resp B k ) describes all the possible k-coordinate for minimal corners (resp maximal).A coordinate for a max corner, i.e. a constraint of the form x k ≤ α, can become a coordinate for a minimal corner, i.e. a constraint of the form x k ≥ α + 1, when taking the complement during a difference operation, and vice versa.
We observe that B is stable by union, intersection and difference.In particular, the overshooting algorithms maintain H ∈ B, namely the hypothesis always has minimal (resp maximal) corners that align with B k (resp B k ) on the k-th coordinate.Figure 2 provides an example of such points for a target made of the union of two cubes.Although B is of polynomial size, proving H ∈ B is not sufficient to prove termination of the algorithm in polynomial time, especially if some cubes in B are added and removed several times.Consider for example Fig. 3 which depicts a possible run of the algorithm on three aligned cubes by its successive hypotheses: cube B is added during the first step, but is later covered when the algorithm tries to learn A but overshoots.Another overshooting happens when trying to remove the space between A and B, which ends up removing all space between A and C. The cube C has then to be learned a second time, terminating the algorithm.
To circumvent this issue, we propose an optimization that prevents visiting twice the same minimal corner v.We base our reasoning on the following observations: -If v ∈ X, then v ∈ X, so v should not be later removed.
-If v / ∈ X, then v / ∈ X, so v should not be later added back to H.
Algorithm 3 introduces an optimized refinement procedure to keep track of the already added maximal corners.Although an analogous optimization can be done on the symmetric difference variant, we only discuss here RefineAd-dRemove2.
Once a minimal corner v for a candidate cube has been found, we continue the search of a maximal corner v by avoiding points that will result in the removal (resp addition) of already added (resp removed) minimal corners.

Algorithm 3. Optimized refinement avoiding visited minimal corners
Notice how only the maximal corner search benefits from the optimization, by tracking down minimal corners only.As a matter of fact, one could store the whole visited cubes in set V .However, when a search for maximal corner is carried, the resulting cube will intersect a previously visited cube as soon as the max corner crosses the minimal corner of the visited cube.
We exploit again Remark 1 to build an oracle for every mentioned membership oracle.Since V is a finite set, one can indeed build a membership oracle for the set {v | ∃v ∈ V \X : v ≤ v ≤ v}.Due to this exclusion region, a finer analysis has to be conducted to prove H ∈ B. Lemma 2. The two optimized variants maintain the following invariants: Properties 1 and 2 ensure that every v added to V is never added twice.These also ensures correctness of the algorithm: remark that the search for a maximal corner is not started from the initial counterexample v e but from v, which is indeed is in the search space since v / ∈ {v | ∃v ∈ V : v ≤ v ≤ v} (no point added twice to V ).Finally, property 3 ensures that only elements of (B k ) k are added to V , hence a maximal number of (2n) d additions.
Proof.At the beginning of the algorithm, V = H = ∅, satisfying all given properties.We prove the result by induction on the iteration step: 1.By definition of corner oracles, namely FindMaxCorner, if v ∈ X has been added to V during some previous iteration, it was added in the first branch (the oracle returns some point in the search region, which excludes X in the second branch).Therefore, it was also added to H during this iteration.Consider some later iteration removing elements from H, namely an iteration executing the second branch.Some cube C = Cube(v , v ) has been computed by the corner oracles in this branch such that v ∈ H\X\{v | ∃v ∈ V : v ≤ v ≤ v} In particular, since v ∈ V , we do not have v ≤ v ≤ v hence v / ∈ C and v is not removed.2. Similar to (1) (symmetric case).3.For every v added to V , it was produced by a (max) corner query made on X\H or H\X.Both of these sets are in B since H ∈ B by induction hypothesis.4. Let us prove that the cube C = Cube(v, v) currently added or removed satisfies C ∈ B (hence H ∪ C, H\C ∈ B which will conclude the induction).We already have proven that v ∈ (B k ) k .We prove now that v ∈ (B k ) k which is searched over the restricted state space By combining Proposition 1 and Lemma 2, we summarize the complexity of our overshooting algorithms for a particular target

Theorem 1 (M+EQ). Both variants of
LearnCubes terminates in at most (2n) d iterations, where an iteration requires: 1.One equivalence query; 2. One corner query, or equivalently, a linear number O(size(X)) of membership queries.
This algorithm terminates in polynomial time, for fixed d, in any representation of target X.In particular, the result holds in the worst-case where the representation of X as a finite union of cubes is minimal.As a matter of fact the presented exponential bound in d is tight: there exists a target X ∈ C and a pair of corner and equivalence oracles such that both algorithms terminate in exponential time.Whether finite unions of cubes can be learned in polynomial time in the dimension is left as an open problem, that we relate to DNF formula learning over d variables where each term can be interpreted as a cube over {0, 1} d .

Extensions
In this section we introduce extensions to the overshooting algorithm from Sect.3.2.While membership and equivalence queries are sufficient for learning finite sets, one natural extension of the minimal learner setting is to introduce a subset oracle [3]: The definition is similar to the membership oracle from Definition 1 except the oracle takes a set instead of a single point as input.

Maximal Cube Oracle
As opposed to the overshooting algorithm, using a subset oracle avoids the overshooting issue, that is to say, we can now search for cubes included in the target X.In order to increase the convergence speed, we nonetheless introduce a maximality criterion on the suitable cubes:  Next, we modify the corner oracle from Sect.3.1 to use subset queries.Again, we only define the algorithm to find a max corner, the min corner algorithm can be implemented analogously.The algorithm first computes a lower and upper bound for the subsequent binary search.The computation is shown in the function computeMaxBounds.Given a cube defined by its minimal and a maximal corner, the value of coordinate i is increased as long as the resulting cube is still a subset of the target set X.The upper bound v is the first negative reply by the oracle and the lower bound v the last positive response.A binary search is made on v and v in the findMaxIncCorner function.

Maximal Cube Algorithm
Algorithm 5 presents a procedure that iteratively refines the hypothesis: for any point, the algorithm searches for a maximal cube contained by this point w.r.t. the target and adds it to the hypothesis.One can check that both procedure calls are valid, as H ⊆ X is an invariant.At every iteration the counterexample v satisfies v ∈ X \ H.The use of the subset oracle ensures that the function FindMaxIncCorner always returns a point v such that Cube(v, v) ⊆ X.Similarly, the function FindMinIncCorner always returns a corner v such that Cube(v, v) ⊆ X.The resulting cube is then added to the hypothesis, ensuring point v is never visited again as a counterexample.This entails the termination of the algorithm, in at most |X| iteration of the main loop.A better bound will be explored in Sect. 4

Complexity
Termination of LearnMaxCube was proved using cardinality arguments in Sect.4.1.These arguments obviously don't apply in the case where the target set is infinite.Moreover, we are interested in finer complexity analysis.
As in Sect.3.3, we fix a target representation X = ∪ n i=1 Cube(v i , v i ) and study the algorithm complexity with respect to As some of the vectors v may contain infinite coordinates, we carefully specify size(+∞) = size(−∞) = 1 and keep the usual definition of size(v).

Theorem 2 (SUB+EQ).
LearnMaxCube terminates in at most n 2d iterations, where an iteration requires: 1.One equivalence query; 2. One maximal cube query, or equivalently, a linear number O(size(X)) of subset queries.
Proof.At every iteration, one equivalence query is performed then FindMax-IncCorner and FindMinIncCorner perform a binary search, resulting in a linear number of subset similar (proof similar to Proposition 1).In order to analyze the number of iterations of the main loop, let us first remark that each added maximal cube is added only once: if we write v k the k-th counterexample and C k the learned maximal cube, then The number of iterations is therefore bounded by the number of maximal cubes.We proceed now to bound the number of maximal cubes: Let C = Cube(v, v) be a maximal cube w.r.t.X.For any k ∈ [1, d] As in Theorem 1 the number of iterations is polynomial in the number of cubes n but exponential in the dimension d.As opposed to the LearnCubes algorithm, the bound is not tight as the example Fig. 5b provides only a quadratic number of maximal number of cubes.As the maximal cube concept can be related to the notion of prime implicant, examples of DNF formula with an exponential of prime implicants (see for example [8]) can be translated into union of cubes with an exponential number of maximal 0-1 cubes.
From a practical perspective, one can nonetheless argue that LearnMax-Cube is likely to perform well in practice, by avoiding the overshooting problem mentioned in Example 1 as H ⊆ X is an invariant.In fact, one can easily check that if there are no adjacent1 cubes, the number of iterations becomes linear.

Applications and Experiments
In this section, we describe an immediate application of our learning algorithms to monadic decomposition of quantifier-free Presburger formulas [15,29].We then report on experimental comparisons between our algorithms and existing methods for the problem.

Application to Monadic Decomposition
Here we consider quantifier-free linear integer arithmetic formulas without modulo arithmetic: where ∼ ∈ {≤, ≥, =}, and α 1 , α 2 are integer linear combinations of the variables x 1 , . . ., x n , i.e., α i is of the form c 0 + n j=1 c j .xj , where each c i ∈ Z.The formula ϕ(x) is said to be satisfiable (written Z; + |= ϕ) if there exists an assignment σ of x to Z such that the formula becomes true.Of course, this is just a simple fragment of the first-order theory of integer linear arithmetic and the notion of Z; + |= ϕ can be defined in the same way even with quantifiers [14,18].A formula ϕ is said to be monadic if it has only one variable.Every monadic formula ϕ(x) in this fragment can be easily transformed into a union integer intervals of the form: (1) l ≤ x ∧ x ≤ u where l, u ∈ Z, (2) l ≤ x where l ∈ Z, (3) x ≤ u where u ∈ Z, or (4) or ⊥.
A monadic decomposition [29] of a formula ϕ(x) is a boolean combination ψ(x) of monadic formulas that is equivalent to ϕ over the theory, i.e., Z; + |= ∀x(ϕ ↔ ψ).Of course, not all formulas admit a monadic decomposition (e.g., x = y).It was shown in [15] that deciding if a formula in the theory be monadically decomposable is coNP-complete2 .Veanes et al. [29] provides a generic semidecision procedure for computing a monadic decomposition of a quantifier-free formula as an if-then-else formula that is applicable to pretty much all theories considered in SMT.Despite its genericity, the procedure runs rather well, e.g., as the authors showed on their benchmarking in [29].
The application of our learning algorithms to computing monadic decomposition arises from the following observation.Since each monadic decomposition can be transformed into DNF, a monadic decomposition of a formula ϕ(x) over Z; + can be constructed as a finite union of (possibly infinite) hypercubes, where an infinite hypercube arises when a variable is either not bounded from above or not bounded from below (or both).Conversely, a finite union H of possibly infinite hypercubes can also be easily transformed into a boolean combination of monadic formulas ϕ H .For example, the formula (0 ≤ x ≤ 5 ∧ 3 ≤ y ≤ 10) ∨ (8 ≤ x) corresponds to the union of hypercubes Cube((0, 3), (5, 10))∪ Cube((8, −∞), (+∞, +∞)).Furthermore, all relevant oracles admit a straightforward implementation: -A membership query v requires checking Z; + |= ϕ(v), which can be checked in polynomial-time because ϕ is quantifier-free.-An equivalence query H can be reduced to checking This is a single satisfiability check of quantifier-free integer-linear arithmetic formula, for which highly-optimized solvers exist (e.g., Z3 [26]).
-A subset query H can similarly be reduced to checking This is also a single satisfiability check over Z; + .This allows us to apply both of our learning algorithms to the problem.
Monadic decomposition has numerous applications including quantifier elimination [29], string solving [15], and symbolic finite automata/transducers [13,29], among others.In the following example we illustrate how our learning algorithm(s) could be applied to improving quantifier elimination for the theory of linear integer arithmetic.
Example 2. Consider a formula of the form ∀x∃y ϕ(x, ȳ), where ϕ is a formula in linear integer arithmetic without modulo constraints.Suppose that ϕ is monadically decomposable, and is equivalent to the formula n i=1 D i (x, ȳ), where each D i is a disjunction of monadic predicates over the variables x ∪ ȳ.We assume w.l.o.g. that each D i is satisfiable.Then, this formula is equisatisfiable (over linear integer arithmetic) to ψ := ∀x ( n i=1 D i (x, ci )) , where ȳ in D i are replaced by fresh constants ci (i.e. two distinct D i , D i use different constants).This can be proven by a simple application of skolemization, and observing that each occurrence of f (x) in any disjunct is of the form a < f(x) < b, where a ∈ {−∞} ∪ Z and b ∈ Z ∪ {∞}, implying that f (x) can be replaced by a single constant, which does not depend on x.Finally, let D i be the conjuncts in D i only involving variables in x.Checking that ψ is true reduces to checking satisfiability of n i=1 ¬D i .To make this example concrete, we consider the formula ∀x∃y(x ≥ 0 → x + y ≥ 5 ∧ y ≥ 0).A monadic decomposition of the quantifier-free part is x < 0 ∨ 5 i=0 (x ≥ i ∧ y ≥ 5 − i).Therefore, checking the above formula can be reduced to satisfiability of x ≥ 0 ∧ 5 i=0 x < i which is not satisfiable.

Experiments
In order to assess the performance of the algorithms FindMaxCorner and FindMinCorner respectively introduced in Sect. 3 and Sect.4, we consider prototype implementations.The following prototypes and experiments can be found in [24].
Variants.Although the methods were presented with binary search strategies in mind, we also implemented a more naive unary search procedure to obtain the corners.As later noticed in the experiments, unary search may be preferred for very small cubes and performs especially well for cubes which are based 0-1 integer programs, while binary search achieves better performance for larger cubes.Consequently, we refer to a third variation of the algorithm called "optimized", combining unary search for small instances and binary search for large values.More precisely two variants of the overshooting algorithm from Sect. 3 and three variants of the max cubes algorithm from Sect. 4 are presented, called respectively overshoot unary and overshoot binary and max unary, max binary and max optimized.Tool Comparison.Evaluation is performed against a generic monadic decomposition procedure mondec 1 from [29] by Veanes et al., which works over an arbitrary base theory and outputs an if-then-else formula, which could be exponentially more succinct than a formula in DNF.The algorithm, which exploits the python-Z3 framework [26], uses a kind of a decision tree search heuristics to split the input into monadic predicates.
Implementation.Similarly to mondec 1 , our prototype is implemented in python using the python-Z3 framework, but is specialized in handling linear integer arithmetic formula, and that outputted formulas will be in DNF, unlike mondec 1 .For monadic decomposition applications, oracles queries are converted to appropriate Z3 satisfaction queries since a (possibly non-monadic) representation of the target set is already known.

Benchmark Suite
Our benchmark suite is restricted to the problem of monadic decomposition of linear integer arithmetic, and its purpose is to stress-test our learning algorithms and mondec 1 against various kinds of "extreme conditions".The suite consists of six classes of monadically decomposable example formulas, which were constructed to test five features (see below).Note that the given formulas themselves might contain non-monadic predicates.
The five features (left to right in Table 1) represent the presence of (1) a large amount of cube overlaps, (2) a large number of cubes, (3) a large cube, (4) large dimension, and (5) an unbounded cube.We hypothesized that these five features play important roles in how fast the algorithms perform, which are indeed validated in our experimental results.The six classes of formulas are elaborated below.
unbounded nature makes it tractable by max optimized and mondec 1 only.

Results
Experiments were conducted on an AMD Ryzen 5 1600 Six-Core CPU with 16 GB of RAM running on Windows 10.The results are summarized in Fig. 7 where each graph represents one benchmark comparing the run times of each algorithm.The overshooting phenomenon can be observed in Fig. 7c and Fig. 7e with its quadratic shape, as d = 2.In Fig. 7b, the running time quickly diverges as d increases, as anticipated by Example 1.
When the considered cubes are small, as in Fig. 7a and Fig. 7c, the unary search algorithms outperform their binary counterparts, meaning the few additional queries made by the binary search are more costly than a direct enumeration.The optimized variant is therefore a good compromise in all cases.
Figure 7d depicts a benchmark with many large cubes for a fixed dimension.While the impact of the overshooting phenomenon remains contained, the maxcube unary search variant is particularly slow.This can be explained by the size of the cubes making unary search inefficient, combined with the already expensive cost of every single inclusion query.
The mondec 1 algorithm is comparable to the overshooting in Fig. 7e.It also performs particularly well in Fig. 7f, which we conjecture is due to the conciseness of the solution in if-then-else form used by mondec 1 .
Overall, the maxcube algorithm in its optimized form is the most stable algorithm for this benchmark set and should be preferred when an inclusion oracle is available.The extra cost of these queries are here taken into account and remain affordable when implemented with Z3 queries.

Conclusion and Future Work
We have presented a polynomial-time algorithm in Angluin's exact learning framework using membership and equivalence for learning a finite union of rectilinear cubes over Z d over any fixed dimension d.By considering an additional subset oracle, learning possibly infinite cubes can be achieved with the same complexity, but a simpler and faster learning algorithm in practice.The technique enables the introduction of auxiliary oracles, namely the corner (resp.maximal cube) oracle when a membership (resp.subset) oracle is provided.While oracles for subset queries tend to be difficult to implement, this turns out not to be the case for our proposed application of computing monadic decompositions of quantifier-free integer linear arithmetic formulas without modulo constraints, which is successfully solved by our algorithm.
We mention three future research directions.First, extensions to modulo operations could be explored, by encoding periodicity on d additional coordinates and providing adequate oracles on the encoded target.A second direction consists in applying these learning techniques to the verification of systems by learning invariants which are monadically decomposable in a small number of cubes.Lastly, one promising direction to further improve our algorithms is to investigate how to leverage if-then-else formula representations as used in mondec 1 [29], which could be exponentially more succinct than formulas in DNF.

Fig. 1 .
Fig. 1.Possible run of the overshooting algorithm on two cubes in 2 dimensions

Fig. 2 .
Fig. 2. Possible minimal and maximal corners for cubes appearing in the hypothesis, for a given target space

Fig. 3 .
Fig. 3. Possible run on three cubes where cube B is added twice to the hypothesis.

Figure 5
Figure 5 provides examples of possible maximal cubes in dimension d = 2.
(a) K Diagonal Restricted consists of K overlapping cubes of length and width 2 and one diagonal as shown in Fig. 6a.The cubes overlap with at most two other cubes and stack up diagonally.The algorithms need to return all the cubes left of the diagonal.(b) 10 cubes in Z d consists of K = 10 overlapping cubes of size 2 d stacking up diagonally similar to the benchmark K Diagonal Restricted without diagonal restriction.(c) K Diagonal Unrestricted is a variation of Fig. 6a where the algorithms need to return all the cubes and all the points on the diagonal.(d) K Big Overlapping Cube is a benchmark testing large cubes as depicted in Fig. 6b.It consists of K overlapping cubes of length and width 100 and are overlapping and stacking up diagonally like the benchmark K Diagonal Restricted.(e) K Diagonal is built as the set of points along the diagonal x

Fig. 7 .
Fig. 7. Benchmark results.The y-axis encodes the time in seconds.The timeout is set to1800 s.
.4.to accelerate the search to infinity.Algorithm 6 achieves this goal by simply overriding the ComputeMaxBounds and ComputeMinBounds subroutines in order to check for possible +∞ and −∞ bounds.Whenever such bound is returned, no further binary search occurs for this coordinate (constant time).