About the complexity of two-stage stochastic IPs

We consider so called 2-stage stochastic integer programs (IPs) and their generalized form, so called multi-stage stochastic IPs. A 2-stage stochastic IP is an integer program of the form \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\max \{ c^T x \mid {\mathcal {A}} x = b, \,l \le x \le u,\, x \in {\mathbb {Z}}^{s + nt} \}$$\end{document}max{cTx∣Ax=b,l≤x≤u,x∈Zs+nt} where the constraint matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {A}} \in {\mathbb {Z}}^{r n \times s +nt}$$\end{document}A∈Zrn×s+nt consists roughly of n repetitions of a matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \in {\mathbb {Z}}^{r \times s}$$\end{document}A∈Zr×s on the vertical line and n repetitions of a matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B \in {\mathbb {Z}}^{r \times t}$$\end{document}B∈Zr×t on the diagonal. In this paper we improve upon an algorithmic result by Hemmecke and Schultz from 2003 [Hemmecke and Schultz, Math. Prog. 2003] to solve 2-stage stochastic IPs. The algorithm is based on the Graver augmentation framework where our main contribution is to give an explicit doubly exponential bound on the size of the augmenting steps. The previous bound for the size of the augmenting steps relied on non-constructive finiteness arguments from commutative algebra and therefore only an implicit bound was known that depends on parameters r, s, t and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta $$\end{document}Δ, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta $$\end{document}Δ is the largest entry of the constraint matrix. Our new improved bound however is obtained by a novel theorem which argues about intersections of paths in a vector space. As a result of our new bound we obtain an algorithm to solve 2-stage stochastic IPs in time \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f(r,s,\Delta ) \cdot \mathrm {poly}(n,t)$$\end{document}f(r,s,Δ)·poly(n,t), where f is a doubly exponential function. To complement our result, we also prove a doubly exponential lower bound for the size of the augmenting steps.


Introduction
Integer programming is one of the most fundamental problems in algorithm theory. Many problems in combinatorial optimization and other areas can be modeled as an integer program . An integer program (IP) is thereby of the form for some matrix A ∈ Z m×n , a right hand side b ∈ Z m , a cost vector c ∈ Z n and lower and upper bounds l, u ∈ Z n . The famous algorithm of Kannan [21] computes an optimal solution of the IP in time of roughly n O(n) · poly(m, log ∆), where ∆ is the largest entry of A and b.
In recent years there was significant progress in the development of algorithms for IPs when the constraint matrix A has a specific structure. Consider for example the class of integer programs with a constraint matrix N of the form for some block matrices A ∈ Z r×s and B ∈ Z r×t . An IP of this specific structure is called an n-fold IP. This class of IPs has found numerous applications in the area of string algorithms [23], social choice games [12,24] and scheduling [18,22]. State-of-theart algorithms compute a solution of an n-fold IP in time poly(n, t)∆ O(r 2 s) [9,19,26], where ∆ is the largest entry in matrices A and B.

Two-Stage Stochastic Integer Programming
Stochastic programming deals with uncertainty of decision making over time [20]. One of the basic models in stochastic programming is 2-stage stochastic programming. In this model one has to decide on a solution at the first stage and in the second stage there is an uncertainty where n possible scenarios can happen. Each of n possible scenarios might have a different optimal solution and the goal is to minimize the costs of the solution of the first stage in addition to the expected costs of the solution of the second stage. In the case that said scenarios can be modeled by an (integer) linear program, we are talking about 2-stage stochastic (integer) linear programs. 2-stage stochastic linear programs that do not contain any integer variable are well understood (we refer to standard text books [3,20]). In contrast, 2-stage stochastic programs that contain integer variables are hard to solve and are topic of ongoing research. Typically, those IPs are investigated in the context of decomposition based methods (we refer to a tutorial [27] or a survey [31] on the topic). For recent progress on 2-stage stochastic programs we refer to [1,4,31]. The interest in solving 2-stage stochastic (I)LPs efficiently stems from their wide range of applications for example in modeling manufacturing processes [8] or energy planing [16]. In this paper we consider 2-stage stochastic IPs with only integral variables. This so called pure integral 2-stage stochastic IPs have also been considered in the literature from a practical perspective (see [13,33]). The considered IP is then of the form for given objective vector c ∈ Z s+nt ≥0 upper and lower bound ℓ, u ∈ Z s+nt ≥0 . The constraint matrix A has the shape (1) 0 · · · 0 A (2) 0 B (2) . . . . . .
for given block matrices A (1) , . . . , A (n) ∈ Z r×s and B (1) , . . . , B (n) ∈ Z r×t . Typically, 2-stage stochastic IPs are written in a slightly different (equivalent) form that explicitly involves the scenarios and the probability distribution of the scenarios of the second stage. In this presented form, roughly speaking, the solution for the first stage scenario is encoded in the variables corresponding to vertical block matrices. A solution for each of the second stage scenarios is encoded in the variables corresponding to one of the diagonal block matrices and the expectation for the second stage scenarios can be encoded in a linear objective function. Since we do not rely on known techniques of stochastic programming in this paper, we omit the technicalities surrounding 2-stage stochastic IPs and simply refer to a survey for further details [31]. Despite their similarity, it seems that 2-stage IPs are significantly harder to solve than n-fold IPs. While it is known that the 2-stage stochastic IP with constraint matrix S can be solved in running time of the form poly(n) · f (r, s, t, ∆) for some computable function f , which was developed by Hemmecke and Schultz [17], the actual dependency on the parameters r, s, t, ∆ was unknown (we elaborate on this further in the coming section). Their algorithm is based on the augmentation framework which we also discuss in the following section.

The Augmentation Framework
Suppose we have an initial feasible solution z 0 of an IP max{c T x | Ax = b, l ≤ x ≤ u, x ∈ Z n } and our goal is to find an optimal solution. The idea behind the augmentation framework (see [7]) is to compute an augmenting (integral) vector y in the kernel, i.e. y ∈ ker(A) with c T y > 0. A new solution z ′ with improved objective can then be defined by z ′ = z 0 + λy for a suitable λ ∈ Z ≥0 . This procedure can be iterated until a solution with optimal objective is obtained eventually.
We call an integer vector y ∈ ker(A) a cycle. A cycle can be decomposed if there exist integral vectors u, v ∈ ker(A) \ {0} with y = u + v and u i · v i ≥ 0 for all i (i.e. the vectors are sign-compatible with y). An integral vector y ∈ ker(A) that can not be decomposed is called a Graver element [14] or we simply say that it is indecomposable. The set of all indecomposable elements is called the Graver basis.
The power of the augmentation framework is based on the observation that the size of Graver elements and therefore also the size of the Graver basis can be bounded. With the help of these bounds, good augmenting steps can be computed by a dynamic program and finally the corresponding IP can be solved efficiently.
In the case that the constraint matrix has a very specific structure, one can sometimes show improved bounds. Specifically, if the constraint matrix A has a 2-stage stochastic shape with identical block matrices in the vertical and diagonal line, then Hemmecke and Schultz [17] were able to prove a bound for the size of Graver elements that only depends on the parameters r, s, t and ∆. The presented bound is an existential result and uses so called saturation results from commutative algebra. As MacLagan's theorem is used in the proof of the bound no explicit function can be derived. It is only known that the dependency on the parameters is lower bounded by ackerman's function [28]. This implies that the implicit bound for the size of Graver elements by Hemmecke and Schultz can not be improved beyond an ackermanian dependency in the parameters r, s, t and ∆.
In a very recent paper it was even conjectured that an algorithm with an explicit bound on parameters r, s, t and ∆ in the running time to solve IPs of the form (1) does not exist [25].
Very recently, improved bounds for Graver elements of general matrices and matrices with specific structure like n-fold [9] or 4-block structure [5] were developed. They are based on the Steinitz Lemma, which was previously also used by Eisenbrand and Weismantel [10] in the context of integer programming.
Lemma 1 (Steinitz [15,32]). Let v 1 , . . . , v n ∈ R m be vectors with v i ∞ ≤ ∆ for 1 ≤ i ≤ n. Assuming that n i=1 v i = 0 then there is a permutation Π such that for each k ∈ {1, . . . , n} the norm of the partial sum is bounded by m∆ The Steinitz Lemma was used by Eisenbrand, Hunkenschröder and Klein [9] to bound the size of Graver elements for a given matrix A. As we use the following theorem and its technique in this paper, we give a brief sketch of its proof.
Theorem 1 (Eisenbrand, Hunkenschröder, Klein [9]). Let A ∈ Z m×n be an integer matrix where every entry of A is bounded by ∆ in absolute value. Let g ∈ Z n be an element of the Graver Basis of A then g 1 ≤ (2m∆ + 1) m .
Proof. Consider the sequence of vectors v 1 , . . . , v g 1 consisting of y i copies of the i-th column of A if g i is positive and |g i | copies of the negative of the i-th coplumn of A if g i is negative. As g is a Graver element we obtain that v 1 + . . . + v g 1 = 0. Using the Steinitz Lemma above, there exists a reordering u 1 + . . . + u g 1 of the vectors such that the partial sum p k = k i=1 u i ∞ ≤ ∆m for each k ≤ g 1 . Suppose by contradiction that g 1 > (2m∆ + 1) m . Then by the pigeonhole principle there exist two partial sums that sum up to the same value. However, this means that g can be decomposed and hence can not be a Graver element.

Our Results:
The main result of this paper is to prove a new structural lemma that enhances the toolset of the augmentation framework. We show that this Lemma can be directly used to obtain an explicit bound for Graver elements of the constraint matrix of 2-stage stochastic IPs. But we think that it might also be of independent interest as it provides interesting structural insights in vector sets.
Assuming that the total sum of all elements in each set is equal, i.e.
Note that this lemma only makes sense when we consider the T i to be multisets as the number of different sets without allowing multiplicity of vectors would be bounded by 2 ∆ d .
A geometric interpretation of the lemma is given in the following figure. On the left side we have n-paths consisting of sets of vectors and all path end at the same point b.
Then the Lemma shows, that there always exist permutations of the vectors of each path such that all paths meet at a point b ′ of bounded size. The bound does only depend on ∆ and the dimension d and is thus independent of the number of paths n and the size of b. For the proof of the Lemma we need basic properties for the intersection of integer cones. We show that those properties can be obtained by using the Steinitz Lemma.
We show that Lemma 2 has strong implications in the context of integer programming. Using the Lemma, we can show that the size of Graver elements of matrix A is bounded by (rs∆) O(rs((2r∆+1) rs 2 )) . Within the framework of Graver augmenting steps the bound implies that 2-stage stochastic IPs can be solved in time n 2 t 2 ϕ log 2 (nt)(rs∆) O(rs 2 ((2r∆+1) rs 2 )) , where ϕ is the encoding length of the instance. With this we improve upon an implicit bound for the size of the Graver elements matrix 2-stage stochastic constraint matrices due to Hemmecke and Schultz [17].
Furthermore, we show that our Lemma can also be applied to bound the size of Graver elements of constraint matrices that have a multi-stage stochastic structure. Multi-stage stochastic IPs are a well known generalization of 2-stage stochastic IPs. By this, we improve upon a result of Aschenbrenner and Hemmecke [2].
To complement our results for the upper bound, we also present a lower bound for the size of Graver elements of matrices that have a 2-stage stochastic IP structure. The given lower bound is for the case of r = 1. In this case we present a matrix where the Graver elements have a size of 2 Ω(∆ s ) .

The Complexity of Two-Stage Stochastic IPs
First, we argue about the application of our main Lemma 2. In the following we show that the infinity-norm of Graver elements of matrices with a 2-stage stochastic structure can be bounded by using the lemma.
Given the block structure of the IP 1, we define for a vector y ∈ Z s+nt with Ay = 0 the vector y (0) ∈ Z s ≥0 which consists of the entries of y that belong to the vertical block matrices A (i) and we define y (i) ∈ Z t ≥0 to be the entries of y that belong to the diagonal block matrix B (i) .

Theorem 2. Let y be a Graver element of the constraint matrix
Proof. Let y ∈ Z s+nt ≥0 be a cycle of IP (1), i.e. Ay = 0. Consider a submatrix of the matrix A denoted by (A (i) B (i) ) ∈ Z r×(s+t) consisting of the block matrix A (i) of the vertical line and the block matrix B (i) of the diagonal line. Consider further the Since all matrices (A (i) B (i) ) share the same set of variables in the overlapping block matrices A (i) , we can not choose indecomposable elements independently in each block to obtain a cycle of smaller size for the entire matrix A. Let p : Z s+t → Z s be the projection that maps a cycle c of a block matrix (A (i) B (i) ) to the variables in the overlapping part, . In the case that y ∞ is large we will show that we can find a cycleȳ of smaller length andȳ ≤ y. In order to obtain this cycleȳ for the entire matrix A, we have to find a multiset of cyclesC i ⊂ C i in each block matrix (A (i) B (i) ) such that the sum of the projected parts is identical, i.e. c∈C 1 p(c) = . . . = c∈Cn p(c). We apply Lemma 2 to the multisets p( and hence the conditions to apply Lemma 2 are fulfilled. Since every v (i) is decomposed in a sign compatible way, every entry of the vector in p(C i ) has the same sign. Hence we can flip the negative signs in order to apply Lemma 2.
By the Lemma, there exist submultisets wherep(c) is the projection that maps a cycle c ∈C i to the part that belongs to matrix for an arbitrary i > 0, which is well defined as the sum is identical for all multisetsC i . As the cardinality of the multisetsC i is bounded, we know by construction ofȳ that the one-norm of every y (i) is bounded by This directly implies the infinity-norm bound for y as well.

Computing the Augmenting Step
As a direct consequence of the bound for the size of the Graver elements, we obtain by the framework of augmenting steps an efficient algorithm to compute an optimal solution of a 2-stage stochastic IP. For this we can use the algorithm by Hemmecke and Schultz [17] or a more recent result by Koutecky, Levin and Onn [26] which gives a strongly polynomial algorithm. Using these algorithms directly would result in an algorithm with a running time of the form poly(n) · f (r, s, t, ∆) for some doubly exponential function involving parameters r, s, t and ∆. However, in the following we explain briefly how the augmenting step can be computed in order to obtain an algorithm with a running time that is polynomial in t.
Given a feasible solution z ∈ Z s+nt ≥0 of IP (1) and a multiple λ ∈ Z ≥0 (which can be guessed). A core ingredient in the augmenting framework is to find an augmenting step. Therefore, we have to compute a Graver element y ∈ ker(A) such that z +λy is a feasible solution of IP (1) and the objective λc T y is maximal over all Graver elements.
that we obtain from the previous Lemma. To find the optimal augmenting step, it is sufficient to solve the IP max{c T x | Ax = 0,l ≤ x ≤ū, x ∞ ≤ L} for modified upper and lower boundsl,ū according to the multiple λ and the feasible solution z. Having the best augmenting step at hand, one can show that the objective value improves by a factor of 1 − 1 2n . This is due to the fact that the difference z − z * between z and an optimal solution z * can be represented by . In the following we briefly show how to solve the IP max{c T x | Ax = 0,l ≤ x ≤ u, x ∞ ≤ L} in order to compute the augmenting step. The algorithm works as follows: • Compute for every y (0) with y (0) 1 ≤ L the objective value of the cycle y consisting of y (0) ,ȳ (1) , . . . ,ȳ (n) , whereȳ (i) for i > 0 are the optimal solutions of the IP max(c (i) ) Tȳ(i) wherel (i) ,ū (i) are the upper and lower bounds for the variablesȳ (i) and c (i) their corresponding objective vector. Note that the first set of constraints of the IP ensure that Ay = 0. The IPs can be solved with the algorithm of Eisenbrand and Weismantel [10] in time O(∆ O(r 2 ) ) each.
• Return the cycle with maximum objective.
Putting all things together, we obtain the following theorem regarding the worst-case complexity for solving 2-stage stochastic IPs. For details regarding the remaining parts of the augmenting framework like finding an initial feasible solution or a bound on the required augmenting steps we refer to [9] and [26] Theorem 3. A 2-stage stochastic IP of the form (1) can be solved in time where ϕ is the encoding length of the IP.

About the Intersection of Integer Cones
Before we are ready to prove our main Lemma 2, we need two helpful observations about the intersection of integer cones. An integer cone is defined for a given (finite) generating Note that the intersection of two integer cones is again an integer cone, as the intersection is closed under addition and scalar multiplication of positive integers. We say that an element b of an integer cone int.
We can assume that the generating set B of an integer cone consists just of the set of indecomposable elements as any decomposable element can be removed from the generating set.
In the following we allow to use a vector set B as a matrix and vice versa where the elements of the set B are the columns of the matrix B. This way we can multiply B with a vector, i.e. Bλ = b∈B λ b b for some λ ∈ Z B . Lemma 3. Given two integer cones int.cone(B (1) ) and int.cone(B (2) ) for some generating sets B (1) , B (2) for some generating set of elementsB. Then for each generating element b ∈B of the intersection cone with b = B (1) λ = B (2) γ for some λ ∈ Z B (1) ≥0 and γ ∈ Z B (2) ≥0 , we have that Consider the representation of a point b = B (1) λ = B (2) γ in the intersection of int.cone(B (1) ) and int.cone(B (2) ). The sum v 1 + . . . v ( λ 1 + γ 1 ) consisting of λ i copies of the i-th element of B (1) and γ i copies of the negative of the i-th element of B (2) equals to zero. Using Steinitz' Lemma, there exists a reordering of the vectors u 1 +. . .+u ( λ 1 + γ 1 ) such that the partial sum ℓ i=1 u i ≤ d∆, for each ℓ ≤ λ 1 + γ 1 . If λ 1 + γ 1 > (2d∆ + 1) d then by the pigeonhole principle, there exist two partial sums of the same value. Hence, there are two sequences that sum up to zero, i.e. there exist non-zero vectors λ ′ , λ ′′ ∈ Z B (1) ≥0 with λ = λ ′ + λ ′′ and γ ′ , γ ′′ ∈ Z B (2) ≥0 with γ = γ ′ + γ ′′ such that B (1) (2) γ ′′ are elements of the intersection cone. This implies that b can be decomposed in the intersection cone.
Using a similar argumentation as in the previous lemma, we can consider the intersection of several integer cones. Note that we can not simply use the above Lemma inductively as this would lead to worse bounds. (1) ), . . . , int.cone(B (ℓ) ) for some generating sets B (1)

Lemma 4. Consider integer cones int.cone(B
for some generating set of elementsB.

Consider the sum of vectors v
j copies of the j-th element of B k j . By adding 0 vectors to sums we can assume without loss of generality that every sequence has the same number of summands L = max i=1,...,ℓ λ (i) 1 .
Claim: There exists a reordering u L for each of these sums such that each partial sum p i is close to the line between 0 and b and more precisely: for each m ≤ L and each k ≤ ℓ. To see this, we construct the sequence that consists of vectors from B (k) and subtract L fractional parts of the vector b. To count the number of vectors we use an additional component with weight ∆ of the vector and definē Hence we can apply the Steinitz Lemma to obtain a reorderingū 1 + . . . +ū L for each sequence such that each partial sum m i=1ūi ∞ ≤ 2∆(d + 1) for each m ≤ 2L. Each partial sum that sums up to index m contains p vectorsv . Furthermore, the ∆ entry of each vector guarantees that |p − q| ≤ 2(d + 1) which implies the statement of the claim. Now consider the differences of a partial sum p m . Using the claim from above, we can now argue that p This implies that b can be decomposed and is therefore not a generating element of ∩ ℓ i=1 int.cone(B i ).

Proof of Lemma 2:
Using the results from the previous section, we are now finally able to prove our main Lemma 2.
To get an intuition for the problem however, we start by giving a sketch of the proof for the 1-dimensional case. In this case, the multisets T i consist solely of natural numbers, i.e T 1 , . . . , T n ⊂ Z ≥0 . Suppose that each set T i consists only of many copies of a single integral number x i ∈ {1, . . . , ∆}. Then it is easy to find a common multiple as ∆! 1 · 1 = ∆! 2 · 2 = . . . = ∆! ∆ · ∆. Hence one can choose the subsets consisting of ∆! x i copies of x i . Now suppose that the multisets T i can be arbitrary. If |T i | ≤ ∆ · ∆! = ∆ O(∆) we are done. But on the other hand, if |T i | ≥ ∆ · ∆!, by the pigeonhole principle there exists a single element x i ∈ {1, . . . , ∆} for every T i that appears at least ∆! times. Then we can argue as in the previous case where we needed at most ∆! copies of a number x i ∈ {1, . . . , ∆}. This proves the lemma in the case d = 1.
In the case of higher dimensions, the lemma seems much harder to prove. But in principle we use generalizations of the above techniques. Instead of single natural numbers however, we have to work with bases of corresponding basic feasible LP solutions and the intersection of the integer cone generated by those bases.
In the proof we need the notion of a cone which is simply the relaxation of an integer cone. For a generating set B ⊂ Z d ≥0 , a cone is defined by Proof. First, we describe the multisets T 1 , . . . , T n ⊂ Z d ≥0 by multiplicity vectors λ (1) , . . . , λ (n) ∈ Z P ≥0 , where P ⊂ Z d is the set of integer points p with p ∞ ≤ ∆. Each λ (i) p thereby states the multiplicity of vector p in T i . Hence t∈T i t = p∈P λ (i) p p and our objective is to find vectors y (1) In the following we proof two claims that correspond to the two previously described cases of the one dimensional case. First, we consider the case that essentially each multiset T i corresponds to one of the basic feasible solution x (j) . In the 1-dimensional case this would mean that each set consists only of a single number. Note that the intersection of integer cones in dimension 1 is just the least common multiple, i.e. int.cone(z 1 ) ∩ int.cone(z 2 ) = int.cone(lcm(z 1 , z 2 )) for some z 1 , z 2 ∈ Z ≥0 . Claim 1: If for all i we have ) then there exist non-zero vectors y (1) , . . . , y (ℓ) ∈ Z d ≥0 with y (1) ≤ x (1) , . . . , y (ℓ) ≤ x (ℓ) and y (i) Proof of the claim: Note that B (i) x (i) = b and hence b ∈ cone(B (i) ). In the following, our goal is to find a non-zero point q ∈ Z d ≥0 such that q = B (1) y (1) = . . . = B (ℓ) y (ℓ) for some vectors y (1) , . . . , y (ℓ) ∈ Z d ≥0 . However, this means that q has to be in the integer cone int.cone(B (i) ) for every 1 ≤ i ≤ ℓ and therefore in the intersection of all the integer cones, i.e. q ∈ n i=1 int.cone(B (i) ). By Lemma 4 there exists a set of generating elementsB such that • each generating vector p ∈B can be represented by p = B (i) λ for some λ ∈ Z d ≥0 with λ 1 ≤ O((d∆) d(ℓ−1) ) for each basis B (i) .
As b ∈ cone(B) there exists a vectorx ∈ RB ≥0 withBx = b. Our goal is to show that there exists a non-zero vector q ∈B withx q ≥ 1. In this case b can be simply written by b = q + q ′ for some q ′ ∈ cone(B). As q and q ′ are contained in the intersection of all cones, there exists for each generating set B (j) a vectors y (j) ∈ Z B (j) ≥0 and z (j) ∈ R B (j)

≥0
such that B (j) y (j) = q and B (j) z (j) = q ′ . Hence x (j) = y (j) + z (j) and we finally obtain that x (j) ≥ y (j) for y (j) ∈ Z B (j) ≥0 which shows the claim. Therefore it only remains to prove the existence of the point q withx q ≥ 1. By Lemma 4, each vector p ∈B can be represented, by , every x (i) can be written by x (i) = p∈B x (p)x p and we obtain a bound on x (i) 1 assuming that every for every p ∈B we havex p < 1.
The last inequality follows as we can assume by Caratheodory's theorem [30] that the number of non-zero components ofx is less or equal than d. (ℓ−1) ) then there has to exist a vector q ∈B with x q ≥ 1 which proves the claim.
Claim 2: For every vector λ (i) ∈ Z P ≥0 with p∈P λ p p = b there exists a basic feasible solution x (j) of LP (2) with basis B (j) such that 1 Proof of the claim: The proof of the claim can be easily seen as each multiplicity vector λ (i) is also a solution of the linear program (2). By standard LP theory, we know that each solution of the LP is a convex combination of the basic feasible solutions x (1) , . . . , x (ℓ) . Hence, each multiplicity vector λ (i) can be written as a convex combination of x (1) , . . . , x (ℓ) , i.e. for each λ (i) , there exists a t ∈ R ℓ ≥0 with t 1 = 1 such that otherwise . By the pigeonhole principle, there exists for each multiplicity vector λ (i) an index j with t j ≥ 1 ℓ which proves the claim. Using the above two claims, we can now prove the claim of the lemma by showing that for each λ (i) , there exist a vector y (i) ≤ λ (i) with bounded 1-norm such that p∈P y (1) By Claim 2 we know that for each λ (i) (1 ≤ i ≤ n) we find one of the basic feasible solutions Applying the first claim to vec- , we obtain vectors y (1) ≤ 1 ℓ x (1) , . . . , y (ℓ) ≤ 1 ℓ x (ℓ) with By (1) = . . . = By (ℓ) . Hence, we find for each λ (i) a vector y (j) ∈ Z B (j) ≥0 with y (j) ≤ λ (i) . As p∈P λ (i) p p and every p ∈ P is bounded by p ∞ ≤ ∆ we know that for every i ≤ n and every j ≤ ℓ. Hence if λ (i) (ℓ−1) ). Therefore, Claim 1 can be applied to find y (j) ≤ 1 ℓ x (j) of smaller 1-norm.
Note that ℓ is bounded by |P | d ≤ |P | d and |P | ≤ ∆ d and we obtain that ) .

A Lower Bound for the Size of Graver Elements
In this section we prove a lower bound on the size of Graver elements for a matrix where the overlapping parts contains only a single variable, i.e. r = 1. First, consider the matrix This matrix is of 2-stage stochastic structure with r = 1 and s = 1. We will argue that every element in ker(A) ∩ (Z ∆ \ {0}) is large and therefore, the Graver elements of the matrix are large as well. We call the variable corresponding to the i-th column of the matrix variable x i , where x 1 is the variable corresponding to he column with only −1 entries and the x i for i > 1 is the variable corresponding to the column with entry i in component i and 0 everywhere else. Clearly, for x ∈ Z ∆ to be in ker(A) ∩ Z n , we know by the first row of matrix A that x 1 has to be a multiple of 2. By the second row of the matrix, we know that x 1 has to be a multiple of 3 and so on. Henceforth the variable x 1 has to be a multiple of all numbers 1, . . . , ∆. Thus x 1 is a multiple of the least common multiple of numbers 1, . . . , n which is divided by the product of all primes between 1, . . . , n. By known bounds for the product of all primes ≤ n [11], this implies that the value of x 1 ∈ 2 Ω(∆) , which shows that the the size of Graver elements of matrix A is in (2 Ω(∆) ). The disadvantage in the matrix above is that the entries of the matrix are rather big. In the following we try to reduce the largest entry of the overall matrix by encoding each number 1, . . . , ∆ into a submatrix. For the encoding we use the matrix having s rows and s + 1 constraints. Due to the first row of matrix C, for a vector x ∈ ker(C) ∩ Z s+1 we know by the i-th row of the matrix that is the i-th number in a representation of z in base ∆. Hence, we consider the following matrix: By the same argumentation as in matrix A above we know that x 0 has to be a multiple of each number 2, . . . , ∆ s+1 − 1. This implies that every non-zero integer vector of ker(A ′ ) has infinity-norm of at least 2 Ω(∆ s ) is the number of rows of the block matrix. This shows the doubly exponential lower bound for the Graver complexity of 2-stage stochastic IPs.

Multi-Stage Stochastic IPs
In this section we show that Lemma 2 can also be used to get a bound on the Graver elements of matrices with a multi-stage stochastic structure. Multi-stage stochastic IPs are a well known generalization of 2-stage stochastic IPs. For the stochastical programming background on multi-stage stochastic IPs we refer to [29]. Here we simply show how to solve the deterministic equivalent IP with a large constraint matrix. Regarding the augmentation framework of multi-stage stochastic IPs, it was previously known that a similar implicit bound than 2-stage stochastic IPs also holds for multi-stage stochastic IPs. This was shown by Aschenbrenner and Hemmecke [2] who built upon the bound of 2-stage stochastic IPs. In the following we define the shape of the constraint matrix M of a multi-stage stochastic IP. The constraint matrix consists of given block matrices A (1) , . . . , A (ℓ) for some ℓ ∈ Z ≥0 , where each block matrix A (i) uses a unique set of columns in M. For a given block matrix, let rows(A (i) ) be the set of rows in M which are used by A (i) . A matrix M is multi-stage stochastic shape, if the following conditions are fulfilled: • There is a block matrix A i 0 such that for every 1 ≤ i ≤ n we have rows(A (i) ) ⊆ rows(A (i 0 ) ).
An example of a matrix of multi-stage stochastic structure is given in the following: Intuitively, the constraint matrix is of multi-stage stochastic shape if the block matrices with the relation ⊆ on the rows, forms a tree (see figure below). (8) A (2) A (3) A (1) Let s i be the number of columns that are used by block matrices in the i-th level of the tree (starting from level 0 at the leaves). Here we assume that the number of columns of block matrices in the same level of the tree are all identical. Let r be the number of rows that are used by the block matrices that correspond to the leaves of the tree. In the following Lemma we show that Lemma 2 can be applied inductively to bound the size of an augmenting step of multi-stage stochastic IPs. The proof is similar to that of Theorem 2. LetĀ (i) be the submatrix of A which consists only of the rows that are used by B (i) (recall that rows(B (i) ) ⊆ rows(A)). Now suppose that y is a cycle of A, i.e. Ay = 0 and let y (0) be the subvector of y consisting only of the entries that belong to matrix A. Symmetrically let y (i) be the entries of vector y that belong only to the matrix B (i) for i > 0. Since Ay = 0 we also know thatĀ (i) y (0) +B (i) y (i) = (Ā (i) B (i) ) y (0) y (i) = 0 for every 1 ≤ i ≤ n. Each vector y (0) y (i) can be decomposed into a multiset of indecomposable cylces C i , i.e.
where each cycle c ∈ C i is a vector c = c (0) c (i) consisting of subvector c (0) of entries that belong to matrix A and a subvector c (i) of entries that belong to the matrix B (i) . Note that the matrix (A (i) B (i) ) has a multi-stage stochastic structure with a corresponding tree of depth t − 1. Hence, by induction we can assume that each indecomposable cycle c ∈ C i is bounded by c ∞ ≤ T (s 1 , . . . , s t−1 , r) for all 1 ≤ i ≤ n, where T is a function that involves a tower of t exponentials. In the base case that t = 0 and the matrix A only consists of one block matrix, we can bound c ∞ by (2∆r + 1) r using Theorem 1. Let p, be the projection that maps a cycle to the entries that belong the matrix A i.e. with T = T (s 1 , . . . , s t−1 , r, ∆) such that x∈S 1 x = . . . = x∈Sn x. As T (s 1 , . . . , s t−1 , r) is a function with t exponentials, the bound on |S i | depends by a function t + 1 exponentials.
There exist submultisetsC 1 ⊆ C 1 , . . . ,C n ⊆ C n with p(C 1 ) = S 1 , . . . , p(C n ) = S n . Hence, we can define the solutionȳ ≤ y byȳ (i) = c∈C ip (c) for every i > 0, wherē p is the function that projects a vector to the entries that belong the matrix B (i) i.e. p(c) =p( c (0) c (i) ) = c (i) . For i = 0 we define y (0) = c∈C i p(c). As the sum c∈C i p(c) is identical for every 1 ≤ i ≤ n, the vectorȳ is a well defined.
Let K be the constant derived from the O-notation of Lemma 2 and T = T (s 1 , . . . , s t−1 , r, ∆), then the size ofȳ can be bounded by .

Computing the Augmenting Step
As a consequence for the bound of the Graver elements of the constraint matrix M of multi-stage stochastic IPs, we obtain by the augmentation framework an algorithm to solve multi-stage stochastic IPs. As explained in Section 2, the core difficulty is to compute the augmenting step y ∈ Ker(M) such that z + λy is a feasible solution for a given initial feasible solution z and a multiple λ. Therefore, we have to solve the IP max{c T x | Mx = 0,l ≤ x ≤ū, x ∞ ≤ T } for some upper and lower boundsl,ū and constant T = T (s 1 , . . . , s t , r, ∆) that is derived from the bound of Theorem 4. This IP can be solved similar than in the case of 2-stage stochastic IPs. However, since we have multiple layers, we have to apply the algorithm recursively. At each recursive call, we guess the value of the variables of the corresponding matrix and then apply the algorithm recursively. For further details on the algorithmic side and the running time we refer to [2] or [26]. As a final result we obtain the following theorem for multi-stage stochastic IPs: Theorem 5. A multi-stage stochastic IP with a constraint matrix M that corresponds to a tree of depth t can be solved in time n 2 s 2 0 ϕ log 2 (ns 0 ) · T (s 1 , . . . , s t , r, ∆) where ϕ is the encoding length of the IP and T is a function depending only on parameters s 1 , . . . , s t , r, ∆ and involves a tower of t + 1 exponentials.