Enhanced formulation for the Guillotine 2D Cutting Knapsack Problem

We advance the state of the art in Mixed-Integer Linear Programming formulations for Guillotine 2D Cutting Problems by (i) adapting a previously-known reduction to our preprocessing phase (plate-size normalization) and by (ii) enhancing a previous formulation (PP-G2KP from Furini et alli) by cutting down its size and symmetries. Our focus is the Guillotine 2D Knapsack Problem with orthogonal and unrestricted cuts, constrained demand, unlimited stages, and no rotation – however, the formulation may be adapted to many related problems. The code is available. Concerning the set of 59 instances used to benchmark the original formulation, the enhanced formulation takes about 4 hours to solve all instances while the original formulation takes 12 hours to solve 53 of them (the other six runs hit a three-hour time limit each). We integrate, to both formulations, a pricing framework proposed for the original formulation; the enhanced formulation keeps a significant advantage in this situation. Finally, in a recently proposed set of 80 harder instances, the enhanced formulation (with and without the pricing framework) found: 22 optimal solutions (5 already known, 17 new); better lower bounds for 25 instances; better upper bounds for 58 instances.


Introduction
The problem we focus on in this work is the Guillotine 2D Knapsack Problem with orthogonal (and unrestricted) cuts, constrained demand, unlimited stages, and no rotation. We will refer to this specific variant as G2KP. The G2KP is a strongly NP-hard problem [25]. The work also focuses on obtaining optimal solutions for this problem through Mixed-Integer Linear Programming (MILP). We propose two simple but effective enhancements regarding a state-of-the-art MILP formulation for the G2KP (which may also benefit some closely related problem variants).

Explanation of the problem and some close variants
An instance of the G2KP consists of: a rectangle of length L and width W (hereafter called original plate); a set of rectanglesJ (also referred to as pieces) where each rectangle i ∈J has a length l i , a width w i , a profit p i , and a demand u i . We assume, without loss of generality, that all such values are positive integers.
The G2KP seeks to maximise the profit of the pieces obtained by cutting the original plate. The guillotine qualifier means every cut always go from one side of a plate to other; a cut never stops or starts from the middle of a plate. We cut the original plate into intermediary plates j ∈ J , J ⊇J , which we further cut following the same rule.
If we do not cut a plate further, then it is either: thrown away as trim/waste for no profit; or, if it has the same size as a piece, it may also be sold by the piece profit value. Orthogonal cuts are always parallel to one side of a plate (and perpendicular to the other). Consequently, any intermediary plate j is always a rectangle, and have a well-defined l j and w j . Unrestricted cuts mean we are allowed to make horizontal (vertical) cuts different from the length (width) of a piece. In constrast, restricted cuts means horizontal (vertical) cuts can only happen at positions that match a piece length (width), it may also mean that, in addition to this, a piece with matching length (width) must be extracted from the first child plate of a restricted cut. In this paper, restricted means only that the position of the cuts is restricted (not that the cut force a posterior piece extraction), we create and employ the term position-only restricted to keep the reader aware of what we mean. Solving the position-only restricted problem exactly is a costly but high-quality primal heuristic for the G2KP.
Constrained demand means we can sell at most u i copies of piece i. The G2KP with unconstrained demand is not strongly NP-hard but weakly NP-hard instead; exact algorithms of pseudo-polynomial time complexity exist [2]. Consequently, if u i ≥ β i : ∀i ∈J , where β i is an upper bound on the number of copies of piece i that can be produced from the original plate, then the instance is probably better solved as an instance of the unconstrained G2KP instead. We avoid this kind of instances in our experiments. Unlimited stages means there is no limit to the number of times the guillotine switches between horizontal and vertical orientations. In the exact k-staged G2KP, the guillotine is switched at most k − 1 times. Consequently, a 2-staged G2KP has all cuts in some orientation before any cuts in the other orientation. The non-exact k-staged G2KP adds one extra stage in which the only cuts allowed are the ones that trim plates to the size of pieces. The no-rotation qualifier means we never switch  length and width during the cutting process; especially, we cannot sell a plate j as a piece of length w j and width l j . If we further qualify the G2KP, we only mean to discard the qualifiers above that directly conflict with the extra qualifiers, if any. For example, if we refer to the unconstrained G2KP, then we meant to discard the constrained qualifier but keep the remaining qualifiers, i.e., no rotation, unlimited stages, as well as guillotined, orthogonal, and unrestricted cuts. Figure 1 may help to understand some of the discussed characteristics.
While our work focuses on this specific problem, the enhanced formulation we present may be readily adapted to, at least, the Guillotine 2D version of the following problems: the Cutting Stock Problem (and the Bin Packing Problem); the Strip Packing Problem; the Multiple Knapsack Problem; the Orthogonal Packing Problem; and the variant allowing rotation for all previously mentioned problems. See [14] for more details. We do not define or further discuss these problems or variants in this work.

Motivation
Guillotine cutting problems are of interest of the industry, especially the wood [22,31] and glass cutting industries [9,23], often because of machinery limitations. The cutting optimization problem proposed in the ROADEF/EURO Challenge 2018 was a guillotine cutting problem. The challenge was developed in collaboration with Saint-Gobain Glass France (a reference on flat glass manufacture). See [23] for more details on this challenge. The vast and growing literature on the subject, pointed out by two recent surveys [17,25], is also evidence of such interest.
We focus on MILP as the solving method (instead of ad hoc solutions) because its adaptability amplifies the value of any enhancements we obtain.

Contributions and paper outline
The main contributions of this work are: an enhanced MILP formulation based on a previous state-of-the-art formulation, its proof of correctness, and empirical evidence of its better performance; a straightforward adaptation of a previously known reduction procedure for both the original and the enhanced formulations, and empirical evidence of its positive impact on their performance; finally, we present new upper and lower bounds, as well as optimal values, for many recently proposed hard instances from [30]. For such, we reimplement a state-of-the-art MILP formulation and an optional pricing procedure used by it. This reimplementation allows us to compare both approaches fully. For reproducibility, the exact version of the code employed in this paper is available at [3]. However, we suggest using the better documented and maintained master branch (https://github.com/henriquebecker91/GuillotineModels.jl) if perfect reproduction is not necessary.
We organise the rest of the paper the following way: Sect. 2 analyses how our work interacts with the pre-existing literature; Sect. 3 introduces some mathematical concepts and explains the reduction we adapted from the literature; Sect. 4 describes our enhanced formulation and briefly explains how it differs from the state-of-the-art formulation it is based on; Sect. 6 presents our experiments and the empirical results we derive from them; Sect. 7 delivers our conclusions and suggests future work.

Related work
We do not intend to provide a full overview of the literature, instead we: refer to surveys; discuss only closely related works and how they interact with our contributions; and opportunely point out missing connections between related works.
Two relevant surveys have come out recently. [17] catalogues exact methods and relaxations for 2D cutting problems including guillotine problems. [25] reviews the literature of our particular problem at length -there G2KP is referred to as Constrained 2D Cutting or C2DC. Moreover, [25] points out three strategies employed by previous exact solving methods which cause loss of optimality, i.e., these methods cannot be considered exact anymore. Our work does not employ any of these three strategies. One of these strategies is a dominance rule that is valid for the unconstrained case but not for the constrained case. In 1972, [16] proposed a dominance rule for the G2KP with unconstrained demand based on the same principle and warned about the possibility of misusing the rule in the constrained case.
The first MILP formulation dealing with guillotine cuts and unlimited stages was proposed by [5] in 2008. The problem considered by [5] is the Strip Packing Problem 1 , but adapting the formulation to the knapsack variant would not change its fundamentals. Previously, [18] had proposed two MILP formulations for 2-staged G2KP. As noted by [4], modeling k-staged cuts for k ≥ 3 (unlimited stages included) was considered difficult at the time. The size of most k-staged formulations is exponential on the number of stages (i.e., k). The formulation of [5] had about 3n 4 /4 variables and 2n 4 constraints (where n is the number of pieces) it also employed, according to the authors, a "very loose linear relaxation" due to which "the practical interest of this formulation is still limited". The characterization of guillotine cuts proposed by [5] seems to have been simultaneously proposed by [24].
The first MILP formulation specifically for the G2KP was proposed by [14] in 2016. An extended version of [14] appears in [29] (a PhD thesis). Their formulation has pseudo-polynomial size, O((L +W )×L ×W ) variables and O(L ×W ) constraints, and its relaxation provides a stronger bound than [5]. It was the first formulation able to solve medium-sized instances of the literature. Besides the formulation, [14] proposes two reductions and one pricing procedure; all of these are reimplemented by our work. They also present and prove a theorem to assure the correctness of one of their reductions (Cut-Position). A similar theorem and proof appear in [27].
In this work, we propose an enhanced formulation based on the one from [14] mentioned above. A significant advantage of our enhancement is to avoid the enumeration of any cuts after the middle of a plate. This advantage appears in many works since [16]. Recently, [11] adapted a formulation for the one-dimensional Cutting Stock Problem to obtain this same advantage. However, the way [11] changes their formulation to obtain this advantage is not the same as our approach.
The most recent MILP formulations for the G2KP come from three works by Martin et alii [19][20][21]. These formulations are compared against the formulation of [14]. We base our enhanced formulation on [14] and also compare against it. The formulations of [19][20][21] have a looser relaxation bound compared to [14], but perform better than [14] in instances for which [14] has a much larger number of variables. Considering the instances used in [14], our enhanced formulation dominates the formulation of [14]. Our formulation also dramatically improves the running times of instances in which the formulation of [14] performed worse than [19][20][21] (e.g., the gcut1-gcut12 instances). Consequently, while it may be interesting for completeness sake, we do not compare against the formulations proposed in [19][20][21].

Notation, discretization, and plate-size normalization
The performance of solving methods for cutting and packing problems often heavily depends on the number of (cut/packing) positions considered. Since the seminal works of [8] and [16], solving methods avoid considering each possible position, but instead consider only a subset necessary to guarantee optimality. The literature includes many such subsets, which are often referred to as discretizations. The most common way of computing these discretizations are Dynamic Programming (DP) algorithms. These DP algorithms usually only take a small fraction of the running time, but the size of the position subset outputted by them strongly affects the time spent by the rest of the solving method.
Both [14] and our enhanced formulation have one constraint for each attainable distinctly-sized plate and one variable for each potential cut over each of these plates. Therefore, eliminating a single cutting position has the following effects: (i) it removes one variable for each distinctly-sized plate that allowed that cutting position; (ii) if that cutting position was the only way to produce some distinctly-sized plates 2 , then it also removes the constraints associated with these plates; (iii) if (ii) excludes one or more constraints/plates, then it also excludes all variables representing possible cuts over the excluded plates; (iv) finally, if (iii) eliminates one or more variables/cuts, then it may trigger (ii) again (i.e., other plates stop being attainable), cyclically.
In this work, the only cut subset (discretization) considered are the canonical dissections of [16], hereafter referred to as normal cuts instead. We acknowledge the existence of stricter discretizations: the raster points of [26,28], the regular normal patterns of [7] (named this way by [10]), and the Meet-in-the-Middle (MiM) of [10]. The reasons for our choice of discretization are numerous: it works well with the Plate-Size Normalization procedure we describe below; it is the same discretization employed by [14] (from which we base our enhanced formulation on); The main gain of MiM is reducing the number of cut positions after the middle of a plate, which our enhanced formulation already discards anyway; the regular normal patterns compute a distinct subset-sum for each pair of plate and piece, which we consider excessive (there may exist hundreds of thousands of intermediary plate possibilities); finally, the raster points complicate our proofs and our Plate-Size Normalization weakens its benefits.
The set O = {v, h} denotes the cut orientation: v is vertical (parallel to length, perpendicular to width); h is horizontal (parallel to width, perpendicular to length). Let us recall that the demand of a piece i ∈J is denoted by u i . If we define the set of pieces fitting a plate j as I j = {i ∈J : l i ≤ l j ∧ w i ≤ w j }, we can define N jo (i.e., the set of the normal cuts of orientation o over plate j) as: The sets defined above never include cuts at the plate extremities (i.e., 0, l j for N jh , and w j for N jv ). Any of these cuts will always create (i) a zero-area plate and (ii) a copy of the plate that is being cut. Consequently, these cuts only add symmetries and may be disregarded.
The set J can now be defined by the following procedure: the original plate (plate 0) is added to J , then for every plate j ∈ J every cut in N jv ∪ N jh is applied to j, and each child generated is added to J if it can fit at least one piece. The process finishes when every plate in J was considered for cutting, and no new plates were generated. Such procedure guarantees each piece i ∈J will always be present in J unless the piece does not fit the original plate (in which case it is irrelevant to the problem and could be removed a priori).
The goal of the Plate-Size Normalization procedure we propose is to reduce the number of distinctly-sized plates considered. Fewer distinctly-sized plates mean fewer constraints and trigger the same cascading effect described by items (ii)-(iv) above. The property exploited by the procedure is already known and similarly exploited by [1] and by [12]. We state the property as: Proposition 1 Given a plate j ∈ J , l j may always be replaced by l j = max{q : q ∈ N kh , q ≤ l j } in which k ∈ J , w k = w j , but l k > l j , without loss of optimality. The analogue is valid for the width.
In other words, if increasing the length (width) of plate j reveals that the original length (width) did not match a normal cut position in the enlarged plate, then plate j may be replaced by a shorter plate in which the length (width) is reduced to the largest normal cut position smaller than the original length (width). For example, given l = [5,7], w = [3,2], a 13x3 plate may be reduced to 12x3 (13 does not match a normal cut while 5 + 7 = 12 does), and a 13x2 plate may be reduced to 7x2 (13 does not match a normal cut while 7 does). We do not replicate any proof here. We can then define:

Definition 1
The length of a plate j is considered normalized if, and only if, l j = l j . The analogue is valid for the width. The size of a plate is normalized if, and only if, both its length and its width are normalized.
The Plate-Size Normalization procedure we propose consists only of replacing every non-size-normalized plate enumerated by their normalized counterpart. The number of distinctly-sized plates diminishes because the procedure replaces many plates of distinct but similar dimensions by a single plate. The only extra effort added by Plate-Size Normalization consists of binary searches over N jo sets for each plate j, and these may be carried out without increasing the overall complexity, given the setup of O(LW ) vectors of size O(L + W ); a setup step which also does not increase the overall complexity. However, in our implementation, we opted to increase the overall complexity from O(L 2 W + LW 2 ) to O(L 2 Wlog(L) + LW 2 log(W )) because the fraction of time spent on the enumeration was not enough to justify the memory and code complexity trade-off. In practice, even if our worst-case complexity increases, the time spent decreased because the actual number of plates (denoted in the complexity by O(LW ) became more distant from the worst-case. A suitable N ko set for each plate j was already computed by the plate enumeration procedure before introducing the Plate-Size Normalization (no extra effort required).

Remark 1
If a normal cut divides a size-normalized plate, then the dimension perpendicular to the cut, in the first child, is normalized. The dimension parallel to the cut in the first child, and both dimensions of the second child, are not guaranteed to be normalized.

Example 1
Consider three pieces with l = [5,7,9], w = [6,4,11], u = [3, 1, 1], and a plate of dimensions 15x15. The plate dimensions are already normalized. The plate length matches stacking the three copies of the first piece. The plate width matches laying side-by-side the other two pieces. An horizontal cut at length 7 is a normal cut because it matches the length of the second piece. If the cut is done, the width of both children is not normalized anymore, nor is the length of the second child. The width of both children is not normalized because the third piece does not fit either child so, for both children, the largest width a valid packing may reach is 12. The length of the second child is not normalized because the largest length a valid packing inside the second child may reach is 7. The dimensions of both children may be normalized to 7x12. This example already shows an immediate gain, instead of creating two new plate sizes, the enumeration only creates a single new plate type. The cut creates two copies of this single type of plate.

Our changes to Furini's model
The formulation proposed in [14] is elegant: the pieces are just intermediary plates that may be sold. Our contribution consists of changes to both the preprocessing step and to the formulation. These changes significantly reduce the number of variables. Differently, these changes deepen the distinction between plates and pieces and, consequently, may be regarded as sacrificing some elegance for performance. The essentials of the formulation remain the same and, for this reason, we consider the model presented here as an enhanced model, not an entirely new model. The cut enumeration in [14] excludes some symmetric cuts; that is, if two different cuts create the same set of two child plates, then the symmetric cut in the second half of the plate may be ignored. Differently, [8] disregards all cuts after the middle of the plate because of symmetry. If [14] would do the same as [8] it could become impossible to trim a plate to the size of a piece. For example, if there was a piece with length larger than half the length of a plate, and such plate has no normal cut with the exact length of the needed trim, then the piece could not be extracted from the plate, even if the piece fits the plate. The goal of our changes is to reduce the number of cuts (i.e., model variables) by getting closer to the symmetry-breaking rule used in [8] without loss of optimality.

The enhanced formulation
Our changes to the formulation are restricted to replacing the set of integer variables y j , i ∈J , with a new set of variables e i j , (i, j) ∈ E, E ⊆J × J , and the necessary adaptations to accomodate this change. In the original formulation, y i denoted the number of times a plate i was sold as the piece i, in this case, the plate always had the exact size of the piece. Our extraction variables e i j denote a piece i was extracted from plate j, which size may differ from the size of the piece. The exact definition of set E is discussed over Sect. 4.2; for the purpose of presenting the formulation, our intuitive definition of e i j just above is enough. For convenience, we also define indicates how many copies of a plate j ∈ J are produced by cutting a plate k ∈ J with a cut of orientation o ∈ O at position q ∈ Q ko . The description of this parameter in [14] has a typo, as pointed out by [19]: "[...] there is a typo in their definition of parameter a o qk j , as the indices j and k seem to be exchanged.". In a valid solution, the value of x o q j is the number of times a plate j ∈ J is cut with orientation o ∈ O at position q ∈ Q jo ; while the value of e i j is the number of sold pieces of type i ∈J that were extracted from plates of type j ∈ J . The plate 0 ∈ J is the original plate, and it may also be inJ , as there may exist a piece of the same size as the original plate. max. s.t.
o∈O q∈Q jo The objective function maximizes the profit of the extracted pieces (2). Constraint (3) guarantees that for every plate j that was further cut or had a piece extracted from it (left-hand side), there must be a cut making available a copy of such plate (right-hand side). One copy of the original plate is available from the start (4). The amount of extracted copies of some piece type must respect the demand for that piece type (a piece extracted is a piece sold) (5). Finally, the domain of all variables is the non-negative integers (6)-(7).

The revised variable enumeration
The variable enumeration described in [14] employs some rules to reduce the number of variables; they are symmetry-breaking, Cut-Position, and Redundant-Cut. The two last rules are not discussed here; [14] proves their correctness and they do not conflict with the enhanced model.
The use of the x variables does not change from the original formulation to our revised formulation -however, the size of the enumerated set of variables changes. Our revised enumeration does not create any variable The original formulation has variables y i , i ∈J , while the revised formulation replaces them with variables e i j , (i, j) ∈ E, E ⊆J × J . SetJ × J is orders of magnitude larger thanJ . Consequently, set E must be a small subset to avoid having a revised model with more variables than the original. A suitable subset may be obtained by a simple rule: (i, j) ∈ E if, and only if, packing piece i in plate j does not allow any other piece to be packed in j.
For the enhanced formulation to have more variables than the original formulation, this is, the number of extraction variables must be larger than the number of pieces plus the sum of the number of cuts after the middle of each enumerated plate. Unfortunately, there is no closed formula for these sets (exceptJ which is given), what makes necessary to compute the full enumeration to verify the difference.

The proof of correctness
The previous section presented a detailed explanation of the changes to the formulation and variable enumeration. This section proves such changes do not affect the correctness of the model. In [14], only the perfect symmetries described below are removed. Our changes may be summarized to: 1. There is no variable for any cut that occurs after the middle of a plate. 2. A piece may be obtained from a plate if, and only if, the piece fits the plate, and the plate cannot fit an extra piece (of any type).
The second change alone cannot affect the model correctness. The original formulation was even more restrictive in this aspect: a piece could only be sold if a plate of the same dimensions existed. In our revised formulation there will always exist an extraction variable in such case: if a piece and plate match perfectly, there is no space for any other piece, fulfilling our only criteria for the existence of extraction variables. Consequently, what needs to be proved is that: Theorem 1 Without changing the pieces obtained from a packing, we may replace any normal cut after the middle of a plate by a combination of piece extractions and cuts at the middle of a plate or before it.
Proof This is a proof by exhaustion. The set of all normal cuts after the middle of a plate may be split into the following cases: 1. The cut has a perfect symmetry. 2. The cut does not have a perfect symmetry.
(a) Its second child can fit at least one piece. (b) Its second child cannot fit a single piece. i. Its first child packs no pieces. ii. Its first child packs a single piece. iii. Its first child packs two or more pieces.
We believe to be self-evident that the union of item 1,item 2a,items 2(b)i to 2(b)iii is equal to the set of all normal cuts after the middle of a plate. We present an individual proof for each of these cases.
Item 1 -The cut has a perfect symmetry. If two distinct cuts have the same children (with the only difference being the first child of one cut is the second child of the other cut, and vice-versa), then the cuts are perfectly symmetric. Whether a plate is the first or second child of a cut does not make any difference for the formulation or for the problem. If the cut is in the second half of the plate, then its symmetry is in the first half of the plate. Consequently, both cuts are interchangeable, and we may keep only the cut in the first half of the plate. Item 2a -Its second child can fit at least one piece. Proposition 1 allows us to replace the second child by a size-normalized plate that can pack any demand-abiding set of pieces the original second child could pack. The second child of a cut that happens after the middle of the plate is smaller than half a plate, and its size-normalized counterpart may only be the same size or smaller. So the size-normalized plate could be cut as the first child by a normal cut in the first half of the plate. Moreover, the old first child (now second child) have stayed the same size or grown (because the size-normalization of its sibling), which guarantee this is possible. Item 2(b)i -Its first child packs no piece. If both children of a single cut do not pack any pieces, then the cut may be safely ignored. Item 2(b)ii -Its first child packs a single piece. First, let us ignore this cut for a moment and consider the plate being cut by it (i.e., the parent plate). The parent plate either: can fit an extra piece together with the piece the first child would pack, or cannot fit any extra pieces. If it cannot fit any extra pieces, this fulfills our criteria for having an extraction variable, and the piece may be obtained through it. The cut in question can then be disregarded (i.e., replaced by the use of such extraction variable). However, if it is possible to fit another piece, then there is a normal cut in the first half of the plate that would separate the two pieces, and such cut may be used to shorten the plate. This kind of normal cuts may successively shorten the plate until it is impossible to pack another piece, and the single piece that was originally packed in the first child may then be obtained employing an extraction variable. Item 2(b)iii -Its first child packs two or more pieces. If the first child packs two or more pieces, but the second child cannot fit a single piece (i.e., it is waste), then the cut separating the first and second child may be omitted and any cuts separating pieces inside the first child may still be done. If some of the plates obtained by such cuts need the trimming that was provided by the omitted cut, then these plates will be packing a single piece each, and they are already considered in item 2(b)ii. In all examples, the parent plate is 15x15. In the example of item 2a, the cut would happen after the middle of the plate, but then the pieces of the second child can be packed in the first child instead. In the example of item 2(b)i" both cuts happen after the middle of the plate, and there are no other pieces; however, as no piece may be extracted from the leftovers, then there is an extraction variable available. In the example of item 2(b)ii, we assume a 2x7 piece exists but we do not intend to obtain it from the plate; therefore, the extraction variable from the previous case does not exist; however, the 2x7 piece allows us to make a cut just to reduce the plate length and, for the size of the second child, an extraction variable is available. Finally, in the example of item 2(b)iii, which cut happens first may be changed, as there is no piece packed in the subplate that would originally become the second child Given the cases cover every cut after the middle of a plate, and each case has a proof, then follows that Theorem 1 is correct.

The pricing phase
The pricing procedure described in [14,29] was reimplemented by us. No significant changes were made to the procedure. As our experiments include multiple comparisons involving this procedure, a summary of the procedure is presented below. For simplicity, we consider the procedure takes an already built model (from either the original formulation or our enhanced version), and any previous reductions mentioned were already applied at this point.
1. Fix to zero all variables representing horizontal (vertical) cuts that do not match a piece length (width).. 2. Remove all integrality constraints and solve the relaxed model to obtain an upper bound for the position-only restricted problem. 3. Obtain a lower bound from an inexact 2-staged heuristic (see [12,14]). 4. Employ the reduced costs of the model variables, the position-only restricted upper bound, and the heuristic lower bound to price-out variables (more details below) by fixing them to zero. In item 4 and item 8, a variable is priced out if reduced_cost(var) + ub ≤ lb, where the upper and lower bounds are the ones available at the corresponding step. The rationale behind this requirement is straightforward. If forcing var to assume value 1 is enough to reduce the upper bound from the relaxation to less than the lower bound, then that variable (guillotine cut) cannot be used to provide a solution better than the current lower bound. Note any variables necessary to produce the current lower bound are kept.
The criteria for choosing the subset of variables in each iteration of item 7 takes into account two parameters: n max andp. If any variables have reduced cost abovep they define the subset; otherwise, the first n max variables with positive reduced cost define the subset. The original description of the procedure does mention an ordering of the variable pool, so what constitutes the first n max variables is not well-defined. We chose to interpret that the n max variables of largest reduced cost are selected. Both parameters are automatically computed for each instance: n max is one-fifth of the sum of the demand vector u, andp is one-fourth of the sum of the profits for every piece (taking demand into account).
The original description of the procedure does not indicate if, during the process, the variables are fixed and unfixed, or removed and added back. Preliminary tests indicated that the fix-and-unfix approach had better performance, so we used it in our experiments. In the last step, all variables yet fixed to zero are removed.

Experimental results
There are three formulation implementations that provide data used in our comparisons: original refers to the implementation presented in [14,29]; faithful refers to our reimplementation of original; enhanced refers to our enhanced formulation presented in Sect. 4. The original implementation was not available 3 Consequently, all data relative to original presented in this work comes from [29]. Both faithful and enhanced data were obtained by runs using the setup described in Sect. 6.1.
Each formulation may be modified by applying any combination of the following optional procedures: priced -refer to the pricing procedure described in [14,29] 5 (same as [14,29]); normalized -the plate-size normalization procedure described in Sect. 3; warmed -the MIP models solved were warm-started with a solution found by a previous step; Cut-Position and Redundant-Cut -are reduction procedures described in [14,29], that may be enabled and disabled individually. For each experiment described in the next sections, if we do not mention a procedure, then it is disabled. The term restricted priced refers to the model for the position-only restricted version of the problem that is solved inside the pricing procedure mentioned above. Consequently, for each run of a priced variant, there will be a restricted priced run with the same combination of optional procedures. The differences between the restricted priced and the (unrestricted) priced models are mainly that: (i) the restricted priced model never has a horizontal (vertical) cut that does not match the length (width) of a piece; (ii) the restricted priced model is MIP-started with the solution of an heuristic (described in [14]) while the priced model is MIP-started with the solution of the restricted priced model; (iii) the distinct solutions used to MIP-start the respective models are also used as the lower bound for the pricing procedure (details in [14]).
Without the set of model variables (guillotine cuts) removed by the pricing, plates of some dimensions may become impossible to obtain. These plates are not necessary to obtain an optimal solution; otherwise, the pricing could not have removed all variables that led to them. Most of these plates could be further cut, but the value of the variables associated with such cuts can now only be zero and, therefore, these variables can be removed too. This thinning effect may be recursive, as each newly removed variable may render some plate sizes unobtainable, similarly to what is described in Sect. 3. Hence, the pricing phase uncovers a set of unnecessary variables larger than the set it directly removes. Our preliminary experiments have shown that removing this larger set from the model, instead of just the variables directly removed by the pricing phase, has basically no effect on the total time; even if, on average, they account for 65% of the model variables after the pricing. Consequently, we believe the solver can detect such variables and remove them by itself. However, as it is computationally cheap to detect and remove the larger variable set, we decided to always apply this procedure after the pricing phase, and present a model size that better corresponds to reality in the results.
Each experiment helps to substantiate choices taken in the subsequent experiments: Sect. 6.2 provides evidence that faithful is on par with original, allowing us to use it as a replacement; Sect. 6.3 compares faithful to enhanced and shows the value of our contributions (namely, the normalize procedure and the enhanced formulation); Sect. 6.4 applies the methods with best results in the last experiment to prove new optimal values and bounds for harder instances.

Setup
Every experiment in this work uses the following setup unless stated otherwise. The CPU was an AMD ® Ryzen TM 9 3900X 12-Core Processor (3.8GHz, cache: L1 -768KiB, L2 -6 MiB, L3 -64 MiB) and 32GiB of RAM were available (2 x Crucial Ballistix Sport Red DDR4 16GB 2.4GHz). The operating system used was Ubuntu 20.04 LTS (Linux 5.4.0-42-generic). Hyper-Threading was disabled. Each run executed on a single thread, and no runs executed simultaneously. The computer did not run any other CPU bound task during the experiments. The exact version of the code used is available online (https://github.com/henriquebecker91/GuillotineModels.jl/tree/0. 2.4), and it was run using Julia 1.4.2 [6] with JuMP 0.20.1 [13] and Gurobi 9.0.2 [15]. The following Gurobi parameters had non-default values: Threads = 1; Seed = 1; MIPGap = 10 −6 (to guarantee optimality); and TimeLimit = 10800 (i.e., three hours). For the root node relaxation of the final built model, the barrier algorithm was employed (Method = 2). Whenever the run included the pricing phase, the multiple continuous relaxations from such phase were solved by the dual simplex algorithm Method = 1. In preliminary experiments, barrier took less time than dual simplex to solve a model relaxation from scratch. However, if a previous base can be exploited, as it is the case during the pricing phase, choosing dual simplex over barrier made the pricing phase take less time.

Comparison of faithful against original
Without a reimplementation of original, any comparison would need to be made directly against the data in [29]. However, such comparison would hardly be fair, as it compares across machines, solvers, and programming languages. Also, for example, it does not allow us to assess the benefits of applying the plate-size normalization procedure to the original formulation. The purpose of this section is to show that faithful may be fairly used in place of original. For this purpose, Table 1 compares the number of model variables and number of plates of the diverse model variants presented in [14,29]. The chosen dataset is, therefore, the same as the one used in these works for the comparison to be possible. The dataset aggregates 59 instances of the previous literature from many distinct sources, all instances are either artificially generated, or of undisclosed origin. The number of enumerated plates has a strong correlation to the number of constraints in the model. Both [14] and [29] present the number of plates and not the number of constraints. To simplify the comparison, we do the same.
The Priced PP-G2KP runs in [14,29] had three time limits of one hour to solve: the restricted model (i.e., obtaining a lower bound); the iterative variable pricing (i.e., obtaining an upper bound); the final model. Such configuration always generates a final model. However, it also has two drawbacks: (i) the computer performance may define the answer given in the first two phases, affecting the size of the final model (and making it harder to make a fair comparison); (ii) if the position-only restricted model, or the iterated variable pricing, cannot be done in one hour, then the final model will probably hit the time limit too -in [14], every run that hits one of the two first time limits also hits the third time limit. We chose to use a single three-hour time limit. Table 1 references the names used in [14,29]. The Complete PP-G2KP is the formulation with all optional procedures disabled, while the PP-G2KP mean both Cut-Position and Redundant-Cut are enabled. Restricted PP-G2KP and its priced version are solved inside Priced PP-G2KP runs. If the lower and upper bounds found during pricing are the same, then the optimal solution was found before generating the final model. The instances in which this happened for an unrestricted solution are 3s, A1s, CU1, CU2, W, cgcut1, and wang20. The instance A1s presented this behaviour already in the pricing of the position-only restricted model.
The following conclusions can be derived from Table 1. All variants, except Priced PP-G2KP, are within ±0.01% of the expected number of plates (and, consequently, of constraints). The Complete PP-G2KP, Complete +Cut-Position, and Restricted PP-G2KP are within ±1% of the expected number of variables. The number of variables in both Complete +Redundant-Cut and PP-G2KP (CP + RC) is 10 ∼ 20% larger than expected. Our reimplementation of Redundant-cut reduction seems responsible for both deviations. However, it follows closely the description given in [29]. The number of variables and plates in Priced variants is not entirely deterministic. The number of variables of Priced variants is either slightly above (+2%) or lower (−6 ∼ 68%).
For all non-priced variants, the fraction of the running time spent in the model generation is negligible. Consequently, the comparison presented in Table 1 is sufficient. We cannot say the same for the priced variants. [14,29] does not report the size of the multiple LP models solved inside the iterative pricing (a phase of the pricing). For instances in which original and faithful executed all phases of pricing and solved the final model, the original spent 34.35% of its time in the iterative pricing phase, while faithful spent 61.69%. It is hard to pinpoint the source of this discrepancy. One possible explanation is that, in original, other phases took more time than they took in faithful.
For example, faithful uses the barrier algorithm for the root node relaxation of the final model, which reduces the percentage of time spent in this phase. Nevertheless, for the subset of the instances aforementioned, the total time spent by faithful was about 13% of the time spent by original. While the difference between machines and solvers does not allow us to infer much from that figure, we believe that the magnitude of the difference guarantees that we are not making a gross misrepresentation.

Comparison of faithful against enhanced
The primary purpose of this section is to evaluate our contributions to the state of the art. Our contributions are the normalize reduction (i.e., the plate-size normalization presented in Sect. 3) and the enhanced formulation (presented in Sect. 4.1). The state of the art consists in a formulation (Complete PP-G2KP), two reductions (Cut-Position and Redundant-Cut), and a pricing procedure presented in [14,29]. In this section, we use our reimplementation of Complete PP-G2KP named faithful (to distinguish from the data of the original). We also reimplemented the reductions and the pricing procedure, but as enhanced may also enable them, we avoid labelling these procedures as faithful as to avoid confusion.
The faithful and enhanced formulations cannot be combined. However, both allow enabling any combination of the optional procedures. The only exception is Redundant-Cut, which is unnecessary for enhanced and, therefore, never applied to it. Outside of this exception, in this section, Redundant-Cut and Cut-Position are always enabled. These reductions never increase the number of variables (or constraints), cost a negligible amount of computational effort, and were already discussed in [14,29].
We also examine the effects of warm-starting the non-priced model. The deterministic heuristic used to MIP-start the non-priced models is the same used in the restricted priced model solved inside the pricing procedure.
Considering the data from Table 2 we can state that: Table 2 Comparison of faithful vs. enhanced over the 59 instances used in [29]. The meaning of the columns follow: T. T. (Total Time) -sum of the time spent in all instances including timeouts, in seconds; #e (early) -number of instances in which pricing found an optimal solution (and, consequently, did not generate a final model); #m (modeled) -number of instances that generated a final model; #s (solved) -number of solved instances; #b (best) -number of instances that the respective variant solved faster than any other variant; S. T. T. (Solved Total Time) -same as Total Time but excluding runs ended by time or memory limit; #variables (#plates) -sum of the variables (plates) in all generated final models (see column #m).
The first row (Faithful) has two runs that ended in memory exhaustion. We count the time of these runs as they were timeouts

Evaluating enhanced against harder instances
The purposes of the experiment described in this section are: (i) to show the limitations of the enhanced formulation against more challenging instances; (ii) to provide better bounds and new proven optimal values for such instances. [30] proposes a set of 80 hard instances to test the limitations of their bounding procedures; we use these instances in this section. The instances were artificially generated and are divided in four classes of 20 instances each. The dataset focuses on two characteristics: (i) the area of the pieces is small compared to the area of the original plate (the average ratio vary between 1.6 and 5%); (ii) each class is defined by the shape of the original plate, and the likely shape of the randomly generated pieces. The original plates of the first two classes have one dimension two or four times larger than the other dimension. In the first class, the pieces are likely to be larger in the same Table 4 Summary table for the instances proposed in [30]. The columns are: C. -instance class (described in [30], 20 instances each); Variant -the solving method employed; #m (modeled) -number of instances in which the model was built before timeout; Avg. #v and Avg. #p -the average number of variables and plates in the #m instances that generated a final model for the respective variant; T. T. (Total Time) -sum of the time spent in all instances in seconds, including timeouts; #s (solved) -number of instances solved; Avg. S. T. (Avg. Solved Time) -as total time but excludes timeouts and divides by #s. Averages were used instead of simple sums because the very different number of generated and solved models made the sums misleading dimension the original plate is larger; while, in the second class, the pieces are likely to be larger in the dimension the original plate is shorter. The original plates of the last two classes are squares. The pieces of the third class have, in average, the same dimension with double the size of the other; while, in the fourth class, half of the pieces follow the previous distribution, and the other half invert the favored dimension. Only two configurations were selected for this experiment, the priced and nonpriced versions of enhanced with Cut-Position, normalize, and MIP-start enabled. We also present the results for the restricted priced variant because it executes inside priced (the same reductions apply to it). Table 4 presents a summary of all runs, and Tables 5, 6, 7, and 8 present the improved bounds and solved instances.
For this experiment, Gurobi was allowed to use the 12 physical cores of our machine. Gurobi distributes the effort of the B&B phase equally among all cores. Solving an LP (as a root node relaxation, or not) calls barrier, primal simplex, and dual simplex. Each of the simplex methods uses a single thread, while barrier uses all remaining cores, and Gurobi stops when the first of them finishes.
Concerning the data from Table 4, we want to highlight some unexpected results: (i) the total number of instances solved by the restricted priced was slightly smaller than non-priced, even with non-priced solving the harder unrestricted problem; (ii) many runs reached time limit without solving the continuous relaxation of the positiononly restricted model (necessary for creating restricted priced model); (iii) non-priced solved more instances than priced in all cases. It is worth noting that the priced variant could have been considered the best configuration in the previous dataset, as its total time was shorter than non-priced, and both solved all instances. Ideally, the pricing procedure would significantly reduce the size of the model and, consequently, the root node relaxation and B&B phases would take much less time to solve. However, the gain in decreasing the size of the (already reduced) enhanced model further does not seem to compensate for the cost of solving hard LPs more than once. Also, previous sections have shown that reducing the model size does not guarantee that the running time will be reduced by the same magnitude. The purpose of Tables 5, 6, 7, and 8 is to allow querying the exact values for specific instances. Even so, there are some gaps to fill. For the instances presented in Tables 5,  6, 7, and 8, the min / mean / max gap between the heuristic lower bound and the final lower bound were: 0.38 / 18.08 / 37.03 (non-priced); 0.68 / 20.62 / 37.29 (restricted priced); 9.17 / 19.38 / 32.24 (priced). In other words, no solution, or best bound, was given by the heuristic, and most of the time, its solution was considerably improved. For the reader convenience, we can also summarize that our experiment has: proved 22   unrestricted optimal values (5 already proven by [30], confirming their results); proved 22 restricted optimal values (in an overlapping but distinct subset of the instances); improved lower bounds for 25 instances; improved upper bounds for 58 instances.

Conclusions
The present work advances the state of the art on MILP formulations for the G2KP. We improve the performance of one of the most competitive MILP formulations for the G2KP by at least one order of magnitude. In the instance set selected by the original formulation, our enhanced formulation dominates the original formulation. Concerning other competitive MILP formulations in the literature, we keep the advantage of tighter bounds the original formulation had over them, and greatly reduce the model size and running times for instances that these other formulations had the advantage.
In the experiments, we have already discussed some elementary inferences, for example: the limitations (and partial success) of our improved formulation against the most recent and challenging instances in the literature; and the impact on the performance caused by the LP-solving algorithm, by the specific changes we made, by MIP-starting the models, and by some procedures proposed together with the original model (i.e., pricing and some preprocessing reductions). Here we present more general conclusions from a broader perspective.
We believe symmetry-breaking plays a significant part in the success of our enhanced formulation. In our experiments, we focus on the significant reduction of the model size because it is easier to measure. The solver can, however, always reduce the model size even further, by disregarding loose constraints or variables which cannot assume non-zero values. This does not seem to be the case of the variables removed by our enhanced model, which could assume non-zero values and compose symmetric solutions. A single extraction variable may replace many distinct sequences of cuts that would extract the same piece from the same slightly-larger plate. The enhanced formulation did not present consistent gains in the LP relaxation, for them to be responsible for the observed improvement in performance. We also believe our results suggest that clever dominance rules may considerably improve pseudo-polynomial models (which often have tight bounds but large formulations) before resorting to more complicated techniques (as the pricing procedure proposed in [14] or column generation techniques).
Our suggestions for future works follow: adapt the formulation for closely related problem variants and compare to their state-of-the-art solving procedure; expand on the symmetry-breaking; consider other frameworks besides the pricing framework of [14].

Author Contributions
The authors' contributions may be summarised as: 1. an enhanced MILP formulation based on a previous state-of-the-art formulation; 2. the proof of correctness of the aforementioned enhancement; 3. empirical evidence of the better performance obtained by the enhancement; 4. the adaptation of a known reduction for our specific case; 5. empirical evidence of the gains obtained by this reduction in our case; 6.17 new optimal solution values of hard instances (unrestricted problem); 7. 22 new optimal solution values of hard instances (position-only restricted); 8. better lower bounds for 25 instances (unrestricted problem); 9. better upper bounds for 58 instances (unrestricted problem).
Funding This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES)-Finance Code 001.

Data Availability
The Jupyter notebooks and raw CSVs used for generating the tables in this work are available in .

Conflict of interest
The authors declare that they have no conflict of interest.
Code availability The code, in the specific version used, is available at [3]. The code is in public domain (by means of the Unlicense template, see https://unlicense.org/ for details).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.