Optimal patchings for consecutive ones matrices

We study a variant of the weighted consecutive ones property problem. Here, a 0/1-matrix is given with a cost associated to each of its entries and one has to find a minimum cost set of zero entries to be turned to ones in order to make the matrix have the consecutive ones property for rows. We investigate polyhedral and combinatorial properties of the problem and we exploit them in a branch-and-cut algorithm. In particular, we devise preprocessing rules and investigate variants of “local cuts”. We test the resulting algorithm on a number of instances, and we report on these computational experiments.

Therefore, being C1P is a property of 0/1-matrices that is preserved under row and column permutations.
The problem of checking whether a given 0/1-matrix is C1P can be solved in linear time using the algorithm of Booth and Lueker [5]. The situation is quite different for optimizing over the set of all C1P matrices of a given size. Indeed, the Weighted C1P Problem (WC1P), i.e., finding an m × n C1P matrix of minimum cost with respect to a linear objective function, is NP-hard (Booth [4] and Papadimitriou [30]).
The weighted C1P problem was investigated by Oswald and Reinelt [27][28][29] who provided a polyhedral study and a branch-and-cut algorithm that produces a certified optimal solution. They also introduce applications in the fields of developmental psychology [31], computational biology [7], archaeology [22], and film-making. Their computational results show that the problem is very difficult to solve to certified optimality even for relatively small matrices. For example, in [29], Oswald and Reinelt report on computational results with randomly generated square instances. Even some instances of size 13 × 13 were not solvable within one hour by their implementation and hardware.
Fortunately, there are applications of the Weighted C1P Problem having a particular structure that can be exploited to make it possible to attack larger instances. This is the case, for example, when, in addition to the cost matrix C, a second matrix M ∈ {0, 1} m×n is given, typically non C1P, and one is asked for the least cost C1P positive patching of M, i.e., a C1P matrix obtained from M by switching some of its 0 entries to 1. This Weighted Positive C1P Patching Problem is the subject of this paper.
The complementary weighted negative C1P patching problem, in which the least cost C1P matrix obtained from M by switching some of its 1 entries to 0 is determined, is less frequent in applications; we therefore concentrate on the positive C1P patching problem. Thus, from now on we will use the name "C1P patching" for "positive C1P patching" and we will denote this problem by WC1PP. It generalizes WC1P, since the two problems coincide when M is the zero matrix. Even if we restrict the objective function coefficients to be all nonnegative, the WC1PP Problem is NP-hard, since it generalizes the so-called average order spread problem, proved to be NP-hard by Fink and Voss [14]. Such a proof also works when the entries of the objective function are all 1s. In this case, we denote the WC1PP Problem as the Minimum Length C1P Patching (MLC1PP) Problem.

Applications
There are a number of applications of the WC1PP Problem that have been discussed in the literature. Actually, some of the examples mentioned in [27] are of this type, like the ones in archaeology and film-making. We give here a short description of some other typical applications and variants (see, e.g., [10] for additional details).

Open Stacks Problem
The Open Stacks problem is to generate a set of cutting patterns that cut raw material into smaller items of required sizes and quantities in order to minimize the waste. Once the optimal patterns have been used to perform the cutting operations, another optimization problem arises in practical applications. Indeed, the items cut from the panels are stacked around the cutting machine. Such stacks remain "open" during the complete production time of the related items and the same stack can be used only for items whose production does not overlap over time. Then one has to find a cutting pattern permutation that minimizes either the total stack opening time or the maximum number of stacks that are simultaneously open during the cutting process. In the literature, the total stack occupation time and the maximum number of simultaneously open stacks problems are known as Time of Open Stacks (TOS) Problem and Maximum number of Open Stacks (MOS) Problem, respectively (see Linhares and Yanasse [25]). In both cases, a feasible solution is a positive patching X of the binary production matrix M in which columns (rows) are associated with the panels (items) and M(i, j) = 1 if and only if at least one item of type i is produced in the panel j. The MOS Problem then seeks for such an X that minimizes the maximum number of 1s in each column, while the TOS Problem reduces to an MLC1PP Problem. The MOS problem has been investigated in the literature, see, for instance, Baptiste [2], de la Banda and Stuckey [11], and [10].

Very Large Scale Integration circuit design (VLSI)
In VLSI design, the gates correspond to circuit nodes and different connections between them are required. Each connection involves a subset of nodes and is called net. Note that to connect the gates of a net, it may be necessary to cross other gates not included in the net, depending on the gate layout sequence. Also, a single connection track can be used to place nonoverlapping net wires. The total wire length determines the connection cost, while the number of tracks determines the total circuit area, which may be limited by design constraints or efficiency issues. Both indicators give an estimate of the circuit layout efficiency and depend on how gates are sequenced. We define the Gate Matrix Connection Cost minimization Problem (GMCCP) as the problem of finding a gate permutation such that the connection cost is minimized and the number of required tracks is limited. Let M be the incidence matrix of the circuit, i.e., M(i, j) = 1 if net i requires the connection of gate j, and M(i, j) = 0, otherwise. Then, a feasible solution to the GMCCP is a positive patching X of M and the number of required tracks for each net is the number of 1s in the corresponding column of X . Therefore, it is not difficult to see that GMCCP reduces to an MLC1PP Problem with the further request to have a bounded value for the maximum number of 1s in each column, see [10].

Outline of the paper
In Sect. 2 we will introduce some notation and resume some polyhedral results related to the Weighted C1P Problem. In Sect. 3 we will consider the convex hull of the positive C1P patchings of a given matrix. In particular we will: i) describe how to extend the facet defining inequalities introduced for the C1P case to this polytope, ii) give some conditions for a 0-lifting procedure to obtain facet defining inequalities, iii) discuss some polyhedral properties of the dominant polyhedron. We will describe in Sect. 4 the cutting planes we used within the branch-and-cut procedure that we implemented to solve the WC1PP Problem. In particular, we will give special emphasis on the oracle-based generated cutting planes ("local cuts"). The implementation details of the branch-and-cut algorithm will be given in Sect. 5, while the computational experiments will be described and commented in Sect. 6. Finally, we will draw up our conclusions in Sect. 7.

Basic definitions and results
We collect here several definitions and results that we will use in the following.
We denote by P m,n C1 the set of all C1P matrices with m rows and n columns and the corresponding C1P polytope by P m,n C1 := conv(P m,n C1 ). We will always assume that n, m ∈ N := {1, 2, . . . }.
Let M be a given 0/1 matrix of size m × n. Recall that a positive C1P patching of M is a C1P matrix A of size m × n such that A ≥ M. Let P + (M) be the set of all positive C1P patchings of M and P + (M) := conv(P + (M)) be the corresponding positive C1P patching polytope. The set P + (M) is nonempty, since it contains the all ones matrix 1 m×n (or just 1, if the size is clear from the context). In the same way, we define 0 m×n and 0.
Moreover, we use the following notation. . Let N 0 (A) denote the set of indices (i, j) such that A(i, j) = 0, and let n 0 (A) be its cardinality. We use the inner product A, B = m i=1 n j=1 A(i, j) B(i, j) for two m × n matrices A and B. Booth and Lueker [5] gave a linear time algorithm to test whether a given matrix is C1P, which is very important in this context. It uses the so-called PQ-tree algorithm. However, if the matrix is not C1P, the algorithm does not generate a certificate. Such a certificate can be given by certain matrices that appear as minors. Indeed, Tucker [33] gave a characterization of C1P using the five types of Tucker matrices shown in Fig. 1: the infinite series of matrices T 1 k , T 2 k , and T 3 k for every k ∈ N and the fixed size matrices T 4 and T 5 . We need the following notation in order to state his result.
For two ordered sets I ∈ O(n) and J ∈ O(n), by A I J we denote the matrix obtained by selecting the rows and the columns of A with indices in I and in J , respectively, taken in the corresponding order. We say that A I J is a minor of A. Finally, Tucker's characterization can be stated as follows.

Theorem 1 (Tucker [33]) A matrix M ∈ {0, 1} m×n is C1P if and only if none of its minors is a Tucker matrix.
Based on this characterization, Oswald and Reinelt [28,29] gave an integer programming formulation for optimizing a linear objective function over the set P m,n C1 described by inequalities that are all facet defining for the polytope P m,n C1 . Such a formulation is provided by four types of inequalities based on the matrices shown in Figs. 2 and 3. The result reads as follows.

Theorem 2 (Oswald and Reinelt [28])
(1) The inequalities F 1 k , X I J ≤ 2k+3 for all k ∈ N and all ordered index sets I , J ∈ O(k + 2) are facet-defining for P m,n C1 with m ≥ k + 2, n ≥ k + 2. (2) The inequalities F 2 k , X I J ≤ 2k + 3 for all k ∈ N and all ordered index sets I ∈ O(k + 2) and J ∈ O(k + 3) are facet-defining for P m,n C1 with m ≥ k +2, n ≥ k +3. Oswald-Reinelt matrices F 1 k of size (k + 2) × (k + 2) and F 2 k of size (k + 2) × (k + 3). Here, "+" stands for "+1", "−" for "−1", and an empty entry stands for "0". Note that F 1 k is transposed with respect to the one defined in [28] Fig. 3 Oswald-Reinelt matrices F 3 of size 4 × 6 and F 4 of size 4 × 5. As usual "+" stands for "+1", "−" for "−1" (3) The inequalities F 3 , X I J ≤ 8 for all ordered index sets I ∈ O(4) and J ∈ O (6) are facet-defining for P m,n C1 with m ≥ 4, n ≥ 6. (4) The inequalities F 4 , X I J ≤ 8 for all ordered index sets I ∈ O(4) and J ∈ O (5) are facet-defining for P m,n C1 with m ≥ 4, n ≥ 5. (5) The inequalities in Parts (1)- (4), together with the trivial inequalities X ≥ and X ≤ 1, define an integer programming formulation of the set P m,n C1 . Remark 1 In the paper of Oswald and Reinelt, there is a typo in the definition of matrix F 4 . We verified the proof of the above theorem for the corrected matrix by enumerating all C1P matrices of size 4 × 5, checking that the inequality F 4 , X I J ≤ 8 is valid, and that there are 20 affinely independent vectors satisfying the inequality with equality for each I ∈ O(4) and J ∈ O(5) (Oswald and Reinelt proved that P m,n C1 is fulldimensional). The result then follows from the trivial lifting theorem of Oswald and Reinelt [28,Theorem 2].

Remark 2
The set P + (M) and the polytope P + (M) are not monotone, i.e., if X ∈ P + (M) and X ≤ Y then not necessarily Y ∈ P + (M). Indeed, the matrix M = More recently, other inequalities, which are also facet-defining for P m,n C1 , have been presented by de Giovanni et al. [9].

Polyhedral properties of the C1P patching polytope
In this section, let M ∈ {0, 1} m×n be the given matrix.

Basic results
The polytope P + (M) has the following basic properties.
Proof Equation (1) follow by definition for all vertices of P + (M) and by convexity for any X ∈ P + (M). Therefore, dim(P + (M)) ≤ n 0 (M). We denote the m × n matrix with entry (i, j) equal to one and all the other entries equal to zero by E i j . Then the n 0 (M) + 1 matrices 1 and 1 − E i j for all (i, j) with M(i, j) = 0 are contained in P + (M) and are affinely independent.
Because of Proposition 1, from now on we assume that all inequalities A, X ≤ α that are valid for P + (M) are in the standard form, i.e., A(i, j) = 0, for all M(i, j) = 1.
where E i j is defined as in the proof of Proposition 1. These matrices are contained in P + (M), since any matrix containing at most two 0s is C1P (the corresponding columns can be permuted to opposite ends of the matrix). These n 0 (M) affinely independent matrices satisfy X (i, j) = 0, which completes the proof.
Another fundamental fact is that P + (M) is a face of P m,n C1 . Indeed, any entry (i, j) of a matrix X ∈ P + (M) for which M(i, j) = 1 is fixed to 1. Thus, the inequalities in Theorem 2 yield nontrivial valid inequalities for P + (M). In particular, their IPformulation yields a valid formulation for the positive C1P patching problem as well. It turns out, however, that they do not always define facets. The following results deal with the corresponding conditions.
The support of a matrix A is the submatrix obtained from A by removing all its zero rows and its zero columns. Proof The proof of Claim (1) is given by slightly modifying the arguments in [29]. To shorten notation, let A = F 1 k and α = 2k + 3. Thus, A, X I J ≤ α ((A, α), for short) is the inequality under investigation. Let B, X I J ≤ β be a valid inequality that is satisfied with equality by all the feasible solutions that satisfy (A, α) with equality. In order to prove the claim, it will be enough to show that there exists δ > 0 such that B(i, j) = δ A(i, j) for any (i, j) ∈ N 0 (M I J ); recall that all variables X I J (i, j) with Observe that Z is strictly C1P and a root of are set to 1 and that P + (M I J ) has dimension n 0 (M I J ) (as for Claim (2) in Proposition 1). Moreover, we will show that there exists at least one positive patching of M I J that satisfies both equalities, which implies β = δα. Consequently, the faces defined by A, X I J ≤ α and B, X I J ≤ β coincide.
Consider the matrix as depicted in Fig. 4. Moreover, for each (i, j) ∈ S 0 , let Z i j be defined as follows: Observe that A, Z = A, Z i j = α. In the following we prove that Z and Z i j are positive patchings of M I J . First, notice that both Z and Z i j support M I J , by construction. Then, Z is strict C1P, and it is not difficult to check that also Z i j is C1P. Indeed, consider the case with i ≤ k + 1. If j = 1, Z i j is strictly C1P, and if j > 1 one has to exchange columns j and k + 1 (columns j and 2, if i = k + 1) to get a strict C1P matrix. On the other hand, if i = k + 2, moving columns j and k + 1 in the first two positions returns a strict C1P matrix. Therefore, Z and Z I J are roots of (A, α) (and thus also of (B, β)). Hence, by subtracting the equations B, X = β for X = Z and X = Z i j , we obtain If (i, j) = (k + 2, k + 1) we have that W i j is strictly C1P. If i ≤ k and j = k + 2, then it is enough to move column k + 2 between columns i and i + 1 to get a strict C1P matrix. Finally, if (i, j) = (k + 1, 1), we obtain a strict C1P matrix by moving column k + 2 in first position. Since all these W i j are roots of (A, α) (and therefore of (B, β)) that differ only for one element, we have that all coefficients B(i, j) for (i, j) ∈ S − have the same value, say γ .
If M I J = T 2 k−1 or M I J = T 1 k then we are done, since, in this case, S + = ∅.
Therefore assume that M I J ≤ T 1 k and S + = ∅. Notice that T 1 k − E i j is a C1P positive patching of M I J for all (i, j) ∈ S + . Moreover, every such vector is a root of (A, α) and hence of (B, β). It follows that all coefficients B(i, j) for (i, j) ∈ S + have the same value, say δ. Let (ī,j) ∈ S + (in particular, B(ī,j) = δ). Then Eī j are C1P positive patchings of M I J and (B, β) is tight to both of them. Ifī ≤ k, it follows that Similarly, γ + δ = 0 holds forī = k + 1 andī = k + 2, too.
In conclusion, we know that is a multiple of (A, α), which concludes the proof.
Claim (2) can be proved by the same arguments used by Oswald and Reinelt [29] to prove case (3) of Theorem 2.

Remark 3 The statement of Part (1) in Theorem 3 becomes false, if
. Similarly, Part (2) in Theorem 3 becomes false, if M I J = T 3 k : For k = 2, the inequality F 2 2 , X ≤ 7 does not define a facet for P + (T 3 2 ). Following the line of Theorem 3, we also checked whether inequalities F 3 , X ≤ 8 and F 4 , X ≤ 8 are facet defining for P + (T 4 ) and P + (T 5 ), respectively. The following two observations address these two questions. Remarks 3, 4, and 5 can be checked by using polymake [17][18][19]. The convex hull algorithm used by polymake is cdd [15], which in turn relies on the exact arithmetic of GMP [20]. The same holds for the rank computations that we have to perform in polymake.

Projection and lifting
In the following we will deal with the relation between the positive patching polytope P + (M) and the polytope P + (M I J ) for a minor M I J .
Proof If X ∈ P + (M), then by definition X is C1P and so is the submatrix X I J . Furthermore, since M ≤ X , then M I J ≤ X I J . Hence, X I J ∈ P + (M I J ). This shows the first claim.
If |J | = n, for Y ∈ P + (M I J ) define the following matrix: It follows from the construction that X ∈ P + (M) and X I J = Y , which shows the second claim.
As we will show in the following subparagraph, an inequality keeps the property of being facet-defining if we restrict the inequality to its support (i.e., if we remove rows and columns with all zero coefficients). In order to prove this property, we first need the following result. Here, an inequality A, X ≤ α, with A, X ∈ R I ×J , is said to be obtained by (trivially) lifting the inequality A , X ≤ α, with A , X ∈ R I ×J and

Then the trivially lifted inequality A, X ≤ β is valid for P + (M).
Proof Assume that there exists B ∈ R m×n with B(i, j) = 0 for all (i, j) / ∈ I × J , such that B I J , X I J ≤ γ defines a facet for P + (M I J ) and i.e., A I J , X I J ≤ β does not define a facet. Since, both A and B are zero outside I × J , by Lemma 2, B, X ≤ γ is valid for P + (M) and i.e., A, X ≤ β does not define a facet for P + (M), a contradiction.
The reverse question asks whether we can get valid or even facet-defining inequalities from inequalities valid for minors.
Oswald and Reinelt [28,29] proved that every trivial row and column lifting yields facets for the C1P polytope P m,n C1 . For positive C1P patching polytopes, this is not true. As above, one can verify that A, X ≤ 7 with

Remark 6 Consider the matrix
Moreover, the trivially column lifted inequality [A, 0], X ≤ 7 does not define a facet for the polytope P + ([M, 1 m×1 ]). (Note that in this case we have dim P Despite these examples, we can give some sufficient conditions for matrix M to obtain facet defining inequalities of P + (M) by trivially lifting the inequalities defined in Theorem 3. The arguments are similar to the ones given in the proof of Theorem 3. We first consider the case of a general facet defining inequality. and z 2 = n 0 (M). In the following we will construct z 1 + z 2 affinely independent roots of A, X ≤ β. Since by Proposition 1 the dimension of P + (M) is z 1 + z 2 , this will prove the claim of the theorem.
Since A, X ≤ β is facet defining for P + (M I J ), there exist z 1 affinely independent roots that can easily be extended to the (m 1 + m 2 ) × n space of M by adding the rows of matrix M, so to obtain matrices X 1 , . . . , X z 1 . In particular, and w.l.o.g., let the columns of M be ordered in such a way that X z 1 is strictly C1P. Observe now that, in order to be strictly C1P for any permutation of the columns, each row of M must either contain at most one 1, or have all 1 entries. Now, iteratively for p = z 1 +1, . . . , z 1 +z 2 , let X p be obtained from X p−1 by switching to 1 an entry (i p , j p ) ∈ N 0 (X p−1 ) such that: i) m 1 + 1 ≤ i p ≤ m 1 + m 2 and ii) X p is strictly C1P (as X p−1 is also strictly C1P, this can be easily done). Therefore, by construction, the matrices of {X 1 , . . . , X z 1 +z 2 } are affinely independent roots of A, X ≤ β, which proves the claim.
In some cases, also inequalities in Theorem 2 can be trivially lifted and still induce facets: Proof The proof of this theorem mimics the one given for Theorem 3.
to the (k +2+m 1 )×(k +2+n 1 ) space, and let (B, β) be an inequality that is satisfied with equality by all the feasible solutions that satisfy (A, α) with equality.
By the same arguments used in the proof of Theorem 3 we can show that be defined as in the proof of Theorem 3 and illustrated in Fig. 4, and let Observe that Z 0 is strictly C1P and it is a root of (A, α) (and consequently of (B, β)). Let z be the number of 0s in the matrix ) and, iteratively for each s = 1, . . . , z, let Z s be obtained from Z s−1 by setting an entry It remains to consider the coefficients of B corresponding to the matrix M 3 . In this case, let Z 0 be the matrix obtained from Z 0 by setting all the entries of [M 1 , M 2 ] to 1. Again Z 0 is strictly C1P and a root of both (A, α) and (B, β). Now, iteratively for each row i from k + 2 to 1 and from each column j from k + 3 to k + 2 + n 1 , let Z s be obtained from Z s−1 by setting Z s (i, j) = 1. Again, it is not difficult to see that Z s is C1P (if i ≤ k, it suffices to move column k + 2 in the last position to get a strictly C1P matrix) and that it is a root of (A, α) (and thus of (B, β)). This implies that B(i, j) = 0 for all the entries corresponding to the matrix M 3 . Therefore, we have that (B, β) coincides with (A, α) and this concludes the proof.
Similar arguments to those used for Claim (1) also prove Claim (2).
It is an open question whether one can obtain necessary and sufficient conditions for trivial lifting in general.

The dominant polyhedron
Of special interest is the case when the objective function of the WC1PP problem to be minimized has only nonnegative coefficients. This, for example, is the case in real world applications where turning a zero into a one is a costly, rather than a profitable, operation.
Clearly, minimizing a nonnegative linear function over P + (M) is equivalent to minimizing the same function over D + (M), the dominant of P + (M) defined by the Minkowski sum of P + (M) and R m×n + : Unfortunately, we do not have at hand an integer linear description of D + (M). Despite this fact, we can derive facet defining inequalities for D + (M) by minimiz-ing over such polyhedron, as it will be discussed in Sect. 4. Therefore, here and in Sect. 4.3.5, we will discuss some properties of D + (M) that we will use to obtain some algorithmic advantages exploited in the solution algorithm. We start with the following result on the facet defining inequalities of D + (M). As it is usual in the literature about dominant polyhedra, here the valid inequalities will be presented in A, X ≥ α form, so to assume, w.l.o.g., A ≥ 0 and α ≥ 0. Proof We will first prove Case (1). Since for and j ∈ [n]; consequently, in order the inequality to be supporting, α ≥ 0.
Now observe that there exists at least one positive patching of M, sayX , with A, X = α (i.e.,X is a root of (A, α)) such thatX (ī,j) = 1. Indeed, if not, all roots of A, X ≥ α will also be roots of X (ī,j) ≥ 0, contradicting the hypothesis that (A, α) defines a facet that does not arise from a trivial inequality of some variable.
Moreover, letX be obtained fromX by setting to 0 all the elements (ī, j) with j ∈ Q. Then, rowī ofX coincides with rowī of M and thus, as it contains at most one 1 entry, cannot contribute to any Tucker minor. Consequently, sinceX ∈ P + (M), alsoX ∈ P + (M). Therefore, A,X < A,X = α, contradicting the assumption that A, X ≥ α is valid for D + (M).
Similar arguments also prove Case (2) (here we use the observation that no Tucker minor contains a column with all 0 entries).
The algorithmic consequences of this theorem will be detailed in Remark 9 below.

Separation
In the branch-and-cut algorithm described in Sect. 5 below we make use of three kinds of cutting planes: • A dictionary of inequalities derived from the Tucker matrices that appear as minors of the given matrix M; • exactly and heuristically separated inequalities as stated in Theorem 2, generated from the current fractional solution, and • cutting planes based on an optimization oracle (local cuts).
We describe the corresponding separation algorithms in the following.

A dictionary of inequalities
In order to create a dictionary of Tucker minors of the input matrix M, we use the following three procedures: (1) Relying on the proof of Tucker [33] for the characterization of C1P-matrices in Theorem 1, we use the method of Lekkerkerker and Boland [24] for recognizing interval graphs. Let G = (U ∪ V , E) be the bipartite graph associated to to M, where U and V are nodes corresponding to the rows and columns of M, respectively, and  (1) or (2), we look for other Tucker minors of the same type that can be generated by replacing one row or column in the current submatrix by equal parts of different rows or columns of M.
For each Tucker minor T provided by the procedures (1), (2), and (3), we generate the following inequalities that are stored in a pool, which is separated during the branch-and-cut procedure: (a) inequalities from Theorem 2. In particular, if T is T 1 k or T 2 k−1 with k > 1, we generate k + 2 symmetric copies of the corresponding inequality F 1 k , X I J ≤ 2k + 3, all violated by T . In this case, observe that, because of Theorem 3, the produced inequalities are facet defining for P + (T ).
we produce all the nontrivial facet defining inequalities of D + (T ).
The facet defining inequalities produced for the cases (a) and (b) are generated off-line using the software suite polymake mentioned above.

Inequalities from Oswald and Reinelt
In order to separate the current fractional LP-solution X , we generate inequalities as stated in Theorem 2. In particular, we apply two separation procedures. First, we use a rounding algorithm as described by Oswald and Reinelt [28,29]. The general idea is to first round X to an integer matrix X and then, if X is not C1P, apply the Steps (2) and (3) described in the previous section to X . Here, for each Tucker minor T , we only generate inequalities from Theorem 2, since the facet defining inequalities from P + (T ) or D + (T ) could, in general, not be valid for P + (M).
Then, for the inequalities (1) and (2) of Theorem 2, we also apply the exact separation procedures described in [29]. Such algorithms reduce the corresponding separation problem to the solution of a sequence of shortest path problems in a set of suitable graphs. Their overall complexity is rather time consuming (O(n 3 (n +m)) and O(n 4 (n + m)), respectively); therefore, we apply them only if no cuts are generated by the rounding procedure described above.
All the generated inequalities are stored in a pool that is used for separation at every later cutting plane phase. See [28,29] for more details on this method.

Oracle-based separation
One can generate valid (facet defining) inequalities violated by an arbitrary given point by means of an optimization oracle, in the case when the size of the matrix M is small. Our approach refers to so-called "local cuts", see Applegate, Bixby, Chvátal, and Cook [1] and to the so-called "target cuts", see Buchheim et al. [6].

The local cuts method
We first describe the general idea, which is nothing but a rephrasing of the simpler of the two directions of the polynomial-time equivalence of the separation and the optimization problem given, e.g., by Grötschel, Lovász, and Schrijver [21].
Assume that we are given a nonempty polyhedron P ⊆ R d , a point x ∈ R d , and we want to solve the separation problem for P with respect to x . In addition, we have an (efficient) optimization oracle for the following problem: for any c ∈ R d . The goal is to find an inequality a, x ≥ a 0 that is valid for P and violated by x , or show that x ∈ P (for reasons that will be clear in the forthcoming discussion, here we deal with valid inequalities in the ≥ form). This can be obtained by solving the following separation problem: r , a ≥ 0 for all r ∈ cone(R), a, a 0 free, where V and R are the sets of vertices and of extreme rays, respectively, of P. Clearly, an optimal solution (a , a 0 ) to (LCSP) defines an inequality a , x ≥ a 0 that is valid for P because it satisfies (3) and (4) and that is violated by x if the optimal value of (LCSP) is negative. The constraints (5) are used to guarantee that (LCSP) is always bounded. Usually (LCSP) is solved with a delayed row generation, i.e., with a cutting plane procedure that iteratively constructs the sets of constraints (3)-(4). At each iteration of the algorithm, the current solution (ā,ā 0 ) is checked for feasibility w.r.t. P. This can be done by solving (2) with c =ā, by means of the optimization oracle. If the optimal value is at leastā 0 , then (ā,ā 0 ) is optimal for (LCSP) and we stop. Otherwise, the oracle either returns a finite optimal solutionx with ā,x <ā 0 or a directionr with r ,ā < 0. In this case, the inequality x, a − a 0 ≥ 0 or the inequality r , a ≥ 0, respectively, is added to the current constraint set (3)-(4), and the procedure iterates.
Recall that the separation problem (LCSP) discussed in this section has the purpose to provide the inequalities of a cutting plane procedure that solves an optimization problem over P. However, we just showed how to solve the separation problem (LCSP) over the same polyhedron P by solving a series of optimization problems over P, although with different objective functions. This approach may look bizarre at a first sight: why not to use the oracle upfront to optimize over P?
The key idea is to set up a procedure where Problem (2) is solved over a polyhedron P whose size is much smaller than the one of the original polyhedron P. Once a separating inequality is generated in the space of P, some lifting technique is used to end up with an inequality in the original space. We sketch such a procedure for the case of WC1PP: LC1) We identify a submatrix M of M, and we call X the corresponding submatrix of matrix X and P = P + ( M) (for details, see Sect. 4.3.5); LC2) We apply the above described "local cut" procedure to find a valid separating inequality Ā , X ≥ᾱ, using an optimization oracle over the polytope P; LC3) we finally lift such an inequality (Ā,ᾱ) to an inequality (A, α) that is valid for P + (M) and is violated by X . To do so, we apply the trivial lifting procedure, whose polyhedral properties have been investigated in Sect. 3.2.
Observe that, if matrix M is chosen sufficiently small, even an optimization oracle based on total enumeration of all feasible patchings can be used in the step LC2 of the above procedure. However, even if X / ∈ P + (M), there is no guarantee that also X falls outside P + ( M). Therefore, if M is too small, the odds that the procedure terminates with no separating inequality found are pretty high. Thus, a tradeoff has to be made in practice.
The separation process described so far is usually called dual separation. An alternative approach is the so-called primal separation where one seeks for a valid inequality violated by the current fractional solution that, in addition, is satisfied at equality by a given integral vertex p of the polyhedron P. The rationale behind this kind of separation is that, if p turns out to be an optimal solution, no inequalities will be generated that are not tight at the optimum and thus not necessary to prove optimality. When, as in our case, P has only 0/1 vertices, primal separation and optimization are polynomially equivalent [12].
In our context, primal separation with respect to an integer vertex p of P is simply achieved by adding the constraint p, a − a 0 = 0 to the linear program (LCSP).

Generating local cuts of high dimensions
The inequalities generated by the method described in Sect. 4.3.1 in general define faces of the polytope P of dimensions that are not necessarily maximal. Moreover, the lifting procedures, to generate the inequality in the original space, typically do not increase the dimension, unless significant computational efforts are spent. Therefore, it is advisable to modify the "local cut" scheme in order to produce inequalities that define high dimensional faces of P.
Applegate et al. [1] (see also Chvátal et al. [8]) presented a procedure, called "tilting", that takes a separating inequality (possibly not facet defining) and terminates with a separating inequality that defines a facet of P. This procedure starts with a maximal set S of affinely independent points of P which are roots of the current inequality and iteratively extends S with a new point that is found by a (possibly long) series of calls to the optimization oracle. The procedure stops when |S| = dim( P).
Here we use the following, slightly different, approach to obtain the same result. As usual, we are given a polyhedron P and a pointx / ∈ P to be separated. Let a,x ≥ a 0 be a valid inequality for P with a,x < a 0 . We are also given a point x 0 ∈ P. Possibly such a point is chosen to be in the interior of the polyhedron. Letz be the intersection of the segment [x ,x 0 ] with a facet of P. With probability 1 such a facet F is unique andz belongs to its interior. If this is not the case, let F be the intersection of all the facets of P containingz. The procedure terminates with an inequality that defines F.
The algorithm iteratively generates a sequence of points in the segment [x ,z], that starts withx and ends withz, and with a separating inequality for each of these points. At each iteration i, we have a pointx i that needs to be separated and we find a separating inequality by means of the optimization oracle. If such an inequality does not exist, x i =z and we are done. Otherwise, let a i ,x ≥ a i 0 be the inequality generated; then we setx i+1 to the point of the segment [x ,x 0 ] that satisfies a i ,x ≥ a i 0 at equality.

The target cuts method
A similar method was proposed by Buchheim et al. [6] for the case when P is a polytope and a point x 0 in its interior is known, in particular, P is full-dimensional. The corresponding model is based on the solution of the following linear program where, as before, V is the set of vertices of P. Let P 0 := {x − x 0 ∈ R d : x ∈ P} be the polytope P shifted by x 0 . By assumption, 0 belongs to the interior of P 0 . Observe that (TP) is derived from (LCSP) by setting P = P 0 . Therefore, (4) can be removed because P 0 is a polytope and a 0 can be set to −1 without loss of generality, since 0 is an interior point of P 0 . Moreover also (5) can be removed because, by setting a 0 to −1, Problem (TP) cannot be unbounded.
An optimal solution to (TP) provides an inequality valid for P 0 that is violated by x − x 0 if its value is strictly less than −1.
The advantage of this approach is that such an inequality a , x − x 0 ≥ −1 is also facet defining for P 0 , if (TP) is solved by the simplex algorithm or by any other method that provides vertex solutions. This can be seen by observing that an optimal basis has n rows, corresponding to points of P 0 that are necessarily linearly independent and are roots of (a , −1), see [6].
Besides the fact that the knowledge of an interior point of P is mandatory, a possible drawback of this method is that the constraint matrix of (TP) is usually dense and has non-integral coefficients (due to the shifting by the vector x 0 ) also in the case when the vertices of P are sparse and binary.
As in the case of local cuts, Problem (TP) is solved with a delayed row generation performed by calling an optimization oracle for Problem (2).

Interior point
The procedures described in Sects. 4.3.2 and 4.3.3 need to be given a point x 0 in the (strict) interior of P as input. Due to the well known Carathéodory's Theorem, such a point can be obtained as the (strict positive) convex combination of d + 1 affinely independent points x 1 , . . . , x d+1 ∈ P. This task can be achieved rather easily in our case. Indeed, let M m× n be the submatrix of M that we identified in order to produce a violated inequality and recall that the dimension d is the number of 0 entries of M. Consider the matrix 1 m× n and the d matrices 1 m× n − E i j , for all (i, j) such that M(i, j) = 0 (again E i j is the m × n matrix with entry (i, j) equal to one and all the other entries equal to zero). It is not difficult to see that these d + 1 matrices are affinely independent and that they are all C1P. If we take their convex combination with weights 1 d+1 we get a point X 0 ∈ [0, 1] m× n in the strict interior of P + ( M). In particular, X 0 (i, j) = 1, if M(i, j) = 1 and X 0 (i, j) = d d+1 , otherwise.

Optimization oracles
As mentioned before, a key element in local or target cut generation is the optimization oracle for P. Since the optimization over P + (M) is an N P-hard problem, it seems hard to avoid some kind of pseudo-enumeration. In particular, for small sizes of the matrix M a brute-force approach is reasonably fast. It is possible to generate all feasible solutions of P + (M) by generating all permutations of the columns of M. This operation, which is obviously done in n! steps, can be implemented with the Johnson-Trotter algorithm (see, e.g., [23]), which has the advantage of generating the next permutation by exchanging two consecutive columns, thus simplifying the objective function computation. For each such permutation, it is then easy to generate all positive patchings that make the permuted matrix, say M, strongly C1P, and to find the patching, say X , that minimizes the objective function C, X . For each row i of M, let i and i r be the column indices of its leftmost and of its rightmost 1-entry, respectively. Then necessarily, X (i, j) = 1 for all i ≤ j ≤ i r . Moreover, we can extend the sequence of 1's of X to the left of i and to the right of i r if the contribution to the objective function corresponding to each of such two extensions is negative. If row i in M contains at least one 1, this operation can clearly be performed in O(n) time. If all entries of row i of M are 0's, the optimal sequence of consecutive 1's can be found, still in O(n), by Kadane's algorithm (see, e.g., Column 7 in [3]). In conclusion, the oracle runs in O(n m n!) time.
An interesting simplification occurs when WC1PP has a nonnegative objective function, which is the case in most of the applications, in particular in all those mentioned in the Sect. 1.
When C ≥ 0, the optimal solutions over P + (M) and over D + (M) coincide. Therefore, any optimal solution X of the current linear relaxation is either optimal for WC1PP or it lies outside D + (M). Thus, we can separate X from D + (M) instead of P + (M). Since all valid inequalities of D + (M) have nonnegative coefficients, the set R in Problem (LCSP) is the set of the d rays of R d + and we can simplify the formulation accordingly.
Moreover, in this case, as an optimization oracle we can use a straightforward adaptation of the dynamic programming algorithm that de la Banda and Stuckey [11] presented for the Open Stack problem. For completeness we describe the version of this algorithm for the Weighted Positive C1P Patching problem with the cost matrix C ∈ R m×n + . Assume that we build up the strictly C1P optimal patching X column by column, such that at a given point in the algorithm we have constructed a submatrix X [m],S whose set of columns is S, and we still have to complete it with columns from S := [n] \ S.
Let s ∈ S be some column that will be placed after S and before S \ {s}. Then, to make X [m],S∪{s} strongly C1P, we have to put 1's at the following rows in column s:

Remark 8
The implementation of this dynamic programming algorithm runs amazingly fast. Instances with up to 25 columns can be solved in under a minute, almost independently of the number of rows. For slightly more columns, however, the method breaks down due to memory requirements.
Observe that the dynamic programming based oracle cannot be used to generate target cuts for D + (M), since the inequalities produced may need negative coefficients in general.

Remark 9
One key decision is how the submatrices on which we generate local cuts is chosen. In our implementation, we always choose submatrices with an adjustable number of columns (in the computations we evaluate using 8, 9, 10, 11 or 12 columns). Since both optimization oracles mentioned above are relatively insensitive to the number of rows, we always include all rows of the original matrix. Moreover, when we optimize over the dominant polyhedron D + (M), we can use Theorem 6 in order to remove all rows with less than two 1s, or columns with all 0 entries, without weakening the facial properties of the generated separating inequality. We have experimented with several methods to choose the submatrices and a combination of Tucker minors and a random choice turned out to be the best one. More explicitly, we perform a filtering technique to detect important submatrices. At the beginning, we generate a list of candidate submatrices, which we initialize with random matrices and with submatrices containing a Tucker minor (possibly filling up with random columns). The details are presented in Algorithm 1; we use R = 40, c = 3, as well as by default K = 10 for the dominant and K = 7 for the polytope case.

Algorithm 1: Initialization of Submatrices for Local Cuts
During the algorithm, we generate cuts for each submatrix currently available and store the violation (efficacy) of the cut produced with respect to the current optimal LP-relaxation point. The matrices are then considered according to non-increasing geometric mean violation in the next round, see Algorithm 2 for details. This means that a submatrix that produced the most efficient cut the last time is used first in the next separation round. If submatrices do not produce a violated cut, they are removed from the list, possibly filling up with new random matrices. Additionally, two submatrices, which are selected using the current LP-solution, are used for separation, see Algorithm 2.

Algorithmic aspects
We have implemented a branch-and-cut algorithm to solve the positive patching problem. In this section, we describe the main algorithmic tools that have been used.

Preprocessing
Preprocessing steps are indispensable for solving practical problem instances of almost any optimization problem. We first consider some rules to reduce the size of the input matrix M ∈ {0, 1} m×n . Recall that the objective function is defined by the nonnegative C ∈ R m×n + . We consider the following general fact. Proof Since X is C1P, by Lemma 1, X I J is a positive C1P patching for M I J . Since C ≥ 0, the claim about the objective function follows.
We now have the following preprocessing steps. , then there exists some optimal solution that satisfies the following inequalities: If row i is equal to row k then there exists an optimal solution that satisfies: j) for all j ∈ [n].

(5) If column j and ( j = ) of M are equal then one can remove either column and replace the cost coefficients for the other by the sum of the original coefficients of both columns without changing the optimal value. (6) If the bipartite graph that has M as adjacency matrix is disconnected, one can treat the connected components separately.
Proof (1) Consider an optimal solution X for the total matrix. By Lemma 4, we have Consider an optimal solution X and assume w.l.o.g. that Then we obtain a feasible solution Y by setting Solution Y is feasible, because if Y contains a Tucker minor, then X contains a Tucker minor as well (there is no Tucker minor that contains equal columns). This shows that C, Y ≤ C, X , i.e., if X is optimal, Y is optimal as well. (6) If the bipartite graph is disconnected, M can be reordered into a block diagonal form: where the M i 's are rectangular submatrices of M. Clearly, to turn M C1P, it suffices to make the submatrices C1P. Since C ≥ 0, the result follows.

Remark 10
All of the preprocessing steps of Proposition 3 are used in our code, except for Part (6), since no disconnected graphs occurred in any of the instances. The inequalities of Part (4) are created in advance and added on demand during the branch-and-cut loop.

Remark 11
An extension of the cases discussed in Proposition 3 is not easily possible: (1) There are Tucker matrices that contain rows with a single 0 (T 1 1 and T 2 k ), columns with all ones (T 2 1 and T 3 1 ), columns with exactly one 1 (T 5 ), and columns with exactly one 0 (T 1 1 , T 3 1 , and T 5 ). Thus, one cannot (easily) preprocess these cases. (2) If in the bipartite graph defined by M there exists an articulation node, i.e., a node such that the graph is disconnected when deleting this node, one cannot (easily) decompose the problems. This case occurs, when there is a row or column that is shared by two "blocks", see Hence, one cannot solve the two parts independently, because in the example each block is C1P, while the total matrix is not.

Primal heuristic
To obtain good feasible solutions, we use methods of Oswald and Reinelt, see [28,29]. The idea is as follows. We generate some order of the rows of the matrix M by using the current fractional LP-solution. We then add one row after the other and test whether the resulting matrix is C1P. Once the matrix is not C1P anymore, we backtrack one step and consider all permutations that certify the C1P-these permutations can be generated from the PQ-tree. We compute the cost of generating a C1P solution by adding 1s according to this fixed permutation; this is easy: just order the columns and fill in the 1s that are needed for a strict C1P matrix. Each of the permutations yields a feasible primal solution. This method is very successful, because it is able to test quite a number of permutations in a small amount of time.

Computational results
We implemented the discussed algorithms using a bugfix version of SCIP 4.0.1, see [26,32]. We use CPLEX 12.7.1 as the underlying LP solver. The computations were performed on a cluster with 3.5 GHz Intel Xeon E5-1620 Quad-Core CPUs, having 32 GB main memory and 10 MB cache running Linux. All computations were performed single threaded. The time limit is 1 h.

Data sets
In order to evaluate the performance of the algorithms, in particular, local cuts, we created a testset as follows. We considered the following instances for the Weighted Positive C1P Patching problem: • 5807 instances from the Constraint Programming Modeling Challenge 2005, available at http://www.dcs.st-and.ac.uk/~ipg/challenge, • 250 instances from Faggioli and Bentivoglio [13], • 11 instances based on the VLSI application [10].
We first filtered out all instances that had less then 20 columns, since these can be easily handled by dynamic programming. After this filtering the number of columns for all instances ranges from 20 to 30. We then sorted out instances for which a basic version of our code (RS-base, see below) took longer than one hour. This leaves 197 instances in a testset, which we call testopt. The instances not solved by the basic version within one hour are sorted by their gap into sets with gap in (0, 10]%, (10, 20]%, and (20, 30]%; we call the testsets testgap0-10, testgap10-20, and testgap20-30, respectively.
Details of the following results are given in an online supplement.

Results for preprocessing
On each input, we apply the preprocessing steps described in Sect. 5.1. Table 4 in the "Appendix" shows the number of rows and columns before and after preprocessing for each instance in the testopt testset. The effect depends on the particular matrix. The number of removed rows varies from 0 to 20 and the number of removed columns from 0 to 6. The average number of removed rows and columns is only 0.78 and 0.83, respectively. Thus, the effect is limited on average on this testset. However, preprocessing is cheap and it can be extremely effective on some instances. For example, when applied to some real world instances from manufacturing for the open stack problem, the resulting sizes become so small that all instances can be solved within seconds; we therefore do not report these results here.
Moreover, during preprocessing we search for Tucker minors as described in Sect. 4.1. The results are again shown in Table 4 in the "Appendix". The number of found minors varies, but can be quite substantial. As described in Sect. 4.1, these Tucker minors are used to generate inequalities into a pool, which are later used in separation. Table 4 shows that the number of these cuts varies from 35 to 603 869.

Results with local cuts
We ran several variants of local cuts on the testopt test set. This includes local cuts (LC), local cuts with tilting (LCT), and target cuts (TC). For each basic variant, we consider subvariants. We vary the frequency of depths at which cuts are separated (from 0, i.e., only at the root node, to 5, i.e., every fifth depth level of the tree); this is indicated by the number attached to the basic variants, e.g., LCT1. Moreover, we vary the number of columns in the submatrix considered for separation (between 6 and 12), where 10 is the default for LC and LCT; this is indicated by attaching "size", e.g., LCT1-size8. Finally, we consider the variant with primal separation, e.g., LCT1-primal, and a variant in which we turn the reduction of the submatrix sizes off, e.g., LCT1-nored. In order to reduce the effects of heuristics, we initialized the runs with an optimal solution in this section.
As a base case to compare with, we use three different settings using the separation methods described in Sect. 4 that do not apply oracles: • RS-base refers to the separation of dictionary inequalities (Sect. 4.1) and rounding inequalities (Sect. 4.2); • OR-base refers to the techniques that were used by Oswald and Reinelt, i.e., rounding inequalities and the exact separation of the inequalities in Theorem 2, see Sect. 4.2; • base refers to the separation of all three previously mentioned techniques.
The results are given in Table 1. The table shows the shifted geometric mean 1 of the number of nodes in the branch-and-bound-tree, the shifted geometric mean of the CPU 1 The shifted geometric mean of values t 1 , . . . , t n with shift s is defined as We use a shift s = 10 for time and s = 100 for branch-and-bound nodes in order to decrease the strong influence of very easy instances in the mean values.  We can draw the following conclusions: • Variant LCT1-size9 and LCT1-size11 are the overall best variants in terms of the average solving time, very closely followed by LCT1, which produces less nodes in comparison to the first two. • Comparing LCT1 and LC1 shows that tilting significantly improves the performance of local cuts. • Target cuts are clearly the slowest: all target cuts are slower than the variants of any other type. This is because the separation needs too many LP-solves in order to converge to a possibly violated cut. Consequently, using target cuts only in the root node (TC0) is faster than the other target cut variants, because it limits this slow-down. • When varying the separation frequency of local cuts with tilting, separation in every node (LCT1) is the best option in terms of average performance, but LCT2 is closely behind and has a slightly smaller maximal solving time, see Fig. 7. • Primal separation is not successful, but LC1-primal slightly improves on LC1. • Turning off the submatrix reduction does not significantly increase the run times. • Increasing or decreasing the size of the submatrix has different effects for the different variants: For LCT1, the right choice seems to be unclear, but 9, 10 or 11 columns produce excellent results. For LC1, smaller sizes seem to be better. For TC1, the size does not significantly change the results.

Results for unsolved instances
The previous computations show the improvement of the additional cutting planes over RS-base. Most variants solved all 197 instances in the testopt testset. We now consider the instances that could not be solved by the RS-base settings within one hour in order to see the additional effect of cutting planes and local cuts in particular. Moreover, we consider the influence of the heuristics. Thus, we do not initialize the optimization runs with an optimal solution. The result on the corresponding testsets testgap0-10, testgap10-20, testgap20-30 are displayed in Table 2.
The results show that LCT1 is able to solve a significant number of instances within the time limit that cannot be solved by RS-base. In fact, LCT1 solves about 93 % of the instances in testgap0-10 and about 59 % in testgap10-20. However, the solution becomes significantly more difficult for the instances in which RS-base had a large gap. For example, LCT1 can only solve about 7 % of the instances in testgap20-30. Nevertheless, these results show the strength of the approach via local cuts with tilting.

Results of the heuristic
In this section, we investigate the effect of applying the heuristic explained in Sect. 5.2.
To this end, we run base and LCT1 on the testopt testset with and without initializing with an optimal solution. The results are shown in Table 3. The last three columns display the geometric mean of the number of calls to the heuristic, the geometric mean of the number of solutions found, and the shifted geometric mean of the time spent in the heuristic. Not initializing with an optimal solution shows a slowdown by about 12 % for base and by about 10 % for for LCT1. Surprisingly, the number of nodes even slightly decreases for the heuristic variant in shifted geometric mean. However, this is an artifact of the mean, since the total number of nodes slightly increases. In any case, the time difference essentially comes from the time needed for running the heuristic.

Conclusions
We considered the weighted positive C1P patching problem, as a variant of the weighted C1P problem. The problem is NP-hard and it has several applications, specially defined on weight matrices with nonnegative entries. In the paper, we exploited the polyhedral properties of the positive patching polytope P + (M) in order to design a new branch-and-cut algorithm to solve the problem to optimality. In particular, we first extended some facet defining inequalities to P + (M) that where known for the C1P polytope, we gave sufficient conditions for the 0-lifting procedure to produce facet defining inequalities, and we presented polyhedral properties of the dominant polyhedron of P + (M).
Then we defined separation procedures for a large set of families of valid inequalities that we used as cutting planes in our implementation of a branch-and-cut algorithm. Among these separation procedures, we in particular focused on oracle-based methods for the on-line generation of valid inequalities.
We finally tested the overall solution algorithm via extensive computational experiments on instances taken from the literature. The results clearly show that the oracle-based methods are very effective. This good performance also results from the right choice of parameters, e.g., frequency and submatrix size. In general, this approach seems to be well suited for optimization problems for which it is difficult to obtain a polyhedral description like the weighted positive C1P patching problem. Table 4 Statistics for preprocessing of testopt: Given are the number of rows m, columns n and the density δ (number of 1s/(m · n)) in the original input matrix and after preprocessing. Moreover, we report the number of Tucker minors that are found: the number found as a submatrix (# sub), the number found by considering asteriodial triples (# AT), the number of copies found (# copies), and the total number (# total). Finally, we show the total number of small instance cuts generated Instance