Abstract
We discuss an algorithmic scheme, which we call the stabilized structured Dantzig–Wolfe decomposition method, for solving large-scale structured linear programs. It can be applied when the subproblem of the standard Dantzig–Wolfe approach admits an alternative master model amenable to column generation, other than the standard one in which there is a variable for each of the extreme points and extreme rays of the corresponding polyhedron. Stabilization is achieved by the same techniques developed for the standard Dantzig–Wolfe approach and it is equally useful to improve the performance, as shown by computational results obtained on an application to the multicommodity capacitated network design problem.
Similar content being viewed by others
References
Alvelos, F., Valério de Carvalho, J.M.: An extended model and a column generation algorithm for the planar multicommodity flow problem. Networks 50(1), 3–16 (2007)
Atamtürk, A., Rajan, D.: On splittable and unsplittable flow capacitated network design arc-set polyhedra. Math. Program. 92, 315–333 (2002)
Bacaud, L., Lemaréchal, C., Renaud, A., Sagastizábal, C.: Bundle methods in stochastic optimal power management: a disaggregated approach using preconditioners. Comput. Optim. Appl. 20, 227–244 (2001)
Bahiense, L., Maculan, N., Sagastizábal, C.: The volume algorithm revisited: relation with bundle methods. Math. Program. 94(1), 41–70 (2002)
Belov, G., Scheithauer, G., Alves, C., Valério de Carvalho, J.M.: Gomory cuts from a position-indexed formulation of 1D stock cutting. In: Bortfeldt, A., Homberger, J., Kopfer, H., Pankratz, G., Strangmeier, R. (eds.) Intelligent Decision Support: Current Challenges and Approaches, pp. 3–14. Gabler (2008)
Ben Amor, H., Desrosiers, J., Frangioni, A.: On the choice of explicit stabilizing terms in column generation. Discret. Appl. Math. 157(6), 1167–1184 (2009)
Ben Amor, H., Desrosiers, J. Valério de Carvalho, J.M.: Dual-optimal inequalities for stabilized column generation. Oper. Res. 54(3), 454–4634 (2006)
Ben Amor, H., Valério de Carvalho, J.M.: Cutting stock problems. In: Desrosiers, J., Desaulniers, G., Solomon, M.M. (eds.) Column Generation, pp. 131–161. Springer, Berlin (2005)
Borghetti, A., Frangioni, A., Lacalandra, F., Nucci, C.A.: Lagrangian Heuristics based on disaggregated bundle methods for hydrothermal unit commitment. IEEE Trans. Power Syst. 18(1), 313–323 (2003)
Crainic, T.G., Frangioni, A., Gendron, B.: Bundle-based relaxation methods for multicommodity capacitated fixed charge network design problems. Discret. Appl. Math. 112, 73–99 (2001)
Croxton, K.L., Gendron, B., Magnanti, T.L.: A comparison of mixed-integer programming models for non-convex piecewise linear cost minimization problems. Manag. Sci. 49, 1268–1273 (2003)
Croxton, K.L., Gendron, B., Magnanti, T.L.: Variable disaggregation in network flow problems with piecewise linear costs. Oper. Res. 55, 146–157 (2007)
d’Antonio, G., Frangioni, A.: Convergence analysis of deflected conditional approximate subgradient methods. SIAM J. Optim. 20(1), 357–386 (2009)
Dantzig, G.B., Wolfe, P.: The decomposition principle for linear programs. Oper. Res. 8, 101–111 (1960)
Elhallaoui, I., Desaulniers, G., Metrane, A., Soumis, F.: Bi-dynamic constraint aggregation and subproblem reduction. Comput. Oper. Res. 35(5), 1713–1724 (2008)
Ford, L.R., Fulkerson, D.R.: A suggested computation for maximal multicommodity network flows. Manag. Sci. 5, 79–101 (1958)
Frangioni, A.: Solving semidefinite quadratic problems within nonsmooth optimization algorithms. Comput. Oper. Res. 21, 1099–1118 (1996)
Frangioni, A.: Dual-Ascent Methods and Multicommodity Flow Problems. PhD thesis, TD 5/97, Dipartimento di Informatica, Università di Pisa, Pisa (1997)
Frangioni, A.: Generalized bundle methods. SIAM J. Optim. 13(1), 117–156 (2002)
Frangioni, A.: About Lagrangian methods in integer optimization. Ann. Oper. Res. 139, 163–193 (2005)
Frangioni, A., Gendron, B.: 0–1 reformulations of the multicommodity capacitated network design problem. Discret. Appl. Math. 157(6), 1229–1241 (2009)
Frangioni, A., Gentile, C., Lacalandra, F.: Solving unit commitment problems with general ramp contraints. Int. J. Electr. Power Energy Syst. 30, 316–326 (2008)
Frangioni, A., Lodi, A., Rinaldi, G.: New approaches for optimizing over the semimetric polytope. Math. Program. 104(2–3), 375–388 (2005)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms volume 306 of Grundlehren Math. Wiss. Springer, New York (1993)
Jones, K.L., Lustig, I.J., Farwolden, J.M., Powell, W.B.: Multicommodity Network flows: the impact of formulation on decomposition. Math. Program. 62, 95–117 (1993)
Kiwiel, K., Lemaréchal, C.: An inexact bundle variant suited to column generation. Math. Program. 118, 177–206 (2009)
Lemaréchal, C.: Lagrangian relaxation. In: Jünger, M., Naddef, D. (eds.) Computational Combinatorial Optimization, pp. 115–160. Springer, Heidelberg (2001)
Magnanti, T.L., Mirchandani, P., Vachani, R.: The convex hull of two core capacitated network design problems. Math. Program. 60, 233–250 (1993)
Petersen, B., Jepsen, M.K.: Partial path column generation for the vehicle routing problem with time windows. In: Bigi, G., Frangioni, A., Scutellà, M.G. (eds). Proceedings of the 4th International Network Optimization Conference (INOC2009), pages paper TB4-3 (2009)
Vanderbeck, F.: On Dantzig-Wolfe decomposition in integer programming and ways to perform branching in a branch-and-price algorithm. Oper. Res. 48(1), 111–128 (2000)
Villeneuve, D., Desrosiers, J., Lübbecke, M.E., Soumis, F.: On compact formulations for integer programs solved by column generation. Ann. Oper. Res. 139, 375–388 (2005)
Acknowledgments
We are grateful to the anonymous referees for their valuable comments which helped us to significantly improve the contents of the paper, and to K. Kiwiel for pointing out a flaw in the analysis of [19] (see the Appendix). We are grateful to Serge Bisaillon for his help with implementing and testing the algorithms. We also gratefully acknowledge financial support for this project provided by NSERC (Canada) and by the GNAMPA section of INDAM (Italy).
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is dedicated to Claude Lemaréchal at the occasion of his 65th birthday. Like such a large part of current work in computational nondifferentiable optimization and decomposition methods, our results would have never been possible without his pioneering work.
Appendix: Proof of the convergence results
Appendix: Proof of the convergence results
We now rapidly sketch convergence results for the \(\text{ S}^2\)DW method, focusing only on certain aspects where the theory of [19] cannot be directly used due to the above mentioned somewhat weaker assumptions on the update of the model. The standing hypotheses here are (i)—(vi), a monotone and safe \(\beta \)-strategy, and, at least initially, that \(X\) is compact. A basic quantity in the analysis is
i.e., the “approximation error” of the model \(f_\mathcal{B }\) with respect to the true function \(f\) in the tentative point \(\tilde{\pi }\). It can be shown that if \(\varDelta f = 0\), then \(\tilde{\pi }\) and \((\tilde{x}, \tilde{z})\) are the optimal solutions to the “exact” stabilized problems
(with \(f_\mathcal{B }= f, X_\mathcal{B }= conv(X)\)), respectively [19, Lemma 2.2]. This means that \(\tilde{\pi }\) is the best possible tentative point, in terms of improvement of the function value, that we can ever obtain unless we change either \(t\) or \(\bar{\pi }\); in fact, it is immediate to realize that if \(\varDelta f = 0\), then the “sufficient ascent” condition (31) surely holds. The “inherent finiteness” of our dual function \(f\) allows us to prove that this has to happen, eventually, provided that \(\mathcal{B }\) is not “treated too badly”.
Lemma 2
Assume that an asymptotically blocked \(\beta \)-strategy is employed, i.e., for any sequence of NSs, at length removals from the bundle \(\mathcal{B }\) are inhibited; then, after finitely many NSs, either a SS is performed, or the algorithm stops.
Proof
By contradiction, assume that there exists a sequence of infinitely many consecutive NSs (i.e., the algorithm never stops and no SS is done). If an asymptotically blocked \(\beta \)-strategy is employed, during an infinite sequence of consecutive NSs, one has that (31) is (at length, see iii) never satisfied, and this clearly implies \(f(\tilde{\pi }) < f_{\mathcal{B }}(\tilde{\pi })\). But, at length, \(t\) is fixed by (v), removals from \(\mathcal{B }\) are inhibited, no aggregated pieces can be created because the \(\beta \)-strategy is safe, and at least one item is added to \(\mathcal{B }\) at every iteration by (iv); thus, \(\mathcal{B }\) grows infinitely large, contradicting finiteness in Assumption 1.\(\square \)
Under the stronger version of Assumption 3 where \(\bar{x} \in X_{\mathcal{B }^{\prime }}\), this result could be strengthened to allow reducing the size of \(\mathcal{B }\) down to any predetermined value. This first requires the further assumption that \(\mathcal{D }_t\) is strictly convex (equivalently, \(\mathcal{D }_t^*\) is differentiable), that is satisfied e.g. by the classic \(\mathcal{D }_t = \frac{1}{2t}\Vert \cdot \Vert ^2_2\) but not by other useful stabilizing terms (cf. e.g. [6]). Then, under the mere monotone \(\beta \)-strategy one can use [19, Lemma 5.6] to prove that the sequence \(\{\tilde{z}_i\}\) is bounded, and then [19, Theorem 5.7] to prove that the optimal value \(v\) (29) of the stabilized primal master problem is actually strictly decreasing; since \(\bar{\pi }\) and (at length) \(t\) are fixed, this means that no two iterations can have the same \(\mathcal{B }\), but the total number of possible different bundles \(\mathcal{B }\) is finite. A weaker version (that is sufficient in practice) showing that \(v\) (29) \(\rightarrow \) 0 even if infinitely many aggregations are performed is also possible: one could then resort to employing the “poorman cutting-plane model” \(f_{\tilde{x}}\) (cf. Sect. 4.1) at all steps, which basically makes the algorithm a subgradient-type approach with deflection [4, 13] and results in much slower convergence [10, 23]. Thus, this kind of development seems to be of little interest in our case. Instead, the strictly monotone \(\beta \)-strategy [19, Definition 4.8], which simply requires suspending removals from \(\mathcal{B }\) until \(v\)(29) strictly decreases, is likely to be an effective way to reduce the size of \(\mathcal{B }\) while ensuring the asymptotically blocked property.
Theorem 5
Under the assumptions of Lemma 2, the sequence \(\{ f(\bar{\pi }_i) \}\) converges to the optimal value of (3) (possibly \(+\infty \)). If (3) is bounded above, then a subsequence of \(\{ \tilde{x}_i \}\) converges to an optimal solution of (1). If, furthermore, \(m = 1\) then the \(\text{ S}^2\)DW algorithm finitely terminates.
Proof
The standing assumption of [19], Sect. 6], i.e., that either the algorithm finitely stops with an optimal solution or infinitely many SSs are performed, is guaranteed by Lemma 2. The fundamental observation is that both \(f\) and \(f_{\mathcal{B }}\), being value functions of linear programs, are polyhedral. Then, the first statement is [19, Theorem 6.4] (being a polyhedral function, \(f\) is \(*\)-compact). The second statement comes from [19, Theorem 6.2], which proves that the sequence \(\{ \tilde{z}_i \}\) converges to 0; then, compactness of \(X\) implies that a convergent subsequence exists. The third statement is [19, Theorem 6.6]. Note that the latter uses [19, Lemma 6.5], which apparently requires that \(f_\mathcal{B }\) be the cutting-plane model. However, this is not actually the case: the required property is that \(f_\mathcal{B }\) be a polyhedral function, and that there exists a “large enough” \(\mathcal{B }\) such that \(f_\mathcal{B }= f\), which clearly happens here.\(\square \)
The assumption that \(t\) is bounded away from zero can be relaxed somewhat if \(\mathcal{D }_t = (1/t) D\) for some \(D\) satisfying (i) and (ii). It may be useful to remark here that, as correctly pointed out by K. Kiwiel in a private communication, there is a flaw in the treatment of this point in [19]. In particular, the alternative to \(t_i \ge \underline{t} > 0\) proposed there is \(\sum _{i \rightarrow \infty } t_i = \infty \) [19], (6.2)]; that assumption does not guarantee that the whole sequence \(\{ \tilde{z}_i \}\) converges to \(0\), but only existence of a converging subsequence. As a consequence, in the hypothesis of Theorem 6.3 one should replace the “asymptotic complementary slackness” condition \(\liminf _{i \rightarrow \infty } \tilde{z}_i \bar{\pi }_i = 0\) with the stronger condition that \(\tilde{z}_{i_k} \bar{\pi }_{i_k} \rightarrow 0\) for some subsequence \(\{ z_{i_k} \} \rightarrow 0\) (which do exist under [19], (6.2)]), or more simply that \(\liminf _{i \rightarrow \infty } \text{ max}\{ \Vert \tilde{z}_i \Vert , \tilde{z}_i \bar{\pi }_i \} = 0\). Nonetheless, proof of [19, Theorem 6.4]—which is of interest here—only requires convergence of a subsequence, and is therefore valid under the weaker assumption [19], (6.2)].
Note that setting \(m = 1\) as required by Theorem 5 to attain finite convergence may come at a cost in practice. In fact, this turns the \(\text{ S}^2\)DW method into a “pure proximal point” approach, where the “abstract” stabilized problems (35) have to be solved to optimality before \(\bar{\pi }\) can be updated (it is easy to check that with \(m = 1\) a SS can only be performed when \(\varDelta f = 0\)). This is most often not the best choice, computationally [6], and for good reasons. The issue is mostly theoretical; in practice, finite convergence is very likely even for \(m < 1\). Furthermore, as observed in the comments to [19, Theorem 6.6], the requirement can be significantly weakened; what is really necessary is to ensure that only finitely many consecutive SSs are performed with \(\varDelta f > 0\). Thus, it is possible to use any \(m < 1\), provided that some mechanism (such as setting \(m = 1\) after a while) ensures that sooner or later a SS with \(\varDelta f = 0\) is performed; once this happens (being \(m = 1\) or not), the mechanism can be reset and \(m\) can be set back to a value smaller than 1.
The extension to the case where \(X\) is not compact is relatively straightforward. Stabilization (with the appropriate assumptions) considerably helps in ensuring that the master problems attain optimal solutions even if their feasible regions (in particular, \(X_{\mathcal{B }}\)) are unbounded. Then, as long as \(f( \tilde{\pi } ) > -\infty \) Assumption 3 is enough to ensure convergence. If \(f( \tilde{\pi } ) = -\infty \) instead, one has to assume that the solution of (2) produces a feasible solution \(\bar{x}\) and an unbounded descent direction \(\nu \) for \(conv(X)\) as a “certificate of unboundedness”. Now, because clearly \(f_{\mathcal{B }}( \tilde{\pi } ) > -\infty = f( \tilde{\pi } )\), the algorithm cannot be stopped; furthermore, it cannot be that the half-line \(\{x = \bar{x} + \alpha \nu , \alpha \ge 0\}\) is entirely contained in \(X_{\mathcal{B }}\). Then, “some variables must be missing”: as in the finite case, it must be easy to update \(\mathcal{B }\) and the associated \(\varGamma _{\mathcal{B }}\), \(\gamma _{\mathcal{B }}\) and \(C_{\mathcal{B }}\) to a \(\mathcal{B }^{\prime } \supset \mathcal{B }\) such that there exists a \(\mathcal{B }^{\prime \prime } \supseteq \mathcal{B }^{\prime }\) such that \(\nu \) is an unbounded descent direction for \(X_{\mathcal{B }^{\prime \prime }}\). Once this assumption is satisfied, the theoretical analysis hardly changes. One may loose the second point of Theorem 5 (asymptotic convergence of \(\{ \tilde{x}_i \}\)), but this is of no concern here; finite convergence basically only relies on the fact that the total number of possible different bundles \(\mathcal{B }\) is finite, so we basically only need to ensure that the same triplet \((\mathcal{B }, \bar{\pi }, t)\) is never repeated. The fact that \(conv( X )\) (\(\Rightarrow \) some variables \(\theta \) in its definition) may not be bounded has no impact, provided that the “set of pieces out of which the formulation of \(conv( X )\) is constructed” is finite.
Rights and permissions
About this article
Cite this article
Frangioni, A., Gendron, B. A stabilized structured Dantzig–Wolfe decomposition method. Math. Program. 140, 45–76 (2013). https://doi.org/10.1007/s10107-012-0626-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-012-0626-8
Keywords
- Dantzig–Wolfe decomposition method
- Structured linear program
- Multicommodity capacitated network design problem
- Reformulation
- Stabilization