On parallelization of a stochastic dynamic programming algorithm for solving large-scale mixed 0–1 problems under uncertainty

Aldasoro, Unai; Escudero, Laureano F.; Merino, María; Monge, Juan F.; Pérez, Gloria

doi:10.1007/s11750-014-0359-3

On parallelization of a stochastic dynamic programming algorithm for solving large-scale mixed 0–1 problems under uncertainty

Original Paper
Published: 28 January 2015

Volume 23, pages 703–742, (2015)
Cite this article

TOP Aims and scope Submit manuscript

Unai Aldasoro¹,
Laureano F. Escudero²,
María Merino³,
Juan F. Monge⁴ &
…
Gloria Pérez³

338 Accesses
9 Citations
Explore all metrics

Abstract

A parallel computing implementation of a Serial Stochastic Dynamic Programming approach referred to as the S-SDP algorithm is introduced to solve large-scale multiperiod mixed 0–1 optimization problems under uncertainty. The paper presents Inner and Outer Parallelization versions of the S-SDP algorithm, referred to as Inner P-SDP and Outer P-SDP, respectively, so that the problem solving elapsed time and gap reduction is analyzed. The basic idea of Inner P-SDP consists of parallelizing the optimization of variations of the MIP subproblems attached to the sets of scenario clusters created by the modeler-defined stages to decompose the original problem. The Outer P-SDP performs simultaneous interconnected executions of the serial algorithm, so that a wider feasibility area is explored using iterative communication to redefine search directions. Strategies are presented to analyze the performance of parallel computation based on Message-Passing Interface threads to solve stage-related subproblems versus the serial version of SDP methodology. The results of using the parallelization are remarkable, as not only faster but also better solutions than the serial version are obtained. In particular, we report up to 10 times speedup for 12 threads on the Inner P-SDP algorithm. The new approach allows problems to be solved using less computing time than a state-of-the-art MIP solver. It can thus solve very large-scale problems that could not otherwise be achieved by plain use of the solver or by the S-SDP algorithm in acceptable elapsed time, if any.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel and distributed computing for stochastic dual dynamic programming

Article Open access 24 August 2021

Asynchronous Lagrangian scenario decomposition

Article Open access 14 March 2020

A parallel hub-and-spoke system for large-scale scenario-based optimization under uncertainty

Article 14 August 2023

References

Al-Khamisl T, M’Hallah R (2011) A two-stage stochastic programming model for the parallel machine scheduling problem with machine capacity. Comput Oper Res 38:1747–1759
Article Google Scholar
Aldasoro U, Escudero LF, Merino M, Pérez G (2013) An algorithmic framework for solving large-scale multistage stochastic mixed 0–1 problems with nonsymmetric scenario trees. Part II: Parallelization. Comput Oper Res 40:2950–2960
Article Google Scholar
ARINA (2015) Cluster IZO-SGI, SGIker (UPV/EHU). http://www.ehu.es/sgi/recursos/cluster-arina
Benders J (1962) Partitioning procedures for solving mixed variables programming problems. Manag Sci 1:238–252
Google Scholar
Beraldi P, Grandinetti L, Musmanno R, Trik C (2000) Parallel algorithms to solve two-stage stochastic linear programs with robustness constraints. Parallel Comput 26:1889–1908
Article Google Scholar
Birge JR (1985) Decomposition and partitioning methods for multistage stochastic linear programs. Oper Res 33:989–1007
Article Google Scholar
Birge JR (1997) Stochastic programming computation and applications. INFORMS J Comput 9:111–133
Article Google Scholar
Birge JR, Donohue CJ, Holmes DF, Svintsitski O (1996) A parallel implementation of the nested decomposition algorithm for multistage stochastic linear programs. Math Program 75:327–352
Google Scholar
Birge JR, Louveaux FV (2011) Introduction to stochastic programming, 2nd edn. Springer, Berlin
Blomval J, Lindberg P (2002) A Riccati-based primal interior point solver for multistage stochastic programming. Eur J Oper Res 143:452–461
Article Google Scholar
Conejo AJ, Castillo E, Mínguez R, García-Bertrand R (2006) Decomposition techniques in mathematical programming. Engineering and science applications. Springer, Berlin
Google Scholar
Cristobal MP, Escudero LF, Monge JF (2009) On stochastic dynamic programming for solving large-scale production planning problems under uncertainty. Comput Oper Res 36:2418–2428
Article Google Scholar
Culler DE, Gupta A, Singh JP (1997) Parallel computer architecture: a hardware/software approach, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco
Google Scholar
Dempster MAH, Thompson RT (1998) Parallelization and aggregation of nested Benders decomposition. Ann Oper Res 81:163–188
Article Google Scholar
Dias B, Tomin M, Mercato A, Ramos T, Brandi R da, Silva jr IC, Filho JAP (2013) Parallel computing applied to stochastic dynamic programming for long term operation planning of hydrothermal power systems. Eur J Oper Res 229:212–222
Escudero LF, de la Fuente J, García C, Prieto F (1999) A parallel computation approach for solving multistage stochastic network problems. Ann Oper Res 90:1–21
Article Google Scholar
Escudero LF, Garín MA, Merino M, Pérez G (2010) On an exact algorithm for solving large-scale two-stage stochastic mixed-integer problems: theoretical and computational aspects. Eur J Oper Res 204:105–116
Article Google Scholar
Escudero LF, Garín MA, Merino M, Pérez G (2012) An algorithmic framework for solving large-scale multistage stochastic mixed 0–1 problems with nonsymmetric scenario trees. Comput Oper Res 39:1133–1144
Article Google Scholar
Escudero LF, Garín MA, Merino M, Pérez G (2015) On time stochastic dominance induced by mixed integer-linear recourse in multistage stochastic programs (submitted)
Escudero LF, Garín MA, Pérez G, Unzueta A (2013) Scenario cluster decomposition of the lagrangian dual in stochastic mixed 0–1 optimization. Comput Oper Res 40:362–377
Escudero LF, Monge JF, Morales DR (2015) An sdp approach for multiperiod mixed 0–1 linear programming models with stochastic dominance constraints for risk management. Comput Oper Res 58:32–40
Escudero LF, Monge JF, Morales DR, Wang J (2013) Expected future value decomposition based bid price generation for large-scale network revenue management. Transp Sci 47:181–197
Article Google Scholar
Hennessy JL, Patterson DA (2003) Computer architecture: a quantitative approach. Morgan Kaufmann Publishers Inc., San Francisco
Google Scholar
IBM (2015) ILOG CPLEX optimizer. http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/
Latorre JM, Cerisola S, Ramos A, Palacios R (2009) Analysis of stochastic problem decomposition algorithms in computational grids. Ann Oper Res 166:355–379
Article Google Scholar
Li X, Wei J, Li T, Wang G, Yeh WG (2014) A parallel dynamic programming algorithm for multi-reservoir system optimization. Adv Water Resour 67:1–15
Article Google Scholar
Linderoth J, Shapiro A, Wright S (2006) The empirical behavior of sampling methods for stochastic programming. Ann Oper Res 142:215–241
Article Google Scholar
Linderoth JT, Wright S (2008) Decomposition algorithms for stochastic programming on a computational grid. Tech. rep, Mathematics and Computer Science Division, Argonne National Laboratory, Chicago
Lumbreras S (2014) Decision support methods for large-scale flexible transmission expansion planning. Ph.D. thesis, Institute of Investigación Tecnológica. Universidad Pontificia de Comillas, Madrid
Mahlke D (2011) A scenario tree-based decomposition for solving multistage stochastic programs with application in energyproduction. Springer, Berlin
Pacheco PS (1996) Parallel programming with MPI. Morgan Kaufmann Publishers, San Francisco
Google Scholar
Pereira M, Pinto L (1991) Multistage stochastic optimization applied to energy planning. Math Program 52:359–375
Article Google Scholar
Plaza A, Pagnoncelli B (2014) The optimal harvesting problem under price uncertainty. Ann Oper Res 217:425–445
Article Google Scholar
Römisch W, Schultz R (2001) Multi-stage stochastic integer programs. An introduction. In: Groetschel M, Krumle SO, Rambau H (eds) Online Optimization of Large Scale Systems, pp 581–600. Springer, Berlin
Ross S (1995) Introduction to stochastic dynamic programming. Academic Press
Ruszczynski AP (1993) Parallel decomposition of multistage stochastic programming problems. Math Program 58:201–228
Article Google Scholar
Shapiro A, Tekaya W, da Costa JP, Soares MP (2013) Risk neutral and risk averse stochastic dual dynamic programming method. Eur J Oper Res 224:375–391
Article Google Scholar
Snir M, Otto S, Walker D, Dongarra J, Huss-Lederman S (1995) MPI: the complete reference. MIT Press, Cambridge
Google Scholar
Stivala A, Stuckey PJ, de la Banda MG, Hermenegildo M, Wirth A (2010) Lock-free parallel dynamic programming. J Parallel Distributed Comput 70(8):839–848
Article Google Scholar
Vladimirou H (1998) Computational assessment of distributed decomposition methods for stochastic linear programs. Eur J Oper Res 108:653–670
Article Google Scholar
Vladimirou H, Zenios S (1999) Scalable parallel computations for large-scale stochastic programming. Ann Oper Res 90:87–129
Article Google Scholar
Zhang Z, Zhang S, Wang Y, Jiang Y, Wang H (2013) Use of parallel deterministic dynamic programming and hierarchical adaptive genetic algorithm for reservoir operation optimization. Comput Ind Eng 65(2):310–321
Article Google Scholar

Download references

Acknowledgments

This research has been partially supported by the projects MTM2012-31514 from the Spanish Ministry of Economy and Competitiveness, UFI BETS 2011 of the University of Basque Country (UPV/EHU), Grupo de Investigación IT-567-13 of the Basque Government, RIESGOS CM of the Regional Community of Madrid, Spain and Project P711RT0278 in Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo (CYTED). The computational resources were provided by SGI/IZO-SGIker at UPV/EHU (supported by the Spanish Ministry of Education and Science and the European Social Fund).

Author information

Authors and Affiliations

Dpto. de Matemática Aplicada, Universidad del País Vasco, Leioa, Bizkaia, Spain
Unai Aldasoro
Dpto. de Estadística e Investigación Operativa, Universidad Rey Juan Carlos, Madrid, Spain
Laureano F. Escudero
Dpto. de Matemática Aplicada, Estadística e Investigación Operativa, Universidad del País Vasco, Barrio Sarriena s/n, 48940, Leioa, Bizkaia, Spain
María Merino & Gloria Pérez
Centro de Investigación Operativa, Universidad Miguel Hernández, Elche, Spain
Juan F. Monge

Authors

Unai Aldasoro
View author publications
You can also search for this author in PubMed Google Scholar
Laureano F. Escudero
View author publications
You can also search for this author in PubMed Google Scholar
María Merino
View author publications
You can also search for this author in PubMed Google Scholar
Juan F. Monge
View author publications
You can also search for this author in PubMed Google Scholar
Gloria Pérez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to María Merino.

Appendices

Appendix A. Implementation details

This appendix details the serial, inner and outer parallel implementation schemes that have been used for the computational experience reported in Sect. 5.

1.1 A.1 Serial implementation overview

The S-SDP algorithm presented in Sect. 3.5 has been implemented for solving the realistic production planning problem under uncertainty introduced in Cristobal et al. (2009). As it frequently happens in tactical multistage planning problems, in our case only continuous variables in a period have nonzero coefficients in the constraints of the next one. Therefore, the linking variables between stages only occur between the leaf nodes \(\{\ell \}\) of the subproblems of a stage, say \(r'\), and the root nodes of the successor subproblems in the next stage, say \(r\in \mathcal{S}_\ell \), if any. (In our case \(\mathcal{V}_\ell =\{y_g,~g\in \tilde{\mathcal{A}}_\ell \}\) for \(\ell \in \mathcal{L}_{r'},~ r'\in \mathcal{R}^{e-1},e\in \mathcal{E}\backslash \{1\}\), such that \((y_g)_i\) denotes the stock of product \(i\) in set \(\mathcal{I}\) related to scenario group \(g\)). Therefore, those continuous variables are the only ones to be iteratively perturbed. The perturbation has been performed as follows: \(\xi =ran/f(iter)\), where \(f(iter)=k\) being \(k\) a constant. Given the relatively small number of iterations performed and the large scale of the instances, the algorithm cannot ensure the goodness of a unique searching direction; consequently, a constant factor \(k\) has been chosen to preserve a wide search. However, the value of parameter \(k\) changes depending on the value of the variable to be perturbed, such that small perturbations for small values and big perturbation for big values are generated, looking for a relative homogeneous change. Additionally, a single new reference level is generated at the BtF scheme. We observed in our computational experimentation with the type of problem we are considering that multiple perturbations of the same solution will lead to similar reference levels that, then, significantly increase the elapsed time. Finally, all reference levels are kept active.

Throughout the numerical experiments that are reported, the stopping criteria parameters are set up to \(\epsilon _1=0.001\) and \(\epsilon _2=0.0001\), \(niterk=miter+1\), \(miter=15\) iterations, time limit of 8 hours and 35 Gb of memory limit.

1.2 A.2 Inner parallel implementation overview

The Inner P-SDP maintains the algorithmic structure of the S-SDP algorithm; i.e., model, linking variables perturbation, reference level management and stopping criteria parameters. Note that if the time limit is reached at a S-SDP execution, the Inner P-SDP execution can be allowed continue to iterate and stop afterwards, see below.

As only the variables in a period have nonzero coefficients in the constraints of the following one, the implemented Inner P-SDP algorithm comprehends asynchronized execution parts. Thus, stage \(e=1\) will be managed by \(th 1\), stage \(e=2\) by main threads and subsequent stages by both main and auxiliary threads. Therefore, primary communication will be needed between stages \(1\) and \(2\); and secondary communication will be performed between consecutive pair of stages from \(2\) to \(|\mathcal{E}|\) whenever needed. Global synchronization is achieved when gathering the EFV curves at the end of FtB and BtF schemes. The procedure of the Inner P-SDP follows the structure of the S-SDP algorithm detailed in Sect. 3.5. It is as follows:

Step 0: (Initialization)

Step 1: (FtB scheme: solve subproblem (5) for stage \({e=1}\))

The subproblem is solved by main thread \(th 1\).
Primary communication: The main threads gather the solution from \(th 1\). They then follow an asynchronized execution until the end of the FtB scheme, since no primary communication is needed in-between. Set stage \(e:=2\).

Step 2: (Solve subproblem (5) rooted at node \({r}\), \({\forall r\in \mathcal{R}^e}\))

Each thread solves its corresponding subproblems rooted at node \(r\in \mathcal{R}_{th}^{FtB}\) and all the available threads solve their own subproblems simultaneously. Note that for stage \(e>2\) all available threads will be used, otherwise only the main threads will.

Step 3: (Generate and append the EFV-\(\ell \) defining constraint to subproblem (5) rooted at node \({r'}\), \({\forall \ell \in \mathcal{L}_{r'}, \, r'\in \mathcal{R}^{e-1}}\))

Secondary communication: Performed only for stage \(e=2\). Auxiliary threads gather the solution from their corresponding main thread so that they can follow an asynchronized execution until the end of the FtB scheme.

Step 4: (Forward to next stage \({e+1}\))

Synchronization: If stage \(e=|\mathcal{E}|\) all threads gather solution and EFV curves.

Step 5: (Compute solution value for original model (2) and check stopping criteria)

Step 6: (BtF scheme: Compute dual vector of subproblem (5) rooted at node \({r,~\forall r\in \mathcal{R}^e, e>1}\))

The subproblem rooted at \(r\in \mathcal{R}_{\mathrm{th}}^{\mathrm{BtF}}\) is solved by its corresponding thread. Finally, if stage \(e=2\) then the results of solving each subproblem rooted at \(r\in \mathcal{R}_{\mathrm{th}}^{\mathrm{BtF}}\) by a main thread will be shared with its corresponding auxiliary threads.
Secondary communication: Each main thread and its auxiliary threads gather solution and follow a synchronized execution among them but their execution is asynchronized with respect to other main and auxiliary thread groups.

Step 7: (Generate and append the EFV-\(\ell \) defining constraints in subproblem (5) rooted at node \({r'}\), \({\forall \ell \in \mathcal{L}_{r'}, \, r'\in \mathcal{R}^{e-1}}\))

Synchronization: In case stage \(e=1\) all threads gather solution and EFV curves.

Step 8: (Backward to previous stage \({e-1}\))

1.3 A.3 Outer parallel implementation overview

Let \(inisize\), \(inigap\), \(nthread\), \(pathsize\) and \(pathgap\) be pilot case-driven parameters taken as input for the execution of the outer parallelization. The global solution pool generation phase stores up to \(inisize\) alternative solutions, choosing them from subproblem (5) solving at stage \(e=1\) for \(iter=1\) with the smallest optimality gap among those whose gap is lower than \(inigap\). Then, the \(nthread\) most diverge solutions among them are chosen from the pool and assigned to threads. The selection criterion is as follows: the candidate solution in the pool with the highest euclidean distance of the vector of the values of the linking variables (at stage \(e=1\)) with respect to the optimal solution is the first candidate; then the solution with the highest distance with respect to the optimal plus the distance with respect to the first candidate is the second candidate, and so on.

The alternative solution path selection phase (at stage \(e=1\) for \(iter>1\)) stores up to \(pathsize\) candidates with an optimality gap lower than \(pathgap\) and takes the one with the highest above euclidean distance from the optimal solution for stage \(e=1\).

The stopping criteria parameters are set up to the same values used at the S-SDP and Inner P-SDP. The procedure of the implemented Outer P-SDP is as follows:

Step 0: (Initialization)

Step 1: (FtB scheme: Solve subproblem (5) for stage \({e=1}\))

Generate the global solution pool for iteration \(iter=1\) from the set of optimal and quasi-optimal solutions from solving the subproblem as described above.
For \(iter>1\) and those paths that have issued a warning message on “non-improved path solution” at the end of the FtB scheme (Step 5) of the previous iteration, an alternative solution is picked up from the pool, according to the criterion presented above.

Step 2: (Solve subproblem (5) rooted at node \({r}\), \({\forall r\in \mathcal{R}^e,~e\ge 2}\))

Step 3: (Generate and append the EFV-\(\ell \) defining constraint to subproblem (5) rooted at node \({r'}\), \({\forall \ell \in \mathcal{L}_{r'}, \, r'\in \mathcal{R}^{e-1}}\))

Step 4: (Forward to next stage \({e+1}\))

Step 5: (Compute solution value for original model (2) and check stopping criteria)

The communication and synchronization phase is executed, see Sect. 4.3 and Fig. 5. Compute global iteration solution and check stopping criteria. If global incumbent solution has been updated, all threads gather the corresponding solution; if global incumbent solution and path incumbent solution have not been updated, the warning message “non-improved path solution” is issued for the appropriate path.

Step 6: (BtF scheme: Compute dual vector of subproblem (5) rooted at node \({r,~ r\in \mathcal{R}^e, e>1}\))

Path reference levels: Generate new reference levels using the path criteria for random values.

Step 7: (Generate and append the EFV-\(\ell \) defining constraints in subproblem (5) rooted at node \({r'}\), \({\forall \ell \in \mathcal{L}_{r'}, \, r'\in \mathcal{R}^{e-1}}\))

Step 8: (Backward to previous stage \({e-1}\))

Appendix B. Inner P-SDP quality analysis of the largest instances in Testbed 2

This appendix details the Inner parallelization analysis for the largest cases, c85 and c86, when using an increasing number (up to 8) of multiprocessor computers of the cluster.

Table 10 and Fig. 6 show the results in terms of elapsed time, speedup and efficiency for instances c85 and c86 when using 12, 24, 48 and 96 threads for Inner P-SDP versus S-SDP (execution using only one thread for CPLEX). The new headings are as follows: \(S_{\mathrm{th}}^{\mathrm{top}}\), top speedup defined as \(S_{\mathrm{th}}^{\mathrm{top}}=\frac{t_{\mathrm{IP}}^{\mathrm{serial}}}{t_{\mathrm{th}}^{\mathrm{top}}}\) and \(E_{\mathrm{th}}^{\mathrm{top}}\%\), top efficiency defined as \(E_{\mathrm{th}}^{\mathrm{top}}\%=100 \cdot \frac{S_{th}^{top}}{th}\), where \(t_{th}^{top}=\sum _{e \in \mathcal{E}}\frac{t_e^{serial}}{\min (th,|\mathcal{R}^e| |\mathcal {Z}|)}\) and \(t_e^{serial}\) is the elapsed time for stage \(e\) in S-SDP such that \(t_{IP}^{serial}=\sum _{e\in \mathcal{E}} t_e^{serial} \). The concepts \(S_{th}^{top}\) and \(E_{th}^{top}\) consider the top speedup and top efficiency that could be achieved, respectively, given the specifications of the SDP algorithm and its Inner P-SDP parallelization under ideal conditions (as if time is not lost by communication and synchronization). For example, in instance c85 \(t_{IP}^{serial}=26{,}180=59+10+8,\!987+17{,}125=t_1+t_2+t_3+t_4\) and \(\{\mathcal{R}^e\}=\{1, 3, 27, 486\}\); therefore, with 48 threads, \(t_{48}^{top}=\frac{59}{1}+\frac{10}{3}+\frac{8{,}987}{27}+\frac{17{,}125}{48}=752\), \(S_{48}^{top}=\frac{26{,}180}{752}=34.81\) and \(E_{48}^{top}=100\cdot \frac{34.81}{48}=72.53~\%\).

Table 10 Inner P-SDP speedup and efficiency for instances c85 and c86

Full size table

Note in Table 10 and Figure 6 the remarkable scalability of the Inner P-SDP algorithm. The speedup increases almost linearly with up to 48 threads and the elapsed time with 96 threads is 44 and 34 times faster than for the serial version in instances c85 and c86, respectively. When comparing the efficiency and the top efficiency, we can observe that time lost by communication and synchronization is not significant in the implementation of Inner P-SDP for instance c85 and is quite small for instance c86.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aldasoro, U., Escudero, L.F., Merino, M. et al. On parallelization of a stochastic dynamic programming algorithm for solving large-scale mixed 0–1 problems under uncertainty. TOP 23, 703–742 (2015). https://doi.org/10.1007/s11750-014-0359-3

Download citation

Received: 24 June 2014
Accepted: 24 December 2014
Published: 28 January 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s11750-014-0359-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On parallelization of a stochastic dynamic programming algorithm for solving large-scale mixed 0–1 problems under uncertainty

Abstract

Access this article

Similar content being viewed by others

Parallel and distributed computing for stochastic dual dynamic programming

Asynchronous Lagrangian scenario decomposition

A parallel hub-and-spoke system for large-scale scenario-based optimization under uncertainty

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A. Implementation details

1.1 A.1 Serial implementation overview

1.2 A.2 Inner parallel implementation overview

1.3 A.3 Outer parallel implementation overview

Appendix B. Inner P-SDP quality analysis of the largest instances in Testbed 2

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

On parallelization of a stochastic dynamic programming algorithm for solving large-scale mixed 0–1 problems under uncertainty

Abstract

Access this article

Similar content being viewed by others

Parallel and distributed computing for stochastic dual dynamic programming

Asynchronous Lagrangian scenario decomposition

A parallel hub-and-spoke system for large-scale scenario-based optimization under uncertainty

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A. Implementation details

1.1 A.1 Serial implementation overview

1.2 A.2 Inner parallel implementation overview

1.3 A.3 Outer parallel implementation overview

Appendix B. Inner P-SDP quality analysis of the largest instances in Testbed 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation