Skip to main content
Log in

On parallelization of a stochastic dynamic programming algorithm for solving large-scale mixed 0–1 problems under uncertainty

  • Original Paper
  • Published:
TOP Aims and scope Submit manuscript

Abstract

A parallel computing implementation of a Serial Stochastic Dynamic Programming approach referred to as the S-SDP algorithm is introduced to solve large-scale multiperiod mixed 0–1 optimization problems under uncertainty. The paper presents Inner and Outer Parallelization versions of the S-SDP algorithm, referred to as Inner P-SDP and Outer P-SDP, respectively, so that the problem solving elapsed time and gap reduction is analyzed. The basic idea of Inner P-SDP consists of parallelizing the optimization of variations of the MIP subproblems attached to the sets of scenario clusters created by the modeler-defined stages to decompose the original problem. The Outer P-SDP performs simultaneous interconnected executions of the serial algorithm, so that a wider feasibility area is explored using iterative communication to redefine search directions. Strategies are presented to analyze the performance of parallel computation based on Message-Passing Interface threads to solve stage-related subproblems versus the serial version of SDP methodology. The results of using the parallelization are remarkable, as not only faster but also better solutions than the serial version are obtained. In particular, we report up to 10 times speedup for 12 threads on the Inner P-SDP algorithm. The new approach allows problems to be solved using less computing time than a state-of-the-art MIP solver. It can thus solve very large-scale problems that could not otherwise be achieved by plain use of the solver or by the S-SDP algorithm in acceptable elapsed time, if any.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Al-Khamisl T, M’Hallah R (2011) A two-stage stochastic programming model for the parallel machine scheduling problem with machine capacity. Comput Oper Res 38:1747–1759

    Article  Google Scholar 

  • Aldasoro U, Escudero LF, Merino M, Pérez G (2013) An algorithmic framework for solving large-scale multistage stochastic mixed 0–1 problems with nonsymmetric scenario trees. Part II: Parallelization. Comput Oper Res 40:2950–2960

    Article  Google Scholar 

  • ARINA (2015) Cluster IZO-SGI, SGIker (UPV/EHU). http://www.ehu.es/sgi/recursos/cluster-arina

  • Benders J (1962) Partitioning procedures for solving mixed variables programming problems. Manag Sci 1:238–252

    Google Scholar 

  • Beraldi P, Grandinetti L, Musmanno R, Trik C (2000) Parallel algorithms to solve two-stage stochastic linear programs with robustness constraints. Parallel Comput 26:1889–1908

    Article  Google Scholar 

  • Birge JR (1985) Decomposition and partitioning methods for multistage stochastic linear programs. Oper Res 33:989–1007

    Article  Google Scholar 

  • Birge JR (1997) Stochastic programming computation and applications. INFORMS J Comput 9:111–133

    Article  Google Scholar 

  • Birge JR, Donohue CJ, Holmes DF, Svintsitski O (1996) A parallel implementation of the nested decomposition algorithm for multistage stochastic linear programs. Math Program 75:327–352

    Google Scholar 

  • Birge JR, Louveaux FV (2011) Introduction to stochastic programming, 2nd edn. Springer, Berlin

  • Blomval J, Lindberg P (2002) A Riccati-based primal interior point solver for multistage stochastic programming. Eur J Oper Res 143:452–461

    Article  Google Scholar 

  • Conejo AJ, Castillo E, Mínguez R, García-Bertrand R (2006) Decomposition techniques in mathematical programming. Engineering and science applications. Springer, Berlin

    Google Scholar 

  • Cristobal MP, Escudero LF, Monge JF (2009) On stochastic dynamic programming for solving large-scale production planning problems under uncertainty. Comput Oper Res 36:2418–2428

    Article  Google Scholar 

  • Culler DE, Gupta A, Singh JP (1997) Parallel computer architecture: a hardware/software approach, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  • Dempster MAH, Thompson RT (1998) Parallelization and aggregation of nested Benders decomposition. Ann Oper Res 81:163–188

    Article  Google Scholar 

  • Dias B, Tomin M, Mercato A, Ramos T, Brandi R da, Silva jr IC, Filho JAP (2013) Parallel computing applied to stochastic dynamic programming for long term operation planning of hydrothermal power systems. Eur J Oper Res 229:212–222

  • Escudero LF, de la Fuente J, García C, Prieto F (1999) A parallel computation approach for solving multistage stochastic network problems. Ann Oper Res 90:1–21

    Article  Google Scholar 

  • Escudero LF, Garín MA, Merino M, Pérez G (2010) On an exact algorithm for solving large-scale two-stage stochastic mixed-integer problems: theoretical and computational aspects. Eur J Oper Res 204:105–116

    Article  Google Scholar 

  • Escudero LF, Garín MA, Merino M, Pérez G (2012) An algorithmic framework for solving large-scale multistage stochastic mixed 0–1 problems with nonsymmetric scenario trees. Comput Oper Res 39:1133–1144

    Article  Google Scholar 

  • Escudero LF, Garín MA, Merino M, Pérez G (2015) On time stochastic dominance induced by mixed integer-linear recourse in multistage stochastic programs (submitted)

  • Escudero LF, Garín MA, Pérez G, Unzueta A (2013) Scenario cluster decomposition of the lagrangian dual in stochastic mixed 0–1 optimization. Comput Oper Res 40:362–377

  • Escudero LF, Monge JF, Morales DR (2015) An sdp approach for multiperiod mixed 0–1 linear programming models with stochastic dominance constraints for risk management. Comput Oper Res 58:32–40

  • Escudero LF, Monge JF, Morales DR, Wang J (2013) Expected future value decomposition based bid price generation for large-scale network revenue management. Transp Sci 47:181–197

    Article  Google Scholar 

  • Hennessy JL, Patterson DA (2003) Computer architecture: a quantitative approach. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  • IBM (2015) ILOG CPLEX optimizer. http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/

  • Latorre JM, Cerisola S, Ramos A, Palacios R (2009) Analysis of stochastic problem decomposition algorithms in computational grids. Ann Oper Res 166:355–379

    Article  Google Scholar 

  • Li X, Wei J, Li T, Wang G, Yeh WG (2014) A parallel dynamic programming algorithm for multi-reservoir system optimization. Adv Water Resour 67:1–15

    Article  Google Scholar 

  • Linderoth J, Shapiro A, Wright S (2006) The empirical behavior of sampling methods for stochastic programming. Ann Oper Res 142:215–241

    Article  Google Scholar 

  • Linderoth JT, Wright S (2008) Decomposition algorithms for stochastic programming on a computational grid. Tech. rep, Mathematics and Computer Science Division, Argonne National Laboratory, Chicago

  • Lumbreras S (2014) Decision support methods for large-scale flexible transmission expansion planning. Ph.D. thesis, Institute of Investigación Tecnológica. Universidad Pontificia de Comillas, Madrid

  • Mahlke D (2011) A scenario tree-based decomposition for solving multistage stochastic programs with application in energyproduction. Springer, Berlin

  • Pacheco PS (1996) Parallel programming with MPI. Morgan Kaufmann Publishers, San Francisco

    Google Scholar 

  • Pereira M, Pinto L (1991) Multistage stochastic optimization applied to energy planning. Math Program 52:359–375

    Article  Google Scholar 

  • Plaza A, Pagnoncelli B (2014) The optimal harvesting problem under price uncertainty. Ann Oper Res 217:425–445

    Article  Google Scholar 

  • Römisch W, Schultz R (2001) Multi-stage stochastic integer programs. An introduction. In: Groetschel M, Krumle SO, Rambau H (eds) Online Optimization of Large Scale Systems, pp 581–600. Springer, Berlin

  • Ross S (1995) Introduction to stochastic dynamic programming. Academic Press

  • Ruszczynski AP (1993) Parallel decomposition of multistage stochastic programming problems. Math Program 58:201–228

    Article  Google Scholar 

  • Shapiro A, Tekaya W, da Costa JP, Soares MP (2013) Risk neutral and risk averse stochastic dual dynamic programming method. Eur J Oper Res 224:375–391

    Article  Google Scholar 

  • Snir M, Otto S, Walker D, Dongarra J, Huss-Lederman S (1995) MPI: the complete reference. MIT Press, Cambridge

    Google Scholar 

  • Stivala A, Stuckey PJ, de la Banda MG, Hermenegildo M, Wirth A (2010) Lock-free parallel dynamic programming. J Parallel Distributed Comput 70(8):839–848

    Article  Google Scholar 

  • Vladimirou H (1998) Computational assessment of distributed decomposition methods for stochastic linear programs. Eur J Oper Res 108:653–670

    Article  Google Scholar 

  • Vladimirou H, Zenios S (1999) Scalable parallel computations for large-scale stochastic programming. Ann Oper Res 90:87–129

    Article  Google Scholar 

  • Zhang Z, Zhang S, Wang Y, Jiang Y, Wang H (2013) Use of parallel deterministic dynamic programming and hierarchical adaptive genetic algorithm for reservoir operation optimization. Comput Ind Eng 65(2):310–321

    Article  Google Scholar 

Download references

Acknowledgments

This research has been partially supported by the projects MTM2012-31514 from the Spanish Ministry of Economy and Competitiveness, UFI BETS 2011 of the University of Basque Country (UPV/EHU), Grupo de Investigación IT-567-13 of the Basque Government, RIESGOS CM of the Regional Community of Madrid, Spain and Project P711RT0278 in Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo (CYTED). The computational resources were provided by SGI/IZO-SGIker at UPV/EHU (supported by the Spanish Ministry of Education and Science and the European Social Fund).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to María Merino.

Appendices

Appendix A. Implementation details

This appendix details the serial, inner and outer parallel implementation schemes that have been used for the computational experience reported in Sect. 5.

1.1 A.1 Serial implementation overview

The S-SDP algorithm presented in Sect. 3.5 has been implemented for solving the realistic production planning problem under uncertainty introduced in Cristobal et al. (2009). As it frequently happens in tactical multistage planning problems, in our case only continuous variables in a period have nonzero coefficients in the constraints of the next one. Therefore, the linking variables between stages only occur between the leaf nodes \(\{\ell \}\) of the subproblems of a stage, say \(r'\), and the root nodes of the successor subproblems in the next stage, say \(r\in \mathcal{S}_\ell \), if any. (In our case \(\mathcal{V}_\ell =\{y_g,~g\in \tilde{\mathcal{A}}_\ell \}\) for \(\ell \in \mathcal{L}_{r'},~ r'\in \mathcal{R}^{e-1},e\in \mathcal{E}\backslash \{1\}\), such that \((y_g)_i\) denotes the stock of product \(i\) in set \(\mathcal{I}\) related to scenario group \(g\)). Therefore, those continuous variables are the only ones to be iteratively perturbed. The perturbation has been performed as follows: \(\xi =ran/f(iter)\), where \(f(iter)=k\) being \(k\) a constant. Given the relatively small number of iterations performed and the large scale of the instances, the algorithm cannot ensure the goodness of a unique searching direction; consequently, a constant factor \(k\) has been chosen to preserve a wide search. However, the value of parameter \(k\) changes depending on the value of the variable to be perturbed, such that small perturbations for small values and big perturbation for big values are generated, looking for a relative homogeneous change. Additionally, a single new reference level is generated at the BtF scheme. We observed in our computational experimentation with the type of problem we are considering that multiple perturbations of the same solution will lead to similar reference levels that, then, significantly increase the elapsed time. Finally, all reference levels are kept active.

Throughout the numerical experiments that are reported, the stopping criteria parameters are set up to \(\epsilon _1=0.001\) and \(\epsilon _2=0.0001\), \(niterk=miter+1\), \(miter=15\) iterations, time limit of 8 hours and 35 Gb of memory limit.

1.2 A.2 Inner parallel implementation overview

The Inner P-SDP maintains the algorithmic structure of the S-SDP algorithm; i.e., model, linking variables perturbation, reference level management and stopping criteria parameters. Note that if the time limit is reached at a S-SDP execution, the Inner P-SDP execution can be allowed continue to iterate and stop afterwards, see below.

As only the variables in a period have nonzero coefficients in the constraints of the following one, the implemented Inner P-SDP algorithm comprehends asynchronized execution parts. Thus, stage \(e=1\) will be managed by \(th 1\), stage \(e=2\) by main threads and subsequent stages by both main and auxiliary threads. Therefore, primary communication will be needed between stages \(1\) and \(2\); and secondary communication will be performed between consecutive pair of stages from \(2\) to \(|\mathcal{E}|\) whenever needed. Global synchronization is achieved when gathering the EFV curves at the end of FtB and BtF schemes. The procedure of the Inner P-SDP follows the structure of the S-SDP algorithm detailed in Sect. 3.5. It is as follows:

Step 0: (Initialization)

Step 1: (FtB scheme: solve subproblem (5) for stage \({e=1}\))

  • The subproblem is solved by main thread \(th 1\).

  • Primary communication:  The main threads gather the solution from \(th 1\). They then follow an asynchronized execution until the end of the FtB scheme, since no primary communication is needed in-between. Set stage \(e:=2\).

Step 2: (Solve subproblem (5) rooted at node \({r}\), \({\forall r\in \mathcal{R}^e}\))

  • Each thread solves its corresponding subproblems rooted at node \(r\in \mathcal{R}_{th}^{FtB}\) and all the available threads solve their own subproblems simultaneously. Note that for stage \(e>2\) all available threads will be used, otherwise only the main threads will.

Step 3: (Generate and append the EFV-\(\ell \) defining constraint to subproblem (5) rooted at node \({r'}\), \({\forall \ell \in \mathcal{L}_{r'}, \, r'\in \mathcal{R}^{e-1}}\))

  • Secondary communication:  Performed only for stage \(e=2\). Auxiliary threads gather the solution from their corresponding main thread so that they can follow an asynchronized execution until the end of the FtB scheme.

Step 4: (Forward to next stage \({e+1}\))

  • Synchronization:    If stage \(e=|\mathcal{E}|\) all threads gather solution and EFV curves.

Step 5: (Compute solution value for original model (2) and check stopping criteria)

Step 6: (BtF scheme: Compute dual vector of subproblem (5) rooted at node \({r,~\forall r\in \mathcal{R}^e, e>1}\))

  • The subproblem rooted at \(r\in \mathcal{R}_{\mathrm{th}}^{\mathrm{BtF}}\) is solved by its corresponding thread. Finally, if stage \(e=2\) then the results of solving each subproblem rooted at \(r\in \mathcal{R}_{\mathrm{th}}^{\mathrm{BtF}}\) by a main thread will be shared with its corresponding auxiliary threads.

  • Secondary communication:  Each main thread and its auxiliary threads gather solution and follow a synchronized execution among them but their execution is asynchronized with respect to other main and auxiliary thread groups.

Step 7: (Generate and append the EFV-\(\ell \) defining constraints in subproblem (5) rooted at node \({r'}\), \({\forall \ell \in \mathcal{L}_{r'}, \, r'\in \mathcal{R}^{e-1}}\))

  • Synchronization:    In case stage \(e=1\) all threads gather solution and EFV curves.

Step 8: (Backward to previous stage \({e-1}\))

1.3 A.3 Outer parallel implementation overview

Let \(inisize\), \(inigap\), \(nthread\), \(pathsize\) and \(pathgap\) be pilot case-driven parameters taken as input for the execution of the outer parallelization. The global solution pool generation phase stores up to \(inisize\) alternative solutions, choosing them from subproblem (5) solving at stage \(e=1\) for \(iter=1\) with the smallest optimality gap among those whose gap is lower than \(inigap\). Then, the \(nthread\) most diverge solutions among them are chosen from the pool and assigned to threads. The selection criterion is as follows: the candidate solution in the pool with the highest euclidean distance of the vector of the values of the linking variables (at stage \(e=1\)) with respect to the optimal solution is the first candidate; then the solution with the highest distance with respect to the optimal plus the distance with respect to the first candidate is the second candidate, and so on.

The alternative solution path selection phase (at stage \(e=1\) for \(iter>1\)) stores up to \(pathsize\) candidates with an optimality gap lower than \(pathgap\) and takes the one with the highest above euclidean distance from the optimal solution for stage \(e=1\).

The stopping criteria parameters are set up to the same values used at the S-SDP and Inner P-SDP. The procedure of the implemented Outer P-SDP is as follows:

Step 0: (Initialization)

Step 1: (FtB scheme: Solve subproblem (5) for stage \({e=1}\))

  • Generate the global solution pool for iteration \(iter=1\) from the set of optimal and quasi-optimal solutions from solving the subproblem as described above.

  • For \(iter>1\) and those paths that have issued a warning message on “non-improved path solution” at the end of the FtB scheme (Step 5) of the previous iteration, an alternative solution is picked up from the pool, according to the criterion presented above.

Step 2: (Solve subproblem (5) rooted at node \({r}\), \({\forall r\in \mathcal{R}^e,~e\ge 2}\))

Step 3: (Generate and append the EFV-\(\ell \) defining constraint to subproblem (5) rooted at node \({r'}\), \({\forall \ell \in \mathcal{L}_{r'}, \, r'\in \mathcal{R}^{e-1}}\))

Step 4: (Forward to next stage \({e+1}\))

Step 5: (Compute solution value for original model (2) and check stopping criteria)

  • The communication and synchronization phase is executed, see Sect. 4.3 and Fig. 5. Compute global iteration solution and check stopping criteria. If global incumbent solution has been updated, all threads gather the corresponding solution; if global incumbent solution and path incumbent solution have not been updated, the warning message “non-improved path solution” is issued for the appropriate path.

Step 6: (BtF scheme: Compute dual vector of subproblem (5) rooted at node \({r,~ r\in \mathcal{R}^e, e>1}\))

  • Path reference levels: Generate new reference levels using the path criteria for random values.

Step 7: (Generate and append the EFV-\(\ell \) defining constraints in subproblem (5) rooted at node \({r'}\), \({\forall \ell \in \mathcal{L}_{r'}, \, r'\in \mathcal{R}^{e-1}}\))

Step 8: (Backward to previous stage \({e-1}\))

Appendix B. Inner P-SDP quality analysis of the largest instances in Testbed 2

This appendix details the Inner parallelization analysis for the largest cases, c85 and c86, when using an increasing number (up to 8) of multiprocessor computers of the cluster.

Table 10 and Fig. 6 show the results in terms of elapsed time, speedup and efficiency for instances c85 and c86 when using 12, 24, 48 and 96 threads for Inner P-SDP versus S-SDP (execution using only one thread for CPLEX). The new headings are as follows: \(S_{\mathrm{th}}^{\mathrm{top}}\), top speedup defined as \(S_{\mathrm{th}}^{\mathrm{top}}=\frac{t_{\mathrm{IP}}^{\mathrm{serial}}}{t_{\mathrm{th}}^{\mathrm{top}}}\) and \(E_{\mathrm{th}}^{\mathrm{top}}\%\), top efficiency defined as \(E_{\mathrm{th}}^{\mathrm{top}}\%=100 \cdot \frac{S_{th}^{top}}{th}\), where \(t_{th}^{top}=\sum _{e \in \mathcal{E}}\frac{t_e^{serial}}{\min (th,|\mathcal{R}^e| |\mathcal {Z}|)}\) and \(t_e^{serial}\) is the elapsed time for stage \(e\) in S-SDP such that \(t_{IP}^{serial}=\sum _{e\in \mathcal{E}} t_e^{serial} \). The concepts \(S_{th}^{top}\) and \(E_{th}^{top}\) consider the top speedup and top efficiency that could be achieved, respectively, given the specifications of the SDP algorithm and its Inner P-SDP parallelization under ideal conditions (as if time is not lost by communication and synchronization). For example, in instance c85 \(t_{IP}^{serial}=26{,}180=59+10+8,\!987+17{,}125=t_1+t_2+t_3+t_4\) and \(\{\mathcal{R}^e\}=\{1, 3, 27, 486\}\); therefore, with 48 threads, \(t_{48}^{top}=\frac{59}{1}+\frac{10}{3}+\frac{8{,}987}{27}+\frac{17{,}125}{48}=752\), \(S_{48}^{top}=\frac{26{,}180}{752}=34.81\) and \(E_{48}^{top}=100\cdot \frac{34.81}{48}=72.53~\%\).

Fig. 6
figure 6

Speedup and efficiency for instances c85 and c86

Table 10 Inner P-SDP speedup and efficiency for instances c85 and c86

Note in Table 10 and Figure 6 the remarkable scalability of the Inner P-SDP algorithm. The speedup increases almost linearly with up to 48 threads and the elapsed time with 96 threads is 44 and 34 times faster than for the serial version in instances c85 and c86, respectively. When comparing the efficiency and the top efficiency, we can observe that time lost by communication and synchronization is not significant in the implementation of Inner P-SDP for instance c85 and is quite small for instance c86.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aldasoro, U., Escudero, L.F., Merino, M. et al. On parallelization of a stochastic dynamic programming algorithm for solving large-scale mixed 0–1 problems under uncertainty. TOP 23, 703–742 (2015). https://doi.org/10.1007/s11750-014-0359-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11750-014-0359-3

Keywords

Mathematics Subject Classification

Navigation