Controlled Approximation of the Stochastic Dynamic Programming Value Function for Multi-Reservoir Systems

Conference paper
Part of the Lecture Notes in Economics and Mathematical Systems book series (LNE, volume 682)


We present an approximation of the Stochastic Dynamic Programming (SDP) value function based on a partition of the state space into simplices. The vertices of such simplices form an irregular grid over which the value function is computed. Under convexity assumptions, lower and upper bounds are developed over the state space continuum. The partition is then refined where the gap between these bounds is largest. This process readily provides a controllable trade-off between accuracy and solution time.


Reservoir Level Stochastic Dynamic Program Irregular Grid Approximate Dynamic Program Division Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Castelletti, A., Galelli, S., Restelli, M., Soncini-Sessa, R.: Tree-based reinforcement learning for optimal water reservoir operation. Water Resour. Res. 46(9) W09507, doi:10.1029/2009WR008898 (2010)CrossRefGoogle Scholar
  2. 2.
    Cervellera, C., Wen, A., Chen, V.C.: Neural network and regression spline value function approximations for stochastic dynamic programming. Comput. Oper. Res. 34(1), 70–90 (2007)CrossRefGoogle Scholar
  3. 3.
    Chandramouli, V., Raman, H.: Multireservoir modeling with dynamic programming and neural networks. J. Water Resour. Plan. Manag. 127(2), 89–98 (2001)CrossRefGoogle Scholar
  4. 4.
    Gil, E., Bustos, J., Rudnick, H.: Short-term hydrothermal generation scheduling model using a genetic algorithm. IEEE Trans. Power Syst. 18(4), 1256–1264 (2003)CrossRefGoogle Scholar
  5. 5.
    Kim, Y.O., Eum, H.I., Lee, E.G., Ko, I.H.: Optimizing operational policies of a korean multireservoir system using sampling stochastic dynamic programming with ensemble streamflow prediction. J. Water Resour. Plan. Manag. 133(1), 4–14 (2007)CrossRefGoogle Scholar
  6. 6.
    Lee, J.H., Labadie, J.W.: Stochastic optimization of multireservoir systems via reinforcement learning. Water Resour. Res. 43(11), W11408 (2007)CrossRefGoogle Scholar
  7. 7.
    Munos, R., Moore, A.: Variable resolution discretization in optimal control. Mach. Learn. 49(2–3), 291–323 (2002)CrossRefGoogle Scholar
  8. 8.
    Philpott, A.B., Guan, Z.: On the convergence of stochastic dual dynamic programming and related methods. Oper. Res. Lett. 36(4), 450–455 (2008)CrossRefGoogle Scholar
  9. 9.
    Powell, W.B.: What you should know about approximate dynamic programming. Nav. Res. Logist. 56(3), 239–249 (2009)CrossRefGoogle Scholar
  10. 10.
    Shapiro, A.: Analysis of stochastic dual dynamic programming method. Eur. J. Oper. Res. 209(1), 63–72 (2011)CrossRefGoogle Scholar
  11. 11.
    Zéphyr, L., Lang, P., Lamond, B.F.: Adaptive monitoring of the progressive hedging penalty for reservoir systems management. Energy Syst. 1–16. Published online: 20 December 2013. doi:10.1007/s12,667-013-0110-4 (2013)

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Operations and Decision Systems DepartmentUniversité Laval, Pavillon Palasis-PrinceQuébecCanada
  2. 2.Rio Tinto Alcan, Énergie électriqueJonquièreCanada

Personalised recommendations