Bounds for Multistage Stochastic Programs Using Supervised Learning Strategies

  • Boris Defourny
  • Damien Ernst
  • Louis Wehenkel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5792)

Abstract

We propose a generic method for obtaining quickly good upper bounds on the minimal value of a multistage stochastic program. The method is based on the simulation of a feasible decision policy, synthesized by a strategy relying on any scenario tree approximation from stochastic programming and on supervised learning techniques from machine learning.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Frauendorfer, K.: Barycentric scenario trees in convex multistage stochastic programming. Mathematical Programming 75, 277–294 (1996)MathSciNetMATHGoogle Scholar
  2. 2.
    Dempster, M.: Sequential importance sampling algorithms for dynamic stochastic programming. Annals of Operations Research 84, 153–184 (1998)Google Scholar
  3. 3.
    Dupacova, J., Consigli, G., Wallace, S.: Scenarios for multistage stochastic programs. Annals of Operations Research 100, 25–53 (2000)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Høyland, K., Wallace, S.: Generating scenario trees for multistage decision problems. Management Science 47(2), 295–307 (2001)CrossRefMATHGoogle Scholar
  5. 5.
    Shapiro, A.: Monte Carlo sampling methods. In: Ruszczyński, A., Shapiro, A. (eds.) Stochastic Programming. Handbooks in Operations Research and Management Science, vol. 10, pp. 353–425. Elsevier, Amsterdam (2003)CrossRefGoogle Scholar
  6. 6.
    Casey, M., Sen, S.: The scenario generation algorithm for multistage stochastic linear programming. Mathematics of Operations Research 30, 615–631 (2005)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Hochreiter, R., Pflug, G.: Financial scenario generation for stochastic multi-stage decision processes as facility location problems. Annals of Operations Research 152, 257–272 (2007)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Pennanen, T.: Epi-convergent discretizations of multistage stochastic programs via integration quadratures. Mathematical Programming 116, 461–479 (2009)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Heitsch, H., Römisch, W.: Scenario tree modeling for multistage stochastic programs. Mathematical Programming 118(2), 371–406 (2009)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Shapiro, A.: On complexity of multistage stochastic programs. Operations Research Letters 34(1), 1–8 (2006)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Shapiro, A.: Inference of statistical bounds for multistage stochastic programming problems. Mathematical Methods of Operations Research 58(1), 57–68 (2003)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Golub, B., Holmer, M., McKendall, R., Pohlman, L., Zenios, S.: A stochastic programming model for money management. European Journal of Operational Research 85, 282–296 (1995)CrossRefMATHGoogle Scholar
  13. 13.
    Kouwenberg, R.: Scenario generation and stochastic programming models for asset liability management. European Journal of Operational Research 134, 279–292 (2001)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Hilli, P., Pennanen, T.: Numerical study of discretizations of multistage stochastic programs. Kybernetika 44, 185–204 (2008)MathSciNetMATHGoogle Scholar
  15. 15.
    Billingsley, P.: Probability and Measure, 3rd edn. Wiley, Chichester (1995)MATHGoogle Scholar
  16. 16.
    Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)MATHGoogle Scholar
  17. 17.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, Heidelberg (2009)CrossRefMATHGoogle Scholar
  18. 18.
    Wahba, G., Golub, G., Heath, M.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21, 215–223 (1979)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Efron, B., Tibshirani, R.: An introduction to the bootstrap. Chapman and Hall, London (1993)CrossRefMATHGoogle Scholar
  20. 20.
    Thénié, J., Vial, J.P.: Step decision rules for multistage stochastic programming: A heuristic approach. Automatica 44, 1569–1584 (2008)MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Küchler, C., Vigerske, S.: Numerical evaluation of approximation methods in stochastic programming (2008) (submitted)Google Scholar
  22. 22.
    Cover, T.: Estimation by the nearest neighbor rule. IEEE Transactions on Information Theory 14, 50–55 (1968)CrossRefMATHGoogle Scholar
  23. 23.
    Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Proceedings of the Second International Symposium on Information Theory, pp. 267–281 (1973)Google Scholar
  24. 24.
    Schwartz, G.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Rissanen, J.: Stochastic complexity and modeling. Annals of Statistics 14, 1080–1100 (1986)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    James, G., Radchenko, P., Lv, J.: DASSO: connections between the Dantzig selector and Lasso. Journal of the Royal Statistical Society: Series B 71, 127–142 (2009)MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    Chapelle, O., Vapnik, V., Bengio, Y.: Model selection for small sample regression. Machine Learning 48, 315–333 (2002)CrossRefMATHGoogle Scholar
  28. 28.
    Huber, P.: Projection pursuit. Annals of Statistics 13, 435–475 (1985)MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Buja, A., Hastie, T., Tibshirani, R.: Linear smoothers and additive models. Annals of Statistics 17, 453–510 (1989)MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Friedman, J.: Multivariate adaptive regression splines (with discussion). Annals of Statistics 19, 1–141 (1991)MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    Girosi, F., Jones, M., Poggio, T.: Regularization theory and neural networks architectures. Neural Computation 7, 219–269 (1995)CrossRefGoogle Scholar
  32. 32.
    Williams, C., Rasmussen, C.: Gaussian processes for regression. In: Advances in Neural Information Processing Systems 8 (NIPS 1995), pp. 514–520 (1996)Google Scholar
  33. 33.
    Smola, A., Schölkopf, B., Müller, K.R.: The connection between regularization operators and support vector kernels. Neural Networks 11, 637–649 (1998)CrossRefGoogle Scholar
  34. 34.
    Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Chichester (1994)CrossRefMATHGoogle Scholar
  35. 35.
    Bertsekas, D., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)MATHGoogle Scholar
  36. 36.
    Sutton, R., Barto, A.: Reinforcement Learning, an introduction. MIT Press, Cambridge (1998)Google Scholar
  37. 37.
    Bagnell, D., Kakade, S., Ng, A., Schneider, J.: Policy search by dynamic programming. In: Advances in Neural Information Processing Systems 16 (NIPS 2003), pp. 831–838 (2004)Google Scholar
  38. 38.
    Lagoudakis, M., Parr, R.: Reinforcement learning as classification: leveraging modern classifiers. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), pp. 424–431 (2003)Google Scholar
  39. 39.
    Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503–556 (2005)MathSciNetMATHGoogle Scholar
  40. 40.
    Langford, J., Zadrozny, B.: Relating reinforcement learning performance to classification performance. In: Proceedings of the Twenty-Second International Conference on Machine Learning (ICML 2005), pp. 473–480 (2005)Google Scholar
  41. 41.
    Fern, A., Yoon, S., Givan, R.: Approximate policy iteration with a policy language bias: solving relational Markov Decision Processes. Journal of Artificial Intelligence Research 25, 85–118 (2006)MathSciNetMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Boris Defourny
    • 1
  • Damien Ernst
    • 1
  • Louis Wehenkel
    • 1
  1. 1.University of Liège, Systems and Modeling, B28LiègeBelgium

Personalised recommendations