Suboptimal Policies for Stochastic \(N\)-Stage Optimization: Accuracy Analysis and a Case Study from Optimal Consumption

  • Mauro Gaggero
  • Giorgio Gnecco
  • Marcello Sanguineti
Part of the International Series in Operations Research & Management Science book series (ISOR, volume 198)


Dynamic Programming formally solves stochastic optimization problems with an objective that is additive over a finite number of stages. However, it provides closed-form solutions only in particular cases. In general, one has to resort to approximate methodologies. In this chapter, suboptimal solutions are searched for by approximating the decision policies via linear combinations of Gaussian and sigmoidal functions containing adjustable parameters, to be optimized together with the coefficients of the combinations. These approximation schemes correspond to Gaussian radial-basis-function networks and sigmoidal feedforward neural networks, respectively. The accuracies of the suboptimal solutions are investigated by estimating the error propagation through the stages. As a case study, we address a multidimensional problem of optimal consumption under uncertainty, modeled as a stochastic optimization task with an objective that is additive over a finite number of stages. In the classical one-dimensional context, a consumer aims at maximizing over a given time horizon the discounted expected value of consumption of a good, where the expectation is taken with respect to a stochastic interest rate. The consumer has an initial wealth and at each time period earns an income, modeled as an exogenous input. We consider a multidimensional framework, in which there are \(d > 1\) consumers that aim at maximizing a social utility function. First we provide conditions that allow one to apply our estimates to such a problem; then we present a numerical analysis.


  1. Adda, J., & Cooper, R. (2003). Dynamic economics: quantitative methods and applications. MA: MIT Press.Google Scholar
  2. Anderson, E. J., & Nash, P. (1987). Linear programming in infinite-dimensional spaces. New York: Wiley.Google Scholar
  3. Bellman, R., & Dreyfus, S. (1959). Functional approximations and dynamic programming. Mathematical Tables and Other Aids to Computation, 13, 247–251.CrossRefGoogle Scholar
  4. Bellman, R., Kalaba, R., & Kotkin, B. (1963). Polynomial approximation: a new computational technique in dynamic programming. Mathematics of Computation, 17, 155–161.Google Scholar
  5. Bertsekas, D. P. (2005). Dynamic programming and optimal control (Vol. 1). Belmont, MA: Athena Scientific.Google Scholar
  6. Bertsekas, D. P. (2007). Dynamic programming and optimal control (Vol. 2). Belmont, MA: Athena Scientific.Google Scholar
  7. Bertsekas, D. P., & Tsitsiklis, J. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.Google Scholar
  8. Bhattacharya, R., & Majumdar, M. (2007). Random dynamical systems: theory and applications. Cambridge, MA: Cambridge University Press.CrossRefGoogle Scholar
  9. Blume, L., Easley, D., & O’Hara, M. (1982). Characterization of optimal plans for stochastic dynamic programs. Journal of Economic Theory, 28, 221–234.CrossRefGoogle Scholar
  10. Buhmann, M. D. (2003). Radial basis functions. Cambridge, MA: Cambridge University Press.CrossRefGoogle Scholar
  11. Busoniu, L., Babuska, R., De Schutter, B., & Ernst, D. (2010). Reinforcement learning and dynamic programming using function approximators. Boca Raton, FL: CRC Press.CrossRefGoogle Scholar
  12. Cervellera, C., Gaggero, M., & Macciò, D. (2012). Efficient kernel models for learning and approximate minimization problems. Neurocomputing, 97, 74–85.CrossRefGoogle Scholar
  13. Cervellera, C., & Muselli, M. (2007). Efficient sampling in approximate dynamic programming algorithms. Computational Optimization and Applications, 38, 417–443.CrossRefGoogle Scholar
  14. Chen, V. C. P., Ruppert, D., & Shoemaker, C. A. (1999). Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming. Operations Research, 47, 38–53.CrossRefGoogle Scholar
  15. Cruz-Suárez, H., & Montes-de-Oca, R. (2006). Discounted Markov control processes induced by deterministic systems. Kybernetika, 42, 647–664.Google Scholar
  16. Cruz-Suárez, H., & Montes-de-Oca, R. (2008). An envelope theorem and some applications to discounted Markov decision processes. Mathematical Methods of Operations Research, 67, 299–321.CrossRefGoogle Scholar
  17. Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems, 2, 303–314.CrossRefGoogle Scholar
  18. Ekeland, I., & Turnbull, T. (1983). Infinite-Dimensional Optimization and Convexity. Chicago, IL: The University of Chicago Press.Google Scholar
  19. Fang, K.-T., & Wang, Y. (1994). Number-Theoretic Methods in Statistics. London: Chapman & Hall.Google Scholar
  20. Foufoula-Georgiou, E., & Kitanidis, P. K. (1988). Gradient dynamic programming for stochastic optimal control of multidimensional water resources systems. Water Resources Research, 24, 1345–1359.CrossRefGoogle Scholar
  21. Gaggero, M., Gnecco, G., & Sanguineti, M. (2013). Dynamic programming and value-function approximation in sequential decision problems: Error analysis and numerical results. Journal of Optimization Theory and Applications, 156, 380–416.CrossRefGoogle Scholar
  22. Gelfand, I. M., & Fomin, S. V. (1963). Calculus of variations. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
  23. Giulini, S., & Sanguineti, M. (2009). Approximation schemes for functional optimization problems. Journal of Optimization Theory and Applications, 140, 33–54.CrossRefGoogle Scholar
  24. Gnecco, G., & Sanguineti, M. (2008). Approximation error bounds via Rademacher’s complexity. Applied Mathematical Sciences, 2, 153–176.Google Scholar
  25. Gnecco, G., & Sanguineti, M. (2010). Suboptimal solutions to dynamic optimization problems via approximations of the policy functions. Journal of Optimization Theory and Applications, 146, 764–794.CrossRefGoogle Scholar
  26. Gnecco, G., Sanguineti, M., & Gaggero, M. (2012). Suboptimal solutions to team optimization problems with stochastic information structure. SIAM Journal on Optimization, 22, 212–243.CrossRefGoogle Scholar
  27. Gribonval, R., & Vandergheynst, P. (2006). On the exponential convergence of matching pursuits in quasi-incoherent dictionaries. IEEE Transactions on Information Theory, 52, 255–261.CrossRefGoogle Scholar
  28. Hernandez-Lerma, O., & Lasserre, J. B. (1998). Approximation schemes for infinite linear programs. SIAM Journal on Optimization, 8, 973–988.CrossRefGoogle Scholar
  29. Hiriart-Urruty, J. B., & Lemaréchal, C. (1996). Convex analysis and minimization algorithms. Berlin: Springer.Google Scholar
  30. Johnson, S. A., Stedinger, J. R., Shoemaker, C., Li, Y., & Tejada-Guibert, J. A. (1993). Numerical solution of continuous-state dynamic programs using linear and spline interpolation. Operations Research, 41, 484–500.CrossRefGoogle Scholar
  31. Judd, K. (1998). Numerical methods in economics. Cambridge, MA: MIT Press.Google Scholar
  32. K\(\mathring{{\rm u}}\)rková, V., & Sanguineti, M. (2002). Comparison of worst-case errors in linear and neural network approximation. IEEE Transactions on Information Theory, 48, 264–275.Google Scholar
  33. K\(\mathring{{\rm u}}\)rková, V., & Sanguineti, M. (2005). Error estimates for approximate optimization by the extended Ritz method. SIAM Journal on Optimization, 18:461–487.Google Scholar
  34. K\(\mathring{{\rm u}}\)rková, V., & Sanguineti, M. (2008). Approximate minimization of the regularized expected error over kernel models. Mathematics of Operations Research, 33:747–756.Google Scholar
  35. K\(\mathring{{\rm u}}\)rková., & Sanguineti, M. (2008) Geometric upper bounds on rates of variable-basis approximation. IEEE Transactions on Information Theory, 54:5681–5688.Google Scholar
  36. Kuhn, D. (2005). Generalized bounds for convex multistage stochastic programs. Berlin Heidelberg: Springer.Google Scholar
  37. Mhaskar, H. N. (1996). Neural networks for optimal approximation of smooth and analytic functions. Neural Computation, 8, 164–177.CrossRefGoogle Scholar
  38. Montrucchio, L. (1998). Thompson metric, contraction property and differentiability of policy functions. Journal of Economic Behavior and Organization, 33, 449–466.CrossRefGoogle Scholar
  39. Niederreiter, H. (1992). Random number generation and quasi-monte carlo methods. Philadelphia, PA: SIAM.CrossRefGoogle Scholar
  40. Nocedal, J., & Wright, S. J. (2006). Numerical optimization. New York: Springer.Google Scholar
  41. Philbrick, C. R., & Kitanidis, P. K, Jr. (2001). Improved dynamic programming methods for optimal control of lumped-parameter stochastic systems. Operations Research, 49, 398–412.CrossRefGoogle Scholar
  42. Pinkus, A. (1999). Approximation theory of the MLP model in neural networks. Acta Numerica, 8, 143–195.CrossRefGoogle Scholar
  43. Powell, W. B. (2007). Approximate dynamic programming. Hoboken, NJ: Wiley-Interscience.CrossRefGoogle Scholar
  44. Sahinidis, N. V. (2004). Optimization under uncertainty: state-of-the-art and opportunities. Computers and chemical engineering, 28, 971–983.CrossRefGoogle Scholar
  45. Santos, M. S. (1991). Smoothness of policy function in discrete time economic models. Econometrica, 59, 1365–1382.CrossRefGoogle Scholar
  46. Si, J., Barto, A. G., Powell, W. B., & Wunsch, D. (Eds.). (2004). Handbook of learning and approximate dynamic programming. New York, NY: IEEE Press.Google Scholar
  47. Singer, I. (1970). Best approximation in normed linear spaces by elements of linear subspaces. Berlin Heidelberg: Springer.Google Scholar
  48. Sobol’, I. M. (1967). The distribution of points in a cube and the approximate evaluation of integrals. Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 7, 784–802.Google Scholar
  49. Stein, E. M. (1970). Singular integrals and differentiability properties of functions. Princeton, NJ: Princeton University Press.Google Scholar
  50. Stokey, N. L., Lucas, R. E., & Prescott, E. (1989). Recursive methods in economic dynamics. MA: Harvard University Press.Google Scholar
  51. Tsitsiklis, J. N. (2010). Perspectives on stochastic optimization over time. INFORMS Journal on Computing, 22, 18–19.CrossRefGoogle Scholar
  52. Zoppoli, R., Parisini, T., & Sanguineti, M. (2002). Approximating networks and extended Ritz method for the solution of functional optimization problems. Journal of Optimization Theory and Applications, 112, 403–439.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Mauro Gaggero
    • 2
  • Giorgio Gnecco
    • 1
  • Marcello Sanguineti
    • 1
  1. 1.DIBRISUniversity of GenoaGenovaItaly
  2. 2.Institute of Intelligent Systems for Automation (ISSIA)National Research Council of ItalyGenovaItaly

Personalised recommendations