Optimization and Engineering

, Volume 18, Issue 2, pp 369–406 | Cite as

Dynamic portfolio choice: a simulation-and-regression approach



Simulation-and-regression algorithms have become a standard tool for solving dynamic programs in many areas, in particular financial engineering and computational economics. In virtually all cases, the regression is performed on the state variables, for example on current market prices. However, it is possible to regress on decision variables as well, and this opens up new possibilities. We present numerical evidence of the performance of such an algorithm, in the context of dynamic portfolio choices in discrete-time (and thus incomplete) markets. The problem is fundamentally the one considered in some recent papers that also use simulations and/or regressions: discrete time, multi-period reallocation, and maximization of terminal utility. In contrast to that literature, we regress on decision variables and we do not rely on Taylor expansions nor derivatives of the utility function. Only basic tools are used, bundled in a dynamic programming framework: simulations—which can be black-boxed—as a representation of exogenous state variable dynamics; regression surfaces, as non-anticipative representations of expected future utility; and nonlinear or quadratic optimization, to identify the best portfolio choice at each time step. The resulting approach is simple, highly flexible and offers good performance in time and precision.


Simulation-and-regression methods Least-squares Monte Carlo methods Dynamic programming Portfolio choice Portfolio optimization 



The authors acknowledge the financial support of HEC-Montréal and NSERC (E.D. grant 386416-2010; M.D. grant 227838-2011).


  1. Artzner P, Delbaen F, Eber J-M, Heath D (1999) Coherent measures of risk. Math Finance 9:203–228MathSciNetCrossRefMATHGoogle Scholar
  2. Balduzzi P, Lynch A (1999) Transaction costs and predictability: some utility cost calculations. J Financ Econ 52:47–78CrossRefGoogle Scholar
  3. Barraquand J (1995) Numerical valuation of high dimensional multivariate European securities. Manag Sci 41:1882–1891CrossRefMATHGoogle Scholar
  4. Bellini F, Bignozzi V (2015) Elicitable risk measures. Quant Finance 15(5):725–733MathSciNetCrossRefGoogle Scholar
  5. Bertsekas DP (2012) Dynamic programming and optimal control, 4th edn, vols I and II. Athena Scientific, BelmontGoogle Scholar
  6. Boyle P, Broadie M, Glasserman P (1997) Monte Carlo methods for security pricing. J Econ Dyn Control 21:1263–1321MathSciNetCrossRefMATHGoogle Scholar
  7. Bradtke SJ, Barto AJ (1996) Linear least-squares algorithms for temporal difference learning. Mach Learn 22:33–57MATHGoogle Scholar
  8. Brandt M (2010) Portfolio choice problems. In: Ait-Sahalia Y, Hansen LP (eds) Handbook of financial econometrics, Volume 1: Tools and techniques. North Holland, New York, pp 269–336CrossRefGoogle Scholar
  9. Brandt M, Goyal A, Santa-Clara P, Stroud J (2005) A simulation approach to dynamic portfolio choice with an application to learning about return predictability. Rev Financ Stud 18:831–873CrossRefGoogle Scholar
  10. Brennan M, Schwartz E, Lagnado R (1997) Strategic asset allocation. J Econ Dyn Control 21:1377–1403MathSciNetCrossRefMATHGoogle Scholar
  11. Carroll Chr (2006) The method of endogenous gridpoints for solving dynamic stochastic optimization problems. Econ Lett 91:312–320MathSciNetCrossRefMATHGoogle Scholar
  12. Cox JC, Huang CF (1989) Optimum consumption and portfolio policies when asset prices follow a diffusion process. J Econ Theory 49:33–83CrossRefMATHGoogle Scholar
  13. Cox JC, Huang CF (1991) A variational problem occurring in financial economics. J Math Econ 20:465–487CrossRefMATHGoogle Scholar
  14. Dammon R, Spatt C, Zhang H (2000) Optimal consumption and investment with capital gains taxes. Rev Financ Stud 14:583–616CrossRefGoogle Scholar
  15. DeMiguel V, Nogales F, Uppal R (2014) Stock return serial dependence and out-of-sample portfolio performance. Rev Financ Stud 27:1031–1073CrossRefGoogle Scholar
  16. Denault M, Simonato J-G (2017) Dynamic portfolio choices by simulation-and-regression: revisiting the issue of value function vs portfolio weight recursions. Comput Oper Res 79:174–189MathSciNetCrossRefGoogle Scholar
  17. Denault M, Simonato J-G, Stentoft L (2013) A simulation-and-regression approach for stochastic dynamic programs with endogenous state variables. Comput Oper Res 40:2760–2769MathSciNetCrossRefMATHGoogle Scholar
  18. Detemple J, Garcia R, Rindisbacher M (2003) A Monte Carlo method for optimal portfolios. J Finance 58:401–446CrossRefGoogle Scholar
  19. Detemple J, Garcia R, Rindisbacher M (2005) Intertemporal asset allocation: a comparison of methods. J Bank Finance 29:2821–2848CrossRefGoogle Scholar
  20. Garlappi L, Skoulakis G (2009) Numerical solutions to dynamic portfolio problems: the case for value function iteration using Taylor expansion. Comput Econ 33:193–207CrossRefMATHGoogle Scholar
  21. Garlappi L, Skoulakis G (2010) Solving consumption and portfolio choice problems: the state variable decomposition method. Rev Financ Stud 23:3346–3400CrossRefGoogle Scholar
  22. Koijen R, Nijman T, Werker B (2010) When can life cycle investors benefit from time-varying bond risk premia? Rev Financ Stud 23:741–780CrossRefGoogle Scholar
  23. Kuhn D, Wiesemann W, Georghiou A (2011) Primal and dual linear decision rules in stochastic and robust optimization. Math Program Ser A 130:177–209MathSciNetCrossRefMATHGoogle Scholar
  24. Lagoudakis MG, Parr R, Littman ML (2002) Least-squares methods in reinforcement learning for control, In: Vlahavas IP, Spyropoulos CD (eds) Methods and application of artificial intelligence, Lectures Notes in Computer Science, vol 2308, Second Hellenic Conference on AI, SETN 2002, Springer, pp 249–260Google Scholar
  25. Longstaff F, Schwartz E (2001) Valuing American options by simulations: a simple least squares approach. Rev Financ Stud 14:113–148CrossRefGoogle Scholar
  26. Nijman T, Werker B, Koijen R (2007) Appendix to: when can life-cycle investors benefit from time-varying bond risk premia? Working paper, Network for Studies on Pensions, Aging and RetirementGoogle Scholar
  27. Rockafellar RT, Uryasev S (2000) Optimization of conditional value-at-risk. J Risk 2:21–41CrossRefGoogle Scholar
  28. Rocha P, Kuhn D (2012) Multistage stochastic portfolio optimisation in deregulated electricity markets using linear decision rules. Eur J Oper Res 216:397–408MathSciNetCrossRefMATHGoogle Scholar
  29. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge. Unpublished, updated 2012 edition available on the web (Spring 2015)Google Scholar
  30. Tsitsiklis J, Van Roy B (2001) Regression methods for pricing complex American-style options. IEEE Trans Neural Netw 12:694–703CrossRefGoogle Scholar
  31. van Binsbergen J, Brandt M (2007) Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function? Comput Econ 29:355–367CrossRefMATHGoogle Scholar
  32. Wachter J (2002) Portfolio and consumption decisions under mean-reverting returns: an exact solution for complete markets. J Financ Quant Anal 37:63–91CrossRefGoogle Scholar
  33. Watkins CJCH (1989) Learning from delayed rewards. Ph.D. thesis, Cambridge UniversityGoogle Scholar
  34. Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8:279–292MATHGoogle Scholar
  35. Weissensteiner A (2009) A Q-learning approach to derive optimal consumption and investment strategies. IEEE Trans Neural Netw 20:1234–1243CrossRefGoogle Scholar
  36. Zenios SA (2007) Practical financial optimization, decision making for financial engineers. Blackwell, LondonMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Michel Denault
    • 1
    • 2
  • Erick Delage
    • 1
    • 2
  • Jean-Guy Simonato
    • 3
  1. 1.Department of Decision SciencesHEC MontréalMontréalCanada
  2. 2.GERADMontréalCanada
  3. 3.Department of FinanceHEC MontréalMontréalCanada

Personalised recommendations