Dynamic portfolio choice: a simulation-and-regression approach
Simulation-and-regression algorithms have become a standard tool for solving dynamic programs in many areas, in particular financial engineering and computational economics. In virtually all cases, the regression is performed on the state variables, for example on current market prices. However, it is possible to regress on decision variables as well, and this opens up new possibilities. We present numerical evidence of the performance of such an algorithm, in the context of dynamic portfolio choices in discrete-time (and thus incomplete) markets. The problem is fundamentally the one considered in some recent papers that also use simulations and/or regressions: discrete time, multi-period reallocation, and maximization of terminal utility. In contrast to that literature, we regress on decision variables and we do not rely on Taylor expansions nor derivatives of the utility function. Only basic tools are used, bundled in a dynamic programming framework: simulations—which can be black-boxed—as a representation of exogenous state variable dynamics; regression surfaces, as non-anticipative representations of expected future utility; and nonlinear or quadratic optimization, to identify the best portfolio choice at each time step. The resulting approach is simple, highly flexible and offers good performance in time and precision.
KeywordsSimulation-and-regression methods Least-squares Monte Carlo methods Dynamic programming Portfolio choice Portfolio optimization
- Bertsekas DP (2012) Dynamic programming and optimal control, 4th edn, vols I and II. Athena Scientific, BelmontGoogle Scholar
- Lagoudakis MG, Parr R, Littman ML (2002) Least-squares methods in reinforcement learning for control, In: Vlahavas IP, Spyropoulos CD (eds) Methods and application of artificial intelligence, Lectures Notes in Computer Science, vol 2308, Second Hellenic Conference on AI, SETN 2002, Springer, pp 249–260Google Scholar
- Nijman T, Werker B, Koijen R (2007) Appendix to: when can life-cycle investors benefit from time-varying bond risk premia? Working paper, Network for Studies on Pensions, Aging and RetirementGoogle Scholar
- Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge. Unpublished, updated 2012 edition available on the web (Spring 2015)Google Scholar
- Watkins CJCH (1989) Learning from delayed rewards. Ph.D. thesis, Cambridge UniversityGoogle Scholar