Skip to main content

Strategic asset allocation and market timing: a reinforcement learning approach


We apply the recurrent reinforcement learning method of Moody, Wu, Liao, and Saffell (1998) in the context of the strategic asset allocation computed for sample data from US, UK, Germany, and Japan. It is found that the optimal asset allocation deviates substantially from the fixed-mix rule. The investor actively times the market and he is able to outperform it consistently over the almost two decades we analyze.

This is a preview of subscription content, access via your institution.


  • Barsky R.B., Juster F.T., Kimball M.S., Shapiro M.D. (1997). Preference parameters and behavioral heterogeneity. An experimental approach in the health and retirement study. Quarterly Journal of Economics 111: 537–579

    Article  Google Scholar 

  • van Binsbergen, J. H., Brandt, M. W. (2006). Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function. Computational Economics, forthcoming.

  • Brandt M.W. (1999). Estimating portfolio and consumption choice: A conditional Euler equations approach. Journal of Finance 54, 1609–1646

    Article  Google Scholar 

  • Brandt M.W., Goyal A., Santa-Clara P., Stroud J.R. (2006). A simulation approach to dynamic portfolio choice with an application to learning about return predictability. Review of Financial Studies 18, 831–873

    Article  Google Scholar 

  • Brennan M.J., Schwartz E.S., Lagnado R. (1997). Strategic asset allocation. Journal of Economic Dynamics and Control 21(7): 1377–1403

    Article  Google Scholar 

  • Campbell J.Y., (2000). Asset pricing at the millenium. Journal of Finance 55(4): 1515–1567

    Article  Google Scholar 

  • Campbell J.Y., Andrew L., Craig Mc Kinley A. (1999). The econometrics of financial markets. Princeton, NJ, Princeton University Press

    Google Scholar 

  • Campbell J.Y., Shiller R.J. (1988). The dividend price ratio and expectations of future dividens and discount factors. Review of financial studies 1, 195–228

    Article  Google Scholar 

  • Campbell J.Y., Viceira L.M. (2002). Strategic asset allocation. Oxford, Oxford University Press

    Google Scholar 

  • Campbell J.Y., Viceira L.M. (1999). Consumption and portfolio decisions when expected returns are time-varying. Quarterly Journal of Economics 114, 433–495

    Article  Google Scholar 

  • Campbell J.Y., Viceira L.M. (2005). Strategic asset allocation for pension plans. forthcoming. In: Gordon Clark, Alicia Munnell, Michael Orszag (eds). Oxford handbook of pensions and retirement income.Oxford University Press, Oxford, pp: 5-7

  • Cochrane, J. H. (1991). Portfolio advice for a multifactor world, economic perspectives XXIII (3) Third quarter 1999 (pp. 59–78). Chicago: Federal Reserve Bank of Chicago.

  • Daniel K., Hirshleifer D., Subrahmanyam A. (1998). Investor psychology and security market under- and overreactions. Journal of Finance 53(6): 1839–1885

    Article  Google Scholar 

  • De Bondt W., Thaler R.H. (1985). Does the stock market overreact?. Journal of Finance 40(3): 793–805

    Article  Google Scholar 

  • Gallant A.R., Hansen L.P., Tauchen G. (1990). Using conditional moments of asset payoffs to infer the volatility of intertemporal marginal rates of substitution. Journal of Econometrics 45(1–2): 141–179

    Article  Google Scholar 

  • Fama E. (1998). Market efficiency, long-term returns, and behavioral finance. Journal of Financial Economics 49, 283–306

    Article  Google Scholar 

  • Friend M.E., Blume I. (1975). The Demand for risky assets. American Economic Review 65(5): 900–922

    Google Scholar 

  • Malkiel B.G. (2003). The efficient market hypothesis and its critics. Journal of Economic Perspectives 17, 59–82

    Article  Google Scholar 

  • Merton R. (1969). Lifetime portfolio selection under uncertainty: The continuous-time case. Review of Economics and Statistics 51, 247–257

    Article  Google Scholar 

  • Moody J., Wu L., Liao Y., Saffell M. (1998). Performance functions and reinforcement learning for trading systems and portfolios. Journal of Forecasting 17, 441–470

    Article  Google Scholar 

  • Samuelson P.A. (1969). Lifetime portfolio selection by dynamic stochastic programming. The Review of Economics and Statistics 51(3): 239–246

    Article  Google Scholar 

  • Samuelson, P. A. (1991). Long-run risk tolerance when equity returns are mean regressing: Pseudoparadoxes and vindication of Businessman’s Risk, Brainard, Nordhaus, Watts (Eds.), Money, macroeconomics, and economic policy. Essay in Honor of James Tobin, (chap. 7, pp. 181–200). Newyork: MIT Press.

  • Shleifer A. (2000). Inefficient markets: An introduction to behavioral finance. Oxford, Oxford University Press

    Google Scholar 

  • Wöhrmann, P. (2006). An axiomatic approach to prediction. University of Zurich.

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Thorsten Hens.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Hens, T., Wöhrmann, P. Strategic asset allocation and market timing: a reinforcement learning approach. Comput Econ 29, 369–381 (2007).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Dynamic asset allocation
  • Bond/equity ratio
  • Reinforcement Learning