Skip to main content
Log in

Regime-switching recurrent reinforcement learning for investment decision making

  • Original Paper
  • Published:
Computational Management Science Aims and scope Submit manuscript

Abstract

This paper presents the regime-switching recurrent reinforcement learning (RSRRL) model and describes its application to investment problems. The RSRRL is a regime-switching extension of the recurrent reinforcement learning (RRL) algorithm. The basic RRL model was proposed by Moody and Wu (Proceedings of the IEEE/IAFE 1997 on Computational Intelligence for Financial Engineering (CIFEr). IEEE, New York, pp 300–307 1997) and presented as a methodology to solve stochastic control problems in finance. We argue that the RRL is unable to capture all the intricacies of financial time series, and propose the RSRRL as a more suitable algorithm for such type of data. This paper gives a description of two variants of the RSRRL, namely a threshold version and a smooth transition version, and compares their performance to the basic RRL model in automated trading and portfolio management applications. We use volatility as an indicator/transition variable for switching between regimes. The out-of-sample results are generally in favour of the RSRRL models, thereby supporting the regime-switching approach, but some doubts exist regarding the robustness of the proposed models, especially in the presence of transaction costs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Bertoluzzo F, Corazza M (2007) Making financial trading by recurrent reinforcement learning. In: Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference. Springer-Verlag, USA, pp 619–626

  • Dempster M, Leemans V (2006) An automated FX trading system using adaptive reinforcement learning. Expert Syst Appl 30(3): 543–552

    Article  Google Scholar 

  • Franses P, van Dijk D (2000) Nonlinear time series models in empirical finance. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Gold C (2003) FX trading via recurrent reinforcement learning. In: Proceedings. 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. IEEE, pp 363–370

  • Hamilton JD (1989) A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2): 357–384

    Article  Google Scholar 

  • Hamilton JD (2008) Regime-switching models. In: The New Palgrave Dictionary of Economics. Palgrave Macmillan, England

  • Kaelbling L, Littman M, Moore A (1996) Reinforcement learning: A survey. J Artif Intell Res 4(1): 237–285

    Google Scholar 

  • Koutmos G (1997) Feedback trading and the autocorrelation pattern of stock returns: further empirical evidence. J Int Money Financ 16(4): 625–636

    Article  Google Scholar 

  • LeBaron B (1992) Some relations between volatility and serial correlations in stock market returns. J Bus 65(2): 199–219

    Article  Google Scholar 

  • McKenzie MD, Faff RW (2003) The determinants of conditional autocorrelation in stock returns. J Financ Res 26(2): 259–274

    Article  Google Scholar 

  • Moody J, Wu L (1997) Optimization of trading systems and portfolios. In: Proceedings of the IEEE/IAFE 1997 on Computational Intelligence for Financial Engineering (CIFEr). IEEE, New York, pp 300–307

  • Moody J, Wu L, Liao Y, Saffell M (1998) Performance functions and reinforcement learning for trading systems and portfolios. J Forecast 17(56): 441–470

    Article  Google Scholar 

  • Moody J, Saffell M (2001) Learning to trade via direct reinforcement. IEEE Trans Neural Netw 12(4): 875–889

    Article  Google Scholar 

  • Sentana E, Wadhwani S (1992) Feedback traders and stock return autocorrelations: evidence from a century of daily data. Econ J 102(411): 415–425

    Article  Google Scholar 

  • Sharpe W (1966) Mutual fund performance. J Bus 39(1): 119–138

    Article  Google Scholar 

  • Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Glob optim 11(4): 341–359

    Article  Google Scholar 

  • Sutton R, Barto A (1998) Introduction to reinforcement learning. MIT Press, Cambridge

    Google Scholar 

  • Teräsvirta T (1994) Specification, estimation, and evaluation of smooth transition autoregressive models. J Am Stat Assoc 89(425): 208–218

    Article  Google Scholar 

  • Tong H (1978) On a threshold model. In: Chen C (eds) Pattern recognition and signal processing. Sijthoff & Noordhoff, The Netherlands, pp 101–141

    Google Scholar 

  • Watkins C (1989) Learning from delayed rewards. Ph.D. thesis, University of Cambridge, England

    Google Scholar 

  • Werbos P (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10): 1550–1560

    Article  Google Scholar 

  • White H (1989) Some asymptotic results for learning in single hidden-layer feedforward network models. J Am Stat Assoc 84(408): 1003–1013

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dietmar Maringer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maringer, D., Ramtohul, T. Regime-switching recurrent reinforcement learning for investment decision making. Comput Manag Sci 9, 89–107 (2012). https://doi.org/10.1007/s10287-011-0131-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10287-011-0131-1

Keywords

Navigation