Modelling Stock Markets by Multi-agent Reinforcement Learning

Lussange, Johann; Lazarevich, Ivan; Bourgeois-Gironde, Sacha; Palminteri, Stefano; Gutkin, Boris

doi:10.1007/s10614-020-10038-w

Modelling Stock Markets by Multi-agent Reinforcement Learning

Published: 17 August 2020

Volume 57, pages 113–147, (2021)
Cite this article

Computational Economics Aims and scope Submit manuscript

Johann Lussange ORCID: orcid.org/0000-0002-0840-8049¹,
Ivan Lazarevich^1,2,
Sacha Bourgeois-Gironde^3,4,
Stefano Palminteri⁵ &
…
Boris Gutkin^1,6

3400 Accesses
26 Citations
10 Altmetric
1 Mention
Explore all metrics

Abstract

Quantitative finance has had a long tradition of a bottom-up approach to complex systems inference via multi-agent systems (MAS). These statistical tools are based on modelling agents trading via a centralised order book, in order to emulate complex and diverse market phenomena. These past financial models have all relied on so-called zero-intelligence agents, so that the crucial issues of agent information and learning, central to price formation and hence to all market activity, could not be properly assessed. In order to address this, we designed a next-generation MAS stock market simulator, in which each agent learns to trade autonomously via reinforcement learning. We calibrate the model to real market data from the London Stock Exchange over the years 2007 to 2018, and show that it can faithfully reproduce key market microstructure metrics, such as various price autocorrelation scalars over multiple time intervals. Agent learning thus enables accurate emulation of the market microstructure as an emergent property of the MAS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stock Price Formation: Precepts from a Multi-Agent Reinforcement Learning Model

Article 10 April 2022

Financial Market Data Simulation Using Deep Intelligence Agents

Fast Agent-Based Simulation Framework with Applications to Reinforcement Learning and the Study of Trading Latency Effects

Notes

Computations were performed on a Mac Pro with 3,5 GHz 6-Core Intel Xeon E5 processor, and 16 GB 1866 MHz DDR memory.
We used the time series feature extraction functions implemented in the tsfresh Python package (Christ et al. 2018).
We used the implementation from the scikit-learn Python package (Pedregosa et al. 2011), with 200 estimators, maximal tree depth equal to 5 and default values for other hyperparameters.

References

Abbeel, P., Coates, A., & Ng, A. Y. (2010). Autonomous helicopter aerobatics through apprenticeship learning. The International Journal of Robotics Research, 29, 1608–1639.
Google Scholar
Aloud, M. (2014). Agent-based simulation in finance: Design and choices. In: Proceedings in finance and risk perspectives ‘14.
Andreas, J., Klein, D., & Levine, S. (2017). Modular multitask reinforcement learning with policy sketches.
Bak, P., Norrelykke, S., & Shubik, M. (1999). Dynamics of money. Physical Review E, 60, 2528–2532.
Google Scholar
Bak, P., Norrelykke, S., & Shubik, M. (2001). Money and goldstone modes. Quantitative Finance, 1, 186–190.
Google Scholar
Barde, S. (2015). A practical, universal, information criterion over nth order Markov processes (p. 04). School of Economics Discussion Papers, University of Kent.
Bavard, S., Lebreton, M., Khamassi, M., Coricelli, G., & Palminteri, S. (2018). Reference-point centering and range-adaptation enhance human reinforcement learning at the cost of irrational preferences. Nature Communications, 9(1), 4503. https://doi.org/10.1038/s41467-018-06781-2.
Article Google Scholar
Benzaquen, M., & Bouchaud, J. (2018). A fractional reaction–diffusion description of supply and demand. The European Physical Journal B, 91, 23. https://doi.org/10.1140/epjb/e2017-80246-9D.
Article Google Scholar
Bera, A. K., Ivliev, S., & Lillo, F. (2015). Financial econometrics and empirical market microstructure. Berlin: Springer.
Google Scholar
Bhatnagara, S., & Panigrahi, J. R. (2006). Actor-critic algorithms for hierarchical decision processes. Automatica, 42, 637–644.
Google Scholar
Biondo, A. E. (2018a). Learning to forecast, risk aversion, and microstructural aspects of financial stability. Economics, 12(2018–20), 1–21.
Google Scholar
Biondo, A. E. (2018b). Order book microstructure and policies for financial stability. Studies in Economics and Finance, 35(1), 196–218.
Google Scholar
Biondo, A. E. (2018c). Order book modeling and financial stability. Journal of Economic Interaction and Coordination, 14(3), 469–489.
Google Scholar
Boero, R., Morini, M., Sonnessa, M., & Terna, P. (2015). Agent-based models of the economy, from theories to applications. New York: Palgrave Macmillan.
Google Scholar
Bouchaud, J., Cont, R., & Potters, M. (1997). Scale invariance and beyond. In Proceeding CNRS Workshop on Scale Invariance, Les Houches. Springer.
Bouchaud, J. P. (2018). Handbook of computational economics (Vol. 4). Amsterdam: Elsevier.
Google Scholar
Chiarella, C., Iori, G., & Perell, J. (2007). The impact of heterogeneous trading rules on the limit order book and order flows. arXiv:0711.3581.
Christ, M., Braun, N., Neuffer, J., & Kempa-Liehr, A. W. (2018). Time series feature extraction on basis of scalable hypothesis tests, tsfresh-a python package. Neurocomputing, 307, 72–77.
Google Scholar
Cont, R. (2001). Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance, 1, 223–236.
Google Scholar
Cont, R. (2005). Chapter 7-Agent-based models for market impact and volatility. In A. Kirman & G. Teyssiere (Eds.), Long memory in economics. Berlin: Springer.
Google Scholar
Cont, R., & Bouchaud, J. P. (2000). Herd behavior and aggregate fluctuations in financial markets. Macroeconomic Dynamics, 4, 170–196.
Google Scholar
Cristelli, M. (2014). Complexity in financial markets. Berlin: Springer.
Google Scholar
Current dividend impacts of FTSE-250 stocks. Retrieved May 19, 2020 from https://www.dividenddata.co.uk.
Delbaen, F., & Schachermayer, W. (2004). What is a free lunch? Notices of the AMS, 51(5), 526–528.
Google Scholar
Deng, Y., Bao, F., Kong, Y., Ren, Z., & Dai, Q. (2017). Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 653–64.
Google Scholar
de Vries, C., & Leuven, K. (1994). Stylized facts of nominal exchange rate returns. Working papers from Purdue University, Krannert School of Management—Center for International Business Education and Research (CIBER).
Ding, Z., Engle, R., & Granger, C. (1993). A long memory property of stock market returns and a new model. Journal of Empirical Finance, 1, 83–106.
Google Scholar
Dodonova, A., & Khoroshilov, Y. (2018). Private information in futures markets: An experimental study. Managerial and Decision Economics, 39, 65–70.
Google Scholar
Donangelo, R., Hansen, A., Sneppen, K., & Souza, S. R. (2000). Modelling an imperfect market. Physica A, 283, 469–478.
Google Scholar
Donangelo, R., & Sneppen, K. (2000). Self-organization of value and demand. Physica A, 276, 572–580.
Google Scholar
Duan, Y., Schulman, J., Chen, X., Bartlett, P. L., Sutskever, I., & Abbeel, P. (2016). Rl-squared: Fast reinforcement learning via slow reinforcement learning. arXiv:1611.02779.
Duncan, K., Doll, B. B., Daw, N. D., & Shohamy, D. (2018). More than the sum of its parts: A role for the hippocampus in configural reinforcement learning. Neuron, 98, 645–657.
Google Scholar
Eickhoff, S. B., Yeo, B. T. T., & Genon, S. (2018). Imaging-based parcellations of the human brain. Nature Reviews Neuroscience, 19, 672–686.
Google Scholar
Eisler, Z., & Kertesz, J. (2006). Size matters: Some stylized facts of the stock market revisited. European Physical Journal B, 51, 145–154.
Google Scholar
Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50(4), 987–1007.
Google Scholar
Erev, I., & Roth, A. E. (2014). Maximization, learning and economic behaviour. PNAS, 111, 10818–10825.
Google Scholar
Fama, E. (1970). Efficient capital markets: A review of theory and empirical work. Journal of Finance, 25, 383–417.
Google Scholar
Franke, R., & Westerhoff, F. (2011). Structural stochastic volatility in asset pricing dynamics: Estimation and model contest. BERG working paper series on government and growth (Vol. 78).
Fulcher, B. D., & Jones, N. S. (2014). Highly comparative feature-based time-series classification. IEEE Transactions Knowledge and Data Engineering, 26, 3026–3037.
Google Scholar
Ganesh, S., Vadori, N., Xu, M., Zheng, H., Reddy, P., & Veloso, M. (2019). Reinforcement learning for market making in a multi-agent dealer market. arXiv:1911.05892.
Gode, D., & Sunder, S. (1993). Allocative efficiency of markets with zero-intelligence traders: Market as a partial substitute for individual rationality. Journal of Political Economy, 101(1), 119–137.
Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).
Green, E., & Heffernan, D. M. (2019). An agent-based model to explain the emergence of stylised facts in log returns. arXiv:1901.05053.
Greene, W. H. (2017). Econometric analysis (8th ed.). London: Pearson.
Google Scholar
Grondman, I., Busoniu, L., Lopes, G., & Babuska, R. (2012). A survey of actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Transactions on Systems Man and Cybernetics, 42, 1291–1307.
Google Scholar
Gualdi, S., Tarzia, M., Zamponi, F., & Bouchaud, J. P. (2015). Tipping points in macroeconomic agent-based models. Journal of Economic Dynamics and Control, 50, 29–61.
Google Scholar
Heinrich, J. (2017). Deep RL from self-play in imperfect-information games. Ph.D. thesis, University College London.
Hu, Y. J., & Lin, S. J. (2019). Deep reinforcement learning for optimizing portfolio management. In 2019 Amity international conference on artificial intelligence.
Huang, W., Lehalle, C. A., & Rosenbaum, M. (2015). Simulating and analyzing order book data: The queue-reactive model. Journal of the American Statistical Association, 110, 509.
Google Scholar
Huang, Z. F., & Solomon, S. (2000). Power, Lévy, exponential and Gaussian-like regimes in autocatalytic financial systems. European Physical Journal B, 20, 601–607.
Google Scholar
IG fees of Contracts For Difference. Retrieved May 19, 2020 from https://www.ig.com.
Katt, S., Oliehoek, F. A., & Amato, C. (2017). Learning in Pomdps with Monte Carlo tree search. In Proceedings of the 34th international conference on machine learning.
Keramati, M., & Gutkin, B. (2011). A reinforcement learning theory for homeostatic regulation. NIPS.
Keramati, M., & Gutkin, B. (2014). Homeostatic reinforcement learning for integrating reward collection and physiological stability. Elife, 3, e04811.
Google Scholar
Kim, G., & Markowitz, H. M. (1989). Investment rules, margin and market volatility. Journal of Portfolio Management, 16, 45–52.
Google Scholar
Konovalov, A., & Krajbich, I. (2016). Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning. Nature Communications, 7, 12438.
Google Scholar
Lefebvre, G., Lebreton, M., Meyniel, F., Bourgeois-Gironde, S., & Palminteri, S. (2017). Behavioural and neural characterization of optimistic reinforcement learning. Nature Human Behaviour, 1(4), 1–19.
Google Scholar
Levy, M., Levy, H., & Solomon, S. (1994). A microscopic model of the stock market: Cycles, booms, and crashes. Economics Letters, 45, 103–111.
Google Scholar
Levy, M., Levy, H., & Solomon, S. (1995). Microscopic simulation of the stock market: The effect of microscopic diversity. Journal de Physique, I(5), 1087–1107.
Google Scholar
Levy, M., Levy, H., & Solomon, S. (1997). New evidence for the power-law distribution of wealth. Physica A, 242, 90–94.
Google Scholar
Levy, M., Levy, H., & Solomon, S. (2000). Microscopic simulation of financial markets: From investor behavior to market phenomena. New York: Academic Press.
Google Scholar
Levy, M., Persky, N., & Solomon, S. (1996). The complex dynamics of a simple stock market model. International Journal of High Speed Computing, 8, 93–113.
Google Scholar
Levy, M., & Solomon, S. (1996a). Dynamical explanation for the emergence of power law in a stock market model. International Journal of Modern Physics C, 7, 65–72.
Google Scholar
Levy, M., & Solomon, S. (1996b). Power laws are logarithmic Boltzmann laws. International Journal of Modern Physics C, 7, 595–601.
Google Scholar
Liang, H., Yang, L., Tu, H. C. W., & Xu, M. (2017). Human-in-the-loop reinforcement learning. In 2017 Chinese automation congress.
Lipski, J., & Kutner, R. (2013). Agent-based stock market model with endogenous agents’ impact. arXiv:1310.0762.
Lobato, I. N., & Savin, N. E. (1998). Real and spurious long-memory properties of stock-market data. Journal of Business and Economics Statistics, 16, 261–283.
Google Scholar
Lux, T., & Marchesi, M. (1999). Scaling and criticality in a stochastic multi-agent model of a financial market. Nature, 397, 498–500.
Google Scholar
Lux, T., & Marchesi, M. (2000). Volatility clustering in financial markets: A microsimulation of interacting agents. Journal of Theoretical and Applied Finance, 3, 67–70.
Google Scholar
Mandelbrot, B. (1963). The variation of certain speculative prices. The Journal of Business, 39, 394–419.
Google Scholar
Mandelbrot, B., Fisher, A., & Calvet, L. (1997). A multifractal model of asset returns. Cowles Foundation for Research and Economics.
Martino, A. D., & Marsili, M. (2006). Statistical mechanics of socio-economic systems with heterogeneous agents. Journal of Physics A, 39, 465–540.
Google Scholar
McInnes, L., Healy, J., & Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426.
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., et al. (2016). Asynchronous methods for deep reinforcement learning. arXiv:1602.01783.
Momennejad, I., Russek, E., Cheong, J., Botvinick, M., Daw, N. D., & Gershman, S. J. (2017). The successor representation in human reinforcement learning. Nature Human Behavior, 1, 680–692.
Google Scholar
Murray, M. P. (1994). A drunk and her dog: An illustration of cointegration and error correction. The American Statistician, 48(1), 37–39.
Google Scholar
Mota Navarro, R., & Larralde, H. (2016). A detailed heterogeneous agent model for a single asset financial market with trading via an order book. arXiv:1601.00229.
Naik, P. K., Gupta, R., & Padhi, P. (2018). The relationship between stock market volatility and trading volume: Evidence from South Africa. The Journal of Developing Areas, 52(1), 99–114.
Google Scholar
Neuneier, R. (1997). Enhancing q-learning for optimal asset allocation. In Proceeding of the 10th international conference on neural information processing systems.
Ng, A. Y., Harada, D., & Russell, S. (1999). Theory and application to reward shaping.
Pagan, A. (1996). The econometrics of financial markets. Journal of Empirical Finance, 3, 15–102.
Google Scholar
Palminteri, S., Khamassi, M., Joffily, M., & Coricelli, G. (2015). Contextual modulation of value signals in reward and punishment learning. Nature Communications, 6, 1–14.
Google Scholar
Palminteri, S., Lefebvre, G., Kilford, E., & Blakemore, S. (2017). Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing. PLoS Computational Biology, 13(8), e1005684.
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn, machine learning in python. Journal of Machine Learning Research, 12, 2825–2830.
Google Scholar
Pinto, L., Davidson, J., Sukthankar, R., & Gupta, A. (2017). Robust adversarial reinforcement learning. arXiv:1703.02702.
Plerou, V., Gopikrishnan, P., Amaral, L. A., Meyer, M., & Stanley, H. E. (1999). Scaling of the distribution of fluctuations of financial market indices. Physical Review E, 60(6), 6519.
Google Scholar
Potters, M., & Bouchaud, J. P. (2001). More stylized facts of financial markets: Leverage effect and downside correlations. Physica A, 299, 60–70.
Google Scholar
Preis, T., Golke, S., Paul, W., & Schneider, J. J. (2006). Multi-agent-based order book model of financial markets. Europhysics Letters, 75(3), 510–516.
Google Scholar
Ross, S., Pineau, J., Chaib-draa, B., & Kreitmann, P. (2011). A Bayesian approach for learning and planning in partially observable Markov decision processes. Journal of Machine Learning Research, 12, 1729–1770.
Google Scholar
Sbordone, A. M., Tambalotti, A., Rao, K., & Walsh, K. J. (2010). Policy analysis using DSGE models: An introduction. Economic Policy Review, 16(2), 23–43.
Google Scholar
Schreiber, T., & Schmitz, A. (1997). Discrimination power of measures for nonlinearity in a time series. Physical Review E, 55(5), 5443.
Google Scholar
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529, 484–489.
Google Scholar
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., et al. (2018a). A general reinforcement learning algorithm that masters chess, shogi and go through self-play. Science, 362(6419), 1140–1144.
Google Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2014). Deterministic policy gradient algorithms. In Proceedings of the 31st international conference on machine learning (Vol. 32).
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., et al. (2018b). Mastering the game of go without human knowledge. Nature, 550, 354–359.
Google Scholar
Sirignano, J., & Cont, R. (2019). Universal features of price formation in financial markets: Perspectives from deep learning. Quantitative Finance, 19(9), 1449–1459.
Google Scholar
Solomon, S., Weisbuch, G., de Arcangelis, L., Jan, N., & Stauffer, D. (2000). Social percolation models. Physica A, 277(1), 239–247.
Google Scholar
Spooner, T., Fearnley, J., Savani, R., & Koukorinis, A. (2018). Market making via reinforcement learning. In Proceedings of the 17th AAMAS.
Sutton, R., & Barto, A. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Google Scholar
Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems, 12, 1057–1063.
Google Scholar
Szepesvari, C. (2010). Algorithms for reinforcement learning. San Rafael: Morgan and Claypool Publishers.
Google Scholar
Tessler, C., Givony, S., Zahavy, T., Mankowitz, D. J., & Mannor, S. (2016). A deep hierarchical approach to lifelong learning in minecraft. arXiv:1604.07255.
UK one-year gilt reference prices. Retrieved May 19, 2020 from https://www.dmo.gov.uk.
Vandewalle, N., & Ausloos, M. (1997). Coherent and random sequences in financial fluctuations. Physica A, 246, 454–459.
Google Scholar
Vernimmen, P., Quiry, P., Dallocchio, M., Fur, Y. L., & Salvi, A. (2014). Corporate finance: Theory and practice (4th ed.). New York: Wiley.
Google Scholar
Wang, J. X., Kurth-Nelson, Z., Kumaran, D., Tirumala, D., Soyer, H., Leibo, J. Z., et al. (2018). Prefrontal cortex as a meta-reinforcement learning system. Nature Neuroscience, 21, 860–868.
Google Scholar
Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3–4), 279–292.
Google Scholar
Way, E., & Wellman, M. P. (2013). Latency arbitrage, market fragmentation, and efficiency: A two-market model. In Proceedings of the fourteenth ACM conference on electronic commerce (pp. 855–872).
Wellman, M. P., & Way, E. (2017). Strategic agent-based modeling of financial markets. The Russell Sage Foundation Journal of the Social Sciences, 3(1), 104–119.
Google Scholar
Weron, R. (2001). Levy-stable distributions revisited: Tail index \(> 2\) does not exclude the levy-stable regime. International Journal of Modern Physics C, 12, 209–223.
Google Scholar
Wiering, M., & van Otterlo, M. (2012). Reinforcement learning: State-of-the-art. Berlin: Springer.
Google Scholar

Download references

Acknowledgements

We graciously acknowledge this work was supported by the RFFI Grant No. 16-51-150007 and CNRS PRC No. 151199, and received support from FrontCog ANR-17-EURE-0017. I. L.’s work was supported by the Russian Science Foundation, Grant No. 18-11-00294. S. B.-G. received funding within the framework of the HSE University Basic Research Program funded by the Russian Academic Excellence Project No. 5-100.

Author information

Authors and Affiliations

Group for Neural Theory, Laboratoire des Neurosciences Cognitives et Computationnelles, INSERM U960, Département des Études Cognitives, École Normale Supérieure, 29 rue d’Ulm, 75005, Paris, France
Johann Lussange, Ivan Lazarevich & Boris Gutkin
Lobachevsky State University of Nizhny Novgorod, 23 Gagarina av., Niznhy Novgorod, Russia, 603950
Ivan Lazarevich
Département des Études Cognitives, Institut Jean-Nicod, UMR 8129, École Normale Supérieure, 29 rue d’Ulm, 75005, Paris, France
Sacha Bourgeois-Gironde
Laboratoire d’Économie Mathématique et de Microéconomie Appliquée, EA 4442, Université Paris II Panthéon-Assas, 4 rue Blaise Desgoffe, 75006, Paris, France
Sacha Bourgeois-Gironde
Laboratoire des Neurosciences Cognitives et Computationnelles, INSERM U960, Département des Études Cognitives, École Normale Supérieure, 29 rue d’Ulm, 75005, Paris, France
Stefano Palminteri
Department of Psychology, Center for Cognition and Decision Making, NU University Higher School of Economics, 8 Myasnitskaya st., Moscow, Russia, 101000
Boris Gutkin

Authors

Johann Lussange
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Lazarevich
View author publications
You can also search for this author in PubMed Google Scholar
Sacha Bourgeois-Gironde
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Palminteri
View author publications
You can also search for this author in PubMed Google Scholar
Boris Gutkin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johann Lussange.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Classification Accuracy

We show on Fig. 16 the accuracy of both the testing (left) and training (right) sets, as functions of time-series sample size, for samples containing larger numbers of timestamps than in the Fig. 13. The saturating accuracy dynamics can be observed for both testing and training sets: for the value distribution feature set, the former does not exceed \(70\%\) and the latter \(75\%\), while for the full time-series feature set, the former saturates above \(90\%\) and the latter above \(95\%\). One can notice that the accuracy values on the training set are generally higher than for the testing set, and do not show such a pronounced saturation dynamic. The accuracy on the training set is not too large because the trees in the random forest have been regularized (with maximal depth equal to 5), since we found it is necessary for a good generalization on the testing set.

1.2 Top Statistical Features

We provide a general grouping and examples of the top statistical features used in the dimensionality reduction performed in Sect. 4.2. The exact ranking of particular features found in our experiments together with their importance metric value \(\varTheta\) is as follows. The imporance metric \(\varTheta\) is summed from 30 random forest models trained on different random splits of the training/testing sets.

1.
Partial autocorrelation value of lag 1, \(\varTheta =1.2240\).
2.
First coefficient of the fitted AR(10) process, \(\varTheta =1.0777\).
3.
Kurtosis of the FFT coefficient distribution, \(\varTheta =1.0214\).
4.
Skewness of the FFT coefficient distribution, \(\varTheta =1.0001\).
5.
Autocorrelation value of lag 1, \(\varTheta =0.9861\).
6.
60th percentile of the value distribution, \(\varTheta =0.9044\).
7.
Kurtosis of the FFT coefficient distribution, \(\varTheta =0.7347\).
8.
Mean of consecutive changes in the series for values in between the 0th and the 80th percentiles of the value distribution, \(\varTheta =0.6349\).
9.
Variance of consecutive changes in the series for values in between the 0th and the 20th percentiles of the value distribution, \(\varTheta =0.5948\).
10.
Approximate entropy value (length of compared run of data is 2, filtering level is 0.1), \(\varTheta =0.5878\).
11.
70th percentile of the value distribution, \(\varTheta =0.5589\).
12.
Variance of absolute consecutive changes in the series for values in between the 0th and the 20th percentiles of the value distribution, \(\varTheta =0.5584\).
13.
Mean of consecutive changes in the series for values in between the 40th and the 100th percentiles of the value distribution, \(\varTheta =0.4755\).
14.
Ratio of values that are more than 1 standard deviation away from the mean value, \(\varTheta =0.3282\).
15.
Median of the value distribution, \(\varTheta =0.2957\).
16.
Skewness of the value distribution, \(\varTheta =0.2894\).
17.
Measure of time series nonlinearity from Schreiber and Schmitz (1997) of lag 1, \(\varTheta =0.2867\).
18.
Second coefficient of the fitted AR(10) process, \(\varTheta =0.2726\).
19.
Partial autocorrelation value of lag 1, \(\varTheta =0.2575\).
20.
Time reversal symmetry statistic from Fulcher and Jones (2014) of lag 1, \(\varTheta =0.2418\).

The top-10 features referenced in Sect. 4.2 are the first 10 features taken from the list above. The PCA and UMAP mappings of the top-10 features onto a two-dimensional space demonstrated some separability between the two classes (real vs. simulated data), as measured by training a linear classifier on these two-dimensional data representations (see Sect. 4.2 for details), as well as by calculating the Kolmogorov–Smirnov (KS) statistic for each embedding component. The KS statistic value between the two classes is 0.24 and 0.11 for PCA (for the first and second component, respectively) and 0.30 and 0.25 for UMAP.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lussange, J., Lazarevich, I., Bourgeois-Gironde, S. et al. Modelling Stock Markets by Multi-agent Reinforcement Learning. Comput Econ 57, 113–147 (2021). https://doi.org/10.1007/s10614-020-10038-w

Download citation

Accepted: 31 July 2020
Published: 17 August 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s10614-020-10038-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modelling Stock Markets by Multi-agent Reinforcement Learning

Abstract

Access this article

Similar content being viewed by others

Stock Price Formation: Precepts from a Multi-Agent Reinforcement Learning Model

Financial Market Data Simulation Using Deep Intelligence Agents

Fast Agent-Based Simulation Framework with Applications to Reinforcement Learning and the Study of Trading Latency Effects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 Classification Accuracy

1.2 Top Statistical Features

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modelling Stock Markets by Multi-agent Reinforcement Learning

Abstract

Access this article

Similar content being viewed by others

Stock Price Formation: Precepts from a Multi-Agent Reinforcement Learning Model

Financial Market Data Simulation Using Deep Intelligence Agents

Fast Agent-Based Simulation Framework with Applications to Reinforcement Learning and the Study of Trading Latency Effects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 Classification Accuracy

1.2 Top Statistical Features

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation