Skip to main content
Log in

Comparison of machine learning methods for financial time series forecasting at the examples of over 10 years of daily and hourly data of DAX 30 and S&P 500

  • Research Article
  • Published:
Journal of Computational Social Science Aims and scope Submit manuscript

Abstract

This article conducts a systematic comparison of three methods for predicting the direction (+/−) of financial time series using over ten years of DAX 30 and the S&P 500 datasets at daily and hourly frames. We choose the methods from representative machine learning families, particularly supervised versus unsupervised. The methods are support vector machines (SVM), artificial neural networks, and k-nearest neighbor (k-NN). We explore the influence of different training window lengths and numbers of out-of-sample predictions. Furthermore, we investigate whether kernel principle component analysis (KPCA) improves prediction, through reducing data dimensionality. Additionally, we verify whether combining machine learning methods by bootstrap aggregating outperforms single methods. Key insights from the experiment are: All machine learning methods are in principle useful to predict the direction of (+/−) financial time series. But to our surprise, increasing the window size only helps to a certain extent for hourly data, before it actually reduces the performance. The number of out-of-sample predictions had a small impact, while KPCA made a strong difference for SVM and k-NN. Finally, backtesting selected machines with a trading system on daily data revealed that the lazy learner k-NN outperforms the supervised approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. In the positive example, we used the k-NN 1600/800 DAX 30 daily from our experiments. In the negative example, we used the SVM 1600/800 S&P 500 daily.

References

  1. Bacon, C. R. (2005). Practical Portfolio performance measurement and attribution. The wiley finance series. Oxford: Wiley.

    Google Scholar 

  2. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(1), 281–305.

    Google Scholar 

  3. Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.

    Article  Google Scholar 

  4. Brasileiro, R. C., Souza, V. L. F., Fernandes, B. J. T., & Oliveira, A. L. I. (2013). Automatic method for stock trading combining technical analysis and the artificial bee colony algorithm. In 2013 IEEE Congress on Evolutionary Computation (CEC) (pp 1810–1817).

  5. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.

    Google Scholar 

  6. Cao, L. J., & Tay, F. E. H. (2003). Support vector machine with adaptive parameters in financial time series forecasting. IEEE Transactions on Neural Networks, 14(6), 1506–1518.

    Article  Google Scholar 

  7. Cao, L. J., Chua, K. S., Chong, W. K., Lee, H. P., & Gu, Q. M. (2003). A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine. Neurocomputing, 55(1), 321–336.

    Google Scholar 

  8. Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46(1–3), 131–159.

    Article  Google Scholar 

  9. Cortes, C., & Vapnik, V. (1995). Support vector machine. Machine Learning, 20(3), 273–297.

    Google Scholar 

  10. Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A., & Vapnik, V. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.

    Google Scholar 

  11. Engelbrecht, A. P. (2007). Computational intelligence: An introduction (2nd ed.). Oxford: Wiley.

    Book  Google Scholar 

  12. Granger, C. W. J., & Ding, Z. (1995). Some properties of absolute return: An alternative measure of risk. Annales d’Economie et de Statistique, 40, 67–91.

    Article  Google Scholar 

  13. Granger, C. W. J., Spear, S., & Ding, Z. (2000). Stylized facts on the temporal and distributional properties of absolute returns: An update (pp. 97–120). Singapore: World Scientific. chap 6.

    Google Scholar 

  14. Guresen, E., Kayakutlu, G., & Daim, T. U. (2011). Using artificial neural network models in stock market index prediction. Expert Systems with Applications, 38(8), 10389–10397.

    Article  Google Scholar 

  15. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735.

    Article  Google Scholar 

  16. Hsu, M. W., Lessmann, S., Sung, M. C., Ma, T., & Johnson, J. E. (2016). Bridging the divide in financial market forecasting: machine learners vs. financial economists. Expert Systems with Applications, 61, 215–234.

    Article  Google Scholar 

  17. Huang, W., Lai, K. K., Nakamori, Y., & Wang, S. (2004). Forecasting foreign exchange rates with artificial neural networks: a review. International Journal of Information Technology & Decision Making, 3(1), 145–165.

    Article  Google Scholar 

  18. Hutter, F., Hoos, H.H., & Leyton-Brown, K. (2011). Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization (LION) (pp. 507–523). Springer.

  19. Jolliffe, I. T. (2002). Principal Component Analysis. Springer Series in Statistics. Springer.

  20. Kaastra, I., & Boyd, M. (1996). Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10(3), 215–236.

    Article  Google Scholar 

  21. Kim, K. (2003). Financial time series forecasting using support vector machines. Neurocomputing, 55(1–2), 307–319.

    Article  Google Scholar 

  22. Krollner, B., Vanstone, B.J., & Finnie, G.R. (2010). Financial time series forecasting with machine learning techniques: a survey. In Proceedings of the 18th European Symposium on Artificial Neural Networks (ESANN).

  23. Malkiel, B. G. (2003). The efficient market hypothesis and its critics. Journal of Economic Perspectives, 17(1), 59–82.

    Article  Google Scholar 

  24. Mitchell, T. M. (1997). Machine Learning (1st ed.). New York: McGraw-Hill.

    Google Scholar 

  25. Pacelli, V., Bevilacqua, V., Azzollini, M., et al. (2011). An artificial neural network model to forecast exchange rates. Journal of Intelligent Learning Systems and Applications, 3(2), 57–69.

    Article  Google Scholar 

  26. Peters, E. E. (1996). Chaos and order in the capital markets: A new view of cycles, prices, and market volatility. Oxford: Wiley.

    Google Scholar 

  27. Qu, H., & Zhang, Y. (2016). A new kernel of support vector regression for forecasting high-frequency stock returns. In Mathematical Problems in Engineering 2016.

  28. Rao, T., & Srivastava, S. (2013). Modeling movements in oil, gold, forex and market indices using search volume index and twitter sentiments. In Proceedings of the 5th Annual ACM Web Science Conference (WebSci) (pp. 336–345). ACM.

  29. Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1–2), 1–39.

    Article  Google Scholar 

  30. Schölkopf, B., Smola, A., & Müller, K.R. (1997). Kernel principal component analysis. In International Conference on Artificial Neural Networks (ICANN) (pp. 583–588). Springer.

  31. Sirignano, J., & Cont, R. (2018). Universal features of price formation in financial markets: perspectives from deep learning. Quantitative Finance, 19(9), 1449–1459.

    Article  Google Scholar 

  32. Snoek, J., Larochelle, H., & Adams, R.P. (2012). Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems (pp. 2951–2959). Curran Associates, Inc.

  33. Tay, F. E. H., & Cao, L. (2001). Application of support vector machines in financial time series forecasting. Omega, 29(4), 309–317.

    Article  Google Scholar 

  34. Teixeira, L. A., & De Oliveira, A. L. I. (2010). A method for automatic stock trading combining technical analysis and nearest neighbor classification. Expert Systems with Applications, 37(10), 6885–6890.

    Article  Google Scholar 

  35. Thomason, M. (1999). The practitioner methdods and tool. Journal of Computational Intelligence in Finance, 7(3), 35–45.

    Google Scholar 

  36. Timmermann, A., & Granger, C. W. J. (2004). Efficient market hypothesis and forecasting. International Journal of Forecasting, 20(1), 15–27.

    Article  Google Scholar 

  37. Wang, J., Sun, T., Liu, B., Cao, Y., & Zhu, H. (2019). CLVSA: A convolutional LSTM based variational sequence-to-sequence model with attention for predicting trends of financial markets. In S. Kraus (Ed.), Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10–16, 2019, ijcai.org (pp. 3705–3711). https://doi.org/10.24963/ijcai.2019/514.

  38. Yen, G., & Lee, C. (2008). Efficient market hypothesis (EMH): Past, present and future. Review of Pacific Basin Financial Markets and Policies, 11(2), 305–329.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chifumi Nishioka.

Ethics declarations

Conflict of Interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Fig. 5.

Fig. 5
figure 5

Performance benchmark charts for trading strategy

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ersan, D., Nishioka, C. & Scherp, A. Comparison of machine learning methods for financial time series forecasting at the examples of over 10 years of daily and hourly data of DAX 30 and S&P 500. J Comput Soc Sc 3, 103–133 (2020). https://doi.org/10.1007/s42001-019-00057-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42001-019-00057-5

Keywords

Navigation