Abstract
This article conducts a systematic comparison of three methods for predicting the direction (+/−) of financial time series using over ten years of DAX 30 and the S&P 500 datasets at daily and hourly frames. We choose the methods from representative machine learning families, particularly supervised versus unsupervised. The methods are support vector machines (SVM), artificial neural networks, and k-nearest neighbor (k-NN). We explore the influence of different training window lengths and numbers of out-of-sample predictions. Furthermore, we investigate whether kernel principle component analysis (KPCA) improves prediction, through reducing data dimensionality. Additionally, we verify whether combining machine learning methods by bootstrap aggregating outperforms single methods. Key insights from the experiment are: All machine learning methods are in principle useful to predict the direction of (+/−) financial time series. But to our surprise, increasing the window size only helps to a certain extent for hourly data, before it actually reduces the performance. The number of out-of-sample predictions had a small impact, while KPCA made a strong difference for SVM and k-NN. Finally, backtesting selected machines with a trading system on daily data revealed that the lazy learner k-NN outperforms the supervised approaches.
Similar content being viewed by others
Notes
In the positive example, we used the k-NN 1600/800 DAX 30 daily from our experiments. In the negative example, we used the SVM 1600/800 S&P 500 daily.
References
Bacon, C. R. (2005). Practical Portfolio performance measurement and attribution. The wiley finance series. Oxford: Wiley.
Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(1), 281–305.
Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.
Brasileiro, R. C., Souza, V. L. F., Fernandes, B. J. T., & Oliveira, A. L. I. (2013). Automatic method for stock trading combining technical analysis and the artificial bee colony algorithm. In 2013 IEEE Congress on Evolutionary Computation (CEC) (pp 1810–1817).
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
Cao, L. J., & Tay, F. E. H. (2003). Support vector machine with adaptive parameters in financial time series forecasting. IEEE Transactions on Neural Networks, 14(6), 1506–1518.
Cao, L. J., Chua, K. S., Chong, W. K., Lee, H. P., & Gu, Q. M. (2003). A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine. Neurocomputing, 55(1), 321–336.
Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46(1–3), 131–159.
Cortes, C., & Vapnik, V. (1995). Support vector machine. Machine Learning, 20(3), 273–297.
Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A., & Vapnik, V. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.
Engelbrecht, A. P. (2007). Computational intelligence: An introduction (2nd ed.). Oxford: Wiley.
Granger, C. W. J., & Ding, Z. (1995). Some properties of absolute return: An alternative measure of risk. Annales d’Economie et de Statistique, 40, 67–91.
Granger, C. W. J., Spear, S., & Ding, Z. (2000). Stylized facts on the temporal and distributional properties of absolute returns: An update (pp. 97–120). Singapore: World Scientific. chap 6.
Guresen, E., Kayakutlu, G., & Daim, T. U. (2011). Using artificial neural network models in stock market index prediction. Expert Systems with Applications, 38(8), 10389–10397.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735.
Hsu, M. W., Lessmann, S., Sung, M. C., Ma, T., & Johnson, J. E. (2016). Bridging the divide in financial market forecasting: machine learners vs. financial economists. Expert Systems with Applications, 61, 215–234.
Huang, W., Lai, K. K., Nakamori, Y., & Wang, S. (2004). Forecasting foreign exchange rates with artificial neural networks: a review. International Journal of Information Technology & Decision Making, 3(1), 145–165.
Hutter, F., Hoos, H.H., & Leyton-Brown, K. (2011). Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization (LION) (pp. 507–523). Springer.
Jolliffe, I. T. (2002). Principal Component Analysis. Springer Series in Statistics. Springer.
Kaastra, I., & Boyd, M. (1996). Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10(3), 215–236.
Kim, K. (2003). Financial time series forecasting using support vector machines. Neurocomputing, 55(1–2), 307–319.
Krollner, B., Vanstone, B.J., & Finnie, G.R. (2010). Financial time series forecasting with machine learning techniques: a survey. In Proceedings of the 18th European Symposium on Artificial Neural Networks (ESANN).
Malkiel, B. G. (2003). The efficient market hypothesis and its critics. Journal of Economic Perspectives, 17(1), 59–82.
Mitchell, T. M. (1997). Machine Learning (1st ed.). New York: McGraw-Hill.
Pacelli, V., Bevilacqua, V., Azzollini, M., et al. (2011). An artificial neural network model to forecast exchange rates. Journal of Intelligent Learning Systems and Applications, 3(2), 57–69.
Peters, E. E. (1996). Chaos and order in the capital markets: A new view of cycles, prices, and market volatility. Oxford: Wiley.
Qu, H., & Zhang, Y. (2016). A new kernel of support vector regression for forecasting high-frequency stock returns. In Mathematical Problems in Engineering 2016.
Rao, T., & Srivastava, S. (2013). Modeling movements in oil, gold, forex and market indices using search volume index and twitter sentiments. In Proceedings of the 5th Annual ACM Web Science Conference (WebSci) (pp. 336–345). ACM.
Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1–2), 1–39.
Schölkopf, B., Smola, A., & Müller, K.R. (1997). Kernel principal component analysis. In International Conference on Artificial Neural Networks (ICANN) (pp. 583–588). Springer.
Sirignano, J., & Cont, R. (2018). Universal features of price formation in financial markets: perspectives from deep learning. Quantitative Finance, 19(9), 1449–1459.
Snoek, J., Larochelle, H., & Adams, R.P. (2012). Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems (pp. 2951–2959). Curran Associates, Inc.
Tay, F. E. H., & Cao, L. (2001). Application of support vector machines in financial time series forecasting. Omega, 29(4), 309–317.
Teixeira, L. A., & De Oliveira, A. L. I. (2010). A method for automatic stock trading combining technical analysis and nearest neighbor classification. Expert Systems with Applications, 37(10), 6885–6890.
Thomason, M. (1999). The practitioner methdods and tool. Journal of Computational Intelligence in Finance, 7(3), 35–45.
Timmermann, A., & Granger, C. W. J. (2004). Efficient market hypothesis and forecasting. International Journal of Forecasting, 20(1), 15–27.
Wang, J., Sun, T., Liu, B., Cao, Y., & Zhu, H. (2019). CLVSA: A convolutional LSTM based variational sequence-to-sequence model with attention for predicting trends of financial markets. In S. Kraus (Ed.), Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10–16, 2019, ijcai.org (pp. 3705–3711). https://doi.org/10.24963/ijcai.2019/514.
Yen, G., & Lee, C. (2008). Efficient market hypothesis (EMH): Past, present and future. Review of Pacific Basin Financial Markets and Policies, 11(2), 305–329.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
See Fig. 5.
Rights and permissions
About this article
Cite this article
Ersan, D., Nishioka, C. & Scherp, A. Comparison of machine learning methods for financial time series forecasting at the examples of over 10 years of daily and hourly data of DAX 30 and S&P 500. J Comput Soc Sc 3, 103–133 (2020). https://doi.org/10.1007/s42001-019-00057-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42001-019-00057-5