Forecasting financial series using clustering methods and support vector regression

  • Lucas F. S. VilelaEmail author
  • Rafael C. Leme
  • Carlos A. M. Pinheiro
  • Otávio A. S. Carpinteiro


This paper proposes a two-stage model for forecasting financial time series. The first stage uses clustering methods in order to segment the time series into its various contexts. The second stage makes use of support vector regressions (SVRs), one for each context, to forecast future values of the series. The series used in the experiments is composed of values of an equity fund of a Brazilian bank. The proposed model is compared to a hierarchical model (HM) presented in the literature. In this series, the HM presented prediction results superior to both a support vector machine (SVM) and a multilayer perceptron (MLP) models. The experiments show that the proposed model is superior to HM, reducing the forecasting error of the HM by 32%. This means that the proposed model is also superior to the SVM and MLP models. An analysis of the construction and use of clusters associated with a series volatility study shows that data obtained from only one type of volatility (low or high) are enough to provide sufficient knowledge to the model so that it is able to forecast future values with good accuracy. Another analysis on the quality of the clusters formed by the model shows that each cluster carries different information about the series. Furthermore, there is always a group of SVRs capable of making adequate forecasts and, for the most part, the SVR used in forecasting is a SVR belonging to this group.


Financial time-series forecasting Clustering Support vector machine Artificial intelligence 



The authors would like to thank the anonymous reviewers for their valuable comments.


  1. Abu-Mostafa YS, Atiya AF (1996) Introduction to financial forecasting. Appl Intell 6(3):205–213CrossRefGoogle Scholar
  2. Anderson DR, Sweeney DJ, Williams TA, Camm JD, Cochran JJ, Fry MJ, Ohlmann JW (2016) An introduction to management science: quantitative approaches to decision making. Int J Forecast 8(1):69–80Google Scholar
  3. Armano G, Marchesi M, Murru A (2005) A hybrid genetic-neural architecture for stock indexes forecasting. Inf Sci 170(1):3–33 (computational Intelligence in Economics and Finance)MathSciNetCrossRefGoogle Scholar
  4. Armstrong J, Collopy F (1992) Error measures for generalizing about forecasting methods: empirical comparisons. Int J Forecast 8(1):69–80CrossRefGoogle Scholar
  5. Atsalakis GS, Valavanis KP (2009) Surveying stock market forecasting techniques—part II: soft computing methods. Expert Syst Appl 36(3, Part 2):5932–5941. CrossRefGoogle Scholar
  6. Azad MK, Uddin S, Takruri M (2018) Support vector regression based electricity peak load forecasting. In: 2018 11th international symposium on mechatronics and its applications (ISMA), pp 1–5.
  7. Bezdek JC, Ehrlich R, Full W (1984) FCM: the Fuzzy C-Means clustering algorithm. Comput Geosci 10(2–3):191–203CrossRefGoogle Scholar
  8. Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the ACM annual workshop on computational learning theory (COLT), pp 144–152Google Scholar
  9. Box GEP, Jenkins GM, Reinsel GC (2008) Time series analysis, forecasting and control. Wiley, HobokenzbMATHGoogle Scholar
  10. Cao L (2003) Support vector machines experts for time series forecasting. Neurocomputing 51:321–339CrossRefGoogle Scholar
  11. Cao L, Tay FEH (2001a) Application of support vector machines in financial time series forecasting. Omega 29(4):309–317CrossRefGoogle Scholar
  12. Cao L, Tay FEH (2001b) Improved financial time series forecasting by combining support vector machines with self-organizing feature map. Intell Data Anal 5(4):339–354CrossRefGoogle Scholar
  13. Cao L, Tay FEH (2003) Support vector machine with adaptive parameters in financial time series forecasting. IEEE Trans Neural Netw 14(6):1506–1518CrossRefGoogle Scholar
  14. Carpinteiro OAS, Leite JPRR, Pinheiro CAM, Lima I (2012) Forecasting models for prediction in time series. Artif Intell Rev 38(2):163–171CrossRefGoogle Scholar
  15. Chabaa S, Zeroual A, Antari J (2010) Identification and prediction of internet traffic using artificial neural networks. J Intell Learn Syst Appl 2(3):147–155Google Scholar
  16. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27CrossRefGoogle Scholar
  17. Clements MP, Franses PH, Swanson NR (2004) Forecasting economic and financial time-series with non-linear models. Int J Forecast 20(2):169–183CrossRefGoogle Scholar
  18. Diebold FX, Mariano RS (2002) Comparing predictive accuracy. J Bus Econ Stat 20(1):134–144. MathSciNetCrossRefGoogle Scholar
  19. Everitt BS, Landau S, Leese M, Stahl D (2011) Cluster analysis. Wiley, HobokenCrossRefGoogle Scholar
  20. Harvey DI, Leybourne SJ, Newbold P (1998) Tests for forecast encompassing. J Bus Econ Stat 16(2):254–259Google Scholar
  21. Haviluddin, Alfred R (2015) A genetic-based backpropagation neural network for forecasting in time-series data. In: Proceedings of the international conference on science in information technology (ICSITech), pp 158–163Google Scholar
  22. Haykin S (2009) Neural networks and learning machines. Prentice Hall, Upper Saddle RiverGoogle Scholar
  23. Huang W, Nakamori Y, Wang SY (2005) Forecasting stock market movement direction with support vector machine. Comput Oper Res 32(10):2513–2522CrossRefGoogle Scholar
  24. Kara Y, Boyacioglu MA, Baykan ÖK (2011) Predicting direction of stock price index movement using artificial neural networks and support vector machines: the sample of the istanbul stock exchange. Expert Syst Appl 38(5):5311–5319. CrossRefGoogle Scholar
  25. Kim Y, Enke D (2016) Developing a rule change trading system for the futures market using rough set analysis. Expert Syst Appl 59:165–173. CrossRefGoogle Scholar
  26. Kutics A, O’Connell C, Nakagawa A (2013) Segment-based image classification using layered-SOM. In: Proceedings of the IEEE international conference on image processing, pp 2430–2434Google Scholar
  27. Limei L, Xuan H (2017) Study of electricity load forecasting based on multiple kernels learning and weighted support vector regression machine. In: 2017 29th Chinese control and decision conference (CCDC), pp 1421–1424.
  28. Lin Q, Wang Q, Zhang G, Shi Y, Liu H, Deng L (2018) Maximum daily load forecasting based on support vector regression considering accumulated temperature effect. In: 2018 Chinese control and decision conference (CCDC), pp 5199–5203.
  29. Liu D, Chen Q, Mori K (2015) Time series forecasting method of building energy consumption using support vector regression. In: 2015 IEEE international conference on information and automation, pp 1628–1632.
  30. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the Berkeley symposium on mathematical statistics and probabilityGoogle Scholar
  31. Makridakis S, Whellwright SC, Hyndman RJ (1998) Forecasting: methods and appplications. Wiley, HobokenGoogle Scholar
  32. Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc Lond A: Math Phys Eng Sci 209(441–458):415–446CrossRefGoogle Scholar
  33. Oliveira JV, Pedrycz W (2007) Advances in fuzzy clustering and its applications. Wiley, HobokenCrossRefGoogle Scholar
  34. Patel J, Shah S, Thakkar P, Kotecha K (2015) Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Syst Appl 42(1):259–268. CrossRefGoogle Scholar
  35. Perwej Y, Perwej A (2012) Prediction of the Bombay Stock Exchange (BSE) market returns using artificial neural network and genetic algorithm. J Intell Learn Syst Appl 4(2):108–119Google Scholar
  36. Popovici R, Andonie R (2015) Music genre classification with self-organizing maps and edit distance. In: Proceedings of the international joint conference on neural networks (IJCNN), pp 1–7Google Scholar
  37. Rosowsky YI, Smith RE (2013) Rejection based support vector machines for financial time series forecasting. In: Proceedings of the international joint conference on neural networks (IJCNN), pp 1–7Google Scholar
  38. Sapankevych NI, Sankar R (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intell Mag 4(2):24–38CrossRefGoogle Scholar
  39. Shen W, Xing M (2009) Stock index forecast with back propagation neural network optimized by genetic algorithm. In: Proceedings of the international conference on information and computing science, vol 2, pp 376–379Google Scholar
  40. Singh S, Bhambri P, Gill J (2011) Time series based temperature prediction using back propagation with genetic algorithm technique. Int J Comput Sci Issues 8(5):28Google Scholar
  41. Small GR, Wong R (2002) The validity of forecasting. In: Pacific rim real estate society international conference ChristchurchGoogle Scholar
  42. Tong H (2002) Nonlinear time series analysis since 1990: some personal reflections. Acta Math Appl Sin 18(2):177–184MathSciNetCrossRefGoogle Scholar
  43. Tsay R (2010) Analysis of financial time series, Wiley series in probability and statistics, 3rd edn. Wiley-Interscience, HobokenCrossRefGoogle Scholar
  44. Van Gestel T, Suykens JAK, Baestaens DE, Lambrechts A, Lanckriet G, Vandaele B, De Moor B, Vandewalle J (2001) Financial time series prediction using least squares support vector machines within the evidence framework. IEEE Trans Neural Netw 12(4):809–821CrossRefGoogle Scholar
  45. Vapnik VN (1998) Statistical learning theory. Wiley, HobokenzbMATHGoogle Scholar
  46. Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999CrossRefGoogle Scholar
  47. Wang JZ, Wang JJ, Zhang ZG, Guo SP (2011) Forecasting stock indices with back propagation neural network. Expert Syst Appl 38(11):14346–14355Google Scholar
  48. Yang H, Chan L, King I (2002) Support vector machine regression for volatile stock market prediction. In: Proceedings of the international conference on intelligent data engineering and automated learning (IDEAL), pp 391–396CrossRefGoogle Scholar
  49. Yang H, Huang K, Chan L, King I, Lyu MR (2004) Outliers treatment in support vector regression for financial time series prediction. In: Proceedings of the international conference on neural information processing (ICONIP), pp 1260–1265Google Scholar
  50. Yizhen L, Wenhua Z, Ling L, Jun W, Gang L (2011) The forecasting of Shanghai index trend based on genetic algorithm and back propagation artificial neural network algorithm. In: Proceedings of the international conference on computer science education (ICCSE), pp 420–424Google Scholar
  51. Yu L, Dai W, Tang L, Wu J (2015) A hybrid grid-GA-based LSSVR learning paradigm for crude oil price forecastingGoogle Scholar
  52. Yu L, Xu H, Tang L (2016) LSSVR ensemble learning with uncertain parameters for crude oil price forecastingGoogle Scholar
  53. Yu L, Zhang X, Wang S (2017) Assessing potentiality of support vector machine method in crude oil price forecasting. Eurasia J Math Sci Technol Educ 13(12):7893–7904. CrossRefGoogle Scholar
  54. Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353CrossRefGoogle Scholar
  55. Zhao C, Yu Z (2017) The research on forecasting model based on support vector machine and discrete grey system. In: 2017 international conference on computing intelligence and information system (CIIS), pp 104–107.
  56. Zhong X, Enke D (2017) Forecasting daily stock market return using dimensionality reduction. Expert Syst Appl 67:126–139. CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  • Lucas F. S. Vilela
    • 1
    Email author
  • Rafael C. Leme
    • 1
  • Carlos A. M. Pinheiro
    • 1
  • Otávio A. S. Carpinteiro
    • 1
  1. 1.Research Group on Systems and Computer EngineeringFederal University of ItajubáItajubáBrazil

Personalised recommendations