A methodology for applying k-nearest neighbor to time series forecasting

Abstract

In this paper a methodology for applying k-nearest neighbor regression on a time series forecasting context is developed. The goal is to devise an automatic tool, i.e., a tool that can work without human intervention; furthermore, the methodology should be effective and efficient, so that it can be applied to accurately forecast a great number of time series. In order to be incorporated into our methodology, several modeling and preprocessing techniques are analyzed and assessed using the N3 competition data set. One interesting feature of the proposed methodology is that it resolves the selection of important modeling parameters, such as k or the input variables, combining several models with different parameters. In spite of the simplicity of k-NN regression, our methodology seems to be quite effective.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Notes

  1. 1.

    http://www.neural-forecasting-competition.com/NN3/index.htm.

References

  1. Ahmed NK, Atiya AF, Gayar NE, El-shishiny H (2010) An empirical comparison of machine learning models for time series forecasting. J Econ Rev 29(5–6):594–621

    MathSciNet  Article  Google Scholar 

  2. Al-Qahtani FH, Crone SF (2013) Multivariate k-nearest neighbour regression for time series data—a novel algorithm for forecasting UK electricity demand. In: IJCNN

  3. Bates JM, Granger CWJ (1969) The combination of forecasts. Oper Res Q 20:451–468

    Article  Google Scholar 

  4. Ben Taieb S, Bontempi G, Atiya AF, Sorjamaa A (2012) A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert Syst Appl 39(8):7067–7083

    Article  Google Scholar 

  5. Box GEP, Jenkins GM, Reinsel GC (2008) Time series analysis: forecasting and control, 4th edn. Wiley, Hoboken

    Google Scholar 

  6. Cleveland RB, Cleveland WS, McRae JE, Terpenning I (1990) STL: a seasonal-trend decomposition procedure based on loess. J Off Stat 6(1):3–73

    Google Scholar 

  7. Crone SF, Hibon M, Nikolopoulos K (2011) Advances in forecasting with neural networks? Empirical evidence from the NN3 competition on time series prediction. Int J Forecast 27(3):635–660

    Article  Google Scholar 

  8. Fernandez-Rodriguez F, Sosvilla-Rivero S, Andrada-Felix J (1999) Exchange-rate forecasts with simultaneous nearest-neighbour methods: evidence from the EMS. Int J Forecast 15(4):383–392

    Article  Google Scholar 

  9. Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, New York

    Google Scholar 

  10. Hibon M, Evgeniou T (2005) To combine or not to combine: selecting among forecasts and their combinations. Int J Forecast 21(1):15–24

    Article  Google Scholar 

  11. Hyndman R, Athanasopoulos G (2014) Forecasting: principles and practice. OTexts

  12. Hyndman R, Khandakar Y (2008) Automatic time series forecasting: the forecast package for R. J Stat Softw 27(1):1–22

    Google Scholar 

  13. Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22:679–688

    Article  Google Scholar 

  14. Hyndman RJ, Koehler AB, Ord JK, Snyder RD (2008) Forecasting with exponential smoothing: the state space approach. Springer, Berlin

    Google Scholar 

  15. Lora AT, Santos JCR, Ramos JLM, Santos JR, Expsito AG (2003) Influence of kNN-based load forecasting errors on optimal energy production. In: Moura-Pires F, Abreu S (eds) EPIA, lecture notes in computer science, vol 2902. Springer, Berlin, pp 189–203

    Google Scholar 

  16. Makridakis S, Hibon M (2000) The M3-competition: results, conclusions and implications. Int J Forecast 16(4):451–476

    Article  Google Scholar 

  17. Ord K, Fildes R (2003) Principles of business forecasting. South-Western, Nashville

    Google Scholar 

  18. Ren Y, Suganthan P (2014) Empirical mode decomposition-k nearest neighbor models for wind speed forecasting. J Power Energy Eng 2:176–185

    Article  Google Scholar 

  19. Sorjamaa A, Hao J, Reyhani N, Ji Y, Lendasse A (2007) Methodology for long-term prediction of time series. Neurocomputing 70(16–18):2861–2869

    Article  Google Scholar 

  20. Tashman LJ (2000) Out-of-sample tests of forecasting accuracy: an analysis and review. Int J Forecast 16(4):437–450

    Article  Google Scholar 

  21. Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1(6):80–83

    MathSciNet  Article  Google Scholar 

  22. Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  23. Yakowitz S (1987) Nearest-neighbour methods for time series analysis. J Time Ser Anal 8:235–247

    MathSciNet  Article  MATH  Google Scholar 

  24. Yan W (2012) Toward automatic time-series forecasting using neural networks. IEEE Trans Neural Netw Learning Syst 23(7):1028–1039

    Article  Google Scholar 

  25. Zhang G, Eddy Patuwo B, Hu YM (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14(1):35–62

    Article  Google Scholar 

  26. Zhang N, Lin A, Shang P (2017) Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting. Physica A 477:161–173

    MathSciNet  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Francisco Martínez.

Additional information

This paper has been partially supported by the project TIN2015-68854-R (FEDER Founds) of the Spanish Ministry of Economy and Competitiveness.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Martínez, F., Frías, M.P., Pérez, M.D. et al. A methodology for applying k-nearest neighbor to time series forecasting. Artif Intell Rev 52, 2019–2037 (2019). https://doi.org/10.1007/s10462-017-9593-z

Download citation

Keywords

  • Nearest neighbors
  • Time series forecasting
  • Combined forecast
  • Feature selection