Skip to main content

Ensemble Methods for Time Series Forecasting

  • Chapter
  • First Online:
Claudio Moraga: A Passion for Multi-Valued Logic and Soft Computing

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 349))

Abstract

Improvement of time series forecasting accuracy is an active research area that has significant importance in many practical domains. Ensemble methods have gained considerable attention from machine learning and soft computing communities in recent years. There are several practical and theoretical reasons, mainly statistical reasons, why an ensemble may be preferred. Ensembles are recognized as one of the most successful approaches to prediction tasks. Previous theoretical studies of ensembles have shown that one of the key reasons for this performance is diversity among ensemble members. Several methods exist to generate diversity. Extensive works in literature suggest that substantial improvements in accuracy can be achieved by combining forecasts from different models. The focus of this chapter will be on ensemble for time series prediction. We describe the use of ensemble methods to compare different models for time series prediction and extensions to the classical ensemble methods for neural networks for classification and regression prediction by using different model architectures. Design, implementation and application will be the main topics of the chapter, and more specifically: conditions under which ensemble based systems may be more beneficial than their single machine; algorithms for generating individual components of ensemble systems; and various procedures through which they can be combined. Various ensemble based algorithms will be analyzed: Bagging, Adaboost and Negative Correlation; as well as combination rules and decision templates. Finally, future directions will be time series forecasting, machine fusion and others areas in which ensemble of machines have shown great promise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    If the hypotheses are considered fixed, the expectations are taken based on the distribution of the inputs \(\mathbf {x}\). If they are considered free, expectations are also taken with respect to the distribution of the sample(s) used to estimate them.

References

  1. George EP Box, Gwilym M Jenkins, and Gregory C Reinsel. Time series analysis: forecasting and control. Prentice Hall Englewood cliffs nj, third edition edition, 1994.

    Google Scholar 

  2. Jan G De Gooijer and Rob J Hyndman. 25 years of time series forecasting. International journal of forecasting, 22 (3):443–473, 2006.

    Article  Google Scholar 

  3. John M Bates and Clive WJ Granger. The combination of forecasts. Journal of the Operational Research Society, 20(4):451–468, 1969.

    Article  Google Scholar 

  4. G Peter Zhang. Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50:159–175, 2003.

    Article  MathSciNet  Google Scholar 

  5. Lilian M de Menezes, Derek W. Bunn, and James W Taylor. Review of guidelines for the use of combined forecasts. European Journal of Operational Research, 120(1):190 – 204, 2000.

    Article  Google Scholar 

  6. Ratnadip Adhikari and R. K. Agrawal. Combining multiple time series models through a robust weighted mechanism. In 1st International Conference on Recent Advances in Information Technology, RAIT 2012, Dhanbad, India, March 15-17, 2012, pages 455–460. IEEE, 2012.

    Google Scholar 

  7. Hui Zou and Yuhong Yang. Combining time series models for forecasting. International Journal of Forecasting, 20(1):69–84, 2004.

    Article  Google Scholar 

  8. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., New York, NY, USA, 2001.

    Book  MATH  Google Scholar 

  9. Hector Allende and Siegfried Heiler. Recursive generalized m-estimates for autoregressive moving average models. Journal of Time Series Analysis, 13(1):1–18, 1992.

    Article  MathSciNet  MATH  Google Scholar 

  10. Bruce L Bowerman and Richard T O’Connell. Forecasting and time series: An applied approach. 3rd. 1993.

    Google Scholar 

  11. Greta M Ljung and George EP Box. On a measure of lack of fit in time series models. Biometrika, 65(2):297–303, 1978.

    Article  Google Scholar 

  12. H Allende and J Galbiati. Robust test in time series model. J. Interamerican Statist. Inst, 1(48):35–79, 1996.

    MathSciNet  MATH  Google Scholar 

  13. T Subba Rao. On the theory of bilinear time series models. Journal of the Royal Statistical Society. Series B (Methodological), pages 244–255, 1981.

    Google Scholar 

  14. Leo Breiman, Jerome Friedman, Charles J Stone, and Richard A Olshen. Classification and regression trees. CRC press, 1984.

    Google Scholar 

  15. Vector Autoregressive Models for Multivariate Time Series, pages 385–429. Springer New York, New York, NY, 2006.

    Google Scholar 

  16. Jerome H Friedman. Multivariate adaptive regression splines. The annals of statistics, pages 1–67, 1991.

    Google Scholar 

  17. Jin-Lung Lin and Clive WJ Granger. Forecasting from non-linear models in practice. Journal of Forecasting, 13(1):1–9, 1994.

    Article  Google Scholar 

  18. Jerome T Connor, R Douglas Martin, and Les E Atlas. Recurrent neural networks and robust time series prediction. IEEE transactions on neural networks, 5(2):240–254, 1994.

    Google Scholar 

  19. Indre Zliobaite. Learning under concept drift: an overview. CoRR, abs/1010.4784, 2010.

    Google Scholar 

  20. Wenyu Zang, Peng Zhang, Chuan Zhou, and Li Guo. Comparative study between incremental and ensemble learning on data streams: Case study. Journal Of Big Data, 1(1), 2014.

    Google Scholar 

  21. Graham Elliott, Clive Granger, and Allan Timmermann, editors. Handbook of Economic Forecasting, volume 1. Elsevier, 1 edition, 2006.

    Google Scholar 

  22. P. G. Zhang and L. V. Berardi. Time series forecasting with neural network ensembles: an application for exchange rate prediction. Journal of the Operational Research Society, 52(6):652–664, 2001.

    Article  MATH  Google Scholar 

  23. L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.

    MathSciNet  MATH  Google Scholar 

  24. E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36:105–139, 1999.

    Article  Google Scholar 

  25. Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119–139, 1997.

    Article  MathSciNet  MATH  Google Scholar 

  26. Eibe Frank and Bernhard Pfahringer. Improving on Bagging with Input Smearing, pages 97–106. Springer Berlin Heidelberg, Berlin, Heidelberg, 2006.

    Google Scholar 

  27. Juan J. Rodriguez, Ludmila I. Kuncheva, and Carlos J. Alonso. Rotation forest: A new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell., 28(10):1619–1630, October 2006.

    Google Scholar 

  28. Leo Breiman. Randomizing outputs to increase prediction accuracy. Machine Learning, 40(3):229–242, 2000.

    Article  MATH  Google Scholar 

  29. P. Bühlmann and Bin Yu. Analyzing bagging. Annals of Statistics, 30:927–961, 2002.

    Article  MathSciNet  MATH  Google Scholar 

  30. J.H. Friedman and P. Hall. On bagging and nonlinear estimation. Journal of Statistical Planning and Inference, 137, 2000.

    Google Scholar 

  31. D. K. Barrow and S. F. Crone. Crogging (cross-validation aggregation) for forecasting x2014; a novel algorithm of neural network ensembles on time series subsamples. In Neural Networks (IJCNN), The 2013 International Joint Conference on, pages 1–8, Aug 2013.

    Google Scholar 

  32. R. K. Bryll, R. Gutierrez-Osuna, and F. K. H. Quek. Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition, 36(6):1291–1302, 2003.

    Article  MATH  Google Scholar 

  33. L. Rokach. Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography. Computational Statistics & Data Analysis, 53(12):4046–4072, 2009.

    Article  MathSciNet  MATH  Google Scholar 

  34. K. Tumer and N. C. Oza. Input decimated ensembles. Pattern Analysis and Applications, 6(1):65–77, 2003.

    Article  MathSciNet  MATH  Google Scholar 

  35. J. F. Kolen, J. B. Pollack, J. F. Kolen, and J. B. Pollack. Back propagation is sensitive to initial conditions. In Complex Systems, pages 860–867. Morgan Kaufmann, 1990.

    Google Scholar 

  36. G. Brown, J. L. Wyatt, R. Harris, and Xin Yao. Diversity creation methods: a survey and categorisation. Information Fusion, 6(1):5–20, 2005.

    Article  Google Scholar 

  37. D. Partridge and W. B. Yates. Engineering multiversion neural-net systems. Neural Computation, 8:869–893, 1995.

    Article  Google Scholar 

  38. W. Yates and D. Partridge. Use of methodological diversity to improve neural network generalization. Neural Computing and Applications, 4(2):114–128, 1996.

    Article  Google Scholar 

  39. D. W. Opitz and J. W. Shavlik. Generating accurate and diverse members of a neural-network ensemble. In Advances in Neural Information Processing Systems, pages 535–541. MIT Press, 1996.

    Google Scholar 

  40. R. Ñanculef, C. Valle, H. Allende, and C. Moraga. Training regression ensembles by equential target correction and resampling. Inf. Sci., 195:154–174, July 2012.

    Google Scholar 

  41. Yong Liu and Xin Yao. Ensemble learning via negative correlation. Neural Networks, 12:1399–1404, 1999.

    Article  Google Scholar 

  42. Ogorzalek M. Wichard JD, Christian M. Building ensembles with heterogeneous models. In 7th Course on the International School on Neural Nets IIASS, 2002.

    Google Scholar 

  43. K. S. Woods, W. P. Kegelmeyer, and K. W. Bowyer. Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell., 19(4):405–410, 1997.

    Article  Google Scholar 

  44. Wenjia Wang, P. Jones, and D. Partridge. Diversity between neural networks and decision trees for building multiple classifier systems. In Josef Kittler and Fabio Roli, editors, Multiple Classifier Systems, volume 1857 of Lecture Notes in Computer Science, pages 240–249. Springer, 2000.

    Google Scholar 

  45. P. Brazdil, J. Gama, and B. Henery. Characterizing the applicability of classification algorithms using meta-level learning. In F. Bergadano and L. De Raedt, editors, ECML, volume 784 of Lecture Notes in Computer Science, pages 83–102. Springer, 1994.

    Google Scholar 

  46. Niall Rooney, David Patterson, Sarab Anand, and Alexey Tsymbal. Dynamic Integration of Regression Models, pages 164–173. Springer Berlin Heidelberg, Berlin, Heidelberg, 2004.

    Google Scholar 

  47. Christiane Lemke and Bogdan Gabrys. Meta-learning for time series forecasting and forecast combination. Neurocomputing, 73(10-12):2006–2016, 2010.

    Article  Google Scholar 

  48. Victor Richmond R. Jose and Robert L. Winkler. Simple robust averages of forecasts: Some empirical results. International Journal of Forecasting, 24(1):163 – 169, 2008.

    Google Scholar 

  49. T. G. Dietterich. Ensemble methods in machine learning. In Proceedings of the First International Workshop on Multiple Classifier Systems, MCS ’00, pages 1–15, London, UK, UK, 2000. Springer-Verlag.

    Google Scholar 

  50. D. M. Hawkins. The Problem of Overfitting. Journal of Chemical Information and Computer Sciences, 44(1):1–12, 2004.

    Article  Google Scholar 

  51. L. I. Kuncheva. Combining Pattern Classifiers: Methods and Algorithms. John Wiley and Sons, Inc., 2004.

    Google Scholar 

  52. L.K. Hansen and P. Salamon. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10):993–1001, 1990.

    Article  Google Scholar 

  53. Y. Grandvalet. Bagging down-weights leverage points. In IJCNN (4), pages 505–510, 2000.

    Google Scholar 

  54. Y. Grandvalet. Bagging equalizes influence. Machine Learning, 55(3):251–270, 2004.

    Article  MATH  Google Scholar 

  55. P. J. Huber. Robust Statistics. Wiley Series in Probability and Statistics. Wiley-Interscience, 1981.

    Book  MATH  Google Scholar 

  56. Atsushi Inoue and Lutz Kilian. Bagging time series models. CEPR Discussion Paper No. 4333, 2004.

    Google Scholar 

  57. Peter Hall and Joel L. Horowitz. Bootstrap Critical Values for Tests Based on Generalized-Method-of-Moments Estimators. Econometrica, 64(4):891–916, 1996.

    Article  MathSciNet  MATH  Google Scholar 

  58. Nikola Simidjievski, Ljupčo Todorovski, and Sašo Džeroski. Predicting long-term population dynamics with bagging and boosting of process-based models. Expert Syst. Appl., 42(22):8484–8496, December 2015.

    Google Scholar 

  59. L. G. Valiant. A theory of the learnable. Commun. ACM, 27(11):1134–1142, 1984.

    Article  MATH  Google Scholar 

  60. P. Melville, N. Shah, L. Mihalkova, and R. J. Mooney. Experiments on ensembles with missing and noisy data. In In: Proceedings of the Workshop on Multi Classifier Systems, pages 293–302. Springer Verlag, 2004.

    Google Scholar 

  61. D. L. Shrestha and D. P. Solomatine. Experiments with adaboost.rt, an improved boosting scheme for regression. Neural Comput., 18(7):1678–1710, July 2006.

    Google Scholar 

  62. H. Drucker. Improving regressors using boosting techniques. In D. H. Fisher, editor, ICML, pages 107–115. Morgan Kaufmann, 1997.

    Google Scholar 

  63. Mohammad Assaad, Romuald Boné, and Hubert Cardot. A new boosting algorithm for improved time-series forecasting with recurrent neural networks. Inf. Fusion, 9(1):41–55, January 2008.

    Google Scholar 

  64. Luzia Vidal de Souza, Aurora Pozo, Joel Mauricio Correa da Rosa, and Anselmo Chaves Neto. Applying correlation to enhance boosting technique using genetic programming as base learner. Applied Intelligence, 33(3):291–301, 2010.

    Article  Google Scholar 

  65. Wei Yee Goh, Chee Peng Lim, and Kok Khiang Peh. Predicting drug dissolution profiles with an ensemble of boosted neural networks: a time series approach. IEEE Transactions on Neural Networks, 14(2):459–463, 2003.

    Google Scholar 

  66. Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189–1232, 2000.

    Article  MathSciNet  MATH  Google Scholar 

  67. Buhlmann P. and Yu B. Boosting with the l2 loss: Regression and classification. Journal of the American Statistical Association, 98:324–339, 2003.

    Article  MathSciNet  MATH  Google Scholar 

  68. Souhaib Ben Taieb and Rob Hyndman. A gradient boosting approach to the kaggle load forecasting competition. International Journal of Forecasting, 30(2):382–394, 2014.

    Article  Google Scholar 

  69. Francesco Audrino and Peter Bühlmann. Splines for financial volatility. University of st. gallen department of economics working paper series 2007, Department of Economics, University of St. Gallen, 2007.

    Google Scholar 

  70. Klaus Wohlrabe and Teresa Buchen. Assessing the macroeconomic forecasting performance of boosting: Evidence for the united states, the euro area and germany. Journal of Forecasting, 33(4):231–242, 2014.

    Article  MathSciNet  Google Scholar 

  71. Nikolay Robinzonov, Gerhard Tutz, and Torsten Hothorn. Boosting techniques for nonlinear time series models. AStA Advances in Statistical Analysis, 96(1):99–122, 2012.

    Article  MathSciNet  Google Scholar 

  72. N. Ueda and R. Nakano. Generalization error of ensemble estimators. In Proceedings of IEEE International Conference on Neural Networks., pages 90–95, Washington, USA, June 1996.

    Google Scholar 

  73. G. Brown. Diversity in Neural Network Ensembles. PhD thesis, The University of Birmingham, 2004.

    Google Scholar 

  74. B. Rosen. Ensemble learning using decorrelated neural networks. Connection Science, 8:373–384, 1996.

    Article  Google Scholar 

  75. Ali Rodan and Peter Tiño. Negatively correlated echo state networks. In ESANN 2011, 19th European Symposium on Artificial Neural Networks, Bruges, Belgium, April 27-29, 2011, Proceedings, 2011.

    Google Scholar 

  76. Mustafa C. Ozturk, Dongming Xu, and José C. Príncipe. Analysis and design of echo state networks. Neural Comput., 19(1):111–138, January 2007.

    Google Scholar 

  77. A. Krogh and J. Vedelsby. Neural network ensembles, cross validation, and active learning. In Advances in Neural Information Processing Systems, pages 231–238. MIT Press, 1995.

    Google Scholar 

Download references

Acknowledgments

We thank to Professor Claudio Moraga for their continued support in the design and consolidation of the doctoral program in computer engineering of the Universidad Técnica Federico Santa María.

This work was supported in part by Research Project DGIP-UTFSM (Chile) 116.24.2 and in part by Basal Project FB 0821.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Héctor Allende .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Allende, H., Valle, C. (2017). Ensemble Methods for Time Series Forecasting. In: Seising, R., Allende-Cid, H. (eds) Claudio Moraga: A Passion for Multi-Valued Logic and Soft Computing. Studies in Fuzziness and Soft Computing, vol 349. Springer, Cham. https://doi.org/10.1007/978-3-319-48317-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48317-7_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48316-0

  • Online ISBN: 978-3-319-48317-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics