Abstract
Improvement of time series forecasting accuracy is an active research area that has significant importance in many practical domains. Ensemble methods have gained considerable attention from machine learning and soft computing communities in recent years. There are several practical and theoretical reasons, mainly statistical reasons, why an ensemble may be preferred. Ensembles are recognized as one of the most successful approaches to prediction tasks. Previous theoretical studies of ensembles have shown that one of the key reasons for this performance is diversity among ensemble members. Several methods exist to generate diversity. Extensive works in literature suggest that substantial improvements in accuracy can be achieved by combining forecasts from different models. The focus of this chapter will be on ensemble for time series prediction. We describe the use of ensemble methods to compare different models for time series prediction and extensions to the classical ensemble methods for neural networks for classification and regression prediction by using different model architectures. Design, implementation and application will be the main topics of the chapter, and more specifically: conditions under which ensemble based systems may be more beneficial than their single machine; algorithms for generating individual components of ensemble systems; and various procedures through which they can be combined. Various ensemble based algorithms will be analyzed: Bagging, Adaboost and Negative Correlation; as well as combination rules and decision templates. Finally, future directions will be time series forecasting, machine fusion and others areas in which ensemble of machines have shown great promise.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
If the hypotheses are considered fixed, the expectations are taken based on the distribution of the inputs \(\mathbf {x}\). If they are considered free, expectations are also taken with respect to the distribution of the sample(s) used to estimate them.
References
George EP Box, Gwilym M Jenkins, and Gregory C Reinsel. Time series analysis: forecasting and control. Prentice Hall Englewood cliffs nj, third edition edition, 1994.
Jan G De Gooijer and Rob J Hyndman. 25 years of time series forecasting. International journal of forecasting, 22 (3):443–473, 2006.
John M Bates and Clive WJ Granger. The combination of forecasts. Journal of the Operational Research Society, 20(4):451–468, 1969.
G Peter Zhang. Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50:159–175, 2003.
Lilian M de Menezes, Derek W. Bunn, and James W Taylor. Review of guidelines for the use of combined forecasts. European Journal of Operational Research, 120(1):190 – 204, 2000.
Ratnadip Adhikari and R. K. Agrawal. Combining multiple time series models through a robust weighted mechanism. In 1st International Conference on Recent Advances in Information Technology, RAIT 2012, Dhanbad, India, March 15-17, 2012, pages 455–460. IEEE, 2012.
Hui Zou and Yuhong Yang. Combining time series models for forecasting. International Journal of Forecasting, 20(1):69–84, 2004.
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., New York, NY, USA, 2001.
Hector Allende and Siegfried Heiler. Recursive generalized m-estimates for autoregressive moving average models. Journal of Time Series Analysis, 13(1):1–18, 1992.
Bruce L Bowerman and Richard T O’Connell. Forecasting and time series: An applied approach. 3rd. 1993.
Greta M Ljung and George EP Box. On a measure of lack of fit in time series models. Biometrika, 65(2):297–303, 1978.
H Allende and J Galbiati. Robust test in time series model. J. Interamerican Statist. Inst, 1(48):35–79, 1996.
T Subba Rao. On the theory of bilinear time series models. Journal of the Royal Statistical Society. Series B (Methodological), pages 244–255, 1981.
Leo Breiman, Jerome Friedman, Charles J Stone, and Richard A Olshen. Classification and regression trees. CRC press, 1984.
Vector Autoregressive Models for Multivariate Time Series, pages 385–429. Springer New York, New York, NY, 2006.
Jerome H Friedman. Multivariate adaptive regression splines. The annals of statistics, pages 1–67, 1991.
Jin-Lung Lin and Clive WJ Granger. Forecasting from non-linear models in practice. Journal of Forecasting, 13(1):1–9, 1994.
Jerome T Connor, R Douglas Martin, and Les E Atlas. Recurrent neural networks and robust time series prediction. IEEE transactions on neural networks, 5(2):240–254, 1994.
Indre Zliobaite. Learning under concept drift: an overview. CoRR, abs/1010.4784, 2010.
Wenyu Zang, Peng Zhang, Chuan Zhou, and Li Guo. Comparative study between incremental and ensemble learning on data streams: Case study. Journal Of Big Data, 1(1), 2014.
Graham Elliott, Clive Granger, and Allan Timmermann, editors. Handbook of Economic Forecasting, volume 1. Elsevier, 1 edition, 2006.
P. G. Zhang and L. V. Berardi. Time series forecasting with neural network ensembles: an application for exchange rate prediction. Journal of the Operational Research Society, 52(6):652–664, 2001.
L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36:105–139, 1999.
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119–139, 1997.
Eibe Frank and Bernhard Pfahringer. Improving on Bagging with Input Smearing, pages 97–106. Springer Berlin Heidelberg, Berlin, Heidelberg, 2006.
Juan J. Rodriguez, Ludmila I. Kuncheva, and Carlos J. Alonso. Rotation forest: A new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell., 28(10):1619–1630, October 2006.
Leo Breiman. Randomizing outputs to increase prediction accuracy. Machine Learning, 40(3):229–242, 2000.
P. Bühlmann and Bin Yu. Analyzing bagging. Annals of Statistics, 30:927–961, 2002.
J.H. Friedman and P. Hall. On bagging and nonlinear estimation. Journal of Statistical Planning and Inference, 137, 2000.
D. K. Barrow and S. F. Crone. Crogging (cross-validation aggregation) for forecasting x2014; a novel algorithm of neural network ensembles on time series subsamples. In Neural Networks (IJCNN), The 2013 International Joint Conference on, pages 1–8, Aug 2013.
R. K. Bryll, R. Gutierrez-Osuna, and F. K. H. Quek. Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition, 36(6):1291–1302, 2003.
L. Rokach. Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography. Computational Statistics & Data Analysis, 53(12):4046–4072, 2009.
K. Tumer and N. C. Oza. Input decimated ensembles. Pattern Analysis and Applications, 6(1):65–77, 2003.
J. F. Kolen, J. B. Pollack, J. F. Kolen, and J. B. Pollack. Back propagation is sensitive to initial conditions. In Complex Systems, pages 860–867. Morgan Kaufmann, 1990.
G. Brown, J. L. Wyatt, R. Harris, and Xin Yao. Diversity creation methods: a survey and categorisation. Information Fusion, 6(1):5–20, 2005.
D. Partridge and W. B. Yates. Engineering multiversion neural-net systems. Neural Computation, 8:869–893, 1995.
W. Yates and D. Partridge. Use of methodological diversity to improve neural network generalization. Neural Computing and Applications, 4(2):114–128, 1996.
D. W. Opitz and J. W. Shavlik. Generating accurate and diverse members of a neural-network ensemble. In Advances in Neural Information Processing Systems, pages 535–541. MIT Press, 1996.
R. Ñanculef, C. Valle, H. Allende, and C. Moraga. Training regression ensembles by equential target correction and resampling. Inf. Sci., 195:154–174, July 2012.
Yong Liu and Xin Yao. Ensemble learning via negative correlation. Neural Networks, 12:1399–1404, 1999.
Ogorzalek M. Wichard JD, Christian M. Building ensembles with heterogeneous models. In 7th Course on the International School on Neural Nets IIASS, 2002.
K. S. Woods, W. P. Kegelmeyer, and K. W. Bowyer. Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell., 19(4):405–410, 1997.
Wenjia Wang, P. Jones, and D. Partridge. Diversity between neural networks and decision trees for building multiple classifier systems. In Josef Kittler and Fabio Roli, editors, Multiple Classifier Systems, volume 1857 of Lecture Notes in Computer Science, pages 240–249. Springer, 2000.
P. Brazdil, J. Gama, and B. Henery. Characterizing the applicability of classification algorithms using meta-level learning. In F. Bergadano and L. De Raedt, editors, ECML, volume 784 of Lecture Notes in Computer Science, pages 83–102. Springer, 1994.
Niall Rooney, David Patterson, Sarab Anand, and Alexey Tsymbal. Dynamic Integration of Regression Models, pages 164–173. Springer Berlin Heidelberg, Berlin, Heidelberg, 2004.
Christiane Lemke and Bogdan Gabrys. Meta-learning for time series forecasting and forecast combination. Neurocomputing, 73(10-12):2006–2016, 2010.
Victor Richmond R. Jose and Robert L. Winkler. Simple robust averages of forecasts: Some empirical results. International Journal of Forecasting, 24(1):163 – 169, 2008.
T. G. Dietterich. Ensemble methods in machine learning. In Proceedings of the First International Workshop on Multiple Classifier Systems, MCS ’00, pages 1–15, London, UK, UK, 2000. Springer-Verlag.
D. M. Hawkins. The Problem of Overfitting. Journal of Chemical Information and Computer Sciences, 44(1):1–12, 2004.
L. I. Kuncheva. Combining Pattern Classifiers: Methods and Algorithms. John Wiley and Sons, Inc., 2004.
L.K. Hansen and P. Salamon. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10):993–1001, 1990.
Y. Grandvalet. Bagging down-weights leverage points. In IJCNN (4), pages 505–510, 2000.
Y. Grandvalet. Bagging equalizes influence. Machine Learning, 55(3):251–270, 2004.
P. J. Huber. Robust Statistics. Wiley Series in Probability and Statistics. Wiley-Interscience, 1981.
Atsushi Inoue and Lutz Kilian. Bagging time series models. CEPR Discussion Paper No. 4333, 2004.
Peter Hall and Joel L. Horowitz. Bootstrap Critical Values for Tests Based on Generalized-Method-of-Moments Estimators. Econometrica, 64(4):891–916, 1996.
Nikola Simidjievski, Ljupčo Todorovski, and Sašo Džeroski. Predicting long-term population dynamics with bagging and boosting of process-based models. Expert Syst. Appl., 42(22):8484–8496, December 2015.
L. G. Valiant. A theory of the learnable. Commun. ACM, 27(11):1134–1142, 1984.
P. Melville, N. Shah, L. Mihalkova, and R. J. Mooney. Experiments on ensembles with missing and noisy data. In In: Proceedings of the Workshop on Multi Classifier Systems, pages 293–302. Springer Verlag, 2004.
D. L. Shrestha and D. P. Solomatine. Experiments with adaboost.rt, an improved boosting scheme for regression. Neural Comput., 18(7):1678–1710, July 2006.
H. Drucker. Improving regressors using boosting techniques. In D. H. Fisher, editor, ICML, pages 107–115. Morgan Kaufmann, 1997.
Mohammad Assaad, Romuald Boné, and Hubert Cardot. A new boosting algorithm for improved time-series forecasting with recurrent neural networks. Inf. Fusion, 9(1):41–55, January 2008.
Luzia Vidal de Souza, Aurora Pozo, Joel Mauricio Correa da Rosa, and Anselmo Chaves Neto. Applying correlation to enhance boosting technique using genetic programming as base learner. Applied Intelligence, 33(3):291–301, 2010.
Wei Yee Goh, Chee Peng Lim, and Kok Khiang Peh. Predicting drug dissolution profiles with an ensemble of boosted neural networks: a time series approach. IEEE Transactions on Neural Networks, 14(2):459–463, 2003.
Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189–1232, 2000.
Buhlmann P. and Yu B. Boosting with the l2 loss: Regression and classification. Journal of the American Statistical Association, 98:324–339, 2003.
Souhaib Ben Taieb and Rob Hyndman. A gradient boosting approach to the kaggle load forecasting competition. International Journal of Forecasting, 30(2):382–394, 2014.
Francesco Audrino and Peter Bühlmann. Splines for financial volatility. University of st. gallen department of economics working paper series 2007, Department of Economics, University of St. Gallen, 2007.
Klaus Wohlrabe and Teresa Buchen. Assessing the macroeconomic forecasting performance of boosting: Evidence for the united states, the euro area and germany. Journal of Forecasting, 33(4):231–242, 2014.
Nikolay Robinzonov, Gerhard Tutz, and Torsten Hothorn. Boosting techniques for nonlinear time series models. AStA Advances in Statistical Analysis, 96(1):99–122, 2012.
N. Ueda and R. Nakano. Generalization error of ensemble estimators. In Proceedings of IEEE International Conference on Neural Networks., pages 90–95, Washington, USA, June 1996.
G. Brown. Diversity in Neural Network Ensembles. PhD thesis, The University of Birmingham, 2004.
B. Rosen. Ensemble learning using decorrelated neural networks. Connection Science, 8:373–384, 1996.
Ali Rodan and Peter Tiño. Negatively correlated echo state networks. In ESANN 2011, 19th European Symposium on Artificial Neural Networks, Bruges, Belgium, April 27-29, 2011, Proceedings, 2011.
Mustafa C. Ozturk, Dongming Xu, and José C. Príncipe. Analysis and design of echo state networks. Neural Comput., 19(1):111–138, January 2007.
A. Krogh and J. Vedelsby. Neural network ensembles, cross validation, and active learning. In Advances in Neural Information Processing Systems, pages 231–238. MIT Press, 1995.
Acknowledgments
We thank to Professor Claudio Moraga for their continued support in the design and consolidation of the doctoral program in computer engineering of the Universidad Técnica Federico Santa María.
This work was supported in part by Research Project DGIP-UTFSM (Chile) 116.24.2 and in part by Basal Project FB 0821.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Allende, H., Valle, C. (2017). Ensemble Methods for Time Series Forecasting. In: Seising, R., Allende-Cid, H. (eds) Claudio Moraga: A Passion for Multi-Valued Logic and Soft Computing. Studies in Fuzziness and Soft Computing, vol 349. Springer, Cham. https://doi.org/10.1007/978-3-319-48317-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-48317-7_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48316-0
Online ISBN: 978-3-319-48317-7
eBook Packages: EngineeringEngineering (R0)