Abstract
Learning ensemble classifiers from concept drifting data streams is discussed. The paper starts with a general overview of these ensembles. Then, differences between block-based and on-line ensembles are examined in detail. We hypothesize that it is still possible to develop new ensembles that combine the most beneficial properties of both types of these classifiers. Two such ensembles are described: Accuracy Updated Ensemble designed to process data blocks and its incremental version, Online Accuracy Updated Ensemble, for learning from single examples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bifet, A., Holmes, G., Pfahringer, B.: Leveraging bagging for evolving data streams. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part I. LNCS, vol. 6321, pp. 135–150. Springer, Heidelberg (2010)
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)
Brzezinski, D.: Block-based and online ensembles for concept-drifting data streams. Ph.D. Thesis, Poznan University of Technology (2015)
Brzeziński, D., Stefanowski, J.: Accuracy updated ensemble for data streams with concept drift. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011, Part II. LNCS, vol. 6679, pp. 155–163. Springer, Heidelberg (2011)
Brzezinski, D,. Stefanowski, J.: From block-based ensembles to onlinelearners in changing data streams: if- and how-to. In: Proceedings of the 2012 ECML PKDD Workshop on Instant Interactive Data Mining. http://adrem.ua.ac.be/iid2012/
Brzezinski, D., Stefanowski, J.: Classifiers for concept-drifting dat streams: Evaluating things that really matter. In: Proceedings of the ECML PKDD 2013 Workshop on Real-World Challenges for Data Stream Mining (2013)
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: The accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2014)
Brzezinski, D., Stefanowski, J.: Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf. Sci. 265, 50–67 (2014)
Brzezinski, D., Stefanowski, J.: Prequential AUC for classifier evaluation and drift detection in evolving data streams. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2014. LNCS, vol. 8983, pp. 87–101. Springer, Heidelberg (2015)
Deckert, M.: Incremental rule-based learners for handling concept drift: an overview. Found. Comput. Decis. Sci. 38(1), 35–65 (2013)
Deckert, M., Stefanowski, J.: Comparing block ensembles for data streams with concept drift. In: Pechenizkiy, M., Wojciechowski, M. (eds.) ADBIS 2012. AISC, vol. 185, pp. 69–78. Springer, Heidelberg (2012)
Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments - a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Gama, J.: Knowledge Discovery from Data Streams. CRC Publishers, Boca Raton (2010)
Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comp. Surv. 46(4), 44:1–44:37 (2014)
Gomes, J., Gaber, M., Sousa, P., Menasalvas, E.: Mining recurring concepts in a dynamic feature space. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 95–110 (2014)
Hoens, T., Chawla, N.: Learning in non-stationary environments with class imbalance. In: Proceedings of the 18th ACM SIGKDD International Conference Knowledge Discovery Data Mining, pp. 168–176 (2012)
Japkowicz, N.: Assessment metrics for imbalanced learning. In: He, H., Ma, Y. (eds.) Imbalanced Learning: Foundations, Algorithms, and Applications, pp. 187–206. Wiley-IEEE Press, New Jersey (2013)
Japkowicz, N., Stefanowski, J.: A machine learning perspective on big data analysis. In: Japkowicz, N., Stefanowski, J. (eds.) Big Data Analysis: New Algorithms for a New Society. SBD, vol. 16, pp. 1–31. Springer, Switzerland (2016)
Kmieciak, M., Stefanowski, J.: Handling sudden concept drift in Enron message data streams. Control Cybern. 40(3), 667–695 (2011)
Kolter, J., Maloof, M.: Dynamic weighted majority: An ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)
Krempl, G., Zliobaite, I., Brzezinski, D., Hullermeier, E., Last, M., Lemaire, V., Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M., Stefanowski, J.: Open challenges for data stream mining research. SIGKDD Explor. 16(1), 1–10 (2014)
Kuncheva, L.I.: Classifier ensembles for changing environments. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 1–15. Springer, Heidelberg (2004)
Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms, 2nd edn. Wiley, Hoboken (2014)
Lemaire, V., Salperwyck, C., Bondu, A.: A survey on supervised classification on data streams. In: Zimányi, E., Kutsche, R.-D. (eds.) eBISS 2014. LNBIP, vol. 205, pp. 88–125. Springer, Heidelberg (2015)
Littlestone, N., Warmuth, M.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)
Masud, M., Gao, J., Khan, L., Han, J., Thuraisingham, B.: A practical approach to classify evolving data streams: training with limited amount of labeled data. In: Proceedings of the 8th IEEE International Conference on Data Mining, pp. 929–934 (2008)
Minku, L., White, A., Yao, X.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 22(5), 730–742 (2010)
Nishida, K., Yamauchi, K., Omori, T.: ACE: adaptive classifiers-ensemble system for concept-drifting environments. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds.) MCS 2005. LNCS, vol. 3541, pp. 176–185. Springer, Heidelberg (2005)
Oza, N., Russell, S.: Experimental comparisons of online and batch versions of bagging and boosting. In: Proceedings of the 7th ACM SIGKDD International Conference Knowledge Discovery Data Mining, pp. 359–364. ACM Press (2001)
Shaker, A., Hullermeier, E.: Recovery analysis for adaptive learning from non-stationary data streams: Experimental design and case study. Neurocomputing 150, 250–264 (2015)
Spiliopoulou, M., Krempl, G.: Mining multiple threads of streaming data. In: Tutorial at the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2013), Gold Coast, Australia, April 2013. https://kmd.cs.ovgu.de/tutorial_pakdd2013.html
Street, N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 377–382 (2001)
Tsymbal, A.: The problem of concept drift: definitions and related works, Technical report, Dept. Comput. Sci., Trinity College Dublin (2004)
Wang, H., Fan, W., Yu, P., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings ACM SIGKDD International Conference Knowledge Discovery Data Mining, pp. 226–235 (2003)
Wang, S., Minku, L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)
Webb, G., Hyde, R., Cao, H., Nguyen, H., Petitjean, F.: Characterizing Concept Drift. arXiv preprint (accepted for publication in journal Data Mining and Knowledge Discovery) (2015). arXiv:1511.03816
Zliobaite, I.: Controlled permutations for testing adaptive learning models. Knowl. Inf. Syst. 39(3), 565–578 (2014)
Zliobaite, I., Pechenizky, M., Gama, J.: An overview of concept drift applications. In: Japkowicz, N., Stefanowski, J. (eds.) Big Data Analysis: New Algorithms for a New Society. SBD, vol. 16, pp. 91–114. Springer, Switzerland (2016)
Acknowledgment
The research on this paper was supported by the Polish National Science Center under grant no. DEC-2013/11/B/ST6/00963. The close co-operation with Dariusz Brzezinski on developing the new AUE and OAUE ensembles, and their experimental evaluation, is also acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Stefanowski, J. (2016). Adaptive Ensembles for Evolving Data Streams – Combining Block-Based and Online Solutions. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2015. Lecture Notes in Computer Science(), vol 9607. Springer, Cham. https://doi.org/10.1007/978-3-319-39315-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-39315-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39314-8
Online ISBN: 978-3-319-39315-5
eBook Packages: Computer ScienceComputer Science (R0)