Droplet Ensemble Learning on Drifting Data Streams

Loeffel, Pierre-Xavier; Bifet, Albert; Marsala, Christophe; Detyniecki, Marcin

doi:10.1007/978-3-319-68765-0_18

Pierre-Xavier Loeffel^16,17,
Albert Bifet¹⁹,
Christophe Marsala^16,17 &
…
Marcin Detyniecki^16,17,18

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10584))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1035 Accesses
2 Citations

Abstract

Ensemble learning methods for evolving data streams are extremely powerful learning methods since they combine the predictions of a set of classifiers, to improve the performance of the best single classifier inside the ensemble. In this paper we introduce the Droplet Ensemble Algorithm (DEA), a new method for learning on data streams subject to concept drifts which combines ensemble and instance based learning. Contrarily to state of the art ensemble methods which select the base learners according to their performances on recent observations, DEA dynamically selects the subset of base learners which is the best suited for the region of the feature space where the latest observation was received. Experiments on 25 datasets (most of which being commonly used as benchmark in the literature) reproducing different type of drifts show that this new method achieves excellent results on accuracy and ranking against SAM KNN [1], all of its base learners and a majority vote algorithm using the same base learners.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This is simply done by computing the average \(\mu ^{i}\) as well as the standard deviation \(\sigma ^{i}\) of each feature on the initialization set and by transforming the \(i^{th}\) feature of \(x_{t}\) into \(\frac{x_{t}^{i}-\mu ^{i}}{\sigma {}^{i}}\).
2.
http://moa.cms.waikato.ac.nz/.
3.
https://github.com/vlosing/driftDatasets.
4.
https://mab.to/o5iNvZdhH.

References

Losing, V., Hammer, B., Wersing, H.: KNN classifier with self adjusting memory for heterogeneous concept drift. In: ICDM (2016)
Google Scholar
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: Proceedings of the Second IEEE International Conference on Computational Intelligence in Security and Defense Applications, pp. 53–58 (2009)
Google Scholar
Katakis, I., Tsoumakas, G., Vlahavas, I.: Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl. Inf. Syst. 22(3), 371–391 (2010)
Article Google Scholar
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2014)
Article Google Scholar
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Article Google Scholar
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 123–130 (2013)
Google Scholar
Bifet, A., Holmes, G., Pfahringer, B.: Leveraging bagging for evolving data streams. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6321, pp. 135–150. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15880-3_15
Chapter Google Scholar
Katakis, I., Tsoumakas, G., Vlahavas, I.: An ensemble of classifiers for coping with recurring contexts in data streams. In: 18th European Conference on Artificial Intelligence, Patras, Greece. IOS Press (2008)
Google Scholar
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2009 (2009)
Google Scholar
Oza, N., Russell, S.: Online bagging and boosting. In: Artificial Intelligence and Statistics 2001, pp. 105–112. Morgan Kaufmann (2001)
Google Scholar
Jaber, G., Cornuéjols, A., Tarroux, P.: A new on-line learning method for coping with recurring concepts: the ADACC system. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8227, pp. 595–604. Springer, Heidelberg (2013). doi:10.1007/978-3-642-42042-9_74
Chapter Google Scholar
Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 249–260. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03915-7_22
Chapter Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)
Article Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Knowledge Discovery and Data Mining, pp. 71–80 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6, 75005, Paris, France
Pierre-Xavier Loeffel, Christophe Marsala & Marcin Detyniecki
CNRS, UMR 7606, LIP6, 75005, Paris, France
Pierre-Xavier Loeffel, Christophe Marsala & Marcin Detyniecki
Polish Academy of Sciences, IBS PAN, Warsaw, Poland
Marcin Detyniecki
LTCI, Télécom ParisTech, Université Paris-Saclay, 75013, Paris, France
Albert Bifet

Authors

Pierre-Xavier Loeffel
View author publications
You can also search for this author in PubMed Google Scholar
Albert Bifet
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Marsala
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Detyniecki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pierre-Xavier Loeffel .

Editor information

Editors and Affiliations

Imperial College London, London, United Kingdom
Niall Adams
Brunel University London, Uxbridge, United Kingdom
Allan Tucker
Birkbeck, University of London, London, United Kingdom
David Weston

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Loeffel, PX., Bifet, A., Marsala, C., Detyniecki, M. (2017). Droplet Ensemble Learning on Drifting Data Streams. In: Adams, N., Tucker, A., Weston, D. (eds) Advances in Intelligent Data Analysis XVI. IDA 2017. Lecture Notes in Computer Science(), vol 10584. Springer, Cham. https://doi.org/10.1007/978-3-319-68765-0_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-68765-0_18
Published: 04 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68764-3
Online ISBN: 978-3-319-68765-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics