An Instance-Window Based Classification Algorithm for Handling Gradual Concept Drifts

Attar, Vahida; Chaudhary, Prashant; Rahagude, Sonali; Chaudhari, Gaurish; Sinha, Pradeep

doi:10.1007/978-3-642-27609-5_11

Vahida Attar²⁵,
Prashant Chaudhary²⁵,
Sonali Rahagude²⁵,
Gaurish Chaudhari²⁵ &
…
Pradeep Sinha²⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7103))

Included in the following conference series:

International Workshop on Agents and Data Mining Interaction

960 Accesses
1 Citations

Abstract

Mining concept drifting data stream is a challenging area for data mining research. In real world, data streams are not stable but change with time. Such changes termed as drifts in concept of the data stream are categorized into gradual and abrupt, based on the amount of drifting time, i.e. the time steps taken to replace the old concept completely by the new one. In traditional online learning systems, this categorization has not been exploited in developing different approaches for handling different types of drifts in the data stream. Such handling of concept drifts according to their type can help improve the performance of the classification system and hence, the issue can be explored further. Among the most popular and effective approaches to handle concept drifts is ensemble learning, where a set of models built over different time periods is maintained and the predictions of models are combined, usually according to their expertise level regarding the current concept. If early instances of new concept are stored and used for ensemble learning once the drift is detected, this may help increase the overall accuracy after the drift. Moreover, if an ensemble learns with zero diversity for instances of a new concept during the drifting period, the ensemble may learn the new concept faster, thus boosting recovery. The paper presents the above mentioned approach for effective handling of gradual concept drifts in the data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baena-Garcia, M., Campo-Avila, J., Del, F.R., Bifet, A.: Early Drift Detection Method. In: Proceedings 24th ECML PKDD International Workshop on Knowledge Discovery From Data Streams (IWKDDS 2006), Berlin, Germany, pp. 77–86 (2006)
Google Scholar
Bifet, A., Kirkby, R.: Data Stream Mining − A Practical Approach, http://moa.cs.waikato.ac.nz/downloads/
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees, p. 368. Wadsworth International Group (1984)
Google Scholar
Cao, L., Gorodetsky, V., Mitkas, P.A.: Agent Mining: The Synergy of Agents and Data Mining. IEEE Intelligent Systems 24(3), 64–72 (2009)
Article Google Scholar
Fern, A., Givan, R.: Online Ensemble Learning: An Empirical Study. Machine Learning 53, 71–109 (2003)
Article MATH Google Scholar
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with Drift Detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004)
Chapter Google Scholar
Katakis, I., Tsoumakas, G., Vlahavas, I.: Tracking Recurring Contexts using Ensemble Classifiers: An Application to Email Filtering. Knowledge and Information Systems 22, 371–391 (2010)
Article Google Scholar
Minku, F.L., Inoue, H., Yao, X.: Negative Correlation in Incremental Learning. Natural Computing Journal - Special Issue on Nature-inspired Learning and Adaptive Systems, 32P (2008)
Google Scholar
Minku, L., White, A., Yao, X.: The Impact of Diversity on On-line Ensemble Learning in the Presence of Concept Drift. IEEE Transactions on Knowledge and Data Engineering (2008)
Google Scholar
Minku, F.L., Yao, X.: Using Diversity to Handle Concept Drift in On-line Learning. IEEE Transactions on Knowledge and Data Engineering 99(1) (2009)
Google Scholar
Oza, N.C., Russell, S.: Experimental Comparisons of On-line and Batch Versions of Bagging and Boosting. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, August 26-29, pp. 359–364 (2001)
Google Scholar
Oza, N.C., Russell, S.: Online Bagging and Boosting. In: Proceedings of the 2005 IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2340–2345. Institute for Electrical and Electronics Engineers, New Jersey (2005)
Chapter Google Scholar
Pelossof, R., Jones, M., Vovsha, I., Rudin, C.: Online Coordinate Boosting (2008), http://arxiv.org/abs/0810.4553
Polikar, R., Udpa, L., Udpa, S.S., Honavar, V.: Learn ++: An Incremental Learning Algorithm for Supervised Neural Networks. IEEE Transactions on Systems, Man and Cybernetics - Part C 31(4), 497–508 (2001)
Article Google Scholar
Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2001, pp. 377–382. ACM Press (2001)
Google Scholar
Tsymbala, A., Pechenizkiy, M., Cunningham, P., Puuronen, S.: Dynamic Integration of Classifiers for Handling Concept Drift. Information Fusion 9(1), 56–68 (2008)
Article Google Scholar
UCI Repository Covertype Dataset, http://archive.ics.uci.edu/ml/datasets/Covertype
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2003, p. 226 (2003)
Google Scholar
Zliobaite, I.: Learning Under Concept Drift- An Overview, Technical Report, Faculty of Mathematics and Informatics, Vilnius University, Vilnius, Lithuania (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Engineering, Pune (CoEP), Shivajinagar, Pune, 411 005, India
Vahida Attar, Prashant Chaudhary, Sonali Rahagude & Gaurish Chaudhari
Centre for Development of Advanced Computing (C-DAC), Pune, 411007, India
Pradeep Sinha

Authors

Vahida Attar
View author publications
You can also search for this author in PubMed Google Scholar
Prashant Chaudhary
View author publications
You can also search for this author in PubMed Google Scholar
Sonali Rahagude
View author publications
You can also search for this author in PubMed Google Scholar
Gaurish Chaudhari
View author publications
You can also search for this author in PubMed Google Scholar
Pradeep Sinha
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Engineering and Information Technology, University of Technology Sydney, Broadway, PO Box 123, 2007, Sydney, NSW, Australia
Longbing Cao
Instituto de Informática, Universidade Federal do Rio Grande do Sul (UFRGS), Caixa Postal 15064, 91.501-970, Porto Alegre, RS, Brazil
Ana L. C. Bazzan
Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
Andreas L. Symeonidis
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, 39, 14th Liniya, 199178, St. Petersburg, Russia
Vladimir I. Gorodetsky
Department of Knowledge Engineering, Maastricht University, P.O. Box 616, 6200, Maastricht, MD, The Netherlands
Gerhard Weiss
Department of Computer Science, University of Illinois at Chicago, 851 S. Morgan Street, Room 1138 SEO, 60607, Chicago, IL, USA
Philip S. Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Attar, V., Chaudhary, P., Rahagude, S., Chaudhari, G., Sinha, P. (2012). An Instance-Window Based Classification Algorithm for Handling Gradual Concept Drifts. In: Cao, L., Bazzan, A.L.C., Symeonidis, A.L., Gorodetsky, V.I., Weiss, G., Yu, P.S. (eds) Agents and Data Mining Interaction. ADMI 2011. Lecture Notes in Computer Science(), vol 7103. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27609-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-27609-5_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27608-8
Online ISBN: 978-3-642-27609-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics