Application of Combined Classifiers to Data Stream Classification

Woźniak, Michał

doi:10.1007/978-3-642-40925-7_2

Michał Woźniak²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8104))

Included in the following conference series:

IFIP International Conference on Computer Information Systems and Industrial Management

2269 Accesses
5 Citations

Abstract

The progress of computer science caused that many institutions collected huge amount of data, which analysis is impossible by human beings. Nowadays simple methods of data analysis are not sufficient for efficient management of an average enterprize, since for smart decisions the knowledge hidden in data is highly required, as which multiple classifier systems are recently the focus of intense research. Unfortunately the great disadvantage of traditional classification methods is that they ”assume” that statistical properties of the discovered concept (which model is predicted) are being unchanged. In real situation we could observe so-called concept drift, which could be caused by changes in the probabilities of classes or/and conditional probability distributions of classes. The potential for considering new training data is an important feature of machine learning methods used in security applications or marketing departments. Unfortunately, the occurrence of this phenomena dramatically decreases classification accuracy.

Download to read the full chapter text

Chapter PDF

Stream Classification

A Survey on Supervised Classification on Data Streams

An Experimental Comparison of Ensemble Classifiers for Evolving Data Streams

Keywords

References

Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23, 69–101 (1996)
Google Scholar
Muhlbaier, M.D., Topalis, A., Polikar, R.: Learn^+ +.nc: Combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes. IEEE Transactions on Neural Networks 20, 152–168 (2009)
Article Google Scholar
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106 (2001)
Google Scholar
Lazarescu, M.M., Venkatesh, S., Bui, H.H.: Using multiple windows to track concept drift. Intell. Data Anal. 8, 29–59 (2004)
Google Scholar
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience (2004)
Google Scholar
Wolpert, D.H.: The supervised learning no-free-lunch theorems. In: Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp. 25–42 (2001)
Google Scholar
Wozniak, M., Grana, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Information Fusion (2013)
Google Scholar
Jain, A., Duin, R., Mao, J.: Statistical pattern recognition: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 4–37 (2000)
Article Google Scholar
Hansen, L., Salamon, P.: Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 993–1001 (1990)
Article Google Scholar
Xu, L., Krzyzak, A., Suen, C.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Transactions on Systems, Man and Cybernetics 22, 418–435 (1992)
Article Google Scholar
Tumer, K., Ghosh, J.: Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition 29, 341–348 (1996)
Article Google Scholar
Ho, T.K., Hull, J.J., Srihari, S.N.: Decision combination in multiple classifier systems. IEEE Trans. Pattern Anal. Mach. Intell. 16, 66–75 (1994)
Article Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)
MathSciNet MATH Google Scholar
Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5, 197–227 (1990)
Google Scholar
Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121, 256–285 (1995)
Article MathSciNet MATH Google Scholar
Kearns, M.J., Vazirani, U.V.: An introduction to computational learning theory. MIT Press, Cambridge (1994)
Google Scholar
Angluin, D.: Queries and concept learning. Mach. Learn. 2, 319–342 (1988)
Google Scholar
Giacinto, G., Roli, F., Fumera, G.: Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of the 15th International Conference on Pattern Recognition, vol. 2, pp. 160–163 (2000)
Google Scholar
Ho, T.K.: Complexity of classification problems and comparative advantages of combined classifiers. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 97–106. Springer, Heidelberg (2000)
Chapter Google Scholar
Roli, F., Giacinto, G.: Design of Multiple Classifier Systems. World Scientific Publishing (2002)
Google Scholar
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Advances in Neural Information Processing Systems, vol. 7, pp. 231–238 (1995)
Google Scholar
Zenobi, G., Cunningham, P.: Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 576–587. Springer, Heidelberg (2001)
Chapter Google Scholar
Sharkey, A.J.C., Sharkey, N.E.: Combining diverse neural nets. Knowl. Eng. Rev. 12, 231–247 (1997)
Article Google Scholar
Brown, G., Wyatt, J.L., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6, 5–20 (2005)
Article Google Scholar
Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 377–382. ACM, New York (2001)
Google Scholar
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2003, pp. 226–235. ACM, New York (2003)
Google Scholar
Kolter, J., Maloof, M.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 123–130 (2003)
Google Scholar
Zliobaite, I.: Change with delayed labeling: When is it detectable? In: Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, ICDMW 2010, pp. 843–850. IEEE Computer Society, Washington, DC (2010)
Chapter Google Scholar
Kuncheva, L.I.: Classifier ensembles for detecting concept change in streaming data: Overview and perspectives. In: 2nd Workshop SUEMA 2008 (ECAI 2008), pp. 5–10 (2008)
Google Scholar
Gaber, M.M., Yu, P.S.: Classification of changes in evolving data streams using online clustering result deviation. In: Proc. of International Workshop on Knowledge Discovery in Data Streams (2006)
Google Scholar
Markou, M., Singh, S.: Novelty detection: a review—part 1: statistical approaches. Signal Process. 83, 2481–2497 (2003)
Article MATH Google Scholar
Salganicoff, M.: Density-adaptive learning and forgetting. In: Machine Learning: Proceedings of the Tenth Annual Conference. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: Proceedings of the Seventeenth International Conference on Machine Learning, ICML 2000, pp. 487–494. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Google Scholar
Baena-García, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavaldá, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams (2006)
Google Scholar
Ramamurthy, S., Bhatnagar, R.: Tracking recurrent concept drift in streaming data using ensemble classifiers. In: Proceedings of the Sixth International Conference on Machine Learning and Applications, ICMLA 2007, pp. 404–409. IEEE Computer Society, Washington, DC (2007)
Google Scholar
Turney, P.D.: Exploiting context when learning to classify. In: Brazdil, P.B. (ed.) ECML 1993. LNCS, vol. 667, pp. 402–407. Springer, Heidelberg (1993)
Chapter Google Scholar
Widmer, G.: Tracking context changes through meta-learning. Mach. Learn. 27, 259–286 (1997)
Article Google Scholar
Bártolo Gomes, J., Ruiz, E.M., Sousa, P.A.C.: Learning recurring concepts from data streams with a context-aware ensemble. In: Chu, W.C., Wong, W.E., Palakal, M.J., Hung, C.C. (eds.) Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21-24, pp. 994–999. ACM (2011)
Google Scholar
Katakis, I., Tsoumakas, G., Vlahavas, I.: Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl. Inf. Syst. 22, 371–391 (2010)
Article Google Scholar
Hosseini, M.J., Ahmadi, Z., Beigy, H.: Pool and accuracy based stream classification: A new ensemble algorithm on data stream classification using recurring concepts detection. In: Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, ICDMW 2011, pp. 588–595. IEEE Computer Society, Washington, DC (2011)
Google Scholar
Partridge, D., Krzanowski, W.: Software diversity: practical statistics for its measurement and exploitation. Information and Software Technology 39, 707–717 (1997)
Article Google Scholar
Klinkenberg, R., Renz, I.: Adaptive information filtering: Learning in the presence of concept drifts, pp. 33–40 (1998)
Google Scholar
Wozniak, M., Kasprzak, A., Cal, P.: Application of combined classifiers to data stream classification. In: FQAS 2013. LNCS(LNAI), vol. 8132, pp. 579–588. Springer, Heidelberg (2013)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)
Article Google Scholar
Alpaydin, E.: Introduction to Machine Learning, 2nd edn. The MIT Press (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Systems and Computer Networks, Wroclaw University of Technology, Wyb. Wyspianskiego 27, 50-370, Wroclaw, Poland
Michał Woźniak

Authors

Michał Woźniak
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

AGH University of Science and Technology, Krakow, Poland
Khalid Saeed
West Bengal University of Technology, Kolkata, India
Rituparna Chaki
Università Ca’ Foscari Venezia, Venice, Italy
Agostino Cortesi
Polish Academy of Sciences, Warsaw, Poland
Sławomir Wierzchoń

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Woźniak, M. (2013). Application of Combined Classifiers to Data Stream Classification. In: Saeed, K., Chaki, R., Cortesi, A., Wierzchoń, S. (eds) Computer Information Systems and Industrial Management. CISIM 2013. Lecture Notes in Computer Science, vol 8104. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40925-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-40925-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40924-0
Online ISBN: 978-3-642-40925-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Application of Combined Classifiers to Data Stream Classification

Abstract

Chapter PDF

Similar content being viewed by others

Stream Classification

A Survey on Supervised Classification on Data Streams

An Experimental Comparison of Ensemble Classifiers for Evolving Data Streams

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Application of Combined Classifiers to Data Stream Classification

Abstract

Chapter PDF

Similar content being viewed by others

Stream Classification

A Survey on Supervised Classification on Data Streams

An Experimental Comparison of Ensemble Classifiers for Evolving Data Streams

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation