Abstract
The progress of computer science caused that many institutions collected huge amount of data, which analysis is impossible by human beings. Nowadays simple methods of data analysis are not sufficient for efficient management of an average enterprize, since for smart decisions the knowledge hidden in data is highly required, as which multiple classifier systems are recently the focus of intense research. Unfortunately the great disadvantage of traditional classification methods is that they ”assume” that statistical properties of the discovered concept (which model is predicted) are being unchanged. In real situation we could observe so-called concept drift, which could be caused by changes in the probabilities of classes or/and conditional probability distributions of classes. The potential for considering new training data is an important feature of machine learning methods used in security applications or marketing departments. Unfortunately, the occurrence of this phenomena dramatically decreases classification accuracy.
Chapter PDF
Similar content being viewed by others
References
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23, 69–101 (1996)
Muhlbaier, M.D., Topalis, A., Polikar, R.: Learn + + .nc: Combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes. IEEE Transactions on Neural Networks 20, 152–168 (2009)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106 (2001)
Lazarescu, M.M., Venkatesh, S., Bui, H.H.: Using multiple windows to track concept drift. Intell. Data Anal. 8, 29–59 (2004)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience (2004)
Wolpert, D.H.: The supervised learning no-free-lunch theorems. In: Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp. 25–42 (2001)
Wozniak, M., Grana, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Information Fusion (2013)
Jain, A., Duin, R., Mao, J.: Statistical pattern recognition: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 4–37 (2000)
Hansen, L., Salamon, P.: Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 993–1001 (1990)
Xu, L., Krzyzak, A., Suen, C.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Transactions on Systems, Man and Cybernetics 22, 418–435 (1992)
Tumer, K., Ghosh, J.: Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition 29, 341–348 (1996)
Ho, T.K., Hull, J.J., Srihari, S.N.: Decision combination in multiple classifier systems. IEEE Trans. Pattern Anal. Mach. Intell. 16, 66–75 (1994)
Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)
Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5, 197–227 (1990)
Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121, 256–285 (1995)
Kearns, M.J., Vazirani, U.V.: An introduction to computational learning theory. MIT Press, Cambridge (1994)
Angluin, D.: Queries and concept learning. Mach. Learn. 2, 319–342 (1988)
Giacinto, G., Roli, F., Fumera, G.: Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of the 15th International Conference on Pattern Recognition, vol. 2, pp. 160–163 (2000)
Ho, T.K.: Complexity of classification problems and comparative advantages of combined classifiers. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 97–106. Springer, Heidelberg (2000)
Roli, F., Giacinto, G.: Design of Multiple Classifier Systems. World Scientific Publishing (2002)
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Advances in Neural Information Processing Systems, vol. 7, pp. 231–238 (1995)
Zenobi, G., Cunningham, P.: Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 576–587. Springer, Heidelberg (2001)
Sharkey, A.J.C., Sharkey, N.E.: Combining diverse neural nets. Knowl. Eng. Rev. 12, 231–247 (1997)
Brown, G., Wyatt, J.L., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6, 5–20 (2005)
Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 377–382. ACM, New York (2001)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2003, pp. 226–235. ACM, New York (2003)
Kolter, J., Maloof, M.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 123–130 (2003)
Zliobaite, I.: Change with delayed labeling: When is it detectable? In: Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, ICDMW 2010, pp. 843–850. IEEE Computer Society, Washington, DC (2010)
Kuncheva, L.I.: Classifier ensembles for detecting concept change in streaming data: Overview and perspectives. In: 2nd Workshop SUEMA 2008 (ECAI 2008), pp. 5–10 (2008)
Gaber, M.M., Yu, P.S.: Classification of changes in evolving data streams using online clustering result deviation. In: Proc. of International Workshop on Knowledge Discovery in Data Streams (2006)
Markou, M., Singh, S.: Novelty detection: a review—part 1: statistical approaches. Signal Process. 83, 2481–2497 (2003)
Salganicoff, M.: Density-adaptive learning and forgetting. In: Machine Learning: Proceedings of the Tenth Annual Conference. Morgan Kaufmann, San Francisco (1993)
Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: Proceedings of the Seventeenth International Conference on Machine Learning, ICML 2000, pp. 487–494. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Baena-García, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavaldá, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams (2006)
Ramamurthy, S., Bhatnagar, R.: Tracking recurrent concept drift in streaming data using ensemble classifiers. In: Proceedings of the Sixth International Conference on Machine Learning and Applications, ICMLA 2007, pp. 404–409. IEEE Computer Society, Washington, DC (2007)
Turney, P.D.: Exploiting context when learning to classify. In: Brazdil, P.B. (ed.) ECML 1993. LNCS, vol. 667, pp. 402–407. Springer, Heidelberg (1993)
Widmer, G.: Tracking context changes through meta-learning. Mach. Learn. 27, 259–286 (1997)
Bártolo Gomes, J., Ruiz, E.M., Sousa, P.A.C.: Learning recurring concepts from data streams with a context-aware ensemble. In: Chu, W.C., Wong, W.E., Palakal, M.J., Hung, C.C. (eds.) Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21-24, pp. 994–999. ACM (2011)
Katakis, I., Tsoumakas, G., Vlahavas, I.: Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl. Inf. Syst. 22, 371–391 (2010)
Hosseini, M.J., Ahmadi, Z., Beigy, H.: Pool and accuracy based stream classification: A new ensemble algorithm on data stream classification using recurring concepts detection. In: Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, ICDMW 2011, pp. 588–595. IEEE Computer Society, Washington, DC (2011)
Partridge, D., Krzanowski, W.: Software diversity: practical statistics for its measurement and exploitation. Information and Software Technology 39, 707–717 (1997)
Klinkenberg, R., Renz, I.: Adaptive information filtering: Learning in the presence of concept drifts, pp. 33–40 (1998)
Wozniak, M., Kasprzak, A., Cal, P.: Application of combined classifiers to data stream classification. In: FQAS 2013. LNCS(LNAI), vol. 8132, pp. 579–588. Springer, Heidelberg (2013)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)
Alpaydin, E.: Introduction to Machine Learning, 2nd edn. The MIT Press (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 IFIP International Federation for Information Processing
About this paper
Cite this paper
Woźniak, M. (2013). Application of Combined Classifiers to Data Stream Classification. In: Saeed, K., Chaki, R., Cortesi, A., Wierzchoń, S. (eds) Computer Information Systems and Industrial Management. CISIM 2013. Lecture Notes in Computer Science, vol 8104. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40925-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-40925-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40924-0
Online ISBN: 978-3-642-40925-7
eBook Packages: Computer ScienceComputer Science (R0)