On the Online Classification of Data Streams Using Weak Estimators
In this paper, we propose a novel online classifier for complex data streams which are generated from non-stationary stochastic properties. Instead of using a single training model and counters to keep important data statistics, the introduced online classifier scheme provides a real-time self-adjusting learning model. The learning model utilizes the multiplication-based update algorithm of the Stochastic Learning Weak Estimator (SLWE) at each time instant as a new labeled instance arrives. In this way, the data statistics are updated every time a new element is inserted, without requiring that we have to rebuild its model when changes occur in the data distributions. Finally, and most importantly, the model operates with the understanding that the correct classes of previously-classified patterns become available at a later juncture subsequent to some time instances, thus requiring us to update the training set and the training model.
The results obtained from rigorous empirical analysis on multinomial distributions, is remarkable. Indeed, it demonstrates the applicability of our method on synthetic datasets, and proves the advantages of the introduced scheme.
KeywordsWeak estimators Learning automata Non-stationary environments Classification in data streams
- 1.Bifet, A.: Adaptive learning and mining for data streams and frequent patterns. Ph.D. thesis, Departament de Llenguatges i Sistemes Informatics, Universitat Politcnica de Catalunya, Barcelona Area, Spain (2009)Google Scholar
- 3.Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings SIAM International Conference on Data Mining, vol. 8, pp. 443–448 (2007)Google Scholar
- 6.Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23, 69–101 (1996)Google Scholar