Combining Online Classification Approaches for Changing Environments
Any change in the classification problem in the course of online classification is termed changing environments. Examples of changing environments include change in the underlying data distribution, change in the class definition, adding or removing a feature. The two general strategies for handling changing environments are (i) constant update of the classifier and (ii) re-training of the classifier after change detection. The former strategy is useful with gradual changes while the latter is useful with abrupt changes. If the type of changes is not known in advance, a combination of the two strategies may be advantageous. We propose a classifier ensemble using Winnow. For the constant-update strategy we used the nearest neighbour with a fixed size window and two methods with a learning rate: the online perceptron and an online version of the linear discriminant classifier (LDC). For the detect-and-retrain strategy we used the nearest neighbour classifier and the online LDC. Experiments were carried out on 28 data sets and 3 different scenarios: no change, gradual change and abrupt change. The results indicate that the combination works better than each strategy on its own.
KeywordsLearning Rate Near Neighbour Average Rank Concept Drift Sequential Probability Ratio Test
- 5.Littlestone, N.: Learning quickly when irrelevant attributes abound: A new linear threshold algorithm. Machine Learning 2(4), 285–318 (1988)Google Scholar
- 8.Reynolds, M.R., Stoumbos, Z.G.: The SPRT chart for monitoring a proportion. IIE Transactions 30(6), 545–561 (1998)Google Scholar
- 9.Stanley, K.O.: Learning concept drift with a committee of decision trees. Technical Report AI-03-302, Computer Science Department, University of Texas-Austin (2003)Google Scholar
- 10.Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 226–235. ACM, New York (2003)Google Scholar