On-Line Classification of Data Streams with Missing Values Based on Reinforcement Learning
In some applications, data arrive sequentially and they are not available in batch form, what makes difficult the use of traditional classification systems. In addition, some attributes may lack due to some real-world conditions. For this problem, a number of decisions have to be made regarding how to proceed with the incomplete and unlabeled incoming objects, how to guess its missing attributes values, how to classify it, whether to include it in the training set, or when to ask for the class label to an expert. Unfortunately, no decision works well for all data sets. This data dependency motivates our formulation of the problem in terms of elements of reinforcement learning. The application of this learning paradigm for this problem is, to the best of our knowledge, novel. The empirical results are encouraging since the proposed framework behaves better and more generally than many strategies used isolatedly, and makes an efficient use of human effort (requests for the class label to an expert) and computer memory (the increase of size of the training set).
KeywordsReinforcement learning Active learning Adaptive learning Streaming data Incomplete data Imputation techniques On-line classification
Unable to display preview. Download preview PDF.
- 6.Langford, J., Zadrozny, B.: Relating reinforcement learning performance to classification performance. In: Proc. of the Intl. Conference on Machine Learning, pp. 473–480 (2005)Google Scholar
- 7.Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
- 8.Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2006)Google Scholar
- 10.Nagy, G.: Classifiers that improve with use. In: In Proc. Conf. on Pattern Recognition and Multimedia, pp. 79–86 (2004)Google Scholar
- 11.Frank, A., Asuncion, A.: UCI Machine Learning RepositoryGoogle Scholar
- 13.Library: Real medical data sets, http://www.bangor.ac.uk/~mas00a/activities/real_data.htm