Research on Query-by-Committee Method of Active Learning and Application
Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples, then selecting the most informative ones with respect to a given cost function for a human to label. The major problem is to find the best selection strategy function to quickly reach high classification accuracy. Query-by-Committee (QBC) method of active learning is less computation than other active learning approaches, but its classification accuracy can not achieve the same high as passive learning. In this paper, a new selection strategy for the QBC method is presented by combining Vote Entropy with Kullback-Leibler divergence. Experimental results show that the proposed algorithm is better than previous QBC approach in classification accuracy. It can reach the same accuracy as passive learning with few labeled training examples.
KeywordsClassification Accuracy Committee Member Unlabeled Data Passive Learning Active Learning Approach
Unable to display preview. Download preview PDF.
- 2.Gong, X.J., Shun, J.P., Shi, Z.Z.: An Active Bayesian Network Classifier. Computer research and development 39, 574–579 (2002)Google Scholar
- 4.McCallum, A.K., Nigam, K.: Employing EM and Pool-based Active Learning for Text Classification. In: Proceeding of the 15th International Conference on Machine Learning, pp. 350–358. Morgan Kaufmann, San Francisco Madison (1998)Google Scholar
- 5.Argamon-Engleson, S., Dagan, I.: Committee-based Sample Selection for Probabilistic Classifers. Journal of Artificial Intelligence Research 11, 335–460 (1999)Google Scholar
- 6.Lewis, D.D., Gale, W.A.: A Sequential Algorithm for Training Text Classifiers. In: Proceedings of SIGIR 1994, 17th ACM International Conference on Research and Development in Information Retrieva, pp. 3–12. Springer, Heidelberg (1994)Google Scholar