Abstract
Active learning is an important method for solving data scarcity problem in machine learning, and most research work of active learning are pool-based. However, this type of active learning is easily affected by pool size, and makes performance improvement of classifier slow. A novel active learning with constructing queries based pool is proposed. Each iteration the training process first chooses representative instance from pool predefined, then employs climbing algorithm to construct instance to label which best represents the original unlabeled set. It makes each queried instance more representative than any instance in the pool. Compared with the original pool based method and a state-of-the-art active learning with constructing queries directly, the new method makes the prediction error rate of classifier drop more fast, and improves the performance of active learning classifier.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hoi, S.C.H., Jin, R., Lyu, M.R.: Large-scale text categorization by batch mode active learning. In: Proceedings of the International Conference on the World Wide Web, pp. 633–642. ACM Press (2006)
Settles, B., Craven, M.: An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1069–1078. ACL Press (2008)
Hauptmann, A., Lin, W., Yan, R., Yang, J., Chen, M.Y.: Extreme video retrieval: joint maximization of human and computer performance. In: Proceedings of the ACM Workshop on Multimedia Image Retrieval, pp. 385–394. ACM Press (2006)
Ling, C.X., Du, J.: Active Learning with Direct Query Construction. In: KDD, pp. 480–487 (2008)
Lewis, D., Gale, W.: A sequential algorithm for training text classifiers. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–12. ACM/Springer (1994)
Settles, B.: Active learning literature survey. Technical Report 1648, University of Wisconsin –Madison (2010)
Lewis, D.D., Catlett, J.: Heterogeneous Uncertainty Sampling for Supervised Learning. In: Proceedings of the International Conference on Machine Learning, pp. 148–156 (1994)
Chaoji, V., Hasan, M.A., Salem, S., Zaki, M.J.: SPARCL: Efficient and Effective Shape-based Clustering. In: ICDM, pp. 93-102 (2008)
Asuncion, A., Newman, D.J.: UCI machine learning repository (2007), http://www.ics.uci.edu/~mlearn/mlrepository.html
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann (June 2005)
Du, J., Ling, C.X.: Asking Generalized Queries to Domain Experts to Improve Learning. In: TKDE (2010)
Baum, E.B., Lang, K.: Query learning can work poorly when a human oracle is used. In: IEEE International Joint Conference on Neural Networks (1992)
Nguyen, H.T., Smeulders, A.: Active Learning Using Pre-clustering. In: ICML, pp. 623–630 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, S., Yin, J., Guo, W. (2011). Pool-Based Active Learning with Query Construction. In: Wang, Y., Li, T. (eds) Foundations of Intelligent Systems. Advances in Intelligent and Soft Computing, vol 122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25664-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-25664-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25663-9
Online ISBN: 978-3-642-25664-6
eBook Packages: EngineeringEngineering (R0)