Abstract
Ensemble-based active learning has been proven to efficiently reduce the number of training instances and thus the cost of data acquisition. To determine the utility of a candidate training instance, the disagreement about its class value among the ensemble members is used. While the disagreement for binary classification is easily determined using margins, the adaption to multi-class problems is not straightforward and little studied in the literature. In this paper we consider four approaches to measure ensemble disagreement, including margins, uncertainty sampling and entropy, and evaluate them empirically on various ensemble strategies for active learning. We show that margins outperform the other disagreement measures on three of four active learning strategies. Our experiments also show that some active learning strategies are more sensitive to the choice of disagreement measure than others.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abe, N., Mamitsuka, H.: Query learning strategies using boosting and bagging. In: Proc. of ICML 1998, pp. 1–9. Morgan Kaufmann, San Francisco (1998)
Muslea, I., Minton, S., Knoblock, C.A.: Selective sampling with redundant views. In: Proc. of AAAI 2000, pp. 621–626. AAAI Press / The MIT Press (2000)
Melville, P., Mooney, R.: Diverse ensembles for active learning. In: Proc. of ICML 2004, pp. 584–591. ACM, New York (2004)
Lewis, D.D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: Proc. of ICML 1994, pp. 148–156. ACM Press, New York (1994)
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Proc. of SIGIR 1994, pp. 3–12. ACM / Springer (1994)
Dagan, I., Engelson, S.: Committee-based sampling for training probabilistic classifiers. In: Proc. of ICML 1995, pp. 150–157. Morgan Kaufmann, San Francisco (1995)
McCallum, A., Nigam, K.: Employing em and pool-based active learning for text classification. In: Proc. of ICML 1998, pp. 350–358. Morgan Kaufmann, San Francisco (1998)
Melville, P., Yang, S.M., Saar-Tsechansky, M., Mooney, R.: Active learning for probability estimation using jensen-shannon divergence. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS, vol. 3720, pp. 268–279. Springer, Heidelberg (2005)
Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proc. of COLT 1992, pp. 287–294. ACM, New York (1992)
Freund, Y., Seung, H.S., Shamir, E., Tishby, N.: Selective sampling using the query by committee algorithm. Machine Learning 28(2-3), 133–168 (1997)
Breiman, L.: Bagging predictors. Technical report 421, University of California, Berkeley (1994)
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proc. of ICML 1996, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proc. of COLT 1998, pp. 92–100. ACM, New York (1998)
Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proc. of CIKM 2000, pp. 86–93. ACM, New York (2000)
Muslea, I.: Active Learning with Multiple Views. PhD thesis, University of Southern California (2002)
Melville, P., Mooney, R.: Constructing diverse classifier ensembles using artificial training examples. In: Proc. of IJCAI 2003, pp. 505–510. Morgan Kaufmann, San Francisco (2003)
Blake, C.L., Merz, C.J.: Uci repository of machine learning databases, http://www.ics.uci.edu/~mlearn/MLRepository.html
Witten, I.H., Frank, E.: Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Körner, C., Wrobel, S. (2006). Multi-class Ensemble-Based Active Learning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science(), vol 4212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871842_68
Download citation
DOI: https://doi.org/10.1007/11871842_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45375-8
Online ISBN: 978-3-540-46056-5
eBook Packages: Computer ScienceComputer Science (R0)