Abstract
Active Learning methods rely on static strategies for sampling unlabeled point(s). These strategies range from uncertainty sampling and density estimation to multi-factor methods with learn-once-use-always model parameters. This paper proposes a dynamic approach, called DUAL, where the strategy selection parameters are adaptively updated based on estimated future residual error reduction after each actively sampled point. The objective of dual is to outperform static strategies over a large operating range: from very few to very many labeled points. Empirical results over six datasets demonstrate that DUAL outperforms several state-of-the-art methods on most datasets.
Keywords
- Unlabeled Data
- Uncertainty Sampling
- Dual Algorithm
- Active Learning Method
- Unlabeled Instance
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Chapter PDF
References
Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pp. 287–294. ACM Press, New York (1992)
Freund, Y., Seung, H., Shamir, E., Tishby, N.: Selective sampling using the Query By Committee algorithm. Machine Learning Journal 28, 133–168 (1997)
Lewis, D., Gale, W.: A sequential algorithm for training text classifiers. In: SIGIR 1994, pp. 3–12 (1994)
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. In: ICML 2000, pp. 999–1006 (2000)
Mitchell, T.M.: Generalization as search. Artificial Intelligence Journal 18 (1982)
McCallum, A., Nigam, K.: Employing EM and pool-based active learning for text classification. In: ICML 1998, pp. 359–367 (1998)
Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. Journal of Artificial Intelligence Research 4, 129–145 (1996)
Xu, Z., Yu, K., Tresp, V., Xu, X., Wang, J.: Representative sampling for text classification using support vector machines. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, Springer, Heidelberg (2003)
Nguyen, H.T., Smeulders, A.: Active learning with pre-clustering. In: ICML 2004, pp. 623–630 (2004)
Baram, Y., El-Yaniv, R., Luz, K.: Online choice of active learning algorithms. In: ICML 2003, pp. 19–26 (2003)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32(1), 48–77 (2002)
Struyf, A., Hubert, M., Rousseeuw, P.: Integrating robust clustering techniques in s-plus. Computational Statistics and Data Analysis 26, 17–37 (1997)
Rätsch, G., Onoda, T., Muller, K.R.: Soft margins for AdaBoost. Machine Learning Journal 42(3), 287–320 (2001)
Melville, P., Mooney, R.J.: Diverse ensembles for active learning. In: ICML 2004, pp. 584–591 (2004)
Melville, P., Mooney, R.J.: Constructing diverse classifier ensembles using artificial training examples. In: IJCAI 2003, pp. 505–510 (2003)
Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: ICML 2001, pp. 441–448 (2001)
Schohn, G., Cohn, D.: Less is more: Active Learning with support vector machines. In: ICML 2000, pp. 839–846 (2000)
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases (1998)
Saar-Tsechansky, M., Provost, F.: Active learning for class probability estimation and ranking. In: IJCAI 2001, pp. 911–920 (2001)
Guo, Y., Greiner, R.: Optimistic Active Learning using Mutual Information. In: IJCAI 2007, pp. 823–829 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Donmez, P., Carbonell, J.G., Bennett, P.N. (2007). Dual Strategy Active Learning. In: Kok, J.N., Koronacki, J., Mantaras, R.L.d., Matwin, S., Mladenič, D., Skowron, A. (eds) Machine Learning: ECML 2007. ECML 2007. Lecture Notes in Computer Science(), vol 4701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74958-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-74958-5_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74957-8
Online ISBN: 978-3-540-74958-5
eBook Packages: Computer ScienceComputer Science (R0)