Abstract
Collecting and annotating exemplary cases is a costly and critical task that is required in early stages of any classification process. Reducing labeling cost without degrading accuracy calls for a compromise solution which may be achieved with active learning. Common active learning approaches focus on accuracy and assume the availability of a pre-labeled set of exemplary cases covering all classes to learn. This assumption does not necessarily hold. In this paper we study the capabilities of a new active learning approach, d-Confidence, in rapidly covering the case space when compared to the traditional active learning confidence criterion, when the representativeness assumption is not met. Experimental results also show that d-Confidence reduces the number of queries required to achieve complete class coverage and tends to improve or maintain classification error.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Uc irvine machine learning repository (2009), http://archive.ics.uci.edu/ml/
Adami, G., Avesani, P., Sona, D.: Clustering documents into a web directory for bootstrapping a supervised classification. Data & Knowledge Engineering 54, 301–325 (2005)
Angluin, D.: Queries and concept learning. Machine Learning 2, 319–342 (1988)
Balcan, M.-F., Beygelzimer, A., Langford, J.: Agnostic active learning. In: ICML, pp. 65–72. ICML (2006)
Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Machine Learning (15), 201–221 (1994)
Cohn, D., Ghahramani, Z., Jordan, M.: Active learning with statistical models. Journal of Artificial Intelligence Research 4, 129–145 (1996)
Dasgupta, S.: Coarse sample complexity bonds for active learning. In: Advances in Neural Information Processing Systems, vol. 18 (2005)
Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: Proceedings of the 25th International Conference on Machine Learning (2008)
Escudeiro, N.F., Jorge, A.M.: Semi-automatic Creation and Maintenance of Web Resources with webTopic. In: Ackermann, M., Berendt, B., Grobelnik, M., Hotho, A., Mladenič, D., Semeraro, G., Spiliopoulou, M., Stumme, G., Svátek, V., van Someren, M. (eds.) Semantics, Web and Mining. LNCS (LNAI), vol. 4289, pp. 82–102. Springer, Heidelberg (2006)
Escudeiro, N., Jorge, A.: Learning partially specified concepts with d-confidence. In: Brazilian Simposium on Artificial Intelligence, Web and Text Intelligence Workshop (2008)
Hanneke, S.: A bound on the label complexity of agnostic active learning. In: Proceedings of the 24th International Conference on Machine Learning (2007)
Kääriäinen, M.: Active learning in the non-realizable case. In: Algorithmic Learning Theory, pp. 63–77. Springer, Heidelberg (2006)
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: SIGIR 1994: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 3–12. Springer, New York (1994)
Li, M., Sethi, I.: Confidence-based active learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 1251–1261 (2006)
Liu, H., Motoda, H.: Instance Selection and Construction for Data Mining. Kluwer Academic Publishers, Dordrecht (2001)
Muslea, I., Minton, S., Knoblock, C.A.: Active learning with multiple views. Journal of Artificial Intelligence Research 27, 203–233 (2006)
Ribeiro, P., Escudeiro, N.: On-line news “à la carte”. In: Proceedings of the European Conference on the Use of Modern Information and Communication Technologies (2008)
Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the International Conference on Machine Learning (2001)
Schohn, G., Cohn, D.: Less is more: Active learning with support vector machines. In: Proceedings of the International Conference on Machine Learning (2000)
Seung, H., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the 5th Annual Workshop on Computational Learning Theory (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Escudeiro, N.F., Jorge, A.M. (2009). Efficient Coverage of Case Space with Active Learning. In: Lopes, L.S., Lau, N., Mariano, P., Rocha, L.M. (eds) Progress in Artificial Intelligence. EPIA 2009. Lecture Notes in Computer Science(), vol 5816. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04686-5_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-04686-5_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04685-8
Online ISBN: 978-3-642-04686-5
eBook Packages: Computer ScienceComputer Science (R0)