Skip to main content

Efficient Coverage of Case Space with Active Learning

  • Conference paper
  • 1344 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5816))

Abstract

Collecting and annotating exemplary cases is a costly and critical task that is required in early stages of any classification process. Reducing labeling cost without degrading accuracy calls for a compromise solution which may be achieved with active learning. Common active learning approaches focus on accuracy and assume the availability of a pre-labeled set of exemplary cases covering all classes to learn. This assumption does not necessarily hold. In this paper we study the capabilities of a new active learning approach, d-Confidence, in rapidly covering the case space when compared to the traditional active learning confidence criterion, when the representativeness assumption is not met. Experimental results also show that d-Confidence reduces the number of queries required to achieve complete class coverage and tends to improve or maintain classification error.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Uc irvine machine learning repository (2009), http://archive.ics.uci.edu/ml/

  2. Adami, G., Avesani, P., Sona, D.: Clustering documents into a web directory for bootstrapping a supervised classification. Data & Knowledge Engineering 54, 301–325 (2005)

    Article  Google Scholar 

  3. Angluin, D.: Queries and concept learning. Machine Learning 2, 319–342 (1988)

    MathSciNet  Google Scholar 

  4. Balcan, M.-F., Beygelzimer, A., Langford, J.: Agnostic active learning. In: ICML, pp. 65–72. ICML (2006)

    Google Scholar 

  5. Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Machine Learning (15), 201–221 (1994)

    Google Scholar 

  6. Cohn, D., Ghahramani, Z., Jordan, M.: Active learning with statistical models. Journal of Artificial Intelligence Research 4, 129–145 (1996)

    MATH  Google Scholar 

  7. Dasgupta, S.: Coarse sample complexity bonds for active learning. In: Advances in Neural Information Processing Systems, vol. 18 (2005)

    Google Scholar 

  8. Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: Proceedings of the 25th International Conference on Machine Learning (2008)

    Google Scholar 

  9. Escudeiro, N.F., Jorge, A.M.: Semi-automatic Creation and Maintenance of Web Resources with webTopic. In: Ackermann, M., Berendt, B., Grobelnik, M., Hotho, A., Mladenič, D., Semeraro, G., Spiliopoulou, M., Stumme, G., Svátek, V., van Someren, M. (eds.) Semantics, Web and Mining. LNCS (LNAI), vol. 4289, pp. 82–102. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Escudeiro, N., Jorge, A.: Learning partially specified concepts with d-confidence. In: Brazilian Simposium on Artificial Intelligence, Web and Text Intelligence Workshop (2008)

    Google Scholar 

  11. Hanneke, S.: A bound on the label complexity of agnostic active learning. In: Proceedings of the 24th International Conference on Machine Learning (2007)

    Google Scholar 

  12. Kääriäinen, M.: Active learning in the non-realizable case. In: Algorithmic Learning Theory, pp. 63–77. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  13. Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: SIGIR 1994: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 3–12. Springer, New York (1994)

    Google Scholar 

  14. Li, M., Sethi, I.: Confidence-based active learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 1251–1261 (2006)

    Article  Google Scholar 

  15. Liu, H., Motoda, H.: Instance Selection and Construction for Data Mining. Kluwer Academic Publishers, Dordrecht (2001)

    Book  Google Scholar 

  16. Muslea, I., Minton, S., Knoblock, C.A.: Active learning with multiple views. Journal of Artificial Intelligence Research 27, 203–233 (2006)

    MathSciNet  MATH  Google Scholar 

  17. Ribeiro, P., Escudeiro, N.: On-line news “à la carte”. In: Proceedings of the European Conference on the Use of Modern Information and Communication Technologies (2008)

    Google Scholar 

  18. Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the International Conference on Machine Learning (2001)

    Google Scholar 

  19. Schohn, G., Cohn, D.: Less is more: Active learning with support vector machines. In: Proceedings of the International Conference on Machine Learning (2000)

    Google Scholar 

  20. Seung, H., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the 5th Annual Workshop on Computational Learning Theory (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Escudeiro, N.F., Jorge, A.M. (2009). Efficient Coverage of Case Space with Active Learning. In: Lopes, L.S., Lau, N., Mariano, P., Rocha, L.M. (eds) Progress in Artificial Intelligence. EPIA 2009. Lecture Notes in Computer Science(), vol 5816. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04686-5_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04686-5_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04685-8

  • Online ISBN: 978-3-642-04686-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics