Active Learning with Irrelevant Examples

  • Dominic Mazzoni
  • Kiri L. Wagstaff
  • Michael C. Burl
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)


Active learning algorithms attempt to accelerate the learning process by requesting labels for the most informative items first. In real-world problems, however, there may exist unlabeled items that are irrelevant to the user’s classification goals. Queries about these points slow down learning because they provide no information about the problem of interest. We have observed that when irrelevant items are present, active learning can perform worse than random selection, requiring more time (queries) to achieve the same level of accuracy. Therefore, we propose a novel approach, Relevance Bias, in which the active learner combines its default selection heuristic with the output of a simultaneously trained relevance classifier to favor items that are likely to be both informative and relevant. In our experiments on a real-world problem and two benchmark datasets, the Relevance Bias approach significantly improves the learning rate of three different active learning approaches.


Active Learning Active Learning Method Irrelevant Item Active Learning Approach Handwritten Digit Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Machine Learning 15(2), 201–221 (1994)Google Scholar
  2. 2.
    Cortes, C., Vapnik, V.: Support-vector network. Machine Learning 20, 273–297 (1995)MATHGoogle Scholar
  3. 3.
    Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2, 45–66 (2002)MATHCrossRefGoogle Scholar
  4. 4.
    Brinker, K.: Incorporating diversity in active learning with support vector machines. In: Proceedings of the Twentieth International Conference on Machine Learning, Washington, D.C., pp. 59–66 (2003)Google Scholar
  5. 5.
    Burl, M.C., DeCoste, D., Enke, B., Mazzoni, D., Merline, W.J., Scharenbroich, L.: Automated knowledge discovery from simulators. In: Proceedings of the Sixth SIAM International Conference on Data Mining (2006)Google Scholar
  6. 6.
    Platt, J.C.: Probabilities for SV Machines. In: Smola, A.J., Bartlett, P., Schölkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (1999)Google Scholar
  7. 7.
    Michie, D., Spiegelhalter, D., Taylor, C.: Machine Learning, Neural and Statistical Classification. Prentice Hall, Englewood Cliffs (1994), Data available at: MATHGoogle Scholar
  8. 8.
    Hsu, C., Lin, C.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  9. 9.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  10. 10.
    Tong, S.: Active Learning: Theory and Applications. PhD thesis, Stanford University (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Dominic Mazzoni
    • 1
  • Kiri L. Wagstaff
    • 1
  • Michael C. Burl
    • 1
  1. 1.Jet Propulsion LaboratoryCalifornia Institute of TechnologyPasadenaUSA

Personalised recommendations