Improving Active Learning by Avoiding Ambiguous Samples

  • Christian LimbergEmail author
  • Heiko Wersing
  • Helge Ritter
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11139)


If label information in a classification task is expensive, it can be beneficial to use active learning to get the most informative samples to label by a human. However, there can be samples which are meaningless to the human or recorded wrongly. If these samples are near the classifier’s decision boundary, they are queried repeatedly for labeling. This is inefficient for training because the human can not label these samples correctly and this may lower human acceptance. We introduce an approach to compensate the problem of ambiguous samples by excluding clustered samples from labeling. We compare this approach to other state-of-the-art methods. We further show that we can improve the accuracy in active learning and reduce the number of ambiguous samples queried while training.


Active learning Ambiguous samples Certainty Rejection Clustering 


  1. 1.
    Constantinopoulos, C., Likas, A.: Active learning with the probabilistic RBF classifier. In: International Conference on Artificial Neural Networks (ICANN), pp. 357–366 (2006)CrossRefGoogle Scholar
  2. 2.
    Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 226–231 (1996)Google Scholar
  3. 3.
    Fang, M., Zhu, X.: I don’t know the label: active learning with blind knowledge. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR), pp. 2238–2241 (2012)Google Scholar
  4. 4.
    Fischer, L., Hammer, B., Wersing, H.: Optimal local rejection for classifiers. Neurocomputing 214, 445–457 (2016)CrossRefGoogle Scholar
  5. 5.
    Käding, C., Freytag, A., Rodner, E., Bodesheim, P., Denzler, J.: Active learning and discovery of object categories in the presence of unnameable instances. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4343–4352 (2015)Google Scholar
  6. 6.
    Limberg, C., Wersing, H., Ritter, H.: Efficient accuracy estimation for instance-based incremental active learning. In: European Symposium on Artificial Neural Networks (ESANN), pp. 171–176 (2018)Google Scholar
  7. 7.
    Losing, V., Hammer, B., Wersing, H.: Interactive online learning for obstacle classification on a mobile robot. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)Google Scholar
  8. 8.
    Losing, V., Hammer, B., Wersing, H.: Incremental on-line learning: a review and comparison of state of the art algorithms. Neurocomputing 275, 1261–1274 (2018)CrossRefGoogle Scholar
  9. 9.
    van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  10. 10.
    Ramirez-Loaiza, M.E., Sharma, M., Kumar, G., Bilgic, M.: Active learning: an empirical study of common baselines. Data Min. Knowl. Discov. 31(2), 287–313 (2017)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Settles, B., Craven, M.: An analysis of active learning strategies for sequence labeling tasks. In: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1070–1079 (2008)Google Scholar
  12. 12.
    Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Conference on Computational Learning Theory (COLT), pp. 287–294 (1992)Google Scholar
  13. 13.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  14. 14.
    Strickert, M., Teichmann, S., Sreenivasulu, N., Seiffert, U.: High-throughput multi-dimensional scaling (HiT-MDS) for cDNA-array expression data. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 625–633. Springer, Heidelberg (2005). Scholar
  15. 15.
    Zhang, J., Wu, X., Sheng, V.S.: Learning from crowdsourced labeled data: a survey. Artif. Intell. Rev. 46(4), 543–576 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Christian Limberg
    • 1
    • 2
    Email author
  • Heiko Wersing
    • 2
  • Helge Ritter
    • 1
  1. 1.CoR-LabBielefeld UniversityBielefeldGermany
  2. 2.HONDA Research Institute Europe GmbHOffenbachGermany

Personalised recommendations