Agreeing to disagree: active learning with noisy labels without crowdsourcing

  • Mohamed-Rafik Bouguelia
  • Slawomir Nowaczyk
  • K. C. Santosh
  • Antanas Verikas
Original Article


We propose a new active learning method for classification, which handles label noise without relying on multiple oracles (i.e., crowdsourcing). We propose a strategy that selects (for labeling) instances with a high influence on the learned model. An instance x is said to have a high influence on the model h, if training h on x (with label \(y = h(x)\)) would result in a model that greatly disagrees with h on labeling other instances. Then, we propose another strategy that selects (for labeling) instances that are highly influenced by changes in the learned model. An instance x is said to be highly influenced, if training h with a set of instances would result in a committee of models that agree on a common label for x but disagree with h(x). We compare the two strategies and we show, on different publicly available datasets, that selecting instances according to the first strategy while eliminating noisy labels according to the second strategy, greatly improves the accuracy compared to several benchmarking methods, even when a significant amount of instances are mislabeled.


Active learning Classification Label noise Mislabeling 


  1. 1.
    Abedini M, Codella N, Connell J, Garnavi R, Merler M, Pankanti S, Smith J, Syeda-Mahmood T (2015) A generalized framework for medical image classification and recognition. IBM J Res Dev 59(2/3):1–18CrossRefGoogle Scholar
  2. 2.
    Agarwal A, Garg R, Chaudhury S (2013) Greedy search for active learning of ocr. In: International conference on document analysis and recognition (ICDAR). IEEE, pp 837–841Google Scholar
  3. 3.
    Bache K, Lichman M (2013) Uci machine learning repository. Irvine, CA : University of California, School of Information and Computer Science.
  4. 4.
    Bouguelia MR, Belaid Y, Belaid A (2015) Identifying and mitigating labelling errors in active learning. In: Fred A, De Marsico M, Figueiredo M (eds) Pattern recognition: applications and methods. Lecture Notes in Computer Science, vol 9493. Springer, Cham, pp 35–51Google Scholar
  5. 5.
    Bouguelia MR, Belaid Y, Belaid A (2016) An adaptive streaming active learning strategy based on instance weighting. Pattern Recogn Lett 70:38–44CrossRefGoogle Scholar
  6. 6.
    Bouneffouf D, Laroche R, Urvoy T, Fraud R, Allesiardo R (2014) Contextual bandit for active learning: active thompson sampling. Int Conf Neural Inf Process 26(12):405–412Google Scholar
  7. 7.
    Ekbal A, Saha S, Sikdar UK (2014) On active annotation for named entity recognition. Int J Mach Learn CybernGoogle Scholar
  8. 8.
    Fang M, Zhu X (2014) Active learning with uncertain labeling knowledge. Pattern Recogn Lett 43:98–108CrossRefGoogle Scholar
  9. 9.
    Frnay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 25(5):845–869CrossRefGoogle Scholar
  10. 10.
    Gilad-Bachrach R, Navot A, Tishby N (2005) Query by committee made real. Adv Neural Inf Process Syst 5:443–450Google Scholar
  11. 11.
    Hamidzadeh J, Monsefi R, Yazdi HS (2016) Large symmetric margin instance selection algorithm. Int J Mach Learn Cybern 7(1):25–45CrossRefGoogle Scholar
  12. 12.
    Henter D, Stahl A, Ebbecke M, Gillmann M (2015) Classifier self-assessment: active learning and active noise correction for document classification. In: 13th International conference on document analysis and recognition (ICDAR). IEEE, pp 276–280Google Scholar
  13. 13.
    Ipeirotis PG, Provost F, Sheng VS, Wang J (2014) Repeated labeling using multiple noisy labelers. Data Min Knowl Discov 28(2):402–441MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Kremer J, Steenstrup Pedersen K, Igel C (2014) Active learning with support vector machines. Wiley Interdiscip Rev Data Min Knowl Discov 4(4):313–326CrossRefGoogle Scholar
  15. 15.
    Krithara A, Amini MR, Renders JM, Goutte C (2008) Semi-supervised document classification with a mislabeling error model. In: Macdonald C, Ounis I, Plachouras V, Ruthven I, White RW (eds) Advances in information retrieval, ECIR 2008. Lecture Notes in Computer Science, vol 4956. Springer, Berlin, Heidelberg, pp 370–381Google Scholar
  16. 16.
    Lin CH, Weld DS (2016) Re-active learning: active learning with relabeling. In: AAAI conference on artificial intelligence, pp 1845–1852Google Scholar
  17. 17.
    Natarajan N, Dhillon IS, Ravikumar PK,Tewari A (2013) Learning with noisy labels. In: Advances in neural information processing systems, pp 1196–1204Google Scholar
  18. 18.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATHGoogle Scholar
  19. 19.
    Ramirez-Loaiza ME, Sharma M, Kumar G, Bilgic M (2016) Active learning: an empirical study of common baselines. Data Min Knowl Discove 31:287–313MathSciNetCrossRefGoogle Scholar
  20. 20.
    Rebbapragada U, Brodley CE, Sulla-Menashe D, Friedl MA (2012) Active label correction. In: IEEE 12th international conference on data mining (ICDM). IEEE, pp 1080–1085Google Scholar
  21. 21.
    Ren W, Li G (2015) Graph based semi-supervised learning via label fitting. Int J Mach Learn Cybern. doi: 10.1007/s13042-015-0458-y Google Scholar
  22. 22.
    Rosenberg A (2012) Classifying skewed data: importance weighting to optimize average recall. In: INTERSPEECH, pp 2242–2245Google Scholar
  23. 23.
    Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: International conference on machine learning, pp 441–448Google Scholar
  24. 24.
    Settles B (2012) Active learning. Synth Lect Artif Intell Mach Learn 6(1):1–114MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Settles B, Craven M, Ray S (2008) Multiple-instance active learning. Adv Neural Inf Process Syst 20:1289–1296Google Scholar
  26. 26.
    Sharma M, Bilgic M (2013) Most-surely vs. least-surely uncertain. In: IEEE 12th International conference on data mining (ICDM). IEEE, pp 667–676Google Scholar
  27. 27.
    Small K, Roth D (2010) Margin-based active learning for structured predictions. Int J Mach Learn Cybern 1(1–4):3–25CrossRefGoogle Scholar
  28. 28.
    Tuia D, Munoz-Mari J (2013) Learning user’s confidence for active learning. IEEE Trans Geosci Remote Sens 51(2):872–880CrossRefGoogle Scholar
  29. 29.
    Vijayanarasimhan S, Grauman K (2012) Active frame selection for label propagation in videos. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer Vision–ECCV 2012. Lecture Notes in Computer Science, vol 7576. Springer, Berlin, Heidelberg, pp 496–509Google Scholar
  30. 30.
    Wu J, Pan S, Cai Z, Zhu X, Zhang C (2014) Dual instance and attribute weighting for naive bayes classification. In: IEEE international conference on neural networks. IEEE, pp 1675–1679Google Scholar
  31. 31.
    Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005MathSciNetMATHGoogle Scholar
  32. 32.
    Zhang J, Wu X, Shengs VS (2015a) Active learning with imbalanced multiple noisy labeling. IEEE Trans Cybern 45(5):1095–1107CrossRefGoogle Scholar
  33. 33.
    Zhang XY, Wang S, Yun X (2015b) Bidirectional active learning: a two-way exploration into unlabeled and labeled data set. IEEE Trans Neural Netw Learn Syst 26(12):3034–3044MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.Center for Applied Intelligent Systems ResearchHalmstad UniversityHalmstadSweden
  2. 2.Department of Computer ScienceThe University of South DakotaVermillionUSA

Personalised recommendations