Consistency-Based Semi-supervised Active Learning: Towards Minimizing Labeling Cost

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12355)


Active learning (AL) combines data labeling and model training to minimize the labeling cost by prioritizing the selection of high value data that can best improve model performance. In pool-based active learning, accessible unlabeled data are not used for model training in most conventional methods. Here, we propose to unify unlabeled sample selection and model training towards minimizing labeling cost, and make two contributions towards that end. First, we exploit both labeled and unlabeled data using semi-supervised learning (SSL) to distill information from unlabeled data during the training stage. Second, we propose a consistency-based sample selection metric that is coherent with the training objective such that the selected samples are effective at improving model performance. We conduct extensive experiments on image classification tasks. The experimental results on CIFAR-10, CIFAR-100 and ImageNet demonstrate the superior performance of our proposed method with limited labeled data, compared to the existing methods and the alternative AL and SSL combinations. Additionally, we also study an important yet under-explored problem – “When can we start learning-based AL selection?”. We propose a measure that is empirically correlated with the AL target loss and is potentially useful for determining the proper starting point of learning-based AL methods .


Active learning Semi-supervised learning Consistency-based sample selection 



Discussions with Giulia DeSalvo, Chih-kuan Yeh, Kihyuk Sohn, Chen Xing, and Wei Wei are gratefully acknowledged.

Supplementary material

504449_1_En_30_MOESM1_ESM.pdf (296 kb)
Supplementary material 1 (pdf 295 KB)


  1. 1.
    Athiwaratkun, B., Finzi, M., Izmailov, P., Wilson, A.G.: There are many consistent explanations of unlabeled data: why you should average. In: ICLR (2019)Google Scholar
  2. 2.
    Balcan, M.F., Beygelzimer, A., Langford, J.: Agnostic active learning. J. Comput. Syst. Sci. 75(1), 78–89 (2009)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Balcan, M.F., Broder, A., Zhang, T.: Margin based active learning. In: International Conference on Computational Learning Theory (2007)Google Scholar
  4. 4.
    Beluch, W.H., Genewein, T., Nürnberger, A., Köhler, J.M.: The power of ensembles for active learning in image classification (2018)Google Scholar
  5. 5.
    Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: Mixmatch: a holistic approach to semi-supervised learning. arXiv preprint arXiv:1905.02249 (2019)
  6. 6.
    Brinker, K.: Incorporating diversity in active learning with support vector machines. In: ICML (2003)Google Scholar
  7. 7.
    Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Mach. Learn. 15(2), 201–221 (1994)Google Scholar
  8. 8.
    Cortes, C., DeSalvo, G., Mohri, M., Zhang, N.: Agnostic active learning without constraints. In: ICML (2019)Google Scholar
  9. 9.
    Cortes, C., DeSalvo, G., Mohri, M., Zhang, N., Gentile, C.: Active learning with disagreement graphs. In: ICML (2019)Google Scholar
  10. 10.
    Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: Randaugment: practical automated data augmentation with a reduced search space. arXiv preprint arXiv:1909.13719 (2019)
  11. 11.
    Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: ICML (2008)Google Scholar
  12. 12.
    Dasgupta, S., Hsu, D.J., Monteleoni, C.: A general agnostic active learning algorithm. In: NIPS (2008)Google Scholar
  13. 13.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  14. 14.
    Drugman, T., Pylkkonen, J., Kneser, R.: Active and semi-supervised learning in ASR: benefits on the acoustic and language models. arXiv preprint arXiv:1903.02852 (2019)
  15. 15.
    Elhamifar, E., Sapiro, G., Yang, A., Shankar Sasrty, S.: A convex optimization framework for active learning. In: CVPR (2013)Google Scholar
  16. 16.
    Freytag, A., Rodner, E., Denzler, J.: Selecting influential examples: active learning with expected model output changes. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 562–577. Springer, Cham (2014). Scholar
  17. 17.
    Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: ICML (2016)Google Scholar
  18. 18.
    Gal, Y., Islam, R., Ghahramani, Z.: Deep Bayesian active learning with image data. In: ICML (2017)Google Scholar
  19. 19.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016).
  20. 20.
    Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: ICML (2017)Google Scholar
  21. 21.
    Guo, Y.: Active instance sampling via matrix partition. In: NIPS (2010)Google Scholar
  22. 22.
    Hasan, M., Roy-Chowdhury, A.K.: Context aware active learning of activity recognition models. In: CVPR (2015)Google Scholar
  23. 23.
    Houlsby, N., Hernández-Lobato, J.M., Ghahramani, Z.: Cold-start active learning with robust ordinal matrix factorization. In: ICML (2014)Google Scholar
  24. 24.
    Iglesias, J.E., Konukoglu, E., Montillo, A., Tu, Z., Criminisi, A.: Combining generative and discriminative models for semantic segmentation of CT scans via active learning. In: Székely, G., Hahn, H.K. (eds.) IPMI 2011. LNCS, vol. 6801, pp. 25–36. Springer, Heidelberg (2011). Scholar
  25. 25.
    Joshi, A.J., Porikli, F., Papanikolopoulos, N.: Multi-class active learning for image classification. In: CVPR (2009)Google Scholar
  26. 26.
    Konyushkova, K., Sznitman, R., Fua, P.: Learning active learning from data. In: NIPS (2017)Google Scholar
  27. 27.
    Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report. Citeseer (2009)Google Scholar
  28. 28.
    Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: ICLR (2017)Google Scholar
  29. 29.
    Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: NIPS (2017)Google Scholar
  30. 30.
    Lewis, D.D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: Machine Learning Proceedings 1994, pp. 148–156. Elsevier (1994)Google Scholar
  31. 31.
    Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: SIGIR 1994, pp. 3–12 (1994)Google Scholar
  32. 32.
    Mac Aodha, O., Campbell, N.D., Kautz, J., Brostow, G.J.: Hierarchical subquery evaluation for active learning on a graph. In: CVPR (2014)Google Scholar
  33. 33.
    McCallumzy, A.K., Nigamy, K.: Employing EM and pool-based active learning for text classification. In: ICML (1998)Google Scholar
  34. 34.
    Nguyen, H.T., Smeulders, A.: Active learning using pre-clustering. In: ICML (2004)Google Scholar
  35. 35.
    Oliver, A., Odena, A., Raffel, C.A., Cubuk, E.D., Goodfellow, I.: Realistic evaluation of deep semi-supervised learning algorithms. In: NeurIPS (2018)Google Scholar
  36. 36.
    Rhee, P.K., Erdenee, E., Kyun, S.D., Ahmed, M.U., Jin, S.: Active and semi-supervised learning for object detection with imperfect data. Cogn. Syst. Res. 45, 109–123 (2017)CrossRefGoogle Scholar
  37. 37.
    Roth, D., Small, K.: Margin-based active learning for structured output spaces. In: ECML (2006)Google Scholar
  38. 38.
    Roy, N., McCallum, A.: Toward optimal active learning through Monte Carlo estimation of error reduction. In: ICML (2001)Google Scholar
  39. 39.
    Sener, O., Savarese, S.: Active learning for convolutional neural networks: a core-set approach. In: ICLR (2018)Google Scholar
  40. 40.
    Settles, B., Craven, M., Ray, S.: Multiple-instance active learning. In: NIPS (2008)Google Scholar
  41. 41.
    Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the Workshop on Computational Learning Theory (1992)Google Scholar
  42. 42.
    Siméoni, O., Budnik, M., Avrithis, Y., Gravier, G.: Rethinking deep active learning: using unlabeled data at model training. arXiv preprint arXiv:1911.08177 (2019)
  43. 43.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  44. 44.
    Sinha, S., Ebrahimi, S., Darrell, T.: Variational adversarial active learning. arXiv preprint arXiv:1904.00370 (2019)
  45. 45.
    Sohn, K., et al.: Fixmatch: simplifying semi-supervised learning with consistency and confidence. arXiv preprint arXiv:2001.07685 (2020)
  46. 46.
    Sohn, K., Zhang, Z., Li, C.L., Zhang, H., Lee, C.Y., Pfister, T.: A simple semi-supervised learning framework for object detection. arXiv preprint arXiv:2005.04757 (2020)
  47. 47.
    Song, S., Berthelot, D., Rostamizadeh, A.: Combining mixmatch and active learning for better accuracy with fewer labels. arXiv preprint arXiv:1912.00594 (2019)
  48. 48.
    Tomanek, K., Hahn, U.: Semi-supervised active learning for sequence labeling. In: ACL (2009)Google Scholar
  49. 49.
    Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. JMLR 2, 45–66 (2001)zbMATHGoogle Scholar
  50. 50.
    Verma, V., Lamb, A., Kannala, J., Bengio, Y., Lopez-Paz, D.: Interpolation consistency training for semi-supervised learning. In: International Joint Conferences on Artifical Intelligence (2019)Google Scholar
  51. 51.
    Xie, Q., Dai, Z., Hovy, E., Luong, M.T., Le, Q.V.: Unsupervised data augmentation for consistency training. arXiv preprint arXiv:1904.12848 (2019)
  52. 52.
    Yang, Y., Ma, Z., Nie, F., Chang, X., Hauptmann, A.G.: Multi-class active learning by uncertainty sampling with diversity maximization. IJCV 113(2), 113–127 (2015)MathSciNetCrossRefGoogle Scholar
  53. 53.
    Yoo, D., Kweon, I.S.: Learning loss for active learning. In: CVPR (2019)Google Scholar
  54. 54.
    Zhang, Z., Zhang, H., Arik, S.O., Lee, H., Pfister, T.: Distilling effective supervision from severe label noise. In: CVPR (2020)Google Scholar
  55. 55.
    Zhu, X., Lafferty, J., Ghahramani, Z.: Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions. In: ICML Workshops (2003)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of MarylandCollege ParkUSA
  2. 2.Google Cloud AISunnyvaleUSA
  3. 3.University of WashingtonSeattleUSA

Personalised recommendations