Skip to main content

Unsupervised Selective Labeling for More Effective Semi-supervised Learning

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

Given an unlabeled dataset and an annotation budget, we study how to selectively label a fixed number of instances so that semi-supervised learning (SSL) on such a partially labeled dataset is most effective. We focus on selecting the right data to label, in addition to usual SSL’s propagating labels from labeled data to the rest unlabeled data. This instance selection task is challenging, as without any labeled data we do not know what the objective of learning should be. Intuitively, no matter what the downstream task is, instances to be labeled must be representative and diverse: The former would facilitate label propagation to unlabeled data, whereas the latter would ensure coverage of the entire dataset. We capture this idea by selecting cluster prototypes, either in a pretrained feature space, or along with feature optimization, both without labels. Our unsupervised selective labeling consistently improves SSL methods over state-of-the-art active learning given labeled data, by 8–25\(\times \) in label efficiency. For example, it boosts FixMatch by 10% (14%) in accuracy on CIFAR-10 (ImageNet-1K) with 0.08% (0.2%) labeled data, demonstrating that small computation spent on selecting what data to label brings significant gain especially under a low annotation budget. Our work sets a new standard for practical and efficient SSL.

X. Wang and L. Lian—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  2. Assran, M., et al.: Semi-supervised learning of visual features by non-parametrically predicting view assignments with support samples. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8443–8452 (2021)

    Google Scholar 

  3. Atlas, L.E., Cohn, D.A., Ladner, R.E.: Training connectionist networks with queries and selective sampling. In: Advances in Neural Information Processing Systems. Citeseer, pp. 566–573 (1990)

    Google Scholar 

  4. Berthelot, D., et al.: ReMixMatch: semi-supervised learning with distribution alignment and augmentation anchoring. arXiv preprint arXiv:1911.09785 (2019)

  5. Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: MixMatch: a holistic approach to semi-supervised learning. arXiv preprint arXiv:1905.02249 (2019)

  6. Bilgic, M., Getoor, L.: Link-based active learning. In: NIPS Workshop on Analyzing Networks and Learning with Graphs, vol. 4 (2009)

    Google Scholar 

  7. Bondy, J.A., Murty, U.S.R., et al.: Graph Theory With Applications, vol. 290. Macmillan, London (1976)

    Book  MATH  Google Scholar 

  8. Cai, Z., Ravichandran, A., Maji, S., Fowlkes, C., Tu, Z., Soatto, S.: Exponential moving average normalization for self-supervised and semi-supervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 194–203 (2021)

    Google Scholar 

  9. Caramalau, R., Bhattarai, B., Kim, T.K.: Sequential graph convolutional network for active learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9583–9592 (2021)

    Google Scholar 

  10. Carlini, N., Erlingsson, U., Papernot, N.: Distribution density, tails, and outliers in machine learning: metrics and applications. arXiv preprint arXiv:1910.13427 (2019)

  11. Caron, M., Bojanowski, P., Joulin, A., Douze, M.: Deep clustering for unsupervised learning of visual features. In: ECCV (2018)

    Google Scholar 

  12. Caron, M., et al.: Emerging properties in self-supervised vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9650–9660 (2021)

    Google Scholar 

  13. Chan, Y.C., Li, M., Oymak, S.: On the marginal benefit of active learning: does self-supervision eat its cake? In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 3455–3459 (2021)

    Google Scholar 

  14. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning. PMLR, pp. 1597–1607 (2020)

    Google Scholar 

  15. Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.: Big self-supervised models are strong semi-supervised learners. arXiv preprint arXiv:2006.10029 (2020)

  16. Chen, W., Pu, S., Xie, D., Yang, S., Guo, Y., Lin, L.: Unsupervised image classification for deep representation learning. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12536, pp. 430–446. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66096-3_30

    Chapter  Google Scholar 

  17. Chen, X., Fan, H., Girshick, R., He, K.: Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020)

  18. Cho, J.W., Kim, D.J., Jung, Y., Kweon, I.S.: MCDAL: maximum classifier discrepancy for active learning. arXiv preprint arXiv:2107.11049 (2021)

  19. Chung, F.R., Graham, F.C.: Spectral Graph Theory. No. 92. American Mathematical Society (1997)

    Google Scholar 

  20. Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Mach. Learn. 15(2), 201–221 (1994)

    Article  Google Scholar 

  21. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(ARTICLE), 2493–2537 (2011)

    Google Scholar 

  22. Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)

    Google Scholar 

  23. Dagan, I., Engelson, S.P.: Committee-based sampling for training probabilistic classifiers. In: Machine Learning Proceedings 1995. Elsevier, pp. 150–157 (1995)

    Google Scholar 

  24. Dang, Z., Deng, C., Yang, X., Wei, K., Huang, H.: Nearest neighbor matching for deep clustering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13693–13702 (2021)

    Google Scholar 

  25. Deo, N.: Graph theory with applications to engineering and computer science. Networks 5(3), 299–300 (1975)

    Article  Google Scholar 

  26. Ducoffe, M., Precioso, F.: Adversarial active learning for deep networks: a margin based approach. arXiv preprint arXiv:1802.09841 (2018)

  27. Emam, Z.A.S., et al.: Active learning at the ImageNet scale. arXiv preprint arXiv:2111.12880 (2021)

  28. Fix, E., Hodges, J.L.: Discriminatory analysis. nonparametric discrimination: consistency properties. Int. Stat. Rev./ Rev. Int. Stat. 57(3), 238–247 (1989)

    Article  MATH  Google Scholar 

  29. Forgy, E.W.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21, 768–769 (1965)

    Google Scholar 

  30. Gao, M., Zhang, Z., Yu, G., Arık, S.Ö., Davis, L.S., Pfister, T.: Consistency-based semi-supervised active learning: towards minimizing labeling cost. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 510–526. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_30

    Chapter  Google Scholar 

  31. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  32. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  33. Grandvalet, Y., Bengio, Y., et al.: Semi-supervised learning by entropy minimization. CAP 367, 281–296 (2005)

    Google Scholar 

  34. Guo, J., et al.: Semi-supervised active learning for semi-supervised models: exploit adversarial examples with graph-based virtual labels. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2896–2905 (2021)

    Google Scholar 

  35. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)

    Google Scholar 

  36. Hestness, J.,et al.: Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409 (2017)

  37. Huang, S.J., Jin, R., Zhou, Z.H.: Active learning by querying informative and representative examples. In: Advances in Neural Information Processing Systems, vol. 23, pp. 892–900 (2010)

    Google Scholar 

  38. Huang, S., Wang, T., Xiong, H., Huan, J., Dou, D.: Semi-supervised active learning with temporal output discrepancy. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3447–3456 (2021)

    Google Scholar 

  39. Kriegel, H.P., Schubert, E., Zimek, A.: The (black) art of runtime evaluation: are we comparing algorithms or implementations? Know. Inf. Syst. 52(2), 341–378 (2017)

    Article  Google Scholar 

  40. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  41. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp 1097–1105 (2012)

    Google Scholar 

  42. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  43. Lee, D.H., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning. ICML, vol. 3 (2013)

    Google Scholar 

  44. Li, J., Xiong, C., Hoi, S.C.: CoMatch: semi-supervised learning with contrastive graph regularization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9475–9484 (2021)

    Google Scholar 

  45. Li, S., et al.: Improve unsupervised pretraining for few-label transfer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10201–10210 (2021)

    Google Scholar 

  46. Liu, B., Wu, Z., Hu, H., Lin, S.: Deep metric transfer for label propagation with limited annotated data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)

    Google Scholar 

  47. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  48. McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions, vol. 382. Wiley, Hoboken (2007)

    MATH  Google Scholar 

  49. Menon, A.K., Jayasumana, S., Rawat, A.S., Jain, H., Veit, A., Kumar, S.: Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314 (2020)

  50. Miao, Z., Liu, Z., Gaynor, K.M., Palmer, M.S., Yu, S.X., Getz, W.M.: Iterative human and automated identification of wildlife images. Nat. Mach. Intell. 3(10), 885–895 (2021)

    Article  Google Scholar 

  51. Oord, A.V.D., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)

  52. Orava, J.: K-Nearest Neighbour Kernel Density Estimation. The Choice of Optimal k. vol. 50. no. 1, pp. 39–50. Tatra Mountains Mathematical Publications, Poland, (2011)

    Google Scholar 

  53. Park, S., et al.: Improving unsupervised image clustering with robust learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12278–12287 (2021)

    Google Scholar 

  54. Radford, A., et al.: Learning transferable visual models from natural language supervision. arXiv preprint arXiv:2103.00020 (2021)

  55. Ren, P., et al.: A survey of deep active learning. arXiv preprint arXiv:2009.00236 (2020)

  56. Roy, N., Mccallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the 18th International Conference on Machine Learning (2001)

    Google Scholar 

  57. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  58. Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and perturbations for deep semi-supervised learning. arXiv preprint arXiv:1606.04586 (2016)

  59. Sener, O., Savarese, S.: Active learning for convolutional neural networks: a core-set approach. arXiv preprint arXiv:1708.00489 (2017)

  60. Settles, B.: Active learning literature survey (2009)

    Google Scholar 

  61. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  62. Sinha, S., Ebrahimi, S., Darrell, T.: Variational adversarial active learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5972–5981 (2019)

    Google Scholar 

  63. Sohn, K.,et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. In: Advances in Neural Information Processing Systems, vol. 33 (2020)

    Google Scholar 

  64. Song, S., Berthelot, D., Rostamizadeh, A.: Combining MixMatch and active learning for better accuracy with fewer labels. arXiv preprint arXiv:1912.00594 (2019)

  65. Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1195–1204 (2017)

    Google Scholar 

  66. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2(Nov), 45–66 (2001)

    Google Scholar 

  67. Van Gansbeke, W., Vandenhende, S., Georgoulis, S., Proesmans, M., Van Gool, L.: SCAN: learning to classify images without labels. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 268–285. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_16

    Chapter  Google Scholar 

  68. Wang, X., Liu, Z., Yu, S.X.: Unsupervised feature learning by cross-level instance-group discrimination. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12586–12595 (2021)

    Google Scholar 

  69. Wang, X., Wu, Z., Lian, L., Yu, S.X.: Debiased learning from naturally imbalanced pseudo-labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14647–14657 (2022)

    Google Scholar 

  70. Wei, K., Iyer, R., Bilmes, J.: Submodularity in data subset selection and active learning. In: International Conference on Machine Learning. PMLR, pp. 1954–1963 (2015)

    Google Scholar 

  71. Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)

    Google Scholar 

  72. Xie, Q., Dai, Z., Hovy, E., Luong, T., Le, Q.: Unsupervised data augmentation for consistency training. In: Advances in Neural Information Processing Systems, vol. 33 (2020)

    Google Scholar 

  73. Xie, Q., Luong, M.T., Hovy, E., Le, Q.V.: Self-training with noisy student improves ImageNet classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10687–10698 (2020)

    Google Scholar 

  74. Yang, J., et al.: MedMNIST v2: a large-scale lightweight benchmark for 2D and 3D biomedical image classification. arXiv preprint arXiv:2110.14795 (2021)

  75. Yoo, D., Kweon, I.S.: Learning loss for active learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 93–102 (2019)

    Google Scholar 

Download references

Acknowledgements

The authors thank Alexei Efros and Trevor Darrell for helpful discussions and feedback on this work in their classes.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stella X. Yu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1387 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, X., Lian, L., Yu, S.X. (2022). Unsupervised Selective Labeling for More Effective Semi-supervised Learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13690. Springer, Cham. https://doi.org/10.1007/978-3-031-20056-4_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20056-4_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20055-7

  • Online ISBN: 978-3-031-20056-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics