Active Crowd Counting with Limited Supervision

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12365)


To learn a reliable people counter from crowd images, head center annotations are normally required. Annotating head centers is however a laborious and tedious process in dense crowds. In this paper, we present an active learning framework which enables accurate crowd counting with limited supervision: given a small labeling budget, instead of randomly selecting images to annotate, we first introduce an active labeling strategy to annotate the most informative images in the dataset and learn the counting model upon them. The process is repeated such that in every cycle we select the samples that are diverse in crowd density and dissimilar to previous selections. In the last cycle when the labeling budget is met, the large amount of unlabeled data are also utilized: a distribution classifier is introduced to align the labeled data with unlabeled data; furthermore, we propose to mix up the distribution labels and latent representations of data in the network to particularly improve the distribution alignment in-between training samples. We follow the popular density estimation pipeline for crowd counting. Extensive experiments are conducted on standard benchmarks i.e. ShanghaiTech, UCF_CC_50, MAll, TRANCOS, and DCC. By annotating limited number of images (e.g. 10% of the dataset), our method reaches levels of performance not far from the state of the art which utilize full annotations of the dataset.



This work was supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61828602 and 51475334; as well as National Key Research and Development Program of Science and Technology of China under Grant No. 2018YFB1305304, Shanghai Science and Technology Pilot Project under Grant No. 19511132100.

Supplementary material

504476_1_En_34_MOESM1_ESM.pdf (589 kb)
Supplementary material 1 (pdf 589 KB)


  1. 1.
    Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: MixMatch: a holistic approach to semi-supervised learning. arXiv preprint arXiv:1905.02249 (2019)
  2. 2.
    Brostow, G.J., Cipolla, R.: Unsupervised Bayesian detection of independent motion in crowds. In: CVPR (2006)Google Scholar
  3. 3.
    Cao, X., Wang, Z., Zhao, Y., Su, F.: Scale aggregation network for accurate and efficient crowd counting. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 757–773. Springer, Cham (2018). Scholar
  4. 4.
    Change Loy, C., Gong, S., Xiang, T.: From semi-supervised to transfer counting of crowds. In: CVPR (2013)Google Scholar
  5. 5.
    Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: BMVC (2012)Google Scholar
  6. 6.
    Dasgupta, S.: Analysis of a greedy active learning strategy. In: NIPS (2005)Google Scholar
  7. 7.
    Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: JMLR (2015)Google Scholar
  8. 8.
    Gonzalez-Garcia, A., Vezhnevets, A., Ferrari, V.: An active search strategy for efficient object class detection. In: CVPR (2015)Google Scholar
  9. 9.
    Guerrero-Gómez-Olmedo, R., Torre-Jiménez, B., López-Sastre, R., Maldonado-Bascón, S., Onoro-Rubio, D.: Extremely overlapping vehicle counting. In: Iberian Conference on Pattern Recognition and Image Analysis (2015)Google Scholar
  10. 10.
    Hoffer, E., Ailon, N.: Semi-supervised deep learning by metric embedding. arXiv preprint arXiv:1611.01449 (2016)
  11. 11.
    Hossain, M., Hosseinzadeh, M., Chanda, O., Wang, Y.: Crowd counting using scale-aware attention networks. In: WACV (2019)Google Scholar
  12. 12.
    Hossain, M.A., Kumar, M., Hosseinzadeh, M., Chanda, O., Wang, Y.: One-shot scene-specific crowd counting. In: BMVC (2019)Google Scholar
  13. 13.
    Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: CVPR (2013)Google Scholar
  14. 14.
    Jenks, G.F.: The data model concept in statistical mapping. Int. Yearb. Cartography 7, 186–190 (1967)Google Scholar
  15. 15.
    Jiang, X., Xiao, Z., Zhang, B., Zhen, X., Cao, X., Doermann, D., Shao, L.: Crowd counting and density estimation by trellis encoder-decoder networks. In: CVPR (2019)Google Scholar
  16. 16.
    Joshi, A.J., Porikli, F., Papanikolopoulos, N.: Multi-class active learning for image classification. In: CVPR (2009)Google Scholar
  17. 17.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  18. 18.
    Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: ICLR (2016)Google Scholar
  19. 19.
    Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: ICMLW (2013)Google Scholar
  20. 20.
    Li, Y., Zhang, X., Chen, D.: CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: CVPR (2018)Google Scholar
  21. 21.
    Liu, J., Gao, C., Meng, D., G. Hauptmann, A.: DecideNet: counting varying density crowds through attention guided detection and density estimation. In: CVPR (2018)Google Scholar
  22. 22.
    Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting. In: CVPR (2019)Google Scholar
  23. 23.
    Liu, X., Van De Weijer, J., Bagdanov, A.D.: Exploiting unlabeled data in CNNs by self-supervised learning to rank. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1862–1878 (2019)CrossRefGoogle Scholar
  24. 24.
    Liu, X., Weijer, J., Bagdanov, A.D.: Leveraging unlabeled data for crowd counting by learning to rank. In: CVPR (2018)Google Scholar
  25. 25.
    Liu, Y., Shi, M., Zhao, Q., Wang, X.: Point in, box out: beyond counting persons in crowds. In: CVPR (2019)Google Scholar
  26. 26.
    Lu, Z., Shi, M., Chen, Q.: Crowd counting via scale-adaptive convolutional neural network. In: WACV (2018)Google Scholar
  27. 27.
    Ma, Z., Wei, X., Hong, X., Gong, Y.: Bayesian loss for crowd count estimation with point supervision. In: ICCV (2019)Google Scholar
  28. 28.
    Marsden, M., McGuinness, K., Little, S., Keogh, C.E., O’Connor, N.E.: People, penguins and petri dishes: adapting object counting models to new visual domains and object types without forgetting. In: CVPR (2018)Google Scholar
  29. 29.
    Olivier, C., Bernhard, S., Alexander, Z.: Semi-supervised learning. IEEE Trans. Neural Networks 20, 542–542 (2006)Google Scholar
  30. 30.
    Olmschenk, G., Tang, H., Zhu, Z.: Crowd counting with minimal data using generative adversarial networks for multiple target regression. In: WACV (2018)Google Scholar
  31. 31.
    Olmschenk, G., Zhu, Z., Tang, H.: Generalizing semi-supervised generative adversarial networks to regression using feature contrasting. Computer Vision and Image Understanding (2019)Google Scholar
  32. 32.
    Oñoro-Rubio, D., López-Sastre, R.J.: Towards perspective-free object counting with deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 615–629. Springer, Cham (2016). Scholar
  33. 33.
    Pham, V.Q., Kozakaya, T., Yamaguchi, O., Okada, R.: Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: ICCV (2015)Google Scholar
  34. 34.
    Rabaud, V., Belongie, S.: Counting crowded moving objects. In: CVPR (2006)Google Scholar
  35. 35.
    Ranjan, V., Le, H., Hoai, M.: Iterative crowd counting. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 278–293. Springer, Cham (2018). Scholar
  36. 36.
    Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: NIPS (2015)Google Scholar
  37. 37.
    Sam, D.B., Babu, R.V.: Top-down feedback for crowd counting convolutional neural network. In: AAAI (2018)Google Scholar
  38. 38.
    Sam, D.B., Sajjan, N.N., Maurya, H., Babu, R.V.: Almost unsupervised learning for dense crowd counting. In: AAAI (2019)Google Scholar
  39. 39.
    Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: CVPR (2017)Google Scholar
  40. 40.
    Sener, O., Savarese, S.: Active learning for convolutional neural networks: a core-set approach. In: ICLR (2018)Google Scholar
  41. 41.
    Settles, B.: Active learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences (2009)Google Scholar
  42. 42.
    Shi, M., Yang, Z., Xu, C., Chen, Q.: Revisiting perspective information for efficient crowd counting. In: CVPR (2019)Google Scholar
  43. 43.
    Shi, Z., Mettes, P., Snoek, C.G.: Counting with focus for free. In: ICCV (2019)Google Scholar
  44. 44.
    Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: ICCV (2017)Google Scholar
  45. 45.
    Sinha, S., Ebrahimi, S., Darrell, T.: Variational adversarial active learning. In: ICCV (2019)Google Scholar
  46. 46.
    Tan, B., Zhang, J., Wang, L.: Semi-supervised elastic net for pedestrian counting. Pattern Recogn. 44(10–11), 2297–2304 (2011)CrossRefGoogle Scholar
  47. 47.
    Verma, V., et al.: Manifold mixup: better representations by interpolating hidden states. In: ICML (2019)Google Scholar
  48. 48.
    Verma, V., Lamb, A., Kannala, J., Bengio, Y., Lopez-Paz, D.: Interpolation consistency training for semi-supervised learning. arXiv preprint arXiv:1903.03825 (2019)
  49. 49.
    Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. IJCV 63(2), 153–161 (2003)CrossRefGoogle Scholar
  50. 50.
    Wang, K., Zhang, D., Li, Y., Zhang, R., Lin, L.: Cost-effective active learning for deep image classification. IEEE Trans. Circuits Syst. Video Technol. 27(12), 2591–2600 (2016)CrossRefGoogle Scholar
  51. 51.
    Wang, Q., Gao, J., Lin, W., Yuan, Y.: Learning from synthetic data for crowd counting in the wild. In: CVPR (2019)Google Scholar
  52. 52.
    Weston, J., Ratle, F., Mobahi, H., Collobert, R.: Deep learning via semi-supervised embedding. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 639–655. Springer, Heidelberg (2012). Scholar
  53. 53.
    Xiong, F., Shi, X., Yeung, D.Y.: Spatiotemporal modeling for crowd counting in videos. In: ICCV (2017)Google Scholar
  54. 54.
    Xu, C., Qiu, K., Fu, J., Bai, S., Xu, Y., Bai, X.: Learn to scale: Generating multipolar normalized density map for crowd counting. In: ICCV (2019)Google Scholar
  55. 55.
    Yan, Z., Yuan, Y., Zuo, W., Tan, X., Wang, Y., Wen, S., Ding, E.: Perspective-guided convolution networks for crowd counting. In: ICCV (2019)Google Scholar
  56. 56.
    Yang, Y., Ma, Z., Nie, F., Chang, X., Hauptmann, A.G.: Multi-class active learning by uncertainty sampling with diversity maximization. Int. J. Comput. Vision 113(2), 113–127 (2015)MathSciNetCrossRefGoogle Scholar
  57. 57.
    Yang, Z., Shi, M., Avrithis, Y., Xu, C., Ferrari, V.: Training object detectors from few weakly-labeled and many unlabeled images. arXiv preprint arXiv:1912.00384 (2019)
  58. 58.
    Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: CVPR (2015)Google Scholar
  59. 59.
    Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: ICLR (2018)Google Scholar
  60. 60.
    Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: CVPR (2016)Google Scholar
  61. 61.
    Zhou, Q., Zhang, J., Che, L., Shan, H., Wang, J.Z.: Crowd counting with limited labeling through submodular frame selection. IEEE Trans. Intell. Transp. Syst. 20(5), 1728–1738 (2018)CrossRefGoogle Scholar
  62. 62.
    Zou, Z., Shao, H., Qu, X., Wei, W., Zhou, P.: Enhanced 3D convolutional networks for crowd counting. In: BMVC (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.College of Electronic and Information EngineeringTongji UniversityShanghaiChina
  2. 2.King’s College LondonLondonUK
  3. 3.Institute of Intelligent Science and TechnologyTongji UniversityShanghaiChina

Personalised recommendations