Skip to main content

ProgressFace: Scale-Aware Progressive Learning for Face Detection

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12351))

Included in the following conference series:

Abstract

Scale variation stands out as one of key challenges in face detection. Recent attempts have been made to cope with this issue by incorporating image/feature pyramids or adjusting anchor sampling/matching strategies. In this work, we propose a novel scale-aware progressive training mechanism to address large scale variations across faces. Inspired by curriculum learning, our method gradually learns large-to-small face instances. The preceding models learned with easier samples (i.e., large faces) can provide good initialization for succeeding learning with harder samples (i.e., small faces), ultimately deriving a better optimum of face detectors. Moreover, we propose an auxiliary anchor-free enhancement module to facilitate the learning of small faces by supplying positive anchors that may be not covered according to the criterion of IoU overlap. Such anchor-free module will be removed during inference and hence no extra computation cost is introduced. Extensive experimental results demonstrate the superiority of our method compared to the state-of-the-arts on the standard FDDB and WIDER FACE benchmarks. Especially, our ProgressFace-Light with MobileNet-0.25 backbone achieves 87.9% AP on the hard set of WIDER FACE, surpassing largely RetinaFace with the same backbone by 9.7%. Code and our trained face detection models are available at https://github.com/jiashu-zhu/ProgressFace.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Faces with area <128 accounts for \(\sim \)29% in WIDER FACE.

References

  1. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: ICML (2009)

    Google Scholar 

  2. Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: CVPR (2018)

    Google Scholar 

  3. Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z., Zou, X.: Selective refinement network for high performance face detection. In: AAAI (2019)

    Google Scholar 

  4. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NeurIPS (2016)

    Google Scholar 

  5. Deng, J., Guo, J., Zhou, Y., Yu, J., Kotsia, I., Zafeiriou, S.: Retinaface: single-stage dense face localisation in the wild. arXiv preprint arXiv:1905.00641 (2019)

  6. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI 32(9), 1627–1645 (2009)

    Article  Google Scholar 

  7. Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)

  8. Gidaris, S., Komodakis, N.: Object detection via a multi-region and semantic segmentation-aware CNN model. In: ICCV (2015)

    Google Scholar 

  9. Girshick, R.: Fast R-CNN. In: ICCV (2015)

    Google Scholar 

  10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)

    Google Scholar 

  11. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. TPAMI 37(9), 1904–1916 (2015)

    Article  Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  14. He, Y., Zhu, C., Wang, J., Savvides, M., Zhang, X.: Bounding box regression with uncertainty for accurate object detection. In: CVPR (2019)

    Google Scholar 

  15. He, Y., Xu, D., Wu, L., Jian, M., Xiang, S., Pan, C.: LFFD: a light and fast face detector for edge devices. arXiv preprint arXiv:1904.10633 (2019)

  16. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  17. Hu, P., Ramanan, D.: Finding tiny faces. In: CVPR (2017)

    Google Scholar 

  18. Huang, L., Yang, Y., Deng, Y., Yu, Y.: Densebox: unifying landmark localization with end to end object detection. arXiv preprint arXiv:1509.04874 (2015)

  19. Jain, V., Learned-Miller, E.: FDDB: a benchmark for face detection in unconstrained settings. Technical report, UMass Amherst technical report (2010)

    Google Scholar 

  20. Jiang, L., Meng, D., Mitamura, T., Hauptmann, A.G.: Easy samples first: self-paced reranking for zero-example multimedia search. In: ACM MM (2014)

    Google Scholar 

  21. Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: NeurIPS (2017)

    Google Scholar 

  22. Kong, T., Sun, F., Liu, H., Jiang, Y., Shi, J.: Foveabox: beyond anchor-based object detector. arXiv preprint arXiv:1904.03797 (2019)

  23. Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: NeurIPS (2010)

    Google Scholar 

  24. Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 765–781. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_45

    Chapter  Google Scholar 

  25. Lee, Y.J., Grauman, K.: Learning the easy things first: self-paced visual category discovery. In: CVPR (2011)

    Google Scholar 

  26. Li, D., Huang, J.B., Li, Y., Wang, S., Yang, M.H.: Weakly supervised object localization with progressive domain adaptation. In: CVPR (2016)

    Google Scholar 

  27. Li, J., et al.: DSFD: dual shot face detector. In: CVPR (2019)

    Google Scholar 

  28. Li, Z., Tang, X., Han, J., Liu, J., He, R.: Pyramidbox++: high performance detector for finding tiny face. arXiv preprint arXiv:1904.00386 (2019)

  29. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)

    Google Scholar 

  30. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)

    Google Scholar 

  31. Liu, C., et al.: Progressive neural architecture search. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 19–35. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_2

    Chapter  Google Scholar 

  32. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  33. Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: NeurIPS (2016)

    Google Scholar 

  34. Ming, X., Wei, F., Zhang, T., Chen, D., Wen, F.: Group sampling for scale invariant face detection. In: CVPR (2019)

    Google Scholar 

  35. Najibi, M., Samangouei, P., Chellappa, R., Davis, L.S.: SSH: single stage headless face detector. In: ICCV (2017)

    Google Scholar 

  36. Najibi, M., Singh, B., Davis, L.S.: FA-RPN: floating region proposals for face detection. In: CVPR (2019)

    Google Scholar 

  37. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. TPAMI 39(6), 1137–1149 (2015)

    Article  Google Scholar 

  38. Shi, Y., Jain, A.K.: Probabilistic face embeddings. In: ICCV (2019)

    Google Scholar 

  39. Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: CVPR (2016)

    Google Scholar 

  40. Singh, B., Davis, L.S.: An analysis of scale invariance in object detection snip. In: CVPR (2018)

    Google Scholar 

  41. Singh, B., Najibi, M., Davis, L.S.: Sniper: efficient multi-scale training. In: NeurIPS (2018)

    Google Scholar 

  42. Supancic, J.S., Ramanan, D.: Self-paced learning for long-term tracking. In: CVPR (2013)

    Google Scholar 

  43. Tang, X., Du, D.K., He, Z., Liu, J.: PyramidBox: a context-assisted single shot face detector. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 812–828. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_49

    Chapter  Google Scholar 

  44. Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: ICCV (2019)

    Google Scholar 

  45. Viola, P., Jones, M.J.: Robust real-time face detection. IJCV 57(2), 137–154 (2004). https://doi.org/10.1023/B:VISI.0000013087.49260.fb

    Article  Google Scholar 

  46. Wang, J., Yuan, Y., Li, B., Yu, G., Jian, S.: SFace: an efficient network for face detection in large scale variations. arXiv preprint arXiv:1804.06559 (2018)

  47. Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: CVPR (2016)

    Google Scholar 

  48. Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.: Unitbox: an advanced object detection network. In: ACMMM (2016)

    Google Scholar 

  49. Zhang, F., Fan, X., Ai, G., Song, J., Qin, Y., Wu, J.: Accurate face detection for high performance. arXiv preprint arXiv:1905.01585 (2019)

  50. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  51. Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: CVPR (2018)

    Google Scholar 

  52. Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: Faceboxes: a cpu real-time face detector with high accuracy. In: IJCB (2017)

    Google Scholar 

  53. Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: S3FD: single shot scale-invariant face detector. In: ICCV (2017)

    Google Scholar 

  54. Zhang, Y., Xu, X., Liu, X.: Robust and high performance face detector. arXiv preprint arXiv:1901.02350 (2019)

  55. Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)

  56. Zhou, X., Zhuo, J., Krahenbuhl, P.: Bottom-up object detection by grouping extreme and center points. In: CVPR (2019)

    Google Scholar 

  57. Zhu, C., Tao, R., Luu, K., Savvides, M.: Seeing small faces from robust anchor’s perspective. In: CVPR (2018)

    Google Scholar 

  58. Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets v2: more deformable, better results. In: CVPR (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiashu Zhu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7922 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhu, J., Li, D., Han, T., Tian, L., Shan, Y. (2020). ProgressFace: Scale-Aware Progressive Learning for Face Detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12351. Springer, Cham. https://doi.org/10.1007/978-3-030-58539-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58539-6_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58538-9

  • Online ISBN: 978-3-030-58539-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics