Skip to main content

A Novel Face Detector Based on YOLOv3

  • Conference paper
  • First Online:
AI 2020: Advances in Artificial Intelligence (AI 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12576))

Included in the following conference series:

Abstract

Face detection has broad applications. Recently, there has been lots of advancement in face detection based on deep learning methods. However, small face detection in a real-world environment is still a challenging task due to its low resolution, variability in size, different poses and occlusions. YOLOv3 is one of the main approaches for object detection, which has achieved comparatively better performance for small target detection in real-time. However, it still struggles to detect a group of small size faces with inaccurate localization as well as an increasing number of false positives. In this paper, we propose an efficient multiscale deep learning network based on YOLOv3 to detect a group of small faces. First, we select the optimum number of anchors, and this will help us understand the small face targets better; secondly, we change the bounding box regression loss in the YOLOv3 to a new CIoU loss to improve the false positives; thirdly, we extend the detection scale from 3 to 4 in YOLOv3 especially for detecting small faces; fourthly, we simplify the four convolutional layers to two residual blocks from six convolutional layers in each detection scale to avoid the derivative vanishing. The proposed model can achieve the state-of-the-art performance on the WIDER FACE face detection benchmark, especially in the hard subset that has a high number of small faces with the variability of scale, poses and occlusions. Our model has achieved 86.5%AP in the WIDER FACE hard validation subset compared to 72.9%AP by the YOLOv3. The run-time is also satisfactory for real application for VGA resolution image with 64.3 FPS using the Nvidia Titan RTX.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016, pp. 779–788 (2016)

    Google Scholar 

  2. Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement. https://arxiv.org/abs/1804.02767. Accessed 8 Aug 2019

  3. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  4. Da, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems 29 (NIPS 2016) (2016)

    Google Scholar 

  5. Zhang, S., Zhu, X., Lei, Z., Shi, H.: S3FD: single shot scale-invariant face detector. https://doi.org/10.1109/iccv.2017.30

  6. Najibi, M., Samangouei, P., Chellappa, R., Davis, L.: SSH: single-stage headless face detector. In: ICCV, pp. 4885–4894 (2017)

    Google Scholar 

  7. Wang, H., Li, Z., Ji, X., Wang, Y.: Face R-CNN. arXiv:1706.01061 (2017)

  8. Zhu, C., Zheng, Y., Luu, K., Savvides, M.: CMS-RCNN: contextual multi-scale region-based CNN for unconstrained face detection. In: Bhanu, B., Kumar, A. (eds.) Deep Learning for Biometrics. ACVPR, pp. 57–79. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61657-5_3

    Chapter  Google Scholar 

  9. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  10. Yang, S., Luo, P., Loy, C.-C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: ICCV (2015)

    Google Scholar 

  11. Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: CVPR, pp. 5525– 5533 (2016)

    Google Scholar 

  12. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: faster and better learning for bounding box regression. In: AAAI Conference on Artificial Intelligence (AAAI) (2020)

    Google Scholar 

  13. Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: CVPR, pp. 532–539 (2013)

    Google Scholar 

  14. Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: CVPR, pp. 146–155 (2016)

    Google Scholar 

  15. Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: CVPR, pp. 1891–1898 (2014)

    Google Scholar 

  16. Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: Proceedings of the British Machine Vision Conference, vol. 1, p. 6 (2015)

    Google Scholar 

  17. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)

    Google Scholar 

  18. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR, vol. 1, p. I–511. IEEE (2001)

    Google Scholar 

  19. Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004). https://doi.org/10.1023/B:VISI.0000013087.49260.fb

    Article  Google Scholar 

  20. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. TPAMI 37(9), 1904–1916 (2015)

    Article  Google Scholar 

  21. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

    Google Scholar 

  22. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 91–99 (2015)

    Google Scholar 

  23. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 7263–7271 (2017)

    Google Scholar 

  24. Fu, C., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: deconvolutional single shot detector. https://arxiv.org/abs/1701.06659. Accessed 8 Aug 2019

  25. Navneet, D., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition (CVPR) (2005)

    Google Scholar 

  26. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  27. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  28. Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: CVPR (2015)

    Google Scholar 

  29. Jiang, H., Learned-Miller, E.: Face detection with the faster R-CNN. arXiv:1606.03473 (2016)

  30. Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22

    Chapter  Google Scholar 

  31. Tang, X., Du, D.K., He, Z., Liu, J.: PyramidBox: a context-assisted single shot face detector. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 812–828. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_49

    Chapter  Google Scholar 

  32. Yang, S., Luo, P., Loy, C.C., Tang, X.: Faceness-Net: face detection through deep facial part responses. IEEE Trans. Pattern Anal. Mach. Intell. 40(8), 1845–1859 (2018)

    Article  Google Scholar 

  33. Ju, M., Luo, H., Wang, Z., Hui, B., Chang, Z.: The application of improved YOLO V3 in multiscale target detection. Appl. Sci. 9, 3775 (2019). https://doi.org/10.3390/app9183775

    Article  Google Scholar 

  34. Jocher, G.: Ultralytics LLC YOLOv3. https://github.com/ultralytics/yolov3

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sabrina Hoque Tuli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tuli, S.H., Mao, A., Liu, W. (2020). A Novel Face Detector Based on YOLOv3. In: Gallagher, M., Moustafa, N., Lakshika, E. (eds) AI 2020: Advances in Artificial Intelligence. AI 2020. Lecture Notes in Computer Science(), vol 12576. Springer, Cham. https://doi.org/10.1007/978-3-030-64984-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64984-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64983-8

  • Online ISBN: 978-3-030-64984-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics