YuNet: A Tiny Millisecond-level Face Detector

Wu, Wei; Peng, Hanyang; Yu, Shiqi

doi:10.1007/s11633-023-1423-y

YuNet: A Tiny Millisecond-level Face Detector

Research Article
Open access
Published: 19 April 2023

Volume 20, pages 656–665, (2023)
Cite this article

Download PDF

You have full access to this open access article

Machine Intelligence Research Aims and scope Submit manuscript

YuNet: A Tiny Millisecond-level Face Detector

Download PDF

6535 Accesses
2 Altmetric
Explore all metrics

Abstract

Great progress has been made toward accurate face detection in recent years. However, the heavy model and expensive computation costs make it difficult to deploy many detectors on mobile and embedded devices where model size and latency are highly constrained. In this paper, we present a millisecond-level anchor-free face detector, YuNet, which is specifically designed for edge devices. There are several key contributions in improving the efficiency-accuracy trade-off. First, we analyse the influential state-of-the-art face detectors in recent years and summarize the rules to reduce the size of models. Then, a lightweight face detector, YuNet, is introduced. Our detector contains a tiny and efficient feature extraction backbone and a simplified pyramid feature fusion neck. To the best of our knowledge, YuNet has the best trade-off between accuracy and speed. It has only 75856 parameters and is less than 1/5 of other small-size detectors. In addition, a training strategy is presented for the tiny face detector, and it can effectively train models with the same distribution of the training set. The proposed YuNet achieves 81.1% mAP (single-scale) on the WIDER FACE validation hard track with a high inference efficiency (Intel i7-12700K: 1.6ms per frame at 320 × 320). Because of its unique advantages, the repository for YuNet and its predecessors has been popular at GitHub and gained more than 11K stars at https://github.com/ShiqiYu/libfacedetection

Article PDF

HPFace: a high speed and accuracy face detector

Article 24 September 2022

SSRFD: single shot real-time face detector

Article 30 January 2022

LWFD: A Simple Light-Weight Network for Face Detection

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

P. Viola, M. Jones. Rapid object detection using a boosted cascade of simple features. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, USA, pp. 511–518, 2001. DOI: https://doi.org/10.1109/CVPR.2001.990517.
Y. T. Feng, S. Q. Yu, H. Y. Peng, Y. R. Li, J. G. Zhang. Detect faces efficiently: A survey and evaluations. IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 4, no. 1, pp. 1–18, 2021. DOI: https://doi.org/10.1109/tbiom.2021.3120412.
Article Google Scholar
S. Yang, P. Luo, C. C. Loy, X. O. Tang. WIDER FACE: A face detection benchmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp.5225–5533, 2016. DOI: https://doi.org/10.1109/cvpr.2016.596.
P. Y. Hu, D. Ramanan. Finding tiny faces. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 1522–1530, 2017. DOI: https://doi.org/10.1100/cvpr.2017.166.
S. F. Zhang, X. Y. Zhu, Z. Lei, H. L. Shi, X. B. Wang, S. Z. Li. S.3FD: Single shot scale-invariant face detector. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 192–201, 2017. DOI: https://doi.org/10.1100/iccv.2017.30.
C. Chi, S. F. Zhang, J. L. Xing, Z. Lei, S. Z. Li, X. D. Zou. Selective refinement network for high performance face detection. In Proceedings of AAAI Conference on Artificial Intelligence, vol. 33, no. 1, pp. 8231–8238, 2019. DOI: https://doi.org/10.1600/aaai.v33i01.33018231.
Article Google Scholar
J. Li, Y. B. Wang, C. A. Wang, Y. Tai, J. J. Qian, J. Yang, C. J. Wang, J. L. Li, F. Y. Huang. DSFD: Dual shot face detector. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp.5055–5064, 2019. DOI: https://doi.org/10.1109/cvpr.2010.00520.
Google Scholar
W. Liu, S. C. Liao, W. Q. Ren, W. D. Hu, Y. N. Yu. High-level semantic feature detection: A new perspective for pedestrian detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 5187–5196, 2019. DOI: https://doi.org/10.1100/cvpr.2010.00533.
Google Scholar
J. K. Deng, J. Guo, E. Ververas, I. Kotsia, S. Zafeiriou. RetinaFace: Single-shot multi-level face localisation in the wild. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5202–5211, 2020. DOI: https://doi.org/10.1100/cvpr42600.2020.00525
Google Scholar
Y. Liu, F. Wang, J. K. Deng, Z. P. Zhou, B. Sun, H. Li. MogFace: Towards a deeper appreciation on face detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, New Orleans, USA, pp. 4083–4092, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.00406.
Google Scholar
L. Song, J. F. Yang, Q. Z. Shang, M. A. Li. Dense face network: A dense face detector based on global context and visual attention mechanism. Machine Intelligence Research, vol. 10, no. 3, pp. 247–256, 2022. DOI: https://doi.org/10.1007/s11633-022-1327-2.
Article Google Scholar
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. [Online], Available: https://arxiv.org/abs/1400.1556, 2014.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/cvpr.2016.00.
A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, H. Adam. MobileNets: Efficient convolutional neural networks for mobile vision applications. [Online], Available: https://arxiv.org/abs/1704.04861, 2017.
A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, pp. 1097–1105, 2012.
M. Najibi, P. Samangouei, R. Chellappa, L. S. Davis. SSH: Single stage headless face detector. In Proceedings of International Conference on Computer Vision, Venice, Italy, pp. 4885–4894, 2017. DOI: https://doi.org/10.1109/iccv.2017.522.
J. Li, B. Zhang, Y. B. Wang, Y. Tai, Z. Y. Zhang, C. J. Wang, J. L. Li, X. M. Huang, Y. L. Xia. ASFD: Automatic and scalable face detector. In Proceedings of the 29th ACM International Conference on Multimedia, pp. 2139–2147, 2021. DOI: https://doi.org/10.1145/3474085.3475372.
Y. Liu, X. Tang. BFBox: Searching face-appropriate backbone and feature pyramid network for face detector. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 13565–13574, 2020. DOI: https://doi.org/10.1109/cvpr42600.2020.01358.
Google Scholar
X. Tang, D. K. Du, Z. Q. He, J. T. Liu. PyramidBox: A context-assisted single shot face detector. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 812–828, 2018. DOI: https://doi.org/10.1007/978-3-030-01240-3_49.
Google Scholar
Y. Liu, X. Tang, J. Y. Han, J. T. Liu, D. E. Rui, X. Wu. HAMBox: Delving into mining high-quality anchors on face detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 13043–13051, 2020. DOI: https://doi.org/10.1109/cvpr42600.2020.01306.
Google Scholar
D. L. Qi, W. J. Tan, Q. Yao, J. F. Liu. YOLO5Face: Why reinventing a face detector. In Proceedings of Computer Vision — ECCV Workshops, Springer, Tel Aviv, Israel, vol. 13805, pp. 288–244, 2022. DOI: https://doi.org/10.1007/978-3-031-25072-9_15.
Google Scholar
G. Jocher. YOLOv5, 2020. [Online], Available: https://github.com/ultralytics/yolov5, Mar. 2022.
J. Guo, J. K. Deng, A. Lattas, S. Zafeiriou. Sample and computation redistribution for efficient face detection. In Proceedings of the 10th International Conference on Learning Representations, 2022.
S. H. Gao, Y. Q. Tan, M. M. Cheng, C. Z. Lu, Y. P. Chen, S. C. Yan. Highly efficient salient object detection with 100K parameters. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, 2020, pp. 702–721. DOI: https://doi.org/10.1007/978-3-030-58539-6_42.
Google Scholar
T. Y. Lin, P. Dollár, R. Girshick, K. M. He, B. Hariharan, S. Belongie. Feature pyramid networks for object detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 936–944, 2017. DOI: https://doi.org/10.1109/cvpr.2017.106.
Z. Ge, S. T. Liu, F. Wang, Z. M. Li, J. Sun. YOLOX: Exceeding YOLO series in 2021. [Online], Available: https://arxiv.org/abs/2107.08430, 2021.
Z. Ge, S. T. Liu, Z. M. Li, O. Yoshie, J. Sun. OTA: Optimal transport assignment for object detection. In Proceedings of IEEE/CVF Conference on Computer Vision and- Pattern Recognition, IEEE, Nashville, USA, pp. 303–312, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00037.
Google Scholar
H. Y. Peng, S. Q. Yu. A systematic IoU-related method: Beyond simplified regression for better localization. IEEE Transactions on Image Processing, vol. 30, pp. 5032–5044, 2021. DOI: https://doi.org/10.1109/TIP.2021.3077144.
Article MathSciNet Google Scholar
K. Chen, J. Q. Wang, J. M. Pang, Y. H. Cao, Y. Xiong, X. X. Li, S. Y. Sun, W. S. Feng, Z. W. Liu, J. R. Xu, Z. Zhang, D. Z. Cheng, C. C. Zhu, T. H. Cheng, Q. J. Zhao, B. Y. Li, X. Lu, R. Zhu, Y. Wu, J. F. Dai, J. D. Wang, J. P. Shi, W. L. Ouyang, C. C. Loy, D. H. Lin. MMDetection: Open MMLab detection toolbox and benchmark. [Online], Available: https://arxiv.org/abs/1906.07155, 2019.
V. Bazarevsky, Y. Kartynnik, A. Vakunov, K. Raveendran, M. Grundmann. BlazeFace: Sub-millisecond neural face detection on mobile GPUs. [Online], Available: https://arxiv.org/abs/1907.05047, 2019.
S. F. Zhang, X. Y. Zhu, Z. Lei, H. L. Shi, X. B. Wang, S. Z. Li. FaceBoxes: A CPU real-time face detector with high accuracy. In Proceedings of IEEE International Joint Conference on Biometrics, Denver, USA, 2017. DOI: https://doi.org/10.1109/BTAS.2017.8272675.

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (No. 61976144), the Stable Support Plan Program of Shenzhen Natural Science Fund (No. 20200925155017002), and the National Key Research and Development Program of China (No. 2020 AAA0140000).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
Wei Wu & Shiqi Yu
Pengcheng Laboratory, Shenzhen, 518066, China
Hanyang Peng

Authors

Wei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hanyang Peng
View author publications
You can also search for this author in PubMed Google Scholar
Shiqi Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shiqi Yu.

Ethics declarations

The authors declared that they have no conflicts of interest to this work.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Wei Wu received the B. Sc. degree in computer science and technology from Chongqing University, China in 2017. Currently, he is a master student in electronics science and technology at Department of Computer Science and Engineering, Southern University of Science and Technology, China.

His research interests include object detection and computer vision.

Hanyang Peng received the B. Sc. degree in measurement and control technology from Northeast University of China, China in 2008, the M. Eng. degree in detection technology and automatic equipment from Tianjin University, China in 2010, and the Ph.D. degree in pattern recognition and intelligence systems from Institute of Automation, Chinese Academy of Sciences, China in 2017. He currently works as an assistant professor in Pengcheng Laboratory, China.

His research interests include computer vision, machine learning and distributed learning.

Shiqi Yu received the B. Eng. degree in computer science and engineering from Chu Kochen Honors College, Zhejiang University, China in 2002, and the Ph.D. degree in pattern recognition and the intelligent systems from Institute of Automation, Chinese Academy of Sciences, China in 2007. He is currently an associate professor in Department of Computer Science and Engineering, Southern University of Science and Technology, China. He worked as an assistant professor and an associate professor in Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China from 2007 to 2010, and as an associate professor in Shenzhen University, China from 2010 to 2019.

His research interests include gait recognition, face detection and computer vision.

Rights and permissions

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wu, W., Peng, H. & Yu, S. YuNet: A Tiny Millisecond-level Face Detector. Mach. Intell. Res. 20, 656–665 (2023). https://doi.org/10.1007/s11633-023-1423-y

Download citation

Received: 02 September 2022
Accepted: 06 February 2023
Published: 19 April 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11633-023-1423-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

YuNet: A Tiny Millisecond-level Face Detector

Abstract

Article PDF

Similar content being viewed by others

HPFace: a high speed and accuracy face detector

SSRFD: single shot real-time face detector

LWFD: A Simple Light-Weight Network for Face Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

YuNet: A Tiny Millisecond-level Face Detector

Abstract

Article PDF

Similar content being viewed by others

HPFace: a high speed and accuracy face detector

SSRFD: single shot real-time face detector

LWFD: A Simple Light-Weight Network for Face Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation