Real-Time Person Identification by Video Image Based on YOLOv2 and VGG 16 Networks

Bobkov, A. V.; Aung, Kh.

doi:10.1134/S00051179220100095

Real-Time Person Identification by Video Image Based on YOLOv2 and VGG 16 Networks

THEMATIC ISSUE
Published: 20 December 2022

Volume 83, pages 1567–1575, (2022)
Cite this article

Automation and Remote Control Aims and scope Submit manuscript

A. V. Bobkov¹ &
Kh. Aung¹

87 Accesses
5 Citations
Explore all metrics

Abstract

This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, and real-time requirements is still a difficult and unfinished task.

The paper uses the apparatus of convolutional networks for various stages of processing: for capturing and detecting a face, for constructing a feature vector, and finally for recognition. All algorithms are implemented and studied in the Matlab environment to simplify their further export to embedded applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

REFERENCES

Viola, P. and Jones, M., Rapid Object Detection Using a Boosted Cascade of Simple Features, IEEE, 2003. https://doi.org/10.1109/CVPR.2001.990517
Guennouni, S., Ahaitouf, A., and Mansouri, A., Face Detection: Comparing Haar-Like combined with Cascade Classifiers and Edge Orientation Matching, IEEE, 2017. https://doi.org/10.1109/WITS.2017.7934604
Khachumov, M.V. and Hguen, T.Z., Face recognition from photographs based on invariant moments, Sovrem. Probl. Nauki Obraz., 2015, no. 2-2. http://science-education.ru/ru/article/view?id=23235. Accessed May 23, 2021.
Rudinskaya E.A. and Paringer, R.A., Development of a face detection algorithm using combinations of Haar cascades, Sb. Tr. ITNT-2019. Nov. Tekh., 2019, pp. 6–12.
Redmon, J. and Farhadi, A., YOLO9000: Better, Faster, Stronger, IEEE, 2017. https://doi.org/10.1109/CVPR.2017.690
Simonyan, K. and Zisserman, A., Very deep convolutional networks for large-scale image recognition, 2015. https://arxiv.org/abs/1409.1556v6 [cs.CV].
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Deep residual learning for image recognition, 2015. https://arxiv.org/abs/1512.03385v1 [cs.CV].
Kolomiets, V., Analysis of existing approaches to face recognition. http://habrahabr.ru/company/synesis/blog/238129/. Accessed December 15, 2021.
Redmon, J., Santosh, D., Girshick, R., and Farhadi, A., You only look once: Unified, real-time object detection, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (Las Vegas, NV, 2016), pp. 779–788.
Russakovsky, O., Deng, J., Su, H., et al., ImageNet large scale visual recognition challenge, Int. J. Comput. Vis. (IJCV), 2015, vol. 115, no. 3, pp. 211–252.
Article MathSciNet Google Scholar
Qawaqneh, Z., Mallouh, A.A., and Barkana, B.D., Deep convolutional neural network for age estimation based on VGG-face model, 2017. https://arxiv.org/abs/1709.01664.
Schroff, F., Kalenichenko, D., and Philbin, J., FaceNet: A unified embedding for face recognition and clustering, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2015).
Taigman, Y., Yang, M., Ranzato, M., and Wolf, L., DeepFace: Closing the gap to human-level performance in face verification, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2014), pp. 1701–1708.
Amos, B., Ludwiczuk, B., and Satyanarayanan, M., Openface: A general-purpose face recognition library with mobile applications, CMU School Comput. Sci., 2016, vol. 6, no. 2, p. 20.
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., and Ramanan, D., Object detection with discriminatively trained part-based models, IEEE Trans. Pattern Anal. Mach. Intell., 2010, vol. 32, no. 9, pp. 1627–1645.
Article Google Scholar
Felzenszwalb, P.F. and Huttenlocher, D.P., Pictorial structures for object recognition, Int. J. Comput. Vis., 2005, vol. 61, no. 1, pp. 55–79.
Article Google Scholar
Fischler, M.A. and Elschlager, R.A., The representation and matching of pictorial structures, IEEE Trans. Comput., 1973, vol. 22, no. 1, pp. 67–92.
Article Google Scholar
Zhu, X. and Ramanan, D., Face, detection pose estimation, and landmark localization in the wild, 2012 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2012), pp. 2879–2886.
Viola, P. and Jones, M.J., Robust real-time face detection, Int. J. Comput. Vis., 2004, vol. 57, no. 2, pp. 137–154.
Article Google Scholar
Li, H., Lin, Z., Brandt, J., Shen, X., and Hua, G., Efficient boosted exemplar-based face detection, 2013 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2013).
Shen, X., Lin, Z., Brandt, J., and Wu, Y., Detecting and aligning faces by image retrieval, 2013 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2013), pp. 3460–3467.
Krizhevsky, A., Sutskever, I., and Hinton, G.E., Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 1998, vol. 86, no. 11, pp. 2278–2324.
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., and Malik, J.. Rich feature hierarchies for accurate object detection and semantic segmentation. https://doi.org/10.48550/arXiv.1311.2524
Zhang, C. and Zhang, Z., Improving multiview face detection with multi-task deep convolutional neural networks, 2014 IEEE Winter Conf. Appl. Comput. Vis. (WACV) (2014), pp. 1036–1041.

Download references

Author information

Authors and Affiliations

Bauman Moscow State Technical University, Moscow, 105005, Russia
A. V. Bobkov & Kh. Aung

Authors

A. V. Bobkov
View author publications
You can also search for this author in PubMed Google Scholar
Kh. Aung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to A. V. Bobkov or Kh. Aung.

Additional information

Translated by V. Potapchouck

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bobkov, A.V., Aung, K. Real-Time Person Identification by Video Image Based on YOLOv2 and VGG 16 Networks. Autom Remote Control 83, 1567–1575 (2022). https://doi.org/10.1134/S00051179220100095

Download citation

Received: 17 February 2022
Revised: 22 April 2022
Accepted: 29 June 2022
Published: 20 December 2022
Issue Date: October 2022
DOI: https://doi.org/10.1134/S00051179220100095

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions