Skip to main content
Log in

Real-Time Person Identification by Video Image Based on YOLOv2 and VGG 16 Networks

  • THEMATIC ISSUE
  • Published:
Automation and Remote Control Aims and scope Submit manuscript

Abstract

This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, and real-time requirements is still a difficult and unfinished task.

The paper uses the apparatus of convolutional networks for various stages of processing: for capturing and detecting a face, for constructing a feature vector, and finally for recognition. All algorithms are implemented and studied in the Matlab environment to simplify their further export to embedded applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.

Similar content being viewed by others

REFERENCES

  1. Viola, P. and Jones, M., Rapid Object Detection Using a Boosted Cascade of Simple Features, IEEE, 2003. https://doi.org/10.1109/CVPR.2001.990517

  2. Guennouni, S., Ahaitouf, A., and Mansouri, A., Face Detection: Comparing Haar-Like combined with Cascade Classifiers and Edge Orientation Matching, IEEE, 2017. https://doi.org/10.1109/WITS.2017.7934604

  3. Khachumov, M.V. and Hguen, T.Z., Face recognition from photographs based on invariant moments, Sovrem. Probl. Nauki Obraz., 2015, no. 2-2. http://science-education.ru/ru/article/view?id=23235. Accessed May 23, 2021.

  4. Rudinskaya E.A. and Paringer, R.A., Development of a face detection algorithm using combinations of Haar cascades, Sb. Tr. ITNT-2019. Nov. Tekh., 2019, pp. 6–12.

  5. Redmon, J. and Farhadi, A., YOLO9000: Better, Faster, Stronger, IEEE, 2017. https://doi.org/10.1109/CVPR.2017.690

  6. Simonyan, K. and Zisserman, A., Very deep convolutional networks for large-scale image recognition, 2015. https://arxiv.org/abs/1409.1556v6 [cs.CV].

  7. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Deep residual learning for image recognition, 2015. https://arxiv.org/abs/1512.03385v1 [cs.CV].

  8. Kolomiets, V., Analysis of existing approaches to face recognition. http://habrahabr.ru/company/synesis/blog/238129/. Accessed December 15, 2021.

  9. Redmon, J., Santosh, D., Girshick, R., and Farhadi, A., You only look once: Unified, real-time object detection, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (Las Vegas, NV, 2016), pp. 779–788.

  10. Russakovsky, O., Deng, J., Su, H., et al., ImageNet large scale visual recognition challenge, Int. J. Comput. Vis. (IJCV), 2015, vol. 115, no. 3, pp. 211–252.

    Article  MathSciNet  Google Scholar 

  11. Qawaqneh, Z., Mallouh, A.A., and Barkana, B.D., Deep convolutional neural network for age estimation based on VGG-face model, 2017. https://arxiv.org/abs/1709.01664.

  12. Schroff, F., Kalenichenko, D., and Philbin, J., FaceNet: A unified embedding for face recognition and clustering, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2015).

  13. Taigman, Y., Yang, M., Ranzato, M., and Wolf, L., DeepFace: Closing the gap to human-level performance in face verification, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2014), pp. 1701–1708.

  14. Amos, B., Ludwiczuk, B., and Satyanarayanan, M., Openface: A general-purpose face recognition library with mobile applications, CMU School Comput. Sci., 2016, vol. 6, no. 2, p. 20.

    Google Scholar 

  15. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., and Ramanan, D., Object detection with discriminatively trained part-based models, IEEE Trans. Pattern Anal. Mach. Intell., 2010, vol. 32, no. 9, pp. 1627–1645.

    Article  Google Scholar 

  16. Felzenszwalb, P.F. and Huttenlocher, D.P., Pictorial structures for object recognition, Int. J. Comput. Vis., 2005, vol. 61, no. 1, pp. 55–79.

    Article  Google Scholar 

  17. Fischler, M.A. and Elschlager, R.A., The representation and matching of pictorial structures, IEEE Trans. Comput., 1973, vol. 22, no. 1, pp. 67–92.

    Article  Google Scholar 

  18. Zhu, X. and Ramanan, D., Face, detection pose estimation, and landmark localization in the wild, 2012 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2012), pp. 2879–2886.

  19. Viola, P. and Jones, M.J., Robust real-time face detection, Int. J. Comput. Vis., 2004, vol. 57, no. 2, pp. 137–154.

    Article  Google Scholar 

  20. Li, H., Lin, Z., Brandt, J., Shen, X., and Hua, G., Efficient boosted exemplar-based face detection, 2013 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2013).

  21. Shen, X., Lin, Z., Brandt, J., and Wu, Y., Detecting and aligning faces by image retrieval, 2013 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2013), pp. 3460–3467.

  22. Krizhevsky, A., Sutskever, I., and Hinton, G.E., Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.

  23. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 1998, vol. 86, no. 11, pp. 2278–2324.

    Article  Google Scholar 

  24. Girshick, R., Donahue, J., Darrell, T., and Malik, J.. Rich feature hierarchies for accurate object detection and semantic segmentation. https://doi.org/10.48550/arXiv.1311.2524

  25. Zhang, C. and Zhang, Z., Improving multiview face detection with multi-task deep convolutional neural networks, 2014 IEEE Winter Conf. Appl. Comput. Vis. (WACV) (2014), pp. 1036–1041.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to A. V. Bobkov or Kh. Aung.

Additional information

Translated by V. Potapchouck

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bobkov, A.V., Aung, K. Real-Time Person Identification by Video Image Based on YOLOv2 and VGG 16 Networks. Autom Remote Control 83, 1567–1575 (2022). https://doi.org/10.1134/S00051179220100095

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S00051179220100095

Keywords

Navigation