Abstract
This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, and real-time requirements is still a difficult and unfinished task.
The paper uses the apparatus of convolutional networks for various stages of processing: for capturing and detecting a face, for constructing a feature vector, and finally for recognition. All algorithms are implemented and studied in the Matlab environment to simplify their further export to embedded applications.
Similar content being viewed by others
REFERENCES
Viola, P. and Jones, M., Rapid Object Detection Using a Boosted Cascade of Simple Features, IEEE, 2003. https://doi.org/10.1109/CVPR.2001.990517
Guennouni, S., Ahaitouf, A., and Mansouri, A., Face Detection: Comparing Haar-Like combined with Cascade Classifiers and Edge Orientation Matching, IEEE, 2017. https://doi.org/10.1109/WITS.2017.7934604
Khachumov, M.V. and Hguen, T.Z., Face recognition from photographs based on invariant moments, Sovrem. Probl. Nauki Obraz., 2015, no. 2-2. http://science-education.ru/ru/article/view?id=23235. Accessed May 23, 2021.
Rudinskaya E.A. and Paringer, R.A., Development of a face detection algorithm using combinations of Haar cascades, Sb. Tr. ITNT-2019. Nov. Tekh., 2019, pp. 6–12.
Redmon, J. and Farhadi, A., YOLO9000: Better, Faster, Stronger, IEEE, 2017. https://doi.org/10.1109/CVPR.2017.690
Simonyan, K. and Zisserman, A., Very deep convolutional networks for large-scale image recognition, 2015. https://arxiv.org/abs/1409.1556v6 [cs.CV].
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Deep residual learning for image recognition, 2015. https://arxiv.org/abs/1512.03385v1 [cs.CV].
Kolomiets, V., Analysis of existing approaches to face recognition. http://habrahabr.ru/company/synesis/blog/238129/. Accessed December 15, 2021.
Redmon, J., Santosh, D., Girshick, R., and Farhadi, A., You only look once: Unified, real-time object detection, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (Las Vegas, NV, 2016), pp. 779–788.
Russakovsky, O., Deng, J., Su, H., et al., ImageNet large scale visual recognition challenge, Int. J. Comput. Vis. (IJCV), 2015, vol. 115, no. 3, pp. 211–252.
Qawaqneh, Z., Mallouh, A.A., and Barkana, B.D., Deep convolutional neural network for age estimation based on VGG-face model, 2017. https://arxiv.org/abs/1709.01664.
Schroff, F., Kalenichenko, D., and Philbin, J., FaceNet: A unified embedding for face recognition and clustering, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2015).
Taigman, Y., Yang, M., Ranzato, M., and Wolf, L., DeepFace: Closing the gap to human-level performance in face verification, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2014), pp. 1701–1708.
Amos, B., Ludwiczuk, B., and Satyanarayanan, M., Openface: A general-purpose face recognition library with mobile applications, CMU School Comput. Sci., 2016, vol. 6, no. 2, p. 20.
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., and Ramanan, D., Object detection with discriminatively trained part-based models, IEEE Trans. Pattern Anal. Mach. Intell., 2010, vol. 32, no. 9, pp. 1627–1645.
Felzenszwalb, P.F. and Huttenlocher, D.P., Pictorial structures for object recognition, Int. J. Comput. Vis., 2005, vol. 61, no. 1, pp. 55–79.
Fischler, M.A. and Elschlager, R.A., The representation and matching of pictorial structures, IEEE Trans. Comput., 1973, vol. 22, no. 1, pp. 67–92.
Zhu, X. and Ramanan, D., Face, detection pose estimation, and landmark localization in the wild, 2012 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2012), pp. 2879–2886.
Viola, P. and Jones, M.J., Robust real-time face detection, Int. J. Comput. Vis., 2004, vol. 57, no. 2, pp. 137–154.
Li, H., Lin, Z., Brandt, J., Shen, X., and Hua, G., Efficient boosted exemplar-based face detection, 2013 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2013).
Shen, X., Lin, Z., Brandt, J., and Wu, Y., Detecting and aligning faces by image retrieval, 2013 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) (2013), pp. 3460–3467.
Krizhevsky, A., Sutskever, I., and Hinton, G.E., Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 1998, vol. 86, no. 11, pp. 2278–2324.
Girshick, R., Donahue, J., Darrell, T., and Malik, J.. Rich feature hierarchies for accurate object detection and semantic segmentation. https://doi.org/10.48550/arXiv.1311.2524
Zhang, C. and Zhang, Z., Improving multiview face detection with multi-task deep convolutional neural networks, 2014 IEEE Winter Conf. Appl. Comput. Vis. (WACV) (2014), pp. 1036–1041.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Translated by V. Potapchouck
Rights and permissions
About this article
Cite this article
Bobkov, A.V., Aung, K. Real-Time Person Identification by Video Image Based on YOLOv2 and VGG 16 Networks. Autom Remote Control 83, 1567–1575 (2022). https://doi.org/10.1134/S00051179220100095
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S00051179220100095