Abstract
Face video retrieval is an attractive research topic in computer vision. However, it remains challenges to overcome because of the significant variation in pose changes, illumination conditions, occlusions, and facial expressions. In video content analysis, face recognition has been playing a vital role. Besides, deep neural networks are being actively studied, and deep learning models have been widely used for object detection, especially for face recognition. Therefore, this study proposes a cloud-based face video retrieval system with deep learning. First, a dataset is collected and pre-processed. To produce a useful dataset for the CNN models, blurry images are removed, and face alignment is implemented on the remaining images. Then the final dataset is constructed and used to pre-train the CNN models (VGGFace, ArcFace, and FaceNet) for face recognition. We compare the results of these three models and choose the most efficient one to develop the system. To implement a query, users can type in the name of a person. If the system detects a new person, it performs enrolling that person. Finally, the result is a list of images and time associated with those images. In addition, a system prototype is implemented to verify the feasibility of the proposed system. Experimental results demonstrate that this system outperforms in terms of recognition accuracy and computational time.
Similar content being viewed by others
References
Caltech faces. http://www.vision.caltech.edu/html-files/archive.html. Accessed 15 Jul 2019
Cheron G, Laptev I, Schmid C (2015) P-CNN: pose-based CNN features for action recognition. In: The IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp 3218–3226
Deng J, Guo J, Xue N, Zafeiriou S (2019) ArcFace: additive angular margin loss for deep face recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp 4690–4699
Ding C, Tao D (2018) Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Trans Pattern Anal Mach Intell 40(4):1002–1014
Dong Z, Jia S, Wu T, Pei M (2016) Face video retrieval via deep learning of binary hash representations. In: the 30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, pp 3471–3477
Dow CR, Ngo HH, Lee LH, Lai PY, Wang KC, Bui VT (2019) A crosswalk pedestrian recognition system by using deep learning and Zebra-crossing recognition techniques. Softw Pract Exp. https://doi.org/10.1002/spe.2742
Extended yale face database B. http://vision.ucsd.edu/content/extended-yale-face-database-b-b. Accessed 15 Jul 2019
Face alignment using MTCNN. https://github.com/davidsandberg/facenet/tree/master/src/align. Accessed 25 Feb 2019
FaceNet. https://github.com/davidsandberg/facenet/. Accessed 25 Feb 2019
Facial images database. https://cswww.essex.ac.uk/mv/allfaces/index.html. Accessed 15 Jul 2019
Gupta V, Mallick S (2019) Face recognition: an introduction for beginners. https://www.learnopencv.com/face-recognition-an-introduction-for-beginners/?ck_subscriber_id=272178015
Hassner T, Masi I, Kim J, Choi J, Harel S, Natarajan P, Medioni G (2016) Pooling faces: template based face recognition with pooled face images. In: The IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA, pp 127–135
Herrmann C, Willersinn D, Beyerer J (2016) Low-resolution convolutional neural networks for video face recognition. In: The 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Colorado Springs, CO, USA, pp 221–227
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp 770–778
Hossain MS, Muhammad G (2015) Cloud-assisted speech and face recognition framework for health monitoring. Mob Netw Appl 20(3):391–399
Huang L, Zhou J (2017) DiFace: a face-based video retrieval system with distributed computing. Am J Syst Softw 5(1):9–14
Huang Z, Wang R, Shan S, Gool LV, Chen X (2018) Cross Euclidean-to-Riemannian metric learning with application to face recognition from video. IEEE Trans Pattern Anal Mach Intell 40(12):2827–2840
Insight face tensorflow. https://github.com/luckycallor/InsightFace-tensorflow. Accessed 20 Mar 2019
Jing C, Dong Z, Pei M, Jia Y (2017) Fusing appearance features and correlation features for face video retrieval. In: 18th Pacific-Rim Conference on Multimedia, Harbin, China, pp 150–160
Li C, Wei W, Li J, Song W (2017) A cloud-based monitoring system via face recognition using Gabor and CS-LBP features. J Supercomput 73(4):1532–1546
Li Y, Wang R, Huang Z, Shan S, Chen X (2015) Face video retrieval with image query via hashing across Euclidean space and Riemannian manifold. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp 4758–4767
MIT-CBCL face recognition database. http://cbcl.mit.edu/software-datasets/heisele/facerecognition-database.html. Accessed 15 Jul 2019
Park DS (2018) Future computing with IoT and cloud computing. J Supercomput 74(12):6401–6407
Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: The British Machine Vision Conference (BMVC), Swansea, United Kingdom, pp 1–12
Pawle AA, Pawar VP (2013) Face recognition system (FRS) on cloud computing for user authentication. Int J Soft Comput Eng 3(4):189–192
Pech-Pacheco JL, Cristobal G, Chamorro-Martinez J, Fernandez-Valdivia J (2000) Diatom autofocusing in brightfield microscopy: a comparative study. In: Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, Barcelona, Spain, pp 314–317
Pertuz S, Puig D, Garcia MA (2013) Analysis of focus measure operators for shape-from-focus. Pattern Recognit 46(5):1415–1432
Qiao S, Wang R, Shan S, Chen X (2019) Deep heterogeneous hashing for face video retrieval. IEEE Trans Image Process 29:1299–1312
Qiao S, Wang R, Shan S, Chen X (2016) Deep video code for efficient face video retrieval. In: The 13th Asian Conference on Computer Vision, Taipei, Taiwan, pp 296–312
Schroff F, Kalenichenko D, Philbin J (2015) FaceNet: a unified embedding for face recognition and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp 815–823
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: The 31st AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, pp 4278–4284
Trigueros DS, Meng L, Hartnett M (2018) Face recognition: from traditional to deep learning methods. arXiv:181100116 pp 1–13
YouTube faces DB. http://www.cs.tau.ac.il/~wolf/ytfaces/index.html#download. Accessed 15 Jul 2019
Acknowledgements
We thank the participants in open source software project for their careful reading of codes and many insightful suggestions. This work was supported in part by Ministry of Science and Technology granted MOST107-2119-M-035-006.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lin, FC., Ngo, HH. & Dow, CR. A cloud-based face video retrieval system with deep learning. J Supercomput 76, 8473–8493 (2020). https://doi.org/10.1007/s11227-019-03123-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-03123-x