Skip to main content
Log in

A cloud-based face video retrieval system with deep learning

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Face video retrieval is an attractive research topic in computer vision. However, it remains challenges to overcome because of the significant variation in pose changes, illumination conditions, occlusions, and facial expressions. In video content analysis, face recognition has been playing a vital role. Besides, deep neural networks are being actively studied, and deep learning models have been widely used for object detection, especially for face recognition. Therefore, this study proposes a cloud-based face video retrieval system with deep learning. First, a dataset is collected and pre-processed. To produce a useful dataset for the CNN models, blurry images are removed, and face alignment is implemented on the remaining images. Then the final dataset is constructed and used to pre-train the CNN models (VGGFace, ArcFace, and FaceNet) for face recognition. We compare the results of these three models and choose the most efficient one to develop the system. To implement a query, users can type in the name of a person. If the system detects a new person, it performs enrolling that person. Finally, the result is a list of images and time associated with those images. In addition, a system prototype is implemented to verify the feasibility of the proposed system. Experimental results demonstrate that this system outperforms in terms of recognition accuracy and computational time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Caltech faces. http://www.vision.caltech.edu/html-files/archive.html. Accessed 15 Jul 2019

  2. Cheron G, Laptev I, Schmid C (2015) P-CNN: pose-based CNN features for action recognition. In: The IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp 3218–3226

  3. Deng J, Guo J, Xue N, Zafeiriou S (2019) ArcFace: additive angular margin loss for deep face recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp 4690–4699

  4. Ding C, Tao D (2018) Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Trans Pattern Anal Mach Intell 40(4):1002–1014

    Article  Google Scholar 

  5. Dong Z, Jia S, Wu T, Pei M (2016) Face video retrieval via deep learning of binary hash representations. In: the 30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, pp 3471–3477

  6. Dow CR, Ngo HH, Lee LH, Lai PY, Wang KC, Bui VT (2019) A crosswalk pedestrian recognition system by using deep learning and Zebra-crossing recognition techniques. Softw Pract Exp. https://doi.org/10.1002/spe.2742

    Article  Google Scholar 

  7. Extended yale face database B. http://vision.ucsd.edu/content/extended-yale-face-database-b-b. Accessed 15 Jul 2019

  8. Face alignment using MTCNN. https://github.com/davidsandberg/facenet/tree/master/src/align. Accessed 25 Feb 2019

  9. FaceNet. https://github.com/davidsandberg/facenet/. Accessed 25 Feb 2019

  10. Facial images database. https://cswww.essex.ac.uk/mv/allfaces/index.html. Accessed 15 Jul 2019

  11. Gupta V, Mallick S (2019) Face recognition: an introduction for beginners. https://www.learnopencv.com/face-recognition-an-introduction-for-beginners/?ck_subscriber_id=272178015

  12. Hassner T, Masi I, Kim J, Choi J, Harel S, Natarajan P, Medioni G (2016) Pooling faces: template based face recognition with pooled face images. In: The IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA, pp 127–135

  13. Herrmann C, Willersinn D, Beyerer J (2016) Low-resolution convolutional neural networks for video face recognition. In: The 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Colorado Springs, CO, USA, pp 221–227

  14. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp 770–778

  15. Hossain MS, Muhammad G (2015) Cloud-assisted speech and face recognition framework for health monitoring. Mob Netw Appl 20(3):391–399

    Article  Google Scholar 

  16. Huang L, Zhou J (2017) DiFace: a face-based video retrieval system with distributed computing. Am J Syst Softw 5(1):9–14

    Article  Google Scholar 

  17. Huang Z, Wang R, Shan S, Gool LV, Chen X (2018) Cross Euclidean-to-Riemannian metric learning with application to face recognition from video. IEEE Trans Pattern Anal Mach Intell 40(12):2827–2840

    Article  Google Scholar 

  18. Insight face tensorflow. https://github.com/luckycallor/InsightFace-tensorflow. Accessed 20 Mar 2019

  19. Jing C, Dong Z, Pei M, Jia Y (2017) Fusing appearance features and correlation features for face video retrieval. In: 18th Pacific-Rim Conference on Multimedia, Harbin, China, pp 150–160

  20. Li C, Wei W, Li J, Song W (2017) A cloud-based monitoring system via face recognition using Gabor and CS-LBP features. J Supercomput 73(4):1532–1546

    Article  Google Scholar 

  21. Li Y, Wang R, Huang Z, Shan S, Chen X (2015) Face video retrieval with image query via hashing across Euclidean space and Riemannian manifold. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp 4758–4767

  22. MIT-CBCL face recognition database. http://cbcl.mit.edu/software-datasets/heisele/facerecognition-database.html. Accessed 15 Jul 2019

  23. Park DS (2018) Future computing with IoT and cloud computing. J Supercomput 74(12):6401–6407

    Article  Google Scholar 

  24. Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: The British Machine Vision Conference (BMVC), Swansea, United Kingdom, pp 1–12

  25. Pawle AA, Pawar VP (2013) Face recognition system (FRS) on cloud computing for user authentication. Int J Soft Comput Eng 3(4):189–192

    Google Scholar 

  26. Pech-Pacheco JL, Cristobal G, Chamorro-Martinez J, Fernandez-Valdivia J (2000) Diatom autofocusing in brightfield microscopy: a comparative study. In: Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, Barcelona, Spain, pp 314–317

  27. Pertuz S, Puig D, Garcia MA (2013) Analysis of focus measure operators for shape-from-focus. Pattern Recognit 46(5):1415–1432

    Article  Google Scholar 

  28. Qiao S, Wang R, Shan S, Chen X (2019) Deep heterogeneous hashing for face video retrieval. IEEE Trans Image Process 29:1299–1312

    Article  MathSciNet  Google Scholar 

  29. Qiao S, Wang R, Shan S, Chen X (2016) Deep video code for efficient face video retrieval. In: The 13th Asian Conference on Computer Vision, Taipei, Taiwan, pp 296–312

  30. Schroff F, Kalenichenko D, Philbin J (2015) FaceNet: a unified embedding for face recognition and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp 815–823

  31. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556

  32. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: The 31st AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, pp 4278–4284

  33. Trigueros DS, Meng L, Hartnett M (2018) Face recognition: from traditional to deep learning methods. arXiv:181100116 pp 1–13

  34. YouTube faces DB. http://www.cs.tau.ac.il/~wolf/ytfaces/index.html#download. Accessed 15 Jul 2019

Download references

Acknowledgements

We thank the participants in open source software project for their careful reading of codes and many insightful suggestions. This work was supported in part by Ministry of Science and Technology granted MOST107-2119-M-035-006.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng-Cheng Lin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, FC., Ngo, HH. & Dow, CR. A cloud-based face video retrieval system with deep learning. J Supercomput 76, 8473–8493 (2020). https://doi.org/10.1007/s11227-019-03123-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-019-03123-x

Keywords

Navigation