Optimizable face detection and tracking model with occlusion resolution for high quality videos

Mool, Akshay; Panda, J.; Sharma, Kapil

doi:10.1007/s11042-022-11958-5

Optimizable face detection and tracking model with occlusion resolution for high quality videos

Published: 15 February 2022

Volume 81, pages 10391–10406, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Akshay Mool¹,
J. Panda² &
Kapil Sharma¹

440 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Recent state-of-the-art Face Detection algorithms in the field of Computer Vision focus greatly on real-time processing and results. The applications using these algorithms deal with low quality video feeds having less Pixels Per Inch (ppi) and/or low frame rate. The algorithms perform well with such video feeds, but their performance deteriorates towards high quality, high data-per-frame videos. Such video files mostly exist in offline mode, that could be used for post processing by the Computer Vision applications. This paper focuses on developing such an algorithm that gives faster results on high quality videos, at par with the algorithms working on live low quality video feeds. The proposed algorithm uses Convolutional-MTCNN as base algorithm, and speeds it up for high definition videos. This paper also presents a novel solution to the problem of occlusion and detecting partial or fully hidden faces in the videos. This is achieved by using probabilistic approaches, given that the face has been identified in first few frames, to give the algorithm an estimate of where the face should be in the occluded region.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

End-to-End Object Detection with Transformers

References

Cao X, Wei Y, Wen F, Sun J (2012) Face alignment by explicit shape regression. In: IEEE conference on computer vision and pattern recognition, pp 2887–2894
Chrysos G G, Antonakos E, Zafeiriou S, Snape P (2015) Offline deformable face tracking in arbitrary videos. In: IEEE international conference on computer vision workshop (ICCVW), pp 954–962
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), vol 1, pp 886–893
Ismail N, Sabri MIM (2009) Review of existing algorithms for face detection and recognition. In: World scientific and engineering academy and society (WSEAS), pp 30–39
Jia Y-B (2017) Polynomial interpolation. National Taiwan Ocean University Pub Scientific Computing
Li J, Song L, Liu C (2018) The cubic trigonometric automatic interpolation spline. IEEE/CAA J Autom Sin 5(6):1136–1141
Article MathSciNet Google Scholar
Luo J, Liu J, Lin J, Wang Z (2020) A lightweight face detector by integrating the convolutional neural network with the image pyramid. Pattern Recogn Lett 133:180–187. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0167865520300805
Article Google Scholar
Meijering E (2002) A chronology of interpolation: from ancient astronomy to modern signal and image processing. Proc IEEE 90(3):319–342
Article Google Scholar
Nguyen D T, Nguyen T N, Kim H, Lee H (2019) A high-throughput and power-efficient fpga implementation of yolo cnn for object detection. IEEE Trans Very Large Scale Integr (VLSI) Syst 27(8):1861–1873
Article Google Scholar
Pairo W, Loncomilla P, Ruiz-del Solar J (2019) A delay-free and robust object tracking approach for robotics applications. J Intell Robot Syst 95:07
Article Google Scholar
Raja R, Sinha D T, Dubey R (2015) Recognition of human-face from side-view using progressive switching pattern and soft-computing technique. Adv Model Anal B 58:14–34, 01
Google Scholar
Raja R, Sinha T S, Patra R K, Tiwari S (2018) Physiological trait-based biometrical authentication of human-face using lgxp and ann techniques. Int J Inf Comput Secur 10(2–3):303–320. [Online]. Available: https://www.inderscienceonline.com/doi/abs/10.1504/IJICS.2018.091468
Google Scholar
Ranftl A, Alonso-Fernandez F, Karlsson S, Bigun J (2017) A real-time adaboost cascade face tracker based on likelihood map and optical flow. IET Biom 6:05
Article Google Scholar
Ranjan R, Patel V M, Chellappa R (2019) Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41(1):121–135
Article Google Scholar
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR). [Online]. Available: https://doi.org/10.1109/CVPR.2015.7298682
Shen J, Zafeiriou S, Chrysos G G, Kossaifi J, Tzimiropoulos G, Pantic M (2015) The first facial landmark tracking in-the-wild challenge: Benchmark and results. In: IEEE international conference on computer vision workshop (ICCVW), pp 1003–1011
Singh S, Singh D, Yadav V (2020) Face recognition using hog feature extraction and svm classifier. 8:6437–6440, 09
Tomasi C, Kanade T (1991) Detection and tracking of point features. Int J Comput Vis
Tzimiropoulos G (2015) Project-out cascaded regression with an application to face alignment. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3659–3667
Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vis 57:137–154, 05
Article Google Scholar
Yang M -H, Kriegman D J, Ahuja N (2002) Detecting faces in images: a survey. IEEE Trans Pattern Anal Mac Intell 24(1):34–58
Article Google Scholar
Yang B, Yan J, Lei Z, Li S Z (2014) Aggregate channel features for multi-view face detection. In: IEEE International joint conference on biometrics, pp 1–8
Yu B, Tao D (2019) Anchor cascade for efficient face detection. IEEE Trans Image Process 28(5):2490–2501
Article MathSciNet Google Scholar
Zeng D, Zhao F, Ge S, Shen W (2019) Fast cascade face detection with pyramid network. Pattern Recogn Lett 119:180–186. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0167865518302125
Article Google Scholar
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042, India
Akshay Mool & Kapil Sharma
Department of Electronics and Communications Engineering, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042, India
J. Panda

Authors

Akshay Mool
View author publications
You can also search for this author in PubMed Google Scholar
J. Panda
View author publications
You can also search for this author in PubMed Google Scholar
Kapil Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akshay Mool.

Ethics declarations

Conflict of Interest

Akshay Mool, J. Panda and Kapil Sharma declare that they have no conflict of interest.

Additional information

Competing interest

The authors declare that they do not have any competing financial interests nor any personal relationships that could seem to have influenced the work presented by this paper.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mool, A., Panda, J. & Sharma, K. Optimizable face detection and tracking model with occlusion resolution for high quality videos. Multimed Tools Appl 81, 10391–10406 (2022). https://doi.org/10.1007/s11042-022-11958-5

Download citation

Received: 09 July 2021
Revised: 16 December 2021
Accepted: 03 January 2022
Published: 15 February 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11042-022-11958-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimizable face detection and tracking model with occlusion resolution for high quality videos

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Competing interest

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimizable face detection and tracking model with occlusion resolution for high quality videos

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Competing interest

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation