Abstract
This study is to solve the problem of low accuracy and slow processing speed for real-time face detection and tracking systems. A margin-based region of interest approach with fixed and dynamic margin concepts is proposed to speed up the processing time. In addition, a hybrid system is developed to boost the accuracy and overcome the deficiency of the main detection algorithm. This approach consists of two routines, i.e., main and escape routines. Three algorithms are used independently as the main routine to evaluate the effectiveness of the proposed hybrid approach. These algorithms are Haar cascade, Joint cascade, and multitask convolutional neural networks. The escape routine based on template matching algorithm is designed to evaluate the effectiveness of the proposed hybrid approach and improve detection accuracy. Two RGB video datasets with diversity and variations in face poses, video backgrounds, illuminations, video resolutions, expressions, over exposed faces, and occlusions of people within various unseen environments have been used for experiments and evaluation. The experiment results confirm that the hybrid approach is capable of detecting and tracking faces in non-frontal orientation with better accuracy and faster processing speed, i.e., four times faster than the conventional full frame scanning techniques.
Similar content being viewed by others
References
Yang, M.-H., Kriegman, D.J., Ahuja, N.: Detecting faces in image: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 24, 34–58 (2002)
Zhang, C., Zhang, Z.: A survey of recent advances in face detection. Microsoft Res. 17, 1–17 (2010)
Zafeiriou, S., Zhang, C., Zhang, Z.: A survey on face detection in the wild: past, present and future. Comput. Vis. Image Underst. 138, 1–24 (2015)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. I–I (2001)
Chen, D., Ren, S., Wei, Y., Cao, X., Sun, J.: Joint cascade face detection and alignment. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 109–122 (2014)
Zhang, K., Zhang, Z., Li, Z., Member, S., Qiao, Y., Member, S.: Joint face detection and alignment using multi-task cascaded convolutional networks. IEEE Signal Process. Lett. 23, 1499–1503 (2016)
Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network approach for face detection. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5325–5334 (2015)
Dai, D., Tan, W., Zhan, H.: Understanding the feedforward artificial neural network model from the perspective of network flow. arXiv Prepr. arXiv1704.08068. (2017)
Ruder, S.: An Overview of multi-task learning in deep neural networks. arXiv Prepr. arXiv1706.05098. (2017)
Wei, L.-Y., Levoy, M.: Fast texture synthesis using tree-structured vector quantization. In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques—SIGGRAPH’00, pp. 479–488 (2000)
Data/Code Section (2019). http://ailab.space/projects/multimodal-human-intention-perception/ Accessed Jan 2019
Shen, J., Zafeiriou, S., Chrysos, G.G., Kossaifi, J., Tzimiropoulos, G., Pantic, M.: The first facial landmark tracking in-the-wild challenge: benchmark and results. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1003–1011 (2016)
Salam, H., Séguier, R.: A survey on face modeling: building a bridge between face analysis and synthesis. Vis. Comput. 34, 289–319 (2018)
Zhao, W., Chellappa, R., Phillips, P.J.: Rosenfeld, a: face recognition: a literature survey. ACM Comput. Surv. 35, 399–458 (2003)
Bulbul, A., Cipiloglu, Z., Capin, T.: A color-based face tracking algorithm for enhancing interaction with mobile devices. Vis. Comput. 26, 311–323 (2010)
Kalal, Z., Mikolajczyk, K., Matas, J.: Face-TLD: Tracking-learning-detection applied to faces. In: Proceedings—International Conference on Image Processing, ICIP, pp. 3789–3792 (2010)
Singh, C., Walia, E., Mittal, N.: Robust two-stage face recognition approach using global and local features. Vis. Comput. 28, 1085–1098 (2012)
Kumar, N., Peter, A.C.B., Belhumeur, P.N., Abstract, S.K.N.: Attribute and simile classifiers for face verification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 365–372 (2009)
Fu, Y., Guo, G., Member, S.: Age synthesis and estimation via faces: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1955–1976 (2010)
Laurentini, A., Bottino, A.: Computer analysis of face beauty: a survey. Comput. Vis. Image Underst. 125, 184–199 (2014)
Pantic, M., Rothkrantz, L.J.M.: Automatic analysis of facial expressions: the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1424–1445 (2000)
Wang, Y., Zhang, L., Liu, Z., Hua, G., Wen, Z., Zhang, Z., Samaras, D.: Face relighting from a single image under arbitrary unknown lighting conditions. IEEE Trans. Pattern Anal. Mach. Intell. 31, 1968–1984 (2009)
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings 26th Annual Conference Computer Graphics Interaction Technology—SIGGRAPH’99, pp. 187–194 (1999)
Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., Seitz, S.M.: Exploring photobios. In: ACM SIGGRAPH 2011 papers on—SIGGRAPH’11, p. 1 (2011)
Wang, Z., Miao, Z., Jonathan, Wu, Wu, Q.M.J., Wan, Y., Tang, Z.: Low-resolution face recognition: a review. Vis. Comput. 30, 359–386 (2014)
Li, Stan Z., Long Zhu, Z.Z.: Statistical learning of multi-view face detection. In: European Conference on Computer Vision, pp. 67–81 (2002)
Jones, M.J., Jones, M.: Fast multi-view face detection. Mitsubishi Electr. Res. Lab TR-20003-96 3, 2 (2003)
Chua, T., Zhao, Y., Kankanhalli, M.S.: Detection of human faces in compressed domain for video strati cation 1 introduction. Vis. Comput. 18, 121–133 (2002)
Viola, P., Jones, M.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)
Bradski, G.: The OpenCV library. Dr Dobb’s J. Softw. Tools Prof. Program 25, 120–123 (2000)
Wang, Y., Hu, S., Wu, S.: Object tracking based on huber loss function. Vis. Comput. (2018). https://doi.org/10.1007/s00371-018-1563-1
Yang, S., Luo, P., Loy, C.C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3676–3684 (2015)
Jiang, H., Learned-Miller, E.: Face detection with the faster R-CNN. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 650–657 (2017)
Park, J., Kang, D.: Unified convolutional neural network for direct facial keypoints detection. Vis. Comput. (2018). https://doi.org/10.1007/s00371-018-1561-3
Dawoud, N.N., Samir, B.B., Janier, J.: Fast template matching method based optimized sum of absolute difference algorithm for face localization. Int. J. Comput. Appl. 18, 975–8887 (2011)
Tan, T.K., Boon, C.S., Suzuki, Y.: Intra Prediction by Template Matching. In: International Conference on Image Processing, pp. 1–4 (2006)
Held, D., Levinson, J., Thrun, S., Savarese, S.: Robust real-time tracking combining 3D shape, color, and motion. Int. J. Rob. Res. 35, 1–28 (2015)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37, 583–596 (2015)
Wang, R., Dong, H., Han, T.X., Mei, L.: Robust tracking via monocular active vision for an intelligent teaching system. Vis. Comput. 32, 1379–1394 (2016)
Quan, W., Chen, J.X., Yu, N.: Robust object tracking using enhanced random ferns. Vis. Comput. 30, 351–358 (2014)
Quan, W., Jiang, Y., Zhang, J., Chen, J.X.: Robust object tracking with active context learning. Vis. Comput. 31, 1307–1318 (2015)
Gerónimo, D., Sappa, A.D., Ponsa, D., López, A.M.: 2D-3D-based on-board pedestrian detection system. Comput. Vis. Image Underst. 114, 583–595 (2010)
Xiao, J., Kanade, T., Cohn, J.F.: Robust full-motion recovery of head by dynamic templates and re-registration techniques. In: Proceedings—5th IEEE International Conference on Automatic Face Gesture Recognition, FGR 2002, pp. 163–169 (2002)
Rehman, B., Hong, O.W., Tan, A., Hong, C.: Hybrid Model with Margin-Based Real-Time Face Detection and Tracking. In: The 11th Multi-disciplinary International Workshop on Artificial Intelligence (MIWAI). Lecture Notes in Computer Science, pp. 360–369. Springer, Cham (2017)
Rehman, B., Hong, O.W., Tan, A., Hong, C.: Using margin-based region of interest technique with multi-task convolutional neural network and template matching for robust face detection and tracking system. In: Proceedings of 2nd International Conference on Imaging, Signal Processing and Communication (ICISPC) (2018)
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Networks 8, 98–113 (1997)
Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1653–1660 (2014)
Ranjan, R., Sankaranarayanan, S., Castillo, C.D., Chellappa, R.: An all-in-one convolutional neural network for face analysis. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 17–24 (2017)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Derpanis, K.G.: Relationship Between the Sum of Squared Difference (SSD) and Cross Correlation for Template Matching. York University, Toronto (2005)
http://ailab.space/wp-content/uploads/multimodal-human-intention-perception/FDTV10.zip
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Rights and permissions
About this article
Cite this article
Rehman, B., Ong, W.H., Tan, A.C.H. et al. Face detection and tracking using hybrid margin-based ROI techniques. Vis Comput 36, 633–647 (2020). https://doi.org/10.1007/s00371-019-01649-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-019-01649-y