MTCNN++: A CNN-based face detection algorithm inspired by MTCNN

Khan, Soumya Suvra; Sengupta, Diganta; Ghosh, Anupam; Chaudhuri, Atal

doi:10.1007/s00371-023-02822-0

MTCNN++: A CNN-based face detection algorithm inspired by MTCNN

Original article
Published: 11 April 2023

Volume 40, pages 899–917, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Soumya Suvra Khan¹,
Diganta Sengupta ORCID: orcid.org/0000-0002-7792-0388²,
Anupam Ghosh³ &
…
Atal Chaudhuri⁴

522 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Increasing security concerns in crowd centric topologies have raised major interests in reliable face recognition systems globally. In this context, certain deep learning frameworks have been proposed till date, for example, Haar Cascade, MTCNN, Dlib to name a few. In this communication, we propose a deep neural network for reliable face recognition in high face density images. The proposed framework is inspired by multi-task cascaded convolutional neural Networks (MTCNN) and, hence the name MTCNN++. In this framework, we have modified the layer density with increasing the neuron count. All the three internal layers of MTCNN, viz. P-Net, R-Net, and O-Net layers and observe that the modified Net-Layer MTCNN (MTCNN++) perform equally well to the MTCNN library or better. Moreover, 20% dropout has been used for tuning the framework for better recognition of the faces, both in terms of face clarity and face count. MTCNN++ exhibits better results as the preprocessing is done dynamically in contrast to the previous versions. The training of the model was done on a dataset comprising of 113,586 human faces in a bucket of 9661 images. The comprehensive dataset comprised of photographs from varied events, thereby presenting multiple human expressions. The accuracy of the model varies from 87.7% (average of 12 faces per image) to 99.7% (average of 2 images per images). The proposed framework fares better with large face count per image. MTCNN++ has further been compared to other literary proposals, and the results are appreciable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

Ninad Mehendale

A review of object detection based on deep learning

Article 12 June 2020

Youzi Xiao, Zhiqiang Tian, … Xuguang Lan

Deepfake: An Overview

Data availability statement

The authors state that the manuscript has no associated dataset.

References

Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: A unified embedding for face recognition and clustering. In: IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, USA, (2015)
Sanchez-Moreno, A.S., Olivares-Mercado, J., Hernandez-Suarez, A., Toscano-Medina, K., Sanchez-Perez, G., Benitez-Garcia, G.: Efficient face recognition system for operating in unconstrained environments. J. Imag. 7(9), 161–182 (2021)
Article Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep Learning Face Attributes in the Wild. In: IEEE international conference on computer vision (ICCV), Santiago, Chile, (2015)
Jain, V., Erik, L.M.: FDDB: A Benchmark for Face Detection in Unconstrained Settings. University of Massachusetts, Amherst (2010)
Google Scholar
Viola, P., Jones, M.: Robust real-time face detection. In: eighth IEEE international conference on computer vision. ICCV 2001, Vancouver, BC, Canada, (2001)
King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10(2009), 1755–1758 (2009)
Google Scholar
Jang, Y., Gunes, H., Patras, I.: Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wild. Comput. Vis. Image Underst. 182, 17–29 (2019)
Article Google Scholar
Chen, W., Huang, H., Peng, S., Zhou, C., Zhang, C.: YOLO-face: a real-time face detector. Vis. Comput. 37, 805–813 (2021)
Article Google Scholar
Guo, Q., Wang, Z., Fan, D.: Multi-face Recognition. In: 13th international congress on image and signal processing, BioMedical engineering and informatics (CISP-BMEI), Chengdu, 2020.
Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I. Zafeiriou, S.: AgeDB: the first manually collected, in-the-wild age. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), Honolulu, HI, USA, 2017.
Sengupta, S., Chen, J. C., Castillo, C., Patel, V. M., Chellappa, R., Jacobs, D. W.: Frontal to profile face verification in the wild. In: 2016 IEEE winter conference on applications of computer vision (WACV), Lake Placid, NY, USA, (2016)
Huang, G. B., Ramesh, M., Berg, T., Miller, E. L.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Workshop on Faces in 'Real-Life' Images: Detection, Alignment, and Recognition, Marseille, France, 2008, 10, (2008)
Yang, S., Luo, P., Loy, C. C., Tang, X.: WIDER FACE: a face detection benchmark. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), (2016)
Wu, W., Liu, C., Su, Z.: Novel Real-time Face Recognition from Video Streams. In: 2017 international conference on computer systems, electronics and control (ICCSEC), Dalian, (2017)
Yu, B., Tao, D.: Anchor cascade for efficient face detection. IEEE Trans. Image Process. 28(5), 2490–2501 (2019)
Article MathSciNet Google Scholar
Xiang, J., Zhu, G.: Joint face detection and facial expression recognition with MTCNN. In: 2017 4th international conference on information science and control engineering (ICISCE), Changsha, (2017)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi task cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Article Google Scholar
Zhang, Z., Luo, P., Loy, C. C., Tang, X.: Facial Landmark Detection by Deep Multi-task Learning. In: European conference on computer vision (ECVV) (2014)
Krizhevsky, A., Sutskever, I., Hinton, G. E.: ImageNet Classification with Deep Convolutional,” In: neural information processing systems (NIPS), 15 US-50, Stateline (2012)
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and 0.5mb model size. In: 5th international conference on learning representations, Toulon, (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition,” In: 3rd international conference on learning representations (ICLR 2015), San Diego (2015)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: proceedings of the IEEE conference on computer vision and pattern recognition, (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J.: Rethinking the inception architecture for computer vision. In: arXiv preprint arXiv:1512.00567, (2015)
Chollet, F.: Xception: Deep Learning with Depthwise Separable Convolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) (2017)
K. He, X. Zhang, S. Ren and J. Sun, Deep Residual Learning for Image Recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas (2016)
Huang, G., Liu, Z., Maaten, L. V. D.: Densely Connected Convolutional Networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, (2017)
Ma, M., Wang, J.: Multi-View Face Detection and Landmark Localization Based on MTCNN. In: 2018 Chinese Automation Congress (CAC), Xi’an (2018)
E. Jose, G. M, S. M. H., M. H. T. P.: Face Recognition Based Surveillance System Using FACENET and MTCNN on Jetson TX2. In: 2019 5th international conference on advanced computing & communication systems (ICACCS), Coimbatore, (2019)
Gunawan, A., Widyantoro, D. H.: Key frame extraction with face biometric features in multi-shot human re-identification system. In: ICACSIS 2019, Bali (2019)
Kim, H., Kim, H., Hwang, E.: Real-time facial feature extraction scheme using cascaded networks. In: 2019 IEEE international conference on big data and smart computing (Big Comp), Kyoto, (2019)
Ji, VS., Wang, K., Peng, X., Yang, J., Zeng, Z. Qiao, Y.: Multiple transfer learning and multi-label balanced training strategies for facial au detection in the wild. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), Seattle (2020)
Ghofrani, A., Toroghi, R. M., Ghanbari, S.: Realtime face-detection and emotion recognition using MTCNN and miniShuffleNet V2. In: 2019 5th conference on knowledge based engineering and innovation (KBEI), Tehran (2019)
Zhou, N., Liang, R.Y., Shi, W.Q.: A lightweight convolutional neural network for real-time facial expression detection. IEEE Access 9, 5573–5584 (2021)
Article Google Scholar
Nagarajan, B., Oruganti, V. R. M.: Group emotion recognition in adverse face detection. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019), Lille, (2019)
Gupta, S., Thakur, K., Kumar, M.: 2D-human face recognition using SIFT and SURF descriptors of face’s feature regions. Vis. Comput. 37(3), 447–456 (2021)
Article Google Scholar
Hassan, U., Ahmad, A.-D.: Is gender encoded in the smile? A computational framework for the analysis of the smile driven dynamic face for gender recognition. Vis. Comput. 34, 1243–1254 (2018)
Article Google Scholar
Ying, L., Dinghua, S., Fuping, W., Pang, L.K., Kiang, C.T., Yi, L.: Learning wavelet coefficients for face super-resolution. Vis. Comput. 37, 1613–1622 (2021)
Article Google Scholar
Shu, X., Tang, J., Lai, H., Liu, L., Yan, S.: Personalized age progression with aging dictionary. In: IEEE international conference on computer vision (ICCV), Santiago: Chile (2015)
Shu, X., Tang, J., Li, Z., Lai, H., Zhang, L., Yan, S.: Personalized age progression with Bi-level aging dictionary learning. IEEE Trans. Patt. Anal. Mach. Intell. 40(4), 905–917 (2018)
Article Google Scholar
Ali, E., Hassan, U.: A framework for facial age progression and regression using exemplar face templates. Vis. Comput. 37(7), 2023–2038 (2021)
Article Google Scholar
Wu, C.Y., Huang, F., Dai, J.Y., Zhou, N.R.: Quantum SUSAN edge detection based on double chains quantum genetic algorithm. Phys. A 605(128017), 1 (2022)
MathSciNet Google Scholar
Zhou, N.R., Zhang, T.F., Xie, X.W., Wu, J.Y.: Hybrid quantum–classical generative adversarial networks for image generation via learning discrete distribution. Signal Process. Image Commun. 110, 116891 (2023)
Article Google Scholar
Zhou, N.R., Liu, X.X., Chen, Y.L., Du, N.S.: Quantum K-nearest-neighbor image classification algorithm based on K-L transform. Int. J. Theor. Phys. 60, 1209–1224 (2021)
Article Google Scholar
Gong, L.H., Xiang, L.Z., Liu, S.H., Zhou, N.R.: Born machine model based on matrix product state quantum circuit. Phys. A Statist. Mechan. Appl. 593, 126907 (2022)
Article MathSciNet Google Scholar
Zhang, X., Yang, Y., Zhang, L., Li, W., Dang, S., Wang, P., Zhu, M.: Research on facial expression recognition algorithm based on convolutional neural network. In: 2019 28th wireless and optical communications conference (WOCC), Beijing, (2019)
Lucena, J. O. D., Lima, J. P., Thomas, D., Teichrieb, V.:Real-time facial motion capture using RGB-D images under complex motion and occlusions. In: 21st Symposium on Virtual and Augmented Reality (SVR), Rio de Janeiro, (2019)
Chou, K. Y., Cheng, W. Y., Chen, W. R., Chen, Y. P.: Multi-task cascaded and densely connected convolutional networks applied to human face detection and facial expression recognition system. In: international automatic control conference (CACS), Keelung, (2019)
Ben Fredj, H., Bouguezzi, S., Souani, C.: Face recognition in unconstrained environment with CNN. Visual Comput. 37(2), 217–226 (2021)
Article Google Scholar
Boughanem, H., Ghazouani, H., Barhoumi, W.: Multichannel convolutional neural network for human emotion recognition from in-the-wild facial expressions. Visual Comput. 20, 22–52 (2022)
Google Scholar
Gyawali, D., Pokharel, P., Chauhan A., Shakya, S. C.: Age range estimation using MTCNN and VGG-face model. In: 11th IEEE international conference on computing, communication and networking technologies (ICCCNT), Kharagpur (2020)
Rusli, M. H., Sjarif, N. N. A., Yuhaniz, S. S., Kok S., Kadir, M. S.: Evaluating the masked and unmasked face with LeNet algorithm. In: 2021 IEEE 17th international colloquium on signal processing & its applications (CSPA), Langkawi, (2021)
Ejaz, M. S., Islam, M. R.: Masked face recognition using convolutional neural network. In: 2019 international conference on sustainable technologies for, Dhaka, (2019)
HE, J.: Performance analysis of facial recognition: a critical review through glass factor. In: 2021 2nd international conference on computing and data science (CDS), Stanford (2021)
Asmara, R. A., Ridwan, M., Budiprasetyo, G.: Haar cascade and convolutional neural network face detection in client-side for cloud computing face recognition. In: 2021 international conference on electrical and information technology (IEIT), Malang: Indonesia (2021)
Sikder, J., Chakma, R., Chakma, R. J., Das, U. K.: Intelligent face detection and recognition system. In: 2021 international conference on intelligent technologies (CONIT), India: Hubli (2021)
Ali, N., Hasan, I., Özyer, T. Alhajj, R.: Driver drowsiness detection by employing CNN and Dlib. In: 2021 22nd international Arab conference on information technology (ACIT), Oman: Muscat (2021)
Arunraja, A., Prasath, C. A., Dhanush, A., Harshavardhan, K. S.: Design of Open CV, EAR algorithm and DLib Library for smart home controller. In: 2022 6th international conference on computing methodologies and communication (ICCMC), India: Erode (2022)
Guravaiah, K., Rithika, G., Raju S. S.: HomeID: home visitors recognition using internet of things and deep learning algorithms. In: 2022 international conference on innovative trends in information technology (ICITIIT), India: Kottayam (2021)
Enadula, S. M., Enadula, A. S. Burri, R. D.: Recognition of Student Emotions in an Online Education System. In: 2021 fourth international conference on electrical, computer and communication technologies (ICECCT). India: Erode (2021)

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Meghnad Saha Institute of Technology, Kolkata, West Bengal, India
Soumya Suvra Khan
Department of Computer Science and Engineering and Computer Science and Business Systems, Meghnad Saha Institute of Technology, Kolkata, West Bengal, India
Diganta Sengupta
Department of Computer Science and Engineering, Netaji Subhash Engineering College, Kolkata, West Bengal, India
Anupam Ghosh
Department of Computer Science and Engineering, Jadavpur University, Kolkata, West Bengal, India
Atal Chaudhuri

Authors

Soumya Suvra Khan
View author publications
You can also search for this author in PubMed Google Scholar
Diganta Sengupta
View author publications
You can also search for this author in PubMed Google Scholar
Anupam Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Atal Chaudhuri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diganta Sengupta.

Ethics declarations

Conflict of interest

The authors have no conflict of interest in this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Khan, S.S., Sengupta, D., Ghosh, A. et al. MTCNN++: A CNN-based face detection algorithm inspired by MTCNN. Vis Comput 40, 899–917 (2024). https://doi.org/10.1007/s00371-023-02822-0

Download citation

Accepted: 25 February 2023
Published: 11 April 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00371-023-02822-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MTCNN++: A CNN-based face detection algorithm inspired by MTCNN

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

A review of object detection based on deep learning

Deepfake: An Overview

Data availability statement

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

A review of object detection based on deep learning

Deepfake: An Overview

Data availability statement

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation