Skip to main content

Advertisement

Log in

MTCNN++: A CNN-based face detection algorithm inspired by MTCNN

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Increasing security concerns in crowd centric topologies have raised major interests in reliable face recognition systems globally. In this context, certain deep learning frameworks have been proposed till date, for example, Haar Cascade, MTCNN, Dlib to name a few. In this communication, we propose a deep neural network for reliable face recognition in high face density images. The proposed framework is inspired by multi-task cascaded convolutional neural Networks (MTCNN) and, hence the name MTCNN++. In this framework, we have modified the layer density with increasing the neuron count. All the three internal layers of MTCNN, viz. P-Net, R-Net, and O-Net layers and observe that the modified Net-Layer MTCNN (MTCNN++) perform equally well to the MTCNN library or better. Moreover, 20% dropout has been used for tuning the framework for better recognition of the faces, both in terms of face clarity and face count. MTCNN++ exhibits better results as the preprocessing is done dynamically in contrast to the previous versions. The training of the model was done on a dataset comprising of 113,586 human faces in a bucket of 9661 images. The comprehensive dataset comprised of photographs from varied events, thereby presenting multiple human expressions. The accuracy of the model varies from 87.7% (average of 12 faces per image) to 99.7% (average of 2 images per images). The proposed framework fares better with large face count per image. MTCNN++ has further been compared to other literary proposals, and the results are appreciable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data availability statement

The authors state that the manuscript has no associated dataset.

References

  1. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: A unified embedding for face recognition and clustering. In: IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, USA, (2015)

  2. Sanchez-Moreno, A.S., Olivares-Mercado, J., Hernandez-Suarez, A., Toscano-Medina, K., Sanchez-Perez, G., Benitez-Garcia, G.: Efficient face recognition system for operating in unconstrained environments. J. Imag. 7(9), 161–182 (2021)

    Article  Google Scholar 

  3. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep Learning Face Attributes in the Wild. In: IEEE international conference on computer vision (ICCV), Santiago, Chile, (2015)

  4. Jain, V., Erik, L.M.: FDDB: A Benchmark for Face Detection in Unconstrained Settings. University of Massachusetts, Amherst (2010)

    Google Scholar 

  5. Viola, P., Jones, M.: Robust real-time face detection. In: eighth IEEE international conference on computer vision. ICCV 2001, Vancouver, BC, Canada, (2001)

  6. King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10(2009), 1755–1758 (2009)

    Google Scholar 

  7. Jang, Y., Gunes, H., Patras, I.: Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wild. Comput. Vis. Image Underst. 182, 17–29 (2019)

    Article  Google Scholar 

  8. Chen, W., Huang, H., Peng, S., Zhou, C., Zhang, C.: YOLO-face: a real-time face detector. Vis. Comput. 37, 805–813 (2021)

    Article  Google Scholar 

  9. Guo, Q., Wang, Z., Fan, D.: Multi-face Recognition. In: 13th international congress on image and signal processing, BioMedical engineering and informatics (CISP-BMEI), Chengdu, 2020.

  10. Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I. Zafeiriou, S.: AgeDB: the first manually collected, in-the-wild age. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), Honolulu, HI, USA, 2017.

  11. Sengupta, S., Chen, J. C., Castillo, C., Patel, V. M., Chellappa, R., Jacobs, D. W.: Frontal to profile face verification in the wild. In: 2016 IEEE winter conference on applications of computer vision (WACV), Lake Placid, NY, USA, (2016)

  12. Huang, G. B., Ramesh, M., Berg, T., Miller, E. L.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Workshop on Faces in 'Real-Life' Images: Detection, Alignment, and Recognition, Marseille, France, 2008, 10, (2008)

  13. Yang, S., Luo, P., Loy, C. C., Tang, X.: WIDER FACE: a face detection benchmark. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), (2016)

  14. Wu, W., Liu, C., Su, Z.: Novel Real-time Face Recognition from Video Streams. In: 2017 international conference on computer systems, electronics and control (ICCSEC), Dalian, (2017)

  15. Yu, B., Tao, D.: Anchor cascade for efficient face detection. IEEE Trans. Image Process. 28(5), 2490–2501 (2019)

    Article  MathSciNet  Google Scholar 

  16. Xiang, J., Zhu, G.: Joint face detection and facial expression recognition with MTCNN. In: 2017 4th international conference on information science and control engineering (ICISCE), Changsha, (2017)

  17. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi task cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  18. Zhang, Z., Luo, P., Loy, C. C., Tang, X.: Facial Landmark Detection by Deep Multi-task Learning. In: European conference on computer vision (ECVV) (2014)

  19. Krizhevsky, A., Sutskever, I., Hinton, G. E.: ImageNet Classification with Deep Convolutional,” In: neural information processing systems (NIPS), 15 US-50, Stateline (2012)

  20. Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and 0.5mb model size. In: 5th international conference on learning representations, Toulon, (2017)

  21. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition,” In: 3rd international conference on learning representations (ICLR 2015), San Diego (2015)

  22. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: proceedings of the IEEE conference on computer vision and pattern recognition, (2015)

  23. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J.: Rethinking the inception architecture for computer vision. In: arXiv preprint arXiv:1512.00567, (2015)

  24. Chollet, F.: Xception: Deep Learning with Depthwise Separable Convolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) (2017)

  25. K. He, X. Zhang, S. Ren and J. Sun, Deep Residual Learning for Image Recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas (2016)

  26. Huang, G., Liu, Z., Maaten, L. V. D.: Densely Connected Convolutional Networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, (2017)

  27. Ma, M., Wang, J.: Multi-View Face Detection and Landmark Localization Based on MTCNN. In: 2018 Chinese Automation Congress (CAC), Xi’an (2018)

  28. E. Jose, G. M, S. M. H., M. H. T. P.: Face Recognition Based Surveillance System Using FACENET and MTCNN on Jetson TX2. In: 2019 5th international conference on advanced computing & communication systems (ICACCS), Coimbatore, (2019)

  29. Gunawan, A., Widyantoro, D. H.: Key frame extraction with face biometric features in multi-shot human re-identification system. In: ICACSIS 2019, Bali (2019)

  30. Kim, H., Kim, H., Hwang, E.: Real-time facial feature extraction scheme using cascaded networks. In: 2019 IEEE international conference on big data and smart computing (Big Comp), Kyoto, (2019)

  31. Ji, VS., Wang, K., Peng, X., Yang, J., Zeng, Z. Qiao, Y.: Multiple transfer learning and multi-label balanced training strategies for facial au detection in the wild. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), Seattle (2020)

  32. Ghofrani, A., Toroghi, R. M., Ghanbari, S.: Realtime face-detection and emotion recognition using MTCNN and miniShuffleNet V2. In: 2019 5th conference on knowledge based engineering and innovation (KBEI), Tehran (2019)

  33. Zhou, N., Liang, R.Y., Shi, W.Q.: A lightweight convolutional neural network for real-time facial expression detection. IEEE Access 9, 5573–5584 (2021)

    Article  Google Scholar 

  34. Nagarajan, B., Oruganti, V. R. M.: Group emotion recognition in adverse face detection. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019), Lille, (2019)

  35. Gupta, S., Thakur, K., Kumar, M.: 2D-human face recognition using SIFT and SURF descriptors of face’s feature regions. Vis. Comput. 37(3), 447–456 (2021)

    Article  Google Scholar 

  36. Hassan, U., Ahmad, A.-D.: Is gender encoded in the smile? A computational framework for the analysis of the smile driven dynamic face for gender recognition. Vis. Comput. 34, 1243–1254 (2018)

    Article  Google Scholar 

  37. Ying, L., Dinghua, S., Fuping, W., Pang, L.K., Kiang, C.T., Yi, L.: Learning wavelet coefficients for face super-resolution. Vis. Comput. 37, 1613–1622 (2021)

    Article  Google Scholar 

  38. Shu, X., Tang, J., Lai, H., Liu, L., Yan, S.: Personalized age progression with aging dictionary. In: IEEE international conference on computer vision (ICCV), Santiago: Chile (2015)

  39. Shu, X., Tang, J., Li, Z., Lai, H., Zhang, L., Yan, S.: Personalized age progression with Bi-level aging dictionary learning. IEEE Trans. Patt. Anal. Mach. Intell. 40(4), 905–917 (2018)

    Article  Google Scholar 

  40. Ali, E., Hassan, U.: A framework for facial age progression and regression using exemplar face templates. Vis. Comput. 37(7), 2023–2038 (2021)

    Article  Google Scholar 

  41. Wu, C.Y., Huang, F., Dai, J.Y., Zhou, N.R.: Quantum SUSAN edge detection based on double chains quantum genetic algorithm. Phys. A 605(128017), 1 (2022)

    MathSciNet  Google Scholar 

  42. Zhou, N.R., Zhang, T.F., Xie, X.W., Wu, J.Y.: Hybrid quantum–classical generative adversarial networks for image generation via learning discrete distribution. Signal Process. Image Commun. 110, 116891 (2023)

    Article  Google Scholar 

  43. Zhou, N.R., Liu, X.X., Chen, Y.L., Du, N.S.: Quantum K-nearest-neighbor image classification algorithm based on K-L transform. Int. J. Theor. Phys. 60, 1209–1224 (2021)

    Article  Google Scholar 

  44. Gong, L.H., Xiang, L.Z., Liu, S.H., Zhou, N.R.: Born machine model based on matrix product state quantum circuit. Phys. A Statist. Mechan. Appl. 593, 126907 (2022)

    Article  MathSciNet  Google Scholar 

  45. Zhang, X., Yang, Y., Zhang, L., Li, W., Dang, S., Wang, P., Zhu, M.: Research on facial expression recognition algorithm based on convolutional neural network. In: 2019 28th wireless and optical communications conference (WOCC), Beijing, (2019)

  46. Lucena, J. O. D., Lima, J. P., Thomas, D., Teichrieb, V.:Real-time facial motion capture using RGB-D images under complex motion and occlusions. In: 21st Symposium on Virtual and Augmented Reality (SVR), Rio de Janeiro, (2019)

  47. Chou, K. Y., Cheng, W. Y., Chen, W. R., Chen, Y. P.: Multi-task cascaded and densely connected convolutional networks applied to human face detection and facial expression recognition system. In: international automatic control conference (CACS), Keelung, (2019)

  48. Ben Fredj, H., Bouguezzi, S., Souani, C.: Face recognition in unconstrained environment with CNN. Visual Comput. 37(2), 217–226 (2021)

    Article  Google Scholar 

  49. Boughanem, H., Ghazouani, H., Barhoumi, W.: Multichannel convolutional neural network for human emotion recognition from in-the-wild facial expressions. Visual Comput. 20, 22–52 (2022)

    Google Scholar 

  50. Gyawali, D., Pokharel, P., Chauhan A., Shakya, S. C.: Age range estimation using MTCNN and VGG-face model. In: 11th IEEE international conference on computing, communication and networking technologies (ICCCNT), Kharagpur (2020)

  51. Rusli, M. H., Sjarif, N. N. A., Yuhaniz, S. S., Kok S., Kadir, M. S.: Evaluating the masked and unmasked face with LeNet algorithm. In: 2021 IEEE 17th international colloquium on signal processing & its applications (CSPA), Langkawi, (2021)

  52. Ejaz, M. S., Islam, M. R.: Masked face recognition using convolutional neural network. In: 2019 international conference on sustainable technologies for, Dhaka, (2019)

  53. HE, J.: Performance analysis of facial recognition: a critical review through glass factor. In: 2021 2nd international conference on computing and data science (CDS), Stanford (2021)

  54. Asmara, R. A., Ridwan, M., Budiprasetyo, G.: Haar cascade and convolutional neural network face detection in client-side for cloud computing face recognition. In: 2021 international conference on electrical and information technology (IEIT), Malang: Indonesia (2021)

  55. Sikder, J., Chakma, R., Chakma, R. J., Das, U. K.: Intelligent face detection and recognition system. In: 2021 international conference on intelligent technologies (CONIT), India: Hubli (2021)

  56. Ali, N., Hasan, I., Özyer, T. Alhajj, R.: Driver drowsiness detection by employing CNN and Dlib. In: 2021 22nd international Arab conference on information technology (ACIT), Oman: Muscat (2021)

  57. Arunraja, A., Prasath, C. A., Dhanush, A., Harshavardhan, K. S.: Design of Open CV, EAR algorithm and DLib Library for smart home controller. In: 2022 6th international conference on computing methodologies and communication (ICCMC), India: Erode (2022)

  58. Guravaiah, K., Rithika, G., Raju S. S.: HomeID: home visitors recognition using internet of things and deep learning algorithms. In: 2022 international conference on innovative trends in information technology (ICITIIT), India: Kottayam (2021)

  59. Enadula, S. M., Enadula, A. S. Burri, R. D.: Recognition of Student Emotions in an Online Education System. In: 2021 fourth international conference on electrical, computer and communication technologies (ICECCT). India: Erode (2021)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diganta Sengupta.

Ethics declarations

Conflict of interest

The authors have no conflict of interest in this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khan, S.S., Sengupta, D., Ghosh, A. et al. MTCNN++: A CNN-based face detection algorithm inspired by MTCNN. Vis Comput 40, 899–917 (2024). https://doi.org/10.1007/s00371-023-02822-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-02822-0

Keywords

Navigation