A Face Detection Method Based on Cascade Convolutional Neural Network

  • Wankou YangEmail author
  • Lukuan Zhou
  • Tianhuang Li
  • Haoran Wang


Cascade has been widely used in face detection where classifier with low computational cost can be firstly used to shrink most of the background while keeping the recall. In this paper, a new cascaded convolutional neural network method consisting of two main steps is proposed. During the first stage, low-pixel candidate window is used as an input such that the shallow convolutional neural network quickly extracts the candidate window. In the second stage, the window from the former stage is resized and used as an input to the corresponding network layer respectively. During the training period, joint online training is conducted for hard samples and the soft non-maximum suppression algorithm is used to test on the dataset. The whole network achieves improved performance on the FDDB and PASCAL face datasets.


Face detection Cascade convolution structure Soft non-maximum suppression 



This work is supported by National Natural Science Foundation (NNSF) of China under Grant No. 61473086, 61603080 and 61773117. Jiangsu key R & D plan (No.BE2017157).


  1. 1.
    Bourdev L, Brandt J (2005) Robust Object Detection via Soft Cascade, Computer Vision and Pattern Recognition, 236–243Google Scholar
  2. 2.
    Chen D, Ren S, Wei Y, Cao X, Sun J (2014) Joint cascade face detection and alignment, in European Conference on Computer Vision, 109–122Google Scholar
  3. 3.
    Dollar P, Tu Z, Perona P, Belongie S (2009) Integral channel features, in BMVAGoogle Scholar
  4. 4.
    Dollár P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545CrossRefGoogle Scholar
  5. 5.
    Farfade SS, Saberian M, Li L, Multi-view face detection using deep convolutional neural networks, ICMR2015Google Scholar
  6. 6.
    Girshick R, Fast R-CNN, ICCV2015Google Scholar
  7. 7.
    Girshick R, Donahue J, Darrell T, Malik J, Rich feature hierarchies for accurate object detection and semantic segmentation, IEEE CVPR2014Google Scholar
  8. 8.
    He K, Zhang X, Ren S et al (2015) Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904CrossRefGoogle Scholar
  9. 9.
    Huang L, Yang Y, Deng Y, Yu Y (2015) DenseBox: Unifying Landmark Localization with End to End Object Detection arXiv:1509.04874Google Scholar
  10. 10.
    Jain V, Learned-Miller E (2010) FDDB: A benchmark for face detection in unconstrained settings, Tech. Rep. UM-CS-2010-009, University of Massachusetts. In: AmherstGoogle Scholar
  11. 11.
    Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. NIPS 1097–1105Google Scholar
  12. 12.
    LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444CrossRefGoogle Scholar
  13. 13.
    Li H, Lin Z, Shen X, Brandt J, Hua G (2015) A convolutional neural network cascade for face detection computer vision and pattern recognitionGoogle Scholar
  14. 14.
    Li J, Lu K, Huang Z, Zhu L, Shen HT Transfer independently together: a generalized framework for domain adaption. IEEE Trans Cybern, Digit Object Identifier.
  15. 15.
    Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M (2018) Deep learning for generic object detection: a survery, arXiv:1809.02165v1 [cs.CV] 6 SepGoogle Scholar
  16. 16.
    Najibi M, Samangouei P, Chellapa R, Davis LS, SSH: single stage headless face detector, ICCV2007Google Scholar
  17. 17.
    Nie L, Wang X, Zhang J, He X, Zhang H, Hong R, Tian Q, Enhancing mircro-video understanding by harnessing external sounds, ACMM2017Google Scholar
  18. 18.
    Peiyun H, Ramanan D (2017) Finding tiny faces, CVPRGoogle Scholar
  19. 19.
    Ren S, He K, Girshick R, Sun J, (2016) Faster R-CNN: Towards real-Time object detection with region proposal networks, IEEE CVPR 1137–1149Google Scholar
  20. 20.
    Shelhamer E, Long J, Darrell T (2014) Fully Convolutional Networks for Semantic Segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640CrossRefGoogle Scholar
  21. 21.
    Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen HT, Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization, IEEE Transactions on Pattern Analysis and Machine Intelligence.
  22. 22.
    Song X, Feng F, Han X, Yang X, Liu W, Nie L Neural compatibility modeling with attentive knowledge distillation, SIGIR2018Google Scholar
  23. 23.
    Tang X, Du DK, He Z, Liu J, (2018) PyramidBox: A Context-assisted Single Shot Face Detector. arXiv preprint arXiv:1803.07737Google Scholar
  24. 24.
    Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features, in Proceedings of the 19th Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, pp. 511–518. IEEEGoogle Scholar
  25. 25.
    Wang X, Han TX, Yan S (2009) An hog-lbp human detector with partial occlusion handling, IEEE ICCVGoogle Scholar
  26. 26.
    Wang H, Li Z, Ji X, Wang Y, Face R-CNN (2017) arXiv preprint arXiv:1706.01061Google Scholar
  27. 27.
    Xie L, Shen J, Han J, Zhu L, Shao L, Dynamic multi-view hashing for online image retrieval, IJCAI2017Google Scholar
  28. 28.
    Yan J, Lei Z, Wen L, Li S (2014) “The fastest deformable part model for object detection,” in IEEE Conference on Computer Vision and Pattern Recognition, 2497–2504Google Scholar
  29. 29.
    Yan J, Zhang X, Lei Z, Li SZ (2014) Face detection by structural models. Image Vis Comput 32(10):790–799CrossRefGoogle Scholar
  30. 30.
    Yang MH, Kriegman D, Ahuja N (2002) Detecting faces in images: A survey, IEEE Trans. PAMIGoogle Scholar
  31. 31.
    Yang S, Luo P, Loy CC, Tang X (2015) From facial parts responses to face detection: A deep learning approach, in IEEE International Conference on Computer Vision, 3676–3684Google Scholar
  32. 32.
    Yang S, Luo P, Loy CC, Tang X (2018) Faceness-Net: face detection through deep facial part response. IEEE Trans Pattern Anal Mach Intell 40(8):1845–1859CrossRefGoogle Scholar
  33. 33.
    Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: past, present and future. Comput Vis Image Underst 138:1–24CrossRefGoogle Scholar
  34. 34.
    Zhan K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multi-task cascade convolutional Networks. IEEE Signal process lett 23(10):1499–1503CrossRefGoogle Scholar
  35. 35.
    Zheng R, Yao C, Jin H, Zhou L, Zhang Q, Dong W (2015) Parallel key frame extraction for surveillance video service in a smart city. PLoS One 10(8):e0135694CrossRefGoogle Scholar
  36. 36.
    Zhu X, Ramanan D (2012) “Face detection, pose estimation, and landmark localization in the wild,” in IEEE Conference on Computer Vision and Pattern Recognition 2879–2886Google Scholar
  37. 37.
    Zhu Q, Yeh MC, Cheng KT, Avidan S (2006) Fast human detection using a cascade of histograms of oriented gradients, IEEE CVPRGoogle Scholar
  38. 38.
    Zhu L, Huang Z, Chang X, Song J, Shen HT, Exploring consistent preferences: discrete hashing with pair-exemplar for scalable landmark search, ACMM2017Google Scholar
  39. 39.
    Zhu L, Huang Z, Li Z, Xie L, Shen HT (2018) Exploring auxiliary context: discrete semantic transfer hashing for scalable image retrieval. IEEE Trans NNLS 29(11):5264–5276Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Wankou Yang
    • 1
    • 2
    Email author
  • Lukuan Zhou
    • 1
    • 2
  • Tianhuang Li
    • 1
    • 2
  • Haoran Wang
    • 3
  1. 1.School of AutomationSoutheast UniversityNanjingChina
  2. 2.Key Lab of Measurement and Control of Complex Systems of Engineering, Ministry of EducationNanjingChina
  3. 3.College of Information Science and EngineeringNortheastern UniversityShenyangChina

Personalised recommendations