Skip to main content

Finding hard faces with better proposals and classifier

Abstract

Recent studies witnessed that deep CNNs significantly improve the performance of face detection in the wild. However, detecting faces with small scales, large pose variations, and occlusions is still challenging. In this paper, to detect challenging faces, we present a boosted faster RCNN (F-RCN) version with an enhanced region proposal network (eRPN) module and newly introduced hard example mining strategies. The eRPN module generates better proposals than traditional RPN by integarating semantic information into the input feature maps. Two hard example mining strategies, i.e., online hard proposal mining (OHPM) and offline hard image mining (OHIM), are proposed to train better classifier. The OHPM can effectively sample quality and diversity of hard positive examples, which is important for detecting hard faces like tiny faces. The OHIM further boosts the classifier to detect hard faces via an auxiliary fine-tuning on a small proportion of training data. Experimental results on the FDDB, WIDER FACE, Pascal Faces, and AFW datasets show that our method significantly improves the faster-RCNN face detector and achieves performance superior or comparable to the state-of-the-art face detectors.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Notes

  1. https://github.com/rbgirshick/py-faster-rcnn.

References

  1. Ahmadi, N., Akbarizadeh, G.: Iris tissue recognition based on GLDM feature extraction and hybrid MLPNN-ICA classifier. Neural Comput. Appl. 32(7), 1–15 (2018)

    Google Scholar 

  2. Akbarizadeh, G.: A new statistical-based kurtosis wavelet energy feature for texture recognition of sar images. IEEE Trans. Geosci. Remote Sens. 50(11), 4358–4368 (2012)

    Article  Google Scholar 

  3. Akbarizadeh, G., Rahmani, M.: Efficient combination of texture and color features in a new spectral clustering method for polsar image segmentation. Natl. Acad. Sci. Lett. 40(2), 117–120 (2017)

    MathSciNet  Article  Google Scholar 

  4. Akbarizadeh, G., Tirandaz, Z., Aleghafour, M.: Hierarchical unsupervised segmentation of sar images via super pixel and lossy data compression. J. Electr. Eng. Univ. Tabriz. 46(2), 1–14 (2015)

    Google Scholar 

  5. Akbarizadeh, G., Tirandaz, Z., Kooshesh, M.: A new curvelet-based texture classification approach for land cover recognition of SAR satellite images. Malays. J. Comput. Sci. 27(3), 218–239 (2014)

    Google Scholar 

  6. Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside–outside net: detecting objects in context with skip pooling and recurrent neural networks. In: CVPR, pp. 2874–2883 (2016)

  7. Chen, D., Ren, S., Wei, Y., Cao, X., Sun, J.: Joint cascade face detection and alignment. In: ECCV, pp. 109–122 (2014)

  8. Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z., Zou, X.: Selective refinement network for high performance face detection. arXiv:1809.02693 (2018)

  9. Dollár, P., Tu, Z., Perona, P., Belongie, S.: Integral channel features. In: BMVC (2009)

  10. Farfade, S.S., Saberian, M.J., Li, L.J.: Multi-view face detection using deep convolutional neural networks. In: ICMR, pp. 643–650 (2015)

  11. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. In: TPAMI

  12. Ghiasi, G., Fowlkes, C.C.: Occlusion coherence: detecting and localizing occluded faces (2015). arXiv:1506.08347

  13. Girshick, R.: Fast r-cnn. In: ICCV, pp. 1440–1448 (2015)

  14. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

  15. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: ICCV (2017)

  16. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: ECCV pp. 346–361 (2014)

  17. Howard, A.G.: Some improvements on deep convolutional neural network based image classification (2013). arXiv:1312.5402

  18. Huang, J., Rathod, V., et al.: Speed/accuracy trade-offs for modern convolutional object detectors (2016). arXiv:1611.10012

  19. Jain, V., Learned-Miller, E.: Fddb: A benchmark for face detection in unconstrained settings. Technical Report UM-CS-2010-009, University of Massachusetts, Amherst (2010)

  20. Jiang, H., Learned-Miller, E.: Face detection with the faster r-cnn. In: FG, pp. 650–657 (2017)

  21. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)

  22. Kumar, V., Namboodiri, A., Jawahar, C.: Visual phrases for exemplar face detection. In: ICCV, pp. 1994–2002 (2015)

  23. Li, H., Hua, G., Lin, Z., Brandt, J., Yang, J.: Probabilistic elastic part model for unsupervised face detector adaptation. In: ICCV, pp. 793–800 (2013)

  24. Li, H., Lin, Z., Brandt, J., Shen, X., Hua, G.: Efficient boosted exemplar-based face detection. In: CVPR, pp. 1843–1850 (2014)

  25. Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: CVPR, pp. 5325–5334 (2015)

  26. Li, J., Wang, Y., Wang, C., Tai, Y., Qian, J., Yang, J., Wang, C., Li, J., Huang, F.: Dsfd: dual shot face detector (2018). arXiv:1810.10220

  27. Li, J., Zhang, Y.: Learning surf cascade for fast and accurate object detection. In: CVPR, pp. 3468–3475 (2013)

  28. Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Light-head r-cnn: in defense of two-stage object detector (2017). arXiv preprint arXiv:1711.07264

  29. Liao, S., Jain, A.K., Li, S.Z.: A fast and accurate unconstrained face detector. TPAMI 38(2), 211–223 (2016)

    Article  Google Scholar 

  30. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection (2017). arXiv:1708.02002

  31. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. In: ECCV, pp. 21–37. Springer (2016)

  32. Loshchilov, I., Hutter, F.: Online batch selection for faster training of neural networks (2015). arXiv:1511.06343

  33. Markus, N., Frljak, M., Pandzic, I.S., Ahlberg, J., Forchheimer, R.: A method for object detection based on pixel intensity comparisons organized in decision trees. In: CoRR (2014)

  34. Mathias, M., Benenson, R., Pedersoli, M., Van Gool, L.: Face detection without bells and whistles In: ECCV, pp. 720–735. Springer (2014)

  35. Modava, M., Akbarizadeh, G., Soroosh, M.: Integration of spectral histogram and level set for coastline detection in SAR images. IEEE Trans. Aerosp. Electron. Syst. 55(2), 810–819 (2018)

    Article  Google Scholar 

  36. Modava, M., Akbarizadeh, G., Soroosh, M.: Hierarchical coastline detection in SAR images based on spectral-textural features and global-local information. IET Radar Sonar Navig. 13(12), 2183–2195 (2019)

    Article  Google Scholar 

  37. Moghaddam, A.E., Akbarizadeh, G., Kaabi, H.: Automatic detection and segmentation of blood vessels and pulmonary nodules based on a line tracking method and generalized linear regression model. SIViP 13(3), 457–464 (2019)

    Article  Google Scholar 

  38. Najibi, M., Samangouei, P., Chellappa, R., Davis, L.: Ssh: single stage headless face detector. In: ICCV (2017)

  39. Pham, M.T., Gao, Y., Hoang, V.D.D., Cham, T.J.: Fast polygonal integration and its application in extending haar-like features to improve object detection. In: CVPR, pp. 942–949 (2010)

  40. Raeisi, A., Akbarizadeh, G., Mahmoudi, A.: Combined method of an efficient cuckoo search algorithm and nonnegative matrix factorization of different zernike moment features for discrimination between oil spills and lookalikes in SAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 11(11), 4193–4205 (2018)

    Article  Google Scholar 

  41. Ranjan, R., Patel, V.M., Chellappa, R.: A deep pyramid deformable part model for face detection. In: BTAS, pp. 1–8 (2015)

  42. Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition (2016). arXiv:1603.01249

  43. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. TPAMI 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  44. Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. TPAMI 20(1), 23–38 (1998)

    Article  Google Scholar 

  45. Samadi, F., Akbarizadeh, G., Kaabi, H.: Change detection in sar images using deep belief network: a new training approach based on morphological images. IET Image Proc. 13(12), 2255–2264 (2019)

    Article  Google Scholar 

  46. Sharifzadeh, F., Akbarizadeh, G., Kavian, Y.S.: Ship classification in SAR images using a new hybrid CNN-MLP classifier. J. Indian Soc. Remote Sens. 47(4), 551–562 (2019)

    Article  Google Scholar 

  47. Shen, X., Lin, Z., Brandt, J., Wu, Y.: Detecting and aligning faces by image retrieval. In: CVPR, pp. 3460–3467 (2013)

  48. Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: CVPR, pp. 761–769 (2016)

  49. Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Moreno-Noguer, F.: Fracking deep convolutional image descriptors (2014). arXiv:1412.6537

  50. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556

  51. Sun, X., Wu, P., Hoi, S.C.: Face detection using deep learning: an improved faster RCNN approach (2017). arXiv:1701.08289

  52. Sun, X., Wu, P., Hoi, S.C.: Face detection using deep learning: an improved faster RCNN approach. Neurocomputing 299, 42–50 (2018)

    Article  Google Scholar 

  53. Taibi, F., Akbarizadeh, G., Farshidi, E.: Robust reservoir rock fracture recognition based on a new sparse feature learning and data training method. Multidimens. Syst. Signal Process. 30(4), 2113–2146 (2019)

    Article  Google Scholar 

  54. Tang, X., Du, D.K., He, Z., Liu, J.: Pyramidbox: A context-assisted single shot face detector. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 797–813 (2018)

  55. Tirandaz, Z., Akbarizadeh, G., Kaabi, H.: Polsar image segmentation based on feature extraction and data compression using weighted neighborhood filter bank and hidden markov random field-expectation maximization. Measurement 153, 107432 (2020)

    Article  Google Scholar 

  56. Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. IJCV 104(2), 154–171 (2013)

    Article  Google Scholar 

  57. Viola, P., Jones, M.J.: Robust real-time face detection. IJCV 57(2), 137–154 (2004)

    Article  Google Scholar 

  58. Wan, S., Chen, Z., Zhang, T., Zhang, B., Wong, K.K.: Bootstrapping face detection with hard negative examples (2016). arXiv:1608.02236

  59. Wang, H., Li, Z., Ji, X., Wang, Y.: Face r-cnn (2017). arXiv:1706.01061

  60. Wang, J., Yuan, Y., Yu, G.: Face attention network: an effective face detector for the occluded faces (2017). arXiv:1711.07246

  61. Wang, X., Gupta, A.: Unsupervised learning of visual representations using videos. In: ICCV, pp. 2794–2802 (2015)

  62. Wang, Y., Ji, X., Zhou, Z., Wang, H., Li, Z.: Detecting faces using region-based fully convolutional networks (2017). arXiv:1709.05256

  63. Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Learning to track for spatio-temporal action localization. In: ICCV, pp. 3164–3172 (2015)

  64. Yan, J., Lei, Z., Wen, L., Li, S.Z.: The fastest deformable part model for object detection. In: CVPR pp. 2497–2504 (2014)

  65. Yan, J., Zhang, X., Lei, Z., Li, S.Z.: Face detection by structural models. Image Vis. Comput. 32(10), 790–799 (2014)

    Article  Google Scholar 

  66. Yang, B., Yan, J., Lei, Z., Li, S.Z.: Aggregate channel features for multi-view face detection. In: IJCB, pp. 1–8 (2014)

  67. Yang, B., Yan, J., Lei, Z., Li, S.Z.: Convolutional channel features. In: ICCV, pp. 82–90 (2015)

  68. Yang, B., Yan, J., Lei, Z., Li, S.Z.: Craft objects from images. In: CVPR, pp. 6043–6051 (2016)

  69. Yang, S., Luo, P., Loy, C.C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: ICCV, pp. 3676–3684 (2015)

  70. Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: CVPR, pp. 5525–5533 (2016)

  71. Yang, S., Xiong, Y., Loy, C.C., Tang, X.: Face detection through scale-friendly deep convolutional networks (2017). arXiv:1706.02863

  72. Zalpour, M., Akbarizadeh, G., Alaei-Sheini, N.: A new approach for oil tank detection using deep learning features with control false alarm rate in high-resolution satellite imagery. Int. J. Remote Sens. 41(6), 2239–2262 (2020)

    Article  Google Scholar 

  73. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. SPL 23(10), 1499–1503 (2016)

    Google Scholar 

  74. Zhang, L., Chu, R., Xiang, S., Liao, S., Li, S.Z.: Face detection based on multi-block lbp representation. In: ICB, pp. 11–18. Springer (2007)

  75. Zhang, S., Zhu, R., Wang, X., Shi, H., Fu, T., Wang, S., Mei, T.: Improved selective refinement network for face detection (2019). arXiv:1901.06651

  76. Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: Faceboxes: a CPU real-time face detector with high accuracy (2017). arXiv:1708.05234

  77. Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: S3fd: Single shot scale-invariant face detector (2017). arXiv:1708.05237

  78. Zhang, Z., Shen, W., Qiao, S., Wang, Y., Wang, B., Yuille, A.L.: Robust face detection via learning small faces on hard images. In: CoRR abs/1811.11662 (2018). http://arxiv.org/abs/1811.11662

  79. Zhu, C., Zheng, Y., Luu, K., Savvides, M.: Cms-rcnn: contextual multi-scale region-based CNN for unconstrained face detection. In: Deep Learning for Biometrics, pp. 57–79. Springer (2017)

  80. Zhu, Q., Yeh, M.C., Cheng, K.T., Avidan, S.: Fast human detection using a cascade of histograms of oriented gradients. In: CVPR, pp. 1491–1498 (2006)

  81. Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: CVPR, pp. 2879–2886 (2012)

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (U1613211, U1813218), and Shenzhen Research Program (JCYJ20170818164704758, JCYJ20150925163005055).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Qiao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zeng, X., Peng, X., Wang, Y. et al. Finding hard faces with better proposals and classifier. Machine Vision and Applications 31, 61 (2020). https://doi.org/10.1007/s00138-020-01110-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-020-01110-4

Keywords

  • Face detection
  • Convolutional neural networks
  • Object detection