A Fast Deep Convolutional Neural Network for Face Detection in Big Visual Data

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 529)

Abstract

Deep learning methods are powerful approaches but often require expensive computations and lead to models of high complexity which need to be trained with large amounts of data. In this paper, we consider the problem of face detection and we propose a light-weight deep convolutional neural network that achieves a state-of-the-art recall rate of 90 % at the challenging FDDB dataset. Our model is designed with a view to minimize both training and run time and outperforms the convolutional network used in [2] for the same task. Our model consists of only 76.554 free parameters whereas the previously proposed CNN for face detection had 60 million parameters. Our model also requires 250 times fewer floating point operations than AlexNet. We propose a new training method that gradually increases the difficulty of both negative and positive examples and has proved to drastically improve training speed and accuracy. The proposed method is able to detect faces under severe occlusion and unconstrained pose variation and meets the difficulties and the large variations of real-world face detection..

References

  1. 1.
    Kotropoulos, C., Tefas, A., Pitas, I.: Frontal face authentication using variants of dynamic link matching based on mathematical morphology. In: Proceedings of IEEE International Conference on Image Processing (ICIP 1998), Chicago, USA, vol. 1, pp. 122–126, 4–7 October 1998Google Scholar
  2. 2.
    Farfade, S.S., Saberian, M., Li, L.-J.: Multi-view face detection using deep convolutional neural networks. In: ICMR (2015)Google Scholar
  3. 3.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)CrossRefGoogle Scholar
  4. 4.
    Chen, D., Ren, S., Wei, Y., Cao, X., Sun, J.: Joint cascade face detection and alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 109–122. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10599-4_8 Google Scholar
  5. 5.
    Yang, B., Yan, J., Lei, Z., Li, S.: Aggregate channel features for multi-view face detection. In: IEEE International Joint Conference on Biometrics (2014)Google Scholar
  6. 6.
    Viola, M., Viola, P.: Fast multi-view face detection. In: Proceedings of CVPR (2003)Google Scholar
  7. 7.
    Wu, B., Ai, H., Huang, C., Lao, S.: Fast rotation invariant multi-view face detection based on real adaboost. In: Proceedings of IEEE Automatic Face and Gesture Recognition (2004)Google Scholar
  8. 8.
    Li, S.Z., Zhu, L., Zhang, Z.Q., Blake, A., Zhang, H.J., Shum, H.: Statistical learning of multi-view face detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 67–81. Springer, Heidelberg (2002). doi:10.1007/3-540-47979-1_5 CrossRefGoogle Scholar
  9. 9.
    Li, J., Zhang, Y.: Learning surf cascade for fast and accurate object detection. In: CVPR (2013)Google Scholar
  10. 10.
    Jun, B., Choi, I., Kim, D.: Local transform features and hybridization for accurate face and human detection. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1423–1436 (2013)CrossRefGoogle Scholar
  11. 11.
    Mathias, M., Benenson, R., Pedersoli, M., Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10593-2_47 Google Scholar
  12. 12.
    Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: Proceedings of CVPR (2008)Google Scholar
  13. 13.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Cascade object detection with deformable part models. In: Computer Vision and Pattern Recognition (2010)Google Scholar
  14. 14.
    Ranjan, R., Patel, V.M., Chellappa, R.: A deep pyramid deformable part model for face detection. In: International Conference on Biometrics Theory, Applications and Systems (2015)Google Scholar
  15. 15.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS (2012)Google Scholar
  16. 16.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of CVPR (2014)Google Scholar
  17. 17.
    Martin Koestinger, P.M.R., Wohlhart, P., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: Proceedings of IEEE International Workshop on Benchmarking Facial Image Analysis Technologies (2011)Google Scholar
  18. 18.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
  19. 19.
    Jain, V., Learned-Miller, E.: Fddb: A benchmark for face detection in unconstrained settings. Technical Report UM-CS-2010-009, University of Massachusetts, Amherst (2010)Google Scholar
  20. 20.
    He, S.R., Sun, K., Jian, X.Z.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  21. 21.
    Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10599-4_7 Google Scholar
  22. 22.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics (2010)Google Scholar
  23. 23.
    Yang, S., Luo, P., Loy, C.C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: IEEE International Conference on Computer Vision (2015)Google Scholar
  24. 24.
    Yang, B., Yan, J., Lei, Z., Li, S.Z.: Convolutional channel features. In: IEEE International Conference on Computer Vision (2015)Google Scholar
  25. 25.
    Ranjan, R., Patel, V.M., Chellappa, R.: HyperFace: A Deep Multi-task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition arXiv:1603.01249 (2016)

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Artificial Intelligence and Information Analysis Lab, Department of InformaticsAristotle University of ThessalonikiThessalonikiGreece

Personalised recommendations