SGNet: Design of Optimized DCNN for Real-Time Face Detection

  • Seunghyun Lee
  • Minseop Kim
  • Inwhee JoeEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 931)


This paper proposes to optimize the deep convolution neural networks for real time video processing on detecting faces and facial landmarks. For that, we have to reduce the existing weight size and duplication of weight parameters. By utilizing the strengths of the two previous powerful algorithms which have shown the best performance, we overcome the weakness of the existing methods. Instead of using the old-fashioned searching method like sliding window, we propose our grid-based one-shot detection method. Furthermore, instead of forwarding one image frame through a very deep CNN, we divide the process into 3 stages for incremental detection improvements to overcome the existing limitation of grid-based detection. After lots of experiments with different frameworks, deep learning frameworks are chosen as the best for integration of 3-stage DCNN. By using transfer learning, we can remove the unnecessary convolution layers in the existing DCNN and retrain hidden layers repeatedly and finally succeed in obtaining the best speed and accuracy which can run on the embedded platform. The performance to find small sized faces is better than YOLO v2.


DCNN Scalable face detection Transfer learning Grid-based one-shot detection method 



This work was supported by the Technology Development Program (S2521883) funded by the Ministry of SMEs and Startups (MSS, Korea).


  1. 1.
    Zhang, S., et al.: Faceboxes: a CPU real-time face detector with high accuracy. In: 2017 IEEE International Joint Conference on Biometrics (IJCB). IEEE (2017)Google Scholar
  2. 2.
    He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: International Conference on Computer Vision (ICCV), vol. 2, no. 6 (2017)Google Scholar
  3. 3.
    Kowalski, M., Naruniec, J., Trzcinski, T.: Deep alignment network: a convolutional neural network for robust face alignment. In: Proceedings of the International Conference on Computer Vision & Pattern Recognition (CVPRW), Faces-in-the-wild Workshop/Challenge, vol. 3, no. 5 (2017)Google Scholar
  4. 4.
    Chollet, F.: Xception: deep learning with depthwise separable convolutions. arXiv preprint arXiv:1610-02357 (2017)
  5. 5.
    Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
  6. 6.
    Courbariaux, M., Bengio, V.: Binarynet: training deep neural networks with weights and activations constrained to +1 or −1. arXiv preprint arXiv:1602.02830 (2016)
  7. 7.
    Wu, Y., Hassner, T.: Facial landmark detection with tweaked convolutional neural networks. arXiv preprint arXiv:1511.04031 (2015)
  8. 8.
    Zhang, K., et al.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)CrossRefGoogle Scholar
  9. 9.
    Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. arXiv preprint (2017)Google Scholar
  10. 10.
    Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)
  11. 11.
    Iandola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: Squeezenet: alexnet-level accuracy with 50x fewer parameters and <1 MB model size. arXiv preprint arXiv:1602.07360 (2016)
  12. 12.
    Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.: SSD: single shot multibox detector. arXiv preprint arXiv:1512.02325 (2015)
  13. 13.
    Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016)
  14. 14.
    Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. arXiv preprint arXiv:1512.06473 (2015)
  15. 15.
    Yang, S., Luo, P., Loy, C.C., Tang, X.: WIDER FACE: a face detection benchmark. arXiv preprint arXiv:1511.06523
  16. 16.
    Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., FeiFei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)Google Scholar
  17. 17.
    Redmon, J.: Darknet: open source neural networks in c (2013–2016).
  18. 18.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
  19. 19.
    Han, S., et al.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems (2015)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Computer SoftwareHanyang UniversitySeoulKorea

Personalised recommendations