Semi-automatic Facial Key-Point Dataset Creation

  • Miroslav HlaváčEmail author
  • Ivan Gruber
  • Miloš Železný
  • Alexey Karpov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10458)


This paper presents a semi-automatic method for creating a large scale facial key-point dataset from a small number of annotated images. The method consists of annotating the facial images by hand, training Active Appearance Model (AAM) from the annotated images and then using the AAM to annotate a large number of additional images for the purpose of training a neural network. The images from the AAM are then re-annotated by the neural network and used to validate the precision of the proposed neural network detections. The neural network architecture is presented including the training parameters.


Key-points Dataset Annotation Neural networks Active appearance model Images Lips 



This work is supported by grant of the University of West Bohemia, project No. SGS-2016-039, by Ministry of Education, Youth and Sports of Czech Republic, project No. LO1506, by Russian Foundation for Basic Research, projects No. 15-07-04415 and 16-37-60100, and by the Government of Russia, grant No. 074-U01. Computational resources were supplied by the Ministry of Education, Youth and Sports of the Czech Republic under the Projects CESNET (Project No. LM2015042) and CERIT-Scientific Cloud (Project No. LM2015085) provided within the program Projects of Large Research, Development and Innovations Infrastructures.


  1. 1.
    Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, Georgia, USA (2016)Google Scholar
  2. 2.
    Barney, H., Haworth, F., Dunn, H.: An experimental transistorized artificial larynx. Bell Syst. Tech. J. 38(6), 1337–1356 (1959)CrossRefGoogle Scholar
  3. 3.
    Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)
  4. 4.
    Chollet, F., et al.: Keras: deep learning library for Theano and TensorFlow (2015). (2015)
  5. 5.
    Chung, J.S., Senior, A., Vinyals, O., Zisserman, A.: Lip reading sentences in the wild. arXiv preprint arXiv:1611.05358 (2016)
  6. 6.
    Chung, J., Zisserman, A.: Lip reading in the wild. In: Asian Conference on Computer Vision (2016)Google Scholar
  7. 7.
    Cootes, T.F., Taylor, C.J., et al.: Statistical models of appearance for computer vision (2004)Google Scholar
  8. 8.
    Gruber, I., Hlaváč, M., Hrúz, M., Železný, M., Karpov, A.: An analysis of visual faces datasets. In: Ronzhin, A., Rigoll, G., Meshcheryakov, R. (eds.) ICR 2016. LNCS, vol. 9812, pp. 18–26. Springer, Cham (2016). doi: 10.1007/978-3-319-43955-6_3 CrossRefGoogle Scholar
  9. 9.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
  10. 10.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., Red Hook (2012)Google Scholar
  11. 11.
    Matthews, I., Baker, S.: Active appearance models revisited. Int. J. Comput. Vision 60(2), 135–164 (2004)CrossRefGoogle Scholar
  12. 12.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  13. 13.
    Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3476–3483 (2013)Google Scholar
  15. 15.
    Tomasi, C., Kanade, T.: Selecting and tracking features for image sequence analysis. Robotics and Automation (1992)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Miroslav Hlaváč
    • 1
    • 2
    • 3
    Email author
  • Ivan Gruber
    • 1
    • 2
    • 3
  • Miloš Železný
    • 1
  • Alexey Karpov
    • 3
    • 4
  1. 1.Department of Cybernetics, Faculty of Applied SciencesUWBPilsenCzech Republic
  2. 2.Faculty of Applied Sciences, NTISUWBPilsenCzech Republic
  3. 3.ITMO UniversitySt. PetersburgRussia
  4. 4.SPIIRASSt. PetersburgRussia

Personalised recommendations