Skip to main content

Semi-automatic Facial Key-Point Dataset Creation

  • 2014 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 10458)


This paper presents a semi-automatic method for creating a large scale facial key-point dataset from a small number of annotated images. The method consists of annotating the facial images by hand, training Active Appearance Model (AAM) from the annotated images and then using the AAM to annotate a large number of additional images for the purpose of training a neural network. The images from the AAM are then re-annotated by the neural network and used to validate the precision of the proposed neural network detections. The neural network architecture is presented including the training parameters.


  • Key-points
  • Dataset
  • Annotation
  • Neural networks
  • Active appearance model
  • Images
  • Lips

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-66429-3_66
  • Chapter length: 7 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-66429-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.


  1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, Georgia, USA (2016)

    Google Scholar 

  2. Barney, H., Haworth, F., Dunn, H.: An experimental transistorized artificial larynx. Bell Syst. Tech. J. 38(6), 1337–1356 (1959)

    CrossRef  Google Scholar 

  3. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)

  4. Chollet, F., et al.: Keras: deep learning library for Theano and TensorFlow (2015). (2015)

  5. Chung, J.S., Senior, A., Vinyals, O., Zisserman, A.: Lip reading sentences in the wild. arXiv preprint arXiv:1611.05358 (2016)

  6. Chung, J., Zisserman, A.: Lip reading in the wild. In: Asian Conference on Computer Vision (2016)

    Google Scholar 

  7. Cootes, T.F., Taylor, C.J., et al.: Statistical models of appearance for computer vision (2004)

    Google Scholar 

  8. Gruber, I., Hlaváč, M., Hrúz, M., Železný, M., Karpov, A.: An analysis of visual faces datasets. In: Ronzhin, A., Rigoll, G., Meshcheryakov, R. (eds.) ICR 2016. LNCS, vol. 9812, pp. 18–26. Springer, Cham (2016). doi:10.1007/978-3-319-43955-6_3

    CrossRef  Google Scholar 

  9. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

  10. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., Red Hook (2012)

    Google Scholar 

  11. Matthews, I., Baker, S.: Active appearance models revisited. Int. J. Comput. Vision 60(2), 135–164 (2004)

    CrossRef  Google Scholar 

  12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)

    Google Scholar 

  13. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  14. Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3476–3483 (2013)

    Google Scholar 

  15. Tomasi, C., Kanade, T.: Selecting and tracking features for image sequence analysis. Robotics and Automation (1992)

    Google Scholar 

Download references


This work is supported by grant of the University of West Bohemia, project No. SGS-2016-039, by Ministry of Education, Youth and Sports of Czech Republic, project No. LO1506, by Russian Foundation for Basic Research, projects No. 15-07-04415 and 16-37-60100, and by the Government of Russia, grant No. 074-U01. Computational resources were supplied by the Ministry of Education, Youth and Sports of the Czech Republic under the Projects CESNET (Project No. LM2015042) and CERIT-Scientific Cloud (Project No. LM2015085) provided within the program Projects of Large Research, Development and Innovations Infrastructures.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Miroslav Hlaváč .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Hlaváč, M., Gruber, I., Železný, M., Karpov, A. (2017). Semi-automatic Facial Key-Point Dataset Creation. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66428-6

  • Online ISBN: 978-3-319-66429-3

  • eBook Packages: Computer ScienceComputer Science (R0)