Semi-automatic Facial Key-Point Dataset Creation
This paper presents a semi-automatic method for creating a large scale facial key-point dataset from a small number of annotated images. The method consists of annotating the facial images by hand, training Active Appearance Model (AAM) from the annotated images and then using the AAM to annotate a large number of additional images for the purpose of training a neural network. The images from the AAM are then re-annotated by the neural network and used to validate the precision of the proposed neural network detections. The neural network architecture is presented including the training parameters.
KeywordsKey-points Dataset Annotation Neural networks Active appearance model Images Lips
This work is supported by grant of the University of West Bohemia, project No. SGS-2016-039, by Ministry of Education, Youth and Sports of Czech Republic, project No. LO1506, by Russian Foundation for Basic Research, projects No. 15-07-04415 and 16-37-60100, and by the Government of Russia, grant No. 074-U01. Computational resources were supplied by the Ministry of Education, Youth and Sports of the Czech Republic under the Projects CESNET (Project No. LM2015042) and CERIT-Scientific Cloud (Project No. LM2015085) provided within the program Projects of Large Research, Development and Innovations Infrastructures.
- 1.Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, Georgia, USA (2016)Google Scholar
- 3.Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)
- 4.Chollet, F., et al.: Keras: deep learning library for Theano and TensorFlow (2015). https://keras.io/k (2015)
- 5.Chung, J.S., Senior, A., Vinyals, O., Zisserman, A.: Lip reading sentences in the wild. arXiv preprint arXiv:1611.05358 (2016)
- 6.Chung, J., Zisserman, A.: Lip reading in the wild. In: Asian Conference on Computer Vision (2016)Google Scholar
- 7.Cootes, T.F., Taylor, C.J., et al.: Statistical models of appearance for computer vision (2004)Google Scholar
- 9.Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
- 10.Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., Red Hook (2012)Google Scholar
- 12.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
- 14.Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3476–3483 (2013)Google Scholar
- 15.Tomasi, C., Kanade, T.: Selecting and tracking features for image sequence analysis. Robotics and Automation (1992)Google Scholar