The Effects of Regularization on Learning Facial Expressions with Convolutional Neural Networks
- 2.6k Downloads
Convolutional neural networks (CNNs) have become effective instruments in facial expression recognition. Very good results can be achieved with deep CNNs possessing many layers and providing a good internal representation of the learned data. Due to the potentially high complexity of CNNs on the other hand they are prone to overfitting and as a result, regularization techniques are needed to improve the performance and minimize overfitting. However, it is not yet clear how these regularization techniques affect the learned representation of faces. In this paper we examine the effects of novel regularization techniques on the training and performance of CNNs and their learned features. We train a CNN using dropout, max pooling dropout, batch normalization and different combinations of these three. We show that a combination of these methods can have a big impact on the performance of a CNN, almost halving its validation error. A visualization technique is applied to the CNNs to highlight their activations for different inputs, illustrating a significant difference between a standard CNN and a regularized CNN.
KeywordsConvolutional neural network Facial expression recognition Regularization Batch normalization Dropout Max pooling dropout
This work was partially supported by the CAPES Brazilian Federal Agency for the Support and Evaluation of Graduate Education (p.n.5951–13–5), the German Research Foundation DFG under project CML (TRR 169), and the Hamburg Landesforschungsförderungsprojekt.
- 1.Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
- 4.Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint 1301.3557 (2013)Google Scholar
- 5.Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint 1502.03167 (2015)Google Scholar
- 6.He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)Google Scholar
- 7.Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: Proceedings of the Third International Workshop on CVPR for Human Communicative Behavior Analysis, pp. 94–101 (2010)Google Scholar
- 8.Khorrami, P., Paine, T., Huang, T.: Do deep neural networks learn facial action units when doing expression recognition? In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 19–27 (2015)Google Scholar
- 9.Liu, M., Li, S., Shan, S., Chen, X.: Au-aware deep networks for facial expression recognition. In: 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–6 (2013)Google Scholar
- 10.Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Heidelberg (2014)Google Scholar
- 11.Barros, P., Weber, C., Wermter, S.: Emotional expression recognition with a cross-channel convolutional neural network for human-robot interaction. In: IEEE-RAS 15th International Conference on Humanoid Robots, pp. 582–587 (2015)Google Scholar
- 12.Liu, P., Han, S., Meng, Z., Tong, Y.: Facial expression recognition via a boosted deep belief network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1805–1812 (2014)Google Scholar
- 13.Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Proceedings of the 30th International Conference on Machine Learning (ICML 2013), pp. 1058–1066 (2013)Google Scholar