Emotional Video Scene Retrieval Using Multilayer Convolutional Network
In order to retrieve impressive scene from a video database, a scene retrieval method based on facial expression recognition (FER) is proposed. The proposed method will be useful to retrieve interesting scenes from lifelog videos. When an impressive event occurs, a certain facial expression will be observed in a person in the video. It is, therefore, important for the impressive scene retrieval to precisely recognize the facial expression of the person. In this paper, we try to construct accurate FER models by introducing a learning framework on the basis of multilayer convolutional network using a number of facial features defined as the positional relations between some facial feature points. The effectiveness of the proposed method is evaluated through an experiment to retrieve emotional scenes from a lifelog video database.
KeywordsFacial Expression Facial Feature Facial Expression Recognition Video Database Bias Vector
This research is supported by Japan Society for the Promotion of Science, Grant-in-Aid for Young Scientists (B), 15K15993.
- 1.T. Datchakorn, T. Yamasaki, and K. Aizawa, “Practical Experience Recording and Indexing of Life Log Video,” Proc. of the 2nd ACM Workshop on Continuous Archival and Retrieval of Personal Experiences, pp. 61–66, 2005.Google Scholar
- 2.D. Datcu and L. Rothkrantz, “Facial Expression Recognition in Still Pictures and Videos Using Active Appearance Models: A Comparison Approach,” Proc. of the 2007 International Conference on Computer Systems and Technologies, pp. 1–6, 2007.Google Scholar
- 3.G. Fanelli, A. Yao, P.-L. Noel, J. Gall, and L. V. Gool, “Hough Forest-based Facial Expression Recognition from Video Sequences,” Proc. of the 11th European Conference on Trends and Topics in Computer Vision, pp. 195–206, 2010.Google Scholar
- 4.H. Nomiya and T. Hochin, “Emotional Scene Retrieval from Lifelog Videos Using Evolutionary Feature Creation,” Studies in Computational Intelligence, Vol. 612, pp. 61–75, 2015.Google Scholar
- 5.Y. LeCun and Y. Bengio, “Convolutional Networks for Images, Speech, and Time Series,” The Handbook of Brain Theory and Neural Networks, MIT Press, pp. 255–258, 1998.Google Scholar
- 6.Y. LeCun, Y. Bengio, and G. Hinton, “Deep Learning,” Nature Vol. 521, pp. 436–444, 2015.Google Scholar
- 7.T. Kanade, J. F. Cohn, and Y. Tian, “Comprehensive Database for Facial Expression Analysis,” Proc. of the 4th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 46–53, 2000.Google Scholar
- 8.Luxand Inc., Luxand FaceSDK 4.0, http://www.luxand.com/facesdk [September 11, 2016] (current version is 6.1).
- 9.H. Nomiya, S. Sakaue, and T. Hochin, “Recognition and Intensity Estimation of Facial Expression Using Ensemble Classifiers,” Proc. of 15th International Conference on Computer and Information Science, pp. 825–830, 2016.Google Scholar
- 10.N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” Journal of Machine Learning Research, Vol. 15, No. 1, pp. 1929–1958, 2014.Google Scholar
- 11.D. Kingma, J. Ba, “Adam: A Method for Stochastic Optimization,” Proc. of the 3rd International Conference for Learning Representations, arXiv:1412.6980, 2015.