Multimodal Emotion Classification in Naturalistic User Behavior
The design of intelligent personalized interactive systems, having knowledge about the user’s state, his desires, needs and wishes, currently poses a great challenge to computer scientists. In this study we propose an information fusion approach combining acoustic, and bio-physiological data, comprising multiple sensors, to classify emotional states. For this purpose a multimodal corpus has been created, where subjects undergo a controlled emotion eliciting experiment, passing several octants of the valence arousal dominance space. The temporal and decision level fusion of the multiple modalities outperforms the single modality classifiers and shows promising results.
KeywordsEmotion Recognition Confusion Matrix Information Fusion Audio Feature Decision Fusion
Unable to display preview. Download preview PDF.
- 5.Ekman, P., Friesen, W.V.: Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press, Palo Alto (1978)Google Scholar
- 7.Hermansky, H., Hanson, B., Wakita, H.: Perceptually based linear predictive analysis of speech. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1985), vol. 10, pp. 509–512 (1985)Google Scholar
- 8.Kim, J., André, E.: Emotion recognition based on physiological changes in music listening. IEEE Trans. Pattern Anal. Mach. Intell. 30, 2067–2083 (2008), http://portal.acm.org/citation.cfm?id=1477073.1477535 CrossRefGoogle Scholar
- 11.Peter, J., Lang, M.M.B., Cuthbert, B.N.: International affective picture system (iaps): Affective ratings of pictures and instruction manual. Tech. rep., NIMH Center for the Study of Emotion & Attention, University of Florida (2008)Google Scholar
- 12.Picard, R.W.: Affective Computing. MIT Press, Cambridge (2000)Google Scholar
- 13.Platt, J.: Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. Advances in Large Margin Classifiers, 61–74 (1999)Google Scholar
- 14.Rabiner, L.R., Schafer, R.W.: Digital processing of speech signals. Prentice-Hall Signal Processing Series. Prentice-Hall, Englewood Cliffs (1978)Google Scholar
- 16.Schwenker, F., Dietrich, C., Thiel, C., Palm, G.: Learning of decision fusion mappings for pattern recognition. International Journal on Artificial Intelligence and Machine Learning (AIML) 6, 17–21 (2006)Google Scholar
- 17.Smyth, P.: Clustering sequences with hidden markov models. Advances in Neural Information Processing Systems 9, 648–654 (1997)Google Scholar