Dimensionality Reduction and Classification Analysis on the Audio Section of the SEMAINE Database

  • Ricardo A. Calix
  • Mehdi A. Khazaeli
  • Leili Javadpour
  • Gerald M. Knapp
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6975)


This paper presents an analysis of the audio section of the SEMAINE database for affect detection. Chi-square and principal component analysis techniques are used to reduce the dimensionality of the audio datasets. After dimensionality reduction, different classification techniques are used to perform emotion classification at the word level. Additionally, for unbalanced training sets, class re-sampling is performed to improve the model’s classification results. Overall, the final results indicate that Support Vector Machines (SVM) performed best for all data sets. Results show promise for the SEMAINE database as an interesting corpus to study affect detection.


speech processing dimensionality reduction affect detection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    McKeown, G., Valstar, M., Pantic, M., Cowie, R.: The SEMAINE Corpus of Emotionally Coloured Character Interactions. In: Proceedings International Conference Multimedia & Expo., pp. 1–6 (2010)Google Scholar
  2. 2.
    Busso, C., Lee, S., Narayanan, S.: Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection. IEEE Transactions on Audio, Speech, and Language Processing 17(4), 582–596 (2009)CrossRefGoogle Scholar
  3. 3.
    Burges, C.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2, 121–167 (1998)CrossRefGoogle Scholar
  4. 4.
    Chang, C., Lin, C.: LIBSVM: A Library for Support Vector Machines (2001),
  5. 5.
    Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16, 341–378 (2002)zbMATHGoogle Scholar
  6. 6.
    Luengo, I., Navas, E., Hernaez, I.: Feature Analysis and Evaluation for Automatic Emotion identification in Speech. IEEE Transactions on Multimedia 12(6), 490–501 (2010)CrossRefGoogle Scholar
  7. 7.
    Schuller, B., Valstar, M., Eyben, F., McKeown, G., Cowie, R., Pantic, M.: AVEC 2011 – The First International Audio/Visual Emotion Challenge. In: D´Mello, S., et al. (eds.) ACII 2011, Part II, vol. 6975, pp. 415–424. Springer, Heidelberg (2011)Google Scholar
  8. 8.
    Witten, I., Frank, E.: Data mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Publishers Inc., San FranciscoGoogle Scholar
  9. 9.
    Grimm, M., Kroschel, K., Mower, E., Narayanan, S.: Primitives based evaluation and estimation of emotions in speech. Elsevier Speech Communication 49, 787–800 (2007)CrossRefGoogle Scholar
  10. 10.
    Yang, Y., Pedersen, J.: A Comparative Study on Feature Selection in Text Categorization. In: Proceedings of the 14th International Conference on Machine Learning, pp. 412–420 (1997)Google Scholar
  11. 11.
    El Ayadi, M., Kamel, M., Karray, F.: Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition 44, 572–587 (2011)CrossRefzbMATHGoogle Scholar
  12. 12.
    You, M., Chen, C., Bu, J., Liu, J., Tao, J.: Emotion recognition from noisy speech. In: IEEE International Conference on Multimedia and Expo., pp. 1653–1656 (2006)Google Scholar
  13. 13.
    Marcano-Cedeño, A., Quintanilla-Domínguez, J., Cortina-Januchs, M.G., Andina, D.: Feature selection using Sequential Forward Selection and classification applying Artificial Metaplasticity Neural Network. In: 36th Annual Conference on IEEE Industrial Electronics Society, pp. 2845–2850 (2010)Google Scholar
  14. 14.
    You, M., Chen, C., Bu, J., Liu, J., Tao, J.: A hierarchical framework for speech emotion recognition. In: IEEE International Symposium on Industrial Electronics, vol. 1, pp. 515–519 (2006)Google Scholar
  15. 15.
    Ververidis, D., Kotropoulos, C., Pitas, I.: Automatic emotional speech classification. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), vol. 1, pp. I-593–I-596 (2004)Google Scholar
  16. 16.
    Go, H., Kwak, K., Lee, D., Chun, M.: Emotion recognition from the facial image and speech signal. In: Proceedings of the IEEE SICE 2003, vol. 3, pp. 2890–2895 (2003)Google Scholar
  17. 17.
    Schuller, B., Lang, M., Rigoll, G.: Robust acoustic speech emotion recognition by ensembles of classifiers. In: Proceedings of the DAGA 2005, 31, Deutsche Jahrestagung für Akustik, DEGA, pp. 329–330 (2005)Google Scholar
  18. 18.
    Lugger, M., Yang, B.: Combining classifiers with diverse feature sets for robust speaker independent emotion recognition. In: Proceedings of EUSIPCO (2009)Google Scholar
  19. 19.
    Bishop, C.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)zbMATHGoogle Scholar
  20. 20.
    Xie, B., Chen, L., Chen, G.-C., Chen, C.: Statistical feature selection for mandarin speech emotion recognition. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 591–600. Springer, Heidelberg (2005), doi:10.1007/11538059_62CrossRefGoogle Scholar
  21. 21.
    Schuller, B., Rigoll, G., Lang, M.: Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: Proceedings of the ICASSP 2004, vol. 1, pp. 577–580 (2004)Google Scholar
  22. 22.
    Duda, R., Hart, P., Stork, D.: Pattern Recognition. John Wiley and Sons, Chichester (2001)zbMATHGoogle Scholar
  23. 23.
    Grimm, M., Kroschel, K., Narayanan, S.: Support Vector Regression for Automatic Recognition of Spontaneous Emotions in Speech. In: Proceeding of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. 1085–1088 (April 2007)Google Scholar
  24. 24.
    Lugger, M., Yang, B.: The Relevance of Voice Quality Features in Speaker Independent emotion recognition. In: Proceeding of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. 17–20 (April 2007)Google Scholar
  25. 25.
    Ververidis, D., Kotropoulos, C.: Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections. In: Proceedings of the European Signal Processing Conference, EUSIPCO (2006)Google Scholar
  26. 26.
    Ververidis, D., Kotropoulos, C.: Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition. Signal Processing 88(12), 2956–2970 (2008)CrossRefzbMATHGoogle Scholar
  27. 27.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ricardo A. Calix
    • 1
  • Mehdi A. Khazaeli
    • 1
  • Leili Javadpour
    • 1
  • Gerald M. Knapp
    • 1
  1. 1.Industrial EngineeringLouisiana State UniversityBaton RougeU.S.A.

Personalised recommendations