Audio-Based Emotion Recognition in Judicial Domain: A Multilayer Support Vector Machines Approach

  • E. Fersini
  • E. Messina
  • G. Arosio
  • F. Archetti
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5632)


Thanks to the recent progresses in judicial proceedings management, especially related to the introduction of audio/video recording systems, semantic retrieval is a key challenge. In this context emotion recognition engine, through the analysis of vocal signature of actors involved in judicial proceedings, could provide useful annotations for semantic retrieval of multimedia clips. With respect to the generation of semantic emotional tag in judicial domain, two main contributions are given: (1) the construction of an Italian emotional database for Italian proceedings annotation; (2) the investigation of a hierarchical classification system, based on risk minimization method, able to recognize emotional states from vocal signatures. In order to estimate the degree of affection we compared the proposed classification method with SVM, K-Nearest Neighbors and Naive Bayes, highlighting in terms of classification accuracy, the improvements given by a hierarchical learning approach.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Batliner, A., Fischer, K., Huber, R., Spilker, J., Nöth, E.: How to find trouble in communication. Speech Commun. 40(1-2), 117–143 (2003)CrossRefMATHGoogle Scholar
  2. 2.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of german emotional speech. In: Interspeech 2005, pp. 1517–1520 (2005)Google Scholar
  3. 3.
    Dellaert, F., Polzin, T., Waibel, A.: Recognizing emotion in speech. In: Proc. of the CMC, pp. 1970–1973 (1996)Google Scholar
  4. 4.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATHGoogle Scholar
  5. 5.
    Slot, K., Cichosz, J.: Application of selected speech-signal characteristics to emotion recognition in polish language. In: Proc. of the 5th International Conf. on signals and electronic systems, pp. 409–412 (2004)Google Scholar
  6. 6.
    Nwe, T.L., Foo, S.W., De Silva, L.C.: Speech emotion recognition using hidden markov models. Speech Communication 41(4), 603–623 (2003)CrossRefGoogle Scholar
  7. 7.
    Petrushin, V.: Emotion recognition in speech signal: Experimental study, development, and application. In: Proc. Sixth International Conf. on Spoken Language Processing (ICSLP 2000), pp. 222–225 (2000)Google Scholar
  8. 8.
    Polzin, T., Waibel, A.: Emotion-sensitive human-computer interface. In: Proc. of the ISCA Workshop on Speech and Emotion (2000)Google Scholar
  9. 9.
    Dordevic, M., Rajkovic, M., Jovicic, S., Kasic, Z.: Serbian emotional speech database: design, processing and evaluation. In: Proc. of the 9th Conf. on Speech and ComputerGoogle Scholar
  10. 10.
    Schuller, B., Arsic, D., Wallhoff, F., Rigoll, G.: Emotion recognition in the noise applying large acoustic feature sets. In: Speech Prosody (2006)Google Scholar
  11. 11.
    Schuller, B., Reiter, S., Rigoll, G.: Evolutionary feature generation in speech emotion recognition. In: Proceeding of the 2005 IEEE International Conf. on Multimedia and Expo., pp. 5–8 (2005)Google Scholar
  12. 12.
    Sedaaghi, M.H., Kotropoulos, C., Ververidis, D.: Using adaptive genetic algorithms to improve speech emotion recognition. In: Proc. of 9th Multimedia Signal Processing Workshop, pp. 461–464 (2007)Google Scholar
  13. 13.
    Sedaaghi, M.H., Kotropoulos, C., Ververidis, D.: Using adaptive genetic algorithms to improve speech emotion recognition. In: Proc. XV European Signal Processing Conf., pp. 2209–2213 (2007)Google Scholar
  14. 14.
    Väyrynen, E., Seppänen, T., Toivanen, J.: Automatic discrimination of emotion from spoken finish. Language and Speech 47(4), 383–412 (2004)CrossRefGoogle Scholar
  15. 15.
    Oudeyer, P.y.: Novel useful features and algorithms for the recognition of emotions in speech. In: Proc. of the 1st International Conf. on Speech Prosody, pp. 547–550 (2002)Google Scholar
  16. 16.
    Oudeyer, P.y.: The production and recognition of emotions in speech: features and algorithms. Int. J. Hum.-Comput. Stud. 59(1-2), 157–183 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • E. Fersini
    • 1
  • E. Messina
    • 1
  • G. Arosio
    • 1
  • F. Archetti
    • 1
    • 2
  1. 1.DISCo, Università degli Studi di Milano-BicoccaMilanoItaly
  2. 2.Consorzio Milano RicercheMilanoItaly

Personalised recommendations