Skip to main content

A Hybrid Distance-Based Method and Support Vector Machines for Emotional Speech Detection

  • Conference paper
  • First Online:
New Frontiers in Mining Complex Patterns (NFMCP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8399))

Included in the following conference series:

Abstract

We describe a novel methodology that is applicable in the detection of emotions from speech signals. The methodology is useful if we can safely ignore sequence information since it constructs static feature vectors to represent a sequence of values; this is the case of the current application. In the initial feature extraction part, the speech signals are cut into 3 speech segments according to relative time interval process. The speech segments are processed and described using 988 acoustic features. Our proposed methodology consists of two steps. The first step constructs emotion models using principal component analysis and it computes distances of the observations to each emotion models. The distance values from the previous step are used to train a support vector machine classifier that can identify the affective content of a speech signal. We note that our method is not only applicable for speech signal, it can also be used to analyse other data of similar nature. The proposed method is tested using four emotional databases. Results showed competitive performance yielding an average accuracy of at least 80 % on three databases for the detection of basic types of emotion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://database.syntheticspeech.de/

  2. 2.

    http://personal.ee.surrey.ac.uk/Personal/P.Jackson/SAVEE/

  3. 3.

    http://www.rml.ryerson.ca/rml-emotion-database.html

  4. 4.

    http://www.enterface.net/enterface05/main.php?frame=emotion

References

  1. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G.N., Kollias, S.D., Fellenz, W.A., Taylor, J.G.: Emotion recognition in human-computer interaction. IEEE Sig. Process. Mag. 18, 32–80 (2001)

    Article  Google Scholar 

  2. Vogt, T., André, E., Wagner, J.: Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. In: Peter, C., Beale, R. (eds.) Affect and Emotion in HCI. LNCS, vol. 4868, pp. 75–91. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  3. El Ayadi, M.M.H., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44, 572–587 (2011)

    Article  MATH  Google Scholar 

  4. Koolagudi, S.G., Rao, K.S.: Emotion recognition from speech: a review. Int. J. Speech Technol. 15, 99–117 (2012)

    Article  Google Scholar 

  5. Dileep, A.D., Veena, T., Sekhar, C.C.: A review of kernel methods based approaches to classification and clustering of sequential patterns, part i: sequences of continuous feature vectors. In: Kumar, P., Krishna, P.R., Raju, S.B. (eds.) Pattern Discovery Using Sequence Data Mining: Applications and Studies. IGI Global (2012)

    Google Scholar 

  6. Branden, K.V., Hubert, M.: Robust classification in high dimensions based on the SIMCA method. Chemometr. Intell. Lab. Syst. 79, 10–21 (2005)

    Article  Google Scholar 

  7. Bi, F., Yang, J., Yu, Y., Xu, D.: Decision templates ensemble and diversity analysis for segment-based speech emotion recognition. In: 2007 International Conference on Intelligent Systems and Knowledge Engineering (ISKE 2007). Advances in Intelligent Systems Research (2007)

    Google Scholar 

  8. Schuller, B., Rigoll, G.: Timing levels in segment-based speech emotion recognition. In: Ninth International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP, Pittsburgh, PA, USA, pp. 1818–1821. ISCA (2006)

    Google Scholar 

  9. Shami, M.T., Kamel, M.S.: Segment-based approach to the recognition of emotions in speech. In: Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, ICME 2005, Amsterdam, The Netherlands, pp. 366–369. IEEE (2005)

    Google Scholar 

  10. Pan, Y., Shen, P., Shen, L.: Speech emotion recognition using support vector machine. Int. J. Smart Home 6(2), 101–108 (2012)

    Google Scholar 

  11. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of german emotional speech. In: INTERSPEECH 2005, pp. 1517–1520. ISCA (2005)

    Google Scholar 

  12. Eyben, F., Wöllmer, M., Schuller, B.: Opensmile - the munich versatile and fast open-source audio feature extractor. In: Proceedings of ACM Multimedia (MM), Florence, Italy, pp. 1459–1462. ACM (2010)

    Google Scholar 

  13. Osuna, E., Freund, R., Girosi, F.: Training support vector machines: an application to face detection. In: CVPR ’97, IEEE Computer Society, pp. 130–136 (1997)

    Google Scholar 

  14. Joachims, T.: Text categorization with suport vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Google Scholar 

  15. Herbrich, R.: Learning Kernel Classifiers: Theory and Algorithms. MIT Press, Cambridge (2001)

    Google Scholar 

  16. Schuller, B., Zhang, Z., Weninger, F., Rigoll, G.: Using multiple databases for training in emotion recognition: to unite or to vote? In: 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, Florence, Italy, pp. 1553–1556. ISCA (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladimer Kobayashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kobayashi, V. (2014). A Hybrid Distance-Based Method and Support Vector Machines for Emotional Speech Detection. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2013. Lecture Notes in Computer Science(), vol 8399. Springer, Cham. https://doi.org/10.1007/978-3-319-08407-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08407-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08406-0

  • Online ISBN: 978-3-319-08407-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics