Advertisement

Multimodal Laughter Detection in Natural Discourses

  • Stefan Scherer
  • Friedhelm Schwenker
  • Nick Campbell
  • Günther Palm
Part of the Cognitive Systems Monographs book series (COSMOS, volume 6)

Abstract

This work focuses on the detection of laughter in natural multiparty discourses. For the given task features of two different modalities are used from unobtrusive sources, namely a room microphone and a 360 degree camera. A relatively novel approach using Echo State Networks (ESN) is utilized to achieve the task at hand. Among others, a possible application is the online detection of laughter in human robot interaction in order to enable the robot to react appropriately in a timely fashion towards human communication, since laughter is an important communication utility.

Keywords

Emotion Recognition Recurrent Neural Network Multi Layer Perceptron Equal Error Rate Echo State Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Campbell, N., Kashioka, H., Ohara, R.: No laughing matter. In: Proceedings of Interspeech, ISCA, pp. 465–468 (2005)Google Scholar
  2. 2.
    Campbell, W.N.: Tools and resources for visualising conversational-speech interaction. In: Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), ELRA, Marrakech, Morocco (2008)Google Scholar
  3. 3.
    Drullman, R., Festen, J., Plomp, R.: Effect of reducing slow temporal modulations on speech reception. Journal of the Acousic Society 95, 2670–2680 (1994)CrossRefGoogle Scholar
  4. 4.
    Hermansky, H.: Auditory modeling in automatic recognition of speech. In: Proceedings of Keele Workshop (1996)Google Scholar
  5. 5.
    Hermansky, H.: The modulation spectrum in automatic recognition of speech. In: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 140–147. IEEE, Los Alamitos (1997)CrossRefGoogle Scholar
  6. 6.
    Jaeger, H.: Tutorial on training recurrent neural networks, covering bppt, rtrl, ekf and the echo state network approach. Tech. Rep. 159, Fraunhofer-Gesellschaft, St. Augustin Germany (2002)Google Scholar
  7. 7.
    Jaeger, H., Haas, H.: Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 304, 78–80 (2004)CrossRefGoogle Scholar
  8. 8.
    Kennedy, L., Ellis, D.: Laughter detection in meetings. In: Proceedings of NIST ICASSP, Meeting Recognition Workshop (2004)Google Scholar
  9. 9.
    Knox, M., Mirghafori, N.: Automatic laughter detection using neural networks. In: Proceedings of Interspeech 2007, ISCA, pp. 2973–2976 (2007)Google Scholar
  10. 10.
    Laskowski, K.: Modeling vocal interaction for text-independent detection of involvement hotspots in multi-party meetings. In: Proceedings of the 2nd IEEE/ISCA/ACL Workshop on Spoken Language Technology (SLT 2008), pp. 81–84 (2008)Google Scholar
  11. 11.
    Maganti, H.K., Scherer, S., Palm, G.: A novel feature for emotion recognition in voice based applications. In: Paiva, A.C.R., Prada, R., Picard, R.W. (eds.) ACII 2007. LNCS, vol. 4738, pp. 710–711. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Pugh, S.D.: Service with a smile: Emotional contagion in the service encounter. Academy of Management Journal 44, 1018–1027 (2001)CrossRefGoogle Scholar
  13. 13.
    Scherer, S., Hofmann, H., Lampmann, M., Pfeil, M., Rhinow, S., Schwenker, F., Palm, G.: Emotion recognition from speech: Stress experiment. In: Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008). European Language Resources Association (ELRA), Marrakech, Morocco (2008)Google Scholar
  14. 14.
    Scherer, S., Oubbati, M., Schwenker, F., Palm, G.: Real-time emotion recognition from speech using echo state networks. In: Prevost, L., Marinai, S., Schwenker, F. (eds.) ANNPR 2008. LNCS (LNAI), vol. 5064, pp. 205–216. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Strauss, P.M., Hoffmann, H., Scherer, S.: Evaluation and user acceptance of a dialogue system using wizard-of-oz recordings. In: 3rd IET International Conference on Intelligent Environments, IET, pp. 521–524 (2007)Google Scholar
  16. 16.
    Truong, K.P., Van Leeuwen, D.A.: Automatic detection of laughter. In: Proceedings of Interspeech, ISCA, pp. 485–488 (2005)Google Scholar
  17. 17.
    Truong, K.P., Van Leeuwen, D.A.: Evaluating laughter segmentation in meetings with acoustic and acoustic-phonetic features. In: Workshop on the Phonetics of Laughter, Saarbrücken, pp. 49–53 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Stefan Scherer
    • 1
  • Friedhelm Schwenker
    • 1
  • Nick Campbell
    • 2
  • Günther Palm
    • 1
  1. 1.Institute of Neural Information ProcessingUlm University 
  2. 2.Center for Language and Communication StudiesTrinity College Dublin 

Personalised recommendations