Detection of Laughter-in-Interaction in Multichannel Close-Talk Microphone Recordings of Meetings

Laskowski, Kornel; Schultz, Tanja

doi:10.1007/978-3-540-85853-9_14

Kornel Laskowski^1,2 &
Tanja Schultz^1,2

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5237))

Included in the following conference series:

International Workshop on Machine Learning for Multimodal Interaction

905 Accesses
14 Citations
3 Altmetric

Abstract

Laughter is a key element of human-human interaction, occurring surprisingly frequently in multi-party conversation. In meetings, laughter accounts for almost 10% of vocalization effort by time, and is known to be relevant for topic segmentation and the automatic characterization of affect. We present a system for the detection of laughter, and its attribution to specific participants, which relies on simultaneously decoding the vocal activity of all participants given multi-channel recordings. The proposed framework allows us to disambiguate laughter and speech not only acoustically, but also by constraining the number of simultaneous speakers and the number of simultaneous laughers independently, since participants tend to take turns speaking but laugh together. We present experiments on 57 hours of meeting data, containing almost 11000 unique instances of laughter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Laskowski, K., Burger, S.: Analysis of the occurrence of laughter in meetings. In: Proc. INTERSPEECH, Antwerpen, Belgium, pp. 1258–1261 (2007)
Google Scholar
Kennedy, L., Ellis, D.: Laughter detection in meetings. In: Proc. ICASSP Meeting Recognition Workshop, Montreal, Canada, NIST, pp. 118–121 (2004)
Google Scholar
Russell, J., Bachorowski, J.A., Fernandez-Dols, J.M.: Facial and vocal expressions of emotion. Annual Review of Psychology 54, 329–349 (2003)
Article Google Scholar
Laskowski, K., Burger, S.: Annotation and analysis of emotionally relevant behavior in the ISL Meeting Corpus. In: Proc. LREC, Genoa, Italy (2006)
Google Scholar
Galley, M., McKeown, K., Fosler-Lussier, E., Jing, H.: Discourse segmentation of multi-party conversation. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, pp. 562–569. Springer, Heidelberg (2004)
Google Scholar
Banerjee, S., Rose, C., Rudnicky, A.: The necessity of a meeting recording and playback system, and the benefit of topic-level annotations to meeting browsing. In: Costabile, M.F., Paternó, F. (eds.) INTERACT 2005. LNCS, vol. 3585, pp. 643–656. Springer, Heidelberg (2005)
Chapter Google Scholar
Wrede, B., Shriberg, E.: Spotting “hotspots” in meetings: Human judgments and prosodic cues. In: Proc. EUROSPEECH, Geneva, Switzerland, pp. 2805–2808 (2003)
Google Scholar
Truong, K., van Leeuwen, D.: Automatic detection of laughter. In: Proc. INTERSPEECH, Lisbon, Portugal, pp. 485–488 (2005)
Google Scholar
Truong, K., van Leeuwen, D.: Automatic discrimination between laughter and speech. Speech Communication 49(2), 144–158 (2007)
Article Google Scholar
Knox, M., Mirghafori, N.: Automatic laughter detection using neural networks. In: Proc. INTERSPEECH, Antwerpen, Belgium, pp. 2973–2976 (2007)
Google Scholar
Truong, K., van Leeuwen, D.: Evaluating automatic laughter segmentation in meetings using acoustic and acoustics-phonetic features. In: Proc. ICPhS Workshop on The Phonetics of Laughter, Saarbrücken, Germany, pp. 49–53 (2007)
Google Scholar
Pfau, T., Ellis, D., Stolcke, A.: Multispeaker speech activity detection for the ICSI Meeting Recorder. In: Proc. ASRU, Madonna di Campiglio, Italy, pp. 107–110 (2001)
Google Scholar
Janin, A., et al.: The ICSI Meeting Corpus. In: Proc. ICASSP, Hong Kong, China, pp. 364–367 (2003)
Google Scholar
Shriberg, E., Dhillon, R., Bhagat, S., Ang, J., Carvey, H.: The ICSI Meeting Recorder Dialog Act (MRDA) Corpus. In: Proc. SIGdial, Cambridge MA, USA, pp. 97–100 (2004)
Google Scholar
Norwine, A.C., Murphy, O.J.: Characteristic time intervals in telephonic conversation. Bell System Technical Journal 17, 281–291 (1938)
Google Scholar
Fiscus, J., Ajot, J., Michel, M., Garofolo, J.: The Rich Transcription 2006 Spring Meeting Recognition Evaluation. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, pp. 309–322. Springer, Heidelberg (2006)
Chapter Google Scholar
Bachorowski, J.-A., Smoski, M., Owren, M.: The acoustic features of human laughter. J. of Acoustical Society of America 110(3), 1581–1597 (2001)
Article Google Scholar
Laskowski, K., Burger, S.: On the correlation between perceptual and contextual aspects of laughter in meetings. In: Proc. ICPhS Workshop on the Phonetics of Laughter, Saarbrücken, Germany (2007)
Google Scholar
Nwokah, E., Hsu, H.-C., Davies, P., Fogel, A.: The integration of laughter and speech in vocal communication: A dynamic systems perspective. J. of Speech, Language & Hearing Research 42, 880–894 (1999)
Google Scholar
Laskowski, K., Schultz, T.: A supervised factorial acoustic model for simultaneous multiparticipant vocal activity detection in close-talk microphone recordings of meetings. Technical Report CMU-LTI-07-017, Carnegie Mellon University, Pittsburgh PA, USA (December 2007)
Google Scholar
Wrigley, S., Brown, G., Wan, V., Renals, S.: Speech and crosstalk detection in multichannel audio. IEEE Trans. Speech and Audio Proc. 13(1), 84–91 (2005)
Article Google Scholar
Huang, Z., Harper, M.: Speech activity detection on multichannels of meetings recordings. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 415–427. Springer, Heidelberg (2006)
Chapter Google Scholar
Boakye, K., Stolcke, A.: Improved speech activity detection using cross-channel features for recognition of multiparty meetings. In: Proc. INTERSPEECH, Pittsburgh PA, USA, pp. 1962–1965 (2006)
Google Scholar
Laskowski, K., Schultz, T.: Modeling duration contraints for simultaneous multiparticipant vocal activity detection in meetings. Technical report, Carnegie Mellon University, Pittsburgh PA, USA, (February 2008)
Google Scholar
Laskowski, K., Fügen, C., Schultz, T.: Simultaneous multispeaker segmentation for automatic meeting recognition. In: Proc. EUSIPCO, Poznań, Poland, pp. 1294–1298 (2007)
Google Scholar
Wrigley, S., Brown, G., Wan, V., Renals, S.: Feature selection for the classification of crosstalk in multi-channel audio. In: Proc. EUROSPEECH, Geneva, Switzerland, pp. 469–472 (2003)
Google Scholar
Dines, J., Vepa, J., Hain, T.: The segmentation of multi-channel meeting recordings for automatic speech recognition. In: Proc. INTERSPEECH, Pittsburgh PA, USA, pp. 1213–1216 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Cognitive Systems Lab, Universität Karlsruhe, Karlsruhe, Germany
Kornel Laskowski & Tanja Schultz
Language Technologies Institute, Carnegie Mellon University, Pittsburgh PA, USA
Kornel Laskowski & Tanja Schultz

Authors

Kornel Laskowski
View author publications
You can also search for this author in PubMed Google Scholar
Tanja Schultz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Andrei Popescu-Belis Rainer Stiefelhagen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Laskowski, K., Schultz, T. (2008). Detection of Laughter-in-Interaction in Multichannel Close-Talk Microphone Recordings of Meetings. In: Popescu-Belis, A., Stiefelhagen, R. (eds) Machine Learning for Multimodal Interaction. MLMI 2008. Lecture Notes in Computer Science, vol 5237. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85853-9_14

Download citation

DOI: https://doi.org/10.1007/978-3-540-85853-9_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85852-2
Online ISBN: 978-3-540-85853-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics