Differentiating Laughter Types via HMM/DNN and Probabilistic Sampling

Gosztolya, Gábor; Beke, András; Neuberger, Tilda

doi:10.1007/978-3-030-26061-3_13

Gábor Gosztolya^11,12,
András Beke¹³ &
Tilda Neuberger¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11658))

Included in the following conference series:

International Conference on Speech and Computer

1162 Accesses
6 Citations

Abstract

In human speech, laughter has a special role as an important non-verbal element, signaling a general positive affect and cooperative intent. However, laughter occurrences may be categorized into several sub-groups, each having a slightly or significantly different role in human conversation. It means that, besides automatically locating laughter events in human speech, it would be beneficial if we could automatically categorize them as well. In this study, we focus on laughter events occurring in Hungarian spontaneous conversations. First we use the manually annotated occurrence time segments, and the task is to simply determine the correct laughter type via Deep Neural Networks (DNNs). Secondly we seek to localize the laughter events as well, for which we utilize Hidden Markov Models. Detecting different laughter types also poses a challenge to DNNs due to the low number of training examples for specific types, but this can be handled using the technique of probabilistic sampling during frame-level DNN training.

This study was partially funded by the National Research, Development and Innovation Office of Hungary via contract NKFIH FK-124413. Gábor Gosztolya was also supported by the Ministry of Human Capacities, Hungary (grant 20391-3/2018/FEKUSTRAT). András Beke was supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

NIST Spoken Term Detection 2006 Evaluation Plan (2006). http://www.nist.gov/speech/tests/std/docs/std06-evalplan-v10.pdf
Ayadi, M.E., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44(3), 572–587 (2011)
Article Google Scholar
Bachorowski, J.A., Smoski, M.J., Owren, M.J.: The acoustic features of human laughter. J. Acoust. Soc. Am. 110(3), 1581–1597 (2001)
Article Google Scholar
Bourlard, H., Morgan, N.: Connectionist Speech Recognition - A Hybrid Approach. Kluwer Academic (1994)
Google Scholar
Brueckner, R., Schuller, B.: Hierarchical neural networks and enhanced class posteriors for social signal classification. In: Proceedings of ASRU, pp. 362–367 (2013)
Google Scholar
Campbell, N., Kashioka, H., Ohara, R.: No laughing matter. In: Proceedings of Interspeech, pp. 465–468, Lisbon, Portugal (2005)
Google Scholar
Galvan, C., Manangan, D., Sanchez, M., Wong, J., Cu, J.: Audiovisual affect recognition in spontaneous Filipino laughter. In: Proceedings of KSE, pp. 266–271 (2011)
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier networks. In: Proceedings of AISTATS, pp. 315–323 (2011)
Google Scholar
Gósy, M.: BEA: a multifunctional Hungarian spoken language database. Phonetician 105(106), 50–61 (2012)
Google Scholar
Gosztolya, G.: On evaluation metrics for social signal detection. In: Proceedings of Interspeech, pp. 2504–2508, Dresden, Germany, September 2015
Google Scholar
Gosztolya, G., Beke, A., Neuberger, T., Tóth, L.: Laughter classification using Deep Rectifier Neural Networks with a minimal feature subset. Arch. Acoust. 41(4), 669–682 (2016)
Article Google Scholar
Gosztolya, G., Grósz, T., Tóth, L.: Social signal detection by probabilistic sampling DNN training. IEEE Trans. Affect. Comput. (2019, to appear)
Google Scholar
Gosztolya, G., Grósz, T., Tóth, L., Beke, A., Neubergers, T.: Neurális hálók tanítása valószínűségi mintavételezéssel nevetések felismerésére. In: Proceedings of MSZNY, pp. 136–145, Szeged, Hungary (2017). (in Hungarian)
Google Scholar
Grammer, K., Eibl-Eibesfeldt, I.: The ritualisation of laughter, Chapter 10. In: Natürlichkeit der Sprache und der Kultur: Acta colloquii, pp. 192–214, Brockmeyer (1990)
Google Scholar
Griffin, H.J., et al.: Laughter type recognition from whole body motion. In: Proceedings of ACII, pp. 349–355 (2013)
Google Scholar
Tóth, L., Grósz, T.: A comparison of deep neural network training methods for large vocabulary speech recognition. In: Habernal, I., Matoušek, V. (eds.) TSD 2013. LNCS (LNAI), vol. 8082, pp. 36–43. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40585-3_6
Chapter Google Scholar
Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Laskowski, K.: Contrasting emotion-bearing laughter types in multi participant vocal activity detection for meetings. In: Proceedings of ICASSP, pp. 4765–4768 (2009)
Google Scholar
Lawrence, S., Burns, I., Back, A., Tsoi, A.C., Giles, C.L.: Neural network classification and prior class probabilities. In: Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 1524, pp. 299–313. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49430-8_15
Chapter Google Scholar
McDermott, E., Heigold, G., Moreno, P., Senior, A., Bacchiani, M.: Asynchronous stochastic optimization for sequence training of Deep Neural Networks: towards big data. In: Proceedings of Interspeech, pp. 1224–1228, September 2014
Google Scholar
McKeown, G., Cowie, R., Curran, W., Ruch, W., Douglas-Cowie, E.: Ilhaire laughter database. In: Proceedings of LREC, pp. 32–35 (2012)
Google Scholar
Neuberger, T., Beke, A.: Automatic laughter detection in Hungarian spontaneous speech using GMM/ANN hybrid method. In: Proceedings of SJUSK Conference on Contemporary Speech Habits, pp. 1–13 (2013)
Google Scholar
Ohara, R.: Analysis of a laughing voice and the method of laughter in dialogue speech. Master’s thesis, Nara Institute of Science and Technology, Ikoma, Japan (2004)
Google Scholar
Pokorny, F.B., et al.: Manual versus automated: the challenging routine of infant vocalisation segmentation in home videos to study neuro(mal)development. In: Proceedings of Interspeech, San Francisco, CA, USA, pp. 2997–3001, September 2016
Google Scholar
Ross, M.D., Owren, M.J., Zimmermann, E.: The evolution of laughter in great apes and humans. Commun. Integr. Biol. 3(2), 191–194 (2010)
Article Google Scholar
Salamin, H., Polychroniou, A., Vinciarelli, A.: Automatic detection of laughter and fillers in spontaneous mobile phone conversations. In: Proceedings of SMC, pp. 4282–4287 (2013)
Google Scholar
Tóth, L.: Phone recognition with hierarchical Convolutional Deep Maxout Networks. EURASIP J. Audio Speech Music Process. 2015(25), 1–13 (2015)
Google Scholar
Tóth, L.: Phone recognition with deep sparse rectifier neural networks. In: Proceedings of ICASSP, pp. 6985–6989 (2013)
Google Scholar
Tóth, L., Kocsor, A.: Training HMM/ANN hybrid speech recognizers by probabilistic sampling. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 597–603. Springer, Heidelberg (2005). https://doi.org/10.1007/11550822_93
Chapter Google Scholar
Young, S., et al.: The HTK Book. Cambridge University Engineering Department, Cambridge (2006)
Google Scholar
Zeng, Z., Pantic, M., Roisman, G., Huang, T.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

MTA-SZTE Research Group on Artificial Intelligence, Szeged, Hungary
Gábor Gosztolya
Department of Informatics, University of Szeged, Szeged, Hungary
Gábor Gosztolya
Research Institute for Linguistics of the Hungarian Academy of Sciences, Budapest, Hungary
András Beke & Tilda Neuberger

Authors

Gábor Gosztolya
View author publications
You can also search for this author in PubMed Google Scholar
András Beke
View author publications
You can also search for this author in PubMed Google Scholar
Tilda Neuberger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gábor Gosztolya .

Editor information

Editors and Affiliations

Utrecht University, Utrecht, The Netherlands
Albert Ali Salah
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gosztolya, G., Beke, A., Neuberger, T. (2019). Differentiating Laughter Types via HMM/DNN and Probabilistic Sampling. In: Salah, A., Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science(), vol 11658. Springer, Cham. https://doi.org/10.1007/978-3-030-26061-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-26061-3_13
Published: 24 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26060-6
Online ISBN: 978-3-030-26061-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics