Automatic Detection of the Prosodic Structures of Speech Utterances

Bartkova, Katarina; Jouvet, Denis

doi:10.1007/978-3-319-01931-4_1

Katarina Bartkova²² &
Denis Jouvet^23,24,25

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8113))

Included in the following conference series:

International Conference on Speech and Computer

1216 Accesses
5 Citations

Abstract

This paper presents an automatic approach for the detection of the prosodic structures of speech utterances. The algorithm relies on a hierarchical representation of the prosodic organization of the speech utterances. The approach is applied on a corpus of radio French broadcast news and also on radio and TV shows which are more spontaneous speech data. The algorithm detects prosodic boundaries whether they are followed or not by pause. The detection of the prosodic boundaries and of the prosodic structures is based on an approach that integrates little linguistic knowledge and mainly uses the amplitude of the F0 slopes and the inversion of the slopes as described in [1], as well as phone durations. The automatic prosodic segmentation results are then compared to a manual prosodic segmentation made by an expert phonetician. Finally, the results obtained by this automatic approach provide an insight into the most frequently used prosodic structures in the broadcasting speech style as well as in a more spontaneous speech style.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Martin, P.: Prosodic and rhythmic structures in French. Linguistics 25, 925–949 (1987)
Article Google Scholar
Hupin, B., Simon, A.C.: Analyse phonostylistique du discours radiophonique. Expériences sur la mise en fonction professionnelle du phonostyle et sur le lien entre mélodicité et proximité du discours radiophonique. Recherches en communication 28, 103–121 (2009)
Google Scholar
Goldman, J.-P., Auchlin, A., Simon, A.C., Avanzi, M.: Phonostylographe: un outil de description prosodique. Comparaison du style radiophonique et lu. Nouveaux Cahiers de Linguistique Franaise 28, 219–237 (2008)
Google Scholar
Lacheret-Dujour, A., Obin, N., Avanzi, M.: Design and Evaluation of Shared Prosodic Annotation for French Spontaneous Speech: From Experts Knowledges to Non-Experts Annotations. In: Proceedings of the 4th Linguistic Annotation Workshop, Uppsala, Sweden (2010)
Google Scholar
Segal, N., Bartkova, K.: Prosodic structure representation for boundary detection in spontaneous French. In: Proceedings of ICPhS 2007, Saarbrcken, pp. 1197–1200 (2007)
Google Scholar
’t Hart, J., Collier, R., Cohen, A.: A Perceptual Study of Intonation. Cambridge U.P., London (1990)
Book Google Scholar
Galliano, S., Gravier, G., Chaubard, L.: The Ester 2 evaluation campaign for rich transcription of French broadcasts. In: Proc. INTERSPEECH 2009, Brighton, UK, pp. 2583–2586 (2009)
Google Scholar
Gravier, G., Adda, G., Paulsson, N., Carr, M., Giraudel, A., Galibert, O.: The ETAPE corpus for the evaluation of speech-based TV content processing in the French language. In: Proc. LREC 2012, Istanbul, Turkey (2012)
Google Scholar
Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; extended advanced front-end feature extraction algorithm; compression Algorithms, ETSI ES 202 212 (2005)
Google Scholar
de Calmès, M., Pérennou, G.: BDLEX: a Lexicon for Spoken and Written French. In: Proc. LREC 1998, Grenade, pp. 1129–1136 (1998)
Google Scholar
Jouvet, D., Fohr, D., Illina, I.: Evaluating grapheme-to-phoneme converters in automatic speech recognition context. In: Proc. ICASSP 2012, Kyoto, Japan, pp. 4821–4824 (2012)
Google Scholar
Sphinx (2011), http://cmusphinx.sourceforge.net/

Download references

Author information

Authors and Affiliations

ATILF - Analyse et Traitement Informatique de la Langue Franaise, 44 Av De La Libration, BP 30687, 54063, Nancy Cedex, France
Katarina Bartkova
Speech Group, LORIA Inria, Villers-lès-Nancy, F-54600, France
Denis Jouvet
Université de Lorraine, LORIA, UMR 7503, Villers-lès-Nancy, F-54600, France
Denis Jouvet
CNRS, LORIA, UMR 7503, Villers-lès-Nancy, F-54600, France
Denis Jouvet

Authors

Katarina Bartkova
View author publications
You can also search for this author in PubMed Google Scholar
Denis Jouvet
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences, Department of Cybernetics, University of West Bohemia, Univerzitní 8, 306 14, Plzeň, Czech Republic
Miloš Železný
University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal
Speech and Multimodal Interfaces Laboratory, St. Petersburg Institute of Informatics and Automation for the Russian Academy of Sciences, 14-th line, 39, 199178, St. Petersburg, Russia
Andrey Ronzhin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bartkova, K., Jouvet, D. (2013). Automatic Detection of the Prosodic Structures of Speech Utterances. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-01931-4_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics