Combining Lexical and Prosodic Features for Automatic Detection of Sentence Modality in French

Orosanu, Luiza; Jouvet, Denis

doi:10.1007/978-3-319-25789-1_20

Luiza Orosanu^16,17,18 &
Denis Jouvet^16,17,18

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9449))

Included in the following conference series:

International Conference on Statistical Language and Speech Processing

646 Accesses
2 Citations

Abstract

This article analyzes the automatic detection of sentence modality in French using both prosodic and linguistic information. The goal is to later use such an approach as a support for helping communication with deaf people. Two sentence modalities are evaluated: questions and statements. As linguistic features, we considered the presence of discriminative interrogative patterns and two log-likelihood ratios of the sentence being a question rather than a statement: one based on words and the other one based on part-of-speech tags. The prosodic features are based on duration, energy and pitch features estimated over the last prosodic group of the sentence. The evaluations consider using linguistic features stemming from manual transcriptions or from an automatic speech transcription system. The behavior of various sets of features are analyzed and compared. The combination of linguistic and prosodic features gives a slight improvement on automatic transcriptions, where the correct classification performance reaches 72 %.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Jurafsky, D., Bates, R., Coccaro, N., Martin, R., Meteer, M., Ries, K., Shriberg, E., Stolcke, A., Taylor, P., Van Ess-Dykema, C.: Automatic detection of discourse structure for speech recognition and understanding. In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 88–95 (1997)
Google Scholar
Kral, P., Kleckova, J., Cerisara, C.: Sentence modality recognition in french based on prosody. In: International Conference on Enformatika, Systems Sciences and Engineering - ESSE 2005. vol. 8, pp. 185–188 (2005)
Google Scholar
Yuan, J., Jurafsky, D.: Detection of questions in chinese conversational speech. In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 47–52 (2005)
Google Scholar
Quang, V.M., Castelli, E., Yên, P.N.: A decision tree-based method for speech processing: question sentence detection. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 1205–1212. Springer, Heidelberg (2006)
Chapter Google Scholar
Quang, V.M., Besacier, L., Castelli, E.: Automatic question detection: prosodic-lexical features and crosslingual experiments. In: Proceedings of Interspeech, pp. 2257–2260 (2007)
Google Scholar
Khan, O., Al-Khatib, W.G., Cheded, L.: A preliminary study of prosody-based detection of questions in Arabic speech monologues. Arab. J. Sci. Eng. 35(2C), 167–181 (2010)
Google Scholar
Margolis, A., Ostendorf, M.: Question detection in spoken conversations using textual conversations. In: Association for Computational Linguistics, pp. 118–124 (2011)
Google Scholar
Kolar, J., Lamel, L.: Development and evaluation of automatic punctuation for French and English speech-to-text. In: Proceedings of Interspeech (2012)
Google Scholar
Liscombe, J., Venditti, J.J., Hirschberg, J.: Detecting question-bearing turns in spoken tutorial dialogues. In: Proceedings of Interspeech (2006)
Google Scholar
Boakye, K., Favre, B., Hakkani-Tur, D.: Any questions? automatic question detection in meetings. In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 485–489 (2009)
Google Scholar
Mendonça, A., Graff, D., DiPersio, D.: French Gigaword third edition. In: Proceedings of the Linguistic Data Consortium (2011)
Google Scholar
Galliano, S., Gravier, G., Chaubard, L.: The ESTER 2 evaluation campaign for rich transcription of French broadcasts. In: Proceedings of Interspeech (2009)
Google Scholar
Gravier, G., Adda, G., Paulson, N., Carré, M., Giraudel, A., Galibert, O.: The ETAPE corpus for the evaluation of speech-based TV content processing in the French language. In: Proceedings of the International Conference on Language Resources, Evaluation and Corpora (LREC) (2012)
Google Scholar
Estève, Y., Bazillon, T., Antoine, J.Y., Béchet, F., Farinas, J.: The EPAC corpus: manual and automatic annotations of conversational speech in French broadcast news. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC) (2010)
Google Scholar
Stolcke, A.: SRILM an extensible language modeling toolkit. In: Conference on Spoken Language Processing (2002)
Google Scholar
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing, pp. 44–49 (1994)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations 11(1), 10–18 (2009)
Article Google Scholar
le Cessie, S., van Houwelingen, J.: Ridge estimators in logistic regression. Appl. Stat. 41(1), 191–201 (1992)
Article MATH Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc, San Francisco (1993)
Google Scholar
Cohen, W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)
Google Scholar
Keerthi, S., Shevade, S., Bhattacharyya, C., Murthy, K.: Improvements to platt’s SMO algorithm for SVM classifier design. Neural Comput. 13(3), 637–649 (2001)
Article MATH Google Scholar
Ruck, D.W., Rogers, S.K., Kabrisky, M., Oxley, M.E., Suter, B.W.: The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Trans. Neural Netw. 1(4), 296–298 (1990)
Article Google Scholar
ETSI ES 202 212: Speech processing, transmission and quality aspects (STQ); distributed speech recognition; extended advanced front-end feature extraction algorithm; compression algorithms. ETSI ES (2005)
Google Scholar
Placeway, P., et al.: The 1996 Hub-4 Sphinx-3 System. In: DARPA Speech Recognition Workshop (1996)
Google Scholar
de Calmès, M., Pérennou, G.: BDLEX : a lexicon for spoken and written French. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), pp. 1129–1136 (1998)
Google Scholar
Jouvet, D., Fohr, D., Illina, I.: Evaluating grapheme-to-phoneme converters in automatic speech recognition context. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4821–4824 (2012)
Google Scholar
Jouvet, D., Fohr, D.: Combining forward-based and backward-based decoders for improved speech recognition performance. In: Proceedings of Interspeech (2013)
Google Scholar
Jouvet, D., Langlois, D.: A machine learning based approach for vocabulary selection for speech transcription. In: Habernal, I., Matouŝek, V. (eds.) TSD. LNCS, vol. 8082, pp. 60–67. Springer, Heidelberg (2013)
Google Scholar
Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: International Conference on Data Engineering, pp. 215–224 (2001)
Google Scholar
Bartkova, K., Jouvet, D.: Automatic detection of the prosodic structures of speech utterances. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 1–8. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Acknowledgements

The work presented in this article is part of the RAPSODIE project, and has received support from the “Conseil Régional de Lorraine” and from the “Région Lorraine” (FEDER) (http://erocca.com/rapsodie).

Author information

Authors and Affiliations

Speech Group, Inria, LORIA, 54600, Villers-lès-Nancy, France
Luiza Orosanu & Denis Jouvet
Speech Group, LORIA, Université de Lorraine, LORIA, UMR 7503, 54600, Villers-lès-Nancy, France
Luiza Orosanu & Denis Jouvet
Speech Group, LORIA, CNRS, LORIA, UMR 7503, 54600, Villers-lès-Nancy, France
Luiza Orosanu & Denis Jouvet

Authors

Luiza Orosanu
View author publications
You can also search for this author in PubMed Google Scholar
Denis Jouvet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luiza Orosanu .

Editor information

Editors and Affiliations

Research Group on Mathematical Linguistic, Rovira i Virgili University, Tarragona, Spain
Adrian-Horia Dediu
Research Group on Mathematical Linguistic, Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
Klára Vicsi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Orosanu, L., Jouvet, D. (2015). Combining Lexical and Prosodic Features for Automatic Detection of Sentence Modality in French. In: Dediu, AH., Martín-Vide, C., Vicsi, K. (eds) Statistical Language and Speech Processing. SLSP 2015. Lecture Notes in Computer Science(), vol 9449. Springer, Cham. https://doi.org/10.1007/978-3-319-25789-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-25789-1_20
Published: 17 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25788-4
Online ISBN: 978-3-319-25789-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics