Prosodic Cues for Automatic Phrase Boundary Detection in ASR

  • Klára Vicsi
  • György Szaszák
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4188)


This article presents a cross-lingual study for Hungarian and Finnish about the segmentation of continuous speech on word and phrasal level based on prosodic features. A word level segmenter has been developed which can indicate the word boundaries with acceptable accuracy for both languages. The ultimate aim is to increase the robustness of Automatic Speech Recognizers (ASR) by detection of word and phrase boundaries, and thus significantly decrease the searching space during the decoding process, very time-consuming in case of agglutinative languages, like Hungarian and Finnish. They are however fixed stressed languages, so by stress detection, word beginnings can be marked with reliable accuracy. An algorithm based on data-driven (HMM) approach was developed and evaluated. The best results were obtained by time series of fundamental frequency and energy together. Syllable length was found to be much less effective, hence was discarded. By use of supra-segmental features, word boundaries can be marked with high correctness ratio, if we allow not to find all of them. The method we evaluated is easily adaptable to other fixed-stress languages. To investigate this we adapted the method to the Finnish language and obtained similar results.


Continuous Speech Word Boundary Automatic Speech Recognition System Phrase Boundary Word Unit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Di Cristo: Aspects phonétiques et phonologiques des éléments prosodiques. Modèles linguistiques Tome III 2, 24–83 (1981)Google Scholar
  2. 2.
    Langlais, P., Méloni, H.: Integration of a prosodic component in an automatic speech recognition system. In: 3rd European Conference on Speech Communication and Technology, Berlin, pp. 2007–2010 (1993)Google Scholar
  3. 3.
    Mandal, S., Datta, A.K., Gupta, B.: Word boundary Detection of Continuous Speech Signal for Standard Colloquial Bengali (SCB) Using Suprasegmental Features. FRSM (2003)Google Scholar
  4. 4.
    Peters, B.: Multiple cues for phonetic phrase boundaries in German spontaneous speech. In: Proceedings 15th ICPhS, ICPhS, Barcelona CA, pp. 1795–1798 (2003)Google Scholar
  5. 5.
    Roach, P.: BABEL: An Eastern European multi-language database. In: International Conference on Speech and Language Processing, Philadelphia (1996)Google Scholar
  6. 6.
    Rossi, M.: A model for predicting the prosody of spontaneous speech (PPSS model). Speech Communication 13, 87–107 (1993)CrossRefGoogle Scholar
  7. 7.
    Salomon, A., Espy-Wilson, C.Y., Deshmukh, O.: Detection of speech landmarks. Use of temporal information. The Journal of the Acoustical Society of America 115, 1296–1305 (2004)CrossRefGoogle Scholar
  8. 8.
    Yang, L.: Duration and pauses as phrase and boundary marking indicators in speech. In: Proceedings 15th ICPhS, pp. 1791–1794. ICPhS, Barcelona (2003)Google Scholar
  9. 9.
    Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., et al.: The HTK Book (for version 3.3), pp. 22–131. Cambridge University, Cambridge (2005)Google Scholar
  10. 10.
    Venditti, J., Hirschberg, J.: Intonation and discourse processing. In: Proceedings 15th ICPhS, pp. 107–114. ICPhS, Barcelona (2003)Google Scholar
  11. 11.
    Vainio, M., Altosaar, T., Karjalainen, M., Aulanko, R., Werner, S.: Neural network models for Finnish prosody. In: Proceedings of ICPhS 1999, pp. 2347–2350. ICPhS, San Francisco (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Klára Vicsi
    • 1
  • György Szaszák
    • 1
  1. 1.Dept. for Telecommunication and MediainformaticsBudapest University for Technology and EconomicsBudapestHungary

Personalised recommendations