Are Morphosyntactic Taggers Suitable to Improve Automatic Transcription?

  • Stéphane Huet
  • Guillaume Gravier
  • Pascale Sébillot
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4188)


The aim of our paper is to study the interest of part of speech (POS) tagging to improve speech recognition. We first evaluate the part of misrecognized words that can be corrected using POS information; the analysis of a short extract of French radio broadcast news shows that an absolute decrease of the word error rate by 1.1% can be expected. We also demonstrate quantitatively that traditional POS taggers are reliable when applied to spoken corpus, including automatic transcriptions. This new result enables us to effectively use POS tag knowledge to improve, in a postprocessing stage, the quality of transcriptions, especially correcting agreement errors.


Automatic Speech Recognition Training Corpus Word Error Rate Broadcast News Automatic Speech Recognition System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chelba, C., Jelinek, F.: Structured language modeling. Computer Speech and Language 14, 283–332 (2000)CrossRefGoogle Scholar
  2. 2.
    Khudanpur, S., Wu, J.: A maximum entropy language model to integrate n-grams and topic dependencies for conversational speech recognition. In: Proc. of ICASSP (1999)Google Scholar
  3. 3.
    Iyer, R., Ostendorf, M.: Modeling long distance dependence in language: Topic mixtures versus dynamic cache models. IEEE Transactions on Speech and Audio Processing 7, 30–39 (1999)CrossRefGoogle Scholar
  4. 4.
    Maltese, G., Mancini, F.: An automatic technique to include grammatical and morphological information in a trigram-based statistical language model. In: Proc. of ICASSP (1992)Google Scholar
  5. 5.
    Brown, P., Della Pietra, V., de Souza, P., Lai, J., Mercer, R.: Class-based n-gram models of natural language. Computational Linguistics 18, 467–480 (1992)Google Scholar
  6. 6.
    Heeman, P.: POS tags and decision trees for language modeling. In: Proc. of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. (1999)Google Scholar
  7. 7.
    Galliano, S., Geoffrois, E., Mostefa, D., Choukri, K., Bonastre, J.F., Gravier, G.: The ESTER phase II evaluation campaign for the rich transcription of French broadcast news. In: Proc. of Eurospeech (2005)Google Scholar
  8. 8.
    Valli, A., Véronis, J.: Étiquetage grammatical de corpus oraux: problèmes et perspectives. Revue française de linguistique appliquée 4, 113–133 (1999)Google Scholar
  9. 9.
    Gauvain, J.L., Adda, G., Adda-Decker, M., Allauzen, A., Gendner, V., Lamel, L., Schwenk, H.: Where are we in transcribing French broadcast news? In: Proc. of Eurospeech (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Stéphane Huet
    • 1
  • Guillaume Gravier
    • 1
  • Pascale Sébillot
    • 1
  1. 1.IRISARennesFrance

Personalised recommendations