Different Approaches to Class-Based Language Models Using Word Segments

  • Raquel Justo
  • M. Inés Torres
Part of the Advances in Soft Computing book series (AINSC, volume 45)


In this paper we propose different approaches to the LM integrated in a Continuous Speech Recognition system. All of them are based on classes that are made up of phrases or segments of words. The proposed models were evaluated in terms of Word Error Rate over a spontaneous dialogue corpus in Spanish. The experiments carried out show that better performance of the CSR system can be achieved introducing segments of words into a class-based LM.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brown, P.F., Pietra, V.J.D., Souza, P.V.d., Lai, J.C., Mercer, R.L.: Class-based n-gram Models of Natural Language. Computational Linguistics 18(4) (1992) 467–480Google Scholar
  2. 2.
    Niesler, T., Whittaker, E., Woodland, P.: Comparison of part-of-speech and automatically derived category-based language models for speech recognition. In: ICASSP’98, Seattle. (1998) 177–180Google Scholar
  3. 3.
    Zitouni, I.: Backoff hierarchical class n-gram language models: effectiveness to model unseen events in speech recognition. Computer Speech and Language 21(1) (2007) 99–104CrossRefGoogle Scholar
  4. 4.
    Deligne, S., Bimbot, F.: Language modeling by variable length sequences: Theoretical formulation and evaluation of multigrams. In: Proc. ICASSP’ 95, Detroit, MI (1995) 169–172Google Scholar
  5. 5.
    Marcu, D., Wong, W.: A phrase-based, joint probability model for statistical machine translation. (EMNLP), Philadelphia, PA, July 6–7 (2002)Google Scholar
  6. 6.
    Ries, K., Buo, F.D., Waibel, A.: Class phrase models for language modelling. In: Proc. ICSLP’ 96. Volume 1., Philadelphia, PA (oct 1996) 398–401Google Scholar
  7. 7.
    Garcia, P., Vidal, E.: Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(9) (1990) 920–925CrossRefGoogle Scholar
  8. 8.
    Torres, I., Varona, A.: k-tss language models in speech recognition systems. Computer Speech and Language 15(2) (2001) 127–149CrossRefGoogle Scholar
  9. 9.
    Caseiro, D., Trancoso, L: Transducer composition for on-the-fly lexicon and language model integration. In: Proceedings ASRU’2001, Madonna di Campiglio, Italy (December 2001)Google Scholar
  10. 10.
    Kuo, H.K.J., Reichl, W.: Phrase-based language models for speech recognition. In: Proceedings of EUROSPEECH 99. Volume 4. (September 1999) 1595–1598 Budapest, Hungary.Google Scholar
  11. 11.
    Och, F. J.: An efficient method for determining bilingual word classes. In: EACL99, Bergen (1999) 71–76Google Scholar
  12. 12.
    Justo, R., Torres, M.I., Benedi, J.M.: Category-based language model in a Spanish spoken dialogue system. Procesamiento del Lenguaje Natural 37(1) (2006) 19–24Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Raquel Justo
    • 1
  • M. Inés Torres
    • 1
  1. 1.Dept. of Electricity and ElectronicsUniversity of the Basque CountrySpain

Personalised recommendations