Advertisement

Experiments with Linguistic Categories for Language Model Optimization

  • Arantza Casillas
  • Amparo Varona
  • Ines Torres
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2588)

Abstract

In this work we obtain robust category-based language models to be integrated into speech recognition systems. Deductive rules are used to select linguistic categories and to match words with categories. Statistical techniques are then used to build n-gram Language Models based on lexicons that consist of sets of categories. The categorization procedure and the language model evaluation were carried out on a taskoriented Spanish corpus. The cooperation between deductive and inductive approaches has proved efficient in building small, reliable language models for speech understanding purposes.

Keywords

Training Corpus Speech Recognition System Word Class Word Error Rate Application Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    “The MACO Morphological Analyzer.” http://www.lsi.upc.es/nlp.
  2. 2.
    “The CMU-Cambridge Statistical Language Modeling toolkit.” http://svr-www.eng.cam.ac.uk/prc14/toolkit-documentation.html.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Arantza Casillas
    • 1
  • Amparo Varona
    • 1
  • Ines Torres
    • 1
  1. 1.Dpt. de Electricidad y ElectrónicaFacultad de Ciencias Universidad del País Vasco (UPV-EHU)Spain

Personalised recommendations