Robust Question Answering for Speech Transcripts Using Minimal Syntactic Analysis

  • Pere R. Comas
  • Jordi Turmo
  • Mihai Surdeanu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5152)


This paper describes the participation of the Technical University of Catalonia in the CLEF 2007 Question Answering on Speech Transcripts track. For the processing of manual transcripts we have deployed a robust factual Question Answering that uses minimal syntactic information. For the handling of automatic transcripts we combine the QA system with a novel Passage Retrieval and Answer Extraction engine, which is based on a sequence alignment algorithm that searches for “sounds like” sequences in the document collection. We have also enriched the NERC with phonetic features to facilitate the recognition of named entities even when they are incorrectly transcribed.


Question Answering Spoken Document Retrieval Phonetic Distance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altschul, S., Gish, W., Miller, W., Meyers, E.W., Lipman, D.J.: Basic local alignment search tool. Journal of Molecular Biology 215, 403–410 (1990)Google Scholar
  2. 2.
    Calinski, T., Harabasz, J.: A dendrite method for cluster analysis. Communications in Statistics 3 (1974)Google Scholar
  3. 3.
    Kondrak, G.: Algorithms for Language Reconstruction. PhD thesis, University of Toronto (2002)Google Scholar
  4. 4.
    Li, X., Roth, D.: Learning question classifiers: The role of semantic information. Journal of Natural Language Engineering (2005)Google Scholar
  5. 5.
    Paşca, M.: High-performance, open-domain question answering from large text collections. PhD thesis, Southern Methodist University, Dallas, TX (2001)Google Scholar
  6. 6.
    Surdeanu, M., Dominguez-Sal, D., Comas, P.R.: Design and performance analysis of a factoid question answering system for spontaneous speech transcriptions. In: Proceedings of the INTERSPEECH (2006)Google Scholar
  7. 7.
    Surdeanu, M., Turmo, J., Comelles, E.: Named entity recognition from spontaneous open-domain speech. In: Proceedings of the INTERSPEECH (2005)Google Scholar
  8. 8.
    Turmo, J., Comas, P.R., Ayache, C., Mostefa, D., Rosset, S., Lamel, L.: Overview of QAST 2007. In: Peters, C., et al. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 249–256. Springer, Heidelberg (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Pere R. Comas
    • 1
  • Jordi Turmo
    • 1
  • Mihai Surdeanu
    • 2
  1. 1.TALP Research CenterTechnical University of Catalonia (UPC) 
  2. 2.Barcelona Media Innovation Center 

Personalised recommendations