Improving the Robustness to Recognition Errors in Speech Input Question Answering

  • Hideki Tsutsui
  • Toshihiko Manabe
  • Mika Fukui
  • Tetsuya Sakai
  • Hiroko Fujii
  • Koji Urata
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)


In our previous work, we developed a prototype of a speech-input help system for home appliances such as digital cameras and microwave ovens. Given a factoid question, the system performs textual question answering using the manuals as the knowledge source. Whereas, given a HOW question, it retrieves and plays a demonstration video. However, our first prototype suffered from speech recognition errors, especially when the Japanese interrogative phrases in factoid questions were misrecognized. We therefore propose a method for solving this problem, which complements a speech query transcript with an interrogative phrase selected from a pre-determined list. The selection process first narrows down candidate phrases based on co-occurrences within the manual text, and then computes the similarity between each candidate and the query transcript in terms of pronunciation. Our method improves the Mean Reciprocal Rank of top three answers from 0.429 to 0.597 for factoid questions.


Speech Recognition Edit Distance Query Term Question Answering Home Appliance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barnett, J., Anderson, S., Broglio, J., Singh, M., Hudson, R., Kuo, S.W.: Experiments in Spoken Queries for Document Retrieval. In: Proceedings of Eurospeech 1997, pp. 1323–1326 (1997)Google Scholar
  2. 2.
    Crestani, F.: Word recognition errors and relevance feedback in spoken query processing. In: Proceedings of the Fourth International Conference on Flexible Query Answering Systems, pp. 267–281 (2000)Google Scholar
  3. 3.
    Fujii, A., Itou, K., Ishikawa, T.: Speech-Drive Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition. In: ACM SIGIR 2001 Workshop on Information Retrieval Techniques for Speech Application (2001)Google Scholar
  4. 4.
    Fukumoto, J., Kato, T., Masui, F.: Question Answering Challenge (QAC-1): An Evaluation of QA Tasks at the NTCIR Workshop 3. In: Proceedings of AAAI Spring Symposium: New Directions in Question Answering, pp. 122–133 (2003)Google Scholar
  5. 5.
    Hori, C., Hori, T., Isozaki, H., Maeda, E., Katagiri, S., Furui, S.: Deriving Disambiguous Queries in a Spoken Interactive ODQA System. In: Proc. the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 624–627 (2003)Google Scholar
  6. 6.
    Ichimura, Y., Yoshimi, Y., Sakai, T., Kokubu, T., Koyama, M.: The Effect of Japanese Named Entity Extraction and Answer Type Taxonomy on the Performance of a Question Answering System. IEICE Journal J88-D2(6), 1067–1080 (2005)Google Scholar
  7. 7.
    Kiyota, Y., Kurohashi, S., Misu, T., Komatani, K., Kawahara, T. Navigator, D.: A Spoken Dialog Q-A System based on Large Text Knowledge Base. In: Proceedings of 41st Annual Meeting of the Association for Computer Linguistics, pp. 149–152 (2003)Google Scholar
  8. 8.
    Kokubu, T., Sakai, T., Saito, Y., Tsutsui, H., Manabe, T., Koyama, M., Fujii, H.: The Relationship between Answer Ranking and User Satisfaction in a Question Answering System. In: Proceedings of NTCIR-5 Workshop Meeting, pp. 537–544 (2005)Google Scholar
  9. 9.
  10. 10.
    Magnini, B., Vallin, A., Ayache, C., Erbach, G., Penas, A., De Rijke, M., Rocha, P., Simov, K., Sutcliffe, R.: Overview of the CLEF 2004 Multilingual Question Answering Track. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 371–391. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  11. 11.
    Masai, Y., Tanaka, S., Nitta, T.: Speaker-independent keyword recognition based on SMQ/HMM. In: Proceedings of International Conference on Spoken Language Processing, pp. 619–622 (1992)Google Scholar
  12. 12.
  13. 13.
    Nishizaki, H., Nakagawa, S.: A System for Retrieving Broadcast News Speech Documents Using Voice Input Keywords and Similarity between Words. In: Proceedings of ICSLP 2000, vol. 3, pp. 1073–1076 (2000)Google Scholar
  14. 14.
    Nitta, T., Kawamura, A.: Designing a reduced feature-vector set for speech recognition by using KL/GPD competitive training. In: Proceedings of the 7th European Conference on Speech Communication and Technology, pp. 2107–2110 (1997)Google Scholar
  15. 15.
    Sakai, T., Saito, Y., Ichimura, Y., Koyama, M., Kokubu, T., Manabe, T.: ASKMi: A Japanese question answering system based on semantic role analysis. In: RIAO 2004 Proceedings, pp. 215–231 (2004)Google Scholar
  16. 16.
    Suzuki, M., Manabe, T., Sumita, K., Nakayama, Y.: Customer Support Operation with a Knowledge Sharing System KIDS: An Approach based on Information Extraction and Text Structurization. In: SCI 2001 Proceedings, vol. 7, pp. 89–96 (2001)Google Scholar
  17. 17.
    Urata, K., Fukui, M., Fujii, H., Suzuki, M., Sakai, T., Saito, Y., Ichimura, Y., Sasaki, H.: A multimodal help system based on question answering technology. In: IPSJ SIG Technical Reports FI-74-4, pp. 23–29 (2004)Google Scholar
  18. 18.
    Voorhees, E.M.: Overview of the TREC 2004 Question Answering Track. In: Proceedings of the Thirteenth Text REtreival Conference, TREC 2004 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hideki Tsutsui
    • 1
  • Toshihiko Manabe
    • 1
  • Mika Fukui
    • 1
  • Tetsuya Sakai
    • 1
  • Hiroko Fujii
    • 1
  • Koji Urata
    • 1
  1. 1.Knowledge Media Laboratory, Corporate R&D CenterTOSHIBA Corp.KawasakiJapan

Personalised recommendations