Word Error Correction of Continuous Speech Recognition Using WEB Documents for Spoken Document Indexing

  • Hiromitsu Nishizaki
  • Yoshihiro Sekiguchi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4285)


This paper describes an error correction method of continuous speech recognition using WEB documents for spoken documents indexing. We performed an experiment of error correction for news speech automatically transcribed, where we focused on especially proper nouns. Two LVCSR systems were used to detect correctly and incorrectly recognized words. Keywords for the Internet search engine were selected among the correctly transcribed words, then correct candidates for the mis-recognized words were obtained in retrieved documents. A Dynamic Programming (DP) technique with a confusion matrix was utilized to compare the candidates with the mis-recognized words. In results of experiment of error correction, recognition rate of proper nouns achieved improvement of about 10% by using WEB documents.


Error Correction Word Sequence Proper Noun Internet Search Engine Word Candidate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Garofolo, J., Auzanne, C.G.P., Voorhees, E.: The TREC SDR Track: A Success Story. In: Proc. of the 8th Text Retrieval Conference, pp. 107–129 (2000)Google Scholar
  2. 2.
    Robinson, T., Abberley, D., Kirby, D., Renals, S.: Recognition, indexing and retrieval of British broadcast news with the THISL system. In: Proc. of EuroSpeech 1999, pp. 1267–1270 (1999)Google Scholar
  3. 3.
    Hauptmann, A.G., Wactlar, H.D.: Indexing and search of multimodal information. In: Proc. of ICASSP 1997, pp. 195–198 (1997)Google Scholar
  4. 4.
    Jourlin, P., Johnson, S.E., Jones, K.S., Woodland, P.C.: Spoken document representations for probabilistic retrieval. Speech Communication 32(1-2), 21–36 (2000)CrossRefGoogle Scholar
  5. 5.
    Wechsler, M., Munteanu, E., Schauble, P.: New Techniques for Open-vocabulary Spoken Document Retrieval. In: Proceedings of the SIGIR 1998, pp. 20–27 (1998)Google Scholar
  6. 6.
    Ng, K., Zue, V.W.: Subword-based Approaches for Spoken Document Retrieval. Speech Communication 32(3), 157–186 (2000)CrossRefGoogle Scholar
  7. 7.
    min Wang, H.: Experiments in Syllable-based Retrieval of Broadcast News Speech in Mandarin Chinese. Speech Communication 32(1-2), 49–60 (2000)CrossRefGoogle Scholar
  8. 8.
    Ng, C., Wilkinson, R., Zobel, J.: Experiments in Spoken Document Retrieval using Phoneme N-grams. Speech Communication 32(1-2), 61–77 (2000)CrossRefGoogle Scholar
  9. 9.
    Fiscus, J.G.: A Post-processing System to Yield Reduced Word Error Rates: Recognizer Output Voting Error Reduction (ROVER). In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 347–354 (1997)Google Scholar
  10. 10.
    Nishizaki, H., Nakagawa, S.: Japanese Spoken Document Retrieval Considering OOV Keywords Using LVCSR System with OOV Detection Processing. In: Proc. of Human Language Technology Conference 2002, pp. 144–151 (March 2002)Google Scholar
  11. 11.
    Kai, A., Hirose, Y., Nakagawa, S.: Dealing with out-of-vocabulary words and speech disfluencies in an n-gram based speech unde rstanding system. In: ICSLP 1998, pp. 2427–2430 (1998)Google Scholar
  12. 12.
    Kawahara, T., Kobayashi, T., Takeda, K., Minematsu, N., Itoh, K., Yamamoto, M., Yamamoto, A., Utsuro, T., Shikano, K.: Sharable software repository for japanese large vocabulary continuous speech recognition. In: ICSLP 1998, pp. 763–766 (1998)Google Scholar
  13. 13.
    Utsuro, T., Harada, T., Nishizaki, H., Nakagawa, S.: A Confidence Measure Based on Agreement among Multiple LVCSR Models – Correlation between Pair of Acoustic Models and Confidence. In: Proc. of ICSLP 2002, pp. 701–704 (September 2002)Google Scholar
  14. 14.
    Nishizaki, H., Nakagawa, S.: A System for Retrieving Broadcast News Speech Documents Using Voice Input Keywords and Similarity between Words. In: Proc. of ICSLP 2000, vol. 3, pp. 1073–1076 (October 2000)Google Scholar
  15. 15.
    Itoh, K., Yamamoto, M., Takeda, K., Takezawa, T., Matsuoka, T., Kobayashi, T., Shikano, K., Itahashi, S.: JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research. Journal of the Acoustical Society of Japan (E) 20(3), 199–206 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hiromitsu Nishizaki
    • 1
  • Yoshihiro Sekiguchi
    • 1
  1. 1.Interdisciplinary Graduate School of Medicine and EngineeringUniversity of YamanashiKofu, YamanashiJapan

Personalised recommendations