Advertisement

Information Retrieval

, Volume 3, Issue 3, pp 173–188 | Cite as

New Approaches to Spoken Document Retrieval

  • Martin Wechsler
  • Eugen Munteanu
  • Peter Schäuble
Article

Abstract

This paper presents four novel techniques for open-vocabulary spoken document retrieval: a method to detect slots that possibly contain a query feature; a method to estimate occurrence probabilities; a technique that we call collection-wide probability re-estimation and a weighting scheme which takes advantage of the fact that long query features are detected more reliably. These four techniques have been evaluated using the TREC-6 spoken document retrieval test collection to determine the improvements in retrieval effectiveness with respect to a baseline retrieval method. Results show that the retrieval effectiveness can be improved considerably despite the large number of speech recognition errors.

spoken document retrieval speech recognition retrieval effectiveness 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abberley D, Renals S, Cook G and Robinson T (1998) The THISL spoken document retrieval system. In: Proceedings of the Sixth Text Retrieval Conference (TREC-6).Google Scholar
  2. Allan J, Callan J, Croft W, Ballesteros L, Byrd D, Swan R and Xu J (1998) INQUERY does battle with TREC-6. In: Proceedings of the Sixth Text REtrieval Conference (TREC-6).Google Scholar
  3. Brown M, Foote J, Jones G, Jones KS and Young S (1996) Open-vocabulary speech indexing for voice and video mail retrieval. In: ACM Multimedia Conference, Boston, MA.Google Scholar
  4. Buckley C, Allan J and Salton G (1994) Automatic routing and ad-hoc retrieval using SMART: TREC 2. In: TREC-2 Proceedings, pp. 45-55.Google Scholar
  5. CMU (1995) cmudict. 0.4. Carnegie Mellon University Pronouncing Dictionary, http://www.speech.cs.cmu. edu/cgi-bin/cmudict.Google Scholar
  6. Dharanipragada S, Franz M and Roukos S (1998) Audio indexing for broadcast news. In: Proceedings of the Sixth Text REtrieval Conference (TREC-6).Google Scholar
  7. Garofolo JS, Lamel L and Fisher W (1990) DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CD-ROM. U.S. Department of Commerce, Gaithersburg, MD 20899.Google Scholar
  8. Glavitsch U and Schäuble P (1992) A system for retrieving speech documents. In: Belkin N, Ingwersen P and Pejtersen AM Eds., ACM SIGIR Conference on R & D in Information Retrieval, pp. 168-176.Google Scholar
  9. James D (1996) A system for unrestricted topic retrieval from radio broadcasts. In: Proceedings ICASSP, Atlanta, GA, USA. pp. 279-282.Google Scholar
  10. Jones G, Foote J, Jones KS and Young S (1995) Video mail retrieval using voice: An overview of the stage-2 system. In: van Rijsbergen C, Ed., Proceedings of the Final Workshop on Multimedia Information Retrieval (MIRO'95), Electronic Workshops in Computing, Glasgow. Springer.Google Scholar
  11. Jones G, Foote J, Jones KS and Young S (1996) Retrieving spoken documents by combining multiple index sources. In: ACM SIGIR Conference on R & D in Information Retrieval, Zurich, pp. 30-38.Google Scholar
  12. LDC (1996) DARPA continuous speech recognition corpus-IV: Radio broadcast news (CSRIV Hub-4), CD-ROM, Linguistic Data Consortium, Philadelphia, PA 19104-2608, USA, ldc@ldc.upenn.edu.Google Scholar
  13. Lee KF (1989) Automatic Speech Recognition: The Development of the SPHINX System. Kluwer Academic Publishers, Boston.Google Scholar
  14. Mateev B, Munteanu E, Sheridan P, Wechsler M and Schäuble P (1998) ETH TREC-6: Routing, Chinese, crosslanguage and spoken document retrieval. In: Proceedings of the Sixth Text REtrieval Conference (TREC-6).Google Scholar
  15. Mittendorf E (1998) Data corruption and information retrieval. PhD Thesis, Swiss Federal Institute of Technology. Diss. ETH No. 12507.Google Scholar
  16. Mittendorf E, Schäuble P and Sheridan P (1995) Applying probabilistic term weighting to OCR text in the case of a large alphabetic library catalogue. In: ACMSIGIR Conference onR&Din Information Retrieval, pp. 328-335.Google Scholar
  17. Ng K and Zue V (1997) Subword unit representations for spoken document retrieval. In: Proceedings of ESCA Eurospeech Conference, Rhodes, Greece.Google Scholar
  18. Rabiner J (1993) Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NY.Google Scholar
  19. Robinson T (1994) An application of recurrent nets to phone probability estimation. IEEE Transactions on Neural Networks, 5(3).Google Scholar
  20. Schäuble P (1997) Multimedia Information Retrieval-Content-Based Information Retrieval from Large Text and Audio Databases. Kluwer Academic Publishers, Boston.Google Scholar
  21. Sheridan P, Wechsler M and Schuble P (1997) Cross-language speech retrieval: Establishing a baseline performance. In: ACM SIGIR Conference on Research & Development in Information Retrieval, Philadelphia.Google Scholar
  22. Singhal A, Buckley C and Mitra M (1996) Pivoted document length normalization In: ACM SIGIR Conference on R & D in Information Retrieval, pp. 21-29.Google Scholar
  23. Voorhees E, Garofolo J and Jones K (1998) The TREC-6 spoken document retrieval track. In: Proceedings of the Sixth Text REtrieval Conference (TREC-6).Google Scholar
  24. Wactlar H, Hauptmann A and Witbrock M (1996) Informedia:News-on-demand experiments in speech recognition. In: Proceedings of DARPA Speech Recogition Workshop, Arden House, Harriman, NY.Google Scholar
  25. Wasser JA (1985) English to phoneme translation. Public domain software, ftp://ftp.doc.ic.ac.uk/packages/ unix-c/utils/phoneme.c.gz.Google Scholar
  26. Wechsler M (1998) Spoken Document Retrieval Based on Phoneme Recognition. PhD Thesis, ETH Zurich. Diss. No. 12879.Google Scholar
  27. Wechsler M, Munteanu E and Schäuble P (1998) New techniques for open-vocabulary spoken document retrieval. In: ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 20-27.Google Scholar
  28. Wechsler M and Schäuble P (1995) Speech retrieval based on automatic indexing. In: van Rijsbergen C Ed., (Proceedings of the Final Workshop on Multimedia Information Retrieval (MIRO '95), Electronic Workshops in Computing, Glasgow. Springer.Google Scholar
  29. Witbrock M and Hauptmann AG (1997) Speech recognition and information retrieval: Experiments in retrieving spoken documents. In: Proceedings of the DARPA Speech Recognition Workshop, Chantilly Virginia.Google Scholar
  30. Young S, Woodland P and Byrne W (1993) HTK Version 1.5: User, Reference & Programmer Manual. Entropic Cambridge Research Laboratory, Sheraton House, Castle Park, Cambridge CB3 OAX, England.Google Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Martin Wechsler
    • 1
  • Eugen Munteanu
    • 2
  • Peter Schäuble
    • 3
  1. 1.Mckinsey and CompanySwitzerland
  2. 2.Eugen Muntiany, Eurospider Information Technology A6ZurichSwitzerland
  3. 3.Eurospider Information Technology A6ZurichSwitzerland

Personalised recommendations