Mixing and Merging for Spoken Document Retrieval
Purchase on Springer.com
$29.95 / €24.95 / £19.95*
* Final gross prices may vary according to local VAT.
This paper describes a number of experiments that explored the issues surrounding the retrieval of spoken documents. Two such issues were examined. First, attempting to find the best use of speech recogniser output to produce the highest retrieval effectiveness. Second, investigating the potential problems of retrieving from a so-called ”mixed collection”, i.e. one that contains documents from both a speech recognition system (producing many errors) and from hand transcription (producing presumably near perfect documents). The result of the first part of the work found that merging the transcripts of multiple recognisers showed most promise. The investigation in the second part showed how the term weighting scheme used in a retrieval system was important in determining whether the system was affected detrimentally when retrieving from a mixed collection.
- F. Crestani and M. Sanderson. Retrieval of spoken documents: first experiences. Research Report TR-1997-34, Department of Computing Science, University of Glasgow, Glasgow, Scotland, UK, October 1997.
- F. Crestani, M. Sanderson, M. Theophylactou, and M. Lalmas. Short queries, natural language and spoken document retrieval: Experiments at Glasgow University. In Proceedings of TREC-6, Gaithersburg, MD, USA, November 1997. In press.
- C. Gerber. The design and application of an acoustic front-end for use in speech interfaces. M.Sc. Thesis, Department of Computing Science, University of Glasgow, Glasgow, Scotland, UK, February 1997. Available as Technical Report TR-1997-6.
- D. Harman. Ranking algorithms. In W.B. Frakes and R. Baeza-Yates, editors, Information Retrieval: data structures and algorithms, chapter 14. Prentice Hall, Englewood Cliffs, New Jersey, USA, 1992.
- D. Harman, editor. Proceedings of the Sixth Text Retrieval Conference (TREC-6), Gaithersburg, MD, USA, November 1997. (In press.).
- G.J.F. Jones, J.T. Foote, K. Spark Jones, and S.J. Young. Video mail retrieval using voice: an overview of the Stage 2 system. In Proceedings of the MIRO Workshop, Glasgow, Scotland, UK, September 1995.
- E. Mittendorf and P. Schauble. Measuring the effects of data corruption on information retrieval. In Proceedings of the SDAIR 96 Conference, pages 179–189, Las Vegas, NV, USA, April 1996.
- T. Robinson, M. Hochberg, and S. Renals. The use of recurrent networks in continuos speech reognition. In C.H. Lee, K.K. Paliwal, and F.K. Soong, editors, Automatic Speech and Speaker Recognition-Advanced Topics, chapter 10, pages 233–258. Kluwer Academic Publishers, 1996.
- M. Sanderson. System for information retrieval experiments (SIRE). Unpublished paper, November 1996.
- M.A. Siegler, M.J. Witbrock, S.T. Slattery, K. Seymore, R.E. Jones, and A.G. Hauptmann. Experiments in spoken document retrieval at CMU. In Proceedings of TREC-6, Gaithersburg, MD, USA, November 1997.
- A. Singhal, J. Choi, D. Hindle, and F. Pereira. AT&T at TREC-6: SDR Track. In Proceedings of TREC-6, Washington DC, USA, November 1997.
- A. Singhal, G. Salton, and C. Buckley. Lenght normalisation in degraded text collections. Research Report 14853-7501, Department of Computer Science, Cornell University, Ithaca, NY, USA, 1995.
- K. Taghva, J. Borsack, and A. Condit. Results of applying probabilistic IR to OCR. In Proceedings of ACM SIGIR, pages 202–211, Dublin, Ireland, 1994.
- M. Wechsler and P. Schauble. Speech retrieval based on automatic indexing. In Proceedings of the MIRO Workshop, Glasgow, Scotland, UK, September 1995.
- J. Xu and W.B. Croft. Query expansion using local and global document analysis. In Proceedings of ACM SIGIR, pages 4–11, Zurich, Switzerland, August 1996.
- Mixing and Merging for Spoken Document Retrieval
- Book Title
- Research and Advanced Technology for Digital Libraries
- Book Subtitle
- Second European Conference, ECDL’98 Heraklion, Crete, Greece September 21–23, 1998 Proceedings
- pp 397-407
- Print ISBN
- Online ISBN
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- Series ISSN
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-VerlagBerlin Heidelberg
- Additional Links
- Industry Sectors
- eBook Packages
To view the rest of this content please follow the download PDF link above.