Television information filtering through speech recognition

  • Arjen P. de VriesJr.
Session 2: Multimedia Services on Demand I
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1045)


The problem of information overload can be solved by the application of information filtering to the huge amount of data. Information on radio and television can be filtered using speech recognition of the audio track. A prototype system using closed captions has been developed on top of the INQUERY information access system. The challange of integrating speech recognition and information retrieval into a working system is a big one. The open problems are the selection of a document representation model, the recognition and selection of indexing features for speech retrieval and dealing with the erroneous output of recognition processes.


multimedia multimedia representation content-based retrieval information filtering automatic indexing speech recognition content analysis probabilistic information retrieval 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Aro94]
    B.M. Arons. Interactively skimming recorded speech. PhD thesis, Massachusetts Institute of Technology, February 1994.Google Scholar
  2. [BC92]
    N.J. Belkin and W.B. Croft. Information filtering and information retrieval: Two sides of the same coin? Communications of the ACM, 35(12):29–38, 1992.Google Scholar
  3. [BCC94]
    E.W. Brown, J.P. Callan, and W.B. Croft. Fast incremental indexing for full-text information retrieval. In Proceedings of the 20th International Conference on Very Large Databases (VLDB), Santiago, Chile, 1994.Google Scholar
  4. [BCCM94]
    E.W. Brown, J.P. Callan, W.B. Croft, and J.E.B. Moss. Supporting full-text information retrieval with a persistent object store. In EDBT '94, 1994.Google Scholar
  5. [CC93]
    J.P. Callan and W.B. Croft. An evaluation of query processing strategies using the TIPSTER collection. In Proceedings of the sixteenth annual international ACM SIGIR conference on research and development in information retrieval, pages 347–356, 1993.Google Scholar
  6. [CCH92]
    J.P. Callan, W.B. Croft, and S.M. Harding. The INQUERY retrieval system. In Proceedings of the 3rd international conference on database and expert systems applications, pages 78–83, 1992.Google Scholar
  7. [CHTB92]
    W.B. Croft, S.M. Harding, K. Taghva, and J. Borsack. An evaluation of information retrieval accuracy with simulated OCR output. In Symposium of Document Analysis and Information Retrieval, 1992.Google Scholar
  8. [Cox90]
    S.J. Cox. Speech and language processing, chapter Hidden Markov Models for automatic speech recognition: theory and application, pages 209–230. Chapman and Hall, 1990.Google Scholar
  9. [CW92]
    F.R. Chen and M.M. Withgott. The use of emphasis to automatically summarize a spoken discourse. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, San Fransisco, CA, March 1992.Google Scholar
  10. [DLM+94]
    E. Deardorff, T.D.C. Little, J.D. Marshall, D. Venkatesh, and R. Walzer. Video scene decomposition with the motion picture parser. In IS&T/SPIE Symposium on Electronic Imaging Science and Technology, San Jose, 1994.Google Scholar
  11. [dV95]
    A.P. de Vries. Multimedia information access. Master's thesis, University of Twente, August 1995.Google Scholar
  12. [Erf93]
    R. Erfle. Specification of temporal constraints in multimedia documents using HyTime. Electronic publishing, 6(4):397–411, 1993.Google Scholar
  13. [Fed]
    Federal Communications Commission. 15.119 Closed caption decoder requirements for television receivers.Google Scholar
  14. [GS92]
    U. Glavitsch and P. Schäuble. A system for retrieving speech documents. In Proceedings of the 15th annual international SIGIR, pages 168–176, Denmark, 6 1992.Google Scholar
  15. [Hea94]
    M.A. Hearst. Multi-paragraph segmentation of expository text. In ACL '94, Las Cruces, 1994.Google Scholar
  16. [LAF93]
    T.D.C. Little, G. Ahanger, R.J. Folz, J.F. Gibbon, F.W. Reeve, D.H. Schelleng, and D. Venkatesh. A digital on-demand video service supporting content-based queries. In Proceedings of the first ACM international conference on multimedia, pages 427–436, Anaheim California, 1993.Google Scholar
  17. [Les89]
    M. Lesk. What to do when there's too much information. In Hypertext '89 Proceedings, pages 305–318, New York, 1989. ACM.Google Scholar
  18. [LPG+93]
    Levergood, Payne, Gettys, Treese, and Stewart. AudioFile: a network-transparent system for distributed audio applications. In USENIX Summer Conference, June 1993.Google Scholar
  19. [Mae94]
    P. Maes. Agents that reduce work and information overload. Communications of the ACM, 37(7):31–42, July 1994.Google Scholar
  20. [Ous94]
    J.K. Ousterhout. Tcl and the Tk toolkit. Addison-Wesley Publishing, 1994.Google Scholar
  21. [Pea89]
    J. Pearl. Probabilistic reasoning in intelligent systems. Morgan Kaufmann, California, 1989.Google Scholar
  22. [RHL94]
    Rudnicky, Hauptmann, and Lee. Survey of current speech technology. Communications of the ACM, 37(3):52–57, 1994.Google Scholar
  23. [RS78]
    L.R. Rabiner and R.W. Schafer. Digital processing of speech. Prentice-Hall, New-Jersey, 1978.Google Scholar
  24. [Sal89]
    G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison Wesley Publishing, 1989.Google Scholar
  25. [SBG94]
    A. Syrdal, R. Bennett, and S. Greenspan. Applied speech technology. CRC Press, Inc., Florida, 1994.Google Scholar
  26. [SvR91]
    M. Sanderson and C.J. van Rijsbergen. NRT: news retrieval tool. Electronic Publishing, 4(4):205–217, 1991.Google Scholar
  27. [SW]
    P. Schäuble and M. Wechsler. First experiences with a system for content based retrieval of information from speech recordings. Scholar
  28. [TBC94]
    K. Taghva, J. Borsack, and A. Condit. Results of applying probabilistic IR to OCR text. In Proceedings of the seventeenth annual international ACM SIGIR Conference on research and development in information retrieval, Dublin, Ireland, 1994.Google Scholar
  29. [TBCE94]
    K. Taghva, J. Borsack, A. Condit, and S. Erva. The effects of noisy data on text retrieval. Journal of the American Society for Information Science, 45(1):50–58, 1994.Google Scholar
  30. [TC91]
    H. Turtle and W.B. Croft. Evaluation of an inference network-based retrieval model. ACM Transactions of information systems, 9(3), 1991.Google Scholar
  31. [VB95]
    P.A.C. Verkoulen and H.M. Blanken. SGML/HyTime for supporting cooperative authoring of multimedia applications. In Advanced Course: Multimedia Databases in Perspective, pages 179–212. Center for Telematics and Information Technology of the University of Twente, 1995.Google Scholar
  32. [vR79]
    C.J. van Rijsbergen. Information retrieval. Butterworths, London, 2nd edition, 1979.Google Scholar
  33. [vS95]
    Hein van Steenis. Spraakherkenning levert eindelijk produkten op. Automatiseringsgids, May 26 1995.Google Scholar
  34. [WB91]
    L.D. Wilcox and M.A. Bush. HMM-based wordspotting for voice editing and indexing. In Proceedings of the Second European Conference on Speech Communication and Technology, Genova, Italy, September 1991.Google Scholar
  35. [Wil79]
    P. Willet. Document retrieval experiments using indexing vocabularies of varying size. II. Hashing, truncation, digram and trigram encoding of indexing terms. Journal of Documentation, 35(4):296–305, 1979.Google Scholar
  36. [YGM]
    T.W. Yan and H. Garcia-Molina. SIFT — a tool for wide-area information dissemination. Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Arjen P. de VriesJr.
    • 1
  1. 1.Centre for Telematics and Information TechnologyUniversity of TwenteUSA

Personalised recommendations