The SoVideo Mandarin Chinese Broadcast News Retrieval System

  • Hsin-min Wang
  • Shi-sian Cheng
  • Yong-cheng Chen


This paper describes the SoVideo broadcast news retrieval system for Mandarin Chinese. The system is based on technologies such as large vocabulary continuous speech recognition for Mandarin Chinese, automatic story segmentation, and information retrieval. Currently, the database consists of 177 hours of broadcast news, which yielded 3,264 stories by automatic story segmentation. We discuss the development and evaluation of each component of the retrieval system.

broadcast news speech recognition story segmentation spoken document retrieval 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bakis, R., Chen, S., Gopalakrishnan, P., Gopinath, R., Maes, S., Polymenakos, L., and Franz, M. (1997). Transcription of broadcast news shows with the IBM large vocabulary speech recognition system. Proceeding of 1997 DARPA Speech Recognition Workshop.Google Scholar
  2. Barras, C., Geoffrois, E., Wu, Z.B., and Liberman, M. (2001). Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication, 33:5–22.Google Scholar
  3. Chen, B., Wang, H.M., and Lee, L.S. (2002). Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese. IEEE Transactions on Speech and Audio Processing, 10(5):303–314.Google Scholar
  4. Chen, K.J. and Liu, S.H. (1992). Word identification for Mandarin Chinese sentences. COLING1992 Proceedings, pp. 101-107.Google Scholar
  5. Chen, K.J. and Ma,W.Y. (2002). Unknown word extraction for Chinese documents. COLING2002 Proceedings. Google Scholar
  6. Chen, S. and Gopalakrishnan, P.S. (1998). Speaker, environment and channel change detection and clustering via the Bayesian information criterion. Proceedings of 1998 DARPA Broadcast News Transcription and Understanding Workshop.Google Scholar
  7. CKIP group. (1993). Analysis of syntactic categories for Chinese. CKIP Technical Report, No. 93-05, Institute of Information Science, Academia Sinica, Taipei.Google Scholar
  8. Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions on Acoustics Speech and Signal Processing, 29:254–272.Google Scholar
  9. Hauptmann, A., Thornton, S., Houghton, R., Qi, Y., Ng, D., Papernick, N., and Jin, R. (2001). Video retrieval with the Informedia digital video library system. Proceedings of The Tenth Text REtrieval Conference.Google Scholar
  10. Huang, M.F., Chen, K.T., andWang, H.M. (2002). Towards retrieval of video archives based on the speech content. Proceedings of International Symposium on Chinese Spoken Language Processing.Google Scholar
  11. Jones, K.S., Jones, G.J.F., Foote, J.T., and Young, S.J. (1996). Experiments on Spoken Document Retrieval. Information Processing & Management, 32(4):399–417.Google Scholar
  12. Kenny, P., Hollan, R., Gupta, V. N., Lennig, M., Mermelstein, P., and O'Shaughnessy, D. (1993). A*-admissible heuristics for rapid lexical access. IEEE Transactions on Speech and Audio Processing, 1(1):49–58.Google Scholar
  13. Kubala F., Jin, H., Matsoukas, S., Nguyen, L., Schwartz, R., and Makhoul, J. (1997). The 1996 BBN Byblos Hub-4 transcription system. Proceeding of 1997 DARPA Speech Recognition Workshop.Google Scholar
  14. Lee, L.S. (1997). Voice dictation of Mandarin Chinese. IEEE Signal Processing Magazine, 14(4):63–101.Google Scholar
  15. Leggetter, C.J. and Woodland, P.C. (1995). Maximum likelihood linear regression for speaker adaptation of the parameters of continuous density hidden Markov models. Computer, Speech and Language, 9:171–185.Google Scholar
  16. Logan, B., Moreno, P., van Thong, J.M., and Whittaker, E. (2000). An experimental study of an audio indexing system for the Web. ICSLP2000 Proceedings.Google Scholar
  17. Makhoul, J., Kubala, F., Leek, T., Liu, D., Nguyen, L., Schwartz, R., and Srivastava, A. (2000). Speech and language techniques for audio indexing and retrieval. Proceedings of the IEEE, 88(8):1338–1353.Google Scholar
  18. Meng, H., Lo, W.K., Li, Y.C., and Ching, P.C. (2000a). Multiscale audio indexing for Chinese spoken document retrieval. ICSLP2000 Proceedings.Google Scholar
  19. Meng, H. et al. (2000b). Mandarin-English Information (MEI): Investigating translingual speech retrieval. Technical Report, The Johns Hopkins University Summer Workshop 2000, reports/mei/ws00mei.pdf.Google Scholar
  20. Ng, K. (2000). Subword-based approaches for spoken document retrieval. Ph.D. thesis, MIT.Google Scholar
  21. Rabiner, L.R. and Juang, B.H. (1993). Fundamentals of Speech Recognition. NJ: Prentice-Hall.Google Scholar
  22. Renals, S., Abberley, A., Kirby, D., and Robinson, T. (2000). Indexing and retrieval of broadcast news”, Speech Communication, 32(1/2):5–20.Google Scholar
  23. Siegler, M., Jain, U., Ray, B., and Stern, R. (1997). Automatic segmentation, classification and clustering of broadcast news audio. Proceeding of 1997 DARPA Speech Recognition Workshop.Google Scholar
  24. Wactlar, H.D., Kanade, T., Smith, M.A., and Stevens, S.M. (1996). Intelligent access to digital video: Informedia project. IEEE Computer, 29(5):46–52.Google Scholar
  25. Wang, H.M. et al. (1997). Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data. IEEE Trans. on Speech and Audio Processing, 5(2):195–200.Google Scholar
  26. Wayne, C.L. (2000). Multilingual topic detection and tracking: Successful research enabled by corpora and evaluation. LREC2000 Proceedings.Google Scholar
  27. Wechsler,M., (1998). Spoken document retrieval based on phoneme recognition. Ph.D. thesis, Swiss Federal Institute of Technology (ETH).Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Hsin-min Wang
    • 1
  • Shi-sian Cheng
    • 1
  • Yong-cheng Chen
    • 1
  1. 1.Institute of Information Science, Academia Sinica, TaipeiTaiwan

Personalised recommendations