World Wide Web

, Volume 6, Issue 2, pp 187–208

A Multi-Modal Approach to Story Segmentation for News Video


  • Lekha Chaisorn
    • The School of ComputingNational University of Singapore
  • Tat-Seng Chua
    • The School of ComputingNational University of Singapore
  • Chin-Hui Lee
    • The School of ComputingNational University of Singapore

DOI: 10.1023/A:1023622605600

Cite this article as:
Chaisorn, L., Chua, T. & Lee, C. World Wide Web (2003) 6: 187. doi:10.1023/A:1023622605600


This research proposes a two-level, multi-modal framework to perform the segmentation and classification of news video into single-story semantic units. The video is analyzed at the shot and story unit (or scene) levels using a variety of features and techniques. At the shot level, we employ Decision Trees technique to classify the shots into one of 13 predefined categories or mid-level features. At the scene/story level, we perform the HMM (Hidden Markov Models) analysis to locate story boundaries. Our initial results indicate that we could achieve a high accuracy of over 95% for shot classification, and over 89% in F1 measure on scene/story boundary detection. Detailed analysis reveals that HMM is effective in identifying dominant features, which helps in locating story boundaries. Our eventual goal is to support the retrieval of news video at story unit level, together with associated texts retrieved from related news sites on the web.

news story segmentationshot classificationmulti-modal approachlearning-based approach

Copyright information

© Kluwer Academic Publishers 2003