A Multi-Modal Approach to Story Segmentation for News Video

Chaisorn, Lekha; Chua, Tat-Seng; Lee, Chin-Hui

doi:10.1023/A:1023622605600

A Multi-Modal Approach to Story Segmentation for News Video

Published: June 2003

Volume 6, pages 187–208, (2003)
Cite this article

World Wide Web Aims and scope Submit manuscript

Lekha Chaisorn¹,
Tat-Seng Chua¹ &
Chin-Hui Lee¹

214 Accesses
32 Citations
Explore all metrics

Abstract

This research proposes a two-level, multi-modal framework to perform the segmentation and classification of news video into single-story semantic units. The video is analyzed at the shot and story unit (or scene) levels using a variety of features and techniques. At the shot level, we employ Decision Trees technique to classify the shots into one of 13 predefined categories or mid-level features. At the scene/story level, we perform the HMM (Hidden Markov Models) analysis to locate story boundaries. Our initial results indicate that we could achieve a high accuracy of over 95% for shot classification, and over 89% in F ₁ measure on scene/story boundary detection. Detailed analysis reveals that HMM is effective in identifying dominant features, which helps in locating story boundaries. Our eventual goal is to support the retrieval of news video at story unit level, together with associated texts retrieved from related news sites on the web.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

A. A. Alatan, A. N. Akansu, and W. Wolf, “Multi-modal dialog scene detection using hidden Markov models for content-based multi-media indexing,” Multimedia Tools and Applications 14, 2001, 137-151.
Google Scholar
C. Anantharamu, H. Feng, and T.-S. Chua, “Temporal multi-resolution framework for shot boundary detection and key frame extraction,” in Proceedings of the International Conference on Text Retrieval (TREC'02), NIST, Gaithersburg, USA, November 2002, pp. 500-504.
Berkeley University, World Wide Web (Digital Library SunSITE), http://sunsite.berkeley. edu/Web/
L. Breiman, J. H. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees, Chapman & Hall, New York, 1993.
Google Scholar
S.-F. Chang and H. Sundaram, “Structural and semantic analysis of video,” IEEE International Conference on Multimedia and Expo, New York, 2000, p. 687.
L. Chen and T.-S. Chua, “A match and tiling approach to content-based image retrieval,” in ICME'01 (IEEE International Conference on Multimedia and Expo), Tokyo, Japan, August 2001, pp. 417-420.
Y. Chen and E. K. Wong, “A knowledge-based approach to video content classification,” Proceedings of the International Conference of SPIE, Vol. 4315, 2001, pp. 292-300.
Google Scholar
T.-S. Chua and C. Chu, “Color-based pseudo-object for image retrieval with relevance feedback,” in Proceedings of the International Conference on Advanced Multimedia Content Processing'98, Osaka, Japan, November 1998, pp. 148-162.
T.-S. Chua, Y. Zhao, and M. S. Kankanhalli, “An automated compressed-domain face detection method for video stratification,” in Proceedings of the International Conference on Multimedia Modeling (MMM'2000), Nagoya, Japan, November 2000, pp. 333-348.
R. Dale, H. Moisl, and H. Somers, Handbook of Natural Language Processing, Marcel Dekker, New York, 2000.
Google Scholar
T. G. Dietterich and G. Bakiri, “Solving multi-class learning problems via error-correcting output codes,” Journal of Artificial Intelligence Research, 1995, 263-286.
S. Eickeler, A. Kosmala, and G. Rigoll, “A new approach to content-based video indexing using hidden Markov models,” in IEEE Workshop on Image Analysis for Multimedia Interactive Service (WIAMIS), Louvain la Neuve, Belgium, June 1997, pp. 149-154.
G. Hoyle, “Distance learning on the Net,” http://www.hoyle.com/distance.htm
J. Huang, Z. Liu, and Y. Wang, “Integration of multimodal features for video scene classification based on HMM,” in IEEE Signal Processing Society Workshop on Multimedia Signal Processing, Denmark, 1999, pp. 53-58.
I. Ide, K. Yamamoto, and H. Tanaka, “Automatic video indexing based on shot classification,” in Proceedings of the International Conference on Advanced Multimedia Content Processing (AMCP'98), Osaka, Japan, 1998, pp. 87-102.
M.-I. Jordan, Learning in Graphical Models, MIT Press, Cambridge, MA, 1998.
Google Scholar
C.-K. Koh and T.-S. Chua, “Detection and segmentation of commercials in news video,” Technical Report, The School of Computing, National University of Singapore, 2000.
Y. Lin, M. S. Kanhanhalli, and T.-S. Chua, “Temporal multi-resolution analysis for video segmentationtion,” in Proceedings of the International Conference of SPIE (Storage and Retrieval for Media Databases), San Jose, USA, Vol. 3972, January 2000, pp. 494-505.
Google Scholar
Z. Liu, J. Huang, and Y. Wang, “Classification of TV programs based on audio information using hidden Markov models,” in IEEE Signal Processing Society, Workshop on Multimedia Signal Processing, Los Angeles, CA, 1998, pp. 27-31.
L. Lu, S. Z. Li, and H.-J. Zhang, “Content-based audio segmentation using support vector machine,” in IEEE International Conference on Multimedia and Expo (ICME 2001), Japan, 2001, pp. 956-959.
J. R. Quinlan, Induction of Decision Trees. Machine Learning, Vol. 1, 1986, pp. 81-106.
Google Scholar
L. Rabiner and B. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, NJ, 1993.
Google Scholar
Vanderbilt University, The Television News Archive, http://tvnews.vanderbilt.edu
H.-J. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Systems 1(1), 1993, 10-28.
Google Scholar
Y. Zhang and T.-S. Chua, “Detection of text captions in compressed domain video,” in Proceedings of ACM Multimedia'2000 Workshops (Multimedia Information Retrieval), California, USA, November 2000, pp. 201-204.
W. Zhou, A. Vellaikal, and C.-C. Jay Kuo, “Rule-based classification system for basketball video indexing,” in Proceedings of ACM Multimedia'2000 Workshops (Multimedia Information Retrieval), California, USA, November 2000, pp. 213-216.

Download references

Author information

Authors and Affiliations

The School of Computing, National University of Singapore, 117543, Singapore
Lekha Chaisorn, Tat-Seng Chua & Chin-Hui Lee

Authors

Lekha Chaisorn
View author publications
You can also search for this author in PubMed Google Scholar
Tat-Seng Chua
View author publications
You can also search for this author in PubMed Google Scholar
Chin-Hui Lee
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chaisorn, L., Chua, TS. & Lee, CH. A Multi-Modal Approach to Story Segmentation for News Video. World Wide Web 6, 187–208 (2003). https://doi.org/10.1023/A:1023622605600

Download citation

Issue Date: June 2003
DOI: https://doi.org/10.1023/A:1023622605600

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Multi-Modal Approach to Story Segmentation for News Video

Abstract

Access this article

Similar content being viewed by others

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

Unsupervised story segmentation and indexing of broadcast news video

Segmenting with style: detecting program and story boundaries in TV news broadcast videos

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Multi-Modal Approach to Story Segmentation for News Video

Abstract

Access this article

Similar content being viewed by others

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

Unsupervised story segmentation and indexing of broadcast news video

Segmenting with style: detecting program and story boundaries in TV news broadcast videos

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation