Audio-Assisted Scene Segmentation for Story Browsing
Content-based video retrieval requires an effective scene segmentation technique to divide a long video file into meaningful high-level aggregates of shots called scenes. Each scene is part of a story. Browsing these scenes unfolds the entire story of a film. In this paper, we first investigate recent scene segmentation techniques that belong to the visual-audio alignment approach. This approach segments a video stream into visual scenes and an audio stream into audio scenes separately and later aligns these boundaries to create the final scene boundaries. In contrast, we propose a novel audio-assisted scene segmentation technique that utilizes audio information to remove false boundaries generated from segmentation by visual information alone. The crux of our technique is the new dissimilarity measure based on analysis of statistical properties of audio features and a concept in information theory. The experimental results on two full-length films with a wide range of camera motion and a complex composition of shots demonstrate the effectiveness of our technique compared with that of the visual-audio alignment techniques.
KeywordsVisual Scene Analysis Window Audio Feature Audio Stream Shot Boundary
Unable to display preview. Download preview PDF.
- 2.Bordwell, D., Thompson, K.: Film Art: An Introduction (5 ed.). McGraw-Hill Companies, Inc. (1997)Google Scholar
- 3.Sundaram, H., Chang, S. F.: Determining computable scenes in films and their structures using audio-visual memory models. In: Proc. of ACM Multimedia, LA, CA, USA (2000) 95–104Google Scholar
- 4.Jiang, H., Lin, T., Zhang, H. J.: Video segmentation with the assistance of audio content analysis. In: Proc. of ICME 2000. (2000) 1507–1510 vol.3Google Scholar
- 5.Tavanapong, W., Zhou, J.: Shot clustering techniques for story browsing. To appear in IEEE Transactions on Multimedia (http://www-midea.cs.iastate.edu) (2003)Google Scholar
- 6.Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: Proc. of the IEEE Int’l Conf. on Acoustics, Speech, and Signal Processing. Volume 2., IEEE (1997) 1331–1334Google Scholar
- 8.Lu, L., Jiang, H., Zhang, H.: A robust audio classification and segmentation method. In: Proc. of ACM Multimedia, Ottawa, Ontario, Canada (2001) 203–211Google Scholar
- 10.J. Tou, Gonzalez, R.: Pattern recongnition principles. Applied Mathemetics and Computation (1974)Google Scholar