Audio-visual processing for scene change detection

Saraceno, Caterina; Leonardi, Riccardo

doi:10.1007/3-540-63508-4_114

Audio-visual processing for scene change detection

Caterina Saraceno¹ &
Riccardo Leonardi¹

Poster Session C: Compression, Hardware & Software, Databases, Neural Networks, Object Recognition & Reconstruction
Conference paper
First Online: 01 January 2005

265 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1311))

Abstract

The organization of video data-bases according to semantic content of data, is a key point in multimedia technologies. In fact, this would allow algorithms such as indexing and retrieval to work more efficiently.

As an attempt to extract semantic information, efforts have been devoted in segmenting the video in shots and for each shot trying to extract informations such as representative video frame, etc. As a video sequence is constructed from a 2-D projection of a 3-D scene, processing video information only has shown its limitations especially in solving problems such as object identification or object tracking. Further not all information is contained in the video signal and more can be achieved by analyzing the audio signal as well. Information can be obtained from the audio signal either to confirm the results obtained by a video processing unit or to acquire information that cannot be extracted from video (such as presence of music).

This paper presents a technique which combines video and audio information for classification and indexing purposes.

Download to read the full chapter text

Chapter PDF

References

L. Rabiner & B. H. Juang, Fundamentals of Speech Recognition, ed. Prentice Hall, 1994
Google Scholar
P. De Souza, “A Statistical Approach to the Design of an Adaptive SelfNormalizing Silence Detector”, IEEE trans. Acoust., Speech, Signal Processing, ASSP-31(3):678–684, Jun. 1983.
Google Scholar
H. Kobataker, “Optimization of Voiced/Unvoiced Decision in Nonstationary Noise Environments”, IEEE Transaction on Acoustic, Speech & Signal Proc., ASSP-35(1):9–18, Jan. 1987.
Google Scholar
I. K. Sethi & N. Patel, “A Statistical Approach to Scene Change Detection”, Storage and Retrieval for Image and Video Databases III, SPIE-2420:329–338, Feb. 1995.
Google Scholar
A. Hampapur, R. Jain and T Weymouth, “Digital Video Segmentation”, Proc. of Multimedia 94 Conf., San Francisco, pp. 357–363, 1994.
Google Scholar
H. Zhang, C. Y. Low and S. W. Smoliar, “Video Parsing and Browsing Using Compressed Data”, Multimedia Tools and Applications, Kluwer Academic Publishers, Boston, Vol. 1, pp. 89–111, 1995.
Google Scholar
J. Meng, Y. Juan & Shih-Fu Chang, “Scene Change Detection in a MPEG Compressed Video Sequence”, SPIE-2419:14–25, 1995.
Google Scholar
J.W. Pitton, K. Wang and B.H. Juang, “Time-Frequency Analysis and Auditory Modeling for Automatic Recognition of 0Speech”, Proceedings of the IEEE, 84(9):1199–1215, Sep. 1996.
Article Google Scholar
G.R. Doddington, “Speaker Recognition Identifying People by their Voices”, Proceedings of the IEEE, 73(11):1651–1664, Nov. 1985.
Google Scholar
J. Saunders, “Real-Time Discrimination of Broadcast Speech/Music” Proc. of the 1996 ICASSP Conf., 993–996, 1996.
Google Scholar
M.M Yeung and B.L. Yeo, “Video content characterization and compaction for digital library application”, Storage and Retrieval for Image and Video Databases V,SPIE-3022, pp. 45–58, Feb. 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Signals and Communications Lab., Dept. of Electronics for Automation, School of Engineering, University of Brescia, via Branze, 38, I-25123, Italy
Caterina Saraceno & Riccardo Leonardi

Authors

Caterina Saraceno
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Leonardi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alberto Del Bimbo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saraceno, C., Leonardi, R. (1997). Audio-visual processing for scene change detection. In: Del Bimbo, A. (eds) Image Analysis and Processing. ICIAP 1997. Lecture Notes in Computer Science, vol 1311. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63508-4_114

Download citation

DOI: https://doi.org/10.1007/3-540-63508-4_114
Published: 29 July 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63508-6
Online ISBN: 978-3-540-69586-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)