Abstract
In this paper we propose a method for the detection of dialogue scenes within movies. This task is of particular interest given the special semantic role played by dialogue based scenes in the most part of movies. The proposed approach firstly operates the segmentation of the video footage in shots, then each shot is classified as dialogue or not-dialogue by a Multi-Expert System (MES) and, finally, the individuated sequences of dialogue shots are aggregated in dialogue scenes by means of a suitable algorithm. The MES integrates three experts which consider different and complementary aspects of the same decision problem, so that the combination of the single decisions provides a performance that is better than that of any single expert. While the general approach of multiple experts is not new, its application to this specific problem is interesting and novel and the obtained results are encouraging.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
M. M. Yeung, B. Liu, “Efficient matching and clustering of video shots”, in Proc. IEEE ICIP’95, vol II, pp. 260–263.
A. Hanjalic, R. Lagendijk, J. Biemond, “Automated high-level movie segmentation for advanced video-retrieval systems”, in IEEE Trans. on Circuits and Systems for Video Technology, vol. 9, No. 4, June 1999, pp. 580–588.
S. Boykin, A. Merlino, “Machine learning of event segmentation for news on demand”, in Communications of the ACM, Feb. 2000, vol. 43, No. 2, pp. 35–41.
M. Bertini, A. Del Bimbo, P. Pala, “Content-based Indexing and Retrieval of TV-news”, in Pattern Recognition Letters, 22, (2001), 503–516.
C. Saraceno, R. Leopardi, “Identification of Story Units in Audio-Visual Sequences by Joint Audio and Video Processing”, in Proc. ICIP’98, pp. 363–367, 1998.
L. P. Cordella, P. Foggia, C. Sansone, F. Tortorella and M. Vento, Reliability Parameters to Improve Combination Strategies in Multi-Expert Systems, Pattern Analysis & Applications, Springer-Verlag, vol. 2, pp. 205–214, 1999.
S.C. Pei, Y.Z. Chou, “Efficient MPEG compressed video analysis using macroblock type information”, in IEEE Trans. on Multimedia, pp. 321–333, Dec. 1999, Vol. 1, Issue: 4.
T.K. Ho, J.J. Hull, S.N. Srihari, “Decision Combination in Multiple Classifier Systems”, IEEE Trans. on Pattern Analysis and Machine Intelligence 1994; 16(1): 66–75.
Y.S. Huang, C.Y. Suen, “A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals”, IEEE Transactions on Pattern Analysis and Machine Intelligence 1995; 17(1): 90–94.
J. Kittler, M. Hatef, R.P.W. Duin, J. Matas, “On Combining Classifiers”, IEEE Trans. on PAMI, vol 20 n.3 March 1998.
H. Wang, S.F. Chang, “A Highly Efficient System for Automatic Face Region Detection in MPEG Video”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 7, no. 4, August 1997, pp. 615–628.
Y.P. Tan, D.D. Saur, S.R. Kulkarni, P.J. Ramadge, “Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 10, no. 1, February 2000, pp. 133–146.
M. De Santo, G. Percannella, C. Sansone, M. Vento, “Classifying Audio of Movies by a Multi-Expert System”, to appear in IEEE Proc. of ICIAP 2001, September, Palermo, Italy.
R. Hecth-Nielsen, Neurocomputing. Addison-Wesley, Reading (MA), 1990.
B. Ackermann, H. Bunke, “Combination of Classifiers on the Decision Level for Face Recognition”. Tech. Rep. IAM-96-002, Institut für Informatik und angewandte Mathematik, Universität Bern, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
De Santo, M., Percannella, G., Sansone, C., Vento, M. (2001). Dialogue Scenes Detection in MPEG Movies: A Multi-expert Approach. In: Tucci, M. (eds) Multimedia Databases and Image Communication. MDIC 2001. Lecture Notes in Computer Science, vol 2184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44819-5_16
Download citation
DOI: https://doi.org/10.1007/3-540-44819-5_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42587-8
Online ISBN: 978-3-540-44819-8
eBook Packages: Springer Book Archive