A Framework for Dialogue Detection in Movies

  • Margarita Kotti
  • Constantine Kotropoulos
  • Bartosz Ziólko
  • Ioannis Pitas
  • Vassiliki Moschou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4105)


In this paper, we investigate a novel framework for dialogue detection that is based on indicator functions. An indicator function defines that a particular actor is present at each time instant. Two dialogue detection rules are developed and assessed. The first rule relies on the value of the cross-correlation function at zero time lag that is compared to a threshold. The second rule is based on the cross-power in a particular frequency band that is also compared to a threshold. Experiments are carried out in order to validate the feasibility of the aforementioned dialogue detection rules by using ground-truth indicator functions determined by human observers from six different movies. A total of 25 dialogue scenes and another 8 non-dialogue scenes are employed. The probabilities of false alarm and detection are estimated by cross-validation, where 70% of the available scenes are used to learn the thresholds employed in the dialogue detection rules and the remaining 30% of the scenes are used for testing. An almost perfect dialogue detection is reported for every distinct threshold.


False Alarm Indicator Function Face Detection Training Sequence Audio Stream 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alatan, A.A., Akansu, A.N.: Multi-modal dialog scene detection using hidden-markov models for content-based multimedia indexing. J. Multimedia Tools and Applications 14, 137–151 (2001)MATHCrossRefGoogle Scholar
  2. 2.
    Chen, L., Özsu, M.T.: Rule-based extraction from video. In: Proc. 2002 IEEE Int. Conf. Image Processing, vol. II, pp. 737–740 (2002)Google Scholar
  3. 3.
    Král, P., Cerisara, C., Kleckova, J.: Combination of classifiers for automatic recognition of dialogue acts. In: Proc. 9th European Conf. Speech Communication and Technology, pp. 825–828 (2005)Google Scholar
  4. 4.
    Lehane, B., O’Connor, N., Murphy, N.: Dialogue scene detection in movies using low and mid-level visual features. In: Proc. Int. Conf. Image and Video Retrieval, pp. 286–296 (2005)Google Scholar
  5. 5.
    Arijon, D.: Grammar of the Film Language. Silman-James Press (1991)Google Scholar
  6. 6.
    Vassiliou, A., Salway, A., Pitt, D.: Formalising stories: sequences of events and state changes. In: Proc. 2004 IEEE Int. Conf. Multimedia and Expo., Hong-Kong, Taiwan, vol. I, pp. 587–590 (2004)Google Scholar
  7. 7.
    Iyengal, G., Nock, H.J., Neti, C.: Audio-visual synchrony for detection of monologues in video archives. In: Proc. 2003 IEEE lnt. Conf. Acoustics, Speech, and Signal Processing, Hong Kong, April 2003, vol. I, pp. 329–332 (2003)Google Scholar
  8. 8.
    Sobottka, K., Pitas, I.: A novel method for automatic face segmentation, facial feature extraction and tracking. Image Communication and Signal Processing 12(3), 263–281 (1998)CrossRefGoogle Scholar
  9. 9.
    Kotti, M., Benetos, E., Kotropoulos, C.: Automatic speaker change detection with the bayesian information criterion using MPEG-7 features and a fusion scheme. In: Proc. 2006 IEEE Int. Symp. Circuits and Systems, Kos, Greece (May 2006)Google Scholar
  10. 10.
    Lu, L., Zhang, H.: Speaker change detection and tracking in real-time news broadcast analysis. In: Proc. 2004 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. I, pp. 741–744 (June 2004)Google Scholar
  11. 11.
    Papoulis, A., Pillai, S.V.: Probabilities, Random Variables, and Stochastic Processes, 4th edn. McGraw-Hill, NY (2002)Google Scholar
  12. 12.
    Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1997)Google Scholar
  13. 13.
    Boys, R.J., Henderson, D.A.: A Bayesian approach to DNA sequence segmetation. In: Proc. 2004 Biometrics, vol. 60(3), p. 573 (September 2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Margarita Kotti
    • 1
  • Constantine Kotropoulos
    • 1
  • Bartosz Ziólko
    • 1
  • Ioannis Pitas
    • 1
  • Vassiliki Moschou
    • 1
  1. 1.Department of InformaticsAristotle University of ThessalonikiThessalonikiGreece

Personalised recommendations