Skip to main content

Detecting Violent Scenes in Movies by Auditory and Visual Cues

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5353))

Abstract

To detect violence in movies, we present a three-stage method integrating visual and auditory cues. In our method, those shots with potential violent content are first identified according to universal film-making rules. A modified semi-supervised learning technique based on semi-supervised cross feature learning (SCFL) is exploited, since it is capable to combine different types of features and use unlabeled data to improve the classification performance. Then, typical violence-related audio effects are further detected for the candidate shots, and we manage to transform the confidences outputted by the classifiers of various audio events into a shot-based violence score. Finally, the first two-stage probabilistic outputs are integrated in a boosting way to generate the final inference. The experimental results on four typical action movies preliminarily show the effectiveness of our method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Datta, A., Shah, M., Lobo, N.D.V.: Person-on-Person Violence Detection in Video Data. In: IEEE International Conference on Pattern Recognition, pp. 433–438 (2002)

    Google Scholar 

  2. Nam, J., Alghoniemy, M., Tewfik, A.H.: Audio-visual content-based violent scene characterization. In: IEEE International Conference on Image Processing, pp. 353–357 (1998)

    Google Scholar 

  3. Cheng, W., Chu, W., Wu, J.: Semantic context detection based on hierarchical audio models. In: Proceedings of the 5th ACM SIGMM international Workshop on Multimedia information Retrieval, pp. 109–115 (2003)

    Google Scholar 

  4. Smeaton, A.F., Lehane, B., O’Connor, N.E., Brady, C., Craig, G.: Automatically selecting shots for action movie trailers. In: Proceedings of the 8th ACM international Workshop on Multimedia information Retrieval, pp. 231–238 (2006)

    Google Scholar 

  5. Yan, R., Naphade, M.: Semi-supervised cross feature learning for semantic concept detection in videos. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 657–663 (2005)

    Google Scholar 

  6. Adams, B., Dorai, C., Venkatesh, S.: Toward: Automatic Extraction of Expressive Elements From Motion Pictures: Tempo. IEEE Transaction on Multimedia 4, 472–481 (2002)

    Article  Google Scholar 

  7. Ngo, C.W., Pong, T.C., Zhang, H.J.: Motion Analysis and Segmentation through Spatial-Temporal Slices Processing. IEEE Transaction on Image Processing 12, 341–354 (2003)

    Article  Google Scholar 

  8. Lu, L., Zhang, H.J., Jiang, H.: Content analysis for audio classification and segmentation. IEEE Transaction on Speech and Audio Processing 10, 504–516 (2002)

    Article  Google Scholar 

  9. Cai, R., Lu, L., Hanjalic, A., Zhang, H.-J., Lian-Hong: A flexible framework for key audio effects detection and auditory context inference. IEEE Transaction on Audio, Speech and Language Processing 14, 1026–1039 (2006)

    Article  Google Scholar 

  10. Lu, L., Liu, D., Zhang, H.-J.: Automatic mood detection and tracking of music audio signals. IEEE Transaction on Audio, Speech, and Language Processing 14, 5–18 (2006)

    Article  Google Scholar 

  11. Bordwell, D., Thompson, K.: Film Art: An Introduction, 4th edn. McGraw-Hill, New York (1996)

    Google Scholar 

  12. Wold, E., Blum, T., Keislar, D., Wheaten, J.: Content-based classification, search, and retrieval of audio, Multimedia. IEEE 3, 27–36 (1996)

    Google Scholar 

  13. Wu, T.-F., Lin, C.-J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research, 975–1005 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gong, Y., Wang, W., Jiang, S., Huang, Q., Gao, W. (2008). Detecting Violent Scenes in Movies by Auditory and Visual Cues. In: Huang, YM.R., et al. Advances in Multimedia Information Processing - PCM 2008. PCM 2008. Lecture Notes in Computer Science, vol 5353. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89796-5_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89796-5_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89795-8

  • Online ISBN: 978-3-540-89796-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics