Skip to main content
Log in

Summarizing high-level scene behavior

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

We present several novel techniques to summarize the high-level behavior in surveillance video. Our proposed methods can employ either optical flow or trajectories as input, and incorporate spatial and temporal information together, which improve upon existing approaches for summarization. To begin, we extract common pathway regions by performing graph-based clustering on similarity matrices describing the relationships between location/orientation states. We then employ the activities along the pathway regions to extract the aggregate behavioral patterns throughout scenes. We show how our summarization methods can be applied to detect anomalies, retrieve video clips of interest, and generate adaptive-speed summary videos. We examine our approaches on multiple complex urban scenes and present experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Cheung, V., Frey, B.J., Jojic, N.: Video epitomes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2005)

  2. Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: Proceedings of IEEE International Conferenc on Computer Vision (2005)

  3. Hanjalic, A., Zhang, H.: An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis. IEEE TCSVT 9(8), 1280–1289 (1999)

    Google Scholar 

  4. Hferlin, B., Hferlin, M., Weiskopf, D., Heidemann, G.: (2010) Information-based adaptive fast-forward for visual surveillance. Multimedia Tools Appl. 55(1), 1–24

    Google Scholar 

  5. Hospedales, T., Gong, S., Xiang, T.: A markov clustering topic model for mining behaviour in video. In: Proceedings of the IEEE International Conference on Computer Vision (2009)

  6. Jojic, N., Frey, B.J., Kannan, A.: Epitomic analysis of appearance and shape. In: Proceedings of IEEE International Conference on Computer Vision (2003)

  7. Kang, H.W., Chen, X.Q., Matsushita, Y., Tang, X.: Space-time video montage. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2006)

  8. Kasamwattanarote, S., Cooharojananone, N., Satoh, S., Lipikorn, R.: Real time tunnel based video summarization using direct shift collision detection. In: Advances in Multimedia Information Processing—PCM 2010, vol 6297, pp 136–147 (2010)

  9. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2006)

  10. Leung, Y., Zhang, J.S., Xu, Z.B.: Clustering by scale-space filtering. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1396–1410 (2000)

    Article  Google Scholar 

  11. Li, J., Gong, S., Xiang, T.: Scene segmentation for behaviour correlation. In: Proceedings of the European Conference on Computer Vision (2008)

  12. Li, Z., Ishwar, P., Konrad, J.: Video condensation by ribbon carving. IEEE Trans. Image Proc. 18, 2572–2583 (2009)

    Article  MathSciNet  Google Scholar 

  13. Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  14. Makris, D., Ellis, T.: Path detection in video surveillance. Image Vis. Comput. 20, 895–903 (2002)

    Article  Google Scholar 

  15. Petrovic, N., Jojic, N.: Adaptive video fast forward. Multimedia Tools Appl. 26(2), 327–344 (2005)

    Article  Google Scholar 

  16. Pop, I., Scuturici, M., Miguet, S.: Common motion map based on codebooks. In: 5th International Symposium, ISVC 2009, pp. 1181–1190. Las Vegas, NV, USA (2009)

  17. Pritch, Y., Rav-Acha, A.: Nonchronological video synopsis and indexing. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1971–1984 (2008)

    Article  Google Scholar 

  18. Pritch, Y., Ratovich, S., Hendel, A., Peleg, S.: Clustered synopsis of surveillance video. Advanced Video and Signal Based Surveillance (2009)

  19. Rav-Acha, A., Pritch, Y., Peleg, S.: Making a long video short: Dynamic video synopsis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2006)

  20. Ren, X., Malik, J.: Learning a classification model for segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (2003)

  21. Rodriguez, M.: CRAM: compact representation of actions in movies. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2010)

  22. Saleemi, I., Shafique, K., Shah, M.: Probabilistic modeling of scene dynamics for applications in visual surveillance. IEEE Trans. Pattern Anal. Mach. Intell. 31(8), 1472–1484 (2009)

    Article  Google Scholar 

  23. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern. Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  24. Shi, J., Tomasi, C.: Good features to track. In: Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition (1994)

  25. Simakov, D., Caspi, Y., Shechtman, E., Irani, M.: Summarizing visual data using bidirectional similarity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008)

  26. Stauffer, C., Grimson, W.E.L.: Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 747–767 (2000)

    Article  Google Scholar 

  27. Streib, K., Davis, J.W.: Improving graph-based clustering via Ripley’s K-function and local connection merging. In: Review in process, technical report pending (2012)

  28. Streib, K., Davis, J.W.: Extracting pathlets from weak tracking data. Advanced Video and Signal Based Surveillance (2010)

  29. Wang, X., Tieu, K., Grimson, W.E.L.: Learning semantic scene models by trajectory analysis. In: Proceedings of the European Conference on Computer Vision (2006)

  30. Wang, X., Ma, X., Grimson, E.: Unsupervised activity perception in crowded and complicated scenes using hierarchical Bayesian models. IEEE Trans. Pattern. Anal. Mach. Intell. 31(3), 539–555 (2009)

    Article  Google Scholar 

  31. Wilson, R., Spann, M.: A new approach to clustering. Pattern Recognit. 23(12), 1413–1425 (1990)

    Article  Google Scholar 

  32. Wang, X., Ma, K.T., Ng, W.G., Grimson, W.E.L.: Trajectory analysis and semantic region modeling using a nonparametric Bayesian model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008)

  33. Yang, Y., Liu, J., Shah, M.: Video scene understanding using multi-scale analysis. In: Proceedings of the IEEE International Conference on Computer Vision (2009)

  34. Zhu, X., Wu, X., Fan, J.: Exploring video content structure for hierarchical summarization. Multimedia Syst. 10(3), 98–115 (2004)

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported in part by the Air Force Research Laboratories under contract No. FA8650-07-D-1220.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kevin Streib.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Streib, K., Davis, J.W. Summarizing high-level scene behavior. Machine Vision and Applications 25, 229–244 (2014). https://doi.org/10.1007/s00138-013-0573-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-013-0573-2

Keywords

Navigation