Advertisement

Exploiting Sparse Representations for Robust Analysis of Noisy Complex Video Scenes

  • Gloria Zen
  • Elisa Ricci
  • Nicu Sebe
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7577)

Abstract

Recent works have shown that, even with simple low level visual cues, complex behaviors can be extracted automatically from crowded scenes, e.g. those depicting public spaces recorded from video surveillance cameras. However, low level features as optical flow or foreground pixels are inherently noisy. In this paper we propose a novel unsupervised learning approach for the analysis of complex scenes which is specifically tailored to cope directly with features’ noise and uncertainty. We formalize the task of extracting activity patterns as a matrix factorization problem, considering as reconstruction function the robust Earth Mover’s Distance. A constraint of sparsity on the computed basis matrix is imposed, filtering out noise and leading to the identification of the most relevant elementary activities in a typical high level behavior. We further derive an alternate optimization approach to solve the proposed problem efficiently and we show that it is reduced to a sequence of linear programs. Finally, we propose to use short trajectory snippets to account for object motion information, in alternative to the noisy optical flow vectors used in previous works. Experimental results demonstrate that our method yields similar or superior performance to state-of-the arts approaches.

Keywords

Sparse Representation Sparse Code Nonnegative Matrix Factorization Complex Scene Probabilistic Latent Semantic Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Wang, X., Ma, X., Grimson, W.: Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 539–555 (2008)CrossRefGoogle Scholar
  2. 2.
    Hospedales, T., Gong, S., Xiang, T.: A markov clustering topic model for mining behaviour in video. In: ICCV (2009)Google Scholar
  3. 3.
    Kuettel, D., Breitenstein, M.D., Van Gool, L., Ferrari, V.: What’s going on? Discovering spatio-temporal dependencies in dynamic scenes. In: CVPR (2010)Google Scholar
  4. 4.
    Li, J., Gong, S., Xiang, T.: Learning behavioural context. Int. J. of Computer Vision (IJCV) 97, 276–304 (2012)zbMATHCrossRefGoogle Scholar
  5. 5.
    Zen, G., Ricci, E.: Earth mover’s prototypes: a convex learning approach for discovering activity patterns in dynamic scenes. In: CVPR (2011)Google Scholar
  6. 6.
    Ricci, E., Zen, G., Sebe, N., Messelodi, S.: A prototype learning framework using EMD: Application to complex scenes analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence (2012) (online)Google Scholar
  7. 7.
    Rubner, Y., Tomasi, C., Guibas, L.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40, 99–121 (2000)zbMATHCrossRefGoogle Scholar
  8. 8.
    Varadarajan, J., Emonet, R., Odobez, J.M.: A sparsity constraint for topic models - application to temporal activity mining. In: NIPS, Workshop on Practical Applications of Sparse Modeling: Open Issues and New Directions (2010)Google Scholar
  9. 9.
    Haines, T., Xiang, T.: Video topic modelling with behavioural segmentation. In: ACM Workshop on Multimodal Pervasive Video Analysis (2010)Google Scholar
  10. 10.
    Zhao, B., Fei-Fei, L., Xing, E.: Online detection of unusual events in videos via dynamic sparse coding. In: CVPR (2011)Google Scholar
  11. 11.
    Lu, Z., Peng, Y.: Latent semantic learning by efficient sparse coding with hypergraph regularization. In: AAAI Conference on Artificial Intelligence (2011)Google Scholar
  12. 12.
    Matikainen, P., Hebert, M., Sukthankar, R.: Action recognition through the motion analysis of tracked features. In: ICCV Workshop on Video-oriented Object and Event Classification (2009)Google Scholar
  13. 13.
    Matikainen, P., Hebert, M., Sukthankar, R.: Representing Pairwise Spatial and Temporal Relations for Action Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 508–521. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  14. 14.
    Raptis, M., Soatto, S.: Tracklet Descriptors for Action Modeling and Video Analysis. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 577–590. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Takahashi, M., Naemura, M., Fujii, M., Satoh, S.: Human action recognition in crowded surveillance video sequences by using features taken from key-point trajectories. In: Machine Learning for Vision-based Motion Analysis (MLVMA), CVPR Workshop (2011)Google Scholar
  16. 16.
    Lee, D., Seung, H.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  17. 17.
    Heiler, M., Schnorr, C.: Learning sparse representations by non-negative matrix factorization and sequential cone progamming. Journal of Machine Learning Research 7, 1385–1407 (2006)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Sandler, R., Lindenbaum, M.: Nonnegative matrix factorization with earth mover’s distance metric. In: CVPR (2009)Google Scholar
  20. 20.
    Ling, H., Okada, K.: An efficient earth mover’s distance algorithm for robust histogram comparison. IEEE Trans. on Pattern Analysis and Machine Intelligence 29, 840–843 (2006)CrossRefGoogle Scholar
  21. 21.
    Shirdhonkar, S., Jacobs, D.W.: Approximate earth mover’s distance in linear time. In: CVPR (2008)Google Scholar
  22. 22.
    Birchfield, S.: KLT: An implementation of the Kanade-Lucas-Tomasi feature tracker (2007)Google Scholar
  23. 23.
    Stauffer, C., Grimson, W.: Adaptive background mixture models for real-time tracking. In: CVPR (1999)Google Scholar
  24. 24.
    Tuy, H.: Convex programs with an additional reverse convex constraint. J. of Optim. Theory and Applic. 52, 463–486 (1987)MathSciNetzbMATHCrossRefGoogle Scholar
  25. 25.
    Li, J., Gong, S., Xiang, T.: Global behaviour inference using probabilistic latent semantic analysis. In: BMVC (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Gloria Zen
    • 1
  • Elisa Ricci
    • 2
  • Nicu Sebe
    • 1
  1. 1.DISIUniversity of TrentoItaly
  2. 2.DIEIUniversity of PerugiaItaly

Personalised recommendations