Abstract
Recent works have shown that, even with simple low level visual cues, complex behaviors can be extracted automatically from crowded scenes, e.g. those depicting public spaces recorded from video surveillance cameras. However, low level features as optical flow or foreground pixels are inherently noisy. In this paper we propose a novel unsupervised learning approach for the analysis of complex scenes which is specifically tailored to cope directly with features’ noise and uncertainty. We formalize the task of extracting activity patterns as a matrix factorization problem, considering as reconstruction function the robust Earth Mover’s Distance. A constraint of sparsity on the computed basis matrix is imposed, filtering out noise and leading to the identification of the most relevant elementary activities in a typical high level behavior. We further derive an alternate optimization approach to solve the proposed problem efficiently and we show that it is reduced to a sequence of linear programs. Finally, we propose to use short trajectory snippets to account for object motion information, in alternative to the noisy optical flow vectors used in previous works. Experimental results demonstrate that our method yields similar or superior performance to state-of-the arts approaches.
Chapter PDF
Similar content being viewed by others
Keywords
- Sparse Representation
- Sparse Code
- Nonnegative Matrix Factorization
- Complex Scene
- Probabilistic Latent Semantic Analysis
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Wang, X., Ma, X., Grimson, W.: Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 539–555 (2008)
Hospedales, T., Gong, S., Xiang, T.: A markov clustering topic model for mining behaviour in video. In: ICCV (2009)
Kuettel, D., Breitenstein, M.D., Van Gool, L., Ferrari, V.: What’s going on? Discovering spatio-temporal dependencies in dynamic scenes. In: CVPR (2010)
Li, J., Gong, S., Xiang, T.: Learning behavioural context. Int. J. of Computer Vision (IJCV) 97, 276–304 (2012)
Zen, G., Ricci, E.: Earth mover’s prototypes: a convex learning approach for discovering activity patterns in dynamic scenes. In: CVPR (2011)
Ricci, E., Zen, G., Sebe, N., Messelodi, S.: A prototype learning framework using EMD: Application to complex scenes analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence (2012) (online)
Rubner, Y., Tomasi, C., Guibas, L.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40, 99–121 (2000)
Varadarajan, J., Emonet, R., Odobez, J.M.: A sparsity constraint for topic models - application to temporal activity mining. In: NIPS, Workshop on Practical Applications of Sparse Modeling: Open Issues and New Directions (2010)
Haines, T., Xiang, T.: Video topic modelling with behavioural segmentation. In: ACM Workshop on Multimodal Pervasive Video Analysis (2010)
Zhao, B., Fei-Fei, L., Xing, E.: Online detection of unusual events in videos via dynamic sparse coding. In: CVPR (2011)
Lu, Z., Peng, Y.: Latent semantic learning by efficient sparse coding with hypergraph regularization. In: AAAI Conference on Artificial Intelligence (2011)
Matikainen, P., Hebert, M., Sukthankar, R.: Action recognition through the motion analysis of tracked features. In: ICCV Workshop on Video-oriented Object and Event Classification (2009)
Matikainen, P., Hebert, M., Sukthankar, R.: Representing Pairwise Spatial and Temporal Relations for Action Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 508–521. Springer, Heidelberg (2010)
Raptis, M., Soatto, S.: Tracklet Descriptors for Action Modeling and Video Analysis. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 577–590. Springer, Heidelberg (2010)
Takahashi, M., Naemura, M., Fujii, M., Satoh, S.: Human action recognition in crowded surveillance video sequences by using features taken from key-point trajectories. In: Machine Learning for Vision-based Motion Analysis (MLVMA), CVPR Workshop (2011)
Lee, D., Seung, H.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Heiler, M., Schnorr, C.: Learning sparse representations by non-negative matrix factorization and sequential cone progamming. Journal of Machine Learning Research 7, 1385–1407 (2006)
Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)
Sandler, R., Lindenbaum, M.: Nonnegative matrix factorization with earth mover’s distance metric. In: CVPR (2009)
Ling, H., Okada, K.: An efficient earth mover’s distance algorithm for robust histogram comparison. IEEE Trans. on Pattern Analysis and Machine Intelligence 29, 840–843 (2006)
Shirdhonkar, S., Jacobs, D.W.: Approximate earth mover’s distance in linear time. In: CVPR (2008)
Birchfield, S.: KLT: An implementation of the Kanade-Lucas-Tomasi feature tracker (2007)
Stauffer, C., Grimson, W.: Adaptive background mixture models for real-time tracking. In: CVPR (1999)
Tuy, H.: Convex programs with an additional reverse convex constraint. J. of Optim. Theory and Applic. 52, 463–486 (1987)
Li, J., Gong, S., Xiang, T.: Global behaviour inference using probabilistic latent semantic analysis. In: BMVC (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zen, G., Ricci, E., Sebe, N. (2012). Exploiting Sparse Representations for Robust Analysis of Noisy Complex Video Scenes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33783-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-33783-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33782-6
Online ISBN: 978-3-642-33783-3
eBook Packages: Computer ScienceComputer Science (R0)