International Journal of Computer Vision

, Volume 105, Issue 1, pp 19–48

Dynamic Template Tracking and Recognition

Article

Abstract

In this paper we address the problem of tracking non-rigid objects whose local appearance and motion changes as a function of time. This class of objects includes dynamic textures such as steam, fire, smoke, water, etc., as well as articulated objects such as humans performing various actions. We model the temporal evolution of the object’s appearance/motion using a linear dynamical system. We learn such models from sample videos and use them as dynamic templates for tracking objects in novel videos. We pose the problem of tracking a dynamic non-rigid object in the current frame as a maximum a-posteriori estimate of the location of the object and the latent state of the dynamical system, given the current image features and the best estimate of the state in the previous frame. The advantage of our approach is that we can specify a-priori the type of texture to be tracked in the scene by using previously trained models for the dynamics of these textures. Our framework naturally generalizes common tracking methods such as SSD and kernel-based tracking from static templates to dynamic templates. We test our algorithm on synthetic as well as real examples of dynamic textures and show that our simple dynamics-based trackers perform at par if not better than the state-of-the-art. Since our approach is general and applicable to any image feature, we also apply it to the problem of human action tracking and build action-specific optical flow trackers that perform better than the state-of-the-art when tracking a human performing a particular action. Finally, since our approach is generative, we can use a-priori trained trackers for different texture or action classes to simultaneously track and recognize the texture or action in the video.

Keywords

Dynamic templates Dynamic textures Human actions  Tracking Linear dynamical systems Recognition 

References

  1. Ali, S., & Shah, M. (2010). Human action recognition in videos using kinematic features and multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2), 288–303.CrossRefGoogle Scholar
  2. Babenko, B., Yang, M. H., & Belongie, S. (2009). Visual tracking with online multiple instance learning. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  3. Bissacco, A., Chiuso, A., Ma, Y., & Soatto, S. (2001). Recognition of human gaits. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 52–58.Google Scholar
  4. Bissacco, A., Chiuso, A., & Soatto, S. (2007). Classification and recognition of dynamical models: The role of phase, independent components, kernels and optimal transport. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11), 1958–1972.CrossRefGoogle Scholar
  5. Black, M. J., & Jepson, A. D. (1998). Eigentracking: Robust matching and tracking of articulated objects using a view-based representation. International Journal of Computer Vision, 26(1), 63–84.CrossRefGoogle Scholar
  6. Chan, A., & Vasconcelos, N. (2007). Classifying video with kernel dynamic textures. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–6.Google Scholar
  7. Chaudhry, R., & Vidal, R. (2009). Recognition of visual dynamical processes: Theory, kernels and experimental evaluation. Technical Report 09–01, Department of Computer Science, Johns Hopkins University.Google Scholar
  8. Chaudhry, R., Ravichandran, A., Hager, G., & Vidal, R. (2009). Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  9. Cock, K. D., & Moor, B. D. (2002). Subspace angles and distances between ARMA models. System and Control Letters, 46(4), 265–270.MATHCrossRefGoogle Scholar
  10. Collins, R., Liu, Y., & Leordeanu, M. (2005a). On-line selection of discriminative tracking features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1631–1643.Google Scholar
  11. Collins, R., Zhou, X., & Teh, S. K. (2005b). An open source tracking testbed and evaluation web site. In: IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS 2005).Google Scholar
  12. Comaniciu, D., & Meer, P. (2002). Mean Shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.CrossRefGoogle Scholar
  13. Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5), 564–577.CrossRefGoogle Scholar
  14. Doretto, G., Chiuso, A., Wu, Y., & Soatto, S. (2003). Dynamic textures. International Journal of Computer Vision, 51(2), 91–109.Google Scholar
  15. Efros, A., Berg, A., Mori, G., & Malik, J. (2003). Recognizing action at a distance. In: IEEE International Conference on Computer Vision pp. 726–733.Google Scholar
  16. Fan, Z., Yang, M., & Wu, Y. (2007). Multiple collaborative kernel tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1268–1273.CrossRefGoogle Scholar
  17. Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32.Google Scholar
  18. Gill, P. E., Murray, W., & Wright, M. H. (1987). Practical Optimization. London: Academic Press.Google Scholar
  19. Gorelick, L., Blank, M., Shechtman, E., Irani, M., & Basri, R. (2007). Actions as space-time shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12), 2247–2253.CrossRefGoogle Scholar
  20. Grabner, H., Grabner, M., & Bischof, H. (2006). Real-time tracking via on-line boosting. In: British Machine Vision Conference.Google Scholar
  21. Hager, G. D., Dewan, M., & Stewart, C. V. (2004). Multiple kernel tracking with SSD. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  22. Ikizler, N., & Duygulu, P. (2009). Histogram of oriented rectangles: A new pose descriptor for human action recognition. Image and Vision Computing, 27(10), 1515–1526.CrossRefGoogle Scholar
  23. Isard, M., & Blake, A. (1998). Condensation-conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1), 5–28.CrossRefGoogle Scholar
  24. Jepson, A. D., Fleet, D. J., & El-Maraghi, T. F. (2001). Robust online appearance models for visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  25. Kim, M., & Pavlovic, V. (2009). Discriminative learning for dynamic state prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(10), 1847–1861.CrossRefGoogle Scholar
  26. Leibe, B., Schindler, K., Cornelis, N., & Gool, L. V. (2008). Coupled object detection and tracking from static cameras and moving vehicles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(10), 1683–1698.CrossRefGoogle Scholar
  27. Lim, H., Morariu, V. I., Camps, O. I., & Sznaier, M. (2006). Dynamic appearance modeling for human tracking. In: IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
  28. Lin, W. C., & Liu, Y. (2007). A lattice-based mrf model for dynamic near-regular texture tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5), 777–792.CrossRefGoogle Scholar
  29. Lin, Z., Jiang, Z., & Davis, L. S. (2009). Recognizing actions by shape-motion prototype trees. In: IEEE International Conference on Computer Vision.Google Scholar
  30. Nejhum, S. M. S., Ho, J., & Yang, M. H. (2008). Visual tracking with histograms and articulating blocks. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  31. Niebles, J. C., Wang, H., & Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79, 299–318.CrossRefGoogle Scholar
  32. North, B., Blake, A., Isard, M., & Rittscher, J. (2000). Learning and classification of complex dynamics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(9), 1016–1034.CrossRefGoogle Scholar
  33. Pavlovic, V., Rehg, J. M., Cham, T. J., & Murphy, K. P. (1999). A dynamic Bayesian network approach to figure tracking using learned dynamic models. In: IEEE International Conference on Computer Vision pp. 94–101.Google Scholar
  34. Péteri, R. (2010). Tracking dynamic textures using a particle filter driven by intrinsic motion information. Vision and Applications, special Issue on Dynamic Textures in Video.Google Scholar
  35. Péteri, R., Fazekas, S., & Huskies, M. (2010). DynTex: A comprehensive database of dynamic textures. Pattern Recognition Letters, 31, 1627–1632, www.cwi.nl/projects/dyntex/, online Dynamic Texture Database.
  36. Ravichandran, A., & Vidal, R. (2008). Video registration using dynamic textures. In: European Conference on Computer Vision.Google Scholar
  37. Ravichandran, A., & Vidal, R. (2011). Video registration using dynamic textures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1), 158–171.CrossRefGoogle Scholar
  38. Ravichandran, A., Chaudhry, R., & Vidal, R. (2009). View-invariant dynamic texture recognition using a bag of dynamical systems. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  39. Saisan, P., Doretto, G., Wu, Y. N., & Soatto, S. (2001). Dynamic texture recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, vol II. pp. 58–63.Google Scholar
  40. Thurau, C., & Hlavac, V. (2008). Pose primitive based human action recognition in videos or still images. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  41. Vidal, R., & Ravichandran, A. (2005). Optical flow estimation and segmentation of multiple moving dynamic textures. In: IEEE Conference on Computer Vision and Pattern Recognition, vol II. pp. 516–521.Google Scholar
  42. Xie, Y., Chang, H., Li, Z., Liang, L., Chen, X., & Zhao, D. (2011). A unified framework for locating and recognizing human actions. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  43. Yilmaz, A., Javed, O., & Shah, M. (2006). Object tracking: A survey. ACM Computing Surveys, 38(4), 13.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Center for Imaging ScienceJohns Hopkins UniversityBaltimoreUSA
  2. 2.Johns Hopkins UniversityBaltimoreUSA

Personalised recommendations