International Journal of Computer Vision

, Volume 100, Issue 2, pp 190–202 | Cite as

Motion Coherent Tracking Using Multi-label MRF Optimization

  • David Tsai
  • Matthew Flagg
  • Atsushi Nakazawa
  • James M. Rehg
Article

Abstract

We present a novel off-line algorithm for target segmentation and tracking in video. In our approach, video data is represented by a multi-label Markov Random Field model, and segmentation is accomplished by finding the minimum energy label assignment. We propose a novel energy formulation which incorporates both segmentation and motion estimation in a single framework. Our energy functions enforce motion coherence both within and across frames. We utilize state-of-the-art methods to efficiently optimize over a large number of discrete labels. In addition, we introduce a new ground-truth dataset, called Georgia Tech Segmentation and Tracking Dataset (GT-SegTrack), for the evaluation of segmentation accuracy in video tracking. We compare our method with several recent on-line tracking algorithms and provide quantitative and qualitative performance comparisons.

Keywords

Video object segmentation Visual tracking Markov random field Motion coherence Combinatoric optimization Biotracking 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

(AVI 10.8 MB)

References

  1. Bai, X., Wang, J., Simons, D., & Sapiro, G. (2009). Video snapcut: Robust video object cutout using localized classifiers. In Proceedings of SIGGRAPH. Google Scholar
  2. Balch, T., Dellaert, F., Feldman, A., Guillory, A., Isbell, C. L. Jr., Khan, Z., Pratt, S. C., Stein, A. N., & Wilde, H. (2006). How multirobot systems research will accelerate our understanding of social animal behavior. Proceedings of the IEEE, 94(7), 1445–1463. Invited paper. CrossRefGoogle Scholar
  3. Bibby, C., & Reid, I. (2008). Robust real-time visual tracking using pixel-wise posteriors. In Proceedings of ECCV. Google Scholar
  4. Bluff, L., & Rutz, C. (2008). A quick guide to video-tracking birds. Biology Letters, 4, 319–322. CrossRefGoogle Scholar
  5. Bouguet, J. Y. (2002). Pyramidal implementation of the Lucas Kanade feature tracker: Description of the algorithm (Technical Report). Microprocessor Research Labs, Intel Corporation. Google Scholar
  6. Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient n-d image segmentation. International Journal of Computer Vision, 70(2), 109–131. CrossRefGoogle Scholar
  7. Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In Proceedings of ICCV. Google Scholar
  8. Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239. CrossRefGoogle Scholar
  9. Branson, K., Robie, A., Bender, J., Perona, P., & Dickinson, M. (2009). High-throughput ethomics in large groups of Drosophila. Nature Methods, 6, 451–457. CrossRefGoogle Scholar
  10. Brostow, G., Essa, I., Steedly, D., & Kwatra, V. (2004). Novel skeletal representation for articulated creatures. In Proceedings of ICCV. Google Scholar
  11. Caselles, V., Kimmel, R., & Sapiro, G. (1997). Geodesic active contours. International Journal of Computer Vision, 22(1), 61–79. MATHCrossRefGoogle Scholar
  12. Cham, T. J., & Rehg, J. M. (1999). A multiple hypothesis approach to figure tracking. In Proceedings of CVPR. Google Scholar
  13. Chang, M. M., Tekalp, A. M., & Sezan, M. I. (1997). Simultaneous motion estimation and segmentation. IEEE Transactions on Image Processing, 6(9), 1326–1333. CrossRefGoogle Scholar
  14. Chellappa, R., Ferryman, J., & Tan, T. (Eds.) (2005). 2nd joint IEEE intl. workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS 05), Beijing, China. Held in conjunction with ICCV 2005. Google Scholar
  15. Chockalingam, P., Pradeep, N., & Birchfield, S. (2009). Adaptive fragments-based tracking of non-rigid objects using level sets. In International conference on computer vision (ICCV). Google Scholar
  16. Dankert, H., Wang, L., Hoopfer, E. D., Anderson, D. J., & Perona, P. (2009). Automated monitoring and analysis of social behavior in drosophila. Nature Methods, 6, 297–303. CrossRefGoogle Scholar
  17. Delcourt, J., Becco, C., Vandewalle, N., & Poncin, P. (2009). A video multitracking system for quantification of individual behavior in a large fish shoal: advantages and limits. Behavior Research Methods, 41(1), 228–235. http://hdl.handle.net/2268/6100. CrossRefGoogle Scholar
  18. Donoser, M., & Bischof, H. (2008). Fast non-rigid object boundary tracking. In Proceedings of British machine vision conference (BMVC) (pp. 1–10). Google Scholar
  19. Felzenschwalb, P. (2005). Representation and detection of deformable shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(2), 208–220. CrossRefGoogle Scholar
  20. Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181. CrossRefGoogle Scholar
  21. Glocker, B., Paragios, N., Komodakis, N., Tziritas, G., & Navab, N. (2007). Inter and intra-modal deformable registration: continuous deformations meet efficient optimal linear programming. In Proceedings of IPMI. Google Scholar
  22. Glocker, B., Paragios, N., Komodakis, N., Tziritas, G., & Navab, N. (2008). Optical flow estimation with uncertainties through dynamic MRFs. In Proceedings of CVPR. Google Scholar
  23. Grundmann, M., Kwatra, V., Han, M., & Essa, I. (2010). Efficient hierarchical graph-based video segmentation. In Proceedings of CVPR. Google Scholar
  24. Kao, E. K., Daggett, M. P., & Hurley, M. B. (2009). An information theoretic approach for tracker performance evaluation. In Proceedings of ICCV. Google Scholar
  25. Khan, Z., Balch, T., & Dellaert, F. (2005). MCMC-based particle filtering for tracking a variable number of interacting targets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1805–1819. CrossRefGoogle Scholar
  26. Kohli, P., & Torr, P. (2005). Efficiently solving dynamic Markov random fields using graph cuts. In Proceedings of ICCV (pp. 922–929). Google Scholar
  27. Komodakis, N., Paragios, N., & Tziritas, G. (2007). MRF optimization via dual decomposition: Message-passing revisited. In International conference on computer vision (ICCV). Google Scholar
  28. Komodakis, N., & Tziritas, G. (2005). A new framework for approximate labeling via graph cuts. In Proceedings of ICCV. Google Scholar
  29. Komodakis, N., & Tziritas, G. (2007). Approximate labeling via graph-cuts based on linear programming. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1436–1453. CrossRefGoogle Scholar
  30. Lempitsky, V., & Boykov, Y. (2007). Global optimization for shape fitting. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–8). CrossRefGoogle Scholar
  31. Li, Y., Sun, J., & Shum, H. Y. (2005). Video object cut and paste. ACM Transactions on Graphics, 24(3), 595–600. CrossRefGoogle Scholar
  32. Martin, J. (2004). A portrait of locomotor behaviour in Drosophila determined by a video-tracking paradigm. Behavioural Processes, 67, 207–219. CrossRefGoogle Scholar
  33. Price, B. L., Morse, B. S., & Cohen, S. (2009). Livecut: Learning-based interactive video segmentation by evaluation of multiple propagated cues. In Proceedings of ICCV. Google Scholar
  34. Ramanan, D., & Forsyth, D. (2003). Using temporal coherence to build models of animals. In International conference on computer vision (ICCV). Google Scholar
  35. Ren, X., & Malik, J. (2007). Tracking as repeated figure/ground segmentation. In IEEE conference on computer vision and pattern recognition (CVPR). Google Scholar
  36. Rodriguez, M. D., Ahmed, J., & Shah, M. (2008). Action mach: A spatio-temporal maximum average correlation height filter for action recognition. In IEEE conference on computer vision and pattern recognition (CVPR). Google Scholar
  37. Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 23(3), 309–314. CrossRefGoogle Scholar
  38. Schoenemann, T., & Cremers, D. (2010). A combinatorial solution for model-based image segmentation and real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(7), 1153–1164. CrossRefGoogle Scholar
  39. Shi, J., & Tomasi, C. (1994). Good features to track. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 593–600). Google Scholar
  40. Sigal, L., Balan, A., & Black, M. J. (2009). Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. International Journal of Computer Vision, 87, 4–27. CrossRefGoogle Scholar
  41. Sminchisescu, C., & Triggs, B. (2003). Kinematic jump processes for monocular 3d human tracking. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 69–76). Google Scholar
  42. Tsai, D., Flagg, M., & Rehg, J. M. (2010). Motion coherent tracking with multi-label MRF optimization. In British machine vision conference (BMVC). Recipient of the Best Student Paper Prize. Google Scholar
  43. Tsibidis, G., & Tavernarakis, N. (2007). Nemo: A computational tool for analyzing nematode locomotion. BMC Neuroscience, 8(1), 86. doi:10.1186/1471-2202-8-86. http://www.biomedcentral.com/1471-2202/8/86. CrossRefGoogle Scholar
  44. Vaswani, N., Tannenbaum, A., & Yezzi, A. (2007). Tracking deforming objects using particle filtering for geometric active contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(8), 1470–1475. CrossRefGoogle Scholar
  45. Wang, J., Bhat, P., Colburn, R. A., Agrawala, M., & Cohen, M. F. (2005). Interactive video cutout. In SIGGRAPH ’05 ACM SIGGRAPH 2005 papers (pp. 585–594). New York: ACM. doi:10.1145/1186822.1073233. CrossRefGoogle Scholar
  46. Wang, P., & Rehg, J. M. (2006). A modular approach to the analysis and evaluation of particle filters for figure tracking. In IEEE conference on computer vision and pattern recognition (CVPR), New York, NY (Vol. 1, pp. 790–797). Google Scholar
  47. Xiao, J., & Shah, M. (2005). Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1644–1659. CrossRefGoogle Scholar
  48. Zhaozheng, Y., & Collins, R. (2009). Shape constrained figure-ground segmentation and tracking. In Proceedings of CVPR. Google Scholar
  49. Zitnick, C. L., Jojic, N., & Kang, S. B. (2005). Consistent segmentation for optical flow estimation. In Proceedings of ICCV. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • David Tsai
    • 1
  • Matthew Flagg
    • 1
  • Atsushi Nakazawa
    • 2
  • James M. Rehg
    • 1
  1. 1.Center for Behavior Imaging and the Computational Perception Laboratory, School of Interactive ComputingGeorgia Institute of TechnologyAtlantaUSA
  2. 2.Cybermedia CenterOsaka UniversityToyonakaJapan

Personalised recommendations