International Journal of Computer Vision

, Volume 107, Issue 2, pp 203–217 | Cite as

Multi-Target Tracking by Online Learning a CRF Model of Appearance and Motion Patterns

Article

Abstract

We introduce an online learning approach for multi-target tracking. Detection responses are gradually associated into tracklets in multiple levels to produce final tracks. Unlike most previous approaches which only focus on producing discriminative motion and appearance models for all targets, we further consider discriminative features for distinguishing difficult pairs of targets. The tracking problem is formulated using an online learned CRF model, and is transformed into an energy minimization problem. The energy functions include a set of unary functions that are based on motion and appearance models for discriminating all targets, as well as a set of pairwise functions that are based on models for differentiating corresponding pairs of tracklets. The online CRF approach is more powerful at distinguishing spatially close targets with similar appearances, as well as in tracking targets in presence of camera motions. An efficient algorithm is introduced for finding an association with low energy cost. We present results on four public data sets, and show significant improvements compared with several state-of-art methods.

Keywords

Multi-target tracking Online learned CRF Appearance and motion patterns Association based tracking 

Notes

Acknowledgments

Research was sponsored, in part, by Office of Naval Research under Grant number N00014-10-1-0517 and by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-10-2-0063. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.

Supplementary material

Supplementary material 1 (avi 5068 KB)

References

  1. Andriyenko, A., & Schindler, K. (2011). Multi-target tracking by continuous energy minimization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, USA, pp. 1265–1272.Google Scholar
  2. Andriyenko, A., Schindler, K., & Roth, S. (2012). Discrete–continuous optimization for multi-target tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, pp. 1926–1933.Google Scholar
  3. Breitenstein, M. D., Reichlin, F., Leibe, B., Koller-Meier, E., & Gool, L. V. (2011). Online multi-person tracking-by-detection from a single, uncalibrated camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(9), 1820–1833.CrossRefGoogle Scholar
  4. Comaniciu, D., Ramesh, V., & Meer, P. (2000). Real-time tracking of non-rigid objects using mean shift. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Hilton Head Island, SC, USA, pp. 142–149.Google Scholar
  5. Duan, G., Ai, H., Cao, S., & Lao, S. (2012). Group tracking: Exploring mutual relations for multiple object tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, pp. 129–143.Google Scholar
  6. Ess, A., Leibe, B., Schindler, K., & van Gool, L. (2009). Robust multiperson tracking from a mobile platform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(10), 1831–1846.Google Scholar
  7. Grabner, H., Matas, J., Gool, L. V., & Cattin, P. (2010). Tracking the invisible: Learning where the object might be. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, USA, pp. 1285–1292.Google Scholar
  8. Hammer, P. L., Hansen, P., & Simeone, B. (1984). Roof duality, complementation and persistency in quadratic 0–1 optimization. Mathematical Programming, 28(2), 121–155.CrossRefMATHMathSciNetGoogle Scholar
  9. Holzer, S., Pollefeys, M., Ilic, S., Tan, D., & Navab, N. (2012). Online learning of linear predictors for real-time tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, pp. 470–483.Google Scholar
  10. Huang, C., & Nevatia, R. (2010). High performance object detection by collaborative learning of joint ranking of granule features. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, USA, pp. 41–48.Google Scholar
  11. Huang, C., Wu, B., & Nevatia, R. (2008). Robust object tracking by hierarchical association of detection responses. In Proceedings of the European Conference on Computer Vision (ECCV), Marseille, France, pp. 788–801.Google Scholar
  12. Isard, M., & Blake, A. (1998). Condensation—conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1), 5–28.CrossRefGoogle Scholar
  13. Kalal, Z., Matas, J., & Mikolajczyk, K. (2010). P–N learning: Bootstrapping binary classifiers by structural constraints. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, USA, pp. 49–56.Google Scholar
  14. Kuo, C.-H., & Nevatia, R. (2011). How does person identity recognition help multi-person tracking? In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, USA, pp. 1217–1224.Google Scholar
  15. Kuo, C.-H., Huang, C., & Nevatia, R. (2010). Multi-target tracking by on-line learned discriminative appearance models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, USA, pp. 685–692.Google Scholar
  16. Li, Y., Huang, C., & Nevatia, R. (2009). Learning to associate: Hybridboosted multi-target tracker for crowded scene. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, pp. 2953–2960.Google Scholar
  17. National Institute of Standards and Technology: Trecvid 2008 evaluation for surveillance event detection. Retrieved October 1, 2012 from http://www.nist.gov/speech/tests/trecvid/2008/.
  18. Pearl, J. (1998). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco: Morgan Kaufmann.Google Scholar
  19. Perera, A. G. A., Srinivas, C., Hoogs, A., Brooksby, G., & Hu, W. (2006). Multi-object tracking through simultaneous long occlusions and split-merge conditions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, USA, pp. 666–673.Google Scholar
  20. Pets 2009 dataset. Retrieved October 1, 2012 from http://www.cvg.rdg.ac.uk/PETS2009.
  21. Pirsiavash, H., Ramanan, D., & Fowlkes, C. C. (2011). Globally-optimal greedy algorithms for tracking a variable number of objects. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, USA, pp. 1201–1208.Google Scholar
  22. Shitrit, H. B., Berclaz, J., Fleuret, F., & Fua, P. (2011). Tracking multiple people under global appearance constraints. In Proceedings of IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, pp. 137–144.Google Scholar
  23. Song, B., Jeng, T. Y., Staudt, E., & Roy-Chowdhury, A. K. (2010). A stochastic graph evolution framework for robust multi-target tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Crete, Greece, pp. 605–619.Google Scholar
  24. Stalder, S., Grabner, H., & Gool, L. V. (2010). Cascaded confidence filtering for improved tracking-by-detection. In Proceedings of the European Conference on Computer Vision (ECCV), Crete, Greece, pp. 369–382.Google Scholar
  25. Wang, S., Lu, H., Yang, F., & Yang, M.-H. (2011). Superpixel tracking. In Proceedings of IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, pp. 1323–1330.Google Scholar
  26. Xing, J., Ai, H., & Lao, S. (2009). Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, pp. 1200–1207.Google Scholar
  27. Xing, J., Ai, H., Liu, L., & Lao, S. (2011). Multiple players tracking in sports video: A dual-mode two-way bayesian inference approach with progressive observation modeling. IEEE Transaction on Image Processing, 20(6), 1652–1667.CrossRefMathSciNetGoogle Scholar
  28. Yang, B., & Nevatia, R. (2012a). Online learned discriminative part-based appearance models for multi-human tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, pp. 484–498.Google Scholar
  29. Yang, B., & Nevatia, R. (2012b). An online learned CRF model for multi-target tracking. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, pp. 2034–2041.Google Scholar
  30. Yang, B., Huang, C., & Nevatia, R. (2011). Learning affinities and dependencies for multi-target tracking using a CRF model. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, USA, pp. 1233–1240.Google Scholar
  31. Yu, Q., Medioni, G., & Cohen, I. (2007). Multiple target tracking using spatio-temporal markov chain monte carlo data association. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, MN, USA, pp. 1–8.Google Scholar
  32. Zamir, A. R., Dehghan, A., & Shah, M. (2012). GMCP-Tracker: Global multi-object tracking. Using generalized minimum clique graphs. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, pp. 343–356.Google Scholar
  33. Zhang, L., Li, Y., & Nevatia, R. (2008). Global data association for multi-object tracking using network flows. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, AK, USA, pp. 1–8. Google Scholar
  34. Zhang, T., Ghanem, B., Liu, S., & Ahuja, N. (2012a). Robust visual tracking via multi-task sparse learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, pp. 2042–2049.Google Scholar
  35. Zhang, K., Zhang, L., & Yang, M.-H. (2012b). Real-time compressive tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, pp. 864–877.Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Institute for Robotics and Intelligent SystemsUniversity of Southern CaliforniaLos AngelesUSA

Personalised recommendations