Collaborative Deep Reinforcement Learning for Multi-object Tracking

  • Liangliang Ren
  • Jiwen LuEmail author
  • Zifeng Wang
  • Qi Tian
  • Jie Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11207)


In this paper, we propose a collaborative deep reinforcement learning (C-DRL) method for multi-object tracking. Most existing multi-object tracking methods employ the tracking-by-detection strategy which first detects objects in each frame and then associates them across different frames. However, the performance of these methods rely heavily on the detection results, which are usually unsatisfied in many real applications, especially in crowded scenes. To address this, we develop a deep prediction-decision network in our C-DRL, which simultaneously detects and predicts objects under a unified network via deep reinforcement learning. Specifically, we consider each object as an agent and track it via the prediction network, and seek the optimal tracked results by exploiting the collaborative interactions of different agents and environments via the decision network. Experimental results on the challenging MOT15 and MOT16 benchmarks are presented to show the effectiveness of our approach.


Object tracking Multi-object Deep reinforcement learning 



This work was supported in part by the National Key Research and Development Program of China under Grant 2017YFA0700802, in part by the National Natural Science Foundation of China under Grant 61672306, Grant U1713214, Grant 61572271, and in part by supported by NSFC under Grant No. 61429201, in part to Dr. Qi Tian by ARO grant W911NF-15-1-0290 and Faculty Research Gift Awards by NEC Laboratories of America and Blippar.


  1. 1.
    Ammar, H.B., Eaton, E., Ruvolo, P., Taylor, M.: Online multi-task learning for policy gradient methods. In: ICML, pp. 1206–1214 (2014)Google Scholar
  2. 2.
    Bae, S.H., Yoon, K.J.: Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking. TPAMI 40, 595–610 (2017)CrossRefGoogle Scholar
  3. 3.
    Ban, Y., Ba, S., Alameda-Pineda, X., Horaud, R.: Tracking multiple persons based on a variational Bayesian model. In: ECCV, pp. 52–67 (2016)Google Scholar
  4. 4.
    Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP 2008(1), 246309 (2008)Google Scholar
  5. 5.
    Butt, A.A., Collins, R.T.: Multi-target tracking by Lagrangian relaxation to min-cost network flow. In: CVPR, pp. 1846–1853 (2013)Google Scholar
  6. 6.
    Cao, Q., Lin, L., Shi, Y., Liang, X., Li, G.: Attention-aware face hallucination via deep reinforcement learning. In: CVPR, pp. 690–698 (2017)Google Scholar
  7. 7.
    Choi, W.: Near-online multi-target tracking with aggregated local flow descriptor. In: ICCV, pp. 3029–3037 (2015)Google Scholar
  8. 8.
    Chu, Q., Ouyang, W., Li, H., Wang, X., Liu, B., Yu, N.: Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism. In: ICCV, pp. 4836–4845 (2017)Google Scholar
  9. 9.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009)Google Scholar
  10. 10.
    Fagot-Bouquet, L., Audigier, R., Dhome, Y., Lerasle, F.: Improving multi-frame data association with sparse representations for robust near-online multi-object tracking. In: ECCV, pp. 774–790 (2016)Google Scholar
  11. 11.
    Geiger, A., Lauer, M., Wojek, C., Stiller, C., Urtasun, R.: 3D traffic scene understanding from movable platforms. TPAMI 36(5), 1012–1025 (2014)CrossRefGoogle Scholar
  12. 12.
    Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep Q-learning with model-based acceleration. In: ICML, pp. 2829–2838 (2016)Google Scholar
  13. 13.
    Henschel, R., Leal-Taixé, L., Cremers, D., Rosenhahn, B.: Improvements to Frank-Wolfe optimization for multi-detector multi-object tracking. arXiv preprint arXiv:1705.08314 (2017)
  14. 14.
    Hong Yoon, J., Lee, C.R., Yang, M.H., Yoon, K.J.: Online multi-object tracking via structural constraint event aggregation. In: CVPR, pp. 1392–1400 (2016)Google Scholar
  15. 15.
    Huang, C., Lucey, S., Ramanan, D.: Learning policies for adaptive tracking with deep feature cascades. In: ICCV, pp. 105–114 (2017)Google Scholar
  16. 16.
    Kamalapurkar, R., Andrews, L., Walters, P., Dixon, W.E.: Model-based reinforcement learning for infinite-horizon approximate optimal tracking. TNNLS 28(3), 753–758 (2017)Google Scholar
  17. 17.
    Keuper, M., Tang, S., Zhongjie, Y., Andres, B., Brox, T., Schiele, B.: A multi-cut formulation for joint segmentation and tracking of multiple objects. arXiv preprint arXiv:1607.06317 (2016)
  18. 18.
    Kim, C., Li, F., Ciptadi, A., Rehg, J.M.: Multiple hypothesis tracking revisited. In: ICCV, pp. 4696–4704 (2015)Google Scholar
  19. 19.
    Kim, D.Y., Jeon, M.: Data fusion of radar and image measurements for multi-object tracking via Kalman filtering. Inf. Sci. 278, 641–652 (2014)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Kong, X., Xin, B., Wang, Y., Hua, G.: Collaborative deep reinforcement learning for joint object search. In: CVPR, pp. 1695–1704 (2017)Google Scholar
  21. 21.
    Le, N., Heili, A., Odobez, J.-M.: Long-term time-sensitive costs for CRF-based tracking by detection. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 43–51. Springer, Cham (2016). Scholar
  22. 22.
    Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: MOTChallenge 2015: towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942 (2015)
  23. 23.
    Levinkov, E., et al.: Joint graph decomposition & node labeling: problem, algorithms, applications. In: CVPR, pp. 6012–6020 (2017)Google Scholar
  24. 24.
    Li, Y., Huang, C., Nevatia, R.: Learning to associate: Hybridboosted multi-target tracker for crowded scene. In: CVPR, pp. 2953–2960 (2009)Google Scholar
  25. 25.
    Liang, X., Lee, L., Xing, E.P.: Deep variation-structured reinforcement learning for visual relationship and attribute detection. arXiv preprint arXiv:1703.03054 (2017)
  26. 26.
    Liu, S., Zhu, Z., Ye, N., Guadarrama, S., Murphy, K.: Optimization of image description metrics using policy gradient methods. arXiv preprint arXiv:1612.00370 (2016)
  27. 27.
    Maksai, A., Wang, X., Fleuret, F., Fua, P.: Non-Markovian globally consistent multi-object tracking. In: ICCV, pp. 2544–2554 (2017)Google Scholar
  28. 28.
    Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)
  29. 29.
    Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
  30. 30.
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  31. 31.
    Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: CVPR, pp. 4293–4302 (2016)Google Scholar
  32. 32.
    Okuma, K., Taleghani, A., De Freitas, N., Little, J.J., Lowe, D.G.: A boosted particle filter: multitarget detection and tracking. In: ECCV, pp. 28–39 (2004)Google Scholar
  33. 33.
    Rao, Y., Lu, J., Zhou, J.: Attention-aware deep reinforcement learning for video face recognition. In: ICCV, pp. 3931–3940 (2017)Google Scholar
  34. 34.
    Sadeghian, A., Alahi, A., Savarese, S.: Tracking the untrackable: learning to track multiple cues with long-term dependencies. arXiv preprint arXiv:1701.01909 (2017)
  35. 35.
    Sanchez-Matilla, R., Poiesi, F., Cavallaro, A.: Multi-target tracking with strong and weak detections. In: ECCVW, vol. 5, p. 18 (2016)Google Scholar
  36. 36.
    Shu, G., Dehghan, A., Oreifej, O., Hand, E., Shah, M.: Part-based multiple-person tracking with partial occlusion handling. In: CVPR, pp. 1815–1821 (2012)Google Scholar
  37. 37.
    Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: ICML, pp. 387–395 (2014)Google Scholar
  38. 38.
    Son, J., Baek, M., Cho, M., Han, B.: Multi-object tracking with quadruplet convolutional neural networks. In: ICCV, pp. 5620–5629 (2017)Google Scholar
  39. 39.
    Supancic, III, J., Ramanan, D.: Tracking as online decision-making: learning a policy from streaming videos with reinforcement learning. In: ICCV, pp. 322–331 (2017)Google Scholar
  40. 40.
    Tang, S., Andres, B., Andriluka, M., Schiele, B.: Multi-person tracking by multicut and deep matching. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 100–111. Springer, Cham (2016). Scholar
  41. 41.
    Tang, S., Andriluka, M., Andres, B., Schiele, B.: Multiple people tracking by lifted multicut and person reidentification. In: ICCV, pp. 3539–3548 (2017)Google Scholar
  42. 42.
    Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: AAAI, pp. 2094–2100 (2016)Google Scholar
  43. 43.
    Vedaldi, A., Lenc, K.: MatConvNet: convolutional neural networks for MATLAB. In: ACMMM, pp. 689–692 (2015)Google Scholar
  44. 44.
    Wang, S., Fowlkes, C.C.: Learning optimal parameters for multi-target tracking with contextual interactions. IJCV 122(3), 484–501 (2017)MathSciNetCrossRefGoogle Scholar
  45. 45.
    Wen, L., Lei, Z., Lyu, S., Li, S.Z., Yang, M.H.: Exploiting hierarchical dense structures on hypergraphs for multi-object tracking. TPAMI 38(10), 1983–1996 (2016)CrossRefGoogle Scholar
  46. 46.
    Wu, Z., Thangali, A., Sclaroff, S., Betke, M.: Coupling detection and data association for multiple object tracking. In: CVPR, pp. 1948–1955 (2012)Google Scholar
  47. 47.
    Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: ICCV, pp. 4705–4713 (2015)Google Scholar
  48. 48.
    Yang, B., Nevatia, R.: Multi-target tracking by online learning of non-linear motion patterns and robust appearance models. In: CVPR, pp. 1918–1925 (2012)Google Scholar
  49. 49.
    Yang, B., Nevatia, R.: An online learned CRF model for multi-target tracking. In: CVPR, pp. 2034–2041 (2012)Google Scholar
  50. 50.
    Yu, L., Zhang, W., Wang, J., Yu, Y.: SeqGAN: sequence generative adversarial nets with policy gradient. In: AAAI, pp. 2852–2858 (2017)Google Scholar
  51. 51.
    Yun, S., Choi, J., Yoo, Y., Yun, K., Young Choi, J.: Action-decision networks for visual tracking with deep reinforcement learning. In: CVPR, pp. 2711–2720 (2017)Google Scholar
  52. 52.
    Zamir, A.R., Dehghan, A., Shah, M.: GMCP-tracker: global multi-object tracking using generalized minimum clique graphs. In: ECCV, pp. 343–356 (2012)Google Scholar
  53. 53.
    Zhang, D., Maei, H., Wang, X., Wang, Y.F.: Deep reinforcement learning for visual object tracking in videos. arXiv preprint arXiv:1701.08936 (2017)
  54. 54.
    Zhang, L., Li, Y., Nevatia, R.: Global data association for multi-object tracking using network flows. In: CVPR, pp. 1–8 (2008)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Liangliang Ren
    • 1
  • Jiwen Lu
    • 1
    Email author
  • Zifeng Wang
    • 1
  • Qi Tian
    • 2
    • 3
  • Jie Zhou
    • 1
  1. 1.Tsinghua UniversityBeijingChina
  2. 2.Huawei Noah‘S Ark LabBeijingChina
  3. 3.University of Texas at San AntonioSan AntonioUSA

Personalised recommendations