Skip to main content
Log in

Grid-based multi-object tracking with Siamese CNN based appearance edge and access region mechanism

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Receiving growing attention for its various applications during the last few years, multi-object tracking remains a complex and challenging problem. Conventional grid-based tracking method is an efficient and effective method to tackle multi-object tracking, whose performance can be further boosted by intuitively taking into account the appearance similarity information yet. Therefore, we introduce appearance similarity edge into the grid-based method, where a Siamese network is utilized to produce the proposed similarity edge. In addition, we build a grid model with hexagonal cells and propose an access region mechanism including accessible area definition and an automatic-generation approach for entrance/exit grids. Since our tracking framework follows ’tracking-by-detection’ paradigm, the corresponding detection information is available to be integrated into access region mechanism, which will facilitate appropriate grid modeling. We verify the proposed Siamese network based appearance edge and access region mechanism through the experiments on some popular datasets like PETS-09, KITTI.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Andriyenko A, Schindler K (2010) Globally optimal multi-target tracking on a hexagonal lattice. In: Daniilidis K, Maragos P, Paragios N (eds) Computer vision – ECCV 2010. Springer, Berlin, pp 466–479

  2. Bae SH, Yoon K (2014) Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning. In: IEEE Conf. computer vision and pattern recognition, pp 1218–1225. https://doi.org/10.1109/CVPR.2014.159

  3. Berclaz J, Fleuret F, Fua P (2009) Multiple object tracking using flow linear programming. In: 2009 Twelfth IEEE international workshop on performance evaluation of tracking and surveillance, pp 1–8. https://doi.org/10.1109/PETS-WINTER.2009.5399488

  4. Berclaz J, Fleuret F, Turetken E, Fua P (2011) Multiple object tracking using k-shortest paths optimization. IEEE Trans Pattern Anal Mach Intell 33(9):1806–1819. https://doi.org/10.1109/TPAMI.2011.21

    Article  Google Scholar 

  5. Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Video Process 2008(1):246309. https://doi.org/10.1155/2008/246309

    Article  Google Scholar 

  6. Bewley A, Ott L, Ramos F, Upcroft B (2016) Alextrac: affinity learning by exploring temporal reinforcement within association chains. In: 2016 IEEE International conference on robotics and automation (ICRA), pp 2212–2218. https://doi.org/10.1109/ICRA.2016.7487371

  7. Chen L, Wang W, Panin G, Knoll A (2015) Hierarchical grid-based multi-people tracking-by-detection with global optimization. IEEE Trans Image Process 24(11):4197–4212. https://doi.org/10.1109/TIP.2015.2451013

    Article  MathSciNet  MATH  Google Scholar 

  8. Chopra S, Hadsell R, LeCun Y (2005) Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), vol 1, pp 539–546. https://doi.org/10.1109/CVPR.2005.202

  9. Chung D, Tahboub K, Delp EJ (2017) A two stream siamese convolutional neural network for person re-identification. In: 2017 IEEE International conference on computer vision (ICCV), pp 1992–2000. https://doi.org/10.1109/ICCV.2017.218

  10. Dicle C, Camps OI, Sznaier M (2013) The way they move: tracking multiple targets with similar appearance. In: Proc. ICCV, pp 2304–2311. https://doi.org/10.1109/ICCV.2013.286

  11. Dollár P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545. https://doi.org/10.1109/TPAMI.2014.2300479

    Article  Google Scholar 

  12. Dosovitskiy A, Fischer P, Ilg E, Häusser P, Hazirbas C, Golkov V, vd Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: 2015 IEEE international conference on computer vision (ICCV), pp 2758–2766. https://doi.org/10.1109/ICCV.2015.316

  13. Elfes A (1989) Using occupancy grids for mobile robot perception and navigation. Computer 22(6):46–57. https://doi.org/10.1109/2.30720

    Article  Google Scholar 

  14. Ess A, Leibe B, Schindler K, Gool LV (2008) A mobile vision system for robust multi-person tracking. In: IEEE Conf. computer vision and pattern recognition, pp 1–8. https://doi.org/10.1109/CVPR.2008.4587581

  15. Ess A, Schindler K, Gool LV (2009) Improved multi-person tracking with active occlusion handling. IEEE ICRA workshop on people detection & tracking

  16. Ferryman J, Shahrokni A (2009) Pets2009: dataset and challenge. In: 2009 Twelfth IEEE international workshop on performance evaluation of tracking and surveillance, pp 1–6. https://doi.org/10.1109/PETS-WINTER.2009.5399556

  17. Fleuret F, Berclaz J, Lengagne R, Fua P (2008) Multicamera people tracking with a probabilistic occupancy map. IEEE Trans Pattern Anal Mach Intell 30(2):267–282. https://doi.org/10.1109/TPAMI.2007.1174

    Article  Google Scholar 

  18. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The Kitti vision benchmark suite. In: Conference on computer vision and pattern recognition (CVPR)

  19. Geiger A, Lauer M, Wojek C, Stiller C, Urtasun R (2014) 3d traffic scene understanding from movable platforms. IEEE Trans Pattern Anal Mach Intell 36(5):1012–1025. https://doi.org/10.1109/TPAMI.2013.185

    Article  Google Scholar 

  20. He K, Gkioxari G, Dollar P, Girshick R (2017) Mask r-cnn. In: The IEEE International conference on computer vision (ICCV)

  21. Ju J, Kim D, Ku B, Han DK, Ko H (2017) Online multi-person tracking with two-stage data association and online appearance model learning. IET Comput Vis 11(1):87–95. https://doi.org/10.1049/iet-cvi.2016.0068

    Article  Google Scholar 

  22. Leal-Taixé L, Milan A, Reid I, Roth S, Schindler K (2015) MOTChallenge 2015: towards a benchmark for multi-target tracking. arXiv:1504.01942 [cs]

  23. Liang X, Shen X, Xiang D, Feng J, Yan LLS (2016) Semantic object parsing with local-global long short-term memory. In: IEEE Conf. computer vision and pattern recognition, pp 3185–3193. https://doi.org/10.1109/CVPR.2016.347

  24. Lu CW, Lin CY, Hsu CY, Weng MF, Kang LW, Liao HYM (2013) Identification and tracking of players sport videos. In: International conference on internet multimedia computing and service, pp 113–116

  25. Milan A, Roth S, Schindler K (2014) Continuous energy minimization for multitarget tracking. IEEE Trans Pattern Anal Mach Intell 36(1):58–72. https://doi.org/10.1109/TPAMI.2013.103

    Article  Google Scholar 

  26. Milan A, Schindler K, Roth S (2016) Multi-target tracking by discrete-continuous energy minimization. IEEE Trans Pattern Anal Mach Intell 38 (10):2054–2068. https://doi.org/10.1109/TPAMI.2015.2505309

    Article  Google Scholar 

  27. Milan A, Rezatofighi SH, Dick A, Reid I, Schindler K (2017) Online multi-target tracking using recurrent neural networks. In: Proc. AAAI

  28. Nillius P, Sullivan J, Carlsson S (2006) Multi-target tracking - linking identities using Bayesian network inference. In: IEEE Conf. on computer vision and pattern recognition, vol 2, pp 2187–2194. https://doi.org/10.1109/CVPR.2006.198

  29. Okuma K, Taleghani A, de Freitas N, Little JJ, Lowe DG (2004) A boosted particle filter: multitarget detection and tracking. Springer, Berlin, pp 28–39

    MATH  Google Scholar 

  30. Pirsiavash H, Ramanan D, Fowlkes CC (2011) Globally-optimal greedy algorithms for tracking a variable number of objects. In: 2011 IEEE Conference on computer vision and pattern recognition (CVPR), pp 1201–1208. https://doi.org/10.1109/CVPR.2011.5995604

  31. Reid D (1979) An algorithm for tracking multiple targets. IEEE Trans Autom Control 24(6):843–854. https://doi.org/10.1109/TAC.1979.1102177

    Article  Google Scholar 

  32. Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  33. Schulter S, Vernaza P, Choi W, Chandraker M (2017) Deep network flow for multi-object tracking. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 2730–2739. https://doi.org/10.1109/CVPR.2017.292

  34. Shen F, Zhou X, Yang Y, Song J, Shen HT, Tao D (2016) A fast optimization method for general binary code learning. IEEE Trans Image Process 25 (12):5610–5621. https://doi.org/10.1109/TIP.2016.2612883

    Article  MathSciNet  MATH  Google Scholar 

  35. Shen F, Yang Y, Liu L, Liu W, Tao D, Shen HT (2017) Asymmetric binary coding for image search. IEEE Trans Multimed 19(9):2022–2032. https://doi.org/10.1109/TMM.2017.2699863

    Article  Google Scholar 

  36. Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen HT (2018) Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Trans Pattern Anal Mach Intell 40(12):3034–3044. https://doi.org/10.1109/TPAMI.2018.2789887

    Article  Google Scholar 

  37. Song Y, Jeon M (2016) Online multiple object tracking with the hierarchically adopted gm-phd filter using motion and appearance. In: 2016 IEEE International conference on consumer electronics-Asia (ICCE-Asia), pp 1–4. https://doi.org/10.1109/ICCE-Asia.2016.7804800

  38. Sun S, Akhtar N, Song H, Mian A, Shah M (2018) Deep affinity network for multiple object tracking. arXiv:1810.11780

  39. Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on computer vision and pattern recognition, pp 1701–1708. https://doi.org/10.1109/CVPR.2014.220

  40. Varior RR, Haloi M, Wang G (2016) Gated siamese convolutional neural network architecture for human re-identification. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision – ECCV 2016. Springer International Publishing, Cham, pp 791–808

  41. Wang B, Wang G, Chan KL, Wang L (2014) Tracklet association with online target-specific metric learning. In: IEEE Conf. computer vision and pattern recognition, pp 1234–1241. https://doi.org/10.1109/CVPR.2014.161

  42. Xing J, Ai H, Liu L, Lao S (2011) Multiple player tracking sports video: a dual-mode two-way Bayesian inference approach with progressive observation modeling. IEEE Trans Image Process 20(6):1652–1667. https://doi.org/10.1109/TIP.2010.2102045

    Article  MathSciNet  MATH  Google Scholar 

  43. Yang W, Li J, Zheng H, Xu RYD (2018) A nuclear norm based matrix regression based projections method for feature extraction. IEEE Access 6:7445–7451. https://doi.org/10.1109/ACCESS.2017.2784800

    Article  Google Scholar 

  44. Yoon K, Kim DY, Young Chul Y, Jeon M (2019) Data association for multi-object tracking via deep neural networks. Sensors 19:559. https://doi.org/10.3390/s19030559

    Article  Google Scholar 

  45. Yoon Y, Boragule A, Song Y, Yoon K, Jeon M (2018) Online multi-object tracking with historical appearance matching and scene adaptive detection filtering. In: 2018 15th IEEE International conference on advanced video and signal based surveillance (AVSS), pp 1–6. https://doi.org/10.1109/AVSS.2018.8639078

  46. Young Chul Y, Song YM, Yoon K, Jeon M (2018) Online multi-object tracking using selective deep appearance matching. In: 2018 IEEE International conference on consumer electronics-Asia (ICCE-Asia), pp 206–212. https://doi.org/10.1109/ICCE-ASIA.2018.8552105

  47. Zagoruyko S, Komodakis N (2015) Learning to compare image patches via convolutional neural networks. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 4353–4361. https://doi.org/10.1109/CVPR.2015.7299064

  48. žbontar J, LeCun Y (2015) Computing the stereo matching cost with a convolutional neural network. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 1592–1599. https://doi.org/10.1109/CVPR.2015.7298767

  49. Zhang L, Li Y, Nevatia R (2008) Global data association for multi-object tracking using network flows. In: IEEE Conf. computer vision and pattern recognition, pp 1–8. https://doi.org/10.1109/CVPR.2008.4587584

  50. Zheng W (2017) Multichannel eeg-based emotion recognition via group sparse canonical correlation analysis. IEEE Trans Cogn Develop Syst 9(3):281–290. https://doi.org/10.1109/TCDS.2016.2587290

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China(Grant No.61727802).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingwu Ren.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Lou, J., Xu, F. et al. Grid-based multi-object tracking with Siamese CNN based appearance edge and access region mechanism. Multimed Tools Appl 79, 35333–35351 (2020). https://doi.org/10.1007/s11042-019-07747-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-07747-2

Keywords

Navigation