Skip to main content
Log in

Multi-object Tracking Method Based on Efficient Channel Attention and Switchable Atrous Convolution

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

In recent years,object detection and data association have getting remarkable progress which are the core components for multi-object tracking. In multi-object tracking field,the main strategy is tracking-by-detection. Although the detection based tracking method can get great results, it is relies on the performance of the detector. In complex scene, detector can not provide reliable results. Moreover,due to the incorrect detection results, data association process can not be trusted. Based on this motivation, this paper focuses on improving the accuracy of detection and data association. We introduce the efficient channel attention module to the backbone network, which can adaptively extract important information in images. Furthermore, we apply switchable atrous convolution in the network to dynamically adjust the receptive field according to object changes. In data association process, the appearance features with minimum occlusion are saved for each existing trajectory, which are used for re-associate after the objects are lost. Extensive experiments on MOT16,MOT17 and MOT20 challenging datasets demonstrate that our method is comparable with the state-of-the-art multi-object tracking methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 941–951

  2. Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Video Process 2008(1):1–10

    Article  Google Scholar 

  3. Bewley A, Ge Z, Ott L, Ramos F, Upcroft R (2016) Simple online and realtime tracking, 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468

  4. Bochinski E, Eiselein V, Sikora T (2017) High-speed tracking-by-detection without using image information, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance , pp. 1–6

  5. Bochkovskiy A, Wang CY, Liao CYM (2020) YOLOv4:Optimal speed and accuracy of object detection, arXiv preprint arXiv:2004.10934

  6. Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and Person Re-Identification,2018 IEEE International Conference on Multimedia and Expo (ICME),pp. 1-6

  7. Chu Q, Ouyang W, Li H, Wang X, Liu B, Yu N (2017) Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism, 2017 IEEE International Conference on Computer Vision, pp. 4846–4855

  8. Dendorfer P, Rezatofighi H, Milan A, et al (2020) MOT20: a benchmark for multi object tracking in crowded scenes, arXiv preprint arXiv:2003.09003

  9. Dollar P, Wojek C, Schiele B, Perona P (2009) Pedestrian detection: a benchmark, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 304–311

  10. Ess A, Leibe B, Schindler K, Van Gool L (2008) A mobile vision system for robust multi-person tracking, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8

  11. Fang K, Xiang Y, Li Y, Savarese S (2018) Recurrent autoregressive networks for online multi-object tracking, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 466–475

  12. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778

  13. Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670

    Article  MathSciNet  Google Scholar 

  14. Hong C, Yu J, Zhang J, Jin X, Lee K (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Indus Inf 15(7):3952–3961

    Article  Google Scholar 

  15. Hu J, Shen L, Sun G (2018) Squeeze-and-Excitation Networks, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141

  16. Joseph R, Ali F (2018) Yolov3: an incremental improvement, arXiv preprint arXiv:1804.02767

  17. Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82D:35–45

    Article  MathSciNet  Google Scholar 

  18. Kuhn HW (1955) The hungarian method for the assignment problem. Naval Res Log Quarter 2(1–2):83–97

    Article  MathSciNet  Google Scholar 

  19. Law H, Deng J (2018) CornerNet: detecting objects as paired keypoints. Int J Comput Vis

  20. Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: common objects in context, In: European conference on computer vision(ECCV). pp. 740–755

  21. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Cheng YF, Berg AC (2016) Ssd: Single shot multibox detector. European conference on computer vision pp. 21–37

  22. Mahmoudi N, Ahadi SM, Rahmati M (2019) Multi-target tracking using cnn-based features: Cnnmtt. Multimed Tools and Appl 78(6):7077–7096

    Article  Google Scholar 

  23. Michel M, Leonardo M, Bruno P, Andre CD, Hendrik M (2020) Learning to associate detections for real-time multiple object tracking, arXiv preprint arXiv:2007.06041

  24. Milan A, Leal-Taixe L, et al. (2016) Mot16: A benchmark for multi-object tracking, arXiv preprint arXiv:1603.00831

  25. Pang B, Li Y, Zhang Y, Li M, Lu C (2020) TubeTK: adopting tubes to track multi-object in a one-step training model, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6307–6317

  26. Peng J , Wang C , Wan F , et al (2020) Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking, arXiv preprint arXiv:2007.14557

  27. Qiao S , Chen L C , Yuille A (2020) DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution,arXiv preprint arXiv:2006.02334

  28. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  29. Ristani E, Solera F, et al (2016) Performance measures and a data set for multi-target, multi-camera tracking, in European Conference on Computer Vision , pp. 17–35

  30. Sanchez-Matilla R, Poiesi F, Cavallaro A (2016) Online multi-target tracking with strong and weak detections, in European Conference on Computer Vision, pp. 84–99

  31. Sun S, Akhtar N, Song H, Mian AS, Shah M (2019) Deep affinity network for multiple object tracking,in IEEE Transactions on Pattern Analysis and Machine Intelligence

  32. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11531–11539

  33. Wang Z, Zheng L, Liu Y, Wang S (2020) Towards real-time multi object tracking. in European Conference on Computer Vision

  34. Wan X, Wang J, Kong Z, Zhao Q, Deng S (2018) Multi-object tracking using online metric learning with long short-term memory, 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 788–792

  35. Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric, 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649

  36. Xiao T, Li S,Wang B, Lin L, Wang X (2017) Joint detection and identification feature learning for Person search, 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3376–3385

  37. Xingyi Z, Dequan W, Philipp K (2019) Objects as points. arXiv preprint arXiv:1904.07850

  38. Yu F , Koltun V (2016) Multi-Scale Context Aggregation by Dilated Convolutions, International Conference on Learning Representations

  39. Yu J, Rui Y, Chen B (2013) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multimed 16(1):159–168

    Article  Google Scholar 

  40. Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032

    Article  MathSciNet  Google Scholar 

  41. Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybernet 45(4):767–779

    Article  Google Scholar 

  42. Yu J, Zhu C, Zhang J, Huang Q, Tao D (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Syst 31(2):661–674

    Article  Google Scholar 

  43. Yu F, Li W, et al (2016) Multiple object tracking with high performance detection and appearance feature, in European Conference on Computer Vision (ECCV), pp. 36–42

  44. Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell

  45. Yu F, Wang D, Shelhamer E, Darrell T (2018) Deep layer aggregation, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.. 2403–2412

  46. Yu J, Yao J, Zhang J, Yu Z, Tao D, SPRNet: single-pixel reconstruction for one-stage instance segmentation, in IEEE Transactions on Cybernetics, pp. 1–12

  47. Zhang S, Benenson R, Schiele B (2017) CityPersons: a Diverse Dataset for Pedestrian Detection, 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4457–4465

  48. Zhan Y, Wang C, Wang X, et al (2020) A Simple Baseline for Multi Object Tracking. arXiv preprint arXiv:2004.01888v4

  49. Zheng L, Zhang H, et al. (2017) Person re-identifification in the wild, in IEEE Conference on Computer Vision and Pattern Recognition, pp. 1367–1376

  50. Zhou X , Koltun V , Krhenbühl, Philipp (2020) Tracking objects as points, in European Conference on Computer Vision

  51. Zhou Z, Xing J, Zhang M, Hu W (2018) Online multi-target tracking with tensor-based high-order graph matching, 2018 24th International Conference on Pattern Recognition (ICPR), pp. 1809–1814

  52. Zhu J, Yang H, Liu N, et al (2018) Online multi-object tracking with dual matching attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 366–382

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuezhi Xiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by CAAI-Huawei MindSpore Open Fund, in part by the National Natural Science Foundation of China under Grant 61401113, in part by the Natural Science Foundation of Heilongjiang Province of China under Grant LC201426, in part by the Fundamental Research Funds for the Central Universities of China under Grant 3072021CF0801.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiang, X., Ren, W., Qiu, Y. et al. Multi-object Tracking Method Based on Efficient Channel Attention and Switchable Atrous Convolution. Neural Process Lett 53, 2747–2763 (2021). https://doi.org/10.1007/s11063-021-10519-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10519-5

Keywords

Navigation