Multi-object Tracking Method Based on Efficient Channel Attention and Switchable Atrous Convolution

Xiang, Xuezhi; Ren, Wenkai; Qiu, Yujian; Zhang, Kaixu; Lv, Ning

doi:10.1007/s11063-021-10519-5

Multi-object Tracking Method Based on Efficient Channel Attention and Switchable Atrous Convolution

Published: 29 April 2021

Volume 53, pages 2747–2763, (2021)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Xuezhi Xiang ORCID: orcid.org/0000-0002-6185-833X^1,2,
Wenkai Ren^1,2,
Yujian Qiu^1,2,
Kaixu Zhang^1,2 &
…
Ning Lv^1,2

550 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

In recent years,object detection and data association have getting remarkable progress which are the core components for multi-object tracking. In multi-object tracking field,the main strategy is tracking-by-detection. Although the detection based tracking method can get great results, it is relies on the performance of the detector. In complex scene, detector can not provide reliable results. Moreover,due to the incorrect detection results, data association process can not be trusted. Based on this motivation, this paper focuses on improving the accuracy of detection and data association. We introduce the efficient channel attention module to the backbone network, which can adaptively extract important information in images. Furthermore, we apply switchable atrous convolution in the network to dynamically adjust the receptive field according to object changes. In data association process, the appearance features with minimum occlusion are saved for each existing trajectory, which are used for re-associate after the objects are lost. Extensive experiments on MOT16,MOT17 and MOT20 challenging datasets demonstrate that our method is comparable with the state-of-the-art multi-object tracking methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

High-speed tracking based on multi-CF filters and attention mechanism

Article Open access 05 July 2019

Online Multi-Object Tracking with Pose-Guided Object Location and Dual Self-Attention Network

Pedestrian Multi-object Tracking Algorithm Based on Attention Feature Fusion

References

Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 941–951
Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Video Process 2008(1):1–10
Article Google Scholar
Bewley A, Ge Z, Ott L, Ramos F, Upcroft R (2016) Simple online and realtime tracking, 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468
Bochinski E, Eiselein V, Sikora T (2017) High-speed tracking-by-detection without using image information, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance , pp. 1–6
Bochkovskiy A, Wang CY, Liao CYM (2020) YOLOv4:Optimal speed and accuracy of object detection, arXiv preprint arXiv:2004.10934
Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and Person Re-Identification,2018 IEEE International Conference on Multimedia and Expo (ICME),pp. 1-6
Chu Q, Ouyang W, Li H, Wang X, Liu B, Yu N (2017) Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism, 2017 IEEE International Conference on Computer Vision, pp. 4846–4855
Dendorfer P, Rezatofighi H, Milan A, et al (2020) MOT20: a benchmark for multi object tracking in crowded scenes, arXiv preprint arXiv:2003.09003
Dollar P, Wojek C, Schiele B, Perona P (2009) Pedestrian detection: a benchmark, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 304–311
Ess A, Leibe B, Schindler K, Van Gool L (2008) A mobile vision system for robust multi-person tracking, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8
Fang K, Xiang Y, Li Y, Savarese S (2018) Recurrent autoregressive networks for online multi-object tracking, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 466–475
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Article MathSciNet Google Scholar
Hong C, Yu J, Zhang J, Jin X, Lee K (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Indus Inf 15(7):3952–3961
Article Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-Excitation Networks, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141
Joseph R, Ali F (2018) Yolov3: an incremental improvement, arXiv preprint arXiv:1804.02767
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82D:35–45
Article MathSciNet Google Scholar
Kuhn HW (1955) The hungarian method for the assignment problem. Naval Res Log Quarter 2(1–2):83–97
Article MathSciNet Google Scholar
Law H, Deng J (2018) CornerNet: detecting objects as paired keypoints. Int J Comput Vis
Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: common objects in context, In: European conference on computer vision(ECCV). pp. 740–755
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Cheng YF, Berg AC (2016) Ssd: Single shot multibox detector. European conference on computer vision pp. 21–37
Mahmoudi N, Ahadi SM, Rahmati M (2019) Multi-target tracking using cnn-based features: Cnnmtt. Multimed Tools and Appl 78(6):7077–7096
Article Google Scholar
Michel M, Leonardo M, Bruno P, Andre CD, Hendrik M (2020) Learning to associate detections for real-time multiple object tracking, arXiv preprint arXiv:2007.06041
Milan A, Leal-Taixe L, et al. (2016) Mot16: A benchmark for multi-object tracking, arXiv preprint arXiv:1603.00831
Pang B, Li Y, Zhang Y, Li M, Lu C (2020) TubeTK: adopting tubes to track multi-object in a one-step training model, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6307–6317
Peng J , Wang C , Wan F , et al (2020) Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking, arXiv preprint arXiv:2007.14557
Qiao S , Chen L C , Yuille A (2020) DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution,arXiv preprint arXiv:2006.02334
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Ristani E, Solera F, et al (2016) Performance measures and a data set for multi-target, multi-camera tracking, in European Conference on Computer Vision , pp. 17–35
Sanchez-Matilla R, Poiesi F, Cavallaro A (2016) Online multi-target tracking with strong and weak detections, in European Conference on Computer Vision, pp. 84–99
Sun S, Akhtar N, Song H, Mian AS, Shah M (2019) Deep affinity network for multiple object tracking,in IEEE Transactions on Pattern Analysis and Machine Intelligence
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11531–11539
Wang Z, Zheng L, Liu Y, Wang S (2020) Towards real-time multi object tracking. in European Conference on Computer Vision
Wan X, Wang J, Kong Z, Zhao Q, Deng S (2018) Multi-object tracking using online metric learning with long short-term memory, 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 788–792
Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric, 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649
Xiao T, Li S,Wang B, Lin L, Wang X (2017) Joint detection and identification feature learning for Person search, 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3376–3385
Xingyi Z, Dequan W, Philipp K (2019) Objects as points. arXiv preprint arXiv:1904.07850
Yu F , Koltun V (2016) Multi-Scale Context Aggregation by Dilated Convolutions, International Conference on Learning Representations
Yu J, Rui Y, Chen B (2013) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multimed 16(1):159–168
Article Google Scholar
Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032
Article MathSciNet Google Scholar
Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybernet 45(4):767–779
Article Google Scholar
Yu J, Zhu C, Zhang J, Huang Q, Tao D (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Syst 31(2):661–674
Article Google Scholar
Yu F, Li W, et al (2016) Multiple object tracking with high performance detection and appearance feature, in European Conference on Computer Vision (ECCV), pp. 36–42
Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell
Yu F, Wang D, Shelhamer E, Darrell T (2018) Deep layer aggregation, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.. 2403–2412
Yu J, Yao J, Zhang J, Yu Z, Tao D, SPRNet: single-pixel reconstruction for one-stage instance segmentation, in IEEE Transactions on Cybernetics, pp. 1–12
Zhang S, Benenson R, Schiele B (2017) CityPersons: a Diverse Dataset for Pedestrian Detection, 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4457–4465
Zhan Y, Wang C, Wang X, et al (2020) A Simple Baseline for Multi Object Tracking. arXiv preprint arXiv:2004.01888v4
Zheng L, Zhang H, et al. (2017) Person re-identifification in the wild, in IEEE Conference on Computer Vision and Pattern Recognition, pp. 1367–1376
Zhou X , Koltun V , Krhenbühl, Philipp (2020) Tracking objects as points, in European Conference on Computer Vision
Zhou Z, Xing J, Zhang M, Hu W (2018) Online multi-target tracking with tensor-based high-order graph matching, 2018 24th International Conference on Pattern Recognition (ICPR), pp. 1809–1814
Zhu J, Yang H, Liu N, et al (2018) Online multi-object tracking with dual matching attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 366–382

Download references

Author information

Authors and Affiliations

School of Information and Communication Engineering, Harbin Engineering University, Harbin, 150001, China
Xuezhi Xiang, Wenkai Ren, Yujian Qiu, Kaixu Zhang & Ning Lv
Key Laboratory of Advanced Marine Communication and Information Technology, Ministry of Industry and Information Technology, Harbin, 150001, China
Xuezhi Xiang, Wenkai Ren, Yujian Qiu, Kaixu Zhang & Ning Lv

Authors

Xuezhi Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Wenkai Ren
View author publications
You can also search for this author in PubMed Google Scholar
Yujian Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Kaixu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ning Lv
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuezhi Xiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by CAAI-Huawei MindSpore Open Fund, in part by the National Natural Science Foundation of China under Grant 61401113, in part by the Natural Science Foundation of Heilongjiang Province of China under Grant LC201426, in part by the Fundamental Research Funds for the Central Universities of China under Grant 3072021CF0801.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiang, X., Ren, W., Qiu, Y. et al. Multi-object Tracking Method Based on Efficient Channel Attention and Switchable Atrous Convolution. Neural Process Lett 53, 2747–2763 (2021). https://doi.org/10.1007/s11063-021-10519-5

Download citation

Accepted: 15 April 2021
Published: 29 April 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s11063-021-10519-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-object Tracking Method Based on Efficient Channel Attention and Switchable Atrous Convolution

Abstract

Access this article

Similar content being viewed by others

High-speed tracking based on multi-CF filters and attention mechanism

Online Multi-Object Tracking with Pose-Guided Object Location and Dual Self-Attention Network

Pedestrian Multi-object Tracking Algorithm Based on Attention Feature Fusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-object Tracking Method Based on Efficient Channel Attention and Switchable Atrous Convolution

Abstract

Access this article

Similar content being viewed by others

High-speed tracking based on multi-CF filters and attention mechanism

Online Multi-Object Tracking with Pose-Guided Object Location and Dual Self-Attention Network

Pedestrian Multi-object Tracking Algorithm Based on Attention Feature Fusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation