Abstract
The process of object tracking involves consistently identifying each instance across frames depending on initial set of object detection(s). Moreover, in multiple object tracking (MOT), the process through tracking-by-detection paradigm consists of performing two common steps consecutively, which are detection and data association. In MOT, it is targeted to associate detections across frames by localizing and identifying all objects of interest. MOT algorithms further keep tracking even the most challenging issues such as revisiting the same view, missing detections, occlusion and temporarily unseen objects, same-appearance objects coexisting in the same frame occur. Hence, re-identification (re-id) appears to be the most powerful tool for assigning the correct identities to each individual instance when aforementioned issues arise. In this work, we propose a similarity-based person re-id framework, called SAT, using a Siamese neural network via shared weights. Once detections are obtained from the backbone SAT applies a Siamese feature extraction model and then we introduce a similarity array for assessing tracklet(s) and detection(s). We examine the performance of SAT on several benchmarks with extensive experiments and statistical tests, where we improve the current state-of-the-art according to commonly used performance metrics with higher accuracy, less ID switches, less false positive and negative rates.
Similar content being viewed by others
References
Zhang Y et al (2020) Multiplex labeling graph for near-online tracking in crowded scenes. IEEE Internet Things J 7:7892–7902
Yoon Y, Kim D, Song Y, Yoon K, Jeon M (2021) Online multiple pedestrians tracking using deep temporal appearance matching association. Inf Sci 561:326–351
Cakir S, Cetin A (2021) Visual object tracking using Fourier domain phase information. Signal Image Video Process 16:119–126
Braso G, Lear-Taixe L (2020) Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6246–6256
Wojke N, Bewley A, Paulus D (2018) Simple online and realtime tracking with a deep association metric. In: Proceedings of international conference on image processing, ICIP, pp 3645–3649
Chen L, Ai H, Chen R, Zhuang Z (2019) Aggregate tracklet appearance features for multi-object tracking. IEEE Signal Process. Lett. 26:1613–1617
Wu Y et al (2019) Instance-aware representation learning and association for online multi-person tracking. Pattern Recognit. 94:25–34
Ciaparrone G, Luque F, Sanchey L, Tabik S et al (2020) Deep learning in video multi-object tracking: a survey. Neurocomputing 381:61–88
Yang F, Chang X, Sakti S, Wu Y, Nakamura S (2021) Remot: a model-agnostic refinement for multiple object tracking. Image Vis Comput 106:104091
Liu Q, Chu Q, Liu B, Yu N (2020) Gsm: graph similarity model for multi-object tracking. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence, pp 530–536
Xu Y, Cao Y, Zhang Z (2019) Spatial-temporal relation networks for multi-object tracking. In: Proceedings of the IEEE international conference on computer vision, pp 3987–3997
Sadeghian A, Alahi A, Saverse S (2017) Tracking the untrackable: learning to track multiple cues with long-term dependencies. In: Proceedings of the IEEE international conference on computer vision, pp 300–311
Xu Y, Osep A, Ban Y, Horaud R (2020) How to train your deep multi-object tracker. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6786–6795
Chu Q et al (2017) Online multi-object tracking using cnn-based single object tracker with spatial-temporal attention mechanism. In: Proceedings of the IEEE international conference on computer vision, pp 4846–4855
Yang M, Wu Y, Jia Y (2017) A hybrid data association framework for robust online multi-object tracking. IEEE Trans Image Process 26:5667–5679
Leal-Taixé L, Milan A, Reid I, Roth S, Schindler K (2015) Motchallenge 2015: towards a benchmark for multi-target tracking. arXiv:1504.01942
Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) Mot16: a benchmark for multi-object tracking. arXiv:1603.00831
Dendorfer P et al (2020) Mot20: a benchmark for multi object tracking in crowded scenes. arXiv:2003.09003
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. Springer, Berlin, pp 688–703
Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) Mot16: a benchmark for multi-object tracking. arXiv:1603.00831
Chavdarova T et al (2018) Wildtrack: a multi-camera hd dataset for dense unscripted pedestrian detection, pp 5030–5039
Li M, Zhu X, Gong S (2019) Unsupervised tracklet person re-identification. IEEE Trans Pattern Anal Mach Intell 42(7):1770–1782
Luiten J et al (2020) Hota: a higher order metric for evaluating multi-object tracking. Int J Comput Vis: IJCV 129:548–578
Fabbri M et al (2021) Motsynth: how can synthetic data help pedestrian detection and tracking?, pp 10849–10859
Peng J et al (2020) Tpm: multiple object tracking with tracklet-plane matching. Pattern Recogn 107:107480
Wu Q, Dai P, Chen P et al (2021) Deep adversarial data augmentation with attribute guided for person re-identification. Signal Image Video Process 15:655–662. https://doi.org/10.1007/s11760-019-01523-3
Nousi P, Triantafyllidou D, Tefas A, Pitas I (2020) Re-identification framework for long term visual object tracking based on object detection and classification. Signal Process Image Commun 88:115969
Bergmann P, Meinhardt T, Leal-Taixé L (2019) Tracking without bells and whistles. CoRR arXiv:1903.05625
Yu T, Li D, Yang Y, Timothy H, Xiang T (2019) Robust person re-identification by modelling feature uncertainty. In: Proceedings of the IEEE international conference on computer vision, pp 552–561
Chen A, Biglari-Abhari M, Wang K (2019) Investigating fast re-identification for multi-camera indoor person tracking. Comput Electr Eng 77:273–288
Li Y, Liu L, Zhu L, Zhang H (2021) Person re-identification based on multi-scale feature learning. Knowl Based Syst 228:107281
Lin Y, Xie L, Wu Y, Yan C, Tian Q (2020) Unsupervised person re-identification via softened similarity learning. CoRR arXiv:2004.03547
Mansouri N, Ammar S, Kessentini Y (2021) Re-ranking person re-identification using attributes learning. Neural Comput Appl 33:12827–12843
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124
Ristani E, Solera F, Zou RS, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision. Springer, Cham, pp 17–35
Liao L et al (2020) A half-precision compressive sensing framework for end-to-end person re-identification. Neural Comput Appl 32(4):1141–1155
Zheng L, Zhang H, Sun S, Chandraker M, Tian Q (2016) Person re-identification in the wild. arXiv:1604.02531
Zhou S, Wang Y, Zhang F, Wu J (2021) Cross-view similarity exploration for unsupervised cross-domain person re-identification. Neural Comput Appl 33(9):4001–4011
Zhu X, Jing X-Y, Ma F, Cheng L, Ren Y (2019) Simultaneous visual-appearance-level and spatial-temporal-level dictionary learning for video-based person re-identification. Neural Comput Appl 31(11):7303–7315
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis. Springer, Berlin, Heidelberg, pp 91–102
Zhang J et al (2020) Multiple object tracking by flowing and fusing. CoRRarXiv:2001.11180
Wang Y, Weng X, Kitani K (2020) Joint detection and multi-object tracking with graph neural networks. CoRRarXiv:2006.13164
Meinhardt T, Kirillov A, Leal-Taixé L, Feichtenhofer C (2021) Trackformer: Multi-object tracking with transformers. CoRR arXiv:2101.02702
Shuai B, Berneshawi AG, Modolo D, Tighe J (2020) Multi-object tracking with siamese track-rcnn. CoRR arXiv:2004.07786
Meimetis D, Daramouskas I, Perikos I, Hatzilygeroudis I (2021) Real-time multiple object tracking using deep learning methods. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06391-y
Yang K, Song H, Zhang K, Liu Q (2020) Hierarchical attentive siamese network for real-time visual tracking. Neural Comput Appl 32(18):14335–14346
Wu Y, Lim J, Yang MH (2013) Online object tracking: a benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2411–2418
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Huang G, Liu Z, Pleiss G, Van Der Maaten L, Weinberger K (2019) Convolutional networks with dense connectivity. IEEE Trans Pattern Anal Mach Intll. https://doi.org/10.1109/TPAMI.2019.2918284
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767
Yu L, Zhao Y, Zheng X (2021) Towards real -time object tracking with deep siamese network and layerwise aggregation. Signal Image Video Process 15:1303–1311. https://doi.org/10.1007/s11760-021-01861-1
Li S, Zhao Z, Kou L, Zhou Z, Xia G-S (2020) Siamese networks with distractor-reduction method for long-term visual object tracking. Pattern Recogn 112:107698. https://doi.org/10.1016/j.patcog.2020.107698
Bayraktar E, Boyraz P (2017) Analysis of feature detector and descriptor combinations with a localization experiment for various performance metrics. Turki J Electr Eng Comput Sci 25(3):2444–2454
Bayraktar E, Basarkan ME, Celebi N (2020) A low-cost uav framework towards ornamental plant detection and counting in the wild. ISPRS J Photogramm Remote Sens 167:1–11
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934
Jocher G et al (2020) ultralytics/yolov5: v3.1—bug fixes and performance improvements. https://doi.org/10.5281/zenodo.4154370
Zheng L et al (2015) Scalable person re-identification: a benchmark, pp 1116–1124. https://doi.org/10.1109/ICCV.2015.133
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification, pp 152–159. https://doi.org/10.1109/CVPR.2014.27
Ciaparrone G et al (2020) Deep learning in video multi-object tracking: a survey. Neurocomputing 381:61–88
Khalkhali MB, Vahedian A, Yazdi HS (2019) Multi-target state estimation using interactive kalman filter for multi-vehicle tracking. IEEE Trans Intell Transp Syst 21(3):1131–1144
Li X, Wang K, Wang W, Li Y (2010) A multiple object tracking method using kalman filter. Piscataway, IEEE, pp 1862–1866
Arulampalam MS, Maskell S, Gordon N, Clapp T (2002) A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans Signal Process 50(2):174–188
Smal I, Draegestein K, Galjart N, Niessen W, Meijering E (2008) Particle filtering for multiple object tracking in dynamic fluorescence microscopy images: application to microtubule growth analysis. IEEE Trans Med Imaging 27(6):789–804
Cui Y, Zhang J, He Z, Hu J (2019) Multiple pedestrian tracking by combining particle filter and network flow model. Neurocomputing 351:217–227
Babaee M, Athar A, Rigoll G (2018) Multiple people tracking using hierarchical deep tracklet re-identification. arXiv:1811.04091
Fu Z, Angelini F, Chambers J, Naqvi S (2019) Multi-level cooperative fusion of gm-phd filters for online multiple human tracking. IEEE Trans Multimed 21:2277–2291. https://doi.org/10.1109/TMM.2019.2902480
Xu Y, Osep A, Ban Y, Horaud R, Leal-Taixé L, Alameda-Pineda X (2020) How to train your deep multi-object tracker. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6787–6796
Ren W, Wang X, Tian J, Tang Y, Chan AB (2021) Tracking-by-counting: using network flows on crowd density maps for tracking multiple targets. IEEE Trans Image Process 30:1439–1452. https://doi.org/10.1109/TIP.2020.3044219
Papakis I, Sarkar A, Karpatne A (2020) Gcnnmatch: graph convolutional neural networks for multi-object tracking via sinkhorn normalization. CoRR arXiv:2010.00067
Wang G, Wang Y, Gu R, Hu W, Hwang J (2021) Split and connect: a universal tracklet booster for multi-object tracking. CoRR arXiv:2105.02426
Dai P et al (2021) Learning a proposal classifier for multiple object tracking. CoRR arXiv:2103.07889
Smeulders AW et al (2013) Visual tracking: an experimental survey. IEEE Trans Pattern Anal Mach Intell 36(7):1442–1468
Valmadre J et al (2021) Local metrics for multi-object tracking. arXiv:2104.02631
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481
Luiten J et al (2021) Hota: a higher order metric for evaluating multi-object tracking. Int J Comput Vis 129(2):548–578
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors certify that there is no actual or potential conflict of interest in relation to this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Suljagic, H., Bayraktar, E. & Celebi, N. Similarity based person re-identification for multi-object tracking using deep Siamese network. Neural Comput & Applic 34, 18171–18182 (2022). https://doi.org/10.1007/s00521-022-07456-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07456-2