Abstract
There exists no comprehensive metric for describing the complexity of Multi-Object Tracking (MOT) sequences. This lack of metrics decreases explainability, complicates comparison of datasets, and reduces the conversation on tracker performance to a matter of leader board position. As a remedy, we present the novel MOT dataset complexity metric (MOTCOM), which is a combination of three sub-metrics inspired by key problems in MOT: occlusion, erratic motion, and visual similarity. The insights of MOTCOM can open nuanced discussions on tracker performance and may lead to a wider acknowledgement of novel contributions developed for either less known datasets or those aimed at solving sub-problems.
We evaluate MOTCOM on the comprehensive MOT17, MOT20, and MOTSynth datasets and show that MOTCOM is far better at describing the complexity of MOT sequences compared to the conventional density and number of tracks. Project page at https://vap.aau.dk/motcom.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
With permission from the MOTChallenge benchmark authors.
- 2.
Leader board results obtained on March 4, 2022.
References
Andriyenko, A., Roth, S., Schindler, K.: An analytical formulation of global occlusion reasoning for multi-target tracking. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1839–1846. IEEE (2011). https://doi.org/10.1109/ICCVW.2011.6130472
Andriyenko, A., Schindler, K.: Multi-target tracking by continuous energy minimization. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1265–1272 (2011). https://doi.org/10.1109/CVPR.2011.5995311
Bergmann, P., Meinhardt, T., Leal-Taixé, L.: Tracking without bells and whistles. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 941–951 (2019). https://doi.org/10.1109/ICCV.2019.00103
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468 (2016). https://doi.org/10.1109/ICIP.2016.7533003
Branchaud-Charron, F., Achkar, A., Jodoin, P.M.: Spectral metric for dataset complexity assessment. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3210–3219 (2019). https://doi.org/10.1109/CVPR.2019.00333
Cao, X., Guo, S., Lin, J., Zhang, W., Liao, M.: Online tracking of ants based on deep association metrics: method, dataset and evaluation. Pattern Recogn. 103 (2020). https://doi.org/10.1016/j.patcog.2020.107233
Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8740–8749 (2019). https://doi.org/10.1109/CVPR.2019.00895
Cui, Y., Gu, Z., Mahajan, D., van der Maaten, L., Belongie, S., Lim, S.N.: Measuring dataset granularity (2019). https://doi.org/10.48550/ARXIV.1912.10154
Dendorfer, P., et al.: MOTChallenge: a benchmark for single-camera multiple target tracking. Int. J. Comput. Vision 129(4), 845–881 (2020). https://doi.org/10.1007/s11263-020-01393-0
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Diaconis, P., Graham, R.L.: Spearman’s footrule as a measure of disarray. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(2), 262–268 (1977). https://doi.org/10.1111/j.2517-6161.1977.tb01624.x
Fabbri, M., et al.: Motsynth: how can synthetic data help pedestrian detection and tracking? In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10829–10839 (2021). https://doi.org/10.1109/ICCV48922.2021.01067
Gade, R., Moeslund, T.B.: Constrained multi-target tracking for team sports activities. IPSJ Trans. Comput. Vision Appl. 10(1), 1–11 (2018). https://doi.org/10.1186/s41074-017-0038-z
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361 (2012). https://doi.org/10.1109/CVPR.2012.6248074
Haurum, J.B., Karpova, A., Pedersen, M., Bengtson, S.H., Moeslund, T.B.: Re-identification of zebrafish using metric learning. In: 2020 IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 1–11 (2020). https://doi.org/10.1109/WACVW50321.2020.9096922
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 24(3), 289–300 (2002). https://doi.org/10.1109/34.990132
Khan, S.D., Ullah, H.: A survey of advances in vision-based vehicle re-identification. Comput. Vis. Image Underst. 182, 50–63 (2019). https://doi.org/10.1016/j.cviu.2019.03.001
Kratz, L., Nishino, K.: Tracking with local spatio-temporal motion patterns in extremely crowded scenes. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 693–700 (2010). https://doi.org/10.1109/CVPR.2010.5540149
Leal-Taixé, L., Milan, A., Schindler, K., Cremers, D., Reid, I., Roth, S.: Tracking the trackers: an analysis of the state of the art in multiple object tracking. arXiv (2017). https://doi.org/10.48550/ARXIV.1704.02781
Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: Siamese cnn for robust target association. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 418–425 (2016). https://doi.org/10.1109/CVPRW.2016.59
Liu, C., Yao, R., Rezatofighi, S.H., Reid, I., Shi, Q.: Model-free tracker for multiple objects using joint appearance and motion inference. IEEE Trans. Image Process. 29, 277–288 (2020). https://doi.org/10.1109/TIP.2019.2928123
Lu, Z., Rathod, V., Votel, R., Huang, J.: Retinatrack: online single stage joint detection and tracking. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14656–14666 (2020). https://doi.org/10.1109/CVPR42600.2020.01468
Luiten, J., Osep, A., Dendorfer, P., Torr, P., Geiger, A., Leal-Taixé, L., Leibe, B.: Hota: a higher order metric for evaluating multi-object tracking. International Journal of Computer Vision (IJCV), pp. 548–578 (2021). https://doi.org/10.1007/s11263-020-01375-2
Luo, W., Kim, T.K., Stenger, B., Zhao, X., Cipolla, R.: Bi-label propagation for generic multiple object tracking. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1290–1297 (2014). https://doi.org/10.1109/CVPR.2014.168
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Kim, T.K.: Multiple object tracking: a literature review. Artif. Intell. 293, 103448 (2021). https://doi.org/10.1016/j.artint.2020.103448
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: Mot16: a benchmark for multi-object tracking. arXiv (2016).https://doi.org/10.48550/ARXIV.1603.00831
Milan, A., Roth, S., Schindler, K.: Continuous energy minimization for multitarget tracking. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 36(1), 58–72 (2014). https://doi.org/10.1109/TPAMI.2013.103
Milan, A., Schindler, K., Roth, S.: Challenges of ground truth evaluation of multi-target tracking. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 735–742 (2013). https://doi.org/10.1109/CVPRW.2013.111
Pang, J., et al.: Quasi-dense similarity learning for multiple object tracking. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 164–173 (2021). https://doi.org/10.1109/CVPR46437.2021.00023
Pedersen, M., Haurum, J.B., Hein Bengtson, S., Moeslund, T.B.: 3D-ZEF: a 3D zebrafish tracking benchmark dataset. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2423–2433 (2020). https://doi.org/10.1109/CVPR42600.2020.00250
Pellegrini, S., Ess, A., Schindler, K., van Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: 2009 IEEE 12th International Conference on Computer Vision (ICCV), pp. 261–268 (2009). https://doi.org/10.1109/ICCV.2009.5459260
Peng, J., et al.: Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 145–161. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_9
Pérez-Escudero, A., Vicente-Page, J., Hinz, R.C., Arganda, S., De Polavieja, G.G.: idtracker: tracking individuals in a group by automatic identification of unmarked animals. Nat. Methods 11(7), 743–748 (2014). https://doi.org/10.1038/nmeth.2994
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Schneider, S., Taylor, G.W., Kremer, S.C.: Similarity learning networks for animal individual re-identification - beyond the capabilities of a human observer. In: 2020 IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 44–52 (2020). https://doi.org/10.1109/WACVW50321.2020.9096925
Schneider, S., Taylor, G.W., Linquist, S., Kremer, S.C.: Past, present and future approaches using computer vision for animal re-identification from camera trap data. Methods Ecol. Evol. 10(4), 461–470 (2019). https://doi.org/10.1111/2041-210X.13133
Stadler, D., Beyerer, J.: Improving multiple pedestrian tracking by track management and occlusion handling. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10953–10962 (2021). https://doi.org/10.1109/CVPR46437.2021.01081
Stiefelhagen, R., Bernardin, K., Bowers, R., Garofolo, J., Mostefa, D., Soundararajan, P.: The CLEAR 2006 evaluation. In: Stiefelhagen, R., Garofolo, J. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 1–44. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-69568-4_1
Sun, P., et al.: Scalability in perception for autonomous driving: waymo open dataset. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2443–2451 (2020). https://doi.org/10.1109/CVPR42600.2020.00252
Uhlmann, J.K.: Algorithms for multiple-target tracking. Am. Sci. 80(2), 128–141 (1992)
Voigtlaender, P., et al.: Mots: multi-object tracking and segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7934–7943 (2019). https://doi.org/10.1109/CVPR.2019.00813
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649 (2017). https://doi.org/10.1109/ICIP.2017.8296962
Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4705–4713 (2015). https://doi.org/10.1109/ICCV.2015.534
Xu, J., Cao, Y., Zhang, Z., Hu, H.: Spatial-temporal relation networks for multi-object tracking. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3987–3997 (2019). https://doi.org/10.1109/ICCV.2019.00409
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.H.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 44(6), 2872–2893 (2022). https://doi.org/10.1109/TPAMI.2021.3054775
Yin, J., Wang, W., Meng, Q., Yang, R., Shen, J.: A unified object motion and affinity model for online multi-object tracking. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6767–6776 (2020). https://doi.org/10.1109/CVPR42600.2020.00680
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vision 129(11), 3069–3087 (2021). https://doi.org/10.1007/s11263-021-01513-4
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28
Acknowledgements
This work has been funded by the Independent Research Fund Denmark under case number 9131-00128B.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pedersen, M., Haurum, J.B., Dendorfer, P., Moeslund, T.B. (2022). MOTCOM: The Multi-Object Tracking Dataset Complexity Metric. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13668. Springer, Cham. https://doi.org/10.1007/978-3-031-20074-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-20074-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20073-1
Online ISBN: 978-3-031-20074-8
eBook Packages: Computer ScienceComputer Science (R0)