MOTCOM: The Multi-Object Tracking Dataset Complexity Metric

Pedersen, Malte; Haurum, Joakim Bruslund; Dendorfer, Patrick; Moeslund, Thomas B.

doi:10.1007/978-3-031-20074-8_2

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13668))

Included in the following conference series:

European Conference on Computer Vision

1632 Accesses
2 Citations

Abstract

There exists no comprehensive metric for describing the complexity of Multi-Object Tracking (MOT) sequences. This lack of metrics decreases explainability, complicates comparison of datasets, and reduces the conversation on tracker performance to a matter of leader board position. As a remedy, we present the novel MOT dataset complexity metric (MOTCOM), which is a combination of three sub-metrics inspired by key problems in MOT: occlusion, erratic motion, and visual similarity. The insights of MOTCOM can open nuanced discussions on tracker performance and may lead to a wider acknowledgement of novel contributions developed for either less known datasets or those aimed at solving sub-problems.

We evaluate MOTCOM on the comprehensive MOT17, MOT20, and MOTSynth datasets and show that MOTCOM is far better at describing the complexity of MOT sequences compared to the conventional density and number of tracks. Project page at https://vap.aau.dk/motcom.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
With permission from the MOTChallenge benchmark authors.
2.
Leader board results obtained on March 4, 2022.

References

Andriyenko, A., Roth, S., Schindler, K.: An analytical formulation of global occlusion reasoning for multi-target tracking. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1839–1846. IEEE (2011). https://doi.org/10.1109/ICCVW.2011.6130472
Andriyenko, A., Schindler, K.: Multi-target tracking by continuous energy minimization. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1265–1272 (2011). https://doi.org/10.1109/CVPR.2011.5995311
Bergmann, P., Meinhardt, T., Leal-Taixé, L.: Tracking without bells and whistles. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 941–951 (2019). https://doi.org/10.1109/ICCV.2019.00103
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468 (2016). https://doi.org/10.1109/ICIP.2016.7533003
Branchaud-Charron, F., Achkar, A., Jodoin, P.M.: Spectral metric for dataset complexity assessment. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3210–3219 (2019). https://doi.org/10.1109/CVPR.2019.00333
Cao, X., Guo, S., Lin, J., Zhang, W., Liao, M.: Online tracking of ants based on deep association metrics: method, dataset and evaluation. Pattern Recogn. 103 (2020). https://doi.org/10.1016/j.patcog.2020.107233
Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8740–8749 (2019). https://doi.org/10.1109/CVPR.2019.00895
Cui, Y., Gu, Z., Mahajan, D., van der Maaten, L., Belongie, S., Lim, S.N.: Measuring dataset granularity (2019). https://doi.org/10.48550/ARXIV.1912.10154
Dendorfer, P., et al.: MOTChallenge: a benchmark for single-camera multiple target tracking. Int. J. Comput. Vision 129(4), 845–881 (2020). https://doi.org/10.1007/s11263-020-01393-0
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Diaconis, P., Graham, R.L.: Spearman’s footrule as a measure of disarray. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(2), 262–268 (1977). https://doi.org/10.1111/j.2517-6161.1977.tb01624.x
Article MathSciNet MATH Google Scholar
Fabbri, M., et al.: Motsynth: how can synthetic data help pedestrian detection and tracking? In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10829–10839 (2021). https://doi.org/10.1109/ICCV48922.2021.01067
Gade, R., Moeslund, T.B.: Constrained multi-target tracking for team sports activities. IPSJ Trans. Comput. Vision Appl. 10(1), 1–11 (2018). https://doi.org/10.1186/s41074-017-0038-z
Article Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361 (2012). https://doi.org/10.1109/CVPR.2012.6248074
Haurum, J.B., Karpova, A., Pedersen, M., Bengtson, S.H., Moeslund, T.B.: Re-identification of zebrafish using metric learning. In: 2020 IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 1–11 (2020). https://doi.org/10.1109/WACVW50321.2020.9096922
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 24(3), 289–300 (2002). https://doi.org/10.1109/34.990132
Article Google Scholar
Khan, S.D., Ullah, H.: A survey of advances in vision-based vehicle re-identification. Comput. Vis. Image Underst. 182, 50–63 (2019). https://doi.org/10.1016/j.cviu.2019.03.001
Article Google Scholar
Kratz, L., Nishino, K.: Tracking with local spatio-temporal motion patterns in extremely crowded scenes. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 693–700 (2010). https://doi.org/10.1109/CVPR.2010.5540149
Leal-Taixé, L., Milan, A., Schindler, K., Cremers, D., Reid, I., Roth, S.: Tracking the trackers: an analysis of the state of the art in multiple object tracking. arXiv (2017). https://doi.org/10.48550/ARXIV.1704.02781
Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: Siamese cnn for robust target association. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 418–425 (2016). https://doi.org/10.1109/CVPRW.2016.59
Liu, C., Yao, R., Rezatofighi, S.H., Reid, I., Shi, Q.: Model-free tracker for multiple objects using joint appearance and motion inference. IEEE Trans. Image Process. 29, 277–288 (2020). https://doi.org/10.1109/TIP.2019.2928123
Article MathSciNet MATH Google Scholar
Lu, Z., Rathod, V., Votel, R., Huang, J.: Retinatrack: online single stage joint detection and tracking. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14656–14666 (2020). https://doi.org/10.1109/CVPR42600.2020.01468
Luiten, J., Osep, A., Dendorfer, P., Torr, P., Geiger, A., Leal-Taixé, L., Leibe, B.: Hota: a higher order metric for evaluating multi-object tracking. International Journal of Computer Vision (IJCV), pp. 548–578 (2021). https://doi.org/10.1007/s11263-020-01375-2
Luo, W., Kim, T.K., Stenger, B., Zhao, X., Cipolla, R.: Bi-label propagation for generic multiple object tracking. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1290–1297 (2014). https://doi.org/10.1109/CVPR.2014.168
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Kim, T.K.: Multiple object tracking: a literature review. Artif. Intell. 293, 103448 (2021). https://doi.org/10.1016/j.artint.2020.103448
Article MathSciNet MATH Google Scholar
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: Mot16: a benchmark for multi-object tracking. arXiv (2016).https://doi.org/10.48550/ARXIV.1603.00831
Milan, A., Roth, S., Schindler, K.: Continuous energy minimization for multitarget tracking. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 36(1), 58–72 (2014). https://doi.org/10.1109/TPAMI.2013.103
Article Google Scholar
Milan, A., Schindler, K., Roth, S.: Challenges of ground truth evaluation of multi-target tracking. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 735–742 (2013). https://doi.org/10.1109/CVPRW.2013.111
Pang, J., et al.: Quasi-dense similarity learning for multiple object tracking. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 164–173 (2021). https://doi.org/10.1109/CVPR46437.2021.00023
Pedersen, M., Haurum, J.B., Hein Bengtson, S., Moeslund, T.B.: 3D-ZEF: a 3D zebrafish tracking benchmark dataset. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2423–2433 (2020). https://doi.org/10.1109/CVPR42600.2020.00250
Pellegrini, S., Ess, A., Schindler, K., van Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: 2009 IEEE 12th International Conference on Computer Vision (ICCV), pp. 261–268 (2009). https://doi.org/10.1109/ICCV.2009.5459260
Peng, J., et al.: Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 145–161. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_9
Chapter Google Scholar
Pérez-Escudero, A., Vicente-Page, J., Hinz, R.C., Arganda, S., De Polavieja, G.G.: idtracker: tracking individuals in a group by automatic identification of unmarked animals. Nat. Methods 11(7), 743–748 (2014). https://doi.org/10.1038/nmeth.2994
Article Google Scholar
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Chapter Google Scholar
Schneider, S., Taylor, G.W., Kremer, S.C.: Similarity learning networks for animal individual re-identification - beyond the capabilities of a human observer. In: 2020 IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 44–52 (2020). https://doi.org/10.1109/WACVW50321.2020.9096925
Schneider, S., Taylor, G.W., Linquist, S., Kremer, S.C.: Past, present and future approaches using computer vision for animal re-identification from camera trap data. Methods Ecol. Evol. 10(4), 461–470 (2019). https://doi.org/10.1111/2041-210X.13133
Article Google Scholar
Stadler, D., Beyerer, J.: Improving multiple pedestrian tracking by track management and occlusion handling. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10953–10962 (2021). https://doi.org/10.1109/CVPR46437.2021.01081
Stiefelhagen, R., Bernardin, K., Bowers, R., Garofolo, J., Mostefa, D., Soundararajan, P.: The CLEAR 2006 evaluation. In: Stiefelhagen, R., Garofolo, J. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 1–44. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-69568-4_1
Chapter Google Scholar
Sun, P., et al.: Scalability in perception for autonomous driving: waymo open dataset. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2443–2451 (2020). https://doi.org/10.1109/CVPR42600.2020.00252
Uhlmann, J.K.: Algorithms for multiple-target tracking. Am. Sci. 80(2), 128–141 (1992)
Google Scholar
Voigtlaender, P., et al.: Mots: multi-object tracking and segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7934–7943 (2019). https://doi.org/10.1109/CVPR.2019.00813
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649 (2017). https://doi.org/10.1109/ICIP.2017.8296962
Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4705–4713 (2015). https://doi.org/10.1109/ICCV.2015.534
Xu, J., Cao, Y., Zhang, Z., Hu, H.: Spatial-temporal relation networks for multi-object tracking. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3987–3997 (2019). https://doi.org/10.1109/ICCV.2019.00409
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.H.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 44(6), 2872–2893 (2022). https://doi.org/10.1109/TPAMI.2021.3054775
Article Google Scholar
Yin, J., Wang, W., Meng, Q., Yang, R., Shen, J.: A unified object motion and affinity model for online multi-object tracking. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6767–6776 (2020). https://doi.org/10.1109/CVPR42600.2020.00680
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vision 129(11), 3069–3087 (2021). https://doi.org/10.1007/s11263-021-01513-4
Article Google Scholar
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28
Chapter Google Scholar

Download references

Acknowledgements

This work has been funded by the Independent Research Fund Denmark under case number 9131-00128B.

Author information

Authors and Affiliations

Aalborg University, Aalborg, Denmark
Malte Pedersen, Joakim Bruslund Haurum & Thomas B. Moeslund
Pioneer Center for AI, Copenhagen, Denmark
Joakim Bruslund Haurum & Thomas B. Moeslund
Technical University of Munich, Munich, Germany
Patrick Dendorfer

Authors

Malte Pedersen
View author publications
You can also search for this author in PubMed Google Scholar
Joakim Bruslund Haurum
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Dendorfer
View author publications
You can also search for this author in PubMed Google Scholar
Thomas B. Moeslund
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Malte Pedersen .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 882 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pedersen, M., Haurum, J.B., Dendorfer, P., Moeslund, T.B. (2022). MOTCOM: The Multi-Object Tracking Dataset Complexity Metric. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13668. Springer, Cham. https://doi.org/10.1007/978-3-031-20074-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-20074-8_2
Published: 12 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20073-1
Online ISBN: 978-3-031-20074-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics