Accurate and robust tracking of rigid objects in real time

Böttger, Tobias; Steger, Carsten

doi:10.1007/s11554-020-00978-9

Accurate and robust tracking of rigid objects in real time

Original Research Paper
Published: 22 May 2020

Volume 18, pages 493–510, (2021)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

526 Accesses
5 Citations
Explore all metrics

Abstract

We present the shape model object tracker, which is accurate, robust, and real-time capable on a standard CPU. The tracker has a failure mode detection, is robust to nonlinear illumination changes, and can cope with occlusions. It uses subpixel-precise image edges to track roughly rigid objects with high accuracy and is virtually drift-free even for long sequences. Furthermore, it is inherently capable of object re-detection when tracking fails. To evaluate the accuracy, robustness, and efficiency of the tracker precisely, we present a challenging new tracking dataset with pixel-precise ground truth. The precise ground-truth labels are created automatically from the photo-realistic synthetic VIPER dataset. The tracker is thoroughly evaluated against the state of the art through a number of qualitative and quantitative experiments. It is able to perform on par with the current state-of-the-art deep-learning trackers, but is at least 45 times faster, even without using a GPU. The efficiency and low memory consumption of the tracker are validated in further experiments that are conducted on an embedded device.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net: Convolutional Networks for Biomedical Image Segmentation

End-to-End Object Detection with Transformers

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Article Open access 12 April 2024

References

Akin, O., Erdem, E., Erdem, A., Mikolajczyk, K.: Deformable part-based tracking by coupled global and local correlation filters. J. Vis. Commun. Image Represent. 38, 763–774 (2016). https://doi.org/10.1016/j.jvcir.2016.04.018
Article Google Scholar
Babu, R.V., Suresh, S., Makur, A.: Online adaptive radial basis function networks for robust object tracking. Comput. Vis. Image Understand. 114(3), 297–310 (2010). https://doi.org/10.1016/j.cviu.2009.10.004
Article Google Scholar
Ballard, D.H.: Generalizing the hough transform to detect arbitrary shapes. Pattern Recognit. 13(2), 111–122 (1981). https://doi.org/10.1016/0031-3203(81)90009-1
Article MATH Google Scholar
Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.S.: Staple: Complementary learners for real-time tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1401–1409 (2016). https://doi.org/10.1109/CVPR.2016.156
Borgefors, G.: Hierarchical chamfer matching: a parametric edge matching algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 10(6), 849–865 (1988). https://doi.org/10.1109/34.9107
Article Google Scholar
Böttger, T., Eisenhofer, C.: Efficiently tracking extremal regions in multichannel images. In: 8th International Conference of Pattern Recognition Systems (ICPRS), pp. 6–14. Institution of Engineering and Technology (2017). https://doi.org/10.1049/cp.2017.0143
Böttger, T., Follmann, P.: The benefits of evaluating tracker performance using pixel-wise segmentations. In: IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 1983–1991 (2017). https://doi.org/10.1109/ICCVW.2017.232
Böttger, T., Follmann, P., Fauser, M.: Measuring the accuracy of object detectors and trackers. In: German Conference on Pattern Recognition (GCPR), pp. 415–426 (2017). https://doi.org/10.1007/978-3-319-66709-6_33
Böttger, T.: Real-time maximally stable homogeneous regions. J. Real-time Image Process. (2020). https://doi.org/10.1007/s11554-020-00951-6
Böttger, T., Ulrich, M., Steger, C.: Subpixel-precise tracking of rigid objects in real-time. In: 20th Scandinavian Conference on Image Analysis (SCIA), pp. 54–65 (2017). https://doi.org/10.1007/978-3-319-59126-1_5
Bouguet, J.Y., et al.: Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corp. 5(1–10), 4 (2001)
Google Scholar
Choi, J., Chang, H.J., Yun, S., Fischer, T., Demiris, Y., Choi, J.Y.: Attentional correlation filter network for adaptive visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4828–4837 (2017).https://doi.org/10.1109/CVPR.2017.513
Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25(5), 564–575 (2003). https://doi.org/10.1109/TPAMI.2003.1195991
Article Google Scholar
Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Understand. 61(1), 38–59 (1995). https://doi.org/10.1006/cviu.1995.1004
Article Google Scholar
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ECO: efficient convolution operators for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6931–6939 (2017). https://doi.org/10.1109/CVPR.2017.733
Danelljan, M., Häger, G., Khan, F.S., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference (BMVC). BMVA Press (2014)
Danelljan, M., Häger, G., Khan, F.S., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: IEEE International Conference on Computer Vision (ICCV), pp. 4310–4318 (2015). https://doi.org/10.1109/ICCV.2015.490
Danelljan, M., Häger, G., Khan, F.S., Felsberg, M.: Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1561–1575 (2017). https://doi.org/10.1109/TPAMI.2016.2609928
Article Google Scholar
Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol. 9909, pp. 472–488. Springer, Berlin (2016). https://doi.org/10.1007/978-3-319-46454-1_29
Gundogdu, E., Alatan, A.A.: Good features to correlate for visual tracking. IEEE Trans. Image Process. 27(5), 2526–2540 (2018). https://doi.org/10.1109/TIP.2018.2806280
Article MathSciNet MATH Google Scholar
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015). https://doi.org/10.1109/TPAMI.2014.2345390
Article Google Scholar
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.P.: Exploiting the circulant structure of tracking-by-detection with kernels. In: European Conference on Computer Vision (ECCV), pp. 702–715 (2012). https://doi.org/10.1007/978-3-642-33765-9_50
Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: International Conference on Machine Learning (ICML), pp. 597–606 (2015)
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012). https://doi.org/10.1109/T.2011.239
Article Google Scholar
Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82(1), 35–45 (1960). https://doi.org/10.1115/1.3662552
Article MathSciNet Google Scholar
Kristan, M., et al.: The visual object tracking VOT2017 challenge results. In: IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 1949–1972 (2017). https://doi.org/10.1109/ICCVW.2017.230
Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., Fernández, G., Vojír, T., Häger, G., Nebehay, G., Pflugfelder, R.P.: The visual object tracking VOT2015 challenge results. In: IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 564–586 (2015). https://doi.org/10.1109/ICCVW.2015.79
Lamdan, Y., Schwartz, J.T., Wolfson, H.J.: Affine invariant model-based object recognition. IEEE Trans. Robot. Autom. 6(5), 578–589 (1990). https://doi.org/10.1109/70.62047
Article Google Scholar
Lepetit, V., Fua, P.: Monocular model-based 3D tracking of rigid objects: A survey. Found. Trends Comput. Graph. Vis. (2005). https://doi.org/10.1561/0600000001
Lepetit, V., Fua, P.: Keypoint recognition using randomized trees. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1465–1479 (2006). https://doi.org/10.1109/TPAMI.2006.188
Article Google Scholar
Li, H., Li, Y., Porikli, F.: Deeptrack: Learning discriminative feature representations online for robust visual tracking. IEEE Trans. Image Process. 25(4), 1834–1848 (2016). https://doi.org/10.1109/TIP.2015.2510583
Article MathSciNet MATH Google Scholar
Li, S., Yeung, D.: Visual object tracking for unmanned aerial vehicles: A benchmark and new motion models. In: Conference on Artificial Intelligence (AAAI), pp. 4140–4146. AAAI Press (2017)
Lukezic, A., Vojír, T., Zajc, L.C., Matas, J., Kristan, M.: Discriminative correlation filter tracker with channel and spatial reliability. Int. J. Comput. Vis. 126(7), 671–688 (2018). https://doi.org/10.1007/s11263-017-1061-3
Article MathSciNet Google Scholar
Matthews, I.A., Ishikawa, T., Baker, S.: The template update problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 810–815 (2004). https://doi.org/10.1109/TPAMI.2004.16
Article Google Scholar
Membarth, R., Reiche, O., Hannig, F., Teich, J., Körner, M., Eckert, W.: Hipacc: A domain-specific language and compiler for image processing. IEEE Trans. Parallel Distrib. Syst. 27(1), 210–224 (2016). https://doi.org/10.1109/TPDS.2015.2394802
Article Google Scholar
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol. 9905, pp. 445–461. Springer, Berlin (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Mueller, M., Smith, N., Ghanem, B.: Context-aware correlation filter tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1387–1395 (2017). https://doi.org/10.1109/CVPR.2017.152
Müller, M., Bibi, A., Giancola, S., Al-Subaihi, S., Ghanem, B.: Trackingnet: A large-scale dataset and benchmark for object tracking in the wild. In: European Conference on Computer Vision (ECCV), pp. 310–327 (2018). https://doi.org/10.1007/978-3-030-01246-5_19
Munkres, J.: Algorithms for the assignment and transportation problems. J. Soc. Ind. Appl. Math. 5(1), 32–38 (1957)
Article MathSciNet Google Scholar
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4293–4302 (2016). https://doi.org/10.1109/CVPR.2016.465
Olson, C.F., Huttenlocher, D.P.: Automatic target recognition by matching oriented edge pixels. IEEE Trans. Image Process. 6(1), 103–113 (1997). https://doi.org/10.1109/83.552100
Article Google Scholar
Perazzi, F., Pont-Tuset, J., McWilliams, B., Gool, L.J.V., Gross, M.H., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 724–732 (2016). https://doi.org/10.1109/CVPR.2016.85
Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., Amarasinghe, S.: Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Not. 48(6), 519–530 (2013). https://doi.org/10.1145/2499370.2462176
Article Google Scholar
Richter, S.R., Hayder, Z., Koltun, V.: Playing for benchmarks. In: IEEE International Conference on Computer Vision (ICCV), pp. 2232–2241 (2017). https://doi.org/10.1109/ICCV.2017.243
Roffo, G., Melzi, S.: Online feature selection for visual tracking. In: British Machine Vision Conference (BMVC). BMVA Press (2016)
Rucklidge, W.: Efficiently locating objects using the hausdorff distance. Int. J. Comput. Vis. 24(3), 251–270 (1997). https://doi.org/10.1023/A:1007975324482
Article Google Scholar
Smeulders, A.W.M., Chu, D.M., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2014). https://doi.org/10.1109/TPAMI.2013.230
Article Google Scholar
Song, Y., Ma, C., Gong, L., Zhang, J., Lau, R.W.H., Yang, M.: CREST: convolutional residual learning for visual tracking. In: IEEE International Conference on Computer Vision (ICCV), pp. 2574–2583 (2017). https://doi.org/10.1109/ICCV.2017.279
Steger, C.: An unbiased detector of curvilinear structures. IEEE Trans. Pattern Anal. Mach. Intell. 20(2), 113–125 (1998). https://doi.org/10.1109/34.659930
Article Google Scholar
Steger, C.: Similarity measures for occlusion, clutter, and illumination invariant object recognition. In: DAGM-Symposium, Lecture Notes in Computer Science, vol. 2191, pp. 148–154. Springer, Berlin (2001). https://doi.org/10.1007/3-540-45404-7_20
Steger, C.: Occlusion, clutter, and illumination invariant object recognition. Int. Arch. Photogram. Remote Sens. 34(3/A), 345–350 (2002)
Steger, C., Ulrich, M., Wiedemann, C.: Machine Vision Algorithms and Applications, 2nd edn. Wiley-VCH, Weinheim (2018)
Google Scholar
Sui, Y., Zhang, Z., Wang, G., Tang, Y., Zhang, L.: Real-time visual tracking: Promoting the robustness of correlation filter learning. In: European Conference on Computer Vision (ECCV), pp. 662–678 (2016). https://doi.org/10.1007/978-3-319-46484-8_40
Sun, C., Wang, D., Lu, H., Yang, M.H.: Learning spatial-aware regressions for visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8962–8970 (2018)
Ulrich, M., Steger, C.: Empirical performance evaluation of object recognition methods. Empirical Evaluation Methods in Computer Vision pp. 62–76 (2001)
Ulrich, M., Steger, C.: Performance evaluation of 2d object recognition techniques. Technical Report PF–2002–01, Lehrstuhl für Photogrammetrie und Fernerkundung, Technische Universität München (2002)
Ulrich, M., Steger, C.: System and methods for automatic parameter determination in machine vision (2011). US Patent 7,953,290
Ulrich, M., Steger, C., Baumgartner, A.: Real-time object recognition using a modified generalized hough transform. Pattern Recognit. 36(11), 2557–2570 (2003). https://doi.org/10.1016/S0031-3203(03)00169-9
Article MATH Google Scholar
Valmadre, J., Bertinetto, L., Henriques, J.F., Tao, R., Vedaldi, A., Smeulders, A.W.M., Torr, P.H.S., Gavves, E.: Long-term tracking in the wild: a benchmark. In: European Conference on Computer Vision (ECCV), pp. 692–707 (2018). https://doi.org/10.1007/978-3-030-01219-9_41
Vojir, T., Matas, J.: Pixel-wise object segmentations for the VOT 2016 dataset. Research Report CTU–CMP–2017–01, Center for Machine Perception, Czech Technical University, Prague, Czech Republic (2017)
Wu, Y., Lim, J., Yang, M.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015). https://doi.org/10.1109/TPAMI.2014.2388226
Article Google Scholar

Download references

Author information

Authors and Affiliations

MVTec Software GmbH, ArnulfstraSSe 205, 80634, Munich, Germany
Tobias Böttger & Carsten Steger

Authors

Tobias Böttger
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Steger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tobias Böttger.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Böttger, T., Steger, C. Accurate and robust tracking of rigid objects in real time. J Real-Time Image Proc 18, 493–510 (2021). https://doi.org/10.1007/s11554-020-00978-9

Download citation

Received: 10 July 2019
Accepted: 20 April 2020
Published: 22 May 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11554-020-00978-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accurate and robust tracking of rigid objects in real time

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

End-to-End Object Detection with Transformers

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accurate and robust tracking of rigid objects in real time

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

End-to-End Object Detection with Transformers

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation