Skip to main content

MSL-RAPTOR: A 6DoF Relative Pose Tracker for Onboard Robotic Perception

  • Conference paper
  • First Online:
Experimental Robotics (ISER 2020)

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 19))

Included in the following conference series:

Abstract

Determining the relative position and orientation of objects in an environment is a fundamental building block for a wide range of robotics applications. To accomplish this task efficiently in practical settings, a method must be fast, use common sensors, and generalize easily to new objects and environments. We present MSL-RAPTOR, a two-stage algorithm for tracking a rigid body with a monocular camera. The image is first processed by an efficient neural network-based front-end to detect new objects and track 2D bounding boxes between frames. The class label and bounding box is passed to the back-end that updates the object’s pose using an unscented Kalman filter (UKF). The measurement posterior is fed back to the 2D tracker to improve robustness. The object’s class is identified so a class-specific UKF can be used if custom dynamics and constraints are known. Adapting to track the pose of new classes only requires providing a trained 2D object detector or labeled 2D bounding box data, as well as the approximate size of the objects. The performance of MSL-RAPTOR is first verified on the NOCS-REAL275 dataset, achieving results comparable to RGB-D approaches despite not using depth measurements. When tracking a flying drone from onboard another drone, it outperforms the fastest comparable method in speed by a factor of 3, while giving lower translation and rotation median errors by 66% and 23% respectively.

B. Ramtoula and A. Caccavale—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Spica, R., Falanga, D., Cristofalo, E., Montijano, E., Scaramuzza, D., Schwager, M.: A real-time game theoretic planner for autonomous two-player drone racing. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018)

    Google Scholar 

  2. Badue, C., et al.: Self-driving cars: a survey. Expert Syst. Appl. 165, 113816 (2021)

    Article  Google Scholar 

  3. Schmidt, T., Newcombe, R., Fox, D.: DART: dense articulated real-time tracking. In: Proceedings of Robotics: Science and Systems, Berkeley, USA (2014)

    Google Scholar 

  4. Andriluka, M., Roth, S., Schiele, B.: Monocular 3D pose estimation and tracking by detection. In: Conference on Computer Vision and Pattern Recognition, pp. 623–630. IEEE (2010)

    Google Scholar 

  5. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018)

    Google Scholar 

  6. Do, T.T., Pham, T., Cai, M., Reid, I.: Real-time monocular object instance 6d pose estimation. In: British Machine Vision Conference (BMVC), vol. 1, p. 6 (2018)

    Google Scholar 

  7. Rad, M., Lepetit, V.: BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3836 (2017)

    Google Scholar 

  8. Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: IEEE International Conference on Computer Vision, pp. 1521–1529 (2017)

    Google Scholar 

  9. Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301 (2018)

    Google Scholar 

  10. Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: an accurate O(n) solution to the PnP problem. Int. J. Comput. Vis. 81(2), 155 (2009)

    Article  Google Scholar 

  11. Wang, H., Sridhar, S., Huang, J., Valentin, J., Song, S., Guibas, L.J.: Normalized object coordinate space for category-level 6D object pose and size estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2642–2651 (2019)

    Google Scholar 

  12. Wang, C., Martín-Martín, R., Xu, D., Lv, J., Lu, C., Fei-Fei, L., Savarese, S., Zhu, Y.: 6-PACK: category-level 6D pose tracker with anchor-based keypoints. In: IEEE Conference on Robotics and Automation (ICRA), Paris, France (2020)

    Google Scholar 

  13. Cho, H., Seo, Y.W., Kumar, B.V., Rajkumar, R.R.: A multi-sensor fusion system for moving object detection and tracking in urban driving environments. In: IEEE Conference on Robotics and Automation (ICRA), pp. 1836–1843 (2014)

    Google Scholar 

  14. Buyval, A., Gabdullin, A., Mustafin, R., Shimchik, I.: Realtime vehicle and pedestrian tracking for didi udacity self-driving car challenge. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 2064–2069. IEEE (2018)

    Google Scholar 

  15. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv:1804.02767 (2018)

  16. Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)

    Google Scholar 

  17. Julier, S., Uhlmann, J., Durrant-Whyte, H.F.: A new method for the nonlinear transformation of means and covariances in filters and estimators. IEEE Trans. Autom. Control 45(3), 477–482 (2000)

    Article  MathSciNet  Google Scholar 

  18. Wan, E.A., Van Der Merwe, R.: The unscented Kalman filter for nonlinear estimation. In: Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium, pp. 153–158 (2000)

    Google Scholar 

  19. Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Cambridge University Press (2004)

    Google Scholar 

  20. Jocher, G., guigarfr, perry0418, Ttayu, Veitch-Michaelis, J., Bianconi, G., Baltacı, F., Suess, D., WannaSea, U., IlyaOvodov: ultralytics/yolov3: Rectangular Inference, Conv2d + Batchnorm2d Layer Fusion, April 2019

    Google Scholar 

  21. Kristan, M., Matas, J., Leonardis, A., Vojir, T., Pflugfelder, R., Fernandez, G., Nebehay, G., Porikli, F., Čehovin, L.: A novel performance evaluation methodology for single-target trackers. IEEE Trans. Pattern Anal. Mach. Intell. 38(11), 2137–2155 (2016)

    Article  Google Scholar 

  22. Issac, J., Wuthrich, M., Cifuentes, C.G., Bohg, J., Trimpe, S., Schaal, S.: Depth-based object tracking using a robust gaussian filter. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden (2016)

    Google Scholar 

  23. Wuthrich, M., Pastor, P., Kalakrishnan, M., Bohg, J., Schaal, S.: Probabilistic object tracking using a range camera. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (2013)

    Google Scholar 

  24. Zhou, Q.Y., Park, J., Koltun, V.: Open3D: a modern library for 3D data processing. arXiv preprint arXiv:1801.09847 (2018)

  25. Suwajanakorn, S., Snavely, N., Tompson, J.J., Norouzi, M.: Discovery of latent 3D keypoints via end-to-end geometric reasoning. In: Advances in Neural Information Processing Systems, pp. 2059–2070 (2018)

    Google Scholar 

Download references

Acknowledgements

This research was supported in part by ONR grant number N00014-18-1-2830, NSF NRI grant 1830402, the Stanford Ford Alliance program, the Mitacs Globalink research award IT15240, and the NSERC Discovery Grant 2019-05165. We are grateful for this support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adam Caccavale .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ramtoula, B., Caccavale, A., Beltrame, G., Schwager, M. (2021). MSL-RAPTOR: A 6DoF Relative Pose Tracker for Onboard Robotic Perception. In: Siciliano, B., Laschi, C., Khatib, O. (eds) Experimental Robotics. ISER 2020. Springer Proceedings in Advanced Robotics, vol 19. Springer, Cham. https://doi.org/10.1007/978-3-030-71151-1_46

Download citation

Publish with us

Policies and ethics