Skip to main content

Principles of Object Tracking and Mapping

  • Chapter
  • First Online:
Springer Handbook of Augmented Reality

Part of the book series: Springer Handbooks ((SHB))

Abstract

Tracking is the main enabling technology for Augmented Reality (AR) as it allows realistic placement of virtual content in the real world. In this chapter, we discuss the most important aspects of tracking for AR while reviewing existing systems that shaped the field over the past years. Initially, we provide a notation for the description of 6 Degree of Freedom (6DoF) poses and camera models. Subsequently, we describe fundamental computer vision techniques that tracking systems frequently use such as feature matching and tracking or pose estimation. We divide the description of tracking approaches into model-based approaches and Simultaneous Localization and Mapping (SLAM) approaches. Model-based approaches use a synthetic representation of an object as a template in order to match the real object. This matching can use texture or lines as tracking features in order to establish correspondences from the models to the image, whereas machine learning approaches for direct pose estimation of an object from an input image have also been recently introduced. Currently, an upcoming challenge is the extension of tracking systems for AR from rigid objects to articulated and nonrigid objects. SLAM tracking systems do not require any models as a reference as they can simultaneously track and map their environment. We discuss keypoint-based, direct, and semi-direct purely visual SLAM system approaches. Next, we analyze the use of additional sensors that can support tracking such as visual-inertial sensor fusion techniques or depth sensing. Finally, we also look at the use of machine learning techniques and especially the use of deep neural networks in conjunction with traditional computer vision approaches for SLAM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 309.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 399.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. (2014). http://www.cs.cmu.edu/~kaess/vslam_cvpr14/media/vslam-tutorial-cvpr14-p12-densevo.pdf.

  2. Google ARCore (2018). https://developers.google.com/ar

  3. Microsoft Hololens (2018). https://www.microsoft.com/en-us/hololens

  4. VisionLib (2018). https://visionlib.com

  5. Vuforia (2018). https://www.vuforia.com

  6. Adagolodjo, Y., Trivisonne, R., Haouchine, N., Cotin, S., Courtecuisse, H.: Silhouette-based pose estimation for deformable organs application to surgical augmented reality. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 539–544. IEEE, New York (2017)

    Google Scholar 

  7. Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Google Scholar 

  8. Baker, S., Bennett, E., Kang, S.B., Szeliski, R.: Removing rolling shutter wobble. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399. IEEE, New York (2010)

    Google Scholar 

  9. Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2011)

    Google Scholar 

  10. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 153–160 (2007)

    Google Scholar 

  11. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Proceedings of the European Conference on Computer Vision, pp. 850–865. Springer, Berlin (2016)

    Google Scholar 

  12. Bleser, G.: Towards visual-inertial slam for mobile augmented reality. Verlag Dr. Hut, Germany (2009)

    Google Scholar 

  13. Bloesch, M., Czarnowski, J., Clark, R., Leutenegger, S., Davison, A.J.: CodeSLAM-learning a compact, optimisable representation for dense visual SLAM. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2560–2568. IEEE, New York (2018)

    Google Scholar 

  14. Bouguet, J.Y.: Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corporation 5(1–10), 4 (2001)

    Google Scholar 

  15. Brachmann, E., Michel, F., Krull, A., Ying Yang, M., Gumhold, S., et al.: Uncertainty-driven 6d pose estimation of objects and scenes from a single RGB image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3364–3372. IEEE, New York (2016)

    Google Scholar 

  16. Bradski, G.: The OpenCV Library. Dr Dobb’s J. Softw. Tools 25(11), 120–123 (2000)

    Google Scholar 

  17. Brox, T., Rosenhahn, B., Gall, J., Cremers, D.: Combined region and motion-based 3D tracking of rigid and articulated objects. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 402–415 (2009)

    Google Scholar 

  18. Buerli, M., Misslinger, S.: Introducing ARKit-Augmented Reality for iOS. In: Proceedings of the Apple Worldwide Developers Conference, pp. 1–187 (2017)

    Google Scholar 

  19. Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks). In: Proceedings of the IEEE International Conference on Computer Vision. IEEE, New York (2017)

    Google Scholar 

  20. Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. In: Proceedings of the European Conference on Computer Vision, pp. 778–792. Springer, Berlin (2010)

    Google Scholar 

  21. Caron, G., Dame, A., Marchand, E.: Direct model based visual tracking and pose estimation using mutual information. Image Vis. Comput. 32(1), 54–63 (2014)

    Google Scholar 

  22. Chen, L., Day, T.W., Tang, W., John, N.W.: Recent developments and future challenges in medical mixed reality. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 123–135. IEEE, New York (2017)

    Google Scholar 

  23. Chen, C., Zhu, H., Li, M., You, S.: A review of visual-inertial simultaneous localization and mapping from filtering-based and optimization-based perspectives. Robotics 7(3), 45 (2018a)

    Google Scholar 

  24. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018b)

    Google Scholar 

  25. Clark, R., Wang, S., Wen, H., Markham, A., Trigoni, N.: Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem. In: Proceedigs of the AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  26. Concha, A., Civera, J.: DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, ART-2015-92153. IEEE, New York (2015)

    Google Scholar 

  27. Davison, A., Murray, D.: Simultaneous localization and map-building using active vision. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 865–880 (2002)

    Google Scholar 

  28. Davison, A.J., Reid, I.D., Molton, n.d., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)

    Google Scholar 

  29. Dementhon, D.F., Davis, L.S.: Model-based object pose in 25 lines of code. Int. J. Comput. Vis. 15(1–2), 123–141 (1995)

    Google Scholar 

  30. DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236. IEEE, New York (2018)

    Google Scholar 

  31. Drummond, T., Cipolla, R.: Real-time visual tracking of complex structures. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 932–946 (2002)

    Google Scholar 

  32. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)

    MATH  Google Scholar 

  33. Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-scale direct monocular SLAM. In: Proceedings of the European Conference on Computer Vision, pp. 834–849. Springer, Berlin (2014)

    Google Scholar 

  34. Fiala, M.: Artag revision 1, a fiducial marker system using digital techniques. Natl. Res. Counc. Publ. 47419, 1–47 (2004)

    Google Scholar 

  35. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    Google Scholar 

  36. Fleet, D., Weiss, Y.: Optical flow estimation. In: Handbook of mathematical models in computer vision, pp. 237–257. Springer, Berlin (2006)

    Google Scholar 

  37. Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: Fast semi-direct monocular visual odometry. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 15–22. IEEE, New York (2014)

    Google Scholar 

  38. Forster, C., Carlone, L., Dellaert, F., Scaramuzza, D.: IMU preintegration on manifold for efficient visual-inertial maximum-a-posteriori estimation. Georgia Institute of Technology, New York (2015)

    Google Scholar 

  39. Gao, X.S., Hou, X.R., Tang, J., Cheng, H.F.: Complete solution classification for the perspective-three-point problem. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 930–943 (2003)

    Google Scholar 

  40. Garon, M., Lalonde, J.F.: Deep 6-DOF tracking. IEEE Trans. Vis. Comput. Graph. 23(11), 2410–2418 (2017)

    Google Scholar 

  41. Garon, M., Laurendeau, D., Lalonde, J.F.: A framework for evaluating 6-DOF object trackers. In: Proceedings of the European Conference on Computer Vision, pp. 582–597. Springer, Berlin (2018)

    Google Scholar 

  42. Gemeiner, P., Einramhof, P., Vincze, M.: Simultaneous motion and structure estimation by fusion of inertial and vision data. Int. J. Robot. Res. 26(6), 591–605 (2007)

    Google Scholar 

  43. Getting, I.A.: Perspective/navigation-the global positioning system. IEEE Spectr. 30(12), 36–38 (1993)

    Google Scholar 

  44. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep learning, vol. 1. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  45. Hager, G.D., Belhumeur, P.N.: Efficient region tracking with parametric models of geometry and illumination. IEEE Trans. Pattern Anal. Mach. Intell. 20(10):1025–1039 (1998)

    Google Scholar 

  46. Harris, C., Stennett, C.: Rapid-a video rate object tracker. In: Proceedings of the British Machine Vision Conference, pp. 1–6 (1990)

    Google Scholar 

  47. Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the Alvey Vision Conference, vol. 15, pp. 10–5244 (1988)

    Google Scholar 

  48. Hartley, R.I., Sturm, P.: Triangulation. Comput. Vis. Image Underst. 68(2), 146–157 (1997)

    Google Scholar 

  49. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE, New York (2016)

    Google Scholar 

  50. Heikkila, J., Silven, O.: A four-step camera calibration procedure with implicit image correction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1106–1112. IEEE, New York (1997)

    Google Scholar 

  51. Hesch, J.A., Roumeliotis, S.I.: A direct least-squares (DLS) method for PnP. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 383–390. IEEE, New York (2011)

    Google Scholar 

  52. Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Proceedings of the Asian Conference on Computer Vision, pp. 548–562. Springer, Berlin (2012)

    Google Scholar 

  53. Huber, P.J.: Robust statistics. In: International Encyclopedia of Statistical Science, pp. 1248–1251. Springer, Berlin (2011)

    Google Scholar 

  54. Jin, H., Favaro, P., Soatto, S.: Real-time feature tracking and outlier rejection with changes in illumination. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 1, pp. 684–689. IEEE, New York (2001)

    Google Scholar 

  55. Kato, H., Billinghurst, M.: Marker tracking and HMD calibration for a video-based augmented reality conferencing system. In: Proceedings of the IEEE/ACM International Workshop on Augmented Reality, pp. 85–94. IEEE, New York (1999)

    Google Scholar 

  56. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874. IEEE, New York (2014)

    Google Scholar 

  57. Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 22–29. IEEE, New York (2017a)

    Google Scholar 

  58. Kehl, W., Tombari, F., Ilic, S., Navab, N.: Real-time 3D model tracking in color and depth on a single CPU core. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 745–753. IEEE, New York (2017b)

    Google Scholar 

  59. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980 (2014)

    Google Scholar 

  60. Klein, G., Murray, D.: Parallel tracking and mapp.ing for small AR workspaces. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 225–234. IEEE, New York (2007)

    Google Scholar 

  61. Klein, G., Murray, D.: Parallel tracking and mapping on a camera phone. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 83–86. IEEE, New York (2009)

    Google Scholar 

  62. Köhler, J., Pagani, A., Stricker, D.: Detection and identification techniques for markers used in computer vision. In: Proceedings of the Visualization of Large and Unstructured Data Sets-Applications in Geospatial Planning, Modeling and Engineering, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2011)

    Google Scholar 

  63. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  64. Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: Proceedings of the International Conference on 3D Vision, pp. 239–248. IEEE, New York (2016)

    Google Scholar 

  65. LeCun, Y., et al. Generalization and network design strategies. Connectionism Perspect. 19, 143–155 (1989)

    Google Scholar 

  66. LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of the IEEE International Symposium on Circuits and Systems, pp. 253–256. IEEE, New York (2010)

    Google Scholar 

  67. Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: An accurate o(n) solution to the PnP problem. Int. J. Comput. Vis. 81(2), 155 (2009)

    Google Scholar 

  68. Leutenegger, S., Lynen, S., Bosse, M., Siegwart, R., Furgale, P.: Keyframe-based visual–inertial odometry using nonlinear optimization. Int. J. Rob. Res. 34(3), 314–334 (2015)

    Google Scholar 

  69. Li, M., Mourikis, A.I.: High-precision, consistent EKF-based visual-inertial odometry. Int. J. Rob. Res. 32(6), 690–711 (2013)

    Google Scholar 

  70. Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: DeepIM: Deep iterative matching for 6D pose estimation. In: Proceedings of the European Conference on Computer Vision, pp. 683–698. Springer, Berlin (2018)

    Google Scholar 

  71. Liang, C.K., Chang, L.W., Chen, H.H.: Analysis and compensation of rolling shutter effect. IEEE Trans. Image Process. 17(8), 1323–1330 (2008)

    Google Scholar 

  72. Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. arXiv preprint arXiv:150202791 (2015)

    Google Scholar 

  73. Longuet-Higgins, H.C.: A computer algorithm for reconstructing a scene from two projections. Nature 293(5828), 133 (1981)

    Google Scholar 

  74. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE, New York (1999)

    Google Scholar 

  75. Lucas, B.D., Kanade, T., et al. An iterative image registration technique with an application to stereo vision. Technical Report (1981)

    Google Scholar 

  76. Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5695–5703. IEEE, New York (2016)

    Google Scholar 

  77. McCormac, J., Handa, A., Davison, A., Leutenegger, S.: SemanticFusion: Dense 3D semantic mapping with convolutional neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 4628–4635. IEEE, New York (2017)

    Google Scholar 

  78. McCormac, J., Clark, R., Bloesch, M., Davison, A., Leutenegger, S.: Fusion++: Volumetric object-level slam. In: Proceedings of the International Conference on 3D Vision, pp. 32–41. IEEE, New York (2018)

    Google Scholar 

  79. Mukherjee, D., Wu, Q.M.J., Wang, G.: A comparative experimental study of image feature detectors and descriptors. Mach. Vision Appl. 26(4), 443–466 (2015)

    Google Scholar 

  80. Mur-Artal, R., Montiel, J.M.M., Tardos, J.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)

    Google Scholar 

  81. Naimark, L., Foxlin, E.: Circular data matrix fiducial system and robust image processing for a wearable vision-inertial self-tracker. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, p. 27. IEEE, New York (2002)

    Google Scholar 

  82. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the International Conference on Machine Learning, pp. 807–814 (2010)

    Google Scholar 

  83. Newcombe, R., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A., Kohi, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136. IEEE, New York (2011a)

    Google Scholar 

  84. Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: Dense tracking and mapping in real-time. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2320–2327. IEEE, New York (2011b)

    Google Scholar 

  85. Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 0756–777 (2004)

    Google Scholar 

  86. Nistér, D.: Preemptive RANSAC for live structure and motion estimation. Mach. Vision Appl. 16(5), 321–329 (2005)

    Google Scholar 

  87. Oth, L., Furgale, P., Kneip, L., Siegwart, R.: Rolling shutter camera calibration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1360–1367. IEEE, New York (2013)

    Google Scholar 

  88. Pagani, A.: Reality Models for efficient registration in Augmented Reality. Verlag Dr. Hut, Germany (2014)

    Google Scholar 

  89. Park, Y., Lepetit, V., Woo, W.: Multiple 3D object tracking for augmented reality. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 117–120. IEEE, New York (2008)

    Google Scholar 

  90. Paulus, C.J., Haouchine, N., Cazier, D., Cotin, S.: Augmented reality during cutting and tearing of deformable objects. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 54–59. IEEE, New York (2015)

    Google Scholar 

  91. Poultney, C., Chopra, S., Cun, Y.L., et al.: Efficient learning of sparse representations with an energy-based model. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 1137–1144 (2007)

    Google Scholar 

  92. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes 3rd edition: The art of scientific computing. Cambridge University, Cambridge (2007)

    MATH  Google Scholar 

  93. Prisacariu, V.A., Reid, I.D.: PWP3D: Real-time segmentation and tracking of 3D objects. Int. J. Comput. Vis. 98(3), 335–354 (2012)

    Google Scholar 

  94. Puerto-Souza, G.A., Mariottini, G.L.: A fast and accurate feature-matching algorithm for minimally-invasive endoscopic images. IEEE Trans. Med. Imaging 32(7), 1201–1214 (2013)

    Google Scholar 

  95. Qin, T., Li, P., Shen, S.: Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Trans. Rob. 34(4), 1004–1020 (2018)

    Google Scholar 

  96. Rad, M., Lepetit, V.: BB8: A scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3848–3856. IEEE, New York (2017)

    Google Scholar 

  97. Radkowski, R., Herrema, J., Oliver, J.: Augmented reality-based manual assembly support with visual features for different degrees of difficulty. Int. J. Hum.-Comput. Interact. 31(5), 337–349 (2015)

    Google Scholar 

  98. Rambach, J., Tewari, A., Pagani, A., Stricker, D.: Learning to fuse: A deep learning approach to visual-inertial camera pose estimation. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 71–76. IEEE, New York (2016)

    Google Scholar 

  99. Rambach, J., Pagani, A., Stricker, D.: Augmented things: enhancing AR applications leveraging the internet of things and universal 3D object tracking. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 103–108. IEEE, New York (2017)

    Google Scholar 

  100. Rambach, J., Deng, C., Pagani, A., Stricker, D.: Learning 6DoF object poses from synthetic single channel images. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. IEEE, New York (2018)

    Google Scholar 

  101. Rambach, J., Lesur, P., Pagani, A., Stricker, D.: SlamCraft: Dense planar RGB monocular SLAM. In: Proceedings of the International Conference on Machine Vision Applications. Springer, Berlin (2019)

    Google Scholar 

  102. Reina, S.C., Solin, A., Kannala, J.: Robust gyroscope-aided camera self-calibration. In: Proceedings of the International Conference on Information Fusion, pp. 772–779. IEEE, New York (2018)

    Google Scholar 

  103. Renaudin, V., Afzal, M.H., Lachapelle, G.: Complete triaxis magnetometer calibration in the magnetic domain. J. Sens. (2010)

    Google Scholar 

  104. Ricolfe-Viala, C., Sanchez-Salmeron, A.J.: Lens distortion models evaluation. Appl. Opt. 49(30), 5914–5928 (2010)

    Google Scholar 

  105. Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1508–1515. IEEE, New York (2005)

    Google Scholar 

  106. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2564–2571. IEEE, New York (2011)

    Google Scholar 

  107. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)

    MATH  Google Scholar 

  108. Runz, M., Buffier, M., Agapito, L.: Maskfusion: Real-time recognition, tracking and reconstruction of multiple moving objects. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 10–20. IEEE, New York (2018)

    Google Scholar 

  109. Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: Proceedings of the International Conference on 3-D Digital Imaging and Modeling, vol. 1, pp. 145–152 (2001)

    Google Scholar 

  110. Salas-Moreno, R., Newcombe, R., Strasdat, H., Kelly, P., Davison, A.: Slam++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1352–1359. IEEE, New York (2013)

    Google Scholar 

  111. Salas-Moreno, R., Glocken, B., Kelly, P., Davison, A.: Dense planar SLAM. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 157–164. IEEE, New York (2014)

    Google Scholar 

  112. Sansoni, G., Trebeschi, M., Docchio, F.: State-of-the-art and applications of 3D imaging sensors in industry, cultural heritage, medicine, and criminal investigation. Sensors 9(1), 568–601 (2009)

    Google Scholar 

  113. Seo, B.K., Wuest, H.: A direct method for robust model-based 3D object tracking from a monocular RGB image. In: Proceedings of the European Conference on Computer Vision, pp. 551–562. Springer, Berlin (2016)

    Google Scholar 

  114. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556 (2014)

    Google Scholar 

  115. Su, Y., Rambach, J., Minaskan, N., Lesur, P., Pagani, A., Stricker, D.: Deep multi-state object pose estimation for augmented reality assembly. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. IEEE, New York (2019)

    Google Scholar 

  116. Subbarao, R., Meer, P.: Beyond RANSAC: user independent robust regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop, pp. 101–101. IEEE, New York (2006)

    Google Scholar 

  117. Sun, B., Saenko, K.: Deep coral: Correlation alignment for deep domain adaptation. In: Proceedings of the European Conference on Computer Vision, pp. 443–450. Springer, Berlin (2016)

    Google Scholar 

  118. Sundermeyer, M., Marton, Z.C., Durner, M., Brucker, M., Triebel, R.: Implicit 3D orientation learning for 6D object detection from RGB images. In: Proceedings of the European Conference on Computer Vision, pp. 699–715. Springer, Berlin (2018)

    Google Scholar 

  119. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE, New York (2015)

    Google Scholar 

  120. Tan, D.J., Navab, N., Tombari, F.: Looking beyond the simple scenarios: Combining learners and optimizers in 3D temporal tracking. IEEE Trans. Visual Comput. Graphics 23(11), 2399–2409 (2017)

    Google Scholar 

  121. Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2. IEEE, New York (2017)

    Google Scholar 

  122. Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301. IEEE, New York (2018)

    Google Scholar 

  123. Titterton, D., Weston, J.L., Weston, J.: Strapdown inertial navigation technology, vol. 17. IET, United Kingdom (2004)

    Google Scholar 

  124. Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment—a modern synthesis. In: Proceedings of the International Workshop on Vision Algorithms, pp. 298–372. Springer, Berlin (1999)

    Google Scholar 

  125. Vacchetti, L., Lepetit, V., Fua, P.: Combining edge and texture information for real-time accurate 3D camera tracking. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 48–57. IEEE, New York (2004a)

    Google Scholar 

  126. Vacchetti, L., Lepetit, V., Fua, P.: Stable real-time 3D tracking using online and offline information. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1385–1391 (2004b)

    Google Scholar 

  127. Wagner, D., Schmalstieg, D.: Artoolkitplus for pose tracking on mobile devices. In: Proceedings of 12th Computer Vision Winter Workshop, 139–146 (2007)

    Google Scholar 

  128. Wang, S., Clark, R., Wen, H., Trigoni, N.: Deepvo: Towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 2043–2050. IEEE, New York (2017)

    Google Scholar 

  129. Wasenmüller, O., Stricker, D.: Comparison of kinect v1 and v2 depth images in terms of accuracy and precision. In: Proceedings of the Asian Conference on Computer Vision Workshops, pp. 34–45. Springer, Berlin (2016)

    Google Scholar 

  130. Whelan, T., Kaess, M., Fallon, M.F.: Kintinuous: Spatially extended {K}inect{F}usion. In: Proceedings of the Workshop on RGB-D: Advanced Reasoning with Depth Cameras (2012)

    Google Scholar 

  131. Whelan, T., Salas-Moreno, R., Glocker, B., Davison, A., Leutenegger, S.: ElasticFusion: Real-time dense SLAM and light source estimation. Int. J. Rob. Res. 35(14), 1697–1716 (2016)

    Google Scholar 

  132. Wuest, H., Vial, F., Stricker, D.: Adaptive line tracking with multiple hypotheses for augmented reality. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 62–69. IEEE, New York (2005)

    Google Scholar 

  133. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Proceedings of the European Conference on Computer Vision, pp. 818–833. Springer, Berlin (2014)

    Google Scholar 

  134. Zhang, Z.: Iterative point matching for registration of free-form curves and surfaces. Int. J. Comput. Vis. 13(2), 119–152 (1994)

    Google Scholar 

  135. Zhang, Z., et al. Flexible camera calibration by viewing a plane from unknown orientations. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 99, pp. 666–673. IEEE, New York (1999)

    Google Scholar 

  136. Zhang, Z., Li, M., Huang, K., Tan, T.: Practical camera auto-calibration based on object appearance and motion for traffic scene visual surveillance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008)

    Google Scholar 

  137. Zhi, S., Bloesch, M., Leutenegger, S., Davison, A.J.: SceneCode: Monocular dense semantic reconstruction using learned encoded scene representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11776–11785. IEEE, New York (2019)

    Google Scholar 

  138. Zhou, H., Ummenhofer, B., Brox, T.: Deeptam: Deep tracking and mapping. In: Proceedings of the European Conference on Computer Vision, pp. 822–838. Springer, Berlin (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jason Rambach .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Rambach, J., Pagani, A., Stricker, D. (2023). Principles of Object Tracking and Mapping. In: Nee, A.Y.C., Ong, S.K. (eds) Springer Handbook of Augmented Reality. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-67822-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67822-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67821-0

  • Online ISBN: 978-3-030-67822-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics