On Calibration and Alignment of Point Clouds in a Network of RGB-D Sensors for Tracking

  • George XuEmail author
  • Shahram Payandeh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9475)


This paper investigates the integration of multiple time-of-flight (ToF) depth sensors for the purposes of general 3D tracking and specifically of the hands. The advantage of using a network with multiple sensors is in the increased viewing coverage as well as being able to capture a more complete 3D point cloud representation of the object. Given an ideal point cloud representation, tracking can be accomplished without having to first reconstruct a mesh representation of the object. In utilizing a network of depth sensors, calibration between the sensors and the subsequent data alignment of the point clouds poses key challenges. While there has been research on the merging and alignment of scenes with larger objects such as the human body, there is little research available focusing on a smaller and more complicated object such as the human hand. This paper presents a study on ways to merge and align the point clouds from a network of sensors for object and feature tracking from the combined point clouds.


Point Cloud Transformation Matrix Depth Image Depth Sensor Depth Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3d tracking of hand articulations using kinect. In: Proceedings of the 2011 British Machine Vision Conference, pp. 101.1–101.11 (2011)Google Scholar
  2. 2.
    Kıraç, F., Kara, Y.E., Akarun, L.: Hierarchically constrained 3D hand pose estimation using regression forests from single frame depth data. Pattern Recogn. Lett. Spec. Issue Depth Image Anal. (2013)Google Scholar
  3. 3.
    Baak, A., Müller, M., Bharaj, G., Seidel, H.-P., Theobalt, C.: A data-driven approach for real-time full body pose reconstruction from a depth camera. In: Proceedings of the 2011 IEEE International Conference on Computer Vision, pp. 1092–1099 (2011)Google Scholar
  4. 4.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. Comput. Vis. Image Underst. (CVIU) 110(3), 346–359 (2008)CrossRefGoogle Scholar
  5. 5.
    Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: Proceedings of the 2005 IEEE International Conference on Computer Vision, pp. 1508–1515 (2010)Google Scholar
  6. 6.
    Newcombe, R., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: Proceedings of the 2011 International Symposium on Mixed and Augmented Reality, pp. 127–136 (2011)Google Scholar
  7. 7.
    Newcombe, R., Davison, A.J.: Live dense reconstruction with a single moving camera. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1498–1505 (2010)Google Scholar
  8. 8.
    Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., Pollefeys, M.: Live metric 3D reconstruction on mobile phones. In: Proceedings of 2013 IEE International Conference on Computer Vision, pp. 65–72 (2013)Google Scholar
  9. 9.
    Marton, Z., Rusu, R., Beetz, M.: On fast surface reconstruction methods for large and noisy point clouds. In: Proceedings of 2009 IEEE International Conference on Robotics and Automation, pp. 3218–3233 (2009)Google Scholar
  10. 10.
  11. 11.
    Berger, K., Ruhl, K., Brümmer, C., Schröder, Y., Scholz, A., Magnor, M.: Markerless motion capture using multiple color-depth sensors. In: Proceedings of the 2011 Vision, Modeling and Visualization, pp. 317–324 (2011)Google Scholar
  12. 12.
    Xu, G., Payandeh, S.: Sensitivity study for object reconstruction using a network of time-of-flight depth sensors. In: Proceedings of the 2015 IEEE International Conference on Robotics and Automation (2015)Google Scholar
  13. 13.
    Kim, Y., Theobalt, C., Diebel, J., Kosecka, J., Miscusik, B., Thrun, S.: Multi-view Image and ToF sensor fusion for dense 3D reconstruction. In: Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops, pp. 1542–1549 (2009)Google Scholar
  14. 14.
    Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)CrossRefGoogle Scholar
  15. 15.
    Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)CrossRefMathSciNetGoogle Scholar
  16. 16.
  17. 17.
    Agarwal, A., Jawahar, C.V., Narayanan, P.J.: A survey of planar homography estimation techniques. Technical Reports: International Institute of Information Technology Hyderabad (2005)Google Scholar
  18. 18.
    Kim, Y., Chan, D., Theobalt, C., Thrun, S.: Design and calibration of a multi-view TOF sensor fusion system. In: Proceedings of the 2009 IEEE Computer Vision and Pattern Recognition Workshops, pp. 1–7, June 2009Google Scholar
  19. 19.
    Dhawan, A., Honrao, V.: Implementation of hand detection based techniques for human computer interaction. Int. J. Comput. Appl. 72(17), June 2013 Google Scholar
  20. 20.
    Rusu, R., Cousins, S.: 3D is here: point cloud library (PCL). In: Proceedings of the 2011 IEEE International Conference on Robotics and Automation, May 2011Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Experimental Robotics and Imaging LaboratorySimon Fraser UniversityBurnabyCanada

Personalised recommendations