The Visual Computer

, Volume 32, Issue 6–8, pp 681–691 | Cite as

Fast capture of textured full-body avatar with RGB-D cameras

  • Shuai Lin
  • Yin Chen
  • Yu-Kun Lai
  • Ralph R. Martin
  • Zhi-Quan Cheng
Original Article


We present a practical system which can provide a textured full-body avatar within 3 s. It uses sixteen RGB-depth (RGB-D) cameras, ten of which are arranged to capture the body, while six target the important head region. The configuration of the multiple cameras is formulated as a constraint-based minimum set space-covering problem, which is approximately solved by a heuristic algorithm. The camera layout determined can cover the full-body surface of an adult, with geometric errors of less than 5 mm. After arranging the cameras, they are calibrated using a mannequin before scanning real humans. The 16 RGB-D images are all captured within 1 s, which both avoids the need for the subject to attempt to remain still for an uncomfortable period, and helps to keep pose changes between different cameras small. All scans are combined and processed to reconstruct the photorealistic textured mesh in 2 s. During both system calibration and working capture of a real subject, the high-quality RGB information is exploited to assist geometric reconstruction and texture stitching optimization.


Full-body avatar High-quality texture RGB-depth camera Global registration 



This work was supported by the Natural Science Foundation of China (No. 61272334).

Supplementary material

371_2016_1245_MOESM1_ESM.mp4 (1 mb)
Supplementary material 1 (mp4 1042 KB)


  1. 1.
    Microsoft, Kinect (2012). Accessed 4 Jan 2016
  2. 2.
    Primesense, Carmine sensor (2013). Accessed 4 Jan 2016
  3. 3.
    ASUS, Xtion pro live (2013). Accessed 4 Jan 2016
  4. 4.
    Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., Fitzgibbon, A.: KinectFusion: real-time 3d reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th annual ACM symposium on User interface software and technology, pp. 559–568 (2011)Google Scholar
  5. 5.
    Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.W.: KinectFusion: real-time dense surface mapping and tracking. In: 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136 (2011)Google Scholar
  6. 6.
    Whelan, T., McDonald, J.B., Kaess, M., Fallon, M.F., Johannsson, H., Leonard, J.J.: Kintinuous: spatially extended KinectFusion. In: RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras, Sydney, Australia, Jul 2012, pp. 127–136Google Scholar
  7. 7.
    Alexiadis, D.S., Zarpalas, D., Daras, P.: Real-time, full 3-d reconstruction of moving foreground objects from multiple consumer depth cameras. IEEE Trans. Multimed. 15, 339–358 (2013)CrossRefGoogle Scholar
  8. 8.
    Niessner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3d reconstruction at scale using voxel hashing. ACM Trans. Graph. 32, 169:1–169:11 (2013)Google Scholar
  9. 9.
    Chen, J., Bautembach, D., Izadi, S.: Scalable real-time volumetric surface reconstruction. ACM Trans. Graph. 32, 113:1–113:16 (2013)zbMATHGoogle Scholar
  10. 10.
    Sturm, J., Bylow, E., Kahl, F., Cremers, D.: Copyme3d: scanning and printing persons in 3d. In: German Conference on Pattern Recognition, pp. 405–414 (2013)Google Scholar
  11. 11.
    Steinbrücker, F., Kerl, C., Cremers, D.: Large-scale multi-resolution surface reconstruction from RGB-D sequences. In: IEEE International Conference on Computer Vision, pp. 3264–3271 (2013)Google Scholar
  12. 12.
    Zhou, Q.-Y., Koltun, V.: Dense scene reconstruction with points of interest. ACM Trans. Graph. 32, 112:1–112:8 (2013)zbMATHGoogle Scholar
  13. 13.
    Steinbrücker, F., Sturm, J., Cremers, D.: Volumetric 3D mapping in real-time on a CPU. In: International Conference on Robotics and Automation, IEEE, pp. 2021–2028 (2014)Google Scholar
  14. 14.
    Zhou, Q.-Y., Koltun, V.: Color map optimization for 3d reconstruction with consumer depth cameras. ACM Trans. Graph. 33, 155:1–155:10 (2014)Google Scholar
  15. 15.
    Zollhöfer, M., Niessner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., Stamminger, M.: Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. 33, 156:1–156:12 (2014)CrossRefGoogle Scholar
  16. 16.
    Afzal, H., Ismaeil, K.A., Aouada, D., Destelle, F., Mirbach, B., Ottersten, B.E.: Kinect Deform: enhanced 3d reconstruction of non-rigidly deforming objects. In: 2nd International Conference on 3D Vision, pp. 7–13 (2014)Google Scholar
  17. 17.
    Whelan, T., Kaess, M., Johannsson, H., Fallon, M., Leonard, J.J., Mcdonald, J.: Real-time large-scale dense RGB-D slam with volumetric fusion. Int. J. Robot. Res. 34, 598–626 (2015)CrossRefGoogle Scholar
  18. 18.
    Or-El, R., Rosman, G., Wetzler, A., Kimmel, R., Bruckstein, A.M.: RGBD-fusion: real-time high precision depth recovery. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5407–5416 (2015)Google Scholar
  19. 19.
    Kerl, C., Stueckler, J., Cremers, D.: Dense continuous-time tracking and mapping with rolling shutter RGB-D cameras. In: IEEE International Conference on Computer Vision (2015)Google Scholar
  20. 20.
    Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 343–352 (2015)Google Scholar
  21. 21.
    Cui, Y., Chang, W., Nöll, T., Stricker, D.: KinectAvatar: fully automatic body capture using a single kinect. In: ACCV 2012 International Workshops, Part II, pp. 133–147 (2012)Google Scholar
  22. 22.
    Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J.T., Gusev, G.: 3d self-portraits. ACM Trans. Graph. 32(6), 187:1–187:9 (2013)Google Scholar
  23. 23.
    Dou, M., Fuchs, H., Frahm, J.: Scanning and tracking dynamic objects with commodity depth cameras. In: IEEE International Symposium on Mixed and Augmented Reality, pp. 99–106 (2013)Google Scholar
  24. 24.
    Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A.W., Izadi, S.: 3d scanning deformable objects with a single RGBD sensor. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 493–501 (2015)Google Scholar
  25. 25.
    Tong, J., Zhou, J., Liu, L., Pan, Z., Yan, H.: Scanning 3d full human bodies using kinects. IEEE Trans. Vis. Comput. Graph. 18(4), 643–650 (2012)CrossRefGoogle Scholar
  26. 26.
    Chen, Y., Dang, G., Cheng, Z.-Q., Xu, K.: Fast capture of personalized avatar using two kinects. J. Manuf. Syst. 33(1), 233–240 (2014)CrossRefGoogle Scholar
  27. 27.
    Lempitsky, V.S., Ivanov, D.V.: Seamless mosaicing of image-based texture maps. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2007)Google Scholar
  28. 28.
    Gal, R., Wexler, Y., Ofek, E., Hoppe, H., Cohen-Or, D.: Seamless montage for texturing models. Comput. Graph. Forum 29(2), 479–486 (2010)CrossRefGoogle Scholar
  29. 29.
    Karp, R.M.: Reducibility among combinatorial problems. In: Proceedings of a symposium on the Complexity of Computer Computations, held March 20–22, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, pp. 85–103 (1972)Google Scholar
  30. 30.
    Khoshelham, K., Elberink, S.O.: Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 12(2), 1437–1454 (2012)CrossRefGoogle Scholar
  31. 31.
    Richardt, C., Stoll, C., Dodgson, N.A., Seidel, H.-P., Theobalt, C.: Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos. Comput. Graph. Forum 31(2), 247–256 (2012)CrossRefGoogle Scholar
  32. 32.
    OpenNI, OpenNI organization (2011). Accessed 4 Jan 2016
  33. 33.
    Cheng, Z.-Q., Chen, Y., Martin, R.R., Lai, Y., Wang, A.: Supermatching: feature matching using supersymmetric geometric constraints. IEEE Trans. Vis. Comput. Graph. 19(11), 1885–1894 (2013)CrossRefGoogle Scholar
  34. 34.
    Amplianitis, K., Adduci, M., Reulke, R.: Calibration of a multiple stereo and rgb-d camera system for 3d human tracking. In: Colomina, I., Prat, M. (eds.) The international archives of the photogrammetry, remote sensing and spatial information sciences, vol. XL-3/W1, ISPRS, Castelldefels, Spain, pp. 7–14 (2014)Google Scholar
  35. 35.
    Sumner, R.W., Schmid, J., Pauly, M.: Embedded deformation for shape manipulation. ACM Trans. Graph. 26(3) (2007)Google Scholar
  36. 36.
    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision, pp. 2564–2571 (2011)Google Scholar
  37. 37.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)CrossRefGoogle Scholar
  38. 38.
    Kappes, J.H., Andres, B., Hamprecht, F.A., Schnörr, C., Nowozin, S., Batra, D., Kim, S., Kausler, B.X., Kröger, T., Lellmann, J., Komodakis, N., Savchynskyy, B., Rother, C.: A comparative study of modern inference techniques for structured discrete energy minimization problems. Int. J. Comput. Vis. 115(2), 155–184 (2015)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM Trans. Graph. 22(3), 313–318 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Shuai Lin
    • 1
  • Yin Chen
    • 2
  • Yu-Kun Lai
    • 3
  • Ralph R. Martin
    • 3
  • Zhi-Quan Cheng
    • 4
  1. 1.PDL, School of ComputerNational University of Defense TechnologyChangshaChina
  2. 2.Defense Engineering SchoolPLA University of Science and TechnologyNanjingChina
  3. 3.School of Computer Science and InformaticsCardiff UniversityCardiffUK
  4. 4.Avatar Science CompanyChangshaChina

Personalised recommendations