Abstract
We present a practical system which can provide a textured full-body avatar within 3 s. It uses sixteen RGB-depth (RGB-D) cameras, ten of which are arranged to capture the body, while six target the important head region. The configuration of the multiple cameras is formulated as a constraint-based minimum set space-covering problem, which is approximately solved by a heuristic algorithm. The camera layout determined can cover the full-body surface of an adult, with geometric errors of less than 5 mm. After arranging the cameras, they are calibrated using a mannequin before scanning real humans. The 16 RGB-D images are all captured within 1 s, which both avoids the need for the subject to attempt to remain still for an uncomfortable period, and helps to keep pose changes between different cameras small. All scans are combined and processed to reconstruct the photorealistic textured mesh in 2 s. During both system calibration and working capture of a real subject, the high-quality RGB information is exploited to assist geometric reconstruction and texture stitching optimization.
Similar content being viewed by others
References
Microsoft, Kinect (2012). http://www.xbox.com/kinect. Accessed 4 Jan 2016
Primesense, Carmine sensor (2013). http://www.primesense.com/solutions/3d-sensor/. Accessed 4 Jan 2016
ASUS, Xtion pro live (2013). http://www.asus.com/Multimedia. Accessed 4 Jan 2016
Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., Fitzgibbon, A.: KinectFusion: real-time 3d reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th annual ACM symposium on User interface software and technology, pp. 559–568 (2011)
Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.W.: KinectFusion: real-time dense surface mapping and tracking. In: 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136 (2011)
Whelan, T., McDonald, J.B., Kaess, M., Fallon, M.F., Johannsson, H., Leonard, J.J.: Kintinuous: spatially extended KinectFusion. In: RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras, Sydney, Australia, Jul 2012, pp. 127–136
Alexiadis, D.S., Zarpalas, D., Daras, P.: Real-time, full 3-d reconstruction of moving foreground objects from multiple consumer depth cameras. IEEE Trans. Multimed. 15, 339–358 (2013)
Niessner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3d reconstruction at scale using voxel hashing. ACM Trans. Graph. 32, 169:1–169:11 (2013)
Chen, J., Bautembach, D., Izadi, S.: Scalable real-time volumetric surface reconstruction. ACM Trans. Graph. 32, 113:1–113:16 (2013)
Sturm, J., Bylow, E., Kahl, F., Cremers, D.: Copyme3d: scanning and printing persons in 3d. In: German Conference on Pattern Recognition, pp. 405–414 (2013)
Steinbrücker, F., Kerl, C., Cremers, D.: Large-scale multi-resolution surface reconstruction from RGB-D sequences. In: IEEE International Conference on Computer Vision, pp. 3264–3271 (2013)
Zhou, Q.-Y., Koltun, V.: Dense scene reconstruction with points of interest. ACM Trans. Graph. 32, 112:1–112:8 (2013)
Steinbrücker, F., Sturm, J., Cremers, D.: Volumetric 3D mapping in real-time on a CPU. In: International Conference on Robotics and Automation, IEEE, pp. 2021–2028 (2014)
Zhou, Q.-Y., Koltun, V.: Color map optimization for 3d reconstruction with consumer depth cameras. ACM Trans. Graph. 33, 155:1–155:10 (2014)
Zollhöfer, M., Niessner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., Stamminger, M.: Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. 33, 156:1–156:12 (2014)
Afzal, H., Ismaeil, K.A., Aouada, D., Destelle, F., Mirbach, B., Ottersten, B.E.: Kinect Deform: enhanced 3d reconstruction of non-rigidly deforming objects. In: 2nd International Conference on 3D Vision, pp. 7–13 (2014)
Whelan, T., Kaess, M., Johannsson, H., Fallon, M., Leonard, J.J., Mcdonald, J.: Real-time large-scale dense RGB-D slam with volumetric fusion. Int. J. Robot. Res. 34, 598–626 (2015)
Or-El, R., Rosman, G., Wetzler, A., Kimmel, R., Bruckstein, A.M.: RGBD-fusion: real-time high precision depth recovery. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5407–5416 (2015)
Kerl, C., Stueckler, J., Cremers, D.: Dense continuous-time tracking and mapping with rolling shutter RGB-D cameras. In: IEEE International Conference on Computer Vision (2015)
Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 343–352 (2015)
Cui, Y., Chang, W., Nöll, T., Stricker, D.: KinectAvatar: fully automatic body capture using a single kinect. In: ACCV 2012 International Workshops, Part II, pp. 133–147 (2012)
Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J.T., Gusev, G.: 3d self-portraits. ACM Trans. Graph. 32(6), 187:1–187:9 (2013)
Dou, M., Fuchs, H., Frahm, J.: Scanning and tracking dynamic objects with commodity depth cameras. In: IEEE International Symposium on Mixed and Augmented Reality, pp. 99–106 (2013)
Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A.W., Izadi, S.: 3d scanning deformable objects with a single RGBD sensor. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 493–501 (2015)
Tong, J., Zhou, J., Liu, L., Pan, Z., Yan, H.: Scanning 3d full human bodies using kinects. IEEE Trans. Vis. Comput. Graph. 18(4), 643–650 (2012)
Chen, Y., Dang, G., Cheng, Z.-Q., Xu, K.: Fast capture of personalized avatar using two kinects. J. Manuf. Syst. 33(1), 233–240 (2014)
Lempitsky, V.S., Ivanov, D.V.: Seamless mosaicing of image-based texture maps. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2007)
Gal, R., Wexler, Y., Ofek, E., Hoppe, H., Cohen-Or, D.: Seamless montage for texturing models. Comput. Graph. Forum 29(2), 479–486 (2010)
Karp, R.M.: Reducibility among combinatorial problems. In: Proceedings of a symposium on the Complexity of Computer Computations, held March 20–22, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, pp. 85–103 (1972)
Khoshelham, K., Elberink, S.O.: Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 12(2), 1437–1454 (2012)
Richardt, C., Stoll, C., Dodgson, N.A., Seidel, H.-P., Theobalt, C.: Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos. Comput. Graph. Forum 31(2), 247–256 (2012)
OpenNI, OpenNI organization (2011). http://openni.org/. Accessed 4 Jan 2016
Cheng, Z.-Q., Chen, Y., Martin, R.R., Lai, Y., Wang, A.: Supermatching: feature matching using supersymmetric geometric constraints. IEEE Trans. Vis. Comput. Graph. 19(11), 1885–1894 (2013)
Amplianitis, K., Adduci, M., Reulke, R.: Calibration of a multiple stereo and rgb-d camera system for 3d human tracking. In: Colomina, I., Prat, M. (eds.) The international archives of the photogrammetry, remote sensing and spatial information sciences, vol. XL-3/W1, ISPRS, Castelldefels, Spain, pp. 7–14 (2014)
Sumner, R.W., Schmid, J., Pauly, M.: Embedded deformation for shape manipulation. ACM Trans. Graph. 26(3) (2007)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision, pp. 2564–2571 (2011)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)
Kappes, J.H., Andres, B., Hamprecht, F.A., Schnörr, C., Nowozin, S., Batra, D., Kim, S., Kausler, B.X., Kröger, T., Lellmann, J., Komodakis, N., Savchynskyy, B., Rother, C.: A comparative study of modern inference techniques for structured discrete energy minimization problems. Int. J. Comput. Vis. 115(2), 155–184 (2015)
Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM Trans. Graph. 22(3), 313–318 (2003)
Acknowledgments
This work was supported by the Natural Science Foundation of China (No. 61272334).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Lin, S., Chen, Y., Lai, YK. et al. Fast capture of textured full-body avatar with RGB-D cameras. Vis Comput 32, 681–691 (2016). https://doi.org/10.1007/s00371-016-1245-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-016-1245-9