An improved recurrent neural networks for 3d object reconstruction

  • Tingsong MaEmail author
  • Ping Kuang
  • Wenhong Tian


3D-R2N2 and other advanced 3D reconstruction neural networks have achieved impressive results, however most of them still suffer from training difficulties and detail losing, due to their weak feature extraction capability and improper loss function. This paper aims to overcome these shortcomings and defects by building a brand new model based on 3D-R2N2. The new model adopts densely connected structure as encoder, and utilizes Chamfer Distance as loss function. The aim is to enhance the learning ability of the network for complex data, meanwhile, make the focus of the whole network rest on the reconstruction of detail structures. In addition, we also made an improved decoder by building two parallel predictor branches to make better use of the feature information and boost the network’s performance on reconstruction task. Through extensive tests, the results show that our proposed model called 3D-R2N2-V2 is slightly slower than 3D-R2N2 in predicting speed, but it can be 20% to 30% faster than 3D-R2N2 in training speed and obtain 15% and 10% better voxel IoU results on both single- and multi-view reconstruction tasks, respectively. Compared with other recent state-of-the-art methods like OGN and DRC, the reconstruction effect of our approach is also competitive.


3D Object Reconstruction 3D-R2N2 approach Densely connected structure RNN Chamfer distance 



This research is partially supported by China National Science Foundation (CNSF) with project ID 61672136, 61828202.


  1. 1.
    Walker J, Harris E, Lynagh C (2018) 3D Printed Smart Molds for Sand Casting. Int J Metalcast 12 (4):785–796CrossRefGoogle Scholar
  2. 2.
    Heinrich M P, Blendowski M, Oktay O (2018) Ternarynet: faster deep model inference without GPUs for medical 3D segmentation using sparse and binary convolutions. Int J CARS 13(9):1–11CrossRefGoogle Scholar
  3. 3.
    Chang A X, Funkhouser TG (2015) ShapeNet: An Information-Rich 3D Model Repository. arXiv:1512.03012
  4. 4.
    Choy C B, Xu D, Gwak J Y (2016) 3D-r2n2: A Unified Approach for Single and Multi-view 3D Object Reconstruction. Lecture Notes in Computer Science, vol 9912. Springer, ChamGoogle Scholar
  5. 5.
    Fan H, Su H, Guibas L (2017) A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In: Computer Vision and Pattern Recognition (CVPR), pp 2463–2471Google Scholar
  6. 6.
    Srinivasan G, Roy K (2019) RestoCNet: Residual Stochastic Binary Convolutional Spiking Neural Network for Memory-Efficient Neuromorphic Computing. Front Neurosci 7(4):13CrossRefGoogle Scholar
  7. 7.
    Chui C K, Shao-Bo L, Ding-Xuan Z (2018) Construction of neural networks for realization of localized deep learning. Front Appl Math Stat 4:12CrossRefGoogle Scholar
  8. 8.
    Huang G, Liu Z, Maaten L V D (2017) DenselyConnected Convolutional Networks. In: Computer Vision and Pattern Recognition (CVPR), pp 2261–2269Google Scholar
  9. 9.
    Bo Y, Stefano R, Andrew M (2018) Dense 3D object reconstruction from a single depth view. IEEE Transactions on Pattern Analysis and Machine Intelligence 2:1–1Google Scholar
  10. 10.
    Monszpart A, Mellado N, Brostow G J (2015) RAPTer: rebuilding man-made scenes with regular arrangements of planes. Acm Trans Graph 34(4):103CrossRefGoogle Scholar
  11. 11.
    Sipiran I, Gregor R, Schreck T (2014) Approximate symmetry detection in partial 3D meshes. Comput Graph Forum 33(7):131–140CrossRefGoogle Scholar
  12. 12.
    Tatarchenko M, Dosovitskiy A, Brox T (2017) Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs. In: International Conference on Computer Vision (ICCV), pp 2107–2115Google Scholar
  13. 13.
    Zhang Y, Liu Z, Li X, Yu Z (2019) Data-Driven Point cloud objects completion. Sensors 19(7):1514CrossRefGoogle Scholar
  14. 14.
    Lee T, Turin S Y, Gosain A K et al (2018) Multi-viewstereo in the operating room allows prediction of healing complications in a patient-specific model of reconstructive surgery. J Biomech 74:202–206CrossRefGoogle Scholar
  15. 15.
    Häming K, Peters G (2010) The structure-from-motion reconstruction pipeline–a survey with focus on short image sequences. Kybernetika Praha 46(5):926–937MathSciNetzbMATHGoogle Scholar
  16. 16.
    Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha JM (2015) Visual simultaneous localization and mapping: a survey. Artif Intell Rev 43(1):55–81CrossRefGoogle Scholar
  17. 17.
    Tulsiani S, Zhou T, Efros A (2017) Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency. In: Computer Vision and Pattern Recognition (CVPR), pp 209–217Google Scholar
  18. 18.
    Li Y, Dai A, Guibas L, Nießner M (2015) Database-Assisted Object Retrieval for Real-Time 3D Reconstruction. Comput Graph Forum 34(2):435–446CrossRefGoogle Scholar
  19. 19.
    Shi Y, Long P, Xu K, Huang H, Xiong Y (2016) Data-driven contextual modeling for 3d scene understanding. Comput Graph 55:55–67CrossRefGoogle Scholar
  20. 20.
    Luo J, Zhang J, Deng B et al (2018) 3D Face Reconstruction With Geometry Details From a Single Image. IEEE Trans Image Process 27(10):4756–4770MathSciNetCrossRefGoogle Scholar
  21. 21.
    Carreira J, Vicente S, Agapito L, Batista J (2016) Lifting object detection datasets into 3d. IEEE Trans Pattern Anal Mach Intell 38(7):1342–1355CrossRefGoogle Scholar
  22. 22.
    Huang Q, Wang H, Koltun V (2015) Single-view reconstruction via joint analysis of image and shape collections. ACM Trans Graph (TOG) 34(4):87Google Scholar
  23. 23.
    Su H, Huang Q, Mitra N J, Li Y, Guibas L (2014) Estimating image depth using shape collections. ACM Trans Graph (TOG) 33(4):37Google Scholar
  24. 24.
    Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A Deep Representation for Volumetric Shapes. In: Computer Vision and Pattern Recognition (CVPR), pp 1912–1920Google Scholar
  25. 25.
    Varley J, Dechant C, Richardson A, Ruales J, Allen P (2017) Shape Completion Enabled Robotic Grasping. In: Intelligent Robots and Systems (IROS), pp 2442–2447Google Scholar
  26. 26.
    Bo Y, Rosa S, Markham A et al (2018) Dense 3D object reconstruction from a single depth view. IEEE Transactions on Pattern Analysis and Machine Intelligence:1–1Google Scholar
  27. 27.
    Smith E, Meger D (2017) Improved adversarial systems for 3D object generation and reconstruction. Robot Learn 78(4):34–47Google Scholar
  28. 28.
    Abadi M, Agarwal A, Barham P (2016) Tensorflow: Large-scale Machine Learning on Heterogeneous Distributed Systems. Acm Sigplan Notices 51:1–1CrossRefGoogle Scholar
  29. 29.
    Kingma D, Ba J (2014) Adam: A Method for Stochastic Optimization. arXiv:
  30. 30.
    Everingham M, L Van Gool C K, Williams I, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(1):303–338CrossRefGoogle Scholar
  31. 31.
    Jaderberg M, Dalibard V, Osindero S (2017) Population Based Training of Neural Networks. arXiv:
  32. 32.
    Lun Z, Gadelha M, Kalogerakis E (2017) 3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks. In: 3D Vision International Conference, pp 67–77Google Scholar
  33. 33.
    Meagher D (1980) Octree encoding: A new technique for the representation, manipulation and display of arbitrary 3d objects by computer. Technical Report report number:IPL-TR-80-111Google Scholar
  34. 34.
    Gao H, Yang Y (2019) Multi-branch fusion network for hyperspectral image classification. Knowl-Based Syst 167:11–25CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.University of Electronic Science and Technology of ChinaSichuanChina

Personalised recommendations