Shape completion using orthogonal views through a multi-input–output network

Delgado, Leonardo; Morales, Eduardo F.

doi:10.1007/s10044-023-01154-y

Shape completion using orthogonal views through a multi-input–output network

Industrial and Commercial Application
Published: 18 April 2023

Volume 26, pages 1045–1057, (2023)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

157 Accesses
Explore all metrics

Abstract

Knowing the shape of objects is essential to many robotics tasks. However, this is not always feasible. Recent approaches based on point clouds and voxel cubes have been proposed for shape completion from a single-depth view. However, they tend to be computationally expensive and require the tuning of many weights. This paper presents a novel architecture for shape completion based on six orthogonal views obtained from a point cloud (they can be seen as the six faces of a dice). Our network uses one branch for each orthogonal view as input–output and mixes them in the middle of the architecture. By using orthogonal views, the number of required parameters is significantly reduced. We also introduce a novel method to filter the output of networks based on orthogonal views and describe algorithms to convert an orthogonal view to voxel cube and point cloud. We compared our approach against state-of-the-art approaches on the YCB and ShapeNet datasets using the Chamfer distance and mean square error measures and showed very competitive performance with less than 5% of their parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Learning 3D Shape Completion Under Weak Supervision

Article Open access 29 October 2018

Shape completion with azimuthal rotations using spherical gidding-based invariant and equivariant network

Article 25 April 2024

A survey on deep geometry learning: From a representation perspective

Article Open access 10 June 2020

Data availability

All data generated or analyzed during this study are included in this published at https://github.com/hideldt/SCBOV.

Notes

PointNet/PointNet++ [8, 9] were proposed to work directly over the point clouds. Several works have been proposed using their frameworks to solve problems like semantic segmentation, classification, point completion and others [3,4,5,6,7, 17]; in this work, we compared our results against [3] which is based on PointNet.
Note that parts of the shape can be touched on the face of the voxel cube. Our representation takes the distance to the first occupied voxel (as shown in Algorithm 2), and if it coincides with position 0 (this means that it is touching a face of the voxel cube), all these voxels have the same value of the background; for this reason, we add the offset. The offset was set empirically at four.
For ShapeNet and YCB datasets, we use the versions proposed by [3] and [1], respectively.
They use as base network PCN which has 6.8 M parameters, so compared with them we only use 4.4% of their weights.
Note that on a lower capacity GPU the comparison could not be made since both networks require at least 8GB of VRAM to be trained.
This means that the inner points are not taken into account. Only the border points.

References

Varley J, DeChant C, Richardson A, Ruales J, Allen P (2017) Shape completion enabled robotic grasping. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 2442–2447. https://doi.org/10.1109/iros.2017.8206060
Yang B, Rosa S, Markham A, Trigoni N, Wen H (2019) Dense 3D object reconstruction from a single depth view. IEEE Trans Pattern Anal Mach Intell 41(12):2820–2834. https://doi.org/10.1109/TPAMI.2018.2868195
Article Google Scholar
Yuan W, Khot T, Held D, Mertz C, Hebert M (2018) Pcn: point completion network. In: 2018 international conference on 3D vision (3DV), pp 728–737
Liu M, Sheng L, Yang S, Shao J, Hu S-M (2019) Morphing and sampling network for dense point cloud completion. In: The thirty-fourth AAAI conference on artificial intelligence
Peng Y, Chang M, Wang Q, Qian Y, Zhang Y, Wei M, Liao X (2020) Sparse-to-dense multi-encoder shape completion of unstructured point cloud. IEEE Access 8:30969–30978
Article Google Scholar
Yu X, Rao Y, Wang Z, Liu Z, Lu J, Zhou J (2021) Pointr: diverse point cloud completion with geometry-aware transformers. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 12478–12487. https://doi.org/10.1109/ICCV48922.2021.01227
Xiang P, Wen X, Liu Y-S, Cao Y-P, Wan P, Zheng W, Han Z (2021) SnowflakeNet: point cloud completion by snowflake point deconvolution with skip-transformer. In: Proceedings of the IEEE international conference on computer vision (ICCV)
Charles RQ, Su H, Kaichun M, Guibas LJ (2017) Pointnet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 77–85. https://doi.org/10.1109/CVPR.2017.16
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30. Curran Associates Inc, USA
Google Scholar
Hu T, Han Z, Shrivastava A, Zwicker M (2019) Render4completion: synthesizing multi-view depth maps for 3D shape completion. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 4114–4122. https://doi.org/10.1109/ICCVW.2019.00506
Hu T, Han Z, Zwicker M (2020) 3D shape completion with multi-view consistent inference. In: The Thirty-Fourth AAAI conference on artificial intelligence, AAAI 2020, The Thirty-Second innovative applications of artificial intelligence conference, IAAI 2020, The tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 Feb 2020, pp 10997–11004
Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, Su H, Xiao J, Yi L, Yu F (2015) ShapeNet: an information-rich 3D model repository. Technical Report arXiv:1512.03012 [cs.GR], Toyota Technological Institute, Chicago
Calli B, Singh A, Walsman A, Srinivasa S, Abbeel P, Dollar AM (2015) The ycb object and model set: towards common benchmarks for manipulation research. In: 2015 international conference on advanced robotics (ICAR), pp 510–517. https://doi.org/10.1109/ICAR.2015.7251504
Kappler D, Bohg J, Schaal S (2015) Leveraging big data for grasp planning. In: 2015 IEEE international conference on robotics and automation (ICRA), pp 4304–4311. https://doi.org/10.1109/ICRA.2015.7139793
Koenig N, Howard A (2004) Design and use paradigms for gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ international conference on intelligent robots and systems (IROS) (IEEE Cat. No.04CH37566), vol 3, pp 2149–21543 . https://doi.org/10.1109/IROS.2004.1389727
Min P (2019) Binvox. http://www.patrickmin.com/binvox or https://www.google.com/search?q=binvox. Accessed on 05 Oct 2019
Saha M, Amin SB, Sharma A, Kumar TKS, Kalia RK (2022) AI-driven quantification of ground glass opacities in lungs of COVID-19 patients using 3D computed tomography imaging. PLoS ONE 17:1–14. https://doi.org/10.1371/journal.pone.0263916
Article Google Scholar
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI conference on artificial intelligence. AAAI’17, pp 4278–4284
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Medical image computing and computer-assisted intervention—MICCAI 2015. Springer, Cham, pp 234–241
Google Scholar
Riegler G, Ulusoy AO, Geiger A (2017) Octnet: learning deep 3D representations at high resolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6620–6629. https://doi.org/10.1109/CVPR.2017.701
Choy CB, Xu D, Gwak J, Chen K, Savarese S (2016) 3D–r2n2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. Springer, Cham, pp 628–644
Chapter Google Scholar
Rusu RB, Cousins S (2011) 3D is here: point cloud library (PCL). In: IEEE international conference on robotics and automation (ICRA). Shanghai, China, pp 1–4
Han X, Li Z, Huang H, Kalogerakis E, Yu Y (2017) High-resolution shape completion using deep neural networks for global structure and local geometry inference. In: 2017 IEEE international conference on computer vision (ICCV), pp 85–93. https://doi.org/10.1109/ICCV.2017.19
Oliphant T (2006) A guide to NumPy. Trelgol Publishing, USA
Google Scholar
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. In: International conference on learning representations. arxiv:1412.6980
Chollet F (2021) Deep Learning with Python, Second Edition. ISBN 9781617296864
Abadi M, Agarwal A et al (2015) TensorFlow: large-scale machine learning on heterogeneous systems. Software available from https://www.tensorflow.org/
Chollet F (2017) Deep learning with Python, 1st edn. Manning Publications Co., Greenwich
Google Scholar
Do T-T, Nguyen A, Reid I (2018) AffordanceNet: an end-to-end deep learning approach for object affordance detection. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 5882–5889. https://doi.org/10.1109/icra.2018.8460902

Download references

Acknowledgements

L. Delgado acknowledges CONACYT for the scholarship granted toward pursuing his PhD studies with Grant Number 707984.

Author information

Leonardo Delgado and Eduardo F. Morales have contributed equally to this work.

Authors and Affiliations

Computer Science, Instituto Nacional de Astrofísica, Óptica y Electrónica, Luis Enrique Erro 1, Tonantzintla, 72840, Puebla, Mexico
Leonardo Delgado & Eduardo F. Morales

Authors

Leonardo Delgado
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo F. Morales
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leonardo Delgado.

Ethics declarations

Conflict of interest

The authors declare they have no financial interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Delgado, L., Morales, E.F. Shape completion using orthogonal views through a multi-input–output network. Pattern Anal Applic 26, 1045–1057 (2023). https://doi.org/10.1007/s10044-023-01154-y

Download citation

Received: 03 July 2022
Accepted: 24 January 2023
Published: 18 April 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s10044-023-01154-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Shape completion using orthogonal views through a multi-input–output network

Abstract

Access this article

Similar content being viewed by others

Learning 3D Shape Completion Under Weak Supervision

Shape completion with azimuthal rotations using spherical gidding-based invariant and equivariant network

A survey on deep geometry learning: From a representation perspective

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Shape completion using orthogonal views through a multi-input–output network

Abstract

Access this article

Similar content being viewed by others

Learning 3D Shape Completion Under Weak Supervision

Shape completion with azimuthal rotations using spherical gidding-based invariant and equivariant network

A survey on deep geometry learning: From a representation perspective

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation