Multi-view Pixel2Mesh++: 3D reconstruction via Pixel2Mesh with more images

Chen, Rongshan; Yin, Xiang; Yang, Yuancheng; Tong, Chao

doi:10.1007/s00371-022-02651-7

Multi-view Pixel2Mesh++: 3D reconstruction via Pixel2Mesh with more images

Original article
Published: 30 August 2022

Volume 39, pages 5153–5166, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Rongshan Chen¹,
Xiang Yin¹,
Yuancheng Yang¹ &
…
Chao Tong¹

668 Accesses
1 Altmetric
Explore all metrics

Abstract

To meet the increasing demand for high-quality 3D models, we propose an end-to-end deep learning network architecture, which can generate 3D mesh models with multiple RGB images and is different from previous methods which generate voxel or point cloud models. Unlike the single-image-based pixel2mesh network, we introduce the ConvLSTM layer to fuse perceptual features, making it possible to process multiple images simultaneously. To constrain the smoothness of 3D shapes, we design a graph pooling layer to reduce mesh structure and define a new loss function—Smooth loss. Collaborating with the graph unpooling layer in Pixel2Mesh (P2M), the graph pooling layer guarantees the mesh topology of the final 3D shapes generated. The application of Smooth loss ensures the visual appeal and structural accuracy of 3D shapes generated. Our experiments on ShapeNet dataset show that our method, compared with previous deep learning networks, can generate higher-precision 3D shapes and achieves the best on F-score and CD. In addition, due to the introduction of fusion features from multiple images, our experimental results are more convincing and credible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Image2Mesh: A Learning Framework for Single Image 3D Reconstruction

PushNet: 3D reconstruction from a single image by pushing

Article 28 January 2024

References

Ioannidou, A., Chatzilari, E., Nikolopoulos, S., Kompatsiaris, I.: Deep learning advances in computer vision with 3d data: a survey. ACM Comput. Surv. 50(2), 1–38 (2017).
Article Google Scholar
Yuan, Z.H., Lu, T., Zhou, H.-Y., Chen, B., Li, J.-N.: Incremental 3d reconstruction using Bayesian learning. In: Proceedings of the 25th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: Advanced Research in Applied Artificial Intelligence, IEA/AIE’12, pp. 754–763. Springer: Heidelberg (2012)
Penner, E.: Soft 3d reconstruction for view synthesis. ACM Trans. Graph. 36(6), 1–11 (2017).
Article Google Scholar
Yong Tsui Lee and Fen Fang: 3d reconstruction of polyhedral objects from single parallel projections using cubic corner. Comput. Aided Des. 43(8), 1025–1034 (2011).
Article Google Scholar
Trucco, E.: Session details: 3d reconstruction. In: Proceedings of the 1st International Workshop on 3D Video Processing, 3DVP’10, New York, NY, USA. Association for Computing Machinery(2010)
Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3d-r2n2: a unified approach for single and multi-view 3d object reconstruction. In: European conference on computer vision, pp. 628–644. Springer (2016)
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3d object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)
Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.G.: Pixel2mesh: generating 3d mesh models from single rgb images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018)
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional lstm network: a machine learning approach for precipitation nowcasting, pp. 802–810 (2015)
Wen, C., Zhang, Y., Li, Z., Fu, Y.: Pixel2mesh++: multi-view 3d mesh generation via deformation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1042–1051 (2019)
Nozawa, N., Shum, H.P., Ho, E.S., Morishima, S.: 3d car shape reconstruction from a single sketch image. In Motion, Interaction and Games, MIG’19, New York, NY, USA. Association for Computing Machinery (2019)
Haag, M., Nagel, H.-H.: Combination of edge element and optical flow estimates for 3d-model-based vehicle tracking in traffic image sequences. Int. J. Comput. Vis. 35(3), 295–319 (1999).
Article Google Scholar
Nozawa, N., Shum, H.P.H., Feng, Q., Ho, E.S.L., Morishima, S.: 3d car shape reconstruction from a contour sketch using gan and lazy learning. Vis. Comput. 38(4), 1317–1330 (2022).
Article Google Scholar
Loh, A.M., Hartley, R.I., et al.: Shape from non-homogeneous, non-stationary, anisotropic, perspective texture, vol. 5. In: BMVC, pp. 69–78. Citeseer (2005)
Aloimonos, J.: Shape from texture. Biol. Cybern. 58(5), 345–360 (1988).
Article MATH Google Scholar
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H. Mobilenets: efficient convolutional neural networks for mobile vision applications. Computer Vision and Pattern Recognition (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017).
Article Google Scholar
Lyu, K., Li, Y., Zhang, Z.: Attention-aware multi-task convolutional neural networks. IEEE Trans. Image Process. 29, 1867–1878 (2020).
Article MathSciNet MATH Google Scholar
Ni, Z., Yang, W., Wang, S., Ma, L., Kwong, S.: Towards unsupervised deep image enhancement with generative adversarial network. IEEE Trans. Image Process. 29, 9140–9151 (2020).
Article MATH Google Scholar
Yang, T.T., Tong, C.: Real-time detection network for tiny traffic sign using multi-scale attention module. Sci. China Technol. Sci. 65(2), 396–406 (2022).
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997).
Article Google Scholar
Chen, R., Han, S., Xu, J., Su, H.: Point-based multi-view stereo network, pp. 1538–1547 (2019)
Delanoy, J., Aubry, M., Isola, P., Efros, A.A., Bousseau, A.: 3d sketching using multi-view deep volumetric prediction. In: Proceedings ACM Computer Graphics and Interactive Techniques, vol. 1, no. 1 (2018)
Xie, H., Yao, H., Sun, X., Zhou, S. and Tong, X.: Weighted voxel: a novel voxel representation for 3d reconstruction. In: Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, ICIMCS’18, New York, NY, USA. Association for Computing Machinery (2018)
Huang, T., Liu, Y.: 3d point cloud geometry compression on deep learning. In: Proceedings of the 27th ACM International Conference on Multimedia, MM’19, pp. 890–898, New York, NY, USA. Association for Computing Machinery (2019)
Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017).
Article Google Scholar
Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering, pp. 3844–3852 (2016)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. Learning (2016)
Ting, Z., Feng, D.D., Zheng, T.: 3d reconstruction of single picture. In Proceedings of the Pan–Sydney area workshop on visual information processing, VIP’05, pp. 83–86. AUS—Australian Computer Society, Inc (2004)
Xiang, N., Wang, L., Jiang, T., Li, Y., Yang, X., Zhang, J.: Single-image mesh reconstruction and pose estimation via generative normal map. In: Proceedings of the 32nd International Conference on Computer Animation and Social Agents, CASA’19, pp. 79–84, New York, NY, USA. Association for Computing Machinery (2019)
Gao, Y., Yao, Y., Jiang, Y.: Multi-target 3d reconstruction from rgb-d data. In: Proceedings of the 2nd International Conference on Computer Science and Software Engineering, CSSE 2019, pp. 184–191, New York, NY, USA. Association for Computing Machinery (2019)
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4460–4470 (2019)
Nikoohemat, S., Diakite, A.A., Zlatanova, S., Vosselman, G.: Indoor 3d reconstruction from point clouds for optimal routing in complex buildings to support disaster management. Autom. Construct. 113, 103109 (2020).
Article Google Scholar
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: Deepsdf: learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 165–174 (2019)
Peng, S., Niemeyer, M., Mescheder, L., Pollefeys, M., Geiger, A.: Convolutional occupancy networks. arXiv preprint arXiv:2003.04618 (2020)
Wang, W., Ceylan, D., Mech, R., Neumann, U.: 3dn: 3d deformation network. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1038–1046 (2019)
Xu, Q., Wang, W., Ceylan, D., Mech, R., Neumann, U.: Disn: deep implicit surface network for high-quality single-view 3d reconstruction. In: Advances in Neural Information Processing Systems, pp. 492–502 (2019)
Sinha, A., Unmesh, A., Huang, Q., Ramani, K.: Surfnet: generating 3d shape surfaces using deep residual networks, pp. 791–800 (2017)
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional lstm network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015)
Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Fisher, Y.: ShapeNet: an information-rich 3d model repository. Technical Report arXiv:1512.03012 [cs.GR], Stanford University—Princeton University—Toyota Technological Institute at Chicago (2015)

Download references

Acknowledgements

This study is partially supported by National Natural Science Foundation of China (62176016), the National Key R &D Program of China (Nos. 2018YFB2101100 and 2019YFB2101600), Guizhou Province Science and Technology Project: Research and Demonstration of Sci. & Tech Big Data Mining Technology Based on Knowledge Graph (supported by Qiankehe[2021] General 382), Training Program of the Major Research Plan of the National Natural Science Foundation of China (Grant No. 92046015), and Beijing Natural Science Foundation Program and Scientific Research Key Program of Beijing Municipal Commission of Education (Grant No. KZ202010025047).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Beihang University, Beijing, China
Rongshan Chen, Xiang Yin, Yuancheng Yang & Chao Tong

Authors

Rongshan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Yin
View author publications
You can also search for this author in PubMed Google Scholar
Yuancheng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Tong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao Tong.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 5527 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, R., Yin, X., Yang, Y. et al. Multi-view Pixel2Mesh++: 3D reconstruction via Pixel2Mesh with more images. Vis Comput 39, 5153–5166 (2023). https://doi.org/10.1007/s00371-022-02651-7

Download citation

Accepted: 05 August 2022
Published: 30 August 2022
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00371-022-02651-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-view Pixel2Mesh++: 3D reconstruction via Pixel2Mesh with more images

Abstract

Access this article

Similar content being viewed by others

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Image2Mesh: A Learning Framework for Single Image 3D Reconstruction

PushNet: 3D reconstruction from a single image by pushing

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 5527 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-view Pixel2Mesh++: 3D reconstruction via Pixel2Mesh with more images

Abstract

Access this article

Similar content being viewed by others

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Image2Mesh: A Learning Framework for Single Image 3D Reconstruction

PushNet: 3D reconstruction from a single image by pushing

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 5527 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation