3D model retrieval based on multi-view attentional convolutional neural network

Liu, An-An; Zhou, He-Yu; Li, Meng-Jie; Nie, Wei-Zhi

doi:10.1007/s11042-019-7521-8

3D model retrieval based on multi-view attentional convolutional neural network

Published: 29 March 2019

Volume 79, pages 4699–4711, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

An-An Liu¹,
He-Yu Zhou¹,
Meng-Jie Li¹ &
…
Wei-Zhi Nie¹

737 Accesses
8 Citations
Explore all metrics

Abstract

We propose a discriminative Multi-View Attentional Convolutional Neural Network, dubbed as MVA-CNN, which takes the multiple views of an shape as input and output the object category. Unlike previous view-based approaches that simply ”compile” the view features into a compact 3D descriptors, our method can discover the context among multiple views in both the visual and spatial domain. First, we extract multiple rendered images from a 3D object by virtual cameras, and then we use Convolutional Neural Network (CNN) to abstract the information of the views. Second, we aggregate the visual views by two steps: 1). an element-wise maximum operation across the view features is adopted to discover discriminative features. 2). a soft attention mechanism is used to dynamically adjust the shape descriptors for better representing the spatial information. The entire network can be trained in an end-to-end way with the standard backpropagation. We verify the effectiveness of MVA-CNN on two widely used datasets: ModelNet10, ModelNet40 by comparing our method with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D object retrieval based on multi-view convolutional neural networks

Article 30 January 2017

View-Based 3D Model Retrieval via Convolutional Neural Networks

Multi-view dual attention network for 3D object recognition

Article Open access 14 October 2021

References

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Bai S, Bai X, Zhou Z, Zhang Z, Latecki LJ (2016) GIFT: a real-time and scalable 3d shape search engine. In: CVPR 2016, Las vegas, NV, USA, June 27-30, 2016, pp 5023–5032
Bosche F, Haas CT (2008) Automated retrieval of 3d cad model objects in construction range images. Autom Constr 17(4):499–512
Article Google Scholar
Cheng Z, Chang X, Zhu L, Catherine Kanjirathinkal R, Kankanhalli MS (2018) MMALFM: explainable recommendation by leveraging reviews and images. arXiv:1811.05318
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR 2005, 20-26 June 2005, San Diego, CA, USA, pp 886–893
Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans Image Processing 21(9):4290–4303
Article MathSciNet Google Scholar
Gao Y, Zhang H, Zhao X, Yan S (2017) Event classification in microblogs via social tracking. ACM TIST 8(3):35:1–35:14
Google Scholar
Gao Y, Zhen Y, Li H, Chua T (2016) Filtering of brand-related microblogs using social-smooth multiview embedding. IEEE Trans Multimedia 18(10):2115–2126
Article Google Scholar
Garcia-Garcia A, Gomez-Donoso F, Rodríguez JG, Orts-Escolano S, Cazorla M, López JA (2016) Pointnet: a 3d convolutional neural network for real-time object class recognition. In: IJCNN 2016, Vancouver, BC, Canada, July 24–29, 2016, pp 1578–1584
Guétat G, Maitre M, Joly L, Lai SL, Lee T, Shinagawa Y (2006) Automatic 3-d grayscale volume matching and shape analysis. IEEE Trans Inf Technol Biomed 10(2):362
Article Google Scholar
Hilaga M, Shinagawa Y, Komura T, Kunii TL (2001) Topology matching for fully automatic similarity estimation of 3d shapes. In: SIGGRAPH 2001, Los Angeles, California, USA, August 12–17, 2001, pp 203–212
Ip CY, Lapadat D, Sieger L, Regli WC (2002) Using shape distributions to compare solid models. In: Seventh ACM symposium on solid modeling and applications, max-planck-institut für informatik, saarbrücken, Germany, June 17–21, 2002, pp 273–280
Johns E, Leutenegger S, Davison AJ (2016) Pairwise decomposition of image sequences for active multi-view recognition. In: CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp 3813–3822
Kanezaki A (2016) Rotationnet: learning object classification using unsupervised viewpoint estimation. arXiv:1603.06208
Kim W, Kim Y (2000) A region-based shape descriptor using zernike moments. Sig Proc Image Comm 16(1–2):95–102
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Little JJ (1985) Determining object attitude from extended gaussian images. In: Proceedings of the 9th international joint conference on artificial intelligence. Los Angeles, CA, USA, August 1985, pp 960–963
Liu A, Nie W, Gao Y, Su Y (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116
Article MathSciNet Google Scholar
Liu A, Nie W, Gao Y, Su Y (2018) View-based 3-d model retrieval: a benchmark. IEEE Trans Cybernetics 48(3):916–928
Google Scholar
Liu A, Xu N, Nie W, Su Y, Zhang Y (2019) Multi-domain and multi-task learning for human action recognition. IEEE Trans Image Process 28(2):853–867
Article MathSciNet Google Scholar
Liu S, Giles CL, Ororbia A (2018) Learning a hierarchical latent-variable model of 3d shapes. In: 3DV pp 542–551
Liu W, Gao Y, Ma H, Yu S, Nie J (2017) Online multi-objective optimization for live video forwarding across video data centers. J Vis Commun Image Represent 48:502–513
Article Google Scholar
Liu W, Zhang C, Ma H, Li S (2018) Learning efficient spatial-temporal gait features with deep learning for human identification. Neuroinformatics 16(3–4):457–471
Article Google Scholar
Liu X, Liu W, Mei T, Ma H (2018) PROVID: progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Trans Multimedia 20 (3):645–658
Article Google Scholar
Ma H, Liu W (2018) A progressive search paradigm for the internet of things. IEEE MultiMedia 25(1):76–86
Article MathSciNet Google Scholar
Makadia A, Daniilidis K (2010) Spherical correlation of visual representations for 3d model retrieval. Int J Comput Vis 89(2-3):193–210
Article Google Scholar
Maturana D, Scherer S (2015) Voxnet: a 3d convolutional neural network for real-time object recognition. In: IROS 2015, Hamburg, Germany, September 28 - October 2, 2015, pp 922–928
Ng PC, Henikoff S (2003) SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res 31(13):3812–3814
Article Google Scholar
Phong BT (1975) Illumination for computer generated pictures. Commun ACM 18(6):311–317
Article Google Scholar
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: NIPS 2017, 4-9 December 2017, Long Beach, CA, USA, pp 5105–5114
Ren M, Niu L, Fang Y (2017) 3d-a-nets: 3d deep dense descriptor for volumetric shapes with adversarial networks. arXiv:1711.10108
Sfikas K, Theoharis T, Pratikakis I (2017) Exploiting the PANORAMA representation for convolutional neural network classification and retrieval. In: Eurographics workshop on 3d object retrieval, 3DOR 2017, Lyon, France, April 23-24, 2017
Shi B, Bai S, Zhou Z, Bai X (2015) Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process Lett 22(12):2339–2343
Article Google Scholar
Siddiqi K, Zhang J, Macrini D, Shokoufandeh A, Bouix S, Dickinson SJ (2008) Retrieving articulated 3-d models using medial surfaces. Mach Vis Appl 19(4):261–275
Article Google Scholar
Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. In: ICCV 2015, Santiago, Chile, December 7–13, 2015, pp 945–953
Tabia H, Laga H (2015) Covariance-based descriptors for efficient 3d shape matching, retrieval, and classification. IEEE Trans Multimedia 17(9):1591–1603
Article Google Scholar
Tangelder JWH, Veltkamp RC (2003) Polyhedral model retrieval using weighted point sets. Int J Image Graphics 3(1):209
Article Google Scholar
Wang X, Nie W (2015) 3d model retrieval with weighted locality-constrained group sparse coding. Neurocomputing 151:620–625
Article Google Scholar
Wong HS, Ma B, Yu Z, Yeung PF, Ip HHS (2007) 3-d head model retrieval using a single face view query. IEEE Trans on Multimedia 9(5):1026–1036
Article Google Scholar
Wu J, Zhang C, Xue T, Freeman B, Tenenbaum J (2016) Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: NIPS 2016, December 5–10, 2016, Barcelona, Spain, pp 82–90
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: CVPR 2015, Boston, MA, USA, June 7-12, 2015, pp 1912–1920
Xie J, Dai G, Zhu F, Wong EK, Fang Y (2017) Deepshape: deep-learned shape descriptor for 3d shape retrieval. IEEE Trans Pattern Anal Mach Intell 39(7):1335–1345
Article Google Scholar
Xie J, Zheng Z, Gao R, Wang W, Zhu S, Wu YN (2018) Learning descriptor networks for 3d shape synthesis and analysis. arXiv:1804.00586
Zanuttigh P, Minto L (2017) Deep learning for 3d shape classification from multiple depth maps. In: ICIP 2017, Beijing, China, September 17–20, 2017, pp 3615–3619
Zaremba W, Sutskever I, Vinyals O (2014) Recurrent neural network regularization. arXiv:1409.2329
Zhao S, Chen L, Yao H, Zhang Y, Sun X (2015) Strategy for dynamic 3d depth data matching towards robust action retrieval. Neurocomputing 151:533–543
Article Google Scholar
Zhao X, Wang N, Zhang Y, Du S, Gao Y, Sun J (2017) Beyond pairwise matching: person reidentification via high-order relevance learning. IEEE Trans Neural Netw Learn Syst PP(99):1–14
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
An-An Liu, He-Yu Zhou, Meng-Jie Li & Wei-Zhi Nie

Authors

An-An Liu
View author publications
You can also search for this author in PubMed Google Scholar
He-Yu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Meng-Jie Li
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Zhi Nie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to An-An Liu or Wei-Zhi Nie.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by the National Natural Science Foundation of China (61772359,61572356,61872267),the grant of Tianjin New Generation Artificial Intelligence Major Program (18ZXZNGX00150), the grant of Elite Scholar Program of Tianjin University (2019XRX-0035).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, AA., Zhou, HY., Li, MJ. et al. 3D model retrieval based on multi-view attentional convolutional neural network. Multimed Tools Appl 79, 4699–4711 (2020). https://doi.org/10.1007/s11042-019-7521-8

Download citation

Received: 17 September 2018
Revised: 01 March 2019
Accepted: 18 March 2019
Published: 29 March 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s11042-019-7521-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D model retrieval based on multi-view attentional convolutional neural network

Abstract

Access this article

Similar content being viewed by others

3D object retrieval based on multi-view convolutional neural networks

View-Based 3D Model Retrieval via Convolutional Neural Networks

Multi-view dual attention network for 3D object recognition

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D model retrieval based on multi-view attentional convolutional neural network

Abstract

Access this article

Similar content being viewed by others

3D object retrieval based on multi-view convolutional neural networks

View-Based 3D Model Retrieval via Convolutional Neural Networks

Multi-view dual attention network for 3D object recognition

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation