Visual information quantification for object recognition and retrieval

Cheng, JiaLiang; Bie, Lin; Zhao, XiBin; Gao, Yue

doi:10.1007/s11431-021-1930-8

Visual information quantification for object recognition and retrieval

Article
Published: 27 October 2021

Volume 64, pages 2618–2626, (2021)
Cite this article

Science China Technological Sciences Aims and scope Submit manuscript

JiaLiang Cheng¹,
Lin Bie¹,
XiBin Zhao¹ &
…
Yue Gao¹

110 Accesses
4 Citations
Explore all metrics

Abstract

The rapid development of computer vision has led to an increasing amount of 3D data, such as multiple views and point clouds, which are widely used in 3D object recognition and retrieval. Intuitively, the quality of 3D data is the most crucial factor that directly affects the performance of 3D applications. However, how to evaluate the 3D data quality, especially the multi-view data quality, is still an open question. To tackle this issue, we propose an entropy-based multi-view information quantification model (MV-Info model) to quantitatively evaluate the multi-view data information. Our proposed MV-Info model consists of hierarchical data module, feature generation module, and quantitative calculation module. Besides, it considers the information entropy theory for more reasonable quantification results. In our method, how much information we can observe from a group of views can be quantified, which can be used to support 3D recognition and retrieval. We also designed a series of experiments to evaluate the effectiveness of the proposed model. The experimental results demonstrate the rationality and validity of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature representation for 3D object retrieval based on unconstrained multi-view

Article 04 May 2022

Multi-view and multivariate gaussian descriptor for 3D object retrieval

Article 11 October 2017

A unified framework for cross-modality 3D model retrieval

Article 06 April 2017

References

Anderson R, Gallup D, Barron J T, et al. Jump: Virtual reality video. ACM Trans Graph, 2016, 35: 1–13
Article Google Scholar
Xu X B, Wang Z, Deng Y M. A software platform for vision-based UAV autonomous landing guidance based on markers estimation. Sci China Tech Sci, 2019, 62: 1825–1836
Article Google Scholar
Wang G, Shi Z C, Shang Y, et al. Precise monocular vision-based pose measurement system for lunar surface sampling manipulator. Sci China Tech Sci, 2019, 62: 1783–1794
Article Google Scholar
Jaderberg M, Czarnecki W M, Dunning I, et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, 2019, 364: 859–865
Article MathSciNet Google Scholar
Bustos B, Keim D A, Saupe D, et al. Feature-based similarity search in 3D object databases. ACM Comput Surv, 2005, 37: 345–387
Article Google Scholar
Maturana D, Scherer S. Voxnet: A 3D convolutional neural network for real-time object recognition. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, 2015. 922–928
Wu Z R, Song S, Khosla A, et al. 3D shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, 2015. 1912–1920
Garcia-Garcia A, Gomez-Donoso F, Garcia-Rodriguez J, et al. Pointnet: A 3D convolutional neural network for real-time object class recognition. In: Proceedings of the International Joint Conference on Neural Networks. Vancouver, 2016. 1578–1584
Wu Z Z, Chen H C, Du S Y, et al. Correntropy based scale ICP algorithm for robust point set registration. Pattern Recogn, 2019, 93: 14–24
Article Google Scholar
Li W J, Bebis G, Bourbakis N G. 3-D object recognition using 2-D views. IEEE Trans Image Process, 2008, 17: 2236–2255
Article MathSciNet Google Scholar
Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, 2009. 248–255
Shilane P, Min P, Kazhdan M, et al. The princeton shape benchmark. In: Proceedings of the Shape Modeling Applications. Genova, 2004. 167–178
Daras P, Axenopoulos A. A 3D shape retrieval framework supporting multimodal queries. Int J Comput Vis, 2010, 89: 229–247
Article Google Scholar
Feng Y F, Zhang Z Z, Zhao X B, et al. Gvcnn: Group-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018. 264–272
Kanezaki A, Matsushita Y, Nishida Y. Rotationnet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018. 5010–5019
Ohbuchi R, Osada K, Furuya T, et al. Salient local visual features for shape-based 3D model retrieval. In: Proceedings of the IEEE International Conference on Shape Modeling and Applications. Stony Brook, 2008. 93–102
Ansary T F, Daoudi M, Vandeborre J P. A Bayesian 3-D search engine using adaptive views clustering. IEEE Trans Multimedia, 2007, 9: 78–88
Article Google Scholar
Gao Y, Zhang Z Z, Lin H J, et al. Hypergraph learning: Methods and practices. IEEE Trans Pattern Anal Mach Intell, 2020, doi: https://doi.org/10.1109/TPAMI.2020.3039374
Zhang Z Z, Lin H J, Zhao X B, et al. Inductive multi-hypergraph learning and its application on view-based 3D object classification. IEEE Trans Image Process, 2018, 27: 5957–5968
Article MathSciNet Google Scholar
Shannon C E. A mathematical theory of communication. Bell Syst Tech J, 1948, 27: 379–423
Article MathSciNet Google Scholar
Vazquez P P, Feixas M, Sbert M, et al. Viewpoint selection using viewpoint entropy. Vision Model Vis, 2001, 1: 273–280
Google Scholar
Gao Y, Dai Q H. View-Based 3D Object Retrieval. San Francisco: Morgan Kaufmann, 2014
Google Scholar
Ansary T F, Vandeborre J P, Mahmoudi S, et al. A bayesian framework for 3D models retrieval based on characteristic views. In: Proceedings of the 2nd International Symposium on 3D Data Processing, Visualization, and Transmission. Washington, 2004. 139–146
Ohbuchi R, Furuya T. Scale-weighted dense bag of visual features for 3D model retrieval from a partial view 3D model. In: Proceedings of the 12th International Conference on Computer Vision Workshops. Kyoto, 2009. 63–70
Furuya T, Ohbuchi R. Dense sampling and fast encoding for 3D model retrieval using bag-of-visual features. In: Proceedings of the ACM International Conference on Image and Video Retrieval. Santorini, 2009. 1–8
Chen D Y, Tian X P, Shen Y T, et al. On visual similarity based 3D model retrieval. Comput Graph Forum, 2003, 22: 223–232
Article Google Scholar
Shih J L, Lee C H, Wang J T. A new 3D model retrieval approach based on the elevation descriptor. Pattern Recogn, 2007, 40: 283–295
Article Google Scholar
Su H, Maji S, Kalogerakis E, et al. Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision. Santiago, 2015. 945–953
Khotanzad A, Hong Y H. Invariant image recognition by Zernike moments. IEEE Trans Pattern Anal Mach Intell, 1990, 12: 489–497
Article Google Scholar
Bracewell R N. The Fourier Transform and Its Applications. New York: McGraw-Hill, 1986
MATH Google Scholar
Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, 2012. 1097–1105
Cheng X, Rao Z F, Chen Y L, et al. Explaining knowledge distillation by quantifying the knowledge. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. 12925–12935
Phong B T. Illumination for computer generated pictures. Commun ACM, 1975, 18: 311–317
Article Google Scholar
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, ArXiv: 1409.1556

Download references

Author information

Authors and Affiliations

School of Software, Tsinghua University, Beijing, 100084, China
JiaLiang Cheng, Lin Bie, XiBin Zhao & Yue Gao

Authors

JiaLiang Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Lin Bie
View author publications
You can also search for this author in PubMed Google Scholar
XiBin Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yue Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yue Gao.

Additional information

This work was supported by the SGCC Science and Technology Project (Grant No. 52020119000A).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, J., Bie, L., Zhao, X. et al. Visual information quantification for object recognition and retrieval. Sci. China Technol. Sci. 64, 2618–2626 (2021). https://doi.org/10.1007/s11431-021-1930-8

Download citation

Received: 18 May 2021
Accepted: 07 September 2021
Published: 27 October 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s11431-021-1930-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual information quantification for object recognition and retrieval

Abstract

Access this article

Similar content being viewed by others

Feature representation for 3D object retrieval based on unconstrained multi-view

Multi-view and multivariate gaussian descriptor for 3D object retrieval

A unified framework for cross-modality 3D model retrieval

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Visual information quantification for object recognition and retrieval

Abstract

Access this article

Similar content being viewed by others

Feature representation for 3D object retrieval based on unconstrained multi-view

Multi-view and multivariate gaussian descriptor for 3D object retrieval

A unified framework for cross-modality 3D model retrieval

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation