SPNet: Deep 3D Object Classification and Retrieval Using Stereographic Projection

Yavartanoo, Mohsen; Kim, Eu Young; Lee, Kyoung Mu

doi:10.1007/978-3-030-20873-8_44

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11365))

Included in the following conference series:

Asian Conference on Computer Vision

2603 Accesses
13 Citations

Abstract

We propose an efficient Stereographic Projection Neural Network (SPNet) for learning representations of 3D objects. We first transform a 3D input volume into a 2D planar image using stereographic projection. We then present a shallow 2D convolutional neural network (CNN) to estimate the object category followed by view ensemble, which combines the responses from multiple views of the object to further enhance the predictions. Specifically, the proposed approach consists of four stages: (1) Stereographic projection of a 3D object, (2) view-specific feature learning, (3) view selection and (4) view ensemble. The proposed approach performs comparably to the state-of-the-art methods while having substantially lower GPU memory as well as network parameters. Despite its lightness, the experiments on 3D object classification and shape retrievals demonstrate the high performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sfikas, K., Pratikakis, I., Theoharis, T.: Ensemble of panorama-based convolutional neural networks for 3D model classification and retrieval. Comput. Graph. 71, 208–218 (2018)
Article Google Scholar
Zhi, S., Liu, Y., Li, X., Guo, Y.: Toward real-time 3D object recognition: a lightweight volumetric CNN framework using multitask learning. Comput. Graph. 71, 199–207 (2018)
Article Google Scholar
Bai, S., Bai, X., Zhou, Z., Zhang, Z., Latecki, L.J.: Gift: a real-time and scalable 3D shape search engine. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5023–5032 (2016)
Google Scholar
Behley, J., Steinhage, V., Cremers, A.B.: Performance of histogram descriptors for the classification of 3D laser range data in urban environments. In: 2012 IEEE International Conference on Robotics and Automation, pp. 4391–4398 (2012)
Google Scholar
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. CoRR abs/1608.04236 (2016)
Google Scholar
Charles, R.Q., Su, H., Kaichun, M., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 77–85 (2017)
Google Scholar
Cohen, T.S., Geiger, M., Köhler, J., Welling, M.: Spherical CNNs. In: ICLR (2018)
Google Scholar
Ding-Yun, C., Xiao-Pei, T., Yu-Te, S., Ming, O.: On visual similarity based 3D model retrieval. Comput. Graph. Forum 22(3), 223–232 (2003)
Article Google Scholar
Frome, A., Huber, D., Kolluri, R., Bülow, T., Malik, J.: Recognizing objects in range data using regional point descriptors. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3023, pp. 224–237. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24672-5_18
Chapter Google Scholar
Furuya, T., Ohbuchi, R.: Deep aggregation of local 3D geometric features for 3D model retrieval. In: BMVC (2016)
Google Scholar
Golovinskiy, A., Kim, V.G., Funkhouser, T.: Shape-based recognition of 3D point clouds in urban environments, September 2009
Google Scholar
Gomez-Donoso, F., Garcia-Garcia, A., Garcia-Rodriguez, J., Orts-Escolano, S., Cazorla, M.: LonchaNet: a sliced-based CNN architecture for real-time 3D object recognition. In: IJCNN, pp. 412–418 (2017)
Google Scholar
Huang, H., et al.: Dynamical waveforms and the dynamical source for electricity meter dynamical experiment. In: 2016 Conference on Precision Electromagnetic Measurements (CPEM 2016), pp. 1–2 (2016)
Google Scholar
Johns, E., Leutenegger, S., Davison, A.J.: Pairwise decomposition of image sequences for active multi-view recognition. In: Savva:2016:LSR:3056462.3056479, pp. 3813–3822 (2016)
Google Scholar
Kanezaki, A., Matsushita, Y., Nishida, Y.: RotationNet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: CVPR (2018)
Google Scholar
Klokov, R., Lempitsky, V.: Escape from cells: deep Kd-networks for the recognition of 3D point cloud models. In: ICCV, pp. 863–872 (2017)
Google Scholar
Kumar, A., Kim, J., Lyndon, D., Fulham, M., Feng, D.: An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J. Biomed. Health Inform. 21(1), 31–40 (2017)
Article Google Scholar
Maturana, D., Scherer, S.: VoxNet: a 3D convolutional neural network for real-time object recognition. In: IROS, pp. 922–928 (2015)
Google Scholar
Papadakis, P., Pratikakis, I., Theoharis, T., Perantonis, S.: Panorama: a 3D shape descriptor based on panoramic views for unsupervised 3D object retrieval. IJCV 89(2), 177–192 (2010)
Article Google Scholar
Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view CNNs for object classification on 3D data. In: CVPR, pp. 5648–5656 (2016)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413 (2017)
Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: CVPR, pp. 6620–6629 (2017)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Savva, M., et al.: Large-scale 3D shape retrieval from ShapeNet core55. In: Proceedings of the Eurographics 2016 Workshop on 3D Object Retrieval, 3DOR 2016, pp. 89–98 (2016)
Google Scholar
Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. CoRR abs/1604.03351 (2016)
Google Scholar
Sfikas, K., Theoharis, T., Pratikakis, I.: Exploiting the panorama representation for convolutional neural network classification and retrieval. In: 3DOR (2017)
Google Scholar
Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 236–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_20
Chapter Google Scholar
Shi, B., Bai, S., Zhou, Z., Bai, X.: DeepPano: deep panoramic representation for 3-D shape recognition. IEEE Signal Process. Lett. 22(12), 2339–2343 (2015)
Article Google Scholar
Sinha, A., Bai, J., Ramani, K.: Deep learning 3D shape surfaces using geometry images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 223–240. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_14
Chapter Google Scholar
Snyder, J.P.: Flattening the Earth: Two Thousand Years of Map Projections. University of Chicago Press, Chicago (1993)
Google Scholar
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: ICCV, pp. 945–953 (2015)
Google Scholar
Teichman, A., Levinson, J., Thrun, S.: Towards 3D object recognition via classification of arbitrary object tracks. In: 2011 IEEE International Conference on Robotics and Automation, pp. 4034–4041 (2011)
Google Scholar
Wang, C.: Dominant set clustering and pooling for multiview 3D object recognition. In: BMVC (2017)
Google Scholar
Wang, P.S., Liu, Y., Guo, Y.X., Sun, C.Y., Tong, X.: O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans. Graph. 36, 72:1–72:11 (2017)
Google Scholar
Wu, J., Zhang, C., Xue, T., Freeman, W.T., Tenenbaum, J.B.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: NIPS, pp. 82–90 (2016)
Google Scholar
Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: CVPR, pp. 1912–1920 (2015)
Google Scholar

Download references

Acknowledgment

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIT) (No. NRF-2017R1A2B2011862).

Author information

Authors and Affiliations

Department of ECE, ASRI, Seoul National University, Seoul, Korea
Mohsen Yavartanoo, Eu Young Kim & Kyoung Mu Lee

Authors

Mohsen Yavartanoo
View author publications
You can also search for this author in PubMed Google Scholar
Eu Young Kim
View author publications
You can also search for this author in PubMed Google Scholar
Kyoung Mu Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kyoung Mu Lee .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C.V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yavartanoo, M., Kim, E.Y., Lee, K.M. (2019). SPNet: Deep 3D Object Classification and Retrieval Using Stereographic Projection. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11365. Springer, Cham. https://doi.org/10.1007/978-3-030-20873-8_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-20873-8_44
Published: 26 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20872-1
Online ISBN: 978-3-030-20873-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics