Skip to main content

SPNet: Deep 3D Object Classification and Retrieval Using Stereographic Projection

  • Conference paper
  • First Online:
Computer Vision – ACCV 2018 (ACCV 2018)

Abstract

We propose an efficient Stereographic Projection Neural Network (SPNet) for learning representations of 3D objects. We first transform a 3D input volume into a 2D planar image using stereographic projection. We then present a shallow 2D convolutional neural network (CNN) to estimate the object category followed by view ensemble, which combines the responses from multiple views of the object to further enhance the predictions. Specifically, the proposed approach consists of four stages: (1) Stereographic projection of a 3D object, (2) view-specific feature learning, (3) view selection and (4) view ensemble. The proposed approach performs comparably to the state-of-the-art methods while having substantially lower GPU memory as well as network parameters. Despite its lightness, the experiments on 3D object classification and shape retrievals demonstrate the high performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sfikas, K., Pratikakis, I., Theoharis, T.: Ensemble of panorama-based convolutional neural networks for 3D model classification and retrieval. Comput. Graph. 71, 208–218 (2018)

    Article  Google Scholar 

  2. Zhi, S., Liu, Y., Li, X., Guo, Y.: Toward real-time 3D object recognition: a lightweight volumetric CNN framework using multitask learning. Comput. Graph. 71, 199–207 (2018)

    Article  Google Scholar 

  3. Bai, S., Bai, X., Zhou, Z., Zhang, Z., Latecki, L.J.: Gift: a real-time and scalable 3D shape search engine. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5023–5032 (2016)

    Google Scholar 

  4. Behley, J., Steinhage, V., Cremers, A.B.: Performance of histogram descriptors for the classification of 3D laser range data in urban environments. In: 2012 IEEE International Conference on Robotics and Automation, pp. 4391–4398 (2012)

    Google Scholar 

  5. Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. CoRR abs/1608.04236 (2016)

    Google Scholar 

  6. Charles, R.Q., Su, H., Kaichun, M., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 77–85 (2017)

    Google Scholar 

  7. Cohen, T.S., Geiger, M., Köhler, J., Welling, M.: Spherical CNNs. In: ICLR (2018)

    Google Scholar 

  8. Ding-Yun, C., Xiao-Pei, T., Yu-Te, S., Ming, O.: On visual similarity based 3D model retrieval. Comput. Graph. Forum 22(3), 223–232 (2003)

    Article  Google Scholar 

  9. Frome, A., Huber, D., Kolluri, R., Bülow, T., Malik, J.: Recognizing objects in range data using regional point descriptors. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3023, pp. 224–237. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24672-5_18

    Chapter  Google Scholar 

  10. Furuya, T., Ohbuchi, R.: Deep aggregation of local 3D geometric features for 3D model retrieval. In: BMVC (2016)

    Google Scholar 

  11. Golovinskiy, A., Kim, V.G., Funkhouser, T.: Shape-based recognition of 3D point clouds in urban environments, September 2009

    Google Scholar 

  12. Gomez-Donoso, F., Garcia-Garcia, A., Garcia-Rodriguez, J., Orts-Escolano, S., Cazorla, M.: LonchaNet: a sliced-based CNN architecture for real-time 3D object recognition. In: IJCNN, pp. 412–418 (2017)

    Google Scholar 

  13. Huang, H., et al.: Dynamical waveforms and the dynamical source for electricity meter dynamical experiment. In: 2016 Conference on Precision Electromagnetic Measurements (CPEM 2016), pp. 1–2 (2016)

    Google Scholar 

  14. Johns, E., Leutenegger, S., Davison, A.J.: Pairwise decomposition of image sequences for active multi-view recognition. In: Savva:2016:LSR:3056462.3056479, pp. 3813–3822 (2016)

    Google Scholar 

  15. Kanezaki, A., Matsushita, Y., Nishida, Y.: RotationNet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: CVPR (2018)

    Google Scholar 

  16. Klokov, R., Lempitsky, V.: Escape from cells: deep Kd-networks for the recognition of 3D point cloud models. In: ICCV, pp. 863–872 (2017)

    Google Scholar 

  17. Kumar, A., Kim, J., Lyndon, D., Fulham, M., Feng, D.: An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J. Biomed. Health Inform. 21(1), 31–40 (2017)

    Article  Google Scholar 

  18. Maturana, D., Scherer, S.: VoxNet: a 3D convolutional neural network for real-time object recognition. In: IROS, pp. 922–928 (2015)

    Google Scholar 

  19. Papadakis, P., Pratikakis, I., Theoharis, T., Perantonis, S.: Panorama: a 3D shape descriptor based on panoramic views for unsupervised 3D object retrieval. IJCV 89(2), 177–192 (2010)

    Article  Google Scholar 

  20. Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view CNNs for object classification on 3D data. In: CVPR, pp. 5648–5656 (2016)

    Google Scholar 

  21. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413 (2017)

  22. Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: CVPR, pp. 6620–6629 (2017)

    Google Scholar 

  23. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  24. Savva, M., et al.: Large-scale 3D shape retrieval from ShapeNet core55. In: Proceedings of the Eurographics 2016 Workshop on 3D Object Retrieval, 3DOR 2016, pp. 89–98 (2016)

    Google Scholar 

  25. Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. CoRR abs/1604.03351 (2016)

    Google Scholar 

  26. Sfikas, K., Theoharis, T., Pratikakis, I.: Exploiting the panorama representation for convolutional neural network classification and retrieval. In: 3DOR (2017)

    Google Scholar 

  27. Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 236–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_20

    Chapter  Google Scholar 

  28. Shi, B., Bai, S., Zhou, Z., Bai, X.: DeepPano: deep panoramic representation for 3-D shape recognition. IEEE Signal Process. Lett. 22(12), 2339–2343 (2015)

    Article  Google Scholar 

  29. Sinha, A., Bai, J., Ramani, K.: Deep learning 3D shape surfaces using geometry images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 223–240. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_14

    Chapter  Google Scholar 

  30. Snyder, J.P.: Flattening the Earth: Two Thousand Years of Map Projections. University of Chicago Press, Chicago (1993)

    Google Scholar 

  31. Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: ICCV, pp. 945–953 (2015)

    Google Scholar 

  32. Teichman, A., Levinson, J., Thrun, S.: Towards 3D object recognition via classification of arbitrary object tracks. In: 2011 IEEE International Conference on Robotics and Automation, pp. 4034–4041 (2011)

    Google Scholar 

  33. Wang, C.: Dominant set clustering and pooling for multiview 3D object recognition. In: BMVC (2017)

    Google Scholar 

  34. Wang, P.S., Liu, Y., Guo, Y.X., Sun, C.Y., Tong, X.: O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans. Graph. 36, 72:1–72:11 (2017)

    Google Scholar 

  35. Wu, J., Zhang, C., Xue, T., Freeman, W.T., Tenenbaum, J.B.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: NIPS, pp. 82–90 (2016)

    Google Scholar 

  36. Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: CVPR, pp. 1912–1920 (2015)

    Google Scholar 

Download references

Acknowledgment

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIT) (No. NRF-2017R1A2B2011862).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kyoung Mu Lee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yavartanoo, M., Kim, E.Y., Lee, K.M. (2019). SPNet: Deep 3D Object Classification and Retrieval Using Stereographic Projection. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11365. Springer, Cham. https://doi.org/10.1007/978-3-030-20873-8_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20873-8_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20872-1

  • Online ISBN: 978-3-030-20873-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics