Skip to main content
Log in

Training deep convolutional neural networks to acquire the best view of a 3D shape

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In a 3D shape retrieval system, when attempting to select the best view from many view images, the ability to project a 3D shape into related view images from multiple viewpoints is important. Furthermore, learning the best view from benchmark sketch datasets is one of the best approaches to acquire the best view of a 3D shape. In this paper, we propose a learning framework based on deep neural networks to obtain the best shape views. We apply transfer learning to obtain features, i.e., we use two Alex convolutional neural networks (CNNs) for feature extraction: one for the view images and the other for the sketches. Specifically, the connections to learn an automatic best-view selector for different types of 3D shapes are obtained through the proposed learning framework. We perform training on the Shape Retrieval Contest’s 2014 Sketch Track Benchmark (SHREC’14) to capture the related rules. Finally, we report experiments to demonstrate the feasibility of our approach. In addition, to better evaluate our proposed framework and show its superiority, we apply our proposed approach to a sketch-based model retrieval task, where it outperforms other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Abadi M, Barham P, Chen J et al (2016) Tensorflow: a system for large-scale machine learning. Operating Systems Design and Implementation, pp 265–283

  2. Chang AX, Funkhouser TA, Guibas LJ et al (2016) ShapeNet: an information-rich 3D model repository. arXiv:1512.03012

  3. Chopra S, Hadsell R, Lecun Y (2005) Learning a similarity metric discriminatively, with application to face verification. In: IEEE conference on computer vision and pattern recognition, pp 539–546

  4. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE conference on computer vision and pattern recognition. IEEE, pp 886–893

  5. Daras P, Axenopoulos A (2010) A 3D shape retrieval framework supporting multimodal queries. Int J Comput Vis 89(2):229–247

    Article  Google Scholar 

  6. Dutagaci H, Cheung CP, Godil A (2010) A benchmark for best view selection of 3D objects. In: Proceedings of the ACM workshop on 3D object retrieval, pp 45–50

  7. Eitz M, Richter R, Boubekeur T, Hildebrand K, Alexa M (2012) Sketch-based shape retrieval. ACM Trans Graph 31:4,31:1–31:10

    Google Scholar 

  8. Eitz M, Hays J, Alexa M (2012) How do humans sketch objects? ACM Trans Graph 31(4):44:1–44:10

    Google Scholar 

  9. Ferrari V, Tuytelaars T, Gool LV (2006) Object detection by contour segment networks. In: Lecture notes in computer science. Springer, pp 14–28

  10. Fu H, Cohen-Or D, Dror G, Sheffer A (2008) Upright orientation of man-made objects. In: Proceedings of ACM SIGGRAPH 2008, pp 42–50

  11. Giorgi D, Mortara M, Spagnuolo M (2010) 3D shape retrieval based on best view selection. In: Proceedings of the ACM workshop on 3D object retrieval, pp 9–14

  12. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov PR (2012) Improving neural networks by preventing co-adaptation of feature detectors. Eprint arXiv:1207.0580

  13. Kim S, Tai Y, Lee J et al (2017) Category-specific salient view selection via deep convolutional neural networks. Comput Graphics Forum 36(8):313–328

    Article  Google Scholar 

  14. Ke C, Salman A (2011) Extracting speaker-specific information with a regularized siamese deep network. In: Proceedings of advances in neural information processing systems, pp 298–306

  15. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105

  16. Laga H, Mortara M, Spagnuolo M (2013) Geometry and context for semantic correspondences and functionality recognition in man-made 3D shapes. ACM Trans Graph 32(5):150–160

    Article  Google Scholar 

  17. Lee CH, Varshney A, Jacobs DW (2005) Mesh saliency. ACM Trans Graph 24(3):659–666

    Article  Google Scholar 

  18. Lega H, Nakajima M (2008) Supervised learning of salient 2D views of 3D models. Journal of the Society for Art and Science 7(7):124–131

    Google Scholar 

  19. Li B, Lu Y, Li CC, Godil A, Schreck T, Aono M et al (2014) Large scale comprehensive 3D shape retrieval. In: 3DOR’15 Proceedings of the 7th Eurographics workshop on 3D object retrieval, pp 131–140

  20. Li B, Lu Y, Godil A, Schreck T et al (2014) A comparison of methods for sketch-based 3D shape retrieval. Comput Vis Image Underst 119(2):57–80

    Article  Google Scholar 

  21. Liu H, Zhang L, Huang H (2012) Web-image driven best views of 3D shapes. Vis Comput 28(3):279–287

    Article  Google Scholar 

  22. Liu YJ, Luo X, Joneja A et al (2013) User-adaptive sketch-based 3-D CAD model retrieval. IEEE Trans Autom Sci Eng 10(3):783–795

    Article  Google Scholar 

  23. Ma C, Yang X, Zhang C et al (2016) Sketch retrieval via local dense stroke features. Image Vis Comput 46(1):64–73

    Article  Google Scholar 

  24. Mortara M, Spagnuolo M (2009) Semantics-driven best view of 3D shapes. Comput Graph 33(3):280–290

    Article  Google Scholar 

  25. Shao T, Xu W, Yin K, Wang J, Zhou W, Guo B (2011) Discriminative sketch-base 3D model retrieval via robust shape matching. Computer Graphics Forum 30(7):2011–2020

    Article  Google Scholar 

  26. Shilane P, Min P, Kazhdan M, Funkhouser T (2004) The Princeton shape benchmark. In: Shape modeling international conference. IEEE Computer Society, pp 167–178

  27. Shtrom E, Leifman G, Tal A (2013) Saliency detection in large point sets. In: IEEE international conference on computer vision, pp 3591–3598

  28. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  29. Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: IEEE international conference on computer vision. IEEE Computer Society, pp 1470–1480

  30. Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  31. Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition. Boston, Massachusetts, USA, pp 1–9

  32. Su H, Maji S, Kalogerakis E et al (2015) Multi-view convolutional neural networks for 3D shape recognition. In: International conference on computer vision. Santiago, Chile, pp 945–953

  33. Tatsuma A, Koyanagi H, Aono M (2012) A large-scale shape benchmark for 3D object retrieval: Toyohashi shape benchmark. In: Proceedings of the 2012 Asia pacific signal and information processing association annual summit and conference (APSIPA ASC), pp 1–10

  34. Wang F, Kang L, Li Y (2015) Sketch-based 3D shape retrieval using convolutional neural networks. In: The IEEE conference on computer vision and pattern recognition, pp 1875–1883

  35. Xie J, Fang Y, Zhu F et al (2015) Deepshape: deep learned shape descriptor for 3D shape matching and retrieval. In: IEEE conference on computer vision and pattern recognition. Boston, Massachusetts, USA, pp 1275–1283

  36. Xie J, Wang M, Fang Y et al (2016) Learned binary spectral shape descriptor for 3D shape correspondence. In: IEEE conference on computer vision and pattern recognition. Las Vegas, Nevada, USA, pp 3309–3317

  37. Xie J, Dai G, Zhu F et al (2017) Learning Barycentric representations of 3D shapes for sketch-based 3D shape retrieval. In: IEEE conference on computer vision and pattern recognition. Honolulu, Hawaii, USA, pp 3615–3623

  38. Yamauchi H, Saleem W, Yoshizawa S, Karni Z, Belyaev A et al (2006) Towards stable and salient multi-view representation of 3D shapes. In: IEEE international conference on shape modeling and applications, pp 40–50

  39. Yih WT, Toutanova K, Platt JC, Meek C (2011) Learning discriminative projections for text similarity measures. In: CoNLL’11 Proceedings of the 15th conference on computational natural language learning, pp 247–256

  40. Zhao L, Liang S, Jia J et al (2015) Learning best views of 3D shapes from sketch contour. Vis Comput 31(6):765–774

    Article  Google Scholar 

  41. Zhou W, Jia JY (2017) SVM: Sketch-based 3D retrieval application using classification method. DEStech Transactions on Computer Science and Engineering

  42. Zhou W, Jia JY (2019) A learning framework for shape retrieval based on multilayer perceptrons. Pattern Recogn Lett 117:119–130

    Article  Google Scholar 

  43. Zhu F, Xie J, Fang Y (2016) Learning cross-domain neural networks for sketch-based 3D shape retrieval. In: AAAI’16 Proceedings of the 30th AAAI conference on artificial intelligence, pp 3683–3389

Download references

Acknowledgements

The authors appreciate the comments and suggestions of all the anonymous reviewers, whose comments helped us to significantly improve this paper. This work is supported in part by National Natural Science Foundation of China (NSFC Grant No. 61902003), The Key Research Projects of Central University of Basic Scientific Research Funds for Cross Cooperation (Grant No. 201510-02), Research Funds for the Doctoral Program of Higher Education of China (Grant No. 2013007211-0035), the Key Project in Science and Technology of Jilin Province of China (Grant No. 20140204088GX) and the Doctoral Scientific Research Foundation of Anhui Normal University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, W., Jia, J. Training deep convolutional neural networks to acquire the best view of a 3D shape. Multimed Tools Appl 79, 581–601 (2020). https://doi.org/10.1007/s11042-019-08107-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-08107-w

Keywords

Navigation