Deep Shape from a Low Number of Silhouettes

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9915)

Abstract

Despite strong progress in the field of 3D reconstruction from multiple views, holes on objects, transparency of objects and textureless scenes, continue to be open challenges. On the other hand, silhouette based reconstruction techniques ease the dependency of 3d reconstruction on image pixels but need a large number of silhouettes to be available from multiple views. In this paper, a novel end to end pipeline is proposed to produce high quality reconstruction from a low number of silhouettes, the core of which is a deep shape reconstruction architecture. Evaluations on ShapeNet [1] show good quality of reconstruction compared with ground truth.

Keywords

Deep 3D reconstruction End to end architecture Silhouettes 

References

  1. 1.
    Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: ShapeNet: an information-rich 3D model repository. Technical report [cs.GR], Stanford University – Princeton University – Toyota Technological Institute at Chicago (2015). arXiv:1512.03012
  2. 2.
    Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)CrossRefGoogle Scholar
  3. 3.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)Google Scholar
  4. 4.
    Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 447–456 (2015)Google Scholar
  5. 5.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
  6. 6.
    Ganin, Y., Lempitsky, V.: \(N^4\)-fields: neural network nearest neighbor fields for image transforms. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9004, pp. 536–551. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16808-1_36 Google Scholar
  7. 7.
    Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)Google Scholar
  8. 8.
    Dosovitskiy, A., Fischery, P., Ilg, E., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D., Brox, T., et al.: Flownet: learning optical flow with convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2758–2766. IEEE (2015)Google Scholar
  9. 9.
    Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems, pp. 2366–2374 (2014)Google Scholar
  10. 10.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  11. 11.
    Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497. IEEE (2015)Google Scholar
  12. 12.
    Matan, O., Burges, C.J., LeCun, Y., Denker, J.S.: Multi-digit recognition using a space displacement neural network. In: NIPS, pp. 488–495 (1991)Google Scholar
  13. 13.
    LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)CrossRefGoogle Scholar
  14. 14.
    Wolf, R., Platt, J.C.: Postal address block location using a convolutional locator network. In: Advances in Neural Information Processing Systems, p. 745 (1994)Google Scholar
  15. 15.
    Ning, F., Delhomme, D., LeCun, Y., Piano, F., Bottou, L., Barbano, P.E.: Toward automatic phenotyping of developing embryos from videos. IEEE Trans. Image Process. 14(9), 1360–1371 (2005)CrossRefGoogle Scholar
  16. 16.
    Dosovitskiy, A., Tobias Springenberg, J., Brox, T.: Learning to generate chairs with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1538–1546 (2015)Google Scholar
  17. 17.
    Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. arXiv preprint (2016). arXiv:1604.03755
  18. 18.
    Wang, X., Fouhey, D., Gupta, A.: Designing deep networks for surface normal estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 539–547 (2015)Google Scholar
  19. 19.
    Tulsiani, S., Malik, J.: Viewpoints and keypoints. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1510–1519. IEEE (2015)Google Scholar
  20. 20.
    Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5695–5703 (2016)Google Scholar
  21. 21.
    Tatarchenko, M., Dosovitskiy, A., Brox, T.: Multi-view 3D models from single images with a convolutional network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 322–337. Springer, Heidelberg (2016). doi:10.1007/978-3-319-46478-7_20 CrossRefGoogle Scholar
  22. 22.
    Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3D–R2N2: a unified approach for single and multi-view 3D object reconstruction. arXiv preprint (2016). arXiv:1604.00449
  23. 23.
    Yumer, M.E., Mitra, N.J.: Learning semantic deformation flows with 3D convolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 294–311. Springer, Heidelberg (2016). doi:10.1007/978-3-319-46466-4_18 CrossRefGoogle Scholar
  24. 24.
    Yan, X., Yang, J., Yumer, E., Guo, Y., Lee, H.: Learning volumetric 3D object reconstruction from single-view with projective transformations. In: Neural Information Processing Systems (NIPS 2016) (2016)Google Scholar
  25. 25.
    Laurentini, A.: The visual hull concept for silhouette-based image understanding. IEEE Trans. Pattern Anal. Mach. Intell. 16(2), 150–162 (1994)CrossRefGoogle Scholar
  26. 26.
    Kim, D., Ruttle, J., Dahyot, R.: Bayesian 3D shape from silhouettes. Digit. Signal Proc. 23(6), 1844–1855 (2013)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2686–2694 (2015)Google Scholar
  28. 28.
    Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE (2015)Google Scholar
  29. 29.
    Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.School of Computer Science and StatisticsTrinity College DublinDublinIreland

Personalised recommendations