Skip to main content

Category-Sensitive Incremental Learning for Image-Based 3D Shape Reconstruction

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13141))

Included in the following conference series:

  • 2043 Accesses

Abstract

Recovering the three-dimensional shape of an object from a two-dimensional image is an important research topic in computer vision. Traditional methods use stereo vision or inter-image matching to obtain geometric information about the object, but they require more than one image as input and are more demanding. Recently, the CNN-based approach enables reconstruction using only a single image. However, they rely on limited categories of objects in large-scale datasets, which leads to limitations in their scope of application. In this paper, we propose an incremental 3D reconstruction method. When new interested categories are labeled and provided, we can finetune the network to meet new needs while retaining old knowledge. To achieve these requirements, we introduce the category-wise and instance-wise contrastive loss and the energy-based classification loss. They help the network distinguish between different categories, especially when faced with new ones, and the uniqueness and variability of the predictions generated for different instances. Extensive experiments demonstrate the soundness and feasibility of our approach. We hope our work can attract further research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Broadhurst, A., Drummond, T., Cipolla, R.: A probabilistic framework for space carving. In: ICCV, pp. 388–393 (2001)

    Google Scholar 

  2. Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15

    Chapter  Google Scholar 

  3. Cermelli, F., Mancini, M.: Modeling the background for incremental learning in semantic segmentation. In: CVPR, pp. 9230–9239. IEEE (2020)

    Google Scholar 

  4. Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q.: Shapenet: an information-rich 3d model repository. CoRR abs/1512.03012 (2015)

    Google Scholar 

  5. Choy, C.B., Xu, D., Gwak, J.Y., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38

    Chapter  Google Scholar 

  6. Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: SIGGRAPH, pp. 303–312. ACM (1996)

    Google Scholar 

  7. Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3d object reconstruction from a single image. In: CVPR, pp. 2463–2471 (2017)

    Google Scholar 

  8. French, R.M.: Catastrophic interference in connectionist networks: Can it be predicted, can it be prevented? In: NIPS, pp. 1176–1177. Morgan Kaufmann (1993)

    Google Scholar 

  9. Han, X., Laga, H., Bennamoun, M.: Image-based 3d object reconstruction: state-of-the-art and trends in the deep learning era. IEEE TPAMI (2021)

    Google Scholar 

  10. Insafutdinov, E., Dosovitskiy, A.: Unsupervised learning of shape and pose with differentiable point clouds. In: NeurIPS, pp. 2807–2817 (2018)

    Google Scholar 

  11. Joseph, K.J., Balasubramanian, V.N.: Meta-consolidation for continual learning. In: NeurIPS (2020)

    Google Scholar 

  12. Kato, H., Harada, T.: Learning view priors for single-view 3d reconstruction. In: CVPR, pp. 9778–9787 (2019)

    Google Scholar 

  13. Kirkpatrick, J., Pascanu, R., Rabinowitz, N.C., Veness, J., Desjardins, G.: Overcoming catastrophic forgetting in neural networks. CoRR abs/1612.00796 (2016)

    Google Scholar 

  14. Knoblauch, J., Husain, H., Diethe, T.: Optimal continual learning has perfect memory and is np-hard. In: ICML (2020)

    Google Scholar 

  15. Li, B., Sun, Z., Guo, Y.: Supervae: superpixelwise variational autoencoder for salient object detection. In: AAAI (2019)

    Google Scholar 

  16. Li, B., Sun, Z., Tang, L., Hu, A.: Two-b-real net: two-branch network for real-time salient object detection. In: ICASSP (2019)

    Google Scholar 

  17. Li, B., Sun, Z., Tang, L., Sun, Y., Shi, J.: Detecting robust co-saliency with recurrent co-attention neural network. In: IJCAI (2019)

    Google Scholar 

  18. Li, B., Sun, Z., Xu, J., Wang, S., Yu, P.: Saliency based multiple object cosegmentation by ensemble MIML learning. MTAP (2020)

    Google Scholar 

  19. Lin, C., Kong, C., Lucey, S.: Learning efficient point cloud generation for dense 3d object reconstruction. In: AAAI, pp. 7114–7121. AAAI Press (2018)

    Google Scholar 

  20. Liu, W., Wang, X.: Energy-based out-of-distribution detection. In: NeurIPS (2020)

    Google Scholar 

  21. Mandikal, P., L., N.K.: 3d-lmnet: latent embedding matching for accurate and diverse 3d point cloud reconstruction from a single image. In: BMVC (2018)

    Google Scholar 

  22. Mandikal, P., Radhakrishnan, V.B.: Dense 3d point cloud reconstruction using a deep pyramid network. In: WACV, pp. 1052–1060. IEEE (2019)

    Google Scholar 

  23. Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: a review. Neural Netw. 113, 54–71 (2019)

    Article  Google Scholar 

  24. Peng, C., Zhao, K., Lovell, B.C.: Faster ILOD: incremental learning for object detectors based on faster RCNN. Pattern Recognit. Lett. 140, 109–115 (2020)

    Article  Google Scholar 

  25. Pérez-Rúa, J., Zhu, X., Hospedales, T.M., Xiang, T.: Incremental few-shot object detection. In: CVPR, pp. 13843–13852. IEEE (2020)

    Google Scholar 

  26. Pfülb, B., Gepperth, A.: A comprehensive, application-oriented study of catastrophic forgetting in DNNs. In: ICLR (Poster). OpenReview.net (2019)

    Google Scholar 

  27. Prabhu, A., Torr, P.H.S., Dokania, P.K.: GDumb: a simple approach that questions our progress in continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 524–540. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_31

    Chapter  Google Scholar 

  28. Rusu, A.A., Rabinowitz, N.C., Desjardins, G.: Progressive neural networks. CoRR abs/1606.04671 (2016)

    Google Scholar 

  29. Sun, X., et al.: Pix3d: dataset and methods for single-image 3d shape modeling. In: CVPR, pp. 2974–2983 (2018)

    Google Scholar 

  30. Tang, L., Li, B.: CLASS: cross-level attention and supervision for salient objects detection. In: ACCV (2020)

    Google Scholar 

  31. Tulsiani, S., Efros, A.A., Malik, J.: Multi-view consistency as supervisory signal for learning shape and pose prediction. In: CVPR, pp. 2897–2905 (2018)

    Google Scholar 

  32. Tulsiani, S., Zhou, T., Efros, A.A., Malik, J.: Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: CVPR, pp. 209–217 (2017)

    Google Scholar 

  33. Wang, X., Huang, T.E., Gonzalez, J., Darrell, T., Yu, F.: Frustratingly simple few-shot object detection. In: ICML (2020)

    Google Scholar 

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China No. 42075139, 42077232, 61272219; the National High Technology Research and Development Program of China No. 2007AA01Z334; the Science and technology program of Jiangsu Province No. BE2020082, BE2010072, BE2011058, BY2012190; the China Postdoctoral Science Foundation No. 2017M621700 and Innovation Fund of State Key Laboratory for Novel Software Technology No. ZZKT2018A09.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhengxing Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhong, Y., Sun, Z., Luo, S., Sun, Y., Zhang, W. (2022). Category-Sensitive Incremental Learning for Image-Based 3D Shape Reconstruction. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13141. Springer, Cham. https://doi.org/10.1007/978-3-030-98358-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-98358-1_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-98357-4

  • Online ISBN: 978-3-030-98358-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics