Abstract
Recovering the three-dimensional shape of an object from a two-dimensional image is an important research topic in computer vision. Traditional methods use stereo vision or inter-image matching to obtain geometric information about the object, but they require more than one image as input and are more demanding. Recently, the CNN-based approach enables reconstruction using only a single image. However, they rely on limited categories of objects in large-scale datasets, which leads to limitations in their scope of application. In this paper, we propose an incremental 3D reconstruction method. When new interested categories are labeled and provided, we can finetune the network to meet new needs while retaining old knowledge. To achieve these requirements, we introduce the category-wise and instance-wise contrastive loss and the energy-based classification loss. They help the network distinguish between different categories, especially when faced with new ones, and the uniqueness and variability of the predictions generated for different instances. Extensive experiments demonstrate the soundness and feasibility of our approach. We hope our work can attract further research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Broadhurst, A., Drummond, T., Cipolla, R.: A probabilistic framework for space carving. In: ICCV, pp. 388–393 (2001)
Castro, F.M., MarÃn-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15
Cermelli, F., Mancini, M.: Modeling the background for incremental learning in semantic segmentation. In: CVPR, pp. 9230–9239. IEEE (2020)
Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q.: Shapenet: an information-rich 3d model repository. CoRR abs/1512.03012 (2015)
Choy, C.B., Xu, D., Gwak, J.Y., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38
Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: SIGGRAPH, pp. 303–312. ACM (1996)
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3d object reconstruction from a single image. In: CVPR, pp. 2463–2471 (2017)
French, R.M.: Catastrophic interference in connectionist networks: Can it be predicted, can it be prevented? In: NIPS, pp. 1176–1177. Morgan Kaufmann (1993)
Han, X., Laga, H., Bennamoun, M.: Image-based 3d object reconstruction: state-of-the-art and trends in the deep learning era. IEEE TPAMI (2021)
Insafutdinov, E., Dosovitskiy, A.: Unsupervised learning of shape and pose with differentiable point clouds. In: NeurIPS, pp. 2807–2817 (2018)
Joseph, K.J., Balasubramanian, V.N.: Meta-consolidation for continual learning. In: NeurIPS (2020)
Kato, H., Harada, T.: Learning view priors for single-view 3d reconstruction. In: CVPR, pp. 9778–9787 (2019)
Kirkpatrick, J., Pascanu, R., Rabinowitz, N.C., Veness, J., Desjardins, G.: Overcoming catastrophic forgetting in neural networks. CoRR abs/1612.00796 (2016)
Knoblauch, J., Husain, H., Diethe, T.: Optimal continual learning has perfect memory and is np-hard. In: ICML (2020)
Li, B., Sun, Z., Guo, Y.: Supervae: superpixelwise variational autoencoder for salient object detection. In: AAAI (2019)
Li, B., Sun, Z., Tang, L., Hu, A.: Two-b-real net: two-branch network for real-time salient object detection. In: ICASSP (2019)
Li, B., Sun, Z., Tang, L., Sun, Y., Shi, J.: Detecting robust co-saliency with recurrent co-attention neural network. In: IJCAI (2019)
Li, B., Sun, Z., Xu, J., Wang, S., Yu, P.: Saliency based multiple object cosegmentation by ensemble MIML learning. MTAP (2020)
Lin, C., Kong, C., Lucey, S.: Learning efficient point cloud generation for dense 3d object reconstruction. In: AAAI, pp. 7114–7121. AAAI Press (2018)
Liu, W., Wang, X.: Energy-based out-of-distribution detection. In: NeurIPS (2020)
Mandikal, P., L., N.K.: 3d-lmnet: latent embedding matching for accurate and diverse 3d point cloud reconstruction from a single image. In: BMVC (2018)
Mandikal, P., Radhakrishnan, V.B.: Dense 3d point cloud reconstruction using a deep pyramid network. In: WACV, pp. 1052–1060. IEEE (2019)
Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: a review. Neural Netw. 113, 54–71 (2019)
Peng, C., Zhao, K., Lovell, B.C.: Faster ILOD: incremental learning for object detectors based on faster RCNN. Pattern Recognit. Lett. 140, 109–115 (2020)
Pérez-Rúa, J., Zhu, X., Hospedales, T.M., Xiang, T.: Incremental few-shot object detection. In: CVPR, pp. 13843–13852. IEEE (2020)
Pfülb, B., Gepperth, A.: A comprehensive, application-oriented study of catastrophic forgetting in DNNs. In: ICLR (Poster). OpenReview.net (2019)
Prabhu, A., Torr, P.H.S., Dokania, P.K.: GDumb: a simple approach that questions our progress in continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 524–540. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_31
Rusu, A.A., Rabinowitz, N.C., Desjardins, G.: Progressive neural networks. CoRR abs/1606.04671 (2016)
Sun, X., et al.: Pix3d: dataset and methods for single-image 3d shape modeling. In: CVPR, pp. 2974–2983 (2018)
Tang, L., Li, B.: CLASS: cross-level attention and supervision for salient objects detection. In: ACCV (2020)
Tulsiani, S., Efros, A.A., Malik, J.: Multi-view consistency as supervisory signal for learning shape and pose prediction. In: CVPR, pp. 2897–2905 (2018)
Tulsiani, S., Zhou, T., Efros, A.A., Malik, J.: Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: CVPR, pp. 209–217 (2017)
Wang, X., Huang, T.E., Gonzalez, J., Darrell, T., Yu, F.: Frustratingly simple few-shot object detection. In: ICML (2020)
Acknowledgment
This work is supported by the National Natural Science Foundation of China No. 42075139, 42077232, 61272219; the National High Technology Research and Development Program of China No. 2007AA01Z334; the Science and technology program of Jiangsu Province No. BE2020082, BE2010072, BE2011058, BY2012190; the China Postdoctoral Science Foundation No. 2017M621700 and Innovation Fund of State Key Laboratory for Novel Software Technology No. ZZKT2018A09.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhong, Y., Sun, Z., Luo, S., Sun, Y., Zhang, W. (2022). Category-Sensitive Incremental Learning for Image-Based 3D Shape Reconstruction. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13141. Springer, Cham. https://doi.org/10.1007/978-3-030-98358-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-98358-1_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98357-4
Online ISBN: 978-3-030-98358-1
eBook Packages: Computer ScienceComputer Science (R0)