Skip to main content
Log in

One-Shot Object Affordance Detection in the Wild

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Affordance detection refers to identifying the potential action possibilities of objects in an image, which is a crucial ability for robot perception and manipulation. To empower robots with this ability in unseen scenarios, we first study the challenging one-shot affordance detection problem in this paper, i.e., given a support image that depicts the action purpose, all objects in a scene with the common affordance should be detected. To this end, we devise a One-Shot Affordance Detection Network (OSAD-Net) that firstly estimates the human action purpose and then transfers it to help detect the common affordance from all candidate images. Through collaboration learning, OSAD-Net can capture the common characteristics between objects having the same underlying affordance and learn a good adaptation capability for perceiving unseen affordances. Besides, we build a large-scale purpose-driven affordance dataset v2 (PADv2) by collecting and labeling 30k images from 39 affordance and 103 object categories. With complex scenes and rich annotations, our PADv2 dataset can be used as a test bed to benchmark affordance detection methods and may also facilitate downstream vision tasks, such as scene understanding, action recognition, and robot manipulation. Specifically, we conducted comprehensive experiments on PADv2 dataset by including 11 advanced models from several related research fields. Experimental results demonstrate the superiority of our model over previous representative ones in terms of both objective metrics and visual quality. The benchmark suite is available at https://github.com/lhc1224/OSAD_Net.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. “Detection” refers to the pixel-wise detection task, which has also been used in the area of salient object detection.

  2. It indicates our conference version model in Luo et al. (2021b)

References

  • Achanta, R., Hemami, S., Estrada, F., & Susstrunk, S. (2009). Frequency-tuned salient region detection. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1597–1604).

  • Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2010). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.

    Article  Google Scholar 

  • Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.

    Article  Google Scholar 

  • Cai, J., Zha, Z. J., Wang, M., Zhang, S., & Tian, Q. (2014). An attribute-assisted reranking model for web image search. IEEE Transactions on Image Processing (TIP), 24(1), 261–272.

    Article  MathSciNet  Google Scholar 

  • Chen, J., Liu, D., Luo, B., Peng, X., Xu, T., & Chen, E. (2019). Residual objectness for imbalance reduction. arXiv preprint arXiv:1908.09075.

  • Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587.

  • Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In The European conference on computer vision (ECCV).

  • Chen, W., Liu, Y., Wang, W., Bakker, E., Georgiou, T., Fieguth, P., Liu, L., & Lew, M. S. (2021). Deep image retrieval: A survey. arXiv preprint arXiv:2101.11282.

  • Chuang, C. Y., Li, J., Torralba, A., & Fidler, S. (2018). Learning to act properly: Predicting and explaining affordances from images. In: The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 975–983).

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39, 1–38.

    MathSciNet  MATH  Google Scholar 

  • Deng, S., Xu, X., Wu, C., Chen, K., & Jia, K. (2021). 3d affordancenet: A benchmark for visual object affordance understanding. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp 1778–1787).

  • Do, T. T., Nguyen, A., & Reid, I. (2018). Affordancenet: An end-to-end deep learning approach for object affordance detection. In International conference on robotics and automation (ICRA).

  • Dong, N., & Xing, E. P. (2018). Few-shot semantic segmentation with prototype learning. In The British Machine Vision Conference (BMVC) (Vol 3).

  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., & Gelly, S., et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.

  • Fan, D. P., Gong, C., Cao, Y., Ren, B., Cheng, M. M., & Borji, A. (2018). Enhanced-alignment measure for binary foreground map evaluation. In International joint conference on artificial intelligence (IJCAI).

  • Fan, D. P., Li, T., Lin, Z., Ji, G. P., Zhang, D., Cheng, M. M., Fu, H., & Shen, J. (2021). Re-thinking co-salient object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 99, 1–1.

    Google Scholar 

  • Fang, K., Wu, T. L., Yang, D., Savarese, S., & Lim, J. J. (2018). Demo2vec: Reasoning object affordances from online videos. In The IEEE conference on computer vision and pattern recognition (CVPR).

  • Fang, K., Zhu, Y., Garg, A., Kurenkov, A., Mehta, V., Fei-Fei, L., & Savarese, S. (2020). Learning task-oriented grasping for tool manipulation from simulated self-supervision. The International Journal of Robotics Research, 39(2–3), 202–216.

    Article  Google Scholar 

  • Finn, C., Abbeel, P., & Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. arXiv.

  • Gao, S. H., Tan, Y. Q., Cheng, M. M., Lu, C., Chen, Y., & Yan, S. (2020). Highly efficient salient object detection with 100k parameters. In The European Conference on Computer Vision (ECCV).

  • Gao, W., Wan, F., Pan, X., Peng, Z., Tian, Q., Han, Z., Zhou, B., & Ye, Q. (2021). Ts-cam: Token semantic coupled attention map for weakly supervised object localization. In The IEEE International Conference on Computer Vision (ICCV).

  • Gibson, J. J. (1977). The theory of affordances. Hilldale

  • Hassan, M., & Dharmaratne, A. (2015). Attribute based affordance detection from human-object interaction images. In Image and Video Technology (pp. 220–232). Springer.

  • Hassanin, M., Khan, S., & Tahtali, M. (2018). Visual affordance and function understanding: A survey. arXiv.

  • He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.

    Article  Google Scholar 

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In The IEEE conference on computer vision and pattern recognition (CVPR).

  • Hermans, T., Rehg, J. M., & Bobick, A. (2011). Affordance prediction via learned object attributes. In IEEE international conference on robotics and automation (ICRA): Workshop on semantic perception, mapping, and exploration (pp. 181–184).

  • Ho, J., & Ermon, S. (2016). Generative adversarial imitation learning. Advances in Neural Information Processing Systems, 29, 4565–4573.

    Google Scholar 

  • Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.

  • Johnander, J., Edstedt, J., Danelljan, M., Felsberg, M., & Khan, F. S. (2021). Deep gaussian processes for few-shot segmentation. arXiv preprint arXiv:2103.16549.

  • Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv

  • Kipf, T. N., & Welling M (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.

  • Kjellström, H., Romero, J., & Kragic, D. (2011). Visual object-action recognition: Inferring object affordances from human demonstration. Computer Vision and Image Understanding, 115(1), 81–90.

    Article  Google Scholar 

  • Le Meur, O., Le Callet, P., & Barba, D. (2007). Predicting visual fixations on video based on low-level visual features. Vision Research, 47, 2483–2498.

    Article  Google Scholar 

  • Li, G., Jampani, V., Sevilla-Lara, L., Sun, D., Kim, J., & Kim, J. (2021). Adaptive prototype learning and allocation for few-shot segmentation. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 8334–8343).

  • Li, K., Zhang, Y., Li, K., & Fu, Y. (2020). Adversarial feature hallucination networks for few-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 13470–13479).

  • Li, X., Liu, S., Kim, K., Wang, X., Yang, M. H., & Kautz, J. (2019a). Putting humans in a scene: Learning affordance in 3d indoor environments. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 12368–12376).

  • Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., & Liu, H. (2019b). Expectation-maximization attention networks for semantic segmentation. In The IEEE International conference on computer vision (ICCV).

  • Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C.L. (2014). Microsoft coco: Common objects in context. In The European conference on computer vision (ECCV).

  • Liu, C., Chen, L. C., Schroff, F., Adam, H., Hua, W., Yuille, A. L., & Fei-Fei, L. (2019). Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 82–92).

  • Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., & Shum, H. Y. (2010). Learning to detect a salient object. IEEE transactions on pattern analysis and machine intelligence (TPAMI), 33(2), 353–367.

    Google Scholar 

  • Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In The IEEE international conference on computer vision (ICCV).

  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In The IEEE conference on computer vision and pattern recognition (CVPR)

  • Lu, L., Zhai, W., Luo, H., Kang, Y., & Cao, Y. (2022). Phrase-based affordance detection via cyclic bilateral interaction. arXiv preprint arXiv:2202.12076.

  • Luo, H., Zhai, W., Zhang, J., Cao, Y., & Tao, D. (2021a). Learning visual affordance grounding from demonstration videos. arXiv preprint arXiv:2108.05675.

  • Luo, H., Zhai, W., Zhang, J., Cao, Y., & Tao, D. (2021b). One-shot affordance detection. In International joint conference on artificial intelligence (IJCAI).

  • Luo, H., Zhai, W., Zhang, J., Cao, Y., & Tao, D. (2022). Learning affordance grounding from exocentric images. In The IEEE conference on computer vision and pattern recognition (CVPR).

  • Mi, J., Liang, H., Katsakis, N., Tang, S., Li, Q., Zhang, C., & Zhang, J. (2020). Intention-related natural language grounding via object affordance detection and intention semantic extraction. Frontiers in Neurorobotics, 14, 26.

    Article  Google Scholar 

  • Myers, A., Teo, C. L., Fermüller, C., & Aloimonos, Y. (2015). Affordance detection of tool parts from geometric features. In International conference on robotics and automation (ICRA) (pp. 1374–1381).

  • Nagarajan, T., & Grauman, K. (2020). Learning affordance landscapes for interaction exploration in 3d environments. arXiv preprint arXiv:2008.09241.

  • Nagarajan, T., Feichtenhofer, C., & Grauman, K. (2019). Grounded human-object interaction hotspots from video. In The IEEE international conference on computer vision (ICCV).

  • Nguyen, A., Kanoulas, D., Caldwell, D. G., & Tsagarakis, N. G. (2017). Object-based affordances detection with convolutional neural networks and dense conditional random fields. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 5908–5915). IEEE.

  • Patro, S., & Sahu, K. K. (2015). Normalization: A preprocessing stage. arXiv preprint arXiv:1503.06462.

  • Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A. (2012). Saliency filters: Contrast based filtering for salient region detection. In The IEEE conference on computer vision and pattern recognition (CVPR).

  • Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., & Sorkine-Hornung, A. (2016). A benchmark dataset and evaluation methodology for video object segmentation. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 724–732).

  • Qi, S., Huang, S., Wei, P., & Zhu, S. C. (2017). Predicting human activities using stochastic grammar. In The IEEE international conference on computer vision (ICCV) (pp. 1164–1172).

  • Qian, Q., Chen, L., Li, H., & Jin, R. (2020). Dr loss: Improving object detection by distributional ranking. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 12164–12172).

  • Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., & Jagersand, M. (2019). Basnet: Boundary-aware salient object detection. In The IEEE conference on computer vision and pattern recognition (CVPR).

  • Ramakrishnan, S. K., Jayaraman, D., & Grauman, K. (2021). An exploration of embodied visual exploration. International Journal of Computer Vision (IJCV), 129(5), 1616–1649.

    Article  Google Scholar 

  • Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., & Koltun, V. (2020). Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 44, 1623–1637.

    Article  Google Scholar 

  • Ravi, S., & Larochelle, H. (2017). Optimization as a model for few-shot learning. In International conference on learning representations (ICLR).

  • Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine learning, 62(1–2), 107–136.

    Article  Google Scholar 

  • Richardson, S., & Green, P. J. (1997). On Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59(4), 731–792.

    Article  Google Scholar 

  • Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In The international conference on medical image computing and computer assisted intervention (MICCA)

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision (IJCV), 115, 211–252.

    Article  MathSciNet  Google Scholar 

  • Rusu, A. A., Rao, D., Sygnowski, J., Vinyals, O., Pascanu, R., Osindero, S., & Hadsell, R. (2018). Meta-learning with latent embedding optimization. arXiv preprint arXiv:1807.05960.

  • Sawatzky, J., & Gall, J. (2017). Adaptive binarization for weakly supervised affordance segmentation. In Proceedings of the IEEE international conference on computer vision workshops (pp. 1383–1391).

  • Sawatzky, J., Srikantha, A., & Gall, J. (2017). Weakly supervised affordance detection. In The IEEE conference on computer vision and pattern recognition (CVPR)

  • Shaban, A., Bansal, S., Liu, Z., Essa, I., & Boots, B. (2017). One-shot learning for semantic segmentation. arXiv

  • Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical networks for few-shot learning. In Conference on neural information processing systems (NeurIPS).

  • Song, H. O., Fritz, M., Goehring, D., & Darrell, T. (2015). Learning to detect visual grasp affordance. IEEE Transactions on Automation Science and Engineering, 13(2), 798–809.

    Article  Google Scholar 

  • Stark, M., Lies, P., Zillich, M., Wyatt, J., & Schiele, B. (2008). Functional object class detection based on learned affordance cues. In International conference on computer vision systems (pp. 435–444). Springer.

  • Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P. H., & Hospedales, T. M. (2018). Learning to compare: Relation network for few-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1199–1208).

  • Thermos, S., Papadopoulos, G. T., Daras, P., & Potamianos, G. (2017). Deep affordance-grounded sensorimotor object recognition. In The IEEE conference on computer vision and pattern recognition (pp. 6167–6175).

  • Tian, Z., Zhao, H., Shu, M., Yang, Z., Li, R., & Jia, J. (2020). Prior guided feature enrichment network for few-shot segmentation. IEEE annals of the history of computing (pp. 1–1).

  • Ugur, E., Szedmak, S., & Piater, J. (2014). Bootstrapping paired-object affordance learning with learned single-affordance features. In International conference on development and learning and on epigenetic robotics (pp. 476–481). IEEE.

  • Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al. (2016). Matching networks for one shot learning. Advances in Neural Information Processing Systems, 29, 3630–3638.

    Google Scholar 

  • Vu, T. H., Olsson, C., Laptev, I., Oliva, A., & Sivic, J. (2014). Predicting actions from static scenes. In The European conference on computer vision (ECCV) (pp. 421–436).

  • Wang, H., Yang, Y., Cao, X., Zhen, X., Snoek, C., & Shao, L. (2021a). Variational prototype inference for few-shot semantic segmentation. In The IEEE winter conference on applications of computer vision (WACV) (pp. 525–534)

  • Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., & Wang, X., et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence (TPAMI)

  • Wang, W., Xie, E., Li, X., Fan, D. P., Song, K., Liang, D., Lu, T., Luo, P., & Shao, L. (2021b). Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In The IEEE international conference on computer vision (CVPR) (pp. 568–578).

  • Wang, X., Girdhar, R., & Gupta, A. (2017). Binge watching: Scaling affordance learning from sitcoms. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp 2596–2605)

  • Wang, Y., X., & Hebert, M. (2016). Learning to learn: Model regression networks for easy small sample learning. In The European conference on computer vision (ECCV) (pp 616–634).

  • Wei, P., Xie, D., Zheng, N., & Zhu, S. C. (2017). Inferring human attention by learning latent intentions. In International Joint Conference on Artificial Intelligence (IJCAI) (pp. 1297–1303).

  • Wu, P., Zhai, W., & Cao, Y. (2022). Background activation suppression for weakly supervised object localization. In The IEEE conference on computer vision and pattern recognition (CVPR)

  • Wu, S., Yang, J., Wang, X., & Li, X. (2019a). Iou-balanced loss functions for single-stage object detection. arXiv preprint arXiv:1908.05641

  • Wu, Z., Su, L., & Huang, Q. (2019b). Cascaded partial decoder for fast and accurate salient object detection. In The IEEE conference on computer vision and pattern recognition (CVPR)

  • Xu, B., Li, J., Wong, Y., Zhao, Q., & Kankanhalli, M. S. (2019). Interact as you intend: Intention-driven human-object interaction detection. IEEE Transactions on Multimedia (TMM), 22(6), 1423–1432.

    Article  Google Scholar 

  • Xu, Y., Zhang, Q., Zhang, J., & Tao, D. (2021). Vitae: Vision transformer advanced by exploring intrinsic inductive bias. In Conference on neural information processing systems (NeurIPS) 34

  • Yamanobe, N., Wan, W., Ramirez-Alpizar, I. G., Petit, D., Tsuji, T., Akizuki, S., Hashimoto, M., Nagata, K., & Harada, K. (2017). A brief review of affordance in robotic manipulation research. Advanced Robotics, 31(19–20), 1086–1101.

    Article  Google Scholar 

  • Yan, S., Xiong, Y., & Lin, D. (2018). Spatial temporal graph convolutional networks for skeleton-based action recognition. In The AAAI conference on artificial intelligence (AAAI)

  • Zhang, C., Lin, G., Liu, F., Yao, R., & Shen, C. (2019). Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5217–5226).

  • Zhang, J., & Tao, D. (2020). Empowering things with intelligence: A survey of the progress, challenges, and opportunities in artificial intelligence of things. IEEE Internet of Things Journal, 8, 7789–7817.

    Article  Google Scholar 

  • Zhang, J., Chen, Z., & Tao, D. (2021). Towards high performance human keypoint detection. International Journal of Computer Vision (IJCV), 129, 1–24.

    Article  Google Scholar 

  • Zhang, Q., Xu, Y., Zhang, J., & Tao, D. (2022). Vitaev2: Vision transformer advanced by exploring inductive bias for image recognition and beyond. arXiv preprint arXiv:2202.10108

  • Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In The IEEE conference on computer vision and pattern recognition (CVPR)

  • Zhao, J. X., Liu, J. J., Fan, D. P., Cao, Y., Yang, J., & Cheng, M. M. (2019). Egnet: Edge guidance network for salient object detection. In The IEEE International Conference on Computer Vision (ICCV)

  • Zhao, X., Cao, Y., & Kang, Y. (2020). Object affordance detection with relationship-aware network. Neural Computing and Applications, 32(18), 14321–14333.

    Article  Google Scholar 

  • Zhong, X., Ding, C., Qu, X., & Tao, D. (2021). Polysemy deciphering network for robust human-object interaction detection. International Journal of Computer Vision (IJCV), 129(6), 1910–1929.

    Article  Google Scholar 

  • Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., & Torralba, A. (2017). Scene parsing through ade20k dataset. In The IEEE conference on computer vision and pattern recognition (CVPR)

  • Zhu, K., Zhai, W., Zha, Z. J., & Cao, Y. (2019). One-shot texture retrieval with global context metric. In International joint conference on artificial intelligence, IJCAI

  • Zhu, K., Zhai, W., & Cao, Y. (2020). Self-supervised tuning for few-shot segmentation. In International joint conference on artificial intelligence, IJCAI

  • Zhu, Y., Fathi, A., & Fei-Fei, L. (2014). Reasoning about object affordances in a knowledge base representation. In Proceedings of the European conference on computer vision (ECCV) (pp. 408–424).

  • Zhu, Y., Zhao, Y., & Chun Zhu, S. (2015). Understanding tools: Task-oriented object modeling, learning and recognition. In The IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2855–2864).

Download references

Acknowledgements

This work was supported by National Key R &D Program of China under Grant 2020AAA0105701, National Natural Science Foundation of China (NSFC) under Grants 61872327. Dr. Jing Zhang is supported by the ARC Project FL-170100117.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Cao.

Additional information

Communicated by Christoph H. Lampert.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhai, W., Luo, H., Zhang, J. et al. One-Shot Object Affordance Detection in the Wild. Int J Comput Vis 130, 2472–2500 (2022). https://doi.org/10.1007/s11263-022-01642-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-022-01642-4

Keywords

Navigation