Skip to main content

Relation-Aware Reasoning with Graph Convolutional Network

  • Conference paper
  • First Online:
Book cover Image and Graphics (ICIG 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12888))

Included in the following conference series:

  • 2045 Accesses

Abstract

Semantic dependencies among objects are crucial for the recognition system to enhance performance. However, utilizing object-object relationships is a non-trivial task as objects are of various scales and locations, leading to irregular relationships. In this paper, we present a novel visual reasoning framework that incorporates both semantic and spatial relationships to improve the recognition system. We at first construct a knowledge graph to represent the co-occurrence frequency and relative position among categories. Based on this knowledge graph, we are able to enhance the original regional features by a Graph Convolutional Network (GCN) that encodes the high-level semantic contexts. Experiments show that our framework manages to outperform the baselines and state-of-the-art on different backbones in terms of both per-instance and per-class classification accuracy.

L. Zhou and Y. Liu—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, X., Gupta, A.: Spatial memory for context reasoning in object detection. In: ICCV (2017)

    Google Scholar 

  2. Chen, X., Li, L.J., Fei-Fei, L., Gupta, A.: Iterative visual reasoning beyond convolutions. In: CVPR (2018)

    Google Scholar 

  3. Chen, Y., Rohrbach, M., et al.: Graph-based global reasoning networks. In: CVPR (2019)

    Google Scholar 

  4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  5. Divvala, S.K., Hoiem, D., Hays, J.H., Efros, A.A., Hebert, M.: An empirical study of context in object detection. In: CVPR (2009)

    Google Scholar 

  6. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)

    Article  Google Scholar 

  7. Fang, H., Gupta, S., et al.: From captions to visual concepts and back. In: CVPR (2015)

    Google Scholar 

  8. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI 32(9), 1627–1645 (2009)

    Article  Google Scholar 

  9. Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. In: CVPR (2008)

    Google Scholar 

  10. Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. IJCV 80(3), 300–316 (2008)

    Article  Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  12. Hu, H., Gu, J., Zhang, Z., Dai, J., Wei, Y.: Relation networks for object detection. In: CVPR (2018)

    Google Scholar 

  13. Jiang, C., Xu, H., Liang, X., Lin, L.: Hybrid knowledge routed modules for large-scale object detection. In: NeurIPS (2018)

    Google Scholar 

  14. Johnson, J., Krishna, R., Stark, M., Li, L.J., Shamma, D., Bernstein, M., Fei-Fei, L.: Image retrieval using scene graphs. In: CVPR (2015)

    Google Scholar 

  15. Kampffmeyer, M., Chen, Y., et al.: Rethinking knowledge graph propagation for zero-shot learning. In: CVPR (2019)

    Google Scholar 

  16. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)

    Google Scholar 

  17. Krishna, R., Zhu, Y., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. IJCV 123(1), 32–73 (2017)

    Article  MathSciNet  Google Scholar 

  18. Lee, C.W., Fang, W., Yeh, C.K., Frank Wang, Y.C.: Multi-label zero-shot learning with structured knowledge graphs. In: CVPR (2018)

    Google Scholar 

  19. Li, L., Gan, Z., Cheng, Y., Liu, J.: Relation-aware graph attention network for visual question answering. In: ICCV (2019)

    Google Scholar 

  20. Li, R., Tapaswi, M., Liao, R., Jia, J., Urtasun, R., Fidler, S.: Situation recognition with graph neural networks. In: ICCV (2017)

    Google Scholar 

  21. Liu, Y., et al.: Goal-oriented gaze estimation for zero-shot learning. In: CVPR (2021)

    Google Scholar 

  22. Marino, K., Salakhutdinov, R., Gupta, A.: The more you know: using knowledge graphs for image classification. In: CVPR (2017)

    Google Scholar 

  23. Mottaghi, R., Chen, X., et al.: The role of context for object detection and semantic segmentation in the wild. In: CVPR (2014)

    Google Scholar 

  24. Ning, X., Gong, K., Li, W., Zhang, L., Bai, X., Tian, S.: Feature refinement and filter network for person re-identification. TCSVT (2020)

    Google Scholar 

  25. Sadeghi, M.A., Farhadi, A.: Recognition using visual phrases. In: CVPR (2011)

    Google Scholar 

  26. Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: ICCV (2003)

    Google Scholar 

  27. Wang, X., Ye, Y., Gupta, A.: Zero-shot recognition via semantic embeddings and knowledge graphs. In: CVPR (2018)

    Google Scholar 

  28. Xu, H., Jiang, C., Liang, X., Lin, L., Li, Z.: Reasoning-RCNN: unifying adaptive global reasoning into large-scale object detection. In: CVPR (2019)

    Google Scholar 

  29. Yang, W., Wang, X., Farhadi, A., Gupta, A., Mottaghi, R.: Visual semantic navigation using scene priors. In: ICLR (2019)

    Google Scholar 

  30. Yao, T., Pan, Y., Li, Y., Mei, T.: Exploring visual relationship for image captioning. In: ECCV (2018)

    Google Scholar 

  31. Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: CVPR (2017)

    Google Scholar 

  32. Zhou, L., Bai, X., Liu, X., Zhou, J., Hancock, E.R.: Learning binary code for fast nearest subspace search. Pattern Recognit. 98, 107040 (2020)

    Article  Google Scholar 

  33. Zhou, L., Bai, X., Liu, X., Zhou, J., Hancock, E.R., et al.: Latent distribution preserving deep subspace clustering. In: IJCAI (2019)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China project no. 61772057, Beijing Natural Science Foundation (4202039), the support funding Jiangxi Research Institute of Beihang University. Supported by the Academic Excellence Foundation of BUAA for PhD Students.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Bai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, L. et al. (2021). Relation-Aware Reasoning with Graph Convolutional Network. In: Peng, Y., Hu, SM., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds) Image and Graphics. ICIG 2021. Lecture Notes in Computer Science(), vol 12888. Springer, Cham. https://doi.org/10.1007/978-3-030-87355-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87355-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87354-7

  • Online ISBN: 978-3-030-87355-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics