Skip to main content

SGDraw: Scene Graph Drawing Interface Using Object-Oriented Representation

  • Conference paper
  • First Online:
HCI International 2023 – Late Breaking Papers (HCII 2023)

Abstract

Scene understanding is an essential and challenging task in computer vision. To provide the visually- grounded graphical structure of an image, the scene graph has received increased attention due to offering explicit grounding of visual concepts. Previous works commonly get scene graphs by using ground-truth annotations or generating from the target images. However, drawing a proper scene graph for image retrieval, image generation, and multi-modal applications is difficult. The conventional scene graph annotation interface is not easy to use and hard to revise the results. The automatic scene graph generation methods using deep neural networks only focus on the objects and relationships while disregarding attributes. In this work, we propose SGDraw, a scene graph drawing interface that uses object- oriented representation to help users interactively draw and edit scene graphs. SGDraw provides a web-based scene graph annotation and creation tool for scene understanding applications. To verify the effectiveness of the proposed interface, we conducted a comparison study with the conventional tool and the user experience study. The results show that SGDraw can help create scene graphs with richer details and describe the images more accurately than traditional bounding box annotations. We believe the proposed SGDraw can be useful in various vision tasks, such as image generation and retrieval. The project source code is available at https://github.com/zty0304/SGDraw.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amiri, S., Chandan, K., Zhang, S.: Reasoning with scene graphs for robot planning under partial observability. IEEE Robot. Automat. Lett. 7(2), 5560–5567 (2022)

    Article  Google Scholar 

  2. Bangor, A., Kortum, P.T., Miller, J.T.: An empirical evaluation of the system usability scale. Int. J. Hum. Comput. Interact. 24(6), 574–594 (2008)

    Article  Google Scholar 

  3. Chen, S., Jin, Q., Wang, P., Wu, Q.: Say as you wish: fine-grained control of image caption generation with abstract scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9962–9971 (2020)

    Google Scholar 

  4. Dai, B., Zhang, Y., Lin, D.: Detecting visual relationships with deep relational networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3076–3086 (2017)

    Google Scholar 

  5. Hildebrandt, M., Li, H., Koner, R., Tresp, V., Günnemann, S.: Scene graph reasoning for visual question answering. arXiv preprint arXiv:2007.01072 (2020)

  6. Huang, Z., et al.: dualface: two-stage drawing guidance for freehand portrait sketching. Comput. Vis. Media 8(1), 63–77 (2022)

    Google Scholar 

  7. Johnson, J., Gupta, A., Fei-Fei, L.: Image generation from scene graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1219–1228 (2018)

    Google Scholar 

  8. Johnson, J., et al.: Image retrieval using scene graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3668–3678 (2015)

    Google Scholar 

  9. Knyazev, B., De Vries, H., Cangea, C., Taylor, G.W., Courville, A., Belilovsky, E.: Graph density-aware losses for novel compositions in scene graph generation. arXiv preprint arXiv:2005.08230 (2020)

  10. Knyazev, B., de Vries, H., Cangea, C., Taylor, G.W., Courville, A., Belilovsky, E.: Generative compositional augmentations for scene graph prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15827–15837 (2021)

    Google Scholar 

  11. Kolesnikov, A., Kuznetsova, A., Lampert, C., Ferrari, V.: Detecting visual relationships using box attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)

    Google Scholar 

  12. Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vision 123(1), 32–73 (2017)

    Google Scholar 

  13. Li, Y., Ouyang, W., Zhou, B., Shi, J., Zhang, C., Wang, X.: Factorizable net: an efficient subgraph-based framework for scene graph generation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 335–351 (2018)

    Google Scholar 

  14. Li, Y., Ouyang, W., Zhou, B., Wang, K., Wang, X.: Scene graph generation from objects, phrases and region captions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1261–1270 (2017)

    Google Scholar 

  15. Liang, X., Lee, L., Xing, E.P.: Deep variation-structured reinforcement learning for visual relationship and attribute detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 848–857 (2017)

    Google Scholar 

  16. Popoola, T., et al.: An object-oriented interface to the sparse polyhedral library. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 1825–1831. IEEE (2021)

    Google Scholar 

  17. Qi, M., Li, W., Yang, Z., Wang, Y., Luo, J.: Attentive relational networks for mapping images to scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3957–3966 (2019)

    Google Scholar 

  18. Qi, M., Wang, Y., Li, A.: Online cross-modal scene retrieval by binary representation and semantic graph. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 744–752 (2017)

    Google Scholar 

  19. Shi, J., Zhang, H., Li, J.: Explainable and explicit visual reasoning over scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8376–8384 (2019)

    Google Scholar 

  20. Song, J., Su, F., Tai, C.L., Cai, S.: An object-oriented progressive-simplification-based vectorization system for engineering drawings: model, algorithm, and performance. IEEE Trans. Pattern Anal. Mach. Intell. 24(8), 1048–1060 (2002)

    Article  Google Scholar 

  21. Suhail, M., et al.: Energy-based learning for scene graph generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13936–13945 (2021)

    Google Scholar 

  22. Tang, K., Niu, Y., Huang, J., Shi, J., Zhang, H.: Unbiased scene graph generation from biased training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3716–3725 (2020)

    Google Scholar 

  23. Weng, J., Du, X., Xie, H.: Dualslide: global-to-local sketching interface for slide content and layout design. arXiv preprint arXiv:2304.12506 (2023)

  24. Xia, H., Araujo, B., Grossman, T., Wigdor, D.: Object-oriented drawing. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 4610–4621 (2016)

    Google Scholar 

  25. Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5419 (2017)

    Google Scholar 

  26. Yang, J., Lu, J., Lee, S., Batra, D., Parikh, D.: Graph r-cnn for scene graph generation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 670–685 (2018)

    Google Scholar 

  27. Yikang, L., Ouyang, W., Wang, X.: Vip-cnn: a visual phrase reasoning convolutional neural network for visual relationship detection

    Google Scholar 

  28. Zellers, R., Yatskar, M., Thomson, S., Choi, Y.: Neural motifs: scene graph parsing with global context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5831–5840 (2018)

    Google Scholar 

  29. Zhang, J., Shih, K.J., Elgammal, A., Tao, A., Catanzaro, B.: Graphical contrastive losses for scene graph parsing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11535–11543 (2019)

    Google Scholar 

  30. Zhang, T., Du, X., Chang, C.M., Yang, X., Xie, H.: Interactive drawing interface for editing scene graph. In: 2022 International Conference on Cyberworlds (CW), pp. 171–172. IEEE (2022)

    Google Scholar 

  31. Zhang, Z., Zhang, C., Niu, Z., Wang, L., Liu, Y.: Geneannotator: a semi-automatic annotation tool for visual scene graph. arXiv preprint arXiv:2109.02226 (2021)

  32. Zhu, G., et al.: Scene graph generation: a comprehensive survey. arXiv preprint arXiv:2201.00443 (2022)

Download references

Acknowledgements

We thank all the participants in our user study. This work was supported by JAIST Research Grant, and JSPS KAKENHI JP20K19845, Japan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haoran Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, T., Du, X., Chang, CM., Yang, X., Xie, H. (2023). SGDraw: Scene Graph Drawing Interface Using Object-Oriented Representation. In: Mori, H., Asahi, Y., Coman, A., Vasilache, S., Rauterberg, M. (eds) HCI International 2023 – Late Breaking Papers. HCII 2023. Lecture Notes in Computer Science, vol 14056. Springer, Cham. https://doi.org/10.1007/978-3-031-48044-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48044-7_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48043-0

  • Online ISBN: 978-3-031-48044-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics