SGDraw: Scene Graph Drawing Interface Using Object-Oriented Representation

Zhang, Tianyu; Du, Xusheng; Chang, Chia-Ming; Yang, Xi; Xie, Haoran

doi:10.1007/978-3-031-48044-7_16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14056))

Included in the following conference series:

International Conference on Human-Computer Interaction

396 Accesses

Abstract

Scene understanding is an essential and challenging task in computer vision. To provide the visually- grounded graphical structure of an image, the scene graph has received increased attention due to offering explicit grounding of visual concepts. Previous works commonly get scene graphs by using ground-truth annotations or generating from the target images. However, drawing a proper scene graph for image retrieval, image generation, and multi-modal applications is difficult. The conventional scene graph annotation interface is not easy to use and hard to revise the results. The automatic scene graph generation methods using deep neural networks only focus on the objects and relationships while disregarding attributes. In this work, we propose SGDraw, a scene graph drawing interface that uses object- oriented representation to help users interactively draw and edit scene graphs. SGDraw provides a web-based scene graph annotation and creation tool for scene understanding applications. To verify the effectiveness of the proposed interface, we conducted a comparison study with the conventional tool and the user experience study. The results show that SGDraw can help create scene graphs with richer details and describe the images more accurately than traditional bounding box annotations. We believe the proposed SGDraw can be useful in various vision tasks, such as image generation and retrieval. The project source code is available at https://github.com/zty0304/SGDraw.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amiri, S., Chandan, K., Zhang, S.: Reasoning with scene graphs for robot planning under partial observability. IEEE Robot. Automat. Lett. 7(2), 5560–5567 (2022)
Article Google Scholar
Bangor, A., Kortum, P.T., Miller, J.T.: An empirical evaluation of the system usability scale. Int. J. Hum. Comput. Interact. 24(6), 574–594 (2008)
Article Google Scholar
Chen, S., Jin, Q., Wang, P., Wu, Q.: Say as you wish: fine-grained control of image caption generation with abstract scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9962–9971 (2020)
Google Scholar
Dai, B., Zhang, Y., Lin, D.: Detecting visual relationships with deep relational networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3076–3086 (2017)
Google Scholar
Hildebrandt, M., Li, H., Koner, R., Tresp, V., Günnemann, S.: Scene graph reasoning for visual question answering. arXiv preprint arXiv:2007.01072 (2020)
Huang, Z., et al.: dualface: two-stage drawing guidance for freehand portrait sketching. Comput. Vis. Media 8(1), 63–77 (2022)
Google Scholar
Johnson, J., Gupta, A., Fei-Fei, L.: Image generation from scene graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1219–1228 (2018)
Google Scholar
Johnson, J., et al.: Image retrieval using scene graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3668–3678 (2015)
Google Scholar
Knyazev, B., De Vries, H., Cangea, C., Taylor, G.W., Courville, A., Belilovsky, E.: Graph density-aware losses for novel compositions in scene graph generation. arXiv preprint arXiv:2005.08230 (2020)
Knyazev, B., de Vries, H., Cangea, C., Taylor, G.W., Courville, A., Belilovsky, E.: Generative compositional augmentations for scene graph prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15827–15837 (2021)
Google Scholar
Kolesnikov, A., Kuznetsova, A., Lampert, C., Ferrari, V.: Detecting visual relationships using box attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Google Scholar
Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vision 123(1), 32–73 (2017)
Google Scholar
Li, Y., Ouyang, W., Zhou, B., Shi, J., Zhang, C., Wang, X.: Factorizable net: an efficient subgraph-based framework for scene graph generation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 335–351 (2018)
Google Scholar
Li, Y., Ouyang, W., Zhou, B., Wang, K., Wang, X.: Scene graph generation from objects, phrases and region captions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1261–1270 (2017)
Google Scholar
Liang, X., Lee, L., Xing, E.P.: Deep variation-structured reinforcement learning for visual relationship and attribute detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 848–857 (2017)
Google Scholar
Popoola, T., et al.: An object-oriented interface to the sparse polyhedral library. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 1825–1831. IEEE (2021)
Google Scholar
Qi, M., Li, W., Yang, Z., Wang, Y., Luo, J.: Attentive relational networks for mapping images to scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3957–3966 (2019)
Google Scholar
Qi, M., Wang, Y., Li, A.: Online cross-modal scene retrieval by binary representation and semantic graph. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 744–752 (2017)
Google Scholar
Shi, J., Zhang, H., Li, J.: Explainable and explicit visual reasoning over scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8376–8384 (2019)
Google Scholar
Song, J., Su, F., Tai, C.L., Cai, S.: An object-oriented progressive-simplification-based vectorization system for engineering drawings: model, algorithm, and performance. IEEE Trans. Pattern Anal. Mach. Intell. 24(8), 1048–1060 (2002)
Article Google Scholar
Suhail, M., et al.: Energy-based learning for scene graph generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13936–13945 (2021)
Google Scholar
Tang, K., Niu, Y., Huang, J., Shi, J., Zhang, H.: Unbiased scene graph generation from biased training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3716–3725 (2020)
Google Scholar
Weng, J., Du, X., Xie, H.: Dualslide: global-to-local sketching interface for slide content and layout design. arXiv preprint arXiv:2304.12506 (2023)
Xia, H., Araujo, B., Grossman, T., Wigdor, D.: Object-oriented drawing. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 4610–4621 (2016)
Google Scholar
Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5419 (2017)
Google Scholar
Yang, J., Lu, J., Lee, S., Batra, D., Parikh, D.: Graph r-cnn for scene graph generation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 670–685 (2018)
Google Scholar
Yikang, L., Ouyang, W., Wang, X.: Vip-cnn: a visual phrase reasoning convolutional neural network for visual relationship detection
Google Scholar
Zellers, R., Yatskar, M., Thomson, S., Choi, Y.: Neural motifs: scene graph parsing with global context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5831–5840 (2018)
Google Scholar
Zhang, J., Shih, K.J., Elgammal, A., Tao, A., Catanzaro, B.: Graphical contrastive losses for scene graph parsing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11535–11543 (2019)
Google Scholar
Zhang, T., Du, X., Chang, C.M., Yang, X., Xie, H.: Interactive drawing interface for editing scene graph. In: 2022 International Conference on Cyberworlds (CW), pp. 171–172. IEEE (2022)
Google Scholar
Zhang, Z., Zhang, C., Niu, Z., Wang, L., Liu, Y.: Geneannotator: a semi-automatic annotation tool for visual scene graph. arXiv preprint arXiv:2109.02226 (2021)
Zhu, G., et al.: Scene graph generation: a comprehensive survey. arXiv preprint arXiv:2201.00443 (2022)

Download references

Acknowledgements

We thank all the participants in our user study. This work was supported by JAIST Research Grant, and JSPS KAKENHI JP20K19845, Japan.

Author information

Authors and Affiliations

Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Tianyu Zhang, Xusheng Du & Haoran Xie
The University of Tokyo, Tokyo, Japan
Chia-Ming Chang
Jilin University, Jilin, China
Xi Yang

Authors

Tianyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xusheng Du
View author publications
You can also search for this author in PubMed Google Scholar
Chia-Ming Chang
View author publications
You can also search for this author in PubMed Google Scholar
Xi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haoran Xie .

Editor information

Editors and Affiliations

Tokyo City University, Tokyo, Japan
Hirohiko Mori
Tokyo University of Science, Tokyo, Japan
Yumi Asahi
University of Bucharest, Bucharest, Romania
Adela Coman
University of Tsukuba, Tsukuba, Japan
Simona Vasilache
Eindhoven University of Technology, Eindhoven, The Netherlands
Matthias Rauterberg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, T., Du, X., Chang, CM., Yang, X., Xie, H. (2023). SGDraw: Scene Graph Drawing Interface Using Object-Oriented Representation. In: Mori, H., Asahi, Y., Coman, A., Vasilache, S., Rauterberg, M. (eds) HCI International 2023 – Late Breaking Papers. HCII 2023. Lecture Notes in Computer Science, vol 14056. Springer, Cham. https://doi.org/10.1007/978-3-031-48044-7_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-48044-7_16
Published: 21 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48043-0
Online ISBN: 978-3-031-48044-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics