Abstract
Scene understanding is an essential and challenging task in computer vision. To provide the visually- grounded graphical structure of an image, the scene graph has received increased attention due to offering explicit grounding of visual concepts. Previous works commonly get scene graphs by using ground-truth annotations or generating from the target images. However, drawing a proper scene graph for image retrieval, image generation, and multi-modal applications is difficult. The conventional scene graph annotation interface is not easy to use and hard to revise the results. The automatic scene graph generation methods using deep neural networks only focus on the objects and relationships while disregarding attributes. In this work, we propose SGDraw, a scene graph drawing interface that uses object- oriented representation to help users interactively draw and edit scene graphs. SGDraw provides a web-based scene graph annotation and creation tool for scene understanding applications. To verify the effectiveness of the proposed interface, we conducted a comparison study with the conventional tool and the user experience study. The results show that SGDraw can help create scene graphs with richer details and describe the images more accurately than traditional bounding box annotations. We believe the proposed SGDraw can be useful in various vision tasks, such as image generation and retrieval. The project source code is available at https://github.com/zty0304/SGDraw.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amiri, S., Chandan, K., Zhang, S.: Reasoning with scene graphs for robot planning under partial observability. IEEE Robot. Automat. Lett. 7(2), 5560–5567 (2022)
Bangor, A., Kortum, P.T., Miller, J.T.: An empirical evaluation of the system usability scale. Int. J. Hum. Comput. Interact. 24(6), 574–594 (2008)
Chen, S., Jin, Q., Wang, P., Wu, Q.: Say as you wish: fine-grained control of image caption generation with abstract scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9962–9971 (2020)
Dai, B., Zhang, Y., Lin, D.: Detecting visual relationships with deep relational networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3076–3086 (2017)
Hildebrandt, M., Li, H., Koner, R., Tresp, V., Günnemann, S.: Scene graph reasoning for visual question answering. arXiv preprint arXiv:2007.01072 (2020)
Huang, Z., et al.: dualface: two-stage drawing guidance for freehand portrait sketching. Comput. Vis. Media 8(1), 63–77 (2022)
Johnson, J., Gupta, A., Fei-Fei, L.: Image generation from scene graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1219–1228 (2018)
Johnson, J., et al.: Image retrieval using scene graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3668–3678 (2015)
Knyazev, B., De Vries, H., Cangea, C., Taylor, G.W., Courville, A., Belilovsky, E.: Graph density-aware losses for novel compositions in scene graph generation. arXiv preprint arXiv:2005.08230 (2020)
Knyazev, B., de Vries, H., Cangea, C., Taylor, G.W., Courville, A., Belilovsky, E.: Generative compositional augmentations for scene graph prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15827–15837 (2021)
Kolesnikov, A., Kuznetsova, A., Lampert, C., Ferrari, V.: Detecting visual relationships using box attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vision 123(1), 32–73 (2017)
Li, Y., Ouyang, W., Zhou, B., Shi, J., Zhang, C., Wang, X.: Factorizable net: an efficient subgraph-based framework for scene graph generation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 335–351 (2018)
Li, Y., Ouyang, W., Zhou, B., Wang, K., Wang, X.: Scene graph generation from objects, phrases and region captions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1261–1270 (2017)
Liang, X., Lee, L., Xing, E.P.: Deep variation-structured reinforcement learning for visual relationship and attribute detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 848–857 (2017)
Popoola, T., et al.: An object-oriented interface to the sparse polyhedral library. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 1825–1831. IEEE (2021)
Qi, M., Li, W., Yang, Z., Wang, Y., Luo, J.: Attentive relational networks for mapping images to scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3957–3966 (2019)
Qi, M., Wang, Y., Li, A.: Online cross-modal scene retrieval by binary representation and semantic graph. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 744–752 (2017)
Shi, J., Zhang, H., Li, J.: Explainable and explicit visual reasoning over scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8376–8384 (2019)
Song, J., Su, F., Tai, C.L., Cai, S.: An object-oriented progressive-simplification-based vectorization system for engineering drawings: model, algorithm, and performance. IEEE Trans. Pattern Anal. Mach. Intell. 24(8), 1048–1060 (2002)
Suhail, M., et al.: Energy-based learning for scene graph generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13936–13945 (2021)
Tang, K., Niu, Y., Huang, J., Shi, J., Zhang, H.: Unbiased scene graph generation from biased training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3716–3725 (2020)
Weng, J., Du, X., Xie, H.: Dualslide: global-to-local sketching interface for slide content and layout design. arXiv preprint arXiv:2304.12506 (2023)
Xia, H., Araujo, B., Grossman, T., Wigdor, D.: Object-oriented drawing. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 4610–4621 (2016)
Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5419 (2017)
Yang, J., Lu, J., Lee, S., Batra, D., Parikh, D.: Graph r-cnn for scene graph generation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 670–685 (2018)
Yikang, L., Ouyang, W., Wang, X.: Vip-cnn: a visual phrase reasoning convolutional neural network for visual relationship detection
Zellers, R., Yatskar, M., Thomson, S., Choi, Y.: Neural motifs: scene graph parsing with global context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5831–5840 (2018)
Zhang, J., Shih, K.J., Elgammal, A., Tao, A., Catanzaro, B.: Graphical contrastive losses for scene graph parsing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11535–11543 (2019)
Zhang, T., Du, X., Chang, C.M., Yang, X., Xie, H.: Interactive drawing interface for editing scene graph. In: 2022 International Conference on Cyberworlds (CW), pp. 171–172. IEEE (2022)
Zhang, Z., Zhang, C., Niu, Z., Wang, L., Liu, Y.: Geneannotator: a semi-automatic annotation tool for visual scene graph. arXiv preprint arXiv:2109.02226 (2021)
Zhu, G., et al.: Scene graph generation: a comprehensive survey. arXiv preprint arXiv:2201.00443 (2022)
Acknowledgements
We thank all the participants in our user study. This work was supported by JAIST Research Grant, and JSPS KAKENHI JP20K19845, Japan.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, T., Du, X., Chang, CM., Yang, X., Xie, H. (2023). SGDraw: Scene Graph Drawing Interface Using Object-Oriented Representation. In: Mori, H., Asahi, Y., Coman, A., Vasilache, S., Rauterberg, M. (eds) HCI International 2023 – Late Breaking Papers. HCII 2023. Lecture Notes in Computer Science, vol 14056. Springer, Cham. https://doi.org/10.1007/978-3-031-48044-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-48044-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48043-0
Online ISBN: 978-3-031-48044-7
eBook Packages: Computer ScienceComputer Science (R0)