CSDQA: Diagram Question Answering in Computer Science

Wang, Shaowei; Zhang, Lingling; Yang, Yi; Hu, Xin; Qin, Tao; Wei, Bifan; Liu, Jun

doi:10.1007/978-981-16-6471-7_21

Shaowei Wang¹¹,
Lingling Zhang¹¹,
Yi Yang¹¹,
Xin Hu¹¹,
Tao Qin¹¹,
Bifan Wei¹¹ &
…
Jun Liu¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1466))

Included in the following conference series:

China Conference on Knowledge Graph and Semantic Computing

2020 Accesses
1 Citations

Abstract

Visual Question Answering (VQA) has been a research focus of the computer vision community for recent years. Most of them are accomplished and verified on images of natural scenes. However, Diagram Question Answering (DQA) which is the task of answering natural language questions based on diagram is rarely noticed. Diagram is a more abstract carrier of knowledge and important resource composition in the multi-modal knowledge graph, research on it is of great significance for understanding the cognitive behavior of learners. In order to fill the scarcity of such data, this paper proposes the Computer Science Diagram Question Answering (CSDQA) dataset, which is the first geometric type of diagram dataset in this field. This dataset contains 1,294 diagrams with rich fine-grained annotations and 3,494 question-answer pairs, including multiple choice and true-or-false questions with two levels of difficulty. We have open sourced all the data in http://zscl.xjtudlc.com:888/CSDQA, hoping to provide convenience for researchers and make it the high-quality data foundation of DQA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Hong, X., Lan, Y., Pang, L., Guo, J., Cheng, X.: Transformation driven visual reasoning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6903–6912 (2021)
Google Scholar
Kembhavi, A., Salvato, M., Kolve, E., Seo, M., Hajishirzi, H., Farhadi, A.: A diagram is worth a dozen images. In: European Conference on Computer Vision, pp. 235–251 (2016)
Google Scholar
Kembhavi, A., Seo, M., Schwenk, D., Choi, J., Farhadi, A., Hajishirzi, H.: Are you smarter than a sixth grader? Textbook question answering for multimodal machine comprehension. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4999–5007 (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, David, Pajdla, Tomas, Schiele, Bernt, Tuytelaars, Tinne (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, C.: Digital Logic Circuit. National Defense Industry Press (2002)
Google Scholar
Mehlhorn, K., Sanders, P.: Algorithms and Data Structures: The Basic Toolbox. Springer Science & Business Media, Berlin (2008). https://doi.org/10.1007/978-3-540-77978-0
Morris, D., Müller-Budack, E., Ewerth, R.: Slideimages: a dataset for educational image classification. In: European Conference on Information Retrieval, pp. 289–296 (2020)
Google Scholar
Shaffer, C.A.: Data Structures and Algorithm Analysis, edn. 3.2, update 0–3, Virginia Tech, Blacksburg (2012)
Google Scholar
Shuai, H.: High Score Notes of Data Structure. China Machine Press, Beijing (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)
Google Scholar
Tang, S., Liu, X., Wang, C.: Principles of Computer Organization. Higher Education Press, Beijing (2000)
Google Scholar
Tang, X., Liang, H., Zhe, F., Tang, Z.: Computer Operating System. Xidian University Press, Shanxi: (2007)
Google Scholar
Yan, W., Wu, M.: Data Structure C Version. TsingHua University Press, Beijing (2002)
Google Scholar
Zhang, M., Maidment, T., Diab, A., Kovashka, A., Hwa, R.: Domain-robust VQA with diverse datasets and methods but no target labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7046–7056 (2021)
Google Scholar

Download references

Acknowledgment

This work was supported by National Key Research and Development Program of China (2020AAA0108800), National Natural Science Foundation of China (62050194, 61937001, and 61877050), Innovative Research Group of the National Natural Science Foundation of China (61721002), Innovation Research Team of Ministry of Education (IRT 17R86), Project of China Knowledge Centre for Engineering Science and Technology, Consulting research project of Chinese academy of engineering “The Online and Offline Mixed Educational Service System for ‘The Belt and Road’ Training in MOOC China”, China Postdoctoral Science Foundation (2020M683493).

Author information

Authors and Affiliations

Xi’an Jiaotong University, Shaanxi, China
Shaowei Wang, Lingling Zhang, Yi Yang, Xin Hu, Tao Qin, Bifan Wei & Jun Liu

Authors

Shaowei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lingling Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xin Hu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Qin
View author publications
You can also search for this author in PubMed Google Scholar
Bifan Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jun Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lingling Zhang .

Editor information

Editors and Affiliations

Harbin Institute of Technology, Harbin, China
Bing Qin
Peking University, Beijing, China
Zhi Jin
Tongji University, Shanghai, China
Haofen Wang
University of Edinburgh, Edinburgh, UK
Jeff Pan
University of South China, Hengyang, China
Yongbin Liu
Chinese Academy of Sciences, Beijing, China
Bo An

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, S. et al. (2021). CSDQA: Diagram Question Answering in Computer Science. In: Qin, B., Jin, Z., Wang, H., Pan, J., Liu, Y., An, B. (eds) Knowledge Graph and Semantic Computing: Knowledge Graph Empowers New Infrastructure Construction. CCKS 2021. Communications in Computer and Information Science, vol 1466. Springer, Singapore. https://doi.org/10.1007/978-981-16-6471-7_21

Download citation

DOI: https://doi.org/10.1007/978-981-16-6471-7_21
Published: 28 October 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6470-0
Online ISBN: 978-981-16-6471-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics