Abstract
Visual Question Answering (VQA) has been a research focus of the computer vision community for recent years. Most of them are accomplished and verified on images of natural scenes. However, Diagram Question Answering (DQA) which is the task of answering natural language questions based on diagram is rarely noticed. Diagram is a more abstract carrier of knowledge and important resource composition in the multi-modal knowledge graph, research on it is of great significance for understanding the cognitive behavior of learners. In order to fill the scarcity of such data, this paper proposes the Computer Science Diagram Question Answering (CSDQA) dataset, which is the first geometric type of diagram dataset in this field. This dataset contains 1,294 diagrams with rich fine-grained annotations and 3,494 question-answer pairs, including multiple choice and true-or-false questions with two levels of difficulty. We have open sourced all the data in http://zscl.xjtudlc.com:888/CSDQA, hoping to provide convenience for researchers and make it the high-quality data foundation of DQA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hong, X., Lan, Y., Pang, L., Guo, J., Cheng, X.: Transformation driven visual reasoning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6903–6912 (2021)
Kembhavi, A., Salvato, M., Kolve, E., Seo, M., Hajishirzi, H., Farhadi, A.: A diagram is worth a dozen images. In: European Conference on Computer Vision, pp. 235–251 (2016)
Kembhavi, A., Seo, M., Schwenk, D., Choi, J., Farhadi, A., Hajishirzi, H.: Are you smarter than a sixth grader? Textbook question answering for multimodal machine comprehension. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4999–5007 (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, David, Pajdla, Tomas, Schiele, Bernt, Tuytelaars, Tinne (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, C.: Digital Logic Circuit. National Defense Industry Press (2002)
Mehlhorn, K., Sanders, P.: Algorithms and Data Structures: The Basic Toolbox. Springer Science & Business Media, Berlin (2008). https://doi.org/10.1007/978-3-540-77978-0
Morris, D., Müller-Budack, E., Ewerth, R.: Slideimages: a dataset for educational image classification. In: European Conference on Information Retrieval, pp. 289–296 (2020)
Shaffer, C.A.: Data Structures and Algorithm Analysis, edn. 3.2, update 0–3, Virginia Tech, Blacksburg (2012)
Shuai, H.: High Score Notes of Data Structure. China Machine Press, Beijing (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)
Tang, S., Liu, X., Wang, C.: Principles of Computer Organization. Higher Education Press, Beijing (2000)
Tang, X., Liang, H., Zhe, F., Tang, Z.: Computer Operating System. Xidian University Press, Shanxi: (2007)
Yan, W., Wu, M.: Data Structure C Version. TsingHua University Press, Beijing (2002)
Zhang, M., Maidment, T., Diab, A., Kovashka, A., Hwa, R.: Domain-robust VQA with diverse datasets and methods but no target labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7046–7056 (2021)
Acknowledgment
This work was supported by National Key Research and Development Program of China (2020AAA0108800), National Natural Science Foundation of China (62050194, 61937001, and 61877050), Innovative Research Group of the National Natural Science Foundation of China (61721002), Innovation Research Team of Ministry of Education (IRT 17R86), Project of China Knowledge Centre for Engineering Science and Technology, Consulting research project of Chinese academy of engineering “The Online and Offline Mixed Educational Service System for ‘The Belt and Road’ Training in MOOC China”, China Postdoctoral Science Foundation (2020M683493).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, S. et al. (2021). CSDQA: Diagram Question Answering in Computer Science. In: Qin, B., Jin, Z., Wang, H., Pan, J., Liu, Y., An, B. (eds) Knowledge Graph and Semantic Computing: Knowledge Graph Empowers New Infrastructure Construction. CCKS 2021. Communications in Computer and Information Science, vol 1466. Springer, Singapore. https://doi.org/10.1007/978-981-16-6471-7_21
Download citation
DOI: https://doi.org/10.1007/978-981-16-6471-7_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6470-0
Online ISBN: 978-981-16-6471-7
eBook Packages: Computer ScienceComputer Science (R0)