Abstract
The emergence of the HotpotQA dataset addressed the lack of training datasets on multi-hop question answering. Based on the strengths of this dataset, we proposed a novel model applicable to multi-hop question answering, called it ELECTRA-based Graph Network model (EGN). First, the method was able to correlate questions with contextual paragraphs and external Wikipedia data to naturally obtain next-hop connected paragraph, initialized the text data with a pre-trained context encoder, Efficiently Learning an Encoder that Classifies Token Re-placements Accurately (ELECTRA). Second, it iterated and updated text features at different levels with the modified Graph Attention Network (GATv2) network. EGN was able to achieve a comparable result in less time with the iterative computation of GATv2 by linking more sensible clues and using ELECTRA to obtain a better representation of the data. In the experiments, EGN performed well with the FullWiki setting on the HotpotQA validation dataset, achieving a Joint EM/F1 score of 47.35/74.62 on the validation set.
Similar content being viewed by others
Data Availability
All the datasets gathered from other sources has been publicly available.
Notes
HotpotQA dataset: https://hotpotqa.github.io/
References
Asai, A., Hashimoto, K., & Hajishirzi, H., et al. (2020). Learning to retrieve reasoning paths over wikipedia graph for question answering.https://doi.org/10.48550/arXiv.1911.10470
Bécigneul, G., Ganea, O., & Chen, B., et al. (2020). Optimal transport graph neural networks. https://doi.org/10.48550/arXiv.2006.04804
Brody, S., Alon, U., & Yahav, E. (2022). How attentive are graph attention networks? In Proceedings of 2022 ICLR. https://doi.org/10.48550/arXiv.2105.14491
Chen, Z., Li, L., & Bruna, J. (2019). Supervised community detection with line graph neural networks. In Proceedings of 2019 ICLR. https://doi.org/10.48550/arXiv.1705.08415
Clark, K., Luong, M., & Le, Q. V., et al. (2020). ELECTRA: pre-training text encoders as discriminators rather than generators. In Proceedings of 2020 ICLR. https://doi.org/10.48550/arXiv.2003.10555
Devlin, J., Chang, M., & Lee, K., et al. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of 2019 NAACL-HLT pp. 4171–4186. https://doi.org/10.18653/v1/N19-1423
Ding, M., Zhou, C., & Chen, Q., et al. (2019). Cognitive graph for multi-hop reading comprehension at scale. In Proceedings of 2019 ACL pp. 2694–2703. https://doi.org/10.18653/v1/P19-1259
Fang, Y., Sun, S., & Gan, Z., et al. (2020). Hierarchical graph network for multi-hop question answering. In Proceedings of 2020 EMNLP pp. 8823–8838. https://doi.org/10.18653/v1/2020.emnlp-main.710
Feldman, Y., & El-Yaniv, R. (2019). Multi-hop paragraph retrieval for open-domain question answering. In Proceedings of 2019 ACL pp. 2296–2309. https://doi.org/10.18653/v1/P19-1222
Huang, Y., & Yang, M. (2021). Breadth first reasoning graph for multi-hop question answering. In Proceedings of 2021 NAACL-HLT pp. 5810–5821. https://doi.org/10.18653/v1/2021.NAACL-MAIN.464
Jiang, Y., & Bansal, M. (2019). Self-assembling modular networks for interpretable multi-hop reasoning. In Proceedings of 2019 EMNLP-IJCNLP pp. 4473–4483. https://doi.org/10.18653/v1/D19-1455
Lan, Z., Chen, M., & Goodman, S., et al. (2020). ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of 2020 ICLR. https://doi.org/10.48550/arXiv.1909.11942
Liu, Y., Ott, M., & Goyal, N., et al. (2019). Roberta: A robustly optimized BERT pretraining approach. CoRR. https://doi.org/10.48550/arXiv.1907.11692arXiv:1907.11692
Mikolov, T., Chen, K., & Corrado, G., et al. (2013a). Efficient estimation of word representations in vector space. In Proceedings of 2013 ICLR. https://doi.org/10.48550/arXiv.1301.3781
Mikolov, T., Sutskever, I., & Chen, K., et al. (2013b). Distributed representations of words and phrases and their compositionality. In Proceedings of 2013 NIPS pp. 3111–3119. https://doi.org/10.48550/arXiv.1310.4546
Min, S., Zhong, V., & Zettlemoyer, L., et al. (2019). Multi-hop reading comprehension through question decomposition and rescoring. In Proceedings of 2019 ACL pp. 6097–6109. https://doi.org/10.18653/v1/P19-1613
Nie, Y., Wang, S., & Bansal, M. (2019). Revealing the importance of semantic retrieval for machine reading at scale. In Proceedings of 2019 EMNLP-IJCNLP pp. 2553–2566. https://doi.org/10.18653/v1/D19-1258
Nishida, K., Nishida, K., & Nagata, M., et al. (2019). Answering while summarizing: Multi-task learning for multi-hop QA with evidence extraction. In Proceedings of 2019 ACL pp. 2335–2345. https://doi.org/10.18653/v1/P19-1225
Nogueira, R. F., & Cho, K. (2019). Passage re-ranking with BERT. CoRR. https://doi.org/10.48550/arXiv.1901.04085arXiv:1901.04085
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of 2014 EMNLP pp. 1532–1543. https://doi.org/10.3115/v1/D14-1162
Qiu, J., Tang, J., & Ma, H., et al. (2018). Deepinf: Social influence prediction with deep learning. In Proceedings of 2018 SIGKDD pp. 2110–2119. https://doi.org/10.1145/3219819.3220077
Qiu, L., Xiao, Y., & Qu, Y., et al. (2019). Dynamically fused graph network for multi-hop reasoning. In Proceedings of 2019 ACL pp. 6140–6150. https://doi.org/10.18653/v1/P19-1617
Qiu, X., Sun, T., Xu, Y., et al. (2020). Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 63(10), 1872–1897. https://doi.org/10.1007/s11431-020-1647-3
Rajpurkar, P., Zhang, J., & Lopyrev, K., et al. (2016). Squad: 100, 000+ questions for machine comprehension of text. In Proceedings of EMNLP pp. 2383–2392. https://doi.org/10.18653/v1/d16-1264
Seo, M. J., Kembhavi, A., & Farhadi, A., et al. (2017). Bidirectional attention flow for machine comprehension. In Proceedings of 2017 ICLR. https://doi.org/10.48550/arXiv.1611.01603
Thorne, J., Vlachos, A., & Cocarascu, O., et al. (2018). The fact extraction and verification (FEVER) shared task. CoRR. https://doi.org/10.18653/v1/W18-5501arXiv:1811.10971
Velickovic, P., Cucurull, G., & Casanova, A., et al. (2018). Graph attention networks. In Proceedings of 2018 ICLR. https://doi.org/10.17863/CAM.48429
Wang, G., Ying, R., & Huang, J., et al. (2019). Improving graph attention networks with large margin-based constraints. CoRR. https://doi.org/10.48550/arXiv.1910.11945arXiv:1910.11945
Welbl, J., Stenetorp, P., & Riedel, S. (2018). Constructing datasets for multi-hop reading comprehension across documents. Trans. Assoc. Comput. Linguistics, 6, 287–302. https://doi.org/10.1162/tacl_a_00021
Wieder, O., Kohlbacher, S., Kuenemann, M., et al. (2020). A compact review of molecular property prediction with graph neural networks. Drug Discovery Today: Technologies, 37, 1–12. https://doi.org/10.1016/j.ddtec.2020.11.009
Xiong, J., Xiong, Z., Chen, K., et al. (2021). Graph neural networks for automated de novo drug design. Drug Discovery Today, 26(6), 1382–1393. https://doi.org/10.1016/j.drudis.2021.02.011
Xiong, W., Yu, M., & Guo, X., et al. (2019). Simple yet effective bridge reasoning for open-domain multi-hop question answering. In Proceedings of 2019 MRQA@EMNLP pp. 48–52. https://doi.org/10.18653/v1/D19-5806
Yang, K., Swanson, K., Jin, W., et al. (2019). Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model., 59(8), 3370–3388. https://doi.org/10.1021/acs.jcim.9b00237
Yang, Z., Dai, Z., & Yang, Y., et al. (2019b). Xlnet: Generalized autoregressive pretraining for language understanding. In Proceedings of 2019 NeurIPS. pp 5754–5764
Yang, Z., Qi, P., & Zhang, S., et al. (2018). Hotpotqa: A dataset for diverse, explainable multi-hop question answering. In Proceedings of 2018 EMNLP pp. 2369–2380. https://doi.org/10.18653/v1/d18-1259
Zhang, X., Zhan, K., & Hu, E., et al. (2021a). Answer complex questions: Path ranker is all you need. In Proceedings of 2021 SIGIR pp. 449–458. https://doi.org/10.1145/3404835.3462942
Zhang, Y., Nie, P., & Ramamurthy, A., et al. (2021b). Answering any-hop open-domain questions with iterative document reranking. In Proceedings of 2021 SIGIR pp. 481–490. https://doi.org/10.1145/3404835.3462853
Zhao, C., Xiong, C., & Rosset, C., et al. (2020). Transformer-xh: Multi-evidence reasoning with extra hop attention. In Proceedings of 2020 ICLR
Zhou, J., Cui, G., Hu, S., et al. (2020). Graph neural networks: A review of methods and applications. AI Open, 1, 57–81. https://doi.org/10.1016/j.aiopen.2021.01.001
Acknowledgements
The authors would like to thank the anonymous reviewers for their helpful reviews.
Funding
This work is supported by the National Natural Science Foundation of China (Grants No. 32071901, No. 32271981) and Anhui Provincial Key Research and Development Project (No. 2022o07020001).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that there are no competing interests with anybody or any institution regarding the publication of this paper.
Ethical Approval and Consent to participate
Not Applicable.
Consent for publication
Not Applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhu, P., Yuan, Y. & Chen, L. ELECTRA-based graph network model for multi-hop question answering. J Intell Inf Syst 61, 819–834 (2023). https://doi.org/10.1007/s10844-023-00800-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-023-00800-5