Abstract
This article mainly introduces a simple memory module that can effectively improve the reading comprehension ability of the BERT model. We think the model of reading comprehension is like the human brain. The area of the human brain responsible for memory is in the hippocampus, and the area responsible for thinking is in the prefrontal and parietal cortex. Reading comprehension should also have areas for memory and analysis. So we added a memory module to the BERT model. After the data enters the encoder, it enters the memory module to find similar vectors. The memory module is responsible for assisting the model in understanding and answering questions, in which comparative learning is used for sentence embedding. The dataset used is CoQA, in which the dialogue is closer to human daily life, the questions and answers are more natural, and it covers 7 different domains. The automatic and manual evaluation surface, the model added with the memory module has higher accuracy than the original model, has strong generalization ability, is not easy to overfit, and is more efficient in multi-domain learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to answer opendomain questions. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, 30 July–4 August, Volume 1: Long Papers, pp. 1870–1879 (2017)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) (2019)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach (2019). arXiv:1907.11692
Chen, Y.-N., Wang, W.Y., Rudnicky, A.: Jointly modeling inter-slot relations by random walk on knowledge graphs for unsupervised spoken language understanding. In: NAACL (2015)
Chen, C.Y., et al.: Gunrock: building a human-like social bot by leveraging large scale real user data. In: 2nd Alexa Prize (2018)
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings (2021). https://arxiv.org/abs/2104.08821
Li, B., Zhou, H., He, J., Wang, M., Yang, Y., Li, L.: On the sentence embeddings from pre-trained language models. In: Empirical Methods in Natural Language Processing EMNLP (2020)
Su, J., Cao, J., Liu, W., Ou, Y.: Whitening sentence representations for better semantics and faster retrieval (2021)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15(1), 1929–1958 (2014)
Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In: International Conference on Machine Learning (2020)
Reddy, S., Chen, D., Manning, C.D.: CoQA: a conversational question answering challenge (2019). arXiv:1808.07042
Cer, D., Diab, M., Agirre, E., LopezGazpio, I., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th International Workshop on Semantic Evaluation (2017)
Wu, Z., Wang, S., Gu, J., Khabsa, M., Sun, F., Ma, H.: Clear: contrastive learning for sentence representation (2020). arXiv:2012.15466
Zhang, Y., He, R., Liu, Z., Lim, K.H., Bing, L.: An unsupervised sentence embedding method by mutual information maximization. In: Empirical Methods in Natural Language Processing (2020)
van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding (2019). arXiv:1807.03748
Cai, X., Huang, J., Bian, Y., Church, K.: Isotropy in the contextual embedding space: clusters and manifolds. In: ICLR (2021)
Gao, J., He, D., Tan, X., Qin, T., Wang, L., Liu, T.: Representation degeneration problem in training natural language generation models. In: International Conference on Learning Representations (2019)
Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In: International Conference on Machine Learning (ICML), pp. 9929–9939 (2020)
Cer, D., Diab, M., Agirre, E., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and cross-lingual focused evaluation (2017). https://doi.org/10.18653/v1/S17-2001
Wieting, J., Neubig, G., Kirkpatrick, T.B.: A bilingual generative transformer for semantic sentence embedding. In: Empirical Methods in Natural Language Processing (2020)
Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension (2016). arXiv:1611.01603
Acknowledgements
This work is supported by Tianjin “Project + Team” Key Training Project under Grant No. XC202022.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, R., Wang, X. (2023). A Simple Memory Module on Reading Comprehension. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1792. Springer, Singapore. https://doi.org/10.1007/978-981-99-1642-9_48
Download citation
DOI: https://doi.org/10.1007/978-981-99-1642-9_48
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1641-2
Online ISBN: 978-981-99-1642-9
eBook Packages: Computer ScienceComputer Science (R0)