A Simple Memory Module on Reading Comprehension

Zhang, Ruxin; Wang, Xiaoye

doi:10.1007/978-981-99-1642-9_48

Ruxin Zhang¹⁰ &
Xiaoye Wang¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1792))

Included in the following conference series:

International Conference on Neural Information Processing

614 Accesses

Abstract

This article mainly introduces a simple memory module that can effectively improve the reading comprehension ability of the BERT model. We think the model of reading comprehension is like the human brain. The area of the human brain responsible for memory is in the hippocampus, and the area responsible for thinking is in the prefrontal and parietal cortex. Reading comprehension should also have areas for memory and analysis. So we added a memory module to the BERT model. After the data enters the encoder, it enters the memory module to find similar vectors. The memory module is responsible for assisting the model in understanding and answering questions, in which comparative learning is used for sentence embedding. The dataset used is CoQA, in which the dialogue is closer to human daily life, the questions and answers are more natural, and it covers 7 different domains. The automatic and manual evaluation surface, the model added with the memory module has higher accuracy than the original model, has strong generalization ability, is not easy to overfit, and is more efficient in multi-domain learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Machine Reading Comprehension Model Based on Multi-head Attention Mechanism

Transformer-Based Coattention: Neural Architecture for Reading Comprehension

Review of Parameters, Approaches and Challenges in Reading Comprehension Systems

References

Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to answer opendomain questions. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, 30 July–4 August, Volume 1: Long Papers, pp. 1870–1879 (2017)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) (2019)
Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach (2019). arXiv:1907.11692
Chen, Y.-N., Wang, W.Y., Rudnicky, A.: Jointly modeling inter-slot relations by random walk on knowledge graphs for unsupervised spoken language understanding. In: NAACL (2015)
Google Scholar
Chen, C.Y., et al.: Gunrock: building a human-like social bot by leveraging large scale real user data. In: 2nd Alexa Prize (2018)
Google Scholar
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings (2021). https://arxiv.org/abs/2104.08821
Li, B., Zhou, H., He, J., Wang, M., Yang, Y., Li, L.: On the sentence embeddings from pre-trained language models. In: Empirical Methods in Natural Language Processing EMNLP (2020)
Google Scholar
Su, J., Cao, J., Liu, W., Ou, Y.: Whitening sentence representations for better semantics and faster retrieval (2021)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In: International Conference on Machine Learning (2020)
Google Scholar
Reddy, S., Chen, D., Manning, C.D.: CoQA: a conversational question answering challenge (2019). arXiv:1808.07042
Cer, D., Diab, M., Agirre, E., LopezGazpio, I., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th International Workshop on Semantic Evaluation (2017)
Google Scholar
Wu, Z., Wang, S., Gu, J., Khabsa, M., Sun, F., Ma, H.: Clear: contrastive learning for sentence representation (2020). arXiv:2012.15466
Zhang, Y., He, R., Liu, Z., Lim, K.H., Bing, L.: An unsupervised sentence embedding method by mutual information maximization. In: Empirical Methods in Natural Language Processing (2020)
Google Scholar
van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding (2019). arXiv:1807.03748
Cai, X., Huang, J., Bian, Y., Church, K.: Isotropy in the contextual embedding space: clusters and manifolds. In: ICLR (2021)
Google Scholar
Gao, J., He, D., Tan, X., Qin, T., Wang, L., Liu, T.: Representation degeneration problem in training natural language generation models. In: International Conference on Learning Representations (2019)
Google Scholar
Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In: International Conference on Machine Learning (ICML), pp. 9929–9939 (2020)
Google Scholar
Cer, D., Diab, M., Agirre, E., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and cross-lingual focused evaluation (2017). https://doi.org/10.18653/v1/S17-2001
Wieting, J., Neubig, G., Kirkpatrick, T.B.: A bilingual generative transformer for semantic sentence embedding. In: Empirical Methods in Natural Language Processing (2020)
Google Scholar
Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension (2016). arXiv:1611.01603

Download references

Acknowledgements

This work is supported by Tianjin “Project + Team” Key Training Project under Grant No. XC202022.

Author information

Authors and Affiliations

Tianjin University of Technology, Tianjin, China
Ruxin Zhang & Xiaoye Wang

Authors

Ruxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoye Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruxin Zhang .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, R., Wang, X. (2023). A Simple Memory Module on Reading Comprehension. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1792. Springer, Singapore. https://doi.org/10.1007/978-981-99-1642-9_48

Download citation

DOI: https://doi.org/10.1007/978-981-99-1642-9_48
Published: 14 April 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1641-2
Online ISBN: 978-981-99-1642-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Simple Memory Module on Reading Comprehension

Abstract

Access this chapter

Similar content being viewed by others

Machine Reading Comprehension Model Based on Multi-head Attention Mechanism

Transformer-Based Coattention: Neural Architecture for Reading Comprehension

Review of Parameters, Approaches and Challenges in Reading Comprehension Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Simple Memory Module on Reading Comprehension

Abstract

Access this chapter

Similar content being viewed by others

Machine Reading Comprehension Model Based on Multi-head Attention Mechanism

Transformer-Based Coattention: Neural Architecture for Reading Comprehension

Review of Parameters, Approaches and Challenges in Reading Comprehension Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation