Skip to main content

A Simple Memory Module on Reading Comprehension

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1792))

Included in the following conference series:

  • 614 Accesses

Abstract

This article mainly introduces a simple memory module that can effectively improve the reading comprehension ability of the BERT model. We think the model of reading comprehension is like the human brain. The area of the human brain responsible for memory is in the hippocampus, and the area responsible for thinking is in the prefrontal and parietal cortex. Reading comprehension should also have areas for memory and analysis. So we added a memory module to the BERT model. After the data enters the encoder, it enters the memory module to find similar vectors. The memory module is responsible for assisting the model in understanding and answering questions, in which comparative learning is used for sentence embedding. The dataset used is CoQA, in which the dialogue is closer to human daily life, the questions and answers are more natural, and it covers 7 different domains. The automatic and manual evaluation surface, the model added with the memory module has higher accuracy than the original model, has strong generalization ability, is not easy to overfit, and is more efficient in multi-domain learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to answer opendomain questions. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, 30 July–4 August, Volume 1: Long Papers, pp. 1870–1879 (2017)

    Google Scholar 

  2. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) (2019)

    Google Scholar 

  3. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach (2019). arXiv:1907.11692

  4. Chen, Y.-N., Wang, W.Y., Rudnicky, A.: Jointly modeling inter-slot relations by random walk on knowledge graphs for unsupervised spoken language understanding. In: NAACL (2015)

    Google Scholar 

  5. Chen, C.Y., et al.: Gunrock: building a human-like social bot by leveraging large scale real user data. In: 2nd Alexa Prize (2018)

    Google Scholar 

  6. Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings (2021). https://arxiv.org/abs/2104.08821

  7. Li, B., Zhou, H., He, J., Wang, M., Yang, Y., Li, L.: On the sentence embeddings from pre-trained language models. In: Empirical Methods in Natural Language Processing EMNLP (2020)

    Google Scholar 

  8. Su, J., Cao, J., Liu, W., Ou, Y.: Whitening sentence representations for better semantics and faster retrieval (2021)

    Google Scholar 

  9. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  10. Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In: International Conference on Machine Learning (2020)

    Google Scholar 

  11. Reddy, S., Chen, D., Manning, C.D.: CoQA: a conversational question answering challenge (2019). arXiv:1808.07042

  12. Cer, D., Diab, M., Agirre, E., LopezGazpio, I., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th International Workshop on Semantic Evaluation (2017)

    Google Scholar 

  13. Wu, Z., Wang, S., Gu, J., Khabsa, M., Sun, F., Ma, H.: Clear: contrastive learning for sentence representation (2020). arXiv:2012.15466

  14. Zhang, Y., He, R., Liu, Z., Lim, K.H., Bing, L.: An unsupervised sentence embedding method by mutual information maximization. In: Empirical Methods in Natural Language Processing (2020)

    Google Scholar 

  15. van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding (2019). arXiv:1807.03748

  16. Cai, X., Huang, J., Bian, Y., Church, K.: Isotropy in the contextual embedding space: clusters and manifolds. In: ICLR (2021)

    Google Scholar 

  17. Gao, J., He, D., Tan, X., Qin, T., Wang, L., Liu, T.: Representation degeneration problem in training natural language generation models. In: International Conference on Learning Representations (2019)

    Google Scholar 

  18. Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In: International Conference on Machine Learning (ICML), pp. 9929–9939 (2020)

    Google Scholar 

  19. Cer, D., Diab, M., Agirre, E., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and cross-lingual focused evaluation (2017). https://doi.org/10.18653/v1/S17-2001

  20. Wieting, J., Neubig, G., Kirkpatrick, T.B.: A bilingual generative transformer for semantic sentence embedding. In: Empirical Methods in Natural Language Processing (2020)

    Google Scholar 

  21. Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension (2016). arXiv:1611.01603

Download references

Acknowledgements

This work is supported by Tianjin “Project + Team” Key Training Project under Grant No. XC202022.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruxin Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, R., Wang, X. (2023). A Simple Memory Module on Reading Comprehension. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1792. Springer, Singapore. https://doi.org/10.1007/978-981-99-1642-9_48

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-1642-9_48

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-1641-2

  • Online ISBN: 978-981-99-1642-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics