Skip to main content

ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2021 (ICANN 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12895))

Included in the following conference series:

Abstract

Neural language representation models such as BERT, pre-trained on large-scale unstructured corpora lack explicit grounding to real-world commonsense knowledge and are often unable to remember facts required for reasoning and inference. Natural Language Inference (NLI) is a challenging reasoning task that relies on common human understanding of language and real-world commonsense knowledge. We introduce a new model for NLI called External Knowledge Enhanced BERT (ExBERT), to enrich the contextual representation with real-world commonsense knowledge from external knowledge sources and enhance BERT’s language understanding and reasoning capabilities. ExBERT takes full advantage of contextual word representations obtained from BERT and employs them to retrieve relevant external knowledge from knowledge graphs and to encode the retrieved external knowledge. Our model adaptively incorporates the external knowledge context required for reasoning over the inputs. Extensive experiments on the challenging SciTail and SNLI benchmarks demonstrate the effectiveness of ExBERT: in comparison to the previous state-of-the-art, we obtain an accuracy of \(95.9\%\) on SciTail and \(91.5\%\) on SNLI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/huggingface/transformers.

  2. 2.

    https://nlp.stanford.edu/projects/snli/.

  3. 3.

    https://leaderboard.allenai.org/scitail/submissions/public.

  4. 4.

    We expect further improvements in ExBERT’s performance with \(\mathrm {BERT_{LARGE}}\), however we left the evaluation for future work due to the limited computing resources.

References

  1. Bast, H., Björn, B., Haussmann, E.: Semantic search on text and knowledge bases. Found. Trends Inf. Retrieval 10(2–3), 119–271 (2016)

    Article  Google Scholar 

  2. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: ACL (2015)

    Google Scholar 

  3. Chen, Q., Zhu, X., Ling, Z.H., Inkpen, D., Wei, S.: Neural natural language inference models enhanced with external knowledge. In: ACL (2018)

    Google Scholar 

  4. Dagan, I., Glickman, O., Magnini, B.: The PASCAL recognising textual entailment challenge. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS (LNAI), vol. 3944, pp. 177–190. Springer, Heidelberg (2006). https://doi.org/10.1007/11736790_9

    Chapter  Google Scholar 

  5. Dalvi, M.B., Tandon, N., Clark, P.: Domain-targeted, high precision knowledge extraction. Trans. Assoc. Comput. Linguist. 5, 233–246 (2017). https://www.transacl.org/ojs/index.php/tacl/article/view/1064

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the NAACL-HLT 2019 (Long and Short Papers), vol. 1, pp. 4171–4186 (2019)

    Google Scholar 

  7. Gajbhiye, A., Jaf, S., Moubayed, N.A., Bradley, S., McGough, A.S.: Cam: a combined attention model for natural language inference. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 1009–1014, December 2018

    Google Scholar 

  8. Gajbhiye, A., Jaf, S., Moubayed, N.A., McGough, A.S., Bradley, S.: An exploration of dropout with RNNs for natural language inference. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018. LNCS, vol. 11141, pp. 157–167. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01424-7_16

    Chapter  Google Scholar 

  9. Gajbhiye, A., Winterbottom, T., Al Moubayed, N., Bradley, S.: Bilinear fusion of commonsense knowledge with attention-based NLI models. In: Farkaš, I., Masulli, P., Wermter, S. (eds.) ICANN 2020. LNCS, vol. 12396, pp. 633–646. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61609-0_50

    Chapter  Google Scholar 

  10. Kang, D., Khot, T., Sabharwal, A., Hovy, E.: AdvEntuRe: adversarial training for textual entailment with knowledge-guided examples. In: ACL, Melbourne, July 2018

    Google Scholar 

  11. Kapanipathi, P., et al.: Infusing knowledge into the textual entailment task using graph convolutional networks. arXiv preprint arXiv:1911.02060 (2019)

  12. Khot, T., Sabharwal, A., Clark, P.: Scitail: A textual entailment dataset from science question answering. In: AAAI, New Orleans, 2–7, February 2018

    Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2015)

    Google Scholar 

  14. Kwon, S., Kang, C., Han, J., Choi, J.: Why do masked neural language models still need common sense knowledge?. CoRR abs/1911.03024 (2019)

    Google Scholar 

  15. Li, A.H., Sethy, A.: Knowledge enhanced attention for robust natural language inference. arXiv preprint arXiv:1909.00102 (2019)

  16. Liu, X., He, P., Chen, W., Gao, J.: Multi-task deep neural networks for natural language understanding. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4487–4496, Florence, July 2019

    Google Scholar 

  17. Logan, R., Liu, N.F., Peters, M.E., Gardner, M., Singh, S.: Barack’s wife hillary: using knowledge graphs for fact-aware language modeling. In: Proceedings of the 57th ACL, pp. 5962–5971, Florence, July 2019

    Google Scholar 

  18. Pang, D., Lin, L.H., Smith, N.A.: Improving natural language inference with a pretrained parser. arXiv preprint arXiv:1909.08217 (2019)

  19. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)

    Google Scholar 

  20. Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: An open multilingual graph of general knowledge (2017)

    Google Scholar 

  21. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)

    Google Scholar 

  22. Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017). https://doi.org/10.1109/TKDE.2017.2754499

    Article  Google Scholar 

  23. Wang, X., et al.: Improving natural language inference using external knowledge in the science questions domain. In: Proceedings of the AAAI, vol. 33, pp. 7208–7215 (2019)

    Google Scholar 

  24. Yang, B., Mitchell, T.: Leveraging knowledge bases in LSTMs for improving machine reading. In: ACL, pp. 1436–1446, Vancouver, July 2017

    Google Scholar 

  25. Zhang, Z., et al.: Semantics-aware BERT for language understanding. ArXiv arXiv:1909.02209 (2020)

  26. Zhang, Z., Wu, Y., Li, Z., Zhao, H.: Explicit contextual semantics for text comprehension. CoRR abs/1809.02794 http://arxiv.org/abs/1809.02794 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amit Gajbhiye .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gajbhiye, A., Moubayed, N.A., Bradley, S. (2021). ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12895. Springer, Cham. https://doi.org/10.1007/978-3-030-86383-8_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86383-8_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86382-1

  • Online ISBN: 978-3-030-86383-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics