Abstract
This paper introduces a semantics-aware approach to natural language inference which allows neural network models to perform better on natural language inference benchmarks. We propose to incorporate explicit lexical and concept-level semantics from knowledge bases to improve inference accuracy. We conduct an extensive evaluation of four models using different sentence encoders, including continuous bag-of-words, convolutional neural network, recurrent neural network, and the transformer model. Experimental results demonstrate that semantics-aware neural models give better accuracy than those without semantics information. On average of the three strong models, our semantic-aware approach improves natural language inference in different languages.
Similar content being viewed by others
Notes
https://github.com/phuonglh/vlp, under the nli module.
As of September 15, 2020 on the latest GLUE test set.
Note that in transformers-based models, the hidden size must be a multiple of the number of self attention heads.
https://github.com/phuonglh/vlp/, under the nli module.
More detailed experimental results can be found in our GitHub repository.
We use the package HypothesisTests of the Julia programming language to perform the statistical tests.
References
Bauer, L., Wang, Y., & Bansal, M. (2018). Commonsense for generative multi-hop question answering tasks. In Proceedings of EMNLP, Brussels, Belgium (pp. 4220–4230).
Bos, J., & Markert, K. (2005). Recognising textual entailment with logical inference. In Proceedings of EMNLP, ACL, Brussels, Belgium (pp. 628–635).
Bowman, S. R., Angeli, G., Potts, C., & Manning, C. D. (2015). A large annotated corpus for learning natural language inference. In Proceedings of EMNLP, ACL (pp. 632–642).
Bowman, S. R., Potts, C., & Manning, C. D. (2015). Recursive neural networks can learn logical semantics. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality, Beijing, China (pp. 12–21).
Bui, T. V., Tran, T. O., & Le-Hong, P. (2020). Improving sequence tagging for Vietnamese text using transformer-based neural models. In Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation, Association for Computational Linguistics, Hanoi, Vietnam (pp. 13–20).
Cambria, E., Fu, J., Bisio, F., & Poria, S. (2015). AffectiveSpace 2: Enabling affective intuition for concept-level sentiment analysis. In Proceedings of AAAI (pp. 508–514).
Cambria, E., Li, Y., Xing, F., Poria, S., & Kwok, K. (2020). SenticNet 6: Ensemble application of symbolic and subsymbolic AI for sentiment analysis. In CIKM (pp. 105–114).
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., E. R. H. Jr., & Mitchell, T. M. (2010). Toward an architecture for never-ending language learning. In Proceedings of AAAI (pp. 10–18).
Chang, T.-Y., Liu, Y., Gopalakrishnan, K., Hedayatnia, B., Zhou, P., & Hakkani-Tur, D. (2020). Incorporating commonsense knowledge graph in pretrained models for social commonsense tasks. In Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, Association for Computational Linguistics, Online (pp. 74–79).
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings NIPS 2014 Deep Learning and Representation Learning Workshop, Montreal, Canada (pp. 10–19).
Clark, K., Luong, M.-T., Le, Q. V., & Manning, C. D. (2020). ELECTRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of ICLR (pp. 1–18).
Conneau, A., Rinott, R., Lample, G., Schwenk, H., Stoyanov, V., Williams, A., & Bowman, S. R. (2018). XNLI: Evaluating cross-lingual sentence representations. In Proceedings of EMNLP, ACL, Brussels, Belgium (pp. 2475–2485).
Dang, H.-V., & Le-Hong, P. (2021). A combined syntactic-semantic embedding model based on lexicalized tree-adjoining grammar. Computer Speech and Language, 68(2021), 101202.
de Marneffe, M.-C., Rafferty, A. N., & Manning, C. D. (2008). Finding contradictions in text. In Proceedings of ACL, Columbus, Ohio, USA (pp. 1039–1047).
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL, Minnesota, USA (pp. 1–16).
Fadel, A., Al-Ayyoub, M., & Cambria, E. (2020). JUSTers at SemEval-2020 task 4: Evaluating transformer models against commonsense validation and explanation. In Proceedings of the 14th International Workshop on Semantic Evaluation, Association for Computational Linguistics, Barcelona (online) (pp. 535–542).
Fyodorov, Y., Winter, Y., & Francez, N. (2000). A natural logic inference system. In Proceedings of the 2nd Workshop on Inference in Computational Semantics (pp. 1–17).
Giampiccolo, D., Magnini, B., Dagan, I., & Dolan, B. (2007). The third PASCAL recognizing textual entailment challenge. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague (pp. 1–9).
Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceedings of EMNLP, ACL, Doha, Quatar (pp. 1746–1751).
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In Y. Bengio, Y. LeCun (Eds.), Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA (pp. 1–15).
Kyunghyun, C., van Merrienboer Bart, Caglar, G., Dzmitry, B., Fethi, B., Holger, S., & Yoshua, B. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv:1406.1078.
Lauscher, A., Majewska, O., Ribeiro, L. F. R., Gurevych, I., Rozanov, N., & Glavaš, G. (2020) Common sense or world knowledge? investigating adapter-based knowledge injection into pretrained transformers. In Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, Association for Computational Linguistics, Online (pp. 43–49.
Lauscher, A., Vulić, I., Ponti, E. M., Korhonen, A., & Glavaš, G. (2020). Specializing unsupervised pretraining models for word-level semantic similarity. In Proceedings of the 28th International Conference on Computational Linguistics, International Committee on Computational Linguistics, Barcelona, Spain (Online) (pp. 1371–1383).
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., & Bizer, C. (2015). Dbpedia—A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 6(2), 167–195.
Le-Hong, P., Nguyen, T. M. H., Roussanaly, A., & Ho, T. V. (2008). A hybrid approach to word segmentation of Vietnamese texts. In Language and Automata Theory and Applications, Vol. 5196 of Lecture Notes in Computer Science, Springer Berlin Heidelberg (pp. 240–249).
Le-Hong, P., Roussanaly, A., Nguyen, T. M. H., & Rossignol, M. (2010). An empirical study of maximum entropy approach for part-of-speech tagging of Vietnamese texts. In Actes de Traitement Automatique des Langues, Montreal, Canada (pp. 50–61).
Li, Y., Pan, Q., Yang, T., Wang, S., Tang, J., & Cambria, E. (2020). Learning word representations for sentiment analysis. Cognitive Computation, 9, 843–851.
Liu, X., He, P., Chen, W., & Gao, J. (2019). Multi-task deep neural networks for natural language understanding. In Proceedings of ACL, Florence, Italy (pp. 4487–4496).
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach.
Ma, Y., Peng, H., & Cambria, E. (2018). Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM. In Proceedings of AAAI (pp. 5876–5883).
MacCartney, B., & Manning, C. D. (2009). An extended model of natural logic. In Proceedings of the Eight International Conference on Computational Semantics, ACL, Tilburg, The Netherlands (pp. 140–156).
Mihaylov, T., & Frank, A. (2018). Knowledgeable reader: Enhancing cloze-style reading comprehension with external commonsense knowledge. In Proceedings of ACL, Melbourne, Australia (pp. 821–832).
Nguyen, D. Q., & Nguyen, A. T. (2020). PhoBERT: Pre-trained language models for Vietnamese. In Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 1037–1042).
Nguyen, M. -T., Ha, Q. -T., Nguyen, T. -D., Nguyen, T. -T., & Nguyen, L. -M. (2015). Recognizing textual entailment in vietnamese text: An experimental study. In Proceedings of the Seventh International Conference on Knowledge and Systems Engineering (KSE), IEEE, Ho Chi Minh City, Vietnam (pp. 108–113).
Peng, H., Cambria, E., & Hussain, A. (2017). A review of sentiment analysis research in Chinese language. Cognitive Computation, 9, 423–435.
Peters, M. E., Neumann, M., IV, R. L. L., Schwartz, R., Joshi, V., Singh, S., & Smith, N. A. (2019) Knowledge enhanced contextual word representation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Association for Computational Linguistics, Hongkong, China (pp. 43–54).
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of NAACL, Louisiana, USA (pp. 1–15).
Pham, M. Q. N., Nguyen, M. L., & Shimazu, A. (2012). Using machine translation for recognizing textual entailment in Vietnamese language. In Proceedings of IEEE International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future, IEEE, Ho Chi Minh City, Vietnam (pp. 1–6).
Poria, S., Cambria, E., Hussain, A., & Huang, G.-B. (2015). Towards an intelligent framework for multimodal affective data analysis. Neural Networks, 63, 104–116.
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. In Preprint (pp. 1–12).
Satapathy, R., Cambria, E., Nanetti, A., & Hussain, A. (2020). A review of shorthand systems: From brachygraphy to microtext and beyond. Cognitive Computation, 12(4), 778–792.
Speer, R., Chin, J., & Havasi, C. ConceptNet 5.5: An open multilingual graph of general knowledge. In Proceedings of AAAI 31, 2017 (pp. 4444–4451).
Sun, H., Dhingra, B., Zaheer, M., Mazaitis, K., Salakhutdinov, R. R., & Cohen, W. W. (2018). Open domain question answering using early fusion of knowledge bases and text. In Proceedings of EMNLP, Brussels, Belgium (pp. 4231–4242).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in neural information processing systems (Vol. 30, pp. 5998–6008). Curran Associates Inc.
Wang, C., & Jiang, H. (2019). Explicit utilization of general knowledge in machine reading comprehension. In Proceedings of ACL, Florence, Italy (pp. 2263–82272).
Wang, C., Liang, S., Jin, Y., Wang, Y., Zhu, X., & Zhang, Y. (2020). SemEval-2020 task4: Commonsense validation and explanation. In Proceedings of the 14th International Workshop on Semantic Evaluation, Association for Computational Linguistics, Barcelona (online) (pp. 307–321).
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. R. (2019). GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Brussels, Belgium (pp. 353—355).
Williams, A., Nangia, N., & Bowman, S. R. (2018). A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of NAACL-HLT, ACL, New Orleans, Louisana, USA (pp. 1112–1122).
Yang, B., & Mitchell, T. (2017). Leveraging knowledge bases in LSTMs for improving machine reading. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada (pp. 1436–1446).
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., & Le, Q. V. (2019). XLNet: Generalized autoregressive pretraining for language understanding. In Proceedings of NeurIPS (pp. 5754–5764).
Zhang, Z., Wu, Y., Zhao, H., Li, Z., Zhang, S., Zhou, X., & Zhou, X. (2020). Semantics-aware BERT for language understanding. In Proceedings of AAAI (pp. 9628–9635).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Research involving human and animal rights
This article does not contain any studies with human or animal subjects performed by any of the authors.
Informed consent
Informed consent was not required as no human or animals were involved.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Le-Hong, P., Cambria, E. A semantics-aware approach for multilingual natural language inference. Lang Resources & Evaluation 57, 611–639 (2023). https://doi.org/10.1007/s10579-023-09635-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-023-09635-6