Textual Entailment for Effective Triple Validation in Object Prediction

García-Silva, Andrés; Berrío, Cristian; Gómez-Pérez, Jose Manuel

doi:10.1007/978-3-031-47240-4_5

Andrés García-Silva¹⁶,
Cristian Berrío¹⁶ &
Jose Manuel Gómez-Pérez¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14265))

Included in the following conference series:

International Semantic Web Conference

1453 Accesses

Abstract

Knowledge base population seeks to expand knowledge graphs with facts that are typically extracted from a text corpus. Recently, language models pretrained on large corpora have been shown to contain factual knowledge that can be retrieved using cloze-style strategies. Such approach enables zero-shot recall of facts, showing competitive results in object prediction compared to supervised baselines. However, prompt-based fact retrieval can be brittle and heavily depend on the prompts and context used, which may produce results that are unintended or hallucinatory. We propose to use textual entailment to validate facts extracted from language models through cloze statements. Our results show that triple validation based on textual entailment improves language model predictions in different training regimes. Furthermore, we show that entailment-based triple validation is also effective to validate candidate facts extracted from other sources including existing knowledge graphs and text passages where named entities are recognized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://blog.google/products/search/introducing-knowledge-graph-things-not/.
2.
The KBP evaluation track of the TAC [14] is a long running initiative. However, manual system evaluation makes it hard to reproduce evaluation for new systems.
3.
See RDF schema in https://www.w3.org/TR/rdf-primer/#properties.
4.
While rdf:type is the standard property used to state that a resource is an instance of a class, some knowledge graphs could use other ad-hoc property.
5.
Due to the diverse nature of the MISC category we do not consider it.
6.
https://lm-kbc.github.io/2022/.
7.
https://pypi.org/project/duckduckgo-search/.
8.
https://spacy.io/models/en#en_core_web_trf.
9.
https://huggingface.co/microsoft/deberta-v2-xlarge-mnli.
10.
https://huggingface.co/microsoft/deberta-v3-xsmall.
11.
https://huggingface.co/boychaboy/MNLI_bert-large-cased.
12.
https://github.com/satori2023/Textual-Entailment-for-Effective-Triple-Validation-in-Object-Prediction.
13.
https://github.com/lm-kbc/dataset.
14.
https://huggingface.co/deepset/deberta-v3-large-squad2.
15.
https://rajpurkar.github.io/SQuAD-explorer/.
16.
The relation mapping can be found in the paper repository.
17.
https://github.com/huggingface/transformers/tree/v4.24.0/examples/pytorch/text-classification.
18.
https://github.com/huggingface/transformers/tree/v4.24.0/examples/pytorch/question-answering.
19.
https://github.com/Babelscape/rebel/blob/main/src/train.py.
20.
https://github.com/expertailab/Textual-Entailment-for-Effective-Triple-Validation-in-Object-Prediction.

References

Adel, H., Schütze, H.: Type-aware convolutional neural networks for slot filling. J. Artif. Intell. Res. 66, 297–339 (2019)
Article Google Scholar
Alivanistos, D., Santamaría, S., Cochez, M., Kalo, J., van Krieken, E., Thanapalasingam, T.: Prompting as probing: using language models for knowledge base construction. In: Singhania, S., Nguyen, T.P., Razniewski, S. (eds.) LM-KBC 2022 Knowledge Base Construction from Pre-trained Language Models 2022, pp. 11–34. CEUR Workshop Proceedings, CEUR-WS.org (2022)
Google Scholar
Balazevic, I., Allen, C., Hospedales, T.: TuckER: tensor factorization for knowledge graph completion. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 5185–5194. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1522. https://aclanthology.org/D19-1522
Balog, K.: Populating knowledge bases. In: Balog, K. (ed.) Entity-Oriented Search. TIRS, vol. 39, pp. 189–222. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93935-3_6
Chapter Google Scholar
Bentivogli, L., Clark, P., Dagan, I., Giampiccolo, D.: The seventh pascal recognizing textual entailment challenge. In: Theory and Applications of Categories (2011)
Google Scholar
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, vol. 26. Curran Associates, Inc. (2013). https://papers.nips.cc/paper/2013/hash/1cecc7a77928ca8133fa24680a88d2f9-Abstract.html
Bouraoui, Z., Camacho-Collados, J., Schockaert, S.: Inducing relational knowledge from BERT. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7456–7463 (2020)
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, pp. 632–642. Association for Computational Linguistics (2015). https://doi.org/10.18653/v1/D15-1075. https://aclanthology.org/D15-1075
Cao, B., et al.: Knowledgeable or educated guess? Revisiting language models as knowledge bases. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, pp. 1860–1874. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.146. https://aclanthology.org/2021.acl-long.146
Chen, Z., Feng, Y., Zhao, D.: Entailment graph learning with textual entailment and soft transitivity. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, pp. 5899–5910. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.acl-long.406. https://aclanthology.org/2022.acl-long.406
Dagan, I., Glickman, O., Magnini, B.: The PASCAL recognising textual entailment challenge. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS (LNAI), vol. 3944, pp. 177–190. Springer, Heidelberg (2006). https://doi.org/10.1007/11736790_9
Chapter Google Scholar
Galárraga, L., Razniewski, S., Amarilli, A., Suchanek, F.M.: Predicting completeness in knowledge bases. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017, New York, NY, USA, pp. 375–383. Association for Computing Machinery (2017). https://doi.org/10.1145/3018661.3018739
Gerber, D., et al.: Defacto-temporal and multilingual deep fact validation. Web Semant. 35(P2), 85–101 (2015). https://doi.org/10.1016/j.websem.2015.08.001
Getman, J., Ellis, J., Strassel, S., Song, Z., Tracey, J.: Laying the groundwork for knowledge base population: nine years of linguistic resources for TAC KBP. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA) (2018). https://aclanthology.org/L18-1245
Goodfellow, I.J., Mirza, M., Da, X., Courville, A.C., Bengio, Y.: An empirical investigation of catastrophic forgeting in gradient-based neural networks. CoRR abs/1312.6211 (2013)
Google Scholar
Guo, Z., Schlichtkrull, M., Vlachos, A.: A survey on automated fact-checking. Trans. Assoc. Comput. Linguist. 10, 178–206 (2022). https://doi.org/10.1162/tacl_a_00454
Article Google Scholar
Hosseini, M.J., Cohen, S.B., Johnson, M., Steedman, M.: Open-domain contextual link prediction and its complementarity with entailment graphs. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, pp. 2790–2802. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.238. https://aclanthology.org/2021.findings-emnlp.238/
Huang, L., Sil, A., Ji, H., Florian, R.: Improving slot filling performance with attentive neural networks on dependency structures. In: EMNLP (2017)
Google Scholar
Huguet Cabot, P.L., Navigli, R.: REBEL: relation extraction by end-to-end language generation. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, pp. 2370–2381. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.204. https://aclanthology.org/2021.findings-emnlp.204
Jaradeh, M.Y., Singh, K., Stocker, M., Auer, S.: Triple classification for scholarly knowledge graph completion. In: Proceedings of the 11th on Knowledge Capture Conference, pp. 225–232 (2021)
Google Scholar
Ji, H., Grishman, R.: Knowledge base population: successful approaches and challenges. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 1148–1158. Association for Computational Linguistics (2011). https://aclanthology.org/P11-1115
Ji, S., Pan, S., Cambria, E., Marttinen, P., Yu, P.S.: A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. 33(2), 494–514 (2022). https://doi.org/10.1109/TNNLS.2021.3070843
Article MathSciNet Google Scholar
Kim, J., Choi, K.s.: Unsupervised fact checking by counter-weighted positive and negative evidential paths in a knowledge graph. In: Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 1677–1686. International Committee on Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.coling-main.147. https://aclanthology.org/2020.coling-main.147
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp. 7871–7880. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.703. https://aclanthology.org/2020.acl-main.703
Li, T., Huang, W., Papasarantopoulos, N., Vougiouklis, P., Pan, J.Z.: Task-specific pre-training and prompt decomposition for knowledge graph population with language models. arXiv abs/2208.12539 (2022)
Google Scholar
Liu, N.F., Gardner, M., Belinkov, Y., Peters, M.E., Smith, N.A.: Linguistic knowledge and transferability of contextual representations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 1073–1094. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1112. https://aclanthology.org/N19-1112
MacCartney, B., Manning, C.D.: Modeling semantic containment and exclusion in natural language inference. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, pp. 521–528. Coling 2008 Organizing Committee (2008). https://aclanthology.org/C08-1066
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning and Motivation, vol. 24, pp. 109–165. Elsevier (1989)
Google Scholar
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Peters, M.E., et al.: Knowledge enhanced contextual word representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 43–54 (2019)
Google Scholar
Petroni, F., et al.: Language models as knowledge bases? In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 2463–2473. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1250. https://aclanthology.org/D19-1250
Poerner, N., Waltinger, U., Schütze, H.: E-BERT: efficient-yet-effective entity embeddings for BERT. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 803–818 (2020)
Google Scholar
Qin, G., Eisner, J.: Learning how to ask: querying LMs with mixtures of soft prompts. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, pp. 5203–5212. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.naacl-main.410. https://aclanthology.org/2021.naacl-main.410
Richardson, K., Hu, H., Moss, L., Sabharwal, A.: Probing natural language inference models through semantic fragments. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, pp. 8713–8721 (2020). https://doi.org/10.1609/aaai.v34i05.6397. https://ojs.aaai.org/index.php/AAAI/article/view/6397
Rodrigo, Á., Peñas, A., Verdejo, F.: Overview of the answer validation exercise 2008. In: Peters, C., et al. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 296–313. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04447-2_35
Chapter Google Scholar
Sainz, O., Gonzalez-Dios, I., Lopez de Lacalle, O., Min, B., Agirre, E.: Textual entailment for event argument extraction: zero- and few-shot with multi-source learning. In: Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, United States, pp. 2439–2455. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.findings-naacl.187. https://aclanthology.org/2022.findings-naacl.187/
Sainz, O., de Lacalle, O.L., Labaka, G., Barrena, A., Agirre, E.: Label verbalization and entailment for effective zero and few-shot relation extraction. arXiv abs/2109.03659 (2021)
Google Scholar
Serra, J., Suris, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. In: International Conference on Machine Learning, pp. 4548–4557. PMLR (2018)
Google Scholar
Shi, B., Weninger, T.: Open-world knowledge graph completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1 (2018). https://doi.org/10.1609/aaai.v32i1.11535. https://ojs.aaai.org/index.php/AAAI/article/view/11535
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, pp. 4222–4235. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-main.346. https://aclanthology.org/2020.emnlp-main.346
Shiralkar, P., Flammini, A., Menczer, F., Ciampaglia, G.L.: Finding streams in knowledge graphs to support fact checking. In: 2017 IEEE International Conference on Data Mining (ICDM), pp. 859–864 (2017). https://doi.org/10.1109/ICDM.2017.105
Singhania, S., Nguyen, T.P., Razniewski, S.: LM-KBC: knowledge base construction from pre-trained language models. In: the Semantic Web Challenge on Knowledge Base Construction from Pre-trained Language Models 2022 co-located with the 21st International Semantic Web Conference (ISWC 2022), Hanghzou, China, vol. 3274 (2022). https://ceur-ws.org/Vol-3274/paper1.pdf
Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: an open multilingual graph of general knowledge. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Surdeanu, M., Ji, H.: Overview of the english slot filling track at the TAC2014 knowledge base population evaluation. In: Proceedings of Text Analysis Conference (TAC 2014) (2014)
Google Scholar
Syed, Z.H., Röder, M., Ngomo, A.-C.N.: Unsupervised discovery of corroborative paths for fact validation. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 630–646. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_36
Chapter Google Scholar
Syed, Z.H., Röder, M., Ngonga Ngomo, A.C.: Factcheck: validating RDF triples using textual evidence. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, pp. 1599–1602. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3269206.3269308
Tenney, I., et al.: What do you learn from context? Probing for sentence structure in contextualized word representations. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=SJzSgnRcKX
Thorne, J., Vlachos, A.: Automated fact checking: task formulations, methods and future directions. In: Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, pp. 3346–3359. Association for Computational Linguistics (2018). https://aclanthology.org/C18-1283
Toutanova, K., Chen, D., Pantel, P., Poon, H., Choudhury, P., Gamon, M.: Representing text for joint embedding of text and knowledge bases. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, pp. 1499–1509. Association for Computational Linguistics (2015). https://doi.org/10.18653/v1/D15-1174. https://aclanthology.org/D15-1174
Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)
Article Google Scholar
Wang, S., Fang, H., Khabsa, M., Mao, H., Ma, H.: Entailment as Few-Shot Learner (2021). arXiv:2104.14690
West, R., Gabrilovich, E., Murphy, K., Sun, S., Gupta, R., Lin, D.: Knowledge base completion via search-based question answering. In: Proceedings of the 23rd International Conference on World Wide Web, WWW 2014, pp. 515–526. Association for Computing Machinery, New York (2014). https://doi.org/10.1145/2566486.2568032
Williams, A., Nangia, N., Bowman, S.: A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, Louisiana, pp. 1112–1122. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/N18-1101. https://aclanthology.org/N18-1101
Wu, L., Petroni, F., Josifoski, M., Riedel, S., Zettlemoyer, L.: Scalable zero-shot entity linking with dense entity retrieval. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6397–6407 (2020)
Google Scholar
Yao, L., Mao, C., Luo, Y.: KG-BERT: BERT for knowledge graph completion. arXiv preprint arXiv:1909.03193 (2019)
Zha, H., Chen, Z., Yan, X.: Inductive Relation Prediction by BERT. In: Proceedings of the First MiniCon Conference (2022). https://aaai-022.virtualchair.net/poster_aaai7162
Zhou, X., Zhang, Y., Cui, L., Huang, D.: Evaluating commonsense in pre-trained language models. In: AAAI (2020)
Google Scholar

Download references

Acknowledgement

We are grateful to the European Commission (EU Horizon 2020 EXCELLENT SCIENCE - Research Infrastructure under grant agreement No. 101017501 RELIANCE) and ESA (Contract No. 4000135254/21/NL/GLC/kk FEPOSI) for the support received to carry out this research.

Author information

Authors and Affiliations

Expert.ai, Language Technology Research Lab, Poeta Joan Maragall 3, 28020, Madrid, Spain
Andrés García-Silva, Cristian Berrío & Jose Manuel Gómez-Pérez

Authors

Andrés García-Silva
View author publications
You can also search for this author in PubMed Google Scholar
Cristian Berrío
View author publications
You can also search for this author in PubMed Google Scholar
Jose Manuel Gómez-Pérez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrés García-Silva .

Editor information

Editors and Affiliations

University of Liverpool, Liverpool, UK
Terry R. Payne
University of Bologna, Bologna, Italy
Valentina Presutti
Southeast University, Nanjing, China
Guilin Qi
Universidad Politécnica de Madrid, Madrid, Spain
María Poveda-Villalón
Huawei Technologies R&D UK, Edinburgh, UK
Giorgos Stoilos
Centrum Wiskunde and Informatica, Amsterdam, The Netherlands
Laura Hollink
IT University of Copenhagen, Copenhagen, Denmark
Zoi Kaoudi
Nanjing University, Nanjing, China
Gong Cheng
Tsinghua University, Beijing, Beijing, China
Juanzi Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

García-Silva, A., Berrío, C., Gómez-Pérez, J.M. (2023). Textual Entailment for Effective Triple Validation in Object Prediction. In: Payne, T.R., et al. The Semantic Web – ISWC 2023. ISWC 2023. Lecture Notes in Computer Science, vol 14265. Springer, Cham. https://doi.org/10.1007/978-3-031-47240-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-47240-4_5
Published: 27 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47239-8
Online ISBN: 978-3-031-47240-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Semantic Web Science Association (opens in a new tab)

Textual Entailment for Effective Triple Validation in Object Prediction