Abstract
A common subtask of knowledge acquisition from natural-language texts is classifying words and recognizing entities and actions in the text. It is used in the analysis of both scientific and narrative texts. Thesauri and lexical databases containing hypernymy relationship between synsets may be a useful resource for entity and action recognition. In this study, we compared the performance of three major English thesauri containing hypernymy relationship in different forms - WordNet, Roget’s Thesaurus, and FrameNet - on 6 word-meaning categories that are used for the analysis of narrative and scientific natural-language texts. The results show that WordNet contains more words than FrameNet, and is more suitable for scientific texts, but FrameNet contains better-defined hypernyms and shows better precision for many narrative natural-language tasks, especially for verbs. Roget’s Thesaurus performance is average between WordNet and FrameNet in most word-meaning categories Enhancing FrameNet by adding more lexical units to existing frames would allow creating a powerful resource for entity and action recognition in text analysis. Fixing WordNet problems require revising its system of hypernyms.
The reported study was funded by RFBR, project number 20-07-00764.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baker, C.: FrameNet: a knowledge base for natural language processing. In: Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929–2014), Baltimore, MD, USA, pp. 1–5. Association for Computational Linguistics, June 2014. https://doi.org/10.3115/v1/W14-3001. https://www.aclweb.org/anthology/W14-3001
Bertoldi, A., de Oliveira Chishman, R.L.: Developing a frame-based lexicon for the Brazilian legal language: the case of the criminal\(\_\)process frame. In: Palmirani, M., Pagallo, U., Casanovas, P., Sartor, G. (eds.) AI Approaches to the Complexity of Legal Systems. Models and Ethical Challenges for Legal Systems, Legal Language and Legal Ontologies, Argumentation and Software Agents, pp. 256–270. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35731-2_18
Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016). https://doi.org/10.1162/tacl_a_00104. https://www.aclweb.org/anthology/Q16-1026
Deschacht, K., Moens, M.F.: Efficient hierarchical entity classifier using conditional random fields. In: Proceedings of the 2nd Workshop on Ontology Learning and Population: Bridging the Gap between Text and Knowledge, Sydney, Australia, pp. 33–40. Association for Computational Linguistics, July 2006. https://www.aclweb.org/anthology/W06-0505
Gangemi, A., Navigli, R., Velardi, P.: The OntoWordNet project: extension and axiomatization of conceptual relations in WordNet. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE, pp. 820–838. Springer, Heidelberg (2003)
Jarmasz, M.: Roget’s Thesaurus as a Lexical Resource for Natural Language Processing. Master’s thesis, University of Ottawa, July 2003. https://arxiv.org/abs/1204.0140
Liang, J., Zhang, Y., Xiao, Y., Wang, H., Wang, W.Y., Zhu, P.: On the transitivity of hypernym-hyponym relations in data-driven lexical taxonomies. In: AAAI (2017)
Magnini, B., Negri, M., Prevete, R., Tanev, H.: A WordNet-based approach to named entities recognition, pp. 1–7, January 2002. https://doi.org/10.3115/1118735.1118744
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995). https://doi.org/10.1145/219717.219748
Schmidt, T.: The Kicktionary: combining corpus linguistics and lexical semantics for a multilingual football dictionary, pp. 11–21. The Linguistics of Football, Narr, Tübingen (2008). http://nbn-resolving.de/urn:nbn:de:bsz:mh39-23491
Shi, L., Mihalcea, R.: Putting pieces together: combining FrameNet, VerbNet and WordNet for robust semantic parsing. In: Gelbukh, A. (ed.) Computational Linguistics and Intelligent Text Processing, pp. 100–111. Springer, Heidelberg (2005)
Shwartz, V., Goldberg, Y., Dagan, I.: Improving hypernymy detection with an integrated path-based and distributional method, pp. 2389–2398, March 2016. https://doi.org/10.18653/v1/P16-1226
Swain, D., Tambe, M., Ballal, P., Dolase, V., Agrawal, K., Rajmane, Y.: Lexical Text Simplification Using WordNet, pp. 114–122, July 2019. https://doi.org/10.1007/978-981-13-9942-8_11
Sychev, O., Kamennov, Y., Shurlaeva, E.: Approach to automatic determining of speakers of direct speech fragments in natural language texts. In: Samsonovich, A.V. (ed.) Biologically Inspired Cognitive Architectures 2019, pp. 527–531. Springer, Cham (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sychev, O., Kamennov, Y. (2021). Eligibility of English Hypernymy Resources for Extracting Knowledge from Natural-Language Texts. In: Samsonovich, A.V., Gudwin, R.R., Simões, A.d.S. (eds) Brain-Inspired Cognitive Architectures for Artificial Intelligence: BICA*AI 2020. BICA 2020. Advances in Intelligent Systems and Computing, vol 1310. Springer, Cham. https://doi.org/10.1007/978-3-030-65596-9_61
Download citation
DOI: https://doi.org/10.1007/978-3-030-65596-9_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65595-2
Online ISBN: 978-3-030-65596-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)