Abstract
Grouping objects into a common, initially unknown, category underlies several important tasks, such as query suggestion or automatic lexicon generation. However, while coming up with more things “of the same kind” is easy for humans, it is not trivial for Artificial Intelligence. This task is commonly known as the Entity Set Expansion (ESE) problem, and has been studied in different branches of AI and NLP. In this paper, we review different similarity metrics and techniques that could be applied to the ESE problem. Moreover, we decompose the problem into phases and demonstrate how to use several approaches together. In particular, we combine semantic similarity metrics with Meta Path algorithm for knowledge graphs. We discuss the results and show that the presented setting can be reused in further research into hybrid approaches to the ESE problem.
This paper is supported by AGH UST.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Adrian, W.T., Skoczeń, S., Majkut, S., Kluza, K., Ligęza, A.: Tracing the evolution of approaches to semantic similarity analysis. In: Aveiro, D., Dietz, J.L.G., Filipe, J. (eds.) Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2020, Volume 2: KEOD, 2–4 November 2020, pp. 157–164. SCITEPRESS (2020). https://doi.org/10.5220/0010108401570164. https://doi.org/10.5220/0010108401570164
Adrian, W.T., Manna, M.: Navigating online semantic resources for entity set expansion. In: Calimeri, F., Hamlen, K., Leone, N. (eds.) PADL 2018. LNCS, vol. 10702, pp. 170–185. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73305-0_12
Attneave, F.: Dimensions of similarity. Am. J. Psychol. 63(4), 516–556 (1950)
Baader, F., Sertkaya, B., Turhan, A.Y.: Computing the least common subsumer wrt a background terminology. J. Appl. Log. 5(3), 392–420 (2007)
Chen, J., Hu, P., Jimenez-Ruiz, E., Holter, O.M., Antonyrajah, D., Horrocks, I.: Owl2vec*: embedding of owl ontologies. Mach. Learn. 110(7), 1813–1845 (2021). https://doi.org/10.1007/s10994-021-05997-6
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: IJcAI, vol. 6 (2007)
Gligorov, R., Kate, W., Aleksovski, Z., Harmelen, F.: Using google distance to weight approximate ontology matches, pp. 767–776 (2007). https://doi.org/10.1145/1242572.1242676
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint cmp-lg/9709008 (1997)
Leacock, C., Chodorow, M.: Combining local context and wordnet similarity for word sense identification. WordNet Electron. Lexical Database 49(2), 265–283 (1998)
Lehmann, J., et al.: Dbpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web 6(2), 167–195 (2015). http://dblp.uni-trier.de/db/journals/semweb/semweb6.html#LehmannIJJKMHMK15
Li, Y., Bandar, Z.A., McLean, D.: An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15(4), 871–882 (2003)
Lin, D., et al.: An information-theoretic definition of similarity. In: ICML, vol. 98, pp. 296–304 (1998)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 19(1), 17–30 (1989)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. CoRR abs/cmp-lg/9511007 (1995). http://arxiv.org/abs/cmp-lg/9511007
Ristoski, P., Paulheim, H.: RDF2Vec: RDF graph embeddings for data mining. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 498–514. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46523-4_30
Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in wordnet. In: ECAI, vol. 16, p. 1089 (2004)
Wu, Z., Palmer, M.: Proceedings of the 32nd annual meeting on association for computational linguistics. In: Las Cruces New Mexico. 981744: Association for Computational Linguistics Las Cruces (1994)
Zhang, X., Chen, Y., Chen, J., Du, X., Wang, K., Wen, J.R.: Entity set expansion via knowledge graphs. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1101–1104 (2017)
Zheng, Y., Shi, C., Cao, X., Li, X., Wu, B.: A meta path based method for entity set expansion in knowledge graph. IEEE Trans. Big Data 8(3), 616–629 (2018). https://doi.org/10.1109/TBDATA.2018.2805366
Zhu, G., Iglesias, C.: Computing semantic similarity of concepts in knowledge graphs. IEEE Trans. Knowl. Data Eng. 29(1), 72–85 (2016). https://doi.org/10.1109/TKDE.2016.2610428
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Adrian, W.T., Wilk, K., Adrian, M., Kluza, K., Ligęza, A. (2022). Semantic Similarity Analysis for Entity Set Expansion. In: Fred, A., Aveiro, D., Dietz, J., Salgado, A., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2020. Communications in Computer and Information Science, vol 1608. Springer, Cham. https://doi.org/10.1007/978-3-031-14602-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-14602-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14601-5
Online ISBN: 978-3-031-14602-2
eBook Packages: Computer ScienceComputer Science (R0)