Abstract
This paper develops methods for calculating the semantic similarity (closeness)-relatedness of natural language words. The concept of semantic relatedness allows one to construct algorithmic models for the context-linguistic analysis with a view to solving problems such as word sense disambiguation, named entity recognition, natural language text analysis, etc. A new algorithm is proposed for estimating the semantic distance between natural language words. This method is a weighted modification of the well-known Lesk approach based on the lexical intersection of glossary entries.
Similar content being viewed by others
References
M. Lesk, “Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone,” in: Proc. of the 5th Annu. Intern. Conf. on Syst. Document SIGDOC’86, ACM, New York (1986), pp. 24–26.
S. Wubben, “Using free link structure to calculate semantic relatedness,” ILK Research Group Technical Report Series No. 08–01, Tilburq Univ., Tilburq (2008).
S. P. Ponzetto and M. Strube, “Knowledge deriver from Wikipedia for computing semantic relatedness,” Artif. Intell. Res., No. 30, 181–212 (2007).
E. Gabrilovich and S. Markovitch, “Computing semantic relatedness using Wikipedia-based explicit semantic analysis,” in: Proc. 20th Intern. Joint Conf. on Artif. Intell. (Hyderabad, 2007), Morgan Kauffman, San Francisco (2007), pp. 1606–1611.
P. Resnik, “Using information content to evaluate semantic similarity in a taxonomy,” in: Proc. Intern. Joint Conf. on Artif. Intell. (Montreal, 1995), Morgan Kauffman, San Francisco (1995), pp. 448–453.
C. Leacock, M. Chodorow, and G. A. Miller, “Using corpus statistics and wordnet relations for sense identification,” Comput. Ling., 24, No. 1, 147–165 (1998).
Z. Wu and M. Palmer, “Verb semantics and lexical selection,” in: Proc. 32nd. Annu. Meet. of the Assoc. for Comput. Ling. (Las Cruces, 1994), Morgan Kauffman, San Francisco (1994), pp. 133–138.
M. Strube and S. P. Ponzetto, “WikiRelate! Computing semantic relatedness using Wikipedia,” in: Proc. 21st Nat. Conf. on Artif. Intell., AAAI, Boston, MA (2006), pp. 1419–1424.
D. Milne and I. H. Witten, “An effective, low-cost measure of semantic relatedness obtained from Wikipedia links,” in: Proc. 1st AAAI Workshop on Wikipedia and Artif. Intell. (CIKM’2008) (Chicago, 2008), AAAI Press, Menlo Park (USA) (2008).
E. Yeh, D. Ramage, C. D. Manning, et al., “WikiWalk: Random walks on Wikipedia for semantic relatedness,” in: ACL-IJCNLP TextGraphs-4 Workshop 2009, Singapore (2009).
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,” Science, New Series, No. 220, 671–680 (1983).
S. Luke, Essentials of Metaheuristics (2009), http://cs.gmu.edu/!sean/book/metaheuristics/.
M. Odersky, Scala by Example, Progr. Meth. Lab., EPFL, Lausanne (2009).
M. Odersky, L. Spoon, and B. Venners, Programming in Scala, Artima Press, Montain View (2008).
L. Finkelstein, E. Gabrilovich, Y. Matias, et al., “Placing search in context: The concept revisited,” ACM Trans. Inform. Systems, 20, No. 1, 116–131 (2002).
T. Pedersen, S. Pathwardhan, and J. Michelizzi, “Wordnet::Similarity — Measuring the relatedness of concepts,” in: Proc. 19th Nat. Conf. on Artif. Intell. (San Jose, 2004), Springer, Berlin (2004), pp. 1024–1025.
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated from Kibernetika i Sistemnyi Analiz, No. 4, pp. 18–27, July–August 2011.
Rights and permissions
About this article
Cite this article
Anisimov, A.V., Marchenko, O.O. & Kysenko, V.K. A method for the computation of the semantic similarity and relatedness between natural language words. Cybern Syst Anal 47, 515–522 (2011). https://doi.org/10.1007/s10559-011-9334-2
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10559-011-9334-2