Distribution-Based Semantic Similarity of Nouns

Bolshakov, Igor A.; Gelbukh, Alexander

doi:10.1007/978-3-540-76725-1_73

Igor A. Bolshakov¹ &
Alexander Gelbukh¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4756))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

2475 Accesses

Abstract

In our previous work we have proposed two methods for evaluating semantic similarity / dissimilarity of nouns based on their modifier sets registered in Oxford Collocation Dictionary for Student of English. In this paper we provide further details on the experimental support and discussion of these methods. Given two nouns, in the first method the similarity is measured by the relative size of the intersection of the sets of modifiers applicable to both of them. In the second method, the dissimilarity is measured by the difference between the mean values of cohesion between a noun and the two sets of modifiers: its own ones and those of the other noun in question. Here, the cohesion between words is measured via Web statistics for co-occurrences of words. The two proposed measures prove to be in approximately inverse dependency. Our experiments show that Web-based weighting (the second method) gives better results.

Work done under partial support of Mexican Government (CONACyT, SNI, SIP-IPN, COTEPABE-IPN). Authors thank anonymous reviewers for valuable comments.

Download to read the full chapter text

Chapter PDF

Joint Distance and Information Content Word Similarity Measure

A Semantic Similarity Measurement Tool for WordNet-Like Databases

JWSAN: Japanese word similarity and association norm

Article Open access 18 June 2021

Keywords

References

Bolshakov, I.A., Bolshakova, E.I.: Measurements of Lexico-Syntactic Cohesion by means of Internet. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds.) MICAI 2005. LNCS (LNAI), vol. 3789, pp. 790–799. Springer, Heidelberg (2005)
Chapter Google Scholar
Bolshakov, I.A., Gelbukh, A.: Two Methods of Evaluation of Semantic Similarity of Nouns Based on Their Modifier Sets. In: LNCS, vol. 4592, Springer, Heidelberg (2007)
Google Scholar
Cilibrasi, R.L., Vitányi, P.M.B.: The Google Similarity Distance. IEEE Transactions on Knowledge and Data Engineering 19(3), 370–383 (2007), www.cwi.nl/~paulv/papers/tkde06.pdf
Article Google Scholar
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Hirst, G., Budanitsky, A.: Correcting Real-Word Spelling Errors by Restoring Lexical Cohesion. Natural Language Engineering 11(1), 87–111 (2005)
Article Google Scholar
Keller, F., Lapata, M.: Using the Web to Obtain Frequencies for Unseen Bigram. Computational linguistics 29(3), 459–484 (2003)
Article Google Scholar
Lin, D.: Automatic retrieval and clustering of similar words. In: COLING-ACL 1998, Canada (1998)
Google Scholar
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
MATH Google Scholar
McCarthy, D., Koeling, R., Weeds, J., Carroll, J.: Finding Predominant Word Senses in Untagged Text. In: Proc. 42nd Annual Meeting of the ACL, Barcelona, Spain (2004)
Google Scholar
Oxford Collocations Dictionary for Students of English. Oxford University Press (2003)
Google Scholar
Patwardhan, S., Banerjee, S., Pedersen, T.: Using Measures of Semantic Relatedness for Word Sense Disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, Springer, Heidelberg (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Computing Research (CIC), National Polytechnic Institute (IPN), Av. Juan Dios Bátiz s/n, Col. Nueva Industrial Vallejo, 07738, Mexico City, Mexico
Igor A. Bolshakov & Alexander Gelbukh

Authors

Igor A. Bolshakov
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Gelbukh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Luis Rueda Domingo Mery Josef Kittler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bolshakov, I.A., Gelbukh, A. (2007). Distribution-Based Semantic Similarity of Nouns. In: Rueda, L., Mery, D., Kittler, J. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2007. Lecture Notes in Computer Science, vol 4756. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76725-1_73

Download citation

DOI: https://doi.org/10.1007/978-3-540-76725-1_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76724-4
Online ISBN: 978-3-540-76725-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Distribution-Based Semantic Similarity of Nouns

Abstract

Chapter PDF

Similar content being viewed by others

Joint Distance and Information Content Word Similarity Measure

A Semantic Similarity Measurement Tool for WordNet-Like Databases

JWSAN: Japanese word similarity and association norm

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Distribution-Based Semantic Similarity of Nouns

Abstract

Chapter PDF

Similar content being viewed by others

Joint Distance and Information Content Word Similarity Measure

A Semantic Similarity Measurement Tool for WordNet-Like Databases

JWSAN: Japanese word similarity and association norm

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation