A Noun-Predicate Bigram-Based Similarity Measure for Lexical Relations

  • Hyopil Shin
  • Insik Cho
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5221)


The method outlined in this paper demonstrates that the information-theoretic similarity measure and noun-predicate bigrams are effective methods for creating lists of semantically-related words for lexical database work. Our experiments revealed that instead of serious syntactic analysis, bigrams and morpho-syntactic information sufficed for the feature-based similarity measure. We contend that our method would be even more appreciated if it applied to a raw newswire corpus in which unlisted words in existing dictionaries, such as recently-created words, proper nouns, and syllabic abbreviations, are prevailing.


Semantically-related words similarity measure lexical relations noun-predicate bigrams 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lin, D.: An Information-Theoretic Definition of Similarity. In: Proceedings of ICML-1998. Madison, Wisconsin (1998a)Google Scholar
  2. 2.
    Alshawi, H., Carter, D.: Training and Scaling Preference Functions for Disambiguation. Computational Linguistics 20(4), 635–648 (1994)Google Scholar
  3. 3.
    Lin, D.: Automatic Retrieval and Clustering of Similar Words. In: Proceedings of COLINGACL 1998, pp. 768–774 (1998b)Google Scholar
  4. 4.
    Frakes, W.B., Baeza-Yates, R. (eds.): Information Retrieval, Data Structure and Algorithms. Prentice-Hall, Englewood Cliffs (1992)Google Scholar
  5. 5.
    Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and Application of a Metric on Semantic Nets. IEEE Transaction on Systems, Man, and Cybernetics 19(1), 17–30 (1989)CrossRefGoogle Scholar
  6. 6.
    Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)CrossRefGoogle Scholar
  7. 7.
    Turney, P.D.: Similarity of Semantic Relations. Computational Linguistics 32(3), 379–416 (2006)CrossRefzbMATHGoogle Scholar
  8. 8.
    Francis, H.S., Gregory, M.L., Michaelis, L.A.: Are Lexical Subjects Deviant? CLS-1999. University of Chicago (1999)Google Scholar
  9. 9.
    Gasperin, C., Gamallo, P., Agustini, A., Lopes, G., de Lima, V.: Using Syntactic Contexts for Measuring Word Similarity. In: Workshop on Knowledge Acquisition and Categorization, ESSLLI 2001 (2001)Google Scholar
  10. 10.
    Grefenstette, G.: Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, USA (1994)CrossRefzbMATHGoogle Scholar
  11. 11.
    Lapata, M., Keller, F.: Web-based Models for Natural Language Processing. ACM Transactions on Speech and Language Processing 2(1), 1–30 (2005)CrossRefGoogle Scholar
  12. 12.
    Tapanainen, P., Piitulainen, J., Jarvinen, T.: Idiomatic Object Usage and Support Verbs. In: COLINGACL 1998 (1998)Google Scholar
  13. 13.
    Ide, N.: Making Senses: Bootstrapping Sense-tagged Lists of Semantically-Related Words. In: Gelbukh, A. (ed.) CICLing 2006. LNCS, vol. 3878. Springer, Heidelberg (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Hyopil Shin
    • 1
  • Insik Cho
    • 1
  1. 1.Computational Linguistics Lab., Dept. of LinguisticsSeoul National UniversitySeoulKorea

Personalised recommendations