TSD 2007: Text, Speech and Dialogue pp 66-75 | Cite as

Disambiguating Hypernym Relations for Roget’s Thesaurus

  • Alistair Kennedy
  • Stanistaw Szpakowicz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4629)

Abstract

Roget’s Thesaurus is a lexical resource which groups terms by semantic relatedness. It is Roget’s shortcoming that the relations are ambiguous, in that it does not name them; it only shows that there is a relation between terms. Our work focuses on disambiguating hypernym relations within Roget’s Thesaurus. Several techniques of identifying hypernym relations are compared and contrasted in this paper, and a total of over 50,000 hypernym relations have been disambiguated within Roget’s. Human judges have evaluated the quality of our disambiguation techniques, and we have demonstrated on several applications the usefulness of the disambiguated relations.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Jarmasz, M., Szpakowicz, S.: Roget’s thesaurus and semantic similarity. In: Proc. Conference on Recent Advances in Natural Language Processing (RANLP 2003), pp. 212–219 (2003)Google Scholar
  2. 2.
    Fellbaum, C. (ed.): WordNet – An electronic lexical database. MIT Press, Cambridge, Massachusetts, London, and England (1998)Google Scholar
  3. 3.
    Kirkpatrick, B. (ed.): Roget’s Thesaurus of English Words and Phrases. Penguin, Harmondsworth, Middlesex, England (1987)Google Scholar
  4. 4.
    Jarmasz, M., Szpakowicz, S.: The design and implementation of an electronic lexical knowledge base. In: Proc. 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence (AI 2001), pp. 325–334 (2001)Google Scholar
  5. 5.
    Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proc. 14th Conference on Computational linguistics, pp. 539–545 (1992)Google Scholar
  6. 6.
    Caraballo, S.A., Charniak, E.: Determining the specificity of nouns from text. In: Proceedings the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing (EMNLP) and Very Large Corpora (VLC), pp. 63–70 (1999)Google Scholar
  7. 7.
    Rydin, S.: Building a hyponymy lexicon with hierarchical structure. In: Proc. SIGLEX Workshop on Unsupervised Lexical Acquisition, ACL 2002, pp. 26–33 (2002)Google Scholar
  8. 8.
    Cederberg, S., Widdows, D.: Using LSA and noun coordination information to improve the precision and recall of automatic hyponymy extraction. In: Proc. Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 111–118 (2003)Google Scholar
  9. 9.
    Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 1297–1304. MIT Press, Cambridge, MA (2005)Google Scholar
  10. 10.
    Nakamura, J., Nagao, M.: Extraction of semantic information from an ordinary english dictionary and its evaluation. In: Proc 12th Conference on Computational linguistics, Morristown, NJ, USA, Association for Computational Linguistics, pp. 459–464 (1988)Google Scholar
  11. 11.
    Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: Proc. 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, Association for Computational Linguistics (July 2006), pp. 113–120 (2006)Google Scholar
  12. 12.
    Pantel, P., Ravichandran, D.: Automatically labeling semantic classes. In: Proc. 2004 Human Language Technology Conference (HLT-NAACL-04), pp. 321–328 (2004)Google Scholar
  13. 13.
    Lenat, D.B.: Cyc: A large-scale investment in knowledge infrastructure. Communications of the ACM 38(11) (November 1995)Google Scholar
  14. 14.
    Procter, P.: Longman Dictionary of Contemporary English. Longman Group Ltd. (1978)Google Scholar
  15. 15.
    Wiktionary: Main page - wiktionary (2006), http://en.wiktionary.org/wiki/Main_Page/
  16. 16.
    Burnard, L.: Reference guide for the british national corpus (world edition) (2000)Google Scholar
  17. 17.
    Clarke, C.L.A., Terra, E.L.: Passage retrieval vs. document retrieval for factoid question answering. In: SIGIR 2003: Proc. 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 427–428. ACM Press, New York (2003)CrossRefGoogle Scholar
  18. 18.
    Fleiss, J.L.: Statistical Methods for Rates and Proportions, 2nd edn. John Wiley & Sons, New York (1981)MATHGoogle Scholar
  19. 19.
    Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Language and Cognitive Process 6(1), 1–28 (1991)CrossRefGoogle Scholar
  20. 20.
    Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Communication of the ACM 8(10), 627–633 (1965)CrossRefGoogle Scholar
  21. 21.
    Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing search in context: the concept revisited. In: WWW 2001: Proc. 10th International Conference on World Wide Web, pp. 406–414. ACM Press, New York (2001)CrossRefGoogle Scholar
  22. 22.
    Landauer, T., Dumais, S.: A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104, 211–240 (1997)CrossRefGoogle Scholar
  23. 23.
    Turney, P.: Mining the web for synonyms: Pmi-ir versus lsa on toefl. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  24. 24.
    Lewis, M. (ed.): Readers Digest, 158(932, 934, 935, 936, 937, 938, 939, 940), 159(944, 948). Readers Digest Magazines Canada Limited (2000-2001)Google Scholar
  25. 25.
    Turney, P., Littman, M., Bigham, J., Shnayder, V.: Combining independent modules to solve multiple-choice synonym and analogy problems. In: Proceedings International Conference on Recent Advances in Natural Language Processing (RANLP-03), pp. 482–489 (2003)Google Scholar
  26. 26.
    Turney, P.: Similarity of semantic relations. Computational Linguistics 32(3), 379–416 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Alistair Kennedy
    • 1
  • Stanistaw Szpakowicz
    • 1
    • 2
  1. 1.School of Information Technology and Engineering, University of Ottawa, Ottawa, OntarioCanada
  2. 2.Institute of Computer Science, Polish Academy of Sciences, WarsawPoland

Personalised recommendations