Distributional Thesaurus Versus WordNet: A Comparison of Backoff Techniques for Unsupervised PP Attachment

  • Hiram Calvo
  • Alexander Gelbukh
  • Adam Kilgarriff
Conference paper

DOI: 10.1007/978-3-540-30586-6_17

Part of the Lecture Notes in Computer Science book series (LNCS, volume 3406)
Cite this paper as:
Calvo H., Gelbukh A., Kilgarriff A. (2005) Distributional Thesaurus Versus WordNet: A Comparison of Backoff Techniques for Unsupervised PP Attachment. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2005. Lecture Notes in Computer Science, vol 3406. Springer, Berlin, Heidelberg

Abstract

Prepositional Phrase (PP) attachment can be addressed by considering frequency counts of dependency triples seen in a non-annotated corpus. However, not all triples appear even in very big corpora. To solve this problem, several techniques have been used. We evaluate two different backoff methods, one based on WordNet and the other on a distributional (automatically created) thesaurus. We work on Spanish. The thesaurus is created using the dependency triples found in the same corpus used for counting the frequency of unambiguous triples. The training corpus used for both methods is an encyclopaedia. The method based on a distributional thesaurus has higher coverage but lower precision than the WordNet method.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Hiram Calvo
    • 1
  • Alexander Gelbukh
    • 1
  • Adam Kilgarriff
    • 2
  1. 1.Center for Computing ResearchNational Polytechnic InstituteMexico
  2. 2.Lexical Computing Ltd.United Kingdom

Personalised recommendations