Acquiring Selectional Preferences from Untagged Text for Prepositional Phrase Attachment Disambiguation

  • Hiram Calvo
  • Alexander Gelbukh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3136)


Extracting information automatically from texts for database representation requires previously well-grouped phrases so that entities can be separated adequately. This problem is known as prepositional phrase (PP) attachment disambiguation. Current PP attachment disambiguation systems require an annotated treebank or they use an Internet connection to achieve a precision of more than 90. Unfortunately, these resources are not always available. In addition, using the same techniques that use the Web as corpus may not achieve the same results when using local corpora. In this paper, we present an unsupervised method for generalizing local corpora information by means of semantic classification of nouns based on the top 25 unique beginner concepts of WordNet. Then we propose a method for using this information for PP attachment disambiguation.


Word Sense Disambiguation Selectional Preference Prepositional Phrase Springer LNCS Local Corpus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ratnaparkhi, A., Reynar, J., Roukos, S.: A Maximum Entropy Model for Prepositional Phrase Attachment. In: Proceedings of the ARPA Human Language Technology Workshop, pp. 250–255 (1994)Google Scholar
  2. 2.
    Brill, E., Resnik, P.: A Rule-Based Approach To Prepositional Phrase Attachment Disambiguation. In: Proceedings of COLING 1994 (1994)Google Scholar
  3. 3.
    Kudo, T., Matsumoto, Y.: Use of Support Vector Learning for Chunk Identification. In: Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal (2000)Google Scholar
  4. 4.
    Lüdtke, D., Sato, S.: Fast Base NP Chunking with Decision Trees - Experiments on Different POS Tag Settings. In: Gelbukh, A. (ed.) Computational Linguistics and Intelligent Text Processing. LNCS, pp. 136–147. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. 5.
    Calvo, H., Gelbukh, A.: Improving Prepositional Phrase Attachment Disambiguation Using the Web as Corpus. In: Sanfeliu, A., Shulcloper, J. (eds.) Progress in Pattern Recognition. LNCS, pp. 604–610. Springer, Heidelberg (2003)Google Scholar
  6. 6.
    Weinreich, U.: Explorations in Semantic Theory. Mouton, The Hague (1972)Google Scholar
  7. 7.
    Dik, Simon C.: The Theory of Functional Grammar. Part I: The structure of the clause. Dordrecht, Foris (1989)Google Scholar
  8. 8.
    Resnik, P.: Selectional Constraints: An Information-Theoretic Model and its Computational Realization. Cognition 61, 127–159 (1996)CrossRefGoogle Scholar
  9. 9.
    Resnik, P.: Selectional preference and sense disambiguation, presented at the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How?, held April 4-5, 1997 in Washington, D.C., USA in conjunction with ANLP 1997 (1997)Google Scholar
  10. 10.
    Prescher, D., Riezler, S., Rooth, M.: Using a probabilistic class-based lexicon for lexical ambiguity resolution. In: Proceedings of the 18th International Conference on Computational Linguistics, Saarland University, Saarbrücken, Germany, July-August (2000) ICCLGoogle Scholar
  11. 11.
    Miller, G.: WordNet: An on-line lexical database. International Journal of Lexicography 3(4), 235–312 (1990)CrossRefGoogle Scholar
  12. 12.
    Calvo, H., Gelbukh, A.: Extracting Semantic Categories of Nouns for Syntactic Disambiguation from Human-Oriented Explanatory Dictionaries. In: Gelbukh, A. (ed.) Computational Linguistics and Intelligent Text Processing. LNCS, Springer, Heidelberg (2004)Google Scholar
  13. 13.
    Lara, Fernando, L.: Diccionario del español usual en México. Digital edition. Colegio de México, Center of Linguistic and Literary Studies (1996)Google Scholar
  14. 14.
    Volk, M.: Exploiting the WWW as a corpus to resolve PP attachment ambiguities. In: Proceeding of Corpus Linguistics 2001. Lancaster (2001)Google Scholar
  15. 15.
    Sebastián, N., Martí, M.A., Carreiras, M.F., Cuestos, F.: Lexesp, léxico informatizado del español, Edicions de la Universitat de Barcelona (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Hiram Calvo
    • 1
  • Alexander Gelbukh
    • 1
    • 2
  1. 1.Center for Computing ResearchNational Polytechnic InstituteMéxico
  2. 2.Department of Computer Science and EngineeringChung-Ang UniversitySeoulKorea

Personalised recommendations