Improving Prepositional Phrase Attachment Disambiguation Using the Web as Corpus

  • Hiram Calvo
  • Alexander Gelbukh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2905)


The problem of Prepositional Phrase (PP) attachment disambiguation consists in determining if a PP is part of a noun phrase, as in He sees the room with books, or an argument of a verb, as in He fills the room with books. Volk has proposed two variants of a method that queries an Internet search engine to find the most probable attachment variant. In this paper we apply the latest variant of Volk’s method to Spanish with several differences that allow us to attain a better performance close to that of statistical methods using treebanks.


Search Engine Noun Phrase Large Corpus Head Noun Full Form 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ratnaparkhi, A., Reynar, J., Roukos, S.: A Maximum Entropy Model for Prepositional Phrase Attachment. In: Proceedings of the Human Language Technology Workshop, Plainsboro, N.J. ARPA, pp. 250–255 (1994)Google Scholar
  2. 2.
    Brill, E., Resnik, P.: A Rule Based Approach to Prepositional Phrase Attachment Disambiguation. In: Proceedings of the Fifteenth International Conference on Computational Linguistics, COLING (1994)Google Scholar
  3. 3.
    Collins, M., Brooks, J.: Prepositional Phrase Attachment trhough a Backedof Model. In: Yarouwsky, D., Church, K. (eds.) Proceedings of the Third Workshop on Very Large Corpora, Cambridge, Massachussets, June 1995, pp. 27–38 (1995)Google Scholar
  4. 4.
    Merlo, P., Crocker, M.W., Berthouzoz, C.: Attaching Multiple Prepositional Phrases: Generalized Backer-off Estimation. In: Cardie, C., Weischedel, R. (eds.) Second Conference on Empirical Methods in Natural Language Processing, Providence, R.I., August 1-2, pp. 149–155 (1997)Google Scholar
  5. 5.
    Zavrel, J., Daelemans, W.: Memory-Based Leraning: Using Similarity for Smoothing. In: ACL (1997)Google Scholar
  6. 6.
    Franz, A.: Independence Assumptions Considered Harmful. In: ACL (1997)Google Scholar
  7. 7.
    Ratnaparkhi, A.: Statistical Models for Unsupervised Prepositional Phrase Attachment. In: Proceedings of the 36th ACL and 17th COLING, pp. 1079–1085 (1998)Google Scholar
  8. 8.
    Brill, E.: Processing Natural Language without Natural Language Processing. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 360–369. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    Bolshakov, I.A., Galicia-Haro, S.N.: Can We Correctly Estimate the Total Number of Pages in Google for a Specific Language? In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 415–419. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Keller, F., Lapata, M.: Using the Web to Obtain Frequencies for Unseen Bigrams. To appear in Computational Linguistics 29(3) (2003)Google Scholar
  11. 11.
    Volk, M.: Scaling up. Using the WWW to resolve PP attachment ambiguities. In: Proceedings of Konvens 2000, Ilmenau (October 2000)Google Scholar
  12. 12.
    Volk, M.: Exploiting the WWW as a corpus to resolve PP attachment ambiguities. In: Proceeding of Corpus Linguistics 2001, Lancaster (2001)Google Scholar
  13. 13.
    Vandeghinste, V.: Resolving PP Attachment Ambiguities Using the WWW. In: The Thirteenth meeting of Computational Linguistics in the Netherlands, CLIN 2002, Abstracts, Groningen (2002)Google Scholar
  14. 14.
    Haro, S.G., Gelbukh, A., Bolshakov, I.A.: Una aproximación para resolución de ambigüedad estructural empleando tres mecanismos diferentes. In: Procesamiento de Lenguaje Natural. Sociedad Española para el Procesamiento de Lenguaje Natural (SEPLN), Spain, September 2001, vol. (27), pp. 55–64 (2001)Google Scholar
  15. 15.
    Cuetos, F., Martí, M.A., Carreiras, V.: Léxico informatizado del Español. Edicions de la Universitat de Barcelona (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Hiram Calvo
    • 1
  • Alexander Gelbukh
    • 1
    • 2
  1. 1.Center for Computing Research, National Polytechnic InstituteMéxico, D.F.México
  2. 2.Department of Computer Science and EngineeringChung-Ang UniversitySeoulKorea

Personalised recommendations