Unsupervised Learning of Ontology-Linked Selectional Preferences

  • Hiram Calvo
  • Alexander Gelbukh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3287)


We present a method for extracting selectional preferences of verbs from unannotated text. These selectional preferences are linked to an ontology (e.g. the hypernym relations found in WordNet), which allows for extending the coverage for unseen valency fillers. For example, if drink vodka is found in the training corpus, a whole WordNet hierarchy is assigned to the verb todrink (drink liquor, drink alcohol, drink beverage, drink substance, etc.), so that when drink gin is seen in a later stage, it is possible to relate the selectional preference drink vodka with drink gin (as ginis a co-hyponym of vodka). This information can be used for word sense disambiguation, prepositional phrase attachment disambiguation, syntactic disambiguation, and other applications within the approach of pattern-based statistical methods combined with knowledge. As an example, we present an application to word sense disambiguation based on the Senseval-2 training text for Spanish. The results of this experiment are similar to those obtained by Resnik for English.


Word Sense Training Corpus Word Sense Disambiguation Selectional Preference Prepositional Phrase 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Resnik, P.: Selection and Information: A Class-Based Approach to Lexical Relationships. TesisDoctoral, University of Pennsylvania (December 1993)Google Scholar
  2. 2.
    Resnik, P.: Selectional constraints: An information-theoretic model and its computational realization. Cognition 61, 127–159 (1996)CrossRefGoogle Scholar
  3. 3.
    Resnik, P.: Selectional preference and sense disambiguation. In: ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How?, Washington, D.C., USA, April 4-5 (1997)Google Scholar
  4. 4.
    Agirre, E., Martinez, D.: Learning class-to-class selectional preferences. In: Proceedings of the Workshop Computational Natural Language Learning (CoNLL 2001), Toulousse, France, July 6-7 (2001)Google Scholar
  5. 5.
    Agirre, E., Martinez, D.: Integrating selectional preferences in WordNet. In: Proceedings of the first International WordNet Conference, Mysore, India, January 21-25 (2002)Google Scholar
  6. 6.
    Yarowsky, D., Cucerzan, S., Florian, R., Schafer, C., Wicentowski, R.: The Johns Hopkins SENSEVAL-2 System Description. In: Preiss, Yarowsky (eds.) The Proceedings of SENSEVAL-2: Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France, pp. 163–166 (2001)Google Scholar
  7. 7.
    Suárez, A., Palomar, M.: A Maximum Entropy-based Word Sense Disambiguation System. In: Chen, H.-H., Lin, C.-Y. (eds.) Proceedings of the 19th International Conference on Computational Linguistics, COLING 2002, Taipei, Taiwan, vol. 2, pp. 960–966 (2002)Google Scholar
  8. 8.
    Yarowsky, D.: Hierarchical decision lists for word sense disambiguation. Computers and the Humanities 34(2), 179–186 (2000)CrossRefGoogle Scholar
  9. 9.
    Carroll, J., McCarthy, D.: Word sense disambiguation using automatically acquired verbal preferences. In: Computers and the Humanities, 34(1-2), Netherlands (April 2000)Google Scholar
  10. 10.
    Agirre, E., Martínez, E.D.: Unsupervised WSD based on automatically retrieved examples: The importance of bias. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, Barcelona, Spain (2004)Google Scholar
  11. 11.
    Gelbukh, A., Sidorov, G., Chanona, L.: Corpus virtual, virtual: Un diccionario grande de contextos de palabras españolas compilado a través de Internet. In: Gonzalo, J., Peñas, A., Ferrández, A. (eds.) Proc. Multilingual Information Access and Natural Language Processing, International Workshop, in IBERAMIA-2002, VII Iberoamerican Conference on Artificial Intelligence, Seville, Spain, November 12-15, pp. 7–14 (2002)Google Scholar
  12. 12.
    Brants, T.: TnT: A Statistical Part-of-Speech Tagger. In: Proceedings of the 6th Applied Natural Language Processing Conference, Seattle, Washington, USA (2000)Google Scholar
  13. 13.
    Morales-Carrasco, R., Gelbukh, A.: Evaluation of TnT Tagger for Spanish. In: Proc. Fourth Mexican International Conference on Computer Science, Tlaxcala, Mexico, September 08-12 (2003)Google Scholar
  14. 14.
    Monedero, J., González, J., Goñi, J., Iglesias, C., Nieto, A.: Obtención automática de marcos de subcategorización verbal a partir de texto etiquetado: el sistema SOAMAS. In: Actas del XI Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural SEPLN, Bilbao, Spain, pp. 241–254 (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Hiram Calvo
    • 1
  • Alexander Gelbukh
    • 1
    • 2
  1. 1.Center for Computing ResearchNational Polytechnic InstituteMéxico, D.F.México
  2. 2.Department of Computer Science and EngineeringChung-Ang UniversitySeoulKorea

Personalised recommendations