Automatic Assignment of Wikipedia Encyclopedic Entries to WordNet Synsets

  • Maria Ruiz-Casado
  • Enrique Alfonseca
  • Pablo Castells
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3528)


We describe an approach taken for automatically associating entries from an on-line encyclopedia with concepts in an ontology or a lexical semantic network. It has been tested with the Simple English Wikipedia and WordNet, although it can be used with other resources. The accuracy in disambiguating the sense of the encyclopedia entries reaches 91.11% (83.89% for polysemous words). It will be applied to enriching ontologies with encyclopedic knowledge.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ding, Y., Fensel, D., Klein, M.C.A., Omelayenko, B.: The semantic web: yet another hip? Data Knowledge Engineering 41, 205–227 (2002)zbMATHCrossRefGoogle Scholar
  2. 2.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web - a new form of web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American 284, 34–43 (2001)CrossRefGoogle Scholar
  3. 3.
    Gómez-Pérez, A., Macho, D.M., Alfonseca, E., nez, R.N., Blascoe, I., Staab, S., Corcho, O., Ding, Y., Paralic, J., Troncy, R.: Ontoweb deliverable 1.5: A survey of ontology learning methods and techniques (2003)Google Scholar
  4. 4.
    Hearst, M.A.: The Oxford Handbook of Computational Linguistics. In: Text Data Mining, pp. 616–628. Oxford University Press, Oxford (2003)Google Scholar
  5. 5.
    Rigau, G.: Automatic Acquisition of Lexical Knowledge from MRDs. PhD Thesis, Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya (1998)Google Scholar
  6. 6.
    Hearst, M.A.: Automated Discovery of WordNet Relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 132–152. MIT Press, Cambridge (1998)Google Scholar
  7. 7.
    Agirre, E., Ansa, O., Martínez, D., Hovy, E.: Enriching wordnet concepts with topic signatures. In: Proceedings of the NAACL workshop on WordNet and Other lexical Resources: Applications, Extensions and Customizations, Pittsburg (2001)Google Scholar
  8. 8.
    Alfonseca, E., Manandhar, S.: Extending a lexical ontology by a combination of distributional semantics signatures. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 1–7. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Firth, J.: A synopsys of linguistic theory 1930-1955. In: Palmer, F. (ed.) Selected Papers of J. R. Firth. Longman, London (1957)Google Scholar
  10. 10.
    Salton, G.: Automatic text processing. Addison-Wesley, Reading (1989)Google Scholar
  11. 11.
    Church, K., Gale, W., Hanks, P., Hindle, D.: 6. In: Zernik, U. (ed.) Using Statistics in Lexical Analysis, Lexical Acquisition: Exploiting On-line Resources to Build a Lexicon, pp. 115–164. Lawrence Erlbaum Associates, Hillsdale (1991)Google Scholar
  12. 12.
    Lin, C.Y.: Robust Automated Topic Identification. Ph.D. Thesis. University of Southern California (1997)Google Scholar
  13. 13.
    Wilks, Y., Fass, D.C., Guo, C.M., McDonald, J.E., Plate, T., Slator, B.M.: Providing machine tractable dictionary tools. Journal of Computers and Translation (1990)Google Scholar
  14. 14.
    Lee, L.: Similarity-Based Approaches to Natural Language Processing. Ph.D. thesis. Harvard University Technical Report TR-11-97 (1997)Google Scholar
  15. 15.
    Faure, D., Nédellec, C.: A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: LREC workshop on Adapting lexical and corpus resources to sublanguages and applications, Granada, Spain (1998)Google Scholar
  16. 16.
    Harabagiu, S., Moldovan, D.I.: Knowledge processing. In: WordNet: An Electronic Lexical Database, pp. 379–405. MIT Press, Cambridge (1998)Google Scholar
  17. 17.
    Miller, G.A.: WordNet: A lexical database for English. Communications of the ACM 38, 39–41 (1995)CrossRefGoogle Scholar
  18. 18.
    Rus, V.: Logic Form For WordNet Glosses and Application to Question Answering. Ph.D. thesis. Computer Science Department, Southern Methodist University (2002)Google Scholar
  19. 19.
    Vossen, P.: EuroWordNet - A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publishers, Dordrecht (1998)zbMATHGoogle Scholar
  20. 20.
    Alfonseca, E.: Wraetlic user guide version 1.0 (2003)Google Scholar
  21. 21.
    Ide, N., Véronis, J.: Introduction to the special issue on word sense disambiguation: the state of the art. Computational Linguistics 24, 1–40 (1998)Google Scholar
  22. 22.
    Manning, C.D., Schütze, H.: Foundations of statistical Natural Language Processing. MIT Press, Cambridge (2001)Google Scholar
  23. 23.
    Hirst, G., St-Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms. In: WordNet: an electronic lexical database. MIT Press, Cambridge (1998)Google Scholar
  24. 24.
    Resnik, P.K.: Disambiguating noun groupings with respect to wordnet senses. In: Proceedings of the Third Workshop on Very Large Corpora, Somerset, pp. 54–68. ACL (1995)Google Scholar
  25. 25.
    Mihalcea, R., Moldovan, D.: A method for word sense disambiguation of unrestricted text. In: Proceedings of ACL 1999, Maryland, NY (1999)Google Scholar
  26. 26.
    Kilgarriff, A., Rosenzweig, J.: Framework and results for english SENSEVAL. Computer and the Humanities, 15–48 (2000)Google Scholar
  27. 27.
    Agirre, E., de Lacalle, O.L.: Clustering wordnet word senses. In: Recent Advances in Natural Language Processing III (2004)Google Scholar
  28. 28.
    Lesk, M.: Automatic sense disambiguation using machine readable dictionaries. In: Proceedings of the 5th International Conference on Systems Documentation, pp. 24–26 (1986)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Maria Ruiz-Casado
    • 1
  • Enrique Alfonseca
    • 1
  • Pablo Castells
    • 1
  1. 1.Computer Science Dep.Universidad Autonoma de MadridMadridSpain

Personalised recommendations