Skip to main content

Automatic Extraction of Semantic Relationships for WordNet by Means of Pattern Learning from Wikipedia

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3513))

Abstract

This paper describes an automatic approach to identify lexical patterns which represent semantic relationships between concepts, from an on-line encyclopedia. Next, these patterns can be applied to extend existing ontologies or semantic networks with new relations. The experiments have been performed with the Simple English Wikipedia and WordNet 1.7. A new algorithm has been devised for automatically generalising the lexical patterns found in the encyclopedia entries. We have found general patterns for the hyperonymy, hyponymy, holonymy and meronymy relations and, using them, we have extracted more than 1200 new relationships that did not appear in WordNet originally. The precision of these relationships ranges between 0.61 and 0.69, depending on the relation.

This work has been sponsored by CICYT, project numbers TIC2002-01948 and TIN2004-03140.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ding, Y., Fensel, D., Klein, M.C.A., Omelayenko, B.: The semantic web: yet another hip? Data Knowledge Engineering 41, 205–227 (2002)

    Article  MATH  Google Scholar 

  2. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web - a new form of web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American 284, 34–43 (2001)

    Article  Google Scholar 

  3. Gruber, T.R.: A translation approach to portable ontologies. Knowledge Acquisition 5, 199–220 (1993)

    Article  Google Scholar 

  4. Degen, W., Heller, B., Herre, H., Smith, B.: Gol: Towards an axiomatized upper-level ontology. In: Proceedings of the International Conference on Formal Ontology in Information Systems, FOIS-2001 (2001)

    Google Scholar 

  5. Gómez-Pérez, A., Macho, D.M., Alfonseca, E.: nez, R.N., Blascoe, I., Staab, S., Corcho, O., Ding, Y., Paralic, J., Troncy, R.: Ontoweb deliverable 1.5: A survey of ontology learning methods and techniques (2003)

    Google Scholar 

  6. Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intelligent systems 16 (2001)

    Google Scholar 

  7. Miller, G.A.: WordNet: A lexical database for English. Communications of the ACM 38, 39–41 (1995)

    Article  Google Scholar 

  8. Lee, L.: Similarity-Based Approaches to Natural Language Processing. Ph.D. thesis. Harvard University Technical Report TR-11-97 (1997)

    Google Scholar 

  9. Faure, D., Nédellec, C.: A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: LREC workshop on Adapting lexical and corpus resources to sublanguages and applications, Granada, Spain (1998)

    Google Scholar 

  10. Cimiano, P., Staab, S.: Clustering concept hierarchies from text. In: Proceedings of LREC-2004 (2004)

    Google Scholar 

  11. Hastings, P.M.: Automatic acquisition of word meaning from context. University of Michigan, Ph. D. Dissertation (1994)

    Google Scholar 

  12. Hahn, U., Schnattinger, K.: Towards text knowledge engineering. In: AAAI/IAAI, pp. 524–531 (1998)

    Google Scholar 

  13. Pekar, V., Staab, S.: Word classification based on combined measures of distributional and semantic similarity. In: Proceedings of Research Notes of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest (2003)

    Google Scholar 

  14. Alfonseca, E., Manandhar, S.: Extending a lexical ontology by a combination of distributional semantics signatures. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 1–7. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  15. Maedche, A., Staab, S.: Discovering conceptual relations from text. In: Proceedings of the 14th European Conference on Artifial Intelligence (2000)

    Google Scholar 

  16. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of COLING 1992, Nantes, France (1992)

    Google Scholar 

  17. Hearst, M.A.: Automated Discovery of WordNet Relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 132–152. MIT Press, Cambridge (1998)

    Google Scholar 

  18. Berland, M., Charniak, E.: Finding parts in very large corpora. In: Proceedings of ACL 1999 (1999)

    Google Scholar 

  19. Finkelstein-Landau, M., Morin, E.: Extracting semantic relationships between terms: supervised vs. unsupervised methods. In: Proceedings of the International Workshop on Ontologial Engineering on the Global Information Infrastructure (1999)

    Google Scholar 

  20. Kietz, J., Maedche, A., Volz, R.: A method for semi-automatic ontology acquisition from a corporate intranet. In: Workshop “Ontologies and text”, co-located with EKAW’2000, Juan-les-Pins, French Riviera (2000)

    Google Scholar 

  21. Alfonseca, E., Manandhar, S.: Improving an ontology refinement method with hyponymy patterns. In: Language Resources and Evaluation (LREC-2002), Las Palmas (2002)

    Google Scholar 

  22. Navigli, R., Velardi, P.: Learning domain ontologies from document warehouses and dedicated websites. Computational Linguistics 30 (2004)

    Google Scholar 

  23. Wilks, Y., Fass, D.C., Guo, C.M., McDonald, J.E., Plate, T., Slator, B.M.: Providing machine tractable dictionary tools. Journal of Computers and Translation (1990)

    Google Scholar 

  24. Rigau, G.: Automatic Acquisition of Lexical Knowledge from MRDs. PhD Thesis, Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya (1998)

    Google Scholar 

  25. Richardson, S.D., Dolan, W.B., Vanderwende, L.: MindNet: acquiring and structuring semantic information from text. In: Proceedings of COLING-ACL 1998, Montreal, Canada, vol. 2, pp. 1098–1102 (1998)

    Google Scholar 

  26. Harabagiu, S., Moldovan, D.I.: Knowledge processing on an extended wordnet. In: WordNet: An Electronic Lexical Database, pp. 379–405. MIT Press, Cambridge (1998)

    Google Scholar 

  27. Harabagiu, S., Miller, G., Moldovan, D.: Wordnet 2 - a morphologically and semantically enhanced resource. In: Proc. of the SIGLEX Workshop on Multilingual Lexicons, ACL Annual Meeting, University of Maryland (1999)

    Google Scholar 

  28. Novischi, A.: Accurate semantic annotation via pattern matching. In: Proceedings of FLAIRS-2002 (2002)

    Google Scholar 

  29. DeBoni, M., Manandhar, S.: Automated discovery of telic relations for wordnet. In: Poceedings of the First International Conference on General WordNet, Mysore, India (2002)

    Google Scholar 

  30. Alfonseca, E.: Wraetlic user guide version 1.0 (2003)

    Google Scholar 

  31. Ruiz-Casado, M., Alfonseca, E., Castells, P.: Automatic assignment of wikipedia encyclopedic entries to wordnet synsets (2005) (in press)

    Google Scholar 

  32. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of english: the penn treebank. Computational Linguistics 19, 313–330 (1993)

    Google Scholar 

  33. Wagner, R., Fischer, M.: The string-to-string correction problem. Journal of Assoc. Comput. Mach. 21 (1974)

    Google Scholar 

  34. Alfonseca, E., Manandhar, S.: Distinguishing instances and concepts in wordnet. In: Poceedings of the First International Conference on General WordNet, Mysore, India (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ruiz-Casado, M., Alfonseca, E., Castells, P. (2005). Automatic Extraction of Semantic Relationships for WordNet by Means of Pattern Learning from Wikipedia. In: Montoyo, A., Muńoz, R., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2005. Lecture Notes in Computer Science, vol 3513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11428817_7

Download citation

  • DOI: https://doi.org/10.1007/11428817_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26031-8

  • Online ISBN: 978-3-540-32110-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics