Extraction of Hypernymy Information from Text∗

Sang, Erik Tjong Kim; Hofmann, Katja; de Rijke, Maarten

doi:10.1007/978-3-642-17525-1_10

Erik Tjong Kim Sang³,
Katja Hofmann⁴ &
Maarten de Rijke⁴

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

613 Accesses
1 Citations

Abstract

This chapter presents the results of three studies in extracting hypernymy information from a text. In the first, a method based on a single extraction pattern applied to the web is compared with a set of patterns applied to a big corpus. In the second study, it is examined how relation extraction can be performed reliably from a text without having access to a word sense tagger. And in a third experiment, it is checked what the effect of elaborate syntactic information has on the extraction process. Both using more data and the removal of ambiguities from the training data is found to be beneficial for the extraction process. But it is surprising to find a positive effect of additional syntactic information.

Parts of this chapter have been published as (Tjong Kim Sang and Hofmann, 2007; Hofmann and Tjong Kim Sang, 2007; Tjong Kim Sang, 2009).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baayen R, Piepenbrock R, Gulikers L (1995) The CELEX Lexical Database (Release 2) [CD-ROM]. Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania
Google Scholar
van der Beek L, Bouma G, Malouf R, van Noord G (2002) The alpino dependency treebank. In: Proceedings of CLIN 2001, Twente University
Google Scholar
Caraballo SA (1999) Automatic construction of a hypernym-labeled noun hierarchy from text. In: Proceedings of ACL-99, Maryland, USA
Google Scholar
Fellbaum C (1998) WordNet – An Electronic Lexical Database. The MIT Press
Google Scholar
Genkin A, Lewis DD, Madigan D (2004) Large-Scale Bayesian Logistic Regression for Text Categorization. Technical report, Rutgers University, New Jersey
Google Scholar
Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of ACL-92, Newark, Delaware, USA
Google Scholar
Hofmann K, Tjong Kim Sang E (2007) Automatic extension of non-english wordnets. In: Proceedings of SIGIR’07, Amsterdam, The Netherlands, (poster)
Google Scholar
IJzereef L (2004) Automatische extractie van hyperniemrelaties uit grote tekstcorpora. MSc thesis, University of Groningen, (in Dutch)
Google Scholar
Jijkoun V, de Rijke M, Mur J (2004) Information extraction for question answering: Improving recall through syntactic patterns. In: Proceedings of Coling’04
Google Scholar
Geneva, Switzerland Li X, Roth D (2001) Exploring evidence for shallow parsing. In: Proceedings of Conference on Computational Natural Language Learning (CoNLL) 2001
Google Scholar
McCarthy D, Koeling R, Weeds J, Caroll J (2007) Unsupervised acquisition of predominant word senses. Computational Linguistics 33(4)
Google Scholar
van Noord G (2006) At last parsing is now operational. In: Mertens P, Fairon C, Dister A, Watrin P (eds) TALN06. Verbum Ex Machina. Actes de la 13e conference sur le traitement automatique des langues naturelles
Google Scholar
van Noord G (2009) Huge parsed corpora in lassy. In: Proceedings of TLT7, LOT, Groningen, The Netherlands
Google Scholar
van der Plas L, Bouma G (2005) Automatic acquisition of lexico-semantic knowledge for qa. In: Proceedings of the IJCNLP Workshop on Ontologies and Lexical Resources, Jeju Island, Korea
Google Scholar
Sabou M, Wroe C, Goble C, Mishne G (2005) Learning domain ontologies for web service descriptions: an experiment in bioinformatics. In: 14th International World Wide Web Conference (WWW2005), Chiba, Japan
Google Scholar
Snow R, Jurafsky D, Ng AY (2005) Learning syntactic patterns for automatic hypernym discovery. In: NIPS 2005, Vancouver, Canada
Google Scholar
Tjong Kim Sang E (2009) To use a treebank or not – which is better for hypernym extraction. In: Proceedings of the Seventh International Workshop on Treebanks and Linguistic Theories (TLT 7), Groningen, The Netherlands
Google Scholar
Tjong Kim Sang E, Hofmann K (2007) Automatic extraction of dutch hypernymhyponym pairs. In: Proceedings of CLIN-2006, Leuven, Belgium
Google Scholar
Van Eynde F (2005) Part of Speech Tagging en Lemmatisering van het Corpus Gesproken Nederlands. K.U. Leuven, (in Dutch)
Google Scholar
Vossen P (1998) EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publisher
Google Scholar
Vossen P, Maks I, Segers R, van der Vliet H (2008) Integrating lexical units, synsets, and ontology in the cornetto database. In: Proceedings of LREC-2008, Marrakech, Morocco
Google Scholar

Download references

Author information

Authors and Affiliations

Alfa-informatica, University of Groningen, Groningen, The Netherlands
Erik Tjong Kim Sang
ISLA, University of Amsterdam, Amsterdam, The Netherlands
Katja Hofmann & Maarten de Rijke

Authors

Erik Tjong Kim Sang
View author publications
You can also search for this author in PubMed Google Scholar
Katja Hofmann
View author publications
You can also search for this author in PubMed Google Scholar
Maarten de Rijke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erik Tjong Kim Sang .

Editor information

Editors and Affiliations

Fac. Humanities, Tilburg University, Tilburg, Netherlands
Antal van den Bosch
, Information Science, University of Groningen, NL-9700 AS Groningen, Netherlands
Gosse Bouma

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sang, E.T.K., Hofmann, K., de Rijke, M. (2011). Extraction of Hypernymy Information from Text^∗ . In: van den Bosch, A., Bouma, G. (eds) Interactive Multi-modal Question-Answering. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17525-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-17525-1_10
Published: 08 April 2011
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17524-4
Online ISBN: 978-3-642-17525-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics