Web-Based Relation Extraction for the Food Domain

  • Michael Wiegand
  • Benjamin Roth
  • Dietrich Klakow
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7337)

Abstract

In this paper, we examine methods to extract different domain-specific relations from the food domain. We employ different extraction methods ranging from surface patterns to co-occurrence measures applied on different parts of a document. We show that the effectiveness of a particular method depends very much on the relation type considered and that there is no single method that works equally well for every relation type. As we need to process a large amount of unlabeled data our methods only require a low level of linguistic processing. This has also the advantage that these methods can provide responses in real time.

Keywords

Food Item Relation Type Linguistic Processing Sparkling Wine Mean Reciprocal Rank 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Mohr, G., Stack, M., Ranitovic, I., Avery, D., Kimpton, M.: An Introduction to Heritrix, an open source archival quality web crawler. In: Proc. of IWAW (2004)Google Scholar
  2. 2.
    Kohlschütter, C., Fankhauser, P., Nejdl, W.: Boilerplate Detection using Shallow Text Features. In: Proc. of WSDM (2010)Google Scholar
  3. 3.
    Hamp, B., Feldweg, H.: GermaNet - a Lexical-Semantic Net for German. In: Proc. of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications (1997)Google Scholar
  4. 4.
    Cilibrasi, R., Vitanyi, P.: The Google Similarity Distance. IEEE Transactions on Knowledge and Data Engineering 19(3), 370–383 (2007)CrossRefGoogle Scholar
  5. 5.
    Wiegand, M., Roth, B., Lasarcyk, E., Köser, S., Klakow, D.: A Gold Standard for Relation Extraction in the Food Domain. In: Proc. of the LREC (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Michael Wiegand
    • 1
  • Benjamin Roth
    • 1
  • Dietrich Klakow
    • 1
  1. 1.Spoken Language SystemsSaarland UniversityGermany

Personalised recommendations