More Informative Open Information Extraction via Simple Inference

  • Hannah Bast
  • Elmar Haussmann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8416)

Abstract

Recent Open Information Extraction (OpenIE) systems utilize grammatical structure to extract facts with very high recall and good precision. In this paper, we point out that a significant fraction of the extracted facts is, however, not informative. For example, for the sentence The ICRW is a non-profit organization headquartered in Washington, the extracted fact (a non-profit organization)(is headquartered in)(Washington) is not informative. This is a problem for semantic search applications utilizing these triples, which is hard to fix once the triple extraction is completed. We therefore propose to integrate a set of simple inference rules into the extraction process. Our evaluation shows that, even with these simple rules, the percentage of informative triples can be improved considerably and the already high recall can be improved even further. Both improvements directly increase the quality of search on these triples.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Tran, T., Mika, P., Wang, H., Grobelnik, M.: Semsearch 2011: the 4th Semantic Search Workshop. In: WWW (2011)Google Scholar
  2. 2.
    Balog, K., Serdyukov, P., de Vries, A.P.: Overview of the TREC 2010 Entity Track. In: TREC (2010)Google Scholar
  3. 3.
    Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: EMNLP, pp. 1535–1545 (2011)Google Scholar
  4. 4.
    Bast, H., Bäurle, F., Buchhold, B., Haussmann, E.: Broccoli: Semantic full-text search at your fingertips. CoRR (2012)Google Scholar
  5. 5.
    Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: IJCAI 2007, pp. 2670–2676 (2007)Google Scholar
  6. 6.
    Mausam, S.M., Soderland, S., Bart, R., Etzioni, O.: Open language learning for information extraction. In: EMNLP-CoNLL, pp. 523–534 (2012)Google Scholar
  7. 7.
    Corro, L.D., Gemulla, R.: ClausIE: clause-based open information extraction. In: WWW, pp. 355–366 (2013)Google Scholar
  8. 8.
    Bast, H., Haussmann, E.: Open information extraction via contextual sentence decomposition. In: ICSC (2013)Google Scholar
  9. 9.
    Banko, M., Etzioni, O.: The tradeoffs between open and traditional relation extraction. In: ACL, pp. 28–36 (2008)Google Scholar
  10. 10.
    Schoenmackers, S., Etzioni, O., Weld, D.S.: Scaling textual inference to the web. In: EMNLP, pp. 79–88 (2008)Google Scholar
  11. 11.
    Lao, N., Mitchell, T.M., Cohen, W.W.: Random walk inference and learning in a large scale knowledge base. In: EMNLP, pp. 529–539 (2011)Google Scholar
  12. 12.
    Levy, R., Andrew, G.: Tregex and Tsurgeon: tools for querying and manipulating tree data structures. In: LREC, pp. 2231–2234 (2006)Google Scholar
  13. 13.
    Nakashole, N., Weikum, G., Suchanek, F.M.: PATTY: A taxonomy of relational patterns with semantic types. In: EMNLP-CoNLL, pp. 1135–1145 (2012)Google Scholar
  14. 14.
    Schoenmackers, S., Davis, J., Etzioni, O., Weld, D.S.: Learning first-order horn clauses from web text. In: EMNLP, pp. 1088–1098 (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Hannah Bast
    • 1
  • Elmar Haussmann
    • 1
  1. 1.Department of Computer ScienceUniversity of FreiburgFreiburgGermany

Personalised recommendations