Dependency Pattern Models for Information Extraction

Article

DOI: 10.1007/s11168-009-9061-2

Cite this article as:
Stevenson, M. & Greenwood, M.A. Res on Lang and Comput (2009) 7: 13. doi:10.1007/s11168-009-9061-2

Abstract

Several techniques for the automatic acquisition of Information Extraction (IE) systems have used dependency trees to form the basis of an extraction pattern representation. These approaches have used a variety of pattern models (schemes for representing IE patterns based on particular parts of the dependency analysis). An appropriate pattern model should be expressive enough to represent the information which is to be extracted from text without being overly complex. Previous investigations into the appropriateness of the currently proposed models have been limited. This paper compares a variety of pattern models, including ones which have been previously reported and variations of them. Each model is evaluated using existing data consisting of IE scenarios from two very different domains (newswire stories and biomedical journal articles). The models are analysed in terms of their ability to represent relevant information, number of patterns generated and performance on an IE scenario. It was found that the best performance was observed from two models which use the majority of relevant portions of the dependency tree without including irrelevant sections.

Keywords

Complexity Dependency analysis Evaluation Expressivity Information Extraction Parsing Relation extraction 

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of SheffieldSheffieldUK