Hierarchical Learning Strategy in Relation Extraction Using Support Vector Machines
This paper proposes a novel hierarchical learning strategy to deal with the data sparseness problem in relation extraction by modeling the commonality among related classes. For each class in the hierarchy either manually predefined or automatically clustered, a discriminative function is determined in a top-down way. As the upper-level class normally has much more positive training examples than the lower-level class, the corresponding discriminative function can be determined more reliably and effectively, and thus guide the discriminative function learning in the lower-level, which otherwise might suffer from limited training data. In this paper, the state-of-the-art Support Vector Machines is applied as the basic classifier learning approach using the hierarchical learning strategy. Evaluation on the ACE RDC 2003 and 2004 corpora shows that the hierarchical learning strategy much improves the performance on least- and medium- frequent relations.
KeywordsSupport Vector Machine Discriminative Function Learning Strategy Class Hierarchy Relation Extraction
Unable to display preview. Download preview PDF.
- ACE (2000-2005). Automatic Content Extraction, http://www.ldc.upenn.edu/Projects/ACE/
- Bunescu, R., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: HLT/EMNLP 2005, Vancover, B.C, October 6-8, pp. 724–731 (2005)Google Scholar
- Collins, M.: Head-driven statistical models for natural language parsing. Ph.D. Dissertation, University of Pennsylvania (1999)Google Scholar
- Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: ACL 2004, Barcelona, Spain, July 21-26, pp. 423–429 (2004)Google Scholar
- Miller, S., Fox, H., Ramshaw, L., Weischedel, R.: A novel use of statistical parsing to extract information from text. In: ANLP 2000, Seattle, USA, April 29 - May 4, pp. 226–233 (2000)Google Scholar
- MUC-7. In: Proceedings of the 7th Message Understanding Conference (MUC-7). Morgan Kaufmann, San Francisco (1998)Google Scholar
- Kambhatla, N.: Combining lexical, syntactic and semantic features with Maximum Entropy models for extracting relations. In: ACL 2004 (Poster), Barcelona, Spain, July 21-26, pp. 178–181 (2004)Google Scholar
- Platt, J.: Probabilistic Outputs for Support Vector Machines and Comparisions to regularized Likelihood Methods. In: Smola, J., Bartlett, P., Scholkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)Google Scholar
- Zhou, G.D., Su, J., Zhang, J., Zhang, M.: Exploring various knowledge in relation extraction. In: ACL 2005, Ann Arbor, Michgan, USA, June 25-30, pp. 427–434 (2005)Google Scholar