An Empirical Research on Extracting Relations from Wikipedia Text
A feature based relation classification approach is presented, in which probabilistic and semantic relatedness features between patterns and relation types are employed with other linguistic information. The importance of each feature set is evaluated with Chi-square estimator, and the experiments show that, the relatedness features have big impact on the relation classification performance. A series experiments are also performed to evaluate the different machine learning approaches on relation classification, among which Bayesian outperformed other approaches including Support Vector Machine (SVM).
KeywordsInformation extraction relation classification feature-based relatedness information
Unable to display preview. Download preview PDF.
- 1.Kambhatla, N.: Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Extracting Relations. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (2004)Google Scholar
- 2.Zhou, G., Su, J., Zhang, J., Zhang, M.: Exploring Various Knowledge in Relation Extraction. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 427–434 (2005)Google Scholar
- 6.Connexor: The Connexor Language Parsers and Taggers for English Website (2008), http://www.connexor.eu/
- 8.LIBSVM, A Library for Support Vector Machines (2008), http://www.csie.ntu.edu.tw/~cjlin/libsvm/
- 9.John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceeding of the 11th conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Mateo (1995)Google Scholar
- 11.Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufman, San Mateo (1993)Google Scholar
- 12.Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Network 10(5) (1999)Google Scholar