An Empirical Research on Extracting Relations from Wikipedia Text

  • Jin-Xia Huang
  • Pum-Mo Ryu
  • Key-Sun Choi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5326)


A feature based relation classification approach is presented, in which probabilistic and semantic relatedness features between patterns and relation types are employed with other linguistic information. The importance of each feature set is evaluated with Chi-square estimator, and the experiments show that, the relatedness features have big impact on the relation classification performance. A series experiments are also performed to evaluate the different machine learning approaches on relation classification, among which Bayesian outperformed other approaches including Support Vector Machine (SVM).


Information extraction relation classification feature-based relatedness information 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kambhatla, N.: Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Extracting Relations. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (2004)Google Scholar
  2. 2.
    Zhou, G., Su, J., Zhang, J., Zhang, M.: Exploring Various Knowledge in Relation Extraction. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 427–434 (2005)Google Scholar
  3. 3.
    Zhou, G., Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge. Inf. Process. Manage. 43(4), 969–982 (2007)CrossRefGoogle Scholar
  4. 4.
    Miller, G.A.: WordNet: An online lexical database. International Journal of Lexicography 3(4), 235–312 (1990)CrossRefGoogle Scholar
  5. 5.
    Manning, et al.: Text classification and Naïve Bayes. In: An Introduction to Information Retrieval, pp. 253–287. Cambridge University Press, Cambridge (2008) (online version)CrossRefGoogle Scholar
  6. 6.
    Connexor: The Connexor Language Parsers and Taggers for English Website (2008),
  7. 7.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar
  8. 8.
    LIBSVM, A Library for Support Vector Machines (2008),
  9. 9.
    John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceeding of the 11th conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Mateo (1995)Google Scholar
  10. 10.
    Aha, D., Kibler, D.: Instance-based Learning Algorithms. Machine Learning 6, 37–66 (1991)zbMATHGoogle Scholar
  11. 11.
    Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufman, San Mateo (1993)Google Scholar
  12. 12.
    Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Network 10(5) (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jin-Xia Huang
    • 1
  • Pum-Mo Ryu
    • 1
  • Key-Sun Choi
    • 1
  1. 1.SWRC, Computer Science Division, EECS Dept. KAISTDaejeonRepublic of Korea

Personalised recommendations