PORE: Positive-Only Relation Extraction from Wikipedia Text

Wang, Gang; Yu, Yong; Zhu, Haiping

doi:10.1007/978-3-540-76298-0_42

Gang Wang¹³,
Yong Yu¹³ &
Haiping Zhu¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4825))

Included in the following conference series:

5030 Accesses
17 Citations

Abstract

Extracting semantic relations is of great importance for the creation of the Semantic Web content. It is of great benefit to semi-automatically extract relations from the free text of Wikipedia using the structured content readily available in it. Pattern matching methods that employ information redundancy cannot work well since there is not much redundancy information in Wikipedia, compared to the Web. Multi-class classification methods are not reasonable since no classification of relation types is available in Wikipedia. In this paper, we propose PORE (Positive-Only Relation Extraction), for relation extraction from Wikipedia text. The core algorithm B-POL extends a state-of-the-art positive-only learning algorithm using bootstrapping, strong negative identifi cation, and transductive inference to work with fewer positive training exam ples. We conducted experiments on several relations with different amount of training data. The experimental results show that B-POL can work effectively given only a small amount of positive training examples and it significantly out per forms the original positive learning approaches and a multi-class SVM. Furthermore, although PORE is applied in the context of Wiki pedia, the core algorithm B-POL is a general approach for Ontology Population and can be adapted to other domains.

This work is funded by IBM China Research Lab.

Download to read the full chapter text

Chapter PDF

Corpus-Based Relation Extraction by Identifying and Refining Relation Patterns

Large Scale Semantic Relation Discovery: Toward Establishing the Missing Link Between Wikipedia and Semantic Network

Domain-Adaptive Relation Extraction for the Semantic Web

Keywords

References

Ding, L., Finin, T.: Characterizing the Semantic Web on the Web. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L. (eds.) ISWC 2006. LNCS, vol. 4273, Springer, Heidelberg (2006)
Chapter Google Scholar
Ramakrishnan, C., Kochut, K.J., Sheth, A.P.: A Framework for Schema-Driven Relationship Discovery from Unstructured text. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L. (eds.) ISWC 2006. LNCS, vol. 4273, Springer, Heidelberg (2006)
Chapter Google Scholar
Sergey, B.: Extracting Patterns and Relations from the World Wide Web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) The World Wide Web and Databases. LNCS, vol. 1590, Springer, Heidelberg (1999)
Google Scholar
Agichtein, E., Gravano, L.: Snowball: Extracting Relations from Large Plain-text Collections. In: ACM DL 2000 (2000)
Google Scholar
Pantel, P., Pennacchiotti, M.: Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations. In: COLING 2006 (2006)
Google Scholar
Ravichandran, D. and Hovy, E.H. 2002. Learning Surface Text Patterns for a Question Answering System. ACL’02.
Google Scholar
Boer, V., Someren, M., Wielinga, B.J.: Extracting Instances of Relations from Web Documents using Redundancy. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, Springer, Heidelberg (2006)
Chapter Google Scholar
Cimiano, P., Handschuh, S., Staab, S.: Towards the Self-Annotating Web. In: WWW 2004 (2004)
Google Scholar
Mori, J., Tsujishita, T., Matsuo, Y., Ishizuka, M.: Extracting Relations in Social Networks from Web using Similarity between Collective Contexts. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L. (eds.) ISWC 2006. LNCS, vol. 4273, Springer, Heidelberg (2006)
Chapter Google Scholar
Tang, J., Hong, M., Li, J., Liang, B.: Tree-structured Conditional Random Fields for Semantic Annotation. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L. (eds.) ISWC 2006. LNCS, vol. 4273, Springer, Heidelberg (2006)
Chapter Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: A Library for Support Vector Machines, Software (2001), available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Völkel, M., Krötzsch, M., Vrandecic, D., Haller, H., Suder, R.: Semantic Wikipedia. In: WWW 2006 (2006)
Google Scholar
Auer, S., Lehmann, J.: What have Innsbruck and Leipzig in common? Extracting Semantics from Wiki Content. In: ESWC 2007 (2007)
Google Scholar
Yu, H., Zhai, C.X., Han, J.: Text Classification from Positive and Unlabeled Documents. In: CIKM 2003 (2003)
Google Scholar
Li, X., Liu, B.: Learning to Classify Texts Using Positive and Unlabeled Data. In: IJCAI 2003 (2003)
Google Scholar
Rocchio, J.: Relevance Feedback in Information Retrieval. In: Salton, G. (ed.) The smart retrieval system: experiments in automatic document processing (1971)
Google Scholar
Denoyer, L.: The Wikipedia XML Corpus. SIGIR Forum (2006)
Google Scholar
Suchanek, F.M., Ifrim, G., Weikum, G.: Combining Linguistic and Statistical Analysis to Extract Relations from Web Documents. In: KDD 2006 (2006)
Google Scholar
Chen, J., Ji, D., Tan, C.L., Niu, Z.: Relation Extraction Using Label Propagation Based Semi-supervised Learning. In: ACL 2006 (2006)
Google Scholar
Zhang, Z.: Weakly-Supervised Relation Classification for Information Extraction. In: CIKM 2004 (2004)
Google Scholar
Wang, T., Li, Y., Bontcheva, K., Cunningham, H., Wang, J.: Automatic Extraction of Hierarchical Relations from Text. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, Springer, Heidelberg (2006)
Chapter Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (2005)
Google Scholar
Schölkopf, B., et al.: New Support Vector Algorithms. Neural Computation (2000)
Google Scholar
Wang, G., Zhang, H., Wang, H., Yu, Y.: Enhancing Relation Extraction by Eliciting Selectional Constraint Features from Wikipedia. In: NLDB 2007 (2007)
Google Scholar
Ruiz-Casado, M., Alfonseca, E., Castells, P.: Automatic extraction of semantic relation-ships for WordNet by means of pattern learning from Wikipedia. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, Springer, Heidelberg (2005)
Google Scholar
Zhou, G.D., Su, J., Zhang, J., Zhang, M.: Exploring Various Knowledge in Relation Extraction. In: ACL 2005 (2005)
Google Scholar
Schutz, A., Buielaar, P.: RelExt: A Tool for Relation Extraction from Text in Ontology Extension. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, Springer, Heidelberg (2005)
Chapter Google Scholar
Manevitz, L.M., Yousef, M.: One-Class SVMs for Document Classification. Journal of Machine Learning Research 2, 139–154 (2001)
Article Google Scholar
Wang, G., Yu, Y., Zhu, H.: Tech. Report. Available at http://apex.sjtu.edu.cn/apex_wiki/Papers?action=AttachFile&do=get&target=wang-iswc07-tr.pdf
Zhu, X.: Semi-supervised Learning Literature Survey. TR 1530, Univ. of Wisconsin, Madison (December 2006)
Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia. In: WWW 2007 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Apex Data & Knowledge Management Lab, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Gang Wang, Yong Yu & Haiping Zhu

Authors

Gang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Yu
View author publications
You can also search for this author in PubMed Google Scholar
Haiping Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015, Lausanne, Switzerland
Karl Aberer
Korea Advanced Institute of Science and Technology, 305-701, Daejeon, Korea
Key-Sun Choi
Stanford University, 94305, Stanford, CA, USA
Natasha Noy
TopQuadrant, 22314, VA, USA
Dean Allemang
Saltlix Inc., Korea
Kyung-Il Lee
Free University of Berlin, Germany
Lyndon Nixon
University of Maryland, 20742, College Park, MD, USA
Jennifer Golbeck
Yahoo! Research Barcelona, Spain
Peter Mika
University of Sheffield, S1 4DP, Sheffield, United Kingdom
Diana Maynard
Osaka University, 565-0047, Osaka, Japan
Riichiro Mizoguchi
Vrije Universiteit Amsterdam, The Netherlands
Guus Schreiber
École Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
Philippe Cudré-Mauroux

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, G., Yu, Y., Zhu, H. (2007). PORE: Positive-Only Relation Extraction from Wikipedia Text. In: Aberer, K., et al. The Semantic Web. ISWC ASWC 2007 2007. Lecture Notes in Computer Science, vol 4825. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76298-0_42

Download citation

DOI: https://doi.org/10.1007/978-3-540-76298-0_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76297-3
Online ISBN: 978-3-540-76298-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

PORE: Positive-Only Relation Extraction from Wikipedia Text

Abstract

Chapter PDF

Similar content being viewed by others

Corpus-Based Relation Extraction by Identifying and Refining Relation Patterns

Large Scale Semantic Relation Discovery: Toward Establishing the Missing Link Between Wikipedia and Semantic Network

Domain-Adaptive Relation Extraction for the Semantic Web

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

PORE: Positive-Only Relation Extraction from Wikipedia Text

Abstract

Chapter PDF

Similar content being viewed by others

Corpus-Based Relation Extraction by Identifying and Refining Relation Patterns

Large Scale Semantic Relation Discovery: Toward Establishing the Missing Link Between Wikipedia and Semantic Network

Domain-Adaptive Relation Extraction for the Semantic Web

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation