Extracting Semantic Networks from Text Via Relational Clustering

Kok, Stanley; Domingos, Pedro

doi:10.1007/978-3-540-87479-9_59

Stanley Kok¹ &
Pedro Domingos¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5211))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

5980 Accesses
23 Citations

Abstract

Extracting knowledge from text has long been a goal of AI. Initial approaches were purely logical and brittle. More recently, the availability of large quantities of text on the Web has led to the development of machine learning approaches. However, to date these have mainly extracted ground facts, as opposed to general knowledge. Other learning approaches can extract logical forms, but require supervision and do not scale. In this paper we present an unsupervised approach to extracting semantic networks from large volumes of text. We use the TextRunner system [1] to extract tuples from text, and then induce general concepts and relations from them by jointly clustering the objects and relational strings in the tuples. Our approach is defined in Markov logic using four simple rules. Experiments on a dataset of two million tuples show that it outperforms three other relational clustering approaches, and extracts meaningful semantic networks.

Download to read the full chapter text

Chapter PDF

Utilizing Big Data Analytics for Automatic Building of Language-agnostic Semantic Knowledge Bases

Research on Semantic Text Mining Based on Domain Ontology

An Automatic Construction of Concept Maps Based on Statistical Text Mining

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: Proc. IJCAI 2007, Hyderabad, India. AAAI Press, Menlo Park (2007)
Google Scholar
Banko, M., Etzioni, O.: Strategies for lifelong knowledge extraction from the web. In: Proc. K-CAP-2007, British Columbia, Canada (2007)
Google Scholar
Charniak, E.: Toward a Model of Children’s Story Comprehension. PhD thesis, Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Boston, MA (1972)
Google Scholar
Craven, M.W., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., Slattery, S.: Learning to extract symbolic knowledge from the World Wide Web. In: Proc. AAAI 1998, Madison, WI, pp. 509–516. AAAI Press, Menlo Park (1998)
Google Scholar
Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proc. KDD 2003, Washington, DC (2003)
Google Scholar
Dyer, M.G.: In-Depth Understanding. MIT Press, Cambridge (1983)
Google Scholar
Etzioni, O., Banko, M., Cafarella, M.J.: Machine reading. In: Proc. 2007 AAAI Spring Symposium on Machine Reading, Palo Alto, CA. AAAI Press, Menlo Park (2007)
Google Scholar
Gellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Google Scholar
Genesereth, M.R., Nilsson, N.J.: Logical Foundations of Artificial Intelligence. Morgan Kaufmann, San Mateo (1987)
MATH Google Scholar
Hasegawa, T., Sekine, S., Grishman, R.: Discovering relations among named entities from large corpora. In: Proc. ACL 2004, Barcelona, Spain (2004)
Google Scholar
Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: Proc. AAAI 2006, Boston, MA. AAAI Press, Menlo Park (2006)
Google Scholar
Kok, S., Domingos, P.: Statistical predicate invention. In: Proc. ICML 2007, Corvallis, Oregon, pp. 440–443. ACM Press, New York (2007)
Google Scholar
Lehnert, W.G.: The Process of Question Answering. Erlbaum, Hillsdale (1978)
MATH Google Scholar
McCallum, A., Jensen, D.: A note on the unification of information extraction and data mining using conditional-probability, relational models. In: Proc. IJCAI 2003 Workshop on Learning Statistical Models from Relational Data, Acapulco, Mexico, pp. 79–86. IJCAII (2003)
Google Scholar
McCallum, A., Nigam, K., Ungar, L.: Efficient clustering of high-dimensional data sets with application to reference matching. In: Proc. KDD 2000, pp. 169–178 (2000)
Google Scholar
Mitchell, T.: Reading the web: A breakthrough goal for AI. AI Magazine 26(3), 12–16 (2005)
Google Scholar
Mooney, R.J.: Learning for semantic parsing. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 311–324. Springer, Heidelberg (2007)
Chapter Google Scholar
Pasca, M., Lin, D., Bigham, J., Lifchits, A., Jain, A.: Names and similarities on the web: Fact extraction on the fast lane. In: Proc. ACL/COLING 2006 (2006)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)
Google Scholar
Quillian, M.R.: Semantic memory. In: Minsky, M.L. (ed.) Semantic Information Processing, pp. 216–270. MIT Press, Cambridge (1968)
Google Scholar
Rajaraman, K., Tan, A.-H.: Mining semantic networks for knowledge discovery. In: Proc. ICMD 2003 (2003)
Google Scholar
Richardson, M., Domingos, P.: Markov logic networks. Machine Learning 62, 107–136 (2006)
Article Google Scholar
Schank, R.C., Riesbeck, C.K.: Inside Computer Understanding. Erlbaum, Hillsdale (1981)
Google Scholar
Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)
Article MATH MathSciNet Google Scholar
Shinyama, Y., Sekine, S.: Preemptive information extraction using unrestricted relation discovery. In: Proc. HLT-NAACL 2006, New York (2006)
Google Scholar
Wong, Y.W., Mooney, R.J.: Learning synchronous grammars for semantic parsing with lambda calculus. In: Proc. ACL 2007, Prague, Czech Republic (2007)
Google Scholar
Xu, Z., Tresp, V., Yu, K., Kriegel, H.-P.: Infinite hidden relational models. In: Proc. UAI 2006, Cambridge, MA (2006)
Google Scholar
Yates, A., Etzioni, O.: Unsupervised resolution of objects and relations on the web. In: Proc. NAACL-HLT 2007, Rochester, NY (2007)
Google Scholar
Zettlemoyer, L.S., Collins, M.: Learning to map sentences to logical form: Structured classification with probabilistic categorial grammers. In: Proc. UAI 2005, Edinburgh, Scotland (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, USA
Stanley Kok & Pedro Domingos

Authors

Stanley Kok
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Domingos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Walter Daelemans Bart Goethals Katharina Morik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kok, S., Domingos, P. (2008). Extracting Semantic Networks from Text Via Relational Clustering. In: Daelemans, W., Goethals, B., Morik, K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008. Lecture Notes in Computer Science(), vol 5211. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87479-9_59

Download citation

DOI: https://doi.org/10.1007/978-3-540-87479-9_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87478-2
Online ISBN: 978-3-540-87479-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Extracting Semantic Networks from Text Via Relational Clustering

Abstract

Chapter PDF

Similar content being viewed by others

Utilizing Big Data Analytics for Automatic Building of Language-agnostic Semantic Knowledge Bases

Research on Semantic Text Mining Based on Domain Ontology

An Automatic Construction of Concept Maps Based on Statistical Text Mining

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Extracting Semantic Networks from Text Via Relational Clustering

Abstract

Chapter PDF

Similar content being viewed by others

Utilizing Big Data Analytics for Automatic Building of Language-agnostic Semantic Knowledge Bases

Research on Semantic Text Mining Based on Domain Ontology

An Automatic Construction of Concept Maps Based on Statistical Text Mining

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation