Abstract
Open Information Extraction (OIE) is a recent unsupervised strategy to extract great amounts of basic propositions (verb-based triples) from massive text corpora which scales to Web-size document collections. We propose a multilingual rule-based OIE method that takes as input dependency parses in the CoNLL-X format, identifies argument structures within the dependency parses, and extracts a set of basic propositions from each argument structure. Our method requires no training data and, according to experimental studies, obtains higher recall and higher precision than existing approaches relying on training data. Experiments were performed in three languages: English, Portuguese, and Spanish.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: International Joint Conference on Artificial Intelligence (2007)
Wu, F., Weld, D.S.: Open information extraction using wikipedia. In: Annual Meeting of the Association for Computational Linguistics (2010)
Corro, L.D., Gemulla, R.: Clausie: clause-based open information extraction. In: Proceedings of the World Wide Web Conference (WWW-2013), Rio de Janeiro, Brazil, pp. 355–366 (2013)
Hall, J., Nilsson, J.: CoNLL-X shared task on multilingual dependency parsing. In: The Tenth CoNLL (2006)
Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilson, J., Riedel, S., Yuret, D.: The CoNLL-2007 shared task on dependency parsing. In: Proceedings of the Shared Task Session of EMNLP-CoNLL 2007, Prague, Czech Republic, pp. 915–932 (2007)
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: Maltparser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13(2), 115–135 (2007)
Gamallo, P., González, I.: A grammatical formalism based on patterns of part-of-speech tags. Journal of Corpus Linguistics 16(1), 45–71 (2011)
Banko, M., Etzioni, O.: The tradeoffs between open and traditional relation extraction. In: ACL-08 (2008)
Etzioni, O., Fader, A., Christensen, J., Soderland, S., Mausam: Open information extraction: the second generation. In: International Joint Conference on Artificial Intelligence (2011)
Mausam, Schmitz, M., Soderland, S., Bart, R., Etzioni, O.: Open language learning for information extraction. In: EMNLP-12, pp. 523–534 (2012)
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: EMNLP-11 (2011)
Zhilla, A., Gelbukh, A.: Comparison of open information extraction for Engish and Spanish. In: Dialogue 2014 (2014)
Xavier, C.C., Souza, M., de Lima, V.S.: Open information extraction based on lexical-syntactic patterns. In: Brazilian Conference on Intelligent Systems, pp. 189–194 (2013)
Bast, H., Haussmann, E.: Open information extraction via contextual sententence decomposition. ICSC 2013, 154–159 (2013)
Akbik, A., Loser, A.: Kraken: N-ary facts in open information extraction. In: Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction, pp. 52–56 (2012)
Gamallo, P., Garcia, M., Fernández-Lanza, S.: Dependency-based open information extraction. In: ROBUS-UNSUP Workshop at EACL-2012, Avignon, France (2012)
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: ACL-03, pp. 423–430 (2003)
Lin, T., Mausam, Etzioni, O.: Identifying functional relations in web text. In: Conference on Empirical Methods in Natural Language Processing (2010)
Soderland, S., Roof, B., Qin, B., Xu, S., Mausam, Etzioni, O.: Adapting open information extraction to domain-specific relations. AI Magazine 31(3), 93–102 (2010)
Schimd, H.: Improvements in part-of-speech tagging with an application to german. In: ACL SIGDAT Workshop, Dublin, Ireland (1995)
Padró, L., Stanilovsky, E.: Freeling 3.0: towards wider multilinguality. In: LREC 2012, Istanbul, Turkey (2012)
Zavaglia, C.: O papel do léxico na elaborac̃ão de ontologias computacionais: do seu resgate à sua disponibilizac̃ão. In: LingüÃstica IN FOCUS - Léxico e morfofonologia: perspectivas e análises. Uberlândia: EDUFU, pp. 233–274 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Gamallo, P., Garcia, M. (2015). Multilingual Open Information Extraction. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds) Progress in Artificial Intelligence. EPIA 2015. Lecture Notes in Computer Science(), vol 9273. Springer, Cham. https://doi.org/10.1007/978-3-319-23485-4_72
Download citation
DOI: https://doi.org/10.1007/978-3-319-23485-4_72
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23484-7
Online ISBN: 978-3-319-23485-4
eBook Packages: Computer ScienceComputer Science (R0)