Abstract
Recent advances in information extraction have led to huge knowledge bases (KBs), which capture knowledge in a machine-readable format. Inductive logic programming (ILP) can be used to mine logical rules from these KBs, such as “If two persons are married, then they (usually) live in the same city.” While ILP is a mature field, mining logical rules from KBs is difficult, because KBs make an open-world assumption. This means that absent information cannot be taken as counterexamples. Our approach AMIE (Galárraga et al. in WWW, 2013) has shown how rules can be mined effectively from KBs even in the absence of counterexamples. In this paper, we show how this approach can be optimized to mine even larger KBs with more than 12M statements. Extensive experiments show how our new approach, AMIE\(+\), extends to areas of mining that were previously beyond reach.
Similar content being viewed by others
Notes
RDF schema has only positive rules and no disjointness constraints or similar concepts.
In these cases, the pruning precision in Table 6 was computed by comparing the output of AMIE\(+\) to the output of AMIE on the mined subset.
We used the YAGO3 [28] types because the type signatures in older versions of YAGO were too general. For example, the relation livesIn is defined from person to location in YAGO2s, whereas in YAGO3 it is defined from person to city.
References
Abedjan Z., Naumann F.: Synonym analysis for predicate expansion. In: ESWC (2013)
Abedjan, Z., Lorey, J., Naumann, F.: Reconciling ontologies and the web of data. In: CIKM (2012)
Adé, H., Raedt, L., Bruynooghe, M.: Declarative bias for specific-to-general ilp systems. Mach. Learn. 20, 119–154 (1995)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD (1993)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining (1996)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: a nucleus for a Web of open data. In: ISWC (2007)
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Jr., E.R.H., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: AAAI (2010)
Chasseur, C., Patel, J.M.: Design and evaluation of storage organizations for read-optimized main memory databases. Proc. VLDB Endow. 6(13), 1474–1485 (2013)
Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent subtree mining: an overview. Fundam. Inf. 66(1–2), 26–37 (2004)
Cimiano, P., Hotho, A., Staab, S.: Comparing conceptual, divisive and agglomerative clustering for learning taxonomies from text. In: ECAI (2004)
d’Amato, C., Bryl, V., Serafini, L.: Data-driven logical reasoning. In: URSW (2012)
d’Amato, C., Fanizzi, N., Esposito, F.: Inductive learning for the semantic web: what does it buy? Semant. Web 1(1,2), 53–59 (2010)
David, J., Guillet, F., Briand, H.: Association rule ontology matching approach. Int. J. Semant. Web Inf. Syst. 3(2), 27–49 (2007)
Dehaspe, L., Toironen, H.: Discovery of relational association rules. In: Relational Data Mining. Springer, New York (2000)
Dehaspe, L., Toivonen, H.: Discovery of frequent DATALOG patterns. Data Min. Knowl. Discov. 3(1), 7–36 (1999)
Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: KDD (2014)
Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: WWW (2013)
Goethals, B., Van den Bussche, J.: Relational association rules: getting WARMER. In: Pattern Detection and Discovery, vol. 2447. Springer, Berlin (2002)
Grice, P.: Logic and conversation. J. Syntax Semant. 3, 41–58 (1975)
Grimnes, G.A., Edwards, P., Preece, A.D.: Learning meta-descriptions of the FOAF network. In: ISWC (2004)
Hellmann, S., Lehmann, J., Auer, S.: Learning of OWL class descriptions on very large knowledge bases. Int. J. Semant. Web Inf. Syst. 5(2), 25–48 (2009)
Huang, Y., Tresp, V., Bundschus, M., Rettinger, A., Kriegel, H.P.: Multivariate prediction for learning on the semantic web. In: ILP (2011)
Jozefowska, J., Lawrynowicz, A., Lukaszewski, T.: The role of semantics in mining frequent patterns from knowledge bases in description logics with rules. Theory Pract. Log. Program. 10(3), 251–289 (2010)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM. IEEE Computer Society (2001)
Lehmann, J.: DL-learner: learning concepts In Description logics. J. Mach. Learn. Res. (JMLR) 10, 2639–2642 (2009)
Lisi, F.A.: Building rules on top of ontologies for the semantic web with inductive logic programming. TPLP 8(3), 271–300 (2008)
Maedche, A., Zacharias, V.: Clustering ontology-based metadata in the semantic web. In: PKDD (2002)
Mahdisoltani, F., Biega, J., Suchanek, F.M.: Yago3: a knowledge base from multilingual wikipedias. In: CIDR (2015)
Mamer, T., Bryant, C., McCall, J.: L-modified ilp evaluation functions for positive-only biological grammar learning. In: Zelezny, F., Lavrac, N. (eds.) Inductive logic programming, No. 5194 in LNAI. Springer, Berlin (2008)
McGuinness, D.L., Fikes, R., Rice, J., Wilder, S.: An environment for merging and testing large ontologies. In: KR (2000)
Muggleton, S.: Inverse entailment and progol. New Gener. Comput. 13(3&4), 245–286 (1995)
Muggleton, S.: Learning from positive data. In: ILP (1997)
Nakashole, N., Sozio, M., Suchanek, F., Theobald, M.: Query-time reasoning in uncertain rdf knowledge bases with soft and hard rules. In: Workshop on Very Large Data Search (VLDS) at VLDB (2012)
Nebot, V., Berlanga, R.: Finding association rules in semantic web data. Knowl Based Syst. 25(1), 51–62 (2012)
Nickel, M., Tresp, V., Kriegel, H.P.: Factorizing yago: scalable machine learning for linked data. In: WWW (2012)
Noy, N.F., Musen, M.A.: PROMPT: algorithm and tool for automated ontology merging and alignment. In: AAAI/IAAI. AAAI Press (2000)
Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)
Schoenmackers, S., Etzioni, O., Weld, D.S., Davis, J.: Learning first-order Horn clauses from web text. In: EMNLP (2010)
Suchanek, F.M., Abiteboul, S., Senellart, P.: PARIS: probabilistic alignment of relations, instances, and schema. PVLDB 5(3), 157–168 (2011)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: WWW (2007)
Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: KDD (2002)
Technologies, M.: The freebase project. http://freebase.com
Völker, J., Niepert, M.: Statistical schema induction. In: ESWC (2011)
Word Wide Web Consortium: RDF Primer (W3C Recommendation 2004–02-10). http://www.w3.org/TR/rdf-primer/ (2004)
Zeng, Q., Patel, J., Page, D.: QuickFOIL: scalable inductive logic programming. In: VLDB (2014)
Acknowledgments
This work is supported by the “Chair Machine Learning for Big Data” of Télécom ParisTech.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Galárraga, L., Teflioudi, C., Hose, K. et al. Fast rule mining in ontological knowledge bases with AMIE\(+\) . The VLDB Journal 24, 707–730 (2015). https://doi.org/10.1007/s00778-015-0394-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-015-0394-1