The VLDB Journal

, Volume 24, Issue 6, pp 707–730 | Cite as

Fast rule mining in ontological knowledge bases with AMIE\(+\)

  • Luis Galárraga
  • Christina Teflioudi
  • Katja Hose
  • Fabian M. Suchanek
Regular Paper

Abstract

Recent advances in information extraction have led to huge knowledge bases (KBs), which capture knowledge in a machine-readable format. Inductive logic programming (ILP) can be used to mine logical rules from these KBs, such as “If two persons are married, then they (usually) live in the same city.” While ILP is a mature field, mining logical rules from KBs is difficult, because KBs make an open-world assumption. This means that absent information cannot be taken as counterexamples. Our approach AMIE (Galárraga et al. in WWW, 2013) has shown how rules can be mined effectively from KBs even in the absence of counterexamples. In this paper, we show how this approach can be optimized to mine even larger KBs with more than 12M statements. Extensive experiments show how our new approach, AMIE\(+\), extends to areas of mining that were previously beyond reach.

Keywords

Rule mining Inductive logic programming ILP Knowledge bases 

References

  1. 1.
    Abedjan Z., Naumann F.: Synonym analysis for predicate expansion. In: ESWC (2013)Google Scholar
  2. 2.
    Abedjan, Z., Lorey, J., Naumann, F.: Reconciling ontologies and the web of data. In: CIKM (2012)Google Scholar
  3. 3.
    Adé, H., Raedt, L., Bruynooghe, M.: Declarative bias for specific-to-general ilp systems. Mach. Learn. 20, 119–154 (1995)Google Scholar
  4. 4.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD (1993)Google Scholar
  5. 5.
    Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining (1996)Google Scholar
  6. 6.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: a nucleus for a Web of open data. In: ISWC (2007)Google Scholar
  7. 7.
    Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Jr., E.R.H., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: AAAI (2010)Google Scholar
  8. 8.
    Chasseur, C., Patel, J.M.: Design and evaluation of storage organizations for read-optimized main memory databases. Proc. VLDB Endow. 6(13), 1474–1485 (2013)CrossRefGoogle Scholar
  9. 9.
    Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent subtree mining: an overview. Fundam. Inf. 66(1–2), 26–37 (2004)MathSciNetGoogle Scholar
  10. 10.
    Cimiano, P., Hotho, A., Staab, S.: Comparing conceptual, divisive and agglomerative clustering for learning taxonomies from text. In: ECAI (2004)Google Scholar
  11. 11.
    d’Amato, C., Bryl, V., Serafini, L.: Data-driven logical reasoning. In: URSW (2012)Google Scholar
  12. 12.
    d’Amato, C., Fanizzi, N., Esposito, F.: Inductive learning for the semantic web: what does it buy? Semant. Web 1(1,2), 53–59 (2010)Google Scholar
  13. 13.
    David, J., Guillet, F., Briand, H.: Association rule ontology matching approach. Int. J. Semant. Web Inf. Syst. 3(2), 27–49 (2007)CrossRefGoogle Scholar
  14. 14.
    Dehaspe, L., Toironen, H.: Discovery of relational association rules. In: Relational Data Mining. Springer, New York (2000)Google Scholar
  15. 15.
    Dehaspe, L., Toivonen, H.: Discovery of frequent DATALOG patterns. Data Min. Knowl. Discov. 3(1), 7–36 (1999)CrossRefGoogle Scholar
  16. 16.
    Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: KDD (2014)Google Scholar
  17. 17.
    Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: WWW (2013)Google Scholar
  18. 18.
    Goethals, B., Van den Bussche, J.: Relational association rules: getting WARMER. In: Pattern Detection and Discovery, vol. 2447. Springer, Berlin (2002)Google Scholar
  19. 19.
    Grice, P.: Logic and conversation. J. Syntax Semant. 3, 41–58 (1975)Google Scholar
  20. 20.
    Grimnes, G.A., Edwards, P., Preece, A.D.: Learning meta-descriptions of the FOAF network. In: ISWC (2004)Google Scholar
  21. 21.
    Hellmann, S., Lehmann, J., Auer, S.: Learning of OWL class descriptions on very large knowledge bases. Int. J. Semant. Web Inf. Syst. 5(2), 25–48 (2009)CrossRefGoogle Scholar
  22. 22.
    Huang, Y., Tresp, V., Bundschus, M., Rettinger, A., Kriegel, H.P.: Multivariate prediction for learning on the semantic web. In: ILP (2011)Google Scholar
  23. 23.
    Jozefowska, J., Lawrynowicz, A., Lukaszewski, T.: The role of semantics in mining frequent patterns from knowledge bases in description logics with rules. Theory Pract. Log. Program. 10(3), 251–289 (2010)MATHMathSciNetCrossRefGoogle Scholar
  24. 24.
    Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM. IEEE Computer Society (2001)Google Scholar
  25. 25.
    Lehmann, J.: DL-learner: learning concepts In Description logics. J. Mach. Learn. Res. (JMLR) 10, 2639–2642 (2009)MATHGoogle Scholar
  26. 26.
    Lisi, F.A.: Building rules on top of ontologies for the semantic web with inductive logic programming. TPLP 8(3), 271–300 (2008)MATHMathSciNetGoogle Scholar
  27. 27.
    Maedche, A., Zacharias, V.: Clustering ontology-based metadata in the semantic web. In: PKDD (2002)Google Scholar
  28. 28.
    Mahdisoltani, F., Biega, J., Suchanek, F.M.: Yago3: a knowledge base from multilingual wikipedias. In: CIDR (2015)Google Scholar
  29. 29.
    Mamer, T., Bryant, C., McCall, J.: L-modified ilp evaluation functions for positive-only biological grammar learning. In: Zelezny, F., Lavrac, N. (eds.) Inductive logic programming, No. 5194 in LNAI. Springer, Berlin (2008)Google Scholar
  30. 30.
    McGuinness, D.L., Fikes, R., Rice, J., Wilder, S.: An environment for merging and testing large ontologies. In: KR (2000)Google Scholar
  31. 31.
    Muggleton, S.: Inverse entailment and progol. New Gener. Comput. 13(3&4), 245–286 (1995)CrossRefGoogle Scholar
  32. 32.
    Muggleton, S.: Learning from positive data. In: ILP (1997)Google Scholar
  33. 33.
    Nakashole, N., Sozio, M., Suchanek, F., Theobald, M.: Query-time reasoning in uncertain rdf knowledge bases with soft and hard rules. In: Workshop on Very Large Data Search (VLDS) at VLDB (2012)Google Scholar
  34. 34.
    Nebot, V., Berlanga, R.: Finding association rules in semantic web data. Knowl Based Syst. 25(1), 51–62 (2012)CrossRefGoogle Scholar
  35. 35.
    Nickel, M., Tresp, V., Kriegel, H.P.: Factorizing yago: scalable machine learning for linked data. In: WWW (2012)Google Scholar
  36. 36.
    Noy, N.F., Musen, M.A.: PROMPT: algorithm and tool for automated ontology merging and alignment. In: AAAI/IAAI. AAAI Press (2000)Google Scholar
  37. 37.
    Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)CrossRefGoogle Scholar
  38. 38.
    Schoenmackers, S., Etzioni, O., Weld, D.S., Davis, J.: Learning first-order Horn clauses from web text. In: EMNLP (2010)Google Scholar
  39. 39.
    Suchanek, F.M., Abiteboul, S., Senellart, P.: PARIS: probabilistic alignment of relations, instances, and schema. PVLDB 5(3), 157–168 (2011)Google Scholar
  40. 40.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: WWW (2007)Google Scholar
  41. 41.
    Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: KDD (2002)Google Scholar
  42. 42.
    Technologies, M.: The freebase project. http://freebase.com
  43. 43.
    Völker, J., Niepert, M.: Statistical schema induction. In: ESWC (2011)Google Scholar
  44. 44.
    Word Wide Web Consortium: RDF Primer (W3C Recommendation 2004–02-10). http://www.w3.org/TR/rdf-primer/ (2004)
  45. 45.
    Zeng, Q., Patel, J., Page, D.: QuickFOIL: scalable inductive logic programming. In: VLDB (2014)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Luis Galárraga
    • 1
  • Christina Teflioudi
    • 2
  • Katja Hose
    • 3
  • Fabian M. Suchanek
    • 1
  1. 1.Télécom ParisTechParisFrance
  2. 2.Max Planck Institute for InformaticsSaarbrückenGermany
  3. 3.Aalborg UniversityAalborgDenmark

Personalised recommendations