Abstract
Entity-centric knowledge bases are large collections of facts about entities of public interest, such as countries, politicians, or movies. They find applications in search engines, chatbots, and semantic data mining systems. In this paper, we first discuss the knowledge representation that has emerged as a pragmatic consensus in the research community of entity-centric knowledge bases. Then, we describe how these knowledge bases can be mined for logical rules. Finally, we discuss how entities can be represented alternatively as vectors in a vector space, by help of neural networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
We can even say type(class, class), i.e., class is an instance of class.
- 4.
Let \(|\mathcal {K}|\) be the number of facts and \(|r(\mathcal {K})|\) the number of relations in a KB \(\mathcal {K}\). Let d be the maximal length of a rule. The size of the search space is reduced from \(O(|\mathcal {K}|^d)\) to \(O(|r(\mathcal {K})|^d)\) when we remove the addInstantiatedAtom operator.
- 5.
Instead of: if a rule is not frequent, none of its refinements can be frequent.
References
Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
Aho, A.V., Garey, M.R., Ullman, J.D.: The transitive reduction of a directed graph. SIAM J. Comput. 1(2), 131–137 (1972)
Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook. Cambridge University Press, Cambridge (2003)
Bienvenu, M., Deutch, D., Suchanek, F.M.: Provenance for web 2.0 data. In: Jonker, W., Petković, M. (eds.) SDM 2012. LNCS, vol. 7482, pp. 148–155. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32873-2_10
Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked data on the web. In: WWW (2008)
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD (2008)
Bordes, A., Glorot, X., Weston, J., Bengio, Y.: A semantic matching energy function for learning with multi-relational data. Mach. Learn. 94(2), 233–259 (2014)
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 2787–2795. Curran Associates Inc. (2013)
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka Jr., E., Mitchell, T.: Toward an architecture for never-ending language learning. In: AAAI (2010)
Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs, provenance and trust. In: WWW (2005)
Chen, Y., Wang, D.Z., Goldberg, S.: Scalekb: scalable learning and inference over large knowledge bases. In: VLDBJ (2016)
De Raedt, L., Kersting, K.: Probabilistic inductive logic programming. In: De Raedt, L., Frasconi, P., Kersting, K., Muggleton, S. (eds.) Probabilistic Inductive Logic Programming. LNCS (LNAI), vol. 4911, pp. 1–27. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78652-8_1
Dehaspe, L., De Raedt, L.: Mining association rules in multiple relations. In: Lavrač, N., Džeroski, S. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997). https://doi.org/10.1007/3540635149_40
Dettmers, T., Minervini, P., Stenetorp, P., Riedel, S.: Convolutional 2D knowledge graph embeddings. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI 2018), New Orleans, LA, USA, vol. 32, February 2018. arXiv: 1707.01476
Dong, X.L., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, New York, NY, USA, 24–27 August 2014, pp. 601–610 (2014)
Duc Tran, M., d’Amato, C., Nguyen, B.T., Tettamanzi, A.G.B.: Comparing rule evaluation metrics for the evolutionary discovery of multi-relational association rules in the semantic web. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds.) EuroGP 2018. LNCS, vol. 10781, pp. 289–305. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77553-1_18
Etzioni, O., et al.: Web-scale information extraction in knowitall. In: WWW (2004)
Fanizzi, N., d’Amato, C., Esposito, F., Minervini, P.: Numeric prediction on owl knowledge bases through terminological regression trees. Int. J. Semant. Comput. 6(04), 429–446 (2012)
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Ferrucci, D., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
Fisher, M.D., Gabbay, D.M., Vila, L.: Handbook of Temporal Reasoning in Artificial Intelligence. Elsevier, Amsterdam (2005)
Gad-Elrab, M.H., Stepanova, D., Urbani, J., Weikum, G.: Exception-enriched rule learning from knowledge graphs. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 234–251. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46523-4_15
Galárraga, L., Razniewski, S., Amarilli, A., Suchanek, F.M.: Predicting completeness in knowledge bases. In: WSDM (2017)
Galárraga, L., Suchanek, F.M.: Towards a numerical rule mining language. In: AKBC Workshop (2014)
Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: WWW (2013)
Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in ontological knowledge bases with AMIE+. In: VLDBJ (2015)
Getoor, L., Diehl, C.P.: Link mining: a survey. ACM SIGKDD Explor. Newsl. 7(2), 3–12 (2005)
Gutierrez, C., Hurtado, C.A., Vaisman, A.: Introducing time into RDF. IEEE Trans. Knowl. Data Eng. 19(2), 207–218 (2007)
Hawthorne, J.: Inductive logic. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, spring 2018 edition (2018)
Hellmann, S., Lehmann, J., Auer, S.: Learning of owl class descriptions on very large knowledge bases. Int. J. Semant. Web Inf. Syst. (IJSWIS) 5(2), 25–48 (2009)
Henderson, L.: The problem of induction. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, spring 2019 edition (2019)
Ho, V.T., Stepanova, D., Gad-Elrab, M.H., Kharlamov, E., Weikum, G.: Rule learning from knowledge graphs guided by embedding models. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 72–90. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_5
Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: Yago2: a spatially and temporally enhanced knowledge base from wikipedia. Artif. Intell. 194, 28–61 (2013)
Inoue, K.: Induction as consequence finding. Mach. Learn. 55(2), 109–135 (2004)
Ji, G., He, S., Xu, L., Liu, K., Zhao, J.: Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol. 1: Long Papers), pp. 687–696, Beijing, China. Association for Computational Linguistics, July 2015
Kimber, T., Broda, K., Russo, A.: Induction on failure: learning connected horn theories. In: Erdem, E., Lin, F., Schaub, T. (eds.) LPNMR 2009. LNCS (LNAI), vol. 5753, pp. 169–181. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04238-6_16
Krötzsch, M., Marx, M., Ozaki, A., Thost, V.: Attributed description logics: reasoning on knowledge graphs. In: IJCAI (2018)
Lajus, J., Suchanek, F.M.: Are all people married? Determining obligatory attributes in knowledge bases. In: WWW (2018)
Lao, N., Mitchell, T., Cohen, W.W.: Random walk inference and learning in a large scale knowledge base. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 529–539. Association for Computational Linguistics (2011)
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2015)
Lenat, D.B., Guha, R.V.: Building Large Knowledge-Based Systems; Representation and inference in the Cyc Project. Addison-Wesley Longman Publishing Co. Inc., Boston (1989)
Lerer, A., et al.: PyTorch-BigGraph: a large-scale graph embedding system. In: Proceedings of The Conference on Systems and Machine Learning, March 2019. arXiv: 1903.12287
Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, February 2015
Liu, H., Singh, P.: Conceptnet. BT Tech. J. 22(4), 211–226 (2004)
Liu, H., Wu, Y., Yang, Y.: Analogical inference for multi-relational embeddings. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, International Convention Centre, Sydney, Australia, 06–11 Aug 2017, vol. 70, pp. 2168–2178. PMLR (2017)
Margolis, E., Laurence, S.: Concepts. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy. Stanford (2014)
Marx, M., Krötzsch, M., Thost, V.: Logic on mars: ontologies for generalised property graphs. In: IJCAI (2017)
Melo, A., Theobald, M., Völker, J.: Correlation-based refinement of rules with numerical attributes. In: FLAIRS (2014)
Muggleton, S.: Inverse entailment and progol. New Gener. Comput. 13(3–4), 245–286 (1995)
Muggleton, S., De Raedt, L.: Inductive logic programming: theory and methods. J. Log. Program. 19, 629–679 (1994)
Muggleton, S., Feng, C.: Efficient induction of logic programs. Citeseer (1990)
Nguyen, D.Q., Nguyen, T.D., Nguyen, D.Q., Phung, D.: A novel embedding model for knowledge base completion based on convolutional neural network. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2 (Short Papers), pp. 327–333 (2018). arXiv: 1712.02121
Nickel, M., Rosasco, L., Poggio, T.: Holographic embeddings of knowledge graphs. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 1955–1961, February 2016
Nickel, M., Tresp, V., Kriegel, H.-P.: A three-way model for collective learning on multi-relational data. In: Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML 2011, pp. 809–816. Omnipress, Bellevue 92011)
Ortona, S., Meduri, V.V., Papotti, P.: Robust discovery of positive and negative rules in knowledge bases. In: ICDE (2018)
Pellissier Tanon, T., Stepanova, D., Razniewski, S., Mirza, P., Weikum, G.: Completeness-aware rule learning from knowledge graphs. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 507–525. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_30
Plotkin, G.: Automatic methods of inductive inference (1972)
Ponzetto, S., Navigli, R.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)
Ray, O., Broda, K., Russo, A.: Hybrid abductive inductive learning: a generalisation of progol. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 311–328. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39917-9_21
Razniewski, S., Suchanek, F.M., Nutt, W.: But what do we actually know? In: AKBC Workshop (2016)
Russell, B.: The Problems of Philosophy. Barnes & Noble, New York City (1912)
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River (2002)
Shapiro, E.Y.: Inductive inference of theories from facts. Yale University, Department of Computer Science (1981)
Socher, R., Chen, D., Manning, C.D., Ng, A.: Reasoning with neural tensor networks for knowledge base completion. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 926–934. Curran Associates Inc. (2013)
Soulet, A., Giacometti, A., Markhoff, B., Suchanek, F.M.: Representativeness of knowledge bases with the generalized Benford’s law. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 374–390. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_22
Sowa, J.F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks/Cole, Boston (2000)
Staab, S., Studer, R. (eds.): Handbook on Ontologies. International Handbooks on Information Systems. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-92673-3
Stepanova, D., Gad-Elrab, M.H., Ho, V.T.: Rule induction and reasoning over knowledge graphs. In: d’Amato, C., Theobald, M. (eds.) Reasoning Web 2018. LNCS, vol. 11078, pp. 142–172. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00338-8_6
Suchanek, F.M., Abiteboul, S., Senellart, P.: Paris: probabilistic alignment of relations, instances, and schema. In: VLDB (2012)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago - a core of semantic knowledge. In: WWW (2007)
Suchanek, F.M., Preda, N.: Semantic culturomics. In: VLDB Short Paper Track (2014)
Tandon, N., de Melo, G., De, A., Weikum, G.: Knowlywood: mining activity knowledge from hollywood narratives. In: CIKM (2015)
Tandon, N., de Melo, G., Suchanek, F.M., Weikum, G.: WebChild: harvesting and organizing commonsense knowledge from the web. In: WSDM (2014)
Telgarsky, M.: Representation benefits of deep feedforward networks. arXiv [cs], September 2015. arXiv: 1509.08101
Trouillon, T., Nickel, M.: Complex and holographic embeddings of knowledge graphs: a comparison. arXiv [cs, stat], July 2017. arXiv: 1707.01475
Udrea, O., Recupero, D.R., Subrahmanian, V.S.: Annotated rdf. ACM Trans. Comput. Logic 11(2), 10 (2010)
Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)
Wang, P., Li, S., Pan, R.: Incorporating GAN for negative sampling in knowledge representation learning. In: Thirty-Second AAAI Conference on Artificial Intelligence, April 2018
Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: Twenty-Eighth AAAI Conference on Artificial Intelligence, June 2014
Welty, C., Fikes, R., Makarios, S.: A reusable ontology for fluents in owl. In: FOIS (2006)
Whitehead, A.N., Russell, B.: Principia mathematica (1913)
Word Wide Web Consortium. RDF Primer (2004)
Word Wide Web Consortium. RDF Vocabulary Description Language 1.0: RDF Schema (2004)
Word Wide Web Consortium. SKOS Simple Knowledge Organization System (2009)
Word Wide Web Consortium. OWL 2 Web Ontology Language (2012)
Word Wide Web Consortium. SPARQL 1.1 Query Language (2013)
Yahya, M., Barbosa, D., Berberich, K., Wang, Q., Weikum, G.: Relationship queries on extended knowledge graphs. In: WSDM (2016)
Yamamoto, A.: Hypothesis finding based on upward refinement of residue hypotheses. Theoret. Comput. Sci. 298(1), 5–19 (2003)
Yang, B., Yih, W., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. In: Proceedings of the International Conference on Learning Representation (ICLR), December 2014. arXiv: 1412.6575
Zupanc, K., Davis, J.: Estimating rule quality for knowledge base completion with the relationship between coverage assumption. In: Proceedings of the 2018 World Wide Web Conference, pp. 1073–1081. International World Wide Web Conferences Steering Committee (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Computation of Support and Confidence
A Computation of Support and Confidence
Notation. Given a logical formula \(\phi \) with some free variables \(x_1, \dots , x_n\), all other variables being by default existentially quantified, we define:
We remind the reader of the two following definitions:
Definition
14 (Prediction of a rule): The predictions P of a rule \(\varvec{B} \Rightarrow h\) in a KB \(\mathcal {K}\) are the head atoms of all instantiations of the rule where the body atoms appear in \(\mathcal {K}\). We write \(\mathcal {K}\wedge (\varvec{B} \Rightarrow h) \models P\).
Definition
19 (Support): The support of a rule in a KB is the number of positive examples predicted by the rule.
A prediction of a rule is a positive example if and only if it is in the KB. This observation gives rise to the following property:
Proposition 34
(Support in practice): The support of a rule \(\varvec{B}\Rightarrow h\) is the number of instantiations of the head variables that satisfy the query \(\varvec{B} \wedge h\). This value can be written as:
Definition
20 (Confidence): The confidence of a rule is the number of positive examples predicted by the rule (the support of the rule), divided by the number of examples predicted by the rule.
Under the CWA, all the predicted examples are either positive examples or negative examples. Thus, the standard confidence of a rule is the support of the rule divided by the number of prediction of the rule, written:
Assume h is more functional than inverse functional. Under the PCA, a predicted negative example is a prediction h(x, y) that is not in the KB, such that, for this x there exists another entity \(y'\) such that \(h(x,y')\) is in the KB. When we add the predicted positive examples, the denominator of the PCA confidence becomes:
We can simplify this logical formula to deduce the following formula for computing the PCA confidence:
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Suchanek, F.M., Lajus, J., Boschin, A., Weikum, G. (2019). Knowledge Representation and Rule Mining in Entity-Centric Knowledge Bases. In: Krötzsch, M., Stepanova, D. (eds) Reasoning Web. Explainable Artificial Intelligence. Lecture Notes in Computer Science(), vol 11810. Springer, Cham. https://doi.org/10.1007/978-3-030-31423-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-31423-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31422-4
Online ISBN: 978-3-030-31423-1
eBook Packages: Computer ScienceComputer Science (R0)