# Statistical Relational Learning: An Inductive Logic Programming Perspective

## Abstract

In the past few years there has been a lot of work lying at the intersection of probability theory, logic programming and machine learning [14,18,13,9,6,1,11]. This work is known under the names of statistical relational learning [7,5], probabilistic logic learning [4], or probabilistic inductive logic programming. Whereas most of the existing works have started from a probabilistic learning perspective and extended probabilistic formalisms with relational aspects, I shall take a di.erent perspective, in which I shall start from inductive logic programming and study how inductive logic programming formalisms, settings and techniques can be extended to deal with probabilistic issues. This tradition has already contributed a rich variety of valuable formalisms and techniques, including probabilistic Horn abduction by David Poole, PRISMs by Sato, stochastic logic programs by Muggleton [13] and Cussens [2], Bayesian logic programs [10,8] by Kersting and De Raedt, and Logical Hidden Markov Models [11].

The main contribution of this talk is the introduction of three probabilistic inductive logic programming settings which are derived from the learning from entailment, from interpretations and from proofs settings of the field of inductive logic programming [3]. Each of these settings contributes di.erent notions of probabilistic logic representations, examples and probability distributions. The first setting, probabilistic learning from entailment, is incorporated in the wellknown PRISM system [19] and Cussens’s Failure Adjusted Maximisation approach to parameter estimation in stochastic logic programs [2]. A novel system that was recently developed and that fits this paradigm is the nFOIL system [12]. It combines key principles of the well-known inductive logic programming system FOIL [15] with the naïve Bayes’ appraoch. In probabilistic learning from entailment, examples are ground facts that should be probabilistically entailed by the target logic program. The second setting, probabilistic learning from interpretations, is incorporated in Bayesian logic programs [10,8], which integrate Bayesian networks with logic programs. This setting is also adopted by [6]. Examples in this setting are Herbrand interpretations that should be a probabilistic model for the target theory. The third setting, learning from proofs [17], is novel. It is motivated by the learning of stochastic context free grammars from tree banks. In this setting, examples are proof trees that should be probabilistically provable from the unknown stochastic logic programs. The sketched settings (and their instances presented) are by no means the only possible settings for probabilistic inductive logic programming, but still – I hope – provide useful insights into the state-of-the-art of this exciting field.

For a full survey of statistical relational learning or probabilistic inductive logic programming, the author would like to refer to [4], and for more details on the probabilistic inductive logic programming settings to [16], where a longer and earlier version of this contribution can be found.

## Keywords

Bayesian Network Logic Program Logic Programming Inductive Logic Inductive Logic Programming## References

- 1.Anderson, C.R., Domingos, P., Weld, D.S.: Relational Markov Models and their Application to Adaptive Web Navigation. In: Hand, D., Keim, D., Zaïne, O.R., Goebel, R. (eds.) Proceedings of the Eighth International Conference on Knowledge Discovery and Data Mining (KDD 2002), Edmonton, Canada, pp. 143–152. ACM Press, New York (2002)CrossRefGoogle Scholar
- 2.Cussens, J.: Loglinear models for first-order probabilistic reasoning. In: Laskey, K.B., Prade, H. (eds.) Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI 1999), Stockholm, Sweden, pp. 126–133. Morgan Kaufmann, San Francisco (1999)Google Scholar
- 3.De Raedt, L.: Logical settings for concept-learning. Artificial Intelligence 95(1), 197–201 (1997)Google Scholar
- 4.De Raedt, L., Kersting, K.: Probabilistic Logic Learning. ACM-SIGKDD Explorations: Special issue on Multi-Relational Data Mining 5(1), 31–48 (2003)Google Scholar
- 5.Dietterich, T., Getoor, L., Murphy, K. (eds.): Working Notes of the ICML 2004 Workshop on Statistical Relational Learning and its Connections to Other Fields, SRL 2004 (2004)Google Scholar
- 6.Friedman, N., Getoor, L., Koller, D., Pfeffer, A.: Learning probabilistic relational models. In: Dean, T. (ed.) Proceedings of the Sixteenth International Joint Conferences on Artificial Intelligence (IJCAI 1999), Stockholm, Sweden, pp. 1300–1309. Morgan Kaufmann, San Francisco (1999)Google Scholar
- 7.Getoor, L., Jensen, D. (eds.): Working Notes of the IJCAI 2003 Workshop on Learning Statistical Models from Relational Data, SRL 2003 (2003)Google Scholar
- 8.Kersting, K., De Raedt, L.: Adaptive Bayesian Logic Programs. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, p. 104. Springer, Heidelberg (2001)CrossRefGoogle Scholar
- 9.Kersting, K., De Raedt, L.: Bayesian logic programs. Technical Report 151, University of Freiburg, Institute for Computer Science (April 2001)Google Scholar
- 10.Kersting, K., De Raedt, L.: Towards Combining Inductive Logic Programming and Bayesian Networks. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, p. 118. Springer, Heidelberg (2001)CrossRefGoogle Scholar
- 11.Kersting, K., Raiko, T., Kramer, S., De Raedt, L.: Towards discovering structural signatures of protein folds based on logical hidden markov models. In: Altman, R.B., Dunker, A.K., Hunter, L., Jung, T.A., Klein, T.E. (eds.) Proceedings of the Pacific Symposium on Biocomputing, Kauai, Hawaii, USA, pp. 192–203. World Scientific, Singapore (2003)Google Scholar
- 12.Landwehr, N., Kersting, K., De Raedt, L.: Nfoil: Integrating naive bayes and foil. In: Proceedings of the 20th National Conference on Artificial Intelligence. AAAI Press, Menlo Park (2005)Google Scholar
- 13.Muggleton, S.H.: Stochastic logic programs. In: De Raedt, L. (ed.) Advances in Inductive Logic Programming. IOS Press, Amsterdam (1996)Google Scholar
- 14.Poole, D.: Probabilistic Horn abduction and Bayesian networks. Artificial Intelligence 64, 81–129 (1993)zbMATHCrossRefGoogle Scholar
- 15.Quinlan, J.R., Cameron-Jones, R.M.: Induction of logic programs: FOIL and related systems. In: New Generation Computing, pp. 287–312 (1995)Google Scholar
- 16.De Raedt, L., Kersting, K.: Probabilistic inductive logic programming. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 19–36. Springer, Heidelberg (2004)CrossRefGoogle Scholar
- 17.De Raedt, L., Kersting, K., Torge, S.: Towards learning stochastic logic programs from proof-banks. In: Proceedings of the 20th National Conference on Artificial Intelligence. AAAI Press, Menlo Park (2005)Google Scholar
- 18.Sato, T.: A Statistical Learning Method for Logic Programs with Distribution Semantics. In: Sterling, L. (ed.) Proceedings of the Twelfth International Conference on Logic Programming (ICLP 1995), Tokyo, Japan, pp. 715–729. MIT Press, Cambridge (1995)Google Scholar
- 19.Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical modeling. Journal of Artificial Intelligence Research 15, 391–454 (2001)zbMATHMathSciNetGoogle Scholar