Skip to main content

Compressing probabilistic Prolog programs

Abstract

ProbLog is a recently introduced probabilistic extension of Prolog (De Raedt, et al. in Proceedings of the 20th international joint conference on artificial intelligence, pp. 2468–2473, 2007). A ProbLog program defines a distribution over logic programs by specifying for each clause the probability that it belongs to a randomly sampled program, and these probabilities are mutually independent. The semantics of ProbLog is then defined by the success probability of a query in a randomly sampled program.

This paper introduces the theory compression task for ProbLog, which consists of selecting that subset of clauses of a given ProbLog program that maximizes the likelihood w.r.t. a set of positive and negative examples. Experiments in the context of discovering links in real biological networks demonstrate the practical applicability of the approach.

References

  • Bryant, R. E. (1986). Graph-based algorithms for boolean function manipulation. IEEE Transactions on Computers, 35(8), 677–691.

    MATH  Article  Google Scholar 

  • Chavira, M., & Darwiche, A. (2007). Compiling Bayesian networks using variable elimination. In M. Veloso (Ed.), Proceedings of the 20th international joint conference on artificial intelligence (pp. 2443–2449). Menlo Park: AAAI Press.

    Google Scholar 

  • De Raedt, L., & Kersting, K. (2003). Probabilistic logic learning. SIGKDD Explorations, 5(1), 31–48.

    Article  Google Scholar 

  • De Raedt, L., Kimmig, A., & Toivonen, H. (2007). ProbLog: a probabilistic Prolog and its application in link discovery. In M. Veloso (Ed.), Proceedings of the 20th international joint conference on artificial intelligence (pp. 2468–2473). Menlo Park: AAAI Press.

    Google Scholar 

  • Flach, P. A. (1994). Simply logical: intelligent reasoning by example. New York: Wiley.

    MATH  Google Scholar 

  • Fuhr, N. (2000). Probabilistic datalog: implementing logical information retrieval for advanced applications. Journal of the American Society for Information Science, 51, 95–110.

    Article  Google Scholar 

  • Getoor, L., & Taskar, B. (Eds.). (2007). Statistical relational learning. Cambridge: MIT Press.

    Google Scholar 

  • Koppel, M., Feldman, R., & Segre, A. M. (1994). Bias-driven revision of logical domain theories. Journal of Artificial Intelligence Research, 1, 159–208.

    MATH  MathSciNet  Google Scholar 

  • Minato, S., Satoh, K., & Sato, T. (2007). Compiling Bayesian networks by symbolic probability calculation based on zero-suppressed BDDs. In M. Veloso (Ed.), Proceedings of the 20th international joint conference on artificial intelligence (pp. 2550–2555). Menlo Park: AAAI Press.

    Google Scholar 

  • Muggleton, S. H. (1996). Stochastic logic programs. In L. De Raedt (Ed.), Advances in inductive logic programming. Amsterdam: IOS Press.

    Google Scholar 

  • Perez-Iratxeta, C., Bork, P., & Andrade, M. A. (2002). Association of genes to genetically inherited diseases using data mining. Nature Genetics, 31, 316–319.

    Google Scholar 

  • Poole, D. (1992). Logic programming, abduction and probability. In Fifth generation computing systems (pp. 530–538).

  • Poole, D. (1993). Probabilistic Horn abduction and Bayesian networks. Artificial Intelligence, 64, 81–129.

    MATH  Article  Google Scholar 

  • Sato, T., & Kameya, Y. (2001). Parameter learning of logic programs for symbolic-statistical modeling. Journal of AI Research, 15, 391–454.

    MATH  MathSciNet  Google Scholar 

  • Sevon, P., Eronen, L., Hintsanen, P., Kulovesi, K., & Toivonen, H. (2006). Link discovery in graphs derived from biological databases. In U. Leser, F. Naumann, & B. Eckman (Eds.), Lecture notes in bioinformatics : Vol. 4075. Data integration in the life sciences 2006. Berlin: Springer.

    Chapter  Google Scholar 

  • Valiant, L. G. (1979). The complexity of enumeration and reliability problems. SIAM Journal of Computing, 8, 410–411.

    MATH  Article  MathSciNet  Google Scholar 

  • Wrobel, S. (1996). First Order Theory Refinement. In L. De Raedt (Ed.), Advances in inductive logic programming. Amsterdam: IOS Press.

    Google Scholar 

  • Zelle, J. M., & Mooney, R. J. (1994). Inducing deterministic Prolog parsers from treebanks: a machine learning approach. In Proceedings of the 12th national conference on artificial intelligence (AAAI-94) (pp. 748–753).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Kimmig.

Additional information

Editors: Stephen Muggleton, Ramon Otero, Simon Colton.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

De Raedt, L., Kersting, K., Kimmig, A. et al. Compressing probabilistic Prolog programs. Mach Learn 70, 151–168 (2008). https://doi.org/10.1007/s10994-007-5030-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-007-5030-x

Keywords

  • Probabilistic logic
  • Inductive logic programming
  • Theory revision
  • Compression
  • Network mining
  • Biological applications
  • Statistical relational learning