Learning first-order probabilistic models with combining rules

  • Sriraam NatarajanEmail author
  • Prasad Tadepalli
  • Thomas G. Dietterich
  • Alan Fern


Many real-world domains exhibit rich relational structure and stochasticity and motivate the development of models that combine predicate logic with probabilities. These models describe probabilistic influences between attributes of objects that are related to each other through known domain relationships. To keep these models succinct, each such influence is considered independent of others, which is called the assumption of “independence of causal influences” (ICI). In this paper, we describe a language that consists of quantified conditional influence statements and captures most relational probabilistic models based on directed graphs. The influences due to different statements are combined using a set of combining rules such as Noisy-OR. We motivate and introduce multi-level combining rules, where the lower level rules combine the influences due to different ground instances of the same statement, and the upper level rules combine the influences due to different statements. We present algorithms and empirical results for parameter learning in the presence of such combining rules. Specifically, we derive and implement algorithms based on gradient descent and expectation maximization for different combining rules and evaluate them on synthetic data and on a real-world task. The results demonstrate that the algorithms are able to learn both the conditional probability distributions of the influence statements and the parameters of the combining rules.


First-order probabilistic models Quantified conditional influence statements Directed graphs 

Mathematics Subject Classification (2000)



Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altendorf, E.E., Restificar, A.C., Dietterich, T.G.: Learning from sparse data by exploiting monotonicity constraints. In: Proceedings of UAI 05 (2005)Google Scholar
  2. 2.
    Binder, J., Koller, D., Russell, S., Kanazawa, K.: Adaptive probabilistic networks with hidden variables. Mach. Learn. 29(2–3), 213–244 (1997) ISSN 0885-6125zbMATHCrossRefGoogle Scholar
  3. 3.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 185–197 (1977)MathSciNetGoogle Scholar
  4. 4.
    Díez, F.J., Galán, S.F.: Efficient computation for the noisy MAX. Int. J. Approx. Reason. 18, 165–177 (2003)zbMATHGoogle Scholar
  5. 5.
    Domingos, P., Richardson, M.: Markov logic: a unifying framework for statistical relational learning. In: Proceedings of the SRL Workshop in ICML, Banff, July 2004Google Scholar
  6. 6.
    Dragunov, A.N., Dietterich, T.G., Johnsrude, K., McLaughlin, M., Li, L., Herlocker, J.L.: Tasktracer: a desktop environment to support multi-tasking knowledge workers. In: Proceedings of IUI, San Diego, January 2005Google Scholar
  7. 7.
    Fierens, D., Blockeel, H., Bruynooghe, M., Ramon, J.: Logical Bayesian networks and their relation to other probabilistic logical models. In: Proceedings of ILP, Bonn, 10–13 August 2005Google Scholar
  8. 8.
    Getoor, L., Grant, J.: PRL: a probabilistic relational language. Mach. Learn. 62(1–2), 7–31 (2006)CrossRefGoogle Scholar
  9. 9.
    Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. MIT, Cambridge (2007)zbMATHGoogle Scholar
  10. 10.
    Getoor, L., Friedman, N., Koller, D., Pfeffer, A.: Learning probabilistic relational models. In: Dzeroski, S., Lavrac, N. (eds.) Relational Data Mining. Springer, New York (2001)Google Scholar
  11. 11.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)zbMATHGoogle Scholar
  12. 12.
    Heckerman, D., Breese, J.S.: Causal independence for probability assessment and inference using Bayesian networks. Technical Report MSR-TR-94-08, Microsoft Research (1994)Google Scholar
  13. 13.
    Heckerman, D., Meek, C., Koller, D.: Probabilistic models for relational data. Technical Report MSR-TR-2004-30, March (2004)Google Scholar
  14. 14.
    Jaeger, M.: Relational Bayesian networks. In: Proceedings of UAI-97, Providence, 1–3 August 1997Google Scholar
  15. 15.
    Jaeger, M.: Parameter learning for relational Bayesian networks. In: Proceedings of the International Conference in Machine Learning, Corvalis, 20–24 June 2007Google Scholar
  16. 16.
    Kersting, K., De Raedt, L.: Bayesian logic programs. In: Proceedings of the Work-in-Progress Track at the 10th International Conference on Inductive Logic Programming, London, 24–27 July 2000Google Scholar
  17. 17.
    Kersting, K., De Raedt, L.: Adaptive Bayesian logic programs. In: Proceedings of the ILP ’01, pp. 104–117. Springer, New York (2001)Google Scholar
  18. 18.
    Koller, D., Pfeffer, A.: Learning probabilities for noisy first-order rules. In: IJCAI, pp. 1316–1323. Nagoya, 23–29 August 1997Google Scholar
  19. 19.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, pp. 282–289 (2001)Google Scholar
  20. 20.
    Laskey, K.B.: MEBN: a language for first-order Bayesian knowledge bases. Artif. Intell. 172(2–3), 140–178 (2008)CrossRefMathSciNetGoogle Scholar
  21. 21.
    Muggleton, S.: Stochastic logic programs. In: Advances in Inductive Logic Programming, pp. 254–264 (1996)Google Scholar
  22. 22.
    Natarajan, S., Tadepalli, P., Altendorf, E., Dietterich, T.G., Fern, A., Restificar, A.: Learning first-order probabilistic models with combining rules. In: Proceedings of the International Conference in Machine Learning, Bonn, 7–11 August 2005Google Scholar
  23. 23.
    Natarajan, S., Tadepalli, P., Fern, A.: A relational hierarchical model for decision-theoretic assistance. In: Proceedings of 17th Annual International Conference on Inductive Logic Programming, Corvallis, 19–21 June 2007Google Scholar
  24. 24.
    Neville, J., Jensen, D., Friedland, L., Hay, M.: Learning relational probability trees. In: KDD ’03: proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp. 625–630. Washington, DC, 24–27 August 2003Google Scholar
  25. 25.
    Ngo, L., Haddawy, P.: Probabilistic logic programming and Bayesian networks. In: Proceedings ACSC95, Pathumthani, 11–13 December 1995Google Scholar
  26. 26.
    Poole, D.: Probabilistic Horn abduction and Bayesian networks. Artif. Intell. 64(1), 81–129 (1993)zbMATHCrossRefGoogle Scholar
  27. 27.
    Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical modeling. J. Artif. Intell. Res. 15, 391–454 (2001)zbMATHMathSciNetGoogle Scholar
  28. 28.
    Vomlel, J.: Noisy-or classifier: research articles. Int. J. Intell. Syst. 21(3), 381–398 (2006)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  • Sriraam Natarajan
    • 1
    Email author
  • Prasad Tadepalli
    • 1
  • Thomas G. Dietterich
    • 1
  • Alan Fern
    • 1
  1. 1.School of Electrical Engineering and Computer ScienceOregon State UniversityCorvallisUSA

Personalised recommendations