Abstract
Dependency networks approximate a joint probability distribution over multiple random variables as a product of conditional distributions. Relational Dependency Networks (RDNs) are graphical models that extend dependency networks to relational domains. This higher expressivity, however, comes at the expense of a more complex model-selection problem: an unbounded number of relational abstraction levels might need to be explored. Whereas current learning approaches for RDNs learn a single probability tree per random variable, we propose to turn the problem into a series of relational function-approximation problems using gradient-based boosting. In doing so, one can easily induce highly complex features over several iterations and in turn estimate quickly a very expressive model. Our experimental results in several different data sets show that this boosting method results in efficient learning of RDNs when compared to state-of-the-art statistical relational learning approaches.
Article PDF
Similar content being viewed by others
References
Van Assche, A., Vens, C., & Blockeel, H. (2006). First order random forests: Learning relational classifiers with complex aggregates. Machine Learning, 64, 149–182
Koller, D., Taskar, B., & Abeel, P. (2002). Discriminative probabilistic models for relational data. In UAI (pp. 485–492).
Bilenko, M., & Mooney, R. (2003). Adaptive duplicate detection using learnable string similarity measures. In KDD (pp. 39–48).
Blockeel, H., & De Raedt, L. (1998). Top-down induction of first-order logical decision trees. Artificial Intelligence, 101, 285–297.
Boutilier, C., Friedman, N., Goldszmidt, M., & Koller, D. (1996). Context-specific independence in Bayesian networks. In UAI (pp. 115–123).
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
Chickering, D. (1996). Learning Bayesian networks is NP-complete. In Learning from data: Artificial intelligence and statistics V (pp. 121–130). Berlin: Springer.
Craven, M., & Shavlik, J. (1996). Extracting tree-structured representations of trained networks. In NIPS (pp. 24–30).
Davis, J., Ong, I., Struyf, J., Burnside, E., Page, D., & Costa, V. S. (2007). Change of representation for statistical relational learning. In IJCAI.
de Salvo Braz, R., Amir, E., & Roth, D. (2005). Lifted first order probabilistic inference. In IJCAI (pp. 1319–1325).
Dietterich, T. G., Ashenfelter, A., & Bulatov, Y. (2004). Training conditional random fields via gradient tree boosting. In ICML.
Domingos, P., & Lowd, D. (2009). MarkovLogic: An interface layer for AI. San Rafael: Morgan & Claypool.
Fierens, D., Blockeel, H., Bruynooghe, M., & Ramon, J. (2005). Logical Bayesian networks and their relation to other probabilistic Logical models. In ILP.
Freund, Y., & Schapire, R. (1996). Experiments with a new boosting algorithm. In ICML.
Friedman, J. H. (2001) Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.
Getoor, L., Friedman, N., Koller, D., & Pfeffer, A. (2001). Learning probabilistic relational models. In S. Dzeroski & N. Lavrac (Eds.), Relational data mining (pp. 307–338).
Getoor, L., & Grant, J. (2006). PRL: A probabilistic relational language. Machine Learning, 62(1–2), 7–31.
Getoor, L., & Taskar, B. (2007). Introduction to statistical relational learning. Cambridge: MIT Press.
Gutmann, B., & Kersting, K. (2006). TildeCRF: Conditional random fields for logical sequences. In ECML.
Heckerman, D., Chickering, D., Meek, C., Rounthwaite, R., & Kadie, C. (2001). Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research, 1, 49–75.
Jaeger, M. (1997). Relational Bayesian networks. In Proceedings of UAI-97.
Jing, Y., Pavloviä, V., & Rehg, J. (2008). Boosted Bayesian network classifiers. Machine Learning, 73(2), 155–184.
Karwath, A., Kersting, K., & Landwehr, N. (2008). Boosting Relational Sequence alignments. In ICDM.
Kersting, K., Ahmadi, B., & Natarajan, S. (2009). Counting belief propagation. In UAI.
Kersting, K., & De Raedt, L. (2007). Bayesian logic programming: theory and tool. In An introduction to statistical relational learning.
Kersting, K., & Driessens, K. (2008). Non-parametric policy gradients: a unified treatment of propositional and relational domains. In ICML.
Kok, S., & Domingos, P. (2009). Learning Markov logic network structure via hypergraph lifting. In ICML.
Kok, S., & Domingos, P. (2010). Learning Markov logic networks using structural motifs. In ICML.
Lawrence, S., Giles, C., & Bollacker, K. (1999). Autonomous citation matching. In AGENTS (pp. 392–393).
Mihalkova, L., & Mooney, R. (2007). Bottom-up learning of Markov logic network structure. In ICML (pp. 625–632).
Milch, B., Zettlemoyer, L., Kersting, K., Haimes, M., & Pack Kaelbling, L. (2008). Lifted probabilistic inference with counting formulas. In AAAI.
Muggleton, S., & De Raedt, L. (1994). Inductive logic programming: theory and methods. The Journal of Logic Programming, 19/20, 629–679.
Natarajan, S., Tadepalli, P., Dietterich, T. G., & Fern, A. (2009). Learning first-order probabilistic models with combining rules. In AMAI.
Neville, J., & Jensen, D. (2007). Relational dependency networks. In Introduction to statistical relational learning (pp. 653–692).
Neville, J., Jensen, D., Friedland, L., & Hay, M. (2003). Learning relational probability trees. In KDD.
Neville, J., Jensen, D., & Gallagher, B. (2003). Simple estimators for relational Bayesian classifiers. In ICDM (pp. 609–612).
Parker, C., Fern, A., & Tadepalli, P. (2006). Gradient boosting for sequence alignment. In AAAI.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo: Morgan Kaufmann.
Poole, D. (1993). Probabilistic Horn abduction and Bayesian networks. Artificial Intelligence, 64(1), 81–129.
Poole, D. (2003). First-order probabilistic inference. In IJCAI (pp. 985–991).
Poon, H., & Domingos, P. (2007). Joint inference in information extraction. In AAAI (pp. 913–918).
De Raedt, L., Kimmig, A., & Toivonen, H. (2007). Problog: A probabilistic prolog and its application in link discovery. In IJCAI (pp. 2468–2473).
Sato, T., & Kameya, Y. (2001). Parameter learning of logic programs for symbolic-statistical modeling. In JAIR (pp. 391–454).
Singla, P., & Domingos, P. (2006). Entity resolution with Markov logic. In ICDM (pp. 572–582).
Singla, P., & Domingos, P. (2008). Lifted first-order belief propagation. In AAAI (pp. 1094–1099).
Srinivasan, A. (2004). The Aleph manual.
Sutton, R., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In NIPS.
Truyen, T., Phung, D., Venkatesh, S., & Bui, H. (2006). Adaboost.mrf: Boosted Markov random forests and application to multilevel activity recognition. In CVPR (pp. 1686–1693).
Vens, C., Ramon, J., & Blockeel, H. (2006). Refining aggregate conditions in relational learning. In Knowledge discovery in databases: PKDD (p. 2006).
Xu, Z., Kersting, K., & Tresp, V. (2009). Multi-relational learning with Gaussian processes. In IJCAI.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Paolo Frasconi and Francesca Lisi.
Rights and permissions
About this article
Cite this article
Natarajan, S., Khot, T., Kersting, K. et al. Gradient-based boosting for statistical relational learning: The relational dependency network case. Mach Learn 86, 25–56 (2012). https://doi.org/10.1007/s10994-011-5244-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-011-5244-9