Relational Restricted Boltzmann Machines: A Probabilistic Logic Learning Approach

  • Navdeep Kaur
  • Gautam KunapuliEmail author
  • Tushar Khot
  • Kristian Kersting
  • William Cohen
  • Sriraam Natarajan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10759)


We consider the problem of learning Boltzmann machine classifiers from relational data. Our goal is to extend the deep belief framework of RBMs to statistical relational models. This allows one to exploit the feature hierarchies and the non-linearity inherent in RBMs over the rich representations used in statistical relational learning (SRL). Specifically, we use lifted random walks to generate features for predicates that are then used to construct the observed features in the RBM in a manner similar to Markov Logic Networks. We show empirically that this method of constructing an RBM is comparable or better than the state-of-the-art probabilistic relational learning algorithms on six relational domains.



Kristian Kersting gratefully acknowledges the support by the DFG Collaborative Research Center SFB 876 projects A6 and B4. Sriraam Natarajan gratefully acknowledges the support of the DARPA DEFT Program under the Air Force Research Laboratory (AFRL) prime contract no. FA8750-13-2-0039. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the authors and do not necessarily reflect the view of the DARPA, ARO, AFRL, or the US government.


  1. 1.
    Ackley, D.H., Hinton, G.E., Sejnowski, T.J.: A learning algorithm for Boltzmann machines. Cogn. Sci. 9(1), 147–169 (1985)CrossRefGoogle Scholar
  2. 2.
    Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. JMLR 13, 281–305 (2012)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. JMLR 3, 993–1022 (2003)zbMATHGoogle Scholar
  4. 4.
    Blockeel, H., Uwents, W.: Using neural networks for relational learning. In: ICML 2004 Workshop on SRL, pp. 23–28 (2004)Google Scholar
  5. 5.
    Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka Jr., E.R., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: AAAI, pp. 1306–1313 (2010)Google Scholar
  6. 6.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, New York (2000)CrossRefzbMATHGoogle Scholar
  7. 7.
    Das, M., Wu, Y., Khot, T., Kersting, K., Natarajan, S.: Scaling lifted probabilistic inference and learning via graph databases. In: SDM (2016)Google Scholar
  8. 8.
    Davis, J., Goadrich, M.: The relationship between Precision-Recall and ROC curves. In: ICML (2006)Google Scholar
  9. 9.
    De Raedt, L., Kersting, K., Natarajan, S., Poole, D. (eds.): Statistical Relational Artificial Intelligence: Logic, Probability, and Computation. Morgan and Claypool Publishers, San Rafael (2016)Google Scholar
  10. 10.
    Deng, L.: Connecting deep learning features to log-linear models. In: Log-Linear Models, Extensions and Applications. MIT Press (2015)Google Scholar
  11. 11.
    Desjardins, G., Courville, A., Bengio, Y., Vincent, P., Dellaleau, O.: Parallel tempering for training of restricted Boltzmann machines. AISTATS 9, 145–152 (2010)Google Scholar
  12. 12.
    Domingos, P., Lowd, D.: Markov Logic: An Interface Layer for AI. Morgan & Claypool Publishers, San Rafael (2009)zbMATHGoogle Scholar
  13. 13.
    França, M.V.M., Zaverucha, G., d’Avila Garcez, A.S.: Fast relational learning using bottom clause propositionalization with artificial neural networks. Mach. Learn. 94(1), 81–104 (2014)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Gehler, P.V., Holub, A.D., Welling, M.: The rate adapting Poisson model for information retrieval and object recognition. In: ICML, pp. 337–344 (2006)Google Scholar
  15. 15.
    Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)zbMATHGoogle Scholar
  16. 16.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)zbMATHGoogle Scholar
  17. 17.
    Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)CrossRefzbMATHGoogle Scholar
  18. 18.
    Hinton, G.E., Osindero, S.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Hinton, G.E.: To recognize shapes first learn to generate images. In: Computational Neuroscience: Theoretical Insights into Brain Function. Elsevier (2007)Google Scholar
  20. 20.
    Hu, Z., Ma, X., Liu, Z., Hovy, E.H., Xing, E.P.: Harnessing deep neural networks with logic rules. In: ACL (2016)Google Scholar
  21. 21.
    Kazemi, S., Buchman, D., Kersting, K., Natarajan, S., Poole, D.: Relational logistic regression. In: KR (2014)Google Scholar
  22. 22.
    Khot, T., Natarajan, S., Kersting, K., Shavlik, J.: Learning Markov logic networks via functional gradient boosting. In: ICDM (2011)Google Scholar
  23. 23.
    Kok, S., Domingos, P.: Learning Markov logic network structure via hypergraph lifting. In: ICML (2009)Google Scholar
  24. 24.
    Kok, S., Domingos, P.: Learning Markov logic networks using structural motifs. In: ICML (2010)Google Scholar
  25. 25.
    Kok, S., Sumner, M., Richardson, M., et al.: The Alchemy system for statistical relational AI. University of Washington, Technical report (2010)Google Scholar
  26. 26.
    Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2009)Google Scholar
  27. 27.
    Landwehr, N., Passerini, A., De Raedt, L., Frasconi, P.: Fast learning of relational Kernels. Mach. Learn. 78, 305–342 (2010)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Lao, N., Cohen, W.: Relational retrieval using a combination of path-constrained random walks. J. Mach. Learn. 81(1), 53–67 (2010)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Larochelle, H., Bengio, Y.: Classification using discriminative restricted Boltzmann machines. In: ICML, pp. 536–543 (2008)Google Scholar
  30. 30.
    Lecun, Y., Chopra, S., Hadsell, R., Marc’Aurelio, R., Huang, F.: A tutorial on energy-based learning. In: Predicting Structured Data. MIT Press (2006)Google Scholar
  31. 31.
    Mihalkova, L., Mooney, R.: Bottom-up learning of Markov logic network structure. In: ICML (2007)Google Scholar
  32. 32.
    Natarajan, S., Khot, T., Kersting, K., Guttmann, B., Shavlik, J.: Gradient-based boosting for statistical relational learning: the relational dependency network case. MLJ 86(1), 25–56 (2012)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Natarajan, S., Khot, T., Kersting, K., Shavlik, J. (eds.): Boosted Statistical Relational Learners. From Benchmarks to Data-Driven Medicine. SpringerBriefs in Computer Science. Springer, New York (2016)zbMATHGoogle Scholar
  34. 34.
    Natarajan, S., Tadepalli, P., Dietterich, T., Fern, A.: Learning first-order probabilistic models with combining rules. AMAI 54(1–3), 223–256 (2008)MathSciNetzbMATHGoogle Scholar
  35. 35.
    Poon, H., Domingos, P.: Joint inference in information extraction. In: AAAI (2007)Google Scholar
  36. 36.
    Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Mateo (1993)Google Scholar
  37. 37.
    Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62, 107–136 (2006)CrossRefGoogle Scholar
  38. 38.
    Muggleton, S.: Inverse entailment and Progol. New Gener. Comput. 13(3–4), 245–286 (1995)CrossRefGoogle Scholar
  39. 39.
    Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted Boltzmann machines for collaborative filtering. In: ICML, pp. 791–798 (2007)Google Scholar
  40. 40.
    Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: NIPS, pp. 2951–2959 (2012)Google Scholar
  41. 41.
    Taylor, G.W., Hinton, G.E., Roweis, S.T.: Modeling human motion using binary latent variables. In: NIPS (2007)Google Scholar
  42. 42.
    Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: ICML, pp. 1064–1071 (2008)Google Scholar
  43. 43.
    Wang, W., Cohen, W.: Learning first-order logic embeddings via matrix factorization. In: IJCAI (2016)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Indiana UniversityBloomingtonUSA
  2. 2.The University of Texas at DallasRichardsonUSA
  3. 3.Allen Institute of Artificial IntelligenceSeattleUSA
  4. 4.Technische Universität DarmstadtDarmstadtGermany
  5. 5.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations