Machine Learning

, Volume 20, Issue 3, pp 197-243

First online:

Learning Bayesian networks: The combination of knowledge and statistical data

  • David HeckermanAffiliated withMicrosoft Research, 9S
  • , Dan GeigerAffiliated withMicrosoft Research, 9S
  • , David M. ChickeringAffiliated withMicrosoft Research, 9S


We describe a Bayesian approach for learning Bayesian networks from a combination of prior knowledge and statistical data. First and foremost, we develop a methodology for assessing informative priors needed for learning. Our approach is derived from a set of assumptions made previously as well as the assumption oflikelihood equivalence, which says that data should not help to discriminate network structures that represent the same assertions of conditional independence. We show that likelihood equivalence when combined with previously made assumptions implies that the user's priors for network parameters can be encoded in a single Bayesian network for the next case to be seen—aprior network—and a single measure of confidence for that network. Second, using these priors, we show how to compute the relative posterior probabilities of network structures given data. Third, we describe search methods for identifying network structures with high posterior probabilities. We describe polynomial algorithms for finding the highest-scoring network structures in the special case where every node has at mostk=1 parent. For the general case (k>1), which is NP-hard, we review heuristic search algorithms including local search, iterative local search, and simulated annealing. Finally, we describe a methodology for evaluating Bayesian-network learning algorithms, and apply this approach to a comparison of various approaches.


Bayesian networks learning Dirichlet likelihood equivalence maximum branching heuristic search