Learning Hierarchical Bayesian Networks for Large-Scale Data Analysis

  • Kyu-Baek Hwang
  • Byoung-Hee Kim
  • Byoung-Tak Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4232)


Bayesian network learning is a useful tool for exploratory data analysis. However, applying Bayesian networks to the analysis of large-scale data, consisting of thousands of attributes, is not straightforward because of the heavy computational burden in learning and visualization. In this paper, we propose a novel method for large-scale data analysis based on hierarchical compression of information and constrained structural learning, i.e., hierarchical Bayesian networks (HBNs). The HBN can compactly visualize global probabilistic structure through a small number of hidden variables, approximately representing a large number of observed variables. An efficient learning algorithm for HBNs, which incrementally maximizes the lower bound of the likelihood function, is also suggested. The effectiveness of our method is demonstrated by the experiments on synthetic large-scale Bayesian networks and a real-life microarray dataset.


Hide Layer Mutual Information Bayesian Network Child Node Hide Variable 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel- Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene Ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)CrossRefGoogle Scholar
  2. 2.
    Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Batagelj, V., Mrvar, A.: Pajek - program for large network analysis. Connections 21(2), 47–57 (1998)Google Scholar
  4. 4.
    Friedman, N.: Inferring cellular networks using probabilistic graphical models. Science 303(6), 799–805 (2004)CrossRefGoogle Scholar
  5. 5.
    Friedman, N., Nachman, I., Peér, D.: Learning Bayesian network structure from massive datasets: the “sparse candidate” algorithm. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 206–215 (1999)Google Scholar
  6. 6.
    Goldenberg, A., Moore, A.: Tractable learning of large Bayes net structures from sparse data. In: Proceedings of the Twentifirst International Conference on Machine Learning, ICML (2004)Google Scholar
  7. 7.
    Gyftodimos, E., Flach, P.: Hierarchical Bayesian networks: an approach to classification and learning for structured data. In: Vouros, G.A., Panayiotopoulos, T. (eds.) SETN 2004. LNCS (LNAI), vol. 3025, pp. 291–300. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Hwang, K.-B., Lee, J.W., Chung, S.-W., Zhang, B.-T.: Construction of large-scale Bayesian networks by local to global search. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 375–384. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Nikovski, D.: Constructing Bayesian networks for medical diagnosis from incomplete and partially correct statistics. IEEE Transactions on Knowledge and Data Engineering 12(4), 509–516 (2000)CrossRefGoogle Scholar
  10. 10.
    Park, S., Aggarwal, J.K.: Recognition of two-person interactions using a hierarchical Bayesian network. In: Proceedings of the First ACM SIGMM International Workshop on Video Surveillance (IWVS), pp. 65–76 (2003)Google Scholar
  11. 11.
    Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., Futcher, B.: Comprehensive identification of cell cycleregulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9(12), 3273–3297 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kyu-Baek Hwang
    • 1
  • Byoung-Hee Kim
    • 2
  • Byoung-Tak Zhang
    • 2
  1. 1.School of ComputingSoongsil UniversitySeoulKorea
  2. 2.School of Computer Science and EngineeringSeoul National UniversitySeoulKorea

Personalised recommendations