DS 2007: Discovery Science pp 151-160 | Cite as

Fast NML Computation for Naive Bayes Models

  • Tommi Mononen
  • Petri Myllymäki
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4755)

Abstract

The Minimum Description Length (MDL) is an informationtheoretic principle that can be used for model selection and other statistical inference tasks. One way to implement this principle in practice is to compute the Normalized Maximum Likelihood (NML) distribution for a given parametric model class. Unfortunately this is a computationally infeasible task for many model classes of practical importance. In this paper we present a fast algorithm for computing the NML for the Naive Bayes model class, which is frequently used in classification and clustering tasks. The algorithm is based on a relationship between powers of generating functions and discrete convolution. The resulting algorithm has the time complexity of \(\mathcal{O}n^{2}\), where n is the size of the data.

Keywords

Bayesian Network Formal Power Series Code Length Recurrence Formula Minimum Description Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Brent, R.P.: Multiple-precision zero-finding methods and the complexity of elementary function evaluation. In: Traub, J.F. (ed.) Analytic Computational Complexity, Academic Press, New York (1976)Google Scholar
  2. 2.
    Flajolet, P., Sedgewick, R.: Analytic Combinatorics (in preparation)Google Scholar
  3. 3.
    Grünwald, P.: The Minimum Description Length Principle. MIT Press, Cambridge (2007)Google Scholar
  4. 4.
    Grünwald, P., Myung, J., Pitt, M. (eds.): Advances in Minimum Description Length: Theory and Applications. MIT Press, Cambridge (2005)Google Scholar
  5. 5.
    Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20(3), 197–243 (1995)MATHGoogle Scholar
  6. 6.
    Henrici, P.: Automatic computations with power series. Journal of the ACM 3(1), 11–15 (1956)Google Scholar
  7. 7.
    Knuth, D.E.: The Art of Computer Programming, 3rd edn. Seminumerical Algorithms, vol. 2. Addison-Wesley, Reading (1998)Google Scholar
  8. 8.
    Knuth, D.E., Pittel, B.: A recurrence related to trees. Proceedings of the American Mathematical Society 105(2), 335–349 (1989)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Kontkanen, P., Myllymäki, P.: Analyzing the stochastic complexity via tree polynomials. Technical Report 2005-4, Helsinki Institute for Information Technology (HIIT) (2005)Google Scholar
  10. 10.
    Kontkanen, P., Myllymäki, P.: A linear-time algorithm for computing the multinomial stochastic complexity. Information Processing Letters 103(6), 227–233 (2007)CrossRefGoogle Scholar
  11. 11.
    Kontkanen, P., Myllymäki, P., Buntine, W., Rissanen, J., Tirri, H.: An MDL framework for data clustering. In: Grünwald, P., Myung, I.J., Pitt, M. (eds.) Advances in Minimum Description Length: Theory and Applications, MIT Press, Cambridge (2006)Google Scholar
  12. 12.
    Kontkanen, P., Wettig, H., Myllymäki, P.: NML computation algorithms for tree-structured multinomial Bayesian networks. EURASIP Journal on Bioinformatics and Systems Biology (to appear)Google Scholar
  13. 13.
    Nakos, G.: Expansions of powers of multivariate formal power series. Mathematica Journal 3(1), 45–47 (1993)Google Scholar
  14. 14.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988)Google Scholar
  15. 15.
    Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, New Jersey (1989)MATHGoogle Scholar
  16. 16.
    Rissanen, J.: Fisher information and stochastic complexity. IEEE Transactions on Information Theory 42(1), 40–47 (1996)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Rissanen, J.: Information and Complexity in Statistical Modeling. Springer, Heidelberg (2007)MATHGoogle Scholar
  18. 18.
    Shtarkov, Y.M.: Universal sequential coding of single messages. Problems of Information Transmission 23, 3–17 (1987)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Tommi Mononen
    • 1
  • Petri Myllymäki
    • 1
  1. 1.Complex Systems Computation Group (CoSCo), Helsinki Institute for Information Technology (HIIT), University of Helsinki & Helsinki University of Technology, P.O.Box 68 (Department of Computer Science), FIN-00014 University of HelsinkiFinland

Personalised recommendations