Parent Assignment Is Hard for the MDL, AIC, and NML Costs

  • Mikko Koivisto
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4005)


Several hardness results are presented for the parent assignment problem: Given m observations of n attributes x 1, ..., x n , find the best parents for x n , that is, a subset of the preceding attributes so as to minimize a fixed cost function. This attribute or feature selection task plays an important role, e.g., in structure learning in Bayesian networks, yet little is known about its computational complexity. In this paper we prove that, under the commonly adopted full-multinomial likelihood model, the MDL, BIC, or AIC cost cannot be approximated in polynomial time to a ratio less than 2 unless there exists a polynomial-time algorithm for determining whether a directed graph with n nodes has a dominating set of size logn, a LOGSNP-complete problem for which no polynomial-time algorithm is known; as we also show, it is unlikely that these penalized maximum likelihood costs can be approximated to within any constant ratio. For the NML (normalized maximum likelihood) cost we prove an NP-completeness result. These results both justify the application of existing methods and motivate research on heuristic and super-polynomial-time algorithms.


Polynomial Time Bayesian Network Directed Graph Positive Instance Parent Assignment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9, 309–347 (1992)MATHGoogle Scholar
  2. 2.
    Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20, 197–243 (1995)MATHGoogle Scholar
  3. 3.
    Suzuki, J.: Learning Bayesian belief networks based on the Minimun Description Length principle: An efficient algorithm using the b & b technique. In: Proceedings of the Thirteenth International Conference on Machine Learning (ICML), pp. 462–470 (1996)Google Scholar
  4. 4.
    Tian, J.: A branch-and-bound algorithm for MDL learning Bayesian networks. In: Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 580–588. Morgan Kaufmann, San Francisco (2000)Google Scholar
  5. 5.
    Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)CrossRefMATHGoogle Scholar
  6. 6.
    Bouckaert, R.R.: Probabilistic network construction using the minimum description length principle. In: Moral, S., Kruse, R., Clarke, E. (eds.) ECSQARU 1993. LNCS, vol. 747, pp. 41–48. Springer, Heidelberg (1993)CrossRefGoogle Scholar
  7. 7.
    Bouckaert, R.R.: Properties of Bayesian belief network learning algorithms. In: de Mantaras, R.L., Poole, D. (eds.) Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 102–109. Morgan Kaufmann, San Francisco (1994)Google Scholar
  8. 8.
    Papadimitriou, C., Yannakakis, M.: On limited nondeterminism and the complexity of the V-C dimension. Journal of Computer and System Sciences 53, 161–170 (1996)CrossRefMathSciNetMATHGoogle Scholar
  9. 9.
    Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–722 (1974)CrossRefMathSciNetMATHGoogle Scholar
  10. 10.
    Shtarkov, Y.M.: Universal sequential coding of single messages. Problems of Information Transmission 23, 3–17 (1987)MathSciNetGoogle Scholar
  11. 11.
    Kontkanen, P., Buntine, W., Myllymäki, P., Rissanen, J., Tirri, H.: Efficient computation of stochastic complexity. In: Bishop, C.M., Frey, B.J. (eds.) Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (AISTAT), Key West, FL, pp. 181–188 (2003)Google Scholar
  12. 12.
    Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)CrossRefMathSciNetMATHGoogle Scholar
  13. 13.
    Cai, L., Juedes, D., Kanj, I.: The inapproximability of non-NP-hard optimization problems. Theoretical Computer Science 289, 553–571 (2002)CrossRefMathSciNetMATHGoogle Scholar
  14. 14.
    Garey, M., Johnson, D.: Computers and Intractability - A Guide to the Theory of NP-completeness. W. H. Freeman & Co., San Fransisco (1971)Google Scholar
  15. 15.
    Chickering, D.M., Meek, C.: Finding optimal Bayesian networks. In: Proceedings of Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 94–102. Morgan Kaufmann, Edmonton (2002)Google Scholar
  16. 16.
    Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the Thirteenth International Conference on Machine Learning (ICML), pp. 284–292. Morgan Kaufmann, San Francisco (1996)Google Scholar
  17. 17.
    Charikar, M., Guruswami, V., Kumar, R., Rajagopalan, S., Sahai, A.: Combinatorial feature selection problems. In: Proceedings of the 41st IEEE Symposium on Foundations of Computer Science (FOCS), pp. 631–640. IEEE, Los Alamitos (2000)CrossRefGoogle Scholar
  18. 18.
    Amaldi, E., Kann, V.: On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoretical Computer Science 209, 237–260 (1998)CrossRefMathSciNetMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Mikko Koivisto
    • 1
  1. 1.HIIT Basic Research Unit, Department of Computer ScienceUniversity of HelsinkiFinland

Personalised recommendations