The Complexity of Distinguishing Markov Random Fields
Markov random fields are often used to model high dimensional distributions in a number of applied areas. A number of recent papers have studied the problem of reconstructing a dependency graph of bounded degree from independent samples from the Markov random field. These results require observing samples of the distribution at all nodes of the graph. It was heuristically recognized that the problem of reconstructing the model where there are hidden variables (some of the variables are not observed) is much harder.
Here we prove that the problem of reconstructing bounded-degree models with hidden nodes is hard. Specifically, we show that unless NP = RP,
It is impossible to decide in randomized polynomial time if two models generate distributions whose statistical distance is at most 1/3 or at least 2/3.
Given two generating models whose statistical distance is promised to be at least 1/3, and oracle access to independent samples from one of the models, it is impossible to decide in randomized polynomial time which of the two samples is consistent with the model.
The second problem remains hard even if the samples are generated efficiently, albeit under a stronger assumption.
Unable to display preview. Download preview PDF.
- 1.Friedman, N.: Infering cellular networks using probalistic graphical models. Science (2004)Google Scholar
- 2.Kasif, S.: Bayes networks and graphical models in computational molecular biology and bioinformatics, survey of recent research (2007), http://genomics10.bu.edu/bioinformatics/kasif/bayes-net.html
- 3.Felsenstein, J.: Inferring Phylogenies. Sinauer, New York (2004)Google Scholar
- 7.Daskalakis, C., Mossel, E., Roch, S.: Optimal phylogenetic reconstruction. In: Proceedings of the thirty-eighth annual ACM symposium on Theory of computing (STOC 2006), pp. 159–168 (2006)Google Scholar
- 9.Bresler, G., Mossel, E., Sly, A.: Reconstruction of Markov random fields from samples: Some easy observations and algorithms. These proceedings (2008), http://front.math.ucdavis.edu/0712.1402
- 10.Wainwright, M.J., Ravikumar, P., Lafferty, J.D.: High dimensional graphical model selection using ℓ1-regularized logistic regression. In: Proceedings of the NIPS (2006)Google Scholar
- 11.Sinclair, A.: Algorithms for Random Generation and Counting: A Markov chain Approach. In: Progress in Theoretical Computer Science. Birkhäuser, Basel (1993)Google Scholar
- 19.Impagliazzo, R., Yung, M.: Direct minimum-knowledge computations (extended abstract). In: Pomerance, C. (ed.) CRYPTO 1987. LNCS, vol. 293, pp. 40–51. Springer, Heidelberg (1988)Google Scholar
- 20.Ben-Or, M., Goldreich, O., Goldwasser, S., Håstad, J., Kilian, J., Micali, S., Rogaway, P.: Everything provable is provable in zero-knowledge. In: Goldwasser, S. (ed.) CRYPTO 1988. LNCS, vol. 403, pp. 37–56. Springer, Heidelberg (1990)Google Scholar