Abstract
This paper addresses state inference for hidden Markov models. These models rely on unobserved states, which often have a meaningful interpretation. This makes it necessary to develop diagnostic tools for quantification of state uncertainty. The entropy of the state sequence that explains an observed sequence for a given hidden Markov chain model can be considered as the canonical measure of state sequence uncertainty. This canonical measure of state sequence uncertainty is not reflected by the classic multidimensional posterior state (or smoothed) probability profiles because of the marginalization that is intrinsic in the computation of these posterior probabilities. Here, we introduce a new type of profiles that have the following properties: (i) these profiles of conditional entropies are a decomposition of the canonical measure of state sequence uncertainty along the sequence and makes it possible to localise this uncertainty, (ii) these profiles are unidimensional and thus remain easily interpretable on tree structures. We show how to extend the smoothing algorithms for hidden Markov chain and tree models to compute these entropy profiles efficiently. The use of entropy profiles is illustrated by sequence and tree data examples.
Similar content being viewed by others
References
Brushe, G., Mahony, R., Moore, J.: A soft output hybrid algorithm for ML/MAP sequence estimation. IEEE Trans. Inf. Theory 44(7), 3129–3134 (1998)
Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer Series in Statistics. Springer, New York (2005)
Celeux, G., Soromenho, G.: An entropy criterion for assessing the number of clusters in a mixture model. J. Classif. 13(2), 195–212 (1996)
Cover, T., Thomas, J.: Elements of Information Theory, 2nd edn. Wiley, Hoboken (2006)
Crouse, M., Nowak, R., Baraniuk, R.: Wavelet-based statistical signal processing using hidden Markov models. IEEE Trans. Signal Process. 46(4), 886–902 (1998)
Devijver, P.A.: Baum’s forward-backward algorithm revisited. Pattern Recognit. Lett. 3, 369–373 (1985)
Durand, J.-B., Girard, S., Ciriza, V., Donini, L.: Optimization of power consumption and device availability based on point process modelling of the request sequence. Appl. Stat. 62(2), 151–162 (2013)
Durand, J.-B., Gonçalvès, P., Guédon, Y.: Computational methods for hidden Markov tree models: an application to wavelet trees. IEEE Trans. Signal Process. 52(9), 2551–2560 (2004)
Durand, J.-B., Guédon, Y., Caraglio, Y., Costes, E.: Analysis of the plant architecture via tree-structured statistical models: the hidden Markov tree models. New Phytol. 166(3), 813–825 (2005)
Durand, J.-B., Guédon, Y.: Localizing the latent structure canonical uncertainty: entropy profiles for hidden Markov odels. Available: hal.inria.fr/hal-00675223/en, Inria technical report (2012)
Ephraim, Y., Merhav, N.: Hidden Markov processes. IEEE Trans. Inf. Theory 48, 1518–1569 (2002)
Guédon, Y., Caraglio, Y., Heuret, P., Lebarbier, E., Meredieu, C.: Analyzing growth components in trees. J. Theor. Biol. 248(3), 418–447 (2007a)
Guédon, Y.: Exploring the state sequence space for hidden Markov and semi-Markov chains. Comput. Stat. Data Anal. 51(5), 2379–2409 (2007b)
Guédon, Y.: Segmentation uncertainty in multiple change-point models. Stat. Comput. (2013, in press)
Hernando, D., Crespi, V., Cybenko, G.: Efficient computation of the hidden Markov model entropy for a given observation sequence. IEEE Trans. Inf. Theory 51(7), 2681–2685 (2005)
Lauritzen, S.: Graphical Models. Clarendon Press, Oxford (1996)
Le Cadre, J.-P., Tremois, O.: Bearing-only tracking for maneuvering sources. IEEE Trans. Aerosp. Electron. Syst. 34(1), 179–193 (1998)
McLachlan, G., Peel, D.: Finite Mixture Models. Wiley Series in Probability and Statistics. Wiley, New York (2000)
Zucchini, W., MacDonald, I.: Hidden Markov Models for Time Series: An Introduction Using R. Chapman & Hall/CRC, Boca Raton (2009)
Acknowledgments
The authors are indebted to Yves Caraglio for useful comments on modelling the Aleppo pines dataset and for providing this dataset.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Computation of the global entropy of state subtrees in hidden Markov tree models
Proposition 2
Let \({\mathcal {V}}\) be a subtree of \({\mathcal {T}}\) with root vertex \(r\). Then for any possible value \({\bar{\varvec{s}}}_{\mathcal {V}}\) of \(\bar{\varvec{S}}_{\mathcal {V}}\) and for any \({\varvec{x}}\),
and
Proof
This is proved by induction on the vertices in \({\mathcal {V}}\). The induction step is as follows: let \(\ell \) be a leaf vertex of \({\mathcal {V}}\). Then for any possible value \({\bar{\varvec{s}}}_{\mathcal {V}}\) of \(\bar{\varvec{S}}_{\mathcal {V}}\),
since \(S_{\ell }\) is conditionally independent from the other vertices in \({\mathcal {V}}\) given \(S_{\rho (\ell )}\) and \({\varvec{X}}\).
The induction step is completed by observing that \({\mathcal {V}}\backslash \{\ell \}\) is a subtree of \({\mathcal {T}}\).
The decomposition of the entropy of \(\bar{\varvec{S}}_{\mathcal {V}}\) yielded by the chain rule
is proved similarly as Corollary 1. \(\square \)
Appendix 2: Direct computation of global state tree entropy in hidden Markov tree models
Direct computation of global state tree entropy is based on recursive computation of the entropies of children state subtrees given each state. This recursion relies on conditional independence properties between hidden and observed variables in HMT models, and particularly the following relations: for any internal, non-root vertex \(u\) and fo \(j = 1,\ldots ,J\),
Entropies \(H(\bar{\varvec{S}}_{c(u)} | S_{u}=j, \bar{\varvec{X}}_u=\bar{\varvec{x}}_u)\) can be computed for any \(u \in {\mathcal {U}}, u \ne 0\) and for \(j=0,\ldots ,J-1\), by an upward algorithm initialised at the leaf vertices \(u\) by
As a consequence from (20), we have for any state \(j\), \(H(\bar{\varvec{S}}_{c(u)}|S_{u}=j,\bar{\varvec{X}}_u=\bar{\varvec{x}}_u) = H(\bar{\varvec{S}}_{c(u)}|S_{u}=j,{\varvec{X}}={\varvec{x}})\). Thus, it is deduced from (19) that
Moreover, for any \(v \in c(u)\) with \(c(v) \ne \emptyset \) and for \(j=0,\ldots ,J-1\),
Thus, the recursion of the upward algorithm is given by
where \(P(S_v=k | S_u=j, \bar{\varvec{X}}_v=\bar{\varvec{x}}_v) = P(S_v=k | S_u=j, {\varvec{X}}={\varvec{x}})\) is given by equation (17).
The termination step is obtained by similar arguments as equation (21):
Using similar arguments as in (22), the partial state tree entropy \(H(\bar{\varvec{S}}_u | {\varvec{X}}= {\varvec{x}})\) can be deduced from the conditional entropies \(H(\bar{\varvec{S}}_{c(u)}|S_{u}=j,\bar{\varvec{X}}_u=\bar{\varvec{x}}_u)\) (with \(j=0,\ldots ,J-1\)) as follows:
where the \(\left\{ L_u(j)\right\} _{j=0,\ldots ,J-1}\) are directly extracted from the downward recursion (14).
The profile of conditional entropies \(H(S_{u} | S_{\rho (u)}, {\varvec{X}}= {\varvec{x}})\) is deduced from
where
and where for any brother vertex \(v\) of \(u\), \(H(\bar{\varvec{S}}_v | S_{\rho (v)}=j, {\varvec{X}}= {\varvec{x}})\) is given by (22). Since
and since
\(H(S_{u} | S_{\rho (u)}, {\varvec{X}}= {\varvec{x}})\) is directly extracted from the partial state entropies \(H(\bar{\varvec{S}}_{c(u)} | S_u=j, {\varvec{X}}\!=\! {\varvec{x}})\) and \(H(\bar{\varvec{S}}_{\rho (u)} | {\varvec{X}}\!=\! {\varvec{x}})\) and from the marginal entropy \(H(S_{\rho (u)} | {\varvec{X}}= {\varvec{x}})\).
In summary, the partial subtrees entropies \(\big \{H(\bar{\varvec{S}}_{c(u)}|S_{u}= j,\bar{\varvec{X}}_u=\bar{\varvec{x}}_u)\big \}_{u \in {\mathcal {U}};}\) \({}_{j=0,\ldots ,J-1}\) are firstly computed using (23). The partial state tree entropies \(\left\{ H(\bar{\varvec{S}}_u | {\varvec{X}}= \right. \left. {\varvec{x}})\right\} _{u \in {\mathcal {U}}}\) and then the profile of conditional entropies \(\left\{ H(S_{u} | S_{\rho (u)}, {\varvec{X}}= {\varvec{x}})\right\} _{u \in {\mathcal {U}}}\) are deduced from these entropies and the posterior state probabilities, using (24) and (25). The time complexity of the algorithm is in \({\mathcal {O}}(J^2n)\).
Appendix 3: Application of HMT model to Aleppo pines: path containing female shoots
A path containing a female shoot is considered. This path corresponds to the main axis of the third individual (for which \(H({\varvec{S}}|{\varvec{X}}={\varvec{x}}) = 29.6\)). The path contains 6 vertices, referred to as \(\{0, \ldots , 5\}\). The female shoot is at vertex 2, and vertex 3 is a bicyclic shoot. Shoots 4 and 5 are unbranched, monocyclic, sterile shoots.
The contribution of the vertices of the considered path \({\mathcal {P}}\) to the global state tree is equal to 0.48 (that is, 0.08 per vertex on average). The global state tree entropy for this individual is 0.21 per vertex, against 0.20 per vertex in the whole dataset. The mean marginal state entropy for this individual is 0.37 per vertex, which strongly overestimates the mean state tree entropy.
Since a female shoot necessarily is in state 0, \(H(S_2 | {\varvec{X}}= {\varvec{x}})=0\) (no uncertainty). The states of shoots 0 and 1 can be deduced from \(S_2\) using the transition matrix \({\hat{P}}\), thus their marginal entropy is null. Since shoot 3 is bicyclic, it is in state 0 with a very high probability \((H(S_3 | {\varvec{X}}= {\varvec{x}}) \approx 0)\). Uncertainty remains concerning the states of shoots 4 and 5, which thus have high marginal entropies. However, \(S_5\) can be deduced from \(S_4\) using \({\hat{P}}\) and inversely, which results into high mutual information between \(S_4\) and \(S_5\) given \({\varvec{X}}= {\varvec{x}}\). This is illustrated by conditional and marginal entropy profiles in Fig. 9.
Rights and permissions
About this article
Cite this article
Durand, JB., Guédon, Y. Localizing the latent structure canonical uncertainty: entropy profiles for hidden Markov models. Stat Comput 26, 549–567 (2016). https://doi.org/10.1007/s11222-014-9494-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-014-9494-9