The complete realization problem for hidden Markov models: a survey and some new results

Vidyasagar, M.

doi:10.1007/s00498-011-0066-7

The complete realization problem for hidden Markov models: a survey and some new results

Original Article
Published: 16 October 2011

Volume 23, pages 1–65, (2011)
Cite this article

Mathematics of Control, Signals, and Systems Aims and scope Submit manuscript

M. Vidyasagar¹

462 Accesses
34 Citations
Explore all metrics

Abstract

Suppose m is a positive integer, and let \({\mathcal{M} = \{1, \ldots ,m\}}\) . Suppose \({\{\mathcal{Y}_t \}}\) is a stationary stochastic process assuming values in \({\mathcal{M}}\) . In this paper we study the question: When does there exist a hidden Markov model (HMM) that reproduces the statistics of this process? This question is more than forty years old, and as yet no complete solution is available. In this paper, we begin by surveying several known results, and then we present some new results that provide ‘almost’ necessary and sufficient conditions for the existence of a HMM for a mixing and ultra-mixing process (where the notion of ultra-mixing is introduced here). In the survey part of the paper, consisting of Sects. 2 through 8, we rederive the following known results: (i) Associate an infinite matrix H with the process, and call it a ‘Hankel’ matrix (because of some superficial similarity to a Hankel matrix). Then the process has a HMM realization only if H has finite rank. (ii) However, the finite Hankel rank condition is not sufficient in general. There exist processes with finite Hankel rank that do not admit a HMM realization. (iii) An abstract necessary and sufficient condition states that a frequency distribution has a realization as an HMM if and only if it belongs to a ‘stable polyhedral’ convex set within the set of all frequency distributions on \({\mathcal{M}^{*}}\) , the set of all finite strings over \({\mathcal{M}}\) . While this condition may be ‘necessary and sufficient,’ it virtually amounts to a restatement of the problem rather than a solution of it, as observed by Anderson (Math Control Signals Syst 12(1):80–120, 1999). (iv) Suppose a process has finite Hankel rank, say r. Then there always exists a ‘regular quasi-realization’ of the process. That is, there exist a row vector, a column vector, and a set of matrices, each of dimension r or r × r as appropriate, such that the frequency of arbitrary strings is given by a formula that is similar to the corresponding formula for HMM’s. Moreover, all quasi-regular realizations of the process can be obtained from one of them via a similarity transformation. Hence, given a finite Hankel-rank process, it is a simple matter to determine whether or not it has a regular HMM in the conventional sense, by testing the feasibility of a linear programming problem. (v) If in addition the process is α-mixing, every regular quasi-realization has additional features. Specifically, a matrix associated with the quasi-realization (which plays the role of the state transition matrix in a HMM) is ‘quasi-row stochastic’ (in that its rows add up to one, even though the matrix may not be nonnegative), and it also satisfies the ‘quasi-strong Perron property’ (its spectral radius is one, the spectral radius is a simple eigenvalue, and there are no other eigenvalues on the unit circle). A corollary is that if a finite Hankel rank α-mixing process has a regular HMM in the conventional sense, then the associated Markov chain is irreducible and aperiodic. While this last result is not surprising, it does not seem to have been stated explicitly. While the above results are all ‘known,’ they are scattered over the literature; moreover, the presentation here is unified and occasionally consists of relatively simpler proofs than are found in the literature. Next we move on to present some new results. The key is the introduction of a property called ‘ultra-mixing.’ The following results are established: (a) Suppose a process has finite Hankel rank, is both α-mixing as well as ‘ultra-mixing,’ and in addition satisfies a technical condition. Then it has an irreducible HMM realization (and not just a quasi-realization). Moreover, the Markov process underlying the HMM is either aperiodic (and is thus α-mixing), or else satisfies a ‘consistency condition.’ (b) In the other direction, suppose a HMM satisfies the consistency condition plus another technical condition. Then the associated output process has finite Hankel rank, is α-mixing and is also ultra-mixing. Moreover, it is shown that under a natural topology on the set of HMMs, both ‘technical’ conditions are indeed satisfied by an open dense set of HMMs. Taken together, these two results show that, modulo two technical conditions, the finite Hankel rank condition, α-mixing, and ultra-mixing are ‘almost’ necessary and sufficient for a process to have an irreducible and aperiodic HMM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Anderson BDO (1999) The realization problem for hidden Markov models. Math Control Signals Syst 12(1): 80–120
Article MATH Google Scholar
Anderson BDO, Deistler M, Farina L, Benvenuti L (1996) Nonnegative realization of a system with a nonnegative impulse response. IEEE Trans Circ Syst I Fundam Theory Appl 43: 134–142
Article MathSciNet Google Scholar
Baldi P, Brunak S (2001) Bioinformatics: a machine learning approach, 2nd edn. MIT Press, Cambridge
MATH Google Scholar
Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37: 1554–1563
Article MathSciNet MATH Google Scholar
Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41(1): 164–171
Article MathSciNet MATH Google Scholar
Benvenuti L, Farina L (2004) A tutorial on the positive realization problem. IEEE Trans Autom Control 49: 651–664
Article MathSciNet Google Scholar
Berman A, Plemmons RJ (1979) Nonnegative matrices. Academic Press, New York
MATH Google Scholar
Blackwell D, Koopmans L (1957) On the identifiability problem for functions of finite Markov chains. Ann Math Stat 28: 1011–1015
Article MathSciNet MATH Google Scholar
Blondel V, Catarini V (2003) Undecidable problems for probabilistic automata of fixed dimension. Theory Comput Syst 36: 231–245
Article MathSciNet MATH Google Scholar
Carlyle JW (1967) Identification of state-calculable functions of finite Markov chains. Ann Math Stat 38: 201–205
Article MathSciNet MATH Google Scholar
Carlyle JW (1969) Stochastic finite-state system theory. In: Zadeh L, Polak E (eds) System theory, chap 10. McGraw-Hill, New York
Cawley SE, Wirth AL, Speed TP (2001) Phat—a gene finding program for Plasmodium falciparum. Mol Biochem Parasitol 118: 167–174
Article Google Scholar
Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999) Improved microbial gene identification with GLIMMER. Nucl Acids Res 27(23): 4636–4641
Article Google Scholar
Dharmadhikari SW (1963) Functions of finite Markov chains. Ann Math Stat 34: 1022–1031
Article MathSciNet MATH Google Scholar
Dharmadhikari SW (1963) Sufficient conditions for a stationary process to be a function of a Markov chain. Ann Math Stat 34: 1033–1041
Article MathSciNet MATH Google Scholar
Dharmadhikari SW (1965) A characterization of a class of functions of finite Markov chains. Ann Math Stat 36: 524–528
Article MathSciNet MATH Google Scholar
Dharmadhikari SW (1969) A note on exchangeable processes with states of finite rank. Ann Math Stat 40(6): 2207–2208
Article MathSciNet MATH Google Scholar
Dharmadhikari SW, Nadkarni MG (1970) Some regular and non-regular functions of finite Markov chains. Ann Math Stat 41(1): 207–213
Article MathSciNet MATH Google Scholar
Erickson RV (1970) Functions of Markov chains. Ann Math Stat 41: 843–850
Article MathSciNet MATH Google Scholar
Fliess M (1975) Series rationelles positives et processus stochastique. Ann Inst Henri Poincaré Sect B XI:1–21
Google Scholar
Fox M, Rubin H (1968) Functions of processes with Markovian states. Ann Mathematical Stat 39: 938–946
Article MathSciNet MATH Google Scholar
Gilbert EJ (1959) The identifiability problem for functions of Markov chains. Ann Math Stat 30: 688–697
Article MATH Google Scholar
Heller A (1965) On stochastic processes derived from Markov chains. Ann Math 36: 1286–1291
MathSciNet MATH Google Scholar
Ito H, Amari S, Kobayashi K (1992) Identifiability of hidden Markov information sources and their minimum degrees of freedom. IEEE Trans Inf Theory 38: 324–333
Article MathSciNet MATH Google Scholar
Jelinek F (1997) Statistical Methods for speech recognition. MIT Press, Cambridge
Google Scholar
Kalikow S (1990) Random Markov processes and uniform martingales. Isr J Math 71(1): 33–54
Article MathSciNet MATH Google Scholar
Krogh A, Brown M, Mian IS, Sjölander K, Haussler D (1994) Hidden Markov models in computational biology: applications to protein modeling. J Mol Biol 235: 1501–1531
Article Google Scholar
Krogh A, Mian IS, Haussler D (1994) A hidden Markov model that finds genes in E. coli DNA. Nucl Acids Res 22(22): 4768–4778
Article Google Scholar
Kronecker L (1881) Zur Theorie der Elimination einer Variablen aus zwei algebraischen Gleichungen. Monatsber Königl Preuss Akad Wiss Berlin, pp 535–600
Majoros WH, Salzberg SL (2004) An empirical analysis of training protocols for probabilistic gene finders. BMC Bioinforma. http://www.biomedcentral.com/1471-2105/5/206
Ornstein DS, Weiss B (1990) How sampling reveals a process. Ann Probab 18(3): 905–930
Article MathSciNet MATH Google Scholar
Picci G (1978) On the internal structure of finite-state stochastic processes. In: Mohler R, Ruberti A (eds) Recent developments in variable structure systems. Lecture notes in economics and mathematical systems, vol 162. Springer, Heidelberg
Rabiner LW (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2): 257–285
Article Google Scholar
Rozenberg G, Salomaa A (1994) Cornerstones in undecidability. Prentice-Hall, Englewood Cliffs
Google Scholar
Salzberg SL, Delcher AL, Kasif S, White O (1998) Microbial gene identification using interpolated Markov models. Nucl Acids Res 26(2): 544–548
Article Google Scholar
Seneta E (1981) Non-negative matrices and Markov chains, 2nd edn. Springer, New York
MATH Google Scholar
Sontag ED (1975) On certain questions of rationality and decidability. J Comput Syst Sci 11: 375–381
Article MathSciNet MATH Google Scholar
van den Hof JM (1997) Realization of continuous-time positive linear systems. Syst Control Lett 31: 243–253
Article MATH Google Scholar
van den Hof JM, van Schuppen JH (1994) Realization of positive linear systems using polyhedral cones. In: Proceedings of the 33rd IEEE conference on decision and control, pp 3889–3893
Vidyasagar M (2003) Learning and generalization with applications to neural networks. Springer, London
Google Scholar
Vidyasagar M (2003) Nonlinear systems analysis. SIAM Publications, Philadelphia
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Texas at Dallas, 800 W. Campbell Road, Richardson, TX, 75080, USA
M. Vidyasagar

Authors

M. Vidyasagar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Vidyasagar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vidyasagar, M. The complete realization problem for hidden Markov models: a survey and some new results. Math. Control Signals Syst. 23, 1–65 (2011). https://doi.org/10.1007/s00498-011-0066-7

Download citation

Received: 31 May 2008
Accepted: 14 May 2009
Published: 16 October 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s00498-011-0066-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The complete realization problem for hidden Markov models: a survey and some new results

Abstract

Access this article

Similar content being viewed by others

The continuous-time hidden Markov model based on discretization. Properties of estimators and applications

Structure and Randomness of Continuous-Time, Discrete-Event Processes

Parameter Estimation for Continuous Time Hidden Markov Processes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The complete realization problem for hidden Markov models: a survey and some new results

Abstract

Access this article

Similar content being viewed by others

The continuous-time hidden Markov model based on discretization. Properties of estimators and applications

Structure and Randomness of Continuous-Time, Discrete-Event Processes

Parameter Estimation for Continuous Time Hidden Markov Processes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation