Abstract
Suppose m is a positive integer, and let \({\mathcal{M} = \{1, \ldots ,m\}}\) . Suppose \({\{\mathcal{Y}_t \}}\) is a stationary stochastic process assuming values in \({\mathcal{M}}\) . In this paper we study the question: When does there exist a hidden Markov model (HMM) that reproduces the statistics of this process? This question is more than forty years old, and as yet no complete solution is available. In this paper, we begin by surveying several known results, and then we present some new results that provide ‘almost’ necessary and sufficient conditions for the existence of a HMM for a mixing and ultra-mixing process (where the notion of ultra-mixing is introduced here). In the survey part of the paper, consisting of Sects. 2 through 8, we rederive the following known results: (i) Associate an infinite matrix H with the process, and call it a ‘Hankel’ matrix (because of some superficial similarity to a Hankel matrix). Then the process has a HMM realization only if H has finite rank. (ii) However, the finite Hankel rank condition is not sufficient in general. There exist processes with finite Hankel rank that do not admit a HMM realization. (iii) An abstract necessary and sufficient condition states that a frequency distribution has a realization as an HMM if and only if it belongs to a ‘stable polyhedral’ convex set within the set of all frequency distributions on \({\mathcal{M}^{*}}\) , the set of all finite strings over \({\mathcal{M}}\) . While this condition may be ‘necessary and sufficient,’ it virtually amounts to a restatement of the problem rather than a solution of it, as observed by Anderson (Math Control Signals Syst 12(1):80–120, 1999). (iv) Suppose a process has finite Hankel rank, say r. Then there always exists a ‘regular quasi-realization’ of the process. That is, there exist a row vector, a column vector, and a set of matrices, each of dimension r or r × r as appropriate, such that the frequency of arbitrary strings is given by a formula that is similar to the corresponding formula for HMM’s. Moreover, all quasi-regular realizations of the process can be obtained from one of them via a similarity transformation. Hence, given a finite Hankel-rank process, it is a simple matter to determine whether or not it has a regular HMM in the conventional sense, by testing the feasibility of a linear programming problem. (v) If in addition the process is α-mixing, every regular quasi-realization has additional features. Specifically, a matrix associated with the quasi-realization (which plays the role of the state transition matrix in a HMM) is ‘quasi-row stochastic’ (in that its rows add up to one, even though the matrix may not be nonnegative), and it also satisfies the ‘quasi-strong Perron property’ (its spectral radius is one, the spectral radius is a simple eigenvalue, and there are no other eigenvalues on the unit circle). A corollary is that if a finite Hankel rank α-mixing process has a regular HMM in the conventional sense, then the associated Markov chain is irreducible and aperiodic. While this last result is not surprising, it does not seem to have been stated explicitly. While the above results are all ‘known,’ they are scattered over the literature; moreover, the presentation here is unified and occasionally consists of relatively simpler proofs than are found in the literature. Next we move on to present some new results. The key is the introduction of a property called ‘ultra-mixing.’ The following results are established: (a) Suppose a process has finite Hankel rank, is both α-mixing as well as ‘ultra-mixing,’ and in addition satisfies a technical condition. Then it has an irreducible HMM realization (and not just a quasi-realization). Moreover, the Markov process underlying the HMM is either aperiodic (and is thus α-mixing), or else satisfies a ‘consistency condition.’ (b) In the other direction, suppose a HMM satisfies the consistency condition plus another technical condition. Then the associated output process has finite Hankel rank, is α-mixing and is also ultra-mixing. Moreover, it is shown that under a natural topology on the set of HMMs, both ‘technical’ conditions are indeed satisfied by an open dense set of HMMs. Taken together, these two results show that, modulo two technical conditions, the finite Hankel rank condition, α-mixing, and ultra-mixing are ‘almost’ necessary and sufficient for a process to have an irreducible and aperiodic HMM.
Similar content being viewed by others
References
Anderson BDO (1999) The realization problem for hidden Markov models. Math Control Signals Syst 12(1): 80–120
Anderson BDO, Deistler M, Farina L, Benvenuti L (1996) Nonnegative realization of a system with a nonnegative impulse response. IEEE Trans Circ Syst I Fundam Theory Appl 43: 134–142
Baldi P, Brunak S (2001) Bioinformatics: a machine learning approach, 2nd edn. MIT Press, Cambridge
Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37: 1554–1563
Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41(1): 164–171
Benvenuti L, Farina L (2004) A tutorial on the positive realization problem. IEEE Trans Autom Control 49: 651–664
Berman A, Plemmons RJ (1979) Nonnegative matrices. Academic Press, New York
Blackwell D, Koopmans L (1957) On the identifiability problem for functions of finite Markov chains. Ann Math Stat 28: 1011–1015
Blondel V, Catarini V (2003) Undecidable problems for probabilistic automata of fixed dimension. Theory Comput Syst 36: 231–245
Carlyle JW (1967) Identification of state-calculable functions of finite Markov chains. Ann Math Stat 38: 201–205
Carlyle JW (1969) Stochastic finite-state system theory. In: Zadeh L, Polak E (eds) System theory, chap 10. McGraw-Hill, New York
Cawley SE, Wirth AL, Speed TP (2001) Phat—a gene finding program for Plasmodium falciparum. Mol Biochem Parasitol 118: 167–174
Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999) Improved microbial gene identification with GLIMMER. Nucl Acids Res 27(23): 4636–4641
Dharmadhikari SW (1963) Functions of finite Markov chains. Ann Math Stat 34: 1022–1031
Dharmadhikari SW (1963) Sufficient conditions for a stationary process to be a function of a Markov chain. Ann Math Stat 34: 1033–1041
Dharmadhikari SW (1965) A characterization of a class of functions of finite Markov chains. Ann Math Stat 36: 524–528
Dharmadhikari SW (1969) A note on exchangeable processes with states of finite rank. Ann Math Stat 40(6): 2207–2208
Dharmadhikari SW, Nadkarni MG (1970) Some regular and non-regular functions of finite Markov chains. Ann Math Stat 41(1): 207–213
Erickson RV (1970) Functions of Markov chains. Ann Math Stat 41: 843–850
Fliess M (1975) Series rationelles positives et processus stochastique. Ann Inst Henri Poincaré Sect B XI:1–21
Fox M, Rubin H (1968) Functions of processes with Markovian states. Ann Mathematical Stat 39: 938–946
Gilbert EJ (1959) The identifiability problem for functions of Markov chains. Ann Math Stat 30: 688–697
Heller A (1965) On stochastic processes derived from Markov chains. Ann Math 36: 1286–1291
Ito H, Amari S, Kobayashi K (1992) Identifiability of hidden Markov information sources and their minimum degrees of freedom. IEEE Trans Inf Theory 38: 324–333
Jelinek F (1997) Statistical Methods for speech recognition. MIT Press, Cambridge
Kalikow S (1990) Random Markov processes and uniform martingales. Isr J Math 71(1): 33–54
Krogh A, Brown M, Mian IS, Sjölander K, Haussler D (1994) Hidden Markov models in computational biology: applications to protein modeling. J Mol Biol 235: 1501–1531
Krogh A, Mian IS, Haussler D (1994) A hidden Markov model that finds genes in E. coli DNA. Nucl Acids Res 22(22): 4768–4778
Kronecker L (1881) Zur Theorie der Elimination einer Variablen aus zwei algebraischen Gleichungen. Monatsber Königl Preuss Akad Wiss Berlin, pp 535–600
Majoros WH, Salzberg SL (2004) An empirical analysis of training protocols for probabilistic gene finders. BMC Bioinforma. http://www.biomedcentral.com/1471-2105/5/206
Ornstein DS, Weiss B (1990) How sampling reveals a process. Ann Probab 18(3): 905–930
Picci G (1978) On the internal structure of finite-state stochastic processes. In: Mohler R, Ruberti A (eds) Recent developments in variable structure systems. Lecture notes in economics and mathematical systems, vol 162. Springer, Heidelberg
Rabiner LW (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2): 257–285
Rozenberg G, Salomaa A (1994) Cornerstones in undecidability. Prentice-Hall, Englewood Cliffs
Salzberg SL, Delcher AL, Kasif S, White O (1998) Microbial gene identification using interpolated Markov models. Nucl Acids Res 26(2): 544–548
Seneta E (1981) Non-negative matrices and Markov chains, 2nd edn. Springer, New York
Sontag ED (1975) On certain questions of rationality and decidability. J Comput Syst Sci 11: 375–381
van den Hof JM (1997) Realization of continuous-time positive linear systems. Syst Control Lett 31: 243–253
van den Hof JM, van Schuppen JH (1994) Realization of positive linear systems using polyhedral cones. In: Proceedings of the 33rd IEEE conference on decision and control, pp 3889–3893
Vidyasagar M (2003) Learning and generalization with applications to neural networks. Springer, London
Vidyasagar M (2003) Nonlinear systems analysis. SIAM Publications, Philadelphia
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vidyasagar, M. The complete realization problem for hidden Markov models: a survey and some new results. Math. Control Signals Syst. 23, 1–65 (2011). https://doi.org/10.1007/s00498-011-0066-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00498-011-0066-7