Skip to main content
Log in

Empirical identifiability in finite mixture models

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Although the parameters in a finite mixture model are unidentifiable, there is a form of local identifiability guaranteeing the existence of the identifiable parameter regions. To verify its existence, practitioners use the Fisher information on the estimated parameters. However, there exist model/data situations where local identifiability based on Fisher information does not correspond to that based on the likelihood. In this paper, we propose a method to empirically measure degree of local identifiability on the estimated parameters, empirical identifiability, based on one’s ability to construct an identifiable likelihood set. From a detailed topological study of the likelihood region, we show that for any given data set and mixture model, there typically exists limited range of confidence levels where the likelihood region has a natural partition into identifiable subsets. At confidence levels that are too high, there is no natural way to use the likelihood to resolve the identifiability problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Agresti, A. (2002). Categorical Data Analysis (2nd ed.). New York: Wiley.

  • Böhning, D., Schlattmann, P., Lindsay, B. G. (1992). C.a.man-computer assisted analysis of mixtures: statistical algorithms. Biometrics, 48, 283–303.

  • Böhning, D., Dietz, E., Schaub, R., Schlattmann, P., Lindsay, B. G. (1994). The distribution of the likelihood ratio for mixtures of densities from the one parameter exponential family. Annals of the Institute of Statistical Mathematics, 46, 373–388.

  • Campbell, N. A., Mahon, R. J. (1974). A multivariate study of variation in two species of rock crab of genus Leptograsus. Australian Journal of Zoology, 22, 417–425.

  • Chen, H., Chen, J. (2001). The likelihood ratio test for homogeneity in finite mixture models. Canadian Journal of Statistics, 29, 201–215.

  • Cox, D. R., Hinkley, D. V. (2002). Theoretical Statistics. London: Chapman & Hall.

  • Crawford, S. L. (1994). An application of the laplace method to finite mixture distributions. Journal of the American Statistical Association, 89, 259–267.

  • Davison, A. C., Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge: Cambridge University Press.

  • Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society Series B (Methodological), 39, 1–38.

  • Dudley, C. R. K., Giuffra, L. A., Raine, A. E. G., Reeders, S. T. (1991). Assessing the role of apnh, a gene encoding for a human amiloride-sensitive na+/h+ antiporter, on the interindividual variation in red cell na+/li+ countertransport. Journal of the American Society of Nephrology, 2, 937–943.

  • Efron, B., Tibshirani, R. J. (1993). An Introduction to the Bootstrap. New York: Chapman & Hall.

  • Finch, S. J., Mendell, N. R., Thode, H. C. (1989). Probabilistic measures of adequacy of a numerical search for a global maximum. Journal of the American Statistical Association, 84, 1020–1023.

  • Goodman, L. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61, 215–231.

  • Huang, G. H., Bandeen-Roche, K. (2004). Building an identifiable latent class model with covariate effects on underlying and measured variables. Psychometrika, 69, 5–32.

  • Kalbfleisch, J. D., Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data. New Jersey: Wiley.

  • Kim, D., Lindsay, B. G. (2011a). Modal simulation and visualization in finite mixture models. Canadian Journal of Statistics, 39, 421–437.

  • Kim, D., Lindsay, B. G. (2011b). Using confidence distribution sampling to visualize confidence sets. Statistica Sinica, 21, 923–948.

  • Kim, D. K., Taylor, J. M. G. (1995). The restricted em algorithm for maximum likelihood estimation under linear restrictions on the parameters. Journal of the American Statistical Association, 430, 708–716.

  • Lindsay, B. G. (1995). Mixture Models: Theory, Geometry, and Applications. In NSF-CBMS Regional Conference Series in Probability and Statistics (Vol. 5). Hayward: Institute of Mathematical Statistics.

  • Liu, X., Shao, Y. (2003). Asymptotics for likelihood ratio tests under loss of identifiability. Annals of Statistics, 31, 807–832.

  • Matsumoto, Y. (2002). An Introduction to Morse Theory. Translations of Mathematical Monographs (Vol. 208). Providence: American Mathematical Society.

  • McLachlan, J., Peel, D. (2000). Finite Mixture Models. New York: Wiley.

  • Meeker, W., Escobar, L. (1995). Teaching about approximate confidence regions based on maximum likelihood estimation. The American Statistician, 49, 48–53.

  • Redner, R. A., Walker, H. C. (1984). Mixture densities, maximum likelihood and the em algorithm. SIAM Review, 26, 195–239.

  • Roeder, K. (1994). A graphical technique for determining the number of components in a mixture of normals. Journal of the American Statistical Association, 89, 487–495.

  • Rothenberg, T. (1971). Identification in parametric models. Econometrica, 39, 577–591.

  • Schelp, F. P., Vivatanasept, P., Sitaputra, P., Sormani, S., Pongpaew, P., Vudhivai, N., et al. (1990). Relationship of the morbidity of under-fives to anthropometric measurements and community health intervention. Tropical Medicine and Parasitology, 41, 121–126.

  • Schlattmann, P. (2005). On bootstrapping the number of components in finite mixtures of poisson distributions. Statistics and Computing, 15, 179–188.

  • Teicher, H. (1960). On the mixture of distributions. The Annals of Mathematical Statistics, 31, 55–73.

  • Teicher, H. (1963). Identifiability of finite mixtures. The Annals of Mathematical Statistics, 34, 1265–1269.

  • Titterington, D. M., Smith, A. F. M., Markov, U. E. (1985). Statistical Analysis of Finite Mixture Models. New York: Wiley.

  • Yakowitz, S. J., Spragins, J. D. (1968). On the identifiability of finite mixtures. The Annals of Mathematical Statistics, 39, 209–214.

  • Yao, W., Lindsay, B. G. (2009). Bayesian mixture labeling by highest posterior density. Journal of the American Statistical Association, 104, 758–767.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daeyoung Kim.

Appendix: Proofs

Appendix: Proofs

Proof of Theorem 1

Suppose that one starts at the highest elevation in the mixture likelihood, \(\psi _\mathrm{MLE}\). Then the likelihood region at \(\psi _\mathrm{MLE}\) is equal to the set of \(K!\) modes. If the elevation \(c\) is just below \(\psi _\mathrm{MLE}\), the likelihood region consists of \(K! \) disjoint modal regions \(C_{c}(\hat{{\theta }}^{\sigma })\).

However, if the elevation \(c\) goes too low, some properties in Definition 3 could fail so that a unimodal partition cannot exist. Two different problems could arise. The first case is where there exists a secondary mode in the likelihood, with the elevation \(\psi _\mathrm{2nd}\). In this case, for the elevation \(c\) at or just below \(\psi _\mathrm{2nd}\), a secondary set of modal regions is formed containing the \(K!\) secondary modes. Since these points are not path-connected to the MLE modes, the property (P3) would be violated. If the primary and secondary modal regions become reconnected at a lower level of \(c\), then, even if identifiable, each element contains two modes, thereby violating property (P4).

A second case where a unimodal partition could fail is when \(c \le \psi _\mathrm{mm}\), the minimal elevation of the maximin path connecting \(C_{c}(\hat{{\theta }})\) to a permuted region \(C_{c}(\hat{{\theta }}^{\sigma })\). This causes a violation of property (P1).

As long as the elevation \(c\) is larger than \(\max \{\psi _\mathrm{mm}, \psi _\mathrm{2nd} \}\), however, then there are no secondary modal regions possible, and there cannot exist connections between the modal regions of the MLE group, as there exist no saddle points. Therefore the \(K!\) modal regions \(C_{c}(\hat{{\theta }}^{\sigma })\) are disjoint and their union is the elevation \(c\) likelihood region \(C_{c}^\mathrm{LR}\), so that we have verified properties (P1) and (P3).

We claim that the second property (P2) holds because the \(K!\) modal regions \(C_{c}(\hat{{\theta }}^{\sigma })\) are disjoint by (P1) and each is connected (by definition). The argument for identifiability goes as follows: first, given any \({\theta }\) in \(C_{c}(\hat{{\theta }})\), its permutation \({\theta }^{\sigma }\) cannot also be in the same modal region \(C_{c}(\hat{{\theta }})\), as we know it lies in a disjoint \(C_{c}(\hat{{\theta }}^{\sigma })\). Secondly, we suppose, for purposes of contradiction, that \(C_{c}(\hat{{\theta }})\) contains a degenerate point \({\theta }_{0}\), but is not path-connected to any \(C_{c}(\hat{{\theta }}^{\sigma })\), for any permutation \(\sigma \). Then the permuted \({\theta }_{0}\), \({\theta }_{0}^{\sigma }\), is contained in only \(C_{c}(\hat{{\theta }}^{\sigma })\). Since the region of \({\theta }\) values that generate the same degenerate mixing distribution and hence the same likelihood as \({\theta }_{0}\) is a connected set, \({\theta }_{0}^{\sigma }\) should be in the same connected set. Thus, at the elevation \(c \le \psi _\mathrm{deg}\), \(C_{c}(\hat{{\theta }})\) and \(C_{c}(\hat{{\theta }}^{\sigma })\) are connected to each other by a path. But this is contradicted with the property (P1) that the \(K!\) modal regions \(C_{c}(\hat{{\theta }}^{\sigma })\) are disjoint. \(\square \)

Proof of Theorem 2

Suppose the two modal regions, \(C_{c}(\hat{{\theta }})\) and \(C_{c}(\hat{{\theta }}^{\sigma })\), are connected to each other at the elevation \(c\). Then a maximin path connecting them has \(\psi _\mathrm{mm} \ge c\). Any path connecting the modes necessarily intersects the hyperplane of Eq (6) in one or more points. Let \(c^{\star }\) be the minimal value of the likelihood on the intersection set. Then \(c^{\star } \ge \psi _\mathrm{mm}\) by the definition of \(\psi _\mathrm{mm}\). However, we also have \(c^{\star } \le \psi _\mathrm{hyp}\) by the definition of the latter. We therefore have \(\psi _\mathrm{mm} \le \psi _\mathrm{hyp}\). \(\square \)

Proof of Theorem 3

Suppose there exists a continuous path of \(\gamma (t)\) parameter values that connects the profile modes \(\gamma ({\theta }_{A})\) and \(\gamma ({\theta }_{B})\) such that the profile likelihood along the path stays above the elevation \(c\). Then the path (\(\gamma (t)\), \(\hat{\phi }(\gamma (t))\)) is a continuous path in the full parameter space that connects the two modes and whose likelihood stays above \(c\). \(\square \)

Proof of Theorem 4

Suppose \(K=2\). Then the default hyperplane generated by Eq.(8) consists of all mixtures with \(\xi _{1}=\xi _{2}\), and so \(\psi _\mathrm{hyp}=\psi _\mathrm{deg}=\psi _\mathrm{mm}\). Thus, if \(c > \psi _\mathrm{deg}\), \(C_{c}(\hat{{\theta }})\) and \(C_{c}(\hat{{\theta }}^{\sigma })\) cannot have a connecting path. If this is the case, the labels in a selected modal region must satisfy \(\xi _{1}> \xi _{2}\) or \(\xi _{1} < \xi _{2}\), as the modal set cannot contain points satisfying \(\xi _{1}=\xi _{2}\). That is, the points in each modal region must have order restriction labels. The argument for a two-component case extends to \(K(>2)\) components. Given a set of ordered \(\xi \)’s, there is no way to move continuously to a permuted set without one pair becoming equal along the way. \(\square \)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, D., Lindsay, B.G. Empirical identifiability in finite mixture models. Ann Inst Stat Math 67, 745–772 (2015). https://doi.org/10.1007/s10463-014-0474-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-014-0474-9

Keywords

Navigation