# An approximation to the information matrix of exponential family finite mixtures

- 138 Downloads

## Abstract

A simple closed form of the Fisher information matrix (FIM) usually cannot be obtained under a finite mixture. Several authors have considered a block-diagonal FIM approximation for binomial and multinomial finite mixtures, used in scoring and in demonstrating relative efficiency of proposed estimators. Raim et al. (Stat Methodol 18:115–130, 2014a) noted that this approximation coincides with the complete data FIM of the observed data and latent mixing process jointly. It can, therefore, be formulated for a wide variety of missing data problems. Multinomial mixtures feature a number of trials, which, when taken to infinity, result in the FIM and approximation becoming arbitrarily close. This work considers a clustered sampling scheme which allows the convergence result to be extended significantly to the class of exponential family finite mixtures. A series of examples demonstrate the convergence result and suggest that it can be further generalized.

### Keywords

Fisher information Complete data Clustered sampling Misclassification rate## Notes

### Acknowledgments

We thank Professors Thomas Mathew, Yi Huang, and Yaakov Malinovsky at the University of Maryland, Baltimore County for helpful discussion in preparing the manuscript. Computational resources were provided by the High Performance Computing Facility (http://www.umbc.edu/hpcf) at the university. The first author thanks the facility for financial support as an RA. We additionally thank the editor and two anonymous referees for comments which helped us to significantly improve the paper.

### References

- Anderson, T. W. (2003).
*An introduction to multivariate statistical analysis*(3rd ed.). Hoboken: Wiley.MATHGoogle Scholar - Blischke, W. R. (1962). Moment estimators for the parameters of a mixture of two binomial distributions.
*The Annals of Mathematical Statistics*,*33*(2), 444–454.MathSciNetCrossRefMATHGoogle Scholar - Blischke, W. R. (1964). Estimating the parameters of mixtures of binomial distributions.
*Journal of the American Statistical Association*,*59*(306), 510–528.MathSciNetCrossRefMATHGoogle Scholar - Boldea, O., Magnus, J. R. (2009). Maximum likelihood estimation of the multivariate normal mixture model.
*Journal of the American Statistical Association*,*104*(488), 1539–1549.Google Scholar - Boyd, S., Vandenberghe, L. (2004).
*Convex optimization*. Cambridge: Cambridge University Press.Google Scholar - Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B. (2003).
*Bayesian data analysis*(2nd ed.). Boca Raton: Chapman and Hall/CRC.Google Scholar - Lehmann, E. L., Casella, G. (1998).
*Theory of point estimation*(2nd ed.). New York: Springer.Google Scholar - Lehmann, E. L., Romano, J. P. (2005).
*Testing statistical hypotheses*(3rd ed.). New York: Springer.Google Scholar - Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm.
*Journal of the Royal Statistical Society. Series B*,*44*, 226–233.MathSciNetMATHGoogle Scholar - McLachlan, G., Peel, D. (2000).
*Finite mixture models*. New York: Wiley.Google Scholar - Meyer, C. D. (2001).
*Matrix analysis and applied linear algebra*. Philadelphia: Society for Industrial and Applied Mathematics.Google Scholar - Morel, J. G., Nagaraj, N. K. (1991).
*A finite mixture distribution for modeling multinomial extra variation. Technical Report Research report 91–03*, Department of Mathematics and Statistics, University of Maryland, Baltimore County.Google Scholar - Morel, J. G., Nagaraj, N. K. (1993). A finite mixture distribution for modelling multinomial extra variation.
*Biometrika*,*80*(2), 363–371.Google Scholar - Neerchal, N. K., Morel, J. G. (1998). Large cluster results for two parametric multinomial extra variation models.
*Journal of the American Statistical Association*,*93*(443), 1078–1087.Google Scholar - Neerchal, N. K., Morel, J. G. (2005). An improved method for the computation of maximum likelihood estimates for multinomial overdispersion models.
*Computational Statistics & Data Analysis*,*49*(1), 33–43.Google Scholar - Okamoto, M. (1959). Some inequalities relating to the partial sum of binomial probabilities.
*Annals of the Institute of Statistical Mathematics*,*10*, 29–35.MathSciNetCrossRefMATHGoogle Scholar - Orchard, T., Woodbury, M. A. (1972). A missing information principle: Theory and applications. In: L. M. Le Cam, J. Neyman, E. L. Scott (Eds.),
*Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Theory of Statistics*(Vol. 1, pp. 697–715). Berkeley: University of California Press.Google Scholar - Raim, A. M., Liu, M., Neerchal, N. K., Morel, J. G. (2014a). On the method of approximate Fisher scoring for finite mixtures of multinomials.
*Statistical Methodology*,*18*, 115–130.Google Scholar - Raim, A. M., Neerchal, N. K., Morel, J. G. (2014b) Large cluster approximation to the finite mixture information matrix with an application to meta-analysis. In
*JSM Proceedings, Statistical Computing Section*. Alexandria: American Statistical Association, pp. 4025–4037.Google Scholar - Rao, J. N. K. (2003).
*Small area estimation*. Hoboken, NJ: Wiley.CrossRefMATHGoogle Scholar - Shao, J. (2008).
*Mathematical statistics*(2nd ed.). New York: Springer.Google Scholar