An approximation to the information matrix of exponential family finite mixtures

  • Andrew M. Raim
  • Nagaraj K. Neerchal
  • Jorge G. Morel


A simple closed form of the Fisher information matrix (FIM) usually cannot be obtained under a finite mixture. Several authors have considered a block-diagonal FIM approximation for binomial and multinomial finite mixtures, used in scoring and in demonstrating relative efficiency of proposed estimators. Raim et al. (Stat Methodol 18:115–130, 2014a) noted that this approximation coincides with the complete data FIM of the observed data and latent mixing process jointly. It can, therefore, be formulated for a wide variety of missing data problems. Multinomial mixtures feature a number of trials, which, when taken to infinity, result in the FIM and approximation becoming arbitrarily close. This work considers a clustered sampling scheme which allows the convergence result to be extended significantly to the class of exponential family finite mixtures. A series of examples demonstrate the convergence result and suggest that it can be further generalized.


Fisher information Complete data Clustered sampling  Misclassification rate 



We thank Professors Thomas Mathew, Yi Huang, and Yaakov Malinovsky at the University of Maryland, Baltimore County for helpful discussion in preparing the manuscript. Computational resources were provided by the High Performance Computing Facility ( at the university. The first author thanks the facility for financial support as an RA. We additionally thank the editor and two anonymous referees for comments which helped us to significantly improve the paper.


  1. Anderson, T. W. (2003). An introduction to multivariate statistical analysis (3rd ed.). Hoboken: Wiley.MATHGoogle Scholar
  2. Blischke, W. R. (1962). Moment estimators for the parameters of a mixture of two binomial distributions. The Annals of Mathematical Statistics, 33(2), 444–454.MathSciNetCrossRefMATHGoogle Scholar
  3. Blischke, W. R. (1964). Estimating the parameters of mixtures of binomial distributions. Journal of the American Statistical Association, 59(306), 510–528.MathSciNetCrossRefMATHGoogle Scholar
  4. Boldea, O., Magnus, J. R. (2009). Maximum likelihood estimation of the multivariate normal mixture model. Journal of the American Statistical Association, 104(488), 1539–1549.Google Scholar
  5. Boyd, S., Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.Google Scholar
  6. Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B. (2003). Bayesian data analysis (2nd ed.). Boca Raton: Chapman and Hall/CRC.Google Scholar
  7. Lehmann, E. L., Casella, G. (1998). Theory of point estimation (2nd ed.). New York: Springer.Google Scholar
  8. Lehmann, E. L., Romano, J. P. (2005). Testing statistical hypotheses (3rd ed.). New York: Springer.Google Scholar
  9. Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B, 44, 226–233.MathSciNetMATHGoogle Scholar
  10. McLachlan, G., Peel, D. (2000). Finite mixture models. New York: Wiley.Google Scholar
  11. Meyer, C. D. (2001). Matrix analysis and applied linear algebra. Philadelphia: Society for Industrial and Applied Mathematics.Google Scholar
  12. Morel, J. G., Nagaraj, N. K. (1991). A finite mixture distribution for modeling multinomial extra variation. Technical Report Research report 91–03, Department of Mathematics and Statistics, University of Maryland, Baltimore County.Google Scholar
  13. Morel, J. G., Nagaraj, N. K. (1993). A finite mixture distribution for modelling multinomial extra variation. Biometrika, 80(2), 363–371.Google Scholar
  14. Neerchal, N. K., Morel, J. G. (1998). Large cluster results for two parametric multinomial extra variation models. Journal of the American Statistical Association, 93(443), 1078–1087.Google Scholar
  15. Neerchal, N. K., Morel, J. G. (2005). An improved method for the computation of maximum likelihood estimates for multinomial overdispersion models. Computational Statistics & Data Analysis, 49(1), 33–43.Google Scholar
  16. Okamoto, M. (1959). Some inequalities relating to the partial sum of binomial probabilities. Annals of the Institute of Statistical Mathematics, 10, 29–35.MathSciNetCrossRefMATHGoogle Scholar
  17. Orchard, T., Woodbury, M. A. (1972). A missing information principle: Theory and applications. In: L. M. Le Cam, J. Neyman, E. L. Scott (Eds.), Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Theory of Statistics (Vol. 1, pp. 697–715). Berkeley: University of California Press.Google Scholar
  18. Raim, A. M., Liu, M., Neerchal, N. K., Morel, J. G. (2014a). On the method of approximate Fisher scoring for finite mixtures of multinomials. Statistical Methodology, 18, 115–130.Google Scholar
  19. Raim, A. M., Neerchal, N. K., Morel, J. G. (2014b) Large cluster approximation to the finite mixture information matrix with an application to meta-analysis. In JSM Proceedings, Statistical Computing Section. Alexandria: American Statistical Association, pp. 4025–4037.Google Scholar
  20. Rao, J. N. K. (2003). Small area estimation. Hoboken, NJ: Wiley.CrossRefMATHGoogle Scholar
  21. Shao, J. (2008). Mathematical statistics (2nd ed.). New York: Springer.Google Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2015

Authors and Affiliations

  • Andrew M. Raim
    • 1
    • 2
  • Nagaraj K. Neerchal
    • 1
  • Jorge G. Morel
    • 1
  1. 1.Department of Mathematics and StatisticsUniversity of Maryland, Baltimore CountyBaltimoreUSA
  2. 2.Center for Statistical Research and Methodology, U.S. Census BureauWashington DCUSA

Personalised recommendations