Abstract
Poisson mixtures are usually used to describe overdispersed data. Finite Poisson mixtures are used in many practical situations where often it is of interest to determine the number of components in the mixture. Identifying how many components comprise a mixture remains a difficult problem. The likelihood ratio test (LRT) is a general statistical procedure to use. Unfortunately, a number of specific problems arise and the classical theory fails to hold. In this paper a new procedure is proposed that is based on testing whether a new component can be added to a finite Poisson mixture which eventually leads to the number of components in the mixture. It is a sequential testing procedure based on the well known LRT that utilises a resampling technique to construct the distribution of the test statistic. The application of the procedure to real data reveals some interesting features of the distribution of the test statistic.
Similar content being viewed by others
REFERENCES
Aitkin, M., Anderson, D. and Hinde, J. (1981). Statistical modelling of data on teaching styles, J. Roy. Statist. Soc. Ser. A, 144, 419-461.
Aitkin, M., Finch, S., Mendell, N. and Thode, H. (1996). A new test for the presence of a normal mixture distribution based on the posterior Bayes factor, Statistics and Computing, 6, 121-125.
Beran, R. (1988). Prepivoting test statistics: a bootstrap review of asymptotic refinements, J. Amer. Statist. Assoc., 83, 687-697.
Berdai, A. and Garrel, B. (1996). Detecting a univariate normal mixture with two components, Statist. Decisions, 14, 35-51.
Bohning, D. (1995). A review of reliable maximum likelihood algorithms for semiparametric mixture models, J. Statist. Plann. Inference, 47, 5-28.
Bohning, D., Dietz, Ek., Schaub, R., Schlattman, P. and Lindsay, B. (1994). The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann. Inst. Statist. Math., 46, 373-388.
Celeux, G. and Diebolt, J. (1985). The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem, Computational Statistics Quarterly, 2, 73-92.
Chen, J. and Kalbfleisch, J. D. (1996). Penalised minimum-distance estimates in finite mixture models, Canad. J. Statist., 24, 167-175.
Dempster, A. P., Laird N. M. and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM aglgorithm, J. Roy. Statist. Soc. Ser. B, 39, 1-38.
Feng, Z. and McCulloch, C. E. (1994). On the likelihood ratio test statistic for the number of components in a normal mixture with unequal variances, Biometrics, 50, 1158-1162.
Feng, Z. and McCulloch, C. E. (1996). Using bootstrap likelihood ratios in finite mixture models, J. Roy. Statist. Soc. Ser. B, 58, 609-617.
Fruman, W. D. and Lindsay, B. (1994). Testing for the number of components in a mixture of normal distributions using moment estimators, Comput. Statist. Data Anal., 17, 473-492.
Greenwood, M. and Yule, G. (1920). An inquiry into the nature of frequency distributions representative of multiple happenings with particular reference to the occurrence of multiple attacks of disease or of repeated accidents, J. Roy. Statist. Soc. Ser. A, 83, 255-279.
Hasselblad, V. (1969). Estimation of finite mixtures from the exponential family, J. Amer. Statist. Assoc., 64, 1459-1471.
Henna, J. (1985). On estimating the number of constituents of a finite mixture of continuous distributions, Ann. Inst. Statist. Math., 37, 235-240.
Izenmann, A. J. and Sommer, C. (1988). Philatelic mixtures and multimodal densities, J. Amer. Statist. Assoc., 83, 941-953.
Karlis, D. and Xekalaki, E. (1996a). Testing for finite mixtures via the likelihood ratio test, Tech. Report, No. 28, Department of Statistics, Athens University of Economics and Business.
Karlis, D. and Xekalaki, E. (1996b). A note on the maximum likelihood estimation of the parameters of finite Poisson mixtures, Tech. Report, No. 24, Department of Statistics, Athens University of Economics and Business.
Leroux, B. (1992). Consistent estimation of a mixing distribution, Ann. Statist., 20, 1350-1360.
Leroux, B. and Puterman, M. (1992). Maximum-penalised-likelihood for independent and Markov-dependent mixture models, Biometrics, 48, 545-558.
Lindsay, B. (1983). The geometry of mixture likelihood: A general theory, Ann. Statist., 11, 86-94.
Lindsay, B. (1989). Moment matrices: Application in mixtures, Ann. Statist., 17, 722-740.
Lindsay, B. and Roeder, K. (1992). Residuals diagnostics for mixture models, J. Amer. Statist. Assoc., 87, 785-794.
McLachlan, G. (1987). On bootstraping the likelihood ratio test statistic for the number of components in a normal mixture, Applied Statistics, 36, 318-324.
Mendell, N., Thode, H. and Finch, S. J. (1991). The likelihood ratio test for the 2-component normal mixture problem: Power and sample size analysis, Biometrics, 47, 1143-1148.
Mendell, N., Finch, S. J. and Thode, H. C. (1993). Where is the likelihood ratio test powerful for detecting two components normal mixture? (The consultant's forum), Biometrics, 49, 907-915.
Richardson, S. and Green, P. (1997). On Bayesian analysis of mixtures with an unknown number of components, J. Roy. Statist. Soc. Ser. B, 59, 751-793.
Self, S. and Liang, K. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions, J. Amer. Statist. Assoc., 82, 605-610.
Symons, M., Grimson, R. and Yuan, Y. (1983). Clustering of rare events, Biometrics, 39, 193-205.
Teicher, H. (1961). Identifiability of mixtures, Ann. Math. Statist., 32, 244-248.
Titterington, M., Markov, G. and Smith, A. F. M. (1985). Statistical Analysis of Finite Mixtures, Willey, London.
Thode, H., Finch, S. and Mendell, N. (1988). Simulated percentage points for the null distribution of the likelihood ratio test for a mixture of two normals, Biometrics, 44, 1195-1201.
Windham, M. and Cutler, A. (1992). Information ratios for validating mixture analyses, J. Amer. Statist. Assoc., 87, 1188-1192.
Wolfe, J. H. (1970). Pattern clustering by multivariate mixture analysis, Multivariate Behavioral Research, 5, 329-350.
Author information
Authors and Affiliations
About this article
Cite this article
Karlis, D., Xekalaki, E. On Testing for the Number of Components in a Mixed Poisson Model. Annals of the Institute of Statistical Mathematics 51, 149–162 (1999). https://doi.org/10.1023/A:1003839420071
Issue Date:
DOI: https://doi.org/10.1023/A:1003839420071