Abstract
Nonparametric modelling of count data is partly motivated by the fact that using parametric count models not only runs the risk of model misspecification but also is rather restrictive in terms of local approximation. Accordingly, we present a framework of using nonparametric mixtures for flexible modelling of count data. We consider the use of the least squares function in nonparametric mixture modelling and provide two algorithms for least squares fitting of nonparametric mixtures. Two illustrations of the framework are given, each with a particular nonparametric mixture. One illustration is the use of the nonparametric Poisson mixture for general modelling purposes. The other illustration is concerned with modelling of count data from some decreasing distribution, in which the Poisson mixture distribution is less appropriate, for its fitted distribution might not be a decreasing distribution. We define a mixture distribution called the discrete decreasing beta mixture distribution that always has fitted probabilities conforming with the assumption of decreasing probabilities. Through numerical studies, we demonstrate the performance of nonparametric mixtures as modelling tools.
Similar content being viewed by others
References
Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)
Baayen, R.H.: Word Frequency Distributions. Springer, Dordrecht (2001)
Balabdaoui, F., Wellner, J.A.: Estimation of a \(k\)-monotone density: characterizations, consistency and minimax lower bounds. Stat. Neerl. 64, 45–70 (2010)
Böhning, D., Patilea, V.: Asymptotic normality in mixtures of power series distributions. Scand. J. Stat. 32, 115–131 (2005)
Böhning, D., Schlattmann, P., Lindsay, B.G.: Computer-assisted analysis of mixtures (C.A.MAN): statistical algorithms. Biometrics 48, 283–303 (1992)
Cameron, A.C., Trivedi, P.K.: Regression Analysis of Count Data, 2nd edn. Cambridge University Press, Cambridge (2013)
Chee, C.-S., Wang, Y.: Minimum quadratic distance density estimation using nonparametric mixtures. Comput. Stat. Data Anal. 57, 1–16 (2013)
Chee, C.-S., Wang, Y.: Least squares estimation of a \(k\)-monotone density function. Comput. Stat. Data Anal. 74, 209–216 (2014)
Dax, A.: The smallest point of a polytope. J. Optim. Theory Appl. 64, 429–432 (1990)
Deb, P., Trivedi, P.K.: Demand for medical care by the elderly: a finite mixture approach. J. Appl. Econom. 12, 313–336 (1997)
Durot, C., Huet, S., Koladjo, F., Robin, S.: Least-squares estimation of a convex discrete distribution. Comput. Stat. Data Anal. 67, 282–298 (2013)
Gupta, R.C., Ong, S.H.: Analysis of long-tailed count data by Poisson mixtures. Commun. Stat. 34, 557–573 (2005)
Harris, I.R., Shen, S.: The minimum L\(_2\) distance estimator for Poisson mixture models. J. Stat. Plan. Inference 141, 1088–1101 (2011)
Jankowski, H.K., Wellner, J.A.: Estimation of a discrete monotone distribution. Electr. J. Stat. 3, 1567–1605 (2009)
Karlis, D., Xekalaki, E.: Minimum Hellinger distance estimation for Poisson mixtures. Comput. Stat. Data Anal. 29, 81–103 (1998)
Karlis, D., Xekalaki, E.: Robust inference for finite Poisson mixtures. J. Stat. Plan. Inference 93, 93–115 (2001)
Karlis, D., Xekalaki, E.: Mixed Poisson distributions. Int. Stat. Rev. 73, 35–58 (2005)
Lawson, C.L., Hanson, R.J.: Solving Least Squares Problems. Prentice-Hall Inc, Englewood Cliffs (1974)
Mazza, A., Punzo, A.: Discrete beta kernel graduation of age-specific demographic indicators. In: Ingrassia, S., Rocci, R., Vichi, M. (eds.) New Perspectives in Statistical Modeling and Data Analysis Studies in Classification, Data Analysis, and Knowledge Organization, pp. 127–134. Springer, Berlin (2011)
Nikoloulopoulos, A.K., Karlis, D.: On modeling count data: a comparison of some well-known discrete distributions. J. Stat. Comput. Simul. 78, 437–457 (2008)
Punzo, A.: Discrete beta-type models. In: Locarek-Junge, H., Weihs, C. (eds.) Classification as a Toolfor Research, Studies in Classification, Data Analysis, and Knowledge Organization, pp 253–261. Springer, Berlin, Heidelberg (2010)
Punzo, A., Zini, A.: Discrete approximations of continuous and mixed measures on a compact interval. Statistical Papers 53, 563–575 (2012)
Rigby, R.A., Stasinopoulos, D.M., Akantziliotou, C.: A framework for modelling overdispersed count data, including the Poisson-shifted generalized inverse Gaussian distribution. Computational Statistics and Data Analysis 53, 381–393 (2008)
Scott, D.W.: Parametric statistical modeling by minimum integrated square error. Technometrics 43, 274–285 (2001)
Shmueli, G., Minka, T.P., Kadane, J.B., Borle, S., Boatwright, P.: A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution. J. R. Stat. Soc. 54, 127–142 (2005)
Simar, L.: Maximum likelihood estimation of a compound Poisson process. Ann. Stat. 4, 1200–1209 (1976)
Wang, Y.: On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J. R. Stat. Soc. 69, 185–198 (2007)
Wang, Y.: Dimension-reduced nonparametric maximum likelihood computation for interval-censored data. Comput. Stat. Data Anal. 52, 2388–2402 (2008)
Wang, Y., Chee, C.-S.: Density estimation using non-parametric and semi-parametric mixtures. Stat. Model. 12, 67–92 (2012)
Acknowledgments
The author is grateful to the associate editor and two reviewers for their insightful and valuable comments. The author also acknowledges and thanks the Universiti Malaysia Terengganu for providing the Research Incentive Grant (No. 68007/2013/121).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chee, CS. Modelling of count data using nonparametric mixtures. AStA Adv Stat Anal 100, 239–257 (2016). https://doi.org/10.1007/s10182-015-0255-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-015-0255-7