A generalized mixed model for skewed distributions applied to small area estimation

Abstract

Models with random (or mixed) effects are commonly used for panel data, in microarrays, small area estimation and many other applications. When the variable of interest is continuous, normality is commonly assumed, either in the original scale or after some transformation. However, the normal distribution is not always well suited for modeling data on certain variables, such as those found in Econometrics, which often show skewness even at the log scale. Finding the correct transformation to achieve normality is not straightforward since the true distribution is not known in practice. As an alternative, we propose to consider a much more flexible distribution called generalized beta of the second kind (GB2). The GB2 distribution contains four parameters with two of them controlling the shape of each tail, which makes it very flexible to accommodate different forms of skewness. Based on a multivariate extension of the GB2 distribution, we propose a new model with random effects designed for skewed response variables that extends the usual log-normal-nested error model. Under this new model, we find empirical best predictors of linear and nonlinear characteristics, including poverty indicators, in small areas. Simulation studies illustrate the good properties, in terms of bias and efficiency, of the estimators based on the proposed multivariate GB2 model. Results from an application to poverty mapping in Spanish provinces also indicate efficiency gains with respect to the conventional log-normal-nested error model used for poverty mapping.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. Battese GE, Harter RM, Fuller WA (1988) An error-components model for prediction of county crop areas using survey and satellite data. J Am Stat Assoc 83:28–36

    Article  Google Scholar 

  2. Bordley RF, McDonald JB, Mantrala A (1996) Something new, something old: parametric models for the size distribution of income. J Income Distrib 6:91–103

    Google Scholar 

  3. Dastrup SR, Hartshorn R, McDonald JB (2007) The impact of taxes and transfer payments on the distribution of income: a parametric comparison. J Econ Inequal 5:353–369

    Article  Google Scholar 

  4. Elbers C, Lanjouw JO, Lanjouw P (2003) Micro-level estimation of poverty and inequality. Econometrica 71(1):355–364

    Article  MATH  Google Scholar 

  5. Ferretti C, Molina I (2011) Fast EB method for estimating complex poverty indicators in large populations. J Indian Soc Agric Stat 66:105–120

    MathSciNet  Google Scholar 

  6. González-Manteiga W, Lombardía MJ, Molina I, Morales D, Santamaría L (2008) Bootstrap mean squared error of a small-area EBLUP. J Stat Comput Simul 78:443–462

    MathSciNet  Article  MATH  Google Scholar 

  7. González-Manteiga W, Lombardía MJ, Molina I, Morales D, Santamaría L (2008) Analytic and bootstrap approximations of prediction errors under a multivariate Fay–Herriot model. Comput Stat Data Anal 52:5242–5252

    MathSciNet  Article  MATH  Google Scholar 

  8. Graf M, Nedyalkova D (2013) Modeling of income and indicators of poverty and social exclusion using the generalized beta distribution of the second kind. Rev Income Wealth 60:821–842

    Google Scholar 

  9. Graf M, Nedyalkova D (2015) GB2: generalized beta distribution of the second kind: properties, likelihood, estimation. R package version 2.1.http://CRAN.R-project.org/package=GB2. Accessed 11 May 2015

  10. Hansen CB, McDonald JB, Theodossiou P (2007) Some flexible parametric models for partially adaptive estimators of econometric models. Economics: the open-access. Open Assess E J 1:2007–7. https://doi.org/10.5018/economics-ejournal.ja.2007-7

    Google Scholar 

  11. Henningsen A, Toomet O (2011) maxLik: a package for maximum likelihood estimation in R. Comput Stat 26(3):443–458. https://doi.org/10.1007/s00180-010-0217-1 https://r-forge.r-project.org/projects/maxlik/

  12. Hobza T, Morales D, Santamaría L (2018) Small area estimation of poverty proportions under unit-level temporal binomial–logit mixed models. TEST 27(2):270–294

    MathSciNet  Article  MATH  Google Scholar 

  13. Jenkins SP (2009) Distributionally-sensitive inequality indices and the GB2 income distribution. Rev Income Wealth 55:392–398

    Article  Google Scholar 

  14. Kleiber C, Kotz S (2003) Statistical size distributions in economics and actuarial sciences. Wiley, Hoboken

    Google Scholar 

  15. Koenker R (2013) quantreg: quantile regression. R package version 5.05. http://CRAN.R-project.org/package=quantreg. Accessed 29 May 2018

  16. Marhuenda Y, Molina I, Morales D, Rao JNK (2018) Poverty mapping in small areas under a twofold nested error regression model. J R Stat Soc Ser A 180:1111–1136

    MathSciNet  Article  Google Scholar 

  17. McDonald J (1984) Some generalized functions for the size distribution of income. Econometrica 52(3):647–663

    Article  MATH  Google Scholar 

  18. McDonald JB, Butler RJ (1987) Some generalized mixture distributions with an application to unemployment duration. Rev Econ Stat 69:232–240

    Article  Google Scholar 

  19. McDonald JB, Xu YJ (1995) A generalization of the beta distribution with applications. J Econ 66:133–152

    Article  MATH  Google Scholar 

  20. Molina I, Marhuenda Y (2015) sae: an R package for small area estimation. R J 7:81–98

    Article  Google Scholar 

  21. Molina I, Rao JNK (2010) Small area estimation of poverty indicators. Can J Stat 38:369–385

    MathSciNet  Article  MATH  Google Scholar 

  22. Parker SC (1997) The distribution of self-employment income in the United Kingdom, 1976–1991. Econ J 107:455–466

    Article  Google Scholar 

  23. Pfeffermann D, Sverchkov M (2007) Small-area estimation under informative probability sampling of areas and within the selected areas. J Am Stat Assoc 102:1427–1439

    MathSciNet  Article  MATH  Google Scholar 

  24. Pinheiro J, Bates D, DebRoy S, Sarkar D, R Core Team (2018) nlme: linear and nonlinear mixed effects models. R package version 3.1-131.1, https://CRAN.R-project.org/package=nlme. Accessed 7 Apr 2018

  25. Rao JNK, Molina I (2015) Small area estimation, 2nd edn. Wiley, Hoboken

    Google Scholar 

  26. Rivest L-P, Verret F, Baillargeon S (2016) Unit level small area estimation with copulas. Can J Stat 44:397–415

    MathSciNet  Article  MATH  Google Scholar 

  27. Sepanski JH, Kong J (2008) A family of generalized beta distributions for income. Adv Appl Stat 10:75–84

    MathSciNet  Article  MATH  Google Scholar 

  28. Venter G (1983) Transformed beta and gamma distributions and aggregate losses. Proc Casualty Actuar Soc 70:156–193

    Google Scholar 

  29. Verret F, Rao JNK, Hidiroglou MA (2015) Model-based small area estimation under informative sampling. Survey Methodol 41:333–347

    Google Scholar 

  30. Yang X, Frees EW, Zhang Z (2011) A generalized beta-copula with applications in modeling multivariate long-tailed data. Insur Math Econ 49:265–284

    MathSciNet  Article  MATH  Google Scholar 

  31. You Y, Rao JNK (2002) A pseudo-empirical best linear unbiased prediction approach to small area estimation using survey weights. Can J Stat 30:431–439

    MathSciNet  Article  MATH  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to J. Miguel Marín.

Additional information

This work has been supported by the Grants MTM2015-72907-EXP and MTM2015-69638-R (MINECO/FEDER, UE).

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 191 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Graf, M., Marín, J.M. & Molina, I. A generalized mixed model for skewed distributions applied to small area estimation. TEST 28, 565–597 (2019). https://doi.org/10.1007/s11749-018-0594-2

Download citation

Keywords

  • Bootstrap
  • Empirical best
  • Mixed models
  • Monte Carlo simulation
  • Random effects

Mathematics Subject Classification

  • 62D05
  • 62E99
  • 62G09