A generalized mixed model for skewed distributions applied to small area estimation
- 198 Downloads
- 1 Citations
Abstract
Models with random (or mixed) effects are commonly used for panel data, in microarrays, small area estimation and many other applications. When the variable of interest is continuous, normality is commonly assumed, either in the original scale or after some transformation. However, the normal distribution is not always well suited for modeling data on certain variables, such as those found in Econometrics, which often show skewness even at the log scale. Finding the correct transformation to achieve normality is not straightforward since the true distribution is not known in practice. As an alternative, we propose to consider a much more flexible distribution called generalized beta of the second kind (GB2). The GB2 distribution contains four parameters with two of them controlling the shape of each tail, which makes it very flexible to accommodate different forms of skewness. Based on a multivariate extension of the GB2 distribution, we propose a new model with random effects designed for skewed response variables that extends the usual log-normal-nested error model. Under this new model, we find empirical best predictors of linear and nonlinear characteristics, including poverty indicators, in small areas. Simulation studies illustrate the good properties, in terms of bias and efficiency, of the estimators based on the proposed multivariate GB2 model. Results from an application to poverty mapping in Spanish provinces also indicate efficiency gains with respect to the conventional log-normal-nested error model used for poverty mapping.
Keywords
Bootstrap Empirical best Mixed models Monte Carlo simulation Random effectsMathematics Subject Classification
62D05 62E99 62G09Supplementary material
References
- Battese GE, Harter RM, Fuller WA (1988) An error-components model for prediction of county crop areas using survey and satellite data. J Am Stat Assoc 83:28–36CrossRefGoogle Scholar
- Bordley RF, McDonald JB, Mantrala A (1996) Something new, something old: parametric models for the size distribution of income. J Income Distrib 6:91–103Google Scholar
- Dastrup SR, Hartshorn R, McDonald JB (2007) The impact of taxes and transfer payments on the distribution of income: a parametric comparison. J Econ Inequal 5:353–369CrossRefGoogle Scholar
- Elbers C, Lanjouw JO, Lanjouw P (2003) Micro-level estimation of poverty and inequality. Econometrica 71(1):355–364CrossRefzbMATHGoogle Scholar
- Ferretti C, Molina I (2011) Fast EB method for estimating complex poverty indicators in large populations. J Indian Soc Agric Stat 66:105–120MathSciNetGoogle Scholar
- González-Manteiga W, Lombardía MJ, Molina I, Morales D, Santamaría L (2008) Bootstrap mean squared error of a small-area EBLUP. J Stat Comput Simul 78:443–462MathSciNetCrossRefzbMATHGoogle Scholar
- González-Manteiga W, Lombardía MJ, Molina I, Morales D, Santamaría L (2008) Analytic and bootstrap approximations of prediction errors under a multivariate Fay–Herriot model. Comput Stat Data Anal 52:5242–5252MathSciNetCrossRefzbMATHGoogle Scholar
- Graf M, Nedyalkova D (2013) Modeling of income and indicators of poverty and social exclusion using the generalized beta distribution of the second kind. Rev Income Wealth 60:821–842Google Scholar
- Graf M, Nedyalkova D (2015) GB2: generalized beta distribution of the second kind: properties, likelihood, estimation. R package version 2.1.http://CRAN.R-project.org/package=GB2. Accessed 11 May 2015
- Hansen CB, McDonald JB, Theodossiou P (2007) Some flexible parametric models for partially adaptive estimators of econometric models. Economics: the open-access. Open Assess E J 1:2007–7. https://doi.org/10.5018/economics-ejournal.ja.2007-7 Google Scholar
- Henningsen A, Toomet O (2011) maxLik: a package for maximum likelihood estimation in R. Comput Stat 26(3):443–458. https://doi.org/10.1007/s00180-010-0217-1 https://r-forge.r-project.org/projects/maxlik/
- Hobza T, Morales D, Santamaría L (2018) Small area estimation of poverty proportions under unit-level temporal binomial–logit mixed models. TEST 27(2):270–294MathSciNetCrossRefzbMATHGoogle Scholar
- Jenkins SP (2009) Distributionally-sensitive inequality indices and the GB2 income distribution. Rev Income Wealth 55:392–398CrossRefGoogle Scholar
- Kleiber C, Kotz S (2003) Statistical size distributions in economics and actuarial sciences. Wiley, HobokenCrossRefzbMATHGoogle Scholar
- Koenker R (2013) quantreg: quantile regression. R package version 5.05. http://CRAN.R-project.org/package=quantreg. Accessed 29 May 2018
- Marhuenda Y, Molina I, Morales D, Rao JNK (2018) Poverty mapping in small areas under a twofold nested error regression model. J R Stat Soc Ser A 180:1111–1136MathSciNetCrossRefGoogle Scholar
- McDonald J (1984) Some generalized functions for the size distribution of income. Econometrica 52(3):647–663CrossRefzbMATHGoogle Scholar
- McDonald JB, Butler RJ (1987) Some generalized mixture distributions with an application to unemployment duration. Rev Econ Stat 69:232–240CrossRefGoogle Scholar
- McDonald JB, Xu YJ (1995) A generalization of the beta distribution with applications. J Econ 66:133–152CrossRefzbMATHGoogle Scholar
- Molina I, Marhuenda Y (2015) sae: an R package for small area estimation. R J 7:81–98CrossRefGoogle Scholar
- Molina I, Rao JNK (2010) Small area estimation of poverty indicators. Can J Stat 38:369–385MathSciNetCrossRefzbMATHGoogle Scholar
- Parker SC (1997) The distribution of self-employment income in the United Kingdom, 1976–1991. Econ J 107:455–466CrossRefGoogle Scholar
- Pfeffermann D, Sverchkov M (2007) Small-area estimation under informative probability sampling of areas and within the selected areas. J Am Stat Assoc 102:1427–1439MathSciNetCrossRefzbMATHGoogle Scholar
- Pinheiro J, Bates D, DebRoy S, Sarkar D, R Core Team (2018) nlme: linear and nonlinear mixed effects models. R package version 3.1-131.1, https://CRAN.R-project.org/package=nlme. Accessed 7 Apr 2018
- Rao JNK, Molina I (2015) Small area estimation, 2nd edn. Wiley, HobokenCrossRefzbMATHGoogle Scholar
- Rivest L-P, Verret F, Baillargeon S (2016) Unit level small area estimation with copulas. Can J Stat 44:397–415MathSciNetCrossRefzbMATHGoogle Scholar
- Sepanski JH, Kong J (2008) A family of generalized beta distributions for income. Adv Appl Stat 10:75–84MathSciNetCrossRefzbMATHGoogle Scholar
- Venter G (1983) Transformed beta and gamma distributions and aggregate losses. Proc Casualty Actuar Soc 70:156–193Google Scholar
- Verret F, Rao JNK, Hidiroglou MA (2015) Model-based small area estimation under informative sampling. Survey Methodol 41:333–347Google Scholar
- Yang X, Frees EW, Zhang Z (2011) A generalized beta-copula with applications in modeling multivariate long-tailed data. Insur Math Econ 49:265–284MathSciNetCrossRefzbMATHGoogle Scholar
- You Y, Rao JNK (2002) A pseudo-empirical best linear unbiased prediction approach to small area estimation using survey weights. Can J Stat 30:431–439MathSciNetCrossRefzbMATHGoogle Scholar