Abstract
Conventional linear regression models assume homoscedastic error terms. This assumption often is violated in empirical applications. Various methods for evaluating the extent of such violations and for adjusting the estimated model parameters if necessary are generally available in books on regression methodology. Recent developments in statistics have taken a different approach by examining the data to ascertain whether the estimated heteroscedastic residuals (from a first-stage regression model of the conditional mean of an outcome variable as a function of a set of explanatory variables or covariates) are themselves systematically related to a set of explanatory variables in a second-stage regression. These extensions of the conventional models have been given various names but, most generally, are heteroscedastic regression models (HRMs). Instead of treating heteroscedasticity as a nuisance to be adjusted out of existence to reduce or eliminate its impact on regression model parameter estimates, the basic idea of HRMs is to model the heteroscedasticity itself. This chapter systematically reviews the specification of HRMs in both linear and generalized linear model forms, describes methods of estimation of such models, and reports empirical applications of the models to data on changes over recent decades in the US income distribution and in self-reported health/health disparities. A concluding section points to similarities and complementarities of the goals of the counterfactual approach to causal inference and heteroscedastic regression models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Developments in statistics that are relatively unknown to most social scientists.
- 2.
In the context of APC analysis, groups are defined by the age, time period, and cohort categories.
- 3.
As a general rule for statistical modeling, if the interpretation of a class of effects can be extended beyond the data being analyzed, a random effects specification of the effects is preferred; if the effects are limited to the data being modeled, then a fixed effects specification may be more appropriate (Hilbe 2009: 503). Applied to age-period-cohort analysis, since the age range for humans is bounded, it follows that they are best conceived statistically as fixed effects. By comparison, the effects of time periods and birth cohorts in any finite dataset generally can be extended and thus are appropriately specified as random effects.
- 4.
- 5.
Respondents in the repeated cross-section sample surveys are cross classified by both the time periods of the surveys in which they responded and the birth cohorts to which they belong. Each cell is an intersection of a cohort and a period.
References
Aitkin, M. (1987). Modelling variance heterogeneity in normal regression using GLIM. Applied Statistics, 36, 332–339.
Chen, F., Yang, Y., & Liu, G. (2010). Social change and socioeconomic disparity in health over the life course in China. American Sociological Review, 75, 126–150.
Dannefer, D. (2003). Cumulative advantage/disadvantage and the life course: Cross-fertilizing age and social science theory. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 58, S327–S337.
Fahrmeir, L., & Lang, S. (2001). Bayesian inference for generalized additive mixed models based on Markov random field priors. Applied Statistics, 50, 201–220.
Fox, J. (2008). Applied regression analysis and generalized linear models. Los Angeles: Sage.
Goesling, B. (2007). The rising significance of education for health? Social Forces, 85(4), 1621–1644.
Goldstein, H. (2003). Multilevel statistical models (3rd ed.). London: Oxford University Press.
Guo, S., & Fraser, M. W. (2010). Propensity score analysis: Statistical methods and applications. Thousand Oaks: Sage.
Harville, D. A. (1974). Bayesian inference for variance components using only error contrasts. Biometrika, 61, 383–385.
Hilbe, J. M. (2009). Logistic regression models. New York: CRC Press.
Ho, D., Imai, K., King, G., & Stuart, E. (2005). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference (Working Paper). Department of Government, Harvard University.
House, J. S., Lepkowski, J. M., Kinney, A. M., Mero, R. P., Kessler, R. C., & Regula Herzog, A. (1994). The social stratification of aging and health. Journal of Health and Social Behavior, 35, 213–234.
Lauderdale, D. S. (2001). Education and survival: Birth cohort, period, and age effects. Demography, 38(4), 551–561.
Lee, Y., & Nelder, J. A. (2006). Double hierarchical generalized linear models. Journal of the Royal Statistical Society Series C(Applied Statistics), 55, 139–185.
Lemieux, T. (2006). Increasing residual wage inequality: Composition effects, noisy data, or rising demand for skill? American Economic Review, 96(3), 461–498.
Long, J. S., & Ervin, L. H. (2000). Using heteroscedasticity consistent standard errors in the linear regression model. The American Statistician, 54, 217–224.
Lynch, S. M. (2003). Cohort and life-course patterns in the relationship between education and health: A hierarchical approach. Demography, 40(2), 309–331.
Mason, W. M., & Fienberg, S. E. (1985). Cohort analysis in social research: Beyond the identification problem. New York: Springer.
Mason, K. O., Mason, W. H., Winsborough, H. H., & Kenneth Poole, W. (1973). Some methodological issues in cohort analysis of archival data. American Sociological Review, 38, 242–258.
Morgan, S. L., & Winship, C. (2007). Counterfactuals and causal inference: Methods and principles for social research. New York: Cambridge University Press.
Nelder, J. A., & Lee, Y. (1991). Generalized linear models for the analysis of Taguchi-type experiments. Applied Stochastic Models and Data Analysis, 7, 101–120.
Pappas, G., Queen, S., Hadden, W., & Fisher, G. (1993). The increasing disparity in mortality between socioeconomic groups in the United States, 1960 and 1986. The New England Journal of Medicine, 329, 103–115.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks: Sage.
Rigby, R. A., & Stasinopoulos, D. M. (2005). “Generalized additive models for size, location, and shape” with discussion. Journal of the Royal Statistical Society, Series C, 54, 507–554.
Smyth, G. K. (1989). Generalized linear models with varying dispersion. Journal of the Royal Statistical Society, Series B, 51, 47–60.
Smyth, G. K. (2002). An efficient algorithm for REML in heteroscedastic regression. Journal of Graphical and Computational Statistics, 11, 836–847.
Smyth, G. K., Frederik Huele, A., & Verbyla, A. P. (2001). Exact and approximate REML for heteroscedastic regression. Journal of Graphical and Computational Statistics, 11, 836–847.
Tunnicliffe Wilson, G. (1989). One the use of marginal likelihood in time series model estimation. Journal of the Royal Statistical Society, Series B, 51, 15–27.
Verbyla, A. P. (1993). Modeling variance heterogeneity: Residual maximum likelihood and diagnostics. Journal of the Royal Statistical Society, Series B, 55, 493–508.
Warren, J. R., & Hernandez, E. M. (2007). Did socioeconomic inequalities in morbidity and mortality change in the United States over the course of the twentieth century? Journal of Health and Social Behavior, 48, 335–351.
Western, B. (2002). The impact of incarceration on wage mobility and inequality. American Sociological Review, 67, 526–546.
Western, B., & Bloome, D. (2009). Variance function regression for studying inequality. Sociological Methodology, 39, 293–325.
Western, B., Bloome, D., & Percheski, C. (2008). Inequality among American families with children, 1975 to 2005. American Sociological Review, 73, 903–920.
White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48, 817–838.
Yang, Y. (2006). Bayesian inference for hierarchical age-period-cohort models of repeated cross-section survey data. Sociological Methodology, 36, 39–74.
Yang, Y. (2010). Aging, cohorts, and methods, Chapter 2. In B. Binstock & L. K. George (Eds.), The handbook of aging and the social sciences (7th ed., pp. 17–30). London: Academic Press.
Yang, Y., & Land, K. C. (2006). A mixed models approach to the age-period-cohort analysis of repeated cross-section surveys, with an application to data on trends in verbal test scores. Sociological Methodology, 36, 75–98.
Yang, Y., & Land, K. C. (2008). Age-period-cohort analysis of repeated cross-section surveys: Fixed or random effects? Sociological Methods and Research, 36(February), 297–326.
Yang, Y., & Lee, L. C. (2009). Sex and race disparities in health: Cohort variations in life course patterns. Social Forces, 87, 2093–2124.
Zheng, H., Yang, Y., & Kenneth, C. L. (2011). Variance function regression in hierarchical age-period-cohort models, with applications to the study of self-reported health. American Sociological Review, 76(6), 955–983.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Zheng, H., Yang, Y., Land, K.C. (2013). Heteroscedastic Regression Models for the Systematic Analysis of Residual Variances. In: Morgan, S. (eds) Handbook of Causal Analysis for Social Research. Handbooks of Sociology and Social Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6094-3_8
Download citation
DOI: https://doi.org/10.1007/978-94-007-6094-3_8
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6093-6
Online ISBN: 978-94-007-6094-3
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)