Abstract
The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration approach, a general pseudo maximum likelihood estimation method based on a conveniently decomposed form of the likelihood. It is both consistent and computationally efficient, and produces point estimates and estimated standard errors which are practically identical to those obtained by maximum likelihood. Simulations suggest that improved regression calibration, which is easy to implement in standard software, works well in a range of situations.
Similar content being viewed by others
References
Albert, P.S., & Follmann, D.A. (2000). Modeling repeated count data subject to informative dropout. Biometrics, 56, 667–677.
Armstrong, B. (1985). Measurement error in generalized linear models. Communications in Statistics. Series B, 16, 529–544.
Bentler, P.M. (1983). Some contributions to efficient statistics in structural models: specification and estimation of moment structures. Psychometrika, 48, 493–517.
Blackburn, M., & Neumark, D. (1992). Unobserved ability, efficiency wages, and interindustry wage differentials. Quarterly Journal of Economics, 107, 1421–1436.
Buonaccorsi, J., Demidenko, E., & Tosteson, T. (2000). Estimation in longitudinal random effects models with measurement error. Statistica Sinica, 10, 885–903.
Buonaccorsi, J. (2010). Measurement error: models, methods and applications. Boca Raton: Chapman & Hall/CRC.
Burr, D. (1988). On errors-in-variables in binary regression—Berkson case. Journal of the American Statistical Association, 83, 739–743.
Buzas, J.S., & Stefanski, L.A. (1995). Instrumental variable estimation in generalized linear measurement error models. Journal of the American Statistical Association, 91, 999–1006.
Carroll, R.J., Ruppert, D., Stefanski, L.A., & Crainiceanu, C.M. (2006). Measurement error in nonlinear models (2nd ed.). Boca Raton: Chapman & Hall/CRC.
Carroll, R.J., Spiegelman, C.H., Lan, K.G., Bailey, K.T., & Abbott, R.D. (1984). On errors-in-variables for binary regression models. Biometrika, 71, 19–25.
Carroll, R.J., & Stefanski, L.A. (1990). Approximate quasi-likelihood estimation in models with surrogate predictors. Journal of the American Statistical Association, 85, 652–663.
Clayton, D.G. (1992). Models for the analysis of cohort and case-control studies with inaccurately measured exposures. In J.H. Dwyer, M. Feinlieb, P. Lippert, & H. Hoffmeister (Eds.), Statistical models for longitudinal studies on health (pp. 301–331). New York: Oxford University Press.
Davis, P.J., & Rabinowitz, P. (1984). Methods of numerical integration (2nd ed.). New York: Academic Press.
Gleser, L.J. (1990). Improvements of the naive approach to estimation in nonlinear errors-in-variables regression models. In P.J. Brown & W.A. Fuller (Eds.), Statistical analysis of measurement error models and applications (pp. 99–114). Providence: American Mathematical Society.
Gong, G., & Samaniego, F.J. (1981). Pseudo maximum likelihood estimation: theory and applications. Annals of Statistics, 9, 861–869.
Gourieroux, C., & Monfort, A. (1995). Statistics and econometric models (Vol. 2). Cambridge: Cambridge University Press.
Griliches, Z. (1976). Wages of very young men. Journal of Political Economy, 85, S69–S86.
Gustafson, P. (2004). Measurement error and misclassification in statistics and epidemiology: impacts and Bayesian adjustments. Boca Raton: Chapman & Hall/CRC.
Higdon, R., & Schafer, D.W. (2001). Maximum likelihood computations for regression with measurement error. Computational Statistics & Data Analysis, 35, 283–299.
Jöreskog, K.G. (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36, 109–133.
Jöreskog, K.G., & Goldberger, A.S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70, 631–639.
Kuha, J. (1997). Estimation by data augmentation in regression models with continuous and discrete covariates measured with error. Statistics in Medicine, 16, 189–202.
Lesaffre, E., & Spiessens, B. (2001). On the effect of the number of quadrature points in a logistic random-effects model: an example. Journal of the Royal Statistical Society. Series C, 50, 325–335.
Liang, K.-Y., & Liu, X.-H. (1991). Estimating equations in generalized linear models with measurement error. In V.P. Godambe (Ed.), Estimating functions (pp. 47–63). Oxford: Oxford University Press.
Lütkepohl, H. (1996). Handbook of matrices. Chichester: Wiley.
McCullagh, P., & Nelder, J.A. (1989). Generalized linear models (2nd ed.). London: Chapman & Hall.
McDonald, R.P. (1967). Nonlinear factor analysis (Psychometric Monograph No. 15). Richmond: Psychometric Corporation.
Parke, W.R. (1986). Pseudo maximum likelihood estimation: the asymptotic distribution. Annals of Statistics, 14, 355–357.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2003). Maximum likelihood estimation of generalized linear models with covariate measurement error. The Stata Journal, 3, 385–410.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004a). Generalized multilevel structural equation modeling. Psychometrika, 69, 167–190.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004b). Gllamm manual (Technical report 160). U.C. Berkeley Division of Biostatistics. Downloadable from http://www.bepress.com/ucbbiostat/paper160/.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics, 128, 301–323.
Rabe-Hesketh, S., & Skrondal, A. (2012). Multilevel and longitudinal modeling using Stata, vol. II: categorical responses, counts, and survival (3rd ed.). College Station: Stata Press.
Richardson, S., & Gilks, W.S. (1993). Conditional independence models for epidemiological studies with covariate measurement error. Statistics in Medicine, 12, 1703–1722.
Robinson, G.K. (1991). That BLUP is a good thing: the estimation of random effects. Statistical Science, 6, 15–51.
Robinson, P.M. (1974). Identification, estimation, and large sample theory for regressions containing unobservable variables. International Economic Review, 15, 680–692.
Rosner, B., Spiegelman, D., & Willett, W.C. (1990). Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. American Journal of Epidemiology, 132, 734–745.
Rosner, B., Willett, W.C., & Spiegelman, D. (1989). Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Statistics in Medicine, 8, 1031–1040.
Rubin, D.B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Schafer, D.W. (1987). Covariate measurement error in generalized linear models. Biometrika, 74, 385–391.
Schafer, D.W. (1993). Likelihood analysis for probit regression with measurement error. Biometrika, 80, 899–904.
Schafer, D.W., & Purdy, K.G. (1986). Likelihood analysis for errors-in-variables regression with replicate measurements. Biometrika, 83, 813–824.
Shapiro, A. (2007). Statistical inference of moment structures. In S.Y. Lee (Ed.), Handbook of latent variable and related models (pp. 229–259). Amsterdam: Elsevier.
Skrondal, A., & Laake, P. (2001). Regression among factor scores. Psychometrika, 66, 563–575.
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling. Boca Raton: Chapman & Hall/CRC.
Skrondal, A., & Rabe-Hesketh, S. (2007). Latent variable modelling: a survey. Scandinavian Journal of Statistics, 34, 712–745.
Skrondal, A., & Rabe-Hesketh, S. (2009). Prediction in multilevel generalized linear mixed models. Journal of the Royal Statistical Society. Series A, 172, 659–687.
Stephens, D.A., & Dellaportas, P. (1992). Bayesian analysis of generalised linear models with covariate measurement error. In J.M. Bernardo, J.O. Berger, A.P. Dawid, & A.F.M. Smith (Eds.), Bayesian statistics (Vol. 4, pp. 813–820). Oxford: Oxford University Press.
Thisted, R.A. (1988). Elements of statistical computing. London: Chapman & Hall.
Acknowledgements
We are grateful to H.K. Gjessing for helpful discussions and three anonymous reviewers for constructive comments.
Author information
Authors and Affiliations
Corresponding author
Appendix: Obtaining \(\widehat{\boldsymbol{\mathcal{I}}}_{ \mathsf{ME},\mathsf{O}}\) in (18)
Appendix: Obtaining \(\widehat{\boldsymbol{\mathcal{I}}}_{ \mathsf{ME},\mathsf{O}}\) in (18)
Here we describe the calculation of the estimate (18) of the matrix \(\boldsymbol{\mathcal{I}}_{\mathsf{ME}, \mathsf{O}}\), which is used in the calculation of the variance matrix (17) of \(\widehat{\boldsymbol {\vartheta }}_{\mathsf{O}}^{\mathsf{IRC}}\). Let us first introduce some convenient shorthand notation for the logarithm of the likelihood contribution (6):
Here g xi and g 2i are multivariate normal density functions with parameters \(\boldsymbol{\theta}_{1i}=(\boldsymbol{\xi}_{i}',\text {vec}(\boldsymbol{\Omega}_{i})')'\) and \(\boldsymbol{\theta}_{2i}=(\boldsymbol{\mu}_{i}',\text {vec}(\boldsymbol{\Sigma}_{i})')'\) respectively, as defined by (11)–(12) and (9)–(10). These in turn are functions of the parameters χ=(ν′,vec(Λ)′,vec(Θ)′,vec(Γ)′,vec(Ψ)′)′, and ϑ ME are the distinct, unknown elements of χ.
The required gradients for (18) are
where
Estimated values for these quantities, and thus for the estimated matrix \(\widehat{\boldsymbol{\mathcal{I}}}_{\mathsf{ME},\mathsf{O}}\) given by (18), are obtained by substituting estimates \(\widehat{\boldsymbol {\vartheta }}^{\mathsf{IRC}}\) of the parameters.
Starting with (A.2), we note that each element of χ is either a known constant or equal to a single element of ϑ ME ; for illustration, consider Λ as shown in (2). Suppose that χ is of length t and ϑ ME of length u. Then \(\partial\boldsymbol{\chi}/\partial \boldsymbol {\vartheta }_{\mathsf {ME}}'\) is a t×u matrix whose (i,j)th element is 1 if the ith element of χ is equal to the jth element of ϑ ME , and 0 otherwise.
Next, the elements of ∂ θ 2i /∂ χ′ in (A.2) are
and the elements of ∂ θ 1i /∂ χ′ are
where
and vec(⋅) denotes the column-by-column vectorization operator, ⊗ the Kronecker product, I m an m×m identity matrix, and K rm an rm×rm commutation matrix. The formulas are obtained through repeated application of rules of matrix differentiation (see, e.g., Lütkepohl 1996).
In the second term of (A.2), the elements of \(\partial\log g_{2i}/\partial\boldsymbol{\theta}_{2i}'\) are \(\partial\log g_{2i}/\partial\boldsymbol{\mu}_{i}'\) \(= (\mathbf{w}_{i}-\boldsymbol{\mu}_{i})'\boldsymbol{\Sigma }_{i}^{-1}\) and ∂logg 2i /∂vec(Σ i )′ \(= \text{vec}[\boldsymbol{\Sigma}_{i}^{-1}(\mathbf{w}_{i}-\boldsymbol {\mu}_{i})(\mathbf{w}_{i}-\boldsymbol{\mu}_{i})' \boldsymbol{\Sigma}_{i}^{-1}-\boldsymbol{\Sigma}_{i}^{-1}]'/2\).
The remaining elements of (A.1) and (A.2) depend also on the outcome model for y i . For the logistic model, which is predominant in applications of generalized linear models with covariate measurement error, and which is also used in our simulations and example, \(g_{yi}=\pi_{i}^{y_{i}}(1-\pi_{i})^{1-y_{i}}\) where π i =exp(η i )/[1+exp(η i )] and \(\eta_{i}=\mathbf{z}_{i}'\boldsymbol{\beta}_{z}+\mathbf {x}_{i}'\boldsymbol{\beta}_{x}\). For this model we employ the well-known closed-form approximation \(g_{1i}\approx(\pi_{i}^{*})^{y_{i}}(1-\pi_{i}^{*})^{1-y_{i}}\), where \(\pi_{i}^{*}=\exp(\eta^{*}_{i})/[1+\exp(\eta^{*}_{i})]\), \(\eta^{*}_{i}=\eta_{1i}\eta_{2i}^{-1/2}\), \(\eta_{1i}= \mathbf{z}_{i}'\boldsymbol{\beta}_{z}+ \boldsymbol{\xi}_{i}'\boldsymbol{\beta}_{x}\), \(\eta_{2i}=1+d\boldsymbol{\beta}_{x}'\boldsymbol{\Omega }_{i}\boldsymbol{\beta}_{x}\), and d=1/1.72 (e.g., Liang & Liu 1991). For this approximation,
where \(\boldsymbol{\xi}_{i}^{*}=\boldsymbol{\xi}_{i}-\eta_{1i}\eta _{2i}^{-1} \, d \boldsymbol{\Omega}_{i}\boldsymbol{\beta}_{x}\). These formulas complete explicit expressions for (A.1) and (A.2).
In our data analysis, we also apply a similar idea for the conventional regression calibration estimate of ϑ O , which uses the first-order approximation \(g_{1i}\approx (\pi_{i}^{\mathrm{RC}})^{y_{i}}(1-\pi_{i}^{ \mathrm{RC}})^{1-y_{i}}\) where \(\pi_{i}^{\mathrm{RC}}=\exp(\eta_{1i})/[1+\exp(\eta_{1i})]\). We estimate its variance matrix analogously to (17)–(18), using in (A.1) and (A.2) \(\partial g_{1i}/\partial \boldsymbol {\vartheta }_{\mathsf {O}}=(\partial g_{1i}/\partial\eta_{1i}) (\mathbf{z}_{i}',\boldsymbol{\xi}')'\) and \(\partial g_{1i}/\partial\boldsymbol{\theta}_{1i}'=(\partial g_{1i}/\partial\eta_{1i}) [\boldsymbol{\beta}_{x}', \mathbf{0}']\), where \(\partial g_{1i}/\partial\eta_{1i}=(-1)^{1-y_{i}} \pi_{i}^{\mathrm{RC}}\, (1-\pi_{i}^{\mathrm{RC}})\).
For other, less popular models, we must evaluate the integrals involved in (A.3)–(A.5). Note first that the partial derivatives \(\partial g_{xi}/\partial \boldsymbol{\theta}_{1i}'\) are given by
Substituting these into (A.5), we see that each of the integrals there, and also in (A.3) and (A.4), are of the form ∫h i (x i )g xi dx i for some function h i (x i ) of x i , integrated over the multivariate normal density g xi =g(x i |w i ,z i ;ϑ ME ). This suggests that the integrals can be evaluated through Monte Carlo integration, by first generating M independent draws x ij , j=1,…,M, from \(g(\mathbf{x}_{i}|\mathbf{w}_{i}, \mathbf{z}_{i}; \widehat{\boldsymbol {\vartheta }}_{\mathsf{ME}})\), and then approximating the integrals by the averages \(M^{-1} \sum_{j=1}^{M} h_{i}(\mathbf{x}_{ij})\) for each of the h i (⋅). Only one set of random draws is needed for all the observations i, if we first generate M uncorrelated m-vectors u j of standard normal random variates and then calculate \(\mathbf{x}_{ij}=\widetilde{\boldsymbol{\xi}}_{i}+\mathbf {B}_{i}\mathbf{u}_{j}\), where \(\widehat{\boldsymbol{\Omega}}_{i}=\mathbf{B}_{i}\mathbf{B}_{i}'\).
Rights and permissions
About this article
Cite this article
Skrondal, A., Kuha, J. Improved Regression Calibration. Psychometrika 77, 649–669 (2012). https://doi.org/10.1007/s11336-012-9285-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-012-9285-1