Skip to main content

Dual Model Misspecification in Generalized Linear Models with Error in Variables

  • Conference paper
  • First Online:
New Developments in Statistical Modeling, Inference and Application

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

  • 736 Accesses

Abstract

We study maximum likelihood estimation of regression parameters in generalized linear models for a binary response with error-prone covariates when the distribution of the error-prone covariate or the link function is misspecified. We revisit the remeasurement method proposed by Huang et al. (Biometrika 93:53–64, 2006) for detecting latent-variable model misspecification and examine its operating characteristics in the presence of link misspecification. Furthermore, we propose a new diagnostic method for assessing assumptions on the link function. Combining these two methods yields informative diagnostic procedures that can identify which model assumption is violated and also reveal the direction in which the true latent-variable distribution or the true link function deviates from the assumed one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Alonso, A., Litière, S., & Laenen, A. (2010). A note on the indeterminacy of the random-effects distribution in hierarchical models. The American Statistician, 64, 318–324.

    Article  MathSciNet  Google Scholar 

  • Brown, C. C. (1982). On a goodness-of-fit test for the logistic model based on score statistics. Communications in Statistics, 11, 1087–1105.

    Article  Google Scholar 

  • Carroll, R. J., Ruppert, D., Stefanski, L. A., & Crainiceanu, C. M. (2006). Measurement error in non-linear models: A modern perspective (2nd ed.). Boca Raton: Chapman & Hall/CRC.

    Book  MATH  Google Scholar 

  • Chambers, E., & Cox, D. (1967). Discrimination between alternative binary response models. Biometrika, 67, 250–251.

    MathSciNet  Google Scholar 

  • Czado, C., & Santner, T. J. (1992). The effect of link misspecification on binary regression inference. Journal of Statistical Planning and Inference, 33, 213–231.

    Article  MathSciNet  MATH  Google Scholar 

  • Fowlkes, E. B. (1987). Some diagnostics for binary regression via smoothing. Biometrika, 74, 503–515.

    Article  MathSciNet  MATH  Google Scholar 

  • Hosmer, D. W., Hosmer, T., Le Cessie, S., & Lemeshow, S. (1997). A comparison of goodness-of-fit tests for the logistic regression model. Statistics in Medicine, 16, 965–980.

    Article  MATH  Google Scholar 

  • Hosmer, D. W., & Lemeshow, S. (1989). Applied logistic regression. New York: Wiley.

    MATH  Google Scholar 

  • Huang, X. (2009). An improved test of latent-variable model misspecification in structural measurement error models for group testing data. Statistics in Medicine, 28, 3316–3327.

    Article  MathSciNet  Google Scholar 

  • Huang, X., Stefanski, L. A, & Davidian, M. (2006). Latent-model robustness in structural measurement error models. Biometrika, 93, 53–64.

    Article  MathSciNet  MATH  Google Scholar 

  • Huang, X., Stefanski, L. A., & Davidian, M. (2009). Latent-model robustness in joint modeling for a primary endpoint and a longitudinal process. Biometrics, 65, 719–727.

    Article  MathSciNet  MATH  Google Scholar 

  • Kannel, W. B., Neaton, J. D., Wentworth, D., Thomas, H. E., Stamler, J., Hulley, S. B., et al. (1986). Overall and coronary heart disease mortality rates in relation to major risk factors in 325,348 men screened for MRFIT. American Heart Journal, 112, 825–836.

    Article  Google Scholar 

  • Le Cessie, S., & van Houwelingen, J. C. (1991). A goodness-of-fit test for binary data based on smoothing residuals. Biometrics, 47, 1267–1282.

    Article  MATH  Google Scholar 

  • Li, K., & Duan N. (1989). Regression analysis under link violation. The Annals of Statistics, 17, 1009–1052.

    Article  MathSciNet  MATH  Google Scholar 

  • Ma, Y., Hart, J. D., Janicki, R., & Carroll, R. J. (2011). Local and omnibus goodness-of-fit tests in classical measurement error models. Journal of the Royal Statistical Society: Series B, 73, 81–98.

    Article  MathSciNet  Google Scholar 

  • McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd ed.). Boca Raton: Chapman & Hall/CRC.

    Book  MATH  Google Scholar 

  • Nelder, J. A., & Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society: Series A 135, 370–384.

    Article  Google Scholar 

  • Pregibon, D. (1980). Goodness of link tests for generalized linear models. Journal of the Royal Statistical Society: Series C 29, 15–24.

    MATH  Google Scholar 

  • Stefanski, L. A., & Carroll, R. J. (1990). Deconvoluting kernel density estimators. Statistics, 21, 169–184.

    Article  MathSciNet  MATH  Google Scholar 

  • Stukel, T. A. (1988). Generalized logistic models. Journal of American Statistical Association, 83, 426–431.

    Article  MathSciNet  Google Scholar 

  • Tsiatis, A. A. (1980). A note on a goodness-of-fit test for logistic regression model. Biometrika, 67, 250–251.

    Article  MATH  Google Scholar 

  • Verbeke, G., & Molenberghs, G. (2010). Arbitrariness of models for augmented and coarse data, with emphasis on incomplete-data and random-effects models. Statistical Modelling, 10, 391–419.

    Article  MathSciNet  Google Scholar 

  • Wang, X., & Wang, B. (2011). Deconvolution estimation in measurement error models: The R package decon. Journal of Statistical Software, 39, 1–24.

    Article  Google Scholar 

  • White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica, 50, 1–25.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianzheng Huang .

Editor information

Editors and Affiliations

Appendices

Appendix 1: Likelihood and Score Functions Referenced in Sect. 3.2

1.1 Likelihood and Score Functions Under the Assumed Model

If one posits a probit link in the primary model and assumes X ∼ N(μ x , σ x 2), the observed-data likelihood for subject i is

$$\displaystyle{ f_{\mbox{ $Y,W$}}(Y _{i},W_{i};\varOmega,\sigma _{u}^{2}) = e_{ i}[\varPhi \{h_{i}(\beta )\}\}^{Y _{i} }[\varPhi \{-h_{i}(\beta )\}]^{1-Y _{i} },\quad \mbox{ for}\ i = 1,\ldots,n, }$$
(6)

where Φ(⋅ ) is the cumulative distribution function (cdf) of N(0, 1), and

$$\displaystyle\begin{array}{rcl} e_{i}& =& \frac{1} {\sqrt{\sigma _{u }^{2 } +\sigma _{ x }^{2}}}\phi \left ( \frac{W_{i} -\mu _{x}} {\sqrt{\sigma _{u }^{2 } +\sigma _{ x }^{2}}}\right ),{}\end{array}$$
(7)
$$\displaystyle\begin{array}{rcl} h_{i}(\beta )& =& \left (\beta _{0} +\beta _{1}\frac{\sigma _{x}^{2}W_{i} +\sigma _{ u}^{2}\mu _{x}} {\sigma _{u}^{2} +\sigma _{ x}^{2}} \right )\left (1 + \frac{\beta _{1}^{2}\sigma _{u}^{2}\sigma _{x}^{2}} {\sigma _{u}^{2} +\sigma _{ x}^{2}}\right )^{-1/2}.{}\end{array}$$
(8)

If the reclassification model is

$$\displaystyle{ P(Y _{i}^{{\ast}} = Y _{ i}\vert W_{i}) =\pi _{i},\text{ for}\ i = 1,\ldots,n, }$$
(9)

the likelihood of the ith reclassified data, (Y i , W i ), under the assumed model is

$$\displaystyle\begin{array}{rcl} f_{\mbox{ $Y^{{\ast}},W$}}(Y _{i}^{{\ast}},W_{ i};\varOmega,\sigma _{u}^{2})& =& e_{ i}[\pi _{i}\varPhi \{h_{i}(\beta )\} + (1 -\pi _{i})\varPhi \{ - h_{i}(\beta )\}]^{Y _{i}^{{\ast}} } \\ & & \times [(1 -\pi _{i})\varPhi \{h_{i}(\beta )\} +\pi _{i}\varPhi \{ - h_{i}(\beta )\}]^{1-Y _{i}^{{\ast}} }.{}\end{array}$$
(10)

Differentiating the logarithm of (6) with respect to β yields the normal scores associated with β based on the raw data with measurement error only in X; and, similarly, differentiating the logarithm of (10) with respect to β gives the counterpart normal scores for the reclassified data with measurement error in both X and Y. These two sets of scores are respectively

$$\displaystyle\begin{array}{rcl} \psi _{m}(\beta;Y _{i},W_{i})& =& h'_{i}(\beta )\phi \{h_{i}(\beta )\}\varPhi ^{-1}\{ - h_{ i}(\beta )\}\left [ \frac{Y _{i}} {\varPhi \{h_{i}(\beta )\}} - 1\right ],{}\end{array}$$
(11)
$$\displaystyle\begin{array}{rcl} \psi _{c}(\beta;Y _{i}^{{\ast}},W_{ i})& =& h'_{i}(\beta )\phi \{h_{i}(\beta )\}\varPhi ^{-1}\{ - h_{ i}(\beta )\}d_{i}^{-1}(\beta ) \\ & & \times \left [\frac{Y _{i}^{{\ast}}(2\pi _{i} - 1)\varPhi \{ - h_{i}(\beta )\}} {1 - d_{i}(\beta )} + 1 - d_{i}(\beta ) -\pi _{i}\right ],{}\end{array}$$
(12)

where

$$\displaystyle{ d_{i}(\beta ) = (1 -\pi _{i})\varPhi \{h_{i}(\beta )\} +\pi _{i}\varPhi \{ - h_{i}(\beta )\}, }$$
(13)

and h i ′(β) = (∂ β)h i (β) consists of the following two elements,

$$\displaystyle\begin{array}{rcl} \frac{\partial h_{i}(\beta )} {\partial \beta _{0}} & =& \left (1 + \frac{\beta _{1}^{2}\sigma _{u}^{2}\sigma _{x}^{2}} {\sigma _{u}^{2} +\sigma _{ x}^{2}}\right )^{-1/2}, {}\\ \frac{\partial h_{i}(\beta )} {\partial \beta _{1}} & =& \frac{(\sigma _{x}^{2}W_{i} +\sigma _{ u}^{2}\mu _{x})\left \{(\partial /\partial \beta _{0})h_{i}(\beta )\right \}^{-1} -\beta _{1}\sigma _{u}^{2}\sigma _{x}^{2}h_{i}(\beta )} {\sigma _{u}^{2} +\sigma _{ x}^{2} +\beta _{ 1}^{2}\sigma _{u}^{2}\sigma _{x}^{2}}. {}\\ \end{array}$$

A close inspection of the scores in (11) and (12) reveals some values of π i that one should avoid when specifying the reclassification model in (9). First, note that the score function in (12) is identically zero if π i  = 0. 5 for all i = 1, , n. Consequently, β is non-estimable from the reclassified data generated according to P(Y i  = Y i  | W i ) = 0. 5 for all i = 1, , n. This is not surprising as, with all π i ’s equal to 0.5, {Y i } i = 1 n virtually contains no information of the true responses. Second, the two sets of scores are equal when π i  = 0 for i = 1, , n, or, π i  = 1 for i = 1, , n. This is also expected as this is the case where {Y i } i = 1 n literally contains the same information as {Y i } i = 1 n, and hence MLEs of β from these two data sets are identical, whether or not the assumed model is correct. Therefore, for the purpose of model diagnosis, we avoid setting π i in (9) identically as 0.5, or 0, or 1, for all i = 1, , n.

1.2 Score Estimating Equations

Under regularity conditions, the limiting MLE of β based on the raw data and that based on the reclassified data as n → , β m and β c , uniquely satisfy the following score equations respectively,

$$\displaystyle\begin{array}{rcl} E_{\mbox{ $Y,W$}}\left \{\psi _{m}(\beta _{m};Y _{i},W_{i})\right \}& =& 0,{}\end{array}$$
(14)
$$\displaystyle\begin{array}{rcl} E_{\mbox{ $Y^{{\ast}},W$}}\left \{\psi _{c}(\beta _{c};Y _{i}^{{\ast}},W_{ i})\right \}& =& 0,{}\end{array}$$
(15)

where the subscripts attached to E{⋅ } signify that the expectations are defined with respect to the relevant true model.

Using iterated expectations, one can show that (14) boils down the following set of two equations,

$$\displaystyle\begin{array}{rcl} E_{\mbox{ $W$}}\left [\phi \{h_{i}(\beta _{m})\} \frac{p_{i} -\varPhi \{ h_{i}(\beta _{m})\}} {\varPhi \{h_{i}(\beta _{m})\}\varPhi \{ - h_{i}(\beta _{m})\}}\right ]& =& 0,{}\end{array}$$
(16)
$$\displaystyle\begin{array}{rcl} E_{\mbox{ $W$}}\left [W_{i}\phi \{h_{i}(\beta _{m})\} \frac{p_{i} -\varPhi \{ h_{i}(\beta _{m})\}} {\varPhi \{h_{i}(\beta _{m})\}\varPhi \{ - h_{i}(\beta _{m})\}}\right ]& =& 0,{}\end{array}$$
(17)

where p i is the mean of Y i given W i under the true model, that is, p i  = P (t)(Y i  = 1 | W i ) evaluated at β (the true parameter value), for i = 1, , n. Similarly, one can deduce that (15) is equivalent to the following system of equations,

$$\displaystyle\begin{array}{rcl} E_{\mbox{ $W$}}\left [\phi \{h_{i}(\beta _{c})\}\frac{(1 - 2\pi _{i})\{1 - d_{i}(\beta _{c}) - q_{i}\}} {d_{i}(\beta _{c})\{1 - d_{i}(\beta _{c})\}} \right ]& =& 0,{}\end{array}$$
(18)
$$\displaystyle\begin{array}{rcl} E_{\mbox{ $W$}}\left [W_{i}\phi \{h_{i}(\beta _{c})\}\frac{(1 - 2\pi _{i})\{1 - d_{i}(\beta _{c}) - q_{i}\}} {d_{i}(\beta _{c})\{1 - d_{i}(\beta _{c})\}} \right ]& =& 0,{}\end{array}$$
(19)

where q i is the mean of Y i given W i under the true model, that is,

$$\displaystyle{ q_{i} = P^{(t)}(Y _{ i}^{{\ast}} = 1\vert W_{ i}) =\pi _{i}p_{i} + (1 -\pi _{i})(1 - p_{i}),\quad \text{for}\ i = 1,\ldots,n. }$$
(20)

1.3 Likelihood Function Under the True Model

Under the mixture-probit-normal model specified in Sect. 3.2, the likelihood of (Y i , W i ) is

$$\displaystyle{ f_{\mbox{ $Y,W$}}^{(t)}(Y _{ i},W_{i};\varOmega ^{(t)},\sigma _{ u}^{2}) =\rho e_{ 1i}p_{1i}^{Y _{i} }(1-p_{1i})^{1-Y _{i} }+(1-\rho )e_{2i}p_{2i}^{Y _{i} }(1-p_{2i})^{1-Y _{i} },}$$

where, for  = 1, 2,

$$\displaystyle\begin{array}{rcl} e_{\ell i}& =& \frac{1} {\sqrt{\sigma _{u }^{2 } +\sigma _{ x\ell }^{2}}}\phi \left ( \frac{W_{i} -\mu _{x\ell}} {\sqrt{\sigma _{u }^{2 } +\sigma _{ x\ell }^{2}}}\right ), {}\\ p_{\ell i}& =& \alpha \varPhi (h_{\ell1i}) + (1-\alpha )\varPhi (h_{\ell2i}), {}\\ h_{\ell ki}& =& \left (\beta _{0} -\mu _{k} +\beta _{1}\frac{\sigma _{x\ell}^{2}W_{i} +\sigma _{ u}^{2}\mu _{x\ell}} {\sigma _{u}^{2} +\sigma _{ x\ell}^{2}} \right )\left (\sigma _{k}^{2} + \frac{\beta _{1}^{2}\sigma _{ u}^{2}\sigma _{ x\ell}^{2}} {\sigma _{u}^{2} +\sigma _{ x\ell}^{2}}\right )^{-1/2},\quad \text{for}\ k = 1,2. {}\\ \end{array}$$

It follows that, as the true mean of Y i given W i ,

$$\displaystyle{ p_{i} = P^{(t)}(Y _{ i} = 1\vert W_{i}) = \frac{\rho e_{1i}p_{1i} + (1-\rho )e_{2i}p_{2i}} {\rho e_{1i} + (1-\rho )e_{2i}},\quad \text{for}\ i = 1,\ldots,n. }$$
(21)

Evaluating (20) at this p i , one obtains the true mean of Y i given W i , that is, q i  = P (t)(Y i  = 1 | W i ), and further deduces that the true-model likelihood of the reclassified data (Y i , W i ) is, for i = 1, , n,

$$\displaystyle{f_{\mbox{ $Y^{{\ast}},W$}}^{(t)}(Y _{ i}^{{\ast}},W_{ i};\varOmega ^{(t)},\sigma _{ u}^{2}) =\{\rho e_{ 1i} + (1-\rho )e_{2i}\}q_{i}^{Y _{i}^{{\ast}} }(1 - q_{i})^{1-Y _{i}^{{\ast}} }.}$$

Appendix 2: Limiting Maximum Likelihood Estimators When β 1 = 0

When β 1 = 0, the limiting MLEs of β are given in the following proposition.

Proposition 1.

Suppose that the true primary model is a GLM with a mixture probit link and β 1 = 0. Under the assumed probit-normal model, β c = β m = (β 0m , 0) t , where

$$\displaystyle{ \beta _{0m} =\varPhi ^{-1}\left \{\alpha \varPhi \left (\frac{\beta _{0} -\mu _{1}} {\sigma _{1}} \right ) + (1-\alpha )\varPhi \left (\frac{\beta _{0} -\mu _{2}} {\sigma _{2}} \right )\right \}. }$$
(22)

The proof is given next, which does not depend on the true X-model or the reclassification model. Proposition 1 indicates that, if β 1 = 0, β m does not depend on σ u 2, suggesting that RM cannot detect either misspecification. Also, β c does not depend on π i , which defeats the purpose of creating reclassified data, hence RC does not help in model diagnosis either. This implication should not raise much concern because, after all, now β 1m  = β 1c  = β 1( = 0), suggesting that MLEs of β 1 remain consistent despite model misspecification.

Proof.

By the uniqueness of the solution to (14), it suffices to check if β m  = (β 0m , 0)t solves (16)–(17), where β 0m is given in (23).

Because β 1 = 0,

$$\displaystyle\begin{array}{rcl} p_{i}& =& P^{(t)}(Y _{ i} = 1\vert W_{i}) \\ & =& \frac{f^{(t)}(Y _{i} = 1,W_{i};\varOmega ^{(t)},\sigma _{u}^{2})} {f_{\mbox{ $W$}}^{(t)}(W_{i};\tau,\sigma _{u}^{2})} \\ & =& \frac{\int P^{(t)}(Y _{i} = 1\vert x;\beta )f_{\mbox{ $W\vert X$}}^{(t)}(W_{i}\vert x;\sigma _{u}^{2})f_{\mbox{ $X$}}^{(t)}(x;\tau )dx} {f_{\mbox{ $W$}}^{(t)}(W_{i};\tau,\sigma _{u}^{2})} \\ & & [\text{Note that}\ P^{(t)}(Y _{ i} = 1\vert x;\beta )\ \mbox{ is free of}\ x\ \mbox{ when}\ \beta _{1} = 0.] \\ & =& \frac{P^{(t)}(Y _{i} = 1\vert x;\beta )f_{\mbox{ $W$}}^{(t)}(W_{i};\tau,\sigma _{u}^{2})} {f_{\mbox{ $W$}}^{(t)}(W_{i};\tau,\sigma _{u}^{2})} \\ & =& \alpha \varPhi \left (\frac{\beta _{0} -\mu _{1}} {\sigma _{1}} \right ) + (1-\alpha )\varPhi \left (\frac{\beta _{0} -\mu _{2}} {\sigma _{2}} \right ). {}\end{array}$$
(23)

Suppose one assumes for now that β 1m  = 0, then by, (8), h i (β m ) = β 0m . With both h i (β m ) and p i in (24) free of W i , (16) reduces to p i Φ{h i (β m )} = 0, or, Φ(β 0m ) = p i . Therefore, β 0m  = Φ −1(p i ), which proves (23). And with p i Φ{h i (β m )} = 0, (17) holds automatically. This completes proving the result regarding β m .

Next we show that β m established above also solves (18)–(19), that is, β c  = β m . Suppose β 1c  = 0, then h i (β c ) = β 0c , and d i (β c ) = (1 −π i )Φ(β 0c ) +π i Φ(−β 0c ). Note that, inside (18), with q i  = π i p i + (1 −π i )(1 − p i ) and d i (β c ) = (1 −π i )Φ(β 0c ) +π i Φ(−β 0c ), one has 1 − d i (β c ) − q i  = (1 − 2π i ){p i Φ(β 0c )}. Therefore, if β 0c  = Φ −1(p i ), then 1 − d i (β c ) − q i  = 0 and (18) holds for all π i . Furthermore, 1 − d i (β c ) − q i  = 0 immediately makes (19) hold. This shows that β c  = β m .

This completes the proof for Proposition 1. □ 

Appendix 3: Proof of Proposition 3.1

The following four results are crucial for proving Proposition 1. For clarity, we incorporate the dependence of h i (β) in (8) on W i by re-expressing this function as h(β 0, β 1, w), with the subscript i suppressed.

  • (R1) If μ x  = 0, then h(−β 0m , β 1m , −w) = −h(β 0m , β 1m , w).

  • (R2) If μ x  = 0, then \(\phi \left \{h(-\beta _{0m},\beta _{1m},-w)\right \} = C\phi \left \{h(\beta _{0m},\beta _{1m},w)\right \}\), where C does not depend on w.

  • (R3) If f 1(x) = f 2(−x) and f U(u) = f U(−u), then f W (1)(w) = f W (2)(−w), where f U(u) is the pdf of the measurement error U, f W (1)(w) and f W (2)(w) are the pdf of W when the pdf of X is f 1(x) and f 2(x), respectively.

  • (R4) If f 1(x) = f 2(−x), f U(u) = f U(−u), H 1(s) = 1 − H 2(−s), μ x  = 0, and β 0 = 0, then p (22)(−w) = 1 − p (11)(w), where p (jk)(w) denotes the conditional mean of Y i given W i  = w under the true model \(f_{j}(x) \curlywedge H_{k}(s)\), for j, k = 1, 2.

The first two results, (R1) and (R2), follow directly from the definition of h i (β) in (8); (R3) can be easily proved by using the convolution formula based on the error model given in Eq. (2) in the main article. The proof for (R4) is given next.

Proof.

By the definition of p (jk)(w), one has, with β 0 = 0,

$$\displaystyle{p^{(11)}(w) = P^{(t)}(Y _{ i} = 1\vert W_{i} = w) =\int _{ -\infty }^{\infty }H_{ 1}(\beta _{1}x)f_{\mbox{ $U$}}(w-x)f_{1}(x)dx/f_{\mbox{ $W$}}^{(1)}(w).}$$

Similarly, p (22)(−w) is equal to

$$\displaystyle\begin{array}{rcl} & & \hspace{-4.0pt}\int _{-\infty }^{\infty }H_{ 2}(\beta _{1}x)f_{\mbox{ $U$}}(-w - x)f_{2}(x)dx/f_{\mbox{ $W$}}^{(2)}(-w) {}\\ & & =\int _{ -\infty }^{\infty }\{1 - H_{ 1}(-\beta _{1}x)\}f_{\mbox{ $U$}}(-w - x)f_{1}(-x)dx/f_{\mbox{ $W$}}^{(1)}(w),\text{ by (R3),} {}\\ & & =\int _{ -\infty }^{\infty }f_{\mbox{ $ U$}}(-w - x)f_{1}(-x)dx/f_{\mbox{ $W$}}^{(1)}(w) -\int _{ -\infty }^{\infty }H_{ 1}(-\beta _{1}x)f_{\mbox{ $U$}}(-w - x)f_{1}(-x)dx/f_{\mbox{ $W$}}^{(1)}(w) {}\\ & & =\int _{ -\infty }^{\infty }f_{\mbox{ $ U$}}(-w + s)f_{1}(s)ds/f_{\mbox{ $W$}}^{(1)}(w) -\int _{ -\infty }^{\infty }H_{ 1}(\beta _{1}s)f_{\mbox{ $U$}}(-w + s)f_{1}(s)ds/f_{\mbox{ $W$}}^{(1)}(w) {}\\ & & =\int _{ -\infty }^{\infty }f_{\mbox{ $ U$}}(w - s)f_{1}(s)ds/f_{\mbox{ $W$}}^{(1)}(w) -\int _{ -\infty }^{\infty }H_{ 1}(\beta _{1}s)f_{\mbox{ $U$}}(w - s)f_{1}(s)ds/f_{\mbox{ $W$}}^{(1)}(w) {}\\ & & = 1 - p^{(11)}(w). {}\\ \end{array}$$

This completes the proof of (R4).

Now we are ready to show Proposition 3.1. In essence, we will show that, if (β 0m , β 1m ) solves (16)–(17) when the true model is \(f_{1}(x) \curlywedge H_{1}(s)\), then (−β 0m , β 1m ) solves (16)–(17) when the true model is \(f_{2}(x) \curlywedge H_{2}(s)\). More specifically, evaluating (16) and (17) at its solution under the true model \(f_{1}(x) \curlywedge H_{1}(s)\), we will show that the following two equations,

$$\displaystyle\begin{array}{rcl} \int _{-\infty }^{\infty }\phi \{h(\beta _{ 0m},\beta _{1m},w)\} \frac{p^{(11)}(w) -\varPhi \left \{h(\beta _{0m},\beta _{1m},w)\right \}} {\varPhi \left \{h(\beta _{0m},\beta _{1m},w)\right \}\varPhi \left \{-h(\beta _{0m},\beta _{1m},w)\right \}}f_{\mbox{ $W$}}^{(1)}(w)dw& =& 0, \\ & & {}\end{array}$$
(24)
$$\displaystyle\begin{array}{rcl} \int _{-\infty }^{\infty }w\phi \{h(\beta _{ 0m},\beta _{1m},w)\} \frac{p^{(11)}(w) -\varPhi \left \{h(\beta _{0m},\beta _{1m},w)\right \}} {\varPhi \left \{h(\beta _{0m},\beta _{1m},w)\right \}\varPhi \left \{-h(\beta _{0m},\beta _{1m},w)\right \}}f_{\mbox{ $W$}}^{(1)}(w)dw& =& 0, \\ & & {}\end{array}$$
(25)

imply the following two identities,

$$\displaystyle\begin{array}{rcl} & & \int _{-\infty }^{\infty }\phi \{h(-\beta _{ 0m},\beta _{1m},w)\} \frac{p^{(22)}(w) -\varPhi \left \{h(-\beta _{0m},\beta _{1m},w)\right \}} {\varPhi \left \{h(-\beta _{0m},\beta _{1m},w)\right \}\varPhi \left \{-h(-\beta _{0m},\beta _{1m},w)\right \}} \\ & & \qquad \times \; f_{\mbox{ $W$}}^{(2)}(w)dw = 0, {}\end{array}$$
(26)
$$\displaystyle\begin{array}{rcl} & & \int _{-\infty }^{\infty }w\phi \{h(-\beta _{ 0m},\beta _{1m},w)\} \frac{p^{(22)}(w) -\varPhi \left \{h(-\beta _{0m},\beta _{1m},w)\right \}} {\varPhi \left \{h(-\beta _{0m},\beta _{1m},w)\right \}\varPhi \left \{-h(-\beta _{0m},\beta _{1m},w)\right \}} \\ & & \qquad \times \; f_{\mbox{ $W$}}^{(2)}(w)dw = 0. {}\end{array}$$
(27)

Take (28) as an example, the left-hand side of it is equal to, by (R1)–(R4) and Φ(−t) = 1 −Φ(t),

$$\displaystyle\begin{array}{rcl} & & \int _{-\infty }^{\infty }(-v)\phi \{h(-\beta _{ 0m},\beta _{1m},-v)\} \frac{p^{(22)}(-v) -\varPhi \left \{h(-\beta _{0m},\beta _{1m},-v)\right \}} {\varPhi \left \{h(-\beta _{0m},\beta _{1m},-v)\right \}\varPhi \left \{-h(-\beta _{0m},\beta _{1m},-v)\right \}} {}\\ & & \quad \times \; f_{\mbox{ $W$}}^{(2)}(-v)dv {}\\ & & = -C\int _{-\infty }^{\infty }v\phi \{h(\beta _{ 0m},\beta _{1m},v)\}\frac{1 - p^{(11)}(v) -\varPhi \left \{-h(\beta _{0m},\beta _{1m},v)\right \}} {\varPhi \left \{-h(\beta _{0m},\beta _{1m},v)\right \}\varPhi \left \{h(\beta _{0m},\beta _{1m},v)\right \}} f_{\mbox{ $W$}}^{(1)}(v)dv {}\\ & & = -C\int _{-\infty }^{\infty }v\phi \{h(\beta _{ 0m},\beta _{1m},v)\}\frac{1 - p^{(11)}(v) - 1 +\varPhi \left \{h(\beta _{0m},\beta _{1m},v)\right \}} {\varPhi \left \{h(\beta _{0m},\beta _{1m},v)\right \}\varPhi \left \{-h(\beta _{0m},\beta _{1m},v)\right \}} f_{\mbox{ $W$}}^{(1)}(v)dv {}\\ & & = C\int _{-\infty }^{\infty }v\phi \{h(\beta _{ 0m},\beta _{1m},v)\} \frac{p^{(11)}(v) -\varPhi \left \{h(\beta _{0m},\beta _{1m},v)\right \}} {\varPhi \left \{h(\beta _{0m},\beta _{1m},v)\right \}\varPhi \left \{-h(\beta _{0m},\beta _{1m},v)\right \}}f_{\mbox{ $W$}}^{(1)}(v)dv {}\\ & & = 0,\text{ according to}\ (\mbox{ 26}). {}\\ \end{array}$$

Following similar derivations, one can show that the left-hand side of (27) is equal to

$$\displaystyle{-C\int _{-\infty }^{\infty }\phi \{h(\beta _{ 0m},\beta _{1m},v)\} \frac{p^{(11)}(v) -\varPhi \left \{h(\beta _{0m},\beta _{1m},v)\right \}} {\varPhi \left \{h(\beta _{0m},\beta _{1m},v)\right \}\varPhi \left \{-h(\beta _{0m},\beta _{1m},v)\right \}}f_{\mbox{ $W$}}^{(1)}(v)dv,}$$

which is also equal to 0 according to (25). Therefore, β 0m (11) = −β 0m (22) and β 1m (11) = β 1m (22). This completes the proof of Proposition 3.1.

Appendix 4: A Counterpart Proposition of Proposition 3.1 for β c

Proposition 2.

Let f 1 (x) and f 2 (x) be two pdf’s specifying two true X-distributions that are symmetric of each other, and let H 1 (s) and H 2 (s) be two true links that are symmetric of each other. Denote by β c (jk) the limiting MLE of β based on reclassified data generated according to P(Y i = Y i |W i ) = π(W i ) when the true model is \(f_{j}(x) \curlywedge H_{k}(s)\) , for j,k = 1,2. If μ x = β 0 = 0 and π(t) is an even function or π(t) satisfies π(−t) = 1 −π(t), then β 0c (11) = −β 0c (22) and β 1c (11) = β 1c (22).

We will elaborate the proof when π(t) is an even function in this Appendix. The following two lemmas are needed in the proof, one lemma concerning d i (β) defined in (13), and the other relates to q i defined in (20). To elaborate the dependence of d i (β) on W i in (13), we re-express this function as d(β 0, β 1, w), with the subscript i suppressed.

Lemma 1.

If μ x = 0 and π(t) is an even function, then d(−β 0c 1c ,−w) = 1 − d(β 0c 1c ,w).

Proof.

By (13),

$$\displaystyle\begin{array}{rcl} & & \hspace{-20.0pt}d(-\beta _{0c},\beta _{1c},-w) {}\\ & =& \left \{1 -\pi (-w)\right \}\varPhi \left \{h(-\beta _{0c},\beta _{1c},-w)\right \} +\pi (-w)\varPhi \left \{-h(-\beta _{0c},\beta _{1c},-w)\right \} {}\\ & =& \left \{1 -\pi (w)\right \}\varPhi \left \{-h(\beta _{0c},\beta _{1c},w)\right \} +\pi (w)\varPhi \left \{h(\beta _{0c},\beta _{1c},w)\right \} {}\\ & & [\text{Next use (R1) and the fact that}\pi (t) =\pi (-t).] {}\\ & =& \left \{1 -\pi (w)\right \}\left [1 -\varPhi \left \{h(\beta _{0c},\beta _{1c},w)\right \}\right ] +\pi (w)\left [1 -\varPhi \left \{-h(\beta _{0c},\beta _{1c},w)\right \}\right ] {}\\ & =& 1 - d(\beta _{0c},\beta _{1c},w). {}\\ \end{array}$$

This completes the proof of Lemma 1. □ 

Lemma 2.

If f 1 (x) = f 2 (−x), f U (u) = f U (−u), H 1 (s) = 1 − H 2 (−s), μ x = 0, β 0 = 0, and π(t) is an even function, then q (22) (−w) = 1 − q (11) (w), where q (jk) (w) denotes the conditional mean of Y i given W i = w under the true model \(f_{j}(x) \curlywedge H_{k}(s)\) , for j,k = 1,2.

Proof.

By (20),

$$\displaystyle\begin{array}{rcl} & & \hspace{-20.0pt}q^{(22) } (-w) {}\\ & =& \left \{1 -\pi (-w)\right \}\left \{1 - p^{(22)}(-w)\right \} +\pi (-w)p^{(22)}(-w) {}\\ & =& \left \{1 -\pi (w)\right \}p^{(11)}(w) +\pi (w)\left \{1 - p^{(11)}(w)\right \},\text{ by (R4) and}\ \pi (-t) =\pi (t), {}\\ & =& 1 - q^{(11)}(w). {}\\ \end{array}$$

This completes the proof of Lemma 2. Following similar derivations, one can show that q (12)(−w) = 1 − q (21)(w).

If, instead of being an even function, π(t) satisfies π(−t) = 1 −π(t), then the conclusion in Lemma 1 becomes d(−β 0c , β 1c , −w) = d(β 0c , β 1c , w), and the conclusion in Lemma 2 changes to q (22)(−w) = q (11)(w).

Now we are ready to show that, if (β 0c , β 1c ) solves (18)–(19) under the true model \(f_{1}(x) \curlywedge H_{1}(s)\), then (−β 0c , β 1c ) solves (18)–(19) under the true model f 2(x) ⋏ H 2(s). Given that (β 0c , β 1c ) solves (18) and (19) under the true model \(f_{1}(x) \curlywedge H_{1}(s)\), one has, by elaborating (18) and (19),

$$\displaystyle\begin{array}{rcl} & & \int _{-\infty }^{\infty } \frac{\phi \left \{h(\beta _{0c},\beta _{1c},w)\right \}} {d(\beta _{0c},\beta _{1c},w)\left \{1 - d(\beta _{0c},\beta _{1c},w)\right \}}\left \{1 - 2\pi (w)\right \} \\ & & \quad \left \{1 - q^{(11)}(w) - d(\beta _{ 0c},\beta _{1c},w)\right \}f_{\mbox{ $W$}}^{(1)}(w)dw = 0,{}\end{array}$$
(28)
$$\displaystyle\begin{array}{rcl} & & \int _{-\infty }^{\infty }w \frac{\phi \left \{h(\beta _{0c},\beta _{1c},w)\right \}} {d(\beta _{0c},\beta _{1c},w)\left \{1 - d(\beta _{0c},\beta _{1c},w)\right \}}\left \{1 - 2\pi (w)\right \} \\ & & \quad \left \{1 - q^{(11)}(w) - d(\beta _{ 0c},\beta _{1c},w)\right \}f_{\mbox{ $W$}}^{(1)}(w)dw = 0.{}\end{array}$$
(29)

Now we check if (−β 0c , β 1c ) solves (18)–(19) under the true model \(f_{2}(x) \curlywedge H_{2}(s)\). Plugging (−β 0c , β 1c ) in (18) gives, where we set v = −w in the first equality,

$$\displaystyle\begin{array}{rcl} & & \hspace{-22.0pt}\int _{-\infty }^{\infty } \frac{\phi \left \{h(-\beta _{0c},\beta _{1c},w)\right \}} {d(-\beta _{0c},\beta _{1c},w)\left \{1 - d(-\beta _{0c},\beta _{1c},w)\right \}}\left \{1 - 2\pi (w)\right \} {}\\ & & \left \{1 - q^{(22)}(w) - d(-\beta _{ 0c},\beta _{1c},w)\right \}f_{\mbox{ $W$}}^{(2)}(w)dw {}\\ & =& \int _{-\infty }^{\infty } \frac{\phi \left \{h(-\beta _{0c},\beta _{1c},-v)\right \}} {d(-\beta _{0c},\beta _{1c},-v)\left \{1 - d(-\beta _{0c},\beta _{1c},-v)\right \}}\left \{1 - 2\pi (-v)\right \} {}\\ & & \left \{1 - q^{(22)}(-v) - d(-\beta _{ 0c},\beta _{1c},-v)\right \}f_{\mbox{ $W$}}^{(2)}(-v)dv {}\\ & & \text{[Next use (R1)}\textendash\text{ (R3), Lemmas 1, 2, and}\ \pi (t) =\pi (-t).] {}\\ & =& \int _{-\infty }^{\infty } \frac{C\phi \left \{h(\beta _{0c},\beta _{1c},v)\right \}} {\left \{1 - d(\beta _{0c},\beta _{1c},v)\right \}d(\beta _{0c},\beta _{1c},v)}\left \{1 - 2\pi (v)\right \} {}\\ & & \left \{-1 + q^{(11)}(v) + d(\beta _{ 0c},\beta _{1c},v)\right \}f_{\mbox{ $W$}}^{(1)}(v)dv {}\\ & =& -C\int _{-\infty }^{\infty } \frac{\phi \left \{h(\beta _{0c},\beta _{1c},v)\right \}} {d(\beta _{0c},\beta _{1c},v)\left \{1 - d(\beta _{0c},\beta _{1c},v)\right \}}\left \{1 - 2\pi (v)\right \} {}\\ & & \left \{1 - q^{(11)}(v) - d(\beta _{ 0c},\beta _{1c},v)\right \}f_{\mbox{ $W$}}^{(1)}(v)dv {}\\ & =& 0,\text{ by}\ (\mbox{ 29}). {}\\ \end{array}$$

Similarly, one can show that (30) implies

$$\displaystyle\begin{array}{rcl} & & \int _{-\infty }^{\infty }w \frac{\phi \left \{h(-\beta _{0c},\beta _{1c},w)\right \}} {d(-\beta _{0c},\beta _{1c},w)\left \{1 - d(-\beta _{0c},\beta _{1c},w)\right \}}\left \{1 - 2\pi (w)\right \} {}\\ & & \quad \left \{1 - q^{(22)}(w) - d(-\beta _{ 0c},\beta _{1c},w)\right \}f_{\mbox{ $W$}}^{(2)}(w)dw = 0. {}\\ \end{array}$$

Hence, (−β 0c , β 1c ) does solve (18)–(19) under the true model \(f_{2}(x) \curlywedge H_{2}(s)\). In other words, β 0c (11) = −β 0c (22) and β 1c (11) = β 1c (22). Following parallel arguments as above one can show that β 0c (12) = −β 0c (21) and β 1c (12) = β 1c (21). This completes the proof of Proposition 2. □ 

Appendix 5: Additional Simulation Results from Sect. 4

When the assumed model is not probit-normal or the true model is not in the class of mixture-probit-normal, analytic exploration, as elaborated in Appendices 1–4 leading to the properties of the limiting MLEs of β, β m and β c , become infeasible. To provide empirical justification of these results, such as those summarized in Proposition 1 and (M1) in Sect. 3.3, under and outside this assumed/true-model configuration, Table 3 presents Monte Carlo averages of \(\hat{\beta }_{m}\) and \(\hat{\beta }_{c}\) obtained under some simulation settings considered or mentioned in Sect. 4. When computing \(\hat{\beta }_{c}\), we consider two forms of π(t) in the reclassification model P(Y b, i  = Y i  | W i ) = π(W i ). One is used in Sect. 4, i.e., P(Y b, i  = Y i  | W i ) = Φ(W i ), and the other is P(Y b, i  = Y i  | W i ) = 0. 2. The former π(t) satisfies the condition that π(−t) = 1 −π(t), and the latter is an even function, providing two examples satisfying the condition regarding π(t) under Proposition 2.

Table 3 Averages of maximum likelihood estimates of β across 1000 Monte Carlo replicates under different true models

Table 4 provides rejection rates across 1000 Monte Carlo replicates when data are generated from four true models in the class of generalized-logit-normal and the assumed model is logit-normal. Overall the operating characteristics of all considered tests are very similar to those when the assumed model is probit-normal (see the lower half of Table 1). Indeed, from a practical point of view when it comes to model diagnosis, it should not matter whether one assumes probit-normal or logit-normal. If one concludes existence of model misspecification under one assumed model, certainly one should not believe in the other assumed model. If one concludes lack of sufficient evidence of model misspecification under one assumed model, the other assumed model is clearly equally plausible. After all, the logit link and the probit link are virtually indistinguishable in most inference contexts (Chambers and Cox, 1967).

Table 4 Rejection rates across 1000 Monte Carlo replicates of each test statistic under each testing procedure considered in Sect. 4 at different levels of reliability ratio ω when the assumed model is logit-normal

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Huang, X. (2016). Dual Model Misspecification in Generalized Linear Models with Error in Variables. In: Jin, Z., Liu, M., Luo, X. (eds) New Developments in Statistical Modeling, Inference and Application. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-42571-9_1

Download citation

Publish with us

Policies and ethics