Skip to main content
Log in

On the use of repeated measurement errors in linear regression models

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

In a linear mean regression setting with repeated measurement errors, we develop asymptotic properties of a naive estimator to better clarify the effects of these errors. We then construct a group of unbiased estimating equations with independent repetitions and make use of these equations in two ways to obtain two estimators: a weighted averaging estimator and an estimator based on the generalized method of moments. The proposed estimators do not require any additional information about the measurement errors. We also prove the consistency and asymptotic normality of the two estimators. Our theoretical results are verified by simulation studies and a real data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barry VW, McClain AC, Shuger S, Sui X, Hardin JW, Hand GA, Wilcox S, Blair SN (2011) Using a technology-based intervention to promote weight loss in sedentary overweight or obese adults: a randomized controlled trial study design. Diabetes Metabolic Syndrome Obesity Targets Therapy 4:67–77

    Article  Google Scholar 

  • Buckland ST, Burnham KP, Augustin NH (1997) Model selection: an integral part of inference. Biometrics 53(2):603–618

    Article  Google Scholar 

  • Campbell A (2002) The potential role of aluminium in Alzheimer’s disease. Nephrol Dial Transplant 17(suppl-2):17–20

    Article  Google Scholar 

  • Chan LK, Mak TK (1979) Maximum likelihood estimation of a linear structural relationship with replication. J Roy Stat Soc Ser B (Methodol) 41(2):263–268

    MathSciNet  MATH  Google Scholar 

  • Cheng CL, Van Ness JW (1999) Statistical regression with measurement error. Kendell’s Library of Statistics 6. Arnold, London

  • Carroll RJ, Ruppert D, Stefanski LA (1995) Nonlinear measurement error models. Chapman and Hall, London

    Book  Google Scholar 

  • Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective, 2nd edn. Chapman and Hall, London

    Book  Google Scholar 

  • Carroll RJ, Stefanski LA (1990) Approximate quasi-likelihood estimation in models with surrogate predictors. J Am Stat Assoc 85(411):652–663

    Article  MathSciNet  Google Scholar 

  • Chen X, Jacho-Chávez DT, Linton O (2016) Averaging of an increasing number of moment condition estimators. Econ Theory 32(1):30–70

    Article  MathSciNet  Google Scholar 

  • Cook JR, Stefanski LA (1994) Simulation-extrapolation estimation in parametric measurement error models. J Am Stat Assoc 89(428):1314–1328

    Article  Google Scholar 

  • Da Silva DN, Skinner C, Kim JK (2016) Using binary paradata to correct for measurement error in survey data analysis. J Am Stat Assoc 111(514):526–537

    Article  MathSciNet  Google Scholar 

  • Davies B, Mutton B (1975) The effect of errors in the independent variables in linear regression. Biometrika 62(2):383–391

    Article  MathSciNet  Google Scholar 

  • Freedman LS, Fainberg V, Kipnis V, Midthune D, Carroll RJ (2004) A new method for dealing with measurement error in explanatory variables of regression models. Biometrics 60(1):172–181

    Article  MathSciNet  Google Scholar 

  • Fuller WA (1987) Measurement error models. Wiley, New York

    Book  Google Scholar 

  • Goldstein H (1979) Some models for analysing longitudinal data on educational attainment. J R Stat Soc Ser A (General) 142(4):407–442

    Article  Google Scholar 

  • Gleser LJ (1981) Estimation in a multivariate “errors in variables” regression model: large sample results. Ann Stat 9(1):24–44

    Article  MathSciNet  Google Scholar 

  • Gleser LJ (1992) The importance of assessing measurement reliability in multivariate regression. J Am Stat Assoc 87(419):696–707

    Article  MathSciNet  Google Scholar 

  • Hodges SD, Moore PG (1972) Data uncertainties and least squares regression. Appl Stat 21(2):185–195

    Article  MathSciNet  Google Scholar 

  • Isogawa Y (1985) Estimating a multivariate linear structural relationship with replication. J Roy Stat Soc Ser B (Methodol) 47(2):211–215

    MathSciNet  MATH  Google Scholar 

  • Liang H, Zou G, Wan AT, Zhang X (2011) Optimal weight choice for frequentist model average estimators. J Am Stat Assoc 106(495):1053–1066

    Article  MathSciNet  Google Scholar 

  • Lin JG, Cao CZ (2013) On estimation of measurement error models with replication under heavy-tailed distributions. Comput Stat 28(2):809–829

    Article  MathSciNet  Google Scholar 

  • Newey WK, McFadden D (1994) Large sample estimation and hypothesis testing. Handbook Econ 4:2111–2245

    MathSciNet  Google Scholar 

  • Qin G, Zhang J, Zhu Z (2016) Simultaneous mean and covariance estimation of partially linear models for longitudinal data with missing responses and covariate measurement error. Comput Stat Data Anal 96:24–39

    Article  MathSciNet  Google Scholar 

  • Rothenberg TJ (1973) Efficient estimation with a priori information. Yale University Press, New Haven

    MATH  Google Scholar 

  • Stefanski LA, Carroll RJ (1987) Conditional scores and optimal scores for generalized linear measurement-error models. Biometrika 74(4):703–716

    MathSciNet  MATH  Google Scholar 

  • Stefanski LA (2000) Measurement error models. J Am Stat Assoc 95(452):1353–1358

    Article  MathSciNet  Google Scholar 

  • Schafer DW, Purdy KG (1996) Likelihood analysis for errors-in-variables regression with replicate measurements. Biometrika 83(4):813–824

    Article  MathSciNet  Google Scholar 

  • Stewart GW (1990) Stochastic perturbation theory. SIAM Rev 32(4):579–610

    Article  MathSciNet  Google Scholar 

  • Thompson JR, Carter RL (2007) An overview of normal theory structural measurement error models. Int Stat Rev 75(2):183–198

    Article  Google Scholar 

  • Whittemore AS (1989) Errors-in-variables regression using Stein estimates. Am Stat 43(4):226–228

    Google Scholar 

  • Yi GY, Ma Y, Carroll RJ (2012) A functional generalized method of moments approach for longitudinal studies with missing responses and covariate measurement error. Biometrika 99(1):151–165

    Article  MathSciNet  Google Scholar 

  • Cao CZ, Meng XN, Wang YH (2016) Partially contaminated measurment error model with replications and it’s applications. Stat Decis 14:81–84 (in Chinese)

    Google Scholar 

Download references

Acknowledgements

We thank the editor and the referee very much for their helpful comments and suggestions which improve our manuscript a lot. Bai’s work was supported by the Natural Science Foundation of China (11771268). Zhang’s work was supported by the Graduate Innovation Foundation of Shanghai University of Finance and Economics of China (Grant No.CXJJ-2017-437).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Bai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

A.1 Proof of Lemma 1

Let \(B\triangleq X^\mathrm {T}X\), \(D\triangleq X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X+\Delta ^\mathrm {T}\Delta \), we have

$$\begin{aligned} (W^\mathrm {T}W)^{-1}=&(X^\mathrm {T}X+X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X+\Delta ^\mathrm {T}\Delta )^{-1}\triangleq (B+D)^{-1}\\ =&B^{-1}-B^{-1}DB^{-1}+B^{-1}DB^{-1}DB^{-1}-\ldots \\ =&(X^\mathrm {T}X)^{-1}-(X^\mathrm {T}X)^{-1}(X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X)(X^\mathrm {T}X)^{-1}\\&+(X^\mathrm {T}X)^{-1}\Delta ^\mathrm {T}\Delta (X^\mathrm {T}X)^{-1}+\ldots \end{aligned}$$

Moreover, for any matrix \(B_n\), \(D_n\), if \(B_n=O_p(a_n)\) and \(D_n=O_p(b_n)\), then \(B_nD_n=O_p(a_nb_n)\) and \(B_n+D_n=O_p(a_n\vee b_n)\).

Thus the results follow immediately.

A.2 Proof of Lemma 2

According to the second order expansion in Lemma 1, we have

$$\begin{aligned} \hat{\beta }_\mathrm {naive}&=(W^\mathrm {T}W)^{-1}W^\mathrm {T}Y\\&=(X^\mathrm {T}X)^{-1}X^\mathrm {T}Y+(X^\mathrm {T}X)^{-1}\Delta ^\mathrm {T}Y\\&\quad -(X^\mathrm {T}X)^{-1}(X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X)(X^\mathrm {T}X)^{-1}X^\mathrm {T}Y\\&\quad -(X^\mathrm {T}X)^{-1}\Delta ^\mathrm {T}\Delta (X^\mathrm {T}X)^{-1}X^\mathrm {T}Y\\&\quad +(X^\mathrm {T}X)^{-1}(X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X)(X^\mathrm {T}X)^{-1}(X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X)(X^\mathrm {T}X)^{-1}X^\mathrm {T}Y\\&\quad -(X^\mathrm {T}X)^{-1}(X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X)(X^\mathrm {T}X)^{-1}\Delta ^\mathrm {T}Y+o_p(1). \end{aligned}$$

Thus we get the expectation of \(\hat{\beta }_\mathrm {naive}\) referring to proposition 1.

A.3 Proof of Lemma 3

According to the first order expansion in Lemma 1, we have

$$\begin{aligned} \hat{\beta }_\mathrm {naive}= & {} (X^\mathrm {T}X)^{-1}X^\mathrm {T}Y+(X^\mathrm {T}X)^{-1}\Delta ^\mathrm {T}Y\\&\quad -(X^\mathrm {T}X)^{-1}(X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X)(X^\mathrm {T}X)^{-1}X^\mathrm {T}Y+o_p(1).\end{aligned}$$

Let \(A_0\triangleq {{\,\mathrm{E}\,}}YY^\mathrm {T}\equiv X\beta _0\beta _0^\mathrm {T}X+\sigma _1^2I_n\), we have

$$\begin{aligned} {{\,\mathrm{Var}\,}}(\hat{\beta }_\mathrm {naive})&={{\,\mathrm{E}\,}}(\hat{\beta }_\mathrm {naive}\hat{\beta }_\mathrm {naive}^\mathrm {T})-\beta _0\beta _0^\mathrm {T}\\&=(X^\mathrm {T}X)^{-1}X^\mathrm {T}A_0X(X^\mathrm {T}X)^{-1}-\beta _0\beta _0^\mathrm {T}\\&\quad +\frac{\sigma _2^2}{s}{{\,\mathrm{tr}\,}}(A_0)(X^\mathrm {T}X)^{-2}-\frac{\sigma _2^2}{s}(X^\mathrm {T}X)^{-2}X^\mathrm {T}A_0^\mathrm {T}X(X^\mathrm {T}X)^{-1}\\&\quad -\frac{\sigma _2^2}{s}{{\,\mathrm{tr}\,}}(A_0X(X^\mathrm {T}X)^{-1}X^\mathrm {T})(X^\mathrm {T}X)^{-2}-\frac{\sigma _2^2}{s}(X^\mathrm {T}X)^{-1}X^\mathrm {T}A_0^\mathrm {T}X(X^\mathrm {T}X)^{-2}\\&\quad +\frac{\sigma _2^2}{s}{{\,\mathrm{tr}\,}}((X^\mathrm {T}X)^{-1}X^\mathrm {T}A_0X(X^\mathrm {T}X)^{-1})(X^\mathrm {T}X)^{-1}+\frac{\sigma _2^2}{s}(X^\mathrm {T}X)^{-1}X^\mathrm {T}A_0^\mathrm {T}X(X^\mathrm {T}X)^{-2}\\&\quad -\frac{\sigma _2^2}{s}{{\,\mathrm{tr}\,}}(X(X^\mathrm {T}X)^{-1}X^\mathrm {T}A_0)(X^\mathrm {T}X)^{-2}+\frac{\sigma _2^2}{s}(X^\mathrm {T}X)^{-2}X^\mathrm {T}A_0^\mathrm {T}X(X^\mathrm {T}X)^{-1}\\&\quad +\frac{\sigma _2^2}{s}{{\,\mathrm{tr}\,}}(X(X^\mathrm {T}X)^{-1}X^\mathrm {T}A_0X(X^\mathrm {T}X)^{-1}X^\mathrm {T})(X^\mathrm {T}X)^{-2}\\&=(X^\mathrm {T}X)^{-1}\sigma _1^2+{{\,\mathrm{tr}\,}}(\beta _0\beta _0^\mathrm {T}+\sigma _1^2(X^\mathrm {T}X)^{-1})\frac{\sigma _1^2}{s}(X^\mathrm {T}X)^{-1}\\&\quad +(n-p)\frac{\sigma _2^2}{s}\sigma _1^2(X^\mathrm {T}X)^{-2}. \end{aligned}$$

A.4 Proof of Theorem 1

From Lemma 3, we get the expansion of \(\hat{\beta }_\mathrm {naive}\) as follows:

$$\begin{aligned} \hat{\beta }_\mathrm {naive}&=(X^\mathrm {T}X)^{-1}X^\mathrm {T}Y+(X^\mathrm {T}X)^{-1}\Delta ^\mathrm {T}Y\\&\quad -(X^\mathrm {T}X)^{-1}(X^\mathrm {T}\Delta +\Delta ^\mathrm {T}X)(X^\mathrm {T}X)^{-1}X^\mathrm {T}Y+o_p(1).\end{aligned}$$

Apparently, we have the asymptotic normality of \((X^\mathrm {T}X)^{-1}X^\mathrm {T}Y\).

Moreover, because of the element-wise independence of \(\Delta \), we have \((\Delta ^\mathrm {T}Y)_i=\sum _{j=1}^{n}\Delta _{ji}y_j \overset{D}{\rightarrow } \mathrm {Normality}\) by central limit theorem and \({{\,\mathrm{Cov}\,}}((\Delta ^\mathrm {T}Y)_i,(\Delta ^\mathrm {T}Y)_k)=0\) by the calculation.

Similarly, we get the asymptotic normality of \((X^\mathrm {T}X)^{-1}X^\mathrm {T}\Delta (X^\mathrm {T}X)^{-1}X^\mathrm {T}Y\),

\((X^\mathrm {T}X)^{-1}\Delta ^\mathrm {T}X(X^\mathrm {T}X)^{-1}X^\mathrm {T}Y\) and thus \(\sqrt{n}\hat{\beta }_\mathrm {naive}\overset{D}{\rightarrow } \mathrm {Normality}\). Therefore the result follows by lemma 2 and lemma 3.

A.5 Proof of Theorem 2

If our model satisfies the assumptions A*1-A*4 in Chen et al. (2016), then we can prove it by Theorem 1 (ii) in that paper. Using Taylor’s expansion, we have

$$\begin{aligned} G_{nl}(\beta )=G_{nl}(\beta _0)-\frac{1}{n}{W^j}^\mathrm {T}W^k(\beta -\beta _0).\end{aligned}$$
  1. 1.

    (A*1): \(\sup \limits _{n\ge 1}n^{\frac{1}{4}}\sum _{l=\tau +1}^{\tau (n)}\Vert W_{nl}\Vert \rightarrow 0\) satisfies because of finite \(\tau =C_s^2\).

  2. 2.

    (A*2): Let \(\gamma _l\equiv \Vert \frac{1}{n}\sum _{i=1}^n x_ix_i^\mathrm {T}\Vert \), \(\epsilon _n\rightarrow 0\) satisfying \(\gamma _l\ge \epsilon _n>n^{\frac{-1}{4}}\), then there exists \(\delta >0\) such that if \(\Vert \beta _0-\beta \Vert <\delta \), we have

    $$\begin{aligned} \Vert G_l(\beta )\Vert \equiv \Vert \frac{1}{n}\sum _{i=1}^nx_ix_i^\mathrm {T}\Vert \Vert \beta _0-\beta \Vert \ge \gamma _l\Vert \beta _0-\beta \Vert \end{aligned}$$

    and \(\min \limits _{l\in \mathcal {J}_n^*}\gamma _l \ge \epsilon _n>0\).

  3. 3.

    (A*3): For any \(\delta _n=o(1)\) and \(n\ge 1\), we have

    $$\begin{aligned} G_{nl}(\hat{\beta }_l)= & {} \frac{1}{n}{W^j}^\mathrm {T}Y-\frac{1}{n}{W^j}^\mathrm {T}Y=O,\\ \inf \limits _{\Vert \beta -\beta _0\Vert \le \delta _n}\Vert G_{nl}(\beta )\Vert\le & {} \inf \limits _{\Vert \beta -\beta _0\Vert \le \delta _n}(\Vert G_{nl}(\beta )-G_{nl}(\beta _0)\Vert +\Vert G_{nl}(\beta _0)\Vert ).\end{aligned}$$

    Moreover, we have

    $$\begin{aligned} \Vert G_{nl}(\beta )-G_{nl}(\beta _0)\Vert&=\Vert \frac{1}{n}{W^j}^\mathrm {T}W^k\Vert \Vert \beta -\beta _0\Vert \\&\le \Vert \frac{1}{n}{W^j}^\mathrm {T}W^k\Vert \delta _n\\&=o_p\left( \frac{1}{\sqrt{n}}\right) , \end{aligned}$$
    $$\begin{aligned} \Vert G_{nl}(\beta _0)\Vert =o_p\left( \frac{1}{\sqrt{n}}\right) ~~~(\mathrm {because~of~} {{\,\mathrm{E}\,}}G_{nl}(\beta _0)=0).\end{aligned}$$

    Thus, \(G_{nl}(\hat{\beta }_l)-\inf \limits _{\Vert \beta -\beta _0\Vert \le \delta _n}\Vert G_{nl}(\beta )\Vert =o_p(\epsilon _nn^{-1/4})\) and \(\max \limits _{l\in \mathcal {J}_n^*}(\Vert G_{nl}(\hat{\beta }_l)\Vert -\inf \limits _{\Vert \beta -\beta _0\Vert \le \delta _n} \Vert G_{nl}(\beta )\Vert )=o_p(\epsilon _nn^{-1/4})\) satisfy.

  4. 4.

    (A*4): For any \(\delta _n=o(1)\) and \(n\ge 1\), we have

    $$\begin{aligned} \Vert G_{nl}(\beta )-G_l(\beta )\Vert&\ \le \Vert \frac{1}{n}\sum _{i=1}^n w_i^jy_i-{{\,\mathrm{E}\,}}\left( \frac{1}{n}\sum _{i=1}^nw_i^jy_i\right) \Vert +\Vert \frac{1}{n}\sum _{i=1}^n w_i^j{w_i^k}^\mathrm {T}\beta \\&\quad -{{\,\mathrm{E}\,}}\left( \frac{1}{n}\sum _{i=1}^nw_i^j{w_i^k}^\mathrm {T}\beta \right) \Vert \\&=O_p\left( \frac{1}{\sqrt{n}}\right) \\&=o_p(\epsilon _n n^{-1/4}). \end{aligned}$$

    Then, \(\max \limits _{l\in \mathcal {J}_n^*} \sup \limits _{\Vert \beta -\beta _0\Vert \le \delta _n} \Vert G_{nl}(\beta )-\widetilde{G}_l(\beta )\Vert =o_p(\epsilon _n n^{-1/4})\) satisfies.

A.6 Proof of Theorem 3

Let’s divide the proof into two parts, one for the proof of asymptotic normality and the other for finding the optimal weighting matrix and asymptotic variance.

First part: if we could justify the assumptions A*1 and B1-B7 in Chen et al. (2016), then from Section 5 and Theorem 2 in that paper, we have

$$\begin{aligned} \sqrt{n}(\hat{\beta }_\mathrm {weight}-\beta _0)\overset{D}{\rightarrow } N(0,\Sigma )\end{aligned}$$

and

$$\begin{aligned} \sqrt{n}(\hat{\beta }_l-\beta _0)\overset{D}{\rightarrow } N(0,V_{l})\end{aligned}$$

where \(\hat{\beta }_\mathrm {weight}=\sum _{l\in \mathcal {J}_n^*}W_{nl}\hat{\beta }_l\) and \(\Sigma =\lim _{n\rightarrow \infty }\sum _{l=1}^\tau \sum _{m=1}^\tau W_{nl}V_{lm}W_{nm}^\mathrm {T}\). Thus the asymptotic normality of \(\hat{\beta }_\mathrm {ma}\) satisfy obviously. Now let’s justify these assumptions:

  1. 1.

    (A*1): It has been proved in Theorem 2.

  2. 2.

    (B1): B1 follows from the proof of (A*3) in Theorem 2.

  3. 3.

    (B2): We already have \(G_l(\beta )-\Gamma _l(\beta -\beta _0)=\frac{1}{n}\sum _{i=1}^nx_ix_i^\mathrm {T}(\beta -\beta _0)-\frac{1}{n}\sum _{i=1}^nx_ix_i^\mathrm {T}(\beta -\beta _0)=0\). Then, there exists a finite constant C such that for any \(\beta \) within a neighborhood of \(\beta _0\):

    $$\begin{aligned}\max \limits _{l\in \mathcal {J}_n^*} \Vert G_l(\beta )-\Gamma _l(\beta -\beta _0) \Vert \le C \Vert \beta -\beta _0 \Vert ^2\end{aligned}$$

    and where \(\Gamma _l=\frac{-1}{n}X^\mathrm {T}X\) is full (column) rank.

  4. 4.

    (B3): (a) From iid samples and central limit theorem, we have:

    $$\begin{aligned}\Vert \sqrt{n}G_{nl}(\beta _0)\Vert =\Vert \sqrt{n}\left( \frac{1}{n}\sum _{i=1}^n w_i^jy_i-\frac{1}{n}\sum _{i=1}^n w_i^j{w_i^k}^\mathrm {T}\beta \right) \Vert =O_p(1).\end{aligned}$$

            (b) \(\forall \delta _n=o(n^{-1/4}),\)

    $$\begin{aligned}&\sup \limits _{\Vert \beta -\beta _0\Vert \le \delta _n} \Vert G_{nl}(\beta )-G_l(\beta )-G_{nl}(\beta _0)+G_l(\beta _0)\Vert \\&\quad =\sup \limits _{\Vert \beta -\beta _0\Vert \le \delta _n} \Vert G_{nl}(\beta )-G_l(\beta )-G_{nl}(\beta _0)\Vert \\&\quad =\sup \limits _{\Vert \beta -\beta _0\Vert \le \delta _n} \Vert -\frac{1}{n}\sum _{i=1}^n w_i^j{w_i^k}^\mathrm {T}(\beta -\beta _0)-{{\,\mathrm{E}\,}}\left( \frac{1}{n}\sum _{i=1}^nw_i^jy_i\right) +{{\,\mathrm{E}\,}}\left( \frac{1}{n}\sum _{i=1}^nw_i^j{w_i^k}^\mathrm {T}\beta \right) \Vert \\&\quad =\sup \limits _{\Vert \beta -\beta _0\Vert \le \delta _n} \Vert -\frac{1}{n}\sum _{i=1}^n w_i^j{w_i^k}^\mathrm {T}(\beta -\beta _0)-{{\,\mathrm{E}\,}}\left( \frac{1}{n}\sum _{i=1}^nw_i^j{w_i^k}^\mathrm {T}\right) (\beta -\beta _0)\Vert \\&\quad =\sup \limits _{\Vert \beta -\beta _0\Vert \le \delta _n} \Vert -\frac{1}{n}\sum _{i=1}^n w_i^j{w_i^k}^\mathrm {T}-{{\,\mathrm{E}\,}}\left( \frac{1}{n}\sum _{i=1}^nw_i^j{w_i^k}^\mathrm {T}\right) \Vert \Vert (\beta -\beta _0)\Vert \\&\quad \le \Vert -\frac{1}{n}\sum _{i=1}^n w_i^j{w_i^k}^\mathrm {T}-{{\,\mathrm{E}\,}}(w_i^j{w_i^k}^\mathrm {T})\Vert \delta _n\\&\quad =o_p\left( \frac{1}{\sqrt{n}}\right) . \end{aligned}$$

    Thus,

    $$\begin{aligned} \max \limits _{l\in \mathcal {J}_n^*} \sup \limits _{\Vert \beta -\beta _0\Vert \le \delta _n} \Vert G_{nl}(\beta )-G_l(\beta )-G_{nl}(\beta _0)+G_l(\beta _0)\Vert =o_p(1/\sqrt{n}).\end{aligned}$$
  5. 5.

    (B4): We can see that the optimal weighting matrix is nonrandom. So, we can restrict \(W_{nl}\) to nonrandom matrices. Then, let \(W_{nl}^0=W_{nl}\), we have:

    (a) \(\sum _{l\in \mathcal {J}_n^*}\Vert (W_{nl}-W_{nl}^0)\Gamma _l^{-1}\Vert =o_p(1),\)

    (b) \(\limsup _n\sum _{l\in \mathcal {J}_n^*}\Vert W_{nl}^0\Gamma _l^{-1}\Vert <\infty .\)

  6. 6.

    (B5): (a)V is obviously positive definite and finite and so is \(V_0\).

    (b) Let \(f_n(w_i,y_i)=n^{-1/2}\sum _{l\in \mathcal {J}_n^*}c^\mathrm {T}W_{nl}^0\Gamma _l^{-1}g_l(\theta _0,w_i,y_i)\), we have \(n{{\,\mathrm{E}\,}}|f_n(w_i,y_i)|^{2+\kappa }\rightarrow 0\) for \(\forall c\in R^p\) and some \(\kappa >0\) since the existence of fourth moments of \(g_l(\theta _0,w_i,y_i)\) and the positive definite matrix \(\Gamma _l=\frac{-1}{n}(X^\mathrm {T}X)^{-1}\).

  7. 7.

    (B6): \(\beta _0\) is naturally in the interior of \(\Theta \).

  8. 8.

    (B7): \(\max \limits _{l\in \mathcal {J}_n^*}\Vert \hat{\beta }_l-\beta _0\Vert =o_p(n^{-1/4})\) satisfies.

Second part: \(\hat{\beta }_\mathrm {weight}\) is a special case of classical minimum distance estimator, so we can get the best estimator by \(\Xi =V^{-1}\) which justifies the optimal weighting matrices provided above by \(W_{nl}^\mathrm {opt}=[R^\mathrm {T}\Xi R]^{-1}[R^\mathrm {T}\Xi ]_{l.}\).

A.7 Proof of Theorem 4

  1. 1.

    \({{\,\mathrm{E}\,}}(\varphi _i(\beta ))=i_\tau \otimes x_ix_i^\mathrm {T}(\beta _0-\beta )\). If \(\beta \ne \beta _0\) and \(W_\mathrm {gmm}{{\,\mathrm{E}\,}}(\varphi _i(\beta ))=\mathbf {0}\), we have \(|W_\mathrm {gmm}|=|\bar{\Omega }|=0\). But \(|\bar{\Omega }|\ne 0\) according to \(\bar{\Omega }_{lm}\). So, \(W_\mathrm {gmm}{{\,\mathrm{E}\,}}(\varphi _i(\beta ))=\mathbf {0}\) if and only if \(\beta =\beta _0\).

  2. 2.

    \(\Theta \) is apparently compact.

  3. 3.

    \(\varphi _i(\beta )\) is continuous in \(\beta \).

  4. 4.
    $$\begin{aligned} \begin{aligned} {{\,\mathrm{E}\,}}(\sup \limits _{\beta \in \Theta }\parallel \varphi _i(\beta )\parallel ^2)&\le C_s^2 {{\,\mathrm{E}\,}}(\sup \limits _{\beta \in \Theta }(w_i^{j^\mathrm {T}}w_i^j(y_i-{w_i^k}^\mathrm {T}\beta )^2))\\&=C_s^2(x_i^\mathrm {T}x_i+p\sigma _2^2){{\,\mathrm{E}\,}}(\sup \limits _{\beta \in \Theta }(y_i-{w_i^k}^\mathrm {T}\beta )^2)\\&< \infty . \end{aligned} \end{aligned}$$

    The last inequality is satisfied because \(\beta \in \Theta \) and \(y_i\), \(w_i^j\) are finite dimension random variables.

Thus, the results follow from Theorem 2.6 in Newey and McFadden (1994).

A.8 Proof of Theorem 5

  1. 1.

    We have proved the consistency of \(\hat{\beta }_\mathrm {gmm}\).

  2. 2.

    \(\beta _0\) is in the interior of \(\Theta \).

  3. 3.

    \(\varphi _n(\beta )\) is continuously differentiable of \(\beta \).

  4. 4.

    With iid \(\varphi _i(\beta )\), we have \(\sqrt{n}\varphi _n(\beta _0)\overset{D}{\rightarrow } N(\mathbf {0},\bar{\Omega })\) by central limit theorem.

  5. 5.

    \(\nabla _{\beta }\varphi _n(\beta )=\frac{-1}{n}\sum _{i=1}^{n} \left( \begin{array}{c}...\\ w_i^kw_i^{j^\mathrm {T}}\\ ...\\ \end{array}\right) \overset{p}{\rightarrow }-i_{\tau }\otimes A^{-1}\) for any \(\beta \).

Thus, the results follow from Theorem 3.2 and Theorem 5.2 in Newey and McFadden (1994).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, M., Bai, Y. On the use of repeated measurement errors in linear regression models. Metrika 84, 779–803 (2021). https://doi.org/10.1007/s00184-020-00801-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-020-00801-2

Keywords

Navigation