Skip to main content
Log in

The focused information criterion for varying-coefficient partially linear measurement error models

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Under general parametric models, Claeskens and Hjort (J Am Stat Assoc 98:900–916, 2003) proposed a focused information criterion for model selection which emphasizes the accuracy of estimation for particular parameters of interest. This paper extends their framework to include a semi-parametric varying-coefficient partially linear model when covariates in both the parametric and the non-parametric parts are subject to measurement errors. We allow the covariance matrices of the measurement errors to be unknown and be estimated by replicated observations. Also, we derive the asymptotic properties of the frequentist model average estimator for the model in consideration, which generalizes the results obtained by Wang et al. (Electron J Stat 6:1017–1039, 2012). In addition to asymptotic properties, finite sample performance of the proposed methods are examined in a simulation study, and a data set obtained from Continuing Survey of Food Intakes by Individuals conducted by the U.S. Department of Agriculture’s (CSFII) is considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike H. (1973). Information theory and an extension of the maximum likelihood principle. In: Petrov BN, źaki FC (eds). 2nd International symposium on information theory. Akademiai Kaidó, Budapest, pp 267–281

  • Buckland ST, Burnham KP, Augustin NH (1997) Model selection: an integral part of inference. Biometrics 53:603–618

    Article  MATH  Google Scholar 

  • Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective, 2nd edn. Chapman and Hall, New York

    Book  Google Scholar 

  • Claeskens G, Carroll RJ (2007) An asymptotic theory for model selection inference in general semiparametric problems. Biometrika 94:249–265

    Article  MathSciNet  MATH  Google Scholar 

  • Claeskens G, Hjort NL (2003) The focused information criterion. J Am Stat Assoc 98:900–916

    Article  MathSciNet  MATH  Google Scholar 

  • Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  • Craven P, Wahba G (1979) Smoothing noisy data with spline functions. Numerische Mathematik 31:377–403

    Article  MathSciNet  MATH  Google Scholar 

  • Danilov D, Magnus JR (2004) On the harm that ignoring pretesting can cause. J Econom 122:27–46

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Huang T (2005) Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11:1031–1057

    Article  MathSciNet  MATH  Google Scholar 

  • Hjort NL, Claeskens G (2003) Frequestist model average estimators. J Am Stat Assoc 98:879–899

    Article  MathSciNet  MATH  Google Scholar 

  • Hjort NL, Claeskens G (2006) Focused information criteria and model averaging for the Cox hazard regression model. J Am Stat Assoc 101:1449–1464

    Article  MathSciNet  MATH  Google Scholar 

  • Liang H, Härdle W, Carroll RJ (1999) Estimation in a semiparametric partially linear errors-in-variables model. Annals Stat 27:1519–1535

    Article  MATH  Google Scholar 

  • Magnus JR, Durbin J (1999) Estimation of regression coefficients of interest when other regression coefficients are of no interest. Econometrica 67:639–643

    Article  MathSciNet  MATH  Google Scholar 

  • Mallows CL (1973) Some comments on \(C_p\). Technometrics 15:661–675

    MATH  Google Scholar 

  • Peña, E. A., Wu, W., Piegorsch, W. W., West, R. W., and An, L. (2013). Model selection and estimation with quantal-response data in benchmark risk assessment. Tech rep

  • Schomaker M (2012) Shrinkage averaging estimation. Stat Pap 53(4):1015–1034

    Article  MathSciNet  MATH  Google Scholar 

  • Schomaker M, Heumann C (2014) Model selection and model averaging after multiple imputation. Comput Stat Data Anal 71:758–770

    Article  MathSciNet  Google Scholar 

  • Schomaker M, Wan ATK, Heumann C (2010) Frequentist model averaging with missing observations. Comput Stat Data Anal 54:3336–3347

    Article  MathSciNet  MATH  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Annals Stat 6:461–464

    Article  MATH  Google Scholar 

  • Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Ser B (Methodological) 36:111–147

    MATH  Google Scholar 

  • Wang H, Zou G (2012) Frequentist model averaging estimation for linear errors-in-variables model. J Syst Sci Math Sci Chin Ser 32:1–14

    MathSciNet  Google Scholar 

  • Wang H, Zou G, Wan ATK (2012) Model averaging for varying-coefficient partially linear measurement error models. Electron J Stat 6:1017–1039

    Article  MathSciNet  MATH  Google Scholar 

  • Wang H, Zou G, Wan AT-K (2013) Adaptive lasso for varying-coefficient partially linear measurement error models. J Stat Plan Inference 143:40–54

    Article  MathSciNet  MATH  Google Scholar 

  • You J, Chen G (2006) Estimation of a semiparametric varying-coefficient partially linear errors-in-variables model. J Multivar Anal 97:324–341

    Article  MathSciNet  MATH  Google Scholar 

  • You J, Zhou Y, Chen G (2006) Corrected local polynomial estimation in varying-coefficient models with measurement errors. Can J Stat 34:391–410

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang X, Liang H (2011) Focused information criterion and model averaging for generalized additive partial linear models. Annals Stat 39:174–200

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang X, Wan ATK, Zhou SZ (2012) Focused information criteria, model selection and model averaging in a tobit model with a non-zero threshold. J Bus Econ Stat 30:132–142

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

We are very grateful to the editor and two referees for their constructive comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai Ying Wang.

Appendix

Appendix

The following conditions are required for the proof.

  1. 1.

    The random variable \(T\) has bounded support \(\varvec{\Omega }\), and its density \(f\) is Lipschitz continuous and bounded away from 0 on its support.

  2. 2.

    For each \(T\in \varvec{\Omega }\), the \(r\times r\) matrix \(\mathbf {E}(ZZ^{\top }|T)\) is non-singular, and each element of \(\mathbf {E}(ZZ^{\top }|T)\), \(\mathbf {E}(XX^{\top }|T)\) or \(\mathbf {E}(ZX^{\top }|T)\) is Lipschitz continuous.

  3. 3.

    There exists some \(\epsilon >2\) such that \(\mathbf {E} \Vert X\Vert ^{2\epsilon }<\infty \), \(\mathbf {E} \Vert Z\Vert ^{2\epsilon }<\infty \), \(\mathbf {E} \Vert U\Vert ^{2\epsilon }<\infty \), \(\mathbf {E} \Vert V\Vert ^{2\epsilon }<\infty \) and \(\mathbf {E} \Vert \varepsilon \Vert ^{2\epsilon }<\infty \), and \(\rho <2-\epsilon ^{-1}\) such that \(nh^{2\rho -1}\rightarrow \infty \) and \(nh^8\rightarrow 0\).

  4. 4.

    \(\alpha _j(T),j=1,\ldots ,r\), is twice continuously differentiable in \(T\in \varvec{\Omega }\).

  5. 5.

    \(K(\cdot )\) is a symmetric density with compact support.

Proof of Theorem 1

Let \(\hat{U}_i =({\bar{\psi }}_{i}\mathbf {U})^{\top }\), \(\hat{\varepsilon }_i={\bar{\psi }}_{i}\varvec{\varepsilon }\) and \(\bigtriangledown =\widetilde{\bar{\mathbf {W}}}^{\top }\widetilde{\bar{\mathbf {W}}}-nJ^{-1}\hat{\Sigma }_u\). Then from Eq. (4), we have

$$\begin{aligned} \hat{\theta }-\theta _{{\mathrm{true}}}&=\bigtriangledown ^{-1}\sum _{i=1}^n(\bar{W}_i-\hat{\bar{W}}_i)(Y_i-\hat{\bar{Y}}_i)-\bigtriangledown ^{-1}\bigtriangledown \theta _{{\mathrm{true}}}\\&=\bigtriangledown ^{-1}nJ^{-1}\hat{\Sigma }_u\theta _{{\mathrm{true}}}+\bigtriangledown ^{-1}\sum _{i=1}^n(\bar{W}_i-\hat{\bar{W}}_i)\left\{ Y_i-\hat{\bar{Y}}_i-(\bar{W}_i-\hat{\bar{W}}_i)^{\top }\theta _{{\mathrm{true}}}\right\} . \end{aligned}$$

From the expressions of \(\hat{\bar{Y}}_i\), \(\hat{\bar{W}}_i\), \(\hat{\bar{U}}_i\) and \(\hat{\bar{\varepsilon }}_i\), we obtain

$$\begin{aligned}&Y_i-\hat{\bar{Y}}_i-(\bar{W}_i-\hat{\bar{W}}_i)^{\top }\theta _{{\mathrm{true}}} =Z_i^{\top }\alpha (T_i)+\varepsilon _i-\bar{U}_i^{\top }\theta _{{\mathrm{true}}}-\hat{\varepsilon }_i+\hat{\bar{U}}_i^{\top }\theta _{{\mathrm{true}}}-{\bar{\psi }}_{i}\mathbf {M}. \end{aligned}$$

Hence,

$$\begin{aligned}&\sum _{i=1}^n(\bar{W}_i-\hat{\bar{W}}_i)\left\{ Y_i-\hat{\bar{Y}}_i-(\bar{W}_i-\hat{\bar{W}}_i)\theta _{{\mathrm{true}}}\right\} \\&\quad =\sum _{i=1}^n(\bar{W}_i-\hat{\bar{W}}_i)(\varepsilon _i-\bar{U}_i^{\top }\theta _{{\mathrm{true}}})\\&\qquad +\sum _{i=1}^n(\bar{W}_i-\hat{\bar{W}}_i)(\hat{\bar{U}}_i^{\top }\theta _{{\mathrm{true}}}-\hat{\varepsilon }_i) +\sum _{i=1}^n(\bar{W}_i-\hat{\bar{W}}_i)\{Z_i^{\top }\alpha (T_i)-{\bar{\psi }}_{i}\mathbf {M}\}\\&\quad =\sum _{i=1}^n\left[ \bar{W}_i-\mathbf {E}(\bar{W}_iZ_i^{\top }|T_i)\{\mathbf {E}(Z_iZ_i^{\top }|T_i)\}^{-1}{\bar{\zeta }}_{i}\right] (\varepsilon _i-\bar{U}_i^{\top }\theta _{{\mathrm{true}}})\\&\qquad +\sum _{i=1}^n\left[ \mathbf {E}(\bar{W}_iZ_i^{\top }|T_i)\{\mathbf {E}(Z_iZ_i^{\top }|T_i)\}^{-1}{\bar{\zeta }}_{i}-\hat{\bar{W}}_i\right] (\varepsilon _i-\bar{U}_i^{\top }\theta _{{\mathrm{true}}})\\&\qquad +\sum _{i=1}^n(\bar{W}_i-\hat{\bar{W}}_i)(\hat{\bar{U}}_i^{\top }\theta _{{\mathrm{true}}}-\hat{\varepsilon }_i) +\sum _{i=1}^n(\bar{W}_i-\hat{\bar{W}}_i)\{Z_i^{\top }\alpha (T_i)-{\bar{\psi }}_{i}\mathbf {M}\}\\&\quad \equiv J_1+J_2+J_3+J_4. \end{aligned}$$

Applying the method used in Fan and Huang (2005) provides that, uniformly in \(T\), \( \hat{\bar{W}}_i^{\top }=(\bar{\varvec{\zeta }}_{i}^{\top },\;0)\left\{ (\mathcal {D}^{\bar{\zeta }}_{t_i})^{\top }\Omega _{t_i}\mathcal {D}^{\bar{\zeta }}_{t_i}-{\bar{\phi }}_{t_i}\right\} ^{-1}(\mathcal {D}^{\bar{\zeta }}_{t_i})^{\top }\Omega _{t_i}\mathbf {W}={\bar{\zeta }}_{i}^{\top }\{\mathbf {E}(Z_iZ_i^{\top }|T_i)\}^{-1}\mathbf {E}(Z_iX_i^{\top }|T_i)\{1+O_P(c_n)\},\) where \(c_n=\{\log (1/h)/(nh)\}^{1/2}+h^2\). In addition, since \(\{\mathbf {E}(\bar{W}_iZ_i^{\top }|T_i)\}^{\top }=\mathbf {E}(Z_i\bar{W}_i^{\top }|T_i)\) and \(\theta _{{\mathrm{true}}}=\theta _0+(0^{\top },\delta ^{\top })^{\top }/\sqrt{n}\), we have

$$\begin{aligned} J_2=\sum _{i=1}^n[\mathbf {E}(\bar{W}_iZ_i^{\top }|T_i)\{\mathbf {E}(Z_iZ_i^{\top }|T_i)\}^{-1}{\bar{\zeta }}_{i}](\varepsilon _i-\bar{U}_i^{\top }\theta _0)O_P(c_n). \end{aligned}$$

The application of the Central Limit Theorem yields \(\sum _{i=1}^n[\mathbf {E}(\bar{W}_iZ_i^{\top }|T_i)\{\mathbf {E}(Z_iZ_i^{\top }|T_i)\}^{-1}Z_i](\varepsilon _i-\bar{U}_i^{\top }\theta _0)=O_P(\sqrt{n})\). Therefore, \(J_2=O_P(\sqrt{n}c_n)=o_P(\sqrt{n})\). Similarly \(J_3=o_P(\sqrt{n})\), and \(J_4=o_P(\sqrt{n})\). Using Slutsky’s Theorem and recognizing that \(\bigtriangledown /n=B_n\overset{p}{\longrightarrow }B\) as \(n\rightarrow \infty \), we obtain

$$\begin{aligned} \sqrt{n}(\hat{\theta }-\theta _{{\mathrm{true}}})&= \, \bigg (\frac{\bigtriangledown }{n}\bigg )^{-1}\frac{1}{\sqrt{n}}\sum _{i=1}^{n}\bigg \{\Big (\bar{W}_i-E(\bar{W}_iZ_i^{\top }|T_i)\{E(Z_iZ_i^{\top }|T_i)\}^{-1}{\bar{\zeta }}_{i}\Big )\\&\qquad \times \left( \varepsilon _i-\bar{U}_i^{\top }\theta _{{\mathrm{true}}}\right) +\frac{\sum _{j=1}^{J}(W_{ij}-\bar{W}_i)^{\otimes 2}\theta _{{\mathrm{true}}}}{J(J-1)}\bigg \}+o_P(1)\\&= \bigg (\frac{\bigtriangledown }{n}\bigg )^{-1}\frac{1}{\sqrt{n}}\sum _{i=1}^{n}\bigg \{\Big (\bar{W}_i-E(\bar{W}_iZ_i^{\top }|T_i)\{E(Z_iZ_i^{\top }|T_i)\}^{-1}{\bar{\zeta }}_{i}\Big )\\&\qquad \times (\varepsilon _i-\bar{U}_i^{\top }\theta _0)+\frac{\sum _{j=1}^{J}(W_{ij}-\bar{W}_i)^{\otimes 2}\theta _0}{J(J-1)}\bigg \}+o_P(1)\\&\overset{d}{\longrightarrow } N(0,\;B^{-1}FB^{-1}). \end{aligned}$$

The above results together with Eq. (5) and the Continuous Mapping Theorem finish the proof. \(\square \)

Proofs of Theorem 2 and Theorem 3

With the result in Theorem 1, they can be proved using approaches similar to those used in the proof of Theorem 1 and Theorem 2 in Wang et al. (2012), respectively. We skip the details here to save space. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H.Y., Chen, X. & Flournoy, N. The focused information criterion for varying-coefficient partially linear measurement error models. Stat Papers 57, 99–113 (2016). https://doi.org/10.1007/s00362-014-0645-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-014-0645-z

Keywords

Navigation