Skip to main content
Log in

Statistical inference methods and applications of outcome-dependent sampling designs under generalized linear models

  • Articles
  • Published:
Science China Mathematics Aims and scope Submit manuscript

Abstract

A cost-effective sampling design is desirable in large cohort studies with a limited budget due to the high cost of measurements of primary exposure variables. The outcome-dependent sampling (ODS) designs enrich the observed sample by oversampling the regions of the underlying population that convey the most information about the exposure-response relationship. The generalized linear models (GLMs) are widely used in many fields, however, much less developments have been done with the GLMs for data from the ODS designs. We study how to fit the GLMs to data obtained by the original ODS design and the two-phase ODS design, respectively. The asymptotic properties of the proposed estimators are derived. A series of simulations are conducted to assess the finite-sample performance of the proposed estimators. Applications to a Wilms tumor study and an air quality study demonstrate the practicability of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Breslow N E, Chatterjee N. Design and analysis of two-phase studies with binary outcome applied to wilms tumour prognosis. J Roy Statist Soc, 1999, 48: 457–468

    Article  MATH  Google Scholar 

  2. Chatterjee N, Chen Y H, Breslow N E. A pseudo-score estimator for regression problems with two-phase sampling. J Amer Statist Assoc, 2003, 98: 158–168

    Article  MathSciNet  MATH  Google Scholar 

  3. Cleveland W S. Visualizing Data. Hobart: Hobart Press, 1993

    Google Scholar 

  4. D’Angio G J, Breslow N, Beckwith B, et al. Treatment of Wilms’ tumor. Cancer, 1989, 64: 349–360

    Article  Google Scholar 

  5. Ding J, Chen X. Large-sample theory for generalized linear models with non-natural link and random variates. Acta Math Appl Sin Eng Ser, 2006, 22: 115–126

    Article  MathSciNet  MATH  Google Scholar 

  6. Ding J, Liu Y. Semiparametric empirical likelihood estimation for two-stage outcome-dependent sampling under the frame of generalized linear models. Acta Math Appl Sin Eng Ser, 2014, 30: 663–676

    Article  MathSciNet  MATH  Google Scholar 

  7. Ding J, Liu Y, Peden D B, et al. Regression analysis for a summed missing data problem under an outcome-dependent sampling scheme. Canad J Statist, 2012, 40: 282–303

    Article  MathSciNet  MATH  Google Scholar 

  8. Dobson A J. An Introductoin to Generalized Linear Models, 2nd ed. London: Chapman and Hall, 2002

    Google Scholar 

  9. Fahrmeir L, Kaufmann H. Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models. Ann Statist, 1985, 14: 342–368

    Article  MathSciNet  MATH  Google Scholar 

  10. Foutz R V. On the unique consistent solution to the likelihood equations. J Amer Statist Assoc, 1977, 72: 147–148

    Article  MathSciNet  MATH  Google Scholar 

  11. Green D M, Breslow N E, Beckwith J B, et al. Comparison between single-dose and divided-dose administration of dactinomycin and doxorubicin for patients with Wilms tumor: A report from the National Wilms Tumor Study Group. J Clinical Oncology, 1998, 16: 237–245

    Article  Google Scholar 

  12. McCullagh P M, Nelder J A. Generalized Linear Models, 2nd ed. London: Chapman and Hall, 1989

    Book  MATH  Google Scholar 

  13. Qin G, Zhou H. Partial linear inference for a 2-stage outcome-dependent sampling design with a continuous outcome. Biostatistics, 2011, 12: 506–520

    Article  MATH  Google Scholar 

  14. Song R, Zhou H, Kosorok M R. On semiparametric efficient inference for two-stage outcome dependent sampling with a continuous outcome. Biometrics, 2009, 96: 221–228

    Article  MathSciNet  MATH  Google Scholar 

  15. Weaver M A, Zhou H. An estimated likelihood method for continuous outcome regression models with outcomedependent sampling. J Amer Statist Assoc, 2005, 100: 459–469

    Article  MathSciNet  MATH  Google Scholar 

  16. Yue L, Chen X. Rate of strong consistency of quasi maximum likelihood estimate in generalized linear models. Sci China Ser A, 2004, 47: 882–893

    Article  MathSciNet  MATH  Google Scholar 

  17. Zhou H, Qin G, Longnecker M P. A partial linear model in the outcome-dependent sampling setting to evaluate the effect of prenatal PCB exposure on cognitive function in children. Biometrics, 2011, 67: 876–885

    Article  MathSciNet  MATH  Google Scholar 

  18. Zhou H, Song R, Qin J. Statistical inference for a two-stage outcome dependent sampling design with a continuous outcome. Biometrics, 2011, 67: 194–202

    Article  MathSciNet  MATH  Google Scholar 

  19. Zhou H, Weaver M A, Qin J, et al. A semiparametric empirical likelihood method for data from an outcome dependent sampling scheme with a continuous outcome. Biometrics, 2002, 58: 413–421

    Article  MathSciNet  MATH  Google Scholar 

  20. Zhou H, You J, Qin G, et al. A partially linear regression model for data from an outcome-dependent sampling design. J Roy Statist Soc Ser C, 2011, 60: 559–574

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 11571263, 11371299 and 11101314).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to JieLi Ding.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, S., Ding, J. & Liu, Y. Statistical inference methods and applications of outcome-dependent sampling designs under generalized linear models. Sci. China Math. 60, 1219–1238 (2017). https://doi.org/10.1007/s11425-016-0152-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11425-016-0152-4

Keywords

MSC(2010)

Navigation