Skip to main content
Log in

A semiparametric Wald statistic for testing logistic regression models based on case-control data

  • Published:
Science in China Series A: Mathematics Aims and scope Submit manuscript

Abstract

We propose a semiparametric Wald statistic to test the validity of logistic regression models based on case-control data. The test statistic is constructed using a semiparametric ROC curve estimator and a nonparametric ROC curve estimator. The statistic has an asymptotic chisquared distribution and is an alternative to the Kolmogorov-Smirnov-type statistic proposed by Qin and Zhang in 1997, the chi-squared-type statistic proposed by Zhang in 1999 and the information matrix test statistic proposed by Zhang in 2001. The statistic is easy to compute in the sense that it requires none of the following methods: using a bootstrap method to find its critical values, partitioning the sample data or inverting a high-dimensional matrix. We present some results on simulation and on analysis of two real examples. Moreover, we discuss how to extend our statistic to a family of statistics and how to construct its Kolmogorov-Smirnov counterpart.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Breslow N, Day N E. Statistical Methods in Cancer Research, 1. The Analysis of Case-Control Studies. Lyon: IARC Press, 1980

    Google Scholar 

  2. Prentice R L, Pyke R. Logistic disease incidence models and case-control studies. Biometrika, 66: 403–411 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  3. Wang C Y, Carroll R J. On robust estimation in logistic case-control studies. Biometrika, 80: 237–241 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  4. Qin J, Zhang B. A goodness of fit test for logistic regression models based on case-control data. Biometrika, 84: 609–618 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  5. Vardi Y. Nonparametric estimation in the presence of length bias. Ann Statist, 10: 616–620 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  6. Vardi Y. Empirical distributions in selection bias models. Ann Statist, 13: 178–203 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  7. Gill R D, Vardi Y, Wellner J A. Large sample theory of empirical distributions in biased sampling models. Ann Statist, 16: 1069–1112 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  8. Qin J. Empirical likelihood in biased sample problems. Ann Statist, 21: 1182–1196 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  9. Kay R, Little S. Transformations of the explanatory variables in the logistic regression model for binary data. Biometrika, 74: 495–501 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  10. Zhang B. A chi-squared goodness-of-fit test for logistic regression models based on case-control data. Biometrika, 86: 531–539 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  11. Zhang B. An information matrix test for logistic regression models based on case-control data. Biometrika, 88: 921–932 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  12. White H. Maximum likelihood estimation of misspecified models. Econometrica, 50: 1–25 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  13. Zhou X H, McClish D K, Obuchowski N A. Statistical Methods in Diagnostic Medicine. New York: Wiley, 2002

    MATH  Google Scholar 

  14. Pepe M S. The Statistical Evaluation of Medical Tests for Classification and Prediction. New York: Oxford University Press, 2003

    MATH  Google Scholar 

  15. Qin J, Zhang B. Using logistic regression procedures for estimating receiver operating characteristic curves. Biometrika, 90: 585–596 (2003)

    Article  MathSciNet  Google Scholar 

  16. Wan S W, Zhang B. Smooth semiparametric receiver operating characteristic curves for continuous diagnostic tests. Stat Med, 26: 2565–2586 (2007)

    Article  MathSciNet  Google Scholar 

  17. Day N E, Kerridge D F. A general maximum likelihood discriminant. Biometrics, 23: 313–323 (1967)

    Article  Google Scholar 

  18. Kac M, Kiefer J, Wolfowitz J. On tests of normality and other tests of goodness of fit based on distance methods. Ann Math Statist, 26: 189–211 (1955)

    Article  MATH  MathSciNet  Google Scholar 

  19. Glovsky L, Rigrodsky S. A developmental analysis of mentally deficient children with early histories of aphasis. Training School Bull, 61: 76–96 (1964)

    Google Scholar 

  20. Hosmer D J, Lemeshow S. Applied Logistic Regression. New York: John Wiley, 1989

    Google Scholar 

  21. Wieand S, Gail M H, James B R, et al. A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika, 76: 585–592 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  22. Bahadur R R. A note on quantiles in large samples. Ann Math Statist, 37: 577–580 (1966)

    Article  MATH  MathSciNet  Google Scholar 

  23. van de Varrt A W, Wellner J A. Weak Convergence and Empirical Processes with Applications to Statistics. New York: Springer, 1996

    Google Scholar 

  24. Billingsley P. Convergence of Probability Measures. New York: John Wiley, 1968

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to ShuWen Wan.

Additional information

This work was supported by the 11.5 Natural Scientific Plan (Grant No. 2006BAD09A04) and Nanjing University Start Fund (Grant No. 020822410110)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wan, S. A semiparametric Wald statistic for testing logistic regression models based on case-control data. Sci. China Ser. A-Math. 51, 2020–2032 (2008). https://doi.org/10.1007/s11425-008-0086-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11425-008-0086-z

Keywords

MSC(2000)

Navigation