ROC curve and covariates: extending induced methodology to the non-parametric framework

Abstract

Continuous diagnostic tests are often used to discriminate between diseased and healthy populations. The receiver operating characteristic (ROC) curve is a widely used tool that provides a graphical visualisation of the effectiveness of such tests. The potential performance of the tests in terms of distinguishing diseased from healthy people may be strongly influenced by covariates, and a variety of regression methods for adjusting ROC curves has been developed. Until now, these methodologies have assumed that covariate effects have parametric forms, but in this paper we extend the induced methodology by allowing for arbitrary non-parametric effects of a continuous covariate. To this end, local polynomial kernel smoothers are used in the estimation procedure. Our method allows for covariate effect not only on the mean, but also on the variance of the diagnostic test. We also present a bootstrap-based method for testing for a significant covariate effect on the ROC curve. To illustrate the method, endocrine data were analysed with the aim of assessing the performance of anthropometry for predicting clusters of cardiovascular risk factors in an adult population in Galicia (NW Spain), duly adjusted for age. The proposed methodology has proved useful for providing age-specific thresholds for anthropometric measures in the Galician community.

This is a preview of subscription content, log in to check access.

References

  1. Alonzo, T.A., Pepe, M.S.: Distribution-free ROC analysis using binary regression techniques. Biostatistics 3, 421–432 (2002)

    MATH  Article  Google Scholar 

  2. Cai, T.: Semi-parametric ROC regression analysis with placement values. Biostatistics 5, 45–60 (2004)

    MATH  Article  Google Scholar 

  3. Cai, T., Pepe, M.S.: Semiparametric receiver operating characteristic analysis to evaluate biomarkers for disease. J. Am. Stat. Assoc. 97, 1099–1107 (2002)

    MATH  Article  MathSciNet  Google Scholar 

  4. Carey, V.J., Walters, E.E., Colditz, G.A., Solomon, C.G., Willet, W.C., Rosner, B.A., Speizer, F.E., Manson, J.E.: Body fat distribution and risk of noninsulin-dependent diabetes in women: the Nurses’ Health Study. Am. J. Epidemiol. 145, 614–619 (1997)

    Google Scholar 

  5. Chumlea, W.C., Baumgartner, R.N., Garry, P.J., Rhyne, R.L., Nicholson, C., Wayne, S.: Fat distribution and blood lipids in a sample of healthy elderly people. Int. J. Obes. 16, 125–133 (1992)

    Google Scholar 

  6. de Boor, C.A.: A Practical Guide to Splines. Springer, New York (2001). Revised Edition

    Google Scholar 

  7. Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, New York (1993)

    Google Scholar 

  8. Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties. Stat. Sci. 11, 89–121 (1996)

    MATH  Article  MathSciNet  Google Scholar 

  9. Fan, J., Gijbels, I.: Local Polynomial Modelling and Its Applications. Chapman & Hall, CRC (1996)

  10. Fan, J., Marron, J.S.: Fast implementation of non-parametric curve estimators. J. Comput. Graph. Stat. 3, 35–56 (1994)

    Article  Google Scholar 

  11. Fan, J., Yao, Q.: Efficient estimation of conditional variance functions in stochastic regression. Biometrika 85, 645–660 (1998)

    MATH  Article  MathSciNet  Google Scholar 

  12. Faraggi, D.: Adjusting receiver operating characteristic curves and related indices for covariates. Statistician 52, 179–192 (2003)

    MathSciNet  Google Scholar 

  13. González-Manteiga, W., Pardo Fernández, J.C., Van Keilegom, I.: ROC curves in nonparametric location-scale regression models. Scand. J. Stat. (2010). doi:10.1111/j.1467-9469.2010.00693x

    Google Scholar 

  14. Haslam, D.W., James, W.P.T.: Obesity. Lancet 366, 1197–1209 (2005)

    Article  Google Scholar 

  15. Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models. Chapman & Hall, London (1990)

    Google Scholar 

  16. Hsieh, F., Turnbull, B.W.: Nonparametric and semiparametric estimation of the receiver operating characteristic curve. Ann. Stat. 24, 24–40 (1996)

    MathSciNet  Google Scholar 

  17. Hu, G., Qiao, Q., Tuomilehto, J., Eliasson, M., Feskens, E.J., Pyörälä, K.: Plasma insulin and cardiovascular mortality in non-diabetic European men and women: a meta-analysis of data from eleven prospective studies. The DECODE Insulin Study Group. Diabetologia 47, 1245–1256 (2004)

    Google Scholar 

  18. International Diabetes Federation: The IDF consensus worldwide definition of the metabolic syndrome. http://www.idf.org/webdata/docs/IDF_Meta_def_final.pdf (2008). Accessed 7 September 2009

  19. Janes, H., Pepe, M.S.: Adjusting for covariate effects on classification accuracy using the covariate-adjusted ROC curve. Biometrika 96, 371–382 (2009)

    MATH  Article  MathSciNet  Google Scholar 

  20. Karelis, A.D., St-Pierre, D.H., Conus, F., Rabasa-Lhoret, R., Poehlman, E.T.: Metabolic and body composition factors in subgroups of obesity: What do we know? J. Clin. Endocrinol. Metab. 89, 2569–2575 (2004)

    Article  Google Scholar 

  21. Lloyd, C.J.: Using smooth receiver operating characteristic curves to summarize and compare diagnostic systems. J. Am. Stat. Assoc. 93, 1356–1364 (1998)

    Article  Google Scholar 

  22. López-de-Ullibarri, I., Cao, R., Cadarso-Suárez, C., Lado, M.J.: Nonparametric estimation of conditional ROC curves: application to discrimination tasks in computerized detection of early breast cancer. Comput. Stat. Data Anal. 52, 2623–2631 (2008)

    MATH  Article  Google Scholar 

  23. MacIntosh, M.W., Pepe, M.S.: Combining several screening test: optimality of the risk score. Biometrics 58, 657–664 (2002)

    Article  MathSciNet  Google Scholar 

  24. Metz, C.E.: Basic principles of ROC analysis. Semin. Nucl. Med. 8, 183–298 (1978)

    Article  Google Scholar 

  25. Mirmiran, P., Esmaillzadeh, A., Azizi, F.: Detection of cardiovascular risk factors by anthropometric measures in Tehranian adults: receiver operating characteristic (ROC) curve analysis. Eur. J. Clin. Nutr. 58, 1110–1118 (2004)

    Article  Google Scholar 

  26. Nadaraya, E.A.: On estimating regression. Theory Probab. Appl. 9, 141–142 (1964)

    Article  Google Scholar 

  27. Peden, M., McGee, K., Krug, E. (eds.): Injury: A Leading Cause of the Global Burden of Disease, 2000. World Health Organization, Geneva (2006)

    Google Scholar 

  28. Peng, L., Zhou, X.H.: Local linear smoothing of receiver operating characteristic (ROC) curves. J. Stat. Plan. Inference 118, 129–143 (2004)

    MATH  Article  MathSciNet  Google Scholar 

  29. Pepe, M.S.: Three approaches to regression analysis of receiver operating characteristic curves for continuous test results. Biometrics 54, 124–135 (1998)

    MATH  Article  Google Scholar 

  30. Pepe, M.S.: The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press, New York (2003)

    Google Scholar 

  31. Ren, H., Zhou, X.H., Liang, H.: A flexible method for estimating the ROC curve. J. Appl. Stat. 31, 773–784 (2004)

    MATH  Article  MathSciNet  Google Scholar 

  32. Ruppert, D., Wand, M.P., Holst, U., Hössjer, O.: Local polinomial variance function estimation. Technometrics 39, 262–273 (1997)

    MATH  Article  MathSciNet  Google Scholar 

  33. Schisterman, E.F., Faraggi, D., Reiser, B.: Adjusting the generalized ROC curve for covariates. Stat. Med. 23, 3319–3331 (2004)

    Article  Google Scholar 

  34. Su, J.Q., Liu, J.S.: Linear combinations of multiple diagnostic markers. J. Am. Stat. Assoc. 88, 1350–1355 (1993)

    MATH  Article  MathSciNet  Google Scholar 

  35. Swets, J.A., Pickett, R.M.: Evaluation of Diagnostic Systems: Methods from Signal Detection Theory. Academic Press, New York (1982)

    Google Scholar 

  36. Tomé, M.A., Botana, M.A., Cadarso-Suárez, C., Rego-Iraeta, A., Fernández-Mariño, A., Mato, J.A., Solache, I., Perez-Fernandez, R.: Prevalence of metabolic syndrome in Galicia (NW Spain) on four alternative definitions and association with insulin resistance. J. Endocrinol. Invest. 32, 505–511 (2008)

    Google Scholar 

  37. Tosteson, A.N., Begg, C.B.: A general regression methodology for ROC curve estimation. Med. Decis. Mak. 8, 204–215 (1988)

    Article  Google Scholar 

  38. Wand, M.P, Jones, M.C.: Kernel Smoothing. Chapman & Hall, London (1995)

    Google Scholar 

  39. Watson, G.S.: Smooth regression analysis. Sankhyā, Ser. A 26, 359–372 (1964)

    MATH  Google Scholar 

  40. Zheng, Y., Heagerty, P.J.: Semiparametric estimation of time-dependent ROC curves for longitudinal marker data. Biostatistics 4, 615–632 (2004)

    Article  Google Scholar 

  41. Zhou, X.H., Harezlak, J.: Comparison of bandwidth selection methods for kernel smoothing of ROC curves. Stat. Med. 21, 2045–2055 (2002)

    Article  Google Scholar 

  42. Zhou, X.H., Obuchowski, N.A., McClish, D.K.: Statistical Methods in Diagnostic Medicine. Wiley, New York (2002)

    Google Scholar 

  43. Zimmet, P., Alberti, K.G.M.M., Shaw, J.: Global and societal implications of the diabetes epidemic. Nature. 414, 782–787 (2001)

    Article  Google Scholar 

  44. Zou, K.H., Hall, W.J., Shapiro, D.E.: Smooth nonparametric receiver operating characteristic (ROC) curves for continuous diagnostic test. Stat. Med. 16, 2143–2165 (1997)

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to María Xosé Rodríguez-Álvarez.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Rodríguez-Álvarez, M.X., Roca-Pardiñas, J. & Cadarso-Suárez, C. ROC curve and covariates: extending induced methodology to the non-parametric framework. Stat Comput 21, 483–499 (2011). https://doi.org/10.1007/s11222-010-9184-1

Download citation

Keywords

  • ROC curve
  • Non-parametric regression
  • Bootstrap
  • Cardiovascular risk factors
  • Anthropometric measures