Avenues for Further Research

  • Yulun Liu
  • Yong ChenEmail author


In this chapter, we present an overview of the recent statistical methods for diagnostic meta-analysis and suggest a few directions for future research. We discuss two important issues regarding (a) the robustness of model misspecifications and (b) the identifiability of models and the assumption of conditional independence in the absence of a gold standard. With increasing availability of biomedical data, the individual patient-level data meta-analyses offer new insights into evidence synthesis compared to traditional aggregated data-based meta-analyses. In particular, the approaches to combine individual patient-level data with aggregated data can inform personalized medical decision based on patient-level characteristics and help to identify clinically relevant subgroups. However, such integration methods for diagnostic prediction research are limited, and hence there is a growing need for developing of novel statistical methods that can address potential issues including model validation, missing predictors, and between-studies heterogeneity while combining both types of data. Despite the perceived advantages of individual patient-level data, using individual patient-level data alone may still encounter a number of challenges, such as partial verification bias and the absence of a gold standard. We discuss these challenges by two examples.


Absence of gold standard Composite likelihood Diagnostic test Generalized linear mixed model Hierarchical model Imperfect reference test Individual patient-level data Meta-analysis Partial verification bias 



Generalized linear mixed models


Hierarchical summary receiver operating characteristic


Individual patient-level data


Magnetic resonance imaging


Negative predictive value


Positive predictive value


Retinopathy of prematurity


Summary receiver operating characteristic


  1. 1.
    Reitsma JB, Glas AS, Rutjes AW, Scholten RJ, Bossuyt PM, Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol. 2005;58:982–90.CrossRefPubMedGoogle Scholar
  2. 2.
    Littenberg B, Moses LE. Estimating diagnostic accuracy from multiple conflicting reports: a new meta-analytic method. Med Decis Mak. 1993;13:313–21.CrossRefGoogle Scholar
  3. 3.
    Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations. Stat Med. 1993;12:1293–316.CrossRefPubMedGoogle Scholar
  4. 4.
    Walter S. Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data. Stat Med. 2002;21:1237–56.CrossRefPubMedGoogle Scholar
  5. 5.
    Arends L, Hamza TH, van Houwelingen JC, Heijenbrok-Kal MH, Hunink MG, Stijnen T. Bivariate random effects meta-analysis of ROC curves. Med Decis Mak. 2008;28:621–38.CrossRefGoogle Scholar
  6. 6.
    Van Houwelingen HC, Arends LR, Stijnen T. Advanced methods in meta-analysis: multivariate approach and meta-regression. Stat Med. 2002;21:589–624.CrossRefPubMedGoogle Scholar
  7. 7.
    Van Houwelingen HC, Zwinderman KH, Stijnen T. A bivariate approach to meta-analysis. Stat Med. 1993;12:2273–84.CrossRefPubMedGoogle Scholar
  8. 8.
    Chu H, Cole SR. Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. J Clin Epidemiol. 2006;59:1331–2.CrossRefPubMedGoogle Scholar
  9. 9.
    Hamza TH, van Houwelingen HC, Stijnen T. The binomial distribution of meta-analysis was preferred to model within-study variability. J Clin Epidemiol. 2008;61:41–51.CrossRefPubMedGoogle Scholar
  10. 10.
    Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA. A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics. 2007;8:239–51.CrossRefPubMedGoogle Scholar
  11. 11.
    Chen Y, Liu Y, Ning J, Cormier J, Chu H. A hybrid model for combining case–control and cohort studies in systematic reviews of diagnostic tests. J R Stat Soc Ser C Appl Stat. 2015;64:469–89.CrossRefPubMedGoogle Scholar
  12. 12.
    Lindsay BG. Composite likelihood methods. Contemp Math. 1988;80:221–39.CrossRefGoogle Scholar
  13. 13.
    Chen Y, Liu Y, Ning J, Nie L, Zhu H, Chu H. A composite likelihood method for bivariate meta-analysis in diagnostic systematic reviews. Stat Methods Med Res. 2017;26:914–30.CrossRefPubMedGoogle Scholar
  14. 14.
    Feinstein A. Misguided efforts and future challenges for research on “diagnostic tests”. J Epidemiol Community Health. 2002;56:330–2.CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Leeflang MM, Rutjes AW, Reitsma JB, Hooft L, Bossuyt PM. Variation of a test’s sensitivity and specificity with disease prevalence. Can Med Assoc J. 2013;185:E537–44.CrossRefGoogle Scholar
  16. 16.
    Chu H, Nie L, Cole SR, Poole C. Meta-analysis of diagnostic accuracy studies accounting for disease prevalence: alternative parameterizations and model selection. Stat Med. 2009;28:2384–99.CrossRefPubMedGoogle Scholar
  17. 17.
    Ma X, Chen Y, Cole SR, Chu H. A hybrid Bayesian hierarchical model combining cohort and case–control studies for meta-analysis of diagnostic tests: accounting for partial verification bias. Stat Methods Med Res. 2016;25:3015–37.CrossRefPubMedGoogle Scholar
  18. 18.
    Chen Y, Liu Y, Chu H, Ting Lee ML, Schmid CH. A simple and robust method for multivariate meta-analysis of diagnostic test accuracy. Stat Med. 2017;36:105–21.CrossRefPubMedGoogle Scholar
  19. 19.
    Joseph L, Gyorkos TW, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. Am J Epidemiol. 1995;141:263–72.CrossRefPubMedGoogle Scholar
  20. 20.
    Rutjes AW, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PM. Evidence of bias and variation in diagnostic accuracy studies. Can Med Assoc J. 2006;174:469–76.CrossRefGoogle Scholar
  21. 21.
    Chu H, Chen S, Louis TA. Random effects models in a meta-analysis of the accuracy of two diagnostic tests without a gold standard. J Am Stat Assoc. 2009;104:512–23.CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Dendukuri N, Schiller I, Joseph L, Pai M. Bayesian meta-analysis of the accuracy of a test for tuberculous pleuritis in the absence of a gold standard reference. Biometrics. 2012;68:1285–93.CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Liu Y, Chen Y, Chu H. A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard. Biometrics. 2015;71:538–47.CrossRefPubMedGoogle Scholar
  24. 24.
    Alonzo TA, Pepe MS. Using a combination of reference tests to assess the accuracy of a new diagnostic test. Stat Med. 1999;18:2987–3003.CrossRefPubMedGoogle Scholar
  25. 25.
    Naaktgeboren CA, Bertens LC, van Smeden M, de Groot JA, Moons KG, Reitsma JB. Value of composite reference standards in diagnostic research. BMJ. 2013;347:f5605.CrossRefPubMedGoogle Scholar
  26. 26.
    Qu Y, Tan M, Kutner MH. Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics. 1996;52:797–810.CrossRefPubMedGoogle Scholar
  27. 27.
    Hui SL, Zhou XH. Evaluation of diagnostic tests without gold standards. Stat Methods Med Res. 1998;7:354–70.CrossRefPubMedGoogle Scholar
  28. 28.
    Pepe MS, Alonzo TA. Comparing disease screening tests when true disease status is ascertained only for screen positives. Biostatistics. 2001;2:249–60.CrossRefPubMedGoogle Scholar
  29. 29.
    Albert PS, Dodd LE. A cautionary note on the robustness of latent class models for estimating diagnostic error without a gold standard. Biometrics. 2004;60:427–35.CrossRefPubMedGoogle Scholar
  30. 30.
    Gustafson P, et al. On model expansion, model contraction, identifiability and prior information: two illustrative scenarios involving mismeasured variables [with comments and rejoinder]. Stat Sci. 2005;20:111–40.CrossRefGoogle Scholar
  31. 31.
    Hui SL, Walter SD. Estimating the error rates of diagnostic tests. Biometrics. 1980;36:167–71.CrossRefPubMedGoogle Scholar
  32. 32.
    Pepe MS, Janes H. Insights into latent class analysis of diagnostic test performance. Biostatistics. 2006;8:474–84.CrossRefPubMedGoogle Scholar
  33. 33.
    Dendukuri N, Joseph L. Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests. Biometrics. 2001;57:158–67.CrossRefPubMedGoogle Scholar
  34. 34.
    Lambert PC, et al. A comparison of summary patient-level covariates in meta-regression with individual patient data meta-analysis. J Clin Epidemiol. 2002;55:86–94.CrossRefPubMedGoogle Scholar
  35. 35.
    Berlin JA, Santanna J, Schmid CH, Szczech LA, Feldman HI, Anti-Lymphocyte Antibody Induction Therapy Study Group. Individual patient-versus group-level data meta-regressions for the investigation of treatment effect modifiers: ecological bias rears its ugly head. Stat Med. 2002;21:371–87.CrossRefPubMedGoogle Scholar
  36. 36.
    Thompson SG, Higgins J. How should meta-regression analyses be undertaken and interpreted? Stat Med. 2002;21:1559–73.CrossRefPubMedGoogle Scholar
  37. 37.
    Schmid CH, Stark PC, Berlin JA, Landais P, Lau J. Meta-regression detected associations between heterogeneous treatment effects and study-level, but not patient-level, factors. J Clin Epidemiol. 2004;57:683–97.CrossRefPubMedGoogle Scholar
  38. 38.
    Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221.CrossRefPubMedGoogle Scholar
  39. 39.
    Smith CT, Williamson PR, Marson AG. Investigating heterogeneity in an individual patient data meta-analysis of time to event outcomes. Stat Med. 2005;24:1307–19.CrossRefPubMedGoogle Scholar
  40. 40.
    Steinberg K, Smith SJ, Stroup DF, Olkin I, Lee NC, Williamson GD, Thacker SB. Comparison of effect estimates from a meta-analysis of summary data from published studies and from a meta-analysis using individual patient data for ovarian cancer studies. Am J Epidemiol. 1997;145:917–25.CrossRefPubMedGoogle Scholar
  41. 41.
    Higgins JP, Green S. Cochrane handbook for systematic reviews of interventions, vol. 4. Chichester: John Wiley & Sons; 2011.Google Scholar
  42. 42.
    Thompson SG, Higgins JP. Can meta-analysis help target interventions at individuals most likely to benefit? Lancet. 2005;365:341–6.CrossRefPubMedGoogle Scholar
  43. 43.
    Riley RD, Steyerberg EW. Meta-analysis of a binary outcome using individual participant data and aggregate data. Res Synth Methods. 2010;1:2–19.CrossRefPubMedGoogle Scholar
  44. 44.
    Sutton AJ, Kendrick D, Coupland CA. Meta-analysis of individual-and aggregate-level data. Stat Med. 2008;27:651–69.CrossRefPubMedGoogle Scholar
  45. 45.
    Riley RD, Dodd SR, Craig JV, Thompson JR, Williamson PR. Meta-analysis of diagnostic test studies using individual patient data and aggregate data. Stat Med. 2008;27:6111–36.CrossRefPubMedGoogle Scholar
  46. 46.
    Steyerberg EW, Mushkudiani N, Perel P, Butcher I, Lu J, McHugh GS, Murray GD, Marmarou A, Roberts I, Habbema JD, Maas AI. Predicting outcome after traumatic brain injury: development and international validation of prognostic scores based on admission characteristics. PLoS Med. 2008;5:e165.CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMC Med. 2015;13:1.CrossRefPubMedPubMedCentralGoogle Scholar
  48. 48.
    Riley RD, Ensor J, Snell KI, Debray TP, Altman DG, Moons KG, Collins GS. External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ. 2016;353:i3140.CrossRefPubMedPubMedCentralGoogle Scholar
  49. 49.
    Ahmed I, Debray TP, Moons KG, Riley RD. Developing and validating risk prediction models in an individual participant data meta-analysis. BMC Med Res Methodol. 2014;14:3.CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Steyerberg EW, Eijkemans MJ, Van Houwelingen JC, Lee KL, Habbema JD. Prognostic models based on literature and individual patient data in logistic regression analysis. Stat Med. 2000;19:141–60.CrossRefPubMedGoogle Scholar
  51. 51.
    Debray TP, Koffijberg H, Lu D, Vergouwe Y, Steyerberg EW, Moons KG. Incorporating published univariable associations in diagnostic and prognostic modeling. BMC Med Res Methodol. 2012;12:121.CrossRefPubMedGoogle Scholar
  52. 52.
    Greenland S. Quantitative methods in the review of epidemiologic literature. Epidemiol Rev. 1987;9:1–30.CrossRefPubMedGoogle Scholar
  53. 53.
    Debray T, Koffijberg H, Vergouwe Y, Moons KG, Steyerberg EW. Aggregating published prediction models with individual participant data: a comparison of different approaches. Stat Med. 2012;31:2697–712.CrossRefPubMedGoogle Scholar
  54. 54.
    Cook NR. Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation. 2007;115:928–35.CrossRefPubMedGoogle Scholar
  55. 55.
    Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology. 2010;21:128–38.CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Debray T, Moons KG, Ahmed I, Koffijberg H, Riley RD. A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis. Stat Med. 2013;32:3158–80.CrossRefPubMedGoogle Scholar
  57. 57.
    Debray TP, Riley RD, Rovers MM, Reitsma JB, Moons KG, Cochrane IPD Meta-analysis Methods Group. Individual participant data (IPD) meta-analyses of diagnostic and prognostic modeling studies: guidance on their use. PLoS Med. 2015;12:e1001886.CrossRefPubMedPubMedCentralGoogle Scholar
  58. 58.
    Rockall A, Meroni R, Sohaib SA, Reynolds K, Alexander-Sefre F, Shepherd JH, Jacobs I, Reznek RH. Evaluation of endometrial carcinoma on magnetic resonance imaging. Int J Gynecol Cancer. 2007;17:188–96.CrossRefPubMedGoogle Scholar
  59. 59.
    Saez F, Urresola A, Larena JA, Martín JI, Pijuán JI, Schneider J, Ibáñez E. Endometrial carcinoma: assessment of myometrial invasion with plain and gadolinium-enhanced MR imaging. J Magn Reson Imaging. 2000;12:460–6.CrossRefPubMedGoogle Scholar
  60. 60.
    Nakao Y, Yokoyama M, Hara K, Koyamatsu Y, Yasunaga M, Araki Y, Watanabe Y, Iwasaka T. MR imaging in endometrial carcinoma as a diagnostic tool for the absence of myometrial invasion. Gynecol Oncol. 2006;102:343–7.CrossRefPubMedGoogle Scholar
  61. 61.
    Gilbert C. Retinopathy of prematurity: a global perspective of the epidemics, population of babies at risk and implications for control. Early Hum Dev. 2008;84:77–82.CrossRefPubMedGoogle Scholar
  62. 62.
    Schaffer DB, Palmer EA, Plotsky DF, Metz HS, Flynn JT, Tung B, Hardy RJ. Prognostic factors in the natural course of retinopathy of prematurity. The Cryotherapy for Retinopathy of Prematurity Cooperative Group. Ophthalmology. 1993;100:230–7.CrossRefPubMedGoogle Scholar
  63. 63.
    Good WV, Hardy RJ, E.M.S. Group. The multicenter study of early treatment for retinopathy of prematurity (ETROP). New York: Elsevier; 2001.Google Scholar
  64. 64.
    Yen KG, Hess D, Burke B, Johnson RA, Feuer WJ, Flynn JT. The optimum time to employ telephotoscreening to detect retinopathy of prematurity. Trans Am Ophthalmol Soc. 2000;98:145.PubMedPubMedCentralGoogle Scholar
  65. 65.
    Richter GM, Williams SL, Starren J, Flynn JT, Chiang MF. Telemedicine for retinopathy of prematurity diagnosis: evaluation and challenges. Surv Ophthalmol. 2009;54:671–85.CrossRefPubMedPubMedCentralGoogle Scholar
  66. 66.
    Ying G-S, Quinn GE, Wade KC, Repka MX, Baumritter A, Daniel E, e-ROP Cooperative Group. Predictors for the development of referral-warranted retinopathy of prematurity in the telemedicine approaches to evaluating acute-phase retinopathy of prematurity (e-ROP) study. JAMA Ophthalmol. 2015;133:304–11.CrossRefPubMedPubMedCentralGoogle Scholar
  67. 67.
    Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med. 1978;299:926–30.CrossRefPubMedGoogle Scholar
  68. 68.
    Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics. 1983;39:207–15.CrossRefPubMedGoogle Scholar
  69. 69.
    Zhou X-H. Maximum likelihood estimators of sensitivity and specificity corrected for verification bias. Commun Stat Theory Methods. 1993;22:3177–98.CrossRefGoogle Scholar
  70. 70.
    Zhou X-H. Correcting for verification bias in studies of a diagnostic test’s accuracy. Stat Methods Med Res. 1998;7:337–53.CrossRefPubMedGoogle Scholar
  71. 71.
    Harel O, Zhou XH. Multiple imputation for correcting verification bias. Stat Med. 2006;25:3769–86.CrossRefPubMedGoogle Scholar
  72. 72.
    De Groot J, Janssen KJ, Zwinderman AH, Moons KG, Reitsma JB. Multiple imputation to correct for partial verification bias revisited. Stat Med. 2008;27:5880–9.CrossRefPubMedGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Biostatistics, Epidemiology and InformaticsPerelman School of Medicine, University of PennsylvaniaPhiladelphiaUSA

Personalised recommendations