Skip to main content

Comparative reviews of diagnostic test accuracy in imaging research: evaluation of current practices



The purpose of this methodological review was to determine the extent to which comparative imaging systematic reviews of diagnostic test accuracy (DTA) use primary studies with comparative or non-comparative designs.


MEDLINE was used to identify DTA systematic reviews published in imaging journals between January 2000 and May 2018. Inclusion criteria: systematic reviews comparing at least two index tests (one of which was imaging-based); review characteristics were extracted. Study design and other characteristics of primary studies included in the systematic reviews were evaluated.


One hundred three comparative imaging reviews were included; 11 (11%) included only comparative studies, 12 (11%) included only non-comparative primary studies, and 80 (78%) included both comparative and non-comparative primary studies. For reviews containing both comparative and non-comparative primary studies, the median proportion of non-comparative primary studies was 81% (IQR 57–90%). Of 92 reviews that included non-comparative primary studies, 86% did not recognize this as a limitation. Furthermore, among 4182 primary studies, 3438 (82%) were non-comparative and 744 (18%) were comparative in design.


Most primary studies included in comparative imaging reviews are non-comparative in design and awareness of the risk of bias associated with this is low. This may lead to incorrect conclusions about the relative accuracy of diagnostic tests and be counter-productive for informing guidelines and funding decisions about imaging tests.

Key Points

• Few comparative accuracy imaging reviews include only primary studies with optimal comparative study designs. Among the rest, few recognize the risk of bias conferred from inclusion of primary studies with non-comparative designs.

• The demand for accurate comparative accuracy data combined with minimal awareness of valid comparative study designs may lead to counter-productive research and inadequately supported clinical decisions for diagnostic tests.

• Using comparative accuracy imaging reviews with a high risk of bias to inform guidelines and funding decisions may have detrimental impacts on patient care.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2



Computed tomography


Diagnostic test accuracy


Magnetic resonance imaging


Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies


Quality Assessment of Diagnostic Accuracy Studies


Randomized controlled trial


Radiologic Society of North America


Statistical Package for the Social Sciences




Value of Imaging through Comparative Effectiveness


  1. Institute of Medicine (US) Roundtable on Value & Science-Driven Health Care (2009) Learning what works: infrastructure required for comparative effectiveness research: workshop summary. Appendix C, Comparative Effectiveness Research Priorities: IOM Recommendations Washington, DC: National Academies Press (US). Available via Accessed 11 Oct 2018

  2. Godlee F (2010) More research is needed - but what type? BMJ 341:c4662

    Article  Google Scholar 

  3. Comparative Effectiveness Research Prioritization: National Academies of Sciences, Engineering, Medicine. Available via Accessed 13 Aug 2018

  4. America RSoN. RSNA/ASNR comparative effectiveness research training (CERT) program. Available via Accessed 11 Oct 2018

  5. A collaborative training program in Biomedical Big Data and Comparative Effectiveness Research (2018) Value of Imaging through Comparative Effectiveness (VOICE)

  6. National Institute for Health and Care Excellence (NICE) (2013) Guide to the methods of technology appraisal. NICE process and methods guides

  7. Concato J, Shah N, Horwitz RI (2000) Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med 342(25):1887–1892

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Leeflang MM, Deeks JJ, Gatsonis C, Bossuyt PM (2008) Systematic reviews of diagnostic test accuracy. Ann Intern Med 149(12):889–897

    Article  PubMed  PubMed Central  Google Scholar 

  9. Takwoingi Y, Leeflang MM, Deeks JJ (2013) Empirical evidence of the importance of comparative studies of diagnostic test accuracy. Ann Intern Med 158(7):544–554

    Article  PubMed  Google Scholar 

  10. Sutton A, Ades AE, Cooper N, Abrams K (2008) Use of indirect and mixed treatment comparisons for technology assessment. Pharmacoeconomics 26(9):753–767

    Article  PubMed  Google Scholar 

  11. Lumley T (2002) Network meta-analysis for indirect treatment comparisons. Stat Med 21(16):2313–2324

    Article  PubMed  Google Scholar 

  12. Bossuyt PM, Irwig L, Craig J, Glasziou P (2006) Comparative accuracy: assessing new tests against existing diagnostic pathways. BMJ 332(7549):1089–1092

    Article  PubMed  PubMed Central  Google Scholar 

  13. Dinnes J, Deeks J, Kirby J, Roderick P (2005) A methodological review of how heterogeneity has been examined in systematic reviews of diagnostic test accuracy. Health Technol Assess 9(12):1–113 iii

    Article  CAS  PubMed  Google Scholar 

  14. Leeflang MMG, Reitsma JB (2018) Systematic reviews and meta-analyses addressing comparative test accuracy questions. Diagn Progn Re 2(17)

  15. Zhou X-H, Obuchowski NA, McClish DK (2011) Statistical methods in diagnostic medicine. John Wiley & Sons, Hoboken.

    Book  Google Scholar 

  16. Leeflang M, Nisio M, Rutjes A, Zwinderman AH, Bossuyt P (2011) Adjusting for indirectness in comparative test accuracy meta-analyses. Cochrane Database Syst Rev Supplement

  17. Wang J, Bossuyt P, Geskus R et al (2015) Using individual patient data to adjust for indirectness did not successfully remove the bias in this case of comparative test accuracy. J Clin Epidemiol 68(3):290–298

    Article  PubMed  Google Scholar 

  18. Shojania KG, Bero LA (2001) Taking advantage of the explosion of systematic reviews: an efficient MEDLINE search strategy. Eff Clin Pract 4(4):157–162

    CAS  PubMed  Google Scholar 

  19. Web of Science: Clarivate Analytics. Available via Accessed 11 Oct 2018

  20. IBM Statistics for Mac (2016). 24 ed: Corp IBM

  21. Issa Y, Kempeneers MA, van Santvoort HC, Bollen TL, Bipat S, Boermeester MA (2017) Diagnostic performance of imaging modalities in chronic pancreatitis: a systematic review and meta-analysis. Eur Radiol 27(9):3820–3844

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Kiewiet JJ, Leeuwenburgh MM, Bipat S, Bossuyt PM, Stoker J, Boermeester MA (2012) A systematic review and meta-analysis of diagnostic performance of imaging in acute cholecystitis. Radiology 264(3):708–720

    Article  PubMed  Google Scholar 

  23. Laméris W, van Randen A, Bipat S, Bossuyt PM, Boermeester MA, Stoker J (2008) Graded compression ultrasonography and computed tomography in acute colonic diverticulitis: meta-analysis of test accuracy. Eur Radiol 18(11):2498–2511

    Article  PubMed  Google Scholar 

  24. Vilgrain V, Esvan M, Ronot M, Caumont-Prim A, Aubé C, Chatellier G (2016) A meta-analysis of diffusion-weighted and gadoxetic acid-enhanced MR imaging for the detection of liver metastases. Eur Radiol 26(12):4595–4615

    Article  PubMed  Google Scholar 

  25. Wang Z, Wang Y, Sui X et al (2015) Performance of FLT-PET for pulmonary lesion diagnosis compared with traditional FDG-PET: a meta-analysis. Eur J Radiol 84(7):1371–1377

    Article  PubMed  Google Scholar 

  26. Berger N, Luparia A, Di Leo G et al (2017) Diagnostic performance of MRI versus galactography in women with pathologic nipple discharge: a systematic review and meta-analysis. AJR Am J Roentgenol 209(2):465–471

    Article  PubMed  Google Scholar 

  27. McGrath TA, Bossuyt PM, Cronin P et al (2018) Best practices for MRI systematic reviews and meta-analyses. J Magn Reson Imaging.

  28. Rutter CM, Gatsonis CA (2001) A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Stat Med 20(19):2865–2884

    Article  CAS  PubMed  Google Scholar 

  29. Bossuyt PM, Lijmer JG, Mol BW (2000) Randomised comparisons of medical tests: sometimes invalid, not always efficient. Lancet 356(9244):1844–1847

    Article  CAS  PubMed  Google Scholar 

  30. Kang SK, Rawson JV, Recht MP (2018) Supporting imagers’ VOICE: a national training program in comparative effectiveness research and big data analytics. J Am Coll Radiol 15(10):1451–1454

    Article  PubMed  Google Scholar 

  31. Macaskill P, Gatsonis C, Deeks JJ, Harbord RM, Takwoingi Y (2010) Chapter 10: Analysing and presenting results. In: Deeks JJ, Bossuyt PM, Gatsonis C (eds) Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy The Cochrane Collaboration

  32. McInnes MDF, Moher D, Thombs BD et al (2018) Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: the PRISMA-DTA statement. JAMA 319(4):388–396

    Article  PubMed  Google Scholar 

  33. Frank RA, Bossuyt PM, McInnes MDF (2018) Systematic reviews and meta-analyses of diagnostic test accuracy: the PRISMA-DTA statement. Radiology.

  34. Whiting PF, Rutjes AW, Westwood ME et al (2011) QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 155(8):529–536

    Article  Google Scholar 

  35. Alabousi M, Alabousi A, McGrath TA et al (2018) Epidemiology of systematic reviews in imaging journals: evaluation of publication trends and sustainability? Eur Radiol.

  36. Pandharipande PV, Gazelle GS (2009) Comparative effectiveness research: what it means for radiology. Radiology 253(3):600–605

    Article  PubMed  Google Scholar 

Download references


This study has received funding from the University of Ottawa, Department of Radiology Research Stipend Program.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Matthew D. F. McInnes.

Ethics declarations


The scientific guarantor of this publication is Matthew McInnes.

Conflict of interest

The authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article.

Statistics and biometry

Several authors have significant statistical expertise (Drs McInnes, Leeflang, Deeks).

Informed consent

Written informed consent was not required for this study because evaluation of published literature is N/A.

Written informed consent was waived by the Institutional Review Board.

Ethical approval

Institutional Review Board approval was not required because this is an evaluation of published literature.


• retrospective

• cross-sectional study

• multicenter study

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material


(DOCX 20 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Dehmoobad Sharifabadi, A., Leeflang, M., Treanor, L. et al. Comparative reviews of diagnostic test accuracy in imaging research: evaluation of current practices. Eur Radiol 29, 5386–5394 (2019).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Diagnostic test, routine
  • Comparative effectiveness research
  • Sensitivity and specificity