Chapter 7: Grading a Body of Evidence on Diagnostic Tests
Grading the strength of a body of diagnostic test evidence involves challenges over and above those related to grading the evidence from health care intervention studies. This chapter identifies challenges and outlines principles for grading the body of evidence related to diagnostic test performance.
Diagnostic test evidence is challenging to grade because standard tools for grading evidence were designed for questions about treatment rather than diagnostic testing; and the clinical usefulness of a diagnostic test depends on multiple links in a chain of evidence connecting the performance of a test to changes in clinical outcomes.
Reviewers grading the strength of a body of evidence on diagnostic tests should consider the principle domains of risk of bias, directness, consistency, and precision, as well as publication bias, dose response association, plausible unmeasured confounders that would decrease an effect, and strength of association, similar to what is done to grade evidence on treatment interventions. Given that most evidence regarding the clinical value of diagnostic tests is indirect, an analytic framework must be developed to clarify the key questions, and strength of evidence for each link in that framework should be graded separately. However if reviewers choose to combine domains into a single grade of evidence, they should explain their rationale for a particular summary grade and the relevant domains that were weighed in assigning the summary grade.
- Owens DK, Lohr KN, Atkins D, et al. AHRQ series paper 5: grading the strength of a body of evidence when comparing medical interventions–Agency for Healthcare Research and Quality and the Effective Health-Care Program. J Clin Epidemiol. 2010;63(5):513–23. CrossRef
- Atkins D, Fink K, Slutsky J. Better information for better health care: the evidence-based practice center program and the Agency for Healthcare Research and Quality. Ann Intern Med. 2005;142(12 Pt 2):1035–41.
- Agency for Healthcare Research and Quality. Methods Guide for Effectiveness and Comparative Effectiveness Reviews. Rockville, MD: Agency for Healthcare Research and Quality. Available at: http://www.effectivehealthcare.ahrq.gov/index.cfm/search-for-guides-reviews-and-reports/?pageaction=displayproduct&productid=318. December, 2011.
- Schunemann HJ, Oxman AD, Brozek J, et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. BMJ. 2008;336(7653):1106–10. CrossRef
- Guyatt GH, Oxman AD, Vist GE, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336(7650):924–6. CrossRef
- Balshem H, Helfand M, Schünemann HJ, Oxman AD, Kunz R, Brozek J, Vist GE, Falck-Ytter Y, Meerpohl J, Norris S, Guyatt GH. GRADE guidelines: 3. Rating the quality of evidence. J Clin Epidemiol. 2011;64(4):401–6. CrossRef
- Lohr KN, Carey TS. Assessing “best evidence”: issues in grading the quality of studies for systematic reviews. Jt Comm J Qual Improv. 1999;25:470–9.
- Samson D, Schoelles KM. Chapter 2: Medical tests guidance (2) developing the topi and structuring systematic reviews of medical tests: utility of PICOTS, analytic frameworks, decision trees, and other frameworks. J Gen Internal Med. 2012. doi:10.1007/s11606-012-2007-7.
- Marchionni L, Wilson RF, Marinopoulos SS, et al. Impact of gene expression profiling tests on breast cancer outcomes. Evidence report/technology assessment No. 160. (Prepared by The Johns Hopkins University Evidence-based Practice Center under contract No. 290-02-0018). AHRQ Publication No. 08-E002. Rockville, MD: Agency for Healthcare Research and Quality. January 2008. Available at: www.ahrq.gov/downloads/pub/evidence/pdf/brcancergene/brcangene.pdf. Accessed December, 2011.
- Ross SD, Allen IE, Harrison KJ, et al. Systematic review of the literature regarding the diagnosis of sleep apnea. Evidence report/technology assessment No. 1. (Prepared by MetaWorks Inc. under Contract No. 290-97-0016.) AHCPR Publication No. 99-E002. Rockville, MD: Agency for Health Care Policy and Research. February 1999. Available at: www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=erta1. Accessed December, 2011.
- Bruening W, Schoelles K, Treadwell J, et al. Comparative effectiveness of core-needle and open surgical biopsy for the diagnosis of breast lesions. Comparative effectiveness review No. 19. (Prepared by ECRI Institute Evidence-based Practice Center under Contract No. 290-02-0019.) Rockville, MD: Agency for Healthcare Research and Quality. December 2009. Available at: http://effectivehealthcare.ahrq.gov/ehc/products/17/370/finalbodyforposting.pdf. Accessed December, 2011.
- Segal JB, Brotman DJ, Emadi A, et al. Outcomes of genetic testing in adults with a history of venous thromboembolism. Evidence report/technology assessment No. 180. (Prepared by Johns Hopkins University Evidence-based Practice Center under Contract No. HHSA 290-2007-10061-I). AHRQ Publication No. 09-E011. Rockville, MD. Agency for Healthcare Research and Quality. June 2009. Available at: http://www.ahrq.gov/downloads/pub/evidence/pdf/factorvleiden/fvl.pdf. Accessed December, 2011.
- Bruening W, Uhl S, Fontanarosa J, Reston J, Treadwell J, Schoelles K. Noninvasive Diagnostic Tests for Breast Abnormalities: Update of a 2006 Review. Comparative Effectiveness Review No. 47. (Prepared by the ECRI Institute Evidence-based Practice Center under Contract No. 290- 02-0019.) AHRQ Publication No. 12-EHC014-EF. Rockville, MD: Agency for Healthcare Research and Quality; February 2012.
- Segal JB. Chapter 3: Choosing the important outcomes for a systematic review of a medical test. J Gen Internal Med. 2011. doi:10.1007/s11606-011-1802-x.
- Lord SJ, Irwig L, Simes J. When is measuring sensitivity and specificity sufficient to evaluate a diagnostic test, and when do we need a randomized trial? Ann Intern Med. 2006;144(11):850–5.
- Pauker SG, Kassirer JP. The threshold approach to clinical decision making. N Engl J Med. 1980;302(20):1109–17. CrossRef
- Santaguida PL, Riley CM, Matchar DB. Chapter 5: Assessing risk of bias as a domain of quality in medical test studies. J Gen Intern Med. 2012.
- Hartmann KE, Matchar DB, Chang S. Chapter 6: Assessing applicability of medical test studies in systematic reviews. J Gen Internal Med. 2012. doi:10.1007/s11606-011-1961-9.
- Trikalinos TA, Kulasingam S, Lawrence WH. Chapter 10: Deciding whether to complement asystematic review of medical tests with decision making. J Gen Intern Med. 2012.
- MacCannell T, Umscheid CA, Agarwal RK, Lee I, Kuntz G, Stevenson KB, the Healthcare Infection Control Practices Advisory Committee. Guideline for the prevention and control of norovirus gastroenteritis outbreaks in healthcare settings. Infect Control Hosp Epidemiol. 2011;32(10):939–69. CrossRef
- Turcios RM, Widdowson MA, Sulka AC, Mead PS, Glass RI. Reevaluation of epidemiological criteria for identifying outbreaks of acute gastroenteritis due to norovirus: United States, 1998-2000. Clin Infect Dis. 2006;42(7):964–9. CrossRef
- Chapter 7: Grading a Body of Evidence on Diagnostic Tests
- Open Access
- Available under Open Access This content is freely available online to anyone, anywhere at any time.
Journal of General Internal Medicine
Volume 27, Issue 1 Supplement, pp 47-55
- Cover Date
- Print ISSN
- Online ISSN
- Additional Links
- diagnostic tests
- publication bias
- health care intervention
- Industry Sectors
- Author Affiliations
- 1. Department of Medicine, Johns Hopkins University School of Medicine, 624 N Broadway, Rm 680 B, Baltimore, MD, 21205, USA
- 2. Department of Epidemiology, Johns Hopkins University, Bloomberg School of Public Health, Baltimore, MD, USA
- 3. Center for Outcomes and Evidence, Agency for Healthcare Research and Quality, Rockville, MD, USA
- 4. Duke-NUS Medical School, Singapore, Singapore
- 5. Duke Center for Clinical Health Policy Research, Durham, NC, USA