Abstract
The term diagnostic trial is generally used in two different ways. A diagnostic trial type I describes studies that evaluate accuracy of diagnostic tests in detecting disease or its severity. Primary endpoints for these studies are generally test accuracy outcomes measured in terms of sensitivity, specificity, positive predictive value, negative predictive value, and area under the receiver operating characteristics curves. Although establishing an accurate diagnosis or excluding disease is a critical first step to manage a health problem, medical decision-makers generally rely on a larger evidence base of empirical data that includes how tests impact patient health outcomes, such as morbidity, mortality, functional status, and quality of life. Therefore, the diagnostic trial type II evaluates the value of test results to guide or determine treatment decisions within a broader management strategy. Typically, differences in diagnostic accuracy result in differences in delivery of treatment, and ultimately affect disease prognosis and patient outcomes. As such, in the diagnostic trial type II, the downstream consequences of tests followed by treatment decisions are evaluated together in a joint construct. These diagnostic randomized clinical trials or test-treatment trials are considered the gold standard of proof for the clinical effectiveness or clinical utility of diagnostic tests. In this chapter, we define the variety of accuracy measures used for assessing diagnostic tests, summarize guidance on sample size calculation, and bring attention to the importance of more accurate reporting of study results.
Similar content being viewed by others
References
Ahmed HU, El-Shater Bosaily A, Brown LC, Gabe R, Kaplan R, Parmar MK, Collaco-Moraes Y et al (2017) Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. Lancet 389(10071):815–822. https://doi.org/10.1016/s0140-6736(16)32401-1
Beam CA (1992) Strategies for improving power in diagnostic radiology research. AJR Am J Roentgenol 159(3):631–637. https://doi.org/10.2214/ajr.159.3.1503041
Begg CB, Greenes RA (1983) Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics 39(1):207–215
Bossuyt PM, Reitsma JB, Linnet K, Moons KG (2012) Beyond diagnostic accuracy: the clinical utility of diagnostic tests. Clin Chem 58(12):1636–1643. https://doi.org/10.1373/clinchem.2012.182576
Braga LH, Farrokhyar F, Bhandari M (2012) Confounding: what is it and how do we deal with it? Can J Surg 55(2):132–138. https://doi.org/10.1503/cjs.036311
Bruni L, Barrionuevo-Rosas L, Albero G, Serrano B, Mena M, Gómez D, Muñoz J, Bosch FX, de Sanjosé S (2014) Human papillomavirus and related diseases report. L’Hospitalet de Llobregat: ICO Information Centre on HPV and Cancer
Cardoso F, Piccart-Gebhart M, Van’t Veer L, Rutgers E (2007) The MINDACT trial: the first prospective clinical validation of a genomic tool. Mol Oncol 1(3):246–251. https://doi.org/10.1016/j.molonc.2007.10.004
Colli A, Fraquelli M, Casazza G, Conte D, Nikolova D, Duca P, Thorlund K, Gluud C (2014) The architecture of diagnostic research: from bench to bedside--research guidelines using liver stiffness as an example. Hepatology 60(1):408–418. https://doi.org/10.1002/hep.26948
de Groot JA, Bossuyt PM, Reitsma JB, Rutjes AW, Dendukuri N, Janssen KJ, Moons KG (2011) Verification problems in diagnostic accuracy studies: consequences and solutions. BMJ 343:d4770. https://doi.org/10.1136/bmj.d4770
Douglas PS, Hoffmann U, Patel MR, Mark DB, Al-Khalidi HR, Cavanaugh B, Cole J et al (2015) Outcomes of anatomical versus functional testing for coronary artery disease. N Engl J Med 372(14):1291–1300. https://doi.org/10.1056/NEJMoa1415516
Faraggi D, Reiser B (2002) Estimation of the area under the ROC curve. Stat Med 21(20):3093–3106. https://doi.org/10.1002/sim.1228
Ferrante di Ruffano L, Dinnes J, Taylor-Phillips S, Davenport C, Hyde C, Deeks JJ (2017) Research waste in diagnostic trials: a methods review evaluating the reporting of test-treatment interventions. BMC Med Res Methodol 17(1):32. https://doi.org/10.1186/s12874-016-0286-0
Ferrante di Ruffano L, Hyde CJ, McCaffery KJ, Bossuyt PM, Deeks JJ (2012) Assessing the value of diagnostic tests: a framework for designing and evaluating trials. BMJ 344:e686. https://doi.org/10.1136/bmj.e686
Fosgate GT (2009) Practical sample size calculations for surveillance and diagnostic investigations. J Vet Diagn Investig 21(1):3–14. https://doi.org/10.1177/104063870902100102
Flahault A, Cadilhac M, Thomas G (2005) Sample size calculation should be performed for design accuracy in diagnostic test studies. J Clin Epidemiol 58(8):859–862. https://doi.org/10.1016/j.jclinepi.2004.12.009
Glasziou PP, Cole BF, Gelber RD, Hilden J, Simes RJ (1998) Quality adjusted survival analysis with repeated quality of life measures. Stat Med 17(11):1215–1229. https://doi.org/10.1002/(sici)1097-0258(19980615)17:11<1215::aid-sim844>3.0.co;2-y
Gluud C, Gluud LL (2005) Evidence based diagnostics. BMJ 330(7493):724–726
Hajian-Tilaki KO, Hanley JA, Joseph L, Collet JP (1997) A comparison of parametric and nonparametric approaches to ROC analysis of quantitative diagnostic tests. Med Decis Mak 17(1):94–102. https://doi.org/10.1177/0272989x9701700111
Harel O, Zhou XH (2006) Multiple imputation for correcting verification bias. Stat Med 25(22):3769–3786. https://doi.org/10.1002/sim.2494
Henrichs J, Verfaille V, Jellema P, Viester L, Pajkrt E, Wilschut J, van der Horst HE, Franx A, de Jonge A (2019) Effectiveness of routine third trimester ultrasonography to reduce adverse perinatal outcomes in low risk pregnancy (the IRIS study): nationwide, pragmatic, multicentre, stepped wedge cluster randomised trial. BMJ 367:l5517. https://doi.org/10.1136/bmj.l5517
Hooper R, DÃaz-Ordaz K, Takeda A, Khan K (2013) Comparing diagnostic tests: trials in people with discordant test results. Stat Med 32(14):2443–2456. https://doi.org/10.1002/sim.5676
Huchko MJ, Sneden J, Zakaras JM, Smith-McCune K, Sawaya G, Maloba M, Bukusi EA, Cohen CR (2015) A randomized trial comparing the diagnostic accuracy of visual inspection with acetic acid to visual inspection with Lugol’s iodine for cervical cancer screening in HIV-infected women. PLoS One 10(4):e0118568. https://doi.org/10.1371/journal.pone.0118568
Huang EP, Lin FI, Shankar LK (2017) Beyond correlations, sensitivities, and specificities: a roadmap for demonstrating utility of advanced imaging in oncology treatment and clinical trial design. Acad Radiol 24(8):1036–1049. https://doi.org/10.1016/j.acra.2017.03.002
Hu ZD (2016) STARD guideline in diagnostic accuracy tests: perspective from a systematic reviewer. Ann Transl Med 4(3):46. https://doi.org/10.3978/j.issn.2305-5839.2016.01.03
Hu ZD, Wei TT, Yang M, Ma N, Tang QQ, Qin BD, Fu HT, Zhong RQ (2015) Diagnostic value of osteopontin in ovarian cancer: a meta-analysis and systematic review. PLoS One 10(5):e0126444. https://doi.org/10.1371/journal.pone.0126444
Kearon C, Ginsberg JS, Douketis J, Turpie AG, Bates SM, Lee AY, Crowther MA et al (2006) An evaluation of D-dimer in the diagnosis of pulmonary embolism: a randomized trial. Ann Intern Med 144(11):812–821. https://doi.org/10.7326/0003-4819-144-11-200606060-00007
Kosinski AS, Barnhart HX (2003) A global sensitivity analysis of performance of a medical diagnostic test when verification bias is present. Stat Med 22(17):2711–2721. https://doi.org/10.1002/sim.1517
Korevaar DA, van Enst WA, Spijker R, Bossuyt PM, Hooft L (2014) Reporting quality of diagnostic accuracy studies: a systematic review and meta-analysis of investigations on adherence to STARD. Evid Based Med 19(2):47–54. https://doi.org/10.1136/eb-2013-101637
Korevaar DA, Wang J, van Enst WA, Leeflang MM, Hooft L, Smidt N, Bossuyt PM (2015) Reporting diagnostic accuracy studies: some improvements after 10 years of STARD. Radiology 274(3):781–789. https://doi.org/10.1148/radiol.14141160
Kumar R, Indrayan A (2011) Receiver operating characteristic (ROC) curve for medical researchers. Indian Pediatr 48(4):277–287. https://doi.org/10.1007/s13312-011-0055-4
Li J, Fine J (2004) On sample size for sensitivity and specificity in prospective diagnostic accuracy studies. Stat Med 23(16):2537–2550. https://doi.org/10.1002/sim.1836
Liu A, Schisterman EF, Mazumdar M, Hu J (2005) Power and sample size calculation of comparative diagnostic accuracy studies with multiple correlated test results. Biom J 47(2):140–150. https://doi.org/10.1002/bimj.200410094
Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, Bossuyt PM (1999) Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 282(11):1061–1066. https://doi.org/10.1001/jama.282.11.1061
Lijmer JG, Bossuyt PM (2009) Various randomized designs can be used to evaluate medical tests. J Clin Epidemiol 62(4):364–373. https://doi.org/10.1016/j.jclinepi.2008.06.017
Lu B, Gatsonis C (2013) Efficiency of study designs in diagnostic randomized clinical trials. Stat Med 32(9):1451–1466. https://doi.org/10.1002/sim.5655
Mark DB, Federspiel JJ, Cowper PA, Anstrom KJ, Hoffmann U, Patel MR, Davidson-Ray L et al (2016) Economic outcomes with anatomical versus functional diagnostic testing for coronary artery disease. Ann Intern Med 165(2):94–102. https://doi.org/10.7326/m15-2639
McClish DK (1989) Analyzing a portion of the ROC curve. Med Decis Mak 9(3):190–195. https://doi.org/10.1177/0272989x8900900307
Metz CE (1978) Basic principles of ROC analysis. Semin Nucl Med 8(4):283–298. https://doi.org/10.1016/s0001-2998(78)80014-2
Majeed H, Amir E (2018) EQUATOR-Oncology: reducing the latitude of cancer trial design and reporting: Nature Publishing Group
Mustafa RA, Wiercioch W, Cheung A, Prediger B, Brozek J, Bossuyt P, Garg AX, Lelgemann M, Büehler D, Schünemann HJ (2017) Decision making about healthcare-related tests and diagnostic test strategies. Paper 2: a review of methodological and practical challenges. J Clin Epidemiol 92:18–28. https://doi.org/10.1016/j.jclinepi.2017.09.003
Modic MT, Obuchowski NA, Ross JS, Brant-Zawadzki MN, Grooff PN, Mazanec DJ, Benzel EC (2005) Acute low back pain and radiculopathy: MR imaging findings and their prognostic role and effect on outcome. Radiology 237(2):597–604. https://doi.org/10.1148/radiol.2372041509
NCSS. PASS (Power Analysis and Sample Size) Software 2018
Network, Equator (2017) EQUATOR Network: what we do and how we are organised 2016
Newby DE, Adamson PD, Berry C, Boon NA, Dweck MR, Flather M, Forbes J et al (2018) Coronary CT angiography and 5-year risk of myocardial infarction. N Engl J Med 379(10):924–933. https://doi.org/10.1056/NEJMoa1805971
Obuchowski NA (1998) Sample size calculations in studies of test accuracy. Stat Methods Med Res 7(4):371–392. https://doi.org/10.1177/096228029800700405
Ogilvie JC, Douglas Creelman C (1968) Maximum-likelihood estimation of receiver operating characteristic curve parameters. J Math Psychol 5(3):377–391
Obuchowski NA, Bullen JA (2018) Receiver operating characteristic (ROC) curves: review of methods with applications in diagnostic medicine. Phys Med Biol 63(7):07tr01. https://doi.org/10.1088/1361-6560/aab4b1
Park SH, Goo JM, Jo CH (2004). Receiver operating characteristic (ROC) curve: practical review for radiologists. Korean J Radiol 5(1):11–18. https://doi.org/10.3348/kjr.2004.5.1.11
Pepe MS (2003) The statistical evaluation of medical tests for classification and prediction. Medicine
Sackett DL, Haynes RB (2002) The architecture of diagnostic research. BMJ 324(7336):539–541. https://doi.org/10.1136/bmj.324.7336.539
Simel DL, Samsa GP, Matchar DB (1991) Likelihood ratios with confidence: sample size estimation for diagnostic test studies. J Clin Epidemiol 44(8):763–770. https://doi.org/10.1016/0895-4356(91)90128-v
Steinberg DM, Fine J, Chappell R (2009) Sample size for positive and negative predictive value in diagnostic research using case-control designs. Biostatistics 10(1):94–105. https://doi.org/10.1093/biostatistics/kxn018
Sun F, Schoelles KM, Coates VH (2013) Assessing the utility of genetic tests. J Ambul Care Manage 36(3):222–232. https://doi.org/10.1097/JAC.0b013e318295d7e3
Swets JA (1986) Indices of discrimination or diagnostic accuracy: their ROCs and implied models. Psychol Bull 99(1):100–117
Thompson IM, Ankerst DP, Chen C, Scott Lucia M, Goodman PJ, Crowley JJ, Parnes HL, Coltman CA (2005) Operating characteristics of prostate-specific antigen in men with an initial PSA level of 3.0 ng/ml or lower. JAMA 294(1):66–70
Trevethan R (2019) Response: commentary: sensitivity, specificity, and predictive values: foundations, Pliabilities, and pitfalls in research and practice. Front Public Health 7:408. https://doi.org/10.3389/fpubh.2019.00408
van Oudenaarde K, Swart NM, Bloem JL, Bierma-Zeinstra SMA, Algra PR, Bindels PJE, Koes BW et al (2018) General practitioners referring adults to MR imaging for knee pain: a randomized controlled trial to assess cost-effectiveness. Radiology 288(1):170–176. https://doi.org/10.1148/radiol.2018171383
Whiting P, Rutjes AW, Reitsma JB, Glas AS, Bossuyt PM, Kleijnen J (2004) Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Ann Intern Med 140(3):189–202. https://doi.org/10.7326/0003-4819-140-3-200402030-00010
Walsh SJ (1997) Limitations to the robustness of binormal ROC curves: effects of model misspecification and location of decision thresholds on bias, precision, size and power. Stat Med 16(6):669–679. https://doi.org/10.1002/(sici)1097-0258(19970330)16:6<669::aid-sim489>3.0.co;2-q
Zhou X-H, McClish DK, Obuchowski NA (2009) Statistical methods in diagnostic medicine. John Wiley & Sons
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this entry
Cite this entry
Mazumdar, M., Zhong, X., Ferket, B. (2021). Diagnostic Trials. In: Piantadosi, S., Meinert, C.L. (eds) Principles and Practice of Clinical Trials. Springer, Cham. https://doi.org/10.1007/978-3-319-52677-5_281-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-52677-5_281-1
Received:
Accepted:
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52677-5
Online ISBN: 978-3-319-52677-5
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering