Skip to main content
Log in

Methods and recommendations for evaluating and reporting a new diagnostic test

  • Review
  • Published:
European Journal of Clinical Microbiology & Infectious Diseases Aims and scope Submit manuscript

Abstract

No standardized guidelines exist for the biostatistical methods appropriate for studies evaluating diagnostic tests. Publication recommendations such as the STARD statement provide guidance for the analysis of data, but biostatistical advice is minimal and application is inconsistent. This article aims to provide a self-contained, accessible resource on the biostatistical aspects of study design and reporting for investigators. For all dichotomous diagnostic tests, estimates of sensitivity and specificity should be reported with confidence intervals. Power calculations are strongly recommended to ensure that investigators achieve desired levels of precision. In the absence of a gold standard reference test, the composite reference standard method is recommended for improving estimates of the sensitivity and specificity of the test under evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM et al (2003) Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Ann Intern Med 138(1):40–44

    PubMed  Google Scholar 

  2. Pfeifer J (ed) (2006) Molecular genetic testing in surgical pathology. Lippincott Williams & Wilkins, Philadelphia

    Google Scholar 

  3. Rosner BA (2006) Fundamentals of biostatistics, 6th edn. Thomson Brooks Cole, Belmont, CA

  4. FDA (2011) Statistical guidance on reporting results from studies evaluating diagnostic tests. Available from: http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/GuidanceDocuments/ucm071148.htm. Updated 6 January 2011; cited 8 December 2011

  5. Royse D, Thyer BA, Padgett DK (2010) Program evaluation: An introduction, 5th edn. Wadsworth, Cengage Learning, Belmont, CA

    Google Scholar 

  6. Royse D (2008) Research methods in social work, 5th edn. Thomson Brooks Cole, Belmont, CA

    Google Scholar 

  7. Sullivan LM (2008) Essentials of biostatistics in public health, 1st edn. Jones and Bartlett, Sudbury, MA

    Google Scholar 

  8. Price RM, Bonett DG (2008) Confidence intervals for a ratio of two independent binomial proportions. Stat Med 27(26):5497–508

    Article  PubMed  Google Scholar 

  9. Schachter J, McCormack WM, Chernesky MA, Martin DH, Van Der Pol B, Rice PA et al (2003) Vaginal swabs are appropriate specimens for diagnosis of genital tract infection with chlamydia trachomatis. J Clin Microbiol 41(8):3784–3789

    Article  PubMed  Google Scholar 

  10. Miller WC (1998) Bias in discrepant analysis: when two wrongs don't make a right. J Clin Epidemiol 51(3):219–231

    Article  PubMed  CAS  Google Scholar 

  11. Hawkins DM, Garrett JA, Stephenson B (2001) Some issues in resolution of diagnostic tests using an imperfect gold standard. Stat Med 20(13):1987–2001

    Article  PubMed  CAS  Google Scholar 

  12. Hadgu A (1996) The discrepancy in discrepant analysis. Lancet 348(9027):592–593

    Article  PubMed  CAS  Google Scholar 

  13. Alonzo TA, Pepe MS (1999) Using a combination of reference tests to assess the accuracy of a new diagnostic test. Stat Med 18(22):2987–3003

    Article  PubMed  CAS  Google Scholar 

  14. Baughman AL, Bisgard KM, Cortese MM, Thompson WW, Sanden GN, Strebel PM (2008) Utility of composite reference standards and latent class analysis in evaluating the clinical accuracy of diagnostic tests for pertussis. Clin Vaccine Immunol 15(1):106–114

    Article  PubMed  CAS  Google Scholar 

  15. Lipman HB, Astles JR (1998) Quantifying the bias associated with use of discrepant analysis. Clin Chem 44(1):108–115

    PubMed  CAS  Google Scholar 

  16. Torrance-Rynard VL, Walter SD (1997) Effects of dependent errors in the assessment of diagnostic test performance. Stat Med 16(19):2157–2175

    Article  PubMed  CAS  Google Scholar 

  17. Pepe MS, Janes H (2007) Insights into latent class analysis of diagnostic test performance. Biostatistics 8(2):474–484

    Article  PubMed  Google Scholar 

  18. Rindskopf D, Rindskopf W (1986) The value of latent class analysis in medical diagnosis. Stat Med 5(1):21–27

    Article  PubMed  CAS  Google Scholar 

  19. Qu Y, Tan M, Kutner MH (1996) Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics 52(3):797–810

    Article  PubMed  CAS  Google Scholar 

  20. Hui SL, Zhou XH (1998) Evaluation of diagnostic tests without gold standards. Stat Methods Med Res 7(4):354–370

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

MS’s work on this article was supported by NIH grant 1K25AG034216.

ADH’s work on this article was supported by NIH grant 1K24AI079040-01A1.

Conflict of interest

JKJ has received funding from Becton Dickinson. The other authors declare that they have no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. S. Hess.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hess, A.S., Shardell, M., Johnson, J.K. et al. Methods and recommendations for evaluating and reporting a new diagnostic test. Eur J Clin Microbiol Infect Dis 31, 2111–2116 (2012). https://doi.org/10.1007/s10096-012-1602-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10096-012-1602-1

Keywords

Navigation