Skip to main content

Methods (2): Statistical Methods

  • Chapter
  • First Online:
Diagnostic Test Accuracy Studies in Dementia
  • 319 Accesses

Abstract

This chapter examines the statistical methodology of diagnostic test accuracy studies, emphasizing the various measures of discrimination, both paired and single (unitary), and comparative measures which may be used to define the outcome of such studies, most based on the construction of a 2 × 2 table. Post hoc significance testing may also be undertaken as a marker of test diagnostic utility.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Abdel-Aziz K, Larner AJ. Six-item Cognitive Impairment Test (6CIT): pragmatic diagnostic accuracy study for dementia and MCI. Int Psychogeriatr. 2015;27:991–7.

    Article  CAS  PubMed  Google Scholar 

  • Akobeng AK. Understanding diagnostic tests 1: sensitivity, specificity and predictive values. Acta Paediatr. 2007a;96:338–41.

    Article  PubMed  Google Scholar 

  • Akobeng AK. Understanding diagnostic tests 2: likelihood ratios, pre- and post-test probabilities and their use in clinical practice. Acta Paediatr. 2007b;96:487–91.

    Article  PubMed  Google Scholar 

  • Akobeng AK. Understanding diagnostic tests 3: receiver operating characteristic curves. Acta Paediatr. 2007c;96:644–7.

    Article  PubMed  Google Scholar 

  • Altman DG, Bland JM. Diagnostic tests 1: sensitivity and specificity. BMJ. 1994a;308:1552.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Altman DG, Bland JM. Diagnostic tests 2: predictive values. BMJ. 1994b;309:102.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Altman DG, Bland JM. Diagnostic tests 3: receiver operating characteristic plots. BMJ. 1994c;309:188.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Altman DG, Bland JM. How to obtain the confidence interval from a P value. BMJ. 2011a;343:d2090.

    Article  PubMed  Google Scholar 

  • Altman DG, Bland JM. How to obtain the P value from a confidence interval. BMJ. 2011b;343:d2304.

    Article  PubMed  Google Scholar 

  • Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332:1080.

    Article  PubMed  PubMed Central  Google Scholar 

  • Altman DG, Machin D, Bryant TN, Gardner MJ. Statistics with confidence. Confidence intervals and statistical guidelines. 2nd ed. London: BMJ Books; 2000.

    Google Scholar 

  • Amrhein V, Greenland S, McShane B. Scientists rise up against statistical significance. Nature. 2019;567:305–7.

    Article  CAS  PubMed  Google Scholar 

  • Andrade C. Likelihood of being helped or harmed as a measure of clinical outcomes in psychopharmacology. J Clin Psychiatry. 2017;78:e73–5.

    Article  PubMed  Google Scholar 

  • Ashford JW. Screening for memory disorders, dementia and Alzheimer’s disease. Aging Health. 2008;4:399–432.

    Article  Google Scholar 

  • Baio G. Bayesian methods in health economics. Boca Raton: CRC Press; 2013.

    Google Scholar 

  • Baum ML. The neuroethics of biomarkers. What the development of bioprediction means for moral responsibility, justice, and the nature of mental disorder. Oxford: Oxford University Press; 2016.

    Book  Google Scholar 

  • Bayes T. An essay towards solving a problem in the doctrine of chances. Philos Trans R Soc Lond. 1763;53:370–418.

    Article  Google Scholar 

  • Bellhouse DR. The Reverend Thomas Bayes, FRS: a biography to celebrate the tercentenary of his birth. Stat Sci. 2004;19:3–43.

    Article  Google Scholar 

  • Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307–10.

    Article  CAS  PubMed  Google Scholar 

  • Bodemer N, Meder B, Gigerenzer G. Communicating relative risk changes with baseline risk: presentation format and numeracy matter. Med Decis Making. 2014;34:615–26.

    Article  PubMed  Google Scholar 

  • Bohning D, Holling H, Patilea V. A limitation of the diagnostic-odds ratio in determining an optimal cut-off value for a continuous diagnostic test. Stat Methods Med Res. 2011;20:541–50.

    Article  PubMed  Google Scholar 

  • Bossuyt PM, Reitsma JB, Bruns DE, et al. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Clin Chem. 2003;49:7–18.

    Article  CAS  PubMed  Google Scholar 

  • Bourke GJ, Daly LE, McGilvray J. Interpretation and uses of medical statistics. 3rd ed. Oxford: Blackwell Scientific Publications; 1985.

    Google Scholar 

  • Brodersen J, Schwartz LM, Heneghan C, O’Sullivan JW, Aronson JK, Woloshin S. Overdiagnosis: what it is and what it isn’t. BMJ Evid Based Med. 2018;23:1–3.

    Article  PubMed  Google Scholar 

  • Brown J, Pengas G, Dawson K, Brown LA, Clatworthy P. Self administered cognitive screening test (TYM) for detection of Alzheimer’s disease: cross sectional study. BMJ. 2009;338:b2030.

    Article  PubMed  PubMed Central  Google Scholar 

  • Brown J, Wiggins J, Dong H, Harvey R, Richardson F, Dawson K, Parker RA. The H-TYM. Evaluation of a short cognitive test to detect mild AD and amnestic MCI. Int J Geriatr Psychiatry. 2014;29:272–80.

    Article  PubMed  Google Scholar 

  • Burch J, Marson A, Beyer F, et al. Dilemmas in the interpretation of diagnostic accuracy studies on presurgical workup for epilepsy surgery. Epilepsia. 2012;53:1294–302.

    Article  PubMed  Google Scholar 

  • Caraguel CGB, Vanderstichel R. The two-step Fagan’s nomogram: ad hoc interpretation of a diagnostic test result without calculation. Evid Based Med. 2013;18:125–8.

    Article  PubMed  Google Scholar 

  • Casscells W, Schoenberger A, Graboys TB. Interpretation by physicians of clinical laboratory results. N Engl J Med. 1978;299:999–1001.

    Article  CAS  PubMed  Google Scholar 

  • Chan QL, Shaik MA, Xu J, Xu X, Chen CL, Dong Y. The combined utility of a brief functional measure and performance-based screening test for case finding of cognitive impairment in primary healthcare. J Am Med Dir Assoc. 2016;17:372e9–11.

    Article  Google Scholar 

  • Citrome L, Ketter TA. When does a difference make a difference? Interpretation of number needed to treat, number needed to harm, and likelihood to be helped or harmed. Int J Clin Pract. 2013;67:407–11.

    Article  CAS  PubMed  Google Scholar 

  • Clopper CJ, Pearson ES. The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika. 1934;26:404–13.

    Article  Google Scholar 

  • Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46.

    Article  Google Scholar 

  • Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale: Lawrence Erlbaum; 1988.

    Google Scholar 

  • Cohen J. A power primer. Psychol Bull. 1992;112:155–9.

    Article  CAS  PubMed  Google Scholar 

  • Connell FA, Koepsell TD. Measures of gain in certainty from a diagnostic test. Am J Epidemiol. 1985;121:744–53.

    Article  CAS  PubMed  Google Scholar 

  • Cook RJ, Sackett DL. The number needed to treat: a clinically useful measure of treatment effect. BMJ. 1995;310:452–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Davis J, Goadrich M. The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning. New York: ACM; 2006. p. 233–40.

    Google Scholar 

  • Deeks JJ, Altman DG. Diagnostic tests 4: likelihood ratios. BMJ. 2004;329:168–9.

    Article  PubMed  PubMed Central  Google Scholar 

  • Doane DP, Seward LE. Measuring skewness: a forgotten statistic? J Stat Educ. 2011;19(2):1–18.

    Article  Google Scholar 

  • Doya K, Ishii S, Pouget A, Rao RPN, editors. Bayesian brain: probabilistic approaches to neural coding. Cambridge: MIT Press; 2007.

    Google Scholar 

  • Dubois B, Feldman HH, Jacova C, et al. Advancing research diagnostic criteria for Alzheimer’s disease: the IWG-2 criteria. Lancet Neurol. 2014;13:614–29. [Erratum Lancet Neurol. 2014;13:757].

    Article  PubMed  Google Scholar 

  • Ellis PD. The essential guide to effect sizes: statistical power, meta-analysis, and the interpretation of research results. Cambridge: Cambridge University Press; 2010.

    Book  Google Scholar 

  • Fagan TJ. Letter: nomogram for Bayes theorem. N Engl J Med. 1975;293:257.

    CAS  PubMed  Google Scholar 

  • Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett. 2006;27:861–74.

    Article  Google Scholar 

  • Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull. 1971;76:378–82.

    Article  Google Scholar 

  • Fleiss JL, Chilton NW. The measurement of interexaminer agreement on periodontal disease. J Periodontal Res. 1983;18:601–6.

    Article  CAS  PubMed  Google Scholar 

  • Flicker L, Logiudice D, Carlin JB, Ames D. The predictive value of dementia screening instruments in clinical populations. Int J Geriatr Psychiatry. 1997;12:203–9.

    Article  CAS  PubMed  Google Scholar 

  • Florkowski CM. Sensitivity, specificity, receiver-operating characteristic (ROC) curves and likelihood ratios: communicating the performance of diagnostic tests. Clin Biochem Rev. 2008;29(Suppl1):S83–7.

    PubMed  PubMed Central  Google Scholar 

  • Fluss R, Faraggi D, Reiser B. Estimation of the Youden Index and its associated cutoff point. Biom J. 2005;47:458–72.

    Article  PubMed  Google Scholar 

  • Forsyth RJ. Neurological and cognitive decline in adolescence. J Neurol Neurosurg Psychiatry. 2003;74(Suppl1):i9–16.

    Article  PubMed  PubMed Central  Google Scholar 

  • Frost C, Kallis C. A plea for confidence intervals and consideration of generalizability in diagnostic studies. Brain. 2009;132:e103.

    Article  PubMed  Google Scholar 

  • Galvin JE, Roe CM, Xiong C, Morris JE. Validity and reliability of the AD8 informant interview in dementia. Neurology. 2006;67:1942–8.

    Article  PubMed  Google Scholar 

  • Gauthier S. Diagnostic instruments to assess functional impairment. In: Qizilbash N, Schneider LS, Chui H, et al., editors. Evidence-based dementia practice. Oxford: Blackwell; 2002. p. 101–4.

    Google Scholar 

  • Ghadiri-Sani M, Larner AJ. Head turning sign for diagnosis of dementia and mild cognitive impairment: a revalidation. J Neurol Neurosurg Psychiatry. 2013;84:e2.

    Article  Google Scholar 

  • Glas AS, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PM. The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol. 2003;56:1129–35.

    Article  PubMed  Google Scholar 

  • Greiner M, Pfeiffer D, Smith RD. Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Prev Vet Med. 2000;45:23–41.

    Article  CAS  PubMed  Google Scholar 

  • Grimes DA, Schulz KF. Refining clinical diagnosis with likelihood ratios. Lancet. 2005;365:1500–5.

    Article  PubMed  Google Scholar 

  • Griner PF, Mayewski RJ, Mushlin AI, Greenland P. Selection and interpretation of diagnostic tests and procedures. Principles and applications. Ann Intern Med. 1981;94:557–92.

    CAS  PubMed  Google Scholar 

  • Habbema JDF, Eijkemans R, Krijnen P, Knottnerus JA. Analysis of data on the accuracy of diagnostic tests. In: Knottnerus JA, editor. The evidence base of clinical diagnosis. London: BMJ Books; 2002. p. 117–43.

    Google Scholar 

  • Habibzadeh F, Yadollahie M. Number needed to misdiagnose: a measure of diagnostic test effectiveness. Epidemiology. 2013;24:170.

    Article  PubMed  Google Scholar 

  • Habibzadeh F, Habibzadeh P, Yadollahie M. On determining the most appropriate test cut-off value: the case of tests with continuous results. Biochem Med (Zagreb). 2016;26:297–307.

    Article  Google Scholar 

  • Hancock P, Larner AJ. Cambridge Behavioural Inventory for the diagnosis of dementia. Prog Neurol Psychiatry. 2008;12(7):23–5.

    Article  Google Scholar 

  • Hancock P, Larner AJ. Clinical utility of Patient Health Questionnaire-9 (PHQ-9) in memory clinics. Int J Psychiatry Clin Pract. 2009a;13:188–91.

    Article  CAS  PubMed  Google Scholar 

  • Hancock P, Larner AJ. Diagnostic utility of the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE) and its combination with the Addenbrooke’s Cognitive Examination-Revised (ACE-R) in a memory clinic-based population. Int Psychogeriatr. 2009b;21:526–30.

    Article  CAS  PubMed  Google Scholar 

  • Hancock P, Larner AJ. Test Your Memory (TYM) test: diagnostic utility in a memory clinic population. Int J Geriatr Psychiatry. 2011;26:976–80.

    Article  CAS  PubMed  Google Scholar 

  • Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36.

    Article  CAS  PubMed  Google Scholar 

  • Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148:839–43.

    Article  CAS  PubMed  Google Scholar 

  • Hayden SR, Brown MD. Likelihood ratio: a powerful tool for incorporating the results of a diagnostic test into clinical decision making. Ann Emerg Med. 1999;33:575–80.

    Article  CAS  PubMed  Google Scholar 

  • Heilbronner RL, Sweet JJ, Attaix DK, Krull KR, Henry GK, Hart RP. Official position of the American Academy of Clinical Neuropsychology on serial neuropsychological assessment: the utility and challenges of repeat test administrations in clinical and forensic contexts. Clin Neuropsychol. 2010;24:1267–78.

    Article  PubMed  Google Scholar 

  • Helkala EL, Kivipelto M, Hallikainen M, et al. Usefulness of repeated presentation of Mini-Mental State Examination as a diagnostic procedure – a population-based study. Acta Neurol Scand. 2002;106:341–6.

    Article  PubMed  Google Scholar 

  • Hlatky MA, Mark DB, Harrell FE Jr, Lee KL, Califf RM, Pryor DB. Rethinking sensitivity and specificity. Am J Cardiol. 1987;59:1195–8.

    Article  CAS  PubMed  Google Scholar 

  • Ioannidis JPA. The proposal to lower P value thresholds to.005. JAMA. 2018;319:1429–30.

    Article  PubMed  Google Scholar 

  • Isik AT, Soysal P, Kaya D, Usarel C. Triple test, a diagnostic observation, can detect cognitive impairment in older adults. Psychogeriatrics. 2018;18:98–105.

    Article  PubMed  Google Scholar 

  • Jaeschke R, Guyatt G, Sackett DL. Users’ guide to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA. 1994;271:703–7.

    Article  CAS  PubMed  Google Scholar 

  • Jones CM, Athanasiou T. Summary receiver operating characteristic curve analysis techniques in the evaluation of diagnostic tests. Ann Thorac Surg. 2005;79:16–20.

    Article  PubMed  Google Scholar 

  • Knafelc R, Lo Giudice D, Harrigan S, et al. The combination of cognitive testing and an informant questionnaire in screening for dementia. Age Ageing. 2003;32:541–7.

    Article  PubMed  Google Scholar 

  • Knottnerus JA, Muris JW. Assessment of the accuracy of diagnostic tests: the cross-sectional study. In: Knottnerus JA, editor. The evidence base of clinical diagnosis. London: BMJ Books; 2002. p. 39–59.

    Google Scholar 

  • Knottnerus JA, van Weel C. General introduction: evaluation of diagnostic procedures. In: Knottnerus JA, editor. The evidence base of clinical diagnosis. London: BMJ Books; 2002. p. 1–17.

    Google Scholar 

  • Kraemer HC. Evaluating medical tests. Objective and quantitative guidelines. Newbery Park: Sage; 1992.

    Google Scholar 

  • Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16:606–13.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. An audit of the Addenbrooke’s Cognitive Examination (ACE) in clinical practice. 2. Longitudinal change. Int J Geriatr Psychiatry. 2006;21:698–9.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Addenbrooke’s Cognitive Examination (ACE) for the diagnosis and differential diagnosis of dementia. Clin Neurol Neurosurg. 2007a;109:491–4.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. DemTect: 1-year experience of a neuropsychological screening test for dementia. Age Ageing. 2007b;36:326–7.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Addenbrooke’s Cognitive Examination-Revised (ACE-R) in day-to-day clinical practice. Age Ageing. 2007c;36:685–6.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. ACE-R: cross-sectional and longitudinal use for cognitive assessment. In: Fisher A, Hanin I, editors. New trends in Alzheimer and Parkinson related disorders: ADPD 2009. Collection of selected free papers from the 9th International Conference on Alzheimer’s and Parkinson’s disease AD/PD. Prague, Czech Republic, March 11–15, 2009. Bologna: Medimond International Proceedings; 2009. p. 103–7.

    Google Scholar 

  • Larner AJ. Teleneurology by internet and telephone. A study of medical self-help. London: Springer; 2011.

    Book  Google Scholar 

  • Larner AJ. Mini-Mental Parkinson (MMP) as a dementia screening test: comparison with the Mini-Mental State Examination (MMSE). Curr Aging Sci. 2012a;5:136–9.

    Article  PubMed  Google Scholar 

  • Larner AJ. Screening utility of the Montreal Cognitive Assessment (MoCA): in place of - or as well as - the MMSE? Int Psychogeriatr. 2012b;24:391–6.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Head turning sign: pragmatic utility in clinical diagnosis of cognitive impairment. J Neurol Neurosurg Psychiatry. 2012c;83:852–3.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Addenbrooke’s Cognitive Examination-Revised (ACE-R): pragmatic study of cross-sectional use for assessment of cognitive complaints of unknown aetiology. Int J Geriatr Psychiatry. 2013a;28:547–8.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Codex (cognitive disorders examination) for the detection of dementia and mild cognitive impairment. Codex pour la détection de la démence et du mild cognitive impairment. Presse Med. 2013b;42:e425–8.

    Article  PubMed  Google Scholar 

  • Larner AJ. Comparing diagnostic accuracy of cognitive screening instruments: a weighted comparison approach. Dement Geriatr Cogn Disord Extra. 2013c;3:60–5.

    Article  CAS  Google Scholar 

  • Larner AJ. Effect size (Cohen’s d) of cognitive screening instruments examined in pragmatic diagnostic accuracy studies. Dement Geriatr Cogn Disord Extra. 2014;4:236–41.

    Article  Google Scholar 

  • Larner AJ. Speed versus accuracy in cognitive assessment when using CSIs. Prog Neurol Psychiatry. 2015a;19(1):21–4.

    Article  Google Scholar 

  • Larner AJ. Performance-based cognitive screening instruments: an extended analysis of the time versus accuracy trade-off. Diagnostics (Basel). 2015b;5:504–12.

    Article  Google Scholar 

  • Larner AJ. AD8 informant questionnaire for cognitive impairment: pragmatic diagnostic test accuracy study. J Geriatr Psychiatry Neurol. 2015c;28:198–202.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Optimizing the cutoffs of cognitive screening instruments in pragmatic diagnostic accuracy studies: maximising accuracy or Youden index? Dement Geriatr Cogn Disord. 2015d;39:167–75.

    Article  PubMed  Google Scholar 

  • Larner AJ. Diagnostic test accuracy studies in dementia: a pragmatic approach. London: Springer; 2015e.

    Book  Google Scholar 

  • Larner AJ. The Q* index: a useful global measure of dementia screening test accuracy? Dement Geriatr Cogn Dis Extra. 2015f;5:265–70.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Larner AJ. Mini-Addenbrooke’s Cognitive Examination: a pragmatic diagnostic accuracy study. Int J Geriatr Psychiatry. 2015g;30:547–8.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Mini-Addenbrooke’s Cognitive Examination diagnostic accuracy for dementia: reproducibility study. Int J Geriatr Psychiatry. 2015h;30:1103–4.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Auscultation of the skull: author’s reply. J R Coll Physicians Edinb. 2016a;46:214.

    Google Scholar 

  • Larner AJ. Correlation or limits of agreement? Applying the Bland-Altman approach to the comparison of cognitive screening instruments. Dement Geriatr Cogn Disord. 2016b;42:247–54.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. M-ACE vs. MoCA: a weighted comparison. Int J Geriatr Psychiatry. 2016c;31:1089–90.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Does combining an informant questionnaire with patient performance scales improve diagnostic test accuracy for cognitive impairment? Int J Geriatr Psychiatry. 2017a;32:466–7.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. MACE versus MoCA: equivalence or superiority? Pragmatic diagnostic test accuracy study. Int Psychogeriatr. 2017b;29:931–7.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Short Montreal Cognitive Assessment: validation and reproducibility. J Geriatr Psychiatry Neurol. 2017c;30:104–8.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ. Dementia in clinical practice: a neurological perspective. Pragmatic studies in the Cognitive Function Clinic. 3rd ed. London: Springer; 2018a.

    Book  Google Scholar 

  • Larner AJ. Number needed to diagnose, predict, or misdiagnose: useful metrics for non-canonical signs of cognitive status? Dement Geriatr Cogn Dis Extra. 2018b;8:321–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Larner AJ. Cognitive screeners for MCI: is correction of skewed data necessary? Prog Neurol Psychiatry. 2018c;22(4):27–30.

    Article  Google Scholar 

  • Larner AJ. Free-Cog: pragmatic test accuracy study. Dement Geriatr Cogn Disord. 2019a; accepted.

    Google Scholar 

  • Larner AJ. Cognitive screening instruments: how much overdiagnosis do they create? Int J Clin Pract. 2019b;73:e13290.

    Article  PubMed  Google Scholar 

  • Larner AJ. New unitary metrics for dementia test accuracy studies. Prog Neurol Psychiatry. 2019c;23; accepted.

    Google Scholar 

  • Larner AJ. Evaluating cognitive screening instruments with the “likelihood to be diagnosed or misdiagnosed” measure. Int J Clin Pract. 2019d;73:e13265.

    Article  PubMed  Google Scholar 

  • Larner AJ. MACE for diagnosis of dementia and MCI: examining cut-offs and predictive values. Diagnostics (Basel). 2019e;9:51.

    Article  PubMed Central  Google Scholar 

  • Larner AJ. Response to “Triple test, a diagnostic observation, can detect cognitive impairment in older adults”. Psychogeriatrics. 2019f;19; in press.

    Google Scholar 

  • Larner AJ, Hancock P. Does combining cognitive and functional scales facilitate the diagnosis of dementia? Int J Geriatr Psychiatry. 2012;27:547–8.

    Article  CAS  PubMed  Google Scholar 

  • Larner AJ, Hancock P. ACE-R or MMSE? A weighted comparison. Int J Geriatr Psychiatry. 2014;29:767–8.

    Article  PubMed  Google Scholar 

  • Larner AJ, Mitchell AJ. A meta-analysis of the accuracy of the Addenbrooke’s Cognitive Examination (ACE) and the Addenbrooke’s Cognitive Examination-Revised (ACE-R) in the detection of dementia. Int Psychogeriatr. 2014;26:555–63.

    Article  PubMed  Google Scholar 

  • Laupacis A, Sackett DL, Roberts RS. An assessment of clinically useful measures of the consequences of treatment. N Engl J Med. 1988;318:1728–33.

    Article  CAS  PubMed  Google Scholar 

  • Lee W, Williams DR, Storey E. Cognitive testing in the diagnosis of parkinsonian disorders: a critical appraisal of the literature. Mov Disord. 2012;27:1243–54.

    Article  PubMed  Google Scholar 

  • Linden A. Measuring diagnostic and predictive accuracy in disease management: an introduction to receiver operating characteristic (ROC) analysis. J Eval Clin Pract. 2006;12:132–9.

    Article  PubMed  Google Scholar 

  • Linn S, Grunau PD. New patient-oriented summary measure of net total gain in certainty for dichotomous diagnostic tests. Epidemiol Perspect Innov. 2006;3:11.

    Article  PubMed  PubMed Central  Google Scholar 

  • Llewelyn H. Likelihood ratios are not good for differential diagnosis. BMJ. 2012;344:e3660.

    Article  PubMed  Google Scholar 

  • Lord SJ, Irwig L, Simes RJ. When is measuring sensitivity and specificity sufficient to evaluate a diagnostic test, and when do we need randomized trials? Ann Intern Med. 2006;144:850–5.

    Article  PubMed  Google Scholar 

  • Lusted L. Introduction to medical decision making. Springfield: Charles Thomas; 1968.

    Google Scholar 

  • Lusted LB. Signal detectability and medical decision-making. Science. 1971;171:1217–9.

    Article  CAS  PubMed  Google Scholar 

  • Mackinnon A, Mulligan R. Combining cognitive testing and informant report to increase accuracy in screening for dementia. Am J Psychiatry. 1988;155:1529–35.

    Article  Google Scholar 

  • Mallett S, Halligan S, Thompson M, Collins GS, Altman DG. Interpreting diagnostic accuracy studies for patient care. BMJ. 2012;345:e3999.

    Article  PubMed  Google Scholar 

  • Manrai AK, Bhatia G, Strymish J, Kohane IS, Jain SH. Medicine’s uncomfortable relationship with math: calculating positive predictive value. JAMA Intern Med. 2014;174:991–3.

    Article  PubMed  PubMed Central  Google Scholar 

  • Marshall RJ. The predictive value of simple rules for combining two diagnostic tests. Biometrics. 1989;45:1213–22.

    Article  Google Scholar 

  • Mathuranath PS, Nestor PJ, Berrios GE, Rakowicz W, Hodges JR. A brief cognitive test battery to differentiate Alzheimer’s disease and frontotemporal dementia. Neurology. 2000;55:1613–20.

    Article  CAS  PubMed  Google Scholar 

  • McCrea MA. Mild traumatic brain injury and postconcussion syndrome. The new evidence base for diagnosis and treatment. Oxford: Oxford University Press; 2008.

    Google Scholar 

  • McGee S. Simplifying likelihood ratios. J Gen Intern Med. 2002;17:646–9.

    Article  PubMed  Google Scholar 

  • McGinn T, Wyer PC, Newman TB, et al. Tips for learners of evidence-based medicine: 3. Measures of observer variability (kappa statistic). CMAJ. 2004;171:1369–73.

    Article  PubMed  PubMed Central  Google Scholar 

  • Metz CE. Basic principles of ROC analysis. Semin Nucl Med. 1978;8:283–98.

    Article  CAS  PubMed  Google Scholar 

  • Mioshi E, Dawson K, Mitchell J, Arnold R, Hodges JR. The Addenbrooke’s Cognitive Examination Revised: a brief cognitive test battery for dementia screening. Int J Geriatr Psychiatry. 2006;21:1078–85.

    Article  PubMed  Google Scholar 

  • Mitchell AJ. Index test. In: Kattan MW, editor. Encyclopedia of medical decision making. Los Angeles: Sage; 2009. p. 613–7.

    Google Scholar 

  • Mitchell AJ. Sensitivity x PPV is a recognized test called the clinical utility index (CUI+). Eur J Epidemiol. 2011;26:251–2.

    Article  PubMed  Google Scholar 

  • Mitchell AJ, Malladi S. Screening and case-finding tools for the detection of dementia. Part I: evidence-based meta-analysis of multidomain tests. Am J Geriatr Psychiatry. 2010a;18:759–82.

    Article  PubMed  Google Scholar 

  • Mitchell AJ, Malladi S. Screening and case-finding tools for the detection of dementia. Part II: evidence-based meta-analysis of single-domain tests. Am J Geriatr Psychiatry. 2010b;18:783–800.

    Article  PubMed  Google Scholar 

  • Mitchell AJ, McGlinchey JB, Young D, Chelminski I, Zimmerman M. Accuracy of specific symptoms in the diagnosis of major depressive disorder in psychiatric out-patients: data from the MIDAS project. Psychol Med. 2009;39:1107–16.

    Article  CAS  PubMed  Google Scholar 

  • Montori VW, Kleinbart J, Newman TB, et al. Tips for learners of evidence-based medicine: 2. Measures of precision (confidence intervals). CMAJ. 2004;171:611–5.

    Article  PubMed  PubMed Central  Google Scholar 

  • Moons KG, van Es GA, Deckers JW, Habbema JD, Grobbee DE. Limitations of sensitivity, specificity, likelihood ratio, and Bayes’ theorem in assessing diagnostic probabilities: a clinical example. Epidemiology. 1997a;8:12–7.

    Article  CAS  PubMed  Google Scholar 

  • Moons KGM, Stijnen T, Michel BC, Büller HR, Van Es GA, Grobbee DE, Habbema DF. Application of treatment thresholds to diagnostic-test evaluation: an alternative to the comparison of areas under receiver operating characteristic curves. Med Decis Mak. 1997b;17:447–54.

    Article  CAS  Google Scholar 

  • Moorhouse P. Screening for dementia in primary care. Can Rev Alzheimers Dis Other Demen. 2009;12:8–13.

    Google Scholar 

  • Nai YH, Shidahara M, Seki C, Watabe H. Biomathematical screening of amyloid radiotracers with clinical usefulness index. Alzheimers Dement (NY). 2017;3:542–52.

    Google Scholar 

  • National Institute for Health and Care Excellence. Dementia. Assessment, management and support for people living with dementia and their carers. NICE Guideline 97. Methods, evidence and recommendations. London: NICE; 2018.. https://www.nice.org.uk/guidance/ng97

    Google Scholar 

  • Noel-Storr AH, Flicker L, Ritchie CW, et al. Systematic review of the body of evidence for use of biomarkers in the diagnosis of dementia. Alzheimers Dement. 2013;9:e96–105.

    Article  PubMed  Google Scholar 

  • Noel-Storr AH, McCleery JM, Richard E, et al. Reporting standards for studies of diagnostic test accuracy in dementia: the STARDdem Initiative. Neurology. 2014;83:364–73.

    Article  PubMed  PubMed Central  Google Scholar 

  • Ostergaard SD, Dinesen PT, Foldager L. Quantifying the value of markers in screening programmes. Eur J Epidemiol. 2010;25:151–4.

    Article  PubMed  Google Scholar 

  • Ouellet D. Benefit: risk assessment: the use of the clinical utility index. Expert Opin Drug Saf. 2010;9:289–300.

    Article  CAS  PubMed  Google Scholar 

  • Ozer S, Noonan K, Burke M, et al. The validity of the Memory Alteration Test and the Test Your Memory test for community-based identification of amnestic mild cognitive impairment. Alzheimers Dement. 2016;12:987–95.

    Article  PubMed  PubMed Central  Google Scholar 

  • Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27:157–72.

    Article  PubMed  Google Scholar 

  • Perkins NJ, Schisterman EF. The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve. Am J Epidemiol. 2006;163:670–5.

    Article  PubMed  Google Scholar 

  • Peters KR. Utility of an effect size analysis for communicating treatment effectiveness: a case study of cholinesterase inhibitors for Alzheimer’s disease. J Am Geriatr Soc. 2013;61:1170–4.

    Article  PubMed  Google Scholar 

  • Powers DMW. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J Machine Learning Technologies. 2011;2:37–63.

    Google Scholar 

  • Qizilbash N. Evidence-based diagnosis. In: Qizilbash N, Schneider LS, Chui H, et al., editors. Evidence-based dementia practice. Oxford: Blackwell; 2002. p. 18–25.

    Google Scholar 

  • Richard E, Schmand BA, Eikelenboom P, Van Gool WA. The Alzheimer’s Disease Neuroimaging Initiative. MRI and cerebrospinal fluid biomarkers for predicting progression to Alzheimer’s disease in patients with mild cognitive impairment: a diagnostic accuracy study. BMJ Open. 2013;3:e002541.

    Article  PubMed  PubMed Central  Google Scholar 

  • Sackett DL, Haynes RB. The architecture of diagnostic research. In: Knottnerus JA, editor. The evidence base of clinical diagnosis. London: BMJ Books; 2002. p. 19–38.

    Google Scholar 

  • Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalance datasets. PLoS One. 2015;10(3):e0118432.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Sappenfield RW, Beeler MF, Catrou PG, Boudreau DA. Nine-cell diagnostic decision matrix. A model of the diagnostic process; a framework for evaluating diagnostic protocols. Am J Clin Pathol. 1981;75:769–72.

    Article  CAS  PubMed  Google Scholar 

  • Sawilowsky SS. New effect sizes rules of thumb. J Mod Appl Stat Methods. 2009;8:597–9.

    Article  Google Scholar 

  • Schuetz GM, Schlattmann F, Dewey M. Use of 3x2 tables with an intention to diagnose approach to assess clinical performance of diagnostic tests: meta-analytical evaluation of coronary CT angiography studies. BMJ. 2012;345:e6717.

    Article  PubMed  PubMed Central  Google Scholar 

  • Smith GE, Bondi MW. Mild cognitive impairment and dementia. Definitions, diagnosis, and treatment. Oxford: Oxford University Press; 2013.

    Google Scholar 

  • Smits N. A note on Youden’s J and its cost ratio. BMC Med Res Methodol. 2010;10:89.

    Article  PubMed  PubMed Central  Google Scholar 

  • Swets JA. Measuring the accuracy of diagnostic systems. Science. 1988;240:1285–93.

    Article  CAS  PubMed  Google Scholar 

  • Talbot PR, Lloyd JJ, Snowden JS, Neary D, Testa HJ. A clinical role for 99mTc-HMPAO SPECT in the investigation of dementia? J Neurol Neurosurg Psychiatry. 1998;64:306–13.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tate RL. A compendium of tests, scales, and questionnaires. The practitioner’s guide to measuring outcomes after acquired brain impairment. Hove: Psychology Press; 2010.

    Google Scholar 

  • The Ronald and Nancy Reagan Research Institute of the Alzheimer’s Association and the National Institute on Aging Working Group. Consensus report of the Working Group on: “Molecular and biochemical markers of Alzheimer’s disease”. Neurobiol Aging. 1998;19:109–16.

    Article  Google Scholar 

  • Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa statistic. Fam Med. 2005;37:360–3.

    PubMed  Google Scholar 

  • Walter SD. Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data. Stat Med. 2002;21:1237–56.

    Article  CAS  PubMed  Google Scholar 

  • Walter SD. The partial area under the summary ROC curve. Stat Med. 2005;24:2025–40.

    Article  CAS  PubMed  Google Scholar 

  • Williamson JC, Larner AJ. MACE for diagnosis of dementia and MCI: 3-year pragmatic diagnostic test accuracy study. Dement Geriatr Cogn Disord. 2018;45:300–7.

    Article  PubMed  Google Scholar 

  • Wilson JMG, Jungner G. Principles and practice of screening for disease. Public health paper No. 34. Geneva: World Health Organization; 1968.

    Google Scholar 

  • Woolf SH, Kamerow DB. Testing for uncommon conditions. The heroic search for positive test results. Arch Intern Med. 1990;150:2451–8.

    Article  CAS  PubMed  Google Scholar 

  • Yerushalmy J. Statistical problems in assessing methods of medical diagnosis, with special reference to x-ray techniques. Public Health Rep. 1947;62:1432–49.

    Article  CAS  PubMed  Google Scholar 

  • Youden WJ. Index for rating diagnostic tests. Cancer. 1950;3:32–5.

    Article  CAS  PubMed  Google Scholar 

  • Zermansky A. Number needed to harm should be measured for treatments. BMJ. 1998;317:1014.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhou XH, Obuchowski NA, McClish DK. Statistical methods in diagnostic medicine. 2nd ed. Hoboken: Wiley; 2011.

    Book  Google Scholar 

  • Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem. 1993;39:561–77.

    CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Larner, A.J. (2019). Methods (2): Statistical Methods. In: Diagnostic Test Accuracy Studies in Dementia. Springer, Cham. https://doi.org/10.1007/978-3-030-17562-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-17562-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-17561-0

  • Online ISBN: 978-3-030-17562-7

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics