Abstract
The authors assessed the reliability of the Autism Diagnostic Interview (ADI-R). Seven Clinical Examiners evaluated a three and one half year old female toddler suspected of being on the Autism Spectrum. Examiners showed agreement levels of 94–96% across all items, with weighted kappa (Kw) between .80 and .88. They were in 100% agreement on 74% of the items; in excellent agreement on 6% of the items (93–96%, with Kw between .78 and .85); in good agreement on 7% (89–90%, with Kw between .62 and 0.68); and in fair agreement on 3% (82 – 84%, with Kw between .40 and .47). For the remaining 10% of ADI-R items, examiners showed poor agreement (50–81% with Kw between −.67 and .37).
Similar content being viewed by others
References
Allport, G. W. (1937). Personality: A psychological interpretation. New York: Holt.
Baca-Garcia, E., Blanco, C., Saiz-Ruiz, J., Diaz-Sastre, C., & Cicchetti, D. V. (2001). Assessment of reliability in the clinical evaluation among investigators in a multi-center clinical trial. Psychiatry Research, 102, 163–173.
Bartko, J. J. (1976). On various intraclass correlation reliability coefficients. Psychological Bulletin, 83, 762–765.
Borenstein, M., Rothstein, H., & Cohen, J. (2001). Power and precision: A computer for statistical power and precision. Englewood: Biostat, Inc.
Cicchetti, D. V. (1976). Assessing inter-rater reliability for rating scales: Resolving some basic issues. British Journal of Psychiatry, 129, 452–456.
Cicchetti, D. V. (1981). Testing the normal approximation and minimal sample size requirements of weighted kappa when the number of categories is large. Applied Psychological Measurement, 5, 101–104.
Cicchetti, D. V. (2001). The precision of reliability and validity estimates re-visited: Distinguishing between clinical and statistical significance of sample size requirements. Journal of Clinical and Experimental Neuropsychology, 23, 695–700.
Cicchetti, D. V. (2006). The Paris 1976 wine tastings revisited once more: Comparing ratings of consistent and inconsistent tasters. Journal of Wine Economics, 2, 125–140.
Cicchetti, D. V., Bronen, R., Spencer, S., Haut, S., Berg, A., Oliver, P., & Tyrer, P. (2006). Rating, scales, scales of measurement, issues of reliability: Resolving some critical issues for clinicians and researchers. Journal of Nervous and Mental Disease, 194, 557–564.
Cicchetti, D. V., & Fleiss, J. L. (1977). Comparison of the null distributions of weighted kappa and the C ordinal statistic. Applied Psychological Measurement, 1, 195–201.
Cicchetti, D. V., Fontana, A. S., & Showalter, D. (in press). Assessing the reliability of multiple assessments of PTSD symptomatology: Multiple examiners, one patient.
Cicchetti, D. V., Rosenheck, R., Showalter, D., Charney, D., & Cramer, J. (1999). Interrater reliability levels of multiple clinical examiners in the evaluation of a schizophrenic patient: Quality of life; level of functioning; and neuropsychological symptomatology. The Clinical Neuropsychologist, 13, 157–170.
Cicchetti, D. V., & Rourke, B. P. (Eds.). (2004). Methodological and biostatistical foundations of clinical neuropsychology and medical and health disciplines (2nd ed.). London: Psychology Press.
Cicchetti, D. V., Showalter, D., & McCarthy, P. (1990). A computer program for calculating subject-by-subject kappa or weighted kappa coefficients. Educational and Psychological Measurement, 50, 153–158.
Cicchetti, D. V., & Sparrow, S. S. (1981). Developing criteria for establishing interrater reliability of specific items: Applications to assessment of adaptive behavior. American Journal of Mental Deficiency, 86, 127–137.
Cicchetti, D. V., Volkmar, F., Klin, A., & Showalter, D. (1995). Diagnosing autism using ICD-10 criteria: A comparison of neural networks and standard multivariate procedures. Child Neuropsychology, 1, 26–37.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.
Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213–220.
Fleiss, J. L. (1981). Statistical methods for rates and proportions (2nd ed.). New York: John Wiley & Sons.
Fleiss, J. L., Cohen, J., & Everitt, B. S. (1969). Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 323–327.
Fleiss, J. L., Levin, B., & Paik, M. C. (2003). Statistical methods for rates and proportions (3rd ed.). New York: John Wiley & Sons.
Grice, J. W., Jackson, B. J., & McDaniel, B. L. (2006). Bridging the ideographic-Nomothetic divide: A follow-up study. Journal of Personality, 74, 1191–1218.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.
Parker, J. M. Jr. (2002). Parker’s wine buyer’s guide. New York: Simon & Shuster.
Rutter, M., LeCouteur, A., & Lord, C. (2003). The Autism Diagnostic Interview, Revised (ADI-R). Los Angeles: Western Psychological Services.
Szalai, J. P. (1993). The statistics of agreement on a single item or object by multiple raters. Perceptual and Motor Skills, 77, 377–378.
Szalai, J. P. (1998). Kappasc: A measure of agreement on a single rating category for a single item or object rated by multiple raters. Psychological Reports, 82, 1321–1322.
Volkmar, F. R., Cicchetti, D. V., Dykens, E., Sparrow, S. S., Leckman, J. F., & Cohen, D. J. (1988). An evaluation of the Autism Behavior Checklist. Journal of Autism and Developmental Disorders, 18, 81–97.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cicchetti, D.V., Lord, C., Koenig, K. et al. Reliability of the ADI-R: Multiple Examiners Evaluate a Single Case. J Autism Dev Disord 38, 764–770 (2008). https://doi.org/10.1007/s10803-007-0448-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10803-007-0448-3