Abstract
In an earlier investigation, the authors assessed the reliability of the ADI-R when multiple clinicians evaluated a single case, here a female 3 year old toddler suspected of having an autism spectrum disorder (Cicchetti et al. in J Autism Dev Disord 38:764–770, 2008). Applying the clinical criteria of Cicchetti and Sparrow (Am J Men Def 86:127–137, 1981); and those of Cicchetti et al. (Child Neuropsychol 126–137, 1995): 74 % of the ADI-R items showed 100 % agreement; 6 % showed excellent agreement; 7 % showed good agreement; 3 % manifested average agreement; and the remaining 10 % evidenced poor agreement. In this follow-up investigation, the authors described and applied a novel method for determining levels of statistical significance of the reliability coefficients obtained in the earlier investigation. It is based upon a modification of the Z test for comparing a given level of inter-examiner reliability with a lower limit value of 70 % (Dixon and Massey in Introduction to statistical analysis. McGraw-Hill, New York, 1957). Results indicated that every item producing a clinically acceptable level of inter-examiner reliability was also statistically significant. However, the reverse was not true, since a number of the items with statistically significant reliability levels did not reach levels of agreement that were clinically meaningful. This indicated that clinical significance was an accurate marker of statistical significance. The generalization of these findings to other areas of diagnostic interest and importance is also examined.
Similar content being viewed by others
References
Cicchetti, D. V. (1976). Assessing inter-rater reliability for rating scales: Resolving some basic issues. British Journal of Psychiatry, 129, 452–456.
Cicchetti, D. V., Fontana, A. S., & Showalter, D. (2009). Assessing the reliability of multiple assessments of PTSD symptomatology: Multiple examiners, one patient. Psychiatry Research, 166, 269–280.
Cicchetti, D. V., Lord, C., Koenig, K., Klin, A., & Volkmar, F. R. (2008). Reliability of the ADI-R: Multiple examiners evaluate a single case. Journal of Autism and Developmental Disorders, 38, 764–770.
Cicchetti, D. V., & Sparrow, S. S. (1981). Developing criteria for establishing interrater reliability of specific items: Application to assessment of adaptive behavior. American Journal of Mental Deficiency, 86, 127–137.
Cicchetti, D. V., Volkmar, F., Klin, A., & Showalter, D. (1995). Diagnosing Autism using ICD-10 criteria: A comparison of neural networks and standard multivariate procedures. Child Neuropsychology, 1, 126–137.
Cohen, J. (1968). Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213–220.
Dixon, W., & Massey, F. J. (1957). Introduction to statistical analysis (2nd ed.). New York: McGraw-Hill.
Fleiss, J. L. (1981). Statistical methods for rates and proportions (2nd ed.). New York, NY: Wiley.
Fleiss, J. L., Levin, B., & Paik, M. C. (2003). Statistical methods for rates and proportions (3rd ed.). New York, NY: Wiley.
Hall, J. N. (1974). Inter-rater reliability of ward rating scales. British Journal of Psychiatry, 125, 248–255.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cicchetti, D.V., Lord, C., Koenig, K. et al. Reliability of the ADI-R for the Single Case-Part II: Clinical Versus Statistical Significance. J Autism Dev Disord 44, 3154–3160 (2014). https://doi.org/10.1007/s10803-014-2177-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10803-014-2177-8