Abstract
Reliability refers to how measurements can produce consistent results and are crucial for any scientific research measurement. Intraclass correlation coefficient (ICC) is the most widely used method to determine the reproducibility of measurements of various statistical techniques. Calculated ICC and its confidence interval that reveal the underlying sampling distribution may help detect an experimental method's ability to identify systematic differences between research participants in a test. This study aimed to introduce a new SAS macro, ICC6, for calculating different ICC forms and their confidence intervals. A SAS macro that employs the PROC GLM procedure in SAS was created to generate two-way random effects (ANOVA) estimates. A simulated dataset was used to input the macro to calculate the point estimates for different ICCs. The ICC forms' upper and lower confidence interval limits were calculated using the F statistics distribution. Our SAS macro provides a complete set of various ICC forms and their confidence intervals. A validation analysis using commercial software packages STATA and SPSS delivered identical results. A development of SAS methodology using publicly available statistical approaches in estimating six distinct forms of ICC and their confidence intervals has been reported in this article. This work is an extension of general methodology supported by a few other statistical software packages to SAS.
Similar content being viewed by others
References
Alexander, H.W.: The estimation of reliability when several trials are available. Psychometrika 12(2), 79–99 (1947). https://doi.org/10.1007/BF02295990
Bartko, J.J.: The intraclass correlation coefficient as a measure of reliability. Psychol. Rep. 19(1), 3–11 (1966). https://doi.org/10.2466/pr0.1966.19.1.3
Belur, J., Tompson, L., Thornton, M., Simon, M.: Interrater reliability in systematic review methodology: exploring variation in coder decision-making. Sociol. Method. Res. 50(2), 837–865 (2018). https://doi.org/10.1177/0049124118799372
Bland, J.M., Altman, D.G.: Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 327, 307–310 (1986). https://doi.org/10.1016/S0140-6736(86)90837-8
Boateng, G.O., Neilands, T.B., Frongillo, E.A., Melgar-Quiñonez, H.R., Young, S.L.: Best practices for developing and validating scales for health, social, and behavioral research: a primer. Front. Public Health 6, 149 (2018). https://doi.org/10.3389/fpubh.2018.00149
Brown, B.W., Jr., Lucero, R.J., Foss, A.B.: A situation where the Pearson correlation coefficient leads to erroneous assessment of reliability. J. Clin. Psychol. 18(1), 95–97 (1962). https://doi.org/10.1002/1097-4679(196201)18:1%3c95::aid-jclp2270180131%3e3.0.co;2-2
Bruton, A., Conway, J.H., Holgate, S.T.: Reliability: what is it, and how is it measured? Physiotherapy 86, 94–99 (2000). https://doi.org/10.1016/S0031-9406(05)61211-4
Dmitrienko, A., Molenberghs, G., Chuang-Stein, C., Offen, W.: Analysis of Clinical Trials Using SAS: A Practical Guide. Cary, NC (2005). https://doi.org/10.1080/10543400500508994
Emrich, L.J., Piedmonte, M.R.: A method for generating high-dimensional multivariate binary variables. Am. Stat. 45, 302–304 (1991)
Fisher, R.A.: Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh (1954)
Fleiss, J.L.: The Design and Analysis of Clinical Experiments. Wiley and Sons, New York (1986)
Hallgren, K.A.: Computing inter-rater reliability for observational data: an overview and tutorial. Tutor Quant. Methods Psychol. 8(1), 23–34 (2012). https://doi.org/10.20982/tqmp.08.1.p023
Hopkins, W.G.: Measures of reliability in sports medicine and science. Sports Med. 30(1), 1–15 (2000). https://doi.org/10.2165/00007256-200030010-00001
Koo, T.K., Li, M.Y.: A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 15(2), 155–63 (2016). https://doi.org/10.1016/j.jcm.2016.02.012
Li, L., Nawar, S.: Reliability analysis: calculate and compare intra-class correlation coefficients (ICC) in SAS. Northeast SAS Users Group 14, 1–4 (2007)
Liljequist, D., Elfving, B., Roaldsen, K.S.: Intraclass correlation–A discussion and demonstration of basic features. PLoSONE 14(7), e0219854 (2019). https://doi.org/10.1371/journal.pone.0219854
McGraw, K.O., Wong, S.P.: Forming inferences about some intraclass correlation coefficients. Psychol. Methods 1(1), 30–46 (1996). https://doi.org/10.1037/1082-989X.1.1.30
McGraw, K.O., Wong, S.P.: Forming inferences about some intraclass correlation coefficients: correction. Psychol. Methods 1(4), 390–390 (1996)
Nakagawa, S., Schielzeth, H.: Repeatability for Gaussian and non-Gaussian data: a practical guide for biologists. Biol. Rev. 85, 935–956 (2010)
Nunnally, J.C., Bernstein, I.H.: Psychometric Theory, 3rd edn. McGraw-Hill Series in Psychology, New York (1994)
Qin, S., Nelson, L., McLeod, L., Eremenco, S., Coons, S.L.: Assessing test–retest reliability of patient-reported outcome measures using intraclass correlation coefficients: recommendations for selecting and documenting the analytical formula. Qual. Life Res. 28(4), 1029–1033 (2019). https://doi.org/10.1007/s11136-018-2076-0
Revicki, D.: Internal consistency reliability. In: Michalos, A.C. (ed.) Encyclopedia of Quality of Life and Well-Being Research. Springer, Dordrecht (2014)
Richard, N.M.: Interrater reliability with SPSS for windows 5.0. Am.Stat. 47(4), 292–296 (1993). https://doi.org/10.1080/00031305.1993.10476000
Rosner, B.: Fundementals of biostatistics, 6th edn. Thomson Brooks/Cole, Duxbury (2006)
Shahraz, S., Pham, T.P., Gibson, M., De La Cruz, M., Baara, M., Karnik, S., Dell, C., Pease, S., Nigam, S., Cappelleri, J.C., Lipset, C., Zornow, P., Lee, J., Byrom, B.: Does scrolling affect measurement equivalence of electronic patient-reported outcome measures? Results of a quantitative equivalence study. J. Patient Rep. Outcomes 5, 23 (2021). https://doi.org/10.1186/s41687-021-00296-z
Shostak, J.: SAS Programming in the Pharmaceutical Industry. SAS Institute, Cary NC (2005)
Shrout, P.E., Fleiss, J.L.: Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86(2), 420–428 (1979). https://doi.org/10.1037/0033-2909.86.2.420
Stoffel, M.A., Nakagawa, S., Schielzeth, H.: rptR: repeatability estimation and variance decomposition by generalized linear mixed-effects models. Method Ecol. Evolut. 8(11), 1639–1644 (2017). https://doi.org/10.1111/2041-210X.12797
Weir, J.P.: Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J. Strength Cond. Res. 19(1), 231–240 (2005). https://doi.org/10.1519/15184.1
Zaki, R., Bulgiba, A., Nordin, N., Ismail, N.A.: A systematic review of statistical methods used to test for reliability of medical instruments measuring continuous variables. Iran J. Basic. Med. Sci 16(6), 803–807 (2013)
Portney, L. G., Watkins, M. P.: Foundations of clinical research: applications to practice (Vol. 892). Pearson/Prentice Hall, Upper Saddle River, NJ (2009)
Potashman, M., Ping, M., Tahir, M., Shahraz, S., Dichter, S., Perneczky, R., Nolte, S.: Psychometric Properties of the Alzheimer’s Disease Cooperative Study–Activities of Daily Living for Mild Cognitive Impairment (ADCS-MCI-ADL) scale: a post hoc analysis of the ADCS ADC-008 trial. BMC Geriatrics. Accepted for publication (2023)
SAS/STAT Software, Version 9.4. SAS Institute Inc, Cary, NC USA (2013). https://www.sas.com.
STATA Stata user's guide release 15. (2017) URL https://www.stata.com/manuals15/u.pdf
U.S. Department of health and human services food and drug administration (FDA): guidance for industry patient-reported outcome measures: use in medical product development to support labeling claims. https://www.fda.gov/media/77832/download (2009).
UCLA: Statistical Consulting Group. Introduction to SAS. https://stats.idre.ucla.edu/sas/modules/sas-learning-moduleintroduction-to-the-features-of-sas/ (2021). Accessed 22 August 2021.
Wicklin, R.: Simulating Data with SAS, pp. 154—157. SAS Institute Inc., Cary NC (2013) https://support.sas.com/content/dam/SAS/support/en/books/simulating-data-with-sas/65378_excerpt.pdf
Author information
Authors and Affiliations
Contributions
VSSK and SS wrote the manuscript. Both authors reviewed the manuscript
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Senthil Kumar, V.S., Shahraz, S. Intraclass correlation for reliability assessment: the introduction of a validated program in SAS (ICC6). Health Serv Outcomes Res Method 24, 1–13 (2024). https://doi.org/10.1007/s10742-023-00299-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10742-023-00299-x