Cognitive, Affective, & Behavioral Neuroscience

, Volume 13, Issue 4, pp 714–724 | Cite as

Quantifying the reliability of image replication studies: The image intraclass correlation coefficient (I2C2)

  • H. Shou
  • A. Eloyan
  • S. Lee
  • V. Zipunnikov
  • A. N. Crainiceanu
  • M. B. Nebel
  • B. Caffo
  • M. A. Lindquist
  • C. M. Crainiceanu


This article proposes the image intraclass correlation (I2C2) coefficient as a global measure of reliability for imaging studies. The I2C2 generalizes the classic intraclass correlation (ICC) coefficient to the case when the data of interest are images, thereby providing a measure that is both intuitive and convenient. Drawing a connection with classical measurement error models for replication experiments, the I2C2 can be computed quickly, even in high-dimensional imaging studies. A nonparametric bootstrap procedure is introduced to quantify the variability of the I2C2 estimator. Furthermore, a Monte Carlo permutation is utilized to test reproducibility versus a zero I2C2, representing complete lack of reproducibility. Methodologies are applied to three replication studies arising from different brain imaging modalities and settings: regional analysis of volumes in normalized space imaging for characterizing brain morphology, seed-voxel brain activation maps based on resting-state functional magnetic resonance imaging (fMRI), and fractional anisotropy in an area surrounding the corpus callosum via diffusion tensor imaging. Notably, resting-state fMRI brain activation maps are found to have low reliability, ranging from .2 to .4. Software and data are available to provide easy access to the proposed methods.


RAVENS DTI fMRI Replication studies Intraclass correlation coefficient 


Author Note

This research was supported by grant R01NS060910 from the National Institute of Neurological Disorders and Stroke and by grants R01EB012547 and P41EB015909 from the National Institute of Biomedical Imaging and Bioengineering. This work represents the opinions of the researchers, and not necessarily that of the granting organizations. The authors would like to thank Dr. Daniel Reich from NIH/NINDS, Dr. Peter Calabresi, and their research teams for collecting and sharing the DTI-MRI data sets, as well as Ronald Caffo for assistance with copy editing.


  1. Avants, B. B., Tustison, N. J., Song, G., Cook, P. A., Klein, A., & Gee, J. C. (2011). A reproducible evaluation of ANTs similarity metric performance in brain image registration. NeuroImage, 54(3), 2033–2044.PubMedCentralPubMedCrossRefGoogle Scholar
  2. Avants, B. B., Yushkevich, P., Pluta, J., Minkoff, D., Korczykowski, M., Detre, J., & Gee, J. C. (2010). The optimal template effect in hippocampus studies of diseased populations. NeuroImage, 49(3), 2457–2466.PubMedCentralPubMedCrossRefGoogle Scholar
  3. Behzadi, Y., Restom, K., Liau, J., & Liu, T. T. (2007). A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. NeuroImage, 37(1), 90.PubMedCentralPubMedCrossRefGoogle Scholar
  4. Bennett, C. M., & Miller, M. B. (2010). How reliable are the results from functional magnetic resonance imaging? The Year in Cognitive Neuroscience, 1191, 133–155.Google Scholar
  5. Braun, U., Plichta, M. M., Esslinger, C., Sauer, C., Haddad, L., Grimm, O., ... Meyer-Lindenberg, A. (2012). Test-retest reliability of resting-state connectivity network characteristics using fMRI and graph theoretical measures. NeuroImage, 59, 1404–1412.PubMedCrossRefGoogle Scholar
  6. Carroll, R. J., Ruppert, D., Stefanski, L. A., & Crainiceanu, C. M. (2006). Measurement error in nonlinear models: A modern perspective. New York: Chapman & Hall/CRC.CrossRefGoogle Scholar
  7. Chen, M., Lee S., Carass, A., Reich, D., Pham, D., & Prince, J. (2012). High dimensional statistical deformation modeling for characterizing brain morphology in multiple sclerosis.Google Scholar
  8. Chen, S., Ross, T. J., Zhan, W., Myers, C. S., Chuang, K. S., Heishman, S. J., ... Yang, Y. (2008). Group independent component analysis reveals consistent resting-state networks across multiple sessions. Brain Research, 1239, 141–151.PubMedCentralPubMedCrossRefGoogle Scholar
  9. Chouinard, P. A., & Paus, T. (2006). The primary motor and premotor areas of the human cerebral cortex. The Neuroscientist, 12(2), 143–152.PubMedCrossRefGoogle Scholar
  10. Crainiceanu, C. M., Staicu, A. M., & Di, C. (2009). Generalized multilevel functional regression. Journal of the American Statistical Association, 104(488), 177–194.CrossRefGoogle Scholar
  11. Crainiceanu, C. M., Staicu, A. M., Ray, S., & Punjabi, N.M. (2012). Bootstrap-based inference on the difference in the means of two correlated functional processes. Statistics in Medicine, 31(26).Google Scholar
  12. Damoiseaux, J. S., Rombouts, S. A., Barkhof, F., Scheltens, P., Stam, C. J., Smith, S. M., & Beckmann, C. F. (2006). Consistent resting-state networks across healthy subjects. Proceedings of the National Academy of Sciences of the United States of America, 103, 13848–13853.PubMedCentralPubMedCrossRefGoogle Scholar
  13. Davatzikos, C., Genc, A., Xu, D., & Resnick, S. M. (2001). Voxel-based morphometry using the ravens maps: Methods and validation using simulated longitudinal atrophy. NeuroImage, 14(6), 1361–1369.PubMedCrossRefGoogle Scholar
  14. Di, C., Crainiceanu, C. M., Caffo, B. S., & Punjabi, N. M. (2009). Multilevel functional principal component analysis. Annals of Applied Statistics, 3(1), 458–488. Online access 2008.PubMedCentralPubMedCrossRefGoogle Scholar
  15. Fox, M. D., Snyder, A. Z., Vincent, J. L., Corbetta, M., Van Essen, D. C., & Raichle, M. E. (2005). The human brain is intrinsically organized into dynamic, anticorrelated functional networks. Proceedings of the National Academy of Sciences of the United States of America, 102(27), 9673–9678.PubMedCentralPubMedCrossRefGoogle Scholar
  16. Fuller, W. (1987). Measurement error models. New York: John Wiley & Sons.CrossRefGoogle Scholar
  17. Goldsmith, A. J., Crainiceanu, C. M., Caffo, B. S., & Reich, D. (2011). Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis. NeuroImage, 57(2), 431–439.PubMedCentralPubMedCrossRefGoogle Scholar
  18. Greven, S., Crainiceanu, C. M., Caffo, B. S., & Reich, D. (2010). Longitudinal functional principal component analysis. Electronic Journal of Statistics, 4, 1022–1054.PubMedCentralPubMedCrossRefGoogle Scholar
  19. Harrison, D. M., Caffo, B. S., Shiee, N., Farrell, J. A. D., Bazin, P.-L., Farrell, S. K., ... Reich, D. S. (2011). Longitudinal changes in diffusion tensor-based quantitative mri in multiple sclerosis. Neurology(76).Google Scholar
  20. Honey, C. J., Sporns, O., Cammoun, L., Gigandet, X., Thiran, J. P., Meuli, R., & Hagmann, P. (2009). Predicting human resting-state functional connectivity from structural connectivity. Proceedings of the National Academy of Sciences of the United States of America, 106, 2035–2040.PubMedCentralPubMedCrossRefGoogle Scholar
  21. Landman, B. A., Farrell, J. A., Jones, C. K., Smith, S. A., Prince, J. L., & Mori, S. (2007). Effects of diffusion weighting schemes on the reproducibility of DTI-derived fractional anisotropy, mean diffusivity, and principal eigenvector measurements at 1.5T. Neuroimage, 36, 1123–1138.Google Scholar
  22. Landman, B. A., Huang, A. J., Gifford, A., Vikram, D. S., Lim, I. A., Farrell, J. A., ... van Zijl, P. C. (2011). Multi-parametric neuroimaging reproducibility: A 3-T resource study. NeuroImage, 54(4), 2854–2866.Google Scholar
  23. Lindquist, M. A. (2008). The statistical analysis of fMRI data. Statistical Science, 23(4), 439–464.CrossRefGoogle Scholar
  24. MATLAB (2010). version 7.10.0 (R2010a). The MathWorks Inc., Natick, Massachusetts.Google Scholar
  25. Meier, J. D., Afalo, T. N., Kastner, S., & Graziano, M. S. A. (2008). Complex organization of human primary motor cortex: A high-resolution fmri study. Journal of Neurophysiology, 100(4), 1800–1812.PubMedCentralPubMedCrossRefGoogle Scholar
  26. Meindl, T., Teipel, S., Elmouden, R., Mueller, S., Koch, W., Dietrich, O., ... Glaser, C. (2010). Test-retest reproducibility of the default-mode network in healthy individuals. Human Brain Mapping, 31, 237–246.PubMedGoogle Scholar
  27. Ozturk, A., Smith, S. A., Gordon-Lipkin, E. M., Harrison, D. M., Shiee, N., Pham, D. L., ... Reich, D. S. (2010). MRI of the corpus callosum in multiple sclerosis: Association with disability. Multiple Sclerosis(16).Google Scholar
  28. R Core Team (2012). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0.Google Scholar
  29. Reich, D. S., Ozturk, A., Calabresi, P. A., & Mori, S. (2010). Automated vs. conventional tractography in multiple sclerosis: Variability and correlation with disability. NeuroImage, 49(4), 3047–3056.PubMedCentralPubMedCrossRefGoogle Scholar
  30. Rombouts, S. A., Barkhof, F., Hoogenraad, F. G., Sprenger, M., & Scheltens, P. (1998). Within-subject reproducibility of visual activation patterns with functional magnetic resonance imaging using multislice echo planar imaging. Magnetic Resonance Imaging, 16, 105–113.PubMedCrossRefGoogle Scholar
  31. Schwarz, A. J., & McGonigle, J. (2011). Negative edges and soft thresholding in complex network analysis of resting state functional connectivity data. NeuroImage, 55, 1132–1146.PubMedCrossRefGoogle Scholar
  32. Shehzad, Z., Kelly, A. M., Reiss, P. T., Gee, D. G., Gotimer, K., Uddin, L. Q., ... Milham, M. P. (2009). The resting brain: Unconstrained yet reliable. Cerebral Cortexortex, 19, 2209–2229.CrossRefGoogle Scholar
  33. Shen, D., & Davatzikos, C. (2002). HAMMER: Hierarchical attribute matching mechanism for elastic registration. Medical Imaging, IEEE Transactions On, 21(11), 1421–1439.CrossRefGoogle Scholar
  34. Shiee, N., Bazin, P. L., Ozturk, A., Reich, D. S., Calabresi, P. A., & Pham, D. L. (2010). A topology-preserving approach to the segmentation of brain images with multiple sclerosis lesions. NeuroImage, 49(2), 1524–1535.PubMedCentralPubMedCrossRefGoogle Scholar
  35. Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 420–428.PubMedCrossRefGoogle Scholar
  36. Strother, S. C., Anderson, J., Hansen, L. K., Kjems, U., Kustra, R., Sidtis, J., ... Rottenberg, D. (2002). The quantitative evaluation of functional neuroimaging experiments: The NPAIRS data analysis framework. NeuroImage, 15, 747–771.PubMedCrossRefGoogle Scholar
  37. Wang, J.-H., Milham, S., Zuo, M. P., Gohel, X.-N., & Biswal, B. B. (2011). Graph theoretical analysis of functional brain networks: Test-retest evaluation on short- and long-term resting-state functional MRI data. PloS one, 6, 2209–2229.Google Scholar
  38. Zhang, H., Duan, L., Zhang, Y. J., Lu, C. M., Liu, H., & Zhu, C. Z. (2011). Test-retest assessment of independent component analysis-derived resting-state functional connectivity based on functional near-infrared spectroscopy. NeuroImage, 55, 607–615.PubMedCrossRefGoogle Scholar
  39. Zipunnikov, V., Caffo, B. S., Yousem, D. M., Davatzikos, C., Schwartz, B. S., & Crainiceanu, C. M. (2011). Multilevel functional principal component analysis for high dimensional data. Journal of Computaional and Graphical Statistics, 20(4), 852–873.CrossRefGoogle Scholar
  40. Zipunnikov, V., Caffo, B. S., Yousem, D. M., Davatzikos, C., Schwartz, B. S., & Crainiceanu, C.M. (2012). Longitudinal high dimensional data analysis. Technical report.Google Scholar
  41. Zuo, X. N., Di Martino, A., Kelly, C., Shehzad, Z. E., Gee, D. G., Klein, D. F., ... Milham, M. P. (2010). The oscillating brain: Complex and reliable. NeuroImage, 49, 1432–1445.PubMedCentralPubMedCrossRefGoogle Scholar
  42. Zuo, X. N., Kelly, C., Adelstein, J. S., Klein, D. F., Castellanos, F. X., & Milham, M. P. (2010). Reliable intrinsic connectivity networks: Test-retest evaluation using ICA and dual regression approach. NeuroImage, 49, 2163–2177.PubMedCentralPubMedCrossRefGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 2013

Authors and Affiliations

  • H. Shou
    • 1
  • A. Eloyan
    • 1
  • S. Lee
    • 2
  • V. Zipunnikov
    • 1
  • A. N. Crainiceanu
    • 3
  • M. B. Nebel
    • 4
  • B. Caffo
    • 1
  • M. A. Lindquist
    • 1
  • C. M. Crainiceanu
    • 1
  1. 1.Department of Biostatistics, Bloomberg School of Public HealthJohns Hopkins UniversityBaltimoreUSA
  2. 2.Department of Psychiatry and the Department of BiostatisticsColumbia UniversityNew YorkUSA
  3. 3.Computer Science DepartmentUnited States Naval AcademyMDUSA
  4. 4.Laboratory for Neurocognitive and Imaging ResearchKennedy Krieger InstituteBaltimoreUSA

Personalised recommendations