Brain Imaging and Behavior

, Volume 12, Issue 6, pp 1678–1695 | Cite as

A comparative study of segmentation techniques for the quantification of brain subcortical volume

  • Theophilus N. AkudjeduEmail author
  • Leila Nabulsi
  • Migle Makelyte
  • Cathy Scanlon
  • Sarah Hehir
  • Helen Casey
  • Srinath Ambati
  • Joanne Kenney
  • Stefani O’Donoghue
  • Emma McDermott
  • Liam Kilmartin
  • Peter Dockery
  • Colm McDonald
  • Brian Hallahan
  • Dara M. Cannon


Manual tracing of magnetic resonance imaging (MRI) represents the gold standard for segmentation in clinical neuropsychiatric research studies, however automated approaches are increasingly used due to its time limitations. The accuracy of segmentation techniques for subcortical structures has not been systematically investigated in large samples. We compared the accuracy of fully automated [(i) model-based: FSL-FIRST; (ii) patch-based: volBrain], semi–automated (FreeSurfer) and stereological (Measure®) segmentation techniques with manual tracing (ITK-SNAP) for delineating volumes of the caudate (easy-to-segment) and the hippocampus (difficult-to-segment). High resolution 1.5 T T1-weighted MR images were obtained from 177 patients with major psychiatric disorders and 104 healthy participants. The relative consistency (partial correlation), absolute agreement (intraclass correlation coefficient, ICC) and potential technique bias (Bland–Altman plots) of each technique was compared with manual segmentation. Each technique yielded high correlations (0.77–0.87, p < 0.0001) and moderate ICC’s (0.28–0.49) relative to manual segmentation for the caudate. For the hippocampus, stereology yielded good consistency (0.52–0.55, p < 0.0001) and ICC (0.47–0.49), whereas automated and semi-automated techniques yielded poor ICC (0.07–0.10) and moderate consistency (0.35–0.62, p < 0.0001). Bias was least using stereology for segmentation of the hippocampus and using FreeSurfer for segmentation of the caudate. In a typical neuropsychiatric MRI dataset, automated segmentation techniques provide good accuracy for an easy-to-segment structure such as the caudate, whereas for the hippocampus, a reasonable correlation with volume but poor absolute agreement was demonstrated. This indicates manual or stereological volume estimation should be considered for studies that require high levels of precision such as those with small sample size.


Segmentation techniques Stereology Subcortical structures FreeSurfer FSL-FIRST VolBrain 



TNA’s doctoral training is funded by the College of Medicine, Nursing and Health Sciences Postgraduate Scholarship Scheme, NUI Galway (2016–2020). We would also like to thank all of the participants and their families for their involvement in the Research Programme of the Clinical Neuroimaging laboratory, NUI Galway.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed consent

All participants provided written informed consent for the relevant studies.

Supplementary material

11682_2018_9835_MOESM1_ESM.doc (8.8 mb)
Supplementary material 1 (DOC 9035 KB)


  1. Ahmed, M., Cannon, D. M., Scanlon, C., Holleran, L., Schmidt, H., McFarland, J., et al. (2015). Progressive brain atrophy and cortical thinning in schizophrenia after commencing clozapine treatment. Neuropsychopharmacology, 40(10), 2409–2417. Scholar
  2. Allen, J. S., Damasio, H., & Grabowski, T. J. (2002). Normal neuroanatomical variation in the human brain: an MRI-volumetric study. American Journal of Physical Anthropology, 118(4), 341–358. Scholar
  3. Altshuler, L. L., Bartzokis, G., Grieder, T., Curran, J., & Mintz, J. (1998). Amygdala enlargement in bipolar disorder and hippocampal reduction in schizophrenia:an MRI study demonstrating neuroanatomic specificity. Archives of General Psychiatry, 55(7), 663–664.PubMedGoogle Scholar
  4. Altshuler, L. L., Bartzokis, G., Grieder, T., Curran, J., Jimenez, T., Leight, K., et al. (2000). An MRI study of temporal lobe structures in men with bipolar disorder or schizophrenia. Biological Psychiatry, 48(2), 147–162.CrossRefGoogle Scholar
  5. Amann, M., Andělová, M., Pfister, A., Mueller-Lenke, N., Traud, S., Reinhardt, J., et al. (2015). Subcortical brain segmentation of two dimensional T1-weighted data sets with FMRIB’s Integrated Registration and Segmentation Tool (FIRST). NeuroImage: Clinical, 7, 43–52. Scholar
  6. Bao, S., & Chung, A. C. S. (2017). Feature sensitive label fusion with random walker for atlas-based image segmentation. IEEE Transactions on Image Processing, 26(6), 2797–2810. Scholar
  7. Barnes, J., Ridgway, G. R., Bartlett, J., Henley, S. M. D., Lehmann, M., Hobbs, N., et al. (2010). Head size, age and gender adjustment in MRI studies: a necessary nuisance? NeuroImage, 53(4), 1244–1255, Scholar
  8. Bland, J. M., & Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet, 327(8476), 307–310. Scholar
  9. Bland, J. M., & Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research, 8(2), 135–160. Scholar
  10. Boccardi, M., Bocchetta, M., Apostolova, L. G., Barnes, J., Bartzokis, G., Corbetta, G., et al. (2015). Delphi definition of the EADC-ADNI harmonized protocol for hippocampal segmentation on magnetic resonance. Alzheimer’s & Dementia, 11(2), 126–138. Scholar
  11. Brambilla, P., Harenski, K., Nicoletti, M., Sassi, R. B., Mallinger, A. G., Frank, E., et al. (2003). MRI investigation of temporal lobe structures in bipolar patients. Journal of Psychiatric Research, 37(4), 287–295. Scholar
  12. Cahn, W., Pol, H., Lems, E. E., et al. (2002). Brain volume changes in first-episode schizophrenia: a 1-year follow-up study. Archives of General Psychiatry, 59(11), 1002–1010. Scholar
  13. Cherbuin, N., Anstey, K. J., Réglade-Meslin, C., & Sachdev, P. S. (2009). In vivo hippocampal measurement and memory: a comparison of manual tracing and automated segmentation in a large community-based sample. PLoS ONE, 4(4), e5265. Scholar
  14. Collins, D. L., Holmes, C. J., Peters, T. M., & Evans, A. C. (1995). Automatic 3-D model-based neuroanatomical segmentation. Human Brain Mapping, 3(3), 190–208. Scholar
  15. Coupé, P., Manjón, J. V., Fonov, V., Pruessner, J., Robles, M., & Collins, D. L. (2011). Patch-based segmentation using expert priors: application to hippocampus and ventricle segmentation. NeuroImage, 54(2), 940–954. Scholar
  16. Dale, A. M., Fischl, B., & Sereno, M. I. (1999). Cortical surface-based analysis. I. Segmentation and surface reconstruction. NeuroImage, 9(2), 179–194. Scholar
  17. Doring, T. M., Kubo, T. T. A., Cruz, L. C. H., Juruena, M. F., Fainberg, J., & Domingues, R. C. (2011). Evaluation of hippocampal volume based on mr imaging in patients with bipolar affective disorder applying manual and automatic segmentation techniques. Journal of Magnetic Resonance Imaging, 33.
  18. Emsell, L., Langan, C., Van Hecke, W., Barker, G. J., Leemans, A., Sunaert, S., et al. (2013). White matter differences in euthymic bipolar I disorder: a combined magnetic resonance imaging and diffusion tensor imaging voxel-based study. Bipolar Disorders, 15(4), 365–376. Scholar
  19. Ertekin, T., Acer, N., İçer, S., Vurdem, ÜE., Çınar, Ş, & Özçelik, Ö (2015). Volume estimation of the subcortical structures in Parkinson’s disease using magnetic resonance imaging: a methodological study. [Article]. Neurology Asia, 20(2), 143–153.Google Scholar
  20. Fenster, A., & Chiu, B. (2005). Evaluation of Segmentation algorithms for Medical Imaging. Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 7, 7186–7189. Scholar
  21. Filipek, P. A., Richelme, C., Kennedy, D. N., & Caviness, V. S. Jr. (1994). The young adult human brain: an MRI-based morphometric analysis. Cerebral Cortex, 4(4), 344–360.CrossRefGoogle Scholar
  22. Fischl, B., & Dale, A. M. (2000). Measuring the thickness of the human cerebral cortex from magnetic resonance images. Proceedings of the National Academy of Sciences of the United States of America, 97(20), 11050–11055. Scholar
  23. Fischl, B., Sereno, M. I., & Dale, A. M. (1999). Cortical surface-based analysis. II: Inflation, flattening, and a surface-based coordinate system. NeuroImage, 9(2), 195–207. Scholar
  24. Fischl, B., Salat, D. H., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., et al. (2002). Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron, 33(3), 341–355.CrossRefGoogle Scholar
  25. Franke, B., Stein, J. L., Ripke, S., Anttila, V., Hibar, D. P., van Hulzen, K. J. E., et al. (2016). Genetic influences on schizophrenia and subcortical brain volumes: large-scale proof of concept. Nature Neuroscience, 19(3), 420–431. Scholar
  26. Garcia, Y., Breen, A., Burugapalli, K., Dockery, P., & Pandit, A. (2007). Stereological methods to assess tissue response for tissue-engineered scaffolds. Biomaterials, 28(2), 175–186. Scholar
  27. García-Fiñana, M., Cruz-Orive, L. M., Mackay, C. E., Pakkenberg, B., & Roberts, N. (2003). Comparison of MR imaging against physical sectioning to estimate the volume of human cerebral compartments. NeuroImage, 18(2), 505–516. Scholar
  28. Geuze, E., Vermetten, E., & Bremner, J. D. (2005). MR-based in vivo hippocampal volumetrics: 1. Review of methodologies currently employed. Molecular Psychiatry, 10(2), 147–159. Scholar
  29. Giraud, R., Ta, V.-T., Papadakis, N., Manjón, J. V., Collins, D. L., & Coupé, P. (2016). An Optimized PatchMatch for multi-scale and multi-feature label fusion. NeuroImage, 124, 770–782. Scholar
  30. Grimm, O., Pohlack, S., Cacciaglia, R., Winkelmann, T., Plichta, M. M., Demirakca, T., et al. (2015). Amygdalar and hippocampal volume: a comparison between manual segmentation, Freesurfer and VBM. Journal of Neuroscience Methods, 253, 254–261. Scholar
  31. Gundersen, H. J., Bagger, P., Bendtsen, T. F., Evans, S. M., Korbo, L., Marcussen, N., et al. (1988). The new stereological tools: disector, fractionator, nucleator and point sampled intercepts and their use in pathological research and diagnosis. APMIS, 96(10), 857–881.CrossRefGoogle Scholar
  32. Hallgren, K. A. (2012). Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in Quantitative Methods for Psychology, 8(1), 23–34.CrossRefGoogle Scholar
  33. Han, X., & Fischl, B. (2007). Atlas renormalization for improved brain MR image segmentation across scanner platforms. IEEE Transactions on Medical Imaging, 26(4), 479–486. Scholar
  34. Hibar, D. P., Westlye, L. T., van Erp, T. G. M., Rasmussen, J., Leonardo, C. D., Faskowitz, J., et al. (2016). Subcortical volumetric abnormalities in bipolar disorder. [Original Article]. Molecular Psychiatry, 21(12), 1710–1716. Scholar
  35. Keller, S. S., Gerdes, J. S., Mohammadi, S., Kellinghaus, C., Kugel, H., Deppe, K., et al. (2012). Volume estimation of the thalamus using freesurfer and stereology: consistency between methods. Neuroinformatics, 10(4), 341–350. Scholar
  36. Kenney, J., Anderson-Schmidt, H., Scanlon, C., Arndt, S., Scherz, E., McInerney, S., et al. (2015). Cognitive course in first-episode psychosis and clinical correlates: a 4 year longitudinal study using the MATRICS consensus cognitive battery. Schizophrenia Research, 169(1–3), 101–108. Scholar
  37. Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163. Scholar
  38. Krouwer, J. S. (2008). Why Bland–Altman plots should use X, not (Y + X)/2 when X is a reference method. Statistics in Medicine, 27(5), 778–780. Scholar
  39. Looi, J. C., Lindberg, O., Liberg, B., Tatham, V., Kumar, R., Maller, J., et al. (2008). Volumetrics of the caudate nucleus: reliability and validity of a new manual tracing protocol. Psychiatry Research, 163(3), 279–288. Scholar
  40. Makowski, C., Béland, S., Kostopoulos, P., Bhagwat, N., Devenyi, G. A., Malla, A. K., et al. (2017). Evaluating accuracy of striatal, pallidal, and thalamic segmentation methods: comparing automated approaches to manual delineation. NeuroImage. Scholar
  41. Mamah, D., Harms, M. P., Barch, D., Styner, M., Lieberman, J. A., & Wang, L. (2012). Hippocampal shape and volume changes with antipsychotics in early stage psychotic illness. Frontiers in Psychiatry, 3, 96. Scholar
  42. Mamah, D., Alpert, K. I., Barch, D. M., Csernansky, J. G., & Wang, L. (2016). Subcortical neuromorphometry in schizophrenia spectrum and bipolar disorders. NeuroImage: Clinical, 11, 276–286. Scholar
  43. Manjón, J. V., & Coupé, P. (2016). volBrain: an online MRI brain volumetry system. Frontiers in Neuroinformatics, 10, 30. Scholar
  44. Mayer, K. N., Latal, B., Knirsch, W., Scheer, I., von Rhein, M., Reich, B., et al. (2016). Comparison of automated brain volumetry methods with stereology in children aged 2 to 3 years. [journal article]. Neuroradiology, 58(9), 901–910. Scholar
  45. McCarthy, C. S., Ramprashad, A., Thompson, C., Botti, J.-A., Coman, I. L., & Kates, W. R. (2015). A comparison of FreeSurfer-generated data with and without manual intervention. [Original Research]. Frontiers in Neuroscience, 9, 379. Scholar
  46. McFarland, J., Cannon, D. M., Schmidt, H., Ahmed, M., Hehir, S., Emsell, L., et al. (2013). Association of grey matter volume deviation with insight impairment in first-episode affective and non-affective psychosis. [journal article]. European Archives of Psychiatry and Clinical Neuroscience, 263(2), 133–141. Scholar
  47. Morey, R. A., Petty, C. M., Xu, Y., Pannu Hayes, J., Wagner, H. R., Lewis, D. V., et al. (2009). A comparison of automated segmentation and manual tracing for quantifying hippocampal and amygdala volumes. NeuroImage, 45(3), 855–866. Scholar
  48. Nazir, M., Cleret de Langavant, L., Brugieres, P., Gaura, V., Lavisse, S., Youssov, K., Bachoud-Levi, A.-C., & Remy, P. (2014). Comparison of three techniques to measure longitudinally striatal volume in Huntington’s disease patients [[abstract]]. Movement Disorders, 29(Supple 1), 227.Google Scholar
  49. Nordenskjöld, R., Malmberg, F., Larsson, E.-M., Simmons, A., Ahlström, H., Johansson, L., et al. (2015). Intracranial volume normalization methods: considerations when investigating gender differences in regional brain volume. Psychiatry Research: Neuroimaging, 231(3), 227–235. Scholar
  50. Okada, N., Fukunaga, M., Yamashita, F., Koshiyama, D., Yamamori, H., Ohi, K., et al. (2016). Abnormal asymmetries in subcortical brain volume in schizophrenia. [Original Article]. Molecular Psychiatry, 21(10), 1460–1466. Scholar
  51. Pardoe, H. R., Pell, G. S., Abbott, D. F., & Jackson, G. D. (2009). Hippocampal volume assessment in temporal lobe epilepsy: how good is automated segmentation? Epilepsia, 50(12), 2586–2592.CrossRefGoogle Scholar
  52. Patenaude, B., Smith, S., Kennedy, D., & Jenkinson, M. (2007). Bayesian shape and appearance models, Technical report TR07BP1, FMRIB Centre - University of Oxford.Google Scholar
  53. Patenaude, B., Smith, S. M., Kennedy, D. N., & Jenkinson, M. (2011). A Bayesian model of shape and appearance for subcortical brain segmentation. NeuroImage, 56(3), 907–922. Scholar
  54. Perlaki, G., Horvath, R., Nagy, S. A., Bogner, P., Doczi, T., Janszky, J., et al. (2017). Comparison of accuracy between FSL’s FIRST and Freesurfer for caudate nucleus and putamen segmentation. Scientific Reports, 7, 2418. Scholar
  55. Quigley, S. J., Scanlon, C., Kilmartin, L., Emsell, L., Langan, C., Hallahan, B., et al. (2015). Volume and shape analysis of subcortical brain structures and ventricles in euthymic bipolar I disorder. Psychiatry Research: Neuroimaging, 233(3), 324–330. Scholar
  56. Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.Google Scholar
  57. Renteria, M. E., Schmaal, L., Hibar, D. P., Couvy-Duchesne, B., Strike, L. T., Mills, N. T., et al. (2017). Subcortical brain structure and suicidal behaviour in major depressive disorder: a meta-analysis from the ENIGMA-MDD working group. [Original Article]. Translational Psychiatry, 7, e1116. Scholar
  58. Rodionov, R., Chupin, M., Williams, E., Hammers, A., Kesavadas, C., & Lemieux, L. (2009). Evaluation of atlas-based segmentation of hippocampi in healthy humans. Magnetic Resonance Imaging, 27(8), 1104–1109. Scholar
  59. Sacchet, M. D., Livermore, E. E., Iglesias, J. E., Glover, G. H., & Gotlib, I. H. (2015). Subcortical volumes differentiate major depressive disorder, bipolar disorder, and remitted major depressive disorder. Journal of Psychiatric Research, 68, 91–98. Scholar
  60. Sánchez-Benavides, G., Gómez-Ansón, B., Sainz, A., Vives, Y., Delfino, M., & Peña-Casanova, J. (2010). Manual validation of FreeSurfer’s automated hippocampal segmentation in normal aging, mild cognitive impairment, and Alzheimer disease subjects. Psychiatry Research: Neuroimaging, 181(3), 219–225. Scholar
  61. Scanlon, C., Anderson-Schmidt, H., Kilmartin, L., McInerney, S., Kenney, J., McFarland, J., et al. (2014). Cortical thinning and caudate abnormalities in first episode psychosis and their association with clinical outcome. Schizophrenia Research, 159(1), 36–42. Scholar
  62. Schmaal, L., Veltman, D. J., van Erp, T. G., Samann, P. G., Frodl, T., Jahanshad, N., et al. (2016). Subcortical brain alterations in major depressive disorder: findings from the ENIGMA major depressive disorder working group. Molecular Psychiatry, 21(6), 806–812. Scholar
  63. Schoemaker, D., Buss, C., Head, K., Sandman, C. A., Davis, E. P., Chakravarty, M. M., et al. (2016). Hippocampus and amygdala volumes from magnetic resonance images in children: assessing accuracy of FreeSurfer and FSL against manual segmentation. NeuroImage, 129, 1–14. Scholar
  64. Sheline, Y. I., Sanghavi, M., Mintun, M. A., & Gado, M. H. (1999). Depression duration but not age predicts hippocampal volume loss in medically healthy women with recurrent major depression. The Journal of Neuroscience, 19(12), 5034–5043.CrossRefGoogle Scholar
  65. Shen, L., Saykin, A. J., Kim, S., Firpi, H. A., West, J. D., Risacher, S. L., et al. (2010). Comparison of manual and automated determination of hippocampal volumes in MCI and early AD. [journal article]. Brain Imaging and Behavior, 4(1), 86–95. Scholar
  66. Sled, J. G., Zijdenbos, A. P., & Evans, A. C. (1998). A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on Medical Imaging, 17(1), 87–97. Scholar
  67. Strakowski, S. M., DelBello, M. P., Sax, K. W., et al. (1999). Brain magnetic resonance imaging of structural abnormalities in bipolar disorder. Archives of General Psychiatry, 56(3), 254–260. Scholar
  68. Tae, W. S., Kim, S. S., Lee, K. U., Nam, E.-C., & Kim, K. W. (2008). Validation of hippocampal volumes measured using a manual method and two automated methods (FreeSurfer and IBASPM) in chronic major depressive disorder. [journal article]. Neuroradiology, 50(7), 569. Scholar
  69. Taha, A. A., & Hanbury, A. (2015). Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. [journal article]. BMC Medical Imaging, 15(1), 29. Scholar
  70. van Erp, T. G., Hibar, D. P., Rasmussen, J. M., Glahn, D. C., Pearlson, G. D., Andreassen, O. A., et al. (2016). Subcortical brain volume abnormalities in 2028 individuals with schizophrenia and 2540 healthy controls via the ENIGMA consortium. Molecular Psychiatry, 21(4), 547–553. Scholar
  71. Velakoulis, D., Wood, S. J., Wong, M. T., McGorry, P. D., Yung, A., Phillips, L., et al. (2006). Hippocampal and amygdala volumes according to psychosis stage and diagnosis: a magnetic resonance imaging study of chronic schizophrenia, first-episode psychosis, and ultra-high-risk individuals. Arch Gen Psychiatry, 63(2), 139–149. Scholar
  72. Watson, R. (2001). SPSS survival manual by Julie Pallant, Open University Press., Buckingham, 2001, 286 pages, ISBN 0 335 20890 8. Journal of Advanced Nursing, 36(3), 478–478.
  73. Yuen, K. H., Wong, J. W., Yap, S. P., & Billa, N. (2001). Estimated coefficient of variation values for sample size planning in bioequivalence studies. International Journal of Clinical Pharmacology and Therapeutics, 39(1), 37–40.CrossRefGoogle Scholar
  74. Yushkevich, P. A., Piven, J., Hazlett, H. C., Smith, R. G., Ho, S., Gee, J. C., et al. (2006). User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. NeuroImage, 31(3), 1116–1128. Scholar
  75. Zaki, R., Bulgiba, A., Ismail, R., & Ismail, N. A. (2012). Statistical methods used to test for agreement of medical instruments measuring continuous variables in method comparison studies: a systematic review. PLoS ONE, 7(5), e37908. Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Theophilus N. Akudjedu
    • 1
    Email author
  • Leila Nabulsi
    • 1
  • Migle Makelyte
    • 1
    • 2
  • Cathy Scanlon
    • 1
  • Sarah Hehir
    • 1
  • Helen Casey
    • 1
  • Srinath Ambati
    • 1
  • Joanne Kenney
    • 1
  • Stefani O’Donoghue
    • 1
  • Emma McDermott
    • 1
  • Liam Kilmartin
    • 2
  • Peter Dockery
    • 1
  • Colm McDonald
    • 1
  • Brian Hallahan
    • 1
  • Dara M. Cannon
    • 1
  1. 1.Centre for Neuroimaging & Cognitive Genomics (NICOG), Clinical Neuroimaging Laboratory, NCBES Galway Neuroscience Centre, Psychiatry & Anatomy, School of Medicine,College of Medicine Nursing and Health SciencesNational University of Ireland GalwayGalwayIreland
  2. 2.College of Engineering and InformaticsNational University of Ireland GalwayGalwayIreland

Personalised recommendations