Assessing robustness and generalization of a deep neural network for brain MS lesion segmentation on real-world data

Chaves, Hernán; Serra, María M.; Shalom, Diego E.; Ananía, Pilar; Rueda, Fernanda; Osa Sanz, Emilia; Stefanoff, Nadia I.; Rodríguez Murúa, Sofía; Costa, Martín E.; Kitamura, Felipe C.; Yañez, Paulina; Cejas, Claudia; Correale, Jorge; Ferrante, Enzo; Fernández Slezak, Diego; Farez, Mauricio F.

doi:10.1007/s00330-023-10093-5

Assessing robustness and generalization of a deep neural network for brain MS lesion segmentation on real-world data

Imaging Informatics and Artificial Intelligence
Published: 31 August 2023

Volume 34, pages 2024–2035, (2024)
Cite this article

European Radiology Aims and scope Submit manuscript

Hernán Chaves¹,
María M. Serra¹,
Diego E. Shalom^2,3,4,
Pilar Ananía⁵,
Fernanda Rueda⁶,
Emilia Osa Sanz¹,
Nadia I. Stefanoff¹,
Sofía Rodríguez Murúa⁷,
Martín E. Costa⁵,
Felipe C. Kitamura⁸,
Paulina Yañez¹,
Claudia Cejas¹,
Jorge Correale⁹,
Enzo Ferrante¹⁰,
Diego Fernández Slezak^7,11,12 &
…
Mauricio F. Farez^6,7,13

317 Accesses
1 Altmetric
Explore all metrics

Abstract

Objectives

Evaluate the performance of a deep learning (DL)–based model for multiple sclerosis (MS) lesion segmentation and compare it to other DL and non-DL algorithms.

Methods

This ambispective, multicenter study assessed the performance of a DL-based model for MS lesion segmentation and compared it to alternative DL- and non-DL-based methods. Models were tested on internal (n = 20) and external (n = 18) datasets from Latin America, and on an external dataset from Europe (n = 49). We also examined robustness by rescanning six patients (n = 6) from our MS clinical cohort. Moreover, we studied inter-human annotator agreement and discussed our findings in light of these results. Performance and robustness were assessed using intraclass correlation coefficient (ICC), Dice coefficient (DC), and coefficient of variation (CV).

Results

Inter-human ICC ranged from 0.89 to 0.95, while spatial agreement among annotators showed a median DC of 0.63. Using expert manual segmentations as ground truth, our DL model achieved a median DC of 0.73 on the internal, 0.66 on the external, and 0.70 on the challenge datasets. The performance of our DL model exceeded that of the alternative algorithms on all datasets. In the robustness experiment, our DL model also achieved higher DC (ranging from 0.82 to 0.90) and lower CV (ranging from 0.7 to 7.9%) when compared to the alternative methods.

Conclusion

Our DL-based model outperformed alternative methods for brain MS lesion segmentation. The model also proved to generalize well on unseen data and has a robust performance and low processing times both on real-world and challenge-based data.

Clinical relevance statement

Our DL-based model demonstrated superior performance in accurately segmenting brain MS lesions compared to alternative methods, indicating its potential for clinical application with improved accuracy, robustness, and efficiency.

Key Points

• Automated lesion load quantification in MS patients is valuable; however, more accurate methods are still necessary.

• A novel deep learning model outperformed alternative MS lesion segmentation methods on multisite datasets.

• Deep learning models are particularly suitable for MS lesion segmentation in clinical scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation

Deep learning based brain tumor segmentation: a survey

Article Open access 09 July 2022

Deep Learning Techniques for Medical Image Segmentation: Achievements and Challenges

Article Open access 29 May 2019

Abbreviations

CV:: Coefficient of variation
DC:: Dice coefficient
DL:: Deep learning
DSDV:: Different-scanner different-visit
DSSV:: Different-scanner same-visit
ICC:: Intraclass correlation coefficient
LGA:: Lesion growth algorithm
LPA:: Lesion prediction algorithm
LST:: Lesion segmentation tool
MS:: Multiple sclerosis
SSDV:: Same-scanner different-visit
SSSV:: Same-scanner same-visit
WM:: White matter

References

Reich DS, Lucchinetti CF, Calabresi PA (2018) Multiple sclerosis. N Engl J Med 378:169–180. https://doi.org/10.1056/NEJMra1401483
Article CAS PubMed PubMed Central Google Scholar
Rodríguez Murúa S, Farez MF, Quintana FJ (2022) The immune response in multiple sclerosis. Annu Rev Pathol 17:121–139. https://doi.org/10.1146/annurev-pathol-052920-040318
Article CAS PubMed Google Scholar
Young IR, Hall AS, Pallis CA et al (1981) Nuclear magnetic resonance imaging of the brain in multiple sclerosis. Lancet 318:1063–1066. https://doi.org/10.1016/S0140-6736(81)91273-3
Article Google Scholar
McDonald WI, Compston A, Edan G et al (2001) Recommended diagnostic criteria for multiple sclerosis: guidelines from the International Panel on the diagnosis of multiple sclerosis. Ann Neurol 50:121–127. https://doi.org/10.1002/ana.1032
Article CAS PubMed Google Scholar
Thompson AJ, Banwell BL, Barkhof F et al (2018) Diagnosis of multiple sclerosis: 2017 revisions of the McDonald criteria. Lancet Neurol 17:162–173. https://doi.org/10.1016/S1474-4422(17)30470-2
Article PubMed Google Scholar
Filippi M, Rocca MA, Ciccarelli O et al (2016) MRI criteria for the diagnosis of multiple sclerosis: MAGNIMS consensus guidelines. Lancet Neurol 15:292–303. https://doi.org/10.1016/S1474-4422(15)00393-2
Article PubMed PubMed Central Google Scholar
on behalf of the MAGNIMS study group, Geraldes R, Ciccarelli O et al (2018) The current role of MRI in differentiating multiple sclerosis from its imaging mimics. Nat Rev Neurol 14:199–213. https://doi.org/10.1038/nrneurol.2018.14
Article Google Scholar
Gasperini C, Prosperini L, Tintoré M et al (2019) Unraveling treatment response in multiple sclerosis: a clinical and MRI challenge. Neurology 92:180–192. https://doi.org/10.1212/WNL.0000000000006810
Article PubMed PubMed Central Google Scholar
Mortazavi D, Kouzani AZ, Soltanian-Zadeh H (2012) Segmentation of multiple sclerosis lesions in MR images: a review. Neuroradiology 54:299–320. https://doi.org/10.1007/s00234-011-0886-7
Article PubMed Google Scholar
García-Lorenzo D, Francis S, Narayanan S et al (2013) Review of automatic segmentation methods of multiple sclerosis white matter lesions on conventional magnetic resonance imaging. Med Image Anal 17:1–18. https://doi.org/10.1016/j.media.2012.09.004
Article PubMed Google Scholar
Gryska E, Schneiderman J, Björkman-Burtscher I, Heckemann RA (2021) Automatic brain lesion segmentation on standard magnetic resonance images: a scoping review. BMJ Open 11:e042660. https://doi.org/10.1136/bmjopen-2020-042660
Article PubMed PubMed Central Google Scholar
Zeng C, Gu L, Liu Z, Zhao S (2020) Review of deep learning approaches for the segmentation of multiple sclerosis lesions on brain MRI. Front Neuroinformatics 14:610967. https://doi.org/10.3389/fninf.2020.610967
Article Google Scholar
Valverde S, Salem M, Cabezas M et al (2019) One-shot domain adaptation in multiple sclerosis lesion segmentation using convolutional neural networks. NeuroImage Clin 21:101638. https://doi.org/10.1016/j.nicl.2018.101638
Article PubMed Google Scholar
Schmidt P, Gaser C, Arsic M et al (2012) An automated tool for detection of FLAIR-hyperintense white-matter lesions in Multiple Sclerosis. Neuroimage 59:3774–3783. https://doi.org/10.1016/j.neuroimage.2011.11.032
Article PubMed Google Scholar
Commowick O, Istace A, Kain M et al (2018) Objective evaluation of multiple sclerosis lesion segmentation using a data management and processing infrastructure. Sci Rep 8. https://doi.org/10.1038/s41598-018-31911-7
Yushkevich PA, Piven J, Hazlett HC et al (2006) User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 31:1116–1128. https://doi.org/10.1016/j.neuroimage.2006.01.015
Article PubMed Google Scholar
Akhondi-Asl A, Hoyte L, Lockhart ME, Warfield SK (2014) A logarithmic opinion pool based STAPLE algorithm for the fusion of segmentations with associated reliability weights. IEEE Trans Med Imaging 33:1997–2009. https://doi.org/10.1109/TMI.2014.2329603
Article PubMed PubMed Central Google Scholar
Schmidt P (2017) Bayesian inference for structured additive regression models for large-scale problems with applications to medical imaging. Text.PhDThesis, Ludwig-Maximilians-UniversitätMünchen
Valverde S, Cabezas M, Roura E et al (2017) Improving automated multiple sclerosis lesion segmentation with a cascaded 3D convolutional neural network approach. Neuroimage 155:159–168. https://doi.org/10.1016/j.neuroimage.2017.04.034
Article PubMed Google Scholar
Zhang H, Valcarcel AM, Bakshi R et al (2019) Multiple sclerosis lesion segmentation with Tiramisu and 2.5D Stacked Slices. In: Shen D, Liu T, Peters TM et al (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. Springer International Publishing, Cham, pp 338–346
Chapter Google Scholar
Jégou S, Drozdzal M, Vazquez D, et al (2017) The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 11–19
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Honolulu, HI, pp 2261–2269
Lin T-Y, Goyal P, Girshick R, et al (2017) Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, Venice, pp 2999–3007
Shrout PE, Fleiss JL (1979) Intraclass correlations: uses in assessing rater reliability. Psychol Bull 86:420–428. https://doi.org/10.1037/0033-2909.86.2.420
Article CAS PubMed Google Scholar
Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26:297–302. https://doi.org/10.2307/1932409
Article Google Scholar
Hendricks WA, Robey KW (1936) The sampling distribution of the coefficient of variation. Ann Math Stat 7:129–132. https://doi.org/10.1214/aoms/1177732503
Article Google Scholar
Mongan J, Moy L, Kahn CE (2020) Checklist for Artificial Intelligence in Medical Imaging (CLAIM): a guide for authors and reviewers. Radiol Artif Intell 2:e200029. https://doi.org/10.1148/ryai.2020200029
Article PubMed PubMed Central Google Scholar
Egger C, Opfer R, Wang C et al (2017) MRI FLAIR lesion segmentation in multiple sclerosis: does automated segmentation hold up with manual annotation? NeuroImage Clin 13:264–270. https://doi.org/10.1016/j.nicl.2016.11.020
Article PubMed Google Scholar
Danelakis A, Theoharis T, Verganelakis DA (2018) Survey of automated multiple sclerosis lesion segmentation techniques on magnetic resonance imaging. Comput Med Imaging Graph 70:83–100. https://doi.org/10.1016/j.compmedimag.2018.10.002
Article PubMed Google Scholar
Zhang H, Oguz I (2021) Multiple sclerosis lesion segmentation-a survey of supervised cnn-based methods. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 6th International Workshop, BrainLes 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers, Part I, vol 6. Springer International Publishing, pp 11–29
Kaur A, Kaur L, Singh A (2021) State-of-the-art segmentation techniques and future directions for multiple sclerosis brain lesions. Arch Comput Methods Eng 28:951–977. https://doi.org/10.1007/s11831-020-09403-7
Article MathSciNet Google Scholar
De Stefano N, Battaglini M, Pareto D et al (2022) MAGNIMS recommendations for harmonization of MRI data in MS multicenter studies. NeuroImage Clin 34:102972. https://doi.org/10.1016/j.nicl.2022.102972
Article PubMed PubMed Central Google Scholar
Shiee N, Bazin P-L, Ozturk A et al (2010) A topology-preserving approach to the segmentation of brain images with multiple sclerosis lesions. Neuroimage 49:1524–1535. https://doi.org/10.1016/j.neuroimage.2009.09.005
Article PubMed Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems
Chan H-P, Samala RK, Hadjiiski LM, Zhou C (2020) Deep learning in medical image analysis. In: Lee G, Fujita H (eds) Deep Learning in Medical Image Analysis. Springer International Publishing, Cham, pp 3–21
Chapter Google Scholar
Akkus Z, Galimzianova A, Hoogi A et al (2017) Deep learning for brain MRI segmentation: state of the art and future directions. J Digit Imaging 30:449–459. https://doi.org/10.1007/s10278-017-9983-4
Article PubMed PubMed Central Google Scholar
Kamraoui RA, Ta V-T, Tourdias T et al (2022) DeepLesionBrain: towards a broader deep-learning generalization for multiple sclerosis lesion segmentation. Med Image Anal 76:102312. https://doi.org/10.1016/j.media.2021.102312
Article PubMed Google Scholar
Weeda MM, Brouwer I, de Vos ML et al (2019) Comparing lesion segmentation methods in multiple sclerosis: input from one manually delineated subject is sufficient for accurate lesion segmentation. NeuroImage Clin 24:102074. https://doi.org/10.1016/j.nicl.2019.102074
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge the support of NVIDIA Corporation with the donation of a Titan XP GPU used for this research. EF, DFS, and DS were supported by Fundación Sadosky.

Funding

The authors state that this work has not received any funding.

Author information

Authors and Affiliations

Diagnostic Imaging Department, Fleni, Montañeses, 2325 (C1428AQK), Ciudad de Buenos Aires, Argentina
Hernán Chaves, María M. Serra, Emilia Osa Sanz, Nadia I. Stefanoff, Paulina Yañez & Claudia Cejas
Department of Physics, University of Buenos Aires (UBA), Buenos Aires, Argentina
Diego E. Shalom
Physics Institute of Buenos Aires (IFIBA) CONICET, Buenos Aires, Argentina
Diego E. Shalom
Laboratorio de Neurociencia, Universidad Torcuato Di Tella, Buenos Aires, Argentina
Diego E. Shalom
ENTELAI, Buenos Aires, Argentina
Pilar Ananía & Martín E. Costa
Radiology Department, Diagnósticos da América SA (Dasa), Rio de Janeiro, Brazil
Fernanda Rueda & Mauricio F. Farez
Center for Research On Neuroimmunological Diseases (CIEN), Fleni, Buenos Aires, Argentina
Sofía Rodríguez Murúa, Diego Fernández Slezak & Mauricio F. Farez
DasaInova, Diagnósticos da América SA (Dasa), São Paulo, São Paulo, Brazil
Felipe C. Kitamura
Neurology Department, Fleni, Buenos Aires, Argentina
Jorge Correale
Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional, sinc(i) CONICET-UNL, Santa Fe, Argentina
Enzo Ferrante
Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (UBA), Buenos Aires, Argentina
Diego Fernández Slezak
Instituto de Investigación en Ciencias de la Computación (ICC), CONICET-UBA, Buenos Aires, Argentina
Diego Fernández Slezak
Center for Biostatistics, Epidemiology and Public Health (CEBES), Fleni, Buenos Aires, Argentina
Mauricio F. Farez

Authors

Hernán Chaves
View author publications
You can also search for this author in PubMed Google Scholar
María M. Serra
View author publications
You can also search for this author in PubMed Google Scholar
Diego E. Shalom
View author publications
You can also search for this author in PubMed Google Scholar
Pilar Ananía
View author publications
You can also search for this author in PubMed Google Scholar
Fernanda Rueda
View author publications
You can also search for this author in PubMed Google Scholar
Emilia Osa Sanz
View author publications
You can also search for this author in PubMed Google Scholar
Nadia I. Stefanoff
View author publications
You can also search for this author in PubMed Google Scholar
Sofía Rodríguez Murúa
View author publications
You can also search for this author in PubMed Google Scholar
Martín E. Costa
View author publications
You can also search for this author in PubMed Google Scholar
Felipe C. Kitamura
View author publications
You can also search for this author in PubMed Google Scholar
Paulina Yañez
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Cejas
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Correale
View author publications
You can also search for this author in PubMed Google Scholar
Enzo Ferrante
View author publications
You can also search for this author in PubMed Google Scholar
Diego Fernández Slezak
View author publications
You can also search for this author in PubMed Google Scholar
Mauricio F. Farez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hernán Chaves.

Ethics declarations

Guarantor

The scientific guarantor of this publication is Hernán Chaves.

Conflict of Interest

The listed authors declare relationships with the following companies:

• Diego Fernández Slezak: is CTO and co-founder of Entelai.

• Diego E. Shalom: has received stipends as a scientific advisor from Entelai.

• Enzo Ferrante: has received stipends as a scientific advisor from Entelai.

• Pilar Ananía: Entelai employee.

• Felipe Kitamura: consultant for MD.ai and employed by DASA.

• Hernán Chaves: has received stipends as a medical advisor from Entelai.

• Jorge Correale: received stipends from Biogen, Merck, Novartis, Roche, Bayer, Sanofi-Genzyme, Gador, Raffo, Bristol Myers Squibb, and Janssen.

• María Mercedes Serra: has received stipends as a medical advisor from Entelai.

• Martín Elías Costa: Entelai employee.

• Mauricio Franco Farez: is CEO and co-founder of Entelai.

Statistics and Biometry

One of the authors (Mauricio Franco Farez) has significant statistical expertise.

Informed Consent

Written informed consent was waived by the Institutional Review Board.

Ethical Approval

Institutional Review Board approval was obtained.

Study subjects or cohorts overlap

No study subjects or cohorts have been previously reported.

Methodology

• prospective and retrospective

• diagnostic and observational study

• multicenter study

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 2037 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chaves, H., Serra, M.M., Shalom, D.E. et al. Assessing robustness and generalization of a deep neural network for brain MS lesion segmentation on real-world data. Eur Radiol 34, 2024–2035 (2024). https://doi.org/10.1007/s00330-023-10093-5

Download citation

Received: 03 February 2023
Revised: 01 July 2023
Accepted: 12 July 2023
Published: 31 August 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00330-023-10093-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing robustness and generalization of a deep neural network for brain MS lesion segmentation on real-world data