A Comparison of Lung Nodule Segmentation Algorithms: Methods and Results from a Multi-institutional Study

Kalpathy-Cramer, Jayashree; Zhao, Binsheng; Goldgof, Dmitry; Gu, Yuhua; Wang, Xingwei; Yang, Hao; Tan, Yongqiang; Gillies, Robert; Napel, Sandy

doi:10.1007/s10278-016-9859-z

A Comparison of Lung Nodule Segmentation Algorithms: Methods and Results from a Multi-institutional Study

Published: 03 February 2016

Volume 29, pages 476–487, (2016)
Cite this article

Journal of Digital Imaging Aims and scope Submit manuscript

Jayashree Kalpathy-Cramer¹,
Binsheng Zhao²,
Dmitry Goldgof³,
Yuhua Gu⁴,
Xingwei Wang⁵,
Hao Yang²,
Yongqiang Tan²,
Robert Gillies⁴ &
…
Sandy Napel⁵

1483 Accesses
53 Citations
2 Altmetric
Explore all metrics

Abstract

Tumor volume estimation, as well as accurate and reproducible borders segmentation in medical images, are important in the diagnosis, staging, and assessment of response to cancer therapy. The goal of this study was to demonstrate the feasibility of a multi-institutional effort to assess the repeatability and reproducibility of nodule borders and volume estimate bias of computerized segmentation algorithms in CT images of lung cancer, and to provide results from such a study. The dataset used for this evaluation consisted of 52 tumors in 41 CT volumes (40 patient datasets and 1 dataset containing scans of 12 phantom nodules of known volume) from five collections available in The Cancer Imaging Archive. Three academic institutions developing lung nodule segmentation algorithms submitted results for three repeat runs for each of the nodules. We compared the performance of lung nodule segmentation algorithms by assessing several measurements of spatial overlap and volume measurement. Nodule sizes varied from 29 μl to 66 ml and demonstrated a diversity of shapes. Agreement in spatial overlap of segmentations was significantly higher for multiple runs of the same algorithm than between segmentations generated by different algorithms (p < 0.05) and was significantly higher on the phantom dataset compared to the other datasets (p < 0.05). Algorithms differed significantly in the bias of the measured volumes of the phantom nodules (p < 0.05) underscoring the need for assessing performance on clinical data in addition to phantoms. Algorithms that most accurately estimated nodule volumes were not the most repeatable, emphasizing the need to evaluate both their accuracy and precision. There were considerable differences between algorithms, especially in a subset of heterogeneous nodules, underscoring the recommendation that the same software be used at all time points in longitudinal studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A quantitative analysis of imaging features in lung CT images using the RW-T hybrid segmentation model

Article 02 October 2023

Toward clinically usable CAD for lung cancer screening with computed tomography

Article 24 July 2014

Automated detection and segmentation of non-small cell lung cancer computed tomography images

Article Open access 14 June 2022

References

Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A: Global cancer statistics, 2012. CA Cancer J Clin 65(2):87–108, 2015. doi:10.3322/caac.21262
Article PubMed Google Scholar
Ravenel JG: Evidence-based imaging in lung cancer: a systematic review. J Thorac Imaging 27(5):315–324, 2012. doi:10.1097/RTI.0b013e318254a198
Article PubMed Google Scholar
Rivera MP, Mehta AC, Wahidi MM: Establishing the diagnosis of lung cancer: Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest 143(5 Suppl):e142S–e165S, 2013. doi:10.1378/chest.12-2353
Article PubMed Google Scholar
Nair A, Hansell DM: European and North American lung cancer screening experience and implications for pulmonary nodule management. Eur Radiol 21(12):2445–2454, 2011. doi:10.1007/s00330-011-2219-y
Article PubMed Google Scholar
National Lung Screening Trial Research Team, Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, Fagerstrom RM, Gareen IF, Gatsonis C, Marcus PM, Sicks JD: Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 365(5):395–409, 2011. doi:10.1056/NEJMoa1102873
Article Google Scholar
National Lung Screening Trial Research Team, Church TR, Black WC, Aberle DR, Berg CD, Clingan KL, Duan F, Fagerstrom RM, Gareen IF, Gierada DS, Jones GC, Mahon I, Marcus PM, Sicks JD, Jain A, Baum S: Results of initial low-dose computed tomographic screening for lung cancer. N Engl J Med 368(21):1980–1991, 2013. doi:10.1056/NEJMoa1209120
Article Google Scholar
MacMahon H, Austin JH, Gamsu G, Herold CJ, Jett JR, Naidich DP, Patz Jr, EF, Swensen SJ, Fleischner S: Guidelines for management of small pulmonary nodules detected on CT scans: a statement from the Fleischner Society. Radiology 237(2):395–400, 2005. doi:10.1148/radiol.2372041887
Article PubMed Google Scholar
Revel MP: Avoiding overdiagnosis in lung cancer screening: the volume doubling time strategy. Eur Respir J 42(6):1459–1463, 2013. doi:10.1183/09031936.00157713
Article PubMed Google Scholar
Patel VK, Naik SK, Naidich DP, Travis WD, Weingarten JA, Lazzaro R, Gutterman DD, Wentowski C, Grosu HB, Raoof S: A practical algorithmic approach to the diagnosis and management of solitary pulmonary nodules: part 2: pretest probability and algorithm. Chest 143(3):840–846, 2013. doi:10.1378/chest.12-1487
Article PubMed Google Scholar
Infante M, Berghmans T, Heuvelmans MA, Hillerdal G, Oudkerk M: Slow-growing lung cancer as an emerging entity: from screening to clinical management. Eur Respir J 42(6):1706–1722, 2013. doi:10.1183/09031936.00186212
Article PubMed Google Scholar
Eisenhauer EA, Therasse P, Bogaerts J, Schwartz LH, Sargent D, Ford R, Dancey J, Arbuck S, Gwyther S, Mooney M, Rubinstein L, Shankar L, Dodd L, Kaplan R, Lacombe D, Verweij J: New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1). 45(2):228–247, 2009. doi:10.1016/j.ejca.2008.10.026
Revel MP, Bissery A, Bienvenu M, Aycard L, Lefort C, Frija G: Are two-dimensional CT measurements of small noncalcified pulmonary nodules reliable? Radiology 231(2):453–458, 2004. doi:10.1148/radiol.2312030167
Article PubMed Google Scholar
Reeves AP, Biancardi AM, Apanasovich TV, Meyer CR, MacMahon H, van Beek EJ, Kazerooni EA, Yankelevitz D, McNitt-Gray MF, McLennan G, Armato 3rd, SG, Henschke CI, Aberle DR, Croft BY, Clarke LP: The Lung Image Database Consortium (LIDC): a comparison of different size metrics for pulmonary nodule measurements. Acad Radiol 14(12):1475–1485, 2007. doi:10.1016/j.acra.2007.09.005
Article PubMed PubMed Central Google Scholar
Marten K, Auer F, Schmidt S, Kohl G, Rummeny EJ: Inadequacy of manual measurements compared to automated CT volumetry in assessment of treatment response of pulmonary metastases using RECIST criteria - Springer. European. 2006
Zhao YR, Ooijen PMv, Dorrius MD, Heuvelmans M, de Bock GH, Vliegenthart R, Oudkerk M: Comparison of three software systems for semi-automatic volumetry of pulmonary nodules on baseline and follow-up CT examinations. Acta radiologica (Stockholm, Sweden : 1987). 2013. doi:10.1177/0284185113508177
Ashraf H, de Hoop B, Shaker SB, Dirksen A, Bach KS, Hansen H, Prokop M, Pedersen JH: Lung nodule volumetry: segmentation algorithms within the same software package cannot be used interchangeably. Eur Radiol 20(8):1878–1885, 2010. doi:10.1007/s00330-010-1749-z
Article CAS PubMed PubMed Central Google Scholar
Kalpathy-Cramer J, Fuller CD: Target Contour Testing/Instructional Computer Software (TaCTICS): a novel training and evaluation platform for radiotherapy target delineation.2010:361–365, 2010
Kalpathy-Cramer J, Bedrick SD, Boccia K, Fuller CD: A pilot prospective feasibility study of organ-at-risk definition using Target Contour Testing/Instructional Computer Software (TaCTICS), a training and evaluation platform for radiotherapy target delineation.2011:654–663,2011
Kalpathy-Cramer J, Awan M, Bedrick S, Rasch CR, Rosenthal DI, Fuller CD: Development of a software for quantitative evaluation radiotherapy target and organ-at-risk segmentation comparison. J Digit Imaging 27(1):108–119, 2014. doi:10.1007/s10278-013-9633-4
Article PubMed Google Scholar
Kalpathy-Cramer J, Napel S, Goldgof D, Zhao B: QIN multi-site collection of Lung CT data with Nodule Segmentations https://wiki.cancerimagingarchive.net/display/DOI/QIN+multi-site+collection+of+Lung+CT+data+with+Nodule+Segmentations2015 [cited 2015]. Available from: doi:10.7937/K9/TCIA.2015.1BUVFJR7
Zhao B, James LP, Moskowitz CS, Guo P, Ginsberg MS, Lefkowitz RA, Qin Y, Riely GJ, Kris MG, Schwartz LH: Evaluating variability in tumor measurements from same-day repeat CT scans of patients with non-small cell lung cancer. Radiology 252(1):263–272, 2009. doi:10.1148/radiol.2522081593
Article PubMed PubMed Central Google Scholar
Clarke LP, Croft BY, Staab E, Baker H, Sullivan DC: National Cancer Institute initiative: Lung image database resource for imaging research. Acad Radiol 8(5):447–450, 2001. doi:10.1016/S1076-6332(03)80555-X
Article CAS PubMed Google Scholar
Turner WD, Kelliher TP, Ross JC, Miller JV: An analysis of early studies released by the Lung Imaging Database Consortium (LIDC). Med Image Comput Comput Assist Interv 9(Pt 2):487–494, 2006
PubMed Google Scholar
Armato III, SG, McNitt-Gray MF, Reeves AP, Meyer CR, McLennan G, Aberle DR, Kazerooni EA, MacMahon H, van Beek EJ, Yankelevitz D, Hoffman EA, Henschke CI, Roberts RY, Brown MS, Engelmann RM, Pais RC, Piker CW, Qing D, Kocherginsky M, Croft BY, Clarke LP: The Lung Image Database Consortium (LIDC): an evaluation of radiologist variability in the identification of lung nodules on CT scans. Acad Radiol 14(11):1409–1421, 2007. doi:10.1016/j.acra.2007.07.008
Article PubMed PubMed Central Google Scholar
Gevaert O, Xu J, Hoang CD, Leung AN, Xu Y, Quon A, Rubin DL, Napel S, Plevritis SK: Non-small cell lung cancer: identifying prognostic imaging biomarkers by leveraging public gene expression microarray data—methods and preliminary results. Radiology 264(2):387–396, 2012. doi:10.1148/radiol.12111607
Article PubMed PubMed Central Google Scholar
Zhao B, Tan Y, Tsai WY, Schwartz LH, Lu L: Exploring variability in CT characterization of tumors: a preliminary phantom study. Transl Oncol 7(1):88–93, 2014
Article PubMed PubMed Central Google Scholar
Gu Y, Kumar V, Hall LO, Goldgof DB, Li C-Y, Korn R, Bendtsen C, Velazquez ER, Dekker A, Aerts H, Lambin P, Li X, Tian J, Gatenby RA, Gillies RJ: Automated delineation of lung tumors from CT images using a single click ensemble segmentation approach. Pattern Recogn 46(3):692–702, 2013. doi:10.1016/j.patcog.2012.10.005
Article Google Scholar
Tan Y, Schwartz LH, Zhao B: Segmentation of lung lesions on CT scans using watershed, active contours, and Markov random field. Med Phys 40(4):043502, 2013. doi:10.1118/1.4793409
Article PubMed PubMed Central Google Scholar
Channin DS, Mongkolwat P, Kleper V, Sepukar K, Rubin DL: The caBIG annotation and image Markup project. J Digit Imaging 23(2):217–225, 2010. doi:10.1007/s10278-009-9193-9
Article PubMed Google Scholar
Dicom Standards Committee WG. Digital Imaging and Communications in Medicine (DICOM) Supplement 111 [cited 2014]. Available from: ftp://medical.nema.org/medical/dicom/final/sup111_ft.pdf
Obuchowski NA, Reeves AP, Huang EP, Wang XF, Buckler AJ, Kim HJ, Barnhart HX, Jackson EF, Giger ML, Pennello G, Toledano AY, Kalpathy-Cramer J, Apanasovich TV, Kinahan PE, Myers KJ, Goldgof DB, Barboriak DP, Gillies RJ, Schwartz LH, Sullivan AD: Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons. Statistical methods in medical research. 2014. doi:10.1177/0962280214537390
Raunig DL, McShane LM, Pennello G, Gatsonis C, Carson PL, Voyvodic JT, Wahl RL, Kurland BF, Schwarz AJ, Gonen M, Zahlmann G, Kondratovich M, O’Donnell K, Petrick N, Cole PE, Garra B, Sullivan DC, Group QTPW: Quantitative imaging biomarkers: A review of statistical methods for technical performance assessment. Statistical methods in medical research. 2014. doi:10.1177/0962280214537344
Kessler LG, Barnhart HX, Buckler AJ, Choudhury KR, Kondratovich MV, Toledano A, Guimaraes AR, Filice R, Zhang Z, Sullivan DC, Group QTW: The emerging science of quantitative imaging biomarkers terminology and definitions for scientific studies and regulatory submissions. Stat Methods Med Res, 2014. doi:10.1177/0962280214537333
PubMed Google Scholar
Barnhart HX, Haber M, Song J: Overall concordance correlation coefficient for evaluating agreement among multiple observers. Biometrics 58(4):1020–1027, 2002
Article PubMed Google Scholar
Lin LI: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45(1):255–268, 1989
Article CAS PubMed Google Scholar
Shrout PE, Fleiss JL: Intraclass correlations: uses in assessing rater reliability. Psychol Bull 86(2):420–428, 1979
Article CAS PubMed Google Scholar
Barnhart HX, Haber MJ, Lin LI: An overview on assessing agreement with continuous measurements. 17(4):529–569, 2007. doi:10.1080/10543400701376480
Barnhart HX, Barboriak DP: Applications of the repeatability of quantitative imaging biomarkers: a review of statistical analysis of repeat data sets. Transl Oncol 2(4):231–235, 2009
Article PubMed PubMed Central Google Scholar
Bland JM, Altman DG: Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1(8476):307–310, 1986
Article CAS PubMed Google Scholar
Nevill AM, Atkinson G: Assessing agreement between measurements recorded on a ratio scale in sports medicine and sports science. Br J Sports Med 31(4):314–318, 1997
Article CAS PubMed PubMed Central Google Scholar
Obuchowski NA, Barnhart HX, Buckler AJ, Pennello G, Wang XF, Kalpathy-Cramer J, Kim HJ, Reeves AP, for the Case Example Working G: Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example. Statistical methods in medical research. Stat Methods Med Res 24(1):107–140, 2015. doi:10.1177/0962280214537392
Article PubMed Google Scholar
Dice LR: Measures of the amount of ecologic association between species. Ecology 26(3):297–302, 1945
Article Google Scholar
Tukey J: Comparing individual means in the analysis of variance. Biometrics 5(2):99–114, 1949
Article CAS PubMed Google Scholar
Siegel S, Castellan Jr, NJ: Nonparametric Statistics for the Behavioral Sciences, 2nd edition. McGraw-Hill Humanities/Social Sciences/Languages, New York, 1988
Google Scholar

Download references

Acknowledgements

U.S. Department of Health and Human Services, National Institutes of Health, National Cancer Institute (R01 CA160251), (R01 CA149490), (U01 CA140207), (U01 CA143062), (U01 CA154601), (U24 CA180927) and (U24 CA180918).

Author information

Authors and Affiliations

Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
Jayashree Kalpathy-Cramer
Department of Radiology, Columbia University Medical Center, New York, NY, USA
Binsheng Zhao, Hao Yang & Yongqiang Tan
Department of Computer Science and Engineering, University of South Florida, Tampa, FL, USA
Dmitry Goldgof
Departments of Cancer Imaging and Metabolism, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
Yuhua Gu & Robert Gillies
Department of Radiology, Stanford University School of Medicine, James H. Clark Center S323 318 Campus Drive, Stanford, CA, 94305-5450, USA
Xingwei Wang & Sandy Napel

Authors

Jayashree Kalpathy-Cramer
View author publications
You can also search for this author in PubMed Google Scholar
Binsheng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Goldgof
View author publications
You can also search for this author in PubMed Google Scholar
Yuhua Gu
View author publications
You can also search for this author in PubMed Google Scholar
Xingwei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yongqiang Tan
View author publications
You can also search for this author in PubMed Google Scholar
Robert Gillies
View author publications
You can also search for this author in PubMed Google Scholar
Sandy Napel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandy Napel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kalpathy-Cramer, J., Zhao, B., Goldgof, D. et al. A Comparison of Lung Nodule Segmentation Algorithms: Methods and Results from a Multi-institutional Study. J Digit Imaging 29, 476–487 (2016). https://doi.org/10.1007/s10278-016-9859-z

Download citation

Published: 03 February 2016
Issue Date: August 2016
DOI: https://doi.org/10.1007/s10278-016-9859-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comparison of Lung Nodule Segmentation Algorithms: Methods and Results from a Multi-institutional Study

Abstract

Access this article

Similar content being viewed by others

A quantitative analysis of imaging features in lung CT images using the RW-T hybrid segmentation model

Toward clinically usable CAD for lung cancer screening with computed tomography

Automated detection and segmentation of non-small cell lung cancer computed tomography images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Comparison of Lung Nodule Segmentation Algorithms: Methods and Results from a Multi-institutional Study

Abstract

Access this article

Similar content being viewed by others

A quantitative analysis of imaging features in lung CT images using the RW-T hybrid segmentation model

Toward clinically usable CAD for lung cancer screening with computed tomography

Automated detection and segmentation of non-small cell lung cancer computed tomography images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation