A comparison of ground truth estimation methods
- 168 Downloads
Knowledge of the exact shape of a lesion, or ground truth (GT), is necessary for the development of diagnostic tools by means of algorithm validation, measurement metric analysis, accurate size estimation. Four methods that estimate GTs from multiple readers’ documentations by considering the spatial location of voxels were compared: thresholded Probability-Map at 0.50 (TPM0.50) and at 0.75 (TPM0.75), simultaneous truth and performance level estimation (STAPLE) and truth estimate from self distances (TESD).
A subset of the publicly available Lung Image Database Consortium archive was used, selecting pulmonary nodules documented by all four radiologists. The pair-wise similarities between the estimated GTs were analyzed by computing the respective Jaccard coefficients. Then, with respect to the readers’ marking volumes, the estimated volumes were ranked and the sign test of the differences between them was performed.
(a) the rank variations among the four methods and the volume differences between STAPLE and TESD are not statistically significant, (b) TPM0.50 estimates are statistically larger (c) TPM0.75 estimates are statistically smaller (d) there is some spatial disagreement in the estimates as the one-sided 90% confidence intervals between TPM0.75 and TPM0.50, TPM0.75 and STAPLE, TPM0.75 and TESD, TPM0.50 and STAPLE, TPM0.50 and TESD, STAPLE and TESD, respectively, show: [0.67, 1.00], [0.67, 1.00], [0.77, 1.00], [0.93, 1.00], [0.85, 1.00], [0.85, 1.00].
The method used to estimate the GT is important: the differences highlighted that STAPLE and TESD, notwithstanding a few weaknesses, appear to be equally viable as a GT estimator, while the increased availability of computing power is decreasing the appeal afforded to TPMs. Ultimately, the choice of which GT estimation method, between the two, should be preferred depends on the specific characteristics of the marked data that is used with respect to the two elements that differentiate the method approaches: relative reliabilities of the readers and the reliability of the region boundaries.
KeywordsCAD development Algorithm validation Volumetric measurement Diagnosis Response to therapy
Unable to display preview. Download preview PDF.
- 1.Armato SG, McLennan G, McNitt-Gray MF, Meyer CR, Yankelevitz D, Aberle DR, Henschke CI, Hoffman EA, Kazerooni EA, MacMahon H, Reeves AP, Croft BY, Clarke LP (2004) The Lung Image Database Consortium Research Group. Lung image database consortium: developing a resource for the medical imaging research community. Radiology 232(3): 739–748Google Scholar
- 2.Biancardi AM, Reeves AP (2009) TESD: a novel ground truth estimation method. In: SPIE international symposium on medical imaging, vol 7260, pp 72603V–1–8Google Scholar
- 3.Biancardi AM, Jirapatnakul AC, Fotin S, Apanasovich TV, Reeves AP (2009) An analysis of two ground truth estimation methods. In: SPIE international symposium on medical imaging, vol 7260, pp 72600E–1–8Google Scholar
- 5.Breiman RS, Beck JW, Korobkin M, Glenny R, Akwari OE, Heaston DK, Moore AV, Ram PC (1982) Volume determinations using computed tomography. Am J Roentgenol 138(2): 329–333Google Scholar
- 6.Felzenszwalb P, Huttenlocher D (2003) Distance transforms of sampled functions. Technical report, Cornell UniversityGoogle Scholar
- 7.Ford LR Jr, Fulkerson DR (1956) Maximal flow through a network. Can J Math 8(3): 399–404Google Scholar
- 9.Ibanez L, Schroeder W, Ng L, Cates J (2005) The ITK Software Guide, 2nd edn. Kitware. ISBN 1-930934-15-7. http://www.itk.org/ItkSoftwareGuide.pdf
- 10.Jaccard P (1908) Nouvelles recherches sur la distribution florale. Bull Soc Vaudoise Sci Nat 44: 223–270Google Scholar
- 12.Kuhnigk J-M, Dicken V, Bornemann L, Wormanns D, Krass S, Peitgen HO (2004) Fast automated segmentation and reproducible volumetry of pulmonary metastases in CT-scans for therapy monitoring. In: Lecture notes in computer science, vol 3217, pp 933–941. Medical Image Computing and Computer-Assisted Intervention. Springer GmbHGoogle Scholar
- 13.McNitt-Gray MF, Armato SG III, Meyer CR, Reeves AP, McLennan G, Pais RC, Freymann J, Brown MS, Engelmann RM, Bland PH, Laderach GE, Piker C, Guo J, Towfic Z, Qing DP-Y, Yankelevitz DF, Aberle DR, van Beek EJR, MacMahon H, Kazerooni EA, Croft BY, Clarke LP (2007) The lung image database consortium (LIDC) data collection process for nodule detection and annotation. Acad Radiol 14(12): 1464–1474CrossRefPubMedGoogle Scholar
- 14.Meyer CR, Johnson TD, McLennan G, Aberle DR, Kazerooni EA, MacMahon H, Mullan BF, Yankelevitz DF, van Beek EJR, Armato SG III, McNitt-Gray MF, Reeves AP, Gur D, Henschke CI, Hoffman EA, Bland PH, Laderach G, Pais R, Qing D, Piker C, Guo J, Starkey A, Max D, Croft BY, Clarke LP (2006) Evaluation of lung MDCT nodule annotation across radiologists and methods. Acad Radiol 13: 1254–1265CrossRefPubMedGoogle Scholar
- 15.National Cancer Institute (2009) National cancer imaging archive. https://imaging.nci.nih.gov/ncia/. Accessed 9 Jan 2009
- 16.National Institutes of Health (2009) Lung image database resource for imaging research. http://grants.nih.gov/grants/guide/rfa-files/RFA-CA-01-001.html, 2000. Accessed 9 Jan 2009
- 19.Reeves AP, Biancardi AM, Apanasovich TV, Meyer CR, MacMahon H, van Beek EJR, Kazerooni EA, Yankelevitz D, McNitt-Gray MF, McLennan G, Armato SG III, Henschke CI, Aberle DR, Croft BY, Clarke LP (2007) The lung image database consortium (LIDC): a comparison of different size metrics for pulmonary nodule measurements. Acad Radiol 14(12): 1475–1485CrossRefPubMedGoogle Scholar