Medical image segmentation is an essential tool for clinical decision making and treatment planning. Automation of this process led to significant improvements in diagnostics and patient care, especially after recent breakthroughs that have been triggered by deep learning. However, when integrating automatic tools into patient care, it is crucial to understand their limitations and to have means to assess their confidence for individual cases. Aleatoric and epistemic uncertainties have been subject of recent research. Methods have been developed to calculate these quantities automatically during segmentation inference. However, it is still unclear how much human factors affect these metrics. Varying image quality and different levels of human annotator expertise are an integral part of aleatoric uncertainty. It is unknown how much this variability affects uncertainty in the final segmentation. Thus, in this work we explore potential links between deep network segmentation uncertainties with inter-observer variance and segmentation performance. We show how the area of disagreement between different ground-truth annotators can be developed into model confidence metrics and evaluate them on the LIDC-IDRI dataset, which contains multiple expert annotations for each subject. Our results indicate that a probabilistic 3D U-Net and a 3D U-Net using Monte-Carlo dropout during inference both show a similar correlation between our segmentation uncertainty metrics, segmentation performance and human expert variability.


  1. 1.
    Armato III, S.G., et al.: The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med. Phys. 38(2), 915–931 (2011)CrossRefGoogle Scholar
  2. 2.
    Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016, Part II. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). Scholar
  3. 3.
    Fox, C.R., Rottenstreich, Y.: Partition priming in judgment under uncertainty. Psychol. Sci. 14(3), 195–200 (2003)CrossRefGoogle Scholar
  4. 4.
    Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: Proceedings of the 33rd International Conference on Machine Learning, ICML 2016 (2016)Google Scholar
  5. 5.
    Gal, Y., Islam, R., Ghahramani, Z.: Deep Bayesian active learning with image data. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, pp. 1183–1192 (2017)Google Scholar
  6. 6.
    Huang, L., Pashler, H.: Symmetry detection and visual attention: a “binary-map” hypothesis. Vis. Res. 42(11), 1421–1430 (2002)CrossRefGoogle Scholar
  7. 7.
    Jungo, A., et al.: On the effect of inter-observer variability for a reliable estimation of uncertainty of medical image segmentation. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018, Part I. LNCS, vol. 11070, pp. 682–690. Springer, Cham (2018). Scholar
  8. 8.
    Kamnitsas, K., et al.: Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med. Image Anal. 36, 61–78 (2017)CrossRefGoogle Scholar
  9. 9.
    Kendall, A., Badrinarayanan, V., Cipolla, R.: Bayesian SegNet: model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, 4–7 September 2017 (2017)Google Scholar
  10. 10.
    Kohl, S., et al.: A probabilistic U-Net for segmentation of ambiguous images. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Montréal, Canada, 3–8 December 2018, pp. 6965–6975 (2018)Google Scholar
  11. 11.
    Mobiny, A., Nguyen, H.V., Moulik, S., Garg, N., Wu, C.C.: DropConnect Is Effective in Modeling Uncertainty of Bayesian Deep Networks. arXiv preprint arXiv:1906.04569 (2019)
  12. 12.
    Nair, T., Precup, D., Arnold, D.L., Arbel, T.: Exploring uncertainty measures in deep networks for multiple sclerosis lesion detection and segmentation. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018, Part I. LNCS, vol. 11070, pp. 655–663. Springer, Cham (2018). Scholar
  13. 13.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). Scholar
  14. 14.
    Roy, A.G., Conjeti, S., Navab, N., Wachinger, C.: Inherent brain segmentation quality control from fully ConvNet Monte Carlo sampling. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018, Part I. LNCS, vol. 11070, pp. 664–672. Springer, Cham (2018). Scholar
  15. 15.
    Sakaridis, C., Dai, D., Van Gool, L.: Semantic nighttime image segmentation with synthetic stylized data, gradual adaptation and uncertainty-aware evaluation. arXiv preprint arXiv:1901.05946 (2019)
  16. 16.
    Smith, L., Gal, Y.: Understanding measures of uncertainty for adversarial example detection. In: Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2018, Monterey, California, USA, 6–10 August 2018, pp. 560–569 (2018)Google Scholar
  17. 17.
    Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: Advances in Neural Information Processing Systems, pp. 3483–3491 (2015)Google Scholar
  18. 18.
    Wang, S., et al.: Central focused convolutional neural networks: developing a data-driven model for lung nodule segmentation. Med. Image Anal. 40, 172–183 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computing, BioMedIAImperial College LondonLondonUK

Personalised recommendations