Balancing the Role of Priors in Multi-Observer Segmentation Evaluation

Zhu, Yaoyao; Huang, Xiaolei; Wang, Wei; Lopresti, Daniel; Long, Rodney; Antani, Sameer; Xue, Zhiyun; Thoma, George

doi:10.1007/s11265-008-0215-5

Balancing the Role of Priors in Multi-Observer Segmentation Evaluation

Published: 28 May 2008

Volume 55, pages 185–207, (2009)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Yaoyao Zhu¹,
Xiaolei Huang¹,
Wei Wang¹,
Daniel Lopresti¹,
Rodney Long²,
Sameer Antani²,
Zhiyun Xue² &
…
George Thoma²

236 Accesses
6 Citations
Explore all metrics

Abstract

Comparison of a group of multiple observer segmentations is known to be a challenging problem. A good segmentation evaluation method would allow different segmentations not only to be compared, but to be combined to generate a “true” segmentation with higher consensus. Numerous multi-observer segmentation evaluation approaches have been proposed in the literature, and STAPLE in particular probabilistically estimates the true segmentation by optimal combination of observed segmentations and a prior model of the truth. An Expectation–Maximization (EM) algorithm, STAPLE’s convergence to the desired local minima depends on good initializations for the truth prior and the observer-performance prior. However, accurate modeling of the initial truth prior is nontrivial. Moreover, among the two priors, the truth prior always dominates so that in certain scenarios when meaningful observer-performance priors are available, STAPLE can not take advantage of that information. In this paper, we propose a Bayesian decision formulation of the problem that permits the two types of prior knowledge to be integrated in a complementary manner in four cases with differing application purposes: (1) with known truth prior; (2) with observer prior; (3) with neither truth prior nor observer prior; and (4) with both truth prior and observer prior. The third and fourth cases are not discussed (or effectively ignored) by STAPLE, and in our research we propose a new method to combine multiple-observer segmentations based on the maximum a posterior (MAP) principle, which respects the observer prior regardless of the availability of the truth prior. Based on the four scenarios, we have developed a web-based software application that implements the flexible segmentation evaluation framework for digitized uterine cervix images. Experiment results show that our framework has flexibility in effectively integrating different priors for multi-observer segmentation evaluation and it also generates results comparing favorably to those by the STAPLE algorithm and the Majority Vote Rule.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical Power in Image Segmentation: Relating Sample Size to Reference Standard Quality

Inter-rater reliability and double reading analysis of an automated three-dimensional breast ultrasound system: comparison of two independent examiners

Article 26 July 2017

Manual Segmentation Errors in Medical Imaging. Proposing a Reliable Gold Standard

References

Warfield, S. K., Zou, K. H., & Wells, W. M. (2004). Simultaneous Truth and Performance Level Estimation (STAPLE): An algorithm for the validation of image segmentation. IEEE Transactions on Medical Imaging, July.
Lotenberg, S., Greenspan, H., Gordon, S., Long, L. R., Jeronimo, J., & Antani, S. K. (2007). Automatic evaluation of uterine cervix segmentations. Proceedings of SPIE Medical Imaging, 6515, 65151J–1-12.
Google Scholar
Zhu, Y., Long, L. R., Antani, S. K., Xue, Z., & Thoma, G. R. (2007). Web-based STAPLE for quality estimation of multiple image segmentations. Poster at 20th NIH Research Festival (IMAG-12), National Institutes of Health, September.
Zhang, Y. J. (1996). A survey on evaluation methods for image segmentation. Pattern Recognition, 29(8), 1335–1346.
Article Google Scholar
Yasnoff, W. A., Mui, J. K., & Bacus, J. W. (1977). Error measures in scene segmentation. Pattern Recognition, 9(4), 217–231.
Article Google Scholar
Qian Huang Dom, B. (1995). Quantitative methods of evaluating image segmentation. Proceedings IEEE International Conference on Image Processing, 3, 53–56.
Article Google Scholar
Martin, D. (2002). An empirical approach to grouping and segmentation. PhD dissertation, University of California, Berkeley.
Cardoso, J. S., & Corte-Real, L. (2005). Toward a generic evaluation of image segmentation. IEEE Transactions on Image Processing, 14(11), 1773–1782.
Article Google Scholar
Monteiro, F. C., Fernando, C., Campilho, A. C., & Aurélio, C. Performance Evaluation of Image Segmentation. ICIAR06 (I: 248–259).
Kittler, J., Hatef, M., Duin, R. P. W., & Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 226–239 Mar.
Article Google Scholar
Windridge, D., & Kittler, J. (2003). A morphologically optimal strategy for classifier combination: Multiple expert fusion as a tomographic process. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 343–353 Mar.
Article Google Scholar
Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation, 3, 79–87.
Article Google Scholar
Jordan, M. I., & Jacobs, R. A. Hierarchical Mixtures of Experts and the EM Algorithm. Tech. Rep. AIM-1440, 1993.
Restif, C. (2007). Revisiting the evaluation of segmentation results: Introducing confidence maps. Medical Image Computing and Computer-Assisted Intervention, 2, 588–595.
Google Scholar
Martina, A., Laanaya, H., & Arnold-Bos, A. (2006). Evaluation for uncertain image classification and segmentation. Pattern Recognition, 39(11), 1987–1995 November.
Article Google Scholar
Berger, J. (1985). Statistical decision theory and bayesian analysis. New York: Springer-Verlag.
MATH Google Scholar
Prasad, M., Sowmya, A., & Koch, I. (2004). Feature subset selection using ICA for classifying emphysema in HRCT images. 17th International Conference on Pattern Recognition (ICPR), 4, 515–518.
Article Google Scholar
Prasad, M., Sowmya, A., & Wilson, P. Multi-level classification of emphysema in HRCT lung images. Pattern Analysis & Applications
Herrero, R., Schiffman, M. H., Bratti, C., et al. (1997). Design and methods of a population-based natural history study of cervical neoplasia in a rural province of Costa-Rica: The Guanacaste Project. Revista Panamericana de Salud Pública, 1(5), 362–375.
Article Google Scholar
Huang, X., Wang, W., Xue, Z., Antani, S., Long, L. R., & Jeronimo, J. (2008). Tissue classification using cluster features for lesion detection in digital cervigrams. San Diego: SPIE Medical Imaging.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA, 18015, USA
Yaoyao Zhu, Xiaolei Huang, Wei Wang & Daniel Lopresti
National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
Rodney Long, Sameer Antani, Zhiyun Xue & George Thoma

Authors

Yaoyao Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Lopresti
View author publications
You can also search for this author in PubMed Google Scholar
Rodney Long
View author publications
You can also search for this author in PubMed Google Scholar
Sameer Antani
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyun Xue
View author publications
You can also search for this author in PubMed Google Scholar
George Thoma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yaoyao Zhu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, Y., Huang, X., Wang, W. et al. Balancing the Role of Priors in Multi-Observer Segmentation Evaluation. J Sign Process Syst Sign Image Video Technol 55, 185–207 (2009). https://doi.org/10.1007/s11265-008-0215-5

Download citation

Received: 21 January 2008
Revised: 31 March 2008
Accepted: 12 April 2008
Published: 28 May 2008
Issue Date: April 2009
DOI: https://doi.org/10.1007/s11265-008-0215-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Balancing the Role of Priors in Multi-Observer Segmentation Evaluation

Abstract

Access this article

Similar content being viewed by others

Statistical Power in Image Segmentation: Relating Sample Size to Reference Standard Quality

Inter-rater reliability and double reading analysis of an automated three-dimensional breast ultrasound system: comparison of two independent examiners

Manual Segmentation Errors in Medical Imaging. Proposing a Reliable Gold Standard

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Balancing the Role of Priors in Multi-Observer Segmentation Evaluation

Abstract

Access this article

Similar content being viewed by others

Statistical Power in Image Segmentation: Relating Sample Size to Reference Standard Quality

Inter-rater reliability and double reading analysis of an automated three-dimensional breast ultrasound system: comparison of two independent examiners

Manual Segmentation Errors in Medical Imaging. Proposing a Reliable Gold Standard

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation