Abstract
Computer vision classification tasks rely on the availability of ground truth labels. Especially in medical imaging, these are typically given by experts and can be of differing quality. To reduce the expert bias influence on labels, commonly blinded multi-expert consensus labels are used as ground truth in machine learning. In this work, we approach the question of how good a multiexpert consensus can be for the example of mitotic figure (MF) identification, which is a relevant task in tumor malignancy assessment. For this, we provide an exhaustive evaluation of all possible majority ensembles of 23 pathologists who independently assessed MFs based on a preselected region of interest. We compared the ensemble against a immunohistochemistry-based ground truth. We found that there were upper bounds to the recognition of MFs by the experts, which were, in our dataset, an accuracy, sensitivity and specificity of 88%, 82%, and 100%, respectively. An analysis of our results revealed cells in prophase and blurry cells to be amongst the most challenging to recognize.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Meyer JS, Alvarez C, Milikowski C, Olson N, Russo I, Russo J et al. Breast carcinoma malignancy grading by Bloom-Richardson system vs proliferation index: reproducibility of grade and advantages of proliferation index. Mod Pathol. 2005;18(8):1067–78.
Veta M, Van Diest PJ, Jiwa M, Al-Janabi S, Pluim JP. Mitosis counting in breast cancer: object-level interobserver agreement and comparison to an automatic method. PLoS One. 2016;11(8):e0161286.
Cardoso JR, Pereira LM, Iversen MD, Ramos AL. What is gold standard and what is ground truth? Dental Press J Orthod. 2014;19:27–30.
Anonymous. Removed for peer review. 2022.
Tapia C, Kutzner H, Mentzel T, Savic S, Baumhoer D, Glatz K. Two mitosis-specific antibodies, MPM-2 and phospho-histone H3 (Ser28), allow rapid and precise determination of mitotic activity. Am J Surg Pathol. 2006;30(1):83–9.
Jiang J, Larson NB, Prodduturi N, Flotte TJ, Hart SN. Robust hierarchical density estimation and regression for re-stained histological whole slide image co-registration. PLoS One. 2019;14(7):e0220074.
Kuncheva L. Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience, 2004.
Lausser L, Szekely R, Schmid F, Maucher M,Kestler HA. Efficient cross-validation traversals in feature subset selection. Sci Rep. 2022;12(1):21485.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Der/die Autor(en), exklusiv lizenziert an Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature
About this paper
Cite this paper
Lausser, L.M., Bertram, C.A., Klopfleisch, R., Aubreville, M. (2023). Limits of Human Expert Ensembles in Mitosis Multi-expert Ground Truth Generation. In: Deserno, T.M., Handels, H., Maier, A., Maier-Hein, K., Palm, C., Tolxdorff, T. (eds) Bildverarbeitung für die Medizin 2023. BVM 2023. Informatik aktuell. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-41657-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-658-41657-7_27
Published:
Publisher Name: Springer Vieweg, Wiesbaden
Print ISBN: 978-3-658-41656-0
Online ISBN: 978-3-658-41657-7
eBook Packages: Computer Science and Engineering (German Language)