Consistent validation of gray-level thresholding image segmentation algorithms based on machine learning classifiers

  • Luca Frigau
  • Claudio ConversanoEmail author
  • Francesco Mola
Regular Article


We propose a Machine Learning approach for Image Validation (MaLIV) to rank the performances of two or more outputs obtained from different gray-level thresholding image segmentation algorithms. MaLIV utilizes machine learning classifiers to rank automatically the outputs of different segmentation algorithms accounting for both the computational complexity of the validation experiment and for the robustness of its results. The proposed method resorts to subsampling to find Fisher consistent estimates of validity measures obtained from a sample of pixels of extremely-reduced size. To this purpose, subsampling is combined with three alternative approaches: learning curves, asymptotic regression and convergence in probability. Results of experiments involving the validation of five images segmented through thirteen different algorithms are presented.


Image validation Subsampling Learning curves Asymptotic regression Convergence in probability Classifiers’ prediction capabilities MaLIV Machine learning 



The research activities of Luca Frigau described in this paper have been conducted within the R&D project “Cagliari2020” partially funded by the Italian University and Research Ministry (grant No. MIUR_PON04a2_00381). The research activities of Luca Frigau, Claudio Conversano and Francesco Mola are supported by the Regione Autonoma della Sardegna under the Grant Pacchetti Integrati di Agevolazione Industria, Artigianato e Servizi, PIA – 2013 No. 282/13 and by the Italian University and Research Ministry (Progetto Dipartimenti di Eccellenza 2018–2022).


  1. Aria M, D’Ambrosio A, Iorio C, Siciliano R, Cozza V (2018) Dynamic recursive tree-based partitioning for malignant melanoma identification in skin lesion dermoscopic images. Stat Pap.
  2. Bertail P, Politis DN, Romano JP (1999) On subsampling estimators with unknown rate of convergence. J Am Stat Assoc 94(446):569–579MathSciNetzbMATHCrossRefGoogle Scholar
  3. Billingsley P (2013) Convergence of probability measures. Wiley, New YorkzbMATHGoogle Scholar
  4. Chan TF, Shen JJ (2005) Image processing and analysis: variational, PDE, wavelet, and stochastic methods, vol 94. Siam, PhiladelphiazbMATHCrossRefGoogle Scholar
  5. Cortes C, Jackel LD, Solla SA, Vapnik V, Denker JS (1994) Learning curves: asymptotic values and rate of convergence. In: Advances in neural information processing systems, pp 327–334Google Scholar
  6. Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26(3):297–302CrossRefGoogle Scholar
  7. Doyle W (1962) Operations useful for similarity-invariant pattern recognition. J ACM (JACM) 9(2):259–267MathSciNetzbMATHCrossRefGoogle Scholar
  8. Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1:54–75MathSciNetzbMATHCrossRefGoogle Scholar
  9. Efron B, Tibshirani R (1997) Improvements on cross-validation: the 632+ bootstrap method. J Am Stat Assoc 92(438):548–560MathSciNetzbMATHGoogle Scholar
  10. El-Samie FA (2012) Image restoration. Lap Lambert Academic Publishing GmbH KG, Saarbrucken ISBN 9783847333531Google Scholar
  11. Emond EJ, Mason DW (2002) A new rank correlation coefficient with application to the consensus ranking problem. J Multi-Criteria Decis Anal 11(1):17–28zbMATHCrossRefGoogle Scholar
  12. Glasbey CA (1993) An analysis of histogram-based thresholding algorithms. CVGIP 55(6):532–537Google Scholar
  13. Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Gang Wang, Jianfei Cai, Tsuhan Chen (2018) Recent advances in convolutional neural networks. Pattern Recognit 77:354–377. CrossRefGoogle Scholar
  14. Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009) The elements of statistical learning, 2nd edn. Springer, New York. zbMATHCrossRefGoogle Scholar
  15. Huang L-K, Wang M-JJ (1995) Image thresholding by minimizing the measures of fuzziness. Pattern Recognit 28(1):41–51CrossRefGoogle Scholar
  16. Iosifescu DV, Shenton ME, Warfield SK, Kikinis R, Dengler J, Jolesz FA, McCarley RW (1997) An automated registration algorithm for measuring mri subcortical brain structures. Neuroimage 6(1):13–25CrossRefGoogle Scholar
  17. Jaccard P (1912) The distribution of the flora in the alpine zone. New Phytol 11(2):37–50CrossRefGoogle Scholar
  18. Kapur JN, Sahoo PK, Wong AKC (1985) A new method for gray-level picture thresholding using the entropy of the histogram. Comput Vis Graph Image Process 29(3):273–285CrossRefGoogle Scholar
  19. Kato Z, Pong T-C (2006) A markov random field image segmentation model for color textured images. Image Vis Comput 24(10):1103–1114. CrossRefGoogle Scholar
  20. Kato Z, Zerubia J (2012) Markov random fields in image segmentation. Now Publishers Inc., Hanover, MAzbMATHGoogle Scholar
  21. Kendall MG (1955) Rank correlation methods. Hafner Publishing Co., OxfordzbMATHGoogle Scholar
  22. Kittler J, Illingworth J (1986) Minimum error thresholding. Pattern Recognit 19(1):41–47CrossRefGoogle Scholar
  23. Korzynska A, Roszkowiak L, Lopez C, Bosch R, Witkowski L, Lejeune M (2013) Validation of various adaptive threshold methods of segmentation applied to follicular lymphoma digital images stained with 3,3’-diaminobenzidine&haematoxylin. Diagn Pathol 8(1):48CrossRefGoogle Scholar
  24. Krzanowski WJ, Hand DJ (2009) ROC curves for continuous data. CRC Press, Boca RatonzbMATHCrossRefGoogle Scholar
  25. Lawrence S, Giles CL, Tsoi AC, Back AD (1997) Face recognition: a convolutional neural-network approach. IEEE Trans Neural Netw 8(1):98–113. ISSN 1045-9227CrossRefGoogle Scholar
  26. Lehmann EL (2004) Elements of large-sample theory. Springer, New YorkGoogle Scholar
  27. Lehmussola A, Ruusuvuori P, Selinummi J, Huttunen H, Yli-Harja O (2007) Computational framework for simulating fluorescence microscope images with cell populations. IEEE Trans Med Imaging 26(7):1010–1016CrossRefGoogle Scholar
  28. Lehmussola A, Ruusuvuori P, Selinummi J, Rajala T, Yli-Harja O (2008) Synthetic images of high-throughput microscopy for validation of image analysis methods. Proc IEEE 96(8):1348–1360CrossRefGoogle Scholar
  29. Li CH, Tam PK-S (1998) An iterative algorithm for minimum cross entropy thresholding. Pattern Recognit Lett 19(8):771–776zbMATHCrossRefGoogle Scholar
  30. Li CH, Lee CK (1993) Minimum cross entropy thresholding. Pattern Recognit 26(4):617–625CrossRefGoogle Scholar
  31. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. ISSN 1573-1405CrossRefGoogle Scholar
  32. Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Computer vision, 2001. ICCV 2001. Proceedings. Eighth IEEE international conference on. vol 2. IEEE, pp 416–423Google Scholar
  33. Moré JJ (1978) The levenberg-marquardt algorithm: implementation and theory. In: Numerical analysis. Springer, pp 105–116Google Scholar
  34. Narkhede HP (2013) Review of image segmentation techniques. Int J Sci Mod Eng 1(8):54–61Google Scholar
  35. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66CrossRefGoogle Scholar
  36. Pal NR, Pal SK (1993) A review on image segmentation techniques. Pattern Recognit 26(9):1277–1294CrossRefGoogle Scholar
  37. Patrangenaru V, Paige R, Yao KD, Qiu M, Lester D (2016) Projective shape analysis of contours and finite 3d configurations from digital camera images. Stat Pap 57(4):1017–1040MathSciNetzbMATHCrossRefGoogle Scholar
  38. Pham DL, Xu C, Prince JL (2000) Current methods in medical image segmentation. Annu Rev Biomed Eng 2(1):315–337CrossRefGoogle Scholar
  39. Picard RR, Cook RD (1984) Cross-validation of regression models. J Am Stat Assoc 79(387):575–583MathSciNetzbMATHCrossRefGoogle Scholar
  40. Politis DN, Romano JP, Wolf M (1999) Subsampling. Springer, New YorkzbMATHCrossRefGoogle Scholar
  41. Politis DN, Romano JP, Michael Wolf (2001) On the asymptotic theory of subsampling. Stat Sin 11:1105–1124MathSciNetzbMATHGoogle Scholar
  42. Prewitt JMS, Mendelsohn ML (1966) The analysis of cell images. Ann N Y Acad Sci 128(3):1035–1053CrossRefGoogle Scholar
  43. Reed TR, Dubuf JMH (1993) A review of recent texture segmentation and feature extraction techniques. CVGIP 57(3):359–372CrossRefGoogle Scholar
  44. Ridler TW, Calvard S et al (1978) Picture thresholding using an iterative selection method. IEEE Trans Syst Man Cybern 8(8):630–632CrossRefGoogle Scholar
  45. Sammut C, Webb GI (2011) Encyclopedia of machine learning. Springer, New YorkzbMATHGoogle Scholar
  46. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B et al (2012) Fiji: an open-source platform for biological-image analysis. Nat Methods 9(7):676CrossRefGoogle Scholar
  47. Seung HS, Sompolinsky H, Tishby N (1992) Statistical mechanics of learning from examples. Phys Rev A 45(8):6056MathSciNetCrossRefGoogle Scholar
  48. Sezgin M, Sankur B (2004) Survey over image thresholding techniques and quantitative performance evaluation. J Electron Imaging 13(1):146–166CrossRefGoogle Scholar
  49. Shanbhag AG (1994) Utilization of information measure as a means of image thresholding. CVGIP 56(5):414–419MathSciNetGoogle Scholar
  50. Shattuck DW, Prasad G, Mirza M, Narr KL, Toga AW (2009) Online resource for validation of brain segmentation methods. NeuroImage 45(2):431–439CrossRefGoogle Scholar
  51. Sonka M, Hlavac V, Boyle R (2014) Image processing, analysis, and machine vision. Cengage LearningGoogle Scholar
  52. Stevens WL (1951) Asymptotic regression. Biometrics 7:247–267MathSciNetCrossRefGoogle Scholar
  53. Tsai W-H et al (1985) Moment-preserving thresholding-a new approach. Comput Vis Graph Image Process 29(3):377–393CrossRefGoogle Scholar
  54. Udupa JK, Leblanc VR, Zhuge Y, Imielinska C, Schmidt H, Currie LM, Hirsch BE, Woodburn J (2006) A framework for evaluating image segmentation algorithms. Comput Med Imaging Graph 30(2):75–87CrossRefGoogle Scholar
  55. Warfield SK, Mulkern RV, Winalski CS, Jolesz FA, Kikinis R (2000) An image processing strategy for the quantification and visualization of exercise-induced muscle mri signal enhancement. J Magn Reson Imaging 11(5):525–531CrossRefGoogle Scholar
  56. Warfield SK, Zou KH, Wells WM (2004) Simultaneous truth and performance level estimation (staple): an algorithm for the validation of image segmentation. IEEE Trans Med Imaging 23(7):903–921CrossRefGoogle Scholar
  57. Zack GW, Rogers WE, Latt SA (1977) Automatic measurement of sister chromatid exchange frequency. J Histochem Cytochem 25(7):741–753CrossRefGoogle Scholar
  58. Zou KH, Warfield SK, Aditya Bharatha, Tempany CMC, Kaus Michael R, Haker SJ, Wells WM, Jolesz FA, Ron Kikinis (2004) Statistical validation of image segmentation quality based on a spatial overlap index: scientific reports. Acad Radiol 11(2):178–189. CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.University of CagliariCagliariItaly

Personalised recommendations