Data-Driven Rank Aggregation with Application to Grand Challenges

  • James FishbaughEmail author
  • Marcel Prastawa
  • Bo Wang
  • Patrick Reynolds
  • Stephen Aylward
  • Guido Gerig
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10434)


The increased number of challenges for comparative evaluation of biomedical image analysis procedures clearly reflects a need for unbiased assessment of the state-of-the-art methodological advances. Moreover, the ultimate translation of novel image analysis procedures to the clinic requires rigorous validation and evaluation of alternative schemes, a task that is best outsourced to the international research community. We commonly see an increase of the number of metrics to be used in parallel, reflecting alternative ways to measure similarity. Since different measures come with different scales and distributions, these are often normalized or converted into an individual rank ordering, leaving the problem of combining the set of multiple rankings into a final score. Proposed solutions are averaging or accumulation of rankings, raising the question if different metrics are to be treated the same or if all metrics would be needed to assess closeness to truth. We address this issue with a data-driven method for automatic estimation of weights for a set of metrics based on unsupervised rank aggregation. Our method requires no normalization procedures and makes no assumptions about metric distributions. We explore the sensitivity of metrics to small changes in input data with an iterative perturbation scheme, to prioritize the contribution of the most robust metrics in the overall ranking. We show on real anatomical data that our weighting scheme can dramatically change the ranking.


  1. 1.
    Menze, B.H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, K., Kirby, J., Burren, Y., Porz, N., Slotboom, J., Wiest, R., et al.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34(10), 1993–2024 (2015)CrossRefGoogle Scholar
  2. 2.
    Pujol, S., Wells, W., Pierpaoli, C., Brun, C., Gee, J., Cheng, G., Vemuri, B., Commowick, O., Prima, S., et al.: The DTI challenge: toward standardized evaluation of diffusion tensor imaging tractography for neurosurgery. J. Neuroimaging 25(6), 875–882 (2015)CrossRefGoogle Scholar
  3. 3.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Warfield, S., Zou, K., Wells, W.: Simultaneous truth and performance level estimation (staple): an algorithm for the validation of image segmentation. IEEE Trans. Med. Imaging 23(7), 903–921 (2004)CrossRefGoogle Scholar
  5. 5.
    Klein, A., Andersson, J., Ardekani, B.A., Ashburner, J., Avants, B., Chiang, M.C., Christensen, G.E., Collins, D.L., Gee, J., Hellier, P., et al.: Evaluation of 14 nonlinear deformation algorithms applied to human brain MRI registration. Neuroimage 46(3), 786–802 (2009)CrossRefGoogle Scholar
  6. 6.
    Gouttard, S., Goodlett, C.B., Kubicki, M., Gerig, G.: Measures for validation of DTI tractography. In: SPIE Medical Imaging, ISOP, p. 83140J (2012)Google Scholar
  7. 7.
    Taha, A.A., Hanbury, A., del Toro, O.A.J.: A formal method for selecting evaluation metrics for image segmentation. In: ICIP, pp. 932–936. IEEE (2014)Google Scholar
  8. 8.
    Klementiev, A., Roth, D., Small, K.: An unsupervised learning algorithm for rank aggregation. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS, vol. 4701, pp. 616–623. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-74958-5_60CrossRefGoogle Scholar
  9. 9.
    Taha, A.A., Hanbury, A.: Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. BMC Med. Imaging 15(1), 29 (2015)CrossRefGoogle Scholar
  10. 10.
    Yushkevich, P., Piven, J., Cody, H., Ho, S., Gee, J.C., Gerig, G.: User-guided level set segmentation of anatomical structures with ITK-SNAP. NeuroImage 31, 1116–1128 (2005)CrossRefGoogle Scholar
  11. 11.
    Vachet, C., Yvernault, B., Bhatt, K., Smith, R.G., Gerig, G., Hazlett, H.C., Styner, M.: Automatic corpus callosum segmentation using a deformable active Fourier contour model. In: SPIE Medical Imaging, ISOP, p. 831707 (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • James Fishbaugh
    • 1
    Email author
  • Marcel Prastawa
    • 2
  • Bo Wang
    • 3
  • Patrick Reynolds
    • 4
  • Stephen Aylward
    • 4
  • Guido Gerig
    • 1
  1. 1.Tandon School of EngineeringNew York UniversityNew YorkUSA
  2. 2.Icahn School of Medicine, Mount SinaiNew YorkUSA
  3. 3.GE Global ResearchNiskayunaUSA
  4. 4.Kitware Inc.Clifton ParkUSA

Personalised recommendations