Cloud–Based Evaluation Framework for Big Data

  • Allan Hanbury
  • Henning Müller
  • Georg Langs
  • Bjoern H. Menze
Open Access
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7858)


The VISCERAL project is building a cloud-based evaluation framework for evaluating machine learning and information retrieval algorithms on large amounts of data. Instead of downloading data and running evaluations locally, the data will be centrally available on the cloud and algorithms to be evaluated will be programmed in computing instances on the cloud, effectively bringing the algorithms to the data. This approach allows evaluations to be performed on Terabytes of data without needing to consider the logistics of moving the data or storing the data on local infrastructure. After discussing the challenges of benchmarking on big data, the design of the VISCERAL system is presented, concentrating on the components for coordinating the participants in the benchmark and managing the ground truth creation. The first two benchmarks run on the VISCERAL framework will be on segmentation and retrieval of 3D medical images.


Evaluation Cloud Computing Annotation Information Retrieval Machine Learning 


  1. 1.
    Alonso, O., Baeza-Yates, R.: Design and implementation of relevance assessments using crowdsourcing. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 153–164. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M.: A view of cloud computing. Commun. ACM 53(4), 50–58 (2010)CrossRefGoogle Scholar
  3. 3.
    Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: EvaluatIR: an online tool for evaluating and comparing ir systems. In: SIGIR 2009: Proceedings of the 32nd International ACM SIGIR Conference, p. 833. ACM (2009)Google Scholar
  4. 4.
    Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: Improvements that don’t add up: ad-hoc retrieval results since 1998. In: CIKM 2009: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pp. 601–610. ACM (2009)Google Scholar
  5. 5.
    Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the Grid: Enabling scalable virtual organizations. The International Journal of Supercomputer Applications 15(3) (summer 2001)Google Scholar
  6. 6.
    Freire, J., Silva, C.T.: Making computations and publications reproducible with VisTrails. Computing in Science & Engineering 14(4), 18–25 (2012)CrossRefGoogle Scholar
  7. 7.
    Gagliardi, F., Jones, B., François, G., Bégin, M.E., Heikkurinen, M.: Building an infrastructure for scientific grid computing: status and goals of the EGEE project. Philosophical Transactions of the Royal Society A 363, 1729–1742 (2005)CrossRefGoogle Scholar
  8. 8.
    Hanbury, A., Müller, H., Langs, G., Weber, M.A., Menze, B.H., Fernandez, T.S.: Bringing the algorithms to the data: Cloud–based benchmarking for medical image analysis. In: Catarci, T., Forner, P., Hiemstra, D., Peñas, A., Santucci, G. (eds.) CLEF 2012. LNCS, vol. 7488, pp. 24–29. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  9. 9.
    Hand, D.J.: Classifier technology and the illusion of progress. Statistical Science 21(1), 1–14 (2006)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Harman, D.: Information Retrieval Evaluation. Morgan & Claypool Publishers (2011)Google Scholar
  11. 11.
    van Harmelen, F., Kampis, G., Börner, K., Besselaar, P., Schultes, E., Goble, C., Groth, P., Mons, B., Anderson, S., Decker, S., Hayes, C., Buecheler, T., Helbing, D.: Theoretical and technological building blocks for an innovation accelerator. The European Physical Journal Special Topics 214(1), 183–214 (2012)CrossRefGoogle Scholar
  12. 12.
    Ince, D.C., Hatton, L., Graham-Cumming, J.: The case for open computer programs. Nature 482(7386), 485–488 (2012)CrossRefGoogle Scholar
  13. 13.
    Langs, G., Müller, H., Menze, B.H., Hanbury, A.: VISCERAL: Towards large data in medical imaging — challenges and directions. In: Greenspan, H., Müller, H., Syeda-Mahmood, T. (eds.) MCBR-CDS 2012. LNCS, vol. 7723, pp. 92–98. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  14. 14.
    Müller, H., Clough, P., Deselaers, T., Caputo, B. (eds.): ImageCLEF – Experimental Evaluation in Visual Information Retrieval. The Springer International Series on Information Retrieval, vol. 32. Springer, Heidelberg (2010)zbMATHGoogle Scholar
  15. 15.
    Pitkanen, M., Zhou, X., Tuisku, M., Niemi, T., Ryynänen, V., Müller, H.: How Grids are perceived in healthcare and the public service sector. In: Global HealthGrid: e-Science Meets Biomedical Informatics — Proceedings of HealthGrid 2008. Studies in Health Technology and Informatics, vol. 138, pp. 61–69. IOS Press (2008)Google Scholar
  16. 16.
    Rebholz-Schumann, D., Yepes, A.J.J., van Mulligen, E.M., Kang, N., Kors, J., Milward, D., Corbett, P., Buyko, E., Beisswanger, E., Hahn, U.: CALBC silver standard corpus. Journal of Bioinformatics and Computational Biology 8(1), 163–179 (2010)CrossRefGoogle Scholar
  17. 17.
    Sanderson, M.: Test collection based evaluation of information retrieval systems. Foundations and Trends in Information Retrieval 4(4), 247–375 (2010)CrossRefGoogle Scholar
  18. 18.
    Stodden, V.: The legal framework for reproducible scientific research: Licensing and copyright. Computing in Science & Engineering 11(1), 35–40 (2009)CrossRefGoogle Scholar
  19. 19.
    Thornley, C.V., Johnson, A.C., Smeaton, A.F., Lee, H.: The scholarly impact of trecvid (2003–2009). Journal of the American Society for Information Science and Technology 62, 613–627 (2011)CrossRefGoogle Scholar
  20. 20.
    Tsikrika, T., de Herrera, A.G.S., Müller, H.: Assessing the scholarly impact of imageCLEF. In: Forner, P., Gonzalo, J., Kekäläinen, J., Lalmas, M., de Rijke, M. (eds.) CLEF 2011. LNCS, vol. 6941, pp. 95–106. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  21. 21.
    Vijayanarasimhan, S., Grauman, K.: Large-scale live active learning: Training object detectors with crawled data and crowds. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1449–1456 (2011)Google Scholar

Copyright information

© Authors 2013

Authors and Affiliations

  • Allan Hanbury
    • 1
  • Henning Müller
    • 2
  • Georg Langs
    • 3
  • Bjoern H. Menze
    • 4
  1. 1.Institute of Software Technology and Interactive SystemsVienna University of TechnologyAustria
  2. 2.University of Applied Sciences Western Switzerland (HES-SO)Switzerland
  3. 3.CIR Lab, Department of RadiologyMedical University of ViennaAustria
  4. 4.Computer Vision LaboratoryETH ZürichSwitzerland

Personalised recommendations