Skip to main content

Cloud–Based Evaluation Framework for Big Data

  • Conference paper
  • Open Access

Part of the Lecture Notes in Computer Science book series (LNISA,volume 7858)


The VISCERAL project is building a cloud-based evaluation framework for evaluating machine learning and information retrieval algorithms on large amounts of data. Instead of downloading data and running evaluations locally, the data will be centrally available on the cloud and algorithms to be evaluated will be programmed in computing instances on the cloud, effectively bringing the algorithms to the data. This approach allows evaluations to be performed on Terabytes of data without needing to consider the logistics of moving the data or storing the data on local infrastructure. After discussing the challenges of benchmarking on big data, the design of the VISCERAL system is presented, concentrating on the components for coordinating the participants in the benchmark and managing the ground truth creation. The first two benchmarks run on the VISCERAL framework will be on segmentation and retrieval of 3D medical images.


  • Evaluation
  • Cloud Computing
  • Annotation
  • Information Retrieval
  • Machine Learning

Invited Paper.


  1. Alonso, O., Baeza-Yates, R.: Design and implementation of relevance assessments using crowdsourcing. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 153–164. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  2. Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M.: A view of cloud computing. Commun. ACM 53(4), 50–58 (2010)

    CrossRef  Google Scholar 

  3. Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: EvaluatIR: an online tool for evaluating and comparing ir systems. In: SIGIR 2009: Proceedings of the 32nd International ACM SIGIR Conference, p. 833. ACM (2009)

    Google Scholar 

  4. Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: Improvements that don’t add up: ad-hoc retrieval results since 1998. In: CIKM 2009: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pp. 601–610. ACM (2009)

    Google Scholar 

  5. Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the Grid: Enabling scalable virtual organizations. The International Journal of Supercomputer Applications 15(3) (summer 2001)

    Google Scholar 

  6. Freire, J., Silva, C.T.: Making computations and publications reproducible with VisTrails. Computing in Science & Engineering 14(4), 18–25 (2012)

    CrossRef  Google Scholar 

  7. Gagliardi, F., Jones, B., François, G., Bégin, M.E., Heikkurinen, M.: Building an infrastructure for scientific grid computing: status and goals of the EGEE project. Philosophical Transactions of the Royal Society A 363, 1729–1742 (2005)

    CrossRef  Google Scholar 

  8. Hanbury, A., Müller, H., Langs, G., Weber, M.A., Menze, B.H., Fernandez, T.S.: Bringing the algorithms to the data: Cloud–based benchmarking for medical image analysis. In: Catarci, T., Forner, P., Hiemstra, D., Peñas, A., Santucci, G. (eds.) CLEF 2012. LNCS, vol. 7488, pp. 24–29. Springer, Heidelberg (2012)

    CrossRef  Google Scholar 

  9. Hand, D.J.: Classifier technology and the illusion of progress. Statistical Science 21(1), 1–14 (2006)

    MathSciNet  CrossRef  Google Scholar 

  10. Harman, D.: Information Retrieval Evaluation. Morgan & Claypool Publishers (2011)

    Google Scholar 

  11. van Harmelen, F., Kampis, G., Börner, K., Besselaar, P., Schultes, E., Goble, C., Groth, P., Mons, B., Anderson, S., Decker, S., Hayes, C., Buecheler, T., Helbing, D.: Theoretical and technological building blocks for an innovation accelerator. The European Physical Journal Special Topics 214(1), 183–214 (2012)

    CrossRef  Google Scholar 

  12. Ince, D.C., Hatton, L., Graham-Cumming, J.: The case for open computer programs. Nature 482(7386), 485–488 (2012)

    CrossRef  Google Scholar 

  13. Langs, G., Müller, H., Menze, B.H., Hanbury, A.: VISCERAL: Towards large data in medical imaging — challenges and directions. In: Greenspan, H., Müller, H., Syeda-Mahmood, T. (eds.) MCBR-CDS 2012. LNCS, vol. 7723, pp. 92–98. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

  14. Müller, H., Clough, P., Deselaers, T., Caputo, B. (eds.): ImageCLEF – Experimental Evaluation in Visual Information Retrieval. The Springer International Series on Information Retrieval, vol. 32. Springer, Heidelberg (2010)

    MATH  Google Scholar 

  15. Pitkanen, M., Zhou, X., Tuisku, M., Niemi, T., Ryynänen, V., Müller, H.: How Grids are perceived in healthcare and the public service sector. In: Global HealthGrid: e-Science Meets Biomedical Informatics — Proceedings of HealthGrid 2008. Studies in Health Technology and Informatics, vol. 138, pp. 61–69. IOS Press (2008)

    Google Scholar 

  16. Rebholz-Schumann, D., Yepes, A.J.J., van Mulligen, E.M., Kang, N., Kors, J., Milward, D., Corbett, P., Buyko, E., Beisswanger, E., Hahn, U.: CALBC silver standard corpus. Journal of Bioinformatics and Computational Biology 8(1), 163–179 (2010)

    CrossRef  Google Scholar 

  17. Sanderson, M.: Test collection based evaluation of information retrieval systems. Foundations and Trends in Information Retrieval 4(4), 247–375 (2010)

    CrossRef  Google Scholar 

  18. Stodden, V.: The legal framework for reproducible scientific research: Licensing and copyright. Computing in Science & Engineering 11(1), 35–40 (2009)

    CrossRef  Google Scholar 

  19. Thornley, C.V., Johnson, A.C., Smeaton, A.F., Lee, H.: The scholarly impact of trecvid (2003–2009). Journal of the American Society for Information Science and Technology 62, 613–627 (2011)

    CrossRef  Google Scholar 

  20. Tsikrika, T., de Herrera, A.G.S., Müller, H.: Assessing the scholarly impact of imageCLEF. In: Forner, P., Gonzalo, J., Kekäläinen, J., Lalmas, M., de Rijke, M. (eds.) CLEF 2011. LNCS, vol. 6941, pp. 95–106. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  21. Vijayanarasimhan, S., Grauman, K.: Large-scale live active learning: Training object detectors with crawled data and crowds. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1449–1456 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

This chapter is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.

Copyright information

© 2013 Authors

About this paper

Cite this paper

Hanbury, A., Müller, H., Langs, G., Menze, B.H. (2013). Cloud–Based Evaluation Framework for Big Data. In: Galis, A., Gavras, A. (eds) The Future Internet. FIA 2013. Lecture Notes in Computer Science, vol 7858. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38081-5

  • Online ISBN: 978-3-642-38082-2

  • eBook Packages: Computer ScienceComputer Science (R0)