Advertisement

The Journal of Supercomputing

, Volume 70, Issue 1, pp 408–464 | Cite as

Cloud computing in e-Science: research challenges and opportunities

  • Xiaoyu Yang
  • David Wallom
  • Simon Waddington
  • Jianwu Wang
  • Arif Shaon
  • Brian Matthews
  • Michael Wilson
  • Yike Guo
  • Li Guo
  • Jon D. Blower
  • Athanasios V. Vasilakos
  • Kecheng Liu
  • Philip Kershaw
Article

Abstract

Service-oriented architecture (SOA), workflow, the Semantic Web, and Grid computing are key enabling information technologies in the development of increasingly sophisticated e-Science infrastructures and application platforms. While the emergence of Cloud computing as a new computing paradigm has provided new directions and opportunities for e-Science infrastructure development, it also presents some challenges. Scientific research is increasingly finding that it is difficult to handle “big data” using traditional data processing techniques. Such challenges demonstrate the need for a comprehensive analysis on using the above-mentioned informatics techniques to develop appropriate e-Science infrastructure and platforms in the context of Cloud computing. This survey paper describes recent research advances in applying informatics techniques to facilitate scientific research particularly from the Cloud computing perspective. Our particular contributions include identifying associated research challenges and opportunities, presenting lessons learned, and describing our future vision for applying Cloud computing to e-Science. We believe our research findings can help indicate the future trend of e-Science, and can inform funding and research directions in how to more appropriately employ computing technologies in scientific research. We point out the open research issues hoping to spark new development and innovation in the e-Science field.

Keywords

e-Science e-Research Informatics Cloud computing Semantic web Grid computing Workflow Digital research Big data 

Notes

Acknowledgments

We thank the anonymous reviewers for their constructive and insightful suggestions. Professor Michael Wilson of STFC suddenly passed away during the preparation of this paper. He was closely involved with its drafting, and we are indebted to his ideas and insights.

References

  1. 1.
    Yang X, Wang L, von Laszewski G (2009) Recent research advances in e-Science. Cluster Comput (special issue). http://springerlink.com/content/f058408qr771348q/
  2. 2.
    Yang X, Wang L et al (2011) Guide to e-Science: next generation scientific research and discovery. Springer, BerlinCrossRefGoogle Scholar
  3. 3.
    Hey AJG, Trefethen AE (2003) In: Berman F, Fox GC, Hey AJG (eds) The data deluge: an e-Science perspective, in grid computing–making the global infrastructure a reality. Wiley, New York, pp 809–824Google Scholar
  4. 4.
    Sutter JP, Alcock SG, Sawhney KJS (2011) Automated in-situ optimization of bimorph mirrors at diamond light source. In: Proc. SPIE 8139, 813906. doi: 10.1117/12.892719.
  5. 5.
    Voss A, Meer EV, Fergusson D (2008) Research in a connected world (Edited book). http://www.lulu.com/product/ebook/research-in-a-connected-world/17375289
  6. 6.
    Zhang L, Zhang J, Cai H (2007) Services computing: core enabling technology of the modern services industry. Springer, New YorkGoogle Scholar
  7. 7.
    Yang X, Dove M, Bruin R et al (2010) A service-oriented framework for running quantum mechanical simulation for material properties over grids. IEEE Trans Syst Man Cybern Part C Appl Rev 40(3)Google Scholar
  8. 8.
    Yang X, Bruin R, Dove M (2010) User-centred design practice for grid-enabled simulation in e-Science. New Gener Comput 28(2):147–159. doi: 10.1007/s00354-008-0082-4, Springer
  9. 9.
    Hamre T, Sandven S (2011) Open service network for marine environmental data. EuroGOOS, SopotGoogle Scholar
  10. 10.
    Browdy SF (2011) GEOSS common infrastructure: internal structure and standards. GeoViQua First Workshop, BarcelonaGoogle Scholar
  11. 11.
    Yang X, Dove M, Bruin R, Walkingshaw A, Sinclair R, Wilson DJ, Murray-Rust P (2012) An e-Science data infrastructure for simulations within grid computing environment: methods, approaches, and practice. Concurr Comput Pract Exp.Google Scholar
  12. 12.
    Yang X (2011) QoS-oriented service computing: bring SOA into cloud environment. In: Liu X, Li Y (eds) Advanced design approaches to emerging software systems: principles, methodology and tools. IGI Global USAGoogle Scholar
  13. 13.
    Zhang S, Wang W, Wu H, Vasilakos AV, Liu P (2013) Towards transparent and distributed workload management for large scale web servers. Future Generation Comp Syst 29(4):913–925CrossRefGoogle Scholar
  14. 14.
    Yang X, Nasser B, Surridge M, Middleton S (2012) A business-oriented cloud federation model for real-time applications. Elsevier, Amsterdam, Future generation computer systems. doi: 10.1016/j.future.2012.02.005 Google Scholar
  15. 15.
    Ludäscher B, Altintas I, Berkley C, Higgins D, Jaeger E, Jones M, Lee E, Tao J, Zhao Y (2005) Scientific workflow management and the Kepler system. Concurr Comput Pract Exp 18(10):1039–1065CrossRefGoogle Scholar
  16. 16.
    Oinn T, Addis M, Ferris J, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock MR, Wipat A, Li P (2004) Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17):3045–3054, Oxford University Press, London.Google Scholar
  17. 17.
    Taylor I, Shields M, Wang I, Harrison A (2007) The Triana workflow environment: architecture and applications. In: Taylor I, Deelman E, Gannon D, Shields M (eds) Workflows for e-Science. Springer, New York, pp 320–339CrossRefGoogle Scholar
  18. 18.
    Deelman E, Mehta G, Singh G, Su M, Vahi K (2007) Pegasus: mapping large-scale workflows to distributed resources. In: Taylor I, Deelman E, Gannon D, Shields M (eds) Workflows for e-Science. Springer, New York, pp 376–394CrossRefGoogle Scholar
  19. 19.
    Fahringer T, Jugravu A, Pllana S, Prodan R, Seragiotto Jr, C, Truong H (2005) ASKALON: a tool set for cluster and Grid computing. Concurr Comput Pract Exp 17(2–4):143–169, Wiley InterScience.Google Scholar
  20. 20.
    Zhao Y, Hategan M, Clifford B, Foster I, von Laszewski G, Nefedova V, Raicu I, Stef-Praun T, Wilde M (2007) Swift: fast, reliable, loosely coupled parallel computation. Proceedings of 2007 IEEE congress on services (Services 2007), pp 199–206.Google Scholar
  21. 21.
    Yang X, Bruin R, Dove M (2010) Developing an end-to-end scientific workflow: a case study of using a reliable, lightweight, and comprehensive workflow platform in e-Science. doi: 10.1109/MCSE.2009.211.
  22. 22.
    Ludäscher B, Altintas I, Bowers S, Cummings J, Critchlow T, Deelman E, Roure DD, Freire J, Goble C, Jones M, Klasky S, McPhillips T, Podhorszki N, Silva C, Taylor I, Vouk M (2009) Scientific process automation and workflow management. In Shoshani A, Rotem D (eds) Scientific data management: challenges, existing technology, and deployment, computational science series. Chapman & Hall/CRC, pp 476–508.Google Scholar
  23. 23.
    Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener Comput Syst 25(5):528–540CrossRefGoogle Scholar
  24. 24.
    Taylor I, Deelman E, Gannon D, Shields M (eds) (2007) Workflows for e-Science. Springer, New York, ISBN: 978-1-84628-519-6.Google Scholar
  25. 25.
    Yu Y (2006) Buyya R (2006) A taxonomy of workflow management systems for grid computing. J Grid Comput 3:171–200CrossRefGoogle Scholar
  26. 26.
    Wang J, Korambath P, Kim S, Johnson S, Jin K, Crawl D, Altintas I, Smallen S, Labate B, Houk KN (2011) Facilitating e-science discovery using scientific workflows on the grid. In: Yang X, Wang L, Jie W (eds) Guide to e-Science: next generation scientific research and discovery. Springer, Berlin, pp 353–382. ISBN 978-0-85729-438-8CrossRefGoogle Scholar
  27. 27.
    MacLennan, BJ (1992) Functional programming: practice and theory. Addison-Wesley.Google Scholar
  28. 28.
    Plale B, Gannon D, Reed DA, Graves SJ, Droegemeier K, Wilhelmson R, Ramamurthy M (2005) Towards dynamically adaptive weather analysis and forecasting in LEAD. In: International conference on computational science (2), pp 624–631.Google Scholar
  29. 29.
    Wang J, Crawl D, Altintas I (2012) A framework for distributed data-parallel execution in the Kepler scientific workflow system. In: Proceedings of 1st international workshop on advances in the Kepler scientific workflow system and its applications at ICCS 2012 conference.Google Scholar
  30. 30.
    Islam M, Huang A, Battisha M, Chiang M, Srinivasan S, Peters C, Neumann A, Abdelnur A (2012) Oozie: towards a scalable workflow management system for hadoop. In: Proceedings of the 1st international workshop on scalable workflow enactment engines and technologies (SWEET’12).Google Scholar
  31. 31.
    El-Rewini H, Lewis T, Ali H (1994) Task scheduling in parallel and distributed systems. PTR Prentice Hall, ISBN: 0-13-099235-6.Google Scholar
  32. 32.
    Yu J, Buyya R, Ramamohanarao K (2008) Workflow scheduling algorithms for grid computing. In: Xhafa F, Abraham A (eds) Metaheuristics for scheduling in distributed computing environments. Springer, Berlin, pp 173–214. ISBN 978-3-540-69260-7CrossRefGoogle Scholar
  33. 33.
    Dong F, Akl S (2006) Scheduling algorithms for grid computing: state of the art and open problems, Technical Report 2006–504. Queen’s University.Google Scholar
  34. 34.
    Wieczorek M, Prodan R, Fahringer T (2005) Scheduling of scientific workflows in the ASKALON grid environment. SIGMOD Record 34(3):56–62CrossRefGoogle Scholar
  35. 35.
    Wang J, Korambath P, Altintas I, Davis J, Crawl D (2014) Workflow as a service in the cloud: architecture and scheduling algorithms. In: Proceedings of international conference on computational science (ICCS 2014).Google Scholar
  36. 36.
    Vazirani VV (2003) Approximation algorithms. Springer, Berlin. ISBN 3-540-65367-8CrossRefGoogle Scholar
  37. 37.
    Morton T, Pentico DW (1993) Heuristic scheduling systems: with applications to production systems and project management. Wiley, New York. ISBN 0-471-57819-3Google Scholar
  38. 38.
    Kosar T, Balman M (2009) A new paradigm: data-aware scheduling in grid computing. Future Gener Comput Syst 25(4):406–413CrossRefGoogle Scholar
  39. 39.
    Yuan D, Yang Y, Liu X, Zhang G, Chen J (2012) A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurr Comput Pract Exp 24(9):956–976CrossRefGoogle Scholar
  40. 40.
    Viana V, de Oliveira D, Mattoso M (2011) Towards a cost model for scheduling scientific workflows activities in cloud environments. IEEE World Congress on Services, pp 216–219.Google Scholar
  41. 41.
    Kllapi H, Sitaridi E, Tsangaris MM, Ioannidis YE (2011) Schedule optimization for data processing flows on the Cloud. In: SIGMOD conference, pp 289–300.Google Scholar
  42. 42.
    De Roure D, Goble C, Stevens R (2009) The design and realisation of the myexperiment virtual research environment for social sharing of workflows. Future Gener Comput Syst 25:561–567. doi: 10.1016/j.future.2008.06.010 CrossRefGoogle Scholar
  43. 43.
    Karasavvas K, Wolstencroft K, Mina E, Cruickshank D, Williams A, De Roure D, Goble C, Roos M (2012) Opening new gateways to workflows for life scientists. In: Gesing S et al. (eds) HealthGrid applications and technologies meet science gateways for life sciences. IOS Press, pp 131–141.Google Scholar
  44. 44.
    Terstyanszky G, Kukla T, Kiss T, Kacsuk P, Balasko A, Farkas Z (2014) Enabling scientific workflow sharing through coarse-grained interoperability. Future Gener Comput Syst 37:46–59, ISSN 0167–739X. doi: 10.1016/j.future.2014.02.016.
  45. 45.
    Plankensteiner K, Montagnat J, Prodan R (2011) IWIR: a language enabling portability across grid workflow systems. In: Proceedings of workshop on workflows in support of large-scale science (WORKS’11), Seattle. doi: 10.1145/2110497.2110509.
  46. 46.
    Simmhan YL, Plale B, Gannon D (2005) A survey of data provenance in e-Science. SIGMOD Record 34(3):31–36CrossRefGoogle Scholar
  47. 47.
    Ikeda R, Park H, Widom J (2011) Provenance for generalized map and reduce workflows. In: Proceedings of CIDR’2011, pp 273–283.Google Scholar
  48. 48.
    Crawl D, Wang J, Altintas I (2011) Provenance for mapreduce-based data-intensive workflows. In: Proceedings of the 6th workshop on workflows in support of large-scale science (WORKS11) at supercomputing 2011 (SC2011) conference, pp 21–29.Google Scholar
  49. 49.
    Muniswamy-Reddy K, Macko P, Seltzer M (2010) Provenance for the cloud. In: Proceedings of the 8th conference on file and storage technologies (FAST’10), The USENIX Association.Google Scholar
  50. 50.
    Foster I, Zhao Y, Raicu I, Lu S (2008) Cloud computing and grid computing 360-degree compared. In: Grid computing environments workshop, 2008 (GCE’08), pp 1–10.Google Scholar
  51. 51.
    Bell G, Hey T, Szalay A (2009) Beyond the data deluge. Science 323(5919):1297–1298. doi: 10.1126/science.1170411 CrossRefGoogle Scholar
  52. 52.
    Chang W-L, Vasilakos AV (2014) Molecular Computing: Towards A Novel Computing Architecture for Complex Problem Solving. Springer, March 2014 (Book in Big Data Series).Google Scholar
  53. 53.
  54. 54.
    Wang J, Crawl D, Altintas I, Li W (2014) Big data applications using workflows for data parallel computing. IEEE Comput Sci Eng.Google Scholar
  55. 55.
    Dean J, Ghemawat S, Mapreduce S (2008) Simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRefGoogle Scholar
  56. 56.
    Moretti C, Bui H, Hollingsworth K, Rich B, Flynn P, Thain D (2010) All-pairs: an abstraction for data-intensive computing on campus Grids. IEEE Trans Parallel Distrib Syst 21:33–46CrossRefGoogle Scholar
  57. 57.
    Gu Y, Grossman R (2009) Sector and sphere: the design and implementation of a high performance data Cloud. Philos Trans R Soc A 367(1897):2429–2445CrossRefGoogle Scholar
  58. 58.
    Gropp W, Lusk E, Skjellum A (1999) Using MPI: portable parallel programming with the message passing interface, 2nd edn. MIT Press, Cambridge, Scientific and Engineering Computation SeriesGoogle Scholar
  59. 59.
    Chapman B, Jost G, van der Pas R, Kuck D (2007) Using OpenMP: portable shared memory parallel programming. The MIT Press, CambridgeGoogle Scholar
  60. 60.
    Schatz M (2009) Cloudburst: highly sensitive read mapping with mapreduce. Bioinformatics 25(11):1363–1369CrossRefGoogle Scholar
  61. 61.
    Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL (2009) Searching for snps with Cloud computing. Genome Biol 10(134)Google Scholar
  62. 62.
    Kalyanaraman A, Cannon WR, Latt B, Baxter DJ (2011) MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification. Bioinformatics, Advance online access. doi: 10.1093/bioinformatics/btr523 Google Scholar
  63. 63.
    Dahiphale D, Karve R, Vasilakos AV, Liu H, Yu Z, Chhajer A, Wang J, Wang C (2014) An advanced mapreduce:cloud mapreduce, enhancements and applications. IEEE Trans Netw Serv Manag 11(1):101–115CrossRefGoogle Scholar
  64. 64.
    Wang J, Crawl D, Altintas I (2009) Kepler + Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems. In: Proceedings of the 4th workshop on workflows in support of large-scale science (WORKS09) at supercomputing 2009 (SC2009) conference. ACM, ISBN 978-1-60558-717-2.Google Scholar
  65. 65.
    Zhang C, Sterck HD (2009) CloudWF: a computational workflow system for clouds based on hadoop. In: Proceedings of the 1st international conference on cloud computing (CloudCom 2009).Google Scholar
  66. 66.
    Fei X, Lu S, Lin C (2009) A mapreduce-enabled scientific workflow composition framework. In: Proceedings of 2009 IEEE international conference on web services (ICWS 2009), pp 663–670.Google Scholar
  67. 67.
    Olston C, Chiou G, Chitnis L, Liu F, Han Y, Larsson M, Neumann A, Rao VBN, Sankarasubramanian V, Seth S, Tian C, ZiCornell T, Wang X (2011) Nova: continuous pig/hadoop workflows. ACM SIGMOD 2011 international conference on management of data (Industrial Track), Athens.Google Scholar
  68. 68.
    Mateescu G, Gentzsch W, Ribbens CJ (2011) Hybrid computing–where HPC meets grid and cloud computing. Future Gener Comput Syst 27(5):440–453, ISSN 0167–739X. doi: 10.1016/j.future.2010.11.003.
  69. 69.
    Parashar M, AbdelBaky M, Rodero I, Devarakonda A (2013) Cloud paradigms and practices for computational and data-enabled science and engineering. Comput Sci Eng 15:10–18. doi: 10.1109/MCSE.2013.49 CrossRefGoogle Scholar
  70. 70.
    Basney J, Gaynor J (2011) An oauth service for issuing certificates to science gateways for teragrid users. TeraGrid ‘11, Salt Lake City.Google Scholar
  71. 71.
    Pearlman J, Craglia M, Bertrand F, Nativi S, Gaigalas G, Dubois G, Niemeyer S, Fritz S (2011) EuroGEOSS: an interdisciplinary approach to research and applications for forestry, biodiversity and drought. http://www.eurogeoss.eu/Documents/publications%20-%20papers/2011%2034ISRSE%20EuroGEOSS%20Pearlman%20et%20al.pdf
  72. 72.
    Baker CJO, Cheung K-H (eds) (2006) Semantic Web: Revolutionizing knowledge discovery in the life sciences.Google Scholar
  73. 73.
    Berners-Lee T (2009) Linked data–design issues, W3C. http://www.w3.org/DesignIssues/LinkedData.html
  74. 74.
    Shaon A, Woolf A, Crompton S, Boczek R, Rogers W, Jackson M (2011) An open source linked data framework for publishing environmental data under the UK location strategy, Terra Cognita workshop, the ISWIC 2011 conference. http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Workshops/Terra/paper6.pdf
  75. 75.
    Shaon A, Callaghan S, Lawrence B, Matthews B, Osborn T, Harpham C (2011) Opening up climate research : a linked data approach to publishing data provenance, 7th international digital curation conference (DCC11), Bristol. http://epubs.stfc.ac.uk/work-details?w=60958
  76. 76.
    Callaghan S, Pepler S, Hewer F, Hardaker P, Gadian A (2009) How to publish data using overlay journals: the OJIMS project, Publication: Ariadne Issue 61, Originating URL: http://www.ariadne.ac.uk/issue61/callaghan-et-al/. Last modified: Thursday, 19-Nov-2009 10:59:06 UTC
  77. 77.
    Callaghan S, Hewer F, Pepler S, Hardaker P, Gadian A (2009) Overlay journals and data publishing in the meteorological sciences, Publication Date: 30-July-2009 Publication: Ariadne Issue 60 Originating. http://www.ariadne.ac.uk/issue60/callaghan-et-al/ File last modified: Thursday, 30-Jul-2009 15:46:43 UTC
  78. 78.
    Lawrence B, Pepler S, Jones C, Matthews B, Callaghan S (2011) Citation and peer review of data: moving towards formal data publication. Int J Digital Curation 6(2):2011. http://www.ijdc.net/index.php/ijdc/article/view/181/265
  79. 79.
    Bechhofer S, Ainsworth J, Bhagat J, Buchan I, Couch P, Cruickshank D, Delderfield M, Dunlop I, Gamble M, Goble C, Michaelides D, Missier P, Owen S, Newman D, De Roure S, Sufi S (2010) Why linked data is not enough for scientists. In: Proceedings of the 6th IEEE e-Science conference, Brisbane.Google Scholar
  80. 80.
    Zhao J, Goble C, Stevens R (2004) Semantic web applications to e-Science in silico experiments. In: Proceedings of the 13th international World Wide Web conference on alternate track papers and posters. http://www.iw3c2.org/WWW2004/docs/2p284.pdf
  81. 81.
    Sauermann L, Cyganiak R (2008) Cool URIs for the Semantic Web. W3C Interest Group Note. http://www.w3.org/TR/cooluris/
  82. 82.
    Haase P, Schmidt M, Schwarte A (2011) The information workbench as a self-service platform for linked data applications. In: Proceedings of the second international workshop on consuming linked data (COLD2011), Bonn. http://ceur-ws.org/Vol-782/HaaseEtAl_COLD2011.pdf
  83. 83.
    Earl T (2011) SOA, cloud computing and semantic web technology: understanding how they can work together. 3rd annual SOA and semantic technology symposium, 2011. http://www.afei.org/events/1a03/documents/daytwo_keypm_erl.pdf
  84. 84.
    Foster I, Kesselman C (eds) The grid: blueprint for a new computing infrastructure. Morgan Kaufmann, ISBN 1-55860-475-8Google Scholar
  85. 85.
    Fitzgerald S (2003) Grid information services for distributed resource sharing. In: Proceedings of the 10th IEEE international symposium on high performance distributed computing.Google Scholar
  86. 86.
    Laure E, Fisher SM, Frohner A, Grandi C, Kunszt P (2006) Programming the grid with gLite. Comput Methods Sci Technol 12(1):33–45CrossRefGoogle Scholar
  87. 87.
    Romberg M (2002) The UNICORE grid infrastructure. J Sci Program Arch 10(2). IOS Press Amsterdam.Google Scholar
  88. 88.
    Risch M, Altmann J, Guo L, Fleming A, Courcoubetis C (2009) The GridEcon platform: a business scenario testbed for commercial cloud services. In: Grid economics and business models. LNCS, vol 5745/2009. Springer, Berlin.Google Scholar
  89. 89.
    Toni F, Morge M et al. (2008) The ArguGrid platform: an overview. In: Grid economics and business models. LNCS, vol 5206/2008. Springer, Berlin.Google Scholar
  90. 90.
    Wei G, Vasilakos AV, Zheng Y, Xiong N (2010) A game-theoretic method of fair resource allocation for cloud computing services. J Supercomput 54(2):252–269CrossRefGoogle Scholar
  91. 91.
    Dustdar S, Guo Y, Satzger B, Truong HL (2011) Principles of elastic processes. IEEE Internet Comput 15(5):66–71CrossRefGoogle Scholar
  92. 92.
    Guo L, Guo Y, Tian X (2010) IC cloud: a design space for composable cloud computing. In: Proceedings of IEEE cloud computing, Miami.Google Scholar
  93. 93.
    Duan Q, Yan Y, Vasilakos AV (2012) A Survey on Service-Oriented Network Virtualization Toward Convergence of Networking and Cloud Computing. Network and Service Management, IEEE Transactions, 9(4):373–392, 10 Dec 2012.Google Scholar
  94. 94.
    Xu F, Liu F, Jin H, Vasilakos AV (2014) Managing Performance Overhead of Virtual Machines in Cloud Computing: A Survey, State of the Art, and Future Directions. Proceedings of the IEEE, 102(1):11–31, 17 Dec 2013.Google Scholar
  95. 95.
    Wang J, Korambath P, Altintas I (2011) A physical and virtual compute cluster resource load balancing approach to data-parallel scientific workflow scheduling. In: Proceedings of IEEE 2011 fifth international workshop on scientific workflows (SWF 2011), at 2011 congress on services (Services 2011), pp 212–215.Google Scholar
  96. 96.
    Chadwick K et al. (2012) FermiGrid and FermiCloud update. International symposium on grids and clouds 2012 (ISGC 2012), Taipei.Google Scholar
  97. 97.
    Schaffer HE, Averitt SF, Hoit MI, Peeler A, Sills ED, Vouk MA (2009) NCSU’s virtual computing lab: a Cloud computing solution. Computer 42(7):94–97CrossRefGoogle Scholar
  98. 98.
    Berriman GB, Deelman E, Juve G, Rynge M, Vöckler JS (1983) The application of cloud computing to scientific workflows: a study of cost and performance. Philos Trans R Soc A Math Phys Eng Sci 371:2013Google Scholar
  99. 99.
    Mell P, Grance T (2009) The NIST definition of cloud computing. http://www.nist.gov/itl/cloud/upload/cloud-def-v15.pdf
  100. 100.
    EMC Report (2008) The diverse and exploding digital universe, IDC White Paper. http://www.emc.com/collateral/analyst-reports/diverse-exploding-digital-universe.pdf
  101. 101.
    Jensen J, Downing R, Waddington S, Hedges M, Zhang J, Knight G (2011) Kindura–federating data clouds for archiving. In: Proceedings of international symposium on grids and clouds.Google Scholar
  102. 102.
    Hedges M, Hasan A. Blanke T (2007) Management and preservation of research data with iRODS. In: Proceedings of the ACM first workshop on CyberInfrastructure: information management in e-Science. doi: 10.1145/1317353.1317358.
  103. 103.
    Moore RW, Wan M, Rajasekar A (2005) Storage resource broker; generic software infrastructure for managing globally distributed data. In: Proceedings of local to global data interoperability–challenges and technologies, Sardinia. doi: 10.1109/LGDI.2005.1612467.
  104. 104.
    Chine K (2010) Open science in the cloud: towards a universal platform for scientific and statistical computing, handbook of cloud computing, part 4, pp 453–474.Google Scholar
  105. 105.
    Vogels W (2009) Eventually consistent. Commun ACM 52:40. doi: 10.1145/1435417.1435432 CrossRefGoogle Scholar
  106. 106.
    Schatz MC, Langmead B, Salzberg SL (2010 July) Cloud computing and the DNA data race. Nat Biotechnol 28(7):691–693Google Scholar
  107. 107.
    EMC Report: managing information storage: trends 2011–2012. http://www.emc.com/collateral/emc-perspective/h2159-managing-storage-ep.pdf
  108. 108.
  109. 109.
    Greenwood D, Khajeh-Hosseini A, Smith J, Sommerville I (2012) The cloud adoption toolkit: addressing the challenges of cloud adoption in enterprise. http://arxiv.org/pdf/1008.1900
  110. 110.
    Loutas N, Peristeras V, Bouras T, Kamateri E, Zeginis D, Tarabanis K (2010) Towards a reference architecture for semantically interoperable clouds. 2010 IEEE second international conference on cloud computing technology and science, pp 143–150.Google Scholar
  111. 111.
    Andreozzi S, Burke S, Ehm F, Field L, Galang G, Konya B, Litmaath M, Millar P, Navarro JP (2009) GLUE Specification v. 2.0 (ANL).Google Scholar
  112. 112.
    Ruiz-Alvarez A, Humphrey M (2011) A model and decision procedure for data storage in Cloud computing. ScienceCloud’11, San Jose.Google Scholar
  113. 113.
    EPSRC Policy Framework on Research Data (2011). http://www.legislation.gov.uk/ukpga/2000/36/contents
  114. 114.
  115. 115.
    Nair SK, Porwal S, Dimitrakos T, Ferrer AJ, Tordsson J, Sharif T, Sheridan C, Rajarajan M, Khan AU (2010) Towards secure cloud bursting, brokerage and aggregation, 2010 eighth IEEE European conference on web services, pp 190–196. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5693261
  116. 116.
    Wang C, Wang Q, Ren K, Lou W (2010) Privacy-preserving public auditing for data storage security in cloud computing. In: INFOCOM, 2010 proceedings IEEE. doi: 10.1109/INFCOM.2010.5462173.
  117. 117.
    Yang X, Blower JD, Bastin L, Lush V, Zabala A, Maso J, Cornford D, Diaz P, Lumsden J (2012) An integrated view of data quality in earth observation. Philos Trans R Soc A. doi: 10.1098/rsta.2012.0072 Google Scholar
  118. 118.
    Wei L, Zhu H, Cao Z, Jia W, Vasilakos AV (2010) SecCloud: Bridging Secure Storage and Computation in Cloud. Distributed Computing Systems Workshops (ICDCSW), 2010 IEEE 30th International Conference, IEEE, Genova, 21–25 June 2010.Google Scholar
  119. 119.
    Wei L, Zhu H, Cao Z, Dong X, Jia W, Chen Y, Vasilakos AV (2014) Security and privacy for storage and computation in cloud computing. Inf Sci 258:371–386CrossRefGoogle Scholar
  120. 120.
    Bose R, Frew J (2005) Lineage retrieval for scientific data processing: a survey. ACM Comput Surv 37(1):1–28CrossRefGoogle Scholar
  121. 121.
    Muniswamy-Reddy K-K, Braun U, Holland DA, Macko P, Maclean D, Margo D, Seltzer M, Smogor R (2009) Layering in provenance systems. In: Proc of the USENIX Technical Conf. USENIX Association, pp 129–142.Google Scholar
  122. 122.
    Muniswamy-Reddy K-K, Macko P, Seltzer MI (2009) Making a cloud provenance-aware. In: Cheney J (ed) First workshop on the theory and practice of provenance. USENIX, San FranciscoGoogle Scholar
  123. 123.
    Ahmed W, Wu YW (2013) A survey on reliability in distributed systems. J Comput Syst Sci 79(8):1243–1255. doi: 10.1016/j.jcss.2013.02.006 MathSciNetCrossRefMATHGoogle Scholar
  124. 124.
    Dai YS, Yang B, Dongarra J, Zhang G (2009) Cloud service reliability: modeling and analysis. In: PRDC. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.214.143&rep=rep1&type=pdf
  125. 125.
    Rellermeyer JS, Bagchi S (2012) Dependability as a cloud service–a modular approach. In: Dependable systems and networks workshops (DSN-W), 2012 IEEE/IFIP 42nd international conference. doi: 10.1109/DSNW.2012.6264688.
  126. 126.
    Berners-Lee T, Fielding R, Masinter L (2005) Uniform resource identifiers (URI): generic syntax. Internet Engineering Task Force (IETF) Request for Comments (RFC) 3986. http://www.ietf.org/rfc/rfc3986.txt
  127. 127.
    Sollins K, Masinter L (1994) Functional requirements for uniform resource names. Internet Engineering Task Force (IETF) Request for Comments (RFC) 1737. http://tools.ietf.org/html/rfc1737
  128. 128.
    Paskin N (2010) Digital object identifier (DOI) system. Encyclopaedia of library and information sciences, 3rd edn, pp 1586–1592 (ISBN: 978-0-8493-9712-7). http://www.doi.org/overview/DOI_article_ELIS3.pdf
  129. 129.
    Bizer C, Heath T, Berners-Lee T (2009) Linked data–the story so far. Int J Semantic Web Inf Syst 5(3):1–22CrossRefGoogle Scholar
  130. 130.
    Delbru R, Campinas S, Tummarello G (2011) Searching web data: an entity retrieval and high-performance indexing model. J Web Semantics.Google Scholar
  131. 131.
    Rochwerger B, Breitgand D, Levy E, Galis A, Nagin K, Llorente IM, Montero R, Wolfsthal Y, Elmroth E, Caceres J, Ben-Yehuda M, Emmerich W, Gala F (2009) The reservoir model and architecture for open federated Cloud computing. IBM J Res Dev 53(4):1–11CrossRefGoogle Scholar
  132. 132.
    Plank G, Burton RAB et al (2009) Generation of histo-anatomically representative models of the individual heart: tools and application. Philos Trans R Soc A 367(1896):2257–2292. doi: 10.1098/rsta.2009.0056 MathSciNetCrossRefMATHGoogle Scholar
  133. 133.
    He Q, Zhou S, Kobler B, Duffy D, McGlynn T (2010) Case study for running HPC applications in public clouds. In: Proceedings of the 19th ACM Lting. ACM, pp 395–401.Google Scholar
  134. 134.
    Bientinesi P, Iakymchuk R, Napper J (2010) HPC on competitive cloud resources. In: Handbook of cloud computing. Springer, pp 493–516.Google Scholar
  135. 135.
    Vouk MA, Sills E, Dreher P (2010) Integration of high-performance computing into cloud computing services. Handbook of cloud computing. Springer, US, pp 255–276CrossRefGoogle Scholar
  136. 136.

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Xiaoyu Yang
    • 1
    • 2
  • David Wallom
    • 3
  • Simon Waddington
    • 4
  • Jianwu Wang
    • 5
  • Arif Shaon
    • 6
  • Brian Matthews
    • 6
  • Michael Wilson
    • 6
  • Yike Guo
    • 7
  • Li Guo
    • 7
  • Jon D. Blower
    • 1
  • Athanasios V. Vasilakos
    • 8
  • Kecheng Liu
    • 9
  • Philip Kershaw
    • 10
  1. 1.Reading e-Science CentreUniversity of ReadingReadingUK
  2. 2.Computer Network Information Centre, Chinese Academy of SciencesBeijingChina
  3. 3.Oxford e-Research CentreUniversity of OxfordOxfordUK
  4. 4.Centre for e-Research, King’s College LondonLondonUK
  5. 5.San Diego Supercomputer Center, University of CaliforniaSan DiegoUSA
  6. 6.Scientific Computing DepartmentRutherford Appleton Laboratory, STFCOxfordshireUK
  7. 7.Department of ComputingImperial College LondonLondonUK
  8. 8.Department of Computer and Telecommunications EngineeringUniversity of Western MacedoniaFlorinaGreece
  9. 9.Informatics Research CentreUniversity of ReadingReadingUK
  10. 10.NCEO/Centre for Environmental Data Archival, Rutherford Appleton Laboratory, STFCDidcotUK

Personalised recommendations