Journal of Grid Computing

, Volume 3, Issue 3–4, pp 259–281 | Cite as

An End-to-end Workflow Pipeline for Large-scale Grid Computing

  • A. Stephen McGough
  • Jeremy Cohen
  • John Darlington
  • Eleftheria Katsiri
  • William Lee
  • Sofia Panagiotidi
  • Yash Patel
Article

Abstract

In this paper we describe a service-based, software architecture that enables end-to-end, high-level workflow processing in a Grid environment consisting of many heterogeneous resources. Our architecture is essentially a pipeline that extends from the abstract application specification phase to the deployment and execution stages through to returning the results to the user. We envision a large-scale Grid environment that contains heterogeneous resources. Our architecture caters for flexible deployment, performance, reliability and charging for resource usage. These are addressed at the specification level as well as at the realisation (brokering) and execution levels. The proposed architecture is derived from previous work in LeSC that has produced the ICENI pipeline, and our experience with e-Science projects, such as GENIE, e-Protein and RealityGrid from which we derive a set of key requirements.

Key words

brokering and planning Grid job launching scheduling workflow workflow pipeline 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    D. Adamczyk, D. Collados, G. Denis, J. Fernandes, P. Galvez, I. Legrand, H.B. Newman, and K. Wei, “Global platform for rich media conferencing and collaboration”, in 2003 Conference for Computing in High-Energy and Nuclear Physics (CHEP 03), La Jolla, California, Mar 2003.Google Scholar
  3. 3.
  4. 4.
    I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludscher, and S. Mock, “Kepler: An extensible system for design and execution of scientific workflows”, in 16th Intl. Conf. on Scientific and Statistical Database Management (SSDBM’04), Santorini Island, Greece, June 2004.Google Scholar
  5. 5.
    Andrew Stephen McGough, Ali Afzal, John Darlington, Nathalie Furmento, Anthony Mayer and Laurie Young, “Making the Grid predictable through reservations and performance modelling”, The Computer Journal, Vol. 48, No. 3, pp. 358–368, 2005.CrossRefGoogle Scholar
  6. 6.
    Open Grid Services Architecture. https://forge.gridforum.org/projects/ogsa-wg.
  7. 7.
    Rob Armstrong, Dennis Gannon, Al Geist, Katarzyna Keahey, Scott R. Kohn, Lois McInnes, Steve R. Parker, and Brent A. Smolinski, “Toward a common component architecture for high-performance scientific computing”, in High Performance Distributed Computing, 1999.Google Scholar
  8. 8.
    Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter Patel-Schneider. The Description Logic Handbook : Theory, Implementation, and Applications. Cambridge, 2003.Google Scholar
  9. 9.
    P. Barman, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization”, in SOSP 2003, September 2003.Google Scholar
  10. 10.
    Sean Bechhofer, Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L. McGuiness, Peter F. Patel-Schneider, and Lynn Andrea Stein, “OWL web ontology language reference”, W3C recommendation, February 2004. Available at http://www.w3.org/TR/owl-ref/.
  11. 11.
  12. 12.
    F. Breg, S. Diwan, J. Villacis., J. Balasubramanian, E. Akman, and D. Gannon, “Java rmi performance and object model interoperability: Experiments with java/hpc++ distributed components”, in Concurrency Practice and Experience, Special Issue from the Fourth Java for Scientific Computing Workshop, 1998.Google Scholar
  13. 13.
  14. 14.
    L. Chen, S. J. Cox, F. Tao, N.R. Shadbolt, C. Puleston, and C. Goble, “Empower resource providers to build the semantic Grid”, In IEEE/WIC/ACM International Conference on Web Intelligence, September 2004.Google Scholar
  15. 15.
    J. Chin, E.S. Boek, and P.V. Coveney, “Lattice boltzmann simulation of the flow of binary immiscible fluids with different viscosities using the shan-chen microscopic interaction model”, Philisophical TRansactions of the Royal Society A, 360(547), 2002.Google Scholar
  16. 16.
    J. Chin, P.V. Coveney, and J. Harting, “The teragyroid project: Collaborative steering and visualisation in an hpc Grid for modelling complex fluids”, UK All-hands e-Science Conference, 2004, September 2004.Google Scholar
  17. 17.
    J. Chin, J. Harting, S. Jha, P.V. Coveney, A. R. Porter, and S. M. Pickles, “Steering in computational science: Mesoscale modelling and simulation”, Contemporary Physics, 44: 417–434, 2003.CrossRefGoogle Scholar
  18. 18.
    The OWL Services Coalition, “OWL-S semantic markup for web services”, http://www.daml.org/services/owl-s/1.1/, 2004.
  19. 19.
    J. Cohen, W. Lee, A. Mayer, and S. Newhouse, “Making the Grid Pay – Economic Web Services”, in Building Service Based Grids Workshop, GGF11, Honolulu, Hawaii, USA, June 2004.Google Scholar
  20. 20.
    K. Czajkowski, I.T. Foster, and C. Kesselman, “Resource co-allocation in computational Grids”, in IEEE HPDC-8, August 1999.Google Scholar
  21. 21.
    Holly Daily, Henri Casanovay, and Fran Berman, “A decoupled scheduling approach for the GrADS program development environment”, in Proceedings of the Supercomputing 2002 conference, Baltimore, November 2002.Google Scholar
  22. 22.
    Protege Ontology Editor. http://protege.stanford.edu/.
  23. 23.
    M. Hakki Eres, Graeme E. Pound, Zhuoan Jiao, Jasmin L. Wason, Fenglian Xu, Andy J. Keane, and Simon J. Cox, “Implementation of a Grid-enabled problem solving environment in matlab”, in International Conference on Computationa Science, pages 420–429, 2003.Google Scholar
  24. 24.
    M.H. Eres, G.E. Pound, Z. Jiao, J.L. Wason, F. Xu, A.J. Keane, and S.J. Cox, “Implementation and utilisation of a Grid-enabled problem solving environment in matlab”, Future Generation Computer Systems (in press), 2005.Google Scholar
  25. 25.
  26. 26.
    Global Grid Forum. http://www.ggf.org.
  27. 27.
    Grid ENabled Integrated Earth system model project. http://www.genie.ac.uk/.
  28. 28.
    W. Gropp, E. Lusk, N. Doss, and A. Skjellum, “A high-performance, portable implementation of the MPI message passing interface standard”, Parallel Computing, Vol. 22, No. 6, pp. 789–828, 1996.CrossRefMATHGoogle Scholar
  29. 29.
    Job Submission Description Language Working Group. https://forge.gridforum.org/projects/jsdl-wg.
  30. 30.
    M.Y. Gulamali, A.S. McGough, R.J. Marsh, N.R. Edwards, T.M. Lenton, P.J. Valdes, S.J. Cox, S.J. Newhouse, and J. Darlington, “Performance guided scheduling in genie through iceni”, In Proceedings of the UK e-Science All Hands Meeting 2004, Nottingham, September 2004.Google Scholar
  31. 31.
    Volker Haarslev and Ralf Möller, “RACER system description”, Lecture Notes in Computer Science, Vol. 2083, pp. 701, 2001.Google Scholar
  32. 32.
    J. Hau, W. Lee, and J. Darlington, “A semantic similarity measure for semantic web services”, Web Service Semantics: A workshop at The Fourteen International World Wide Web Conference (WWW2005), 2005.Google Scholar
  33. 33.
    I. Horrocks, “Using an expressive description logic: Fact or fiction?” in Principles of Knowledge Representation and Reasoning: Proceedings of the Sixth International Conference (KR98), June 1998.Google Scholar
  34. 34.
    Immunology Grid, Immunology Grid Project. http://www.immunologygrid.org.
  35. 35.
    K. Keahey, K. Doering, and I. Foster, “From sandbox to playground: Dynamic virtual environments in the Grid”, in 5th IEEE/ACM International Workshop on Grid Computing, November 2004.Google Scholar
  36. 36.
    S. Liang. The Java Native Interface, Programmer’s Guide and Specification. Addison-Wesley, 1999.Google Scholar
  37. 37.
    London e-Science Centre. A Market for Computational Services. Available at http://www.lesc.ic.ac.uk/markets/.
  38. 38.
    Phillip Lord, Pinar Alper, Chris Wroe, and Carole Goble, “Feta: A light-wieght architecture for user oriented semantic service discovery”, in A. Gómez-Pérez workflow. Enactment.tex; 30/10/2005; 16:54; p. 37 and J. Euzenat, editors, European Semantic Web Conference, pages 17–31. Springer, 2005.Google Scholar
  39. 39.
    Simone A. Ludwig, William Naylor, Julian Padget, and Omer F. Rana, “Matchmaking support for mathematical web services”, in Proceedings of the UK e-Science All Hands Meeting 2005, Nottingham, September 2005.Google Scholar
  40. 40.
    Matthias Hovestadt, Odej Kao, Axel Keller, Achim Streit, “Scheduling in HPC resource management systems: Queuing vs. Planning”, Lecture Notes in Computer Science, 2862, October 2003.Google Scholar
  41. 41.
    A.E. Mayer, Composite Construction of High Performance Scientific Applications. PhD thesis, Department of Computing, Imperial College, London, UK, 2001.Google Scholar
  42. 42.
    A. Mayer, S. McGough, N. Furmento, J. Cohen, M. Gulamali, L. Young, A. Afzal, S. Newhouse, and J. Darlington, “ICENI: An Integrated Grid Middleware to Support e-Science”, in V. Getov and T. Kielmann (eds.), Component Models and Systems for Grid Applications, volume 1 of CoreGRID series, pages 109–124. Springer, June 2004.Google Scholar
  43. 43.
    A. Mayer, S. McGough, M. Gulamali, L. Young, J. Stanton, S. Newhouse, and J. Darlington, “Meaning and behaviour in Grid oriented components”, Lecture Notes in Computer Science, Vol. 2536, pp. 100–111, 2002.CrossRefGoogle Scholar
  44. 44.
    A.S. McGough, L. Young, A. Afzal, S. Newhouse, and J. Darlington, “Workflow Enactment in ICENI”, in UK e-Science All Hands Meeting, pages 894–900, Nottingham, UK, Sep 2004.Google Scholar
  45. 45.
    Wolfgang Nejdl, Boris Wolf, Changtao Qu, Stefan Decker, Michael Sintek, Ambj Naeve, Mikael Nilsson, Matthias Palmer, and Tore Risch, “Edutella: A p2p networking infrastructure based on rdf”, in 11th World Wide Web Conference, page 604, May 2002 2002.Google Scholar
  46. 46.
    A. O’Brien, S.J. Newhouse, and J. Darlington, “Mapping of scientific workflow within the e-protein project to distributed resources”, in Proceedings of the UK e-Science All Hands Meeting 2004, Nottingham, September 2004.Google Scholar
  47. 47.
    Open Grid Services Architecture Data Access and Integration (OGSA-DAI). http://www.ogsadai.org.uk/.
  48. 48.
    S. Panagiotidi, E. Katsiri, and J. Darlington, “On Advanced Scientific Understanding, Model Componentisation and Coupling in GENIE”. in All Hands Meeting, Nottingham, UK, September 2005.Google Scholar
  49. 49.
    S.M. Pickles, P.V. Coveney, and B.M. Boghosian, “Transcontinental realitygrids for interactive collaborative exploration of parameter space (triceps)”, Winner of SC’03 HPC Challenge Competition (Most Innovative Data-Intensive Application), November 2003.Google Scholar
  50. 50.
    UDDI Project. Universal Description, Discovery and Integrati on (UDDI), September 2002. Available at http://www.uddi. org.
  51. 51.
    RealityGrid Project. http://www.realitygrid.org/.
  52. 52.
    RFC 2459. Internet X.509 Public Key Infrastructure Certificate and CRL Profile. http://www.ietf.org/rfc/rfc2459.txt.
  53. 53.
    A. Saleem, M. Krznaric, S. Newhouse, and J. Darlington, “ICENI Virtual Organisation Management”, in UK e-Science All Hands Meeting, pages 117–120, Nottingham, UK, Sep. 2003.Google Scholar
  54. 54.
    E. Smith and P. Anderson, “Dynamic reconfiguration for Grid fabrics”, in 5th IEEE/ACM International Workshop on Grid Computing, November 2004.Google Scholar
  55. 55.
    R. Stevens, H.J. Tipney, C. Wroe, T. Oinn, M. Senger, P. Lord, C.A. Goble, A. Brass, and M. Tassabehji, “Exploring Williams–Beuren syndrome using my Grid”, in Proceedings of 12th International Conference on Intelligent Systems in Molecular Biology, Glasgow, UK, July 2004.Google Scholar
  56. 56.
    Ian Taylor, Matthew Shields, Ian Wang, and Roger Philp, “Grid enabling applications using Triana”, in Workshop on Grid Applications and Programming Tools, Held in Conjunction with GGF8, June 2003.Google Scholar
  57. 57.
    Condor Team. Condor Project Homepage. http://www.cs.wisc.edu/condor.
  58. 58.
    The Bespoke Framework Generator (BFG). http://www.cs.manhesler.ac.uk/cnc/projects/bfg.php.
  59. 59.
  60. 60.
    The Kerrighed project. http://www.kerrighed.org/.
  61. 61.
    Inc. The MathWorks. Matlab®. http://www.mathworks.com/products/matlab/.
  62. 62.
    The open Mosix project. http://openmosix.sourceforge.net/.
  63. 63.
    The open SSI project. http://openssi.org/index.shtml.
  64. 64.
    The Shibboleth® Project. http://shibboleth.internet2.edu/.
  65. 65.
  66. 66.
    Juan E. Villacis, Madhusudhan Govindaraju, David Stern, Andrew Whitaker, Fabian Breg, Prafulla Deuskar, Benjamin Temko, Dennis Gannon, and Randall Bramley, “CAT: A high performance distributed component architecture toolkit for the Grid”, in High Performance Distributed Computing, 1999.Google Scholar
  67. 67.
    Virtual Organisation Membership Service (VOMS). http://hep-project-grid-scg.web.cern.ch/hep-project-grid-scg/voms.html.
  68. 68.
    The Taverna Project Website. http://taverna.sourceforge.net/.
  69. 69.
    Laurie Young, “Scheduling componentised applications on a computational Grid”, MPhil Transfer Report, 2004.Google Scholar
  70. 70.
    Jia Yu and Rajkumar Buyya, “A taxonomy of workflow management systems for Grid computing”, http:www.gridbus.org/reports/GridWorkflowTaxonomy.

Copyright information

© Springer Science+Business Media, Inc. 2006

Authors and Affiliations

  • A. Stephen McGough
    • 1
  • Jeremy Cohen
    • 1
  • John Darlington
    • 1
  • Eleftheria Katsiri
    • 1
  • William Lee
    • 1
  • Sofia Panagiotidi
    • 1
  • Yash Patel
    • 1
  1. 1.London e-Science CentreImperial College LondonLondonUK

Personalised recommendations