Journal of Grid Computing

, Volume 6, Issue 4, pp 369–383 | Cite as

Workflow-Based Data Parallel Applications on the EGEE Production Grid Infrastructure

  • Johan Montagnat
  • Tristan Glatard
  • Isabel Campos Plasencia
  • Francisco Castejón
  • Xavier Pennec
  • Giuliano Taffoni
  • Vladimir Voznesensky
  • Claudio Vuerli
Article

Abstract

Setting up and deploying complex applications on a Grid infrastructure is still challenging and the programming models are rapidly evolving. Efficiently exploiting Grid parallelism is often not straight forward. In this paper, we report on the techniques used for deploying applications on the EGEE production Grid through four experiments coming from completely different scientific areas: nuclear fusion, astrophysics and medical imaging. These applications have in common the need for manipulating huge amounts of data and all are computationally intensive. All the cases studied show that the deployment of data intensive applications require the development of more or less elaborated application-level workload management systems on top of the gLite middleware to efficiently exploit the EGEE Grid resources. In particular, the adoption of high level workflow management systems eases the integration of large scale applications while exploiting Grid parallelism transparently. Different approaches for scientific workflow management are discussed. The MOTEUR workflow manager strategy to efficiently deal with complex data flows is more particularly detailed. Without requiring specific application development, it leads to very significant speed-ups.

Keywords

Workflows Workload management Data parallelism 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arnold, D., Agrawal, S., Blackford, S., Dongarra, J., Miller, M., Seymour, K., Sagi, K., Shi, Z., Vadhiyar, S.: Users’ guide to NetSolve V1.4.1. Technical Report ICL-UT-02-05, University of Tennessee, Knoxville (2002)Google Scholar
  2. 2.
    Ascasíbar, E., et al.: Confinement and stability on the TJ-II Stellarator. Plasma Phys. Control. Fusion 44, B307 (2002)CrossRefGoogle Scholar
  3. 3.
    Bond, R., Crittenden, R., Jaffe, A., Knox, L.: Computing challenges of the cosmic microwave background. Comput. Sci. Eng. 1(1), 21–29 (1999)CrossRefGoogle Scholar
  4. 4.
    Capit, N., Da Costa, G., Georgiou, Y., Huard, G., Marti, C.: A batch scheduler with high level components. In: Cluster Computing and Grid 2005 (CCGrid’05), vol. 2, pp. 776–783. Institute of Electrical & Electronics Engineers, New York (2005)Google Scholar
  5. 5.
    Caron, E., Desprez, F.: DIET: a scalable toolbox to build network enabled servers on the Grid. Int. J. High Perform. Comput. Appl. 20, 335–352 (2005)CrossRefGoogle Scholar
  6. 6.
    Castejón, F., et al.: Ion orbits and ion confinement studies on ECRH plasmas in TJ-II stellarator. Fusion Sci. Technol. 50, 412–418 (2006)Google Scholar
  7. 7.
    Deelman, E., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Blackburn, K., Lazzarini, A., Arbree, A., Cavanaugh, R., Koranda, S.: Mapping abstract complex workflows onto Grid environments. J. Grid Comput. 1(1), 9–23 (2003)CrossRefGoogle Scholar
  8. 8.
    Glatard, T., Emsellem, D., Montagnat, J.: Generic web service wrapper for efficient embedding of legacy codes in service-based workflows. In: Grid-Enabling Legacy Applications and Supporting End Users Workshop (GELA’06), Paris, 19–23 June 2006Google Scholar
  9. 9.
    Glatard, T., Montagnat, J., Lingrand, D., Pennec, X.: Flexible and efficient workflow deployement of data-intensive applications on Grids with MOTEUR. Int. J. High Perform. Comput. Appl. 22(3), 347–360 (2008)CrossRefGoogle Scholar
  10. 10.
    Glatard, T., Montagnat, J., Pennec, X.: Efficient services composition for Grid-enabled data-intensive applications. In: IEEE International Symposium on High Performance Distributed Computing (HPDC’06), Paris, France (2006)Google Scholar
  11. 11.
    Glatard, T., Montagnat, J., Pennec, X.: Medical image registration algorithms assesment: bronze standard application enactment on Grids using the MOTEUR workflow engine. In: HealthGrid Conference (HealthGrid’06), Valencia, Spain (2006)Google Scholar
  12. 12.
    Glatard, T., Montagnat, J., Pennec, X.: Probabilistic and dynamic optimization of job partitioning on a Grid infrastructure. In: 14th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP06), Montbéliard-Sochaux, France, pp. 231–238 (2006d)Google Scholar
  13. 13.
    Glatard, T., Montagnat, J., Pennec, X.: Optimizing jobs timeouts on clusters and production Grids. In: International Symposium on Cluster Computing and the Grid (CCGrid), Rio de Janeiro (2007)Google Scholar
  14. 14.
    Glatard, T., Pennec, X., Montagnat, J.: Performance evaluation of Grid-enabled registration algorithms using bronze-standards. In: Medical Image Computing and Computer-Assisted Intervention (MICCAI’06), Copenhagen, Denmark (2006e)Google Scholar
  15. 15.
    Glatard, T., Sipos, G., Montagnat, J., Farkas, Z., Kacsuk, P.: Workflow Level Parametric Study Support by MOTEUR and the P-GRADE Portal, Chapt. 18. Springer, Berlin (2007)Google Scholar
  16. 16.
    Gorski, K.M., et al.: Analysis issues for large CMB data sets. In: Evolution of Large Scale Structure: From Recombination to Garching, p. 37. ESO, Garching (1998)Google Scholar
  17. 17.
    Jannin, P., Fitzpatrick, J., Hawkes, D., Pennec, X., Shahidi, R., Vannier, M.: Validation of Medical Image Processing in Image-guided Therapy. IEEE Transactions on Medical Imaging (TMI) 21(12), 1445–1449 (2002)CrossRefGoogle Scholar
  18. 18.
    Kacsuk, P., Sipos, G.: Multi-Grid, multi-user workflows in the P-GRADE Grid portal. J. Grid Comput. (JGC) 3(3–4), 221 – 238 (2005)CrossRefGoogle Scholar
  19. 19.
    Khalaf, R., Mukhi, N., Weerawarana, S.: Service-Oriented Composition in BPEL4WS. In: International World Wide Web Conference (WWW), Budapest, Hungary (2003)Google Scholar
  20. 20.
    Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurr. Comput.: Pract. Exp. (2005)Google Scholar
  21. 21.
    Maino, D., Burigana, C., Maltoni, T.: The Planck-LFI instrument: analysis of the 1/f noise and implications for the scanning strategy. Astron. Astrophys. (A&A) 140(1), 383–392 (1999)Google Scholar
  22. 22.
    Mandolesi, N., Lawrence, C., Pasian, F., Bersanelli, M., Butler, C., et al.: Planck LFI. Proposal submitted to ESA 1(1), 1–140 (1998)Google Scholar
  23. 23.
    Mikhailov, M., Shafranov, V., Subbotin, A., et al.: Improved alpha-particle confinement in stellarators with poloidally closed contours of the magnetic field strength. Nucl. Fus. 42, L23–L26 (2002)CrossRefGoogle Scholar
  24. 24.
    Montagnat, J., Glatard, T., Lingrand, D.: Data composition patterns in service-based workflows. In: Workshop on Workflows in Support of Large-Scale Science (WORKS’06), Paris, France (2006)Google Scholar
  25. 25.
    Nicolau, S., Pennec, X., Soler, L., Ayache, N.: Evaluation of a new 3D/2D registration criterion for liver radio-frequencies guided by augmented reality. In: International Symposium on Surgery Simulation and Soft Tissue Modeling (IS4TM’03), vol. 2673 of LNCS, Juan-les-Pins, pp. 270–283 (2003)Google Scholar
  26. 26.
    Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics J. 17(20), 3045–3054 (2004)CrossRefGoogle Scholar
  27. 27.
    Pautasso, C., Heinis, T., Alonso, G.: JOpera: autonomic service orchestration. IEEE Data Eng. Bull. 29(3) (2006)Google Scholar
  28. 28.
    Puget, J., Lamarre, J., Sygnet, M., et al.: Planck HFI. Proposal submitted to ESA 1(1), 1–166 (1998)Google Scholar
  29. 29.
    Taffoni, G., Maino, D., deGasperis, G., et al.: The prototype of a computational Grid for Planck satellite. In: Astronomical Data Analysis Software and Systems (ADASS) XIV, Pasadena, US, p. 4 (2005)Google Scholar
  30. 30.
    Tanaka, Y., Nakada, H., Sekiguchi, S., Suzumura, T., Matsuoka, S.: Ninf-G: a reference implementation of RPC-based programming middleware for Grid computing. J. Grid Comput. (JGC) 1(1), 41–51 (2003)CrossRefGoogle Scholar
  31. 31.
    Taylor, I., Wand, I., Shields, M., Majithia, S.: Distributed computing with Triana on the Grid. Concurr. Comput.: Pract. Exp 17(1–18) (2005)Google Scholar
  32. 32.
    Tweed, T., Miguet, S.: Medical image database on the Grid: strategies for data distribution. In: HealthGrid’03, Lyon, France, pp. 152–162 (2003)Google Scholar
  33. 33.
    Yu, J., Buyya, R.: A taxonomy of workflow management systems for Grid computing. J. Grid Comput. (JGC) 3(3–4), 171–200 (2005)CrossRefGoogle Scholar
  34. 34.
    Zaldarriaga, M., Seljak, U.: CMBFAST for spatially closed universes. Astrophys. J., Suppl. Ser. 129(2), 431–434 (2000)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2008

Authors and Affiliations

  • Johan Montagnat
    • 1
    • 2
  • Tristan Glatard
    • 3
  • Isabel Campos Plasencia
    • 4
  • Francisco Castejón
    • 5
  • Xavier Pennec
    • 6
  • Giuliano Taffoni
    • 7
  • Vladimir Voznesensky
    • 8
  • Claudio Vuerli
    • 7
  1. 1.I3S laboratoryCNRSSophia AntipolisFrance
  2. 2.EPU, RAINBOWSophia Antipolis CedexFrance
  3. 3.I3S laboratory – INRIACNRSSophia AntipolisFrance
  4. 4.CSIC , Instituto de Fisica de CantabriaSantanderSpain
  5. 5.Laboratorio Nacional de FusiónAsociación Euratom/CiematMadridSpain
  6. 6.INRIASophia AntipolisFrance
  7. 7.INAF-Osservatorio Astronomico di TriesteTriesteItaly
  8. 8.Kurchatov InstituteMoscowRussia

Personalised recommendations