Advertisement

Services + Components = Data Intensive Scientific Workflow Applications with MeDICi

  • Ian Gorton
  • Jared Chase
  • Adam Wynne
  • Justin Almquist
  • Alan Chappell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5582)

Abstract

Scientific applications are often structured as workflows that execute a series of distributed software modules to analyze large data sets. Such workflows are typically constructed using general-purpose scripting languages to coordinate the execution of the various modules and to exchange data sets between them. While such scripts provide a cost-effective approach for simple workflows, as the workflow structure becomes complex and evolves, the scripts quickly become complex and difficult to modify. This makes them a major barrier to easily and quickly deploying new algorithms and exploiting new, scalable hardware platforms. In this paper, we describe the MeDICi Workflow technology that is specifically designed to reduce the complexity of workflow application development, and to efficiently handle data intensive workflow applications. MeDICi integrates standard component-based and service-based technologies, and employs an efficient integration mechanism to ensure large data sets can be efficiently processed. We illustrate the use of MeDICi with a climate data processing example that we have built, and describe some of the new features we are creating to further enhance MeDICi Workflow applications.

Keywords

workflow middleware components services 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kouzes, R.T., Anderson, G.A., Elbert, S.T., Gorton, I., Gracio, D.K.: The Changing Paradigm of Data-Intensive Computing. Computer 42(1), 26–34 (2009)CrossRefGoogle Scholar
  2. 2.
    Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience 18(10), 1039–1065 (2006)CrossRefGoogle Scholar
  3. 3.
    Goble, C.A., Oinn, T., Greenwood, M., Addis, M.: Taverna: Lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice and Experience (Special Issue on Workflow in Grid Systems) 18(10), 1067–1100 (2005)Google Scholar
  4. 4.
    Shah, A.R., Singhal, M., Gibson, T.D., Sivaramakrishnan, C., Waters, K.M., Gorton, I.: An Extensible, Scalable Architecture for Managing Bioinformatics Data and Analyses. In: IEEE Fourth International Conference on eScience 2008, December 7-12 , pp. 190–197 (2008)Google Scholar
  5. 5.
    Gorton, I., Greenfield, P., Szalay, A., Williams, R.: Data-Intensive Computing in the 21st Century. Computer 41(4), 30–32 (2008)CrossRefGoogle Scholar
  6. 6.
    Gorton, I., Wynne, A., Almquist, J., Chatterton, J.: The MeDICi Integration Framework: A Platform for High Performance Data Streaming Applications. In: Seventh Working IEEE/IFIP Conference on Software Architecture (WICSA 2008), Vancouver, Canada, pp. 95–104 (2008)Google Scholar
  7. 7.
    Barker, A., van Hemert, J.: Scientific Workflow: A Survey and Research Directions. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2007. LNCS, vol. 4967, pp. 746–753. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Butchart, B., Cameron, N., Chen, L., Wassermann, B., Emmerich, W., Patel, J.: Sedna: A BPEL-based environment for visual scientific workflow modeling. In: Workflows for eScience. Springer, Heidelberg (2007)Google Scholar
  9. 9.
    Akram, A., Meredith, D., Allan, R.: Evaluation of BPEL to Scientific Workflows. In: CCGRID 2006. Sixth IEEE International Symposium on Cluster Computing and the Grid, vol. 1, pp. 269–274 (2006)Google Scholar
  10. 10.
    Wynne, A., Gorton, I., Almquist, J., Chatterton, J., Thurman, D.: A Flexible, High Performance Service-Oriented Architecture for Detecting Cyber Attacks. In: Hawaiian International Conference on Systems Science (HICSS 2008). IEEE, Los Alamitos (2008)Google Scholar
  11. 11.
    Lee, K., Paton, N.W., Sakellariou, R., Deelman, E., Fernandes, A., Mehta, G.: Adaptive Workflow Processing and Execution. In: Pegasus 3rd International Workshop on Workflow Management and Applications in Grid Environments (WaGe08), Proceedings of the Third International Conference on Grid and Pervasive Computing Symposia/Workshops, Kunming, China, May 25-28, pp. 99–106 (2008)Google Scholar
  12. 12.
    Brown, J., Ferner, C., Hudson, T., Stapleton, A., Vetter, R., Carland, T., Martin, A., Martin, J., Rawls, A., Shipman, W., Wood, M.: GridNexus: A Grid Services Scientific Workflow System. International Journal of Computer Information Science (IJCIS) 6(2), 72–82 (2005)Google Scholar
  13. 13.
  14. 14.
    Couvares, P., et al.: Workflow Management in Condor. In: Taylor, I., et al. (eds.) Workflows in e-Science. Springer, Heidelberg (2006)Google Scholar
  15. 15.
    Emmerich, W., Butchart, B., Chen, L., Wassermann, B., Price, S.: Grid Service Orchestration Using the Business Process Execution Language (BPEL). J. Grid Comput. 3(3-4), 283–304 (2005)CrossRefGoogle Scholar
  16. 16.
    Barker, A., Weissman, J.B., van Hemert, J.I.: Orchestrating Data-Centric Workflows. In: Procs. Int. Sym. On Cluster Computer and the Grid, pp. 210–217. IEEE, Los Alamitos (2008)Google Scholar
  17. 17.
    Barker, A., Weissman, J.B., van Hemert, J.: Eliminating the Middle Man: Peer-to-Peer Dataflow. In: HPDC 2008: Proceedings of the 17th International Symposium on High Performance Distributed Computing, pp. 55–64 (June 2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ian Gorton
    • 1
  • Jared Chase
    • 1
  • Adam Wynne
    • 1
  • Justin Almquist
    • 1
  • Alan Chappell
    • 1
  1. 1.Pacific Northwest National LabRichlandUSA

Personalised recommendations