Advertisement

MPO: A System to Document and Analyze Distributed Heterogeneous Workflows

  • Kesheng WuEmail author
  • Elizabeth N. Coviello
  • S. M. Flanagan
  • Martin Greenwald
  • Xia Lee
  • Alex Romosan
  • David P. Schissel
  • Arie Shoshani
  • Josh Stillerman
  • John Wright
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9672)

Abstract

Large scientific experiments and simulations produce vast quantities of data. Though smaller in volume, the corresponding metadata describing the production, pedigree, and ontology, is just as important as the raw data to the scientific discovery process. Driven by the application needs of a number of large-scale distributed workflows, we develop a metadata capturing and analysis system called MPO (short for Metadata, Provenance, Ontology). It seamlessly integrates with most data analysis environments and requires a minimal amount of changes to users’ existing analysis programs. Users have the full control of how to instrument their programs to capture as much or as little information as they desire. Once captured in a database system, the workflows can be visualized and studied through a set of web-based tools. In large scientific collaborations where the workflows have been built up over decades, this ability to instrument the complex existing workflows and visualize the key interactions among the software components is tremendously useful.

References

  1. 1.
    Marinho, A., et al.: ProvManager: a provenance management system for scientific workflows. Concurr. Comput. Pract. Exp. 24(13), 1513–1530 (2012)CrossRefGoogle Scholar
  2. 2.
    Kondylakis, H., Plexousakis, D.: Ontology evolution without tears. Web Semant.: Sci. Serv. Agents World Wide Web 19, 42–58 (2013)CrossRefGoogle Scholar
  3. 3.
    Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the Kepler scientific workflow system. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 118–132. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Davidson, S.B., et al.: Provenance in scientific workflow systems. IEEE Data Eng. Bull. 30(4), 44–50 (2007)Google Scholar
  5. 5.
    Schissel, D.P., et al.: Automated metadata, provenance cataloging and navigable interfaces: ensuring the usefulness of extreme-scale data. Fusion Eng. Des. 89(5), 745–749 (2014)CrossRefGoogle Scholar
  6. 6.
    Wright, J.C., et al.: The MPO API: a tool for recording scientific workflows. Fusion Eng. Design 89(5), 754–757 (2014)CrossRefGoogle Scholar
  7. 7.
    Greenwald, M., et al.: A metadata catalog for organization and systemization of fusion simulation data. Fusion Eng. Design 87(12), 2205–2208 (2012)CrossRefGoogle Scholar
  8. 8.
    Abla, G., et al.: The MPO System for Automatic Workflow Documentation. Fusion Engineering and Design (2016 to appear)Google Scholar
  9. 9.
    Richardson, L., Ruby, S.: RESTful Web Services. O’Reilly Media, Sebastopol (2008)Google Scholar
  10. 10.
    Fielding, R.T., Taylor, R.N.: Principled design of the modern Web architecture. ACM Trans. Internet Technol. 2(2), 115–150 (2002)CrossRefGoogle Scholar
  11. 11.
    Stillerman, J., et al.: MDSplus data acquisition system. Rev. Sci. Instrum. 68(1), 939–942 (1997)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland (outside the US) 2016

Authors and Affiliations

  • Kesheng Wu
    • 1
    Email author
  • Elizabeth N. Coviello
    • 2
  • S. M. Flanagan
    • 2
  • Martin Greenwald
    • 3
  • Xia Lee
    • 2
  • Alex Romosan
    • 1
  • David P. Schissel
    • 2
  • Arie Shoshani
    • 1
  • Josh Stillerman
    • 3
  • John Wright
    • 3
  1. 1.Lawrence Berkeley National LaboratoryBerkeleyUSA
  2. 2.General AtomicsSan DiegoUSA
  3. 3.Massachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations