Skip to main content

MPO: A System to Document and Analyze Distributed Heterogeneous Workflows

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9672))

Abstract

Large scientific experiments and simulations produce vast quantities of data. Though smaller in volume, the corresponding metadata describing the production, pedigree, and ontology, is just as important as the raw data to the scientific discovery process. Driven by the application needs of a number of large-scale distributed workflows, we develop a metadata capturing and analysis system called MPO (short for Metadata, Provenance, Ontology). It seamlessly integrates with most data analysis environments and requires a minimal amount of changes to users’ existing analysis programs. Users have the full control of how to instrument their programs to capture as much or as little information as they desire. Once captured in a database system, the workflows can be visualized and studied through a set of web-based tools. In large scientific collaborations where the workflows have been built up over decades, this ability to instrument the complex existing workflows and visualize the key interactions among the software components is tremendously useful.

This work was supported by the US DOE, Office of Advanced Scientific Computing Research and the Office of Fusion Energy Sciences under DE-SC0008697, DEAC02-05CH11231, and DE-SC0008736.

The rights of this work are transferred to the extent transferable according to title 17 § 105 U.S.C.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    MPO project documentation and software are available at <https://mpo.psfc.mit.edu/>.

  2. 2.

    SQLAlchemy is available at http://www.sqlalchemy.org/.

References

  1. Marinho, A., et al.: ProvManager: a provenance management system for scientific workflows. Concurr. Comput. Pract. Exp. 24(13), 1513–1530 (2012)

    Article  Google Scholar 

  2. Kondylakis, H., Plexousakis, D.: Ontology evolution without tears. Web Semant.: Sci. Serv. Agents World Wide Web 19, 42–58 (2013)

    Article  Google Scholar 

  3. Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the Kepler scientific workflow system. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 118–132. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Davidson, S.B., et al.: Provenance in scientific workflow systems. IEEE Data Eng. Bull. 30(4), 44–50 (2007)

    Google Scholar 

  5. Schissel, D.P., et al.: Automated metadata, provenance cataloging and navigable interfaces: ensuring the usefulness of extreme-scale data. Fusion Eng. Des. 89(5), 745–749 (2014)

    Article  Google Scholar 

  6. Wright, J.C., et al.: The MPO API: a tool for recording scientific workflows. Fusion Eng. Design 89(5), 754–757 (2014)

    Article  Google Scholar 

  7. Greenwald, M., et al.: A metadata catalog for organization and systemization of fusion simulation data. Fusion Eng. Design 87(12), 2205–2208 (2012)

    Article  Google Scholar 

  8. Abla, G., et al.: The MPO System for Automatic Workflow Documentation. Fusion Engineering and Design (2016 to appear)

    Google Scholar 

  9. Richardson, L., Ruby, S.: RESTful Web Services. O’Reilly Media, Sebastopol (2008)

    Google Scholar 

  10. Fielding, R.T., Taylor, R.N.: Principled design of the modern Web architecture. ACM Trans. Internet Technol. 2(2), 115–150 (2002)

    Article  Google Scholar 

  11. Stillerman, J., et al.: MDSplus data acquisition system. Rev. Sci. Instrum. 68(1), 939–942 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kesheng Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland (outside the US)

About this paper

Cite this paper

Wu, K. et al. (2016). MPO: A System to Document and Analyze Distributed Heterogeneous Workflows. In: Mattoso, M., Glavic, B. (eds) Provenance and Annotation of Data and Processes. IPAW 2016. Lecture Notes in Computer Science(), vol 9672. Springer, Cham. https://doi.org/10.1007/978-3-319-40593-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40593-3_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40592-6

  • Online ISBN: 978-3-319-40593-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics