Abstract
Data provenance graphs are form of metadata that can be used to establish a variety of properties of data products that undergo sequences of transformations, typically specified as workflows. Their usefulness for answering user provenance queries is limited, however, unless the graphs are enhanced with domain-specific annotations. In this paper we propose a model and architecture for semantic, domain-aware provenance, and demonstrate its usefulness in answering typical user queries. Furthermore, we discuss the additional benefits and the technical implications of publishing provenance graphs as a form of Linked Data. A prototype implementation of the model is available for data produced by the Taverna workflow system.
Chapter PDF
References
Barga, R.S., Digiampietri, L.A.: Automatic capture and efficient storage of e-Science experiment provenance. Concurrency and Computation: Practice and Experience 20, 419–429 (2008)
Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: Towards a Mashup to Build Bioinformatics Knowledge Systems. Journal of Biomedical Informatics 41, 706–716 (2008)
Biton, O., Cohen Boulakia, S., Davidson, S.B.: Zoom*UserViews: Querying Relevant Provenance in Workflow Systems. In: VLDB, pp. 1366–1369 (2007)
Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. Int. Journal on Semantic Web and Information Systems, Special Issue on Linked Data (2009) (in press)
Bowers, S., McPhillips, T.M., Ludäscher, B.: Provenance in collection-oriented scientific workflows. Concurrency and Computation: Practice and Experience 20, 519–529 (2008)
Cao, B., Plale, B., Subramanian, G., Missier, P., Goble, C., Simmhan, Y.: Semantically Annotated Provenance in the Life Science Grid. In: Freire, J., Missier, P., Sahoo, S.S. (eds.) 1st International Workshop on the Role of Semantic Web in Provenance Management. CEUR Proceedings (2009)
Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: SIGMOD Conference, pp. 1345–1350 (2008)
Hartig, O., Bizer, C., Freytag, J.C.: Executing SPARQL queries over the web of linked data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)
Howe, B., Lawson, P., Bellinger, R., Anderson, E., Santos, E., Freire, J., Scheidegger, C., Baptista, A., Silva, C.: End-to-end escience: Integrating workflow, query, visualization, and provenance at an ocean observatory. In: Procs Fourth IEEE International Conference on eScience, pp. 127–134 (2008)
Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic acids research 34, 729–732 (2006)
McGuinness, D.L., Fox, P., Pinheiro da Silva, P., Zednik, S., Del Rio, N., Ding, L., West, P., Chang, C.: Annotating and embedding provenance in science data repositories to enable next generation science applications. In: American Geophysical Union, Fall Meeting (AGU 2008), Eos Trans. AGU, Fall Meet. Suppl., Abstract IN11C-1052, vol. 89(53) (2008)
Missier, P., Paton, N.W., Belhajjame, K.: Fine-grained and efficient lineage querying of collection-based workflow provenance. In: Procs. of EDBT, Lausanne, Switzerland (2010)
Moreau, L.: The Open Provenance Model v 1.1 (2009)
Prud’ommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C Recommendation (2008)
Sahoo, S.S., Sheth, A., Henson, C.: Semantic provenance for eScience: Managing the deluge of scientific data. IEEE Internet Computing 12, 46–54 (2008)
Sahoo, S.S., Sheth, A.: Provenir ontology: Towards a Framework for eScience Provenance Management (2009)
Simmhan, Y., Plale, B., Gannon, D.: A survey of data provenance in e-science. SIGMOD Record 34, 31–36 (2005)
Zhao, J., Wroe, C., Goble, C., Stevens, R., Quan, D., Greenwood, M.: Using Semantic Web Technologies for Representing e-Science Provenance. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 92–106. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Missier, P., Sahoo, S.S., Zhao, J., Goble, C., Sheth, A. (2010). Janus: From Workflows to Semantic Provenance and Linked Open Data. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds) Provenance and Annotation of Data and Processes. IPAW 2010. Lecture Notes in Computer Science, vol 6378. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17819-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-17819-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17818-4
Online ISBN: 978-3-642-17819-1
eBook Packages: Computer ScienceComputer Science (R0)