Janus: From Workflows to Semantic Provenance and Linked Open Data

  • Paolo Missier
  • Satya S. Sahoo
  • Jun Zhao
  • Carole Goble
  • Amit Sheth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6378)


Data provenance graphs are form of metadata that can be used to establish a variety of properties of data products that undergo sequences of transformations, typically specified as workflows. Their usefulness for answering user provenance queries is limited, however, unless the graphs are enhanced with domain-specific annotations. In this paper we propose a model and architecture for semantic, domain-aware provenance, and demonstrate its usefulness in answering typical user queries. Furthermore, we discuss the additional benefits and the technical implications of publishing provenance graphs as a form of Linked Data. A prototype implementation of the model is available for data produced by the Taverna workflow system.


Link Data Semantic Annotation SPARQL Query Query Pattern Link Open Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Barga, R.S., Digiampietri, L.A.: Automatic capture and efficient storage of e-Science experiment provenance. Concurrency and Computation: Practice and Experience 20, 419–429 (2008)CrossRefGoogle Scholar
  2. 2.
    Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: Towards a Mashup to Build Bioinformatics Knowledge Systems. Journal of Biomedical Informatics 41, 706–716 (2008)CrossRefGoogle Scholar
  3. 3.
    Biton, O., Cohen Boulakia, S., Davidson, S.B.: Zoom*UserViews: Querying Relevant Provenance in Workflow Systems. In: VLDB, pp. 1366–1369 (2007)Google Scholar
  4. 4.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. Int. Journal on Semantic Web and Information Systems, Special Issue on Linked Data (2009) (in press)Google Scholar
  5. 5.
    Bowers, S., McPhillips, T.M., Ludäscher, B.: Provenance in collection-oriented scientific workflows. Concurrency and Computation: Practice and Experience 20, 519–529 (2008)CrossRefGoogle Scholar
  6. 6.
    Cao, B., Plale, B., Subramanian, G., Missier, P., Goble, C., Simmhan, Y.: Semantically Annotated Provenance in the Life Science Grid. In: Freire, J., Missier, P., Sahoo, S.S. (eds.) 1st International Workshop on the Role of Semantic Web in Provenance Management. CEUR Proceedings (2009)Google Scholar
  7. 7.
    Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: SIGMOD Conference, pp. 1345–1350 (2008)Google Scholar
  8. 8.
    Hartig, O., Bizer, C., Freytag, J.C.: Executing SPARQL queries over the web of linked data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  9. 9.
    Howe, B., Lawson, P., Bellinger, R., Anderson, E., Santos, E., Freire, J., Scheidegger, C., Baptista, A., Silva, C.: End-to-end escience: Integrating workflow, query, visualization, and provenance at an ocean observatory. In: Procs Fourth IEEE International Conference on eScience, pp. 127–134 (2008)Google Scholar
  10. 10.
    Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic acids research 34, 729–732 (2006)CrossRefGoogle Scholar
  11. 11.
    McGuinness, D.L., Fox, P., Pinheiro da Silva, P., Zednik, S., Del Rio, N., Ding, L., West, P., Chang, C.: Annotating and embedding provenance in science data repositories to enable next generation science applications. In: American Geophysical Union, Fall Meeting (AGU 2008), Eos Trans. AGU, Fall Meet. Suppl., Abstract IN11C-1052, vol. 89(53) (2008)Google Scholar
  12. 12.
    Missier, P., Paton, N.W., Belhajjame, K.: Fine-grained and efficient lineage querying of collection-based workflow provenance. In: Procs. of EDBT, Lausanne, Switzerland (2010)Google Scholar
  13. 13.
    Moreau, L.: The Open Provenance Model v 1.1 (2009)Google Scholar
  14. 14.
    Prud’ommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C Recommendation (2008)Google Scholar
  15. 15.
    Sahoo, S.S., Sheth, A., Henson, C.: Semantic provenance for eScience: Managing the deluge of scientific data. IEEE Internet Computing 12, 46–54 (2008)CrossRefGoogle Scholar
  16. 16.
    Sahoo, S.S., Sheth, A.: Provenir ontology: Towards a Framework for eScience Provenance Management (2009)Google Scholar
  17. 17.
    Simmhan, Y., Plale, B., Gannon, D.: A survey of data provenance in e-science. SIGMOD Record 34, 31–36 (2005)CrossRefGoogle Scholar
  18. 18.
    Zhao, J., Wroe, C., Goble, C., Stevens, R., Quan, D., Greenwood, M.: Using Semantic Web Technologies for Representing e-Science Provenance. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 92–106. Springer, Heidelberg (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Paolo Missier
    • 1
  • Satya S. Sahoo
    • 3
  • Jun Zhao
    • 2
  • Carole Goble
    • 1
  • Amit Sheth
    • 3
  1. 1.School of Computer ScienceUniversity of ManchesterUK
  2. 2.Department of ZoologyUniversity of OxfordUK
  3. 3.The Kno.e.sis CenterWright State UniversityDaytonUSA

Personalised recommendations