Skip to main content
Log in

Provenance Explorer-a graphical interface for constructing scientific publication packages from provenance trails

  • REGULAR PAPER
  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Scientific communities are under increasing pressure from funding organizations to publish their raw data, in addition to their traditional publications, in open archives. Many scientists would be willing to do this if they had tools that streamlined the process and exposed simple provenance information, i.e., enough to explain the methodology and validate the results without compromising the author’s intellectual property or competitive advantage. This paper presents Provenance Explorer, a tool that enables the provenance trail associated with a scientific discovery process to be visualized and explored through a graphical user interface (GUI). Based on RDF graphs, it displays the sequence of data, states and events associated with a scientific workflow, illustrating the methodology that led to the published results. The GUI also allows permitted users to expand selected links between nodes to reveal more fine-grained information and sub-workflows. But more importantly, the system enables scientists to selectively construct “scientific publication packages” by choosing particular nodes from the visual provenance trail and dragging-and-dropping them into an RDF package which can be uploaded to an archive or repository for publication or e-learning. The provenance relationships between the individual components in the package are automatically inferred using a rules-based inferencing engine.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the Kepler scientific workflow system. In: International provenance and annotation workship (IPAW’06). Chicago (2006)

  2. Oinn T. (2004). Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics J. 20(17): 3045–3054

    Article  Google Scholar 

  3. Majithia, S., et al.: Triana: A Graphical Web Service Composition and Execution Toolkit. in IEEE International Conference on Web Services (ICWS’04). IEEE Computer Society (2004)

  4. Brown, J.L., et al.: GridNexus: a grid services scientific workflow system. Int. J. Comput. Info. Sci. 6 (2), (2005)

  5. Schraefel, M.C., et al.: Breaking the Book: Translating the chemistry lab book into a pervasive computing lab environment. in CHI04. Vienna (2004)

  6. Gibson, A., et al.: myTea: connecting the web to Digital Science on the Desktop. in World Wide Web Conference. 2006. Edinburgh

  7. Lagoze, C., Hunter, J.: The ABC Ontology and Model. J. Digi. Info. 2 (2), (2001)

  8. Bose, R., Frew, J.: Composing lineage metadata with XML for custom satellite-derived data products. in Scientific and Statistical Database Management, 2004. In: Proceedings of 16th International Conference (2004)

  9. Myers, J.D., Pancerella, C., Lansing, C., Schuchardt, K.L., Didier, B.: Multi-scale science: supporting emerging practice with semantically derived provenance. In: ISWC 2003 Workshop: semantic web technologies for searching and retrieving scientific data. Sanibel Island (2003)

  10. Zhao, J., et al.: Using Semantic Web technologies for representing e-science provenance. In: 3rd international semantic web conference. Hiroshima (2004)

  11. Freire, J., et al.: Managing rapidly-evolving scientific workflows. In: International provenance and annotation workship (IPAW’06). Chicago (2006)

  12. Lagoze, C., et al.: Fedora: An architecture for complex objects and their relationships. J. of Digit. Libr. (2005)

  13. Sompel H. (2005). aDORe: A modular, standards-based digital object repository. Comput. J. 48(5): 514–535

    Article  Google Scholar 

  14. Hunter J. and Nack F. (2000). An Overview of the MPEG-7 desiption definition language (DDL) proposals. Signal Process. Image Communi. J. Special Issue on MPEG-7 16: 271–293

    Google Scholar 

  15. Bekaert, J., Hochstenbach, P., Van-de-Sompel, H.: Using MPEG-21 DIDL to represent complex digital objects in the Los Alamos National Laboratory Digital Library. D-Lib Mag. 9 (11), (2003)

  16. Hunter J. and Cheung K. (2005). Generating eScience Workflows from Statistical Analysis of Prior Data. in APAC’05. Royal Pines Resort, Gold Coast

    Google Scholar 

  17. Hunter, J., Drennan, J., Little, S.: Realizing the Hydrogen Economy through Semantic Web Technologies. IEEE Intell. Syst. J.—Special Issue on eScience 40–47, (2004)

    Google Scholar 

  18. Carroll, J.J., et al.: Jena: implementing the semantic web recommendations. In: Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters. ACM Press, New York pp. 74–83 (2004)

  19. McCarthy, P.: Search RDF data with SPARQL: SPARQL and the Jena Toolkit open up the semantic Web, in developerWorks. IBM (2005)

  20. O’Connor, M., et al.: Writing Rules for the Semantic Web Using SWRL and Jess. In: 8th International Protege Conference, Protege with Rules Workshop. Madrid (2005)

  21. Crawford J.M. and Kuipers B.J. (1991). Algernon—a tractable system for knowledge-representation. SIGART Bull. 2(3): 35–44

    Article  Google Scholar 

  22. Alder, G.: The JGraph Swing Component, in Department of Computer Science. Federal Institute of Technology ETH, Zurich (2002)

  23. Gangemi, A., et al.: Sweetening Ontologies with DOLCE. In: 13th International conference on knowledge engineering and knowledge management. Siguenza (2002)

  24. Weber, R.: Ontological foundations of information systems. Monograph No. 4. Coopers& Lybrand Accounting Research Methodology, Melbourne (1997)

  25. Colomb, R.M.: Formal versus Material Ontologies for information Systems interoperation in the Semantic Web. Comput. J. 49 (1), (2006)

  26. Digital Broadband Content: Scientific Publishing, in SourceOECD Science & Information Technology. OECD - Organisation for Economic Co-operation and Development (2005)

  27. Mackenzic, S.: D Span for E-print archives. High energy Phys Libr webzine (9), (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jane Hunter.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hunter, J., Cheung, K. Provenance Explorer-a graphical interface for constructing scientific publication packages from provenance trails. Int J Digit Libr 7, 99–107 (2007). https://doi.org/10.1007/s00799-007-0018-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-007-0018-5

Keywords

Navigation