Provenance Explorer-a graphical interface for constructing scientific publication packages from provenance trails

  • Jane HunterEmail author
  • Kwok Cheung


Scientific communities are under increasing pressure from funding organizations to publish their raw data, in addition to their traditional publications, in open archives. Many scientists would be willing to do this if they had tools that streamlined the process and exposed simple provenance information, i.e., enough to explain the methodology and validate the results without compromising the author’s intellectual property or competitive advantage. This paper presents Provenance Explorer, a tool that enables the provenance trail associated with a scientific discovery process to be visualized and explored through a graphical user interface (GUI). Based on RDF graphs, it displays the sequence of data, states and events associated with a scientific workflow, illustrating the methodology that led to the published results. The GUI also allows permitted users to expand selected links between nodes to reveal more fine-grained information and sub-workflows. But more importantly, the system enables scientists to selectively construct “scientific publication packages” by choosing particular nodes from the visual provenance trail and dragging-and-dropping them into an RDF package which can be uploaded to an archive or repository for publication or e-learning. The provenance relationships between the individual components in the package are automatically inferred using a rules-based inferencing engine.


eScience Provenance Visualization Inferencing Publications 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the Kepler scientific workflow system. In: International provenance and annotation workship (IPAW’06). Chicago (2006)Google Scholar
  2. 2.
    Oinn T. (2004). Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics J. 20(17): 3045–3054 CrossRefGoogle Scholar
  3. 3.
    Majithia, S., et al.: Triana: A Graphical Web Service Composition and Execution Toolkit. in IEEE International Conference on Web Services (ICWS’04). IEEE Computer Society (2004)Google Scholar
  4. 4.
    Brown, J.L., et al.: GridNexus: a grid services scientific workflow system. Int. J. Comput. Info. Sci. 6 (2), (2005)Google Scholar
  5. 5.
    Schraefel, M.C., et al.: Breaking the Book: Translating the chemistry lab book into a pervasive computing lab environment. in CHI04. Vienna (2004)Google Scholar
  6. 6.
    Gibson, A., et al.: myTea: connecting the web to Digital Science on the Desktop. in World Wide Web Conference. 2006. EdinburghGoogle Scholar
  7. 7.
    Lagoze, C., Hunter, J.: The ABC Ontology and Model. J. Digi. Info. 2 (2), (2001)Google Scholar
  8. 8.
    Bose, R., Frew, J.: Composing lineage metadata with XML for custom satellite-derived data products. in Scientific and Statistical Database Management, 2004. In: Proceedings of 16th International Conference (2004)Google Scholar
  9. 9.
    Myers, J.D., Pancerella, C., Lansing, C., Schuchardt, K.L., Didier, B.: Multi-scale science: supporting emerging practice with semantically derived provenance. In: ISWC 2003 Workshop: semantic web technologies for searching and retrieving scientific data. Sanibel Island (2003)Google Scholar
  10. 10.
    Zhao, J., et al.: Using Semantic Web technologies for representing e-science provenance. In: 3rd international semantic web conference. Hiroshima (2004)Google Scholar
  11. 11.
    Freire, J., et al.: Managing rapidly-evolving scientific workflows. In: International provenance and annotation workship (IPAW’06). Chicago (2006)Google Scholar
  12. 12.
    Lagoze, C., et al.: Fedora: An architecture for complex objects and their relationships. J. of Digit. Libr. (2005)Google Scholar
  13. 13.
    Sompel H. (2005). aDORe: A modular, standards-based digital object repository. Comput. J. 48(5): 514–535 CrossRefGoogle Scholar
  14. 14.
    Hunter J. and Nack F. (2000). An Overview of the MPEG-7 desiption definition language (DDL) proposals. Signal Process. Image Communi. J. Special Issue on MPEG-7 16: 271–293 Google Scholar
  15. 15.
    Bekaert, J., Hochstenbach, P., Van-de-Sompel, H.: Using MPEG-21 DIDL to represent complex digital objects in the Los Alamos National Laboratory Digital Library. D-Lib Mag. 9 (11), (2003)Google Scholar
  16. 16.
    Hunter J. and Cheung K. (2005). Generating eScience Workflows from Statistical Analysis of Prior Data. in APAC’05. Royal Pines Resort, Gold Coast Google Scholar
  17. 17.
    Hunter, J., Drennan, J., Little, S.: Realizing the Hydrogen Economy through Semantic Web Technologies. IEEE Intell. Syst. J.—Special Issue on eScience 40–47, (2004)Google Scholar
  18. 18.
    Carroll, J.J., et al.: Jena: implementing the semantic web recommendations. In: Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters. ACM Press, New York pp. 74–83 (2004)Google Scholar
  19. 19.
    McCarthy, P.: Search RDF data with SPARQL: SPARQL and the Jena Toolkit open up the semantic Web, in developerWorks. IBM (2005)Google Scholar
  20. 20.
    O’Connor, M., et al.: Writing Rules for the Semantic Web Using SWRL and Jess. In: 8th International Protege Conference, Protege with Rules Workshop. Madrid (2005)Google Scholar
  21. 21.
    Crawford J.M. and Kuipers B.J. (1991). Algernon—a tractable system for knowledge-representation. SIGART Bull. 2(3): 35–44 CrossRefGoogle Scholar
  22. 22.
    Alder, G.: The JGraph Swing Component, in Department of Computer Science. Federal Institute of Technology ETH, Zurich (2002)Google Scholar
  23. 23.
    Gangemi, A., et al.: Sweetening Ontologies with DOLCE. In: 13th International conference on knowledge engineering and knowledge management. Siguenza (2002)Google Scholar
  24. 24.
    Weber, R.: Ontological foundations of information systems. Monograph No. 4. Coopers& Lybrand Accounting Research Methodology, Melbourne (1997)Google Scholar
  25. 25.
    Colomb, R.M.: Formal versus Material Ontologies for information Systems interoperation in the Semantic Web. Comput. J. 49 (1), (2006)Google Scholar
  26. 26.
    Digital Broadband Content: Scientific Publishing, in SourceOECD Science & Information Technology. OECD - Organisation for Economic Co-operation and Development (2005)Google Scholar
  27. 26.
    Mackenzic, S.: D Span for E-print archives. High energy Phys Libr webzine (9), (2004)Google Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.ITEEThe University of QueenslandSt. LuciaAustralia
  2. 2.AIBNThe University of QueenslandSt. LuciaAustralia

Personalised recommendations