Semantically Linking and Browsing Provenance Logs for E-science

  • Jun Zhao
  • Carole Goble
  • Robert Stevens
  • Sean Bechhofer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3226)

Abstract

e-Science experiments are those performed using computer-based resources such as database searches, simulations or other applications. Like their laboratory based counterparts, the data associated with an e-Science experiment are of reduced value if other scientists are not able to identify the origin, or provenance, of those data. Provenance is the term given to metadata about experiment processes, the derivation paths of data, and the sources and quality of experimental components, which includes the scientists themselves, related literature, etc. Consequently provenance metadata are valuable resources for e-Scientists to repeat experiments, track versions of data and experiment runs, verify experiment results, and as a source of experimental insight. One specific kind of in silico experiment is a workflow. In this paper we describe how we can assemble a Semantic Web of workflow provenance logs that allows a bioinformatician to browse and navigate between experimental components by generating hyperlinks based on semantic annotations associated with them. By associating well-formalized semantics with workflow logs we take a step towards integration of process provenance information and improved knowledge discovery.

References

  1. 1.
    Fox, G., Walker, D.: e-Science gap analysis. Technical report, Indiana University and Cardiff University, UK e-Science Center (2003)Google Scholar
  2. 2.
    Stevens, R., Glover, K., Greenhalgh, C., Jennings, C., Li, P., Radenkovic, M., Wipat, A.: Performing in silico experiments on the Grid: a users perspective. In: Cox, S.J. (ed.) UK e-Science All Hands Meeting 2003 Editors, pp. 43–50 (2003)Google Scholar
  3. 3.
    Stevens, R., Tipney, H., Wroe, C., Oinn, T., Senger, M., Lord, P., Goble, C., Brass, A., Tassabehji, M.: Exploring Williams-Beuren Syndrome using mygrid. In: Proceedings of 12th International Conference on Intelligent Systems in Molecular Biology (2004)Google Scholar
  4. 4.
    Foster, I., Kesselman, C. (eds.): Blueprint for a new computing infrastructure, 2nd edn., vol. 7. Morgan Kaufmann Publishers, San Francisco (2003), http://haystack.lcs.mit.edu/ Google Scholar
  5. 5.
    Moreau, L., Miles, S., Goble, C., Greenwood, M., Dialani, V., Addis, M., Alpdemir, N., Cawley, R., De Roure, D., Ferris, J., Gaizauskas, R., Glover, K., Greenhalgh, C., Li, P., Liu, X., Lord, P., Luck, M., Marvin, D., Oinn, T., Paton, N., Pettifer, S., Radenkovic, M.V., Roberts, A., Robinson, A., Rodden, T., Senger, M., Sharman, N., Stevens, R., Warboys, B., Wipat, A., Wroe, C.: On the use of agents in a bioInformatics Grid. In: The Third IEEE/ACM CCGRID 2003Workshop on Agent Based Cluster and Grid Computing, Tokyo, Japan, pp. 653–661 (2003)Google Scholar
  6. 6.
    Klyne, G., Carroll, J.J.: Resource description framework (RDF): concepts and abstract syntax. W3C Proposed Recommendation (2003), Available at http://www.w3.org/TR/2003/PR-rdf-concepts-20031215/
  7. 7.
    Lord, P., Wroe, C., Stevens, R., Goble, C., Miles, S., Moreau, L., Decker, K., Payne, T., Papay, J.: Semantic and personalised service discovery. In: Proceedings of Workshop on Knowledge Grid and Grid Intelligence (KGGI 2003), in conjunction with 2003 IEEE/WIC International Conference on Web Intelligence/Intelligent Agent Technology, pp. 100–107 (2003)Google Scholar
  8. 8.
    Addis, M., Ferris, J., Greenwood, M., Li, P., Marvin, D., Oinn, T., Wipat, A.: Experiences with e-Science workflow specification and enactment in bioinformatics. In: Cox, S.J. (ed.) Proc. UK e-Science All Hands Meeting 2003, pp. 459–466 (2003)Google Scholar
  9. 9.
    Greenwood, M., Goble, C., Stevens, R., Zhao, J., Addis, M., Marvin, D., Moreau, L., Oinn, T.: Provenance of e-science experiments - experience from bioinformatics. In: Cox, S.J. (ed.) UK e-Science All Hands Meeting 2003 Editors, pp. 223–226 (2003)Google Scholar
  10. 10.
    Krishna, A., Tan, V., Lawley, R., Miles, S., Moreau, L.: Mygrid notification service. In: Cox, S.J. (ed.) UK e-Science All Hands Meeting 2003 Editors, pp. 475–482 (2003)Google Scholar
  11. 11.
    Horrocks, I.: DAML+OIL: a Description Logic for the Semantic Web. The IEEE Computer Society Technical Committee on Data Engineering 25, 4–9 (2002)Google Scholar
  12. 12.
    Horrocks, I., Patel-Schneider, P.F.: A proposal for an owl rules language (2004)Google Scholar
  13. 13.
    Baader, F., Horrocks, I., Sattler, U.: Description Logics as ontology languages for the Semantic Web. In: Hutter, D., Stephan, W. (eds.) Festschrift in honor of Jörg Siekmann. LNCS (LNAI), Springer, Heidelberg (2003) (to appear)Google Scholar
  14. 14.
    Wroe, C., Goble, C., Greenwood, M., Lord, P., Miles, S., Papay, J., Payne, T., Moreau, L.: Automating experiments using semantic data on a bioinformatics Grid. IEEE Intelligent Systems, Special Issue on E-Science 19, 48–55 (2004)CrossRefGoogle Scholar
  15. 15.
    Ankolekar, A.: The DAML Services Coalition DAML-S:Web Service description for the Semantic Web. In: The First International Semantic Web Conference (ISWC), Sardinia, Italy (2002)Google Scholar
  16. 16.
    Wroe, C., Stevens, R., Goble, C., Greenwood, M.: A suite of DAML+OIL ontologies to describe bioinformatics Web Services and data. International Journal of Cooperative Information Systems 12, 197–224 (2003)CrossRefGoogle Scholar
  17. 17.
    Bechhofer, S., Goble, C., Carr, L., Kampa, S., Hall, W., Roure, D.D.: COHSE: Conceptual Open Hypermedia Service. Frontiers in Artifical Intelligence and Applications, vol. 96. IOS Press, Amsterdam (2003)Google Scholar
  18. 18.
    Horrocks, I.: The FaCT system. In: de Swart, H. (ed.) TABLEAUX 1998. LNCS (LNAI), vol. 1397, pp. 307–312. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  19. 19.
    Weibel, S., Kunze, J., Lagoze, C., Wolf, M.: Dublin Core metadata for resource discovery. The Internet Society (1998)Google Scholar
  20. 20.
    Ashburner, M., et al.: Gene Ontology: tool for the unification of biology. Nature Genetics 25, 25–29 (2000)CrossRefGoogle Scholar
  21. 21.
    Christensen, E., Curbera, F., Meredith, G., Weerawarana, S.: Web Services Description Language (WSDL) 1.1, W3C Note (2001)Google Scholar
  22. 22.
    Dzbor, M., Domingue, J.B., Motta, E.: Magpie - towards a semantic web browser. In: The 2nd International. Semantic Web Conference, Florida, US, pp. 255–265 (2003)Google Scholar
  23. 23.
    Vargas-Vera, M., Motta, E., Domingue, J., Lanzoni, M., Stutt, A., Ciravegna, F.: Mnm: ontology driven semi-automatic and automatic support for semantic markup. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, p. 379. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  24. 24.
    Szomszor, M., Moreau, L.: Recording and reasoning over data provenance in Web and Grid Services. In: ODBASE (2003)Google Scholar
  25. 25.
    Clark, T., Martin, S., Liefeld, T.: Globally distributed object identification for biological knowledgebases. Briefings in Bioinformatics 5, 59–70 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Jun Zhao
    • 1
  • Carole Goble
    • 1
  • Robert Stevens
    • 1
  • Sean Bechhofer
    • 1
  1. 1.Department of Computer ScienceUniversity of ManchesterManchesterUnited Kingdom

Personalised recommendations