Semantic Web pp 355-395

Knowledge Discovery for Biology with Taverna

Producing and consuming semantics in the Web of Science
  • Carole Goble
  • Katy Wolstencroft
  • Antoon Goderis
  • Duncan Hull
  • Jun Zhao
  • Pinar Alper
  • Phillip Lord
  • Chris Wroe
  • Khalid Belhajjame
  • Daniele Turi
  • Robert Stevens
  • Tom Oinn
  • David De Roure

Abstract

Life Science research has extended beyond in vivo and in vitro bench-bound science to incorporate in silico knowledge discovery, using resources that have been developed over time by different teams for different purposes and in different forms. The myGrid project has developed a set of software components and a workbench, Taverna, for building, running and sharing workflows that link third party bioinformatics services, such as databases, analytic tools and applications. Intelligently discovering prior services, workflow or data is aided by a Semantic Web of annotations, as is the building of the workflows themselves. Metadata associated with the workflow experiments, the provenance of the data outcomes and the record of the experimental process need to be flexible and extensible. Semantic Web metadata technologies would seem to be well-suited to building a Semantic Web of provenance. We have the potential to integrate and aggregate workflow outcomes, and reason over provenance logs to identify new experimental insights, and to build and export a Semantic Web of experiments that contributes to Knowledge Discovery for Taverna users and for the scientific community as a whole.

Key words

workflow in silico services Web Services Semantic Web Taverna discovery publication provenance metadata annotation LSID ontology myGrid experiment Web e-Science 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    W3C, “Web Services Activity Statement,” 2006. http://www.w3.org/2002/ws/ActivityGoogle Scholar
  2. [2]
    Wilkinson M.D. “BioMOBY-the MOBY-S Platform for Interoperable Data Service Provision,” in Computational Genomics Theory and Application, R. P. Grant, Ed. Wymondham, U.K.: Horizon Bioscience, 2004.Google Scholar
  3. [3]
    Ludaescher B. and Goble C. “Guest Editors’ Introduction to the Special Section on Scientific Workflows,” SIGMOD Record, vol. 34, 2005.Google Scholar
  4. [4]
    Ludäscher B., Altintas I., Berkley C, Higgins D., Jaeger-Frank E., Jones M, Lee E., Tao J., and Zhao Y. Scientific Workflow Management and the Kepler System, Concurrency and Computation: Practice & Experience, vol. Special Issue on Scientific Workflows (to appear), 2006.Google Scholar
  5. [5]
    Oinn T., Greenwood M., Addis M., Alpdemir M. N., Ferris J., Glover K., Goble C, Goderis A., Hull D., Marvin D., Li P., Lord P., Pocock M. R., Senger M., Stevens R., Wipat A., and Wroe C. Taverna: Lessons in creating a workflow environment for the life sciences, Concurrency and Computation: Practice and Experience, To appear.Google Scholar
  6. [6]
    Churches D., Gombas G., Harrison A., Maassen J., Robinson C, Shields M., Taylor I., and Wang I. Programming scientific and distributed workflow with Triana services, Concurrency and Computation: Practice & Experience, 2006.Google Scholar
  7. [7]
    Stevens R., Tipney H.J., Wroe C., Oinn T., Senger M., Lord P., Goble C.A., Brass A., and Tassabehji M. Exploring Williams-Beuren Syndrome Using myGrid, presented at 12th International Conference on Intelligent Systems in Molecular Biology, Glasgow, UK, 2004.Google Scholar
  8. [8]
    Senger M., Rice P., and Oinn T., Soaplab-a unified Sesame door to analysis tools, presented at e-Science Second All Hands Meeting 2003, Nottingham, UK, 2003.Google Scholar
  9. [9]
    Oinn T., Greenwood M., Addis M., Alpdemir M.N., Ferris J., Glover K., Goble C., Goderis A., Hull D., Marvin D., Li P., Lord P., Pocock M.R., Senger M., Stevens R., Wipat A., and Wroe C. Taverna: Lessons in creating a workflow environment for the life sciences, Concurrency and Computation: Practice and Experience, 2006.Google Scholar
  10. [10]
    Oinn T., Addis M., Ferris J., Marvin D., Senger M., Greenwood M., Carver T., Glover K., Pocock M.R., Wipat A., and Li P. Taverna: A tool for the composition and enactment of bioinformatics workflows, Bioinformatics Journal, vol. 20, pp. 3045–3054, 2004.CrossRefGoogle Scholar
  11. [11]
    Li P., Hayward K., Jennings C, Owen K., Oinn T., Stevens R., Pearce S., and Wipat A. Association of variations on I kappa B-epsilon with Graves’ disease using classical and myGrid methodologies, presented at 3rd UK e-Science All Hands Meeting, Nottingham UK, 2004.Google Scholar
  12. [12]
    Altschul S.F., Madden T.L., Schäffer A.A., Zhang J., Zhang Z., Miller W., and Lipman D.J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res., vol. 25, pp. 3389–3402, 1997.PubMedCrossRefGoogle Scholar
  13. [13]
    Hendler J. Science and the Semantic Web, Science vol. 299, pp. 520–521, 2003.PubMedCrossRefGoogle Scholar
  14. [14]
    Bairoch A., Apweiler R., Wu C.H., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., and Magrane M. The Universal Protein Resource (UniProt), Nucleic Acids Res., vol. 33, pp. D154–159, 2005.PubMedCrossRefGoogle Scholar
  15. [15]
    Ashburner M., Ball C.A., Blake J. A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T., Harris M.A., Hill D.P., Issel-Tarver L., Kasarskis A., Lewis S., Matese J.C., Richardson J.E., Ringwald M., Rubin G.M., and Sherlock G. Gene Ontology: tool for the unification of biology, Nat Genet, vol. 25, pp. 25–29, 2000.PubMedCrossRefGoogle Scholar
  16. [16]
    Wroe C., Goble C., Goderis A., Lord P., Miles S., Papay J., Alper P., and Moreau L. Recycling workflows and services through discovery and reuse, Concurrency and Computation: Practice and Experience, 2006.Google Scholar
  17. [17]
    Berners-Lee T., Hendler J., and Lassila O. The Semantic Web, Scientific American, vol. 284, pp. 34–43, 2001.CrossRefGoogle Scholar
  18. [18]
    Clark T., Martin S., and Liefeld T. Globally Distributed Object Identification for Biological Knowledgebases, Briefings in Bioinformatics, vol. 5, pp. 59–70, 2004.PubMedCrossRefGoogle Scholar
  19. [19]
    Wikipedia, “Folksomony,” 2006. http://en.wikipedia.org/wiki/FolksonomyGoogle Scholar
  20. [20]
    Lord P., Alper P., Wroe C, and Goble C. Feta: A light-weight architecture for user oriented semantic service discovery, presented at 2nd European Semantic Web Conference, Heraklion, Greece, 2005.Google Scholar
  21. [21]
    Wroe C., Goble C. A., Greenwood M., Lord P., Miles S., Papay J., Payne T., and Moreau L. Automating Experiments Using Semantic Data on a Bioinformatics Grid, IEEE Intelligent Systems, vol. 19, pp. 48–55, 2004.CrossRefGoogle Scholar
  22. [22]
    Lord P., Bechhofer S., Wilkinson M., Schiltz G., Gessler D., Goble C., Stein L., and Hull D. Applying semantic web services to bioinformatics: Experiences gained, lessons learnt, presented at 3rd International Semantic Web Conference ISWC2004, Hiroshima, Japan, 2004.Google Scholar
  23. [23]
    Goderis A., Li P., and Goble C. Workflow discovery: the problem, a case study from escience and a graph-based solution, presented at 4th IEEE Int. Conference on Web Services (ICWS 2006), Chicago, USA, 2006.Google Scholar
  24. [24]
    Goderis A., Sattler U., and Goble C. Applying descriptions logics for workflow reuse and repurposing, presented at International Description Logics Workshop, Edinburgh, Scotland, 2005.Google Scholar
  25. [25]
    Goderis A., Sattler U., Lord P., and Goble C. Seven bottlenecks to workflow reuse and repurposing, presented at Fourth International Semantic Web Conference (ISWC 2005), Galway, Ireland, 2005.Google Scholar
  26. [26]
    Belhajjatne K., Embury S.M., and Paton N.W. On characterising and identifying mismatches in scientific workflows, presented at Data Integration in the Life Sciences (DILS’06), Hinxton, UK 2006.Google Scholar
  27. [27]
    Hull D., Zolin E., Bovykin A., Horrocks I., Sattler U., and Stevens R. Deciding matching of stateless services, presented at Twenty-First National Conference on Artificial Intelligence (AAAI’06), Boston, MA, USA, 2006.Google Scholar
  28. [28]
    Szomszor M., Payne T. R., and Moreau L. Using semantic web technology to automate data integration in grid and web service architectures, presented at Semantic Infrastructure for Grid Computing Applications Workshop, Cluster Computing and Grid (CCGrid), Cardiff, UK, 2005.Google Scholar
  29. [29]
    Zhao J., Wroe C., Goble C., Stevens R., Quan D., and Greenwood M. Using Semantic Web Technologies for Representing e-Science Provenance, presented at 3rd International Semantic Web Conference ISWC2004, Hiroshima, Japan, 2004.Google Scholar
  30. [30]
    Zhao J., Goble C., Stevens R., and Bechhofer S. Semantically Linking and Browsing Provenance Logs for e-Science, presented at International Conference on Semantics of a Networked World, Paris, France, 2004.Google Scholar
  31. [31]
    Frey J.G., de Roure D., and Carr L.A. Publication At Source: Scientific Communication from a Publication Web to a Data Grid, presented at Euroweb 2002 Conference, The Web and the GRID: from e-science to e-business, Oxford, UK, 2002.Google Scholar
  32. [32]
    Taylor K., Gledhill R., Essex J.W., Frey J.G., Harris S.W., and de Roure D.. A Semantic Datagrid for Combinatorial Chemistry, presented at 6th IEEE/ACM International Workshop on Grid Computing, Seattle, 2005.Google Scholar
  33. [33]
    Hughes G., Mills H., de Roure D., Frey J.G., Moreau L., Schraefel M.C., Smith G., and Zaluska E. The Semantic Smart Laboratory: A system for supporting the chemical e-Scientist, Organic & Biomolecular Chemistry., vol. 2, pp. 3284–3293, 2004.CrossRefGoogle Scholar
  34. [34]
    Pettifer S., Sinnott J.R., and Attwood T.K. UTOPIA: user friendly tools for operating informatics applications, Comparative and Functional Genomics, vol. 5, pp. 56–60, 2004.CrossRefPubMedGoogle Scholar
  35. [35]
    Garwood K., Lord P., Parkinson H., Paton N.W., and Goble C., Pedro ontology services: A framework for rapid ontology markup, presented at 2nd European Semantic Web Conference, Heraklion, Greece, 2005.Google Scholar
  36. [36]
    Wong S.C., Tan V., Fang W., Miles S., and Moreau L., Grimoires: Grid Registry with Metadata Oriented Interface: Robustness, Efficiency, Security — Work-in-Progress, presented at Cluster Computing and Grid (CCGrid), Cardiff, UK, 2005.Google Scholar
  37. [37]
    Wroe C, Stevens R., Goble C.A., Roberts A., and Greenwood M. A suite of DAML+OIL Ontologies to Describe Bioinformatics Web Services and Data, international Journal of Cooperative Information Systems, vol. 2, pp. 197–224, 2003.CrossRefGoogle Scholar
  38. [38]
    Roman D., Keller U., Lausen H., de Bruijn J., Lara R., Stollberg M., Polleres A., Feier C, Bussler C., and Fensel D. Web Service Modeling Ontology, Applied Ontology, vol. 1, pp. 77–106, 2005.Google Scholar
  39. [39]
    Martin D., Paolucci M., Mcllraith S., Burstein M., McDermott D., McGuinness D., Parsia B., Payne T., Sabou M., Solanki M., Srinivasan N., and Sycara K. Bringing Semantics to Web Services: The OWL-S Approach, presented at First International Workshop on Semantic Web Services and Web Process Composition (SWSWPC 2004), San Diego, California, USA, 2004.Google Scholar
  40. [40]
    Akkiraju R., Farrell J., Miller J., Nagarajan M., Schmidt M., Sheth A., and Verma K. Web Service Semantics-WSDL-S, Joint UGA-IBM Technical Note, 2005.Google Scholar
  41. [41]
    Broekstra J., Kampman A., and van Harmelen F. Sesame: A generic architecture for storing and querying rdf and rdf schema, presented at International Semantic Web Conference (ISWC 2002), Sardinia, Italy, 2002.Google Scholar
  42. [42]
    Hull D., Stevens R., Lord P., Wroe C., and Goble C. Treating shimantic web syndrome with ontologies, presented at First Advanced Knowledge Technologies workshop on Semantic Web Services (AKT-SWS04), Milton Keynes, UK., 2004.Google Scholar
  43. [43]
    Stevens R., Wroe C., Bechhofer S., Lord P., and Rector A. Building Ontologies in DAML + OIL, Comparative and Functional Genomics, vol. 4, 2003.Google Scholar
  44. [44]
    Szomszor M. and Moreau L. Recording and Reasoning Over Data Provenance in Web and Grid Services, presented at Ontologies, Databases and Applications of Semantics (ODBASE’03), Catania, Sicily, Italy.Google Scholar
  45. [45]
    Zhao J., Goble C, and Stevens R. An Identity Crisis in the Life Sciences, presented at International Provenance and Annotation Workshop (IPAW’06), Chicago, 2006.Google Scholar
  46. [46]
    Newscientist.com news service and Translator lets computers “understand” experiments, 2006. http://www.newscientist.com/article/dn9288-translator-lets-computers-understand-experiments-.htmlGoogle Scholar
  47. [47]
    Blake J. Bio-ontologies—fast and furious, Nature Biotechnology vol. 22, pp. 773–774, 2004.PubMedCrossRefGoogle Scholar
  48. [48]
    Butler D. Mashups mix data into global service, Nature, vol. 439, pp. 6–7, 2006.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Carole Goble
    • 1
  • Katy Wolstencroft
    • 1
  • Antoon Goderis
    • 1
  • Duncan Hull
    • 1
  • Jun Zhao
    • 1
  • Pinar Alper
    • 2
  • Phillip Lord
    • 2
  • Chris Wroe
    • 3
  • Khalid Belhajjame
    • 1
  • Daniele Turi
    • 1
  • Robert Stevens
    • 1
  • Tom Oinn
    • 4
  • David De Roure
    • 5
  1. 1.University of ManchesterUK
  2. 2.University of NewcastleUK
  3. 3.British TelecomUK
  4. 4.The European Bioinformatics InstituteUK
  5. 5.University of SouthamptonUK

Personalised recommendations