Semantic Web pp 225-248 | Cite as

Applying OWL Reasoning to Genomic Data

  • Katy Wolstencroft
  • Robert Stevens
  • Volker Haarslev

Abstract

The core part of the Web Ontology Language (OWL) is based on Description Logic (DL) theory, which has been investigated for more than 25 years. OWL reasoning systems offer various DL-based inference services such as (i) checking class descriptions for consistency and automatically organizing them into classification hierarchies, (ii) checking descriptions about individuals for consistency and recognizing individuals as instances of class descriptions. These services can therefore be utilized in a variety of application domains concerned with representation of and reasoning about knowledge, for example, in biological sciences. Classification is an integral part of all biological sciences, including the new discipline of genomics. Biologists not only wish to build complex descriptions of the categories of biological molecules, but also to classify instances of new molecules against these class level descriptions. In this chapter we introduce to the non-expert reader the basics of OWL DL and its related reasoning patterns such as classification. We use a case study of building an ontology of a protein family and then classifying all members of that family from a genome using DL technology. We show how a technically straight-forward use of these technologies can have far-reaching effects in genomic science.

Keywords

protein classification OWL DL reasoning reasoning patterns protein phosphatases 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Ouzounis C.A., and Valencia A.. Early bioinformatics: the birth of a discipline—a personal view. Bioinformatics 19:2176–2190, 2003.PubMedCrossRefGoogle Scholar
  2. [2]
    Borsani G., Ballabio A., and Banfi S. A practical guide to orient yourself in the labyrinth of genome databases. Hum Mol Genet 7:1641–1648, 1998.PubMedCrossRefGoogle Scholar
  3. [3]
    Wheeler D.L., Barrett L.T., Benson D.A., Bryant, S.H. Canese K., Church D.M., DiCuccio M., Edgar R., Federhen S., Helmberg W., Kenton D.L., Khovayko O., Lipman D.J., Madden T.L., Maglott D.R., Ostell J., Pontius J.U., Pruitt K.D., Schuler G.D., Schriml L.M., Sequeira E., Sherry S., Sirotkin K., Starchenko G., Suzek T.O., Tatusov R., Tatusova T.A., Wagner L., and Yaschenko E.. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 33:D39–45, 2005.PubMedCrossRefGoogle Scholar
  4. [4]
    Ouzounis C. A., Karp P. D. The past, present and future of genome-wide re-annotation. Genome Biol 3:COMMENT2001, 2002.Google Scholar
  5. [5]
    Ge H., Walhout A.J., and Vidal M. Integrating ‘omic’ information: a bridge between genomics and systems biology. Trends Genet 19:551–560, 2003.PubMedCrossRefGoogle Scholar
  6. [6]
    Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zhang Z, Miller W., and Lipman D. J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402, 1997.PubMedCrossRefGoogle Scholar
  7. [7]
    Mulder N. J., Apweiler R., Attwood T.K., Bairoch A., Bateman A., Binns D., Bradley P., Bork P., Bucher P., Cerutti L., Copley R., Courcelle E., Das U., Durbin R., Fleischmann W., Gough J., Haft D., Harte N., Hulo N., Kahn D., Kanapin A., Krestyaninova M., Lonsdale D., Lopez R., Letunic I., Madera M., Maslen J., McDowall J., Mitchell A., Nikolskaya A.N., Orchard S., Pagni M., C.P. Ponting C.P., Quevillon E., Selengut J., Sigrist C.J., Silventoinen V., Studholme D. J., Vaughan R., and Wu C. H. InterPro, progress and status in 2005. Nucleic Acids Res 33:D201–205, 2005.PubMedCrossRefGoogle Scholar
  8. [8]
    Gilks W.R., Audit B., De Angelis D., Tsoka S., and Ouzounis C.A. Modeling the percolation of annotation errors in a database of protein sequences. Bioinformatics 18:1641–1649, 2002.PubMedCrossRefGoogle Scholar
  9. [9]
    Hulo N., Sigrist C.J., Le Saux V., Langendijk-Genevaux P.S., Bordoli L., Gattiker A., De Castro E., Bucher P, and Bairoch A. Recent improvements to the PROSITE database. Nucleic Acids Res 32:D134–137, 2004.PubMedCrossRefGoogle Scholar
  10. [10]
    Bateman A., Coin L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Studholme D.J., Yeats C, and Eddy S.R. The Pfam protein families database. Nucleic Acids Res 32:D138–141, 2004.PubMedCrossRefGoogle Scholar
  11. [11]
    Stevens R., Goble C, Horrocks I., and Bechhofer S. OILing the way to machine understandable bioinformatics resources. IEEE Trans Inf Technol Biomed 6:129–134, 2002a.PubMedCrossRefGoogle Scholar
  12. [12]
    The Gene Ontology Consortium. Creating the gene ontology resource: design and implementation. Genome Res 11:1425–1433, 2001.CrossRefGoogle Scholar
  13. [13]
    Horrocks I. Patel-Schneider P.F, and van Harlem F. From SHIQ and RDF to OWL: The making of a web ontology language. J. of Web Semantics, 1(1):7–26, 2003.Google Scholar
  14. [14]
    Bollen M., and Stalmans W. The structure, role, and regulation of type 1 protein phosphatases. Crit Rev Biochem Mol Biol 27:227–281, 1992.PubMedGoogle Scholar
  15. [15]
    Kile B.T., Nicola N.A., and Alexander W.S. Negative regulators of cytokine signaling. Int J Hematol 73:292–298, 2001.PubMedCrossRefGoogle Scholar
  16. [16]
    Cohen P. The origins of protein phosphorylation. Nat Cell Biol. 4:E127–130, 2002a.PubMedCrossRefGoogle Scholar
  17. [17]
    Cohen P. Signal integration at the level of protein kinases, protein phosphatases and their substrates. Trends Biochem Sci 17:408–413, 1992.PubMedCrossRefGoogle Scholar
  18. [18]
    Alonso A., Sasin J., Bottini N., Friedberg I., Friedberg I., Osterman A., Godzik A., Hunter T., Dixon J., and Mustelin T. Protein tyrosine phosphatases in the human genome. Cell 117:699–711, 2004.PubMedCrossRefGoogle Scholar
  19. [19]
    Cohen P.T. Novel protein serine/threonine phosphatases: variety is the spice of life. Trends Biochem Sci 22:245–251, 1997.PubMedCrossRefGoogle Scholar
  20. [20]
    Andersen J.N., Mortensen O.H., Peters G.H., Drake P.G., Iversen L.F., Olsen O.H., Jansen P.G., Andersen H.S., Tonks N.K., and Moller N.P.. Structural and evolutionary relationships among protein tyrosine phosphatase domains. Mol Cell Biol 21:7117–7136, 2001.PubMedCrossRefGoogle Scholar
  21. [21]
    Goldstein B.J. Protein-tyrosine phosphatase IB (PTP1B): a novel therapeutic target for type 2 diabetes mellitus, obesity and related states of insulin resistance. Curr Drug Targets Immune Endocr Metabol Disord 1:265–275, 2001.PubMedCrossRefGoogle Scholar
  22. [22]
    Schonthal A.H. Role of serine/threonine protein phosphatase 2A in cancer. Cancer Lett 170:1–13, 2001.PubMedCrossRefGoogle Scholar
  23. [23]
    Zhang Z.Y. Protein tyrosine phosphatases: prospects for therapeutics. Curr Opin Chem Biol 5:416–423, 2001PubMedCrossRefGoogle Scholar
  24. [24]
    Tian Q. and Wang J. Role of serine/threonine protein phosphatase in Alzheimer’s disease. Neurosignals 11:262–269, 2002.PubMedCrossRefGoogle Scholar
  25. [25]
    Baader F., Calvanese D., McGuinness D., Nardi D., and Patel-Schneider P.F., editors. The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, 2003.Google Scholar
  26. [26]
    Wessel M. and Möller, R. A High Performance Semantic Web Query Answering Engine. In I. Horrocks, U. Sattler, and F. Wolter, editors, Proc. International Workshop on Description Logics, 2005.Google Scholar
  27. [27]
    Haarslev V. and Möller R. RACER system description. In Proceedings of the International Joint Conference on Automated Reasoning (IJCAR-01), volume 2083 of Lecture Notes in Artificial Intelligence, Springer-Verlag, 701–705, 2001.Google Scholar
  28. [28]
    Baker C.J.O., Su X., Butler G., and Haarslev V. Ontoligent Interactive Query Tool. In Proceedings of the Canadian Semantic Web Working Symposium, June 6, 2006, Quebec City, Quebec, Canada, Series: Semantic Web and Beyond: Computing for Human Experience, Vol. 2, Springer Verlag, 2006, pp. 155–169, 2006a.Google Scholar
  29. [29]
    Shaban-Nejad A., Baker C.J.O., Haarslev V., and Butler G. The FungalWeb Ontology: Semantic Web Challenges in Bioinformatics and Genomics. In Semantic Web Challenge-Proceedings of the 4th International Semantic Web Conference, Nov. 6–10, Galway, Ireland, Springer-Verlag, LNCS, Vol. 3729, 2005, pp. 1063–1066, 2005. (2. Prize in the Semantic Web Challenges competition).Google Scholar
  30. [30]
    Baker C.J.O., Shaban-Nejad A., Su X., Haarslev V., and Butler G. Semantic Web Infrastructure for Fungal Enzyme Biotechnologists. Journal of Web Semantics, (4)3, 2006, 2006b.Google Scholar
  31. [31]
    Bechhofer S. Horrocks I., Turi D. The OWL Instance Store: System Description. Proceedings CADE-20, Lecture Notes in Computer Science, Springer-Verlag. (To appear.)Google Scholar
  32. [32]
    Mabey J.E., Anderson M.J., Giles P.F., Miller C.J., Attwood T.K., Paton N.W., Bornberg-Bauer E., Robson G.D., Oliver S.G., and Denning D.W.. CADRE: the Central Aspergillus Data REpository. Nucleic Acids Res 32:D401–405, 2004.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Katy Wolstencroft
    • 1
  • Robert Stevens
    • 1
  • Volker Haarslev
    • 2
  1. 1.School of Computer ScienceUniversity of ManchesterUK
  2. 2.Department of Computer Science and Software EngineeringConcordia UniversityCanada

Personalised recommendations