A Little Semantic Web Goes a Long Way in Biology

  • K. Wolstencroft
  • A. Brass
  • I. Horrocks
  • P. Lord
  • U. Sattler
  • D. Turi
  • R. Stevens
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3729)


We show how state-of-the-art Semantic Web technology can be used in e-Science, in particular, to automate the classification of proteins in biology. We show that the resulting classification was of comparable quality to that performed by a human expert, and how investigations using the classified data even resulted in the discovery of significant information that had previously been overlooked, leading to the identification of a possible drug-target.


Protein Phosphatase Description Logic Human Expert Automate Reasoning Williams Beuren Syndrome 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Alonso, A., Sasin, J., Bottini, N., Friedberg, I., Friedberg, I., Osterman, A., Godzik, A., Hunter, T., Dixon, J., Mustelin, T.: Protein tyrosine phosphatases in the human genome. Cell 117(6), 699–711 (2004)CrossRefGoogle Scholar
  2. 2.
    Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, Cambridge (2003)zbMATHGoogle Scholar
  3. 3.
    Bada, M., Turi, D., McEntire, R., Stevens, R.: Using Reasoning to Guide Annotation with Gene Ontology Terms in GOAT. SIGMOD Record (special issue on data engineering for the life sciences) (2004)Google Scholar
  4. 4.
    Bechhofer, S., Volz, R.: Patching syntax in OWL ontologies. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 668–682. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Borgida, A., Brachman, R.J.: Loading data into description reasoners. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 217–226 (1993)Google Scholar
  6. 6.
    Carter, K., Oka, A., Tamiya, G., Bellgard, M.I.: Bioinformatics issues for automating the annotation of genomic sequences. In: Genome Inform Ser Workshop Genome Inform, vol. 12, pp. 204–211 (2001)Google Scholar
  7. 7.
    Cohen, P.T.: Novel protein serine/threonine phosphatases: variety is the spice of life. Trends Biochem Sci 22(7), 245–251 (1997)CrossRefGoogle Scholar
  8. 8.
    Gene Ontology Consortium. Gene ontology: Tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)Google Scholar
  9. 9.
    Haarslev, V., Möller, R.: RACER system description. In: Goré, R.P., Leitsch, A., Nipkow, T. (eds.) IJCAR 2001. LNCS (LNAI), vol. 2083, pp. 701–705. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  10. 10.
    Hollunder, B., Baader, F.: Qualifying number restrictions in concept languages. In: Proceedings of the Second International Conference on the Principles of Knowledge Representation and Reasoning (KR 1991), pp. 335–346 (1991)Google Scholar
  11. 11.
    Horrocks, I., Patel-Schneider, P.F., van Harmelen, F.: From SHIQ and RDF to OWL: The making of a web ontology language. Journal of Web Semantics 1(1) (2003)Google Scholar
  12. 12.
    Horrocks, I., Sattler, U.: Ontology reasoning in the SHOQ(D) description logic. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (2001)Google Scholar
  13. 13.
    Horrocks, I.: Using an expressive description logic: FaCT or fiction? In: Proceedings of the Sixth International Conference on the Principles of Knowledge Representation and Reasoning (KR 1998), pp. 636–647 (1998)Google Scholar
  14. 14.
    Horrocks, I., Li, L., Turi, D., Bechhofer, S.: The instance store: DL reasoning with large numbers of individuals. In: Proc. of the, Description Logic Workshop (DL 2004), available at CEUR see also (2004),
  15. 15.
    Horrocks, I., Patel-Schneider, P., van Harmelen, F.: Reviewing the design of DAML+OIL: An ontology language for the semantic web. In: Proc. of the 18th Nat. Conf. on Artificial Intelligence (AAAI 2002), pp. 792–797. AAAI Press, Menlo Park (2002)Google Scholar
  16. 16.
    Horrocks, I., Sattler, U.: A tableaux decision procedure for SHOIQ. In: Proc. of the 19th Int. Joint Conf. on Artificial Intelligence, IJCAI 2005 (2005)Google Scholar
  17. 17.
    Hulo, N., Sigrist, C.J., Le Saux, V., Langendijk-Genevaux, P.S., Bordoli, L., Gattiker, A., De Castro, E., Bucher, P., Bairoch, A.: Recent improvements to the prosite database. Nucleic. Acids Res. 32, 134–137 (2004)CrossRefGoogle Scholar
  18. 18.
    Kennelly, P.J.: Protein phosphatases–a phylogenetic perspective. Chem. Rev. 101(8), 2291–2312 (2001)CrossRefGoogle Scholar
  19. 19.
    Wolstencroft, K., Lord, P., Tabernero, L., Brass, A., Stevens, R.: Intelligent classification of proteins using an ontology (Submitted) (2005)Google Scholar
  20. 20.
    Letunic, I., Copley, R.R., Schmidt, S., Ciccarelli, F.D., Doerks, T., Schultz, J., Ponting, C.P., Bork, P.: Smart 4.0: towards genomic data integration. Nucleic. Acids Res. 32, 142–144 (2004)CrossRefGoogle Scholar
  21. 21.
    Lutz, C.: Description logics with concrete domains—a survey. In: Advances in Modal Logics, vol. 4. World Scientific Publishing Co. Pte. Ltd., Singapore (2003)Google Scholar
  22. 22.
    Mabey, J.E., Anderson, M.J., Giles, P.F., Miller, C.J., Attwood, T.K., Paton, N.W., Bornberg-Bauer, E., Robson, G.D., Oliver, S.G., Denning, D.W.: Cadre: the central aspergillus data repository. Nucleic. Acids Res. 32, 401–405 (2004)CrossRefGoogle Scholar
  23. 23.
    Mulder, N.J., Apweiler, R., Attwood, T.K., et al.: Interpro, progress and status in 2005. Nucleic. Acids Res. 33, 201–205 (2005)CrossRefGoogle Scholar
  24. 24.
    Mustelin, T., Vang, T., Bottini, N.: Protein tyrosine phosphatases and the immune response. Nat. Rev. Immunol. 5(1), 43–57 (2005)CrossRefGoogle Scholar
  25. 25.
    Rice, P., Longden, I., Bleasby, A.: Emboss: the European molecular biology open software suite. Trends Genet. 16(6), 276–277 (2000)CrossRefGoogle Scholar
  26. 26.
    Smith, T.F., Zhang, X.: The challenges of genome sequence annotation or ”the devil is in the details”. Nat. Biotechnol. 15(12), 1222–1223 (1997)CrossRefGoogle Scholar
  27. 27.
    Stevens, R., Tipney, H., Wroe, C., Oinn, T., Senger, M., Lord, P., Goble, C., Brass, A., Tassabehji, M.: Exploring Williams Beuren Syndrome Using MyGrid. In: Bioinformatics. Intelligent Systems for Molecular Biology (ISMB), vol. 20, pp. 303–310 (2004)Google Scholar
  28. 28.
    Stevens, R., Wroe, C., Lord, P., Goble, C.: Ontologies in bioinformatics. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies, pp. 635–657. Springer, Heidelberg (2003)Google Scholar
  29. 29.
    Tobies, S.: Complexity Results and Practical Algorithms for Logics in Knowledge Representation. PhD thesis, RWTH Aachen (2001), electronically, available at
  30. 30.
    Tsarkov, D., Horrocks, I.: Efficient reasoning with range and domain constraints. In: Proceedings of the 2004 Description Logic Workshop (DL 2004). CEUR (2004), Available from

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • K. Wolstencroft
    • 1
  • A. Brass
    • 1
  • I. Horrocks
    • 1
  • P. Lord
    • 1
  • U. Sattler
    • 1
  • D. Turi
    • 1
  • R. Stevens
    • 1
  1. 1.School of Computer ScienceUniversity of ManchesterUK

Personalised recommendations