Skip to main content

Ontologies in Cheminformatics

  • Living reference work entry
  • First Online:
Handbook of Computational Chemistry

Abstract

Ontologies are structured controlled vocabularies which encode domain knowledge, backed by sophisticated logic-based computational tools. They enable knowledge-based applications which harness automated reasoning for inference and knowledge discovery. They also enable the semantic and standard annotation of large-scale data, which is ever relevant in the modern age of increased high-throughput data generation and sharing in scientific research. Established chemical ontologies include ChEBI, which encodes the structural classification of chemical entities of biological interest together with their roles. More recently, the chemical information ontology was created to standardize the annotation of cheminformatics software and descriptors. In this chapter, the technology, structure and applications of ontologies within cheminformatics will be described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Baader, F., Calvanese, D., McGuiness, D., Nardi, D., & Patel-Schneider, P. (2003). Description logic handbook (2nd ed.). Cambridge: Cambridge University Press.

    Google Scholar 

  • Batchelor, C., Hastings, J., Steinbeck, C. (2010). Ontological dependence, dispositions and institutional reality in chemistry. In A. Galton & R. Mizoguchi (Eds.), Proceedings of the 6th Formal Ontology in Information Systems Conference, Toronto.

    Google Scholar 

  • Belleau, F., Nolin, M., Tourigny, N., Rigault, P., & Morissette, J. (2008). Bio2RDF: Towards a mashup to build bioinformatics knowledge systems. Journal of Biomedical Informatics, 41, 706–716. DOI10.1016/j.jbi.2008.03.004.

    Google Scholar 

  • Bolton, E. E., Wang, Y., Thiessen, P. A., & Bryant, S. H. (2008). PubChem: Integrated platform of small molecules and biological activities (pp. 217–241). American Chemical Society, Washington, DC.

    Google Scholar 

  • Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., & Apweiler, R. (2004). The Gene Ontology Annotation (GOA) database: Sharing knowledge in uniprot with gene ontology. Nucleic Acids Research, 32(suppl 1), D262–D266. DOI10.1093/nar/gkh021. http://nar.oxfordjournals.org/content/32/suppl_1/D262.abstract.

  • Chagoyen, M., & Pazos, F. (2011). MBRole: Enrichment analysis of metabolomic data. Bioinformatics, 27(5), 730–731. DOI10.1093/bioinformatics/btr001. http://bioinformatics.oxfordjournals.org/content/27/5/730.abstract.

  • Chen, B., Dong, X., Jiao, D., Wang, H., Zhu, Q., Ding, Y., & Wild, D. (2010). Chem2Bio2RDF: A semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics, 11(1), 255. DOI10.1186/1471-2105-11-255. http://www.biomedcentral.com/1471-2105/11/255.

  • Chepelev, L., Riazanov, A., Kouznetsov, A., Low, H. S., Dumontier, M., & Baker, C. (2011). Prototype semantic infrastructure for automated small molecule classification and annotation in lipidomics. BMC Bioinformatics, 12(1), 303. http://dx.doi.org/10.1186/1471-2105-12-303.

  • Chepelev, L. L., Hastings, J., Ennis, M., Steinbeck, C., & Dumontier, M. (2012). Self-organizing ontology of biochemically relevant small molecules. BMC Bioinformatics, 13, 3.

    Article  CAS  Google Scholar 

  • Corbett, P., & Murray-Rust, P. (2006). High-throughput identification of chemistry in life science texts. In M. Berthold, R. Glen, & I. Fischer (Eds.), Computational life sciences II (pp. 107–118). Springer, Berlin/Heidelberg.

    Google Scholar 

  • Courtot, M., Juty, N., Knüpfer, C., Waltemath, D., Zhukova, A., Dräger, A., Dumontier, M., Finney, A., Golebiewski, M., Hastings, J., Hoops, S., Keating, S., Kell, D. B., Kerrien, S., Lawson, J., Lister, A., Lu, J., Machne, R., Mendes, P., Pocock, M., Rodriguez, N., Villeger, A., Wilkinson, D. J., Wimalaratne, S., Laibe, C., Hucka, M., & Novère, N. L. (2011). Controlled vocabularies and semantics in systems biology. Molecular Systems Biology, 7, 543.

    Google Scholar 

  • Ferreira, J. D., & Couto, F. M. (2010). Semantic similarity for automatic classification of chemical compounds. PLoS Computational Biology, 6(9), e1000937. DOI10.1371/journal.pcbi.1000937.

    Google Scholar 

  • Ferreira, J. D., Hastings, J., & Couto, F. M. (2013). Exploiting disjointness axioms to improve semantic similarity measures. Bioinformatics, 29, 2781–2787.

    Google Scholar 

  • Fu, G., Batchelor, C., Dumontier, M., Hastings, J., Willighagen, E., & Bolton, E. (2015). PubChemRDF: Towards the semantic annotation of pubchem compound and substance databases. Journal of Cheminformatics, 7, 34.

    Article  Google Scholar 

  • Gkoutos, G. V., Schofield, P. N., & Hoehndorf, R. (2012). The units ontology: A tool for integrating units of measurement in science. Database, 2012. DOI10.1093/database/bas033. http://database.oxfordjournals.org/content/2012/bas033.abstract.

  • Grau, B. C., Horrocks, I., Motik, B., Parsia, B., Patel-Schneider, P., & Sattler, U. (2008). OWL 2: The next step for OWL. Web Semantics, 6, 309–322. DOI10.1016/j.websem.2008.05.001. http://portal.acm.org/citation.cfm?id=1464505.1464604.

  • Grego, T., Ferreira, J. D., Pesquita, C., Bastos, H., Vicosa, D. V., Freire, J., & Couto, F. M. (2010). Chemical and metabolic pathway semantic similarity. Technical report, LASIGE, Faculty of Sciences, University of Lisbon.

    Google Scholar 

  • Gruber, T. R. (2009). Ontology. In L. Liu & M. T. Özsu (Eds.), Encyclopedia of database systems. Springer. http://tomgruber.org/writing/ontology-definition-2007.htm.

  • Harland, L., Larminie, C., Sansone, S. A., Popa, S., Marshall, M. S., Braxenthaler, M., Cantor, M., Filsell, W., Forster, M. J., Huang, E., Matern, A., Musen, M., Saric, J., Slater, T., Wilson, J., Lynch, N., Wise, J., & Dix, I. (2011). Empowering industrial research with shared biomedical vocabularies. Drug Discovery Today, 16(21–22), 940–947. DOI10.1016/j.drudis.2011.09.013. http://www.sciencedirect.com/science/article/pii/S1359644611003035.

  • Hastings, J., Chepelev, L., Willighagen, E., Adams, N., Steinbeck, C., Dumontier, M. (2011). The chemical information ontology: Provenance and disambiguation for chemical data on the biological semantic web. PLoS One, 6(10), e25513. DOI10.1371/journal.pone.0025513.

    Google Scholar 

  • Hastings, J., de Matos, P., Dekker, A., Ennis, M., Harsha, B., Kale, N., Muthukrishnan, V., Owen, G., Turner, S., Williams, M., & Steinbeck, C. (2013). The ChEBI reference database and ontology for biologically relevant chemistry: Enhancements for 2013. Nucleic Acids Research, 41(Database issue), D456–D463.

    Google Scholar 

  • Hastings, J., Magka, D., Batchelor, C., Duan, L., Stevens, R., Ennis, M., & Steinbeck, C. (2012). Structure-based classification and ontology in chemistry. Journal of Cheminformatics, 4(1), 8. DOI10.1186/1758-2946-4-8. http://www.jcheminf.com/content/4/1/8.

  • Haug, K., Salek, R. M., Conesa, P., Hastings, J., de Matos, P., Rijnbeek, M., Mahendraker, T., Williams, M., Neumann, S., Rocca-Serra, P., Maguire, E., Gonzalez-Beltran, A., Sansone, S. A., Griffin, J. L., & Steinbeck, C. (2012). Metabolights–an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucleic Acids Research. DOI10.1093/nar/gks1004. http://nar.oxfordjournals.org/content/early/2012/10/28/nar.gks1004.abstract.

  • Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., Drabkin, H. J., Ennis, M., Foulger, R. E., Harris, M. A., Hastings, J., Kale, N. S., de Matos, P., Mungall, C. J., Owen, G., Roncaglia, P., Steinbeck, C., Turner, S., & Lomax, J. (2013). Dovetailing biology and chemistry: Integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14, 513.

    Google Scholar 

  • Hoehndorf, R., Dumontier, M., & Gkoutos, G. V. (2012). Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics. Bioinformatics, 28(16), 2169–2175. DOI10.1093/bioinformatics/bts350. http://bioinformatics.oxfordjournals.org/content/28/16/2169.abstract.

  • Hoehndorf, R., Oellrich, A., Dumontier, M., Kelso, J., Rebholz-Schuhmann, D., & Herre, H. (2010). Relations as patterns: Bridging the gap between obo and owl. BMC Bioinformatics, 11(1), 441. DOI10.1186/1471-2105-11-441. http://www.biomedcentral.com/1471-2105/11/441.

  • Huang, D. W., Sherman, B. T., & Lempicki, R. A. (2009). Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research, 37(1), 1–13.

    Google Scholar 

  • Hunter, L. (2002). Ontologies for programs, not people. Genome Biology, 3, 1002.1–1002.2.

    Article  Google Scholar 

  • Jessop, D. M., Adams, S. E., Willighagen, E. L., Hawizy, L., & Murray-Rust, P. (2011). Oscar4: A flexible architecture for chemical text-mining. Journal of Cheminformatics, 3, 41.

    Google Scholar 

  • Jupp, S., Malone, J., Bolleman, J., Brandizi, M., Davies, M., Garcia, L., Gaulton, A., Gehant, S., Laibe, C., Redaschi, N., Wimalaratne, S. M., Martin, M., Novère, N. L., Parkinson, H., Birney, E., & Jenkinson, A. M. (2013). The EBI RDF platform: Linked open data for the life sciences. Bioinformatics, 30, 1338–1339.

    Google Scholar 

  • Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K., Itoh, M., Kawashima, S., Katayama, T., Araki, M., & Hirakawa, M. (2006). From genomics to chemical genomics: New developments in KEGG. Nucleic Acids Research, 34, D354–D357. DOI10.1093/nar/gkj102.

    Google Scholar 

  • Kutz, O., Hastings, J., & Mossakowski, T. (2012). Modelling highly symmetrical molecules: Linking ontologies and graphs artificial intelligence: Methodology, systems, and applications. In A. Ramsay & G. Agre (Eds.), Artificial intelligence: Methodology, systems, and applications (Lecture notes in computer science, Vol. 7557, chap. 11, pp. 103–111). Springer, Berlin/Heidelberg. DOI10.1007/978-3-642-33185-5_11. http://dx.doi.org/10.1007/978-3-642-33185-5_11.

  • Li, C., Donizelli, M., Rodriguez, N., Dharuri, H., Endler, L., Chelliah, V., Li, L., He, E., Henry, A., Stefan, M., Snoep, J., Hucka, M., Le Nov\(\grave{e}\)re, N., & Laibe, C. (2010). BioModels database: An enhanced, curated and annotated resource for published quantitative kinetic models. BMC Systems Biology, 4, 92.

    Google Scholar 

  • Lowe, D. M., Corbett, P. T., Murray-Rust, P., & Glen, R. C. (2011). Chemical name to structure: Opsin, an open source solution. Journal of Chemical Information and Modeling, 51(3), 739–753. DOI10.1021/ci100384d. http://pubs.acs.org/doi/abs/10.1021/ci100384d.

  • Magka, D., Motik, B., & Horrocks, I. (2011). Modelling structured domains using description graphs and logic programming. Technical report, Department of Computer Science, University of Oxford.

    Google Scholar 

  • Matthews, L., Gopinath, G., Gillespie, M., Caudy, M., Croft, D., de Bono, B., Garapati, P., Hemish, J., Hermjakob, H., Jassal, B., Kanapin, A., Lewis, S., Mahajan, S., May, B., Schmidt, E., Vastrik, I., Wu, G., Birney, E., Stein, L., & D’Eustachio, E. (2009). Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Research, 37, D619–D622.

    Google Scholar 

  • McNaught, A. D., & Wilkinson, A. (1997). IUPAC compendium of chemical terminology (2nd ed., the “Gold Book”). Oxford: Blackwell Scientific Publications. DOIdoi:10.1351/goldbook. XMLon-linecorrectedversion:http://goldbook.iupac.org. (2006-) created by M. Nic, J. Jirat, B. Kosata; updates compiled by A. Jenkins.

  • Moreno, P., Beisken, S., Harsha, B., Muthukrishnan, V., Tudose, I., Dekker, A., Dornfeldt, S., Taruttis, F., Grosse, I., Hastings, J., Neumann, S., & Steinbeck, C. (2015). BiNChE: A web tool and library for chemical enrichment analysis based on the chEBI ontology. BMC Bioinformatics, 16, 56.

    Article  Google Scholar 

  • Shearer, R., Motik, B., & Horrocks, I. (2008). HermiT: A highly-efficient OWL reasoner. In C. Dolbear, A. Ruttenberg, & U. Sattler (Eds.), Proceedings of the 5th Workshop on OWL: Experiences and Directions, Karlsruhe.

    Google Scholar 

  • Shotton, D. (2010). CiTO, the citation typing ontology, and its use for annotation of reference lists and visualization of citation networks. Journal of Biomedical Semantics, 1(Suppl 1), S6.

    Article  Google Scholar 

  • Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., & Katz, Y. (2007). Pellet: Aypractical OWL-DL reasoner. Journal of Web Semantics, 5, 51–53.

    Article  Google Scholar 

  • Smith, B. (2003). Ontology. In L. Floridi (Ed.), Blackwell guide to the philosophy of computing and information (pp. 155–166). Oxford: Blackwell.

    Google Scholar 

  • Swainston, N., Smallbone, K., Mendes, P., Kell, D. B., & Paton, N. W. (2011). The SuBliMinaL Toolbox: Automating steps in the reconstruction of metabolic networks. Journal of Integrative Bioinformatics, 8, 186.

    Google Scholar 

  • Protégé Team, T. (2013). The Protégé ontology editing tool. http://protege.stanford.edu/. Last accessed Mar 2013.

    Google Scholar 

  • The Gene Ontology Consortium. (2000). Gene ontology: Tool for the unification of biology. Nature Genetics, 25, 25–29.

    Article  Google Scholar 

  • The Gene Ontology Consortium. (2012). The OBO language, version 1.2. http://www.geneontology.org/GO.format.obo-1_2.shtml. Last accessed Oct 2012.

    Google Scholar 

  • The UniProt Consortium. (2015). Uniprot: A hub for protein information. Nucleic Acids Research, 43, D204–D212.

    Article  Google Scholar 

  • Tsarkov, D., & Horrocks, I. (2006). FaCT++ description logic reasoner: System description. In Proceedings of the International Joint Conference on Automated Reasoning (IJCAR 2006), Seattle (pp. 292–297). Springer.

    Google Scholar 

  • Villanueva-Rosales, N., & Dumontier, M. (2007). Describing chemical functional groups in OWL-DL for the classification of chemical compounds. In Proceedings of the OWL: Experiences and Directions (OWLED 2007), Innsbruck.

    Google Scholar 

  • Wegner, J. K., Sterling, A., Guha, R., Bender, A., Faulon, J. L., Hastings, J., O’Boyle, N., Overington, J., Van Vlijmen, H., & Willighagen, E. (2012). Cheminformatics. Communications of the ACM, 55(11), 65–75.

    Google Scholar 

  • Willighagen, E. L., Waagmeester, A., Spjuth, O., Ansell, P., Williams, A. J., Tkachenko, V., Hastings, J., Chen, B., & Wild, D. J. (2013). The ChEMBL database as linked open data. Journal of Cheminformatics, 5, 23.

    Google Scholar 

  • Wishart, D., Knox, C., Guo, A., Shrivastava, S., Hassanali, M., Stothard, P., Chang, Z., & Woolsey, J. (2006). DrugBank: A comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Research, 34, D668–D672. DOI10.1093/nar/gkj067.

    Google Scholar 

  • Wishart, D. S., Knox, C., Guo, A. C. C., Eisner, R., Young, N., Gautam, B., Hau, D. D., Psychogios, N., Dong, E., Bouatra, S., Mandal, R., Sinelnikov, I., Xia, J., Jia, L., Cruz, J. A., Lim, E., Sobsey, C. A., Shrivastava, S., Huang, P., Liu, P., Fang, L., Peng, J., Fradette, R., Cheng, D., Tzur, D., Clements, M., Lewis, A., De Souza, A., Zuniga, A., Dawe, M., Xiong, Y., Clive, D., Greiner, R., Nazyrova, A., Shaykhutdinov, R., Li, L., Vogel, H. J., Forsythe, I. (2009). HMDB: A knowledgebase for the human metabolome. Nucleic Acids Research, 37(Database issue), D603–D610. DOI10.1093/nar/gkn810. http://dx.doi.org/10.1093/nar/gkn810.

    Google Scholar 

Download references

Acknowledgements

ChEBI is supported by the BBSRC under grant agreement number BB/K019783/1 within the “Bioinformatics and biological resources” fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Janna Hastings .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Dordrecht

About this entry

Cite this entry

Hastings, J., Steinbeck, C. (2016). Ontologies in Cheminformatics. In: Leszczynski, J. (eds) Handbook of Computational Chemistry. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6169-8_55-1

Download citation

  • DOI: https://doi.org/10.1007/978-94-007-6169-8_55-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Online ISBN: 978-94-007-6169-8

  • eBook Packages: Springer Reference Chemistry and Mat. ScienceReference Module Physical and Materials ScienceReference Module Chemistry, Materials and Physics

Publish with us

Policies and ethics