Biomedical Semantic Resources for Drug Discovery Platforms

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10577)

Abstract

The biomedical research community is providing large-scale data sources to enable knowledge discovery from the data alone, or from novel scientific experiments in combination with the existing knowledge. Increasingly semantic Web technologies are being developed and used including ontologies, triple stores and combinations thereof. The amount of data is constantly increasing as well as the complexity of data. Since the data sources are publicly available, the amount of content can be measured giving an overview on the accessible content but also on the state of the data representation in comparison to the existing content. For a better understanding of the existing data resources, i.e. judgements on the distribution of data triples across concepts, data types and primary providers, we have performed a comprehensive analysis which delivers an overview on the accessible content for semantic Web solutions (from publicly accessible data servers). It can be derived that the information related to genes, proteins and chemical entities form the core, whereas the content related to diseases and pathways forms a smaller portion. As a result, any approach for drug discovery would profit from the data on molecular entities, but would lack content from data resources that represent disease pathomechanisms.

Keywords

Biomedical Ontologies and Databases Life Sciences Linked Open Data (LSLOD) 

Notes

Acknowledgements

The work presented in this paper has been partly funded by EU FP7 GRANATUM project (project number 270139) and Science Foundation Ireland under Grant No. SFI/12/RC/2289.

References

  1. 1.
    Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)CrossRefGoogle Scholar
  2. 2.
    Berners-Lee, T., Bizer, C., Heath, T.: Linked data-the story so far. Int. J. Semant. Web Inform. Syst. 5(3), 1–22 (2009)CrossRefGoogle Scholar
  3. 3.
    Corpet, D.E., Taché, S.: Most effective colon cancer chemopreventive agents in rats: a systematic review of aberrant crypt foci and tumor data, ranked by potency. Nutr. Cancer 43(1), 1–21 (2002)CrossRefGoogle Scholar
  4. 4.
    Deus, K.T.W.P.C.N.T.B.C.G.C.K.H.F.: D1.1 – requirements analysis. Technical report, CERTH, NUIG-DERI, FIT, CYBION, UCY, and DKFZ (2011)Google Scholar
  5. 5.
    Doolittle, R., Abelson, J., Simon, M.: Computer methods for macromolecular sequence analysis. In: Methods in Enzymology, vol. 266 (1996)Google Scholar
  6. 6.
    Greenes, R.A., McClure, R.C., Pattison-Gordon, E., Sato, L.: The findings-diagnosis continuum: implications for image descriptions and clinical databases. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, p. 383. American Medical Informatics Association (1992)Google Scholar
  7. 7.
    Hasnain, A., Fox, R., Decker, S., Deus, H.F.: Cataloguing and linking life sciences LOD Cloud. In: 1st International Workshop on Ontology Engineering in a Data-driven World collocated with EKAW12 (2012)Google Scholar
  8. 8.
    Hasnain, A., et al.: Linked biomedical dataspace: lessons learned integrating data for drug discovery. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 114–130. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11964-9_8 Google Scholar
  9. 9.
    Hasnain, A., Mehmood, Q., e Zainab, S.S., Saleem, M., Warren, C., Zehra, D., Decker, S., Rebholz-Schuhmann, D.: BioFed: federated query processing over life sciences linked open data. J. Biomed. Semant. 8(1), 13 (2017). http://dx.doi.org/10.1186/s13326-017-0118-0 CrossRefGoogle Scholar
  10. 10.
    Hasnain, A., Mehmood, Q., e Zainab, S.S., Hogan, A.: SPORTAL: profiling the content of public SPARQL endpoints. Int. J. Semant. Web Inform. Syst. (IJSWIS) 12(3), 134–163 (2016). http://www.igi-global.com/article/sportal/160175 CrossRefGoogle Scholar
  11. 11.
    Hasnain, A., Mehmood, Q., e Zainab, S.S., Hogan, A.: SPORTAL: searching for public SPARQL endpoints. In: Proceedings of the ISWC 2016 Posters & Demonstrations Track co-located with 15th International Semantic Web Conference (ISWC 2016), Kobe, Japan, 19 October 2016 (2016). http://ceur-ws.org/Vol-1690/paper78.pdf
  12. 12.
    Hasnain, A., et al.: A roadmap for navigating the life sciences linked open data cloud. In: Supnithi, T., Yamaguchi, T., Pan, J.Z., Wuwongse, V., Buranarach, M. (eds.) JIST 2014. LNCS, vol. 8943, pp. 97–112. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-15615-6_8 CrossRefGoogle Scholar
  13. 13.
    Hasnain, S.M.A.: Cataloguing and linking publicly available biomedical SPARQL endpoints for federation-addressing aPosteriori data integration. Ph.D. thesis (2017)Google Scholar
  14. 14.
    Hirst, G.: Ontology and the lexicon. In: Staab, S., Studer, R. (eds.) andbook on Ontologies: International Handbooks on Information Systems, pp. 269–292. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-92673-3_12 Google Scholar
  15. 15.
    Hoehndorf, R., Dumontier, M., Gkoutos, G.V.: Evaluation of research in biomedical ontologies. Brief. Bioinform. 14(6), 696–712 (2012)CrossRefGoogle Scholar
  16. 16.
    Jimeno-Yepes, A., Jiménez-Ruiz, E., Berlanga, R., Rebholz-Schuhmann, D.: Use of shared lexical resources for efficient ontological engineering. In: Semantic Web Applications and Tools for Life Sciences Workshop (SWAT4LS), CEUR WS Proceedings, vol. 435, pp. 93–136 (2008)Google Scholar
  17. 17.
    Machado, C.M., Rebholz-Schuhmann, D., Freitas, A.T., Couto, F.M.: The semantic web in translational medicine: current applications and future directions. Brief Bioinform., bbt079 (2013)Google Scholar
  18. 18.
    Musen, M.A.: Dimensions of knowledge sharing and reuse. Comput. Biomed. Res. 25(5), 435–467 (1992)CrossRefGoogle Scholar
  19. 19.
    Pico, A.R., Kelder, T., Iersel, M.P., Hanspers, K., Conklin, B.R., Evelo, C.: WikiPathways: pathway editing for the people. PLoS Biol. 6(7), e184 (2008)CrossRefGoogle Scholar
  20. 20.
    Rebholz-Schuhmann, D., Oellrich, A., Hoehndorf, R.: Text-mining solutions for biomedical research: enabling integrative biology. Nat. Rev. Genet. 13(12), 829–839 (2012)CrossRefGoogle Scholar
  21. 21.
    Rebholz-Schuhmann, D., Grabmuller, C., Kavaliauskas, S., Harrow, I., Kapushevsky, M., Westaway, M., Woollard, P., Wilkinson, N., Strutt, P., Braxtenthaler, M., Hoole, D., Wilson, J., O’Beirne, R., Kidd, R.R., Filsell, W., Marshall, C., Backofen, R., Clark, D.: Semantic integration of gene-disease associations for diabetes type II from literature and biomedical data resources. Drug Discov. Today 19(7), 882–889 (2014)CrossRefGoogle Scholar
  22. 22.
    Rebholz-Schuhmann, D., Kim, J.H., Yan, Y., Dixit, A., Friteyere, C., Backofen, R., Lewin, I.: Evaluation and cross-comparison of Lexical Entities of Biological Interest (LexEBI). PLoS One 8(10), e75185 (2013)CrossRefGoogle Scholar
  23. 23.
    Splendiani, A., Gundel, M., Austyn, J.M., Cavalieri, D., Scognamiglio, C., Brandizi, M.: Knowledge sharing and collaboration in translational research, and the DC-THERA Directory. Brief. Bioinform. 12(6), 562–575 (2011)CrossRefGoogle Scholar
  24. 24.
    Wishart, D.S., Knox, C., Guo, A.C., Cheng, D., Shrivastava, S., Tzur, D., Gautam, B., Hassanali, M.: DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 36(suppl 1), D901–D906 (2008)CrossRefGoogle Scholar
  25. 25.
    e Zainab, S.S., Hasnain, A., Saleem, M., Mehmood, Q., Zehra, D., Decker, S.: FedViz: a visual interface for SPARQL queries formulation and execution. In: Visualizations and User Interfaces for Ontologies and Linked Data (VOILA 2015), Bethlehem, Pennsylvania, USA (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Insight Centre for Data AnalyticsNational University of IrelandGalwayIreland

Personalised recommendations