Abstract
There is a large number of online documents data sources available nowadays. The lack of structure and the differences between formats are the main difficulties to automatically extract information from them, which also has a negative impact on its use and reuse. In the biomedical domain, the DISNET platform emerged to provide researchers with a resource to obtain information in the scope of human disease networks by means of large-scale heterogeneous sources. Specifically in this domain, it is critical to offer not only the information extracted from different sources, but also the evidence that supports it. This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations; with the objective of providing an schema to improve the publication and description of evidences and biomedical associations in this domain. The ontology has been successfully evaluated to ensure there are no errors, modelling pitfalls and that it meets the previously defined functional requirements. Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed according to the proposed ontology to create a Knowledge Graph that can be used in real scenarios, and which has also been used for the evaluation of the presented ontology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
References
Arenas-Guerrero, J., Chaves-Fraga, D., Toledo, J., Pérez, M.S., Corcho, O.: Morph-KGC: scalable knowledge graph materialization with mapping partitions. Semantic Web 1–20 (2022). http://www.semantic-web-journal.net/system/files/swj3135.pdf
Badenes-Olmedo, C., Alonso, A., Corcho, O.: An overview of drugs, diseases, genes and proteins in the cord-19 corpus. Procesamiento del Lenguaje Natural, vol. 69 (2022)
Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008). https://doi.org/10.1016/j.jbi.2008.03.004
Bodenreider, O.: The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(suppl_1), D267–D270 (2004). https://doi.org/10.1093/nar/gkh061
Bodenreider, O., Mitchell, J.A., McCray, A.T.: Biomedical ontologies. In: Pacific Symposium on Biocomputing, pp. 76–78 (2005)
Bodenreider, O., Stevens, R.: Bio-ontologies: current trends and future directions. Brief. Bioinform. 7(3), 256–274 (2016). https://doi.org/10.1093/bib/bbl027
Chávez-Feria, S., García-Castro, R., Poveda-Villalón, M.: Chowlk: from UML-based ontology conceptualizations to owl. In: Groth, P., et al. (eds.) The Semantic Web, pp. 338–352. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06981-9_20
Ciccarese, P., Soiland-Reyes, S., Belhajjame, K., Gray, A.J., Goble, C., Clark, T.: Pav ontology: provenance, authoring and versioning. J. Biomed. Semant. 4(1), 1–22 (2013)
Consortium, G.O.: The gene ontology (GO) database and informatics resource. Nucleic Acids Res. 32(suppl_1), D258–D261 (2004)
Das, S., Sundara, S., Cyganiak, R.: R2RML: RDB to RDF Mapping Language, W3C Recommendation, 27 September 2012. www.w3.org/TR/r2rml
Davis, A.P., et al.: Comparative toxicogenomics database (CTD): update 2021. Nucleic Acids Res. 49, D1138–D1143 (2021). https://doi.org/10.1093/nar/gkaa891
Dimou, A., Sande, M.V., Colpaert, P., Verborgh, R., Mannens, E., Van De Walle, R.: RML: a generic language for integrated RDF mappings of heterogeneous data. In: LDOW (2014)
Dumontier, M., et al.: The semanticscience integrated ontology (SIO) for biomedical research and knowledge discovery. J. Biomed. Semant. 5(1), 14 (2014). https://doi.org/10.1186/2041-1480-5-14
Eilbeck, K., et al.: The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. 6(5), R44 (2005). https://doi.org/10.1186/gb-2005-6-5-r44
Fernández-Izquierdo, A., Cimmino, A., García-Castro, R.: Supporting demand-response strategies with the delta ontology. In: 2021 IEEE/ACS 18th International Conference on Computer Systems and Applications (AICCSA), pp. 1–8. IEEE (2021)
Garijo, D.: WIDOCO: a wizard for documenting ontologies. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 94–102. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68204-4_9
Giglio, M., et al.: Eco, the evidence & conclusion ontology: community standard for evidence information. Nucleic Acids Res. 47(D1), D1186–D1194 (2019)
Goh, K.I., Cusick, M.E., Valle, D., Childs, B., Vidal, M., Barabási, A.L.: The human disease network. Proc. Natl. Acad. Sci. 104(21), 8685–8690 (2007). https://doi.org/10.1073/pnas.0701361104
Graves, M., Constabaris, A., Brickley, D.: FOAF: connecting people on the semantic web. Cataloging Classif. Q. 43(3–4), 191–202 (2007)
Iglesias-Molina, A., Pozo-Gilo, L., Doņa, D., Ruckhaus, E., Chaves-Fraga, D., Corcho, Ó.: Mapeathor: simplifying the specification of declarative rules for knowledge graph construction. In: ISWC (Demos/Industry) (2020)
Jackson, R., et al.: OBO foundry in 2021: operationalizing open data principles to evaluate ontologies. Database 2021, baab069 (2021). https://doi.org/10.1093/database/baab069
Köhler, S., et al.: The human phenotype ontology in 2021. Nucleic Acids Res. 49(D1), D1207–D1217 (2021). https://doi.org/10.1093/nar/gkaa1043
Lagunes-García, G., Rodríguez-González, A., Prieto-Santamaría, L., del Valle, E.P.G., Zanin, M., Menasalvas-Ruiz, E.: DISNET: a framework for extracting phenotypic disease information from public sources. PeerJ 8, e8580 (2020). https://doi.org/10.7717/peerj.8580
Lebo, T., et al.: PROV-O: The PROV ontology (2013). www.w3.org/TR/prov-o/
Martens, M., et al.: WikiPathways: connecting communities. Nucleic Acids Res. 49(D1), D613–D621 (2021). https://doi.org/10.1093/nar/gkaa1024
Mendez, D., et al.: ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 47, D930–D940 (2019). https://doi.org/10.1093/nar/gky1075
Natale, D.A., et al.: The Protein Ontology: a structured representation of protein forms and complexes. Nucleic Acids Res. 39(suppl_1), D539–D545 (2011). https://doi.org/10.1093/nar/gkq907
Peroni, S., Shotton, D.: The SPAR ontologies. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 119–136. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_8
Piñero, J., et al.: The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48, D845–D855 (2020). https://doi.org/10.1093/nar/gkz1021
Poveda-Villalón, M., Gómez-Pérez, A., Suárez-Figueroa, M.C.: Oops!(ontology pitfall scanner!): an on-line tool for ontology evaluation. Int. J. Semant. Web Inf. Syst. (IJSWIS) 10(2), 7–34 (2014)
Poveda-Villalón, M., Fernández-Izquierdo, A., Fernández-López, M., García-Castro, R.: LOT: an industrial oriented ontology engineering framework. Eng. Appl. Artif. Intell. 111, 104755 (2022). https://doi.org/10.1016/j.engappai.2022.104755
Prieto Santamaría, L., Díaz Uzquiano, M., Ugarte Carro, E., Ortiz-Roldán, N., Pérez Gallardo, Y., Rodríguez-González, A.: Integrating heterogeneous data to facilitate COVID-19 drug repurposing. Drug Discovery Today 27(2), 558–566 (2022). https://doi.org/10.1016/j.drudis.2021.10.002
Prieto Santamaría, L., Ugarte Carro, E., Díaz Uzquiano, M., Menasalvas Ruiz, E., Pérez Gallardo, Y., Rodríguez-González, A.: A data-driven methodology towards evaluating the potential of drug repurposing hypotheses. Comput. Struct. Biotechnol. J. 19, 4559–4573 (2021). https://doi.org/10.1016/j.csbj.2021.08.003
Prieto Santamaría, L., García del Valle, E.P., Zanin, M., Hernández Chan, G.S., Pérez Gallardo, Y., Rodríguez-González, A.: Classifying diseases by using biological features to identify potential nosological models. Sci. Rep. 11(1), 21096 (2021). https://doi.org/10.1038/s41598-021-00554-6
Queralt-Rosinach, N., Piñero, J., Bravo, A., Sanz, F., Furlong, L.I.: DisGeNET-RDF: harnessing the innovative power of the semantic web to explore the genetic basis of diseases. Bioinformatics 32(14), 2236–2238 (2016)
Redaschi, N., Consortium, U.: UniProt in RDF: tackling data integration and distributed annotation with the semantic web. Nat. Precedings (2019). https://doi.org/10.1038/npre.2009.3193.1
Schriml, L.M., et al.: The human disease ontology 2022 update. Nucleic Acids Res. 50, D1255–D1261 (2022). https://doi.org/10.1093/nar/gkab1063
Suárez-Figueroa, M.C., Gómez-Pérez, A., Fernandez-Lopez, M.: The neon methodology framework: a scenario-based methodology for ontology development. Appl. Ontol. 10(2), 107–145 (2015)
García del Valle, E.P., Lagunes García, G., Prieto Santamaría, L., Zanin, M., Menasalvas Ruiz, E., Rodríguez-González, A.: DisMaNET: a network-based tool to cross map disease vocabularies. Comput. Methods Programs Biomed. 207, 106233 (2021). https://doi.org/10.1016/j.cmpb.2021.106233
Vasant, D., et al.: ORDO: an ontology connecting rare disease, epidemiology and genetic data. In: Bio-Ontologies ISMB 2014, July 2014
Wishart, D.S., et al.: DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46(D1), D1074–D1082 (2018). https://doi.org/10.1093/nar/gkx1037
Zahn-Zabal, M., et al.: The neXtProt knowledgebase in 2020: data, tools and usability improvements. Nucleic Acids Res. 48, D328–D334 (2020). https://doi.org/10.1093/nar/gkz995
Acknowledgments
This work is supported by the DRUGS4COVID++ project, funded by Ayudas Fundación BBVA a equipos de investigación científica SARS-CoV-2 y COVID-19. The work is also supported by “Data-driven drug repositioning applying graph neural networks (3DR-GNN)” under grant “PID2021-122659OB-I00” from the Spanish Ministerio de Ciencia, Innovación y Universidades. LPS’s work is supported by “Programa de fomento de la investigación y la innovación (Doctorados Industriales)” from Comunidad de Madrid (grant “IND2019/TIC-17159”).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pérez, A.Á., Iglesias-Molina, A., Santamaría, L.P., Poveda-Villalón, M., Badenes-Olmedo, C., Rodríguez-González, A. (2022). EBOCA: Evidences for BiOmedical Concepts Association Ontology. In: Corcho, O., Hollink, L., Kutz, O., Troquard, N., Ekaputra, F.J. (eds) Knowledge Engineering and Knowledge Management. EKAW 2022. Lecture Notes in Computer Science(), vol 13514. Springer, Cham. https://doi.org/10.1007/978-3-031-17105-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-17105-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17104-8
Online ISBN: 978-3-031-17105-5
eBook Packages: Computer ScienceComputer Science (R0)