Skip to main content

Biomedical Knowledge Graphs: Context, Queries and Complexity

  • Chapter
  • First Online:
Computational Life Sciences

Part of the book series: Studies in Big Data ((SBD,volume 112))

  • 335 Accesses

Abstract

Ontology-based mappings in knowledge graphs are a widely discussed topic in biomedical research. Contextual information is widely considered for NLP and knowledge discovery in life sciences since it highly influences the exact meaning of natural language. The scientific challenge is not only to extract such context data, but also to store this data for further query and discovery approaches. Classical approaches use RDF triple stores, which have serious limitations. Here, we introduce the graph-theoretic foundation for a general context concept within semantic networks and show a proof-of-concept based on biomedical literature and text mining as a multiple step knowledge graph approach using labeled property graphs based on polyglot persistence systems to utilize context data for context mining, graph queries, knowledge discovery and extraction. Our test system contains a knowledge graph derived from the entirety of PubMed and SCAIView data and is enriched with text mining data and domain specific language data using BEL. Here, context is a more general concept than annotations. Storing and querying a giant knowledge graph as a labeled property graph is still a technological challenge. Here we demonstrate how our data model is able to support the understanding and interpretation of biomedical data. We present several real world use cases that utilize our massive, generated knowledge graph derived from PubMed data and enriched with additional contextual data. Finally, we show a working example in context of biologically relevant information using SCAIView.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.ncbi.nlm.nih.gov/pubmed.

  2. 2.

    https://www.scaiview.com/.

  3. 3.

    https://neo4j.com/.

  4. 4.

    http://jsongraphformat.info/.

  5. 5.

    http://spring.io/projects/spring-boot.

  6. 6.

    https://spring.io/projects/spring-data.

  7. 7.

    https://neo4j.com/.

  8. 8.

    See https://www.pharmgkb.org/.

References

  1. Desai, M., Mehta, R.G., Rana, D.P.: Issues and challenges in big graph modelling for smart city: an extensive survey. Int. J. Comput. Intell. & IoT 1(1) (2018)

    Google Scholar 

  2. Dumontier, M., Callahan, A., Cruz-Toledo, J., Ansell, P. Emonet, V., Belleau, F., Droit, A.: Bio2rdf release 3: a larger connected network of linked data for the life sciences, In: Proceedings of the 2014 International Conference on Posters & Demonstrations Track, vol. 1272, pp. 401–404 (2014)

    Google Scholar 

  3. Callahan, A., Cruz-Toledo, J., Ansell, P., Dumontier, M.: Bio2rdf release 2: improved coverage, interoperability and provenance of life science linked data. In: Extended Semantic Web Conference, pp. 200–212. Springer (2013)

    Google Scholar 

  4. Li, S., Xin, L.: Research on integration and sharing of scientific data based on linked data–a case study of bio2rdf. Res. Libr. Sci. 21 (2014)

    Google Scholar 

  5. Natsiavas, P., Koutkias, V., Maglaveras, N.: Exploring the capacity of open, linked data sources to assess adverse drug reaction signals. In: SWAT4LS, pp. 224–226 (2015)

    Google Scholar 

  6. Aggarwal, C.C., Zhai, C.: An introduction to text mining. In: Mining Text Data, pp. 1–10. Springer (2012)

    Google Scholar 

  7. Dörpinghaus, J., Stefan, A.: Knowledge extraction and applications utilizing context data in knowledge graphs. In: 2019 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 265–272. IEEE (2019)

    Google Scholar 

  8. Hanisch, D., Fundel-Clemens, K., Mevissen, H.-T., Zimmer, R., Fluck, J.: Prominer: Rule-based protein and gene entity recognition. BMC Bioinf. 6(Suppl 1), S14, 02 (2005). https://doi.org/10.1186/1471-2105-6-S1-S14

  9. Fluck, J., Klenner, A., Madan, S., Ansari, S., Bobic, T., Hoeng, J., Hofmann-Apitius, M., Peitsch, M.: Bel networks derived from qualitative translations of bionlp shared task annotations. In: Proceedings of the 2013 Workshop on Biomedical Natural Language Processing, pp. 80–88 (2013)

    Google Scholar 

  10. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: tool for the unification of biology. Nat. Gen. 25(1), 25 (2000)

    Article  Google Scholar 

  11. Wishart, D.S., Feunang, Y.D., Guo, A.C., Lo, E.J., Marcu, A., Grant, J.R., Sajed, T., Johnson, D., Li, C., Sayeeda, Z., et al.: Drugbank 5.0: a major update to the drugbank database for 2018. Nucl. Acids Res. 46(D1), D1074–D1082 (2017)

    Article  Google Scholar 

  12. Khan, K., Benfenati, E., Roy, K.: Consensus qsar modeling of toxicity of pharmaceuticals to different aquatic organisms: ranking and prioritization of the drugbank database compounds. Ecotoxicol. Environ. Safety 168, 287–297 (2019)

    Article  Google Scholar 

  13. Hey, J.: The data, information, knowledge, wisdom chain: the metaphorical link. Intergov. Oceanograp. Comm. 26, 1–18 (2004)

    Google Scholar 

  14. Zeleny, M.: Management support systems: towards integrated knowledge management. Human Syst. Manag. 7(1), 59–70 (1987)

    Article  Google Scholar 

  15. Ackoff, R.L.: From data to wisdom. J. Appl. Syst. Anal. 16(1), 3–9 (1989)

    Google Scholar 

  16. Rowley, J.: The wisdom hierarchy: representations of the dikw hierarchy. J. Inf. Sci. 33(2), 163–180 (2007)

    Article  Google Scholar 

  17. Dörpinghaus, J., Jacobs, M.: Semantic knowledge graph embeddings for biomedical research: data integration using linked open data. In: Posters and Demo Track of the 15th International Conference on Semantic Systems. (Poster and Demo Track at SEMANTiCS 2019), no. 2451, pp. 46–50 (2019). http://ceur-ws.org/Vol-2451/#paper-10

    Google Scholar 

  18. Dörpinghaus, J., Darms, J., Jacobs, M.: What was the question? a systematization of information retrieval and nlp problems. In: 2018 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE (2018)

    Google Scholar 

  19. Losiewicz, P., Oard, D.W., Kostoff, R.N.: Textual data mining to support science and technology management. J. Intell. Inf. Syst. 15(2), 99–119 (2000)

    Article  Google Scholar 

  20. Dörpinghaus, J., Klein, J., Darms, J., Madan, S., Jacobs, M.: Scaiview – a semantic search engine for biomedical research utilizing a microservice architecture. In: Proceedings of the Posters and Demos Track of the 14th International Conference on Semantic Systems - SEMANTiCS2018 (2018)

    Google Scholar 

  21. Webber, J., Eifrem, E., Robinson, I.: Graph Databases. O’Reilly (2015)

    Google Scholar 

  22. Rogers, F.B.: Medical subject headings. Bull. Med. Libr. Assoc. 51, 114–116 (1963)

    Google Scholar 

  23. Yang, H., Lee, H.: Research trend visualization by mesh terms from pubmed. Int. J. Environ. Res. Pub. Health 15(6), 1113 (2018)

    Article  MathSciNet  Google Scholar 

  24. Cyganiak, R., Wood, D., Lanthaler, M.: RDF 1.1 concepts and abstract syntax. W3C, W3C Recommendation (2014). http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/

  25. Patel-Schneider, P., Rudolph, S., Krötzsch, M., Hitzler, P., Parsia, B.: OWL 2 web ontology language primer (second edition). W3C, Technical Report (2012). http://www.w3.org/TR/2012/REC-owl2-primer-20121211/

  26. Summers, E., Isaac, A.: SKOS simple knowledge organization system primer. W3C, W3C Note (2009). http://www.w3.org/TR/2009/NOTE-skos-primer-20090818/

  27. Zeng, M., Hlava, M., Qin, J., Hodge, G., Bedford, D.: Knowledge organization systems (kos) standards. Proc. Assoc. Inf. Sci. Technol. 44(1), 1–3 (2007)

    Article  Google Scholar 

  28. NISO: Guidelines for the construction, format, and management of monolingual controlled vocabularies. National Information Standards Organization, Baltimore, Maryland, U.S.A., Standard (2005)

    Google Scholar 

  29. Zeng, M.: Knowledge organization systems (kos). Knowl. Org. 35, 160–182 (2008). https://doi.org/10.5771/0943-7444-2008-2-3-160

  30. Malhotra, A., Younesi, E., Gündel, M., Müller, B., Heneka, M.T., Hofmann-Apitius, M.: Ado: a disease ontology representing the domain knowledge specific to alzheimer’s disease. Alzheimer’s & Dementia 10(2), 238–246 (2014)

    Article  Google Scholar 

  31. Iyappan, A., Younesi, E., Redolfi, A., Vrooman, H., Khanna, S., Frisoni, G.B., Hofmann-Apitius, M.: Neuroimaging feature terminology: a controlled terminology for the annotation of brain imaging features. J. Alzheimer’s Dis. 59(4), 1153–1169 (2017)

    Article  Google Scholar 

  32. S. Madan, M. Fiosins, S. Bonn, and J. Fluck, “A Semantic Data Integration Methodology for Translational Neurodegenerative Disease Research,” in Proceedings of the 11th International Conference Semantic Web Applications and Tools for Life Sciences (SWAT4HCLS 2018), Dec. 2018. DOI: 10.6084/m9.figshare.7339244.v1

    Google Scholar 

  33. Voß, J.: Classification of knowledge organization systems with wikidata. In: NKOS@ TPDL, pp. 15–22 (2016)

    Google Scholar 

  34. Vrandečić, D.: Toward an abstract wikipedia. In: Ortiz, M., Schneider, T. (eds.) 31st International Workshop on Description Logics (DL), CEUR Workshop Proceedings, no. 2211, Aachen (2018)

    Google Scholar 

  35. Oßwald, A., Schöpfel, J., Jacquemin, B.: Continuing professional education in open access. A French-German survey. LIBER Quart. J. Assoc. Europ. Res. Libr. 26(2), 43–66 (2015)

    Google Scholar 

  36. Volanakis, A., Krawczyk, K.: Sciride finder: a citation-based paradigm in biomedical literature search. Sci. Rep. 8(1), 6193 (2018)

    Article  Google Scholar 

  37. Madan, S., Hodapp, S., Senger, P., Ansari, S., Szostak, J., Hoeng, J., Peitsch, M., Fluck, J.: The BEL information extraction workflow (BELIEF): evaluation in the BioCreative V BEL and IAT track. Database 2016 (2016)

    Google Scholar 

  38. Madan, S., Szostak, J., Dörpinghaus, J., Hoeng, J., Fluck, J.: Overview of BEL track: extraction of complex relationships and their conversion to BEL. In: Proceedings of the BioCreative VI Workshop (2017)

    Google Scholar 

  39. Dörpinghaus, J., Düing, C., Weil, V.: A minimum set-cover problem with several constraints. In: Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 115–122 (2019). https://doi.org/10.15439/2019F2

  40. Bertossi, A.A.: Dominating sets for split and bipartite graphs. Inf. Proc. Lett. 19(1), 37–40 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  41. Yannakakis, M., Gavril, F.: Edge dominating sets in graphs. SIAM J. Appl. Math. 38(3), 364–372 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  42. Korte, B., Vygen, J., Korte, B., Vygen, J.: Combinatorial optimization, vol. 2. Springer (2012)

    Google Scholar 

  43. Dörpinghaus, J., Düing, C., Weil, V.: Utilizing Minimum Set-Cover Structures with Several Constraints for Knowledge Discovery on Large Literature Databases, pp. 49–69. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-58884-7_3

  44. Wood, P.T.: Query languages for graph databases. SIGMOD Rec. 41(1), 50–60 (2012). https://doi.org/10.1145/2206869.2206879

    Article  Google Scholar 

  45. Angles, R., Arenas, M., Barceló, P., Hogan, A., Reutter, J., Vrgoč, D.: Foundations of modern query languages for graph databases. ACM Comput. Surv. 50(5), 68:1–68:40 (2017). https://doi.org/10.1145/3104031

  46. Kodamullil, A.T., Younesi, E., Naz, M., Bagewadi, S., Hofmann-Apitius, M.: Computable cause-and-effect models of healthy and alzheimer’s disease states and their mechanistic differential analysis. Alzheimer’s & Dement. 11(11), 1329–1339 (2015)

    Article  Google Scholar 

  47. Kim, J.: Correction to: Evaluating author name disambiguation for digital libraries: a case of dblp. Scientometrics 118(1), 383–383 (2019)

    Article  Google Scholar 

  48. Franzoni, V., Lepri, M., Milani, A.: Topological and semantic graph-based author disambiguation on dblp data in neo4j (2019). arXiv:1901.08977

  49. Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)

    Article  Google Scholar 

  50. Cai, D., Wu, G.: Content-aware attributed entity embedding for synonymous named entity discovery. Neurocomputing 329, 237–247 (2019)

    Article  Google Scholar 

  51. Prajapati, P., Sivakumar, P.: Context dependency relation extraction using modified evolutionary algorithm based on web mining. In: Emerging Technologies in Data Mining and Information Security, pp. 259–267. Springer, Göttingen (2019)

    Google Scholar 

  52. Cook, S.A.: The complexity of theorem-proving procedures. In: Proceedings of the third annual ACM symposium on Theory of Computing, pp. 151–158. ACM (1971)

    Google Scholar 

  53. Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L.B., Bourne, P.E., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jens Dörpinghaus .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Dörpinghaus, J., Düing, C., Stefan, A. (2022). Biomedical Knowledge Graphs: Context, Queries and Complexity. In: Dörpinghaus, J., Weil, V., Schaaf, S., Apke, A. (eds) Computational Life Sciences. Studies in Big Data, vol 112. Springer, Cham. https://doi.org/10.1007/978-3-031-08411-9_20

Download citation

Publish with us

Policies and ethics