Advertisement

Facilitating the Analysis of COVID-19 Literature Through a Knowledge Graph

  • Bram SteenwinckelEmail author
  • Gilles Vandewiele
  • Ilja Rausch
  • Pieter Heyvaert
  • Ruben Taelman
  • Pieter Colpaert
  • Pieter Simoens
  • Anastasia Dimou
  • Filip De Turck
  • Femke Ongenae
Conference paper
  • 532 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12507)

Abstract

At the end of 2019, Chinese authorities alerted the World Health Organization (WHO) of the outbreak of a new strain of the coronavirus, called SARS-CoV-2, which struck humanity by an unprecedented disaster a few months later. In response to this pandemic, a publicly available dataset was released on Kaggle which contained information of over 63,000 papers. In order to facilitate the analysis of this large mass of literature, we have created a knowledge graph based on this dataset. Within this knowledge graph, all information of the original dataset is linked together, which makes it easier to search for relevant information. The knowledge graph is also enriched with additional links to appropriate, already existing external resources. In this paper, we elaborate on the different steps performed to construct such a knowledge graph from structured documents. Moreover, we discuss, on a conceptual level, several possible applications and analyses that can be built on top of this knowledge graph. As such, we aim to provide a resource that allows people to more easily build applications that give more insights into the COVID-19 pandemic.

Keywords

COVID-19 Knowledge graph creation Network analysis Graph embeddings 

References

  1. 1.
    Allen Institute For AIF: Covid-19 open research dataset challenge (cord-19). https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge
  2. 2.
    Andersen, K.G., Rambaut, A., Lipkin, W.I., Holmes, E.C., Garry, R.F.: The proximal origin of sars-cov-2. Nat. Med. 26(4), 450–452 (2020)CrossRefGoogle Scholar
  3. 3.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-76298-0_52CrossRefGoogle Scholar
  4. 4.
    Auer, S., Kovtun, V., Prinz, M., Kasprzik, A., Stocker, M., Vidal, M.E.: Towards a knowledge graph for science. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, pp. 1–6 (2018)Google Scholar
  5. 5.
    Johns Hopkins Coronavirus Resource Center: Covid-19 dashboard by the center for systems science and engineering (csse) at johns hopkins university (jhu). https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
  6. 6.
    Dimou, A., De Meester, B., Heyvaert, P., Verborgh, R., Latré, S., Mannens, E.: RMLMapper: a tool for uniform Linked Data generation from heterogeneous dataGoogle Scholar
  7. 7.
    Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: RML: a generic language for integrated RDF mappings of heterogeneous data. In: Proceedings of the 7th Workshop on Linked Data on the Web, vol. 1184 (2014)Google Scholar
  8. 8.
    Domingo-Fernandez, D., et al.: Covid-19 knowledge graph: a computable, multi-modal, cause-and-effect knowledge model of covid-19 pathophysiology. BioRxiv (2020)Google Scholar
  9. 9.
    Färber, M.: The microsoft academic knowledge graph: a linked data source with 8 billion triples of scholarly data. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 113–129. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-30796-7_8CrossRefGoogle Scholar
  10. 10.
    Guoqian, J., Harold Solbrig, F.T.: Cord-19-on-fhir - semantics for covid-19 discovery. https://github.com/fhircat/CORD-19-on-FHIR
  11. 11.
    Haak, L.L., Fenner, M., Paglione, L., Pentz, E., Ratner, H.: Orcid: a system to uniquely identify researchers. Learn. Publ. 25(4), 259–264 (2012)CrossRefGoogle Scholar
  12. 12.
    Heyvaert, P., De Meester, B., Dimou, A., Verborgh, R.: Declarative rules for linked data generation at your fingertips!. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 11155, pp. 213–217. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-98192-5_40CrossRefGoogle Scholar
  13. 13.
    Lammey, R.: Crossref text and data mining services. Science Editing (2015)Google Scholar
  14. 14.
    McCusker, J.P., Dumontier, M., Yan, R., He, S., Dordick, J.S., McGuinness, D.L.: Finding melanoma drugs through a probabilistic knowledge graph. PeerJ Comput. Sci. 3, e106 (2017)CrossRefGoogle Scholar
  15. 15.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8 (2011)Google Scholar
  16. 16.
    Noy, N.F., et al.: Bioportal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res. 37, W170–W173 (2009)CrossRefGoogle Scholar
  17. 17.
    Preusse, M.: COVID-19 Knowledge Graph (2020). https://covidgraph.org
  18. 18.
    Ristoski, P., Paulheim, H.: RDF2Vec: RDF graph embeddings for data mining. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 498–514. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46523-4_30CrossRefGoogle Scholar
  19. 19.
    Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-93417-4_38CrossRefGoogle Scholar
  20. 20.
    Shotton, D., Peroni, S.: Fabio, the FRBR-aligned bibliographic ontology (2011)Google Scholar
  21. 21.
    Steenwinckel, B., Vandewiele, G., De Turck, F., Ongenae, F.: Csv2kg: Transforming tabular data into semantic knowledge. SemTab, ISWC Challenge (2019)Google Scholar
  22. 22.
    Taelman, R., Van Herwegen, J., Vander Sande, M., Verborgh, R.: Comunica: A Modular SPARQL Query Engine for the Web. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 239–255. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-00668-6_15CrossRefGoogle Scholar
  23. 23.
    DSG-UPB: Covid19ds: RDF file generation is based on papers related to the covid-19 and coronavirus-related research (2020)Google Scholar
  24. 24.
    Vandewiele, G., Steenwinckel, B., Ongenae, F., De Turck, F.: Inducing a decision tree with discriminative paths to classify entities in a knowledge graph. In: SEPDA2019, the 4th International Workshop on Semantics-Powered Data Mining and Analytics, pp. 1–6 (2019)Google Scholar
  25. 25.
    Verborgh, R., et al.: Triple pattern fragments: a low-cost knowledge graph interface for the web. J. Web Semant. 37, 184–206 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.IDLabGhent University – imecGhentBelgium

Personalised recommendations