A Knowledge-Driven Pipeline for Transforming Big Data into Actionable Knowledge

  • Maria-Esther Vidal
  • Kemele M. Endris
  • Samaneh Jozashoori
  • Guillermo Palma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11371)


Big biomedical data has grown exponentially during the last decades, as well as the applications that demand the understanding and discovery of the knowledge encoded in available big data. In order to address these requirements while scaling up to the dominant dimensions of big biomedical data –volume, variety, and veracity– novel data integration techniques need to be defined. In this paper, we devise a knowledge-driven approach that relies on Semantic Web technologies such as ontologies, mapping languages, linked data, to generate a knowledge graph that integrates big data. Furthermore, query processing and knowledge discovery methods are implemented on top of the knowledge graph for enabling exploration and pattern uncovering. We report on the results of applying the proposed knowledge-driven approach in the EU funded project iASiS ( in order to transform big data into actionable knowledge, paying thus the way for precision medicine and health policy making.



This work has been supported by the European Union’s Horizon 2020 Research and Innovation Program for the project iASiS with grant agreement No 727658.


  1. 1.
    Auer, S., et al.: The bigdataeurope platform - supporting the variety dimension of big data. In: Web Engineering - 17th International Conference, ICWE 2017, pp. 41–59 (2017)Google Scholar
  2. 2.
    Belleau, F., Nolin, M., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)CrossRefGoogle Scholar
  3. 3.
    Collarana, D., Galkin, M., Ribón, I.T., Vidal, M., Lange, C., Auer, S.: MINTE: semantically integrating RDF graphs. In Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, WIMS 2017, Amantea, Italy, 19–22 June 2017 (2017)Google Scholar
  4. 4.
    Endris, K.M., Almhithawi, Z., Lytra, I., Vidal, M., Auer, S.: BOUNCER: privacy-aware query processing over federations of RDF datasets. In: Database and Expert Systems Applications - 29th International Conference, DEXA 2018, Regensburg, Germany, 3–6 September 2018, Proceedings, Part I, pp. 69–84 (2018)Google Scholar
  5. 5.
    Ernst, P., Siu, A., Weikum, G.: Knowlife: a versatile approach for constructing a large knowledge graph for biomedical sciences. BMC Bioinform. 16, 157 (2015)CrossRefGoogle Scholar
  6. 6.
    Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)Google Scholar
  7. 7.
    Livi, C.M., Klus, P., Delli Ponti, R., Tartaglia, G.G: cat Rapid signature: identification of ribonucleoproteins and RNA-binding regions. Bioinformatics 32(5), 773–775 (2016)CrossRefGoogle Scholar
  8. 8.
    Mahdisoltani, F., Biega, J., Suchanek, F.M.: YAGO3: a knowledge base from multilingual Wikipedias. In CIDR 2015 (2015)Google Scholar
  9. 9.
    Menasalvas, E., Rodriguez-Gonzalez, A., Costumero, R., Ambit, H., Gonzalo, C.: Clinical narrative analytics challenges. In: Flores, V., et al. (eds.) IJCRS 2016. LNCS (LNAI), vol. 9920, pp. 23–32. Springer, Cham (2016). Scholar
  10. 10.
    Ribón, I.T., Vidal, M., Kämpgen, B., Sure-Vetter, Y.: GADES: a graph-based semantic similarity measure. In: Proceedings of SEMANTICS, pp. 101–104 (2016)Google Scholar
  11. 11.
    Schmidlen, T.J., Wawak, L., Kasper, R., García-España, J.F., Christman, M.F., Gordon, E.S.: Personalized genomic results: analysis of informational needs. J. Genetic Counseling 23(4), 578–587 (2014)CrossRefGoogle Scholar
  12. 12.
    Shah, N.H., et al.: Proton pump inhibitor usage and the risk of myocardial infarction in the general population. Plos One 10(7), e0124653 (2015)CrossRefGoogle Scholar
  13. 13.
    Sivarajah, U.M.M.K., Irani, Z., Weerakkody, V.: Critical analysis of big data challenges and analytical methods. J. Bus. Res. 70, 263–286 (2017)CrossRefGoogle Scholar
  14. 14.
    Stephens, Z.D., et al.: Big data: astronomical or genomical? Plos One 13(7), e1002195 (2015)CrossRefGoogle Scholar
  15. 15.
    Jagadish, H.V., et al.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.TIB Leibniz Information Centre for Science and TechnologyHannoverGermany
  2. 2.L3S InstituteLeibniz University of HannoverHannoverGermany

Personalised recommendations