Skip to main content

A Knowledge-Driven Pipeline for Transforming Big Data into Actionable Knowledge

  • 669 Accesses

Part of the Lecture Notes in Computer Science book series (LNBI,volume 11371)


Big biomedical data has grown exponentially during the last decades, as well as the applications that demand the understanding and discovery of the knowledge encoded in available big data. In order to address these requirements while scaling up to the dominant dimensions of big biomedical data –volume, variety, and veracity– novel data integration techniques need to be defined. In this paper, we devise a knowledge-driven approach that relies on Semantic Web technologies such as ontologies, mapping languages, linked data, to generate a knowledge graph that integrates big data. Furthermore, query processing and knowledge discovery methods are implemented on top of the knowledge graph for enabling exploration and pattern uncovering. We report on the results of applying the proposed knowledge-driven approach in the EU funded project iASiS ( in order to transform big data into actionable knowledge, paying thus the way for precision medicine and health policy making.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-06016-9_4
  • Chapter length: 6 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   54.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-06016-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   69.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.


  1. 1.

  2. 2.

  3. 3.

  4. 4.

  5. 5.

  6. 6.

  7. 7.


  1. Auer, S., et al.: The bigdataeurope platform - supporting the variety dimension of big data. In: Web Engineering - 17th International Conference, ICWE 2017, pp. 41–59 (2017)

    Google Scholar 

  2. Belleau, F., Nolin, M., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)

    CrossRef  Google Scholar 

  3. Collarana, D., Galkin, M., Ribón, I.T., Vidal, M., Lange, C., Auer, S.: MINTE: semantically integrating RDF graphs. In Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, WIMS 2017, Amantea, Italy, 19–22 June 2017 (2017)

    Google Scholar 

  4. Endris, K.M., Almhithawi, Z., Lytra, I., Vidal, M., Auer, S.: BOUNCER: privacy-aware query processing over federations of RDF datasets. In: Database and Expert Systems Applications - 29th International Conference, DEXA 2018, Regensburg, Germany, 3–6 September 2018, Proceedings, Part I, pp. 69–84 (2018)

    Google Scholar 

  5. Ernst, P., Siu, A., Weikum, G.: Knowlife: a versatile approach for constructing a large knowledge graph for biomedical sciences. BMC Bioinform. 16, 157 (2015)

    CrossRef  Google Scholar 

  6. Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)

    Google Scholar 

  7. Livi, C.M., Klus, P., Delli Ponti, R., Tartaglia, G.G: cat Rapid signature: identification of ribonucleoproteins and RNA-binding regions. Bioinformatics 32(5), 773–775 (2016)

    CrossRef  Google Scholar 

  8. Mahdisoltani, F., Biega, J., Suchanek, F.M.: YAGO3: a knowledge base from multilingual Wikipedias. In CIDR 2015 (2015)

    Google Scholar 

  9. Menasalvas, E., Rodriguez-Gonzalez, A., Costumero, R., Ambit, H., Gonzalo, C.: Clinical narrative analytics challenges. In: Flores, V., et al. (eds.) IJCRS 2016. LNCS (LNAI), vol. 9920, pp. 23–32. Springer, Cham (2016).

    CrossRef  Google Scholar 

  10. Ribón, I.T., Vidal, M., Kämpgen, B., Sure-Vetter, Y.: GADES: a graph-based semantic similarity measure. In: Proceedings of SEMANTICS, pp. 101–104 (2016)

    Google Scholar 

  11. Schmidlen, T.J., Wawak, L., Kasper, R., García-España, J.F., Christman, M.F., Gordon, E.S.: Personalized genomic results: analysis of informational needs. J. Genetic Counseling 23(4), 578–587 (2014)

    CrossRef  Google Scholar 

  12. Shah, N.H., et al.: Proton pump inhibitor usage and the risk of myocardial infarction in the general population. Plos One 10(7), e0124653 (2015)

    CrossRef  Google Scholar 

  13. Sivarajah, U.M.M.K., Irani, Z., Weerakkody, V.: Critical analysis of big data challenges and analytical methods. J. Bus. Res. 70, 263–286 (2017)

    CrossRef  Google Scholar 

  14. Stephens, Z.D., et al.: Big data: astronomical or genomical? Plos One 13(7), e1002195 (2015)

    CrossRef  Google Scholar 

  15. Jagadish, H.V., et al.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014)

    CrossRef  Google Scholar 

Download references


This work has been supported by the European Union’s Horizon 2020 Research and Innovation Program for the project iASiS with grant agreement No 727658.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Samaneh Jozashoori .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Vidal, ME., Endris, K.M., Jozashoori, S., Palma, G. (2019). A Knowledge-Driven Pipeline for Transforming Big Data into Actionable Knowledge. In: Auer, S., Vidal, ME. (eds) Data Integration in the Life Sciences. DILS 2018. Lecture Notes in Computer Science(), vol 11371. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-06015-2

  • Online ISBN: 978-3-030-06016-9

  • eBook Packages: Computer ScienceComputer Science (R0)