Chapter

Data Integration in the Life Sciences

Volume 6254 of the series Lecture Notes in Computer Science pp 90-105

A Data Warehouse Approach to Semantic Integration of Pseudomonas Data

  • Kamar MarrakchiAffiliated withDepartment of Biology, Faculty of Sciences and Techniques, University Abdelmalek Essaâdi
  • , Abdelaali BriacheAffiliated withDepartment of Biology, Faculty of Sciences and Techniques, University Abdelmalek Essaâdi
  • , Amine KerzaziAffiliated withDepartment of Computer Languages and Computing Science, Higher Technical School of Computer Science Engineering, University of Málaga
  • , Ismael Navas-DelgadoAffiliated withDepartment of Computer Languages and Computing Science, Higher Technical School of Computer Science Engineering, University of Málaga
  • , José Francisco Aldana-MontesAffiliated withDepartment of Computer Languages and Computing Science, Higher Technical School of Computer Science Engineering, University of Málaga
  • , Mohamed EttayebiAffiliated withDepartment of Biology, Faculty of Sciences Dhar-Mahraz, Sidi Med Ben Abdellah University
  • , Khalid LairiniAffiliated withDepartment of Biology, Faculty of Sciences and Techniques, University Abdelmalek Essaâdi
  • , Badr Din Rossi HassaniAffiliated withDepartment of Biology, Faculty of Sciences and Techniques, University Abdelmalek Essaâdi

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Biological research and development are routinely producing terabytes of data that need to be organized, queried and reduced to useful scientific knowledge. Even though data integration can provide solutions to such biological problems, it is often problematic due to the sources’ heterogeneity and their semantic and structural diversity. Moreover, necessary updates of both structure and content of databases provide further challenges for an integration process. We present a new biological data warehouse for Pseudomonas species “PseudomonasDW” to integrate annotation and pathway data from highly different resources. The combination of knowledge from multiple disciplines and sources should advance the understanding of cellular processes and lead to the prediction of cellular behavior in its entirety. The key aspect of our approach is the combination of a materialized and a virtual data integration to exploit their advantages in a new hybrid approach. The data are extracted from the original data sources using SB-KOM (System Biology Khaos Ontology-based Mediator) and then stored locally in the data warehouse to ensure a fast performance and data consistency.

Keywords

Data Integration Data Warehouse Web Services Ontology Pseudomonas