Data Integration between Swedish National Clinical Health Registries and Biobanks Using an Availability System
- Cite this paper as:
- Spjuth O., Heikkinen J., Litton JE., Palmgren J., Krestyaninova M. (2014) Data Integration between Swedish National Clinical Health Registries and Biobanks Using an Availability System. In: Galhardas H., Rahm E. (eds) Data Integration in the Life Sciences. DILS 2014. Lecture Notes in Computer Science, vol 8574. Springer, Cham
Linking biobank data, such as molecular profiles, with clinical phenotypes is of great importance in epidemiological and predictive studies. A comprehensive overview of various data sources that can be combined in order to power up a study is a key factor in the design. Clinical data stored in health registries and biobank data in research projects are commonly provisioned in different database systems and governed by separate organizations, making the integration process challenging and hampering biomedical investigations. We here describe the integration of data on prostate cancer from a clinical health registry with data from a biobank, and its provisioning in the SAIL availability system. We demonstrate the implications of using the actual raw data, data transformed to availability data, and availability data which has been subjected to anonymization techniques to reduce the risk of re-identification. Our results show that an availability system such as SAIL with integrated clinical and biobank data can be a valuable tool for planning new studies and finding interesting subsets to investigate further. We also show that an availability system can deliver useful insights even when the data has been subjected to anonymization techniques.
KeywordsData integration health registry biobanks availability system anonymization
Unable to display preview. Download preview PDF.