Abstract
Conducting epidemiologic research usually requires a large amount of data to establish the natural history of a disease and achieve meaningful study design, and interpretations of findings. This is, however, a huge task because the healthcare domain is composed of a complex corpus and concepts that result in difficult ways to use and store data. Additionally, data accessibility should be considered because sensitive data from patients should be carefully protected and shared with responsibility. With the COVID-19 pandemic, the need for sharing data and having an integrated view of the data was reaffirmed to identify the best approaches and signals to improve not only treatments and diagnoses but also social answers to the epidemiological scenario. This paper addresses a data integration scenario for dealing with COVID-19 and cardiovascular diseases, covering the main challenges related to integrating data in a common data repository storing data from several hospitals. Conceptual architecture is presented to deal with such approaches and integrate data from a Portuguese hospital into the common repository used to explore data in a standardized way.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Faggella, D.: Where Healthcare’s Big Data Actually Comes From. https://emerj.com, 22 Nov 2019. https://emerj.com/ai-sector-overviews/where-healthcares-big-data-actually-comes-from/. Accessed 14 Oct 2022
Jones, G.L., Peter, Z., Rutter, K.A., Somauroo, A.: Promoting an overdue digital transformation in healthcare. https://www.mckinsey.com, 20 June 2019. https://www.mckinsey.com/industries/healthcare-systems-and-services/our-insights/promoting-an-overdue-digital-transformation-in-healthcare. Accessed 14 Oct 2022
Bughin, J., et al.: Artificial Intelligence The Next Digital Frontier?, June 2017. https://www.mckinsey.com/~/media/mckinsey/industries/advanced%20electronics/our%20insights/how%20artificial%20intelligence%20can%20deliver%20real%20value%20to%20companies/mgi-artificial-intelligence-discussion-paper.ashx. Accessed 14 Oct 2022
Chen, M.-T., Lin, T.H.: A provable and secure patient electronic health record fair exchange scheme for health information systems. Appl. Sci. (Switzerland) 11(5), 2401 (2021). https://doi.org/10.3390/app11052401
Khennou, F., Houda Chaoui, N., Khamlichi, Y.I.: A migration methodology from legacy to new electronic health record based OpenEHR. Int. J. E-Health Med. Commun. 10(1), 55–75 (2019). https://doi.org/10.4018/IJEHMC.2019010104
Sarwar, T., et al.: The secondary use of electronic health records for data mining: data characteristics and challenges. ACM Comput. Surv. 55(2), 1–40 (2023). https://doi.org/10.1145/3490234
Aunger, J.A., Millar, R., Rafferty, A.M., Mannion, R.: Collaboration over competition? regulatory reform and inter-organizational relations in the NHS amidst the COVID-19 pandemic: a qualitative study. BMC Health Serv. Res. 22(1) (2022). https://doi.org/10.1186/s12913-022-08059-2
Joint Research Centre (JRC). Ireland is the country with the highest cancer incidence in the EU (2020). https://ec.europa.eu/newsroom/eusciencehubnews/items/684847. Accessed 15 Oct 2022
Kimball, R., Caserta, J.: The Data Warehouse ETL Toolkit: Practical Techniques for Extracting. Conforming, and Delivering Data. John Wiley & Sons Inc. Cleaning (2004)
Batini, C., Scannapieco, M.: Data and Information Quality. Springer International Publishing (2016)
Fagin, R.: Inverting schema mappings. ACM Trans. Datab. Syst. 32(4), 25–es (2007). https://doi.org/10.1145/1292609.1292615
Ralph, K., Margy, R.: The Data Warehouse Toolkit, 3rd Edition. John Wiley & Sons, Inc. (2013)
Moorthie, S., et al.: Rapid systematic review to identify key barriers to access, linkage, and use of local authority administrative data for population health research, practice, and policy in the United Kingdom. BMC Public Health 22(1) (2022). https://doi.org/10.1186/s12889-022-13187-9
Mai, P.L., et al.: Li-Fraumeni exploration consortium data coordinating center: building an interactive web-based resource for collaborative international cancer epidemiology research for a rare condition. Cancer Epidemiol. Biomark. Prev. 29(5), 927–935 (2021). https://doi.org/10.1158/1055-9965.EPI-19-1113
Sanchez, P., Voisey, J.P., Xia, T., Watson, H.I., O’Neil, A.Q., Tsaftaris, S.A.: Causal machine learning for healthcare and precision medicine. R. Soc. Open Sci. 9(8), 220638 (2022). https://doi.org/10.1098/rsos.220638
Chen, J.S., Baxter, S.L.: Applications of natural language processing in ophthalmology: present and future. Front Med. (Lausanne) (9) (2022). https://doi.org/10.3389/fmed.2022.906554
Oubenali, N., Messaoud, S., Filiot, A., Lamer, A., Andrey, P.: Visualization of medical concepts represented using word embeddings: a scoping review. BMC Med. Inform. Decis. Mak. 22(1), 83 (2022). https://doi.org/10.1186/s12911-022-01822-9
Zhang, T., Schoene, A.M., Ji, S., Ananiadou, S.: Natural language processing applied to mental illness detection: a narrative review. NPJ Digit. Med. 5(1), 46 (2022). https://doi.org/10.1038/s41746-022-00589-7
Pecoraro, F., Luzi, D., Ricci, F.L.: Designing ETL tools to feed a data warehouse based on electronic healthcare record infrastructure. In: Digital Healthcare Empowering Europeans, IOS Press, pp. 929–933 (2015)
Fleuren, L.M., et al.: The Dutch Data Warehouse, a multicenter and full-admission electronic health records database for critically ill COVID-19 patients. Crit. Care 25(1), 1–12 (2021). https://doi.org/10.1186/s13054-021-03733-z
Ong, T.C., et al.: Dynamic-ETL: a hybrid approach for health data extraction, transformation and loading. BMC Med Inform Decis Mak 17(1), 1–2 (2017). https://doi.org/10.1186/s12911-017-0532-3
Poulymenopoulou, M., Papakonstantinou, D., Malamateniou, F., Vassilacopoulos, G.: A health analytics semantic ETL service for obesity surveillance. Stud. Health Technol. Inform. 210, 840–844 (2015). https://doi.org/10.3233/978-1-61499-512-8-840
Gavrilov, G., Vlahu-Gjorgievska, E., Trajkovik, V.: Healthcare data warehouse system supporting cross-border interoperability. Health Inform. J. 26(2), 1321–1332 (2020)
Khan, U., Kothari, H., Kuchekar, A., Koshy, R.: Common data model for healthcare data. In: 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), pp. 1450–1457 (2018) https://doi.org/10.1109/RTEICT42901.2018.9012520
Khedr, A., Kholeif, S., Saad, F.: An integrated business intelligence framework for healthcare analytics. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 7(5), 263–270 (2017). https://doi.org/10.23956/ijarcsse/SV7I5/0163
Registry of patients with COVID-19 including cardiovascular risk and complications. https://capacity-covid.eu. Accessed 09 Oct 2022
Hayrinen, K., Saranto, K., Nykanen, P.: Definition, structure, content, use and impacts of electronic health records: a review of the research literature. Int J Med Inform 77(5), 291–304 (2008). https://doi.org/10.1016/j.ijmedinf.2007.09.001
Acknowledgement
This work is partially funded by national funds through FCT—Fundação para a Ciência e Tecnologia, I.P., under the project FCT UIDB/04466/2020 and UIDP/04466/2020. Luís Elvas holds a Ph.D. grant, funded by FCT with UI/BD/151494/2021.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Oliveira, B., Mira, M., Monteiro, S., Elvas, L.B., Rosário, L.B., Ferreira, J.C. (2023). Implementing a Data Integration Infrastructure for Healthcare Data – A Case Study. In: Abraham, A., Bajaj, A., Gandhi, N., Madureira, A.M., Kahraman, C. (eds) Innovations in Bio-Inspired Computing and Applications. IBICA 2022. Lecture Notes in Networks and Systems, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-031-27499-2_69
Download citation
DOI: https://doi.org/10.1007/978-3-031-27499-2_69
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27498-5
Online ISBN: 978-3-031-27499-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)