Skip to main content

Implementing a Data Integration Infrastructure for Healthcare Data – A Case Study

  • Conference paper
  • First Online:
Innovations in Bio-Inspired Computing and Applications (IBICA 2022)

Abstract

Conducting epidemiologic research usually requires a large amount of data to establish the natural history of a disease and achieve meaningful study design, and interpretations of findings. This is, however, a huge task because the healthcare domain is composed of a complex corpus and concepts that result in difficult ways to use and store data. Additionally, data accessibility should be considered because sensitive data from patients should be carefully protected and shared with responsibility. With the COVID-19 pandemic, the need for sharing data and having an integrated view of the data was reaffirmed to identify the best approaches and signals to improve not only treatments and diagnoses but also social answers to the epidemiological scenario. This paper addresses a data integration scenario for dealing with COVID-19 and cardiovascular diseases, covering the main challenges related to integrating data in a common data repository storing data from several hospitals. Conceptual architecture is presented to deal with such approaches and integrate data from a Portuguese hospital into the common repository used to explore data in a standardized way.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.hl7.org/implement/standards/product_brief.cfm?product_id=7.

References

  1. Faggella, D.: Where Healthcare’s Big Data Actually Comes From. https://emerj.com, 22 Nov 2019. https://emerj.com/ai-sector-overviews/where-healthcares-big-data-actually-comes-from/. Accessed 14 Oct 2022

  2. Jones, G.L., Peter, Z., Rutter, K.A., Somauroo, A.: Promoting an overdue digital transformation in healthcare. https://www.mckinsey.com, 20 June 2019. https://www.mckinsey.com/industries/healthcare-systems-and-services/our-insights/promoting-an-overdue-digital-transformation-in-healthcare. Accessed 14 Oct 2022

  3. Bughin, J., et al.: Artificial Intelligence The Next Digital Frontier?, June 2017. https://www.mckinsey.com/~/media/mckinsey/industries/advanced%20electronics/our%20insights/how%20artificial%20intelligence%20can%20deliver%20real%20value%20to%20companies/mgi-artificial-intelligence-discussion-paper.ashx. Accessed 14 Oct 2022

  4. Chen, M.-T., Lin, T.H.: A provable and secure patient electronic health record fair exchange scheme for health information systems. Appl. Sci. (Switzerland) 11(5), 2401 (2021). https://doi.org/10.3390/app11052401

  5. Khennou, F., Houda Chaoui, N., Khamlichi, Y.I.: A migration methodology from legacy to new electronic health record based OpenEHR. Int. J. E-Health Med. Commun. 10(1), 55–75 (2019). https://doi.org/10.4018/IJEHMC.2019010104

  6. Sarwar, T., et al.: The secondary use of electronic health records for data mining: data characteristics and challenges. ACM Comput. Surv. 55(2), 1–40 (2023). https://doi.org/10.1145/3490234

  7. Aunger, J.A., Millar, R., Rafferty, A.M., Mannion, R.: Collaboration over competition? regulatory reform and inter-organizational relations in the NHS amidst the COVID-19 pandemic: a qualitative study. BMC Health Serv. Res. 22(1) (2022). https://doi.org/10.1186/s12913-022-08059-2

  8. Joint Research Centre (JRC). Ireland is the country with the highest cancer incidence in the EU (2020). https://ec.europa.eu/newsroom/eusciencehubnews/items/684847. Accessed 15 Oct 2022

  9. Kimball, R., Caserta, J.: The Data Warehouse ETL Toolkit: Practical Techniques for Extracting. Conforming, and Delivering Data. John Wiley & Sons Inc. Cleaning (2004)

    Google Scholar 

  10. Batini, C., Scannapieco, M.: Data and Information Quality. Springer International Publishing (2016)

    Google Scholar 

  11. Fagin, R.: Inverting schema mappings. ACM Trans. Datab. Syst. 32(4), 25–es (2007). https://doi.org/10.1145/1292609.1292615

  12. Ralph, K., Margy, R.: The Data Warehouse Toolkit, 3rd Edition. John Wiley & Sons, Inc. (2013)

    Google Scholar 

  13. Moorthie, S., et al.: Rapid systematic review to identify key barriers to access, linkage, and use of local authority administrative data for population health research, practice, and policy in the United Kingdom. BMC Public Health 22(1) (2022). https://doi.org/10.1186/s12889-022-13187-9

  14. Mai, P.L., et al.: Li-Fraumeni exploration consortium data coordinating center: building an interactive web-based resource for collaborative international cancer epidemiology research for a rare condition. Cancer Epidemiol. Biomark. Prev. 29(5), 927–935 (2021). https://doi.org/10.1158/1055-9965.EPI-19-1113

    Article  Google Scholar 

  15. Sanchez, P., Voisey, J.P., Xia, T., Watson, H.I., O’Neil, A.Q., Tsaftaris, S.A.: Causal machine learning for healthcare and precision medicine. R. Soc. Open Sci. 9(8), 220638 (2022). https://doi.org/10.1098/rsos.220638

  16. Chen, J.S., Baxter, S.L.: Applications of natural language processing in ophthalmology: present and future. Front Med. (Lausanne) (9) (2022). https://doi.org/10.3389/fmed.2022.906554

  17. Oubenali, N., Messaoud, S., Filiot, A., Lamer, A., Andrey, P.: Visualization of medical concepts represented using word embeddings: a scoping review. BMC Med. Inform. Decis. Mak. 22(1), 83 (2022). https://doi.org/10.1186/s12911-022-01822-9

    Article  Google Scholar 

  18. Zhang, T., Schoene, A.M., Ji, S., Ananiadou, S.: Natural language processing applied to mental illness detection: a narrative review. NPJ Digit. Med. 5(1), 46 (2022). https://doi.org/10.1038/s41746-022-00589-7

    Article  Google Scholar 

  19. Pecoraro, F., Luzi, D., Ricci, F.L.: Designing ETL tools to feed a data warehouse based on electronic healthcare record infrastructure. In: Digital Healthcare Empowering Europeans, IOS Press, pp. 929–933 (2015)

    Google Scholar 

  20. Fleuren, L.M., et al.: The Dutch Data Warehouse, a multicenter and full-admission electronic health records database for critically ill COVID-19 patients. Crit. Care 25(1), 1–12 (2021). https://doi.org/10.1186/s13054-021-03733-z

  21. Ong, T.C., et al.: Dynamic-ETL: a hybrid approach for health data extraction, transformation and loading. BMC Med Inform Decis Mak 17(1), 1–2 (2017). https://doi.org/10.1186/s12911-017-0532-3

  22. Poulymenopoulou, M., Papakonstantinou, D., Malamateniou, F., Vassilacopoulos, G.: A health analytics semantic ETL service for obesity surveillance. Stud. Health Technol. Inform. 210, 840–844 (2015). https://doi.org/10.3233/978-1-61499-512-8-840

    Article  Google Scholar 

  23. Gavrilov, G., Vlahu-Gjorgievska, E., Trajkovik, V.: Healthcare data warehouse system supporting cross-border interoperability. Health Inform. J. 26(2), 1321–1332 (2020)

    Google Scholar 

  24. Khan, U., Kothari, H., Kuchekar, A., Koshy, R.: Common data model for healthcare data. In: 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), pp. 1450–1457 (2018) https://doi.org/10.1109/RTEICT42901.2018.9012520

  25. Khedr, A., Kholeif, S., Saad, F.: An integrated business intelligence framework for healthcare analytics. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 7(5), 263–270 (2017). https://doi.org/10.23956/ijarcsse/SV7I5/0163

    Article  Google Scholar 

  26. Registry of patients with COVID-19 including cardiovascular risk and complications. https://capacity-covid.eu. Accessed 09 Oct 2022

  27. Hayrinen, K., Saranto, K., Nykanen, P.: Definition, structure, content, use and impacts of electronic health records: a review of the research literature. Int J Med Inform 77(5), 291–304 (2008). https://doi.org/10.1016/j.ijmedinf.2007.09.001

    Article  Google Scholar 

Download references

Acknowledgement

This work is partially funded by national funds through FCT—Fundação para a Ciência e Tecnologia, I.P., under the project FCT UIDB/04466/2020 and UIDP/04466/2020. Luís Elvas holds a Ph.D. grant, funded by FCT with UI/BD/151494/2021.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luís B. Elvas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Oliveira, B., Mira, M., Monteiro, S., Elvas, L.B., Rosário, L.B., Ferreira, J.C. (2023). Implementing a Data Integration Infrastructure for Healthcare Data – A Case Study. In: Abraham, A., Bajaj, A., Gandhi, N., Madureira, A.M., Kahraman, C. (eds) Innovations in Bio-Inspired Computing and Applications. IBICA 2022. Lecture Notes in Networks and Systems, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-031-27499-2_69

Download citation

Publish with us

Policies and ethics