Skip to main content
Log in

FedFlow: a federated platform to build secure sharing and synchronization services for health dataflows

  • Special Issue Article
  • Published:
Computing Aims and scope Submit manuscript

Abstract

Data synchronization and content delivery services are key to supporting healthcare dataflows built by organizations. These types of services must prepare and process the data to accomplish mandatory non-functional requirements, such as security and reliability. This is a challenge as multiple applications, infrastructures, and platforms participate in healthcare dataflows. This paper presents FedFlow, a federated content distribution platform to build infrastructure-agnostic health data sharing and synchronization services to support healthcare dataflows. FedFlow creates secure and efficient data sharing and synchronization patterns for intra-dataflows and inter-dataflows by using implicit parallel data preparation schemes. A prototype of FedFlow was developed to conduct a case study about the building of inter-dataflows for delivering synchronized health data to multiple organizations by using combinations of non-functional requirements algorithms to accomplish governmental rules related to health data management. The experimental evaluation in a multi-cloud federated environment showed that FedFlow is around 90% faster than a traditional pipeline implementation, around 40% faster than Jenkins workflow management, and almost 30% faster than duplicity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Bala C (2012) Fault tolerance-challenges, techniques and implementation in cloud computing. Int J Comput Sci Issues 9(1):288

    Google Scholar 

  2. Bartík Ubik K (2015) Lz4 compression algorithm on fpga. In: ICECS. IEEE, pp 179–182

  3. Bhushan G (2017) Security challenges in cloud computing: state-of-art. Int J Big Data Intell 4(2):81–107

    Article  Google Scholar 

  4. Carrizales S-G, Reyes G-C, Morales-Sandoval C, Galaviz-Mosqueda A (2019) A data preparation approach for cloud storage based on containerized parallel patterns. In: IDCS. Springer, pp 478–490

  5. Carrizales-Espinoza D, Sánchez-Gallegos DD, Gonzalez-Compean J, Carretero J (2021) A federated content distribution system to build health data synchronization services. In: PDP. IEEE, pp 1–8

  6. Chervyakov N, Babenko M, Tchernykh A, Kucherov N, Miranda-López V, Cortés-Mendoza JM (2019) Ar-rrns: configurable reliable distributed data storage systems for internet of things to ensure security. FGCS 92:1080–1092

    Article  Google Scholar 

  7. CloudFront A (2014) Amazon cloudfront. http://aws.amazon.com/cloudfront. Accessed 15 July 2019

  8. Cook-Deegan R, Majumder MA, McGuire AL (2019) Introduction: sharing data in a medical information commons. J Law Med Ethics 47(1):7–11

    Article  Google Scholar 

  9. Davami F, Adabi S, Rezaee A, Rahmani AM (2021) Fog-based architecture for scheduling multiple workflows with high availability requirement. Computing, pp 1–40

  10. Deryabin M, Chervyakov N, Tchernykh A, Berezhnoy V, Djurabaev A, Nazarov A, Babenko M (2019) Comparative performance analysis of information dispersal methods. In: 24th FRUCT. IEEE, pp 67–74

  11. Domadiya N, Rao UP (2020) Improving healthcare services using source anonymous scheme with privacy preserving distributed healthcare data collection and mining. Computing, pp 1–23

  12. DuMont Schütte A, Hetzel J, Gatidis S, Hepp T, Dietz B, Bauer S, Schwab P (2021) Overcoming barriers to data sharing with medical image generation: a comprehensive evaluation. npj Dig Med 4(1):1–14

    Google Scholar 

  13. Duplicity (2021) duplicity encrypted bandwidth-efficient backup using the rsync algorithm. http://duplicity.nongnu.org/. Accessed 28 April 2021

  14. Fan K, Wang S, Ren Y, Li H, Yang Y (2018) Medblock: Efficient and secure medical data sharing via blockchain. J Med Syst 42(8):1–11

    Article  Google Scholar 

  15. Fang L, Yin C, Zhu J, Ge C, Tanveer M, Jolfaei A, Cao Z (2020) Privacy protection for medical data sharing in smart healthcare. ACM Trans Multim Comput Commun Appl 16(3s):1–18

    Article  Google Scholar 

  16. Ford T et al (2021) The challenges and opportunities of mental health data sharing in the uk. Lancet Dig Health 3(6):e333–e336

    Article  Google Scholar 

  17. French-Baidoo A (2018) Oppong: achieving confidentiality in electronic health records using cloud systems. IJCNIS 10(1):18

    Article  Google Scholar 

  18. Gonzalez P, Sosa-Sosa SB (2015) Skycds: a resilient content delivery service based on diversified cloud storage. SIMPAT 54:64–85

  19. Gonzalez S, Diaz CY (2018) Sacbe: a building block approach for constructing efficient and flexible end-to-end cloud storage. JSS 135:143–156

  20. Gunawi Hao OS, Laksono S, Adityatama E (2016) Why does the cloud stop computing? Lessons from hundreds of service outages. In: SoCC. ACM, pp 1–16

  21. Herrmann MD, Clunie DA, Fedorov A, Doyle SW, Pieper S, Klepeis V, Le LP, Mutter GL, Milstone DS, Schultz TJ et al (2018) Implementing the dicom standard for digital pathology. J Pathol Inform 9

  22. Jan MA, Zhang W, Usman M, Tan Z, Khan F, Luo E (2019) Smartedge: An end-to-end encryption framework for an edge-enabled smart city application. J Netw Comput Appl 137:1–10

    Article  Google Scholar 

  23. joe42: joe42/cloudfusion (2021). https://github.com/joe42/CloudFusion

  24. Kim DO, Kim HY, Kim YK, Kim JJ (2019) Efficient techniques of parallel recovery for erasure-coding-based distributed file systems. Computing 101(12):1861–1884

    Article  MathSciNet  Google Scholar 

  25. Li Abramson A et al (2016) Data from qin-breast. Cancer Imaging Archive

  26. Liu J, Li X, Ye L, Zhang H, Du X, Guizani M (2018) Bpds: a blockchain based privacy-preserving data sharing for electronic medical records. In: 2018 IEEE global communications conference (GLOBECOM). IEEE, pp 1–6

  27. Mao Wu J (2015) Improving storage availability in cloud-of-clouds with hybrid redundant data distribution. In: IPDPS 2015m. IEEE, pp 633–642

  28. Marcelín-Jiménez R, Ramírez-Ortíz JL, De La Colina ER, Pascoe-Chalke M, González-Compeán JL (2020) On the complexity and performance of the information dispersal algorithm. IEEE Access 8:159284–159290

    Article  Google Scholar 

  29. Mathew Varia (2014) Overview of amazon web services. Amazon Whitepapers

  30. Mayan JA, Anand DK, Sadhvi N (2017) Efficient and secure server migration on cloud storage with vsm and dropbox services. In: ICICES. IEEE, pp 1–5

  31. McAfee (2019) Cloud adoption and risk report

  32. Meister B (2009) Multi-level comparison of data deduplication in a backup scenario. In: Proceedings of SYSTOR. ACM, p 8

  33. Meister B (2010) dedupv1: improving deduplication throughput using solid state drives (ssd). In: MSST 2010. IEEE, pp 1–6

  34. Mier H, Delgadillo T (2018) Regulación del acceso al expediente clínico con fines de investigación en méxico. Revista CONAMED 22(1):27–31

    Google Scholar 

  35. Miller K (2018) Storreduce

  36. Mitzenmacher M (2001) The power of two choices in randomized load balancing. IEEE Trans Parallel Distrib Syst 12(10):1094–1104

    Article  Google Scholar 

  37. Mohamed SM, Wang Y (2021) A survey on novel classification of deduplication storage systems. Distrib Parallel Databases 39(1):201–230

    Article  Google Scholar 

  38. Morales G, Diaz S (2018) A pairing-based cryptographic approach for data security in the cloud. IJISP 17(4):441–461

  39. Morales-Ferreira P, Santiago-Duran M, Gaytan-Diaz C, Gonzalez-Compean J, Sosa-Sosa VJ, Lopez-Arevalo I (2018) A data distribution service for cloud and containerized storage based on information dispersal. In: SOSE. IEEE, pp 86–95

  40. Odelu R, Kumari, Khan C (2017) Pairing-based cp-abe with constant-size ciphertexts and secret keys for cloud environment. Comput Stand Interf 54:3–9

  41. Opara-Martins J, Sahandi R, Tian F (2016) Critical analysis of vendor lock-in and its impact on cloud computing migration: a business perspective. JoCCASA 5(1):4

    Google Scholar 

  42. Packer M (2018) Data sharing in medical research

  43. Patel V (2019) A framework for secure and decentralized sharing of medical imaging data via blockchain consensus. Health Inform J 25(4):1398–1411

    Article  Google Scholar 

  44. Phillips (2018) International data-sharing norms: from the oecd to the general data protection regulation (gdpr). Hum Genet 137(8):575–582

    Article  Google Scholar 

  45. Reyes-Anastacio HG, Gonzalez-Compean J, Sosa-Sosa VJ, Carretero J, Garcia-Blas J (2020) Kulla, a container-centric construction model for building infrastructure-agnostic distributed and parallel applications. JSS 168:110665

    Google Scholar 

  46. Riazul Islam SM, Daehan K, Humaun Kabir M et al (2015) The internet of things for health care: a comprehensive survey. IEEE Access 3:678–708

    Article  Google Scholar 

  47. Robin.io: Cloud native kubernetes storage. https://robin.io/

  48. Roukounaki A, Efremidis S, Soldatos J, Neises J, Walloschke T, Kefalakis N (2019) Scalable and configurable end-to-end collection and analysis of iot security data: Towards end-to-end security in iot systems. In: GIoTS. IEEE, pp 1–6

  49. Rowhani-Farid et al (2017) What incentives increase data sharing in health and medical research? a systematic review. Res Integ Peer Rev 2(1):1–10

    Google Scholar 

  50. rsync (2021) rsync. https://rsync.samba.org/. Accessed 28 April 2021

  51. Rydning DRJGJ (2018) The digitization of the world from edge to core. International Data Corporation, Framingham

    Google Scholar 

  52. Sakellariou G, Gounaris A (2019) Homomorphically encrypted k-means on cloud-hosted servers with low client-side load. Computing 101(12):1813–1836

    Article  MathSciNet  MATH  Google Scholar 

  53. Samant SS, Chhetri MB, Vo QB, Kowalczyk R, Nepal S (2018) Towards end-to-end qos and cost-aware resource scaling in cloud-based iot data processing pipelines. In: SCC. IEEE, pp 287–290

  54. Sánchez-Gallegos D, Carrizales-Espinoza A, Reyes-Anastacio, Gonzalez-Compean, Morales-Sandoval C, Galaviz-Mosqueda (2020) From the edge to the cloud: a continuous delivery and preparation model for processing big iot data. SIMPAT, p 102136

  55. Sánchez-Gallegos G-M, Gonzalez-Compean V-R, Perez-Ramos C-E, Carretero (2020) On the continuous processing of health data in edge-fog-cloud computing by using micro/nanoservice composition. IEEE Access 8:120255–120281

    Article  Google Scholar 

  56. Satti FA, Ali T, Hussain J, Khan WA, Khattak AM, Lee S (2020) Ubiquitous health profile (uhpr): a big data curation platform for supporting health data interoperability. Computing 102(11):2409–2444

    Article  Google Scholar 

  57. Sayood K (2017) Introduction to data compression. Morgan Kaufmann

  58. Shuaib M, Samad A, Alam S, Siddiqui ST (2019) Why adopting cloud is still a challenge?—a review on issues and challenges for cloud migration in organizations. Amb Commun Comput Syst 387–399

  59. Spillner J, Müller J, Schill A (2013) Creating optimal cloud storage systems. Futur Gener Comput Syst 29(4):1062–1072

    Article  Google Scholar 

  60. Tan CB, Hijazi MHA, Lim Y, Gani A (2018) A survey on proof of retrievability for cloud data integrity and availability: cloud storage state-of-the-art, issues, solutions and future trends. J Netw Comput Appl 110:75–86

    Article  Google Scholar 

  61. Tan L, Yu K, Shi N, Yang C, Wei W, Lu H (2021) Towards secure and privacy-preserving data sharing for covid-19 medical records: a blockchain-empowered approach. IEEE Trans Netw Sci Eng

  62. Uthayakumar J, Vengattaraman T, Dhavachelvan P (2018) A survey on data compression techniques: From the perspective of data quality, coding schemes, data type and applications. J King Saud Univ Comput Inform Sci

  63. Xia Q et al (2017) Medshare: trust-less medical data sharing among cloud service providers via blockchain. IEEE Access 5:14757–14767

    Article  Google Scholar 

  64. Xu Z, Zhang J, Song Z, Liu Y, Li J, Zhou J (2021) A scheme for intelligent blockchain-based manufacturing industry supply chain management. Computing 1–20

  65. Yang J, Sharp G, Veeraraghavan H, van Elmpt W, Dekker A, Lustberg T, Gooding M (2017) Data from lung ct segmentation challenge. Cancer Imaging Arch

  66. Yang JJ et al (2015) A hybrid solution for privacy preserving medical data sharing in the cloud environment. Futur Gener Comput Syst 43:74–86

    Article  Google Scholar 

  67. Zhang Z (2015) Secure and efficient data-sharing in clouds. CCPE 27(8):2125–2143

    Google Scholar 

  68. Zhao Y, Ren M, Jiang S, Zhu G, Xiong H (2019) An efficient and revocable storage cp-abe scheme in the cloud computing. Computing 101(8):1041–1065

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work has been partially supported by the grant “CABAHLA-CM: Convergencia Big data-HPC: de Los sensores a las Aplicaciones” (Ref: S2018/TCS-4423) of Madrid Regional Government; the Spanish Ministry of Science and Innovation Project “New Data Intensive Computing Methods for High-End and Edge Computing Platforms (DECIDE)”. Ref. PID2019-107858GB-I00; and by the project 41756 “Plataforma tecnológica para la gestión, aseguramiento, intercambio y preservación de grandes volúmenes de datos en salud y construcción de un repositorio nacional de servicios de análisis de datos de salud” by the FORDECYT-PRONACES.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. L. Gonzalez-Compean.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Carrizales-Espinoza, D., Sanchez-Gallegos, D.D., Gonzalez-Compean, J.L. et al. FedFlow: a federated platform to build secure sharing and synchronization services for health dataflows. Computing 105, 1019–1037 (2023). https://doi.org/10.1007/s00607-021-01044-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-021-01044-3

Keywords

Navigation