Advancing medical data classification through federated learning and blockchain incentive mechanism: implications for modern software systems and applications

Yu, Haifeng; Cai, Lei; Min, Hong; Su, Xin

doi:10.1007/s11227-023-05825-9

Advancing medical data classification through federated learning and blockchain incentive mechanism: implications for modern software systems and applications

Open access
Published: 23 December 2023

Volume 80, pages 10469–10484, (2024)
Cite this article

Download PDF

You have full access to this open access article

The Journal of Supercomputing Aims and scope Submit manuscript

Advancing medical data classification through federated learning and blockchain incentive mechanism: implications for modern software systems and applications

Download PDF

Haifeng Yu¹,
Lei Cai²^na1,
Hong Min³^na1 &
…
Xin Su²^na1

836 Accesses
1 Citation
Explore all metrics

Abstract

The key issue of medical data is patient information sensitivity and dataset finiteness, which need to guarantee high-efficient training. Besides, the current convolutional neural network has a low image classification and poor robustness concerning antagonistic samples. A lack of scalability in healthcare federated learning and incentive mechanism hinders the attraction of ample high-quality datasets. This paper proposes a Federated Learning Incentive Mechanism for Medical Data Classification (FedIn-MC). It realizes a collaborative model training of multi-party medical institutions through the combination of federated learning and blockchain. There is a marked improvement to the model’s robustness through a combination of the distance loss function and the prototype loss regulation. In addition, this incentive mechanism of blockchain in the project is applied to calculate client contribution values and encourage healthcare institutions to active training model participation. Simulation results verify an accomplishment of a multi-party training. With regard to image classifications, this framework also has a higher classification accuracy and stronger robustness concerning invisible class samples.

Deep learning modelling techniques: current progress, applications, advantages, and challenges

Article Open access 17 April 2023

Blockchain for healthcare data management: opportunities, challenges, and future recommendations

Article 07 January 2021

Fairness of artificial intelligence in healthcare: review and recommendations

Article Open access 04 August 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Artificial intelligence (AI) and big data technology already find a wide application in the medical field to reduce labor costs and human-induced mistakes [1, 2]. However, current intelligent medical systems still suffer from immaturity and are questioned for inadequate treatment recommendations [3,4,5]. Many factors help account for this inadequacy, the first and most prominent one being the difficulty in collecting sufficient data with rich features, which determines a comprehensive understanding of disease judgment in the data. Besides, relevant machine learning models generally have a lower performance [6]. Thus, an effective, secure medical data collection and processing from worldwide medical institutions becomes the bottleneck in current intelligent medical systems [7, 8].

To break this bottleneck, medical institutions are united and agree to share medical data under Privacy Protection Regulations. Thus, the model training with larger datasets performs much better than with data from a single medical institution [9, 10]. Federated learning [11] (FL) forms a promising solution in this medical field, since all participants cooperate to train a shared model without any private data disclosure or exchange. Despite of its preliminary success in medical practice, the basic FL convolutional neural networks (CNNs) [12, 13] still have some drawbacks, diminishing the overall system performance. A good case in point is, either a small noise addition to the initial sample or a minor change would make it difficult to visually detect any obvious changes. Same samples in the client’s network model often bring about different predictions. In addition, the lack of client participation incentives in federated learning contributes negatively to the model training efficiencies.

Major improvements on the training model robustness lie in proposing better designs of loss algorithms [14]. This paper proposes the distance-based cross-entropy loss and nature prototype loss (DPL loss function), so that all prototypes directly learn from the given data. The convolutional layer is still employed to extract the characteristics for the bottom of this DPL image classification framework. By contrast, the distribution of multiple prototypes at the top of the DPL means to represent different classes. To achieve the image classification, Euclidean distance is used to find the nearest prototype in the feature space. Inspired by prototype learning [15], the paper introduces natural prototype losses (PL) in DPL to help reduce the distance between the feature vector and the corresponding prototype. With its model overfitting prevented, it still effectively improves DPL classification performances. The model therefore becomes more identifiable and robust.

It is widely accepted that medical data are the prerequisite for model training. Unfortunately, traditional federated learning fails to attract high-quality dataset owners, making training global models infeasible [16, 17]. As a remedy to federated learning, blockchain technologies emerge in federated learning frameworks. Clients are attracted to participate in federated learning by leveraging the blockchain transaction integrity and traceability and by combining its incentive mechanism with other technologies [18]. Inspired by the above studies, this paper introduces an incentive mechanism in the federated learning framework to attract more high-quality medical datasets, expands medical datasets in training models, and improves model classification effects.

As mentioned above, this paper proposes Incentive Mechanism for Federated Learning of Medical Data Classification (FedIn-MC) and makes the following chief contributions:

1.
This project introduces an improved federated learning framework to encourage multi-party medical institutions to participate in the cooperation. Secure cross-institutional data sharing guarantees train model comprehensiveness, with medical data retaining in their respective local areas.
2.
With the introduction of DPL algorithms based on prototypes, this paper realizes that all prototypes learn directly from the data, and significantly changes poor classification robustness on complex datasets.
3.
An introduction of incentive mechanisms in the blockchain markedly improves the attractiveness to medical institutions and encourages high-quality datasets owners to participate in training. Medical institution clients receive corresponding token rewards based on their training dataset contributions.

The rest of this paper is organized as follows: In Sect. 2 we introduce related work followed by an introduction to the FedIn-MC in Sect. 3, and in Sect. 4 we describe the experiments and performance analysis of FedIn-MC. Finally, Sect. 5 draws a conclusion and makes a prospect for future researches.

2 Related work

Widely used in machine learning tasks, prototype learning is a classic and representative method in pattern recognition [19]. Yu et al. [15] apply prototype learning in image classifications, where a prototype represents a class and is computed as the mean of feature vectors within each of these classes.

The method in He et al. [20] is related to the contrastive loss for unsupervised visual representation learning, which guides CNNs to learn more discriminative representations. Wen et al. [21] adopt the central loss approach to improve performance for softmax-based CNN, but it does not learn directly from the data. Instead, it is updated according to the predetermined rules and unable to achieve a synchronized learning with this CNN. A CNN-based encoder is trained by Huang et al. [22] to extract visual representations, to convert image features into coherent semantics and to use similar visual semantics to aggregate into the same image features. In addition, Lakshmanaprabu et al. [23] use the aggregated local features as descriptors for image retrievals, which improve the model’s image retrieval ability. The average word embedding is used as a sentence representation in Wieting et al. [24], to achieve a competitive performance on multiple NLP benchmarks. Furthermore, Hoang et al. [25] adopt prototype learning to represent irrelevant information in distributed machine learning tasks. All of these studies leverage a new fusion paradigm to integrate related prototypes to generate new models for new tasks. They widely use prototype learning for tasks with a limited number of training samples, ensuring better discrimination of the learned representations [26]. Consequently, this paper combines federated learning with prototype learning to effectively integrate feature representations from different dataset distributions, projecting samples to a specific region in the feature space, i.e., near the prototype.

The goal of federated learning is to train a global model on a centralized server, while all data are distributed across multiple local clients, owing to privacy or communication concerns [27, 28]. These prominent advantages make federated learning a promising solution in smart healthcare, breaking data barriers between institutions and stimulating a collaborative training [29, 30]. Lim et al. [31] outline a federated learning application scenario in biomedicine, confirming the feasibility of federated learning in smart medical care. According to Kan et al. [32], local model training in the federated learning mechanism obtains experimental results with a high accuracy and reliability. Improved learning effects ultimately result from a reduced training time. Based on the future digital health development, Rieke et al. [33] exploit how federated learning solves the current problems in smart medical care. However, traditional federated learning frameworks still lack user incentives, failing to attract medical clients with high-quality datasets [34, 35], resulting in hindrances to train an efficient model.

At present, blockchain technology has been used in all walks of life [36, 37]. To be more specific, federated learning and blockchain combined is widely used in the medical field [38, 39]. This combination generally overcomes the incentive inadequacy in traditional federated learning frameworks. Kang et al. [40] apply the incentive mechanism in the reliable federated learning program to protect client data and achieve efficient calculations. Filtering client data sources and combining various blockchain incentive mechanisms, Nishio et al. [41] focus on the model training efficiency and federated learning verification. Kim et al. [42] add a mechanism for verification and provision of corresponding rewards to federated learning. In addition, the phenomenon of differences brought by the delayed problem is also discussed. The Shapley value is adopted in Liu et al. [43] to calculate contribution degrees, together with some game theory principles introduced into the incentive mechanism. It is encouraging that all clients participating in the training are allocated a fair token value. Rehman et al. [44] introduce a fine-grained reputation perception to encourage more edge computing servers to participate in the model training, so the final training effect gets improved. These papers introduce the incentive mechanism in blockchain technology into federated learning to solve the problem of small quantity and low quality of training datasets.

However, practical applications fall outside their consideration in the above mentioned researches. Actual performances may be significantly degraded, despite of their potentially high accuracy in theory on specific datasets. Though computer vision-related tasks are executed on the basis of CNN models, they generally ignore the models’ own robustness defects. The FedIn-MC framework proposed means to be applied to real medical fields. At the first hand, it hopefully solves the inefficiency and less robustness of medical image classifications. Besides, it greatly improves medical image data sharing and data quality. To sum up, this framework realizes local medical data storage and safe sharing, together with an encouragement for more participants to join in.

3 Methodology

3.1 DPL loss function

An automatic feature learning from data usually has a better classification effect. In this paper’s framework, CNN is adopted as a feature extractor, denoted as $(x,\theta )$, while x and $\theta $ represent the original CNN input and parameter, respectively. Whereas traditional CNNs utilize softmax layers to linearly classify the learned features, our model learns prototypes on each class features for classification in this paper. Denote the learned prototype as m, where $i \subset \{1,2,3,...,I\} $ represents the class index and $j \subset \{1,2,3,...,J\} $ represents the prototype index in each class. Denote the prototypes in each class as $M=\left\{ m_{ij}\big | i=1,2,\ldots ,I;j=1,2,\ldots ,J\right\} $, and set each class to have the same number of prototypes. At the classification stage, the prototype is used to match the sample for classification, (the nearest prototype is found through the Euclidean distance, and the class of the prototype is matched to the corresponding sample) as shown in Fig. 1.

First, the distance-based cross-entropy loss (DCE) is used to measure the similarity between the sample and the prototype. Therefore, the probability $p\left( x\in m_{ij}\big | x\right) $ that the sample (x, y) belongs to the prototype $m_{ij}$ can be measured by the distance, and the formula is as follows:

$$\begin{aligned}{} & {} p\left( x\in m_{ij}\big | x\right) \propto -\Vert f(x)-m_{ij}\Vert \end{aligned}$$

(1)

$$\begin{aligned}{} & {} p\left( x\in m_{ij}\big | x\right) =\frac{e^{-\lambda d(f\left( x\right) ,m_{ij})}}{\sum _{a=1}^{I}\sum _{b=1}^{J}e^{-\lambda d(f\left( x\right) ,m_{ab})}} \end{aligned}$$

(2)

where $d(f\left( x\right) ,m_{ij})$ denotes the distance between f(x) and $m_{ij}$, $\lambda $ denotes the hyper-parameter that controls the probability assignment difficulty. From this definition, the DCE function $l(\left( x,y\right) ;\theta ,M)$ is obtained, and its formula is as follows:

$$\begin{aligned}{} & {} l\left( \left( x,y\right) ;\theta ,M\right) =-\text{ log }p\left( y\big | x\right) \end{aligned}$$

(3)

$$\begin{aligned}{} & {} p\left( y\big | x\right) =\sum _{j=1}^{J}{p\left( x\in m_{ij}\big | x\right) } \end{aligned}$$

(4)

The probability of correct classification is thus maximized by minimizing the loss function. Among them, the correct classification probability is maximized by minimizing the distance between sample features and all accurate class prototypes. Meanwhile, it maximizes the space between sample features and all wrong class prototypes. However, a direct minimizing of the classification loss probably leads to overfitting, so the prototype loss is introduced as a regularization to improve DPL generalization performance, which is defined as:

$$\begin{aligned} l\left( \left( x,y\right) ;\theta ,M\right) ={\Vert f(x)-m_{yj}\Vert }^2_2 \end{aligned}$$

(5)

Combining the above losses to train the classification model, the total loss is defined as $\left( \left( x,y\right) ;\theta ,M\right) $ with the following its formula:

$$\begin{aligned} \text{ DPL }\left( \left( x,y\right) ;\theta ,M\right) =\ l\left( \left( x,y\right) ;\theta ,M\right) \ +\alpha \cdot pl\left( \left( x,y\right) ;\theta ,M\right) \end{aligned}$$

(6)

where $\alpha $ denotes the hyper-parameter that controls the prototype loss weight. Using PL further improves DPL framework performance. For one thing, it is because PL makes sample features close to the corresponding prototypes, reducing the distances between the features within the same class and indirectly increasing the distances between different types. For another, because the classification loss emphasizes representation separation, while the prototype loss emphasizes the representation compactness, the two combined correspondingly enhance the framework robustness.

3.2 Incentive mechanism

The incentive mechanism proposed is based on client contributions from medical institutions. It is based on the dataset size participating in the model training to reward the corresponding tokens of each medical institution. First, when the FedIn-MC framework starts each round of training, the central server selects a fixed number of medical institution servers and stores their blockchain addresses in Clist. At the same time, the central server defines the Training structure, which is used to store the training tasks completed by the server of that medical institution, including the training set size and the training task’s status information; then, the selected client server trains the model locally and the current round of teaching.

This information is uploaded to training and is stored in Contrib synchronously; finally, according to the data in Contrib, tokens are distributed to each medical institution server address in Clist. The number of reward tokens is calculated as the total sum of all products by the training set size times a constant r. The incentive mechanism helps to encourage more medical institutions to participate in the training of the FedIn-MC framework, thereby reducing the bias of the classification model. The specific Algorithm 1 is as follows.

3.3 FedIn-MC framework

Since multiple medical institutions jointly participate in the model training, during each round of federated learning, the central server selects a specific number of medical institutions to join in the training according to predefined conditions (such as client idle time and charging conditions). For the server of the medical institution, set each medical client’s dataset as ${\text{ data }}_{n,i}=\left\{ {x_{n,i},y_{n,i}}\right\} $ and set the total dataset participating in the training as $\text{ Data }=\sum _{n\in N}\sum _{i\in I}{\text{ data }}_{n,i}$ where each sample is represented as $i\in I$, and the medical institution is defined as $n \in N$. In the model training process on the client side, the server objective function of the medical institution n is expressed as $F_{n}$, the loss function calculation is described as $DPL\left( \left( x,y\right) ;\theta ,M\right) $, and the model parameters $w_{n}^{k}$ are obtained by the corresponding analysis, which is represented as follows:

$$\begin{aligned} F_n \triangleq \min {\left( {\frac{1}{N}}\cdot \sum _{n\in N} \text{ DPL }\left( \left( x,y\right) ;\theta ,M\right) \right) } \end{aligned}$$

(7)

Then, the medical institution server performs k rounds of Stochastic Gradient Descent (SGD) algorithm calculation. And it uploads the updated $w^k$ to the central server. Finally, on the central server, the weighted average of the $w^k$ is completed according to the training dataset size. It is calculated as follows:

$$\begin{aligned}{} & {} w^k=\frac{1}{N}\cdot \sum _{n=1}^{N}\sum _{i=1}^{I}\frac{w_n^k\cdot {\text {data}}_n}{\text {data}} \end{aligned}$$

(8)

$$\begin{aligned}{} & {} {\nabla f(w}^k)=\frac{1}{N}\cdot \sum _{n=1}^{N}{\sum _{i=1}^{I}{{\nabla f(w}^k,x_n,y_n)}} \end{aligned}$$

(9)

In FedIn-MC, the model is iteratively computed until the inequality $\left| f\left( w^k\right) -f(w^{k-1})\right| \le \delta $ is satisfied, where $\delta \in R$ is the accuracy loss. Here, different medical institutions are used as system clients, each medical institution has a local dataset, and the data are kept locally. At the same time, a trusted third party with sufficient computing power is used as the central server to provide computing power support for the aggregated global model. The process of FedIn-MC framework algorithm is as Algorithm 2.

4 Experiments and results

4.1 Experimental setting

This project takes Ethereum as the underlying blockchain network and Proof of Work (POW). Select My SQL 5.7.1 to store the data under the chain respectively. The federated learning system framework is deployed on 7 servers, of which 1 is a central server and 6 servers as client servers. Assuming that each client server is assigned to a medical institution, each medical institution server is a four-layer neural network for medical data classification. During the training process, the initial learning rate is 0.005, the momentum amount is 0.5, and the hyper-parameter in DPL is set to 1.0.

The experiment uses the CT diagram dataset of COVID-19 [45] (as shown in Fig. 2) and the benchmark dataset MNIST. COVID-19 dataset is collected from several public databases and recently published articles. After screening this paper, the entire dataset is divided into training, test, and validation sets. The basic situation of the dataset division is shown in Table 1.

Table 1 Introduction of COVID-19 dataset

Full size table

The experiment stores the status of each federated training round in the incentive registry. As is shown in Fig. 2, the experimental registry records the training status of each medical institution client in this federated training round. This also includes the customer data size and the number of tokens obtained.

The indicators are accurate rates (Accuracy), false rejection rates (FRR), and false acceptance rates (FAR). The Accuracy reflects the model classification accuracy, while FRR and FAR values verify the model stability. The formula is described as follows:

$$\begin{aligned}{} & {} \text {Accuracy} = \frac{\text{ TP}_n + \text{ TN}_n}{\text{ TP}_n + \text{ FP}_n + \text{ TN}_n + \text{ FN}_n} \end{aligned}$$

(10)

$$\begin{aligned}{} & {} \text{ FRR } = \frac{\text{ FN}_n}{\text{ TP}_n + \text{ FN}_n } \end{aligned}$$

(11)

$$\begin{aligned}{} & {} \text{ FAR } = \frac{\text{ FP}_n}{\text{ FP}_n + \text{ TN}_n } \end{aligned}$$

(12)

where TP$_n$, TN$_n$, FP$_n$, and FN$_n$, respectively, represent True Positive, True Negative, False Positive, and False Negative of the corresponding medical institution client n.

4.2 Results and discussion

4.2.1 Analysis of classification accuracy

As shown in Figs. 3 and 4, in terms of classification accuracy, the FedIn-MC framework is better than the traditional FedAvg framework. The model accuracy of the DPL loss function on COVID-19 and MNIST dataset is increased by 1.12 percentage points and 0.3 percentage points, respectively. This indicates that the DPL loss function minimizes the distances between the sample features and all correct prototypes. Also, it maximizes the distances between sample features and all wrong prototypes. This paper’s loss function maximizes the probability of correct image classifications. The FedIn-MC framework mentioned above performs well on different datasets. This method shows promise for better training efficiency and model universality, and a greater advantage in medical data training.

Table 2 The FAR and FRR values for methods

Full size table

As shown in Table 2, in order to test the FedIn-MC framework’s robustness, this project trains the model on the MNIST dataset and uses two test sets for the network (MNIST and COVID-19). Since the COVID-19 dataset test sample is not a number, the test result is abnormal, so the model rejects the result. By contrast, MNIST test data and training data come from the same data domain, and the model accepts the corresponding results. To fairly evaluate the classification effect of the model, our experiments use two measurement indicators: FAR and FRR; FAR represents the percentage of the MNIST dataset to accept the sample, while FRR represents the percentage of COVID-19 dataset to reject the selection. A comparison of the results in the table verifies the fact that the traditional model based on softmax fails to simultaneously possess high FAR and FRR. However, the DPL-based model in this paper achieves dual high values, demonstrating an abnormality detection stability. The model rejects more than 99$\%$ of abnormal COVID-19 value samples while retaining more than 99 $\%$ of normal MNIST value samples. FedIn-MC framework model here confirms a significant advantage in robust identification (its robustness).

Table 3 The accuracy of classification on added category

Full size table

This project does an incremental experiment on the COVID-19 dataset to prove the superiority of the FedIn-MC framework in medical data classification. The experiment uses the COVID-19 sample (Normal class, Viral class, and COVID-19 class) as known class data and uses the MNIST dataset as ten categories of unknown. From these ten categories, select one with the COVID-19 dataset. Test samples form a new test dataset. It can be seen from Table 3 that the FedIn-MC framework still maintains a high classification accuracy in incremental test samples. The experiment does not retrain any part of the model but also adds a classification test based on the original model training, proving that the framework of this paper has a robust classified advantage.

4.2.2 Analysis of communication cost

This project compares the communication overhead of FedAvg (softmax) and FedIn-MC (DPL). During the experiment, each round of random selection of N medical institution clients to participate in the experiment measured the communication overhead of 100 rounds of system training.

As shown in Table 4 and Fig. 5, an incremental experiment measured the communication overhead of the server and the global model of the model parameter and the global model. It can be seen from the results that the communication overhead based on softmax-based federated learning is much more significant than the DPL FedIn-MC’s communication overhead. Significantly when the medical institution increases and the dataset becomes more extensive, the framework of this paper can effectively reduce communication overhead.

Table 4 Comparison of communication cost

Full size table

Given the above experimental results, the FedIn-MC framework in this paper can inspire data owners of all parties to join model training, break the "data islands" situation while ensuring data privacy, improve data classification effects, and have a more positive sign for an intelligent medical care development.

5 Conclusions and future work

A federated learning technology is adopted in this paper to solve medical data insufficiency and sharing difficulties so as to promote the medical institution cooperation. A prototype learning concept transfers the traditional softmax function into a DPL function. It directly learns multiple prototypes in each class in a convolutional feature space, and employs a prototype loss to shorten the inner class distances to improve the model robustness. At the same time, in order to improve the dataset number and quality, an incentive learning mechanism of blockchain is introduced in the framework to attract more medical institution participants. In the future smart medical care field, federated learning can also be combined with other new technologies to increase the model accuracy, to reduce data sharing risks, and to provide more innovative ideas for secure medical data sharing. In the following studies, we mean to try more detailed innovative experiments of blockchain and federated learning, to improve the model adaptation ability to changing environments and its aggregation process credibility.

References

Editorial for special issue on security and privacy protection in the era of iot devices. Digital Communications and Networks 7(2), 245–246 (2021)
Zhang P, Wang C, Kumar N, Jiang C, Lu Q, Choo K-KR, Rodrigues JJPC (2021) Artificial intelligence technologies for covid-19-like epidemics: methods and challenges. IEEE Netw 35(3):27–33
Article Google Scholar
McMahan B, Moore E, Ramage D, Hampson S, Arcas BAY (2017) Communication-efficient learning of deep networks from decentralized data. In: Singh A, Zhu J (eds.) Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pp. 1273–1282
Rieke N, Hancox J, Li W, Milletari F, Roth H, Albarqouni S, Bakas S, Galtier M, Landman B, Maier-Hein K, Ourselin S, Sheller M, Summers R, Trask A, Xu D, Baust M, Cardoso MJ (2020) The future of digital health with federated learning. NPJ Digit Med 3:119
Article Google Scholar
Qu L, Xu G, Zeng Z, Zhang N, Zhang Q (2022) UAV-assisted RF/FSO relay system for space-air-ground integrated network: a performance analysis. IEEE Trans Wirel Commun 21(8):6211–6225
Article Google Scholar
Di Ciaula A (2018) Towards 5g communication systems: are there health implications? Int J Hyg Environ Health 221(3):367–375
Article Google Scholar
He Y, Chen J (2021) User location privacy protection mechanism for location-based services. Digit Commun Netw 7(2):264–276
Article Google Scholar
Zhang P, Huang X, Zhang L (2021) Information mining and similarity computation for semi- / un-structured sentences from the social data. Digit Commun Netw 7(4):518–525
Article Google Scholar
Yan Z, Wicaksana J, Wang Z, Yang X, Cheng K-T (2021) Variation-aware federated learning with multi-source decentralized medical image data. IEEE J Biomed Health Inform 25(7):2615–2628
Article Google Scholar
Liu Q, Chen C, Qin J, Dou Q, Heng P-A (2021) Feddg: Federated domain generalization on medical image segmentation via episodic learning in continuous frequency space. In: Singh A, Zhu J (eds.) 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 1013–1023
Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol 10(2):1–19
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Hossain MZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Comput Surv 51(6):1–36
Article Google Scholar
Li W, Wang Y, Jin Z, Yu K, Li J, Xiang Y (2021) Challenge-based collaborative intrusion detection in software-defined networking: an evaluation. Digit Commun Netw 7(2):257–263
Article Google Scholar
Yu Y, Ji Z, Han J, Zhang Z (2020) Episode-based prototype generating network for zero-shot learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 14032–14041
Huang J, Qi YW, Asghar MR, Meads A, Tu Y-C (2022) Sharing medical data using a blockchain-based secure EHR system for New Zealand. IET Blockchain 2(1):13–28
Article Google Scholar
Li D, Peng W, Deng W, Gai F (2018) A blockchain-based authentication and security mechanism for iot. In: 2018 27th International Conference on Computer Communication and Networks (ICCCN), vol 99, pp 1–6
Li H, Yu K, Liu B, Feng C, Qin Z, Srivastava G (2022) An efficient ciphertext-policy weighted attribute-based encryption for the internet of health things. IEEE J Biomed Health Inform 26(5):1949–1960
Article Google Scholar
Balagourouchetty L, Pragatheeswaran JK, Pottakkat B, Govindarajalou R (2018) Enhancement approach for liver lesion diagnosis using unenhanced CT images. IET Comput Vis 12(8):1078–1087
Article Google Scholar
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9726–9735
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: Leibe B, Matas J, Sebe N, Welling M (eds.) Computer Vision – ECCV 2016, pp 499–515
Huang Z, Zeng Z, Huang Y, Liu B, Fu D, Fu J (2021) Seeing out of the box: End-to-end pre-training for vision-language representation learning. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 12971–12980
Lakshmanaprabu SK, Mohanty SN, Shankar K, Arunkumar N, Ramirez G (2019) Optimal deep learning model for classification of lung cancer on CT images. Future Gen Comput Syst 92:374–382
Article Google Scholar
Wieting J, Bansal M, Gimpel K, Livescu K (2015) Towards universal paraphrastic sentence embeddings. CoRR 5(11):81–98
Google Scholar
Hoang TN, Lam CT, Low BKH, Jaillet, P (2020) Learning task-agnostic embedding of multiple black-box experts for multi-task model fusion. In: Proceedings of the 37th International Conference on Machine Learning. ICML’20
Zhu J, Meng L, Wu W, Choi D, Ni J (2021) Generative adversarial network-based atmospheric scattering model for image dehazing. Digit Commun Netw 7(2):178–186
Article Google Scholar
Dhada M, Jain AK, Herrera M, Perez Hernandez M, Parlikad AK (2020) Secure and communications-efficient collaborative prognosis. IET Collab Intell Manuf 2(4):164–173
Article Google Scholar
Dhada M, Jain AK, Herrera M, Perez Hernandez M, Parlikad AK (2020) Secure and communications-efficient collaborative prognosis. IET Collab Intell Manuf 2(4):164–173
Article Google Scholar
Mandal M, Chaudhary M, Vipparthi SK, Murala S, Gonde AB, Nagar SK (2019) Antic: antithetic isomeric cluster patterns for medical image retrieval and change detection. IET Comput Vis 13(1):31–43
Article Google Scholar
Zhang W, Zhou T, Lu Q, Wang X, Zhu C, Sun H, Wang Z, Lo SK, Wang F-Y (2021) Dynamic-fusion-based federated learning for covid-19 detection. IEEE Internet Things J 8(21):15884–15891
Article Google Scholar
Yan Z, Wicaksana J, Wang Z, Yang X, Cheng K-T (2021) Variation-aware federated learning with multi-source decentralized medical image data. IEEE J Biomed Health Inform 25(7):2615–2628
Article Google Scholar
Lim WYB, Garg S, Xiong Z, Niyato D, Leung C, Miao C, Guizani M (2021) Dynamic contract design for federated learning in smart healthcare applications. IEEE Internet Things J 8(23):16853–16862
Article Google Scholar
Kang J, Xiong Z, Niyato D, Xie S, Zhang J (2019) Incentive mechanism for reliable federated learning: a joint optimization approach to combining reputation and contract theory. IEEE Internet Things J 6(6):10700–10714
Article Google Scholar
Dou Q, So TY, Jiang M, Liu Q, Vardhanabhuti V, Kaissis G, Li Z, Si W, Lee HHC, Yu K, Feng Z, Dong L, Burian E, Jungmann F, Braren RF, Makowski MR, Kainz B, Rueckert D, Glocker B, Yu SC-H, Heng P-A (2021) Federated deep learning for detecting covid-19 lung abnormalities in CT: a privacy-preserving multinational validation study. NPJ Digit Med 4(1):60
Article Google Scholar
Zhang W, Zhou T, Lu Q, Wang X, Zhu C, Sun H, Wang Z, Lo SK, Wang F-Y (2021) Dynamic-fusion-based federated learning for covid-19 detection. IEEE Internet Things J 8(21):15884–15891
Article Google Scholar
Wang J, Lu N, Cheng Q, Zhou L, Shi W (2021) A secure spectrum auction scheme without the trusted party based on the smart contract. Digit Commun Netw 7(2):223–234
Article Google Scholar
Xu J, Xue K, Li S, Tian H, Hong J, Hong P, Yu N (2019) Healthchain: a blockchain-based privacy preserving scheme for large-scale health data. IEEE Internet Things J 6(5):8770–8781
Article Google Scholar
Yang X, Xing H (2021) A data complementary method for thunderstorm point charge localization based on atmospheric electric field apparatus array group. Digit Commun Netw 7(2):170–177
Article Google Scholar
Tang X, Du B, Huang J, Wang Z, Zhang L (2019) On combining active and transfer learning for medical data classification. IET Comput Vis 13(2):194–205
Article Google Scholar
Kang J, Xiong Z, Niyato D, Xie S, Zhang J (2019) Incentive mechanism for reliable federated learning: a joint optimization approach to combining reputation and contract theory. IEEE Internet Things J 6(6):10700–10714
Article Google Scholar
Nishio T, Yonetani R (2019) Client selection for federated learning with heterogeneous resources in mobile edge. In: ICC 2019 - 2019 IEEE International Conference on Communications (ICC), pp 1–7
Kim H, Park J, Bennis M, Kim S-L (2020) Blockchained on-device federated learning. IEEE Commun Lett 24(6):1279–1283
Article Google Scholar
Liu Y, Ai Z, Sun S, Zhang S, Liu Z, Yu H (2020). In: Yang Q, Fan L, Yu H (eds) FedCoin: a peer-to-peer payment system for federated learning. Springer, Cham, pp 125–138
Google Scholar
Rehman MH, Salah K, Damiani E, Svetinovic D (2020) Towards blockchain-based reputation-aware federated learning. In: IEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp 183–188
Chowdhury MEH, Rahman T, Khandakar A, Mazhar R, Kadir MA, Mahbub ZB, Islam KR, Khan MS, Iqbal A, Emadi NA, Reaz MBI, Islam MT (2020) Can ai help in screening viral and covid-19 pneumonia? IEEE Access 8:132665–132676
Article Google Scholar

Download references

Funding

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(No.2021R1F1A1055408).

Author information

Lei Cai, Hong Min and Xin Su have contributed equally to this work.

Authors and Affiliations

Technology and Marketing Department, Shenyang Weituo Technology Co, Ltd, Shenyang, Liaoning Province, China
Haifeng Yu
College of IoT Engineering, Hohai University, Street, Changzhou, Jiangsu Province, China
Lei Cai & Xin Su
School of Computing, Gachon University, Seongnam-daero, Sujeong-gu, Seongnam-si, Republic of Korea
Hong Min

Authors

Haifeng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Cai
View author publications
You can also search for this author in PubMed Google Scholar
Hong Min
View author publications
You can also search for this author in PubMed Google Scholar
Xin Su
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HY and HM devised main idea and wrote the main manuscript. LC and XS conducted and summarized experimental results. All authors reviewed the manuscript.

Corresponding author

Correspondence to Hong Min.

Ethics declarations

Conflict of interest

The authors have no conflict and competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yu, H., Cai, L., Min, H. et al. Advancing medical data classification through federated learning and blockchain incentive mechanism: implications for modern software systems and applications. J Supercomput 80, 10469–10484 (2024). https://doi.org/10.1007/s11227-023-05825-9

Download citation

Accepted: 17 November 2023
Published: 23 December 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11227-023-05825-9

Advancing medical data classification through federated learning and blockchain incentive mechanism: implications for modern software systems and applications

Abstract

Similar content being viewed by others

Deep learning modelling techniques: current progress, applications, advantages, and challenges

Blockchain for healthcare data management: opportunities, challenges, and future recommendations

Fairness of artificial intelligence in healthcare: review and recommendations

1 Introduction

2 Related work