Federated Learning with Privacy-preserving and Model IP-right-protection

In the past decades, artificial intelligence (AI) has achieved unprecedented success, where statistical models become the central entity in AI. However, the centralized training and inference paradigm for building and using these models is facing more and more privacy and legal challenges. To bridge the gap between data privacy and the need for data fusion, an emerging AI paradigm federated learning (FL) has emerged as an approach for solving data silos and data privacy problems. Based on secure distributed AI, federated learning emphasizes data security throughout the lifecycle, which includes the following steps: data preprocessing, training, evaluation, and deployments. FL keeps data security by using methods, such as secure multi-party computation (MPC), differential privacy, and hardware solutions, to build and use distributed multiple-party machine-learning systems and statistical models over different data sources. Besides data privacy concerns, we argue that the concept of “model” matters, when developing and deploying federated models, they are easy to expose to various kinds of risks including plagiarism, illegal copy, and misuse. To address these issues, we introduce FedIPR, a novel ownership verification scheme, by embedding watermarks into FL models to verify the ownership of FL models and protect model intellectual property rights (IPR or IP-right for short). While security is at the core of FL, there are still many articles referred to distributed machine learning with no security guarantee as “federated learning”, which are not satisfied with the FL definition supposed to be. To this end, in this paper, we reiterate the concept of federated learning and propose secure federated learning (SFL), where the ultimate goal is to build trustworthy and safe AI with strong privacy-preserving and IP-right-preserving. We provide a comprehensive overview of existing works, including threats, attacks, and defenses in each phase of SFL from the lifecycle perspective.


Introduction
In recent years, artificial intelligence (AI) has made great progress in many commercial applications, including computer vision [1,2] , natural language processing [3,4] , recommender systems [5,6] , etc. However, behind the super-fast development, the drawbacks of traditional AI approaches are also revealed, which are that they rely heavily on the availability of large-scale and high-quality data but do not provide a mechanism for securely obtaining and using it. For example, the development of computer vision benefited from large-scale public datasets like Im-ageNet [7] , which is essentially based on a centralized data model. From e-commerce to online video, based on historical data, recommender systems can analyze user preferences precisely and recommend the most relevant items to users. In biology, by training on publicly available data consisting of 170 000 protein structures from the protein data bank (PDB) [8] , AlphaFold, developed by DeepMind, achieved high accuracy predictions of protein structure [9,10] . These examples are all centralized data-driven computation systems, and they require that data scattered across multiple devices be first uploaded to a central database before being used for training the statistical models.
Centralized data fusion for AI modeling is facing more and more legal and ethical challenges. In practice, data is often spread across multiple end devices and held by different individual users or organizations, data in different locations is heterogeneous in form and distribution. Fusing the data into a central database inevitably increases privacy leakage risks. With the increasing awareness of privacy concerns, governments are strengthening data privacy laws to prevent privacy leakage, such as general data protection regulation (GDPR) in the EU [11] , California consumer privacy act (CCPA) in the USA [12] , data security law (DSL) in China [13] . On the other hand, due to uniqueness and rarity, the value of data is also a challenge that cannot be neglected, its value will disappear gradually whenever data can be shared and copied, simply because no organization is willing to share data without benefit.
In order to eliminate the drawbacks caused by data fusion, Google proposed a new training paradigm, called federated learning (FL) [14] , to address data challenges. The original FL requires model parameters, not raw training data sets, to be exchanged between multiple devices during the whole training process, which can greatly mitigate data privacy risks. However, existing works have shown that vanilla FL without protection on exchanged model parameters may not always provide strong security guarantees. Zhu et al. [15] demonstrated that the original training data can be recovered from gradients. Phong et al. [16] showed that even a small portion of original gradients can expose information about local data. Besides, beyond the training stage, vanilla FL is also vulnerable to various kinds of attacks during the entire FL lifecycle, which includes the following steps: data preprocessing, training, evaluation, and deployments [17] . For example, data can be poisoned in the preprocessing stage [18] . Membership inference attacks can occur in the model deployment phase [19] . It is thus important to emphasize that security guarantees are an essential part of FL system design.
Moreover, as statistical models are the central entities in AI, multiple assets that include the training data, hardware, and human expertise, are involved when developing and deploying FL models in practice. This makes "model management" a critical issue. To prevent models from being misused or plagiarized without authorization, we reinforce awareness of model intellectual property and introduce IP-right-preserving mechanism for models in federate learning. In this paper, we realize federated model IPR protection by embedding watermarks into deep neural network (DNN) model parameters, achieving good results in practice.
In summary, the true spirit of federated learning lies in its ability to provide strong privacy-preserving and model IP-right preserving, to distinguish it from vanilla FL, we call it secure federated learning (SFL). In contrast to many existing FL works that only provide very weak or no security guarantees, we emphasize that the principle of SFL should receive more attention in both in-dustry and academia. This article gives a comprehensive overview of key aspects of SFL, including existing works on both security guarantees throughout the entire lifecycle and model IP-right protection. In the rest of the article, we will use federated learning (FL) and secure federated learning (SFL) interchangeably and refer to federated learning systems that employ certain security mechanisms unless stated otherwise.

Related works
In the FL literature, Yang et al. [20,21] introduced the categorizations of FL and extended the scope of federated learning to include also vertical federated learning (VFL) scenarios. Kairouz et al. [17] discussed the FL progress and presented several challenging problems for future directions. Li et al. [22] discussed the unique characteristics and challenges of federated learning.
Several existing security-related surveys are published with similar motivations to ours. For example, Lyu et al. [23] discussed the threats to federated learning. Bouacida and Mohapatra [24] conducted a comprehensive survey on the security issues and defense strategies under the FL setting. Mothukuri et al. [25] summarized common threat models faced by the FL system and provided corresponding defense strategies.
Recently, Liu et al. [26] analyzed the security issues throughout the multi-phase of the FL execution, which includes the data and behavior auditing phase, training phase, and predicting phase, this survey covers more scopes and scenarios about FL security. However, their discussions mainly focus on horizontal federated learning (HFL), nor do they discuss how to protect IP-right issues of federated learning.

Contribution
Compared to the previous works, the main contributions of our paper are as follows: 1) We reiterate the core concept of secure federated learning and emphasize that security guarantees should cover the entire FL lifecycle, which includes the following steps: data preprocessing, model training, evaluation, and deployment.
2) We provide a general SFL architecture, which covers both HFL and VFL scenarios, and we summarize existing works on threats, attacks, and defense in each phase throughout the entire lifecycle.
3) We view the model intellectual property right as an important part when building a secure FL system, and provide detailed implementations on how to protect federated models′ IPR.

Overview of secure federated learning
In this section, we discuss the definition and system architecture of secure federated learning.

Security guarantees
We first clarify our definitions of security, which involves five security aspects as shown in Fig. 1  Keeping private training datasets safe. There are two different levels of approaches that meet the requirement, the simplest one is to keep local data sets on devices locally, but security is not sufficient enough. Another high-level approach is to be able to provide security protocols to deal with model inversion attacks and prevent raw data sets being stolen.
Keeping model parameters safe. As aforementioned, the attacker can reconstruct original data sets by model inversion.
Keeping model structure safe. This is not a required field, in the VFL setting, each participant holds a partial model while the global model structure is unknown. However, in the HFL setting, in most cases, the global model structure is pre-defined, and anyone who participates in the FL system knows the global model structure in advance.
Keeping model performance intact. In most cases, there is a trade-off between privacy loss and utility loss [27] , how to balance security and model performance is a challenging and open problem.
Keeping the model IP-right safe. FL involves collaboration among multiple parties, where the concept of "model" becomes important, to this end, IP-right-preserving is required to prevent model assets from being stolen.

Definition
SFL refers to secure collaborative distributed machinelearning methods and architectures that satisfy the following conditions: 1) They provide a security mechanism to protect data/model security and user privacy using tools that include but are not limited to: multi-party computation (MPC) [28,29] , differential privacy (DP) [30,31] , and encryptbased solutions (can be either software-based encryption solution such as homomorphic encryption (HE) [32] or hardware-based solutions such as trusted execution environment (TEE) [33−35] ).
2) There are clearly defined threat models as well as corresponding provably correct defense strategies, with the purpose of providing and clarifying the security scope of federated learning throughout the entire lifecycle.
3) They also include methods to protect the IP-rights of the trained models, with the purpose of preventing the model from being plagiarized.

Architecture
In this section, we introduce the key components that constitute SFL system architecture. Without loss of generality, we assume there are three participants to jointly train and use a statistical model. A typical SFL architecture is as shown in Fig. 2.
In Fig. 2, we show secure HFL ( Fig. 2(a)) and secure VFL ( Fig. 2(b)) respectively. At a high level, the general SFL architecture consists of the following three components: Component 1: Distributed machine learning (DML). Under the FL scenario, collaborative training among multiple devices is realized based on distributed machine learning framework. However, unlike traditional DML, there are several key differences between FL and DML: 1) FL is motivated by data privacy and security, while DML is motivated by large-scale computation.
2) Participants of FL can be either individuals or organizations, while DML is a multi-node system, each participant is a compute node in a single cluster or data center.
3) FL allows different participants to have different configurations (like data distribution, data size, hardware, network, etc.). In contrast, DML is more stable, the configurations of each node are almost the same. 4) FL requires incentive mechanisms to encourage more participants to participate in FL ecosystem, while DML does not require any such mechanism.
Component 2: Security protocol. Security is at the core of federated learning. However, as aforementioned, vanilla FL can be vulnerable to different kinds of attacks throughout the lifecycle.
To mitigate the risks caused by adversarial attacks, a series of secure operation steps would be pre-negotiated by multiple participants, the goal is to complete the task requirements without compromising data privacy. We call these pre-negotiated and secure operation steps are security protocols.
In summary, security protocols can be either based on traditional privacy-preserving computation, such as MPC, DP, and HE, or algorithm-based approaches, such as modifying the training loss function, and model aggregation improvement.
Component 3: Model IPR protection. Conceptu-ally, IPR refers to all rights associated with models owned by an entity, anyone cannot be used unless authorized. The motivation for Model IPR can be boiled down to the following reasons: 1) Since multiple models are involved in the training phase, whenever failures happen, to locate where the responsibility lies, it is critical to be able to trace back to the original creators of models.
2) FL models are of high commercial value, it is necessary to prevent adversaries from plagiarizing, misusing, and re-distributing valuable FL models without legal permission.
3) Incentive mechanisms are essential parts of the FL ecosystem, based on the principle of "more pay for more work", IPR can let the FL manager know the contributions of different models.

Security of federated learning lifecycle
In the following sections, we first briefly review the concept of federated learning lifecycle and then summarize the potential security risks at each stage, respectively.
FL is a distributed learning paradigm, in which the process is typically initiated by a party based on a specific application purpose. For example, a financial company wants to update the risk prediction model by jointing with multiple financial institutions for this purpose. Kairouz et al. [17] discussed the lifecycle of a model in federated learning, based on that, we claim that there are four phases that are susceptible to various kinds of attacks, as shown in Fig. 3.
Preprocessing. Participants first identify the problems and requirements to be solved with FL. There are two major tasks at this stage. The first one is data preprocessing, according to the task requirements, engineers first transform raw training data sets to make them suitable for building machine learning models, in most cases, each client will generate and store the data independently, data preprocessing could be done locally. However, in some cases, additional data need to be maintained and downloaded from the server-side. The second step is to prepare the initialized model, to this end, the client will send an instruction to the server to request the global model.
At this stage, malicious modification of training data sets, such as mislabeling, noising, and poisoning, are the most significant security threat and affect subsequent  Fig. 3 Threat phases of federated learning lifecycle adapted from [17] processes. Federated model training. Based on the distributed architecture, multiple participants collaboratively train a share global model while keeping the training data sets on the device locally.
The model training phase is vulnerable to various kinds of attacks, including poisoning attacks, backdoor attacks, byzantine attacks, etc.
Federated model evaluation. Model evaluation is the process of using different evaluation metrics to evaluate a machine learning model′s performance, and feedback to determine whether to stop training or not. Model evaluation can be done either on the server or on the local client.
The potential security risks at this stage mainly come from evasion attacks (a.k.a. adversarial examples) [36−39] , the goal is to fool the model to make the wrong output by carefully perturbing the training examples.
Deployment. The final step is to select the updated model and integrate it into the real-world application to make practical business decisions.
The potential security risks at this stage come from three aspects, i.e., evasion attacks, model inference attacks, and model plagiarism.

Security of preprocessing
In the context of federated learning, the training data sets are generated and stored locally, which makes it hard to be stolen by other parties. However, if the client is controlled by a malicious user, the local data sets can be modified arbitrarily, and then the polluted data affect the following training step. In this section, we discuss the possible threats, attacks, and corresponding defenses dur-ing the preprocessing stage.

Client-side attacks
If a participant is controlled by a malicious user, he (or she) can modify the data at will before the training phase, including mislabeling, noising, and poisoning. Lowquality samples impair model training and reduce the model′s performance.
In general, the impacts of client-side attacks at preprocessing stage rely heavily on the number of malicious clients. Due to the client-selection mechanism of federated learning [40] , there is no guarantee that a malicious client will be selected at each round, or even if selected but not frequently, model aggregation can cancel the backdoor model′s contribution. Besides, from the client′s perspective, the percentage of polluted data in the overall training sample is also a critical factor that affects the attack.
Defense. The selection mechanism of federated learning is a natural defense strategy. Improving the selection mechanism with adaptive strategy [41,42] , rather than random choices, can further reduce the loss caused by malicious clients.
Anomaly detection to identify malicious clients is another effective approach to defend against client-side attacks in the preprocessing phases. For example, with a pre-trained model, the FL manager can check the training datasets to filter out potential adversarial attackers.

Server-side attacks
Another possible threat is that the attacker controls the central server, which orchestrates the training process and holds the additional data and the global model. Server-side attacks usually occur in HFL, the attacker  can modify the data or the global model parameters whatever he/she wants, and then distribute the fake data or global model to the selected clients for training.
The server-side attack is harder to defend against than client-side attacks, in the worst case, the server can distribute the fake data or global model to all the clients, which makes it equivalent to all participants being malicious.
Currently, there is still no good strategy to defend against server-side attacks, that′s why in most FL literature, the central server is required to be trusted. Another solution is to use TEE as a trusted third party [43,44] , which provides the code and data′s confidentiality and integrity guarantees.

Security of model training
Attacking in the model training phase is the main research area of federated learning security, the adversary can attack the model training procedure in the following ways.
1) In the HFL setting, the attacker can either control the central server or clients, or eavesdrop and steal the transmitted parameters ( Fig. 5(a)).
2) In the VFL setting, the attacker can either control the passive party or the active party, or eavesdrop and steal the transmitted parameters ( Fig. 5(b)).
In the following subsections, we discuss the threats, attacks, and corresponding defense strategies in the training phase. According to the capabilities of the adversary, we classify model training attacks into gradients attacks and poisoning attacks.

Gradient inversion attacks
The goal of gradient inversion attacks is to reconstruct or recover the sensitive information from the shared gradients.
Hitaj et al. [45] discussed the GAN-based attack to generate label-specific prototypical samples from gradients, which are meant to be private to other clients. However, their approach is limited and only works when labels among multiple parties are overlapping.
Zhu et al. [15] proposed deep leakage from gradients (DLG), DLG does not require a GAN model and additional information other than gradients, the key idea of which is to optimize the synthetic gradients as close as to the original gradients, which makes the synthetic data close to the real training data when the optimization is done. However, DLG is susceptible to convergence and label consistency problems, to this end, Zhao et al. [46] proposed improved DLG (iDLG), which guarantees to extract the ground-truth labels from the shared gradients.
Geiping et al. [47] proposed inverting gradients to recover the original input image, by setting the loss function to cosine similarity with the total variation (TV) norm. Compared to DLG/iDLG, inverting gradients performed well even on deep and non-smooth models.
Wang et al. [48] proposed self-adaptive privacy attack from gradients (SAPAG), which sets distance measure as Gaussian kernel-based of gradient difference. Zhu and Blaschko [49] proposed R-GAP, which provides a recursive procedure to recover data from gradients.
The above attacks are mainly applied in HFL scenarios, recently, Jin et al. [50] proposed catastrophic data leakage in vertical federated learning (CAFE), to perform large-batch data leakage attacks with improved data recovery quality under the VFL setting.
We summarize some common gradient inversion attack approaches in Table 1 for reference.

Defense against gradient inversion attacks
Several defense strategies have been proved feasible with respect to defending against inversion attacks.
Encrypted gradients. One straightforward approach is to encrypt the gradient, which makes the gradient values unavailable without secret keys. For example, Hardy et al. [52] showed how to use Taylor approximation to approximate the loss function, which enables the training process of FL can be executed under the encrypted setting. Phong et al. [16] proposed using homomorphic encryption to encrypt the gradients before sending.
However, encryption-based solutions are vulnerable to suffering from efficiency problems, how to balance safety and efficiency is still a challenging problem. For example, Zhang et al. [53] proposed an efficient HE solution, called BatchCrypt, for cross-silo federated learning, which can significantly reduce the communication overhead caused by encryption. Gradient compression. Gradient compression is an approach to obfuscate the model structure. Specifically, gradient compression can be either pruning or sparsification, since part of the gradients is missing, making the recovered images far away from the original data [54] .
Noisy gradient methods. Unlike encryption-based solutions, the noise-based solution is to perturb the gradient by adding noise to the gradient to achieve differential privacy. Since time-consuming operations such as encryption and decryption are unnecessary, the noise-based solution is more efficient in practice.
For example, McMahan et al. [55] introduced a new algorithm, called DP-FedAvg, for user-level differentially private training of large neural networks under federated settings. Wei et al. [56] proposed a general framework combining FL with differential privacy, by adjusting different amounts of noise to ensure distinct protection levels.
However, noise-based solutions require the algorithm to carefully adjust the noise generation to keep the performance of the model, such as model accuracy, intact. Otherwise, performance may be badly compromised. As shown in [27], how to balance safety and utility is still a challenging problem.

Poisoning attacks
According to the attacker′s goal, we classify poisoning attacks into the following two categories.
1) Targeted attacks, or backdoor attacks, aim to reduce the model′s performance on those examples with certain features while maintaining good performance on the rest of the other examples. In most cases, poisoning attacks not only require data modification in the prepro-cessing phase, but also require properly designing the training algorithm to achieve the goal.
Take image classification as an example, the attacker wants the model to misclassify images with specific patterns (vertical red stripe at the upper-left corner ( Fig. 6 (a)), yellow background ( Fig. 6 (b))) to an attacker-chosen class [40] , while the main task is not compromised.
Bagdasaryan et al. [40] leveraged the model replacement approach to make backdoor attacks more persistent. Xie et al. [57] further introduced distributed backdoor attacks, where a global backdoor trigger is decomposed into multiple local patterns, each of which is embedded into the training datasets of different malicious clients. Huang [58] discussed how to achieve backdoor attacks under the dynamic environment.
2) Unlike targeted attacks, untargeted attacks aim to degenerate the model performance. For example, Feng et al. [59] proposed the DeepConfuse framework, which uses an autoencoder to add imperceivable noises to the training data, so that the polluted data confuse the corresponding classifier trained on it, and make the wrong output when feeding with new clean data.
Byzantine attacks are another type of untargeted attack, Hu et al. [60] proposed a method called weight attack, the key idea is lying the attacker′s data set size so that model weight is changed when executing model aggregation. Fang et al. [61] proposed local model poisoning attacks, which manipulate the local models uploaded from the compromised devices to the central server during the training process

Defense against poisoning attacks
FL provides numerous security protocols to defend against poisoning attacks during model training.
Byzantine-robust aggregation, which aims to improve the classical FedAvg algorithm [14] and provide secure and robust aggregation to mitigate byzantine attacks. Yin et al. [62] proposed median and trimmed mean aggregation to remove abnormal local models. Blanchard et al. [63] proposed the Krum aggregation rule, a byzantine-resilient algorithm for distributed stochastic gradi- Table 1 An overview of existing works on gradient inversion attacks adapted from [50] Methods Approach Adversary control Supported scenarios Deep models under the GAN [45] GAN-based Client HFL DLG [15] Minimize L2 distance between dummy gradients and original gradients Central server (HFL) / Active party (VFL) HFL, VFL iDLG [46] Minimize L2 distance between dummy gradients and original gradients Central server (HFL) / Active party (VFL) HFL, VFL Inverting gradients [47] Cosine similarity and TV norm Central server HFL SAPAG [48] Gaussian kernel-based distance Central server HFL R-GAP [49] Recursive gradient loss Central server HFL CAFE [50] L2 distance; TV norm; Internal representation norm Passive party VFL GGL [51] GAN-based Central server HFL ent descent (SGD), which provides a convergence guarantee even though multiple byzantine workers exist. Xie et al. [64] proposed Zeno aggregation rule for synchronous SGD, at least one honest worker is enough to ensure good defense performance. FL with MPC has widely been studied. For example, Dong et al. [65] combined secret sharing with distributed machine learning to achieve a high-level security guarantee without compromising model performance. Kanagavelu et al. [66] proposed to adopt MPC to achieve privacy-preserving model aggregation for FL.
Like encryption-based solutions, MPC-based solutions are vulnerable to suffering from efficiency problems. It is important to regard MPC as a set of technologies (or primitives), including but not limited to secret sharing (SS) [29] , oblivious transfer (OT) [67] , and garbled circuits (GC) [68] . Improving the efficiency of MPC depends largely on the breakthroughs of low-level primitives.
Hardware solution. TEEs have many different implementations with different forms, including Intel′s SGX [69,70] , Arm′s TrustZone [71,72] , and AMD SEV [73] , each varying in its ability to offer privacy protection. Combining TEE with FL has been applied in many applications. For example, Mo et al. [43] proposed DarkneTZ to mitigate attacks against the neural network, and then, they further designed a general FL framework for mobile systems to protect user privacy [74] . Huang et al. [44] proposed a new hybrid federated learning architecture, called StarFL, by combining TEE, MPC, and satellites for smart urban computing.
Besides, encryption-based and noise-based solutions which describe in Section 5.1.2 are also feasible solutions to defend against poisoning attacks.

Security of model evaluation
Evasion attacks, or adversarial examples, aim to evade the model by adjusting samples during the inference phase, in general, these samples are carefully perturbed so that they are indistinguishable to the human eye while the network fails to identify the image contents.
Model evaluation is used to evaluate a model′s performance, and feedback to determine whether to stop training or not, the attacker can deceive the federated model′s evaluation output by constructing adversarial test examples. According to the attacker′s capabilities, evasion attacks can be divided into the following two categories.
Gradient-based attacks. This type of approach requires the attacker access to the model′s gradients in advance. Goodfellow et al. [39] proposed fast gradient sign method (FGSM), which uses the gradients of the loss with respect to the input image to create a new image that maximizes the loss. Kurakin et al. [75] improved FGSM by computing adversarial examples iteratively. Carlini and wagner. [76] proposed C&W algorithm, a novel powerful attack approach that can defeat defensive distillation.
Confidence scores. This type of approach does not require knowing the model′s gradients in advance, in contrast, they use the outputted classification confidence to estimate the gradients, and then perform a similar optimization step as gradient-based attacks above.
Chen et al. [77] proposed zeroth-order optimization (ZOO) to directly estimate the gradients of the targeted model for generating adversarial examples. Ilyas et al. [78] proposed the variant of natural evolution strategies (NES) to fool the classifier under three realistic settings: the query-limited setting, the partial information setting, and the label-only setting.
In general, defense against evasion attacks is much harder, we summarize some common ideas of existing works as follows [79] .
Adversarial training. An intuitive idea is to build a robust model which includes adversarial samples during the training process. For example, Moosavi-Dezfooli et al. [80] built more robust classifiers by fine-tuning the adversarial examples. Goodfellow et al. [39] built robust models by mixing the adversarial objective with the classification objective as regularizer.
However, in most cases, it is unlikely to know all possible adversarial samples in advance, adversarial training  Knowledge distillation. Papernot et al. [81] proposed defensive distillation, by using the knowledge distillation technique to train the model and hide the gradient between the logits layer and softmax outputs, so that it is impossible for the attacker to generate adversarial examples from network gradients.
Anomaly detection. Another approach is to detect abnormal examples, for example, Metzen et al. [82] detected adversarial examples by using a detector subnetwork attached to the main classification network. Grosse et al. [83] empirically validated the hypothesis that adversarial examples can be detected using statistical tests before they are fed to the machine learning model.

Security of model deployment
Model deployment is the last step of the lifecycle, which aims to apply the machine learning models into practice. There are three potential risks in this phase: evasion attacks, model inference attacks, and model plagiarize. Evasion attacks have been discussed in Section 6, in Sections 7 and 8, we discuss the two remaining threats. The purpose of model inference attacks is to infer sensitive information by accessing models multiple times. Model inference attacks can be divided into the following three classes.
Label inference attack. Label inference attacks are more likely to occur in vertical federated learning scenarios. In VFL, the active party holds the data matrix and the class labels, while the passive parties keep the data matrix only. Label inference attacks happen when the passive party is controlled by the attacker, the goal is to infer the labels held by the active party.
Fu et al. [84] presented three types of label inference attacks against VFL: the direct label inference attack, the passive label inference attack, and the active label inference attack. Liu et al. [85] proposed batch label inference and replacement attacks to recover labels in the VFL setting with HE-protected communication.
Feature inference attack. Like label inference attack, feature inference attack usually occurs in the VFL setting, where features are partitioned and held by different parties, the goal is to infer the sensitive feature information held by other parties.
Luo et al. [86] proposed a feature inference attack method on model predictions in VFL, where the active party attempts to infer the feature values of new samples which belong to the passive parties Membership inference attack. Unlike the previous two types of inference attacks, membership inference attack usually occurs in the HFL setting, where members are partitioned and held by different parties. Given a data record and the black-box model, the attackers try to determine if the record is in the model′s training dataset via model outputs.
Pustozerova and Mayer [87] discussed membership inference attacks in the setting of sequential federated learning. The promising approach is to distort the resulting model by injecting a certain amount of noise to their training data, or directly perturbating the model parameters. To achieve a similar result is to apply differential privacy on the learning output [87] .
Defense against model inference attacks are also widely studied, the defensive strategies discussed in the previous phase also apply to defend against inference attacks, such as differential privacy to obfuscate the model output, and encryption-based solution for masking model structure. Besides, controlling query frequency is also a promising approach to preventing malicious queries.
The final security risk is model plagiarism. FL models can be deployed on any device, which makes them out of control and is susceptible to various kinds of attacks such as plagiarizing and misusing. As a new research direction of federated learning, it is necessary to discuss this part in-depth, we explain the detailed implementation in Section 8.

IP-right protection of federated learning models
While preserving the training data privacy is of paramount importance for FL, it is also a critical issue to prevent adversaries from plagiarizing, misusing, and re-distributing valuable FL models without legal permissions from legitimated owners of models [17] .

Challenges
Machine learning methods that allow ownership verifications of valuable models, especially large deep neural network models, have been successfully demonstrated by either detecting feature-based signatures embedded into models [88,89] , or verifying designated labels for backdoor samples that are injected into the models during the training stage [89,90] .
These methods are adopted and extended to the federated learning setting [91,92] , in which the following challenges are properly addressed to allow each participant to verify their respective ownerships of and contributions to the global model.
First, in order for each participant to embed their own feature-based signatures, the global federated model must have sufficient capacity to embed a potentially large set of (binary) signatures without compromising original model performances. Theoretical analysis and empirical investigation in [92] demonstrated that, as long as the total bit-lengths of embedded signature do not exceed a threshold that is proportional to the deep neural network size, it is possible to embed signatures without introducing significant loss of original model performances. Also, potential conflicts between signatures embedded by differ-ent participants should also be considered, and luckily, such conflicts lead to negligible losses in the confidence of ownership verification as demonstrated in [92].
Second, in order for each participant to embed their private signatures, the aggregation of federated models should not disclose signatures embedded into individual models. Moreover, these feature-based signatures should be verified in a private manner for instance without disclosing the feature extraction matrix, etc. It was demonstrated by [92] that private embedding and verification are achievable.
Third, for backdoor-sample-based ownership verification, one must ensure the persistence of backdoor samples when submitting the local models for aggregation. This is because plenty of aggregator-side defensive methods have been proposed with the aim to filter out backdoor samples from the global model [62，63，93] . Again, Li et al. [92] showed that negligible losses in the confidence of ownership verification were caused by the adoption of defensive methods. Thus, the embedded backdoor samples turn out to be very persistent.
The protection of IPR for FL models is an important step in the whole life-cycle of federated training. This step is part of an auditing process in which a variety of requirements for federated model management must be fulfilled. For instance, one may wonder whether a trained generative model has been misused to generate fake images or videos. This line of research work has been investigated in non-federated settings [94−96] .
Note that model IP-right protection cannot be solved by existing blockchain-based methods. When a model is collaboratively built by multiple participants, the model has not been entered into any blockchain yet.

Protection of deep neural network ownership using digital watermarks
In the past, digital watermarks were widely utilized to safeguard the ownership of multimedia assets such as images [97,98] , videos [99,100] , audios [101−103] or functional designs [104] . However, the recent progress in deep learning has expedited various technology corporations to launch machine learning as a service (MLaaS) as one of the business models. Therefore, in order to protect and encourage creativity, it is necessary and urgent to provide IPright-preserving.
In general, the IPR of deep models can be protected by various digital watermarking methods, which can be categorized into two schools according to respective working modes, namely, the black-box solutions using trigger sets [90,105] and the white-box solutions relying on unique detectable features [88,106,107] . The main idea of watermarking is to embed identification information (i.e., a digital watermark) into the model in question without compromising model performances for the original task. For trigger-set-based methods, such watermarks are encoded by specific input-output data samples, which are referred to as the trigger set. Ownership of the model in question is verified by the repeated detection of trigger-set samples, and due to the exponentially low probability that an innocent model will exhibit the same behavior by chance. On the other hand, feature-based methods embed designated watermarks into parameters of deep neural networks (DNNs) using a carefully designed transformation matrix. In this case, the detection of designated watermarks validates ownership.
The first effort due to Uchida et al. [88] proposed to protect ownerships of DNNs in a white-box manner, by embedding designated watermarks into DNNs without compromising host network performances for the original task. Uchida et al. [88] also demonstrated that the detection of designated watermarks was robust in the face of a variety of removal attacks, including model fine-tuning and pruning. However, their method was constrained in that it required access to all of the network weights in question to extract the embedded watermarks. In order to mitigate the white-box constraint, Merrer et al. [108] proposed a trigger-set-based solution which embedded watermarks in the classification outputs of CNN models by using adversarial samples (trigger sets). This method was advantageous in that it allowed designated watermarks to be verified remotely by repeated submitting trigger set samples to a service API, thus without requiring access to the network's internal weights parameters. Later, Adi et al. [90] demonstrated that an embedded watermark as such can be treated as an intentional backdoor, and a theoretical analysis of performance under different scenarios was provided in [90]. One common theme in follow-up works such as [106,107] have been focused on how to embed robust watermarks (or fingerprints) that are persistent to various removal attacks, including watermark overwriting, model fine-tuning and pruning of neural network models in both black box and white box settings. More recent works [89,105] are proposed to deal with another type of attacks on watermarks, i.e., ambiguity attacks. The most unique feature of solutions illustrated in [89,105] (also summarized in Fig. 7) lies in the fact that the inference performance of a DNN model in question will either remain intact if a valid passport is presented, or be significantly deteriorated otherwise. By taking advantage of this unique feature of the passport-based approach, ownership verification become both robust to removal attacks and resilient to ambiguity attacks. Moreover, designated binary signature can be simultaneously embedded into the scale factors of a passport layer, which provides strong guarantees and resilience to ambiguity attacks.
Aiming at the IP protection of generative adversarial networks (GANs), Ong et al [109] demonstrated a feasible solution as summarized in Fig. 8. Later, Lim et al. [110] also demonstrated IP protection for recurrent neural networks (RNNs). On both occasions, the generic watermarked framework proposed by [108] for DNNs is not readily applicable to GANs and recurrent neural networks (RNNs), since the input source for GANs can be either a latent vector or an image, and the output of GANs is a synthetic image rather than a classification label. While for RNNs, the input and output for RNNs are sentences. z

Normal inputs
Normal outputs

Trigger inputs
Watermarked outputs For the protection of GAN-type models, Ong et al. [109] proposed a protection framework by embedding the ownership information into the generator of GAN. In a blackbox scenario, they proposed to induce the generator to output a designated watermark at an assigned location of the synthesized image, when given a trigger input (see Fig. 8). This special behavior is enforced into the model by using an appropriately designed regularization term in the training of GAN. In a white-box scenario, Ong et al. [109] proposed to use a modified sign-loss of [89] where the sign of scaling factors encodes meaningful security information, e.g., company name. The ownership verification was successfully demonstrated on three GANs variants, namely, deep convolutional generative adversarial network (DCGAN) [111] , super-resolution using a generative adversarial network (SRGAN) [112] and CycleGAN [113] .
For RNN models, which were designed to take images as inputs and output meaningful image captions according to image contents, Lim et al. [110] proposed a novel RNN ownership verification method whose main features are summarized as follows. First, two different embedding were adopted to embed a designated watermark (or secret key) into the RNN cell. Second, the ownership was then verified by comparing the designed image captions for a specific input image. Third, a secret key was embedded into the hidden memory of RNN such that a forged key will immediately yield an unusable image captioning model in terms of poor-quality outputs. This protection, in the same vein of passport-type of protection in [89] prevents the infringement of RNN models proactively.

Protection of deep neural network ownership under federated setting
In federated learning, there are several IPR infringement cases. Firstly, during the training stage, multiple clients have access to the global model, thus in the model-deployment stage, the trained model may be illegally redistributed to an un-authorized party outside the federated-learning system. Secondly, some free-rider parties participate in federated learning merely for stealing the federated model, they dissimulate participation to the training process but without actually contributing any data for FL, which means they infringe the intellectual property rights of benign clients. Those two IPR infringement cases highlight that there is a strong demand and motivation for federated model IP protection. In this way, several watermarking methods for FL come as a remedy for the aforementioned loopholes.
Recently, a watermarking scheme named WAFFLE [91] was proposed to protect FL model, this method assumes that the trustworthy central server is the owner of the FL model and clients have no ownership over the jointtrained federated model. WAFFLE method introduces a model re-training step at the server side, server embeds backdoor-based watermarks [90] into the aggregated model. In the ownership verification stage, the central server claim ownership through black-box access to the trained model with the prescribed watermarks.
Li et al. [92] considered the FL IPR protection problem in a more general semi-honest FL setting, and proposed FedIPR signature/watermark embedding scheme. In FedIPR, each party is the owner of the federated learning model, during the training stage, each party keeps its own secret watermarks, and embeds secret watermarks into the local model on the client-side, and afterward, the local models are aggregated into a global model. In the verification stage, each party can verify the ownership of the global model by looking for its own watermark embedded in the global model. Note that this verification process is kept secret for each party and independent of other parties′ watermarks. In this way, unauthorized parties outside FL cannot claim ownership of the federated learning model. Moreover, free-riders who do not embed watermarks during the training cannot claim legitimate rights of the global model. This FedIPR setting is rather challenging because the signature embedding process on the client-side must be kept secret, and the global model needs to have sufficient capacity to embed each party′s watermarks at the same time. On the technical side, Li et al. [92] propose both backdoor-based watermarks and feature-based watermarks, specifically, they propose adversarial samples as the backdoor-based watermarks to embed in the local model, and adopt a secret matrix to embed feature-based signatures into the batch-norm layers. In the verification stage, each party can verify the ownership of the global model independently. FedIPR has provided theoretical results for the capacity of client-side secret watermarks, and FedIPR is evaluated in both image classification tasks and natural language inference tasks with both convolution network and transformer-based architectures.
The engineer or researcher of a federated learning framework might benefit from the model IP protection. This is crucial as the development of the DNN costs a massive amount of money, data and computing resources. The IP protection methods will encourage the innovations of DNN model and protect the legitimate right of model owner, even in the worst scenario that the attacker can access the model without the owner′s acknowledgment. In short, the FedDNN model protection benefits the AI society especially for securing their advantage in the open market.

IPR research direction
Model intellectual property protection is an open question in secure federated learning, challenges come from the following perspectives: Watermarking protocols. It is necessary to design secure and trustable protocols for federated learning model protection, a scheme [91] was proposed in which the model server is responsible for watermark embedding and only the server can verify ownership over the model, whereas Li et al. [92] proposed that each client can embed private watermarks and claim ownership of the model without revealing watermarking information to other parties. We believe that ownership verification protocols combined with more security mechanisms are a compelling need for trustable and secure verification.
Watermarking embedding methods. Previous federated model watermarking methods can be divided into feature-based methods [88, 90−92] and backdoor-based methods [109,110] . The feature-based watermarks need to be extracted with white-box access to the model parameters, which is unrealistic in practice. The backdoor-based watermarking methods are highly related to backdoor learning, which is an important perspective and can motivate more related research. Especially in federated learning, it remains to be investigated how many private watermarks can be embedded into the same global model.
Watermark robustness. Another important challenge for federated model watermarking is the robustness. On one hand, various training strategies like differential privacy, homomorphic encryption and secure aggregation, etc. are adopted for data security [52, 54−55, 65, 66] , those strategies modify the training process thus may remove the watermarks; on the other hand, the model adversary may apply removal attacks or model extraction attacks to remove the watermarks [114−116] . Combining those two risks, the watermark robustness is a crucial issue for federated model IP protection.
In general, model IP protection is a non-negligible issue when applying federated learning into practice. Algorithms and protocols will be the core of the research on federated model IP protection.

Open-source frameworks for federated learning
Developing a federated learning framework from scratch is very time-consuming, especially in industrial. An excellent FL framework can facilitate engineers and researchers to train, research and deploy the FL model in practice. In this section, we summarize some commonly used open-source frameworks in Table 2.

Conclusions
Privacy-preserving computing (PPC) is one of the active and influential research areas in both industry and academia. As the frontier research direction of PPC, FL has received considerable attention in recent years. This article gives a comprehensive survey on key components of SFL, including definition, architecture design, and threat models faced by FL. Besides, we wish that the IP protection perspective illustrated in this paper will lead to model IP protection in more FL settings. We believe that secure federated learning will bring about a new mindset and toolbox in developing large-scale AI systems, and help to address open problems that hinder wide applications of SFL in a larger variety of use cases, such as secure and legal data exchanges, data shortages and data silos in practice.

Open Access
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article′s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article′s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.