$$\mathsf {FLOD}$$ : Oblivious Defender for Private Byzantine-Robust Federated Learning with Dishonest-Majority

Dong, Ye; Chen, Xiaojun; Li, Kaiyun; Wang, Dakui; Zeng, Shuai

doi:10.1007/978-3-030-88418-5_24

Ye Dong^11,12,
Xiaojun Chen^11,12,
Kaiyun Li^11,12,
Dakui Wang¹¹ &
…
Shuai Zeng¹¹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12972))

Included in the following conference series:

European Symposium on Research in Computer Security

3977 Accesses
14 Citations

Abstract

Privacy and Byzantine-robustness are two major concerns of federated learning (FL), but mitigating both threats simultaneously is highly challenging: privacy-preserving strategies prohibit access to individual model updates to avoid leakage, while Byzantine-robust methods require access for comprehensive mathematical analysis. Besides, most Byzantine-robust methods only work in the honest-majority setting.

We present $\mathsf {FLOD}$, a novel oblivious defender for private Byzantine-robust FL in dishonest-majority setting. Basically, we propose a novel Hamming distance-based aggregation method to resist $>1/2$ Byzantine attacks using a small root-dataset and server-model for bootstrapping trust. Furthermore, we employ two non-colluding servers and use additive homomorphic encryption ($\mathsf {AHE}$) and secure two-party computation (2PC) primitives to construct efficient privacy-preserving building blocks for secure aggregation, in which we propose two novel in-depth variants of Beaver Multiplication triples (MT) to reduce the overhead of Bit to Arithmetic ($\mathsf {Bit2A}$) conversion and vector weighted sum aggregation ($\mathsf {VSWA}$) significantly. Experiments on real-world and synthetic datasets demonstrate our effectiveness and efficiency: (i) $\mathsf {FLOD}$ defeats known Byzantine attacks with a negligible effect on accuracy and convergence, (ii) achieves a reduction of $\approx $2$\times $ for offline (resp. online) overhead of $\mathsf {Bit2A}$ and $\mathsf {VSWA}$ compared to $\mathsf {ABY}$-$\mathsf {AHE}$ (resp. $\mathsf {ABY}$-$\mathsf {MT}$) based methods (NDSS’15), (iii) and reduces total online communication and run-time by 167–1416$\times $ and 3.1–7.4$\times $ compared to $\mathsf {FLGUARD}$ (Crypto Eprint 2021/025).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alistarh, D., Allen-Zhu, Z., Li, J.: Byzantine stochastic gradient descent. arXiv preprint arXiv:1803.08917 (2018)
Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., Shmatikov, V.: How to backdoor federated learning. In: International Conference on Artificial Intelligence and Statistics, pp. 2938–2948. PMLR (2020)
Google Scholar
Beaver, D.: Efficient multiparty protocols using circuit randomization. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 420–432. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-46766-1_34
Chapter Google Scholar
Bellare, M., Hoang, V.T., Rogaway, P.: Foundations of garbled circuits. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 784–796 (2012)
Google Scholar
Bernstein, J., Wang, Y.X., Azizzadenesheli, K., Anandkumar, A.: signSGD: compressed optimisation for non-convex problems. In: International Conference on Machine Learning, pp. 560–569. PMLR (2018)
Google Scholar
Blanchard, P., El Mhamdi, E.M., Guerraoui, R., Stainer, J.: Machine learning with adversaries: byzantine tolerant gradient descent. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 118–128 (2017)
Google Scholar
Bogdanov, D., Laur, S., Willemson, J.: Sharemind: a framework for fast privacy-preserving computations. In: Jajodia, S., Lopez, J. (eds.) ESORICS 2008. LNCS, vol. 5283, pp. 192–206. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88313-5_13
Chapter Google Scholar
Bonawitz, K., et al.: Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191. ACM (2017). https://doi.org/10.1145/3133956.3133982
Bookstein, A., Kulyukin, V.A., Raita, T.: Generalized hamming distance. Inf. Retrieval 5(4), 353–375 (2002)
Article Google Scholar
Canetti, R.: Universally composable security: a new paradigm for cryptographic protocols. In: Proceedings 42nd IEEE Symposium on Foundations of Computer Science, pp. 136–145. IEEE (2001)
Google Scholar
Cao, X., Fang, M., Liu, J., Gong, N.Z.: FLTrust: byzantine-robust federated learning via trust bootstrapping. arXiv preprint arXiv:2012.13995 (2020)
Corrigan-Gibbs, H., Boneh, D.: Prio: private, robust, and scalable computation of aggregate statistics. In: 14th $\{USENIX\}$ Symposium on Networked Systems Design and Implementation ($\{NSDI\}$ 2017), pp. 259–282 (2017)
Google Scholar
Demmler, D., Schneider, T., Zohner, M.: Aby-a framework for efficient mixed-protocol secure two-party computation. In: NDSS (2015)
Google Scholar
ElGamal, T.: A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory 31(4), 469–472 (1985)
Article MathSciNet Google Scholar
Erickson, B.J., Korfiatis, P., Akkus, Z., Kline, T.L.: Machine learning for medical imaging. Radiographics 37(2), 505–515 (2017)
Article Google Scholar
Fang, M., Cao, X., Jia, J., Gong, N.: Local model poisoning attacks to byzantine-robust federated learning. In: 29th $\{USENIX\}$ Security Symposium ($\{USENIX\}$ Security 2020), pp. 1605–1622 (2020)
Google Scholar
Hard, A., et al.: Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)
Ion, M., et al.: On deploying secure computing: private intersection-sum-with-cardinality. In: 2020 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 370–389. IEEE (2020)
Google Scholar
Kairouz, P., et al.: Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019)
Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Li, M., et al.: Scaling distributed machine learning with the parameter server. In: 11th $\{USENIX\}$ Symposium on Operating Systems Design and Implementation ($\{OSDI\}$ 2014), pp. 583–598 (2014)
Google Scholar
Liu, R., Cao, Y., Yoshikawa, M., Chen, H.: FedSel: federated SGD under local differential privacy with top-k dimension selection. In: Nah, Y., Cui, B., Lee, S.-W., Yu, J.X., Moon, Y.-S., Whang, S.E. (eds.) DASFAA 2020. LNCS, vol. 12112, pp. 485–501. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59410-7_33
Chapter Google Scholar
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Google Scholar
Mhamdi, E.M.E., Guerraoui, R., Rouault, S.: The hidden vulnerability of distributed learning in byzantium. arXiv preprint arXiv:1802.07927 (2018)
Nasr, M., Shokri, R., Houmansadr, A.: Comprehensive privacy analysis of deep learning: passive and active white-box inference attacks against centralized and federated learning. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 739–753. IEEE (2019)
Google Scholar
Nguyen, T.D., et al.: FLGUARD: secure and private federated learning. arXiv preprint arXiv:2101.02281 (2021)
Nosowsky, R., Giordano, T.J.: The health insurance portability and accountability act of 1996 (HIPAA) privacy rule: implications for clinical research. Annu. Rev. Med. 57, 575–590 (2006). https://doi.org/10.1146/annurev.med.57.121304.131257
Article Google Scholar
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16
Chapter Google Scholar
Peikert, C., Vaikuntanathan, V., Waters, B.: A framework for efficient and composable oblivious transfer. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 554–571. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85174-5_31
Chapter Google Scholar
Phong, L.T., Aono, Y., Hayashi, T., Wang, L., Moriai, S.: Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 13(5), 1333–1345 (2018). https://doi.org/10.1109/TIFS.2017.2787987
Article Google Scholar
Microsoft SEAL (release 3.6), November 2020. https://github.com/Microsoft/SEAL Microsoft Research, Redmond, WA
Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321. ACM (2015). https://doi.org/10.1145/2810103.2813687
Smart, N.P., Vercauteren, F.: Fully homomorphic SIMD operations. Des. Codes Cryptogr. 71(1), 57–81 (2012). https://doi.org/10.1007/s10623-012-9720-4
Article MATH Google Scholar
Team, I.P.: EU general data protection regulation (GDPR): an implementation and compliance guide. IT Governance Ltd (2017). https://doi.org/10.2307/j.ctt1trkk7x
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Yao, A.C.C.: How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science (SFCS 1986), pp. 162–167. IEEE (1986)
Google Scholar
Yin, D., Chen, Y., Kannan, R., Bartlett, P.: Byzantine-robust distributed learning: towards optimal statistical rates. In: International Conference on Machine Learning, pp. 5650–5659. PMLR (2018)
Google Scholar
Zhu, L., Han, S.: Deep leakage from gradients. In: Yang, Q., Fan, L., Yu, H. (eds.) Federated Learning. LNCS (LNAI), vol. 12500, pp. 17–31. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63076-8_2
Chapter Google Scholar

Download references

Acknowledgements

We are grateful to the anonymous reviewers for their comprehensive comments. This work was supported by the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDC02040400.

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Ye Dong, Xiaojun Chen, Kaiyun Li, Dakui Wang & Shuai Zeng
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Ye Dong, Xiaojun Chen & Kaiyun Li

Authors

Ye Dong
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kaiyun Li
View author publications
You can also search for this author in PubMed Google Scholar
Dakui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaojun Chen .

Editor information

Editors and Affiliations

Purdue University, West Lafayette, IN, USA
Elisa Bertino
National Research Center for Applied Cybersecurity ATHENE, Fraunhofer Institute for Secure Information Technology SIT, Darmstadt, Germany
Haya Shulman
National Research Center for Applied Cybersecurity ATHENE , Technische Universität Darmstadt, Fraunhofer Institute for Secure Information Technology SIT, Darmstadt, Germany
Michael Waidner

Appendices

A Byzantine-Robustness Analysis

Cosine similarity is one of the best metrics to measure the similarity of two vectors. Recall the cosine similarity of two $\mathsf {sgn}$ $\widetilde{\mathbf {w}_i}$ and $\widetilde{\mathbf {w}_s}$ is $ c_i = \frac{\langle \widetilde{\mathbf {w}_i}, \widetilde{\mathbf {w}_s}\rangle }{\Vert \widetilde{\mathbf {w}_i}\Vert \cdot \Vert \widetilde{\mathbf {w}_s}\Vert } $, and $\mathsf {FLTrust}$ clips $c_i$ using $\mathrm ReLU$ function to remove the poisoned model updates with negative $c_i$ [11]. Based on $\widetilde{\mathbf {w}_i}$, $\widetilde{\mathbf {w}_s}$ in $\{-1,1\}^d$ and Eq. (2, 3), we have

$$\begin{aligned} \begin{aligned} c_i&=\frac{\sum _{j=1}^d \widetilde{w_{ij}}\cdot \widetilde{w_{sj}}}{\sqrt{d}\cdot \sqrt{d}}=\frac{1}{d}\cdot (\sum _{j=1}^d (1-2\mathcal {E}(\widetilde{w_{ij}}))\cdot (1-2\mathcal {E}(\widetilde{w_{sj}})))\\&= 1 -\frac{2}{d}\cdot (\sum _{j=1}^d(\mathcal {E}(\widetilde{w_{ij}})+\mathcal {E}(\widetilde{w_{sj}})-2\mathcal {E}(\widetilde{w_{ij}})\mathcal {E}(\widetilde{w_{sj}})))\\&= 1 - \frac{2}{d}\cdot (\sum _{j=1}^d\mathcal {E}(\widetilde{w_{ij}})\oplus \mathcal {E}(\widetilde{w_{sj}})) = 1-2\frac{hd_i}{d}. \end{aligned} \end{aligned}$$

Thus, we have $c_i>0 \Leftrightarrow 1-2\cdot \frac{hd_i}{d}>0\Leftrightarrow hd_i<\frac{d}{2}$. Therefore, with $\tau = \frac{d}{2}$ we have $\nu _i>0 \Leftrightarrow c_i>0$, which means $\tau $-clipping Hamming distance-based method is capable to exclude the poisoned $\mathsf {sgn}$ model updates equivalent to that the cosine similarity-based method achieved. What is more, our $\tau $-clipping Hamming distance-based method is more flexible than the cosine similarity-based one since we can alter $\tau $ for different tasks to achieve the best Byzantine-robustness.

B Proof of Theorem 1

Proof (of Theorem 1)

The universal composability framework [10] guarantees the security of arbitrary composition of different protocols. Therefore, we only need to prove the security of individual protocols. We give the proof of the security under the semi-honest model in the real-ideal paradigm [10].

Privacy of $\mathsf {CXOR}$. There is nothing to simulate as the protocol is non-interactive.

Privacy of $\mathsf {PCBit2A}$. In offline phase, $P_0$’s view in real-world is composed of $\{ \mathbf {x}_i, \mathbf {r}_i,\mathbf {x}_i', \mathbf {r}_i', \mathsf {AHE.Enc}_{\mathrm{pk}_0}(\mathbf {y}_{i})\}$. To simulate it in ideal-world, the $\mathsf {Sim}$ can simply return $\{\mathbf {\Delta }_i^x,\mathbf {\Delta }_i^r, \mathbf {\Delta }_i^{x'},\mathbf {\Delta }_i^{r'}, \mathsf {AHE.Enc}_{\mathrm{pk}^{'}_0}([0,0,...,0])\}$ where $\mathbf {\Delta }_i^x,\mathbf {\Delta }_i^r, \mathbf {\Delta }_i^{x'},\mathbf {\Delta }_i^{r'}$ are chosen from $\mathcal {R}^d$ at random and $\mathrm{pk}'_0$ is generated by $\mathsf {Sim}$. Due to the semantic security of $\mathsf {AHE}$, these two views are computationally indistinguishable from each other. And $P_1$’s view in real execution can also be simulated by $\mathsf {Sim}$ which outputs two random vectors in $\mathcal {R}^d$ since the real-world view $\{\boldsymbol{\xi }_{i}, \boldsymbol{\xi }'_{i}\}$ are masked by random vectors $\mathbf {r}_{i}$ and $\mathbf {r}'_{i}$. In online, the output of $\mathsf {Sim}$ for corrupted $P_t$ is one share which is uniformly chosen from $\mathcal {R}^d$, and thus $P_t$’s view in the real-world is also indistinguishable from that in ideal-world.

Privacy of Private $\tau $-$\mathsf {Clipping}$. As the underlying garbled circuits are secure, $P_t$’s view composed of labels in real-world is indistinguishable from the ideal-world view, which comprises of simulated labels.

Privacy of $\mathsf {CSWA}$. In the offline, the view of $P_t$ in the real-world is computationally indistinguishable from the ideal-world view because of the semantic security of $\mathsf {AHE}$. Moreover, in the online, the real-world view of $P_t$ is also masked random values. $\mathsf {Sim}$ can simulate it with random values of the same size.

Therefore, we guarantee that the adversary $\mathcal {A}^s$ (when corrupts $P_0$) learns nothing beyond what can be inferred from the aggregated results ($\sum _{i=1}^K \langle \nu _i \widetilde{\mathbf {w}_i}\rangle ^\mathsf {A}_t$, $\sum _{i=1}^K \langle \nu _i\rangle ^\mathsf {A}_t$) with an overwhelming probability. Completing the proof.

C MA of ResNet-18 on CIFAR10 with Altering $\delta $

D Online Overhead of Free-HD and Private $\tau $-Clipping

Table 4. Comm. and Run-time of $\mathsf {Free-HD}$ and Private $\tau $-$\mathsf {Clipping}$.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, Y., Chen, X., Li, K., Wang, D., Zeng, S. (2021). $\mathsf {FLOD}$: Oblivious Defender for Private Byzantine-Robust Federated Learning with Dishonest-Majority. In: Bertino, E., Shulman, H., Waidner, M. (eds) Computer Security – ESORICS 2021. ESORICS 2021. Lecture Notes in Computer Science(), vol 12972. Springer, Cham. https://doi.org/10.1007/978-3-030-88418-5_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-88418-5_24
Published: 30 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88417-8
Online ISBN: 978-3-030-88418-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

\(\mathsf {FLOD}\): Oblivious Defender for Private Byzantine-Robust Federated Learning with Dishonest-Majority

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A Byzantine-Robustness Analysis

B Proof of Theorem 1

Proof (of Theorem 1)

C MA of ResNet-18 on CIFAR10 with Altering \(\delta \)

D Online Overhead of Free-HD and Private \(\tau \)-Clipping

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

\(\mathsf {FLOD}\): Oblivious Defender for Private Byzantine-Robust Federated Learning with Dishonest-Majority

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A Byzantine-Robustness Analysis

B Proof of Theorem 1

Proof (of Theorem 1)

C MA of ResNet-18 on CIFAR10 with Altering \(\delta \)

D Online Overhead of Free-HD and Private \(\tau \)-Clipping

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation