Abstract
Privacy and Byzantine-robustness are two major concerns of federated learning (FL), but mitigating both threats simultaneously is highly challenging: privacy-preserving strategies prohibit access to individual model updates to avoid leakage, while Byzantine-robust methods require access for comprehensive mathematical analysis. Besides, most Byzantine-robust methods only work in the honest-majority setting.
We present \(\mathsf {FLOD}\), a novel oblivious defender for private Byzantine-robust FL in dishonest-majority setting. Basically, we propose a novel Hamming distance-based aggregation method to resist \(>1/2\) Byzantine attacks using a small root-dataset and server-model for bootstrapping trust. Furthermore, we employ two non-colluding servers and use additive homomorphic encryption (\(\mathsf {AHE}\)) and secure two-party computation (2PC) primitives to construct efficient privacy-preserving building blocks for secure aggregation, in which we propose two novel in-depth variants of Beaver Multiplication triples (MT) to reduce the overhead of Bit to Arithmetic (\(\mathsf {Bit2A}\)) conversion and vector weighted sum aggregation (\(\mathsf {VSWA}\)) significantly. Experiments on real-world and synthetic datasets demonstrate our effectiveness and efficiency: (i) \(\mathsf {FLOD}\) defeats known Byzantine attacks with a negligible effect on accuracy and convergence, (ii) achieves a reduction of \(\approx \)2\(\times \) for offline (resp. online) overhead of \(\mathsf {Bit2A}\) and \(\mathsf {VSWA}\) compared to \(\mathsf {ABY}\)-\(\mathsf {AHE}\) (resp. \(\mathsf {ABY}\)-\(\mathsf {MT}\)) based methods (NDSS’15), (iii) and reduces total online communication and run-time by 167–1416\(\times \) and 3.1–7.4\(\times \) compared to \(\mathsf {FLGUARD}\) (Crypto Eprint 2021/025).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alistarh, D., Allen-Zhu, Z., Li, J.: Byzantine stochastic gradient descent. arXiv preprint arXiv:1803.08917 (2018)
Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., Shmatikov, V.: How to backdoor federated learning. In: International Conference on Artificial Intelligence and Statistics, pp. 2938–2948. PMLR (2020)
Beaver, D.: Efficient multiparty protocols using circuit randomization. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 420–432. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-46766-1_34
Bellare, M., Hoang, V.T., Rogaway, P.: Foundations of garbled circuits. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 784–796 (2012)
Bernstein, J., Wang, Y.X., Azizzadenesheli, K., Anandkumar, A.: signSGD: compressed optimisation for non-convex problems. In: International Conference on Machine Learning, pp. 560–569. PMLR (2018)
Blanchard, P., El Mhamdi, E.M., Guerraoui, R., Stainer, J.: Machine learning with adversaries: byzantine tolerant gradient descent. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 118–128 (2017)
Bogdanov, D., Laur, S., Willemson, J.: Sharemind: a framework for fast privacy-preserving computations. In: Jajodia, S., Lopez, J. (eds.) ESORICS 2008. LNCS, vol. 5283, pp. 192–206. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88313-5_13
Bonawitz, K., et al.: Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191. ACM (2017). https://doi.org/10.1145/3133956.3133982
Bookstein, A., Kulyukin, V.A., Raita, T.: Generalized hamming distance. Inf. Retrieval 5(4), 353–375 (2002)
Canetti, R.: Universally composable security: a new paradigm for cryptographic protocols. In: Proceedings 42nd IEEE Symposium on Foundations of Computer Science, pp. 136–145. IEEE (2001)
Cao, X., Fang, M., Liu, J., Gong, N.Z.: FLTrust: byzantine-robust federated learning via trust bootstrapping. arXiv preprint arXiv:2012.13995 (2020)
Corrigan-Gibbs, H., Boneh, D.: Prio: private, robust, and scalable computation of aggregate statistics. In: 14th \(\{USENIX\}\) Symposium on Networked Systems Design and Implementation (\(\{NSDI\}\) 2017), pp. 259–282 (2017)
Demmler, D., Schneider, T., Zohner, M.: Aby-a framework for efficient mixed-protocol secure two-party computation. In: NDSS (2015)
ElGamal, T.: A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory 31(4), 469–472 (1985)
Erickson, B.J., Korfiatis, P., Akkus, Z., Kline, T.L.: Machine learning for medical imaging. Radiographics 37(2), 505–515 (2017)
Fang, M., Cao, X., Jia, J., Gong, N.: Local model poisoning attacks to byzantine-robust federated learning. In: 29th \(\{USENIX\}\) Security Symposium (\(\{USENIX\}\) Security 2020), pp. 1605–1622 (2020)
Hard, A., et al.: Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)
Ion, M., et al.: On deploying secure computing: private intersection-sum-with-cardinality. In: 2020 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 370–389. IEEE (2020)
Kairouz, P., et al.: Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019)
Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Li, M., et al.: Scaling distributed machine learning with the parameter server. In: 11th \(\{USENIX\}\) Symposium on Operating Systems Design and Implementation (\(\{OSDI\}\) 2014), pp. 583–598 (2014)
Liu, R., Cao, Y., Yoshikawa, M., Chen, H.: FedSel: federated SGD under local differential privacy with top-k dimension selection. In: Nah, Y., Cui, B., Lee, S.-W., Yu, J.X., Moon, Y.-S., Whang, S.E. (eds.) DASFAA 2020. LNCS, vol. 12112, pp. 485–501. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59410-7_33
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Mhamdi, E.M.E., Guerraoui, R., Rouault, S.: The hidden vulnerability of distributed learning in byzantium. arXiv preprint arXiv:1802.07927 (2018)
Nasr, M., Shokri, R., Houmansadr, A.: Comprehensive privacy analysis of deep learning: passive and active white-box inference attacks against centralized and federated learning. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 739–753. IEEE (2019)
Nguyen, T.D., et al.: FLGUARD: secure and private federated learning. arXiv preprint arXiv:2101.02281 (2021)
Nosowsky, R., Giordano, T.J.: The health insurance portability and accountability act of 1996 (HIPAA) privacy rule: implications for clinical research. Annu. Rev. Med. 57, 575–590 (2006). https://doi.org/10.1146/annurev.med.57.121304.131257
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16
Peikert, C., Vaikuntanathan, V., Waters, B.: A framework for efficient and composable oblivious transfer. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, pp. 554–571. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85174-5_31
Phong, L.T., Aono, Y., Hayashi, T., Wang, L., Moriai, S.: Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 13(5), 1333–1345 (2018). https://doi.org/10.1109/TIFS.2017.2787987
Microsoft SEAL (release 3.6), November 2020. https://github.com/Microsoft/SEAL Microsoft Research, Redmond, WA
Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321. ACM (2015). https://doi.org/10.1145/2810103.2813687
Smart, N.P., Vercauteren, F.: Fully homomorphic SIMD operations. Des. Codes Cryptogr. 71(1), 57–81 (2012). https://doi.org/10.1007/s10623-012-9720-4
Team, I.P.: EU general data protection regulation (GDPR): an implementation and compliance guide. IT Governance Ltd (2017). https://doi.org/10.2307/j.ctt1trkk7x
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Yao, A.C.C.: How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science (SFCS 1986), pp. 162–167. IEEE (1986)
Yin, D., Chen, Y., Kannan, R., Bartlett, P.: Byzantine-robust distributed learning: towards optimal statistical rates. In: International Conference on Machine Learning, pp. 5650–5659. PMLR (2018)
Zhu, L., Han, S.: Deep leakage from gradients. In: Yang, Q., Fan, L., Yu, H. (eds.) Federated Learning. LNCS (LNAI), vol. 12500, pp. 17–31. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63076-8_2
Acknowledgements
We are grateful to the anonymous reviewers for their comprehensive comments. This work was supported by the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDC02040400.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Byzantine-Robustness Analysis
Cosine similarity is one of the best metrics to measure the similarity of two vectors. Recall the cosine similarity of two \(\mathsf {sgn}\) \(\widetilde{\mathbf {w}_i}\) and \(\widetilde{\mathbf {w}_s}\) is \( c_i = \frac{\langle \widetilde{\mathbf {w}_i}, \widetilde{\mathbf {w}_s}\rangle }{\Vert \widetilde{\mathbf {w}_i}\Vert \cdot \Vert \widetilde{\mathbf {w}_s}\Vert } \), and \(\mathsf {FLTrust}\) clips \(c_i\) using \(\mathrm ReLU\) function to remove the poisoned model updates with negative \(c_i\) [11]. Based on \(\widetilde{\mathbf {w}_i}\), \(\widetilde{\mathbf {w}_s}\) in \(\{-1,1\}^d\) and Eq. (2, 3), we have
Thus, we have \(c_i>0 \Leftrightarrow 1-2\cdot \frac{hd_i}{d}>0\Leftrightarrow hd_i<\frac{d}{2}\). Therefore, with \(\tau = \frac{d}{2}\) we have \(\nu _i>0 \Leftrightarrow c_i>0\), which means \(\tau \)-clipping Hamming distance-based method is capable to exclude the poisoned \(\mathsf {sgn}\) model updates equivalent to that the cosine similarity-based method achieved. What is more, our \(\tau \)-clipping Hamming distance-based method is more flexible than the cosine similarity-based one since we can alter \(\tau \) for different tasks to achieve the best Byzantine-robustness.
B Proof of Theorem 1
Proof (of Theorem 1)
The universal composability framework [10] guarantees the security of arbitrary composition of different protocols. Therefore, we only need to prove the security of individual protocols. We give the proof of the security under the semi-honest model in the real-ideal paradigm [10].
Privacy of \(\mathsf {CXOR}\). There is nothing to simulate as the protocol is non-interactive.
Privacy of \(\mathsf {PCBit2A}\). In offline phase, \(P_0\)’s view in real-world is composed of \(\{ \mathbf {x}_i, \mathbf {r}_i,\mathbf {x}_i', \mathbf {r}_i', \mathsf {AHE.Enc}_{\mathrm{pk}_0}(\mathbf {y}_{i})\}\). To simulate it in ideal-world, the \(\mathsf {Sim}\) can simply return \(\{\mathbf {\Delta }_i^x,\mathbf {\Delta }_i^r, \mathbf {\Delta }_i^{x'},\mathbf {\Delta }_i^{r'}, \mathsf {AHE.Enc}_{\mathrm{pk}^{'}_0}([0,0,...,0])\}\) where \(\mathbf {\Delta }_i^x,\mathbf {\Delta }_i^r, \mathbf {\Delta }_i^{x'},\mathbf {\Delta }_i^{r'}\) are chosen from \(\mathcal {R}^d\) at random and \(\mathrm{pk}'_0\) is generated by \(\mathsf {Sim}\). Due to the semantic security of \(\mathsf {AHE}\), these two views are computationally indistinguishable from each other. And \(P_1\)’s view in real execution can also be simulated by \(\mathsf {Sim}\) which outputs two random vectors in \(\mathcal {R}^d\) since the real-world view \(\{\boldsymbol{\xi }_{i}, \boldsymbol{\xi }'_{i}\}\) are masked by random vectors \(\mathbf {r}_{i}\) and \(\mathbf {r}'_{i}\). In online, the output of \(\mathsf {Sim}\) for corrupted \(P_t\) is one share which is uniformly chosen from \(\mathcal {R}^d\), and thus \(P_t\)’s view in the real-world is also indistinguishable from that in ideal-world.
Privacy of Private \(\tau \)-\(\mathsf {Clipping}\). As the underlying garbled circuits are secure, \(P_t\)’s view composed of labels in real-world is indistinguishable from the ideal-world view, which comprises of simulated labels.
Privacy of \(\mathsf {CSWA}\). In the offline, the view of \(P_t\) in the real-world is computationally indistinguishable from the ideal-world view because of the semantic security of \(\mathsf {AHE}\). Moreover, in the online, the real-world view of \(P_t\) is also masked random values. \(\mathsf {Sim}\) can simulate it with random values of the same size.
Therefore, we guarantee that the adversary \(\mathcal {A}^s\) (when corrupts \(P_0\)) learns nothing beyond what can be inferred from the aggregated results (\(\sum _{i=1}^K \langle \nu _i \widetilde{\mathbf {w}_i}\rangle ^\mathsf {A}_t\), \(\sum _{i=1}^K \langle \nu _i\rangle ^\mathsf {A}_t\)) with an overwhelming probability. Completing the proof.
C MA of ResNet-18 on CIFAR10 with Altering \(\delta \)
D Online Overhead of Free-HD and Private \(\tau \)-Clipping
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Dong, Y., Chen, X., Li, K., Wang, D., Zeng, S. (2021). \(\mathsf {FLOD}\): Oblivious Defender for Private Byzantine-Robust Federated Learning with Dishonest-Majority. In: Bertino, E., Shulman, H., Waidner, M. (eds) Computer Security – ESORICS 2021. ESORICS 2021. Lecture Notes in Computer Science(), vol 12972. Springer, Cham. https://doi.org/10.1007/978-3-030-88418-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-88418-5_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88417-8
Online ISBN: 978-3-030-88418-5
eBook Packages: Computer ScienceComputer Science (R0)