Abstract
The foundation models (FMs) have been used to generate synthetic public datasets for the heterogeneous federated learning (HFL) problem where each client uses a unique model architecture. However, the vulnerabilities of integrating FMs, especially against backdoor attacks, are not well-explored in the HFL contexts. In this paper, we introduce a novel backdoor attack mechanism for HFL that circumvents the need for client compromise or ongoing participation in the FL process. This method plants and transfers the backdoor through a generated synthetic public dataset, which could help evade existing backdoor defenses in FL by presenting normal client behaviors. Empirical experiments across different HFL configurations and benchmark datasets demonstrate the effectiveness of our attack compared to traditional client-based attacks. Our findings reveal significant security risks in developing robust FM-assisted HFL systems. This research contributes to enhancing the safety and integrity of FL systems, highlighting the need for advanced security measures in the era of FMs. The source codes can be found in the link (https://github.com/lixi1994/backdoor_FM_hete_FL).
X. Li and C. Wu—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., Shmatikov, V.: How to backdoor federated learning. In: AISTATS, vol. 108, pp. 2938–2948. PMLR (2020)
Che, L., Wang, J., Zhou, Y., Ma, F.: Multimodal federated learning: a survey. Sensors 23(15), 6986 (2023)
Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv:1712.05526 (2017)
Chou, S., Chen, P., Ho, T.: How to backdoor diffusion models? In: CVPR, pp. 4015–4024. IEEE (2023)
Dai, J., Chen, C., Li, Y.: A backdoor attack against LSTM-based text classification systems. IEEE Access 7, 138872–138878 (2019)
Dong, Q., et al.: A survey for in-context learning. arXiv preprint arXiv:2301.00234 (2022)
Gu, T., Dolan-Gavitt, B., Garg, S.: Badnets: identifying vulnerabilities in the machine learning model supply chain. CoRR abs/1708.06733 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)
Huang, W., Ye, M., Du, B.: Learn from others and be yourself in heterogeneous federated learning. In: CVPR, pp. 10133–10143. IEEE (2022)
Kairouz, P., et al.: Advances and open problems in federated learning. Found. Trends® Mach. Learn. 14(1–2), 1–210 (2021)
Kandpal, N., Jagielski, M., Tramèr, F., Carlini, N.: Backdoor attacks for in-context learning with language models. CoRR abs/2307.14692 (2023)
Kirillov, A., et al.: Segment anything (2023)
Krizhevsky, A., Nair, V., Hinton, G.: Cifar-10 (Canadian institute for advanced research) (2009). http://www.cs.toronto.edu/~kriz/cifar.html
Li, D., Wang, J.: FedMD: heterogenous federated learning via model distillation. CoRR abs/1910.03581 (2019). http://arxiv.org/abs/1910.03581
Li, L., Song, D., Li, X., Zeng, J., Ma, R., Qiu, X.: Backdoor attacks on pre-trained models by layerwise weight poisoning. In: EMNLP (2021)
Li, X., Wang, S., Huang, R., Gowda, M., Kesidis, G.: Temporal-distributed backdoor attack against video based action recognition. In: AAAI (2024)
Li, X., Wang, S., Wu, C., Zhou, H., Wang, J.: Backdoor threats from compromised foundation models to federated learning. CoRR abs/2311.00144 (2023)
Lin, T., Kong, L., Stich, S.U., Jaggi, M.: Ensemble distillation for robust model fusion in federated learning. In: NeurIPS (2020)
Lu, S., Li, R., Liu, W., Chen, X.: Defense against backdoor attack in federated learning. Comput. Secur. 121, 102819 (2022)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: AISTATS, pp. 1273–1282. PMLR (2017)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: AISTATS (2017)
OpenAI: Gpt-3: Language models (2020). https://openai.com/research/gpt-3
Rieger, P., Nguyen, T.D., Miettinen, M., Sadeghi, A.: Deepsight: mitigating backdoor attacks in federated learning through deep model inspection. In: NDSS. The Internet Society (2022)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models (2022)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (2020)
Shi, J., Liu, Y., Zhou, P., Sun, L.: BadGPT: exploring security vulnerabilities of ChatGPT via backdoor attacks to instructGPT. CoRR abs/2304.12298 (2023)
Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP, pp. 1631–1642. ACL (2013)
Sun, L., Lyu, L.: Federated model distillation with noise-free differential privacy. In: Zhou, Z. (ed.) IJCAI, pp. 1563–1570. ijcai.org (2021)
Touvron, H., et al.: Llama: open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)
Wang, B., et al.: Decodingtrust: a comprehensive assessment of trustworthiness in GPT models. CoRR abs/2306.11698 (2023)
Wang, H., et al.: Attack of the tails: yes, you really can backdoor federated learning. In: NeurIPS (2020)
Wang, J., Ma, F.: Federated learning for rare disease detection: a survey (2023)
Wang, J., et al.: Towards personalized federated learning via heterogeneous model reassembly. arXiv preprint arXiv:2308.08643 (2023)
Wu, C., Yang, X., Zhu, S., Mitra, P.: Toward cleansing backdoored neural networks in federated learning. In: ICDCS, pp. 820–830. IEEE (2022)
Wu, C., Wu, F., Liu, R., Lyu, L., Huang, Y., Xie, X.: Fedkd: communication efficient federated learning via knowledge distillation. CoRR abs/2108.13323 (2021)
Xiang, Z., Miller, D.J., Chen, S., Li, X., Kesidis, G.: A backdoor attack against 3D point cloud classifiers. In: ICCV (2021)
Xie, C., Chen, M., Chen, P., Li, B.: CRFL: certifiably robust federated learning against backdoor attacks. In: ICML, vol. 139, pp. 11372–11382. PMLR (2021)
Xie, C., Huang, K., Chen, P., Li, B.: DBA: distributed backdoor attacks against federated learning. In: ICLR. OpenReview.net (2020)
Xu, J., Ma, M.D., Wang, F., Xiao, C., Chen, M.: Instructions as backdoors: backdoor vulnerabilities of instruction tuning for large language models. CoRR abs/2305.14710 (2023)
Yi, L., Wang, G., Liu, X., Shi, Z., Yu, H.: FedGH: heterogeneous federated learning with generalized global header. In: MM, pp. 8686–8696. ACM (2023)
Yu, S., Qian, W., Jannesari, A.: Resource-aware federated learning using knowledge extraction and multi-model fusion. CoRR abs/2208.07978 (2022)
Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification. In: NeurIPS, pp. 649–657 (2015)
Zhuang, W., Chen, C., Lyu, L.: When foundation model meets federated learning: motivations, challenges, and future directions. CoRR abs/2306.15546 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, X., Wu, C., Wang, J. (2024). Unveiling Backdoor Risks Brought by Foundation Models in Heterogeneous Federated Learning. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14647. Springer, Singapore. https://doi.org/10.1007/978-981-97-2259-4_13
Download citation
DOI: https://doi.org/10.1007/978-981-97-2259-4_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2261-7
Online ISBN: 978-981-97-2259-4
eBook Packages: Computer ScienceComputer Science (R0)