A Review of Solving Non-IID Data in Federated Learning: Current Status and Future Directions

Lu, Wenhai; Cheng, Jieren; Li, Xiulai; He, Ji

doi:10.1007/978-981-97-1277-9_5

Wenhai Lu^8,11,
Jieren Cheng^8,11,
Xiulai Li^9,11 &
…
Ji He¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2058))

Included in the following conference series:

International Artificial Intelligence Conference

154 Accesses

Abstract

Federated learning (FL), as a machine learning framework, has garnered substantial attention from researchers in recent years. FL makes it possible to train a global model through coordination by a central server while ensuring the privacy of data on individual edge devices. However, the data on edge devices that participate in FL training are not independently and identically distributed (IID), resulting in challenges related to heterogeneity data. In this paper, we introduce the challenges generated by non-IID data to FL and provide a detailed classification of non-IID data. Then, we summarize the existing solutions to non-IID data in FL from the perspectives of data and process. To the best of our knowledge, despite the considerable efforts achieved by many researchers in solving the non-IID problem, some issues remain unsolved. This paper provides researchers with the latest findings and analyzes the potential future directions for solving non-IID in FL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abouelnaga, Y., Ali, O.S., Rady, H., Moustafa, M.: CIFAR-10: KNN-based ensemble of classifiers. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1192–1195. IEEE (2016)
Google Scholar
Antunes, R.S., André da Costa, C., Küderle, A., Yari, I.A., Eskofier, B.: Federated learning for healthcare: systematic review and architecture proposal. ACM Trans. Intell. Syst. Technol. (TIST) 13(4), 1–23 (2022)
Google Scholar
Banabilah, S., Aloqaily, M., Alsayed, E., Malik, N., Jararweh, Y.: Federated learning review: fundamentals, enabling technologies, and future applications. Inf. Process. Manage. 59(6), 103061 (2022)
Article Google Scholar
Cheng, J., Luo, P., Xiong, N., Wu, J.: AAFL: asynchronous-adaptive federated learning in edge-based wireless communication systems for countering communicable infectious diseasess. IEEE J. Sel. Areas Commun. 40(11), 3172–3190 (2022)
Article Google Scholar
Chiaro, D., Prezioso, E., Ianni, M., Giampaolo, F.: FL-enhance: a federated learning framework for balancing Non-IID data with augmented and shared compressed samples. Inf. Fusion 98, 101836 (2023)
Article Google Scholar
Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: anextension of MNIST to handwritten letters. arXiv preprint: arXiv:1702.05373 (2017)
Dinh, C.T., Vu, T.T., Tran, N.H., Dao, M.N., Zhang, H.: A new look and convergence rate of federated multitask learning with Laplacian regularization. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Google Scholar
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 4052, pp. 1–12. Springer, Berlin (2006). https://doi.org/10.1007/11787006_1
Chapter Google Scholar
Hanzely, F., Richtárik, P.: Federated learning of a mixture of global and local models. arXiv preprint: arXiv:2002.05516 (2020)
Itahara, S., Nishio, T., Koda, Y., Morikura, M., Yamamoto, K.: Distillation-based semi-supervised federated learning for communication-efficient collaborative training with Non-IID private data. IEEE Trans. Mob. Comput. 22(1), 191–205 (2021)
Article Google Scholar
Jeong, E., Oh, S., Kim, H., Park, J., Bennis, M., Kim, S.L.: Communication-efficient on-device machine learning: federated distillation and augmentation under Non-IID private data. arXiv preprint: arXiv:1811.11479 (2018)
Jiang, J.C., Kantarci, B., Oktug, S., Soyata, T.: Federated learning in smart city sensing: challenges and opportunities. Sensors 20(21), 6230 (2020)
Article Google Scholar
Jiang, Y., et al.: Model pruning enables efficient federated learning on edge devices. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Google Scholar
Jin, H., et al.: Personalized edge intelligence via federated self-knowledge distillation. IEEE Trans. Parallel Distrib. Syst. 34(2), 567–580 (2022)
Article Google Scholar
Kairouz, P., et al.: Advances and open problems in federated learning. Found. Trends® Mach. Learn. 14(1–2), 1–210 (2021)
Google Scholar
Lakhan, A., et al.: Federated-learning based privacy preservation and fraud-enabled blockchain IoMT system for healthcare. IEEE J. Biomed. Health Inform. 27(2), 664–672 (2022)
Article Google Scholar
Li, L., Fan, Y., Tse, M., Lin, K.Y.: A review of applications in federated learning. Comput. Ind. Eng. 149, 106854 (2020)
Article Google Scholar
Li, Q., Diao, Y., Chen, Q., He, B.: Federated learning on Non-IID data silos: an experimental study. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 965–978. IEEE (2022)
Google Scholar
Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of FedAvg on Non-IID data. arXiv preprint: arXiv:1907.02189 (2019)
Li, X.C., Zhan, D.C., Shao, Y., Li, B., Song, S.: FedPHP: federated personalization with inherited private models. In: Oliver, N., Perez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science(), vol. 12975, pp. 587–602. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86486-6_36
Chapter Google Scholar
Lin, T., Kong, L., Stich, S.U., Jaggi, M.: Ensemble distillation for robust model fusion in federated learning. In: Advances in Neural Information Processing Systems , vol. 33, pp. 2351–2363 (2020)
Google Scholar
Liu, T., Ding, J., Wang, T., Pan, M., Chen, M.: Towards fast and accurate federated learning with Non-IID data for cloud-based IoT applications. J. Circuits, Syst. Comput. 31(13), 2250235 (2022)
Article Google Scholar
Liu, X., Deng, Y., Mahmoodi, T.: Wireless distributed learning: a new hybrid split and federated learning approach. IEEE Trans. Wireless Commun. 22(4), 2650–2665 (2022)
Article Google Scholar
Lo, S.K., Lu, Q., Wang, C., Paik, H.Y., Zhu, L.: A systematic literature review on federated machine learning: from a software engineering perspective. ACM Comput. Surv. (CSUR) 54(5), 1–39 (2021)
Article Google Scholar
Ma, X., Zhu, J., Lin, Z., Chen, S., Qin, Y.: A state-of-the-art survey on solving Non-IID data in federated learning. Futur. Gener. Comput. Syst. 135, 244–258 (2022)
Article Google Scholar
Mahini, H., Mousavi, H., Daneshtalab, M.: GTFLAT: game theory based add-on for empowering federated learning aggregation techniques. arXiv preprint: arXiv:2212.04103 (2022)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Google Scholar
McMahan, H.B., Moore, E., Ramage, D., y Arcas, B.A.: Federated learning of deep networks using model averaging, vol. 2, p. 2. arXiv preprint: arXiv:1602.05629 (2016)
Mills, J., Hu, J., Min, G.: Multi-task federated learning for personalised deep neural networks in edge computing. IEEE Trans. Parallel Distrib. Syst. 33(3), 630–641 (2021)
Article Google Scholar
Qin, Z., Li, G.Y., Ye, H.: Federated learning and wireless communications. IEEE Wirel. Commun. 28(5), 134–140 (2021)
Article Google Scholar
Sannara, E., Portet, F., Lalanda, P., German, V.: A federated learning aggregation algorithm for pervasive computing: evaluation and comparison. In: 2021 IEEE International Conference on Pervasive Computing and Communications (PerCom), pp. 1–10. IEEE (2021)
Google Scholar
Sattler, F., Müller, K.R., Samek, W.: Clustered federated learning: model-agnostic distributed multitask optimization under privacy constraints. IEEE Trans. Neural Netw. Learn. Syst. 32(8), 3710–3722 (2020)
Article MathSciNet Google Scholar
Sattler, F., Wiedemann, S., Müller, K.R., Samek, W.: Robust and communication-efficient federated learning from Non-IID data. IEEE Trans. Neural Netw. Learn. Syst. 31(9), 3400–3413 (2019)
Article Google Scholar
Shin, M., Hwang, C., Kim, J., Park, J., Bennis, M., Kim, S.L.: XOR Mixup: privacy-preserving data augmentation for one-shot federated learning. arXiv preprint: arXiv:2006.05148 (2020)
Shu, J., et al.: Clustered federated multitask learning on Non-IID data with enhanced privacy. IEEE Internet Things J. 10(4), 3453–3467 (2022)
Article Google Scholar
Song, J., Wang, W., Gadekallu, T.R., Cao, J., Liu, Y.: EPPDA: an efficient privacy-preserving data aggregation federated learning scheme. IEEE Trans. Netw. Sci. Eng. (2022)
Google Scholar
Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82(398), 528–540 (1987)
Article MathSciNet Google Scholar
Voigt, P., Von dem Bussche, A.: The EU General Data Protection Regulation (GDPR). A Practical Guide, vol. 10, no. 3152676, p. 10–5555, 1st Ed. Springer International Publishing, Cham (2017)
Google Scholar
Wang, X., Garg, S., Lin, H., Kaddoum, G., Hu, J., Hossain, M.S.: A secure data aggregation strategy in edge computing and blockchain-empowered internet of things. IEEE Internet Things J. 9(16), 14237–14246 (2020)
Article Google Scholar
Wei, B., Li, J., Liu, Y., Wang, W.: Non-IID federated learning with sharper risk bound. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Google Scholar
Xu, J., Glicksberg, B.S., Su, C., Walker, P., Bian, J., Wang, F.: Federated learning for healthcare informatics. J. Healthc. Inf. Res. 5, 1–19 (2021)
Article Google Scholar
Xue, B., He, Y., Jing, F., Ren, Y., Jiao, L., Huang, Y.: Robot target recognition using deep federated learning. Int. J. Intell. Syst. 36(12), 7754–7769 (2021)
Article Google Scholar
Yang, L., Huang, J., Lin, W., Cao, J.: Personalized federated learning on non-IID data via group-based meta-learning. ACM Trans. Knowl. Discov. Data 17(4), 1–20 (2023)
Article Google Scholar
Yang, Z., Chen, M., Wong, K.K., Poor, H.V., Cui, S.: Federated learning for 6G: applications, challenges, and opportunities. Engineering 8, 33–41 (2022)
Article Google Scholar
Yin, X., Zhu, Y., Hu, J.: A comprehensive survey of privacy-preserving federated learning: a taxonomy, review, and future directions. ACM Comput. Surv. (CSUR) 54(6), 1–36 (2021)
Article Google Scholar
You, X., Liu, X., Jiang, N., Cai, J., Ying, Z.: Reschedule gradients: temporal Non-IID resilient federated learning. IEEE Internet Things J. 10(1), 747–762 (2022)
Article Google Scholar
Yu, H., Wu, C., Yu, H., Wei, X., Liu, S., Zhang, Y.: A federated learning algorithm using parallel-ensemble method on Non-IID datasets. Complex Intell. Syst., 1–13 (2023)
Google Scholar
Yu, T., Bagdasaryan, E., Shmatikov, V.: Salvaging federated learning by local adaptation. arXiv preprint: arXiv:2002.04758 (2020)
Zhang, C., Yuan, X., Zhang, Q., Zhu, G., Cheng, L., Zhang, N.: Toward tailored models on private AIoT devices: federated direct neural architecture search. IEEE Internet Things J. 9(18), 17309–17322 (2022)
Article Google Scholar
Zhang, K., Song, X., Zhang, C., Yu, S.: Challenges and future directions of secure federated learning: a survey. Front. Comp. Sci. 16, 1–8 (2022)
Google Scholar
Zhang, W., et al.: R \(^2\) Fed: resilient reinforcement federated learning for industrial applications. IEEE Trans. Ind. Inform. (2022)
Google Scholar
Zhang, Z., Zhang, Y., Guo, D., Zhao, S., Zhu, X.: Communication-efficient federated continual learning for distributed learning system with Non-IID data. Sci. China Inf. Sci. 66(2), 122102 (2023)
Article Google Scholar
Zhao, L., Huang, J.: A distribution information sharing federated learning approach for medical image data. Complex Intell. Syst., 1–12 (2023)
Google Scholar
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with Non-IID data. arXiv preprint: arXiv:1806.00582 (2018)
Zheng, W., Yan, L., Gou, C., Wang, F.Y.: Federated meta-learning for fraudulent credit card detection. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp. 4654–4660 (2021)
Google Scholar
Zhu, H., Xu, J., Liu, S., Jin, Y.: Federated learning on Non-IID data: a survey. Neurocomputing 465, 371–390 (2021)
Article Google Scholar
Zhu, Z., Hong, J., Zhou, J.: Data-free knowledge distillation for heterogeneous federated learning. In: International Conference on Machine Learning, pp. 12878–12889. PMLR (2021)
Google Scholar

Download references

Acknowledgements

This work was supported by Youth Foundation Project of Hainan Natural Science Foundation(621QN211), National Natural Science Foundation of China (NSFC) (Grant No. 62162022, 62162024), the Major science and technology project of Hainan Province (Grant No. ZDKJ2020012), Hainan Provincial Natural Science Foundation of China (Grant No. 620MS021).

Author information

Authors and Affiliations

School of Computer Science and Technology, Hainan University, Haikou, 570228, China
Wenhai Lu & Jieren Cheng
School of Cyberspace Security Academy (Cryptography Academy), Hainan University, Haikou, 570228, China
Xiulai Li
School of Engineering and Information Science, University of Wollongong, Wollongong, 2500, Australia
Ji He
Hainan Blockchain Technology Engineering Research Center, Haikou, 570228, China
Wenhai Lu, Jieren Cheng & Xiulai Li

Authors

Wenhai Lu
View author publications
You can also search for this author in PubMed Google Scholar
Jieren Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Xiulai Li
View author publications
You can also search for this author in PubMed Google Scholar
Ji He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jieren Cheng .

Editor information

Editors and Affiliations

Huazhong University of Science and Technology, Wuhan, Hubei, China
Hai Jin
Chinese Academy of Science, Shenzhen, China
Yi Pan
Nanjing University of Science and Technology, Nanjing, China
Jianfeng Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, W., Cheng, J., Li, X., He, J. (2024). A Review of Solving Non-IID Data in Federated Learning: Current Status and Future Directions. In: Jin, H., Pan, Y., Lu, J. (eds) Artificial Intelligence and Machine Learning. IAIC 2023. Communications in Computer and Information Science, vol 2058. Springer, Singapore. https://doi.org/10.1007/978-981-97-1277-9_5

Download citation

DOI: https://doi.org/10.1007/978-981-97-1277-9_5
Published: 03 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1276-2
Online ISBN: 978-981-97-1277-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Review of Solving Non-IID Data in Federated Learning: Current Status and Future Directions