Abstract
Federated learning (FL), as a machine learning framework, has garnered substantial attention from researchers in recent years. FL makes it possible to train a global model through coordination by a central server while ensuring the privacy of data on individual edge devices. However, the data on edge devices that participate in FL training are not independently and identically distributed (IID), resulting in challenges related to heterogeneity data. In this paper, we introduce the challenges generated by non-IID data to FL and provide a detailed classification of non-IID data. Then, we summarize the existing solutions to non-IID data in FL from the perspectives of data and process. To the best of our knowledge, despite the considerable efforts achieved by many researchers in solving the non-IID problem, some issues remain unsolved. This paper provides researchers with the latest findings and analyzes the potential future directions for solving non-IID in FL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abouelnaga, Y., Ali, O.S., Rady, H., Moustafa, M.: CIFAR-10: KNN-based ensemble of classifiers. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1192–1195. IEEE (2016)
Antunes, R.S., André da Costa, C., Küderle, A., Yari, I.A., Eskofier, B.: Federated learning for healthcare: systematic review and architecture proposal. ACM Trans. Intell. Syst. Technol. (TIST) 13(4), 1–23 (2022)
Banabilah, S., Aloqaily, M., Alsayed, E., Malik, N., Jararweh, Y.: Federated learning review: fundamentals, enabling technologies, and future applications. Inf. Process. Manage. 59(6), 103061 (2022)
Cheng, J., Luo, P., Xiong, N., Wu, J.: AAFL: asynchronous-adaptive federated learning in edge-based wireless communication systems for countering communicable infectious diseasess. IEEE J. Sel. Areas Commun. 40(11), 3172–3190 (2022)
Chiaro, D., Prezioso, E., Ianni, M., Giampaolo, F.: FL-enhance: a federated learning framework for balancing Non-IID data with augmented and shared compressed samples. Inf. Fusion 98, 101836 (2023)
Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: anextension of MNIST to handwritten letters. arXiv preprint: arXiv:1702.05373 (2017)
Dinh, C.T., Vu, T.T., Tran, N.H., Dao, M.N., Zhang, H.: A new look and convergence rate of federated multitask learning with Laplacian regularization. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 4052, pp. 1–12. Springer, Berlin (2006). https://doi.org/10.1007/11787006_1
Hanzely, F., Richtárik, P.: Federated learning of a mixture of global and local models. arXiv preprint: arXiv:2002.05516 (2020)
Itahara, S., Nishio, T., Koda, Y., Morikura, M., Yamamoto, K.: Distillation-based semi-supervised federated learning for communication-efficient collaborative training with Non-IID private data. IEEE Trans. Mob. Comput. 22(1), 191–205 (2021)
Jeong, E., Oh, S., Kim, H., Park, J., Bennis, M., Kim, S.L.: Communication-efficient on-device machine learning: federated distillation and augmentation under Non-IID private data. arXiv preprint: arXiv:1811.11479 (2018)
Jiang, J.C., Kantarci, B., Oktug, S., Soyata, T.: Federated learning in smart city sensing: challenges and opportunities. Sensors 20(21), 6230 (2020)
Jiang, Y., et al.: Model pruning enables efficient federated learning on edge devices. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Jin, H., et al.: Personalized edge intelligence via federated self-knowledge distillation. IEEE Trans. Parallel Distrib. Syst. 34(2), 567–580 (2022)
Kairouz, P., et al.: Advances and open problems in federated learning. Found. Trends® Mach. Learn. 14(1–2), 1–210 (2021)
Lakhan, A., et al.: Federated-learning based privacy preservation and fraud-enabled blockchain IoMT system for healthcare. IEEE J. Biomed. Health Inform. 27(2), 664–672 (2022)
Li, L., Fan, Y., Tse, M., Lin, K.Y.: A review of applications in federated learning. Comput. Ind. Eng. 149, 106854 (2020)
Li, Q., Diao, Y., Chen, Q., He, B.: Federated learning on Non-IID data silos: an experimental study. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 965–978. IEEE (2022)
Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of FedAvg on Non-IID data. arXiv preprint: arXiv:1907.02189 (2019)
Li, X.C., Zhan, D.C., Shao, Y., Li, B., Song, S.: FedPHP: federated personalization with inherited private models. In: Oliver, N., Perez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science(), vol. 12975, pp. 587–602. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86486-6_36
Lin, T., Kong, L., Stich, S.U., Jaggi, M.: Ensemble distillation for robust model fusion in federated learning. In: Advances in Neural Information Processing Systems , vol. 33, pp. 2351–2363 (2020)
Liu, T., Ding, J., Wang, T., Pan, M., Chen, M.: Towards fast and accurate federated learning with Non-IID data for cloud-based IoT applications. J. Circuits, Syst. Comput. 31(13), 2250235 (2022)
Liu, X., Deng, Y., Mahmoodi, T.: Wireless distributed learning: a new hybrid split and federated learning approach. IEEE Trans. Wireless Commun. 22(4), 2650–2665 (2022)
Lo, S.K., Lu, Q., Wang, C., Paik, H.Y., Zhu, L.: A systematic literature review on federated machine learning: from a software engineering perspective. ACM Comput. Surv. (CSUR) 54(5), 1–39 (2021)
Ma, X., Zhu, J., Lin, Z., Chen, S., Qin, Y.: A state-of-the-art survey on solving Non-IID data in federated learning. Futur. Gener. Comput. Syst. 135, 244–258 (2022)
Mahini, H., Mousavi, H., Daneshtalab, M.: GTFLAT: game theory based add-on for empowering federated learning aggregation techniques. arXiv preprint: arXiv:2212.04103 (2022)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
McMahan, H.B., Moore, E., Ramage, D., y Arcas, B.A.: Federated learning of deep networks using model averaging, vol. 2, p. 2. arXiv preprint: arXiv:1602.05629 (2016)
Mills, J., Hu, J., Min, G.: Multi-task federated learning for personalised deep neural networks in edge computing. IEEE Trans. Parallel Distrib. Syst. 33(3), 630–641 (2021)
Qin, Z., Li, G.Y., Ye, H.: Federated learning and wireless communications. IEEE Wirel. Commun. 28(5), 134–140 (2021)
Sannara, E., Portet, F., Lalanda, P., German, V.: A federated learning aggregation algorithm for pervasive computing: evaluation and comparison. In: 2021 IEEE International Conference on Pervasive Computing and Communications (PerCom), pp. 1–10. IEEE (2021)
Sattler, F., Müller, K.R., Samek, W.: Clustered federated learning: model-agnostic distributed multitask optimization under privacy constraints. IEEE Trans. Neural Netw. Learn. Syst. 32(8), 3710–3722 (2020)
Sattler, F., Wiedemann, S., Müller, K.R., Samek, W.: Robust and communication-efficient federated learning from Non-IID data. IEEE Trans. Neural Netw. Learn. Syst. 31(9), 3400–3413 (2019)
Shin, M., Hwang, C., Kim, J., Park, J., Bennis, M., Kim, S.L.: XOR Mixup: privacy-preserving data augmentation for one-shot federated learning. arXiv preprint: arXiv:2006.05148 (2020)
Shu, J., et al.: Clustered federated multitask learning on Non-IID data with enhanced privacy. IEEE Internet Things J. 10(4), 3453–3467 (2022)
Song, J., Wang, W., Gadekallu, T.R., Cao, J., Liu, Y.: EPPDA: an efficient privacy-preserving data aggregation federated learning scheme. IEEE Trans. Netw. Sci. Eng. (2022)
Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82(398), 528–540 (1987)
Voigt, P., Von dem Bussche, A.: The EU General Data Protection Regulation (GDPR). A Practical Guide, vol. 10, no. 3152676, p. 10–5555, 1st Ed. Springer International Publishing, Cham (2017)
Wang, X., Garg, S., Lin, H., Kaddoum, G., Hu, J., Hossain, M.S.: A secure data aggregation strategy in edge computing and blockchain-empowered internet of things. IEEE Internet Things J. 9(16), 14237–14246 (2020)
Wei, B., Li, J., Liu, Y., Wang, W.: Non-IID federated learning with sharper risk bound. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Xu, J., Glicksberg, B.S., Su, C., Walker, P., Bian, J., Wang, F.: Federated learning for healthcare informatics. J. Healthc. Inf. Res. 5, 1–19 (2021)
Xue, B., He, Y., Jing, F., Ren, Y., Jiao, L., Huang, Y.: Robot target recognition using deep federated learning. Int. J. Intell. Syst. 36(12), 7754–7769 (2021)
Yang, L., Huang, J., Lin, W., Cao, J.: Personalized federated learning on non-IID data via group-based meta-learning. ACM Trans. Knowl. Discov. Data 17(4), 1–20 (2023)
Yang, Z., Chen, M., Wong, K.K., Poor, H.V., Cui, S.: Federated learning for 6G: applications, challenges, and opportunities. Engineering 8, 33–41 (2022)
Yin, X., Zhu, Y., Hu, J.: A comprehensive survey of privacy-preserving federated learning: a taxonomy, review, and future directions. ACM Comput. Surv. (CSUR) 54(6), 1–36 (2021)
You, X., Liu, X., Jiang, N., Cai, J., Ying, Z.: Reschedule gradients: temporal Non-IID resilient federated learning. IEEE Internet Things J. 10(1), 747–762 (2022)
Yu, H., Wu, C., Yu, H., Wei, X., Liu, S., Zhang, Y.: A federated learning algorithm using parallel-ensemble method on Non-IID datasets. Complex Intell. Syst., 1–13 (2023)
Yu, T., Bagdasaryan, E., Shmatikov, V.: Salvaging federated learning by local adaptation. arXiv preprint: arXiv:2002.04758 (2020)
Zhang, C., Yuan, X., Zhang, Q., Zhu, G., Cheng, L., Zhang, N.: Toward tailored models on private AIoT devices: federated direct neural architecture search. IEEE Internet Things J. 9(18), 17309–17322 (2022)
Zhang, K., Song, X., Zhang, C., Yu, S.: Challenges and future directions of secure federated learning: a survey. Front. Comp. Sci. 16, 1–8 (2022)
Zhang, W., et al.: R \(^2\) Fed: resilient reinforcement federated learning for industrial applications. IEEE Trans. Ind. Inform. (2022)
Zhang, Z., Zhang, Y., Guo, D., Zhao, S., Zhu, X.: Communication-efficient federated continual learning for distributed learning system with Non-IID data. Sci. China Inf. Sci. 66(2), 122102 (2023)
Zhao, L., Huang, J.: A distribution information sharing federated learning approach for medical image data. Complex Intell. Syst., 1–12 (2023)
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with Non-IID data. arXiv preprint: arXiv:1806.00582 (2018)
Zheng, W., Yan, L., Gou, C., Wang, F.Y.: Federated meta-learning for fraudulent credit card detection. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp. 4654–4660 (2021)
Zhu, H., Xu, J., Liu, S., Jin, Y.: Federated learning on Non-IID data: a survey. Neurocomputing 465, 371–390 (2021)
Zhu, Z., Hong, J., Zhou, J.: Data-free knowledge distillation for heterogeneous federated learning. In: International Conference on Machine Learning, pp. 12878–12889. PMLR (2021)
Acknowledgements
This work was supported by Youth Foundation Project of Hainan Natural Science Foundation(621QN211), National Natural Science Foundation of China (NSFC) (Grant No. 62162022, 62162024), the Major science and technology project of Hainan Province (Grant No. ZDKJ2020012), Hainan Provincial Natural Science Foundation of China (Grant No. 620MS021).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lu, W., Cheng, J., Li, X., He, J. (2024). A Review of Solving Non-IID Data in Federated Learning: Current Status and Future Directions. In: Jin, H., Pan, Y., Lu, J. (eds) Artificial Intelligence and Machine Learning. IAIC 2023. Communications in Computer and Information Science, vol 2058. Springer, Singapore. https://doi.org/10.1007/978-981-97-1277-9_5
Download citation
DOI: https://doi.org/10.1007/978-981-97-1277-9_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1276-2
Online ISBN: 978-981-97-1277-9
eBook Packages: Computer ScienceComputer Science (R0)