Skip to main content

Federated Learning for Non-IID Data: From Theory to Algorithm

Part of the Lecture Notes in Computer Science book series (LNAI,volume 13031)

  • The original version of this chapter was revised: The second affiliation of the author Bojian Wei has been corrected as “School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China”. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-89188-6_45

Abstract

Federated learning suffers from terrible generalization performance because the model fails to utilize global information over all clients when data is non-IID (not independently or identically distributed) partitioning. Meanwhile, the theoretical studies in this field are still insufficient. In this paper, we present an excess risk bound for federated learning on non-IID data, which measures the error between the model of federated learning and the optimal centralized model. Specifically, we present a novel error decomposition strategy, which decomposes the excess risk into three terms: agnostic error, federated error, and approximation error. By estimating the error terms, we find that Rademacher complexity and discrepancy distance are the keys to affecting the learning performance. Motivated by the theoretical findings, we propose FedAvgR to improve the performance via additional regularizers to lower the excess risk. Experimental results demonstrate the effectiveness of our algorithm and coincide with our theory.

Keywords

  • Federated learning
  • Non-IID
  • Excess risk bound

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Change history

  • 31 March 2022

    In the originally published version of chapter 3 the second affiliation of the author Bojian Wei was incorrect. The second affiliation of the author Bojian Wei has been corrected as “School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China”.

References

  1. Bartlett, P.L., Bousquet, O., Mendelson, S.: Localized rademacher complexities. In: COLT, vol. 2375, pp. 44–58 (2002)

    Google Scholar 

  2. Basu, D., Data, D., Karakus, C., Diggavi, S.N.: Qsparse-local-SGD: distributed SGD with quantization, sparsification and local computations. In: NeurIPS, pp. 14668–14679 (2019)

    Google Scholar 

  3. Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1-2), 151–175 (2010)

    Google Scholar 

  4. Borgwardt, K.M., Gretton, A., Rasch, M.J., Kriegel, H., Schölkopf, B., Smola, A.J.: Integrating structured biological data by Kernel maximum mean discrepancy. In: Proceedings of the 14th International Conference on Intelligent Systems for Molecular Biology, pp. 49–57 (2006)

    Google Scholar 

  5. Briggs, C., Fan, Z., Andras, P.: Federated learning with hierarchical clustering of local updates to improve training on non-IID data. In: International Joint Conference on Neural Networks, IJCNN, pp. 1–9. IEEE (2020)

    Google Scholar 

  6. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (20)

    Google Scholar 

  7. Charles, Z., Konečný, J.: Convergence and accuracy trade-offs in federated learning and meta-learning. In: AISTATS, vol. 130, pp. 2575–2583 (2021)

    Google Scholar 

  8. Cortes, C., Kuznetsov, V., Mohri, M., Yang, S.: Structured prediction theory based on factor graph complexity. In: NIPS, pp. 2514–2522 (2016)

    Google Scholar 

  9. Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S.J., Stich, S.U., Suresh, A.T.: SCAFFOLD: stochastic controlled averaging for federated learning. In: ICML, vol. 119, pp. 5132–5143 (2020)

    Google Scholar 

  10. Li, J., Liu, Y., Wang, W.: Automated spectral Kernel learning. In: AAAI, pp. 4618–4625 (2020)

    Google Scholar 

  11. Li, J., Liu, Y., Yin, R., Wang, W.: Approximate manifold regularization: Scalable algorithm and generalization analysis. In: IJCAI. pp. 2887–2893 (2019)

    Google Scholar 

  12. Li, J., Liu, Y., Yin, R., Wang, W.: Multi-class learning using unlabeled samples: theory and algorithm. In: IJCAI, pp. 2880–2886 (2019)

    Google Scholar 

  13. Li, J., Liu, Y., Yin, R., Zhang, H., Ding, L., Wang, W.: Multi-class learning: from theory to algorithm. In: NeurIPS, pp. 1593–1602 (2018)

    Google Scholar 

  14. Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Sig. Process. Mag. 37(3), 50–60 (2020)

    CrossRef  Google Scholar 

  15. Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. In: MLSys (2020)

    Google Scholar 

  16. Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of FedAvg on non-iid data. In: ICLR (2020)

    Google Scholar 

  17. Li, Z., Kovalev, D., Qian, X., Richtárik, P.: Acceleration for compressed gradient descent in distributed and federated optimization. In: ICML, vol. 119, pp. 5895–5904 (2020)

    Google Scholar 

  18. Lian, X., Zhang, C., Zhang, H., Hsieh, C., Zhang, W., Liu, J.: Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent. In: NeurIPS, pp. 5330–5340 (2017)

    Google Scholar 

  19. Liu, Y., Jiang, S., Liao, S.: Efficient approximation of cross-validation for kernel methods using Bouligand influence function. In: ICML, vol. 32, pp. 324–332 (2014)

    Google Scholar 

  20. Liu, Y., Liao, S., Jiang, S., Ding, L., Lin, H., Wang, W.: Fast cross-validation for kernel-based algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 42(5), 1083–1096 (2020)

    Google Scholar 

  21. Liu, Y., Liao, S., Lin, H., Yue, Y., Wang, W.: Generalization analysis for ranking using integral operator. In: AAAI, pp. 2273–2279 (2017)

    Google Scholar 

  22. Liu, Y., Liao, S., Lin, H., Yue, Y., Wang, W.: Infinite Kernel learning: generalization bounds and algorithms. In: AAAI, pp. 2280–2286 (2017)

    Google Scholar 

  23. Liu, Y., Liu, J., Wang, S.: Effective distributed learning with random features: improved bounds and algorithms. In: ICLR (2021)

    Google Scholar 

  24. Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: learning bounds and algorithms. In: COLT (2009)

    Google Scholar 

  25. McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: AISTATS, vol. 54, pp. 1273–1282 (2017)

    Google Scholar 

  26. Mohri, M., Sivek, G., Suresh, A.T.: Agnostic federated learning. In: ICML, vol. 97, pp. 4615–4625 (2019)

    Google Scholar 

  27. Mehryar, A.R.M., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, Cambridge second edn. (2018)

    Google Scholar 

  28. Pustozerova, A., Rauber, A., Mayer, R.: Training effective neural networks on structured data with federated learning. In: AINA, vol. 226, pp. 394–406 (2021)

    Google Scholar 

  29. Rahimi, A., Recht, B.: Random features for large-scale Kernel machines. In: NIPS, pp. 1177–1184 (2007)

    Google Scholar 

  30. Sattler, F., Müller, K.R., Samek, W.: Clustered federated learning: model-agnostic distributed multitask optimization under privacy constraints. IEEE Trans. Neural Netw. Learn. Syst. 32(8), 3710–3722 (2021)

    Google Scholar 

  31. Smith, V., Chiang, C., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. In: NIPS, pp. 4424–4434 (2017)

    Google Scholar 

  32. Stich, S.U.: Local SGD converges fast and communicates little. In: ICLR (2019)

    Google Scholar 

  33. Wang, H., Yurochkin, M., Sun, Y., Papailiopoulos, D.S., Khazaeni, Y.: Federated learning with matched averaging. In: ICLR (2020)

    Google Scholar 

  34. Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective inconsistency problem in heterogeneous federated optimization. In: NeurIPS (2020)

    Google Scholar 

  35. Wang, J., Tantia, V., Ballas, N., Rabbat, M.G.: SLOWMO: improving communication-efficient distributed SGD with slow momentum. In: ICLR (2020)

    Google Scholar 

  36. Wang, S., et al.: Adaptive federated learning in resource constrained edge computing systems. IEEE J. Sel. Areas Commun. 37(6), 1205–1221 (2019)

    CrossRef  Google Scholar 

  37. Yin, R., Liu, Y., Lu, L., Wang, W., Meng, D.: Divide-and-conquer learning with nyström: optimal rate and algorithm. In: AAAI, pp. 6696–6703 (2020)

    Google Scholar 

  38. Yu, H., Jin, R., Yang, S.: On the linear speedup analysis of communication efficient momentum SGD for distributed non-convex optimization. In: ICML, vol. 97, pp. 7184–7193 (2019)

    Google Scholar 

  39. Zhang, Y., Duchi, J.C., Wainwright, M.J.: Divide and conquer kernel ridge regression: a distributed algorithm with minimax optimal rates. J. Mach. Learn. Res. 16, 3299–3340 (2015)

    MathSciNet  MATH  Google Scholar 

  40. Zhang, Y., Liu, T., Long, M., Jordan, M.I.: Bridging theory and algorithm for domain adaptation. In: ICML, vol. 97, pp. 7404–7413 (2019)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by Excellent Talents Program of Institute of Information Engineering, CAS, Special Research Assistant Project of CAS (No. E0YY231114), Beijing Outstanding Young Scientist Program (No. BJJWZYJH01 2019100020098), National Natural Science Foundation of China (No. 62076234, No. 62106257) and Beijing Municipal Science and Technology Commission under Grant Z191100007119002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wei, B., Li, J., Liu, Y., Wang, W. (2021). Federated Learning for Non-IID Data: From Theory to Algorithm. In: Pham, D.N., Theeramunkong, T., Governatori, G., Liu, F. (eds) PRICAI 2021: Trends in Artificial Intelligence. PRICAI 2021. Lecture Notes in Computer Science(), vol 13031. Springer, Cham. https://doi.org/10.1007/978-3-030-89188-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89188-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89187-9

  • Online ISBN: 978-3-030-89188-6

  • eBook Packages: Computer ScienceComputer Science (R0)