Skip to main content
Log in

From distributed machine learning to federated learning: a survey

  • Survey Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In recent years, data and computing resources are typically distributed in the devices of end users, various regions or organizations. Because of laws or regulations, the distributed data and computing resources cannot be aggregated or directly shared among different regions or organizations for machine learning tasks. Federated learning emerges as an efficient approach to exploit distributed data and computing resources, so as to collaboratively train machine learning models. At the same time, federated learning obeys the laws and regulations and ensures data security and data privacy. In this paper, we provide a comprehensive survey of existing works for federated learning. First, we propose a functional architecture of federated learning systems and a taxonomy of related techniques. Second, we explain the federated learning systems from four aspects: diverse types of parallelism, aggregation algorithms, data communication, and the security of federated learning systems. Third, we present four widely used federated systems based on the functional architecture. Finally, we summarize the limitations and propose future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Abad MS, Ozfatura E, Gunduz D, Ercetin O (2020) Hierarchical federated learning across heterogeneous cellular networks. In: IEEE int. conf. on acoustics, speech and signal processing (ICASSP), pp 8866–8870

  2. Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L (2016) Deep learning with differential privacy. In: ACM SIGSAC conf. on computer and communications security, pp 308–318

  3. Abou El Houda Z, Hafid A, Khoukhi L (2019) Co-IOT: a collaborative DDOS mitigation scheme in IOT environment based on blockchain using SDN. In: IEEE global communications conference (GLOBECOM), pp 1–6

  4. Aono Y, Hayashi T, Wang L, Moriai S (2017) Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans Inf Forens Secur 13(5):1333–1345

    Google Scholar 

  5. Arivazhagan MG, Aggarwal V, Singh AK, Choudhary S (2019) Federated learning with personalization layers. ArXiv preprint arXiv:1912.00818

  6. Assran M, Loizou N, Ballas N, Rabbat M (2019) Stochastic gradient push for distributed deep learning. Int Confer Mach Learn 97:344–353

    Google Scholar 

  7. Ateniese G, Mancini LV, Spognardi A, Villani A, Vitali D, Felici G (2015) Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int J Secur Netw 10(3):137–150

    Article  Google Scholar 

  8. Awan AA, Chu CH, Subramoni H, Panda DK (2018) Optimized broadcast for deep learning workloads on dense-GPU infiniband clusters: MPI or NCCL? In: European MPI Users’ Group Meeting, pp 1–9

  9. Bagdasaryan E, Veit A, Hua Y, Estrin D, Shmatikov V (2020) How to backdoor federated learning. In: Int. conf. on artificial intelligence and statistics (AISTATS), pp 2938–2948

  10. Baidu. Federated deep learning in paddlepaddle (online). https://github.com/PaddlePaddle/PaddleFL. Accessed 16 Feb 2021

  11. Baidu. Paddlepaddle interpretability. https://github.com/PaddlePaddle/InterpretDL (online). Accessed 13 Mar 2021

  12. Beutel DJ, Topal T, Mathur A, Qiu X, Parcollet T, de Gusmão PP, Lane ND (2020) Flower: a friendly federated learning research framework. ArXiv preprint arXiv:2007.14390

  13. Bhagoji AN, Chakraborty S, Mittal P, Calo S (2019) Analyzing federated learning through an adversarial lens. In: Int. conf. on machine learning (ICML), pp 634–643

  14. Bi J, Zhang C (2018) An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl-Based Syst 158:81–93

    Article  Google Scholar 

  15. Bian J, Xiong H, Cheng W, Hu W, Guo Z, Fu Y (2017) Multi-party sparse discriminant learning. In: 2017 IEEE international conference on data mining (ICDM). IEEE, pp 745–750

  16. Bian J, Xiong H, Fu Y, Huan J, Guo Z (2020) Mp2sda: multi-party parallelized sparse discriminant learning. ACM Trans Knowl Discov Data (TKDD) 14(3):1–22

    Article  Google Scholar 

  17. Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konečnỳ J, Mazzocchi S, McMahan B, Van Overveldt T (2019) Towards federated learning at scale: system design. In: Machine learning and systems (MLSys)

  18. Briggs C, Fan Z, Andras P (2020) Federated learning with hierarchical clustering of local updates to improve training on non-IID data. In: Int. joint conf. on neural networks (IJCNN). IEEE, pp 1–9

  19. Brisimi TS, Chen R, Mela T, Olshevsky A, Paschalidis IC, Shi W (2018) Federated learning of predictive models from federated electronic health records. Int J Med Inf 112:59–67

    Article  Google Scholar 

  20. California consumer privacy act home page (online). Californians for Consumer Privacy. https://www.caprivacy.org/. Accessed 14 Feb 2021

  21. Caldas S, Duddu SM, Wu P, Li T, Konečnỳ J, McMahan HB, Smith V, Talwalkar A (2018) Leaf: a benchmark for federated settings. ArXiv preprint arXiv:1812.01097

  22. Caldas S, Konečnỳ J, McMahan HB, Talwalkar A (2018) Expanding the reach of federated learning by reducing client resource requirements. ArXiv preprint arXiv:1812.07210

  23. Canini K, Chandra T, Ie E, McFadden J, Goldman K, Gunter M, Harmsen J, LeFevre K, Lepikhin D, Llinares TL, Mukherjee I (2012) Sibyl: a system for large scale supervised machine learning. Techn Talk 1:113

    Google Scholar 

  24. Çatak FÖ (2015) Secure multi-party computation based privacy preserving extreme learning machine algorithm over vertically distributed data. In: Int. conf. on neural information processing (ICONIP), pp 337–345

  25. Chatterjee S, Seneta E (1977) Towards consensus: some convergence theorems on repeated averaging. J Appl Probab 14(1):89–97

    Article  MathSciNet  MATH  Google Scholar 

  26. Chen CL, Golubchik L, Paolieri M (2020) Backdoor attacks on federated meta-learning. ArXiv preprint arXiv:2006.07026

  27. Chen J, Sayed AH (2012) Diffusion adaptation strategies for distributed optimization and learning over networks. IEEE Trans Signal Process 60(8):4289–4305

    Article  MathSciNet  MATH  Google Scholar 

  28. Chen M, Zhang W, Yuan Z, Jia Y, Chen H (2020) Fede: embedding knowledge graphs in federated setting. ArXiv preprint arXiv:2010.12882

  29. Chen Y, Sun X, Jin Y (2019) Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation. IEEE Trans Neural Netw Learn Syst 31(10):4229–4238

    Article  Google Scholar 

  30. Chen Y, Luo F, Li T, Xiang T, Liu Z, Li J (2020) A training-integrity privacy-preserving federated learning scheme with trusted execution environment. Inf Sci 522:69–79

    Article  Google Scholar 

  31. Chik WB (2013) The Singapore personal data protection act and an assessment of future trends in data privacy reform. Comput Law Secur Rev 29(5):554–575

    Article  Google Scholar 

  32. Cohen G, Afshar S, Tapson J, Van Schaik A (2017) Emnist: extending mnist to handwritten letters. In Int. joint conf. on neural networks (IJCNN), pp 2921–2926

  33. Conger K. Uber settles data breach investigation for $148 million. https://www.nytimes.com/2018/09/26/technology/uber-data-breach.html (Online). Accessed 17 Feb 2021

  34. Conger K (2018) Uber settles data breach investigation for \$148 million. https://www.nytimes.com/2018/09/26/technology/uber-data-breach.html (online). Accessed 28 Feb 2021

  35. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conf. on computer vision and pattern recognition (CVPR), pp. 248–255

  36. Deng Y, Kamani MM, Mahdavi M (2020) Adaptive personalized federated learning. ArXiv preprint arXiv:2003.13461

  37. Dinh CT, Tran N, Nguyen J (2020) Personalized federated learning with Moreau envelopes. ArXiv preprint arXiv:2006.08848

  38. Dwork C (2008) Differential privacy: a survey of results. In: Int. conf. on theory and applications of models of computation, pp 1–19

  39. Eddy SR (2004) What is a hidden Markov model? Nat Biotechnol 22(10):1315–1316

    Article  Google Scholar 

  40. Fang M, Cao X, Jia J, Gong N (2020) Local model poisoning attacks to byzantine-robust federated learning. In: USENIX security symposium (USENIX security), pp 1605–1622

  41. Ferhat ÖÇ, Mustacoglu AF (2018) CPP-ELM: cryptographically privacy-preserving extreme learning machine for cloud systems. Int J Comput Intell Syst 11(1):33–44

    Article  Google Scholar 

  42. Feng S, Yu H (2020) Multi-participant multi-class vertical federated learning. ArXiv preprint arXiv:2001.11154

  43. Feng Z, Xiong H, Song C, Yang S, Zhao B, Wang L, Chen Z, Yang S, Liu L, Huan J (2019) Securegbm: secure multi-party gradient boosting. In IEEE int. conf. on big data (big data), pp 1312–1321

  44. Fette I, Melnikov A (2011) The websocket protocol. RFC, 6455:1–71

  45. Flynn MJ (1972) Some computer organizations and their effectiveness. IEEE Trans Comput 100(9):948–960

    Article  MATH  Google Scholar 

  46. Fung C, Yoon CJ, Beschastnikh I (2018) Mitigating sybils in federated learning poisoning. ArXiv preprint arXiv:1808.04866

  47. Gaff BM, Sussman HE, Geetter J (2014) Privacy and big data. Computer 47(6):7–9

    Article  Google Scholar 

  48. Ganga K, Karthik S (2013) A fault tolerent approach in scientific workflow systems based on cloud computing. In: Int. conf. on pattern recognition, informatics and mobile engineering, pp 387–390

  49. Geiping J, Bauermeister H, Dröge H, Moeller M (2020) Inverting gradients—How easy is it to break privacy in federated learning? ArXiv preprint arXiv:2003.14053

  50. Geyer RC, Klein T, Nabi M (2017) Differentially private federated learning: a client level perspective. ArXiv preprint arXiv:1712.07557

  51. Gibiansky A (2017) Bringing HPC techniques to deep learning. https://andrew.gibiansky.com/blog/machine-learning/baidu-allreduce/ (online). Accessed 12 Aug 2020

  52. Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2018) Explaining explanations: an overview of interpretability of machine learning. In: IEEE int. conf. on data science and advanced analytics (DSAA). IEEE, pp 80–89

  53. Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2016) Deep learning, vol 1. MIT Press, Cambridge

    Google Scholar 

  54. Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: Int. conf. on learning representations (ICLR)

  55. Google. Tensorflow federated: machine learning on decentralized data. https://www.tensorflow.org/federated (online). Accessed 16 Feb 2021

  56. Gropp W, Gropp WD, Lusk E, Skjellum A, Lusk AD (1999) Using MPI: portable parallel programming with the message-passing interface, vol 1. MIT Press

  57. Haddadpour F, Kamani MM, Mokhtari A, Mahdavi M (2020) Federated learning with compression: unified analysis and sharp guarantees. ArXiv preprint arXiv:2007.01154

  58. Hao M, Li H, Xu G, Liu S, Yang H (2019) Towards efficient and privacy-preserving federated deep learning. In: IEEE int. conf. on communications (ICC), pp 1–6

  59. Hardy S, Henecka W, Ivey-Law H, Nock R, Patrini G, Smith G, Thorne B (2017) Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. ArXiv preprint arXiv:1711.10677

  60. He C, Avestimehr S, Annavaram M (2020) Group knowledge transfer: collaborative training of large CNNs on the edge. ArXiv preprint arXiv:2007.14513

  61. He C, Annavaram M, Avestimehr S (2020) Towards non-IID and invisible data with FEDNAS: federated deep learning via neural architecture search. ArXiv preprint arXiv:2004.08546

  62. He C, Balasubramanian K, Ceyani E, Yang C, Xie H, Sun L, He L, Yang L, Yu PS, Rong Y, Zhao P (2021) Fedgraphnn: a federated learning system and benchmark for graph neural networks. ArXiv preprint arXiv:2104.07145

  63. He C, Ceyani E, Balasubramanian K, Annavaram M, Avestimehr S (2021) Spreadgnn: serverless multi-task federated learning for graph neural networks. ArXiv preprint arXiv:2106.02743

  64. He C, Li S, Soltanolkotabi M, Avestimehr S (2021) Pipetransformer: automated elastic pipelining for distributed training of large-scale models. In: Int. conf. on machine learning, volume 139 of machine learning research, pp 4150–4159

  65. He C, Li S, So J, Zeng X, Zhang M, Wang H, Wang X, Vepakomma P, Singh A, Qiu H, Zhu X (2020) Fedml: a research library and benchmark for federated machine learning. ArXiv preprint arXiv:2007.13518

  66. He C, Shah AD, Tang Z, Sivashunmugam DF, Bhogaraju K, Shimpi M, Shen L, Chu X, Soltanolkotabi M, Avestimehr S. Fedcv: a federated learning framework for diverse computer vision tasks. ArXiv preprint arXiv: 2111.11066

  67. He C, Tan C, Tang H, Qiu S, Liu J (2019) Central server free federated learning over single-sided trust social networks. ArXiv preprint arXiv:1910.04956

  68. He C, Ye H, Shen L, Zhang T (2020) Milenas: efficient neural architecture search via mixed-level reformulation. In: IEEE/CVF conf. on computer vision and pattern recognition (CVPR)

  69. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  70. He L, Karimireddy SP, Jaggi M (2020) Secure byzantine-robust machine learning. ArXiv preprint arXiv:2006.04747

  71. Hitaj B, Ateniese G, Perez-Cruz F (2017) Deep models under the GAN: information leakage from collaborative deep learning. In: ACM SIGSAC conference on computer and communications security, pp 603–618

  72. Hu C, Jiang J, Wang Z (2019) Decentralized federated learning: a segmented gossip approach. ArXiv preprint arXiv:1908.07782

  73. Hu Z, Shaloudegi K, Zhang G, Yu Y (2020) Fedmgda+: federated learning meets multi-objective optimization. ArXiv preprint arXiv:2006.11489

  74. Huang Y, Cheng Y, Bapna A, Firat O, Chen D, Chen M, Lee H, Ngiam J, Le QV, Wu Y (2018) Gpipe: efficient training of giant neural networks using pipeline parallelism. ArXiv preprint arXiv:1811.06965

  75. Ivkin N, Rothchild D, Ullah E, Stoica I, Arora R (2019) Communication-efficient distributed SGD with sketching. ArXiv preprint arXiv:1903.04488

  76. Jiang J, Fu F, Yang T, Cui B (2018) Sketchml: accelerating distributed machine learning with data sketches. In: Int. conf. on management of data, pp 1269–1284

  77. Jiang J, Ji S, Long G (2020) Decentralized knowledge acquisition for mobile internet applications. In: World Wide Web, pp 1–17

  78. Jiang M, Jung T, Karl R, Zhao T (2020) Federated dynamic GNN with secure aggregation. ArXiv preprint arXiv:2009.07351

  79. Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, Charles Z, Cormode G, Cummings R, D’Oliveira RG (2019) Advances and open problems in federated learning. ArXiv preprint arXiv:1912.04977

  80. Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, Charles Z, Cormode G, Cummings R, D’Oliveira RG, et al. (2021) Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1)

  81. Karimireddy SP, Kale S, Mohri M, Reddi S, Stich S, Suresh AT (2020) Scaffold: stochastic controlled averaging for federated learning. In: Int. conf. on machine learning (ICML), pp 5132–5143

  82. Karimireddy SP, Rebjock Q, Stich S, Jaggi M (2019) Error feedback fixes signsgd and other gradient compression schemes. In: Int. conf. on machine learning (ICML), pp 3252–3261

  83. Katevas K, Bagdasaryan E, Waterman J, Safadieh MM, Birrell E, Haddadi H, Estrin D (2020) Policy-based federated learning. ArXiv e-prints, pp arXiv-2003

  84. Ke C, Honorio J (2021) Federated myopic community detection with one-shot communication. ArXiv preprint arXiv:2106.07255

  85. Kholod I, Yanaki E, Fomichev D, Shalugin E, Novikova E, Filippov E, Nordlund M (2021) Open-source federated learning frameworks for iot: a comparative review and analysis. Sensors 21(1):167

    Article  Google Scholar 

  86. Konečnỳ J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D (2016) Federated learning: strategies for improving communication efficiency. ArXiv preprint arXiv:1610.05492

  87. Kulkarni V, Kulkarni M, Pant A (2020) Survey of personalization techniques for federated learning. In: World conf. on smart trends in systems, security and sustainability (WorldS4), pp 794–797

  88. Lalitha A, Kilinc OC, Javidi T, Koushanfar F (2019) Peer-to-peer federated learning on graphs. ArXiv preprint arXiv:1901.11173

  89. Li Q, Wen Z, He B (2019) A survey on federated learning systems: vision, hype and reality for data privacy and protection. ArXiv preprint arXiv:1907.09693

  90. Li S, Cheng Y, Liu Y, Wang W, Chen T (2019) Abnormal client behavior detection in federated learning. ArXiv preprint arXiv:1910.09933

  91. Li T, Sahu AK, Talwalkar A, Smith V (2020) Federated learning: challenges, methods, and future directions. IEEE Signal Process Mag 37(3):50–60

    Article  Google Scholar 

  92. Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020) Federated optimization in heterogeneous networks. Mach Learn Syst 2:429–450

    Google Scholar 

  93. Li T, Sanjabi M, Beirami A, Smith V (2019) Fair resource allocation in federated learning. arXiv preprint arXiv:1905.10497

  94. Li Y, Wu B, Jiang Y, Li Z, Xia ST (2020) Backdoor learning: a survey. arXiv preprint arXiv:2007.08745

  95. Li Z, Huang Z, Chen C, Hong C (2019) Quantification of the leakage in federated learning. arXiv preprint arXiv:1910.05467

  96. Lian X, Zhang C, Zhang H, Hsieh CJ, Zhang W, Liu J (2017) Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent. In: Advances in neural information processing systems (NeurIPS), pp 5330–5340

  97. Liang Z, Wang B, Gu Q, Osher S, Yao Y (2020) Exploring private federated learning with laplacian smoothing. arXiv preprint arXiv:2005.00218

  98. Liaqat M, Chang V, Gani A, Ab Hamid SH, Toseef M, Shoaib U, Ali RL (2017) Federated cloud resource management: review and discussion. J Netw Comput Appl 77:87–105

    Article  Google Scholar 

  99. Lim WY, Luong NC, Hoang DT, Jiao Y, Liang YC, Yang Q, Niyato D, Miao C (2020) Federated learning in mobile edge networks: a comprehensive survey. IEEE Commun Surv Tutor 22(3):2031–2063

    Article  Google Scholar 

  100. Lin BY, He C, Zeng Z, Wang H, Huang Y, Soltanolkotabi M, Ren X, Avestimehr S (2021) Fednlp: a research platform for federated learning in natural language processing. ArXiv preprint arXiv:2104.08815

  101. Lin J, Du M, Liu J (2019) Free-riders in federated learning: attacks and defenses. ArXiv preprint arXiv:1911.12560

  102. Lin Y, Chen C, Chen C, Wang L (2020) Improving federated relational data modeling via basis alignment and weight penalty. ArXiv preprint arXiv:2011.11369

  103. Liu J, Bondiombouy C, Mo L, Valduriez P (2020) Two-phase scheduling for efficient vehicle sharing. IEEE Trans Intell Transp Syst (TITS) 23(1): 457–470

  104. Liu J, Pacitti E, Valduriez P, De Oliveira D, Mattoso M (2016) Multi-objective scheduling of scientific workflows in multisite clouds. Fut Gener Comput Syst 63:76–95

    Article  Google Scholar 

  105. Liu J, Pacitti E, Valduriez P, Mattoso M (2015) A survey of data-intensive scientific workflow management. J Grid Comput 13(4):457–493

    Article  Google Scholar 

  106. Liu J, Pineda L, Pacitti E, Costan A, Valduriez P, Antoniu G, Mattoso M (2018) Efficient scheduling of scientific workflows using hot metadata in a multisite cloud. IEEE Trans Knowl Data Eng 31(10):1940–1953

    Article  Google Scholar 

  107. Liu L, Zhang J, Song SH, Letaief KB (2020) Client-edge-cloud hierarchical federated learning. In: IEEE int. conf. on communications (ICC), pp 1–6

  108. Liu R, Cao Y, Yoshikawa M, Chen H (2020) Fedsel: federated sgd under local differential privacy with top-k dimension selection. In: Int. conf. on database systems for advanced applications, pp 485–501

  109. Liu Y, Huang A, Luo Y, Huang H, Liu Y, Chen Y, Feng L, Chen T, Yu H, Yang Q (2020) Fedvision: an online visual object detection platform powered by federated learning. AAAI Confer Artif Intell 34:13172–13179

    Google Scholar 

  110. Liu Y, Kang Y, Zhang X, Li L, Cheng Y, Chen T, Hong M, Yang Q (2019) A communication efficient collaborative learning framework for distributed features. ArXiv preprint arXiv:1912.11187

  111. Lo SK, Lu Q, Zhu L, Paik HY, Xu X, Wang C (2021) Architectural patterns for the design of federated learning systems. ArXiv preprint arXiv:2101.02373

  112. Luo S, Chen X, Wu Q, Zhou Z, Yu S (2020) Hfel: joint edge association and resource allocation for cost-efficient hierarchical federated edge learning. IEEE Trans Wirel Commun 19(10):6535–6548

    Article  Google Scholar 

  113. Luo X, Zhu X (2020) Exploiting defenses against gan-based feature inference attacks in federated learning. ArXiv preprint arXiv:2004.12571

  114. Lyu L, Yu H, Yang Q (2020) Threats to federated learning: a survey. ArXiv preprint arXiv:2003.02133

  115. Lyu L, Yu J, Nandakumar K, Li Y, Ma X, Jin J, Yu H, Ng KS (2020) Towards fair and privacy-preserving federated deep models. IEEE Trans Parallel Distrib Syst 31(11):2524–2541

    Article  Google Scholar 

  116. Ma Y, Yu D, Wu T, Wang H (2019) Paddlepaddle: an open-source deep learning platform from industrial practice. Front Data Comput 1(1):105

    Google Scholar 

  117. Malekijoo A, Fadaeieslam MJ, Malekijou H, Homayounfar M, Alizadeh-Shabdiz F, Rawassizadeh R (2021) FEDZIP: a compression framework for communication-efficient federated learning. ArXiv preprint arXiv:2102.01593

  118. Mandal K, Gong G (2019) PrivFL: practical privacy-preserving federated regressions on high-dimensional data over mobile networks. In: ACM SIGSAC conf. on cloud computing security workshop, pp 57–68

  119. McKeen F, Alexandrovich I, Berenzon A, Rozas CV, Shafi H, Shanbhogue V, Savagaonkar UR (2013) Innovative instructions and software model for isolated execution. In: Int. workshop on hardware and architectural support for security and privacy

  120. McMahan B, Moore E, Ramage D, Hampson S, y Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Int. conf. on artificial intelligence and statistics (AISTATS), pp 1273–1282

  121. McMahan HB, Ramage D, Talwar K, Zhang L (2017) Learning differentially private recurrent language models. ArXiv preprint arXiv:1710.06963

  122. McMahan HB, Ramage D, Talwar K, Zhang L (2018) Learning differentially private recurrent language models. In: Int. conf. on learning representations (ICLR)

  123. Mei G, Guo Z, Liu S, Pan L (2019) Sgnn: a graph neural network based federated learning approach by hiding structure. In: IEEE int. conf. on big data (big data), pp 2560–2568

  124. Melis L, Song C, De Cristofaro E, Shmatikov V (2019) Exploiting unintended feature leakage in collaborative learning. In: IEEE symposium on security and privacy (SP), pp 691–706

  125. Meng C, Rambhatla S, Liu Y (2021) Cross-node federated graph neural network for spatio-temporal data modeling. In: ACM SIGKDD conference on knowledge discovery and data mining (KDD) (to appear)

  126. Mhaisen N, Abdellatif AA, Mohamed A, Erbad A, Guizani M (2021) Optimal user-edge assignment in hierarchical federated learning based on statistical properties and network topology constraints. IEEE Trans Netw Sci Eng 9(1): 55–66

  127. Mo F, Haddadi H (2019) Efficient and private federated learning using tee. In: EuroSys

  128. Mohri M, Sivek G, Suresh AT (2019) Agnostic federated learning. In: Int. conf. on machine learning (ICML), pp 4615–4625

  129. Mothukuri V, Parizi RM, Pouriyeh S, Huang Y, Dehghantanha A, Srivastava G (2021) A survey on security and privacy of federated learning. Fut Gener Comput Syst 115:619–640

    Article  Google Scholar 

  130. Muñoz-González L, Co KT, Lupu EC (2019) Byzantine-robust federated machine learning through adaptive model averaging. ArXiv preprint arXiv:1909.05125

  131. Narayanan D, Harlap A, Phanishayee A, Seshadri V, Devanur NR, Ganger GR, Gibbons PB, Zaharia M (2019) Pipedream: generalized pipeline parallelism for dnn training. In: ACM symposium on operating systems principles, pp 1–15

  132. Ochiai K, Senkawa K, Yamamoto N, Tanaka Y, Fukazawa Y (2019) Real-time on-device troubleshooting recommendation for smartphones. In: ACM SIGKDD int. conf. on knowledge discovery and data mining, pp 2783–2791

  133. Official Journal of the European Union. General data protection regulation. https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679 (online). Accessed 12 Feb 2021

  134. Ohrimenko O, Schuster F, Fournet C, Mehta A, Nowozin S, Vaswani K, Costa M (2016) Oblivious multi-party machine learning on trusted processors. In \(\{\)USENIX\(\}\) security symposium (\(\{\)USENIX\(\}\) security), pp 619–636

  135. Oksuz K, Cam BC, Kalkan S, Akbas E (2020) Imbalance problems in object detection: a review. IEEE Trans Pattern Anal Mach Intell 43:3388–3415

  136. OpenMined. Pysyft. https://github.com/OpenMined/PySyft (online). Accessed 22 Feb 2021

  137. PaddlePaddle B. Paddlehub. https://github.com/PaddlePaddle/PaddleHub (online). Accessed 01 Oct 2021

  138. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Int. conf. on the theory and applications of cryptographic techniques, pp 223–238

  139. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  140. Peng H, Li H, Song Y, Zheng V, Li J (2021) Federated knowledge graphs embedding. In: ACM int. conf. on information and knowledge management (CIKM), pp 1–10

  141. Phan H, Thai MT, Hu H, Jin R, Sun T, Dou D (2020) Scalable differential privacy with certified robustness in adversarial learning. In: Int. conf. on machine learning (ICML), pp 7683–7694

  142. Pillutla K, Kakade SM, Harchaoui Z (2019) Robust aggregation for federated learning. ArXiv preprint arXiv:1912.13445

  143. Pineda-Morales L, Liu J, Costan A, Pacitti E, Antoniu G, Valduriez P, Mattoso M (2016) Managing hot metadata for scientific workflows on multisite clouds. In: IEEE Int. Conf. on Big Data (Big Data), pp 390–397

  144. Pytorch. Pytorch. https://pytorch.org/ (online). Accessed 13 Mar 2021

  145. Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 22(3):400–407

  146. Romanini D, Hall AJ, Papadopoulos P, Titcombe T, Ismail A, Cebere T, Sandmann R, Roehm R, Hoeh MA (2021) Pyvertical: a vertical federated learning framework for multi-headed splitnn. ArXiv preprint arXiv:2104.00489

  147. Rothchild D, Panda A, Ullah E, Ivkin N, Stoica I, Braverman V, Gonzalez J, Arora R (2020) Fetchsgd: communication-efficient federated learning with sketching. In: Int. conf. on machine learning (ICML), pp 8253–8265

  148. Ryffel T, Trask A, Dahl M, Wagner B, Mancuso J, Rueckert D, Passerat-Palmbach J (2018) A generic framework for privacy preserving deep learning. ArXiv preprint arXiv:1811.04017

  149. Sabater C, Bellet A, Ramon J (2020) Distributed differentially private averaging with improved utility and robustness to malicious parties. ArXiv preprint arXiv:2006.07218

  150. Satariano A. Google is fined $57 million under Europe’s data privacy law. https://www.nytimes.com/2019/01/21/technology/google-europe-gdpr-fine.html (online). Accessed 28 Feb 2021

  151. Sayed AH (2014) Adaptation, learning, and optimization over networks. Found Trends Mach Learn 7(ARTICLE):311–801

  152. Sayed AH, Tu SY, Chen J, Zhao X, Towfic ZJ (2013) Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior. IEEE Signal Process Maga 30(3):155–171

    Article  Google Scholar 

  153. Seif M, Tandon R, Li M (2020) Wireless federated learning with local differential privacy. In: IEEE int. symposium on information theory (ISIT), pp 2604–2609

  154. Seneta E (2006) Non-negative matrices and Markov chains. Springer

  155. Shakespeare W (2007) The complete works of William Shakespeare. Wordsworth Editions

  156. Shlezinger N, Chen M, Eldar YC, Poor HV, Cui S (2020) Federated learning with quantization constraints. In: IEEE int. conf. on acoustics, speech and signal processing (ICASSP), pp 8851–8855

  157. Shlezinger N, Chen M, Eldar YC, Poor HV, Cui S (2020) Uveqfed: universal vector quantization for federated learning. IEEE Trans Signal Process 69:500–514

    Article  MathSciNet  Google Scholar 

  158. Silverstein J. Hundreds of millions of facebook user records were exposed on amazon cloud server. https://www.cbsnews.com/news/millions-facebook-user-records-exposed-amazon-cloud-server/ (online). Accessed 28 Feb 2021

  159. Spring R, Kyrillidis A, Mohan V, Shrivastava A (2019) Compressing gradient optimizers via count-sketches. In: Int. conf. on machine learning (ICML), pp 5946–5955

  160. Standing Committee of the National People’s Congress. Cybersecurity law of the people’s Republic of China. https://www.newamerica.org/cybersecurity-initiative/digichina/blog/translation-cybersecurity-law-peoples-republic-china/ (online). Accessed 22 Feb 2021

  161. Stich SU, Cordonnier JB, Jaggi M (2018) Sparsified SGD with memory. In: Advances in neural information processing systems (NeurIPS), vol 31

  162. Sun H, Ma X, Hu RQ (2020) Adaptive federated learning with gradient compression in uplink NOMA. IEEE Trans Vehic Technol 69(12):16325–16329

  163. Sun Z, Kairouz P, Suresh AT, McMahan HB (2019) Can you really backdoor federated learning? ArXiv preprint arXiv:1911.07963

  164. Suzumura T, Zhou Y, Baracaldo N, Ye G, Houck K, Kawahara R, Anwar A, Stavarache LL, Watanabe Y, Loyola P, Klyashtorny D (2019) Towards federated graph learning for collaborative financial crimes detection. ArXiv preprint arXiv:1909.12946

  165. Tolpegin V, Truex S, Gursoy ME, Liu L (2020) Data poisoning attacks against federated learning systems. In: European symposium on research in computer security. Springer, pp 480–501

  166. Triastcyn A, Faltings B (2020) Federated generative privacy. IEEE Intell Syst 35(4):50–57

    Google Scholar 

  167. Truex Stacey, Baracaldo Nathalie, Anwar Ali, Steinke Thomas, Ludwig Heiko, Zhang Rui, Zhou Yi (2019) A hybrid approach to privacy-preserving federated learning. In: ACM workshop on artificial intelligence and security, pp 1–11

  168. Vanhaesebrouck P, Bellet A, Tommasi M (2017) Decentralized collaborative learning of personalized models over networks. In: Int. conf. on artificial intelligence and statistics (AISTATS), pp 509–517

  169. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2018) Graph attention networks. In: Int. conf. on learning representations (ICLR)

  170. Verbraeken J, Wolting M, Katzy J, Kloppenburg J, Verbelen T, Rellermeyer JS (2020) A survey on distributed machine learning. ACM Comput Surv (CSUR) 53(2):1–33

    Article  Google Scholar 

  171. Vishnu A, Siegel C, Daily J (2016) Distributed tensorflow with MPI. ArXiv preprint arXiv:1603.02339

  172. Wainakh A, Guinea AS, Grube T, Mühlhäuser M (2020) Enhancing privacy via hierarchical federated learning. In: IEEE European symposium on security and privacy workshops (EuroS&PW), pp 344–347

  173. Wang B, Li A, Li H, Chen Y (2020) Graphfl: a federated learning framework for semi-supervised node classification on graphs. ArXiv preprint arXiv:2012.04187

  174. Wang C, Chen B, Li G, Wang H (2021) FL-AGCNS: federated learning framework for automatic graph convolutional network search. ArXiv preprint arXiv:2104.04141

  175. Wang G (2019) Interpret federated learning with Shapley values. ArXiv preprint arXiv:1905.04519

  176. Wang H, Yurochkin M, Sun Y, Papailiopoulos D, Khazaeni Y (2020) Federated learning with matched averaging. In: Int. conf. on learning representations (ICLR)

  177. Wang J, Charles Z, Xu Z, Joshi G, McMahan HB, Al-Shedivat M, Andrew G, Avestimehr S, Daly K, Data D, Diggavi S (2021) A field guide to federated optimization. ArXiv preprint arXiv:2107.06917

  178. Wang J, Sahu AK, Yang Z, Joshi G, Kar S (2019) Matcha: speeding up decentralized SGD via matching decomposition sampling. In: Indian control conference (ICC), pp 299–300

  179. Wang L, Xu S, Wang X, Zhu Q (2021) Addressing class imbalance in federated learning. AAAI Confer Artif Intell 35:10165–10173

    Google Scholar 

  180. Luping W, Wei W, Bo L (2019) CMFL: mitigating communication overhead for federated learning. In: IEEE int. conf. on distributed computing systems (ICDCS), pp 954–964

  181. Wang Z, Song M, Zhang Z, Song Y, Wang Q, Qi H (2019) Beyond inferring class representatives: User-level privacy leakage from federated learning. In: IEEE conf. on computer communications (INFOCOM), pp 2512–2520

  182. WeBank. Federated AI technology enabler (FATE). https://github.com/FederatedAI/FATE (online). Accessed 16 Feb 2021

  183. WeBank. Federated learning white paper v2.0. https://aisp-1251170195.cos.ap-hongkong.myqcloud.com/wp-content/uploads/pdf/%E8%81%94%E9%82%A6%E5%AD%A6%E4%B9%A0%E7%99%BD%E7%9A%AE%E4%B9%A6_v2.0.pdf (online). Accessed 14 Feb 2021

  184. Wei K, Li J, Ding M, Ma C, Yang HH, Farokhi F, Jin S, Quek TQ, Poor HV (2020) Federated learning with differential privacy: algorithms and performance analysis. IEEE Trans Inf Forens Secur 15:3454–3469

    Article  Google Scholar 

  185. Wu C, Wu F, Cao Y, Huang Y, Xie X (2021) Fedgnn: federated graph neural network for privacy-preserving recommendation. ArXiv preprint arXiv:2102.04925

  186. Wu T, Liu Z, Huang Q, Wang Y, Lin D (2021) Adversarial robustness under long-tailed distribution. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8659–8668

  187. Xu C, Tao D, Xu C (2013) A survey on multi-view learning. ArXiv preprint arXiv:1304.5634

  188. Xu J, Glicksberg BS, Su C, Walker P, Bian J, Wang F (2021) Federated learning for healthcare informatics. J Healthc Inf Res 5:1–19

  189. Xu J, Du W, Jin Y, He W, Cheng R (2020) Ternary compression for communication-efficient federated learning. IEEE Trans Neural Netw Learn Syst 33(3):1162–1176

  190. Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol 10(2):1–19

    Article  Google Scholar 

  191. Yi X, Paulet R, Bertino E (2014) Homomorphic encryption. In: Homomorphic encryption and applications. Springer, pp 27–46

  192. Yuan J, Xu M, Ma X, Zhou A, Liu X, Wang S (2020) Hierarchical federated learning through LAN-WAN orchestration. ArXiv preprint arXiv:2010.11612

  193. Yurochkin M, Agarwal M, Ghosh S, Greenewald K, Hoang N, Khazaeni Y (2019) Bayesian nonparametric federated learning of neural networks. In: Int. conf. on machine learning (ICML), pp 7252–7261

  194. Zhang C, Li S, Xia J, Wang W, Yan F, Liu Y (2020) Batchcrypt: efficient homomorphic encryption for cross-silo federated learning. In: USENIX annual technical conference (USENIX ATC), pp 493–506

  195. Zhang C, Bi J, Soda P (2017) Feature selection and resampling in class imbalance learning: Which comes first? An empirical study in the biological domain. In: Int. conf. on bioinformatics and biomedicine (BIBM), pp 933–938

  196. Zhang C, Bi J, Xu S, Ramentol E, Fan G, Qiao B, Fujita H (2019) Multi-imbalance: an open-source software for multi-class imbalance learning. Knowl-Based Syst 174:137–143

    Article  Google Scholar 

  197. Zhang C, Soda P, Bi J, Fan G, Almpanidis G, Garcia S (2021) An empirical study on the joint impact of feature selection and data resampling on imbalance classification. ArXiv preprint arXiv:2109.00201

  198. Zhang H, Shen T, Wu F, Yin M, Yang H, Wu C (2021) Federated graph learning—a position paper. ArXiv preprint arXiv:2105.11099

  199. Zhang T, He C, Ma T, Gao L, Ma M, Avestimehr S (2021) Federated learning for internet of things: a federated learning framework for on-device anomaly data detection. ArXiv preprint arXiv:2106.07976

  200. Zhang X, Li F, Zhang Z, Li Q, Wang C, Wu J (2020) Enabling execution assurance of federated learning at untrusted participants. In: IEEE INFOCOM conf. on computer communications, pp 1877–1886

  201. Zhao B, Mopuri KR, Bilen H (2020) IDLG: improved deep leakage from gradients. ArXiv preprint arXiv:2001.02610

  202. Zhao Y, Barnaghi P, Haddadi H (2021) Multimodal federated learning. ArXiv preprint arXiv:2109.04833

  203. Zheng L, Zhou J, Chen C, Wu B, Wang L, Zhang B (2021) Asfgnn: automated separated-federated graph neural network. Peer-to-Peer Netw Appl 14(3):1692–1704

    Article  Google Scholar 

  204. Zhou J, Chen C, Zheng L, Wu H, Wu J, Zheng X, Wu B, Liu Z, Wang L (2020) Vertically federated graph neural network for privacy-preserving node classification. ArXiv preprint arXiv:2005.11903

  205. Zhu H, Zhang H, Jin Y (2021) From federated learning to federated neural architecture search: a survey. Complex Intell Syst

  206. Zinkevich M, Weimer M, Li L, Smola A. (2010) Parallelized stochastic gradient descent. In: Advances in neural information processing systems (NeurIPS), vol 4. Citeseer, p 4

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ji Liu or Dejing Dou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Huang, J., Zhou, Y. et al. From distributed machine learning to federated learning: a survey. Knowl Inf Syst 64, 885–917 (2022). https://doi.org/10.1007/s10115-022-01664-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01664-x

Keywords

Navigation