From distributed machine learning to federated learning: a survey

Liu, Ji; Huang, Jizhou; Zhou, Yang; Li, Xuhong; Ji, Shilei; Xiong, Haoyi; Dou, Dejing

doi:10.1007/s10115-022-01664-x

From distributed machine learning to federated learning: a survey

Survey Paper
Published: 22 March 2022

Volume 64, pages 885–917, (2022)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Ji Liu ORCID: orcid.org/0000-0002-9421-4100¹,
Jizhou Huang¹,
Yang Zhou²,
Xuhong Li¹,
Shilei Ji¹,
Haoyi Xiong¹ &
…
Dejing Dou^1,3

9085 Accesses
91 Citations
2 Altmetric
Explore all metrics

Abstract

In recent years, data and computing resources are typically distributed in the devices of end users, various regions or organizations. Because of laws or regulations, the distributed data and computing resources cannot be aggregated or directly shared among different regions or organizations for machine learning tasks. Federated learning emerges as an efficient approach to exploit distributed data and computing resources, so as to collaboratively train machine learning models. At the same time, federated learning obeys the laws and regulations and ensures data security and data privacy. In this paper, we provide a comprehensive survey of existing works for federated learning. First, we propose a functional architecture of federated learning systems and a taxonomy of related techniques. Second, we explain the federated learning systems from four aspects: diverse types of parallelism, aggregation algorithms, data communication, and the security of federated learning systems. Third, we present four widely used federated systems based on the functional architecture. Finally, we summarize the limitations and propose future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning: Algorithms, Real-World Applications and Research Directions

Article 22 March 2021

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

References

Abad MS, Ozfatura E, Gunduz D, Ercetin O (2020) Hierarchical federated learning across heterogeneous cellular networks. In: IEEE int. conf. on acoustics, speech and signal processing (ICASSP), pp 8866–8870
Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L (2016) Deep learning with differential privacy. In: ACM SIGSAC conf. on computer and communications security, pp 308–318
Abou El Houda Z, Hafid A, Khoukhi L (2019) Co-IOT: a collaborative DDOS mitigation scheme in IOT environment based on blockchain using SDN. In: IEEE global communications conference (GLOBECOM), pp 1–6
Aono Y, Hayashi T, Wang L, Moriai S (2017) Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans Inf Forens Secur 13(5):1333–1345
Google Scholar
Arivazhagan MG, Aggarwal V, Singh AK, Choudhary S (2019) Federated learning with personalization layers. ArXiv preprint arXiv:1912.00818
Assran M, Loizou N, Ballas N, Rabbat M (2019) Stochastic gradient push for distributed deep learning. Int Confer Mach Learn 97:344–353
Google Scholar
Ateniese G, Mancini LV, Spognardi A, Villani A, Vitali D, Felici G (2015) Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int J Secur Netw 10(3):137–150
Article Google Scholar
Awan AA, Chu CH, Subramoni H, Panda DK (2018) Optimized broadcast for deep learning workloads on dense-GPU infiniband clusters: MPI or NCCL? In: European MPI Users’ Group Meeting, pp 1–9
Bagdasaryan E, Veit A, Hua Y, Estrin D, Shmatikov V (2020) How to backdoor federated learning. In: Int. conf. on artificial intelligence and statistics (AISTATS), pp 2938–2948
Baidu. Federated deep learning in paddlepaddle (online). https://github.com/PaddlePaddle/PaddleFL. Accessed 16 Feb 2021
Baidu. Paddlepaddle interpretability. https://github.com/PaddlePaddle/InterpretDL (online). Accessed 13 Mar 2021
Beutel DJ, Topal T, Mathur A, Qiu X, Parcollet T, de Gusmão PP, Lane ND (2020) Flower: a friendly federated learning research framework. ArXiv preprint arXiv:2007.14390
Bhagoji AN, Chakraborty S, Mittal P, Calo S (2019) Analyzing federated learning through an adversarial lens. In: Int. conf. on machine learning (ICML), pp 634–643
Bi J, Zhang C (2018) An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl-Based Syst 158:81–93
Article Google Scholar
Bian J, Xiong H, Cheng W, Hu W, Guo Z, Fu Y (2017) Multi-party sparse discriminant learning. In: 2017 IEEE international conference on data mining (ICDM). IEEE, pp 745–750
Bian J, Xiong H, Fu Y, Huan J, Guo Z (2020) Mp2sda: multi-party parallelized sparse discriminant learning. ACM Trans Knowl Discov Data (TKDD) 14(3):1–22
Article Google Scholar
Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konečnỳ J, Mazzocchi S, McMahan B, Van Overveldt T (2019) Towards federated learning at scale: system design. In: Machine learning and systems (MLSys)
Briggs C, Fan Z, Andras P (2020) Federated learning with hierarchical clustering of local updates to improve training on non-IID data. In: Int. joint conf. on neural networks (IJCNN). IEEE, pp 1–9
Brisimi TS, Chen R, Mela T, Olshevsky A, Paschalidis IC, Shi W (2018) Federated learning of predictive models from federated electronic health records. Int J Med Inf 112:59–67
Article Google Scholar
California consumer privacy act home page (online). Californians for Consumer Privacy. https://www.caprivacy.org/. Accessed 14 Feb 2021
Caldas S, Duddu SM, Wu P, Li T, Konečnỳ J, McMahan HB, Smith V, Talwalkar A (2018) Leaf: a benchmark for federated settings. ArXiv preprint arXiv:1812.01097
Caldas S, Konečnỳ J, McMahan HB, Talwalkar A (2018) Expanding the reach of federated learning by reducing client resource requirements. ArXiv preprint arXiv:1812.07210
Canini K, Chandra T, Ie E, McFadden J, Goldman K, Gunter M, Harmsen J, LeFevre K, Lepikhin D, Llinares TL, Mukherjee I (2012) Sibyl: a system for large scale supervised machine learning. Techn Talk 1:113
Google Scholar
Çatak FÖ (2015) Secure multi-party computation based privacy preserving extreme learning machine algorithm over vertically distributed data. In: Int. conf. on neural information processing (ICONIP), pp 337–345
Chatterjee S, Seneta E (1977) Towards consensus: some convergence theorems on repeated averaging. J Appl Probab 14(1):89–97
Article MathSciNet MATH Google Scholar
Chen CL, Golubchik L, Paolieri M (2020) Backdoor attacks on federated meta-learning. ArXiv preprint arXiv:2006.07026
Chen J, Sayed AH (2012) Diffusion adaptation strategies for distributed optimization and learning over networks. IEEE Trans Signal Process 60(8):4289–4305
Article MathSciNet MATH Google Scholar
Chen M, Zhang W, Yuan Z, Jia Y, Chen H (2020) Fede: embedding knowledge graphs in federated setting. ArXiv preprint arXiv:2010.12882
Chen Y, Sun X, Jin Y (2019) Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation. IEEE Trans Neural Netw Learn Syst 31(10):4229–4238
Article Google Scholar
Chen Y, Luo F, Li T, Xiang T, Liu Z, Li J (2020) A training-integrity privacy-preserving federated learning scheme with trusted execution environment. Inf Sci 522:69–79
Article Google Scholar
Chik WB (2013) The Singapore personal data protection act and an assessment of future trends in data privacy reform. Comput Law Secur Rev 29(5):554–575
Article Google Scholar
Cohen G, Afshar S, Tapson J, Van Schaik A (2017) Emnist: extending mnist to handwritten letters. In Int. joint conf. on neural networks (IJCNN), pp 2921–2926
Conger K. Uber settles data breach investigation for $148 million. https://www.nytimes.com/2018/09/26/technology/uber-data-breach.html (Online). Accessed 17 Feb 2021
Conger K (2018) Uber settles data breach investigation for \$148 million. https://www.nytimes.com/2018/09/26/technology/uber-data-breach.html (online). Accessed 28 Feb 2021
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conf. on computer vision and pattern recognition (CVPR), pp. 248–255
Deng Y, Kamani MM, Mahdavi M (2020) Adaptive personalized federated learning. ArXiv preprint arXiv:2003.13461
Dinh CT, Tran N, Nguyen J (2020) Personalized federated learning with Moreau envelopes. ArXiv preprint arXiv:2006.08848
Dwork C (2008) Differential privacy: a survey of results. In: Int. conf. on theory and applications of models of computation, pp 1–19
Eddy SR (2004) What is a hidden Markov model? Nat Biotechnol 22(10):1315–1316
Article Google Scholar
Fang M, Cao X, Jia J, Gong N (2020) Local model poisoning attacks to byzantine-robust federated learning. In: USENIX security symposium (USENIX security), pp 1605–1622
Ferhat ÖÇ, Mustacoglu AF (2018) CPP-ELM: cryptographically privacy-preserving extreme learning machine for cloud systems. Int J Comput Intell Syst 11(1):33–44
Article Google Scholar
Feng S, Yu H (2020) Multi-participant multi-class vertical federated learning. ArXiv preprint arXiv:2001.11154
Feng Z, Xiong H, Song C, Yang S, Zhao B, Wang L, Chen Z, Yang S, Liu L, Huan J (2019) Securegbm: secure multi-party gradient boosting. In IEEE int. conf. on big data (big data), pp 1312–1321
Fette I, Melnikov A (2011) The websocket protocol. RFC, 6455:1–71
Flynn MJ (1972) Some computer organizations and their effectiveness. IEEE Trans Comput 100(9):948–960
Article MATH Google Scholar
Fung C, Yoon CJ, Beschastnikh I (2018) Mitigating sybils in federated learning poisoning. ArXiv preprint arXiv:1808.04866
Gaff BM, Sussman HE, Geetter J (2014) Privacy and big data. Computer 47(6):7–9
Article Google Scholar
Ganga K, Karthik S (2013) A fault tolerent approach in scientific workflow systems based on cloud computing. In: Int. conf. on pattern recognition, informatics and mobile engineering, pp 387–390
Geiping J, Bauermeister H, Dröge H, Moeller M (2020) Inverting gradients—How easy is it to break privacy in federated learning? ArXiv preprint arXiv:2003.14053
Geyer RC, Klein T, Nabi M (2017) Differentially private federated learning: a client level perspective. ArXiv preprint arXiv:1712.07557
Gibiansky A (2017) Bringing HPC techniques to deep learning. https://andrew.gibiansky.com/blog/machine-learning/baidu-allreduce/ (online). Accessed 12 Aug 2020
Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2018) Explaining explanations: an overview of interpretability of machine learning. In: IEEE int. conf. on data science and advanced analytics (DSAA). IEEE, pp 80–89
Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L (2016) Deep learning, vol 1. MIT Press, Cambridge
Google Scholar
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: Int. conf. on learning representations (ICLR)
Google. Tensorflow federated: machine learning on decentralized data. https://www.tensorflow.org/federated (online). Accessed 16 Feb 2021
Gropp W, Gropp WD, Lusk E, Skjellum A, Lusk AD (1999) Using MPI: portable parallel programming with the message-passing interface, vol 1. MIT Press
Haddadpour F, Kamani MM, Mokhtari A, Mahdavi M (2020) Federated learning with compression: unified analysis and sharp guarantees. ArXiv preprint arXiv:2007.01154
Hao M, Li H, Xu G, Liu S, Yang H (2019) Towards efficient and privacy-preserving federated deep learning. In: IEEE int. conf. on communications (ICC), pp 1–6
Hardy S, Henecka W, Ivey-Law H, Nock R, Patrini G, Smith G, Thorne B (2017) Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. ArXiv preprint arXiv:1711.10677
He C, Avestimehr S, Annavaram M (2020) Group knowledge transfer: collaborative training of large CNNs on the edge. ArXiv preprint arXiv:2007.14513
He C, Annavaram M, Avestimehr S (2020) Towards non-IID and invisible data with FEDNAS: federated deep learning via neural architecture search. ArXiv preprint arXiv:2004.08546
He C, Balasubramanian K, Ceyani E, Yang C, Xie H, Sun L, He L, Yang L, Yu PS, Rong Y, Zhao P (2021) Fedgraphnn: a federated learning system and benchmark for graph neural networks. ArXiv preprint arXiv:2104.07145
He C, Ceyani E, Balasubramanian K, Annavaram M, Avestimehr S (2021) Spreadgnn: serverless multi-task federated learning for graph neural networks. ArXiv preprint arXiv:2106.02743
He C, Li S, Soltanolkotabi M, Avestimehr S (2021) Pipetransformer: automated elastic pipelining for distributed training of large-scale models. In: Int. conf. on machine learning, volume 139 of machine learning research, pp 4150–4159
He C, Li S, So J, Zeng X, Zhang M, Wang H, Wang X, Vepakomma P, Singh A, Qiu H, Zhu X (2020) Fedml: a research library and benchmark for federated machine learning. ArXiv preprint arXiv:2007.13518
He C, Shah AD, Tang Z, Sivashunmugam DF, Bhogaraju K, Shimpi M, Shen L, Chu X, Soltanolkotabi M, Avestimehr S. Fedcv: a federated learning framework for diverse computer vision tasks. ArXiv preprint arXiv: 2111.11066
He C, Tan C, Tang H, Qiu S, Liu J (2019) Central server free federated learning over single-sided trust social networks. ArXiv preprint arXiv:1910.04956
He C, Ye H, Shen L, Zhang T (2020) Milenas: efficient neural architecture search via mixed-level reformulation. In: IEEE/CVF conf. on computer vision and pattern recognition (CVPR)
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Article Google Scholar
He L, Karimireddy SP, Jaggi M (2020) Secure byzantine-robust machine learning. ArXiv preprint arXiv:2006.04747
Hitaj B, Ateniese G, Perez-Cruz F (2017) Deep models under the GAN: information leakage from collaborative deep learning. In: ACM SIGSAC conference on computer and communications security, pp 603–618
Hu C, Jiang J, Wang Z (2019) Decentralized federated learning: a segmented gossip approach. ArXiv preprint arXiv:1908.07782
Hu Z, Shaloudegi K, Zhang G, Yu Y (2020) Fedmgda+: federated learning meets multi-objective optimization. ArXiv preprint arXiv:2006.11489
Huang Y, Cheng Y, Bapna A, Firat O, Chen D, Chen M, Lee H, Ngiam J, Le QV, Wu Y (2018) Gpipe: efficient training of giant neural networks using pipeline parallelism. ArXiv preprint arXiv:1811.06965
Ivkin N, Rothchild D, Ullah E, Stoica I, Arora R (2019) Communication-efficient distributed SGD with sketching. ArXiv preprint arXiv:1903.04488
Jiang J, Fu F, Yang T, Cui B (2018) Sketchml: accelerating distributed machine learning with data sketches. In: Int. conf. on management of data, pp 1269–1284
Jiang J, Ji S, Long G (2020) Decentralized knowledge acquisition for mobile internet applications. In: World Wide Web, pp 1–17
Jiang M, Jung T, Karl R, Zhao T (2020) Federated dynamic GNN with secure aggregation. ArXiv preprint arXiv:2009.07351
Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, Charles Z, Cormode G, Cummings R, D’Oliveira RG (2019) Advances and open problems in federated learning. ArXiv preprint arXiv:1912.04977
Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, Charles Z, Cormode G, Cummings R, D’Oliveira RG, et al. (2021) Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1)
Karimireddy SP, Kale S, Mohri M, Reddi S, Stich S, Suresh AT (2020) Scaffold: stochastic controlled averaging for federated learning. In: Int. conf. on machine learning (ICML), pp 5132–5143
Karimireddy SP, Rebjock Q, Stich S, Jaggi M (2019) Error feedback fixes signsgd and other gradient compression schemes. In: Int. conf. on machine learning (ICML), pp 3252–3261
Katevas K, Bagdasaryan E, Waterman J, Safadieh MM, Birrell E, Haddadi H, Estrin D (2020) Policy-based federated learning. ArXiv e-prints, pp arXiv-2003
Ke C, Honorio J (2021) Federated myopic community detection with one-shot communication. ArXiv preprint arXiv:2106.07255
Kholod I, Yanaki E, Fomichev D, Shalugin E, Novikova E, Filippov E, Nordlund M (2021) Open-source federated learning frameworks for iot: a comparative review and analysis. Sensors 21(1):167
Article Google Scholar
Konečnỳ J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D (2016) Federated learning: strategies for improving communication efficiency. ArXiv preprint arXiv:1610.05492
Kulkarni V, Kulkarni M, Pant A (2020) Survey of personalization techniques for federated learning. In: World conf. on smart trends in systems, security and sustainability (WorldS4), pp 794–797
Lalitha A, Kilinc OC, Javidi T, Koushanfar F (2019) Peer-to-peer federated learning on graphs. ArXiv preprint arXiv:1901.11173
Li Q, Wen Z, He B (2019) A survey on federated learning systems: vision, hype and reality for data privacy and protection. ArXiv preprint arXiv:1907.09693
Li S, Cheng Y, Liu Y, Wang W, Chen T (2019) Abnormal client behavior detection in federated learning. ArXiv preprint arXiv:1910.09933
Li T, Sahu AK, Talwalkar A, Smith V (2020) Federated learning: challenges, methods, and future directions. IEEE Signal Process Mag 37(3):50–60
Article Google Scholar
Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020) Federated optimization in heterogeneous networks. Mach Learn Syst 2:429–450
Google Scholar
Li T, Sanjabi M, Beirami A, Smith V (2019) Fair resource allocation in federated learning. arXiv preprint arXiv:1905.10497
Li Y, Wu B, Jiang Y, Li Z, Xia ST (2020) Backdoor learning: a survey. arXiv preprint arXiv:2007.08745
Li Z, Huang Z, Chen C, Hong C (2019) Quantification of the leakage in federated learning. arXiv preprint arXiv:1910.05467
Lian X, Zhang C, Zhang H, Hsieh CJ, Zhang W, Liu J (2017) Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent. In: Advances in neural information processing systems (NeurIPS), pp 5330–5340
Liang Z, Wang B, Gu Q, Osher S, Yao Y (2020) Exploring private federated learning with laplacian smoothing. arXiv preprint arXiv:2005.00218
Liaqat M, Chang V, Gani A, Ab Hamid SH, Toseef M, Shoaib U, Ali RL (2017) Federated cloud resource management: review and discussion. J Netw Comput Appl 77:87–105
Article Google Scholar
Lim WY, Luong NC, Hoang DT, Jiao Y, Liang YC, Yang Q, Niyato D, Miao C (2020) Federated learning in mobile edge networks: a comprehensive survey. IEEE Commun Surv Tutor 22(3):2031–2063
Article Google Scholar
Lin BY, He C, Zeng Z, Wang H, Huang Y, Soltanolkotabi M, Ren X, Avestimehr S (2021) Fednlp: a research platform for federated learning in natural language processing. ArXiv preprint arXiv:2104.08815
Lin J, Du M, Liu J (2019) Free-riders in federated learning: attacks and defenses. ArXiv preprint arXiv:1911.12560
Lin Y, Chen C, Chen C, Wang L (2020) Improving federated relational data modeling via basis alignment and weight penalty. ArXiv preprint arXiv:2011.11369
Liu J, Bondiombouy C, Mo L, Valduriez P (2020) Two-phase scheduling for efficient vehicle sharing. IEEE Trans Intell Transp Syst (TITS) 23(1): 457–470
Liu J, Pacitti E, Valduriez P, De Oliveira D, Mattoso M (2016) Multi-objective scheduling of scientific workflows in multisite clouds. Fut Gener Comput Syst 63:76–95
Article Google Scholar
Liu J, Pacitti E, Valduriez P, Mattoso M (2015) A survey of data-intensive scientific workflow management. J Grid Comput 13(4):457–493
Article Google Scholar
Liu J, Pineda L, Pacitti E, Costan A, Valduriez P, Antoniu G, Mattoso M (2018) Efficient scheduling of scientific workflows using hot metadata in a multisite cloud. IEEE Trans Knowl Data Eng 31(10):1940–1953
Article Google Scholar
Liu L, Zhang J, Song SH, Letaief KB (2020) Client-edge-cloud hierarchical federated learning. In: IEEE int. conf. on communications (ICC), pp 1–6
Liu R, Cao Y, Yoshikawa M, Chen H (2020) Fedsel: federated sgd under local differential privacy with top-k dimension selection. In: Int. conf. on database systems for advanced applications, pp 485–501
Liu Y, Huang A, Luo Y, Huang H, Liu Y, Chen Y, Feng L, Chen T, Yu H, Yang Q (2020) Fedvision: an online visual object detection platform powered by federated learning. AAAI Confer Artif Intell 34:13172–13179
Google Scholar
Liu Y, Kang Y, Zhang X, Li L, Cheng Y, Chen T, Hong M, Yang Q (2019) A communication efficient collaborative learning framework for distributed features. ArXiv preprint arXiv:1912.11187
Lo SK, Lu Q, Zhu L, Paik HY, Xu X, Wang C (2021) Architectural patterns for the design of federated learning systems. ArXiv preprint arXiv:2101.02373
Luo S, Chen X, Wu Q, Zhou Z, Yu S (2020) Hfel: joint edge association and resource allocation for cost-efficient hierarchical federated edge learning. IEEE Trans Wirel Commun 19(10):6535–6548
Article Google Scholar
Luo X, Zhu X (2020) Exploiting defenses against gan-based feature inference attacks in federated learning. ArXiv preprint arXiv:2004.12571
Lyu L, Yu H, Yang Q (2020) Threats to federated learning: a survey. ArXiv preprint arXiv:2003.02133
Lyu L, Yu J, Nandakumar K, Li Y, Ma X, Jin J, Yu H, Ng KS (2020) Towards fair and privacy-preserving federated deep models. IEEE Trans Parallel Distrib Syst 31(11):2524–2541
Article Google Scholar
Ma Y, Yu D, Wu T, Wang H (2019) Paddlepaddle: an open-source deep learning platform from industrial practice. Front Data Comput 1(1):105
Google Scholar
Malekijoo A, Fadaeieslam MJ, Malekijou H, Homayounfar M, Alizadeh-Shabdiz F, Rawassizadeh R (2021) FEDZIP: a compression framework for communication-efficient federated learning. ArXiv preprint arXiv:2102.01593
Mandal K, Gong G (2019) PrivFL: practical privacy-preserving federated regressions on high-dimensional data over mobile networks. In: ACM SIGSAC conf. on cloud computing security workshop, pp 57–68
McKeen F, Alexandrovich I, Berenzon A, Rozas CV, Shafi H, Shanbhogue V, Savagaonkar UR (2013) Innovative instructions and software model for isolated execution. In: Int. workshop on hardware and architectural support for security and privacy
McMahan B, Moore E, Ramage D, Hampson S, y Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Int. conf. on artificial intelligence and statistics (AISTATS), pp 1273–1282
McMahan HB, Ramage D, Talwar K, Zhang L (2017) Learning differentially private recurrent language models. ArXiv preprint arXiv:1710.06963
McMahan HB, Ramage D, Talwar K, Zhang L (2018) Learning differentially private recurrent language models. In: Int. conf. on learning representations (ICLR)
Mei G, Guo Z, Liu S, Pan L (2019) Sgnn: a graph neural network based federated learning approach by hiding structure. In: IEEE int. conf. on big data (big data), pp 2560–2568
Melis L, Song C, De Cristofaro E, Shmatikov V (2019) Exploiting unintended feature leakage in collaborative learning. In: IEEE symposium on security and privacy (SP), pp 691–706
Meng C, Rambhatla S, Liu Y (2021) Cross-node federated graph neural network for spatio-temporal data modeling. In: ACM SIGKDD conference on knowledge discovery and data mining (KDD) (to appear)
Mhaisen N, Abdellatif AA, Mohamed A, Erbad A, Guizani M (2021) Optimal user-edge assignment in hierarchical federated learning based on statistical properties and network topology constraints. IEEE Trans Netw Sci Eng 9(1): 55–66
Mo F, Haddadi H (2019) Efficient and private federated learning using tee. In: EuroSys
Mohri M, Sivek G, Suresh AT (2019) Agnostic federated learning. In: Int. conf. on machine learning (ICML), pp 4615–4625
Mothukuri V, Parizi RM, Pouriyeh S, Huang Y, Dehghantanha A, Srivastava G (2021) A survey on security and privacy of federated learning. Fut Gener Comput Syst 115:619–640
Article Google Scholar
Muñoz-González L, Co KT, Lupu EC (2019) Byzantine-robust federated machine learning through adaptive model averaging. ArXiv preprint arXiv:1909.05125
Narayanan D, Harlap A, Phanishayee A, Seshadri V, Devanur NR, Ganger GR, Gibbons PB, Zaharia M (2019) Pipedream: generalized pipeline parallelism for dnn training. In: ACM symposium on operating systems principles, pp 1–15
Ochiai K, Senkawa K, Yamamoto N, Tanaka Y, Fukazawa Y (2019) Real-time on-device troubleshooting recommendation for smartphones. In: ACM SIGKDD int. conf. on knowledge discovery and data mining, pp 2783–2791
Official Journal of the European Union. General data protection regulation. https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679 (online). Accessed 12 Feb 2021
Ohrimenko O, Schuster F, Fournet C, Mehta A, Nowozin S, Vaswani K, Costa M (2016) Oblivious multi-party machine learning on trusted processors. In $\{$USENIX$\}$ security symposium ($\{$USENIX$\}$ security), pp 619–636
Oksuz K, Cam BC, Kalkan S, Akbas E (2020) Imbalance problems in object detection: a review. IEEE Trans Pattern Anal Mach Intell 43:3388–3415
OpenMined. Pysyft. https://github.com/OpenMined/PySyft (online). Accessed 22 Feb 2021
PaddlePaddle B. Paddlehub. https://github.com/PaddlePaddle/PaddleHub (online). Accessed 01 Oct 2021
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Int. conf. on the theory and applications of cryptographic techniques, pp 223–238
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Peng H, Li H, Song Y, Zheng V, Li J (2021) Federated knowledge graphs embedding. In: ACM int. conf. on information and knowledge management (CIKM), pp 1–10
Phan H, Thai MT, Hu H, Jin R, Sun T, Dou D (2020) Scalable differential privacy with certified robustness in adversarial learning. In: Int. conf. on machine learning (ICML), pp 7683–7694
Pillutla K, Kakade SM, Harchaoui Z (2019) Robust aggregation for federated learning. ArXiv preprint arXiv:1912.13445
Pineda-Morales L, Liu J, Costan A, Pacitti E, Antoniu G, Valduriez P, Mattoso M (2016) Managing hot metadata for scientific workflows on multisite clouds. In: IEEE Int. Conf. on Big Data (Big Data), pp 390–397
Pytorch. Pytorch. https://pytorch.org/ (online). Accessed 13 Mar 2021
Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 22(3):400–407
Romanini D, Hall AJ, Papadopoulos P, Titcombe T, Ismail A, Cebere T, Sandmann R, Roehm R, Hoeh MA (2021) Pyvertical: a vertical federated learning framework for multi-headed splitnn. ArXiv preprint arXiv:2104.00489
Rothchild D, Panda A, Ullah E, Ivkin N, Stoica I, Braverman V, Gonzalez J, Arora R (2020) Fetchsgd: communication-efficient federated learning with sketching. In: Int. conf. on machine learning (ICML), pp 8253–8265
Ryffel T, Trask A, Dahl M, Wagner B, Mancuso J, Rueckert D, Passerat-Palmbach J (2018) A generic framework for privacy preserving deep learning. ArXiv preprint arXiv:1811.04017
Sabater C, Bellet A, Ramon J (2020) Distributed differentially private averaging with improved utility and robustness to malicious parties. ArXiv preprint arXiv:2006.07218
Satariano A. Google is fined $57 million under Europe’s data privacy law. https://www.nytimes.com/2019/01/21/technology/google-europe-gdpr-fine.html (online). Accessed 28 Feb 2021
Sayed AH (2014) Adaptation, learning, and optimization over networks. Found Trends Mach Learn 7(ARTICLE):311–801
Sayed AH, Tu SY, Chen J, Zhao X, Towfic ZJ (2013) Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior. IEEE Signal Process Maga 30(3):155–171
Article Google Scholar
Seif M, Tandon R, Li M (2020) Wireless federated learning with local differential privacy. In: IEEE int. symposium on information theory (ISIT), pp 2604–2609
Seneta E (2006) Non-negative matrices and Markov chains. Springer
Shakespeare W (2007) The complete works of William Shakespeare. Wordsworth Editions
Shlezinger N, Chen M, Eldar YC, Poor HV, Cui S (2020) Federated learning with quantization constraints. In: IEEE int. conf. on acoustics, speech and signal processing (ICASSP), pp 8851–8855
Shlezinger N, Chen M, Eldar YC, Poor HV, Cui S (2020) Uveqfed: universal vector quantization for federated learning. IEEE Trans Signal Process 69:500–514
Article MathSciNet Google Scholar
Silverstein J. Hundreds of millions of facebook user records were exposed on amazon cloud server. https://www.cbsnews.com/news/millions-facebook-user-records-exposed-amazon-cloud-server/ (online). Accessed 28 Feb 2021
Spring R, Kyrillidis A, Mohan V, Shrivastava A (2019) Compressing gradient optimizers via count-sketches. In: Int. conf. on machine learning (ICML), pp 5946–5955
Standing Committee of the National People’s Congress. Cybersecurity law of the people’s Republic of China. https://www.newamerica.org/cybersecurity-initiative/digichina/blog/translation-cybersecurity-law-peoples-republic-china/ (online). Accessed 22 Feb 2021
Stich SU, Cordonnier JB, Jaggi M (2018) Sparsified SGD with memory. In: Advances in neural information processing systems (NeurIPS), vol 31
Sun H, Ma X, Hu RQ (2020) Adaptive federated learning with gradient compression in uplink NOMA. IEEE Trans Vehic Technol 69(12):16325–16329
Sun Z, Kairouz P, Suresh AT, McMahan HB (2019) Can you really backdoor federated learning? ArXiv preprint arXiv:1911.07963
Suzumura T, Zhou Y, Baracaldo N, Ye G, Houck K, Kawahara R, Anwar A, Stavarache LL, Watanabe Y, Loyola P, Klyashtorny D (2019) Towards federated graph learning for collaborative financial crimes detection. ArXiv preprint arXiv:1909.12946
Tolpegin V, Truex S, Gursoy ME, Liu L (2020) Data poisoning attacks against federated learning systems. In: European symposium on research in computer security. Springer, pp 480–501
Triastcyn A, Faltings B (2020) Federated generative privacy. IEEE Intell Syst 35(4):50–57
Google Scholar
Truex Stacey, Baracaldo Nathalie, Anwar Ali, Steinke Thomas, Ludwig Heiko, Zhang Rui, Zhou Yi (2019) A hybrid approach to privacy-preserving federated learning. In: ACM workshop on artificial intelligence and security, pp 1–11
Vanhaesebrouck P, Bellet A, Tommasi M (2017) Decentralized collaborative learning of personalized models over networks. In: Int. conf. on artificial intelligence and statistics (AISTATS), pp 509–517
Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2018) Graph attention networks. In: Int. conf. on learning representations (ICLR)
Verbraeken J, Wolting M, Katzy J, Kloppenburg J, Verbelen T, Rellermeyer JS (2020) A survey on distributed machine learning. ACM Comput Surv (CSUR) 53(2):1–33
Article Google Scholar
Vishnu A, Siegel C, Daily J (2016) Distributed tensorflow with MPI. ArXiv preprint arXiv:1603.02339
Wainakh A, Guinea AS, Grube T, Mühlhäuser M (2020) Enhancing privacy via hierarchical federated learning. In: IEEE European symposium on security and privacy workshops (EuroS&PW), pp 344–347
Wang B, Li A, Li H, Chen Y (2020) Graphfl: a federated learning framework for semi-supervised node classification on graphs. ArXiv preprint arXiv:2012.04187
Wang C, Chen B, Li G, Wang H (2021) FL-AGCNS: federated learning framework for automatic graph convolutional network search. ArXiv preprint arXiv:2104.04141
Wang G (2019) Interpret federated learning with Shapley values. ArXiv preprint arXiv:1905.04519
Wang H, Yurochkin M, Sun Y, Papailiopoulos D, Khazaeni Y (2020) Federated learning with matched averaging. In: Int. conf. on learning representations (ICLR)
Wang J, Charles Z, Xu Z, Joshi G, McMahan HB, Al-Shedivat M, Andrew G, Avestimehr S, Daly K, Data D, Diggavi S (2021) A field guide to federated optimization. ArXiv preprint arXiv:2107.06917
Wang J, Sahu AK, Yang Z, Joshi G, Kar S (2019) Matcha: speeding up decentralized SGD via matching decomposition sampling. In: Indian control conference (ICC), pp 299–300
Wang L, Xu S, Wang X, Zhu Q (2021) Addressing class imbalance in federated learning. AAAI Confer Artif Intell 35:10165–10173
Google Scholar
Luping W, Wei W, Bo L (2019) CMFL: mitigating communication overhead for federated learning. In: IEEE int. conf. on distributed computing systems (ICDCS), pp 954–964
Wang Z, Song M, Zhang Z, Song Y, Wang Q, Qi H (2019) Beyond inferring class representatives: User-level privacy leakage from federated learning. In: IEEE conf. on computer communications (INFOCOM), pp 2512–2520
WeBank. Federated AI technology enabler (FATE). https://github.com/FederatedAI/FATE (online). Accessed 16 Feb 2021
WeBank. Federated learning white paper v2.0. https://aisp-1251170195.cos.ap-hongkong.myqcloud.com/wp-content/uploads/pdf/%E8%81%94%E9%82%A6%E5%AD%A6%E4%B9%A0%E7%99%BD%E7%9A%AE%E4%B9%A6_v2.0.pdf (online). Accessed 14 Feb 2021
Wei K, Li J, Ding M, Ma C, Yang HH, Farokhi F, Jin S, Quek TQ, Poor HV (2020) Federated learning with differential privacy: algorithms and performance analysis. IEEE Trans Inf Forens Secur 15:3454–3469
Article Google Scholar
Wu C, Wu F, Cao Y, Huang Y, Xie X (2021) Fedgnn: federated graph neural network for privacy-preserving recommendation. ArXiv preprint arXiv:2102.04925
Wu T, Liu Z, Huang Q, Wang Y, Lin D (2021) Adversarial robustness under long-tailed distribution. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8659–8668
Xu C, Tao D, Xu C (2013) A survey on multi-view learning. ArXiv preprint arXiv:1304.5634
Xu J, Glicksberg BS, Su C, Walker P, Bian J, Wang F (2021) Federated learning for healthcare informatics. J Healthc Inf Res 5:1–19
Xu J, Du W, Jin Y, He W, Cheng R (2020) Ternary compression for communication-efficient federated learning. IEEE Trans Neural Netw Learn Syst 33(3):1162–1176
Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol 10(2):1–19
Article Google Scholar
Yi X, Paulet R, Bertino E (2014) Homomorphic encryption. In: Homomorphic encryption and applications. Springer, pp 27–46
Yuan J, Xu M, Ma X, Zhou A, Liu X, Wang S (2020) Hierarchical federated learning through LAN-WAN orchestration. ArXiv preprint arXiv:2010.11612
Yurochkin M, Agarwal M, Ghosh S, Greenewald K, Hoang N, Khazaeni Y (2019) Bayesian nonparametric federated learning of neural networks. In: Int. conf. on machine learning (ICML), pp 7252–7261
Zhang C, Li S, Xia J, Wang W, Yan F, Liu Y (2020) Batchcrypt: efficient homomorphic encryption for cross-silo federated learning. In: USENIX annual technical conference (USENIX ATC), pp 493–506
Zhang C, Bi J, Soda P (2017) Feature selection and resampling in class imbalance learning: Which comes first? An empirical study in the biological domain. In: Int. conf. on bioinformatics and biomedicine (BIBM), pp 933–938
Zhang C, Bi J, Xu S, Ramentol E, Fan G, Qiao B, Fujita H (2019) Multi-imbalance: an open-source software for multi-class imbalance learning. Knowl-Based Syst 174:137–143
Article Google Scholar
Zhang C, Soda P, Bi J, Fan G, Almpanidis G, Garcia S (2021) An empirical study on the joint impact of feature selection and data resampling on imbalance classification. ArXiv preprint arXiv:2109.00201
Zhang H, Shen T, Wu F, Yin M, Yang H, Wu C (2021) Federated graph learning—a position paper. ArXiv preprint arXiv:2105.11099
Zhang T, He C, Ma T, Gao L, Ma M, Avestimehr S (2021) Federated learning for internet of things: a federated learning framework for on-device anomaly data detection. ArXiv preprint arXiv:2106.07976
Zhang X, Li F, Zhang Z, Li Q, Wang C, Wu J (2020) Enabling execution assurance of federated learning at untrusted participants. In: IEEE INFOCOM conf. on computer communications, pp 1877–1886
Zhao B, Mopuri KR, Bilen H (2020) IDLG: improved deep leakage from gradients. ArXiv preprint arXiv:2001.02610
Zhao Y, Barnaghi P, Haddadi H (2021) Multimodal federated learning. ArXiv preprint arXiv:2109.04833
Zheng L, Zhou J, Chen C, Wu B, Wang L, Zhang B (2021) Asfgnn: automated separated-federated graph neural network. Peer-to-Peer Netw Appl 14(3):1692–1704
Article Google Scholar
Zhou J, Chen C, Zheng L, Wu H, Wu J, Zheng X, Wu B, Liu Z, Wang L (2020) Vertically federated graph neural network for privacy-preserving node classification. ArXiv preprint arXiv:2005.11903
Zhu H, Zhang H, Jin Y (2021) From federated learning to federated neural architecture search: a survey. Complex Intell Syst
Zinkevich M, Weimer M, Li L, Smola A. (2010) Parallelized stochastic gradient descent. In: Advances in neural information processing systems (NeurIPS), vol 4. Citeseer, p 4

Download references

Author information

Authors and Affiliations

Baidu Inc., Beijing, China
Ji Liu, Jizhou Huang, Xuhong Li, Shilei Ji, Haoyi Xiong & Dejing Dou
Computer Science and Software Engineering Department, Auburn University, Auburn, AL, USA
Yang Zhou
Computer and Information Science Department, University of Oregon, Eugene, OR, USA
Dejing Dou

Authors

Ji Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jizhou Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xuhong Li
View author publications
You can also search for this author in PubMed Google Scholar
Shilei Ji
View author publications
You can also search for this author in PubMed Google Scholar
Haoyi Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Dejing Dou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ji Liu or Dejing Dou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, J., Huang, J., Zhou, Y. et al. From distributed machine learning to federated learning: a survey. Knowl Inf Syst 64, 885–917 (2022). https://doi.org/10.1007/s10115-022-01664-x

Download citation

Received: 18 March 2021
Revised: 31 January 2022
Accepted: 05 February 2022
Published: 22 March 2022
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10115-022-01664-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

From distributed machine learning to federated learning: a survey

Abstract

Access this article

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

From distributed machine learning to federated learning: a survey

Abstract

Access this article

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation