Abstract
Federated Computation is an emerging area that seeks to provide stronger privacy for user data, by performing large scale, distributed computations where the data remains in the hands of users. Only the necessary summary information is shared, and additional security and privacy tools can be employed to provide strong guarantees of secrecy. The most prominent application of federated computation is in training machine learning models (federated learning), but many additional applications are emerging, more broadly relevant to data management and querying data. This survey gives an overview of federated computation models and algorithms. It includes an introduction to security and privacy techniques and guarantees, and shows how they can be applied to solve a variety of distributed computations providing statistics and insights to distributed data. It also discusses the issues that arise when implementing systems to support federated computation, and open problems for future research.
Similar content being viewed by others
References
Agarwal, N., Kairouz, P., Liu, Z.: The skellam mechanism for differentially private federated learning. In: Advances in Neural Information Processing Systems, pp. 5052–5064 (2021). https://proceedings.neurips.cc/paper/2021/hash/285baacbdf8fda1de94b19282acd23e2-Abstract.html
Bagdasaryan, E., Kairouz, P., Mellem, S., Gascón, A., Bonawitz, K.A., Estrin, D., Gruteser, M.: Towards sparse federated analytics: location heatmaps under distributed differential privacy with secure aggregation. Proc. Priv. Enhancing Technol. 2022(4), 162–182 (2022). https://doi.org/10.56553/popets-2022-0104
Balle, B., Bell, J., Gascón, A., Nissim, K.: Private summation in the multi-message shuffle model. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 657–676. ACM (2020). https://doi.org/10.1145/3372297.3417242
Basat, R.B., Mitzenmacher, M., Vargaftik, S.: How to send a real number using a single bit (and some shared randomness). In: International Colloquium on Automata, Languages, and Programming, volume 198 of LIPIcs, pp. 25:1–25:20. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2021). https://doi.org/10.4230/LIPIcs.ICALP.2021.25
Bassily, R., Nissim, K., Stemmer, U., Thakurta, A.: Practical locally private heavy hitters. J. Mach. Learn. Res. 21, 16:1-16:42 (2020)
Beaver, D.: Efficient multiparty protocols using circuit randomization. In: CRYPTO (1991)
Bell, J.H., Bonawitz, K.A., Gascón, A., Lepoint, T., Raykova, M.: Secure single-server aggregation with (poly)logarithmic overhead. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 1253–1269. ACM (2020). https://doi.org/10.1145/3372297.3417885
Bharadwaj, A., Cormode, G.: An introduction to federated computation. In: International Conference on Management of Data, pp. 2448–2451. ACM (2022). https://doi.org/10.1145/3514221.3522561
Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H.B., Patel, S., Ramage, D., Segal, A., Seth, K.: Practical secure aggregation for privacy-preserving machine learning. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191. Association for Computing Machinery (2017). ISBN 9781450349468. https://doi.org/10.1145/3133956.3133982
Bun, M., Steinke, T.: Concentrated differential privacy: simplifications, extensions, and lower bounds. In: Theory of Cryptography, volume 9985 of Lecture Notes in Computer Science, pp. 635–658 (2016). https://doi.org/10.1007/978-3-662-53641-4_24
Chaudhuri, K., Guo, C., Rabbat, M.: Privacy-aware compression for federated data analysis. In: Uncertainty in Artificial Intelligence, Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, UAI 2022, 1–5 August 2022, Eindhoven, The Netherlands, volume 180 of Proceedings of Machine Learning Research, pp. 296–306. PMLR (2022). https://proceedings.mlr.press/v180/chaudhuri22a.html
Chen, W.-N., Özgür, A., Kairouz, P.: The Poisson binomial mechanism for unbiased federated learning with secure aggregation. In: International Conference on Machine Learning, ICML 2022, 17–23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp. 3490–3506. PMLR (2022). https://proceedings.mlr.press/v162/chen22s.html
Cormode, G., Bharadwaj, A.: Sample-and-threshold differential privacy: histograms and applications. In: International Conference on Artificial Intelligence and Statistics, AISTATS, volume 151 of Proceedings of Machine Learning Research, pp. 1420–1431. PMLR (2022). https://proceedings.mlr.press/v151/cormode22a.html
Cormode, G., Markov, I.L.: Bit-efficient numerical aggregation and stronger privacy for trust in federated analytics. CoRR, abs/2108.01521 (2021). arxiv:abs/2108.01521
Cormode, G., Markov, I.L.: Federated calibration and evaluation of binary classifiers. CoRR, abs/2210.12526 (2022). arXiv:2210.12526
Cormode, G., Yi, K.: Small Summaries for Big Data. Cambridge University Press, Cambridge (2020). https://doi.org/10.1017/9781108769938
Cormode, G., Jha, S., Kulkarni, T., Li, N., Srivastava, D, Tianhao, W.: Local differential privacy in practice. Tutorial at SIGMOD and KDD, Privacy at scale (2018)
Cormode, G., Maddock, S., Maple, C.: Frequency estimation under local differential privacy. In: International Conference on Very Large Data Bases (VLDB) (2021)
Corrigan-Gibbs, H., Boneh, D.: Prio: private, robust, and scalable computation of aggregate statistics. In: Symposium on Networked Systems Design and Implementation, NSDI, pp. 259–282. USENIX Association (2017). https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/corrigan-gibbs
Damgård, I., Fitzi, M., Kiltz, E., Nielsen, J.B., Toft, T.: Unconditionally secure constant-rounds multi-party computation for equality, comparison, bits and exponentiation. In: TCC (2006)
Davidson, A., Snyder, P., Quirk, E. B., Genereux, J., Livshits, B., Haddadi, H.: STAR: secret sharing for private threshold aggregation reporting. In: ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 697–710. ACM (2022). https://doi.org/10.1145/3548606.3560631
Differential Privacy Team at Apple. Learning with privacy at scale. https://machinelearning.apple.com/research/learning-with-privacy-at-scale (2017)
Ding, B., Kulkarni, J., Yekhanin, S.: Collecting telemetry data privately. In: Advances in Neural Information Processing Systems, pp. 3571–3580 (2017). https://proceedings.neurips.cc/paper/2017/hash/253614bbac999b38b5b60cae531c4969-Abstract.html
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theoret. Comput. Sci. 9(3–4), 211–407 (2014)
Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: EUROCRYPT International Conference on the Theory and Applications of Cryptographic Techniques, volume 4004 of Lecture Notes in Computer Science, pp. 486–503. Springer, New York (2006). https://doi.org/10.1007/11761679_29
Erlingsson, Ú., Pihur, V., Korolova, A.: RAPPOR: randomized aggregatable privacy-preserving ordinal response. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 1054–1067. ACM (2014). https://doi.org/10.1145/2660267.2660348
Evans, D., Kolesnikov, V., Rosulek, M.: A pragmatic introduction to secure multi-party computation. Found. Trends Priv. Secur. 2(2–3), 70–246 (2018). https://doi.org/10.1561/3300000019
Ghazi, B., Golowich, N., Kumar, R., Pagh, R., Velingker, A.: On the power of multiple anonymous messages. IACR Cryptol. ePrint Arch., p. 1382 (2019)
Gilboa, N., Ishai, Y.: Distributed point functions and their applications. In: EUROCRYPT Annual International Conference on the Theory and Applications of Cryptographic Techniques,, volume 8441 of Lecture Notes in Computer Science, pp. 640–658. Springer, New York (2014). https://doi.org/10.1007/978-3-642-55220-5_35
Huba, D., Nguyen, J., Malik, K., Zhu, R., Rabbat, M., Yousefpour, A., Wu, C.-J., Zhan, H., Ustinov, P., Srinivas, H., Wang, K., Shoumikhin, A., Min, J., Malek, M.: PAPAYA: practical, private, and scalable federated learning. In: Machine Learning and Systems (MLSys). mlsys.org, (2022). https://proceedings.mlsys.org/paper/2022/hash/f340f1b1f65b6df5b5e3f94d95b11daf-Abstract.html
Kairouz, P., McMahan, H.B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A.N., Bonawitz, K.A., Charles, Z., Cormode, G., Cummings, R., D’Oliveira, R.G.L., Eichner, H., Rouayheb, S.E., Evans, D., Gardner, J., Garrett, Z., Gascón, A., Ghazi, B., Gibbons, P.B., Gruteser, M., Harchaoui, Z., He, C., He, L., Huo, Z., Hutchinson, B., Hsu, J., Jaggi, M., Javidi, T., Joshi, G., Khodak, M., Konečný, J., Korolova, A., Koushanfar, F., Koyejo, S., Lepoint, T., Liu, Y., Mittal, P., Mohri, M., Nock, R., Özgür, A., Pagh, R., Qi, H., Ramage, D., Raskar, R., Raykova, M., Song, D., Song, W., Stich, S.U., Sun, Z., Suresh, A.T., Tramèr, F., Vepakomma, P., Wang, J., Xiong, L., Xu, Z., Yang, Q., Yu, F.X., Yu, H., Zhao, S.: Advances and open problems in federated learning. Found. Trends Mach. Learn. 14(1–2), 1–210 (2021). https://doi.org/10.1561/2200000083
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: \(\ell \)-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1(1), 3 (2007). https://doi.org/10.1145/1217299.1217302
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: International Conference on Artificial Intelligence and Statistics (AISTATS), volume 54 of Proceedings of Machine Learning Research, pp. 1273–1282. PMLR (2017). http://proceedings.mlr.press/v54/mcmahan17a.html
Mironov, I.: Rényi differential privacy. In: IEEE Computer Security Foundations Symposium (CSF), pp. 263–275. IEEE Computer Society (2017). https://doi.org/10.1109/CSF.2017.11
Nishide, T., Ohta, K.: Constant-round multiparty computation for interval test, equality test, and comparison. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 90(5), 960–968 (2007)
Pagh, R., Stausholm, N.M.: Infinitely divisible noise in the low privacy regime. In: International Conference on Algorithmic Learning Theory, volume 167 of Proceedings of Machine Learning Research, pp. 881–909. PMLR (2022). https://proceedings.mlr.press/v167/pagh22a.html
Peisert, S.: Trustworthy scientific computing. Commun. ACM 64(5), 18–21 (2021). https://doi.org/10.1145/3457191
Pihur, V., Korolova, A., Liu, F., Sankuratripati, S., Yung, M., Huang, D., Zeng, R.: Differentially-private “draw and discard” machine learning. CoRR, abs/1807.04369, (2018). arxiv:abs/1807.04369
Ramage, D., Mazzocchi, S.: Federated analytics: collaborative data science without data collection. https://ai.googleblog.com/2020/05/federated-analytics-collaborative-data.html (2020)
Reisizadeh, A., Mokhtari, A., Hassani, H., Jadbabaie, A., Pedarsani, R.: Fedpaq: a communication-efficient federated learning method with periodic averaging and quantization. In: The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, 26–28 (August 2020), Online [Palermo, Sicily, Italy], volume 108 of Proceedings of Machine Learning Research, pp. 2021–2031. PMLR 2020. http://proceedings.mlr.press/v108/reisizadeh20a.html
Roth, E., Noble, D., Falk, B.H., Haeberlen, A.: Honeycrisp: large-scale differentially private aggregation without a trusted core. In :Proceedings of the 27th ACM Symposium on Operating Systems Principles, SOSP 2019, Huntsville, ON, Canada, October 27–30, pp. 196–210. ACM (2019). https://doi.org/10.1145/3341301.3359660
Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: PODS (1998)
Shamir, A.: How to share a secret. Commun. ACM 22(11), 612–613 (1979). https://doi.org/10.1145/359168.359176
Stock, P., Joulin, A., Gribonval, R., Graham, B., Jégou, H.: And the bit goes down: revisiting the quantization of neural networks. In: 8th International Conference on Learning Representations, ICLR 2020. OpenReview.net, (2020). https://openreview.net/forum?id=rJehVyrKwH
Suresh, A.T., Yu, F.X., Kumar, S., McMahan, H.B.: Distributed mean estimation with limited communication. In: Proceedings of the 34th International Conference on Machine Learning , vol. 70, pp. 3329–3337. JMLR.org (2017)
Vargaftik, S., Ben-Basat, R., Portnoy, A., Mendelson, G., Ben-Itzhak, Y., Mitzenmacher, M.: DRIVE: one-bit distributed mean estimation. In: Advances in Neural Information Processing Systems, pp. 362–377 (2021)
Vargaftik, S., Basat, R.B., Portnoy, A., Mendelson, G., Ben-Itzhak, Y., Mitzenmacher, M.: EDEN: communication-efficient and robust distributed mean estimation for federated learning. In: International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pp. 21984–22014. PMLR (2022). https://proceedings.mlr.press/v162/vargaftik22a.html
Wang, T., Zhang, X., Feng, J., Yang, X.: A comprehensive survey on local differential privacy toward data statistics and analysis in crowdsensing. CoRR, abs/2010.05253 (2020). https://arxiv.org/abs/2010.05253
Wang, T., Blocki, J., Li, N., Jha, S.: Locally differentially private protocols for frequency estimation. In: USENIX Security Symposium, pp. 729–745. USENIX Association (2017). https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/wang-tianhao
Wang, Y., Lee, J., Kifer, D.: Differentially private hypothesis testing, revisited. CoRR, abs/1511.03376, (2015). http://arxiv.org/abs/1511.03376
Warner, S.L.: Randomised response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc. 60(309), 63–69 (1965)
Yang, M., Lyu, L., Zhao, J., Zhu, T., Lam, K.-Y.: Local differential privacy and its applications: a comprehensive survey. CoRR, abs/2008.03686, (2020). https://arxiv.org/abs/2008.03686
Zhu, L., Liu, Z., Han, S.: Deep leakage from gradients. In: NeurIPS (2019)
Author information
Authors and Affiliations
Contributions
AB and GC prepared the tutorial upon which this survey is based, and collaborated to write and review the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors are employees of Meta Platforms, Inc (“Meta”).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bharadwaj, A., Cormode, G. Federated computation: a survey of concepts and challenges. Distrib Parallel Databases (2023). https://doi.org/10.1007/s10619-023-07438-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s10619-023-07438-w