Skip to main content

Privacy-Preserving Vertical Federated Learning

  • 338 Accesses

Abstract

Many federated learning (FL) proposals follow the structure of horizontal FL, where each party has all the necessary information to train a model available to them. However, in important real-world FL scenarios, not all parties have access to the same information, and not all have what is required to train a machine learning model. In what is known as vertical scenarios, multiple parties provide disjoint sets of information that, when brought together, can create a full feature set with labels, which can be used for training. Legislation, practical considerations, and privacy requirements inhibit moving all data to a single place or freely sharing among parties. Horizontal FL techniques cannot be applied to vertical settings. This chapter discusses the use cases and challenges of vertical FL. It introduces the most important approaches for vertical FL and describes in detail FedV, an efficient solution to perform secure gradient computation for popular ML models. FedV is designed to overcome some of the pitfalls inherent to applying existing state-of-the art techniques. Using FedV substantially reduces training time and the amount of data transfer and enables the use of vertical FL in more real-world use cases.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-96896-0_18
  • Chapter length: 22 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   119.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-96896-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Hardcover Book
USD   159.99
Price excludes VAT (USA)
Fig. 18.1
Fig. 18.2
Fig. 18.3
Fig. 18.4
Fig. 18.5

Notes

  1. 1.

    The term honest-but-curious has the same meaning as the term semi-honest in the crypto community.

References

  1. Abdalla M, Benhamouda F, Kohlweiss M, Waldner H (2019) Decentralizing inner-product functional encryption. In: IACR international workshop on public key cryptography. Springer, pp 128–157

    MATH  Google Scholar 

  2. Bagdasaryan E, Veit A, Hua Y, Estrin D, Shmatikov V (2018) How to backdoor federated learning. Preprint. arXiv:1807.00459

    Google Scholar 

  3. Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konecny J, Mazzocchi S, McMahan HB et al (2019) Towards federated learning at scale: System design. Preprint. arXiv:1902.01046

    Google Scholar 

  4. Chen B, Carvalho W, Baracaldo N, Ludwig H, Edwards B, Lee T, Molloy I, Srivastava B (2018) Detecting backdoor attacks on deep neural networks by activation clustering. Preprint. arXiv:1811.03728

    Google Scholar 

  5. Chen T, Jin X, Sun Y, Yin W (2020) Vafl: a method of vertical asynchronous federated learning. Preprint. arXiv:2007.06081

    Google Scholar 

  6. Chotard J, Sans ED, Gay R, Phan DH, Pointcheval D (2018) Decentralized multi-client functional encryption for inner product. In: International conference on the theory and application of cryptology and information security. Springer, pp 703–732

    Google Scholar 

  7. Corinzia L, Buhmann JM (2019) Variational federated multi-task learning. Preprint. arXiv:1906.06268

    Google Scholar 

  8. Fang C, Li CJ, Lin Z, Zhang T (2018) Spider: Near-optimal non-convex optimization via stochastic path integrated differential estimator. Preprint. arXiv:1807.01695

    Google Scholar 

  9. Gascón A, Schoppmann P, Balle B, Raykova M, Doerner J, Zahur S, Evans D (2016) Secure linear regression on vertically partitioned datasets. IACR Cryptology ePrint Archive 2016, 892

    Google Scholar 

  10. Geyer RC, Klein T, Nabi M (2017) Differentially private federated learning: A client level perspective. Preprint. arXiv:1712.07557

    Google Scholar 

  11. Ghadimi S, Lan G (2013) Stochastic first-and zeroth-order methods for nonconvex stochastic programming. SIAM J Optim 23(4):2341–2368

    MathSciNet  CrossRef  Google Scholar 

  12. Gu B, Dang Z, Li X, Huang H (2020) Federated doubly stochastic kernel learning for vertically partitioned data. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2483–2493

    Google Scholar 

  13. Hardy S, Henecka W, Ivey-Law H, Nock R, Patrini G, Smith G, Thorne B (2017) Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. Preprint. arXiv:1711.10677

    Google Scholar 

  14. Ion M, Kreuter B, Nergiz AE, Patel S, Raykova M, Saxena S, Seth K, Shanahan D, Yung M (2019) On deploying secure computing commercially: Private intersection-sum protocols and their business applications. IACR Cryptol. ePrint Arch. 2019, 723

    Google Scholar 

  15. Jin X, Du R, Chen PY, Chen T (2020) Cafe: Catastrophic data leakage in federated learning. OpenReview - Preprint

    Google Scholar 

  16. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. Preprint. arXiv:1412.6980

    Google Scholar 

  17. Lan G, Lee S, Zhou Y (2020) Communication-efficient algorithms for decentralized and stochastic optimization. Math Program 180(1):237–284

    MathSciNet  CrossRef  Google Scholar 

  18. Li O, Sun J, Yang X, Gao W, Zhang H, Xie J, Smith V, Wang C (2021) Label leakage and protection in two-party split learning. Preprint. arXiv:2102.08504

    Google Scholar 

  19. Ludwig H, Baracaldo N, Thomas G, Zhou Y, Anwar A, Rajamoni S, Ong Y, Radhakrishnan J, Verma A, Sinn M, et al (2020) IBM federated learning: an enterprise framework white paper v0. 1. Preprint. arXiv:2007.10987

    Google Scholar 

  20. Luo X, Wu Y, Xiao X, Ooi BC (2021) Feature inference attack on model predictions in vertical federated learning. In: 2021 IEEE 37th international conference on data engineering (ICDE). IEEE, pp 181–192

    Google Scholar 

  21. McMahan HB, Moore E, Ramage D, Hampson S et al (2016) Communication-efficient learning of deep networks from decentralized data. Preprint. arXiv:1602.05629

    Google Scholar 

  22. Nesterov Y (1998) Introductory lectures on convex programming volume I: Basic course. Lecture Notes 3(4):5

    Google Scholar 

  23. Nock R, Hardy S, Henecka W, Ivey-Law H, Patrini G, Smith G, Thorne B (2018) Entity resolution and federated learning get a federated resolution. Preprint. arXiv:1803.04035

    Google Scholar 

  24. Schnell R, Bachteler T, Reiher J (2011) A novel error-tolerant anonymous linking code. German Record Linkage Center, Working Paper Series No. WP-GRLC-2011-02

    Google Scholar 

  25. Singh A, Vepakomma P, Gupta O, Raskar R (2019) Detailed comparison of communication efficiency of split learning and federated learning. Preprint. arXiv:1909.09145

    Google Scholar 

  26. Slavkovic AB, Nardi Y, Tibbits MM (2007) Secure logistic regression of horizontally and vertically partitioned distributed databases. In: Seventh IEEE international conference on data mining workshops (ICDMW 2007). IEEE, pp. 723–728

    Google Scholar 

  27. Thapa C, Chamikara MAP, Camtepe S (2020) Splitfed: When federated learning meets split learning. Preprint. arXiv:2004.12088

    Google Scholar 

  28. Vaidya J (2008) A survey of privacy-preserving methods across vertically partitioned data. In: Privacy-preserving data mining. Springer, pp 337–358

    Google Scholar 

  29. Vaidya J, Clifton C, Kantarcioglu M, Patterson AS (2008) Privacy-preserving decision trees over vertically partitioned data. ACM Trans Knowl Discov Data (TKDD) 2(3):14

    Google Scholar 

  30. Vepakomma P, Gupta O, Swedish T, Raskar R (2018) Split learning for health: Distributed deep learning without sharing raw patient data. Preprint. arXiv:1812.00564

    Google Scholar 

  31. Wang C, Liang J, Huang M, Bai B, Bai K, Li H (2020) Hybrid differentially private federated learning on vertically partitioned data. Preprint. arXiv:2009.02763

    Google Scholar 

  32. Xu R, Baracaldo N, Zhou Y, Anwar A, Ludwig H (2019) Hybridalpha: An efficient approach for privacy-preserving federated learning. In: Proceedings of the 12th ACM workshop on artificial intelligence and security. ACM

    Google Scholar 

  33. Xu R, Baracaldo N, Zhou Y, Anwar A, Joshi J, Ludwig H (2021) Fedv: Privacy-preserving federated learning over vertically partitioned data. Preprint. arXiv:2103.03918

    Google Scholar 

  34. Yang, K, Fan T, Chen T, Shi Y, Yang Q (2019) A quasi-newton method based vertical federated learning framework for logistic regression. Preprint. arXiv:1912.00513

    Google Scholar 

  35. Yu H, Vaidya J, Jiang X (2006) Privacy-preserving SVM classification on vertically partitioned data. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 647–656

    Google Scholar 

  36. Zhang Q, Gu B, Deng C, Huang H (2021) Secure bilevel asynchronous vertical federated learning with backward updating. Preprint. arXiv:2103.00958

    Google Scholar 

  37. Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V (2018) Federated learning with non-IID data. Preprint. arXiv:1806.00582

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Runhua Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Xu, R., Baracaldo, N., Zhou, Y., Abay, A., Anwar, A. (2022). Privacy-Preserving Vertical Federated Learning. In: Ludwig, H., Baracaldo, N. (eds) Federated Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-96896-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-96896-0_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-96895-3

  • Online ISBN: 978-3-030-96896-0

  • eBook Packages: Computer ScienceComputer Science (R0)