Skip to main content
Log in

Differentially private SGD with random features

  • Published:
Applied Mathematics-A Journal of Chinese Universities Aims and scope Submit manuscript

Abstract

In the realm of large-scale machine learning, it is crucial to explore methods for reducing computational complexity and memory demands while maintaining generalization performance. Additionally, since the collected data may contain some sensitive information, it is also of great significance to study privacy-preserving machine learning algorithms. This paper focuses on the performance of the differentially private stochastic gradient descent (SGD) algorithm based on random features. To begin, the algorithm maps the original data into a low-dimensional space, thereby avoiding the traditional kernel method for large-scale data storage requirement. Subsequently, the algorithm iteratively optimizes parameters using the stochastic gradient descent approach. Lastly, the output perturbation mechanism is employed to introduce random noise, ensuring algorithmic privacy. We prove that the proposed algorithm satisfies the differential privacy while achieving fast convergence rates under some mild conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. N Aronszajn. Theory of reproducing kernels, Transactions of the American mathematical society, 1950, 68(3): 337–404.

    Article  MathSciNet  Google Scholar 

  2. R Bassily, V Feldman, C Guzmán, K Talwar. Stability of stochastic gradient descent on nonsmooth convex losses, Advances in Neural Information Processing Systems, 2020, 33: 4381–4391.

    Google Scholar 

  3. R Bassily, V Feldman, K Talwar, A Thakurta. Private stochastic convex optimization with optimal rates, Advances in Neural Information Processing Systems, 2019, 32.

  4. L Bottou, O Bousquet. The tradeoff’s of large scale learning, Advances in Neural Information Processing Systems, 2007, 20.

  5. L Carratino, A Rudi, L Rosasco. Learning with sgd and random features, Advances in Neural Information Processing Systems, 2018, 31.

  6. K Chaudhuri, C Monteleoni, A D Sarwate. Differentially private empirical risk minimization, Journal of Machine Learning Research, 2011, 12(3): 1069–1109.

    MathSciNet  PubMed  Google Scholar 

  7. X Chen, B Tang, J Fan, X Guo. Online gradient descent algorithms for functional data learning, Journal of Complexity, 2022, 70: 101635.

    Article  MathSciNet  Google Scholar 

  8. F Cucker, D X Zhou. Learning theory: an approximation theory viewpoint, Cambridge University Press, 2007.

  9. A Dieuleveut, F Bach. Nonparametric stochastic approximation with large step-sizes, The Annals of Statistics, 2016, 44(4): 1363–1399.

    Article  MathSciNet  Google Scholar 

  10. C Dwork, F McSherry, K Nissim, A Smith. Calibrating noise to sensitivity in private data analysis, In Theory of Cryptography Conference, 2006, 265–284.

  11. C Dwork, A Roth. The algorithmic foundations of differential privacy, Foundations and Trends® in Theoretical Computer Science, 2014, 9(3–4): 211–407.

    MathSciNet  Google Scholar 

  12. V Feldman, T Koren, K Talwar. Private stochastic convex optimization: optimal rates in linear time, Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, 2020, 439–449.

  13. X Guo, Z C Guo, L Shi. Capacity dependent analysis for functional online learning algorithms, Applied and Computational Harmonic Analysis, 2023, 67: 101567.

    Article  MathSciNet  Google Scholar 

  14. Z C Guo, A Christmann, L Shi. Optimality of robust online learning, Foundations of Computational Mathematics, 2023.

  15. Z C Guo, L Shi. Fast and strong convergence of online learning algorithms, Advances in Computational Mathematics, 2019, 45: 2745–2770.

    Article  MathSciNet  Google Scholar 

  16. P Jain, A Thakurta. Differentially private learning with kernels, In International Conference on Machine Learning, 2013, 118–126.

  17. Y Lei, L Shi, Z C Guo. Convergence of unregularized online learning algorithms, The Journal of Machine Learning Research, 2017, 18(1): 6269–6301.

    MathSciNet  Google Scholar 

  18. Y Lei, Y Ying. Fine-grained analysis of stability and generalization for stochastic gradient descent, In International Conference on Machine Learning, 2020, 5809–5819.

  19. J Lin, L Rosasco. Optimal rates for multi-pass stochastic gradient methods, The Journal of Machine Learning Research, 2017, 18(1): 3375–3421.

    MathSciNet  Google Scholar 

  20. I Pinelis. Optimum bounds for the distributions of martingales in banach spaces, The Annals of Probability, 1994, 1679–1706.

  21. A Rahimi, B Recht. Random features for large-scale kernel machines, Advances in Neural Information Processing Systems, 2007, 20.

  22. A Rudi, L Rosasco. Generalization properties of learning with random features, Advances in Neural Information Processing Systems, 2017, 30.

  23. B Schölkopf, A J Smola. Learning with kernels: support vector machines, regularization, optimization, and beyond, MIT press, 2002.

  24. O Shamir, T Zhang. Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes, In International Conference on Machine Learning, 2013, 71–79.

  25. S Smale, D X Zhou. Learning theory estimates via integral operators and their approximations, Constructive approximation, 2007, 26(2): 153–172.

    Article  MathSciNet  Google Scholar 

  26. B Sriperumbudur, Z Szabó. Optimal rates for random fourier features, Advances in Neural Information Processing Systems, 2015, 28.

  27. I Sutskever, J Martens, G Dahl, G Hinton. On the importance of initialization and momentum in deep learning, In International Conference on Machine Learning, 2013, 1139–1147.

  28. M J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, Cambridge University Press, 2019.

  29. P Wang, Y Lei, Y Ying, H Zhang. Differentially private sgd with non-smooth losses, Applied and Computational Harmonic Analysis, 2022, 56: 306–336.

    Article  MathSciNet  Google Scholar 

  30. X Wu, F Li, A Kumar, K Chaudhuri, S Jha, J Naughton. Bolt-on differential privacy for scalable stochastic gradient descent-based analytics, In Proceedings of the 2017 ACM International Conference on Management of Data, 2017, 1307–1322.

  31. Y Ying, M Pontil. Online gradient descent learning algorithms, Foundations of Computational Mathematics, 2008, 8: 561–596.

    Article  MathSciNet  Google Scholar 

  32. Y Ying, D X Zhou. Online regularized classification algorithms, IEEE Transactions on Information Theory, 2006, 52(11): 4775–4788.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng-chu Guo.

Ethics declarations

Conflict of interest The authors declare no conflict of interest.

Additional information

The work is supported by Zhejiang Provincial Natural Science Foundation of China (LR20A010001) and National Natural Science Foundation of China(12271473 and U21A20426).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Yg., Guo, Zc. Differentially private SGD with random features. Appl. Math. J. Chin. Univ. 39, 1–23 (2024). https://doi.org/10.1007/s11766-024-5037-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11766-024-5037-0

Keywords

MR Subject Classification

Navigation