Skip to main content

Training Differentially Private Neural Networks with Lottery Tickets

Part of the Lecture Notes in Computer Science book series (LNSC,volume 12973)


We propose the differentially private lottery ticket hypothesis (DPLTH). An end-to-end differentially private training paradigm based on the lottery ticket hypothesis, designed specifically to improve the privacy-utility trade-off in differentially private neural networks. DPLTH, using high-quality winners privately selected via our custom score function outperforms current methods by a margin greater than 20%. We further show that DPLTH converges faster, allowing for early stopping with reduced privacy budget consumption and that a single publicly available dataset for ticket generation is enough for enhancing the utility on multiple datasets of varying properties and from varying domains. Our extensive evaluation on six public datasets provides evidence to our claims.


  • Differential privacy
  • Lottery ticket hypothesis
  • Differential privacy in neural networks

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-88428-4_27
  • Chapter length: 20 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-88428-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.


  1. 1.

    Code for DPLTH will be made publicly available at

  2. 2.

    As we are only composing two mechanisms, advanced composition is not necessary.

  3. 3.

  4. 4.

  5. 5.

    DPLTH consistently selects winning tickets with total parameters \({\le }10\%\) of the full model.


  1. Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)

    Google Scholar 

  2. Beaulieu-Jones, B.K., Wu, Z.S., Williams, C., Greene, C.S.: Privacy-preserving generative deep neural networks support clinical data sharing. BioRxiv, p. 159756 (2017)

    Google Scholar 

  3. Carlini, N., Liu, C., Kos, J., Erlingsson, Ú., Song, D.: The secret sharer: measuring unintended neural network memorization & extracting secrets. arXiv preprint arXiv:1802.08232 (2018)

  4. Chen, T., et al.: The lottery ticket hypothesis for pre-trained BERT networks. arXiv preprint arXiv:2007.12223 (2020)

  5. Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., Ha, D.: Deep learning for classical Japanese literature (2018)

    Google Scholar 

  6. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006).

    CrossRef  Google Scholar 

  7. Frankle, J., Carbin, M.: The lottery ticket hypothesis: finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018)

  8. Frankle, J., Dziugaite, G.K., Roy, D.M., Carbin, M.: The lottery ticket hypothesis at scale. arXiv preprint arXiv:1903.01611 (2019)

  9. Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333. ACM (2015)

    Google Scholar 

  10. Jayaraman, B., Evans, D.: Evaluating differentially private machine learning in practice. In: 28th USENIX Security Symposium (USENIX Security 2019), pp. 1895–1912 (2019)

    Google Scholar 

  11. LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010).

  12. Lu, A.X., Lu, A.X., Schormann, W., Ghassemi, M., Andrews, D.W., Moses, A.M.: The cells out of sample (COOS) dataset and benchmarks for measuring out-of-sample generalization of image classifiers. arXiv preprint arXiv:1906.07282 (2019)

  13. Malach, E., Yehudai, G., Shalev-Schwartz, S., Shamir, O.: Proving the lottery ticket hypothesis: pruning is all you need. In: International Conference on Machine Learning, pp. 6682–6691. PMLR (2020)

    Google Scholar 

  14. McMahan, H.B., Ramage, D., Talwar, K., Zhang, L.: Learning differentially private recurrent language models. arXiv preprint arXiv:1710.06963 (2017)

  15. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2007), pp. 94–103. IEEE (2007)

    Google Scholar 

  16. Morcos, A.S., Yu, H., Paganini, M., Tian, Y.: One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers. arXiv preprint arXiv:1906.02773 (2019)

  17. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814 (2010)

    Google Scholar 

  18. Nasr, M., Shokri, R., et al.: Improving deep learning with differential privacy using gradient encoding and denoising. arXiv preprint arXiv:2007.11524 (2020)

  19. Papernot, N., Thakurta, A., Song, S., Chien, S., Erlingsson, Ú.: Tempered sigmoid activations for deep learning with differential privacy. arXiv preprint arXiv:2007.14191 (2020)

  20. Prabhu, V.U.: Kannada-MNIST: a new handwritten digits dataset for the Kannada language. arXiv preprint arXiv:1908.01242 (2019)

  21. Rajkumar, A., Agarwal, S.: A differentially private stochastic gradient descent algorithm for multiparty classification. In: Artificial Intelligence and Statistics, pp. 933–941 (2012)

    Google Scholar 

  22. Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321. ACM (2015)

    Google Scholar 

  23. Song, C., Ristenpart, T., Shmatikov, V.: Machine learning models that remember too much. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 587–601. ACM (2017)

    Google Scholar 

  24. Song, S., Chaudhuri, K., Sarwate, A.D.: Stochastic gradient descent with differentially private updates. In: 2013 IEEE Global Conference on Signal and Information Processing, pp. 245–248. IEEE (2013)

    Google Scholar 

  25. Wu, X., Fredrikson, M., Jha, S., Naughton, J.F.: A methodology for formalizing model-inversion attacks. In: 2016 IEEE 29th Computer Security Foundations Symposium (CSF), pp. 355–370. IEEE (2016)

    Google Scholar 

  26. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)

    Google Scholar 

  27. Xie, L., Lin, K., Wang, S., Wang, F., Zhou, J.: Differentially private generative adversarial network. arXiv preprint arXiv:1802.06739 (2018)

  28. You, H., et al.: Drawing early-bird tickets: towards more efficient training of deep networks. arXiv preprint arXiv:1909.11957 (2019)

  29. Yu, H., Edunov, S., Tian, Y., Morcos, A.S.: Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP. arXiv preprint arXiv:1906.02768 (2019)

Download references


This research is in part supported by a CGS-D award and a discovery grant from Natural Sciences and Engineering Research Council of Canada.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Lovedeep Gondara .

Editor information

Editors and Affiliations



Theorem 1

Phase 2 (Selecting a winning ticket) is (\(\epsilon _1\)) - differentially private.


We consider the scenario where the EM outputs some element \(r \in \mathcal {R}\) on two neighbouring datasets, \(X,X'\).

$$\begin{aligned} \dfrac{Pr[\mathcal {M}(X,u,\mathcal {R}) = r]}{Pr[\mathcal {M}(X',u,\mathcal {R}) = r]} = \dfrac{\bigg ( \dfrac{\exp (\dfrac{\epsilon _1 u (X,r)}{2 \varDelta u})}{\sum _{r' \in \mathcal {R}} \exp (\dfrac{\epsilon _1 u (X,r')}{2 \varDelta u})} \bigg )}{\bigg ( \dfrac{\exp (\dfrac{\epsilon _1 u (X',r)}{2 \varDelta u})}{\sum _{r' \in \mathcal {R}} \exp (\dfrac{\epsilon _1 u (X',r')}{2 \varDelta u})} \bigg )} \end{aligned}$$
$$\begin{aligned} = \bigg ( \dfrac{\exp (\dfrac{\epsilon _1 u (X,r)}{2 \varDelta u})}{\exp (\dfrac{\epsilon _1 u (X',r)}{2 \varDelta u})} \bigg ) . \bigg ( \dfrac{\sum _{r' \in \mathcal {R}} \exp (\dfrac{\epsilon _1 u (X,r')}{2 \varDelta u})}{\sum _{r' \in \mathcal {R}} \exp (\dfrac{\epsilon _1 u (X',r')}{2 \varDelta u})} \bigg ) \end{aligned}$$
$$\begin{aligned} = \exp \bigg ( \dfrac{\epsilon _1 (u(X,r') - u(X',r'))}{2 \varDelta u} \bigg ) . \bigg ( \dfrac{\sum _{r' \in \mathcal {R}} \exp (\dfrac{\epsilon _1 u (X,r')}{2 \varDelta u})}{\sum _{r' \in \mathcal {R}} \exp (\dfrac{\epsilon _1 u (X',r')}{2 \varDelta u})} \bigg ) \end{aligned}$$
$$\begin{aligned} \le \exp (\dfrac{\epsilon _1}{2}) . \exp (\dfrac{\epsilon _1}{2}) . \bigg ( \dfrac{\sum _{r' \in \mathcal {R}} \exp (\dfrac{\epsilon _1 u (X,r')}{2 \varDelta u})}{\sum _{r' \in \mathcal {R}} \exp (\dfrac{\epsilon _1 u (X',r')}{2 \varDelta u})} \bigg ) \end{aligned}$$
$$\begin{aligned} \le \exp (\epsilon _1) \end{aligned}$$

   \(\square \)

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Gondara, L., Carvalho, R.S., Wang, K. (2021). Training Differentially Private Neural Networks with Lottery Tickets. In: Bertino, E., Shulman, H., Waidner, M. (eds) Computer Security – ESORICS 2021. ESORICS 2021. Lecture Notes in Computer Science(), vol 12973. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88427-7

  • Online ISBN: 978-3-030-88428-4

  • eBook Packages: Computer ScienceComputer Science (R0)