Privacy-Preserving Data Generation and Sharing Using Identification Sanitizer

Wang, Shuo; Lyu, Lingjuan; Chen, Tianle; Chen, Shangyu; Nepal, Surya; Rudolph, Carsten; Grobler, Marthie

doi:10.1007/978-3-030-62008-0_13

Shuo Wang^13,14,
Lingjuan Lyu¹⁵,
Tianle Chen¹⁴,
Shangyu Chen¹⁶,
Surya Nepal¹³,
Carsten Rudolph¹⁴ &
…
Marthie Grobler¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12343))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1218 Accesses

Abstract

In this paper, we propose a practical privacy-preserving generative model for data sanitization and sharing, called Sanitizer-Variational Autoencoder (SVAE). We assume that the data consists of identification-relevant and irrelevant components. A variational autoencoder (VAE) based sanitization model is proposed to strip the identification-relevant features and only retain identification-irrelevant components in a privacy-preserving manner. The sanitization allows for task-relevant discrimination (utility) but minimizes the personal identification information leakage (privacy). We conduct extensive empirical evaluations on the real-world face, biometric signal and speech datasets, and validate the effectiveness of our proposed SVAE, as well as the robustness against the membership inference attack.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)
Google Scholar
Abdulkader, S.N., Atia, A., Mostafa, M.S.M.: Brain computer interfacing: applications and challenges. Egypt. Inf. J. 16(2), 213–230 (2015)
Google Scholar
Alyasseri, Z.A.A., Khader, A.T., Al-Betar, M.A., Papa, J.P., Alomari, O.A.: Eeg feature extraction for person identification using wavelet decomposition and multi-objective flower pollination algorithm. IEEE Access 6, 76007–76024 (2018)
Article Google Scholar
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein gan. arXiv preprint arXiv:1701.07875 (2017)
Beaulieu-Jones, B.K., Wu, Z.S., Williams, C., Greene, C.S.: Privacy-preserving generative deep neural networks support clinical data sharing. BioRxiv, p. 159756 (2017)
Google Scholar
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2172–2180 (2016)
Google Scholar
Cynthia, D.: Differential privacy. Automata, languages and programming, pp. 1–12 (2006)
Google Scholar
Esteban, C., Hyland, S.L., Rätsch, G.: Real-valued (medical) time series generation with recurrent conditional gans. arXiv preprint arXiv:1706.02633 (2017)
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: Darpa timit acoustic-phonetic continous speech corpus cd-rom. nist speech disc 1–1.1. NASA STI/Recon technical report n 93 (1993)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Goldberger, A.L., et al.: Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000)
Article Google Scholar
Guibas, J.T., Virdi, T.S., Li, P.S.: Synthetic medical images from dual generative adversarial networks. arXiv preprint arXiv:1709.01872 (2017)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Kumari, P., Vaish, A.: Brainwave based authentication system: research issues and challenges. Int. J. Comput. Eng. Appl. 4(1), 2 (2014)
Google Scholar
Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300 (2015)
Li, Y., Swersky, K., Zemel, R.: Generative moment matching networks. In: International Conference on Machine Learning, pp. 1718–1727 (2015)
Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)
Google Scholar
Lyu, L., et al.: Towards fair and privacy-preserving federated deep models. IEEE Trans. Parallel Distrib. Syst. 31(11), 2524–2541 (2020)
Article Google Scholar
Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)
Mescheder, L., Nowozin, S., Geiger, A.: Adversarial variational bayes: unifying variational autoencoders and generative adversarial networks. arXiv preprint arXiv:1701.04722 (2017)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Google Scholar
Schirrmeister, R.T., et al.: Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 38(11), 5391–5420 (2017)
Article Google Scholar
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017)
Google Scholar
Song, S., Chaudhuri, K., Sarwate, A.D.: Stochastic gradient descent with differentially private updates. In: 2013 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 245–248. IEEE (2013)
Google Scholar
Xie, L., Lin, K., Wang, S., Wang, F., Zhou, J.: Differentially private generative adversarial network. arXiv preprint arXiv:1802.06739 (2018)
Zhang, X., Ji, S., Wang, T.: Differentially private releasing via deep generative model. arXiv preprint arXiv:1801.01594 (2018)
Zue, V., Seneff, S., Glass, J.: Speech database development at mit: Timit and beyond. Speech Commun. 9(4), 351–356 (1990)
Article Google Scholar

Download references

Author information

Authors and Affiliations

CSIRO’s Data61, Melbourne, Australia
Shuo Wang, Surya Nepal & Marthie Grobler
Monash University, Melbourne, Australia
Shuo Wang, Tianle Chen & Carsten Rudolph
National University of Singapore, Singapore, Singapore
Lingjuan Lyu
University of Melbourne, Melbourne, Australia
Shangyu Chen

Authors

Shuo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lingjuan Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Tianle Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shangyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Surya Nepal
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Rudolph
View author publications
You can also search for this author in PubMed Google Scholar
Marthie Grobler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuo Wang .

Editor information

Editors and Affiliations

VU Amsterdam, Amsterdam, The Netherlands
Zhisheng Huang
VU Amsterdam, Amsterdam, The Netherlands
Wouter Beek
Victoria University, Melbourne, VIC, Australia
Hua Wang
Swinburne University of Technology, Hawthorn, VIC, Australia
Rui Zhou
Victoria University, Melbourne, VIC, Australia
Yanchun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, S. et al. (2020). Privacy-Preserving Data Generation and Sharing Using Identification Sanitizer. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2020. WISE 2020. Lecture Notes in Computer Science(), vol 12343. Springer, Cham. https://doi.org/10.1007/978-3-030-62008-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-62008-0_13
Published: 21 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62007-3
Online ISBN: 978-3-030-62008-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics