Exact and Efficient Bayesian Inference for Privacy Risk Quantification

Rønneberg, Rasmus C.; Pardo, Raúl; Wąsowski, Andrzej

doi:10.1007/978-3-031-47115-5_15

Rasmus C. Rønneberg⁹,
Raúl Pardo¹⁰ &
Andrzej Wąsowski¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14323))

Included in the following conference series:

International Conference on Software Engineering and Formal Methods

259 Accesses
1 Citations

Abstract

Data analysis has high value both for commercial and research purposes. However, disclosing analysis results may pose severe privacy risk to individuals. Privug is a method to quantify privacy risks of data analytics programs by analyzing their source code. The method uses probability distributions to model attacker knowledge and Bayesian inference to update said knowledge based on observable outputs. Currently, Privug uses Markov Chain Monte Carlo (MCMC) to perform inference, which is a flexible but approximate solution. This paper presents an exact Bayesian inference engine based on multivariate Gaussian distributions to accurately and efficiently quantify privacy risks. The inference engine is implemented for a subset of Python programs that can be modeled as multivariate Gaussian models. We evaluate the method by analyzing privacy risks in programs to release public statistics. The evaluation shows that our method accurately and efficiently analyzes privacy risks, and outperforms existing methods. Furthermore, we demonstrate the use of our engine to analyze the effect of differential privacy in public statistics.

Work partially supported by funding from the topic Engineering Secure Systems of the Helmholtz Association (HGF), the KASTEL Security Research Labs and the Danish Villum Foundation through Villum Experiment project No. 0002302.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Statistics Denmark. www.dst.dk/en Accessed 23 June 2023
Statistics New Zealand. www.stats.govt.nz/ Accessed 23 June 2023
US Census Bureau. www.census.gov/ Accessed 23 June 2023
Alvim, M.S., Chatzikokolakis, K., McIver, A., Morgan, C., Palamidessi, C., Smith, G.: The Science of Quantitative Information Flow. Springer, Cham (2020)
Book MATH Google Scholar
Article 29 Data Protection Working Party: Opinion 05/2014 on Anonymisation Techniques (2014). www.pdpjournals.com/docs/88197.pdf
Avi Pfeffer: Practical probabilistic programming. Manning Publications Co. (2016)
Google Scholar
Barthe, G., Katoen, J.P., Silva, A. (eds.): Foundations of Probabilistic Programming. Cambridge University Press (2020)
Google Scholar
Biondi, F., Kawamoto, Y., Legay, A., Traonouez, L.: Hybrid statistical estimation of mutual information and its application to information flow. Formal Aspects Comput. 31(2), 165–206 (2019)
Article MathSciNet MATH Google Scholar
Biondi, F., Legay, A., Traonouez, L.-M., Wąsowski, A.: QUAIL: a quantitative security analyzer for imperative code. In: Sharygina, N., Veith, H. (eds.) Computer Aided Verification, pp. 702–707. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39799-8_49
Chapter Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Information science and statistics, Springer, New York (2006)
MATH Google Scholar
Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer, New York (2002)
MATH Google Scholar
Cherubin, G., Chatzikokolakis, K., Palamidessi, C.: F-BLEAU: fast black-box leakage estimation. In: SP’19, pp. 835–852. IEEE (2019)
Google Scholar
Chothia, T., Guha, A.: A statistical test for information leaks using continuous mutual information. In: CSF’11, pp. 177–190. IEEE (2011)
Google Scholar
Chothia, T., Kawamoto, Y., Novakovic, C.: A tool for estimating information leakage. In: Sharygina, N., Veith, H. (eds.) Computer Aided Verification, pp. 690–695. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39799-8_47
Chapter Google Scholar
Chothia, T., Kawamoto, Y., Novakovic, C.: LeakWatch: Estimating Information Leakage from Java Programs. In: Kutyłowski, M., Vaidya, J. (eds.) ESORICS 2014. LNCS, vol. 8713, pp. 219–236. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11212-1_13
Cover, T.M., Thomas, J.A.: Elements of information theory (2. ed.). Wiley (2006)
Google Scholar
Dwork, C., Kohli, N., Mulligan, D.: Differential privacy in practice: Expose your epsilons! J. Privacy Confidentiality 9(2) (2019)
Google Scholar
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)
MathSciNet MATH Google Scholar
Eaton, M.: Multivariate Statistics: A Vector Space Approach. Lecture notes-monograph series, Institute of Mathematical Statistics (2007)
Book MATH Google Scholar
Elliot, M., Mackey, E., O’Hara, K., Tudor, C.: The Anonymisation Decision - Making Framework. University of Manchester, UKAN (2016)
Google Scholar
Garfinkel, S.L., Abowd, J.M., Martindale, C.: Understanding database reconstruction attacks on public data. Commun. ACM 62(3), 46–53 (2019)
Article Google Scholar
Gehr, T., Misailovic, S., Vechev, M.T.: PSI: exact symbolic inference for probabilistic programs. In: CAV’16. LNCS, vol. 9779, pp. 62–83 (2016)
Google Scholar
Gehr, T., Steffen, S., Vechev, M.: \(\lambda \)PSI: exact inference for higher-order probabilistic programs. In: PLDI’20, pp. 883–897. ACM (2020)
Google Scholar
Gordon, A.D., Henzinger, T.A., Nori, A.V., Rajamani, S.K.: Probabilistic programming. In: FOSE’14, pp. 167–181. ACM (2014)
Google Scholar
Greenberg, S.C.E.: Understanding the Metropolis-Hastings Algorithm p. 10
Google Scholar
Homan, M.D., Gelman, A.: The no-u-turn sampler: Adaptively setting path lengths in Hamiltonian monte carlo. J. Mach. Learn. Res. 15(1), 1593–1623 (2014)
Google Scholar
Jaynes, E.T.: Probability Theory: The Logic of Science. Cambridge University Press, Cambridge (2003)
Book MATH Google Scholar
Koller, D., Friedman, N.: Probabilistic Graphical Models - Principles and Techniques. MIT Press (2009)
Google Scholar
Kucera, M., Tsankov, P., Gehr, T., Guarnieri, M., Vechev, M.T.: Synthesis of probabilistic privacy enforcement. In: CCS’17, pp. 391–408. ACM (2017)
Google Scholar
McElreath, R.: Statistical rethinking: A Bayesian course with examples in R and Stan. CRC Press (2020)
Google Scholar
Narayanan, P., Carette, J., Romano, W., Shan, C., Zinkov, R.: Probabilistic Inference by Program Transformation in Hakaru (System Description). In: Kiselyov, O., King, A. (eds.) FLOPS 2016. LNCS, vol. 9613, pp. 62–79. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29604-3_5
Pardo, R., Rafnsson, W., Probst, C.W., Wąsowski, A.: Privug: Using Probabilistic Programming for Quantifying Leakage in Privacy Risk Analysis. In: Bertino, E., Shulman, H., Waidner, M. (eds.) ESORICS 2021. LNCS, vol. 12973, pp. 417–438. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88428-4_21
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, New York, NY (2004)
Book MATH Google Scholar
Romanelli, M., Chatzikokolakis, K., Palamidessi, C., Piantanida, P.: Estimating g-leakage via machine learning. In: CCS’20. ACM (2020)
Google Scholar
Rønneberg, R.C., Pardo, R., Wąsowski, A.: Exact and Efficient Bayesian Inference for Privacy Risk Quantification (Accompanying Artifact). www.doi.org/10.5281/zenodo.8173905
Rønneberg, R.C., Pardo, R., Wąsowski, A.: Exact and efficient Bayesian inference for privacy risk quantification (extended version). arXiv:2308.16700 (2023)
Saad, F.A., Rinard, M.C., Mansinghka, V.K.: SPPL: Probabilistic programming with fast exact symbolic inference. In: PLDI’21, pp. 804–819. ACM (2021)
Google Scholar
Stein, D., Staton, S.: Compositional semantics for probabilistic programs with exact conditioning. In: LICS’21, pp. 1–13. IEEE (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Karlsruhe Institute of Technology, Karlsruhe, Germany
Rasmus C. Rønneberg
IT University of Copenhagen, Copenhagen, Denmark
Raúl Pardo & Andrzej Wąsowski

Authors

Rasmus C. Rønneberg
View author publications
You can also search for this author in PubMed Google Scholar
Raúl Pardo
View author publications
You can also search for this author in PubMed Google Scholar
Andrzej Wąsowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rasmus C. Rønneberg .

Editor information

Editors and Affiliations

NOVA University Lisbon, Caparica, Portugal
Carla Ferreira
Eindhoven University of Technology, Eindhoven, The Netherlands
Tim A. C. Willemse

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rønneberg, R.C., Pardo, R., Wąsowski, A. (2023). Exact and Efficient Bayesian Inference for Privacy Risk Quantification. In: Ferreira, C., Willemse, T.A.C. (eds) Software Engineering and Formal Methods. SEFM 2023. Lecture Notes in Computer Science, vol 14323. Springer, Cham. https://doi.org/10.1007/978-3-031-47115-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-47115-5_15
Published: 31 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47114-8
Online ISBN: 978-3-031-47115-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exact and Efficient Bayesian Inference for Privacy Risk Quantification