Tight Differential Privacy Guarantees for the Shuffle Model with k-Randomized Response

Biswas, Sayan; Jung, Kangsoo; Palamidessi, Catuscia

doi:10.1007/978-3-031-57537-2_27

Sayan Biswas^13,14,15,
Kangsoo Jung¹³ &
Catuscia Palamidessi^13,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14551))

Included in the following conference series:

International Symposium on Foundations and Practice of Security

49 Accesses

Abstract

Most differentially private algorithms assume a central model in which a reliable third party inserts noise to queries made on datasets, or a local model where the data owners directly perturb their data. However, the central model is vulnerable via a single point of failure, and the local model has the disadvantage that the utility of the data deteriorates significantly. The recently proposed shuffle model is an intermediate framework between the central and local paradigms. In the shuffle model, data owners send their locally privatized data to a server where messages are shuffled randomly, making it impossible to trace the link between a privatized message and the corresponding sender. In this paper, we theoretically derive the tightest known differential privacy guarantee for the shuffle models with k-Randomized Response (k-RR) local randomizers, under histogram queries, and we denoise the histogram produced by the shuffle model using the matrix inversion method to evaluate the utility of the privacy mechanism. We perform experiments on both synthetic and real data to compare the privacy-utility trade-off of the shuffle model with that of the central one privatized by adding the state-of-the-art Gaussian noise to each bin. We see that the difference in statistical utilities between the central and the shuffle models shows that they are almost comparable under the same level of differential privacy protection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
the inverse of a k-RR mechanism always exists [1, 13].
2.
where $\delta $ is correspondingly obtained using Result 1.
3.
we consider Total Variation Distance for our experiments.

References

Agrawal, R., Srikant, R., Thomas, D.: Privacy preserving olap. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 251–262 (2005)
Google Scholar
Balcer, V., Cheu, A.: Separating local & shuffled differential privacy via histograms. arXiv preprint arXiv:1911.06879 (2019)
Balle, B., Bell, J., Gascón, A., Nissim, K.: The privacy blanket of the shuffle model. In: Boldyreva, A., Micciancio, D. (eds.) CRYPTO 2019. LNCS, vol. 11693, pp. 638–667. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26951-7_22
Chapter Google Scholar
Balle, B., Wang, Y.X.: Improving the gaussian mechanism for differential privacy: analytical calibration and optimal denoising. In: International Conference on Machine Learning, pp. 394–403. PMLR (2018)
Google Scholar
Bittau, A., et al.: Prochlo: strong privacy for analytics in the crowd. In: Proceedings of the 26th Symposium on Operating Systems Principles, pp. 441–459 (2017)
Google Scholar
Cheu, A.: Differential privacy in the shuffle model: a survey of separations. arXiv preprint arXiv:2107.11839 (2021)
Cheu, A., Smith, A., Ullman, J., Zeber, D., Zhilyaev, M.: Distributed differential privacy via shuffling. In: Ishai, Y., Rijmen, V. (eds.) EUROCRYPT 2019. LNCS, vol. 11476, pp. 375–403. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17653-2_13
Chapter Google Scholar
Duchi, J.C., Jordan, M.I., Wainwright, M.J.: Local privacy and statistical minimax rates. In: 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pp. 429–438. IEEE (2013)
Google Scholar
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14
Chapter Google Scholar
Erlingsson, Ú., Feldman, V., Mironov, I., Raghunathan, A., Talwar, K., Thakurta, A.: Amplification by shuffling: From local to central differential privacy via anonymity. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 2468–2479. SIAM (2019)
Google Scholar
Feldman, V., McMillan, A., Talwar, K.: Hiding among the clones: A simple and nearly optimal analysis of privacy amplification by shuffling. arXiv preprint arXiv:2012.12803 (2020)
The gowalla dataset. [online]. https://snap.stanford.edu/data/loc-gowalla.html (2011), (Accessed 10 Aug 2021)
Kairouz, P., Bonawitz, K., Ramage, D.: Discrete distribution estimation under local privacy. In: International Conference on Machine Learning, pp. 2436–2444. PMLR (2016)
Google Scholar
Koskela, A., Heikkilä, M.A., Honkela, A.: Tight accounting in the shuffle model of differential privacy. arXiv preprint arXiv:2106.00477 (2021)
Sommer, D.M., Meiser, S., Mohammadi, E.: Privacy loss classes: the central limit theorem in differential privacy. Proc. Priv. Enhancing Technol. 2019(2), 245–269 (2019)
Article Google Scholar

Download references

Acknowledgment

The work is supported by the European Research Council (ERC) project HYPATIA under the European Unions Horizon 2020 research and innovation programme. Grant agreement no. 835294 and ELSA - European Lighthouse on Secure and Safe AI funded by the European Union under grant agreement No. 101070617.

Author information

Authors and Affiliations

Inria, Palaiseau, France
Sayan Biswas, Kangsoo Jung & Catuscia Palamidessi
École Polytechnique, Palaiseau, France
Sayan Biswas & Catuscia Palamidessi
EPFL, Lausanne, Switzerland
Sayan Biswas

Authors

Sayan Biswas
View author publications
You can also search for this author in PubMed Google Scholar
Kangsoo Jung
View author publications
You can also search for this author in PubMed Google Scholar
Catuscia Palamidessi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kangsoo Jung .

Editor information

Editors and Affiliations

University of Bordeaux, Bordeaux, France
Mohamed Mosbah
Toulouse III - Paul Sabatier University, Toulouse, France
Florence Sèdes
Université Laval, Québec, QC, Canada
Nadia Tawbi
University of Bordeaux, Bordeaux, France
Toufik Ahmed
Polytechnique Montréal, Montreal, QC, Canada
Nora Boulahia-Cuppens
Telecom SudParis, Palaiseau, France
Joaquin Garcia-Alfaro

Appendices

A Proof of Theorem Theorem 1

Setting $p=\mathbb {P}[x_0|x_0],\,\overline{p}=\mathbb {P}[x_0|y\ne x_0]$ in $\mathcal {R}_{{\text {kRR}}},\,\forall s\in [n]$, $\mathbb {P}[\mathcal {M}_{x_0}(x_0)=s]$

$$\begin{aligned} {}&=p\sum \limits _{r=0}^{s-1}\left[ \left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) p^r(1-p)^{n_{x_0}-r}\left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \overline{p}^{s-1-r}(1-\overline{p})^{n-n_{x_0}-s+r}\right] \nonumber \\ {}&+(1-p)\sum \limits _{r=0}^{s}\left[ \left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) p^r(1-p)^{n_{x_0}-r}\left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \overline{p}^{s-r}(1-\overline{p})^{n-n_{x_0}-1-s+r}\right] \nonumber \\ {}&=\frac{e^{\epsilon _0}}{e^{\epsilon _0}+k-1}\sum \limits _{r=0}^{s-1}\left[ \left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \frac{e^{r\epsilon _{0}}(k-1)^{n_{x_0}-r}}{(e^{\epsilon _0}+k-1)^{n_{x_0}}}\left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \frac{(e^{\epsilon _0}+k-2)^{n-n_{x_0}-s+r}}{(e^{\epsilon _0}+k-1)^{n-1-n_{x_0}}}\right] \nonumber \\ {}&+\frac{k-1}{e^{\epsilon _0}+k-1}\sum \limits _{r=0}^{s}\left[ \left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \frac{e^{r\epsilon _{0}}(k-1)^{n_{x_0}-r}}{(e^{\epsilon _0}+k-1)^{n_{x_0}}}\left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \frac{(e^{\epsilon _0}+k-2)^{n-n_{x_0}-1-s+r}}{(e^{\epsilon _0}+k-1)^{n-1-n_{x_0}}}\right] \nonumber \\ {}&=\frac{e^{\epsilon _0}(k-1)^{n_{x_0}}(e^{\epsilon _0}+k-2)^{n-n_{x_0}-s}}{(e^{\epsilon _0}+k-1)^n} \sum \limits _{r=0}^{s-1}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r\nonumber \\ {}&+\frac{(k-1)^{n_{x_0}+1}(e^{\epsilon _0}+k-2)^{n-n_{x_0}-1-s}}{(e^{\epsilon _0}+k-1)^n} \sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r\nonumber \\ {}&=\kappa _3\left[ e^{\epsilon _0}\sum \limits _{r=0}^{s-1}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r +\kappa _2\sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r\right] \nonumber \\ {}&\text {Using elementary combinatorial identities, we reduce to:}\nonumber \\ {}&\kappa _3\left[ \kappa _2\sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \kappa _1^r\left( \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) +\left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \right) \right. \nonumber \\ {}&\left. +(e^{\epsilon _{0}}-\kappa _2)\left( e^{\epsilon _0}\sum \limits _{r=0}^{s-1}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r\right) \right] \nonumber \\{}&=\kappa _3\left[ \kappa _2\sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r +(e^{\epsilon _{0}}-\kappa _2)\left( \sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r\right) \right] \nonumber \\ {}&=\kappa _3\left[ \kappa _2\sum \limits _{r=0}^s\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r +\frac{(e^{\epsilon _0}-\kappa _2)(s-r)}{n-n_{x_0}}\sum \limits _{r=0}^s\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r\right] \nonumber \\ {}&=\frac{\kappa _3}{n-n_{x_0}}\sum \limits _{r=0}^s\mu (s,r)\tau _r \text { [} \mu \text { and } \tau \text { are as in Definition 11]} \end{aligned}$$

(8)

By similar arguments as above, for any $s\in \{0,\ldots ,n\}$, $\mathbb {P}[\mathcal {M}_{x_0}(x_1)=s]$

$$\begin{aligned} {}&=\frac{1}{e^{\epsilon _0}+k-1}\sum \limits _{r=0}^{s-1}\left[ \left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \frac{e^{r\epsilon _{0}}(k-1)^{n_{x_0}-r}}{(e^{\epsilon _0}+k-1)^{n_{x_0}}} \left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \frac{(e^{\epsilon _0}+k-2)^{n-n_{x_0}-s+r}}{(e^{\epsilon _0}+k-1)^{n-1-n_{x_0}}}\right] \nonumber \\ {}&+\frac{e^{\epsilon _0}+k-2}{e^{\epsilon _0}+k-1}\sum \limits _{r=0}^{s}\left[ \left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \frac{e^{r\epsilon _{0}}(k-1)^{n_{x_0}-r}}{(e^{\epsilon _0}+k-1)^{n_{x_0}}}\left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \frac{(e^{\epsilon _0}+k-2)^{n-n_{x_0}-1-s+r}}{(e^{\epsilon _0}+k-1)^{n-1-n_{x_0}}}\right] \nonumber \\ {}&=\kappa _3\left( \sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r +\sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r\right) \nonumber \\ {}&=\kappa _3\sum \limits _{r=0}^s\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r \end{aligned}$$

(9)

Using Result 1, for every $k>2$ and $s\in \{0,1,\ldots ,n\}$, we can say that $\mathcal {M}$ induces a tight $(\epsilon ,\,\delta )$-ADP guarantee with respect to $x_0,\,x_1\in \mathcal {X}$ for any $\epsilon >0$ and $\delta $ iff $\delta $ is defined as:

$$\begin{aligned} \delta (\epsilon )=\sum \limits _{v: v>\epsilon }(1-e^{\epsilon -v})\sum \limits _{\begin{array}{c} s=0\\ v=\ln {\frac{\mathbb {P}[\mathcal {M}_{x_0}(x_0)=s]}{\mathbb {P}[\mathcal {M}_{x_0}(x_1)=s]}} \end{array}}^n\mathbb {P}[\mathcal {M}_{x_0}(x_0)=s] \end{aligned}$$

(10)

Using the expressions derived for $\mathbb {P}[\mathcal {M}_{x_0}(x_0)=s]$ and $\mathbb {P}[\mathcal {M}_{x_0}(x_1)=s]$ in (8) and (9), respectively, to get $v_s$:

$$\begin{aligned} &=\ln {\frac{\mathbb {P}[\mathcal {M}_{x_0}(x_0)=s]}{\mathbb {P}[\mathcal {M}_{x_0}(x_1)=s]}}=\ln {\frac{e^{\epsilon _0}\sum \limits _{r=0}^{s-1}\left( {\begin{array}{c}n_{x_{0}}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r+\kappa _2\sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_{0}}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r}{\sum \limits _{r=0}^{s-1}\left( {\begin{array}{c}n_{x_{0}}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r+\sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_{0}}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r}}\nonumber \\ &=\ln {\left( \kappa _2+\frac{(e^{\epsilon _{0}}-\kappa _2)\left( \sum \limits _{r=0}^{s-1}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r\right) }{\sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r}\right) }\nonumber \\ &=\ln {\left( \kappa _2+\frac{\frac{(e^{\epsilon _{0}}-\kappa _2)}{n-n_{x_0}}\left( \sum \limits _{r=0}^{s}(s-r)\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-1-n_{x_0}\\ s-1-r\end{array}}\right) \kappa _1^r\right) }{\sum \limits _{r=0}^{s}\left( {\begin{array}{c}n_{x_0}\\ r\end{array}}\right) \left( {\begin{array}{c}n-n_{x_0}\\ s-r\end{array}}\right) \kappa _1^r}\right) } \end{aligned}$$

(11)

Combining (10) and (11), $\delta (\epsilon )=\sum \limits _{\begin{array}{c} u: u>\epsilon ; s=0\\ v=\ln {\frac{\mathbb {P}[\mathcal {M}_{x_0}(x_0)=s]}{\mathbb {P}[\mathcal {M}_{x_0}(x_1)=s]}} \end{array}}^n(1-e^{\epsilon -v})\mathbb {P}[\mathcal {M}_{x_0}(x_0)=s]$

$$\begin{aligned} &=\sum \limits _{s=0}^n\mathbbm {1}_{\{v_{s}>\epsilon \}}(1-e^{\epsilon -v_{s}})\mathbb {P}[\mathcal {M}_{x_0}(x_0)=s]\nonumber \\ &=\sum \limits _{s=0}^n\mathbbm {1}_{\{v_{s}>\epsilon \}}(1-e^{\epsilon -v_{s}})\frac{\kappa _3}{n-n_{x_0}}\sum \limits _{r=0}^s\mu (s,r)\tau _r\nonumber =\hat{\delta }(\epsilon )\\ &\text {[Substituting } \mathbb {P}[\mathcal {M}_{x_0}(x_0)=s] \text { from (8)].} \end{aligned}$$

B Theoretical outline

In $\mathcal {M}$, we extend the idea of ADP to a non-adapted, general DP by using the highest value of $\delta $ across the primary inputs of every member in $\mathfrak {U}$, for a fixed $\epsilon $. This essentially ensures the worst possible tight differential privacy guarantee for the shuffle model. After that, we focus on estimating the original distribution of the primary initial dataset.

Let $\mathcal {R}_{\text {kRR}}^{-1}$ denote the inverse^{Footnote 1} of the probabilistic mechanism $\mathcal {R}_{\text {kRR}}$, which is used as the local randomizer for $\mathcal {M}$. Note that $\mathcal {R}_{\text {kRR}}^{-1}$ and $\mathcal {R}_{\text {kRR}}$ are both $k\times k$ stochastic channels as $|\mathcal {X}|=k$. Staying consistent with our previously developed notations, let us, additionally, introduce $H_{\mathfrak {N}}$ broadcasting the frequencies of the elements in $\mathcal {X}$ after they have been sanitized with $\mathfrak {N}$. In other words, $H_{\mathfrak {N}}=\mathfrak {N}_{\epsilon ,\delta }(D_{\mathcal {X}})=(H_{x_0},\ldots ,H_{x_{k-1}})$, where $H_{x_i}$ is the random variable giving the frequency of $x_i$ after $D_{\mathcal {X}}$ has been obfuscated with $\mathfrak {N}_{\epsilon ,\delta }$.

Since both $\mathcal {M}$ and $\mathfrak {N}$ are probabilistic mechanisms, to estimate their utilities we study how accurately we can estimate the true distribution from which $D_{\mathcal {X}}$ is sampled, after observing the response of the histogram queries in both the scenarios.

Let $\pi =(\pi _{x_0},\ldots ,\pi _{x_{k-1}})$ be the distribution of the original messages in $D(x_0)$. Our best guess of the original distribution by observing the noisy histogram going through the Gaussian mechanism is the noisy histogram itself, as $\mathbb {E}(H_{x_i})=n\pi _{x_i}$ for every $i\in \{0,\ldots ,k-1\}$.

However, in the case where $D(x_0)$ is locally obfuscated using $\mathcal {R}_{\text {kRR}}$ and the frequency of each element is broadcast by the shuffle model $\mathcal {M}$, we can use the matrix inversion method [1, 13] to estimate the distribution of the original messages in $D(x_0)$. So $\mathcal {M}(D(x_0))\mathcal {R}_{\text {kRR}}^{-1}$ (referred as shuffle+INV in the experiments) should be giving us $\hat{\pi }=(\hat{\pi }_{x_0},\ldots ,\hat{\pi }_{x_{k-1}})$ – the most likely estimate of the distribution of each user’s message in $D(x_0)$ sampled from $\mathcal {X}$ – where $\hat{\pi }_{x_i}$ denotes the random variable estimating the normalised frequency of $x_i$ in $D(x_0)$.

$$\begin{aligned} \mathbb {E}(\hat{\pi })=\mathbb {E}(\mathcal {M}(D(x_0))\mathcal {R}_{\text {kRR}}^{-1})=\pi \mathcal {R}_{\text {kRR}}\mathcal {R}_{\text {kRR}}^{-1}=\pi \end{aligned}$$

(12)

We recall that $\mathcal {M}$ provides tight $(\epsilon ,\,\delta )$-ADP for $x_0,\,x_1$, where $\delta $ is a function of $\epsilon _0,\,\epsilon ,\,\text {and }x_0$ – essentially $\mathcal {M}$ privatizes the true query response for $x_0$ to be identified as that for any $x_1\ne x_0$. On the other hand, $\mathfrak {N}_{\epsilon ,\delta }$ ensures $(\epsilon ,\,\delta )$-DP, which essentially means it guarantees $(\epsilon ,\,\delta )$-ADP for every $x_i\in \mathcal {X}$. Therefore, in order to facilitate a fair comparison of utility between the central and shuffle models of differential privacy under the same privacy level for the histogram query, we introduce the following concepts:

i)
Individual specific utility: Suppose the primary input of u is $x_0$. Individual specific utility refers to measuring the utility for the specific message $x_0$ in the dataset $D(x_0)$ in a certain privacy mechanism. In particular, the individual specific utility of $x_0$ in $D(x_0)$ for $\mathcal {M}$ is
$$\begin{aligned} \overline{\mathcal {W}}(\mathcal {M},x_0)=|n\hat{\pi }_{x_0}-n\pi _{x_0}|, \end{aligned}$$
and that for $\mathfrak {N}_{\epsilon ,\delta }$ is
$$\begin{aligned} \overline{\mathcal {W}}(\mathfrak {N}_{\epsilon ,\delta },x_0)=|n\pi _{x_0}-H_{x_0}| \end{aligned}$$
ii)
Community level utility: Here we consider the utility privacy mechanisms over the entire community, i.e., all the values of the original dataset, by measuring the distance between the estimated original distribution obtained from the observed noisy histogram and the original distribution of the source messages itself. In particular, fixing any $\epsilon _0>0$ and $\epsilon >0$, the community level utility for $\mathcal {M}$ is
$$\begin{aligned} \mathcal {W}(\mathcal {M})=d(n\hat{\pi },\,n\pi ), \end{aligned}$$
(13)
and that for $\mathfrak {\mathfrak {N}}_{\epsilon ,\delta }$^{Footnote 2} is
$$\begin{aligned} \mathcal {W}(\mathfrak {N}_{\epsilon ,\delta })=d(H_{\mathfrak {N}_{\epsilon ,\delta }},\,n\pi ), \end{aligned}$$
(14)
where d(.) is any standard metric^{Footnote 3} to measure probability distributions over a finite space. For an equitable comparison between $\mathcal {M}$ and $\mathfrak {N}$, we take the worst tight ADP guarantee over every user’s primary input and call this the community level tight DP guarantee for $\mathcal {M}$. That is, for a fixed $\epsilon _0,\,\epsilon >0$, we have $\mathcal {M}$ satisfying $(\epsilon ,\,\hat{\delta })$-DP as the community level tight DP guarantee if
$$\begin{aligned} \hat{\delta }=\max \limits _{x\in \mathcal {X}}\{\delta :\mathcal {M}\text { is tightly }(\epsilon ,\delta (x))\text {-ADP for } x\in D_{\mathcal {X}}\} \end{aligned}$$
(15)
Therefore, we impose the worst tight ADP guarantee on $\mathcal {M}$ over all the original messages with $\epsilon $ and $\hat{\delta }$, implying that $\mathcal {M}$ now gives a $(\epsilon ,\,\hat{\delta })$-DP guarantee by Remark 1, placing us in a position to compare the community level utilities of the shuffle and the central models of DP under the histogram query for a fixed level of privacy. In particular, we juxtapose $\mathcal {W}(\mathcal {M})$ with $\mathcal {W}(\mathfrak {N}_{\epsilon ,\hat{\delta }})$, as seen in the experimental results with location data from San Francisco and Paris in Fig. 3.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Biswas, S., Jung, K., Palamidessi, C. (2024). Tight Differential Privacy Guarantees for the Shuffle Model with k-Randomized Response. In: Mosbah, M., Sèdes, F., Tawbi, N., Ahmed, T., Boulahia-Cuppens, N., Garcia-Alfaro, J. (eds) Foundations and Practice of Security. FPS 2023. Lecture Notes in Computer Science, vol 14551. Springer, Cham. https://doi.org/10.1007/978-3-031-57537-2_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-57537-2_27
Published: 25 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-57536-5
Online ISBN: 978-3-031-57537-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Tight Differential Privacy Guarantees for the Shuffle Model with k-Randomized Response