Skip to main content

The Effect of False Positives: Why Fuzzy Message Detection Leads to Fuzzy Privacy Guarantees?

  • Conference paper
  • First Online:
Financial Cryptography and Data Security (FC 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13411))

Included in the following conference series:

  • 1092 Accesses

Abstract

Fuzzy Message Detection (FMD) is a recent cryptographic primitive invented by Beck et al. (CCS’21) where an untrusted server performs coarse message filtering for its clients in a recipient-anonymous way. In FMD—besides the true positive messages—the clients download from the server their cover messages determined by their false-positive detection rates. What is more, within FMD, the server cannot distinguish between genuine and cover traffic. In this paper, we formally analyze the privacy guarantees of FMD from three different angles.

First, we analyze three privacy provisions offered by FMD: recipient unlinkability, relationship anonymity, and temporal detection ambiguity. Second, we perform a differential privacy analysis and coin a relaxed definition to capture the privacy guarantees FMD yields. Finally, we simulate FMD on real-world communication data. Our theoretical and empirical results assist FMD users in adequately selecting their false-positive detection rates for various applications with given privacy requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Similar scenario was studied in [5] concerning Bloom filters.

  2. 2.

    Note that Beck et al. coined this as dynamic k-anonymity, yet, we believe it does not capture all the aspects of their improvement. Hence, we renamed it with a more generic term.

  3. 3.

    For an initial empirical anonymity analysis, we refer the reader to the simulator developed by Sarah Jamie Lewis [24].

  4. 4.

    In this work, we stipulate that a single server filters the messages for all users, i.e., a single server knows all the recipients’ detection keys.

  5. 5.

    This lower bound is practically tight since the probability distribution of the adversary’s advantage is concentrated around the mean \(\lfloor pU\rfloor \) anyway.

  6. 6.

    Note that this approximation is generally considered to be tight enough when \(\textsf{out}(u_2)p(u_1)\ge 5\) and \(\textsf{out}(u_2)(1-p(u_1))\ge 5\).

  7. 7.

    For senders with only a few sent messages (\(\textsf{out}(u_2)\le 30\)), one can apply t-tests instead of Z-tests.

  8. 8.

    As an illustrative example collected from a real communication system, see Fig. 3b.

  9. 9.

    We elaborate more on various DP notions in Appendix C.

  10. 10.

    Note that this is also a personalized guarantee as in [18].

  11. 11.

    The simulator can be found at https://github.com/seresistvanandras/FMD-analysis.

  12. 12.

    We present the proof for singleton sets, but it can be extended by using the following formula: \(\frac{A+C}{B+D}<\max (\frac{A}{B},\frac{C}{D})\).

  13. 13.

    It is only an optimistic baseline as it merely captures the trivial event when no-one downloads the a message from any sender v besides the intended recipient u.

  14. 14.

    \(a_{-u}\) is a common notation to represent all other players action except player u. Note that \(p(-u)\) stands for the same in relation with FMD.

References

  1. Anshelevich, E., Dasgupta, A., Kleinberg, J., Tardos, E., Wexler, T., Roughgarden, T.: The price of stability for network design with fair cost allocation. SIAM J. Comput. 38(4), 1602–1623 (2008)

    Article  MathSciNet  Google Scholar 

  2. Backes, M., Kate, A., Manoharan, P., Meiser, S., Mohammadi, E.: AnoA: a framework for analyzing anonymous communication protocols. In: 2013 IEEE 26th Computer Security Foundations Symposium, pp. 163–178. IEEE (2013)

    Google Scholar 

  3. Beck, G., Len, J., Miers, I., Green, M.: Fuzzy message detection. IACR eprint (2021)

    Google Scholar 

  4. Solomon, M., DiFrancesco, B.: Privacy preserving stealth payments on the Ethereum blockchain (2021)

    Google Scholar 

  5. Bianchi, G., Bracciale, L., Loreti, P.: “Better than nothing’’ privacy with bloom filters: to what extent? In: Domingo-Ferrer, J., Tinnirello, I. (eds.) PSD 2012. LNCS, vol. 7556, pp. 348–363. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33627-0_27

    Chapter  Google Scholar 

  6. Biczók, G., Chia, P.H.: Interdependent privacy: let me share your data. In: Sadeghi, A.-R. (ed.) FC 2013. LNCS, vol. 7859, pp. 338–353. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39884-1_29

    Chapter  Google Scholar 

  7. Chatzikokolakis, K., Andrés, M.E., Bordenabe, N.E., Palamidessi, C.: Broadening the scope of differential privacy using metrics. In: De Cristofaro, E., Wright, M. (eds.) PETS 2013. LNCS, vol. 7981, pp. 82–102. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39077-7_5

    Chapter  Google Scholar 

  8. de Valence, H.: Determine whether penumbra could integrate fuzzy message detection (2021)

    Google Scholar 

  9. Desfontaines, D., Pejó, B.: SoK: differential privacies. Proc. Priv. Enhanc. Technol. 2, 288–313 (2020)

    Google Scholar 

  10. Domingo-Ferrer, J., Torra, V.: A critique of k-anonymity and some of its enhancements. In: 2008 Third International Conference on Availability, Reliability and Security, pp. 990–993. IEEE (2008)

    Google Scholar 

  11. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1

    Chapter  Google Scholar 

  12. Dwork, C.: Differential privacy in new settings. In: Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM (2010)

    Google Scholar 

  13. Dwork, C., Naor, M., Pitassi, T., Rothblum, G.N.: Differential privacy under continual observation. In: Proceedings of the Forty-Second ACM Symposium on Theory of Computing. ACM (2010)

    Google Scholar 

  14. Dwork, C., Naor, M., Pitassi, T., Rothblum, G.N., Yekhanin, S.: Pan-private streaming algorithms. In: ICS (2010)

    Google Scholar 

  15. Hardin, G.: The tragedy of the commons: the population problem has no technical solution; it requires a fundamental extension in morality. Science 162(3859), 1243–1248 (1968)

    Article  Google Scholar 

  16. Harsanyi, J.C., Selten, R., et al.: A General Theory of Equilibrium Selection in Games. MIT Press, Cambridge (1988)

    MATH  Google Scholar 

  17. Hay, M., Li, C., Miklau, G., Jensen, D.: Accurate estimation of the degree distribution of private networks. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 169–178. IEEE (2009)

    Google Scholar 

  18. Jorgensen, Z., Yu, T., Cormode, G.: Conservative or liberal? Personalized differential privacy. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 1023–1034. IEEE (2015)

    Google Scholar 

  19. Kellaris, G., Papadopoulos, S., Xiao, X., Papadias, D.: Differentially private event sequences over infinite streams. Proc. VLDB Endow. 7(12), 1155–1166 (2014)

    Article  Google Scholar 

  20. Kifer, D., Machanavajjhala, A.: No free lunch in data privacy. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 193–204 (2011)

    Google Scholar 

  21. Korolova, A., Kenthapadi, K., Mishra, N., Ntoulas, A.: Releasing search queries and clicks privately. In: Proceedings of the 18th International Conference on World Wide Web, pp. 171–180 (2009)

    Google Scholar 

  22. Koutsoupias, E., Papadimitriou, C.: Worst-case equilibria. In: Meinel, C., Tison, S. (eds.) STACS 1999. LNCS, vol. 1563, pp. 404–413. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-49116-3_38

    Chapter  Google Scholar 

  23. Lewis, S.J.: Niwl: a prototype system for open, decentralized, metadata resistant communication using fuzzytags and random ejection mixers (2021)

    Google Scholar 

  24. Lewis, S.J.: A playground simulator for fuzzy message detection (2021)

    Google Scholar 

  25. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: \(L\)-diversity: privacy beyond \(k\)-anonymity. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 3-es (2007)

    Google Scholar 

  26. Monderer, D., Shapley, L.S.: Potential games. Games Econ. Behav. 14(1), 124–143 (1996)

    Article  MathSciNet  Google Scholar 

  27. Nash, J.F., et al.: Equilibrium points in \(n\)-person games. Proc. Natl. Acad. Sci. 36(1), 48–49 (1950)

    Article  MathSciNet  Google Scholar 

  28. Nisan, N., Schapira, M., Valiant, G., Zohar, A.: Best-response mechanisms. In: ICS, pp. 155–165. Citeseer (2011)

    Google Scholar 

  29. Noether, S.: Ring signature confidential transactions for monero. IACR Cryptology ePrint Archive 2015/1098 (2015)

    Google Scholar 

  30. Panzarasa, P., Opsahl, T., Carley, K.M.: Patterns and dynamics of users’ behavior and interaction: network analysis of an online community. J. Am. Soc. Inf. Sci. Technol. 60(5), 911–932 (2009)

    Article  Google Scholar 

  31. Paranjape, A., Benson, A.R., Leskovec, J.: Motifs in temporal networks. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 601–610 (2017)

    Google Scholar 

  32. Rondelet, A.: Fuzzy message detection in Zeth (2021)

    Google Scholar 

  33. Sasson, E.B., et al.: Zerocash: decentralized anonymous payments from bitcoin. In: 2014 IEEE Symposium on Security and Privacy, pp. 459–474. IEEE (2014)

    Google Scholar 

  34. Simon, H.A.: Altruism and economics. Am. Econ. Rev. 83(2), 156–161 (1993)

    Google Scholar 

  35. Soumelidou, A., Tsohou, A.: Towards the creation of a profile of the information privacy aware user through a systematic literature review of information privacy awareness. Telematics Inform. 61, 101592 (2021)

    Article  Google Scholar 

  36. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  37. Takagi, S., Cao, Y., Yoshikawa, M.: Asymmetric differential privacy. arXiv preprint arXiv:2103.00996 (2021)

  38. Zhang, T., Zhu, T., Liu, R., Zhou, W.: Correlated data in differential privacy: definition and analysis. Concurr. Comput. Pract. Exp. 34(16), e6015 (2020)

    Google Scholar 

  39. Zhou, B., Pei, J.: The \(k\)-anonymity and \(l\)-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowl. Inf. Syst. 28(1), 47–77 (2011). https://doi.org/10.1007/s10115-010-0311-2

    Article  Google Scholar 

Download references

Acknowledgements

We thank our shepherd Fan Zhang and our anonymous reviewers for helpful comments in preparing the final version of this paper. We are grateful to Sarah Jamie Lewis for inspiration and publishing the data sets. We thank Henry de Valence and Gabrielle Beck for fruitful discussions. Project no. 138903 has been implemented with the support provided by the Ministry of Innovation and Technology from the NRDI Fund, financed under the FK_21 funding scheme. The research reported in this paper and carried out at the BME has been supported by the NRDI Fund based on the charter of bolster issued by the NRDI Office under the auspices of the Ministry for Innovation and Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to István András Seres .

Editor information

Editors and Affiliations

Appendices

A FMD in More Details

The fuzzy message detection scheme consists of the following five probabilistic polynomial-time algorithms \((\textsf{Setup},\textsf{KeyGen},\textsf{Flag},\textsf{Extract},\textsf{Test})\). In the following, let \(\mathcal {P}\) denote the set of attainable false positive rates.

  • \(\textsf{Setup}(1^{\lambda }) \xrightarrow {\$}\textsf{pp}\). Global parameters \(\textsf{pp}\) of the FMD scheme are generated, i.e., the description of a shared cyclic group.

  • \(\textsf{KeyGen}_{\textsf{pp}}(1^{\lambda }) \xrightarrow {\$}(pk,sk)\). This algorithm is given the global public parameters and the security parameter and outputs a public and secret key.

  • \(\textsf{Flag}(pk) \xrightarrow {\$}C.\) This randomized algorithm given a public key pk outputs a flag ciphertext C.

  • \(\textsf{Extract}(sk,p) \xrightarrow []{}dsk\). Given a secret key sk and a false positive rate p the algorithm extracts a detection secret key dsk iff. \(p\in \mathcal {P}\) or outputs \(\bot \) otherwise.

  • \(\textsf{Test}(dsk,C) \xrightarrow []{}\{0,1\}\). The test algorithm given a detection secret key dsk and a flag ciphertext C outputs a detection result.

An FMD scheme needs to satisfy three main security and privacy notions: correctness, fuzziness and detection ambiguity. For the formal definitions of these, we refer to [3]. The toy example presented in Fig. 5 is meant to illustrate the interdependent nature of the privacy guarantees achieved by the FMD scheme.

Fig. 5.
figure 5

A toy example of the FMD scheme. Several senders post anonymous messages to the untrusted server. Whenever recipients come online, they download messages that correspond to them (some false positive, some true positive). Recipient A, B, C and D have a false positive rate \(0,\frac{1}{3},\frac{1}{3},1\), respectively. Note that the server can map the messages that belong to A and D. However, the messages of Recipient B and C are 2-anonymous.

B Formal Definitions of Security and Privacy Guarantees

Fig. 6.
figure 6

The security game for the anonymity notion of recipient unlinkability.

Definition 4

(Temporal Detection Ambiguity). An anonymous communication protocol \(\varPi \) satisfies temporal detection ambiguity if for all probabilistic polynomial-time adversaries \(\mathcal {A}\) there is a negligible function \(\textsf{negl}(\cdot )\) such that

$$\begin{aligned} \Pr [\mathcal {G}^{TDA}_{\mathcal {A},\varPi }(\lambda )=1]\le \frac{1}{2}+\textsf{negl}(\lambda ), \end{aligned}$$
(6)

where the temporal detection ambiguity game \(\mathcal {G}^{TDA}_{\mathcal {A},\varPi }(\cdot )\) is defined below (Fig. 7).

Fig. 7.
figure 7

The security game for the privacy notion of temporal detection ambiguity

C Differential Privacy Relaxations and Proofs

Our novel DP notion called PEEDP (short for Personalized Existing Edge DP) is an instance of d-privacy [7], which generalizes the neighbourhood of datasets (on which the DP inequality should hold) to an arbitrary metric d defined over the input space. Yet, instead of a top-down approach where we are presenting a complex metric to fit our FMD use-case, we follow a bottom-up approach and show the various building blocks of our definition. PEEDP is a straight forward combination of unbounded DP [20], edge-DP [17]), asymmetric DP [37], and personalized DP [18]. Although Definition 3 is appropriate for FMD, it does not capture the FMD scenarios fully as neither time-dependent nature of the messages nor the dependencies and correlations between them are taken into account.

The first issue can be tackled by integrating other DP notions into PEEDP which provide guarantees under continuous observation (i.e., stream-data), such as pan-privacy [14]. Within this streaming context several definitions can be considered: user-level [13] (to protect the presence of users), event-level [12] (to protect the presence of messages), and w-event level [19] (to protect the presence of messages within time windows).

The second issue is also not considered in Theorem 1 as we assumed the messages are IID, while in a real-world applications this is not necessarily the case. Several DP notions consider distributions, without cherry-picking any we refer the readers to two corresponding surveys [9, 38]. We leave it as a future work to tweak our definition further to fit into these contexts.

Proof

(of Theorem 1). Due to the IID nature of the messages it is enough to show that Eq. 5 holds for an arbitrary communication graph D with arbitrary message m of an arbitrary user u. The two possible world the adversary should not be able to differentiate between \(D=D^\prime /\{m\}\), i.e., whether the particular message exists or not. Due to the asymmetric nature of Definition 3 (i.e., it only protects the existence) Eq. 7 does not need to be satisfied. On the other hand, if the message exists than Eq. 8 and 9 must be satisfied where \(S_1=\){message m is downloaded by user u} and \(S_2=\){message m is not downloaded by user u}.

$$\begin{aligned} \Pr (A(D^\prime )\in S)\le e^{\varepsilon _u} \cdot \Pr (A(D)\in S)\end{aligned}$$
(7)
$$\begin{aligned} \Pr (A(D)\in S_1)\le e^{\varepsilon _u} \cdot \Pr (A(D^\prime )\in S_1)\end{aligned}$$
(8)
$$\begin{aligned} \Pr (A(D)\in S_2)\le e^{\varepsilon _u} \cdot \Pr (A(D^\prime )\in S_2) \end{aligned}$$
(9)

If we reformulate the last two equations with the corresponding probabilities we get \(1\le e^{\varepsilon _u}\cdot p(u)\) and \(0\le e^{\varepsilon _u}\cdot (1-p(u))\) respectively. While the second holds trivially the first corresponds to the formula in Theorem 1.    \(\square \)

Proof

(of Theorem 2). The users’ number of incoming messages are independent from each other hence we can focus on a single user u. The proof follows the idea from [21]Footnote 12: we satisfy Eq. 4 (with \(+\delta \) at the end) when \(A(D)=tag(u)\sim D+\textsf{Binom}(M-in(u),p(u))\) for \(D=in(u)\) and \(D^\prime =in(u)\pm 1\), i.e., we show that the following Equation holds.

$$\begin{aligned} \begin{aligned} \Pr (A(D)=tag(u)\in S|D=in(u),M,p(u))\le \\ e^\varepsilon \cdot \Pr (A(D^\prime )=tag^\prime (u)\in S|D^\prime =in(u)\pm 1,M^\prime =M\pm 1,p(u))+\delta \\ \Rightarrow \qquad \quad \Pr (in(u)+\textsf{Binom}(M-in(u),p(u))\in S)\le \\ e^\varepsilon \cdot \Pr (in(u)\pm 1+\textsf{Binom}(M\pm 1-(in(u)\pm 1),p(u))\in S)+\delta \\ \end{aligned} \end{aligned}$$

First, we focus on \(\delta \) and provide a lower bound originating from the probability on the left when \(\Pr (\cdot )\le e^\varepsilon \cdot 0+\delta \). This corresponds to two cases as seen in the Equation below: when \(D^\prime =in(u)+1\) with \(S=\{in(u)\}\) and when \(D^\prime =in(u)-1\) with \(S=\{M\}\). The corresponding lower bounds (i.e., probabilities) correspond to the event when user u does not download any fuzzy messages and when user u does downloads all messages respectively. Hence, the maximum of these are indeed a lower bound for \(\delta \).

$$\begin{aligned}\begin{gathered} \Pr (A(in(u))=in(u))\le e^\varepsilon \cdot \Pr (A(in(u)+1)=in(u))+\delta \Rightarrow (1-p(u))^{M-in(u)}\le \delta \\ \Pr (A(in(u))=M)\le e^\varepsilon \cdot \Pr (A(in(u)-1)=M)+\delta \;\;\Rightarrow \;\;p(u)^{M-in(u)}\le \delta \end{gathered}\end{aligned}$$

Now we turn towards \(\varepsilon \) and show that \((\varepsilon ,0)\)-DP holds for all subset besides the two above, i.e., when \(S=\{in(u)+y\}\) with \(y=[1,\dots ,M-in(u)-1]\). First, we reformulate Eq. 4 as seen below.

$$\begin{aligned} \frac{\Pr (in(u)+\textsf{Binom}(M-in(u),p(u))\in S)}{\Pr (in(u)\pm 1+\textsf{Binom}(M-in(u),p(u))\in S)}\le e^\varepsilon \end{aligned}$$

Then, by replacing the binomial distributions with the corresponding probability formulas we get the following two equations for \(D^\prime =in(u)+1\) and \(D^\prime =in(u)-1\) respectively.

$$\begin{aligned}\begin{gathered} \frac{\left( {\begin{array}{c}M-in(u)\\ y\end{array}}\right) \cdot p(u)^y\cdot (1-p(u))^{M-in(u)-y}}{\left( {\begin{array}{c}M-in(u)\\ y-1\end{array}}\right) \cdot p(u)^{y-1}\cdot (1-p(u))^{M-in(u)-y+1}}=\frac{M-in(u)-y+1}{y}\cdot \frac{p(u)}{1-p(u)}\le e^\varepsilon \\ \frac{\left( {\begin{array}{c}M-in(u)\\ y\end{array}}\right) \cdot p(u)^y\cdot (1-p(u))^{M-in(u)-y}}{\left( {\begin{array}{c}M-in(u)\\ y+1\end{array}}\right) \cdot p(u)^{y+1}\cdot (1-p(u))^{M-in(u)-y-1}}=\frac{y+1}{M-in(u)-y}\cdot \frac{1-p(u)}{p(u)}\le e^\varepsilon \\ \end{gathered}\end{aligned}$$

Consequently, the maximum of these is the lower bound for \(e^\varepsilon \). The first formula’s derivative is negative, so the function is monotone decreasing, meaning that its maximum is at \(y=in(u)+1\). On the other hand, the second formula’s derivative is positive so the function is monotone increasing, hence the maximum is reached at \(y=M-in(u)-1\). By replacing y with these values respectively one can verify that the corresponding maximum values are indeed what is shown in Theorem 2.    \(\square \)

D Game-Theoretical Analysis

Here—besides a short introduction of the utilized game theoretic concepts—we present a rudimentary game-theoretic study of the FMD protocol focusing on relationship anonymity introduced in Sect. 4. First, we formalize a game and highlight some corresponding problems such as the interdependence of the user’s privacy. Then, we unify the user’s actions and show the designed game’s only Nash Equilibrium, which is to set the false-positive detection rates to zero, rendering FMD idle amongst selfish users. Following this, we show that a higher utility could been reached with altruistic users and/or by centrally adjusting the false-positive detection rates. Finally, we show that our game (even with non-unified actions) is a potential game, which have several nice properties, such as efficient Nash Equilibrium computation.

  • Tragedy of Commons [15]: users act according to their own self-interest and, contrary to the common good of all users, cause depletion of the resource through their uncoordinated action.

  • Nash Equilibrium [27]: every player makes the best/optimal decision for itself as long as the others’ choices remain unchanged.

  • Altruism [34]: users act to promote the others’ welfare, even at a risk or cost to ourselves.

  • Social Optimum [16]: the user’s strategies which maximizes social welfare (i.e., the overall accumulated utilities).

  • Price of Stability/Anarchy [1, 22]: the ratio between utility values corresponding to the best/worst NE and the SO. It measures how the efficiency of a system degrades due to selfish behavior of its agents.

  • Best Response Mechanism [28]: from a random initial strategy the players iteratively improve their strategies

Almost every multi-party interaction can be modeled as a game. In our case, these decision makers are the users using the FMD service. We assume the users bear some costs \(C_u\) for downloading any message from the server. For simplicity we define this uniformly: if f is the cost of retrieving any message for any user than \(C_u=f\cdot tag(u)\). Moreover, we replace the random variable \(tag(u)\sim in(u)+\textsf{Binom}(M-in(u),p(u))\) with its expected value, i.e., \(C_u=f\cdot (in(u)+p(u)\cdot (M-in(u)))\).

Besides, the user’s payoff should depend on whether any of the privacy properties detailed in Sect. 4 are not satisfied. For instance, we assume the users suffer from a privacy breach if relationship anonymity is not ensured, i.e., they uniformly lose L when the recipient u can be linked to any sender via any message between them. In the rest of the section we slightly abuse the notation u as in contrast to the rest of the paper we refer to the users as \(u\in \{1,\dots ,U\}\) instead of \(\{u_0,u_1,\dots \}\). The probability of a linkage via a particular message for user u is \(\alpha _u=\prod _{v\in \{1,\dots ,U\}/u}(1-p(v))\). The probability of a linkage from any incoming message of u is \(1-(1-\alpha _u)^{in(u)}\).Footnote 13 Based on these we define the FMD-RA Game.

Definition 5

The FMD-RA Game is a tuple \(\langle \mathcal {N},\varSigma ,\mathcal {U}\rangle \), where the set of players is \(\mathcal {N}=\{1,\dots ,U\}\), their actions are \(\varSigma =\{p(1),\dots ,p(U)\}\) where \(p(u)\in [0,1]\) while their utility functions are \(\mathcal {U}=\{\varphi _u(p(1),\dots ,p(U))\}_{u=1}^U\) such that for \(1\le u\le U\):

$$\begin{aligned} \begin{aligned} \varphi _u=-L\cdot \left( 1-\left( 1-\alpha _u\right) ^{in(u)}\right) -f\cdot (in(u)+p(u)\cdot (M-in(u))). \end{aligned} \end{aligned}$$
(10)

It is visible in the utility function that the bandwidth-related cost (second term) depends only on user u’s action while the privacy-related costs (first term) depend only on the other user’s actions. This reflects well that relationship anonymity is an interdependent privacy property [6] within FMD: by downloading fuzzy tags, the users provide privacy to others rather than to themselves. As a consequence of this tragedy-of-commons [15] situation, a trivial no-protection Nash Equilibrium (NE) emerges. Moreover, Theorem 3 also states this NE is unique, i.e., no other NE exists.

Theorem 3

Applying no privacy protection in the FMD-RA Game is the only NE: \((p^*(1),\dots ,p^*(U))=(0,\dots ,0)\).

Proof

First we prove that no-protection is a NE. If all user u set \(p(u)=0\) than a single user by deviates from this strategy would increased its cost. Hence no rational user would deviate from this point. In details, in Eq. 10 the privacy related costs is constant \(-L\) independently from user u’s false-positive rate while the download related cost would trivially increase as the derivative of this function (shown in Eq. 11) is negative.

$$\begin{aligned} \frac{\partial \varphi _u}{\partial p(u)}=-f\cdot (M-in(u))<0 \end{aligned}$$
(11)

Consequently, \(p^*=(p^*(1),\dots ,p^*(U))=(0,\dots ,0)\) is indeed a NE. Now we give an indirect reasoning why there cannot be any other NEs. Lets assume \(\hat{p}=(\hat{p}(1),\dots ,\hat{p}(U))\) is a NE. At this state any player could decrease its cost by reducing its false positive-rate which only lower the download related cost. Hence, \(\hat{p}\) is not an equilibrium.    \(\square \)

This negative result highlights that in our simplistic model, no rational (selfish) user would use FMD; it is only viable when altruism [34] is present. On the other hand, (if some condition holds) in the Social Optimum (SO) [16], the users do utilize privacy protection. This means a higher total payoff could be achieved (i.e., greater social welfare) if the users cooperate or when the false-positive rates are controlled by a central planner. Indeed, according to Theorem 4 the SO\(\not =\)NE if, for all users, the cost of the fuzzy message downloads is smaller than the cost of the privacy loss. The exact improvement of the SO over the NE could be captured by the Price of Stability/Anarchy [1, 22], but we leave this as future work.

Theorem 4

The SO of the FMD-RA Game is not the trivial NE and corresponds to higher overall utilities if \(f\cdot (M-\max _u(in(u)))<L\).

Proof

We show that the condition in the theorem is sufficient to ensure that SO\(\not =\)NE by showing that greater utility could be achieved with \(0<p^\prime (u)\) than with \(p(u)=0\). To do this we simplify out scenario and set \(p(u)=p\) for all users. The corresponding utility function is presented in Eq. 12 while in Eq. 13 we show the exact utilities when p is either 0 or 1.

$$\begin{aligned} \varphi _u(p)= - L \cdot (1-(1-(1-p)^{U-1})^{in(u)}) - f \cdot (in(u) + p \cdot (M - in(u)))\end{aligned}$$
(12)
$$\begin{aligned} \varphi _u(0)=-L-f\cdot in(u) \qquad \varphi _u(1)=-f\cdot M \end{aligned}$$
(13)

One can check with some basic level of mathematical analysis that the derivative of Eq. 12 is negative at both edge of [0, 1] as \(\frac{\partial \varphi _u(p)}{\partial p}(0)=\frac{\partial \varphi _u(p)}{\partial p}(1)=-f\cdot (M-in(u))\). This implies that the utility is decreasing at these points. Moreover, depending on the relation between the utilities in Eq. 13 (when \(p=0\) and \(p=1\)), two scenario is possible as we illustrate in Fig. 8. From this figure it is clear that when \(\varphi _u(0)<\varphi _u(1)\) (or \(f\cdot (M-in(u))<L\)) for all users that the maximum of their utilities cannot be at \(p=0\).    \(\square \)

Potential Game. We also show that FMD-RA is a potential game [26]. This is especially important, as it guaranteed that the Best Response Dynamics terminates in a NE.

Definition 6

(Potential Game). A Game \(\langle \mathcal {N},\mathcal {A},\mathcal {U}\rangle \) (with players \(\{1,\dots ,U\}\), actions \(\{a_1,\dots ,a_U\}\), and utilities \(\{\varphi _1,\dots ,\varphi _U\}\)) is a Potential Game if there exist a potential function \(\varPsi \) such that Eq. 14 holds for all players u independently of the other player’s actions.Footnote 14

$$\begin{aligned} \varphi _u(a_u,a_{-u})-\varphi _u(a_u^\prime ,a_{-u})=\varPsi (a_u,a_{-u})-\varPsi (a_u^\prime ,a_{-u}) \end{aligned}$$
(14)

Theorem 5

FMD-RA is a Potential Game with potential function shown in Eq. 15.

$$\begin{aligned} \varPsi (p(1),\dots ,p(U))=-f\cdot \sum _{u=1}^Up(u)\cdot (M-in(u)) \end{aligned}$$
(15)
Fig. 8.
figure 8

Illustration of the utility functions: the yellow curve’s maximum must be between zero and one since the gray dot is below the green where the derivative is negative. (Color figure online)

Proof

We prove Eq. 14 by transforming both side to the same form. We start with the left side: the privacy related part of the utility does only depend on the other user’s action, therefore this part falls out during subtraction. On the other hand the download related part accumulates as shown below.

$$\begin{aligned} \varphi _u(p(u),(p(-u))-\varphi _u(p(u)^\prime ,p(-u))=\\ -f\cdot (in(u)+p(u) \cdot (M-in(u)))-(-f\cdot (in(u)+p(u)^\prime \cdot (M-in(u))))=\\ -f\cdot p(u) \cdot (M-in(u))-(-f\cdot p(u)^\prime \cdot (M-in(u))) \end{aligned}$$

Coincidentally, we get the same result if we do the subtraction on the right side using the formula in Eq. 15 as all element in the summation besides u falls out (as they are identical because they do not depend on user u’s action).    \(\square \)

E Attacks on Privacy

We show several possible attacks against the FMD scheme, that might be fruitful to be analyzed in more depth.

Intersection Attacks. The untrusted server could possess some background knowledge that it allows to infer that some messages were meant to be received by the same recipient. In this case, the server only needs to consider the intersection of the anonymity sets of the “suspicious” messages. Suppose the server knows that l messages are sent to the same user. In that case, the probability that a user is in the intersection of all the l messages’ anonymity sets is drawn from the \(\textsf{Binom}(U,p^{l})\) distribution. Therefore, the expected size of the anonymity set after an intersection attack is reduced to \(p^{l}U\) from pU.

Sybil Attacks. The collusion of multiple nodes would decrease the anonymity set of a message. For instance, when a message is downloaded by K nodes out of U, and N node is colluding, then the probability of pinpointing a particular message to a single recipient is \(\frac{\left( {\begin{array}{c}N+1\\ K\end{array}}\right) }{\left( {\begin{array}{c}U\\ K\end{array}}\right) }\). This probability clearly increases as more node is being controlled by the adversary. On the other hand, controlling more nodes does trivially increase the controller’s privacy (not message-privacy but user-privacy) as well. However, formal reasoning would require a proper definition for both of these privacy notions.

Neighborhood Attacks. Neighborhood attacks had been introduced by Zhou et al. in the context of deanonymizing individuals in social networks [39]. An adversary who knows the neighborhood of a victim node could deanonymize the victim even if the whole graph is released anonymously. FMD is susceptible to neighborhood attacks, given that relationship anonymity can be easily broken with statistical tests. More precisely, one can derive first the social graph of FMD users and then launch a neighborhood attack to recover the identity of some users.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 International Financial Cryptography Association

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Seres, I.A., Pejó, B., Burcsi, P. (2022). The Effect of False Positives: Why Fuzzy Message Detection Leads to Fuzzy Privacy Guarantees?. In: Eyal, I., Garay, J. (eds) Financial Cryptography and Data Security. FC 2022. Lecture Notes in Computer Science, vol 13411. Springer, Cham. https://doi.org/10.1007/978-3-031-18283-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18283-9_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18282-2

  • Online ISBN: 978-3-031-18283-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics