1 Introduction

Information theoretic cryptography studies the problem of secure communication and computation in the presence of computationally unbounded adversaries. Unlike the case of computational cryptography whose full understanding is closely tied to basic open problems in computational complexity, information theoretic solutions depend “only” on non-computational (typically combinatorial or algebraic) objects. One may therefore hope to gain a full understanding of the power and limitations of information theoretic primitives. Indeed, Shannon’s famous treatment of perfectly secure symmetric encryption [30] provides an archetypical example for such a study.

Unfortunately, for most primitives, the picture is far from being complete. This is especially true for the problem of secure function evaluation (SFE) [33], in which a set of parties \(P_1,\ldots ,P_{m}\) wish to jointly evaluate a function f over their inputs while keeping those inputs private. Seminal completeness results show that any function can be securely evaluated with information theoretic security [10, 13] (or computational security [19, 33]) under various adversarial settings. However, the communication complexity of these solutions is tied to the computational complexity of the function (i.e., its circuit size), and it is unknown whether this relation is inherent. For instance, as noted by Beaver, Micali, and Rogaway [8] three decades ago, we cannot even rule out the possibility that any function can be securely computed by a constant number of parties with communication that is polynomial in the input length, even in the simple setting where the adversary passively corrupts a single party. More generally, the communication complexity of securely computing a function (possibly via an inefficient protocol) is wide open, even in the most basic models.

1.1 A Minimal Model for Secure Computation

In light of the above, it makes sense to study the limitation of information theoretic secure computation in its simplest form. In [16] Feige, Kilian and Naor (hereinafter referred to as FKN) presented such a “Minimal Model for Secure Computation”. In this model, Alice and Bob hold private inputs, x and y, and they wish to let Charlie learn the value of f(xy) without leaking any additional information. The communication pattern is minimal. Alice and Bob each send to Charlie a single message, a and b respectively, which depends on the party’s input and on a random string r which is shared between Alice and Bob but is hidden from Charlie. Given (ab) Charlie should be able to recover f(xy) without learning additional information. The parties are assumed to be computationally unbounded, and the goal is to minimize the communication complexity of the protocol (i.e., the total number of bits sent by Alice and Bob). Following [23], we refer to such a protocol as a private simultaneous message protocol (PSM) (Fig. 1).

Fig. 1.
figure 1

Schematic of a PSM protocol.

Definition 1

(Private Simultaneous Messages). A private simultaneous message (PSM) protocol \(\varPi = (\varPi _A,\varPi _B,g)\) for a function \(f: \mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\) is a triple of functions \(\varPi _A: \mathcal {X}\times \mathcal {R}\rightarrow \mathcal {A}\), \(\varPi _B: \mathcal {Y}\times \mathcal {R}\rightarrow \mathcal {B}\), and \(g: \mathcal {A}\times \mathcal {B}\rightarrow \mathcal {Z}\) that satisfy the following two properties.

  • (\(\delta \)-Correctness) The protocol has correctness error of \(\delta \) if for every \((x,y) \in \mathcal {X}\times \mathcal {Y}\) it holds that

    $$\begin{aligned} \Pr _{r{\mathop {\leftarrow }\limits ^{\$}}\mathcal {R}}[f(x,y) \ne g(\varPi _A(x,r),\varPi _B(y,r))]\le \delta \end{aligned}$$
  • (\(\epsilon \)-Privacy) The protocol has privacy error of \(\epsilon \) if for every pair of inputs \((x,y) \in \mathcal {X}\times \mathcal {Y}\) and \((x',y') \in \mathcal {X}\times \mathcal {Y}\) for which \(f(x,y)=f(x',y')\) the random variables

    $$\begin{aligned} (\varPi _{A}(x,r),\varPi _B(y,r)) \quad \text {and} \quad (\varPi _{A}(x',r),\varPi _A(y',r), \end{aligned}$$
    (1)

    induced by a uniform choice of \(r{\mathop {\leftarrow }\limits ^{\$}}\mathcal {R}\), are \(\epsilon \)-close in statistical distance.

We mainly consider perfect protocols which enjoy both perfect correctness (\(\delta =0\)) and perfect privacy (\(\epsilon =0\)). We define the communication complexity of the protocol to be \(\log |\mathcal {A}|+\log |\mathcal {B}|\).

The correctness and privacy conditions assert that, for every pair of inputs (xy) and \((x',y')\), the transcript distributions are either close to each other when \(f(x,y)=f(x',y')\), or far apart when \(f(x,y)\ne f(x',y')\). Hence, the joint computation of Alice and Bob, \(C_r(x,y)=(\varPi _A(x,r),\varPi _B(y,r))\), can be also viewed as a “randomized encoding” [5, 24] (or “garbled version”) of the function f(xy) that has the property of being 2-decomposable into an x-part and a y-part. Being essentially non-interactive, such protocols (and their multiparty variants [23]) have found various applications in cryptography (cf. [2, 22]). Moreover, it was shown in [6, 9] that PSM is the strongest model among several other non-interactive models for secret-sharing and zero-knowledge proofs.

FKN showed that any function \(f:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) admits a PSM protocol [16]. The best known communication complexity is polynomial for log-space computable functions [16] and \(O(2^{k/2})\) for general functions [9]. While it seems likely that some functions require super-polynomial communication, the best known lower-bound, due to the original FKN paper, only shows that a random function requires \(3k-O(1)\) bits of communication. This lower-bound is somewhat weak but still non-trivial since an insecure solution (in which Alice and Bob just send their inputs to Charlie) costs 2k bits of communication. The question of improving this lower-bound is an intriguing open problem. In this paper, we aim for a more modest goal. Inspired by the general theory of communication complexity, we ask:

How does the PSM complexity of a function f relate to its combinatorial properties? Is there a “simple” condition that guarantees a non-trivial lower-bound on the PSM complexity?

We believe that such a step is necessary towards proving stronger lower-bounds. Additionally, as we will see, this question leads to several interesting insights for related information-theoretic tasks.

1.2 Revisiting the FKN Lower-Bound

Our starting point is the original proof of the 3k lower-bound from [16]. In order to prove a lower-bound FKN relax the privacy condition by requiring that Charlie will not be able to recover the last bit of Alice’s input. Formally, let us denote by \(\bar{x}\) the string obtained by flipping the last bit of x. Then, the privacy condition (Eq. 1) is relaxed to hold only over sibling inputs (xy) and \((\bar{x},y)\) for which \(f(x,y)=f(\bar{x},y)\). We refer to this relaxation as weak privacy. Since (standard) privacy implies weak privacy, it suffices to lower-bound the communication complexity of weakly private PSM protocols.

To prove a lower-bound for random functions, FKN (implicitly) identify three conditions which hold for most functions and show that if a function \(f:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) satisfies these conditions then any weak PSM for f has communication complexity of at least \(3k-O(1)\). The FKN conditions are:

  1. 1.

    The function f is non-degenerate, namely, for every \(x\ne x'\) there exists y for which \(f(x,y)\ne f(x',y)\) and similarly, for every \(y\ne y'\) there exists x for which \(f(x,y)\ne f(x,y')\).

  2. 2.

    The function is useful in the sense that for at least \(\frac{1}{2}-o(1)\) of the inputs (xy) it holds that \(f(x,y)=f(\bar{x},y)\) where \(\bar{x}\) denotes the string x with its last bit flipped. (An input (xy) for which the equation holds is referred to as being useful.Footnote 1)

  3. 3.

    We say that \((x_1,\ldots ,x_m)\times (y_1,\ldots ,y_n)\) is a complement similar rectangle of f if \(f(x_i,y_j)=f(\bar{x}_i,y_j)\) for every \(1\le i\le m\) and \(1\le j \le n\). Then, f has no complement similar rectangle of size mn larger than \(M=2^{k+1}\). Equivalently, the function \(f'(x,y)=f(x,y)-f(\bar{x},y)\), which can be viewed as a partial derivative of f with respect to its last coordinate, has no 0-monochromatic rectangle of size M.

We observe that the above conditions are, in fact, insufficient to prove a non-trivial lower-bound. As a starting point, we note that the inner-product function has low PSM complexity and has no large monochromatic rectangles. While the inner-product function cannot be used directly as a counterexample (since it has huge complement similar rectangles), we can construct a related function f such that: (1) the derivative \(f'\) is (a variant of) the inner product function and so \(f'\) has no large monochromatic rectangles; and (2) by applying some local preprocessing on Alice’s input, the computation of f(xy) reduces to the computation of the inner product function. Altogether, we prove the following theorem (see Sect. 3).

Theorem 1

(FKN counterexample). There exists a function \(f:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) that satisfies the FKN conditions but has a (standard) PSM of communication complexity of \(2k+O(1)\).

Let us take a closer look at the proof of the FKN lower-bound to see where the gap is. The FKN proof boils down to showing that the set \(S_r\) of all possible transcripts (ab) sent by Alice and Bob under a random string r, has relatively small intersection with the set \(S_{r'}\) of all possible transcripts (ab) sent by Alice and Bob under a different random string \(r'\). Such a collision, \(c=(a,b)\in S_r \cap S_{r'}\), is counted as a trivial collision if the inputs (xy) that generate c under r are the same as the inputs \((x',y')\) that generate c under \(r'\). Otherwise, the collision is counted as non-trivial. The argument mistakenly assumes that all non-trivial collisions are due to sibling inputs, i.e., \((x',y')=(\bar{x},y)\). In other words, it is implicitly assumed that the transcript (ab) fully reveals all the information about (xy) except for the last input of x. (In addition to the value of f(xy) which is revealed due to the correctness property.) Indeed, we show that the FKN argument holds if one considers fully-revealing PSM protocols. (See Theorem 8 for a slightly stronger version.)

Theorem 2

(LB’s against weakly private fully revealing PSM). Let \(f:\{0,1\}^{k}\times \{0,1\}^{k} \rightarrow \{0,1\}\) be a non-degenerate function. Let M be an upper-bound on size of the largest complement similar rectangle of f and let U be a lower-bound on the number of useful inputs of f. Then, any weakly-private fully-revealing PSM for f has communication complexity of at least \(2\log U-\log M -O(1)\). In particular, for all but o(1) fraction of the functions \(f:\{0,1\}^{k}\times \{0,1\}^{k} \rightarrow \{0,1\}\), we get a lower-bound of \(3k-O(1)\).

A lower-bound of c bits against fully-revealing weakly-private PSM easily yields a lower-bound of \(c-2k+1\) bits for PSM. (Since a standard PSM can be turned into a fully-revealing weakly-private PSM by letting Alice/Bob append \(x[1:k-1]\) and y to their messages.) Unfortunately, this loss (of 2k bits) makes the 3k bit lower-bound useless. Moreover, Theorem 1 shows that this loss is unavoidable. Put differently, fully-revealing weakly-private PSM may be more expensive than standard PSM. Nevertheless, as we will see in Sect. 1.4, lower-bounds for fully-revealing weakly-private PSM have useful implications for other models.

1.3 Fixing the PSM Lower-Bound

We show that the FKN argument can be fixed by posing stronger requirements on f. Roughly speaking, instead of limiting the size of complement similar rectangles, we limit the size of any pair of similar rectangles by a parameter M. That is, if the restriction of f to the ordered rectangle \(R=(x_1,\ldots ,x_m) \times (y_1,\ldots ,y_{\ell })\) is equal to the restriction of f to the ordered rectangle \(R'=(x'_1,\ldots ,x'_m) \times (y'_1,\ldots ,y'_{\ell })\) and the rectangles are disjoint in the sense that either \(x_i\ne x'_i\) for every i, or \(y_j\ne y'_j\) for every j, then the size \(m\ell \) of R should be at most M. (See Sect. 2 for a formal definition.)

Theorem 3

(perfect-PSM LB’s). Let \(\mathcal {X},\mathcal {Y}\) be sets of size at least 2, and let \(f:\mathcal {X}\times \mathcal {Y}\rightarrow \{0,1\}\) be a non-degenerate function for which any pair of disjoint similar rectangles \((R,R')\) satisfies \(|R|\le M\). Then, any perfect PSM for f has communication of at least \(2(\log |\mathcal {X}|+\log |\mathcal {Y}|)-\log M -3\).

The theorem is proved by a distributional version of the FKN argument which also implies Theorem 2. (See Sect. 4.) As a corollary, we recover the original lower-bound claimed by FKN.

Corollary 1

For a \(1-o(1)\) fraction of the functions \(f: \{0,1\}^k \times \{0,1\}^k \rightarrow \{0,1\}\) any perfect PSM protocol for f requires \(3k-2\log k -O(1)\) bits of total communication.Footnote 2

Proof

It is not hard to verify that \(1-o(1)\) fraction of all functions are non-degenerate. In Sect. 6 we further show that, for \(1-o(1)\) of the functions, any pair of disjoint similar rectangles \((R,R')\) satisfies \(|R|\le k^2\cdot 2^k \). The proof follows from Theorem 3.     \(\square \)

By partially de-randomizing the proof, we show that the above lower-bound applies to a function that is computable by a family of polynomial-size circuits, or, under standard complexity-theoretic assumptions, by a polynomial-time Turing machine. This resolves an open question of Data, Prabhakaran and Prabhakaran [15] who proved a similar lower-bound for an explicit non-boolean function \(f:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}^{k-1}\). Prior to our work, we could not even rule out the (absurd!) possibility that all efficiently computable functions admit a perfect PSM with communication of \(2k+o(k)\).

Theorem 4

There exists a sequence of polynomial-size circuits

$$\begin{aligned} f=\left\{ f_k: \{0,1\}^k \times \{0,1\}^k \rightarrow \{0,1\}\right\} \end{aligned}$$

such that any perfect PSM for \(f_k\) has communication complexity of at least \(3k-O(\log k)\) bits. Moreover, assuming the existence of a hitting-set generator against co-nondeterministic uniform algorithms, f is computable by a polynomial-time Turing machine.Footnote 3

Remark 1

(On the hitting-set generator assumption). The exact definition of a hitting-set generator against co-nondeterministic uniform algorithms is postponed to Sect. 6. For now, let us just say that the existence of such a generator follows from standard Nissan-Wigderson type complexity-theoretic assumptions. In particular, it suffices to assume that the class \(\mathsf E\) of functions computable in \(2^{O(n)}\)-deterministic time contains a function that has no sub-exponential nondeterministic circuits [28], or, more liberally, that some function in \(\mathsf E\) has no sub-exponential time Arthur-Merlin protocol [21]. (See also the discussion in [7].)

Lower-bounds for imperfect PSM’s. We extend Theorem 3 to handle imperfect PSM protocols by strengthening the non-degeneracy condition and the non self-similarity condition. This can be used to prove an imperfect version of Corollary 1 showing that, for almost all functions, an imperfect PSM with correctness error \(\delta \) and privacy error \(\epsilon \) must communicate at least

$$\begin{aligned} \min \left\{ 3k-2\log (k), 2k+\log (1/\epsilon ), 2k+\log (1/\delta )\right\} - O(1) \end{aligned}$$

bits. An analogous extension of Theorem 4, yields a similar bound for an explicit function. (See Sect. 5.)

1.4 Applications to Conditional Disclosure of Secrets

We move on to the closely related model of Conditional Disclosure of Secrets (CDS) [18]. In the CDS model, Alice holds an input x and Bob holds an input y, and, in addition, Alice holds a secret bit s. The referee, Charlie, holds both x and y, but does not know the secret s. Similarly to the PSM case, Alice and Bob use shared randomness to compute the messages a and b that are sent to Charlie. The CDS requires that Charlie can recover s from (ab) if and only if the predicate f(xy) evaluates to one.Footnote 4

Definition 2

(Conditional Disclosure of Secrets). A conditional disclosure of secrets (CDS) protocol \(\varPi =(\varPi _A,\varPi _B,g)\) for a predicate \(f: \mathcal {X}\times \mathcal {Y}\rightarrow \{0,1\}\) and domain \(\mathcal {S}\) of secrets is a triple of functions \(\varPi _A : \mathcal {X}\times \mathcal {S}\times \mathcal {R}\rightarrow \mathcal {A}\), \(\varPi _B : \mathcal {Y}\times \mathcal {R}\rightarrow \mathcal {B}\) and \(g:\mathcal {X}\times \mathcal {Y}\times \mathcal {A}\times \mathcal {B}\rightarrow \mathcal {S}\) that satisfy the following two properties:

  1. 1.

    (Perfect Correctness) For every (xy) that satisfies f and any secret \(s \in \mathcal {S}\) we have that:

    $$\begin{aligned} \Pr _{r {\mathop {\leftarrow }\limits ^{\$}}\mathcal {R}}[g(x,y,\varPi _A(x,s,r),\varPi _B(y,r)) \ne s] = 0. \end{aligned}$$
  2. 2.

    (Perfect Privacy) For every input (xy) that does not satisfy f and any pair of secrets \(s,s' \in \mathcal {S}\) the distributions

    $$\begin{aligned} (x,y,\varPi _A(x,s,r),\varPi _B(y,r)) \quad \text {and} \quad (x,y,\varPi _A(x,s',r),\varPi _B(y,r)), \end{aligned}$$

    induced by \(r{\mathop {\leftarrow }\limits ^{\$}}\mathcal {R}\) are identically distributed.

The communication complexity of the CDS protocol is \((\log {| \mathcal {A}|} + \log {| \mathcal {B}|})\) and its randomness complexity is \(\log {| \mathcal {R}|}\). By default, we assume that the protocol supports single-bit secrets (\(\mathcal {S}= \{0,1\}\)).Footnote 5

Intuitively, CDS is weaker than PSM since it either releases s or keeps it private but it cannot manipulate the secret data.Footnote 6 Still, this notion has found useful applications in various contexts such as information-theoretically private information retrieval (PIR) protocols [14], priced oblivious transfer protocols [1], secret sharing schemes for graph-based access structures (cf. [11, 12, 31]), and attribute-based encryption [20, 29].

The communication complexity of CDS. In light of the above, it is interesting to understand the communication complexity of CDS. Protocols with communication of O(t) were constructed for t-size Boolean formula by [18] and were extended to t-size (arithmetic) branching programs by [25] and to t-size (arithmetic) span programs by [6]. Until recently, the CDS complexity of a general predicate \(f:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) was no better than its PSM complexity, i.e., \(O(2^{k/2})\) [9]. This was improved to \(2^{O(\sqrt{k \log k})}\) by Liu, Vaikuntanathan and Wee [27]. Moreover, Applebaum et al. [4] showed that, for very long secrets, the amortized complexity of CDS can be reduced to \(O(\log k)\) bits per bit of secret. Very recently, the amortized cost was further reduced to O(1) establishing the existence of general CDS with constant rate [3].

Lower-bounds for the communication complexity of CDS were first established by Gay et al. [17]. Their main result shows that the CDS communication of a predicate f is at least logarithmic in its randomized one-way communication complexity, and leads to an \(\varOmega (\log k)\) lower-bound for several explicit functions. Applebaum et al. [4] observed that weakly private PSM reduces to CDS. This observation together with the 3k-bit FKN lower-bound for weakly private PSM has lead to a CDS lower-bound of \(k-o(k)\) bits for some non-explicit predicate. (The reduction loses about 2k bits.)

In this paper, we further exploit the connection between CDS and PSM by observing that CDS protocols for a predicate h(xy) give rise to weakly private fully revealing PSM for the function \(f((x \circ s),y)=h(x,y) \wedge s\), where \(\circ \) denotes concatenation. By using our lower-bounds for weakly private fully revealing PSM’s we get the following theorem. (See Sect. 7 for a proof.)

Theorem 5

Let \(h:\mathcal {X}\times \mathcal {Y}\rightarrow \{0,1\}\) be a predicate. Suppose that M upper-bounds the size of the largest 0-monochromatic rectangle of h and that for every \(x\in \mathcal {X}\), the residual function \(h(x,\cdot )\) is not the constant zero function. Then, the communication complexity of any perfect CDS for h is at least

$$\begin{aligned} 2\log |f^{-1}(0)|-\log M-\log |\mathcal {X}|-\log |\mathcal {Y}|-1, \end{aligned}$$

where \(|f^{-1}(0)|\) denotes the number of inputs (xy) that are mapped to zero.

Unlike the non-explicit lower-bound of [4], the above theorem provides a simple and clean sufficient condition for proving non-trivial CDS lower-bounds. For example, we can easily show that a random function has at least linear CDS complexity.

Corollary 2

For all but a o(1) fraction of the predicates \(h:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\), any perfect CDS for h has communication of at least \(k-4-o(1)\).

Proof

Let \(h:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) be a randomly chosen predicate. Let \(K=2^k\) and let \(\epsilon =1/\sqrt{K}\). There are exactly \(2^K \cdot 2^K=2^{2K}\) rectangles. Therefore, by a union-bound, the probability of having a 0-monochromatic rectangle of size \(M=2K(1+\epsilon )\) is at most

$$\begin{aligned} 2^{2K} \cdot 2^{-M}=2^{-2\epsilon K}=2^{-\varOmega (\sqrt{K})}. \end{aligned}$$

Also, since h has \(K^2\) inputs, the probability of having less than \((\frac{1}{2}-\epsilon )\cdot K^2\) unsatisfying inputs is, by a Chernoff bound, \(2^{-\varOmega (\epsilon ^2 K^2)}=2^{-\varOmega (K)}\). Finally, by the union bound, the probability that there exists \(x\in \mathcal {X}\) for which \(h(x,\cdot )\) is the all-zero function is at most \(K\cdot 2^{-K}\). It follows, by Theorem 5, that with probability of \(1-2^{-\varOmega (\sqrt{K})}\), the function h has a CDS complexity of at least \(k-4-o(1)\).     \(\square \)

We can also get lower-bounds for explicit functions. For example, Gay et al. [17] studied the CDS complexity of the binary inner product function \(h(x,y)=\langle x, y \rangle \). They proved an upper-bound of \(k+1\) bits and a lower-bound of \(\varOmega (\log k)\) bits, and asked as an open question whether a lower-bound of \(\varOmega (k)\) can be established. (The question was open even for the special case of linear CDS for which [17] proved an \(\varOmega (\sqrt{k})\) lower-bound). By plugging the inner-product predicate into Theorem 5, we conclude:

Corollary 3

Any perfect CDS for the inner product predicate \(h_{\text {ip}}:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) requires at least \(k-3-o(1)\) bits of communication.

Proof

It suffices to prove the lower bound for the restriction of inner-product in which \(x\ne 0^n\). It is well known (cf. [26]) that the largest monochromatic rectangle is of size \(M=2^{k}\), and the number of “zero” inputs is exactly \(S=2^{2k-1}-2^k\). Hence, Theorem 5 yields a lower-bound of \(k-3-o(1)\).     \(\square \)

This lower-bound matches the \(k+1\) upper-bound up to a constant additive difference (of 4 bits). It also implies that in any ABE scheme for the inner-product function which is based on the dual system methodology [32] either the ciphertext or the secret-key must be of length \(\varOmega (k)\). (See [17] for discussion.)

Organization. Following some preliminaries (Sect. 2), we present the counter example for the FKN lower-bound (Sect. 3). We then analyze the communication complexity of perfect PSM (Sect. 4) and imperfect PSM (Sect. 5). Based on these results, we obtain PSM lower-bounds for random and explicit functions (Sect. 6), as well as CDS lower-bounds (Sect. 7).

2 Preliminaries

For a string (or a vector) x of length n, and indices \(1\le i \le j \le n\), we let x[i] denote the i-th entry of x, and let x[i : j] denote the string \((x[i],x[i+1]\ldots , x[j])\). By convention, all logarithms are taken base 2.

Rectangles. An (ordered) rectangle of size \(m \times n\) over some finite domain \(\mathcal {X}\times \mathcal {Y}\) is a pair \(\rho =(\mathbf {x},\mathbf {y})\), where \(\mathbf {x} = (x_1,\ldots ,x_m) \subseteq \mathcal {X}^m\) and \(\mathbf {y}= (y_1,\ldots ,y_n) \subseteq \mathcal {Y}^n\) satisfy \(x_i \ne x_j\) and \(y_i \ne y_j\) for all \(i \ne j\). We say that (xy) belongs to \(\rho \) if \(x=x_i\) and \(y=y_j\) for some ij (or by abuse of notation we simply write \(x\in {\mathbf {x}}\) and \(y\in \mathbf {y}\)). The size of an \(m \times n\) rectangle \(\rho \) is mn, and its density with respect to some probability distribution \(\mu \) over \(\mathcal {X}\times \mathcal {Y}\), is \(\sum _{x\in \mathbf {x}, y\in \mathbf {y}} \mu (x,y)\). Let \(\rho = (\mathbf {x},\mathbf {y})\) and \(\rho ' = (\mathbf {x}',\mathbf {y}')\) be a a pair of \(m \times n\)-rectangles. We say that \(\rho \) and \(\rho '\) are x-disjoint (resp., y-disjoint) if \(x_i \ne x'_i\) for all \(i \in \{1,\ldots ,m\}\) (resp., if \(y_j \ne y'_j\) for all \(j \in \{1,\ldots ,n\}\)). We say that \(\rho \) and \(\rho '\) are disjoint if they are either x-disjoint or y-disjoint.

As an example, consider the three \(2\times 3\) rectangles \(\rho _1 = \bigl ((1,2),(5,6,7)\bigr )\), \(\rho _2 = \bigl ((2,1),(6,5,4)\bigr )\), and \(\rho _3 = \bigl ((1,3),(7,5,6)\bigr )\). Among those, \(\rho _1\) and \(\rho _3\) are y-disjoint but not x-disjoint, \(\rho _2\) and \(\rho _3\) are x-disjoint but not y-disjoint, and \(\rho _1\) and \(\rho _2\) are both x-disjoint and y-disjoint. Therefore, each of these pairs is considered to be disjoint.

If \(f: \mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\) is a function and \(\rho \) a rectangle of size \(m \times n\), we let \(f_{[\rho ]}\) be the matrix M of size \(m \times n\) whose entry \(M_{ij}\) is \(f(x_i,y_j)\). A rectangle \(\rho \) is 0-monochromatic (resp., 1-monochromatic) if \(f_{[\rho ]}\) is the all-zero matrix (resp., all-one matrix). A rectangle \(\rho \) is similar to a rectangle \(\rho '\) (with respect to f) if \(f_{[\rho ]}=f_{[\rho ']}\). A rectangle \((\mathbf {x}=(x_1,\ldots ,x_m),\mathbf {y})\) is complement similar if it is similar to the rectangle \(((\bar{x}_1,\ldots ,\bar{x}_m),\mathbf {y})\), where \(\bar{x}\) denotes the string x with its last bit flipped.

Probabilistic notation. We will use calligraphic letters \(\mathcal {A}\), \(\mathcal {B}\), ..., to denote finite sets. Lower case letters denote values from these sets, i.e., \(x \in \mathcal {X}\). Upper case letters usually denote random variables (unless the meaning is clear from the context).

Given two random variables A and B over the same set \(\mathcal {A}\), we use \(\Vert A - B\Vert \) to denote their statistical distance \(\Vert A - B\Vert = \frac{1}{2}\sum _{a\in \mathcal {A}}|\Pr [A=a] - \Pr [B=a]|\). The min-entropy of A, denoted by \(H_{\infty }(A)\), is minus the logarithm of the probability of the most likely value of A, i.e., \(-\log \max _{a\in \mathcal {A}} \Pr [A=a]\).

3 A Counterexample to the FKN Lower-Bound

Let \(\mathbf {T}_0,\mathbf {T}_1\) be a pair of \((k-1)\times (k-1)\) non-singular matrices (over the binary field \(\mathbb {F}={{\mathrm{GF}}}[2]\)) with the property that \(\mathbf {T}=\mathbf {T}_0+\mathbf {T}_1\) is also non-singular. (The existence of such matrices is guaranteed via a simple probabilistic argument.Footnote 7) Define the mapping \(L:\mathbb {F}^k\rightarrow \mathbb {F}^k\) by

$$\begin{aligned} x\mapsto (\mathbf {T}_{x[k]} \cdot x[1:k-1])\circ x[k], \end{aligned}$$

where \(\circ \) denotes concatenation. That is, if the last entry of x is zero then L applies \(\mathbf {T}_0\) to the \(k-1\) prefix \(x'=x[1:k-1]\) and extends the resulting \(k-1\) vector by an additional 0 entry, and if \(x[k]=1\) then the prefix \(x'\) is sent to \(\mathbf {T}_1x'\) and the vector is extended by an additional 1 entry. Note that L is a bijection (since \(\mathbf {T}_0,\mathbf {T}_1\) are non-singular). The function \(f:\mathbb {F}^k\times \mathbb {F}^k\rightarrow \mathbb {F}^k\) is defined by

$$\begin{aligned} (x,y)\mapsto \langle L(x), y \rangle , \end{aligned}$$

where \(\langle \cdot , \cdot \rangle \) denotes the inner-product function over \(\mathbb {F}\).

In Sect. 3.1, we will prove that f satisfies the FKN conditions (described in Sect. 1.2).

Lemma 1

The function f is (1) non-degenerate, (2) useful, and (3) its largest complement similar rectangle is of size at most \(M=2^{k+1}\).

Recall that f is non-degenerate if for every distinct \(x\ne x'\) (resp., \(y\ne y'\)) the residual functions \(f(x,\cdot )\) and \(f(x',\cdot )\) (resp., \(f(\cdot ,y')\) and \(f(\cdot ,y')\)) are distinct. It is useful if \(\Pr _{x,y}[f(x,y)\ne f(\bar{x},y)]\ge \frac{1}{2}\), where \(\bar{x}\) denotes the string x with its last entry flipped. Also, a rectangle \(R=(\mathbf {x},\mathbf {y})\) is complement similar if \(f(x,y)=f(\bar{x},y)\) for every \(x\in \mathbf {x},y\in \mathbf {y}\).

In Sect. 3.2 we will show that f admits a PSM with communication complexity of \(2k+O(1)\).

Lemma 2

The function f has a PSM protocol with communication complexity of \(2k+2\).

Theorem 1 follows from Lemmas 1 and  2.

3.1 f Satisfies the FKN Properties (Proof of Lemma 1)

(1) f is non-degenerate. Fix \(x_1\ne x_2\in \mathbb {F}^k\) and observe that \(L(x_1)\ne L(x_2)\) (since L is a bijection). Therefore there exists y for which \(f(x_1,y)=\langle L(x_1), y \rangle \ne \langle L(x_2), y \rangle =f(x_2,y)\). (In fact this holds for half of y’s). Similarly, for every \(y_1\ne y_2\) there exists \(v\in \mathbb {F}^k\) for which \(\langle v, y_1 \rangle \ne \langle v, y_2 \rangle \), and since L is a bijection we can take \(x=L^{-1}(v)\) and get that \(f(x,y_1)=\langle v, y_1 \rangle \ne \langle v, y_2 \rangle = f(x,y_2)\).

(2) f is useful. Choose \(x'{\mathop {\leftarrow }\limits ^{\$}}\mathbb {F}^{k-1}\) and \(y{\mathop {\leftarrow }\limits ^{\$}}\mathbb {F}^k\) and observe that \(f(x'\circ 0, y)= f(x'\circ 1,y)\) if and only if

$$\begin{aligned} \langle \mathbf {T}x', y[1:k-1] \rangle +y_k=0, \end{aligned}$$

which happens with probability \(\frac{1}{2}\).

(3) The largest complement similar rectangle is of size at most \(2^{k+1}\). Fix some rectangle \(R=(\mathbf {x},\mathbf {y})\), where \(\mathbf {x} = (x_1,\ldots ,x_m) \in (\mathbb {F}^k)^m\) and \(\mathbf {y}= (y_1,\ldots ,y_n) \in (\mathbb {F}^k)^n\). We show that if R is complement similar then \(mn\le 2\cdot 2^k\). Since R is complement similar for every \(x\in \mathbf {x}, y\in \mathbf {y}\) it holds

$$\begin{aligned} f(x,y)=f(\bar{x},y), \end{aligned}$$

which by definition of f implies that

$$\begin{aligned} \langle \mathbf {T}x' \circ 1, y \rangle =0, \end{aligned}$$

where \(x'\) is the \((k-1)\) prefix of x. Let d be the dimension of the linear subspace spanned by the vectors in \(\mathbf {x}\), and so \(m\le 2^d\). Since \(\mathbf {T}\) has full rank, the dimension of the subspace V spanned by \(\left\{ (\mathbf {T}x[1:k-1] \circ 1): x\in \mathbf {x}\right\} \) is at least \(d-1\). (We may lose 1 in the dimension due to the removal of the last entry of the vectors \(x\in \mathbf {x}\).) Noting that every \(y\in \mathbf {y}\) is orthogonal to V, we conclude that the dimension of the subspace spanned by \(\mathbf {y}\) is at most \(k-(d-1)\). It follows that \(n\le 2^{k-(d-1)}\) and so \(mn<2\cdot 2^k\).     \(\square \)

3.2 PSM for f (Proof of Lemma 2)

Note that f can be expressed as applying the inner product to v and y where v can be locally computed based on x. Hence it suffices to construct a PSM for the inner-product function and let Alice compute v and apply the inner-product protocol to v. (This reduction is a special instance of the so-called substitution lemma of randomize encoding, cf. [2, 22].) Lemma 2 now follows from the following lemma.

Lemma 3

The inner product function \(h_{\text {ip}}:\mathbb {F}^k\times \mathbb {F}^k\rightarrow \mathbb {F}\) has a PSM protocol with communication complexity of \(2k+2\).

A proof of the lemma appearsFootnote 8 in [27, Corollary 3]. For the sake of self-containment we describe here an alternative proof.

Proof

We show a PSM \(\varPi = (\varPi _A,\varPi _B,g)\) with communication 2k under the promise that the inputs of Alice and Bob, xy, are both not equal to the all zero vector. To get a PSM for the general case, let Alice and Bob locally extend their inputs xy to \(k+1\)-long inputs \(x'=x\circ 1\) and \(y'=y\circ 1\). Then run the protocol \(\varPi \) and at the end let Charlie flip the outcome. It is easy to verify that the reduction preserves correctness and privacy. Since the inputs are longer by a single bit the communication becomes \(2(k+1)\) as promised.

We move on to describe the protocol \(\varPi \). The common randomness consists of a random invertible matrix \(\mathbf {R}\in \mathbb {F}^{k\times k}\). Given non-zero \(x\in \mathbb {F}^k\), Alice outputs \(a=\mathbf {R}x\) where x is viewed as a column vector. Bob, who holds \(y\in \mathbb {F}^k\), outputs \(b=y^T\mathbf {R}^{-1}\). Charlie outputs ba.

Prefect correctness is immediate: \((y^T \mathbf {R}^{-1})\cdot (\mathbf {R}x)=y^T x\), as required. To prove perfect privacy, we use the following claim.

Claim 6

Let \(x,y\in \mathbb {F}^k\) be non-zero vectors and denote their inner-product by z. Then, there exists an invertible matrix \(\mathbf {M}\in \mathbb {F}^{k\times k}\) for which \(\mathbf {M}e_1 =x\) and \(v_z^T \mathbf {M}^{-1}=y^T\) where \(e_i\) is the i-th unit vector, and \(v_z\) is taken to be \(e_1\) if \(z=1\) and \(e_k\) if \(z=0\).

Proof

Let us first rewrite the condition \(v_z^T \mathbf {M}^{-1}=y^T\) as \(v_z^T=y^T \mathbf {M}\). Let \(V\subset \mathbb {F}^{k}\) be the linear subspace of all vectors that are orthogonal to y. Note that the dimension of V is \(k-1\). We distinguish between two cases based on the value of z.

Suppose that \(z=0\), that is, \(x\in V\) and \(v_z=e_k\). Then set the first column of \(\mathbf {M}\) to be x and choose the next \(k-2\) columns \(\mathbf {M}_2,\ldots ,\mathbf {M}_{k-1}\) so that together with x they form a basis for V. Let the last column \(\mathbf {M}_{k}\) be some vector outside V. Observe that the columns are linearly independent and so \(\mathbf {M}\) is invertible. Also, it is not hard to verify that \(\mathbf {M}e_1 =x\) and that \(y^T \mathbf {M}=e_k^T\).

Next, consider the case where \(z=1\), that is, \(x\notin V\) and \(v_z=e_1\). Then, take \(\mathbf {M}_1=x\) and let the other columns \(\mathbf {M}_2,\ldots ,\mathbf {M}_{k}\) to be some basis for V. Since x is non-zero the columns of \(\mathbf {M}\) are linearly independent. Also, \(\mathbf {M}e_1 =x\) and \(y^T \mathbf {M}=e^T_1\). The claim follows.     \(\square \)

We can now prove perfect privacy. Fix some non-zero \(x,y\in \mathbb {F}^k\) and let \(z=\langle x, y \rangle \). We show that the joint distribution of the messages (AB) depends only on z. In particular, (AB) is distributed identically to \((\mathbf {R}e_1,v_b^T \mathbf {R}^{-1})\) where \(\mathbf {R}\) a random invertible matrix. Indeed, letting \(\mathbf {M}\) be the matrix guaranteed in Claim 6 we can write

$$\begin{aligned} (\mathbf {R}x,y^T \mathbf {R}^{-1}) = (\mathbf {R}(\mathbf {M}e_1), (v_z^T \mathbf {M}^{-1}) \mathbf {R}^{-1}). \end{aligned}$$

Noting that \(\mathbf {T}=\mathbf {R}\mathbf {M}\) is also a random invertible matrix (since the the set of invertible matrices forms a group) we conclude that the RHS is identically distributed to \(\mathbf {T}e_1, v_z^T \mathbf {T}^{-1}\), as claimed.     \(\square \)

Remark 2

Overall the PSM for f has the following form: Alice sends \(a=\mathbf {R}\cdot (L(x)\circ 1)\) and Bob sends \(b=(y\circ 1)^T \mathbf {R}\) where \(\mathbf {R}\in \mathbb {F}^{(k+1)\times (k+1)}\) is a random invertible matrix. The privacy proof shows that if the input (xy) is mapped to (ab) for some \(\mathbf {R}\) then for every \((x',y')\) for which \(f(x,y)=f(x',y')\), there exists \(\mathbf {R}'\) under which the input \((x',y')\) is mapped to (ab) as well. Hence, there are collisions between non-sibling inputs. As explained in the introduction, this makes the FKN lower-bound inapplicable.

4 Lower Bound for Perfect PSM Protocols

In this Section we will prove a lower bound for perfect PSM protocols.

Definition 3

For a function \(f:\mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\) and distribution \(\mu \) over the domain \(\mathcal {X}\times \mathcal {Y}\) with marginals \(\mu _A\) and \(\mu _B\), define

$$\begin{aligned} \alpha (\mu ) = \max _{(R_1,R_2)} \min (\mu (R_1),\mu (R_2)), \end{aligned}$$

where the maximum ranges over all pairs of similar disjoint rectangles \((R_1,R_2)\). We also define

$$\begin{aligned} \beta (\mu )=\Pr [\, (X,Y)\ne (X',Y')\mid f(X,Y)=f(X',Y')\,], \end{aligned}$$

where (XY) and \((X',Y')\) represent two independent samples from \(\mu \). Finally, we say that f is non-degenerate with respect to \(\mu \) if for every \(x\ne x'\) in the support of \(\mu _A\) there exists some \(y \in \mathcal {Y}\) for which \(f(x,y)\ne f(x',y)\), and similarly for every \(y\ne y'\) in the support of \(\mu _B\) there exists some \(x \in \mathcal {X}\) for which \(f(x,y)\ne f(x,y')\).

We prove the following key lemma.

Lemma 4

Let \(f:\mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\). Then the communication complexity of any perfect PSM protocol is at least

$$\begin{aligned} \max _{\mu } \; \log (1/\alpha (\mu ))+ H_{\infty }(\mu )-\log (1/\beta (\mu ))-1, \end{aligned}$$

where the maximum is taken over all (not necessarily product) distribution \(\mu \) under which f is non-degenerate.

The lower-bound is meaningful as long as \(\beta \) is not too small. Intuitively, this makes sure that the privacy requirement (which holds only over inputs on which the function agrees) is not trivial to achieve under \(\mu \).

For the special case of a Boolean function f, we can use the uniform distribution over \(\mathcal {X}\times \mathcal {Y}\) and prove Theorem 3 from the introduction (restated here for the convenience of the reader).

Theorem 7

(Theorem 3 restated). Let \(\mathcal {X},\mathcal {Y}\) be sets of size at least 2. Let \(f:\mathcal {X}\times \mathcal {Y}\rightarrow \{0,1\}\) be a non-degenerate function for which any pair of disjoint similar rectangles \((R,R')\) satisfies \(|R|\le M\). Then, any perfect PSM for f has communication of at least \(2(\log |\mathcal {X}|+\log |\mathcal {Y}|)-\log M -3\).

Proof

For the uniform distribution \(\mu \) we have \(\alpha (\mu )\le M/(|\mathcal {X}||\mathcal {Y}|)\), \(H_{\infty }(\mu )=\log |\mathcal {X}|+\log |\mathcal {Y}|\) and

$$\begin{aligned} \beta (\mu )\ge \Pr [(X,Y)\ne (X',Y')]-\Pr [f(X,Y)\ne f(X',Y')], \end{aligned}$$

where XY and \(X',Y'\) are two independent copies of uniformly distributed inputs. The minuend is \(1-1/(|\mathcal {X}||\mathcal {Y}|)\) and the subtrahend is at most \(\frac{1}{2}\) (since f is Boolean). For \(|\mathcal {X}||\mathcal {Y}|\ge 4\), we get \(\beta (\mu )\ge 1/4\), and the proof follows from the key lemma (Lemma 4).     \(\square \)

We note that the constant 3 can be replaced by \(2+o_k(1)\) when the size of the domain \(\mathcal {X}\times \mathcal {Y}\) grows with k.

Weakly Private Fully Revealing PSM. We can also derive a lower-bound on the communication complexity of weakly private fully revealing PSM. We begin with a formal definition.

Definition 4

(Weakly Private Fully Revealing PSM). A weakly private fully revealing PSM \(\varPi = (\varPi _A,\varPi _B,g)\) for a function \(f: \mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\) is a perfect PSM for the function \(f':\{0,1\}^{k_1}\times \{0,1\}^{k_2} \rightarrow \{0,1\}^{k_1-1}\times \{0,1\}^{k_2} \times \{0,1\}\) that takes (xy) and outputs \((x[1:k_1-1],y,f(x,y))\), where \(x[1:k_1-1]\) is the \(k_1-1\) prefix of x.

In the following, we say that f is weakly non-degenerate if for every x there exists y such that \(f(x,y)\ne f(\bar{x},y)\). Recall that an input (xy) is useful if \(f(x,y)=f(\bar{x},y)\). We prove the following (stronger) version of Theorem 2 from the introduction.

Theorem 8

Let \(f:\{0,1\}^{k_1}\times \{0,1\}^{k_2} \rightarrow \{0,1\}\) be a weakly non-degenerate function. Let M be an upper-bound on size of the largest complement similar rectangle of f and let U be a lower-bound on the number of useful inputs of f. Then, any weakly-private fully-revealing PSM for f has communication complexity of at least \(2\log U-\log M -2\). In particular, for all but an o(1) fraction of the predicates \(f:\{0,1\}^{k}\times \{0,1\}^{k} \rightarrow \{0,1\}\) we get a lower-bound of \(3k- 4-o(1)\).

Proof

Let \(f'\) be the function defined in Definition 4 based on f. We will prove a lower-bound on the communication complexity of any perfect PSM for \(f'\). Let \(\mu \) be the uniform distribution over the set of useful inputs. Since f is weakly non-degenerate the function \(f'\) is non-degenerate under \(\mu \). Also, observe that

$$\begin{aligned} \alpha (\mu ) \le M/U, \quad \beta (\mu ) =1/2, \quad \text {and } \quad H_{\infty }(\mu )\ge \log U. \end{aligned}$$

The first part of the theorem follows from Lemma 4.

To prove the second (“in particular”) part observe that, for a random function f, each pair of inputs (xy) and \((\bar{x},y)\) gets the same f-value with probability \(\frac{1}{2}\) independently of other inputs. Hence, with all but o(1) probability, a fraction of \(\frac{1}{2}-o(1)\) of all \(2^{2k-1}\) of the pairs is mapped to the same value, and so there will be \(2^{2k-1}(1-o(1))\) useful inputs. (Since each successful pair contributes two useful inputs.) Also, each M-size rectangle R is complement similar with probability \(2^{-M}\). By taking a union bound over all \(2^{2^{k+1}}\) rectangles, we conclude that f has an \(M=2^{k+1}(1+o(1))\)-size complement similar rectangle with probability at most \(2^{2^{k+1}-M}=o(1)\). We conclude that, all but an o(1) fraction of the functions, do not have weakly-private fully-revealing PSM with complexity smaller than \(3k-4-o(1)\).     \(\square \)

4.1 Proof of the Key Lemma (Lemma 4)

Fix some function \(f:\mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\) and let \(\varPi =(\varPi _A,\varPi _B,g)\) be a perfect PSM protocol for f. Let \(\mu \) denote some distribution over the domain \(\mathcal {X}\times \mathcal {Y}\) and assume that f is non-degenerate with respect to \(\mu \).

We will use a probabilistic version of the FKN proof. In particular, consider two independent executions of \(\varPi \) on inputs that are sampled independently from \(\mu \). We let XY and R (resp., \(X',Y'\) and \(R'\)) denote the random variables that represent the inputs of Alice and Bob and their shared randomness in the first execution (resp., second execution). Thus, we can for example write \(\Pr [(A,B) = (A',B') \wedge X \ne X']\) to denote the probability that the messages in the two executions match while the two inputs for Alice are different.

To simplify notation somewhat, we define the following events:

$$\begin{aligned} \mathcal {P}^{(=)}&:\equiv (A = A') \wedge (B = B') \\ \mathcal {I}^{(=)}&:\equiv (X = X') \wedge (Y = Y') \\ \mathcal {I}^{(\ne )}&:\equiv (X \ne X') \vee (Y \ne Y') \equiv \lnot \mathcal {I}^{(=)}\\ \mathcal {F}^{(=)}&:\equiv f(X,Y) = f(X',Y') \end{aligned}$$

(The notation \(\mathcal {P}\) is chosen to indicate equivalence/inequivalence of Protocol message and \(\mathcal {I}\) to indicate equivalence/inequivalence of the Inputs.) Our lower-bound follows from the following claims.

Claim 9

The communication complexity of \(\varPi \) is at least \(\log (1/\Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)}])-\log (1/\beta ).\)

Proof

We will compute the collision probability \(\Pr [(A,B) = (A',B')]\) of two random executions by showing that

$$\begin{aligned} \Pr [\mathcal {P}^{(=)}] = \frac{\Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)}]}{\Pr [\mathcal {I}^{(\ne )}|\mathcal {F}^{(=)}]}=\frac{\Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)}]}{\beta }. \end{aligned}$$
(2)

Because the collision probability of two independent instances of a random variable is at least the inverse of the alphabet size, the alphabet of A and B must have size at least \(\beta /\Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)}]\). Thus, in total the protocol requires

$$\begin{aligned} \log (1/\Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)}])-\log (1/\beta ) \end{aligned}$$

bits of communication.

We move on to prove (2). By perfect correctness, \(\mathcal {P}^{(=)}\) can only happen if \(\mathcal {F}^{(=)}\) happens, therefore

$$\begin{aligned} \frac{\Pr [\mathcal {P}^{(=)}]}{\Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)}]} = \frac{\Pr [\mathcal {F}^{(=)}]\Pr [\mathcal {P}^{(=)}|\mathcal {F}^{(=)}]}{\Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)}]}. \end{aligned}$$
(3)

By the same reasoning, we can express the denominator of the RHS by

$$ \Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)} \wedge \mathcal {F}^{(=)}]=\Pr [\mathcal {F}^{(=)}]\Pr [\mathcal {I}^{(\ne )}|\mathcal {F}^{(=)}]\Pr [\mathcal {P}^{(=)}|\mathcal {F}^{(=)}\wedge \mathcal {I}^{(\ne )}].$$

It follows that (3) equals to

$$\begin{aligned} \frac{\Pr [\mathcal {F}^{(=)}]\Pr [\mathcal {P}^{(=)}|\mathcal {F}^{(=)}]}{\Pr [\mathcal {F}^{(=)}]\Pr [\mathcal {I}^{(\ne )}|\mathcal {F}^{(=)}]\Pr [\mathcal {P}^{(=)}|\mathcal {F}^{(=)}\wedge \mathcal {I}^{(\ne )}]}= \frac{1}{\Pr [\mathcal {I}^{(\ne )}|\mathcal {F}^{(=)}]}, \end{aligned}$$
(4)

where equality follows by noting that \(\Pr [\mathcal {P}^{(=)}|\mathcal {F}^{(=)}] = \Pr [\mathcal {P}^{(=)}|\mathcal {F}^{(=)}\wedge \mathcal {I}^{(\ne )}]\) (due to perfect privacy). Multiplying the LHS of (3) and the RHS of (4) by \(\Pr [\mathcal {I}^{(\ne )} \wedge \mathcal {P}^{(=)}]\), we conclude (2).     \(\square \)

Claim 10

For any pair of strings \(r\ne r'\),

$$\begin{aligned} \Pr [\mathcal {P}^{(=)} \wedge \mathcal {I}^{(\ne )}|R = r, R' = r'] \le 2\alpha (\mu ) 2^{-H_{\infty }(\mu )}. \end{aligned}$$

Proof

We see that

$$\begin{aligned} \Pr \bigl [\mathcal {P}^{(=)} \wedge \mathcal {I}^{(\ne )}|R=r\wedge R'=r']&\le \Pr \bigl [\mathcal {P}^{(=)} \wedge (X\ne X') |R=r\wedge R'=r']\\&\qquad + \Pr \bigl [\mathcal {P}^{(=)} \wedge (Y\ne Y') |R=r\wedge R'=r']\;. \end{aligned}$$

Due to symmetry it suffices to bound the first summand by \(\alpha (\mu ) 2^{-H_{\infty }(\mu )}\).

Say that x collides with \(x'\) if \(\varPi _A(x,r) = \varPi _A(x', r')\). Restricting our attention to x’s in the support of \(\mu _A\), we claim that every x can collide with at most a single \(x'\). Indeed, if this is not the case, then \(\varPi _A(x,r) = \varPi _A(x', r')=\varPi _A(x'', r')\). The second equality implies that when the randomness is \(r'\), for every y, the messages (ab) communicated under \((x',y)\) are equal to the ones communicated under \((x'',y)\). By perfect correctness, this implies that \(f(x',y)=f(x'',y)\) for every y, contradicting the non-degeneracy of f under \(\mu \). Analogously, let us say that y collides with \(y'\) if \(\varPi _B(y,r) = \varPi _B(y', r')\). The same reasoning shows that every y in the support of \(\mu _B\) can collide with at most a single \(y'\) in the support of \(\mu _B\).

Let \(\mathbf {x} = (x_1,\ldots ,x_m)\) and \(\mathbf {x}' = (x'_1,\ldots ,x'_m)\) be a complete list of entries for which \(x_i\) collides with \(x'_i\) and \(x_i \ne x'_i\) and \(\mu _A(x_i),\mu _A(x'_i)>0\). Analogously let \(\mathbf {y}= (y_1,\ldots ,y_n)\) and \(\mathbf {y}' = (y'_1,\ldots ,y'_n)\) be a complete list for which \(y_i\) collides with \(y'_i\) and \(\mu _B(y_i),\mu _B(y'_i)>0\). (Note that we do not require \(y_i \ne y'_i\).) Since collisions are unique (as explained above), the tuples \(\mathbf {x},\mathbf {x}',\mathbf {y},\mathbf {y}'\) are uniquely determined up to permutation.

By definition, the tuples \((x,y,x',y')\) with \(x \ne x'\), and \((a,b) = (a',b')\) are exactly those of the form \((x_i,y_j,x'_i,y'_j)\) for some i and j.

Now, consider the two x-disjoint rectangles \(\rho =(\mathbf {x},\mathbf {y})\) and \(\rho '=(\mathbf {x}',\mathbf {y}')\) and assume, without loss of generality, that \(\mu (\rho )\le \mu (\rho ')\). Since Alice and Bob both send the same messages with randomness r on inputs \((x_i,y_j)\) as they send with randomness \(r'\) on inputs \(x'_i,y'_j\), we see that it must be that \(f(x_i,y_j) = f(x'_i,y'_j)\) if the protocol is correct. Therefore, \(f_{[\rho ]} = f_{[\rho ']}\), and so \(\mu (\rho )\le \alpha (\mu )\).

To complete the argument, note that \(\mathcal {P}^{(=)} \wedge (X\ne X')\) can only happen if we pick \((X,Y) = (x_i,y_j)\) and \((X',Y') = (x'_i,y'_j)\) for some ij. The event that there exists ij for which \((X,Y) = (x_i,y_j)\) has probability at most \(\alpha (\mu )\). The event that \((X',Y') = (x'_i,y'_j)\) for the same (ij) has probability at most \(\max _{x,y} \mu (x,y) = 2^{-H_{\infty }(\mu )}\).     \(\square \)

Combining Claims 9 and 10, we derive Lemma 4.    \(\square \)

5 Lower Bounds for Imperfect PSM Protocols

In this section we state a lower-bound on the communication complexity of imperfect PSM protocols. For this, we will have to strengthen the requirements from the function f.

We call f strongly non-degenerate if for any \(x \ne x'\) we have \(|\{y | f(x,y) = f(x',y)\} | \le 0.9 |\mathcal {Y}|\) and for any \(y \ne y'\) we have \(|\{x | f(x,y) = f(x,y')\} | \le 0.9 |\mathcal {X}|\). A pair of ordered \(m\times n\) rectangles \(R = (\mathbf {x},\mathbf {y})\) and \(R' = (\mathbf {x}',\mathbf {y}')\) in which either \(x_i \ne x'_i\) for all \(i\in [m]\), or \(y_i \ne y'_i\) for all \(i\in [n]\) are called approximately similar if for 0.99 of the pairs (ij) we have \(f(x_i,y_j) = f(x'_i,y'_j)\). (The constants 0.9 and 0.99 are somewhat arbitrary and other constants may be chosen.)

In the full version we prove the following theorem:

Theorem 11

Let \(f: \mathcal {X}\times \mathcal {Y}\rightarrow \mathcal {Z}\) be a strongly non-degenerate function whose largest approximately similar pair of rectangles is of size at most M. Then, any PSM for f with privacy error of \(\epsilon \) and correctness error of \(\delta < \frac{1}{100}\), requires at least

$$\begin{aligned} \log |\mathcal {X}| + \log |\mathcal {Y}| + \min \left\{ \begin{array}{l} \log |\mathcal {X}| + \log |\mathcal {Y}| - \log \left( \frac{1}{\Pr [\mathcal {F}^{(=)}]} \right) ,\\ \log |\mathcal {X}| + \log |\mathcal {Y}| - \log M,\\ \log (1/\epsilon ),\\ \log (1/\delta ) - \log \left( \frac{1}{\Pr [\mathcal {F}^{(=)}]} \right) \end{array} \right\} - c \end{aligned}$$
(5)

bits of communication, where c is some universal constant (that does not depend on f) and \(\Pr [\mathcal {F}^{(=)}] = \Pr [f(X,Y) = f(X',Y')]\) when (XY) and \((X',Y')\) are picked independently and uniformly at random from \(\mathcal {X}\times \mathcal {Y}\).

In the special case of a Boolean function f, it holds that \(\Pr [\mathcal {F}^{(=)}] = \Pr [f(X,Y) = f(X',Y')]\ge 1/2\), and the communication lower-bound simplifies to

$$\begin{aligned} \log |\mathcal {X}| + \log |\mathcal {Y}| + \min \left\{ \log |\mathcal {X}| + \log |\mathcal {Y}| - \log M,\log (1/\epsilon ),\log (1/\delta )\right\} - c \end{aligned}$$

where c is some universal constant. In Sect. 6, we will use Theorem 11 to prove imperfect PSM lower-bounds for random functions and for efficiently computable functions.

6 Imperfect PSM Lower-Bounds for Random and Explicit Functions

In this section we will show that most functions have non-trivial imperfect PSM complexity, and establish the existence of an explicit function that admits a non-trivial imperfect PSM lower-bound. Formally, in Sect. 6.1 we will prove the following theorem (which strengthens Corollary 1 from the introduction).

Theorem 12

For a \(1-o(1)\) fraction of the functions \(f: \{0,1\}^k \times \{0,1\}^k \rightarrow \{0,1\}\) any PSM protocol for f with privacy error of \(\epsilon \) and correctness error of \(\delta \), \(\delta < \frac{1}{100}\), requires at least

$$\begin{aligned} \ell (k,\epsilon ,\delta )= \min \left\{ 3k-2\log (k), 2k+\log (1/\epsilon ), 2k+\log (1/\delta )\right\} - c \end{aligned}$$
(6)

bits of communication, where c is some universal constant.

By de-randomizing the proof, we derive (in Sect. 6.2) the following theorem (which strengthens Theorem 4 from the introduction).

Theorem 13

There exists a sequence of polynomial-size circuits

$$\begin{aligned} f=\left\{ f_k: \{0,1\}^k \times \{0,1\}^k \rightarrow \{0,1\}\right\} \end{aligned}$$

such that any \(\delta \)-correct \(\epsilon \)-private PSM for \(f_k\) has communication complexity of at least \(\ell (k,\epsilon ,\delta )\) bits (as defined in (6)). Moreover, assuming the existence of a hitting-set generator against co-nondeterministic uniform algorithms, there exists an explicit family f which is computable by a polynomial-time Turing machine whose imperfect PSM communication complexity is at least \(\ell (k,\epsilon ,\delta )-O(\log k)\).

The reader is advised to read the following subsections sequentially since the proof of Theorem 13 builds over the proof of Theorem 12.

6.1 Lower Bounds for Random Functions (Proof of Theorem 12)

We will need the following definition.

Definition 5

(good function). We say that a function \(f:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) is good if it satisfies the following conditions:

  1. 1.

    For every \(x\ne x'\) and every set \(\mathbf {y}\) of \(k^2\) consecutive strings (according to some predefined order over \(\{0,1\}^k\)), it holds that \(f(x,y)=f(x',y)\) for at most 0.9-fraction of the elements \(y\in \mathbf {y}\).

  2. 2.

    Similarly, for every \(y\ne y'\) and set \(\mathbf {x}\) of \(k^2\) consecutive strings (according to some predefined order over \(\{0,1\}^k\)), it holds that \(f(x,y)=f(x,y')\) for at most 0.9-fraction of \(x\in \mathbf {x}\).

  3. 3.

    For every pair of \(k^2\times k^2\) x-disjoint or y-disjoint rectangles \(R, R'\), it holds that \(f_{[R]}\) disagrees with \(f_{[R']}\) on at least 0.01 fraction of the entries.

Claim 14

Any good \(f:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) satisfies the conditions of Theorem 11 with \(M=2^k\cdot k^2\), and therefore any \(\delta \)-correct \(\epsilon \)-private PSM for f, \(\delta < \frac{1}{100}\), requires communication of

$$\ell (k,\epsilon ,\delta )=\min \left\{ 3k-2\log (k), 2k+\log (1/\epsilon ), 2k+\log (1/\delta ) \right\} - c,$$

for some universal constant c.

Proof

Fix some good f. Condition (1) guarantees that \(f(x,\cdot )\) and \(f(x',\cdot )\) differ on 0.1 fraction of each \(k^2\) block of consecutive y’s, and therefore, overall, they must differ on a 0.1 fraction of all possible y’s. Applying the same argument on the y-axis (using condition (2)), we conclude that a good f must be strongly non-degenerate.

Similarly, a good f cannot have a pair of x-disjoint approximately similar \(m\times n\) rectangles \(R,R'\) of size \(mn \ge 2^k \cdot k^2\). To see this, observe that the latter condition implies that mn are both larger than \(k^2\), and therefore, again by an averaging argument, there must exists a pair of \(k^2\times k^2\) x-disjoint sub-rectangles \(R'_0\subseteq R_0,R'_1\subseteq R_1\) which are also approximately similar. Applying the same argument to y-disjoint rectangles we conclude that any good f satisfies the conditions of Theorem 11.     \(\square \)

We say that a family of functions \(\left\{ f_z:\mathcal {A}\rightarrow \mathcal {B}\right\} _{z\in \mathcal {Z}}\) is t-wise independent functions if for any t-tuple of distinct inputs \((a_1,\ldots ,a_t)\) and for a uniformly chosen \(z{\mathop {\leftarrow }\limits ^{\$}}\mathcal {Z}\), the joint distribution of \((f_z(a_1),\ldots ,f_z(a_t))\) is uniform over \(\mathcal {B}^t\).

Claim 15

Pick \(f: \{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) uniformly at random among all such functions. Then, with probability \(1-o(1)\), the resulting function is good. Moreover, this holds even if f is chosen from a family of \(k^4\)-wise independent functions.

Proof

Choose f randomly from a family of \(k^4\)-wise independent hash functions. Fix a pair of \(x\ne x'\) and a \(k^2\)-subset \(\mathbf {y}\subset \{0,1\}^k\) of consecutive y’s. By a Chernoff bound, the probability that \(f(x,y)=f(x',y)\) for more than 0.9 of \(y\in \mathbf {y}\) is at most \(2^{-\varOmega (k^2)}\). There are at most \(2^{2k}\) pairs of \(x,x'\), and at most \(2^k\) different sets \(\mathbf {y}\) of consecutive y’s, therefore by a union bound the probability that condition (1) does not hold is \(2^{3k}2^{-\varOmega (k^2)}=2^{-\varOmega (k^2)}\). A similar argument, shows that (2) fails with a similar probability.

We move on to prove there is no pair of approximately similar x-disjoint rectangles of size exactly \(k^2\times k^2\). (Again, the case of y-disjoint rectangles is treated similarly.)

Let \(m=k^2\). Fix two x-disjoint \(m \times m\)-rectangles \(R = (\mathbf {x},\mathbf {y})\) and \(R' = (\mathbf {x}',\mathbf {y}')\). We want to give an upper bound on the probability that \(f_{[R]}\) agrees with \(f_{[R']}\) on 99% of their entries. This event happens only if the entries of f satisfy all but 1% of the the \(m^2\) equations \(f(x_i,y_j) = f(x'_i,y'_j)\) for \((i,j) \in \{1,\ldots ,m\} \times \{1,\ldots ,m\}\). The probability that any such equation is satisfied is \(\frac{1}{2}\): since the rectangles are x-disjoint the equation is non-trivial. We can further find a subset T of at least \(m^2/2\) such equations such that each equation in the subset uses an entry f(xy) that is not used in any other equation. Let us fix some \(0.01m^2\) subset S of equations that are allowed to be unsatisfied. After removing S from T, we still have at least \(0.49m^2\) equations that are simultaneously satisfied with probability of at most \(2^{-0.49m^2}\). There are at most \(2^{H_2(0.01)m^2}\) sets S (where \(H_2\) is the binary entropy function), and at most \(2^{2mk}\) choices for R and \(2^{2mk}\) choices for \(R'\). Hence, by a union bound, the probability that (3) fails is at most

$$\begin{aligned} 2^{-0.49m^2+0.081m^2+4m^{3/2}}<2^{-\varOmega (m^2)}, \end{aligned}$$

the claim follows.     \(\square \)

Theorem 12 follows from Claims 14 and 15.    \(\square \)

6.2 Explicit Lower-Bound (Proof of Theorem 13)

Our next goal is to obtain an explicit lower-bound. We begin by noting that good functions (as per definition 5) can be identified by efficient co-nondeterministic algorithms.

Definition 6

A co-nondeterministic algorithm M(xy) is a Turing machine that takes z as its primary input and v as a witness. For each \(z\in \{0,1\}^*\) we define \(M(z)=1\) if there exist a witness v such that \(M(z, v) = 0\).

Claim 16

There exists a co-nondeterministic algorithm that given some s-bit representation of a function \(f: \{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\) accepts f if and only if f is good with complexity of \(O(k^4 t)\) where t is the time complexity of evaluating f on a given point.

Proof

It suffices to describe a polynomial-time verifiable witness for the failure of each of the goodness conditions. If f is not good due to (1), then the witness is a pair \(x\ne x'\) and a \(k^2\)-set \(\mathbf {y}\) of consecutive y’s. Since \(f_z\) can be efficiently evaluated we can verify that \(f(x,y)=f(x',y)\) for more than 0.9-fraction of the y’s in \(\mathbf {y}\) in times \(O(k^2 t)\). A violation of (2) is treated similarly. If f is not good due to (3), then the witness is a pair of x-disjoint or y-disjoint \(k^2\times k^2\) rectangles \(R,R'\) that are approximately similar. Again, we can verify the validity of this witness in time \(O(k^4 t)\).     \(\square \)

Let \(s(k)=\mathrm{poly}(k)\) and let \(\left\{ f_z:\{0,1\}^k\times \{0,1\}^k\rightarrow \{0,1\}\right\} _{z\in \{0,1\}^{s}}\) be a family of \(k^4\)-wise independent functions with an evaluator algorithm F which takes an index \(z\in \{0,1\}^s\) and input \((x,y)\in \{0,1\}^k\times \{0,1\}^k\) and outputs in time t(k) the value of \(f_z(x,y)\). (Such an F can be based on \(k^4\)-degree polynomials over a field of size \(\Theta (k^4)\)). Claims 14 and 15 imply that for most choices of z, the function \(f_z\) has an imperfect PSM complexity of at least \(\ell (k,\epsilon ,\delta )\). Since F is efficiently computable, for every z there is a polynomial-size circuit that computes \(f_z\). Hence, there exists a polynomial-size computable function for which the \(\ell (k,\epsilon ,\delta )\) lower-bound holds, and the first part of Theorem 13 follows.

To prove the second part, we use a properly chosen pseudorandom generator (PRG) \(G:\{0,1\}^{O(\log k)}\rightarrow \{0,1\}^s\) to “derandomize” the family \(\left\{ f_z\right\} \). That is, we define the function \(g:\{0,1\}^{O(\log k)}\times \{0,1\}^k \times \{0,1\}^k\rightarrow \{0,1\}\) which takes (wxy) and outputs \(f_z(x,y)\) where \(z=G(w)\in \{0,1\}^s\). Concretely, we require G to “hit” the image of any co-nondeterministic algorithms of complexity \(T=O(k^4 t)\). Formally, this means that for every T-time co-nondeterministic algorithm M it holds that if \(\Pr _z[M(z)=1]\ge \frac{1}{2}\) then there exists a “seed” r for which \(M(G(r))=1\).

Taking M to be the algorithm from Claim 16, we conclude, by Claims 15 and 14, that for some seed w, the function \(f_{G(w)}\) has an imperfect PSM complexity of at least \(\ell (k,\epsilon ,\delta )\). Let us parse g as a two-party function, say by partitioning w to two halves \(w_A,w_B\) and giving \((x,w_A)\) to Alice, and \(y,w_B\) to Bob. We conclude that g must have an imperfect PSM complexity of at least \(\ell (k,\epsilon ,\delta )\). Since the input length \(k'\) of Alice and Bob becomes longer by an additional \(O(\log k)\) bits, the lower-bound becomes at least \(\ell (k',\epsilon ,\delta )-O(\log k')\), as claimed. The part of Theorem 13 follows.     \(\square \)

7 Lower-Bounds for Conditional Disclosure of Secrets

In this section we derive CDS lower bounds. We begin with a reduction from fully revealing weakly hiding PSM (Definition 4) to CDS.

Claim 17

Let \(h:\mathcal {X}\times \mathcal {Y}\rightarrow \{0,1\}\) be a predicate. Define the function \(f:\mathcal {X}'\times \mathcal {Y}\rightarrow \{0,1\}\) where \(\mathcal {X}'=\mathcal {X}\times \{0,1\}\) by \(f((x,s),y)=s \wedge h(x,y)\). If h has a perfect CDS with communication complexity of c then f has a weakly-private fully-revealing PSM with complexity of \(c+\log |\mathcal {X}|+\log |\mathcal {Y}|\).

Proof

Given a CDS protocol \(\varPi =(\varPi _A,\varPi _B,g)\) for h we construct a weakly-private fully-revealing PSM for f as follows. Given an input (xs), Alice sends \((x,a=\varPi _A(x,s,r))\) where x plays the role of the Alice’s input in the CDS, s plays the role of the secret, and r is a shared string uniformly sampled from \(\mathcal {R}\). Bob takes his input y, and sends \((y,b=\varPi _B(y,r))\). Charlie outputs \(h(x,y)\wedge g(x,y,a,b)\).

It is not hard to verify that the protocol is perfectly correct and fully revealing. Indeed, a PSM decoding error happens only if g(xyab) fails to decode the secret s (which happens with probability zero). To prove weak privacy observe that if f agrees on a pair of inputs, ((x, 0), y) and ((x, 1), y), then h(xy) must be zero. By CDS privacy, for \(R{\mathop {\leftarrow }\limits ^{\$}}\mathcal {R}\) the distribution \((x,y,\varPi _A(x,0,R),\varPi _B(y,R))\) is identical to the distribution \((x,y,\varPi _A(x,1,R),\varPi _B(y,R))\), as required.     \(\square \)

Next, we show that the properties of f needed for applying Theorem 8, follow from simple requirements on h. In the following, we say that \(x\in \mathcal {X}\) is a null input if the residual function \(h(x,\cdot )\) is the constant zero function.

Claim 18

Let h and f be as in Claim 17. Then

  1. 1.

    The size of the largest complement similar rectangle of f equals to the size of the largest 0-monochromatic rectangle of h.

  2. 2.

    The number U of useful inputs of f is exactly two times larger than the number of inputs that are mapped by h to zero.

  3. 3.

    If h has no input x for which the residual function \(h(x,\cdot )\) is the constant zero function, then f is weakly non-degenerate.

Proof

The claim follows immediately by noting that for every (xy) it holds that \(f((x,1),y)=f((x,0),y)\) if and only if \(h(x,y)=0\). We proceed with a formal argument.

  1. 1.

    Consider some complement similar rectangle \(R=(\mathbf {x}' \times \mathbf {y})\) of f. For every \((x,b)\in \mathbf {x}'\) and \(y\in \mathbf {y}\), it holds that

    $$\begin{aligned} f((x,b),y)=f((x,1-b),y), \end{aligned}$$

    and therefore \(h(x,y)=0\) and R is a 0-monochromatic rectangle of h.

  2. 2.

    Every input (xy) that does not satisfy h induces an unordered pair, ((x, 1), y) and ((x, 0), y), of useful inputs for f. Therefore, the number of (ordered) useful inputs of f is exactly \(2|h^{-1}(0)|\).

  3. 3.

    Fix some \((x,s)\in \mathcal {X}'\) and assume, towards a contradiction, that for every y it holds that \(f((x,s),y)= f((x,1-s),y)\). By the definition of f this means that \(h(x,y)=0\) for every y, contradicting our assumption on h.

    \(\square \)

Theorem 5 (restated here for convenience) now follows immediately from the lower-bound on weakly-private fully revealing PSM (Theorem 8).

Theorem 19

(Theorem 5 restated). Let \(h:\mathcal {X}\times \mathcal {Y}\rightarrow \{0,1\}\) be a predicate. Suppose that M upper-bounds the size of the largest 0-monochromatic rectangle of h and that for every \(x\in \mathcal {X}\), the residual function \(h(x,\cdot )\) is not the constant zero function. Then, the communication complexity of any perfect CDS for h is at least

$$\begin{aligned} 2\log |f^{-1}(0)|-\log M-\log |\mathcal {X}|-\log |\mathcal {Y}|-1, \end{aligned}$$

where \(|f^{-1}(0)|\) denotes the number of inputs (xy) that are mapped to zero.

Proof

Let \(h:\mathcal {X}\times \mathcal {Y}\rightarrow \{0,1\}\) be a predicate that satisfies the theorem requirement. That is, M upper-bounds the size of the largest 0-monochromatic rectangle of h, there at least S inputs that are mapped to zero, and for every \(x\in \mathcal {X}\), the residual function \(h(x,\cdot )\) is not the constant zero function.

Suppose that h has a perfect CDS with communication complexity of c. By Claim 17, the function f (defined in the claim) has a weakly-private fully-revealing PSM with complexity of at most

$$\begin{aligned} c+\log |\mathcal {X}|+\log |\mathcal {Y}|, \end{aligned}$$

which, by Claim 18 and Theorem 8, is at least

$$\begin{aligned} 2\log U-\log M -2=2\log S-\log M -1. \end{aligned}$$

It follows that

$$\begin{aligned} c\ge 2\log S-\log M -1-(\log |\mathcal {X}|+\log |\mathcal {Y}|), \end{aligned}$$

as required.     \(\square \)

Example 1

(The index predicate). As a sanity check, consider the index predicate \(f_{ind}:[k]\times \{0,1\}^k\rightarrow \{0,1\}\) which given an index \(i\in [k]\) and a string \(y\in \{0,1\}^k\) outputs y[i], the i-th bit of y. Clearly exactly half of all inputs are mapped to 0. Also, for every i the residual function \(f(i,\cdot )\) is not the constant zero. Finally, every zero rectangle is of the form \(I\times \left\{ y: y[i]=0, \forall i\in I\right\} \) where \(I\subseteq [k]\). This implies that the size of any such rectangle is exactly \(|I|\cdot 2^{k-|I|}\le 2^{k-1}\). Plugging this into Theorem 19, we get a lower-bound of

$$\begin{aligned} 2(k+\log k-1)-(k-1)-k-\log k -1 \ge \log k -2. \end{aligned}$$