# Fooling Pairs in Randomized Communication Complexity

## Abstract

The fooling pairs method is one of the standard methods for proving lower bounds for deterministic two-player communication complexity. We study fooling pairs in the context of randomized communication complexity. We show that every fooling pair induces far away distributions on transcripts of private-coin protocols. We use the above to conclude that the private-coin randomized \(\varepsilon \)-error communication complexity of a function *f* with a fooling set \(\mathcal S\) is at least order \(\log \frac{\log |\mathcal S|}{\varepsilon }\). This relationship was earlier known to hold only for constant values of \(\varepsilon \). The bound we prove is tight, for example, for the equality and greater-than functions.

As an application, we exhibit the following dichotomy: for every boolean function *f* and integer *n*, the (1/3)-error public-coin randomized communication complexity of the function \(\bigvee _{i=1}^{n}f(x_i,y_i)\) is either at most *c* or at least n/c, where \(c>0\) is a universal constant.

## 1 Introduction

Communication complexity provides a mathematical framework for studying communication between two or more parties. It was introduced by Yao [12] and has found numerous applications since. We focus on the two-player case, and provide a brief introduction to it. For more details see the textbook by Kushilevitz and Nisan [8].

In this model, there are two players called Alice and Bob. The players wish to compute a function \(f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z\), where Alice knows \(x \in \mathcal {X}\) and Bob knows \(y \in \mathcal Y\). To achieve this goal, they need to communicate. The *communication complexity* of *f* measures the minimum number of bits the players must exchange in order to compute *f*. The communication is done according to a pre-determined protocol. Protocols may be deterministic or use randomness that is either *public* (known to both players) or *private* (randomness held by one player is not known to the other). In the case of deterministic protocols, we denote by *D*(*f*) the minimum communication required to compute *f* correctly on all inputs. In the case of randomized protocols, we allow the protocol to err with a small probability. We denote by \(R_{\varepsilon }(f)\) and \(R^\mathsf {pri}_{\varepsilon }(f)\) the minimum communication required to compute *f* correctly with public and private-coin protocols with a probability of error at most \(\varepsilon \) on all inputs. We refer to Sect. 2.1 for formal definitions.

A fundamental problem in this context is proving lower bounds on the communication complexity of a given function *f*. Lower bounds methods for deterministic communication complexity are based on the fact that any protocol for *f* defines a partition of \(\mathcal {X}\times \mathcal Y\) to *f*-monochromatic rectangles^{1}. Thus, a lower bound on the size of a minimal partition of this kind readily translates to a lower bound on the communication complexity of *f*. Three basic bounds of this type are based on rectangle size, fooling sets, and matrix rank (see [8]). Both matrix rank and rectangle size lower bounds have natural and well-known analogues in the randomized setting: the approximate rank lower bound [7, 9] and the discrepancy lower bound [8] respectively. In this paper we show that fooling sets also have natural counterparts in the randomized setting.

### 1.1 Fooling Pairs and Sets

*D*(

*f*). A pair \((x,y),(x',y')\in \mathcal {X}\times \mathcal Y\) is called

*a fooling pair*for \(f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z\) if

\(f(x, y) = f(x', y')\), and

either \(f(x', y) \ne f(x,y)\) or \(f(x, y') \ne f(x,y) \).

Observe that if (*x*, *y*) and \((x',y')\) are a fooling pair then \(x \ne x'\) and \(y \ne y'\). When \(\mathcal Z=\{0,1\}\) we distinguish between 0-fooling pairs (for which \(f(x,y)=f(x',y')=0\)) and 1-fooling pairs (for which \(f(x,y)=f(x',y')=1\)).

It is easy to see that if (*x*, *y*) and \((x',y')\) form a fooling pair then there is no *f*-monochromatic rectangle that contains both of them. An immediate conclusion is the following:

### Lemma 1

**(**[8]

**).**Let \(f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z\) be a function, let (

*x*,

*y*) and \((x',y')\) be a fooling pair for

*f*and let \(\pi \) be a deterministic protocol for

*f*. Then

A subset \(\mathcal S\subseteq \mathcal {X}\times \mathcal Y\) is a *fooling set* if every \(p \ne p'\) in \(\mathcal S\) form a fooling pair. Lemma 1 implies the following basic lower bound for deterministic communication complexity.

### Theorem 1

**(**[8]

**).**Let \(f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z\) be a function and let \(\mathcal S\) be a fooling set for

*f*. Then

The same properties do not hold for randomized protocols, but one could expect their natural variants to hold. Let \(\pi \) be an \(\varepsilon \)-error private-coin protocol for *f*, and let \((x,y),(x',y')\) be a fooling pair for *f*. Then, one can expect that the probabilistic analogue of \(\pi (x) \ne \pi (x')\) holds, *i.e.*\(|\varPi (x, y)-\varPi (x', y')|\) is large, where \(|\varPi (x,y)-\varPi (x',y')|\) denotes the statistical distance between the two distributions on transcripts.

Such a statement was previously only known for a specific type of fooling pair (that we call the AND fooling pair in Sect. 1.2) and was implicit in [2], where it is used as part of a lower bound proof for the randomized communication complexity of the disjointness function. Here, we prove that it holds for an arbitrary fooling pair.

### Lemma 2

**(Analogue of Lemma**1

**).**Let \(f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z\) be a function, let (

*x*,

*y*) and \((x',y')\) be a fooling pair for

*f*, and let \(\pi \) be an \(\varepsilon \)-error private-coin protocol for

*f*. Then

Lemma 2 is not only an analogue of Lemma 1 but is actually a generalization of it. Indeed, plugging \(\varepsilon =0\) in Lemma 2 implies Lemma 1. Moreover, it implies that the bound from Theorem 1 holds also in the 0-error private-coin randomized case.

We use the above to prove an analogue of Theorem 1 as well.

### Theorem 2

**(Analogue of Theorem**1

**).**Let \(f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z\) be a function and let \(\mathcal S\) be a fooling set for

*f*. Let \(1/|\mathcal S| \le \varepsilon < 1/3\). Then,

*n*-bit strings has a large fooling set of size \(2^n\), but it is well-known (see [8]) that

*n*and \(\varepsilon \). It also provides a tight lower bound for the greater-than function. Moreover, Theorem 2 is a generalization of Theorem 1 and basically implies it by choosing \(\varepsilon = 1/|\mathcal S|\).

The proof of the lower bound uses a general lower bound on the rank of perturbed identity matrices by Alon [1]. Interestingly, although not every fooling set comes from an identity matrix (e.g. in the greater-than function), there is always some perturbed identity matrix in the background (the one used in the proof of Theorem 2).

^{2}which appears in [12] without proof: for every function

*f*with a fooling set \(\mathcal S\) and for every \(0< \varepsilon < 1/3\),

*f*and for every \(0 \le \varepsilon < 1/2\),

### 1.2 Two Types of Fooling Pairs

*f*. There are two types of fooling pairs:

The \(\mathsf {AND}\)-type for which \(f(x',y)\ne f(x,y')\).

The \(\mathsf {XOR}\)-type for which \(f(x',y)= f(x,y')\).

A partial proof of Lemma 2 is implicit in [2]. The case considered in [2] corresponds to a 0-fooling pair of the \(\mathsf {AND}\)-type. Let \(\pi \) be a private-coin \(\varepsilon \)-error protocol for *f* that is the \(\mathsf {AND}\) of two bits. In this case, by definition it must hold that \(\varPi (0,0)\) is statistically far away from \(\varPi (1,1)\). The cut-and-paste property (see Corollary 1) implies that the same holds for \(\varPi (0,1)\) and \(\varPi (1,0)\), yielding Lemma 2 for the 0-fooling pair of the \(\mathsf {AND}\)-type – (0, 1), (1, 0).

The case of a pair of the \(\mathsf {XOR}\)-type was not analyzed before. If \(\pi \) is a private-coin \(\varepsilon \)-error protocol for \(\mathsf {XOR}\) of two bits, then it does not immediately follow that \(\varPi (0,0)\) is far away from \(\varPi (1,1)\), nor that \(\varPi (0,1)\) is far away from \(\varPi (1,0)\). Lemma 2 implies that in fact both are true, but the argument can not use the cut-and-paste property. Our argument actually gives a better quantitative result for the \(\mathsf {XOR}\) function as compared to the \(\mathsf {AND}\) function.

The importance of the special case of Lemma 2 from [2] is related to proving a lower bound on the randomized communication complexity of the disjointness function \(\mathsf {DISJ}\) defined over \(\{0,1\}^n \times \{0,1\}^n\): \(\mathsf {DISJ}(x,y) = 1\) if for all \(i \in [n]\) it holds that \(x_i \wedge y_i = 0\). They reproved that \(R_{1/3}(\mathsf {DISJ}) \ge \varOmega (n)\). This lower bound is extremely important and useful in many contexts, and was first proved in [6].

*internal information cost*(as was later defined in [3]) of computing one copy of the \(\mathsf {AND}\) function with the communication of the protocol \(\pi \) for \(\mathsf {DISJ}\). This is a direct-sum-esque result. More concretely, if \(\mu \) is a distribution on \(\{0,1\}^2\) such that \(\mu (1,1)=0\) then

*x*,

*y*and not only on the support of \(\mu \). The authors of [2] use the cut-and-paste property (see Corollary 1 below) to argue that indeed \(\mathsf {IC}_{\mu }(\mathsf {AND}) > 0\).

*n*-tuples of elements:

*k*is a positive integer and \(\mathsf {EQ}_k: [k] \times [k] \rightarrow \{0,1\}\) denotes the equality function on elements of the set [

*k*].

The direct-sum reduction of [2] also works for the function \(f_3\) and since \(\mathsf {EQ}_3\) contains a 0-fooling pair of the \(\mathsf {AND}\)-type, we can straightaway conclude that the (1/3)-error randomized communication complexity and internal information cost of \(f_3\) are \(\varOmega (n)\). However, for the seemingly similar function \(f_2\), the direct sum reduction described above does not work (and all the fooling pairs are of the \(\mathsf {XOR}\)-type). In fact, the (1/3)-error public-coin randomized communication complexity and internal information cost of \(f_2\) are *O*(1), since \(f_2\) can be reduced to equality on *n*-bit strings.

The following theorem shows that this example is part of a general dichotomy. For example, there is no function *f* for which the randomized communication complexity of \(\bigvee _{i=1}^n f(x_i,y_i)\) is \(\varTheta (\sqrt{n})\), when *n* tends to infinity.

### Theorem 3

*f*and integer

*n*, the following holds:

- 1.
If

*f*contains a 0-fooling pair of the \(\mathsf {AND}\)-type then the (1/3)-error public-coin randomized communication complexity of \(\bigvee _{i=1}^n f(x_i,y_i)\) is at least n/c. - 2.
Else, the (1/3)-error public-coin randomized communication complexity of \(\bigvee _{i=1}^n f(x_i,y_i)\) is at most

*c*.

A dual statement applies to the *n*-fold \(\mathsf {AND}\) of *f*:

### Theorem 4

**(Dual of Theorem**3

**).**There is a constant \(c>0\) so that for every boolean function

*f*and integer

*n*, the following holds:

- 1.
If

*f*contains a 1-fooling pair of the \(\mathsf {AND}\)-type then the (1/3)-error public-coin randomized communication complexity of \(\bigwedge _{i=1}^n f(x_i,y_i)\) is at least n/c. - 2.
Else, the (1/3)-error public-coin randomized communication complexity of \(\bigwedge _{i=1}^n f(x_i,y_i)\) is at most

*c*.

We provide a proof of Theorem 3. Theorem 4 can be derived by a similar argument, or alternatively by a reduction to Theorem 3 using the relation \(\bigwedge _{i=1}^n f(x_i,y_i) = \lnot \bigvee _{i=1}^n \lnot f(x_i,y_i)\), which transforms 1-fooling pairs to 0-fooling pairs.

### Proof

*(Proof of Theorem* 3*).* To prove the first item, note that the sub-matrix correponding to the 0-fooling pair of the AND-type can be mapped to the AND function and then taking the *n*-fold copy of it corresponds to computing the negation of the disjointness function on *n* bits. Applying the lower bound of [2] then proves that randomized communication complexity must be \(\varOmega (n)\).

For the second item, assume *f* does not contain any 0-fooling pair of the \(\mathsf {AND}\)-type. Note that this implies that \(\bigvee _{i=1}^n f(x_i,y_i)\) also does not contain any 0-fooling pair of the \(\mathsf {AND}\)-type. Indeed, more generally, if \(f_1\) and \(f_2\) do not contain 0-fooling pairs of the \(\mathsf {AND}\)-type then \(f_1(x_1,y_1)\vee f_2(x_2,y_2)\) also does not contain such pairs.

So, it suffices to show that any function *g* that does not contain 0-fooling pairs of the \(\mathsf {AND}\) type has public-coin randomized communication complexity *O*(1). For any such *g*, the communication matrix of *g* does not contain a \(2\times 2\) sub-matrix with exactly three *zeros*. Without loss of generality, assume that the communication matrix contains no repeated rows or columns. We claim that this matrix contains at most one *zero* in each row and column. This will finish the proof since by permuting the rows and columns, we get the negation of the identity matrix with possibly one additional column of all ones or one additional row of all ones. Therefore, a simple variant of the *O*(1) public-coin protocol for the equality function will compute *g*.

To see why there is at most one *zero* in each row and column, assume towards contradiction that it has two *zeros* in some row *i*, say in the first and second columns. Now, since the first and second *columns* differ, there must be some other row *k* on which they disagree. This means that the sub-matrix formed by rows *i* and *k* and columns 1 and 2 contains exactly three *zeros*, contradicting our assumption.

## 2 Preliminaries

### 2.1 Communication Complexity

A *private-coin communication protocol* for computing a function \(f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z\) is a binary tree with the following generic structure. Each node in the protocol is owned either by Alice or by Bob. For every \(x \in \mathcal {X}\), each internal node *v* owned by Alice is associated with a distribution \(P_{v, x}\) on the children of *v*. Similarly, for every \(y \in \mathcal Y\), each internal node *v* owned by Bob is associated with a distribution \(P_{v, y}\) on the children of *v*. The leaves of the protocol are labeled by \(\mathcal Z\).

On input *x*, *y*, a protocol \(\pi \) is executed as follows.

- 1.
Set

*v*to be the root node of the protocol-tree defined above. - 2.
If

*v*is a leaf, then the protocol outputs the label of the leaf. Otherwise, if Alice owns the node*v*, she samples a child according to the distribution \(P_{v, x}\) and sends a bit to Bob indicating which child was sampled. The case when Bob owns the node is analogous. - 3.
Set

*v*to be the sampled node and return to the previous step.

A protocol is *deterministic* if for every internal node *v*, the distribution \(P_{v, x}\) or \(P_{v, y}\) has support of size one. A *public-coin* protocol is a distribution over private-coin protocols defined as follows: Alice and Bob first sample a shared random *r* to choose a protocol \(\pi _r\), and they execute a private protocol \(\pi _r\) as above.

For an input (*x*, *y*), we denote by \(\pi (x, y)\) the sequence of messages exchanged between the parties. We call \(\pi (x, y)\) the *transcript* of the protocol \(\pi \) on input (*x*, *y*). Another way to think of \(\pi (x,y)\) is as a leaf in the protocol-tree. We denote by \(L(\pi (x,y))\) the label of the leaf \(\pi (x,y)\) in the tree. The *communication complexity* of a protocol \(\pi \), denoted by \(\mathsf {CC}(\pi )\) is the depth of the protocol-tree of \(\pi \). For a private-coin protocol \(\pi \), we denote by \(\varPi (x, y)\) the distribution of the transcript of \(\pi (x, y)\).

For a function *f*, the *deterministic* communication complexity of *f*, denoted by *D*(*f*), is the minimum of \(\mathsf {CC}(\pi )\) over all deterministic protocols \(\pi \) such that \(L(\pi (x, y)) = f(x, y)\) for every *x*, *y*. For \(\varepsilon > 0\), we denote by \(R_{\varepsilon }(f)\) the minimum of \(\mathsf {CC}(\pi )\) over all public-coin protocols \(\pi \) such that for every (*x*, *y*), it holds that \(\mathbb {P}[L(\pi (x, y)) \ne f(x, y)] \le \varepsilon \) where the probability is taken over all coin flips in the protocol \(\pi \). We call \(R_{\varepsilon }(f)\) the \(\varepsilon \)*-error public-coin randomized* communication complexity of *f*. Analogously we define \(R^\mathsf {pri}_{\varepsilon }(f)\) as the \(\varepsilon \)*-error private-coin* randomized communication complexity.

### 2.2 Rectangle Property

In the case of deterministic protocols, the set of inputs reaching a particular leaf forms a rectangle (a product set inside \(\mathcal {X}\times \mathcal Y\)). In the case of private-coin randomized protocols, the following holds (see for example Lemma 6.7 in [2]).

### Lemma 3 (Rectangle property for private-coin protocols)

Here too the lemma is in fact a generalization of what happens in the deterministic case where \(\alpha ,\beta \) take values in \(\{0,1\}\) rather than in [0, 1].

The next proposition immediately follows from the definitions.

### Proposition 5

*x*,

*y*) and \((x', y')\) be such that \(f(x, y) \ne f(x', y')\). Then, for any \(\varepsilon \)-error private-coin protocol \(\pi \) for

*f*,

### 2.3 Hellinger Distance and Cut-and-paste Property

*Hellinger*distance between two distributions

*p*,

*q*over a finite set \(\mathcal U\) is defined as

### Corollary 1 (Cut-and-paste property)

*x*,

*y*) and \((x', y')\) be inputs to a randomized private-coin protocol \(\pi \). Then

We also use the following relationship between Statistical and Hellinger Distances.

### Proposition 6

**(Statistical and Hellinger Distances).**Let

*p*and

*q*be distributions. Then,

### 2.4 A Geometric Claim

We use the following technical claim that has a geometric flavor. For two vectors \(\mathbf {a}, \mathbf {b}\in \mathbb {R}^m\), we denote by \(\langle \mathbf {a}, \mathbf {b} \rangle \) the standard inner product between \(\mathbf {a},\mathbf {b}\). Denote by \(\mathbb {R}_+\) the set of non-negative real numbers.

### Claim 1

### Proof

## 3 Fooling Pairs and Sets

### 3.1 Fooling Pairs Induce Far Away Distributions

### Proof

*(Proof of Lemma* 2*).* Let the fooling pair be (*x*, *y*) and \((x',y')\) and assume without loss of generality that \(f(x,y) = f(x', y') = 1\). We distinguish between the following two cases.

- (a)
\(f(x',y) \ne f(x, y')\).

- (b)
\(f(x',y) = f(x,y') = z\) where \(z \ne 1\).

In the first case, Proposition 5 implies that \(|\varPi (x', y) - \varPi (x, y')| \ge 1-2\varepsilon \). Proposition 1 implies that \(h(\varPi (x, y), \varPi (x',y')) = h(\varPi (x', y), \varPi (x, y'))\). Proposition 6 thus implies that \(|\varPi (x, y) - \varPi (x', y')| \ge 1 - 2 \sqrt{\varepsilon }\).

*f*,

### 3.2 A Lower Bound Based on Fooling Sets

The following result of Alon [1] on the rank of perturbed identity matrices is a key ingredient.

### Lemma 4

*M*be an \(m \times m\) matrix such that \(|M(i, j)| \le \varepsilon \) for all \(i \ne j\) in [

*m*] and \(|M(i,i)| \ge \frac{1}{2}\) for all \(i \in [m]\). Then,

### Proof

*(Proof of Theorem*2

*).*Let \(\mathcal {L}\) denote the set of leaves of \(\pi \). Let \(A \in \mathbb {R}^{\mathcal S\times \mathcal L}\) be the matrix defined by

*A*transposed. First,

^{3}the rank of

*M*is at least \(\varOmega \left( \frac{\log |\mathcal S|}{\sqrt{\varepsilon }\log \left( \frac{1}{\varepsilon ^{1/4}}\right) } \right) = \varOmega \left( \left( \frac{\log |\mathcal S|}{\varepsilon }\right) ^{1/4}\right) \). On the other hand,

## Footnotes

- 1.
\(R\subseteq \mathcal {X}\times \mathcal Y\) is an

*f*-monochromatic rectangle if \(R=A\times B\) for some \(A\subseteq \mathcal {X}, B\subseteq \mathcal Y\) and*f*is constant over*R*. - 2.
- 3.
We may assume that say \(\varepsilon < 2^{-12}\) by repeating the given randomized protocol a constant number of times.

### References

- 1.Alon, N.: Perturbed identity matrices have high rank: proof and applications. Comb. Probab. Comput.
**18**(1–2), 3–15 (2009). http://dx.doi.org/10.1017/S0963548307008917 MathSciNetCrossRefMATHGoogle Scholar - 2.Bar-Yossef, Z., Jayram, T.S., Kumar, R., Sivakumar, D.: An information statistics approach to data stream and communication complexity. In: FOCS, pp. 209–218 (2002)Google Scholar
- 3.Barak, B., Braverman, M., Chen, X., Rao, A.: How to compress interactive communication. SIAM J. Comput.
**42**(3), 1327–1363 (2013)MathSciNetCrossRefMATHGoogle Scholar - 4.Chor, B., Kushilevitz, E.: A zero-one law for Boolean privacy. SIAM J. Discrete Math.
**4**(1), 36–47 (1991)MathSciNetCrossRefMATHGoogle Scholar - 5.Hastad, J., Wigderson, A.: The randomized communication complexity of set disjointness. Theor. Comput.
**3**(1), 211–219 (2007)MathSciNetCrossRefMATHGoogle Scholar - 6.Kalyanasundaram, B., Schnitger, G.: The probabilistic communication complexity of set intersection. SIAM J. Discrete Math.
**5**(4), 545–557 (1992)MathSciNetCrossRefMATHGoogle Scholar - 7.Krause, M.: Geometric arguments yield better bounds for threshold circuits and distributed computing. Theor. Comput. Sci.
**156**(1–2), 99–117 (1996)MathSciNetCrossRefMATHGoogle Scholar - 8.Kushilevitz, E., Nisan, N.: Communication Complexity. Cambridge University Press, New York (1997)CrossRefMATHGoogle Scholar
- 9.Lee, T., Shraibman, A.: Lower bounds in communication complexity. Founda. Trends Theoret. Comput. Sci.
**3**(4), 263–398 (2009)MathSciNetCrossRefMATHGoogle Scholar - 10.Newman, I.: Private vs. common random bits in communication complexity. Inf. Process. Lett.
**39**(2), 67–71 (1991)MathSciNetCrossRefMATHGoogle Scholar - 11.Paturi, R., Simon, J.: Probabilistic communication complexity. J. Comput. Syst. Sci.
**33**(1), 106–123 (1986)MathSciNetCrossRefMATHGoogle Scholar - 12.Yao, A.C.C.: Some complexity questions related to distributive computing (preliminary report). In: STOC, pp. 209–213 (1979)Google Scholar