# Fooling Pairs in Randomized Communication Complexity

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9988)

## Abstract

The fooling pairs method is one of the standard methods for proving lower bounds for deterministic two-player communication complexity. We study fooling pairs in the context of randomized communication complexity. We show that every fooling pair induces far away distributions on transcripts of private-coin protocols. We use the above to conclude that the private-coin randomized $$\varepsilon$$-error communication complexity of a function f with a fooling set $$\mathcal S$$ is at least order $$\log \frac{\log |\mathcal S|}{\varepsilon }$$. This relationship was earlier known to hold only for constant values of $$\varepsilon$$. The bound we prove is tight, for example, for the equality and greater-than functions.

As an application, we exhibit the following dichotomy: for every boolean function f and integer n, the (1/3)-error public-coin randomized communication complexity of the function $$\bigvee _{i=1}^{n}f(x_i,y_i)$$ is either at most c or at least n/c, where $$c>0$$ is a universal constant.

## 1 Introduction

Communication complexity provides a mathematical framework for studying communication between two or more parties. It was introduced by Yao [12] and has found numerous applications since. We focus on the two-player case, and provide a brief introduction to it. For more details see the textbook by Kushilevitz and Nisan [8].

In this model, there are two players called Alice and Bob. The players wish to compute a function $$f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z$$, where Alice knows $$x \in \mathcal {X}$$ and Bob knows $$y \in \mathcal Y$$. To achieve this goal, they need to communicate. The communication complexity of f measures the minimum number of bits the players must exchange in order to compute f. The communication is done according to a pre-determined protocol. Protocols may be deterministic or use randomness that is either public (known to both players) or private (randomness held by one player is not known to the other). In the case of deterministic protocols, we denote by D(f) the minimum communication required to compute f correctly on all inputs. In the case of randomized protocols, we allow the protocol to err with a small probability. We denote by $$R_{\varepsilon }(f)$$ and $$R^\mathsf {pri}_{\varepsilon }(f)$$ the minimum communication required to compute f correctly with public and private-coin protocols with a probability of error at most $$\varepsilon$$ on all inputs. We refer to Sect. 2.1 for formal definitions.

A fundamental problem in this context is proving lower bounds on the communication complexity of a given function f. Lower bounds methods for deterministic communication complexity are based on the fact that any protocol for f defines a partition of $$\mathcal {X}\times \mathcal Y$$ to f-monochromatic rectangles1. Thus, a lower bound on the size of a minimal partition of this kind readily translates to a lower bound on the communication complexity of f. Three basic bounds of this type are based on rectangle size, fooling sets, and matrix rank (see [8]). Both matrix rank and rectangle size lower bounds have natural and well-known analogues in the randomized setting: the approximate rank lower bound [7, 9] and the discrepancy lower bound [8] respectively. In this paper we show that fooling sets also have natural counterparts in the randomized setting.

Although public-coin protocols are more general than private-coin ones, Newman [10] proved that for boolean functions every public-coin protocol can be efficiently simulated by a private-coin protocol: If $$f : \mathcal {X}\times \mathcal Y\rightarrow \{0,1\}$$ then for every $$0< \varepsilon < 1/2$$,
$$R_{2\varepsilon }(f) \le R^\mathsf {pri}_{2 \varepsilon }(f) = O \left( R_\varepsilon (f) + \log \frac{\log (|\mathcal {X}| |\mathcal Y|)}{\varepsilon } \right) .$$
The additive logarithmic factor on the right-hand-side is often too small to matter, but it does make a difference in the bounds we prove below.

### 1.1 Fooling Pairs and Sets

Fooling sets are a well-known tool for proving lower bounds for D(f). A pair $$(x,y),(x',y')\in \mathcal {X}\times \mathcal Y$$ is called a fooling pair for $$f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z$$ if
• $$f(x, y) = f(x', y')$$, and

• either $$f(x', y) \ne f(x,y)$$ or $$f(x, y') \ne f(x,y)$$.

Observe that if (xy) and $$(x',y')$$ are a fooling pair then $$x \ne x'$$ and $$y \ne y'$$. When $$\mathcal Z=\{0,1\}$$ we distinguish between 0-fooling pairs (for which $$f(x,y)=f(x',y')=0$$) and 1-fooling pairs (for which $$f(x,y)=f(x',y')=1$$).

It is easy to see that if (xy) and $$(x',y')$$ form a fooling pair then there is no f-monochromatic rectangle that contains both of them. An immediate conclusion is the following:

### Lemma 1

([8]). Let $$f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z$$ be a function, let (xy) and $$(x',y')$$ be a fooling pair for f and let $$\pi$$ be a deterministic protocol for f. Then
$$\pi (x, y)\ne \pi (x', y').$$

A subset $$\mathcal S\subseteq \mathcal {X}\times \mathcal Y$$ is a fooling set if every $$p \ne p'$$ in $$\mathcal S$$ form a fooling pair. Lemma 1 implies the following basic lower bound for deterministic communication complexity.

### Theorem 1

([8]). Let $$f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z$$ be a function and let $$\mathcal S$$ be a fooling set for f. Then
$$D(f) \ge \log _2(|\mathcal S|).$$

The same properties do not hold for randomized protocols, but one could expect their natural variants to hold. Let $$\pi$$ be an $$\varepsilon$$-error private-coin protocol for f, and let $$(x,y),(x',y')$$ be a fooling pair for f. Then, one can expect that the probabilistic analogue of $$\pi (x) \ne \pi (x')$$ holds, i.e.$$|\varPi (x, y)-\varPi (x', y')|$$ is large, where $$|\varPi (x,y)-\varPi (x',y')|$$ denotes the statistical distance between the two distributions on transcripts.

Such a statement was previously only known for a specific type of fooling pair (that we call the AND fooling pair in Sect. 1.2) and was implicit in [2], where it is used as part of a lower bound proof for the randomized communication complexity of the disjointness function. Here, we prove that it holds for an arbitrary fooling pair.

### Lemma 2

(Analogue of Lemma 1). Let $$f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z$$ be a function, let (xy) and $$(x',y')$$ be a fooling pair for f, and let $$\pi$$ be an $$\varepsilon$$-error private-coin protocol for f. Then
\begin{aligned} |\varPi (x,y) - \varPi (x',y')| \ge 1 - 2\sqrt{\varepsilon }. \end{aligned}

Lemma 2 is not only an analogue of Lemma 1 but is actually a generalization of it. Indeed, plugging $$\varepsilon =0$$ in Lemma 2 implies Lemma 1. Moreover, it implies that the bound from Theorem 1 holds also in the 0-error private-coin randomized case.

We use the above to prove an analogue of Theorem 1 as well.

### Theorem 2

(Analogue of Theorem 1). Let $$f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z$$ be a function and let $$\mathcal S$$ be a fooling set for f. Let $$1/|\mathcal S| \le \varepsilon < 1/3$$. Then,
$$R^\mathsf {pri}_\varepsilon (f) = \varOmega \left( \log \frac{\log |\mathcal S|}{\varepsilon } \right) .$$
The lower bound provided by the theorem above seems exponentially weaker than the one in Theorem 1, but it is tight. The equality function $$\mathsf {EQ}$$ over n-bit strings has a large fooling set of size $$2^n$$, but it is well-known (see [8]) that
$$R^\mathsf {pri}_\varepsilon (\mathsf {EQ}) = O \left( \log \frac{n}{\varepsilon } \right) .$$
Theorem 2 therefore provides a tight lower bound on $$R^\mathsf {pri}_\varepsilon (\mathsf {EQ})$$ in terms of both n and $$\varepsilon$$. It also provides a tight lower bound for the greater-than function. Moreover, Theorem 2 is a generalization of Theorem 1 and basically implies it by choosing $$\varepsilon = 1/|\mathcal S|$$.

The proof of the lower bound uses a general lower bound on the rank of perturbed identity matrices by Alon [1]. Interestingly, although not every fooling set comes from an identity matrix (e.g. in the greater-than function), there is always some perturbed identity matrix in the background (the one used in the proof of Theorem 2).

We remark that for any constant $$0< \varepsilon < 1/3$$, a version of Theorem 2 has been known for a long time. In particular, Håstad and Wigderson [5] give a proof of the following result2 which appears in [12] without proof: for every function f with a fooling set $$\mathcal S$$ and for every $$0< \varepsilon < 1/3$$,
\begin{aligned} \ R^\mathsf {pri}_\varepsilon (f) = \varOmega \left( \log {\log |\mathcal S|} \right) . \end{aligned}
(1)
The right-hand side above does not depend on $$\varepsilon$$. The same lower bound as in (1) also directly follows from Theorem 1 and from the following general result [8]: for every function f and for every $$0 \le \varepsilon < 1/2$$,
$$R^\mathsf {pri}_\varepsilon (f) = \varOmega ( \log D(f) ).$$

### 1.2 Two Types of Fooling Pairs

Let $$(x,y),(x',y')$$ be a fooling pair for a boolean function f. There are two types of fooling pairs:
• The $$\mathsf {AND}$$-type for which $$f(x',y)\ne f(x,y')$$.

• The $$\mathsf {XOR}$$-type for which $$f(x',y)= f(x,y')$$.

A partial proof of Lemma 2 is implicit in [2]. The case considered in [2] corresponds to a 0-fooling pair of the $$\mathsf {AND}$$-type. Let $$\pi$$ be a private-coin $$\varepsilon$$-error protocol for f that is the $$\mathsf {AND}$$ of two bits. In this case, by definition it must hold that $$\varPi (0,0)$$ is statistically far away from $$\varPi (1,1)$$. The cut-and-paste property (see Corollary 1) implies that the same holds for $$\varPi (0,1)$$ and $$\varPi (1,0)$$, yielding Lemma 2 for the 0-fooling pair of the $$\mathsf {AND}$$-type – (0, 1), (1, 0).

The case of a pair of the $$\mathsf {XOR}$$-type was not analyzed before. If $$\pi$$ is a private-coin $$\varepsilon$$-error protocol for $$\mathsf {XOR}$$ of two bits, then it does not immediately follow that $$\varPi (0,0)$$ is far away from $$\varPi (1,1)$$, nor that $$\varPi (0,1)$$ is far away from $$\varPi (1,0)$$. Lemma 2 implies that in fact both are true, but the argument can not use the cut-and-paste property. Our argument actually gives a better quantitative result for the $$\mathsf {XOR}$$ function as compared to the $$\mathsf {AND}$$ function.

The importance of the special case of Lemma 2 from [2] is related to proving a lower bound on the randomized communication complexity of the disjointness function $$\mathsf {DISJ}$$ defined over $$\{0,1\}^n \times \{0,1\}^n$$: $$\mathsf {DISJ}(x,y) = 1$$ if for all $$i \in [n]$$ it holds that $$x_i \wedge y_i = 0$$. They reproved that $$R_{1/3}(\mathsf {DISJ}) \ge \varOmega (n)$$. This lower bound is extremely important and useful in many contexts, and was first proved in [6].

On a high level, the proof of [2] can be summarized as follows. Let $$\pi$$ be a private-coin protocol with (1/3)-error for $$\mathsf {DISJ}$$. We want to show that $$\mathsf {CC}(\pi )=\varOmega (n)$$. The argument has two different parts: The first part of the argument essentially relates the internal information cost (as was later defined in [3]) of computing one copy of the $$\mathsf {AND}$$ function with the communication of the protocol $$\pi$$ for $$\mathsf {DISJ}$$. This is a direct-sum-esque result. More concretely, if $$\mu$$ is a distribution on $$\{0,1\}^2$$ such that $$\mu (1,1)=0$$ then
$$\mathsf {IC}_{\mu }(\mathsf {AND}) \le \frac{\mathsf {CC}(\pi )}{n},$$
where $$\mathsf {IC}_{\mu }(\mathsf {AND})$$ is the infimum over all (1/3)-error private-coin protocols $$\tau$$ for $$\mathsf {AND}$$ of the internal information cost of $$\tau$$. The second part of the argument shows that if $$\mu$$ is uniform on the set $$\{(0,0),(0,1),(1,0)\}$$ then $$\mathsf {IC}_\mu (\mathsf {AND}) > 0$$. The challenge in proving the second part stems from the fact that $$\mu$$ is supported on the zeros of $$\mathsf {AND}$$, so it is trivial to compute $$\mathsf {AND}$$ on inputs from $$\mu$$. However, the protocols $$\tau$$ in the definition of $$\mathsf {IC}_{\mu }(\mathsf {AND})$$ are guaranteed to succeed for every xy and not only on the support of $$\mu$$. The authors of [2] use the cut-and-paste property (see Corollary 1 below) to argue that indeed $$\mathsf {IC}_{\mu }(\mathsf {AND}) > 0$$.
Here we observe that these arguments can be cast into a more general fooling-pair based method. For example, consider the following function on a pair of n-tuples of elements:
$$f_k(x,y) = \bigvee _{i=1}^n \mathsf {EQ}_k(x_i,y_i),$$
where k is a positive integer and $$\mathsf {EQ}_k: [k] \times [k] \rightarrow \{0,1\}$$ denotes the equality function on elements of the set [k].

The direct-sum reduction of [2] also works for the function $$f_3$$ and since $$\mathsf {EQ}_3$$ contains a 0-fooling pair of the $$\mathsf {AND}$$-type, we can straightaway conclude that the (1/3)-error randomized communication complexity and internal information cost of $$f_3$$ are $$\varOmega (n)$$. However, for the seemingly similar function $$f_2$$, the direct sum reduction described above does not work (and all the fooling pairs are of the $$\mathsf {XOR}$$-type). In fact, the (1/3)-error public-coin randomized communication complexity and internal information cost of $$f_2$$ are O(1), since $$f_2$$ can be reduced to equality on n-bit strings.

The following theorem shows that this example is part of a general dichotomy. For example, there is no function f for which the randomized communication complexity of $$\bigvee _{i=1}^n f(x_i,y_i)$$ is $$\varTheta (\sqrt{n})$$, when n tends to infinity.

### Theorem 3

There is a constant $$c>0$$ so that for every boolean function f and integer n, the following holds:
1. 1.

If f contains a 0-fooling pair of the $$\mathsf {AND}$$-type then the (1/3)-error public-coin randomized communication complexity of $$\bigvee _{i=1}^n f(x_i,y_i)$$ is at least n/c.

2. 2.

Else, the (1/3)-error public-coin randomized communication complexity of $$\bigvee _{i=1}^n f(x_i,y_i)$$ is at most c.

A dual statement applies to the n-fold $$\mathsf {AND}$$ of f:

### Theorem 4

(Dual of Theorem 3). There is a constant $$c>0$$ so that for every boolean function f and integer n, the following holds:
1. 1.

If f contains a 1-fooling pair of the $$\mathsf {AND}$$-type then the (1/3)-error public-coin randomized communication complexity of $$\bigwedge _{i=1}^n f(x_i,y_i)$$ is at least n/c.

2. 2.

Else, the (1/3)-error public-coin randomized communication complexity of $$\bigwedge _{i=1}^n f(x_i,y_i)$$ is at most c.

We provide a proof of Theorem 3. Theorem 4 can be derived by a similar argument, or alternatively by a reduction to Theorem 3 using the relation $$\bigwedge _{i=1}^n f(x_i,y_i) = \lnot \bigvee _{i=1}^n \lnot f(x_i,y_i)$$, which transforms 1-fooling pairs to 0-fooling pairs.

### Proof

(Proof of Theorem  3). To prove the first item, note that the sub-matrix correponding to the 0-fooling pair of the AND-type can be mapped to the AND function and then taking the n-fold copy of it corresponds to computing the negation of the disjointness function on n bits. Applying the lower bound of [2] then proves that randomized communication complexity must be $$\varOmega (n)$$.

For the second item, assume f does not contain any 0-fooling pair of the $$\mathsf {AND}$$-type. Note that this implies that $$\bigvee _{i=1}^n f(x_i,y_i)$$ also does not contain any 0-fooling pair of the $$\mathsf {AND}$$-type. Indeed, more generally, if $$f_1$$ and $$f_2$$ do not contain 0-fooling pairs of the $$\mathsf {AND}$$-type then $$f_1(x_1,y_1)\vee f_2(x_2,y_2)$$ also does not contain such pairs.

So, it suffices to show that any function g that does not contain 0-fooling pairs of the $$\mathsf {AND}$$ type has public-coin randomized communication complexity O(1). For any such g, the communication matrix of g does not contain a $$2\times 2$$ sub-matrix with exactly three zeros. Without loss of generality, assume that the communication matrix contains no repeated rows or columns. We claim that this matrix contains at most one zero in each row and column. This will finish the proof since by permuting the rows and columns, we get the negation of the identity matrix with possibly one additional column of all ones or one additional row of all ones. Therefore, a simple variant of the O(1) public-coin protocol for the equality function will compute g.

To see why there is at most one zero in each row and column, assume towards contradiction that it has two zeros in some row i, say in the first and second columns. Now, since the first and second columns differ, there must be some other row k on which they disagree. This means that the sub-matrix formed by rows i and k and columns 1 and 2 contains exactly three zeros, contradicting our assumption.

## 2 Preliminaries

### 2.1 Communication Complexity

A private-coin communication protocol for computing a function $$f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z$$ is a binary tree with the following generic structure. Each node in the protocol is owned either by Alice or by Bob. For every $$x \in \mathcal {X}$$, each internal node v owned by Alice is associated with a distribution $$P_{v, x}$$ on the children of v. Similarly, for every $$y \in \mathcal Y$$, each internal node v owned by Bob is associated with a distribution $$P_{v, y}$$ on the children of v. The leaves of the protocol are labeled by $$\mathcal Z$$.

On input xy, a protocol $$\pi$$ is executed as follows.

1. 1.

Set v to be the root node of the protocol-tree defined above.

2. 2.

If v is a leaf, then the protocol outputs the label of the leaf. Otherwise, if Alice owns the node v, she samples a child according to the distribution $$P_{v, x}$$ and sends a bit to Bob indicating which child was sampled. The case when Bob owns the node is analogous.

3. 3.

Set v to be the sampled node and return to the previous step.

A protocol is deterministic if for every internal node v, the distribution $$P_{v, x}$$ or $$P_{v, y}$$ has support of size one. A public-coin protocol is a distribution over private-coin protocols defined as follows: Alice and Bob first sample a shared random r to choose a protocol $$\pi _r$$, and they execute a private protocol $$\pi _r$$ as above.

For an input (xy), we denote by $$\pi (x, y)$$ the sequence of messages exchanged between the parties. We call $$\pi (x, y)$$ the transcript of the protocol $$\pi$$ on input (xy). Another way to think of $$\pi (x,y)$$ is as a leaf in the protocol-tree. We denote by $$L(\pi (x,y))$$ the label of the leaf $$\pi (x,y)$$ in the tree. The communication complexity of a protocol $$\pi$$, denoted by $$\mathsf {CC}(\pi )$$ is the depth of the protocol-tree of $$\pi$$. For a private-coin protocol $$\pi$$, we denote by $$\varPi (x, y)$$ the distribution of the transcript of $$\pi (x, y)$$.

For a function f, the deterministic communication complexity of f, denoted by D(f), is the minimum of $$\mathsf {CC}(\pi )$$ over all deterministic protocols $$\pi$$ such that $$L(\pi (x, y)) = f(x, y)$$ for every xy. For $$\varepsilon > 0$$, we denote by $$R_{\varepsilon }(f)$$ the minimum of $$\mathsf {CC}(\pi )$$ over all public-coin protocols $$\pi$$ such that for every (xy), it holds that $$\mathbb {P}[L(\pi (x, y)) \ne f(x, y)] \le \varepsilon$$ where the probability is taken over all coin flips in the protocol $$\pi$$. We call $$R_{\varepsilon }(f)$$ the $$\varepsilon$$-error public-coin randomized communication complexity of f. Analogously we define $$R^\mathsf {pri}_{\varepsilon }(f)$$ as the $$\varepsilon$$-error private-coin randomized communication complexity.

### 2.2 Rectangle Property

In the case of deterministic protocols, the set of inputs reaching a particular leaf forms a rectangle (a product set inside $$\mathcal {X}\times \mathcal Y$$). In the case of private-coin randomized protocols, the following holds (see for example Lemma 6.7 in [2]).

### Lemma 3 (Rectangle property for private-coin protocols)

Let $$\pi$$ be a private-coin protocol over inputs from $$\mathcal {X}\times \mathcal Y$$, and let $$\mathcal {L}$$ denote the set of leaves of $$\pi$$. There exist functions $$\alpha :\mathcal {L}\times \mathcal {X}\rightarrow \left[ 0,1\right]$$, $$\beta :\mathcal {L}\times \mathcal Y\rightarrow \left[ 0,1\right]$$ such that for every $$(x, y) \in \mathcal {X}\times \mathcal Y$$ and every $$\ell \in \mathcal {L}$$,
\begin{aligned} \mathbb {P}[\pi (x,y) \text{ reaches } \ell ] = \alpha (\ell ,x) \cdot \beta (\ell ,y). \end{aligned}

Here too the lemma is in fact a generalization of what happens in the deterministic case where $$\alpha ,\beta$$ take values in $$\{0,1\}$$ rather than in [0, 1].

The next proposition immediately follows from the definitions.

### Proposition 5

Let $$f: \mathcal {X}\times \mathcal Y\rightarrow \mathcal Z$$ be a function and let (xy) and $$(x', y')$$ be such that $$f(x, y) \ne f(x', y')$$. Then, for any $$\varepsilon$$-error private-coin protocol $$\pi$$ for f,
\begin{aligned} |\varPi (x, y) - \varPi (x', y')| \ge 1 - 2\varepsilon . \end{aligned}

### 2.3 Hellinger Distance and Cut-and-paste Property

The Hellinger distance between two distributions pq over a finite set $$\mathcal U$$ is defined as
\begin{aligned} h(p, q) = \sqrt{1 - \sum _{u \in \mathcal U} \sqrt{p(u)q(u)}}. \end{aligned}
Lemma 3 implies the following property of private-coin protocols that is more commonly known as the cut-and-paste property [4, 11].

### Corollary 1 (Cut-and-paste property)

Let (xy) and $$(x', y')$$ be inputs to a randomized private-coin protocol $$\pi$$. Then
$$h(\varPi (x, y), \varPi (x', y')) = h(\varPi (x', y), \varPi (x, y')).$$

We also use the following relationship between Statistical and Hellinger Distances.

### Proposition 6

(Statistical and Hellinger Distances). Let p and q be distributions. Then,
\begin{aligned} h^2(p, q) \le |p - q| \le \sqrt{h^2(p, q)(2 - h^2(p, q))}. \end{aligned}
In particular, if $$|p - q| \ge 1 - \varepsilon$$ for $$0 \le \varepsilon \le 1$$. Then, $$h^2(p, q) \ge 1 - \sqrt{2\varepsilon }$$.

### 2.4 A Geometric Claim

We use the following technical claim that has a geometric flavor. For two vectors $$\mathbf {a}, \mathbf {b}\in \mathbb {R}^m$$, we denote by $$\langle \mathbf {a}, \mathbf {b} \rangle$$ the standard inner product between $$\mathbf {a},\mathbf {b}$$. Denote by $$\mathbb {R}_+$$ the set of non-negative real numbers.

### Claim 1

Let $$\varepsilon _1,\varepsilon _2, \delta _1, \delta _2 > 0$$ and let $$\mathbf {a}, \mathbf {b}, \mathbf {c}, \mathbf {d}\in \mathbb {R}_+^m$$ be vectors such thatThen,
\begin{aligned} \sum _{i \in [m]} |\mathbf {a}(i)\mathbf {b}(i) - \mathbf {c}(i)\mathbf {d}(i)| \ge 2 - (\varepsilon _1 + \varepsilon _2 + \delta _1 + \delta _2). \end{aligned}

## 3 Fooling Pairs and Sets

### Proof

(Proof of Lemma 2). Let the fooling pair be (xy) and $$(x',y')$$ and assume without loss of generality that $$f(x,y) = f(x', y') = 1$$. We distinguish between the following two cases.

1. (a)

$$f(x',y) \ne f(x, y')$$.

2. (b)

$$f(x',y) = f(x,y') = z$$ where $$z \ne 1$$.

In the first case, Proposition 5 implies that $$|\varPi (x', y) - \varPi (x, y')| \ge 1-2\varepsilon$$. Proposition 1 implies that $$h(\varPi (x, y), \varPi (x',y')) = h(\varPi (x', y), \varPi (x, y'))$$. Proposition 6 thus implies that $$|\varPi (x, y) - \varPi (x', y')| \ge 1 - 2 \sqrt{\varepsilon }$$.

Let us now consider the second case. Let $$\mathcal L$$ be the set of all leaves of $$\pi$$ and let $$\mathcal {L}_1$$ denote those leaves which are labeled by 1. For $$x \in \mathcal {X}$$, $$y \in \mathcal Y$$, define the vectors $$\mathbf {a}_{x} \in \mathbb {R}_+^{\mathcal L_1}$$ as $$\mathbf {a}_{x}(\ell ) = \alpha (\ell , x)$$, and the vectors $$\mathbf {b}_{y} \in \mathbb {R}_+^{\mathcal L_1}$$ as $$\mathbf {b}_{y}(\ell ) = \beta (\ell , y)$$ where $$\alpha$$ and $$\beta$$ are the functions from Lemma 3. Since $$f(x, y) = 1$$ and $$\pi$$ is an $$\varepsilon$$-error protocol for f,
\begin{aligned} \langle \mathbf {a}_x, \mathbf {b}_y \rangle = \sum _{\ell \in \mathcal L_1}{\alpha (\ell ,x)\cdot \beta (\ell ,y)}=\mathbb {P}[L(\pi (x,y))=1] \ge 1-\epsilon . \end{aligned}
Similarly, we have $$\langle \mathbf {a}_{x'}, \mathbf {b}_{y'} \rangle \ge 1 - \varepsilon$$, $$\langle \mathbf {a}_x, \mathbf {b}_{y'} \rangle \le \varepsilon$$ and $$\langle \mathbf {a}_{x'}, \mathbf {b}_y \rangle \le \varepsilon$$. Observe
$$2 |\varPi (x, y) - \varPi (x', y')| \ge \sum _{\ell \in \mathcal L_1} |\mathbf {a}_{x}(\ell )\mathbf {b}_y(\ell ) - \mathbf {a}_{x'}(\ell )\mathbf {b}_{y'}(\ell )|.$$
Applying Claim 1 with the vectors $$\mathbf {a}_x, \mathbf {b}_y, \mathbf {a}_{x'}, \mathbf {b}_{y'}$$ yields that $$|\varPi (x, y) - \varPi (x', y')| \ge 1 - 2\varepsilon$$.

### 3.2 A Lower Bound Based on Fooling Sets

The following result of Alon [1] on the rank of perturbed identity matrices is a key ingredient.

### Lemma 4

Let $$\frac{1}{2 \sqrt{m}} \le \varepsilon < \frac{1}{4}$$. Let M be an $$m \times m$$ matrix such that $$|M(i, j)| \le \varepsilon$$ for all $$i \ne j$$ in [m] and $$|M(i,i)| \ge \frac{1}{2}$$ for all $$i \in [m]$$. Then,
\begin{aligned} \mathsf {rank}(M) = \varOmega \left( \frac{\log m}{\varepsilon ^2\log (\frac{1}{\varepsilon })}\right) . \end{aligned}

### Proof

(Proof of Theorem 2). Let $$\mathcal {L}$$ denote the set of leaves of $$\pi$$. Let $$A \in \mathbb {R}^{\mathcal S\times \mathcal L}$$ be the matrix defined by
$$A_{(x,y),\ell } = \sqrt{\mathbb {P}[ \pi (x,y) = \ell ]}.$$
Let
$$M = A A^{T}$$
where $$A^T$$ is A transposed. First,
$$M_{(x,y),(x,y)} = 1.$$
Second, if $$(x,y) \ne (x',y')$$ in $$\mathcal S$$ then by Lemma 2 we know $$|\varPi (x,y) - \varPi (x',y')| \ge 1 - 2\sqrt{\varepsilon }$$ so by Proposition 6
\begin{aligned} h^2(\varPi (x,y), \varPi (x',y')) \ge 1 - 2 \varepsilon ^{1/4} \end{aligned}
which implies
\begin{aligned} M_{(x,y),(x',y')} = 1 - h^2( \varPi (x,y),\varPi (x',y')) \le 2 {\varepsilon }^{1/4} . \end{aligned}
Lemma 4 implies that3 the rank of M is at least $$\varOmega \left( \frac{\log |\mathcal S|}{\sqrt{\varepsilon }\log \left( \frac{1}{\varepsilon ^{1/4}}\right) } \right) = \varOmega \left( \left( \frac{\log |\mathcal S|}{\varepsilon }\right) ^{1/4}\right)$$. On the other hand,
$$2^{CC(\pi )} \ge |\mathcal L| \ge \mathsf {rank}(M).$$

## Footnotes

1. 1.

$$R\subseteq \mathcal {X}\times \mathcal Y$$ is an f-monochromatic rectangle if $$R=A\times B$$ for some $$A\subseteq \mathcal {X}, B\subseteq \mathcal Y$$ and f is constant over R.

2. 2.

In fact, the theorem in [5, 12] is more general than the one stated here. We state the theorem in this form since it fits well the focus of this text.

3. 3.

We may assume that say $$\varepsilon < 2^{-12}$$ by repeating the given randomized protocol a constant number of times.

### References

1. 1.
Alon, N.: Perturbed identity matrices have high rank: proof and applications. Comb. Probab. Comput. 18(1–2), 3–15 (2009). http://dx.doi.org/10.1017/S0963548307008917
2. 2.
Bar-Yossef, Z., Jayram, T.S., Kumar, R., Sivakumar, D.: An information statistics approach to data stream and communication complexity. In: FOCS, pp. 209–218 (2002)Google Scholar
3. 3.
Barak, B., Braverman, M., Chen, X., Rao, A.: How to compress interactive communication. SIAM J. Comput. 42(3), 1327–1363 (2013)
4. 4.
Chor, B., Kushilevitz, E.: A zero-one law for Boolean privacy. SIAM J. Discrete Math. 4(1), 36–47 (1991)
5. 5.
Hastad, J., Wigderson, A.: The randomized communication complexity of set disjointness. Theor. Comput. 3(1), 211–219 (2007)
6. 6.
Kalyanasundaram, B., Schnitger, G.: The probabilistic communication complexity of set intersection. SIAM J. Discrete Math. 5(4), 545–557 (1992)
7. 7.
Krause, M.: Geometric arguments yield better bounds for threshold circuits and distributed computing. Theor. Comput. Sci. 156(1–2), 99–117 (1996)
8. 8.
Kushilevitz, E., Nisan, N.: Communication Complexity. Cambridge University Press, New York (1997)
9. 9.
Lee, T., Shraibman, A.: Lower bounds in communication complexity. Founda. Trends Theoret. Comput. Sci. 3(4), 263–398 (2009)
10. 10.
Newman, I.: Private vs. common random bits in communication complexity. Inf. Process. Lett. 39(2), 67–71 (1991)
11. 11.
Paturi, R., Simon, J.: Probabilistic communication complexity. J. Comput. Syst. Sci. 33(1), 106–123 (1986)
12. 12.
Yao, A.C.C.: Some complexity questions related to distributive computing (preliminary report). In: STOC, pp. 209–213 (1979)Google Scholar