1 Introduction

Let \(\Omega _n=\{-1,1\}^n\) be the Boolean hypercube equipped with the uniform probability measure \(\mu _n\). For any \(f:\Omega _n\rightarrow \mathbb {R}\), we denote by \(\textrm{Var}(f)=\textrm{Var}_{\mu _n}(f)\) its variance, i.e. \(\textrm{Var}(f)=\mathbb {E}|f-\mathbb {E}f|^2\). For each \(1\le j\le n\), the influence of the j-th variable on f is given by

$$\begin{aligned} {\text {Inf}}_j f:=\mathbb {E}\left[ \left( \frac{f-f^{\oplus j}}{2}\right) ^2\right] , \end{aligned}$$

where \(f^{\oplus j}(x)=f(x^{\oplus j})\) and \(x^{\oplus j}\) denotes the vector in \(\Omega _n\) obtained by flipping the j-th variable, that is, for \(x=(x_1,\dots , x_n)\),

$$\begin{aligned} x^{\oplus j}:=(x_1,\dots , x_{j-1},-x_j,x_{j+1},\dots , x_n)\,. \end{aligned}$$

The notion of influences appears naturally in many contexts ranging from isoperimetric inequalities [KMS12, CEL12], threshold phenomena in random graphs [FK96], cryptography [LMN93], etc. For these reasons, the last three decades witnessed an extensive study of their properties, which led to many applications in theoretical computer science (hardness of approximation [DS05, Hs01] and learning theory [OS07]), percolation theory [BKS99], social choice theory [Mos12, BOL85] to cite a few.

Karpovsky [Kar76] proposed the sum of the influences (also called total influence),

$$\begin{aligned} {\text {Inf}}f:=\sum _{j=1}^n {\text {Inf}}_j f, \end{aligned}$$

as a measure of complexity of a function f. This first intuition was then made rigorous in [LMN93] and [Bop97] where tight circuit complexity lower bounds in terms of the total influence were derived for the complexity class \({\text {AC}}^0\) of constant depth circuits. A simple lower bound on \({\text {Inf}}f\) in terms of the variance can be derived from Poincaré inequality: For all \(f:\Omega _n\rightarrow {\mathbb {R}}\) one has [O’D14, Chapter 2]

$$\begin{aligned} \textrm{Var}(f)\le {\text {Inf}}f\,. \end{aligned}$$
(1.1)

Functions on the hypercubes \(\Omega _n\) that take only values in \(\{-1,1\}\) are of particular interest. These are the so-called Boolean functions and play important roles in social science, combinatorics, computer sciences and many other areas. See [dW08, O’D14] for more information. Note that the \(L^p\)-norms, \(1\le p<\infty \), of Boolean functions are always equal to 1, where the weighted \(L^p\)-norm of a function \(f:\Omega _n\rightarrow \mathbb {R}\) is defined as

$$\begin{aligned} \Vert f\Vert _p:=\left( \mathbb {E}\big [|f|^p\big ]\right) ^{\frac{1}{p}}\,. \end{aligned}$$
(1.2)

A Boolean function \(f:\Omega _n\rightarrow \{-1,1\}\) is said to be balanced if \(\mathbb {E}f=0\). If f is a Boolean function, the influence of the j-th variable can further be expressed as

$$\begin{aligned} {\text {Inf}}_j f={\mathbb {P}}(\{x\in \Omega _n\mid f(x)\ne f(x^{\oplus j})\}). \end{aligned}$$

The Poincaré inequality (1.1) implies that there exists \(j\in \{1,\dots ,n\}\) such that \({\text {Inf}}_j f\ge 1/n\). Note that Poincaré inequality (1.1) can be tight, e.g. for balanced Boolean function \(f(x)=x_1.\) So it may happen that the total influence \(\approx \) variance. Is it possible that all the influences are small simultaneously, that is, \({\text {Inf}}_j(f)\approx \textrm{Var}(f)/n\) for all \(1\le j\le n\)? Quite surprisingly, the answer is negative; a celebrated result of Kahn, Kalai and Linial [KKL88] predicts that every balanced Boolean function has an influential variable. More precisely, Kahn, Kalai and Linial [KKL88] proved that for any balanced Boolean function f on \(\Omega _n\), there exists \(1\le j\le n\) such that

$$\begin{aligned} {\text {Inf}}_j f\ge \frac{C\log (n)}{n}\,, \end{aligned}$$
(1.3)

where \(C>0\) is some universal constant. So some variable has an influence at least \(\Omega (\log (n)/n)\), which is larger than the order 1/n deduced from Poincaré inequality.

This theorem of Kahn, Kalai and Linial (KKL in short) plays a fundamental role in Boolean analysis. It was further strengthened by Talagrand [Tal94] and Friedgut [Fri98] in different directions.

In his celebrated paper [Tal94], Talagrand proved that for all \(n\ge 1\) and \(f:\Omega _n\rightarrow \mathbb {R}\), we have for some universal \(C>0\) that

$$\begin{aligned} \textrm{Var}(f)\le C\sum _{j=1}^{n}\frac{\Vert D_j f\Vert ^2_2}{1+\log (\Vert D_j f\Vert _2/\Vert D_j f\Vert _1)}, \end{aligned}$$
(1.4)

where \(D_j f(x):=\frac{1}{2}(f(x)-f(x^{\oplus j}))\). Note that if f is Boolean, then \(D_j f\) takes values only in \(\{-1,0,1\}\), so that \(\Vert D_j f \Vert _1=\Vert D_j f\Vert _2^2= {\text {Inf}}_j f\). Therefore, this inequality of Talagrand (1.4), as an improvement of Poincaré inequality (1.1), immediately implies the result of KKL. There are plenty of extensions of Talagrand’s inequality (1.4) [OW13a, OW13b, CEL12], which has become a central tool in theoretical computer science [O’D14]. Moreover, it provides a powerful tool to study sub-diffusive and super-concentration phenomena [BKS03, BKS99, Cha14, GS15, ADH17, Sos18, Tan20] ubiquitous to many models studied in modern probability theory (percolation, random matrices, spin glasses, etc.); see the review articles [CEL12, Led19] and references therein for more details.

Also related to the KKL theorem, Friedgut’s Junta theorem [Fri98] states that a Boolean function with a bounded total influence essentially depends on few coordinates. More precisely, a Boolean function \(f:\Omega _n\rightarrow \{-1,1\}\) is called a k-junta, for \(k\in \{1,\dots ,n\}\) independent of n, if it depends on at most k coordinates. When \(k=1\), the function is called a dictatorship. If f is a junta, it is an immediate consequence that the total influence does not depend on n, i.e. \({\text {Inf}}f=\mathcal {O}(1)\). Friedgut’s Junta theorem provides the following converse statement: for any Boolean function \(f:\Omega _n\rightarrow \{-1,1\}\) and \(\varepsilon >0\), there exists a k-junta \(g:\Omega _n\rightarrow \{-1,1\}\) such that

$$\begin{aligned} \Vert f-g\Vert _2\le \varepsilon \,,\qquad \text { with }\quad k=2^{\mathcal {O}({\text {Inf}}f /\varepsilon )}\,. \end{aligned}$$
(1.5)

Since its discovery, Friedgut’s Junta theorem has found many applications in random graph theory and the learnability of monotone Boolean functions [OS07].

Judging from the range of applicability of these results, it is natural to consider their extensions to noncommutative or quantum settings. Partial results in this direction were obtained by Montanaro and Osborne [MO10a]. There, Boolean functions on the hypercube \(\Omega _n\) were replaced by quantum Boolean functions on n qubits, that is, operators \(A\in M_2(\mathbb {C})^{\otimes n}\) acting on the n-fold tensor product of \(\mathbb {C}^2\) with the additional conditions that \(A=A^*\) and \(A^2=\textbf{1}\). Here and in what follows, \(M_k(\mathbb {C})\) denotes the k-by-k complex matrix algebra. Then, the \(L^2\)-influence of A in j-th coordinate is defined as \({\text {Inf}}_j^2 A:= \Vert d_j A\Vert _2^2\), where we used \(d_j\) to denote the quantum analogue of the bit-flip map

$$\begin{aligned} d_j:={\mathbb {I}}^{\otimes (j-1)}\otimes \left( {\mathbb {I}}-\frac{1}{2}\textrm{tr}\right) \otimes {\mathbb {I}}^{\otimes (n-j)} \end{aligned}$$

with \(\mathbb { I}\) being the identity map over \(M_2(\mathbb {C})\), and replaced the normalized \(L^p\)-norm on \(\Omega _n\) by the normalized Schatten-p norm on \(M_2(\mathbb {C})^{\otimes n}\). The quantum influence has already found interesting applications to quantum complexity theory [BGJ+22]. In this framework, Montanaro and Osborne [MO10a, Proposition 11.1] proved a quantum analogue of Talagrand’s inequality (1.4). However, this does not yield a quantum KKL as in the classical setting since we do not have the identity \(\Vert d_j A\Vert _1=\Vert d_j A\Vert ^2_2\) for general quantum Boolean functions. In the worst case, we may even have \(\Vert d_j A\Vert _1=\Vert d_j A\Vert _2\) ( j is a bad influence according to [MO10a, Definition 11.2]) and thus (1.4) will not help anymore. For this reason, the problem of whether every balanced quantum Boolean function has an influential variable still remains open; see [MO10a] for some partial results and more discussions.

In fact, the observation that \(\Vert d_j A\Vert _1\ne \Vert d_j A\Vert ^2_2\) is not exclusive to the quantum setting, and also arises for instance when considering extensions of the setup of Boolean functions on the hypercubes to functions on smooth manifolds, after replacing the uniform distribution on \(\Omega _n\) by an appropriate finite measure, and the discrete derivatives \(D_j\) by the partial derivatives associated to the differential structure of the manifold. In this setting, analogues of the previous results were recently obtained for the \(L^1\)-influences \({\text {Inf}}^1_j A:=\Vert d_j A\Vert _1\), which is sometimes called geometric influence for its relation to isoperimetric inequalities [KMS12, CEL12, Aus16, Bou17].

In this paper, we propose to take the above considerations as a starting point for establishing quantum analogues of (1.1), (1.3), (1.4) and (1.5) based on the \(L^1\)-influences. Our first main result (Theorems 3.2 and 3.6) states that for any self-adjoint operator A on n qubits with \(\Vert A\Vert \le 1\) we have

$$\begin{aligned} \Vert A-2^{-n}{\text {tr}}(A)\Vert _2^2=:\textrm{Var}(A)\le C \sum _{j=1}^{n} \frac{\Vert d_j A\Vert _1(1+\Vert d_j A\Vert _1)}{1+\log ^+(1/\Vert d_j A\Vert _1)} \end{aligned}$$
(1.6)

for some universal \(C>0\), where \(\log ^+\) refers to the positive part of the logarithm. In particular, this suggests that every balanced quantum Boolean function has a variable that has geometric influence at least of the order \(\log (n) /n\). We also prove a quantum \(L^1\)-Poincaré inequality (Theorem 3.1): for any operator A on n qubits we have

$$\begin{aligned} \Vert A-2^{-n}{\text {tr}}(A)\Vert _1\le \sum _{j=1}^{n}{\text {Inf}}^1_j A\,. \end{aligned}$$
(1.7)

Therefore our result provides an alternative answer to the quantum KKL conjecture [MO10a, Conjecture 3 of Section 12] in terms of geometric influences (Theorem 3.9). The inequality (1.6) is inspired by some results in the classical setting; see for example [KMS12, CEL12]. Since (1.6) will be our main focus, rather than (1.4), to distinguish them in the sequel, we shall refer to (1.6) as (\(L^1\)-)Talagrand’s inequality, and (1.4) as Talagrand’s \(L^1\)-\(L^2\) variance inequality as did in [Led19]. We also have a qubit isoperimetric type inequality and a stronger form of \(L^1\)-Poincaré (1.7); see Sect. 6.4 below.

Our second main result is a quantum analogue of Friedgut’s Junta theorem (Theorem 3.11 and Corollary 3.12): for any quantum Boolean function \(A\in M_2(\mathbb {C})^{\otimes n}\) and \(\varepsilon >0\) there exists another quantum Boolean function \(B\in M_2(\mathbb {C})^{\otimes n}\) that is supported on k subsystems such that

$$\begin{aligned} \Vert A-B\Vert _2\le \varepsilon \qquad \text { with }\qquad k\le 2^{\frac{270{\text {Inf}}^2(A)}{\varepsilon ^2}}\frac{{\text {Inf}}^1(A)^6}{{\text {Inf}}^2(A)^5}, \end{aligned}$$
(1.8)

where \({\text {Inf}}^p(A):=\sum _{j=1}^n\,{\text {Inf}}_j^p(A)\) with \({\text {Inf}}^p_j(A)=\Vert d_j A\Vert _p^p\).

The proofs of Equations (1.6) and (1.8) make use of recent noncommutative generalizations of hypercontractive inequalities and gradient estimates [OZ99, MO10a, KT13, TPK14, CM17a, DR20, BDR20, WZ21, GR21, Bei21]. Moreover, the generality of these tools also allows us to further extend most of our results to the abstract von Neumann algebraic setting which contains both our previously stated results and their classical analogues previously established in [CEL12, Bou17], but also other extensions arising in noncommutative analysis and quantum information with discrete and continuous variables. As for their classical analogues, we expect our results to find many new applications to quantum information and quantum computation.

The rest of the paper is organized as follows: in Sect. 2, we recall useful definitions and results from the Fourier analysis on the quantum Boolean hypercubes including Poincaré inequality, hypercontractivity, intertwining and gradient estimates. Section 3 is devoted to the statement and proof of our main results, namely a quantum \(L^1\)-Poincaré inequality (Theorem 3.1), quantum Talagrand inequality (Theorem 3.2), and quantum KKL theorem (Theorem 3.9) and a quantum Friedgut’s Junta theorem (Theorem 3.11 and Corollary 3.12). These results are then extended to the general von Neumann algebraic setting in Sect. 4. Finally, examples and applications to quantum circuit complexity and quantum learning theory are provided in Sects. 5 and 6.

2 Quantum Boolean Analysis

Let us start by recapitulating the framework of quantum Boolean functions from [MO10a]. As a quantum analogue of functions on the Boolean hypercubes, i.e., functions of n bits, we will take observables on n qubits. In other words, our algebra of observables is \( M_2(\mathbb {C})^{\otimes n}\cong M_{2^n}(\mathbb {C})\) endowed with the operator norm \(\Vert \cdot \Vert \). In what follows, we denote by \({\text {tr}}\) the trace in \(M_2(\mathbb {C})^{\otimes n}\), and by \({\text {tr}}_T\) the partial trace with respect to any subset T of qubits. Following [MO10a, Definition 3.1], we say \(A\in M_2(\mathbb {C})^{\otimes n}\) is a quantum Boolean function if \(A=A^*\) and \(A^2=\textbf{1}\). Here and in what follows, \(\textbf{1}\) always denotes the identity operator. A quantum Boolean function A is balanced if \({\text {tr}}(A)=0\).

One pillar of analysis on the Boolean hypercube is that every function \(f:\Omega _n\rightarrow \mathbb {R}\) has the Fourier–Walsh expansion, i.e. can be expressed as a linear combination of characters. Our quantum analogues of the characters for 1 qubit are the Pauli matrices

$$\begin{aligned} \sigma _0=\begin{pmatrix}1&{}\quad 0\\ 0&{}\quad 1\end{pmatrix},\quad \sigma _1=\begin{pmatrix}0&{}\quad 1\\ 1&{}\quad 0\end{pmatrix},\quad \sigma _2=\begin{pmatrix}0&{}\quad -i\\ i&{}\quad 0\end{pmatrix},\quad \sigma _3=\begin{pmatrix}1&{}\quad 0\\ 0&{}\quad -1\end{pmatrix}. \end{aligned}$$

Clearly, these are quantum Boolean functions, and they form a basis of \(M_2(\mathbb {C})\). For \(s=(s_1,\dots , s_n)\in \{0,1,2,3\}^n\), we put

$$\begin{aligned} \sigma _s:=\sigma _{s_1}\otimes \dots \otimes \sigma _{s_n}\, . \end{aligned}$$

These are again quantum Boolean functions, and form a basis of \(M_2(\mathbb {C})^{\otimes n}\). Accordingly, every \(A\in M_2(\mathbb {C})^{\otimes n}\) can uniquely be expressed as

$$\begin{aligned} A=\sum _{s\in \{0,1,2,3\}^n} {\widehat{A}}_s \,\sigma _s \end{aligned}$$
(2.1)

where \({\widehat{A}}_s\in {\mathbb {C}}\) is the Fourier coefficient. Given \(s\in \{0,1,2,3\}^n\), we call the set of indices j such that \(s_j\ne 0\) the support of s, and denote it by \({\text {supp}}(s)\). Its cardinality is denoted by \(|{\text {supp}}(s)|\). Similarly, the support of A is defined by

$$\begin{aligned} {\text {supp}}(A):=\bigcup _{s|\widehat{A}_s\ne 0}{\text {supp}}(s)\,, \end{aligned}$$
(2.2)

and its cardinality is denoted by \(|{\text {supp}}(A)|\). In analogy with the classical setting, an arbitrary operator \(A\in M_2(\mathbb {C})^{\otimes n}\) is called a k-junta if \(|{\text {supp}}(A)|\le k\). As the Pauli matrices are orthonormal with respect to the normalized Hilbert–Schmidt inner product, the coefficients \({\widehat{A}}_s\) can be recovered by

$$\begin{aligned} {\widehat{A}}_s=\frac{1}{2^n}\,\textrm{tr}(\sigma _s A)\,. \end{aligned}$$

Note that whenever A is self-adjoint, the coefficients \(\widehat{A}_s\) must be real. The quantum analogue of the bit-flip map is given by

$$\begin{aligned} d_j(A):= {\mathbb {I}}^{\otimes (j-1)}\otimes \left( {\mathbb {I}}-\frac{1}{2}\textrm{tr}\right) \otimes {\mathbb {I}}^{\otimes (n-j)}(A) =\sum _{\begin{array}{c} s\in \{0,1,2,3\}^n\\ s_j\ne 0 \end{array}}{\widehat{A}}_s \sigma _s\,. \end{aligned}$$

Here \({\mathbb {I}}\) denotes the identity map on \(M_2(\mathbb {C})\). Note that \({\mathcal {L}}_0:={\mathbb {I}}-\frac{1}{2}\textrm{tr}\) satisfies \({\mathcal {L}}_0^2={\mathcal {L}}_0\), so that \(d_j^2=d_j\).

For \(p\ge 1\), we denote by \({\text {Inf}}^p_j(A):=\Vert d_j A\Vert _p^p\) the \(L^p\)-influence of j on the operator \(A\in M_2(\mathbb {C})^{\otimes n}\), and by \({\text {Inf}}^p (A):=\sum _{j=1}^{n}{\text {Inf}}_j^p(A)\) the associated total \(L^p\)-influence, where the normalized Schatten-p norm of an operator \(A\in M_2(\mathbb {C})^{\otimes n}\) is defined as (\(|A|:=(A^*A)^{1/2}\))

$$\begin{aligned} \Vert A\Vert _p:=\Big (\frac{1}{2^n}{\text {tr}}\big |A\big |^p\Big )^{\frac{1}{p}}\,. \end{aligned}$$

The \(L^1\)-influence is also called the geometric influence. For the \(L^2\)-influence we have

$$\begin{aligned} {\text {Inf}}^2(A)=\sum _{j=1}^n \frac{1}{2^n}\textrm{tr}((d_j A)^*d_j(A))=\frac{1}{2^n}\textrm{tr}(A^*{\mathcal {L}}(A)) \end{aligned}$$
(2.3)

with \({\mathcal {L}}:=\sum _{j=1}^n d_j\). The operator \(\mathcal {L}\) is the generator of the tensor product of the quantum depolarizing semigroups \((P_t)_{t\ge 0}\) for the individual qubits:

$$\begin{aligned} P_t=e^{-t{\mathcal {L}}}=\left( e^{-t}\,{\mathbb {I}}+(1-e^{-t})\frac{\textrm{tr}}{2}(\cdot )\textbf{1}\right) ^{\otimes n}\underset{t\rightarrow \infty }{\longrightarrow }\frac{1}{2^n}\,{\text {tr}}(\cdot )\,. \end{aligned}$$
(2.4)

It is a tracially symmetric quantum Markov semigroup, whose general properties are discussed in Sect. 4.

In the Fourier decomposition, we have the following convenient expressions for the \(L^2\)-influence:

$$\begin{aligned} {\text {Inf}}^2(A)=\sum _{s\in \{0,1,2,3\}^n}|{\text {supp}}(s)||{\widehat{A}}_s|^2, \end{aligned}$$
(2.5)

and the semigroup \(P_t\):

$$\begin{aligned} P_t(A)=\sum _{s\in \{0,1,2,3\}^n}e^{-t|{\text {supp}}(s)|}{\widehat{A}}_s\sigma _s. \end{aligned}$$
(2.6)

In addition we need the following further facts:

Lemma 2.1

(Poincaré inequality, see Proposition 10.9 of [MO10a]). For all \(A\in M_2(\mathbb {C})^{\otimes n}\) such that \({\text {tr}}(A)=0\) and \(t\ge 0\), one has

$$\begin{aligned} \Vert P_t(A)\Vert _2\le e^{-t}\Vert A\Vert _2. \end{aligned}$$

This inequality is equivalent to

$$\begin{aligned} \textrm{Var}(A)\le \frac{1}{1-e^{-2t}}(\Vert A\Vert _2^2-\Vert P_t(A)\Vert _2^2). \end{aligned}$$

and is also equivalent to

$$\begin{aligned} \textrm{Var}(A)\le {\text {Inf}}^2(A). \end{aligned}$$

Lemma 2.2

(Hypercontractivity, see Theorem 8.4 of [MO10a]). For all \(A\in M_2(\mathbb {C})^{\otimes n}\), \(t\ge 0\) and \(p=p(t)= 1+e^{-2t}\) one has

$$\begin{aligned} \Vert P_t(A)\Vert _2\le \Vert A\Vert _p\,. \end{aligned}$$

Lemma 2.3

(Intertwining). For all \(j\in \{1,\dots ,n\}\) and \(t\ge 0\) one has

$$\begin{aligned} d_j P_t=P_t d_j. \end{aligned}$$

Proof

Follows easily from the definitions of \(P_t\) and \(d_j\). \(\square \)

We denote by \(\Gamma : M_2(\mathbb {C})^{\otimes n}\rightarrow M_2(\mathbb {C})^{\otimes n}\) the carré du champ operator associated to \(P_t=e^{-t{\mathcal {L}}}\) which is defined via:

$$\begin{aligned} 2\Gamma (A):={\mathcal {L}}(A^*)A+A^*{\mathcal {L}}(A)-{\mathcal {L}}(A^*A) =\sum _{j=1}^{n}d_j(A^*)A+A^*d_j A-d_j(A^*A)\,. \end{aligned}$$

Lemma 2.4

(Gradient estimate [JZ15, WZ21]). For any \(A\in M_2(\mathbb {C})^{\otimes n}\) and all \(t\ge 0\),

$$\begin{aligned} \Gamma (P_t A)\le e^{-t}P_t\Gamma (A)\,. \end{aligned}$$

We close this section by remarking that classical Boolean functions are special quantum Boolean functions. In fact, the Fourier–Walsh expansions of classical Boolean functions correspond to (2.1) when restricting \(s\in \{0,3\}^n\) .

3 Main Results for Quantum Boolean Functions

In this section we state and prove our main results in the restricted setting of the quantum Boolean cube.

3.1 A quantum \(L^1\)-Poincaré inequality

We start with the following \(L^1\)-Poincaré type inequality; see also [DPMRF23] for variations of this inequality and Sect. 6.4 for a stronger form.

Theorem 3.1

For all \(A\in M_2(\mathbb {C})^{\otimes n}\), one has

$$\begin{aligned} \Vert A-2^{-n}{\text {tr}}(A)\Vert _1\le {\text {Inf}}^1 (A)\,. \end{aligned}$$
(3.1)

Proof

This follows from a simple use of the triangle inequality for the \(L^1\)-norm as well as monotonicity under the normalized partial trace. \(\square \)

3.2 A quantum \(L^1\)-Talagrand inequality

We first prove a quantum \(L^1\)-Talagrand inequality on quantum Boolean cubes that can be extended to more general von Neumann algebras; see Sect. 4. We will see later that on quantum Boolean cubes the estimates can be improved, so that we may deduce a sharp quantum KKL theorem for \(L^1\)-influences.

Theorem 3.2

For all \(A\in M_2(\mathbb {C})^{\otimes n}\) with \(\Vert A\Vert \le 1\) one has

$$\begin{aligned} \textrm{Var}(A)\le C \sum _{j=1}^{n} \frac{\Vert d_j A\Vert _1(1+\Vert d_j A\Vert _1)}{[1+\log ^+(1/\Vert d_j A\Vert _1)]^{1/2}}\,, \end{aligned}$$
(3.2)

for some universal \(C>0\).

Proof

Differentiating the function \(t\mapsto \Vert P_t(A)\Vert _2^2\) one gets

$$\begin{aligned} \Vert A\Vert _2^2-\Vert P_T(A)\Vert _2^2=2\int _0^T \sum _{j=1}^n \Vert d_j P_t A\Vert _2^2\,dt=4\int _0^{T/2}\sum _{j=1}^n \Vert d_j P_{2t}A\Vert _2^2\,dt. \end{aligned}$$

By intertwining (Lemma 2.3) and hypercontractivity (Lemma 2.2),

$$\begin{aligned} \Vert d_j P_{2t}A\Vert _2=\Vert P_t d_j P_t A\Vert _2\le \Vert d_j P_t A\Vert _{p(t)} \end{aligned}$$

with \(p(t)=1+e^{-2t}\). By Hölder’s inequality,

$$\begin{aligned} \Vert d_j P_t A\Vert _{p(t)}\le \Vert d_j P_t A\Vert _1^{1/p(t)}\Vert d_j P_t A\Vert ^{1-1/p(t)}. \end{aligned}$$

For the term with the \(L^1\)-norm we use intertwining again and \(L^1\)-contractivity of \((P_t)_{t\ge 0}\) to get \(\Vert d_j P_t A\Vert _1\le \Vert d_j A\Vert _1\). For the term with \(\Vert \cdot \Vert \)-norm we use the bound derived from Lemma 3.4 below, which gives \(\Vert d_j P_t A\Vert \le (e^t-1)^{-1/2}\). Altogether,

$$\begin{aligned} \Vert d_j P_{2t}A\Vert _2\le \Vert d_j P_t A\Vert _{p(t)}\le (e^t-1)^{\frac{1-p(t)}{2p(t)}}\Vert d_j A\Vert _1^{\frac{1}{p(t)}}. \end{aligned}$$

As a consequence,

$$\begin{aligned} \Vert A\Vert _2^2-\Vert P_T(A)\Vert _2^2\le 4\sum _{j=1}^n \Vert d_j A\Vert _1 \int _0^{T/2}(e^t-1)^{\frac{1-p(t)}{p(t)}}\Vert d_j A\Vert _1^{\frac{2-p(t)}{p(t)}}\,dt. \end{aligned}$$
(3.3)

Since \(e^t-1\ge t\) and \(p(t)\ge 1\), we have

$$\begin{aligned} (e^t-1)^{\frac{1-p(t)}{p(t)}}\le t^{\frac{1-p(t)}{p(t)}}\,. \end{aligned}$$

Choosing \(T=1\), we further show in Lemma 3.3 below that, given \(a=\Vert d_j A\Vert _1\),

$$\begin{aligned} \int _0^{1/2} t^{\frac{1-p(t)}{p(t)}} a^{\frac{2-p(t)}{p(t)}}\,dt\le C\frac{1+a}{(1+\log ^+(1/a))^{1/2}}\,, \end{aligned}$$

for some universal constant \(C>0\). We finish by combining (3.3) and the bound (with \(T=1\))

$$\begin{aligned} \textrm{Var}(A)\le \frac{1}{1-e^{-2T}}(\Vert A\Vert _2^2-\Vert P_T(A)\Vert _2^2) \end{aligned}$$

derived from the Poincaré inequality (Lemma 2.1). \(\square \)

Lemma 3.3

There exists a universal \(C>0\) such that for \(\alpha >0\), \(a\ge 0\), \(p(t)=1+e^{-2\alpha t}\) and \(0\le r\le \min \{1,1/2\alpha \}\), we have

$$\begin{aligned} \int _0^r t^{-(1-1/p(t))} a^{2/p(t)-1}\,dt\le \frac{1}{\sqrt{\alpha }}\cdot \frac{C(1+a)}{(1+\log ^+(1/a))^{1/2}}\, . \end{aligned}$$

Proof

Note that \(r\le 1\) and \(1\le p(t)\le 2\), we have

$$\begin{aligned} \int _0^r t^{-(1-1/p(t))} a^{2/p(t)-1}\,dt\le \int _0^r t^{-1/2} a^{2/p(t)-1}\,dt \, . \end{aligned}$$

Now we estimate the right hand side in two cases. If \(a\ge 1\), then \(a^{2/p(t)-1}\le a\) and thus

$$\begin{aligned} \int _0^{r} t^{-1/2} a^{2/p(t)-1}\,dt\le a\int _0^{r} t^{-1/2}\,dt=2a\sqrt{r}\le 2(1+a)\,. \end{aligned}$$

If \(a<1\), then \(a^{2/p(t)-1}\le a^{\alpha t/2}\) for \(0\le t\le r\). In fact, for \(t\in [0,r]\subset [0, 1/2\alpha ]\), we have

$$\begin{aligned} \frac{2}{p(t)}-1=\frac{e^{2\alpha t}-1}{e^{2\alpha t}+1}\ge \frac{\alpha t}{2} \, , \end{aligned}$$

which is equivalent to

$$\begin{aligned} \varphi (\beta ):=4(e^{\beta }-1)-\beta (e^\beta +1)\ge 0\, , \end{aligned}$$

for \(\beta \in [0,2\alpha r]\subset [0,1]\). This is clear as \(\varphi (0)=0\) and \(\varphi '(\beta )=(3-\beta )e^\beta -1\ge 0\) for \(\beta \in [0,1]\). Hence

$$\begin{aligned} \int _0^{r} t^{-1/2} a^{2/p(t)-1}\,dt\le \int _0^{1/2\alpha } t^{-1/2} a^{\alpha t/2}\,dt =\sqrt{\frac{2}{\alpha x}}\int _{0}^{x/4}y^{-1/2}e^{-y}dy\, , \end{aligned}$$

where \(x=\log (1/a)>0\). It remains to show that there exists a universal constant \(C>0\) such that for all \(x>0\),

$$\begin{aligned} \sqrt{\frac{2}{x}}\int _{0}^{x/4}y^{-1/2}e^{-y}dy\le \frac{C}{(1+x)^{1/2}}\, . \end{aligned}$$
(3.4)

In fact, the function

$$\begin{aligned} f(x):=\sqrt{\frac{2(1+x)}{x}}\int _{0}^{x/4}y^{-1/2}e^{-y}dy\, , \end{aligned}$$

is continuous on \((0,\infty )\), and we have

$$\begin{aligned} \lim \limits _{x\rightarrow +\infty }f(x)=\sqrt{2}\int _{0}^{\infty }y^{-1/2}e^{-y}dy=\sqrt{2\pi }\, , \end{aligned}$$

and by L’Hôpital’s rule

$$\begin{aligned} \lim \limits _{x\rightarrow 0^+}f(x)=\lim \limits _{x\rightarrow 0^+}\sqrt{8}x^{1/2}(1+x)^{3/2}\cdot \left( \frac{x}{4}\right) ^{-1/2}e^{-x/4}=4\sqrt{2}\, . \end{aligned}$$

Then (3.4) follows and the proof is complete. \(\square \)

It remains to prove the following technical lemma:

Lemma 3.4

Let \(n\ge 1\) and \((P_t)_{t\ge 0}\) be the quantum depolarizing semigroup on n-qubits defined in (2.4). Then for all \(t> 0\) and all \(A\in M_2(\mathbb {C})^{\otimes n}\) we have

$$\begin{aligned} \sum _{j=1}^{n} (d_j P_t A)^*d_j P_t A\le \frac{\Vert A\Vert ^2}{e^t-1}\textbf{1}\,. \end{aligned}$$
(3.5)

In particular, for each \(1\le j\le n\),

$$\begin{aligned} (d_j P_t A)^*d_j P_t A\le \frac{\Vert A\Vert ^2}{e^t-1}\textbf{1}\qquad \text { and }\qquad \Vert d_j P_t A\Vert \le \frac{\Vert A\Vert }{\sqrt{e^t-1}}\,. \end{aligned}$$

Proof

By definition of \((P_t)_{t\ge 0}\):

$$\begin{aligned} \Vert A\Vert ^2 \textbf{1}&\ge P_t(A^*A)\\&\ge P_t(A^*A)-P_t(A)^*P_t(A)\\&\overset{(1)}{=}2\int _0^t P_s\Gamma (P_{t-s}A)\,ds\\&\overset{(2)}{\ge }\ 2\int _0^t e^s ds \cdot \Gamma (P_t A)\\&=2(e^t-1)\Gamma (P_t A)\,, \end{aligned}$$

where (1) follows from the differentiation of \(s\mapsto P_s(P_{t-s}(A^*)P_{t-s}(A))\), whereas (2) follows from gradient estimates Lemma 2.4. Now we claim that for all A and for each \(1\le j\le n\) we have

$$\begin{aligned} d_j(A)^*A+A^*d_j(A)- d_j(A^*A)\ge d_j (A)^*d_j (A)\, , \end{aligned}$$
(3.6)

and thus

$$\begin{aligned} 2\Gamma (A)\ge \sum _{j=1}^{n}d_j (A)^*d_j (A)\, . \end{aligned}$$
(3.7)

Let us first finish the proof of the lemma given (3.7). Applying (3.7) to \(P_t A\), we may proceed with the previous estimate as

$$\begin{aligned} \Vert A\Vert ^2 \textbf{1}\ge 2(e^t-1)\Gamma (P_t A) \ge (e^t-1)\sum _{j=1}^n (d_j P_t A )^*d_j P_t A\,, \end{aligned}$$

which proves (3.5).

Now it remains to show (3.6). For this we decompose \(A=\sum _{k}A^k_1\otimes \cdots \otimes A^k_n\) into a sum of tensor products, where k in \(A^k_j=A^{(k)}_j\) is just an index, then

$$\begin{aligned}&d_j(A)^*A+A^*d_j(A)-d_j(A^*A)\\&\quad =\sum _{k,l}(A^{k}_1)^*A^l_1\otimes \cdots \otimes \left[ {\mathcal {L}}_0(A_j^k)^*A_j^l+(A_j^k)^*{\mathcal {L}}_0(A_j^l)-{\mathcal {L}}_0((A_j^k)^*A_j^l)\right] \otimes \cdots \otimes (A^{k}_n)^*A^l_n \end{aligned}$$

and

$$\begin{aligned} d_j (A)^*d_j (A) =\sum _{k,l}(A^{k}_1)^*A^l_1\otimes \cdots \otimes [{\mathcal {L}}_0(A_j^k)^*{\mathcal {L}}_0(A_j^l)] \otimes \cdots \otimes (A^{k}_n)^*A^l_n. \end{aligned}$$

For any \(X,Y\in M_2(\mathbb {C})\), a direct computation shows

$$\begin{aligned} {\mathcal {L}}_0(X)^*Y+X^*{\mathcal {L}}_0(Y)- {\mathcal {L}}_0(X^*Y) =X^*Y+\frac{1}{2}{\text {tr}}(X^*Y)-\frac{1}{2}{\text {tr}}(X^*) Y-\frac{1}{2}{\text {tr}}(Y)X^*, \end{aligned}$$

and

$$\begin{aligned} {\mathcal {L}}_0(X)^*{\mathcal {L}}_0(Y) =X^*Y+\frac{1}{4}{\text {tr}}(X^*){\text {tr}}(Y)-\frac{1}{2}{\text {tr}}(X^*) Y-\frac{1}{2}{\text {tr}}(Y)X^*. \end{aligned}$$

Thus

$$\begin{aligned} {\mathcal {L}}_0(X)^*Y+X^*{\mathcal {L}}_0(Y) - {\mathcal {L}}_0(X^*Y)-{\mathcal {L}}_0(X)^*{\mathcal {L}}_0(Y) =\frac{1}{2}{\text {tr}}(X^*Y)-\frac{1}{4}{\text {tr}}(X^*){\text {tr}}(Y)\,. \end{aligned}$$

So (3.6) is equivalent to

$$\begin{aligned} \sum _{k,l} (A^{k}_1)^*A^l_1\otimes \cdots \otimes \left[ \frac{1}{2}{\text {tr}}((A_j^k)^*A^l_j)-\frac{1}{4}{\text {tr}}((A_j^k)^*){\text {tr}}(A^l_j) \right] \otimes \cdots \otimes (A^{k}_n)^*A^l_n\ge 0\,, \end{aligned}$$

which can be reformulated as

$$\begin{aligned} T_j(A^*A)\ge T_j(A)^*T_j(A)\,, \end{aligned}$$
(3.8)

with \(T_j:={\mathbb {I}} ^{\otimes (j-1)}\otimes \frac{1}{2}{\text {tr}}\otimes {\mathbb {I}} ^{\otimes (n-j)}\). Now (3.8) follows from the Kadison–Schwarz inequality [Wol12, Chapter 5.2] and that \(\frac{1}{2}{\text {tr}}\) is unital completely positive (over \(M_2(\mathbb {C})\)). This finishes the proof of the claim (3.6) and thus the proof of the lemma. \(\square \)

Remark 3.5

Following the argument in [CEL12, Proof of Theorem 1], one can prove a quantum analogue of (1.4) using similar properties of quantum depolarizing semigroups. In fact, the proof of (1.4) does not even require strictly positive Ricci curvature lower bounds, i.e. Lemma 2.4 can be weakened. We will not discuss it here as (1.4) is not our main focus and a quantum analogue was already obtained in [MO10a].

The quantum Talagrand inequality Theorem 3.2 implies a quantum KKL for \(L^1\) influences (following a similar argument in Lemma 3.8 below): for balanced quantum Boolean A on n-qubits,

$$\begin{aligned} \max _{1\le j\le n}{\text {Inf}}^1_j (A)\ge C\frac{\sqrt{\log (n)}}{n}\ . \end{aligned}$$

Recall that for classical Boolean functions the sharp order is \(\log (n)/n\), which can be captured by tribes functions [O’D14, Chapter 4]. In fact, the order \(\log (n)/n\) is also sharp for quantum Boolean functions, which can be seen from the following improved version of quantum Talagrand Theorem 3.2:

Theorem 3.6

For every \(p\in [1,2)\) there exists a constant \(C_p>0\) such that for every \(n\in {\mathbb {N}}\) and \(A\in M_2({\mathbb {C}})^{\otimes n}\) with \(\Vert A\Vert \le 1\) one has

$$\begin{aligned} \textrm{Var}(A)\le C_p\sum _{j=1}^n \frac{\Vert d_j(A)\Vert _p^p(1+\Vert d_j(A)\Vert _p^p)}{1+\log ^+(1/\Vert d_j(A)\Vert _p^p)}, \end{aligned}$$

where the constant can be chosen of order \(C_p\sim C/(2-p)\) as \(p\nearrow 2\). In particular, for \(p=1\):

$$\begin{aligned} \textrm{Var}(A)\le C\sum _{j=1}^n \frac{\Vert d_j(A)\Vert _1(1+\Vert d_j(A)\Vert _1)}{1+\log ^+(1/\Vert d_j(A)\Vert _1)}. \end{aligned}$$

Proof

Let \(T>0\) be such that \(p\le 1+e^{-2T}\). By the Poincaré inequality we have

$$\begin{aligned} \textrm{Var}(A)\le \frac{1}{1-e^{-2T}}\left[ \Vert A\Vert _2^2-\Vert P_T(A)\Vert _2^2\right] =\frac{2}{1-e^{-2T}}\int _0^T\sum _{j=1}^n \Vert d_j(P_t(A))\Vert _2^2\,dt \end{aligned}$$

By intertwining and hypercontractivity,

$$\begin{aligned} \frac{2}{1-e^{-2T}}\int _0^T\sum _{j=1}^n \Vert d_j(P_t(A))\Vert _2^2\,dt\le \frac{2}{1-e^{-2T}}\int _0^T\sum _{j=1}^n \Vert d_j(A)\Vert _{p(t)}^2\,dt \end{aligned}$$

with \(p(t)=1+e^{-2t}\).

By interpolation and \(\Vert d_jA\Vert \le 2\Vert A\Vert \le 2\),

$$\begin{aligned}&\frac{2}{1-e^{-2T}}\int _0^T\sum _{j=1}^n \Vert d_j(A)\Vert _{p(t)}^2\,dt\\&\quad \le \frac{2}{1-e^{-2T}}\int _0^T\sum _{j=1}^n \Vert d_j(A)\Vert _{p}^{2p/p(t)}\Vert d_j(A)\Vert ^{2(1-p/p(t))}\,dt\\&\quad \le \frac{8}{1-e^{-2T}}\sum _{j=1}^n \Vert d_j(A)\Vert _p^p\int _0^T\Vert d_j(A)\Vert _p^{p(2/p(t)-1)}\,dt. \end{aligned}$$

If \(\Vert d_j(A)\Vert _p^p\ge 1\), then \(\Vert d_j(A)\Vert _p^{p(2/p(t)-1)}\le \Vert d_j(A)\Vert _p^p\), so that

$$\begin{aligned} \frac{8}{1-e^{-2T}} \Vert d_j(A)\Vert _p^p\int _0^T\Vert d_j(A)\Vert _p^{p(2/p(t)-1)}\,dt \le&\frac{8T}{1-e^{-2T}} \Vert d_j(A)\Vert _p^{2p}\\ \le&\frac{8T}{1-e^{-2T}} \Vert d_j(A)\Vert _p^{p}(1+\Vert d_j(A)\Vert _p^{p}). \end{aligned}$$

If \(\Vert d_j(A)\Vert _p^p< 1\), then \(\Vert d_j(A)\Vert _p^{p(2/p(t)-1)}\le \Vert d_j(A)\Vert _p^{pt/2}\) (recall that \(p(2/p(t)-1)\ge pt/2\) by the proof of Lemma 3.3) and thus

$$\begin{aligned}&\frac{8}{1-e^{-2T}} \Vert d_j(A)\Vert _p^p\int _0^T\Vert d_j(A)\Vert _p^{p(2/p(t)-1)}\,dt\\&\quad \le \frac{8}{1-e^{-2T}} \Vert d_j(A)\Vert _p^p\int _0^T\Vert d_j(A)\Vert _p^{pt/2}\,dt\\&\quad =\frac{16}{1-e^{-2T}} \Vert d_j(A)\Vert _p^p\frac{1-\Vert d_j(A)\Vert _p^{pT/2}}{\log 1/\Vert d_j(A)\Vert _p^p}. \end{aligned}$$

We claim that

$$\begin{aligned} \frac{1-a^{-T/2}}{\log a}\le \max \left\{ \frac{3T}{2},2\right\} \frac{1+a^{-1}}{1+\log a},\qquad a\ge 1. \end{aligned}$$
(3.9)

Applying (3.9) to \(a=\Vert d_j (A)\Vert _p^{-p}\), we obtain

$$\begin{aligned} \frac{16}{1-e^{-2T}} \Vert d_j(A)\Vert _p^p\frac{1-\Vert d_j(A)\Vert _p^{pT/2}}{\log 1/\Vert d_j(A)\Vert _p^p} \le \frac{16\max \left\{ \frac{3T}{2},2\right\} }{1-e^{-2T}} \frac{\Vert d_j(A)\Vert _p^p(1+\Vert d_j(A)\Vert _p^p)}{1+\log ^+(1/\Vert d_j(A)\Vert _p^p)}. \end{aligned}$$

Now let us prove the claim (3.9) which we divide into two cases. When \(a\in [1,e]\), we have

$$\begin{aligned} \frac{1-a^{-T/2}}{\log a}\le \frac{T}{2},\qquad \frac{1+a^{-1}}{1+\log a}\ge \frac{1}{3}, \end{aligned}$$
(3.10)

which are nothing but

$$\begin{aligned} f_1(a):=\frac{T}{2}\log a-1+a^{-T/2}\ge 0,\qquad f_2(a):=3a+3-(a+a\log a)\ge 0. \end{aligned}$$

A direct computation shows that when \(a\in [1,e]\)

$$\begin{aligned} f_1'(a)=\frac{T}{2}a^{-1-\frac{T}{2}}(a^{\frac{T}{2}}-1)\ge 0, \qquad \text {thus}\qquad f_1(a)\ge f_1(1)=0, \end{aligned}$$

and

$$\begin{aligned} f_2'(a)=1-\log a\ge 0,\qquad \text {thus}\qquad f_2(a)\ge f_2(1)=5. \end{aligned}$$

This proves (3.10) and thus the claim when \(a\in [1,e]\). When \(a\ge e\), we have \(2\log a\ge 1+\log a\), and

$$\begin{aligned} \frac{1-a^{-T/2}}{\log a}\le \frac{2}{1+\log a}\le \frac{2(1+a^{-1})}{1+\log a}, \end{aligned}$$
(3.11)

which proves the claim for \(a\in [1,e]\).

Noting that \(8T\le 16\max \left\{ \frac{3T}{2},2\right\} \), we thus just proved for all \(e^{-2T}\ge p-1\), one has

$$\begin{aligned} \textrm{Var}(A)\le \frac{16\max \left\{ \frac{3T}{2},2\right\} }{1-e^{-2T}}\sum _{j=1}^n \frac{\Vert d_j(A)\Vert _p^p(1+\Vert d_j(A)\Vert _p^p)}{1+\log ^+(1/\Vert d_j(A)\Vert _p^p)}. \end{aligned}$$

choosing \(T=-\frac{1}{2} \log (p-1)\), the above inequality becomes

$$\begin{aligned} \textrm{Var}(A)\le C_p \sum _{j=1}^n \frac{\Vert d_j(A)\Vert _p^p(1+\Vert d_j(A)\Vert _p^p)}{1+\log ^+(1/\Vert d_j(A)\Vert _p^p)}, \end{aligned}$$

with \(C_p=\frac{16\max \{-\frac{3}{4}\log (p-1),2\}}{2-p}\) which is of the order \(32/(2-p)\) as \(p\nearrow 2\). \(\square \)

Remark 3.7

The above Theorem 3.6 (when \(p=1\)) improves Theorem 3.2, since it gives the right order of quantum KKL as we shall see in the next. However, Theorem 3.2 can be easily extended to more general von Neumann algebras, which will be discussed in Theorem 4.3. The generalization of Theorem 3.6 is also possible but requires additional assumption which we will not discuss in the general von Neumann algebra setting.

3.3 A KKL theorem for quantum Boolean functions

Our quantum KKL theorem for geometric (\(L^1\)-)influences follows as a simple corollary of Theorem 3.6. First we need an elementary lemma.

Lemma 3.8

If \(n\in {\mathbb {N}}\), \(a_1,\dots ,a_n\ge 0\) and \(c>0\) such that

$$\begin{aligned} \sum _{j=1}^n \frac{a_j(1+a_j)}{1+\log ^+(1/a_j)}\ge c\ , \end{aligned}$$

then

$$\begin{aligned} \max _{1\le j\le n}a_j\ge \min \left\{ \frac{c}{4},1\right\} \frac{\log n}{n}\ . \end{aligned}$$

Proof

If \(\max _{1\le j\le n}a_j\ge 1/\sqrt{n}\), we are done, so we can assume \(a_j<1/\sqrt{n}\le 1\) for all \(j\in \{1,\dots ,n\}\). Then we have

$$\begin{aligned} c&\le \sum _{j=1}^n \frac{2a_j}{1+\log (1/a_j)}\\&\le \frac{2}{1+\frac{1}{2} \log n}\sum _{j=1}^n a_j\\&\le \frac{4 n}{\log n}\max _{1\le j\le n} a_j\ . \end{aligned}$$

\(\square \)

Theorem 3.9

For every \(1\le p<2\), there exists a constant \(C_p>0\) such that for any \(n\ge 1\) and any balanced quantum Boolean function \(A\in M_2(\mathbb {C})^{\otimes n}\)

$$\begin{aligned} \max _{1\le j\le n}{\text {Inf}}^p_j (A)\ge C_p\frac{\log (n)}{n}\ . \end{aligned}$$

Proof

Since \(\textrm{Var}(A)=1\) for any balanced quantum Boolean function, the result follows from Theorem 3.6 with the help of Lemma 3.8. \(\square \)

All combined, we have shown that every balanced quantum Boolean function has a geometrically influential variable. In fact, suppose that \(A\in M_2(\mathbb {C})^{\otimes n}\) is a balanced quantum Boolean function, then

$$\begin{aligned} \Vert A-2^{-n}{\text {tr}}(A)\Vert _1=\Vert A\Vert _1=\Vert A\Vert _2^2=\textrm{Var}(A)=1. \end{aligned}$$

According to \(L^1\)-Poincaré inequality (3.1),

$$\begin{aligned} \sum _{j=1}^{n}\Vert d_j A\Vert _1\ge 1\ . \end{aligned}$$

One may wonder if \({\text {Inf}}_j^1(A)=\Vert d_j A\Vert _1\approx 1/n\) for all \(1\le j\le n\) is possible. However, our Theorem 3.9 for \(p=1\) indicates that this is not the case. There exists j such that \({\text {Inf}}^1_j (A)\ge C\log (n)/n\) for some \(C>0\).

Remark 3.10

In [MO10a, Conjecture 3 of Section 12], the authors have conjectured a similar KKL-type result for the quantum \(L^2\)-influences \({\text {Inf}}_j^2(A)\). While this influence coincides with the \(L^1\)-influence \({\text {Inf}}^1_j(A)\) when A is a classical Boolean function, this is not the case in the quantum setting. Hence, this conjecture in [MO10a] remains open to the best of our knowledge.

3.4 A Friedgut’s Junta theorem for quantum Boolean functions

We recall that a Boolean function \(g:\Omega _n\rightarrow \{-1,1\}\) is called a k-junta if it only depends on a set of at most \(k<n\) bits. In [Fri98], Friedgut showed that for any Boolean function \(f:\{-1,1\}^n\rightarrow \{-1,1\}\) and \(\varepsilon \in (0,1)\), f is \(\varepsilon \)-close in 2-norm to a \(2^{\mathcal {O}({\text {Inf}}^2 f/\varepsilon )}\)-junta, where

$$\begin{aligned} {\text {Inf}}^2 f=\sum _{j=1}^n{\text {Inf}}^2_j f \end{aligned}$$
(3.12)

denotes the total \(L^2\)-influence of f, with \({\text {Inf}}_j^2(f):=\Vert D_j f\Vert _2^2\). More recently, Bouyrie [Bou17] proved an \(L^1\) version of Friedgut’s junta theorem, more adapted to continuous models, based on the proof techniques developed in [CEL12] (see also [Aus16] for a previous account of the result upon which the proof of [Bou17] relies). The next theorem constitutes a quantum generalization of the \(L^1\) Friedgut’s Junta theorem; see Corollary 3.12 followed. Recall that we define k-juntas for operators that are not necessarily Boolean.

Theorem 3.11

For any \(A\in M_2({\mathbb {C}})^{\otimes n}\) and any \(\varepsilon >0\) small enough, there exists a k-junta \(B\in M_2({\mathbb {C}})^{\otimes n}\) with \(\Vert A-B\Vert _2\le \varepsilon \) and

$$\begin{aligned} k\le 2^{\frac{30{\text {Inf}}^2(A)}{\varepsilon ^2}}\frac{\Vert A\Vert _2^4{\text {Inf}}^1(A)^6}{{\text {Inf}}^2(A)^5}. \end{aligned}$$

Moreover, B can be taken to be \(2^{-|T|}{\text {tr}}_T(A)\) for some set \(T\subset \{1,\dots ,n\}\) of \(n-k\) qubits.

Proof

Let \(d=\frac{2{\text {Inf}}^2(A)}{\varepsilon ^2}\). If \(d\le 1\), then

$$\begin{aligned} \Vert A-2^{-|T|}{\text {tr}}_T(A)\Vert _2^2\le \sum _{j\in T}\Vert d_j(A)\Vert _2^2\le {\text {Inf}}^2(A)\le \frac{\varepsilon ^2}{2} \end{aligned}$$
(3.13)

for any subset T of \(\{1,\dots ,n\}\) by the (non-primitive) Poincaré inequality for the tensor product of depolarizing semigroups restricted to the subset T (see [Bar17, Example 3.1]).

Let us now consider the case \(d>1\). Let

$$\begin{aligned} T=\left\{ j\in \{1,\dots ,n\}\,\bigg \vert \, {\text {Inf}}^1_j(A)\le \frac{{\text {Inf}}^2(A)^5}{{\text {Inf}}^1(A)^5}\frac{2^{-15d}}{\Vert A\Vert _2^4}\right\} \end{aligned}$$

and \(B=2^{-|T|}{\text {tr}}_T(A)\). The matrix \(A-B\) has the Fourier decomposition

$$\begin{aligned} A-B=\sum _{\begin{array}{c} s\in \{0,1,2,3\}^n\\ s|_T\ne 0 \end{array}}{\widehat{A}}_s \sigma _s. \end{aligned}$$

By Plancherel’s identity,

$$\begin{aligned} \Vert A-B\Vert _2^2=\sum _{s|_T\ne 0}|{\widehat{A}}_s|^2=\sum _{\begin{array}{c} s|_T\ne 0\\ |{\text {supp}}(s)|>d \end{array}}|{\widehat{A}}_s|^2+\sum _{\begin{array}{c} s|_T\ne 0\\ |{\text {supp}}(s)|\le d \end{array}}|{\widehat{A}}_s|^2. \end{aligned}$$

Let us treat both summands on the right side separately. For the first summand,

$$\begin{aligned} \sum _{\begin{array}{c} s|_T\ne 0\\ |{\text {supp}}(s)|>d \end{array}}|{\widehat{A}}_s|^2\le \frac{1}{d}\sum _{\begin{array}{c} s|_T\ne 0\\ |{\text {supp}}(s)|>d \end{array}}|{\text {supp}}(s)||{\widehat{A}}_s|^2\le \frac{1}{d}{\text {Inf}}^2(A)= \frac{\varepsilon ^2}{2}, \end{aligned}$$

where we used formula (2.5) for \({\text {Inf}}^2\).

For the second summand,

$$\begin{aligned} \sum _{\begin{array}{c} s|_T\ne 0\\ |{\text {supp}}(s)|\le d \end{array}}|{\widehat{A}}_s|^2&\le e^{2dt}\sum _{s|_T\ne 0}e^{-2t|{\text {supp}}(s)|}|{\widehat{A}}_s|^2\\&\le e^{2dt}\sum _{j\in T}\sum _{\begin{array}{c} s\in \{0,1,2,3\}^n\\ s_j\ne 0 \end{array}}e^{-2t|{\text {supp}}(s)|}|{\widehat{A}}_s|^2\\&=e^{2dt}\sum _{j\in T}{\text {Inf}}^2_j(P_t(A)) \end{aligned}$$

for any \(t\ge 0\). Here we used (2.6) for the depolarizing semigroup.

Now take \(t=\log 2\). By intertwining (Lemma 2.3), hypercontractivity (Lemma 2.2) and interpolation,

$$\begin{aligned} {\text {Inf}}^2_j(P_{\log 2}(A))&=\Vert d_j(P_{\log 2}(A))|_2^2 \\&=\Vert P_{\log 2}(d_j(A))\Vert _2^2\nonumber \\&\le \Vert d_j(A)\Vert _{5/4}^2\nonumber \\&\le \Vert d_j(A)\Vert _1^{6/5}\Vert d_j(A)\Vert _2^{4/5}.\nonumber \end{aligned}$$
(3.14)

Since \(d_j\) is a projection, \(\Vert d_j(A)\Vert _2\le \Vert A\Vert _2\). Moreover, by definition, for \(j\in T\)

$$\begin{aligned} \Vert d_j(A)\Vert _1^{6/5}={\text {Inf}}_j^1(A)^{6/5}\le \frac{{\text {Inf}}^2(A)}{{\text {Inf}}^1(A)}\frac{2^{-3d}}{\Vert A\Vert _2^{4/5}}{\text {Inf}}^1_j(A). \end{aligned}$$

Therefore,

$$\begin{aligned} \sum _{\begin{array}{c} s|_T\ne 0\\ |{\text {supp}}(s)|\le d \end{array}}|{\widehat{A}}_s|^2&\le 4^d\sum _{j\in T}\Vert d_j(A)\Vert _1^{6/5}\Vert d_j(A)\Vert _2^{4/5}\\&\le 2^{-d} \frac{{\text {Inf}}^2(A)}{{\text {Inf}}^1(A)}\sum _{j\in T}{\text {Inf}}_j^1(A)\\&\le 4^{-\frac{{\text {Inf}}^2(A)}{\varepsilon ^2}}{\text {Inf}}^2(A)\\&\le e^{-\frac{{\text {Inf}}^2(A)}{\varepsilon ^2}}{\text {Inf}}^2(A)\\&\le \frac{\varepsilon ^2}{2}, \end{aligned}$$

where we used the elementary inequality \(x\le e^{x/2}\) for \(x\ge 0\) in the last step.

Altogether we have shown that \(\Vert A-B\Vert _2^2\le \varepsilon ^2\). Moreover, B is a k-junta with \(k=|T^c|\). Since

$$\begin{aligned} {\text {Inf}}^1_j(A)\ge \frac{{\text {Inf}}^2(A)^5}{{\text {Inf}}^1(A)^5}\frac{2^{-15d}}{\Vert A\Vert _2^4} \end{aligned}$$

for every \(j\in T^c\), we have

$$\begin{aligned} {\text {Inf}}^1(A)\ge \sum _{j\in T^c}{\text {Inf}}_j^1(A)\ge |T^c|\frac{{\text {Inf}}^2(A)^5}{{\text {Inf}}^1(A)^5}\frac{2^{-15d}}{\Vert A\Vert _2^4}. \end{aligned}$$

Hence

$$\begin{aligned} k=|T^c|\le 2^{15 d}\frac{\Vert A\Vert _2^4{\text {Inf}}^1(A)^6}{{\text {Inf}}^2(A)^5}=2^{\frac{30{\text {Inf}}^2(A)}{\varepsilon ^2}}\frac{\Vert A\Vert _2^4{\text {Inf}}^1(A)^6}{{\text {Inf}}^2(A)^5}. \end{aligned}$$

\(\square \)

In the next corollary we restrict ourselves to quantum Boolean functions.

Corollary 3.12

For any quantum Boolean \(A\in M_2({\mathbb {C}})^{\otimes n}\) and any \(\varepsilon >0\) small enough there exists a quantum Boolean k-junta \(C\in M_2({\mathbb {C}})^{\otimes n}\) with \(\Vert A-C\Vert _2\le \varepsilon \) and

$$\begin{aligned} k\le 2^{\frac{270{\text {Inf}}^2(A)}{\varepsilon ^2}}\frac{{\text {Inf}}^1(A)^6}{{\text {Inf}}^2(A)^5}. \end{aligned}$$

Proof

By Theorem 3.11 there exists a self-adjoint k-junta \(B\in M_2({\mathbb {C}})^{\otimes n}\) such that \(\Vert A-B\Vert _2\le \varepsilon \) and \(k\le 2^{\frac{30{\text {Inf}}^2(A)}{\varepsilon ^2}}{\text {Inf}}^1(A)^6/{\text {Inf}}^2(A)^5\). Let us now define \(C:={\text {sgn}}(B)\) as follows: Given the spectral decomposition \(B=\sum _i\lambda _i|\psi _i\rangle \langle \psi _i|\), \({\text {sgn}}(B)=\sum _{i}{\text {sgn}}(\lambda _i)|\psi _i\rangle \langle \psi _i|\), where the sign function \({\text {sgn}}\) is defined as

$$\begin{aligned} {\text {sgn}}(x):=\left\{ \begin{aligned}&1,\,&x> 0,\,\\&-1,\,&x\le 0\,. \end{aligned}\right. \end{aligned}$$

Note that \(|\lambda +{\text {sgn}}(\lambda )|\ge 1\). Then

$$\begin{aligned} |\lambda -{\text {sgn}}(\lambda )|^2\le |(\lambda -{\text {sgn}}(\lambda ))(\lambda +{\text {sgn}}(\lambda ))|^2=|\lambda ^2-1|^2 \end{aligned}$$

and

$$\begin{aligned} 2^n\Vert B-C\Vert _2^2=\sum _{i}|\lambda _i-{\text {sgn}}(\lambda _i)|^2\le \sum _{i}|\lambda _i^2-1|^2=2^n \Vert B^2-\textbf{1}\Vert _2^2\,. \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert A-C\Vert _2\le \Vert A-B\Vert _2+\Vert B-C\Vert _2&\le \Vert A-B\Vert _2+\Vert B^2-\textbf{1}\Vert _2\\&\overset{(1)}{\le }\ \varepsilon +\Vert B^2-A^2\Vert _2\\&\le \varepsilon + \Vert (B-A)B\Vert _2+\Vert A(B-A)\Vert _2\\&\le \varepsilon \big (1+\Vert B\Vert +\Vert A\Vert \big )\\&\overset{(2)}{\le }\ \varepsilon \big (1+2\Vert A\Vert \big )\\&\le 3\varepsilon \,. \end{aligned}$$

where in (1) we have used that \(A^2=\textbf{1}\), whereas in (2) we used the fact that \(B=2^{-|T|}{\text {tr}}_T(A)\) for some set T of qubits, so that \(\Vert B\Vert \le \Vert A\Vert \le 1\). Moreover, we know the size of \(T^c\) from Theorem 3.11. The result then follows after rescaling of \(\varepsilon \) to \(\varepsilon /3\). \(\square \)

Remark 3.13

In the case of a classical Boolean function f, we know that \({\text {Inf}}f\equiv {\text {Inf}}^1 f={\text {Inf}}^2 f\) and the bound in Corollary 3.12,

simplifies as

$$\begin{aligned} k\le e^{\frac{270{\text {Inf}}(f)}{\varepsilon ^2}}{\text {Inf}}(f). \end{aligned}$$
(3.15)

We therefore recover the classical Friedgut’s Junta theorem.

Remark 3.14

In the classical setting, other junta-type theorems related to Fourier analysis of Boolean functions may be found in [FKN02, ADFS04, Bou02, KN06, KS02, DFKO06]. While extending these results to the present quantum setting is an interesting problem, their statements do not directly involve the notion of influence that is central to our study. This interesting direction of research will therefore be considered elsewhere.

4 Von Neumann Algebraic Generalizations

In this section, we generalize the main results from the previous section to the general von Neumann algebraic setting. Apart from technical challenges that arise from the fact that the underlying Hilbert space can be infinite-dimensional and the operators involved can be unbounded, most proofs run parallel to the ones for qubits once the appropriate assumptions are identified. As demonstrated in the next section, these hypotheses are satisfied for a number of interesting examples besides the qubit systems treated in Sect. 3.

We start recapitulating some basic von Neumann algebra theory. As a general reference, we refer to [Tak02, Tak03]. Let \(\mathcal {H}\) be a Hilbert space and \(B(\mathcal {H})\) the space of all bounded linear operators on \(\mathcal {H}\). The \(\sigma \)-weak topology on \(B(\mathcal {H})\) is the topology induced by the seminorms \(|{\text {tr}}(\,\cdot \,x)|\), where x runs over the set of all trace-class operators. A von Neumann algebra \({\mathcal {M}}\) on \(\mathcal {H}\) is a unital \(*\)-subalgebra of \(B(\mathcal {H})\) that is closed in the \(\sigma \)-weak topology. A linear functional on \({\mathcal {M}}\) is called normal if it is continuous with respect to the \(\sigma \)-weak topology. The set of all normal linear functionals on \({\mathcal {M}}\) is denoted by \({\mathcal {M}}_*\), and the obvious dual pairing between \({\mathcal {M}}\) and \({\mathcal {M}}_*\) establishes an isometric isomorphism between \({\mathcal {M}}\) and \(({\mathcal {M}}_*)^*\).

A state on \({\mathcal {M}}\) is a positive linear functional \(\varphi :{\mathcal {M}}\rightarrow \mathbb {C}\) such that \(\varphi (\textbf{1})=1\). A state is called faithful if \(\varphi (x^*x)=0\) implies \(x=0\). For a faithful normal state \(\varphi \) on \({\mathcal {M}}\) let \(\mathcal {H}_\varphi \) denote the completion of M with respect to the inner product

$$\begin{aligned} \langle \,\cdot \,,\cdot \,\rangle _\varphi :{\mathcal {M}}\times {\mathcal {M}}\rightarrow \mathbb {C},\,(x,y)\mapsto \varphi (x^*y), \end{aligned}$$

and let \(\Lambda _\varphi (x)\) denote the image of x inside \(\mathcal {H}_\varphi \). The GNS representation is defined by \(\pi _\varphi (x)\Lambda _\varphi (y)=\Lambda _\varphi (xy)\). The vector \(\Lambda _\varphi (\textbf{1})\) is a cyclic and separating vector for \(\pi _\varphi ({\mathcal {M}})\), which is denoted by \(\Omega _\varphi \). We routinely identify \({\mathcal {M}}\) with \(\pi _\varphi ({\mathcal {M}})\).

For the definition of the noncommutative \(L^p\) spaces, we need some basic modular theory. The operator

$$\begin{aligned} S_0:\Lambda _\varphi ({\mathcal {M}})\rightarrow \Lambda _\varphi ({\mathcal {M}}),\,\Lambda _\varphi (x)\mapsto \Lambda _\varphi (x^*) \end{aligned}$$

is a closable anti-linear operator on \(\mathcal {H}_\varphi \). Let S denote its closure and \(S=J\Delta ^{1/2}\) the polar decomposition of S. The operator J is an anti-unitary involution, called the modular conjugation, and \(\Delta =S^*S\) is called the modular operator.

The symmetric embedding \(i_2\) of \({\mathcal {M}}\) into \(\mathcal {H}_\varphi \) is given by \(i_2(x)=\Delta ^{1/4}\Lambda _\varphi (x)\) and the symmetric embedding \(i_1\) of \({\mathcal {M}}\) into \({\mathcal {M}}_*\) is uniquely determined by the relation

$$\begin{aligned} \langle i_2(x^*),i_2(y)\rangle =i_1(x)(y), \end{aligned}$$

or in other words, \(i_1=i_2^*J i_2\) if we view J as an isomorphism between \(\mathcal {H}_\varphi \) and \({\overline{\mathcal {H}}}_\varphi \cong \mathcal {H}_\varphi ^*\).

Kosaki’s interpolation \(L^p\) spaces [Kos84] are defined as the complex interpolation space

$$\begin{aligned} L^p({\mathcal {M}},\varphi )=({\mathcal {M}}_*,i_1({\mathcal {M}}))_{1/p} \end{aligned}$$

for \(p\in (1,\infty )\). Thus we get embeddings \(i_p:{\mathcal {M}}\rightarrow L^p({\mathcal {M}},\varphi )\) for \(p\in (1,\infty )\) with

$$\begin{aligned} \Vert i_p(x)\Vert \le \Vert i_1(x)\Vert ^{1/p}\Vert x\Vert ^{1-1/p} \end{aligned}$$

for all \(x\in {\mathcal {M}}\). In particular, \(L^2({\mathcal {M}},\varphi )\cong \mathcal {H}_\varphi \) isometrically, and the definition of \(i_2\) is consistent with the definition given before under this identification.

In the case \({\mathcal {M}}=M_n({\mathbb {C}})\), every state \(\varphi \) on \({\mathcal {M}}\) is of the form \(\varphi ={\text {tr}}(\,\cdot \,\sigma )\) for some density matrix \(\sigma \). The state \(\varphi \) is faithful if and only if \(\sigma \) is invertible. In this case \(L^p(M_n({\mathbb {C}}),\varphi )\) can be identified with \(M_n({\mathbb {C}})\) with the norm \({\text {tr}}(|\cdot |^p)^{1/p}\), and the embedding \(i_p\) is given by \(i_p(x)=\sigma ^{1/2p}x\sigma ^{1/2p}\). In particular, \(\Vert i_p(x)\Vert ={\text {tr}}(|\sigma ^{1/2p}x\sigma ^{1/2p}|^p)^{1/p}\), which is the expression for the \(L^p\) norm commonly used in quantum information theory.

A quantum Markov semigroup (QMS) on \({\mathcal {M}}\) is a family \((P_t)_{t\ge 0}\) of normal bounded linear operators on \({\mathcal {M}}\) such that

  • \(P_0=\textrm{id}_{{\mathcal {M}}}\), \(P_s P_t=P_{s+t}\) for \(s,t\ge 0\),

  • \(P_t(x)\rightarrow x\) as \(t\searrow 0\) in the \(\sigma \)-weak topology for every \(x\in {\mathcal {M}}\),

  • \(\sum _{j,k=1}^n y_j^*P_t(x_j^*x_k)y_k\ge 0\) for all \(x_1,\dots ,x_n,y_1,\dots ,y_n\in {\mathcal {M}}\) and \(t\ge 0\),

  • \(P_t(\textbf{1})=1\) for all \(t\ge 0\).

If \((P_t)_{t\ge 0}\) is a quantum Markov semigroup on \({\mathcal {M}}\), then \(P_t\) has a pre-adjoint \((P_t)_*:{\mathcal {M}}_*\rightarrow {\mathcal {M}}_*\) for every \(t\ge 0\), and \(((P_t)_*)_{t\ge 0}\) is a strongly continuous semigroup on \({\mathcal {M}}_*\). The QMS \((P_t)_{t\ge 0}\) is called KMS-symmetric with respect to \(\varphi \) if

$$\begin{aligned} \langle i_2(P_t(x)),i_2(y)\rangle =\langle i_2(x),i_2(P_t(y))\rangle \end{aligned}$$

for all \(x,y\in {\mathcal {M}}\) and \(t\ge 0\). In this case, for all \(p\in [1,\infty )\) and \(t\ge 0\) the operator

$$\begin{aligned} i_p({\mathcal {M}})\rightarrow i_p({\mathcal {M}}),\,i_p(x)\mapsto i_p(P_t(x)) \end{aligned}$$

extends to a contraction \(P_t^{(p)}\) on \(L^p({\mathcal {M}},\varphi )\), and \((P_t^{(p)})_{t\ge 0}\) is a strongly continuous semigroup. In particular, \((P_t)_*=P_t^{(1)}\). Occasionally we also write \(P_t^{(\infty )}\) for \(P_t\).

The generator of \((P_t^{(p)})_{t\ge 0}\) is defined by

$$\begin{aligned} D({\mathcal {L}}_p)&=\{x\in L^p({\mathcal {M}},\varphi )\mid \lim _{t\searrow 0}\frac{1}{t}(x-P_t^{(p)}(x))\text { exists}\},\\ {\mathcal {L}}_p(x)&=\lim _{t\searrow 0}\frac{1}{t}(x-P_t^{(p)}(x)), \end{aligned}$$

where the limit is taken in the norm topology if \(p\in [1,\infty )\) and in the \(\sigma \)-weak topology if \(p=\infty \). We also write \({\mathcal {L}}\) for \({\mathcal {L}}_\infty \). Note that there are differing sign conventions for the generator; with our convention, \({\mathcal {L}}_2\) is a positive self-adjoint operator on \(L^2({\mathcal {M}},\varphi )\).

We make the following assumption:

(H0):

There exists a \(*\)-subalgebra \({\mathcal {A}}\) of \(D({\mathcal {L}})\) which is \(\sigma \)-weakly dense in \({\mathcal {M}}\) and invariant under \((P_t)_{t\ge 0}\).

We can then define the carré du champ operator as follows:

$$\begin{aligned} \Gamma :{\mathcal {A}}\times {\mathcal {A}}\rightarrow {\mathcal {A}},\,\Gamma (x,y)=\frac{1}{2}({\mathcal {L}}(x)^*y+x^*{\mathcal {L}}(y)-{\mathcal {L}}(x^*y)). \end{aligned}$$

We write \(\Gamma (x)\) for \(\Gamma (x,x)\).

We will further use the following assumption:

(H1):

Bakry–Émery gradient estimate: There exists \(K\in {\mathbb {R}}\) such that

$$\begin{aligned} \Gamma (P_t(x))\le e^{-2Kt}P_t(\Gamma (x)) \end{aligned}$$

for all \(x\in {\mathcal {A}}\) and \(t\ge 0\).

To avoid case distinctions, the following notation will come in handy:

$$\begin{aligned} e_K(t)=2\int _0^t e^{2Ks}\,ds={\left\{ \begin{array}{ll} \frac{e^{2Kt}-1}{K}&{}\text {if }K\ne 0,\\ 2t&{}\text {if }K=0. \end{array}\right. } \end{aligned}$$

Further, we write \(K_-\) for the negative part of a real number K. The following result is an analog of Lemma 3.4.

Lemma 4.1

If \((P_t)_{t\ge 0}\) is a QMS satisfying (H0)(H1), then

$$\begin{aligned} \Gamma (P_t(x))\le \frac{1}{e_K(t)}(P_t(x^*x)-P_t(x)^*P_t(x))\le \frac{\Vert x\Vert ^2}{e_K(t)} \end{aligned}$$

for all \(x\in {\mathcal {A}}\) and \(t\ge 0\).

In particular,

$$\begin{aligned} \Gamma (P_t(x))\le {\left\{ \begin{array}{ll} \frac{\Vert x\Vert ^2}{2t}&{}\text {if }K\ge 0,\\ \frac{\Vert x\Vert ^2}{t}&{}\text {if }K< 0\end{array}\right. } \end{aligned}$$

for all \(x\in {\mathcal {A}}\) and \(t\in \left[ 0,\frac{1}{2K_-}\right) \).

(H2):

There exists a finite family of linear self-adjoint maps \(d_j:{\mathcal {A}}\rightarrow {\mathcal {M}}\), \(j\in {\mathcal {J}}\), such that

$$\begin{aligned} \langle i_2(x),i_2({\mathcal {L}}(x))\rangle =\sum _{j\in {\mathcal {J}}}\Vert i_2(d_j(x))\Vert ^2 \end{aligned}$$
(H2-1)

and a constant \(M>0\) such that

$$\begin{aligned} \max _{j\in {\mathcal {J}}}\Vert d_j(x)\Vert \le M\Vert \Gamma (x)\Vert ^{1/2} \end{aligned}$$
(H2-2)

for all \(x\in {\mathcal {A}}\).

Note that (H2-1) implies in particular that the series on the right side converges for all \(x\in {\mathcal {A}}\), and by polarization,

$$\begin{aligned} \langle i_2(x),i_2({\mathcal {L}}(y))\rangle =\sum _{j\in {\mathcal {J}}}\langle i_2(d_j(x)),i_2(d_j(y))\rangle \end{aligned}$$
(4.1)

for all \(x,y\in {\mathcal {A}}\).

In this situation we define the p-influence of the j-th variable on x by \({\text {Inf}}^p_j(x)=\Vert i_p(d_j(x))\Vert ^p\) and the total influence of x by \({\text {Inf}}^p(x)=\sum _{j\in {\mathcal {J}}}{\text {Inf}}^p_j(x)\).

We say \((P_t)_{t\ge 0}\) is primitive if \(P_t(x)\rightarrow \varphi (x)\textbf{1}\) \(\sigma \)-weakly as \(t\rightarrow \infty \) for every \(x\in {\mathcal {M}}\).

Theorem 4.2

(\(L^1\)-Poincaré inequality) If \((P_t)_{t\ge 0}\) is a primitive KMS-symmetric QMS on \({\mathcal {M}}\) satisfying (H0)(H2) with \(K>0\), then

$$\begin{aligned} \frac{\sqrt{K}}{M}\Vert i_1(x-\varphi (x)\textbf{1})\Vert \le \frac{ \pi }{2} {\text {Inf}}^1(x) \end{aligned}$$

for all \(x\in {\mathcal {A}}\).

Proof

By duality and \(\sigma \)-weak density of \({\mathcal {A}}\) in \({\mathcal {M}}\), we have

$$\begin{aligned} \Vert i_1(x-\varphi (x)\textbf{1})\Vert =\sup _{\begin{array}{c} y\in {\mathcal {A}}\\ \Vert y\Vert \le 1 \end{array}}|i_1(x-\varphi (x)\textbf{1})(y)|=\sup _{\begin{array}{c} y\in {\mathcal {A}}\\ \Vert y\Vert \le 1 \end{array}}|i_1(x)(y-\varphi (y)\textbf{1})|. \end{aligned}$$

Since \((P_t)_{t\ge 0}\) is primitive,

$$\begin{aligned} |i_1(x)(y-\varphi (y)\textbf{1})|&=\lim _{T\rightarrow \infty }|i_1(x)(y-P_T(y))|\\&=\left|\int _0^\infty i_1(x)({\mathcal {L}}P_t(y))\,dt\right|\\&=\left|\int _0^\infty \langle i_2(x^*),i_2({\mathcal {L}}(P_t(y)))\rangle \,dt\right|. \end{aligned}$$

Now by the consequence (4.1) of (H2-1),

$$\begin{aligned} \left|\int _0^\infty \langle i_2(x^*),i_2({\mathcal {L}}(P_t(y)))\rangle \,dt\right|&=\left|\int _0^\infty \sum _{j\in {\mathcal {J}}} \langle i_2(d_j(x^*)),i_2(d_j(P_t(y)))\rangle \,dt\right|\\&\le \sum _{j\in {\mathcal {J}}}\int _0^\infty |i_1(d_j(x))(d_j(P_t(y)))|\,dt \\&\le \sum _{j\in {\mathcal {J}}}{\text {Inf}}^1_j(x)\int _0^\infty \Vert d_j(P_t(y))\Vert \,dt \end{aligned}$$

By Lemma 4.1 and (H2-2),

$$\begin{aligned} \int _0^\infty \Vert d_j(P_t(y))\Vert \,dt\le&\, M\int _0^\infty \Vert \Gamma (P_t(y))\Vert ^{1/2}\,dt\\ \le&\,M\sqrt{K} \Vert y\Vert \int _0^\infty \frac{dt}{\sqrt{e^{2Kt}-1}}=\frac{\pi M}{2\sqrt{K}}\Vert y\Vert . \end{aligned}$$

All combined, we obtain the desired inequality. \(\square \)

To prove our general noncommutative version of the \(L^1\)-Talagrand inequality, we need some more assumptions on \((P_t)_{t\ge 0}\), which we collect in the following:

(H3):

Poincaré inequality: There exists a constant \(\lambda >0\) such that

$$\begin{aligned} \lambda \Vert i_2(x-\varphi (x)\textbf{1})\Vert ^2\le \langle i_2(x),i_2({\mathcal {L}}(x))\rangle \end{aligned}$$

for all \(x\in {\mathcal {A}}\).

(H4):

Hypercontractivity: There exists a constant \(\alpha >0\) such that

$$\begin{aligned} \Vert i_2(P_t(x))\Vert \le \Vert i_p(x)\Vert \end{aligned}$$

for all \(x\in {\mathcal {A}}\), \(t\ge 0\) and \(p=1+e^{-2\alpha t}\).

(H5):

Intertwining: There exists a constant \(\mu \in {\mathbb {R}}\) such that

$$\begin{aligned} \Vert i_p(d_j(P_t(x)))\Vert \le e^{-\mu t}\Vert i_p(P_t(d_j(x)))\Vert \end{aligned}$$

for all \(x\in {\mathcal {A}}\), \(j\in {\mathcal {J}}\), \(p\in [1,\infty ]\) and \(t\ge 0\).

In fact, it is well-known [OZ99] that hypercontractivity (H4) implies Poincaré inequality (H3).

The proof of the following theorem follows the argument given by Cordero–Erausquin and Ledoux [CEL12, Theorem 6] in the commutative case. We refer to the appendix for the details.

Theorem 4.3

(\(L^1\)-Talagrand inequality). If \((P_t)_{t\ge 0}\) is a KMS-symmetric QMS on \({\mathcal {M}}\) satisfying (H0)(H5), then there exists a constant \(C>0\) depending only on the constants \(K,M,\alpha ,\lambda ,\mu \) such that

$$\begin{aligned} \Vert i_2(x-\varphi (x)\textbf{1})\Vert ^2\le C\sum _{j\in {\mathcal {J}}}\frac{{\text {Inf}}^1_j(x)(1+{\text {Inf}}^1_j(x))}{(1+\log ^+(1/{\text {Inf}}^1_j(x)))^{1/2}} \end{aligned}$$

for all \(x\in {\mathcal {A}}\) with \(\Vert x\Vert \le 1\).

Again following [CEL12], we can also give a generalization of Talagrand’s inequality (1.4) in this setting.

Theorem 4.4

If \((P_t)\) is a KMS-symmetric QMS on \({\mathcal {M}}\) satisfying (H0), (H2)(H5), then

$$\begin{aligned} \Vert i_2(x-\varphi (x)\textbf{1})\Vert ^2\le \frac{2e^{(2\alpha -\mu )_+/2\lambda }}{\alpha (1-e^{-1})}\sum _{j\in {\mathcal {J}}}\frac{{\text {Inf}}^2_j(x)}{1+\log (\sqrt{{\text {Inf}}_j^2(x)}/{\text {Inf}}_j^1(x))} \end{aligned}$$

for all \(x\in {\mathcal {A}}\).

Proof

By the Poincaré inequality (H3), we have

$$\begin{aligned} \Vert i_2(x-\varphi (x){\textbf {1}})\Vert ^2\le \frac{1}{1-e^{-1}}\left( \Vert i_2(x)\Vert ^2-\Vert i_2(P_T(x))\Vert ^2\right) \end{aligned}$$

for \(T=1/2\lambda \).

Arguing as in the proof of Theorem 3.2, we get

$$\begin{aligned} \Vert i_2(x)\Vert ^2-\Vert i_2(P_T(x))\Vert =2\sum _{j\in {\mathcal {J}}}\int _0^T \Vert i_2(d_j(P_t(x)))\Vert ^2\,dt. \end{aligned}$$

By (H4) and (H5),

$$\begin{aligned} \Vert i_2(d_j(P_t(x)))\Vert \le e^{-\mu t}\Vert i_2(P_t(d_j(x)))\Vert \le e^{-\mu t}\Vert i_{p(t)}(d_j(x))\Vert \end{aligned}$$

with \(p(t)=1+e^{-2\alpha t}\).

After the change of variables \(s=p(t)\) and application of Hölder’s inequality we get

$$\begin{aligned} \Vert i_2(x-\varphi (x)\textbf{1})\Vert ^2&\le \frac{2}{1-e^{-1}}\sum _{j\in {\mathcal {J}}}\int _0^T e^{-2\mu t}\Vert i_{p(t)}(d_j(x))\Vert ^2\,dt\\&=\frac{1}{\alpha (1-e^{-1})}\sum _{j\in {\mathcal {J}}}\int _{1+e^{-\alpha /\lambda }}^2 (s-1)^{\mu /2\alpha -1}\Vert i_s(d_j(x))\Vert ^2\,ds\\&\le \frac{e^{(2\alpha -\mu )_+/2\lambda }}{\alpha (1-e^{-1})}\sum _{j\in {\mathcal {J}}}\int _1^2 \Vert i_s(d_j(x))\Vert ^2\,ds\\&\le \frac{e^{(2\alpha -\mu )_+/2\lambda }}{\alpha (1-e^{-1})}\sum _{j\in {\mathcal {J}}}\int _1^2 {\text {Inf}}^1_j(x)^{4/s-2}{\text {Inf}}^2_j(x)^{2-2/s}\,ds\\&=\frac{e^{(2\alpha -\mu )_+/2\lambda }}{\alpha (1-e^{-1})}\sum _{j\in {\mathcal {J}}}{\text {Inf}}_j^2(x)\int _1^2 ({\text {Inf}}^1_j(x)/{\text {Inf}}_j^2(x))^{2/s-2}\,ds. \end{aligned}$$

From here, the claimed inequality follows from an elementary bound on the last integral (compare [CEL12, Theorem 1]). \(\square \)

Since the p-influences for different p do not coincide in the quantum setting, this version of Talagrand’s inequality does not imply a KKL bound. However, we still have the following weaker bound as consequence of Theorem 4.3. Again, the proof can be found in the appendix.

Theorem 4.5

If \((P_t)\) is a KMS-symmetric QMS on \({\mathcal {M}}\) satisfying (H0)(H5) and the cardinality n of \({\mathcal {J}}\) is finite, then there exists \(C'>0\) depending only on the constants \(K,L,M,\alpha ,\lambda ,\mu \) such that

$$\begin{aligned} \max _{j\in {\mathcal {J}}}{\text {Inf}}^1_j(x)\ge C'\frac{\sqrt{\log n}}{n} \end{aligned}$$

for all self-adjoint \(x\in {\mathcal {A}}\) with \(\Vert i_2(x)\Vert =1\), \(\Vert x\Vert \le 1\) and \(\varphi (x)=0\).

Remark 4.6

The sharpness of the bound derived in Theorem 4.5 in the present general context was shown in [KMS12].

To prove our generalized version of Friedgut’s junta theorem, we need one last assumption on \((P_t)\). For that purpose, if \({\mathcal {I}}\subset {\mathcal {J}}\), let \(E_{{\mathcal {I}}}\) denote the orthogonal projection onto \(\bigcap _{i\in {\mathcal {I}}}\overline{i_2(\ker d_i)}\) in \(L^2({\mathcal {M}},\varphi )\).

(H6):

There exists a constant \(\nu >0\) such that

$$\begin{aligned} \nu \Vert i_2(x)-E_{{\mathcal {I}}}(i_2(x))\Vert ^2\le \sum _{i\in {\mathcal {I}}}\Vert i_2(d_i(x))\Vert ^2 \end{aligned}$$

for every \(x\in {\mathcal {A}}\) and \({\mathcal {I}}\subset {\mathcal {J}}\).

If \((P_t)\) is primitive, then \(E_{{\mathcal {J}}}(i_2(x))=i_2(\varphi (x)\textbf{1})\). Thus (H6) is a strengthening of the Poincaré inequality from (H3) in the case of primitive QMS.

Lemma 4.7

If \((P_t)\) is a KMS-symmetric QMS on \({\mathcal {M}}\) satisfying (H0), (H2), (H4)(H6), then for any \(x\in {\mathcal {A}}\), \(t,\eta >0\) and \({\mathcal {I}}\subset {\mathcal {J}}\) such that \({\text {Inf}}^1_i(x)\le \eta \) for all \(i\in {\mathcal {I}}\) one has

$$\begin{aligned} \Vert (\textrm{id}-E_{{\mathcal {I}}})(i_2(P_t(x)))\Vert ^2\le \frac{e^{-\mu t}}{\nu }(\eta {\text {Inf}}^1(x))^{q(t)}({\text {Inf}}^2(x))^{1-q(t)}, \end{aligned}$$

where \(q(t)=\frac{1-e^{-2\alpha t}}{1+e^{-2\alpha t}}\).

Proof

By (H6) we have

$$\begin{aligned} \Vert i_2(P_t(x))-E_{{\mathcal {I}}}(i_2(P_t(x)))\Vert ^2\le \frac{1}{\nu }\sum _{i\in {\mathcal {I}}}\Vert i_2(d_i(P_t(x)))\Vert ^2. \end{aligned}$$

By (H4), (H5) and interpolation,

$$\begin{aligned} \Vert i_2(d_i(P_t(x)))\Vert&\le e^{-\mu t}\Vert i_2(P_t(d_i(x)))\Vert \\&\le e^{-\mu t}\Vert i_{p(t)}(d_i(x))\Vert \\&\le e^{-\mu t}\Vert i_1(d_i(x))\Vert ^{q(t)}\Vert i_2(d_i(x))\Vert ^{1-q(t)}. \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert (\textrm{id}-E_{{\mathcal {I}}})(i_2(P_t(x)))\Vert ^2&\le \frac{e^{-\mu t}}{\nu } \sum _{i\in {\mathcal {I}}} \Vert i_1(d_i(x))\Vert ^{2q(t)}\Vert i_2(d_i(x))\Vert ^{2(1-q(t))}\\&\le \frac{e^{-\mu t}}{\nu }\left( \sum _{i\in {\mathcal {I}}}\Vert i_1(d_i(x))\Vert ^2\right) ^{q(t)}\left( \sum _{i\in {\mathcal {I}}}\Vert i_2(d_i(x))\Vert ^2\right) ^{1-q(t)}\\&\le \frac{e^{-\mu t}}{\nu }\left( \sum _{i\in {\mathcal {I}}} \eta {\text {Inf}}^1_i(x)\right) ^{q(t)}\left( \sum _{j\in {\mathcal {J}}}{\text {Inf}}^2_j(x)\right) ^{1-q(t)}\\&=\frac{e^{-\mu t}}{\nu }(\eta {\text {Inf}}^1(x))^{q(t)}({\text {Inf}}^2(x))^{1-q(t)}. \end{aligned}$$

\(\square \)

Lemma 4.8

If A is a positive self-adjoint operator on a Hilbert space \({\mathcal {H}}\), then

$$\begin{aligned} \Vert \xi -e^{-tA}\xi \Vert _{{\mathcal {H}}}^2\le t \Vert A^{1/2}\xi \Vert _{{\mathcal {H}}}^2 \end{aligned}$$

for every \(\xi \in D(A^{1/2})\) and \(t\ge 0\).

Proof

This follows by the spectral theorem from the scalar inequality \((1-e^{-tx})^2\le tx\) for \(t,x\ge 0\). \(\square \)

A version of Friedgut’s junta theorem in this setting now reads as follows. Again, the proof can be found in the appendix.

Theorem 4.9

Let \((P_t)\) be a KMS-symmetric QMS on \({\mathcal {M}}\) satisfying (H0), (H2), (H4)(H6). There exists a constant \(C>0\) depending only on \(\alpha \) and \(\nu \) such that for every \(x\in {\mathcal {A}}\) and \(0<\varepsilon \le 2/\nu \) there exists a set \({\mathcal {I}}\subset {\mathcal {J}}\) such that \(\Vert i_2(x)-E_{{\mathcal {I}}}(i_2(x))\Vert \le \varepsilon \) and

$$\begin{aligned} |{\mathcal {J}}\setminus {\mathcal {I}}|\le {\left\{ \begin{array}{ll}{\text {Inf}}^1(x)^2 \exp \left( C\mu _-+\frac{C {\text {Inf}}^2(x)}{\varepsilon ^2}\log \frac{2{\text {Inf}}^2(x)}{\nu \varepsilon }\right) &{}\text {if }{\text {Inf}}^2(x)\ge 1,\\ \frac{{\text {Inf}}^1(x)^2}{{\text {Inf}}^2(x)}\exp \left( C\mu _- +\frac{C {\text {Inf}}^2(x)}{\varepsilon ^2}\log \frac{2\sqrt{{\text {Inf}}^2(x)}}{\nu \varepsilon }\right) &{}\text {otherwise},\end{array}\right. } \end{aligned}$$

where \(\mu _-=-\mu \) if \(\mu <0\), and 0 otherwise.

5 Examples

5.1 Classical case

The results in [CEL12, Bou17] fit into our framework by choosing the commutative von Neumann algebras, i.e. \(({\mathcal {M}},\varphi )=L^\infty (X,\mu )\) with X a probability measure space.

5.2 Generalized depolarizing semigroups

We start with a simple weighted generalization of the depolarizing semigroup, also known as generalized depolarizing: given a full-rank state \(\omega \) over \(\mathbb {C}^{d}\),

$$\begin{aligned} e^{t\mathcal {L}_\omega }=\big (e^{-t}{\text {id}}+(1-e^{-t})\,{\text {tr}}(\omega \,\cdot )\textbf{1}\big )^{\otimes n}\,. \end{aligned}$$

We verify assumptions (H0)-(H5) for the semigroup \((e^{t\mathcal {L}_\omega })_{t\ge 0}\) . First of all, since we are in a finite dimensional case, (H0) is directly satisfied. (H1) was proved in [JZ15] with \(K=\frac{1}{2}\). With \(d_j(x)=\textrm{id}^{\otimes (j-1)}\otimes (\textrm{id}-{\text {tr}}(\omega \cdot )\textbf{1})\otimes \textrm{id}^{\otimes (n-j)}\) a direct computation shows

$$\begin{aligned} \langle x,{\mathcal {L}}_\omega (x)\rangle _\omega =\sum _{j=1}^n \langle d_j(x),d_j(x)\rangle _\omega , \end{aligned}$$

which settles (H2-1). Condition (H2-2) with \(M=\sqrt{2}\) follows as in Equation (3.7). Condition (H3) with \(\lambda =1\) is easy to check for \(n=1\), one for arbitrary n follows by tensorization. The best constant \(\alpha \) satisfying (H4) for any n has been shown in [BDR20, Theorems 24 & 25], whereas a lower bound on \(\alpha \) was found e.g. in [TPK14, Theorem 9]. A direct computation shows (H5) with \(\mu =1\).

5.3 Quantum Ornstein–Uhlenbeck semigroup

Next, we consider the generator of the so-called quantum Ornstein–Uhlenbeck semigroup [FRS94, CFL00, CS08]. The latter acts on the algebra \(B(\ell ^2(\mathbb {N}))\) of all bounded operators on the Hilbert space \(\ell ^2(\mathbb {N})\) of square-summable sequences. Denoting by a and \(a^*\) the annihilation and creation operators of the quantum harmonic oscillator, which are defined by their action on a given orthonormal basis \(\{|k\rangle \}_{k\in \mathbb {N}}\) of \(\mathcal {H}\equiv \ell ^2(\mathbb {N})\simeq L^2(\mathbb {R})\) as follows:

$$\begin{aligned} a^*|k\rangle =\sqrt{k+1}|k\rangle \,,\qquad \text { and }\qquad a|k\rangle =\left\{ \begin{aligned}&\sqrt{k}|k-1\rangle&k\ge 1\\&0&\text { else}\, ,\end{aligned}\right. \end{aligned}$$

the generator of the quantum Ornstein–Uhlenbeck semigroup takes the following form at least on finite rank operators:

$$\begin{aligned} \mathcal {L}(x)=\frac{\mu ^2}{2}\,(a^*ax-2a^*xa+xa^*a)+\frac{\lambda ^2}{2}\,(aa^*x-2axa^*+xaa^*)\,, \end{aligned}$$

where \(\mu>\lambda >0\). Denoting \(\nu =\lambda ^2/\mu ^2\), it has a unique invariant state

$$\begin{aligned} \sigma _{\mu ,\nu }:=(1-\nu )\,\sum _{n\ge 0}\,\nu ^n\,|n\rangle \langle n|\,. \end{aligned}$$

Here we will use the notion of a Schwartz operator [KKW16]: an operator \(x\in B(L^2(\mathbb {R}))\) is called a Schwartz operator if for any indices \(\alpha ,\beta ,\alpha ',\beta '\in \mathbb {N}\),

$$\begin{aligned} \sup \big \{|\langle P^\beta Q^\alpha \psi ,\,x P^{\beta '}Q^{\alpha '}\varphi \rangle |:\, \psi ,\varphi \in \mathfrak {S}(\mathbb {R}),\,\Vert \psi \Vert ,\Vert \varphi \Vert \le 1 \big \}<\infty \,, \end{aligned}$$

where \(\mathfrak {S}(\mathbb {R})\) denotes the set of Schwartz functions over \(\mathbb {R}\), \(Q:(x\mapsto \psi (x))\mapsto (x\mapsto x\psi (x))\) is the so-called position operator and \(P:(x\mapsto \psi (x))\mapsto (x\mapsto -i\psi '(x))\) is the momentum operator. We denote by \(\mathfrak {S}(\mathcal {H})\) the algebra of Schwartz operators.

Proposition 5.1

The semigroup generated by \(\mathcal {L}\) and derivations \(d_a:=[a,\cdot ]\) and \(d_{a*}=[a^*,\cdot ]\) satisfy the conditions (H0)(H5) with respect to the algebra \(\mathcal {A}\equiv \mathfrak {S}(\mathcal {H})\).

Proof

The set \(\mathfrak {S}(\mathcal {H})\) of Schwartz operators is a \(*\)-subalgebra of \(B(L^2(\mathbb {R}))\) [KKW16, Lemma 3.5]. Moreover for any \(p\ge 1\), the set \(\mathfrak {S}_0(\mathcal {H})\) of finite-rank Schwartz operators is dense in the space \(\mathcal {T}_p(\mathcal {H})\) of Schatten-p operators [KKW16, Lemma 2.5]. Therefore, since finite-rank operators are \(\sigma \)-weakly dense in \(B(\mathcal {H})\), this also holds for \(\mathfrak {S}(\mathcal {H})\). In order to show that \(\mathfrak {S}(\mathcal {H})\) is invariant with respect to the semigroup generated by \(\mathcal {L}\), we use tools from noncommutative Fourier analysis: given a trace-class operator x, its characteristic function is given by

$$\begin{aligned} \chi _x(z):= {\text {tr}}(x D(z))\,, \end{aligned}$$

where \(D(z):=e^{za^*-\bar{z}a}\), for \(z\in \mathbb {C}\), is the so-called one-mode displacement operator. By the quantum Plancherel identity, we have that for any two trace-class operators xy [Hol11],

$$\begin{aligned} {\text {tr}}(x^* y)=\int \frac{d^2z}{\pi }\, \overline{\chi _x(z)}\,\chi _y(z)\,. \end{aligned}$$

Moreover, the quantum Ornstein–Uhlenbeck semigroup can be represented by a family of quantum channels \(e^{t\mathcal {L}}\) modelling a quantum beam-splitter of transmissivity \(\eta =e^{-(\mu ^2-\lambda ^2)t}\) and with environment state \(\sigma _{\mu ,\nu }\) can be shown to induce the following action on characteristic functions:

$$\begin{aligned} \chi _x(z)\longrightarrow \chi _{e^{t\mathcal {L}^*}(x)}(z)= \chi _x(\sqrt{\eta }\,z)\,\chi _{\sigma _{\mu ,\nu }}(\sqrt{1-\eta }\,z)=\chi _x(\sqrt{\eta }\,z)\,e^{-(2N+1)(1-\eta )|z|^2/2}\,, \end{aligned}$$

with \(N=\frac{\lambda ^2}{\mu ^2-\lambda ^2}\). Therefore, for any \(\alpha ,\beta ,\alpha ',\beta '\in \mathbb {N}\) and normalized \(\psi ,\varphi \in \mathfrak {S}(\mathbb {R})\), denoting \(y:=|P^{\beta '}Q^{\alpha '}\varphi \rangle \langle P^\beta Q^\alpha \psi |\), we have

$$\begin{aligned} \langle P^\beta Q^\alpha \psi ,\,e^{t\mathcal {L}}(T)\,P^{\beta '}Q^{\alpha '}\varphi \rangle&={\text {tr}}(y\,e^{t\mathcal {L}}(T)) \\&={\text {tr}}(e^{t\mathcal {L}^*}(y)T)\\&=\int \,\frac{d^2z}{\pi } \,\chi _y(\sqrt{\eta }z)\,e^{-(2N+1)(1-\eta )|z|^2/2}\,\chi _T(z)\\&=\int \,\frac{d^2u}{\eta \pi } \,\chi _y(u)\,e^{-(2N+1)(1-\eta )|u|^2/2\eta }\,\chi _T(u/\sqrt{\eta })\\&<\infty \,, \end{aligned}$$

where we used that \(u\mapsto e^{-(2N+1)(1-\eta )|u^2|/2\eta }\chi _T(u/\sqrt{\eta })\) is a Schwartz function, see [KKW16, Proposition 3.18]. Finally, by [KKW16, Proposition 3.14], for any \(x\in \mathfrak {S}(\mathcal {H})\), \(\mathcal {L}(x)\) is closable with closure in \(\mathfrak {S}(\mathcal {H})\). Hence, (H0) is satisfied for the algebra \(\mathcal {A}\equiv \mathfrak {S}(\mathcal {H})\) . Property (H1) can be easily derived from the canonical commutation relation \([a,a^*]=\mathbb {I}\) and gives \(K=(\mu ^2-\lambda ^2)/2\) (see e.g. [CM17a]). Property (H2) is satisfied for the maps \(d_a:=[a,\cdot ]\) and \(d_{a^*}:=[a^*,\cdot ]\). The Poincaré inequality (H3) follows from the characterization of the spectrum of the generator \(\mathcal {L}\) established in [CFL00]. The hypercontractivity constant in (H4) was estimated in [CS08]. The intertwining relation of (H5) was found in [CM17a]. \(\square \)

5.4 Group von Neumann algebras

Let G be a countable discrete group with unit e, L(G) the group von Neumann algebra on \(\ell ^2(G)\) generated by \(\{\lambda _g,g\in G\}\) where \(\lambda \) is the left regular representation of G. We denote by \(\tau (x)=\langle x\delta _e,\delta _e\rangle \) the canonical tracial faithful state. Here and in what follows, \(\delta _g\) always denotes the function on G that takes value 1 at g and vanishes elsewhere.

A function \(\psi :G\rightarrow [0,\infty )\) is a conditionally negative definite (cnd) length function if \(\psi (e)=0\), \(\psi (g^{-1})=\psi (g)\) and

$$\begin{aligned} \sum _{g,h\in G}\overline{f(g)}f(h)\psi (g^{-1}h)\le 0 \end{aligned}$$

for every \(f:G\rightarrow \mathbb {C}\) with finite support such that \(\sum _{g\in G} f(g)=0\).

By Schoenberg’s Theorem (see for example [BO08, Theorem D.11]), to every cnd function one can associate a \(\tau \)-symmetric quantum Markov semigroup on L(G) given by

$$\begin{aligned} P_t \lambda _g=e^{-t\psi (g)}\lambda _g. \end{aligned}$$

For a countable discrete group G, a 1-cocycle is a triple \((H,\pi ,b)\), where H is a real Hilbert space, \(\pi :G\rightarrow O(H)\) is an orthogonal representation, and \(b:G\rightarrow H\) satisfies the cocycle law: \(b(gh)=b(g)+\pi (g)b(h),g,h\in G.\) To any cnd function \(\psi \) on a countable discrete group G, one can associate with a 1-cocycle \((H,\pi ,b)\) such that \(\psi (g^{-1}h)=\Vert b(g)-b(h)\Vert ^2,g,h\in G\). See [BO08, Appendix D] for more information.

Fix an orthonormal basis \((e_j)_{j\in {\mathcal {J}}}\) of H. In case G is finite, the index set \({\mathcal {J}}\) can always be taken to be finite. Let \({\mathcal {A}}\) be the linear span of the operator \(\lambda _g\), \(g\in G\), and let

$$\begin{aligned} d_j:{\mathcal {A}}\rightarrow L(G),\,d_j(\lambda _g)=\langle b(g),e_j\rangle \lambda _g. \end{aligned}$$

The space \({\mathcal {A}}\) is contained in the domain of the generator \({\mathcal {L}}\) of \((P_t)\) and \({\mathcal {L}}(\lambda _g)=\psi (g)\lambda _g\) for \(g\in G\). Moreover, \(\Gamma (\lambda _g,\lambda _h)=\langle b(g),b(h)\rangle \lambda _{g^{-1}h}\) for \(g,h\in G\).

Clearly, condition (H0) is satisfied. Condition (H1) is satisfied with \(K=0\) [WZ21, Example 3.14]. For condition (H2) note that if \(x=\sum _g f(g)\lambda _g\in {\mathcal {A}}\), then

$$\begin{aligned} \sum _{j\in {\mathcal {J}}}d_j(x)^*d_j(x)&=\sum _{j\in {\mathcal {J}}}\sum _{g,h\in G}\overline{f(g)}f(h)d_j(\lambda _g)^*d_j(\lambda _h)\\&=\sum _{j\in {\mathcal {J}}}\overline{f(g)}f(h)\sum _{g,h\in G}\langle b(g),e_j\rangle \langle b(h),e_j\rangle \lambda _{g^{-1}h}\\&=\sum _{g,h\in G}\overline{f(g)}f(h)\langle b(g),b(h)\rangle \lambda _{g^{-1}h}\\&=\sum _{g,h\in G}\overline{f(g)}f(h)\Gamma (\lambda _g,\lambda _h)\\&=\Gamma (x). \end{aligned}$$

In particular, \(d_j(x)^*d_j(x)\le \Gamma (x)\) for every \(j\in {\mathcal {J}}\). Moreover,

$$\begin{aligned} \sum _{j\in {\mathcal {J}}}\Vert i_2(d_j(x))\Vert ^2=\sum _{j\in J}\tau (d_j(x)^*d_j(x))=\tau (\Gamma (x))=\langle i_2(x),i_2({\mathcal {L}}(x))\rangle . \end{aligned}$$

Thus condition (H2) holds with \(M=1\). Condition (H3) holds with the spectral gap \(\lambda =\inf _{g:\psi (g) >0} \psi (g)\) of \({\mathcal {L}}\). Since \(d_j P_t=P_t d_j\), condition (H5) is always satisfied with \(\mu =0\).

Condition (H4) is known to hold for certain discrete groups. For free groups, it is known that (H4) holds with \(\alpha =2\) [JPP+15, Theorem A]. We refer to [JPPP17] for more examples including triangular groups, finite cyclic groups \(\mathbb {Z}_N, N\ge 6\), infinite Coxeter groups etc. with \(0<\alpha <\infty \).

6 Applications

6.1 Influence and circuit complexity lower bounds

As mentioned in the introduction, Karpovsky [Kar76] was the first to propose the total influence, as a measure of complexity of a function f. This intuition was then made rigorous in [LMN93] and [Bop97] where tight circuit complexity lower bounds in terms of the total influence were derived for the complexity class \({\text {AC}}^0\) of constant depth circuits.

Similar results were recently derived in the quantum setting. For instance, [BGJ+22] show a direct link between the notion of \(L^2\)-influence and the complexity of quantum circuits. More precisely, they showed that for a quantum circuit U, that is a unitary matrix in \(M_2(\mathbb {C})^{\otimes n}\) [BGJ+22, Theorem 12]

$$\begin{aligned} \frac{1}{8}\,{\text {CiS}}^2(U)\le {\text {Cost}}(U)\,, \end{aligned}$$

where the \(L^2\)-circuit sensitivity \({\text {CiS}}^2(U)\) is defined as

$$\begin{aligned} {\text {CiS}}^2(U):=\max _{\Vert O\Vert _2=1}\,\Big |{\text {Inf}}^2(UOU^*)-{\text {Inf}}^2(O)\Big |\,, \end{aligned}$$

and where \({\text {Cost}}(U)\) refers to the cost of the circuit and was introduced in a series of seminal papers by Nielsen and coauthors [Nie06, NDGD06a, NDGD06b, DN08] as a lower bound on the minimal number of one and two-qubit gates required from a given universal gate-set in order to synthesize the unitary U. More precisely, given traceless self-adjoint operators \(h_1,\dots ,h_m\) that are supported on 2 qubits and normalized as \(\Vert h_i\Vert =1\), the circuit cost of U with respect to \(h_1,\dots , h_m\) is defined as

$$\begin{aligned} {\text {Cost}}(U):=\inf \int _0^1\,\sum _{j=1}^m |r_j(s)|\,ds\,, \end{aligned}$$

where the infimum above is taken over all continuous functions \(r_j:[0,1]\rightarrow \mathbb {R}\) satisfying

$$\begin{aligned} U=\mathcal {P}{\text {exp}}\Big (-i\int _0^1\,H(s)\,ds\Big )\,, \end{aligned}$$

and

$$\begin{aligned} H(s)=\sum _{j=1}^m\,r_j(s)\,h_j\,, \end{aligned}$$

where \(\mathcal {P}\) denotes the path-ordering operator. We start by providing a simple bound on the p-influences for \(p\in [1,2]\) (for convenience we may write \(\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)\) for \(\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)\otimes \textbf{1}\)):

Proposition 6.1

For any \(j\in \{1,\dots , n\}\), let \(N_j\subset \{1,\dots , m\}\) be the minimal set of qubits such that \(\frac{{\text {tr}}_j}{2}(U(\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)\otimes \textbf{1})U^*)=U(\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)\otimes \textbf{1})U^*\) for any \(O\in M_2(\mathbb {C})^{\otimes n}\). Then, for any self-adjoint matrix \(O\in M_2(\mathbb {C})^{\otimes n}\) with \(\Vert O\Vert _2\le 1\) and all \(p\in [1,2]\) we have

$$\begin{aligned} {\text {Inf}}^p(UOU^*)\le \sum _{i=1}^n\,\Vert d_i O\Vert _2^p\,\,|\{j:i\in N_j\}|\,. \end{aligned}$$

In the case \(p=2\) and denoting \(L:=\max _i\,|\{j:\,i\in N_j\}|\), we get

$$\begin{aligned} {\text {Inf}}^2(UOU^*)\le L\,{\text {Inf}}^2(O)\,. \end{aligned}$$

Proof

For \(p\in [1,2]\) and any \(O\in M_2(\mathbb {C})^{\otimes n}\) with \(\Vert O\Vert _2\le 1\), we have

$$\begin{aligned} {\text {Inf}}^p(UOU^*)&=\sum _{j=1}^n \Vert UOU^*-\frac{{\text {tr}}_j}{2}(UOU^*)\Vert _p^p\\&\le \sum _{j=1}^n \Vert UOU^*-\frac{{\text {tr}}_j}{2}(UOU^*)\Vert _2^p \\&\le \sum _{j=1}^n \Vert UOU^*-U\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)U^*\Vert _2^p\\&= \sum _{j=1}^n \Vert O-\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)\Vert _2^p\\&\le \sum _{j=1}^n\Big (\sum _{i\in N_j}\Vert O-\frac{{\text {tr}}_{i}}{2}(O)\Vert _2^2\Big )^\frac{p}{2}\\&\le \sum _{j=1}^n\sum _{i\in N_j}\Vert O-\frac{{\text {tr}}_i}{2}(O)\Vert _2^p\\&\le \sum _{i=1}^n \Vert O-\frac{{\text {tr}}_i}{2}(O)\Vert _2^p\, |\{j:i\in N_j\}| \end{aligned}$$

where in the second inequality above we use that the partial trace \({\text {tr}}_j\) is a projection onto the algebra of operators supported on \(\{j\}^c\), and therefore minimized the distance to that subalgebra. The third inequality follows from the non-primitive Poincaré inequality from Equation (3.13). \(\square \)

Remark 6.2

The assumption \(\frac{{\text {tr}}_j}{2}(U(\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)\otimes \textbf{1})U^*)=U(\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)\otimes \textbf{1})U^*\) can be interpreted as a lightcone condition: let’s consider for simplicity n a unitary circuit in brickwork architecture of the form \(U=U^{\ell } U^{\ell -1}\dots U^1\), where for each j, \(U^{2j+1}=U_{1,2}^{2j+1}\otimes \dots \otimes U^{2j+1}_{n-1,n }\) and \(U^{2j}=U_{2,3}^{2j}\otimes \dots \otimes U^{2j}_{n-2,n-1 }\), where by \(U^j_{r,r+1}\) we mean a unitary with non-trivial support on qubits r and \(r+1\). Hence, for any set \(N_1=\{1,\dots , n_1\}\) and any observable \(O_{N_1^c}\) supported on \(N_1^c\),

$$\begin{aligned} UO_{N_1^c} U^*= U^\ell \dots U^1 O_{N_1^c}(U^1)^*\dots (U^\ell )^* \end{aligned}$$

is supported in the set \(\{n_1-\ell +1,\dots , n\}\). Hence, for \(n_1=\ell +1\), the condition holds. In other words, \(n_1\) scales linearly with the depth \(\ell \) of the circuit U. The above simple argument generalizes easily to higher dimensions and general local unitary circuits.

In the case when \(p=1\) we can bound the total \(L^1\) output influence in terms of the total \(L^1\) input influence.

Proposition 6.3

For any \(j\in \{1,\dots , n\}\), let \(N_j\subset \{1,\dots , m\}\) be the minimal set of qubits such that \(\frac{{\text {tr}}_j}{2}(U\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)U^*)=U\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)U^*\) for any \(O\in M_2(\mathbb {C})^{\otimes n}\) and denote \(L:=\max _i\,|\{j:\,i\in N_j\}|\). Then, for any such matrix \(O\in M_2(\mathbb {C})^{\otimes n}\)

$$\begin{aligned} {\text {Inf}}^1(UOU^*)\le 2 L\,{\text {Inf}}^1(O)\,. \end{aligned}$$

Proof

For any \(O\in M_2(\mathbb {C})^{\otimes n}\),

$$\begin{aligned} {\text {Inf}}^1(UOU^*)&=\sum _{j=1}^n \Vert UOU^*-\frac{{\text {tr}}_j}{2}(UOU^*)\Vert _1\\&\le \sum _{j=1}^n \Vert UOU^*-U\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)U^*\Vert _1+\Vert U\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)U^*-\frac{{\text {tr}}_j}{2}(UOU^*)\Vert _1\\&\le 2\,\sum _{j=1}^n \Vert O-\frac{{\text {tr}}_{N_j}}{2^{|N_j|}}(O)\Vert _1\\&\le 2\sum _{j=1}^n\sum _{i\in N_j}\Vert d_i O\Vert _1\\&\le 2 L\,{\text {Inf}}^1(O) \end{aligned}$$

where in the second inequality above we used the definition of \(N_j\) and that the partial trace \({\text {tr}}_j\) is contractive in \(\Vert \cdot \Vert _1\) norm. The third inequality follows from simple triangle inequality and monotonicity of the \(L^1\)-norm under partial traces. \(\square \)

Finally, we find a bound on the variation of the \(L^1\)-influence through a circuit U in terms of the cost of a unitary U. We define the \(L^1\)-circuit sensitivity of a unitary \(U\in M_2(\mathbb {C})^{\otimes n}\) as

$$\begin{aligned} {\text {CiS}}^1(U):=\max _{\Vert O\Vert _1=1}\,\Big |{\text {Inf}}^1(UOU^*)-{\text {Inf}}^1(O)\Big |\,. \end{aligned}$$

Theorem 6.4

The \(L^1\)-circuit sensitivity of a unitary \(U\in M_2(\mathbb {C})^{\otimes n}\) is a lower bound on the circuit cost:

$$\begin{aligned} {\text {CiS}}^1(U)\le 8\,{\text {Cost}}(U)\,. \end{aligned}$$

Proof

Our proof follows similar steps to those leading to [BGJ+22, Theorem 12] (see also [Eis21, MAVAV16]): we first show that, for a unitary \(U_t=e^{-itH}\), where H acts non-trivially on a set S of k qubits, and O with \(\Vert O\Vert _1=1\),

$$\begin{aligned} |{\text {Inf}}^1(U_tOU_t^*)-{\text {Inf}}^1(O)|&=|\sum _{j=1}^n \Vert d_j(O(t))\Vert _1-\Vert d_j(O)\Vert _1|\nonumber \\&=|\sum _{j\in S}\,\Vert d_j(O(t))\Vert _1-\Vert d_j(O)\Vert _1|\nonumber \\&\le \sum _{j\in S}\,\Vert d_j(O(t))-d_j(O)\Vert _1\nonumber \\&\le \int _0^t\,\sum _{j\in S}\,\Vert d_j\,e^{-isH}[H,O]e^{isH}\Vert _1\,ds\nonumber \\&\le 2kt\, \Vert [H,O]\Vert _1\nonumber \\&\le 4kt \Vert H\Vert \Vert O\Vert _1\nonumber \\&= 4kt\,\Vert H\Vert \,, \end{aligned}$$
(6.1)

where we denoted \(O(t):=U_tOU_t^*\). Back to our original problem, we take a Trotter decomposition of U such that for arbitrary small \(\varepsilon >0\),

$$\begin{aligned} \Vert U-V_N\Vert \le \varepsilon \end{aligned}$$

where \(V_N\) is defined as follows:

$$\begin{aligned}&V_N:=\prod _{j=1}^N\,W_j\,,\\&W_\eta :={\text {exp}}\Big ( -\frac{i}{N}\sum _{j=1}^m\,r_j\big (\frac{\eta }{N}\big )\,h_j\Big )\,, \end{aligned}$$

so that

$$\begin{aligned}&W_\eta =\lim _{l\rightarrow \infty }W_\eta ^{(l)}\,,\\&W_\eta ^{(l)}:=\Big (W_{\eta ,1}^{\frac{1}{l}}\dots W_{\eta ,l}^{\frac{1}{l}}\Big )^l\,,\\&W_{\eta ,j}:={\text {exp}}\Big (-\frac{i}{N}\,r_j\big (\frac{\eta }{N}\big )\,h_j\Big )\,. \end{aligned}$$

Next, we define \(O_\eta =W_\eta O_{\eta -1}W_\eta ^*\) with \(O_0=O\). We have,

$$\begin{aligned} |{\text {Inf}}^1(O_\eta )-{\text {Inf}}^1(O_{\eta -1})|&=|{\text {Inf}}^1(W_\eta O_{\eta -1}W_\eta ^*)-{\text {Inf}}^1(O_{\eta -1})|\\&=\limsup _{l\rightarrow \infty }\, |{\text {Inf}}^1\big (W_\eta ^{(l)}O_{\eta -1}W_\eta ^{(l)*}\big )-{\text {Inf}}^1(O_{\eta -1})|\\&\le \limsup _{l\rightarrow \infty }\frac{l}{N}\,\sum _{j=1}^m\,\frac{8}{l}\,\Big |r_j\Big (\frac{\eta }{N}\Big )\Big |\\&=\frac{8}{N}\,\sum _{j=1}^m\,\Big |r_j\Big (\frac{\eta }{N}\Big )\Big |\,, \end{aligned}$$

where the inequality follows from (6.1) for \(k=2\) and \(t=\frac{1}{N}\). Summing over \(\eta \), we get

$$\begin{aligned} |{\text {Inf}}^1(UOU^*)-{\text {Inf}}^1(O)| \le \frac{8}{N}\,\sum _{\eta =1}^N\sum _{j=1}^m\,\Big |r_j\Big (\frac{\eta }{N}\Big )\Big |+ |{\text {Inf}}^1(UOU^*)- {\text {Inf}}^1(V_NOV_N^*)|. \end{aligned}$$

Since the circuit cost is expressed as \({\text {Cost}}(U)=\inf _{(r_j)_j} \lim _{N\rightarrow \infty }\frac{1}{N}\sum _{\eta =1}^N\sum _{j=1}^m\Big |r_j\Big (\frac{\eta }{N}\Big )\Big |\), and since the influence of \(UOU^*\) can be arbirarily well approximated by that of \(V_NOV_N^*\) as \(N\rightarrow \infty \), the result follows. \(\square \)

Remark 6.5

Combined with our quantum Friedgut’s Junta theorem 3.11 the above results show that for any observable O with \({\text {Inf}}^1(O),{\text {Inf}}^2(O)=\mathcal {O}(1)\) and \(\Vert O\Vert _2=\mathcal {O}(1)\), and for any unitary U with \(L=\mathcal {O}(1)\), the output observable \(UOU^*\) can be well approximated by a k-junta with \(k=\mathcal {O}(1)\). Taking again the simple example constructed in Remark 6.2, we recover the simple fact that, for a 1-qubit Pauli matrix evolving according to a circuit of constant depth, the output observable will still be supported on a constant size region. While it would be interesting to find some non-trivial situations where our bounds still hold, we leave this question to future work.

6.2 Learning quantum Boolean functions

In this section, we use our quantum Friedgut’s Junta theorem 3.11 to provide an efficient algorithm for learning quantum Boolean functions. Our argument relies on the following quantum generalization of Goldreich–Levin theorem (see [MO10a, Theorem 7.6]):

Theorem 6.6

(quantum Goldreich–Levin). Given an oracle access to a unitary operator U on n qubits and its adjoint \(U^*\), and given \(\delta ,\gamma >0\), there is a \({\text {poly}}\big (n,\frac{1}{\gamma }\big )\log \big (\frac{1}{\delta }\big )\)-time algorithm which outputs a list \(L=\{s_1,\dots , s_m\}\) such that with probability \(1-\delta \): (1) if \(|\widehat{U}_s|\ge \gamma \) then \(s\in L\); and (2) for all \(s\in L\), \(|\widehat{U}_s|\ge \gamma /2\).

Once the quantum Goldreich–Levin algorithm has been used to output a list of Fourier coefficients, the following lemma, which is also taken from [MO10a], can be used to compute them:

Lemma 6.7

[MO10a, Lemma 7.4] For any quantum Boolean function A, and any \(s\in \{0,1,2,3\}^n\) it is possible to estimate \(\widehat{A}_s\) to within \(\pm \eta \) with probability \(1-\delta \) using \(\mathcal {O}\big (\frac{1}{\eta ^2}\log \big (\frac{1}{\delta }\big )\big )\) queries.

Combining Theorem 6.6, Lemma 6.7 with our Theorem 3.11, we directly arrive at the following result:

Proposition 6.8

(Learning quantum Boolean functions). Let \(A\in M_2(\mathbb {C})^{\otimes n}\) be a quantum Boolean function. Given oracle access to A, with probability \(1-\delta \), we can learn A to precision \(\varepsilon \) in \(L^2\) using \({\text {poly}}(n,4^k,\log \big (\frac{1}{\delta }\big ))\) queries to A, where

$$\begin{aligned} k\le k(\varepsilon )\equiv \left\{ \begin{aligned}&{\text {Inf}}^1(A)^2e^{\frac{48{\text {Inf}}^2(A)}{\varepsilon ^2}\log \frac{2{\text {Inf}}^2(A)}{\varepsilon }}&\text { if }\,{\text {Inf}}^2(A)\ge 1\,; \\&\frac{{\text {Inf}}^1(A)^2}{{\text {Inf}}^2(A)}\,e^{\frac{48{\text {Inf}}^2(A)}{\varepsilon ^2}\log \frac{2\sqrt{{\text {Inf}}^2(A)}}{\varepsilon }}&\text { else} \end{aligned}\,\right. \end{aligned}$$

Proof

By Theorem 3.11, we have a quantum Boolean function \(B=2^{-|T|}{\text {tr}}_{T}(A)\in M_2(\mathbb {C})^{\otimes n}\) supported on a region \(T^c\) of k qubits such that \(\Vert A-B\Vert _2\le \varepsilon \), with \(k\le k(\varepsilon )\). Next, we use the quantum Goldreich–Levin algorithm of Theorem 6.6 in order to output a list \(L=\{s^{(1)},\dots , s^{(m)}\}\subseteq \{0,1,2,3\}^n\) of corresponding significant Fourier coefficients \(\widehat{A}_{s^{(j)}}\), \(1\le j\le m\). Denoting the operator \(A_L:=\sum _{s\in L} \widehat{A}_{s}\sigma _{s}\) as well as the set of strings \(s\in \{0,1,2,3\}^n\) with \(s_T=0\) as \(L'\), we get

$$\begin{aligned} \Vert A-A_L\Vert _2&= \Vert \sum _{s\in L^c}\widehat{A}_s \sigma _s\Vert _2\\&=\Big (\sum _{s\in L^c}\widehat{A}_s^2\Big )^{1/2}\\&=\Big ( \sum _{s\in (L\cup L')^c} \widehat{A}_s^2+\sum _{s\in L'\setminus L}\widehat{A}_s^2\Big )^{\frac{1}{2}}\\&\le \Big ( \sum _{s\in L'^c} \widehat{A}_s^2+\sum _{s\in L'\setminus L}\widehat{A}_s^2\Big )^{\frac{1}{2}}\\&\le \Big ( \varepsilon ^2+\sum _{s\in L'\setminus L}\widehat{A}_s^2\Big )^{\frac{1}{2}}\\ \end{aligned}$$

Moreover, by Theorem 6.6, we have that with probability \(1-\delta \), if \(s\notin L\), then \(|\widehat{A}_s|\le \gamma \). Therefore we have that with that same high probability

$$\begin{aligned} \Vert A-A_L\Vert _2\le \big (\varepsilon ^2+4^k\, \gamma ^2 \big )^{\frac{1}{2}}\,. \end{aligned}$$
(6.2)

It remains to evaluate the coefficients \(\widehat{A}_s\) for \(s\in L\). This can be done within precision \(\pm \eta \) with probability \((1-\delta )\) using \(\mathcal {O}\big (\frac{1}{\eta ^2}\log \big (\frac{1}{\delta }\big )\big )\) queries of A according to Lemma 6.7. Moreover, since A is a quantum Boolean function, there are at most \(\frac{4}{\gamma ^2}\) coefficients \(\widehat{A}_s\) such that \(|\widehat{A}_s|\ge \frac{\gamma }{2}\). Therefore, with high probability \(|L|\le \frac{4}{\gamma ^2}\). Choosing \(\gamma =\varepsilon 2^{-k}\) so that \(\Vert A-A_L\Vert _2= \mathcal {O}( \varepsilon )\), we need to evaluate \(4/\gamma ^2=\mathcal {O}(4^k)\) such coefficients. The result follows. \(\square \)

6.3 Learning quantum dynamics

Proposition 6.8 extends the domain of applicability of Proposition 41 in [MO10b] where the authors provided an efficient algorithm to learn the evolution of initially local observables under the dynamics generated by a local Hamiltonian. While the proof of [MO10b, Proposition 41] requires the Lieb–Robinson bounds in order to control the sets of sites of large influence in terms of the support of the initial observable and the light-cone of H, our argument has the advantage of not putting any geometric locality assumption of the quantum Boolean function A.

To further illustrate our result, we consider the following generalization of the setup of [MO10a, Proposition 41]: let \(\Lambda \) be a finite set of size \(|\Lambda |=n\) endowed with a metric \(d(\cdot ,\cdot )\). We suppose that there is a monotone increasing function g on \([0,\infty )\) and constants \(C,D>0\) such that

$$\begin{aligned} \big |\big \{y\in \Lambda \big |\,d(x,y)\le r\big \}\big |\le g(r)\le C(1+r)^D, \quad r\ge 0, x\in \Lambda \, . \end{aligned}$$

The constant D typically denotes the spatial dimension in the case of a regular lattice. We consider a quantum spin system on the point set \(\Lambda \) by assigning the Hilbert space \(\mathcal {H}_x\equiv \mathbb {C}^2\) to each site \(x\in \Lambda \). For any subset \(T \subseteq \Lambda \), the configuration space of spin states on T is given by the tensor product \(\mathcal {H}_T=\bigotimes _{x\in T}\mathcal {H}_x\), and the algebra \(\mathcal {A}_T:=B(\mathcal {H}_T)\) of observables on T acts on the Hilbert space \(\mathcal {H}_T\). We consider a Hamiltonian \(H_\Lambda =\sum _{X\subseteq \Lambda } h_X\), where \(h_X\in B(\mathcal {H}_X)\) is a local Hamiltonian, i.e. a self-adjoint operator on \(\mathcal {H}_X\), for each \(X\subset \Lambda \). In what follows, we denote the diameter of a set \(Z\subseteq \Lambda \) by \({\text {diam}}(Z):=\max \{d(x,y)|\,x,y\in Z\}\). We further assume the following requirements [MKN17, Assumption A]:

  1. (i)

    There is a decreasing function f(R) on \([0,\infty )\), such that

    $$\begin{aligned} \max _{x\in \Lambda }\,\sum _{\begin{array}{c} Z\ni x\\ {\text {diam}}(Z)\ge R \end{array}}\Vert h_Z\Vert \le f(R)\,,\quad R\ge 0; \end{aligned}$$
  2. (ii)

    The following constant is independent of the system size n:

    $$\begin{aligned} \mathcal {C}_0:=\max _{x\in \Lambda }\,\sum _{y\in \Lambda }\sum _{Z\ni x,y}\Vert h_Z\Vert <\infty \,. \end{aligned}$$

Strictly speaking, condition (ii) only makes sense when considering a family of Hamiltonians \(H_\Lambda \) defined on an increasing family of sets \(\Lambda \) all included in a countable set \(\Sigma \). We will however favour simplicity over rigour here.

By [MKN17, Theorem 2.1], for any two one-local Pauli operators \(\sigma _{s_i},\sigma _{s_j}\) with \(j\ne i\), and all \(R\ge 1\) we have that, given \(d_{ij}\equiv d(i,j)\)

$$\begin{aligned}&\big \Vert \big [e^{itH_\Lambda }\sigma _{s_i}e^{-itH_\Lambda },\,\sigma _{s_j}\big ] \big \Vert \\&\quad \le 2 e^{vt-d_{ij}/R}+4t\,g(d_{ij})\,f(R)+2\mathcal {C}_2\,t\,R\,\max \{d_{ij},R\}^D\,f(R)\,e^{vt-d_{ij}/R} \end{aligned}$$

for any \(t\ge 0\), where v and \(\mathcal {C}_2\) are positive constants independent of \(\Lambda ,t,R,i\) and j.

Proposition 6.9

With the above assumptions, we further assume that there is \(R\ge 1\) such that for all \(i\in \Lambda \), the constants

$$\begin{aligned} C_i:= \sum _{j\in \Lambda }\, e^{vt-d_{ij}/R}+ t\,g(d_{ij})\,f(R)+t\,R\,\max \{d_{ij},R\}^D\,f(R)\,e^{vt-d_{ij}/R} \end{aligned}$$

can be bounded by constants independent of the size n of the system. Then, with probability \(1-\delta \), we can learn the quantum Boolean functions \(e^{itH_\Lambda }\sigma _{s_i}e^{-itH_\Lambda }\) to precision \(\varepsilon \) in \(L^2\) using \({\text {poly}}(n,\exp (\exp (\varepsilon ^{-2}|\log (\varepsilon )|)),\log \big (\frac{1}{\delta }\big ))\) queries to \(e^{-itH_\Lambda }\) and \(e^{itH_\Lambda }\).

Proof

In view of the dependence of k on the influences in Proposition 6.8 on the influences, it is enough to control \({\text {Inf}}^1(e^{itH_\Lambda }\sigma _{s_i}e^{-itH_\Lambda })\) and \({\text {Inf}}^2(e^{itH_\Lambda }\sigma _{s_i}e^{-itH_\Lambda })\) independently of the size of the system. We clearly have for any \(A\in M_2(\mathbb {C})^{\otimes n}\) that \({\text {Inf}}^1(A)\le \sum _{j\in \Lambda } \Vert d_j A\Vert \) and \({\text {Inf}}^2(A)\le \sum _{j\in \Lambda } \Vert d_j A\Vert ^2\). Moreover, by the following well-known expression for the partial trace

$$\begin{aligned} \frac{1}{2}{\text {tr}}_j(A)\otimes \textbf{1}_j=\frac{1}{4}\,\sum _{s_j}\,\sigma _{s_j}A\sigma _{s_j}\,, \end{aligned}$$

where \(\sigma _{s_j}\) are Pauli matrices on site j, we have that

$$\begin{aligned} \Vert d_j A\Vert = \Big \Vert A-\frac{1}{4}\sum _{s_j}\sigma _{s_j}A\sigma _{s_j}\Big \Vert&=\frac{1}{4}\,\Big \Vert 3A-\sum _{s_j:\sigma _{s_j}\ne \textbf{1}}\sigma _{s_j}A\sigma _{s_j}\Big \Vert \le \frac{1}{4}\,\sum _{s_j:\sigma _{s_j}\ne \textbf{1}}\,\Vert [A,\,\sigma _{s_j}]\Vert \,. \end{aligned}$$

Therefore,

$$\begin{aligned}&\sqrt{{\text {Inf}}^2\big (e^{itH_\Lambda }\sigma _{s^{(i)}}e^{-itH_\Lambda }\big )},\,{\text {Inf}}^1\big (e^{itH_\Lambda }\sigma _{s^{(i)}}e^{-itH_\Lambda }\big )\\&\quad \le \frac{3}{2} \sum _{j\in \Lambda }\, e^{vt-d_{ij}/R}+2t\,g(d_{ij})\,f(R)+\mathcal {C}_2\,t\,R\,\max \{d_{ij},R\}^D\,f(R)\,e^{vt-d_{ij}/R}\,. \end{aligned}$$

\(\square \)

Remark 6.10

Our result in Proposition 6.8 has the advantage that it does not assume in advance that A is (close to) a k-junta. This comes at the price that the dependence of the query complexity on the approximating parameter \(\varepsilon \) scales doubly exponentially with the latter. For the same reason, our dynamics learning method in Proposition 6.9 allows us to extend the class of Hamiltonians considered in [MO10a] to Hamiltonians satisfying a weaker power-law decay, at the cost of a much worse dependence on \(\varepsilon \). This dependency on \(\varepsilon \) is also not new in classical setting [BT96, OS07].

Remark 6.11

In a recent article [CNY23], the authors provide an algorithm for learning any unitary k-junta U with precision \(\varepsilon \) and high probability which uses \(\mathcal {O}\big (\frac{k}{\varepsilon }+\frac{4^k}{\varepsilon ^2}\big )\) queries to U (see Theorem 29), extending a previous quantum algorithm for learning classical k-juntas reported in [AcS07]. While the dependence on \(\varepsilon \) is much tighter than ours, the two results are incomparable, since we replaced the requirement that U is a k-junta by the weaker condition that it has influences \({\text {Inf}}^1 U, {\text {Inf}}^2 U=\mathcal {O}(1)\).

6.4 Quantum isoperimetric type inequalities

Closely related to the concentration of measure phenomenon and functional inequalities, isoperimetric inequalities provide powerful tools in the analysis of extremal sets and surface measures. Given a metric space (Xd) equipped with a Borel measure \(\mu \), the boundary measure of a Borel set A in X with respect to \(\mu \) is defined as [Led00, Led01, BGL14]

$$\begin{aligned} \mu ^+(A)=\lim _{r\rightarrow 0} \frac{1}{r}\,\mu (A_r\backslash A) \end{aligned}$$

where we recall that \(A_r:=\{x\in X|\,d(x,A)<r\}\) is the (open) r-neighbourhood of A. The isoperimetric profile of \(\mu \) corresponds to the largest function \(I_\mu \) on \([0,\mu (X)]\) such that, for any Borel set \(A\subset X\) with \(\mu (A)<\infty \),

$$\begin{aligned} \mu ^+(A)\ge I_\mu (\mu (A)) \,. \end{aligned}$$
(6.3)

In the case of the canonical Gaussian measure \(\gamma \) on the Borel sets of \(\mathbb {R}^k\) with density \((2\pi )^{-k/2}e^{-|x|^2/2}\) with respect to the Lebesgue measure, with the usual Euclidean metric induced by the norm |x| [Led01, Theorem 2.5]:

$$\begin{aligned} I_\gamma = \Phi '\circ \Phi ^{-1}\, \end{aligned}$$

where \(\Phi (t)=(2\pi )^{-1/2}\int _{-\infty }^te^{-x^2/2}dx\) is the distribution function of the canonical Gaussian measure in dimension one. Moreover, equality holds in (6.3) if and only if A is a half-space in \(\mathbb {R}^k\). Moreover, as \(a\rightarrow 0\), we have

$$\begin{aligned} \Phi '\circ \Phi ^{-1}(a)\sim a\Big (2\log \frac{1}{a}\Big )^{\frac{1}{2}}\,. \end{aligned}$$
(6.4)

Similar isoperimetric inequalities were also derived for hypercontractive, log-concave measures [BL96] (see also [Mil09, Mil10]). When \(k=1\), the boundary measure of a Borel set A can be expressed in terms of the geometric influence \(\Vert f_A'\Vert _1\) of a smooth approximation \(f_A\) of the characteristic function of A. In other words, \(\mu ^+(A)\approx {\text {Inf}}^1(f_A)\).

This observation allows us to generalize the notion of isoperimetric inequality in the context of smooth Riemannian manifolds to discrete settings. In the context of the classical Boolean hypercube \(\Omega _n\), the edge isoperimetric inequality states that for any m, among the m-element subsets of the discrete cube, the minimal edge boundary is attained by the set of m largest elements in the lexicographic order [Ber67, Har64, Har76, Lin64]. In particular, for any set \(A\subset \Omega _n\) of vertices

$$\begin{aligned} \mu _n(\partial A)\equiv {\text {Inf}}(f_A) \ge 2\mu _n(A)\log _2\left( \frac{1}{\mu _n(A)}\right) \,, \end{aligned}$$
(6.5)

where we recall that \(\mu _n\) is the uniform probability measure on \(\Omega _n\), and \(f_A\) corresponds to the characteristic function of set A. Here, \(\partial A\) simply corresponds to the set of vertices in the complement of A that are adjacent to A. This inequality is moreover tight when \(|A|=2^d\) for some \(d\in \mathbb {N}\) (take for instance A to be the vertices of a d-dimensional subcube). We notice the similarity with (6.4) up to the change of power in the logarithmic factor.

Similarly, consider a finite graph \(G=(V,E)\) with set of vertices V and set of edges E with bounded degree d (i.e. each vertex has at most a fixed number d adjacent edges). The graph G is said to satisfy the linear isoperimetric inequality if

$$\begin{aligned} {\text {Card}}(\partial A)\ge h\,{\text {Card}}(A)\,, \end{aligned}$$

for some \(h>0\) and all subsets A of V such that \({\text {Card}}(A)\le \frac{1}{2}{\text {Card}}(V)\). The so-called Cheeger constant h of the graph can be related to the spectral gap \(\lambda \) of the graph Laplacian via Cheeger’s and Buser’s inequalities. Here, \({\text {Card}}(\partial A)\) plays the role of \(\mu ^+(A)\) and can be once again related to a notion of influence. The linear isoperimetric inequality can be understood as a weaker form of isoperimetry than the one derived for log-concave, hypercontractive measures, and hence only implies exponential concentration for the normalized counting measure on G. Moreover, one should not expect to recover the stronger Gaussian type isoperimetry in this setting, since the hypercontractivity constant for graph Laplacians is known to scale with the size of the graph [BT06].

Linear isoperimetric inequalities were also considered in the more general context of Markov chains over finite sample spaces. For instance, in the case of a continuous time Markov chain with transition rates Q(xy) and unique reversible probability measure \(\pi \) with non-negative entropic Ricci curvature, [EF18] established that for any set A,

$$\begin{aligned} \pi ^+(\partial A)\ge \frac{1}{3}\,\sqrt{Q_*\lambda }\,\,\pi (A)(1-\pi (A))\, \end{aligned}$$

where \(\lambda \) is the spectral gap of Q, \(Q_*=\min \{Q(x,y):Q(x,y)>0\}\) and \(\pi ^+(\partial A)=\sum _{x\in A,y\in A^c}Q(x,y)\pi (x)\) denotes the perimeter measure of A. We also note that extensions of such inequalities in the quantum setting were obtained in [TKR+10].

Interestingly, such inequalities are well-known to be equivalent to an \(L^1\)-Poincaré inequality. It is then natural to ask whether one could recover the type of isoperimetry found for the Gaussian measure and uniform measure on the hypercube in other discrete and quantum settings by further assuming hypercontractivity of the (quantum) Markov chain. This is indeed the case, as we prove by a direct appeal to Talagrand’s inequality:

Theorem 6.12

(Qubit isoperimetric type inequality). For any projection \(P_A\) onto a subspace \(A\subset (\mathbb {C}^2)^{\otimes n}\),

$$\begin{aligned} \max _{1\le j\le n}\textrm{Inf}^{1}_{j}(P_A)\ge \frac{C}{n}\,\tau (A)(1-\tau (A)) \log \left( \frac{n}{\tau (A)(1-\tau (A)) }\right) ^{\frac{1}{2}}\,. \end{aligned}$$
(6.6)

for some universal constant C, where \(\tau (A){:=}2^{-n}{\text {tr}}(P_A)\).

Proof

As mentioned in [CEL12], this is a simple corollary of Talagrand’s inequality Theorem 3.2 after assuming that

$$\begin{aligned} {\text{ Inf }}_j^1(P_A)\le \Big (\frac{\tau (A)(1-\tau (A))}{n}\Big )^{\frac{1}{2}} \end{aligned}$$

for every \(j\in \{1,\dots ,n\}\), since otherwise the result directly holds. \(\square \)

Remark 6.13

Similar to the quantum KKL conjecture of Montanaro and Osborne, it is reasonable to conjecture the following \(L^2\) variant of (6.6)

$$\begin{aligned} \max _{1\le j\le n}\text {Inf}_j^2(P_A)\ge \frac{C}{n}\,\tau (A)(1-\tau (A)) \log \left( \frac{n}{\tau (A)(1-\tau (A)) }\right) \,. \end{aligned}$$
(6.7)

We end this section by remarking the following \(L^1\)-Poincaré inequality that is stronger than Theorem 3.1. See [ILvHV18] for the discussions on the classical Boolean cubes.

Theorem 6.14

For all \(A\in M_2(\mathbb {C})^{\otimes n}\), one has

$$\begin{aligned} \Vert A-2^{-n}{\text {tr}}(A)\Vert _1\le \pi \left\| \left( \sum _{j=1}^{n}d_j(A)^*d_j(A)\right) ^{1/2}\right\| _1\le \sqrt{2}\pi \Vert \Gamma (A)^{1/2}\Vert _1\, . \end{aligned}$$
(6.8)

Remark 6.15

Using the inequality

$$\begin{aligned} \Vert X+Y\Vert _p^p\le \Vert X\Vert _p^p+\Vert Y\Vert _p^p,\qquad 0<p<1 \end{aligned}$$

for \(p=1/2\), we get

$$\begin{aligned} \left\| \left( \sum _{j=1}^{n}d_j(A)^*d_j(A)\right) ^{1/2}\right\| _1 \le \sum _{j=1}^{n}\left\| \left( d_j(A)^*d_j(A)\right) ^{1/2}\right\| _{1} =\sum _{j=1}^{n}\left\| d_j(A)\right\| _{1}\, . \end{aligned}$$

So (6.8) is stronger than (3.1) up to the multiplicative constant \(\pi \).

Proof of Theorem 6.14

Recall that we have proved (3.7)

$$\begin{aligned} \sum _{j=1}^{n}d_j(A)^*d_j(A)\le 2\Gamma (A) \, . \end{aligned}$$

So we have the second inequality of (6.8) by operator monotonicity of \(x\mapsto x^{1/2}\). Now let us prove the first inequality of (6.8). We denote by \((A,B)\mapsto \langle A,B\rangle :=2^{-n}{\text {tr}}(A^* B)\) the normalized Hilbert–Schmidt inner product. By semigroup interpolation (2.4) and duality,

$$\begin{aligned} \Vert A-2^{-n}{\text {tr}}(A)\Vert _1&=\sup _{\Vert B\Vert \le 1}|\langle A-2^{-n}{\text {tr}}(A),B\rangle | \\&\overset{(2.4)}{=}\sup _{\Vert B\Vert \le 1}\left| \int _0^\infty \langle {\mathcal {L}} P_t(A),B\rangle dt\right| \\&\le \sup _{\Vert B\Vert \le 1}\int _0^\infty \sum _{j=1}^{n}\left| \langle d_j A, d_j P_t B\rangle \right| \,dt\,. \end{aligned}$$

Now recall that we have the following inequality:

$$\begin{aligned} \left| \sum _{j=1}^{n}\langle X_j, Y_j\rangle \right| \le \left\| \left( \sum _{j=1}^{n} X_j^*X_j\right) ^{1/2}\right\| _1\cdot \left\| \sum _{j=1}^{n}Y^*_j Y_j\right\| ^{1/2} \,. \end{aligned}$$
(6.9)

To prove it, form the operators \(X:=\sum _{j=1}^{n}X_j\otimes \vert j\rangle \langle 1\vert \) and \(Y:=\sum _{j=1}^{n}Y_j\otimes \vert j\rangle \langle 1\vert \). Then Hölder’s inequality gives

$$\begin{aligned} \Vert X^*Y\Vert _1\le \Vert X\Vert _1\cdot \Vert Y\Vert =\Vert \left( X^*X\right) ^{1/2}\Vert _1\cdot \Vert Y^*Y\Vert ^{1/2}\, , \end{aligned}$$

which, together with

$$\begin{aligned} \left| \sum _{j=1}^{n}\langle X_j, Y_j\rangle \right| \le \left\| \sum _{j=1}^{n}X_j^*Y_j\right\| _1=\Vert X^*Y\Vert _1\, , \end{aligned}$$

yields (6.9). Now apply (6.9) to \((X_j,Y_j)=(d_j A, d_j P_t B)\) to get

$$\begin{aligned} \left| \sum _{j=1}^{n}\langle d_j A, d_j P_t B\rangle \right| \le \left\| \left( \sum _{j=1}^{n}d_j(A)^*d_j(A)\right) ^{1/2}\right\| _1\cdot \left\| \sum _{j=1}^{n}d_j (P_t B)^*d_j (P_t B) \right\| ^{1/2}\, . \end{aligned}$$

To conclude, we use Lemma 3.4 together with

$$\begin{aligned} \int _0^\infty \frac{dt}{\sqrt{e^t-1}} =\pi \,. \end{aligned}$$

\(\square \)

7 Discussions

We end this paper with the following discussions.

7.1 Equivalence between log-Sobolev and Talagrand’s inequalities

In Theorem 4.3, we derived a general noncommutative extension of Talagrand’s inequality. Our proof requires the joint use of the hypercontractivity inequality (H4) with the intertwining relation (H5). It is hence legitimate to ask whether, in return, such Talagrand-type inequalities imply hypercontractivity. This question was answered in the positive in the classical, continuous setting in [BH99, Proposition 1], and later on for discrete spaces in [Vö16]. It would be interesting to consider the similar problem in the quantum setting, which we leave to future work.

7.2 Learning low-degree quantum Boolean functions

An alternative notion of complexity than the support condition for k-juntas is that of the degree: a bounded function \(f:\Omega _n\rightarrow [-1,1]\) is said to have degree at most \(d\in \{1,\dots , n\}\) if for any string \(s\in \{-1,1\}^n\) with Hamming weight \(|s|>d\), the Fourier coefficient \(\widehat{f}(s)=0\). In particular, Boolean functions of degree at most d are \(d2^{d-1}\) juntas [NS94]. As a main tool for the result, the authors derived a simple lower bound on the degree of the function in terms of its total influence. This observation can be used in conjunction with the Goldreich–Levin algorithm in order to devise a learning algorithm which makes \({\text {poly}}(n)\) random queries to f. More efficient algorithms were proposed in the past decades [LMN93, IRR+21, Man94]. However all these algorithm have a query complexity scaling polynomially with n. In the recent article [EI22], the authors show that any low degree Boolean function can be approximated to \(\varepsilon \) precision in \(L^2\) with probability \(1-\delta \) from \(\mathcal {O}\big ({\text {poly}}\big (\frac{1}{\varepsilon },d\big )\log \big (\frac{n}{\delta }\big )\big )\) random queries to the function. While this result is incomparable to the ones we report in Sect. 6.2, it would be interesting to find a quantum extension of it. The result of [EI22] uses the so-called Bohnenblust–Hille inequalities. The study of Bohnenblust–Hille inequalities has a long history and these inequalities have found many applications in various problems. A Boolean analogue was known [DMoP19] and has led to interesting applications to learning theory [EI22]. Here we formulate and conjecture a quantum analogue of Bohnenblust–Hille inequality and explain why it is useful to learning problems in the quantum setting.

Conjecture 7.1

Fix \(d\ge 1\). Then there exists \(C_d>0\) depending only on d such that for all \(n\ge 1\) and all \(A\in M_2(\mathbb {C})^{\otimes n}\) of degree at most d i.e.

$$\begin{aligned} A=\sum _{s\in \{0,1,2,3\}^n:|s|\le d}\widehat{A}_s\sigma _s, \end{aligned}$$

we have

$$\begin{aligned} \left( \sum _{s\in \{0,1,2,3\}^n:|s|\le d}|\widehat{A}_s|^{\frac{2d}{d+1}}\right) ^{\frac{d+1}{2d}}\le C_d \Vert A\Vert \, . \end{aligned}$$
(7.1)

where the degree |s| of a string s is defined as the number of components that are different from 0.

If Conjecture 7.1 holds, we expect that it can be used in a similar fashion as in [EI22] in order to devise a highly efficient algorithm for learning quantum Boolean functions of small degree in terms of query complexity.

In fact, this conjecture has been resolved after an earlier version of this paper was post out. It was first resolved by Huang, Chen and Preskill [HCP23]. Later on, another proof was found by Volberg and Zhang [VZ23].