Noncommutative Bohnenblust–Hille inequalities

Bohnenblust–Hille inequalities for Boolean cubes have been proven with dimension-free constants that grow subexponentially in the degree (Defant et al. in Math Ann 374(1):653–680, 2019). Such inequalities have found great applications in learning low-degree Boolean functions (Eskenazis and Ivanisvili in Proceedings of the 54th annual ACM SIGACT symposium on theory of computing, pp 203–207, 2022). Motivated by learning quantum observables, a qubit analogue of Bohnenblust–Hille inequality for Boolean cubes was recently conjectured in Rouzé et al. (Quantum Talagrand, KKL and Friedgut’s theorems and the learnability of quantum Boolean functions, 2022. arXiv preprint arXiv:2209.07279). The conjecture was resolved in Huang et al. (Learning to predict arbitrary quantum processes, 2022. arXiv preprint arXiv:2210.14894). In this paper, we give a new proof of these Bohnenblust–Hille inequalities for qubit system with constants that are dimension-free and of exponential growth in the degree. As a consequence, we obtain a junta theorem for low-degree polynomials. Using similar ideas, we also study learning problems of low degree quantum observables and Bohr’s radius phenomenon on quantum Boolean cubes.


Introduction
In 1930, Littlewood [Lit30] proved that for any n ≥ 1 and any bilinear form B : C n ⊗ C n → C, we have Here 4/3 is optimal and (1.1) is known as Littlewood's 4/3 inequality.Right after Littlewood's proof of (1.1), Bohnenblust and Hille [BH31] extended this result to multilinear forms: For any d ≥ 1, there exists a constant C d > 0 depending only on d such that for any n ≥ 1 and any d-linear form B : |B(e i 1 , . . ., e i d )| where B is defined in a similar way as above and the exponent 2d/(d + 1) is optimal.The inequalities (1.2) have played a key role in Bohnenblust and Hille's solution [BH31] to Bohr's strip problem [Boh13] concerning the convergence of Dirichlet series.Such multilinear form inequalities (1.2) and their polynomial variants (which we shall recall for Boolean cubes) are known as Bohnenblust-Hille inequalities.
Since then, Bohnenblust-Hille inequalities have been extended to different contexts.Recent years have seen great progress in improving the constants in Bohnenblust-Hille inequalities (e.g.C d in (1.2)) and this has led to the resolution of a number of open problems in harmonic analysis.See for example [DFOC + 11, DSP14, BPSS14, DGMSP19] and references therein.
In [Ble01] Blei extended Bohnenblust-Hille inequalities to polynomials on Boolean cubes with dimension-free constants.Recently, this result was revisited by Defant, Masty lo and Pérez [DMP19].Moreover, they proved that the dimension-free Bohnenblust-Hille constants for Boolean cubes actually grow at most subexponentially in the degree.To state their results, recall that any function f : {−1, 1} n → C has the Fourier-Walsh expansion: where for each S ⊂ [n] := {1, . . ., n}, f (S) ∈ C and The function f is said to be of degree at most d if f (S) = 0 whenever |S| > d; and is said to be d-homogeneous if f (S) = 0 whenever |S| = d.Defant, Masty lo and Pérez proved the following theorem (they considered real-valued functions but the proof works for complex-valued case as well).
Theorem 1.1.[DMP19, Theorem 1] For any d ≥ 1, there exists C d > 0 such that for any n ≥ 1 and any f :  [EI22], which will be explained in more detail in Section 4. In view of this, an analogue of (1.3) for qubit systems was conjectured in [RWZ22], motivated by learning quantum observables following the work of Eskenazis and Ivanisvili [EI22].But actually this quantum analogue of Bohnenblust-Hille inequality was already contained in a result of Huang, Chen and Preskill in the preprint [HCP22] that was not online available when the conjecture was made.Their motivation is to predict any quantum process.In this paper, we provide another proof that is simpler and gives better constants.Moreover, our method is more general which allows us to reduce many problems on the qubit systems to classical Boolean cubes.We refer to Section 2 for more discussions and to Section 6 for the comparison on our results and the work in [HCP22].
Corollary 1.3.Fix d ≥ 1.Then there exists C d > 0 such that for any n ≥ 1 and any (each σ Moreover, C d ≤ 3 d BH ≤d {±1} , and it becomes a noncommutative analogue of Littlewood's 4/3 inequality when d = 2. Remark 1.4.Note that the algebra of function on {−1, 1} n can be viewed as a commutative subalgebra of M 2 (C) ⊗n spanned by σ s , s ∈ {0, 3} n .So (1.3) is a special case of (1.4) and we always have BH ≤d {±1} ≤ BH ≤d M 2 (C) .Our main result Theorem 1.2 states that the converse holds up to a factor 3 d .
Remark 1.5.The main theorem of Huang, Chen and Preskill [HCP22, Theorem 5] is actually more general which admits (1.4) as a corollary [HCP22, Corollary 3].Their proof is different from ours and the constant they obtained is is of low degree, then it is close to a junta.In the next corollary we derive such a result in a quantum setting.We refer to [RWZ22, Theorem 3.9] to another quantum junta type theorem related to the influences instead of the degree.
Corollary 1.6.Suppose that A ∈ M 2 (C) ⊗n is of degree at most d and A ≤ 1.Then for any ǫ > 0, there exists a k-junta B ∈ M 2 (C) ⊗n such that Here • 2 denoted the Hilbert-Schmidt norm with respect to the normalized trace, that is, In particular, we may choose k ≤ dC 2d 2 ǫ −2d for some universal C > 0.
Remark 1.7.The results in [Bou02, DFKO07] are more general.However, in the case when polynomials are of low degree, the proof presented here that uses Bohnenblust-Hille inequalities is simpler.We are grateful to Alexandros Eskenazis for pointing out to us this proof.
To prove Theorem 1.2, we reduce the problem to the (commutative) Boolean cube case, at a price of an extra factor 3 d .In fact, our main contribution is a general method that reduces many problems in the quantum setting to the classical setting.This method will be explained in Section 2, while the proof of Theorem 1.2 and Corollary 1.6 will be presented in Section 3.
We shall illustrate the strength of this reduction method with two more applications.The first one concerns learning bounded low-degree quantum observables which will be Section 4 and the main result is Theorem 4.1.The second one is Theorem 5.2 on Bohr's radius phenomenon in the context of quantum Boolean cubes which will be discussed in Section 5.
In Section 6, we briefly compare our results with the work of Huang, Chen and Preskill [HCP22].
Notation.We shall use tr to denote the usual (unnormalized) trace on matrix algebras, and •, • the inner product on C n that is linear in the second argument.By A p of a k-by-k matrix A we always mean the normalized Schatten-p norm, i.e.A p p = 2 −k tr|A| p .For any unit vector η ∈ C n , we use |η η| to denote the associated rank one projection operator.Sometimes people use the convention η ⊗ η instead.By a density matrix we mean a positive semi-definite matrix with unit trace.

Reduction to the commutative case
In this section, we present a general reduction method.For this let us collect a few facts about Pauli matrices.For each j = 1, 2, 3, σ j is self-adjoint unitary, and has 1 and −1 as eigenvalues.We denote by e j 1 and e j −1 the corresponding unit eigenvectors, respectively.Pauli matrices σ j , j = 1, 2, 3 satisfy the following anticommutation relation: We record the following simple fact as a lemma.
Thus (recall that κ j ∈ {1, 2, 3}) So we have shown that i l , and thus This is nothing but (2.6).To see the rest of statements, note first that by definition and Hölder's inequality: The desired deg(f A ) = deg(A) and the inequality (2.7) follow immediately from (2.6) and the fact that |p(S)| = |S|.

Proof of Theorem 1.2 and Corollary 1.6
In this section we prove Theorem 1.2 and Corollary 1.6.
Proof of Theorem 1.2.Using the notation (2.5) we need to prove Let f A : {−1, 1} 3n → C be the function associated to A ∈ M 2 (C) ⊗n constructed in Proposition 2.2.So it is also of degree at most d.Then the desired result follows from the following chain of inequalities: , where in the first and last inequalities we used Proposition 2.2, and in the second inequality we used the Bohnenblust-Hille inequality on {−1, 1} n .Note that Corollary 1.3 follows immediately from Theorem 1.2.Now we prove Corollary 1.6.
Proof of Corollary 1.6.Fix η > 0. Consider the following set Recall that A ≤ 1.Then by Markov's inequality and noncommutative Bohenblust-Hille inequality (denoting Then B η := s∈Aη A s σ s depends on at most k coordinates with

Again, by noncommutative Bohenblust-Hille inequality and the fact that
This finishes the proof, since BH ≤d M 2 (C) ≤ C d for some universal C > 0 by Theorem 1.2.

Learning quantum observables of low degree
Let us review the classical learning model first.Suppose that we want to learn a class F of functions on {−1, 1} n using the random query model which we shall explain.For N ≥ 1, let X 1 , . . ., X N be N i.i.d.random variables uniformly distributed on {−1, 1} n .Then how many random queries do we need to recover functions f ∈ F nicely?More precisely, fix the error parameters ǫ, δ ∈ (0, 1), what is the least number N = N(F , ǫ, δ) > 0 such that for any f ∈ F together with N random queries (X 1 , f (X 1 )), . . ., (X N , f (X N )) one can construct a random function h : {−1, 1} n → R such that f − h 2 2 ≤ ǫ with probability at least 1 − δ ?Here f 2 denotes the L 2 -norm of f with respect to the uniform probability measure on {−1, 1} n .When F = F ≤d n consists of functions f : {−1, 1} n → [−1, 1] of degree at most d, Eskenazis and Ivanisvili [EI22] proved that where C(d) depends on the Bohnenblust-Hille constant BH ≤d {±1} for functions on {−1, 1} n of degree at most d.So logarithmic number O ǫ,δ,d (log(n)) of random queries is sufficient to learn bounded lowdegree polynomials in the above sense.This improves significantly the previous work [LMN93,IRRRY21] on the dimension-dependence of N(F ≤d n , ǫ, δ).Later on, Eskenazis, Ivanisvili and Streck [EIS23] proved that O ǫ,δ,d (log(n)) is also necessary.Now suppose that we want to learn quantum observable A over nqubits, i.e.A ∈ M 2 (C) ⊗n , of degree at most d with A ≤ 1.Our learning model is similar to the classical setting.The random queries are now replaced with where ρ samples uniformly in some set S of density matrices in M 2 (C) ⊗n .Our hope is to build another (random) observable A out of N random queries such that A − A 2 2 ≤ ε (4.1) with probability at least 1 − δ.Again, the question is, how many random queries N = N(ε, δ, d, n) do we need to accomplish this, and how does N depend on n?
In the remaining part of this section we provide one answer to this question with S = S n constructed in Proposition 2.2.Theorem 4.1.Suppose that A ∈ M 2 (C) ⊗n is of degree at most d and A ≤ 1. Fix δ, ǫ ∈ (0, 1) and with C > 0 large enough.Then given any N random density matrices ρ (m) , 1 ≤ m ≤ N independently and uniformly sampled in S n , as well as random queries Proof.Again, we will reduce the problem to commutative case, which we shall explain how.To any we associate it with the function

, and
Suppose that we have ρ( x(m)) ∈ S n , 1 ≤ m ≤ N as our independent random density matrices and as random queries, where x(m), 1 ≤ m ≤ N are i.i.d.random variables uniformly distributed on {−1, 1} 3n .Similar to the commutative setting [EIS23], we approximate f A (S) = 3 −|S| A p(S) , S ∈ Im(q), |S| ≤ d with the empirical Walsh coefficients Of course Eα S = f A (S) where E is with respect to the uniform distribution.Since α S is the sum of i.i.d.random variables, we get by f A ∞ ≤ 1 and the Chernoff bound that for b > 0 Then by (4.2) and the union bound Choosing one achieves We continue to copycat [EI22] and introduce the random sets In view of (4.4), with probability ≥ 1 − δ: The second line of (4.5), together with f A ∞ ≤ 1 and the commutative Bohnenblust-Hille inequality (1.3), yields Fix b as above.Consider the random polynomial in M 2 (C) ⊗n All combined, we have with probability at least 1 − δ that Inserting this into (4.3),we choose N such that Noting moreover that (see for example [EI22]) The problem of Boolean's radius is to determine the right order of decay of Br n (F n ) as n → ∞.Among others, Defant, Masty lo and Pérez proved the following.(2) there exists C > 1 such that for all , where (3) lim .
We refer to [DMP18] for more discussions on the similarity and differences between the Boolean cube case and torus case.
The concept of Boolean radius carries over to the quantum setting without any difficulties.This establishes the bound qBr n (F q n (all)) ≥ 2 1/n −1 9 by definition.

Discussions
We briefly compare our results with the work of Huang, Chen and Preskill in [HCP22].(2) Our proof of (1.4) is different from theirs.We use an argument that reduces the problem to the commutative case and the proof looks simpler, while their proof that is more self-contained, combines several technical estimates that can be useful to other problems.
(3) Our better Bohnenblust-Hille constant yields an immediate improvement for learning quantum observable up to a small prediction error in their work.In particular, this allows us to remove the log log(1/ǫ) factor in the exponent of [HCP22, Eq. (A17)].It also yields an immediate improvement in the sample complexity for learning arbitrary quantum process.This is because the learning is achieved by considering the unknown observable to be the observable after Heisenberg evolution under the unknown quantum process.
{e j , 1 ≤ j ≤ n} is the canonical basis of C n and B denotes the norm of the bilinear form, i.e.B := sup{|B(x, y)| : x, y ∈ C n , x ∞ ≤ 1, y ∞ ≤ 1}.2010 Mathematics Subject Classification.47A30, 81P45, 06E30.Key words and phrases.Bohnenblust-Hille inequality, Boolean cubes, quantum observables, Pauli matrices, quantum learning, juntas, Bohr radius.The research of A.V. is supported by NSF DMS-1900286, DMS-2154402 and by Hausdorff Center for Mathematics.H.Z. is supported by the Lise Meitner fellowship, Austrian Science Fund (FWF) M3337.This work is partially supported by NSF DMS-1929284 while both authors were in residence at the Institute for Computational and Experimental Research in Mathematics in Providence, RI, during the Harmonic Analysis and Convexity program.
some C > 0 large enough.Here, the Bohenblust-Hille constant BH ≤d {±1} is contained in C d √ d log d .With this choice of N, the random polynomial A := A b above satisfies A − A 2 2 ≤ ǫ, with probability ≥ 1 − δ. 5. Bohr's radius phenomenon on quantum Boolean cubes One important application of classical Bohnenblust-Hille inequalities is to study Bohr's radius problem [Boh14].The original problem [BK97] concerns the n-dimensional torus T n with T = {z ∈ C : |z| = 1} and the exact asymptotic behaviour of Bohr's radius was obtained by Bayart, Pellegrino and Seoane-Sepúlveda [BPSS14] using the polynomial version of Bohnenblust-Hille inequalities (1.2) with the best constants (denoted by BH ≤d T ) of subexponential growth in the degree d.See also [DFOC + 11].A Boolean analogue of the problem was studied by Defant, Masty lo and Pérez in [DMP18], where Bohr's radius is replaced by Boolean's radius.Definition 5.1.Boolean's radius of a function f: {−1, 1} n → R is the positive real number Br n (f ) such that S⊂[n] | f (S)|Br n (f ) |S| = f ∞ .Given a class F n of functions on {−1, 1} n , the Boolean radius of F n is defined asBr n (F n ) := inf{Br n (f ) : f ∈ F n }.Of particular interests to us are the following four classes of functions (1) F n (all): all real functions on {−1, 1} n withBr n (all) := Br n (F n (all));(2) F n (hom): all real homogeneous functions on {−1, 1} n with Br n (hom) := Br n (F n (hom)); (3) F n (= d): all d-homogeneous real functions on {−1, 1} n with Br n (= d) := Br n (F n (= d)); (4) F n (≤ d): all real functions on {−1, 1} n of degree at most d with Br n (≤ d) := Br n (F n (≤ d)).

( 1 )
In the work [HCP22], the noncommutative Bohnenblust-Hille inequality (1.4) ([HCP22, Corollary 3]) follows from a more general result [HCP22, Theorem 5] which we do not have.The noncommutative Bohnenblust-Hille constant (C d in (1.4)) we obtained is of exponential growth ∼ C d , which is better than theirs ∼ d O(d) .Remark that the best known bound for the (commutative) Boolean cubes is subexponential ∼ C √ d log d [DMP19].
n form a basis of M 2 (C) ⊗n and play the role of characters χ S , S ∈ [n] in the classical case.So any A ∈ M 2 (C) ⊗n , we denote by |s| the number of non-zero s j 's.Similar to the classical setting, A ∈ M 2 (C) ⊗n is of degree at most d if A s = 0 whenever |s| > d, and it is d-homogeneous if A s = 0 whenever |s| = d.In the sequel, we always use A to denote the operator norm of A. Our main result is the following: Theorem 1.2.For any d ≥ 1, there exists C d > 0 such that for all n ≥ 1 and all