Convergence Rates for the Quantum Central Limit Theorem

Various quantum analogues of the central limit theorem, which is one of the cornerstones of probability theory, are known in the literature. One such analogue, due to Cushen and Hudson, is of particular relevance for quantum optics. It implies that the state in any single output arm of an n-splitter, which is fed with n copies of a centred state \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document}ρ with finite second moments, converges to the Gaussian state with the same first and second moments as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document}ρ. Here we exploit the phase space formalism to carry out a refined analysis of the rate of convergence in this quantum central limit theorem. For instance, we prove that the convergence takes place at a rate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}\left( n^{-1/2}\right) $$\end{document}On-1/2 in the Hilbert–Schmidt norm whenever the third moments of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document}ρ are finite. Trace norm or relative entropy bounds can be obtained by leveraging the energy boundedness of the state. Via analytical and numerical examples we show that our results are tight in many respects. An extension of our proof techniques to the non-i.i.d. setting is used to analyse a new model of a lossy optical fibre, where a given m-mode state enters a cascade of n beam splitters of equal transmissivities \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda ^{1/n}$$\end{document}λ1/n fed with an arbitrary (but fixed) environment state. Assuming that the latter has finite third moments, and ignoring unitaries, we show that the effective channel converges in diamond norm to a simple thermal attenuator, with a rate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}\Big (n^{-\frac{1}{2(m+1)}}\Big )$$\end{document}O(n-12(m+1)). This allows us to establish bounds on the classical and quantum capacities of the cascade channel. Along the way, we derive several results that may be of independent interest. For example, we prove that any quantum characteristic function \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi _\rho $$\end{document}χρ is uniformly bounded by some \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _\rho <1$$\end{document}ηρ<1 outside of any neighbourhood of the origin; also, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _\rho $$\end{document}ηρ can be made to depend only on the energy of the state \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document}ρ.


Introduction
The Central Limit Theorem (CLT) is one of the cornerstones of probability theory. This theorem and its various extensions have found numerous applications in diverse fields including mathematics, physics, information theory, economics and psychology. Any limit theorem becomes more valuable if it is accompanied by estimates for rates of convergence. The Berry-Esseen theorem (see e.g. [1]), which gives the rate of convergence of the distribution of the scaled sum of independent and identically distributed (i.i.d.) random variables to a normal distribution, thus provides an important refinement of the CLT.
The first results on quantum analogues of the CLT were obtained in the early 1970s by Cushen and Hudson [2], and Hepp and Lieb [3,4]. The approach of [3] was generalised by Giri and von Waldenfels [5] a few years later. These papers were followed by numerous other quantum versions of the CLT in the context of quantum statistical mechanics [6][7][8][9][10][11][12][13][14], quantum field theory [15][16][17], von Neumann algebras [18,19], free probability [20], noncommutative stochastic processes [21] and quantum information theory [22][23][24]. For a more detailed list of papers on noncommutative or quantum central limit theorems (QCLT), see for example [19,25] and references therein. A partially quantitative central limit theorem for unsharp measurements has been obtained in [26]. An important pair of non-commuting observables is the pair (x, p) of canonically conjugate operators, which obey Heisenberg's canonical commutation relations (CCR) [x, p] = i I , where I denotes the identity operator. 1 These observables could be, for example, the position and momentum operators of a quantum particle, or the so-called position and momentum quadratures of a single-mode bosonic field, described in the quantum mechanical picture by the Hilbert space H 1 . . = L 2 (R) -the space of square integrable functions on R. The corresponding annihilation and creation operators are constructed as a . . = (x + i p)/ √ 2 and a † . . = (x − i p)/ √ 2. When expressed in terms of a, a † , the CCR take the form [a, a † ] = I . Quantum states are represented by density operators, i.e. positive semi-definite trace class operators with unit trace. A state ρ of a continuous variable quantum system is uniquely identified by its characteristic function, defined for all z ∈ C by χ ρ (z) . . = Tr ρ e za † −z * a . The special class of Gaussian states comprises all quantum states whose characteristic function is the (classical) characteristic function of a normal random variable on C. 2 Exactly as in the classical case, a quantum Gaussian state is uniquely defined by its mean and covariance matrix.
Cushen and Hudson [2] proved a quantum CLT for a sequence of pairs of such canonically conjugate operators {(x n , p n ) : n = 1, 2, . . .}, with each pair acting on a distinct copy of the Hilbert space H 1 . More precisely, they showed that sequences that are stochastically independent and identically distributed, and have finite covariance matrix and zero mean with respect to a quantum state ρ (given by a density operator on H 1 ), are such that their scaled sums converge in distribution to a normal limit distribution [2, Theorem 1].
Their result admits a physical interpretation in terms of a passive quantum optical element known as the n-splitter. This can be thought of as the unitary operator U n-split that acts on n annihilation operators of n independent optical modes as U n-split a j U † n-split = k F jk a k , where F jk . . = e 2π jk n i is the discrete Fourier transform matrix. Passivity here means that U n-split commutes with the canonical Hamiltonian of the field, i.e. U n-split , j a † j a j = 0. When n identical copies of a state ρ are combined by means of an n-splitter, and all but the first output modes are traced away, the resulting output state is called the n-fold quantum convolution of ρ, and denoted by ρ n . This nomenclature is justified by the fact that the characteristic function χ ρ σ of two states ρ and σ is equal to the product of the characteristic functions of ρ and σ , a relation analogous to that satisfied by characteristic functions of convolutions of classical random variables. Observe state ρ n can also be obtained as the output of a cascade of n − 1 beam splitters with suitably tuned transmissivities λ j = j/( j + 1) for j = 1, 2, . . . n − 1 (see Fig. 1a).
Cushen and Hudson's result is that if ρ is a centred state (i.e. with zero mean) and has finite second moments, its convolutions ρ n converge to the Gaussian state ρ G with the same first and second moments as ρ in the limit n → ∞ (Theorem 3). In [2, Theorem 1], the convergence is with respect to the weak topology of the Banach space of trace class operators, which translates to pointwise convergence of the corresponding characteristic functions, by a quantum analogue of Levy's lemma that is also proven in [2]. This in turn implies that the convergence actually is with respect to the strong topology, i.e. in trace norm (see [27], or [28,Lemma 4]).
In this paper, we focus on the framework proposed by Cushen and Hudson, and provide a refinement of their result by deriving estimates for the associated rates of convergence. We consider a quantum system composed of m modes of the electromagnectic field, each modelled by an independent quantum harmonic oscillator, so that the corresponding Hilbert space becomes H ⊗m 1 = L 2 (R m ). The main contribution of this paper consists of estimates on rate of convergence of ρ n to the 'Gaussification' ρ G of ρ, obtained under suitable assumptions on ρ -typically, the finiteness of higher-order moments. In analogy with the classical case, we refer to our Theorems 6 and 7 as quantum Berry-Esseen theorems. Our estimates are given in the form of bounds on the Schatten p-norms (for p = 1 and 2) of the difference (ρ n − ρ G ) in the limit of large n, as well as bounds on the relative entropy of ρ n with respect to ρ G in the same limit.
We also show that the assumption of finiteness of the second moments cannot be removed from the Cushen-Hudson theorem. Namely, we construct a simple example of a single-mode quantum state σ such that Tr σ (aa † ) 1−δ is finite for all δ > 0 (and infinite for δ = 0), yet σ n does not converge to any quantum state as n → ∞.
As an application, we propose and study a new model of optical fibre, represented as a cascade of n beam splitters, each with transmissivity λ 1/n and fed with a fixed environment state ρ, which is assumed to have bounded energy and thermal Gaussification. Such a model may be relevant to the mathematical modelisation of a channel running across an integrated optical circuit [29,30]. We are able to show that for n → ∞ the cascade channel converges in diamond norm, up to irrelevant symplectic unitaries, to a thermal attenuator channel with transmissivity λ and the same photon number as that of the environment state ρ. Furthermore, an extension of our results to the non-i.i.d. setting allows us to bound the rate of convergence in terms of the diamond norm distance. Finally, combining existing continuity bounds on entropies and energy-constrained channel capacities [31,32], obtained by Winter [33,34] and Shirokov [35,36], with the known formulae expressing or estimating energy-constrained classical [37,38] and quantum [39][40][41][42][43][44][45] capacities of thermal attenuator channels, we derive bounds on the same capacities for the cascade channel.
Finally, along the way we derive several novel results concerning quantum characteristic functions, which we believe to be of independent interest. First, we prove the simple yet remarkable fact that convolving any two quantum states (i.e. mixing them in a 50 : 50 beam splitter) always results in a state with non-negative Wigner function (Lemma 16). This allows us to interpret the quantum central limit theorem as a result on classical random variables, in turn enabling us to transfer techniques from classical probability theory to the quantum setting. Secondly, we derive new decay bounds on the behaviour of the quantum characteristic function both at the origin and at infinity. For instance, we prove that for any m-mode quantum state ρ and for any ε > 0 there exists a constant η = η(ρ, ε) < 1 such that |χ ρ (z)| ≤ η(ρ, ε) for all z ∈ C m with |z| ≥ ε . Moreover, we show that such a constant can be made to depend only on the second moments of the state, assuming they are finite (Proposition 15). As an explicit example, consider a single-mode state ρ with mean energy E. We then prove that |χ ρ (z)| ≤ 1 − c E 2 for all z with |z| ≥ C √ E , where c, C are universal constants. Note that any such bound must depend on the energy, as one can construct a sequence of highly squeezed Gaussian states for which the modulus of the characteristic function approaches one at any designated point in phase space (Example 2).
Layout of the paper: In Sect. 2 we introduce the notation and definitions used in the paper. In Sect. 3 we recall the Cushen and Hudson quantum central limit theorem. Our main results are presented in Sect. 4. The rest of the paper is devoted to the proofs of these results. We start with the novel properties of quantum characteristic functions (Sect. 5), which lie at the heart of our approach. Then, in Sect. 6 we prove our quantum Berry-Esseen theorems. Section 7 is devoted to the discussion of the optimality and sharpness of our results. In Sect. 8 we apply our quantitative non-i.i.d. extension of the Cushen-Hudson theorem to an optical fibre subject to non-Gaussian environment noise. The paper contains a technical appendix (Appendix A) that makes the connection between moments and the regularity of the quantum characteristic function and shows that our definition of moments induces a canonical family of interpolation spaces.

Notation and Definitions
In this section, we fix the basic notations used in the paper, and introduce the necessary definitions.
2.1. Mathematical notation. Let H denote a separable Hilbert space, and let B(H) denote the set of bounded linear operators acting on H. Let D(H) denote the set of quantum states of a system with Hilbert space H, that is the set of density operators ρ (positive semi-definite, i.e. ρ ≥ 0, trace class operators 3 with unit trace) acting on H. We denote by · T p (H) ≡ · p the Schatten p-norm, defined as X p = (Tr |X | p ) 1/ p . The Schatten p-class T p (H) is the Banach subspace of B(H) formed by all bounded linear operators whose Schatten p-norm is finite. We shall hereafter refer to T 1 (H) as the set of trace class operators, to the corresponding norm · 1 as the trace norm, and to the induced distance (e.g. between quantum states) as the trace distance. The case p = 2 is also special, as the norm · 2 coincides with the Hilbert-Schmidt norm. Let with the convention that Tr[ρ A] = +∞ if the above series diverges or if there exists an index i such that p i > 0 and |e i / ∈ Dom A 1/2 . To extend this definition to a generic densely defined self-adjoint operator X on H, it is useful to consider its decomposition X = X + − X − into positive and negative part [ the expected value of X on ρ. Clearly, given two operators A ≥ B ≥ 0, we have that For two real sequences (a n (λ)) n , (b n (λ)) n that depend on some parameter λ, we write a n (λ) = O λ (b n (λ)) if there exists a constant c λ > 0 that only depends on λ such that |a n (λ)| ≤ c λ |b n (λ)| holds in the limit n → ∞. We also write a n (λ) For an n-linear tensor A : n times if the vector we apply the tensor to is the same in every component. For functions f , we sometimes abuse the notation by denoting the norm of this function as f (z) instead of f . We denote with * the entry-wise complex conjugation, with the standard transposition of vectors, and with † the combination of the two.
For partial derivatives with respect to complex variables z, z * we write ∂ z and ∂ z * . Consider an m-dimensional multi-index α = (α 1 , α 2 , . . . , α m ) with |α| = α 1 + α 2 + · · · + α m . Then ∂ α z . . = ∂ α 1 z 1 ∂ α 2 z 2 . . . ∂ α m z m , and analogously for z * . The total derivatives of order k of a function f : We then recall the definition of the Fréchet derivative for functions f : , C) and therefore We write C ∞ c (C m ) to denote the space of smooth and compactly supported functions on C m . For some open set ⊆ C m with closure , a function f : → C, and a non-negative integer k ∈ N 0 , we denote by C k ( ) the space of functions for which the norm is finite. Here, α, β ∈ N m 0 are multi-indices. When k ≥ 0 is not an integer, we define instead This extension allows us to consider the normed spaces C k for all k ≥ 0. Typically, we will deal with the case where is bounded, so that C k is in fact a Banach space. Finally, L 2 ( ) will denote the space of equivalence classes of measurable functions f : → C whose L 2 norm f 2 where I is the identity on H m . An m-mode quantum state ρ is said to be centred if i.e. if all expected values of the canonical operators on ρ, defined according to (2), vanish.
For an m-tuple of non-negative integers n = (n 1 , . . . , n m ) ∈ N m 0 , the corresponding Fock state is defined by |n .
In what follows, we often consider m = 1. The (von Neumann) entropy of a quantum state ρ is defined as which is well defined although possibly infinite. 4 The relative entropy between two states ρ and σ is usually written as follows [47] D(ρ σ ) . . = Tr ρ (log ρ − log σ ) .
Again, the above expression is well defined and possibly infinite [48]. 5 For two Hilbert spaces H, H , a quantum channel N : T 1 (H) → T 1 (H ) is a completely positive, trace-preserving linear map. For a linear map L : T 1 (H) → T 1 (H ), we define its diamond norm as where the supremum is over all non-zero trace class operators X on H ⊗ C k . Consider a quantum system with Hilbert space H, governed by a Hamiltonian H , which is taken to be a positive (possibly unbounded) operator on H. The energy of a state ρ ∈ D(H) is the quantity Tr[ρ H ] ∈ R + ∪ {+∞} defined as in (1).
Given two Hilbert spaces H and H , a Hamiltonian H on H, and some energy bound E > inf |ψ ∈H ψ|H |ψ ≥ 0, the corresponding energy-constrained classical capacity of a channel N : T 1 (H) → T 1 (H ) is given by [31,[49][50][51][52] where it is understood that the Hamiltonian H (n) on H ⊗n is given by H (n) . . = n j=1 H j , where H j acts on the j th tensor factor, and tensor products with the identity operator 4 One way to define it is via the infinite sum S(ρ) = i (− p i log p i ), where ρ = i p i |e i e i | is the spectral decomposition of ρ. Since all terms of this sum are non-negative, the sum itself can be assigned a well-defined value, possibly +∞. 5 To define it one considers the infinite sum D(ρ σ ) . .= i, j e i | f j 2 p i log p i − p i log q j + q j − p i , where ρ = i p i |e i e i | and σ = j q j | f j f j | are the spectral decompositions of ρ and σ , respectively. As detailed in [48], the convexity of x → x log x implies that all terms of this sum are non-negative, which makes the expression well defined.
are omitted for notational simplicity. With the same notation, one can also define the energy-constrained quantum capacity of N, given by [32,34,[53][54][55] Q H (N, E) . . = lim n→∞ 1 n Q (1) (13) where Tr H is the partial trace over the entirely arbitrary ancillary Hilbert space H. In this paper we are interested in the simple case of m modes. In this case, we will omit the subscripts and simply write the energyconstrained capacities as C (N, E) and Q (N, E).

Phase space formalism
We define the displacement operator D(z) associated with a complex vector z ∈ C m as Thus, D(z) is a unitary operator and satisfies D(z) † = D(−z) and valid for all z, w ∈ C m . Let H quad = j,k X jk a † j a k + Y jk a j a k + Y * jk a † j a † k , where X = X † is an m × m Hermitian matrix, and Y = Y is an m × m complex symmetric matrix. The unitaries e −i H quad generated by such Hamiltonians, and products thereof, 6 are called symplectic unitaries, because they induce a symplectic linear transformation at the phase space level (z R , z I ) ∈ R 2m , where z R . . = z and z I . . = z [56,57]. A symplectic unitary is called passive if it commutes with the number operator j a † j a j , which happens whenever the generating Hamiltonian H quad satisfies Y = 0. A passive symplectic unitary V acts on annihilation operators as V a j V † = k U jk a k , where U is an m × m unitary matrix.
For trace class operators T ∈ T 1 (H m ), the quantum characteristic function χ T : C m → C is given by Conversely, the operator T can be reconstructed from χ T via the weakly defined identity Observe that the adjoint T † of T satisfies χ T † (z) = χ T (−z) * for all z ∈ C m , so that T is self-adjoint if and only if χ T (−z) ≡ χ T (z) * . The characteristic function χ T of a trace class operator T is bounded and uniformly continuous [58, § 5.4]. If T is positive semi-definite (e.g. if T is a density operator), then max α |χ T (α)| = χ T (0) = Tr [T ].
We write |ψ f to denote the pure state corresponding to the wave function f ∈ L 2 (R m ), so that the corresponding rank-one state ψ f . . = |ψ f ψ f | has the following characteristic function: where as usual z = z R + i z I . The Fourier transform of the characteristic function is known as the Wigner function. For a trace class operator T , the Wigner function is given by [59,Eq. (4.5.12) and (4.5.19)] Observe that where T 1 = Tr |T | reduces to 1 when T is a density operator. By taking the Fourier transform of (19), one can show that Moreover, the energy of any density matrix, ρ, can be obtained as a phase space integral The displacement operator D(z) induces a translation or displacement of the Wigner function as follows, hence the nomenclature: The map T → χ T , defined for trace class operators T in (17), extends uniquely to an isomorphism between the space of Hilbert-Schmidt operators and that of squareintegrable functions L 2 (C m ). In fact, the quantum Plancherel theorem guarantees that this is also an isometry, namely and therefore Henceforth, we refer to (26) as the quantum Plancherel identity.
Gaussian states on H m are the density operators ρ ∈ D(H m ) such that W ρ (z) is a Gaussian probability distribution on the real space (z R , z I ) ∈ R 2m and are uniquely defined by their first and second moments. A particularly simple example of a singlemode Gaussian state is a thermal state with mean photon number N ∈ [0, ∞), given by The thermal state is the maximiser of the entropy among all states with a fixed maximum average energy: for all N ≥ 0, where the function g is defined by The characteristic function and Wigner function of the thermal state evaluate to [59,Eq. (4.4.21) and (4.5.31)] respectively, so that τ N is easily seen to be a centred Gaussian state.

Definition 1 (Standard Moments
). An m-mode quantum state ρ is said to have finite standard moments of order up to k, for some k ∈ [0, ∞), if where H m is the canonical Hamiltonian (14), and the above trace is defined as in (1).
Remark. The above condition is fairly easy to check once the matrix representation of ρ in the Fock basis is given. Namely, resorting to (1) and exchanging the order of summation for infinite series with non-negative terms, we see that (31) is equivalent to where as usual |n| = j n j .
Given k > 0 and m ∈ N, we can also define, by analogy with classical harmonic analysis, the m-mode bosonic Sobolev space of order k as follows where as usual H m = L 2 (R m ). Here, we set 1 with the canonical Hamiltonian on m modes being defined by (14). For density operators ρ it holds, using monotone convergence and cyclicity of the trace, that It is well known that the characteristic function of any classical random variable with finite moments of order up to k (with k being a positive integer) is continuously differentiable k times everywhere. We can draw inspiration from this fact to devise an alternative way to introduce moments, relying on the regularity of the quantum characteristic function, in the quantum setting as well. We refer to moments defined in this manner as phase space moments.

Definition 2 ( Phase space moments).
An m-mode quantum state ρ is said to have finite phase space moments of order up to k, for some k for some ε > 0, where B(0, ε) . . = {z ∈ C m : |z| ≤ ε} is the Euclidean ball of radius ε centred in 0, and the norm on the space C k (B(0, ε)) is defined by (5) and (6).
In complete analogy with the classical case, finiteness of standard moments implies local differentiability of the characteristic function, and hence finiteness of phase space moments. See Theorem 9 of Sect. 4.
However, the converse is not true in general. This is not surprising, as the same phenomenon is observed for classical random variables. In fact, a famous example by Zygmund [60] shows the existence of classical random variables with continuously differentiable characteristic function whose first absolute moments do not exist. We can swiftly carry over his example to the quantum realm, e.g. by considering a particular displaced vacuum state ρ . . = c ∞ n=2 1 n 2 log n D(n) |0 0| D(n) † + D(n) † |0 0| D(n) . One can show that its characteristic function is cos(2nz I ) n 2 log n , which turns out to be continuously differentiable everywhere [60]. However, which implies that ρ has no finite first-order moments (see Lemma 24). In spite of the above counterexample, we show in Theorem 28 that at least if k is an even integer, then the existence of k th order phase space moment implies the existence of the k th order standard moment. Again, this is in total analogy with the classical case [61, Theorem 1. 8.16]. Remark. Due to the above, for even k, we simply use the word moment in the statements of our theorems, instead of differentiating between standard moments and phase space moments.

Quantum convolution.
A beam splitter with transmissivity λ ∈ [0, 1] acting on two sets of m modes is a particular type of a passive symplectic unitary, which we express as 7 U where a j and b j ( j = 1, . . . , m) are the creation operators of the first and second sets of modes, respectively. Its action on annihilation operators can be represented as follows Accordingly, displacement operators are transformed by The beam splitter unitary can be used to define the following (λ-dependent) quantum convolution: for two m-mode quantum states ρ, σ and λ ∈ [0, 1], their (λ-dependent) quantum convolution is given by the state ρ λ σ which is defined according to [62] as In terms of characteristic functions, this definition corresponds to It is not difficult to verify that for all symplectic unitaries V and all λ ∈ [0, 1], the beam splitter unitary U λ of (34) satisfies [V ⊗ V, U λ ] = 0. In particular, for any state σ . Also, using (35) it can be shown that the mean photon number of a quantum convolution is just the convex combination of those of the input states, i.e.
where the canonical Hamiltonian is defined by (14). For all m-mode quantum states σ and all λ ∈ [0, 1], we can use the corresponding convolution to define a quantum channel N σ,λ : T 1 (H 1 ) → T 1 (H 1 ), whose action is given by When σ = τ N is a thermal state (with mean photon number N ), the channel N τ N ,λ = . . E N ,λ is called a thermal attenuator channel. Its action, obtained by combining (38) and (30), is given by For the thermal attenuator channel, the energy-constrained classical capacity (defined in (12)) can be shown to reduce to can be shown to be given by [37,38] where g is given by (29).
In what follows, we will be interested in the symmetric quantum convolutions ρ 1 · · · ρ n , iteratively defined for a positive integer n and states ρ 1 , . . . , ρ n , by the relations ρ σ . . = ρ 1/2 σ and ρ 1 · · · ρ n . . = (ρ 1 · · · ρ n−1 ) 1−1/n ρ n . (44) We will also use the shorthand In terms of characteristic and Wigner functions, we can also write Here, denotes convolution, which is defined for n functions f 1 , . . . , (46) shows that the quantum characteristic function of the symmetric quantum convolution satisfies the same scaling property as a sum of classical i.i.d. (independent and identically distributed) random variables. The important special case ρ i ≡ ρ of (46) for all i ∈ {i, 2, . . . , n}, on which we will focus most of our efforts, reads Iterating (39), using (44), shows that holds for all symplectic unitaries V .

Cushen and Hudson's Quantum Central Limit Theorem
In [2], Cushen and Hudson proved the following quantum mechanical analogue of the central limit theorem, which is the starting point of our study.
where B(H m ) is the set of bounded operators on H m .
Remark. The state ρ G is commonly called the Gaussification of ρ.
In fact, the proof of Theorem 3 relies on the equivalence between weak convergence of states and pointwise convergence of their characteristic functions. More precisely, the following holds:

Lemma 4 ([27, Lemma 4.3] and [28, Lemma 4]).
Let (ρ n ) n∈N be a sequence of density operators on H m . The following are equivalent: • (ρ n ) n∈N converges to a density operator in the weak operator topology, namely, it holds that lim n→∞ x|ρ n |y = x|ρ|y for all |x , |y ∈ H m ; • (ρ n ) n∈N converges in trace distance to a trace class operator; • the sequence (χ ρ n ) n∈N of characteristic functions converges pointwise to a function that is continuous at 0.
Together, the above lemma and Theorem 3 allow us to conclude the following seemingly stronger convergence: Under the assumptions of Theorem 3, we have that

Main Results
The main objective of this paper is to refine Theorem 5 of the previous section in the following directions: • First, in the case in which the state ρ satisfies the conditions of the Cushen-Husdon theorem, we provide quantitative bounds on the rate at which the sequence of states (ρ n ) n∈N converges to ρ G , under the assumption of finiteness of certain phase space moments of ρ. We also show how finiteness of phase space moments is implied by finiteness of the corresponding standard moments, the latter having the advantage of being a more easily verifiable condition. Moreover, we show that finiteness of even integer phase space moments implies finiteness of even integer standard moments (Sect. 4.1). • Secondly, we provide an example to show that the assumption that the second moments be finite in the Cushen-Hudson theorem cannot be weakened (Sect. 4.2).
• Thirdly, we extend our results to the non-i.i.d. setting, i.e. we consider a scaling in the quantum convolution different from (44). This allows us to analyse the propagation of states through cascades of beam splitters with varying transmissivities (Sect. 4.3). • Finally, we provide a precise asymptotic analysis of the behaviour of quantum characteristic functions at zero and at infinity (Sect. 4.4).

Quantitative bounds in the QCLT.
In this section, we state our results on rates of convergence in the Cushen-Hudson quantum central limit theorem. We call them quantum Berry-Esseen theorems, as is customary in the literature. Our first theorem provides convergence rates O n −1/2 in the quantum central limit theorem under a fourth-order moment condition. The rate of convergence is boosted to O n −1 if the third derivative of the characteristic function at zero vanishes: Theorem 6 (Quantum Berry-Esseen theorem; High regularity). Let ρ be a centred mmode quantum state with finite fourth-order phase space moments. Then, the convergence in the quantum central limit theorem in Hilbert-Schmidt norm satisfies Here, M 4 = M 4 (ρ, ε) is the moment defined in (33), and ε > 0 is sufficiently small. Moreover, if D 3 χ ρ (0) = 0 then the convergence is at least with rate O M 4 n −1 .
The proof of Theorem 6 is provided in Sect. 6. In the next Theorem, we weaken the assumption on the moments of the state ρ, which leads to a slower rate of convergence.

The convergence in the quantum central limit theorem in Hilbert-Schmidt norm is given by
Here, M 2+α = M 2+α (ρ, ε) is the phase space moment defined in (33), and ε > 0 is sufficiently small.
The proof of Theorem 7 is provided in Sect. 6. The variable α allows us to obtain a convergence rate under the assumption of finiteness of phase space moments of order all the way down to 2 (excluded), which is the assumption required in the Cushen-Hudson QCLT. The above results can further be used to find convergence rates in other, statistically more relevant, distance measures: Corollary 8 (Convergence in trace distance and relative entropy). Assume that an mmode quantum state ρ has finite third-order phase space moments. Then, (33), and ε > 0 is sufficiently small. The above rates are replaced by O M 2+α n −α/(2m+2) when ρ only satisfies the conditions of Theorem 7.
The proof of this Corollary is given in Sect. 6.
Remark. (Condition on the existence of moments). The error bounds in Theorems 6 and 7 are stated in terms of assumptions on the phase space moments M k given by (33), of the state. It is possible to bound the phase space moments M k directly in terms of the standard moments M k defined in (31). This is stated in the following Theorem, whose proof is given in Appendices A-C Theorem 9. Let k ∈ [0, ∞), m a positive integer, and ε > 0 be given. Then every mmode quantum state with finite standard moments of order up to k also has finite phase space moments of the same order. More precisely, there is a constant c k,m (ε) < ∞ such that Conversely, if the characteristic function is 2k times totally differentiable at z = 0 for some integer k, then the 2k th standard moment is finite as well.
The importance of Theorem 9 for us comes from the fact that most of our proofs rest upon local differentiability properties of the characteristic function. While mathematically useful, such properties have no direct physical meaning and may be hard to verify in practice. Instead, the condition of finiteness of higher-order standard moments, as given in Definition 1, bears a straightforward physical meaning, related to the properties of the photon number distribution of the state, and is often easier to verify.
The key to proving Theorem 9 for fractional k lies in an interpolation argument. To state it precisely, we briefly recall some basic facts about real interpolation theory (see [63] for more details): given two Banach spaces (X 0 , . X 0 ) and (X 1 , . X 1 ), and a parameter 0 ≤ θ ≤ 1, define the K -function as follows: and derive from this the function θ (K (X )) = sup t>0 t −θ K (t, X ). The real interpolation spaces, parametrised by θ ∈ (0, 1), are then defined as Now, given two couples of Banach spaces X 0 , X 1 and Y 0 , Y 1 , and a map :

bounded and:
: We want to apply this to the map ρ → χ ρ .
The following interpolation result for density operators then holds: form a compatible couple such that for any mmode quantum state ρ and θ ∈ (0, 1) the real interpolation norm satisfies The proof of Proposition 10 is stated in Appendix B.

Optimality of convergence rates and necessity of finite second moments in the QCLT.
The results stated in the previous section lead naturally to the following questions: (i) Can the assumption of finiteness of second moments in the Cushen-Hudson theorem be weakened?
(ii) Are the convergence rates of Theorems 6 and 7 and Corollary 8 optimal? We start by answering the first question in the negative: there exists a state with finite moments of all orders 2(1 − δ) (for δ > 0) for which neither Theorem 3 nor Theorem 5 holds.
the sequence |ψ f ψ f | ⊕n n does not converge to any quantum state. Hence, the assumption of finiteness of second moments in the Cushen-Hudson QCLT (Theorems 3 and 5) cannot be weakened.
The proof of the above proposition is given in Sect. 7. We now come to the second question (ii) regarding tightness of the estimates in Theorems 6 and 7 and Corollary 8. In Sect. 7 below, we study several explicit examples and provide convincing numerical evidence that our estimates are indeed tight, at least as far as the Hilbert-Schmidt convergence rates are concerned. Our findings are summarised as follows.
• Next, we focus on the second estimate of Theorem 6, and show that it is also tight.
Namely, we compute the differences ψ n − ψ G ζ for the simple case of a singlephoton state ψ = |1 1| and for ζ = 1, 2, and find numerical evidence that again ψ n − ψ G ζ ∼ c n −1 for some absolute constant c (Example 4 and Fig. 4). This shows that the O(n −1 ) convergence rate stated in Theorem 6, under the assumption that D 3 χ ρ (0) = 0, is also attained.

Applications to capacity of cascades of beam splitters with non-Gaussian environment.
We now discuss applications of our results to the study of channels that arise naturally in the analysis of lossy optical fibres. We model a physical fibre of overall transmissivity λ as a cascade of n beam splitters, in each of which the signal state ω is mixed via an elementary beam splitter of transmissivity λ 1/n with a fixed state ρ, modelling the environmental noise (Fig. 2). Each step corresponds to the action of the channel N ρ, λ 1/n : ω → ω λ 1/n ρ (cf. the definition (41)), so that the whole cascade can be represented by the n-fold composition N n ρ, λ 1/n . Note that this is in general a non-Gaussian channel, albeit it is Gaussian dilatable [28,64]. We are interested in the asymptotic expression of the output state N n ρ, λ 1/n (ω) as the number n tends to infinity, as a function of the input state ω. In other words, we want to study the asymptotic channel lim n→∞ N n ρ, λ 1/n . At this point, it should not come as a surprise that such a channel exists and coincides with N ρ G , λ .
Before we see why, let us justify why the above model may be relevant to applications. The recently flourishing field of integrated quantum photonics sets as its goal that of implementing universal quantum computation on miniaturised optical chips [29,30,65,66]. A quantum channel that runs across such a circuit is susceptible to noise generated by other active elements of the same circuit, e.g. single-photon sources. While we expect such noise to be far from thermal, it may become so in the limit n → ∞ of many interactions. In a regime where n is finite, albeit large, our setting will thus be the appropriate one. The forthcoming Corollary 13 allows us to study the classical and quantum capacity of the effective channel in such a regime.
Let us note in passing that the cascade architecture we are investigating now, in spite of some apparent resemblance, is different from that depicted in Fig. 1b. While we regard the former as more operationally motivated, the latter is mathematically convenient, as the transmissivities are tuned in such a way as to yield the symmetric convolution ρ n at the output.
Theorem 12 (Approximation of thermal attenuators channels by cascades of beam splitters). Let ρ be a centred m-mode quantum state with finite third-order phase space moments M 3 , cf. (33), and denote by ρ G its Gaussification. Then, where · stands for the diamond norm (11).
One can further make use of the recently derived continuity bounds under input energy constraints [33][34][35][36] in order to find bounds on capacities of the cascade channel N n ρ, λ 1/n in the physically relevant case where the Gaussification ρ G of ρ is a thermal state. 8 Corollary 13. Consider a single-mode quantum state ρ with finite third-order phase space moments M 3 (cf. (33)) and thermal Gaussification ρ G = τ N as in (27). Then, for λ ∈ [0, 1], mean photon number N . . = Tr ρ a † a < ∞, and some input energy E > 0, the energy-constrained classical and quantum capacity of the cascade channel N n ρ, λ 1/n relative to the canonical Hamiltonian a † a satisfy (55) and (29)), and Q E N ,λ , E is the quantum capacity of the thermal attenuator. 9 The remainder terms are such that for some constant C = C(M 3 ) and all sufficiently large n ≥ n 0 λE The proofs of Theorem 12 and Corollary 13 are postponed to Sect. 8. 8 This amounts to assuming that ρ can be brought to its so-called Williamson form (see (63) of Sect. 6) by a passive symplectic unitary only. 9 An analytical formula for this quantity is currently not known. We report the best lower [39,45] and upper [40][41][42][43][44] bounds known to date in (105)-(106) and (107)-(109), respectively. These results can be used together with (56) to find bounds on Q N n ρ, λ 1/n , E .

New results on quantum characteristic functions.
In this subsection we state our refined asymptotic analysis of the decay of quantum characteristic functions that we employ in the proofs of our main theorems. For arbitrary quantum states, we have the following asymptotic result on the quantum characteristic function at infinity. It states that the quantum characteristic function can, in absolute value, only attain the value one at zero and decays to zero at infinity. Both these properties do not hold for general classical random variables, see Sect. 5.2.

Proposition 14.
The quantum characteristic function of an m-mode quantum state ρ is a continuous function that is arbitrarily small in absolute value outside of a sufficiently large compact set, i.e. χ ρ belongs to the Banach space C 0 (C m ) of asymptotically vanishing functions. Moreover, for any ε > 0 we have where B(0, ε) . . = {z ∈ C m : |z| ≤ ε} denotes a Euclidean ball of radius ε centred at the origin.
The proof of Proposition 14 is given in Sect. 5.2. Interestingly, we can obtain a much more refined asymptotic on the decay of quantum characteristic functions if we assume that the state has finite second order moments.
The proof of Proposition 15 is given in Sect. 5.2.

New Results on Quantum Characteristic Functions: Proofs
Quantum characteristic functions constitute a central tool in our approach. Therefore, the first step in our path towards the quantum Berry-Esseen theorems is to prove the results stated in Sect. 4.4. The structure of this section is as follows: • Quantum-classical correspondence: We derive a quantum-classical correspondence of the central limit theorems by showing that the quantum convolution of two arbitrary density operators naturally induces a classical random variable (Sect. 5.1). • Decay bounds: We derive new decay estimates and asymptotic properties of the quantum characteristic function at infinity (Sect. 5.2).

Quantum-Classical Correspondence.
In this section we show that the quantum convolution ρ σ of any two states ρ and σ has a non-negative Wigner function. While the mathematics behind this is known (see e.g. [ (37), with λ = 1/2, is given by where J . . = (−1) j a † j a j is the unitary and self-adjoint operator that implements a phase space inversion (in the sense of Eq. (61) below). In particular, Proof. We start by verifying that J actually corresponds to a phase space inversion, in the sense that for all m-mode quantum states ρ and all z ∈ C m . This follows from the easily verified fact that J a j J = −a j for all j, which also implies that J D(z)J = D(−z). In fact, using (21) we find that We now compute In 1, we use the convolution property for the Wigner function in (47),where in 2 we just write out the convolution of several functions as in (48). In 3 we then first flip phase space variables according to (61) and use the displacement operator in 4 to translate them by √ 2z, cf (24). Finally, in 5 we use the quantum Plancherel identity (25) to transform the integral over Wigner functions in a trace over density operators. (24) The above equalities are labelled by the equation numbers corresponding to the identities that justify them.
We proceed by showing how the above result bridges the gap between classical and quantum central limit theorems. We now fix an m-mode quantum state ρ, and notice that ρ 2n = (ρ ρ) n . Consider the probability density function f X . . = W ρ ρ ≥ 0, where positivity holds by (60). Let X be a random variable with density f X . The mean and covariance matrix of X coincide with those of ρ ρ, which are in turn the same as those of ρ. Hence, at the level of Gaussifications, f G = W ρ G . We write for an i.i.d. family of random variables X i with law f X where 1 follows from (47) and 2 follows from the change of variables u → √ nu. This implies by applying the classical and quantum Plancherel identities (26) that which shows that the QCLT is equivalent to a certain CLT for classical i.i.d. random variables. The problem with this approach is that the right classical tool to use here would be an estimate on the rate of convergence of (X 1 + · · · + X n )/ √ n to the normal variable X G with respect to the L 2 norm. However, it is known that convergence fails to hold in general, and even under some finiteness of moments assumption there does not seem to be a readily available result in the literature, that is powerful enough to be successfully employed here. Therefore, we do not pursue this route further here.

Decay estimates on the quantum characteristic function.
Before studying the rate of convergence in the quantum central limit theorem, we show that quantum characteristic functions have the so-called strict non-lattice property. To motivate this property, we start by recalling some basic properties of characteristic functions from classical probability theory.
The characteristic function χ cl X of a classical random variable X always attains the value one at zero. However, it can also attain the value one, in absolute value, at any other point. The random variables that exhibit this latter behaviour are precisely those that are lattice-distributed; 10 see also [69,Section 3.5]. Examples include the Dirac, Bernoulli, geometric and Poisson distributions.
Knowing that χ cl X (t) < 1 for all values t = 0 however does not imply that lim sup t→∞ χ cl X (t) < 1. This latter condition is known as the strict non-lattice property of a random variable. An example of a non-lattice distributed random variable which does not satisfy the strict non-lattice property is as follows.
Example 1 ([69, Section 3.5]). Consider an enumeration of the positive rationals q 1 , q 2 , . . . ∈ Q + with q i ≤ i and a non-lattice random variable X defined by The random variable X is then given by which simplifies to Let q i = p i r i where p i ∈ Z and r i ∈ N 0 , by considering times t n = 2π n i=1 r i for arbitrarily large n, one has lim sup t→∞ χ cl X (t) = 1. We now show the surprising fact that quantum characteristic functions do not exhibit this somewhat pathological behaviour. Instead, for any quantum state ρ it holds that lim sup |z|→∞ χ ρ (z) = 0, as the proof of Proposition 14 below shows.
Proof of Proposition 14. Thanks to the spectral theorem and by the dominated convergence theorem, it suffices to prove that lim |z|→∞ χ ψ f (z) = 0 for all wave function f ∈ L 2 (R m ), where ψ f . . = |ψ f ψ f |, and |ψ f is the pure state with wave function f . We rephrase this as the requirement that χ ψ f belongs to the Banach space C 0 (C m ), where the norm on C 0 (C m ) is the supremum norm.
We consider smooth compactly supported functions f first. For such functions, the claim follows by combining (i) Eq. (19); (ii) the fact that f is normalised, i.e.
d m x| f (x)| 2 = 1; and (iii) the Riemann-Lebesgue lemma. For general f ∈ L 2 (R m ), the result then follows by a density argument: for an arbitrary f ∈ L 2 (R m ) there is a sequence of smooth and compactly supported functions Since C 0 (C m ) is a Banach space and χ ψ fn ∈ C 0 (C m ), this implies that also the limit χ ψ f ∈ C 0 (C m ). Thus, to complete the proof of (58) it suffices to show that for every ε > 0 and any z ∈ C m \B(0, ε) one has that χ ψ f (z) < 1. If this were not the case, then |ψ f would be an eigenvector of the displacement operator D(z). This is well known to be impossible, see e.g. [28,Lemma 10].
For a given state ρ and some fixed ε > 0, Proposition 14 tells us that there exists a constant η(ρ, ε) < 1 such that max z∈C m \B(0,ε) χ ρ (z) ≤ η(ρ, ε) (cf. (58)). However, the problem of characterising the quantity η(ρ, ε) in terms of some physically meaningful property of the state ρ remains. To this end, a natural candidate turns out to be the energy of the state. To see why this is the case, consider the following simple example.
Example 2 (Squeezed states). For every z ∈ C m and every δ ∈ (0, 1) there is a (Gaussian) state ρ G of mean photon number Tr [ρ G H m ] ≤ t 2 8 ln 1 To see that this is the case, up to the application of passive symplectic unitaries, it suffices to consider the case z = (t, 0, . . . , 0), where t > 0. Consider the 'squeezed' Gaussian state [70][71][72] defined by the characteristic function where we set η . . = min 2 t 2 ln 1 1−δ , 1 > 0. The mean photon number of ρ G is well known to be given by Tr where we used the fact that η ≤ 1.
The above example shows that any estimate on η(ρ, ε) can be reasonably expected to depend on the energy. We now show that our preliminary work on the quantumclassical correspondence allows us to derive a general upper estimate for |χ ρ (z)| at any designated point z ∈ C m in terms of the energy of the state ρ. For this purpose, we draw upon some important mathematical results from the well-developed theory of classical characteristic functions. Proposition 15, whose proof we present now, implies e.g. that for a one-mode state ρ, we can take η(ρ, ε) = 1− c E min ε 2 , C E , where E is the energy of ρ, and c, C are universal constants.
Proof of Proposition 15. Denoting as usual with |z| the Euclidean norm (4) of z ∈ C m , we write the following chain of inequalities.
Here, 1 is an application of the quantum convolution rule (cf. the n = 2 case of (49)). In 2 we introduced the classical random vector X (ρ ρ) taking values in C m , with probability distribution given by the Wigner function W ρ ρ , which is everywhere nonnegative by Lemma 16. The inequality in 3, which is the non-trivial one, follows from [61, Corollary 2.7.2]: we set a . . = sup z∈C m W ρ ρ (z) ≤ 2 m π m , with the latter estimate coming from (21), and α = 2, so that γ α = γ 2 = d m z |z| 2 W ρ ρ (z) (23) = Tr (ρ ρ) H m + m 2 I = Tr ρ H m + m 2 I = . . E ; also, we substituted m → 2m, because our phase space C m has real dimension 2m; finally, we used the well-known formula (m + 1/2) = √ π 2 −m (2m − 1)!!, where (·)!! is the bi-factorial. Lastly, the inequality in 4 is just an application of the elementary estimate Remark. In [61, Section 2.7], several other estimates for χ cl X (t) are derived. While we decided to stick to the simplest one, as it is already very instructive, it is possible to substantially improve over it, e.g. by resorting to non-isotropic estimates (cf. for instance [61, Theorem 2.7.14]). Notably, our quantum-classical correspondence allows us to translate all of these inequalities to the quantum setting, up to an irrelevant factor of 1/2 in the associated constants (see step 4 in the above proof). We do not pursue this approach further, though we want to stress that it immediately leads to a plethora of further results.

Quantitative Bounds in the QCLT: Proofs
In this section, we provide proofs of the convergence rates in our quantum Berry-Esseen theorems. We also provide proofs of some of the statements in Sect. 4.3 on the convergence rate for cascades of beam splitters converging to thermal attenuator channels. Outline of this section:. To fix ideas, we give a high-level outline of our proofs: • Williamson form: We apply a suitable symplectic unitary to the state, so as to make the Hessian of its characteristic function diagonal and larger than the identity. Subsequently, we use the quantum Plancherel identity to express the difference of the convolved state and its Gaussification in Hilbert-Schmidt norm as a difference of quantum characteristic functions in L 2 norm (Sect. 6.1). • Local-tail decomposition: We then split the integral of the L 2 norm of the difference of the quantum characteristic functions of the convolved state and the Gaussification of the original state into a regime around zero (Lemma 17), in which we can control the behaviour of the quantum characteristic function by its Taylor expansion, and a tail-regime in which we estimate the difference using Proposition 14. The error in the Taylor expansion is controlled by the phase space moments of the state, cf. Lemma 18. • Hilbert-Schmidt convergence: We implement the above ideas to prove Theorems 6 and 7, and Proposition 22 (Sect. 6.2). • Trace norm and entropic convergence: We then use the preservation of the boundedness of the second moment under quantum convolutions to obtain a quantitative estimate of convergence in trace distance, employing Markov's inequality and the Gentle Measurement Lemma [73], and in relative entropy, using entropic continuity bounds [33] satisfies With a slight abuse of terminology, we will call ρ the Williamson form of ρ [74]. Bringing a state to its Williamson form allows us to assume that (i) the smallest eigenvalue of its covariance matrix is at least one. Also, (ii) the transformation in (63) does not change the first moments of the state, so that if ρ is centred then ρ remains centred. Finally, (iii) the same unitary V brings not only ρ but also its Gaussification ρ G to their Williamson forms simultaneously, so that (65) holds as well. Thanks to the covariance of the quantum convolution with respect to symplectic unitaries (50), we see that Combining this with the quantum Plancherel identity (26) yields In short, when estimating any unitarily invariant distance of ρ n from its limit ρ G , we can assume without loss of generality that all states are in their Williamson forms. When the Hilbert-Schmidt norm is employed, we can compute the distance as an L 2 norm at the level of characteristic functions, or equivalently at that of Wigner functions.

Local-tail decomposition
We continue with an important technical lemma that reduces the convergence in the quantum central limit theorem to the behaviour of the quantum characteristic function around zero.

Lemma 17.
Let ρ be an m-mode quantum state with finite second-order phase space moment. Without loss of generality, we assume that ρ is centred and in Williamson form, and that its Gaussification ρ G has characteristic function as in (65). Then for every ε > 0 we have that as n → ∞. If ρ has also finite third-order phase space moments, then where the Fréchet derivative of χ ρ is defined by (3).
Proof. The first identity (68) follows along the lines of the second one (69) and so we focus on verifying the latter. Using the quantum Plancherel identity (26) and the relation (46), we apply the triangle inequality and split the integration domain into two disjoint sets such that The last term on the rightmost side of (70) can be estimated explicitly using spherical coordinates. Namely, combining the fact that the coefficients appearing in the Williamson form satisfy ν j ≥ 1 with the bound D 3 χ ρ (0) z ×3 ≤ D 3 χ ρ (0) |z| 3 , we obtain that where we used that (N /2) for the volume of the (N − 1)-sphere. Furthermore, the second-to-last term in (70) can be shown to be exponentially small. In fact, where in 1 we use that (a + b) 2 ≤ 2(a 2 + b 2 ), in 2 we use that and in (3) we changed variables in the first integral to u := z √ n . Finally, in 4, we used that the L 2 norm of the characteristic function is at most one and switched to spherical coordinates to compute the second integral. In 5, instead, we estimated e −r 2 < e − ε 2 2 n e − r 2 2 for r > √ n ε. Note that the first addend goes to zero faster than any inverse power of n for n → ∞ by Proposition 14. The second decays exponentially, essentially because the integral is bounded in n (in fact, it tends to 0 as n → ∞). This concludes the proof.
The first term on the right-hand side of (69) features an explicit dependence on n, while the second decays faster than any inverse power of n. Therefore, all that is left to do is to estimate the third term, which can be done by looking at the behaviour of the characteristic function in a neighbourhood of the origin. The first step in this direction, rather unsurprisingly, involves a Taylor expansion of χ ρ around 0. In the subsequent lemma we record various important estimates of this sort, which will play a key role in the proofs of our quantum Berry-Esseen theorems.

Lemma 18.
For ε > 0 and k ∈ [0, ∞), let ρ be an m-mode state with finite phase space moments of order up to k (namely, with the notation of Definition 2, assume that M k (ρ, ε) < ∞). Then for all z ∈ C m with |z| ≤ √ n ε it holds that

In particular, if ρ is centred and in Williamson form,
depending on what phase space moments are finite. In (73), we assumed that α ∈ (0, 1).
The estimate in (71) follows immediately from using Hölder continuity of the derivative.

Proofs of convergence rates in Hilbert-Schmidt distance.
We start with the proof of Theorem 6 assuming fourth-order moments.
Proof of Theorem 6. By the discussion in Sect. 6.1.1, we can assume that ρ is in Williamson form, namely, that its characteristic function satisfies (64), with ν 1 , . . . , ν m ≥ 1. Since M 2 (ρ, ε) is monotonically non-decreasing in ε, for any fixed μ ∈ (0, 2) we can chose ε > 0 small enough so that for any z ∈ B . . = B 0, √ n ε it holds that Looking at (72), this implies that 2 1 − χ ρ z √ n ≤ μ. Now, for x ∈ C with |x| < 2 define the function where to deduce the last inequality we observed that |x| ≤ μ implies that |a(x)| ≤ a(μ). Then, thanks to (78) and (74), an application of the triangle inequality yields where for fixed m the constant C 1 depends only on M 3 (remember that M 2 ≤ M 3 by construction). Using again (78) but now in conjunction with (75), by a swift application of the triangle inequality we see that where for fixed m the constant C 2 depends only on M 4 (remember that M 2 ≤ M 4 by construction). We now estimate Here, 1 follows simply by the triangle inequality. In 2, we (i) observed that |e u − (1 + u)| ≤ |u| 2 e |u| ; (ii) operated the substitution u = log χ ρ z √ n n + 1 2 j ν j |z j | 2 ; (iii) noted that R x → x 2 e x is a monotonically increasing function; and (iv) used the factproved in (79) -that |u| ≤ C 1 |z| 3 √ n . Finally, in 3 we remembered that |z| ≤ √ n ε and assumed that ε > 0 is small enough so that εC 1 ≤ 1 4 . Now, since ν 1 , . . . , ν m ≥ 1, we can rephrase the above estimate as Upon integration, (82) naturally yields an upper bound for the second term on the righthand side of (69). We obtain that The justification of the above steps goes as follows: in 4 we switched to spherical coordinates; in 5 we performed the change of variables s . . = 1 2 r 2 ; in 6 we computed the gamma integrals, also remembering that vol S 2m−1 = 2π m (m−1)! ; finally, the constant C 3 introduced in 7 depends -for fixed m -only on M 4 (note that M 3 ≤ M 4 by construction). The proof of the first claim is completed once one inserts (83) into (69). In particular, if D 3 χ ρ (0) = 0 we see that the convergence rate is O M 4 n −1 . This proves also the second claim.
We continue with the proof of the low-regularity QCLT that assumes finiteness of phase space moments of order up to 2 + α, for some α ∈ (0, 1]. Proof of Theorem 7. We just deal with the case where α ∈ (0, 1). As above, we start by fixing μ ∈ (0, 2) and choosing a sufficiently small ε > 0 so that for any z ∈ B . . = B 0, √ n ε the inequality (76) holds. By a similar estimate as in (79), but now leveraging (73) instead of (74), we have that for any z ∈ B 0, √ n ε where the constant C 4 introduced in the last line depends only on M 2+α (note that M 2 ≤ M 2+α ). Here, in 1 we used the elementary estimate |e u − 1| ≤ |u|e |u| , together with the observation that the function R x → xe x is monotonically increasing. In 2 we used the fact that |z| ≤ √ n ε, and chose ε > 0 sufficiently small so that ε α C 4 ≤ 1 4 . Combining the above estimate with the fact that ν 1 , . . . , ν m ≥ 1 yields which upon integration in turn leads to Here, in 3 we switched to spherical coordinates; in 4 we operated the change of variables s . . = 1 2 r 2 and computed the gamma integrals; the constant introduced in 5 depends, for fixed α, only on M 2+α . Inserting (86) into the right-hand side of (69) completes the proof.

Convergence in trace distance and relative entropy.
In this section, we further use the assumption of finiteness of the second moments of the state in order to find convergence rates in trace distance and in relative entropy.
Proof of Corollary 8. The hypothesis implies in particular that ρ has finite phase space moments of the second order. By Theorem 28, this amounts to saying that ρ has also finite standard moments of the second order, that is, that Tr [ρ H m ] ≤ E < ∞. Iterating (40) and passing to the limit, we see that in fact Now, for any E > 0, denote by P E the projection onto the finite dimensional subspace generated by the eigenvectors of the canonical Hamiltonian H m of eigenvalue less than E . Then, by Markov's inequality, for any ε > 0, From the so-called 'gentle measurement lemma' [73,Lemma 9], we have that Then, The result follows after optimising over ε > 0. In particular, if . We now turn to the proof of the convergence in relative entropy. Observe that, since ρ n and ρ G share the same first and second moments, Tr ρ n log ρ G = Tr ρ G log ρ G and thus D ρ n ρ G = S (ρ G ) − S ρ n . The result follows directly from [33, Lemma 18].

Optimality of Convergence Rates and Necessity of Finite Second Moments in the QCLT: Proofs
In this section we discuss the optimality of our results in two different directions:

Failure of convergence for states with unbounded energy.
We now show that the assumption of finiteness of second moments in Theorems 3 and 5 cannot be weakened, e.g. by replacing it with finiteness of some lower-order moments. Some examples of states with undefined moments that do not satisfy Theorems 3 and 5 can be obtained by drawing inspiration from probability theory. For instance, remembering that a classical Cauchy-distributed random variable does not satisfy the central limit theorem, we construct the following example. x+i . The characteristic function of this state can be computed thanks to (19), which in this case evaluates to The absolute value of this characteristic function is illustrated in Fig. 3. We then find the pointwise limit lim n→∞ χ |ψ f ψ f | z/ √ n n = δ z,0 which again is not continuous at 0 and hence is not the characteristic function of any quantum state. The main drawback of the above state is that it does not have even first order moments. We can fix this by considering a slightly more sophisticated example. To proceed further, we first need to recall a well-known integral representation of fractional matrix powers. ([46,Proposition 5.16]). For all r ∈ (0, 1), all positive (possibly unbounded) operators A, and all |ψ ∈ Dom A 1/2 , we have that

Lemma 19
where all functions of A are defined by means of its spectral decomposition.
Proof of Proposition 11. The state is clearly centred, for instance because the wave function is symmetric under inversion x → −x. We proceed to prove claim (b). Note that, since x 2 + p 2 = I + 2a † a ≥ I , 2aa † = x 2 + p 2 + I ≤ 2(x 2 + p 2 ), where p . . = −i d dx is the momentum operator. We now apply the operator inequality (A + B) r ≤ A r + B r , which can be shown to hold for all r ∈ [0, 1] and all positive (possibly unbounded) selfadjoint operators A, B. To prove this explicitly in the non-trivial case where r ∈ (0, 1), we apply (87) to A + B. For a generic |ψ ∈ Dom A r/2 ∩ Dom B r/2 , we obtain that where the inequality in the above derivation follows e.g. from [46,Corollary 10.13]. Now, setting A = x 2 , B = p 2 and r = 1 − δ, we obtain that Computing the expectation value on |ψ f yields where the last step is by explicit computation. This proves (b). We now move on to (c). For this we evaluate the characteristic function of the convolution |ψ f ψ f | n on the purely imaginary line. For t ∈ R, using (19) we obtain that were K 1 is a modified Bessel function of the second kind, and the last equality follows from (54) and [75,Eq. (9.6.25)]. Therefore, for any fixed t > 0 it holds that where we have used the expansion in [75, Eq. (9.6.53)] (see also [75,Eq. (6.3.2) and (9.6.7)]). Since χ |ψ f ψ f | n (0) = 1 for all n because |ψ f ψ f | n is a valid quantum state, the sequence of functions χ |ψ f ψ f | n does not possess a continuous limit. Hence, it cannot converge to the characteristic function of any quantum state. This proves (c).
In particular, every density operator satisfying the assumptions of Theorem 6 that is diagonal in the Fock basis achieves a O(n −1 ) rate.
Proof of Proposition 20. By Theorem 6 it suffices to show that D 3 χ ρ (0) = 0 under the assumptions of the Proposition. We start by recalling that any density operator ρ has an expansion into the Fock basis such that Hence, we find for the characteristic function that Using a finite-rank approximation of the density operator ρ, it suffices then by Theorem 9 to analyse the component-wise derivatives in (89). The functions χ |i j| are explicitly given by [59,Eq. (4.4.46) and (4.4.47)] Here, ! are the associated Laguerre polynomials. By assumption, it suffices to consider the case where |i − j| is even or |i − j| is odd and at least 5. We find that by writing the characteristic function in the form 2 H ji (z) for some suitable function H ji , as in (90), that for the different possible third derivatives, we have Therefore, the only possible non-zero contribution to the third derivative of the quantum characteristic function χ ρ at zero could be due to terms that contain either one or three derivatives of functions H ji evaluated at zero. If |i − j| ≥ 4 then z and z * appear in (90) with a joint power of at least 4; thus, this term's contribution necessarily has to vanish. It suffices therefore to consider the case where |i − j| = 2. If H ji is only differentiated once, then it is clear that this derivative has to vanish at zero, since z, z * appear with a joint power of at least two.
If H ji is differentiated three times, then the term |z| 2 causes the derivative to vanish at zero unless this term is differentiated precisely two times. This, however, implies that the Laguerre polynomial is differentiated exactly once. However, by the chain rule any first order derivative of the term L | j−i| j (|z| 2 ) vanishes at the origin. This concludes the proof.
The following example shows that the O(n −1 ) convergence rate stated in Proposition 20, under the assumption that D 3 χ ρ (0) = 0, is in fact attained.
Example 4 (O(n −1 )-rate). By Proposition 20 we can take ρ = |1 1| to obtain a convergence rate of at least O(n −1 ) in the QCLT. That the O(n −1 ) rate is actually attained is illustrated in the right figure in Fig. 4. The O(n −1 ) rate is saturated both in Hilbert-Schmidt and trace norm.
The following example shows that the O(n −1/2 ) convergence rate of Theorem 7 is attained.
The following example shows that the O(n −α/2 ) convergence rate of Theorem 7 is attained at least for α = 1/2. Example 6. Consider the probability density function p on R given by

Its Fourier transform reads
where K ν (z) is again the modified Bessel function of the second kind, and (91) follows from [75,Eq. (9.6.25)]. Define the single-mode quantum state is a so-called coherent state [76][77][78][79]. The characteristic function of ρ can be easily computed as which leads us to On the other hand, a little thought confirms that ρ has vanishing first moments and second moments given by Tr[ρx 2 ] = 9/2 and Tr[ρp 2 ] = 1/2. Its Gaussification then reads We also observe that: (a) ρ has finite standard moments of order up to 5/2 − δ, for all δ > 0; but (b) it has no well-defined phase space moments (nor standard moments) of order 5/2. To prove claim (a), start by setting β . . = 5/4 − δ/2. Assuming that δ ≤ 1/2 so that β ≥ 1, for all t ∈ R we have that where 1 is just the definition of coherent state, 2 comes from the concavity of the function x → x β−1 and from the fact that q n = t 2n e −t 2 n! is a probability distribution over N, and as claimed.
We now present numerical evidence hinting at the fact that ρ n − ρ G 2 = O(n −1/4 ) for our choice of ρ. Note that (92) The above integral can be evaluated numerically to a high degree of precision. Plotting the function − ln ρ n − ρ G 2 against ln n shows that ρ n − ρ G 2 decays as O(n −1/4 ), cf. Figure 5. By what we have learnt above, Theorem 7 predicts a convergence at least as fast as O(n −1/4+δ ) for every fixed δ > 0, and is therefore tight at least for α = 1/2.

Cascade of Beam Splitters: Proofs
In this section, we prove the results claimed in Sect. 4.3, namely convergence rates for cascades of beam splitters converging to thermal attenuator channels.
8.1. Generalities of the cascade channels. In order to study the convergence of the cascade channel, we start by proving the following elementary equivalence.

Proposition 22.
Let ρ be a centred m-mode quantum state with finite second-order phase space moments. Then the sequence of quantum states ρ(λ, n) defined via (94) converges to the Gaussification ρ G of ρ in trace norm. Moreover, if ρ has finite third-order phase space moments then Here, M 3 = M 3 (ρ, ε) is defined by (33), and ε > 0 is sufficiently small.
Proceeding as usual, we continue to estimate Note that in 4 we applied the elementary inequality |e u − 1| ≤ |u|e |u| , observed that R x → xe x is a monotonically increasing function, and leveraged the bound in (100).
Remembering that ν 1 , . . . , ν m ≥ 1, we can massage the above relation so as to get Now, we can repeat the steps that led to (68). We obtain that The justification of the above steps is as follows. The estimate in 6 is just an application of the triangle inequality. In 7 we used (101) and the elementary fact that |u + v| 2 ≤ 2|u| 2 +2|v| 2 on the second addend. As for 8, we: (i) performed the integral and introduced a constant C 7 that depends on m only on the first addend; (ii) decomposed χ ρ(λ,n) (z) = χ ρ (w 1 ) · n =2 χ ρ (w ) on the second; and (iii) used the fact that e − j ν j |z j | 2 ≤ e −|z| 2 < e − ε 2 2 n e − 1 2 |z| 2 in the prescribed range on the third. Finally, in 9 we noted that if |z| > √ n ε then eventually in n for all ∈ {1, . . . , n}; moreover, we used the fact that d 2m u |χ ρ (u)| 2 ≤ 1 to evaluate the integral in the second addend.
Since the second term in the rightmost side of (102) decays faster than any inverse power of n as n → ∞ thanks to Proposition 14, the proof of (96) is complete. Lastly, (97) follows similarly to Corollary 8.

Approximating cascade channels.
With this convergence at hand, we provide a quantitative bound on the approximation of thermal attenuator channels by cascades of beam splitters (with possibly non-Gaussian environment states). Recall that, to an environment state ρ one can associate an attenuator channel N ρ,λ (ω) . . = ω λ ρ of transmissivity 0 ≤ λ ≤ 1. The following simple lemma is crucial to convert the above state approximation result (Proposition 22) into a statement about approximations of attenuator channels.

Lemma 23.
Given any two m-mode quantum states ρ 1 and ρ 2 , and some λ ∈ [0, 1], the corresponding channels defined as in (41) satisfy Proof. Let R be any reference system, and let ω ∈ D(H AR ) be a state on the bipartite system AR. Then where the inequality stems from the monotonicity of trace distance under quantum channels.
With this lemma at hand, we are ready to prove Theorem 12.
Proof of Corollary 13. We now move on to Corollary 13. Let us start by proving the statement on quantum capacities, namely (56) and (57). Our aim is to apply [34, Theorem 9] to the two channels N n ρ, λ 1/n and N ρ G ,λ = E N ,λ , for the special case m = 1 (cf. (42)). We set where the energy-constrained diamond norm is defined with respect to the canonical Hamiltonian, namely the number operator H = a † a (see [36,Eq. (2)] and [34,Eq. (2)]). Note that ε n = O M 3 n −1/4 by Theorem 12.
The input-output energy relations can be easily determined for both channels thanks to (95) and (40), which together show that Tr ρ(λ, n) a † a = Tr ρ a † a = N = Tr τ N a † a . One obtains that This means that we can set α = λ and E 0 = (1 − λ)N , and hence E = λE + (1 − λ)N , in [34,Theorem 9]. We obtain that Here, in step 1 we applied [34,Theorem 9] together with the formula S(τ N ) = g(N ) (see (28) and (29)); the inequality in 2 holds eventually in n for some universal constant c ≤ 57 + 24 log e, as can be seen by combining the bounds g(x) ≤ log(x + 1) + log e (tight for large x) and g(x) ≤ −2x log x (valid for sufficiently small x); finally, in 3 we used the fact that ε n ≤ C(M 3 ) n −1/4 eventually in n by the already proven Theorem 12, together with the observation that x → −x log x is an increasing function for sufficiently small x > 0.
To complete the first part of the proof we need to estimate the classical capacity of N n ρ, λ 1/n in terms of that of the thermal attenuator E N ,λ of (42), in turn given by (43). Although we could use the estimates in [34], we prefer to resort to the tighter ones provided in [36]. We obtain that The inequality in 4 is an application of [36,Proposition 6]. To see why, let us re-write the result of Shirokov [36,Proposition 6] for one-mode channels and with respect to the canonical Hamiltonian as Here, N i (i = 1, 2) are two quantum channels with 1 (104) and [36,Eq. (21)]); choosing t = 1/2 and hence r ε (t) ≤ r 1 (t) = r 1 (1/2) = 5/2 yields the above relation 4, as claimed. The inequality in 5 holds for all sufficiently large n and for some absolute constant c ≤ 15. Finally, 6 is analogous to 3 above.
Remark. Let us stress that the threshold in n above which the inequalities in the above proof hold true depends on both λE + (1 − λ)N and M 3 (which dictates the rate of convergence of ε n → 0). Although this is a minor point from the point of view of the mathematical derivation, it may be important for applications.
Remark. An analytical formula for the quantum capacity of the thermal attenuator that appears in Corollary 13 is currently not known. The best lower bound to date reads [45,Eq. (9)] The best upper bound to date, instead, can be obtained by combining the results of [40, Eq. (23)- (25)] (see also [41,Section 8]) with those of [44,Theorem 9] and [43,Theorem 46], in turn derived by refining a technique introduced in [42]. We look at the case where λ ≥ N +1/2 N +1 , because below that value of λ the channel E N ,λ becomes 2-extendable [83] (that is, anti-degradable [84][85][86]) and therefore Q E N ,λ , E = 0.
F 1 (N , λ, E) . Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Standard Moments Versus Phase Space Moments: The Integer Case
In this appendix we prove that a state with finite standard moments of order up to k also has finite phase space moments of order up to k, i.e. Theorem 10. More precisely, we show that its characteristic function is differentiable k times, and that there are constants c k,m (ε) < ∞ such that the standard moments and phase space moments, defined by (31) and (33), respectively, satisfy M k (ρ, ε) ≤ c k,m (ε)M k (ρ) for all m-mode quantum states ρ. We start with the following lemma. Proof. First of all, it suffices to consider the one-mode case. Indeed, assume that (aa † ) k/2 ≥ d k (|x| k + | p| k ) for some d k > 0. Then, leveraging the fact that the operators a j a † j commute with each other, and employing standard inequalities between p-norms, we deduce that Therefore, from now on we look at the one-mode case only. The vector space V m . . = span{|n : n ∈ N 0 } of states with a finite expansion in the Fock basis is a core for both aa † k/2 , as well as |x| k and | p| k . Thus, it suffices for us to prove the inequality (A1) on states in V m . It is enough to show that (aa † ) k/2 ≥ d k |x| k for some constants d k > 0, as the other inequality (aa † ) k/2 ≥ d k | p| k is obtained by performing a phase space rotation of an angle π/2, i.e. by conjugating both sides by the unitary operator e i π 2 aa † .
We now prove that the inequality (aa † ) k/2 ≥ d k |x| k holds for some d k > 0 on all vectors in V m . Write k = 2rh, where r ∈ (0, 1] and h . . = k/2 ∈ N 0 . Since the function A → A r is well known to be operator monotone [46,Proposition 10.14], it suffices to show that (aa † ) h ≥ d h x 2h for all non-negative integers h ∈ N 0 . To this end, let us take advantage of our restriction to states with a finite expansion in the Fock basis. Defining N as the projector onto the span of the first N + 1 Fock states (from 0 to N ), we have to show that where the inequality now involves only matrices. Thanks to Gershgorin's circle theorem [87,88] Here, in 1 we extended the sum over n to all values that yield a non-vanishing result, i.e. those that satisfy |n − n | ≤ 2h. In 2 we used the canonical commutation relations (7) to expand In 3 we applied standard estimates for factorials: for example, when n ≤ n ≤ n + 2h we used the fact that n n ! n! ≤ n (n ) (n −n)/2 ≤ (n ) +(n −n)/2 ≤ (n ) 2h ≤ (n + 2h) 2h ; moreover, we defined F h . . = , ≥0, + ≤2h | f h, , | + | f h, , | . Finally, 4 follows by choosing e.g. d −1 h = (2h + 1) h F h . Since (A2) holds for all n, we conclude that A N ≥ 0 for all N , which completes the proof.
Remark. The inequality in Lemma 24 depends critically on the special properties of the canonical operators. In fact, there is no universal constant d k > 0 that makes the general relation (A+ B) k ≥ d k A k + B k true for all positive matrices A, B ≥ 0. To see why this is the case, it suffices to consider two pure states A = |ψ ψ| and B = |φ φ|. Setting λ . . = 1 − | ψ|φ | ∈ [0, 1], it can be shown that the minimal eigenvalue of (A + B) k is λ k , while that of A k + B k = A + B is clearly λ. By Weyl's principle, the conjectured matrix inequality would imply that λ k ≥ d k λ for all λ ∈ [0, 1], absurd. Proposition 25. Let k ≥ 0 and m ≥ 1 be integers; also, let ε > 0 be given. Then, there is a constant c k,m (ε) < ∞ such that every m-mode quantum state ρ with finite k-moments M k (ρ), as defined by (31), also satisfies M k (ρ, ε) = χ ρ C k (B(0,ε)) ≤ c k,m (ε)M k (ρ) .
In particular, according to (33), ρ has finite phase space moments of order up to k.
Proof. Let ρ be an m-mode quantum state. We start by considering the modified state σ . . = ρ |0 0| that is obtained by convolving it with the (multi-mode) vacuum state according to the rule (37) (for λ = 1/2). A first important observation is that the moments of σ and ρ are related. Namely, To see why, we pick a multi-index n ∈ N m 0 and evaluate the n th diagonal entries of σ with respect to the Fock basis. We obtain that n|σ |n Here, in 1 we introduced the dephasing operator in the Fock basis, whose action is given by (X ) . . = k∈N m 0 |k k| X |k k|. In 2 we observed that (ω δ) = (ω) δ for all m-mode quantum states ω whenever δ = (δ) is already diagonal in the Fock basis. To show this, first exploit linearity and factorisation of to reduce to the onemode case. Then, use the representation (X ) = 2π 0 dϕ 2π e iϕ a † a X e −iϕa † a , valid for bounded X and where the integrals are as usual weakly converging, and remember that e iϕ a † a+b † b = e iϕa † a ⊗ e iϕb † b is a function of the total Hamiltonian and thus commutes with the action of the beam splitter. The identity in 3 follows from the formula for the convolution of a Fock state with the vacuum. Here, , ∈ N m 0 are multi-indices, ordered entry-wise, and . . = j j j . The above expression can be obtained easily e.g. by first reducing to the one-mode case, and then by induction on , employing the relations (35). Computing the k th moment of σ then yields Here, 4 and 7 follow from the representation in (32); in 5 we rearranged a double series of non-negative terms, and in 6 we observed that for a given ∈ N m 0 the coefficients P (n) . . = 2 −| | n form a probability distribution over the set of multi-indices n ∈ N m 0 with n ≤ . This proves that the k th moments of σ are upper bounded by those of ρ.
The state σ is also useful because its characteristic function is a close relative of that of ρ. Namely, according to (38) we have that χ σ (z) = χ ρ (z/ √ 2) e − z 2 /4 , and hence for some constants g k,m (ε). Thus, it suffices to find a suitable upper estimate for the norm χ σ C k (B(0,ε)) . By Lemma 16, the Fourier transform of χ σ , i.e. the Wigner function W σ of σ , is everywhere non-negative. Hence, χ σ can be seen as the characteristic function of a classical random variable Z over C m , with probability density function W σ . If we show that Z has finite absolute moments of order k, then thanks to [61,Theorem 1.8.15] we deduce that χ σ is k-fold differentiable everywhere, and since (A5) for all multi-indices α, β ∈ N m 0 , we in fact have that χ σ C k (B(0,ε)) ≤ 1 + d 2m u u k W σ (u) = . . 1 + L k (σ ) .
Therefore, we now look at the quantity L k (σ ). For a vector u = u R + iu I ∈ C m , with u R , u I ∈ R m , we observe that Thus, In the above derivation, the identity in 8 can be verified by first reducing to the case of a pure σ , which can be done by linearity and by multiple applications of Tonelli's theorem, and by subsequently remembering that for a pure state |ψ f with wave function The inequality in 9 is just an application of Lemma 24. Finally, in 10 we introduced a suitable constant c k,m ≥ 1. Combining the above estimate with (A4), (A6), and (A3), we deduce that M k (ρ, ε) = χ ρ C k (B(0,ε)) ≤ g k,m (ε) χ σ C k (B(0,ε)) ≤ g k,m (ε)(1 + L k (σ )) ≤ 2 g k,m (ε)c k,m M k (σ ) ≤ 2 g k,m (ε)c k,m M k (ρ) = . . c k,m (ε)M k (ρ) , which concludes the proof.

Appendix B: Standard Moments Versus Phase Space Moments: The Fractional Case
In the last section, we showed that the k th phase space moment was controlled by the k th standard moment in the case of an integer constant k. Here, we show that this fact still holds when k is a positive real number by an interpolation argument. In principle, we could conclude this fact from the setting of Proposition 25, using that for L p (w 0 ) spaces with weight function w 0 and L p (w 1 ) spaces with weight function w 1 , the real interpolation spaces [63,Theorem 5.4.1] satisfy L p (w 0 ), L p (w 1 ) θ = L p (w θ ) where w θ := w 1−θ 0 w θ 1 . This would allow us to extend the estimate in (A5) to fractional powers as well. However, we want to establish the stronger result that shows that the moments themselves naturally induce an interpolating family of normed spaces. That is, we show the following: Proposition 26. Let ρ be an m-mode quantum state and k ≥ 0. If Tr ρ j a j a † j k/2 < ∞, then χ ρ C k (B(0,ε)) < ∞ for some ε > 0. Moreover, (B(0,ε)) ≤ C ε Tr ρ j a j a † j k/2 , for some constant C ε > 0.
The result follows from the interpolation bound (B1).

Appendix C: Standard Moments Versus Phase Space Moments: A Partial Converse
We now show that at least for even integers k, the existence of k th order phase space moments implies the existence of standard moments of the same order.

Theorem 28.
Let ρ be an m-mode quantum state such that its characteristic function χ ρ is 2k times totally differentiable at z = 0 for some integer k, then the 2k th standard moment is finite as well.
Proof. For simplicity, we restrict attention to m = 1. Let H + lin . . = a + a † and H + lin . . = (−i)(a − a † ) be two Hamiltonians, and consider the spectral decomposition of the density operator ρ = ∞ i=1 λ i |e i e i |. Then, there exist unique probability measures μ e i such that f (λ) dμ e i (λ) for all f bounded measurable.
We now proceed with an induction argument. Start by noting that for k = 0 the result holds. For k ≥ 1, define the auxiliary function ϕ : R → C as ϕ(t) . . = Tr ρ e it H ± lin , which is by assumption 2k times differentiable at zero and let u(t) = ϕ(t). Then, u is also 2k times differentiable at zero. Since ϕ 2k (0) exists, for t ∈ (−ε, ε), with sufficiently small ε > 0, the function t → ϕ (2k−1) (t) exists and is continuous. We record that Taylor's formula implies that for t ∈ (−ε, ε) where odd derivatives vanish at zero, since u is even. We then define a positive continuous function f k : R → [0, ∞) with f k (0) = 1 and for t = 0 as From Taylor's formula above we obtain the following estimate for t sufficiently small Using integration by parts and standard estimates only, it is straightfroward to verify that the finiteness of both Tr ρ H 2k lin ± implies the finiteness of Tr ρ(aa † ) k .