On the Correlation Between Nodal and Nonzero Level Sets for Random Spherical Harmonics

We study the correlation between the nodal length of random spherical harmonics and the length of a nonzero level set. We show that the correlation is asymptotically zero, while the partial correlation after removing the effect of the random L2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^2$$\end{document}-norm of the eigenfunctions is asymptotically one.


Random Spherical Harmonics
On the unit two-dimensional sphere S 2 , let us consider the Helmholtz equation is the Laplace-Beltrami operator on S 2 in spherical coordinates (θ, ϕ) and {λ = ( + 1)} ∈N represent the set of eigenvalues of −Δ S 2 . For any λ , the corresponding eigenspace is the (2 + 1)-dimensional space of spherical harmonics of degree ; we can choose the standard L 2 -orthonormal basis made of the spherical harmonics {Y m } m=− ,..., [13, §3.4] and focus, for ∈ N * , on random eigenfunctions of the form x ∈ S 2 . (1.1) Here the coefficients {a ,m } m=− ,..., are random variables (defined on some probability space (Ω, F, P) that we fix once and for all) such that a ,0 is a real standard Gaussian, and for m = 0 the a ,m 's are standard complex Gaussians (independent of a ,0 ) and independent save for the relation a ,m = (−1) m a ,−m ensuring f to be real valued-it is immediate to see that the law of the process f is invariant with respect to the choice of a L 2 -orthonormal basis of eigenfunctions.
For every , the random eigenfunction f is centred, Gaussian and isotropic; from the addition theorem for spherical harmonics [13, (3.42)], the covariance function is given by (x, y)), x,y∈ S 2 , (1.2) where P is the th Legendre polynomial, defined by Rodrigues' formula, as and d(x, y) stands for the spherical geodesic distance between the points x and y. As discussed elsewhere (see i.e. [3,4,7]), random spherical harmonics arise naturally from the spectral analysis of isotropic Gaussian fields on the sphere or in the investigation of quantum chaos (see for instance [13,29] for reviews). . We may assume f and f in (1.1) to be independent random fields whenever = , this is equivalent to assume that a ,m and a ,m are independent random variables for every m, m whenever = .
In this paper, we shall focus on excursion sets of f in (1.1) defined as A u (f ) := {x ∈ S 2 : f (x) ≥ u} for u ∈ R. The boundary of A u (f ), i.e. the level set is an a.s. smooth curve whose connected components are homeomorphic to the circumference. Let us define, for u ∈ R, the random variable If u = 0, the quantity L (0) is known as nodal length. For u = 0, we will often refer to L (u) in (1.3) as "boundary length", equivalent with "length of level sets". In this manuscript, we shall investigate in particular the sequence of random variables {L (u)} ∈N * .

Nodal Length.
The nodal length {L (0)} ∈N * has been the object of an enormous amount of activity, see i.e. [2,19,26,28]. In particular, according to the celebrated Yau's conjecture [31], which has been proved for real analytic manifolds by Donnelly and Fefferman, see [9], and more recently for smooth manifolds by [12] (the lower bound), there exist two constants c, C > 0 s.t. for any realization f one has for every ∈ N * . For the "typical" eigenfunction, fluctuations are indeed of smaller order; more precisely, tighter bounds can be given in a probabilistic sense. In fact, under Gaussianity the expected value of L (0) is easily computed to be (for instance, by the Gaussian Kinematic Formula [1], see also [2]) The computation of the variance is much more challenging and was solved by [28], where it is shown that, as → ∞, Var (L (0)) = log 32 + O (1) . (1.4) More recently, [19] actually provided a stronger characterization of the nodal length fluctuation around their expected value. More precisely, they were able to establish the asymptotic equivalence (in the L 2 (Ω, F, P) =: L 2 (P)-sense) of the nodal length and the so-called sample trispectrum of f , i.e. integral over the sphere of H 4 (f ) that is the fourth-order Hermite polynomial evaluated at the field itself (we recall that H 4 (t) = t 4 − 6t 2 + 3). Indeed, let us define first the sequence of random variables thanks to the properties of Hermite and Legendre polynomials, it is easy to check that [14,16]  . It is shown in [19] that, as To this aim, let us first recall that (see [23,28]), for u ∈ R and ∈ N * , and, for u = 0, as → +∞, Note that for a fixed u = 0 and in the regime → ∞, the variance of L (u) is much larger than in the case u = 0. Let us now consider the random variable (recall that H 2 (t) = t 2 − 1 is the second Hermite polynomial) Note that E [L (u)] = 0; also, from (1.9) we easily find (cf. (1.8)) Var(D (u)) = π 2 λ 2 + 1 u 4 e −u 2 ; (1.10) let us now consider, for u = 0, the standardized level set length, i.e.
Var(L (u)) , and analogously D (u) := . It was shown in [23] that, for u = 0, as  [4], thanks to (1.11) the boundary length is asymptotically (as → +∞) perfectly correlated (meaning that the squared correlation coefficients ρ(·, ·) converge to one) with other geometric functionals, such as the area [15,16] and the Euler-Poincaré characteristic [4] for the excursion regions A u (f ) = {x ∈ S 2 : f (x) ≥ u}; likewise, perfect correlation holds between these functionals and the number of critical points [5] for f (x) in the same excursion region, for any non-zero threshold u = 0. In other words, at high values of , knowledge of any of these quantities for a given sample yields the full information on the behaviour of all the others for the same sample, up to asymptotically negligible terms. (ii) From (1.9), it is convenient to note that moreover, a simple application of Parseval's identity yields where C := (2 + 1) −1 m=− |a m | 2 is usually denoted the sample power spectrum. Roughly speaking, the boundary length is asymptotically proportional to the (random) square norm of the eigenfunctions: from this point of view, it is more natural to expect that it should not play a role in the behaviour of nodal lines, which are clearly invariant to scaling factors.
Note that (ii) in Remarks 1.2 and (1.11) immediately give a CLT for the normalized boundary length.

Main Result
The main result in this paper is the characterization of the correlation structure between the lengths of the nodal and nonzero level sets. To do so, it is important to recall the standard distinction between the classical correlation coefficient between two (finite-variance) random variables X, Y , which of course is given by and the Partial Correlation Coefficient (conditional on the finite variance ran- where X * , Y * are the "residuals" after projecting X, Y on the "explanatory variable" Z, i.e. for a finite variance random variable W (1.13) As well known, ρ Z (X, Y ) admits a standard interpretation as a measure of the (linear) dependence between X and Y, after we got rid of the common components depending on Z. Our main result is obtained by taking X, Y to be the boundary lengths at different levels and Z = f 2 L 2 (S 2 ) = 4π C to be the random L 2 -norm of the eigenfunctions f : on the other hand, Let us briefly explain the main ideas behind the proof of Theorem 1.3 (for further details see Sect. 2). First, we can interpret (1.6) and (1.11) as a Taylortype expansion in L 2 (P) up to the order 4 and 2, respectively, of L (0) and L (u) for u = 0; Equation 1.14 comes from the latter once recalling orthogonal properties [13,Remark 4.10] of Hermite polynomials (see Sect. 2.1). Removing the effect on L (u) of the random L 2 -norm of the eigenfunction makes (1.11) useless so we need a further term in the Taylor-type expansion of the boundary length (Propositions 2.3 and 2.4). It turns out that the leading term of the residual L (u) * is "proportional" to the r.v. M in (1.5) entailing (1.15).
As an easy corollary of Theorem 1.3, we obtain the following joint weak convergence results for the normalized boundary and nodal length. It suffices to recall (1.6) and (1.11), the orthogonality of Wiener chaoses Sect. 2.1 and joint convergence results in [22].
where (Z 1 , Z 2 ) denotes a bivariate vector of standard, independent Gaussian variables.
Corollary 1.4 states that the limiting distribution of the nodal and boundary length (at non-zero level) is independent, in the limit for higher and higher eigenvalues. As motivated above, this result is substantially spurious, as it depends crucially on the dominant role played in the boundary length behaviour by the random norm of the eigenfunction. Taking the effect into account, the landscape changes entirely; indeed, consider the "regression residual" which we again normalize by taking .
We have then the next, fully degenerate, convergence result.
where Z denotes a standard Gaussian variable.
In short, boundary (nonzero level) and nodal length are asymptotically independent, meaning also that the nodal length carries no information on other functionals such as the excursion area, the Euler-Poincaré characteristic or the number of critical points above a given threshold (see Remark 1.2). This result, however, must be interpreted with great care: it is due to the dominant role played by the sample norm in the behaviour of excursion sets. When this effect is properly subtracted, the behaviour of length fluctuations at any level is fully explained by the nodal length, in the high-energy limit, and the joint distributions are completely degenerate. Thus, indeed the nodal lengths are asymptotically sufficient (in the high-energy limit) to characterize the measure of the boundary at any threshold level, provided that the effect of random fluctuations in the norm is properly taken into account. We refer to [10] for some numerical evidence on these and related issues.
We also note that the driving rationale behind these results is the "longrange" behaviour of the covariance function (1.2), as derived from Hilb's asymptotics. We hence conjecture that similar results will hold for other models with similar covariance structure; for instance, we expect (partial) correlation to be asymptotically full between level curves of the Berry's' random wave model considered e.g. in [21,27]. Likewise, we expect very similar or analogous results to hold when considering random linear finite combinations of harmonics in neighbouring eigenspaces; this would correspond to a "high frequency" perturbation of the covariance function, with no impact on the asymptotic results.

Outline of the Paper and Proof of the Main Result
The main ingredient behind our proofs is a neat series representation (chaotic decomposition) of the length and a consequent careful investigation of its chaotic components. In particular, as briefly anticipated, the results recalled in Sects. 1.1.1 and 1.1.2 can be interpreted in terms of Wiener-Itô theory. Let us first recall the notion of Wiener chaos, restricting ourselves to our specific setting on the sphere (see [20, §2.2] and the references therein for a complete discussion).

Wiener Chaos
Let us consider the sequence {H k } k∈N of Hermite polynomials on R; these are defined as follows: H 0 ≡ 1 and where φ(t) := ( √ 2π) −1 exp(−t 2 /2) denotes the standard Gaussian density on R. The family is a complete orthonormal system in the space of functions L 2 (R, B(R), φ(t)dt) =: L 2 (φ). Recall from Sect. 1.1 that the random variables {a ,m , m = − , . . . , , ∈ N * } are defined on the probability space (Ω, F, P). Bearing in mind (1.1) and Remark 1.1, we define the space X to be the closure in L 2 (P) := L 2 (Ω, F, P) of all real finite linear combinations of random variables ξ of the form ξ = r a ,0 for some r ∈ R and ∈ N * or for m = 0 and some thus X is a real centred Gaussian Hilbert (closed) subspace of L 2 (P). Note that the same space is generated also by the basis elements {a ,0 , R(a ,m ), I(a ,m ), m = − , . . . , , ∈ N * }. Let q ≥ 0 be an integer; we define the qth Wiener chaos C q associated with X as the closure in L 2 (P) of all real finite linear combinations of random variables of the type where p 1 , ..., p k ∈ N are such that p 1 + · · · + p k = q, and (ξ 1 , ..., ξ k ) is a standard real Gaussian vector extracted from X ; more explicitly the variables ξ i , i = 1, ..., k are independent Gaussian with unit variance and zero mean. Note that, in particular, C 0 = R. The orthonormality and completeness of H in L 2 (φ), together with a standard monotone class argument [20, Theorem 2.2.4] imply that C q ⊥ C m in L 2 (P) for every q = m, and that is, every square integrable real-valued functional F of X can be (uniquely) represented as a series, converging in L 2 (P), of the form

Chaotic Expansions for Lengths
The perimeter of the boundary of excursion sets on the sphere can be (formally) written as where δ u denotes the Dirac mass in u, ∇f the gradient field and · the Euclidean norm in R 2 . Indeed, let us consider the ε-approximating random variable (ε > 0) We have the following result whose proof is similar to the one given in the nodal case (for details see [19, Appendix B]), and hence omitted.
Lemma 2.1 justifies (2.2). By a differentiation of (1.1) of f , it is easy to see that the random eigenfunctions f and the components of ∇f , viewed as collections of Gaussian random variables indexed by x ∈ S 2 , are all lying in X ; hence L ε (u), L (u) ∈ L 2 (Ω, σ(X ), P).
From the chaotic expansion of L ε (u), it is easy to obtain those of L (u) by letting ε go to zero. In order to recall the chaotic expansion (2.1) where ∇ is the normalized gradient, i.e. ∇ : . (The variance of each component of ∇f (x) is ( + 1)/2.) Note that for each x ∈ S 2 , the random variables f (x), ∇f (x) are independent, and the components of ∇f (x) are independent as well.

Proposition 2.2. The chaotic expansion (2.1) of the approximate length is
where we use spherical coordinates (colatitude θ, longitude ϕ) and for x = (θ x , ϕ x ) we are using the notation and the chaotic coefficients are as in (2.5) and (2.6). By letting ε → 0, we find (2.8) where the convergence of the above series is in L 2 (P).
Equations 2.7 and 2.8 are in some sense the major formulae of the article; all further proofs are obtained by estimating the various terms in this formulae. The proof of Proposition 2.2 is analogous to the one given in the nodal case (see [19, §2] and [23, Proposition 7.2.2]) and hence omitted; in short, Eq. 2.7 follows from the evaluation of the projection coefficients for any level u ∈ R, whereas in the previous references the computation was only considered for u = 0; Eq. 2.8 follows from Eq. 2.7 and Lemma 2.1.
Recall now from Sect. 1.1 that λ = ( + 1). As it will be clearer later, it suffices to deal with the first few terms of the series in (2.8). For every u ∈ R, from (2.8), (2.6) and (2.5) since spherical harmonics have zero mean on the sphere, and thanks to Green formula (see [25, §4] ) that is, D (u) (cf. (1.9)); notice that the right-hand side of (2.11) is identically equal to zero if and only if u = 0 (see also [19]). Obviously, if u = 0, the term proj[L (0)|q] vanishes whenever q is odd. We are in a position to make some more comments, comparing results recalled in Sects. 1.1.1 and 1.1.2, and the theory of chaotic decompositions exploited above: the leading term in the L 2 (P)-expansion of the boundary length around its expected value is provided by its orthogonal projection on the second-order Wiener chaos (2.11), rather than the fourth as in the nodal case (see (1.5) and (1.6), and compare with Proposition 2.4); the asymptotic variance of the second chaos is of order , as opposed to log for the fourthorder chaos (see e.g. [16,19]), and hence, the variance of nonzero level curves is larger than in the nodal case (the Berry's cancellation phenomenon, see [28]). Similar to the nodal length case (1.5), the projection (2.11) takes a very simple form, as the integral depends only on the (second power of the) random eigenfunction and not on its gradient.

Proof of the Main Result
In this paper, we complete the characterization of the chaos expansion for the boundary length of excursion sets, and indeed, we show the following results which are of independent interest.
Of course, the component in the series is orthogonal by construction, so the variance is just the sum of their variances; for u = 0, from Propositions 2.2 and 2.3, it gives an alternative proof for (1.4). Moreover we get, for u = 0, . Let us set for notational simplicity so that M (0) = M in (1.5). As remarked earlier, the asymptotic variance of S 2 H 4 (f (x))dx is known (see [16], Lemma 3.2); indeed, we have Note that Proposition 2.4 generalizes results in [19], moreover implies that (2.14) Here we are also exploiting the fact that the terms proj[L (u)|q] for q = 4 are orthogonal to M (u), which is an element of the fourth-order chaos. Applying Cauchy-Schwartz inequality to the l.h.s. of (2.14), thanks to orthogonality of Wiener chaoses Sect. 2.1, Propositions 2.3, (2.13) and (2.14), we have that, as → +∞, The proofs of Propositions 2.3 and 2.4 are technical and will be given in the next sections. We are now ready to prove our main result.

Discussion
It can be instructive to compare the results in this paper with other recent characterizations which have been given for the asymptotic distribution for the nodal length of random eigenfunctions in the non-spherical case. We recall first that a (non-universal) non-central limit theorem for arithmetic random waves, i.e. Gaussian Laplacian eigenfunctions on the standard two-dimensional flat torus T 2 := R 2 /Z 2 , was established in [18]. To obtain this result, analogously to our discussion above the nodal length was decomposed into chaotic components (see Sect. 2.2). The expansion of nodal length in the toroidal and spherical cases has both analogies and important differences. In both cases, the term corresponding to q = 2 disappears at u = 0, thus entailing that the variance becomes of lower order (the so-called Berry's cancellation phenomenon). Likewise, in both cases the nodal length is dominated by the fourth-order chaos: however, it is only in the spherical case that the fourth-order term admits an expression depending on the field only (and not on the gradient components).
Because of this, we do not expect that taking into account the random norm behaviour will be enough to establish full correlation between nodal length and boundary curves (it could be the case that a degeneracy occurs when a sufficient number of different levels is considered). Similar cancellation phenomena occur for other geometric functionals, including the excursion area and the Defect ( [15][16][17]24], which cover any dimension d ≥ 2), the Euler-Poincaré characteristic [4] and the zeros of complex arithmetic random waves [8]; quantitative central limit theorems have been given on the sphere in [4,[15][16][17] for many of these statistics, in the high-energy limit where → ∞. On the torus, the asymptotic behaviour has been shown to be more complicated, because it is non-Gaussian and differs across different subsequences as the eigenvalue diverges (see [8,18]).

Proof of Proposition 2.3
Let us bear in mind Proposition 2.2. For q ≥ 0, q = 2 we set (3.1) These two functions can be viewed as the qth order component in the integrand of the chaos expansion given in the proposition; they are both polynomial functions in f and its first-order derivatives, and they depend on the parameter u (although we do not take this into account to simplify our notation, this does not affect the discussion to follow). Concerning second chaotic components, thanks to Green's formula (for details see again [23,25]), we can write proj[L ε (u)|2] = ( + 1) 2 (note that proj[L (u)|2] = D (u) in (1.9)). Let us hence set Ψ ε (x; 2) := ( + 1) 2 Ψ (x; 2) := ( + 1) 2 and finally where the second equality is just formal; by this we mean that the first series converges in L 2 (P) for every fixed x, while the second does not. Before we proceed, we need to introduce some more notation: let us fix x = (0, 0) to be the "north pole" and y(θ) = (0, θ) to be points on the meridian where ϕ = 0. We will split the proof of Proposition 2.3 into some lemmas.

Lemma 3.1. For C > 0 large enough
where the constant involved in the O-notation does not depend on .
Let us deal with the terms of the series on the r.h.s. of (3.1). that immediately concludes the proof.

Lemma 3.5.
which gives (3.8), the integral in the latter error term being convergent (recall in particular that a + b + c + d ≥ 3).
We are now ready to state and prove the results we need for Lemmas 3.2 and 3.3.

(3.11)
Proof. Lemma 3.5 entails that the asymptotic behaviour of the first four integrals (note the multiplicative factor ( + 1)/2) on the l.h.s. of (3.11) is given by (up to constants) where a, c, d ∈ {0, 1, 2} and a+ c+ d = 3. It remains to prove that, as → +∞, indeed, for the l.h.s. of (3.12), as → +∞, establishing (3.12). The last integral of (3.11) is simpler, actually by Lemma 3.5 The proof of Lemma 3.6 is hence complete. π/2 C/ 4 2 ( + 1) 2 (P (cos θ) cos θ − P (cos θ) sin 2 θ) (3.13) Proof. In order to investigate the first seven integrals on the l.h.s. of (3.13), we need to work with integrals of the type (cf. (3.14) There are three possibilities: (3.15) Plugging (3.15) into (3.14), we prove the first seven equality in (3.13). Concerning the last four integrals, it is simpler to deal with them and the argument is basically the same as the one used in the proof of the last equality in (3.11).

Proof of Lemma 3.4
This part is inspired by the proof of Lemma 3.5 in [8]. We will need the following estimate that is easy to check once recalling (3.10). As → +∞ max a,b,c,d≥0 a+b+c+d=5 2 ( + 1) (3.20) Proof of Lemma 3.4. This proof is similar to that of Lemma 3.1 in Appendix, for further details see the proof therein. For any C > 0, 2b, c, 2a , 2b , c ) being the sum of no more than q! terms of the form 2 ( + 1) ×(P (cos θ) cos θ − P (cos θ) sin 2 θ) m3 P (cos θ) m4 sin θ dθ, (3.22) where m 1 , . . . , m 4 ≥ 0 and m 1 + m 2 + m 3 + m 4 = q. Now for some 0 < δ < 1 and C large enough (see the proof of Lemma 3.1 in Appendix), Therefore, we have  The series on the r.h.s. of (3.23) being convergent, (3.20) applied in (3.23) allows to conclude the proof.

Proof of Proposition 2.4
Recall that L := + 1 2 , and set We will need the following key result (inspired by Proposition 3.1 in [19]) whose proof is in Appendix. and, for C < ψ < L π 2 , where a(u) and b(u) are two (explicit) constants that depend on u. Recall (3.4). Note that +∞ q=3 Ψ ε (·; q) and H 4 (f (·)) are both in L 2 (S 2 × Ω) and they are isotropic (as we will see just below, we need isotropy to pass from a double integral over the sphere to an integral over the geodesic meridian), and thus, The integrand E [Ψ ε (x; 4)H 4 (f (y(θ)))] can be computed explicitly, and it is easily seen to be absolutely bounded for fixed , uniformly over ε, see Lemma 4.1. Hence, by Lebesgue theorem we may exchange the limit and the integral, and we have that Performing the change of variable ψ = Lθ, we can now write  For the first summand in (4.4), we have, thanks to (4.1), Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A. Proof of Lemma 3.1
In the following proof, three important points are made: the mollification, the splitting of the integral and the control of the series in order to exchange series and integral.
Proof. From Proposition 2.2, we can write (recall in particular (3.3) and (3.4)), for every ε > 0, where for the last equality we applied Fubini theorem, indeed recall (3.2) and (3.3) and that for every Thus, we have for (A.1), from the definition of Ψ ε in (3.4) and then by isotropy (that will allow us to pass from a double integral over the sphere to an integral over the geodesic meridian) and usual symmetry arguments [30], Let us split the integral on the r.h.s. of (A.2) into two terms (C > 0 is an absolute constant) We will separately investigate the two terms on the r.h.s. of (A.3). For the first one, we can write where for x, y ∈ S 2 , K ε (x, y) := E[Ψ ε (x)Ψ ε (y)] is the ε-approximation of the so-called two-point correlation function (see [30]): for x = y The second term on the r.h.s. of (A.3) is more delicate to deal with (we will show that we can exchange integral and series), as follows: ×H c (f (y(θ)))H 2a ( ∂ 1;x f (y(θ)))H 2b ( ∂ 2;x f (y(θ))) sin θ dθ ≤ ( + 1) where m 1 , . . . , m 4 ≥ 0 and m 1 + m 2 + m 3 + m 4 = q. Now let δ > 0 such that √ 1 − δ < 1 5 , then for large enough C, we have that the absolute value of each factor of the integrand in (3.22) is less than 1 − δ, see i.e. the expressions for P , P , P which are proved in [3], Lemma B3 and reported in [19], Appendix A. Hence, we can write from (A.6), taking into account (A.7), Indeed, note that the map (a, b, c) → α 2 2a,2b β ε c (u) 2 (2a)!(2b)!c! is bounded uniformly over ε: see (2.6) and recall that there exists C > 0 s.t. for every k ∈ N and u ∈ R (see e.g. [11,Proposition 3]) immediately implying (see the definition of β ε · in (2.5)) that for every c ∈ N and ε > 0 β ε c (u) 2 c! ≤ C.
We have just proved that which is what we were looking for.

B. Proof of Lemma 4.1
The projection of the boundary length on the fourth-order chaos is proj[L (u)|4] =