A geometric heat-flow theory of Lagrangian coherent structures

We define Lagrangian coherent structures (LCSs) as maximal material subsets whose advective evolution is maximally persistent to weak diffusion. For their detection, we first transform the Eulerian Fokker--Planck equation (FPE) into a diffusion-only equation in Lagrangian coordinates. In this framework, LCSs express themselves as almost-invariant sets under this diffusion process. The Eulerian-to-Lagrangian coordinate transformation separates the reversible effects of advection from the irreversible joint effects of advection and diffusion. We approximate the Lagrangian FPE in two steps: first, we take the time-average of the diffusion tensors and identify Froyland's dynamic Laplacian as the associated generator; second, we introduce a deformed Riemannian geometry that is consistent with the averaged anisotropic diffusion. The latter turns the diffusion equation into a geometric heat equation, where the distribution of heat is governed by the dynamically induced intrinsic geometry on the material manifold, to which we refer as the geometry of mixing. We study and visualize this geometry in detail, and discuss the connections to diffusion barriers, i.e., boundaries of LCSs. We demonstrate the consistency with the geometric analysis in several numerical examples. Our approach facilitates the discovery of strong mathematical connections between several prominent methods for coherent structure detection, including the probabilistic transfer operator, the dynamic Laplacian, the variational geometric approaches to elliptic and parabolic LCSs and the effective diffusivity approach.


Introduction
Understanding the distribution of physical quantities by advection-diffusion is of fundamental importance in many scientific disciplines, including turbulent (geophysical) fluid dynamics and molecular dynamics. Of particular interest are coherent structures, for which there exist many phenomenological descriptions, visual diagnostics and mathematical approaches; see [40] for a recent review. In fluid dynamics, coherent structures are often thought of as rotating islands of particles with regular motion, which move in an otherwise turbulent background [60,25,69,44]. In molecular dynamics, coherent (or almost-invariant) structures are thought of as conformations, i.e., sets of configurations of the molecule which are stable on time scales much larger than those of molecular oscillations [72,73].
In the last years there has been an explosion of coherent structure detection methods based on flow information. Relying on flow information appears to be a necessary step in nonautonomous/unsteady velocity fields, since instantaneous velocity snapshots and their streamlines are no longer conclusive for material motion as they are in the autonomous/steady case. Nevertheless, the appearance of these methods is very different at first sight: in the category of variational approaches some methods require preservation of boundary length [44], minimization of mixing under the flow [35,27] or surface-to-volume ratio [28,29]. A different class of methods considers averages of observables along trajectories [62,11,58,64,45] and seeks coherent structures as sets with similar statistics. Recent clustering approaches [34,41,6] assess coherence based on mutual trajectory distances.
Comparison studies of methods within this category have been restricted exclusively to simulation case studies [3,57,40]. In this work, we discover for the first time strong mathematical similarities between the variational geometric methods developed by Haller and co-workers on the one hand [43,44,24], and variational transfer-operator-based methods developed by Froyland and co-workers on the other hand [32,27,33]; see Figure 1 and the references therein. Notably, we manage to relate our work and the methods mentioned in Figure 1 to earlier approaches developed in the geophysical fluid dynamics community, namely Nakamura's effective diffusivity framework [65,75].
Even though the appearance of many of the above-mentioned approaches is different, often the underlying idea is-more or less explicit-that coherent structures are expected to be maximal material sets which are the least vulnerable to (weak) diffusion, often modeled through some requirement on boundary deformation under the flow. This intuition is our starting point, and leads us to the Fokker-Planck equation (FPE) in Lagrangian coordinates, which is of diffusion-only type; see also [67,50,78,36] for earlier related approaches. In the Lagrangian frame, we view Lagrangian coherent structures (LCSs) as metastable/almost-invariant sets under the Lagrangian FPE. It turns out that the deformation by advection is equivalent to a deformation of the geometry of the (initial) material manifold, i.e., the flow domain. This change of perspective from space (Eulerian coordinates) to material (Lagrangian coordinates) solves, by the way, the longstanding problem in physical oceanography of separating the reversible effects of pure advection from the irreversible effects of advection and diffusion acting together, see, for instance, [ 65,75]. Time-averaging of the Lagrangian FPE yields an autonomous diffusion-type equation, whose generator is Froyland's recently introduced dynamic Laplacian [28,31]. Froyland's approach is motivated by a dynamic analogue to the isoperimetry problem, i.e., the optimal bisection of a manifold, where optimality is measured with respect to the ratio between the area of the bisection surface and the volume of the smaller of the two parts. Our independent and physical advection-diffusion derivation of the dynamic Laplacians [28,31] establishes a link to Markov processes and their metastable decomposition of state space [15,21]. This, in turn, provides a transparent framework for the detection of multiple coherent structures through spectral analysis. Looking at the Lagrangian averaged diffusion-type equation generated by the dynamic Laplacian, the natural question arises whether this diffusion is related to some intrinsic Riemannian geometry on the material manifold. Our main contribution is to derive such a geometry, which we interpret as the geometry of mixing; cf. also [38,79] for the use of this terminology, however, not in an averaging sense. The self-adjoint Laplace-Beltrami operator induced by the geometry of mixing can be directly investigated in detail by methods from semigroup and operator theory [15,16], Riemannian and spectral geometry [13,53] and visualization. The technical requirements on the flow and the original material manifold consist of smoothness alone. Similar to Froyland's dynamic Laplacian [28,31], one nice and important feature of our Laplace-Beltrami operator is its inherent self-adjointness. Additionally, since working in a classic Riemannian geometry setting, we benefit from the rich intuition about the role of eigenfunctions gathered in applied and computational harmonic analysis, and probably most strongly in the diffusion map methodology [14,23]. Equipped with this knowledge, we are able to interpret the Laplace-Beltrami eigenfunctions in the light of Lagrangian coherent structure detection much more transparent than is the current state-of-the-art in operator-based LCS approaches.
While our theory as presented in the current paper is of continuous type and theoretically requires arbitrary fine dynamic information, it is strongly related to the Diffusion Map methodology [14]. The classic, albeit not exclusive, application case there is that of manifold learning, i.e., the computation of topological and geometric features (such as intrinsic coordinates) of manifolds embedded in a Euclidean, usually high-dimensional space. The situation is thus one of a static manifold. Recently, these ideas have been extended towards including dynamics [37,59,74], none of which, however, use the dynamics to define a Riemannian geometry and thereby get back to the "static" setting as we do here.
Another major contribution is the following conceptual clarification. Lagrangian coherent structures (LCSs) are commonly referred to as transport barriers. With its reference to advection through the term "transport", this translates then to sets with (near-)zero advective flux. It has been pointed out earlier [65,43] that in purely advective flows any material surface constitutes a transport barrier by flow invariance. Our approach, and its consistency with many existing LCS methods, clarifies the role of LCSs as diffusion or mixing barriers; cf. also [27,28].
The paper is organized as follows. Section 2 is devoted to the derivation and discussion of the Lagrangian version of the well-known Fokker-Planck equation (FPE) and its approximation by a time-averaged diffusion equation. We derive and study the Riemannian geometry of this underlying equation, the geometry of mixing, in section 3. In section 4 we provide a thorough discussion on the different roles of eigenfunctions and show numerical results. We close with a discussion of related aspects and future directions in section 5. In particular, readers interested in applications in atmospheric and oceanic fluid dynamics may find the discussion of connections to the effective diffusivity framework in section 5.5 of particular interest. For the convenience of the reader, in appendix A we recall some fundamental notions and concepts from Riemannian geometry, elliptic differential operators and their induced heat flows.
Notation Throughout this paper, we use the following notations.
First, for the positive definite matrix representation G ∈ R d×d of a Riemannian metric g on M (in some coordinates), we denote its ordered eigenvalues by 0 < µ min (G) ≤ . . . ≤ µ max (G), and the corresponding eigenvectors (in those coordinates) by v min (G), . . . ,v max (G).

Advection-diffusion in Eulerian and Lagrangian coordinates
Let (M, g) be a Riemannian d-manifold and M ⊂ M, the fluid domain/material manifold, an embedded d-dimensional submanifold equipped with the induced metric, again denoted by g. We consider the transport equation/conservation law for the scalar quantity φ associated to the (in general non-autonomous) vector field V on M: As is well-known, this equation may be solved for φ by means of the flow map, i.e., the solution to the ordinary differential equatioṅ As mentioned in the Introduction, we are interested in the detection of LCSs as maximal material subsets which defy dispersion under the action of advection and diffusion.

Eulerian Fokker-Planck equation
As a starting point, consider the spatial evolution of a scalar density φ as it is carried by a (compressible) fluid with conserved mass density ρ and subject to diffusion, the classic Eulerian Fokker-Planck equation (FPE) Here, the measure ν corresponds to the fluid's mass dν(t) = ρ(t)ω g , and D/Dt denotes the material/substantial/advective time derivative used in the fluid dynamics literature. Note that for a homogeneous initial fluid density ρ 0 = 1 (dρ = 0 at any time) and divergence-free velocity field (div(V ) = 0), we recover the usual incompressible FPE In eq. (1a), the metric g (via its dual g −1 ) models the molecular/turbulent/numerical diffusivity of φ and therefore determines the gradient, whereas the evolution of the fluid density ρ determines the volume form and therefore the divergence; cf. appendix A.3 for intrinsic definitions of divergence and gradient. In particular, g contains physical information, including the diffusivity in each direction. Finally, ε > 0 can be interpreted as the inverse of the dimensionless Péclet number, which quantifies the strength of advection relative to the strength of diffusion. The problem of LCS detection is typically considered in advection-dominated flow regimes, i.e., associated with a large Péclet number.
The Eulerian perspective comes with a couple of drawbacks. First, if one is interested in the evolution of material localized in some non-invariant region M , one needs to solve eq. (1) on a sufficiently large spatial domain in M in order to cover the entire evolution Φ t (M ) of material initialized in M . This can be problematic in applications to open dynamical systems such as ocean surface flows. Second, coherent sets computed in this framework-as done in [20]-are inevitably of Eulerian, i.e., spacetime, kind, and do not admit a material interpretation. In particular, such Eulerian structures generally have both diffusive and advective flux through their boundaries. It is therefore of interest to study weakly diffusive flows in Lagrangian coordinates, which is the purpose of the next section.

Lagrangian Fokker-Planck equation
Let us take a look at eq. (1) from the Lagrangian viewpoint; cf. [67,50,78,36]. Formally, this means that we interpret the scalar and the fluid densities as functions of particles by pulling them back to time t = 0 through composition with the flow map Φ, which is equivalent to applying the Koopman operator associated to Φ to the densities. This yields Lagrangian scalar and fluid densities ϕ = Φ * φ = φ • Φ and = Φ * ρ = ρ • Φ, respectively. Additionally, we need to pull back eq. (1) to the material manifold, and thus arrive at its Lagrangian form Here, a material point is no longer subjected to a drift-in the Lagrangian perspective, we are following trajectories-but its carried scalar density ϕ is subject to diffusion generated by the time-dependent family of (generalized) Laplace-Beltrami operators ∆ g(t),µ(t) t , with g(t) := (Φ t ) * g the pullback metric, and dµ(t) = (t)ω g(t) = (Φ t ) * dν(t); see appendices A.3 and A.4 for technical background. Conservation of mass yields For incompressible flows and homogeneous initial fluid density, one has ∆ g(t),µ(t) = ∆ g(t) , i.e., the classic Laplace-Beltrami operator induced by the pullback metric. Equation (2a) can thus be viewed as an inhomogeneous, i.e., time-dependent, diffusion equation for Lagrangian scalar densities ϕ on M .
Remark 1 (Pullback metrics). The pullback metric g(t) is well-known in the theory of kinematics of deforming continua by the name (right) Cauchy-Green strain tensor ; see, for instance, [1, p. 356]. In the typical case when the space M is Euclidean and parametrized by the canonical coordinates x 1 , . . . , x d , the pullback metric g(t) has the matrix representation C(t) = (DΦ(t)) DΦ(t), where DΦ(t) is the linearized flow map, with entries (∂ j Φ i ) ij . In general coordinates, one has C(t) = (DΦ(t)) GDΦ(t), where G is the matrix representation of g in local coordinates on Φ t (M ).
The metric g(t) is different from g = g(0) unless Φ t is an isometry, or, in physical terms, unless Φ t corresponds to a solid body motion. Therefore, the Lagrangian diffusion is no longer isotropic with respect to g. This reflects the fact that the flow deformation may have pushed two particles apart or together, and thus their material exchange by diffusion at some later time point is, respectively, less and more likely; see Figure 2. These intuitive heuristics have been formalized and exploited in [78] to reduce the fulldimensional FPE to a one-dimensional FPE along the most contracting direction. The spatial Euclidean geometry (right) is pulled back to the material manifold from time t = 0.05 (left) by the flow map Φ 0.05 0 for the rotating double gyre, Example 1. A spatial diffusion with variance ε = 0.1 (red circle on the right) is pulled back to the red curve on the left, visualizing diffusion in the pullback metric g(0.05) on the material manifold of equal variance. As can be seen, the red curve reaches further out than material diffusion with same variance in the original metric g(0) (visualized by the green circle) in some directions, while it does not reach as far in others. This is due to the deformation by the flow. Note also the duality to the Eulerian deformation perspective presented in Welander's classic work on two-dimensional turbulence [84,Fig. 2].
Remark 2. We stress that-according to our Lagrangian viewpoint-eq. (2) is an evolution equation on M , even if the flow does not keep M invariant, i.e., Φ t (M ) = M .
Next, we show an important property of the elements of the one-parameter family of operators ∆ g(t),µ(t) t involved in eq. (2).
Equation (4) shows that time-dependence enters the weak formulation of ∆ g(t),µ(t) only through the dual metric/diffusion tensor g(t) −1 , and not through the measure µ; cf. also [78,Eq. (3)]. For this reason, we henceforth omit the measure in the notation of the generalized Laplace-Beltrami operator.

Autonomization by time-averaging
Our next goal is to approximate the one-parameter family of operators ∆ g(t) t∈[0,T ] by a time-independent operator L. A natural choice is to consider the time-averaged operator as has been done in [67,52,50,36]. Denote the heat flows of eq. (2a) and ∂ t ϕ = ε∆ by U ε(∆ g(t) ) t and U ε∆ , respectively. Then the arguments of [52] indicate that Note that we do not assume periodicity as usual in averaging theory [52], but also that we do not compare the two heat flows at time instances beyond the averaging time interval, i.e., no asymptotics in T are considered. In any case, the approximation quality is-in practice with some finite ε-certainly a matter of concern, especially for long time intervals; cf. [52,Sect. 6.4], and [50] for studies including time-asymptotic considerations.
With the time-average of the dual metrics 1 the weak formulation of ∆ takes the form Following an independent line of reasoning, Froyland introduced the operator ∆ recently in [28,31] and coined it dynamic Laplacian. Froyland considered both the timecontinuous average as well as its two-point trapezoidal time discretization The following lemma summarizes key properties of ∆.
Proof. The statements follow directly from the construction, Proposition 2 and Lemma 1.
Notably, self-adjointness follows from the fact that the dynamic Laplacian is an integral/sum of operators ∆ g(t) , all of which are self-adjoint on the same Hilbert space L 2 (M, µ 0 ), see Lemma 1. Ultimately, this is a direct consequence of conservation of mass (volume) in the (in)compressible case. The origin of the self-adjointness of the dynamic Laplacian, which is not explicitly asserted, is difficult to trace back in [28,31].
In the case when M has nonempty boundary, the operator is equipped with a boundary condition that can be read as the average of pullbacks of zero Neumann boundary conditions, see [28,Thm. 3.2], [31,Thm. 4.4].
Due to the intimate relation between Riemannian geometries and the corresponding Laplace-Beltrami operators, the following natural question arises: Is there an intrinsic geometry on M , given by a Riemannian metricḡ, such that ∆ equals the associated Laplace-Beltrami operator ∆ḡ? Such a geometric footing is of great interest, since this would allow for the application of many well-established techniques and tools from geometric spectral analysis and spectral geometry [46,53], harmonic analysis as well as intuitive visualization techniques.

The geometry of mixing
The aim of this section is the derivation and extensive exploration of a Riemannian geometry that is most consistent with Lagrangian averaged diffusion modelled by the dynamic Laplacian ∆. We will refer to this geometry as the geometry of mixing, thereby picking up and extending earlier related approaches [38,79] developed under the same terminology, but referring to the (single) pullback geometry under one flow map of (typically chaotic) flows.

The symbol of the dynamic Laplacian
To answer the above question, we study the dynamic Laplacian ε∆ through its phase space representation as a semiclassical pseudodifferential operator on M . In this context, ε plays the role of the (small) semiclassical parameter. For a brief review of technical background see appendix A.6 and the references given therein.
We start with the spatial Laplace-Beltrami operator −ε∆ g . By using its local representation and eq. (19), one can show that its symbol is-in local coordinates-of the form p = p 0 + √ εp 1 , where the principal and subprincipal symbols are given by [55, p.
Proof. Since −ε∆ is the average of the time-parametrized family of Laplace-Beltrami operators −ε∆ g(t) , it is again a second-order pseudodifferential operator. The formula for its symbol is a consequence of eq. (7), as the quantization rule eq. (19) is linear in the symbol. Specifically, let ψ ∈ S(R n ) be a Schwartz function (the general manifold case follows by localization in charts, cf. appendix A.6), and denote by p t the symbol of ∆ g(t) . Then, which states that the symbol can be computed by time-averaging of the symbols p t as claimed.

The harmonic mean metric and its Laplace-Beltrami operator
A careful comparison of the weak formulation of (generalized) Laplace-Beltrami operators with that of the dynamic Laplacian, eq. (6), reveals a "mismatch" between the diffusion tensor,ḡ −1 , and the measure dµ 0 = ρ 0 ω g . In the same spirit, the subprincipal symbol 1 depends nonlinearly on the metric, which does not allow to shift time-averaging to where g(t)-terms occur. Lemma 3, however, confirms that the principal symbol 0 of the dynamic Laplacian is a positive definite quadratic form, that coincides with the principal symbol of the Laplace-Beltrami operator −ε∆ḡ associated with the harmonic meanḡ of the metrics g(t), i.e.,ḡ Remark 3. In two-dimensional flows, the harmonic mean metric tensor G may be computed without matrix inversion as Here, |M | denotes the determinant of M .
Due to their common principal symbol, the dynamic Laplacian ε∆ and the harmonic mean Laplace-Beltrami operator ε∆ḡ are expected to have a similar spectral structure in the limit ε → 0. By eq. (7), the symbol¯ of −ε∆ḡ is given in local coordinates bȳ =¯ 0 + √ ε¯ 1 , wherē In fact, in the one-dimensional flat case with time-discrete flow, ∆ḡ and ∆ even coincide. Proof. See appendix B.
Remark 4. Unfortunately, the calculation in appendix B does not seem to offer a structural insight as to how to extend this equality to several time instances or a time interval, and whether this is possible at all.
When equipping M with the Riemannian metricḡ (and measure dμ = ρ 0 ωḡ), the timeaverage of diffusion tensorsḡ −1 indeed is the natural diffusion tensor in that geometry [17], and the induced Laplace-Beltrami operator takes the weak form In summary, we propose to approximate Lagrangian coherent structures, i.e., almostinvariant sets of the nonautonomous Lagrangian FPE (2), by almost-invariant sets for the autonomous Lagrangian evolution equation By the spectral relation between heat flow and generator, cf. appendix A.5, this boils down to a spectral analysis of the generating Laplace-Beltrami operator ∆ḡ.
Remark 5. The operators ∆ and ∆ḡ act on different Hilbert spaces. Correspondingly, different spectral problems are considered.
In the remainder of this section, we study the geometry of the Riemannian manifold (M,ḡ), i.e., the initial flow domain M equipped with the harmonic mean metricḡ, as well as the properties of the induced dynamic Laplace-Beltrami operator ∆ḡ and its heat flow. Our aim is to find signatures of coherent and incoherent dynamics-and of the boundary between them-in the static geometry of (M,ḡ). We do so by comparing characteristics of the metric tensor field, the volume form and the induced surface area form relative to the physical geometry (M, g) and in g-orthonormal coordinates x. In the Euclidean setting these are the canonical x i -coordinates. For simplicity, we assume ρ 0 = 1 henceforth.
By choosing a reference geometry relative to which we study the deformedḡ-geometry our analysis appears to be somewhat reference-dependent. An analogous approach, however, is common in continuum mechanics, where deformed configurations are analyzed relative to a reference configuration [80]. Eventually, the spectrum and the eigenprojections of ∆ḡ-the basis of our coherent structure detection method-are intrinsic and independent of representations w.r.t. the reference configuration, see Proposition 3. Notably, our geometric construction is observer-independent, or, equivalently, objective.

The Laplace-Beltrami operator and averaging
In order to develop a finer intuition on the action of the mean metric Laplace-Beltrami operator and its associated heat flow, we compare it to the operator of averaging over geodesic balls. To this end, consider an arbitrary Riemannian metric g on M (without boundary) and denote the diffusion operator defined by averaging over g-geodesic balls B g ε (x) = {y ∈ M ; dist g (x, y) ≤ ε} of radius ε by T g ε , i.e., Then the results from [55, Thms. 1 and 2] show that (almost) in the norm resolvent sense; see [55] for technical details. In particular, the dominant eigenvalues and their eigenprojections of ε 2 ∆ g converge to the eigenvalues and eigenprojections of 2(d + 2)(T g ε − Id) as ε → 0, respecting multiplicity. This strong result can be usefully interpreted in two ways in our context, by reading eq. (10) from right to left and vice versa.
Spectral approximation of the short-time heat flow Typically, almost-invariant sets for eq. (9) are detected by eigenfunctions of the heat flow U ε∆g . Now, the right-hand side of eq. (10) can be read as the second-order operator expansion of the heat flow U ε∆g ε 2(d+2) , i.e., for short time intervals of length O(ε). An understanding of this short-time heat flow is already instructive for the general heat flow, since eq. (9) is autonomous and the long-time heat flow is nothing but an iteration of the short-time heat flow. Equation (10) now states that not only may we understand the pointwise action of the short-time heat flow through studying the shape of geodesic balls locally, but also build a macroscopic intuition as reflected through eigenfunctions. We explore this viewpoint further in the next section, including numerical case studies.
Approximation of local averaging by advection-diffusion Alternatively, eq. (10) may be interpreted from left to right. On the left-hand side, we have the compact integral smoothing operator T g ε . This operator is expanded in (non-compact) differential smoothing operators. To zeroth order, i.e., sending ε → 0, the integral kernel of T g becomes the Dirac delta-distribution, whose action is given simply by point evaluation, or, on the operator level, by the identity operator. In the context of the classic heat equation induced by ∆ g , i.e., the FPE with vanishing advection, one may thus interpret the identity as the representation of the (absent) advection, and the Laplace operator ε∆ g as the second-order differential operator representing the classic diffusion part of T g .
These static considerations may now be translated to a dynamic context as done in [27]. As in [27], we use the notation 2 with strong convergence, that does not imply convergence of spectra. Our hypothesis is, however, that this convergence analysis can be strengthened towards convergence of spectra, using the results from [55]. In the advection-diffusion "decomposition" discussed above, the identity reflects the purely advective part of diffusive forward-backward motion as modeled by L t ε * L t ε , and ∆ represents the second-order differential diffusion. This, together with our derivation of the dynamic Laplacian in a Lagrangian-diffusion context, is in contrast to the interpretation of ∆ in a "purely advective" context in [28,29,31,30].

Diffusion barriers
In local coordinates, the Riemannian metricḡ has a second-order matrix field representation by the Gram matrix field G. At each point, G is symmetric, positive definite, and hence invertible. Its inverse D := G −1 is the matrix representation of the dual metric g −1 with respect to the dual coordinates, and for that reason again symmetric positive definite. Physically speaking, D is the natural/isotropic diffusion tensor associated with G, or the Lagrangian averaged diffusion tensor.
We are now looking for a canonical, i.e., diagonalized, coordinate representation of the bilinear form D with respect to cotangent-space bases which are orthonormal with respect to g −1 . This can be achieved by computing the eigendecomposition of D. The eigenvalues of D can then be viewed as the diffusion coefficients in the characteristic directions corresponding to the eigenvectors. In the corresponding tangent space basis, the Laplace-Beltrami operator takes the following leading-order form: In other words, the direction v max D associated with µ max D corresponds to the direction of strongest (or fastest) diffusion. The direction spanned by v max D corresponds to v min G , i.e., the direction which is most strongly compressed under the change of metric from g toḡ, see Figure 3. Equivalently, we may pass to aḡ-orthonormal basisṽ i by rescaling,ṽ i = √ µ i v i . In theṽ i -basis, the Laplace-Beltrami operator takes the canonical leading-order form The rescaling therefore shows how theḡ-unit sphere looks relative to the g-unit sphere, and this relative deformation transfers to geodesic balls under the diffeomorphic exponential maps. It is of interest to look at the ratio between the maximal and the minimal eigenvalues. This ratio is commonly referred to as anisotropy ratio and indicates the separation of diffusion time scales. Note that the anisotropy ratio is not an intrinsic quantity ofḡ, but is determined here by viewing theḡ-diffusion tensor from the g-perspective.
In Figures 4 to 6 we visualize the Lagrangian averaged diffusion tensor field D for three model examples that are commonly considered in the LCS community. Here, the scalar field corresponds to the difference in order of magnitude between the dominant µ max = µ max D and the subdominant eigenvalue µ min = µ min D , i.e., log 10 (µ max /µ min ). On top, a grey-scale texture is shown, whose features are aligned with the dominant diffusion direction field v max D . A coarser visualization of related matters can also be found in [52,Fig. 5].  Figure 3: Schematic visualization of the g-andḡ-unit circles (black and red, resp.) in g-orthonormal coordinates, cf. Figure 2. Note thatḡ-diffusion is fastest in the direction of v min G because theḡ-distance is the shortest on g-circles. The corresponding diffusion coefficient in that direction is µ max D = 1/µ min G . Note also thatḡ-unit spheres have typically much smaller volume than g-unit spheres because µ min G ≤ µ max G < 1 in large regions of M , and consequently g-unit volumes haveḡ-volume smaller than one there, see section 3.5.
, defining the velocity field througḣ The integration time interval is [0, 1]. The flow is designed to interpolate in time an instantaneously horizontal (at t = 0) and an instantaneously vertical (at t = 1) double gyre vector field. For our metric computations we average over 21 pullback metrics from equidistant time instances with time step 0.05. In Figure 4, two phenomena are clearly visible. First, theḡ-diffusion gets closest to isotropic diffusion (low log 10 (µ max /µ min )-values) around the cores of the two coherent structures at roughly (0.5 ± 0.25, 0.5). The further away from the structure centers, the more quasi-one-dimensional the diffusion becomes (high log 10 (µ max /µ min )-values), cf. also [79,78]. In particular, there are thin yellowish filaments almost fully enclosing the blueish regions. Second, diffusion across the anticipated boundary of the coherent regions corresponds to the subdominant diffusion direction, which is several orders of magnitudes weaker than the dominant diffusion and therefore much slower.
In other words, a uniform heat distribution localized close to the vortex core will diffuse both radially and circularly on comparable time scales. A uniform heat distribution localized on the whole vortex will diffuse to the exterior on very long time scales and is therefore expected to be extremely slowly decaying, or, in other words, almost-invariant.
Next, we consider a non-volume preserving flow with two known coherent structures. Example 2 (Cylinder flow, non-volume preserving). We consider the flow on the cylinder S 1 × [0, π], introduced in [32] and generated by the vector fielḋ The functions and parameters are chosen as in the computations in [29]: , c = 0.5, ν = 0.5, = 0.25. As in [29], we choose an integration time of T = 40 and approximate the harmonic mean metric by 41 pullback metrics at integer time instances.
The resulting metric tensor field is shown in Figure 5. As in the previous example, we find a low anisotropy ratio in the anticipated cores of the vortex at roughly (π ± π/2, π/2), which corresponds to almost isotropicḡ-diffusion there. Also, the dominant diffusion direction field circulates around the vortices, which are eventually surrounded by regions of quasi-one-dimensional diffusion in the yellowish region, separating the two gyres.
Example 3 (Bickley jet flow). We consider the Bickley jet flow [70], which is determined by the stream function ψ(t, x, y) = ψ 0 (y) + ψ 1 (t, x, y), where with functions and parameters as in [70,41] instances with time step 0.5 days. Again, we find a low anisotropy ratio in the anticipated cores of the vortex, and a dominant diffusion direction field circulating around the vortices. The regions of quasi-one-dimensional diffusion separate the gyres from the jet. Note the singular behavior of the v max D -integral curves along the jet core, especially on the right half, and in the vortex centers; cf. section 5.2 for more details on the singularity aspect.

Likelihood of membership to an LCS/Lagrangian effective diffusivity
The Riemannian volume form of the original metric g reads in local coordinates as which simplifies in our chosen normal coordinates to g ij = δ ij , and therefore ω g = dx.
For the harmonic mean metric, the associated Riemannian volume form then reads Remark 6. In two-dimensional flows, the coefficient |ḡ| can be computed as 1/ √ det C, where C is a weighted average of Cauchy-Green strain tensors, see Remark 3.
Thus, relative to the reference volume form, the factor √ḡ = det can be interpreted as the density of ωḡ with respect to the volume form dx. In other words, the value of √ḡ at some point in M answers the following question: given a d-parallelepiped in the tangent space with unit g-volume, what is itsḡ-volume? If the density √ḡ = i µ i G is very small in some region, this implies that 1/ √ḡ = Necessarily, µ max D must be large, and consequently, diffusion in the associated direction v max D is (extremely) fast. Conversely, regions of high density must have low diffusivity in any direction.
Alternatively, the inverse density, i.e., det D, can be considered as a Lagrangian scalar "effective diffusivity", reducing the direction-dependent information contained in D to a single scalar quantity. In simple terms, we may regard points with low density as leaky, points with high density as sticky with regard to keeping scalar quantities under the heat flow.
As an alternative description for the low density case we have that a g-unit reference set (like parallelepiped or a ball) has largeḡ-volume. Conversely, the correspondinḡ g-unit reference set appears small in g-normal coordinates, and therefore captures a smaller neighborhood compared to the g-reference set; see Figure 3. This in turn means that on average neighboring points have a larger geodesic distance with respect toḡ than with respect to g. The discrete graph analogue to measuring the connectivity of points to its neighborhood or the entire manifold is the degree of graph nodes. For that reason, it does not come as a surprise that our density plots below show striking similarity with corresponding degree field plots in [41].
Example 4 (Volume-preserving diffeomorphism). Suppose the dynamics is given by a single volume-preserving diffeomorphism Φ, acting on a domain in the two-dimensional Euclidean space. By virtue of Remark 6, we have where µ = µ max (C) > 1 is the larger eigenvalue of C and 1/µ is its smaller eigenvalue. By the symmetry of h in µ around 1, it is clear that h attains its maximum 1 at µ = 1. At points of maximal density, the deformation by DΦ is isotropic, and the flow map Φ represents an infinitesimal solid body motion. In infinitesimal terms, this means that the flow map does not distort an initial circle, thereby keeping its optimal circumference-to-area ratio. Another interpretation is that all neighbors keep their distance under the flow map, and the infinitesimal neighborhood of such a point stays together uniformly, i.e., independent of direction.
In contrast, low density values are induced by strong stretching and compression, which turns an infinitesimal circle into an elongated ellipse with large surface area for diffusion to act on.
This example is additionally interesting as it shows that we may distinguish hyperbolic dynamics with strong stretching and compression from elliptic dynamics without significant stretching and compression via the density of ωḡ w.r.t. ω g . This cannot be achieved by the analogous density of ω Φ * g , which is constantly 1 in the volume-preserving case.
Example 1 (Rotating double gyre flow, continued). We show the density of the transient double gyre flow in Figure 7. One can see two regions of high density, surrounded by structures of smaller scale and embedded in a region of very low (several orders of magnitude compared to the global maximum) density. On a coarse scale, i.e., ignoring the smaller structures, we find two structures with significant ωḡ-volume embedded in a region of almost negligible such volume. From the volume perspective, we may consider theḡ-geometry as a small perturbation from the setting with two connected components.
Example 2 (Cylinder flow, continued). In Figure 8 one can see roughly two high-density components with densities of order 1, which are embedded in a large low-density region, in which the density may drop below 10 −7 . As in the previous example, we may consider theḡ-geometry as a small perturbation from the two-components setting.
To support the intuition of leaky versus sticky points, we provide video animations of the heat flow for different heat distributions initialized in the leacky and the sticky region, respectively; see Supplementary Material 4 and 5 and a more detailed description in section 5.5. meandering jet and structures of smaller scales, embedded in a region with very low density. From the volume perspective, we may therefore consider theḡ-geometry as a small perturbation from the setting with six connected components and a yet unclear role of the jet. Note that the jet has a much smaller ωḡ-volume compared to the vortices.

Diffusive flux form
Besides the volume form, we are also interested in the induced (hyper-)surface area form. Such forms assign a (d − 1)-volume, which we will simply refer to as area, to (d − 1)-dimensional parallelepipeds in tangent space. We restrict our attention to parallelepipeds with unit g-area, whose correspondingḡ-area can be interpreted as the 'ḡ-diffusive flux'. To compute theḡ-area of parallelepipeds of interest we recall a result from linear algebra [28, App. A, Lemma 1]: for an invertible matrix A ∈ GL R d and an orthonormal basis (v 1 , . . . , v d ) one has In our context, given some tangent space T p M , (v 1 , . . . , v d ) shall be an orthonormal basis with respect to g. The transformation given by A := G 1/2 corresponds to the basis transformation in T p M to Riemannian normal coordinates, cf. [47,Appendix]. Therefore, in the new coordinates, G takes the canonic Euclidean form, and area, determinant as well as volume are computed as in the Euclidean case. In other words, on the left-hand side we have theḡ-area of the parallelepiped spanned by (v 1 , . . . , v d−1 )-our object of interest-and on the right-hand side we have the density det G 1/2 = √ḡ studied in section 3.5 and theḡ −1 -norm of the normal (co-)vector v d . Given a point p in M , what is the orientation of a (d − 1)-parallelepiped of unit g-area in the tangent space T p M with the leastḡ-area? Physically speaking, what is the orientation of a surface element attached to p that admits the least diffusive flux? Looking at the left-hand side, we find directly that the parallelepiped spanned by the eigenvectors of G corresponding to the lowest eigenvalues has minimalḡ-area. This is consistent with the right-hand side in that the normal vector v d is the eigenvector corresponding to v min D in that case, and therefore has minimal dual norm. Furthermore, this is also consistent with our analysis in section 3.4 about directions of slowest and fastest diffusion.
In summary (and not surprisingly), surface elements oriented normally to the direction of weakest diffusion are locally optimal 'diffusion barriers'. From a global viewpoint, surfaces located at the boundary of a high density/low effective diffusivity region, which are pointwise (near-)optimal diffusion barriers, are expected to be good delineations of coherent from mixing regions, or, in other words, boundaries of coherent structures.

No-flux boundary conditions for Lagrangian averaged diffusion
We would like to find an expression for the normal vector field νḡ ofḡ in terms of ν g , the normal vector field with respect to the original metric g. Choose a point x ∈ ∂M and an orthonormal tangent space basis B such that taking inner products in these coordinates is performed by an Euclidean vector product, i.e.: by definition of the normal vector field. In the same coordinates, let G be the matrix representing the metricḡ . Now, we want to express νḡ as Aν g , where A is some invertible linear transformation. By definition, we must have for all which holds if and only if A = D. This transformation, however, corresponds to the dual metric ofḡ, or in other words, to the arithmetic average of the dual metrics g(t) −1 . Our natural Neumann boundary condition for ∆ḡ thus corresponds exactly to the boundary condition posed in [28], which is a natural, but not necessarily a Neumann boundary condition there.

Variational characterization of eigenvalues
We conclude our study of the geometry of mixing by interpreting the eigenvalues of the Laplace-Beltrami operator from the variational viewpoint. According to the Courant--Fischer-Weyl min-max principle the eigenvalues of any Laplace-Beltrami operator can be characterized as follows: for k ∈ N let W k = span{w 1 , . . . , w k } ⊂ L 2 (M,ḡ) be the k-dimensional subspace spanned by the eigenfunctions corresponding to the k dominant eigenvalues and W ⊥ k its orthogonal complement. Then the k-th eigenvalue is given by The infimum is attained exactly by the eigenfunctions corresponding to λ k . For a smooth function w, the simplest way to minimize the Rayleigh quotient is to be non-

LCS detection by Laplacian spectral analysis
Our previous analysis led us to the task of identifying sets or material distributions that are almost invariant under the heat flow induced by the harmonic mean metric Laplace-Beltrami operator ∆ḡ, see eq. (9). This is done by examining the spectrum and the eigenfunctions of the heat flow operator. Since its generator ∆ḡ is autonomous, the spectrum and the eigenfunctions of the heat flow can be deduced from the spectrum and the eigenfunctions of ∆ḡ by the spectral mapping theorem; cf. appendix A.5.

Spectral analysis & the role of eigenfunctions
In order to determine the number and location of LCSs, we study the spectrum and eigenfunctions of ∆ḡ. That is, we look at the solutions of the self-adjoint eigenproblem ∆ḡw n = λ n w n , w n ∈ L 2 (M,ḡ), possibly with natural Neumann boundary conditions on ∂M .
Before we come to the actual LCS extraction based on eigenfunctions of the harmonic mean Laplace-Beltrami operator ∆ḡ, we would like to develop some intuition on eigenfunctions of Laplace-Beltrami operators. As a starting point, we recall the discussion in [23], before we point out important differences to the problems considered there.
The last two facts are commonly interpreted as follows: the values of the (k 1 , k 2 ) = (1, 0)-eigenfunction cos(πx/ 1 ) can be used to parametrize or "recover" the x-coordinate, the values of the (k 1 , k 2 ) = (0, 1)-eigenfunction cos(πy/ 2 ) can be used to parametrize or "recover" the y-coordinate. When aiming for an intrinsic parametrization of the manifold of interest, the constant (0, 0)-eigenfunction is not useful at all, obviously. Its constancy and the simplicity of the corresponding 0-eigenvalue tells us that we are dealing with a connected manifold.
For k 1 + k 2 ≥ 2, the corresponding eigenfunctions are often referred to as higher harmonics, they are oscillating at higher frequencies along the two coordinates that can be obtained intrinsically as the values of the (1, 0)-and (0, 1)-eigenfunctions. Again, when aiming for intrinsic parametrizations these higher harmonics are not useful; the purpose of [23] is exactly to propose an algorithm that distinguishes the unique eigendirections given by the (1, 0)-and (0, 1)-eigenfunctions from repeated eigendirections (in the terminology of [23]). An unavoidable challenge in this eigenfunction distinction is that the eigenvalues corresponding to the (1, 0)-and (0, 1)-eigenfunctions need not be dominanttheir position in the spectrum depends on the aspect ratio of 1 and 2 -neither need they be followed by a gap in the spectrum.
We argue, however, that the problem of coherent structure detection is of a different nature. First of all, the material manifold/fluid domain as a subset of space at the initial time is fully known, and there is no need to learn intrinsic coordinates from operator-eigenfunctions. Second, as we have shown in section 3.5, it is much more natural to think of our material manifold M -equipped with the geometry of mixingas consisting of nearly-decoupled components, which then constitute the almost-invariant sets or coherent structures 3 . Back in our initial strip example, we should think of M  Figure 10: Spectral structure in the decoupled (a) and perturbed (b) cases. The quasicomponents indicate connected coherent sets corresponding to almost invariant sets, and arise from the unperturbed zero eigenvalues associated with connected components. The coordinates and higher harmonics correspond to first and higher order Laplace-Beltrami eigenfunctions associated with one or a superposition of the components or quasi-components.
being the union of several disjoint full-dimensional sets, say, strips or discs, which are weakly coupled by thin bridges. With this picture in mind, the eigenfunctions of interest are now those which are nearly constant on each "piece". In other words, in the simple strip example, the most interesting eigenfunction in our context is the flat one, since this would correspond to the structure-indicating function in a multi-structure case. Therefore, one would need to identify component-indicating eigenfunctions, and filter out both coordinate-inducing eigenfunctions and higher harmonics. This is in strong contrast to the coordinate-recovery problem in manifold learning described above.
In summary, we are interested in weakly-coupled components of full material manifold dimension. It is a classic heuristic to view the weak coupling as a small perturbation of the manifold, to which spectrum and eigenfunctions of the Laplace-Beltrami operator are expected to respond to continuously. One would therefore expect to see as many near-zero eigenvalues as one has components in the ideal decoupled case (or weaklycoupled components), followed by a gap often referred to as eigengap. It is exactly the non-dominant eigenfunctions-ideally after the first significant spectral gap-which can be used to parametrize each component in the diffusion-map fashion, discarding possibly interweaved higher harmonics. Figure 10 schematically illustrates the role of different eigenfunctions. The relevant eigengap in the perturbed case lies below the eigenvalues indicating quasi-components, which in turn are separated from the simple zero eigenvalue associated with the connected manifold M by Cheeger's constant. We note, however, that in practice the quasi-component eigenvalues indicating very small coherent structures might be embedded within a spectral region containing higher harmonics of larger structures. It is important that size and geometric shape need to be interpreted within theḡ-geometry, both for ∆ and ∆ḡ, and not within the (typically Euclidean) g-geometry; cf. section 3.4 and recall that both share the same diffusion tensor.

Spectral analysis of the examples
After solving the eigenproblem for eigenpairs (λ, w) by the finite-element method, cf. [30], we proceed as is well-established in the (computational) spectral geometry community [15,61,54]. We check the spectrum for the first significant gap, say, after the k-th eigenvalue. Then we apply the k-means clustering algorithm to the leading k eigenfunctions, cf. Figure 11 for the second and third eigenfunction in the rotating double gyre flow example. Alternatively, other heuristic methods may be applied here [81,18,21]. Note that the k-means clustering algorithm does not optimize any dynamical quantity, but only the classic k-means-objective function based on Euclidean distances in the eigenfunction space, which is highly consistent with the original clustering goal; cf. [54].
Since the first eigenfunction is flat, it can actually be omitted from the k-means clustering. The second up to the k-th eigenfunction are indicating the almost-invariant structures in a combinatorial way, since all eigenfunctions are pairwise orthogonal. In particular, all but the harmonic first eigenfunction have mixed signs and cannot be of pure indicator function type. Nevertheless, they span a subspace which is very close to the one spanned by the indicator functions supported on the respective coherent structures; cf. [15]. In Figure 11, adding a constant to w 3 and rotating (w 2 , w 3 ) by roughly −π/4, say, by passing to w 2 + w 3 and w 2 − w 3 , yields almost-indicator type functions supported on the blue and yellow structures, see the following examples for more details.
Example 1 (Rotating double gyre flow, continued, cf. [33]). In Figure 12 we show the 21 first eigenvalues of the spectrum of ∆ḡ. We find a first larger gap after the third eigenvalue, which indicates the presence of two coherent structures embedded in  Figure 13i, which is formed by the stable manifold of the lower hyperbolic trajectory. Figure 14 shows the result of the 3-clustering of the first three eigenfunctions. For comparison, in [33] only the left coherent structure has been detected (as necessary in the bisection problem treated there), which appears to be larger than ours obtained from the clustering. For a visual proof of coherence, we provide an advection movie showing the evolution of the detected sets as Supplementary Material 1.
Example 2 (Cylinder flow, continued, cf. [29]). In Figure 15 we show the 21 first eigenvalues of the spectrum of ∆ḡ. In this case, the eigengap is less obvious, and we merely observe an eigenvalue plateau from λ 4 to λ 7 . A look at the first two nontrivial eigenfunctions w 2 and w 3 in Figure 16a and Figure 16b reveals that the two most relevant coherent structures are again in the initial gyre centers. All eigenfunctions from w 4 to w 20 have common visual support with w 2 , and can be interpreted as coordinates-inducing or higher harmonics on these two components. If we neglect the corresponding eigenvalues in line with our reasoning in section 4.1, we are left with the problem of finding two coherent structures with some background. The final clustering result is shown in Figure 16c. For comparison, in [29] only the left coherent structure has been detected (as necessary in the bisection problem treated there), which appears to be visually indistinguishable from ours. As for the previous example, we provide an advection movie showing the evolution of the detected sets as Supplementary Material 2 for a visual proof of coherence.    The 3-clustering obtained from w 2 and w 3 .
Example 3 (Bickley jet flow, continued, cf. [41]). In Figure 17 we show the 21 first eigenvalues of the spectrum of ∆ḡ. We find a first bigger gap after the seventh eigenvalue, which indicates the presence of six coherent structures embedded in an incoherent connecting background. A look at the first nontrivial eigenfunctions such as the second one in Figure 18a reveals that the six most relevant coherent structures are the gyres flanking the jet. All eigenfunctions corresponding to the dominant eigenvalues look conceptually the same, i.e., they distinguish the different vortices as does w 2 for some of them. The final 7-clustering result obtained from the dominant seven eigenfunctions is shown in Figure 18b. For comparison, in [41] very similar spectrum and LCS have been found, apparently larger than ours obtained from the clustering.
Of particular interest in this model is the presence or absence of a north-south division in the first nontrivial eigenfunction w 2 . Such a division is reported for the related probabilistic transfer operator method, in [35,27] for different parameters, and in [41] for the parameters used here, as well as in [30] in the transfer-operator-based FEMdiscretization of ∆. We have not observed the horizontal division in the dominant eigenfunctions neither with ∆ḡ, nor with the dynamic Laplacian ∆, cf. the Cauchy-Green-based FEM-discretization of ∆ in [30]. The occurence of the north-south division is related to a strong "numerical" diffusivity introduced through the discretization, both in the classic box discretization of the transfer operator and the transfer-operator-based FEM-discretizations in [30]. These aspects will be discussed in more depth in a forthcoming publication. As

What to average?
Including our approach developed here, there are now three coherent structure detection methods which average different, but related, geometric objects. As a first category, we have the averaged Laplacian approach: for the (spatially) continuous version, this is the dynamic Laplacian ∆ [28,31], for the discrete version, this is the Spacetime Diffusion Map transition matrix Q ε [6]. As we have seen, ∆ is an average of Laplace-Beltrami operators with respect to pullback metrics, and Q ε is an average of the graph Laplacians B ε,t of those Laplace-Beltrami operators ∆ g(t) , built on the respective locations of a given set of trajectories. For each time instant t, the pointwise or even uniform convergence of the graph Laplacian towards the respective Laplace-Beltrami operator for increasing data samples, see [6,Thm. 3,Rem. 4], follows from the results in [7,76,14,8], see also [9, Appendix A] for a historic account and sufficient conditions for convergence of spectra. In summary, averaging (pullback) Laplace operators yields an elliptic operator on the physical manifold (M, g) (possibly weighted by the initial fluid density).
As a second category, we have the spectral-clustering method put forward in [41]. There, geodesic distances with respect to the pullback metrics are averaged. These new distances are then used to build a normalized graph Laplacian on that new metric space. Spectral convergence of the resulting normalized graph Laplacian for infinite samples towards a continuous operator can still be guaranteed when replacing the distancedependent kernel function r → 1/r by the often used Gaussian kernel r → exp(−r 2 /σ 2 ), see [82,Thm. 15]. The metric space lacks a generating Riemannian structure, but is very accessible from real-world data sets, and is guaranteed to give a self-adjoint matrix, in contrast to the spacetime diffusion map approach, see [6, III.C.2, p. 8].
Thirdly, we have our current method, which builds on averages of dual metrics/diffusion tensors, and equips the material manifold with the corresponding natural metric as Riemannian metric. This procedure yields the richest mathematical structure, i.e., both a Riemannian manifold and consequently a metric space, and is guaranteed to give a self-adjoint operator. On the downside, its discretization in terms of a graph Laplacian is not straightforward anymore, see section 5.4. We expect, however, that for material points which are close in all pullback metrics g(t) the average of geodesic distances used in [41] is a good approximation for theḡ-geodesic distance.
All of these three methods have in common that they consider local neighborhood information, either through considering (second) derivatives of "test functions", or through geodesic distance information. This is in contrast to an independent set of coherent structure detection methods, for instance [62,11,58,64,45], which retrieve information from time averages of observables along "isolated" trajectories. The expectation then is that coherent structures reveal themselves as sets of material points showing similar statistics within the structure, and different statistics compared to the exterior.

Connections to geodesic LCS approaches
There is another group of methods for finding boundaries of coherent structures in purely advective flows, developed by Haller and co-workers [42,44,48,24]. These build on global variational principles involving the Cauchy-Green (CG) strain tensor field C := DΦ(T ) DΦ(T ), which we interpret here as the pullback metric Φ(T ) * g; see Remark 1. These methods evaluate the dynamics at two time instances, an initial t = 0 and a final one t = T . Earlier "finite-time" methods have only used the logarithm of the maximal eigenvalue of C, well-known as the finite-time Lyapunov exponent (FTLE), for visual inference of coherent structures.
Previously, it has been unclear how to extend these two-point variational methods towards using also intermediate dynamic information, and, more importantly, how these Cauchy-Green-based methods relate to the probabilistic transfer operator-based methods. In the following, we will clarify these two questions and thereby provide the longsought link between the two prominent coherent structure approaches.
To this end, observe the following tight relation between the Cauchy-Green strain tensor C and the two-point harmonic mean metric tensor G in physical g-normal coordinates at some point p ∈ M . Then G has the coordinate representation G = 2 G(0) −1 + G(T ) −1 −1 = 2 I + C −1 −1 , where I is the identity matrix. Clearly, C has eigenvalues µ min = µ min (C) ≤ µ max (C) = µ max with eigenvectors 4 v min (C) and v max (C) if and only if D = G −1 has eigenvalues µ min D = 1 2 1 + 1 and v max D = v min (C). In other words, the minor CG-eigendirection v min (C) corresponds to the dominant G-diffusion direction. Moreover, in the volume-preserving case, one has µ min = µ −1 max , and therefore the anisotropy ratio for D 1 + µ max 1 + 1 µmax = µ max is equal to the dominant CG-eigenvalue. From these considerations we see that our harmonic mean metric is a consistent way of including intermediate deformation information in quantities like the FTLE or the characteristic direction fields v i (C). Thus, the logarithm of the anisotropy ratio shown in the figures in section 3.4 corresponds to an accordingly defined multiple time step FTLE up to rescaling. Next, let us briefly recall the variational formulations for elliptic (coherent vortex boundaries) and parabolic (jet cores) Lagrangian coherent structures (LCSs) in twodimensional flows, using our notation. Following [44], boundaries of elliptic LCSs are sought as the outermost closed stationary curves of the averaged strain functional where r is a parametrization of a material curve γ ⊂ M . The integrand compares pointwise the magnitude of curve velocity r (s) after push forward by DΦ with its original magnitude. Equivalently, by equipping M with the pullback metric C = Φ(T ) * g, the domain M is geometrically deformed, in principle as we do in our approach here withḡ. In these terms, the length of r (s) in the deformed geometry (M, C) is compared with its length in the original geometry (M, g). By applying Noether's theorem, one obtains that stationary curves necessarily obey a conservation law, which corresponds exactly to the integrand, i.e., |r (s)| Φ(T ) * g |r (s)| g = |r (s)| g(T ) |r (s)| g(0) = λ = const., from which one may deduce tangent line fields η λ . Among the orbits of these line fields one looks for closed ones. Closed orbits typically come in continuous one-parameter families, out of which one picks the outermost. Analogously to index theory for vector fields, one may employ index theory for line fields to deduce that any closed orbit of a piecewise differentiable line field must necessarily enclose at least two singularities 5 of the line fields, and all enclosed singularities obey a topological rule, see [48] for more details. In most cases, relevant closed orbits have been found to enclose exactly two Closed η λ -lines (black) on top of (a) the final clustering result from the ∆ḡ-analysis, and (b) the second eigenfunction w 2 of ∆ḡ.
singularities of wedge type [48]. These singularities can be visualized (and numerically detected) by phase portraits of the eigendirections of the pullback metric C, or, as shown above, equally well by the dominant diffusion direction field of D as in section 3.4. In practice, the outermost closed stationary curve travels through regions of high FTLE/anisotropy values. There, the tangent direction field is almost collinear with v min (C). From the discussion in section 3.6 we conclude that such closed curves are pointwise very close to the optimal direction for blockingḡ-diffusion. Their deviation from the optimal direction is not very costly in terms of diffusive flux, but still allows them to close up smoothly under the conditions of the variational principle.
For a simple numerical test, we have overlaid the final clustering result as well as the second eigenfunction of ∆ḡ for a two time point-approximation ofḡ with all closed η λ -orbits in Figure 19. We find a striking similarity especially of the right elliptic LCS (black) with the right structure found from the ∆ḡ-analysis (blue set in Figure 19a). A close inspection of the level sets of the second eigenfunction, see Figure 19b, shows that even the kink at the bottom of the right LCS is captured, even though smoothed out by the k-means clustering.
Regarding parabolic LCSs, there exists a similar mathematical formulation to the one for elliptic LCSs, including a variational principle, which admits a conserved quantity from which one may deduce tangent line fields for stationary curves [24]. Eventually, parabolic LCSs, or shearless jet cores, are defined as alternating chains of v min (C) and v max (C) integral curves, which connect to each other in tensor singularities. Interestingly, this feature can again be well observed in the Bickley jet example, compare Figure 6 to [24,Sec. 9.3].
In summary, its dimension-independent formulation and its high degree of consistency with the two-dimensional variational principles suggest our methodology as a natural extension of these approaches to higher dimensions. It has proven to be notoriously challenging to extend the variational ideas to three dimensions by restricting oneself to variational principles on curves [10,66].

Perturbation of topology
We have argued in section 3.4 that the manifold (M, g) with the coherent islands of large metric density and a surrounding sea of small metric density can be regarded as a perturbation of a virtual limit case in which the density actually vanishes in the small-density region. In this case, however, the manifold would change its topology by disintegrating into several connected components. Due to the small density, the omission of the region with smallḡ-volume may be regarded as a small perturbation. As all examples in section 4.2 show, the dominant part of the spectrum and the corresponding eigenfunctions behave exactly as expected in such a perturbative setting.
In the following, we want to examine how much of an error one introduces in the heat flow by omitting regions of smallḡ-volume. To this end, suppose that N ⊂ M is the region where the metric density of ω g is larger than some small 0 < δ 1 in normal coordinates x with respect to the physical metric g. We denote by u ε t (x, y) the heat kernel of −ε∆ g , see eq. (9). Then, the characteristic function of N evolves according to Hence, for the (heat) transition from N to M \ N we observe that which implies that the set N is almost invariant under the flow if M is compact and δ is sufficiently small. In other words, if the volume of M \ N with respect to the metric g is small, there is only little diffusion from N into M \ N . Hence, setting u ε t (x, y) = 0 whenever x / ∈ N or y / ∈ N introduces only small errors but decouples the manifold into a region M \ N in which no diffusion happens and the connected components of N in which diffusion still takes place. This is similar to a manifold-disintegration approach based on transition-kernels as proposed in [26].

Discrete data case
The analogue to a finite-difference discretization of ∆ḡ on an irregular grid is-to leading order-computing its graph Laplacian. Due to the geometric distortion by the metricḡ, a regular Euclidean grid on an initially Euclidean domain is no longer equidistant, since all grid points may have different geodesic distances, similar to [41]. And so, a consistent finite-difference discretization boils down to computing-or approximating-geodesic distances between neighboring points. In the continuous data case, this can be done classically by computing the geodesic vector field and solve the corresponding two-point boundary value problem. Alternatively, one can solve the anisotropic eikonal equation locally for the first arrival time function, say, by the efficient fast marching method, to obtain geodesic distances to points in the topological neighborhood. Both methods require metric evaluation at subgrid points.
Given a discrete trajectory data set, such subgrid information is not available. It is therefore of interest to estimate theḡ-geodesic distance with less knowledge by using metric information only at the irregularly placed trajectories or avoiding the computation of the pullback metrics altogether. For the latter, one may resort to the spectralclustering approach [41] discussed in section 5.1; for the first, work is in progress.

Applications to geophysical fluid dynamics
In this section, we compare our methodology to Nakamura's "effective diffusivity" framework [65,75], which is widely used in the geophysical fluid dynamics community.
Both methods consider advection-diffusion processes in possibly turbulent fluid flows, i.e., the Fokker-Planck equation (FPE) in the physical domain with a "constant diffusion coefficient" κ 6 . In a second step, the FPE is transformed to a different set of coordinates: in [65] a quasi-Lagrangian coordinate system based on the area inside concentration level sets of a given tracer density φ of interest is constructed. This transformation is considered as advantageous, for it separates "the reversible effects of advection from the irreversible effects of advection and diffusion acting together" [75]. In our context, this is achieved by passing to Lagrangian coordinates as in [67,78,36]-and therefore literally to a "tracer-based coordinate system"-which factors out the advective motion, but keeps the joint action of advection and diffusion through tracking deformation and its effect on diffusion in the form of the pullback metric.
In the Nakamura framework, the coordinates are then given by the one-dimensional area-coordinate A and some (d − 1)-dimensional coordinates within concentration level sets. Clearly, the local action of diffusion on the concentration is in the A-direction only, since there is no φ-gradient along the level set coordinates. Averaging over contours allows to reduce the d-dimensional advection-diffusion equation to a one-dimensional (along the area coordinate) pure diffusion equation [65]: and the scalar K eff is coined effective diffusivity. In our context, the Eulerian FPE is turned into a pure diffusion equation as well, eq. (2), however, of full dimension d again. Moreover, we arrive at a diffusion tensor induced by the pullback metric. We comment on the reason and the benefit of this increased complexity later.
The next important concept in Nakamura's framework is that of equivalent length L eq (in 2D) and equivalent area A eq (in 3D). For convenience, we focus on L eq henceforth, but all arguments apply analogously to higher-dimensional flows. And so, the equivalent length of a concentration level set corresponding to the area coordinate A can be obtained from a comparison of the spatial scalar diffusivity κ (from the Eulerian FPE) and the effective diffusivity K eff (from eq. (13)), cf. [75, eq. (6)]: It can be shown that L 2 eq is at least (the square of) the physical length of the corresponding concentration level set.
Equation (14) can be interpreted as answering the following question: how would one have to rescale units or, more generally, deform the local geometry in order to observein the new units/deformed geometry-the original diffusivity κ? Essentially, this is the perspective that we took throughout section 3: how do length, surface area, and volume in the diffusion geometry relate to the corresponding entities in the original spatial geometry? How do we see in the geometry of mixing whether diffusion is effectively enhanced or suppressed (relative to the pure spatial diffusion) via its interaction with advection? As an early indicator of mixing efficiency, Nakamura [65] suggested to look at the equivalent length, for a large L eq implies a large effective diffusivity, which is then related to the mixing region. By perfect analogy, comparing the "effective diffusion volume" ωḡ−1 to the original spatial diffusion volume ω g −1 , which play the roles of K eff and κ, respectively, the relating factor becomes the determinant of the effective diffusion tensor |D| 1/2 , i.e., the inverse of the densities shown in section 3.5.
In support of the decomposition of the fluid domain M into regular/coherent and mixing regions, we provide two video animations of the geometric heat flow in the Supplementary Material 4 and 5. In the fourth Supplementary Material, we consider the geometry of mixing related to example 2, and initialize a scalar quantity in the interior of the left coherent structure. Initially, the heat distribution follows the subdivision pattern determined by the fifth eigenvector, until it roughly homogenizes over the coherent structure. Conversely, the leaked heat perfectly homogenizes in the mixing region. Moreover, leakage of heat into the right coherent structure is-just as leakage out of the left coherent structure-extremely slow, such that there remains a significant heat gradient across the boundary of the coherent structures. In the fifth Supplementary Material, we initialize the same amount of heat in the mixing region. Under the heat flow, the heat homogenizes almost instantly across the mixing region, and again keeps a gradient across the boundaries of the coherent structures. The SM5 animation was produced on a timescale which is 1000 times faster than the one in the SM4 animation, and still scalar homogenization appears to be instantly in the mixing region. The shape of the "metastable" configurations matches the predictions based on a visual study of the geometry of mixing in section 3 remarkably well.
The effective diffusivity framework is tied to a specific tracer concentration field. For instance, the fact that diffusion is one-dimensional and the orientation of that single coordinate in physical space is determined by the concrete concentration field at hand. A concentration field with a different initial level set topology may yield different results.
To obtain physically relevant mixing information, it is assumed that a "suitable tracer field" is considered; cf. the effort taken in [65,75] to generate suitable initial tracer fields. Suitability then means, roughly speaking, that its level set topology is already reasonably "equilibrated" w.r.t. the mixing geometry induced by the flow, but at the same time has sufficiently strong gradients to allow for a computationally robust transformation to area coordinates. In other words, concentration gradients are roughly aligned with the direction of slowest diffusion, which is of largest interest.
In contrast, our geometry of mixing provides information about the "mixing ability" [75] or "mixing potential" [71] of the flow, irrespective and independently of a concrete concentration field. In this spirit, we view the recent trajectory encounter volume diagnostic [71] as a consistent approach to estimating our effective diffusivity |D| 1/2 based on discrete trajectory information. The Lagrangian FPE remains of full dimension and invokes an effective diffusion tensor to account for all possible level set topologies. This generality offers the opportunity to quantify, for instance, mixing of non-equilibrated concentration fields, or of fronts. Moreover, our Lagrangian heat-flow-based framework offers both intrinsic visual diagnostics and a global operator-based approach for computer-supported distinction of mixing regions from LCSs, both fully consistent with each other.

A. Preliminaries
A.1. Riemannian geometry Throughout, let (M, g) be a compact, smooth, connected d-dimensional Riemannian manifold, possibly with smooth boundary ∂M . For any smooth real-valued function f ∈ C ∞ (M ) its exterior derivative df is invariantly defined as a one-form on M and given by df = ∂f ∂x i dx i in local coordinates with Einstein's summation convention.
The metric (tensor field) g induces a unique volume form ω g , with respect to which we may define the L p (M, g)-spaces of p-integrable functions on M . As usual, we define an inner product on L 2 (M, g) by The metric tensor g x (·, ·) defines a scalar product on each tangent space T x M , which allows to identify the cotangent space T * x M with the tangent space T x M by the canonical/musical isomorphism, see [56, p. 342]. The musical isomorphism becomes an isometry if we equip the cotangent space with the inverse or dual metric g −1 . In this context, the superindex −1 is motivated by the fact that, in local (dual) coordinates, the dual metric is represented by the inverse matrix G −1 of the metric-representing matrix G. We denote by | · | gx both induced norms on the fibers T x M and T * x M , where the domain will be clear from the argument. Since the exterior derivative df of f is dual to the gradient grad g f of f by the musical isomorphism, its isometry expresses itself through |df | gx = | grad g f | gx at any x ∈ M .
For brevity, we omit the indices at the inner product notation when there is no risk of confusion with the above L 2 (M, g) inner product on scalar functions.
Finally, if M has a non-empty boundary, there exists a unique outward unit normal vector field ν g on ∂M , where orthogonality is defined with respect to g.

A.2. Elliptic differential operators
An elliptic 7 second-order differential operator P on M takes the form in local coordinates, where a ij , b i , c are smooth (real) coefficients and a ij ij is symmetric and (uniformly) positive definite. The most important example of a second-order elliptic operator is the negative Laplace-Beltrami operator reviewed in appendix A. 3.
If M has no boundary, an elliptic differential operator P defines a closed operator on L p (M ) for any 1 ≤ p < ∞. If ∂M = ∅, we add the natural Neumann boundary condition du(ν g ) = 0 on the boundary ∂M, u ∈ C ∞ (M ).
The resulting elliptic boundary operator P N is a closed operator on L 2 (M, g). The following is a classic result on elliptic operators, see, e.g., [2,Sec. 1.4] and [51].
Proposition 2. Let P be a second-order elliptic operator on int M that is formally self-adjoint on L 2 (M, g). Then the following statements hold true.
1. The operator P (or P N , resp.) is self-adjoint and has purely discrete spectrum.
3. Each eigenvalue has finite multiplicity and the eigenspaces corresponding to distinct eigenvalues are L 2 (M, g)-orthogonal.
4. The Hilbert space L 2 (M, g) is the direct sum of the eigenspaces. All eigenfunctions are C ∞ -smooth.

A.3. The Laplace-Beltrami operator
There exists a unique operator d * , also referred to as the codifferential, acting on oneforms and yielding smooth functions, which is the adjoint of the exterior derivative d in the sense that The Laplace-Beltrami operator on smooth functions induced by the Riemannian metric g is then defined as with its weak formulation In terms of the coordinate representation eq. (15), one finds that a ij = g ij , where g ij are the components of the dual metric g −1 8 , b j = Γ j ij g ij and c = 0, where Γ k ij are the Christoffel symbols; see also [5,Sec. 3.15.1]. Solutions to the corresponding eigenproblem for ∆ g , ∆ g u = λu on the interior of M, 8 The matrix representation g ij of g −1 can therefore be interpreted as the natural diffusion tensor.
are of great importance in many mathematical and applied disciplines. As mentioned above, in the case that M has non-empty boundary ∂M , we are interested in the Neumann eigenproblem, i.e., with the "no-outflux" boundary condition du(ν g ) = 0 on ∂M.
Notably, Proposition 2 applies to d * d and, hence, implies the well-known spectral properties of the Laplace-Beltrami operator; see Proposition 3. An equivalent, classic pointwise representation of the Laplace-Beltrami operator is where gradient and divergence are the unique vector field and function, respectively, for which df (x)V = g(grad f (x), V ), and div(V )ω g = d(ω g (V, ·, . . . , ·)) hold for any vector field V on M , see [56]. The important issue is that the gradient operator is determined by the metric g, whereas the divergence operator is determined by the volume form ω g . In general, in the definition of the Laplace-Beltrami operator, the volume form does not have to be the one induced by the metric, which leads to the notion of generalized Laplace-Beltrami operator, see appendix A.4. Two important properties of the Laplace-Beltrami operator will be relevant in our work. The first is commutativity with isometries [12, p. 27]. That is, for an isometry T : (N, h) → (M, g) between two Riemannian manifolds (N, h) and (M, g), i.e., h = T * g, one has ∆ h T * = T * ∆ g .
Suppose T 1/2 : D 1/2 R n → (M, g) are two global parameterizations of M , and D 1/2 are equipped with the respective pullback metrics T * 1/2 g. Then, both the T i 's and the coordinate change T 1 • T −1 2 : (D 2 , T * 2 g) → (D 1 , T * 1 g) are isometries by definition, and isometry invariance yields coordinate independence of eigenvalues and eigenfunctions of ∆ g . Specifically, eigenfunctions of ∆ 1 and ∆ 2 transform into each other by the above coordinate change and its inverse. This invariance under coordinate or observer changes is also referred to as objectivity in continuum mechanics. Note that objectivity comes without extra effort once the operator is introduced in an intrinsic coordinate-free fashion as above.
Second, recall that harmonic functions f are defined by ∆ g f = 0 [83,Ch. 6]. It is wellknown [83, p. 222] that the only harmonic functions on M are the constant functions, i.e., functions f ∈ C ∞ (M ) with df = 0. If M is connected, a harmonic function has to attain the same value on all of M , and the space of harmonic functions, ker ∆ g , is one-dimensional and isomorphic to the real numbers R, by the canonical identification of constant functions on M with their single value. In other words, the Laplace-Beltrami operator has a simple 0 eigenvalue.
If M has n connected components, it is easy to see that there exist n linearly independent harmonic functions: define harmonic functions by setting them equal to the constant 1 on one component and to 0 on all other components. As a consequence, the 0 eigenvalue has multiplicity n, the space of harmonic functions is n-dimensional and spanned by the characteristic functions of the connected components. It is therefore a simple task to extract the connected components from a set of linearly independent (if not orthogonal) eigenvectors.
The gap between the zero eigenvalue and the first nontrivial eigenvalue is often referred to as spectral gap or eigengap. Standard perturbation theory implies that eigenvalues of self-adjoint operators change Lipschitz continuously under continuous perturbations, see [49,Ch. V.4]. Hence, zero eigenvalues remain close to zero for small perturbations, becoming what is referred to as the dominant or leading eigenvalues. In contrast, the corresponding eigenfunctions may exhibit discontinuities at eigenvalue crossings.
We summarize the above discussion on relevant spectral properties of the Laplace-Beltrami operator in the following proposition.
Proposition 3. For the Laplace-Beltrami operator ∆ g the following statements hold: 1. The operator ∆ g (or ∆ N g , resp.) is self-adjoint and has discrete spectrum.
3. Each eigenvalue has finite multiplicity and the eigenspaces corresponding to distinct eigenvalues are L 2 (M, g)-orthogonal.
4. The Hilbert space L 2 (M, g) is the Hilbert sum of the eigenspaces. All eigenfunctions are C ∞ -smooth.
5. The operator ∆ g is an objective operator, i.e., the spectrum does not depend on the coordinate representation and eigenfunctions transform according to coordinate transformations.
6. The multiplicity of the 0-eigenvalue equals the number of connected components of M .

A.4. Weighted manifolds and the generalized Laplace-Beltrami operator
We call a triple (M, g, µ) a weighted manifold with M and g as above, and µ a measure on M given by integrating indicator functions of measurable sets A with respect to dµ = ρω g , i.e., Here, ρ is some smooth positive density [39]. This gives rise to corresponding L p -spaces over M , and in particular to the Hilbert space L 2 (M, µ). Then the exterior derivative has again an adjoint operator d * µ and we may define the generalized Laplace-Beltrami operator by ∆ g,µ = −d * µ d = div µ • grad g .
Its weak formulation takes the form [39] f, d * µ dh 0,µ =ˆM g −1 (df, dh)dµ. (18) Under the usual assumptions, ∆ g,µ admits a unique self-adjoint extension and has all properties stated in Proposition 3, see, for instance, [39]. In what follows, we will omit the subindex µ in the notation of d * when there is no risk of confusion.

A.5. Heat flows
In this section, let us assume that M has no boundary for simplicity. Then, given an elliptic, nonpositive second-order differential operator P on M 9 , the (infinitesimal) generator, (exp(tP )) t≥0 is an analytic semigroup of bounded operators on L p (M ), 1 ≤ p < ∞, and u(t) = exp(tP )u 0 , u 0 ∈ L p (M ), is the unique solution of the generalized heat equation d dt u(t) = P u(t), u(0) = u 0 ; see, e.g., [49,Ch. IX.1]. As usual, we call exp(tP ) the heat flow generated by P . By the spectral mapping theorem, we have σ (exp(tP )) = exp (t[σ(P )]) , with corresponding eigenprojections. In other words, it suffices to study the spectrum and the eigenprojections of the generator P to determine subspaces which are left invariant under the heat flow (exp(tP )) t≥0 . If we consider a Hölder continuous curve t → P (t) of elliptic second-order differential operators, the unique solution of d dt u(t) = P (t)u(t), u(0) = u 0 , is given by u(t) = U P (t, 0)u 0 , where the generalized heat process {U P (t, s)} t≥s≥0 is the non-autonomous parabolic solution operator generated by P , see [4,Ch. II]. In particular, it satisfies U P (t, t) = Id L 2 (M ) , and U P (t, τ )U P (τ, s) = U P (t, s), for s ≤ τ ≤ t, and one has U P (t, s) = exp((t − s)P ) if P is time-translation invariant. The integral kernel u P (t, s; ·, ·) of U P (t, s), U P (t, s)ψ(x) =ˆM u P (t, s; x, y)ψ(y)ω g , is called the heat kernel of P . 9 That is, −P is elliptic in the sense of appendix A.2.
The function p is a phase space representation of the operator known as the symbol of P ε . The class of pseudodifferential operators includes in particular differential operators with smooth coefficients. By using local charts, one can also define pseudodifferential operators on a manifold (M, g). More explicitly, T ε is called a pseudodifferential operator if the following holds: for any x 0 ∈ M there is a local chart U x 0 and two functions u, v ∈ C ∞ c (U ) that are equal to 1 close to x 0, and with v = 1 on a set containing the support of u, such that T ε u = vT ε u + r ε .
Here, in the local chart U it holds vT ε u = P ε for some symbol p : R 2d → C, and the remainder r ε is a smoothing operator of size O(ε ∞ ) that maps distributions to Schwartz functions.

B. Proof of Proposition 1
We need to prove that ∆ g +∆ Φ * g = ∆ḡ. First, recall that the Laplace-Beltrami operator associated to some metric h reads as in a local coordinate. In the following, we parametrize the one-dimensional manifold by a unit speed g-geodesic. In this coordinate, we have ∆ g = ∂ 2 .
We start with the left hand side of our assertion. Let f ∈ C ∞ (M ) and S = Φ −1 . Then, omitting the argument x ∈ M for brevity, we calculate where we have used Faa di Bruno's formula in the third line, S (Φ(x)) = Φ (x) −1 in the fourth and S (Φ(x)) = −Φ (x)Φ (x) −3 in the fifth. The latter can be easily shown by twice differentiating the identity S • Φ(x) = x.