1 Introduction

Understanding the distribution of physical quantities by advection–diffusion is of fundamental importance in many scientific disciplines, including turbulent (geophysical) fluid dynamics and molecular dynamics. Of particular interest are coherent structures, for which there exist many phenomenological descriptions, visual diagnostics and mathematical approaches; see Hadjighasem et al. (2017) for a recent review. In fluid dynamics, coherent structures are often thought of as rotating islands of particles with regular motion, which move in an otherwise turbulent background (McWilliams 1984; Fazle Hussain 1986; Provenzale 1999; Haller and Beron-Vera 2013). In molecular dynamics, coherent structures (or almost-invariant sets) are thought of as conformations, i.e., sets of configurations of the molecule which are stable on time scales much larger than those of molecular oscillations (Schütte 1999, 2003).

In the last years, there has been an explosion of coherent structure detection methods based on flow information. Relying on flow information appears to be a necessary step in non-autonomous/unsteady velocity fields, since instantaneous velocity snapshots and their streamlines are no longer conclusive for material motion, in contrast to the autonomous/steady case. Nevertheless, the appearance of these methods is very different at first sight: In the category of variational approaches, some methods require preservation of boundary length (Haller and Beron-Vera 2013), minimization of mixing under the flow (Froyland et al. 2010; Froyland 2013) or surface-to-volume ratio (Froyland 2015; Froyland and Junge 2015). A different class of methods considers averages of observables along trajectories (Mezic et al. 2010; Budišić and Mezić 2012; Mancho et al. 2013; Mundel et al. 2014; Haller et al. 2016; AlMomani and Bollt 2018) and seeks coherent structures as sets with similar statistics. Recent (graph) clustering approaches (Froyland and Padberg-Gehle 2015; Hadjighasem et al. 2016; Banisch and Koltai 2017; Padberg-Gehle and Schneide 2017) assess coherence based on mutual trajectory distances. Besides, there exist geometric and topological approaches to coherence; see Ma and Bollt (2014, 2015b) and Allshouse and Thiffeault (2012), respectively. Comparison studies of these methods have been restricted to simulation case studies (Allshouse and Peacock 2015; Ma and Bollt 2015a; Hadjighasem et al. 2017) so far.

Even though many of the above-mentioned approaches focus on different phenomenological features of coherent structures, often the underlying motivation is that Lagrangian coherent structures are expected to be material sets which are the least vulnerable to (weak) diffusion. By material or Lagrangian sets, we mean—as usual in continuum mechanics—flow invariant sets, or, equivalently, fixed sets of particles. The invulnerability to diffusion is often modeled via some requirement on boundary deformation under the flow, see, for instance, Haller and Beron-Vera (2012), Haller and Beron-Vera (2013), Ma and Bollt (2014), Froyland (2015). Despite the intuitive reference to diffusion, all these methods assume a purely advective transport process. In this work, we develop a unifying framework for the study of coherence from the Lagrangian viewpoint on advection–diffusion and provide new mathematical connections between Cauchy–Green tensor-based methods developed by Haller and coworkers (Haller and Beron-Vera 2012, 2013; Farazmand et al. 2014), the dynamic Laplacian methodology by Froyland (2015) and Froyland and Kwok (2017) and Nakamura’s effective diffusivity framework, adding to the previously found connection between the dynamic Laplacian and the probabilistic transfer operator approach (Froyland 2015); see Fig. 1 for a schematic overview and more details in Sect. 5.

Fig. 1
figure 1

Schematic representation of the connections between different methods for coherent structure detection

The intuition of Lagrangian coherence as persistence to diffusion leads us to the incompressible advection–diffusion equation (ADE) in Lagrangian coordinates, which is of diffusion-only type; see also Press and Rybicki (1981), Krol (1991), Knobloch and Merryfield (1992), Thiffeault (2003) and Fyrillas and Nomura (2007) for earlier related approaches. In the Lagrangian frame, we view Lagrangian coherent sets or structures (LCSs) as metastable sets under the Lagrangian ADE. It turns out that the deformation by advection (in the Eulerian frame) is equivalent to a deformation of the geometry of the (initial) material manifold, i.e., in the Lagrangian frame. This change of perspective from space (Eulerian frame) to material (Lagrangian frame) solves, by the way, the problem of separating the reversible effects of pure advection from the irreversible effects of advection and diffusion acting together, see Nakamura (1996) and Shuckburgh and Haynes (2003). Time averaging of the Lagrangian ADE yields an autonomous diffusion-type equation, whose generator coincides—in the case of spatially isotropic diffusion—with Froyland’s recently introduced dynamic Laplacian (Froyland 2015). Froyland’s approach is motivated by a dynamic analogue to the isoperimetry problem, i.e., the optimal bisection of a manifold, where optimality is measured with respect to the ratio between the area of the bisection surface and the volume of the smaller of the two parts. Our independent and physical advection–diffusion-based derivation of the self-adjoint dynamic Laplacian establishes a link to symmetric Markov processes and their metastable decomposition of state space (Davies 1982a, b; Deuflhard et al. 2000; Huisinga and Schmidt 2006).

The Lagrangian averaged diffusion tensor field generates an intrinsic Riemannian geometry on the material manifold, which we refer to as geometry of mixing; cf. also Giona et al. (2000) for the use of this terminology, however, not in an averaging sense. The self-adjoint Laplace operator associated with the geometry of mixing can be investigated in detail by methods from semigroup and operator theory (Davies 1982b, 1995b), Riemannian and spectral geometry (Cheeger 1970; Lablée 2015) and visualization, i.e., diffusion tensor imaging (DTI). The technical requirements on the flow are volume preservation and smoothness and on the original material manifold are compactness and boundary regularity that permits the formulation of (homogeneous) Neumann boundary conditions. Since we are working in a Riemannian geometry setting, we benefit from the rich intuition about the role of eigenfunctions gathered in applied and computational harmonic analysis (Coifman and Lafon 2006; Dsilva et al. 2015).

While our theory as presented in the current paper is of continuous type and theoretically requires arbitrary fine dynamic information, it is strongly related to the diffusion maps methodology (Coifman and Lafon 2006); cf. also Banisch and Koltai (2017). The classic, albeit not exclusive, application case there is that of manifold learning, i.e., the computation of topological and geometric features (such as intrinsic coordinates) of manifolds embedded in a Euclidean, usually high-dimensional space. The situation is thus one of a static manifold. Recently, these ideas have been extended toward including dynamics (Giannakis and Majda 2012; Shnitzer et al. 2017; Marshall and Hirn 2018), taking different approaches than ours.

Another contribution of ours is the following conceptual clarification. Lagrangian coherent structures (LCSs) are commonly referred to as transport barriers. With its reference to advection through the term “transport,” this translates then to sets with (near-)zero advective flux. It has been pointed out earlier (Nakamura 1996; Haller and Beron-Vera 2012) that in purely advective flows any material surface constitutes a transport barrier by flow invariance. Our approach, and its consistency with many existing LCS methods, clarifies the role of LCSs as diffusion or mixing barriers; cf. Froyland (2013) for an Eulerian analogue.

The paper is organized as follows. We start in Sect. 2 by recalling some fundamental concepts from Riemannian geometry, the Laplace operator, its induced heat flow and metastability. Section 3 is devoted to the derivation and discussion of the Lagrangian version of the advection–diffusion equation (ADE), the definition of Lagrangian coherent structures in this framework and the derivation of the geometry of mixing. In Sect. 4, we study and visualize the geometry of mixing in search for signatures of diffusion barriers. We close with a discussion of relations to previously developed methods and future directions in Sect. 5. In particular, readers interested in applications in atmospheric and oceanic fluid dynamics may find the discussion of connections to the effective diffusivity framework in Sect. 5.5 of particular interest.

Notation Throughout this paper, we use the following notations.

First, for the symmetric positive-definite matrix representation \(G\in \mathbb {R}^{d\times d}\) of a Riemannian metric g on M (in local coordinates), we denote its ordered eigenvalues by \(0<\mu _{\min }(G)\le \cdots \le \mu _{\max }(G),\) and the corresponding eigenvectors (in those coordinates) by \(v_{\min }(G),\ldots , v_{\max }(G)\).

Second, for any time-dependent map \([0,T]\ni t\mapsto \gamma (t)\in X\), with X some linear space, we define the time average of \(\gamma \) by

2 Preliminaries

For general references on (weighted) Riemannian manifolds with emphasis on Laplace operators, heat flows and heat kernels, see Chavel (1984), Rosenberg (1997), Grigor’yan (2009), Jost (2011) and Lablée (2015).

2.1 Weighted Manifolds and the Laplace Operator

Let \((M,g,\nu )\) be a weighted manifold: M a compact, complete, smooth, connected d-dimensional Riemannian manifold, possibly with sufficiently regular boundary \(\partial M\);Footnote 1g a smooth Riemannian metric (tensor field); and \(\nu \) a measure on M given by integrating indicator functions of measurable sets A with respect to \(\mathrm {d}\nu =\rho \,\mathrm {d}x\), i.e.,

$$\begin{aligned} \nu (A) = \int _M \chi _A(x)\,\mathrm {d}\nu (x) = \int _A \rho \,\mathrm {d}x. \end{aligned}$$

Here, \(\mathrm {d}x\) is the unique (Riemannian) volume form induced by g and \(\rho \) is some smooth positive density (Grigor’yan 2006)Footnote 2 . This gives rise to corresponding \(L^{p}\)-spaces over M, and in particular to the Hilbert space \(L^{2}(M,\nu )\).

For any smooth real-valued function \(f\in C^{\infty }(M)\), its exterior derivative\(\mathrm {d}f\) is a one-form on M, invariantly defined by

$$\begin{aligned} \mathrm {d}f=\frac{\partial f}{\partial x^{i}}\mathrm {d}x^{i} \end{aligned}$$

in local coordinates, where we make use of Einstein’s summation convention. Thus, when viewed as a vector in local coordinates, \(\mathrm {d}f\) comprises the partial derivatives of f in the coordinate directions.Footnote 3 The metric tensor\(g_{x}(\cdot ,\cdot )\) defines a scalar product on each tangent space \(T_{x}M\), which allows to identify the cotangent space \(T_{x}^{*}M\) with the tangent space \(T_{x}M\) via the canonical/musical isomorphism, see Lee (2012, p. 342). In local coordinates, this isomorphism is given by the inverse Gramian matrix \(G^{-1}\), G the matrix representation of the tensor g. In particular, one has

$$\begin{aligned} G^{-1}\mathrm {d}f=\,\mathrm{grad}\,_gf, \end{aligned}$$

the gradient of f w.r.t. the metric g. Moreover, the divergence of a vector field V may be defined implicitly via the Lie derivative of the volume form \(\mathrm {d}\nu \) in the direction of V,

$$\begin{aligned} \mathcal {L}_V(\mathrm {d}\nu )=\,\mathrm{div}\,_{\nu }(V)\mathrm {d}\nu . \end{aligned}$$

Intuitively, the divergence measures the rate of expansion of volume along the flow induced by V. In particular, if \(\,\mathrm{div}\,(V)=0\), volume is preserved by the flow of V.

Finally, the Laplace operator\(\Delta _{g,\nu }\) is defined by

$$\begin{aligned} \Delta _{g,\nu } f{:}{=}\,\mathrm{div}\,_{\nu }\,\,\mathrm{grad}\,_g f = \,\mathrm{div}\,_{\nu }\,D\,\mathrm {d}f,\qquad D{:}{=}G^{-1}. \end{aligned}$$

Its weak formulation takes the form (Grigor’yan 2006, 2009)

$$\begin{aligned} \left\langle f,\,\mathrm{div}\,_{\nu }\,\mathrm{grad}\,_gh\right\rangle _{0,\nu } = -\int _M g^{-1}(\mathrm {d}f,\mathrm {d}h)\,\mathrm {d}\nu = -\int _M g(\,\mathrm{grad}\,_gf,\,\mathrm{grad}\,_gh)\,\mathrm {d}\nu , \end{aligned}$$
(1)

where Eq. (1.1) is known as Green’s formula. Here, fh need to satisfy homogeneous Dirichlet or Neumann boundary conditions (Grigor’yan 2006), and \(g^{-1}\) denotes the dual metric (tog), which is the pullback of g by the canonical isomorphism. By construction, the isomorphism is an isometry and, hence, one has \(\Vert \,\mathrm{grad}\,_g f(x)\Vert _g = \Vert \mathrm {d}f(x)\Vert _{g^{-1}}\) for any \(x\in M\) and any smooth f.

It is well known (Grigor’yan 2006) that \(-\Delta _{g,\nu }\) can be represented in local coordinates as

$$\begin{aligned} g^{ij}(x)\frac{\partial }{\partial x^{i}}\frac{\partial }{\partial x^{j}}+b^{i}(x)\frac{\partial }{\partial x^{i}}+c(x), \end{aligned}$$

where \(g^{ij},b^{i},c\) are smooth (real) coefficients and \(\left( g^{ij}\right) _{ij}=G^{-1}\) is symmetric and uniformly positive definite. That is, the principal symbol \(g^{ij}(x)\xi _{i}\xi _{j}\) of \(-\Delta _{g,\nu }\) satisfies

$$\begin{aligned} g^{ij}(x)\xi _{i}\xi _{j}\ge \gamma |\xi |_{g_{x}}^{2}, \qquad (x,\xi )\in T^{*}M, \end{aligned}$$

for some \(\gamma >0\). In fact, one even has \(g^{ij}(x)\xi _{i}\xi _{j}=|\xi |_{g_{x}}^{2}\) by definition. As a consequence, \(\Delta _{g,\nu }\) is a (uniformly) elliptic second-order differential operator on M.

One important property of the Laplace operator is its invariance under (volume-preserving) isometries, see Grigor’yan (2006, Sect. 4.2): We call \(T:N\rightarrow M\) an isometry between two weighted manifolds \((N,\tilde{g},\tilde{\nu })\) and \((M,g,\nu )\) if T is a diffeomorphism and \(\tilde{g}=T^*g\) (\(\tilde{g}\) is the pullback metric) and \(\nu =T_*\tilde{\nu }\) (\(\nu \) is the pushforward measure). For such isometries, one has

$$\begin{aligned} \Delta _{\tilde{g},\tilde{\nu }}T^{*}=T^{*}\Delta _{g,\nu }, \qquad \text {or, explicitly}\qquad \Delta _{\tilde{g},\tilde{\nu }}(f\circ T)= \left( \Delta _{g,\nu }f\right) \circ T, \end{aligned}$$
(2)

for \(f\in C^\infty (M)\). This directly implies the coordinate independence of eigenvalues and eigenfunctions of \(\Delta _{g,\nu }\), when interpreting N as a global reparametrization of M.

Following a well-established procedure, see Jost (2011, Chap. 3) for the case when \(\mathrm {d}\nu =\mathrm {d}x \) and Grigor’yan (2006, Theorem 2.2), \(\Delta _{g,\nu }\)—defined on smooth functions—can be uniquely extended to a self-adjoint non-positive definite operator on \(L^2(M,\nu )\) by the Friedrichs extension. Indeed, Green’s formula, Eq. 1, implies that \(\Delta _{g,\nu }\) is \(L^2\)-symmetric on the domain of classically smooth functions \(C^\infty (M)\). Density of \(C^\infty (M)\) in \(L^2(M,\nu )\) follows directly from the observation that \(L^2(M,\nu )\) and \(L^2(M,\mathrm {d}x)\) are isometric via the unitary transformation \(U_{\sqrt{\rho }}\), \(f\mapsto \sqrt{\rho }f\), which leaves \(C^\infty (M)\) invariant; cf. Davies (1989, Sect. 4.2). As a negative semi-definite, self-adjoint elliptic second-order differential operator, \(\Delta _{g,\nu }\) has the following well-known spectral properties.

Proposition 1

(Grigor’yan (2006, Sect. 2.2), see also Jost (2011, Thm. 3.2.1)). The operator \(\Delta _{g,\nu }\) has

  1. (i)

    purely discrete, non-positive spectrum \(0=\lambda _1\ge \lambda _2\ge \ldots \), and eigenvalues accumulate only at \(-\infty \);

  2. (ii)

    pairwise \(L^2(M,\nu )\)-orthogonal eigenspaces;

  3. (iii)

    \(C^\infty \)-smooth eigenfunctions, which form a complete basis of \(L^2(M,\nu )\).

The harmonic functions, i.e., eigenfunctions corresponding to the 0-eigenvalue, are constant. Hence, there are as many linearly independent harmonic functions as M has connected components.

The last means that the multiplicity of the 0-eigenvalue equals the number of connected components of M, if one allowed M to have multiple connected components; we come back to this issue in Sect. 2.3.

2.2 Heat Flows

Given an elliptic, non-positive second-order differential operator H (equipped with zero Neumann boundary condition if M has boundary) such as the Laplace operator on M, the (infinitesimal) generator, \(\left( \exp (tH)\right) _{t\ge 0}\) is an analytic semigroup of bounded operators defined on \(L^{2}(M,\nu )\) (Davies 1982a, 1995a), and \(u(t)=\exp (tH)u_0\), \(u_0\in L^{2}(M)\), is the unique solution of the generalized heat equation

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}u(t)=Hu(t), \qquad u(0)=u_0; \end{aligned}$$

see, e.g., Davies (1989, Chap. 5, Sect. 1.4), and Grigor’yan (2006, Sect. 3). The semigroup \((\exp (tH))_{t\ge 0}\) is called the heat flow generated by H. By the spectral mapping theorem, we have

$$\begin{aligned} \sigma \left( \exp (tH)\right) = \exp \left( t[\sigma (H)]\right) , \end{aligned}$$

with corresponding eigenprojections. In other words, it suffices to study the spectrum and the eigenprojections of the generator H to determine subspaces which are invariant under the heat flow \(\left( \exp (tH)\right) _{t\ge 0}\). For \(H=\Delta _{g,\nu }\), the heat flow maps \(L^2\) functions to \(C^{\infty }\) functions, the heat kernel is symmetric and, hence, the heat flow is a family of self-adjoint operators on \(L^2(M,\nu )\), \(\exp (t\Delta _{g,\nu })1_M=1_M\), which are positivity preserving (Grigor’yan 2006, Thms. 3.1, 3.3).

If we consider a Hölder-continuous curve \(t\mapsto H(t)\) of elliptic second-order differential operators, the unique solution of

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}u(t)=H(t)u(t) ,\qquad u(0)=u_0, \end{aligned}$$

is given by \(u(t)=U_{H}(t,0)u_0,\) where the generalized heat process\(\left\{ U_{H}(t,s)\right\} _{t\ge s\ge 0}\) is the non-autonomous parabolic solution operator generated by H, see Amann (1995, Chap. II). In particular, it satisfies

$$\begin{aligned} U_{H}(t,t) =\,\mathrm{Id}\,_{L^{2}(M)}, \text {and}\quad U_{H}(t,\tau )U_{H}(\tau ,s) =U_{H}(t,s),\quad \text {for } s\le \tau \le t, \end{aligned}$$

and one has \(U_{H}(t,s)=\exp ((t-s)H)\) if H is time-translation invariant. The integral kernel \(u_{H}(t,s;\cdot ,\cdot )\) of \(U_{H}(t,s)\),

$$\begin{aligned} U_{H}(t,s)\psi (x)=\int _{M}u_{H}(t,s;x,y)\psi (y)\mathrm {d}x, \end{aligned}$$

is called the heat kernel of H.

2.3 Metastability and Metastable Decompositions

In the literature, it is usually assumed that the manifold under consideration is connected, as one may otherwise study each connected component individually. In the spirit of spectral geometry, however, i.e., the study of geometric properties of Riemannian manifolds by means of spectral properties of the Laplace operator and its induced heat flow, one important question is the recognition of manifolds that are connected but “nearly decomposable.” Such a phenomenon is closely related to the concept of metastability, which we recall in the following.

First, we recall the seminal work by Davies (1982a, b) on metastable states in positivity-preserving contraction semigroups \(\left( \exp (tH)\right) _{t\ge 0}\). The prototype example is given by \(H = \Delta \), i.e., a heat flow on a Riemannian manifold, equipped with homogeneous Neumann boundary condition if M has a non-empty boundary, as discussed in Sect. 2.2. Such operators H have real, non-positive spectrum \(0=\lambda _1>\lambda _2\ge \dots \) with a non-degenerate eigenvalue 0, accumulating only at \(-\infty \).

For reference, let us formulate some statements:

  1. 1.

    the first non-trivial eigenvalue \(\lambda _2\) of H is much smaller than the second \(\lambda _3\);

  2. 2.

    there exists a subset \(M_1\subset M\) such that the eigenfunction \(u_1\) associated with \(\lambda _1\) is close (in \(L^2\)) to some linear combination of the two indicator functions \(1_{M_1}\) and \(1_{M\setminus M_1}\);

  3. 3.

    there exists a subset \(M_1\subset M\) such that \(\Vert \exp (tH)1_{M_1}-1_{M_1}\Vert _1\) is small, i.e., the indicator function \(1_{M_1}\) is slowly evolving away from itself under the heat flow \((\exp (tH))_t\), where the distance is measured in \(L^1\).

Now, Davies (1982b, Thm. 7) and Davies (1982a, Thms. 3 and 5) show that, assuming (1), then (2) and (3) follow, and conversely, assuming (3), then (1) and (2) follow. Hence, statements (1) and (3) are “qualitatively equivalent” (Davies 1982a, p. 139). Notably, Davies does not give a direct, explicit definition of metastable sets, but rather justifies to call \(M_1\) a metastable set due to the estimates underlying (1)–(3). Therefore, metastable sets are directly linked to spectral properties of the generator of the symmetric Markov semigroup. Davies also points out that, having a candidate metastable set \(M_1\), any other set sufficiently close to \(M_1\) would admit the same properties, and hence metastable sets are invariably non-unique.

In Davies (1982b), the above analysis is extended to the physically relevant case, which is of interest in classic LCS applications, when the generator H has n very small eigenvalues, followed by a significant gap to the next eigenvalue:

$$\begin{aligned} 0=\lambda _1> \cdots \ge \lambda _n=-\varepsilon , \quad \lambda _{n+1} = O(1). \end{aligned}$$

Similarly to the above, the corresponding first n eigenfunctions are then roughly given as the linear combination of n indicator functions \(1_{M_1},\ldots ,1_{M_n}\), where the sets \(M_i\) decompose M.

In the context of reversible Markov chains on finite state spaces (e.g., space–time discretizations of the Markov semigroup above), Davies work has been adopted and extended as follows.

First, Deuflhard et al. (2000) and Deuflhard and Weber (2005) have studied weak perturbations of reducible Markov chains. In a geometric heat-flow context, the latter correspond to multiple connected components, and the weak perturbations introduce weak coupling between them. In this framework, a perturbation analysis in terms of an explicit small perturbation parameter \(\varepsilon \) on the dominant eigenvalues and their eigenfunctions is performed and a cluster extraction algorithm called PCCA (or PCCA+) is devised. In the continuous state space setting as considered by Davies, a corresponding construction of a perturbation \(M_{\varepsilon >0}\) that turns a disconnected reference manifold \(M_0\) into a single manifold with “weakly connected” components [as done for finite reversible Markov chains in Deuflhard et al. (2000)] appears to be a very challenging research problem, since both the coupled and the decoupled configurations of \(M_{\varepsilon \ge 0}\) need to be embedded into a single ambient manifold, on which the Laplace operators and their spectra can be compared.

Another important contribution is Huisinga and Schmidt (2006), adopting the approach of Dellnitz and Junge (1999) of addressing metastability (called almost-invariance there) of sets via having high internal transition probability (or, equivalently, low exit probability) as measured by the (almost) invariance ratio

$$\begin{aligned} p(M_1,M_1) = \langle \mathcal {P}1_{M_1},1_{M_1}\rangle _2/\Vert 1_{M_1}\Vert _2^2. \end{aligned}$$

Here, \(\mathcal {P}\) is a Perron–Frobenius/transfer operator, which can be thought of as \(\exp (tH)\) for some \(t>0\) in Davies’s framework. In a different line of research, this approach has been used to detect almost-invariant behavior in deterministic, finite-dimensional dynamical systems (Froyland and Dellnitz 2003; Froyland 2005). The metastability quality of an arbitrary state space decomposition into \(M_1,\ldots ,M_n\) is then assessed via \(\sum _i p(M_i,M_i)\). Specifically, an upper bound is given in terms of \(\sum _i \lambda _i\), \(\lambda _i\) the eigenvalues of \(\mathcal {P}\), and a lower bound in terms of the weighted sum of eigenvalues, where the weights are determined by the norm of the orthogonal projection of the eigenfunctions onto the space spanned by the indicator functions \(1_{M_i}\). Roughly speaking, if the eigenfunctions appear to be almost constant on the \(M_i\), then the lower bound on metastability is close to the upper bound. In this approach, there is no a priori assumption on the proximity of dominant eigenvalues to 1. The provable metastability measure, however, decreases when including eigenfunctions associated with non-small eigenvalues. For another quantification approach to metastability in terms of exit times, see Schütte (2003).

It is generally extremely difficult to optimize the almost-invariance objective function under the constraint of searching among characteristic functions which decompose state space. For this reason, the optimization problem is relaxed toward general densities in \(L^2\), and the decomposition constraint is relaxed to \(L^2\)-orthogonality of densities. In this form, the optimization problem takes the form of Courant–Fischer min–max type and is therefore solved by eigenvalues and eigenfunctions of the (symmetrized) transfer operator. For the extraction of metastable/almost-invariant sets, the theory and methods developed in Davies (1982b), Deuflhard et al. (2000), Huisinga and Schmidt (2006) can then be applied.

In practice, it is common to use heuristic clustering algorithms such as k-means or fuzzy c-means to extract state-space decompositions from eigenfunctions, besides the aforementioned (Dellnitz and Junge 1999; Deuflhard et al. 2000; Deuflhard and Weber 2005). On the one hand, there is theoretical justification for using k-means (Lafon and Lee 2006) by interpreting the values \(\left( w_k(x)\right) _{k=1,\ldots ,n}\) of leading eigenfunctions \(\left( w_k\right) _{k=1,\ldots ,n}\) as a quasi-isometric embedding into some Euclidean space. In this feature space, k-means then effectively optimizes cluster attribution with respect to the intrinsic distance of the data. On the other hand, these clustering algorithms do not address the original optimization problem.

2.4 Laplace Operator, Heat Flow and Local Averaging

There exist many tight connections between a Riemannian geometry on a manifold (as modeled by a Riemannian metric and its induced Laplace operator) on the one hand, and the heat flow and heat kernel (induced by the Laplace operator) on the other hand; see, for instance, Grigor’yan (2009) and Lablée (2015). Here, we want to recall one very intuitive connection between short-time heat flows and local averaging.

To this end, consider a Riemannian manifold (Mg) without boundary and with Riemannian measure. Denote the diffusion operator defined by averaging over g-geodesic \(\varepsilon \)-balls \(B_{\varepsilon }^{g}(x)=\lbrace y\in M;\,\,\mathrm{dist}\,_{g}(x,y) \le \varepsilon \rbrace \) of radius \(\varepsilon \) by \(T_{\varepsilon }^{g}\), i.e.,

$$\begin{aligned} \left( T_{\varepsilon }^{g}u\right) (x)=\frac{\int _{B_{\varepsilon }^g(x)}u\,\mathrm {d}x}{\int _{B_{\varepsilon }^g(x)}\,\mathrm {d}x}=\frac{1}{{\hbox {Vol}}_g\left( B_{\varepsilon }^g(x)\right) }\int _{B_{\varepsilon }^g(x)}u\,\mathrm {d}x. \end{aligned}$$

Then, the results from Lebeau and Michel (2010, Thms. 1 and 2) show that

$$\begin{aligned} T_{\varepsilon }^{g} =\,\mathrm{Id}\,_{L^{2}(M,g)}+\tfrac{\varepsilon ^{2}}{2(d+2)}\Delta _{g}+O\left( \varepsilon ^{4}\right) , \qquad \text {for } \varepsilon \rightarrow 0, \end{aligned}$$
(3)

(almost) in the norm-resolvent sense; see Lebeau and Michel (2010) for technical details. In particular, the dominant eigenvalues and their eigenprojections of \(\varepsilon ^2\Delta _{g}\) converge to the eigenvalues and eigenprojections of \(2(d+2)(T_{\varepsilon }^{g}-\,\mathrm{Id}\,)\) as \(\varepsilon \rightarrow 0+\), respecting multiplicity. This strong result can be interpreted in two ways.

Spectral approximation of the short-time heat flow

As recalled in Sect. 2.3, metastable sets for the heat flow are detected by eigenfunctions of the heat flow \(U_{\varepsilon \Delta }\). Now, the right-hand side of Eq. 3 can be read as the second-order operator expansion of the heat flow \(U_{\varepsilon \Delta }\left( \varepsilon /(2d+4)\right) \), i.e., for short time intervals of length \(O(\varepsilon )\). An understanding of this short-time heat flow is already instructive for the general heat flow, since Eq. 7 is autonomous and the long-time heat flow is nothing but an iteration of the short-time heat flow. Equation 3, i.e., the approximation of the short-time heat flow by the local geodesic averaging operator, then states that it is instructive to look at the shape distribution of small geodesic neighborhoods to form an intuition on the action of the short-time heat flow and thereby, possibly, on dominant metastable sets as identified from Laplace eigenfunctions; cf. Sect. 2.3. We make use of this correspondence in a visual exploration of a non-trivial geometry in Sect. 4.1.

Approximation of local geodesic averaging by diffusion

Alternatively, Eq. 3 may be interpreted from left to right. On the left-hand side, we have the compact integral smoothing operator \(T^g_\varepsilon \). This operator is expanded in (non-compact) differential smoothing operators. To zeroth order, i.e., sending \(\varepsilon \rightarrow 0\), the integral kernel of \(T^g\) becomes the Dirac delta distribution, whose action is given simply by point evaluation, or, on the operator level, by the identity operator. At the second-order level, local averaging is represented by the differential diffusion operator \(\Delta \).

3 Advection–Diffusion in Eulerian and Lagrangian Frames

Let \(\left( \mathcal {M},\mathfrak {g},\mathrm {d}x\right) \) be a weighted Riemannian d-manifold and \(M\subset \mathcal {M}\), the fluid domain or material manifold, an embedded d-dimensional submanifold equipped with the induced metric, again denoted by \(\mathfrak {g}\). We regard \(\mathfrak {g}\) and the (volume) measure \(\mathrm {d}x\) as universal objects, in the sense that they do not depend on the physical transport and mixing process that we are going to study. In particular, \(\mathrm {d}x\) may be the Riemannian measure induced by \(\mathfrak {g}\), which is what we assume henceforth. Thus, \(\left( \mathcal {M},\mathfrak {g},\mathrm {d}x\right) \) is a Riemannian manifold, and the reader may simply think of the physical space with physical length and volume.

We consider the transport equation/conservation law for the scalar quantity \(\phi \) associated with the (in general non-autonomous) divergence-free vector field V on \(\mathcal {M}\):

$$\begin{aligned} \partial _t\phi + \,\mathrm{div}\,(\phi V) = 0, \qquad \phi (0,\cdot ) = \phi _0. \end{aligned}$$
(4)

Here and throughout, the divergence is the one induced by the physical volume form \(\mathrm {d}x\). As is well known, Eq. 4 may be solved for \(\phi \) by means of the flow map, i.e., the solution to the ordinary differential equation

$$\begin{aligned} \dot{x} = V(t,x), \end{aligned}$$

a smooth one-parametric family of diffeomorphisms \(\Phi ^{t}\), \(t\in [0,T]\), over M,

$$\begin{aligned} \Phi ^{t}:M \rightarrow \Phi ^{t}[M]\subseteqq \mathcal {M}\qquad t \in [0,T],\qquad \Phi ^{0} =\,\mathrm{Id}\,_{M}. \end{aligned}$$

Indeed, the solution may be represented pointwise by \(\phi (t,x) = \phi \left( 0,\left( \Phi ^t\right) ^{-1}(x)\right) \), or globally \(\phi (t,\cdot ) = \mathcal {P}^t(\phi (0,\cdot ))\), where \(\left( \mathcal {P}^t\right) _t\) is the time-dependent family of Perron–Frobenius operators associated with Eq. 4.

3.1 Eulerian Advection–Diffusion Equation

As a starting point, consider the spatial evolution of a scalar density \(\phi \) as it is carried by an incompressible fluid and subject to diffusion, the classic Eulerian advection–diffusion equation (ADE)

$$\begin{aligned} \frac{D\phi }{Dt}=\partial _{t}\phi +\,\mathrm{div}\,(\phi V) =\varepsilon \,\mathrm{div}\,\, D\, \mathrm {d}\phi ,\qquad \phi (0,\cdot ) = \phi _0. \end{aligned}$$
(5)

Here, D/Dt denotes the material/substantial/advective time derivative used in the fluid dynamics literature, and D is the smooth space–time-dependent diffusion tensor field, pointwise symmetric and uniformly positive-definite. The diffusion tensor field is supposed to model only the directional dependence of diffusivity, whereas \(\varepsilon >0\) models the diffusion strength and can be interpreted as the inverse of the dimensionless Péclet number, which quantifies the strength of advection relative to the strength of diffusion. The problem of LCS detection is typically considered in purely advective flows, which we relax here to advection-dominated, weakly diffusive flow regimes, i.e., associated with a large Péclet number.

Note that we do not require the spatial metric \(\mathfrak {g}\) in the formulation of Eq. 5. It is well known that the above ADE captures also anisotropic diffusion, i.e., diffusion with direction-dependent diffusivity. Isotropic diffusion corresponds to a diffusion tensor D, which is represented by (a multiple of) the identity tensor in physical units, i.e., in Riemannian normal coordinates induced by \(\mathfrak {g}\).

Nevertheless, we may actually use the diffusion tensor field D to define a diffusion-adapted metric. This is a classic, but little-known procedure, which seems to go back to Kolmogoroff (1937), see also Masoliver et al. (1987) and Cohen de Lara (1995). It builds on the duality of the Riemannian and the dual metrics, using the fact that D transforms like a dual metric tensor, see Sect. 2.1. Assume D has components \(D^{ij}\) in local coordinates; then, we define a metric tensor field g by the symmetric, positive-definite matrix field \(G{:}{=}D^{-1}=(D_{ij})_{ij}\). It seems appropriate to refer to (geodesic) length measured by g as effective length: In effective length units, the D-diffusion is isotropic by construction, i.e., in g-normal coordinates D is represented by the identity tensor. This is achieved by downscaling/upscaling effective length units (relative to physical length units) in directions of larger/smaller diffusivity, respectively. Note that we do not alter the notion of volume as modeled by \(\mathrm {d}x\). By definition, D is the inverse of the metric Gramian G and therefore may be interpreted as the canonical isomorphism. Thus, \(D \mathrm {d}\phi =\,\mathrm{grad}\,_g\phi \) models the diffusive flux in physical units, and finally Eq. 5 may be rewritten in the form

$$\begin{aligned} \frac{D\phi }{Dt}=\partial _{t}\phi +\,\mathrm{div}\,(\phi V) =\varepsilon \,\mathrm{div}\,\,\mathrm{grad}\,_g\phi = \varepsilon \Delta _{g,\mathrm {d}x}\phi ,\qquad \phi (0,\cdot ) = \phi _0. \end{aligned}$$

By the uniform definiteness assumption on the diffusion tensor field D, the Laplace operator \(\Delta _{g,\mathrm {d}x}\) is uniformly elliptic. Using local coordinate representations, it is easy to see that the physical volume measure \(\mathrm {d}x\) has a density \(\sqrt{\det (D)\det (\mathfrak {G})}\) w.r.t. the volume measure induced by g, where \(\mathfrak {G}\) is the Gramian of \(\mathfrak {g}\).

The Eulerian perspective comes with a couple of challenges. First, if one is interested in the evolution of material localized in some non-invariant region M, one needs to solve Eq. 5 on a sufficiently large spatial domain in \(\mathcal {M}\) in order to cover the entire evolution \(\Phi ^{t}(M)\) of material initialized in M. This can be challenging in applications to open dynamical systems such as ocean surface flows. Second, coherent sets computed in this framework—as done in Denner et al. (2016), cf. also Froyland and Koltai (2017)—are conceptually of Eulerian, i.e., space–time, kind, and are not material by construction. This lack of materiality is not, as sometimes stated, due to the addition of diffusion in phase space, as we show by our theory here. In particular, such Eulerian structures generally have both diffusive and advective fluxes through their boundary. It is therefore of interest to study weakly diffusive flows from a material perspective, i.e., in Lagrangian coordinates.

3.2 Lagrangian Advection–Diffusion Equation

Next, let us take a look at Eq. 5 from the Lagrangian viewpoint, cf. Press and Rybicki (1981), Krol (1991), Knobloch and Merryfield (1992), Thiffeault (2003), Fyrillas and Nomura (2007). Formally, this means that we interpret the scalar density as a function of particles by pulling it back to time \(t=0\) through composition with the flow map \(\Phi \). This yields a Lagrangian scalar density \(\varphi =\Phi ^*\phi = \phi \circ \Phi \). Additionally, we need to pull back Eq. 5 to the material manifold, and thus arrive at its Lagrangian form

$$\begin{aligned} \partial _t\varphi = \varepsilon \,\mathrm{div}\,\left( D\Phi (t)^{-1}\cdot D\cdot D\Phi (t)^{-\top }\right) \mathrm {d}\varphi , \qquad \varphi (0,\cdot ) = \phi _0. \end{aligned}$$
(6)

Here, the scalar density \(\varphi \) is no longer subjected to an advective drift—in the Lagrangian perspective, we are following trajectories—but is subject to diffusion generated by the time-dependent family of pullback diffusion tensors. Specifically, denote by \(G(t)^{-1} {:}{=}D\Phi (t)^{-1}\cdot D\cdot D\Phi (t)^{-\top }\) the diffusion tensor in Lagrangian coordinates. Then, by duality, the matrix field G(t) determines a time-dependent family of diffusion-adapted pullback metrics g(t) on M, and we may rewrite Eq. 6 as

$$\begin{aligned} \partial _t\varphi = \varepsilon \Delta _{g(t),\mathrm {d}x}\varphi , \qquad \varphi (0,\cdot ) = \phi _0, \end{aligned}$$

where the Laplace operators \(\left( \Delta _{g(t),\mathrm {d}x}\right) _t\) are induced by the pullback metric \(g(t){:}{=}(\Phi ^t)^*g\) and the physical volume. On an abstract level, this change of notation corresponds exactly to the transformation behavior of Laplace operators (Eq. 2). Equation 6 can thus be viewed as an inhomogeneous, i.e., time-dependent, diffusion equation for Lagrangian scalar densities \(\varphi \) on M.

Remark 1

(Pullback metrics) The pullback metric g(t) is well known in the theory of kinematics of deforming continua by the name (right) Cauchy–Green strain tensor; see, for instance, Abraham et al. (1988, p. 356). In the typical case, when the space \(\mathcal {M}\) is Euclidean and parameterized by the canonical coordinates \(x^1,\ldots ,x^d\), the pullback metric g(t) has the matrix representation \(G(t) = \left( D\Phi (t)\right) ^\top \cdot D\Phi (t)\), where \(D\Phi (t)\) is the linearized flow map with entries \((\partial _j\Phi ^i)_{ij}\). In general coordinates, one has \(G(t)=\left( D\Phi (t)\right) ^\top \cdot G\cdot D\Phi (t)\), where G is the matrix representation of g in local coordinates on \(\Phi ^t(M)\).

Remark 2

We stress that—according to our Lagrangian viewpoint—Eq. 6 is an evolution equation on M, even if the flow does not keep M invariant, i.e., \(\Phi ^{t}(M)\ne M\).

We are now in the position to define our main object of interest.

Definition 1

Lagrangian coherent sets\(U\subset M\) are material sets that are metastable under the time-inhomogeneous heat flow (6), i.e., the advection–diffusion equation in Lagrangian coordinates. By Lagrangian coherent structure we mean the boundary of a Lagrangian coherent set. Since both notions define each other unambiguously, we will abbreviate both simply by LCS and use them mostly synonymously.

The metric g(t) is different from \(g=g(0)\) unless \(\Phi ^{t}\) is an isometry, or, in physical terms, unless \(\Phi ^{t}\) corresponds to a solid body motion. Therefore, the Lagrangian diffusion is not isotropic with respect to \(\mathfrak {g}\), even if the Eulerian diffusion was. This reflects the fact that the flow deformation may have pushed two particles apart or together, and thus, their material exchange by diffusion at some later time point is, respectively, less and more likely (Fig. 2). These intuitive heuristics have been formalized and exploited in Thiffeault (2003) to reduce the full-dimensional ADE to a one-dimensional ADE along the most contracting direction.

Fig. 2
figure 2

Schematic visualization of the pullback geometry and the induced diffusion. The spatial Euclidean geometry (right) is pulled back to the material manifold from time \(t=0.05\) (left) by the flow map \(\Phi _0^{0.05}\) for the rotating double gyre (Example 1). A spatial diffusion with variance \(\varepsilon =0.1\) (red circle on the right) is pulled back to the red curve on the left, visualizing diffusion in the pullback metric g(0.05) on the material manifold of equal variance. As can be seen, the red curve reaches further out than material diffusion with same variance in the original metric g(0) (visualized by the green circle) in some directions, while it does not reach as far in others. This is due to the deformation by the flow. Note also the duality to the Eulerian deformation perspective presented in Welander’s classic work on two-dimensional turbulence (Welander 1955, Fig. 2) (Color figure online)

Since \(-\Delta _{g(t),\mathrm {d}x}\) is elliptic for all \(t\in [0,T]\) (each is just an isometric representative of the elliptic Laplace operator \(-\Delta _{g,\mathrm {d}x}\) on the flow image), the solution of Eq. 6 in \(L^{2}(M)\) is given by the generalized heat flow \(U_{\varepsilon \Delta _{g(t),\mathrm {d}x}}\) associated with \(\left( \varepsilon \Delta _{g(t),\mathrm {d}x}\right) _t\), cf. Sect. 2.2.

By construction and as a consequence of volume preservation by \(\Phi \), we have the following result.

Proposition 2

For each \(t\in [0,T]\), the operator \(\Delta _{g(t),\mathrm {d}x}=\,\mathrm{div}\,_{\mathrm {d}x}\,\mathrm{grad}\,_{g(t)}\) is self-adjoint on \(L^{2}(M,\mathrm {d}x)\) and admits the spectral properties stated in Proposition 1.

We see that time dependence enters the definition of \(\Delta _{g(t),\mathrm {d}x}\) only through the pullback metric/diffusion tensor field, the physical volume \(\mathrm {d}x\) remains unaltered. For this reason, we henceforth omit the measure in the notation of the Laplace operator.

Remark 3

From a geometric viewpoint, one may consider the heat flow as a tool to study the geometry of manifolds. Then, Eq. 6 may be interpreted as the heat flow on a time-evolving manifold. This setting has been studied recently from the diffusion maps point of view in Marshall and Hirn (2018).

3.3 Metastability in Time-Dependent Processes and Its Approximation

For two reasons, our definition of LCSs given above does not provide a precise mathematical definition, but rather merely invokes the intuition underlying metastability. First, as recalled in Sect. 2.3, metastability of Markov processes comes with an intrinsic vagueness, and second, as pointed out in Koltai et al. (2016, p. 1), “a straightforward definition of a metastable set in the non-stationary, non-equilibrium case may only be given case by case.”

In Koltai et al. (2016), the authors build on previous work by Froyland (2013) and extract metastable sets [defined as coherent sets in the terminology of Froyland (2013)] from features of the singular vectors of the solution operator associated with the time-dependent process; cf. also Marshall and Hirn (2018). In our case, this would correspond to the singular vectors of the generalized heat flow \(U_{\varepsilon \Delta _{g(t),\mathrm {d}x}}\) introduced earlier. Note, however, that features of singular vectors—like connected regions of uniformly high absolute value—are generally not expected to be invariant under the generalized heat flow. While this is conceptually unproblematic in Eulerian approaches to coherence which do not request flow invariance of the sets of interest, it is problematic in a Lagrangian approach, where one seeks Lagrangian structures. Those ought to be, by definition, invariant in Lagrangian coordinates.

To enforce invariance, one may consider extracting LCSs from features of the eigenfunctions of the generalized heat flow. The issue with this approach then is that the generalized heat flow is a family of non-self-adjoint operators, whose eigenvalues therefore are not necessarily real. It remains unclear how to interpret complex eigenvalues of modulus almost 1, and their associated eigenfunctions, in a finite-time context.

Another, simplifying ad hoc approach is to approximate the generalized heat flow by an autonomous heat flow, whose generator is obtained from averaging the time-dependent generators (Press and Rybicki 1981; Krol 1991; Knobloch and Merryfield 1992), i.e.,

We defer a rigorous convergence study of the two heat flows in the vanishing diffusivity limit, \(\varepsilon \rightarrow 0\), to the forthcoming (Karrasch and Schilling 2020); see Krol (1991) for related work.

By the linearity of the divergence and with the time average of the pullback diffusion tensors,

\(\overline{\Delta }\) takes the form of a Laplace operator on M, i.e.,

$$\begin{aligned} \overline{\Delta } = \,\mathrm{div}\,_{\mathrm {d}x}\,\mathrm{grad}\,_{\bar{g}} = \Delta _{\bar{g},\mathrm {d}x}= \Delta _{\bar{g}}, \end{aligned}$$

where \(\bar{g}\) denotes the metric tensor induced by \(\bar{g}^{-1}\). \(\Delta _{\bar{g}}\) is then a volume-based diffusion operator again.

Following a different line of reasoning, Froyland introduced the operator \(\overline{\Delta }\) recently in Froyland (2015) and coined it dynamic Laplacian; cf. also Froyland and Kwok (2017). From our ADE point of view, the dynamic Laplacian of Froyland (2015) can be obtained as the time average of pullbacks of isotropic Eulerian Laplace operators; cf. (2).

The following proposition holds due to the fact that \(\overline{\Delta }\) is the Laplace operator \(\Delta _{\bar{g}}\) associated with the weighted manifold \((M,\bar{g},\mathrm {d}x)\). Notably, ellipticity follows from uniform bounds on the continuously differentiable flow map defined on a compact space–time manifold \(M\times [0,T]\).

Proposition 3

(cf. Froyland (2015, Thm. 4.1)) The operator \(\overline{\Delta }\) is self-adjoint on \(L^{2}(M,\mathrm {d}x)\) and admits the spectral properties stated in Proposition 1.

In the case when M has non-empty boundary, the operator is—as before—equipped with the natural (w.r.t. \(\bar{g}\)) Neumann boundary condition. This can be interpreted as the average of pullbacks of zero Neumann boundary conditions, see Froyland (2015) and Froyland and Kwok (2017).

The metric \(\bar{g}\) endows the material manifold M with a Riemannian metric, that encodes—in an averaged sense—the diffusion as it is observed from a Lagrangian perspective. This allows for (i) a static visualization and exploration and (ii) the application of many well-established techniques and tools from geometric spectral analysis and spectral geometry (Jost 2011; Lablée 2015), as well as harmonic analysis.

Finally, we propose to approximate LCSs, i.e., metastable sets of the non-autonomous Lagrangian ADE (6) by metastable sets for the autonomous Lagrangian evolution equation

$$\begin{aligned} \partial _{t}\varphi =\varepsilon \Delta _{\bar{g}}\varphi . \end{aligned}$$
(7)

By the spectral relation between heat flow and generator, cf. Sect. 2.2, this boils down to a spectral analysis of the generating Laplace operator \(\Delta _{\bar{g}}\). Extracting metastable sets from eigenfunctions of the heat flow/generator as described in Sect. 2.3 then coincides with the procedure put forward in Froyland and Junge (2018), where a couple of example flows are analyzed.

4 Geometry of Mixing

An important outcome of the time averaging of the pullback Laplacians is the geometric structure on the material manifold M given by the harmonic mean metric\(\bar{g}\). This is a material geometry typically different from all material geometries induced by the configuration of the material in space, i.e., its embedding in space as a submanifold. The aim of this section is to study this new weighted (Riemannian) manifold as well as the properties of the induced Laplace operator \(\Delta _{\bar{g}}\) and its heat flow. We will refer to this geometry as the geometry of mixing, thereby reviving and generalizing an earlier related approach by Giona et al. (2000), which refers to the (single) pullback geometry under one flow map of (typically chaotic) flows.

More specifically, we wish to find signatures of coherent and incoherent dynamics, and of the boundary between them, in the static geometry of mixing. We do so by comparing characteristics of the diffusion tensor field relative to the physical geometry \(\left( M,\mathfrak {g}\right) \) and in \(\mathfrak {g}\)-orthonormal coordinates x. In the Euclidean setting, these are the canonical \(x^{i}\)-coordinates.

By choosing a reference geometry relative to which we study the deformed \(\bar{g}\)-geometry our analysis appears to be somewhat reference dependent. An analogous approach, however, is common in continuum mechanics, where deformed configurations are analyzed relative to a reference configuration (Truesdell and Noll 2004). Eventually, the spectrum and the eigenprojections of \(\Delta _{\bar{g}}\)—the basis of our coherent structure detection method—are intrinsic and independent of representations w.r.t. the reference configuration. Notably, our geometric construction is observer independent, or, equivalently, objective, since Euclidean changes of observer do not change the notions of length and volume, and the diffusion tensor field D is given intrinsically.

4.1 Lagrangian Averaged Diffusion Tensor Imaging

In this section, we explore visually the averaged diffusion tensor field in search of signatures of Lagrangian coherent structures. The visualization of second-order tensor fields, and diffusion tensor fields in particular, is referred to as diffusion tensor imaging (DTI) and is a well established and active field of research in scientific visualization; see, for instance, Le Bihan et al. (2001) for a brief review. We denote the Lagrangian averaged diffusion tensor by \(\overline{D}\). Two diffusion phenomena that are of interest in DTI are (i) anisotropy and (ii) some scalar measure of diffusivity.

4.1.1 Anisotropy and Barriers to Diffusion

For anisotropy, several scalar measures have been proposed in the diffusion tensor imaging (DTI) literature, for instance the volume ratio, given by

$$\begin{aligned} \frac{\prod _i^d\mu _i\left( \overline{D}\right) }{\overline{\mu }\left( \overline{D}\right) ^d}, \end{aligned}$$

where \(\overline{\mu }\) denotes the arithmetic mean of the eigenvalues. This quantifies the volume of the diffusion ellipsoid relative to the volume of the sphere with radius \(\overline{\mu }\). It takes values between 0 and 1, where 1 corresponds to isotropy and 0 corresponds to a lower-dimensional, degenerate ellipsoid and hence strong anisotropy. Recall that both measures are not intrinsic quantities of \(\bar{g}\), but are determined by viewing the \(\overline{D}\)—diffusion tensor in \(\mathfrak {g}\)—orthonormal coordinates.

Generally, for a point \(p\in M\) and a \(\mathfrak {g}\)-unit direction v, the \(\bar{g}\)-norm of v corresponds to the inverse effective \(\overline{D}\)-diffusivity in v-direction. We are now looking for a canonical, i.e., diagonalized representation of the diffusion tensor \(\overline{D}\) in physical \(\mathfrak {g}\)-unit directions. This can be achieved simply by computing the eigendecomposition of \(\overline{D}\), assuming that the matrix representation of \(\mathfrak {g}\) in the chosen coordinates is the identity. The eigenvalues of \(\overline{D}\) then correspond to the characteristic diffusivities, attained in the directions of the eigenvectors. In other words, the direction \(v_{\max }\left( \overline{D}\right) \) associated with \(\mu _{\max }\left( \overline{D}\right) \) corresponds to the direction of strongest (or fastest) diffusion, and, by duality, to \(v_{\min }\left( \overline{G}\right) \), i.e., the direction which is most strongly compressed under the change of metric from \(\mathfrak {g}\) to \(\bar{g}\) (Fig. 3). The connection between short-time heat flows and averaging on small geodesic balls (w.r.t. the intrinsic geometry, here given by \(\bar{g}\))—recalled in Sect. 2.4—indicates that this visualization procedure may be indicative for the action of the heat flow and the location of metastable states as identified from spectral information.

Fig. 3
figure 3

Schematic visualization of the g- and \(\bar{g}\)-unit circles (black and red, resp.) in \(\mathfrak {g}\)-orthonormal coordinates, cf. Fig. 2. Note that \(\bar{g}\)-diffusion is fastest in the direction of \(v_{\min }\left( \overline{G}\right) \) because the \(\bar{g}\)-distance is the shortest on g-circles. The corresponding diffusion coefficient in that direction is \(\mu _{\max }\left( \overline{D}\right) =1/\mu _{\min }\left( \overline{G}\right) \). Note also that \(\bar{g}\)-unit spheres have typically much smaller volume than \(\mathfrak {g}\)-unit spheres because \(\mu _{\min }\left( \overline{G}\right) \le \mu _{\max }\left( \overline{G}\right) <1\) in large regions of M, and consequently \(\mathfrak {g}\)-unit volumes have \(\bar{g}\)-volume smaller than one there, see Sect. 4.1.3 (Color figure online)

4.1.2 (Mean) Diffusivity: Lagrangian Effective Diffusivity and Mixing Regions

Another quantity of interest in the exploration of diffusion tensor fields is a scalar measure of total (or, mean) diffusivity. It is common to use the trace of the diffusion tensor in the DTI community, i.e., the sum of eigenvalues. For the mean diffusivity, the trace is additionally normalized, i.e., divided by the dimension. In two-dimensional flows, the trace as a measure of absolute diffusivity (in contrast to the relative strength as measured by anisotropy) at a point is strongly dominated by the maximal eigenvalue \(\mu _{\max }\). It is natural to interpret regions where the scalar diffusivity field attains high values as mixing regions, in which localized scalar densities are expected to diffuse very quickly, following preferentially the directions of fastest diffusion, as discussed in Sect. 4.1.1. In the literature, the method of trajectory encounter volume (Rypina and Pratt 2017), cf. also Padberg-Gehle and Schneide (2017), and Nakamura’s effective diffusivity methodology, see Sect. 5.5, have been proposed as a way to compute a Lagrangian diffusivity.

4.1.3 Density

Throughout, we are working in a weighted manifold setting here, in which the volume measure \(\mathrm {d}x\) generally does not coincide with the volume induced by the diffusion-adapted metric g. It is therefore of interest to study the deviation of the diffusion-adapted intrinsic volume from the physical volume measure. This amounts to determining the density of \(\mathrm {d}x\) w.r.t. the Riemannian volume measure in the geometry of mixing.

However, since intuition is based on the physical notion of volume, we visualize the inverse problem, i.e., the volume density in the geometry of mixing w.r.t. \(\mathrm {d}x\):

$$\begin{aligned} \mathrm {d}\bar{g}=\sqrt{\det \overline{G}}\ \mathrm {d}x=\frac{1}{\sqrt{\det \overline{D}}}\mathrm {d}x. \end{aligned}$$

4.1.4 Numerical Examples

For simplicity, we assume that the Eulerian diffusion tensor field in the following examples is given by the identity tensor in the canonical coordinates. The inclusion of a space–time-dependent, anisotropic diffusion tensor field is straightforward, see Remark 1. We are going to visualize the geometry of mixing for two commonly studied flow examples. For an LCS analysis based on spectral data of the dynamic Laplacian for these examples, see Froyland and Junge (2018).

Example 1

[Rotating double-gyre flow (Mosovsky and Meiss 2011)] We consider the transient double-gyre flow on the unit square \([0,1]\times [0,1]\), as introduced in Mosovsky and Meiss (2011). It is given by a time-dependent stream function \(\Psi (t,x,y)=(1-s(t))\sin (2\pi x)\sin (\pi y)+s(t)\sin (\pi x)\sin (2\pi y)\), \(s(t)=t^{2}(3-2t)\), defining the velocity field via

$$\begin{aligned} \dot{x} =-\frac{\partial \Psi }{\partial y}, \qquad \dot{y} =\frac{\partial \Psi }{\partial x}. \end{aligned}$$

The integration time interval is [0, 1] and the computational grid is 500\(\times \)500 points. The flow is designed to interpolate in time an instantaneously horizontal (at \(t=0\)) and an instantaneously vertical (at \(t=1\)) double-gyre vector field. For our metric computations, we average over 21 pullback metrics from equidistant time instances with time step 0.05. The LCSs as computed from a clustering of the dominant eigenfunctions of the dynamic Laplacian \(\overline{\Delta }\) are shown in Fig. 4; see Froyland and Junge (2018) for details. For a visual proof of coherence, we provide an advection movie showing the evolution of the Lagrangian coherent structures as Supplementary Material 1.

Fig. 4
figure 4

Rotating double-gyre flow: the three-clustering obtained from the second and third eigenfunction of \(\overline{\Delta }\) at initial (left) and final time (right)

Example 2

[Bickley jet flow (Rypina et al. 2007)] We consider the Bickley jet flow, as introduced in Rypina et al. (2007), which is determined by the stream function \(\psi (t,x,y) =\psi _{0}(y)+\psi _{1}(t,x,y)\), where

$$\begin{aligned} \psi _{0}(y)&=-U_{0}L_{0}\tanh \left( y/L_{0}\right) , \\ \psi _{1}(t,x,y)&=U_{0}L_{0}\,\mathrm{sech}\,^{2}(y/L_{0})\mathfrak {R}\left( \sum _{n=1}^{3}f_{n}(t)\exp \left( ik_{n}x\right) \right) , \end{aligned}$$

with functions and parameters as in Rypina et al. (2007) and Hadjighasem et al. (2016): \(f_{n}(t)=\epsilon _{n}\exp \left( -ik_{n}c_{n}t\right) \), \(U_{0}=62.66\, \text {m}\,\text {s}^{-1}\), \(L_{0}=1770\,\mathrm {km}\), \(k_{n}=2n/r_{0}\), \(r_{0}=6.371\, \mathrm {km}\), \(c_{1}=0.1446U_{0}\), \(c_{2}=0.205U_{0}\), \(c_{3}=0.461U_{0}\), \(\epsilon _{1}=0.0075\), \(\epsilon _{2}=0.15\), \(\epsilon _{3}=0.3\); x and y have units of 1000 km and t has unit s. The integration time interval is [0, 40] days and the computational grid is 800\(\times \)240 points. We approximate \(\bar{g}\) by 81 pullback metrics from equidistant time instances with time step 0.5 days. The LCSs as computed from a clustering of the dominant eigenfunctions of the dynamic Laplacian \(\overline{\Delta }\) are shown in Fig. 5; see Froyland and Junge (2018) again for details. As for the previous example, we provide an advection movie showing the evolution of the detected sets as Supplementary Material 2 for a visual confirmation of the coherent motion.

Fig. 5
figure 5

Bickley jet flow: the eight-clustering obtained from the second to eighth eigenfunctions of \(\overline{\Delta }\)

Next, we visualize the Lagrangian averaged diffusion tensor field for the two example flows described in Examples 1 and 2. In Figs. 6a and 7a, we show the respective volume ratio fields. In Figs. 6b and 7b, the scalar field shown is the decimal logarithm of the mean diffusivity and it is overlaid by a grayscale texture whose features are aligned with the dominant diffusion direction field \(v_{\max }\left( \overline{D}\right) \). In the orthogonal direction, diffusion is weaker typically by several orders of magnitudes. Finally, in Figs. 6c and 7c we plot the respective densities \(\left( \det \overline{D}\right) ^{-1/2}\).

Fig. 6
figure 6

Averaged Lagrangian diffusion tensor field imaging for the transient double-gyre flow. a Volume ratio as a measure of anisotropy. High values correspond to isotropic diffusion. b The texture corresponds to integral curves of the dominant diffusion direction field; the scalar field corresponds to the logarithmic trace of the diffusion tensor, the Lagrangian effective diffusivity. c Density of the diffusion-induced volume relative to physical volume

Fig. 7
figure 7

Averaged Lagrangian diffusion tensor field imaging for the Bickley jet flow; see Fig. 6 for the interpretation

In both Figs. 6 and 7, two phenomena are clearly visible. First, the \(\bar{g}\)-diffusion gets closest to isotropic diffusion (volume ratio values close to 1) around the cores of the two LCSs at roughly \(\left( 0.5\pm 0.25,0.5\right) \) (see Figs. 6a and 7a). The further away from the structure centers, the more quasi-one-dimensional the diffusion becomes (volume ratio values close to 0), cf. also Thiffeault and Boozer (2001) and Thiffeault (2003). In particular, in Figs. 6b and 7b there are strongly diffusive yellowish filaments almost enclosing the bluish regions. Second, the direction normal to the LCS boundaries corresponds to the subdominant diffusion direction, which is several orders of magnitudes weaker than the dominant diffusion and therefore significantly slower. To leverage the correspondence between local geodesic averaging and short-time heat flow, imagine a small geodesic radius \(\varepsilon >0\) and a geodesic ball (in the geometry of mixing) of radius \(\varepsilon \) attached to points on some dense grid. With the shape and orientation of these geodesic balls as described in Sect. 4.1.1, one may guess that the geodesic averaging operator leaves characteristic functions localized on the LCSs almost invariant, much more than characteristic functions localized on smaller subsets thereof, or on material sets outside the LCSs.

Figures 6c and 7c demonstrate that the material manifolds equipped with the respective geometry of mixing may be regarded as consisting of two and six massive components, respectively, connected by an almost massless background.

In other words, a uniform heat distribution localized close to the isotropic core of the LCSs diffuses both radially and circularly on comparable time scales. A uniform heat distribution localized on the whole LCS will diffuse to the exterior on extremely long time scales and is therefore expected to be extremely slowly decaying, or, in other words, metastable. We demonstrate that our expectations built from the diffusion tensor analysis are indeed satisfied by the heat flow animations provided in Supplementary Materials 3 and 4 and discussed in Sect. 4.3.

4.2 Variational Characterization of Eigenvalues

We continue our study of the geometry of mixing by interpreting the eigenvalues of the Laplace operator from the variational viewpoint, in light of the preceding visualizations of the averaged diffusion tensor field and the density. This establishes some rather explicit connections between the Lagrangian averaged diffusion tensor field \(\overline{D}\) on the one hand and the topography of low eigenfunctions of the dynamic Laplacian on the other hand.

According to the Courant–Fischer–Weyl min–max principle, the eigenvalues of any Laplace operator can be characterized as follows: For \(k\in \mathbb {N}\), let \(W_{k}=\,\mathrm{span}\,\{w_{1},\ldots ,w_{k}\}\subset L^{2}(M,\mathrm {d}x)\) be the k-dimensional subspace spanned by the eigenfunctions corresponding to the k dominant eigenvalues and \(W_{k}^{\bot }\) its orthogonal complement. Then the kth eigenvalue is given by

$$\begin{aligned} \lambda _{k} = -\inf _{w\in W_{k-1}^{\bot }}\frac{\int _{M}\bar{g}^{-1}(\mathrm {d}w,\mathrm {d}w)\,\mathrm {d}x}{\int _{M}w^{2}\,\mathrm {d}x}. \end{aligned}$$

The infimum is attained exactly by the eigenfunctions corresponding to \(\lambda _{k}\). For a smooth function w, the simplest way to minimize the Rayleigh quotient is to be non-vanishing and constant. On connected manifolds, globally constant functions are captured by the eigenspace of the zero eigenvalue, and functions in the orthogonal complement must necessarily (i) have variation and (ii) be sign-indefinite. From an “eigenfunction-engineering” perspective, two questions arise for dominant eigenfunctions: (i) where to change values and where to remain (almost) constant; and (ii) if changing values locally then in which direction the most? Of course, the overall constraint is to have as little variation as possible.

Clearly, it is generally favorable to have variation in regions where any variation does not come at a high cost. Pointwise, the maximal cost is determined by the maximal eigenvalue \(\mu _{\max }(\overline{D})\) of the averaged diffusion tensor, which also dominates the trace of \(\overline{D}\). Thus, one would expect variation in regions with low trace, or, in our terminology above, with low Lagrangian effective diffusivity. As for the direction of strongest variation, it is pointwise optimal to have the differential \(\mathrm {d}w\) point in the direction of \(v_{\min }(\overline{D})\), i.e., orthogonal to the textured structures shown in Figs. 6a and 7a.

So far, one may think that it is favorable for a low eigenfunction to take some, say, positive value in the very LCS center, where effective diffusivity is very low, and some negative value everywhere else (except for a smooth transition). Due to the orthogonality constraint \(\langle w, 1\rangle \), this would introduce a very steep gradient in the transition zone, which may outbalance the low cost due to low \(\mu _{\max }(\overline{D})\). Thus, it is advantageous to push the transition zone outward, thereby increasing the size of the region with low-variation and positive values, which allows for a less steep transition between the vortex-like LCS region and its surrounding. Pushing the transition zone further outward into the high diffusivity region increases the Rayleigh quotient again, and the low eigenfunctions find an optimal balance in this geometry of mixing.

4.3 Geometric Heat Flow

In support of the decomposition of the fluid domain M into regular/coherent and mixing regions, we look at the action of the geometric heat flow induced by \(\Delta _{\bar{g}}\) on two different initial scalar distributions in the context of Example 1, the transient double gyre. To this end, we provide two video animations of the geometric heat flow in Supplementary Materials 3 and 4. In the third Supplementary Material, we consider the geometry of mixing and initialize a scalar quantity in the interior of the left LCS (Fig. 8a). The scalar quantity is slowly distributed over the LCS, and only very little leaks out (Fig. 8b). In the fourth Supplementary Material, we initialize the same amount of some scalar quantity between the two LCSs in the mixing region (Fig. 8c). Here, the scalar mixes fast all over the mixing region, but only very little enters into the two LCSs (Fig. 8d). This demonstrates both the mixing behavior and the diffusion barrier property of the LCS boundaries.

Fig. 8
figure 8

Initial (left) and final (right) scalar densities for an initial condition localized in the center of an LCS (top) and in the mixing region (bottom)

4.4 Diffusive Flux Form

Another interesting geometric object related to a weighted manifold is its induced surface area form. Such forms assign a \((d-1)\)-volume, which we will simply refer to as area, to \((d-1)\)-dimensional parallelepipeds in tangent space. We restrict our attention to parallelepipeds with unit g-area, whose corresponding \(\bar{g}\)-area can be interpreted as the “\(\bar{g}\)-diffusive flux.” To compute the \(\bar{g}\)-area of parallelepipeds of interest, we recall a result from linear algebra (Froyland 2015, App. A, Lemma 1): For an invertible matrix \(A\in GL\left( \mathbb {R}^{d}\right) \) and an orthonormal basis \(\left( v_{1},\ldots ,v_{d}\right) \), one has

$$\begin{aligned} \left\Vert A\left( v_{1}\wedge \ldots \wedge v_{d-1}\right) \right\Vert =\det (A)\left\Vert A^{-\top }v_{d}\right\Vert . \end{aligned}$$

In our context, given some tangent space \(T_{p}M\), \(\left( v_{1},\ldots ,v_{d}\right) \) shall be an orthonormal basis with respect to g. The transformation given by \(A{:}{=}\overline{G}^{1/2}\) corresponds to the basis transformation in \(T_{p}M\) to Riemannian normal coordinates, cf. Karrasch (2015, Appendix). Therefore, in the new coordinates, \(\overline{G}\) takes the canonic Euclidean form, and area, determinant as well as volume are computed as in the Euclidean case. In other words, on the left-hand side we have the \(\bar{g}\)-area of the parallelepiped spanned by \(\left( v_{1},\ldots ,v_{d-1}\right) \)—our object of interest—and on the right-hand side we have the density \(\det \left( \overline{G}\right) ^{1/2}=\sqrt{\bar{g}}\) discussed in Sect. 4.1.3 and the \(\bar{g}^{-1}\)-norm of the normal (co-)vector \(v_{d}\).

Given a point p in M, what is the orientation of a \(\left( d-1\right) \)-parallelepiped of unit g-area in the tangent space \(T_{p}M\) with the least \(\bar{g}\)-area? Physically speaking, what is the orientation of a surface element attached to p that admits the least diffusive flux? Looking at the left-hand side, we find directly that the parallelepiped spanned by the eigenvectors of \(\overline{G}\) corresponding to the lowest eigenvalues has minimal \(\bar{g}\)-area. This is consistent with the right-hand side in that the normal vector \(v_{d}\) is the eigenvector corresponding to \(v_{\min }\left( \overline{D}\right) \) in that case, and therefore has minimal dual norm.

The area form induced by the geometry of mixing features prominently in asymptotics of diffusive flux through material surfaces in the vanishing diffusivity limit (Karrasch and Schilling 2020), see also Haller et al. (2018, 2019).

5 Discussion

5.1 Generalization to Compressible Flows

The geometric heat-flow approach generalizes by analogy to the setting when advection is due to a compressible flow. In such contexts, advection–diffusion of a scalar density \(\phi \) as it is carried by a (compressible) fluid with conserved mass density \(\rho \) is modeled by the mass-based equations (Landau and Lifshitz 1966), cf. also Thiffeault (2003),

$$\begin{aligned} \frac{D\phi }{Dt}=&\partial _{t}\phi +\,\mathrm{div}\,(\phi V) =\varepsilon \,\mathrm{div}\,_{\nu }\, D \mathrm {d}\phi ,\qquad \phi (0,\cdot ) = \phi _0, \end{aligned}$$
(8a)
$$\begin{aligned} \frac{D\rho }{Dt}=&\partial _t\rho +\,\mathrm{div}\,(\rho V) = 0,\qquad \qquad \;\quad \qquad \rho (0,\cdot ) = \rho _0. \end{aligned}$$
(8b)

Here, the (time-dependent) spatial measure \(\nu \) corresponds to the fluid’s mass \(\mathrm {d}\nu (t)=\rho (t)\mathrm {d}x\), i.e., it has density \(\rho \) w.r.t. the physical volume. Equation 8b is easily recognized as the continuity equation (or, mass conservation) for \(\rho \) and implies that \(\nu (t)\) is the pushforward measure of \(\nu _0\) under the flow; cf. the corresponding construction in Froyland and Kwok (2017). Moreover, \(\,\mathrm{div}\,_{\nu }\, D \mathrm {d}\phi = \Delta _{g,\nu }\) for the diffusion-adapted metric g whose coordinate representation is \(G=D^{-1}\).

When we represent Eq. 8 in Lagrangian coordinates, see Thiffeault (2003), we obtain

$$\begin{aligned} \partial _t\varphi&= \varepsilon \Delta _{g(t),\mu (t)}\varphi , \qquad \qquad \varphi (0,\cdot ) = \phi _0, \\ \partial _t\varrho&= -\varrho \Phi ^*(\,\mathrm{div}\,(V)),\qquad \varrho (0,\cdot ) = \rho _0. \end{aligned}$$

Analogously to the incompressible case, the scalar density \(\varphi \) is subject to diffusion generated by the time-dependent family of Laplace operators \(\left( \Delta _{g(t),\mu (t)}\right) _t\), with \(g(t){:}{=}(\Phi ^t)^*g\) the pullback metric, and \(\mathrm {d}\mu (t)=\varrho (t)\mathrm {d}x=(\Phi ^t)^*\mathrm {d}\nu (t)\) the pullback measure. Conservation of mass yields

$$\begin{aligned} \mathrm {d}\mu (t)=\mathrm {d}\mu (0)=\rho _0\mathrm {d}x = \mathrm {d}\nu _0, \end{aligned}$$

such that—fully analogously to the incompressible case—time dependence enters the pullback Laplace operators only in the diffusion tensor term. Finally, time averaging yields the dynamic Laplacian operator \(\Delta _{\bar{g},\nu _0}\) defined in Froyland and Kwok (2017) for compressible flows. In summary, both dynamic Laplacian constructions given in Froyland (2015) and Froyland and Kwok (2017) can be derived from a Lagrangian advection–diffusion viewpoint.

5.2 Spectral Analysis and the Role of Eigenfunctions

In our framework, we consider solutions of the self-adjoint elliptic eigenproblem

$$\begin{aligned} \Delta _{\bar{g}}w_{n}=\lambda _{n}w_{n}, \end{aligned}$$

possibly with natural Neumann boundary conditions on \(\partial M\). As summarized in Sect. 2.3, metastable sets are identified by separating the dominant part of the spectrum from the rest via an eigengap. Subsequently, a metastable decomposition is extracted from the associated eigenfunctions.

In our Riemannian manifold framework here, there is an additional spectral interpretation method that comes into consideration.Footnote 4 It has been developed in the field of manifold learning in the realm of diffusion maps (Belkin and Niyogi 2003; Coifman and Lafon 2006). The typical problem considered there is the following: Given a sample from a connected manifold, that is low-dimensional but embedded in some high-dimensional Euclidean space, one seeks coordinates that parameterize the manifold intrinsically, without resorting to the high-dimensional coordinates in the ambient space. This can be achieved by analyzing the eigenvectors of some graph Laplacian. As discussed in detail in Dsilva et al. (2015), one observes the following phenomena as one scans through the sequence of eigenvectors: The first non-trivial eigenvector [called unique eigendirections in Dsilva et al. (2015)] defines a coordinate direction. It may be followed either by an eigenvector inducing the same coordinate direction, however, with a higher frequency of variation [called repeated eigendirections in Dsilva et al. (2015)], or another unique eigendirection. The purpose of Dsilva et al. (2015) is to propose an algorithm that distinguishes unique from repeated eigendirections.

The manifold learning problem, however, is different in two aspects to the metastability problem. First, there is no expectation on smallness of associated eigenvalues nor an expectation of the existence of a spectral gap in manifold learning. That implies that one is interested exclusively in consecutive, dominant eigenfunctions in the metastability analysis (Davies 1982b), whereas the coordinate-inducing eigenfunctions may be scattered (Dsilva et al. 2015). Second, one is interested in plateaus of eigenfunctions in the metastability analysis, whereas one is interested in their variation in manifold learning.

These two dichotomous approaches to the interpretation of Laplacian spectra are potentially reconciled as follows: In the dominant spectrum close to zero, one expects to find information on the metastable sets, and subsequently coordinate-inducing eigenfunctions and higher (mixed) harmonics thereof on the almost-disconnected, metastable sets. Figure 9 schematically illustrates the role of different eigenfunctions in the ideal decoupled and the weakly coupled manifold cases. The distinction of eigenfunctions and their interpretation in terms of metastable sets or coordinates remains a challenging research question.

Fig. 9
figure 9

Spectral structure in the decoupled (a) and perturbed (b) cases. The quasi-components indicate connected coherent sets corresponding to almost invariant sets and arise from the unperturbed zero eigenvalues associated with connected components. The coordinates and higher harmonics correspond to first- and higher-order Laplace–Beltrami eigenfunctions associated with one or a superposition of the components or quasi-components

As a demonstration, consider the rotating double-gyre flow (Example 1). We used the eigenfunctions \(u_2\) and \(u_3\) to extract a metastable decomposition into the two LCSs and the surrounding mixing region in Sect. 4.1.4; see also Fig. 10 for a typical diffusion–coordinate plot of these. Eigenfunctions related to eigenvalues following the first two non-trivial ones can now be interpreted as inducing the radial and some “angular” coordinates on the two LCSs by tracing across the level sets from low to high values (Fig. 11).

Fig. 10
figure 10

Dominant diffusion coordinates for the rotating double-gyre flow. Shown are the second and third eigenfunctions, the flat first eigenfunction has been omitted. Each point corresponds to a sample point in the flow domain. The coloring corresponds to the coloring of the k-means clustering result in Fig. 4 (Color figure online)

Fig. 11
figure 11

Rotating double-gyre flow. Fourth (a) and fifth (b) eigenfunctions, which can be interpreted as inducing radial (a) and angular (b) coordinates on the left LCS

5.3 Discretization Aspects

In this section, we collect some thoughts regarding discretization aspects of the continuous averaged Lagrangian diffusion framework, provide relations to previously published trajectory-based computational approaches and indicate future research prospects.

Regarding the discretization of the dynamic Laplacian, there exist two approaches: (i) start from the continuous formulation and consider different discretizations or (ii) start from discrete approaches and target for the dynamic Laplacian in some infinite-sample limit.

As for the first, this refers to the classic and broad field of numerics for partial differential equations (PDEs). The problem at hand is to discretize the second-order differential operator \(\Delta _{\bar{g}}\), and classic approaches that have been employed include the finite-difference discretization (Froyland 2015; Froyland and Kwok 2017), the projection on a finite-dimensional subspace of admissible functions spanned by radial basis functions with compact support (Froyland and Junge 2015), and the finite-element method (FEM) (Froyland and Junge 2018). In most practical cases, one will work on a domain with boundary and hence has to ensure that (homogeneous) boundary conditions are satisfied. Moreover, one would want to preserve a crucial feature of \(\Delta _{\bar{g}}\), namely its self-adjointness. Out of the afore-mentioned discretization schemes, only the FEM approach (Froyland and Junge 2018) manages to fulfill both these requirements exactly and in addition admits an implementation suitable to handle sparse and incomplete data.

As for the second, this leads to the well-established field of diffusion maps. The central result is, roughly speaking, that a graph Laplacian matrix built from a sample from some Riemannian manifold converges toward the classic Laplace operator on the manifold. There is an abundance of varieties here: which graph Laplacian (normalized or unnormalized) to consider, which kernel function to use (fixed- or variable-bandwidth) and which manifolds (with or without boundary, compact or not) to allow, and finally, what kind of convergence to obtain (Coifman and Lafon 2006; von Luxburg et al. 2008; Berry and Sauer 2016a).

Briefly, let \((x_i)_i\) denote the samples from the manifold M, then define the entries \(w_{ij}\) of the weight matrix W associated with the pair \((x_i,x_j)\) via a kernel \(k:\mathbb {R}_{\ge 0}\rightarrow \mathbb {R}_{\ge 0}\) and the distance \(\,\mathrm{dist}\,(x_i,x_j)\), i.e., \(w_{ij}=k(\,\mathrm{dist}\,(x_i,x_j))\), where k may additionally depend on \(x_i\) and \(x_j\). A classic choice is given by the Gaussian \(k(w) = \exp (-w^2/\sigma ^2)\), but the exact design of the kernel turns out to be a powerful tool on its own (Berry and Sauer 2016b). From the weight matrix \(W = (w_{ij})\), one constructs the diagonal degree matrixD, with diagonal entries \(d_i = \sum _j w_{ij}\), and finally \(I-D^{-1}W\) is the normalized graph Laplacian, an approximation to the negative Laplace operator. This goes back to the normalized spectral clustering in Shi and Malik (2000) and admits a plethora of modifications and extensions (Coifman and Lafon 2006).

In this spirit, several suggestions from the recent literature admit an interpretation as approximations of averaged Lagrangian diffusion. Since the dynamic Laplacian is defined as the average of pullback Laplace operators, it is natural to consider its approximation via the average of graph Laplacian approximations of the pullback Laplace operators (Banisch and Koltai 2017). This has the advantage that one can use distances between trajectories at different times (usually Euclidean distances) for the construction of the graph Laplacians, but has the drawback that the final matrix representation of \(\Delta _{\bar{g}}\) (Banisch and Koltai 2017\(\mathbf {Q}_{\varepsilon }\) in Eq. (19)) does not preserve self-adjointness (Banisch and Koltai 2017, p. 8, comment (a)). However, it approximates the pointwise action of \(\Delta _{\bar{g}}\) on functions in the infinite-sample limit (Banisch and Koltai 2017, Thm. 3).

A—challenging—alternative might be to operate directly on the intrinsic geometry induced by \(\bar{g}\). In the spirit of graph Laplacians, this requires the computation/approximation of geodesic distances in the \(\bar{g}\)-metric. If the tensor field can be evaluated on a rather fine grid, this would allow the computation of geodesic distances by solving the (anisotropic) eikonal equation. For this purpose, very efficient algorithms such as the fast-marching algorithm or, more generally level set methods, are, in principle, available. Alternatively, one may approximate geodesic distances to the immediate grid neighbors from the metric tensors computed on the grid and extend the distance metric to the whole grid by computing lengths of shortest paths.

Another line of research has suggested to define notions of distance, or, even weaker, of adjacency (in a graph sense) between trajectories; cf. Froyland and Padberg-Gehle (2015) for an approach that introduces a metric on the extended state space, which is then applied to compute distances between trajectories and cluster centers as required by the k-means algorithm.

On the one hand, Hadjighasem et al. (2016) proposed to use time averages of pairwise distances, dynamical distances, as \(\,\mathrm{dist}\,\)-function in the graph Laplacian construction. Thereby, pairs of trajectories with large dynamical distance get a low weight which makes it unlikely that they are diagnosed as members of the same cluster/LCS. On the other hand, Padberg-Gehle and Schneide (2017) proposed a complementary approach: regard two trajectories as adjacent in a graph whose nodes correspond to trajectories if they ever get closer than a specified distance threshold. Roughly speaking, while (Hadjighasem et al. 2016) specifically punishes large dynamic distances, Padberg-Gehle and Schneide (2017) rewards close distances. Notably, the degree field derived from the graph adjacency matrix in Padberg-Gehle and Schneide (2017) has been independently proposed as a measure for mixing potential called trajectory encounter volume (Rypina and Pratt 2017). As discussed earlier, one may interpret this equally as a measure of Lagrangian effective diffusivity. While this correspondence to physical quantities is appealing, the graph constructed in Padberg-Gehle and Schneide (2017) is merely an unweighted and undirected graph. For mathematical analysis, more structure such as an underlying metric or even a Riemannian geometry generating adjacency would be desirable. In any case, for the methods discussed in this paragraph it remains unclear what a continuous infinite-sample limit could look like and whether it carries a nice mathematical structure. Ignoring such convergence aspects, there remains a high degree of analogy between the dynamic Laplacian and its associated geodesic distance in the geometry of mixing on the one hand, and ad hoc dynamic distances which enter classic Laplace operator discretizations on the other hand; see also Froyland and Junge (2018, Sect. 3.4) for an interpretation of an FEM discretization of the dynamic Laplacian as a graph Laplacian matrix.

One common feature of the above-listed methods is that they embed each trajectory in some neighborhood via distances, taking into account the dynamics. This is in contrast to an independent set of coherent structure detection methods, for instance (Mezic et al. 2010; Mancho et al. 2013; Mundel et al. 2014; Haller et al. 2016; Fabregat 2016). These methods retrieve information from time averages of observables along trajectories, which are viewed individually, without an immediate neighborhood relationship. The expectation then is that coherent structures reveal themselves as sets of material points showing similar statistics within the structure, and different statistics compared to the exterior. In other words, a neighborhood relationship is built only after observing trajectory statistics.

5.4 Connections to Geodesic LCS Approaches

There is another group of methods for finding boundaries of coherent structures in purely advective flows, developed by Haller and co-workers (Haller 2015; Haller and Beron-Vera 2013; Karrasch et al. 2015; Farazmand et al. 2014). These build on global variational principles formulated in terms of the stretching or shear of material boundaries. These approaches naturally involve the Cauchy–Green (CG) strain tensor field \(C{:}{=}D\Phi (T)^{\top }D\Phi (T)\), which we interpret here as the pullback metric \(\Phi (T)^{*}\mathfrak {g}\), see Remark 1. These methods evaluate the dynamics at two time instances, an initial \(t=0\) and a final one \(t=T\). Earlier LCS methods have only used the logarithm of the maximal eigenvalue of C, well known as the finite-time Lyapunov exponent (FTLE), for visual inference of coherent structures.

First, observe the following tight relation between the Cauchy–Green strain tensor C and the two-point averaged, diffusion-adapted metric tensor \(\overline{G}\) in physical \(\mathfrak {g}\)-normal coordinates at some point \(p\in M\). Then \(\overline{G}\) has the coordinate representation \(\overline{G}=2\left( G(0)^{-1}+G(T)^{-1}\right) ^{-1}=2\left( I+C^{-1}\right) ^{-1}\), where I is the identity matrix. Clearly, C has eigenvalues \(\mu _{\min }=\mu _{\min }(C)\le \mu _{\max }(C)=\mu _{\max }\) with eigenvectorsFootnote 5\(v_{\min }(C)\) and \(v_{\max }(C)\) if and only if \(\overline{D}=\overline{G}^{-1}\) has eigenvalues \(\mu _{\min }\left( \overline{D}\right) =\frac{1}{2}\left( 1+\frac{1}{\mu _{\max }}\right) \le \frac{1}{2}\left( 1+\frac{1}{\mu _{\min }}\right) =\mu _{\max }\left( \overline{D}\right) \) with eigenvectors \(v_{\min }\left( \overline{D}\right) =v_{\max }(C)\) and \(v_{\max }\left( \overline{D}\right) =v_{\min }(C)\). In other words, the minor CG-eigendirection \(v_{\min }(C)\) corresponds to the dominant \(\overline{G}\)–diffusion direction. Moreover, in the volume-preserving case, one has \(\mu _{\min }=\mu _{\max }^{-1}\), and therefore the anisotropy ratio for \(\overline{D}\)

$$\begin{aligned} \frac{1+\mu _{\max }}{1+\frac{1}{\mu _{\max }}}=\mu _{\max } \end{aligned}$$

is equal to the dominant CG-eigenvalue. Thus, the logarithm of the anisotropy ratio shown in the figures in Sect. 4.1 corresponds to an accordingly defined multiple-time-step FTLE up to rescaling.

Next, let us briefly recall the variational formulations for elliptic (coherent vortex boundaries) and parabolic (jet cores) Lagrangian coherent structures (LCSs) in two-dimensional flows, using our notation. Following Haller and Beron-Vera (2013), boundaries of elliptic LCSs are sought as the outermost closed stationary curves of the averaged strain functional

$$\begin{aligned} Q(\gamma )=\int _{0}^{b}\frac{|r'(s)|_{\Phi (T)^{*}g}}{|r'(s)|_{g}}\mathrm {d}s, \end{aligned}$$

where r is a parameterization of a material curve \(\gamma \subset M\). The integrand compares pointwise the magnitude of curve velocity \(r'(s)\) after push forward by \(D\Phi \) with its original magnitude. Equivalently, by equipping M with the pullback metric \(C=\Phi (T)^{*}g\), the domain M is geometrically deformed, in principle as we do in our approach here with \(\bar{g}\). In these terms, the length of \(r'(s)\) in the deformed geometry (MC) is compared with its length in the original geometry (Mg). By applying Noether’s theorem, one obtains that stationary curves necessarily obey a conservation law, which corresponds exactly to the integrand, i.e.,

$$\begin{aligned} \frac{|r'(s)|_{\Phi (T)^{*}g}}{|r'(s)|_{g}}=\frac{|r'(s)|_{g(T)}}{|r'(s)|_{g(0)}}=\lambda =\text {const}., \end{aligned}$$

from which one may deduce tangent line fields \(\eta _{\lambda }\). Among the orbits of these line fields one looks for closed ones. Closed orbits typically come in continuous one-parameter families, out of which one picks the outermost. Analogously to index theory for vector fields, one may employ index theory for line fields to deduce that any closed orbit of a piecewise differentiable line field must necessarily enclose at least two singularitiesFootnote 6 of the line fields, and all enclosed singularities obey a topological rule, see Karrasch et al. (2015) for more details. In most cases, relevant closed orbits have been found to enclose exactly two singularities of wedge type (Karrasch et al. 2015). These singularities can be visualized (and numerically detected) by phase portraits of the eigendirections of the pullback metric C, or, analogously, by the dominant diffusion direction field of \(\overline{D}\) as in Sect. 4.1.

In practice, the outermost closed stationary curve travels through regions of high FTLE/anisotropy values. There, the tangent direction field is almost collinear with \(v_{\min }(C)\). From the discussion in Sects. 4.1.1 and 4.1.4, we conclude that such closed curves are pointwise very close to the optimal direction for blocking \(\bar{g}\)-diffusion. Their deviation from the optimal direction is not very costly in terms of diffusive flux, but still allows them to close up smoothly under the conditions of the variational principle. Our ad hoc considerations here have been formalized and made rigorous in recent work by Haller et al. (2018, 2019), who provide a variational theory for detecting material transport barriers to diffusive transport in the original, time-dependent advection–diffusion process (Eqs. 5 and 6).

For a simple numerical test, we have overlaid the final clustering result as well as the second eigenfunction of \(\Delta _{\bar{g}}\) for a two-time-point approximation of \(\bar{g}\) with all closed \(\eta _{\lambda }\)-orbits in Fig. 12. We find that the eigenvector clustering procedure yields smaller structures (Fig. 12a). A close inspection of the level sets of the second eigenfunction, however, reveals a nice, albeit not perfect, visual matching of some level sets with the geodesic LCS results (Fig. 12b).

Fig. 12
figure 12

Rotating double-gyre flow (evaluated at initial and final time points only). Closed \(\eta _{\lambda }\)-lines (black) on top of a the final clustering result from the \(\Delta _{\bar{g}}\)-analysis, and b the second eigenfunction \(w_{2}\) of \(\Delta _{\bar{g}}\)

In summary, its dimension-independent formulation and its high degree of consistency with the two-dimensional variational principles suggest our methodology as a natural extension of these approaches to higher dimensions. It has proven to be notoriously challenging to extend the variational ideas to three dimensions by restricting oneself to variational principles on curves (Blazevski and Haller 2014; Oettinger et al. 2016).

5.5 Applications to Geophysical Fluid Dynamics

In this section, we compare our methodology to Nakamura’s effective diffusivity framework (Nakamura 1996; Shuckburgh and Haynes 2003), which is widely used in the geophysical fluid dynamics community.

Both methods consider advection–diffusion processes in possibly turbulent fluid flows, i.e., the advection–diffusion equation (ADE) in the physical domain with a constant diffusion coefficient \(\kappa \).Footnote 7 In a second step, the ADE is transformed to a different set of coordinates: In Nakamura (1996), an almost Lagrangian coordinate system based on the area inside concentration level sets of a given tracer density \(\phi \) of interest is constructed. This transformation is considered as advantageous, for it separates “the reversible effects of advection from the irreversible effects of advection and diffusion acting together” (Shuckburgh and Haynes 2003). In our context, this is achieved by passing to Lagrangian coordinates as in Press and Rybicki (1981), Thiffeault (2003), Fyrillas and Nomura (2007)—and therefore literally to a tracer-based coordinate system—which factors out the advective motion, but keeps the joint action of advection and diffusion through tracking deformation and its effect on diffusion in the form of the pullback metric.

In the Nakamura framework, the coordinates are then given by the one-dimensional area coordinate A and \((d-1)\)-dimensional coordinates on concentration level sets. Clearly, the instantaneous local action of diffusion on the concentration is in the A-direction only, since there is no \(\phi \)-gradient along the level set coordinates. Averaging over contours allows to reduce the d-dimensional advection–diffusion equation to a one-dimensional (along the area coordinate) pure diffusion equation (Nakamura 1996):

$$\begin{aligned} \partial _t\phi = \partial _A(K_{\text {eff}}(A,t)\partial _A\phi ), \end{aligned}$$
(9)

and the scalar \(K_\text {eff}\) is coined effective diffusivity.

In our context, the Eulerian ADE is turned into a pure diffusion equation as well, Eq. 6, however, of full dimension d again. Moreover, we arrive at a diffusion tensor induced by the pullback metric. We comment on the reason and the benefit of this increased complexity later.

The next important concept in Nakamura’s framework is that of equivalent length\(L_{\text {eq}}\) (in 2D) and equivalent area\(A_{\text {eq}}\) (in 3D). For simplicity, we focus on \(L_{\text {eq}}\) henceforth, but all arguments apply analogously to higher-dimensional flows. And so, the equivalent length of a concentration level set corresponding to the area coordinate A can be obtained from a comparison of the spatial scalar diffusivity \(\kappa \) (from the Eulerian ADE) and the effective diffusivity \(K_{\text {eff}}\) (from Eq. 9), cf. Shuckburgh and Haynes (2003, eq. (6)):

$$\begin{aligned} K_{\text {eff}}(A,t) = \kappa L_{\text {eq}}^2(A,t), \qquad \text {or equivalently}\qquad \frac{K_{\text {eff}}(A,t)}{L_{\text {eq}}^2(A,t)} = \kappa . \end{aligned}$$
(10)

It can be shown that \(L_{\text {eq}}^2\) is at least (the square of) the physical length of the corresponding concentration level set.

Equation 10 can be interpreted as answering the following question: How would one have to rescale units or, more generally, deform the local geometry in order to observe—in the new units/deformed geometry—the original diffusivity \(\kappa \)? Conceptually, this procedure is completely analogous to the duality of the diffusion tensor and the induced metric tensor (with respect to which anisotropic diffusion is isotropic; cf. Sect. 3.1), but also to the perspective that we took throughout Sect. 4.1: How do length, diffusivity, and volume in the geometry of mixing relate to the corresponding entities in the original spatial geometry?

How do we see in the geometry of mixing whether diffusion is effectively enhanced or suppressed (relative to the pure spatial diffusion) via its interaction with advection? As an early indicator of mixing efficiency, Nakamura (1996) suggested to look at the equivalent length, for a large \(L_\text {eq}\) implies a large effective diffusivity, which is then related to the mixing region. By analogy, the Lagrangian effective diffusivity \(1/d\cdot \,\mathrm{trace}\,\left( \overline{D}\right) \), d the dimension of space, plays the role of \(K_{\text {eff}}\) and serves as an indicator for the mixing region.

The effective diffusivity framework is tied to a specific tracer concentration field. For instance, the fact that diffusion is one-dimensional and the orientation of that single coordinate in physical space is determined by the concrete concentration field at hand. A concentration field with a different initial level set topology may yield different results. To obtain physically relevant mixing information, it is assumed that a “suitable tracer field” is considered; cf. the effort taken in Nakamura (1996) and Shuckburgh and Haynes (2003) to generate those. Suitability there means, roughly speaking, that its level set topology is already reasonably “equilibrated” w.r.t. the mixing geometry induced by the flow, but at the same time has sufficiently strong gradients to allow for a computationally robust transformation to area coordinates. In other words, concentration gradients are roughly aligned with the direction of slowest diffusion, which is of greatest interest.

In contrast, our geometry of mixing provides information about the “mixing ability” (Shuckburgh and Haynes 2003) or “mixing potential” (Rypina and Pratt 2017) of the flow, independently from a concrete concentration field. The Lagrangian ADE remains of full dimension and invokes an effective diffusion tensor to account for all possible level set topologies. Moreover, this diffusion tensor field and its corresponding Laplace operator are obtained computationally from purely advective ODE simulations, there is no need to solve the advection–diffusion PDE.