# A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition

- 3.5k Downloads
- 96 Citations

## Abstract

The Koopman operator is a *linear* but infinite-dimensional operator that governs the evolution of scalar observables defined on the state space of an autonomous dynamical system and is a powerful tool for the analysis and decomposition of nonlinear dynamical systems. In this manuscript, we present a data-driven method for approximating the leading *eigenvalues, eigenfunctions, and modes* of the Koopman operator. The method requires a data set of snapshot pairs and a dictionary of scalar observables, but does not require explicit governing equations or interaction with a “black box” integrator. We will show that this approach is, in effect, an extension of dynamic mode decomposition (DMD), which has been used to approximate the Koopman eigenvalues and modes. Furthermore, if the data provided to the method are generated by a Markov process instead of a deterministic dynamical system, the algorithm approximates the eigenfunctions of the Kolmogorov backward equation, which could be considered as the “stochastic Koopman operator” (Mezic in Nonlinear Dynamics 41(1–3): 309–325, 2005). Finally, four illustrative examples are presented: two that highlight the quantitative performance of the method when presented with either deterministic or stochastic data and two that show potential applications of the Koopman eigenfunctions.

## Keywords

Data mining Koopman spectral analysis Set oriented methods Spectral methods Reduced order models## Mathematics Subject Classification

Primary: 65P99 37M25 Secondary: 47B33## 1 Introduction

In many mathematical and engineering applications, a phenomenon of interest can be summarized in different ways. For instance, to describe the state of a two-dimensional incompressible fluid flow, one can either record velocity and pressure fields or stream function and vorticity (Hirsch 2007). Furthermore, these states can often be *approximated* using a low-dimensional set of proper orthogonal decomposition (POD) modes (Holmes et al. 1998), a set of dynamic modes (Schmid 2010; Schmid et al. 2012), or a finite collection of Lagrangian particles (Monaghan 1992). A mathematical example is the linear time-invariant (LTI) system provided by \(\varvec{x}(n+1) = \varvec{A} \varvec{x}(n)\), where \(\varvec{x}(n)\) is the system state at the *n*th timestep. Written as such, the evolution of \(\varvec{x}\) is governed by the eigenvalues of \(\varvec{A}\). One could also consider the invertible but nonlinear change in variables, \(\varvec{z}(n) = \varvec{T}(\varvec{x}(n))\), which generates a nonlinear evolution law for \(\varvec{z}\). Both approaches (i.e., \(\varvec{x}\) or \(\varvec{z}\)) describe the same fundamental behavior, yet one description may be preferable to others. For example, solving an LTI system is almost certainly preferable to evolving a nonlinear system from a computational standpoint.

In general, one measures (or computes) the state of a system using a set of scalar *observables*, which are functions defined on state space, and watches how the values of these functions evolve in time. As we will show shortly, one can write an evolution law for the dynamics of this set of observables, and if they happen to be “rich enough,” reconstruct the original system state from the observations. Because the properties of this new dynamical system depend on our choice of variables (observables), it would be highly desirable if one could find a set of observables whose dynamics appear to be governed by a linear evolution law. If such a set could be identified, the dynamics would be completely determined by the spectrum of the evolution operator. Furthermore, this could enable the simple yet effective algorithms designed for linear systems, for example controller design (Todorov 2007; Stengel 2012) or stability analysis (Mauroy and Mezic 2013; Lehoucq et al. 1998), to be applied to nonlinear systems.

Mathematically, the evolution of observables of the system state is governed by the *Koopman operator* (Koopman and Neumann 1932; Koopman 1931; Budišić et al. 2012; Rowley et al. 2009), which is a *linear but infinite-dimensional* operator that is defined for a given dynamical system. Of particular interest here is the “slow” subspace of the Koopman operator, which is the span of the eigenfunctions associated with eigenvalues near the unit circle in discrete time (or near the imaginary axis in continuous time). These eigenvalues and eigenfunctions capture the long-term dynamics of observables that appear after the fast transients have subsided and could serve as a low-dimensional approximation of the otherwise infinite-dimensional operator when a spectral gap, which clearly delineates the “fast” and “slow” temporal dynamics, is present. In addition to the eigenvalues and eigenfunctions, the final element of Koopman spectral analysis is the set of *Koopman modes* for the *full-state* observable (i.e., the identity operator) (Budišić et al. 2012; Rowley et al. 2009) which are vectors that enable us to reconstruct the *state of the system* as a linear combination of the Koopman eigenfunctions. Overall, the “tuples” of Koopman eigenfunctions, eigenvalues, and modes enable us to: (a) transform state space so that the dynamics appear to be linear, (b) determine the temporal dynamics of the linear system, and (c) reconstruct the state of the original system from our new linear representation. In principle, this framework is quite broadly applicable and useful even for problems with multiple attractors that cannot be accurately approximated using models based on local linearization.

There are several algorithms in the literature that can computationally approximate subsets of these quantities. Three examples are numerical implementations of generalized Laplace analysis (GLA) (Budišić et al. 2012; Mauroy and Mezić 2012; Mauroy et al. 2013), the Ulam Galerkin method (Froyland et al. 2014; Bollt and Santitissadeekorn 2013), and dynamic mode decomposition (DMD) (Schmid 2010; Tu et al. 2014; Rowley et al. 2009). None of these techniques require explicit governing equations, so all, in principle, can be applied directly to data. GLA can approximate both the Koopman modes and eigenfunctions, but it requires knowledge of the eigenvalues to do so (Budišić et al. 2012; Mauroy and Mezić 2012; Mauroy et al. 2013). The Ulam Galerkin method has been used to approximate the eigenfunctions and eigenvalues (Froyland et al. 2014), though it is more frequently used to generate finite-dimensional approximations of the Perron–Frobenius operator, which is the adjoint of the Koopman operator. Finally, DMD has been used to approximate the Koopman modes and eigenvalues (Rowley et al. 2009; Tu et al. 2014), but not the Koopman eigenfunctions.

Even in pairs instead of triplets, approximations of these quantities are useful. DMD and its variants (Wynn et al. 2013; Chen et al. 2012; Jovanović et al. 2014) have been successfully used to analyze nonlinear fluid flows using data from both experiments and computation (Schmid 2010; Muld et al. 2012; Seena and Sung 2011). GLA and similar methods have been applied to extract meaningful spatio-temporal structures using sensor data from buildings and power systems (Eisenhower et al. 2010; Susuki and Mezić 2011, 2012, 2014). Finally, the Ulam Galerkin method has been used to identify coherent structures and almost invariant sets (Froyland et al. 2007; Froyland and Padberg 2009; Froyland 2005) based on the singular value decomposition of (a slight modification of) the Perron–Frobenius operator.

In this manuscript, we present a data-driven method that approximates the leading Koopman *eigenfunctions, eigenvalues, * **and** * modes* from a data set of successive “snapshot” pairs and a dictionary of observables that spans a subspace of the space of scalar observables. There are many possible ways to choose this dictionary, and it could be comprised of polynomials, Fourier modes, spectral elements, or other sets of functions of the full-state observable. We will argue that this approach is an extension of DMD that can produce better approximations of the Koopman eigenfunctions; as such, we refer to it as *extended dynamic mode decomposition* (EDMD). One regime where the behavior of both EDMD and DMD can be formally analyzed and contrasted is in the limit of large data. Following the definition of DMD in Tu et al. (2014), we consider the case where this large data set consists of a distribution of snapshot pairs rather than a single time series. In this regime, we will show that the numerical approximation of the Koopman eigenfunctions generated by EDMD converges to the numerical approximation we would obtain from a Galerkin method (Boyd 2013) in that the residual is orthogonal to the subspace spanned by the elements of the dictionary. With finite amounts of data, we will demonstrate the effectiveness of EDMD on two deterministic examples: one that highlights the quantitative accuracy of the method and other a more practical application.

Because EDMD is an entirely data-driven procedure, it can also be applied to data from stochastic systems without any algorithmic changes. If the underlying system is a Markov process, we will show that EDMD approximates the eigenfunctions of the Kolmogorov backward equation (Givon et al. 2004; Bagheri 2014) which has been called the *stochastic Koopman* operator (SKO) (Mezić 2005). Once again, we will demonstrate the effectiveness of the EDMD procedure when the amount of data is limited, by applying it to two stochastic examples: the first to test the accuracy of the method and the second to highlight a potential application of EDMD as a nonlinear manifold learning technique. In the latter example, we highlight two forms of model reduction: reduction that occurs when the *dynamics of the system state* are constrained to a low-dimensional manifold and reduction that occurs when the evolution of *statistical moments* of the stochastic dynamical system are effectively low dimensional.

In the remainder of the manuscript, we will detail the EDMD algorithm and show (when mathematically possible) or demonstrate through examples that it accurately approximates the leading Koopman eigenfunctions, eigenvalues, and modes for both deterministic and stochastic sets of data. In particular, in Sect. 2, the EDMD algorithm will be presented, and we will prove that it converges to a Galerkin approximation of the Koopman operator given a sufficiently large amount of data. In Sect. 3, we detail three choices of dictionary that we have found to be effective in a broad set of applications. In Sect. 4, we will demonstrate that the EDMD approximation can be accurate even with finite amounts of data and can yield useful parameterizations of common dynamical structures in problems with multiple basins of attraction *when the underlying system is deterministic*. In Sect. 5, we experiment by applying EDMD to stochastic data and show it approximates the eigenfunctions of the SKO for Markov processes. Though the interpretation of the eigenfunctions now differs, we demonstrate that they can still be used to accomplish useful tasks such as the parameterization of nonlinear manifolds. Finally, some brief concluding remarks are given in Sect. 6.

## 2 Dynamic Mode Decomposition and the Koopman Operator

Our ambition in this section is to establish the connection between the Koopman operator and what we call EDMD. To accomplish this, we will define the Koopman operator in Sect. 2.1. Using this definition, we will outline the EDMD algorithm in Sect. 2.2 and then show how it can be used to approximate the Koopman eigenvalues, eigenfunctions, and modes. Next, in Sect. 2.3, we will prove that the EDMD method almost surely converges to a Galerkin method in the limit of large data. Finally, in Sect. 2.4, we will highlight the connection between the EDMD algorithm and standard DMD.

### 2.1 The Koopman Operator

*the Koopman operator maps functions of state space to functions of state space*and not states to states (Koopman and Neumann 1932; Koopman 1931; Budišić et al. 2012).

In essence, the Koopman operator defines a new dynamical system, \(({\mathcal {F}}, n, \mathcal {K})\), that governs the evolution of *observables*, \(\psi \in {\mathcal {F}}\), in discrete time. In what follows, we assume that \({\mathcal {F}}= L^2({\mathcal {M}}, \rho )\), where \(\rho \) is a positive, single-valued analytic function with \(\Vert \rho \Vert _{\mathcal {M}}= \int _{\mathcal {M}}\rho (\varvec{x}) \;d\varvec{x} = 1\), but not necessarily an invariant measure of the underlying dynamical system. This assumption, which has been made before in the literature (Budišić et al. 2012; Koopman 1931), is required so that the inner products in the Galerkin-like method we will present can be taken. Because it acts on functions, \(\mathcal {K}\) is infinite dimensional even when \(\varvec{F}\) is finite dimensional, but provided that \({\mathcal {F}}\) is a vector space, it is also *linear even when* \(\varvec{F}\) *is nonlinear*. The infinite-dimensional nature of the Koopman operator is potentially problematic, but if it can, practically, be truncated without too great a loss of accuracy (e.g., if the system has multiple timescales), then the result would be a linear and finite-dimensional approximation. Therefore, the promise of the Koopman approach is to take the tools developed for linear systems and apply them to the dynamical system defined by the Koopman operator, thus obtaining a linear approximation of a nonlinear system without directly linearizing around a particular fixed point.

*vector-valued observable*, each component of it is a scalar-valued observable, i.e., \(g_i\in {\mathcal {F}}\) where \(g_i\) is the

*i*th component of \(\varvec{g}\). Assuming \(g_i\) is in the span of our set of \({N_K}\) eigenfunctions, \(g_i = \sum _{k=1}^{N_K} v_{ik}\varphi _k\) with \(v_{kj}\in \mathbb {C}\). Then, \(\varvec{g}\) can be obtained by “stacking” these weights into vectors (i.e., \(\varvec{v}_j = [v_{1j}, v_{2j},\ldots , v_{Nj}]^T\)). As a result,

*k*th

*Koopman mode*and \(\varphi _k\) is the

*k*th

*Koopman eigenfunction*. In doing this, we have assumed that each of the scalar observables that comprise \(\varvec{g}\) are in the subspace of \({\mathcal {F}}\) spanned by our \({N_K}\) eigenfunctions, but we have not assumed that the eigenfunctions form a basis for \({\mathcal {F}}\). In some applications, the Koopman operator will have a continuous spectrum, which as shown in Mezić (2005) would introduce an additional term in (2). The system state at future times can be obtained either by directly evolving \(\varvec{x}\) or by evolving the full-state observable through Koopman:

*continuous-time*Koopman operator. If \(\varphi _k\) is the

*k*th eigenfunction of \(\hat{\mathcal {K}}\) associated with the eigenvalue \(\lambda _k\), then the future value of some vector-valued observable \(\varvec{g}\) can be written as

In what follows, we will not have access to the “right-hand side” function, \(\varvec{f}\), and therefore cannot approximate \(\hat{\mathcal {K}}\) directly. However, if the discrete-time dynamical system \(\varvec{F}\) is the flow map associated with \(\varvec{f}\) for a fixed time interval \(\Delta t\) (i.e., \(\mathcal {K}= \mathcal {K}_{\Delta t}\)), then (3) and (6) are equivalent with \(\mu _k = e^{\lambda _k\Delta t}\). As a result, although we will be approximating the Koopman operator associated with discrete-time dynamical systems, we will often present our results in terms of \(\lambda _k\) rather than \(\mu _k\) when the underlying system is a flow rather than map.

### 2.2 Extended Dynamic Mode Decomposition

*a pair of data sets*,

*snapshots of the system state*with \(\varvec{y}_i = \varvec{F}(\varvec{x}_i)\), and (b) a dictionary of observables, \({\mathcal {D}}=\{\psi _1, \psi _2, \ldots , \psi _{N_K}\}\) where \(\psi _i\in {\mathcal {F}}\), whose span we denote as \({\mathcal {F}}_{{\mathcal {D}}}\subset {\mathcal {F}}\); for brevity, we also define the vector-valued function \(\varvec{\Psi }:{\mathcal {M}}\rightarrow \mathbb {C}^{1\times N_K}\) where

#### 2.2.1 Approximating the Koopman Operator and its Eigenfunctions

*m*th snapshot in \(\varvec{X}\), and \(\varvec{y}_m = \varvec{F}(\varvec{x}_m)\) is the

*m*th snapshot in \(\varvec{Y}\). Equation 11 is a least-squares problem and therefore cannot have multiple isolated local minima; it must have either a unique global minimizer or a continuous family (or families) of minimizers. As a result, regularization (here via the truncated singular value decomposition) may be required to ensure the solution is unique, and the \(\varvec{K}\) that minimizes (11) is:

*j*th eigenvector of \(\varvec{K}\) with the eigenvalue \(\mu _j\), then the EDMD approximation of an eigenfunction of \(\mathcal {K}\) is

*continuous-time system*. In the remainder of the manuscript, we denote the eigenvalues of \(\varvec{K}\) with the \(\mu _j\) and (when applicable) the approximation of the corresponding continuous-time eigenvalues as \(\lambda _j\). Although both embody the same information, one choice is often more natural for a specific problem.

#### 2.2.2 Computing the Koopman Modes

Next, we will compute approximations of the Koopman modes for the full-state observable using EDMD. Recall that the Koopman modes are the weights needed to express the full state in the *Koopman eigenfunction* basis. As such, we will proceed in two steps: First, we will express the full-state observable using the elements of \({\mathcal {D}}\); then, we will find a mapping from the elements of \({\mathcal {D}}\) to the numerically computed eigenfunctions. Applying these two steps in sequence will yield the observables expressed as a linear combination of Koopman eigenfunctions, which are, by definition, the Koopman modes for the full-state observable.

*N*scalar-valued observables, \(g_i:{\mathcal {M}}\rightarrow \mathbb {R}\), as follows:

*i*th unit vector in \(\mathbb {R}^N\). At this time, we conveniently assume that all \(g_i(\varvec{x})\in {\mathcal {F}}_{{\mathcal {D}}}\) so that \(g_i(\varvec{x}) = \sum _{k=1}^{N_K} \psi _k(\varvec{x}) b_{k,i} = \varvec{\Psi }(\varvec{x})\varvec{b}_i\), where \(\varvec{b}_i\) is some appropriate vector of weights. If this is not the case,

*approximate*Koopman modes can be computed by projecting \(g_i\) onto \({\mathcal {F}}_{\mathcal {D}}\), though the accuracy and usefulness of this fit clearly depend on the choice of \({\mathcal {D}}\). To avoid this issue, we take \(g_i\in {\mathcal {F}}_{\mathcal {D}}\) for \(i=1,\ldots ,N\) in all examples that follow. In either case, the entire vector-valued observable can be expressed (or approximated) in this manner as

*i*th eigenvector of \(\varvec{K}\) associated with \(\mu _i\). Therefore, we can determine the \(\psi _i\) as a function of \(\varphi _i\) by inverting \(\varvec{\Xi }^T\). Because \(\varvec{\Xi }\) is a matrix of eigenvectors, its inverse is

*i*th left eigenvector of \(\varvec{K}\) also associated with \(\mu _i\) (i.e., \(\varvec{w}_i^*\varvec{K} = \varvec{w}_i^*\mu _i\)) appropriately scaled so \(\varvec{w}_i^*\varvec{\xi }_i = 1\). We combine (16) and (19), and after some slight algebraic manipulation found that

*i*th Koopman mode. This is the formula for the Koopman modes that we desired.

#### 2.2.3 Algorithm Summary

- (1)
A data set of snapshot pairs \(\{(\varvec{x}_m, \varvec{y}_m)\}_{m=1}^M\).

- (2)
A set of functions, \({\mathcal {D}}= \{\psi _1, \psi _2, \ldots , \psi _{N_K}\}\), that will be used to approximate the Koopman eigenfunctions; for brevity, we define the vector-valued function, \(\varvec{\Psi }\), in (8) that contains these dictionary elements.

- (3)
A matrix \(\varvec{B}\), defined in (16), which contains the weights required to reconstruct the full-state observable using the elements of \({\mathcal {D}}\).

- (1)
Loop through the data set of snapshot pairs to form the matrices \(\varvec{G}\) and \(\varvec{A}\) using (13); these computations can be performed in parallel and are compatible with the MapReduce framework (Dean and Ghemawat 2008).

- (2)
Form \(\varvec{K} \triangleq \varvec{G}^+\varvec{A}\). Then, compute the set of eigenvalues, \(\mu _i\), eigenvectors, \(\varvec{\xi }_i\), and left eigenvectors, \(\varvec{w}_i\).

- (3)
Compute the set of Koopman modes by setting \(\varvec{v}_i \triangleq (\varvec{w}_i^*\varvec{B})^T\), where \(\varvec{v}_i\) is the

*i*th Koopman mode.

*i*th Koopman eigenvalue and a set of \(\varvec{v}_i\), which is our approximation of the

*i*th Koopman mode. We also have the vector \(\varvec{\xi }_i\), which allows the

*i*th Koopman eigenfunction to be approximated using (14).

Ultimately, the EDMD procedure is a regression procedure and can be applied to any set of snapshot pairs without modification. However, the contribution of this manuscript is to show how the regression problem above relates to the Koopman operator, and making this connection requires some knowledge about the process that created the data. If the underlying system is (effectively) a discrete-time, autonomous dynamical system (or, equivalently, an autonomous flow sampled at a fixed interval of \(\Delta t\)), then the results of this section hold. If the underlying system is stochastic, then a connection with the Koopman operator still exists, but will be discussed later in Sect. 5. Furthermore, even some non-autonomous systems could, in principle, be analyzed using EDMD by augmenting the state vector to include time; this, however, will be left for future work.

### 2.3 Convergence of the EDMD Algorithm to a Galerkin Method

In this subsection, we relate EDMD to the Galerkin methods one would use to approximate the Koopman operator with complete information about the underlying dynamical system. In this context, a Galerkin method is a weighted-residual method where the residual, as defined in (10), is orthogonal to the span of \({\mathcal {D}}\). In particular, we show that the EDMD approximation of the Koopman operator converges to the approximation that would be obtained from a Galerkin method in the large-data limit. In this manuscript, the large-data limit is when \(M\rightarrow \infty \) and the elements of \(\varvec{X}\) are drawn independently from a distribution on \({\mathcal {M}}\) with the probability density \(\rho \). Note that one method of generating a more complex \(\rho \) (e.g., not a Gaussian or uniform distribution) is to randomly sample points from a single, infinitely long trajectory. In this case, \(\rho \) is one of the natural measures associated with the underlying dynamical system. We also assume that \({\mathcal {F}}=L^2({\mathcal {M}}, \rho )\). The first assumption defines a process for adding new data points to our set and could be replaced with other sampling schemes. The second assumption is required so that the inner products in the Galerkin method converge, which is relevant for problems where \({\mathcal {M}}=\mathbb {R}^N\).

*useful*approximation of the Koopman operator (e.g., when can we “trust” our eigenfunctions if \(\rho \) is compactly supported but \({\mathcal {M}}=\mathbb {R}^N\)?), but they are beyond the scope of this manuscript and will be the focus of future work.

*M*, the

*ij*th element of \(\varvec{G}\) is

*ij*th element of \(\varvec{G}\) contains the sample mean of \(\psi _i^*(\varvec{x})\psi _j(\varvec{x})\). Similarly,

*M*is finite, (21) is approximated by (22). However, by the law of large numbers, the sample means almost surely converge to the expected values when the number of samples,

*M*, becomes sufficiently large. For this system, the expectations can be written as

Implicit in this argument is the fact that \(\varvec{x}_m\) and \(\varvec{y}_m\) are snapshots of the system state, which implies that a snapshot, say \(\varvec{x}_m\), cannot map to “two different places.” However, there are many cases where \(\varvec{x}_m\) and \(\varvec{y}_m\) are only partial measurements of the system state, in which case this could happen. For example, \(\varvec{x}_m\) and \(\varvec{y}_m\) could consist of *N* elements of a vector in \(\mathbb {R}^{N_0}\) where \(N_0 > N\), and so some state measurements have simply been neglected. This is quite common and occurs when the data have been compressed using POD or other related techniques. Another common example is where \(\varvec{x}_m\) and \(\varvec{y}_m\) are snapshots of the full state, but the underlying system is periodically forced; in this case, the missing information is the “phase” of the forcing.

Because the Koopman operator acts on scalar observables, one could still consider the EDMD procedure as an approximation of a Koopman operator using a set of functions, \(\psi _k\), that are constant in the “missing” components. As a result, if the true eigenfunctions are nearly constant in these directions [an assumption that is justifiable in some cases, such as fast-slow systems (Froyland et al. 2014)], then the missing information may have only a small impact on the resulting eigenfunctions. However, this assumption clearly does not hold in general, so care must be taken in order to have as “complete” a set of measurements as possible.

### 2.4 Relationship with DMD

When *M* is not large, EDMD will not be an accurate Galerkin method because the quadrature errors generated by the Monte Carlo integrator will be significant, and so the residual will probably not be orthogonal to \({\mathcal {F}}_{{\mathcal {D}}}\). However, it is still formally an extension of DMD, which has empirically been shown to yield meaningful results even without exhaustive data sets. In this section, we show that EDMD is equivalent to DMD *for a very specific—and restrictive—choice of* \({\mathcal {D}}\) because EDMD and DMD will produce the same set of eigenvalues and modes for any set of snapshot pairs.

*DMD modes*as the eigenvectors of the matrix

*j*th mode is associated with the

*j*th eigenvalue of \(\varvec{K}_{\mathrm{DMD}}\), \(\mu _j\). \(\varvec{K}_{\mathrm{DMD}}\) is constructed using the data matrices in (7), where \(+\) again denotes the pseudoinverse. This definition is a generalization of preexisting DMD algorithms (Schmid 2010; Schmid et al. 2011) and does not require the data to be in the form of a single time series. All that is required are states and their images (for a map) or their updated positions after a fixed interval in time of \(\Delta t\) (for a flow). However, in the event that they

*are*in the form of a single time series, then as discussed in Tu et al. (2014), this generalization is related to the Krylov method presented in Rowley et al. (2009).

*This is the special (if relatively restrictive) choice of dictionary alluded to earlier*. In particular, we show that the

*i*th Koopman mode, \(\varvec{v}_i\), is also an eigenvector of \(\varvec{K}_{\mathrm{DMD}}\), hence a DMD mode. Because the elements of the full-state observable are the dictionary elements, \(\varvec{B}=\varvec{I}\) in (16). Then, the Koopman modes are the complex conjugates of the left eigenvectors of \(\varvec{K}\), so \(\varvec{v}_i^T = \varvec{w}_i^*\). Furthermore, \(\varvec{G}^T = \frac{1}{M}\varvec{X}\varvec{X}^*\) and \(\varvec{A}^T = \frac{1}{M}\varvec{Y}\varvec{X}^*\). Then,

*EDMD and DMD are equivalent only for this very specific*\({\mathcal {D}}\), and other choices of \({\mathcal {D}}\) will enable EDMD to generate different (and potentially more useful) results.

Conceptually, DMD can be thought of as *producing an approximation of the Koopman eigenfunctions* using the set of linear monomials as basis functions for \({\mathcal {F}}_{{\mathcal {D}}}\), which is analogous to a one-term Taylor expansion. For problems where the eigenfunctions can be approximated accurately using linear monomials (e.g., in some small neighborhood of a stable fixed point), then DMD will produce an *accurate* local approximation of the Koopman eigenfunctions. However, this is certainly not the case for all systems (particularly beyond the region of validity for local linearization). EDMD can be thought of as an extension of DMD that retains additional terms in the expansion, where these additional terms are determined by the elements of \({\mathcal {D}}\). The quality of the resulting approximation is governed by \({\mathcal {F}}_{{\mathcal {D}}}\) and, therefore, depends upon the choice of \({\mathcal {D}}\). In all the examples that follow, the dictionaries we use will be strict supersets of the dictionary chosen implicitly by DMD. Assuming this “richer” dictionary and using the argument presented in Tu et al. (2014), it is easy to show that if DMD is able to exactly recover a Koopman mode and eigenvalue from a given set of data, then EDMD will also exactly recover the same Koopman modes and eigenvalues. However, in many applications the tuples of interest cannot be exactly recovered. In these cases, the idea is that the more extensive dictionary used by EDMD will produce better approximations of the leading Koopman tuples because \({\mathcal {F}}_{{\mathcal {D}}}\) is larger and therefore better able to represent the eigenfunctions of interest.

## 3 The Choice of the Dictionary

Some commonly used dictionaries of functions and the application where they are, in our experience, best suited

Name | Suggested context |
---|---|

Hermite polynomials | Problems defined on \(\mathbb {R}^N\) with normally distributed data |

Radial basis functions | Problems defined on irregular domains |

Discontinuous spectral elements | Large problems where a block-diagonal \(\varvec{G}\) is beneficial/computationally important |

Choosing \({\mathcal {D}}\) for EDMD is, in some cases, more difficult than selecting a set of basis functions for use in a standard spectral method because the domain on which the underlying dynamical system is defined, \({\mathcal {M}}\), and is not necessarily known. Typically, we can define \(\Omega \supset {\mathcal {M}}\) so that it contains all the data in \(\varvec{X}\) and \(\varvec{Y}\); e.g., pick \(\Omega \) to be a “box” in \(\mathbb {R}^N\) that contains every snapshot in \(\varvec{X}\) and \(\varvec{Y}\). Next, we choose the elements of \({\mathcal {D}}\) to be a basis for \(\tilde{{\mathcal {F}}}_{{\mathcal {D}}}\subset \tilde{{\mathcal {F}}}\), where \(\tilde{{\mathcal {F}}}\) is the space of functions that map \(\Omega \rightarrow \mathbb {C}\). Because \({\mathcal {F}}\subset \tilde{{\mathcal {F}}}\), this choice of \({\mathcal {D}}\) can be used in the EDMD procedure, but there is no guarantee that the elements of \({\mathcal {D}}\) form a basis for \({\mathcal {F}}_{{\mathcal {D}}}\) as there may be redundancies. The potential for these redundancies and the numerical issues they generate is why regularization, and hence, the pseudoinverse (Hansen 1990) is required in (12). An example of these redundancies and their effects is given in Appendix.

Although the optimal choice of \({\mathcal {D}}\) is unknown, there are three choices that are broadly applicable in our experience. They are: Hermite polynomials, radial basis functions (RBFs), and discontinuous spectral elements. The *Hermite polynomials* are the simplest of the three sets and are best suited to problems defined on \(\mathbb {R}^N\) if the data in \(\varvec{X}\) are normally distributed. The observables that comprise \({\mathcal {D}}\) are products of the Hermite polynomials in a single dimension (e.g., \(H_1(x)H_2(y)H_0(z)\), where \(H_i\) is the *i*th Hermite polynomial and \(\varvec{x} = (x,y,z)\)). This set of basis functions is simple to implement and conceptually related to approximating the Koopman eigenfunctions with a Taylor expansion. Furthermore, because they are orthogonal with respect to Gaussian weights, \(\varvec{G}\) will be diagonal if the \(\varvec{x}_m\) are drawn from a normal distribution, which can be beneficial numerically.

*discontinuous spectral elements*. To use this set, we define a set of \(B_N\) boxes, \(\{\mathcal {B}_i\}_{i=1}^{B_N}\), such that \(\cup _{i=1}^{B_N}\mathcal {B}_i \supset {\mathcal {M}}\). Then, on each of the \(\mathcal {B}_i\), we define \(K_i\) (suitably transformed) Legendre polynomials. For example, in one dimension, each basis function is of the form

*j*th Legendre polynomial and \(\xi \) is

*x*transformed such that the “edges” of the box are at \(\xi = \pm 1\). The advantage of this basis is that \(\varvec{G}\) will be block-diagonal and therefore easy to invert even if a very large number of basis functions are employed.

With a fixed amount of data, an equally difficult task is choosing the \(\mathcal {B}_i\); the number and arrangement of the \(\mathcal {B}_i\) are a balance between span of the basis functions (i.e., *h*-type convergence), which increases as the number of boxes is increased, and the accuracy of the quadrature rule, which decreases because smaller boxes contain fewer data points. To generate a covering of \({\mathcal {M}}\), we use a method similar to the one used by GAIO (Dellnitz et al. 2001). Initially, all the data (i.e., \(\varvec{X}\) and \(\varvec{Y}\)) are contained within a single user-selected box, \(\mathcal {B}_1^{(0)}\). If this box contains more than a pre-specified number of data points, it is subdivided into \(2^N\) domains of equal Lebesgue measure (e.g., in one dimension, \(\mathcal {B}_1^{(0)} = \mathcal {B}_1^{(1)} \cup \mathcal {B}_2^{(1)}\)). We then proceed recursively: if any of \(\mathcal {B}_i^{(1)}\) contain more than a pre-specified number of points, then they too are subdivided; this proceeds until no box has an “overly large” number of data points. Any \(\mathcal {B}_i^{(j)}\) that do not contain any data points are pruned, which after *j* iterates leaves the set of subdomains, \(\{\mathcal {B}_i^{(j)}\}\), on which we define the Legendre polynomials. The resulting set of functions are compactly supported and can be evaluated efficiently using \(2^N\) trees, where *N* is the dimension of a snapshot. Finally, the higher-order polynomials used here allow for more rapid *p*-type convergence if the eigenfunctions happen to be smooth.

The final choice is a set of *radial basis functions* (RBFs), which appeal to previous work on “mesh-free” methods (Liu 2010). Because these methods do not require a computational grid or mesh, they are particularly effective for problems where \({\mathcal {M}}\) has what might be called a complex geometry. Many different RBFs could be effective, but one particularly useful set of RBFs are the thin plate splines (Wendland 1999; Belytschko et al. 1996) because they do not require the scaling parameter that other RBFs (e.g., Gaussians) do. However, we still must choose the “centers” about which the RBFs are defined, which we do with *k*-means clustering (Bishop 2006) with a pre-specified value of *k* on the combined data set. Although we make no claims of optimality, in our examples, the density of the RBF centers appears to be directly related to the density of data points, which is, intuitively, a reasonable method for distributing the RBF centers as regions with more samples will also have more spatial resolution.

There are, of course, other dictionaries that may prove more effective in other circumstances. For example, basis functions defined in polar coordinates are useful when limit cycles or other periodic orbits are present as they mimic the form of the Koopman eigenfunctions for simple limit cycles (Bagheri 2013). How to choose the best set of functions is an important, yet open, question; fortunately, the EDMD method often produces useful results even with the relatively naïve choices presented in this section and summarized in Table 1.

## 4 Deterministic Data and the Koopman Eigenfunctions

Most applications of DMD assume that the data sets were generated by a deterministic dynamical system. In Sect. 2, we showed that, as long as the dictionary can accurately approximate the leading Koopman eigenfunctions, EDMD produces an approximation of the Koopman eigenfunctions, eigenvalues, and modes with large amounts of data (the regime in which EDMD is equivalent to a Galerkin method). In this section, we demonstrate that EDMD can produce accurate approximations of the Koopman eigenfunctions, eigenvalues, and modes with relatively limited amounts of data by applying the method to two illustrative examples. The first is a discrete-time linear system, for which the eigenfunctions, eigenvalues, and modes are known analytically, and serves as a test case for the method. The second is the unforced Duffing equation. Our goal there is to demonstrate that the approximate Koopman eigenfunctions obtained via EDMD have the potential to serve as a data-driven parameterization of a system with multiple basins of attraction.

### 4.1 A Linear Example

#### 4.1.1 The Governing Equation, Data, and Analytically Obtained Eigenfunctions

*the eigendecomposition*of \(\varvec{J}\).

*N*left eigenvectors, \(\varvec{w}_i\), that satisfy \(\varvec{w}_i^*\varvec{J} = \mu _j\varvec{w}_i^*\), and where the

*i*th eigenvector is associated with the eigenvalue \(\mu _i\). Then, the function

*i*th Koopman mode, \(\varvec{v}_i\), is the

*i*th eigenvector of \(\varvec{J}\) suitably scaled so that \(\varvec{w}_i^*\varvec{v}_i = 1\). This is identical to writing \(\varvec{x}\) in terms of the eigenvectors of \(\varvec{J}\); inner products with the left eigenvectors determine the component in each direction, and the (right) eigenvectors allow the full state to be reconstructed. As a concrete example, consider

*x*and in

*y*that include up to the fourth-order terms in

*x*and

*y*, i.e.,

*x*of degree \((i \mod 5)\) and a Hermite polynomial in

*y*of degree \(\left\lfloor \frac{i}{5}\right\rfloor \). The Hermite polynomials were chosen because they are orthogonal with respect to the weight function \(\rho (\varvec{x}) = e^{-\Vert \varvec{x}\Vert ^2}\) that is implicit in the normally distributed sampling strategy used here and can also represent the leading Koopman eigenfunctions in this problem.

#### 4.1.2 Results

*will not produce any of the other eigenfunctions*; the standard choice of the dictionary contains only linear terms and, therefore, cannot reproduce eigenfunctions with constant terms or any nonlinear terms. As a result, expanding the basis allows EDMD to capture more of the Koopman eigenfunctions than standard DMD could. These additional eigenfunctions are not necessary to reconstruct the full-state observable of an LTI system, but are in principle needed in nonlinear settings.

*missing or erroneous*eigenfunction like the examples shown in Fig. 4. The eigenfunction \(\varphi _{50}=\left( \frac{x-y}{\sqrt{2}}\right) ^5\) with \(\mu = 0.9^5 = 0.59049\) is not captured by EDMD

*with the dictionary chosen here*because it lacks the needed fifth-order monomials in

*x*and

*y*, which is similar to how DMD skips the second Koopman eigenfunction due to a lack of quadratic terms.

The erroneous eigenfunction appears because \({\mathcal {F}}_{{\mathcal {D}}}\) is not invariant with respect to the action of the Koopman operator. In particular, \(\varphi _{50}\) contains the term \(yx^4\) whose image \(\mathcal {K}(yx^4)\not \in {\mathcal {F}}_{{\mathcal {D}}}\) because \(x^5,y^5\not \in {\mathcal {F}}_{{\mathcal {D}}}\). In most applications, there are small components of the eigenfunction that cannot be represented in the dictionary chosen, which results in errors in the eigenfunction such as the one seen here. Even in the limit of infinite data, we would compute the eigenfunctions of \(\mathcal {P}_{{\mathcal {F}}_{\mathcal {D}}}\mathcal {K}\), where \(\mathcal {P}_{{\mathcal {F}}_{\mathcal {D}}}\) is the projection onto \({\mathcal {F}}_{{\mathcal {D}}}\), rather than the eigenfunction of \(\mathcal {K}\). To see that this not a legitimate eigenfunction, we added \(H_5(x)\) and \(H_5(y)\) to \({\mathcal {D}}\), which removes this erroneous eigenfunction.

In our experience, erroneous eigenfunctions tend to appear in one of the two places. Sometimes they will be associated with one of the slowly decaying eigenvalues and in that case are often associated with lower mode energies (i.e., the value of the corresponding Koopman eigenfunction is small when the corresponding Koopman mode is normalized). Otherwise, they also typically form a “cloud” of rapidly decaying eigenvalues, which are ignored as we are primarily only interested in the leading Koopman eigenvalues. One pragmatic method for determining whether or not an eigenvalue is spurious or not is to apply the EDMD procedure to subsets of the data and compare the resulting eigenvalues; while all of the eigenvalues will be perturbed, the erroneous ones tend to have larger fluctuations. Unfortunately, missing eigenvalues are more difficult to detect without prior knowledge, but tend to live in the same part of the complex plane that the erroneous eigenvalues do as both are caused by a lack of dictionary elements. In practice, we take the cloud of erroneous eigenvalues as the cutoff point below which the EDMD method cannot be trusted to produce accurate results.

*numerically computed*eigenfunctions in reconstructing the full-state observable is negligible (i.e., \(\Vert \varvec{v}_k\Vert \approx 10^{-11}\) for \(k\ne 1,3\)), so the Koopman/EDMD analysis is an eigenvalue/eigenvector decomposition once numerical errors are taken into consideration.

Although EDMD reveals a richer set of Koopman eigenfunctions that are analytically known to exist, their associated Koopman modes are zero, and hence, they can be neglected. Our goal in presenting this example is not to demonstrate any new phenomenon, but rather to demonstrate that there is good quantitative agreement between the analytically obtained Koopman modes, eigenvalues, and eigenfunctions and the approximations produced by EDMD. Furthermore, it allowed us to highlight the types of errors that appear when \({\mathcal {F}}_{{\mathcal {D}}}\) is not an invariant subspace of \(\mathcal {K}\), which results in erroneous eigenfunctions, or when the dictionary is missing elements, which results in missing eigenfunctions.

### 4.2 The Duffing Equation

In this section, we will compute the Koopman eigenfunctions for the unforced Duffing equation, which for the parameter regime of interest here has two stable spirals and a saddle point whose stable manifold defines the boundary between the basins of attraction. Following Mauroy et al. (2013) and the references contained therein, the eigenvalues of the linearizations about the fixed points in the system are known to be a subset of the Koopman eigenvalues, and for each stable spiral, the magnitude and phase of the associated Koopman eigenfunction parameterizes the relevant basin of attraction. Additionally, because basins of attraction are forward invariant sets, there will be two eigenfunctions with \(\mu = 0\), each of which is supported on one of the two basins of attraction in this system (or, equivalently, there will be a trivial eigenfunction and another eigenfunction with \(\mu =0\) whose level sets denote the basins of attraction). Ultimately, we are not interested in recovering highly accurate eigenfunctions in this example. Instead, we will demonstrate that the eigenfunctions computed by EDMD are *accurate enough* that they can be used to identify and parameterize the basins of attraction that are present in this problem for the region of interest.

*i*th unit vector \(\varvec{e}_i\). For any initial condition \(\varvec{x}(0)\), the values of these observables will relax to either 0 or \(\pm 1\) depending on

*i*and the initial condition. This relaxation implies that the eigenvalues used in the expansion in (6) should not have positive real parts, otherwise exponential growth would be observed.

This is shown in Gaspard and Tasaki (2001) and Gaspard et al. (1995) for the pitchfork and Hopf bifurcations, where the eigenfunctions and *eigendistributions* can be obtained analytically. Unlike those examples, we do not have analytical expressions for the eigenfunctions (or eigendistributions) here, but we will appeal to the same intuition. In particular, we expect EDMD to approximate the Koopman eigenfunctions associated with the attractors at \((\pm 1, 0)\) because those quantities, at least away from the saddle, can lie in or near the subspace spanned by our dictionary. There will also be eigenvalues associated with the unstable equilibrium at the origin, but they are paired with eigendistributions [e.g., Dirac delta distributions and their derivatives (Gaspard and Tasaki 2001; Gaspard et al. 1995)], which are not in the span of any dictionary we will choose and therefore could not be recovered even if we had a large quantity of data near that equilibrium.

Therefore, we use a set of data that consists of \(10^3\) trajectories with 11 samples each with a sampling interval of \(\Delta t = 0.25\) (i.e., \(\varvec{X},\varvec{Y}\in \mathbb {R}^{2\times 10^4}\)), and initial conditions uniformly distributed over \(x,\dot{x}\in [-2, 2]\). With this sampling rate and initialization scheme, many trajectories will *approach* the stable spirals, but few will have (to numerical precision) reached the fixed points. As a result, the basins of attraction cannot be determined by observing the last snapshot in a given trajectory. Instead, EDMD will be used to “stitch” together this ensemble of trajectories to form a single coherent picture.

However, because there are multiple basins of attraction, the leading eigenfunctions appear to be discontinuous (Mauroy et al. 2013) and supported only on the appropriate basin of attraction. In principle, our computation could be done “all at once” using a single \({\mathcal {D}}\) and applying EDMD to the complete data set. To enforce the compactly supported nature of the eigenfunctions regardless of which dictionary we use, we will proceed in a two-tiered fashion. First, the basins of attraction will be identified using all of the data and a dictionary with support everywhere we have data. Once we have identified these basins, both state space and the data will be partitioned into subdomains based on the numerically identified basins. The EDMD procedure will then be run on each subdomain and the corresponding partitioned data set individually.

#### 4.2.1 Locating Basins of Attraction

*k*-means clustering (Bishop 2006) on the full data set was used to choose the RBF centers. RBFs were chosen here because of the geometry of the computational domain; indeed, RBFs are often a fundamental component of “mesh-free” methods that avoid the non-trivial task of generating a computational mesh (Liu 2010).

The leading (continuous-time) eigenvalue is \(\lambda _0 = -10^{-14}\) which corresponds to the constant function. The second eigenfunction, shown in the leftmost image of Fig. 5, has \(\lambda _1 = -10^{-3}\), which should be considered an approximation of zero. The discrepancy between the numerically computed eigenfunction and the theoretical one is due to the choice of the dictionary. The analytical eigenfunction possesses a discontinuity on the edge of the basin of attraction (i.e., the stable manifold of the saddle point at the origin), but discontinuous functions are not in the space spanned by RBFs. Therefore, the numerically computed approximation “blurs” this edge as shown in Fig. 5.

The scatter plot in the center of Fig. 5 shows the data points colored by the first non-trivial eigenfunction. There is good qualitative agreement between the numerically computed basin of attraction and the actual basin. By computing the mean value of \(\varphi _1\) on the data and using that value as the threshold that determines which basin of attraction a point belongs to, the EDMD approach misclassifies only 46 of the \(10^4\) data points, resulting in an error of only 0.5 % as shown by the rightmost plot. As a result, the leading eigenfunctions computed by EDMD are sufficiently accurate to produce a meaningful partition of the data.

#### 4.2.2 Parameterizing a Basin of Attraction

*k*-means procedure was run again to select a new set of 1000 RBF centers, and this “adjusted” basis along with the constant function comprised the \({\mathcal {D}}\) used by EDMD. Figure 6 shows the amplitude and phase of the eigenfunction with eigenvalue closest to \(-0.25 + 1.387\imath \) computed using the data in each basin of attraction. The computed eigenvalues agree favorably with the analytically obtained eigenvalue; the basin of the spiral at (1, 0) has the eigenvalue \(-0.237 + 1.387\imath \), and the basin of the spiral at \((-1, 0)\) has the eigenvalue \(-0.24 + 1.35\imath \).

Figure 6 demonstrates that the amplitude and phase of a Koopman eigenfunction forms something analogous to an “action–angle” parameterization of the basin of attraction. Due to the nonlinearity in the Duffing oscillator, this parameterization is more complicated than an appropriately shifted polar coordinate system and is, therefore, not the parameterization that would be generated by linearization about either \((\pm 1, 0)\). The level sets of the amplitude of this eigenfunction are the so-called isostables (Mauroy et al. 2013). One feature predicted in that manuscript is that the 0-level set of the isostable is the fixed point in the basin of attraction; this feature is reflected in Fig. 6 by the blue region, which corresponds to small values of the eigenfunction that are near the fixed points at \((\pm 1, 0)\). Additionally, a singularity in the phase can be observed there. The EDMD approach produces noticeable numerical errors near the edges of the basin. These errors can be due to a lack of data or due to the singularities in the eigenfunctions that can occur at unstable fixed points (Mauroy and Mezic 2013).

In this section, we applied the EDMD procedure to deterministic systems and showed that it produces an approximation of the Koopman operator. With a sensible choice of data and \({\mathcal {D}}\), we showed that EDMD generates a quantitatively accurate approximation of the Koopman eigenvalues, eigenfunctions, and modes for the linear example. In the second example, we used the Koopman eigenfunctions to identify and parameterize the basins of attraction of the Duffing equation. Although the EDMD approximation of the eigenfunctions could be made more accurate with more data, it is still accurate enough to serve as an effective parameterization. As a result, the EDMD method can be useful with limited quantities of data and should be considered an enabling technology for data-driven approximations of the Koopman eigenvalues, eigenfunction, and modes.

## 5 Stochastic Data and the Kolmogorov Backward Equation

The EDMD approach is entirely data driven and will produce an output regardless of the nature of the data given to it. However, if the results of EDMD are to be meaningful, then certain assumptions must be made about the dynamical system that produced the data used. In the previous section, it was assumed that the data were generated by a deterministic dynamical system; as a result, EDMD produced approximations of the tuples of Koopman eigenfunctions, eigenvalues, and modes.

Another interesting case to consider is when the underlying dynamical system is a Markov process, such as a stochastic differential equation (SDE). For such systems, the evolution of an observable is governed by the Kolmogorov backward (KB) equation [33], whose “right-hand side” has been called the “stochastic Koopman operator” (SKO) (Mezić 2005). In this section, we will show that EDMD produces approximations of the eigenfunctions, eigenvalues, and modes of the SKO if the underlying dynamical system happens to be a Markov process.

To accomplish this, we will prove that the EDMD method converges to a Galerkin method in the large-data limit. After that, we will demonstrate its accuracy with finite amounts of data by applying it to the model problem of a one-dimensional SDE with a double-well potential, where the SKO eigenfunctions can be computed using standard numerical methods.

Another proposed application of the Koopman operator is for the purposes of *model reduction*, which was first explored in Mezić (2005) and later work such as Froyland et al. (2014). Model reduction based on the Koopman eigenfunctions is equally applicable in both deterministic and stochastic settings, but we choose to present it for stochastic systems to highlight the similarities between EDMD and *manifold learning techniques* such as diffusion maps (Nadler et al. 2005; Coifman and Lafon 2006). In particular, we apply EDMD to an SDE defined on a “Swiss roll,” which is a nonlinear manifold often used to test manifold learning methods (Lee and Verleysen 2007). The purpose of this example is twofold: First, we show that a data-driven parameterization of the Swiss roll can be obtained using EDMD, and second, we show that this parameterization will preferentially capture “slow” dynamics on that manifold before the “fast” dynamics when the noise is made anisotropic.

### 5.1 EDMD with Stochastic Data

*discrete-time Markov process*,

*P*, \({\mathbb {E}}\) denotes the expected value over that space, and \(\psi \in {\mathcal {F}}\) is a scalar observable. The SKO (Mezić 2005) takes an observable of the full system state and returns the conditional expectation of the observable “one timestep in the future.” Note that this definition is compatible with the deterministic Koopman operator because \({\mathbb {E}}[\psi (\varvec{F}(\varvec{x}))]=\psi (\varvec{F}(\varvec{x}))\) if \(\varvec{F}\) is deterministic.

*M*is sufficiently large. Once again, \(\rho \) does not need to be an invariant measure of the underlying dynamical system; it is simply the sampling density of the data. Due to the stochastic nature of the system, there are two probability spaces involved: one related to the samples in \(\varvec{X}\) and another for the stochastic dynamics. Because our system has “process” rather than “measurement” noise, the \(\varvec{x}_i\) are known exactly, and the interpretation of the Gram matrix, \(\varvec{G}\), remains unchanged. Therefore,

*M*is large enough. This is identical to the deterministic case. However, the definition of \(\varvec{A}\) will change. Assuming that the choice of \(\varvec{\omega }\) and \(\varvec{x}\) are independent,

*second*integral over the probability space that pertains to the stochastic dynamics, which produces the expectation of the observable in the expression above.

The accuracy of the resulting method will depend on the dictionary, \({\mathcal {D}}\), the manifold on which the dynamical system is defined, the data, and the dynamics used to generate it. One interesting special case is if the basis functions are indicator functions supported on “boxes.” When this is the case, EDMD is equivalent to the widely used Ulam Galerkin method (Bollt and Santitissadeekorn 2013; Dellnitz et al. 2001). This equivalence is lost for other choices of \({\mathcal {D}}\) and \(\rho \), but as we will demonstrate in the subsequent sections, EDMD can produce accurate approximations of the eigenfunctions for many other choices of these quantities.

The “stochastic Koopman modes” can then be computed using (20), but they too must be reinterpreted as the weights needed to reconstruct the *expected value* of the full-state observable using the eigenfunctions of the SKO. Due to the stochastic nature of the dynamics, the Koopman modes can no longer exactly specify the state of the system. However, they can be used as approximations of the Koopman modes that would be obtained in the “noise-free” limit when some appropriate restrictions are placed on the nature of the noise and the underlying dynamical system. Indeed, these are the modes we are truly computing when we apply DMD or EDMD to experimental data, which by its very nature contains some noise.

### 5.2 A Stochastic Differential Equation with a Double-Well Potential

In this section, we will show that the EDMD procedure is capable of accurately approximating the eigenfunctions of the stochastic Koopman operator by applying it to an SDE with a double-well potential. Although we do not have analytical solutions for the eigenfunctions, the problem is simple enough that we can accurately compute them using standard numerical methods.

#### 5.2.1 The Double-Well Problem and Data

*x*is the state, \(-\nabla U(x)\) the drift, \( \sigma \) is the (constant) the diffusion coefficient, and \(W_t\) is a Wiener process. Furthermore, no-flux boundary conditions are imposed at \(x = \pm 1\). For this problem, we let \(U(x) = -2 (x^2 - 1)^2x^2\) as shown in Fig. 7.

The data are \(10^6\) initial points on \(x\in [-1, 1]\) drawn from a uniform distribution, which constitute \(\varvec{X}\), and their positions after \(\Delta t = 0.1\), which constitute \(\varvec{Y}\). The evolution of each initial condition was accomplished through \(10^2\) steps of the Euler–Maruyama method (Higham 2001; Kloeden and Platen 1992) with a timestep of \(10^{-3}\) using the double-well potential in Fig. 7. Note that only the initial and final points in this trajectory were retained, so we have access to \(10^6\) and not \(10^8\) snapshot pairs. The dictionary chosen is a discontinuous spectral element basis that splits \(x\in [-1,1]\) into four equally sized subdomains with up to tenth-order Legendre polynomials on each subdomain (see Sect. 3) for a total of forty degrees of freedom.

#### 5.2.2 Recovering the Koopman Eigenfunctions and Eigenvalues

Because the Koopman operator is infinite dimensional, we will clearly be unable to approximate all of the tuples. Instead, we focus on the leading (i.e., most slowly decaying) tuples, which govern the long-term dynamics of the underlying system. In this example, we seek to demonstrate that our approximation is: (a) quantitatively accurate and (b) valid over a range of coefficients, \(\sigma \), and not solely in the small (or large) noise limits.

*non-trivial*eigenfunction which has the eigenvalue \(\lambda _1\). The change in \(\varphi _1\) as a function of \(\sigma \) is shown in Fig. 9. As with the eigenvalue, there is good agreement between EDMD and the directly computed eigenfunctions at different values of \(\sigma \). For all values of \(\sigma \), \(\varphi _1\) is an odd function; what changes is how rapidly \(\varphi _1\) transitions from its maximum to its minimum. When \(\sigma \) is small, this transition is rapid, and \(\varphi _1\) will approach a step function as \(\sigma \rightarrow 0\). When \(\sigma \) grows, this eigenfunction is “smoothed out” and the transition becomes slower. In the limit as \(\sigma \rightarrow \infty \), the dynamics of the problem are dominated by the diffusion term, and \(\varphi _1\) will be proportional to \(\cos (\pi x/L)\) as is implied by the rightmost plot in the figure.

In many system-identification algorithms [e.g., Juang (1994)], one often constructs deterministic governing equations from inherently stochastic data (either due to measurement or due to process noise). Similarly, methods like DMD have been applied to noisy sets of data to produce an approximation of the Koopman modes and eigenvalues with the assumption that the underlying system is deterministic. In this example, this is equivalent to using the output of EDMD with data taken with \(0<\sigma \ll 1\) as an approximation of the Koopman tuples that would be obtained with \(\sigma = 0\).

For certain tuples, this is a reasonable approach. Taking \(\sigma \rightarrow 0\), \(\lambda _3\) and \(\lambda _4\) and \(\varphi _3\) and \(\varphi _4\) are good approximations of their deterministic counterparts. In particular, \(\varphi _3\) and \(\varphi _4\) are one-to-one with their associated basin of attraction and appear to possess a zero at the stable fixed point. However, these approximate eigenfunctions lack some important features such as a singularity at \(x =0\) that occurs due to the unstable fixed point there. Therefore, both eigenfunctions are good approximations of their \(\sigma = 0\) counterparts, but cannot be “trusted” in the vicinity of an unstable fixed point.

For other tuples, even a small amounts of noise can be important. Consider the “slowest” nonzero eigenvalue, \(\lambda _2\), which appears to approach \(-4\) as \(\sigma \rightarrow 0\), but is *not* obtained by the EDMD method when \(\sigma = 0\). Formally, the existence of an eigenvalue of \(-4\) is not surprising. The fixed point at \(x=0\) is unstable with \(\lambda = 4\), and in continuous time, if (\(\lambda _n\), \(\varphi _n\)) is an eigenvalue/eigenfunction pair then (\(k\lambda _n\), \(\varphi ^k_n\)) is, at least formally, an eigenvalue/eigenfunction pair for any scalar *k*. Using an argument similar to Matkowsky and Schuss (1981), it can be shown that \(\varphi _2(x) = C_0\exp (-4x^2/\sigma ^2) + \mathcal {O}(\sigma ^2)\) as \(\sigma \rightarrow 0\) where \(C_0\) is chosen to normalize \(\varphi _2\). However, this approaches a delta function as \(\sigma \rightarrow 0\) and therefore leaves the subspace of observables spanned by our dictionary. When this occurs, this tuple appears to “vanish,” which is why it does not appear in the \(\sigma = 0\) limit. As a result, when applying methods like EDMD or DMD to noisy data, the spectrum of the finite-dimensional approximation is not necessarily a good approximation of the spectrum that would be obtained with noise-free data. Some of the tuples, such as those containing \(\varphi _1\), \(\varphi _3\), and \(\varphi _4\), have eigenvalues that closely approximate the ones found in the deterministic problem. However, others such as the tuple containing \(\varphi _2\) do not. Furthermore, the only method to determine that \(\lambda _2\) can be neglected is by directly examining the eigenfunction. As a result, when we apply methods like DMD/EDMD to noisy data with the purpose of using the spectrum to determine the timescales and behaviors of the underlying system, we must keep in mind that not all of the eigenvalues obtained with noisy data will be present if “clean” data are used instead.

#### 5.2.3 Rate of Convergence

Among other things, the performance of the EDMD method is dependent upon the number of snapshots provided to it, the distribution of the data, the underlying dynamical system, and the dictionary. In this section, we examine the convergence of EDMD to a Galerkin method as the number of snapshots increases in order to provide some intuition about the “usefulness” of the eigenfunctions obtained without an exhaustive amount of data. To do so, we generated a larger set of data consisting of \(10^7\) initial conditions chosen from a spatially uniform distribution for the case with \(\sigma = 1\). Each initial condition was propagated using the Euler–Maruyama method described in the previous section, but only the initial and terminal states were retained. Then, we applied EDMD using the same dictionary to subsets of the data, computed the leading nontrivial eigenvalue and eigenfunction, and compared the results to the “true” leading eigenfunction and eigenvalue computed using a finite-difference approximation of the stochastic Koopman operator.

*M*. In the rightmost plot, we define the error as \(\Vert \varphi _{1,\text {EDMD}} - \varphi _{1,\text {True}}\Vert \) after both eigenfunctions have been normalized so that \(\Vert \varphi _{1,\text {EDMD}}\Vert _2 = \Vert \varphi _{1,\text {True}}\Vert _2\). As expected, EDMD is inaccurate when

*M*is small (here, \(M<100\)); there is not enough data to accurately approximate the scalar products. For \(M > 10^3\), the eigenfunction produced by EDMD have the right shape, and the eigenvalue is approaching its true value. For \(M > 10^4\), there is no “visible” difference in the leading eigenvalue, and the error in the leading eigenfunction is less than \(10^{-3}\).

To quantify the rate of convergence, we fit a line to the plot of error versus *M* in the right panel of Fig. 10. As expected, EDMD converges like \(M^{-0.49}\), which is very close to the predicted value of \(\mathcal {O}(M^{-0.5})\) associated with Monte Carlo integration. Because this problem is stochastic, we cannot increase the rate of convergence by uniform sampling (the integral over the probability space associated with the stochastic dynamics will still converge like \(\mathcal {O}(M^{-1/2})\)), even though that is a simple method for enhancing the rate of convergence for deterministic problems.

*M*, but there are clearly some numerical issues at the edges of the domain and near \(x=0\) where the discontinuities in the numerically computed eigenfunctions can occur with our choice of dictionary. To obtain a more quantitatively accurate solution, additional data points are required. When \(M=14384\), the numerical issues at the boundaries and the discontinuity at \(x=0\) have diminished. As shown in the plot with \(M = 1128837\), this process continues until the EDMD eigenfunction is visually identical to the true eigenfunction. In principle, highly accurate approximations of the leading Koopman eigenfunctions using EDMD are possible, but because EDMD for stochastic systems is a Monte Carlo method, the relatively slow rate of convergence may make the amount of data required to obtain this level of accuracy infeasible in practice.

### 5.3 Parameterizing Nonlinear Manifolds and Reducing Stochastic Dynamics

In this section, we will briefly demonstrate how the EDMD method can be used to parameterize nonlinear manifolds and reduce stochastic differential equations defined on those manifolds. Everything done here could also be done for a deterministic system; we chose to use an SDE rather than an ODE only to highlight the similarities between EDMD and nonlinear manifold learning techniques such as diffusion maps, and not because of any restriction on the Koopman approach. We proceed in two steps: First, we will show that data from an SDE defined on the Swiss roll, which is a nonlinear manifold often used as a test of nonlinear manifold learning techniques (Nadler et al. 2005, 2006; Coifman and Lafon 2006; Lee and Verleysen 2007); in conjunction with the EDMD procedure can generate a data-driven parameterization of that manifold. For this first example, isotropic diffusion is used, so there is no “fast” or “slow” solution component that can be meaningfully neglected. Instead, we will show that the leading eigenfunctions are one-to-one with the “length” and “width” of the Swiss roll. Then, we alter the SDE and introduce a “fast” component by making the diffusion term anisotropic. In this case, EDMD will “pick” the slower components before the faster ones. We should stress that in this application, EDMD could be used as a replacement for (rather than in conjunction with) other methods such as diffusion maps (DMAPs) and its variants (Nadler et al. 2005, 2006; Coifman and Lafon 2006; Dsilva et al. 2013). Although there are advantages to *combining diffusion maps and Koopman* (Budisic and Mezic 2012), exploration of those approaches is beyond the scope of this manuscript.

#### 5.3.1 Parameterizing a Nonlinear Manifold with a Diffusion Process

*the EDMD approach is applied to the three-dimensional, transformed variables and not the two-dimensional, true variables*. Our objective here is to determine a 2-parameter description of what initially appears to be three-dimensional data, directly from the data.

The data given to EDMD were generated by \(10^4\) initial conditions uniformly distributed in \(s_1\) and \(s_2\) that were evolved for a total time of \(\Delta t = 0.1\) using the Euler–Maruyama method with 100 timesteps. Then, both the initial and terminal states of the system were mapped into three dimensions using (42). Next, a dictionary must be defined. However, \({\mathcal {M}}\) is unknown (indeed, parameterizing \({\mathcal {M}}\) is the entire point), so the domain \(\Omega \) is taken to be the “box” in \(\mathbb {R}^3\) such that \(x\in [-3\pi -0.1, 3\pi +0.1]\), \(y\in [0, 2\pi ]\) and \(z\in [-3\pi -0.1, 3\pi +0.1]\). In this larger domain, the spectral element basis consisting of 4096 rectangular subdomains (16 each in *x*, *y*, and *z*) with up to linear polynomials in each subdomain is employed. Because \({\mathcal {M}}\subset \Omega \), extraneous and redundant functions are expected, and \(\varvec{G}\) is often ill conditioned.

The procedure for incorporating new data points is simple: The embedding for any \(\tilde{\varvec{x}}\in {\mathcal {M}}\) can be obtained simply by evaluating the relevant eigenfunctions at \(\tilde{\varvec{x}}\). It should be stressed that although the \(\varphi \) are defined on \(\Omega \), their value is only meaningful on (or very near) \({\mathcal {M}}\) because that is where the dynamical system is defined. Therefore, these new points must be elements of \({\mathcal {M}}\) if the resulting embedding is to have any meaning.

#### 5.3.2 Reducing Multiscale Dynamics

In the previous example, the noise was isotropic, so the dynamics were equally “fast” in both directions. Because we are interested in the most slowly decaying eigenfunctions, the eigenfunction that is one-to-one with \(s_1\) is more important because it decays more slowly than the eigenfunction that is one-to-one with \(s_2\). In this example, we introduce anisotropic diffusion and therefore create “fast” and “slow” directions on the nonlinear manifold. The purpose of this example is to show that EDMD will “reorder” its ranking of the eigenfunctions and recover the slower component before the faster one if the level of anisotropy is large enough.

*eigenvalues*associated with each of the eigenfunctions should.

This section explored the application of the EDMD method to data taken from a Markov process. Algorithmically, the method remains the same regardless of how the data were generated, but as demonstrated here, EDMD computes an approximation of the tuples associated with SKO rather than the Koopman operator. To demonstrate the effectiveness of EDMD, we applied the method to a simple SDE with a double-well potential, and an SDE defined on a Swiss roll, which is a nonlinear manifold often used as a benchmark for manifold learning techniques. One advantage of the Koopman approach for applications such as manifold learning or model reduction is that the Koopman tuples take into account both the geometry of the manifold, through the eigenfunction and mode, and the dynamics, through the eigenvalue. As a result, the approach taken here is aware of both geometry and dynamics and does not focus solely on one or the other.

## 6 Conclusions

In this manuscript, we presented a data-driven method that computes approximations of the Koopman eigenvalues, eigenfunctions, and modes (what we call Koopman tuples) directly from a set of snapshot pairs. We refer to this method as extended dynamic mode decomposition (EDMD). The finite-dimensional approximation generated by EDMD is the solution to a least-squares problem and converges to a Galerkin method with a large amount of data. While the usefulness of the Galerkin method depends on the sampling density and dictionary selected, several “common sense” choices of both appear to produce useful results.

We demonstrate the effectiveness of the method with four examples: two examples dealt with deterministic data and two with stochastic data. First, we applied EDMD to a linear system where the Koopman eigenfunctions are known analytically. Direct comparison of the EDMD eigenfunctions and the analytic values demonstrated that EDMD can be highly accurate with the proper choice of data and dictionary. Next, we applied EDMD to the unforced Duffing equation, for which the Koopman eigenfunctions are not known explicitly. Although more data will increase the accuracy of the resulting eigenfunctions, they appeared to be accurate enough to effectively partition the domain of interest and parameterize the resulting partitions.

The final two examples used data generated by Markov processes. First, we applied EDMD to data taken from an SDE with a double-well potential and demonstrated the accuracy of the method by comparing those results with a direct numerical approximation of the stochastic Koopman operator over a range of diffusion parameters. Next, we applied EDMD to data from a diffusion process on a “Swiss roll,” which is a nonlinear manifold commonly used as an example for nonlinear dimensionality reduction. Similar to those methods (see e.g., Coifman and Lafon (2006) and Lee and Verleysen (2007)), EDMD generated an effective parameterization of the manifold using the leading eigenfunctions. By making the diffusion anisotropic, we then demonstrated that EDMD extracts a parameterization that is dynamically, rather than only geometrically, meaningful. Due to the simplicity of this problem, the eigenfunctions remain unchanged despite the anisotropy; the difference appears in the temporal evolution of the eigenfunctions, which is dictated by the corresponding set of eigenvalues. As a result, the purpose of that example was to show that EDMD “ordered” the eigenvalues of each tuple differently.

The Koopman operator governs the evolution of observables defined on the state space of a dynamical system. By judiciously selecting how we observe our system, we can generate *linear models* that are valid on all of (or, at least, a larger subset of) state space rather than just some small neighborhood of a fixed point; this could allow algorithms designed for linear systems to be applied even in nonlinear settings. However, the tuples of eigenvalues, eigenfunctions, and modes required to do so are decidedly non-trivial to compute. Data-driven methods such as EDMD have the potential to allow accurate approximations of these quantities to be computed without knowledge of the underlying dynamics or geometry. As a result, they could be a practical method for enabling Koopman-based analysis and model reduction in large nonlinear systems.

## Notes

### Acknowledgments

The authors would like to thank Igor Mezić, Jonathan Tu, Maziar Hemati, and Scott Dawson for interesting and useful discussions on dynamic mode decomposition and the Koopman operator. M.O.W. gratefully acknowledges support from NSF DMS-1204783. I.G.K acknowledges support from AFOSR FA95550-12-1-0332 and NSF CMMI-1310173. C.W.R acknowledges support from AFOSR FA9550-12-1-0075.

## References

- Bagheri, S.: Effects of weak noise on oscillating flows: linking quality factor, Floquet modes and Koopman spectrum. Phys. Fluids.
**26**, 094104 (2014)Google Scholar - Bagheri, S.: Koopman-mode decomposition of the cylinder wake. J. Fluid Mech.
**726**, 596–623 (2013)zbMATHMathSciNetCrossRefGoogle Scholar - Belytschko, T., Krongauz, Y., Organ, D., Fleming, M., Krysl, P.: Meshless methods: an overview and recent developments. Comput. Methods Appl. Mech. Eng.
**139**(1), 3–47 (1996)zbMATHCrossRefGoogle Scholar - Bishop, C.M., et al.: Pattern Recognition and Machine Learning (Information Science and Statistics), Springer-Verlag, New York (2006)Google Scholar
- Bollt, E.M., Santitissadeekorn, N.: Applied and Computational Measurable Dynamics. SIAM, Philadelphia (2013)zbMATHCrossRefGoogle Scholar
- Boyd, J.P.: Chebyshev and Fourier Spectral Methods. Courier Dover Publications, New York (2013)Google Scholar
- Budišić, M., Mohr, R., Mezić, I.: Applied Koopmanism. Chaos Interdiscip. J. Nonlinear Sci.
**22**(4), 047510 (2012)CrossRefGoogle Scholar - Budisic, M., Mezic, I.: Geometry of the ergodic quotient reveals coherent structures in flows. Phys. D
**241**, 1255–1269 (2012)zbMATHMathSciNetCrossRefGoogle Scholar - Chen, K.K., Tu, J.H., Rowley, C.W.: Variants of dynamic mode decomposition: boundary condition, Koopman, and Fourier analyses. J. Nonlinear Sci.
**22**(6), 887–915 (2012)zbMATHMathSciNetCrossRefGoogle Scholar - Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmonic Anal.
**21**(1), 5–30 (2006)zbMATHMathSciNetCrossRefGoogle Scholar - Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM
**51**(1), 107–113 (2008)CrossRefGoogle Scholar - Dellnitz, M., Froyland, G., Junge, O.: The algorithms behind GAIO—set oriented numerical methods for dynamical systems. In: Bernold Fiedler (ed.) Ergodic Theory, Analysis, and Efficient Simulation of Dynamical Systems, pp. 145–174, Springer, Berlin (2001)Google Scholar
- Dsilva, C.J., Talmon, R., Rabin, N., Coifman, R.R., Kevrekidis, I.G.: Nonlinear intrinsic variables and state reconstruction in multiscale simulations. J. Chem. Phys.
**139**(18), 184109 (2013)CrossRefGoogle Scholar - Eisenhower, B., Maile, T., Fischer, M., Mezić, I.: Decomposing building system data for model validation and analysis using the Koopman operator. In: Proceedings of the National IBPSAUSA Conference, New York, USA (2010)Google Scholar
- Erban, R., Frewen, T.A., Wang, X., Elston, T.C., Coifman, R., Nadler, B., Kevrekidis, I.G.: Variable-free exploration of stochastic models: a gene regulatory network example. J. Chem. Phys.
**126**(15), 155103 (2007)CrossRefGoogle Scholar - Froyland, G., Gottwald, G.A., Hammerlindl, A.: A computational method to extract macroscopic variables and their dynamics in multiscale systems. SIAM J. Appl. Dyn. Sys.
**13**(4), 1816–1846 (2014)Google Scholar - Froyland, G.: Statistically optimal almost-invariant sets. Phys. D Nonlinear Phenom.
**200**(3), 205–219 (2005)zbMATHMathSciNetCrossRefGoogle Scholar - Froyland, G., Padberg, K., England, M.H., Treguier, A.M.: Detection of coherent oceanic structures via transfer operators. Phys. Rev. Lett.
**98**(22), 224503 (2007)CrossRefGoogle Scholar - Froyland, G., Padberg, K.: Almost-invariant sets and invariant manifolds—connecting probabilistic and geometric descriptions of coherent structures in flows. Phys. D Nonlinear Phenom.
**238**(16), 1507–1523 (2009)zbMATHMathSciNetCrossRefGoogle Scholar - Gaspard, P., Nicolis, G., Provata, A., Tasaki, S.: Spectral signature of the pitchfork bifurcation: Liouville equation approach. Phys. Rev. E
**51**(1), 74 (1995)MathSciNetCrossRefGoogle Scholar - Gaspard, P., Tasaki, S.: Liouvillian dynamics of the Hopf bifurcation. Phys. Rev. E
**64**(5), 056232 (2001)MathSciNetCrossRefGoogle Scholar - Givon, D., Kupferman, R., Stuart, A.: Extracting macroscopic dynamics: model problems and algorithms. Nonlinearity
**17**(6), R55 (2004)zbMATHMathSciNetCrossRefGoogle Scholar - Hansen, P.C.: Truncated singular value decomposition solutions to discrete ill-posed problems with ill-determined numerical rank. SIAM J. Sci. Stat. Comput.
**11**(3), 503–518 (1990)zbMATHCrossRefGoogle Scholar - Higham, D.J.: An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM Rev.
**43**(3), 525–546 (2001)zbMATHMathSciNetCrossRefGoogle Scholar - Hirsch, C.: Numerical computation of internal and external flows: The fundamentals of computational fluid dynamics vol. 1, Butterworth-Heinemann (2007)Google Scholar
- Holmes, P., Lumley, J.L., Berkooz, G.: Turbulence, Coherent Structures, Dynamical Systems and Symmetry. Cambridge University Press, Cambridge (1998)zbMATHGoogle Scholar
- Jovanović, M.R., Schmid, P.J., Nichols, J.W.: Sparsity-promoting dynamic mode decomposition. Phys. Fluids
**26**, 024103 (2014)Google Scholar - Juang, J.-N.: Applied System Identification. Prentice Hall, Englewood Cliffs (1994)zbMATHGoogle Scholar
- Karniadakis, G., Sherwin, S.: Spectral/Hp Element Methods for Computational Fluid Dynamics. Oxford University Press, Oxford (2013)zbMATHGoogle Scholar
- Kloeden, P.E., Platen, E.: Numerical solution of stochastic differential equations. Springer, Berlin (1992)Google Scholar
- Koopman, B.O.: Hamiltonian systems and transformation in Hilbert space. Proc. Natl. Acad. Sci. U. S. A.
**17**(5), 315 (1931)CrossRefGoogle Scholar - Koopman, B.O., Neumann, J.V.: Dynamical systems of continuous spectra. Proc. Natl. Acad. Sci. U. S. A.
**18**(3), 255 (1932)CrossRefGoogle Scholar - Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer, Berlin (2007)zbMATHCrossRefGoogle Scholar
- Lehoucq, R.B., Sorensen, D.C., Yang, C.: ARPACK users’ guide: Solution of large-scale eigenvalue problems. SIAM, Philadelphia (1998)CrossRefGoogle Scholar
- Liu, G.-R.: Meshfree methods: Moving beyond the finite element method. CRC Press, Boca Raton (2010)Google Scholar
- Matkowsky, B., Schuss, Z.: Eigenvalues of the Fokker-Planck operator and the approach to equilibrium for diffusions in potential fields. SIAM J. Appl. Math.
**40**(2), 242–254 (1981)zbMATHMathSciNetCrossRefGoogle Scholar - Mauroy, A., Mezic, I.: A spectral operator-theoretic framework for global stability. In: Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on, pp. 5234–5239 (2013)Google Scholar
- Mauroy, A., Mezić, I., Moehlis, J.: Isostables, isochrons, and Koopman spectrum for the action-angle representation of stable fixed point dynamics. Phys. D Nonlinear Phenom.
**261**, 19–30 (2013)zbMATHCrossRefGoogle Scholar - Mauroy, A., Mezić, I.: On the use of Fourier averages to compute the global isochrons of (quasi) periodic dynamics. Chaos Interdiscip. J. Nonlinear Sci.
**22**(3), 033112 (2012)CrossRefGoogle Scholar - Mezić, I.: Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dyn.
**41**(1–3), 309–325 (2005)zbMATHGoogle Scholar - Monaghan, J.J.: Smoothed particle hydrodynamics. Annu. Rev. Astron. Astrophys.
**30**, 543–574 (1992)CrossRefGoogle Scholar - Muld, T.W., Efraimsson, G., Henningson, D.S.: Flow structures around a high-speed train extracted using proper orthogonal decomposition and dynamic mode decomposition. Comput. Fluids
**57**, 87–97 (2012)MathSciNetCrossRefGoogle Scholar - Nadler, B., Lafon, S., Kevrekidis, I.G., Coifman, R.R.: Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. Adv Neural Inf Process Syst.
**18**, 955–962 (2005)Google Scholar - Nadler, B., Lafon, S., Coifman, R.R., Kevrekidis, I.G.: Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl. Comput. Harmonic Anal.
**21**(1), 113–127 (2006)zbMATHMathSciNetCrossRefGoogle Scholar - Rowley, C.W., Mezić, I., Bagheri, S., Schlatter, P., Henningson, D.S.: Spectral analysis of nonlinear flows. J. Fluid Mech.
**641**, 115–127 (2009)zbMATHMathSciNetCrossRefGoogle Scholar - Santitissadeekorn, N., Bollt, E.: The infinitesimal operator for the semigroup of the Frobenius-Perron operator from image sequence data: vector fields and transport barriers from movies. Chaos Interdiscip. J. Nonlinear Sci.
**17**(2), 023126 (2007)MathSciNetCrossRefGoogle Scholar - Schmid, P.J.: Dynamic mode decomposition of numerical and experimental data. J. Fluid Mech.
**65**(6), 5–28 (2010)CrossRefGoogle Scholar - Schmid, P., Li, L., Juniper, M., Pust, O.: Applications of the dynamic mode decomposition. Theor. Comput. Fluid Dyn.
**25**(1–4), 249–259 (2011)zbMATHCrossRefGoogle Scholar - Schmid, P.J., Violato, D., Scarano, F.: Decomposition of time-resolved tomographic PIV. Exp. Fluids
**52**(6), 1567–1579 (2012)CrossRefGoogle Scholar - Seena, A., Sung, H.J.: Dynamic mode decomposition of turbulent cavity flows for self-sustained oscillations. Int. J. Heat Fluid Flow
**32**(6), 1098–1110 (2011)CrossRefGoogle Scholar - Sirisup, S., Karniadakis, G.E., Xiu, D., Kevrekidis, I.G.: Equation-free/Galerkin-free POD-assisted computation of incompressible flows. J. Comput. Phys.
**207**(2), 568–587 (2005)zbMATHMathSciNetCrossRefGoogle Scholar - Stengel, R.F.: Optimal Control and Estimation. Courier Dover Publications, New York (2012)Google Scholar
- Susuki, Y., Mezic, I.: Nonlinear Koopman modes and power system stability assessment without models. IEEE Trans. Power Syst.
**29**(2), 899–907 (2014)Google Scholar - Susuki, Y., Mezić, I.: Nonlinear Koopman modes and coherency identification of coupled swing dynamics. IEEE Trans. Power Syst.
**26**(4), 1894–1904 (2011)CrossRefGoogle Scholar - Susuki, Y., Mezić, I.: Nonlinear Koopman modes and a precursor to power system swing instabilities. Power Syst. IEEE Trans.
**27**(3), 1182–1191 (2012)CrossRefGoogle Scholar - Todorov, E.: Optimal control theory. In: Bayesian brain: Probabilistic approaches to neural coding, Kenji Doya (Editor), pp. 269–298. MIT Press, Cambridge (2007)Google Scholar
- Trefethen, L.N.: Spectral Methods in MATLAB, vol. 10. SIAM, Philadelphia (2000)zbMATHCrossRefGoogle Scholar
- Tu, J.H., Rowley, C.W., Luchtenburg, D.M., Brunton, S.L., Kutz, J.N.: On dynamic mode decomposition: Theory and applications. J Comput Dyn.
**1**(2), 391–421 (2014)Google Scholar - Wendland, H.: Meshless Galerkin methods using radial basis functions. Math. Comput. Am. Math. Soc.
**68**(228), 1521–1531 (1999)zbMATHMathSciNetCrossRefGoogle Scholar - Wynn, A., Pearson, D., Ganapathisubramani, B., Goulart, P.: Optimal mode decomposition for unsteady flows. J. Fluid Mech.
**733**, 473–503 (2013)zbMATHMathSciNetCrossRefGoogle Scholar