Greedy Kernel Methods for Center Manifold Approximation

For certain dynamical systems it is possible to significantly simplify the study of stability by means of the center manifold theory. This theory allows to isolate the complicated asymptotic behavior of the system close to a non-hyperbolic equilibrium point, and to obtain meaningful predictions of its behavior by analyzing a reduced dimensional problem. Since the manifold is usually not known, approximation methods are of great interest to obtain qualitative estimates. In this work, we use a data-based greedy kernel method to construct a suitable approximation of the manifold close to the equilibrium. The data are collected by repeated numerical simulation of the full system by means of a high-accuracy solver, which generates sets of discrete trajectories that are then used to construct a surrogate model of the manifold. The method is tested on different examples which show promising performance and good accuracy.


Introduction
Center manifold theory plays an important role in the study of the stability of dynamical systems when the equilibrium point is not hyperbolic. It isolates the complicated asymptotic behavior by locating the center manifold which is an invariant manifold tangent to the subspace spanned by the eigenspace of eigenvalues on the imaginary axis. Then, the dynamics of the original system will be essentially determined by the restriction of this dynamics on the center manifold since the local dynamic behavior "transverse" to this invariant manifold is relatively simple as it corresponds to the flows in the local stable (and unstable) manifolds. In practice, one does not compute the center manifold and its dynamics exactly since this requires the resolution of a quasilinear partial differential equation which is not easily solvable. In most cases of interest, an approximation of degree two or three of the solution is sufficient. Then, the reduced dynamics on the center manifold can be determined, its stability can be studied and then conclusions about the stability of the original system can be obtained [6,8,4,1,3].
In this article, we use greedy kernel methods to construct a data-based approximation of the center manifold. The present work is a preliminary study that is intended to introduce our concept and algorithm, and to test it on some examples.

Background
Consider a dynamical systeṁ . Suppose x = 0 is an equilibrium, i.e. f (0) = 0, and denote as σ R (F ) the sequence of real parts of the eigenvalues of F . A classical stability result states that if F has all its eigenvalues with negative real parts, i.e., σ R (F ) ⊂ R<0, then the origin is asymptotically stable; and if F has some eigenvalues with positive real parts, then the origin is unstable. If σ R (F ) ⊂ R ≤0 , the linearization fails to determine the stability properties of the origin.
After a linear change of coordinates, we havė Intuitively we expect the stability of the equilibrium to only depend on the nonlinear termsf1(x, y). The center manifold theorem correctly formalizes this intuition.
A center manifold is an invariant manifold 1 , y = h(x), for (2)-(3), such that h is smooth and for i = 1, 2, and the eigenvalues of F1 have zero real parts, and all the eigenvalues of F2 have negative real parts, then there exists a neighbourhood Ω ⊂ R d of the origin 0 ∈ R d , such that x2 = h(x1) is a center manifold for (2)-(3).

Sinceẋ
= F1x +f1(x, y), y = F2y +f2(x, y) and x2 = h(x1), we deduce that h satisfies the PDE The center manifold theorem ensures that there are smooth solutions to this PDE. It also allows to deduce the stability of the origin of the full order system (2)-(3) from the stability of the origin of a reduced order system called the center dynamics.

Theorem 2. [1] (Center Manifold Theorem)
The equilibria x = 0, y = 0 of the original dynamics is locally asymptotically stable (resp. unstable) iff the equilibria x = 0 of the center dynamics (dynamics on the center is locally asymptotically stable (resp. unstable).
After solving the PDE (6), the problem of analyzing the stability properties of the system (2) reduces to analyzing the nonlinear stability of a lower dimensional system (7).
But the PDE (6) need not be solved exactly, since frequently it suffices to compute the low degree terms of the Taylor series expansion of h around x = 0, i.e., h(x) = h [1] x + h [2] (x) + h [3] (x) + . . .

where (·)
[k] is the degree k part of the Taylor series. The PDE can then be rewritten as a set of algebraic equations • etc.
The Taylor expansion of h to degree d − 1 determines the center manifold dynamics to degree d, . .
, and this may be enough to determine its local asymptotic stability. This methodology is valid for parameterized dynamical systems and is used to study the stability of dynamical systems with bifurcations. Our goal in this paper is to find a data-based approximation of the center manifold in view of a data-based version of the center manifold theorem.

Kernel Approximation
We want to build a surrogate model s h : Ω → R m which approximates the center manifold h on a suitable set for all x ∈ Ω. This model is constructed in a data-based way, i.e., we assume the knowledge of the map h on a finite set of input parameters, or training data. In practice, such values are computed from high-fidelity numerical approximations, which will be discussed in details in the following.
The surrogate is based on kernel approximation, which allows the use of scattered data, i.e., we do not require any grid structure on the set of training data. Moreover, since the unknown function h is vector-valued, we employ here matrix-valued kernels. Details on kernel-based approximation can be found e.g. in [9], and the extension to the vectorial case is detailed e.g. in [5,10]. We recall here only that a positive definite matrix-valued kernel on Ω is a function K : Ω × Ω → R m×m such that K(x, y) = K(y, x) T for all x, y ∈ Ω and [K(xi, xj)] N i,j=1 ∈ R mN ×mN is positive semidefinite for any set {x1, . . . , xN } ⊂ Ω of pairwise distinct points, for all N ∈ N. Associated to a positive definite kernel there is a unique Hilbert space H of functions Ω → R m , named native space, where the kernel is reproducing, meaning that K(·, x)α is the Riesz representer of the directional point evaluation δ α We consider here a twice continuously differentiable matrix-valued kernel k on Ω, and we use a specific functional formulation for our approximation and a specific cost function, in order to construct a surrogate that is well suited for the particular approximation task.
In details, the approximant takes the form 1 , . . . , x (2) n 2 and coefficient vectors αi, βi,j ∈ R m . Here the superscript ∂ (2) denotes that the derivative with regards to the second kernel component is taken.
We assume to have sufficiently many data XN * = {x1, . . . , xN * } and YN * = {y1, . . . , yN * } which, for example, are generated by running a numerical scheme to compute discrete trajectories for different initial values (x0, y0). For this step, we need to assume that the variable splitting (2)-(3) is known in advance. Note that this is not a severe restriction, as for a general ODE (1) the required state transformation can be determined by eigenvalue decomposition of F .
Observe that we do not know if a data pair (xi, yi) lies on the center manifold, i.e. if yi = h(xi) holds. We only know that the data converges asymptotically to the center manifold as xi → 0. Thus, an interpolationbased surrogate which interpolates the data on a given subset X ⊂ XN * seems ill-suited for our purposes. We consider instead another set of conditions to define the approximant. First, we still require the conditions in (4) to be satisfied by our approximation. Moreover, for the given subsets X = {x1, . . . , xN } and Y = {y1, . . . , yN }, we compute our approximant by minimizing the following functional J : H → R under the constraint s(0) = 0, Ds(0) = 0: Here ωi ∈ R m×m is a positive definite weight matrix. It can be shown that (8) has a unique minimizer s h (see [11]). In particular s h and its derivative Ds h have the form where we set xN+1 := 0. The coefficient vectors αi, βi can be computed by solving the system with A := (K(xi, xj)) i,j ∈ R m(N +1)×m(N +1) , The weight matrices ωi can either be chosen manually, or a regularizing function r : Ω → R m×m can be prescribed such that ωi = r(xi) is symmetric and positive definite. In our numerical examples in section 4 we chose a constant regularization function, i.e. ωi = r(xi) = λIm for some λ > 0. However, one might consider a more general approach, where the weight increases as the data tends to the origin, i.e. ωi ωj if xi ≤ xj .

Greedy approximation
If the technique of the previous section is used as it is, the surrogate (9) is given by an expansion with N * terms, where N * is the number of points in the training set. Therefore, the model evaluation can be not efficient enough if the model is built using a too large dataset. Furthermore, the computation of the coefficients in (9) requires the solution of the linear system (10), whose size again scales with the size of the training set, and which can be severely ill-conditioned for not well-placed points.
To mitigate both problems, we employ an algorithm that aims at selecting small subsets XN , YN of points such that the surrogate computed with these sets is a sufficiently good approximation of the one which uses the full sets. The algorithm selects the points in a greedy way, i.e., one point at a time is selected and added to the current training set. In this way, it is possible to identify a good set without the need to solve a nearly infeasible combinatorial problem.
The selection is performed using the P -greedy method of [2] applied to the kernel K, such that the set of points is selected before the computation of the surrogate. The number of points, and therefore the expansion size and evaluation speed, is depending on the prescribed target accuracy ε tol > 0. For details on the method implementation and its convergence properties we refer to [7].

Numerical Examples
We test now our method on three different examples. In each of them, we specify the setting and the parameters used to build the surrogate. We compare the surrogates with the the true manifold and we compute the pointwise residual ri(x) = Dsi(x)f1(x, si(x)) − f2(x, si(x)), i = 1, 2, which measures the error when the surrogate is used as a replacement of the center manifold.
In all the three examples, the greedy algorithm is used to select a suitable subset of the points, and in all cases the procedure is stopped with a prescribed ε tol . In the first two examples we set ε tol := 10 −15 , while ε tol := 10 −10 in the last one.

Example 1
We consider the 2-dimensional systeṁ We generate the training data by solving (11) with an implicit Euler scheme for initial time t0 = 0, final time T = 1000 and with the time step ∆t = 0.1. We initiate the numerical procedure with initial values (x0, y0) ∈ {±0.8} × {±0.8} and store the resulting data pairs in X and Y after discarding all data whose x-values are not contained in the neighborhood [−0.1, 0.1] which results in N * = 38248 data pairs. We run the greedy algorithm for the kernels k1(x, y) := (1 + xy/2) 4 and k2(x, y) = e −(x−y) 2 /2 . This results in the sets X1 and X2 which contain 14 and 6 points, respectively. The corresponding approximations s1 and s2 for the constant regularization function r ≡ 10 −10 are plotted in Figure 1, left. The center manifold approximations are plotted over the domain [−0.1, 0.1]. The pointwise residual is depicted in Fig. 1, right.

Example 2
We consider the 2-dimensional systeṁ The training data is generated the same way as in Example 1. We again use the kernels k1 and k2. The greedy algorithm gives sets X1 and X2 of size 12 and 6, respectively. The evaluation of the approximations s1 and s2 over the neighborhood [−0.1, 0.1] can be seen on the left in Figure  2, while the respective pointwise residuals are plotted in the right part of Figure 2.

Example 3
We consider the (2 + 1)-dimensional systeṁ We generate the training data in a similar fashion as before. We again use the implicit Euler scheme with start time t = 0, final time T = 1000 and with time step ∆t = 0.1. The Euler method is performed for initial data (x0, y0) ∈ {±0.8} 3 and the resulting trajectories are stored in X and Y , where only data with x ∈ [−0.1, 0.1] 2 was considered; this leads to N * = 78796 data pairs. We use the kernels k1(x, y) = (1 + x T y/2) 4 and k2(x, y) = e − x−y 2 2 /2 , and the greedy-selected sets have the size 21 (for k1) and 25 (for k2), respectively. The approximations s1,s2 and their corresponding residuals r1 and r2 computed over the domain [−0.1, 0.1] 2 . The results can be seen in Figure 3.  Figure 3: Approximation of the center manifold and residuals for Example 3 using kernels k 1 and k 2 .
We remark that in all the three experiments both kernels give comparable results in terms of error magnitude, and they both provide a good approximation of the manifold.

Conclusions
In this paper we introduced a novel algorithm to approximate the center manifold of a given ODE using a data-based surrogate.
This algorithm computes an approximation of the manifold from a set of numerical trajectories with different initial data. It is based on kernel methods, which allow the use of the scattered data generated by these simulations as training points. Moreover, an application-specific ansatz and cost function have been employed in order to enforce suitable properties on the surrogate.
Several numerical experiments suggested that the present method can reach a significant accuracy, and that it has the potential to be used as an effective model reduction technique. It seems promising to apply this approach to high dimensional systems as the approximation technique straightforwardly can be extended and is less prone to the curse of dimensionality than grid-based approximation techniques. An interesting extension would consist of determining the decomposition (2)-(3) in a data-based fashion by suitable processing of the trajectory data.