1 Introduction

In this paper, we highlight how invariant foliations [15, 29] of dynamical systems can be used to derive reduced order models (ROM) either from data or physical models. We consider dynamics about equilibria only. We assume a deterministic process, such that future states of the system are fully determined by initial conditions. An invariant foliation is a decomposition of the state space into a family of manifolds, called leaves, such that the dynamics brings each leaf into another (see Fig. 1). If a leaf is brought into itself, then it is also an invariant manifold. A foliation is generally characterised by its co-dimension, which equals the number of parameters needed to describe the family of leaves so that it covers the state space. The dynamics that maps one leaf of an invariant foliation into another leaf has the same dimensionality as the co-dimension of the foliation. We call this mapping the conjugate dynamics, which is lower dimensional than the dynamics of the underlying system and therefore suitable to be used as an ROM. Such ROM treats all initial conditions within one leaf equivalent to each other and characterises the dynamics of the whole system. In contrast, the conjugate dynamics (ROM) on an invariant manifold captures the dynamics only on a low-dimensional subset of the state space. The conjugate dynamics on an invariant manifold, however, describes the exact evolution of initial conditions taken from the invariant manifold, while the conjugate dynamics on an invariant foliation is imprecise about the evolution, it can only tell which leaves a trajectory goes through. This ambiguity about the state has some advantages: for all initial conditions there is a leaf and a valid reduced dynamics. In contrast, when using invariant manifolds, the initial condition must come from the invariant manifold in order to have a valid prediction.

Multiple foliations can act as a coordinate system about the equilibrium. When individual leaves from different foliations intersect in one point, the dynamics can be fully reconstructed from the foliations. Therefore invariant foliations are fully paralleled with linear modal analysis of mechanical systems [11]: it allows both the decomposition of the system and the reconstruction of the full dynamics. To reconstruct the dynamics, one needs to find intersection points of leaves from different foliations which is more complicated than adding vibration modes of linear system. However, such composability is not at all possible with invariant manifolds or any other nonlinear normal mode (NNM) definition [13, 16, 24, 25]. Therefore, an invariant foliation seems to be the closest nonlinear alternative of linear modal analysis. The concept of composition is illustrated in Fig. 1.

Fig. 1
figure 1

Two foliations act as a coordinate system. An initial condition (red dots) is mapped forward by \(\varvec{F}\); however, each leaf of a foliation is brought forward by the lower-dimensional maps \(\varvec{S}_{1}\) and \(\varvec{S}_{2}\). Due to invariance of the foliation, the full trajectory can be reconstructed from the two maps \(\varvec{S}_{1}\) and \(\varvec{S}_{2}\) and the leaves of the foliations. (Color figure online)

Invariant foliations can be directly fitted to time-series data, because the foliation acts as a projection, much like linear modes. This allows for another parallel to be drawn with modal testing [11], which identifies linear vibration modes from data. Direct fitting of the manifold invariance equation to data is not possible, because the likelihood of data points falling onto the manifold is zero. Instead, in [27] a two-step process was used to find invariant manifolds in vibration data. First a high-dimensional black-box model was identified and then the invariant manifold was extracted. In contrast, a foliation covers all of the phase space where the data lives; hence, all available data can be used for fitting. Moreover, a leaf of a foliation that is mapped into itself or equivalently in our case contains the equilibrium, is an invariant manifold. Therefore finding two complementary invariant foliations, one transversal to an invariant manifold, another containing the invariant manifold as a leaf can substitute for calculating the invariant manifold and ROM.

The condition for uniqueness of invariant foliations is different from invariant manifolds. Only invariant manifolds about equilibria that are sufficiently smooth are unique. Unique invariant manifolds about equilibria, periodic or quasi-periodic orbits are called spectral submanifolds (SSM) [13]. The theory behind SSMs was mainly developed in [8], generalised to infinite dimensions in [7] and applied to mechanical systems in [13]. For an SSM to exist, non-resonance conditions need to be satisfied and the dynamics must be smoother than a so-called spectral quotient, which is calculated from the eigenvalues of the Jacobian about an equilibrium. For an SSM to be interesting, it must contain the slowest dynamics, so that it captures long-term behaviour, rather than just transients (see R2 in [14]). It turns out that the spectral quotient of such an SSM is also the highest and therefore the SSM requires the highest order of smoothness to be unique. While the concept of smoothness is theoretically well-understood, it is almost impossible to quantify numerically or determine from data. This is one of the reasons why it is challenging to calculate SSMs numerically (in contrast to series expansion [22]) in a reproducible manner. Invariant foliations, as explained below, also need to satisfy non-resonance conditions to exist and be sufficiently smooth to be unique. We call a unique invariant foliation tangential to an invariant linear subspace about an equilibrium an invariant spectral foliation (ISF). In contrast to SSMs, ISFs that capture the long-term dynamics require the lowest order of smoothness among all ISFs. This, however, does not mean that the smoothness requirements of SSMs can be circumvented by extracting an SSM as the leaf of the ISF going through the origin. In order to obtain the slowest SSM, one would need to calculate the fastest ISF, both of which require the same high order of smoothness for uniqueness.

The existing literature on invariant foliations is rich and difficult to summarise without distracting too much from the purpose of the paper (see, e.g., [3, 15, 23]). However, the setting used here is also different from most of the literature in that we are not dealing with stable or unstable fibres and hyperbolicity is not an important aspect either. The closest results in the literature are the remarks of de la Llave in Sect. 7.3 of [8] and Sect. 2 of [7], that generalise the parametrisation method to foliations.

The plan of the paper is as follows. We start with introducing invariant foliations and describe their properties. We then state and prove theorems for the existence and uniqueness of ISFs both for discrete-time systems and vector fields. Finally, we describe a simple method that allows finding ISFs from time-series data, which is then tested on a simple example alongside with two other approaches.

2 Invariant foliations

Consider a dynamical system that is defined by the \(C^{r}\) map \(\varvec{F}:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\). A trajectory of the dynamical system is obtained by recursively applying \(\varvec{F}\) to the initial condition \(\varvec{x}_{0}\), such that successive points along a trajectory are generated by

$$\begin{aligned} \varvec{x}_{k+1}=\varvec{F}\left( \varvec{x}_{k}\right) ,\quad k=0,1,\ldots . \end{aligned}$$
(1)

We assume that the origin is a fixed point, that is \(\varvec{F}\left( \varvec{0}\right) =\varvec{0}\) and the Jacobian at the origin, \(\varvec{A}=D\varvec{F}\left( \varvec{0}\right) \), is semisimple. The eigenvalues of \(\varvec{A}\) are denoted by \(\mu _{i}\), \(i=1,\ldots ,n\) and we have a full set of left and right eigenvectors, \(\varvec{v}_{i}^{\star }\) and \(\varvec{v}_{i}\), that satisfy \(\varvec{v}_{i}^{\star }\varvec{A}=\mu _{i}\varvec{v}_{i}^{\star }\) and \(\varvec{A}\varvec{v}_{i}=\mu _{i}\varvec{v}_{i}\), respectively. For convenience we also assume that the eigenvectors are scaled such that \(\varvec{v}_{i}^{\star }\varvec{v}_{i}=1\). Let us denote the linear subspace spanned by the first \(\nu \) eigenvectors as \(E=\mathrm {span}\left\{ \varvec{v}_{1},\ldots ,\varvec{v}_{\nu }\right\} \) and the dual subspace \(E^{\star }=\mathrm {span}\left\{ \varvec{v}_{1}^{\star },\ldots ,\varvec{v}_{\nu }^{\star }\right\} \). Finally, we assume that \(\varvec{A}\) is a contraction, that is, \(\left| \mu _{i}\right| <1,\,\forall i=1,\ldots ,n\).

We are interested in how codimension-\(\nu \) sets about the origin are brought into each other by \(\varvec{F}\). The manifold of sets is parametrised by an \(\nu \)-dimensional parameter \(\varvec{z}\in \mathbb {R}^{\nu }\) and a single set at point \(\varvec{z}\) is denoted by \(\mathcal {L}_{\varvec{z}}\). We assume that each \(\mathcal {L}_{\varvec{z}}\) is a differentiable manifold and \(\mathcal {L}_{\varvec{z}}\) and \(\mathcal {L}_{\tilde{\varvec{z}}}\) are disjointed if \(\varvec{z}\ne \tilde{\varvec{z}}\). In technical terms this is called a codimension-\(\nu \) foliation of \(\mathbb {R}^{n}\) [18] and each \(\mathcal {L}_{\varvec{z}}\) is a leaf. The foliation is a collection of leaves, that is \(\mathcal {F}=\left\{ \mathcal {L}_{\varvec{z}}:\varvec{z}\in \mathbb {R}^{\nu }\right\} \).

A foliation \(\mathcal {F}\) is invariant under \(\varvec{F}\) if there is a map \(\varvec{S}:\mathbb {R}^{\nu }\rightarrow \mathbb {R}^{\nu }\), which brings the leaves into each other in the same way as the high-dimensional dynamics, that is

$$\begin{aligned} \varvec{F}\left( \mathcal {L}_{\varvec{z}}\right) \subset \mathcal {L}_{\varvec{S}\left( \varvec{z}\right) }. \end{aligned}$$
(2)

A foliation can be represented by a function \(\varvec{U}:\mathbb {R}^{n}\rightarrow \mathbb {R}^{\nu }\), called submersion, such that a leaf is the pre-image of the parameter \(\varvec{z}\) under the submersion \(\varvec{U}\), that is,

$$\begin{aligned} \mathcal {L}_{\varvec{z}}=\left\{ \varvec{x}\in \mathbb {R}^{n}:\varvec{U}\left( \varvec{x}\right) =\varvec{z}\right\} . \end{aligned}$$
(3)

Using definition (3), we find that the inclusion (2) translates into an algebraic equation for the submersion \(\varvec{U}\),

$$\begin{aligned} \varvec{U}\left( \varvec{F}\left( \varvec{x}\right) \right) =\varvec{S}\left( \varvec{U}\left( \varvec{x}\right) \right) , \end{aligned}$$
(4)

which is called the invariance equation. Similar to invariant manifolds, we require a tangency condition to a linear subspace. To consider the dynamics corresponding to the linear subspace \(E^{\star }\), we require that

$$\begin{aligned} \varvec{U}\left( \varvec{0}\right) =\varvec{0}\;\text {and}\;\mathrm {span}\,D\varvec{U}\left( \varvec{0}\right) =E^{\star }, \end{aligned}$$
(5)

which means that \(D\varvec{U}\left( \varvec{0}\right) \) is a set of \(\nu \) linearly independent row vectors from the dual space of \(\mathbb {R}^{n}\) (row vectors) spanning the whole space \(E^{\star }\).

Fig. 2
figure 2

Invariant foliation. The leaf \(\mathcal {L}_{\varvec{z}}\) (green solid line) is mapped onto \(\mathcal {L}_{\varvec{S}\left( \varvec{z}\right) }\) (red solid line) by \(\varvec{F}\). Leaf \(\mathcal {L}_{\varvec{0}}\) (black solid line) is an invariant manifold, because it contains the origin and it is mapped onto itself by \(\varvec{F}\). Dashed lines are other leaves. (Color figure online)

Figure 2 shows the geometry of an invariant foliation. Each leaf is mapped into another, in particular, the green solid line representing \(\mathcal {L}_{\varvec{z}}\) is mapped into the red solid line by \(\varvec{F}\). A leaf that corresponds to a fixed point of \(\varvec{S}\) is an invariant manifold as it is mapped into itself. In particular, we have \(\varvec{S}\left( \varvec{0}\right) =\varvec{0}\); hence \(\mathcal {L}_{\varvec{0}}\) is an invariant manifold.

The solution of the invariance equation (4) with the tangency condition (5) is not unique for a number of reasons. Firstly, assuming that there exist a pair of functions \(\varvec{U}\) and \(\varvec{S}\) satisfying (4) and (5) a large class of diffeomorphism \(\varvec{\varPhi }:\mathbb {R}^{\nu }\rightarrow \mathbb {R}^{\nu }\) can be used, such that \(\tilde{\varvec{U}}=\varvec{\varPhi }\circ \varvec{U}\) and \(\tilde{\varvec{S}}=\varvec{\varPhi }\circ \varvec{U}\circ \varvec{\varPhi }^{-1}\) are also solutions of (4) and (5). However, if two pairs of solutions of (4) and (5) are conjugate through a diffeomorphism \(\varvec{\varPhi }\), they represent the same invariant foliation \(\mathcal {F}\). The kind of non-uniqueness that is problematic when multiple solutions of (4) and (5) are not conjugate and do not represent the same foliation. To fix possible non-uniqueness, we impose extra smoothness conditions on the submersion \(\varvec{U}\) in addition to being differentiable and tangent to \(E^{\star }\). We then call the smoothest and unique foliation the invariant spectral foliation (ISF) corresponding to the linear subspace E.

2.1 Vector fields

In many applications the dynamics is defined by a vector field. Here we recall that there is a one-to-one relationship between invariant foliations of maps and vector fields [2]. Consider the vector field \(\varvec{\dot{x}}=\varvec{G}\left( \varvec{x}\right) \), which has a fundamental solution \(\varvec{\varPhi }_{t}\left( \varvec{x}\right) \), such that

$$\begin{aligned} \frac{d}{dt}\varvec{\varPhi }_{t}\left( \varvec{x}\right) =\varvec{G}\left( \varvec{\varPhi }_{t}\left( \varvec{x}\right) \right) ,\quad \varvec{\varPhi }_{0}\left( \varvec{x}\right) =\varvec{x}. \end{aligned}$$

Here \(\varvec{\varPhi }_{t}\) is a one-parameter group, because \(\varvec{\varPhi }_{t}\left( \varvec{\varPhi }_{s}\left( \varvec{x}\right) \right) =\varvec{\varPhi }_{t+s}\left( \varvec{x}\right) \) and \(\varvec{\varPhi }_{0}\left( \varvec{x}\right) =\varvec{x}\). If \(\varvec{G}\) is \(C^{r}\) smooth then so is \(\varvec{\varPhi }_{t}\). We can now define the map \(\varvec{F}\left( \varvec{x}\right) =\varvec{\varPhi }_{t}\left( \varvec{x}\right) \), which brings the invariance equation (4) into

$$\begin{aligned} \varvec{U}\left( \varvec{\varPhi }_{t}\left( \varvec{x}\right) \right) =\varvec{S}_{t}\left( \varvec{U}\left( \varvec{x}\right) \right) . \end{aligned}$$
(6)

The conjugate dynamics \(\varvec{S}\) must also be a one-parameter group with \(\varvec{S}_{t+s}\left( \varvec{x}\right) =\varvec{S}_{t}\left( \varvec{S}_{s}\left( \varvec{x}\right) \right) \) and \(\varvec{S}_{0}\left( \varvec{x}\right) =\varvec{x}\) in order to satisfy the invariance equation, that is,

$$\begin{aligned} \varvec{U}\left( \varvec{\varPhi }_{t}\left( \varvec{\varPhi }_{s}\left( \varvec{x}\right) \right) \right)&=\varvec{S}_{t}\left( \varvec{U}\left( \varvec{\varPhi }_{s}\left( \varvec{x}\right) \right) \right) \\ \varvec{U}\left( \varvec{\varPhi }_{t+s}\left( \varvec{x}\right) \right)&=\varvec{S}_{t}\left( \varvec{S}_{s}\left( \varvec{U}\left( \varvec{x}\right) \right) \right) \\ \varvec{U}\left( \varvec{\varPhi }_{t+s}\left( \varvec{x}\right) \right)&=\varvec{S}_{t+s}\left( \varvec{U}\left( \varvec{x}\right) \right) . \end{aligned}$$

The infinitesimal generator of the group \(\varvec{S}\) is denoted by \(\varvec{R}\), such that \(\frac{d}{dt}\varvec{S}_{t}\left( \varvec{x}\right) =\varvec{R}\left( \varvec{S}_{t}\left( \varvec{x}\right) \right) \). On the other hand \(\varvec{U}\) must be independent of time, if it is to define an invariant foliation. We now take the derivative of the invariance equation (6) with respect to time and find

$$\begin{aligned} D\varvec{U}\left( \varvec{\varPhi }_{t}\left( \varvec{x}\right) \right) \varvec{G}\left( \varvec{\varPhi }_{t}\left( \varvec{x}\right) \right) =\varvec{R}\left( \varvec{S}_{t}\left( \varvec{U}\left( \varvec{x}\right) \right) \right) . \end{aligned}$$
(7)

Setting \(t=0\) in Eq. (7), we get the invariance equation for vector fields in the form of

$$\begin{aligned} D\varvec{U}\left( \varvec{x}\right) \varvec{G}\left( \varvec{x}\right) =\varvec{R}\left( \varvec{U}\left( \varvec{x}\right) \right) . \end{aligned}$$
(8)

The next example, which aims to illustrate non-uniqueness of foliations, also shows that occasionally, it is easier to find an invariant foliation using (8) than using (4).

2.2 Example: smoothness and uniqueness of foliations

Let us consider the discrete-time map

$$\begin{aligned} \begin{pmatrix}x_{k+1}\\ y_{k+1} \end{pmatrix}=\begin{pmatrix}\mathrm {e}^{-\lambda }x_{k}\\ \mathrm {e}^{-\mu }y_{k} \end{pmatrix},\,\lambda>0,\mu >0 \end{aligned}$$

for which we can find an equivalent vector field in the form of

$$\begin{aligned} \begin{pmatrix}\dot{x}\\ \dot{y} \end{pmatrix}=\begin{pmatrix}-\lambda x\\ -\mu y \end{pmatrix}, \end{aligned}$$
(9)

such that \(x_{k}=x\left( k\right) \) and \(y_{k}=y\left( k\right) \). The solutions of system (9) lie on the curves \(y\left( x\right) =c\mathrm {e}^{x\mu /\lambda }\), \(c\in \mathbb {R}\), \(x\ge 0\), as we only consider the right half-plane. The invariance equation (8), when (9) is substituted, becomes

$$\begin{aligned} -\lambda xD_{1}u\left( x,y\right) -\mu yD_{2}u\left( x,y\right) =r\left( u\left( x,y\right) \right) , \end{aligned}$$

where r describes the dynamics among the leaves of the invariant foliation. Here, we have used non-bold, lower-case letters to represent \(\varvec{U}=u\) and \(\varvec{R}=r\), because they assume scalar values. Without restricting generality, we prescribe the parametrisation of the foliation by setting \(u\left( x,0\right) =x\), which implies that \(r\left( x\right) =-\lambda x\). We note that any other parametrisation for which \(\hat{u}\left( x,0\right) \) is a strictly monotonous (invertible) and smooth function of x can be brought into the special parametrisation that we have just chosen, that is \(u\left( x,y\right) =\hat{u}\left( \hat{u}^{-1}\left( x,0\right) ,y\right) \). Using this parametrisation, the invariance equation then simplifies to

$$\begin{aligned} -\lambda xD_{1}u\left( x,y\right) -\mu yD_{2}u\left( x,y\right) =-\lambda u\left( x,y\right) . \end{aligned}$$
(10)

The solution of (10) is sought in the form of \(u\left( x,y\right) =xw\left( x,y\right) \), where w has to satisfy the somewhat simpler equation

$$\begin{aligned} -\lambda xD_{1}w\left( x,y\right) -\mu yD_{2}w\left( x,y\right) =0. \end{aligned}$$

Using the method of characteristics and assuming the boundary condition \(w\left( 1,y\right) =f\left( y\right) \) gives the general solution

$$\begin{aligned} u\left( x,y\right) =xf\left( x^{-\mu /\lambda }y\right) , \end{aligned}$$
(11)

where f is an unknown, continuously differentiable function with \(f\left( 0\right) =1\) due to the constraint on the parametrisation.

We now assume that f is m times differentiable, such that \(f\left( x\right) =1+\sum _{k=1}^{m}a_{k}x^{k}+\mathcal {O}\left( x^{m+1}\right) \) and that \(\lambda /\mu >m\). In this case the k-th order term of f leads to an order \(1+k\left( 1-\mu /\lambda \right) >1+k-k/m\) term in u, that are continuously differentiable if and only if \(k\le m\). This implies that if \(m<\lambda /\mu \le m+1\), function f must assume the form

$$\begin{aligned} f\left( x\right) =1+\sum _{k=1}^{m}a_{k}x^{k} \end{aligned}$$

for u to be once differentiable. This means that the foliation is non-unique and has m free parameters. Repeating the same argument but stipulating that the foliation must be m-times continuously differentiable, we find that \(f=1\), which has no parameters and therefore the invariant foliation becomes unique. Indeed, after differentiating (11) m-times, a k-th order term in f results in an order \(1+k\left( 1-\mu /\lambda \right) -m>1+k-m-k/m\) term in \(D^{m}u\), hence none of the terms apart from the constant one will lead to an m-times differentiable u, and the only solution is \(f=1\) meaning that the unique submersion is \(u\left( x,y\right) =x\). We also note that in this example, the ISF is as smooth as the vector field, that is analytic.

The x variable represents the slow dynamics if \(\lambda <\mu \). In this case \(\lambda /\mu <1\), which means that a differentiable foliation is already unique. The result of this section is graphically illustrated in Fig. 3.

Fig. 3
figure 3

Uniqueness of foliations. Dashed blue lines are trajectories of (9), red continuous lines are the contours of \(u\left( x,y\right) \) and represent the leaves of the foliation. a \(f=1+x/5\), \(\lambda =2\), \(\mu =3\), the resulting u does not define a differentiable foliation; b \(f=1+x/5\), \(\lambda =3\), \(\mu =2\), the foliation is once differentiable but not unique; c \(f=1\), \(\lambda =2\), \(\mu =3\) leads to the unique and differentiable foliation. (Color figure online)

3 Existence and uniqueness of invariant foliations

In this section we generalise the findings from the example in Sect. 2.2 and provide a sufficient condition for the existence of a unique invariant foliation, i.e., an ISF. We start with a definition.

Definition 1

The number

$$\begin{aligned} \beth _{E^{\star }}=\frac{\min _{k=1\ldots \nu }\log \left| \mu _{k}\right| }{\max _{k=1\ldots n}\log \left| \mu _{k}\right| } \end{aligned}$$

is called the ISF spectral quotient of the linear subspace \(E^{\star }\) about the origin.

Theorem 1

Assume that \(\max _{k=1\ldots n}\left| \mu _{k}\right| <1\) and that there exists an integer \(2\le \sigma \le r\), such that \(\beth _{E^{\star }}<\sigma \). Further assume that

$$\begin{aligned} \prod _{k=1}^{n}\mu _{k}^{m_{k}}\ne \mu _{j},\;j=1,\ldots ,\nu \end{aligned}$$
(12)

for all integer \(m_{k}\ge 0\), \(1\le k\le n\) with at least one \(m_{l}\ne 0\), \(\nu +1\le l\le n\) and with \(2\le \sum _{k=0}^{n}m_{k}\le \sigma -1\).

Then the following are true:

  1. 1.

    In a sufficiently small neighbourhood of the origin there exists an invariant foliation \(\mathcal {F}\) tangent to the invariant linear subspace \(E^{\star }\) of the \(C^{r}\) map \(\varvec{F}\). The foliation \(\mathcal {F}\) is unique among the \(\sigma \)-times differentiable foliations and it is also \(C^{r}\) smooth.

  2. 2.

    The conjugate dynamics of the invariant foliation \(\mathcal {F}\), given by the map \(\varvec{S}\) in Eq. (4) can be represented by a polynomial of order \(\sigma -1\). In its simplest form \(\varvec{S}\) must include terms \(\prod _{k=1}^{\nu }z_{k}^{m_{k}}\) in dimension j for which

    $$\begin{aligned} \prod _{k=1}^{\nu }\mu _{k}^{m_{k}}=\mu _{j},\;j=1,\ldots ,\nu ,\;2\le \sum _{k=0}^{\nu }m_{k}\le \sigma -1. \end{aligned}$$
    (13)

Proof

The proof is carried out in appendix A. \(\square \)

Remark 1

From the conditions of theorem 1 it follows that \(\beth _{E^{\star }}\ge 1\). In case \(\beth _{E^{\star }}=1\), an invariant foliations is unique if it is at least twice differentiable. This is, however, not a necessary condition, because for example (9) of Sect. 2.2 once differentiability already implied uniqueness.

Remark 2

In contrast to SSMs, \(D\varvec{F}\left( \varvec{0}\right) \) does not have to be invertible, only \(D\varvec{S}\left( \varvec{0}\right) \) has to be invertible, that is, \(\mu _{k}\ne 0\) for \(k=1\ldots \nu \). If one were to extend the theory to Banach spaces, where typical dynamics is not invertible (e.g., delay equations, analytic semigroups, etc), the lack of requirement on invertibility would allow wider application of ISFs than SSMs. For example, in [17], the requirement of invertibility demanded a special choice of damping added to a beam model for an SSM to exist.

Remark 3

For simplicity of presentation, the paper focusses on equilibria. However, theorem 1 is also applicable to periodic orbits of vector fields, both autonomous and periodically forced, where \(\varvec{F}\) is the Poincaré map associated with the periodic orbit.

The proof of theorem 1 follows the same lines that Cabré et al. [7] employ. First, a low-order series expansion is carried out avoiding possible resonances. For higher-order terms, where no resonance is possible, Banach’s contraction mapping principle is applied to find a unique correction. Since the series expansion allows a number of free parameters, we also show that the choice of these parameters does not influence the geometry of the foliation, only its parametrisation. This results in a unique foliation. Differentiability follows from choosing \(\sigma =r\).

Theorem 1 also applies to \(C^{r}\) vector fields \(\dot{\varvec{x}}=\varvec{G}\left( \varvec{x}\right) \). Again, we assume that the origin is the equilibrium, that is \(\varvec{G}\left( \varvec{0}\right) =\varvec{0}\) and that the Jacobian \(\varvec{B}=D\varvec{G}\left( \varvec{0}\right) \) is semisimple. The eigenvalues of \(\varvec{B}\) are denoted by \(\lambda _{i}\), \(i=1,\ldots ,n\) and we have a full set of left and right eigenvectors, \(\varvec{v}_{i}^{\star }\) and \(\varvec{v}_{i}\), that satisfy \(\varvec{v}_{i}^{\star }\varvec{B}=\lambda _{i}\varvec{v}_{i}^{\star }\) and \(\varvec{B}\varvec{v}_{i}=\lambda _{i}\varvec{v}_{i}\), respectively. The invariant linear subspaces are defined as before: \(E=\mathrm {span}\left\{ \varvec{v}_{1},\ldots ,\varvec{v}_{\nu }\right\} \) and \(E^{\star }=\mathrm {span}\left\{ \varvec{v}_{1}^{\star },\ldots ,\varvec{v}_{\nu }^{\star }\right\} \).

Using the spectral mapping theorem for the equivalence \(\varvec{A}=\exp \varvec{B}\tau \), \(\tau >0\), we find that the ISF spectral quotient for a vector field is

$$\begin{aligned} \beth _{E^{\star }}=\frac{\min _{k=1\ldots \nu }\mathfrak {R}\lambda _{k}}{\max _{k=1\ldots n}\mathfrak {R}\lambda _{k}}. \end{aligned}$$

Due to the equivalence between discrete-time dynamics and vector fields, the following corollary is a direct consequence of theorem 1.

Corollary 1

Assume that \(\max _{k=1\ldots n}\mathfrak {R}\lambda _{k}<0\) and that there exists an integer \(2\le \sigma \le r\), such that \(\beth _{E^{\star }}<\sigma \). Further assume that

$$\begin{aligned} \sum _{k=1}^{n}m_{k}\lambda _{k}\ne \lambda _{j},\;j=1,\ldots ,\nu \end{aligned}$$
(14)

for all integer \(m_{k}\ge 0\), \(1\le k\le n\) with at least one \(m_{l}\ne 0\), \(\nu +1\le l\le n\) and with \(2\le \sum _{k=0}^{n}m_{k}\le \sigma -1\).

Then the following are true:

  1. 1.

    In a sufficiently small neighbourhood of the origin there exists an invariant foliation \(\mathcal {F}\) tangent to the invariant linear subspace \(E^{\star }\) of the \(C^{r}\) vector field \(\varvec{G}\). The foliation \(\mathcal {F}\) is unique among the \(\sigma \)-times differentiable foliations and it is also \(C^{r}\) smooth.

  2. 2.

    The conjugate dynamics of the invariant foliation \(\mathcal {F}\), given by the vector field \(\varvec{R}\) in Eq. (8) can be represented as a polynomial of order \(\sigma -1\). In its simplest form \(\varvec{R}\) must include terms \(\prod _{k=1}^{\nu }z_{k}^{m_{k}}\) in dimension j for which

    $$\begin{aligned} \sum _{k=1}^{\nu }m_{k}\lambda _{k}=\lambda _{j},\;j=1,\ldots ,\nu ,\;2\le \sum _{k=0}^{\nu }m_{k}\le \sigma -1. \end{aligned}$$
    (15)

Definition 2

We say that the invariant foliation has an internal resonance if there exist non-negative integers \(m_{k}\), \(k=1,\ldots ,\nu \) for which (13) (or (15) for vector fields) holds.

4 Fitting a codimension-two ISF to data

In this section we outline how to find the submersion \(\varvec{U}\) and conjugate map \(\varvec{S}\) from a time-series without identifying map \(\varvec{F}\) first. The procedure is based on the proof of theorem 1 in appendix A, which uses normalising conditions to find a unique solution of the invariance Eq. (4). Here, we make the normalising conditions applicable to a wide class of representations of the submersion \(\varvec{U}\) and conjugate map \(\varvec{S}\) and not just polynomials.

4.1 Normalising the solution of the invariance equation

The construction of the fitting process is centred around near internal resonances. We assume a pair of complex conjugate eigenvalues \(\mu _{1}=\overline{\mu }_{2}\) corresponding to the invariant linear subspace \(E^{\star }\). If the dynamics is slow on the ISF compared to the rest of the system, we have \(\left| \mu _{1}\right| \approx 1\), which implies that

$$\begin{aligned} \left. \begin{array}{rl} \mu _{1} &{} \approx \mu _{1}^{p+1}\mu _{2}^{p}\\ \mu _{2} &{} \approx \mu _{1}^{p}\mu _{2}^{p+1} \end{array}\right\} \end{aligned}$$
(16)

for integers \(1\le p<\sigma \). According to theorem 1, we can choose to represent the dynamics on the ISF in complex coordinates as

$$\begin{aligned} \tilde{\varvec{S}}\left( z,\overline{z}\right) =\begin{pmatrix}\begin{array}{l} \mu _{1}z+\sum _{p=1}^{\left\lfloor \sigma /2\right\rfloor }a_{p}z^{p+1}\overline{z}^{p}\\ \mu _{2}\overline{z}+\sum _{p=1}^{\left\lfloor \sigma /2\right\rfloor }\overline{a}_{p}z^{p}\overline{z}^{p+1} \end{array}\end{pmatrix}, \end{aligned}$$
(17)

where \(z,a_{p}\in \mathbb {C}\). The choice of terms in (17) avoids diverging terms in the submersion \(\varvec{U}\) when \(\left| \mu _{1}\right| \approx 1\) as illustrated by formula (63) in the proof of theorem 1. Using the transformation \(z=z_{1}+iz_{2}\), with \(z_{1},z_{2}\in \mathbb {R}\), the dynamics on the ISF (17) can be written in real coordinates as

$$\begin{aligned}&\hat{\varvec{S}}\left( z_{1},z_{2}\right) \nonumber \\&\quad =\begin{pmatrix}\begin{array}{l} z_{1}\sum _{p=0}^{\left\lfloor \sigma /2\right\rfloor }b_{p}\left( z_{1}^{2}+z_{2}^{2}\right) ^{p}-z_{2}\sum _{p=0}^{\left\lfloor \sigma /2\right\rfloor }c_{p}\left( z_{1}^{2}+z_{2}^{2}\right) ^{p}\\ z_{1}\sum _{p=0}^{\left\lfloor \sigma /2\right\rfloor }c_{p}\left( z_{1}^{2}+z_{2}^{2}\right) ^{p}+z_{2}\sum _{p=0}^{\left\lfloor \sigma /2\right\rfloor }b_{p}\left( z_{1}^{2}+z_{2}^{2}\right) ^{p} \end{array}\end{pmatrix},\nonumber \\ \end{aligned}$$
(18)

where \(b_{0}=\mathfrak {R}\mu _{1}\), \(c_{0}=\mathfrak {I}\mu _{1}\) and \(b_{p}=\mathfrak {R}a_{p}\), \(c_{p}=\mathfrak {I}a_{p}\), \(p=1,\ldots ,\left\lfloor \sigma /2\right\rfloor \). To generalise even further, and allow the limit \(\left| \mu _{1}\right| \rightarrow 1\), we can also write that

$$\begin{aligned} \varvec{S}\left( z_{1},z_{2}\right) =\begin{pmatrix}\begin{array}{l} z_{1}f_{r}\left( z_{1}^{2}+z_{2}^{2}\right) -z_{2}f_{i}\left( z_{1}^{2}+z_{2}^{2}\right) \\ z_{1}f_{i}\left( z_{1}^{2}+z_{2}^{2}\right) +z_{2}f_{r}\left( z_{1}^{2}+z_{2}^{2}\right) \end{array}\end{pmatrix}, \end{aligned}$$
(19)

where \(f_{r}\) and \(f_{i}\) are unknown functions. We note that theorem 1 does not cover the case \(\left| \mu _{1}\right| =1\); however, the invariance equation can be solved by the asymptotic expansion described in appendix A.1 up to any order of accuracy even when \(\left| \mu _{1}\right| =1\). This suggests that the invariance equation can also be solved numerically up to any order of accuracy, when (12) holds and near internal resonances are taken into account, even though the existence of a unique solution is not known.

The dynamics on the ISF can be further analysed by introducing the polar parametrisation \(z_{1}=r\cos \theta \) and \(z_{2}=r\sin \theta \). In these coordinates Eq. (19) is transformed into

$$\begin{aligned} \breve{\varvec{S}}\left( r,\theta \right) =\begin{pmatrix}\begin{array}{l} {\displaystyle r\sqrt{f_{r}^{2}\left( r^{2}\right) +f_{i}^{2}\left( r^{2}\right) }}\\ {\displaystyle \theta +\tan ^{-1}\frac{f_{i}\left( r^{2}\right) }{f_{r}\left( r^{2}\right) }} \end{array}\end{pmatrix}. \end{aligned}$$
(20)

For a similar analysis see [27, Sect. 6]. The radial dynamics in (20) is decoupled from the angular motion, therefore we can identify that \(r=0\) is the fixed point, and all solutions of \(f_{r}^{2}\left( r^{2}\right) +f_{i}^{2}\left( r^{2}\right) =1\) for r with \(r>0\) represent periodic orbits. We assume that each iteration of \(\varvec{F}\) and of \(\breve{\varvec{S}}\) accounts for a period of time T and therefore the instantaneous angular frequency of rotation about the fixed point is given by

$$\begin{aligned} \omega _{E^{\star }}\left( r\right) =T^{-1}\tan ^{-1}\frac{f_{i}\left( r^{2}\right) }{f_{r}\left( r^{2}\right) }. \end{aligned}$$
(21)

We also define the instantaneous damping ratio by

$$\begin{aligned} \zeta _{E^{\star }}\left( r\right) =-\frac{\log \sqrt{f_{r}^{2}\left( r^{2}\right) +f_{i}^{2}\left( r^{2}\right) }}{T\omega _{E^{\star }}\left( r\right) }, \end{aligned}$$
(22)

which agrees with the damping ratio of the linear dynamics about the equilibrium at \(r=0\). Unfortunately we cannot easily determine what vibration amplitude r represents, because there is no unique closed curve in the phase space that is mapped by the submersion \(\varvec{U}\) to the circle \(r\times [0,2\pi )\). This means that we cannot define a backbone curve in the same way as in [27, Sect. 6]. Instead, we define a surrogate for the amplitude in Sect. 5.3.

Similarly, the submersion of the ISF needs to be normalised, because in case of an internal resonance it is not fully specified. We are now looking for a \(\tilde{\varvec{U}}\), which together with \(\tilde{\varvec{S}}\) satisfies the invariance equation (4), and also takes into account the near internal resonances (16). In order to uncover the constraints on \(\tilde{\varvec{U}}\) that eliminate the terms corresponding to near internal resonances, we write that

$$\begin{aligned} \tilde{\varvec{U}}\left( \varvec{x}\right) =\varvec{U}\left( \varvec{v}_{1}^{\star }\varvec{x},\varvec{v}_{2}^{\star }\varvec{x},\ldots ,\varvec{v}_{n}^{\star }\varvec{x}\right) , \end{aligned}$$

where \(\varvec{U}=\left( U_{1},U_{2}\right) ^{T}\) and \(U_{1},U_{2}\) form a complex conjugate pair, which have real and imaginary parts, such that \(U_{1}=U_{r}+iU_{i}\). Note that \(\varvec{U}\) is the same submersion that is used in appendix A, where \(\varvec{F}\) was assumed to have a diagonal Jacobian at the origin. Similarly, we decompose \(\tilde{\varvec{U}}=\left( \tilde{U}_{1},\tilde{U}_{2}\right) ^{T}\) and \(\tilde{U}_{1}=\tilde{U}_{r}+i\tilde{U}_{i}\) and define \(\hat{\varvec{U}}=\left( \hat{U}_{1},\hat{U}_{2}\right) \overset{ def }{=}\left( \tilde{U}_{r},\tilde{U}_{i}\right) \), which together with \(\hat{\varvec{S}}\) or \(\varvec{S}\) of Eqs. (18) and (19), respectively, must satisfy the invariance equation (4). Due to our assumptions, the left and right eigenvectors satisfy \(\varvec{v}_{j}^{\star }\varvec{v}_{k}=\delta _{jk}\), where \(\delta _{jk}\) is the Kronecker delta; hence, we can write that

$$\begin{aligned} \tilde{\varvec{U}}\left( \sum _{j=1}^{n}\varvec{v}_{j}z_{j}\right) =\varvec{U}\left( z_{1},z_{2},\ldots ,z_{n}\right) . \end{aligned}$$

As in appendix A.1, we recognise that the terms corresponding to internal resonances are

$$\begin{aligned} z_{1}^{p+1}z_{2}^{p}\;\text {and}\;z_{1}^{p}z_{2}^{p+1},\;p\ge 1 \end{aligned}$$

in \(U_{1}\) and \(U_{2}\), respectively, whose coefficients need to vanish as per Eq. (62). To remove these terms, we set \(z_{1}=r\mathrm {e}^{i\theta }\), \(z_{2}=r\mathrm {e}^{-i\theta }\) which leads to internally resonant terms \(r^{2p+1}\mathrm {e}^{i\theta }\) and \(r^{2p+1}\mathrm {e}^{-i\theta }\) that are the only terms with \(\mathrm {e}^{i\theta }\) and \(\mathrm {e}^{-i\theta }\) components in the Fourier expansion of \(\varvec{U}\). In particular for \(U_{1}\) only the linear term (\(r\mathrm {e}^{i\theta }\) for \(p=0\)) is allowed to contribute to a nonzero coefficient of \(\mathrm {e}^{i\theta }\), which means that the first Fourier coefficient must be

$$\begin{aligned} \int _{0}^{2\pi }\mathrm {e}^{-i\theta }\cdot U_{1}\left( r\mathrm {e}^{i\theta },r\mathrm {e}^{-i\theta },0,\ldots ,0\right) =2\pi r, \end{aligned}$$
(23)

where we assumed the normalisation \(D_{k}U_{1}\left( 0,\ldots ,0\right) =\delta _{1k}\). Since \(U_{1}\) and \(U_{2}\) are complex conjugate pairs, there is no need for a similar condition for \(U_{2}\). Instead, we expand the constraint (23) using the real valued submersion \(\hat{\varvec{U}}\), such that the final form of the constraint becomes

$$\begin{aligned} \left. \begin{array}{rl} {\displaystyle \int _{0}^{2\pi }\hat{U}_{1}\left( \varvec{v}_{r}r\cos \theta -\varvec{v}_{i}r\sin \theta \right) \cos \theta +\hat{U}_{2}\left( \varvec{v}_{r}r\cos \theta -\varvec{v}_{i}r\sin \theta \right) \sin \theta \mathrm {d}\theta } &{} =2\pi r\\ {\displaystyle \int _{0}^{2\pi }\hat{U}_{2}\left( \varvec{v}_{r}r\cos \theta -\varvec{v}_{i}r\sin \theta \right) \cos \theta -\hat{U}_{1}\left( \varvec{v}_{r}r\cos \theta -\varvec{v}_{i}r\sin \theta \right) \sin \theta \mathrm {d}\theta } &{} =0 \end{array}\right\} , \end{aligned}$$
(24)

where \(\varvec{v}_{r}=\mathfrak {R}\varvec{v}_{1}\) and \(\varvec{v}_{i}=\mathfrak {I}\varvec{v}_{1}\). In what follows the constraints (24) will turn into penalty terms added to the loss function of the optimisation problem, whose minimum is the approximate pair of functions \(\varvec{U}\) and \(\varvec{S}\).

4.2 The optimisation problem

Let us assume a set of data points, given by \(\left\{ \left( \varvec{x}_{k},\varvec{y}_{k}\right) ,\,k=1,\ldots ,N\right\} \), with the constraint that \(\varvec{y}_{k}=\varvec{F}\left( \varvec{x}_{k}\right) \). In practice, \(\left( \varvec{x}_{k},\varvec{y}_{k}\right) \) may be part of a set of trajectories, such that \(\varvec{y}_{k}=\varvec{x}_{k+1}\) for ranges of subsequent indices \(K_{j}\le k<K_{j+1}\), \(1=K_{1}<K_{2}<\cdots <K_{M}=N\). We also assume that there is an approximate knowledge of the Jacobian of \(\varvec{F}\) about the equilibrium. To find the Jacobian one can use standard linear regression that fits a linear model to the data in the neighbourhood of the equilibrium [6].

We further assume parametric representations of the submersion \(\varvec{U}\) and the map \(\varvec{S}\), such that \(\varvec{U}\left( \varvec{0}\right) =\varvec{0}\) and \(\varvec{S}\) has the form of (19). In particular, we use the notation \(\varvec{U}\left( \varvec{x}\right) =\varvec{U}\left( \varvec{x};\varvec{\varTheta }_{\varvec{U}}\right) \) and \(\varvec{S}\left( \varvec{z}\right) =\varvec{S}\left( \varvec{z};\varvec{\varTheta }_{\varvec{S}}\right) \), where \(\varvec{\varTheta }_{\varvec{U}}\) and \(\varvec{\varTheta }_{\varvec{S}}\) are the parameters we are looking for. Functions \(\varvec{U}\) and \(\varvec{S}\) must satisfy the invariance equation (4) at each point along the time-series with the smallest possible residual error \(\varvec{r}_{k}\), that is

$$\begin{aligned} \varvec{U}\left( \varvec{y}_{k};\varvec{\varTheta }_{\varvec{U}}\right)&=\varvec{S}\left( \varvec{U}\left( \varvec{x}_{k};\varvec{\varTheta }_{\varvec{U}}\right) ;\varvec{\varTheta }_{\varvec{S}}\right) +\varvec{r}_{k}. \end{aligned}$$

An obvious strategy to minimise the residual \(\varvec{r}_{k}\), is to use the least-squares method. In particular, we use the scaled norm (70) from the proof of theorem 1 in appendix A.2, which guarantees a unique solution. The loss term from the invariance equation is then

$$\begin{aligned}&L_{i}\left( \varvec{\varTheta }_{\varvec{U}},\varvec{\varTheta }_{\varvec{S}}\right) \nonumber \\&\quad =\sum _{k=1}^{N}\left| \varvec{x}_{k}\right| ^{-2\sigma }\left| \varvec{U}\left( \varvec{y}_{k};\varvec{\varTheta }_{\varvec{U}}\right) -\varvec{S}\left( \varvec{U}\left( \varvec{x}_{k};\varvec{\varTheta }_{\varvec{U}}\right) ;\varvec{\varTheta }_{\varvec{S}}\right) \right| ^{2}.\nonumber \\ \end{aligned}$$
(25)

We also need to ensure that the normalising conditions (24) are satisfied. We choose a two-dimensional mesh in polar coordinates, that is \(r_{j}=r_{\mathrm {max}}j/N_{r}\), \(\theta _{k}=2\pi k/N_{\theta }\) and \(\varvec{v}_{jk}=\varvec{v}_{r}r_{j}\cos \theta _{k}-\varvec{v}_{i}r_{j}\sin \theta _{k}\) and define

$$\begin{aligned}&L_{n}\left( \varvec{\varTheta }_{\varvec{U}}\right) =\sum _{j=1}^{N_{r}}\left( r_{j}^{-1}\sum _{k=1}^{N_{\theta }}\left( U_{1}\left( \varvec{v}_{jk};\varvec{\varTheta }_{\varvec{U}}\right) \cos \theta _{k}\right. \right. \nonumber \\&\qquad \qquad \left. \left. +U_{2}\left( \varvec{v}_{jk};\varvec{\varTheta }_{\varvec{U}}\right) \sin \theta _{k}\right) -\frac{N_{\theta }}{2}\right) ^{2}\nonumber \\&\quad +\sum _{j=1}^{N_{r}}\left( r_{j}^{-1}\sum _{k=1}^{N_{\theta }}\left( U_{2}\left( \varvec{v}_{jk};\varvec{\varTheta }_{\varvec{U}}\right) \cos \theta _{k}\right. \right. \nonumber \\&\qquad \qquad \left. \left. -U_{1}\left( \varvec{v}_{jk};\varvec{\varTheta }_{\varvec{U}}\right) \sin \theta _{k}\right) \right) ^{2}. \end{aligned}$$
(26)

The value of \(r_{\mathrm {max}}\) is proportional to \(\max _{k}\left| \varvec{x}_{k}\right| \). Our version of the least-squares optimisation problem can be written as

$$\begin{aligned} \varvec{\varTheta }_{\varvec{U}},\varvec{\varTheta }_{\varvec{S}}=\arg \min \left( L_{i}\left( \varvec{\varTheta }_{\varvec{U}},\varvec{\varTheta }_{\varvec{S}}\right) +\beta L_{n}\left( \varvec{\varTheta }_{\varvec{U}}\right) \right) , \end{aligned}$$
(27)

where \(\beta >0\) is sufficiently large so that \(\varvec{U}\) continues to satisfy the normalising conditions (24). The optimisation must be initialised such that

$$\begin{aligned} D_{1}\varvec{U}\left( \varvec{0};\varvec{\varTheta }_{\varvec{U}} \right) \approx \begin{pmatrix}\begin{array}{l} \mathfrak {R}\varvec{v}_{1}^{\star }\\ \mathfrak {I}\varvec{v}_{1}^{\star } \end{array}\end{pmatrix}\;\text {and}\;f_{r}\left( 0\right) \approx \mathfrak {R}\mu _{1},\;f_{i}\left( 0\right) \approx \mathfrak {I}\mu _{1}. \end{aligned}$$
(28)

Remark 4

An alternative to the normalising conditions (24) is to fix the norm of \(D_{1}\varvec{U}\left( \varvec{0};\varvec{\varTheta }_{\varvec{U}}\right) \), by defining

$$\begin{aligned} L_{n}\left( \varvec{U}\right) =\left( \left\| D_{1}\varvec{U}\left( \varvec{0};\varvec{\varTheta }_{\varvec{U}} \right) \right\| ^{2}-1\right) ^{2}. \end{aligned}$$
(29)

In this case, the optimisation (27) will not yield a unique result for \(\varvec{\varTheta }_{\varvec{U}},\varvec{\varTheta }_{\varvec{S}}\); however according to theorem 1 the foliation defined by the resulting \(\varvec{U}\) should represent the unique foliation. The non-uniqueness comes from the possible choices of terms in \(\varvec{U}\) and \(\varvec{S}\) relative to each other at near internal resonances as described in appendix A.1.

4.3 Polynomial representation for optimisation

Here we use a polynomial representation to carry out the optimisation given by Eq. (27). We represent the unknown functions as polynomials of finite order \(\alpha \), such that

$$\begin{aligned} \varvec{U}\left( \varvec{x};\varvec{U}^{\varvec{m}_{1}},\ldots ,\varvec{U}^{\varvec{m}_{\#\left[ n,\alpha \right] }}\right)&=\sum _{\varvec{m}\in M_{n,\alpha }}\varvec{U}^{\varvec{m}}\varvec{x}^{\varvec{m}},\\ \varvec{S}\left( \varvec{z};\varvec{S}^{\varvec{m}_{1}},\ldots ,\varvec{S}^{\varvec{m}_{\#\left[ 2,\alpha \right] }}\right)&=\sum _{\varvec{m}\in M_{2,\alpha }}\varvec{S}^{\varvec{m}}\varvec{z}^{\varvec{m}}, \end{aligned}$$

where the finite set is \(M_{n,\alpha }=\left\{ \varvec{m}\in \mathbb {N}^{n}:1\le \sum _{k=1}^{n}m_{k}\le \alpha \right\} \), the unique elements of \(M_{n,\alpha }\) are denoted by \(\varvec{m}_{1},\varvec{m}_{2},\ldots ,\varvec{m}_{\#\left[ n,\alpha \right] }\) and \(\#\left[ n,\alpha \right] =\left( {\begin{array}{c}n+\alpha \\ n\end{array}}\right) -1\) is the cardinality of \(M_{n,\alpha }\). The scalar values \(\varvec{x}^{\varvec{m}}\) are defined as

$$\begin{aligned} \varvec{x}^{\varvec{m}}=x_{1}^{m_{1}}\cdots x_{n}^{m_{n}} \end{aligned}$$
(30)

and \(\varvec{U}^{\varvec{m}},\varvec{S}^{\varvec{m}}\in \mathbb {R}^{2}\). We further define that \(\left| \varvec{m}\right| =\sum _{k=1}^{n}m_{k}\). The multi-index notation implies that coefficients of linear terms have indices given by unit vectors

$$\begin{aligned} {\varvec{e}}_{k}=\left( \underset{1}{0},\ldots , \underset{k-1}{0},\underset{k}{1},\underset{k+1}{0}\ldots ,\underset{n{\text { or }}\nu }{0}\right) . \end{aligned}$$

Matrices are consequently denoted as multi-indexed vectors, that is, the element of a matrix in the j-th row and k-th column is written as \(U_{j}^{\varvec{e}_{k}}\) or just simply the k-th column vector of a matrix is written as \(\varvec{U}^{\varvec{e}_{k}}\). In order to arrive at the form of \(\varvec{S}\) given by (18), we need to set

$$\begin{aligned} \left. \begin{array}{ll} S_{1}^{\left( 1+2p,2\left( k-p\right) \right) } &{} ={\displaystyle \left( {\begin{array}{c}k\\ p\end{array}}\right) b_{k}}\\ S_{1}^{\left( 2p,1+2\left( k-p\right) \right) } &{} ={\displaystyle -\left( {\begin{array}{c}k\\ p\end{array}}\right) c_{k}}\\ S_{2}^{\left( 1+2p,2\left( k-p\right) \right) } &{} ={\displaystyle \left( {\begin{array}{c}k\\ p\end{array}}\right) c_{k}}\\ S_{2}^{\left( 2p,1+2\left( k-p\right) \right) } &{} ={\displaystyle \left( {\begin{array}{c}k\\ p\end{array}}\right) b_{k}} \end{array}\right\} \quad 0\le k\le \left\lfloor \alpha /2\right\rfloor ,\;0\le p\le k. \end{aligned}$$

Finally, as per the notation of Sect. 4.2, the parameter arrays are given by

$$\begin{aligned} \varvec{\varTheta }_{\varvec{S}}&=\left( b_{0},\ldots b_{\left\lfloor \alpha /2\right\rfloor },c_{0},\ldots c_{\left\lfloor \alpha /2\right\rfloor }\right) ,\\ \varvec{\varTheta }_{\varvec{U}}&=\left( \varvec{U}^{\varvec{m}_{1}},\ldots ,\varvec{U}^{\varvec{m}_{\#\left[ n,\alpha \right] }}\right) . \end{aligned}$$

The starting point of the optimisation is using the eigenvalues and left eigenvectors of the Jacobian at the origin

$$\begin{aligned} U_{1}^{\varvec{e}_{k}}=\left[ \mathfrak {R}\varvec{v}_{1}^{\star }\right] _{k},\;U_{2}^{\varvec{e}_{k}}=\left[ \mathfrak {I}\varvec{v}_{1}^{\star }\right] _{k},\;b_{0}=\mathfrak {R}\mu _{1},\;c_{0}=\mathfrak {I}\mu _{1}, \end{aligned}$$
(31)

while the rest of the parameters can be initialised either randomly or to zero. During the optimisation the values (31) are allowed to change to fit the data, the initialisation ensures that the ISF converges to the chosen linear subspace \(E^{\star }.\)

The polynomial representation of the objective function in the optimisation problem (27) can be written as

$$\begin{aligned}&loss \left( \varvec{\varTheta }_{\varvec{U}},\varvec{\varTheta }_{\varvec{S}}\right) =\beta L_{n}\left( \varvec{\varTheta }_{\varvec{U}}\right) \nonumber \\&\quad +\sum _{k=1}^{N}\left| \varvec{x}_{k}\right| ^{-2\sigma }\left| \sum _{\varvec{m}\in M_{n,\alpha }}\varvec{U}^{\varvec{m}}\varvec{y}_{k}^{\varvec{m}}-\sum _{\varvec{p}\in M_{2,\alpha }}\varvec{S}^{\varvec{p}}\left( \sum _{\varvec{m}\in M_{n,\alpha }}\varvec{U}^{\varvec{m}}\varvec{x}_{k}^{\varvec{m}}\right) ^{\varvec{p}}\right| ^{2},\nonumber \\ \end{aligned}$$
(32)

where

$$\begin{aligned}&L_{n}\left( \varvec{\varTheta }_{\varvec{U}}\right) =\sum _{j=1}^{N_{r}}\left( \sum _{\varvec{m}\in M_{n,\alpha }}\sum _{k=1}^{N_{\theta }}\left( U_{1}^{\varvec{m}}\varvec{c}_{jk}^{\varvec{m}}+U_{2}^{\varvec{m}}\varvec{s}_{jk}^{\varvec{m}}\right) -\frac{N_{\theta }}{2}\right) ^{2}\nonumber \\&\quad +\sum _{j=1}^{N_{r}}\left( \sum _{\varvec{m}\in M_{n,\alpha }}\sum _{k=1}^{N_{\theta }}\left( U_{2}^{\varvec{m}}\varvec{c}_{jk}^{\varvec{m}}-U_{1}^{\varvec{m}}\varvec{s}_{jk}^{\varvec{m}}\right) \right) ^{2} \end{aligned}$$
(33)

and

$$\begin{aligned} \varvec{c}_{jk}^{\varvec{m}}&=r_{j}^{\left| \varvec{m}\right| -1}\left( \varvec{v}_{r}\cos \theta _{k}-\varvec{v}_{i}\sin \theta _{k}\right) ^{\varvec{m}}\cos \theta _{k},\\ \varvec{s}_{jk}^{\varvec{m}}&=r_{j}^{\left| \varvec{m}\right| -1}\left( \varvec{v}_{r}\cos \theta _{k}-\varvec{v}_{i}\sin \theta _{k}\right) ^{\varvec{m}}\sin \theta _{k}. \end{aligned}$$

The penalty term \(L_{n}\) uses the approximate right eigenvectors \(\varvec{v}_{r}\pm i\varvec{v}_{i}\), which do not adapt during the optimisation. We do not expect that this causes inaccuracies, because this is just one possible way of normalising the submersion \(\varvec{U}\) which still represents the unique ISF. An inaccuracy of the a priori estimated eigenvectors \(\varvec{v}_{r}\pm i\varvec{v}_{i}\), however, will affect the conjugate map \(\varvec{S}\).

Another issue is that for a large \(\beta \) the penalty term can overshadow the actual loss function \(L_{i}\), which may cause inaccuracies. On the other hand, for smaller values of \(\beta \) the constraint (24) may not hold accurately. It is, however, much less important to satisfy the constraint accurately than finding the minimum of \(L_{i}\), because the constraint (24) only affects the parametrisation of the ISF and not its geometry. Therefore the value of \(\beta \) can be limited so that the minimum of the penalised loss function remains close to the minimum of \(L_{n}\). Alternatively, one can use constrained optimisation, such as sequential quadratic programming [20] to take (24) into account with full numerical accuracy.

From experience with other model identification studies, we believe that accuracy can be improved if not just two consecutive points, but multiple points along a trajectory are taken into account. This leads to a so-called multiple shooting technique [4], which will be part of a further investigation.

In our implementation we use the Optim.jl [19] package of the Julia programming language and choose the BFGS method to find an optimal solution. This only requires the gradient of \( loss \), which can be calculated by automatic differentiation.

5 Analysis of ISFs

5.1 Reconstructing the dynamics

Two or more carefully selected ISFs can act as a nonlinear coordinate system of the state space and therefore can be used to reconstruct the dynamics of \(\varvec{F}\). Let \(E_{j}^{\star }\), \(j=1,\ldots ,q\) be invariant linear subspaces, satisfying the conditions of theorem 1, such that

$$\begin{aligned} \left. \begin{array}{rl} E_{j}^{\star }\cap E_{k}^{\star } &{} =\left\{ \mathbf {0}\right\} ,\;\forall j\ne k\\ E_{1}^{\star }\oplus E_{2}^{\star }\cdots \oplus E_{q}^{\star } &{} =\mathbb {R}^{n} \end{array}\right\} \end{aligned}$$
(34)

and let the corresponding submersions of the ISFs be denoted by \(\varvec{U}^{j}\). Further, assume trajectories \(\varvec{x}_{k}\), and \(\varvec{z}_{j,k}\), \(k\in \mathbb {N}\) satisfying \(\varvec{x}_{k+1}=\varvec{F}\left( \varvec{x}_{k}\right) \), \(\varvec{z}_{j,k+1}=\varvec{S}^{j}\left( \varvec{z}_{j,k}\right) \) with matching initial conditions, that is, \(\varvec{z}_{j,0}=\varvec{U}^{j}\left( \varvec{x}_{0}\right) \). Because of the invariance of ISFs, the trajectories satisfy the equation

$$\begin{aligned} \begin{pmatrix}\varvec{z}_{1,k}\\ \vdots \\ \varvec{z}_{q,k} \end{pmatrix}=\widehat{\varvec{U}}\left( \varvec{x}_{k}\right) \overset{ def }{=}\begin{pmatrix}\varvec{U}^{1}\left( \varvec{x}_{k}\right) \\ \vdots \\ \varvec{U}^{q}\left( \varvec{x}_{k}\right) \end{pmatrix},\;k=1,2,\ldots \;. \end{aligned}$$
(35)

Due to our assumptions about \(E_{j}^{\star }\), we can invert \(\widehat{\varvec{U}}\) in a neighbourhood of the origin and therefore there exist a function \(\varvec{h}\), such that

$$\begin{aligned} \widehat{\varvec{U}}\left( \varvec{h}\left( \varvec{z}_{1},\ldots ,\varvec{z}_{q}\right) \right) =\begin{pmatrix}\varvec{z}_{1}\\ \vdots \\ \varvec{z}_{2} \end{pmatrix}. \end{aligned}$$
(36)

Using \(\varvec{h}\), the equivalence of the trajectories is expressed as

$$\begin{aligned} \varvec{x}_{k}=\varvec{h}\left( \varvec{z}_{1,k},\ldots ,\varvec{z}_{q,k}\right) ,\;k=1,2,\ldots \;. \end{aligned}$$
(37)

Equation (37) can be used to reconstruct the full dynamics of the system from the lower-order conjugate dynamics of the ISFs. The equivalence is the same between the trajectories of the vector fields \(\varvec{G}\), \(\varvec{R}^{j}\), except that the subscript k is replaced by time t.

Function \(\varvec{h}\) can be obtained by a fixed point iteration in a small neighbourhood of the origin. Let us denote \(\varvec{C}=D\widehat{\varvec{U}}\left( \varvec{0}\right) \) and decompose \(\widehat{\varvec{U}}\left( \varvec{x}\right) =\varvec{C}\varvec{x}+\widehat{\varvec{U}}_{N}\left( \varvec{x}\right) \), such that \(\widehat{\varvec{U}}_{N}\left( \varvec{x}\right) =\mathcal {O}\left( \left| \varvec{x}\right| ^{2}\right) \). Due to our assumptions (34), we infer that \(\varvec{C}\) is invertible, therefore the iteration makes sense

$$\begin{aligned}&\varvec{h}_{l+1}\left( \varvec{z}_{1},\ldots ,\varvec{z}_{q}\right) \nonumber \\&\quad =\varvec{C}^{-1}\begin{pmatrix}\varvec{z}_{1}\\ \vdots \\ \varvec{z}_{q} \end{pmatrix}-\varvec{C}^{-1}\widehat{\varvec{U}}_{N}\left( \varvec{h}_{l}\left( \varvec{z}_{1},\ldots ,\varvec{z}_{q}\right) \right) ,\;\varvec{h}_{0}\left( \varvec{z}_{1},\ldots ,\varvec{z}_{q}\right) =\varvec{0}.\nonumber \\ \end{aligned}$$
(38)

The iteration (38) converges in a neighbourhood of the origin, where \(\varvec{C}^{-1}\widehat{\varvec{U}}_{N}\) is a contraction [1]. For polynomials of a given order the iteration always converges in finite number of steps if the resulting polynomial is truncated to a finite order at each iteration.

Finding function \(\varvec{h}\) recovers all the SSMs of the system at the same time. Let \(j\in \left\{ 1,\ldots ,q\right\} \). It is quick to show that

$$\begin{aligned} \varvec{W}^{j}\left( \varvec{z}_{j}\right) =\varvec{h}\left( \varvec{0},\ldots ,\varvec{0},\varvec{z}_{j},\varvec{0},\ldots ,\varvec{0}\right) \end{aligned}$$
(39)

is the immersion of the SSM and \(\varvec{S}^{j}\) is the SSM conjugate dynamics. Indeed, applying \(\varvec{W}^{j}\) to the invariance Eq. (4) from both sides gives

$$\begin{aligned} \varvec{W}^{j}\circ \varvec{U}^{j}\circ \varvec{F}\circ \varvec{W}^{j}=\varvec{W}^{j}\circ \varvec{S}^{j}\circ \varvec{U}^{j}\circ \varvec{W}^{j}, \end{aligned}$$

where we notice that \(\varvec{U}^{j}\circ \varvec{W}^{j}\) is the identity, by construction, and \(\varvec{W}^{j}\circ \varvec{U}^{j}\) is a projection, and also the identity on the range of \(\varvec{F}\circ \varvec{W}^{j}\). Therefore we are left with the SSM invariance equation

$$\begin{aligned} \varvec{F}\circ \varvec{W}^{j}=\varvec{W}^{j}\circ \varvec{S}^{j}, \end{aligned}$$

which proves our statement.

Remark 5

The leaves of the ISF can be explicitly constructed using the function \(\varvec{h}\), that is

$$\begin{aligned} \mathcal {L}_{\varvec{z}}^{j}= & {} \left\{ \varvec{h}\left( \varvec{c}_{1},\ldots ,\varvec{c}_{j-1},\varvec{z},\varvec{c}_{j+1},\ldots ,\varvec{c}_{q}\right) :\right. \\&\qquad \qquad \left. \varvec{c}_{l}\in \mathbb {R}^{\nu _{l}},l=1,\ldots ,q,l\ne j\right\} . \end{aligned}$$

However, the information about the foliation \(\mathcal {F}^{j}\) is already encoded in the submersion \(\varvec{U}^{j}\), hence finding a full set of foliations satisfying (34) and then calculating \(\varvec{h}\) is inefficient. In the next section, we develop a more efficient technique to find explicit expressions for \(\mathcal {L}_{\varvec{z}}^{j}\).

5.2 The leaves of an ISF

Each leaf of an ISF is given implicitly by (3). It is, however, possible to describe a leaf explicitly as a forward image of a manifold immersion without relying on the inefficient construction of remark 5. An explicit expression for a leaf allows us to find an SSM as \(\mathcal {L}_{\varvec{0}}\) or visualise the leaves of the foliation as surfaces (or lines). It will also aid us to define backbone curves in Sect. 5.3.

We construct the family of immersions \(\varvec{W}_{\varvec{z}}:\mathbb {R}^{n-\nu }\rightarrow \mathbb {R}^{n}\) from a submersion \(\varvec{U}:\mathbb {R}^{n}\rightarrow \mathbb {R}^{\nu }\), such that a leaf within a foliation is given by

$$\begin{aligned} \mathcal {L}_{\varvec{z}}=\left\{ \varvec{W}_{\varvec{z}}\left( \varvec{y}\right) :\varvec{y}\in \mathbb {R}^{n-\nu }\right\} . \end{aligned}$$
(40)

To achieve this we are solving the under-determined equation

$$\begin{aligned} \varvec{z}=\varvec{\varvec{U}}\left( \varvec{W}_{\varvec{z}}\left( \varvec{y}\right) \right) \end{aligned}$$
(41)

under additional constraints, which allows a unique solution. We assume that the immersion has the form

$$\begin{aligned} \varvec{W}_{\varvec{z}}\left( \varvec{y}\right) =\varvec{V}_{\perp }\varvec{y}+\varvec{V}_{\parallel }\varvec{g}\left( \varvec{z},\varvec{y}\right) , \end{aligned}$$
(42)

where \(\varvec{g}:\mathbb {R}^{\nu }\times \mathbb {R}^{n-\nu }\rightarrow \mathbb {R}^{\nu }\) is an unknown function. First we choose matrices \(\varvec{V}_{\perp }\) and \(\varvec{V}_{\parallel }\), such that

$$\begin{aligned} D\varvec{U}\left( \varvec{0}\right) \varvec{V}_{\perp }=\varvec{0},\;D\varvec{U}\left( \varvec{0}\right) \varvec{V}_{\parallel }=\varvec{I},\;\varvec{V}_{\perp }^{T}\varvec{V}_{\parallel }=\varvec{0}\;\text {and}\;\varvec{V}_{\perp }^{T}\varvec{V}_{\perp }=\varvec{I}. \end{aligned}$$
(43)

This choice constrained by (43) allows for a unique solution of \(\varvec{g}\) in formula (42) through the defining equation (41). Note that the linear subspace \(E_{\parallel }\) spanned by \(\varvec{V}_{\parallel }\) can also be defined as

$$\begin{aligned} E_{\parallel }=\left\{ \arg \min _{\varvec{x}}\left| D\varvec{U}\left( \varvec{0}\right) \varvec{x}-\varvec{\xi }\right| :\varvec{\xi }\in \mathbb {R}^{2}\right\} . \end{aligned}$$
(44)

The construction of \(\varvec{W}_{\varvec{z}}\) is illustrated in Fig. 4. We also decompose the submersion \(\varvec{U}\) into a linear and nonlinear part, such that \(\varvec{U}\left( \varvec{x}\right) =D\varvec{U}\left( \varvec{0}\right) \varvec{x}+\varvec{U}_{N}\left( \varvec{x}\right) \), then expand Eq. (41) into

$$\begin{aligned} \varvec{z}=\varvec{g}\left( \varvec{z},\varvec{y}\right) +\varvec{U}_{N}\left( \varvec{V}_{\perp }\varvec{y}+\varvec{V}_{\parallel }\varvec{g}\left( \varvec{z},\varvec{y}\right) \right) . \end{aligned}$$
(45)

Equation (45) can be rearranged into a contraction mapping iteration [1], that is

$$\begin{aligned} \varvec{g}_{j+1}\left( \varvec{z},\varvec{y}\right) =\varvec{z}-\varvec{U}_{N}\left( \varvec{V}_{\perp }\varvec{y}+\varvec{V}_{\parallel }\varvec{g}_{j}\left( \varvec{z},\varvec{y}\right) \right) ,\;\varvec{g}_{0}\left( \varvec{z},\varvec{y}\right) =\varvec{z}. \end{aligned}$$
(46)

Due to \(\varvec{U}_{N}\left( \varvec{x}\right) =\mathcal {O}\left( \left| \varvec{x}\right| ^{2}\right) \), the iteration is indeed a contraction within a sufficiently small neighbourhood of the origin. If \(\varvec{U}\) is a polynomial of order \(\alpha \), and we seek \(\varvec{g}\) as another polynomial of order \(\alpha \), then the iteration (46) finishes in \(\alpha \) steps.

Fig. 4
figure 4

Finding the immersion \(\varvec{W}_{\varvec{z}}\) of a leaf \(\mathcal {L}_{\varvec{z}}\) in the form of Eq. (42). The leaf \(\mathcal {L}_{\varvec{z}}\) is represented as a graph over the linear subspace spanned by \(\varvec{V}_{\perp }\). This representation breaks down at points where the tangent space of \(\mathcal {L}_{\varvec{z}}\) is parallel with \(E_{\parallel }\). The SSM tangent to the linear subspace E is also illustrated, which coincides with \(E_{\parallel }\) under the conditions outlined in remark 6. When \(E_{\parallel }=E\), \(\varvec{W}_{\varvec{z}}\left( \varvec{0}\right) \) is linearly asymptotic to the SSM at the origin

We now show that \(\varvec{V}_{\perp }\) and \(\varvec{V}_{\parallel }\) can be found using singular value decomposition [28]. The singular value decomposition in our case can be written as

$$\begin{aligned} \begin{pmatrix}D\varvec{U}\left( \varvec{0}\right) \\ \varvec{0}_{\left( n-\nu \right) \times n} \end{pmatrix}=\begin{pmatrix}\varvec{\varUpsilon }_{\parallel } &{} \varvec{0}\\ \varvec{0} &{} \varvec{\varUpsilon }_{\perp } \end{pmatrix}\begin{pmatrix}\varvec{\varSigma } &{} \varvec{0}\\ \varvec{0} &{} \varvec{0} \end{pmatrix}\begin{pmatrix}\tilde{\varvec{V}}_{\parallel }&\tilde{\varvec{V}}_{\perp }\end{pmatrix}^{T}, \end{aligned}$$
(47)

where \(\varvec{\varSigma }\) is diagonal, \(\varvec{\varUpsilon }_{\parallel }\), \(\varvec{\varUpsilon }_{\perp }\) and \(\begin{pmatrix}\tilde{\varvec{V}}_{\parallel }&\tilde{\varvec{V}}_{\perp }\end{pmatrix}^{T}\) are orthonormal matrices. We now multiply (47) by \(\begin{pmatrix}\tilde{\varvec{V}}_{\parallel }&\tilde{\varvec{V}}_{\perp }\end{pmatrix}\) from the left to check the constraints (43) and we find that

$$\begin{aligned} D\varvec{U}\left( \varvec{0}\right) \tilde{\varvec{V}}_{\parallel }=\varvec{\varUpsilon }_{\parallel }\varvec{\varSigma },\;D\varvec{U}\left( \varvec{0}\right) \tilde{\varvec{V}}_{\perp }=\varvec{0}, \end{aligned}$$

where \(\varvec{\varUpsilon }_{\parallel }\varvec{\varSigma }\) is invertible. Therefore, we find that \(\varvec{V}_{\parallel }=\tilde{\varvec{V}}_{\parallel }\left( \varvec{\varUpsilon }_{\parallel }\varvec{\varSigma }\right) ^{-1}\) and \(\varvec{V}_{\perp }=\tilde{\varvec{V}}_{\perp }\).

Remark 6

Here we justify our choice of representation (42) for lightly damped mechanical systems. For other kinds of systems a different representation may be necessary. First we note that the range of matrix \(\varvec{V}_{\perp }\) is always invariant under the Jacobian \(\varvec{A}\); however the range of \(\varvec{V}_{\parallel }\) (i.e., \(E_{\parallel }\)) is not. The range of \(\varvec{V}_{\perp }\) is invariant if there exists a matrix \(\varvec{P}_{\perp }\), such that \(\varvec{A}\varvec{V}_{\perp }=\varvec{V}_{\perp }\varvec{P}_{\perp }\). In the most general case, we have the decomposition

$$\begin{aligned} \varvec{A}\varvec{V}_{\perp }=\varvec{V}_{\perp }\varvec{P}_{\perp }+\varvec{V}_{\parallel }\varvec{P}_{\parallel }. \end{aligned}$$
(48)

Applying \(D\varvec{U}\left( \varvec{0}\right) \) from the left to (48) and noticing that

$$\begin{aligned} D\varvec{U}\left( \varvec{0}\right) \varvec{A}\varvec{V}_{\perp }=D\varvec{S}\left( \varvec{0}\right) D\varvec{U}\left( \varvec{0}\right) \varvec{V}_{\perp }=\varvec{0}, \end{aligned}$$

we find that \(\varvec{P}_{\parallel }=\varvec{0}\), which proves the invariance of \(\varvec{V}_{\perp }\). A similar calculation can be carried out for \(\varvec{V}_{\parallel }\) by using the decomposition

$$\begin{aligned} \varvec{A}\varvec{V}_{\parallel }=\varvec{V}_{\perp }\varvec{Q}_{\perp }+\varvec{V}_{\parallel }\varvec{Q}_{\parallel }. \end{aligned}$$
(49)

Applying \(\varvec{V}_{\perp }^{T}\) to (49), we find that \(\varvec{Q}_{\perp }=\varvec{V}_{\perp }^{T}\varvec{A}\varvec{V}_{\parallel }\). If \(\varvec{V}_{\perp }\) is invariant under \(\varvec{A}^{T}\), we have \(\varvec{Q}_{\perp }=\varvec{0}\), which implies that \(E_{\parallel }\) coincides with E.

We now assume an undamped mechanical system with an equilibrium at the origin, such that the Jacobian of the vector field \(\dot{\varvec{x}}=\varvec{G}\left( \varvec{x}\right) \) at the origin has the form

$$\begin{aligned} \varvec{B}=\begin{pmatrix}\varvec{0} &{} \varvec{I}\\ -\varvec{K} &{} \varvec{0} \end{pmatrix}=D\varvec{G}\left( \varvec{0}\right) , \end{aligned}$$
(50)

where the stiffness matrix \(\varvec{K}\) is symmetric and positive definite. If \(\left( \varvec{v},\lambda \varvec{v}\right) ^{T}\) is a right eigenvector of \(\varvec{B}\), then \(\left( \lambda \varvec{v}^{T},\varvec{v}^{T}\right) \) is a left eigenvector both corresponding to the same eigenvalue \(\lambda \), where \(\varvec{v}\) is a real valued vector. Therefore if the eigenvector \(\left( \varvec{v},\lambda \varvec{v}\right) ^{T}\) being in the range of \(\varvec{V}_{\perp }\) implies that the eigenvector \(\left( \varvec{v},\overline{\lambda }\varvec{v}\right) ^{T}\) is also in the range of \(\varvec{V}_{\perp }\), then \(\varvec{V}_{\perp }\) is invariant under \(\varvec{B}^{T}\). This is because the pair of vectors \(\left( \varvec{v},\lambda \varvec{v}\right) \), \(\left( \varvec{v},\overline{\lambda }\varvec{v}\right) \) and \(\left( \lambda \varvec{v},\varvec{v}\right) \), \(\left( \overline{\lambda }\varvec{v},\varvec{v}\right) \) span the same linear subspace. In other words, if \(\lambda _{j}^{2}-\lambda _{k}^{2}\ne 0\) holds for \(k=1,\cdots ,\nu \) and \(j=\nu +1,\cdots n\), then pairs of complex conjugate eigenvectors are part of the range of \(\varvec{V}_{\perp }\), which makes \(\varvec{V}_{\perp }\) invariant under \(\varvec{B}^{T}\) and further implies that \(E_{\parallel }\) coincides with E. Using the relation that \(\varvec{A}=D\varvec{F}\left( \varvec{0}\right) =\exp \varvec{B}\tau \), where \(\tau \) is the sampling period, we find that if \(\mu _{j}/\mu _{k}\ne 1\) for \(k=1,\cdots ,\nu \) and \(j=\nu +1,\cdots n\), then \(E_{\parallel }=E\). If light damping is introduced, into the mechanical system, such that

$$\begin{aligned} \varvec{B}=\begin{pmatrix}\varvec{0} &{} \varvec{I}\\ -\varvec{K} &{} -\varvec{C} \end{pmatrix}, \end{aligned}$$

where \(\left\| \varvec{C} \right\| \) is small, \(E_{\parallel }\) remains close to E due to the continuity of eigenvectors with respect to the underlying matrix.

5.3 The backbone and damping curves of an ISF

We can accurately identify the dynamics on an ISF and determine its instantaneous damping ratio (22) and angular frequency (21). It is, however, not possible to attach a unique amplitude to a leaf within a foliation. In this section we go around this restriction and define a surrogate for the amplitude, which measures the distance of a leaf from the equilibrium. This is extracted purely from the submersion \(\varvec{U}\), therefore it will not measure the amplitude, but some approximation of it as explained in remark 8.

In Sect. 4.1 we have parametrised the ISF in polar coordinates as \(\varvec{z}=\left( r\cos \theta ,r\sin \theta \right) \). Then in Sect. 5.2 we described the leaves of an ISF as an immersion. Picking a point on the leaf \(\mathcal {L}_{\varvec{z}}\) and taking its norm can act as an instantaneous amplitude. The simplest option is to pick the intersection point \(\mathcal {L}_{\varvec{z}}\cap E_{\parallel }\), which is \(\varvec{W}_{\varvec{z}}\left( \varvec{0}\right) \) as per definition (42) and illustrated in Fig. 4. Using the same polar parametrisation that describes the instantaneous natural frequency and damping, we define our surrogate for the amplitude as

$$\begin{aligned} \varDelta _{E^{\star }}\left( r\right) =\sup _{\theta \in [0,2\pi )}\left| \varvec{W}_{\left( r\cos \theta ,r\sin \theta \right) }\left( \varvec{0}\right) \right| . \end{aligned}$$
(51)

Definition 3

We call the parametrised curve

$$\begin{aligned} \mathscr {B}_{E^{\star }}=\left\{ \omega _{E^{\star }}\left( r\right) ,\varDelta _{E^{\star }}\left( r\right) :0\le r<r_{\mathrm {max}}\right\} \end{aligned}$$
(52)

the ISF backbone curve of the dynamics associated with the codimension-two ISF corresponding to the linear subspace \(E^{\star }\).

We can similarly construct a curve that describes instantaneous damping.

Definition 4

We call the parametrised curve

$$\begin{aligned} \mathscr {D}_{E^{\star }}=\left\{ \zeta _{E^{\star }}\left( r\right) ,\varDelta _{E^{\star }}\left( r\right) :0\le r<r_{\mathrm {max}}\right\} \end{aligned}$$
(53)

the ISF damping curve of the dynamics associated with the codimension-two ISF corresponding to the linear subspace \(E^{\star }\).

If a full set of ISFs are calculated that satisfy the conditions (34), and one is willing to solve Eq. (36) for the function \(\varvec{h}\) or its values for a set of arguments, then the SSM backbone and damping curves can also be calculated. The amplitude of a vibration represented by the conjugate dynamics \(\varvec{S}^{j}\) on the corresponding SSM is given by

$$\begin{aligned} \varDelta _{E_{j}}\left( r\right) =\sup _{\theta \in [0,2\pi )}\left| \varvec{W}^{j}\left( r\cos \theta ,r\sin \theta \right) \right| , \end{aligned}$$

where \(\varvec{W}^{j}\) is defined by (39). This allows us to make the following definitions.

Definition 5

We call the parametrised curve

$$\begin{aligned} \mathscr {B}_{E_{j}}=\left\{ \omega _{E_{j}}\left( r\right) ,\varDelta _{E_{j}}\left( r\right) :0\le r<r_{\mathrm {max}}\right\} \end{aligned}$$
(54)

the SSM backbone curve of the dynamics associated with the two-dimensional SSM corresponding to the linear subspace E.

We can similarly construct a curve that describes instantaneous damping.

Definition 6

We call the parametrised curve

$$\begin{aligned} \mathscr {D}_{E_{j}}=\left\{ \zeta _{E_{j}}\left( r\right) ,\varDelta _{E_{j}}\left( r\right) :0\le r<r_{\mathrm {max}}\right\} \end{aligned}$$
(55)

the SSM damping curve of the dynamics associated with the two-dimensional SSM corresponding to the linear subspace E.

Remark 7

The backbone and damping curves are not unique; they depend on the choice of parametrisation of the ISF or SSM. This is illustrated by the fact that \(\varvec{S}\) can be chosen linear if there are no internal resonances in the strict sense of (13), which is the case of most damped systems. For linear \(\varvec{S}\) the damping and backbone curves are straight lines, which is not the expected result for a nonlinear system. In [27] a special parametrisation was chosen, such that all near resonances are fully represented in the conjugate dynamics on the SSM, which made the backbone curves unique. We use an equivalent normalisation in the optimisation problem (27), which results in unique backbone and damping curves. However, the alternative normalising loss function (29) can leave near internally resonant terms in the submersion \(\varvec{U}\), which leads to non-unique representations of the unique ISF. The amount of variation in the submersion \(\varvec{U}\) and map \(\varvec{S}\) can be reduced if during optimisation various terms of the submersion \(\varvec{U}\) assume similar magnitudes as the nonlinear terms of \(\varvec{S}\). This strategy leads to smaller variations as the linear damping vanishes and the near internal resonances are getting closer to strict internal resonances. Therefore the uncertainty in the location of the backbone curve will also vanish as damping vanishes, making the backbone curve unique in the limit, if the limit exists. We must stress that this argument only mentions the linear damping, that is, only \(\zeta _{E^{\star }}\left( 0\right) \rightarrow 0\) is assumed; therefore the damping curve need not vanish.

Remark 8

In general, there is no connection between the ISF and SSM backbone curves, except for lightly damped mechanical systems. According to remark 6 for undamped mechanical systems E and \(E_{\parallel }\) coincide, hence due to the construction of \(\varvec{W}_{\varvec{z}}\), the surrogate amplitude \(\varDelta _{E^{\star }}\) is linearly asymptotic to the SSM amplitude \(\varDelta _{E_{j}}\) at the equilibrium. If small damping is introduced, E and \(E_{\parallel }\) remain close to each other. This implies that \(\varDelta _{E^{\star }}\) remains nearly linearly asymptotic to the SSM amplitude \(\varDelta _{E_{j}}\), and the ISF and SSM backbone curves stay close to each other near the equilibrium.

6 Examples

We illustrate the application of the theory on two examples, one based on a mathematical model, the other is purely data driven.

Table 1 The residual of the fitting procedure is calculated as \( res =\frac{1}{N}\sum _{k=1}^{N}\left| \varvec{x}_{k}\right| ^{-1}\left| \varvec{U}\left( \varvec{y}_{k}\right) -\varvec{S}\left( \varvec{U}\left( \varvec{x}_{k}\right) \right) \right| \), which are compared for the training and testing data

6.1 Shaw-Pierre example

We use a modified two-degree-of-freedom oscillator studied by Shaw and Pierre [25], which has appeared in [27]. The modification makes the damping matrix proportional to the stiffness matrix in the linearised problem. The first-order equations of motion are

$$\begin{aligned} \left. \begin{array}{rl} \dot{x}_{1} &{} =v_{1},\\ \dot{x}_{2} &{} =v_{2},\\ \dot{v}_{1} &{} =-cv_{1}-k_{0}x_{1}-\kappa x_{1}^{3}-k_{0}(x_{1}-x_{2})-c(v_{1}-v_{2}),\\ \dot{v}_{2} &{} =-cv_{2}-k_{0}x_{2}-k_{0}(x_{2}-x_{1})-c(v_{2}-v_{1}). \end{array}\right\} \end{aligned}$$
(56)

where the parameters are \(c=0.003\), \(k_{0}=1\), and \(\kappa =0.5\). The natural frequencies and damping ratios are

$$\begin{aligned}&\omega _{1}=\sqrt{k_{0}},\qquad \omega _{2}=\sqrt{3k_{0}},\qquad \zeta _{1}=\frac{c}{2\sqrt{k_{0}}},\\&\zeta _{2}=\frac{\sqrt{3}c}{2\sqrt{k_{0}}}, \end{aligned}$$

yielding the complex eigenvalues

$$\begin{aligned}&\lambda _{1,2}=-\frac{c}{2}\pm i\sqrt{k_{0}\left( 1-\frac{c^{2}}{4k_{0}}\right) },\\&\lambda _{3,4}=-\frac{3c}{2}\pm i\sqrt{3k_{0}\left( 1-\frac{3c^{2}}{4k_{0}}\right) }, \end{aligned}$$

where we have assumed that both modes are underdamped, i.e., \(c<2\sqrt{k_{0}/3}.\) The spectral quotients corresponding to these natural frequencies are

$$\begin{aligned} \beth _{E_{1}^{\star }} =1, \quad \beth _{E_{2}^{\star }}=3. \end{aligned}$$

The data for this problem was generated from 100 trajectories of 16 points each with time step \(T=0.8\). The initial conditions for each trajectory were uniformly drawn from a cube of width 0.4 about the origin and scaled, such that \(\varvec{x}_{0}\mapsto \varvec{x}_{0}/\left| \varvec{x}_{0}\right| ^{2}\). This ensures a higher density of data about the origin and that \(\max \left| \varvec{x}_{k}\right| \le 0.2\). Testing data was also created by the same procedure in order to check whether we overfit the data. The fitting procedure used \(\sigma =2\) and \(\sigma =3\) values with order 3, 5 and 7 polynomials representing the submersion \(\varvec{U}\) and dynamics \(\hat{\varvec{S}}\) in Eq. (32). The optimisation was carried out using the first-order BFGS method. The parameters for the penalty term (33) were \(N_{r}=10\), \(N_{\theta }=24\) and \(r_{\mathrm {max}}=0.2\). The accuracy of fitting can be seen in Table 1, which also shows that as the order of polynomials grows, the ratio between of testing and training residual slightly increases.

In Fig. 5 various ISF backbone and damping curves are compared to each other and to the SSM backbone and damping curves. We treat the order 7 SSM calculation as a reference. It can be seen that the ISF backbone curves are very close to the SSM backbone curve. The ISF damping curves seemingly display a larger variation; however, that is due to the scale of the horizontal axis, the relative error is small.

Fig. 5
figure 5

Backbone and damping curves of Eq. (56). The curves were identified using SSMs, series expanded ISFs calculated from the vector field and identified from data. The relative error of the backbone and damping curves are roughly the same, but due to the scaling of the figure the damping curves appear less accurate. VF means that the vector field was used to calculate the result, DATA means that the ISF was directly fitted to the data, \(O(\alpha )\) indicates that order \(\alpha \) polynomials were used

The calculated ISFs can be used to reconstruct the full dynamics. The accuracy of this reconstruction is illustrated in Fig. 6 for a single trajectory. We compare the sampled trajectory \(\varvec{x}_{k}=\varvec{x}\left( kT\right) \), which is the solution of the differential Eq. (56) to the reconstructed dynamics using the map \(\varvec{h}\) as defined by (36). The initial conditions for the reduced order models are set by \(\varvec{z}_{j,0}=\varvec{U}^{j}\left( \varvec{x}_{0}\right) \) and then iterated under the reduced models, such that \(\varvec{z}_{j,k+1}=\varvec{S}^{j}\left( \varvec{z}_{j,k}\right) \). First, we evaluate the inaccuracies of the fitting of the invariance equation by

$$\begin{aligned} \mathrm {err}_{k}^{ fw }=\left| \varvec{x}_{k}\right| ^{-1}\left| \left( \varvec{z}_{1,k}-\varvec{U}^{1}\left( \varvec{x}_{k}\right) ,\varvec{z}_{2,k}-\varvec{U}^{2}\left( \varvec{x}_{k}\right) \right) \right| , \end{aligned}$$
(57)

where the subscript \( fw \) refers to forward prediction. The result of this can be seen in Fig. 6a. Second, we use Eq. (37) to reconstruct the dynamics from the two ISFs and compare the reconstructed trajectories to the solution of the differential Eq. (56). The relative reconstruction error is calculated as

$$\begin{aligned} \mathrm {err}_{k}^{ bw }=\left| \varvec{x}_{k}\right| ^{-1}\left| \varvec{x}_{k}-\varvec{h}\left( \varvec{z}_{1,k},\varvec{z}_{2,k}\right) \right| \end{aligned}$$
(58)

and illustrated in Fig. 6b. When comparing Figs. 6a and 6b, one can see that the accuracy of satisfying the invariance equation is better than the accuracy of the reconstruction, which is due to the added inaccuracy of the post-processing step that produces the map \(\varvec{h}\). It is also clear that the error in the invariance equation increases about one order of magnitude over the 32 steps of the comparison, while the reconstruction error remains roughly constant at least for order 3 and 5 polynomials. We note that the described behaviour is consistent with other trajectories; however, the absolute magnitude of the errors will increase as \(\left| \varvec{x}_{0}\right| \) increases. The dependence of the errors on \(\left| \varvec{x}_{0}\right| \) can be controlled by the value of \(\sigma \). However, the error also depends on the distribution of the data within the state space, which we may not have control over. We note that the errors in Fig. 6b can be reduced to the errors displayed in Fig. 6a if Eq. (36) is solved using a Newton’s method instead of the iteration (38) (data not shown as it is indistinguishable from Fig. 6a).

Fig. 6
figure 6

Reconstruction error as a function of time. Solid lines correspond to the ISF obtained directly from the vector field (56). The \(\times \) markers denote the error from the ISFs of the identified map and the \(\triangledown \) corresponds to the directly identified ISFs. The same comparison is carried out for orders \(\alpha =3\) (black) \(\alpha =5\) (green) and \(\alpha =7\) (red) polynomial expansions. The scaling order parameter is \(\sigma =3\) and the initial condition is \(\varvec{x}=\left( 1.1088\times 10^{-4},1.9023\times 10^{-5},-0.0739,-0.0126\right) \). a errors of reconstruction using Eq. (57) and b using (58). (Color figure online)

6.2 Clamped-clamped beam

Here we analyse the free-decay vibration of a clamped-clamped beam. The data was collected by Ehrhardt and Allen [10] using the device depicted in Fig. 7. The data contains three tracks of velocity information, measured at the midpoint of the beam, which correspond to the first three vibration modes of the structure. The initial conditions were set by applying a carefully tuned forcing that compensates for the damping within the structure, and intends to recover the sustained vibration that would have occurred if the structure did not have damping. Such vibration is thought to be near an SSM [27], which makes it unusual as impact hammer tests would not single out specific modes of vibration. The data was re-sampled with time period \(T=0.97656\) ms. We use the same phase-space reconstruction through delay-embedding of velocity data as in [27], where full justification is given for the choice of phase space dimensionality.

Fig. 7
figure 7

The clamped-clamped beam, whose free-vibration was measured, which in turn was used to carry out our analysis. Reproduced from [10]

Table 2 Residuals of the fitting process, calculated as \( res =\frac{1}{N}\sum _{k=1}^{N}\left| \varvec{x}_{k}\right| ^{-1}\left| \varvec{U}\left( \varvec{y}_{k}\right) -\varvec{S}\left( \varvec{U}\left( \varvec{x}_{k}\right) \right) \right| \). O(\(\alpha \)) means that the conjugate map \(\varvec{S}\) is an order \(\alpha \) polynomial, while the submersion \(\varvec{U}\) is always an order 3 polynomial
Table 3 Natural frequencies of the three ISFs are compared to the estimates in [10]
Table 4 Damping ratios and ISF spectral quotients estimated by polynomial fitting are compared to [10]

We have fitted ISFs to all three modes of vibration captured by the data; however, we only show the first and third backbone curves, which can be compared to the analysis in [10]. We have used order 3 polynomials for the submersion \(\varvec{U}\), order 3, 5 and 7 polynomials for the conjugate map \(\varvec{S}\) and set \(\sigma =1\) throughout the calculation. Setting a higher value of \(\sigma \) would over-emphasise the importance of the data near the equilibrium and therefore the backbone curves would follow less accurately the actual frequency variations at higher amplitudes. We have found that using higher-order polynomials for the submersion \(\varvec{U}\) makes the composite map \(\widehat{\varvec{U}}\) of Eq. (36) non-invertible close to the equilibrium, which is a likely symptom of over fitting. The parameters for the penalty term (33) were \(N_{r}=12\), \(N_{\theta }=24\) and \(r_{\max }=0.7\). The residuals of the fitting process are gathered in Table 2. When a polynomial model is first fitted to the data, just as in [27], and the ISF is directly calculated from the model, the residuals are high, because the ISF only asymptomatically satisfies the invariance equation (4) about the equilibrium with respect to the fitted model. When the ISF is directly fitted to the data, the loss function (32) is minimised, which is closely related to the residual.

The fitting procedure also recovers the natural frequencies and the damping ratios of the linear modes of the structure. The identified natural frequencies can be seen in Table 3, which show very little variation from the linearly identified values in [10]. The damping ratios in Table 4 show a wider variation, and there seems to be a systematic error of a factor \(2\ldots 3\). The ISF spectral quotients are also shown in Table 4, which indicate that all ISFs are unique if they are twice differentiable, when considering the results of the order 5 and 7 fittings.

Using the fitted ISFs, we have calculated the backbone curves corresponding to the first and third vibration modes. The ISF calculations are compared to the force appropriation results and free decay analysis of [10], denoted by ‘Forcing’ and ‘Decay’ in Fig. 8, respectively. We have also calculated the SSMs and ISFs indirectly, from a third-order polynomial model that is fitted to the data, as in [27], which is denoted by ‘MAP’ in Fig. 8. To obtain the backbone curves, we have applied the post-processing steps in Sects. 5.2 and 5.3. In Figs. 8a and b the leaves of the foliation were recovered as polynomials as described in Sect. 5.2; however in Figs. 8c and d, Eq. (45) was solved for \(\varvec{g}\) in a pointwise manner with fixed \(\varvec{y},\varvec{z}\) values using Newton’s method, which gives accurate results. In Figs. 8e and f, we have used Newton’s method to solve the even more accurate Eq. (36), and calculated the SSM backbone curves from the collection of three ISFs. It can be seen in Figs. 8c and e, that Newton’s method is unable to find a solution for higher vibration amplitudes in the vicinity of previous iterations. This indicates for Fig. 8c that some leaves of the foliation become tangential to \(E_{\parallel }\) [defined by (44)] or the leaves of the three ISFs do not always intersect, in case of Fig. 8e. We believe that the latter problem can be partly blamed on the lack of data outside the neighbourhoods of the three SSMs. The fitting method is arbitrarily picking the submersions in these regions of the phase space, which can be highly distorted. This problem was not encountered in Sect. 6.1, where the data was better distributed in the phase space.

Fig. 8
figure 8

Backbone curves for the clamped-clamped beam. Definition (52) and \(\varvec{W}_{\varvec{z}}\) calculated by the polynomial iteration (46) were used in (a, b). Definition (52) and \(\varvec{W}_{\varvec{z}}\) calculated by Newton’s method from (45) were used in (c, d). Definition (54) was used in (e, f) with \(\varvec{W}^{j}\) calculated using Newton’s method. MAP means a polynomial model fit was carried out first, DATA means that the ISF was directly fitted to the data, \(O(\alpha )\) indicates that order \(\alpha \) polynomials were used

7 Discussion and conclusions

The paper has introduced invariant spectral foliations (ISF) as a tool to derive reduced order models (ROM) of dynamic systems about equilibria. The major advantage of this approach over other methods is that the full dynamics can be reconstructed from a set of ISFs. ISFs can also be fitted to data directly as opposed to SSMs. Direct fitting ensures that the invariance equation is satisfied for the data points with maximum accuracy, without using intermediate representations, such as a black-box model. We have shown that the indirect fitting of ISFs can result in high residuals compared to direct fitting. The major disadvantage of an ISF is that it requires a submersion (function) that depends on the same number of parameters as the dimensionality of the phase-space. Therefore for high-dimensional problems a polynomial representation will not be suitable, because the number of parameters required to represent the ISF will be in the order of \(\nu \left( {\begin{array}{c}n+\alpha \\ n\end{array}}\right) \), where n is the dimension of the phase space, \(\nu \) is the co-dimension of the ISF and \(\alpha \) is the order of the polynomial. SSMs in contrast are represented by immersions that depend on small number of parameters even though they map into high-dimensional spaces, so the required number of parameters are about \(n\left( {\begin{array}{c}\nu +\alpha \\ \nu \end{array}}\right) \), where \(\nu \) is the dimension of the SSM. Therefore SSMs can be efficiently represented. However, finding SSMs requires a model, be it black-box or physical, that might also be difficult to represent efficiently. In essence, the problem of dimensionality occurs at different levels of representations with ISFs and SSMs. One promising approach to represent submersions with minimum number of parameters is to use deep neural networks [5, 12], or other kinds of nonlinear approximation methods [9], that allow to represent high-dimensional functions with reasonable efficiency as opposed to polynomials. The challenge with nonlinear approximations, in particular with neural networks, is that they can be difficult to fit to data, because the distance between parameters that provide small improvements in accuracy can be large and therefore not easy to find [21]. Nevertheless, deep neural networks have enabled great advances in many fields of engineering and therefore this approach will be explored elsewhere.

One important aspect of any calculation or prediction is whether it is repeatable. In particular, the mathematically defined object should be the same regardless of what numerical method is used to calculate it. This aspect is determined by the uniqueness properties of the mathematical object one wants to calculate. One particularly desirable feature of an ISF is that its representation only needs to be once differentiable when it is calculated for the slowest dynamics (as in Sect. 2.2) or twice differentiable as per theorem 1 to be unique. This is in contrast with SSMs, where the SSM containing the slowest dynamics must be many times differentiable (as given by the SSM spectral quotient). The required order of smoothness gets higher if there is a time-scale separation with an increasingly fast dynamics in the system. As smoothness is difficult to quantify numerically, calculating unique SSMs can be a challenge. In this aspect, ISFs offer a theoretical advantage over SSMs, which needs to be verified in practical examples.

The practical aspects of finding an ISF remain to be investigated. One question is what type of data is required to obtain an accurate ISF. To answer this question a rigorous statistical analysis on how the accuracy of the ISF depends on the amount, the type and the uncertainty of data is necessary. For mechanical systems impact hammer tests seem to be a simple way to obtain free-decay vibration data. However, such data might need to be pre-processed in order to achieve a certain distribution of data points within the phase space. In some cases it might not be possible to obtain data from certain parts of the phase space, similar to the clamped-clamped beam example; hence the effect of missing data also needs to be explored.