Predicting Spatio-temporal Time Series Using Dimension Reduced Local States

We present a method for both cross-estimation and iterated time series prediction of spatio-temporal dynamics based on local modelling and dimension reduction techniques. Assuming homogeneity of the underlying dynamics, we construct delay coordinates of local states and then further reduce their dimensionality through Principle Component Analysis. The prediction uses nearest neighbour methods in the space of dimension reduced states to either cross-estimate or iteratively predict the future of a given frame. The effectiveness of this approach is shown for (noisy) data from a (cubic) Barkley model, the Bueno-Orovio–Cherry–Fenton model, and the Kuramoto–Sivashinsky model.


Introduction
In many experiments some variables of the system are more easily observable than others.If the underlying dynamics is deterministic, in general the observable of interest is nonlinearly related to other variables of the system which might be more accessible.In such cases one may try to estimate any observable which is difficult to measure from time series of those variables which are at one's disposal.Another task frequently encountered with observed time series is forecasting the dynamical evolution of the system and the time series.To cope with both tasks in case of multivariate time series from extended spatio-temporal systems we present an approach for cross estimation and iterated time series prediction using local state reconstructions, dimension reduction, and nearest neighbour methods for local modelling.Local state reconstruction is motivated by the fact that it often is impractical to predict the behaviour of systems with a large spatial extent all at once.If instead one combines a spatial and temporal neighbourhood around each measurement to find a description of the local system state it becomes possible to make predictions for each point in space independently.For performing cross estimation or prediction based on local states one can either use nearest neighbours methods (also called local modelling) [20] or employ some other black-box modelling approach like, for example, Echo State Machines [21,27].In the following, we shall use local modelling by selecting for each reconstructed reference state similar states from a training data set whose relations to other observables and/or future temporal evolutions are known and can be exploited for cross estimation or timeseries prediction.
Successful reconstruction of high-dimensional dynamics in extended systems, however, requires very large embedding dimensions which is a major challenge in particular for nearest neighbour methods.Therefore, a crucial point in making the conceptually simple nearest neighbours algorithm performant is dimension reduction.As a means of dimension reduction to find lower a dimensional representation of the local states, we employ principal component analysis (PCA) which turns out to improve performance in particular for noisy data.

Predicting Spatio-Temporal Time Series
In this section we shall introduce the main concepts for predicting spatio-temporal time series, including local delay coordinates states, linear dimension reduction, and nearest neighbours methods for local modelling of the dynamical evolution or any other relation between observed time series.

Local Modelling
Let x t be a state of some dynamical system evolving in time t and let s t = h(x t ) be a signal which can be observed or measured.Furthermore, let's assume that the dynamical equations generating the flow in state space and the measurement function h are unknown, but only a set S of M states x tm and corresponding time series values s tm for t 1 , . . ., t M are available, for which also future values x tm+T and s tm+T are known (due to previous measurements, for example).This data set S can be used to predict the future value x t+T of a given state x t or to estimate the corresponding time series values s t and s t+T , by selecting the k nearest neighbours of x t in S and using their future values (or the corresponding time series values) for approximating x t+T (or s t and s t+T ), for example, by (distance weighted) averaging.
In most practical applications of this kind of local nearest neighbour modelling the required states are reconstructed from a measured time series using the concept of delay coordinates (to be introduced in the next section).Local modelling in (reconstructed) state space is a powerful tool for purely data driven time series prediction [3,13].Its main ingredients are a proper state space representation of the measured time series, fast nearest neighbour searches, and local models such as low order polynomials which can accurately interpolate and predict the (nonlinear) relation between (reconstructed) states and target values.

Delay Coordinates
The most important part of time series based local modelling is the representation of data, i.e. proper reconstruction of states from data.Typically this representation is found utilizing delay coordinates and Taken's Embedding Theorem [25,22,15,2,8] such that a scalar time series {s t } is reconstructed to state vectors x t = (s t−γτ , . . ., s t−τ , s t ) by including γ past measurements each separated by τ time steps.For multivariate time series {s t } one can do the same for each of the components resulting is state vectors x t = (s 1,t−γτ , . . ., s 1,t , s 2,t−γτ , . . ., s 2,t ).

Spatial Embedding
In principle, delay embedding could also be employed to reconstruct (global) states of highdimensional spatially extended systems using multivariate time series sampled at many spatial locations.Such global state vectors are (and have to be) very high dimensional (for systems exhibiting extensive chaos).The runtime of nearest neighbour searches, however, and in particularly the memory usage of such reconstructions grows rapidly with the dimension of the reconstructed global states.To avoid this issue it has been proposed [19,20,18] to reconstruct (spatially) local states and to predict spatially extended systems point by point instead of the whole global state at once.This approach is motivated by the fact that all spatially extended physical systems posses a finite speed at which information travels.Therefore the future value of any of the variables depends solely on its past and its spatial neighbours. 1 Instead of trying to reconstruct the state of the whole system into one vector, we limit ourselves to reconstructing small neighbourhoods of all points that carry enough information to predict one point one time step into the future.As an additional benefit the infeasibly large embedding dimension that would result from embedding the entire space into a single state is greatly reduced.The idea of local state reconstruction was first applied to spatially one-dimensional systems [19,20,18] and was used, for example, to anticipate extreme events in extended excitable systems [7].
In the following we will present the embedding procedure for spatiotemporal time series represented by u t,α , where t denotes time and α a point in space.For 2D space α takes the values α = (i, j) In the most general case such an embedding could consist of arbitrary combinations of neighbours in all directions of space and time.For practical purposes we will limit ourselves to a certain set of parameters to describe which neighbours will be included into a reconstruction.We parameterize an embedding with the number γ of past time steps and their respective temporal delay (or time lag) τ .All neighbours in space that are within the radius r, referring to the Euclidean distances in a unit grid, will be included as well.The resulting shape of the embedding is comparable to a cylinder in 2+1D space-time.To make this clearer, a visualization of the spatial embedding in a two-dimensional system is displayed in Fig. 1 for different radii r.In the following we shall assume that the dynamics underlying the observed spatio-temporal time series is invariant with respect to translations, i.e. that the system is homogeneous.In this case states reconstructed at different locations can be combined to a single training set providing the data base for cross estimation or time series prediction as will be discussed in more detail in Section 2.4.However, even if the dynamical rules are the same for all locations, special care needs to be taken at the boundaries.This becomes obvious when trying to include nonexistent neighbours from outside the grid.For periodic boundary condition the canonical solution is to wrap around at the edges but for constant boundaries the solution is not so obvious.In many cases the dynamics near the boundary may also differ from dynamics far from it.It is therefore desirable to treat boundaries separately during nearest neighbour predictions.A solution proposed in Ref. [20] is to fill the missing values at the boundary with an additional parameter being a large constant number.If the parameter is significantly larger than typical values of the internal dynamics, the states reconstructed from the boundary fill regions in state space isolated from state vectors of internal dynamics.This has the desired effect as nearest neighbour searches will always find boundary states when given a boundary state as query and similarly for internal states.

Dimension Reduction
The feasibility of any nearest neighbour search depends heavily on the memory consumption as the whole data set needs to lie in memory.A crucial part of our algorithm is therefore about creating a proper low dimensional reconstruction.In most numerical experiments choosing just very few neighbours to include in the reconstruction did not prove to be very effective.Therefore, instead of choosing a small embedding dimension from the start, we propose to perform some means of dimension reduction on the resulting reconstructed state vectors.For this task we choose Principal Component Analysis (PCA) as it is a straightforward standard technique for (linear) dimension reduction, where the reconstructed states x t are projected onto the eigenvectors of the covariance matrix corresponding the largest eigenvalues [14].In the field of nonlinear time series analysis PCA has first been used by D. Broomhead and G. King [9] who suggested to use dimension reduction applied to high dimensional delay reconstructions with time series densely sampled in time.
Let {x n } be the set of all N reconstructed states x n = (x n 1 , . . ., x n DE ) ∈ R DE (at different times t and locations α, assuming stationary and spatially homogeneous dynamical rules).To perform PCA first mean values x = 1 The whole data set can thus be embedded into the space with reduced dimension D R by embedding each point of the data set into the high dimensional space R DE and projecting it into the low dimensional space R DR using the PCA projection matrix P computed beforehand.
For the subsequent prediction process the projected reconstruction vectors y n are then fed into a tree structure such as a kd-tree [5,11] for fast nearest neighbour searching.
One issue arises with states near boundaries.Since the dynamics close to the boundaries may differ from the rest of the system, they were separated from other reconstructed vectors in phase space.This was achieved by setting the non-existent neighbours of boundary points to a large constant value [20].The power of PCA however relies on its assumption of a single cloud of points in (state) space close to a low dimensional linear subspace.This is no longer the case when constant boundaries come into play.To sidestep this issue we suggest changing the second step of the procedure described above.Simply exclude all boundary states from the computation of the projection matrix P but project them with the resulting matrix P nonetheless.In principle this could eliminate the offset meant to separate internal and boundary dynamics but in practice the projection matrices rarely posses zero-valued entries.Therefore it is highly unlikely that this would become a problem as long as boundary offset values are chosen large enough.

Prediction Algorithm
While the values and the dimension of the state vectors have changed in the dimension reduction process, their ordering (t, α) ↔ n within the reconstructed space and the search tree is known.It is therefore sufficient to find the indices of nearest neighbours in the dimension reduced reconstructed training set in R DR .To make predictions we assign each reconstructed state x t,α a target value z t,α from the original training data and the only difference between temporal prediction and cross estimation lies in the choice of these target values.
For time series prediction we choose x t,α → u t+1,α where x t,α are the reconstructed vectors from the spatio-temporal time series {u t,α } and u t+1,α the target values.The prediction process then consists of reconstructing x T,α from the end of the time series by applying the same embedding parameters, subsequent dimension reduction using the projection matrix P that was computed for the training set, and local nearest neighbour modelling providing the target values u T +1,α .Once a prediction for each point (denoted by α) has been made, all future values u T +1,α of the (input) field u are known and the procedure can be repeated for predicting u T +2,α .Using this kind of iterated prediction spatio-temporal time series can, in principle, be forecasted for any period of time (with the well-known limits of predictability of chaotic dynamics).
The case of cross estimation is even simpler than time series prediction.Here we are given a training set of two fields: an input variable u t,α and a target variable v t,α .The values of the input field u t,α are reconstructed into delay vectors x t,α .Using PCA and nearest neighbours search we find similar reconstructed states in the training set for which the corresponding values of the target variables are known and can be used for estimating the current target v t,α .

Error Measures
In the next section we will test the presented prediction methods on the model systems described in section 3.For evaluation we compare any predicted field v with the corresponding correct values (i.e.test values) v by considering spatial averages over all sites α.This so-called Mean Squared Error (MSE) is then normalized by the MSE obtained when using the (spatial) mean value v for prediction.The resulting Normalized Mean Squared Error (NRMSE) is defined as where A is the number of spatial sites α taken into account.Any good estimate or forecast should be (much) better than the trivial prediction using mean values and result in NMSE values (much) smaller than one.

Software
All software used in this paper has been published in the form of an open source software library under the name of TimeseriesPrediction.jl [1] along with extensive documentation and various examples.It is written using the programming language Julia [6] with extensibility in mind, such that it is compatible with different spatial dimensions as well as arbitrary spatiotemporal embeddings.This is made possible through a modular design and Julia's multiple dispatch.

Model Systems
The Kuramoto-Sivashinsky (KS) model [17,23,24] has been devised for modelling flame fronts and will in our case be used as a benchmark system for iterated time series prediction.The Barkley model [4] describes an excitable medium that shows chaotic interplay of traveling waves.The third and most complex model is the Bueno-Orovio-Cherry-Fenton (BOCF) model [10], which is composed of four coupled fields describing electrical excitation waves in the heart muscle.

Kuramoto-Sivashinsky System
The Kuramoto-Sivashinsky (KS) system [17,23,24] is defined by the following partial differential equation: typically integrated with periodic boundary conditions.It is widely used in literature [20,21] because it is a simple system consisting of just one field while still showing highdimensional chaotic dynamics.The dynamics were simulated with an EDTRK4 algorithm [12,16] and the parameters for integration are the time step ∆t = 0.25 and the system size L with spatial sampling Q.Two example evolutions with L = 22, Q = 64 and L = 200, Q = 512 are shown in Fig. 2.

Barkley Model
The Barkley model [4] is a simple system that exhibits excitable dynamics.We will use a modification with a cubic u 3 term that can be used to generate spatio-temporal chaos such that: where the parameter set a = 0.75, b = 0.06, ε = 0.08 and D = 0.02 leads to chaotic behavior.For integration we used ∆t = 0.01 and ∆x = 0.1 in combination with an optimized FTCS scheme like the one described in [4].

Bueno-Orovio-Cherry-Fenton Model
The Buono-Orovio-Cherry-Fenton (BOCF) model [10] is a more advanced set of equations that serves as a realistic but relatively simple model of (chaotic) cardiac dynamics.It consist of four coupled fields that can be integrated as PDEs on various geometries.For the sake of simplicity we consider a two-dimensional square.The four variables u, v, w, s are given by the following equations: where the currents J si , J f i and J so and all other parameters are defined in the appendix.Variable u represents the voltage across the cell membrane and provides spatial coupling due the diffusion term, whereas v, w, and s are governed by local ODEs without any spatial coupling.Fig. 4 shows a snapshot of all four fields.To make it easier to tell the different fields apart each one has been assigned its own color map that will be used consistently.
For simulation we used an implementation by Roland Zimmermann [27], that simulates the dynamics of the BOCF model using an FTCS scheme on a 500 × 500 grid with integration parameters ∆x = 1, ∆t = 0.1, diffusion constant D = 0.2, no-flux boundary conditions and a temporal sampling of t sample = 2.0.The dense spatial sampling is needed for integration but impractical for our use.Therefore the software by Zimmermann coarse-grains the data to a grid of size 150 × 150.

Cross Estimation
For cross estimation we analyze the Barkley model and the BOCF model.In the beginning both systems are simulated for more than 10000 time steps so that different subsets can be chosen for model training and testing.All training sets consisted of 5000 consecutive time

Barkley Model
For the Barkley model (3) only the u variable has a diffusion term.Therefore the dynamics of v solely depends on u and its past.This significantly reduces the reconstruction parameter space as spatial neighbourhoods may only be needed for noise reduction during PCA and  3) with temporal sampling t sample = 0.01.D E is the initial embedding dimension (a direct result of the reconstruction parameters chosen) and D R is the reduced dimension used to make the prediction.For both predictions we used the constant value of 200 for the beyond the boundary pixels.The errors are averaged over 20 predicted frames.can likely be small.For the prediction direction u → v the best embedding found by optimization, the one with least prediction error, was γ = 500, τ = 1 and r = 0.These parameters produce a highly redundant embedding which allows PCA to efficiently filter out noise.The other direction v → u needs spatial neighbourhoods for effective cross prediction and the embedding parameters were γ = 30, τ = 5 and r = 3.
The results evaluated according to the error measure ( 1) are listed in Table 1.A visualization of the predictions is shown in Fig. 6 along with additional predictions performed with identical parameters but for noiseless input.

BOCF Model
Similarly to the Barkley model only the u variable of the BOCF model (4) has a diffusion term which simplifies the predictions of u → {v, w, s}.All embedding parameters, found with a stochastic gradient decent procedure, are listed along with the prediction errors in Table 2.In most of these cases we observe a very large temporal embedding with a small spatial neighbourhood.This is likely due to the dense temporal sampling relative to the propagation speed of wavefronts within the simulated medium.In this way the highly redundant embedding and PCA for dimension reduction provide an effective method of noise reduction.The w field however presents itself as a somewhat smeared out version of the other variables thus requiring a larger spatial neighbourhood to recover the positions of wavefronts.
To visualize a few results we chose the best and worst performing estimations.Figure 7 contains results for w noisy → {u, v, s} and Fig. 8 shows estimations from a noisy u field to all other variables.The NRMSE values in Table 2 indicate that the estimations from field w perform about one order of magnitude worse than the estimations from field u.Figures 7  and 8 on the other hand reveal that, even in the latter estimations, the erroneous pixels are concentrated around the wavefronts.Thus the overall prediction for most of the area is very accurate in both cases.1.The embedding parameters are listed in Table 2.

Iterated Time Series Prediction
In the following we will analyze the performance of local modelling for spatially extended systems in the context of iterated time series prediction.For this we use the Kuramoto-Sivashinsky model ( 2) and the Barkley model (3).The obvious performance measure in this case is the time it takes before the prediction errors exceed a certain threshold.Time however is not an absolute concept in dimensionless systems.Therefore we will also define characteristic timescales of each system which will give a context to the prediction times.

Predicting Barkley Dynamics
The datasets used during cross estimation were sampled with t sample = 0.01 which could be considered nearly continuous relative to the timescale of the dynamics.To provide a useful example for temporal prediction with a reasonable amount of predicted frames we use a larger time step t sample = 0.2 (simulation time step was still sufficiently small for accurate numeric integration).
Figure 9 shows one such prediction of the u variable in the Barkley model.The figure consists of seven subplots where the top two rows show the system state at the prediction  2.
time steps n = 25, 50 as well as the corresponding iterated predictions.The very right column displays the absolute errors of the prediction defined by |u t,α − ût,α |.At the bottom is the time evolution of the NRMSE for the prediction.Looking closely at the snapshots in the figure reveals that indeed the maximum prediction error increases quickly, as can be seen by the dark spots of the error plots (c) and (f).The overall error however increases much more slowly which is confirmed by comparing the original state with the prediction.
To set the above results into perspective we calculate a characteristic timescale for the Barkley model.Here we will use the average time between two consecutive local maxima for each pixel, which in good approximation gives the average period of the rotating spiral waves.Averaging over 100 × 100 pixels and 4000 time steps gave this time as t c = 4.81.This means that the error of the u field prediction increased to NRMSE(u, 2t c ) ≈ 0.5 within two characteristic times.

Predicting Kuramoto-Sivashinsky Dynamics
The Kuramoto-Sivashinsky (KS) model ( 2) is a one dimensional system that has just a single field.As in the iterated time series prediction of the Barkley model we will need a characteristic timescale for the dynamics of the KS model to assess the quality of the forecast.The approach of measuring wave-front-return-times proved not to be as effective.It is possible to integrate the KS model with different sizes L and spatial samplings Q.We will attempt to predict the time evolution for L = 22, Q = 64 and a larger system with L = 200 and Q = 512.The smaller one of the two has just 64 points and thereby could be predicted by reconstructing either local or global states, where the latter are given by combining samples from all Q sites in a state vector.The global states have a higher dimension and need larger training sets to densely fill the reconstructed space but in return each vector represents the state of the whole system.Both approaches are compared in Fig. 10   A notable observation with the KS model is its variable predictability as it strongly depends on the initial conditions i.e. the current position on the chaotic attractor.Figure 11 shows two predictions using 10 5 training states and identical (reconstruction) parameters.The only difference is that the training set chosen from pre-generated data was offset by 4000 states as compared to the other.The results differ greatly.In one case the errors stay small for roughly 8 Lyapunov times while the other diverges already after 3 Lyapunov times.
Similar variations in predictability were not observed in the KS system with L = 200 and Q = 512, which may be due to the significantly larger extent of the system.Instead predictions rarely stayed correct for more than two Lyapunov times.An example is shown in Fig. 12.The issue of variations in predictability of the KS model hinder direct comparisons to the work of Pathak et al. [21] who did not address this problem.In the small system we saw initial conditions where predictions outperformed the ones by Pathak et al. but also others that were much worse.The larger system however has so far been harder to predict and we did not match the prediction accuracy of the approach of Pathak et al.

Benchmark of PCA
In this paper we use principal component analysis for two reasons.The obvious purpose is to find a low-dimensional representation of the high-dimensional embedding.One very much wanted side-effect is noise reduction.All of the above presented examples used highly redundant embeddings to allow for noise reduction.
To evaluate how well PCA is suited for this purpose we test two things: Does PCA find a low dimensional representation?This is tested in Fig. 13.We see the dependence of the prediction error on the output dimension of PCA in a cross estimation of u → v in the Barkley model.It is evident that in this case no more than about 5 − 7 dimensions are needed to encode all information relevant to the prediction.
To test whether PCA also successfully eliminated the noise in the test set we compare the two panes in Fig. 13 where the results in the right pane were computed using a 20 times less redundant embedding.The noiseless predictions perform similarly well in both cases, indicating that the additional embedding dimensions are indeed redundant and do not add much information to the reconstructed states.Comparing the noisy predictions highlights the effectiveness of PCA in this case as predictions from the redundant embedding (Fig. 13a) are consistently better by one order of magnitude (comp.

Conclusions
The combination of local modelling and principal component analysis for dimension reduction provides a conceptually simple yet effective approach to both cross estimation and temporal prediction of complex spatially extended dynamics.The equations for all three model systems (Barkley model, BOCF model, KS model) were only needed for data generation and as such the approach could well be applied to real world data where the underlying dynamics are not known.Adding noise to the input data naturally reduces prediction quality but in section 5.3 it was shown that PCA can restore accuracy from a more redundant embedding.

Acknowlegdements
The authors thank R. Zimmermann for allowing the use of BOCF model simulation and S. Luther for scientific discussions and continuous support.

Appendix
8.1 Bueno-Orovio-Cherry-Fenton Model H(•) denotes the Heaviside function and the currents J f i , J so , J si of the Buono-Orovio-Cherry-Fenton (BOCF) model ( 4) [10] are defined as: The τ parameters used above are not constant but rather a function of the cell membrane voltage variable u: and All other parameters are listed in Table 3.   [10] that imitates the Ten Tusscher-Noble-Noble-Panfilov model [26].

Figure 1 :
Figure 1: Visualization of spatial embedding for radii r ∈ {1, 1.5, 4}.All points within the circle spanned by r are included in the embedding.
iteratively reconstructing individual shifted delay vectors xn from the dataset and summing the terms (x n ) tr • xn into the preallocated matrix C X .Local states y n of lower dimension D R ≤ D E are obtained by projecting the shifted states x y n = P xn using a (globally valid) D R × D E projection matrix P whose rows are given by the D R eigenvectors of the matrix C X corresponding to the largest D R eigenvalues.The dimensionality D R of the subspace spanned by eigenvectors to be taken into account can either be set explicitly or determined such that some percentage of the original variance of the embedding is preserved.

Figure 2 :
Figure 2: Temporal evolution of the KS model (2) for two different system sizes.Pane (a) has parameters L = 22 and Q = 64, while the larger system (b) has L = 200 and Q = 512.

Figure 3 :
Figure 3: Snapshot of the chaotic Barkley model (3) on a grid of size 150×150 with constant boundary conditions and after transients decayed.The u variable is displayed in (a) and v in (b).

Figure 4 : 2 t,α σ 2 N
Figure 4: Snapshot of the BOCF model simulated on a 500 × 500 grid and coarse grained to a 150 × 150 grid using the software by Zimmermann [27].

Figure 5 :
Figure 5: Snapshots of the variables u and v of the Barkley model and the variable u and w of the BOCF model after addition of normally distributed noise.

Figure 6 :
Figure 6: Cross estimation of data generated by the Barkley model (3) from a noisy v field to u and vice versa.(a)-(d) show estimates of the u field where (a) is the actual u field, (b) the predicted field û (from noisy input), (c) the absolute difference between the two, and (d) a reference estimation error for noiseless input with identical embedding parameters and training set.Panes (e)-(h) show the same for the field v.The embedding parameters are listed in Table1.

Figure 7 :
Figure 7: Cross estimation of data generated by the BOCF model (4) from w noisy to all three other variables.(a)-(d) show estimates of the u field where (a) is the actual u field, (b) the predicted field û (from noisy input), (c) the absolute difference between the two, and (d) a reference estimation error for noiseless input with identical embedding parameters and training set.Panes (e)-(h) and (i)-(l) show the same for their fields v and s, respectively.The embedding parameters are listed in Table2.

Figure 8 :
Figure 8: Cross estimation of data generated by the BOCF model (4) from u noisy to all three other variables.(a)-(d) show estimations for the v field where (a) is the actual v field, (b) the predicted field (from noisy input), (c) the absolute difference between the two, and (d) a reference estimation error for noiseless input with identical embedding parameters and training set.Panes (e)-(h) and (i)-(l) show the same for their fields w and s, respectively.The embedding parameters are listed in Table2.

Figure 9 :
Figure 9: Predicting field u of the Barkley model with system size 150 × 150 and training of 5000 states.The embedding parameters are γ = 12, τ = 2, r = 4, boundary value 200.PCA reduced the dimension from D E = 637 to D R = 15.Panes (a) and (d) show the true evolution at time t = 5 and t = 10.Panes (b) and (e) contain the iterated prediction at that time and (c) and (f) the corresponding absolute error.(g) shows the accumulation of the NRMSE in the prediction.The dashed lines note t c , the bullets note the times 5 and 10.
using the same training set of 9 • 10 5 states.For smaller training sets the global state prediction fails altogether as the high dimensional embedding space is too sparsely sampled and nearest neighbour searches always find the same state over and over again.The local state prediction on the other hand still predicts well for some time as is shown in the same figure.

Figure 10 :
Figure 10: Predictions of the KS dynamics using PCA and 5 nearest neighbours.Shown are: In (a) and (f) actual evolutions, below it in (b) and (g) predictions from local states with parameters γ = 7, τ = 1, r = 10, and at the bottom in (d) and (i) predictions using global states (γ = 0, r = 32), each along with its errors.The left panel was generated using a training set consisting of 9 • 10 5 states and while the right one used 1 • 10 5 training states.

Figure 11 :
Figure 11: Two predictions of the KS dynamics with L = 22, Q = 64.All parameters were identically γ = 7, τ = 10, r = 10 using 10 5 training states, PCA, and 1 nearest neighbour for local modelling.The only difference lies in the initial condition.(a) and (d) show the true evolution, (b) and (e) the predictions, and (c) and (f) the corresponding absolute errors.

Figure 12 :
Figure 12: Prediction of the KS dynamics with system parameters L = 200, Q = 512 and training of 70000 states.The embedding parameters are γ = 2, τ = 4, r = 10 with PCA and just one nearest neighbour for local modelling.

Figure 13 :
Figure 13: Normalized Mean Squared Errors of cross estimation u → v of Barkley variables vs. reduced dimension D R for clean and noisy input signals u with different embedding parameters ((a) γ = 500, τ = 1; (b) γ = 25, τ = 20) such that γτ remains constant.The estimation error increases for very small values of the reduced dimension D R , but becomes almost constant for D R > 5.For noisy data PCA based dimension reduction based on higher dimensional embedding with D E = γ + 1 = 501 in (a) enables a more efficient noise reduction than the reconstruction with D E = γ + 1 = 26 in (b).

Table 1 :
Optimal embedding parameters and cross estimation errors for noisy data from the Barkley model (

Table 2 :
Embedding parameters and cross estimation errors, averaged over 20 frames, for noisy data from the BOCF model (4) with temporal sampling of t sample = 2.0.A value of 200 was used for the pixels beyond the boundary.D E is the original reconstruction dimension and D R is the reduced dimension used for nearest neighbour searches.

Table 3 :
Parameter set for the BOCF model