Trends in recurrence analysis of dynamical systems

The last decade has witnessed a number of important and exciting developments that had been achieved for improving recurrence plot-based data analysis and to widen its application potential. We will give a brief overview about important and innovative developments, such as computational improvements, alternative recurrence definitions (event-like, multiscale, heterogeneous, and spatio-temporal recurrences) and ideas for parameter selection, theoretical considerations of recurrence quantification measures, new recurrence quantifiers (e.g. for transition detection and causality detection), and correction schemes. New perspectives have recently been opened by combining recurrence plots with machine learning. We finally show open questions and perspectives for futures directions of methodical research.


Introduction
Recurrence in dynamical systems is a fundamental feature, indicating different types of dynamics, such as periodic, chaotic, or random variations, or predictable and unpredictable variability.The study of recurrences in dynamical systems by recurrence plot (RP) based methods 1 , such as recurrence quantification analysis (RQA) and recurrence networks (RNs) is receiving a growing interest in many different scientific disciplines , well represented by the increasing number of publications (Fig. 1) and the diverse scientific disciplines these studies cover (Fig. 2A).The increase in the number of studies citing the seminal works introducing RPs, RQA, and RNs [31-33, 2, 34, 35] is even stronger (Fig. 1), which can be interpreted as a growing general popularity of these methods not limited to the (still small although growing) community of researchers.Nowadays, studies which are actually not using RP based methods refer to them, e.g., as alternative useful approaches or some kind of standard methods.Obviously, RP based methods are meanwhile well-accepted in data science.
A growing number of available software is supporting this positive development (Fig. 2B).Progress in theoretical understanding of RP analysis, GPU based computa e-mail: marwan@pik-potsdam.de 1 A RP is a matrix Ri,j = Θ (ε − ∥⃗ xi − ⃗ xj∥), representing all the times j when a state at time i is recurring.Further information on RPs and RQA can be found, e.g., in this special issue in [1] or in the review [2].ing, and software development in general have allowed very efficient and fast packages for Python and Julia (cp.Subsect.2.1).Such packages are beneficial for working with the challenges of big data and integrating them to machine learning approaches.A list of software is available at [37].[36], May 2022, see also the notes in Appendix A.2); (B) software for RP based analysis is available as standalone applications and as packages for the most frequently used high-level programming languages (based on information at [37], May 2022).
Big and ever growing data sets, multi-scale and spatial data, very long or very short data, data with gaps, irregular sampling and uncertainties are challenges in many scientific disciplines.Novel ideas and concepts are required to answer the research questions of today.The ongoing technical developments of RP based approaches in both theoretical and practical domains provide tailored tools for the specific challenges.Here we have selected a multitude of directions, ranging from computational developments, over new theoretical insight and new recurrence definitions, to novel extensions and applications of RP based research.It allows the interested reader to catch up on hot topics and recent developments in RP based analysis.
2 Trends and novel directions

3). (B)
The approximative RQA allows calculation times of a few seconds for time series of length larger than 1 million data points, where standard single-thread calculations need hours (details can be found in [38]).
The recurrence matrix R i,j = Θ(ε − ∥⃗ x i − ⃗ x j ∥) is the basis for RP, RQA, and RN, but the calculation of this recurrence matrix is an N × N pairwise test (with N the length of the phase space trajectory ⃗ x i , i = 1, . . ., N ), thus, comes with large computational costs in the order of O(N 2 ).Some of the subsequent quantification (RQA measures, network measures) add a further amount of computational complexity, usually an additional O(N 2 ).Therefore, long time series and big data applications require a fast and efficient calculation of the recurrence matrix and the RP based quantifiers.Several approaches would be possible: an efficient implementation, a parallelisation of the computation, and approximation of the calculations.
Fast calculations can be performed, e.g., using Python, a widely used software framework in the scientific community.The pyunicorn package [39] uses a very efficient implementation based on Cython and provides recurrence network measures.
Recently, the Julia language was introduced with the aim to provide a fast and very efficient tool for scientific computations.In this line, the Julia package Recurrence-Analysis.jl (meanwhile integrated into DynamicalSystems.jl)was developed which also provides calculations for RPs and the main RQA measures [40].The calculation of RPs and RQA measures using Julia is much faster than comparable implementations in R, MATLAB, and Python, in particular for longer time series N > 10, 000 (Fig. 3A).
Much shorter calculation times can be achieved by parallelising the computations.For example, the Python package PyRQA uses a divide & recombine approach to distribute the computations on multi-core processors or on an array of graphics processing units (GPUs).The improvements can be of several magnitudes of reduced calculation time (Fig. 3A).
A completely different approach is using an approximation of the RQA measures [41,38].Instead of pairwise testing the distance between all points of phase space trajectory, the recurrences are estimated using a coarse graining of the phase space, leading, e.g., to the recurrence rate with m the current (embedding) dimension and h X the histogram of the phase space points.The line based RQA measures can be estimated by approximative RQA, e.g., for determinism (similar approximations for the other RQA measures are available, see [41,38]).The computational complexity including the RQA measures is, thus, reduced to O(N log N ), resulting in an extreme reduction of the calculation time (Fig. 3B).However, this acceleration of RQA calculations comes with the cost of some inaccuracies in the results.

Alternative recurrence definitions for recurrence plots
The original definition of a recurrence in phase space for creating a RP was to consider a certain number of nearest neighbours [31].This was soon changed to define a recurrence in terms of a thresholded distance between points in phase space2 [43,44].
For most applications both definitions work very well.Later, extensions were suggested to add further criteria.Recurring points should lie on a perpendicular plane [45] or phase space trajectories should be parallel [46], aiming to reduce the effect of sojourn points.Order patterns are also a very powerful extension [47,48], reducing the effects of non-stationarity, or to characterise the dynamics (cf.Subsect.2.3).In the last years, some additional ideas were suggested for specific research questions.Specific applications require tailored recurrence definitions.For the identification of laminar regimes or to have a variance-independent distance measure, it can be helpful to apply the exponential function to the actual distance D i,j = ∥⃗ x i − ⃗ x j ∥ between states ⃗ x i and ⃗ x j [49,50] This transformation of the distances D i,j provides values between 0 and 1, where 1 represents the closest and 0 the longest distances.Therefore, the thresholding is now opposite.Such modification is, e.g., used to identify laminar regimes (cf.Subsect.2.5).
If only phase differences are of interest, e.g., in material testing using ultrasonic signal processing, or in acoustic signal analysis, the actual amplitude should be neglected.Here, the angular distance is a better recurrence criterion than the spatial distance in phase space [51] where α is the phase difference between both points ⃗ x i and ⃗ x j .Although the spatial difference between ⃗ x i and ⃗ x j can be large, they can be considered to be recurrent because of a very small phase difference α (Fig. 4).Such a recurrence criterion is particularly useful in the analysis of ultrasonic waves for material testing or in diagnosing atrial fibrillations [52,53].4. Instead of using the spatial distance, the angular distance, represented as the angle α between two states in phase space, can be used to define recurrences.Although the spatial difference between two points at the phase space trajectory is large, both points can be considered to be recurrent because of the similar phase, indicated by very small α.
Another specific type of data, where the construction of a phase space and measuring of distances between states at different time points might be difficult or even impossible, are event like data.For such data, Suzuki et al. [54] have introduced the edit distance metric, which is based on transforming one sequence S i of events into another one S j (thus, the time series of events x i is segmented into short sequences of length w of events S i = {x i , x i+1 , . . ., x i+w−1 }).The cost for the minimum operations required for such a transformation is an appropriate distance measure.The edit distance has been further extended to better understand the parameters within this measure [55], with I and J the set of indices of events in sequences S i and S j , C is the set of pairs of event indices in I and J, t i (α) and t j (β) are the time points of the events in S i and S j .The parameter τ can be used to set the maximal delay between events when considering them as recurrent.This edit distance can be further interpreted as a difference filter, allowing us to construct an equidistantly sampled time series from an irregularly sampled time series, as typical in different geoscience and astrophysics applications [12,56].Consequently, a further application would even allow us to construct RPs directly from irregularly sampled time series using such edit distance measure [57,58].Such data from geosciences and astrophysics (and not only from there) has often a certain fraction of uncertainties, e.g., from age uncertainties in palaeoclimate archives.Instead of a series of scalar values, a time series would then be a series of probabilities p(x, t).A recent development has combined a Bayesian approach with RPs to derive a RP which explicitly represents also the uncertainties [59].Instead of a binary recurrence matrix, we get a matrix with probabilities of recurrences Q i,j (ε) = p (∥⃗ x i − ⃗ x j ∥ < ε).Although such representation is very helpful for data with uncertainties, the quantification is not as straight forward as for binary recurrence matrices.It is still an open question how line based measures could be defined in most reliable way (there are already some suggestions [60]).Nevertheless, complex network based analysis is, of course, possible, as it was used to identify palaeoclimate regime changes, changes in the sea surface temperature distribution of the equatorial central Pacific, or in financial markets [59].
An alternative for data with uncertainties are fuzzy recurrences.Here, a fuzzy objective function is minimized and the fuzzy cluster membership is used to define a recurrence [61] and is beneficial when working with physiological data.This approach can also be used for creating recurrence networks [62] and for cross recurrence analysis [63].
Finally, for analysing spatio-temporal recurrences, we need a recurrence criterion that considers the spatial variability in a temporal sequence of images X(t).A promising distance measure is based on the mapogram m b,i,j of an image X which can be compared to another image X ′ using the Bhattacharyya distance [64] with X = X(t 1 ) an image at time t 1 , X ′ = X(t 2 ) an image at time t 2 , i and j the indices of a pixel in an image, N i and N j the size of the image, n b the histogram of grey values in the image (with B histogram bins), and m b,i,j the mapogram (indicating the class of a pixel with respect to the histogram).See Subsect.2.7 for further details on spatio-temporal recurrence analysis.

Theoretical and parametric RQA and testing
The first years of RP based method development were founded by empirical findings and mainly lacking some theoretical background, although some connections between dynamical properties, line lengths, and recurrence times were already framed 1983 by Grassberger and Procaccia [65,66].Meanwhile, several theoretical findings directly related to RPs have been achieved.A fundamental finding was elaborated by Grendár et al. [67], who mathematically developed the connection between correlation sum C (m) and the RQA measures recurrence rate RR (m) , determinism DET (m) , and average diagonal line length L (m) .Even more important are their formulation of the asymptotic behaviour of these measures, i.e., to which values these measures will converge when the length of the considered data goes to infinity.They also show analytically that DET and L for Gaussian white noise do not depend on the embedding dimension.These considerations have been further elaborated by Ramdani et al. [68,69], which have further derived the analytical expressions for several RQA measures for certain stochastic processes, fractional Gaussian noise, and correlated noise (first analytical solutions were already given in [70]).Analytical and asymptotical expressions for RQA measures of specific random processes are important for defining baselines for benchmarking and testing.Moreover, the fundamental relationship between RR (m) and other RQA measures can be used to define approximative RQA measures, such as Eq.(2).
Further research has considered to derive empirical distributions for testing serial dependencies [71,72] and estimate KS entropy from recurrence times [73].
Another remarkable development is the use of RP based analysis to characterise stochastic dynamics, although the original intention of RPs was to investigate the evolution of a phase space trajectory of a deterministic dynamical system.Based on very specific distribution of recurrence points in geometric pattern in the RP, the type of the stochastic process can be determined [74].It was shown that this approach can be used to distinguish stochastic from deterministic dynamics and works even for short data.Order patterns π i are also useful for this purpose, because some order patterns are very unlikely to occur for certain dynamics, called "forbidden order patterns" [75,48,76].For example, an order pattern RP can be coloured by the specific recurring order pattern [75], providing the information about the distribution of occurring order patterns (Fig. 5).From a more mathematical perspective, an independence test for stochastic data was proposed, based on recurrence rate and the Cramér-von Mises functional applied to a U -process defined from these recurrence rates [77].The test works very well in comparison to alternative tests, like Pearson, Spearman, or Kendall correlations, or even more advanced tests (e.g., covariance distance).
The idea of identifying slow driving forces from time series using RPs has also been regularly considered [78][79][80].Riedl et al. [79] combined the approach by Casdagli [81] with spatial RPs (cp.Subsec.2.7) to identify an external forcing on marine ecological data.A novel concept to infer driving forces from data feeds the RP as an imagelike data representation of the original time series into a deep learning framework [80].The presented preliminary results are rather promising (see also Subsect.2.9 for further combinations with machine learning approaches).
Few studies have investigated the small-scale structures of RPs and found links to characteristic dynamics.We mention here two examples: First, the shape of the block patterns in RPs is related to specific types of intermittency [82].Second, because of the very different time scales in slow-fast dynamics, such dynamics causes thickening of lines or even short lines in the RP almost perpendicular to the main diagonal line direction [83] (Fig. 6).

Causal and directed relationships
Different RP based approaches have been proposed and successfully implemented to detect causal and directed relationships in data.Among them are network based approaches [85] and joint RPs [86][87][88].The network approach uses the inter-system recurrence network, a combination of individual RPs for both systems X and Y and their cross RPs with CR Y X = CR XY T being the cross RPs between systems X and Y .Applying geometrical considerations, the cross-transitivity coefficient (and similar crossnetwork measures) quantify how information flows between the systems, providing an indicator on the coupling direction [85].
Approaches using joint RPs are closely related to mutual information [2].The recurrence measure of dependence (RMD) is a recurrence based probability measure similar to transfer entropy [89].Its extension is a conditional version, the recurrence measure of conditional dependence [87] (with JR XY Z the joint RP between systems X, Y, Z), which can be used to study indirect couplings or even causal dependencies (when considering lagged values of one variable, e.g., Z(t) = Y (t + τ )).A similar approach is conditioning already the joint RP [88] This conditional joint RP can be easily extended to more variables.RM CD and CJR have been shown to indicate the correct causality relationship for different kinds of challenging data [87,88].

New RQA measures and phase space segmentation based recurrences
Although the quantification of RPs has its roots in the early 1990s, there are still some aspects that require innovative ideas for quantifying the apparently different visual impression of RPs.Inspired by the research on fractal geometries, the lacunarity measure was adopted to RPs [90].It characterises the homogeneity of the RP and allows to detect characteristic time scales, such as periodicities or extended laminar regimes (Fig. 7).2) and multifractal Gaussian noise), as well as a RP with characteristic temporal scales (Rössler system).Technical details can be found in [90].
Laminar regimes or transient trapping of states are represented in the RP by extended blocks of recurrence points.Usually, a sliding windowing procedure is applied to identify the changes between different dynamics.A new measure has been suggested that can identify the temporal variation of transient trapping without windowing.It is based on a block invariant measure [50] where t | (i), t ∥ (i), t ⊥ (i) are the geometric extensions of the blocks in the RP.This promising new measure was successfully applied to detect transient trapping events in intracellular and plasma membrane compartmentalisation.
In data analysis it can be important to identify the times of a specific dynamical behaviour.This corresponds to a segmentation of the phase space.beim Graben and Hutt [91] have suggested several approaches to segment the phase space into recurrence domains, e.g., using the chain of transitions from one recurrence to another one, calling it recurrence grammars.Such recurrence grammars are related to Markov chain description of the data and can be used to symbolise the RP or to specify a new RP based entropy measure (cf.Subsect.2.8 for an application of this specific entropy).
The suggested method by Yang and Chen [92] goes in a similar direction, which also segments the phase space but using a Q-tree segmentation.The result is a classification of recurrences to delineate heterogeneous recurrences, an interesting concept to reveal the fractal nature of state transitions.

Border effects, tangential motion & alternative RP definitions
In RQA, border effects and tangential motion (sojourn points) can heavily bias diagonal line based characteristics.In a finite size RP these lines can be cut by the borders of the RP, thus, bias the length distribution of diagonal lines and, consequently, the line based RQA measures.Moreover, temporal correlations in the data, especially when highly sampled flow data is used, noise and an insufficient embedding of the time series combined with the effect of discretization and an inadequate choice of parameters needed to construct the RP can cause a thickening of diagonal lines ("tangential motion").High sampling can cause tangential motion, a thickening of diagonal lines (orange circles).Modified after [93].
Both effects can have a substantial impact on certain RQA quantifiers, e.g., the diagonal line length entropy ENTR (cf.Subsec.3.4), especially for regular dynamics.The border effects can be tackled in two ways.Either by a special treatment in the according histogram of any diagonal line which "touches" the RP-border [93] or by rotating the RP by 45°("window masking") [93,94], in order to distribute the induced bias equally on all lines.For the histogram correction, we investigated in a previous study [93] the window masking together with the ideas to either discard all border lines (dibo correction), to only keep the longest of all border lines (kelo correction), or to replace all border lines by the length of the line of identity, which had already been proposed by Censi et al. [95] (cp.Fig. 9 for a simple sinusoidal signal).In general, for noise free or slightly noise corrupted map data all these correction schemes solve the problem of the biased diagonal line length entropy due to lines cut by the borders of the RP.
However, for flow data the effect of tangential motion has a much bigger influence on the entropy bias than the border effects.Alternative criteria of defining the RP were proposed to solve this problem (cp.Subsect 2.2).The already mentioned perpendicular A corresponding enlargement of (E) does qualitatively look the same, but with reduced frequencies, due to the smaller effective window size.For a better visibility, we enlarged single bars in (B) to (E) and limited the view to a frequency range [0 3] in (A) to (E) (in (F) the full range is used).Modified after [93].
RP [45] contains only those points of the d-dimensional phase space trajectory that fall into the neighbourhood of a reference point and lie in the (d − 1)-dimensional subspace (Poincaré section) that is perpendicular to the phase space trajectory at the reference point (Fig. 10B) .In practice, an additional parameter is needed to account for a certain deviations of a reference point being exact on that surface of section.The iso-directional RP [46] also promises to cope with the tangential motion, but needs two additional parameters (Fig. 10C) .In this approach two points in phase space are denoted recurrent, if their mutual distance falls within the recurrence threshold ε and the distance of their trajectories throughout T consecutive time steps falls within another recurrence threshold ε 2 .A further idea is the true RP [96] counts only those points to be recurrent, which first enter the ε-neighbourhood of a reference point (Fig. 10D) .Finally, a definition of recurrences by means of local minima was suggested (LM2P approach) [97,98].In the latter approach only local minima of the distance matrix make up the RP (Fig. 10E) .In practice a local minima detection method needs to be defined including an additional parameter τ m , which regulates the tolerated spacing in between consecutive minima.In addition to these recurrence criteria, a more geometric based approach was proposed using a skeletonisation schema [93].Since a "thickened" line consists of many adjacent diagonal lines, this parameter-free algorithm shrinks all "thickened" diagonal lines in a RP down to the longest line contained in such a "thickened" line.The result is a RP, which only consists of diagonal lines with unity width (Fig. 10F).
Even though the true RP and the LM2P RP (Fig. 10D, E) do not look to differ much from the skeletonised RP, in practice the computation of the skeletonised RP yields the most robust results.Together with the border effect corrections of the line length histograms this approach yields meaningful estimates for the diagonal line length entropy ENTR.Furthermore, when computing the so corrected ENTR for increasing minimum considered line lengths ℓ min > 2 the noise level can be estimated and the skeletonised RP can then be used as a noise filter.The effect of these corrections on other RQA-quantifiers, including those based on white vertical lines (recurrence times) need to be studied as further described in Subsec.3.4.

Spatio-temporal recurrence analysis
The fast development of the computational power of computers makes the application of RPs and RQA for spatial and spatio-temporal data analysis possible.A simple idea considered only static images and transformed the two-dimensional images to one-dimensional series of grey values [99,100].Unfortunately, the RQA based on this approach is influenced by the orientation of the spatial structures.The more advanced approach is to compare each spatial direction of the image, finally resulting in a RP of four or even six dimensions (for two-dimensional or three-dimensional data, respectively) [101].This latter concept is challenging because the quantification of the recurrence structures in such dimensions is not trivial.Moreover, although all different spatial directions are compared, objects with rotational symmetries still have an impact on the results.Consequently, an extension was introduced by incorporating rotations and allowing the identification of irregular circular patterns [102].Recently, another extension was suggested to weight the grey value distances by the distances between the pixel values [103] w with D(i, j) a Gaussian weighted distance function of the spatial distance between pixels i and j.The authors have used this recurrence criterion to construct and analyse recurrence networks of spatial data.Spatio-temporal data such as surveillance videos or satellite data are another interesting application field of RP based analysis.The most simple approach would be to compare the images pixel-wise (each pixel forms the component of phase space vector), but this would be a very sensitive approach resulting in very low detection rates of recurrences.An alternative would be to compare the grey value histograms.However, here the spatial information in an image is completely lost.A powerful approach combining both concepts was suggested by Agustí et al. [64].They suggest to apply mapograms to compare images (cp.Subsect.2.2).Mapograms come with a scaling factor which even allows the specific focus on different spatial scales that can be used in a multi-scale analysis [104].As already mentioned, RPs based on mapograms can be used to infer the driving force from spatio-temporal data [102].
The identification of spatio-temporal recurrences becomes challenging when only a small part of an image represents a dynamical pattern.Bonizzi et al. [105] proposed to apply a singular value decomposition (SVD) to identify the regions of interest (i.e., the regions with some variability) and use only the data within these regions in a regular RP.All the pixels in such a region are considered to be the components of a phase space vector (like the simple approach mentioned before).

Selection of the recurrence threshold
Discussions on selecting the recurrence threshold ε have been included already several times in many publications [2,106,49,[107][108][109].This shows the importance of this topic, as the selection of ε is a trade-off from having as small threshold as possible but at the same time a sufficient number of recurrences which strongly depends on the research question.
An easy approach which helps in most cases is to use a quantile of the distance distribution D i,j = ∥⃗ x i − ⃗ x j ∥ (Fig. 11A).Selecting the threshold by using the 5%quantile, ε = D 0.05 , would result in a recurrence rate of 5%.This approach provides a robust recurrence characteristics for different embedding dimensions [108].
Another criterion for selecting ε is based on topological similarity, where such a value for ε is selected where small changes ε±δε have minimal impact on the structures in the RP.We can think about several criteria that measure the topological similarity of RPs.One idea for testing this is based on measuring the Hamming distance ∆ H between the RPs thresholded with ε, ε − δε, and ε + δε, i.e., Andreadis et al. [110] have suggested to chose such an ε where the difference is minimal (Fig. 11B).A similar idea is to consider the RP as a RN and find network modules, again for threshold ε and small deviations in the threshold ε − δε and ε + δε [107].The first criterion is to have exactly the same number C of modules, i.e., C R(ε The second criterion tries to minimize the difference in the size (number of nodes) of a given module k in R(ε) and R(ε + ∆ε), with M k the k th module in the network and |M k | the size of the module (the number or nodes or phase space states in this module).This procedure identifies such thresholds where structures in a RP do not change much for small deviation in ε.
The next criterion which was suggested by several authors tries to maximize the homogeneity of RPs.We had already seen the symbolisation based on the recurrence grammars in Subsect.2.5.beim Graben and Hutt [91] suggest to select ε in a way to have the distribution of the symbols as uniform as possible.This corresponds to a maximisation of the entropy of the symbol distribution.A very similar approach was suggested by Prado et al. [111], which is using local recurrence patterns of specific size (e.g., n = 2, corresponding to {R i,j , R i,j+1 , R i+1,j , R i+1,j+1 }), so called micro-states.The criterion is to maximise the diversity of structures/patterns in the RP, i.e., the micro-states should be equally distributed, leading to the criterion that the entropy of the micro-states distribution should be maximal As an alternative to recurrence grammars, the transition probabilities between recurrence domains can be used [112].Again, we find an optimal ε where the entropy of these transition probabilities is maximal, ensuring equally frequent transitions between different recurrence domains.Fig. 11.Finding optimal recurrence thresholds ε using (A) quantiles, (B) topological similarity, and (C) network connectivity for the Roessler system (using 2,000 values of the x-component, embedded into 3-dimensional phase space using time delay embedding).The quantile approach (with 5% quantile) suggests ε = 3.54, the topological similarity ε = 6.0, and the network connectivity ε = 1.84.
Whereas the preceding suggestions for selecting ε are mainly based on empirical arguments and without specifying for which research question it might work or not, Medrano et al. [109] elaborated a procedure with a deliberate theory.The goal is to estimate dynamical invariants, like correlation dimension C 2 or K 2 entropy.Usually, such measures should be estimated in the limit ε → 0. However, Medrano et al. [109] could show that there will be a lower limit required, i.e., ε ∈ [βε opt , ε opt ], with 0 < β < 1.Moreover, they found that ε should be selected in such a range which minimises the estimation errors of C 2 (ε) (the estimation errors when estimating K 2 can also be used).
As the final approach for selecting ε we mention a method derived from complex networks.In networks, the eigenvalues of the Laplace matrix L i,j = δ i,j j A i,j − A i,j (with A i,j = R i,j − δ i,j the RN) provide information about the connectivity of the network [113].As soon as the second smallest eigenvalue λ 2 becomes larger than 0, the corresponding ε ensures that the RN will not have isolated parts, but is a connected network (Fig. 11C).This approach is related to former ideas of a percolation threshold suggested for network based recurrence analysis [33,114].

Recurrence and machine learning
Machine learning is currently a very fast-growing field.Not surprisingly that recurrence analysis and machine learning approaches are combined and tailored to specific research questions.Generally speaking, computing a RP of a time series is one way of transforming a sequence of data into an image, called "time series imaging".This transformation is even more complicated, when the time series gets embedded into a higher dimensional space beforehand (c.f., Subsect.3.1).The image, i.e., the RP, can be the starting point of a consecutive machine learning workflow (Fig. 12A).This seems the natural way to go, since many machine learning tools, such as convolutional neural networks (CNN), were developed for image classification.Of course, other image encoding techniques such as Gramian angular fields or Markov transition fields instead of RPs are possible and have also been used [115].
Starting from the RP many different ways of setting up a ML workflow are possible and researchers combined several established ML-methods for classification and prediction tasks.First attempts started more than 15 years ago using RQA measures as features in support vector machines (SVM) for regression and classification purposes [116,117] (Fig. 12B).Features based on RQA measures are meanwhile frequently used for classification purposes using SVMs, CNNs, k-nearest neighbour or random forest classifications [118][119][120][121][122][123][124] and the ML-toolbox offers a variety of other methods for clustering and feature classification (Fig. 12B).Also other RP based quantifiers, such as based on JRPs for synchronisation can serve as powerful features for ML classification methods [125].Instead of using the physically motivated RQA measures, which use certain RP-structures, such as diagonal lines, automated image feature extraction techniques, e.g., spatial bag-of-features (SBoF) or internal layer representation of a pre-trained CNN, are possible and have been used for forecasting in combination with another neural net, e.g., a long short term memory (LSTM) [126].All suitable time series features can be used for forecast model averaging [e.g., 127], and RP-based features appear to be a valuable complement to established features such as mean, autocorrelation, etc. [128].The basic idea is to use the features in a regression model for estimating weights of a number of given forecast models, such that the weighted model forecast minimizes the prediction error (Fig. 12B).Technically, a given set of forecast models (e.g., ARIMA3 , ETS4 , NAIVE, etc.) are fitted to the training period of each time series of the training data and produce forecasts for the Fig. 12. Simplified exemplary and schematic machine learning workflows for classification and prediction using (A) the RP as an image of the time series.(B) Features of the RP can be extracted via RQA yielding established features like the recurrence rate (RR), determinism (DET), laminarity (LAM), etc. or via an autonomous image feature extraction algorithm, e.g., spatial bag-of-features (SBoF), or pretrained convolutional neural network layers (CNN).These features can be used for classification or -in combination with another regression algorithm -for averaging/weighting of prediction models (e.g., ARIMA, ETS, NAIVE, etc.) in order to obtain an optimally weighted prediction model.(C) The RP can also be used directly as input to a CNN in order to classify or predict the underlying time series.
corresponding test periods.At the same time features are computed/extracted for the training period of each time series of the training data.Finally the features and the prediction errors for each of the given forecast model are used to compute optimal weights of the models via a regression model (e.g., XGBoost [127,128]).Assuming that the time series from the training data and the actual data to be predicted are generated by the same process, these weights are finally used to produce the final, improved prediction.
Of course the RP can be used directly as an input for a CNN (Fig. 12C).Either the CNN is trained to classify different RPs [129][130][131][132], or to predict time series values [115].Such combinations of RPs and RQA measures with machine learning were successfully applied for transition detection, monitoring, and anomaly detection [133,80,[134][135][136].
Reservoir computing (e.g., liquid state machines, echo state networks) is a specific approach of recurrent neural networks to predict the future states of a dynamical system based on time series without a model [137,138].RQA was used to evaluate the results of such model-free prediction [139].But more interesting are, of course, combinations of the learning algorithm with recurrence features.A promising approach is to use the RQA for fine-tuning of parameters in the learning [140].
So far, we have seen examples where RPs and RQA can help to improve the ML applications.There are only a few studies that use ML approaches to improve the recurrence analysis.One idea is to use a learning algorithm to classify the RP with respect to the underlying dynamics [141].

Perspectives for future research
The trends in the methodological developments of RP based methods and their applications show the perspectives for future research.

The embedding problem
The RP considers recurrences of the trajectory {⃗ x i } N i=1 (with ⃗ x i = ⃗ x(t i )) of the considered dynamical system's phase space.However, in most applications, ⃗ x cannot be measured directly or completely, and only a subset of observables is available.In such cases, ⃗ x must be reconstructed from the measured observables.All of the numerous published methods for reconstruction of the phase space (e.g.[142][143][144][145][146]) introduce a certain number of parameters on which, consequently, the calculated RP and the RQA depend.This is a current field of research with the aim in automatising this process and making it robust with respect to a subsequent recurrence analysis (e.g., recently introduced PECUZAL embedding algorithm [142]).However, it has been shown that the optimization of embedding parameters does depend on the actual research question [147], like computing dynamical invariants or prediction [148][149][150][151][152].
The PECUZAL algorithm can occasionally suggest contradictory embedding parameters.For example, the logistic map is clearly a deterministic system and, therefore, the used test statistic (L-statistic, [145], related to the false nearest neighbor statistic [153][154][155][156][157]), should recommend an embedding with dimension m > 1.However, in chaotic regime, PECUZAL suggests no embedding and treats the input as a stochastic signal.For other maps, e.g., the Ikeda or Hénon map, this is not the case.This does not seem to be a problem of the specific test statistic or the PECUZAL algorithm.When running the "stochastic indicator" proposed by Cao [155], it also values the chaotic logistic map as a stochastic source and does not suggest any embedding.A similar problem arises when analyzing map-like data in a geoscientific context.These time series are often interpolated and despite their inherent non-stationarity we should be able to embed small pieces with approximately constant parameters.In many cases, ranging from drill core data under a certain age model to climate index data such as the Southern Oscillation Index (SOI) and to Earth system models of intermediate complexity (EMICs), PECUZAL does not suggest any embedding and also other stochastic indicators would treat the signals as stochastic.
Therefore, the following research questions should be addressed in the future: (1) How does interpolation affect the estimation of the embedding parameters?(2) How does the sampling resolution affect the estimation of the embedding parameters (flow-like vs. map-like data)?(3) Countless real world processes can be described by a Langevin equation.Yet, to our best knowledge there is no study which systematically investigates the embedding of systems described by such a stochastic differential equation.(4) The impact of the embedding on phase space based causality measures such as convergent cross mapping [158] (and its extensions), joint recurrences [87,159], or recurrence networks [85] have only been investigated briefly [147,160].Since causality analysis is of great interest in many scientific (and commercial) fields, more thorough research on this topic is of high importance.

Recurrence definitions
In order to visualise recurrences of a phase space trajectory, we have to define what recurrence or actually similarity of states actually means in the context of the current research question.For this purpose, dynamical similarity is mostly measured in terms of some metric distance D i,j = ∥⃗ x i − ⃗ x j ∥ defined in the underlying system's d-dimensional phase space.However, specific data or research questions can require modifications of this similarity measure (Subsect.2.2).Depending on the further growing field of recurrence analysis and ever new applications as such as in machine learning, novel similarity measures or metrics will be required, such as for comparing field data and spatial patterns, or time series with uncertainties and gaps [e.g., 161].

Recurrence threshold
Even though many studies (Subsect.2.8) have considered the question of how to objectively find an optimal recurrence threshold, this is still not yet answered satisfactorily.In most applications where comparisons or relative results are of interest, fixing the recurrence rate at a certain value and adjusting the threshold accordingly [108] will be appropriate.For other research questions (such as characterising the specific dynamical properties), a reasonable, very specific threshold should be selected.Although several ideas for an objective selection were suggested [107, 109-111, 113, 162], they are mainly based on heuristic ideas and the used criteria miss an objective physical foundation (e.g., why should be a topological invariance desirable, why should be the diversity of structures in RP maximised, why should be the recurrence network connected?).An objective criterion should either minimise the estimation error for the dynamical invariants [109] or be a trade-off of maximising the number and length of diagonal line structures and minimising the threshold value itself.Besides the specific selection criterion, a systematic overview generalising typical applications of RP based analysis and best suited threshold selections would be helpful in particular for new users of the method.

Analytical RQA
The analytical explanation of various RQA measures has made great progress in recent years (Subsec.2.3).However, the relation between the line structures in the RP and dynamical invariants has not yet been satisfactorily answered.For example, as shown in [93], the analytically derived relation between ENTR and K 2 [163] does not yield meaningful results for real-world time series -neither for the border effect corrected [93], nor for the uncorrected ENTR.
For an analytical expression of ENTR, we use the limit of infinitely large RPs, thus, infinitely long diagonal lines ℓ max = ∞ [163]: with p(ℓ) being the theoretical probabilities of observing a line of length ℓ.By using the scaling property of the correlation sum with the correlation entropy K 2 [65], p(ℓ) can be expressed in terms of K 2 , p(ℓ) = 1 − e −K2 e −K2(ℓ−1) , thus, we find a theoretical expression for the diagonal line length entropy [163] EN with γ = (1 − e −K2 ).For increasing K 2 , ENTR will decrease (Fig. 13).Eq. ( 17) holds only in the limit of N → ∞, but in real world applications, we have finite time series lengths, i.e., N ≪ ∞, thus, the upper limit in the sum of Eq. ( 16) is ℓ max ≪ ∞.This results in significant deviations of ENTR from the theoretical value in the weak chaotic regime, 0 ≤ K 2 ≤ 0.01 (Fig. 13) and is especially important for real world applications with data set lengths < 5, 000.In principle, it should be possible to get the "right" approximation by considering the length of the available data.16), for different upper limits of the maximum encountered diagonal line ℓmax.The case ℓmax = ∞ corresponds to the analytical expression in Eq. (17).
However, when calculating ENTR from RPs and comparing it with the approximated values derived from Eq. ( 16), we find strong discrepancies in particular for K 2 < 0.3 (Fig. 14).These differences remain also for many different parameter settings (e.g., very small ε) and different systems (the correction for border effects, as briefly discussed in Subsec.2.6, also do not improve this negative result).Of course, the main problem when using real world data or even model flow data is that it is not trivial to estimate K 2 properly, which could be the potential reason for such strong deviation.But even for a very simplistic system like the logistic map, where we can analytically compute K 2 by the positive Lyapunov exponent λ(r) = log(|r − 2rx|) , the theoretical relationship as visible in Fig. 13 cannot be approximated.Relationship between ENTR and K2 for the logistic map, calculated from time series of length N = 2, 000 embedded in two dimensions with unity lag, a fixed recurrence threshold ε = 0.05, a minimum line length ℓmin = 2, and the kelo correction applied.For lower choices of the recurrence thresholds the graphs looked similar and only for higher thresholds the agreements with the expected values got worse.
Addressing the following questions would be helful in order to make advances in transition and bifurcation detection as well as classifying regimes: (1) Further elaborate the relationships between structures in RPs (diagonal and vertical lines, recurrence times) and dynamical invariants; compare the different estimations based, e.g., on line length distributions [164,165], recurrence rate [65], recurrence entropy [163], or recurrence times [73].( 2) Investigate the sampling effect on these relations [166] and clarify why some of these relations (such as the EN T R − K 2 -relation, Eq. ( 17) and similar the DET − K 2 -relation) do not match observational data.(3) A thorough study on the impact of the correction schemes [93] (see Subsec.2.6) on the estimation of dynamical invariants is needed.

Significance tests for RQA
In cases where the experimental design allows the acquisition of distributions of RQA characteristics, it is possible to make statements about the significance of the results.In most passive experiment setups, as it is often the case in medical applications, astrophysics, or geoscience, this is not possible.Hypotheses testing on observations of a system should then be performed using known test models (which correspond to the null).Recent theoretical work which derived the theoretical values for RQA measures of specific systems (mainly stochastic systems) will help in evaluating and benchmarking results [68,69].However, this approach works only for specific nullhypotheses (e.g., testing against noise).For more general hypothesis testing, we will rely on surrogate data, an appropriate Monte Carlo sample of the underlying data for a given null hypothesis.Surrogates are generated by keeping characteristics of the observed system related to the null, but induce randomness at the same time (constrained randomisation) [167].In the context of recurrence analysis this translates into the question of how to construct surrogate phase space trajectories, distance or recurrence matrices, which are consistent with the null.It would, thus, be beneficial to construct surrogates of phase space trajectories in order to obtain distributions of corresponding RQA statistics, which could then be used for statistical testing.
A promising method is using twin surrogates [168], which constructs surrogates from (1) identifying twins in the phase space trajectory (points which share the same neighbourhood) and ( 2) randomly jump to one of the possible futures of the existing twins.The drawback is, of course, that for proper statistical testing we would seek around 1, 000 surrogates or more and in the described method this number is determined by the total number of twins, which is a property of the data and is often too small.
Another idea for line based RQA statistics in the running window approach for transition detection is based on bootstrapping line structures [169].To estimate the unknown variance of the diagonal line length distribution of a RP, surrogate line length distributions are bootstrapped from the cumulative line length distribution of all windows.Although this approach is working well in most cases, the resulting confidence intervals are sensitive to the number of bootstrapped lines.There is no objective way to determine this number, because the number of lines can vary between the windows.Thus, there is still an urgent need for robust methods that construct RP surrogates, which preserve the basic properties (correlation structure) of, e.g., the RP or the underlying state space trajectory.This would affect all existing RQA measures and would allow to make statements about the statistical relevance a measured RQA statistic has, even in passive experiments with single runs.

Machine learning combined with recurrence analysis
Machine learning (ML) approaches become more and more accepted and used also in complex systems science.RPs and RQA are already used as features in ML applications mainly for classification purposes, but also automated feature extraction methods are increasingly used (Subsect.2.9).The main question here is whether the RQA features, some of which have a relationship to dynamical invariants (i.e., have physical meaning), are a useful preprocessing step before applying a particular ML method for classification or prediction.Or whether suitable image feature extraction methods, such as CNNs are the way to go.Certainly, the computation of RQA features does not depend on too many free parameters and, thus, does not require any additional hyperparameter optimization or training.This is an important point, as multiple stacked ML methods easily become unmanageably complex models that are potentially prone to overfitting and additionally require a large amount of (stationary) training data.For the direct application of CNNs to the time series image (Fig. 12C) a sound study is also needed examining the difference between feeding a RP or the unthresholded distance matrix.
New directions in using ML approaches are time series based predictions using reservoir computing, which might benefit by applying concepts from RPs and the according recurrence networks.
The future developments with respect to ML and RPs will see further crossfertilisations.For example, ideas of time series imaging used for ML based classifications such as gramian summation fields and markov transition fields [170,171] could provide new definitions for recurrences.

Conclusions
Methodical research on recurrence plots (RPs) and recurrence quantification analysis (RQA) is still a lively field.The last years have revealed a number of important new solutions for specific research questions, but also gave some answers to more general challenges in RP based data analysis.Nevertheless, there are still further open ends and directions which should be considered in the future.The search results are available via https://www.webofscience.com/wos/woscc/citation-report/bd5b7e1a-63b3-4345-9626-215d23f8e7e1-35998ece

A.2 Subjects
The database of publications (N = 3, 618 by May 2022) on or using RP based methods [36] is used to retrieve the Scopus subjects of them.This has been performed using a Python script to get this information via the Altmetric web service.Some of the Scopus subjects were summarised because of significant overlap (Tab.1).A publication can cover multiple Scopus subjects, therefore, the information in Fig. 2B does not mean exclusive subjects per publication and the total sum of presented subjects does not correspond to the total number of publications.

A.3 Measuring the calculation time for recurrence analysis
We measured the calculation time for creating a RP and calculation of the standard RQA measures depending on data length N for the Rössler system with the standard parameters (a = 0.25, b = 0.25, and c = 4) [172] and a sampling time of ∆t = 0.05.We used only the x-component of the Rössler system after removing the first 1,000 points as transients and applied a simple time delay embedding with m = 3 and τ = 6.The RP and RQA calculations were implemented in MATLAB (Version R2022a), R (Version 4.0.2),Julia (Version 1.6.4),and Python (Version 3.8.8).For MATLAB we used the rp code v1.1 provided by [173], for R the crqa package v2.0.2 [174], for Julia the package DynamicalSystems.jlv1.4.0 (RecurrenceAnalysis v1.5.2) [40], for Python the pyunicorn v0.6.1 package [39], as well as the PyRQA v8.0.0 package [175].The CRP Toolbox for MATLAB was not used, because the implementation is interwoven with a graphical user interface and, thus, the new rendering engine of MATLAB is strongly interfering and slowering the calculations since its introduction in 2014 [176].
The calculations were performed on a 2.3 GHz Quad-Core Intel Core i7 with 16GB RAM, except the calculations using the PyRQA package, which were performed on a Nvidia GPU Tesla V100 with OpenCL 1.2.

Fig. 1 .
Fig. 1.Number of publications and studies using recurrence plot based methods (based on the database available at [36], May 2022) and citations referring to seminal studies as retrieved from a Web of Science search (May 2022, details in Appendix A.1).

Fig. 2 .
Fig. 2. (A) Subjects covered by publications using RP based methods (based on Scopus subject classification database in[36], May 2022, see also the notes in Appendix A.2); (B) software for RP based analysis is available as standalone applications and as packages for the most frequently used high-level programming languages (based on information at[37], May 2022).

Fig. 3 .
Fig. 3. (A) Computation speed for recurrence plots and recurrence quantification measures for the Rössler system (details in Appendix A.3). (B)The approximative RQA allows calculation times of a few seconds for time series of length larger than 1 million data points, where standard single-thread calculations need hours (details can be found in[38]).

Fig. 5 .
Fig. 5. Order pattern RPs coloured with the corresponding order pattern for (A) Gaussian white noise, (B) autoregressive process of 1 st order, and (C) x-component of the Rössler system.Length of order pattern m = 3 and delay τ = 1, resulting in six different order patterns πi, i.e., six different colours.The fraction of the specific recurring order pattern on all recurrences is provided in brackets.

Fig. 6 .
Fig. 6. (A) Slow-fast dynamics derived from the Izhikevich model with a = 0.15, b = 0.2, c = −65, d = 8, I = 5, and smapling time ∆t = 0.1 [84].(B) The very different time scales in the data of the Izhikevich model cause small appendages at the diagonal lines that look like sword-like structures.

Fig. 7 .
Fig.7.Lacunarity for different prototypical systems representing more homogeneous (white noise) and quite heterogneous RPs (AR(2) and multifractal Gaussian noise), as well as a RP with characteristic temporal scales(Rössler system).Technical details can be found in[90].

Fig. 8 .
Fig. 8. Parallel and close parts of a phase space trajectory (A) correspond to diagonal lines of length ℓ in a RP (B).Diagonal lines can be cut by the border of the RP (green circles).High sampling can cause tangential motion, a thickening of diagonal lines (orange circles).Modified after[93].

FrequencyFig. 9 .
Fig.9.Diagonal line length histograms of (A) the conventional line length computation and (B) to (E) of the correction schemes proposed in[93] for a monochromatic time-delay embedded sinusoidal with an oscillation period T = 100 time units (m = 2, τ = T /4).(F) Enlargement of the histograms from panels (A) to (D), focusing on the shorter line lengths.A corresponding enlargement of (E) does qualitatively look the same, but with reduced frequencies, due to the smaller effective window size.For a better visibility, we enlarged single bars in (B) to (E) and limited the view to a frequency range [0 3] in (A) to (E) (in (F) the full range is used).Modified after[93].

Fig. 10 .
Fig. 10.Different approaches for avoiding the effect of tangential motion in a recurrence plot (RP), exemplary shown for the Rössler system (with parameters a = 0.15, b = 0.2, c = 10, sampling time ∆t = 0.2).(A) Standard RP with fixed recurrence threshold ensuring 4% global recurrence rate as a basis to all other RPs shown in this figure.(B) Perpendicular RP with angle threshold φ = 15°, (C) isodirectional RP with T = 5 [sampling units] and ε2 = ε/2, (D) true recurrence point RP (TRP) with Tmin = 5 [sampling units], which coincides with the first minimum of the mutual information, (E) thresholded local minima approach with two parameters (LM2P) and τm = 5, and (F) "skeletonized" diagonal RP.Modified after [93].

2 Fig. 13 .
Fig.13.Theoretical values of the diagonal line length entropy, Eq. (16), for different upper limits of the maximum encountered diagonal line ℓmax.The case ℓmax = ∞ corresponds to the analytical expression in Eq.(17).
Fig.14.Relationship between ENTR and K2 for the logistic map, calculated from time series of length N = 2, 000 embedded in two dimensions with unity lag, a fixed recurrence threshold ε = 0.05, a minimum line length ℓmin = 2, and the kelo correction applied.For lower choices of the recurrence thresholds the graphs looked similar and only for higher thresholds the agreements with the expected values got worse.