1 Introduction

The Lagrangian formulation, namely when the position of sensors dispersed in the flow is recorded over time, is a well-recognised framework to investigate the spatio-temporal features of fluid flows (Sreenivasan and Schumacher 2010; Haller 2015), with a large variety of practical applications for which experimental particle tracking is carried out. In fact, a broad range of complex flows have been investigated through—and have benefited from—a Lagrangian-based perspective with strong technological as well as social implications (Serra et al. 2020; Bourouiba 2021). They include, among others, atmospheric flows (Rosi et al. 2014; Shnapp et al. 2019), biological and cardiovascular flows (Tallapragada et al. 2011; Zhang et al. 2021; Di Labbio et al. 2022), oceanic flows (Tew Kai et al. 2009; Froyland and Padberg-Gehle 2015; Karrasch et al. 2015; Ser-Giacomi et al. 2021), as well as more general turbulent and vortical flows (Schneide et al. 2018; Neamtu-Halic et al. 2019; Martins et al. 2021; Mowlavi et al. 2022). Figure 1a illustrates some key examples where the Lagrangian framework finds application.

Fig. 1
figure 1

a Schematic of four realistic problems in which very-sparse Lagrangian data may be obtained: (i) sea surface circulation and dispersion; (ii) deep ocean circulation; (iii) atmospheric flows and dispersion; and (iv) engineering applications. b Workflow of the proposed methodology. The MSRN is built on individual trajectories (dashed red lines), and characteristic (dominant) scales are identified through recurrence peaks

Under realistic measurement conditions, e.g. in large-scale environmental flows (Fig. 1a), only a limited number of tracers are typically available due to technological limitations and associated costs of producing, seeding, and tracking a large number of Lagrangian sensors. Sensor sparsity is especially the case in atmospheric and oceanic flows, where large-scale motions have to be characterised by a limited number of sensors (Froyland and Padberg-Gehle 2015; Ser-Giacomi et al. 2021; Galler and Rival 2023). However, tracer sparsity is a more general issue that extends to other research domains such as, among others, cardiovascular treatments (Di Labbio et al. 2022), autonomous robot navigation (Yoerger et al. 2021), micro-fluidic systems in laboratory-on-a-chip devices (Chase et al. 2022), as well as path planning and remote sensing (Shen et al. 2015; Krishna et al. 2022). Moreover, data sparsity is particularly challenging for data-driven methodologies (Rasht-Behesht et al. 2022), due to the strong dependence of data-driven approaches on the amount (and quality) of the training dataset (Manohar et al. 2018; Fukami et al. 2021; Güemes et al. 2022).

Quantitatively, data sparsity is defined here in terms of the ratio \(l_m/l_{\text {ref}}\), where \(l_m\) is the mean inter-particle distance between sensors, while \(l_{\text {ref}}\) is a characteristic flow scale (see Fig. 1a, scenario (ii)); sparse and very-sparse Lagrangian data can then be defined as follows: \(l_m/l_{\text {ref}} \ge 0.1\) and \(l_m/l_{\text {ref}}>1\), respectively.

In spite of its practical importance, identifying significant flow features from limited (i.e. sparse) information is a challenging task and still represents an active research area (Fukami et al. 2021; Martins et al. 2021; Güemes et al. 2022). In this study, in order to tackle the sparsity issue and exploit very-sparse Lagrangian data from two-dimensional (2D) and three-dimensional (3D) flows typical of realistic conditions, we propose a data-driven approach that takes the inter-particle Lagrangian information to extremes by exploiting only single trajectories. To date, a multitude of methodologies have been developed in order to identify salient features of practical relevance (such as coherent flow structures) from Lagrangian trajectories (Haller 2015; Hadjighasem et al. 2017). However, such Lagrangian-based methodologies have rarely been tested on very-sparse datasets, while medium- to densely-seeded tracer distributions have been used instead (Hadjighasem et al. 2017). Dense distributions, in fact, allow for an Eulerian field reconstruction, e.g. via spatial interpolation of Lagrangian quantifiers, which is not possible for the case of (very-)sparse trajectories. In this regard, coherent motion indicators from individual particle trajectories have already been proposed (Rypina et al. 2011; Dong et al. 2011; Li et al. 2011; Lumpkin 2016; Neamtu-Halic et al. 2019; El Aouni et al. 2020; Haller et al. 2021), but they have not always been tested for very-sparse datasets and usually require additional parameter definitions and settings (Rypina et al. 2011; Dong et al. 2011; Li et al. 2011; Lumpkin 2016; El Aouni et al. 2020).

To attain the aforementioned goal, we develop a methodology referred to as multiscale spatial recurrence networks (MSRNs) (see Fig. 1b). The recurrence-based criterion proposed here relies on the physical intuition that tracer trajectories moving in a vortex-like structure tend to form a loop, that is detected for a specific (spatial) recurrence threshold R. In contrast, strongly-advected tracers do not necessarily display a looping behaviour for any spatial scale R. Accordingly, a larger number of recurrence links are expected to be activated in MSRNs built from trajectories displaying a looping structure, while the opposite is expected for strongly-advected tracers.

Although one of the major challenges with data-driven approaches—based on, e.g. a machine learning (Brunton et al. 2020) or a complex network (Iacobello et al. 2020) framework—is to provide physical connections between the method and the underlying flow features, we demonstrate that our MSRN methodology corroborates the physical intuition, allowing us to highlight dominant scales in both 2D and 3D flows. A similar physical intuition has been adopted in previous studies identifying trajectory loops (Dong et al. 2011; Li et al. 2011), but such methodologies were limited to 2D trajectories and required multiple parameters to identify loops and discern them from advection. In contrast, both 2D and 3D looping structures are detected through the novel MSRN methodology in each flow configuration considered without setting any parameters a priori (see Sect. 4). This advantage of MSRN over previous methodologies results from the fact that looping structures in MSRNs naturally emerge at particular spatial scales as highly interconnected (i.e. highly recurrent) sub-graphs (see Fig. 1b).

We showcase the potential of MSRNs to extract dominant flow scales from very-sparse Lagrangian data by testing the proposed methodology on two numerical and two experimental datasets. The numerically-simulated flows consist of a 2D double-gyre flow and 3D forced isotropic turbulence, while experimental data are collected for drifter positions over the North Atlantic ocean and an unsteady starting vortex. The double-gyre flow is a widely used setup to test Lagrangian-based coherent structure detection tools and consists of two 2D counter-rotating vortices expanding and contracting periodically in one direction, thus representing a simplified pattern frequently occurring in geophysical flows (Balasuriya et al. 2018). Forced isotropic turbulence is also investigated as an exemplifying case of 3D, fully developed turbulent motion. Drifter positions over the North Atlantic ocean are obtained from the NOAA/AOML Environmental Data Service as an example of field measurements. The dataset comprises drifters with varying starting times and lifespans, as well as trajectories with missing data (broken tracks), thus representing a challenging example of realistic field data where very-sparse trajectories can be exploited, and a multiscale recurrence analysis can be carried out. A laboratory-based starting vortex is also considered as a simple but effective case of an unsteady, three-dimensional and turbulent vortical flow, which is a paradigm of several relevant flow applications including, for instance, breath puffs (Abkarian et al. 2020) or bio-inspired flows (Cummins et al. 2018; Rival 2022). The starting vortex flow also provides an example of advected flows, which represent a challenge for loop identification schemes (Dong et al. 2011).

Therefore, our data-driven methodology contributes to providing an alternative (multiscale) paradigm for Lagrangian flow modelling of sparse and very-sparse data, which takes advantage of the emerging developments in network science (Boccaletti et al. 2006; Zou et al. 2019) to uncover the persistent spatio-temporal features of Lagrangian trajectories.

2 Methods

2.1 Multiscale spatial recurrence networks

Recurrence networks have been introduced as a network-based representation of (multivariate or univariate) time series to identify persistent dynamical behaviours in complex systems (Marwan et al. 2009; Donner et al. 2011; Subramaniyam et al. 2015). Unlike the classical idea of recurrence-based analysis to map time series into trajectories of a phase space (Lekscha and Donner 2019), in this study, we shift from the phase-space perspective and apply the recurrence network framework in the physical space, such that Lagrangian tracers tracked along their path in the flow are used to build distance-based recurrence networks. Moreover, by performing a multiscale analysis, we avoid the burden of finding an optimal recurrence threshold which is typical of classical recurrence-based approaches (Kraemer et al. 2018). It is worth mentioning that recurrence networks and recurrent neural networks represent different methodologies, in spite of the similar terminology. Recurrent neural networks, indeed, refer to a class of machine learning architectures where recurrent feedback loops are present (Sherstinsky 2020).

Fig. 2
figure 2

Schematic of the proposed MSRN methodology. a An example of a broken particle trajectory with 16 points and two spheres of radius R centred into two representative points, \(i=\lbrace 2, 5\rbrace\). b A graph representation of the nested recurrence network corresponding to the exemplifying trajectory in panel (a) is also illustrated. c An example of anisotropic reference geometry (grey ellipsoid) for the generalised MSRN, with a local reference frame such that \(x^\prime\) is aligned with the main direction of advection.

A recurrence-based network structure is obtained from a single tracer trajectory, \(\varvec{X}_i \equiv \varvec{X}(t_i)=\left( x(t_i)\,, y(t_i)\,, z(t_i)\right)\), starting at \(\varvec{X}(t_0)\) and then evaluated at \(N_t\) discrete times. First, the network nodes are defined so that each node i (with \(i=1,\dots , N_t\)) corresponds to a discrete point in space \(\varvec{X}_i\). As an example, Fig. 2a shows a schematic of a trajectory (red line) with \(N_t =16\) nodes (black and cyan dots). Given a spatial scale \(R>0\), a recurrence link from a node i and another node \(j>i\) is established if the trajectory exits at a time \(t_k>t_i\) the R-radius ball (namely the R-radius disc in 2D and the R-radius sphere in 3D) centred into i and then re-enters the ball at time \(t_j>t_k\).

In the example of Fig. 2a, the trajectory exits at \(k=4\) from the orange sphere centred in node \(i=2\) but never re-enters into it. On the other hand, a recurrence link is established between nodes \(i=5\) and \(j=13\) for the given R value, because the trajectory exits from the green sphere centred in \(i=5\) and re-enters at \(j=13\). It is worth stressing that links here are established between points (nodes) belonging to the same trajectory. Inter-trajectory connections might also be considered but, since we focus on very-sparse seeding in which two trajectories are unlikely to occupy nearby positions, they are excluded in this study (see also a discussion in Sect. 5). Figure 2b provides a graph representation of the network structure of the corresponding schematic in Fig. 2a, where nodes are depicted as filled circles and the recurrence link as a green line.

We further refine the network construction procedure by implementing a nestedness criterion, so that only links leading to a locally-nested network structure are allowed. Specifically, if a link is activated between two nodes i and j (with \(t_j>t_i\)), any other intermediate node k (i.e. such that \(i<k<j\)) can only establish a recurrence link with a node h also in between i and j, namely such that \(i<k<h<j\). In Fig. 2, since a recurrence link between \(i=5\) and \(j=13\) is first established (green segment in Fig. 2b), the link between \(k=6\) and \(h=14\) (black dashed line in Fig. 2b) is discarded because the link between nodes k and h does not provide a nested structure (i.e. links intersect with each other).

Following the interpretation of recurrence network structure in terms of vortex-like flow structures, the nestedness criterion gains a phenomenological interpretation, where the multiscale recurrence network can be associated with the presence of larger eddies containing smaller eddies in a nested structure. Furthermore, the nestedness criterion allows one to retain only relevant links for a given scale R, thus leading to computational benefits (as the total number of links diminishes).

The key feature of our methodology is represented by the ability of MSRNs to capture looping flow structures as densely interconnected sub-graphs, i.e. sets of nodes associated with a high number of recurrent links. Therefore, characteristic scales of fluid flows are identified through the appearance of peaks in the number of recurrence links, \(N_{\text {rec}}(R)\), for varying spatial scales, \(R\in \left[ R_{\text {min}}, R_{\text {max}}\right]\). Specifically, to account for different track lengths (typical of Lagrangian datasets), we normalise the number of recurrence links as follows:

$$\begin{aligned} N^\prime _{\text {rec}}=\frac{N_{\text {rec}}}{L/l_{\text {ref}}}, \end{aligned}$$
(1)

where \(L=\sum _i \Vert \varvec{X}_{i+1} - \varvec{X}_i\Vert _2\), with \(i=1,\dots ,N_t -1\), is the track length, while \(l_{\text {ref}}\) is a characteristic flow scale. This normalisation accounts for the fact that trajectories with longer tracks, L, can span a larger range of scales \(R<L\); accordingly, \(N^\prime _{\text {rec}}\) is interpreted as an average number of recurrence links per unit (normalised) trajectory length.

By using the scale R as a variable (in the range \(\left[ R_{\text {min}}, R_{\text {max}}\right]\)), we, in fact, avoid the need for defining an optimal value a priori to initiate our analysis (Kraemer et al. 2018). The values of \(R_{\text {min}}\) and \(R_{\text {max}}\) can be determined by geometrical and measurement constraints. In general, \(R_{\text {max}}\) is dictated by the largest spatial size of the domain, while \(R_{\text {min}}\) can be associated with the measurement resolution (or, for simplicity, bounded to zero). In practice, when trajectories are available, \(R_{\text {min}}\) and \(R_{\text {max}}\) can be easily calculated as the minimum and maximum values, respectively, of the pairwise distance between each trajectory point. Although the number of different R values in-between \(R_{\text {min}}\) and \(R_{\text {max}}\) depends on the problem, specifically on the result resolution desired, we suggest as a rule of thumb that \(O(10^2)\) different R values spanning the range \(R_{\text {min}}\) and \(R_{\text {max}}\) can be an appropriate order of magnitude to capture dominant flow scales while not significantly increasing the computational cost.

2.2 Generalised MSRNs: application to strongly-advected looping trajectories

The MSRN methodology described above can be generalised to explicitly account for the effect of significant flow advection on the trajectories. The effect of advection on single-trajectory loop identification schemes is a known issue arising in realistic applications. Such an issue has been typically overcome by deforming the trajectories through the subtraction of an average advection displacement (Dong et al. 2011). However, this pre-processing implies the knowledge of the direction and mean velocity of advection that can both change over time for unsteady problems.

To tackle this issue, in this work, we adopt a fully-automatic strategy where the advection direction is extracted from the trajectory data rather than defining it a priori. Specifically, to account for the dominant advection effect on the tracer trajectory, an anisotropic reference geometry—that is an ellipsoid with semi-axis distances \(R_x\), \(R_y\), and \(R_z\) measured in a local reference system \((x^\prime , y^\prime , z^\prime )\)—is used instead of a sphere with radius R (isotropic geometry), as sketched in Fig. 2c. Here \(x^\prime\) is conventionally assumed as the main direction of advection and is obtained as the principal axis of a principal component analysis (PCA) of the trajectory positions, i.e. the direction of maximal variance of trajectory points. This choice to identify \(x^\prime\) is justified by the fact that the tracer positions in a stretched trajectory (as an effect of the significant flow advection) occupy a larger spatial domain in the advection direction, thus naturally leading to a higher position variance along \(x^\prime\).

The amount of recurrence links can now be plotted against \(R_x\), \(R_y\), and \(R_z\) to highlight the looping structures in the three local Cartesian directions. In fact, due to a significant advection displacement (compared with the rotational motion in the \(y^\prime\)-\(z^\prime\) plane), recurrence links cannot appear at any isotropic scale R (i.e. when a sphere is used as a reference geometry) while they are captured by an anisotropic reference geometry. While the choice of the relative proportions between \(R_x\), \(R_y\) and \(R_z\) is generally problem-dependent (e.g. based on boundary conditions), here, we vary \(R_x\), \(R_y\) and \(R_z\) proportionally to the average (Euclidean) distance of tracer positions along the x, y, and z directions, respectively, as in Iacobello et al. (2019).

3 Test cases and experimental setup

3.1 Two-dimensional double-gyre flow

The double-gyre flow consists of two counter-rotating vortices, unsteadily oscillating in a 2D domain. The velocity field \(\varvec{u}=(u, v)\) is defined in a rectangular domain, \((x, y) \in [0, 2]d_G \times [0, 1]d_G\), as

$$\begin{aligned} u(x, y; \;t)&= -\pi A \sin (\pi h(x; \;t)) \cos (\pi y), \end{aligned}$$
(2)
$$\begin{aligned} v(x, y; \;t)&= \pi A \cos (\pi h(x; t)) \cos (\pi y) \frac{\partial h(x; \;t)}{\partial x}, \end{aligned}$$
(3)
$$\begin{aligned} h(x; \;t)&= \epsilon \sin (\omega t)x^2 + (1-2\epsilon \sin (\omega t))x, \end{aligned}$$
(4)

where \(d_G =1\) is the characteristic length scale. In the present study, \(A=1\), \(\omega = 0.2\pi\) (all variables are in simulation units). The parameter \(\epsilon\) represents the degree of unsteadiness of the system, and it is typically set to be in the range 0.2–0.3 (Balasuriya et al. 2018). Here, the values 0.1 and 0.7 were used to highlight the effect of weaker and stronger unsteadiness, respectively, on tracer motion. Pointwise Lagrangian trajectories, \(\varvec{X}(t)=\left( x(t), y(t)\right)\), were then computed through a first-order temporal integration of the local (Eulerian) velocity field, \(\varvec{u}\) (evaluated at the particle position using Eqs. (24)), so that \(\varvec{u}(t)=\Delta \varvec{X}/\Delta t\). The integration was carried out with a constant time step \(\Delta t=0.2\) for a total time \(T_G=200\), corresponding to twenty times the characteristic time \(2\pi /\omega =10\) (so that \(\Delta t\,\omega /(2\pi )=1/50\), justifying the choice of first-order integration). The starting tracer position, \(\varvec{X}(t=0)\), was set randomly in the domain.

3.2 Three-dimensional isotropic turbulence

A direct numerical simulation (DNS) of 3D, forced isotropic turbulence at Taylor scale Reynolds number, \(Re_\lambda \equiv \lambda u_{\text {rms}}/\nu =433\), was also used to extract Lagrangian trajectories. Here \(\lambda\) is the Taylor micro-scale, while \(u_{rms}\) and \(\nu\) are the root-mean-square velocity and fluid kinematic viscosity, respectively (all variables are in simulation units). Velocity and pressure are computed into a 1024×1024×1024 grid spanning a \(2\pi \times 2\pi \times 2\pi\) domain with periodic boundary conditions. After the statistically stationary state is reached, velocity and pressure are stored for a temporal window \(T_{\text {iso}} =10.056\), corresponding to about five large-eddy turnover times. The dataset is available online from the Johns Hopkins Turbulence Database (Li et al. 2008), and further details on the simulation can be found in the Johns Hopkins Turbulence Database (2022). A characteristic length scale here is the integral length scale, \(d_{\text {iso}}\), which is \(d_{\text {iso}} =1.364\).

Pointwise Lagrangian trajectories, \(\varvec{X}(t)=\left( x(t), y(t), z(t)\right)\), were then obtained through the getPosition function available on the online database (see Yu et al. (2012) for further details on getPosition). Temporal integration over \(T_{\text {iso}}\) was performed using the simulation time step (equal to \(2\times 10^{-3}\)) and an 8th-order Lagrangian interpolation scheme in space. The starting position, \(\varvec{X}(t=0)\), was set randomly in the domain, and trajectory points are collected with a time step \(\Delta t=1.6\times 10^{-2}\).

3.3 Oceanic drifter trajectories

Drifter data are extracted from the Global Ocean Drifter Program available at NOAA/AOML Drifter Data Assembly Center (NOAA 2022), where Lagrangian data have been collected from 1979, with drifter positions given every 6 hours. In this work, we consider drifters having a minimum lifetime of 6 months, within the same 5-year time span (2005–2009) adopted by Froyland and Padberg-Gehle (2015). We limited the area of observation to drifters with starting position in the North Atlantic Ocean, i.e. with starting latitude in the range [\(15^\circ\) N, \(60^\circ\) N] and starting longitude in the range [\(75^\circ\) W, \(0^\circ\)]. As a result, a dataset containing 549 trajectories was obtained, corresponding to drifters starting and terminating at different times, and trajectories with missing data, thus providing a realistic and challenging scenario. In order to establish recurrence links, the shortest distances between point pairs over the ocean surface (great circle distance) were computed using latitude and longitude angles and then converted into kilometres using a spherical approximation for Earth with an average radius of 6371 km (Froyland and Padberg-Gehle 2015).

Fig. 3
figure 3

Schematic of the experimental setup consisting of an open-jet vertical wind tunnel specifically designed for unsteady pulses (Galler and Rival 2021, 2023). The wind tunnel frame is displayed, which includes, within the black frame, two turbulence grids and a honeycomb flow straightener (not shown).

3.4 Starting vortex experiment

The experimental setup consists of an open-jet vertical wind tunnel from the Queen’s University laboratory (Kingston, Canada) specifically designed for unsteady pulses and that has been used to test the response of biological seeds in gusts (Galler and Rival 2021, 2023). Figure 3 illustrates a schematic of the setup. The wind tunnel has a square outlet cross-section of side length \(d_V =0.115\,\)m and a contraction ratio 6:1. The flow was generated by suddenly activating 9 fans located at the wind tunnel inlet, achieving a steady-state velocity of \(U_c\approx 1.8\,\text {m}\,\text{s}^{-1}\) at the nozzle exit centreline. A Photron Fastcam Mini WX 100 high-speed camera (focal length \(35\,\)mm and pixel size \(10\,\mu\)m) with the axis normal to the x-y plane (see Fig. 3) was mounted to capture tracers’ motion in a cubic measurement volume of about 1-m side. A single-camera approach was used to track soap bubbles in the air (Hou et al. 2021; Kaiser and Rival 2023), which were used as tracers illuminated by a LED light source placed perpendicular to both cameras. The bubble generator was designed to produce bubbles without an additional air supply, as well as acceptable consistency in the bubble size. A preliminary sensitivity analysis showed that over \(95\%\) of all measured bubble diameters were within \(\pm 5\%\) of the mean (nominal) diameter of \(30\,\)mm (Kaiser and Rival 2023).

Additionally, PIV data were collected to provide a ground truth in terms of the (Eulerian) vorticity field. PIV data were collected using a Photron SA4 camera (\(1024\times 1024\) pixel resolution) with the axis normal to the (xz) plane. A 60-mJ-per-pulse Photonics Nd:YAG laser (producing an approximately \(2\,\)mm thick laser sheet) was used to illuminate seeding fog (approximate particle diameter of \(3 \mu\)m) produced by a Degree Controls C-Breeze fog machine (Galler and Rival 2021, 2023), resulting into a PIV field-of-view of about \(65\,\text {cm}\times 65\,\)cm. Velocity values in the x-z plane are then evaluated at 500-fps frame rate for \(1\,\)s, in an \(86\times 86\) regular grid with spacing 7.91 mm, and averaged over 15 realisations.

4 Results

4.1 Application to numerical flows

The numerically-simulated double-gyre and isotropic turbulence flows are first analysed, owing to their simplicity with respect to experimental data, and results are reported in Fig. 4.

Fig. 4
figure 4

ad Double-gyre flow; e, f forced isotropic turbulence. Four particle trajectories in the double-gyre flow are considered for two unsteadiness parameter values: (a) \(\epsilon =0.1\); (c) \(\epsilon =0.7\). (b, d Normalised number of recurrences, \(N^\prime _{\text {rec}}\), against the recurrence scale, \(R/d_G\), for particle motion in panels (a) and (c) with matching colours. (e) A 3D visualisation of a particle trajectory (red arrow) in the forced isotropic turbulence domain. A transparent grey sphere of diameter equal to \(d_{\text {iso}}\) is also shown. (f) \(\langle N^\prime _{\text {rec}} \rangle\) against \(R/d_{\text {iso}}\) evaluated over 100 sparsely-distributed, particle trajectories. The bars refer to the standard error, \(\epsilon _{\text {iso}}\), at each normalised scale, while the black lines refer to the median (50th percentile) and the 90th percentile for the whole ensemble. The inset in (f) shows the normalised number of recurrences for the trajectory depicted in (e)

For the double-gyre flow, four particles are randomly seeded in the domain, and their behaviour is investigated for two unsteadiness parameter values, \(\epsilon\). Figure 4a and c show the particle trajectories in the physical plane; results do not significantly change for different particle starting positions (not shown). For weak unsteadiness (\(\epsilon = 0.1\), Fig. 4a), trajectory paths clearly trace the shape of the two gyres mainly located at \(0<x/d_G<1\) and \(1<x/d_G<2\). In contrast, in the case of stronger unsteadiness (\(\epsilon =0.7\), Fig. 4c), trajectory paths clearly display more chaotic behaviours. Figure 4b and d show the normalised number of recurrence links, \(N^\prime _{\text {rec}}\) (with \(l_{\text {ref}} \equiv d_G\)), as a function of the (normalised) recurrence threshold, \(R/d_G\), for both \(\epsilon\) configurations. A unique peak of \(N^\prime _{\text {rec}}\) can be clearly detected at \(R/d_G = 1\) for the case of \(\epsilon =0.1\), as a consequence of the presence of a vortex-like flow with characteristic size \(d_G\). For \(\epsilon =0.7\), a single peak of \(N^\prime _{\text {rec}}\) can also be identified. However, this peak is less sharp and occurs at \(R/d_G \approx 0.5\), as a consequence of the more chaotic particle motion resulting from stronger underlying flow unsteadiness.

Furthermore, we observe that—although recurrence networks are built on single trajectories—the behaviour of \(N^\prime _{\text {rec}}\) is consistent among all particles. Such consistency appears even in the case of \(\epsilon =0.7\) (Fig. 4d) where particles follow spatial paths with notable spatial differences.

Capabilities of the recurrence-based approach are then assessed on a 3D flow of forced isotropic turbulence. For this test case, 100 particles are randomly seeded in the flow (particle density is \(l_m/l_{\text {ref}} \approx 3.1>1\), that is very-sparse seeding, with \(l_{\text {ref}} \equiv d_{iso}\)) and tracked for the full simulation time, \(T_{\text {iso}}\). This temporal window does not allow one to analyse very long tracks, thus posing further obstacles to flow feature detection. An example of a particle trajectory from the 3D isotropic turbulence is shown in Fig. 4e as a red line (the arrow indicates increasing time). Due to the flow isotropy and stationarity, multiple trajectories can be assumed to belong to different realisations of the same flow. Accordingly, an ensemble average, \(\langle N^\prime _{\text {rec}}\rangle\), of \(N^\prime _{\text {rec}}\) values is computed and shown in Fig. 4f as a function of \(R/d_{\text {iso}}\). A peak of \(\langle N^\prime _{\text {rec}}\rangle\) can be clearly detected for \(R/d_{\text {iso}} \approx 1\), namely for recurrence thresholds R comparable with the integral length scale, \(d_{\text {iso}}\). An uncertainty characterisation is also provided in Fig. 4f, by illustrating the standard error, \(\epsilon _{\text {iso}}\), at each scale \(R/d_{\text {iso}}\), as well as the 50th and 90th percentiles: All curves statistically corroborate the main finding by consistently highlighting a peak at \(R/d_{\text {iso}} \approx 1\). To better highlight this outcome, a transparent sphere of diameter equal to \(d_{\text {iso}}\) is also illustrated in Fig. 4e: The particle trajectory in red follows a loop-like path whose characteristic size is comparable to the sphere size. This outcome is supported, in the inset of Fig. 4f, by the appearance of a peak in the behaviour of \(N^\prime _{\text {rec}}\) corresponding to the particle trajectory of Fig. 4e.

In summary, results from the numerical test cases provide some preliminary insights about the ability of MSRNs to detect characteristic flow scales from very-sparse data, through the appearance of recurrence peaks.

Fig. 5
figure 5

a Drifter trajectories over the North Atlantic ocean (\(l_m/d_O\approx 27\)) corresponding to scenario (i) in Fig. 1a. b Ensemble-averaged number of recurrence links, \(\langle N^\prime _{\text {rec}} \rangle\), for three randomly-picked subsets of drifters, as a function of the (normalised) recurrence scale (\(R/d_O\)). Note that drifter trajectories in panel (a) correspond to Subset 3 of panel b. c Two snapshots of the starting vortex (using fog visualisation) in the xz plane, where z is the vertical direction. The starting vortex is representative of many common problems, e.g. scenarios (ii)-(iv) in Fig. 1a. d \(\langle N^\prime _{rec} \rangle\) against \(R/d_V\) evaluated over 100, 3D bubble trajectories.

4.2 Application to experimental data

In this section, MSRNs are tested on the more challenging, realistic flow from experimental data. Oceanic drifter trajectories are considered first; in particular, three randomly-chosen subsets of 10 tracers are extracted from the full dataset (549 trajectories) representing very-sparse conditions. Figure 5a shows drifter trajectories for one of the three subsets, while the ensemble-averaged number of recurrence links is reported in Fig. 5b. Although recurrence networks are built on individual tracks, multiple trajectories can be assumed to belong to different realisations of the same flow, owing to the arbitrary selection of drifter subsets. Results highlight the appearance of peaks in the number of recurrence links for \(R/d_O \le 1\), where \(l_{\text {ref}}\equiv d_O=100\) km is the characteristic length scale, corresponding to the conventional upper-bound spatial scale for the oceanic mesoscale motions (Martínez-Moreno et al. 2021). The outcome from the recurrence plot of oceanic drifters in Fig. 5b is in accordance with typical mesoscale oceanic motion scales, which span in the range \(0.1<R/d_O<1\) and drive the drifter motion (Martínez-Moreno et al. 2021).

It is worth noting that, although \(\langle N^\prime _{\text {rec}}\rangle\) spans a range of scales since drifters are affected by flow structures of different sizes, the methodology is able to detect the dominant (mesoscopic) scale through a recurrence peak. Consistent results are obtained here for different drifter subsets, as illustrated in Fig. 5b, thus corroborating the methodology’s robustness against the choice of different trajectories, as shown for numerical test cases. Moreover, Fig. 5a–b exemplify a typical situation in which the R range can be narrowed to avoid unnecessary calculations for very large spatial scales, since \(R_{\text {max}}=1000\) km is much smaller than the domain size but is sufficient to capture characteristic oceanic motion scales.

Figure 5c shows two snapshots of the starting vortex at two different times, while the ensemble-averaged number of recurrence links, \(\langle N^\prime _{\text {rec}}\rangle\), for a set of 100 three-dimensional trajectories is displayed in Fig. 5d. Similarly to drifter trajectories, an ensemble average can be performed owing to the experimental reproducibility of the starting vortex flow. Results highlight the appearance of a main peak in the number of recurrence links for \(R/d_V=0.5\), thus identifying the presence of the characteristic vortical structure in the starting vortex flow, while local peaks at \(R/d_V>1\) suggest the presence of weakly-advected rotational motion. In what follows, the starting vortex is further exploited as a representative flow to showcase the potential of MSRNs in distinguishing between vortical-dominated and advection-dominated particle motion from both 2D and 3D trajectories (Sect. 4.3), and how our methodology can be generalised to explicitly account for effects of unsteadiness and strong advection (Sect. 4.4).

4.3 Detailed MSRN analysis of 2D and 3D bubble trajectories

Four representative, two-dimensional, bubble tracks are shown in Fig. 6a, while Fig. 6b shows the corresponding \(N^\prime _{\text {rec}}\) plots, as a function of \(R/d_V\). For bubble 1 (blue lines in Fig. 6a and b), we observe an \(N^\prime _{\text {rec}}\) peak at very small \(R/d_V\) values, as the bubble is mainly advected (a small loop can be detected at \(x/d_V\approx 1.5\) and \(z/d_V\approx 0.75\)). In contrast, an \(N^\prime _{rec}\) peak at \(R/d_V\approx 0.5\) can be observed for bubbles 2 and 3 (red and green lines, respectively). These peaks result from larger loops experienced by the corresponding tracks (Fig. 6a) and are resonant of the presence of a vortex with comparable size as also highlighted in Fig. 6b. In fact, the vortex core—quantified as the area where the vorticity \(\omega _y\) is larger than \(50\%\) of its maximum value—shown as yellow contours in Fig. 6c has a characteristic size of \(l/d_V \approx 0.48\). Bubble 2 also displays a second \(N^\prime _{\text {rec}}\) peak at \(R/d_V \approx 4\) that is related to the fact that the starting vortex moves upwards, and, as a consequence, the same bubble experiences the presence of the (counterclockwise) vortex more than once. However, since the loop centred at \(z/d_V\approx 5.5\) is not closed due to the sudden bubble collapse, the second \(N^\prime _{\text {rec}}\) peak is more significant than expected. Due to the significant effect of advection, a time-varying MSRN analysis will, therefore, provide more refined insights by correctly distinguishing the main loop for \(R/d_V \approx 0.5\) from the bubble advection (see Fig. 7a and b).

Fig. 6
figure 6

a Two-dimensional bubble trajectories and b the corresponding (normalised) number of recurrences; arrows indicate increasing time. c A snapshot of particle image velocimetry (PIV) data averaged over 15 realisations, illustrated as velocity black arrows and a vorticity colour map. Yellow contours correspond to \(\omega _y d_V/U_c=\pm 4\times 10^{-3}\) and highlight the vortex core whose size is comparable to the spatial scale of maximum recurrence. d Three-dimensional bubble trajectories and e the corresponding (normalised) number of recurrences, corroborating the outcomes from 2D trajectories (panels (a) and (b)). f A zoomed-in visualisation of the broken track of bubble 8

Three-dimensional Lagrangian trajectories are also used to build recurrence networks. Although conceptually similar to the 2D analysis, the third dimension affects the pairwise distance values and, in turn, the recurrence link activation for a given R. Four representative bubbles are used, whose tracks are shown in Fig. 6d. The corresponding \(N^\prime _{\text {rec}}\) behaviour is displayed in Fig. 6e, corroborating the findings for 2D tracks. In fact, we observe an \(N^\prime _{\text {rec}}\) peak at \(R/d_V\ll 1\) for bubble 5, which is mainly advected and displays only a small-scale loop. Bubble 6 shows a peculiar behaviour with multiple local peaks in the range \(0.1<R/d_V <1.8\), corresponding to loops of different sizes (cyan line in Fig. 6d). As with bubble 4, bubble 6 is subjected to multiple influences of the clockwise-rotating vortex (red areas in Fig. 6c). In contrast, results for bubble 7 do not reveal any looping for any \(R/d_V\) as this bubble is mainly advected by the jet-like core of the starting vortex.

The capabilities of our methodology against broken trajectories are also assessed, i.e. when some temporal data are missing from the trajectory (as happens in some experimental data). Figure 6d shows an example of a broken trajectory (grey line, see also the zoomed-in plot in Fig. 6f), namely where the track includes missing data; the corresponding \(N^\prime _{\text {rec}}\) versus \(R/d_V\) plot is reported in Fig. 6e. High values of \(N^\prime _{\text {rec}}\) are still identified around \(R/d_V =0.5\), further confirming the ability of the MSRN approach to detect the characteristic scale of flow motion even for incomplete Lagrangian data. Such ability comes from the fact that MSRNs highlight the presence of vortical structures by accumulating recurrence links along the looping intervals of the trajectory (see Sect. 2.1).

The results discussed so far for the starting vortex flow corroborate the ensemble-averaged result of Fig. 5d, as well as the methodology’s robustness displayed for numerical test cases, but also provide a richer picture of the behaviour of individual bubble trajectories elucidating the origin of the main and secondary peaks in Fig. 5d.

4.4 Time-dependent and generalised MSRNs

The MSRN-based analysis discussed in the previous sections focused on the full-trajectory length. Although the methodology was able to successfully highlight dominant scales, considering the full-trajectory length leads to an averaging effect that might not be suitable for unsteady flows. Likewise, highly-advected trajectories would require an anisotropic approach to distinguish between looping and advective motion.

The potential of the MSRN approach for the study of unsteady flows is first assessed by performing a time-dependent analysis of the starting vortex flow. In Fig. 7a and b, we report results for two different strategies to investigate the behaviour of bubble 2 of Fig. 6a, whose trajectory is also shown in the inset of Fig. 7a where colours indicate the bubble lifespan as an increasing percentage of time, \(T/T_{\text {tot}}\). A first time-dependent strategy consists in cumulatively increasing the temporal window over which the links are established, while a second strategy consists in fixing a temporal window, \(\Delta T\), sliding over the total bubble lifespan.

Figure 7a shows the normalised number of recurrences, \(N^\prime _{\text {rec}}\), as a function of scales \(R/d_V\) and increasing time, \(T/T_{\text {tot}}\) (i.e. the first strategy). As time cumulatively increases, a peak appears at \(T/T_{\text {tot}} \approx 0.5\) owing to the fact that the bubble loop is fully captured within this temporal window. In Fig. 7b, instead, we adopt the second strategy by using a fixed sliding window, whose temporal length is increased as \(\Delta T/T_{\text {tot}}=0.05, 0.1, \dots , 0.95, 1\). A main peak in the number of recurrences is observed in Fig. 7b for \(\Delta T/T_{\text {tot}}\approx 20\%\) owing to the fact that the closed loop of the bubble starts at \(T/T_{\text {tot}}\approx 30\%\) and terminates at \(T/T_{\text {tot}}\approx 50\%\) (see inset in Fig. 7a), such that the most appropriate temporal window to capture the loop is at \(\Delta T/T_{\text {tot}}\approx 20\%\). The wake-like effect observed in Fig. 7a and b after the main peak is the result of attenuation in the relative number of recurrence links for increasing temporal windows (and hence an increase in the track length). Note that the behaviour of \(N^\prime _{\text {rec}}\) for \(T/T_{\text {tot}}=100\%\) and \(\Delta T/T_{\text {tot}} =100\%\) in Fig. 7a and b is the same and coincide with the corresponding (red) \(N^\prime _{\text {rec}}\) plot in Fig. 6b.

Fig. 7
figure 7

a, b Time-dependent analysis of the normalised number of recurrence links: (a) a cumulatively increasing time window \(T/T_{\text {tot}}\) strategy and (b) a fixed sliding window strategy with increasing size, \(\Delta T/T_{\text {tot}}\). Results are illustrated for bubble 2 of Fig. 6a, which is also shown in the inset of panel (a) as a coloured track highlighting the bubble travelling time percentage, \(T/T_{\text {tot}}\). cf Generalised MSRN results for bubble 5 of Fig. 6d. (c) Normalised number of recurrence links as a function of \(R/d_V\) (reference/isotropic approach), for three uniformly spanned time intervals, \(T_{1,2,3}\); the inset shows the bubble trajectory with colours associated with the three time intervals. The normalised number of recurrence links is shown as a function of (d) \(R_x/d_V\), (e) \(R_y/d_V\), and (f) \(R_z/d_V\). The same legend shown in c holds also for panels (df)

We then use the generalised MSRN framework to account for the trajectory stretching that results from the effect of strong flow advection. As mentioned in Sect. 2.2, the number of recurrence links is now plotted against spatial scales \(R_x\), \(R_y\), and \(R_z\) such that (similar to the analysis carried out so far) peaks in \(N_{\text {rec}}\) will reveal the presence of characteristic vortical motion in the flow. As a representative example, in Fig. 7c–f, we show results for the 3D trajectory corresponding to bubble 5 of Fig. 6d. A time-varying approach is pursued, where \(N^\prime _{\text {rec}}\) is reported for three consecutive intervals \(T_{1,2,3}\) with constant duration \(\Delta T/T_{\text {tot}}=1/3\).

Figure 7c shows that the isotropic MSRN approach reveals a significant loop with scale \(R/d_V\approx 0.15\) in the second time interval (\(T_2\)), while the bubble is advected for the remaining time. This outcome is consistent with the results from the generalised (anisotropic) MSRN approach (Fig. 7d–f) and with a visual inspection of the bubble track (Fig. 7c). However, the anisotropic MSRN approach provides a more detailed picture of the bubble’s motion. In fact, at \(T_2\), the looping structure is strongly anisotropic as \(R_x/d_V \approx 0.35\), \(R_y/d_V \approx 0.23\), and \(R_z/d_V\approx 0.06\). Moreover, two very small-scale looping structures are detected through the generalised MSRN approach at \(T_1\) and \(T_3\), revealed by the \(N^\prime _{rec}\) peaks for \(R_y/d_V \approx 0.06\) and \(R_z/d_V \approx 0.02\). Finally, in Fig. 7d, the \(N^\prime _{\text {rec}}\) peaks are present at scales \(R_x/d_V\) comparable with the trajectory length (i.e. \(L/d_V\approx 1\)) at time intervals \(T_1\) and \(T_3\), as \(x^\prime\) is assumed to be the main direction of advection. In contrast, the \(N^\prime _{\text {rec}}\) peak shifts back towards lower \(R_x/d_V\) at \(T_2\) (red line in Fig. 7d) as the looping behaviour is dominant over advection at this time interval.

5 Discussion and conclusions

Very-sparse Lagrangian datasets place significant challenges on the identification of coherent flow features in myriad applications. Specifically, due to the lack of a large number of trajectories, traditional identification techniques (e.g. based on velocity gradients) simply cannot rely on detailed spatial information as used for dense Lagrangian analyses. While previously proposed approaches have been successfully employed on dense or semi-sparse data, in this study, a spatial recurrence-based network approach is proposed to tackle the problem of very-sparse data. The aim here is to overcome the issue of very-sparse Lagrangian datasets, by exploiting the information enclosed in single Lagrangian trajectories and by highlighting the presence of coherent vortical motion in the flow over a broad range of spatial scales.

The proposed methodology displays some desirable features. First, in contrast with previous loop identification schemes, the present approach does not require a priori parameter setting, only relies on tracer positions, and can deal with both 2D and 3D unsteady flows. The current approach can easily account for broken trajectories, as well as highly-convective flows. Building the networks on individual trajectories also allows one to potentially ensemble-averaged results, overcoming some issues of spatial interpolation (like data sparsity or the presence of solid obstacles in the flow). These MSRN features make the proposed approach a suitable tool for the analysis of realistic (steady and unsteady) flows that, beyond the examples considered here, can range from micro-scale to geophysical flows where very-sparse data are the norm rather than the exception.

As the network nodes correspond to the trajectory (discrete) points, the computational cost to build a recurrence network increases with the number of time steps in which the trajectories are evaluated, as well as the number of scales considered. However, the number of time steps does not represent a major issue in most practical applications where short tracks are usually available. Moreover, similar to other Lagrangian-based methods, the accuracy of the present approach is related to the tracer size as well as their inertial effects in the flow. This aspect will be the focus of future research, as it is relevant for experiments where tracers can only be approximated as infinitely small points. Hence, owing to the capabilities of networks to model complex flow systems (Iacobello et al. 2020; Taira and Nair 2022), future work will be carried out to account for additional physical effects (e.g. particle inertia or rotation) into the recurrence framework, as well as to test it on other flow configurations.

It is worth noting that the present methodology could be further generalised, e.g. to model inter-trajectory connections as in previous works (e.g. Rypina and Pratt 2017; Iacobello et al. 2019; Martins et al. 2021; Schlueter-Kuck and Dabiri 2017). In fact, a multi-layer network structure can be obtained (Zou et al. 2019), where different trajectories correspond to different layers, and inter-layer (i.e. inter-trajectory) links are activated, e.g. based on trajectory spatial proximity (Schneide et al. 2018; Iacobello et al. 2019) or trajectory similarity (Schlueter-Kuck and Dabiri 2017).

In conclusion, the MSRN features potentially make our multi-disciplinary approach an effective and versatile modelling framework for the characterisation of fluid flows even under challenging scenarios. As such, following the recent developments in network-based modelling (Fernex et al. 2021), results presented here are expected to pave the way for the MSRN-based analysis of a variety of increasingly challenging datasets collected through state-of-the-art sensors (Galler and Rival 2023), in order to gain insights otherwise unavailable via very-sparse trajectories.