1 Introduction

Understanding and predicting turbulent processes in geophysical and laboratory setups is key for many important questions [1,2,3,4]. Although observation technologies are progressing, the quality and quantity of available data so far is still inadequate in many respects. The most important limitation is due to measurements sensitivity, spatial and temporal sparsity [5,6,7]. As a result, different tools have been developed to reconstruct the whole ‘dense’ field from sparse/noisy/limited measurements. Gappy Proper Orthogonal Decomposition (GPOD) is based on a linear reconstruction in eigenmodes [8], and it was later improved and applied in [9] for the simple idealized and prototypical case of a flow past a circular cylinder with random spatio-temporal gappiness. More recently, inspired by the success of Convolutional Neural Networks (CNNs) in computer vision tasks, a series of proof-of-concepts studies have used a Generative Adversarial Network (GAN) [10] to reconstruct two-dimensional (2D) snapshots of three-dimensional (3D) rotating turbulence with a large gap [11, 12]. In our previous work [12], POD- and GAN-based reconstructions have been systematically compared in terms of the point-wise error and turbulent statistical properties, for gaps with different sizes and geometries. For super-resolution tasks, CNN and GAN have been applied to recover high-resolution laminar or turbulent flows from low-resolution coarse data in space and time [13,14,15,16]. Moreover, CNN or GAN have been recently proposed to reconstruct the 3D velocity fields from several 2D sections [17] or a cross-plane of unpaired 2D velocity observations [18].

In this paper, we deal with different inference problems, connected to gappiness in the diversity of physical fields that can be measured. In particular, we ask how much one can make use of the measurement of one component of the 3D turbulent velocity field to predict another one. The problem is of course important for many field and laboratory applications where sensors/observations can collect fluctuations only in some preferred directions, e.g., Particle Image Velocimetry (PIV) [19, 20] and wind observation from satellite infrared images [21,22,23,24,25]. Inference problems can be attacked by Extended Proper Orthogonal Decomposition (EPOD) using linear decomposition on a predefined basis of independent functions [26]. Recently, for a turbulent open-channel flow, it has been shown that the 2D velocity-fluctuation fields at different wall-normal locations can be predicted with a good statistical accuracy from the wall-shear-stress components and the wall pressure with a fully nonlinear CNN [27] and GAN [28]. Here we ask a more complex multi-objective question, checking how much one can infer the unknown field in terms of (i) the point-to-point \(L_2\) error and (ii) the statistical properties, e.g., on the basis of Jensen–Shannon (JS) divergence between the probability density function (PDF) of the ground truth and the one of the predicted velocity component.

Fig. 1
figure 1

Energy spectra of different velocity components and the total energy spectrum from the TURB-Rot database [39]. The Kolmogorov dissipative wave number, \(k_\eta =32\), is determined as the scale where the total energy spectrum starts to decay exponentially

In order to be specific with some important applications, we concentrate on the case of 3D turbulent flows under rotation, a system with plenty of physical interest and of high relevance to geophysical and engineering problems [29,30,31]. Under strong rotation, the flow tends to become quasi-2D, with the formation of large-scale coherent vortical structures parallel to the rotation axis [32,33,34]. Moreover, strong non-Gaussian, 3D and intermittent fluctuations exist at small scales because of the forward energy cascade [35,36,37,38]. The tendency toward quasi-2D configurations implies that the out-plane velocity component, \(u_z\), behaves close to a passive scalar advected by the in-plane components, \(u_x\) and \(u_y\). This can be further shown in Fig. 1 with the energy spectra of different velocity components from the database used in this study. It is obvious that the energy of \(u_x\) or \(u_y\) is dominant compared with that of \(u_z\) over the whole range of scales, which indicates that \(u_z\) is almost decoupled from the leading dynamics. Therefore, \(u_x\) and \(u_y\) are well correlated with each other while \(u_z\) is expected to show less correlations (Fig. 2). In this work, we assess the potential of EPOD, CNN and GAN methods to infer the velocity components on 2D slices perpendicular to the rotation axis. We limit the analysis to the scenario where only instantaneous measurements are provided without any temporal sequence information. Two tasks are studied with different difficulties as shown in Fig. 2: (I) Using one in-plane component, \(u_x\), to predict the other, \(u_y\), and (II) using an in-plane component, \(u_x\), to predict the out-plane one, \(u_z\).

Fig. 2
figure 2

Examples of two inference tasks on 2D slices of 3D turbulent rotating flows: (I) Using \(u_x\) to infer \(u_y\) and (II) using \(u_x\) to infer \(u_z\). The rotation axis is along z-direction. Note that the in-plane components, \(u_x\) and \(u_y\), are strongly correlated among each other while the out-plane component, \(u_z\), is less correlated with both \(u_x\) and \(u_y\). Here, \(\varvec{u}_S\) and \(\varvec{u}_G\), respectively, represent the measured quantities and those to be inferred

The organization of this paper is as follows. In Sect. 2, we describe the dataset used to evaluate the inference tools and introduce the EPOD and GAN-based methods. The CNN used in this study is taken from the generator part of the GAN. Results of above tools on the two inference tasks are given in Sect. 3. Finally, we present the conclusions in Sect. 4.

2 Methodology

2.1 Dataset

To evaluate different tools on the inference of velocity components, we use the TURB-Rot database [39], obtained from Direct Numerical Simulations (DNS) of the Navier–Stokes equations for incompressible rotating fluid in a 3D periodic domain, which can be written as

$$\begin{aligned} \frac{\partial \varvec{u}}{\partial t}+\varvec{u}\cdot \varvec{\nabla }\varvec{u}+2\varvec{ \Omega }\times \varvec{u}=-\frac{1}{\rho }\varvec{\nabla }{\tilde{p}}+\nu \Delta \varvec{u}+\varvec{f}, \end{aligned}$$
(1)

where \(\varvec{u}\) is the incompressible velocity, \(\varvec{ \Omega }= \Omega \hat{\varvec{z}}\) is the rotation vector, \({\tilde{p}}\) is the modified pressure in the rotating frame, the forcing \(\varvec{f}\) acts at scales around \(k_f=4\) and it is generated from a second-order Ornstein–Uhlenbeck process [40, 41]. To enlarge the inertial range, the dissipation \(\nu \Delta \varvec{u}\) is modeled by a hyperviscous term \(\nu _h\nabla ^4\varvec{u}\). Moreover, a linear friction term \(\beta \Delta ^{-1}\varvec{u}\) is used at large scales to reduce the formation of a large-scale condensate [37]. The Rossby number in the stationary regime is \(Ro={{{\mathcal {E}}}}^{1/2}/( \Omega /k_f)\approx 0.1\), where \({{{\mathcal {E}}}}\) is the flow kinetic energy. The Kolmogorov dissipative wave number, \(k_\eta =32\), is chosen as the scale where the total energy spectrum E(k) starts to decay exponentially (Fig. 1). Given the domain length \(L_0\), the integral length scale is \(L={{{\mathcal {E}}}}/\int kE(k)\,\textrm{d}k\approx 0.15L_0\) and the integral time scale is \(T=L/{{{\mathcal {E}}}}^{1/2}\approx 0.185\). Refer to [39] for more details of the simulation.

Although all inference tools in this study can be applied to 3D data, here we restrict to 2D horizontal slices in order to make contact with the geophysical observation, PIV experiments and to restrict the amount of data to be used for training (see [14, 15, 17, 18] for a few applications to 3D datasets). To extract data from the simulation, we first sampled snapshots of the full 3D velocity field during the stationary regime. Snapshots are chosen with a large temporal interval \(\Delta t_s=5.41T\) to decrease their correlations in time. There are 600 snapshots sampled over 3243T at early times for training and validation, while 160 snapshots sampled over 865T at later times. The time separation between the two samplings for training/validation and testing is more than 3459T. Second, the resolution of the sampled fields is downsized from \(256^3\) to \(64^3\) by a Galerkin truncation in Fourier space:

$$\begin{aligned} \varvec{u}(\varvec{x})=\sum _{\Vert \varvec{k}\Vert \le k_\eta }\hat{\varvec{u}}(\varvec{k})\textrm{e}^{\textrm{i}\varvec{k}\cdot \varvec{x}}, \end{aligned}$$
(2)

where the cutoff wave number is chosen to be the Kolmogorov dissipative wave number, such as to eliminate only fully dissipative degrees of freedom where the flow can be approximated to be linear with a good approximation. This is needed to balance the request to reduce the amount of data to be analyzed, without reducing the complexity of the task. For each downsized configuration, we selected 16 x-y planes at different z levels and each plane is augmented to 11 (for training/validation) or 8 (for testing) different ones using the periodic boundary condition [39]. After independent random shuffles of the two sets of planes, the dataset is in Train/Validation/Test split as follows, 84480/10560/20480, corresponding to 73.1%, 9.1% and 17.7%, respectively.

2.2 EPOD inference

Denote vector \(\varvec{u}_S\) as the measured quantities and \(\varvec{u}_G\) as the quantities to be inferred, which are defined on the spatial 2D domain, \(\Omega \). Note that \(\varvec{u}_S\) or \(\varvec{u}_G\) becomes a scalar when there is only one quantity measured or to be inferred. For the inference task (I), one uses \(u_x\) to predict \(u_y\), which can be expressed as

$$\begin{aligned} \varvec{u}_S{:}u_x\rightarrow \varvec{u}_G{:}u_y, \end{aligned}$$
(3)

and for the inference task (II),

$$\begin{aligned} \varvec{u}_S{:}u_x\rightarrow \varvec{u}_G{:}u_z{,} \end{aligned}$$
(4)

see Fig. 2. The first step of EPOD method is to compute the correlation matrix

(5)

where with \(\langle \cdot \rangle \) we denote an average over the configurations selected for the training set. Then we solve the eigenvalue problem

(6)

where \(n=1,\ldots ,{N_\Omega }\) and \({N_\Omega }\) is the number of physical points in the domain \(\Omega \), to obtain the eigenvalues \(\sigma _n\) and the POD eigenmodes \({\varvec{\phi }}_S^{(n)}(\varvec{x})\). It is easy to realize that any realization of the observed fields can be decomposed as

$$\begin{aligned} \varvec{u}_S(\varvec{x})=\sum _{n=1}^{{N_\Omega }}b^{(n)}_S\varvec{\phi }_S^{(n)}(\varvec{x}), \end{aligned}$$
(7)

where the POD coefficients are given by the inner product:

$$\begin{aligned} b_S^{(n)}=\int _\Omega \varvec{u}_S(\varvec{x}){\cdot }\varvec{\phi }_S^{(n)}(\varvec{x})\,\textrm{d}\varvec{x}. \end{aligned}$$
(8)

Furthermore, exploiting the orthogonality of the POD eigenmodes one can also write the following identity [26]:

$$\begin{aligned} \varvec{\phi }_S^{(n)}(\varvec{x})=\langle b^{(n)}_S\varvec{u}_S(\varvec{x})\rangle /\sigma _n. \end{aligned}$$
(9)

The idea of the Extended POD modes is to generalize the exact relation (9) also to the case where on the r.h.s. we use the quantities to be inferred, i.e., by replacing \(\varvec{u}_S\) with \(\varvec{u}_G\) but keeping using the same coefficients \(b_n\) defined in (8):

$$\begin{aligned} \varvec{\phi }_E^{(n)}(\varvec{x})=\langle b_S^{(n)}\varvec{u}_G(\varvec{x})\rangle /\sigma _n. \end{aligned}$$
(10)

Once completed the training protocol and defined the set of EPOD modes (10), one can start the inferring procedure for any new configuration outside the training set, by taking the measured components \(\varvec{u}_S(\varvec{x})\), calculating the coefficients (8) and defining the predicted/inferred field as

$$\begin{aligned} \varvec{u}_G^{(p)}(\varvec{x})=\sum _{n=1}^{{N_\Omega }}b_S^{(n)}\varvec{\phi }_E^{(n)}(\varvec{x}). \end{aligned}$$
(11)

\(\varvec{u}_G^{(p)}\) represents the prediction given by an inference method (here EPOD). Note that \(b_S^{(n)}\) in (11) is calculated out of (9) with measurements of a testing data, which is different from \(b_S^{(n)}\) in (10) for the training set.

2.3 GAN-based inference with context encoders

GAN [10] has been proposed to generate 2D images out of random input sets. The architecture consists of two neural networks: a generator and a discriminator. The generator tries to create new data that is—statistically—similar to the training data, while the discriminator tries to distinguish between the real and the generated data. Both networks are trained together in a game-like fashion where the generator tries to fool the discriminator, and the discriminator tries to accurately distinguish between real and generated data. Over the learning phase, the generator gets better at creating realistic data that can fool the discriminator, leading to high-quality generated data. For the task here attacked, inferring data out of some input configuration, the typical GAN architecture must be slightly adapted, using a context encoder [42] which takes as input the measured data (instead of a random vector) and adds a second loss measuring the \(L_2\) point-to-point distance between the GAN output and the ground truth configuration. These architectures have already been used to reconstruct gappy 2D turbulent configurations in our previous works [11, 12]. For the inference tasks of velocity components here studied, the GAN architecture is shown in Fig. 3.

Fig. 3
figure 3

Architecture of generator and discriminator for the velocity inference tasks. Each convolution (up-convolution) layer has a kernel size of 4 followed by a Leaky Rectified Linear Unit (ReLU) activation function, except that the last layer of the generator uses a Tanh activation function. We can increase channels in the layer \(\varvec{u}_S\) or the layers \(\varvec{u}_G^{(p)}\) and \(\varvec{u}_G^{(t)}\) when there are more than one components/quantities measured or to be inferred

The generator is a functional \(GEN(\cdot )\) first encoding the supplied measurements, \(\varvec{u}_S\), to a latent feature representation, and second it generalizes by a decoder the latent representation to produce a candidate for the missing measurements, \({GEN(\varvec{u}_S)=}\varvec{u}_G^{(p)}\), while \(\varvec{u}_G^{(t)}\) is the missing true turbulent configuration. The discriminator plays as a ‘referee’ functional \(D(\cdot )\) which takes either \(\varvec{u}_G^{(t)}\) or \(\varvec{u}_G^{(p)}\) and outputs the probability that the supplied configuration is sampled from the true turbulent dataset. The loss function of the generator is

$$\begin{aligned} {\mathcal {L}}_{GEN}=(1-\lambda _{adv}){\mathcal {L}}_\textrm{MSE}+\lambda _{adv}{\mathcal {L}}_{adv}, \end{aligned}$$
(12)

where the \(L_2\) loss

$$\begin{aligned} {\mathcal {L}}_\textrm{MSE}{=}\langle \frac{1}{{A_\Omega }} \int _I \Vert \varvec{u}_G^{(p)}(\varvec{x}){-}\varvec{u}_G^{(t)}(\varvec{x})\Vert ^2 \,\textrm{d}\varvec{x} \rangle \end{aligned}$$
(13)

is defined as the mean squared error (MSE) averaged over the spatial domain of area \({A_\Omega }\). The hyper-parameter \(\lambda _{adv}\) is called the adversarial ratio and the adversarial loss is

$$\begin{aligned} {\mathcal {L}}_{adv}&=\langle \log (1-D(\varvec{u}_G^{(p)}))\rangle \nonumber \\&=\int p(\varvec{u}_S)\log [1-D(GEN(\varvec{u}_S))]\,\textrm{d}\varvec{u}_S \nonumber \\&=\int p_p(\varvec{u}_G)\log (1-D(\varvec{u}_G))\,\textrm{d}\varvec{u}_G, \end{aligned}$$
(14)

where \(p(\varvec{u}_S)\) is the probability distribution of the known input measurements in the training set and \(p_p(\varvec{u}_G)\) is the probability distribution of the predicted fields from the generator. The discriminator is trained simultaneously with the generator to maximize the cross-entropy between the ground truth and generated samples

$$\begin{aligned} {\mathcal {L}}_{DIS}&=\langle \log (D(\varvec{u}_G^{(t)}))\rangle +\langle \log (1-D(\varvec{u}_G^{(p)}))\rangle \nonumber \\&=\int [p_t(\varvec{u}_G)\log (D(\varvec{u}_G)) \nonumber \\&\quad +p_p(\varvec{u}_G)\log (1-D(\varvec{u}_G))]\,\textrm{d}\varvec{u}_G. \end{aligned}$$
(15)

Here \(p_t(\varvec{u}_G)\) is the probability distribution of the real -ground truth- fields, \(\varvec{u}_G^{(t)}\). It is possible to show that the generative-adversarial training with \(\lambda _{adv}=1\) in (12) minimizes the JS divergence between the generated and the true probability distributions, \(\textrm{JSD}(p_t\parallel p_p)\) [10, 43]. Therefore, the adversarial loss helps the generator to produce predictions with correct turbulent statistics, while the generator, if trained alone (\(\lambda _{adv}=0\)), would minimize only the \(L_2\) loss (13) which is mainly sensitive to the large energy-containing scales. Compared with the linear EPOD, GAN not only takes advantage of the nonlinear expression of the generator, but also can optimize a loss considering both the MSE error and the generated probability distribution.

Fig. 4
figure 4

Prediction visualization of the inference task (I) by the different tools for an instantaneous field. In the 1st row, the input velocity component, \(u_x\), is shown in the 1st column, while the 2nd to 5th columns show the ground truth and the inferred velocity component, \(u_y\), obtained from EPOD, CNN and GAN. The corresponding gradient in x-direction, \(\partial u_y/\partial x\), is shown in the 2nd row

More details of the GAN are provided in Appendix 1, including the architecture, hyper-parameters and the training schedule.

3 Results

In this section, we systematically compare EPOD- , CNN- and GAN-based methods for the two tasks of velocity component inference. The CNN has the same architecture as the generator of the GAN, thus it corresponds to our GAN with \(\lambda _{adv}=0\) (without the discriminator); in this way we are also able to judge the importance of adding/removing the objective to have a small JS divergence with the ground truth for our inferred field. For the decision of the adversarial ratio for the GAN, readers can refer to Appendix 1.

To quantify the inference error, we define the normalized mean squared error (MSE) as

$$\begin{aligned} \textrm{MSE}(\varvec{u}_G)=\langle \Delta _{\varvec{u}_G}\rangle /E_{\varvec{u}_G}, \end{aligned}$$
(16)

where

$$\begin{aligned} \Delta _{\varvec{u}_G}=\frac{1}{A(I)}\int _I\Vert \varvec{u}_G^{(p)}(\varvec{x})-\varvec{u}_G^{(t)}(\varvec{x})\Vert ^2\,\textrm{d}\varvec{x} \end{aligned}$$
(17)

is the spatially averaged \(L_2\) error for one flow configuration and \(\langle \cdot \rangle \) represents now the average over the test data. The normalization factor is defined as

$$\begin{aligned} E_{\varvec{u}_G}=\sigma _G^{(p)}\sigma _G^{(t)}, \end{aligned}$$
(18)

where

$$\begin{aligned} \sigma _G^{(t)}=\langle \frac{1}{A(I)}\int _I\Vert \varvec{u}_G^{(t)}(\varvec{x})\Vert ^2\,\textrm{d}\varvec{x}\rangle ^{1/2} \end{aligned}$$
(19)

and \(\sigma _G^{(p)}\) is defined similarly. The form of \(E_{\varvec{u}_G}\) makes sure that a prediction with too small or too large energy gives a large MSE.

We use JS divergence to evaluate the PDF of the inferred velocity components. For two probability distributions, P(x) and Q(x), JS divergence measures their similarity and is defined as

$$\begin{aligned} \textrm{JSD}(P\parallel Q)=\frac{1}{2}\textrm{KL}(P\parallel M)+\frac{1}{2}\textrm{KL}(Q\parallel M), \end{aligned}$$
(20)

where \(M=\frac{1}{2}(P+Q)\) and

$$\begin{aligned} \textrm{KL}(P\parallel Q)=\int _{-\infty }^\infty P(x)\log \left( \frac{P(x)}{Q(x)}\right) \,\textrm{d}x \end{aligned}$$
(21)

is the Kullback–Leibler (KL) divergence. If the two probability distributions are close, it gives a small JS divergence, and vice versa.

3.1 Inference task (I)

3.1.1 Prediction at large scales and small scales

For the inference task (I), in Fig. 4 we present an inference experiment using a test data never showed during training. The aim is to use the velocity component \(u_x\) (1st row, 1st column) to predict the velocity component \(u_y\), of which the ground truth and predictions from the different tools are shown in the other columns of the 1st row. In the 2nd row, we show their gradients in x-direction, \(\partial u_y/\partial x\). EPOD only keeps the correlated part with the given information and the prediction looks meaningful because of the correlation between \(u_x\) and \(u_y\). With the capability of expressing nonlinear correlations, CNN predicts better results which are close to the ground truth. However, CNN predictions are blurry without the high-frequency information, while GAN can generate realistic turbulent configurations with the benefit of adversarial training.

To analyze the predictions quantitatively, Fig. 5(top) shows the \(\textrm{MSE}(u_y)\) and the JS divergence between PDFs of the original and the predicted \(u_y\), which is denoted as \(\textrm{JSD}(u_y)=\textrm{JSD}(\textrm{PDF}(u_y^{(t)})\parallel \textrm{PDF}(u_y^{(p)}))\). We divide the test data into batches of size 128 (or 2048), calculate the MSE (or JS divergence) over these batches and we indicate with the error bound its range of fluctuation. The EPOD approach has a \(\textrm{MSE}(u_y)\) around 0.6, while CNN and GAN have smaller values around 0.1.

Fig. 5
figure 5

MSE (left y-axis) and JS divergence (right y-axis) between PDFs for the velocity component to be inferred, \(u_y\), and its gradient, \(\partial u_y/\partial x\). Results are obtained from EPOD, CNN and GAN for the inference task (I)

Figure 6 shows the PDF of the spatially averaged \(L_2\) error, \(\Delta _{u_y}\), over different configurations, the peaks of which give consistent results with MSEs. The JS divergence of the inferred component can be explained together with Fig. 6b, which shows PDFs of the predicted velocity component compared with the original data. EPOD has the largest value of \(\textrm{JSD}(u_y)\) and the predicted \(\textrm{PDF}(u_y)\) has the correct shape but deviates from the original one at around \(|u_y|=\sigma (u_y)\), where \(\sigma (u_y)\) represents the standard deviation of the original data. CNN and GAN generate nearly perfect PDFs with small values of \(\textrm{JSD}(u_y)\). Moreover, GAN is better with the help of the discriminator, which also slightly increases \(\textrm{MSE}(u_y)\).

Fig. 6
figure 6

a PDFs of the spatially averaged \(L_2\) error over different configurations and b PDFs of the predicted and the original velocity components, \(u_y\), where \(\sigma (u_y)\) is the standard deviation of the original data. Results are obtained from EPOD, CNN and GAN for the inference task (I)

Figure 5(bottom) shows the MSE and JS divergence between PDFs for the gradient of \(u_y\), \(\partial u_y/\partial x\). The \(\textrm{MSE}(\partial u_y/\partial x)\) behaves similarly as \(\textrm{MSE}(u_y)\) that EPOD gives a large value while CNN and GAN have smaller close values. However, for the gradient both EPOD and CNN give large values of JS divergence and only GAN predicts a small value, indicating the importance of adversarial training on the high-order quantities.

Fig. 7
figure 7

W-MSE for the velocity component to be inferred, \(u_y\), at different wave numbers \(k_j\). Results are obtained from the different tools for the inference task (I)

3.1.2 Multi-scale prediction error

Here, we evaluate the prediction from a multi-scale perspective with the help of wavelet analysis [44,45,46]. For a field defined on a uniform grid of size \(2^{N^{}}\times 2^N\), 2D wavelet decomposition gives that

$$\begin{aligned} u_y(\varvec{x})={\bar{u}}_y+\sum _{j=0}^{N-1}u_y^{(k_j)}(\varvec{x}), \end{aligned}$$
(22)

where \({\bar{u}}_y\) is the mean value and

$$\begin{aligned} u_y^{(k_j)}(\varvec{x})=\sum _{i_x=0}^{2^j-1}\sum _{i_y=0}^{2^j-1}\sum _{\sigma }c_{j,i_x,i_y}^{(\sigma )}\psi _{j,i_x,i_y}^{(\sigma )}(\varvec{x}) \end{aligned}$$
(23)

is the wavelet contribution at wave number \(k_j=2^j\), corresponding to the length scale \(1/k_j\). Given that \(\sigma \in \lbrace x,y,d\rbrace \), \(c_{j,i_x,i_y}^{(\sigma )}\) is the wavelet coefficient and

$$\begin{aligned} \begin{aligned} \psi _{j,k_x,k_y}^{(x)}(x,y)=\psi _{j,k_x}(x)\phi _{j,k_y}(y), \\ \psi _{j,k_x,k_y}^{(y)}(x,y)=\phi _{j,k_x}(x)\psi _{j,k_y}(y), \\ \psi _{j,k_x,k_y}^{(d)}(x,y)=\psi _{j,k_x}(x)\psi _{j,k_y}(y), \end{aligned} \end{aligned}$$
(24)

where \(\phi (\cdot )\) and \(\psi (\cdot )\) are the Haar scaling function and associated wavelet, respectively. To measure the inference error at different scales, we define the normalized wavelet mean squared error (W-MSE) as

$$\begin{aligned} \text {W-MSE}(k_j,u_y)=\textrm{MSE}(u_y^{(k_j)}), \end{aligned}$$
(25)
Fig. 8
figure 8

a The energy spectrum and b the flatness of predictions and the ground truth of the velocity component, \(u_y\). The unit of r is the grid width, \(w_g=2\pi /64\). Results are obtained from the different tools for the inference task (I)

which is the MSE between wavelet contributions at \(k_j\) of the predicted and original data, following the definition (16). We stress that the normalization factor given by (18) and (19) is different for different \(k_j\). Figure 7 displays the W-MSE obtained from different methods. It shows that at all scales EPOD has the largest prediction error. CNN and GAN produce close values of W-MSE, which is large at small scales where \(k_j=2^6\), while varies between 0 to 0.4 at \(1\le k_j\le 2^5\). The wave number \(k_j\) with the minimum W-MSE, i.e., the scale where the maximum percentage prediction is reached, corresponds to the scale in correspondence of the maximum of the spectrum, as shown by comparing with Fig. 8a.

Fig. 9
figure 9

Prediction visualization of the inference task (II) by the different tools for an instantaneous field. In the 1st row, the input velocity component, \(u_x\), is shown in the 1st column, while the 2nd to 5th columns show the ground truth and the inferred velocity component, \(u_z\), obtained from EPOD, CNN and GAN. The corresponding gradient in x-direction, \(\partial u_z/\partial x\), is shown in the 2nd row

3.1.3 Spectra and flatness

To further study the statistical properties of different predictions, in Fig. 8a we have measured the energy spectrum for the different inferred fields, namely

$$\begin{aligned} E_{u_y}(k)=\sum _{k\le \Vert \varvec{k}\Vert <k+1}\frac{1}{2}\langle {\hat{u}}_y(\varvec{k}){\hat{u}}_y^*(\varvec{k})\rangle , \end{aligned}$$
(26)

where \(\varvec{k}=(k_x, k_y)\) is the wave number, \({\hat{u}}_y(\varvec{k})\) is the Fourier transform of \(u_y(\varvec{x})\) and \({\hat{u}}_y^*(\varvec{k})\) is its complex conjugate. Since EPOD only extracts the correlated part with the supplied information, it predicts an energy spectrum with a similar shape but with smaller energy at all wave numbers compared with the original one. CNN benefits from the nonlinear properties and predicts the correct energy spectrum at small wave numbers. With the help of the discriminator, GAN can predict close energy spectrum to the original one at all wave numbers.

Figure 8b plots the flatness

$$\begin{aligned} F_{u_y}(r)=\langle (\delta _r u_y)^4\rangle /\langle (\delta _r u_y)^2\rangle , \end{aligned}$$
(27)

where \(\delta _r u_y=u_y(x+r,y)-u_y(x,y)\). The results are consistent with those of energy spectrum where EPOD fails at all scales, CNN predicts correct flatness at large scales and GAN has satisfying results at all scales.

3.2 Inference task (II)

Now we move to the inference task (II) which is more challenging as illustrated in the introduction. Figure 9 presents an inference experiment similar as Fig. 4 but for the task (II). It is obvious that GAN predictions are realistic and well correlated with the ground truth. However, although the predicted structures are correct, GAN predictions can have wrong values which can be even with opposite signs in some vortices (5th column). The EPOD method is not able to infer meaningful results due to the complexity of the task. The 4th column shows that CNN predicts some blurry blobs that can capture the positions of the large-scale vortices, but cannot correctly predict their signs. This limited capability of CNN should be the reason why GAN cannot predict correct signs of vortices either, as GAN uses the same CNN as its generator.

Figure 10 shows MSEs of the interested component, \(u_z\), and its gradient, \(\partial u_z/\partial x\).

Fig. 10
figure 10

MSE (left y-axis) and JS divergence (right y-axis)between PDFs for the velocity component to be inferred, \(u_z\), and its gradient, \(\partial u_z/\partial x\). Results are obtained from the different tools for the inference task (II)

Compared with the inference task (I), MSEs have larger values for the different tools, which indicates the difficulty of the task. For both \(\textrm{MSE}(u_z)\) and \(\textrm{MSE}(\partial u_z/\partial x)\), EPOD and CNN give large values, while the best values are around 2 from the GAN prediction. The PDF of the spatially averaged \(L_2\) error, \(\Delta _{u_z}\), over flow configurations are shown in Fig. 11a. It shows that GAN has a peak of PDF with the smallest \(\Delta _{u_z}\) over CNN and EPOD, which is consistent with Fig. 10(top).

Fig. 11
figure 11

a PDFs of the spatially averaged \(L_2\) error over different configurations and b PDFs of the predicted and the original velocity components, \(u_z\), where \(\sigma (u_z)\) is the standard deviation of the original data. Results are obtained from EPOD, CNN and GAN for the inference task (II)

We also present the JS divergence of \(u_z\) and \(\partial u_z/\partial x\) in Fig. 10. Only GAN gives small values of JS divergence with the turbulent predictions. Figure 11b further shows PDFs of \(u_z\) from predictions, where GAN has the closest PDF with the original data.

To quantitatively study the predicted statistical properties, in Fig. 12 we present the energy spectrum \(E_{u_z}(k)\) and the flatness \(F_{u_z}(r)\) for the different inferred fields.

Fig. 12
figure 12

a The energy spectrum and b the flatness of predictions and the ground truth of the velocity component, \(u_z\). The unit of r is the grid width, \(w_g=2\pi /64\). Results are obtained from the different tools for the inference task (II)

Only GAN predicts close energy to that of the original data at all wave numbers, which indicates that GAN generates inference with satisfying multi-scale properties (Fig. 12a). Figure 12b shows that the flatness from GAN has a similar shape with the original one. They are close at small and intermediate scales but that of GAN has smaller values at large scales. Besides, all the other tools cannot predict satisfying flatness at all scales.

4 Conclusion

We studied a problem with practical applications in geophysical and engineering problems, which is using one velocity component to infer another one for 2D snapshots of rotating turbulent flows. Moreover, this problem is also with theoretical interest, connected to the feature ranking of turbulence: identifying which degrees of freedoms are dominant in turbulent flows.

Linear (EPOD) and nonlinear (CNN & GAN) methods are systematically compared by two inference tasks with different complexities, which depend on the correlation between the input component and the one to be inferred. The EPOD method conducts POD analysis with the supplied component and gives the correlated part as the prediction. CNN & GAN methods are based on fully nonlinear approximators, without and with an adversarial component.

For the simpler inference task (I), EPOD generates meaningful inference which is not satisfying in terms of the MSE and the JS divergence with real data. An obvious improvement is reached after the usage of nonlinear mapping, where CNN gives good predictions with small MSE and JS divergence with the ground truth. Compared with CNN, GAN makes the inference more realistic with fine structures, with the cost of slightly increasing the MSE. Besides, the training cost is also more expensive for GAN. For the inference task (II), the supplied component is not well correlated with the one to be inferred. The EPOD method cannot make meaningful inference because of this complexity. However, with the nonlinear capability CNN and GAN can recognize the positions of coherent structures, although they cannot predict correct values (or even signs) in the vortices. Moreover, the usage of discriminator is very important for this task. Indeed, CNN only predicts some smooth blobs while GAN generates turbulent configurations correlated with the ground truth. In conclusion, we have shown that GANs are simultaneously optimizing the \(L_2\) point-to-point distance, snapshot by snapshot, and the cross entropy between the ground-truth data and the generated ones, leading to both instantaneous and statistical reconstruction optima. On the other hand, EPOD minimizes only the variance through the eigenvalues of each single field. The latter yields a worse result of statistical properties.

Note that EPOD, CNN and GAN-based methods can se straightforwardly generalized to the inference task with multiple components/quantities supplied and/or to be inferred. The inference task where

$$\begin{aligned} \varvec{u}_S{:}(u_x,u_y)\rightarrow \varvec{u}_G{:}u_z \end{aligned}$$

was also conducted and it shows very similar results with task (II), because of the high correlation between the two in-plane components. Future research can give more consideration to serving practical applications such as PIV. As the current research focuses on fully resolved turbulent flow, future studies can focus on conducting inference with noisy and/or Fourier-filtered data. Additionally, it is recommended to investigate the potential benefits of employing machine learning techniques [47,48,49] to exploit the temporal correlations present in the data. Given that the current study is limited to 2D fields, it is impossible to impose physical constraints such as incompressibility. A possible avenue for future research is to consider fields in a 3D volume, which may better accommodate the imposition of such prior knowledge.