1 Introduction

Super resolution reconstructs a spatially high-resolution field data \({{{\varvec{q}}}}_{\textrm{HR}}\) from its low-resolution counterpart \({{{\varvec{q}}}}_\textrm{LR}\) [1,2,3]. This problem set has been traditionally tackled in computer visions with various techniques including interpolation [4,5,6,7], example-based internal learning [8,9,10,11], high-frequency transfer [12,13,14,15], neighbor embedding [16,17,18,19,20], and sparse coding [21,22,23,24,25]. Although these implementations are effortless, it is generally challenging to reconstruct high-wavenumber contexts. To address this difficulty, machine learning has been used for accurate super-resolution reconstruction of images [26,27,28]. Machine learning can find a nonlinear relationship between input and output data even under ill-posed conditions. This approach can be applied to a pair of low- and high-resolution images, providing a finer level of images from extremely coarse images [29].

Machine-learning-based techniques in general [30,31,32] have been considered for a range of applications in fluid mechanics including turbulence modeling [33,34,35,36,37], reduced-order modeling [38,39,40,41,42], data reconstruction [43,44,45,46], and flow control [47,48,49,50,51,52]. Super-resolution reconstruction with machine learning is no exception. The lower barrier to access open source codes in image science and implement models also enables fluid mechanicians to apply methods for fluid flow data by replacing RGB components (red, green, and blue) with velocity components \(\{u,v,w\}\).

Table 1 Representative studies on machine-learning-based super-resolution reconstruction methods for fluid flows

While super resolution can be regarded as an image-based data recovery technique, it is also a general framework for a broad range of applications in fluid mechanics. For instance, a low-resolution fluid flow image can be interpreted as a set of sparse sensor measurements. In this aspect, the inverse problem of global field reconstruction from local measurements is an extension of super-resolution analysis [64, 68, 69]. If we consider low-resolution fluid flow data as noisy experimental measurements, super-resolution analysis can also be extended to denoising problem [60, 70, 71]. Furthermore, large-eddy simulation (LES) can incorporate super-resolution reconstruction to reveal finer structures inside a low-resolution grid cell [63, 66].

This paper surveys the current status and the challenges of machine-learning-based super-resolution analysis for vortical flows. We first cover several machine-learning models and their applications to super resolution of fluid flows. We then offer case studies using a supervised learning-based super resolution for an example of two-dimensional decaying isotropic turbulence. We consider embedding physics into the model design to successfully reconstruct a high-resolution vortical flow from low-resolution data. We further discuss the challenges and outlooks of machine-learning-based super resolution in fluid flow applications. The present paper is organized as follows. We introduce machine-learning approaches of super-resolution reconstruction for vortical flows in Sect. 2. Applications of these machine-learning techniques are discussed in Sect. 3. We perform case studies in Sect. 4. Extensions of super-resolution analysis for fluid dynamics are discussed in Sect. 5. Concluding remarks with outlooks are provided in Sect. 6.

2 Approaches

A variety of machine-learning models have been proposed for the super-resolution reconstruction of vortical flows, as summarized in Table 1. Machine-learning-based approaches can find a nonlinear relationship between the low-resolution input and the corresponding high-resolution output from a large collection of data through training. In super-resolution analysis, the dimension of the input (low-resolution data) \({{{\varvec{q}}}}_{\textrm{LR}} \in {\mathbb {R}}^{m}\) is smaller than that of the high-resolution output \({{{\varvec{q}}}}_{\textrm{HR}} \in {\mathbb {R}}^{n}\) with \(m\ll n\),

$$\begin{aligned} {{{\varvec{q}}}}_{\textrm{HR}} = F({{{\varvec{q}}}}_{\textrm{LR}}), \end{aligned}$$
(1)

where F is the super-resolution model. Depending on the flow of interest and the size of data, the machine-learning model should be carefully chosen. In Sect. 2.1, we introduce three types of machine-learning models that are widely used. We also discuss the use of physics-based loss functions in Sect. 2.2.

2.1 Machine-learning models

2.1.1 Fully connected network (multi-layer perceptron)

Fig. 1
figure 1

Fully connected model-based super resolution

The fully connected network, also called the multi-layer perceptron [72], is the most basic neural network model. Nodes between layers are fully connected with each other, as illustrated in Fig. 1. The minimum unit of a fully connected network is called perceptron. For each perceptron, the linear combination of the inputs from layer \((l-1)\), \(c_j^{(l-1)}\), is connected with weights \({{{\varvec{w}}}}\) yielding the output at layer (l), \(c_i^{(l)}\),

$$\begin{aligned} {c}^{(l)}_{i}=\varphi \left( \sum _{j}{w}_{ij}^{(l)}{c}^{(l-1)}_{j} + b_i^{(l)}\right) , \end{aligned}$$
(2)

where \(\varphi \) is the activation function and b is the bias added at each layer. We can choose a nonlinear function for \(\varphi \), enabling the network to capture the nonlinear relationship between the input and the output.

A fully connected model can be used for supervised machine learning-based super resolution. A training process for supervised machine-learning models is cast as an optimization problem to determine the weights \({{{\varvec{w}}}}\) inside the model F. The weights \({{{\varvec{w}}}}\) are optimized by minimizing the loss function \({{\mathcal {E}}}\) through backpropagation [73]. This optimization procedure is described as

$$\begin{aligned} {{{\varvec{w}}}}={\textrm{argmin}}_{{{\varvec{w}}}}~{{{\mathcal {E}}}}({{{\varvec{w}}}}). \end{aligned}$$
(3)

Since super-resolution reconstruction aims to obtain a high-resolution image \({{{\varvec{q}}}}_{\textrm{HR}}\) from the corresponding low-resolution data \({{{\varvec{q}}}}_{\textrm{LR}}\), the loss function (error) can be formulated as

$$\begin{aligned} {{{\mathcal {E}}}}&= ||{{{\varvec{q}}}}_{\textrm{HR}}-F({{{\varvec{q}}}}_\textrm{LR})||_P,~ \end{aligned}$$
(4)

where P indicates the norm. While the \(L_2\) norm is widely used, we can instead consider other norms such as \(L_1\) norm and logarithmic norm depending on the data characteristics. The \(L_1\) norm can be used for model construction that is not as sensitive for outliers in the data. The logarithmic norm is suitable for cases where underestimation should be avoided.

As mentioned above, the difference of data dimension between the input and the output in the super-resolution analysis is substantial. Hence, models generally comprise the decoder-type structure [55, 74], meaning that the number of nodes gradually increases towards the output layer. This is especially the case for high-dimensional inverse problems such as super-resolution reconstruction of fluid flows. This leads to the number of nodes and their connections to drastically increase, leading to the prohibitively expensive computational cost and the failure of non-convex optimization known as the curse of dimensionality [75]. Users should be mindful of computational time and memory requirements for fully connected models.

2.1.2 Convolutional neural network

To address the issue of the computational burden associated with the fully connected models, convolutional neural networks (CNNs) [76] have been widely utilized in super-resolution analysis of fluid flows. CNNs incorporate a function called filter sharing, enabling the processing of large vortical flow data without encountering the curse of dimensionality [77].

A CNN is generally comprised of the convolutional layer, pooling layer, and upsampling layer. The convolutional layer depicted in Fig. 2 captures the nonlinear relationship between input and output data by extracting spatial features of supplied data through filtering operations. This operation is expressed as

$$\begin{aligned} q^{(l)}_{ijn}=\varphi \left( \sum _{m=1}^M\sum _{p=0}^{H-1}\sum _{q=0}^{H-1}h^{(l)}_{pqmn}q^{(l-1)}_{i+p-G,j+q-G,m}+b_n^{(l)}\right) , \end{aligned}$$
(5)

where \(G=\lfloor H/2\rfloor \), H is the width and height of the filter, M is the number of input channel, n is the number of output channel, b is the bias, and \(\varphi \) is the activation function. As in the fully connected models, a nonlinear function can be chosen for \(\varphi \) to account for nonlinearities in the machine-learning model.

Fig. 2
figure 2

Convolutional neural network-based super resolution

In addition to the convolutional layer, a pooling layer also plays an important role in CNN-based analysis. The pooling layer downscales the data, reducing data dimension. For regression tasks, it is useful for reducing spatial sensitivity, producing a robust CNN model against noisy inputs [78]. It is also possible to expand the data dimension through the upsampling layer. Upsampling copies the value onto an arbitrary region to expand the dimension. This function is especially useful to align the data dimension inside the network.

For super resolution in which the dimension of the output \({\mathbb {R}}^{n}\) is larger than that of the input \({\mathbb {R}}^{m}\), there are several ways to treat the difference of the dimensions between the input and the output. For example, the upsampling can be used inside a network to expand the dimension [79]. One can also implement a resize or interpolation function for the input data to align the size with that of the output [29, 54, 80]. This can avoid the use of pooling or upsampling operations, reducing the complexity of the model.

2.1.3 Generative adversarial network

In addition to supervised fully connected networks and convolutional networks, unsupervised learning with generative adversarial network (GAN) [81] has also been proposed for super-resolution analysis of fluid flows [53, 56, 59, 62, 82]. GAN is attractive for cases in which it is difficult to prepare paired input and output data. For example, the application of super resolution with LES can correspond to this scenario. A model trained with a pair of high-fidelity DNS and subsampled low-resolution data may not directly support super-resolution reconstruction for LES data. Super resolution of PIV measurements with limited spatio-temporal resolution (without corresponding high-resolution solution images) also needs to be carefully considered.

GAN is composed of two networks, namely, a generator (G) and a discriminator (D). A generator produces a fake image which is similar to the solution from random noise \({\varvec{n}}\). In contrast, a discriminator judges the generated (fake) image as whether it is likely to be a realistic image by returning a probability between 0 (fake) and 1 (real). A generator usually possesses a decoder-type structure to expand the data dimension from noise to images, while a discriminator is composed of an encoder-type network towards reducing the data size from images to the probability. Throughout the training process, the weights inside the generator are being updated to deceive the discriminator toward the direction of minimizing the probability by generating images increasingly similar to the real data. Fake images produced by the generator eventually become high-quality images that cannot be distinguished from the real image.

Fig. 3
figure 3

Generative adversarial network-based super resolution

These processes can be mathematically expressed with regard to the cost function V(DG),

$$\begin{aligned} \underset{G}{\textrm{min}} \underset{D}{\textrm{max}}~V(D,G) = {\mathbb {E}}_{{{{\varvec{d}}}} \sim p_{\textrm{data}}({{{\varvec{d}}}})}[{\textrm{log}}D({{{\varvec{d}}}})] + {\mathbb {E}}_{{{{\varvec{n}}}} \sim p_{{{\varvec{n}}}}({{{\varvec{n}}}})}[{\textrm{log}}(1-D(G({\varvec{n}})))],~ \end{aligned}$$
(6)

where \({\varvec{d}}\) is a real data set and \(p_{\textrm{data}}\) is the probability distribution of the real data. The parameters in the generator G are trained towards the direction in which \(D(G({{\varvec{n}}}))\) becomes 1. On the other hand, the weights in the discriminator D are updated so that \(D({{{\varvec{d}}}})\) returns a value close to 1. Since the discriminator becomes wiser through training, \(D(G({{\varvec{n}}}))\) provides a value close to 0. Summarizing, the parameter inside the generator G is optimized by minimizing the loss function while that for the discriminator D is adjusted by maximizing the loss function, referred to as competitive learning [83]. Once the training ends, the trained generator can produce an output with indistinguishable quality compared to the real data. For super-resolution problems, we can use low-resolution data as the input for the generator G instead of random noise \({\varvec{n}}\), as illustrated in Fig. 3. A generator in super-resolution reconstruction provides a statistically plausible high-resolution output by learning the relationship between the input low-resolution data set and the high-resolution data set, which need not be paired.

2.2 Choice of loss function

Here, let us discuss the choice of loss (cost) function for machine-learning-based super-resolution analysis. In standard formulation, we can have the cost function defined by Eqs. 4 and 6. However, super-resolved flow fields with direct applications of machine-learning models do not satisfy physical conditions, such as the conservation laws. To address such an issue, loss functions that embed physics laws can be utilized [84, 85]. Together with the original data-based cost \({{{\mathcal {E}}}}_d\) from Eqs. 4 or 6, the loss function \(\mathcal{E}\) incorporating a physics-inspired loss function \({{{\mathcal {E}}}}_p\) for super-resolution analysis can take the form of

$$\begin{aligned} {{{\mathcal {E}}}}&= {{{\mathcal {E}}}}_d + \beta {{{\mathcal {E}}}}_p, \end{aligned}$$
(7)

where \(\beta \) provides a scale between \({{{\mathcal {E}}}}_d\) and \(\mathcal{E}_p\).

There are several approaches to introduce the physics-based loss term for fluid flows. For instance, we can directly substitute a reconstructed high-resolution field \({{{\varvec{q}}}}_{\textrm{HR}}\) into the governing equation [85] if we have all data for the state variables to have,

$$\begin{aligned} {{{\mathcal {E}}}}_p = ||{{{\mathcal {N}}}}({{{\varvec{x}}}}, {{{\varvec{q}}}}_{\textrm{HR}}({{{\varvec{x}}}},t))||_P, \end{aligned}$$
(8)

where \({{{\mathcal {N}}}}\) is an operator from governing equations. Minimizing a loss function incorporating only certain terms of the Navier–Stokes equation [38] can also be considered,

$$\begin{aligned}&{{{\mathcal {E}}}}^j_p = ||{{{\mathcal {N}}}}_j({{{\varvec{q}}}}_{\textrm{Ref}})-{{{\mathcal {N}}}}_j({{\varvec{q}}}_{\textrm{HR}})||_P,~~~{{{\mathcal {N}}}} = \sum _j {{{\mathcal {N}}}}_j, \end{aligned}$$
(9)

where \({{{\mathcal {N}}}}_j\) is a term in the governing equation and \({{\varvec{q}}}_{\textrm{Ref}}\) is a reference data. It is known that these physics-based loss functions help in reconstructing flows with a small amount of data [60]. What these terms in the loss function do is to better constrain the solution space [86, 87]. This is a similar concept to semisupervised learning which combines a small amount of labeled data with a large amount of unlabeled data [88]. In the present paper, we demonstrate the effectiveness of training with small data set for super-resolution reconstruction of turbulent vortices in Sect. 4. We should, however, note that the so-called physics-inspired analysis can suffer from large numerical error if \({{{\varvec{q}}}}_{\textrm{HR}}\) contains error or noise. This approach should be used with caution as it assumes that \({{{\varvec{q}}}}_{\textrm{HR}}\) can be used to evaluate certain terms.

3 Applications

In this section, we survey recent super-resolution applications for fluid flows through supervised (Sect. 3.1) and semisupervised-/unsupervised learning (Sect. 3.2).

3.1 Supervised learning

In machine-learning-based super-resolution reconstruction of fluid flows, supervised techniques are often used. Supervised learning requires a pair of input and output flow field data as training data. For super-resolution analysis, a high-resolution reference flow field and the corresponding low-resolution data need to be available for training models. To avoid the curse of dimensionality, CNN models are often used for image-based super resolution of fluid flows rather than fully connected models.

Fukami et al. [54, 89, 90] proposed a CNN-based super-resolution reconstruction for fluid flows in a supervised manner. The CNN-based model was applied to examples of a two-dimensional cylinder wake, two-dimensional isotropic turbulence, and three-dimensional turbulent channel flow. To capture multi-scale physics in turbulent vortical flows, they also proposed the hybrid downsampled skip-connection/multi-scale (DSC/MS) model based on the CNN. The model is composed of the up-/downsampling operations, the skip connection [91], and CNNs with various sizes of filters. While up-/downsampling operations support robustness against rotation and translation of vortical structures, the skip connection provides stability of the learning process [91]. Moreover, the multi-scale CNN aims to capture a variety of length scales in turbulent flows. Especially for the examples of turbulence, it was shown that the DSC/MS model is effective in accurately preserving the energy spectrum.

Following this study, supervised CNN-based super-resolution analysis has been actively studied for a range of flows. Obiols-Sales et al. [57] proposed a CNN-based super-resolution model called SURFNet and tested its performance for wakes around various NACA-type airfoils, ellipses, and cylinders. SURFNet includes a transfer learning-based augmentation [92]. The model is first trained using only low-resolution flow data, and then, the pre-trained weights are transferred in training with high-resolution data sets. Transfer learning over multiple levels of spatial-resolution flow field can improve the accuracy of super-resolution reconstruction [93], which is also related to multi-fidelity learning [94]. U-Net-based model (illustrated in Fig. 4) can also reduce the training cost for super-resolution reconstruction of turbulent flows since the size of fluid flow data is reduced through an autoencoder-type model structures [95].

Fig. 4
figure 4

U-Net-based model for super-resolution reconstruction of vortical flows

Incorporating physical insights and domain knowledge into model construction further supports or enhances supervised-learning-based super-resolution reconstruction in vortical flows. For instance, accounting for spatial length scales of the flow structures in the models improves reconstruction [54]. Kong et al. [96] developed a multiple path super-resolution CNN with several connections inside the model to capture variations of spatial temperature distribution in a supersonic combustor. They reported that the proposed multiple-path CNN provides enhanced reconstruction of temperature fields compared to a regular CNN. Incorporating the time history of flow fields is also useful for super-resolving vortical flows in a supervised manner. Liu et al. [58] compared two types of supervised CNN-based models for super-resolution analysis: namely the static CNN (SCNN) and the multiple temporal paths CNN (MTPC). While the SCNN model uses instantaneous flow snapshots as the input, the MTPC model considers a time series of velocity fields as the input to read spatial and temporal information simultaneously. With examples of forced isotropic turbulence and turbulent channel flow, they found that the MTPC model can improve the reconstruction of turbulence statistics such as kinetic energy spectra and the second and third invariants of the velocity gradient tensor.

Once supervised models are trained, machine-learning models can be used for data compression since we only need to save only the input data to recover high-resolution flow fields. Matsuo et al. [97] proposed an adaptive super-resolution analysis. They focused on how a low-resolution field is prepared in training a supervised learning-based model. While max- and averaging pooling operations are generally used for preparing low-resolution data sets, they considered the spatial standard deviation in arbitrary subdomains in a flow field to determine the local degree of downsampling. This can account for the importance of flow structures in generating low-resolution data sets. They reported that supervised CNN models can reconstruct a high-resolution field of three-dimensional square cylinder wake from adaptive low-resolution data, achieving approximately 0.05% data compression against the original data.

Compressing fluid flow data in the time direction can also be considered. Fukami et al. [90] used the DSC/MS model to reconstruct high-resolution turbulent flows from coarse flow data in space and time inspired by a concept of super-resolution analysis and inbetweening [98]. In their formulation, two spatial coarse flow fields at \(t=n\Delta t\) and \(t=(n+k)\Delta t\) are taken as the input of the first machine-learning model. Once the spatial-reconstruction model provides two super-resolved high-resolution flow fields, these outputs are then fed into the second model to perform inbetweening that provides high-resolution snapshots between the beginning and the end frames. By combining these two models, spatio-temporal high-resolution vortical flows can be obtained from only two coarse snapshot data. It should be note that linear interpolation in time cannot capture advective physics. They demonstrated the model capability with turbulent channel flows and reported that the flow field can be quantitatively reconstructed, achieving 0.04% data compression. Arora and Shrivastava [99] have recently combined this super-resolution/inbetweening idea with physics-informed neural network [85] to improve the reconstruction accuracy and demonstrated it with an example of a mixed-variable elastodynamics system.

Furthermore, supervised super-resolution reconstruction can be used to examine how machine learning extracts the relationship between small and large-scale vortical structures. Kim and Lee [100] considered a CNN-based estimation of the high-resolution heat flux field in a turbulent channel flow from poorly resolved wall-shear stresses and pressure. They revealed that the CNN model focuses on the relationship between vortical structures and pressure distribution in channel turbulence to estimate the local heat flux from the wall-shear stress. Morimoto et al. [101] have recently examined the effect of inter- and extrapolation with machine-learning-based super-resolution reconstruction with respect to flow parameters. They considered two-staggered cylinder wakes whose flow dynamics are characterized based on the diameters and the distance between two cylinders. They found that the supervised CNN-based model can quantitatively reconstruct a vortical flow even for untrained parameter cases by preparing flow field data based on the information of lift coefficient spectrum.

Supervised super-resolution techniques have also been applied to larger-scale meteorological flows [102, 103]. Onishi et al. [102] proposed a CNN-based model for super-resolution analysis of temperature fields in urban environment. The proposed model provides a high-resolution temperature field at reduced computational time than the corresponding high-fidelity simulation, suggesting the potential use of machine-learning models as a surrogate for large-scale numerical simulations. To improve the model performance, Yasuda et al. [103] extended their model by incorporating skip connection [91] and channel attention [104]. While skip connection [91] helps stabilize the learning process of deep CNNs, channel attention [104] can discover the crucial and irrelevant spatial regions of fluid flow regressions. The model trained with temperature fields in one city (Tokyo) provides quantitative reconstruction for test temperature data for another city with similar climate (Osaka). They also observed that including building height information as a part of the input of the machine-learning model is important for successful temperature reconstruction.

In addition to the aforementioned studies with numerical data, applications to experiments have also been considered [105]. For such cases, the effects of noise in the input data must be carefully considered. Deng et al. [106] developed a machine-learning model to super-resolve PIV measurements. For training the model based on CNN, a pair of high-resolution experimental velocity data collected by PIV with cross-correlation method and downsampled low-resolution data is used. The model was tested for turbulent flows around a single cylinder and two cylinders. For more complex turbulent flows, Wang et al. [107] proposed a super-resolution neural network for two-dimensional PIV (PIV2DSR) based on CNNs. Once they trained the model with velocity fields of turbulent channel flow at Re\(_\tau = 1000\) obtained by direct numerical simulation (DNS), the model is assessed with not only numerical channel flow field data at a much higher Reynolds number of 5200 but also real experimental PIV data for a turbulent boundary layer at Re\(_\tau = 2200\).

Fig. 5
figure 5

Extraction of nonlinear modes [108] from shallow decoder [55] in super-resolution reconstruction for an example of two-dimensional incompressible flow (vorticity field) over a NACA0012 airfoil (\(Re=100\) and \(\alpha =40\) deg)

For the preparation of training data in these experimental studies, cross-correlation methods [109] are generally used to obtain velocity fields from particle images. Instead of giving a velocity field from the correlation method, one may consider providing a particle image directly into a model to obtain a higher-resolution flow field. Cai et al. [110] used FlowNetS [111] to estimate velocity fields of cylinder wake, backward-facing step flow, and isotropic turbulence from synthetic particle images. They exhibited that a machine-learning model provides higher-resolution flow field data than the conventional PIV. The proposed method was also tested with experimental particle images of a turbulent boundary layer. Reconstructed flows based on machine learning may capture phenomena that cannot be observed with conventional techniques. This FlowNetS-based method has recently been commercialized as AI-PIV [112]. The super-resolution approach with particle images has also been applied to a wake around bluff bodies to remove the influence of reflection and halation in PIV measurements [113].

Alternatively, a set of sparse sensor measurements can be considered as input to machine-learning models instead of the low-resolution flow data. For instance, Erichson et al. [55] used a fully connected model to reconstruct a global flow field from local sensors. The model was applied to geophysical flow and forced isotropic turbulence. Their fully connected model is a shallow decoder—the model that incorporates a dimension compression to nonlinearly extract key features from sensors, after which the whole field is recovered from these latent representations of the input sensors, as illustrated in Fig. 5. By visualizing the weight distribution between the latent space representation and the whole field, the shallow decoder provides nonlinear modes that represent the contribution of each latent variable for super-resolution reconstruction, which are analogous to those captured by nonlinear autoencoders [108, 115,116,117,118,119,120].

As mentioned above, fully connected network-based reconstruction is prohibitively expensive for global flow field reconstruction due to the very large number of parameters in the network [121]. To address this issue, there are also some efforts to estimate low-order representations such as coefficients obtained through proper orthogonal decomposition (POD) from sparse sensor measurements [122,123,124]. For instance, Nair and Goza [67] proposed a fully connected model-based estimator of POD coefficients and applied it to a laminar wake around a flat plate. Their fully connected model takes vorticity sensors on the airfoil surface and then, outputs POD coefficients, as illustrated in Fig. 6. They considered wakes with two different angles of attacks and reported that the neural-network model outperforms conventional linear techniques such as Gappy POD [125, 126] and linear stochastic estimation [127]. Similarly, Manohar et al. [128] has also recently performed a fully connected model and POD-based sparse reconstruction for wake interactions of two cylinders. Their model considers the time history of sensor measurements with long short-term memory (LSTM) [129], achieving more robustness against noisy inputs compared to a regular MLP model. These reduced-order strategies in machine-learning-based vortical flow reconstruction are summarized in Dubois et al. [114]. With flow examples of two- and three-dimensional cylinder wakes and a spatial mixing layer, they discussed pros and cons of a variety of techniques such as POD [130,131,132], regular autoencoder [133], variational autoencoder [134], linear/nonlinear fully connected networks, support vector machine [135], gradient boosting [136], and library-based reconstruction [137, 138].

Fig. 6
figure 6

Reduced-order modeling-assisted super-resolution reconstruction [67, 114]

From the aspect of reducing the number of parameters inside machine-learning models, a combination of a fully connected model and CNNs has also been leveraged to overcome the limitation of fully connected networks. Morimoto et al. [139] considered a combination of multi-layer perceptron (MLP) and CNN (called MLP-CNN-based estimator) to estimate vortical flows around urban structures and temperature data (DayMET) across North America from sparse sensors. The sensor inputs are first given into the part of a fully connected model and the model extracts the features from the input sensors. The feature vectors extracted from it are then given to the convolutional layers. Compared to solely using fully connected layers, the computational cost can be significantly reduced while maintaining the reconstruction accuracy. A similar MLP-CNN model was also considered by Zhong et al. [140, 141] for a vortex-airfoil gust interaction problem. The model estimates a two-dimensional vorticity field from pressure sensor measurements on an airfoil surface. They reported that transfer learning [92, 142] can help in reducing the required amount of training data, while recurrent neural network (long short-term memory, LSTM [129]) also improves the reconstruction performance of complex transient wake problems.

3.2 Semisupervised and unsupervised learning

In addition to supervised-learning-based efforts, semisupervised and unsupervised learning can be used in super-resolution analysis of fluid flows. Semisupervised learning combines a small amount of labeled data with a large amount of unlabeled data, which can also be augmented with prior knowledge incorporated into the loss function. Gao et al. [60] proposed a semisupervised CNN-based super-resolution analysis for fluid flows. Through the investigation of a two-dimensional laminar flow and a cardiovascular flow, they showed that the constraints based on the conservation laws and boundary conditions enable successful super-resolution reconstruction without high-resolution labeling. These physics-law-based augmentations inspired by physics-informed neural network (PINN) [84, 85, 143] achieve accurate reconstruction while reducing the required amount of training data [65, 144].

There are also a couple of studies on semisupervised super resolution. Bode et al. [66] proposed the physics-informed enhanced super-resolution generative adversarial network (PIESRGAN) for applications to subgrid-scale modeling of LES. To incorporate a physics-based loss function, they used the following cost function \({{\mathcal {E}}}\) for training,

$$\begin{aligned} {{{\mathcal {E}}}} = {{{\mathcal {E}}}}_{\textrm{adv}} + \beta _{\textrm{reg}} {{{\mathcal {E}}}}_{\textrm{reg}} + \beta _{\textrm{grad}} {{{\mathcal {E}}}}_{\textrm{grad}} + \beta _{\textrm{cont}} \mathcal{E}_{\textrm{cont}}, \end{aligned}$$
(10)

where \(\beta _{\textrm{reg}}\), \(\beta _{\textrm{grad}}\), and \(\beta _{\textrm{cont}}\) are weighting coefficients for the different loss term contributions. The first loss term \({{{\mathcal {E}}}}_{\textrm{adv}}\) corresponds to a regular adversarial loss used in GAN-based models, introduced in Eq. 6 [145]. The second term \({{{\mathcal {E}}}}_{\textrm{reg}}\) is a regular supervised loss function, which is equivalent to Eq. 4. The PIESRGAN also includes the gradient loss \({{{\mathcal {E}}}}_{\textrm{grad}}\) defined as the \(L_2\) error norm of the gradient of state variables [56]. Weighting the gradient of the flow field promotes a smooth and physically plausible reconstruction [146, 147]. They also considered \({{{\mathcal {E}}}}_{\textrm{cont}}\), the divergence-free error for incompressible flow. Similarly, a combination of physics-based loss and U-Net (Fig. 4) was proposed by Esmaeilzadeh et al. [148] as MeshfreeFlowNet and was applied for the Rayleigh–Bénard instability problem. Due to the U-Net-based augmentation, the training for MeshfreeFlowNet takes only less than 4 min with 128 GPUs while achieving quantitative reconstruction. To improve the generalizability of MeshfreeFlowNet [148] for a wide variety of problems, Wang et al. [149] have recently proposed TransFlowNet which weakens the constraint of initial and boundary conditions compared to MeshfreeFlowNet. TransFlowNet was tested with examples of shallow water equation and Rayleigh–Bénard convection. The model provides better reconstruction than the original MeshfreeFlowNet, although the instability of training process is also observed due to the complexity of model.

Fig. 7
figure 7

a ResBlock [91] comprised of BatchNormalization layer, ReLU activation, and convolutional layer. b Cycle GAN (cGAN) [82, 150]

While incorporating the aforementioned physics loss can promote a physically plausible super-resolution solution, we should be mindful of the fact that finding an appropriate balance between the weighting coefficients is challenging. We can consider the use of optimization for finding an optimal set of coefficients, although it is computationally expensive [151]. The influence of balancing between an adversarial error and a regular \(L_2\) reconstruction error for sparse flow reconstruction is discussed in detail by Zhang et al. [152] for an example of a flow around building models. Moreover, achieving stable convergence during training is also difficult with such complex loss functions. To avoid this issue, additional machine-learning functions such as skip connection [91] and BatchNormalization [153] can be leveraged. In fact, the aforementioned models such as PIESRGAN [56], MeshfreeFlowNet [148], and TransFlowNet [149] are composed of ResBlock [91] (illustrated in Fig. 7a), which includes both BatchNormalization and skip connection, for stable and successful learning.

In contrast to supervised and semisupervised learning, unsupervised learning, which does not require labeled data sets, is also used for super-resolution analysis. Kim et al. [59] proposed a cycle generative adversarial network (cGAN)-based framework for unsupervised super-resolution reconstruction of turbulent flows. While a regular GAN is composed of one generator and one discriminator as presented in Sect. 2.1.3, cGAN possesses two generators (\(G_1\) and \(G_2\)) and two discriminators (\(D_1\) and \(D_2\)), as illustrated in Fig. 7b. One generator \(G_1\) attempts to reconstruct a high-resolution data \({{\varvec{q}}}_{\textrm{HR}}\) from a low-resolution flow field \({{{\varvec{q}}}}_{\textrm{LR}}\), while another generator \(G_2\) provides low-resolution fields from the generated high-resolution flow data through \(G_1\). The discriminators \(D_1\) and \(D_2\) are trained to distinguish the real data from the generated data, as depicted in Fig. 7b. This operation allows the cGAN model to learn common features between low- and high-resolution data, that need not be paired [150]. The proposed model can reconstruct a velocity field of turbulent channel flow from its low-resolution field. They also demonstrated that the model trained with data from DNS can be applied to the LES data.

Following the study by Kim et al. [59], the unsupervised GAN-based super resolution has recently been examined for a variety of flows. Wurster et al. [154] proposed a hierarchical GAN to perform super resolution of fluid flows. Analogous to SURFNet [57], a hierarchical GAN is first trained with low-resolution data sets. The model weights are then transferred to training with higher-resolution flow fields. Güemes et al. [62] combined a GAN-based super-resolution reconstruction and state estimation [155,156,157,158] from the wall sensor measurements of turbulent channel flow. They first perform super-resolution reconstruction for wall-shear stresses and wall pressure. Another GAN model is then constructed to estimate wall-parallel velocity fields at several wall-normal locations from the super-resolved wall measurements. The GAN models are able to provide reasonable agreement with the reference simulation data up to \(y^+\approx 50\). Yousif et al. [159] extended a super-resolution GAN model by combining it with multi-scale CNN [54] and applied it to a turbulent channel flow with large longitudinal ribs. The reconstructed flow fields are shown to retain the temporal correlations and high-order spatial statistics.

Moreover, the use of a CNN-based GAN for three-dimensional super-resolution analysis was examined by Xu et al. [160] for computed tomography (CT) of turbulent jet combustor. With an example of turbulent atmospheric flow, Hassanaly et al. [161] has comprehensively compared various models for super-resolution reconstruction, including a super-resolution GAN [162], stochastic estimation, a deconvolution GAN [163], and diversity-sensitive conditional GAN [164]. Although GAN-based models have issues with stability during the learning process, these models hold potential for high-wavenumber reconstruction of turbulent flows.

4 Case study: super-resolution reconstruction of turbulence

This section offers details of CNN-based super-resolution reconstruction for fluid flows through a case study. As an example, we consider two-dimensional decaying isotropic turbulence, which serves as a canonical turbulent flow. The flow field data to be studied are generated by a two-dimensional DNS [165], which numerically solves the two-dimensional vorticity transport equation,

$$\begin{aligned} \dfrac{\partial \omega }{\partial t}+{{\varvec{u}}}\cdot \nabla \omega =\dfrac{1}{\textrm{Re}_0}\nabla ^2 \omega , \end{aligned}$$
(11)

where \({{{\varvec{u}}}}=(u,v)\) and \(\omega \) represent the velocity and vorticity fields, respectively. The computational domain is a biperiodic square with \(L_x=L_y=1\). The initial Reynolds numbers for training/validation and test data sets are, respectively, set to Re\(_0\equiv u^*l_0^*/\nu =\{451,442\}\). Here, \(u^*\) is the characteristic velocity defined as the square root of the spatially averaged initial kinetic energy, \(l_0^*=[2{\overline{u^2}}(t_0)/{\overline{\omega ^2}}(t_0)]^{1/2}\) is the initial integral length, and \(\nu \) is the kinematic viscosity. The numbers of computational grid points used by DNS are \(N_x=N_y=512\). For training the baseline networks, we use 1000 snapshots over an eddy turn-overtime of \(t\in [2,6]\) with a time interval of \(\Delta t=0.004\). We consider a vorticity field \(\omega \) as the variable of interest.

We note that our previous studies [54, 90] on machine-learning-based super-resolution reconstruction were performed with two-dimensional decaying turbulence but at lower Reynolds numbers (Re\(_0\approx 80\)) with smaller numbers of the grid points (\(N=128\)). The present case study examines how the model can be improved with regard to not only reconstruction accuracy but also a large amount of necessary training data at a higher Reynolds number.

For the present study, we consider super-resolution reconstruction with a regular CNN and the hybrid downsampled skip-connection/multi-scale (DSC/MS) model [54]. The design of the DSC/MS model is illustrated in Fig. 8. The red portion of the downsampled skip connection (DSC) model is composed of up-/downsampling operations and skip connection. The up-/downsampling operations provide robustness against rotational and translational invariance. The skip connection plays a crucial role in learning hierarchically the relationship between the high-resolution output and the low-resolution input, while providing numerical stability during the learning process of the CNN [91]. The present model also incorporates the multi-scale model (MS) model [166], corresponding to the blue portion of Fig. 8. This part of the model performs filtering operations across three different sizes, capturing a range of spatial length scales in vortical flows.

To accurately reconstruct two-dimensional higher Reynolds number turbulent flow, we provide additional internal skip connections between the DSC model and MS model, as depicted in the green and orange boxes in Fig. 8. Each green box in the DSC model connects with each of the orange boxes in the MS model; hence, nine connections are present. With these interconnections, this interconnected DSC/MS model enables the intermediate input/output from both submodels to correlate with each other through the learning process. Since coverage of spatial length scales increases with the Reynolds number, the interconnections are expected to be important in learning the relationship between small and large vortical elements by the model. For the activation function \(\varphi \), this study uses the ReLU function [167] to avoid vanishing gradients of weights during the training process.

Fig. 8
figure 8

Interconnected DSC/MS model for super-resolution reconstruction of turbulent flows

Furthermore, we consider a physics-based loss function to examine its effects on machine-learning-based super-resolution reconstruction of turbulent vortical flows. As discussed in Sect. 2.2, the use of physics-inspired loss function may not only promote the physical validity of reconstruction but also reduce the amount of necessary training data in a semisupervised manner [60, 65, 148, 168]. Here, we use the nonlinear advection term and the linear viscous diffusion term in Eq. 11 for the physics-based loss function. The present cost function \({{{\mathcal {E}}}}\) is hence defined as

$$\begin{aligned} {{{\mathcal {E}}}}&= {{{\mathcal {E}}}}_{\omega } + \beta _{\textrm{adv}}{{{\mathcal {E}}}}_{\textrm{adv}} + \beta _{\textrm{visc}}{{{\mathcal {E}}}}_{\textrm{visc}},~~~{\textrm{where}}~\nonumber \\ {{{\mathcal {E}}}}_{\omega }&= ||\omega _{\textrm{DNS}}-F(\omega _{\textrm{LR}})||_2,\nonumber \\ {{{\mathcal {E}}}}_{\textrm{adv}}&= ||{{{\varvec{u}}}}_{\textrm{DNS}}\cdot \nabla \omega _{\textrm{DNS}}-{{{\varvec{u}}}}_{\textrm{ML}}\cdot \nabla \omega _{\textrm{ML}}||_2,\nonumber \\ {{{\mathcal {E}}}}_{\textrm{visc}}&= ||\nabla ^2 \omega _{\textrm{DNS}}-\nabla ^2 \omega _{\textrm{ML}}||_2, \end{aligned}$$
(12)

in which \(\omega _{\textrm{DNS}}\) and \(\omega _{\textrm{LR}}\), respectively, represent the reference (high-resolution) DNS field and the low-resolution input flow field. The coefficients \(\beta _{\textrm{adv}}\) and \(\beta _{\textrm{visc}}\) determine the balance of the terms in the loss function. The terms \((\cdot )_{\textrm{ML}}\) inside \({{{\mathcal {E}}}}_\textrm{adv}\) and \({{{\mathcal {E}}}}_{\textrm{visc}}\) are computed with the super-resolved vorticity field \(F(\omega _{\textrm{LR}})\).

Fig. 9
figure 9

Super-resolution reconstruction of two-dimensional decaying homogeneous isotropic turbulence. The value underneath each vorticity contour plot presents the \(L_2\) norm of reconstruction error \(\epsilon \)

Fig. 10
figure 10

a The linear term \(\nabla ^2\omega \) and b the nonlinear term \({{{\varvec{u}}}}\cdot \nabla \omega \), computed from the reconstructed flow fields for each machine-learned model. The value underneath each contour presents the \(L_2\) norm error \(\epsilon \). Shown results are from the same case as the vorticity snapshots presented in Fig. 9

In what follows, we assess six different machine-learning models:

  1. 1.

    CNN-\(L_2\): a regular CNN model with \({{{\mathcal {E}}}} = {{{\mathcal {E}}}}_{\omega }\),

  2. 2.

    CNN-\(L_{\textrm{phys}}\): a regular CNN model with \({{{\mathcal {E}}}} = {{{\mathcal {E}}}}_{\omega } + \beta _{\textrm{adv}}{{{\mathcal {E}}}}_{\textrm{adv}} + \beta _{\textrm{visc}}{{{\mathcal {E}}}}_{\textrm{visc}}\),

  3. 3.

    DSC/MS-\(L_2\): the original DSC/MS model [54] with \({{{\mathcal {E}}}} = {{{\mathcal {E}}}}_{\omega }\),

  4. 4.

    DSC/MS-\(L_{\textrm{phys}}\): the original DSC/MS model with \({{{\mathcal {E}}}} = {{{\mathcal {E}}}}_{\omega } + \beta _{\textrm{adv}}{{{\mathcal {E}}}}_{\textrm{adv}} + \beta _{\textrm{visc}}{{{\mathcal {E}}}}_{\textrm{visc}}\),

  5. 5.

    IDSC/MS-\(L_2\): the interconnected DSC/MS model with \({{{\mathcal {E}}}} = {{{\mathcal {E}}}}_{\omega }\),

  6. 6.

    IDSC/MS-\(L_{\textrm{phys}}\): the interconnected DSC/MS model with \({{{\mathcal {E}}}} = {{{\mathcal {E}}}}_{\omega } + \beta _{\textrm{adv}}{{{\mathcal {E}}}}_{\textrm{adv}} + \beta _{\textrm{visc}}{{{\mathcal {E}}}}_{\textrm{visc}}\).

These six machine-learning models are tasked to reconstruct the high-resolution vortical flow field of size \(512^2\) from the corresponding low-resolution data of size \(16^2\), generated by average-pooling operations [54]. We set \(\beta _{\textrm{adv}} = \beta _{\textrm{visc}} = 0.1\) to the balance of the order for each term.

Let us consider the reconstructed vorticity fields from machine-learing-based super-resolution approaches in Fig. 9. The large-scale vortices can be reconstructed with the regular CNN models. However, the reconstructed fields are pixelized around rotation and shear-dominated structures, which was also observed with a regular CNN-based super-resolution reconstruction in our previous study [54]. The \(L_2\) norm error, \(\epsilon = ||{f_{\textrm{DNS}}}-{f_{\textrm{ML}}}||_2/||{f_\textrm{DNS}}||_2\), is found to be larger than 0.5 with the regular CNNs. The DSC/MS model with the \(L_2\)-based optimization provides a better and clear reconstruction for large vortical structures with an \(L_2\) norm error of 0.241. This indicates that embedding the physics-inspired DSC functions and the MS filters enables accurate reconstruction of vortical flows.

While the DSC/MS model achieves a qualitative reconstruction for vortical structures, finer scales of shear layers that appear around large rotational elements cannot be recovered well. The reconstruction over these scales that emerge in higher Reynolds number flows can be improved by introducing either the physics-based loss function or the interconnection inside the DSC/MS model, as presented in Fig. 9. With DSC/MS-\(L_{\textrm{phys}}\), IDSC/MS-\(L_2\), and IDSC/MS-\(L_{\textrm{phys}}\), these shear layers are more accurately reconstructed compared to the reconstruction with the regular model, as highlighted by the red boxes in Fig. 9. Hence, both the physics-inspired optimization and model design greatly assist in the reconstruction of higher Reynolds number flows. Note that the difference between the interconnection-based model enhancement and using the physics-based loss function is in their robustness against noisy low-resolution input, as it will be discussed later.

We here examine each term in the physics-loss function; namely the linear term \(\nabla ^2\omega \) and the nonlinear term \({{\varvec{u}}}\cdot \nabla \omega \) of the present super-resolution reconstruction, as shown in Fig. 10. These results are from the same case as the vorticity snapshots presented in Fig. 9. Examination of these terms is a strict test since higher-order derivations can amplify errors greatly for high wavenumbers. Let us first focus on the estimated linear viscous diffusion term visualized in Fig. 10a. The regular CNN completely fails to estimate \(\nabla ^2\omega \), as evident from the pixelized vorticity reconstruction in Fig. 9. Using the DSC/MS model with the regular \(L_2\) optimization, the linear term field also exhibits erroneous profiles comprised of pairwise structures that cannot be observed in the reference field. These derivative-based assessments are again very sensitive and also affected by the reconstruction of surrounding local structures. As expected, estimation is improved by including the physics-based term in the loss function (DSC/MS-\(L_{\textrm{phys}}\)). The accuracy of \(\nabla ^2\omega \) can be further enhanced by using the interconnected DSC/MS models, presenting fine-scale structures in the high-order derivation field. This indicates that adding the interconnections inside the machine-learning model enables physically compatible super-resolution reconstruction of turbulent flows in addition to the physics-loss-based optimization.

The estimation of the nonlinear term is shown in Fig. 10b. The whole trend in reconstruction is analogous to that for the linear term, hence the interconnected DSC/MS models well reconstruct the nonlinear term fields compared to the reference field. The \(L_2\) errors for the nonlinear term with DSC/MS-\(L_{\textrm{phys}}\), IDSC/MS-\(L_2\), and IDSC/MS-\(L_{\textrm{phys}}\) are higher than that of the linear term. This suggests that the estimation for the nonlinear term is more difficult than the linear term.

We also investigate the dependence of the reconstruction error on the number of the training snapshots \(n_{\textrm{snapshot}}\). For all models, the error decreases as \(n_{\textrm{snapshot}}\) increases, as shown in Fig. 11. Both the interconnection and the physics-based loss enable a qualitative reconstruction with a reduced number of training snapshots. The observation that a physics-inspired optimization reduces the required amount of training data has also been reported in previous studies [60, 143]. The interconnected DSC/MS model reconstructs fine vortical structures even with only \(n_{\textrm{snapshot}} = 50\) (Fig. 11c), while the original DSC/MS model provides only large-scale structures, as shown in Fig. 11a. This suggests that the present machine-learning model efficiently captures a nonlinear relationship between the under-resolved input and the high-resolution vortical flow from a small amount of training data by capitalizing on the interconnected skip connections.

Fig. 11
figure 11

Dependence of the reconstruction accuracy on the number of the training snapshots

Fig. 12
figure 12

Dependence of the reconstruction accuracy on the magnitude of noisy input

The use of the physics-based loss function can lead to robustness against noisy inputs. Here, let us examine the influence of noise on super-resolution reconstruction. We add the Gaussian noise \({{{\varvec{n}}}}\) to the low-resolution input \({\omega }_{\textrm{LR}}\) and assess the reconstruction \(L_2\) error \(\epsilon = ||\omega _{\textrm{HR}} - F(\omega _{\textrm{LR}} + {{{\varvec{n}}}})||_2/||\omega _{\textrm{HR}}||_2\), where the magnitude of the noise is given as \(\gamma = ||{{\varvec{n}}}||/||\omega ||\). Here, the models trained with 1000 snapshots are used. The relationship between the error and the noise magnitude is shown in Fig. 12. For all cases, the error increases with increasing magnitude \(\gamma \). The reconstructed flow fields generally reveal the large-scale vortices, while the finer scales are affected by the noisy input. Especially for \(\gamma >0.3\), the DSC/MS models with the physics-based loss function are observed to be more robust than models trained with the simple \(L_2\) error optimization. Hence, it can be argued that physics-based loss function helps in devising robust models against noisy measurements.

5 Extensions

In the above sections, we surveyed various machine-learning-based super-resolution approaches and their applications to vortical flows. Here, we discuss extensions of machine-learning-based super-resolution analysis beyond their basic applications.

5.1 Changing input variable setups

Fig. 13
figure 13

Applications of machine-learning-based super-resolution analysis for moving sensor and unstructured grid conditions. a Convolution on point clouds [169, 170]. b Graph neural network [171, 172]. c Coordinate transformation [173]. d Voronoi tessellation-based projection [64]

When a machine-learning model is trained, the size of the input and output variables or more specifically the setup of the input and output variables is fixed. If the setup is changed, the machine-learning model generally needs to be completely retrained, which is a heavy burden. This issue is in fact a limitation of many machine-learning models, and machine-learning-based super-resolution models are no exceptions. If the number of pixels or their locations is different from that used in the training process, the trained model cannot be used without retraining the model with the input variable size changed. Preprocessing the different-size input data with interpolation may work, but care should be taken since such an approach generally loses information. Unstructured grid and randomly sampled data also require some care since standard CNN-based models may not be appropriate.

There are several approaches to address these challenges. For instance, PointNet [170] is able to handle unorganized and sparse data with a point cloud. Although this was originally used for image classification and segmentation tasks, Kashefi et al. [169] have recently applied it to fluid flows. In their formulation, sensors on the grid can be directly treated and a model can learn the relationship between the sensors and outputs, as illustrated in Fig. 13a.

To handle spatially irregular sensor arrangements, graph neural network (GNN) [121] can be considered. GNN is able to perform a convolutional operation on unstructured mesh data, which is similar to that inside of CNNs. Such GNN-based methods can be applied for machine-learning-based super-resolution reconstruction by modifying the setup for data dimensions between input and output, as shown in Fig. 13b.

Coordinate transformation can also be considered to simply use regular machine-learning models for vortical flows. PhyGeoNet [173] includes coordinate transformation from an irregular domain to a structured mesh space for fluid flow regression, allowing us to convolve on flow fields, as illustrated in Fig. 13c. Finding the appropriate coordinate transformation may be a challenge for complex flow field domain geometry.

We can also generalize super-resolution analysis by considering sensor measurements in the flow field as the input for machine-learning models to reconstruct the flow field. For a fixed number of sensors with their positions unchanged, regular machine-learning models developed in image science can often be directly used. However, when sensors go online or offline changing the number of sensors and moving spatially over time, the machine-learning models cannot be applied without special care. Voronoi-tessellation-based CNN [64] can handle an arbitrary number of moving sensors in a single model. In this formulation, sparse sensor measurements are projected onto grids generated from Voronoi tessellation, as illustrated in Fig. 13d. The flow field discretized with Voronoi tessellation is then used as an input for CNN-based super-resolution reconstruction. This approach provides robust real-time super-resolution analysis for vortical flows.

5.2 Super resolution for turbulent flow simulations

With the ability to recover fine-scale flow structures from coarse images of the flow field, it is natural to ask whether machine-learning-based super-resolution analysis can be incorporated into numerical simulations to improve turbulent flow simulations. From a broader perspective, this question translates to whether super-resolution analysis can be implemented in a simulation of multi-scale physical phenomena to accurately reconstruct the subgrid-scale physics [63, 66].

For super-resolution analysis to reconstruct a physically accurate high-resolution flow field, it is generally necessary that the low-resolution input data are accurate on its own coarse grid. If the coarse flow field input is provided by some turbulent flow simulation (e.g., large-eddy simulation, detached eddy simulation, and Reynolds-averaged Navier–Stokes simulation [174]), it is important that the coarse flow be accurate to begin with. The super-resolved field would not be a physically accurate if the low-resolution flow field (input) is deviated from the true solution. Conservatively speaking, turbulent flow statistics may be predicted well with super resolution, but highly accurate reconstruction of each and every instantaneous flow would likely be a major difficulty, if not impossible [175]. In other words, we should not expect that LES results (or those from other solvers with turbulence models) can be transformed to yield DNS results.

A worthy question to ask is whether super-resolution analysis can support the development of subgrid-scale models. This could be different from other turbulence modeling approaches that use applied regressions to directly determine the subgrid-scale models for turbulent flow simulations. Similar to an approximate deconvolution model [176] which considers inverse mapping of spatial filters, super resolution could be used to augment the subgrid-scale models. Furthermore, it remains to be seen whether super-resolution analysis can simultaneously nudge the low-resolution field and recover the subgrid-scale flow structures. Again, the success of such simultaneous corrections will likely require the low-resolution flow field to be fairly accurate on its own grid. Alternatively, GAN-based techniques may also provide interesting approaches to achieve super resolution for turbulence.

Ongoing research developments in super-resolution analysis of turbulent flows and data-driven turbulence models [33, 35] may address the issues identified here in the coming years. As super-resolution methods are extended and incorporated into turbulent flow analysis and simulations, it is important to ensure that the derived super-resolution method is generalizable over a range of Reynolds numbers and turbulent flow problems to confirm robust and reliable performance. This is critical if these techniques are to be implemented in general-use turbulent flow simulators.

5.3 Applications to real-world problems

Super-resolution analysis holds great potential for fluid dynamics as discussed above. However, there still exists some challenges, especially toward applications to real-world problems. This section discusses the current challenges and possible future directions of machine-learning-based fluid flow super resolution.

One of the major challenges of machine-learning-based super-resolution reconstruction for fluid flows is the necessity for a certain amount of training data. While unsupervised learning used for GANs and semisupervised learning assisted with physics-inspired loss functions introduced in the present survey can mitigate this issue, existing techniques still require learning the relationship between coarse data and high-resolution vortical flows from either unpaired or paired training data for successful reconstruction. Since the majority of real-world problems do not have access to ground truth and only sparse and noisy measurements are available, one can consider the use of data assimilation [177,178,179] to improve super-resolution reconstruction by incorporating the latest observations with a short-range real-time forecast.

Yasuda and Onishi [180] has recently proposed a four-dimensional super-resolution data assimilation and demonstrated its performance with a two-dimensional periodic channel flow. The proposed method considers the temporal evolution of a system from low-resolution simulations with the aid of an ocean model, while a trained machine-learning model is simultaneously used to perform data assimilation and super resolution. Since there is a huge amount of historical weather and climate reanalysis data available, the unification of super resolution with data assimilation or pre-existing models would be an interesting research direction.

In addition, most of the existing studies focus on designing a reconstruction model for a particular flow problem, variable, or data shape. From this aspect, it would be desired to simultaneously leverage a variety of multi-modal data such as point-wise measurements, image-based data, and online measurements such as LiDAR-type data. Prediction of unavailable parameters from such sparse and noisy but available measurements may also become an interesting direction of super-resolution studies of fluid dynamics.

6 Conclusions

We provided a survey on machine-learning-based super-resolution reconstruction of vortical flows. Several machine-learning approaches and the use of physics-based cost functions for super-resolution analysis were discussed. We further performed case studies of super-resolution reconstruction of turbulent flows with convolutional neural network (CNN)-based methods. We demonstrated that a super-resolution model with physics-based loss function or physics-inspired neural network structures can reconstruct vortical flows even with limited training data and noisy inputs. We also discussed extensions and challenges of machine-learning-based super resolution for fluid flows from the aspects of changing input variable setups and applications for turbulent flow simulations.

The insights obtained through the present survey can be leveraged for a variety of machine-learning-based super-resolution models. For instance, the use of multi-scale filters inside CNN can be generalized not only in supervised learning but also in unsupervised techniques [181]. Physics-informed loss functions can also be extended for various machine-learning models. Moreover, it may also be interesting to develop super-resolution models in wavespace to incorporate certain spectral properties.

We remind that studies surveyed in the present paper are generally based on clean training data. Preparing high-quality input data is essential for successfully reconstructing turbulent flows. However, it is necessary to assess the robustness and sensitivity of the models against noisy inputs [182]. This point will be important as machine-learning-based super-resolution analyses become utilized in industrial applications [183]. Together with the accuracy of the models, quantifying uncertainties in machine-learning prediction is also required to assess their reliability and limitations. For these reasons, making computational and experimental fluid-flow databases [184,185,186] available is critically important to advance studies on data-driven analysis of vortical flows. We hope that this survey paper provides some guidance in advancing algorithms and applications of machine-learning-based super-resolution analysis for a variety of fundamental and industrial fluid flow problems.