1 Introduction

Vortex–airfoil interaction is ubiquitous around fluid-based systems, including aircraft [1,2,3,4], wind turbines [5], and pumps [6, 7]. Such interactions can cause unsteady loading, fatigue, and structural damage to these systems. For analyzing vortex–airfoil interactions, it is useful to assess the state of the flow from sparse measurements for understanding the governing dynamics [8], prediction of flow disturbance [9], and performing the wake flow control [10]. However, it is challenging to identify vortical structures during the vortex–airfoil interactions from sparse measurements due to its strong nonlinear dynamics and the high-degree of freedom required to describe the vortical flows.

A number of studies have examined sparse state estimation for aerodynamics. In particular, linear techniques have been studied over the last several decades. For instance, gappy proper orthogonal decomposition [11] has been considered to obtain dominant flow features from spatially incomplete and sparse data sets [12]. Focusing on the characterization of flows and boundary layers near body surface, the applications of four-dimensional variational method [13], linear stochastic estimation [14], and Kalman filters [15] have also been explored. However, these techniques are constrained by their linear formulations, which poses challenges when the applications involve strongly nonlinear dynamics.

To overcome such limitations, nonlinear machine learning approaches have been considered as a promising approach in analyzing fluid flows from sparse information. Nonlinear machine learning techniques have shown to be useful in estimating and modeling high-dimensional flow [16]. For example, Pawar et al. [17, 18] applied a physics-guided machine-learning framework to estimate the lift coefficient of a variety of airfoils. Hui et al. [19] utilized a signed distance function-assisted convolutional neural network (CNN) to predict the pressure distribution over an airfoil surface. For flow field reconstructions, Erichson et al. [20] proposed a shallow decoder based on multi-layer perceptron (MLP) for a circular cylinder wake, the sea surface temperature, and forced isotropic turbulence. Fukami et al. [21] proposed a CNN-based method to reconstruct the global turbulent flow field from sparse sensors that can be in motion or change in numbers. In addition to the aforementioned efforts, there are various machine-learning-based flow reconstruction techniques based on super-resolution analysis [22,23,24].

However, there are issues with utilizing nonlinear machine learning techniques for estimating unsteady fluid flows from limited sensor measurements. The most outstanding issue is the computational costs for using machine learning models are expensive. For neural network-based models with low-dimensional inputs to high-dimensional outputs, an enormous number of interior parameters (weights) are required. To determine the internal parameters, generally, thousands of flow (or sensor) snapshots are required, which causes a large computational burden in terms of both training costs and data storage. In our case, if a variety of unsteady flow fields is needed to be accurately reconstructed, storage and computing costs can rise significantly if the problem is approached naively. From this aspect, it is crucial to develop a method that can qualitatively reconstruct a flow field with a small amount of training data and a reduced number of tuning parameters. In addition, generalizable models promote a reduction in cost. Most machine learning models can only be used for specific flow fields, for example, a single model trained with a laminar flow may not be applicable to use to reconstruct turbulent flow fields. In fact, the data used for testing needs to be similar to the training data to achieve accurate results. If we need to consider different flows over a vast parameter space, it is almost impossible to perform experiments or simulations for each and every case. In this regard, the diversity of the training data needs to be considered so that a single model can effectively predict unsteady flow fields over a large range of parameters.

In this study, we aim to develop machine learning methods that reconstruct dominant wake features from limited sensor measurements and a small set of training data sampled over a vast parameter space. Because the disturbance vortex can be of any size, strength, or position from the airfoil, a very large parameter space is needed to be explored to capture the complex vortex-impinged airfoil wake dynamics. In this case, the amount of data can be tremendously large. Instead of naively training machine learning models with all parameter combinations, we develop models that are trained with a few cases in the parameter space and use the models to estimate unseen cases. For the machine learning methods, we choose a multi-layer perceptron (MLP) to model the nonlinear relationship between the low-dimensional sensors inputs and the outputs, including the lift coefficient, drag coefficient, and surface pressure coefficient. Moreover, combining the convolutional neural networks and MLP allows the reconstruction of the vorticity field over time with modest computational costs. The transfer learning and long-short term memory further help in incorporating the dynamics of the transient flow, which reduces the required training data and improves the flow estimation. The current model is robust for a variety of wake scenarios separate from the training data. We also assess the influence of sensor numbers and placement on flow estimation.

The present paper is organized as follows. The problem setup and data compilation are discussed in Sect. 2. Flow physics of vortex–airfoil wake interactions are presented in Sect. 3. Machine learning techniques utilized in this study are introduced in Sect. 4. Results and discussion of machine learning-based flow reconstruction are presented in Sect. 5. Concluding remarks are provided in Sect. 6.

2 Data compilation

The present objective is to develop a robust machine-learning model for highly disturbed flows around an airfoil from sparse pressure sensors and limited training data. Here, we consider transient flow over a NACA 0012 airfoil at an angle of attack of \(\alpha =12^{\circ }\) experiencing various types of vortical disturbances at a chord-based Reynolds number \(\hbox {Re} \equiv u_{\infty } c/{\nu }_{\infty }=400\) and a Mach number \(M_{\infty }\equiv u_{\infty }/a_{\infty }=0.1\). Here, \(u_{\infty }\) is the free-stream velocity, c is the chord length, \({\nu }_{\infty }\) is the kinematic viscosity, and \(a_{\infty }\) is the freestream sonic speed. The simulated flows have been verified and validated with previous studies [25,26,27].

Fig. 1
figure 1

a The size and position of the vortical disturbance, and 8 uniform sensors are distributed on the airfoil surface; b the velocity profile of the disturbance vortex

Fig. 2
figure 2

Randomly distributed 50 training cases (blue) and 10 test cases (red). Example test cases shown: a \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.14, 0.15, 0.15)\), b \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.78, 0.99, 0.18)\), c \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(-\,0.80, 0.61, 0.08)\), d \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.35, 0.95, -\,0.15)\) (color figure online)

The compressible flow solver CharLES [28] is used to simulate the transient flows over the airfoil. For the present vortex–airfoil interaction problem, a single vortical disturbance is initially introduced upstream of the airfoil. This disturbance vortex is given as a compressible Taylor vortex [29], described by

$$\begin{aligned} u_{\theta }=u_{\theta {\max }}{\dfrac{r}{R}}\textrm{exp}\left[ \dfrac{1}{2}\left( {1-\dfrac{r^2}{R^2}}\right) \right] , \end{aligned}$$
(1)

where R is the radius, and \(u_{\theta {\max }}\) is the maximum rotational velocity of the vortex, as shown in Fig. 1. The vortex is initially introduced at \((x_0,y_0)\) with \(x_0=-2c\).

The present vortex–airfoil interaction problem exhibits a variety of flow patterns, as shown in Fig. 2. A strong disturbance vortex produces strong unsteadiness in the flow field, and the larger the vortex is, the larger the region it influences. Apart from the radius and the strength, a vortex can either hit the airfoil at the leading edge and thus incite large fluctuations or pass through the airfoil without causing dramatic changes to the flow or aerodynamic characteristics. Detailed discussion on the flow is offered in Sect. 3. The present study examines whether the flow field generated over the wide parameter space can be recovered with the machine-learning model trained with only a very few cases.

In the present study, we choose eight sensors distributed on both sides of the airfoil surface to capture the vortex passing around an airfoil, as shown in Fig. 1. These sensors are labeled 1–8, with the respective x-locations of the sensors being (0.00, 0.26, 0.48, 0.72, 0.99, 0.23, 0.46, 0.71)c. Three parameters that describe the disturbance vortex are maximum rotational velocity (\(u_{\theta {\max }}\)), the radius (R), and the initial vertical location (\(y_0\)). The training data sets are comprised of \(u_{\theta {\max }}/u_{\infty } \in [-0.9, -0.7, -0.5, -0.3, -0.1, 0.1, 0.3, 0.5, 0.7, 0.9]\), \(R/c\in [0.125, 0.25, 0.5, 0.75, 1]\), and \(y_{0}/c\in [-0.3, -0.1, 0, 0.1, 0.3]\), respectively. Here, the positive value of \(u_{\theta {\max }}/u_{\infty }\) indicates a counterclockwise rotation. The maximum rotational velocity of the vortex \(u_{\theta {\max }}\) covers a range from \(0.1 u_{\infty }\) to \(0.9 u_{\infty }\). The choices for the vortex radius R and \(y_0\) are carefully determined so that vortices can pass over or below the airfoil while significantly influencing the airfoil wake. In Sect. 5, we consider 25, 50, and 100 training cases out of the vast combinations of parameters, then test the models with untrained cases. Parameter combinations of test cases are randomly chosen over the aforementioned ranges. Note that the training data is a small proportion compared to the whole combinations of parameters. There are no test cases overlapping with the training cases.

For each case, we collect 500 snapshots of the flow field for \(u_{\infty }t/c \in [0.85,5.1]\), which reflects the process from the vortex approaching the airfoil to moving away from the tailing edge. Here, \(u_{\infty }t/c=0\) refers to the initial time at which the vortex is at \(x_{0}/c=-2\). The snapshots at \(u_{\infty }t/c=[0,0.85]\) are not used in the present analysis to remove the start-up period of the simulation. For a single parameter set \((u_{\theta {\max }}/u_{\infty },R/c,y_{0}/c)\), the data sizes of aerodynamic force coefficients, pressure over surface, and two-dimensional vorticity field data amount to approximately 1MB, 15MB, and 500MB, respectively. If we use 100 training cases with all 500 snapshots of two-dimensional wake data, the training data size becomes approximately 50GB for a single machine learning model, which is quite large with respect to storage and computation.

3 Flow physics

The present vortex-impinged airfoil wake exhibits rich dynamics influenced by the vortex velocity, size, and position. In this section, we present the flow physics induced by a variety of vortex disturbances.

Fig. 3
figure 3

Effect of the largest rotational velocity of vortical disturbance. a Lift coefficients, b drag coefficients, and c vorticity fields for vortical disturbances of \((R/c, y_{0}/c)=(0.5, 0.1)\) and \(u_{\theta \textrm{max}}/u_{\infty }=-0.7,-0.3,0.3\), and 0.7

The maximum rotational velocity of the vortex disturbance is one of the most important characteristics affecting the vortex–airfoil interaction. Here, we investigate the influence of vortex largest velocity on \(C_L\), \(C_D\), and vorticity fields when \((R/c,y_{0}/c)=(0.5,0.1)\). As depicted in Fig. 3a, b, a positive (counterclockwise) vortex generally induces a transient increase in \(C_L\) and \(C_D\) when it impinges on the leading edge of the airfoil. A secondary negative peak is then introduced when the center of the vortical disturbance passes the center of the airfoil. A similar but reversed trend is observed for a negative (clockwise) vortex. The initial decrease in lift is followed by the vortex tail-induced lift increase.

For a positive vortex with two different magnitudes of the vortex rotational velocity, the first peaks of \(C_L\) are reached at nearly the same time, as presented in Fig. 3a. However, the magnitude difference causes the temporal shift for the secondary peak—the peak with \(u_{\theta _\textrm{max}}/u_{\infty }=0.7\) is reached at \(u_{\infty }t/c\approx 2.6\) while that with \(u_{\theta _\textrm{max}}/u_{\infty }=0.3\) is achieved at \(u_{\infty }t/c\approx 3.0\). This is because a stronger positive vortex produces a stronger interaction with the pre-existing negative vorticity on the suction side of the airfoil, forming a large negative vortex that detaches from the airfoil afterward. Similar to the positive disturbance cases, larger fluctuation induced by a stronger negative vortex gives rise to an earlier secondary peak. For \(C_D\), we observe a similar trend of the time history to the \(C_L\) for the positive disturbance, while the magnitudes of variation are much smaller than \(C_L\).

Fig. 4
figure 4

Effect of vortex size. a Lift coefficients, b drag coefficient, and c vorticity fields for vortical disturbances of \((u_{\theta \textrm{max}}/u_{\infty }, y_{0}/c)=(0.3, 0.1)\) and \(R/c=0.125,0.25,0.5\), and 0.75

The dependence of the flow field response on the vortex size is also examined, as shown in Fig. 4. We choose the same vortex strength and vertical position as \((u_{\theta \textrm{max}}/u_{\infty }, y_{0}/c)=(0.3, 0.1)\) for comparison. The \(C_L\) and \(C_D\) histories experience the same trend of first increasing and then decreasing among different vortex sizes. By increasing the vortex size, the first peaks of \(C_L\) and \(C_D\) appear earlier because a vortex with a larger radius encounters the airfoil earlier.

The changes in the vorticity fields caused by the different sizes of vortices are also presented in Fig. 4c. When a small-size vortical disturbance (\(R/c=0.125\)) impinges on the airfoil, the whole vortex passes over the suction side of the airfoil and induces mild fluctuation in the flow field. As the size of the vortex becomes larger, the vortex splits into two structures which advects over the suction side and the pressure side. The positive vorticity around the trailing edge is rolled up and interacts with the wakes, thus affecting the evolution of the wake region.

Fig. 5
figure 5

Effect of vortex position. a lift coefficients, b drag coefficients, and c vorticity fields for vortical disturbances of \((u_{\theta \textrm{max}}/u_{\infty }, R/c)=(-0.5, 0.5)\) and \(y_{0}/c=0.3,0\), and \(-0.3\)

In addition to the largest velocity and the size of the vortical disturbance, the transient dynamics are also strongly influenced by whether the disturbance vortex passes above or below the airfoil. Here, let we investigate three vertical positions of \(y_{0}/c={-0.3,0,0.3}\) with a negative vortical disturbance \((u_{\theta \textrm{max}}/u_{\infty }, R/c)=(-0.5, 0.5)\), as shown in Fig. 5. For \(y_{0}/c=0.3\), the disturbance passes over the airfoil, where a large portion of negative disturbance passes through the suction side of the airfoil, introducing a large jump in \(C_D\) as the first peak. For \(y_{0}/c=0\), the negative vortical disturbance is split into two parts as it passes around the airfoil. At \(u_{\infty }t/c=2.55\), the large positive vorticity attached on the pressure side of the airfoil produces the second peak in \(C_L\). For the case of \(y_{0}/c=-0.3\) where the disturbance passes below the airfoil, the variation is mostly dominated by the interaction along the pressure side of the airfoil, and the drop and the increment of \(C_L\) and \(C_D\) occur at the same time.

4 Methods

We develop machine-learning models to estimate aerodynamic characteristics that cover a variety of force and wake dynamics from sparse sensors. Constructing a robust model suitable for the vast parameter space in Fig. 2 is challenging. To estimate different types of nonlinear wake responses from limited training data, we consider several strategies with regard to machine-learning model design and training methods for reproducing the transient dynamics. For all machine-learning models used in the current study, three-fold cross-validations are performed, ensuring the convergence of the estimations in terms of data distribution.

Fig. 6
figure 6

Overview of the present estimation problems. The inputs are pressure sensor measurements on the airfoil surface, outputs are a \(C_D\) or \(C_L\), b \(C_P\), and c vorticity field. \(C_L\), \(C_D\), and \(C_P\) are estimated using separate multi-layer perceptron models, vorticity field is estimated using the combination of multi-layer perception and convolutional neural network

An overview of the present machine-learning-based estimation approaches is shown in Fig. 6. The input is the sensor measurements \({\varvec{s}}^{n\mathrm {\Delta } t}\) spanning over \(n\mathrm {\Delta } t\). We first consider a multi-layer perceptron (MLP) to build the relationship between the sensor measurements and the aerodynamic force coefficients over time. Since the degrees of freedom of the input and output are \(\mathcal{O}(1)\), we can easily employ a fully-connected neural network to construct such a relationship. Similarly, we also use a multi-layer perceptron (MLP) to estimate the pressure distribution over the airfoil surface. However, MLP can be challenging to use for problems with high degrees of freedom due to its fully-connected structure [21, 30]. To access the two-dimensional vorticity flow field (the degree of freedom \(\approx \mathcal{O}(10^{3}-10^{4})\)), a model which can effectively extract spatial information with a manageable computational cost is required. To address this point, we incorporate a two-dimensional convolutional neural network (CNN) to provide qualitative estimations while maintaining a low computational cost. When the MLP is coupled with the CNN, the machine-learning model can reconstruct the flow field from a limited number of sensor measurements. Moreover, due to the transient nature of the current vortex–airfoil interaction problem, accounting for the dynamics into model construction aids in accurate estimation. For this reason, the long short-term memory (LSTM) algorithm [31] assisted with transfer learning serves as an effective method to estimate the flow fields from time traces. Hence, we embed LSTM into the aforementioned MLP and MLP-CNN models. In what follows, we introduce the algorithms of these machine learning methods.

Fig. 7
figure 7

a A minimum unit of perceptron. b Two-dimensional convolutional operation

4.1 Multi-layer perceptron

In the present study, the input sensor measurements are first fed into a multi-layer perceptron (MLP) [32]. For the estimation of aerodynamic forces (Sect. 5.1) and pressure distribution over the airfoil surface (Sect. 5.2), the MLP \(\mathcal{M}\) is used as a function approximator between the input sensor measurements \({\varvec{s}}\) and the output variables \({\varvec{q}}\) such that \({\varvec{q}}\approx \mathcal{M}({\varvec{s}})\). For the estimation of the two-dimensional vorticity field \(\omega \) (Sect. 5.3), MLP plays a role of a nonlinear function mapping the low-dimensional sensor information \({\varvec{s}}\in {\mathbb {R}}^{n_{\varvec{s}}}\) to the high-dimensional variable in the model. In addition, we incorporate LSTM [31] into the machine-learning models to capitalize on the dynamical information of sensors.

In MLP, the input at layer \((l-1)\) is multiplied by weights \(\varvec{W}\), then linearly combined, and passed through a nonlinear activation function \(\varphi \) as an output to the next layer (l),

$$\begin{aligned} {q}^{(l)}_{i}=\varphi \left( \sum _{j}{W}_{ij}^{(l)}{q}^{(l-1)}_{j} + b_i^{(l)}\right) , \end{aligned}$$
(2)

where b is a bias added at each layer as illustrated in Fig. 7a. We utilize the ReLU function [33] for \(\varphi \), which is known to be effective for addressing the vanishing gradient problems in deep neural networks. For determining the weights W, the Adam algorithm [34] is utilized. In the present model training, early stopping [35] with 20 training epochs is also applied to avoid overfitting the machine-learning model.

4.2 Convolutional neural network

Since the full flow field estimation requires a large number of spatial grid points (high spatial degrees of freedom), the computational burden is substantial for the direct application of MLP to the full flow field reconstruction [36, 37]. To address this issue, we combine MLP and a two-dimensional convolutional neural network (CNN) [38]. The CNN enables regression while greatly reducing computational costs through filter sharing. The two-dimensional convolutional operation is illustrated in Fig. 7b, whose internal procedure is expressed as

$$\begin{aligned} q^{(l)}_{ijg}=\varphi \left( \sum _{l=1}^F\sum _{p=0}^{H-1}\sum _{q=0}^{H-1}h^{(l)}_{pqlg}q^{(l-1)}_{i+p-C,j+q-C,l}+b_g^{(l)}\right) , \end{aligned}$$
(3)

where \(C=\lfloor H/2\rfloor \), H is the width and height of the filter, F is the number of input channels, g is the number of output channels, b is the bias, and \(\varphi \) is the activation function. The input sensor measurements \({\varvec{s}}\in {\mathbb {R}}^{n_{\varvec{s}}}\) are transformed to a high-dimensional representation \(\hat{\varvec{q}} \in {\mathbb {R}}^{n_{\hat{\varvec{q}}}}\) through the MLP for the wake estimation. This representation \(\hat{\varvec{q}} \in {\mathbb {R}}^{n_{\hat{\varvec{q}}}}\) is then reshaped into a two-dimensional matrix form \(\hat{\varvec{q}} \in {\mathbb {R}}^{n_{\hat{x}} \times n_{\hat{y}}}\) so that the data can be managed with a two-dimensional CNN, as illustrated in Fig. 6a. Through the CNN process in Eq. 3 and upsampling operation, the present model extracts the relationship between the input sensors and the vorticity field \({\omega } \in {\mathbb {R}}^{n_x \times n_y}\). As with the MLP training, we apply the ReLU function [33] as the nonlinear activation function, the Adam algorithm [34] for updating filters, and early stopping [35] to prevent overfitting.

4.3 Long short-term memory-assisted transfer learning

To improve the present estimation, we also utilize the long short-term memory (LSTM) algorithm [31]. LSTM is one of the recurrent neural network methods, which is suitable for predicting temporal behaviors from time-series data. Since LSTM can hold the time-series data as memory inside the function referred to as cell, the implementation of LSTM can greatly help with the present problem that is dependent on past flow states due to its transient nature.

An LSTM layer is constructed by four functions; a cell C, an input gate d, an output gate o, and a forget gate g. These functions play important roles in deciding how past information is incorporated to predict the output variables. The input gate d determines how much of the current information from the input of cell \(e_t\) is used for prediction,

$$\begin{aligned} d_t&=\sigma (W_d\cdot [\tilde{q}_{t-1}, e_t]+\beta _d), \end{aligned}$$
(4)

where q is the output of cell, W and \(\beta \) represent the weights and the bias, respectively, for each gate denoted by its subscript; the subscripts t and \(t-1\) represent the time indices, and \(\sigma \) is the sigmoid function. Here, the concatenation of two inputs in a model is denoted as [mn]. In parallel, the LSTM also considers how much of the past information is kept from the cell state at the previous cell state \(C_{t-1}\) using the forget gate g,

$$\begin{aligned} g_t&=\sigma (W_g\cdot [\tilde{q}_{t-1}, e_t]+\beta _g). \end{aligned}$$
(5)

With the temporal cell state at the current time step,

$$\begin{aligned} \widetilde{C}_{t}&=\tanh ({W_c\cdot [\tilde{q}_{t-1}, e_t]+\beta _c}), \end{aligned}$$
(6)

and the previous cell state \(C_{t-1}\), the current cell state \(C_t\) is determined by balancing the input gate d and the forget gate g,

$$\begin{aligned} C_t&=g_t C_{t-1}+d_t\widetilde{C}_t. \end{aligned}$$
(7)

Note that the sigmoid functions used for the input and the output gates play important roles in avoiding gradient vanishing problems. At the output of the LSTM layer, the amount of information at the cell state \(C_t\) being leveraged for short-term prediction (i.e., the output at the next step \({\tilde{q}}_t\)) is assessed using the output gate o with

$$\begin{aligned} o_t= & {} \sigma (W_o\cdot [\tilde{q}_{t-1}, e_t]+\beta _o), \end{aligned}$$
(8)
$$\begin{aligned} {\tilde{q}}_t= & {} o_t\tanh ({C_t}). \end{aligned}$$
(9)

With this formulation, the LSTM is able to predict the variable at the next step \({\tilde{q}}_t\) while considering the long-term memory influence with the concept of cell state C.

Fig. 8
figure 8

Long short-term memory-assisted transfer learning

Here, we combine the high-dimensional representation of the input measurements obtained through the MLP \(\tilde{\varvec{q}}^{n\Delta t}\) with two previous time sequences extracted by LSTMs \(\{\tilde{\varvec{q}}^{(n-1)\Delta t},\tilde{\varvec{q}}^{(n-2)\Delta t}\}\) such that \({\tilde{\varvec{q}}=[\tilde{\varvec{q}}^{n\Delta t}+\tilde{\varvec{q}}^{(n-1)\Delta t}+\tilde{\varvec{q}}^{(n-2)\Delta t}}]\), as illustrated in Fig. 8. This combined vector \(\tilde{\varvec{q}}\) with three time steps is then provided to the MLP layer of the force estimation and the \(C_p\) estimation, or the two-dimensional CNN layer of the vorticity reconstruction task.

Moreover, we utilize the concept of transfer learning for the LSTM-assisted network. Transfer learning can facilitate the training process by setting appropriate initial weights [39]. The present strategy of the LSTM-assisted transfer learning is graphically summarized in Fig. 8. In the present study, the weights of pre-trained MLP \({\varvec{w}}_\mathcal{M}\) are adopted as initial weights of the second model \(\mathcal{F}_2\) which has two sensor input gates \({\varvec{s}}^{(n-1)\Delta t}\) and \({\varvec{s}}^{n\Delta t}\). The high-dimensional feature of input sensor measurements \(\hat{\varvec{q}}\) from the MLP part of the model is merged with that from LSTM. Once the training for the second model \(\mathcal{F}_2\) is completed, the optimized weights of the second model \({\varvec{w}}_{\mathcal{F}_2}\) are repeatedly transferred to the third model \(\mathcal{F}_3\) which considers sensor measurements at three different time steps \({\varvec{s}}^{(n-2)\Delta t}\), \({\varvec{s}}^{(n-1)\Delta t}\), and \({\varvec{s}}^{n\Delta t}\). The weight optimizations through these operations are mathematically expressed as

$$\begin{aligned} {\varvec{w}}_{\mathcal{F}_1}&= \textrm{argmin}_{{\varvec{w}}_{\mathcal{F}_1}}||{\varvec{q}}-{\mathcal{F}_1}({\varvec{s}}^{n\Delta t};{\varvec{w}})||_2, \end{aligned}$$
(10)
$$\begin{aligned} {\varvec{w}}_{\mathcal{F}_2}&= \textrm{argmin}_{{\varvec{w}}_{\mathcal{F}_2}}||{\varvec{q}}-\mathcal{F}_2([{\varvec{s}}^{n\Delta t},{\varvec{s}}^{(n-1)\Delta t}]; {{\varvec{w}}_{\mathcal{F}_2}}({\varvec{w}}'_{\mathcal{F}_1}))||_2, \end{aligned}$$
(11)
$$\begin{aligned} {\varvec{w}}_{\mathcal{F}_3}&= \textrm{argmin}_{{\varvec{w}}_{\mathcal{F}_3}}||{\varvec{q}}-\mathcal{F}_3([{\varvec{s}}^{n\Delta t},{\varvec{s}}^{(n-1)\Delta t},{\varvec{s}}^{(n-2)\Delta t}]; {\varvec{w}}_{\mathcal{F}_3}({\varvec{w}}'_{\mathcal{F}_2}))||_2, \end{aligned}$$
(12)

where \({\varvec{w}}'_{\mathcal{F}_1}\) denotes the weights assigned to the common part of the first MLP-CNN model and the second model, \({\varvec{w}}'_{\mathcal{F}_2}\) represents the weights assigned to the common part of the second MLP-LSTM-CNN model and the third model, respectively. Since transfer learning can aid in the computational reduction by enabling fast convergence of weights [40, 41], we can expect accurate flow reconstruction with minimal training costs using transfer learning with LSTM.

5 Results and discussion

5.1 Aerodynamic forces

Let us first present the machine-learning-based estimation of \(C_L\) and \(C_D\) from the pressure sensor inputs. Here, we here prepare machine learning models \(\mathcal{F}\) for each coefficient such that \({C_L}=\mathcal{F}_L({\varvec{s}}(t))\) and \({C_D}=\mathcal{F}_D({\varvec{s}}(t))\). The estimation results for \(C_D\) and \(C_L\) are shown in Fig. 9. When training with only 50 training cases with each case having 50 snapshots, the model achieves a qualitative estimation of \(C_D\). Here, we denote the number of cases as \(n_\textrm{case}\), and the number of snapshots per case as \(n_\textrm{ss}\). We also quote the \(L_2\) error norm \(\varepsilon \equiv ||{\varvec{f}}_\textrm{Ref}-{\varvec{f}}_\textrm{ML}||_2/||{\varvec{f}}_\textrm{Ref}-\overline{\varvec{f}}||_2\), where \({\varvec{f}}_\textrm{Ref}\) and \({\varvec{f}}_\textrm{ML}\) are the reference and the machine-learning-based estimation, respectively, of variable \({\varvec{f}}\). Note that this error is normalized by the fluctuation of a variable from its steady-state value \(\overline{\varvec{f}}\). For \(C_L\) and \(C_D\), the error is measured over the time range \(u_{\infty }t/c = [0.85,5.1]\) for each case.

Fig. 9
figure 9

Estimation of \(C_D\) and \(C_L\) obtained with MLP and MLP-LSTM models for the case of \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(-0.19, 0.83, 0.28)\)

The estimation results show that the positions of the peak and trough of \(C_D\) induced by the vortex–airfoil wake interaction are qualitatively predicted, yet the exact values are off from the DNS result. Increasing the number of training cases \(n_\mathrm{{case}}\) improves the estimation performance. Enhanced agreement between the estimation and DNS is also achieved when increasing the number of snapshots to 500, as illustrated in Fig. 9a. The enhancement in the data diversity leads to a drastic decrease in the prediction error. In contrast to 50 training cases, utilizing 100 training cases yields a \(67\%\) deduction in test error. The reason why the expansion of training cases is beneficial for prediction performance is that the machine learning model can cover a larger parameter space, which assists in better predicting unseen test cases. Yet, considering the vast parameter space, 100 training cases are very few.

To obtain an accurate reconstruction while using as little data as possible, we then incorporate the transfer-learning-based LSTM into the model for \(C_D\), as shown in Fig. 9a. Due to the transient nature of the current vortex–airfoil interaction problem, the present transfer-learning-based LSTM is able to build a reliable connection between sensor input and output based on historical information. For all three examples, using the MLP-LSTM model gives rise to a 10% decrease in the estimation error.

Estimation for \(C_L\) is presented in Fig. 9b. Enhancement in the reconstruction of \(C_L\) from increasing the amount of training data is also shown in Fig. 9b. The new MLP-LSTM model reduces the test error to 0.215, 0.158, and 0.148 for \((n_\textrm{case}, n_\textrm{ss})=(50,50)\), \((n_\textrm{case}, n_\textrm{ss})=(50,500)\) and \((n_\textrm{case}, n_\textrm{ss})=(100,50)\), respectively. Similar to the \(C_D\) estimation, the transfer-learned-LSTM architecture is also useful in the estimation of \(C_L\). We also note that the reconstruction for \(C_L\) is usually better than \(C_D\), which is due to the variation of \(C_L\) over time being much larger than that of \(C_D\).

5.2 Surface pressure distribution

Next, we perform the MLP-based estimation of the pressure distribution \(C_p\) over the airfoil surfaces. Representative snapshots of the vorticity field for a test case when a vortical disturbance passes around the airfoil are shown in the first row of Fig. 10. Similar to the aerodynamics forces \(C_L\) and \(C_D\), the reconstruction performance is strongly influenced by the number of cases \(n_\textrm{case}\), the number of snapshots per case \(n_\textrm{ss}\), as well as whether transfer-learned LSTM is incorporated. When training the MLP-LSTM model with 50 cases and 250 snapshots per case, a qualitative reconstruction is achieved for \(C_p\). As shown in the first row of Fig. 10, the estimated \(C_p\) at both the upper and lower surfaces of the airfoil are in agreement with the DNS. As we increase the number of cases from 50 to 100 without utilizing transfer-learned LSTM, this machine-learning model also reconstructs \(C_p\) in a reasonable manner, as shown in the second row of Fig. 10. However, by comparing the reconstruction of \(C_p\) of \((n_\textrm{case}, n_\textrm{ss})=(100,250)\) without LSTM against the results with LSTM implemented, it is found that the use of transfer-learned LSTM greatly improves reconstruction, achieving enhanced performance with only half of the training data. In order to further improve the estimation performance, increasing the number of snapshots from 250 to 500 for 100 training cases achieves a similar performance as the results of \((n_\textrm{case}, n_\textrm{ss},\mathrm LSTM)=(50,250,\textrm{Y})\), as shown in the last row of Fig. 10. Note that although we are using 100 training cases and 500 snapshots per case, the training data is still small compared to the broad parameter space of the test cases.

Fig. 10
figure 10

Machine-learning-based estimation of the pressure distribution over an airfoil surface. The pressure coefficient \(C_p\) at \(u_{\infty }t/c=\) a 0.85, b 3.82, and c 4.67. Results are shown for the case of: \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.35, 0.95, -0.15)\)

5.3 Vorticity field

We employ the present machine learning techniques to reconstruct the two-dimensional vorticity field from sensor measurements on an airfoil using the MLP-CNN model. Analogous to the results in Sect. 4.2, the combination of MLP and CNN is suitable to estimate the vortical flow from sensors. The reconstruction of the spatially discretized vorticity field \({\omega }\in {\mathbb {R}}^{100\times 200}\) is summarized in Fig. 11. The present model successfully captures the vortical disturbance at \(u_{\infty }t/c=0.85\). The location and the strength of the vortex are well reconstructed. The interaction between the vortex disturbance and the flow field around the airfoil at \(u_{\infty }t/c=2.12\) is also reproduced well. This is approximately the time at which \(C_L\) and \(C_D\) drop to their minimum values, serving as an important dynamic transition point. However, the wakes behind the trailing edge at \(u_{\infty }t/c=5.10\) are not accurately reconstructed because these wake structures are farther away from the airfoil during this period. Sensors on the airfoil surface measure do not observe a sizeable change in pressure, making it difficult to reconstruct far-field wakes, which is expected.

Fig. 11
figure 11

Reconstructed vorticity flow field with large region training and windowed region training. Results shown for \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.65, 0.40, 0)\)

Let us now focus on the critical near-wake region around an airfoil since this region primarily determines the unsteady loading. Considering only the near-field region enables us to greatly reduce the size of training data and the associated computational costs. Results from training with a smaller region are described in Fig. 11. The windowed training model also provides improved estimations with lower error. The averaged \(L_2\) error for the test case reduces from 0.329 (large region training) to 0.261 (windowed region training), which also shows the influence of the region size to estimate the wake field with the modest computational cost.

Fig. 12
figure 12

Dependence of the reconstruction accuracy on the present enhancement methods with window training for vorticity wake problem. Results are shown for the case \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.72, 0.64, -0.10)\)

Moreover, the enhancement in reconstructing the vorticity field can also be achieved by increasing the amount of training data and utilizing transfer-learned LSTM, as summarized in Fig. 12. For all time series, a qualitative and insightful reconstruction of the vorticity field is achieved with as less as 10 snapshots per case, as shown in the second column in Fig. 12. When we apply the transfer-learned-LSTM to the same dataset, up to 33% reduction in \(L_2\) error is accomplished. Additionally, increasing the number of snapshots to 50 or increasing the data diversity by using 100 training cases produces further improvements.

It is worth noting that the reconstruction accuracy for the interaction process is not uniform. For example, at \(u_{\infty }t/c=1.70\) and \(u_{\infty }t/c=2.55\) when the center of the disturbance is near the airfoil, the \(L_2\) errors are relatively high. Due to the high level of interaction, the complex morphological changes in the vorticity field result in an increased error. However, this does not indicate that the machine-learning model is not able to extract the crucial features of the flow field. Instead, the errors are partially due to the modest displacement of vortical structures. In addition, the reconstructed vorticity field at \(u_{\infty }t/c=5.10\) shows that the transfer-learned LSTM shows its superiority in estimating the small fluctuation behind the trailing edge compared to the enhancement of data amount or diversity. Based on the insights gained from this study, we deduce that when the influence from the disturbance is greater (strong disturbances with large sizes and the interactions around the airfoil), the accuracy of the reconstruction is improved. Here again, the transfer-learned LSTM greatly improves the estimation for the overall dynamic process.

5.4 Influence on the sensor positions

Next, let us examine the estimation performance of the machine-learning models trained with different numbers and placements of the sensors. As shown in Fig. 13, we consider the uses of 8 sensors (case 1), 3 sensors around the leading edge (case 2), 3 sensors on the top surface (case 3), 3 sensors around the trailing edge (case 4), and 3 sensors on the bottom surface (case 5), respectively. With 8 sensors (case 1), the lowest \(L_2\) error is achieved compared to the other cases with 3 sensors, as expected. With 3 sensors, we observe that Case 5 with the bottom surface sensors usually presents a lower error than Cases 2–4 for the whole time range. This is likely because the sensors on the pressure side may sense the vortical structures approaching an airfoil easier and earlier than having sensors on the suction side.

Fig. 13
figure 13

Dependence of the \(L_2\) errors on the sensor positions. Cases 1–5 denote 8 uniform sensors, 3 leading edge sensors, 3 top surface sensors, 3 trailing edge sensors, and 3 bottom surface sensors, respectively. The machine-learning model with the condition of \((n_\textrm{case},n_\textrm{sss},\textrm{LSTM})=(50,50,Y)\) is used

Fig. 14
figure 14

Dependence of the reconstruction accuracy on the location of sensors for the reconstructed vorticity wake. \((n_\textrm{case},n_\textrm{sss},\textrm{LSTM})=(50,50,Y)\) As a test case, we use \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.65, 0.40, 0)\)

We also assess the estimation performance over time in Fig. 13. Before the vortical disturbance impinges on the airfoil (\(u_{\infty }t/c<2\)) and after the vortex moves away from the trailing edge of the airfoil (\(u_{\infty }t/c>4\)), we observe relatively low \(L_2\) error. For \(u_{\infty }t/c \in [2,4]\), due to the complex interactions between the disturbance and the airfoil, the estimation for this time period is more difficult than other times. However, we note that the present model still achieves qualitative reconstructions even for the strong vortex–airfoil wake interaction process, as depicted in Fig. 14. These reconstructed snapshots correspond to the moment \(u_{\infty }t/c=2.12\) for \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.65, 0.40, 0)\). This implies that monitoring not only the scalar error measurement but also the reconstructed flow fields is essential for appropriate assessments of machine-learning-based flow estimations. These results also provide practical insights into the choice of sensor locations. It is recommended that sensors are placed on the suction and pressure sides for the present problem.

5.5 Robustness against noisy sensor measurements

Let us evaluate the machine-learning model robustness against the noisy sensor measurements. We use the Gaussian noise \(\varvec{n}\) for the sensor input \(\varvec{s}\). Hence, the estimated output is expressed as

$$\begin{aligned} {\varvec{q}}_{n} = \mathcal{F}({\varvec{s}}+ \varvec{n}) \end{aligned}$$
(13)

where \({\varvec{q}}_{n}\) is the output of the model, \(\mathcal{F}\) is the model trained without noisy inputs, and \(\gamma = \Vert {\varvec{n}}\Vert /\Vert {\varvec{s}}\Vert \) is the magnitude of the noise.

The estimation performance of \(C_L\), \(C_D\), \(C_P\), and vorticity field with noisy inputs (pressure measurements) are considered herein, as shown in Fig. 15. For all estimations, the error increases with the magnitude of the input noise, as expected. The reconstructed \(C_L\), \(C_D\), \(C_P\), and vorticity field are also shown in Figs. 16, 17, and 18. Regarding the estimated \(C_L\) and \(C_D\) in Fig. 16, the reconstructed \(C_L\) and \(C_D\) present smooth curves without noisy input of \(\gamma = 0\). With increasing \(\gamma \), \(C_L\) and \(C_D\) have high fluctuations resulting in a larger \(L_2\) error but with the overall trend well reproduced.

Fig. 15
figure 15

Machine-learning model robustness against noisy sensor measurements. a  \(C_L\) and \(C_D\), \((n_\textrm{case},n_\textrm{ss},\textrm{LSTM})=(100,50,Y)\), b \(C_P\), \((n_\textrm{case},n_\textrm{ss},\textrm{LSTM})=(100,500,Y)\), and c Vorticity field \(\omega \), \((n_\textrm{case},n_\textrm{ss},\textrm{LSTM})=(100,100,Y)\)

Fig. 16
figure 16

Reconstruction of \(C_L\) and \(C_D\) subjected to different levels of input noise. Results are shown for \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.96, 0.57, 0.21)\)

Fig. 17
figure 17

Comparison of \(C_P\) subjected to different levels of input noise. Results are shown for \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.35, 0.93, -0.15)\)

Fig. 18
figure 18

Comparison of estimated vorticity fields subject to different levels of input noise

The estimated \(C_P\) from noisy pressure inputs is depicted in Fig. 17. While the error solely increases with the noise magnitude, we find that the error at \(u_{\infty }t/c=2.55\) is larger than that at \(u_{\infty }t/c=0.85\). This is caused by the intense wake-vortex gust interaction at \(u_{\infty }t/c=2.55\) which induces rapid changes in the pressure distribution on the airfoil surface. Although the error reports approximately 0.5 with \(\gamma =0.178\), the whole trend of the \(C_P\) curve is well-estimated, supporting the robustness of the present machine-learning model.

The reconstructed vorticity fields across the different levels of noisy inputs are also exhibited in Fig. 18. We show two cases of the vortical disturbance, \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.35, 0.95, -0.15)\) and \((-0.60, 0.91, -0.09)\). A large positive disturbance is introduced in the former case, while a negative vortical gust travels over the airfoil in the latter case. In the case of \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(0.35, 0.95, -0.15)\), the estimated vorticity field retains the primary vortical features for \(\gamma \le 0.178\). The estimated flow field deviates from the reference DNS field at \(\gamma =0.28\). For the case of \((u_{\theta \textrm{max}}/u_{\infty }, R/c, y_{0}/c)=(-0.60, 0.91, -0.09)\), spurious negative structure attached to the trailing edge vortex emerges beyond \(\gamma =0.178\), albeit the overall flow is reconstructed well. At \(\gamma =0.28\), although the \(L_2\) error norm is relatively high, the main wake structures are nonetheless reconstructed. These results suggest that the present machine-learning models that incorporate dynamics are robust against noisy pressure measurements even with a small amount of training data.

6 Concluding remarks

High-fidelity machine-learning-based reconstructions are developed for aerodynamic force coefficients, pressure distribution over the airfoil, and two-dimensional vorticity flows that experience an impact with a disturbance vortex. Such reconstruction using sparse sensor measurements and a modest amount of training data is extremely challenging due to the strong nonlinearities and the transient nature of flow fields which requires a vast parameter space to be covered during the learning process. For accurate reconstruction, we developed machine learning models that are suitable for estimating the transient flow features. A multi-layer perceptron is chosen for its ability in constructing the nonlinear relation between limited sensor measurements and aerodynamic forces coefficient as well as pressure over the airfoil surface. A convolutional neural network coupled with MLP addresses the problem of estimating the vorticity fields with rich information in an efficient way with the filtering process. To better capture dynamical features in time, long short-term memory (LSTM)-assisted transfer learning is utilized via passing information from the historical scenarios, which is embedded in the aforementioned two model structures. Due to the transient nature of the vortex–airfoil interaction problem, the use of LSTM greatly assists in the improvement of the estimation with as few as 10 training snapshots.

The main contribution of the present study is how time-varying flows with a vast parameter space are reconstructed accurately. For this study, the parameter space is comprised of maximum rotational velocity, radius, and position of the disturbance vortex. As shown in this paper, careful sampling of training data and incorporation of dynamics into the machine-learning model is important. Based on our study, we also showed that accurate reconstruction of vortical structures is easier to accomplish for high-intensity interaction processes between the vortical disturbance and the airfoil (strong vortex with large size, interacting close to the airfoil). In addition, we accessed proper sensor locations over different time periods. We expect that the present machine-learning-based reconstruction method will be useful in predicting and controlling flows associated with vortex–airfoil interactions in the future.