1 Introduction

Offshore wind turbine installations are projected to accelerate rapidly in the near future driven by better wind resources and higher social acceptance rates compared to onshore sites [24]. However, a major economic limitation of offshore wind turbines are high operational and maintenance expenditures (OPEX), which amount to about 34 % of the levelized cost of energy (LCOE) [19]. These are caused by lower reliability due to harsher environmental conditions and time-consuming replacement or repair due to difficulties accessing the site and dependency on good weather conditions. A major contributor to OPEX is the geared drivetrain with frequent failures and long downtimes and is thus the subject of current research [23].

Digital Twin (DT) is an emerging technology with prospects of decreasing operational and maintenance expenditures and improving the market competitiveness of offshore wind farms. The wind turbine drivetrain DT proposed by the authors in [readacted] would enable monitoring of fatigue loads at otherwise inaccessible locations such as bearings and gear contacts using ’virtual sensors’ and thus support Remaining Useful Life (RUL) assessment based on the true load history. A framework with the three components Data, Virtual model and Decision support is envisioned for this objective (Fig. 1). Data is collected continuously by sensors of the Supervisory Control and Data Acquisition system (SCADA), the condition monitoring system (CMS), and from other sources such as metocean forecasts and maintenance logs. The Virtual model comprises of decoupled simulation models to represent the physics at different scales. Aerodynamics and structural dynamics of tower and blades are captured with the aeroelastic model. The resulting hub loads and nacelle motions are imposed as boundary conditions on the drivetrain model, which produces local bearing and gear forces. These are used as input for the fatigue damage model consisting of stress cycle counting algorithms and S‑N curves. The virtual model and its physical counterpart are synchronized with real-time field measurements using state estimators such as the Kalman Filter. The synchronization, also referred to as data fusion or Digital Twinning, is essential as it facilitates measurements of virtual sensors in the synchronized model. The virtual sensor measurements are converted to value adding information for the turbine operator in the component called Decision support. In this study the focus is on RUL assessment of drivetrain components, which is necessary to move from corrective to predictive maintenance strategies.

Fig. 1
figure 1

Digital Twin framework for continuous remaining useful life estimation in wind turbine drivetrain components

Preliminary investigations on the feasibility of the proposed DT have been conducted in a numerical case study in [redacted]. In this study the proposed DT is further validated with with field measurements of the DOE 1.5 MW turbine instrumented by NREL [18]. Main bearing loads estimated with the proposed virtual sensing method are compared to field measurements obtained from shaft strain gauges under different operational conditions.

Other studies are found in literature, which are concerned with estimating main shaft loads with a virtual sensing approach. Several works pursue an inverse approach for real-time estimation of the rotor torque based on SCADA measurements. By simplifying the drivetrain dynamics to a two degrees of freedom (DOF) torsional system, the SCADA signals generator torque along with the LSS and HSS speeds contain enough information to predict the unknown rotor torque [3, 12, 15, 17]. The drivetrain model may be constructed without knowledge of manufacturer’s specifications using data-driven system identification techniques such as least-squares estimators [15]. Data fusion is realized with state estimators, for example Kalman Filters (KF) [15], or regularization methods [17].

Several works are concerned with virtual sensing of rotor thrust, but with the objective of structural health monitoring of the tower rather than drivetrain components. A common approach involves constructing a dynamic, flexible tower model and use state estimation methods based on tower top acceleration and/or strain gauge measurements [20].

Notable publications that fit the proposed Digital Twin framework (Fig. 1) are presented by Branlard et al. and Azzam et al. [2, 4]. Branlard et al. present a holistic wind turbine DT capable of estimating both thrust and torque based on SCADA measurements and a linearized aeroelastic model [4]. Validation of the DT is conducted with both simulation and field measurements, however the scope is limited to structural dynamics of the tower and blades. Azzam et al. present a DT that also considers drivetrain dynamics [2]. The DT is constructed by regression on aeroelastic and drivetrain multi-body simulations and serves the purpose of estimating all six main shaft load components based on SCADA measurements. Unfortunately, their work is purely numerical and not supported by experimental or field measurements.

The novel contributions of this paper in comparison to existing research are summarized as follows.

  • Validating the concept of virtual sensing of drivetrain loads with field measurements rather than numerical simulations

  • Monitoring fatigue damage in drivetrain components rather than the tower or blades

  • Leveraging high-frequency CMS vibration measurements as opposed to using only SCADA measurements

The remainder of this paper is organized as follows. Sect. 2 presents the proposed methodology for virtual sensing of drivetrain loads and remaining useful life estimation. It follows a discussion of the errors between field measurements and predictions using the virtual sensors in Sect. 3. Concluding remarks are given in Sect. 4.

2 Methodology

The overall methodology is illustrated in Fig. 2. Several SCADA and CMS signals, described in Sect. 2.1, are filtered and postprocessed to extract statistical features (Sect. 2.2). Data-driven regression models are then trained to map the SCADA and CMS features onto measured aerodynamic hub loads (torque, yaw moment, pitch moment, thrust), as detailed in Sect. 2.3. Local forces at the main bearing and the high-speed shaft generator side (HSS-GS) bearing are then calculated with a low-fidelity, physics-based drivetrain model, presented in Sect. 2.4. The remaining useful life (RUL) is estimated based on the fatigue damage model of ISO 281 (Sect. 2.5). Lastly, the RUL is scaled with a safety factor to account for uncertainties in the load estimation (Sect. 2.6).

2.1 Description of dataset

The dataset was acquired as part of a field measurement campaign with the U.S. Department of Energy 1.5 MW (DOE 1.5) turbine at the National Renewable Energy Laboratory (NREL) [18]. The DOE 1.5 turbine is based on a commercial GE 1.5 SLE turbine with a custom configuration. In addition to a standard SCADA system and drivetrain CMS the turbine is equipped with strain gauges at the tower base, tower top, blade roots and the main shaft to fully monitor multiaxial aerodynamic loads.

Fig. 2
figure 2

Overall methodology for virtual sensing of drivetrain loads and remaining useful life estimation

The sensor signals used in this study are listed in Table 1. The following SCADA signals are considered in this study, which are reportedly sensitive to the main shaft loading: Active power, HSS and LSS speed, Nacelle wind speed, as well as tower top acceleration. The CMS sensors are installed in a typical configuration and positioned on the housing of the generator (Gen), the high-speed gear stage (HSS), planetary gear stage (PL) and each of the torque arms (TA). The aerodynamic loads at the rotor hub including the torque, pitch moment, yaw moment and thrust are measured with strain gauges at the main shaft downwind of the main bearing and at the tower base. The calibration of the strain gauges is described in [18].

The dataset used in this study comprises of a total of 830 measurements of 10 min length recorded from 31. Oct 2018 to 05. Dec 2018. The sampling frequency is 50 Hz for all signals.

Table 1 Field measurements of the DOE 1.5 turbine [18] used in this study

2.2 Data postprocessing

The dataset is filtered for normal power production, which is identified by three criteria

  • Main shaft speed \(> 10.5\) rpm

  • Blade 1 pitch angle \(<50^{\circ}\)

  • Active power \(> 0\) kW

In addition, a moving average filter with window size of 1 s is applied on the recorded strain gauge signals.

Best practice in drivetrain condition monitoring is the extraction of statistical features, which are indicative of faults or damage [16]. The recorded SCADA and CMS measurements are partitioned into 10 min segments and the features listed in Table 2 are calculated for each segment. These include a wide range of the most commonly used features in the time domain (\(x\)) and frequency domain (\(S_{ \textit{xx}}\)). The features that are eventually utilized as input for the regression models are determined by a sensitivity analysis in Sect. 3.2.

Table 2 Statistical features calculated for SCADA and CMS signals

2.3 Data-driven regression models

Regression models are used in this study to map the predictors, the SCADA and CMS statistical features, onto the targets, the aerodynamic hub loads. Several linear and non-linear regression models are investigated for this purpose including Linear Regression (LR), Support Vector Machine (SVM) and Tree ensembles, as described in Table 3. For a detailed description of each regression model type it is referred to [6]. Implementation and training is realized with MATLAB’s Statistics and Machine Learning Toolbox [9]. The dataset is partitioned 80/20 into training data and test data, and the models are regressed onto the training data using least squares regression and five-fold cross validation. Hyperparameters are not optimized and are set to the default values provided by MATLAB.

Table 3 Statistical regression models for predicting hub loads based on SCADA and CMS measurements

2.4 Physics-based drivetrain model

The DOE 1.5 MW turbine is instrumented with strain gauges at the blade roots, the main shaft and the tower top and base to monitor the multiaxial loading of the turbine. The aerodynamic loads at the rotor hub, as well as loads at the main bearing and HSS-GS bearing are calculated from strain gauge measurements using an analytical model, presented in [1, 5]. The analytical model assumes steady state operation and neglects any torsional or bending dynamics of the main shaft and the tower. With this assumption it is possible to determine bearing loads by moment balances.

First, the aerodynamic moments including torque \(M_{ \textit{a,x}}\), pitch moment \(M_{ \textit{a,y}}\) and yaw moment \(M_{ \textit{a,z}}\) are determined from the measured main shaft moments \(M_{ \textit{ms}}\) by moment balance around the main bearing (Fig. 3) and expressed in the fixed coordinate frame at the hub. The thrust \(F_{ \textit{a,x}}\) is calculated from the tower base moment \(M_{ \textit{tb,y}}\) by moment balance around the tower base (Fig. 3)

$$\begin{aligned}M_{ \textit{a,x}}=M_{ \textit{ms,x}}\end{aligned}$$
(1)
$$\begin{aligned}M_{ \textit{a,y}}=M_{ \textit{ms,y}}-M_{ms,y0}\end{aligned}$$
(2)
$$\begin{aligned}M_{ \textit{a,z}}=M_{ \textit{ms,z}}\end{aligned}$$
(3)
$$\begin{aligned}F_{ \textit{a,x}}=-\frac{M_{ \textit{a,y}}+M_{ \textit{tb,y}}-M_{tb,y0}}{h\cos(\alpha)-d_{1}\sin(\alpha)}\end{aligned}$$
(4)

where \(M_{ms,y0}\) and \(M_{tb,y0}\) are gravitational moments due to the rotor overhang expressed at the main bearing and tower base respectively, \(h\) denotes the tower height, \(d_{1}\) the distance from the hub to the tower top and \(\alpha\) the main shaft tilt angle (Table 4)

Fig. 3
figure 3

Definition of forces and moments

Table 4 Parameters of analytical model for bearing load calculation according to [1, 5]

The installed main bearing is a SKF 240/600 BC spherical roller bearing in a 3-point configuration and thus considered to only support radial and axial forces. The torque arms are also considered to only experience radial and axial forces and the stiffness of the generator coupling is neglected. In steady state operation the main bearing forces \(F_{ \textit{mb}}\) are then calculated as follows (Eqs. 57). The axial force is governed by thrust, while the radial force is governed by the yaw and pitch moments.

$$\begin{aligned}F_{ \textit{mb,x}}=F_{mb,x0}+F_{ \textit{a,x}}\end{aligned}$$
(5)
$$\begin{aligned}F_{ \textit{mb,y}}=-M_{ \textit{a,z}}/d_{ \textit{GB}}\end{aligned}$$
(6)
$$\begin{aligned}F_{ \textit{mb,z}}=F_{mb,z0}-M_{ \textit{a,y}}/d_{ \textit{GB}}\end{aligned}$$
(7)

where \(F_{mb,x0}\) and \(F_{mb,z0}\) is the rotor, shaft and gearbox weight projected onto the x an z axis respectively and \(d_{ \textit{GB}}\) is the distance from the main bearing to the torque arms (Table 4).

The HSS-GS bearing is a SKF NU232 cylindrical roller bearing and thus only supports radial forces. The radial force is governed by the transmitted gear force at the high-speed gear stage, which is calculated from the rotor torque by neglecting all torsional dynamics

$$\begin{aligned}F_{ \textit{HSS,rad}}=\frac{M_{ \textit{a,x}}}{i_{ \textit{GB}}r_{b}}\frac{d_{ \textit{RS}}}{d_{ \textit{GS}}}\end{aligned}$$
(8)
$$\begin{aligned}F_{ \textit{HSS,x}}=0\end{aligned}$$
(9)

where \(i_{ \textit{GB}}\) denotes the gearbox ratio, \(r_{b}\) the base radius of the pinion and \(d_{ \textit{RS}}\), \(d_{ \textit{GS}}\) the distance from the generator- and rotor side HSS bearings to the pinion center (Table 4).

2.5 Fatigue damage and remaining useful life

The bearing fatigue damage and remaining useful life is based on ISO 281 [7], which defines the equivalent dynamic load \(P\) for cylindrical roller bearings (CRB) and tapered roller bearings (TRB) as

$$\begin{aligned}\text{for CRB:}\quad P=F_{\text{rad}}\end{aligned}$$
(10)
$$\begin{aligned}\text{for TRB:}\quad P=\begin{cases}F_{\text{rad}}+Y_{1}F_{ \textit{ax}},&\text{if }F_{ \textit{ax}}/F_{\text{rad}}\leq e\\ 0.67F_{\text{rad}}+Y_{2}F_{ \textit{ax}},&\text{otherwise}\end{cases}\end{aligned}$$
(11)

where \(Y_{1}\), \(Y_{2}\), \(e\) are bearing specific parameters (Table 4). The equivalent dynamic load is calculated with 10 min average load estimates denoted as \(\overline{P}_{i}\). For each 10 min section indexed by \(i\) the permissible stress cycles \(N_{i}\) are then calculated with the bearing lifetime equation

$$N_{i}=10^{6}\left(\frac{C}{\overline{P}_{i}}\right)^{m}$$
(12)

where \(C\) is the basic dynamic load rating and \(m\) equals 10/3 for roller bearings. The experienced stress cycles \(n_{i}\) are determined using the load duration distribution (LDD) method, which counts one stress cycle per shaft revolution due to entering and exiting the bearing load zone [13]. Using 10 min average shaft speeds \(\overline{\omega}_{i}\) the LDD method simplifies to

$$n_{i}=\overline{\omega}_{i}\Delta t,$$
(13)

where \(\Delta t\) equals 10 min. It follows the dimensionless short-term fatigue damage \(D^{ \textit{ST}}_{ \textit{i}}\), which is defined as the ratio of experienced to permissible stress cycles

$$D^{ \textit{ST}}_{ \textit{i}}=n_{i}/N_{i}$$
(14)

The long-term damage \(D^{ \textit{LT}}(t)\) is obtained with the Palmgren-Miner linear damage hypothesis by summation of all previous short-term damage and is updated in 10 min intervals for real-time damage monitoring

$$D^{ \textit{LT}}(t)=\sum_{i=0}^{t/\Delta t}D^{ \textit{ST}}_{ \textit{i}}$$
(15)

By definition, the bearing has consumed its damage reserves and reached its end of life at \(D^{ \textit{LT}}=1\). With a nominal life \(t_{ \textit{nom}}\) of 20 years the remaining useful life RUL is then calculated as follows

$$\text{RUL}(t)=t_{ \textit{nom}}\left(1-D^{ \textit{LT}}(t)\right)$$
(16)

2.6 Damage uncertainty model

Using 10 min average load estimates for the damage calculation reduces computational costs and enables real-time monitoring, however it introduces uncertainties by neglecting high-frequency load fluctuations, which may originate in the aerodynamics or internal drivetrain dynamics. The damage is generally underestimated with averaged loads, since load peaks are disproportionally more damaging than load minima due to the exponentiation with \(m\) (Eq. 12). The uncertainty \(\chi_{ \textit{avg}}\) is expressed as the ratio of the true short-term damage \(D^{ \textit{ST}}_{\text{50\,Hz}}\) measured at 50 Hz and the short term damage based on 10 min average load estimates \(D^{ \textit{ST}}_{\text{10\,min-avg}}\). The fluctuations of the equivalent dynamic load within a 10 min period are modelled with a statistical variable \(X\sim N(\mu,\sigma)\), which is normally distributed with mean value \(\mu\) and standard deviation \(\sigma\). It is further assumed that variations of the shaft speed are negligible, such that Eq. 13 remains valid. It follows for the uncertainty \(\chi_{ \textit{avg}}\)

$$\chi_{ \textit{avg}}:=\frac{D^{ \textit{ST}}_{\text{50\,Hz}}}{D^{ \textit{ST}}_{\text{10\,min-avg}}}=\frac{E(X^{m})}{[E(X)]^{m}}$$
(17)

where the expected values \(E\) are given by the law of the unconscious statistician (LOTUS) [8] using the standard normal statistical variable \(Z=\frac{X-\mu}{\sigma}\)

$$\begin{aligned}E(X^{m})=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}(\mu+\sigma z)^{m}\exp(-z^{2}/2)dz\end{aligned}$$
(18)
$$\begin{aligned}E(X)^{m}=\mu^{m}\end{aligned}$$
(19)

It is evident that the uncertainty \(\chi_{ \textit{avg}}\) is only a function of the 10 min mean and standard deviation, which are both estimated with data-driven regression models (Sect. 2.3).

3 Results and Discussion

3.1 Measured hub loads and fatigue damage

Shown in Fig. 4 are the distributions of the measured hub loads for verification of the results. The mean torque follows the analytical thrust curve and levels out at rated torque, while the highest torque variance is observed at about 10 m/s slightly below rated wind speed of 14 m/s. This behaviour is similarly reported in other works [13] and is likely due to effects of the pitch controller, which frequently activates and deactivates in this region causing high torque amplitudes. The aerodynamic pitch moment is predominantly a result of thrust differences between the upper and lower rotor disk due to the vertical wind profile. Positive trends of the mean and variance with reference to wind speed is observed. The yaw moment is similarly a result of aerodynamic imbalance, predominantly yaw misalignment. Contrary to the pitch moment, the yaw moment is centered around zero mean and is independent of wind speed. The measured thrust agrees well with simulated thrust curves, as demonstrated in [5]. The highest variance in thrust is measured at around 8 m/s, which is slightly lower than the peak of torque variance.

Fig. 4
figure 4

10 min mean and standard deviation of measured aerodynamic loads in fixed frame of the rotor hub

Fig. 5
figure 5

10 min fatigue damage at the main bearing and HSS-GS bearing based on measured hub loads

The calculated bearing damage based on the measured hub loads are presented in Fig. 5. It is emphasized here that rotating machine elements such as bearings and gears experience stress cycles even at stationary environmental loads due to the shaft rotation. For this reason the LDD method [13] is used in this study for stress cycle counting as opposed to the rainflow counting method commonly used for (non-rotating) structural elements. The fatigue damage at the main bearing comprises of two components, an axial component \(D_{ \textit{ax}}=XF_{ \textit{ax}}/P\cdot D\) due to thrust and a radial component \(D_{\text{rad}}=YF_{\text{rad}}/P\cdot D\) due to gravity and pitch moments. The maximum in fatigue damage is observed at 11 m/s and coincides with the thrust peak. In this operational region the axial forces dominate and amount to about 66% of the equivalent dynamic load \(P\). At wind speeds above 16 m/s the contribution of radial forces due to pitch moments becomes dominant and below 8 m/s with relatively low aerodynamic loads the contribution of gravity forces becomes dominant.

The HSS-GS bearing experiences only radial forces, which are considered to be proportional to the rotor torque (Eq. 8). Thus, the fatigue damage is governed by the mean rotor torque and reaches its maximum above rated wind speeds.

3.2 Sensitivity analysis

A sensitivity analysis is conducted with the objective of dimensionality reduction of the predictor variables. The sensor signals (Table 1) and statistical features (Table 2) are selected, which are the best predictors of hub loads according to the metric of the Neyman-Pearson correlation coefficient. Presented in Fig. 6 are the ten best performing signals SCADA signals (in blue) and CMS signals (in red) for each hub load component.

Fig. 6
figure 6

SCADA (blue) and CMS signals (red) ranked according to their correlation with mean and standard deviation of hub loads

The generator torque is as expected an excellent predictor of both the mean and the standard deviation of rotor torque with \(R> 0.99\). Prediction of the absolute values of the bending moments on the other hand is challenging, as neither of the SCADA or CMS signals show statistically significant correlation (\(R<0.5\)). However, the moment standard deviations show high correlation (\(R=0.88\)) with respect to the wind speed, as well as other SCADA signals. The torque correlates well with all SCADA signals, as well as the CMS vibrations at the HSS, the generator and the nacelle frame.

3.3 Hub load estimation

Several regression models, as described in Table 4, are used to map sensor measurements onto the aerodynamic hub loads. Two different scenarios of sensor input are considered: (1) only SCADA signals, (2) combined SCADA and CMS signals. This serves the purpose of assessing the added value of CMS vibration data and validating the novel approach of virtual sensing based on vibration measurements. The metric for model performance is the root mean square error (RMSE) between measured and predicted loads using 5‑fold cross validation. Shown in Fig. 7 is the RMSE normalized to the maximum value of each hub load.

Fig. 7
figure 7

Normalized RMSE between measured and predicted hub loads with different regression models and different SCADA/CMS input

It is evident that the estimation of the mean and standard deviation of the rotor torque is accurate with minimum RMSE of 0.24% and 0.46% respectively. The best performance is observed is observed with a simple linear regression model, due to the high linear correlation of the rotor torque with the measured generator torque.

Concerning the bending moments, it appears that estimating the mean value is much more challenging than estimating the standard deviation. A possible reason is that the mean bending moments unlike torque and thrust do not show a clear trend with respect to wind speed (Fig. 4). The inclusion of CMS vibration data slightly improves the prediction accuracy of bending moments in most cases. Non-linear regression models are preferable, since the relationship of bending moments and dynamic drivetrain responses appear to be non-linear.

The mean thrust as well as the standard deviation is estimated with relatively low RMSE of 8.0% and 6.0%. It is clear that non-linear regression models are necessary to capture the non-linear behaviour such as the thrust-wind speed curve (Fig. 4). In this case CMS vibration data do not appear to increase performance.

3.4 Fatigue damage and remaining useful life

The measured and estimated hub loads discussed in the previous section are converted into short-term (10 min) fatigue damage in the main bearing and the HSS-GS bearing using Eqs. 516. The RMSE of the fatigue damage normalized to its maximum value is presented in Fig. 8.

Fig. 8
figure 8

Normalized RMSE between measured and predicted bearing damage with different regression models and different SCADA/CMS input

Fig. 9
figure 9

Measured and predicted bearing RUL with the best performing regression model (quadratic SVM/LR)

The damage in the main bearing is estimated with high accuracy (RMSE = 6.4%) despite high uncertainty in estimating the bending moments. These results suggest that the damage in the main bearing is governed by thrust, which can be estimated more accurately. The best performance is achieved by the quadratic SVM, which is able to capture the non-linear behaviour best. For monitoring the damage in the HSS-GS bearing a linear regression model suffices, which results in RMSE of 1.1%.

It appears that the inclusion of high-frequency CMS vibration measurements does not provide much value for monitoring bearing fatigue damage and that the considered 10 min average SCADA measurements are sufficient to estimate the damage within a 6.4% error margin.

Fig. 9 presents the measured and predicted RUL with the best performing regression model. During the recorded time frame of 138.3 h the measured RUL of the main and HSS-GS bearing is reduced only by 20.8 h and 17.5 h respectively. The discrepancy can be attributed to conservative design, for example in the selection of design load cases (DLC), which are more severe than the actual experienced environmental conditions. Furthermore, the sample size is relatively small and the time frame of the recordings of 31. Oct to 05. Dec is not representative for seasonality effects.

The RUL is overestimated significantly despite high accuracy in the predicted 10 min average loads. This is caused by high-frequency load dynamics for example from turbulence or internal drivetrain dynamics, which are not accounted for with 10 min average load estimates. The discrepancy is partially compensated with the damage uncertainty model (Sect. 2.6). A good agreement with the measured RUL is observed at the HSS-GS bearing, while at the main bearing there remains a larger error possibly due to higher uncertainties in predicting bending moments and thrust.

3.5 Method limitations and further work

While the presented Digital Twin exhibits high accuracy in predicting aerodynamic loads and bearing damage, it is crucial to discuss the method assumptions and associated uncertainties, which limit the applicability of this method.

Field measurements: The data used in this study (Sect. 2.1) are representative for commercial SCADA and CMS measurements with the exception of the wind speed. The wind speed data were acquired with a MET mast about 150 m downwind of the turbine. Commercial wind turbine SCADA systems, however, mostly rely on nacelle mounted anemometers, which suffer from greater inaccuracies due to wake effects. The additional measurement uncertainty can be estimated with a coefficient of variation (COV) of \(1-3\%\) according to Toft et al. [21].

Aeroelastic model: The presented regression model (Sect. 2.3) relies on a training data set of aerodynamic loads, in this case obtained by strain gauge measurements, which are not available in commercial wind turbines. Alternatively, it is possible to emulate field measurements with measurements from high-fidelity simulation models, similar to the approach of Azzam et al. [2]. Naturally, this shifts the challenge to the model construction and validation and introduces additional uncertainties due to modelling errors. Such uncertainties can be approximated with a COV of \(5\%\) according to Nejad et al. [13], however it is difficult to make generalized statements. In further studies it is planned to compare the data-driven regression models with an aeroelastic model of the DOE 1.5 turbine, which has been developed and validated by other authors [5], in order to quantify modelling uncertainties.

Drivetrain model: State-of-the-art drivetrain models are highly complex multibody simulation (MBS) models [14, 22], and are not suitable as Digital Twin models, as discussed in [11]. First, the high number of degrees of freedom (DOF) make them numerically expensive and not capable of real-time simulation, which is necessary for online monitoring purposes. Secondly, wind turbine operators do not have the means of developing and validating complex models, since the drivetrain specifications are largely confidential to the OEMs. For this reason, a relatively simple drivetrain model is used in this article, which assumes a quasi-static transmission of torque and neglects all internal dynamics including effects of component flexibility, multi-body interaction and excitations from gear meshing or roller bearings (Sect. 2.4). The effects of internal dynamics on bearing fatigue damage are expected to be relatively small, as suggested by the results of a previous numerical case study [10], where RMSE of 5–15% in the bearing fatigue damage were observed. However, the scope of the numerical case study was limited to the high-speed gear stage bearings, normal power production at rated wind speed and one drivetrain configuration. Further numerical investigations are scheduled better quantify the modelling errors of such reduced order drivetain models.

4 Conclusion

This paper presents a Digital Twin for virtual sensing of wind turbine hub loads based on SCADA and CMS measurements, as well as monitoring the accumulated fatigue damage and remaining useful life in the main and HSS-GS bearing. The Digital Twin is constructed for the DOE 1.5 research turbine [18] and evaluated with field measurements. Several data-driven regression models including linear regression models, support vector machines and tree ensembles are trained on field measurements for the aerodynamic hub load estimation. For calculation of local bearing loads a low-fidelity physics-based model is constructed with the assumption of steady-state operation. The remaining useful life is calculated based on the consumed fatigue damage reserves according to ISO 281 [7].

While the estimation of rotor torque and thrust is accurate with RMSE of 0.24% and 6.0%, it proves to be much more challenging to estimate the yaw and pitch bending moments. The measured bending moments appear to be highly stochastic and do not show statistically significant correlation (\(R<0.5\)) with any of the considered SCADA and CMS measurements.

Nonetheless, relatively low RMSE of 6.4% in the 10 min fatigue damage are observed at the main bearing despite the high uncertainty in the bending moment estimates. It appears that the damage in the main bearing is governed by thrust, which is estimated much more accurately than the bending moments. The damage at the HSS-GS bearings is assumed to only depend on the drivetrain torque and can thus be estimated with high accuracy (RMSE \(=\) 1.1%).

The main contribution of this article is the knowledge transfer of the virtual sensing concept from wind turbine structural elements to drivetrain components, and validation of the concept with field measurements. With regards to the quality and availability of physical sensor measurements the proposed virtual sensors are feasible. SCADA and CMS data contain sufficient information for accurate monitoring of bearing fatigue damage. Challenges are identified in the multi-body drivetrain dynamics, which are much more complex than the dynamics of the tower and blades. However, developing and validating models to capture complex drivetrain dynamics is difficult based on the information that is available to wind turbine operators. Low fidelity, quasi-static models, which largely neglect internal drivetrain dynamics, are shown to produce low errors of bearing fatigue damage, and are thus proposed for virtual sensing purposes. Further investigations are planned to quantify the uncertainties introduced by quasi-static drivetrain models.