1 Introduction

Power train system including rotor, main bearings, gearbox, generator and power converter accounts for 57% of turbine total failures and 65% of turbine total downtime [1]. The consequent costs are expected to be higher in floating wind turbines (FWT) due to costly and slow-going marine operations. In FWTs, the power train system is exposed to a wider range of excitations than bottom-fixed turbines due to synergistic impacts of wind, wake, currents and structural motions-induced vibrations [2]. Predictive maintenance is proposed in the literature as a support tool for the time base maintenance to reduce the unexpected shutdowns and the consequent downtimes while it can also help to optimize the inspection intervals. The motivation of this research is increasing the wind turbine (WT) availability by performing predictive maintenance of the drivetrain system through the digital twin models.

Digital twins are highly accurate but computationally fast models of the system which can be run in-line with the real machine. The model should be able to update itself by the online operational measurements, to capture the physical variations in the system over the time [3]. In this paper, digital twin is proposed as a tool for lifetime monitoring of the drivetrain system, and more specifically the gearbox, by using the torsional vibration measurements, which can be included in the farm Supervisory Control and Data Acquisition (SCADA) existing system by using higher resolution encoders. The input data may also be provided by additional installations e.g. proximity sensors and angular accelerometers. Correa et al. [4] performs a review on the drivetrain components condition monitoring techniques and faults prediction in general and also by using the SCADA data. Digital twin is a proven technology which is used by Siemens Gamesa for prediction of drivetrain loads and subsequently improving the drivetrain design [5]. The application of digital twin for monitoring of the drivetrain lifetime may be looked as an expensive solution. However, in this work, it is shown that by using innovative approaches based on analysis of torsional measurements it is possible to offer computationally-fast solutions for monitoring the lifetime of the drivetrain critical components with a potential to be integrated with the currently available control and monitoring system with minimum additional cost. Digital twin in this context means the combination of model, online measurements and remaining useful lifetime (RUL) model as defined and suggested by [6].

Rebouças et al. [7] suggests the 14-DOF torsional model as a fair compromise between complexity and accuracy when the structural integrity of gearbox is taken into consideration. The structural integrity is mainly measured in terms of contact fatigue stresses of the gear pairs, so that this model can be used as the indicator for the gears contact loads. For this purpose, the applicability of a 14-DOF lumped-parameter torsional model of the drivetrain system as the digital twin model for monitoring the degradation in a high-speed drivetrain technology with a three-stages gearbox which consists of two planetary and one parallel stages is investigated. The algorithm for estimation of digital twin model parameters is based on a data-driven approach which operates based on the real-time drivetrain torsional measurements, and the estimated dynamic properties from the torsional response. The algorithm employed for estimation of dynamic properties, namely natural frequencies and damping from the torsional response is according to the modal estimation approach proposed by [8]. In order to estimate the moment of inertia of the components in the equivalent model, an optimization problem is defined based on the least square error between the digital twin model and the online measurements consisting of torsional response and estimated input torque. After all, the stiffness parameters of the model are estimated by using the equations of natural frequencies as a function of equivalent model parameters (stiffness and inertia). It is worth noting that the equivalent lumped parameter model of drivetrain in the proposed digital twin approach is linear. However, the eigenfrequencies of the drivetrain are nonlinear functions of equivalent model parameters. Since the assumption of accessibility of all the system dynamic states is optimistic, the problem is extend to a more general class of problem with a partial knowledge about the system modes. The parameters of the digital twin model are updated by using the real-time operational data. The most prevalent failures of the large gears in WT drivetrain systems are due to gear tooth root bending and pitting fatigue damage [9]. For monitoring the RUL of the gears in this work, the degradation of the gears due to pitting fatigue damage and the gear pairs contact loads are taken into consideration. The contact loads on the gear pairs are estimated by using the load observers designed for the gear pairs based on the real-time estimated digital twin model and torsional measurements. The employed model-based degradation model is based on stress-life method. The online estimated contact stress feeds the time-domain cycle counting approach based on rainflow method for online estimation of fatigue cycles, and then for estimation of fatigue damage for each gear pair by using the Miner’s rule. For all the simulation cases, the floating 5 MW NREL WT with a spar support substructure as a common way of realization of FWT is used. The reason for the latter is the higher motivation in FWT for predictive maintenance.

This paper is the proof of concept for the idea of near real-time estimation of the load and subsequently the residual life in the drivetrain components by using computationally efficient equivalent models estimated from torsional measurements supported by real-time values of torsional response. The proposed algorithm is able to estimate the equivalent model parameters by using only few measurement samples which feed a robust linear regression estimator designed to estimate the equivalent model parameters. On this basis, the main contributions of this work are:

  • Proposing the 14-DOF equivalent lumped-parameter model as the digital twin of the drivetrain system for remaining useful lifetime monitoring of the gearbox,

  • Proposing an algorithm for the near real-time estimation of the equivalent model parameters by using the torsional measurements,

  • Designing computationally inexpensive load observers for the near real-time estimation of the contact load and stress on the gears in planetary and parallel stages of the drivetrain gearbox, by using equivalent model parameters and real-time torsional responses,

  • Proposing a physics-based degradation model for near real-time estimation of the residual life of the gears in the drivetrain gearbox by using the online estimated equivalent model, real-time measurements, designed load observers and contact stress estimation approach,

  • Simulation studies to evaluate the proposed digital twin approach for estimation of fatigue damage of the gears, and validation of the results by high-fidelity simulation models.

2 Methodology

2.1 Test case

NREL 5 MW reference wind turbine [10] with a spar floating support substructure is selected for this study. The wind turbine specification and the overall characteristics of the floating platform is obtained from [11]. This model is able to capture the global dynamics of spar floating wind turbine from the interaction with the environmental loads. The 5 MW reference drivetrain [12] is employed in this study.

2.2 Global simulation and estimation of drivetrain loads

The decoupled simulation approach is used for the drivetrain studies in this work. The latter means that the turbine global simulation results, namely the rotor aerodynamic load and the bedplate motions are applied to the detailed local drivetrain model in a secondary software. Then the drivetrain local load effects are obtained for further postprocessing aimed at monitoring the drivetrain remaining lifetime. More details on the utilized decoupled approach and the global simulation model is available in [11]. The wind is turbulent based on Kaimal distribution. The turbulence intensity at hub height I (−) is assumed to be 0.14, for all the wind speeds, according to IEC 61400‑1 class B turbines. The wave is modelled stochastic by two parameters, namely significant wave height \(H_{s}(m)\)and peak period \(T_{p}(s)\) in global analysis.

2.3 Drivetrain torsional model as the digital twin model for predictive maintenance

The NREL 5 MW reference drivetrain proposed in [12] is considered in this study, considering only its torsional behavior. Fig. 1 contains a schematic representation of the model used in this work. The gearbox is assumed to be clamped to the nacelle’s bedplate by rigid torque arm supports, leading to no rotation of the ring gears. This model is based on [13], where the resonant behavior of a single-stage planetary gearbox is discussed. The equations of motion are shown in Eq. (1).

$$\mathbf{J}\ddot{\mathbf{\theta }}+\mathbf{D}\dot{\mathbf{\theta }}+\mathbf{K}\mathbf{\theta }=\mathbf{T},$$
(1)
$$\begin{aligned}\mathbf{\theta }= \; & \; \left[\begin{array}{ccccc} \theta _{R} & \boldsymbol{\theta }_{1}^{T} & \boldsymbol{\theta }_{2}^{T} & \boldsymbol{\theta }_{3}^{T} & \theta _{G} \end{array}\right]^{T},\\ \mathbf{T}= \; & \; \left[\begin{array}{ccccc} T_{R} & 0 & \cdots & 0 & T_{G} \end{array}\right]^{T},\end{aligned}$$
(2)
Fig. 1
figure 1

Representation of the torsional model [7]

In these equations, \(\theta _{k}\) and \(T_{k}\) (\(k\in \{R,1,2,3,G\}\)) represent torsional coordinate and external torques, respectively. The indexes \(\mathrm{R}\) and \(G\) are associated with rotor and generator variables, respectively. The sub-vectors \(\boldsymbol{\theta }_{i}\)(\(i\in \{1,2,3\}\)) refer to the gear stages, see Eqs. (3) and (6). The global inertia and stiffness matrices \(\mathbf{J}\) and \(\mathbf{K}\) are obtained from their local counterparts. The damping is defined using Rayleigh parameters, being proportional to the inertia and stiffness matrices \(\mathbf{J}\) and \(\mathbf{K}\), i.e \(\mathbf{D}=\alpha \mathbf{J}+\beta \mathbf{K}\).

The local matrices for the planetary stages can be seen in Eqs. (3) to (5) and are based on the matrices used in [13], while the local matrices for the parallel stage, shown in Eq. (6), were obtained by isolating the sun-planet gear pair of the planetary stage. In these equations, \(\theta\), \(m\), \(k\), and \(r\), represent the angular displacement, mass, stiffness and base radius respectively. The indices \(c,p,s,P\), and \(W\) represent the carrier, planet and sun elements of the \(i^{th}\) stage, respectively. The mesh stiffness between sun-planet, ring-planet and pinion-wheel gears is represented by \(k_{sp}\), \(k_{rp}\), and \(k_{PW}\), respectively. In this work, the mesh stiffness between different gear pairs is calculated according to ISO 6336‑2 [14], which takes a constant value. The center distance for planetary stages is represented by \(a_{w}\) [7]. Shafts are used to connect the adjacent inertia elements, such as the rotor, generator, and gear stages. The shafts were modelled using finite element theory leading to the shaft’s inertia and stiffness matrices \(J_{S}\) and \(K_{S}\) in Eq. (7). Assembling the global matrices from the local matrices is illustrated in Fig. 2. In Eq. (7), \(D\), \(L\), and \(G\) represent the shaft’s diameter, length, and the shear elasticity modulus of its material. The indices \(a\) and \(b\) on the shaft’s displacement vector \(\boldsymbol{\theta }_{S}\) represent DOFs adjacent to the shaft. For the high-speed intermediate shaft (HSIS), shown in Fig. 1, \(a=s_{2}\) and \(b=W\).

$$\boldsymbol{\theta }_{i}=\left[\begin{array}{ccccc} \theta _{{c_{i}}} & \theta _{{p_{1i}}} & \theta _{{p_{2i}}} & \theta _{{p_{3i}}} & \theta _{{s_{i}}} \end{array}\right]^{T},i=1,2$$
(3)
$$\boldsymbol{J}_{i}=\text{diag}\left(\left[\begin{array}{ccccc} m_{c}a_{w}^{2} & m_{p}r_{p}^{2} & m_{p}r_{p}^{2} & m_{p}r_{p}^{2} & m_{s}r_{s}^{2} \end{array}\right]\right),$$
(4)
figure a
$$\begin{aligned}\boldsymbol{\theta }_{3}= \; & \;\left[\begin{array}{cc} \theta _{W} & \theta _{P} \end{array}\right]^{T}, \\ \boldsymbol{J}_{3}= \; & \; \text{diag}\left(\left[\begin{array}{cc} m_{w}r_{W}^{2} & m_{P}r_{P}^{2} \end{array}\right]\right),\\ \boldsymbol{K}_{3}= \; & \; k_{PW}\left[\begin{array}{cc} r_{W}^{2} & r_{P}r_{W}\\ r_{P}r_{W} & r_{P}^{2} \end{array}\right]\end{aligned}$$
(6)
$$\begin{aligned}\boldsymbol{\theta }_{S}= & \left[\begin{array}{cc} \theta _{a} & \theta _{b} \end{array}\right]^{T}, \\ \boldsymbol{J}_{S}= & \frac{mD^{2}}{48}\left[\begin{array}{cc} 2 & 1\\ 1 & 2 \end{array}\right],\\ \boldsymbol{K}_{S}= & \frac{\uppi GD^{4}}{32L}\left[\begin{array}{cc} 1 & -1\\ -1 & 1 \end{array}\right],\end{aligned}$$
(7)
Fig. 2
figure 2

Assembling the global matrices from their local equivalents [13]

The generator torque is realized by a proportional-integral (PI) speed controller, defined as

$$T_{G}=K_{P}\mathrm{e}+K_{I}\int _{0}^{t}e\mathrm{d}\uptau ,e=\omega _{G}-\dot{\theta }_{G},$$
(8)

where the error \(e\) is given as the difference between reference and measured generator speeds \(\omega _{G}\) and \(\dot{\theta }_{G}\), respectively. \(K_{P}=2200\) is the proportional gain and \(K_{I}=220\) is the integral gain.

2.4 Estimation of model parameters from torsional response

Drivetrain modal estimation

The torsional response residual function between the inertia \(j_{m}\) and \(j_{n}\) from the point m is defined as

$$e_{m,n}^{\alpha }\triangleq \alpha _{m}-{u_{m,n}}\alpha _{n}, \text{for } m,n\in \left\{1,\ldots ,14\right\},$$
(9)

with \(\alpha\) is the angular acceleration, and \(u_{m,n}\) is the relative gear ratio between \(j_{m}\)and \(j_{n}\) to make them in the same coordinate. Gear ratio \(u_{m,n}\) as per definition is \(\frac{N_{m}}{N_{n}}\), where \(N_{m}\) and \(N_{n}\) are the speeds at \(m^{th}\)and \(n^{th}\) inertias. The theoretical explanation of estimation of natural frequencies from the frequency spectrum of \(e_{m,n}^{\alpha }\)is provided by [8]. The different residual functions can be defined by using the response obtained from the different locations of drivetrain, which present different levels of visibility of the drivetrain torsional modes.

Estimation of moment of inertia matrix

By summing the moments on each of the inertias of the lumped parameter model yields 14 equations of the form

$$\begin{aligned} j_{i} & \ddot{\theta}_{i}+D_{i}\left(\dot{\theta}_{i}-\dot{\theta}_{i-1}\right)-D_{i+1}\left(\dot{\theta}_{i+1}-\dot{\theta}_{i}\right) \\ + \; k_{i} & \left(\theta_{i}-\theta_{i-1}\right)-k_{i+1}\left(\theta_{i+1}-\theta_{i}\right)=0, \\ & \text{ for } i=\left(1,\ldots ,14\right).\end{aligned}$$
(10)

In the matrix form, these set of equations can be written as

$$\mathbf{J}\ddot{\boldsymbol{\Theta }}\left(\boldsymbol{t}\right)+\mathbf{D}\dot{\boldsymbol{\Theta }}\left(\boldsymbol{t}\right)+\mathbf{K}\boldsymbol{\Theta }\left(\boldsymbol{t}\right)=\mathbf{T}.$$
(11)

Assuming that the load and response time series are known, the parameter estimation turns to the minimization of the L2 norm of error. The error function is defined by

$$\boldsymbol{E}\left(t\right)\triangleq \hat{\mathbf{J}}\ddot{\boldsymbol{\Theta }}\left(t\right)+\hat{\mathbf{D}}\dot{\boldsymbol{\Theta }}\left(t\right)+\hat{\mathbf{K}}\boldsymbol{\Theta }\left(t\right)-\mathbf{T}\left(\mathrm{t}\right).$$
(12)

The least square estimator is defined by

$$\begin{array}{c} \hat{\mathbf{J}}^{\mathbf{LS}},\hat{\mathbf{K}}^{\mathbf{LS}},\hat{\mathbf{D}}^{\mathbf{LS}}\in \arg \min \left\{\left\| \boldsymbol{E}\right\| ^{\mathbf{2}}\right\},\\ \mathbf{J},\mathbf{K},\mathbf{D}\geq \mathbf{0} \end{array}$$
(13)

where J is a diagonal matrix. K and D are nondiagonal but symmetric matrices. The matrices D and K are not full rank. The latter causes some computational difficulty for the above quadratic matrix optimization problem, so that there is no guarantee for the convergence. In order to remove the coupling between these equations due to the K and D terms, and to reduce the computational complexity by reduction of the number of variables, the equivalent scalar optimization problem is constructed by the sum of the individual equations of each inertia. The latter leads to the following error function in terms of the variables \(j_{i}\) with rotor as the reference of the rotary coordinate.

$$\begin{aligned} \boldsymbol{e}\left(t\right)= \; & \; j_{1}\alpha _{\mathbf{1}}\left(t\right)+\ldots + a_{1,i}j_{i}\alpha _{i}\left(t\right)+\ldots + \\ \; & \; a_{1,14}j_{14}\alpha _{\mathbf{14}}\left(t\right)-\tau ^{\mathbf{Rot}}\left(t\right)-{a_{1,14}}\tau ^{\mathbf{Gen}}\left(t\right),\end{aligned}$$
(14)

where \(\alpha _{i}\) is the time series of angular acceleration at the \(i^{th}\) body. \(\tau ^{\mathbf{Rot}}\) and \(\tau ^{\mathbf{Gen}}\) are the time series of the rotor and generator torques respectively. The sign of \(a_{1,i}\) is determined based on the direction of rotation of \(j_{i}\). The latter leads to the following quadratic scalar optimization problem as

$$\begin{array}{c} \hat{\boldsymbol{j}}^{\mathit{LS}}=\arg \min \left\{\left\| \boldsymbol{e}\right\| ^{\mathbf{2}}\right\}.\\ j\geq 0 \end{array}$$
(15)

This estimator is robust to the measurement noises, and can provide a good approximation even with less than 14 input data samples (underdetermined case). For the case of more than 14 samples (overdetermined case), this estimator helps to obtain more accurate estimation than solving the linear equations, when the input measurements are subject to independent identically distributed (i.i. d.) Gaussian noise. In other words, the total LS technique is able to correct the system with minimum perturbation [15]. The above convex optimization problem is numerically solved by Matlab CVX and the global optimizer \(j^{LS}=\{j_{1},\ldots ,j_{14}\}\) is estimated.

Estimation of stiffness matrix

The undamped torsional frequencies of the system are the nonlinear function of inertia and stiffness as

$$\omega _{i}\left( \text{for } i=1,\ldots ,14\right)=\sqrt{\text{eig}\,\left(-\boldsymbol{J}^{-1}\mathbf{K}\right).}$$
(16)

By using the estimated natural frequencies obtained from the modal estimation approach together with the estimated inertia matrix J from the LS optimization problem, the stiffness matrix K is the root of \(g_{i}\) which is defined by the following nonlinear equation as

$$g_{i}={\omega _{i}}^{2}-\,\text{eig}\,\left(-\boldsymbol{J}^{-1}\mathbf{K}\right), (\text{for}\; i=1, \ldots, 14).$$
(17)

In general, there is not a unique matrix K from the above equation for the known set of eigenvalues \(\lambda _{i}=\{{\omega _{1}}^{2},\ldots ,{\omega _{14}}^{2}\}\) of the matrix \(-\boldsymbol{J}^{-1}\mathbf{K}\). However, by imposing the sparsity and symmetricity to matrix K from the lumped model, it is possible to calculate the unique matrix K numerically by using Matlab fsolve solver. The latter also helps to reduce commutation cost of this matrix algebraic equation by reducing the number of variables from \(N^{2}\) to N. The matrix \(\boldsymbol{J}^{-1}\mathbf{K}\) is not symmetric in the general case which may give the sense that there are multiple answers for K from this equation. However the fact that \(-\boldsymbol{J}^{-1}\mathbf{K}\) always has positive eigenvalues (it is positive definite), brings us to the believe that this matrix is a small perturbation of a symmetric matrix with positive eigenvalues. Small perturbation keeps the eigenvalues positive [16].

The usual condition for the estimation problem is more restrictive. In other words, it is possible that only some of the eigenfrequencies of the drivetrain system can be estimated by employing the aforedescribed modal estimation approach, especially the higher eigenfrequencies which are excited with a lower energy of the input torque. In this case, the matrix K can still be estimated by using the following optimization problem in terms of the first n eigenfrequencies as defined by the following least square error estimator:

$$\begin{array}{c} \hat{\boldsymbol{k}}^{\mathit{LS}}=\arg \min \left\{\left\| \lambda _{n}-eig\left(\mathbf{\Lambda },n\right)\right\| ^{\mathbf{2}}\right\},\\ \mathbf{\Lambda }\in \Lambda \end{array}$$
(18)

with \(\mathbf{\Lambda }\) is the variable of this problem which is a function of the unknown variable K as \(\mathbf{\Lambda }=-\boldsymbol{J}^{-1}\mathbf{K}\). Also \(\hat{\boldsymbol{k}}^{LS}\) is the set of nonzero elements of matrix K which are estimated by the above nonlinear matrix optimization problem. The sign of the elements of k are forced in the optimization problem. \(\lambda _{n}\) is the set of n \((n\in \{1,\ldots ,14\})\) smallest magnitude eigenvalues which are known from the modal estimation, \(\lambda _{n}=\{{\omega _{1}}^{2},\ldots ,{\omega _{n}}^{2}\}\). \(eig(\mathbf{\Lambda },n)\) is the set of n \(((n\in \{1,\ldots ,14\})\)) smallest magnitude eigenvalues defined in terms of matrix J and the unknown matrix K. The feasible set \(\Uplambda\) is also defined by

$$\Uplambda =\left\{\mathbf{\Lambda }\colon \mathbf{\Lambda }\in R^{14\times 14},\mathbf{K}\geq \mathbf{0},\boldsymbol{\Lambda }_{l,m}=0,\forall \boldsymbol{\Lambda }_{l,m}\in S^{\boldsymbol{\Lambda }}\right\},$$
(19)

where \(S^{\boldsymbol{\Lambda }}\) is the sparsity of matrix \(\mathbf{\Lambda }\). The positive definiteness and sparsity of \(\mathbf{\Lambda }\) are the nonlinear constraints which are imposed to this problem. For the set of positive semidefinite matrices, this problem is convex and the solution is the global optimizer. However, \(\mathbf{\Lambda }\) is not symmetric in general so that the definition of the problem is nonconvex for the numerical solvers and convex optimization tools are not able to numerically solve the problem. For this purpose, Matlab fmincon solver as a powerful tool for the general class of nonlinear nonconvex problems is used.

2.5 Estimation of gearbox loads, and defining degradation model

Estimating contact stresses and loads at each stage

The free-body diagram for the gears in planetary and parallel stages can be seen in Fig. 3. For wind turbine gearboxes, the input and output torques for planetary (parallel) stages are the carrier (pinion) torque \(T_{C}\left(T_{P}\right)\) and the sun (wheel) torque \(T_{s}\left(T_{W}\right)\), respectively. From Fig. 3, one obtains Eqs. (20) and (21) as

Fig. 3
figure 3

Free-body diagrams for ac planetary and de parallel gear stages

Planetary stage:

$$\begin{array}{c} J_{s}\ddot{\theta }_{s}=T_{S}-N_{p}r_{s}F_{sp},\\ J_{p}\ddot{\theta }_{p}=r_{p}\left(F_{\mathrm{pr}}-F_{\mathrm{sp}}\right)=-J_{p}\frac{r_{s}}{r_{p}}\ddot{\theta }_{s},\\ m_{p}a_{w}\ddot{\theta }_{c}=F_{\mathrm{pr}}+F_{\mathrm{sp}}-F_{\mathrm{pc}}=m_{p}a_{w}\frac{r_{s}}{r_{s}+r_{r}}\ddot{\theta }_{s},\\ J_{c}\ddot{\theta }_{c}=T_{c}-N_{p}r_{s}F_{\mathrm{sp}}=J_{c}\frac{r_{s}}{r_{s}+r_{r}}\ddot{\theta }_{s}, \end{array}$$
(20)

Parallel stage:

$$\begin{array}{c} J_{P}\ddot{\theta }_{P}=T_{P}-r_{P}F_{PW}=-J_{P}\frac{r_{P}}{r_{W}}\ddot{\theta }_{W},\\ J_{W}\ddot{\theta }_{W}=T_{W}-r_{W}F_{PW}, \end{array}$$
(21)

One can obtain the relationships between the input and output torques by eliminating the internal forces between elements. Speed relations are time differentiated which enables writing the contribution of inertial torques by using a single torsional acceleration and equivalent mass moment of inertia. For the planetary stage, one obtains

$$\begin{array}{c} T_{s}=J_{EQ}\ddot{\theta }_{s}-T_{c}\frac{r_{s}}{2a_{w}},\text{ where }\\ J_{EQ}=J_{s}+\frac{N_{p}r_{s}^{2}}{2}\left(\frac{J_{p}}{r_{p}^{2}}+m_{p}\frac{a_{w}}{r_{s}+r_{r}}\right)+J_{c}\frac{r_{s}^{2}}{2a_{w}\left(r_{s}+r_{r}\right)}, \end{array}$$
(22)

with \(J_{EQ}\) as the equivalent mass moment of inertia, having contributions from the sun, \(N_{p}\) planet gears and planet carrier. Similarly, for the parallel stage one has

$$\begin{array}{c} T_{W}=J_{EQ}\ddot{\theta }_{W}+T_{P}\frac{r_{W}}{r_{P}},\\ J_{EQ}=J_{W}+J_{P}\left(\frac{r_{W}}{r_{P}}\right)^{2}, \end{array}$$
(23)

with \(J_{EQ}\) having contributions from both pinion and wheel gears. The term multiplying the torque at the pinion is the gear ratio for parallel gears.

Torque transfer between stages is made via shafts, and can be estimated by

$$T_{\mathrm{out}}=T_{in}-J_{S}\ddot{\theta }_{S},$$
(24)

where \(J_{S}\ddot{\theta _{S}}\) is the inertial torque from the shaft, [17]. Therefore, for the NREL 5 MW drivetrain, one should have

$$\begin{array}{cc} T_{in1}=T_{c1}=T_{R}-J_{\mathrm{LSS}}\ddot{\theta }_{R}, & T_{\mathrm{out}1}=T_{s1}=J_{EQ1}\ddot{\theta }_{s1}-T_{in1}\frac{r_{s1}}{2a_{w1}},\\ T_{in2}=T_{c2}=T_{\mathrm{out}1}-J_{\mathrm{ISS}}\ddot{\theta }_{s1}, & T_{\mathrm{out}2}=T_{s2}=J_{EQ2}\ddot{\theta }_{s2}-T_{in2}\frac{r_{s2}}{2a_{w2}},\\ T_{in3}=T_{P}=T_{G}-J_{\mathrm{HSS}}\ddot{\theta }_{G}, & T_{\mathrm{out}3}=T_{W}=J_{EQ3}\ddot{\theta }_{W}-T_{in3}\frac{r_{W}}{r_{P}}, \end{array}$$
(25)

The input torques estimated above can be used to estimate the stresses at the different gear stages as described below.

Gear contact stresses are analyzed in this work following ISO 6336-2:2019 [14]. According to this standard, the contact stresses are defined as

$$\sigma _{Hi}=Z_{BD}Z_{H}Z_{E}Z_{\epsilon }Z_{\beta }\sqrt{K_{A}K_{\upgamma \mathrm{i}}K_{vi}K_{H\upbeta \mathrm{i}}K_{H\upalpha \mathrm{i}}}\sqrt{\frac{2000T_{i}}{bd_{1}^{2}}\frac{u+1}{u}},$$
(26)

where \(u\) is the gear ratio of the pair, \(d_{1}\) and \(b\) are reference diameter and face width of the pinion, respectively, \(T_{i}\) is the input torque. The other parameters account for different aspects of the problem, such as contact relations \(Z_{BD}\) and ratios \(Z_{\epsilon }\), material properties \(Z_{E}\), helix angle \(Z_{\beta }\), mesh load \(K_{\gamma }\), gear speed \(K_{v}\), load distributions \(K_{H\beta i}\) and \(K_{H\alpha i}\). These factors are discussed in [15] and references therein. Eq. (26) can be rewritten in a compact form as

$$\sigma _{Hi}=C\sqrt{T_{i}},$$
(27)

where \(C\) represents the \(K\) and \(Z\) design parameters mentioned above. This parameter, unknown to the maintenance engineer, can be roughly estimated from nominal conditions as \(C=\sigma _{HN}/\sqrt{T_{iN}}\). Therefore, Eq. (27) turns to

$$\sigma _{Hi}=\sigma _{HN}\sqrt{\frac{T_{i}}{T_{iN}}},$$
(28)

where \(\sigma _{Hi}\) is defined in terms of nominal contact stress \(\sigma _{HN}\) and torque \(T_{iN}\), which have a clear physical meaning. Additionally, one can also expand Eq. (28) using Taylor series around the nominal torque \(T_{iN}\) as

$$\begin{aligned} \sigma _{Hi}-\sigma _{HN}= & \left(T_{i}-T_{iN}\right)\frac{\sigma _{HN}}{2T_{iN}}-\left(T_{i}-T_{iN}\right)^{2}\frac{\sigma _{HN}}{8T_{iN}^{2}} \\ + & O\left(\left(T_{i}-T_{iN}\right)^{3}\right).\end{aligned}$$
(29)

This formula gives a polynomial relationship between stress and load deviations. The accuracy of this expression for higher torque deviations can be increased by considering other terms of the Taylor series. The nominal stresses are of order 103 and the nominal torques vary between 104 and 106, thus unitary torque deviations lead to 0.05 to 5% stress deviations, if only the first term of the series in Eq. (29) is considered. A limitation of this derivation is that \(C\) depends on parameters that may change with \(T_{i}\), but these variations should be negligible.

Remaining useful lifetime estimation

The real-time accumulated damage is estimated as follows. First, the time-varying gear transmitted loads of the gear pairs of three gearbox stages are estimated by using the torque observers designed as explained in the previous part. The latter is based on the real-time aerodynamic and generator torques, and the time-varying digital twin model parameters. Then the gear tooth surface pitting stress is estimated for the different gears as explained on above. The number of gear tooth contact stress cycles at different stress levels is counted by using the time-domain rainflow cycle counting approach [18]. The outputs are the amplitude stress level \(\sigma _{s}\), and the number of stress cycles at \(\sigma _{s}\) for \(s=(1,\ldots ,S)\). To consider the influence of nonzero mean stress level, Goodman rule is employed to calculate the effective stress (the equivalent zero mean alternating stress) by [19],

$$\sigma _{s}^{e}=\frac{\sigma _{s}}{1-\frac{\sigma _{m}}{\sigma _{u}}},\forall \left(\mathrm{s}\in \left\{1,\ldots ,S\right\}\right),$$
(30)

where \(\sigma _{m}\) and \(\sigma _{u}\) are the mean stress and material yield strength, respectively. The accumulated pitting damage for the data block t with S different stress levels \(\sigma _{s}^{e}(s\in \{1,\ldots ,S\}\)) is calculated by using Miner’s rule as

$$d^{t}=\sum _{s=1}^{S}\frac{n_{s}}{N_{s}},$$
(31)

where \(n_{s}\) is the number of cycles at the stress level \(\sigma _{s}^{e}\) and \(N_{s}\) is the number of cycles to yield at stress level \(\sigma _{s}^{e}\), where \(N_{s}=k(\sigma _{s}^{e})^{-m}\). The total absolute online accumulated damage will then be calculated by

$$D=\sum _{t=1}^{T}d^{t},$$
(32)

where T stands for the last data block which represents the current time. This method can also be used for estimation of relative damage between different operational periods over the time, to give an insight on variations in degradation between different operational periods.

3 Simulation results

3.1 Aerodynamic and generator torques, and drivetrain torsional response

The input aerodynamic torque obtained from SIMO-RIFLEX-AeroDyn and the performance of the designed controller in controlling the generator torque under variable input torque to set the speed on the shaft is shown in Fig. 4. The drivetrain model responses are the angular displacement, velocity and acceleration in the different bodies in the described 14-DOF model. As an example, the angular velocity responses of the rotor and generator are shown in Fig. 4. For demonstration purposes the torques and response are scaled with the gear ratio. The turbine operation is assumed to be near the rated operation.

Fig. 4
figure 4

Drivetrain model loads and responses a Rotor and generator torques, b Torsional response

3.2 Estimation of dynamic properties and digital twin model parameters

The undamped frequency modes of the 14-DOF model for the healthy system are listed in the Table 1. The torsional response error function can be defined between different bodies in the drivetrain model. For example, the undamped natural frequencies estimated from the angular acceleration error function for the 2nd gearbox stage is shown in Fig. 5. The main feature of angular acceleration compared to angular velocity and displacement is the amplification of the higher natural frequencies in the response. By defining the angular acceleration error functions in terms of different pairs of bodies in the model, it is possible to estimate the different drivetrain modes. However, the significance of modes is different in the error functions defined between different bodies. As it can be seen in Fig. 5, the 6th and 7th modes are not visible in the error function of the 2nd stage, but they can be observed in the error function defined between rotor and generator bodies. The estimated natural frequencies by using the employed modal estimation approach based on angular acceleration error function are listed in the Table 1. The maximum relative error of estimation is less than 1.5%. In practice, the torsional response in commercially available turbines is only accessible by the encoders placed on rotor and possibly generator shafts, so that some of the modes may not be observable by only using the error function between rotor and generator.

Table 1 14-DOF drivetrain model natural frequencies (in Hz)
Fig. 5
figure 5

Estimation of torsional modes from gearbox 2nd stage error function

The influence of pitting on parameters of drivetrain equivalent model is usually represented by a decrease on the mesh stiffness of the faulty gear pair [20, 21]. The NREL 5 MW drivetrain presented in Sect. 2.3 is used to illustrate this behavior, assuming that the early occurrence of pitting at the sun gear can lead to a 10% decrease in the sun-planet mesh stiffness. The 10% reduction of the faulty gear mesh stiffness assumed for modelling a very early stage fault might be high, but this assumption is made to demonstrate more clearly the influence of the fault on the variations of drivetrain dynamic properties. The results of the mentioned fault in the first and second gear stages can be seen in the second and third rows of Table 2, where one can see that pitting in the first stage sun gear affected mainly the 7th, 8th and 9th resonances, whereas pitting in the second stage sun gear affected mainly the 14th resonances. The results of fault simulation reported in Table 2 show that a 10% reduction in the sun-planet mesh stiffness of the first gear stage, results in around 2.5%, 2.5% and 4.5% reduction of 7th, 8th and 9th drivetrain natural frequencies, respectively, and a 10% reduction in the sun-planet mesh stiffness of the second gear stage, results in about 5% reduction of the highest natural frequency of drivetrain. The influence of pitting fault in the ring of first and second gear stages on the drivetrain natural frequencies are listed respectively in the fourth and fifth rows of Table 2. A 10% reduction in the ring-planet mesh stiffness of the first gear stage, results in around 5%, 3% and 3% reduction of 6th, 7th and 8th natural frequencies, and a 10% reduction in the ring-planet mesh stiffness of the second gear stage, results in the 3.5%, 4% and 5% reduction of 10th, 11th and 13th natural frequencies, respectively. Since the relative error of estimating the natural frequencies can be as high as 1.5%, variations of the natural frequencies which are less than 1.5% cannot be used as the indicator of fault. More detailed models of pitting fault can be engaged to capture more precisely the dynamics of fault [22] and consequent influence on the drivetrain dynamic properties, which is not the scope of this work. The influence of drivetrain faults at system-level on dynamic properties is analytically studied by [23]. Monitoring the variations in the drivetrain system natural frequencies can support the fault detection of the drivetrain at component-level, e.g. faults in the gears of the gearbox gear stages, which is also not the main scope of this work.

Table 2 14-DOF drivetrain model natural frequencies in the different fault cases (in Hz)

The actual values of moment of inertia of the different bodies in the 14-DOF model and the performance of the designed LS estimator in estimation of those parameters from the torsional measurements based on the theory elaborated in Sect. 2 can be seen in the Table 3. The accuracy increases as the number of input samples increases while the data outliers are filtered to improve the estimation. The number of input samples can be selected to reach a good tradeoff between the accuracy and computational speed. The actual values of the diagonal elements of stiffness matrix as the main stiffness parameters are listed in the Table 3. In the case that the exact values of eigenfrequencies and inertia parameters are accessed, the solution of Eq. (16) gives the exact values of stiffness parameters. The estimated values of stiffness parameters of the main diagonal of the matrix, by considering both the natural frequencies and inertia parameters estimation errors, are listed in the Table 3. EstimatedM1 is the designation used to indicate that the first proposed stiffness estimation method which denotes the case that all the eigenfrequencies are accessed. EstimatedM2 indicates that the second method of stiffness estimation is used, which is associated to the case that some of the frequency modes are not observable. In order to calculate EstimatedM2, two different cases in which only the first ten and eleven estimated modes are available are simulated, and the optimization problem defined by the LS estimator in Eq. (18) is solved. The results are listed in the Table 4. The estimation error of the associated underdetermined LS estimator reduces as more frequency modes are known. It is worth noting that our extensive simulations show that high values of error in the input data of stiffness estimation problem, namely the estimated eigenfrequencies and inertia parameters, may cause instability of the nonlinear numerical solver in Eq. (16). From the stability perspective, the LS estimator outperforms the first method which is based on solving the nonlinear Eq. (16), because the eigenfrequencies of Λ in the LS estimator are forced to be positive, which ensures the convergence of the solver. In other words, EstimatedM2 is more robust to the input data, and is recommended also when all the frequency modes are available.

Table 3 Estimation of the digital twin model parameters by using the proposed approach
Table 4 Estimated stiffness by EstimatedM2

3.3 Loads, contact stress and accumulated damage

Contact load and stress validation on the three different gear stages

The input and output torques derived in Sect. 2.5 and shown in Eq. (25) can be seen in Fig. 6. There one can see that all torques present similar oscillating pattern around their nominal values, and that there is little difference between the output and input torques at adjacent stages, see Fig. 6d–b and e–c.

Fig. 6
figure 6

Comparison between nominal and estimated torques. ac input torques and df output torques at the 1st, 2nd, and 3rd stages respectively

One can estimate the stresses at a sun-planet gear pair by inserting the input torques shown in Fig. 4 at the first order approximation using Taylor series, based on Eq. (12). The results for such operation can be seen in Fig. 7, together with results from high-fidelity Simpack multi-body simulation model of NREL 5 MW drivetrain reported in [12]. The results show reasonable agreement between the results for the simplified torsional model and the high-fidelity multi-body simulation platform. (Fig. 7).

Fig. 7
figure 7

Sun-planet contact stress comparison at second stage: results using estimated input torque and 1st order Taylor series (see Sect. 2.5) against Simpack simulation

Calculation of average accumulated damage

The SN curve parameters for pitting fatigue damage calculations depend on material, load and gear geometry. These two parameters for the pitting fatigue damage estimated in this paper are obtained based on ISO 6336-2:2019 [14]. Only one region is assumed for the SN curve. For instance, k is 3.051×1044, and m is 12 for the sun gear of the 2nd stage. The numbers of stress cycles of sun gear is multiplied by 3, because it meshes with 3 gears simultaneously at each revolution. The accumulated damage of 1st stage sun, 2nd stage sun and 3rd stage pinion gears due to contact fatigue for one hour of operation by using the proposed digital twin-based remaining useful lifetime monitoring approach is listed in the Table 5. The online accumulated damage D of the 2nd stage sun gear is shown in the Fig. 8.

Table 5 Accumulated damage for 3600 s of operation
Fig. 8
figure 8

Accumulated damage of the sun gear of the gearbox 2nd stage over one hour of operation

3.4 Practical implementation challenges and real-time performance

Challenges to overcome for practical implementation

In practical turbines, uncertainties are an important issue. The different sources of uncertainties and their influences on the digital twin-based degradation estimation approach grounded on drivetrain torsional equivalent models for estimation of residual life of the drivetrain shafts are investigated by Moghadam and Nejad in [24]. The latter is performed by using stochastic models and statistical approaches, where uncertainties in both the model estimation and real-time measurements are addressed to mitigate their influence and improve the efficacy of the digital twin approach in presence of uncertainties. The statistical uncertainties in fatigue calculation due to material uncertainties can be accounted for by stochastic modelling of damage as explained by [24]. It is also worth noting that the employed least-square estimator is robust to the measurement noises, so that it can mitigate the influence of measurement noise in general, and cancel the influence of Gaussian noise [25]. The other issue in practical turbines is a call for computationally fast and inexpensive health monitoring techniques. The proposed algorithm is based on inexpensive linear equivalent models of the drivetrain, and the computational time depends on the sampling frequency and the number of input measurement samples. The algorithm can be managed to be executed only in order of a second. The method relies on a linear regression-based estimator and a few measurement samples, linear torque observers, and the real-time cycle counting of the estimated stress. The selected linear 14-DOF model of drivetrain is a compromise between simplicity and accuracy, which demonstrates similar results to results of high fidelity models for estimation of load aimed at estimating the degradation of the gears. The practical implementation of the algorithm integrated with turbine fully automated control and monitoring systems means the automated pre-processing of the turbine measurements to continuously feed updated input data into the proposed system identification approach to estimate real-time equivalent model of drivetrain in the digital twin framework. This task aims at the optimization of data streaming and continuous processing architectures to deal with the experimental data aspect of the digital twin framework. Within the pre-processing stage, a significant step will be on time signal correlation methods to verify and ensure time synchronicity when the data comes from different sources by mapping them on common spaces and dealing with the effect of different samples rates. Another practical issue with the realization of this idea is the possible need for new sensor installations to access additional torsional measurements, communication links with the main operation control unit, and the processing power for the execution of algorithm in near real-time.

Real-time performance metrics

As discussed in detail in Sect. 2, the digital twin algorithm consists of three main components, namely drivetrain equivalent model identification, load estimation and damage calculation. Therefore, for the simulation-based studies of this paper, the run-time of algorithm can be estimated by the summation of data recording time, time for estimation of equivalent model parameters, time for estimation of load and time for calculation of damage.

The model identification computation time, \(t^{\text{model}},\) is mainly influenced by the linear regression estimator operation, which its computational time depends on the degree of model complexity and the number of input samples. For the case of 14-DOF model, the model identification algorithm can estimate the model parameters by less than 5% error only by using 14 data samples, where the estimation error can be reduced by increasing the number of input samples. By assuming 30 data samples, with the sampling frequency 300 Hz, the time length of recorded data block will be 0.1 sec. The processing time depends on the processing power, but can be managed to be only a fraction of a second. The estimation of load and stress computation time, \(t^{\mathrm{load}},\) is determined by the operation of designed torque observers which are simple arithmetic functions of estimated model parameters and measured real-time response, with the computational time very close to zero. The damage calculation computation time, \(t^{\text{damage}}\), is reliant on real-time cycle counting operation on the data blocks of stress time series, which can be executed quite fast in fraction of a second.

Two real-time metrics are employed to evaluate the algorithm run-time, namely real-time factor (RTF) and run time (TR), as they can be estimated by [26]

$$\begin{array}{c} TF=\frac{t^{\text{process}}}{t^{\text{input}}},\\ TR=t^{\text{input}}+t^{\text{process}},\\ t^{\text{process}}=t^{\text{model}}+t^{\mathrm{load}}+t^{\text{damage}}. \end{array}$$
(33)

These equations can be used to provide a rough estimation of the real-time performance of the algorithm, based on testing it with a system with 1.9 GHz Intel Core i7 CPU and 16 GB memory, which leads to

$$\begin{aligned} \mathrm{RTF}= & \; \frac{0.8\,s+0.1\,s+0.2\,s}{0.1\,s}=11,TR \\ = & \; 0.1\,s+0.8\,s+0.1\,s+0.2\,s=1.2\,s.\end{aligned}$$

Generally speaking, for a system to be considered real-time, RTF should be ≤ 1 [27]. The run-time of the algorithm can be compensated in the proposed digital twin approach by repeating the results of data processing phase for all the data blocks captured during the processing time. The latter is expected to provide fair enough results for the drivetrain components’ fatigue damage estimation, since degradation usually happens during longer periods of time than one second. Hardware-in-the-loop (HIL) simulation of the algorithm to study the possibility of executing the algorithm in real-time in practice is considered as the future work.

4 Conclusion

The possibility of using 14-DOF lumped parameter model as the drivetrain digital twin model for monitoring the remaining useful lifetime of the gears in the different gearbox gear stages due to contact fatigue stress was investigated. An algorithm for near real-time estimation of the parameters of drivetrain equivalent model by using the torsional measurements was proposed and tested by simulation studies. The estimated contact load and stress obtained by using the designed load observers in the proposed digital twin approach were validated by the results obtained from a Simpack high-fidelity simulation model, which showed a fair agreement between the results, whereas the proposed digital twin approach based on a linear torsional model is computationally fast and can be implemented integrated with turbine fully automated control and monitoring systems for online monitoring of gearbox components. The estimated stress was later used for near real-time estimation of the fatigue damage of the gears by using a physics-based degradation model. The influence of pitting faults on the system’s resonances was demonstrated by simulation studies, which can support the fault diagnosis of the gearbox.

Hardware-in-the-loop (HIL) simulation of the algorithm to check the possibility of executing it in real-time for the fault prognosis purpose, tackling with the described practicality issues and implementation of the algorithm in an operational wind turbine drivetrain system are considered as the next steps that will be investigated in the future work.