Reduced-order model-inspired system identification of geometrically nonlinear structures: application to a nonlinear cantilever-type structure

In the field of structural dynamics, system identification usually refers to building mathematical models from an experimentally obtained data set. To build reliable models using the measurement data, the mathematical model must be representative of the structure. In this work, attention is given to robust identification of geometrically nonlinear structures, particularly those with large inertial effects. We draw inspiration from reduced-order modelling to determine a suitable model for the system identification. There are large similarities between reduced-order modelling and system identification fields, i.e. both are used to replicate the dynamics of a system using a mathematical model with low complexity. Reduced-order models (ROMs) can accurately capture the physics of a system with a low number of degrees of freedom; thus, in system identification, a model based on the form of a ROM is potentially more robust. Nonlinear system identification of a structure is presented, where inspiration is taken from a novel ROM to form the model. A finite-element model of the structure is built to simulate an experiment, and the identification is performed. It is shown how the ROM-inspired model in the system identification improves the accuracy of the predicted response, in comparison with a standard nonlinear model. As the data are gathered from simulations, system identification is first demonstrated on the high-fidelity data, and then, the fidelity of data is reduced to represent a more realistic experiment. A good response agreement is achieved when using the ROM-inspired model, which accounts for the kinetic energy of un-modelled modes. The estimated parameters of this model are also shown to be more robust and rely on the underlying physics of the system.


Introduction
The trend for aesthetic engineering structures, together with improvements in the design methodologies, has resulted in more slender, flexible and lightweight structures. These structures might satisfy the static loading resilience criteria but they often fail under certain dynamic loads, as dynamic loads can cause highamplitude vibrations which push the structures to behave beyond their linear regime [1]. This brings the need for nonlinear dynamic analysis to accurately represent the behaviour of such structures [2].
For nonlinear dynamic analysis of a system, a reliable mathematical model is required. Nonlinear system identification methods can be used to build a model of the physical structure using data measured experimentally. The discipline of nonlinear data-driven modelling has attracted attention from a wide range of research fields; here we consider this from structural dynamics perspective. One of the earliest works on nonlinear system identification was presented by Masri [3]. Throughout the years, the field has witnessed considerable growth with development of new methodologies and the rise of numerous challenges. Mainly, nonlinear system identification can be divided into following main steps: data collection and testing, nonlinearity detection, characterisation and parameter estimation [4][5][6]. In the data collection step, data are gathered from the system, typically from physical experiments; in detection, the behaviour of the system is examined to identify whether the system behaves nonlinearly, whilst in the characterisation step, the functional form of nonlinearity is determined and a model is selected for the system, and in parameter estimation, the parameters of the selected model are estimated using the data [4]. All these phases have undergone some development, whilst parameter estimation phase has gained significant attention as researchers have progressively found the importance of parameters' robustness [4]. One of the main challenges of parameter estimation in nonlinear system identification is the individualistic nature of nonlinear systems compared to linear systems which have a well-defined functional form [6]. This can add uncertainty in the parameters of the selected model, and typically uncertainty quantification techniques are required to identify the confidence bound and distribution of parameters. Physics-based parametric approaches are used where a functional form is available or synthesised for the system and parameters are estimated [7][8][9]. Meanwhile, approaches such as Bayesian probabilistic framework and Markov chain Monte Carlo have been put forward to tackle the model uncertainty of nonlinear systems in identification [10][11][12][13][14]. Similarly, nonparametric techniques of deep learning and neural networks have also opened paths to nonlinear system identification [15,16]. These approaches can facilitate an optimum model to be selected from a set of candidate models [17] or allow representation of complex nonlinear dynamical functions through finding their intrinsic features [16]. However, artificial neural network techniques mostly deliver a blind approximation of closed-form expressions of a blackbox model [18]. More recently, it has been shown that prior knowledge and physics-based information can be embedded into deep learning and neural network-based methods [19,20]. In [19], an ordinary differential equation model formulation is considered, where a feedforward neural network is used to include a discrepancy term in the model. Similarly, in [21], a more rigorous discussion of embedding physics-based information into machine learning algorithms for inverse problems is given. These may yield a hybrid model which is composed of physics-based terms and neural network terms. However, the neural network terms remain as black-box expressions. In [18], Quaranta and coworkers present a range of computational techniques which are generally used in nonlinear system identification. Genetic programming and other artificial intelligence algorithms such as particle swarm optimisation are applied on a range of nonlinear dynamical systems [18]. These approaches have imposed a change of paradigm in nonlinear system identification in the recent years [4].
In contrast to neural network-based techniques, sparse regression methods are shown to be appealing for obtaining analytical expressions of the underlying dynamics of the systems [22,23]. These methods, which are referred to as sparse identification of nonlinear dynamics (SINDy), emerged as effective tools to discover the governing equations representing dynamical systems and different modifications have been proposed over recent years [24][25][26][27]. The models included in the design matrix are assumed based on a prior knowledge, and the size of the matrix can be large enough to include many assumed models [27]. As the dynamics of structures get more complex, and the degrees of freedom (DoFs) of the model increase, so does the number of nonlinear parameters and assuming candidate models may require expert domain knowledge. This can further increase uncertainty in the estimated parameters. Models would be more reliable when there is a meaningful relation among the nonlinear parameters and physical parameters of a system.
Reduced-order models (ROMs) for nonlinear structural dynamics were proposed as a means to improve the computational efficiency for nonlinear systems [28]. They are used as a smaller-sized representative model of a complex structure through a set of secondorder differential equations [29]. Similarly, in system identification, the dynamics of a system are repre-sented by a simplified model. In this paper, we use this observation and use ROM-inspired models in nonlinear system identification. Specifically, we note that the form of the model used in identification must have the same form as the reduced-order model. This not can only make the identification process more reliable, but will also make the response simulation computationally cheaper than an identified model that relies on a large number of DoFs to achieve good accuracy. Also, there is a consequential relationship between these nonlinear terms and underlying physics of the system [30]. Building smaller sized models for dynamical systems using normal forms is also explored [31,32]. Normal form methods are mainly based on the definition of nonlinear normal modes (NNMs) as invariant manifolds [33]. A nonlinear change of coordinates is required to transform the reduced equation of motion to normal coordinates [34]. Model reduction methods based on invariant manifolds, referred to as spectral sub-manifolds, are demonstrated in [35]. However, measuring the normal coordinates may be a difficult task in an inverse problem case of system identification. In [32], Vizzaccaro and co-workers demonstrated graph style parametrisation of invariant manifolds into modal coordinates, where the name "graph style" comes from defining a functional graph relationship between master and slave coordinates. However, it was found that the response diverges from the true solution at high amplitudes. A comparison of invariant manifold-based model reduction methods with condensation methods can be found in [36,37].
In this paper, we aim to reduce the uncertainty associated with the model, hence, dealing with the characterisation step in the identification. We consider nonlinear system identification of an example geometrically nonlinear cantilever-type structure. The lower-frequency bending mode is coupled with highfrequency axial modes which can be captured by a single-DoF ROM. Similarly in structures like cantilever systems, the nonlinearity can be dominated by the inplane nonlinear inertia [38,39]. System identification is applied using ROM-inspired models, and their results are compared to the case where standard nonlinear stiffness model is used. We use backbone curves for response comparison and as a validation tool. Backbone curves show the shift in the frequency of the system versus amplitude or energy [40][41][42]. Numerous techniques are available in the literature on measuring the backbone curves of nonlinear systems, [43][44][45][46].
Here, the backbone curve of the system is measured using the response decay of the full FE model [43]. We use the response decay data for the nonlinear system identification.
In Sect. 2, the structural system and its FE model are described, followed by the identification using standard nonlinear stiffness models and a nonlinear model of a cantilever beam from the literature in Sect. 3. Section 4 shows system identification using the ROM-inspired model with its results discussed. Section 5 demonstrates the ROM-inspired identification on pseudoexperimental data. Finally, conclusions are made in Sect. 6.

Cantilever-type beam system
In this section, the structure on which the system identification will be demonstrated is described. The system is a cantilever-type structure with a spring attached axially at its free end, as shown in Fig. 1. As cantilever beams are known to have larger nonlinear inertia effects with less stiffness nonlinearity, we include the axial spring to impose a stiffening effect at high-amplitude vibrations. This causes a larger frequency shift at higher amplitudes and hence makes it convenient to validate the results using backbone curves. Similar cantilever beams with an additional element at the tip are presented in [47,48]. The spring is stretched at equilibrium and has a mass of 0.0008 kg, whilst the system includes force due to a gravitational acceleration of 9.81 ms −2 applied downwards. This results in an initial static tip deflection of around 0.02L at the equilibrium position, which also causes the linear natural frequency to increase by a factor of 1.003, as also demonstrated in [49]. The corresponding parameters of the system are as summarised in Table 1. The system is modelled in FE package Abaqus [50] and meshed using Timoshenko beam elements (three-node quadratic beam, B32). A total of 150 beam elements are used, resulting in 1800 DoFs. Additionally, to allow resonant decay to be used to measure the backbone curves, the mass and stiffness proportional Rayleigh damping coefficients are applied as α = 0.37s −1 and η = 1×10 −5 s, respectively. These values are selected to obtain around 0.5% damping ratio for the first mode and having relatively slight increase in damping ratios for higher modes.
The generalised physical coordinates are given in the N × U matrix from the FE model, which we call  X, N representing the DoFs and U being the number of time steps during the decay responses. As we will consider models expressed in modal coordinates, we must perform a linear modal transform. By applying a modal analysis in Abaqus FE package, we get the eigenvalues matrix ( ) as an N × N matrix containing nth squared natural frequency (ω 2 n ) in the nth element along its diagonal. Similarly, the mass-normalised modeshape matrix ( ) is obtained as an N × N matrix. In a physical experiment, these can be obtained through an experimental modal analysis. For the modal transformation, we apply: where q is N × U matrix of modal coordinates. The modeshapes for the first two bending modes and the To demonstrate the coupling between the modes of the system, the steady-state undamped modal displacement responses of the system, over a period of one second, are shown in Fig. 3a. A static modal force is applied to the first bending mode of the system to get a tip displacement of around 1/3rd of the beam length. The resulting dynamic response, after the static load is released, is dominated by the first mode, whilst the second highest amplitude response is from the 30th mode, corresponding to the first axial mode. Due to membrane stretching, there exists some significant coupling between these two modes, as demonstrated in Fig. 3b, which shows half a cycle of the steady-state responses plotted against each other. Additionally, as the coupling between the first mode and other modes in the system is asymmetric, this may indicate that the system is not symmetric (due to the sag under gravity). Nevertheless, the asymmetry due to the initial sag is quite small.
We assume that, when lightly damped, the decaying response of the system follows its backbone curve; therefore, the backbone curve of the system can be constructed using its decay response [43]. As this system has well-separated modes, internal resonance is not considered. Note that the presence of internal resonance would require other techniques such as control-based continuation [44,51], for backbone measurement. As the dynamics of the structure is mainly governed by the first bending mode with (ω 1 = 38.5 rad s −1 ), the resonant decay is performed by applying a tip force, which is easiest to apply on a real system, on the damped structure, and is displaced to around 1/3rd of the beam length, L. The force is released to capture the decaying response over a period of 25 seconds, transformed into first mode coordinates as shown in Fig. 4. Note that the physical coordinates (X) are measured about the equilibrium position. Using this decay response, the backbone curve for the first mode of the system is con-structed. The time at every other zero-crossing is found using interpolation to identify exact zero-crossings, as T 0 , T 1 , . . . , T Z , with T 0 as the time of the initial zerocrossing and T Z being the time at last zero-crossing point in the decay response. Note that additional steps are required when the response is gathered from a physical experiment, such as filtering, to remove multiple zero-crossings around the crossing points in noisy data. These steps are necessary to reduce the error in the frequency assessment procedure [43]. The maximum amplitudes are measured over the window of each cycle (T z−1 ≤ t ≤ T z ), and the time period of the cycle z is approximated using zero-crossing times (i.e. T z − T z−1 ). The response frequencies of each periodic cycle are then mapped with its corresponding maximum amplitude as the backbone curve of the system shown in Fig. 5.
This synthetic data set will be used to perform system identification and is referred to as full FE measurements in the rest of this paper. We note that whilst we have selected the decay response here, there are other experimental techniques that can be used to measure backbone curves. In the following section, system identification is illustrated using a standard nonlinear stiffness model for the system.

System identification using a standard nonlinear stiffness model
In this section, system identification is performed using standard nonlinear stiffness models and a nonlinear cantilever beam model from the literature. We first consider the model with a nonlinear stiffness polynomial, which is commonly used to model geometric nonlinearity. When the knowledge of the nonlinear behaviour of the system is limited, this model can be a good starting point. However, in the case of a cantilever beam, the standard nonlinear stiffness model does not include some terms that arise from nonlinear inertial effects [52]. As the response of the system is dominated by the first mode, we firstly select a single-mode model with nonlinear stiffness, given by:q whereq r , ω r and q r are the acceleration, natural frequency and displacement of mode r , respectively. As is common in the literature, we approximate the geometric stiffness nonlinearity using a polynomial function of up to Mth order with γ m being mth nonlinear parameter. Therefore, this equation is referred to as the standard stiffness nonlinear model. For the example considered here, we select nonlinearity of up to 5th order as the base model (M = 5) and considering a single-mode model representing the first mode (r = 1). Equation (2) can be written as below: Both even-and odd-order terms are included in the model, as the asymmetry due to sag is captured by even-order terms [1,53]. The significance of the nonlinear terms is demonstrated later on in the paper. The damping is assumed to be small, and as we are mostly interested in the nonlinear behaviour of the underlying conservative system, no damping term is included in Eq. (3). In system identification, the model is fitted into the measured data to identify the nonlinear parameters γ m . Considering Eq. (3), we know the acceleration (q 1 ) and displacement responses of the first mode (q 1 ) from the backbone measurements. The remaining parameters are treated as unknown. Note that although the linear natural frequency of the system (ω 1 ) is known, in this particular case, from the linear modal analysis, it is still included in the estimation. In an experimental setting, there would be uncertainty associated with the linear frequency estimation prior to nonlinear model fitting. To account for this, we compare the cases where: (i) the linear natural frequency is fixed, to represent the case where the frequency is known (i.e. taken from an accurate FE model); (ii) the linear natural frequency is estimated along with the nonlinear parameters, to account for any uncertainty.
To identify the unknowns in Eq. (3), we propose a method based on using the Fourier information of the measurement signal. We include certain coefficients of the complex exponential form of the response measurements. Li and colleagues [16] have also demonstrated that using the frequency content of the measurements removes some sensitivity associated with the time domain measurements in the identification of different nonlinear systems.
As the backbone measurement is constructed using the decay response, each point on the curve can be assumed to be a periodic cycle of the decay. A moving window is used to extract the Fourier components of each cycle z. The lengths of moving windows are defined after the zero-crossing times (T z ) of each cycle z, whilst the Fourier components are achieved using fast Fourier transform (FFT) algorithm in MATLAB [54], which gives the two-sided spectral components. Applying the Fourier transformation, the time domain Eq. (3) can be written as below for the initial periodic cycle: shows the time length of the window for the first periodic cycle. T 0 is the time of the beginning and T 1 the time of the end zero-crossing of the cycle 1, as described previously. F H indicates the column vector of complex Fourier components of a general term (w(t)) and can be defined as below with T being its period: In the above, H = hω, h indicates the Fourier component and ω is the fundamental frequency of the system. Once the column vector of Fourier components for each periodic cycle is extracted, the real and imaginary parts of certain components (i.e. h = [h min : h max ]) of all data points are stacked on top of each other as a single column vector. Now, Eq. (3) can be written as the following matrices: where G is the design matrix including all the known terms in the model, d is the vector of known variables, and P is the vector of unknown parameters. These matrices can be represented by relation below: To identify the unknown parameters ( P), Eq. (7) can be applied in a least squares regression form. For this example, we use up to the 5th complex Fourier components in the identification as higher harmonics of each data point are negligible. Also, odd and even harmonics including the DC (0 Hz) components are used [1,53], so h = [0; 1; 2; 3; 4; 5] in Eq. (5). This allows the fitting process to reach high levels of accuracy as the measurements are simulated and there are not much error associated with the data. High harmonics may not be reliable for a real, physical experiment subject to noise. The real ( ) and imaginary ( ) parts of the components (h) of the complex exponential Fourier series of periodic cycles of data points on the backbone curve are stacked as column vectors (note that the DC component does not have an imaginary part) such that the matrix G has 11× Z rows (Z being number of periodic cycles). The model is fitted to the full FE measurement data, using data across a pre-determined amplitude range to which the backbone curve fit is required (and later plotted). As the matrix G consists of multiple Fourier components for a whole range of amplitude levels, it was conditioned before applying the regression. Each column of the matrix is normalised by its root-mean-square (rms) value, and then, the resulting parameters are re-scaled using the reciprocal of rms values. These estimated parameters are summarised in Table 2.
Using the estimated parameters, the response of identified model from Eq. (3) is simulated and compared with the full FE measurements backbone curve. Note that the estimated parameters relate to the case where linear frequency is included in the estimation. However, we also show the backbone for the case where the linear frequency is fixed using the value extracted from the FE model. The backbone curves are computed using the MATLAB-based continuation toolbox-Continuation Core (CoCo) [55]. This comparison is shown in Fig. 6.
From the backbone curves in Fig. 6, it is clear that the identified standard model is a poor representation of the full measurements, despite the near-perfect data used for fitting. It is also clear that the linear natural frequency is estimated with error for the fifth-order model. This can be due to the model limitation giving a wrong estimate of the linear natural frequency term, and hence, the response fails to converge at different levels of amplitude. Similarly, when the linear frequency is fixed, the response diverges from the measurements at higher amplitudes. The estimated parameters in Table 2 cannot be relied upon as they fail to  Comparison between the backbone curves of the 3rd-, 7th-, 9th-order identified standard nonlinear models, and the full FE measurements (ω 1 is included in the estimation) produce the correct backbone curve. To verify that this poor representation is not due to an insufficient order of the nonlinearity in the stiffness model, we also include different order terms in the standard model-monomial terms of up to 9th order (O(9)) are considered and the identified models responses are compared with the full FE measurements. The backbone curves of identified models are shown in Fig. 7 when linear frequency is included in the estimation and in Fig. 8 when it is fixed. Fig. 8 Comparison between the backbone curves of the 3rd-, 7th-, 9th-order identified standard nonlinear models and the full FE measurements (ω 1 is fixed as a known parameter) Figures 7 and 8 show that even the higher-order standard models cannot capture the true response of the system and still diverge from the true measurements. There is a large error in the linear natural frequency estimation for the third-order model, and the error slightly diminishes for higher orders. This large error between the identified and true backbone curves, even for very high-order models, implies that the standard models are unable to capture the true physics-based information about the system. We also noted that the even-order models (4th, 6th and 8th) showed small differences to the odd-order models (3rd, 5th and 7th) and so are not shown.
The poor fit is likely to be due to the fitting limitation imposed by the choice of mathematical model for this structure-i.e. Eq. (2). Although polynomial terms of different orders could represent stiffness-based nonlinearity, the response of cantilever-type systems is dominated by longitudinal inertia and large curvature [38,39,52,56]. However, we note that such knowledge of the underlying dynamics is a feature of the structure selected in this paper and is not generally known for other structures-the method in this paper does not exploit such information. Furthermore, conservative terms were found to improve the response of the model relative to the experimental results in [56], whilst Urasaki and Yabuno [46] identified the backbone curves of a cantilever-type system by including quadratic velocity terms in a mathematical model derived through Hamilton's principle. Therefore, in line with these works we also consider a model which is derived from the beam theory for a cantilever beam.

Nonlinear cantilever beam model identification
The oscillator-like nonlinear model we consider in this section is similar to that in [38,52] which, alongside the nonlinear stiffness terms, also has some additional terms of up to cubic order, accounting for the nonlinear inertia. This model was derived from the general beam theory [52] for a cantilever beam, so should be correct and accurate. The single-mode model for the governing dynamics of the system can be written as: In the above model, in addition to the quadratic and cubic order stiffness terms, two additional terms appear. Whilst the nonlinear stiffness terms represent the geometric nonlinearity, the cubic terms in (q 2 1q 1 and q 1q 2 1 ) capture the nonlinear inertia of the cantilever beam. To capture the asymmetry of the system, we only consider a single quadratic term in (q 2 1 ). This model is now fitted to the backbone curve measurements, using the same procedure described for the standard nonlinear model, to identify the unknown parameters P * = [ω 2 1 ,Ā,B,C,D] T . Equation (8) is applied in a least squares sense whilst using the harmonic components of the known variables as h = [0; 1; 2; 3]. We also consider the case with a fixed linear natural frequency from the prior linear modal analysis. The identified models are simulated, and the backbone curves are compared with the measurements, as shown in Fig. 9.
The backbone curve of identified models gives good match with measurements, as expected. However, for the estimated linear natural frequency case there is rela-tively a bigger error at lower amplitudes. Fixing the linear frequency as a known parameter requires an accurate measurement of the linear natural frequency. Nevertheless, the beam model with additional cubic inertia terms is able to provide a good response match for the cantilever beam. Most of these works are case specific and are particularly aimed to model the nonlinear inertia in a cantilever beam [38,39,46,52,56]. The nonlinear cantilever beam model is usually limited to cubic order which can face limitations at higher amplitudes of the backbone curve. It can also be pushed to higher orders with rigorous algebra, which is not done yet in the literature.
In parallel to these works, we are taking a generalised approach to embed physics-based information into the model of systems with large curvature (such as the cantilever-type system used to demonstrate this here). We consider a ROM structure, derived from the general Lagrangian, in line with [57]. This ROM has additional terms accounting for the kinetic energy of un-modelled modes with the ability to expand to any arbitrary order. The ROM-inspired model with its identification is described in the next section.

System identification using a ROM-inspired model
The previous section demonstrated that the standard nonlinear stiffness model is insufficient for the system identification of the cantilever-type system. The cantilever beam model in the literature showed improved response match; however, this model is specific to a cantilever beam and may not be applicable to general nonlinear systems with large inertia. From the reducedorder modelling study in [57], it is shown that additional inertial-based terms are necessary to account for the kinetic energy of un-modelled modes of a cantilever system. Note that whilst [57] also considers a cantilever system, the form of the ROM is general and provides a good description of any system, reducible to a single mode, with large inertia. We use this model to inspire the form of the equations used in the system identification of a cantilever-type system, in this section. Note that these equations are general, they ensure kinetic energy is captured whilst requiring no underlying knowledge of the system-i.e. no system-specific model is required.
The main mathematical structure of the ROM, which is referred to as inertially compensated ROM (IC-ROM), a derivation of which is presented in Appendix B, can be expressed as: Note that this ROM model structure is derived for a single mode (q 1 ), as the dynamics of the system is mainly governed by the first mode only. The method is limited in accounting for internal resonance [57], and there is no evidence of internal resonance in the measured dynamic response of the beam system. This can be extended to multi-mode systems by considering further modes in the reduction basis of Eq. (B2), following the work presented in [57]. As the damping of the system is low and we are interested in the nonlinear behaviour of the underlying conservative system, the damping term is neglected from the ROM model [43].
No dynamic interaction is detected from the response of the system, so internal resonance is not accounted for and the modal coupling is approximated by statically constraining coupled modes to the reduced mode using a function g s which can be described as: Compared to the standard model, two additional terms appear in Eq. (9), where the partial derivatives of a function g s with respect to q 1 are included for each un-modelled mode (s). The function g s represents the coupling between the reduced mode and each coupled mode s (i.e. s = 2, 3, . . . , N ; and N = 1800 for the example considered). β s,m represents the mthorder coupling function coefficient relating to the sth un-modelled coupled mode.
We have a large number of parameters β s,m to estimate. To reduce the number of independent parameters, we first write the function g s in the following vector form: where β is S × K (with S = N − 1 and K = M − 1) coefficients matrix andq is K × 1 variables vector, derivatives of which with respect to q 1 may be written as: By substituting these into the inertial compensation terms of Eq. (9), we get: In (13), β T β will end up in a K × K matrix with β parameters for each coupled mode (s) in each element of the matrix. As this will be a symmetric matrix, for simplification the β T β =β matrix may be written as below with number of β parameters dramatically reduced from S × K parameters to K × K parameters (with K S): Now usingβ, (13) gives: We select an IC-ROM of up to 5th order for both the coupling function (g s ) and nonlinear terms. Whilst using (15) and (14), the inertial compensation terms are derived as below with K = 4: This can be further simplified as below, with fewer parameters to estimate: ϒ j represents the jth-order (the total order of the term) inertial compensation parameter which relates tō β parameters in Eq. (16) as below: ϒ 3 = 4β 1,1 , ϒ 4 = 6β 1,2 , ϒ 5 = 16β 1,3 + 9β 2,2 , ϒ 6 = 10β 1,4 + 12β 2,3 , ϒ 7 = 30β 2,4 + 16β 3,3 , 4,4 We include both odd-and even-order terms in the model, and as we only have up to 5th-order stiffness nonlinearity, we only include third-, fourth-and fifthorder IC terms (relating to ϒ 3 , ϒ 4 and ϒ 5 ) into (9) with terms relating to γ 2 , γ 3 , γ 4 and γ 5 . This will construct a fifth-order IC-ROM as the model of the structural system as below: This model can now be fitted to the measurements to estimate the unknown parameters. The acceleration, velocity and displacement (q 1 ,q 1 and q 1 ) are known, and the remaining parameters are treated as unknown. As previously, linear natural frequency is included in the estimation and is compared with the case where linear frequency is fixed. Note that the simulated data contain no error, however, the identification using data which is closer to a real experiment will be demonstrated later in the paper.
Similar to the standard model fitting, we use the complex exponential Fourier components (F H ) of the measurements in the estimation so the identical data set as previous section is used. F H is computed for each term in Eq. (18) after Eq. (5). By applying the transformation, the time domain Eq. (18) can be written in matrix form as: where each column of the matrixḠ corresponds to the terms in (18), in the same order as unknown parameters in vectorP, and the known variables vector d remain the same as standard model fitting described in Eq. (6b). The 1st to the 5th Fourier components including the DC component of measurements are added in fitting process (h = [0; 1; 2; 3; 4; 5]) [1,53]. As such, the real ( ) and imaginary ( ) components of the complex Fourier components (F H ) of each periodic cycles are included in the identification. This generates theḠ matrix with 11 × Z rows-note that the 0Hz component does not have an imaginary part. To identify the parametersP inḠP = d, we apply the problem in a least squares form, using the same approach as in the previous section. The model is fitted to the simulated data at all levels of amplitude of the backbone curve, whilst the identification algorithm remains similar to the previous section. The design matrix (Ḡ) is conditioned as previously, by normalising each column using its rms value and re-scaling the resulting parameters. The parameters of the model are estimated as summarised in Table  3. The parameters from direct reduced-order modelling [57] are also included as the true parameters. The existence of an FE model of the structural system has allowed the true parameters to be obtained; however, in system identification cases, only a physical system is available, so direct reduced-order modelling cannot be applied.
In Table 3, the relative error (R E) between the true (P true ) and estimated (P) parameters is calculated using the relation below: The relative errors for most parameters appear small; however, the error in ϒ 4 and ϒ 5 is significantly bigger than the others and will be discussed later in this section. Now using the estimated parameters we reconstruct the model in Eq. (18). The response of the identified models is simulated in the MATLAB-based continuation toolbox CoCo, and compared with the case of fixed linear natural frequency identified model and full FE measurements as shown in Fig. 10.
From the backbone comparison, a good match is shown between the full FE and the identified ROMinspired models. However, there is a small difference in the estimated linear natural frequency of the identified model. Now, the estimated parameters which were  . 10 Comparison between the backbone curves of the 5thorder identified ROM-inspired model (estimated and fixed linear frequency) and the full FE measurements defined based on the physics of the system can be reliable.
As was demonstrated, identifying the cantilevertype structural system using a standard model which contains different order nonlinear terms does not give good match with the measurements of the system. From the results of the standard nonlinear stiffness model, adding nonlinear terms of up to 9th order showed that the identified responses diverge from the true measurements at all levels of amplitudes. This is mainly due to the model not being able to capture the nonlinearity of the beam and indicates that identifying a structure using a model with different candidate model terms can give a set of parameters which does not represent the structure. The nonlinear cantilever beam model, which has some additional nonlinear inertia terms, gives improved results. However, this cubic model is justified through analysis of underlying equations for a cantilever, and such underlying equations may not be known for other structures. Similarly, adding some additional terms containing displacement-acceleration and displacement-velocity terms as inertial compensation for the kinetic energy of un-modelled modes has resulted in accurate response match with the measurements. Figure 10 shows that the 5th-order ROM-based model gives good match with the measurements at all levels. However, the number of parameters in the ROMbased model is similar to the standard model, and the second with the same number of parameters was unable to capture the true dynamics of the system. It should be noted that the ROM-inspired model derived in this work is generalised and is also expandable to any order.
To better understand the contribution of additional terms in the ROM-based model, the significance of the nonlinear terms is shown in Fig. 11. The absolute relative magnitude (RM) of the nonlinear terms (i.e. the estimated parameter times its corresponding response term) is shown as bars at highest, middle, lower backbone response amplitude levels. It is shown that the contribution of the second-order nonlinear stiffness term is dominant at low amplitude, whilst the third-order stiffness nonlinear term (γ 3 q 3 1 ) is significant across different levels of amplitude. Similarly, the significance of the fourth-and fifth-order terms (γ 4 q 4 1 and γ 5 q 5 1 ) is considerable at higher amplitude, slightly reduces at midlevel and diminishes at lower level. This demonstrates that the coupling between the lower-frequency bending and high-frequency axial modes increase in higher amplitudes, thus, higher-order terms (γ 5 q 5 1 here), is needed to capture the dynamic coupling. This also demonstrates that higher-order nonlinearity in ROM leads to better accuracy in the response. Although the Fig. 11 The relative magnitudes of nonlinear terms of the ROMinspired model shown at different amplitude levels of the full FE backbone curve coupling is less in lower levels of amplitude, hence, the higher-order term has least significance.
In Fig. 11, the absolute relative magnitudes of inertial compensation terms (ϒ 3 (q 1 q 2 1 +q 2 1 q 1 ), ϒ 4 (2q 1 q 3 1 + 3q 2 1 q 2 1 ) and ϒ 5 (q 1 q 4 1 + 2q 2 1 q 3 1 )) are shown at high, mid and lower backbone response amplitudes. The inclusion of higher-order terms is deemed to be necessary to capture for the high inertial nonlinearity in high-amplitude levels. At amplitudes on the backbone curve, the contribution of third-order IC-term (ϒ 3 (q 1 q 2 1 +q 2 1 q 1 )) is significant compared to the fourthand fifth-order IC terms (ϒ 4 (2q 1 q 3 1 + 3q 2 1 q 2 1 ) and ϒ 5 (q 1 q 4 1 + 2q 2 1 q 3 1 )). The fourth-and fifth-order terms further diminish at mid-level, whilst the only considerable magnitude of IC terms goes to third order at a lower response amplitude-meaning that inertial compensation is always needed for this system. This also shows that including higher-order inertial compensation terms is necessary to capture the response of the structural system at higher levels of amplitude. Also, from Table 3 the estimation of ϒ 4 and ϒ 5 is associated with significantly more error. Given their low RM at all the levels, the amount of error has a negligible influence on the response.
The estimated parameters summarised in Table 3 can be more robust and carry information about the underlying physics of the structural system. Also, these terms are derived based on the physics of the structural system; hence, the uncertainty in the model of the structural system has been tackled throughout this work and the estimated parameters can be more reliable.
Considering that data were perfectly clean out of the FE model and not close to reality, in the next section we demonstrate the methodology whilst using a data set that is more representative of a real physical test.

ROM-inspired identification using a simulated experiment
The data used in previous sections consisted of 300 node measurements, and each node had 3 translational and 3 rotational movements. This resulted in 1800 measurements in physical coordinates. In this section, to replicate a more realistic physical experimental scenario, the measurement set is dramatically reduced, such that measurements are taken from only 5 points in two DoFs along the length of the beam as shown as circular dots in Fig. 12. Having multiple measurement points is necessary to obtain the modeshapes of different modes from linear modal analysis perspective [58]. Whilst having the modeshapes of multiple modes can make the modal transformation process more accurate, there is no need to measure the response of high-frequency modes. These measurement points are located at 30 mm, 135 mm, 240 mm, 345 mm and 450 mm from the fixed boundary on the left of the beam. The structure sags under gravity with the spring stretched in its equilibrium position. It is given an initial displacement similar to the previous section to initiate the decay, and for each measurement point, decay response measurements are extracted in vertical and axial coordinates of each point, as aN ×Ū matrix denotedX. (Ū represents number of time steps andN = 10 as the total number of coordinates.) These displacements are transformed into modal coordinates using a reduced modeshape matrix¯ , which is extracted from the FE model.¯ is a 10 × 3 matrix which captures the three initial bending modes of the system, as these would be the easiest to measure in a physical test. To make the data more representative of a physical test, we also add white Gaussian noise ( ) to both modeshapes and physical coordinates with signal to noise ratio of 35 decibels. Now, we can estimate the modal displacement responses using below: where is the noise which is added to both physical coordinates and modeshapes. ( †) represents the pseudo-inverse. The first mode decay response is shown for the first 18 seconds in Fig. 13a, which is similar to the decay response using 300 node measurements in Fig. 4. Note that lower amplitude responses were cut from this signal as they were significantly affected by noise. The first mode response is then filtered using a 4th-order Butterworth filter with a cut-off frequency of above ten times of the first mode natural frequency ( f c = 100 Hz), to remove the noise in the signal. The filter is applied to the signal using MAT-LAB command f ilt f ilt which filters the data forward and backwards to avoid a phase shift in the signal [54]. The filtered displacement response of the first mode is as shown in Fig. 13, whilst the velocity and acceleration responses are estimated from the displacement using differentiation. The backbone curve of the first mode is constructed using the filtered decay response of the first mode, following the procedure described previously. The backbone curve is shown in Fig. 15, where there are some errors associated with the approximation of the frequency of low-amplitude responses due to the noise in the data. This resulted in a noisy backbone curveespecially at lower amplitudes where noise becomes more significant.
The ROM-inspired model in Eq. (18) is now fitted to the simulated experimental (SE) data following the methodology proposed in the previous section. Again, we use the exponential Fourier components of the measurements for estimating the unknown parameters. The 0th, 1st, 2nd, 3rd, 4th and 5th Fourier coefficients of measurements are added in the model. The estimated parameters are given in Table 4 which are compared to the parameters derived based on direct reducedorder modelling as true parameters. The relative error between the two sets is also shown in table using Eq. (20).
The estimated parameters for simulated experiment shown in Table 4 are comparable to those achieved for the high-fidelity FE data. As previously discussed, the high error percentage in ϒ 4 and ϒ 5 is likely to be due to their low magnitude at different levels of backbone curve. Similarly, ϒ 4 and ϒ 5 are more sensitive to the change in data which can be due to their less accuracy. The relative error in other parameters is all at reasonable levels, with lowest (0.6%) for the linear natural frequency estimations. The R E in both set of estimated nonlinear parameters (full FE data and SE data) can be shown in Fig. 14, where the RM for each nonlinear parameter is also shown at the mid-level amplitude of the backbone curve. It is prominent that for the parameters with less error, the RM has higher significance, whilst for parameters with larger error, the significance reduces.
The model is reconstructed using the estimated parameters of SE data, and its backbone curve is com- Fig. 14 Relative error in estimated nonlinear parameters for full FE data and simulated experiment data with their relative magnitudes Fig. 15 Comparison between the backbone curves of the 5thorder identified ROM-inspired model for simulated experiment data (estimated and fixed linear frequency) and the simulated experiment puted using CoCo, which is compared with the case when linear frequency is fixed and the simulated experiment in Fig. 15.
From the plot, it is clear that the identified ROMbased model backbone curve using the erroneous data with fewer measurement points still matches the simulated experiment measurements very closely. The estimated parameters and identified model backbone curve are similar to the previous section.

Conclusions
In this paper, a method for generating a physically appropriate model, based on ROMs, that can be used in system identification was proposed. The system identification method was demonstrated on a nonlinear cantilever-type structure. We highlighted a fundamental link between nonlinear system identification and reduced-order modelling.
An FE model of the system was constructed, and its decay response was simulated to provide synthetic data to test the proposed identification technique. It was shown that using nonlinear models with only nonlinear stiffness terms does not lead to an accurate identified model. Using the cantilever beam nonlinear model showed improved response; however, the estimated linear frequency case showed mismatch at lower amplitudes. In contrast, the identification based on the ROMinspired model showed good match with the measurements taken from the structure at different amplitude levels of the backbone curve. We assume that this is because the set of estimated ROM-inspired model parameters carry more meaningful information about the physics of the structure and ensure appropriate energy balancing across modes can occur. Furthermore, the ROM-based model has been constructed using the physics of the structural system rather than assuming an arbitrary set of models. The ROM-inspired model can be derived to any order of choice and is generalised for structures with large inertia. The cantilever beam model can also be expanded to higher orders but with the best knowledge of authors is only limited to cubic order in the literature. The results were shown for a set of high-fidelity data and also for a set of measurements from fewer points on the structure to represent a more realistic test. It was shown that the methodology also works when the synthetic data are polluted with noise. We conclude that ROM-inspired models can robustly represent a structural system through a set of meaningful terms and are ideally suited for use in system identification. Additionally, there exists a direct relationship between the ROM parameters and an FE model [30,59]. This would give one the ability to map between the physical parameters of the FE model and the identified ROM-inspired model parameters, in cases of FE model updating. The method proposed in this work is applicable to structures with large inertial nonlinearity, whilst as the damping is neglected, this work is only applicable to weakly damped systems-the nonlinear damping inclusion in the ROM-inspired model remains as a prospective future work. It is also worth noting that, based on the assumptions used to generate ROM, the proposed approach does not take into account internal resonances [57], which was acceptable for the example considered. As a future work, this method will be taken to real experimental investigations, where the data measurements can contain more uncertainties. (In this work, the data measured from the FE model were clean and were artificially polluted in the last section.) This work could also be further expanded to more complex models, which account for internal resonances and the frequency difference between reduced and constrained modes, described as slow and fast decomposition in [37], by continuing to draw inspiration from the reduced-order modelling.
Funding The authors acknowledge the support from UK Engineering and Physical Sciences Research Council (EPSRC). S.A.N. was supported through EPSRC grant no. EP/R006768/1. M.W.A. was supported through EPSRC doctoral training scholarship.
Data availability Data and the system identification algorithm coded in MATLAB can be accessed through Zenodo repository at [60].

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/ by/4.0/.

Appendix A: ROM-inspired model fitting design matrix
The design matrixḠ for the ROM-based model fitting can be shown as below:

Appendix B: IC-ROM derivation
The full FE equation of motion of unforced undamped system transformed into the modal space using (X = q) can be written as: q r q s + r 0 0 s q r q s + f r (q r , q s ) f s (q r , q s ) = 0 0 (B2) In the above, q r represents the response of the reduced mode (r = 1 which is governing the dynamics of the system example considered in this paper), and q s represents the response of all the remaining modes which do not have governing dynamics but are statically coupled with the reduced mode. (Here, s = 2, 3, . . . , N ; and N = 1800 as total number of modes.) These couplings are approximated using the vector of polynomial functions (g) as quasi-static, following the study in [57]: this is an approximation of the coupled modes, and there are other assumptions which consider both displacement and velocity as independent variables [33,35]. We use the polynomial approximation as a function of displacement only. The Lagrangian (L) of the system in terms of the mass-normalised modal coordinates can be written as: where T and V are the kinetic and potential energies of the system such that: ∂V ∂q 1 = 1 q 1 + f 1 (q 1 , g) ∂V ∂q s = s q s + f s (q 1 , g) Now, using Eq. (B3) andq s = ( ∂ g ∂q 1 )q 1 we can write: The partial derivatives of (L) with respect toq 1 and q 1 can be written as: q 1 , g) ( s g + f s (q 1 , g)) = 0 (B8) From here, as only the reduced modes are directly forced, as per the ICE method, and the responses of statically coupled modes are captured implicitly [29,57], g(q 1 ) must satisfy: s g + f s (q 1 , g) = 0 ( B 9 ) and g being a function of q 1 , we can write: Now, the single-mode IC-ROM model form, with each statically coupled mode (s) represented through a function (g s ), linear potential energy term as 1 = ω 2 1 and nonlinear restoring forces represented as polynomial terms of up to Mth order, can be written as below: