1 Introduction

1.1 Background

The dynamic analysis of railway vehicles collision, as a vital issue in passive safety, has been an indispensable research topic in recent years [1, 2]. Although the real train crash experiments are the most reliable and preferred approach, the incurrence of extensive human labour and substantial costs for full-size tests has led researchers to adopt scaled tests instead [3, 4]. Nowadays, the dynamic simulation methods including the finite element (FE) simulation and multi-body dynamic simulation have significantly contributed to cutting down research and labour costs [5, 6]. However, the application of these dynamic simulation methods still needs to be improved in the actual implementation process due to the following reasons: (1) Complex modelling. It is generally expensive to employ the traditional simulation method due to the complexity of the modelling process and the length of time it takes to simulate the system. (2) Stochastic factors. It is inevitable that there will be stochastic factors in the experiments and simulation processes [7], resulting in a reduction in accuracy and the appearance of a random distribution (Fig. 1a).

Fig. 1
figure 1

Stochastic process analysis: a graph showing the stochastic process in engineering [5]; b graphical model of the Gaussian process regression (GPR), where the connectivity between values of the function fi is denoted by a loop around the plate [16]; c graphical model representing the stochastic processes using stochastic variational inference, with inducing variables u working as global variables and dependencies between the observations [18, 19]

1.2 State-of-the-art development

To address the challenge of balancing accuracy and efficiency, data-driven modelling methods as a reverse approach have recently received more interest in the engineering field. For instance, Tang et al. [8,9,10] proposed a data-driven collision modelling method where useful force–displacement curve models are extracted from the existing FE simulation data, to predict the collision dynamic response under various collision conditions. Müller et al. [11] proposed a machine learning method to quickly predict and estimate the severity of crushing caused by collisions based on the FE simulation data. Li et al. [12] proposed a vehicle collision mathematical model (VCMM) to predict the collision response of vehicles at unknown velocities, and the accuracy and efficiency of the method are verified by comparison with the FE simulation results.

Besides, many researchers are committed to dynamic simulations considering stochastic factors, and have put forward several modelling methods for stochastic dynamic simulation. For example, Luo et al. [13] applied mathematics normal distribution to rationally describe different stochastic parameters, and then accurately predicted the evolution of wheel profile wear and related vehicle dynamic for high-speed trains. Lu et.al [14] discussed in detail the dynamic effects of vehicle speed, load, road surface roughness and tire stiffness by analysing tire dynamic load and dynamic load coefficient. Souffran et al. [15] analysed the relationship between stochastic characteristics and variables (i.e. vehicle speed, acceleration, and road gradient), and then proposed a methodology for modelling real-world vehicle missions. The application of a novel probability method is another popular approach for assessing vehicle dynamic. In this regard, Xu et al. [16] proposed a probability reduction method, considering the random combination of different random variables in vehicle–track coupled systems; Hao et al. [17] applied a specific prior distribution to quantify the probability by checking all modal activities between consecutive data points. In addition, Gaussian process regression (GPR) [18] has gradually become a baseline stochastic processes approach due to its relatively high accuracy of description; however, this approach needs to store a full training dataset to realize predictions (Fig. 1b), which means that the complexity will grow proportionally with the size of the dataset.

To improve efficiency, a data-driven approach combined with variational inference is developed to analyse the stochastic process [19,20,21,22], and an efficient prediction function is derived from the mini-batch training data (Fig. 1c). The data-driven stochastic method has been widely used in engineering, such as health monitoring [23], real-time monitoring [24], vehicle speed prediction [25], and their applicability has been independently verified.

Although the above-mentioned methods have yielded improvements in terms of effectiveness, there are seldom studies to set up empirical or physical models incorporating stochastic process analysis in collision dynamic simulations, resulting in traditional dynamic simulation methods either requiring a lot of calculation time or lacking consideration of randomness and physical interpretability. Therefore, the existing stochastic process analysis methods for collision dynamic simulations of railway trains are still facing many challenges in maintaining high efficiency and accuracy while ensuring physical interpretability.

1.3 Motivation, contribution and structure

Considering different kinds of stochastic variability noise factors [26,27,28] in the dynamic simulation of railway vehicles collision, we propose a new data-driven stochastic process modelling approach to reduce the process disturbance and acquire accurate results using a simpler model. The major contributions of our study include:

  • Developing four kinds of kernels in the stochastic process model to describe different disturbances, namely, the trend uncertainty, the time uncertainty, the smaller irregularities and the data sources uncertainty.

  • Using stochastic variational inferences and mini-batch algorithms, the model could effectively explain every single detail of the stochastic process.

  • Proposing a DSPM approach for collision dynamic simulation of railway vehicles to accurately describe the stochastic processes.

  • Obtaining a higher computing accuracy and efficiency of the DSPM approach compared with that of GPR and FE methods.

The rest of this paper is organized as follows. In Sect. 2, the FE model of railway vehicles is established for collision simulation. A detailed introduction of the DSPM approach is given in Sect. 3. In Sect. 4, an overview of how the DSPM approach can be applied to stochastic dynamic simulation is provided. Next, two collision scenarios are utilized to evaluate the performance of our proposed model, and a detailed comparative analysis of three simulation methods, namely, the DSPM approach, GPR method and FEA method, is elaborated in Sect. 5. In Sect. 6, the challenges faced by the DSPM approach are analysed. Finally, conclusions and discussion on this research work are drawn in Sect. 7.

2 FE simulation

2.1 Modelling

In this paper, we take a representative subway vehicle as the main research object, which has a maximum design speed of 100 km/h. Generally, the car body of the vehicle mainly consists of two side walls, an underframe, a roof frame, and two end walls. In addition, the structure of the body-in-white (BIW) and the chassis should consider static and dynamic loads. The layout of a spacious space in the end area is designed as energy-absorbing areas to prevent large deformations, and the passenger area is designed in the middle part. Figure 2a shows the three views (i.e. bottom, left, and front) of the FE model of the railway vehicle collision. There are eleven different cross-sectional characteristics of thickness for the plates and beams, which are all indicated by different colours for easy inspection.

Fig. 2
figure 2

FE model of railway vehicles collision: a three views of the FE model; b composition of the FE model; c FE model of the bogie

The FE simulations of railway vehicle collision scenarios are conducted using full-scale detailed models which contain a total of 182 different components. The car body has a length, width, and height of 24,000, 3200, and 3100 mm, respectively, and a distance 20,000 mm between the centres of the bogies. Specifically, the front section and the end of the driver's cab are deformable and consist of quadrilateral shell elements with an accuracy of 20 mm, and the rest of the vehicle consists of quadrilateral shell elements with an accuracy of 50 mm. The composition of the FE model is shown in Fig. 2b. In FE modelling, the mass of the bogie and appropriate boundary conditions are considered, as the bogie (Fig. 2c) is the critical component supporting the vehicle. In this case, the FE model is composed of 819,389 quadrilateral shell elements (approximately 95.5% of the total number of elements) and 37,742 rigid connection elements (welding, joining, etc.). In addition, the weight of the railway vehicle and BIW is 46 t and 7.03 t (about 15.2% of the total weight), respectively.

In the FE model, there are two connection methods: one is the common node; the other is Reb2. The static and dynamic friction coefficients of the wheel-rail contact are taken as 0.15 and 0.1, respectively; and the other corresponding friction coefficients are taken as 0.2 and 0.15, respectively. The contact forms in this FE model include (i) surface-to-surface contact (the energy absorbing device and the rigid wall, wheel-rail, etc.) and (ii) single-surface contact (the energy-absorbing device itself, etc.).

The FE model of the energy-absorbing devices, containing anti-climbing teeth, energy-absorbing beams, aluminium honeycombs, ribs, etc., is established. In this FE model, the aluminium honeycomb, the energy-absorbing beam, and rib structure are refined into a unit with a size of 3, 4, and 4 mm, respectively, and the total mass is 75 kg. Then the FE simulation of the energy-absorbing device impacting the front fixed rigid wall at an initial velocity V0 was implemented(Fig. 3a). In the FE model of railway vehicles, the Z degree of freedom of the bogie-to-car body is constrained to avoid the overall Euler buckling deformation, and the rigid wall is constrained with 6 degrees of freedom. Finally, the FE model of the lead car impacts the rigid wall at initial velocity V0 was established, as shown in Fig. 3b. In addition, the material parameters of the components in these two FE models are listed in Table 1.

Fig. 3
figure 3

Collision condition: a energy-absorbing component to a rigid wall; b lead car to a rigid wall

Table 1 Material parameters of the components employed in FE simulation

Meanwhile, there are nonlinear connections in the FE model that exist for the coupler buffer device and for the anti-climbing device. As shown in Fig. 4a, the solid line and the dashed line represent the loading and unloading process of the coupler buffer device during the collision, respectively. Using nonlinear hysteresis characteristic curves (Fig. 4b), a mathematical model of the anti-climbing device can be built to simulate its mechanical characteristics.

Fig. 4
figure 4

Mechanical characteristics of the nonlinear connection: a characteristics of the coupler buffer device; b nonlinear hysteresis characteristic curve of the anti-climbing energy-absorbing device

2.2 Model verification

Considering that railway vehicle collision is a complex problem in terms of material, contact, and geometric nonlinearity [29], we conducted the crash experiment on the energy-absorbing device to verify the accuracy of the FE model. A frontal collision experiment was conducted with the energy-absorbing device at an initial speed of 16.21 m/s, with a rebound speed of 0.95 m/s. Similarly, the instantaneous velocity V0 of the energy-absorbing device before collision in the FE simulation was set to 16.21 m/s. The transient impact force test system was used to measure the impact force in real time and the high-speed camera system was used to record the whole impact process. Afterward, sequence image tracking was performed on the energy absorption device.

Finally, dynamic responses such as the displacement, velocity, acceleration, and impact force during the collision process were obtained. The FE model of the energy-absorbing device (Fig. 5a–b) and the representative FE model of railway vehicle (Fig. 5c–d) were carried out on the Ls-Dyna platform in Windows with Core i7, 3.4 GHz processor, and 8 GB RAM. Figure 6 illustrates detailed comparisons between experimental and FE simulation results of the energy-absorbing device.

Fig. 5
figure 5

Collision experiments and FE simulation: a FE model of energy absorbing device, t = 0 ms; b FE model simulation results of energy absorption device at an initial speed of 60 km/h, t = 50 ms; c FE model of railway vehicle, t = 0 ms; d FE simulation results of the railway vehicle at 60 km/h, t = 50 ms

Fig. 6
figure 6

Comparison results: a displacement; b velocity; c acceleration; d collision force; e force–displacement curve; f absorbed energy

As it can be seen in Fig. 6, collision dynamic responses of the FE simulation are highly consistent with the results of the experimental test, indicating the accuracy of the simulation results. The maximum displacement of the experimental test and FE simulation is 461.99 and 459.58 mm, respectively, and the maximum error is 0.52% (Fig. 6a). Figure 6b shows that the collision velocity of the experimental test and FE simulation drops rapidly to zero in the initial stage, and then it rebounds at a low velocity until the collision process ends. The acceleration results of the experiment are consistent with those of the FE simulation, as shown in Fig. 6c. Figure 6d shows that after SAE 1400 Hz filtering, the corresponding average values of the simulated collision force and the tested forces are 491.90 and 450.81 kN, respectively, and the error between them is 6.8%; the variation trend of the simulated collision force agrees well with the experimental collision force. Figure 6e–f shows that the absorbed energy of the experimental test and the FE simulation is 199.69 and 188.36 kJ, respectively, and the error is 6.01%, which is within the acceptable range. As a result, the FE model can effectively and truly reflect the collision process of railway vehicles.

3 DSPM approach

In FE simulation, collision dynamic responses are often obtained by inputting relevant characteristics such as initial velocity. Inspired by this modelling theory, a data-driven stochastic process model was built to estimate \({\varvec{Y}} = f\left( {\varvec{X}} \right)\), where \(f\left( \cdot \right)\) is the base function that converts the input relevant vector \({\varvec{X}}\) into the output collision dynamic response Y. The modelling process of DSPM approach mainly can be divided into three stages, including data collection and extraction, model training, and model prediction. As shown in Fig. 7, the training data were collected from FE simulation, and then the collected data was extracted and recognized at the stage of data collection and extraction. In the process of model training, parameters of the collected data were initialized by the maximum likelihood estimation (MLE) algorithm using mini-batches. In the process of model prediction, the initial parameters from the trained model were used for model prediction. Finally, the estimated output values, including mean and deviation at the new point, can be obtained.

Fig. 7
figure 7

Framework and data flow of DSPM approach

3.1 Stochastic process analysis

In the DSPM approach, we suppose that there exist some sets of sample data \({\varvec{X}} = (x_{1} ,x_{2} , \ldots ,x_{n} ){ }\) and \({\varvec{X}}^{{\prime }} = (x_{1}^{{\prime }} ,x_{2}^{{\prime }} , \ldots ,x_{n}^{{\prime }} )\) with the corresponding set of observed outputs \({\varvec{Y}} = (y_{1} ,y_{2} , \ldots ,y_{n}\)), and the prior assumption can be expressed as

$$y\left( {\varvec{X}} \right) \sim {\mathcal{G}\mathcal{P}}\left( {{\varvec{u}},k\left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}}} \right)} \right),$$

where \(y\left( {\cdot{ }} \right)\) is the output function, \({ \mathcal{G}\mathcal{P}}\left( {\cdot{ }} \right)\) represents the traditional Gaussian process, \({\varvec{u}}\) is the mean value, \(k\left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta }}} \right)\) is a covariance function with the vector of hyper-parameters θ, which also represents the kernel function [30]. In addition, let’s postulate there is a dataset \(\left\{ {{\varvec{H}},{\varvec{Z}}} \right\}\) with

$${\varvec{Z}} \sim {\mathcal{N}}\left( {{\varvec{m}},{\varvec{S}}} \right),$$

where Z is the hypothetical data, \({\mathcal{N}}\left( {{\varvec{m}},{\varvec{S}}} \right)\) represents a normal distribution with mean value \({\varvec{m}}\) and covariance matrix \({\varvec{S}}\). Here, \({\varvec{H}} = \left\{ {{{h}}^{i} } \right\}_{i = 1}^{n}\) and \({\varvec{Z}} = \left\{ {{{z}}^{i} } \right\}_{i = 1}^{n}\). Then, a stochastic process \(f\left( {\varvec{X}} \right)\) can be obtained by the conditional distribution, as follows:

$$f\left( {\varvec{X}} \right) = y\left( {\varvec{X}} \right){\mid }{\varvec{m}},{\varvec{S}},$$
$$f\left( {\varvec{X}} \right) \sim {\mathcal{G}\mathcal{P}}\left( {\mu \left( {{\varvec{X}};{\varvec{\theta}},{\varvec{m}}} \right),\Sigma \left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}},{\varvec{S}}} \right)} \right),$$


$$\mu \left( {{\varvec{X}};{\varvec{\theta}},{\varvec{m}}} \right) = k\left( {{\varvec{X}},{\varvec{H}};{\varvec{\theta}}} \right)k({\varvec{H}},{\varvec{H}};{\varvec{\theta}})^{ - 1} {\varvec{m}},$$
$$\begin{aligned} \Sigma \left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}},{\varvec{S}}} \right) = & k\left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}}} \right) - k\left( {{\varvec{X}},{\varvec{H}};{\varvec{\theta}}} \right)k({\varvec{H}},{\varvec{H}};{\varvec{\theta}})^{ - 1} k\left( {{\varvec{H}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}}} \right) \\ & \quad + k\left( {{\varvec{X}},{\varvec{H}};{\varvec{\theta}}} \right)k({\varvec{H}},{\varvec{H}};{\varvec{\theta}})^{ - 1} {\varvec{S}}k({\varvec{H}},{\varvec{H}};{\varvec{\theta}})^{ - 1} k\left( {{\varvec{H}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}}} \right). \\ \end{aligned}$$

3.2 Parameter Initialization with MLE

To update and get the hyper-parameters \({\varvec{\theta}}\) and noise variance parameters \(\eta_{*}^{2}\), we apply the gradient ascent on the log-marginal-likelihood. The log-likelihood function of the initial parameter can be expressed as follows [31]:

$${\mathcal{L}}\left( {{{\varvec{\uptheta}}},\eta_{*}^{2} } \right) = \frac{1}{2}{\varvec{m}}^{{\text{T}}} k({\varvec{H}},{\varvec{H}};{\varvec{\theta}})^{ - 1} {\varvec{m}} + \frac{1}{2}{\text{log}}\left| {k\left( {{\varvec{H}},{\varvec{H}};{\varvec{\theta}}} \right)} \right| + \frac{n}{2}{\text{log}}\left( {2{\uppi }} \right).$$

Let \(\partial {\mathcal{L}}\left( {{\varvec{\theta}},\eta_{*}^{2} } \right)/\partial \mu \left( {{\varvec{H}};{\varvec{\theta}},0} \right) = 0\) and \(\partial {\mathcal{L}}\left( {{\varvec{\theta}},\eta_{*}^{2} } \right)/\partial \Sigma \left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}},{\varvec{S}}} \right) = 0\). We can initialize the training procedure by hypothesizing that \({\varvec{m}}_{0} = 0\) and \({\varvec{S}}_{0} = k\left( {{\varvec{H}},{\varvec{H}};{\varvec{\theta}}_{0} } \right),\) where \({\varvec{\theta}}_{0}\) is the initial set of hyperparameters, \({\varvec{m}}_{0}\) is the initial mean value, and \({\varvec{S}}_{0}\) is the initial covariance matrix. Then \(\mu \left( {{\varvec{H}};{\varvec{\theta}},{\varvec{m}}} \right) = {\varvec{m}}\) and \(\Sigma \left( {{\varvec{H}},{\varvec{H}};{\varvec{\theta}},{\varvec{S}}} \right) = {\varvec{S}}\) of MLE can be obtained by optimizing the hyper-parameters and noise variance parameters of the model.

3.3 Model prediction

The mean \({\varvec{m}}\) and covariance matrix \({\varvec{S}}\) in the stochastic process are key parameters to our DSPM approach, and the parameters can be estimated after parameter initialization. It is convenient to apply efficient mini-batch training to replace the original dataset. Therefore, we can update the mean \({\varvec{m}}\) and covariance \({\varvec{S}}\) of the hypothetical dataset \(\left\{ {{\varvec{H}},{\varvec{Z}}} \right\}\) by adopting the posterior distribution of a mini-batch of data \(\left\{ {{\varvec{I}},{\varvec{J}}} \right\}\) with size \(M\), as shown Eq. (8) and Eq. (9):

$$\mu \left( {{\varvec{H}};{\varvec{\theta}},{\varvec{m}}} \right) + \Sigma \left( {{\varvec{H}},{\varvec{I}};{\varvec{\theta}}, {\varvec{S}}} \right)\left( {\Sigma \left( {{\varvec{I}},{\varvec{I}};{\varvec{\theta}}, {\varvec{S}}} \right) + \sigma_{\epsilon }^{2} {\varvec{I}}} \right)^{ - 1} \left[ {{\varvec{J}} - \mu \left( {{\varvec{I}};{\varvec{\theta}},{\varvec{m}}} \right)} \right] \to {\varvec{m}},$$
$$\Sigma \left( {{\varvec{H}}, {\varvec{H}};{\varvec{\theta}}, {\varvec{S}}} \right) - \Sigma \left( {{\varvec{H}},{\varvec{I}};{\varvec{\theta}}, {\varvec{S}}} \right)\left( {\Sigma \left( {{\varvec{I}},{\varvec{I}};{\varvec{\theta}}, {\varvec{S}}} \right) + \sigma_{\epsilon }^{2} {\varvec{I}}} \right)^{ - 1} \Sigma \left( {{\varvec{I}}, {\varvec{H}};{\varvec{\theta}},\user2{ S}} \right) \to {\varvec{S}}.$$

In this manner, we can predict the solution at a new test point \({\varvec{x}}^{*}\) through the mean \(\mu \left( {{\varvec{x}}^{*} ;{\varvec{\theta}},{\varvec{m}}} \right)\) and the predicted variance \(\Sigma \left( {{\varvec{x}}^{*} ,{\varvec{x}}^{*} ;{\varvec{\theta}},{\varvec{S}}} \right)\) according to the mean value \({\varvec{m}}\) and covariance matrix \({\varvec{S}}\) of the hypothetical data, where \({ }\mu\) and \(\Sigma\) are obtained from Eq. (5) and Eq. (6), respectively. For more information, please refer to the pseudocode in Table 2.

Table 2 Pseudocode of our data-driven stochastic process modelling approach

4 Stochastic dynamic simulation of railway vehicle collisions

A detailed overview of stochastic dynamic simulation of railway vehicle collision using the DSPM approach is presented in Fig. 8. As can be seen from the schematic diagram, the framework can be divided into four sections, i.e. data collection and normalization, stochastic process modelling, DSPM training, and stochastic dynamic simulation and prediction. Specifically, the section of data collection and normalization includes modules of raw data collection and data preprocessing; the section of stochastic process modelling contains modules of core extractor, stochastic model training, and hyper-parameters determination; the section of DSPM training includes modules of model training and evaluation; the section of stochastic dynamic simulation and prediction contains modules of model validation and prediction, etc. Besides, an outline of the main procedures of each key section is also provided.

Fig. 8
figure 8

Framework of stochastic dynamic simulation of railway vehicle collision

4.1 Data collection and normalization

Accordingly, the database of our DSPM approach has l calculation cases. Case i is calculated at the initial velocity \({\varvec{v}}_{0}^{k} \left( {k = 1,2, \ldots l} \right)\); then the dynamic response curve can be extracted and expressed as \({\varvec{p}}_{i} \left( t \right) = \left( {{\varvec{d}}_{i} \left( t \right),{\varvec{v}}_{i} \left( t \right),{\varvec{f}}_{i} \left( t \right),{\varvec{E}}_{i} \left( t \right)} \right)\left( {t = 1,2, \ldots ,\varepsilon} \right),\) where \({\varvec{d}}_{i} \left( t \right)\), \({\varvec{v}}_{i} \left( t \right)\), \({\varvec{f}}_{i} \left( t \right)\) and \({\varvec{E}}_{i} \left( t \right)\) represent the displacement, velocity, interface force, and internal energy at the time t, respectively, while \(\varepsilon\) corresponds to the maximum value of time. If we express the dynamic response curve in case i as \(g_{k} \left( \cdot \right)\), we can then write

$${\varvec{p}}_{i} \left( t \right) = g_{k} \left( {t,{\varvec{v}}_{0}^{i} } \right)\left( {t = 1,2, \ldots ,\varepsilon; i = 1,2, \ldots ,l} \right).$$

Denoting the new velocity value by \({\varvec{v}}_{0}^{{{\text{New}}}}\), the undetermined dynamic response curve for a new case can be expressed as

$${\varvec{p}}^{{{\text{New}}}} \left( t \right) = g^{{{\text{New}}}} \left[ {t^{{{\text{New}}}} ,{\varvec{v}}_{0}^{{{\text{New}}}} } \right],$$

where \({\varvec{p}}^{{{\text{New}}}} \left( t \right)\) represents the time series of the dynamic response when a new initial velocity \({\varvec{v}}_{0}^{{{\text{New}}}}\) and time \(t^{{{\text{New}}}}\) are given.

After gathering the sample data, we need to normalize the data to reduce the sensitivity of the DSPM approach. Then the feature extraction is performed on the sample data after post-processing to create the training data.

4.2 Stochastic process modelling

In this paper, we illustrate four complex kernels with hyperparameter optimization to represent different noise properties [30,31,32,33].

  • A temporal trending term is to be explained by a radial basis function (RBF) kernel [32]. The RBF kernel is given by

    $$k_{{{\text{RBF}}}} \left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}}} \right) = \gamma^{2} {\text{exp}}\left( { - \frac{1}{2}\sum w^{2} \left( {{\varvec{X}} - {\varvec{X}}^{{\prime }} } \right)^{2} } \right),$$

    where \(\gamma^{2}\) represents a variance parameter, and \(w\) represents a scalar of an isotropic variant of the kernel.

  • A time component is expressed by the periodic ExpSineSquared kernel with a periodicity parameter \(q(q > 0)\) [33], and the kernel is given by

    $$k_{{{\text{ESS}}}} \left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}}} \right) = \gamma^{2} {\text{exp}}\left( { - 2w^{2} \sin^{2} ({\uppi }\left( {{\varvec{X}} - {\varvec{X}}} \right)^{2} /q)} \right)^{2} .$$
  • Smaller irregularities are to be explained by a Rational Quadratic kernel, which could be better expressed than a RBF kernel component. The Rational Quadratic can accommodate several length scales [34], and can be parameterized by a scale mixture parameter \(\alpha > 0\), the definition is as follows:

    $$k_{{{\text{RQ}}}} \left( {{\varvec{X}},{\varvec{X}}^{{\prime }} ;{\varvec{\theta}}} \right) = \gamma^{2} {\text{exp}}\left( {1 + \frac{\alpha }{2}w^{2} \left( {{\varvec{X}} - {\varvec{X}}^{{\prime }} } \right)^{2} } \right)^{ - \alpha } .$$

    At last, we applied the White kernel as the part of a sum-kernel which explains the noise as independently and identically normally distributed [35]. Overall, using these four kernels, it is possible to explain the noise present in the real-world collision process.

4.3 Model training

The training datasets are generally divided into three sample sets, i.e. training sample set, validation sample set, and testing sample set. Using the DSPM approach, we can predict the confidence intervals using just two sample sets, namely, training sample data and validating sample data. To improve the accuracy of our model, we employ the k-fold cross-validation method [36, 37], where k−1 subsets of training datasets are used for training and validating. Specifically, in this paper, the FE simulation results from different initial impact velocities may be used as training data for our model. Different training sessions are performed to train and validate the model; then the hyperparameters referred to in Sect. 3 can be obtained according to the validation results. Consequently, a stochastic model with confidence in 95% is selected during the training sessions of the model, and then the relationship between the inputs and the outputs is built by training the data-driven stochastic process model.

4.4 Simulations and evaluation

To validate this model, parameters are optimized to minimize the mean square error (MSE) between the output value and the expected value. Then an optimized data-driven model is built until the MSE meets our demands, which can not only predict the dynamic characteristics but also give the dynamic response distribution considering stochastic factors.

Let \({\varvec{X}}_{{\text{t}}} = \left( {x_{{\text{t}}}^{\left( 1 \right)} ,x_{{\text{t}}}^{\left( 2 \right)} , \ldots ,x_{{\text{t}}}^{{\left( {M} \right)}} } \right)^{{\text{T}}}\) and \({\varvec{Y}}_{{\text{t}}} = \left( {y_{{\text{t}}}^{\left( 1 \right)} ,y_{{\text{t}}}^{\left( 2 \right)} , \ldots ,y_{{\text{t}}}^{{\left( {M } \right)}} } \right)^{{\text{T}}}\) denote the inputs and outputs of test samples, respectively, where \(M\) is the size of the test samples, and \(\hat{y}_{{\text{t}}}^{\left( i \right)} = \hat{g}\left( {x_{{\text{t}}}^{\left( i \right)} } \right)\) denote the ith predicted output of the data-driven model.

We employ four different metrics to quantitatively characterize the prediction accuracy of our proposed model, among which the MSE [38] is calculated as follows:

$${\text{MSE }} = \frac{1}{M}\mathop \sum \limits_{i = 1}^{M} \left( {\hat{y}_{\text{t}}^{\left( i \right)} - y_{\text{t}}^{\left( i \right)} } \right)^{2} .$$

By taking the squared root of MSE, we can obtain the root mean squared error (RMSE) [39] as in Eq. (16):

$${\text{RMSE }} = \sqrt {\frac{1}{M}\mathop \sum \limits_{i = 1}^{M} \left( {\hat{y}_{\text{t}}^{\left( i \right)} - y_{\text{t}}^{\left( i \right)} } \right)^{2} }.$$

Another metric we apply to assess the model’s prediction accuracy is the mean absolute error (MAE) [40] as defined in Eq. (17):

$${\text{MAE }} = \frac{1}{M}\mathop \sum \limits_{i = 1}^{M} \left| {\hat{y}_{\text{t}}^{\left( i \right)} - y_{\text{t}}^{\left( i \right)} } \right|,$$

Finally, the goodness of fit \(R^{2}\) [41] can be obtained in accordance with Eq. (18):

$$R^{2} = 1 - \frac{{\sum \left( {y^{\left( i \right)} - \hat{y}^{\left( i \right)} } \right)^{2} }}{{\sum \left( {y^{\left( i \right)} - \overline{y}} \right)^{2} }}.$$

In these approaches, we used the initial conditions as the input to feed the model, and the predicted mean and deviation results as the output. Once this model has been trained, it can be used to predict the mean values and deviations of dynamic response under varying initial conditions. Finally, two illustrative examples were provided, and the prediction results were compared with those obtained by two baseline methods, i.e. the FE simulation and GPR, to illustrate the accurateness of our model. A detailed discussion regarding the stochastic model is presented in Sect. 5.

5 Illustrative example analysis

We used two collision scenarios in this section to investigate the dynamic simulations of railway vehicle collisions, one in which the lead car collides with a rigid wall and one in which the lead car collides with another lead car. An analysis of the basic trend and distribution range of the results obtained from the stochastic process analysis was conducted to validate the accuracy of the DSPM method. All the cases in this paper were carried out on the Windows platform with Core i7, 3.4 GHz processor, and 8 GB RAM.

5.1 Collision scenario set up

We designed two different collision events in order to encompass the vast majority of aspects of real train collision scenarios. The first scenario involves the collision between the lead car and a rigid wall, and the second one is the collision between one lead car and another lead car, where the moving lead car A crashes with the static lead car B. The specific collision scenario is shown in Fig. 9.

Fig. 9
figure 9

Two collision scenarios: a lead car to rigid wall; b lead car to lead car

The initial velocities are generally limited to below 100 km/h in line on actual conditions [42]. For a lead car, the effect of head-on-train impact (case 2) with an initial velocity v is almost equivalent to one of rigid-wall impact (case 1) with an initial velocity of 2v. Since drivers can trigger an emergency brake before the collision happens, the initial velocity is less likely to reach the maximum running velocity. Therefore, the initial velocity range for case 1 was set from 10 to 60 km/h in practical terms, and the collision time was set to 300 ms for case 1 to simulate the scenario of a high-speed collision, covering a history of 300 sample states (sampling period = 1 ms). Similarly, case 2 belongs to the general cases where the initial velocity ranges from 20 to 50 km/h [43], for which the collision time was set to 100 ms, covering 1000 sample states (sampling period = 1 ms). After the data was generated, it was fed into the DSPM approach described in Sect. 3 to predict the accurate dynamic response curve with deviations. Table 3 lists the parameters of the two collision scenarios in our work.

Table 3 Parameters of the collision scenario

5.2 Model training

In case 1, we applied the FE method and the DSPM approach to calculate the dynamic response of railway vehicles, including displacement, velocity, interface force, internal energy, and kinetic energy at six different impact velocities (Table 3). Comparisons of dynamic responses for displacement, velocity, interface force, internal energy, and kinetic energy are depicted in Fig. 10.

Fig. 10
figure 10

Dynamic responses of the lead car in case 1:. a displacement; b velocity; c interface force; d internal energy; e kinetic energy; f average goodness-of-fit of case 1

Figure 10 illustrates that the displacement of the lead car gradually increases over time, and then decreases slowly, with a tendency to become flat after 0.3 s. Within 0.15 s after the collision, the single car’s velocity gradually decreases until the vehicle comes to a standstill. The internal energy gradually increases and reaches its maximum value at 0.1 s, while the kinetic energy gradually decreases to a minimum value at the same time. The interface force curve generally goes through three stages: rapid increase, then stabilize, and finally rapid decrease. When the impact velocity exceeds 50 km/h, the interface force rises rapidly to a peak and quickly decreases, showing the evolution of the interface force from three stages to two stages. As can be seen, the fluctuation of the interface force is within 95% confidence interval, and the dynamic responses of our model are fundamentally consistent with the trend of the FE analysis results for validation and prediction. Figure 10f shows that the goodness-of-fit (R2) of displacement, velocity, interface force, internal energy, and kinetic energy for case 1 is 0.998, 0.992, 0.961, 0.996, and 0.987, respectively, implying that this model can fit well with these dynamic response.

In case 2, we also applied the FE method and DSPM approach to calculate the dynamic responses at three different impact velocities, including displacement, velocity, interface force, internal energy, and kinetic energy of the lead car A. The compared results between the FE method and DSPM approach of lead car A are depicted in Fig. 11.

Fig. 11
figure 11

Dynamic response of lead car A in Case 2: a displacement; b velocity; c interface forces; d internal energy; e kinetic energy; f average goodness-of-fit in case 2

As shown in Fig. 11, after 1000 ms of collision, the displacement of lead car A shows an increasing trend, while its velocity presents a decreasing trend. The interface force approaches zero after experiencing cyclic fluctuations. The internal energy gradually increases and reaches the maximum value at 0.5 s after collision, while the kinetic energy gradually decreases to the minimum value at the same time. Figure 11a–e also shows that the dynamic response results of our model are in essence consistent with the trend of the FE method results, and the fluctuation of the dynamic response results is within 95% confidence interval (marked as orange area). Figure 11f shows that the goodness-of-fit (R2) of displacement, velocity, interface force, internal energy, and kinetic energy of lead car A in case 2 is 0.999, 0.944, 0.950, 0.993, and 0.927, respectively, indicating that our model has good performance in fitting the collision dynamic response curves of railway vehicles. As a result, the comparisons between the DSPM approach and the FE method illustrate that the dynamic responses obtained from the FE method are essentially within the range predicted by the DSPM approach, especially displacement, velocity, internal energy, and kinetic energy.

5.3 Model validation

To verify the model, we applied the DSPM to predict and analyse the dynamic response of rail vehicles at a velocity of 35 km/h in case 1 and case 2, and then the results were compared with that of the FE method. In this section, we focus to introduce the collision dynamic response of displacement, velocity, internal energy, and kinetic energy. The predicted results in case 1 is shown in Fig. 12, and the predicted results in case 2 is shown in Fig. 13.

Fig. 12
figure 12

Predicted results and relative errors in case 1: a displacement; b velocity; c internal energy; d kinetic energy

Fig. 13
figure 13

Predicted results and relative errors of lead car A in case 2: a displacement; b velocity; c internal energy; d kinetic energy

The predicted mean values (marked as red line in Fig. 12) and deviation ranges (marked as orange in Fig. 12) of the dynamic responses are obtained through training 1806 samples in case 1. Figure 12 shows that even though the relative error of displacement and internal energy is relatively large at the beginning due to the numerical changes around the initial stage, the predicted collision dynamic responses are consistent with the trend of the FE results, and the fluctuation of the dynamic response results is also within 95% confidence interval. Meanwhile, the average relative error of displacement, velocity, internal energy, and kinetic energy are 0.022, 0.125, 0.037 and 0.25, respectively, which can verify the accuracy capability of the DSPM approach.

In case 2, the predicted results of lead car A, including the mean-values (marked as red line in Fig. 13) and deviation ranges (marked as orange in Fig. 13) are obtained through training 3303 samples, and then the comparison between the DSPM and FEM results is analysed. The relative errors increase with the fluctuations of the predicted results which is still within 95% confidence interval (marked as orange). Moreover, the average relative error of DSPM approach in displacement, velocity, internal energy, and kinetic energy are 0.04, 0.03, 0.07, and 0.15, respectively, implying that the dynamic response can be accurately described with this approach.

5.4 Comparative analysis

According to the derived optimum parameters, the corresponding predicted results of case 1 and case 2 can be obtained. The model hyperparameters of mini-batch and data size are set as 1000 and 500, respectively, and the maximum iterations are set as 1000. In addition, the final predicted results of this DSPM approach are depicted and the average performance indicators are listed in Table 4. The box plot illustrating the errors is presented in Fig. 14a (case 1) and Fig. 15a (case 2), and the goodness-of-fit (R2) results of our method and GPR method are shown in Fig. 14b (case 1) and Fig. 15b–c (case 2).

Table 4 Comparisons between our DSPM approach and FE and GPR methods
Fig. 14
figure 14

Comparison between DSPM and CPR method in case 1: a box-plot; b R2

Fig. 15
figure 15

Comparative results analysis in case 2: a box plot; b R2 of car A; c R2 of car B

Through comparative analysis of Table 4, Fig. 14 and Fig. 15, the following results are obtained:

  • FE simulation is the most commonly used method in engineering, but its simulation time is almost 10 times longer than that of other two methods, so this approach suffers from a high computational cost.

  • Generally, GPR method is used to deal with stochastic processes, because of its strong fitting ability. Otherwise, it is difficult to apply and promote in engineering due to its insufficient predictive ability and high computational complexity.

  • Our proposed DSPM approach with lower error and higher R2, has an obvious advantage of efficiency and accuracy compared with FE and GPR method.

Overall, as can be seen from these two collision cases, the proposed DSPM approach shows its higher prediction accuracy and extrapolating capability. Moreover, the comparative results indicate that our proposed method is generally more efficient and robust than the FE and GRP methods.

6 Challenges faced by DSPM approach

6.1 Insufficient training samples

In fact, we are still facing some challenges, such as how to obtain valuable training samples. In the future, few-shot learning such as meta-transfer learning may be introduced to obtain more valuable training samples.

6.2 Inappropriate training strategy

Due to the challenging training requirements, the data-driven approach is not yet applicable in many circumstances. In the future, the training strategy should be systematically studied, such as hidden layers, network nodes, transfer functions and other hyperparameters, etc.

6.3 Poor interpretability

Interpretability has always been a key issue affecting the credibility and usability of data-driven approaches. In the future, the hybrid modelling method that combines data science and physical principles will be an important research prospect.

7 Conclusion

In this paper, we propose a novel DSPM approach for collision dynamic simulation to accurately predict and analyse the dynamic response with a confidence interval in the collision process. The main conclusions are summarized as follows:

  1. (1)

    Considering different kinds of stochastic noise, the DSPM approach could provide high-precision predicted results of dynamic response in collision dynamic simulations of railway vehicles.

  2. (2)

    The DSPM approach improves calculation efficiency by encoding a large amount of data into some mini-batch data in complicated collision dynamic simulations of railway vehicles.

  3. (3)

    The comparison results demonstrated the superior performance of the DSPM approach in terms of efficiency and accuracy than FE and GRP methods.

Moreover, the scaling of data-driven stochastic processes is, and will remain, an important research area, and it may open up other applications of this approach in other stochastic dynamic simulations of railway vehicles. In the future, we plan to focus our attention on addressing the faced challenges, and refining our approach by making further modifications to the DSPM approach, including obtaining valuable training samples, training strategy analysis, improving model interpretability.