Application of Tribological Artificial Neural Networks in Machine Elements

Traditionally, analytical equations used in tribo-dynamic modelling, such as those used for predicting central film thickness within elastohydrodynamic lubricated contacts, have led to timely computations, but tend to lack the accuracy of numerical solvers. However, it can be shown that data-driven solutions, such as machine learning, can significantly improve computational efficiency of tribo-dynamic simulations of machine elements without comprising accuracy relative to the numerical solution. During this study, artificial neural networks (ANNs) are trained using data produced via numerical solutions, which are constrained by the regimes of lubrication to ensure the quality of the training data set. Multiple ANNs are then implemented to predict EHL central film thickness, as well as viscous and boundary friction, in multiple commonly used machine elements, such as a rolling element bearing and a spur gear. The viscous and boundary friction ANN prediction are compared directly against ball-on-disc experimental measurements to validate its accuracy.


Introduction
Tribology plays an important role for the enhancement of sustainable design and reduction of CO 2 production, especially in mechanical power generation and transmission. Tribology is the study of three fundamental phenomena of 1 3 3 Page 2 of 16 interacting surfaces with relative rolling and sliding motions: friction, lubrication, and wear. Advancements in this field could reduce energy losses by up to 8.7% on a global scale over a 15-year period starting from 2019 [1]. The focus of this paper will be the time-efficient prediction of a lubricant's film thickness within mechanical component case studies, including a roller element bearing and a spur gear. The case studies are key sub-components found within automotive applications for both electric propulsion and internal combustion drivetrains alike; however, the methodology described in this work applies to many more industries and their subsequent contacting componentry. The prediction of a lubricant's film thickness is important because it ultimately describes the separation of two surfaces and whether there is partial interaction of two contiguous surfaces.
Historically, complex non-linear problems in tribology like the prediction of lubricant film thickness have been solved by one of two approaches. The first is a numerical approach [2], where systems of partial differential equations are set up to define the state of the contact and are consequently solved iteratively [3]. This method produces accurate results and is applicable to a large range of operating contact conditions. However, it is not only computationally expensive due to its iterative nature, but also there is insufficient initial conditions. The second approach is to use experimental data to create analytical equations that can be applied to specific lubrication regimes [4]. These equations provide quick estimates of certain key parameters, such as central [5] and minimum film thickness [6] and viscous [7] and boundary friction [8]. The drawback of this approach is that each equation has a limited range of applicability, a lower accuracy due to its simplified nature, and requires adequate effort to collect the experimental data. The presented workflow applies the use of Machine Learning (ML) algorithms in the form of Artificial Neural Networks (ANNs) to predict the central film thickness as well as viscous friction to compare their relative performance against analytical equations and numerical solutions. Whilst the ANN lacks the physical understanding of the numerical solution, it performs almost as fast as an analytical solution whilst benefiting from the accuracy of the numerical solution [9]. A methodology is proposed to ensure the ANN is trained within a valid range of the numerical solver.
ANNs were first utilised in tribology simulation [10] in the 1990s to model wear behaviour and predict the life of tools [11], coating materials [12], and mechanical systems [13]. Following these initial studies, they were successfully applied to the prediction of the coefficient of friction and wear rate of composite materials [14][15][16] and tool steels [17] to aid component design. The studies mentioned above modelled friction and wear in dry conditions, but ANNs have also been applied to lubricated contacts to model friction [9,18,19] and lubricant viscosity [20,21] to aid lubricant design. Additional work has been completed to model wear in manufacturing processes [22] including specific applications like the estimation of flank wear in drilling [23] and modelling of the surface roughness of machined parts [24]. These studies worked with fixed material properties and used process-specific inputs (i.e. speed, torque, vibration) instead of material properties as in studies [11][12][13][14][15][16][17][18][19][20][21].
The common feature between studies [11][12][13][14][15][16][17][18][19][20][21][22][23][24] was the use of experimental data to train ANN models, with the goal of interpolating the material behaviour between the data points for varying conditions. The limitation of this approach is that experimental data are expensive and time consuming to generate. As a result, the training dataset for the ANN is limited to approximately a hundred points [14,15] or less [11,12]. To develop more complex ANNs and achieve greater accuracy, it is necessary to have a sufficiently large dataset [25,26]. Wang et al. demonstrated the capability of ANNs to predict maximum pressure in thermohydrodynamic contacts when trained on data obtained from a numerical model using Reynold's equation [27]. This modelling technique achieved an acceptable level of accuracy with lower computational expense than the numerical equivalent model. It is therefore possible to generate vast training datasets from iterative numerical models and create more accurate ANN models covering a wide range of operating conditions for dynamic simulations.
Almqvist demonstrated the use of Physics Informed Neural Networks (PINN) to solve Reynolds boundary value problem [28]; however, whilst not the purpose of the study, noted that the benefit of the method was neither accuracy nor computational efficiency, but these could improve with time and research. When simulating tribology in conjunction with dynamic simulations of machine elements, accurate and time-efficient simulation are of importance, therefore datadriven solutions currently are preferable. Marian et al. were the first to use a Finite Element Method (FEM) to generate EHL film thickness data for the purpose of training an ANN to predict an EHL film distribution [29]; the ANN was a factor of ∼ 25 times more computationally efficient than the FEM solution.
This study uses multiple tribological ANNs trained upon a numerical EHL solution, which weakly couples the elastic deformation and hydrodynamic pressure, alongside a Greenwood and Tripp deterministic asperity friction model. The outputs of the ANNs include EHL central film thickness, viscous, and boundary friction. These ANNs are used explicitly in dynamic simulations of multiple machine element applications, bearings, and gears, which has not been reported hitherto. Numerically computed EHL film thicknesses and pressure distributions are used to generate a dataset for the purpose of training the ANNs. The training data will be based on the applications of machine elements at relatively high speeds. The training data range is constrained to lubricated steel-steel contacts to guarantee its validity and applicability within the Piezo-Rigid and predominate Piezo-Elastic EHL regime. Separate boundary and viscous ANNs will be used to compute the overall friction in boundary to mixed EHL lubrication regimes. The viscous friction ANN will be able to compute within Newtonian and Non-Newtonian regimes. The overall friction computed by the ANNs will be validated against experimentally obtained test results from a ball-on-disc tribometer. Overall friction is then predicted throughout a spur gear's meshing cycle explicitly to show the computational efficiency advantage gained by the use of data-driven ANN solutions.

Methodology Overview
A high-level summary flowchart of the methodology implemented in the following work is presented in Fig. 1. The numerical models for line contact film thickness were first established, alongside the numerical viscous and boundary friction models. The numerical database which includes the necessary input and outputs to be predicted structure is then constructed and assessed via statistical means and validity ranges of the models that are providing the data. The structure of the ANN model is then thoroughly assessed to find the optimum for predicting the output parameters.

Numerical Elastohydrodynamic Lubrication
In roller raceway or gear pair conjunctions the non-conformal nature of the contact leads to high pressures under load. These pressures result in local elastic deformation of the contact and an increase in lubricant viscosity. This regime of fluid film lubrication is known EHL [30].
In roller bearings and spur gears, the contacts are often modelled as a line contact. For a line contact, the contact dimensions are much larger in the direction of side leakage than they are across the contact. Pressure along the major axis is therefore assumed to be constant, hence the analysis can be formed in one dimension.
The contacting geometries can be represented by an equivalent rigid roller, of radius R , against a semi-infinite elastic half space of equivalent elastic modulus, E r . The analytical central film thickness is obtained from [5]: The non-dimensional material, speed, and load parameters are as follows: These types of analytical solutions for EHL are based on experimental or numerical solutions for central and minimum film thickness, some of which are based on only ten data points [31]. Whilst offering a timely solution for EHL calculations, these regressed equations were designed upon entrainments speed below 2 m∕s which brings into question their validity at higher entrainment speeds. Modern electric powertrain bearings can have entrainment speed as high as 20 m∕s which far exceeds the validation region of the analytical equations. Furthermore, for more sophisticated contact analysis, pressure and film thickness distribution can only be obtained from a full numerical solution. The numerical solution for the EHL problem is achieved through solution of Reynolds equation with elastic field and rheological models. The following form is used in this paper, assuming a thin film of Newtonian lubricant: The pressure distribution across the EHL contact differs from the dry Hertzian elliptical profile due to the variation of the lubricant film. The variation of this film is computed using the following equation: where the idealised film thickness parabola, generated by the first and second term, deviates due to the localised contact deformation caused by the pressure at each computational node along the contact domain (third term). Roelands equation [32] is used to model the increase in lubricant viscosity with pressure due to its known accuracy at higher contact EHL pressures: where Lubricant density changes with pressure due to the compressibility of the lubricant, hence the Dowson-Higginson model [33]: An iterative procedure is required to solve the numerical model due to the coupled nature of the film thickness and contact pressure. The method of solution is as follows: 1. The contact load is obtained from the dynamic model of the gear pair or bearing. 2. An initial estimation of lubricant film thickness, h 0 is made. 3. The computational domain is established based on the calculated contact half-width, a . Contact inlet and outlet distances are set to − 4.5 a to 1.5 a to represent a fully flooded contact condition. 4. Pressure and film thickness distributions are obtained through simultaneous solutions of Eqs. (3)(4)(5)(6)(7). The Newton-Raphson method is used for speed and robustness of convergence [34]. The iterative solution requires pressure criteria to be satisfied at each computational node across the contact.
where p = 1 × 10 −5 is the pressure convergence error. Under-relaxation is applied between successive iterations where the criterion is not met, using an under-relaxation factor typically in the region of 0.01 ≤ γ ≤ 0.8.
Convergence criterion is applied where w = 0.001 is the load convergence error.

Boundary and Viscous Friction
One significant consequence of a sliding contact is the resulting friction. For a lubricated system, there can be two constituent components to the friction that occurs: viscous and boundary friction. The former component is a result of the shearing of the fluid film and the latter is from the direct contact of the bounding surfaces.
There are several analytical equations developed to compute viscous friction of sliding-rolling interfaces. These equations have been derived via tribometers such ball-ondisc [35] and twin-disc machines [36]. Evans and Johnson's formulated equations for the maximum traction coefficient that complement the traction maps for four regimes of three lubricants (HVI 650, Santovac 5 and Santotrac 50).
The regimes are given as follows: (1) Linear viscoelastic (Newtonian) (2) Non-linear viscous (Eyring) (3) Non-linear viscoelastic (4) Elastic-plastic Bearing and gear pair traction regimes typically fall within the Newtonian and Eyring regimes. In the present study the numerical viscous friction computations are based upon these two regimes. The shear stress experienced by the lubricant across the EHL computational domain is integrated with respect to the discretised area to compute the viscous friction: The equation used for shear is dependent on the limiting shear case: where = 0.08 is an adjustment coefficient for a steel on steel contact [3].
In the case where the limiting stress is exceeded by the Newtonian shear, the lubricant then exhibits a non-linear response within the Eyring regime.
Within the Newtonian regime for a line contact, the shear of the lubricants can be derived via the Reynold's equation: Thus, the Newtonian shear stress (Eq. 14) can be substituted into Eq. 12: The central differencing method within the numerical EHL solver resolves the film thickness and pressure distributions required in Eq. 14. As the numerical friction solution relies upon the distribution of pressure and film thickness within the contact, computation times are largely dependent on resolving Reynold's equation within the usual iterative manner. The use of an ANN methodology negates the excessive computational times of the numerical solution, whilst maintaining accuracy.
To realise any boundary interaction of the surfaces a classical Greenwood and Tripp model is used [8]. The surface height distributions during this study are considered to be Gaussian, and the mean radius of curvature is assumed for the asperity summits. F 5∕2 ( ) and F 2 ( ) are statistical functions dependent upon the Stribeck Oil Parameter , which can be approximated by polynomial curve fitting [37]. The roughness parameter ζ of combined represented surfaces typically ranges between 0.04 and 0.07 [38]. The asperity contact load and area can be calculated using the following equations: The total boundary friction is obtained via The limiting shear stress 0 acknowledges an adsorbed thin film on the summit of contacting asperities and thus non-Newtonian shear. is typically considered to be 0.17 for steel-steel contact [3].
Total friction is the summation of the boundary and viscous components: 3 Page 6 of 16

Artificial Neural Network Methodology
ANNs consist of a set of interconnected neurons that can adapt to input data to approximate complex, non-linear functions. A schematic of the structure of one of the ANNs used in this work is given in Fig. 2. Previous implementations of ANNs in tribology and film thickness prediction have been limited to between one or three hidden layers [39]. This is due to increasing ANN structural complexity requiring more training time due to increased neuron and layer number. MATLAB 2022a [40] was used for the development of the ANNs.
As previously mentioned, unlike ANNs trained on limited numbers of experimental data, the production of numerical solution databases can provide vast databases in a relatively short space of time. Therefore, a decision between timely generation of an appropriately sized and valid database for any given application, training time and prediction accuracy must be made. The structure of the ANN is denoted in the following way as per [14]: where N represents the number of neurons in a layer. The subscripts 'in' and 'out' represent the input and output layers, whilst subscript 'h1', 'h2', and 'h3' represent hidden layers and 't' is the number of hidden layers.
A sensitivity study of various ANN structures, confined all by the same input ranges, was completed to evaluate their performance. This is summarised in Table 1.
The final selected structures were based on training time, coefficient of determination R 2 , and its potential for over training.

ANN Structure
The input parameters and the target values, in this case the film thickness and viscous friction, were normalised as per a standardised min-max normalisation function: where u and l , where the upper and lower normalised unit values of 1 and −1 , respectively. x denotes the dimensional target input value and x represents the final normalised input or output parameter of the ANN. The output layer of the ANN returns a normalised value, so therefore x max , x min must be stored as a variable to dimensionalise the output later.
The input training data set was split into three categories randomly: a training set, validation set, and test set. Each set contained 70%, 15% and 15% of the training data, respectively. A limit of 1000 epochs was also implemented. During back propagation of the ANN the mean-squared error (MSE) was used to evaluate the network's performance: where N is the number of data points being trained, validated, or tested. The coefficient of determination in line with [29], R 2 Eq. 22, was post-processed on the test data to evaluate the prediction quality of each ANN structure.
where t i and y i are the target value and the predicted value, respectively. y is the mean of the target sample. The numerator of the fraction in Eq. 22 is the residual sum of the squares, whilst the denominator is the total sum of the squares.
As suggested per [29], suitable activation functions of hyperbolic tangent (Eq. 23) and logistic sigmoid (Log-Sig) (Eq. 24) functions were used for the hidden layers. A rectilinear (ReLU) (Eq. 25) function was also evaluated.  A simple linear output activation function was used on the output. Several sizes of training data ( 600, 1000, 2000, and 5000 ) were tested to observe the influence over the prediction quality.
The ANN makes use of early stopping and regularisation in order to prevent statistical overfitting when conducting training [41]. The former improves generalisation of the trained network by observing the performance criteria of MSE of training data set relative to the validation training. Generalisation refers to the ANNs ability to adapt appropriately to previously unseen data drawn out of the training data set also known here as the validation set. If the validation set's error increases, training is stopped early. The latter technique modifies the performance criteria by accounting for the change in the mean square of the network weights and biases (MSW) (Eq. 26). A performance adjustment factor, ′ , can be applied to reduce the weights and biases (Eq. 27) for any given propagation and reduce the likelihood of overfitting.

Numerical Database Construction
ANNs can only perform as well as the quality and quantity of the data that it is provided, therefore the generation of a (23) tanh(x) = 2 1 + e −2x − 1, many quality data points for the database is of high importance. To improve the timeliness of numerical data point generation explicit parallelisation can be implemented, i.e. use of multiple computational cores of the CPU. When using MBD solvers implicitly it is not possible to exploit the use of explicit parallelisation since a particular time step is dependent on the present time step as well as on data of the same time step. The ANN functionality in turn can be embedded with an implicit MBD computation environment to significantly improve the computation times without the need of explicit parallelisation despite benefitting from it indirectly.
One previously developed EHL film thickness database constructed via a Finite Element Method (FEM) solver covered a vast range of lubricants and material properties for relatively low speed conditions ( < 0.4 m∕s of 2D line contacts) [29]. However, if a similar database was to be used for a numerical EHL solver, the stability and ranges of their validity may be exceeded. Therefore, training databases produced via numeric solvers should be constrained in some manner to ensure its validity. Marian et al. used Latin Hypercube Sampling (LHS) for EHL film prediction as an effective method to generate a training database to cover a vast design space with many factors [29].
In this study, LHS was used but was constrained by the Greenwood Regime chart [42] as shown in Fig. 3. The regions of the chart are Isoviscous Rigid (IR), Isoviscous Elastic (IE), Piezo-Elastic (PE), and Piezo-Rigid (PR). These bounded regions are indicational and are in fact areas of transition from one regime to another but are used here as constraints for the numeric databases. The bounding regions classifications are dependent on the geometric, material, and rheological properties of the contact. To ascertain this, dimensionless elasticity and viscosity parameters, G e and G v , respectively, are established. Fig. 3 Line contact Greenwood lubrication regime chart with bearing, training data, and spur gear points plotted [4] 3 Page 8 of 16 The IR region can be described as a hydrodynamic region (lightly loaded contacts). Any data points generated via the LHS methodology that were found to be in this bounded region were redistributed with valid newly generated LHS EHL data points. The Piezoviscous Elastic (PE) regime prevails when the contact pressures are sufficiently high to cause elastic deformation of the bounding surfaces and a rise in lubricant viscosity. This signifies an EHL regime of lubrication and is the most prevalent regime for contact in gear pairs and sufficiently preloaded roller bearings. Piezo-rigid contacts also feature within the training dataset which refers to little to no elastic deformation of the bounding surfaces but there is a rise in lubricant viscosity.
The contacts within two commonly used machine elements, a cylindrical element roller bearing and spur gear, were superimposed on the Greenwood chart. This ensured that the training data covered their respective points of operation. This further confirmed that the lubrication regimes of both case studies were well within the PE region. Maximum and minimum ranges for the input variables were chosen to reflect the ranges over which typical high-speed gears and bearing operations would cover. Table 2 shows the initial ranges of the training data set produced that the LHS methodology was implemented with. Further constraints on the numeric database were to ensure suitable Hertzian pressures over 300 MPa and below 3.5 GPa to ensure the numeric database stayed within the PR and PE regions. Training data sets for the central film thickness are openly available from Figshare [43].
Typically linear regression techniques can be significantly affected by multicollinearity of the input parameters. ANNs however are unaffected as it is an overparameterised methodology able to predict non-linear problems. When multicollinearity is apparent, the highly correlated input variables might require manipulation using methods to combine parameters, such as partial least squares, Principal Component Analysis (PCA), or ridge regression [44]. Figure 4 shows the absolute correlation factors between the input values for the EHL problem with the most significant bivariate correlation being between load and length with a magnitude of 0.23. This indicates a very weak correlation between the two parameters, but it must be noted that this may have been affected by the previous constraining of the input variables.
Interestingly in the study by Marian et al. [29], grouped non-dimensionalised parameters, denoted in Eq. 2 of the current study, were used as inputs and it was found to significantly impeded the ANNs ability to predict the central film thickness; the authors' of the current manuscript drew similar conclusions. ANNs can use principal component analysis similar to linear regression in an effort to reduce the dimensionality of the input space of the problem. The output of PCA conducted on the selected inputs of the present study shown in Fig. 5 shows that each of the variables provide a similar contribution to variance and therefore any reduction in input parameters may impact the ANNs ability to predict the film thickness.

Machine Element Applications
A simple shaft supported by two cylindrical roller bearings has been modelled in a flexible multi-body dynamic software. The shaft is radially loaded at the centre point between the two bearings using a static load application. The shaft is constrained to rotation about its central axis and lateral  Table 3.
The dynamic simulation is run explicitly, and results required as inputs to the ANN are output. Results are generated for an individual rolling element in its orbit around the bearing. These include roller load per unit length, reduced radius of the roller-inner race contact, and contact entrainment velocity and are shown in Fig. 6. The bearing model accounts for the non-linear Hertzian force-deflection relationship. Peak roller loading occurs as the roller enters the lowest point in its orbit, due to the vertical direction of the radial load applied to the shaft. The loading pattern is cyclic in nature as the roller enters and exits the loaded region. Sufficient preload ensures that the roller and raceways remain in contact throughout operation, and an EHL contact is maintained throughout. The contact reduced radius and entraining velocity do not change throughout the rollers orbit as these are a function of the bearing geometry and operating speed, which remain constant.
The lubricant film prediction ANN has been embedded into an explicit workflow with a dynamic simulation of spur gear pair with microgeometry modifications. The relevant dimensions and operating conditions of the spur gear are displayed in Table 4. A common spur gear modification of a linear tip relief has been applied as it reduces mesh impact of the single to double contact during the meshing cycle [45]. Table 5 provides the lubricant parameters used in both the bearing and spur gear case study.
All the relevant input parameters required by the ANN such as reduced radius, entraining speed, and load across a meshing cycle were computed using a one degree of freedom dynamic model as shown in Fig. 7. A single meshing cycle consisted of 198 data points to be computed. The model considers tooth bending and tilt using a Weber-Benesheck method [47] as well as local deformations using a Hertzian model. Peak loading occurs during a period of the meshing cycle where the load is supported via a single flank pair.   Cylindrical gears in automotive applications typically operate under a boundary to mixed EHL regime due to the contact pressures. Under mixed lubrication regimes, viscous friction is a predominate contributing factor to the overall friction.

Results
It is worth noting that the following results were performed on current consumer grade hardware to demonstrate the applicability of the ANN performance in comparison to conventional numerical solutions. (Hardware used: Intel® Core™ i7-9750H CPU 6 cores @ 2.60 GHz, 32 GB RAM; GPU: NVDIA GeForce GTX 1650).

ANN Structure Performance
The initial effort to generate a training database is of importance as the number of data points can influence the ANNs accuracy. During this study, the average time for a numerical data point to be generated was 5.88 s. This led to the database construction, using a single processor core, having a wall time between 0.98 h and 8.16 h for 600 to 5000 data points, respectively. Note the use of multiple processor cores could vastly improve the wall time over a single core. To find the optimum structure a parameter study was completed using over 500 different ANN structures on the same input data. The number of hidden layers is varied between one and four, and the number of neurons between ten and twenty. The three documented activation functions were studied in the various hidden layer configurations including a combination of hyperbolic tangent with the final layer being a logistic tangent function. R 2 is reported only for the test data sets as this gives the best indication of the models performance and generalisation in comparison to the training and validation performances. The full training procedure completion time was also documented for each ANN structure tested.
Values for R 2 using 600 data points and different activation functions, number of layers, and neurons are presented in Tables 6, 7, 8, 9, 10, 11, and 12. The ReLU function performed significantly worse across all structures in comparison to the logistic sigmoid and hyperbolic tangent functions. However, it is worth noting that there is slight variation in the performance of a network with each new training session. The optimum number of layers was between two and three similar to tribological applications of ANNs [39].
It was found that 600 data points were sufficient to train a relatively accurate ANN for central film prediction;   Fig. 7 Inputs to ANN for a spur gear a load, b reduced radius, and c entrainment and sliding speed     however, it must be noted in the current study more data points led to less variation in the R 2 values across all structures without necessarily overfitting. Although it must be noted that training times varied most significantly with increasing the number of data points rather than the network structure complexity. Figure 8a-d and Tables 9, 10, 11, and 12 demonstrate the influence of increasing the data points from 600 to 1000, 2000, and 5000 data points for a logistic sigmoid function. It was concluded during this study that 1000 to 2000 data points typically led to the ANNs ability to reliably predict the film thickness without leading to significant increases in training time. 5000 data points used with the most complex network structures exceeded five minutes with the aforementioned hardware and reached 1000 epochs.

Case Studies Overview
In order to realise the potential of ANNs to predict complex multifaceted problems they require to be demonstrated into scenarios which can benefit from their implementation. In order to assess the ANNs ability and suitability for operation in dynamic machine element modelling a few case studies are presented below. Performances of the ANNs are to be compared relative to the numerical models and experimental data.

Case Study: EHL Film Prediction for Bearings and Gears
The analytical formula from Eq. 1, the numerical solution as described in Section III, and optimum-structured ANN were  used to compute the EHL film thickness over a single meshing cycle of the spur gear and several cycles for the roller bearing. It is worth noting that the bearing and gear case studies operated within the range of the ANNs training dataset. The film thickness computed by each methodology for each of the machine element's angular displacement is displayed in Fig. 9a and b. For both the bearing and spur gear applications Fig. 9a and b demonstrates clearly the ANNs (solid blue) ability to predict a similar central film thickness as the numerical method (long dash red) with minimal MSE error. The analytical (short dash green) method on the other hand over predicts the film by up to 0.1 μm in the spur gear case study, in real terms this could allow for misinterpretation of little to no boundary interaction depending upon the surface roughness. Table 13 demonstrates the relative performance of the analytical and ANN against the numerical solution for the MSE. The computation time per data point was also documented for each of the case studies in Table 12 to demonstrate the timely solution of the spur gear and bearing film thickness using the analytical and ANN methods. The analytical solution is a factor of ~ 75 faster than the ANN but is significantly less accurate as described by the MSE. The ANN represents approx. ~ 1500 factor computation reduction over the numerical solution with marginal error. Despite significant computational gains, it must be noted for widespread adoption of EHL ANNs in MBD simulations and its ability to extrapolate for conditions in which the input range of the original training datasets is exceeded must be appropriately addressed. Whilst this is not the focus of the current study, it should be addressed in the future works to advance the use of this data-driven solution.

Case Study: Friction ANN Validation Against MTM
Throughout a gear meshing cycle, the contact between two flanks changes its slide-to-roll ratio (SRR). A typical maximum SRR of a spur gear is 50% . One way to experimentally explore this contact-level phenomenon in a controlled manner is to use a ball-on-disc tribometer MTM2 (PCS Instruments, London, UK), where friction can be measured. Five tests were conducted, specifically varying the entrainment speed only, using a Group III base oil with a contact pair made of AISI 52,100 steel. Each of the discs were measured using Alicona Infinite Focus microscope (Alicona, Graz, Austria) at four different locations and the topographical parameters averaged.   A reduced order 2D EHL method was used to simulate the contact in discrete slices as per [48]. Table 14 shows the inputs into the model. The numerical methodology to compute the viscous and the deterministic Greenwood and Tripp methodology for boundary friction, as documented in Section IV and V, were used. A training data set of 10,000 was used due to the increased number of factors to consider in the design space. Moreover, to accommodate the slice-reduced order methodology, the load and length training input parameters of Table 2 were reduced to reflect the smaller contact lengths ( 0.01 − 10.0μm ), whilst maintaining the similar load per unit length ratios. Separate viscous and boundary friction ANN's were trained using the following ANN network structures, respectively, using hyperbolic tangent activation functions for the hidden layers: Each ANNs training achieved an R 2 value of 0.9985 and 0.9993 . The MTM friction results compared against the ANN, and numerical solutions are shown in Fig. 10. The Stribeck Oil Parameter is plotted to demonstrate the lubrication regimes experienced, mixed to EHL. One potential reason of both models overprediction of friction at the experiment's higher entrainment speeds at the fixed SRR of 50% could be due to shear thinning.

Case Study: ANN Friction Application for Spur Gears
The numerical and ANN viscous friction methodologies were also used in same spur gear example. As the gear flanks approach the pitch point the sliding velocity reduces towards zero. Therefore, the friction coefficient should be at its minimum. Both presented methodologies are capable of predicting this event as shown in Fig. 11. Literature has at points misused analytical lubricated friction formulas, such as those presented by Evans and Johnson [36] in gear applications where sliding-rolling motion is present. The ANN was ~ 1300 times faster at computing the friction compared the numerical solution.

Conclusion
Tribological ANNs have been applied to two commonly used machine elements, a roller element bearing and a spur gear with microgeometry modification, for the prediction of EHL film thickness as well as friction. During this study, the ANNs have proven to be accurate as well as a fast computation technique for explicit use in dynamic modelling. Despite ANNs shortcomings of having no physical understanding, the data-driven solution has provided similar levels of accuracy to numerical computation techniques. The ANNs numerically generated training data was produced via Latin Hypercube Sampling (LHS) and then constrained via the Greenwood Regime to ensure the quality of the data, which directly affects the ANNs ability to predict with a high level of accuracy. It was also found that at entraining velocities of the gear (~ 4 m/s) and the roller element bearing (~ 9.5 m/s) the analytical equation deviated significantly from the numerical solution for the central film thickness whilst the ANN could replicate it with very small margins of error. During these specific case studies, the ANN was shown to perform ∼ 1500 times faster than the numerical solution.
Multiple regimes of friction are accounted for by the ANNs, including boundary, Newtonian, and non-Newtonian viscous friction. A numerical and ANN approach were compared directly against a lubricated ball-ondisc (MTM) study with a fixed slide-to-roll ratio and an increasing entrainment velocity. The numerical solution used a 2D-reduced order technique to replicate the contact pressure and film thickness distribution to then calculate the viscous and boundary friction. The separately trained viscous and boundary friction ANNs combined to achieve similar levels of accuracy relative to the experimental and numerical result; however, the ANN vastly outperformed the numerical solution with regards to computational efficiency. The ANN was also demonstrated to be capable of predicting a spur gear pair's changing friction coefficient throughout its meshing cycle, where the load, slide-to-roll ratio, and contact geometry varies significantly.
Future work should look to embed a tribological ANN within a dynamic co-simulation to seek the full benefits of the accuracy of a numerical solution but far faster simulation times. For this data-driven solution to be more widely adopted in MBD simulations, its ability to extrapolate beyond its training input data range needs be addressed, as this could affect the stability of the dynamic co-simulation.