Predicting EHL film thickness parameters by machine learning approaches

Non-dimensional similarity groups and analytically solvable proximity equations can be used to estimate integral fluid film parameters of elastohydrodynamically lubricated (EHL) contacts. In this contribution, we demonstrate that machine learning (ML) and artificial intelligence (AI) approaches (support vector machines, Gaussian process regressions, and artificial neural networks) can predict relevant film parameters more efficiently and with higher accuracy and flexibility compared to sophisticated EHL simulations and analytically solvable proximity equations, respectively. For this purpose, we use data from EHL simulations based upon the full-system finite element (FE) solution and a Latin hypercube sampling. We verify that the original input data are required to train ML approaches to achieve coefficients of determination above 0.99. It is revealed that the architecture of artificial neural networks (neurons per layer and number of hidden layers) and activation functions influence the prediction accuracy. The impact of the number of training data is exemplified, and recommendations for a minimum database size are given. We ultimately demonstrate that artificial neural networks can predict the locally-resolved film thickness values over the contact domain 25-times faster than FE-based EHL simulations (R2 values above 0.999). We assume that this will boost the use of ML approaches to predict EHL parameters and traction losses in multibody system dynamics simulations.


Introduction
Reducing friction and wear losses in highly loaded lubricated tribo-contacts of machine elements or mechanical components is essential for developing energy-efficient and reliable systems [1][2][3]. In particular, the modeling of concentrated, elastohydrodynamically lubricated (EHL) contacts (see Fig. 1), in which local elastic deformation of the rubbing surfaces and hydrodynamic fluid film formation are superimposed, is comparatively complex and computationally expensive [4]. For a sufficiently accurate and computationally efficient incorporation of EHL effects into higher-level multibody system dynamics simulations, analytically solvable approximating equations are generally used to estimate integral fluid film parameters [4].
To generalize the EHL calculations and their results, non-dimensional similarity groups were introduced. Dowson and Higginson [5,6]  material, velocity, load, and lubricant film parameters, and proposed curve-fitted regression formulas (hereinafter referred to as "proximity equations") relating those parameters from numerical simulation results. Blok and Moes [7,8] proved that these four parameters could be transformed into three independent parameters, thus introducing a load and a viscosity parameter in addition to the fluid film parameter. Later, Johnson [9] proposed film thickness parameters for elasticity and a pressure viscosity coefficient, which, in turn, can be derived from the Blok/Moes notation. More recently, Habchi et al. [10] suggested using the Weissenberg number, the Nahme-Griffith number, a limiting shear stress-pressure coefficient, and a thermo-viscous regime indicator. However, the non-dimensional groups proposed by Dowson/Higginson and Blok/Moes are the most widely used ones in EHL literature [4,11].
To estimate the minimum lubricant film thickness in an EHL infinite 2D line contact, Dowson and Higginson [5,6] proposed an analytically solvable regression equation as a function of velocity, material, and load parameters. Based upon more advanced isothermal EHL simulations, these parameters have been modified [7,12,13] and extended by various authors to cover 3D circular and elliptical point contacts and estimate the central film thickness [14][15][16][17][18][19][20][21]. Due to the transformability, the Blok/Moes parameters can be used to calculate the EHL film parameters. However, since these are only proximity formulas with a limited validity range, Johnson [9] differentiated four regions depending on the viscosity and material behavior. Besides the classical EHL regime, proximity equations were suggested for the hydrodynamic as well as for two semi-elastohydrodynamic regimes with rigid bodies and iso-viscous fluid behavior, respectively. Finally, Moes et al. [19,22] developed further approximations based upon precise EHL simulations with validity over a wider range of concentrated contacts.
In these correlations, thermal effects, limited oil supply, and shear-thinning fluid behavior have not been considered. One way to account for these aspects is to apply correction factors to the EHL film thickness calculations [4]. Various authors have proposed thermal correction factors [23][24][25][26][27], each with a different range of validity, to adjust the central or minimum EHL film thickness. Similar approaches have also been developed to adjust the EHL film thickness for starvation [28][29][30], fluid compressibility [31,32], non-Newtonian fluid behavior [33][34][35], and surface roughness effects under mixed lubrication conditions [36][37][38].
For more fundamentals and details about non-dimensional groups, film thickness equations, and applicable correction factors, the interested reader is referred to the recent review article by Marian et al. [4]. The latter [4,39] also hypothesized that machine learning (ML) or artificial intelligence (AI) algorithms provide opportunities to predict relevant EHL film parameters more accurately and effectively. Currently, ML and AI methods are receiving growing attention in the field of tribology [40] and involve the development of computing systems that are able to learn from training data (input) and build/refine experience-based models to predict a certain behavior (output) [41]. ML algorithms can be categorized as supervised, unsupervised and reinforcement learning, whereby the selection of suitable approaches is highly task-dependent [40,41]. For instance, support vector machine (SVM) represents an object set by a vector within a vector space. Hyperplanes within the space are used to separate the data points. Kernel functions are used to transform the vector space into www.Springer.com/journal/40544 | Friction any higher-dimensional space so that interlaced vector groups are linearly separable [42,43]. In the context of tribology, SVMs have been successfully applied to tailor composite materials [44,45] as well as in the area of drive technology [46,47] and manufacturing [48].
Furthermore, artificial neural network (ANN) is a flexible and the most widely used approach [40] with successful applications in the field of composite materials [44,[49][50][51], drive technology [52][53][54][55][56], manufacturing [48,57], surface engineering [58][59][60], and lubricant formulation [61,62]. ANN is inspired by natural brains' architecture and involves a number of simple but highly interconnected information processing elements (neurons) [63], see Fig. 2. Transfer or propagation functions determine the neurons' network input based on the output weighting, whereas the calculation of the outputs is achieved by activation functions considering a threshold value [41]. During training, the weightings and thresholds are adjusted to optimize the overall prediction quality. The topology or architecture of ANNs, i.e., how many neurons are parallelly arranged in each layer and how many hidden layers are between the input and output layer, needs to be tailored for the respective application, whereas overfitting must be avoided. ANN has also already been used in the context of predicting the behavior of EHL contacts. Even though there are first physics-informed ML approaches to predict the behavior of lubricated contacts [64], most of the work done so far was data-driven based upon designs of experiments (DoE) [40,65]. Otero et al. [66] trained an ANN with 20 neurons in a single hidden layer to predict the coefficient of friction in micro-textured EHL contacts under various operating conditions and dependent on the textures' dimensions and patterns. Thereby, the underlying data was obtained from experimental ball-on-disk experiments (mini-traction machine). Marian et al. [67,68] demonstrated that numerical EHL simulations could also be utilized for training approximation or meta-models. Nevertheless, ML approaches have not yet been adapted to predict lubricant film parameters in EHL contacts as a function of standard input variables, as is also accomplished by the well-known proximity equations [4,39], see Fig. 2.
In this context, this contribution is based on the hypothesis that ML approaches can predict relevant EHL film parameters such as the central film thickness h c and the minimum lubricant gap h min in 2D line and in 3D circular point contacts more efficiently than sophisticated simulation models and with higher accuracy and flexibility than analytically solvable proximity equations based upon non-dimensional  groups. Thereby, the following research questions will be addressed: 1) How accurate are the predictions of SVMs, Gaussian process regression (GPR), and ANNs compared to the established analytically solvable proximity equations?
2) Does using all available input parameters (see Fig. 2) instead of non-dimensional groups to train the ML approaches affect the prediction accuracy?
3) How do the ANN architecture and structure affect prediction quality? 4)What is the influence of the size of the data base on the prediction quality? 5) Is it possible to determine locally resolved lubricant film thickness distributions, i.e., the lubricant gap as a function of the contact length h(x), with sufficient accuracy and more efficiently than by EHL contact simulation?

Theory and methods
To answer these questions, analytically solvable proximity equations (Section 2.1) are compared with SVM, GPR as well as ANN (Section 2.4) based upon designs of experiments (Section 2.2) and data sets generated by EHL simulations (Section 2.3) with respect to their prediction quality of EHL film parameters. The overall approach is illustrated in Fig. 3.

Analytically solvable proximity equations
The proximity equations used to predict lubricant film parameters in EHL infinite 2D line or 3D circular point contacts are based on the most widely used non-dimensional similarity group proposed by Dowson and Higginson [5,6], as summarized in Table 1. Thereby, E' is the reduced Youngs modulus, R the effective radius, l the contact length, η the fluid viscosity, α p the pressure-viscosity coefficient, u m the entrainment velocity, and F N the normal load. These parameters can be transformed into the notation from Blok and Moes [7,8], see Table 2.
Given their widespread use, practicability and applicability to the overall scope covered within this contribution (see Section 2.2 and Refs. [4,11]), the proximity equations derived by Dowson and Toyoda [14] [5,6].
Load parameter Viscosity parameter 1 4 (2 ) which represent polynomial regression equations, were used to estimate the central and minimum lubricant film thicknesses. Within this study, circular point contacts and, therefore, the same radii in x-(R x ) and in y-direction (R y ) were assumed for the ellipticity parameter 0.64 1.03 www.Springer.com/journal/40544 | Friction

Design of experiments
To generate a sufficient data base to determine correlations between input and output variables with minimum computational effort, statistical DoE methods can be employed to systematically plan and evaluate computer experiments. In addition to full and partial factorial, central composite, or Box-Behnken, designs tailored to the characteristics of numerical simulation and statistical modeling can be used. To generate a broad spectrum of approximation or meta-models, a Latin hypercube design (LHD) or Latin hypercube sampling (LHS) from the group of equally distributed test fields are particularly suitable. Thereby, the data points are distributed so that they fill the factor space as evenly as possible and provide information about almost every region of the factor space with little computational effort (small number of simulations n s ) despite many factors n f . The LHS is based on a modification of a generated LHD, which represents an n s × n f matrix containing random permutations of the numbers {1, 2, 3, ... n f } within its columns. The LHS elements are generated by subtracting a random number between zero and one Z r [0, 1) from each LHD element x ij,LHD and then dividing this value by the number of trial points [69]: Unlike other DoEs, the number of data points is directly specified in the LHS. The resulting test field can then be transformed to the desired factor space. In this work, the desired factor space is delimited by the minimum and maximum values of the EHL parameters, as summarized in Table 3. The ranges of these values were based on typical machine elements that operate under hard EHL conditions, such as rolling bearings or gears [70], and were chosen to cover the parameter space without substantial changes to the individual simulation models (e.g., in terms of numerical stabilization, see Section 2.3) and with as many converged calculations as possible. Thus, the ranges of the non-dimensional Dowson/Higginson parameters (U, G, W) and Blok/Moes parameters (M, L), as shown in Table 3, could be covered. Thereby, the conventional cases of EHL infinite 2D line and 3D circular point contacts were sampled with 1,500 and 1,000 data sets, respectively. The former was able to include more data points due to the lower computational effort. Moreover, it was investigated for the 2D case to what extent a reduced data base (600, 300, and 100 data points) affects the prediction quality. According to Johnson et al. [71], the distances between the data points can be used to assess the quality of the test field regarding uniform distribution and freedom of correlation. The minimum distance between the individual test points was maximized for a suitable and uniformly distributed LHS test field. Considering all distances of the test field d, the MaxiMin criterion was obtained, with the integer, positive, and application-dependent factor ξ [69]. Within the scope of this work, the MATLAB's Statistics and Machine Learning Toolbox was used to generate an LHS optimized according to the MaxiMin criterion.

EHL modeling
Numerical EHL modeling was done by applying the steady-state isothermal Reynolds differential equation [72]: Couette term Poisuille term in a slightly modified notation for hydrodynamics, whereby p is the pressure, h the lubricant film thickness, η the viscosity, ρ the density, θ the fractional film content, u 1 and u 2 the surface velocities (see Fig. 1). The fluid was assumed to be compressible and piezo-viscous with the density following Dowson and Higginson [73]: as well as the dynamic viscosity following Roelands [74]: This is an important prerequisite to be consistent with the assumptions employed to obtain the EHL film thickness proximity equations (see Section 2.1). Cavitation was addressed by a mass-conserving penalty formulation of the fractional film content [75] 2 ( ) where γ(p) is zero if p < 0 and otherwise a sufficiently high algebraic number ξ. The lubricant gap equation was comprised by the rigid body motion h 0 , a quadratic approximation of the undeformed geometry and the elastic deformation δ. The latter was calculated by solving the steady-state linear elasticity equation for an equivalent elastic body with negligible body forces, which can be expressed in the contract matrix (Voigt) notation as  (23) where C is the generalized Hooke's law elasticity matrix, ε L the contracted Lagrangian small-strain tensor, and U the displacement vector. The surface displacement is admitted being the normal component Furthermore, the equivalent body was assumed to be of a homogenous isotropic material with equivalent Youngs modulus and equivalent Poisson's ratio by applying the linear elasticity equation. The integral of the hydrodynamic pressure over the contact domain Ω c balanced the normally applied load to satisfy the force equilibrium: To solve the EHL problem and to ensure good conditioning, the relevant variables were normalized on Hertzian (subscript H) or reference values (subscript 0): The numerical solution scheme was based on the full-system approach [76], whose overall procedure is depicted in Fig. 4(a). After reading the inputs, initial values were determined following the Hertzian theory to define an initial guess for the elastic deformation solution. Subsequently, the Reynolds equation was solved in a weak formulation on the contact domain Ω c and fully (strongly) coupled with the calculation of the elastic deformation in the solution domain Ω based upon FEM. Especially for higher pressures, instabilities can occur in the solution of the Reynolds equation. This is because convection-diffusion equations converge to local oscillations of the solution variables when solved with Galerkin FEM in convectiondominated problems [77]. Partially, computational stability can be improved by finer discretization and higher-order approximation functions. In this work, 5th and 7th order were used for the 2D and 3D models, respectively. Moreover, other stabilization methods can serve as artificial diffusion by introducing additional terms into the transport equations. In this study, the residual-based (consistent) stabilized Galerkin least squares (GLS) method [78] and the inconsistent method of isotropic diffusion (ID) [79] were used, whereby care was taken to minimize the influence of the stabilization on the numerical solution.
Regarding the boundary conditions, zero pressure (Dirichlet) was applied at the contact domain's (Ω c ) in-and outlet. Furthermore, zero displacements on the bottom, the hydrodynamic pressure as normal stress on the top (Ω c ), and free boundary with zero normal and shear stresses on the remaining borders were applied as boundary conditions of the elastic body (Ω). The domains differed between the 2D infinite line contact ( Fig. 4(b)) and the 3D circular point contact case (Fig. 4(c)) and were chosen sufficiently large to avoid numerical starvation, thus ensuring that the results correspond to the ones predicted by an infinite elastic half-space approach. The 2D domain was discretized by triangular elements and the 3D domain by tetrahedral elements with refinements in the contact center of the upper surface. Symmetry boundary conditions were used for the 3D case to reduce computational efforts. The interested reader is referred to Refs. [77,80,81] for more details about the fundamentals of FEM applied to EHL problems and its implementation within the software COMSOL Multiphysics.

Machine learning
Various ML regression methods, including SVM, GPR, and ANN, were employed to predict EHL film parameters using the MATLAB's Machine Learning and Deep Learning Toolbox [82]. All data (input and output values) were normalized to values between -1 and 1 using the MATLAB built-in function "mapminmax". The ML models were developed using training data to provide the best possible predictions for unknown test data. For this purpose, the data sets (see Section 2.2) were divided into 85% for training and 15% for testing. Due to its superior expressiveness [83], the coefficient of determination defined as sum of squaredregression was used to evaluate the prediction quality. The above equation represents the proportion of the variation in the response variable y explained by the independent input variables, whereas y is the arithmetic mean of the target variables, and ŷ is the approximation value [84,85]. The prediction quality of the ML approaches was compared after training with the original input data of the EHL simulations (please refer to Fig. 1 and Table 3) and after training with the non-dimensional parameters U, G, W, and M, L (please refer to Section 2.1 and Table 3), respectively.

Support vector machines
The linear epsilon-insensitive SVM (ε-SVM) regression was implemented in MATLAB's Machine Learning and Deep Learning Toolbox [82]. The goal was to find a function for each training point x n of a multi-variate set of N observations that was as flat as possible through formulation as a convex optimization problem and deviated from the response value y n by a value smaller than ε [82]. The slack variables n  and * n  are included to deal with otherwise infeasible constraints results in the objective function (primal formula) constrained by [82,86]: with the box constraint C, a numeric value to control the penalty imposed on observations outside the ε-margin. The linear ε-insensitive loss function is described by [82]: To computationally simplify this optimization problem, with the nonnegative multipliers α n and * was used to predict new values which depend only on the support vectors [82]. Thereby, G k (x j , x k ) was a linear or polynomial semidefinite kernel function to map x to a higher dimensional space [82]. Finally, the Karush-Kuhn-Tucker (KKT) complementarity conditions were applied to find optimal solutions [82]: The minimization problem was solved by an iterative single data algorithm (ISDA) [82,87].

Gaussian process regression
A GPR non-parametric probabilistic model was implemented in MATLAB's Machine Learning and Deep Learning Toolbox [88]. The GPR explained the response by latent variables f (x i ) from a Gaussian process (GP) and explicit base functions. The covariance function of the latent variables described the smoothness of the response, and base functions h b (x) transferred the inputs into a p-dimensional space. A GP is a set of random variables with Gaussian distribution, and was defined by its mean m(x) and kernel parametrized or hyper-parametrized covariance function k(x, x') [88]: With the error variance σ² and the p-by-1 vector of the base function coefficients β b , the response of the GPR was modeled as [88]: During training, the MATLAB built-in function "fitrgp" estimated the base function coefficients β b , the noise variance σ², and the hyperparameter θ k of the kernel function [88].

Artificial neural networks
Multi-hidden layer feedforward backpropagation ANNs were trained by the MATLAB's app "nnstart" utilizing the Levenberg-Marquart algorithm [89][90][91]. Thereby, the neuron weights were adjusted layer-bylayer, starting backwards with the connection to the output layer [92]. 10% of the training data was used for validation to avoid overfitting. At first, the most favorable architecture of the ANNs was determined, for which the number of neurons per layer varied between 10 and 20, while the number of hidden layers was assumed to lay between zero and three. If not explicitly mentioned, the hyperbolic tangent function  (44) was used as activation function, linear identity as output function and as propagation function [92,93]. After having identified the best architecture, the prediction was compared to training with sigmoid (logistic): and identity activation functions [93]. The results were compared with data obtained after training with reduced data sets (600, 300, and 100 data points) to study the influence of the number of training data. In the FEM simulations used for data generation, the global central and minimum lubricant film thicknesses and the spatially resolved field quantities (e.g., fluid pressure, surface elastic deformation, lubricant film gap) were calculated. To use the ANNs to predict the local lubricant film thickness in the EHL infinite 2D line contact, the position along the contact length (-1.2 ≤ X ≤ 1.2) was used as an additional input variable. With a resolution of 86 uniformly distributed points in this length domain, a total number of 128,914 data sets were obtained, which were divided into training, test and testing data.

EHL simulations
Representative distributions of the lubricant film gap and the hydrodynamic pressure along the contact length of the infinite 2D line contact for four exemplary cases (lower M and higher L, higher M and higher L, lower M and lower L, and higher M and lower L) are shown in Fig. 5. The presented diagrams illustrate typical characteristics of EHL contacts, such as the elastic flattening in the contact center and the additional constriction in the lubricant gap near the contact outlet. Compared to the Hertzian www.Springer.com/journal/40544 | Friction theory, the fluid pressure distribution increased earlier due to lubricant compression at the contact inlet and displayed the additional Petrusevich peak at the contact outlet. The same holds for the 3D circular point contact (not shown for spatial reasons), whereby the film thickness showed the characteristic horseshoe shape. Generally, a higher M led to a fluid pressure distribution approaching the Hertzian theory and, with constant L, to a decrease in the lubricant film thickness. A higher L with constant M, in turn, increased the magnitude of the Petrusevich spike and the fluid film thickness. From the performed calculations within the LHS, 1,499 converged data sets for the 2D case and 764 for the 3D case were ultimately obtained to compare with the proximity equations (Section 3.2.1) and train the ML models (Section 3.2.2).

Analytically solvable film thickness equations
As shown in Table 4, the coefficients of determination of the analytically solvable proximity equations featured good prediction quality despite the DoE slightly exceeding these equations' validity range, see Table 3 and Ref. [4]. All data sets were used for the calculation, which reduces the influence of individual errors and outliers. The R² values were in a similar range as the fits between proximity equations and experimental film thickness measurements using an optical ball-on-disk tribometer as reported by van Leeuwen [11]. However, the comparison with calculated values derived from EHL simulations (Fig. 6) revealed that the film thicknesses tended to be slightly overestimated by the proximity equations, especially for larger lubricant gaps. This can be attributed to the partly different modeling aspects (e.g., the rheological and cavitation models) between the EHL simulations employed here and the original data used to obtain the proximity equations.

Global EHL film parameters
The R 2 values of the central and minimum film thicknesses for the infinite 2D line and 3D circular point contacts against testing data as predicted by SVM and GPE methods are summarized in Table 5 and graphically compared in Fig. 7. With the original input data (see Fig. 1 and Table 3), the values calculated with EHL simulations were accurately predicted by both SVM and GPR methods (Figs. 7(a)-7(h)). The coefficients of determination with values larger than 0.99 (Table 5) reached higher values than the analytically solved proximity equations (Section 3.2.1). An overestimation, as seen for the analytically solved proximity equations, was not verifiable. Comparing both ML approaches, GPR proved to be slightly superior to SVM (Table 5).    When using the non-dimensional groups U, G, W, or M, L instead of the original input data for training purposes, the values were predicted with significantly lower accuracy (Figs. 7(j)-7(l), 7(p)-7(v), and 7(x)) or even totally insufficiently (Figs. 7(m)-7(o) and 7(w)). The R 2 values fell well below 0.9 and almost zero for some cases ( Table 5). The substantially inferior prediction quality can be traced back to redundant input variables for different output variables. Since similar dimensionless parameters can result from different original input data, it is possible that similar dimensionless variables can lead to different film parameters (Table 6). These inconsistencies in the data base led to problems in the training of SVM and, particularly, GPR (Table 5). A possibility to overcome this issue is the use of the dimensionless film thickness H as output of the regression equations for training the ML approaches without prior dimensionalization. The unambigious prediction of the film thickness becomes possible by using only dimensionless parameters as input of the ANN and the dimensionless film thickness H as output since the latter is unique. Subsequently, the film thickness can be dimensionalized in the post-processing using the radius R.
The coefficients of determination for the prediction of the central and minimum film thicknesses for the infinite 2D line and the 3D circular point contacts against testing data using ANNs with 10, 12, 15, or 20 neurons in one, two, or three hidden layers are summarized in Tables 7 and 8. Thereby, the original input values have been used for training. It can be observed that the EHL film parameters were predicted with high accuracy (R 2 values above 0.99) by some ANN configurations. While the number of neurons played a less dominant role, the number of hidden layers had a decisive impact on the prediction quality. Therefore, the lowest R² values resulted in only one hidden layer, followed by three and finally two hidden layers with the highest accuracy. Regarding the number of neurons, the configurations with more than 10 neurons proved to be favorable. Since 12 neurons in 2 hidden layers gave the best results in the more complex 3D circular point contact (Table 8), this configuration was kept constant for the following predictions. The comparison of the calculated and predicted central and minimum film thicknesses for this configuration is depicted in Fig. 8. The prediction was more precise than that of the analytically solvable proximity equations (Section 3.2.1) and comparable to SVM or GPR ( Table 5). The 3D circular point contact case featured higher R 2 values against the testing data compared to the 2D line contact, which may be due to the considerably larger data base for the latter and thus some overfitting.
Similar to SVM and GPR, training the ANN with dimensionless groups instead of the original input parameters resulted in a considerable reduction of prediction quality (Table 9) due to inconsistencies in the training data set ( Table 6).
The coefficients of determination of the prediction with an ANN with 12 neurons in each of two hidden layers after training with hyperbolic tangent, logistic Table 9 Coefficients of determination of the prediction for the central and minimum film thicknesses for the infinite 2D line and 3D circular point contacts against testing data using an ANN with 12 neurons in each of two hidden layers after training with the original EHL inputs, the non-dimensional parameters U, G, W, or M, L. and identity activation functions are compared in Table 10. Please note that the output and propagation functions were kept constant and only the activation function was varied. The sigmoid activation function   To minimize the efforts required to generate the training data base, it is advisable to keep the number of training data as small as possible without affecting the prediction quality. The prediction coefficients of the central and minimum film thicknesses for the infinite 2D line contact with a varying number of training data are shown in Table 11. Even when the original training data set was reduced to 600 data points, the R² values remained at a very high level above 0.99, thus showing slightly better R² values compared to the 1,249 training data sets. This may imply some overfitting for the original 1,249 training data set. For 300 or 100 training data points, the coefficients of determination showed a decreasing tendency but stayed above 0.9. The deteriorated prediction accuracy due to an insufficient data base can also be seen from the comparison between the calculated and predicted values in Fig. 9. It can be concluded that for the present case with 12 varying input variables, at least 600 data points are required for the training. This also fits the data consisting of 600 sets used for the 3D circular point contact at the same number of input variables and explains its sufficient prediction quality.

Local EHL film distribution
For the prediction of the local lubricant film distribution for the infinite 2D line contact by an ANN with 12 neurons in 2 hidden layers, an overall coefficient of determination of 1.000 against testing data including all data sets and positions along the contact length was obtained. Thus, the model exhibited excellent prognosis accuracy, which can be recognized in the direct comparison of the lubricant film distribution calculated by FEM and the values predicted by ANN as depicted in Fig. 10. While the two curves were congruent over almost the entire contact length, including the minimum film thickness, only minor deviations were predicted close to the upstream of the film constriction at the contact outlet (slight overshooting of the ANN prediction as enlarged detail in Fig. 10). Regarding computational time, training and execution tasks must be differentiated. The training of the ANN with the locally resolved data required several minutes. To finally compare the execution time of the FEM simulation and the ANN, the lubricant film thickness for a data set not included in the training data was calculated using both methods on the same computer. Although the ANN was trained with 86 uniformly distributed data points along the contact length, it is possible to query any desired discretization. Since the EHL simulation provided the lubricant film thickness at 8,641 positions in the relevant contact length range, the same number of values was predicted with the ML algorithm for consistency. As illustrated in Table 12, ANN was over 25 times faster than the FEM model, while the calculation time is comparable to the analytically solvable proximity  equations solution. However, ANN offers the possibility to predict the lubricant film distribution locally. Moreover, the computational time could even be further improved when implementing the ANN in programming languages with a stronger focus on computational speed. By incorporating the ANN into multibody system simulations, their accuracy in friction and dynamics analyses or the identification of critical operating conditions could become even more precise without negatively affecting the required computational time.

Conclusions
This contribution demonstrated that ML approaches could predict relevant film parameters more efficiently than sophisticated EHL simulation models. Moreover, we verified that ML outperforms analytically solvable proximity equations based upon non-dimensional groups in terms of accuracy and flexibility. Based upon the presented results, we derive the following conclusions to address the introduced research questions: 1) Compared to the proximity equations according to Dowson/Toyoda/Higginson [5,6,14] for infinite 2D line contacts and Hamrock/Dowson [16] for 3D circular point contacts, all ML approaches (SVM, GPR, and ANNs) reproduced the behavior calculated by the FEM-based EHL simulations more accurately. Thereby, coefficients of determinations up to 1 were verified when predicting EHL film parameters within the operating conditions covered by the training data. Other film thickness equations, e.g., from Nijenbanning et al. [19], Wolf et al. [21], or Moes [22], may also enable accurate predictions. However, the implementation of these equations requires the definition of more complicated analytical expressions to fit the simulation results, which, in turn, increases the computational effort and the proneness to errors. In this regard, the omission of a large range of dimensionless parameters represents another benefit of the ML approaches employed to efficiently predict EHL film thickness without the need to define complex proximity equations. In contrast to proximity equations, the usage of original EHL input parameters is crucial for ML approaches. Using only the parameters G, U, W, or M, L notably decreases the prediction quality. We anticipate that this aspect can be overcome by training ML approaches with dimensionless film thickness as output instead of the dimensional one.
2) The architecture of ANNs and, especially, the number of hidden layers notably influences the accuracy. For the data set of this contribution, ANNs with 12 neurons in two hidden layers were most favorable. Suitable activation functions are hyperbolic tangent and sigmoid functions, while linear identity functions deteriorated the prediction accuracy.
3) The size of the data base affects the prediction quality. For ML approaches trained using 12 input parameters, roughly 600 training data proved sufficient to reach high R² values without overfitting.
4) It should be noted that the trained ML approaches were not physics-informed and, therefore, cannot provide physical explanations. Although not reflected in the results, the predictions have to be considered carefully as it is possible to obtain negative film thicknesses. Nevertheless, the trained ML approaches provided meaningful results when employed for the operating conditions covered in the training data base of this article. In the future, the architecture of ANNs can be further modified to exclude unphysical www.Springer.com/journal/40544 | Friction results by using a strictly positive linear function (e.g., the poslin function in MATLAB) instead of linear identity in combination with a normalization of the ouput data in the range of [0, 1]. This would help to ensure that the ANN will not predict negative film thicknesses for any input parameters.
5) In addition to global parameters, ML approaches can also predict the local lubricant film distribution. Compared to FEM-based EHL simulations, the trained ANNs achieved 25 times shorter computation times at prediction coefficients above 0.999. At present, lubrication behavior and dynamics calculations in multibody system simulations typically employ analytically solvable proximity equations and the Hertzian theory to account for fluid film formation and pressure distribution. We hypothesize that in the future, the incorporation of ML approaches such as ANNs into these higher-level simulations will add value to the engineering system design process. 6) In the future, other contact conditions and physical effects should be considered by training novel ML approaches without the need to define complex proximity equations or introduce further correction factors. Thereby, elliptical contacts, thermal effects and other rheological models should be considered. It is also conceivable that time-dependent squeeze, starvation effects, and the influences of surface roughness and solid asperity contact may be also taken into account.

Declaration of competing interest
The authors have no competing interests to declare that are relevant to the content of this article.