Robust estimation of clinch joint characteristics based on data-driven methods

Given a steadily increasing demand on multi-material lightweight designs, fast and cost-efficient production technologies, such as the mechanical joining process clinching, are becoming more and more relevant for series production. Since the application of such joining techniques often base on the ability to reach similar or even better joint loading capacities compared to established joining processes (e.g., spot welding), few contributions investigated the systematic improvement of clinch joint characteristics. In this regard, the use of data-driven methods in combination with optimization algorithms showed already high potentials for the analysis of individual joints and the definition of optimal tool configurations. However, the often missing consideration of uncertainties, such as varying material properties, and the related calculation of their impact on clinch joint properties can lead to poor estimation results and thus to a decreased reliability of the entire joint connection. This can cause major challenges, especially for the design and dimensioning of safety-relevant components, such as in car bodies. Motivated by this, the presented contribution introduces a novel method for the robust estimation of clinch joint characteristics including uncertainties of varying and versatile process chains in mechanical joining. Therefore, the utilization of Gaussian process regression models is demonstrated and evaluated regarding the ability to achieve sufficient prediction qualities.


Introduction
In the field of mechanical joining processes, clinching assigns to the group of procedures without the use of auxiliary joining elements. Thereby, the cold-forming technique is becoming more and more relevant for the generation of lightweight designs, such as in multimaterial car body elements, over the last decades [1,2]. Especially, the opportunity to join two or more sheets in combination with different blank surface conditions (e.g., coated materials) and dissimilar sheet thicknesses provides a highly efficient alternative to established thermal methods, such as resistance welding [3,4]. Christoph Zirngibl zirngibl@mfk.fau.de 1 Engineering Design, Friedrich-Alexander-Universität Erlangen-Nürnberg, Martensstrasse 9, 91058 Erlangen, Germany 2 Product Life Cycle Management, Technical University Darmstadt, Otto-Berndt-Str. 2, 64287 Darmstadt, Germany However, to guarantee a high reliability of the designed joining parts, not only the selection of a suitable mechanical joining technique but also the robust and accurate estimation of individual joint properties is crucial. This includes for instance the coverage of different joining tool configurations as well as uncertainties within the process chain, such as manufacturing tolerances of blanks or material property variations. Since it is very time-and cost-intensive to determine the influence of such diverse use-cases in experimental studies, it is often necessary to carry out a high amount of design iterations until a satisfying product design is reached. In this context, several contributions already introduced approaches for the numerical identification of improved clinch joint properties by utilizing machine learning methods and optimization algorithms [5]. However, the main focus was on the definition of an optimal joining tool configuration by investigating the influence of varying geometrical parameters, such as punch diameter or die depth. Since these studies often applied a FE model considering constant material and process settings, the influence of changing conditions (e.g., varying surface states or blank thicknesses) was not represented by the trained regression models. Especially, unknown or unexpected changes in these process conditions can lead to incorrect estimations of the individual joint characteristics and thus to a significant loss of the regression model's ability to achieve high-quality prediction results.
Hence, this contribution introduces a method for the systematic set-up of robust regression models predicting clinch joint characteristics in varying and versatile process chains. This covers the determination of uncertainty distributions of the target variables caused by varying material and process parameters as well as the definition of prediction uncertainties within the trained machine learning algorithm. In summary, the presented results pave not only the way to a reliable and data-driven estimation of individual clinch joint characteristics but also to a more robust design of the entire joint connection.

Related work
Given the challenge of defining optimized tool configurations for improved clinch joint properties, few contributions investigated the application of data-driven methods. For instance, Oudjene et al. [6,7] increase the joint's resistance against tensile loading by optimizing the shape of the applied joining tool geometries using a Taguchi's L18 design of experiment and response surfaces. This enabled an increase of the tensile force from 612 N to 796 N.
In addition to this work, Lebaal et al. [8] aimed to improve the results of Oudjene et al. [6,7] by applying Kriging interpolations for the building of the response surfaces. Based on this, an increase of the joint's resistance against tensile loading by 34% in comparison to the initial tool design was obtained.
A further development of this procedure was introduced by Roux and Bouchard [9] aiming to determine optimal shapes of the clinching tools taking also the joining history (plastic strain, ductile damage effects, and residual stresses) into account. Therefore, the application of an efficient global optimization algorithm combined with a Kriging metamodel achieved an increase of the mechanical joint strength for both the tensile (+13.5%) and shear (+46.5%) loading.
An approach for the utilization of an artificial neural network (ANN) for the prediction of clinch joint properties considering an extensible die was developed by Lambiase and Ilio [10]. In this regard, the simulation-based generation of training data enabled the initial fitting and validation of the ANN and the subsequent performance of a genetic algorithm to obtain an optimal clinching tool configuration.
In contrast to this, Wen et al. [11] introduced a reshaping process of the generated clinch joint. Thereby, the protrusion height of the joint was decreased by simultaneously increasing the pull-out strength of the connection. As a result, the joint strengths were improved while the protrusion height was reduced by 60% (1.7 to 0.68 mm).
The authors in [12] applied a Grey-based Taguchi method to improve multiple clinch joint characteristics (interlock, neck, and bottom thickness) considering dissimilar materials. In this regard, the utilization of a Taguchi's L27 orthogonal experimental design together with the notion of signal-to-noise ratio provided the opportunity to define an objective function. The subsequent performance of an ANOVA identified the impact of each input parameter on the selected quality-relevant clinch joint characteristics. Thus, the results of the study contained an optimal configuration of the geometrical tool parameters of punch, die, and blank holder.
Han et al. [13,14] investigated the impact of varying geometrical tool parameters on the flat-clinching and the clinching process including extensible dies. Therefore, the application of an orthogonal experimental design and the set-up of a finite element (FE) model enabled the definition of an optimal tool configuration joining the aluminum alloy Al5052.
In contrast to this, Wang et al. [15] improved the strength of the clinched joint by applying a shape optimization of the joining tools. For this purpose, the description of the die outline contour through a Bezier curve combined with a direct communication between the finite element model and an implemented genetic algorithm enabled the generation of several tool shapes. As a result, the joint's resistance against tensile (767 N to 1100 N) and shear (500 N to 652 N) loading was significantly increased.
The contribution of Wang et al. [16] introduces an approach to determine optimal clinching tool designs by applying a response surface method with a non-dominated sorting genetic algorithm (NSGA-II). Based on this, the generated Pareto frontier offers the opportunity to select an optimal clinch tool configuration for the investigated use-case.
Schwarz et al. [17] presented a procedure to define an optimal clinching tool design using the principal component analysis (PCA). Through this, it is possible to define a functional relationship between the investigated clinching tool parameters and the resulting geometrical clinch joint characteristics. Subsequently, the implementation of a generic algorithm identifies optimal tool designs for the increase and improvement of the neck and interlock thickness.
The authors in [18] showed a novel approach for the analysis and definition of improved clinching tool configurations applying a deep reinforcement learning algorithm. Therefore, the initial training and fitting of an agent, represented by an artificial neural network, provide the opportunity to predict the investigated quality-relevant clinch joint properties without the involvement of labeled data. In addition, the utilization of a value-based deep Q-learning algorithm obtains optimized process and tool parameters in an 8-dimensional solution space.
In addition, the application of data-driven methods showed high potentials for the analysis of further mechanical joining technologies. For instance in [19], the authors introduced the application of an ensemble learner (XGBoost) algorithm for the estimation of the cross-tension strength of self-piercing riveted joints.
In summary, the presented contributions mainly focused on the identification of improved clinch joint characteristics considering data-driven methods and multi-objective optimization algorithms. However, as already explained, the generation of clinch joints underlies uncertainties based on permissible tolerances of material and process. Therefore, besides the optimization of joint geometries, Drossel et al. [20] applied a sensitivity and robustness analysis in order to gain a deeper understanding of how varying tool and process parameters (e.g., changing surface and lubrication conditions) affect selected critical clinch joint properties, such as the interlock thickness. As a result, the authors introduced an exemplary strategy to reduce the failure property of the joint connections through a more robust design of the entire joining process. Based on this, Drossel et al. [21] then combined the adaptive response surface method with the results of the sensitivity analysis to enable the identification of optimal tools for a few material combinations. However, the focus of both contributions was mainly on the identification of relevant input parameters and the definition of process design recommendations. The utilization of robust regression models for the prediction of individual clinch joint characteristics in versatile process chains was not investigated.

Research questions
While previous works mainly focused on data-driven analysis of individual clinch joining tasks aiming either to define optimal tool configurations or the prediction of joint characteristics, this contribution additionally integrates the influence of process uncertainties in the estimation procedure. In this context, the focus is especially on the achievement of a sufficient joining safety through a robust and reliable prediction of individual clinch joint properties. Therefore, after the introduction of the methodical and theoretical background, the implementation of parameterized finite element simulation models combined with the carrying out of parameter studies provides the opportunity to answer three research questions (RQ).
At the beginning, the utilization of different machine learning algorithms evaluates the ability of Gaussian regression (GPR) models to estimate quality-relevant clinch joint properties. This answers the question, whether GPR models are in general applicable to achieve high prediction qualities for the given joining use-case (RQ1).
Then, the subsequent section aims to investigate the impact of process uncertainties caused by varying material and friction parameters on important clinch joint characteristics. Therefore, the selection of suitable distribution functions (e.g., Gaussian or uniformly distributed) combined with an intelligent design of experiment enables to measure the impact of these parameter variations on resulting joint properties. Through this, it is possible to answer the question, whether meaningful confidence intervals covering 90% of the calculated target variable's probability distributions can be determined (RQ2).
Based on this, the following sections evaluate the opportunity to combine the calculated uncertainties of prediction and joining process into a comprehensive estimation approach and answer the question, whether the resulting prediction confidence intervals are feasible to cover experimentally generated data (RQ3). In this context, the joining of the material EN AW-6014-T4 with a nominal sheet thickness of 2.0 mm is used as an exemplary joining task for the validation of the demonstrated approach.

Methodical and theoretical background
The following section provides a brief overview of the developed approach and the applied methods. In this regard, the set-up of a numerical clinching process and the fitting of GPR models are described in more detail. Furthermore, an introduction of the used design of experiment and the implemented distribution functions is given. In addition to this, an overview of the considered input parameters is provided.

General approach
As already mentioned, this contribution aims to introduce a novel approach for the robust and data-driven estimation of clinch joint characteristics. In comparison to established approaches (see Fig. 1), the focus is not only on the setup and training of suitable regression models but rather on the quantification and integration of prediction and process parameter uncertainties in the estimation process. The latter focuses especially on factors that exact values are mainly unknown or can hardly be controlled, such as Therefore, the use of an uniformly distributed and spacefilling design of experiment (Latin Hypercube Design) combined with Gaussian process regression models offers the opportunity to not only estimate clinch joint properties but also to calculate individual uncertainty values on the given prediction for each input data. In this step, only input parameters that are known or can directly be determined by the product designer, such as the die or punch diameter, will be taken into consideration while material and friction settings remain constant. Thus, fitted GPR models represent the basis for the implementation of joining task data and the following estimation of single prediction values regarding each joint property. In this regard and as already mentioned, a main part of the developed method is the investigation of varying process and material parameters concerning their impact on the resulting joint geometries and properties. To determine these effects, a constant tool configuration combined with the selection of suitable distribution functions (e.g., Gaussian or uniformly distributed) provides the basis for the set-up of a DoE and the following sampling of several joint geometries. Since this contribution focuses on varying material and process conditions, no manufacturing tolerances of the tool and changes during series production, such as tool wearing, will be taken into consideration. The following carrying out of sensitivity and robustness analysis (introduced in Section 5.2) enables the calculation of a property distribution for each clinch joint characteristic and the following definition of statistical intervals (e.g., confidence intervals representing 90% coverage of the generated output data). Then, the combination of the trained GPR models with the individual uncertainty intervals (Section 5.3) paves the way to a more reliable and robust estimation of clinch joint characteristics considering datadriven methods. As an overview, Fig. 1 illustrates the set-up of meaningful regression models for the estimation of clinch joint characteristics combined with the novel approach integrating uncertainties in the estimation process.

Numerical clinching process
Since the carrying out of experimental studies is highly cost-and time-intensive, the generation of a database relies on a validated and parameterized finite element simulation model. Therefore, the clinch joining process shows highly nonlinear elastic-plastic material behavior including large deformations and enormous element distortions. In this regard, the FE software LS-DYNA provides suitable solutions for the creation of an accurate simulation model based on the introduced approach in [22]. According to that, the basic structure contains a 2D-axisymmetric set-up and the following elements: die, punch, blank holder, die-, and punch-sided sheet. As an overview, Fig. 2(a) depicts an illustration of the clinch joining process and the investigated quality-relevant geometrical joint characteristics (neck t NE , interlock t IL , and bottom thickness t BT ). Additionally, all considered geometrical joining tool parameters (input factors) are presented in more detail.
To ensure a detailed description of the original clinching process, the model mainly includes elastic and elasticplastic parts. For instance, the segmentation of punch and blank holder into a rigid (020-Rigid) and an elastic (001-Elastic) main part enables the involvement of elastic tool deformations.
In addition, the die is generated as a purely elastic part (001-Elastic) including a fixed bearing of the bottom nodes. While for the tools the material HCT590X (Emodulus: 210 GPa) was selected, both blanks consist of the aluminum magnesium silicon alloy EN AW-6014-T4, which is frequently used for car body and lightweight designs. Therefore, a nominal sheet thickness of 2.0 mm (E-modulus: 70 GPa, Poisson's ratio v: 0.3) represents a clinch joining process chain in which the movement of a punch leads to the compression of the upper material into the lower material by cold forming and thus to a formand force-fitting connection between both blanks. In this regard, the implementation of an elastic-plastic material behavior (024-Piecewise linear plasticity) combined with the use of adaptive and automatic remeshing supports the achievement of accurate results and normal simulation terminations. For this purpose, a maximum edge length of 0.05 mm was defined for remeshing the aluminum blanks. Furthermore, the dimensions of the blank holder, the blank holder force (785 N), and the joining velocity (2 mm s 1 ) remain constant. For the calculation of the generated FE simulation models, the LS-DYNA multiphysics solver smp d R910 was chosen. As an overview, Fig. 2(b) depicts more information regarding selected FE model settings.
Besides the geometrical validation in [22], Fig. 3 shows an comparison of simulation and experimental data. Due to a high agreement between the measured geometrical characteristics as well as the results in [22], the simulative representation of the clinch joining process can be assumed as sufficiently accurate. Furthermore, to provide a fast and consistent realization of numerous simulation runs, the initial FE-model includes several parameterized variables. Based on this, an algorithm loads the particular data from a backend (DoE dataset) and set-ups a simulation model for each individual clinching configuration [23]. Afterwards, the automatic determination of the joint characteristics (geometrical and strength properties) enables the generation  [24]. In addition to this, the determination of the joint's resistance against shear loading mainly base on the described model set-up in [25]. In this context, a 3D model is required in order to map non-axisymmetric deformations and the chosen test procedure [26]. Moreover, the transfer of data, such as plastic strains and stresses as well as node coordinates, ensures the involvement of the initial clinch joining operation. Especially, the implementation of the strain values enables the consideration of strain hardening effects. Furthermore, the test velocity of 10 mm/min prevents the occurrence of strain rates. Similar to [25], the model is validated by the maximum transmittable shear force (see Fig. 4). In this context, no material damage or failures are included in the simulation model, which leads to only a slight decrease of the load force in the presented shear simulation curve (red). Based on the results in [25] and the satisfying agreement of the force-displacement curves in Fig. 4, the applied 3D shear simulation models can be assumed as sufficiently accurate.

Design of experiment
Since the comprehensive analysis of a technical system requires a significant amount of samples, the use of a statistical design of experiment offers an efficient representation of parameter spaces by only considering a defined number and distribution of design points. Furthermore, in comparison to experimental studies, the previous set-up of a parameterized simulation model combined with an intersection to the generated DoE provides the opportunity to directly address and specify geometrical tool, material, and process parameters. In this context, it is a crucial step to define all investigated parameters in advance considering boundary values and individual distributions. For instance, the uniform and space-filling distribution of individual tool geometry variables enables a comprehensive coverage of the parameter space and thus a meaningful training of Gaussian process regression models for varying tool configurations. In contrast to this, the investigation of material uncertainties often relies on a Gaussian distribution of the particular variables based on expert knowledge or standards, such as the variation of sheet thicknesses in DIN EN 485-4:2019-05.
Resulting, the selection of a Latin hypercube design satisfies both Gaussian and uniform distributed factors combined with a multidimensional and space-filling design of near-randomly sample points by simultaneously decreasing unwanted spurious correlations between the input parameters [27,28]. Based on this, it is possible to detect multivariate dependencies between the considered input variables and the investigated clinch joint properties. In summary, the generate DoE provides the fundamental for the data sampling and regression model training process.

Gaussian process regression
The use of Gaussian processes provide a wide range of suitable application fields. For instance, besides the opportunity to solve classification problems, the generic supervised learning algorithm also enables the use in regression purposes, such as in [29]. In this regard, while simple regression models mainly utilize linear or polynomial representations of the relationship between input and output variables, GPR models are trained in a more subtly way [30]. Therefore, the Gaussian process allocates a probability distribution for each admissible input parameter configuration by generating a theoretically infinite amount of approximation curves. Subsequently, the curve that indicates the best fitting Gaussian normal probability distribution to the investigated training data will be chosen to predict unseen data of interest. In summary, the algorithm provides significantly more information on the output data since not only a single value but also the measurement of the prediction's uncertainty is provided over all permissible input data. However, to influence the behavior of the Gaussian process the selection of a suitable kernel is crucial. Therefore, depending on the prediction task and the type of data, a few kernels are available to define for instance the handling of outliers or the similarity of data points (covariance) [30,31]. For a better understanding, Fig. 5 illustrates an exemplary nonlinear function and the fitted Gaussian process regression combined with the calculated 95% prediction confidence interval over all input data.

Robust design of clinched joint connection
The subsequent section demonstrates the general approach for the reliable and data-driven estimation of clinch joint characteristics. Therefore, both the utilization of meaningful Gaussian process regression models and the determination of the joint's sensitivity on varying material and process parameters are explained in more detail. In this regard, different clinching tool, material and process parameters, and suitable distribution functions are taken into account.

Uncertainty of regression model
For the determination of the prediction uncertainty associated to the regression model, an automated data sampling process combined with the subsequent training and evaluation of the Gaussian process regression models will be carried out. Therefore, relevant input parameters and investigated factor spaces will be defined in advance. Then, the comparison of the model's prediction ability with other machine learning algorithms evaluates the applicability of the system in the field of mechanical joining.

Data sampling and evaluation process
For the accurate prediction of individual clinch joint characteristics, it is crucial to set-up a suitable training database in advance. Therefore, the focus is only on factors that can be directly adjusted or are exactly known by the product designer, such as tool dimensions and the punch movement distance, as input parameters. This enables a high applicability of the regression models, since the consideration of unknown or hardly controllable input parameters, such as varying material properties, can lead to cost and time-intensive experimental studies and unwanted inaccuracies in the estimation. The investigation of tool geometry variations caused by manufacturing tolerances or during the series production are no part of this contribution. Resulting, it is possible to fit different regression models and to determine their ability to estimate the individual target variables. Thus, it is possible to evaluate the GPR models prediction performance in order to estimate individual clinch joint characteristics for varying tool geometries and constant material settings (e.g., no varying sheet thicknesses or friction values). In summary, 13 input parameters, demonstrated in Table 1, and a Latin Hypercube Design provide the basis for the sampling of 650 numerical joining processes. Therefore, regarding time and consistency reasons, an algorithm [23] enables the automated set-up of the individual simulation models and the subsequent determination of all required joint properties. For the latter, only joints that reached a minimum interlock (>0.15mm) and neck (>0.15mm) thickness were taken into consideration.

Set-up of Gaussian process regression model
To investigate whether it is possible to train meaningful GPR models for the estimation of each joint property, the use of a performance score (Coefficient of Prognosis CoP [32]) is considered in order to demonstrate the individual prediction capability. Therefore, the training dataset contains 80% of the input data and the remaining 20% are used for the subsequent testing of the model's performance. Additionally, to reduce the influence of overfitting models or individual data points, the validation is carried out for 10 different data configurations. Furthermore, the inclusion of simple linear and polynomial regression algorithms as well as the application of artificial neural networks enable a comparison of the GPR's performance to estimate clinch joint characteristics for the given input data. Based on this, Fig. 6 depicts an overview of the individual model performances.
In order to provide a high reliability of the predicted joint properties, a minimum CoP value of 0.8 is required. Thus, one can see that the trained GPR models achieved a satisfying estimation performance compared to the other algorithms. Especially, the geometrical joint characteristics indicate almost no impact of different trainingtest-configurations on the prediction performance. Only for the estimation of the shear force the linear and polynomial (2nd degree) algorithms achieved slightly better performances. However, the ability of Gaussian process regressions to estimate clinch joint characteristics can be confirmed in general. Additionally, the opportunity to allocate a probability distribution for each admissible input parameter configuration guarantees a high applicability of this supervised machine learning method for the scope of this contribution.

Analysis of varying material and process parameters
The following section aims to investigate the influence of uncertainties within the joining process chain on the resulting clinch joint connection. Therefore, the selection of different process and material parameters as well as the set-up of a meaningful design of experiment is explained in more detail. Subsequently on the sampling and evaluation of the generated joint geometries, an estimation can be given about the permissible distribution ranges of the particular target variables and which input parameters have the highest impact on them.

Data sampling process
Similar to Section 5.1.1, the set-up of a design of experiment requires the previous selection of input parameters combined with the definition of suitable boundary values. Therefore, experimental studies as well as existing tolerance standards offer the opportunity to capture permissible parameter spaces. However, since not all material and process factors are assigned to reliable manufacturing knowledge, the sufficient implementation of individual uncertainties into the estimation process is a crucial step. For instance, while the variation range of sheet thicknesses is precisely described in tolerance specifications, such as in DIN EN ISO 9445, the friction values mainly rely on the respective state of the lubrication or the available surface conditions of sheet and tool. Thus, to generate a meaningful design of experiment, it is important to assign suitable distribution functions to the investigated input parameters. This includes beside the uniform distribution of data also the consideration of Gaussian distributed values. For instance, the sheet thickness or ultimate tensile strength of the blanks can vary between defined tolerance specifications whereby each parameter is Gaussian distributed around a mean value. In this regard, Table 2 depicts the individual factors and the relating boundary values. For the latter, the minimal and maximal values are defined by truncating the probability distribution covering 99.7% of the data (three standard deviation of the mean).
In contrast to this, the definition of friction limit values for each contacting parts, such as between the punch and the die-sided sheet, relies mainly on expert knowledge and experimental studies. Especially, the initial state of the shape surface or the lubrication highly affect the resulting frictional contact mechanics. Thus, to cover a wide range of initial tool and sheet conditions, the distribution of the friction values, demonstrated in Table 3, is assumed to be uniformly over the investigated parameter space. Similar to the previous section, an algorithm subsequently creates the individual simulation models and automatically starts the sampling process. Then, based on the determination and evaluation of the individual joint characteristics, 320 samples provide the database for the following calculation of sensitivity indices and the definition of distribution ranges and confidence intervals.

Sensitivity indices of varying input parameters
Given the challenges to generate a reliable clinch joint connection, the calculation of variance-based sensitivity indices, defined by Saltelli [33] and Sobol [34], offers an efficient way to measure relationships between the varying material/friction parameters and the resulting output factors. Through this, it is possible to get a deeper understanding of which uncertain factors are more relevant regarding the achievement of high-quality clinch joint connections and thus have to be analyzed in more detail. Based on the generated results, Fig. 7 illustrates the individual sensitivity values (first-order indices) for each target variable.
One can see that especially the variation of the sheet thicknesses (t I and t I I ) leads to changing geometrical joint characteristics, such as neck, interlock, and bottom thickness. This effect is mainly caused by the different amount of material available for the development of the properties. For instance, a larger total sheet thickness leads to a relatively smaller penetration of the punch into the material and thus to a higher bottom thickness of the resulting joint. Focusing on the shear force, one can see that the friction (μ I I ) between the blanks and the initial ultimate tensile strength of the punch-sided material (T S I ) are relevant for the formation of the target variable. Especially,  the latter can lead to a decrease of the maximum tolerable stress and thus to a lower resistance against shear loading. In comparison to this, variations in den frictions μ I (punch and punch-sided material) and μ I V (blank holder and punchsided material) as well as the initial ultimate tensile strength of the die-sided material (T S I I ) have nearly no impact on the investigated clinch joint characteristics.

Robustness analysis and definition of uncertainty ranges
In the scope of this contribution, the realization of a robustness analysis represents an extension to the previous calculation of sensitivity indices. Therefore, the aim is to get a deeper understanding about the frequency distribution of the individual joint characteristics caused by varying material and friction conditions. Subsequently, the definition of threshold values that represent 1.64 standard deviations (x − 1.64 * σ and x + 1.64 * σ ) allows the definition of a confidence interval that covers 90% of the generated data. For this purpose, a Shapiro-Wilk test (p value) is additionally applied in advance to evaluate the Gaussian distribution of the given set of data. Therefore, Fig. 8 provides an overview of the resulting clinch joint property distributions. Since the calculated p values are greater than the chosen minimum alpha level of 0.05, the distributions can be assumed as Gaussian distributed. Furthermore, one can see that the coverage of 90% of the data involve a significant distribution of all joint characteristics. Especially, the resulting values of shear force and bottom thickness show strong dependency on varying material and friction conditions. Thus, the relevance to investigate and implement joining uncertainties in the estimation process of clinch joint properties can be confirmed. Based on these results, the defined confidence intervals will be combined with the individual prediction uncertainties of the GPR models and build the framework for the robust and reliable estimation of clinch joint properties. However, the calculated frequency distributions are only representative for the considered parameter spaces as well as the defined designs of experiment and can vary for changing input parameter settings.

Use-case: clinching of similar materials
To evaluate the ability of the novel approach for the robust and data-driven prediction of clinch joint properties, the joining of the aluminum alloy EN AW-6014-T4 (t I = 2.0mm; t I I = 2.0mm) is used as an exemplary usecase scenario. In this regard, the GPR model is used to predict the individual joint characteristics for a selected constant tool configuration in advance. Then, based on the calculated uncertainty of the regression model in combination with the individual threshold values of the varying joint characteristics, the predicted numerical values will be extended by the particular prediction confidence interval. To evaluate the applicability of the approach, Fig. 9 depicts an overview of the estimated clinch joint properties and the calculated 90% confidence interval. Additionally, an experimental study (n=5) considering an equal tool configuration combined with the subsequent measurement of quality-relevant joint characteristics was carried out. As a result, one can see that the estimated results fit very well with the experimentally determined data since none of the measured values were located outside the defined prediction confidence intervals. Thus, it can be assumed that the introduced method is feasible to achieve a robust and reliable prediction of selected clinching target variables under conditions of uncertainty.

Discussion
Reflecting RQ1, Fig. 6 depicts that the Gaussian process regression achieves a high ability to estimate the investigated clinch joint properties taking 13 input parameters into account. Especially for the prediction of geometrical characteristics, hardly no performance differences were observed in comparison to the linear or polynomial regression model. In this regard, the key benefit of the GPR is that the fitted model provides, beside the estimation of a numerical single value, additionally a reliable evaluation of their own prediction uncertainty for each input data. However, since the performance of a GPR model mainly relies on the selected kernel, it is recommended to consider several covariance functions in order to calculate different generalization properties. Moreover, the utilization of Gaussian processes is a non-parametric method. Thus, the algorithm requires the entire training data for each prediction process which can lead to increasing computational costs compared to parametric approaches, such as linear or polynomial regressions.
Referring to RQ2, the probability distributions of all investigated clinch joint properties are depicted in Fig. 8. Therefore, the additional implementation of the Shapiro-Wilk test confirmed a Gaussian distributed population of the output values and thus enabled the calculation of a 90% confidence interval (90% of the values are represented within 1.64 standard deviations of the mean). In this context, the additional determination of sensitivity indices showed a high impact of the available sheet thicknesses (die-and punch-sided) especially on the formation of the geometrical joint characteristics. Moreover, the friction value between the blanks and the ultimate tensile strength Fig. 9 Evaluation of the predicted joint characteristics and confidence intervals considering experimental data of the punch-sided material influence the joint's resistance against shear loading. In summary, Section 5.2.3 approved the possibility to calculate the uncertainty in a clinching process caused by varying material and friction values. Therefore, the determined sensitivity indices provide a deeper understanding of the input parameter's relevance on the resulting joint properties. However, the generated probability distributions are entirely based on the previous design of experiment considering wide ranges of varying material and friction factors. Thus, in order to get a more realistic representation of a present clinch joining process, it is recommended to choose proper boundary values for the permissible parameter variations. For instance, a more precise understanding of the existing friction conditions in combination with the applied manufacturing tolerances of the sheets can already lead to an adjustment of the confidence interval and thus to a more accurate and specific estimation of the individual joint characteristics. In this regard, the involvement of cross-domain knowledge, such as described in [35], can pave the way to a better understanding of the clinching process. Furthermore, due to time-and computational-resources only limited data were considered for the calculation of the probability distributions. Thus, in order to increase the accuracy of the defined confidence intervals, especially in the boundary areas, it is recommended to compare the generated results with a significantly higher amount of data. Additionally, it is required to evaluate the transferability of the defined confidence intervals to other joining tasks as, for instance, different materials or sheet thicknesses can lead to a changing material behavior (e.g., strain hardening effects) during the joining process and consequently to changing probability distributions of the target variables.
Leading over to RQ3, Section 5.3 introduced an exemplary use-case scenario (joining of a similar material and sheet thickness configuration) to evaluate the robust and reliable estimation of a clinch joint connection considering data-driven methods. Therefore, the combination of the calculated uncertainty ranges caused by the material and friction variations with the individual observed uncertainty of the GPR model provide the definition of a comprehensive confidence interval. Thus, not only a predicted single value for each target variable will be offered to the product designer, but also an associated range in which these values can potentially vary. In general, the applicability of the introduced design method was confirmed by validating the results with experimental data.

Conclusion and outlook
Given strict requirements on safety-relevant components or products, such as in the automotive industry, the data-driven design and dimensioning of clinch joint connections requires a high reliability of the estimated individual joint properties. In this context, the presented contribution demonstrates a methodical approach for the quantification and inclusion of uncertainties in the estimation process caused by the prediction model and by varying material as well as process parameters. Therefore, the utilization of Gaussian process regression models showed a high ability to predict individual quality-relevant joint characteristics (neck, interlock, and bottom thickness as well as the maximum transmittable shear load). The following combination of these results with determined confidence intervals of each target variable, based on the calculated impact of varying material and process parameters, provides estimation ranges instead of only a predicted single value. Through this, product engineers get a deeper understanding of achievable best-and worst-case scenarios for a given joining task and thus paves the way to a more robust design and dimensioning of clinch joint connections taking data-driven methods into account. In this regard, a high applicability of the introduced methods was obtained considering an exemplary joining use-case and experimental data. In summary, the investigations in this contribution identified the following results: • It is possible to fit Gaussian process regression models for the prediction of individual clinch joint characteristics taking 13 input parameters into account. Especially the opportunity to additionally provide a precise estimation of their own prediction uncertainty for each input data is a key benefit in comparison to simple regression algorithms, such as linear or polynomial models. • Furthermore, the investigation of uncertain factors within the clinch joining process indicated a high dependency of the resulting target variables on these parameters. In this context, the achievement of a deeper process understanding (e.g., exact information about the applied manufacturing tolerances) for the given joining task can lead to a more accurate and specific definition of the particular uncertainty ranges. As a result, the calculated confidence intervals represent 90% of the clinch joint property distributions caused by varying material and process parameters • Finally, the introduced method showed high potentials for the robust and reliable design of clinch joint connections. This was demonstrated by analyzing an exemplary joining task and the following experimental evaluation of the results.
Outlook: Given the current developments in the field of lightweight designs, it is recommended to also involve multi-material joints and dissimilar sheet thickness combi-nations in the estimation process. Through this, is will be possible to apply the introduced method in more complex and versatile process chains. Moreover, since this contribution only investigated the impact of varying material and friction conditions, the additional investigation of changing tool settings caused by manufacturing tolerances or wear can pave the way to a reliable and robust design of clinch joint connections in series production.
Funding Open Access funding enabled and organized by Projekt DEAL. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research 635 Foundation) -TRR 285 B05 -Project-ID 418701707. We also thank the collaboration within the TRR285, especially with A01.

Data availability Not applicable.
Code availability Not applicable.

Conflict of interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.