1 Introduction

Self-piercing riveting is a highly productive and efficient joining process for joining parts in the metalworking industry (especially in the automotive industry or aircraft construction). It is possible to join sheets of the same or different materials (e.g., aluminum and steel). Part connections with non-weldable materials are also possible in a single operation. Compared to welding, this technology has the advantages of fast and energy-saving part joining, which also does not lead to thermal stresses or deformations in the material. The objectives are to achieve high productivity of the joining process and the best possible fulfillment of the required quality specifications. However, the numerous influencing variables, the high demands on the quality parameters of the manufacturing result, and the consistently complex nonlinear dynamic relationships pose a challenge to modeling as the basis of an optimized process control. For process optimization, artificial intelligence (AI)-oriented modeling methods are an efficient approach. The underlying data can be taken from real experiments or computer simulations. Both types have specific advantages and disadvantages but can build the basis for the fuzzy pattern classification (FPC) with local models described here.

1.1 Process description

Self-pierce riveting with semi-tubular rivet (SPR) is the most used mechanical joining technology for car bodies when using aluminum and steel combinations. This joining method can be divided in three steps, shown in Fig. 1. The first step is characterized by positioning the rivet and the sheets between the punch, blank holder, and die (a). When the punch presses the rivet in the punch-sided sheet, the rivet pierces a slug out of the material, which remains inside the cavity (b). The contour of the die forces the rivet to expand and an interlock is created (c) [1]. SPR joints are evaluated by certain geometrical criteria (Fig. 1d). These criteria correlate with the strength properties of the joint and must fulfill specific values in order that the joint can be considered OK [2].

Fig. 1
figure 1

Self-pierce riveting: ac process steps, d characteristic values [1]

1.2 System analytical view

In the case of self-piercing riveting, the best possible process results (outputs) to be ensured are, for example, interlock, minimum thickness die-sided part, rivet foot diameter, rivet head position, and maximum joining force [2]. In addition to the specified and thus known influencing variables (such as the materials of the sheets to be joined), these quality values depend on influencing variables (inputs) that determine the actual riveting process. These include sheet thickness, flow curve, rivet, and geometry [2]. Thus, the modeling of this multi–input–multi–output (MIMO) problem is characterized by relatively many input and output variables (Fig. 2).

Fig. 2
figure 2

Multi–input–multi–output problem

Because of transparency and feasibility, modeling should be done as a multi-input–single-output (MISO) structure according to Fig. 3.

Fig. 3
figure 3

Modeling the multi–input–multi–output problem through a multi-input–single-output structure

For an optimal process flow, the results (outputs) are decisive for given inputs. Models of different types are used to describe this relationship. Roughly, a distinction is made between experimental, theoretical, and expert-based approaches, whereby recourse to expert knowledge is usually not possible in the case of complex interrelationships (as in the present case). A purely experiment-based, i.e., data-based approach, is only feasible to a limited extent. Reasons are the time and financial effort as well as the need for trained personnel. In addition, the results of the experiments (outputs) can often only be obtained by extensive destructive testing (e.g., welding or coating). Thus, targeted experiments can be realized for the MIMO problems for only a relatively few selected material combinations and process settings. It is now a concern to incorporate the relatively few assured data (small-data problem), given the complexity, into a suitable approximating model whose generalization should cover the entire control range. In Fig. 4, this is denoted by ModelExp. Thus, the model building methodology should perform this generalization. In the fuzzy artificial intelligence methods, this is achieved by a “fuzzification” of the single data, according to which the available secured information is taken as a representative of a range.

Fig. 4
figure 4

Parallel paths of the real experiments (with ModelExp) and the computer simulation (ModelSim)

The high-dimensional problem is modeled by a multi-input–single-output (MISO) structure (see Fig. 3). Therefore, the purely experimental approach is only viable to a limited extent, especially in the context of practical applications. Theoretical model building is an alternative way of model building. It allows a more efficient computer-aided approach (see Fig. 4). Here, explicit physical laws in the form of mathematical relations (DE systems) or the implicit form as computer-aided simulation (e.g., FEM methods) form the basis. The advantage is the experimentation with the simulation models in wide setting ranges of the variables of interest. Due to the programmable automatic variation and efficient computer-aided processing of the influencing variables within the control ranges of interest, a large number of settings can be generated and tested via this simulation path. With the more extensive database, a big-data task is present that is also amenable to AI modeling. However, on the one hand, a simulation model also cannot capture all properties of reality (loss information). Reasons are general limits about the correlations, limitations of validity ranges (e.g., linearized or simplified nonlinear approaches), limitations of the number of considered influencing variables, or a time variance. For example, a 2D simulation with Simufact V15 [30] is used to simulate self-pierce riveting in [3]. On the other hand, the models show an intrinsic behavior that does not correspond to reality (transparent information). However, trained personnel is required for the simulation procedure and the handling of the simulation system [4, 5].

The data generated in this way can now be the basis of an AI model design (ModelSim in Fig. 4) in a similar way as for the experiments (data base DB1 in our case). Compared to numerical computer simulation, the latter’s advantage is that it is more flexible and less expensive, and can be applied close to the process or integrated into the process, and is, therefore, more practical. If the same modeling method as local fuzzy modeling (see Sect. 4) is available, the two models determined in parallel for the experiments and the simulation can be combined. In this case, the ranges in ModelSim, which are varied within wide limits, are supplemented and qualified by the experimentally determined results, which are limited in their scope. The path of data utilization from both approaches is followed in this paper. Another possibility would be the adaptation of the fuzzy structure. In the modeling of the experimental and the simulation results, different types of uncertainty are present. Uncertainties follow from (a) unavoidable stochasticity in the actual measurements due to the used sensors and measuring chains or (b) the limitation of the physical models (2D approaches for 3D problems, linearizations, etc.). Incompleteness of the applied models results from the number and importance of the concretely chosen input and output variables for the models, thus the necessary neglect of further influencing variables and the chosen quantization of the settings for the actual and computer investigations. Inconsistency becomes apparent when the simulation model for certain settings, for example, delivers results that do not correspond to reality. This is the case in the present problem and is evident in the data set DB2, which is the numerically calculated database. However, the numerical database based on a statistical design includes a series of simulation processes that are not fit with a technological point of view. Therefore, this database has been filtered and improved the DB3, which is presented in section 5.5. Because of these characteristics, modeling for both ModelExp and ModelSim with fuzzy methods, especially fuzzy pattern classification, is appropriate.

2 Problem statement

To produce reliable and robust SPR joints, different technique users have set up other joint quality criteria. The criteria in the SPR process are interlock, minimum thickness die-sided part, rivet foot diameter, rivet head position, and maximum joining force, which depend on the concrete set up the process parameter such as material properties, rivet properties, and die geometry (see Fig. 1) [267, 8]. Among these criteria, three prominent aspects need to consider: the rivet head position, the interlock distance, and the minimum remaining bottom material thickness. The interlock distance is an essential joint quality because it can determine the locking strength between the bottom sheet and the rivet. Although the minimum thickness die-sided part does not significantly influence the joint strength, it is essential for noise, vibration, harshness, and corrosion. The other essential join quality is the rivet head height, the joint strength, the tightness of the joints, and the gaps between the top sheet and rivet head [9]. The process parameters can influence the joint quality and strength; adjusting these parameters is the main challenge for SPR applications, such as selecting the correct parameters for different material stacks [10]. One of the main parameters, which should be correctly adjusted, is force. In order to provide the joint quality standards for a specific material combination, mainly a specific range of setting forces can be used. Different setting forces will produce joints with other joint qualities, resulting in different mechanical performances that depend on the material stack and the joint features. For example, the low setting force may lead to low strength caused by a short interlock distance, and the high setting force can be led to minimum remaining bottom material thickness, damage, and reduce the top strength sheet [69]. The next challenge, which occurs between the sheet material and the rivet, is friction force. The friction force is impossible to measure because the friction occurs locally, and friction coefficients depend on the geometry, surface texture, roughness of the surface, pressure, and relative speed [911]. For an optimal process flow, various types of models are used to describe these relationships. One approach is physical and theoretical modeling. However, this approach is not suitable for SPR optimal process flow because SPR is a system with a nonlinear behavior and complexity, and conventional physical and theoretical modeling due to the low information and lack of precise knowledge about the system is an imprecise model. Therefore, in the SPR process, many of the phenomena are highly complex and interact with many factors that high process performance cannot be achieved with mathematical relationships. Because we have to simplify the model, which forces us to accept a certain amount of imprecision and uncertainty in mathematical models that cannot achieve acceptable results for the dynamical behavior of the systems. This uncertainty is caused by modeling due to the local linearization of nonlinear systems because the linear model is suitable for parameter estimation [12]. However, process parameter variations cause the system to fail and cause false alarms.

In such situations, a data-driven black-box nonlinear system provides a reasonable approximation of nonlinear systems. These approaches create models based on measured input and output data of the process and require little, no physical, or formal information. From literature, it has been observed that black-box models such as neural networks and fuzzy logic-based models are widely used to build models of the manufacturing process from measured input/output data [2, 11, 13]. These approaches perform better rather than statistical models [14] and mathematical equations. However, artificial neural networks and fuzzy logic have limitations. The main difficulty of neural network models can work with numerical information or data-driven and big data. Consequently, the neural network cannot model from a knowledge base data from human respective engineering knowledge, and the neural network has the main challenge with small data. In fuzzy logic, difficulties are about constructing the shape of the membership function and fuzzy rules, which are determined using trial and error by the operator. The problem is essential when fuzzy logic is applied to a complex system [15].

A well-established analysis and control of complex nonlinear systems is the local model network. It means that in order to cover the entire work area, the manipulated variable is determined by interpolation from the individual, locally valid manipulated variables. These models interpolate between different local models [15]. Another well-known model architecture is Takagi–Sugeno fuzzy model [16]. The combination of local models and Takagi–Sugeno fuzzy models represents an advantage because the fuzzy system can interpret the local model parameters separately, and the boundary between neighboring in local fuzzy system is soft [12, 17, 18].

The present work aims to model the nonlinear dynamic multi-variable SPR process based on Takagi–Sugeno fuzzy models with potential functions. The focus is on the targeted partitioning of the input variable range, which is caused by a standard evaluation of the range of validity and similarity of the local linear models.

3 Local fuzzy pattern modeling methodology

For complex technological processes in manufacturing engineering (e.g., welding, riveting, forming, coating), modeling nonlinear dynamic relationships with a more significant number of input and output variables is required. A decomposition can be done by the p-canonical parallel structure according to Fig.3. For each MISO path, a closed-form nonlinear parametric modeling approach can be traditionally chosen (e.g., regression) that describes the interrelationships of all input variables with one output variable each. For this purpose, all model parameters have to be estimated in such a way that an error measure between all model values and the given real measured values is minimized in the entire modeling domain. The disadvantage of such high-dimensional nonlinear model approaches are, for example. •The requirement of a sufficiently large amount of data for parameter estimation in the high-dimensional space (where the experimental determination of the real measured value is often time-consuming and costly) •The high sensitivity in numerical parameter estimation for the nonlinear approach, which is why often only low-order approaches are chosen. •The impossibility of selective model adaption, which is useful e.g., for local model qualification in a sensitive control design. The methodology of local modeling, which has been introduced for control engineering problems for quite long time [19,20,21,22], avoids these disadvantages. To do so, a set of local (preferably linear) models leads to a highly flexible nonlinear overall structure. For the application described it yields average mean error (ME) reductions of 23.8% for all outputs of data set (DB1) and 12.5% for data set (DB3) (see. Section 6). Especially for the local fuzzy models, the membership functions possess the importance of the validity of the local models and lead to continuous transitions between the local domains. In [29, 12] the decomposition into the submodels and the fuzzy description are described in more detail.

3.1 Fuzzy pattern classification (FPC)

Fuzzy pattern classification (FPC) is a fundamental model building method. FPC is suitable for describing complex relationships based on data or expert knowledge. The theoretical basis is the fuzzy set theory with the key concept of membership [23]. It is a criteria-related measure, which can be seen as a formal-mathematical membership of an object to a set or as an application-related membership of a product or process state to a class with regard to content (quality class, state class, control class, etc.). The formal description of the classes is done by membership functions (MF), which can be done in one- or multidimensional information spaces (characteristic, quality, or state-space). Due to practicality, parametric approaches (triangular, trapezoidal, sigmoidal functions) are advantageous and common for MFs. In many applications, a potential function has proven to be useful; FPC is the starting point for describing complex systems, including technical, nontechnical, and support for control or decision support. A fuzzy pattern classification is an approach for describing classification systems of observations. Features are defined as feature vectors in feature space of one or more dimensional Euclidean space using fuzzy membership functions representing the measured variables or characteristics [24]. Compared to other approaches, the fuzzy pattern is a sign of parallel and non-sequential. Therefore, this approach allows modeling the interdependencies of variables by rotating the multi-axis system in the coordinate system as a whole. One classifier can comprise some patterns, which are all semantically interpretable. Fuzzy pattern classifiers can be modeled knowledge-based or by data-driven, and also as a combination of both [25].

3.1.1 One-dimensional membership functions (MF)

According to the fuzzy set theory, each class is defined by a MF in the information space. We use parametric functions of generalized Aizerman’s potential function type. Figure 5 shows two one-dimensional asymmetric functions, and in Equation (1), the function is defined for the variable \(u\).

Fig. 5
figure 5

Two one-dimensional asymmetric membership functions with different parameters

$$\mu \left(u\right)=f\left(x\right)=\left\{\begin{array}{c}\frac{a}{1+\left(\frac{1}{{b}_{l}}-1\right){\left(\frac{r-u}{{c}_{l}}\right)}^{{d}_{l}}} \quad\quad u< r\\ \frac{a}{1+\left(\frac{1}{{b}_{r}}-1\right){\left(\frac{u-r}{{c}_{r}}\right)}^{{d}_{r}}} \quad\quad u\ge r\end{array}\right.$$
(1)

This function is based on a set of eight parameters. The parameter \(r\) denotes the representative position of the MF, which can be determined in various ways, for example, as the center of gravity of the objects constituting the class or as a reference point. The maximum value of the membership parameter \(a\) of this unimodal MF is assigned to \(r\). The membership parameter \(a\ge 1\) can indicate the weight or authenticity of a class. The parameters \({c}_{l}\) and \({c}_{r}\) \(\left({c}_{l/r}>0\right)\) is the information of the class range (positions of the farthest objects from the center). Therefore, parameter \(c\) represents the greatest distance of the observed objects from \(r\). As shown in Fig. 5, \({c}_{l}\) and \({c}_{r}\) characterize the left- and right-sided expansions of a fuzzy pattern class. The parameters \({b}_{l}\) and \({b}_{r}\) \(\left(0<{b}_{l/r}\le 1\right)\) are factors that determine the value of the MF at the sharp boundaries \({c}_{l/r}\) of the fuzzy pattern class. The parameters \({d}_{l}\) and \({d}_{r}\) \(\left({d}_{l/r}\ge 2\right)\) determine the form of the function and carry information about the object distribution in the corresponding class. In special cases, the amount for \({d}_{l/r}\to \infty\) changes to a sharply described class (red color in Fig. 6) [24].

Fig. 6
figure 6

Illustration of two-dimensional MFs for four classes

3.1.2 Multidimensional membership functions

The previous considerations only referred to the presentation of fuzzy pattern classification in one-dimensional (feature) space \({\mathbb{R}}^{1}\). The fuzzy potential function points to the theoretical possibility of definition of multidimensional MFs in \(N\)-dimensional space \({\mathbb{R}}^{\mathrm{N}}\). Therefore, the unique advantage of the potential function is that a suitable conjunctive combination of several one-dimensional MFs can be used to obtain a multivariate MF \(\mu \left(u\right)\) in parametric form [24]. The normalized function (\(a=1\)) with \(N\) dimension is given by

$$\mu \left(\underline{u}\right)=\frac{1}{1+\left(\frac{1}{N}\cdot \sum_{j=1}^{N}{\sigma }_{j}\right)}$$
(2)

where \(j\) belonging to the number of features (\(j\)= \(1, 2, 3,..., N\); \(N\) = the number of dimension). Therefore, \({\sigma }_{j}\) is denoted by

$${\sigma }_{j}=\left\{\begin{array}{c}\left(\frac{1}{{b}_{l,j}}-1\right){\left(\frac{{r}_{j}-{u}_{j}}{{c}_{l,j}}\right)}^{{d}_{l,j}} \quad\quad{u}_{j}<{r}_{j}\\ \left(\frac{1}{{b}_{r,j}}-1\right){\left(\frac{{u}_{j}-{r}_{j}}{{c}_{r,j}}\right)}^{{d}_{r,j}}, \quad\quad{u}_{j}\ge {r}_{j}\end{array}\right..$$
(3)

Figure 6 illustrates the basic join operation for two-dimensional fuzzy pattern classes.

4 Local fuzzy model structure

Figure 7 shows the local fuzzy model concept. Each segmentation in the local model consists of two parts, (a) the local linear model that estimates the parameters of each local model independently by polynomial regression models and (b) the validity functions, which carry the membership values from a partition.

Fig. 7
figure 7

Combining three planes to a nonlinear surface

The local linear model \({f}_{k}\left(x\right)\), with \(k=\mathrm{1,2},3,...,K\) whereby \(K\) is the number of classes, and the output for segment \(k\) of a partition is calculated as

$${f}_{k}\left(x\right)={p}_{k,0}+\sum_{j=1}^{N}{p}_{k,j}\cdot {x}_{j}.$$
(4)

Therefore, the function graph of \({f}_{k}\left(x\right)\) represents a straight line for \(N=1\), a plane for \(N=2\), and a hyperplane for multidimensional features. Besides the parameter estimation of the linear models, it is necessary to calculate the validity function parameter respectively the MF parameters. The parameter of each membership function is calculated from Eq. (1) for one-dimensional and Eqs. (2) and (3) for multidimensional feature space.

The global model output \(y(\underline{u})\) is calculated from the aggregation through weighted averaging of all local estimations [12].

$$y\left(\underline{u}\right)=\sum_{k=1}^{K}{y}_{k}\left(\underline{u}\right)\cdot {\nu }_{k}\left(\underline{u}\right),$$
(5)

where

$${\nu }_{k}\left(\underline{u}\right)=\frac{{\mu }_{k}\left(\underline{u}\right)}{\sum_{k=1}^{K}{\mu }_{k}\left(\underline{u}\right)}.$$
(6)

The linear parameter models and the potential membership function parameters were calculated for each local partition, which is used to approximate multidimensional nonlinear functions and is specialized for fuzzy analyses and modeling of potential membership functions.

5 Data collection (experimentation and simulation procedure)

5.1 Experimental data collection

In order to validate the simulation models of the SPR process and to generate a separate experimental database for the following considerations, with the materials of Table 1, 125 combinations were identified on the basis of a partial factorial experimental design according to the optimized Latin hypercube sampling method [26] and subsequently experimentally joined and evaluated. The process parameters like rivet length (between 4 and 6 mm) and die design (diameter 9 to 11 mm, depth 1.0 to 2.2 mm) have an impact on the joint geometry (see Fig. 1d) and were chosen experience-based. Figure 8 visualizes the results of experimental joining investigations. Thereby, green dots represent the joints, which fulfill all necessary criteria (Fig. 1d) and the red dots the joints which do not meet one of the criteria in relation to the tensile strength and sheet thickness ration of the punch- and die-sided sheets. Thereby, it can be noted that sheet thickness ratios above 1.7 lead to critical results, which is in good correspondence with the general advice for SPR that it is better to join thinner in thicker sheets [1]. The average scatter of the geometrical characteristic values interlock and the minimum thickness of the die-sided part in the experiments was approx. 20% with three evaluated experiments per test series. This value is used as a quality characteristic for the accuracy of the numerical or data-based prognoses [2].

Table 1 Considered materials
Fig. 8
figure 8

Distribution of OK and not OK joints concerning the tensile strength (\({R}_{m}\)), ultimate tensile strength ratio (\({R}_{m1/2}\)), and sheet thickness ratio (\({t}_{1/2}\)) of the joined parts [2]

5.2 Correlation between experiment vs. simulation

Following for all 125 experimental joints, 2D simulation models in Simufact V15 [30] with the Fig. 9 shown structure were build up. Due to the following numerical sensitivity analysis, the simulation models are designed with the background in mind to achieve a good balance between forecasting accuracy and computing time. An average computing time of 13 min per SPR process simulation (information about mesh structure and number of elements can be seen in Fig. 9) is achieved on a workstation with 14 cores. The chosen parameters of the combined friction model are a result of a numerical sensitivity analysis by fitting the calculated experimental joint contour and force–displacement curve of the SPR process. Since the simulation quality with these constant friction values is already sufficiently high, the use of nonlinear approaches is avoided here in order to keep the complexity and computation time low. Flow curves for the sheet and the rivet material are determined by stack compression tests (SCT) due to the good comparability in terms of stress state between SPR and SCT [3].

Fig. 9
figure 9

 2D simulation model for the process simulation of self-pierce riveting with additional information

The decisive criterion for the validation of the simulation models regarding sufficient predictive capability was whether the deviation between simulation and experiment was within the average, absolute scatter of interlock, and min. thickness of the die-sided part of all experiments. If these conditions were fulfilled and in principle, a good comparability in terms of joint contour and force–displacement progression was given, the simulation models were used for the following variation studies. As a result of the evaluation, 71 models, which correspond to the validation criterion, showed the necessary accuracy to be used for numerical variation study.

5.3 Material combinations

In the study described here, the SPR process of steel and aluminum sheets is investigated. Thereby, the materials and thicknesses described in Table 1 are considered.

5.4 Simulation data collection

The goal of the numerical sensitivity analysis is to generate an extensive database for the SPR process of the steel and aluminum sheets under consideration. In numeric, process parameters, geometry and material properties can be varied much more flexibly than in experiments. One hundred fifty parameter variations are performed for each of the validated 71 simulation models, which will result in a total number of 10,650 SPR joining results. For each material combination, the following input variables are individually varied in relation to the original values:

Process variation of the parameters:

  1. Sheet thickness in [mm]:

    \(\begin{array}{ccc}\pm 0.15& \pm 0.1& \pm 0.05\end{array}\) 

  1. Flow curve variation in [%]:

    \(\begin{array}{ccc}0& \pm 5& \pm 10\end{array}\)  

  2. Rivet head geometry:

    C-form; P-form

  3. Rivet length in [mm]:

    \(\begin{array}{cc}0& \pm 0.25\end{array}\) 

  4. Contour diameter in [mm]:

    \(\begin{array}{cc}0& \pm 0.5\end{array}\) 

  5. Contour depth in [mm]:

    \(\begin{array}{cc}0& \pm 0.25\end{array}\) 

  6. Thorn height in [mm]:

    \(\begin{array}{cc}0& 0.5\end{array} \begin{array}{cc}1.0& 1.5\end{array}\) 

  7. Thorn width in [mm]:

    \(\begin{array}{cc}0& 2.0\end{array} \begin{array}{cc}3.0& 4.0\end{array}\) 

  8. Contour edge radius in [mm]:

    \(\begin{array}{ccc}0.1& 0.25& 0.5\end{array}\) 

Due to the high number of possible parameter variations, a partial factorial design of simulations according to the optimized Latin hypercube sampling (LHS) [27, 28] method was generated. All the required material and geometry data were integrated into the Simufact Joining Optimizer, which is a special tool for the automization of mechanical joining simulations inside the Simufact Joining software allowing the multitude of simulations to be built up, carried out, and evaluated automatically. All the 10,650 process simulations were calculated in a total time of approx. 630 h on a workstation with 14 cores.

5.5 Available databases

Two databases are available in the origin for the evaluation of the process data of the SPR of the considered material combinations: the experimental database with 125 joining results (DB 1) and the numerically calculated database with 10,650 joining results (DB 2). Since the numerical database is based on a statistical design, it contains a series of simulations, which are incorrect from a technological point of view. Therefore, this database was additionally adjusted in order to improve the later databased prognosis. The following technological filters were included:

  • Interlock u1,2 ≥ 0 mm

  • Min. thickness die-sided part tr ≥ 0 mm

  • Rivet foot diameter df ≤ 8.2 mm

  • Rivet head position − 0.5 mm ≤ ph ≤ 0.5 mm

  • Max. joining force 30 kN ≤ FJ ≤ 85 kN

A total of 2376 joining results fulfill these criteria, which are considered in the following as a separate database (DB 3) that was used for creating the model for simulations (ModelSim in Fig. 4).

6 Results and discussions

6.1 Training phase and performance evaluation

As explained in Sect. 5, there are two types of data used in the modeling: the experiment (DB1) and simulation (DB3) data, with 11 inputs and 5 outputs. In this work, the nonlinear multi–input–multi–output (MIMO) (see Fig. 2) has been expressed with multi-input–single-output (MISO) (see Fig. 3) systems, because the aim is to analyze the influence of the inputs of each output.

The local fuzzy model should have a minimal error with a nonlinear system. Therefore, the parameter membership function should be estimated. On the other hand, the parameters depend on the membership functions, which is the main challenge the determining the number of the local models respectively the number of membership functions. One approach to solving this problem is the multi-resolution constructive algorithm, which increases the number of local models step by step. It means that every step at the center of the new class is located in the area with the most significant error. [12]. All in all, 82 experiment data samples are used to estimate the number of local fuzzy models with the reclassification approach to calculate the global error.

6.2 Experimental and simulation results

6.2.1 Model representativeness: mean maximum membership of training objects

For a model to perform well, it is necessary that the data support the model in a representative way. According to fuzzy modeling, this “fit” of (training) objects and model can be determined by the membership values (sympathy vector). The maximum membership value \({\mu }_{\mathrm{max}}\) over all \(K\) submodels, where \(k\) belonging to the number of classes (\(k = 1...K\)), indicates to which submodel the object in question dominantly belongs:

$${\mu }_{max}=ma{x}_{k}({\mu }_{k})$$
(7)

Thus, the mean maximum membership value \({\overline{\mu }}_{max}\) can be determined as a measure of model validity over all (training) objects:

$${\overline{\mu }}_{max}=\frac{1}{M}\sum_{m=1}^{M}{\mu }_{ma{x}_{m}}$$
(8)

where \(m\) belonging to the number of objects and with \({\overline{\mu }}_{ma{x}_{n}}\) the maximum membership value for the m-th object. This dominant membership value (averaged over all objects and all submodels) should now be sufficiently high \({\overline{\mu }}_{max}>0.5\).

The \(k\)-fold cross-validation was utilized, which can be the data set divided into \(k\) partitions. We set \(k\) to be 3 to evaluate the performance of each model. Therefore, one-third of the data set was for testing. Hence, the modeling was repeating three times for each output with different training data sets. Consequently, the modeling was repeating thirty times for experiment and simulation data sets.

Therefore, after validation, it can be seen that \({\overline{\mu }}_{max}\) averaged over threefold cross-validation are very balanced, as Table 2 shows.

Table 2 Mean maximum class membership values for the five outputs, averaged over all objects and threefold cross-validation

These values range from 0.64 to 0.74, showing sufficient representativeness of the objects for segmentation. For DB1, the values are lower than for DB3. The reason is the higher number of objects for DB3 and thus the stronger support of the segments.

The mean error (ME) in Equation (9) and mean absolute error (MAE) in Equation (10)  are used for the evaluation, and all error details are given as related values in percent, where \({x}_{m}\) is the measured value, \({\tilde{x}_{m}}\) is the model output value, and \(\bar{x}\) is the mean value for the m training objects.

$$ME=\frac{1}{M}\sum_{m=1}^{M}\frac{\widetilde{{x}_{m}}-{x}_{m}}{\overline{x}}$$
(9)
$$MAE=\frac{1}{M}\sum_{m=1}^{M}\frac{\left|\widetilde{{x}_{m}}-{x}_{m}\right|}{\left|\overline{x}\right|}$$
(10)

Furthermore, the mean maximum membership value is included, which represents a measure of the model validity.

The training phase was with initial values \(\left({b}_{l/r}=0.1, {d}_{l/r}=2\right)\) and \(\left({b}_{l/r}=0.3, {d}_{l/r}=3\right)\). The errors of each class are summarized in Table 3. It can be seen that with 10 classes, the error after reclassification has the lowest value. Hence, this number of the local models with hyperparameter \(\left({b}_{l/r}=0.3, {d}_{l/r}=3\right)\), symmetric and without rotation, is used in the modeling experiment and simulation data.

Table 3 The maximum error in percent with the increasing number of partitions in the method of local fuzzy models

6.2.2 Error comparison for the DB1 and DB3 database

Since both approaches are data based, a basis for a comparative discussion of the results of the modeling approaches is given on the basis of the percentage errors ME and MAE for the approaches via DB1 (experiments) and DB3 (computer simulations).

The average value ME error and MAE error for each output is shown in Figs. 10 and 11, and the accuracy of both experimental and simulation data sets with threefold is summarized in Tables 4 and 5. The second, fourth, and sixth columns denote the mean error (ME) to test the model from each \(k\)-fold train data set. The third, fifth, and seventh columns indicate the mean absolute error (MAE) to test the model from each \(k\)-fold train data set. Comparing the two results from modeling the local fuzzy model listed in Tables 4, 5, 6, and 7 it can be seen that the output is different for each model due to the non-uniformity of training and test data in each \(k\)-fold evaluation. Therefore, it can be seen that the slightest ME error for the test objects in three test ratios \({k}_{1}\), \({k}_{2}\), and \({k}_{3}\) are \(0\) to 15% for rivet head position in experimental data and 0% for simulation data. Nevertheless, the highest ME error is \(0\) to 15% for interlock in experimental data and 2 to 7% for rivet foot diameter in simulation data.

Fig. 10
figure 10

Comparison of the DB1 vs. DB3 ME values for the five outputs (mean values over the three k-folds)

Fig. 11
figure 11

Comparison of the DB1 vs. DB3 MAE values for the five outputs (mean absolute values over the three k-folds)

Table 4 Experimental prediction results using training data based on the mean error and mean absolute percentage errors
Table 5 Simulation prediction results using training data based on the mean error and mean absolute percentage errors
Table 6 Experimental mean membership value results using training data
Table 7 Simulation mean membership value results using training data

The ME error has the meaning of an average deviation of the nonlinear (here 11-dimensional) modeled map from the real map, averaged over all grid points. The ME errors are between \(0\) and 7.7% (average: 3.73%) over both databases and all outputs. From Fig. 10, it can be seen that the average ME values averaged over the three k-fold for all five outputs are larger in the case of the DB1 database (experiments) than in the case of DB3 (computer simulations) (a weak exception is rivet foot diameter respectively output 3). This is understandable since the DB1 model in the 11-dimensional space is based on the comparatively very small data set of 82 training data, which were, however, fuzzified.

The MAE values have the meaning of the mean absolute deviation of the modeled data from the real data. This value is of course higher than for ME. As a result, the values for the DB1 and DB3 data bases are comparably high for the five outputs. They range from 2.3 to 36.7% (average: 18.2%) across both databases and all outputs (see Fig. 11).

In particular, it becomes clear that the model represents a generalization that is shown in a smoothing. Consequently, the absolute values lead to the enlargement because of the peaks. As can be seen from the evaluation of the membership function value of each of the target objectives, the mean maximum uncertainty for experimental data is between 0.63 and 0.68 and for simulation data is between 0.70 and 0.76. It means that we face more uncertainty in the experimental data set (see Table 6 and 7). Model variants with ascending differentiation were further examined (1 to 25 local models). In order to address the increasing number of total model parameters with increasing segmentation, the achievable model improvement, and possible overfitting due to the relatively small data volume, a partitioning into ten local models was chosen as a compromise.  Table 8 summarizes the reductions in errors ME and MAE for DB1 and DB3 in percent for only one vs. ten (optimized) segments. The calculations are based on the test objects when averaging over all k-fold variants. Obviously, almost all models (except for the value for MAE at max. joining force respectively output five) improved.

Table 8 Percentage reduction in errors ME and MAE (in %) between the model with only one segment and the model with ten (optimized) segments (basis: test data sets)

6.2.3 Comparison with the other modeling approach

For comparison, calculations were performed with the same prerequisites (same training and test data sets for three \(k\)-folds, evaluation of the error values ME and MAE) for the modeling approach of a regression approach. The results are summarized in Figs. 12 and 13.

Fig. 12
figure 12

Comparison local fuzzy vs. linear regression of the DB1 (a) and DB3 (b) ME values for the five outputs (mean values over the three \(k\)-folds)

Fig. 13
figure 13

Comparison local fuzzy vs. linear regression of the DB1 (a) and DB3 (b) MAE values for the five outputs (mean absolute values over the three)

6.3 Cross-examinations between experimental and simulation

A future research task should be the comparison of models from experiments and corresponding ones from computer simulations. The cross-examinations serve as the first step in this direction. This means that the models from the small-data experiment data with the big-data simulation data set are tested, and the reverse case is also examined. The introduced ME, or MAE errors, are also used for this purpose.

According to the main paths in Fig. 4, the models were designed in parallel with the experimental and simulation data (DB1 and DB3, respectively). An obvious consideration is now the crosswise examination of the models, i.e., the DB1 model with the DB3 data and vice versa (dotted connections of the secondary paths in Fig. 4). The data of the respective other access can be taken as new test data collected by qualitatively different access.

From Table 9, it can be seen that, as expected, testing the models with the associated test data leads to minor errors. The results for the DB3 model (simulated data) with the DB1 foreign data are also acceptable (ME = 7%, MAE = 28%, and \({\mu }_{max}=\) 0.58). In contrast, the DB1 model for the small number of experimental data is unsuitable for DB3 data (errors exceeding 100%). This unsuitability is also evident from the low value for the mean class membership of \({\overline{\mu }}_{max}=0.25\).

Table 9 Comparison experimental and simulation cross-examination prediction of the local fuzzy model, based on the mean error, mean absolute percentage errors, and mean membership value

As a consequence, a fusion of the two model accesses is required. This can be done by adapting the DB1 model since it embodies reality. The adaptation requires the design of new simulation results. To limit the simulation effort, the settings for this have to be well planned. This is possible based on the \(\mu\) values, which embody the representativeness with respect to the classes. As a result of the models over the DB1 or DB3 databases or an additional subsequent adaptation, the local fuzzy model structure is available. It is transparent and can be used close to the process with little computational effort.

7 Summary and conclusions

Self-piercing riveting is a highly productive and efficient joining process for parts in the metalworking industry (especially in the automotive industry or aircraft construction). It is possible to connect sheets of the same or different materials (e.g., aluminum and steel). Compared to welding, this technology has the advantages of fast and energy-saving parts connection, which also does not lead to thermal stress or deformations in the material. Part connections with non-weldable materials are also possible in one operation. The goals are to achieve high productivity of the joining process and the best possible fulfillment of the required quality specifications. With self-piercing rivets, such process results (outputs) are, e.g., joining the force, undercut, undercut height, material thickness, rivet base diameter, and the strength of the connection. Several influencing factors can only be partially influenced for the course and thus the result of the riveting process. This includes the influencing variables determined by the process design and thus known (such as the materials of the sheets to be joined, sheet thicknesses) and the boundary conditions that cannot be influenced, such as the characteristics of the riveting device. The important influenceable variables (inputs) that determine the actual riveting process go directly into the modeling. This includes tensile strengths and thicknesses of the sheets, rivet length, die diameter and depth, mandrel height and width, contour radius, and rivet end position. Changes to the riveting device (e.g., wear on the punch or die) and stochastic influences during the speedy riveting process also play a role. These influences lead to uncertainties in the modeling, which should at least partially be recorded. The modeling of this multi–input–multi–output (MIMO) problem is thus characterized by a relatively large number of input and output variables (Fig. 2). For reasons of modeling techniques, the description is based on a multi-input–single-output (MISO) structure according to Fig. 3.

The modeling carried out in this paper is based exclusively on the estimation of the model parameters with the help of measured and computer simulation data. This enables the modeling and control of systems for which no usable mathematical model is available.

The main goal of the current study was to model a nonlinear dynamic system with the help of locally valid linear models, the system behavior of which corresponds locally to that of the nonlinear system. In this work, local fuzzy pattern modeling with a multidimensional membership function is used to predict the output of the self-piercing rivet. The parameter estimation is based on an existing partition, which is the partition structure is of much greater importance for the quality of the model. Therefore, finding a proper partition of the variable input range is significant, i.e., the optimal definition of the validity ranges. Thus, the lowest error value after classification is defined as the number of 10 classes. To train the model, the threefold cross-validation was used.

This research has shown that the MAE values are higher than the ME value for DB1 and DB3 data-based comparably high for the five outputs. It was also shown that the ME values for all five outputs are larger in the experiment database than in the case of computer simulations. The membership function value evaluation investigation has shown that we face more uncertainty in the experimental data set. One of the more significant findings to emerge from this study is comparing the experiment model from small data and the computer simulations model from big data, in other words, the cross-examination between experimental and simulation models. This investigation showed that testing the DB3 model (simulated data) with the DB1 foreign data is acceptable and has minor errors. In inverse, the DB1 model of experimental information is unsuitable for DB3 data. The second significant finding was the comparison between the local fuzzy model and linear regression.

This research will serve as a base for future studies and divided into application and theoretical perspectives. Future applications can transmit this concept and approach to other technical and manufacturing applications, such as machine tools, joining, and surface analysis. As future theoretical research, the suggestion is evolving classification. Evolving classifications can update their structures, components, and parameters according to new process features, system behavior, and operational and environmental conditions. These systems support modeling any scenario from data flow, online measurements, and dynamic data whose nature and characteristics change over time.