1 Introduction

Modern information and communication technology advancements have made it possible to effectively monitor, manage, and store data for industrial systems. With this, the industrial systems in general and manufacturing sector in particular store real-time operation data of the processes carried out for the value-added product development. The data stored in the data-storage banks of the manufacturing industries possess the chief characteristics and information of the system can be deployed to conduct the customized and problem-specific analytics thereby enhancing the productivity and smart operation, informed and knowledgeable decision-making and operational excellence of the systems.

The conventional process modelling and analytical techniques like physical models and regression techniques are being used by the manufacturing sector community to exploit the operational data of their manufacturing systems [1]. However, the physical models built on certain assumptions cannot effectively model the hyperdimensional system that contains highly nonlinear and interactive relationships among the input-output variables [2]. Furthermore, the simulation and optimization of the physical model can be computationally prohibitive thereby limiting their applications [3]. On the other hand, simple regression-based techniques are not capable enough to capture the underlying information and interactions present in the heap of the data; thus, the regression models may lead to sub-optimal results for the nonlinear and complex output space of the large-scale manufacturing systems [4].

To address the challenges of modelling for the complex systems, artificial intelligence (AI)–based algorithms are emerged as alternative modelling paradigm that not only model the complex system with reasonable accuracy but are also computationally cheaper to be developed with the available computational power [5]. The AI models are thus customized for the particular applications and can be adapted to similar application domains thereby offering their generalized utilization [6]. In some application domains like energy, material discovery, process design and optimization, AI is serving its customers with its excellent modelling and feature detector capabilities. However, a generic analysis framework depicting the key analysis stages and incorporating the AI-based modelling algorithms for the operation analysis of the manufacturing sector is lacking in the literature. Thus, this study proposes a generic analysis framework to determine the manufacturing system’s optimal operating conditions. The proposed AI-built analysis framework optimizes WEDM cutting speed for stainless cladding steel. The non-conventional machining variants, i.e., electric discharge machining (EDM) die sinking and WEDM, play a major role in the manufacturing industry. The said non-traditional processes are used in manufacturing to cut materials with greater precision and prepare intricate dies, molds, and parts used in aerospace, automobile, and many other industrial applications [7]. Therefore, to enhance the productivity of the process and increase the finishing proficiencies, modelling the said machining variants is of great importance today. WEDM tends to machine the materials with greater wear-resistant resistance, high strength, and hardness. The conventional machining processes cannot produce complex shapes with greater precision, but with the WEDM, these complex shapes and cutting are done with higher accuracy [8, 9].

Literature survey is conducted to find the reported research using AI based modelling and optimization analysis for the WEDM system. Computational methods, including the genetic algorithm (GA), support vector machine (SVM), and artificial neural network (ANN), gained tremendous popularity in giving process modelling and optimization with a high degree of precision. The process efficiency results obtained by these models are highly accurate compared to the statistical models. The said models are highly flexible to integrate the non-linear datasets and can estimate any function with the high-dimensional system [10]. Paturi et al. [11] investigated the surface roughness (SR) of the Inconel 718 in WEDM. The authors developed the AI models, i.e., SVM, ANN, and GA to optimize the response, i.e., SR via machine learning techniques. The authors found that the predictions developed to study the SR using machine learning techniques were accurate and the developed prediction was compared with the statistical RSM model. However, GA gave 61.31% better surface finish compared to the statistical RSM model under the defined range of input parameters.

Sivanaga Malleswara Rao et al. [12] determined the influence of input parameters of WEDM, i.e., cutting speed, current, and spark gap, on the SR of high-speed steel. The experimental values obtained during the investigational procedure were compared with the AI models of SVM and ANN. The investigation results depicted that the error between the predicted values from machine learning techniques and statistical machining results was less than 5%. Amini et al. [13] evaluated the effect of machining parameters (pulse off time (Poff), servo, voltage, wire feed rate (Fw), and powder) in the WEDM for the TiB2 nano-ceramic composites by using the GA. The authors found that the proposed neural network gave the highest accuracy of MRR and SR. Moreover, GA helped to find the best optimal settings for WEDM with 94% accuracy between the experimental and optimized settings. Yusoff et al. [14] determined the influence of machining parameters (Ton, Toff, peak current (Ip), servo voltage (SV), and flushing pressure) in WEDM for the Inconel 718 using the ANN model. The authors revealed that AI models are the best tools to minimize machining costs. Moreover, an error of 5.16% was found in the statistical data and the ANN model for the given data set of input and output parameters.

Huang et al. [15] determined the online workpiece height joined with the CNC system to evaluate the machining responses in the WEDM by applying the SVM model. The authors engaged the pulse interval, discharge frequency, and Fr as the input variables for the estimation model (kernel function). An error of less than 2 mm was found by employing the SVM model compared to the experimental results. The authors also elaborated that different workpieces with varying heights can be machined using an adaptive control unit, but the AI models provided the best solution for unidirectional WEDM. The proficiency of WEDM was determined by Nain et al. [16] by employing the gray relational analysis (GRA) and the SVM model in terms of material removal rate (MRR) and SR of the superalloy Udimet L605. The experimental results revealed that SVM gave better outcomes compared to the non-linear regression models. Moreover, the authors also found that the percentage significance of Ton in the case of MRR was 60.18%, and in the case of SR, it was 79.10%. Varun and Venkaiah [17] evaluated the optimization strategy for the parameters (MRR, cutting width and SR) by employing the GRA with the generic algorithm for cutting EN 353 in the WEDM. The authors declared that Ip is the highest inducing factor for MRR, cutting width, and SR among the other input variables. The authors found that GRA optimized the situation based on experimental results. However, the GA optimized the single process parameter based on other response parameters. Akıncıoğlu [18] used the EDM to investigate the machining of titanium alloy in terms of MRR, SR, depth and electrode wear rate (EWR) under distinct process parameters. The authors revealed that for the aforementioned response measures, amperage was the most significant factor. However, the best average magnitude of SR was achieved at 8A of current. Nas et al. [19] used electro erosion machine tools, a copper electrode abrased super alloy Hastelloy C22. The experimental study included pulse durations, waiting times, and discharge currents. The minimum mean SR (2.86 μm) was obtained at the magnitudes of 20 μs, 10 μs, and 5 A of pulse duration, delay time, and current, respectively. Moreover, the maximum average SR (4.07 μm) was gained at 10 μs waiting time and 15 A current. Industry 4.0 involves automating most industrial processes, from machinery status monitoring to production efficiency optimization with robots and digital twins. Predictive maintenance (PdM) uses AI to analyze machinery health data and enable timely repairs. The manufacturer can arrange maintenance actions in advance to avoid machine failure. Kotecha et al. [20] discusses AI-enabled machinery defect detection and reviews classic and new fault detection methods. It examines current advances in sensors used and extracted for defect detection in case studies.

Jain et al. [21] studied the SR and acoustic signals (AE) of titanium alloy grade-2 in WEDM using the machine learning technique, i.e., ANN. The authors found that neural network training gave results with 70% good prediction compared with the least R-value in the training data of 50 to 60%. Altug et al. [22] optimized the kerf width of titanium alloy in the WEDM by employing the GA. The authors found that a fine kerf width was obtained when the sample quenched but the quenched sample has lower electrical conductivity, and GA gave the optimal settings for the better kerf width as the function of heat treatment. However, the worst kerf width was obtained when the sample is just tempered to martensite. Chen et al. [23] evaluated the machining parameters for pure tungsten material in the WEDM by engaging the backpropagation neural network (BPNN) and stimulated annealing algorithm (SAA). The authors concluded that the results obtained by the said techniques were closely matched with the actual experimental. Sunkara et al. [24] determined the proficiency of WEDM in machining holes for the aluminum 6061 by employing the GA. The results obtained after the actual machining and by the proposed model were compared. The authors found the 95% confidence when the GA-predicted results were computed with experimental results.

Considering the literature cited above, substantial work has been conducted to determine the MRR, and SR generated by the application of WEDM for machining different materials. Moreover, distinct algorithms such as GA, RSM, etc., have also been utilized for the said responses (MRR and SR) in WEDM. Various input parameters have been employed to examine the variables’ effect on the execution of WEDM with or without the use of different algorithms. However, the WEDM’s cutting speed by engaging the stainless clad steel material has not been investigated so far by AI-based modelling algorithms (ANN, SVM, and ELM). Moreover, the gap identified above, i.e., the expansion of a generic AI-model-based optimization framework for the performance augmentation of the WEDM process is presented that contributes towards the novel aspects of this study. Herein, a case study for the machining of clad material through the WEDM process is presented using the proposed AI-based modelling and optimization framework. AI-based computational techniques are implemented in this study to model the cutting speed of the WEDM process and validated it on the experimental conditions thereby addressing the identified gap. In addition, sensitivity analysis is performed in order to analyze the importance of input factors on the cutting speed of WEDM. This discovery, which is not documented in the literature and is one of the novel discoveries of this case study, is being carried out as part of the research. Moreover, the parametric optimization technique is utilized to determine the optimal input parametric settings corresponding to higher cutting rates that are potentially missing in the literature and is of industrial relevance and competitiveness for enhancing the performance of WEDM and resource competitiveness. The increased and widespread utilization of AI based analytics leads to the smart operation of manufacturing system, digitalization of the processes and efficient operation management of the manufacturing systems that contributes to industry 4.0 vision.

2 Materials and methods

The approach taken to accomplish this study’s objective is shown in Fig. 1. The foremost step includes the collection of characteristic data for the WEDM procedure including the observations for the input and output variables of the process, and the resulting dataset is then deployed for the data processing and visualization in the second step. The actual WEDM used for cutting the clad material is shown in Fig. 2(a). In this work, eight input parameters have been considered namely, layer thickness of stainless steel (LTss), the layer thickness of mild steel (LTMS), SV, wire diameter (Dw), pressure ratio (Pr), wire feed rate (Fw), pulse on time (Pon) and orientation (Or). There exists two levels for orientation control variables whereas for the rest of the parameters, three levels have been employed for experimentations. The stainless clad steel material is shown in Fig. 2(b). The data for the cutting speed of the WEDM process is taken from the literature [25]. The WebPlotDigitizer software is used to extract the data from the research paper and 35 observations covering the whole design space for the input-output variables are collected. Thirty observations (for the statistical inference) considering the influence of the input variables are used for the training of the AI models. Whereas, the remaining five observations considering the variation of the input variables are deployed for validating the trained AI models.

Fig. 1
figure 1

Methodology implemented on the case study taken from WEDM. A number of stages are devised for the AI based modelling and optimization framework which is implemented on the WEDM process

Fig. 2
figure 2

a Working setup of WEDM; b stainless clad steel material

The collected data of the cutting speed from the WEDM against the input variables can be represented in the form of box plots as it presents an effective graphical visualization of the data. The investigation of the linear dependence among the variables is imperative to identify the truly independent variables. The independent variables can construct an effective AI-based model that can predict the accurate values of the output variables taken from the hyperdimensional and complex system. Pearson correlation coefficient is a reliable measure to investigate the linear dependence among the variables. The mathematical expression of the Pearson correlation coefficient is given as:

$${R}_{xy}=\frac{\sum_{i=1}^N\left({x}_i-\overline{x}_i\right)\left({y}_i-\overline{y}_i\right)}{\sqrt{\sum_{i=1}^N{\left({x}_i-\overline{x}_i\right)}^2}\sqrt{\sum_{i=1}^N{\left({y}_i-\overline{y}_i\right)}^2}}$$
(1)

where x denotes the input whereas y is the response measure; i = 1,2, 3,…, N. \(\overline{x}\) and \(\overline{y}\) are the mean-value of x and y respectively. The value of Rxy ranges from −1 (strongly negatively correlated) to +1 (perfect positive correlation). Whereas Rxy = 0 indicates no linear dependence among the variables.

After the data-processing and visualization step, in the third step, the AI-based modelling algorithms like ANN, SVM and ELM are trained under rigorous hyperparameters tuning. The three AI-based models are considered since the nonlinear and complex characteristics of the system can be effectively learned by these algorithms as reported in the literature [26,27,28,29]. Furthermore, the working of ANN, SVM, and ELM can be found in [27, 30].

To assess the effectiveness of trained AI models, three statistical metrics are selected. Coefficient of determination (R2), root-mean-squared-error (RMSE), and mean absolute error (MAE) are commonly deployed to evaluate the predictive performance of the AI models [31]. Below are the mathematical Equations (2)–(4) for the chosen performance parameters:

$${R}^2=1-\frac{\sum_{i=1}^N{\left({y}_i-{\hat{y}}_i\right)}^2}{\sum_{i=1}^N{\left({y}_i-{\overline{y}}_i\right)}^2}$$
(2)
$$\textrm{RMSE}=\sqrt{\frac{1}{\textrm{N}}\sum\nolimits_{\textrm{i}=1}^{\textrm{N}}{\left({\hat{\textrm{y}}}_{\textrm{i}}-{\textrm{y}}_{\textrm{i}}\right)}^2}$$
(3)
$$\textrm{MAE}=\frac{1}{\textrm{N}}\sum\nolimits_{\textrm{i}=1}^{\textrm{N}}\left|{\textrm{y}}_{\textrm{i}}-{\hat{\textrm{y}}}_{\textrm{i}}\right|$$
(4)

Where N denotes the sample size; \({\hat{y}}_i\), and yi represent the predicted and actual values; \(\overline{y_i}\) and \({\overline{\hat{y}}}_i\) denote the mean of actual and predicted values respectively. R2 is a measure of accuracy and it varies from zero (poor prediction accuracy) to one (100% accurate predictions). While RMSE and MAE are the error terms to measure the difference between the actual values and the model-predicted responses.

To evaluate the generalization and prediction performance of the trained AI models as carried out in the fourth step, an external validation test (Valext test) is executed. The Valext test consists of predicting the unseen data from the trained AI models to evaluate their generalization capability and a well-performing AI model can be selected. In the fifth step, the comparative prediction capacity of the models is evaluated on the performance metrics, built on the R2, RMSE, and MAE, and a better-performing AI model is selected. In the sixth step of the methodology, sensitivity analysis is performed using the better-performing AI model, and the significance order of the input variables to predict the cutting speed of the WEDM process is established. Finally, in the last step, the parametric optimization technique is utilized to determine the optimized values of the input variables so that the cutting speed of the WEDM process is maximum.

3 Results and discussion

3.1 Descriptive statistics

After obtaining the training data on cutting speed of WEDM process, the box plots are constructed to visualize the data. Fig. 3 depicts the box plot for the training data of WEDM’s cutting speed for the machining of clad steel material under the input variables. From Fig. 3, it is apparent that the lower values of input variables, i.e., LTss, LTMS, SV, and Dw significantly affect the cutting speed of WEDM for the clad material. However, the rest of the input variables, such as Pr, Fw, and Pon, at their moderate value level, cause an increase in the WEDM’s cutting speed.

Fig. 3
figure 3

Box plot for the cutting speed of WEDM

A little description of all the input variables is presented herein. The material used in this study is clad which is composed of two types of materials. LTSS has presented the layer thickness of stainless steel, whereas the LTMS indicates the layer thickness of mild steel. SV plays a crucial role in cutting rates of material. Different Dw is engaged for the cutting of materials in WEDM which influenced the machining characteristics of materials. However, three different Dw (0.2 mm, 0.25 mm, and 0.3 mm) were used for the cutting of clad material. Pr indicates the ratio between the movement of the ram and the flushing pressure during the Poff. Fw is the wire feed rate for the cutting of clad material and is changed from 4 to 10 feed mode measured in mm/s. Pon is the duration in which a spark is generated in the middle of the workpiece and the wire for the cutting of material.

The box plots shown in Fig. 3 indicate the ranges of eight input variables taken to conduct the experimentation against the cutting speed of clad material [25]. It was observed that if the LTSS, and LTMS were taken 2 mm and 6 mm, respectively, then they significantly improved the cutting speed of clad material. In addition to that, the lower magnitudes of Dw (0.2 mm), SV (30 V), and Fw (2 mm/s) also increased the cutting speed of WEDM for clad material. However, the moderate value of Pr (1.0), and the highest value of Pon (5 μs) mainly increased cutting speed.

The range of cutting speed within the 1.5IQR indicates that the maximum and minimum values of cutting speed were 2.59 mm/s and 1.42 mm/s, respectively. However, the mean cutting speed lies below the median line of the data. Additionally, 75% of the cutting speed data indicates the range from 2 mm/s to 2.5 mm/s; the rest covers 25% and ranges from 1.8 to 2.1 mm/s.

Investigating the linear relationship among the input and output parameters is significant before deploying the design space to construct the model by AI algorithms. By evaluating the relationship between input and output variables, the extent of dependence of the input variables on the output variable can be assessed, enabling the removal of strongly dependent input variables. A heat map representing the Pearson correlation coefficient among the variables of WEDM is shown in Fig. 4. The correlation values computed between the variables lie from −0.59 to 0.64 indicating the absence of strong linear dependence among the variables. Thus, the identified input variables of WEDM are independent of each other and can be deployed for constructing the AI-based models.

Fig. 4
figure 4

Pearson correlation coefficient-based heat map constructed for the variables of the WEDM process

3.2 Development of AI-based models

This section contains the details on the development of AI-based modelling algorithms like ANN, SVM, and ELM trained in this work. The data was taken from experiments conducted on the stainless-clad steel material in the WEDM. The Valext test was conducted on the trained AI models, and the modelling performance is compared to selecting a better AI model. A detailed discussion of the model’s development, sensitivity analysis, and parametric optimization analysis is provided in the subdivision of section 3.

3.2.1 Artificial neural network (ANN) model

ANN is one of the innovative modelling algorithms of AI and is used in various engineering areas, including automobiles, development trade, manufacturing, and energy [30, 32, 33]. ANN is a function approximation algorithm and can construct the effective functional mapping among the variables. The ANN model has three layers (input, hidden and output). All the input variables used in this study are incorporated into the input layer that receives and then transmits the data to the next layer. The general working schematic of the ANN for the WEDM process is shown in Fig. 5.

Fig. 5
figure 5

Schematic for the working of the ANN model

The data is moved from the input layer of the ANN to the hidden layer after being processed. There are two different approaches that can be taken when defining the transmission of information in the hidden layer, which is the primary processing layer in ANNs. First, decisions have to be made regarding the neurons in the hidden layer. It is common practice to estimate the number of neurons in the hidden layer to be between one and two and a half times the number of neurons in the input layer [34]. Second, the number of hidden layers to be included in the architecture of ANN model. The complexity of the system that is going to be modeled will determine how many processing layers—one, two, or more—are going to be selected. However, in order to ensure that the ANN has reasonable modelling performance, one hidden layer can be enough given that an adequate number of neurons are provided [35, 36].

The information compiled at the hidden layer is then sent to the output layer. The information processing takes place at the output layer and a value is simulated for the output variable. The error is calculated between the actual and model-simulated response and the error-back propagation tunes the parameters (weights and biases) to make the model-simulated response closer to the actual value. The mathematical model of ANN can be written as.

$${\textrm{Y}}_{\textrm{i}}={f}_2\left(\sum {W}_2\left[{f}_1\left(\sum {X}_i{W}_1+{b}_1\right)\right]+{b}_2\right)$$
(5)

ANN’s output layer value is Yi, and Xi is the input vector, where i = 1,2,3,..., N. ANN’s hidden and output layers have biases (b1, b2) and activation functions (f1, f2), respectively. W1, W2 are the weight matrices that connect the input to the hidden layer and the hidden to the output layer of ANN, respectively.

In this work, a three-layered shallow ANN model is constructed for modelling the cutting speed of the WEDM process on the eight input variables. The data-split ratio of 0.8, 0.1, and 0.1 is used for the model development in the training, testing, and validation phases respectively. The ANNs are trained on various numbers of hidden layer neurons ranging from 8 to 20. The Levenberg Marquardt method and sum-of-squared error are used to optimize the parameters of the ANN algorithm. The modelling performance of the trained ANN is evaluated on the performance metrics built on R2, RMSE adn MAE.

When the hidden layer neurons are varied from eight to twenty, as shown in Fig. 6, the modelling performance of the ANN model during the training, testing, and validation phases is shown. Each ANN has a strong prediction performance, with an R2-value greater than 0.8 during the training, testing, and validation phase for the trained ANNs. Furthermore, the ANN model having fifteen neurons in the hidden layer has comparatively better R2 values, i.e., R2_train = 0.99, R2_test = 0.99, and R2_val = 0.98. Closely observing the modelling performance of the ANNs with respect to RMSE and MAE, it is found that ANN model with fifteen hidden layer neurons has lowest modelling errors values. The RMSE and MAE values in the three stages of the ANN development are as follows: RMSE_train = 0.01 mm/s, RMSE_test = 0.04 mm/s, RMSE_val = 0.02 mm/s and MAE_train = 0.005 mm/s, MAE_test = 0.04 mm/s, MAE_val = 0.02 mm/s. Thus, it is suggested that fifteen neurons are the best number of hidden layer neurons for ANN since it is evident from Fig. 6(a–c) that a trained ANN with fifteen hidden layer neurons performed considerably better in the training, testing, and validation phase.

Fig. 6
figure 6

The modelling performance of ANN constructed on varying hidden layer neurons, i.e., 8 to 20 in a training, b testing, and c validation phase. The ANN model with 15 hidden layer neurons has comparatively better performance metrics for modelling the cutting speed of WEDM

3.2.2 Support vector machine (SVM)

SVM is an AI-based modelling algorithm used to construct a model between the input-output variables. The model is also used for the classification and approximation of non-linear and complex data [37, 38]. Basically, SVM divides the input data into two separate classes by hyperplanes. There are fundamentally two hyperplanes in the SVM, i.e., positive hyperplane, and negative hyperplane, and there is another line which is referred to as the maximum margin hyperplane as shown in Fig. 7. So, every dataset present in the positive or negative hyperplane has some distance from the maximum margin hyperplane which is taken as ‘margin’. Therefore, a dataset having a maximum distance from the maximum margin hyperplane executes better results for convergence and generalization in SVM. The structural risk minimization (SRM) method tries to decrease the absolute limit of overview error for the response variable by developing an efficient SVM model. The non-linear interactions between the output and the input variables are extracted after normalizing and transforming the training data into a robust feature space. The key conditions of the Karush-Kuhn-Tucker (KKT) statements result in the global optimum of the output variable.

Fig. 7
figure 7

Schematic for the working of the SVM model

Epsilon, kernel scale, and box constraint were three significant hyperparameters that were used in this work using SVM modelling algorithm. The hyperparameters can take values between 0.001 and 1000 for the kernel scale, 0.20859 and 20859.15 for the epsilon, and 0.001 and 1000 for the box constraint, respectively. In this study, a grid search algorithm is employed to methodically explore various combinations of hyperparameter values. The model’s performance is then assessed using the MAE. To identify the optimal hyperparameter combination, a Bayesian optimizer incorporating an expected improvement per second plus acquisition function is utilized. This iterative process is repeated for 30 epochs within a 5-fold cross-validation training setup.

Ten SVM models are trained under rigorous hyperparameters optimization, and the performance metrics of the models are compared to select a better-performing SVM model. Fig. 8 depicts the performance metrics, i.e., R2, MAE, and RMSE calculated for the trained SVM models. Closely observing the modelling performance of the trained SVM models, it is found that a better-performing SVM model has R2 value of 0.87, MAE of 0.02 mm/s, and RMSE of 0.04 mm/s. The performance metrics of the considered SVM model are better (higher R2 and lower MAE and RMSE) than those of the remaining nine SVM models. However, the trained SVM models are deployed to undergo the Valext test.

Fig. 8
figure 8

Modelling performance of SVM models trained for the cutting speed of WEDM process

3.2.3 Extreme learning machine (ELM)

The ELM is a single hidden layer feedforward neural network utilized for both regression and classification tasks. In Fig. 9, it can be observed that the input weights, hidden layer bias, and node numbers are all predetermined and assigned by the ELM’s single hidden layer feedforward neural network. The output weights are computed using an approach that undergoes minimal changes during the iteration process [39]. The main objective of ELM is to achieve superior generalization by reducing both the training error and the norm of output weights. Consequently, the smaller the norm of the output weights, the better the generalization performance of the network tends to be [40]. In practical applications, the training of the ELM model takes precedence, followed by prediction. The training process involves incorporating actual outcomes and their relevant factors into the data sets, which are then utilized for training the ELM model.

Fig. 9
figure 9

Schematic for the working of the ELM model

In this body of work, ELM models were constructed using a range of hidden layer neurons, anything from 5 to 30. The performance of these models was assessed based on how accurately they predicted the cutting speed of WEDM, with R2, MAE, and RMSE serving as the measures for evaluation. The performance of the newly developed ELM models is shown in Fig. 10. It was found that the ELM model with fifteen hidden layer neurons had the best performance, as indicated by its R2 value of 0.95, as well as its MAE and RMSE estimated values of 0.02 mm/s and 0.021 mm/s, respectively. These well-performing models were then validated externally.

Fig. 10
figure 10

Modelling performance of ELM models trained for the cutting speed of WEDM process

3.3 External validation test

The trained AI models — ANN, SVM, and ELM — are subjected to an external validation test that involves gathering a comprehensive dataset covering all practical operating regimes of the system under evaluation. This dataset is unknown to the trained networks, providing a reliable method to evaluate their predictive capabilities. Further details of the test are provided in subsequent subsections. Determining the ideal neurons in the hidden layer is essential for creating an efficient function approximator for an ANN model. To do this, many ANNs are tested and trained using various hidden layer neuron counts. The trained ANNs are then subjected to an independent validation test to determine the ideal number of neurons that yield the greatest outcomes.

The trained ANNs are subjected to a Valext test with hidden layer neuron numbers ranging from 8 to 20. The performance of the ANNs is evaluated based on statistical measures, as shown in Fig. 11. All ANNs exhibit strong prediction capabilities with an R2 value of at least 0.85. Among them, the ANN with fifteen hidden layer neurons yields the lowest error values of 0.03 mm/s and 0.03 mm/s for MAE and RMSE, respectively. When applied to a Valext dataset, this ANN achieves an R2 value of 0.90. Figs. 6 and 11 indicate that the ANN with fifteen hidden layer neurons performs significantly better than the others in the Valext test with improved performance metrics. Therefore, the optimal number of hidden layer neurons for accurately predicting WEDM cutting speed with reasonable performance metrics is determined to be fifteen.

Fig. 11
figure 11

Predictive performance of ANN models in external validation test

3.3.1 SVM’s external validation

The trained SVM model’s prediction accuracy is assessed using the same Valext dataset. To assess the effectiveness of the developed SVM models for predicting the Valext dataset, performance measures like R2, MAE, and RMSE are calculated and presented in Fig. 12. The values of performance metrics, i.e., R2, MAE, and RMSE are changed as follows: 0.21 to 0.36, 0.06 mm/s to 0.07 mm/s and 0.07 mm/s to 0.09 mm/s respectively. The SVM model having the comparatively better performance metrics are as follows: R2 = 0.362, RMSE = 0.07 mm/s, MAE = 0.07 mm/s. The R2 value for the Valext dataset is not quite high thereby indicating the fair prediction performance of the developed model.

Fig. 12
figure 12

External validation test of cutting speed of WEDM using SVM model

3.3.2 ELM’s external validation

The trained ELM model’s prediction accuracy is assessed with the identical Valext dataset. To assess the effectiveness of the ELM model for predicting the Valext dataset, performance measures like R2, RMSE, and MAE are computed. R2, MAE, and RMSE are changed from 0.85 to 0.94, 0.06 to 0.47 mm/s, and 0.09 to 0.69 mm/s respectively. Closely comparing the predictive performance of the trained ELM models on the performance metrics, it is found that the ELM model having fifteen hidden layer neurons has comparatively better R2, MAE, and RMSE compared to those of other ELM models. The performance metrics for the ELM model having fifteen hidden layer neurons are as follows: R2 = 0.88, RMSE = 0.0925 mm/s, MAE = 0.0619 mm/s. Figure 13 shows a graphic comparison of ELM-predicted speed values.

Fig. 13
figure 13

Predictive performance of ELM models in external validation test

3.3.3 Performance evaluation and optimum model selection

Figure 14 shows the comparative performance metrics of the trained AI models, i.e., ANN, SVM, and ELM, in predicting the dataset of the Valext test. In Fig. 14, the performance metric, i.e., R2, is compared against predictions based on ANN, SVM, and ELM models for an Valext dataset. Performance metrics for ANN are: R2 = 0.903, RMSE = 0.03 mm/s, and MAE = 0.03 mm/s, whereas for SVM: R2 = 0.36, RMSE = 0.07 mm/s, and MAE = 0.07 mm/s. However, in the case of ELM, performance metrices are as follows R2 = 0.8838, RMSE = 0.09 mm/s, and MAE = 0.06 mm/s.

Fig. 14
figure 14

Comparison of predictive performance of different AI-based models

When associating the performance of the ANN, SVM, and ELM models for prediction, it is clear that the ANN has a comparatively higher R2 value of 0.90, greater than that of ELM (0.88), and SVM (0.36). Additionally, the MAE (ANN_MAE = 0.03 mm/min < ELM_MAE = 0.06 mm/min < SVM_MAE = 0.07 mm/min) and RMSE (ANN_RMSE = 0.03 mm/min < SVM_RMSE = 0.07 mm/min < ELM_RMSE = 0.09 mm/min) of the ANN-based predictions are comparatively lower than that of SVM and ELM. As a result, ANN is chosen for the successive analysis of interests, which is shown in the section below.

3.4 Sensitivity analysis

The first step in understanding the system is to determine how sensitive the output variable is to the input variable. The constructed ANN model is subjected to a sensitivity analysis to assess the impact of the input variables on the cutting speed of the WEDM process. During the sensitivity analysis, the variable for whom the sensitivity of the output variable is to be evaluated is systematically varied in its operating range (minimum to maximum value). In contrast, the other input variables are kept at a constant value (usually the variable’s mean value). The total change in the values of the output variable to the particular input variable is computed and normalized to calculate the input variable’s percentage significance.

Figure 15 shows the percentage significance of the input variables toward the cutting speed of the WEDM process. Pon contributes the most percentage significance, i.e., 26.6%, towards the cutting speed, followed by Dw and LTss, having % a significance of 17.3% and 16.7%, respectively. Or shares the least % significance of 4.3% towards the cutting speed of WEDM. The largest % contribution of Pon is explained by the fact that it is the duration in which the discharge gap between the wire and the workpiece material is generated. This discharge gap leads to discharge heat, which melts and erodes the material. The higher the value of the Pon, the greater will be the plasma channel produced. Thereof, the larger pulse value on time produced a plasma channel of greater width, and then discharge heat melts and erodes the material from the workpiece material. The above statement favors the higher cutting speed when the Pon has a larger value. However, as mentioned earlier, that base material is a layered material with two orientations (A & B). Thus, the independent variable of input parameter ‘Or,’ either A or B, shows the least % effect on the cutting speed of stainless clad steel material.

Fig. 15
figure 15

The percentage significance of input variables on the cutting speed of WEDM

3.5 Parametric optimization for cutting speed in WEDM

Parametric optimization represents a concise and simplest design approach where a set of input variables indicates the enhanced magnitudes of output variables in the defined range of input data. The WEDM process is essentially a nonlinear process and thus parametric optimization technique suits well for maximizing the cutting speed of WEDM and the optimized values of input variables can be determined. Thereof, for parametric optimization, an optimal value was determined against every single level of an input variable. Then, the investigated optimized value of each input variable is used to determine the cutting speed in WEDM for the clad material.

Figure 16 shows the optimized values of the input variables of the WEDM process determined by the parametric optimization analysis for maximizing the cutting speed of the WEDM process. The optimized values of the input variables are as follows: Or = 0; LTSS = 2 mm, LTMS = 6 mm, Dw = 0.2 mm, SV = 30 V, Fw = 4 (60 mm/s), Pr = 0.7, and Pon = 5μs and the cutting speed of WEDM process is 2.90 mm/min, 27.3% higher than the maximum speed achieved in the actual dataset. The parametric optimized values of the input variables comply with the domain knowledge of the WEDM process as well [25].

Fig. 16
figure 16

Parametric optimization analysis for maximizing the cutting speed of the WEDM process. The optimized values of input variables are determined

4 Conclusions

The traditional conventional process modelling, and analytical techniques are used to analyze the performance of the manufacturing systems that may lead to suboptimal results. Herein, we have presented a generic AI-based process analysis framework for the manufacturing sector that contains the key analysis stages and explains to carry out the analysis at each stage for the performance enhancement of the system under consideration. The suggested analysis framework is applied on the WEDM system so that the cutting speed of the WEDM process can be maximized for use with the stainless cladding steel material. The conclusions of this case study are summarized as follows:

  1. 1.

    Rigorous hyperparameter tuning is carried out to train the AI models, i.e., ANN, SVM, and ELM in this study. The modelling performance of the AI models is computed. Amongst all the developed models (ANN, SVM, and ELM), the ANN model performed comparatively better to predict the cutting speed of WEDM for the stainless cladding steel. The R2 value of 0.90, MAE of 0.03 mm/s, and the RMSE of 0.03 mm/s are measured for the Valext dataset thereby demonstrating the excellent generalization capability of the trained ANN model.

  2. 2.

    The ANN model performed comparatively better, i.e., 59.9% and 2.19% higher R2 value than that of SVM and ELM models, respectively.

  3. 3.

    The sensitivity analysis has been conducted to establish the variables’ significance order for the WEDM process. The sensitivity analysis depicts that Pon is the significant factor and shares 26.6% towards the cutting speed of stainless clad steel material, followed by the Dw and LTSS, with a significant percentage of 17.3% and 16.7% respectively.

  4. 4.

    The variable Or has the least significant effect of 4.3% on the cutting speed of clad material in the machining process of WEDM. However, the layered material gave the highest cutting speed when the orientation of the material was set stainless steel over the cast iron.

  5. 5.

    The parametric optimization has been performed and the optimal values of the input variables are determined to correspond to the maximum cutting speed of WEDM. The parametric optimization technique is applied to maximize the cutting speed of the WEDM process. The optimized values of the input variables are determined as follows: Or = 0; LTSS = 2 mm, LTMS = 6 mm, Dw = 0.2 mm, SV = 30 V, Fw = 4 (60 mm/s), Pr = 0.7, and Pon = 5 μs.

  6. 6.

    The cutting speed in WEDM obtained at optimal parameters during the parametric optimization is 27.3% higher than the maximum value of cutting speed achieved during the cutting of clad material.

  7. 7.

    The application of AI-based modelling techniques is an emerging area of research nowadays in the manufacturing sector. The suggested machine learning models will assist the manufacturing sector in thinking about, amending, and analyzing the machining processes prior to the actual cutting and machining activities. Adopting the suggested framework, which incorporates an AI-based model and optimization technique, may assist in obtaining the increase in productivity, improvement in quality, and reduction of material, financial, and time waste.

This study covers the use of AI-based modelling techniques integrated within the optimization environment to determine the optimum operating conditions for maximizing the cutting speed of the WEDM process. However, in the future, different materials and wide operating ranges would be investigated to further generalize the utilization of the developed AI-based modelling and optimization framework. Furthermore, model-based control of the process would also be investigated.