1 Introduction

The dicing saw is the core machine of the back-end packaging in the chip processing process. It is based on the scribing principle of strong grinding, and the air bearing is the executing element, and the spindle carrying the blade rotates at high speed to make strong cuts on the workpiece of various materials. Since the dicing saw adopts the physical processing method, it can be adapted to the processing work of most workpieces and has less impact on the processed edge part of the object to be processed.

With the rapid development of modern technology, the iteration of electronic products has become very fast, which has led to new market requirements for the precision of packaging in the semiconductor industry. In order to produce smaller and more accurate workpieces, the semiconductor manufacturing industry needs to constantly adjust and optimize the cutting process to meet the new market demand [1]. While considering improving the quality of workpieces, it is also necessary to consider the production cost, and companies usually choose to reduce the production cost by reducing the width of the dicing street [2], which is a new challenge for the dicing machine. There are also many criteria to judge the quality of dicing saw processing, such as the maximum chipping width generated by the scribing process [3], the maximum chipping area, and the width of the cutter marks. If the maximum chipping width is too wide, the internal circuitry of the workpiece will be damaged and the workpiece will be directly scrapped and unusable [4, 5] Therefore, in this paper, the maximum chipping width is used as the index for judging the cutting quality.

In this study, we conducted an in-depth study of all factors (including dicing machine, workpiece, and environment) that can affect cutting quality during dicing. On the basis of expert experience, six parameters are selected for the experiment, and five level values are selected for each parameter, so as to establish an orthogonal table of six factors and five levels for the experiment. After the experiment is completed, the experimental results are analyzed, and then a machine learning model is established to fit the relationship between the cutting parameters and the cutting quality based on the experimental data. Finally, the genetic algorithm is used to find the corresponding cutting parameter combination when the cutting quality is better on the established machine learning model. After finding the optimal parameter combination, the cutting parameters are applied to the dicing saw for cutting test, and then judge whether the expected optimization effect can be achieved. The main factors affecting the cutting quality in the scribing process of the dicing saw and their brief explanations are shown in Table 1. The parameter optimization process is shown in Fig. 1.

Table 1 Influencing factors and their brief explanations
Fig. 1
figure 1

Parameter optimization flow chart

The problem of optimizing the scribing process parameters of dicing saw has been the subject of intensive research by many scholars. Su et al. [8] established a three-factor and two-level orthogonal table using three features of blade spindle revolution, blade depth and cutting speed, and studied the relationship between the maximum chipping width and cutting parameters of the dicing saw. They used the most basic parameters to achieve a cutting operation on a 60 μm dicing street. Although the cutting work was completed, the maximum chipping width reached 41–47 μm based on the blade thickness of 27–32 μm, and the maximum chipping width reached 1.4 times the blade thickness. In another of their articles [5], Su et al. used BP neural network to predict the cutting quality of the dicing saw, and established a neural network model with blade cutting distance, blade exposure, blade wear out, and blade loading as model inputs, verified the effectiveness of the model with five groups of test data, and achieved 75% accuracy on the basis of a small amount of test data. Kayabasi et al. [9] built a neural network model to predict the cut quality of a wire saw using spool speed, z-axis speed, and oil ratio in a coolant slurry and achieved good results.

The above research on cutting process parameters and cutting quality has certain theoretical significance, and it is also one of the basic works of this paper. On the basis of these studies, we expanded the research object and used new research methods to find better cutting parameters. Starting from the actual cutting requirements, this paper considers the common cutting parameters of dicing saws, and the obtained results can be directly applied to similar products to improve the dicing quality of cutting saws.

This is the end of the introduction part. In this part, we give the research background, research method, and research purpose of this paper, and introduce the research process in detail. In the second part, we will discuss the specific experimental methods, including the material basis of the experiment, the data conditions of the experiment, and the specific process of the experiment. The third part gives the experimental results and applies the found cutting parameters to the dicing saw for verification. Based on the comparison with other scholars' research results, the research results of this paper are fully discussed. Section 4 presents the conclusion of the article and summarizes the possibility of future expansion of this work.

2 Experiments and methods

2.1 Orthogonal experimental design

Orthogonal experimental design is one of the most commonly used design methods in multi-factor experiments, which can be designed with a different number of factors and distribution levels according to the actual situation, and the corresponding statistical analysis of the experiment results of the multi-factor experiment method.

The design of an orthogonal table is decided based on the minimum number of experiments, and each orthogonal table can obtain the equivalent amount of information as in a full factorial experiment with the minimum number of trials. For example, for a six-factor, five-level Experiment, 56 ≈ 15,625 trials are required if the traditional experiment method is used. Whereas if an orthogonal table is used to complete this experiment, only 25 trials need to be completed according to the six-factor, five-level orthogonal table design scheme. Therefore, using an orthogonal experiment design can signify-cantly reduce the time and cost of the experiment and ensure that all the information about the factors affecting the cutting quality of the scriber is obtained at a minimal cost [10].

The experimental dicing saw used is the ADT8230 automatic dicing saw manufactured by ADT, Israel, as shown in Fig. 2. The scribing principle of the dicing saw is shown in Fig. 3, and the scribing position of the dicing saw is shown in Fig. 4. The high-speed rotating blade will cut through the reserved dicing street on the workpiece, but not through the tape at the bottom of the workpiece so that the separated individual chips are still adsorbed on the tape.

Fig. 2
figure 2

Israeli ADT8230 automatic dicing saw

Fig. 3
figure 3

Schematic diagram of scribing and cutting

Fig. 4
figure 4

Schematic diagram of scribing position

During the scribing process, the blade keeps rotating at high speed in the same position, and the workpiece is moved by the bottom chuck table of the workpiece to complete the scribing work. According to the actual scribing process conditions of the Israeli ADT8230 dicing saw, six main factors affecting the cutting quality of the dicing saw at room temperature are selected to design an orthogonal table. Five level values are selected for each factors, and the designed orthogonal table is shown in Table 2 (factors refer to the characteristics that can affect the test results, and levels refer to the different values that can be set for each factor in the test).

Table 2 Table of factor levels of cutting parameters

According to the principle of orthogonal experiment design, this paper uses an \(L_{25} (5^{6} )\) orthogonal table to complete the experiment, and the results obtained are shown in Table 3. The experiment contains 25 groups of slicing schemes, corresponding to 25 rows of slicing parameters in the orthogonal table. In order to exclude the uncertainty caused by chance factors, each group of scribing experiments was repeated three times and the average value was taken as the final experimental result. The average values of each group are shown in the last column of Table 3.

Table 3 Results of orthogonal experimental

The experimental environment is as follows: the laboratory room temperature is 20 ℃, and the dicing saw uses ADT8230 automatic dicing saw produced by ADT, Israel. The blade thickness is 35 μm and the blade exposure is 940 μm. The cutting workpiece is a 12" standard wafer chip. The blades used in the cutting experiments were all new, which could exclude the possibility of greater chip-ping of the blades due to wear [11]. The experimental process was completed in single-blade mode [12], and there was no change of blade during the cutting process that could affect the experimental results.

2.2 Genetic algorithm

2.2.1 Genetic algorithm

A genetic algorithm is a model of evolutionary algorithm that follows the mechanism of natural selection and genetics of Darwinian biological evolution, which was proposed by Professor J. Holland of the University of Michigan in the 1970s as a method to simulate the search for optimal solutions in the process of biological evolution.

The genetic algorithm combines Darwin's evolutionary theory with modern computer coding theory to simulate reproduction, crossover, mutation, and elimination in the evolutionary process of natural organisms by computer. The evolutionary principle of "survival of the fittest "is followed in the iterative process. The flow chart of the algorithm is shown in Fig. 5.

Fig. 5
figure 5

Flow chart of genetic algorithm

In order to prevent the loss of excellent parental individuals in crossover and selec-tion, a selection method based on a binary union mating pool is used in this paper [13]. The progeny population that completes the crossover and mutation operations and the parental population are grouped in the same mating pool for selection.

To enhance the random search capability of the algorithm, this paper uses roulette to select individuals. The probability of a single individual being selected in the roulette selection process is, that is \(P({\text{x}}_{{\text{i}}} )\),the probability of each individual being selected is proportional to the value of its fitness function. Where Iterator represents the current number of completed iterations and N represents the pre-defined number of iterations.

$$P({\text{x}}_{{\text{i}}} {) = }\frac{{{\text{f(x}}_{{\text{i}}} )}}{{\sum\nolimits_{{{\text{j = }}1}}^{N} {{\text{x}}_{{\text{j}}} } }}$$

2.2.2 Determination of the fitness function

The fitness function plays a decisive role in the adaptive algorithm and is the "brain" of the whole algorithm. A good fitness function is a fundamental guarantee for achieving "superiority and inferiority". In this paper, we use the regression model established by the orthogonal design as the fitness function of the genetic algorithm, i.e., the optimization goal of the genetic algorithm is to minimize the maximum chip-ping width.

3 Results and discussion

3.1 Experimental results and regression modeling

The data in Table 3 are analyzed using the extreme difference analysis method, and the extreme difference values of each factor are calculated to give the ranking of the effect of each factor on the maximum chipping width as shown in Table 4. The main effect diagram of this result is plotted as shown in Fig. 6.

Table 4 Analytical results of the orthogonal experimental
Fig. 6
figure 6

main effects diagram

As can be seen from Table 4 and Fig. 6, among the factors affecting the chipping width, the spindle speed is the most significant, followed by the blade cooling water flow and feed speed. The cutting depth, two-fluid flow, and shower water flow have less influence on the chipping width.

For the above data and its analysis results, the following quadratic regression equation are established.

$$\begin{aligned} y & = b_{0} + b_{1} x_{1} + b_{2} x_{2} + b_{3} x_{3} + b_{4} x_{4} + b_{5} x_{5} + b_{6} x_{6} + b_{11} x_{1}^{2} + b_{12} x_{1} x_{2} \\ & \quad+ b_{13} x_{1} x_{3} + b_{14} x_{1} x_{4} + b_{15} x_{1} x_{5} + b_{16} x_{1} x_{6} + b_{22} x_{2}^{2} + b_{23} x_{2} x_{3} + b_{24} x_{2} x_{4} \\ &\quad + b_{25} x_{2} x_{5} + b_{26} x_{2} x_{6} + b_{33} x_{3}^{2} + b_{34} x_{3} x_{4} + b_{35} x_{3} x_{5} + b_{36} x_{3} x_{6} + b_{44} x_{4}^{2} \\ & \quad+ b_{55} x_{5}^{2} + b_{56} x_{5} x_{6} + b_{66} x_{6}^{2} \\ \end{aligned}$$

where \(b_{i}\) is the regression coefficient to be found in the regression equation,\({\text{x}}_{{1}} \;{{\sim }}\;{\text{x}}_{6}\) represents the values of spindle speed, feed speed, cutting depth, blade cooling water flow, shower water flow, and two-fluid flow, respectively. The 25 rows of data in Table 3 are used as the model data set to train the regression model, and the least squares method is used to solve the regression coefficients. The coefficients of the regression model are finally obtained as shown in Table 5.

Table 5 Table of regression coefficients

To ensure that the model is truly usable, we use the mean square error (MSE), root mean square error (RMSE), MAE (mean absolute error), and R Squared (coefficient of determination) to validate the model, respectively. The values of each evaluation index are shown in Table 6. Among them, MSE and MAE indicate that the overall average error of the model stays within 0.3 µm, while R Squared, as a model evaluation indicator under the elimination of the effect of the outline quantity, represents the degree of explanation of the dependent variable by the independent variable of the model [14]. The value of R Squared, of the regression model is about 0.925, which means that the model has strong explanatory power for the data in Table 3, which further supports the results of MSE and MAE.

$$MSE{ = }\frac{1}{{\text{n}}}\sum\limits_{{\text{i = 1}}}^{{\text{n}}} {{\text{(y}}_{{\text{i}}} - \widehat{{\text{y}}}_{{\text{i}}} )^{2} }$$
$$RMSE{ = }\sqrt {\frac{1}{{\text{n}}}\sum\limits_{{\text{i = 1}}}^{{\text{n}}} {{\text{(y}}_{{\text{i}}} - \widehat{{\text{y}}}_{{\text{i}}} )^{2} } }$$
$$MAE = \frac{1}{{\text{n}}}\sum\limits_{{\text{i = 1}}}^{{\text{n}}} {\left| {{\text{(y}}_{{\text{i}}} - \widehat{{\text{y}}}_{{\text{i}}} )} \right|}$$
$$R^{2} { = }1 - \frac{{MSE(\widehat{{\text{y}}}_{{\text{i}}} ,{\text{y}}_{{\text{i}}} )}}{Var(y)}$$
Table 6 Model evaluation results

3.2 Analysis of genetic algorithm results

The established regression model is used as the genetic algorithm fitness function and iterative optimization was performed using the genetic algorithm.The specific parameters of GA are set as follows: the population size is set to 100, the crossover probability is set to 0.8, the mutation probability is set to 0.2, and the number of iterations is set to 100. maxPE (maximum evaluation times) is the product of population size and iteration times.

The optimization results of GA are shown in Figs. 7 and 8. From Fig. 7, it can be seen that the minimum value of the maximum chipping width is obtained when the spindle speed, feed speed, cutting depth, cooling water flow, shower water flow rate and two-fluid flow is [55,000, 30, 200, 1.4, − 1.23, 1.1], respectively. According to the data in Fig. 8.the minimum value of the chipping width predicted by the model is 38.0327 μm under the optimal combination of parameters.

Fig. 7
figure 7

Population iteration results

Fig. 8
figure 8

Value of fitness function of GA

At the same time, in order to verify the correctness of the optimization algorithm, this paper selects Adam [15] (adaptive moment estimation) and PSO [16] (particle swarm optimization) for comparative research.

Adam is one of the most commonly used optimization algorithms in the field of deep learning, so we choose it as a comparison. The specific parameters of Adam are set as follows: the number of iterations is set to 300, alpha is set to 0.3, beta1 is set to 0.9, and beta2 is set to 0.999. The optimization results of.

Adam are shown in Fig. 9, and the optimal parameter values calculated is [55,000, 45, 200, 1.1, 1.5, 1.1], and the predicted maximum chipping width is between 39 and 40 μm. Adam's optimization process involves a large number of computing processes and takes more than 5 min to compute, with a time cost of hundreds of times that of GA.

Fig. 9
figure 9

Value of fitness function of Adam

Both PSO and GA are ``early" algorithms proposed by scholars in the last century, and they are also one of the optimization algorithms that are still frequently used, so we choose them for comparison. The specific parameters of PSO are set as follows: the population size is set to 100, the weight ω is set to 0.5, and the learning factors c1 and c2 are both set to 1.49445. The optimization results of PSO are shown in Fig. 10, and the optimal parameter value calculated is [35,000, 34.8, 40, 1.5, 1.5, 1.5], with the predict-ted maximum chipping width between 41 and 42 μm. The computational time cost of PSO is comparable to that of GA, but the predicted values do not achieve the expected optimization effect.

Fig. 10
figure 10

Value of fitness function of PSO

In summary, although some scholars consider genetic algorithm as an obsolete evolutionary algorithm, it still has irreplace-able advantages in some problems as a pioneer of evolutionary algorithms.

3.3 Experimental validation

Since the optimal combination of cutting pro-cess parameters obtained by the genetic algo-rithm does not appear in the data in Table 3, experimental verification of this combination of parameters is required. The experimental environment remains the same as the previous experiment, and the experiment is repeated three times, with the average value as the final experimental result.

Complete the cutting under this set of cutting parameters, and observe the knife marks produced by cutting with a high-power microscope, as shown in Fig. 11 The maximum chipping width of the three knife marks is [38.73, 38.31, 38.57], and the average value is 38.54 μm. Through this experiment, it can be seen that the optimized combination of cutting process parameters can effectively reduce the value of the chipping width generated by the scribing process. For high precision processing equipment such as dicing saw, a very small chipping value may cause the chip of the workpiece to be broken and scrapped.

Fig. 11
figure 11

The effect of knife marks under high magnification microscope

3.4 Discussion of the results

The experimental results show that when spindle speed, feeding speed, cutting depth, blade cooling water flow, shower water flow, and two-fluid flow are [55,000, 30, 200, 1.4, − 1.23, 1.1] respectively, the dicing saw can excellent complete the cutting task, achieve the best cutting performance and effectively improve the cutting quality of the dicing saw. On the basis of the optimal cutting parameters, the experiment is repeated three times, and the three sets of data are averaged, and we obtained the maximum chipping width value of 38.54 μm. Compared with the average maximum chipping width value of 42.0013 μm in Table 3, the maximum chipping width value is reduced by 3.45 μm, and the optimization effect is about 8.23%.

Compared with the parameter optimization of dicing saw carried out by Te-Jen Su and others in the past, they used the thickness of 27–32 μm produced by Disco company, the maximum chipping width is 41 μm, the maximum chipping width is about 1.4 times of the blade thickness. In this experiment, we used the blade with the thickness of 35 μm produced by Zhengzhou Research Institute for Abrasives & Grinding Co., Ltd, and the maximum chipping width is controlled between 38 and 39 μm, which is only 1.1 times of the blade thickness. The reason why we can achieve such optimization effect is that we have expanded the research object of Te-Jen Su and others, and added machine learning model and intelligent optimization algorithm to assist in parameter optimization. We have achieved relatively better optimization effect under the same experimental environment as far as possible.

4 Conclusion

In this paper, we deeply study the relationship between the cutting quality and cutting parameters of the dicing saw, and find a set of cutting parameters that can make the dicing saw excellent in completing the cutting task. Through experiments, it is verified that this group of cutting parameters can achieve the industry excellent standard that the maximum chipping width is 1.1 times the blade thickness, and effectively improve the processing quality of the dicing saw. The parameter optimization method used in this paper can provide some theoretical help for practitioners in semiconductor manufacturing industry. This method can also be used in other engineering applications involving material processing.

In order to more accurately study the relationship between cutting parameters and cutting quality, a new blade is used in the experiment, which avoids the negative impact of blade wear on cutting quality, but it can not be maintained for a long time in the actual industrial production environment. The spindle speed used in the experiment has reached 55,000 rpm, which is very close to the upper speed limit of 60,000 rpm of the blade. Therefore, this method has been very close to the theoretical limit in the physical machining mode. In the next step, we will consider using other non-contact processing methods to further reduce the maximum chipping width, such as using laser to realize invisible cutting [17]