1 Introduction

Diesel exhaust emission control has become a challenge for the researchers, as the emission standards are becoming more stringent throughout the world. On the other hand, demand for diesel engines is increasing continuously. Thus, there is a need to look for a technology to substitute the existing conventional catalytic-based filtering systems.

Even though NTP treatment for diesel exhaust pollution control has been proven to be an efficient technique, the studies are still at the laboratory level. To bring it into real-time applications, prior prediction of pollutant concentrations with the treatment would be helpful. If the NOX concentrations are accurately predicted through modeling, it facilitates to estimate the pollutant removal efficiencies for those parametric variations, which cannot be performed in the laboratory due to experimental limitations [1]. Experimental studies were conducted by the researchers throughout the world to know the effects of various operating parameters such as applied voltage [2, 3], flow rate [4], temperature [5], residence time [6], reactor configurations [7] and electrode configurations [8]. While there is sufficient literature available regarding the chemical kinetics of the treatment, there is no much work done to quantify the effects of different parameters on the NOX removal from the exhaust. Further, this modeling also helps the researchers to plan for the real-time applications through providing the knowledge of the relation between different parameter values and NOX removal.

The multi-criteria decision-making method was used to evaluate diesel exhaust emission characteristics by Hoseinpour et al. [9], while gasoline fumigation was added to reduce pollutant emissions under different operating conditions. A plant-specific multi-year and multi-parameter coal power stack emission model has been proposed by Walvekar and Gurjar [10] using the emission factor-based approach. However, very few studies have been carried out to predict the NOX (NO + NO2) concentrations in the diesel exhaust after the NTP treatment based on the parameter values.

In most of the studies [1, 11, 12], dimensional analysis has been used for this purpose by relating the NOX concentration with two or three other operating parameters of the NTP process. The disadvantage of the models derived using this method is that the root-mean-square error (RMSE) between experimental and predicted values of NOX concentrations would be more for an experimental data other than that by which the model is trained. The reason for this is the less experimental data requirement for the training of the model. It can be observed that the RMSE for the test data is 9.58 ppm in a study by the authors of this present study (Allamsetty and Mohapatro [13, 14]).

Deriving a mathematical model and training it with more number of experimental data with a wide range of parameter values can be helpful in accurately predicting the pollutant concentrations for the test data also. Thus, in this present study, an effort has been made to predict the sum of NO and NO2 concentrations in diesel exhaust with NTP treatment using RSM.

RSM has been used to model various types of parameters such as surface roughness and temperature of turning operation [15, 16], kerf width of laser machining [17], bending strength of aluminum alloys [18]. It has also been used in [19,20,21,22] to predict the pollutant concentrations in the emissions from a diesel engine based on various control parameters such as engine speed, compression ratio, brake power and injection parameters.

In this present study, the sum of NO and NO2 concentrations (NO + NO2) has been predicted using RSM-based predictive model, with respect to the changes in parameters voltage (V), flow rate (Fr), temperature (T), discharge gap (Dg), initial sum of NO and NO2 concentrations (NO + NO2)i. The effect of each parameter on the response has been analyzed numerically to know the dominant parameters. Values of the regression coefficients of the model have been derived, and analysis of variance has been performed on the model. Then, the model has been used to predict the values of (NO + NO2) and compared them with their corresponding experimental values for the test data. The model has also been compared with the model derived using dimensional analysis with respect to its performance during the testing.

2 Experimental details

The details of the experimental setup are shown in Fig. 1, and experiments conducted and the experimental results are discussed in Sects. 2.12.3, respectively.

Fig. 1
figure 1

Experimental setup

2.1 Experimental setup

The exhaust required for the experiments has been taken from a 5-kVA stationary diesel engine using a vacuum pump. The engine has been loaded with a 0–5-kW lamp load to get the required variation in (NO + NO2)i. A filtering unit comprising of containers filled with steel wool and drierite (indicative type) and two cascaded particulate filters (5 µm, Make: Ultrafilter) have been used to filter the exhaust before coming into the pump. The outlet of the pump has been connected to the heated hose inlet. By the instant the exhaust comes out from the pump through the filtering unit, it reaches the room temperature. Thus, the exhaust has been made to pass through the heated hose to get the desired temperature during the experiments. The outlet of the heated hose has been directly connected to the inlet of the plasma reactor so that there would be no decrement in the temperature that is set. Therefore, the flow controller has been placed at the outlet of the reactor to measure and control the exhaust flow rate. Then, the exhaust has been sent to the gas analyzer (Make: Testo, Model: Testo-350) to measure the concentrations of (NO + NO2) before and during the treatment. The concentrations of NO, NO2, along with the concentrations of CO, CO2 and O2 before the treatment at different concentrations of (NO + NO2)i are presented in Table 1.

Table 1 Exhaust composition before treatment

The high voltage applied to the plasma reactor during the NTP treatment has been generated and varied using a high-voltage AC test set (0–30 kV, 50 Hz, Make: Rectifiers & Electronics). This high voltage has been measured using a voltage divider (2000:1 ± 5%, Make: IWATSU, Model: HV-P60A, DC to 50 MHz, within − 3 dB) and a digital storage oscilloscope (Make: RIGOL, Model: DS 1074: 70 MHz).

The inner and outer diameters of the reactor are 15 mm and 17 mm, respectively. A layer of aluminum foil is wrapped to form the ground electrode over the reactor, between inlet and outlets, for a length of 280 mm. This can be described as the effective discharge length. The reactor is made up of borosilicate glass, which acts as the dielectric barrier when the high voltage is applied. The parameter Dg has been varied during the experiments by changing the diameter of the high-voltage electrode. The inner diameter of the reactor is maintained constant at 15 mm. Thus, when the electrode diameters are changed among 3 mm, 4 mm and 5 mm, they formed discharge gaps of 6 mm, 5.5 mm and 5 mm, respectively.

2.2 Design of experiments

The design of experiments (DoE) is a crucial aspect of RSM, which was formerly developed for the model fitting of physical experiments. Later, these strategies were also implemented for numerical experiments with an objective of selecting the input data points where the response needs to be experimentally found out.

The most widely used DoE is the Taguchi’s L-9 method [23], in which the input parameter points would be decided based on the orthogonal arrays. Even though the number of experiments those need to be conducted would decrease drastically to only nine, there is a considerable disadvantage in using this method, i.e., ignoring the interactions between the parameters. Thus, the DoE followed in this present study is the 3N full factorial design, using of which allows to investigate all possible combinations of operating parameters. Here, N is the number of parameters and 3 is the number of levels considered for each parameter. So, a total of 243 experiments have been conducted as N is five in this study. The lower and upper bounds and midpoints of each parameter are chosen as given in Table 2.

Table 2 Operating parameters and their levels

The NTP treatment primarily causes NO to NO2 conversion reactions leading to a decrement in NO concentration and an increment in NO2 concentration. If the applied high voltage keeps on increasing, the electric field gets intensified leading to the decrement in NO2 also. The possible chemical reactions those can take place in the reactor were given in a study by Saavedra et al. [24] with their rate constants. Major reaction pathways have been mentioned in a previous study of the authors [25]. The operating parameters have been varied in such a way to cover all the possible combinations with all the considered levels of each parameter, and corresponding (NO + NO2) concentrations are noted down.

2.3 Experimental results

Figure 2 represents the main effect plots, which displays the mean value of (NO + NO2) at each level of a parameter. Reduction in (NO + NO2) indicates the positive effect with respect to the increase in a parameter value. Thus, the decrease in (NO + NO2) with an increase in V and increase in (NO + NO2) with an increase in Fr, which can be seen in Fig. 2a, b, can easily be understood. It can be observed from Fig. 2c that the (NO + NO2) is decreased with an increase in T. This might be due to the effect of the moisture that got generated at higher temperatures. As shown in Fig. 2d, the (NO + NO2) is increased as the electric field in the reactor becomes mild with an increase in Dg. When the (NO + NO2)i increases, a lesser percentage of pollutant removal happens, which can be noticed in Fig. 2e. Further, the differences in the means of (NO + NO2) for upper and lower limits of the parameters are 73.7, − 47.4, 16.9, − 12.5 and − 95.2 ppm for V, Fr, T, Dg and (NO + NO2)i, respectively. Thus, it can be said that the effect of (NO + NO2)i and then the effect of V on (NO + NO2) are comparatively more compared to the other parameters.

Fig. 2
figure 2

Main effect plots over (NO + NO2) from the experimental data for a voltage, b flow rate, c temperature, d discharge gap, e (NO + NO2)i

3 Model derivation

The order of the model should be as low as possible for solving the prediction problems. Choosing a higher-order model can be said to be ill—using the regression analysis. At the same time, the first-order (linear) model generally suffers from lack of fit. For the present problem, when the linear model is used, RMSE is found to be so high, i.e., 13.5 ppm. Even though the RMSE is decreased to 3.5 ppm, when the third-order model is used, the model may not be working as a good predictor during testing. Despite achieving a good fit with the data, the model suffers from rank deficiency. Thus, the commonly used quadratic (second-order polynomial) RSM model, as given in Eq. (1), has been chosen for this study. This model consists of a total of 21 terms, which include linear, interactions, squares and a constant (C). Here, β1 to β20 are the regression coefficients of the model. The constant and the coefficients of each term of the model have been estimated using statistical analysis, and results are given in Table 3. The table also presents the information regarding the standard error (SE) coefficient and t-stat value for each term.

Table 3 Estimated coefficients of quadratic RSM model
$$ \begin{aligned} \left( {{\text{NO}} + {\text{NO}}_{2} } \right) & = C + \beta_{1} V + \beta_{2} {\text{Fr}} + \beta_{3} T + \beta_{4} {\text{Dg}} + \beta_{5} \left( {{\text{NO}} + {\text{NO}}_{2} } \right)_{i} \\ & \quad + \beta_{6} V{\text{Fr}} + \beta_{7} VT + \beta_{8} V{\text{Dg}} + \beta_{9} V\left( {{\text{NO}} + {\text{NO}}_{2} } \right)_{i} \\ & \quad + \beta_{10} {\text{Fr}}T + \beta_{11} {\text{FrDg}} + \beta_{12} {\text{Fr}}\left( {{\text{NO}} + {\text{NO}}_{2} } \right)_{i} \\ & \quad + \beta_{13} T{\text{Dg}} + \beta_{14} T\left( {{\text{NO}} + {\text{NO}}_{2} } \right)_{i} + \beta_{15} {\text{Dg}}\left( {{\text{NO}} + {\text{NO}}_{2} } \right)_{i} \\ & \quad + \beta_{16} V^{2} + \beta_{17} {\text{Fr}}^{2} + \beta_{18} T^{2} + \beta_{19} {\text{Dg}}^{2} + \beta_{20} \left( {{\text{NO}} + {\text{NO}}_{2} } \right)_{i}^{2} \\ \end{aligned} $$
(1)

The SE coefficient is an estimate of the standard deviation of the sampling distribution of the corresponding parameter. In other words, it is the ratio of standard deviation and the square root of the sample size. A lower value of SE coefficient indicates a more precise estimation. The t-stat is the ratio of the coefficient to its standard error. It is used to test the null hypothesis that the corresponding coefficient is zero against the alternative that it has some value other than zero, given the other predictors in the model. From the t-stat values obtained, it can be said that the coefficient values are well estimated. The values of both R2 and adjusted R2 of the model have been found to be 0.993, which indicate that this model is closely fitted with the experimental data. The RMSE is found to be 4.7 ppm. Further, the F value for the model has been found to be 1660 with respect to the constant model, which indicates a significant regression relationship between the response parameter and the operating parameters. The p value is 1.76e−229, which indicates a strong significance of the model. According to these results, it can be said that the model is well derived and suitable to be used for prediction of the response parameter, i.e., (NO + NO2).

4 Results and discussion

4.1 Surface plots

The surface plots of the response parameter, shown in Fig. 3a–d, have been plotted using quadratic equation by taking two operating parameters each time. Experimental data points can also be seen in these figures. Voltage has been taken as a common parameter, and one of the remaining four parameters has been taken as the second parameter in each plot. While plotting these graphs, the values of the other three parameters have been taken as constants.

Fig. 3
figure 3

Surface plots for the (NO + NO2) predicted using quadratic RSM model with respect to a voltage and flow rate, b voltage and temperature, c voltage and discharge gap, d voltage and (NO + NO2)i

The trend followed by the response parameter, i.e., (NO + NO2), with respect to the increase or decrease in the operating parameters can be observed in these plots. The apparent decrement in the (NO + NO2) with the variation in V from 16 to 26 kV can be noticed in every subfigure of Fig. 3. It can be seen from Fig. 3a that the effect of V is less when the Fr is 16 lpm. The effect of the second parameter on the response can be observed in the remaining plots also and can be compared with each other. For example, from Fig. 3a, b, it can be noticed that the effect of Dg, when it is varied from its lower level to upper level, is lesser compared to that of Fr. Similarly, a decrement in (NO + NO2) with an increase in T and with a decrease in Dg and (NO + NO2)i can be observed from Fig. 3b, c and d, respectively.

4.2 Analysis of variance

ANOVA (analysis of variance) results of this quadratic RSM model are presented in Table 4. Sum of squares (SS), degrees of freedom (DF) and mean squares (MS) are computed for each term of this quadratic RSM model as a part of this study. However, the most important factors of ANOVA are the F value and P value. The t-stat would inform whether a single parameter or variable is statistically significant or not, whereas the F value would inform whether a group of parameters are jointly significant or not.

Table 4 Analysis of variance for the quadratic RSM model

The calculated F values mentioned in the table for most of the terms of the model are greater in magnitude than the critical value of F, i.e., 3.88. If the F value of any term is lesser than the critical value of F of the model, then the P value increases and indicates the insignificance of that particular term. This has happened for few of the interaction and square terms of the model. The P values are comparatively higher for the interaction terms involved with the parameter Dg, i.e., FrDg, TDg and (NO + NO2)iDg, and square terms, i.e., T2 and Dg2. However, the P values of linear terms along with the term associated with Dg are much lesser than 0.05, indicating that those terms are statistically significant at the 95% confidence level. In other words, it can be said that the changes in response values are associated with changes in parameter values and their inclusion in the model is meaningful.

4.3 Model validation

Obtaining a fitted model does not serve the purpose in a prediction problem; otherwise, a third-order model could have been chosen as described earlier. Thus, the validation of the model includes verification of the predicted responses given by the model for both training and test data. The results of the model are verified first by feeding the training data itself, which have been used already for calculating the values of regression coefficients, i.e., β1 to β20, and constant C. Before comparing the predicted response values of different data instances, the residuals have to be standardized first with respect to their standard deviation. The standardized residual of a data instant is nothing but the ratio of residual at that instant to its standard deviation. Figure 4a, b is the residual plots for the training data. It can be said that the predicted values are well matching with the experimental values of NO + NO2 from Fig. 4a, as the points (small circles shown in blue color) on the graph are closer to the red-colored straight line. The standardized residuals those plotted with respect to predicted values should not follow any particular pattern and should be symmetrically distributed with respect to the center line. They should also be clustered around the lower single digits of the y-axis. As the plot shown in Fig. 4b follows these rules, it indicates that the model is well fitted. The RMSE for training data is found to be 4.7 ppm. The root-mean-square percentage error (RMSPE) is 2.5. The mean percentage error (MPE), which is the measure of the bias in the prediction by the model, is − 0.02.

Fig. 4
figure 4

Residual plots for training data. a Experimental values versus predicted values of (NO + NO2) concentration. b Standardized residuals versus predicted values of (NO + NO2) concentration

The second part of the validation process is testing the quadratic RSM model for an entirely new set of data which has not been used during training or while finding out the regression coefficients. Experiments have been conducted to obtain the test data of twelve sets with a random variation in the operating parameter values.

The model has been used to predict the NO + NO2 concentration for the test data. Figure 5a, b exhibits the residual plots for the predicted responses of the test data. In this case also, the points in Fig. 5a are closer to the straight line and data points of the standardized residual plot shown in Fig. 5b are not following any pattern and symmetrically distributed with respect to the center line. Thus, it can be said that the model chosen is well appropriate for the data. Here, for these test data, the RMSE is found to be 5.6 ppm and the RMSPE is 3.29. The MPE is found to be 0.4.

Fig. 5
figure 5

Residual plots for test data. a Experimental values versus predicted values of (NO + NO2) concentration, b standardized residuals versus predicted values of (NO + NO2) concentration

The operating parameter details of this test data set of twelve experiments are given in Table 4 along with the predicted values of NO + NO2 concentrations. In this table, experimental values of NO + NO2 concentrations are mentioned as (NO + NO2)e and predicted values of NO + NO2 obtained using RSM model are mentioned as (NO + NO2)p_RSM.

In the experiments with order numbers 3, 7, 9, 11 and 12, the values of operating parameters have been taken beyond their ranges with respect to the training data. The discharge gap is varied below its lower lever and taken as 6.5 mm in experiments 3 and 9. The temperature is varied over the upper level to 100 °C in experiment number 7. Similarly, the voltage and flow rate are varied out of their ranges in experiments 11 and 12, respectively. The errors (ERSM) in predicted values for these experiments are below ± 9 ppm, which can be observed from Table 5. Observing all these results, it can be said that the model is trained well and can predict the NO + NO2 concentration for any set of operating parameter values.

Table 5 Test data with predicted (NO + NO2) using quadratic RSM and dimensional analysis-based models and their errors

4.4 Comparison with dimensional analysis-based model

As described before, dimensional analysis is the only method used for this application as per the literature [1, 11, 12]. In these studies, the models were not tested for a novel data. The results shown in these studies were the RMSEs for the training data itself. Thus, a model given in Eq. (2) has been derived using the dimensional analysis and the experimental data of the present study, following a procedure stated by the authors of this study in their previous paper [13, 14]. Then, the predicted responses of the quadratic RSM model during testing have been compared with those of the dimensional analysis-based model. As said earlier, the models derived using dimensional analysis experience difficulty in predicting the responses when an entirely new data are given. The reason is that the training data do not include the interactions between parameters, as the power terms would be derived with variation in one parameter at a time.

$$ \begin{aligned} {\text{NO}} + {\text{NO}}_{2} & = 0.0178 \times \left[ {\frac{V\sqrt \varepsilon }{{\sqrt[3]{{V_{r} }}\sqrt P }}} \right]^{ - 0.9187} \times \left[ {\frac{\text{Fr}}{{\sqrt[3]{{V_{r}^{2} }}\sqrt P }}} \right]^{0.3104} \\ & \quad \times \left[ {\frac{{TT_{c} }}{{\sqrt[3]{{V_{r} }}\sqrt {P^{3} } }}} \right]^{ - 0.1946} \times \left[ {\frac{\text{Dg}}{{\sqrt[3]{{V_{r} }}}}} \right]^{0.5104} \times \left[ {\left( {{\text{NO}} + {\text{NO}}_{2} } \right)_{i} } \right]^{2.0259} \\ \end{aligned} $$
(2)

The predicted values of NO + NO2 concentrations obtained using the dimensional analysis-based model are mentioned as (NO + NO2)p_Dim in Table 5. The error in these predicted values is mentioned as EDim. It can be observed from this column of Table 5 that for the experiments with order numbers 1, 5, 8, 10 and 12, the error is lesser compared to that of the remaining experiments. From this, it can be understood that the error is more when there is simultaneous variation in two or more operating parameters with respect to the data for which the model is trained.

The predicted values of NO + NO2 using both the models have been plotted along with the experimental values of NO + NO2 with respect to experimental order as shown in Fig. 6. From this figure, it can be observed that the predicted values those are obtained using quadratic RSM model are much closer to the corresponding experimental values. Thus, it can be said that this quadratic RSM model would work well for predicting the NO + NO2 concentrations for any given test data of operating parameters.

Fig. 6
figure 6

Comparison of quadratic RSM model and dimensional analysis-based model during testing

5 Conclusion

A prior knowledge about the outcome of any experimental process would always be helpful in bringing the theory toward practical applications. In this present study, an attempt is made to predict the sum of NO and NO2 concentrations in diesel exhaust during the NTP treatment, with respect to changes in operating parameters, using RSM. The 35 full factorial design has been adopted for conducting the experiments with three-level variations in the five operating parameters: Those are V, Fr, T, Dg and (NO + NO2)i. From the main effect plots, it is noticed that the change in (NO + NO2)i and V affects the (NO + NO2) more than the other parameters.

Regression coefficients of the quadratic RSM model have been obtained by feeding it with the experimental data. The results of the chosen model have been analyzed using the surface plots, ANOVA and model validation. From the F values and P values obtained with ANOVA, the model can be described as significant at the 95% confidence level.

The predicted values are observed to be in good agreement with the corresponding experimental values. The RMSEs are found to be 4.7 ppm and 5.6 ppm for training data and test data, respectively. From all these results, this quadratic RSM model can be described as well suited for the prediction of (NO + NO2) concentrations in diesel exhaust during NTP treatment.