Introduction

The importance of the in-flight powder particle characteristics, such as the particle velocity and temperature, on the coating formation in the complex nonlinear atmospheric plasma spray (APS) process, is long well recognized. Since these characteristics cannot be measured during production, there is a great interest to predict these parameters to monitor and improve the spray process and coating quality.

The advancement of machine learning permits modeling such complex nonlinear relationships in a data-driven way. Guessasma et al. proposed to use artificial neural network (ANN) to develop an expert system for the prediction of the average spray particle velocity, temperature and diameter for better coating quality control (Ref 1). Input parameters, also referred to as attributes or predictors, including the arc current intensity, the argon and the hydrogen gas flow rate, were considered. The authors described their ANN development protocol in a follow-up article (Ref 2); in particular, they proposed a database enlargement procedure for the conditions when the number of experiments is insufficient to adequately train the ANN structure. Kanta et al. employed such protocol and developed an ANN to predict the in-flight particle temperatures and velocities to study the particle melting and dwell time with respect to the particle diameter (Ref 3). Sha expressed three concerns regarding such ANN development approach: first about the need and effectiveness of developing neural network models based on very few experimental data points, secondly about the database enlargement procedure and thirdly about the wide extrapolation based on the neural network models (Ref 4). Choudhury et al. borrowed the experimental data sets from (Ref 1) and suggested an alternative protocol to develop an ANN to predict the in-flight powder particle characteristics of an APS process (Ref 5); more specifically, the authors proposed to expand the experimental dataset using kernel regression instead and reported good model generalization. Later, Choudhury et al. employed extreme learning machine (ELM), a specific class of ANN, to construct a robust single hidden layer feed-forward neural network for the in-flight particle characteristics of the APS process (Ref 6). This approach reduced the training time and yielded stable performance with regard to the changes in the number of hidden layer neurons. Choudhury et al. further proposed to implement the ANN using a modular scheme (Ref 7). The scheme simplified the model structure and improved the generalization of the model overall. Liu et al. employed a nonlinear autoregressive exogenous (NLARX) model combined with the wavelet network to predict the in-flight particle characteristics of a mono-cathode plasma spray torch using a system identification approach (Ref 8). Compared with normal neural network, such approach could be more suitable for dynamical conditions. Recently, Zhu et al. proposed to employ convolutional neural network (CNN) to model the in-flight particle characteristics in atmospheric plasma spray (Ref 9). Proven to be successful for image processing, the CNN-based model can also describe the two-dimensional spray distributions along the plane of the substrate, in additional to the prediction of the average in-flight particle characteristic values as with an ANN. The reversed model can predict the required input control parameter given the two-dimensional sprayed particle distributions. More recently, Bobzin et al. proposed to use Residual Neural Network (ResNet) and Support Vector Machine (SVM) to provide fast estimations of the in-flight particle properties in APS. Numerical simulations are used to provide the training data. Both ResNet and SVM models are able to reduce the simulation time from 3 h down to a few seconds with accurate predictions for the average properties. Some other machine learning investigations for atmospheric plasma spray include process modeling, monitoring and control (Ref 10-12), coating characteristics prediction (Ref 13-16), and process optimization (Ref 17).

The majority of the predictive models proposed for APS in-flight particle characteristics are neural network based. There are however many other nonlinear predictive modeling techniques (Ref 18). In particular, decision tree-based ensemble methods, like random forest (RF) and gradient boosting (GB), have demonstrated their strong performance, often comparable to and sometimes even better than ANN (Ref 19, 20).

The essential idea behind ensemble methods is to combine the outputs from many simple models, referred to as base learners, to yield the final prediction. RF and GB are two popular ensemble methods; both use decision tree as their base learners. The two differs however in how the individual trees are constructed and added together. RF generates the trees by training them on subsets of data, both in terms of the observations and the attributes, randomly drawn from the full training set with replacement (Ref 21). The final prediction is an average of the results from all the generated trees. Since the generation of each tree does not depend on each other, the procedure is well suited to be executed in parallel. RF reduces the prediction variance. On the other hand, GB constructs each tree sequentially aiming to reduce the prediction error or the residue from the previous trees (Ref 22, 23). Subsampling in observations are generally considered, while subsampling in attributes may also be employed (Ref 24). GB reduces both the prediction variance and bias. Wolpert argues that without having substantive information about the modeling problem, there is no single model that will always do better than any other model (Ref 25). Could ensemble methods also predict well the in-flight particle characteristics?

It is well known that the ageing of torch electrodes greatly affect the relationship between the APS input parameters and the in-flight particle characteristics, in a time scale of hours. The same set of process inputs would expect to yield different in-flight particle characteristics using a brand new electrode pair as compared to a used one. Therefore, when modeling the in-flight particle characteristics with torch electrodes ageing considered, it may be imperative to consider the production data of APS as time series; where there is an ordered temporal component in the observations of the data. To the authors’ best knowledge, the usage time of torch electrodes (i.e. ageing) has not been previously considered along with other process parameters to predict the in-flight particle characteristics. Can better predictions be made if the production data of APS is considered as time series?

This work aims to explore the applicability of RF and GB, to predict and forecast the multivariate APS in-flight particle characteristics with the consideration of torch electrodes ageing as time series. In particular, two different time series modeling strategies are compared with the baseline approach. The paper is organized as follows: First, the electrode-wearing experiment is described, followed by the data preprocessing. Subsequently, the two time series modeling strategies are compared and discussed. The paper concludes that it is advantageous to consider the APS data as time series for the in-flight particle characteristic prediction. Also, it is beneficial to first make the time series stationary using the traditional differencing technique when modeling using RF for the present application.

Experimental Procedures

APS experiments were performed to record simultaneously the spray process input parameters, the in-flight powder particle characteristics and the electrodes usage time. The experiment was carried out using a Metco 3 MB APS torch with a brand new pair of electrodes (model GE, nozzle diameter 5.54 mm), the electrodes were started 135 times and pursued until the torch could no longer sustain the plasma. The main process parameters (e.g. torch current intensity, voltage) were monitored and recorded at a sampling rate of 1 Hertz, using in-house built console equipment integrated into LabVIEW (National Instruments). Various spray time durations were considered for the torch usage so to emulate industrial coating spray production settings similar to those employed for thermal barrier coating (TBC) production. One single set of spray parameters, i.e. plasma-forming gases \({N}_{2}\) (50 lpm) as primary gas, \({H}_{2}\) (10 lpm) as auxiliary gas and a single TBC top coat powder (YSZ, Metco 204BN-S) deposited using a powder feed rate of 20 g/min with a constant carrier gas flow rate set at 6 lpm, and at 75 mm stand-off distance, was tested throughout the experiment. During the course of the experiment, it was aimed to maintain the net power of the torch constant by adjusting the torch current; where the net power is the raw power minus the heat transfer from the torch to the coolant (hereafter, referred to as torch cooling). An AccuraSpray (Tecnar, St-Bruno, Qc, Canada) diagnostic device was used to measure the in-flight particle temperature and velocity at defined time intervals (precision of the measurement given by the equipment i.e. plume relative intensity measurement 0.5% and plume geometry measurement 0.1 mm).

The electrodes began to show a weakness in sustaining a constant plasma plume and manifested plasma pulsations after about 26 h of usage. At that point, it was decided that the electrodes reached their end of life. The evolution of the torch current, voltage and power, and the corresponding in-flight particle velocity and temperature are shown in Fig. 1. Even though the spray condition is maintained constant during the experiment, with time the in-flight particle temperature shows an obvious decrease of about 500 °C from the beginning to the end of the experiment; whereas the in-flight particle velocity shows less reduction. The torch voltage is observed to decrease rather quickly within the first 5 h of usage, remains rather stable up to about 17 h of usage and shows a higher level of fluctuation at the end of usage time. The effect of electrode ageing on the in-flight particle characteristics can be clearly observed. Figure 2 shows the 3 MB torch electrode pair at the beginning, at 7 h of usage and at the end of the experiment. Figure 3 shows the porosity of the YSZ coating prepared at two different moments corresponding to different torch electrodes usage states; the coating prepared with the electrodes closer to the end of their lifetime (right) shows a higher level of porosity due to the electrode wearing.

Fig. 1
figure 1

Torch power (a), current (b), voltage (c), as well as the corresponding in-flight particle temperature (d) and velocity (e) with respect to the usage time of the electrodes throughout the experiment. Throughout the experiment, it was aimed to maintain the net power of the torch constant by adjusting the torch current; where the net power is the raw power minus the torch cooling

Fig. 2
figure 2

3 MB torch electrode pair used for the experiment. (a) shows the brand new electrode pair, (b) shows the electrode pair after 7 h of usage and finally (c) shows the electrode pair after 26 h of usage

Fig. 3
figure 3

APS coating porosity. Increased porosity is observed in the coating when spraying with the electrodes used for 26 h (b) as compared to the ones used for 7 h (a). Here, TC refers to the YSZ top coat and BC refers to the MCrAlY bond coat

Predictive Modeling with Ensemble Methods

Data Preprocessing

The data acquired from the different devices was first cleaned and then integrated based on the time stamp. For the data recorded from the APS controller (e.g. torch current intensity, voltage), only those observations when the plasma was on were considered. A total of about 2600 records with 8 attributes (including the torch voltage \(V\) (\(V\)), the current \(I\) (\(A\)), the input raw power \({P}_{raw}\) (\(kW\)), the net power \({P}_{net}\) (\(kW\)), the torch cooling \({q}_{cool}\) (\(kW\)), the flow rates of nitrogen \({Q}_{{N}_{2}}\) (\(lpm\)), hydrogen \({Q}_{{H}_{2}}\) (\(lpm\)) as well as, and the electrode usage time \({t}_{usage}\) (s)) and 2 targets (i.e. in-flight particle temperature \({T}_{p}\) (°C) and velocity \({v}_{p}\) (\(m/s\))) were obtained right after the data fusion. All the attributes and the targets are normalized by subtracting their respective mean and then divided by their respective standard deviation. This common standardization can help to eliminate the possibility of numerical artifacts caused by different scaling during comparison.

Figure 4 shows the Pearson correlation coefficient among some selected parameters. The Pearson correlation coefficient is a measure of linear correlation between two parameters. It is determined by dividing the covariance of the two parameters by the product of their standard deviations. This normalized covariance only gives values from −1 to 1. When it is 1, the two parameters relate to each other linearly in a positive manner; increasing one, the other will increase too. When it is −1, the two parameters relate to each other linearly negatively; increasing one parameter, the other will decrease. As with the covariance, the Pearson correlation coefficient can only reflect if the two parameters have any linear relationship. Recall that in the experiment, only a single spray condition is considered. Several process parameters (e.g. flow rate of nitrogen) were not varied throughout the experiment. Therefore, their effects on the particle velocity and temperature cannot be observed and appear to have no relation to any other parameter. Similar observation is found for the net power, which is purposefully maintained constant. In particular, the Pearson correlation coefficient indicates that the particle temperature seems to have stronger correlation with the voltage as compared to particle velocity.

Fig. 4
figure 4

Pearson correlation coefficient among some selected parameters

A time series has a temporal relationship among the observations (i.e. records): the previous observation affects the current one, which will affect the next one, etc. However, most machine learning predictive models, including RF and GB, consider each observation independently and thus, expect that all the necessary information to predict the target can be found in the attributes of that particular observation. Therefore, to forecast a time series using these machine learning models, it is necessary to capture the temporal order information properly as new attributes in each observation itself. There are two general approaches (Ref 26). The first approach simply appends all the information from the considered previous time segments into the observation (Ref 27). The time series may also be normalized (Ref 28), stationarized (Ref 17), or decomposed (Ref 30) a priori. The second approach only supplements the observation with specifically engineered and selected statistical features from the time series for the learning (Ref 26). The first approach will be adopted here due to its simplicity. One disadvantage of this approach, however, is that the number of the attributes increases rather quickly as longer histories are desired, resulting in a higher model training cost. Therefore, it is desirable to include only the relevant parameters for the attribute augmentation.

Feature Selection

Both RF and GB provide a ranking of variable importance as the regression models are developed. They will be used to guide the useful attribute selection.

The present prediction problem belongs to that of multi-target regression, having two outputs: the particle temperature and velocity. The development of multi-target regression for RF is more advanced (Ref 31). There are already several implementations available from popular machine learning platforms (Ref 32, 33). Such models also take into consideration the correlation among the targets. As for GB, multi-target regression is still under active development. It is not yet possible to develop native multi-target regression using GB. A simple workaround is to build a separate predictive model per target. The correlation among the outputs is, however, ignored. Alternatively one may build a set of chained dependent models: Suppose that there are two targets \({y}_{a}\) and \({y}_{b}\). One first builds a model to predict \({y}_{a}\) with the attributes \(\mathop{x}\limits^{\rightharpoonup} = x_{1} ,x_{2} , \ldots ,x_{n}\); and then one builds another model to predict \({y}_{b}\) with \(\widehat{{y}_{a}}\) (the predicted \({y}_{a}\) from the first model) as well as the attributes \(\mathop{x}\limits^{\rightharpoonup}\). The performance of such approach depends upon the model construction order and the correlation among the targets. The benefit of the added complexity is not always apparent.

Both ensemble methods will be used here to compare the features. The dataset is first randomly separated into two parts: 70% for training and 30% for testing.

The testing set is used to ensure the decent performance of the predictive model. The hyperparameters of the two models are set based on previous experience as follows: for RF, the maximum number of features per tree is set to 2 and the depth of tree is left as maximum; for GB, the depth of tree is set to 2, with a learning rate of 0.1. Here, the simpler separate model per target approach is employed. For both methods, a minimal leaf node size of 5 is considered.

The top 6 features ranked using RF and GB are shown in Fig. 5 and 6 respectively. The two sets of ranking are not identical, and both do not follow exactly the Pearson correlation coefficient (Fig. 4). The ranking of GB seems to have better agreements with the Pearson correlation coefficient than those of RF.

Fig. 5
figure 5

Top 6 features ranked using RF, which is trained to predict both the particle temperature and velocity simultaneously

Fig. 6
figure 6

Top 6 features ranked using GB. One separated model is trained per target. (a) shows the feature ranking of the GB model trained to predict the particle temperature, and in (b), from the one to predict the particle velocity

The ranking from GB for the in-flight particle velocity, though reasonable according to the Pearson correlation coefficient, is counterintuitive. In the ranking, the net power is the most important feature for the particle velocity prediction. From Fig. 4, indeed the net power is the most correlated feature with respect to the particle velocity after the particle temperature. However, the net power is purposefully maintained constant throughout the experiment. This underlines the importance of having the interaction among the targets considered. Therefore, in the following, only RF will be studied with the top 5 features ranked in Fig. 5. In particular, the torch net power is not included, because first, the ranking between the torch net power and the torch cooling are very comparable, and secondly the torch net power can be determined by simply subtracting the torch cooling from the torch raw power. It is not necessary to include redundant information.

Time Series Attribute Augmentation

To incorporate the temporal order of the data in each observation, it is necessary to add the information of the previous observations into the present observation as new attributes. Suppose that there are two attributes: \({x}_{a}\) and \({x}_{b}\) in the original dataset for a target \(y\). Embedding such dataset with the data from the two previous time steps will result in the following form:

$$\begin{array}{c}{{\varvec{y}}}_{3}\\ {{\varvec{y}}}_{4}\\ {{\varvec{y}}}_{5}\end{array} \Leftarrow \begin{array}{ccc}{{\varvec{x}}}_{{\varvec{a}}3}& {{\varvec{x}}}_{{\varvec{b}}3}& {y}_{2}\\ {{\varvec{x}}}_{{\varvec{a}}4}& {{\varvec{x}}}_{{\varvec{b}}4}& {y}_{3}\\ {{\varvec{x}}}_{{\varvec{a}}5}& {{\varvec{x}}}_{{\varvec{b}}5}& {y}_{4}\end{array} \begin{array}{cc}{x}_{a2}& {x}_{b2}\\ {x}_{a3}& {x}_{b3}\\ {x}_{a4}& {x}_{b4}\end{array} \begin{array}{ccc}{y}_{1}& {x}_{a1}& {x}_{b1}\\ {y}_{2}& {x}_{a2}& {x}_{b2}\\ {y}_{3}& {x}_{a3}& {x}_{b3}\end{array}$$

where the numbers represent the ascending temporal order in the dataset. The original dataset is shown in bold. Note that the target values for the previous observations are also embedded here. For example, \({x}_{a}\) can be the torch current intensity \(I\) and \({x}_{b}\) can be the torch voltage \(V\) with \(y\) being the in-flight particle temperature \({T}_{p}\). The embedded dataset, again considering the two previous time steps, will become as follows:

$$\begin{array}{*{20}c} {{\varvec{T}}_{{{\varvec{p}}3}} } \\ {{\varvec{T}}_{{{\varvec{p}}4}} } \\ {{\varvec{T}}_{{{\varvec{p}}5}} } \\ \end{array} \Leftarrow \begin{array}{*{20}c} {{\varvec{I}}_{3} } & {{\varvec{V}}_{3} } & {T_{p2} } \\ {{\varvec{I}}_{4} } & {{\varvec{V}}_{4} } & {T_{p3} } \\ {{\varvec{I}}_{5} } & {{\varvec{V}}_{5} } & {T_{p4} } \\ \end{array} \begin{array}{*{20}c} {I_{2} } & {V_{2} } \\ {I_{3} } & {V_{3} } \\ {I_{4} } & {V_{4} } \\ \end{array} \begin{array}{*{20}c} {T_{p1} } & {I_{1} } & {V_{1} } \\ {T_{p2} } & {I_{2} } & {V_{2} } \\ {T_{p3} } & {I_{3} } & {V_{3} } \\ \end{array}$$

For a given dataset, incorporating a longer history can introduce more temporal information into the present observation. However, this will also reduce the total number of embedded observations for modeling. The tradeoff is far from trivial, and often the necessary lag order is problem-dependent. For the present exploration, the information (with also the targets) from the previous 4-time steps are embedded into the dataset. All the top 5 ranked features are included, resulting in 33 attributes in total in the embedded dataset, referred to as the simple embedded dataset in the following.

During data exploration, it is observed that both the targets and the attributes contain trending patterns. Hence, they are non-stationary. Traditional time series forecasting techniques will typically first use differencing to stationarize the time series data before model tuning (Ref 34). The benefits of such data preparation for RF is however not well reported. Therefore, a second embedded dataset, where the time series is first differenced before embedding, is prepared for comparison. This is referred to as the differenced embedded dataset below. Note that the prediction of any regression model, which is trained upon a differenced embedded dataset, requires reverse differencing to convert itself back to the original unit scale.

Methodology for Forecasting Experiment

A RF regression model is developed from each embedded dataset. They are compared with a baseline RF model developed from the original dataset, without embedding nor differencing. Since the number of attributes (\(m\)) of these datasets is different, the number of features included per tree is set based on the rule of thumb recommended (\(m/3\)) (Ref 35), whereas the depth of tree is left as maximum as before with the minimum leaf node size of 5.

Again, the dataset is separated into two parts, according to each AccuraSpray measurement record: the beginning 70% for training and the remaining 30% for testing. This arrangement emphasizes the prediction of future values.

In this exploratory work, only the one-step prediction, which predicts the immediate next target value, is considered. The performance is compared using the mean square error (MSE) in the original unit scale.

Results

General Performance and Ensemble Sizing

For each dataset, a set of RF regression models is first constructed, with an increasing number of trees from 5 to 500. Since the in-flight particle characteristic data captures the effect of electrode ageing, the prediction error is a direct indication of how well each model can cope with the aspect of electrode usage time. The corresponding out-of-sample MSE (i.e. computed from the testing data) versus the number of trees considered for the three RF model sets are shown in Fig. 7. The MSE of the simple embedded RF model is much smaller than that of the baseline model; whereas that of the differenced embedded RF model is substantially further reduced. For all three cases, the MSE reduces as the number of trees increases from the beginning. After a certain point, a further increase in the number of trees does not reduce the MSE anymore. This transition to the error plateau provides a guideline of the required number of trees for the ensemble as per the selected hyperparameters. From Fig. 7, it is indicated that the baseline RF model needs about 300 trees. Both the simple embedded and differenced embedded RF models (upon zoomed in) need about 200 trees.

Fig. 7
figure 7

The out-of-sample mean squared error versus the number of trees in the ensemble of the three RF models

Table 1 lists the in-sample MSE (i.e. computed from the training data) together with the out-of-sample MSE (i.e. computed from the testing data) for the three RF models. When the out-of-sample error is much higher than those of the in-sample, it is an indication of model over-fitting (i.e. when the model starts to capture also the noisy patterns in the training dataset, in addition to the overall trends as originally intended). The results from Table 1 suggest that the embedding procedure reduces model over-fitting. In particular, the differencing preprocessing essentially “eliminates” model over-fitting with respect to the present choice of hyperparameters.

Table 1 In-sample and out-of-sample mean square error (MSE) for the three RF models. The closer the out-of-sample error (computed from the testing data) to those of the in-sample (computed from the training data), the less model over-fitting exists

Important Features

As mentioned above, RF provides a ranking of variable importance as the models are developed. The feature ranking order of the baseline RF model is the same as Fig. 5. Whereas the top 12 features for the simple embedded and the differenced embedded RF models are shown in Fig. 8. The differenced embedded RF model ranks the previous targets (i.e. the previous particle temperatures and velocities) to be the most important attribute group, with a mixed order. After the previous targets, the model neatly ranks the torch currents, followed by the torch cooling, then the voltages, the raw powers and finally the electrode usage times. Whereas, the simple embedded RF model neatly ranks the previous particle temperature first, followed by particle velocity and the rest of the attributes in a mixed manner. The rankings from both embedded RF models underline the importance of the consideration of the previous target as attributes. Such consideration is not possible using the baseline RF model.

Fig. 8
figure 8

Rankings of features for differenced embedded (a) and simple embedded (b) RF models

Performance Comparison

The performance of a baseline RF model with 300 trees is then compared with two embedded RF models with 200 trees. Here, the tree depth is limited to 3 for all three RF models to better contrast their performance.

The comparison of the in-flight particle temperature prediction is shown in Fig. 9. The predictions are plotted against the actual values. The better the predictions are, the closer will they be with respect to the diagonal line. The scattering of the prediction reduces progressively from the baseline model (\({R}^{2}=0.928\)), then with the simple embedded RF model (\({R}^{2}=0.950\)), and finally becomes the smallest with the differenced embedded RF model (\({R}^{2}=0.999\)).

Fig. 9
figure 9

Comparison of the in-flight particle temperature predictions of the baseline (a), the simple embedded (b) and the differenced embedded (c) RF models. Note that all the attributes and the targets, including the in-flight particle temperature, are normalized by first subtracting the respective mean and then divided by their respective standard deviation

Figure 10 shows the comparison of the in-flight particle velocity prediction. Similar improvement trend can be observed, starting from the prediction of the baseline model (\({R}^{2}=0.822\)), then better with the simple embedded RF model (\({R}^{2}=0.965\)), and the best with the differenced embedded RF model (\({R}^{2}=0.999\)).

Fig. 10
figure 10

Comparison of the in-flight particle velocity predictions of the baseline (a), the simple embedded (b) and the differenced embedded (c) RF models. Note that all the attributes and the targets, including the in-flight particle velocity, are normalized by first subtracting the respective mean and then divided by their respective standard deviation

Selected predictions of the in-flight particle temperatures and velocities are shown in Fig. 11. Interestingly, a few predictions of the baseline model are better than those of the simple embedded RF model here, e.g. the first velocity prediction on the left.

Fig. 11
figure 11

Selected predictions of the in-flight particle temperatures (a) and velocities (b) with the three RF models

Discussions

In literature, the majority of the predictive models proposed for APS in-flight particle characteristics are neural network based. The present investigation demonstrates that ensemble methods like RF are also suitable to model the APS in-flight particle characteristics. Compared with the neural network-based models, training a RF model is a lot easier. There are only a few hyperparameters (e.g. the number of feature included per tree, the depth of the tree, the minimum leaf node size, etc.) involved to specify the layout of an individual decision tree (Ref 36). The ‘forest’ can then be grown in a straight forward manner by building a new tree repeatedly from randomly sampled data with replacement. In contrast, to develop a neural network, one first needs to design the network structure by selecting the number of hidden layers, the number of neurons in each layer, their connection topology, and the activation function, etc. (Ref 37, 38) There is no specific rule of thumb for a successful neural network topology. Secondly, one needs to find the weighting parameters for the network components by solving a non-convex optimization problem; for example, using backpropagation (Ref 37, 38). It is not possible to compute an optimal set of the weights, nor there is any global convergence guarantees to find them. The process is generally computationally expensive. Besides, the complexity of the neural network dictates the minimum amount of data needed for its training. The present work now provides a straightforward alternative machine learning modeling approach for APS in-flight particle characteristics prediction.

The present work also underlines the benefits to consider the APS data as time series when modeling the in-flight particle characteristics since electrode ageing is of concern. The importance of such time evolving (i.e. time series) aspect of the in-flight particle characteristics as the electrode ages was not yet commonly acknowledged and was under appreciated in the corresponding predictive models. The majority of the predictive models proposed for APS in-flight particle characteristics do not consider the time evolution aspects of the process inputs parameters. Only a few articles consider such aspect. For instance, Liu et al. examined a time-series-sensitive nonlinear autoregressive exogenous (NLARX) model combined with the wavelet network to predict the in-flight particle characteristics of a mono-cathode plasma spray torch using a system identification approach (Ref 8). It was mentioned that such approach can better capture the dynamic characteristics, comparing with the normal neural network. Kim examined a linear autoregressive with exogenous (ARX) model and ARX-type neural network models to develop a control-oriented dynamic model of an inductively coupled plasma torch, similar to the work of Liu et al. (Ref 39). The present work uses RF to prepare three models: the baseline, the simple embedded and the differenced embedded. The latter two consider the in-flight particle characteristics as time series via the embedding procedures, which incorporates the data from the previous time steps into the present record, and they both have much improved performance over the baseline model for one-step prediction. In particular, it is highly beneficial to first make the time series stationary using the differencing technique. Since all three models are constructed using RF, the prediction improvement seen in those using the embedding procedures is offered by the datasets of these two models which contain information from the previous time steps. This provides clear supporting results to illustrate the benefits of considering the in-flight particle characteristics as time series over the lifespan of the electrodes.

Investigations are currently under way to evaluate other multivariate predictive models applicable for time series, as well as to compare the performance of the alternative time series enrichment strategy, which performs feature engineering and selection from the statistics of the time series (e.g. mean, standard deviation) for the learning. The corresponding findings will be communicated in future work. Although there are still knowledge and technological gaps before deploying a production-ready predictive model for the in-flight particle characteristics, the present study lays down a solid foundation to advance towards such goal.

Conclusions

This work explored the applicability of two ensemble methods, namely RF and GB, to predict and forecast the multivariate APS in-flight particle characteristics with the consideration of torch electrodes ageing as time series. Experiments were performed to record simultaneously the input process parameters, the in-flight powder particle characteristics and the electrodes usage time, using a brand new pair of electrodes until they were no longer usable. Only a single spray condition was considered. The in-flight powder particle characteristics clearly reflect the effect of electrode ageing. Two strategies of time series embedding manipulation are considered. The first one simply stacks up the attributes and the targets from the previous \(n\) time segments considered without any modification, while the second strategy first performs differencing to make the time series stationary before the embedding procedure. To select the more influencing attributes for the embedding procedure, an initial feature ranking is performed using both RF and GB. Such ranking underlines the advantages for those models that can consider the inter-target correlation for multivariate regression modeling. Hence, for the present application, RF is more suitable than GB. The superior prediction performances and the feature rankings of both embedded RF models show that it is better to consider the APS data as time series for the in-flight particle characteristic prediction. In particular, it is also advantageous to first make the time series stationary using the traditional differencing technique, even when modeling using RF. Comparison with other multivariate regression modeling techniques for time series is currently under way and the findings will be communicated in future work. Ultimately, it is desired to develop a predictive model for coating characteristics and performance, which can serve as a guiding tool for effective torch usage and coating quality control.