Employing machine learning techniques in monitoring autocorrelated profiles

Yeganeh, Ali; Johannssen, Arne; Chukhrova, Nataliya; Abbasi, Saddam Akber; Pourpanah, Farhad

doi:10.1007/s00521-023-08483-3

Employing machine learning techniques in monitoring autocorrelated profiles

Original Article
Open access
Published: 29 April 2023

Volume 35, pages 16321–16340, (2023)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

Employing machine learning techniques in monitoring autocorrelated profiles

Download PDF

1657 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

In profile monitoring, it is usually assumed that the observations between or within each profile are independent of each other. However, this assumption is often violated in manufacturing practice, and it is of utmost importance to carefully consider autocorrelation effects in the underlying models for profile monitoring. For this reason, various statistical control charts have been proposed to monitor profiles when between- or within-data is correlated in Phase II, in which the main aim is to develop control charts with quicker detection ability. As a novel approach, this study aims to employ machine learning techniques as control charts instead of statistical approaches in monitoring profiles with between-profile autocorrelations. Specifically, new input features based on conventional statistical control chart statistics and normalized estimated parameters are defined that are capable of adequately accounting for the between-autocorrelation effect of profiles. In addition, six machine learning techniques are extended and compared by means of Monte Carlo simulations. The simulation results indicate that machine learning techniques can obtain more accurate results compared with statistical control charts. Moreover, adaptive neuro-fuzzy inference systems outperform other machine learning techniques and the conventional statistical control charts.

Parametric and non-parametric methods for monitoring nonlinear fuzzy profiles

Article 18 May 2021

On efficient estimation strategies in monitoring of linear profiles

Article 17 March 2018

Application of Machine Learning in Statistical Process Control Charts: A Survey and Perspective

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Statistical process monitoring (SPM) is usually employed for industrial processes to omit assignable causes that deteriorate the product outcome. SPM is the major field for controlling process variations to eventuate lower costs in waste, scrap, rework and claims, better quality, and more insights into the capability of the process. Seven main tools, entailing scatter diagrams, Pareto charts, control charts, histograms, cause-and-effect diagrams, check sheets and stratification, are utilized in SPM to implement inspection and monitoring procedures [1]. Among them, control charts are the most successful and effective tools for quality control of manufacturing processes [2,3,4].

To employ a control chart for process monitoring, two phases entailing Phase I and II should be initially defined. In Phase I, it is tried to achieve proper estimations of the process parameters, whereas Phase II monitoring aims to find assignable causes in which the process situation changes from In-Control (IC) to Out-of-Control (OC) state [5,6,7,8,9]. Average Run Length (ARL) and Standard Deviation of Run Length (SDRL) are two common performance indicators in Phase II. The ARL is the average number of samples to be obtained by the predefined control chart before the chart triggers an OC signal. Thus, a control chart with larger (smaller) values of the ARL is to be preferred when the underlying process is in IC (OC) state [the more common notation is ARL₀ (ARL₁)] [10,11,12]. In addition, the SDRL is defined in a similar way as a secondary criterion in Phase II (i.e., SDRL₀ and SDRL₁).

There are two common approaches to monitor a manufacturing processes with the help of control charts namely monitoring quality characteristics and profile monitoring [13]. In this paper, we focus on profile monitoring. Here, the quality of a process or product is modelled via a functional relationship between a response (dependent) variable and one or more explanatory (independent) variable(s) [14]. The aim of profile monitoring is to check the stability of a predefined IC relationship (or profile) over time, and it is essential to reach a true OC signal as soon as possible when the IC model shifts to an unknown OC profile [15].

Different IC models can be employed due to the nature of the underlying problem, such as circular [16], linear [5, 17,18,19,20,21,22], logistic [23,24,25], nonlinear [26], nonparametric [27, 28], multichannel [29], polynomial [30] or quadratic [31]. Among them, linear profiles have received more attention in the literature [32, 33]. The majority of previous studies in linear profile monitoring is based on the independency assumption regarding within or between profiles in relation to the error terms. However, this assumption is often violated in manufacturing practice that is characterized by autocorrelated profiles and consequently, conventional approaches may lead to inaccurate outcomes for this type of profiles.

Autocorrelated profiles consist of within- and between-correlation models in the related literature [34,35,36,37]. In the first group, Soleimani, Noorossana and Amiri [38] developed four control charts including T² and three well-known Exponentially Weighted Moving Average (EWMA) charts considering the first order Autoregressive (AR) model, i.e., AR(1). The results showed the superiority of EWMA-based approaches over T². Autoregressive Moving Average (ARMA), Vector ARMA (VARMA) etc. are other more complex models that have been developed in this field [39,40,41,42,43]. Due to the higher potential for applications of the second group, this paper focuses on between-profile autocorrelation. To the best of the authors' knowledge, the pioneering work is Noorossana, Amiri and Soleimani [44], in which they developed T², EWMA of residuals (EWMA/R) and transformed individual EWMA (EWMA-3) control charts for situations where autocorrelation effects exist between profiles. Similar to Soleimani, Noorossana and Amiri [38], they concluded that EWMA-based methods outperform T². Wang and Lai [45] aggregated the individual EWMA statistics to a Multivariate EWMA (MEWMA) control chart for profiles with between-autocorrelation, and it has been shown that MEWMA outperforms T². Khedmati and Niaki [46] considered both linear and polynomial profiles; they first utilized the U statistic for removing the effect of autocorrelation and then developed a T²-based control chart. The experimental results showed that this method performs better than conventional T² control charts, but comparisons with EWMA are missing. Koosha and Amiri [47] proposed a similar T²-based control chart for monitoring autocorrelated logistic profiles. Wang and Huang [48] modified the estimation procedure of the EWMA approach, and the simulation results demonstrated that this scheme has a faster detection ability than that of conventional EWMA.

From the literature, it can be inferred that the probability of occurring autocorrelations in practical applications is very high. Therefore, an early detection of OC conditions is more important than simulation results, as a delay in detection may result in the production of nonconformities and additional costs. However, the conventional control charts such as T², EWMA/R and EWMA-3 are not able to perform well in line with this aim as their performance deteriorate in the occurrence of autocorrelation in comparison with simple situations; for example, it can be referred to the results of ARL₁ in Noorossana, Amiri and Soleimani [44] versus Kim et al. [20].

Hence, proposing a novel control chart with a tangible ability in reduction of the OC signaling time in autocorrelated profiles is crucial. To remedy this challenge, in recent years, several studies incorporated machine learning techniques in the SPM context in monitoring roundness [49], nonlinear [50,51,52], linear [53, 54] and logistic [25, 55, 56] profiles. As a different approach, Chen et al. [57] employed a deep learning technique, called stacked denoising autoencoders, to monitor autocorrelated profiles. Specifically, this scheme extracts a number of features from the process using autoencoders, and then the extracted features are used to develop control charts based on T² and EWMA. In other words, the main task of their approach is to select proper features from the process, whereas the direct usage of machine learning techniques as a control chart would be more promising. As far as the authors know, there are no further articles where machine learning techniques are employed in monitoring autocorrelated profiles.

The aim of this paper is to develop a robust control chart based on machine learning techniques to alleviate the above-mentioned challenges, i.e., reducing the values of the ARL₁ and SDRL₁ for autocorrelated linear profiles that can result in early detection of OC situations in Phase II. To achieve this, three combinations of input features based on the effect of the mean of responses, the mean of errors, and T² statistic, each in the current and previous sample, are defined to fed into the machine learning techniques for monitoring profiles with between-autocorrelation of first order, i.e., AR(1). Since each machine learning technique performs differently in tackling various problems, six machine learning techniques ranging from shallow to deep structures including adaptive neuro-fuzzy inference system (ANFIS), artificial neural network (ANN) with Back-Propagation (BP) training, Convolutional Neural Network (CNN), long short-term memory (LSTM) network, Radial Basis Function (RBF) network and support vector regression (SVR), are employed to find the most appropriate one. To sum up, the main contributions of this paper are as follows:

Improving the detection ability of Phase II control charts for monitoring linear autocorrelated profiles with the help of machine learning techniques,
Defining different combinations of input features based on the effect of the mean of responses, the mean of errors, and T² statistic, each in the current and previous sample, for monitoring the between-autocorrelation effect of profiles,
Evaluating the performance of the defined input features and finding the best combination using the proposed machine learning-based control chart, and
Identification of the most appropriate machine learning technique under the most suitable input combination for this problem.

The rest of this article is organized as follows. In Sect. 2, definitions of autocorrelated linear profiles are discussed. Section 3 presents the framework of the proposed approach. Results of simulation studies regarding performance comparisons are given and discussed in Sect. 4. To show the effectiveness of our method, an illustrative example is given in Sect. 5. Finally, Sect. 6 gives some conclusions and suggests future research directions.

2 Preliminaries

In this section, first, the general relations of linear profiles with between-autocorrelation error terms are presented. Then, three common control charts, namely T², EWMA/R and EWMA-3, are briefly introduced. Finally, details about the Ordinary Least Squares (OLS) estimation of the parameters are discussed. These basics are necessary against the backdrop that (1) our proposed method employs the T² statistic as input feature, and (2) these conventional charts are used for comparison purposes in our analyses.

2.1 The linear autocorrelated profile in Phase II

A common linear profile, which is the simplest but the most fundamental type of profiles [19], is defined as:

$$\begin{aligned} &Y_{ij} = A_{0} + A_{1} X_{i} + \varepsilon_{ij} ,\varepsilon_{ij} \sim N\left( {0,\sigma^{2} } \right),\\ & i = 1,2, \ldots ,n;\,j = 1,2, \ldots , \end{aligned}$$

(1)

where X_i represent the explanatory variable in a linear profile and the response variable Y_ij is the quality characteristic under study. The parameters of the above IC model (intercept A₀, slope A₁ and error variance σ²) are estimated from Phase I samples, and it is usually assumed that sample size n and independent variable X_i are fixed in each profile. When there is an AR(1) structure between the random error terms, (1) becomes:

$$ \begin{aligned}&Y_{ij} = A_{0} + A_{1} X_{i} + \varepsilon_{ij} ,\varepsilon_{ij} = \phi \varepsilon_{{i\left( {j - 1} \right)}} + a_{ij} ,a_{ij} \sim N\left( {0,\sigma^{2} } \right),\\&i = 1,2,...,n;\quad j = 1,2, \ldots , \end{aligned}$$

(2)

where ϕ is a constant autocorrelation coefficient, which is assumed to be known in Phase II. To monitor the above IC profile, we briefly present three common approaches in the following subsections.

2.2 The T ² control chart for monitoring autocorrelated profiles in Phase II

By some calculations, it can be easily shown that the estimated responses are obtained in the j^th generated sample over time as follows [44]:

$$ \hat{y}_{ij} = \phi y_{{i\left( {j - 1} \right)}} + \left( {1 - \phi } \right)(A_{0} + A_{1} X_{i} ). $$

(3)

Thus, the empirical residuals can be written as:

$$ e_{ij} = \hat{y}_{ij} - y_{ij} . $$

(4)

Noorossana, Amiri and Soleimani [44] used a modified form of the T² statistics proposed by Kang and Albin [14] in a simple linear profile:

$$ t_{j}^{2} = \underline{e}_{j} \sum_{e}^{ - 1} \underline{e}_{j}^{^{\prime}} , \underline{e}_{j} = \left( {e_{1j} ,e_{2j} , \ldots ,e_{nj} } \right), $$

(5)

where ∑_e is the symmetric n × n matrix σ²I. Since the chart statistic $(t_{j}^{2}$) is ensured to be larger than zero, it is compared with a predefined Upper Control Limit (UCL_T) to reach an OC signal. Note that the Lower Control Limit (LCL_T) is equal to 0.

2.3 The EWMA/R control chart for monitoring autocorrelated profiles in Phase II

In the EWMA/R control chart, two simultaneous statistics monitor the generated profiles. The first statistic is related to the mean of the residuals and is defined as follows [44, 53]:

$$ z_{j} = \theta \overline{e}_{j} + (1 - \theta )z_{(j - 1)} ,\overline{e}_{j} = \frac{1}{n}\sum\limits_{i = 1}^{n} {e_{ij} } , $$

(6)

In (6), it holds z₀ = 0, and θ is the EWMA constant that usually has a value between 0.1 and 0.9 [20]. Following previous works [21, 53, 54, 58], θ is set to 0.2 in this paper.

The second statistic of the EWMA/R chart is the range of the empirical residuals defined by [37, 48]:

$$ r_{j} = \max (e_{ij} ) - \min (e_{ij} ). $$

(7)

The EWMA/R declares the process as IC if both of the following conditions are met [14]:

$$ \begin{aligned}- L\sigma \sqrt {\frac{\theta }{{\left( {2 - \theta } \right)n}}} < z_{j} < L\sigma \sqrt {\frac{\theta }{{\left( {2 - \theta } \right)n}}} ,\\ \sigma \left( {d_{2} - Ld_{3} } \right) < r_{j} < \sigma \left( {d_{2} + Ld_{3} } \right)\end{aligned} $$

(8)

In (8), the value of L is assigned to reach a predefined ARL₀, while d₂ and d₃ are two constants that depend on the sample size (see Montgomery [1] for more details).

2.4 The EWMA-3 control chart for monitoring autocorrelated profiles in Phase II

To solve the problem of dependency between the estimators in linear profiles, Kim, Mahmoud and Woodall [20] suggested to have zero mean explanatory variables, where the least squares estimators of slope and intercept are independent random variables. By this coding (transformation), the EWMA-3 approach has the IC model

$$ \begin{aligned}Y_{ij} = B_{0} + B_{1} X_{di} + \varepsilon_{ij} ,\varepsilon_{ij} = \phi \varepsilon_{{i\left( {j - 1} \right)}} + a_{ij} ,\varepsilon_{ij} \sim N\left( {0,\sigma^{2} } \right),\\i = 1,2, \ldots ,n;\quad j = 1,2, \ldots ,\end{aligned} $$

(9)

where the coded explanatory variables (${X}_{di}={X}_{i}-\overline{X }$) lead to the transformed IC intercept B₀ = A₀ + A₁ $\overline{X }$, ${A}_{0}$ and ${A}_{1}$ are defined as in (1). Note that the transformed IC slope is B₁ = A₁ in this approach. The OLS estimation of the parameters (${\widehat{B}}_{0j},{\widehat{B}}_{1j},{{\widehat{\sigma }}^{2}}_{j}$) generates three separate EWMA-based errors for intercept (${e}_{Ij}$), slope (${e}_{Sj}$) and standard deviation (${e}_{ij}$), as follows:

$$ \begin{aligned} e_{Ij} & = \hat{B}_{0j} + \emptyset \hat{B}_{{0\left( {j - 1} \right)}} - \left( {1 - \emptyset } \right)B_{0} , \\ e_{Sj} & = \hat{B}_{1j} + \emptyset \hat{B}_{{1\left( {j - 1} \right)}} - \left( {1 - \emptyset } \right)B_{1} , \\ e_{ij} & = y_{ij} - \emptyset y_{{i\left( {j - 1} \right)}} - \left( {1 - \emptyset } \right)B_{0} + B_{1} X_{di} , \\ {\text{MSE}}_{j} & = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} e_{ij}^{2} . \\ \end{aligned} $$

(10)

Based on the OLS estimation, the Mean Square Error (MSE) of the jth profile is considered as the estimator of the error variance (for details see Kim et al. [20], Huwang et al. [59] and Yeganeh and Shadman [54]), thus, three chart statistics can be calculated as follows:

$$ \begin{aligned} {\text{EWMA}}_{Ij} & = \theta e_{Ij} + \left( {1 - \theta } \right){\text{EWMA}}_{{I\left( {j - 1} \right)}} , \\ {\text{EWMA}}_{Sj} & = \theta e_{Sj} + \left( {1 - \theta } \right){\text{EWMA}}_{{S\left( {j - 1} \right)}} , \\ {\text{EWMA}}_{Ej} & = \max \left[ {\theta \left( {{\text{MSE}}_{j} - 1} \right) + \left( {1 - \theta } \right){\text{EWMA}}_{{E\left( {j - 1} \right)}} ,0} \right]. \\ \end{aligned} $$

(11)

The control limits of the three separate control charts are designed as:

$$ \begin{aligned} {\text{UCL}}_{I} & = - {\text{LCL}}_{I} = L_{I} \sigma \sqrt {\frac{\theta }{{n\left( {2 - \theta } \right)}}} \\ {\text{UCL}}_{S} & = - {\text{LCL}}_{S} = L_{S} \sigma \sqrt {\frac{\theta }{{n\left( {2 - \theta } \right)}}} \\ {\text{UCL}}_{E} & = L_{E} \sqrt {\frac{{\theta {\text{Var}}\left( {{\text{MSE}}_{j} } \right)}}{{\left( {2 - \theta } \right)}}} \\ \end{aligned} $$

(12)

It is worth noting that ${\mathrm{LCL}}_{E}=0$, and some suggestions regarding $\mathrm{Var}({\mathrm{MSE}}_{j})$ can be found in Kim, Mahmoud and Woodall [20], Noorossana et al. [44] and Hosseinifard et al. [53]. EWMA-3 triggers an OC signal when at least one of the statistics exceed the control limits. The proposed constants entailing L_I, L_S and L_E are usually adjusted to reach a desired value of ARL₀ in such a way that each of the separate charts achieves an identical individual ARL₀.

2.5 OLS estimation of the parameters

Since the parameters estimated via OLS are employed as inputs of the machine learning techniques, some details on the OLS estimators are given below [20, 21, 53]. While the intercept of the original and the transformed model is estimated by ${\widehat{A}}_{0j}={\overline{Y}}_{j}-{\widehat{\text{A}}}_{1j}{\overline{X}}$ and ${\widehat{B}}_{0j}= {\overline{Y}}_{j}$, respectively, the slope parameter in both models is estimated via ${\widehat{A}}_{1j} = {\widehat{B}}_{1j} = \frac{ S_{{XY_{j} }} } { S_{XX} }$, where ${S}_{XY_{j}}=\sum\nolimits_{i=1}^{n}Y_{ij}(X_{i}-{\overline{X}})$ and ${S}_{XY}=\sum\nolimits_{i=1}^{n}(X_{i}-{\overline{X}})^{2}$ (with ${\overline{Y}}_{j}=\frac{1}{n}\sum\nolimits_{i=1}^{n}Y_{ij}$ and ${\overline{X}}_{j}=\frac{1}{n}\sum\nolimits_{i=1}^{n}X_{ij}$). Note that the definition of MSE_j based on (10) is equivalent to the estimator of the error variance.

3 The proposed control chart for monitoring autocorrelated profiles

The basic idea in this paper is to use machine learning techniques instead of statistical control charts for monitoring profiles with between-autocorrelation. For this purpose, several features are extracted from the process to embed into the machine learning techniques. Using training patterns and the obtained control limits help to improve decision-making about the process.

To employ a machine learning technique as a control chart, four main steps are defined in the following. In the first step, the structure of the input features and outputs of a machine learning technique are determined, and then, a training data set based on the input features and outputs is generated by simulating IC and OC profiles in the second step. The third step uses the generated data set to train a machine learning technique, and finally, by the definition of a control limit, the machine learning technique provides information about the process condition in the fourth step. Figure 1 shows a step-by-step flowchart regarding the proposed method in monitoring autocorrelated profiles. Moreover, details of these steps are presented in the following subsections.

3.1 Defining the structure of input features and outputs

Extracting proper input features is a key step in the implementation of machine learning techniques [60]. In the literature, various strategies have been employed for extracting features. For example, Chen et al. [57] and Sergin and Yan [51] used autoencoders to obtain Phase I information, which is not compatible for this study. Hosseinifard et al. [53] and Yeganeh and Shadman [54] suggested to take the OLS estimations of the linear profile parameters as input features. This input structure performs well in profiles without autocorrelation, but it is not able to provide reliable results when between-autocorrelation of the profiles is present since the effect of autocorrelation is not captured by the distribution of the OLS estimators. So, in addition to these inputs (${\widehat{A}}_{0j}$,${\widehat{ A}}_{1j}$, ${\widehat{\sigma }}_{j}$), further appropriate input features are proposed in the following, which are suitable to account for between-autocorrelation effects of first order (i.e., AR(1) autocorrelation). In particular, the proposed input structure addresses the effect of the mean of responses, the mean of errors, and T² statistic each in the current sample (j) and the previous sample (j−1). It should be mentioned that one of the main benefits of machine learning-based algorithms is their independency to the basic assumptions about the process. So, when replacing proper estimations related to other autocorrelation models such as MA, ARMA and ARIMA, the proposed method can be easily applied for other process conditions.

In machine learning, there are several approaches for normalizing and scaling of the inputs, one of which involves the parameter distribution to be used. For instance, Yeganeh and Shadman [54] scaled the parameters of simple linear profiles, i.e., intercept, slope and standard deviation, with normal and chi square distribution (the relations are not reported for brevity but the interested reader is referred to Eqs. (4) to (9) in Yeganeh and Shadman [54]). Since the autocorrelation was not considered in Yeganeh and Shadman [54] for OLS estimators, we suggest to utilise the between-autocorrelation effect of the first order by means of deviations between current (previous) OLS estimators and their respective IC values, i.e., $\begin{aligned}{\widehat{A}}_{0j}-{A}_{0},\, {\widehat{A}}_{1j}-{A}_{1}, {\widehat{\sigma }}_{j}-{\sigma }_{0}({\widehat{A}}_{0(j-1)}-{A}_{0},\, {\widehat{A}}_{1(j-1)}-{A}_{1}, {\widehat{\sigma }}_{(j-1)}-{\sigma }_{0})\end{aligned}$. Using this approach, the means of current and previous responses ($\overline{y}_{j} ,\overline{y}_{{\left( {j - 1} \right)}}$) are automatically incorporated in the input structure as they are functions of the OLS estimates $\left( {\hat{A}_{0j} = \overline{y}_{j} - \hat{A}_{1j} \overline{x},\hat{A}_{{0\left( {j - 1} \right)}} = \overline{y}_{{\left( {j - 1} \right)}} - \hat{A}_{{1\left( {j - 1} \right)}} \overline{x}} \right)$.

By considering the above inputs, numerous input combinations can be defined in the proposed method. Having investigated several input features, the following three input combinations I, II and III with 8, 4 and 10 inputs, respectively, are employed for each machine learning technique based on their ability to: (1) adequately address the specific effects of between-autocorrelation of type AR(1), and (2) incorporate the deviations between OLS estimates and their respective IC values for efficient Phase II monitoring:

(I)
${\widehat{A}}_{0(j-1)}-{A}_{0}$, ${\widehat{A}}_{1(j-1)}-{A}_{1}$, ${\widehat{\sigma }}_{(j-1)}-{\sigma }_{0}$, ${\widehat{A}}_{0j}-{A}_{0}$, ${\widehat{A}}_{1j}-{A}_{1}$, ${\widehat{\sigma }}_{j}-{\sigma }_{0}$, ${\overline{e}}_{j}$, ${\overline{e}}_{(j-1)}$.
(II)
${\widehat{\sigma }}_{(j-1)}-{\sigma }_{0}$, ${\widehat{\sigma }}_{j}-{\sigma }_{0}$, t²_(j−1), t²_j.
(III)
${\widehat{A}}_{0(j-1)}-{A}_{0}$, ${\widehat{A}}_{1\left(j-1\right)}-{A}_{1}$, ${\widehat{\sigma }}_{(j-1)}-{\sigma }_{0}$, ${\widehat{A}}_{0j}-{A}_{0}$, ${\widehat{A}}_{1j}-{A}_{1}$, ${\widehat{\sigma }}_{j}-{\sigma }_{0}$, ${\overline{e}}_{j}$, ${\overline{e}}_{(j-1)}$, t²_(j−1), t²_j.

These input combinations are motivated by their ability to consider effects regarding autocorrelation in various ways. While input combination I addresses the raw AR(1) structure of the underlying model, input combination II aims to isolate the effects of the current and previous T² statistics extended by an additional consideration of the current and previous error variances. Finally, input combination III is the union of input combinations I and II, and therefore, combines both main effects. In these three input combinations, the following notations are utilized:

Estimated parameters via OLS in the previous sample (${\widehat{A}}_{0(j-1)}$, ${\widehat{A}}_{1(j-1)}$, ${\widehat{\sigma }}_{(j-1)}$).
Estimated parameters via OLS in the current sample (${\widehat{A}}_{0j}$, ${\widehat{A}}_{1j}$, ${\widehat{\sigma }}_{j}$).
Mean of error terms in the previous sample (${\overline{e}}_{(j-1)}$).
Mean of error terms in the current sample (${\overline{e}}_{j}$).
T² statistic in the previous sample (t²_(j−1)).
T² statistic in the current sample (t²_j).

Note that the T² statistic is added as an input feature and no further statistics from other conventional control charts in order to: (1) increase the worse performance of the T² control chart, and (2) avoid overparameterization and complexity with regard to further common competitors.

3.2 Generation of the training data set

To construct the training data set, the IC and OC profiles are generated by means of simulations. From the simulated profiles, the inputs are constructed based on the predefined three input structures I, II and III in Sect. 3.1. For example, the inputs of the jth generated profile consist of 10 features in input combination III as ${\widehat{A}}_{0(j-1)}-{A}_{0}$, ${\widehat{A}}_{1\left(j-1\right)}-{A}_{1}$, ${\widehat{\sigma }}_{(j-1)}-{\sigma }_{0}$, ${\widehat{A}}_{0j}-{A}_{0}$, ${\widehat{A}}_{1j}-{A}_{1}$, ${\widehat{\sigma }}_{j}-{\sigma }_{0}$, ${\overline{e}}_{(j-1)}$, ${\overline{e}}_{j}, t^{2}_{(j-1)}\,{\text{and}}\,t^{2}_{j}.$ Considering the suggestions by Hosseinifard, Abdollahian and Zeephongsekul [53], equal numbers of IC and OC profiles are generated in a way that the target values of IC and OC profiles are set to 0 and 1, respectively.

Hence, we consider the size of the training data set (number of rows) as 6G. First, 3G IC profiles are generated and their input features are recorded with a target value equal to 0. Then, 3G OC profiles (G profiles with shift in intercept, G profiles with shift in slope, and G profiles with shift in standard deviation) are obtained in the same way with a target value equal to 1. Finally, the training data set has 6G rows and 9 (8 + 1), 5 (4 + 1) and 11 (10 + 1) columns for input combinations I, II and III, respectively (note that the last column represents the target values). For better understanding, pseudo code 1 illustrates the process of data set generation for input combination III (an analogous procedure also applies to input combinations I and II).

3.3 Training a machine learning technique

After obtaining a training data set, a machine learning technique can be trained on its basis. In this paper, six common machine learning techniques, i.e., ANFIS, ANN, RBF, SVR, CNN and LSTM, with the ability of generating continuous outputs are investigated. For better understanding, we provide a brief description about the parameters and adjustments of each method in MATLAB software.

The ANFIS approach utilizes fuzzy IF–THEN rules to train the parameters with some basic algorithms such as subtractive clustering and grid portioning, in which the idea of rule generation is different. The ‘genfis’ function, which is a well-known single-output Sugeno fuzzy inference system, is used to obtain a grid partition for the training procedure.
The ANN structure, especially a Multi-Layer Perceptron (MLP), with gradient-based optimization is employed here. An important issue in ANNs is related to the adjustment of the number of hidden layers and the neurons. The function ‘feedforwardnet’ utilises a fully connected network architecture using the BP Levenberg–Marquardt training algorithm (‘trainlm’ option). In this study, a single hidden layer with 10 neurons is suggested for training.
RBF considers different training approaches based on the idea of clustering. It has only one hidden layer in a way that the neurons with a distinct spread (radius) are added to its structure until the pre-specified error or maximum number of neurons is obtained. Because the training procedure is completed by the aim of training error reduction, the probability of overfitting would be generally high in this approach. The function ‘newrb’ with spread (radius) 1, error rate 0.05 and maximum neuron size 100 is selected for training purposes.
As additional machine learning techniques, SVM and SVR obtain the parameters based on generating a hyperplane in the problem space in order to minimize the gaps between the predicted and obtained values considering a kernel function for mapping the inputs to the problem space. As we aim to reach a continuous output (regression problem) in this study, the SVR function ‘fitrsvm’ with the gaussian kernel function is used for training (more details about the classification and regression nature of machine learning-based control charts can be found in Yeganeh and Shadman [54]). The parameter epsilon, which determines the distance between the real and the estimated planes in the space, is an important parameter of the SVR technique. As the common range of epsilon is [0.3–0.5] [61], the value 0.3 is selected in this paper.
The deep leaning technique CNN is utilized to investigate its detection ability in the SPM field. Generally, a CNN layer moves some filters along the input vertically and horizontally and computes the dot product of the weights and the input, and then adds a bias term to reach some novel features from the process. CNNs have several parameters such as padding and filter size. As the inputs of this paper are in vector form, the layers are created with ‘convolution1dLayer’ function with filter size 5.
As a further deep learning technique, LSTM is trained to evaluate its performance. Due to the consideration of time dependencies with the aim of time units, LSTM can identify the time series related patterns in an effective way. The most important parameter of LSTM is the number of hidden neurons in each unit that is nearly like the number of hidden layers in common ANNs. Two LSTM layers are defined by the ‘lstmLayer’ function with 40 hidden neurons in a way that the Adam optimizer is utilised to obtain the best weights.

3.4 Decision on the process condition

Considering the definition of target values, Hosseinifard et al. [53] set the LCL of their proposed method to 0 and denoted the UCL as Cutting Value (CV). The CV is adjusted by simulations to reach the desired value of ARL₀. After adjustment of the CV, the output of the considered machine learning technique, e.g., ANN, is compared with the CV to make a decision on the process [43]. If the output of the ANN in the jth sample (O_j) is larger than the CV, this indicates an OC condition (see Fig. 1 in Hosseinifard et al. [53] for more details).

By employing this approach in our proposed framework, we can identify the process condition when comparing the output O_j of a machine learning technique with the respective CV. For better understanding, pseudo code 2 illustrates the procedure of reaching an OC signal in one iteration of simulations when input combination III is used. To compute the ARL and SDRL by means of Monte Carlo simulations, this procedure is iterated 10,000 times.

4 Simulation study

To show the effectiveness of our proposed method, a comprehensive simulation study is conducted in this section. To compare the six machine learning techniques ANFIS, ANNBP, CNN, LSTM, RBF and SVR, the CV for each method is set to reach ARL₀ equal to 200, as it is the most common value in profile monitoring. The next aim is to find the best input among the three combinations I, II and III. As such, three input combinations under different parameter settings are used as inputs to the machine learning techniques. Due to the page limit, we only present the results of three input combinations for the shift in intercept in Table 1 under ϕ = 0.1 and other results can be given to the interested readers upon request. As can be seen, nearly all machine learning techniques produced the best (i.e., lowest) values in terms of ARL₁ for input combination III. This is due to the fact that input combination III is able to combine both main effects of input combinations I and II, namely:

Table 1 ARL₁ comparisons considering all shifts in the intercept (λ) for the input combinations I, II and III under ϕ = 0.1

Full size table

(1) appropriately addressing the raw AR(1) structure of the underlying model, and.

(2) the effects of the current and previous T² statistics as well as of current and previous error variances.

In other words, only considering (2), i.e., using input combination II, is not enough to reach proper results. On the other hand, concentrating on (1), i.e., using input combination I, enables to obtain better performance on average than with (2). The combination of (1) and (2) via input combination III clearly strengthens the effect of (1) and leads to superior results.

Similar results are obtained for the rest of parameter settings; thus, we only present the results of input combination III for the rest of the experiments. In addition, due to the first-priority importance of detecting small shifts in the underlying process, the focus is mainly on smaller shifts in the simulation studies. As for larger shifts, there are generally the same patterns as for smaller shifts.

For comparisons of single shifts in intercept, slope and standard deviation, the IC model is taken from Noorossana et al. [44] and Wang and Huang [48], where A₀ = 3, A₁ = 2 and σ² = 1. In addition, the explanatory variables have the values 2, 4, 6, 8 (n = 4), and 0.1, 0.5, 0.9 are considered as fixed values of ϕ. For comparisons of simultaneous shifts, the IC model is extracted from Wang and Lai [45].

In Sect. 4.1, the performance of various machine learning techniques is compared and the method with the best performance is selected. In Sect. 4.2, the selected machine learning technique and conventional statistical control charts in Phase II profile monitoring are compared. Finally, Sect. 4.3 reports the performance of the best approach for simultaneous shifts in profile parameters.

4.1 Comparing different machine learning techniques based on input combination III

In this subsection, three individual shifts are considered for each parameter to compare the performance of each technique in a way that the shifted parameters are A₀ + λσ, A₁ + ησ and γσ. The values of ARL₁ associated with different machine learning techniques are given in Table 2. To have a fair comparison, the values of the SDRL₁ are additionally provided in Table 3 for each of the considered shifts. The bold values represent the approach with the best performance.

Table 2 ARL₁ comparisons considering small shifts in intercept (λ), slope (η) and standard deviation (γ) for the input combination III and various values of ϕ

Full size table

Table 3 SDRL₁ comparisons considering small shifts in intercept (λ), slope (η) and standard deviation (γ) for the input combination III and various values of ϕ

Full size table

As can be seen in Table 2, ANFIS obtains the best performance in terms of ARL₁ for small shifts in intercept and slope, given ϕ = 0.1 and ϕ = 0.5. As for ϕ = 0.9, mostly ANNBP (λ = 0.6, 0.8, η = 0.075, 0.1, 0.125) and LSTM (λ = 0.2, η = 0.025, 0.05) produce better results. These results are also reflected in the values of SDRL₁ with only a few exceptions (see Table 3). As for small shifts in the standard deviation, RBF outperforms other machine learning techniques, with only two exceptions for ϕ = 0.9, γ = 1.8 and 2, where CNN performs marginally better. However, RBF is not a preferable approach in comparison with ANFIS regarding shifts in intercept and slope.

Note that the deep learning techniques CNN and LSTM are generally not able to reach comparable results for the most shifts, especially for smaller shifts. These findings can be justified before the backdrop that deep neural networks have superiority over shallow networks in problems with a large number of features such as image processing and Natural Language Processing (NLP) [62, 63].

Based on the above findings and numerical evidence regarding a wider range of shifts, ANFIS is the best of the considered machine learning approaches regarding shifts in intercept and slope, and RBF is the superior method regarding shifts in the standard deviation. However, for ϕ = 0.9, ANFIS and RBF are not throughout the best methods. To illustrate this issue, we additionally provide the results regarding ARL₁ and SDRL₁ of all techniques considering ϕ = 0.9 with a wider range of shifts in Tables 4 and 5, respectively.

Table 4 ARL₁ comparisons considering a wider range of shifts in intercept (λ), slope (η) and standard deviation (γ) for the input combination III and ϕ = 0.9

Full size table

Table 5 SDRL₁ comparisons considering a wider range of shifts in intercept (λ), slope (η) and standard deviation (γ) for the input combination III and ϕ = 0.9

Full size table

To reach a better judgment about the results of Tables 4 and 5, the Relative Mean Index (RMI) is implemented to select the best machine learning technique. This measure is frequently utilized in SPM (see, for example, Han and Tsung [64], Perry [65] and Yeganeh et al. [24]) and considers the average difference from the superior approach in each treatment. The smaller the RMI, the quicker the detection ability. Table 6 reports the values of RMI based on the simulation results given in Tables 4 and 5.

Table 6 Comparing different machine learning techniques based on the RMI criterion with respect to simulation results given in Tables 4 and 5 (ϕ = 0.9)

Full size table

The order of machine learning techniques in terms of ARL₁ (SDRL₁) for shifts in the intercept is RBF, ANNBP, ANFIS, SVR, LSTM, CNN (ANNBP, ANFIS, RBF, SVR, LSTM, CNN). As for shifts in the slope, the order in terms of both ARL₁ and SDRL₁ is ANNBP, RBF, ANFIS, SVR, LSTM, CNN, while the order for shifts in the standard deviation is CNN, ANFIS, ANNBP, RBF, LSTM, SVR for both ARL₁ and SDRL₁. To select the machine learning technique with the best average performance regarding all the shifts, we consider the average of the RMI values in terms of ARL₁ and SDRL₁, respectively. Here, ANFIS (ANNBP) has the best overall average performance because the average of RMI with respect to ARL₁ (SDRL₁) is 0.570 (1.598), but the difference in SDRL₁ between ANNBP and ANFIS (1.598 vs. 1.931) is negligible. According to these results, ANFIS is selected as the best machine learning technique, and thus we use it as benchmark technique in the subsequent simulations.

The superiority of ANFIS could be due to the establishment of two main training parts including the antecedent part and the conclusion in its structure by the aim of fuzzy IF–THEN rules. The effects of the current and previous T² statistics, current and previous errors as well as of the current and previous coefficients’ estimations, the chance of getting unnatural trends is increased when employing the ANFIS technique. A similar conclusion has been achieved by Aziz Kalteh and Babouei [66] in a way that they indicated the suitable performance of ANFIS in control chart pattern recognition problems.

4.2 Comparing the best machine learning technique with conventional statistical approaches

In this subsection, the performance of ANFIS as the best machine learning technique is compared with conventional statistical control charts based on individual and simultaneous shifts in intercept, slope and standard deviation for ϕ = 0.1. Because the performance of RBF regarding shifts in the standard deviation is better as compared to ANFIS, RBF is also embedded in the comparisons. Three competitors entailing T², EWMA/R and EWMA-3 are selected following Noorossana et al. [44]. Table 7 shows the results of ARL₁ for ANFIS and the competitors. Note that the setups are the same as in the previous subsection, and the results of ANFIS and RBF are extracted from Table 2.

Table 7 Comparing the performance of ANFIS and RBF with conventional statistical competitors considering small shifts in intercept (λ), slope (η) and standard deviation (γ) in terms of ARL₁ (ϕ = 0.1)

Full size table

As can be seen, ANFIS performs considerably better than the other methods for shifts regarding intercept and slope. There is a large difference especially for smaller shifts; for example, ARL₁ is 3.096 for ANFIS given λ = 0.4, while ARL₁ is 99.7, 21.9 and 19.7 for the conventional competitors T², EWMA/R and EWMA-3, respectively. However, EWMA-3 and RBF obtain the best results for shifts in the standard deviation. In addition, the statistical methods outperform ANFIS regarding shifts in the standard deviation.

4.3 Comparing the best machine learning technique with conventional statistical approaches considering simultaneous shifts

In industrial processes, simultaneous shifts may occur, so a control chart should also be able to detect such type of shifts. Wang and Lai [45] conducted several simulations about simultaneous shifts with the IC model proposed by Noorossana, Amiri and Soleimani [44]. In the following, we compare ANFIS with two reported schemes in Wang and Lai [45], i.e., T² and MEWMA. MEWMA is an advanced version of EWMA control charts that integrates the effect of previous samples in one statistic, and some researchers reported that the performance of this approach in profile monitoring is very well [17, 21, 59]. Table 8 shows the results in terms of ARL₁ for simultaneous shifts in intercept and slope. Note that we restrict the comparison to location parameters and do not consider further simultaneous shifts, which include shifts in the standard deviation, due to the superior performance of ANFIS regarding shifts in intercept and slope. This is also in line with the approach proposed in Wang and Lai [45].

Table 8 Comparing the performance of ANFIS with conventional statistical competitors considering simultaneous small shifts in intercept (λ and slope (η) in terms of ARL₁ (ϕ = 0.9)

Full size table

According to Table 8, ANFIS outperforms both other methods. The deviations in terms of ARL₁ are tangible; for example, the values of ARL₁ are 135.73, 180.42 and 196.15 (1.91, 34.22 and 49.64) for ANFIS, MEWMA and T², respectively, for the smallest (largest) shift λ = 0.2 and η = 0.025 (λ = 1 and η = 0.125). While there is no distinct trend for the absolute deviations between the values of ARL₁ of ANFIS and each of both competitors for increasing shift sizes, there is generally an increasing behaviour for the corresponding relative deviations, i.e., $\frac{{{\text{ARL}}_{1}^{{{\text{MEWMA}}}}- {\text{ARL}}_{1}^{{{\text{ANFIS}}}} }}{{{\text{ARL}}_{1}^{{{\text{ANFIS}}}} }}\,{\text{and}}\, \frac{{{\text{ARL}}_{1}^{{T^{2} }} - {\text{ARL}}_{1}^{{{\text{ANFIS}}}}}}{{{\text{ARL}}_{1}^{{{\text{ANFIS}}}} }}$regarding small shifts (see Table 8). That is, the larger the shifts in slope and/or in intercept, the larger the relative deviations.

As for larger shifts in slope and/or intercept (λ>1 and η > 0.125, not tabulated due to lower relevance), the values of ARL₁ regarding ANFIS decrease to a small extent, while the values of ARL₁ regarding MEWMA and T² become closer to the respective ARL₁ values of ANFIS, i.e., we observe a decreasing behavior for the corresponding relative deviations regarding larger shifts. To sum up, ANFIS clearly outperforms both methods in detecting simultaneous shifts and its detection ability is especially better for lower shift sizes.

Statistical control charts usually require the fulfilment of some principal assumptions to reach the best performance, while the occurrence of complicated patterns in the manufacturing process may lead to the invalidity of some of the presumed assumptions and thus to deteriorations in their performance. In contrast, machine learning techniques encounter less challenges provided that input combinations and training procedure are defined properly. It could be concluded from the above results that the machine learning-based techniques, and especially ANFIS, perform better than conventional statistical methods when monitoring autocorrelated profiles; however, some computational effort may be required when implementing these approaches. Due to the existence of online data collection systems in real applications, big data storage and development of high technology computers, this challenge is becoming easier in a way that machine learning-based systems can automatically analyze process data to identify OC situations. To this end, the definition of proper input features, dataset development, relevant training adjustment and acceptable false alarm rates are essential tasks. These steps are usually performed as off-line modelling phase while the operation (online) phase refers to the implementation of the trained model on the online data to detect the process condition [57, 60]. By this procedure, the proposed machine learning-based approach in this paper can improve the monitoring of industrial processes in terms of OC detection ability.

5 Illustrative example

In this section, an illustrative example of a chemical process is conducted to demonstrate a real application. In fact, this example could be considered as a calibration system in the chemical industry. Sometimes, it is necessary to control a chemical process far from the laboratory with remote schemes in which some gas sensors are used as the controller. These sensors are used to monitor such a chemical process over time. Although it is a beneficial approach, it needs new calibration by changing the sensors’ adjustments as the variability of gas sensors may affect the performance of the underlying calibration model [67]. These changes may be caused by different chemical materials, process conditions, and equipment movements so their calibration should be checked over time. The approach of profile monitoring can be applied to address calibration issues and for online monitoring of the process.

For these reasons, some studies such as Mahmood et al. [68] and Nadi et al. [36] suggested to apply profile monitoring. Metal oxide (MOX) as a conductometric type of gas sensors is one of the best options due to its sensitivity, operational ease, cost efficiency, rapid response, and the capability of spotting a high number of volatiles. The authors supposed MOX as a sensor and monitored a functional relationship between the resistance (R) of the sensor (i.e., MOX) as the dependent variable and the concentrations of carbon monoxide in the sensor as the independent variable.

To monitor this functional formula, they recorded the results of sensor resistance and different concentration levels over time. Based on the recorded data, the explanatory variables are fixed at 25, 100, 125, and 150 ppm. To reach a better performance, it is suggested to change the process situation with some additives. These substances are blended to a special process to accelerate the processing ability of the polymers, improve the characteristics such as durability, stiffness, and enhance the service life. A wide range of additives such as gas, feed, anti-wear, food, fuel, antioxidant, plastic additives have been extended yet. Indeed, gas additives are usually added to the gas sensor processes to adjust the flow of gas during the experiment [67]. However, previous works showed that the relation between resistance and carbon concentration might change in the case of additive materials. To address these issues, Nadi et al. [36] investigated situations related to the before and after of adding the additive material in a way that one additive material was added to the process after time 3278; so, the IC model was extracted from the first 3278 profiles. Considering these profiles, Nadi et al. [36] considered a simple linear IC model with the autocorrelation effect as follows:

$$ \begin{aligned} &Y_{ij} = 71.741 + 0.0176X_{i} + \varepsilon_{ij} ,\\ &\varepsilon_{ij} = 0.565\varepsilon_{{i\left( {j - 1} \right)}} + a_{ij} ,a_{ij} \sim N\left( {0,0.142} \right),\,i = 1,2,3,4;\,j = 1,2, \ldots \end{aligned} $$

(13)

To show the applicability of the proposed method in monitoring the above IC model, Nadi et al. [36] utilized simulations for OC data generation (instead of using the data after the 3278th profile). Following them, we first generated five IC profiles and then continued with the OC profile generation considering a shift in the intercept until reaching an OC signal. The magnitude of the OC shift was considered as 0.15 (or 0.5σ). Table 9 shows the response variables of the generated profiles (the black and red values are IC and OC profiles, respectively).

Table 9 The IC (black) and OC (red) generated responses in the illustrative example

Full size table

To specify the detection ability of ANFIS for this data set, it is trained based on the IC model in Eq. (13) and input combination III. Considering ARL₀ = 200, the CV is set to 0.615. After adjustment of the CV, the generated data in Table 9 is imported to ANFIS and the output of each input is computed. Table 10 reports the input and output values for the first seven generated profiles. Hence, ANFIS only needs two OC samples to trigger a signal. The signal in the 7th sample appears because the final statistic exceeds the CV (red horizontal line in Fig. 2), so ANFIS can trigger an OC signal (O₇ = 0.849 > 0.615 = CV).

Table 10 Input and output values for the first seven generated profiles applying ANFIS with input combination III (ϕ = 0.565)

Full size table

6 Conclusions

In profile monitoring, the error term often does not follow a simple structure and is affected by autocorrelations. For this reason, a novel monitoring scheme for linear autocorrelated profiles with between-autocorrelation of first order in Phase II of process monitoring has been proposed in this paper. Unlike most of the existing methods that use common statistical control charts, this paper employed various machine learning techniques, such as ANFIS, ANNBP, CNN, LSTM, RBF and SVR as a control chart. To this aim, four main steps were defined. In the first step, the structure of the input features and outputs of a machine learning technique were determined, and then, a training data set based on the input features and outputs was generated by simulating IC and OC profiles in the second step. The third step utilized the generated data set to train a machine learning technique, and finally, by the definition of a control limit, the machine learning technique provided information about the process condition in the fourth step.

The study conducted pursued three main objectives. Due to the high importance of input features in machine learning, some input features, which are appropriate to account for between-autocorrelation effects of first order, were defined and compared to achieve the most appropriate input combination. The results indicated that input combination III, which is defined as the union of input combinations I and II and combines both main effects of these input combinations, is the most appropriate one. For the second aim, different machine learning techniques were compared to identify the most adequate one. Experimental studies showed that ANNBP, CNN, LSTM and SVR were mostly not able to reach a satisfactory detection ability in comparison with ANFIS and RBF. Among ANFIS and RBF, ANFIS was preferable with respect to shifts in intercept and slope, while RBF had the best performance regarding shifts in the standard deviation. This superiority was obvious for low and moderate autocorrelation coefficients (i.e., ϕ = 0.1 and 0.5), while it was not possible to identify a consistently best method for a larger value (ϕ = 0.9). To address this issue, we additionally implemented an overall performance measure, called RMI. Following RMI, we found that ANFIS turns out to be the method with the best overall average performance for ϕ = 0.9. The third aim of this study was to compare machine learning-based techniques with statistical control charts. This comparison led to the result that the detection ability of ANFIS outperformed all the competitors regarding shifts in intercept and slope. However, the detection ability of ANFIS regarding shifts in the standard deviation was inferior compared to the selected statistical control charts. In this regard, the EWMA-3 control chart performed better, and the best machine learning technique for this purpose was RBF (with a performance that is hardly worse than that of EWMA-3). Hence, machine learning-based control charts, and ANFIS in the first place, are suggested to be utilized in profiles that are characterized by between-sample AR(1) autocorrelation to considerably improve the detection ability of the control chart.

Employing the proposed novel input features with other machine learning techniques and other profile types such as nonlinear or Generalized Linear Models (GLMs) in the presence of autocorrelations could be a promising avenue for potential future research. Also, implementing the proposed method in profiles that are characterized by within sample autocorrelation or in profiles with other autocorrelation patterns, such as ARMA or VARMA are further suggestions for potential future directions.

Data availability statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

References

Montgomery DC (2019) Introduction to statistical quality control, 8th edn. Wiley, New York
MATH Google Scholar
Weese M, Martinez W, Megahed FM, Jones-Farmer LA (2016) Statistical learning methods applied to process monitoring: an overview and perspective. J Qual Technol 48:4–24
Google Scholar
Tran PH, Ahmadi Nadi A, Nguyen TH, Tran KD, Tran KP (2022) Application of machine learning in statistical process control charts: a survey and perspective. In: Tran KP (ed) Control charts and machine learning for anomaly detection in manufacturing. Springer, Berlin, pp 7–42
Google Scholar
Fathizadan S, Niaki STA, Noorossana R (2017) Using independent component analysis to monitor 2-D geometric specifications. Qual Reliab Eng Int 33:2075–2087
Google Scholar
Abbas T, Rafique F, Mahmood T, Riaz M (2019) Efficient phase II monitoring methods for linear profiles under the random effect model. IEEE Access 7:148278–148296
Google Scholar
Yeganeh A, Pourpanah F, Shadman A (2021) An ANN-based ensemble model for change point estimation in control charts. Appl Soft Comput 110:107604
Google Scholar
Johannssen A, Chukhrova N, Castagliola P (2021) The performance of the hypergeometric np chart with estimated parameter. Eur J Oper Res. https://doi.org/10.1016/j.ejor.2021.06.056
Article MATH Google Scholar
Ji C, Sun W (2022) A review on data-driven process monitoring methods: characterization and mining of industrial data. Processes. https://doi.org/10.3390/pr10020335
Article Google Scholar
Vanli OA, Castillo ED (2019) Statistical process control in manufacturing. In: Baillieul J, Samad T (eds) Encyclopedia of systems and control. Springer, London, pp 1–8
Google Scholar
Knoth S (2021) Steady-state average run length(s): methodology, formulas, and numerics. Seq Anal. https://doi.org/10.1080/07474946.2021.1940501
Article MathSciNet MATH Google Scholar
Chukhrova N, Johannssen A (2019) Hypergeometric p-chart with dynamic probability control limits for monitoring processes with variable sample and population sizes. Comput Ind Eng 136:681–701
Google Scholar
Chukhrova N, Johannssen A (2019) Improved control charts for fraction non-conforming based on hypergeometric distribution. Comput Ind Eng 128:795–806
Google Scholar
Yeganeh A, Chukhrova N, Johannssen A, Fotuhi H (2023) A network surveillance approach using machine learning based control charts. Expert Syst Appl 219:119660
Google Scholar
Kang L, Albin SL (2000) On-line monitoring when the process yields a linear profile. J Qual Technol 32:418–426
Google Scholar
Jones CL, Abdel-Salam A-SG, Mays DA (2021) Practitioners guide on parametric, nonparametric, and semiparametric profile monitoring. Qual Reliab Eng Int 37:857–881
Google Scholar
Zhao C, Du S, Deng Y, Li G, Huang D (2020) Circular and cylindrical profile monitoring considering spatial correlations. J Manuf Syst 54:35–49
Google Scholar
Yeganeh A, Shadman A, Amiri A (2021) A novel run rules based MEWMA scheme for monitoring general linear profiles. Comput Ind Eng 152:107031
Google Scholar
Yeganeh A, Shadman AR, Triantafyllou IS, Shongwe SC, Abbasi SA (2021) Run rules-based EWMA charts for efficient monitoring of profile parameters. IEEE Access 9:38503–38521
Google Scholar
Zhu J, Lin DKJ (2009) Monitoring the slopes of linear profiles. Qual Eng 22:1–12
Google Scholar
Kim K, Mahmoud MA, Woodall WH (2003) On the monitoring of linear profiles. J Qual Technol 35:317–328
Google Scholar
Zou C, Tsung F, Wang Z (2007) Monitoring general linear profiles using multivariate exponentially weighted moving average schemes. Technometrics 49:395–408
MathSciNet Google Scholar
Yeganeh A, Shadman A, Abbasi SA (2022) Enhancing the detection ability of control charts in profile monitoring by adding RBF ensemble model. Neural Comput Appl 34:9733–9757
Google Scholar
Ding D, Tsung F, Li J (2017) Ordinal profile monitoring with random explanatory variables. Int J Prod Res 55:736–749
Google Scholar
Yeganeh A, Abbasi SA, Shongwe SC (2021) A novel simulation-based adaptive MEWMA approach for monitoring linear and logistic profiles. IEEE Access. https://doi.org/10.1109/ACCESS.2021.3107482
Article Google Scholar
Mohammadzadeh M, Yeganeh A, Shadman A (2021) Monitoring logistic profiles using variable sample interval approach. Comput Ind Eng 158:107438
Google Scholar
Chou S-H, Chang SI, Tsai T-R (2014) On monitoring of multiple non-linear profiles. Int J Prod Res 52:3209–3224
Google Scholar
Zou C, Tsung F, Wang Z (2008) Monitoring profiles based on nonparametric regression methods. Technometrics 50:512–526
MathSciNet Google Scholar
Zeng L, Neogi S, Zhou Q (2014) Robust phase I monitoring of profile data with application in low-E glass manufacturing processes. J Manuf Syst 33:508–521
Google Scholar
Zhou P, Liu P, Wang S, Zhang C, Zhang J, Li S (2022) Functional state-space model for multi-channel autoregressive profiles with application in advanced manufacturing. J Manuf Syst 64:356–371
Google Scholar
Yao C, Li Z, He C, Zhang J (2020) A Phase II control chart based on the weighted likelihood ratio test for monitoring polynomial profiles. J Stat Comput Simul 90:676–698
MathSciNet MATH Google Scholar
Zhang Y, He Z, Zhang M, Wang Q (2016) A score-test-based EWMA control chart for detecting prespecified quadratic changes in linear profiles. Qual Reliab Eng Int 32:921–931
Google Scholar
Maleki MR, Amiri A, Castagliola P (2018) An overview on recent profile monitoring papers (2008–2018) based on conceptual classification scheme. Comput Ind Eng 126:705–728
Google Scholar
Woodall WH (2007) Current research on profile monitoring. Production 17:420–425
Google Scholar
Fan S-KS, Jen C-H, Lee J-X (2019) Profile monitoring for autocorrelated reflow processes with small samples. Processes. https://doi.org/10.3390/pr7020104
Article Google Scholar
Khalili S, Noorossana R (2022) Online monitoring of autocorrelated multivariate linear profiles via multivariate mixed models. Qual Technol Quant Managl 19:319–340
Google Scholar
Nadi AA, Yeganeh A, Shadman A (2023) Monitoring simple linear profiles in the presence of within- and between-profile autocorrelation. Qual Reliab Eng Int. https://doi.org/10.1002/qre.3254
Article Google Scholar
Yao J, Xian X, Wang C (2023) Adaptive sampling for monitoring multi-profile data with within-and-between profile correlation. Technometrics. https://doi.org/10.1080/00401706.2023.2166125
Article Google Scholar
Soleimani P, Noorossana R, Amiri A (2009) Simple linear profiles monitoring in the presence of within profile autocorrelation. Comput Ind Eng 57:1015–1021
Google Scholar
Cheng T-C, Yang S-F (2018) Monitoring profile based on a linear regression model with correlated errors. Qual Technol Quant Manag 15:393–412
Google Scholar
Rahimi SB, Amiri A, Ghashghaei R (2021) Simultaneous monitoring of mean vector and covariance matrix of multivariate simple linear profiles in the presence of within profile autocorrelation. Commun Stat Simul Comput 50:1791–1808
MathSciNet MATH Google Scholar
Jensen WA, Birch JB, Woodall WH (2008) Monitoring correlation within linear profiles using mixed models. J Qual Technol 40:167–183
Google Scholar
Jensen WA, Birch JB (2009) Profile monitoring via nonlinear mixed models. J Qual Technol 41:18–34
Google Scholar
Narvand A, Soleimani P, Raissi S (2013) Phase II monitoring of auto-correlated linear profiles using linear mixed model. J Ind Eng Int 9:12
Google Scholar
Noorossana R, Amiri A, Soleimani P (2008) On the monitoring of autocorrelated linear profiles. Commun Stat Theory Methods 37:425–442
MathSciNet MATH Google Scholar
Wang Y-HT, Lai Y (2019) Monitoring of autocorrelated general linear profiles. J Stat Comput Simul 89:519–535
MathSciNet MATH Google Scholar
Khedmati M, Niaki STA (2016) Phase II monitoring of general linear profiles in the presence of between-profile autocorrelation. Qual Reliab Eng Int 32:443–452
Google Scholar
Koosha M, Amiri A (2013) Generalized linear mixed model for monitoring autocorrelated logistic regression profiles. Int J Adv Manufact Technol 64:487–495
Google Scholar
Wang Y-HT, Huang W-H (2017) Phase II monitoring and diagnosis of autocorrelated simple linear profiles. Comput Ind Eng 112:57–70
Google Scholar
Pacella M, Semeraro Q (2011) Monitoring roundness profiles based on an unsupervised neural network algorithm. Comput Ind Eng 60:677–689
Google Scholar
Li C-I, Pan J-N, Liao C-H (2019) Monitoring nonlinear profile data using support vector regression method. Qual Reliab Eng Int 35:127–135
Google Scholar
Sergin ND, Yan H (2021) Toward a better monitoring statistic for profile monitoring via variational autoencoders. J Qual Technol. https://doi.org/10.1080/00224065.2021.1903821
Article Google Scholar
Yeganeh A, Abbasi SA, Pourpanah F, Shadman A, Johannssen A, Chukhrova N (2022) An ensemble neural network framework for improving the detection ability of a base control chart in non-parametric profile monitoring. Expert Syst Appl 204:117572
Google Scholar
Hosseinifard SZ, Abdollahian M, Zeephongsekul P (2011) Application of artificial neural networks in linear profile monitoring. Expert Syst Appl 38:4920–4928
Google Scholar
Yeganeh A, Shadman A (2021) Monitoring linear profiles using Artificial Neural Networks with run rules. Expert Syst Appl 168:114237
Google Scholar
Noorossana R, Niaki STA, Izadbakhsh H (2015) Statistical monitoring of nominal logistic profiles in phase II. Commun Stat Theory Methods 44:2689–2704
MathSciNet MATH Google Scholar
Yeganeh A, Shadman A (2021) Using evolutionary artificial neural networks in monitoring binary and polytomous logistic profiles. J Manuf Syst 61:546–561
Google Scholar
Chen S, Yu J, Wang S (2020) Monitoring of complex profiles based on deep stacked denoising autoencoders. Comput Ind Eng 143:106402
Google Scholar
Riaz M, Mahmood T, Abbasi SA, Abbas N, Ahmad S (2017) Linear profile monitoring using EWMA structure under ranked set schemes. Int J Adv Manufact Technol 91:2751–2775
Google Scholar
Huwang L, Wang Y-HT, Xue S, Zou C (2014) Monitoring general linear profiles using simultaneous confidence sets schemes. Comput Ind Eng 68:1–12
Google Scholar
Yu J, Zheng X, Wang S (2019) A deep autoencoder feature learning method for process pattern recognition. J Process Control 79:1–15
Google Scholar
Cherkassky V, Ma Y (2004) Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw 17:113–126
MATH Google Scholar
Das HS, Roy P (2019) Chapter 5—a deep dive into deep learning techniques for solving spoken language identification problems. In: Dey N (ed) Intelligent speech signal processing. Academic Press, Cambridge, pp 81–100
Google Scholar
Borg A, Boldt M, Rosander O, Ahlstrand J (2021) E-mail classification with machine learning and word embeddings for improved customer support. Neural Comput Appl 33:1881–1902
Google Scholar
Han D, Tsung F (2006) A reference-free cuscore chart for dynamic mean change detection and a unified framework for charting performance comparison. J Am Stat Assoc 101:368–386
MathSciNet MATH Google Scholar
Perry MB (2020) An EWMA control chart for categorical processes with applications to social network monitoring. J Qual Technol 52:182–197
Google Scholar
Aziz Kalteh A, Babouei S (2020) Control chart patterns recognition using ANFIS with new training algorithm and intelligent utilization of shape and statistical features. ISA Trans 102:12–22
Google Scholar
Fonollosa J, Fernández L, Gutiérrez-Gálvez A, Huerta R, Marco S (2016) Calibration transfer and drift counteraction in chemical sensor arrays using Direct Standardization. Sens Actuators B Chem 236:1044–1053
Google Scholar
Mahmood T, Riaz M, Hafidz Omar M, Xie M (2018) Alternative methods for the simultaneous monitoring of simple linear profile parameters. Int J Adv Manufact Technol 97:2851–2871
Google Scholar

Download references

Acknowledgements

The authors would like to thank four anonymous reviewers for their valuable feedback and suggestions, which were important and helpful to significantly improve the paper.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Faculty of Business Administration, University of Hamburg, 20146, Hamburg, Germany
Ali Yeganeh & Arne Johannssen
HafenCity University, 20457, Hamburg, Germany
Nataliya Chukhrova
Department of Mathematics, Statistics and Physics, Qatar University, Doha, Qatar
Saddam Akber Abbasi
Department of Mathematics, Statistics and Physics, College of Arts and Sciences, Qatar University, 2713, Doha, Qatar
Saddam Akber Abbasi
Statistical Consulting Unit, College of Arts and Sciences, Qatar University, 2713, Doha, Qatar
Saddam Akber Abbasi
Department of Electrical and Computer Engineering, Center for Computer Vision and Deep Learning, University of Windsor, Windsor, Canada
Farhad Pourpanah

Authors

Ali Yeganeh
View author publications
You can also search for this author in PubMed Google Scholar
Arne Johannssen
View author publications
You can also search for this author in PubMed Google Scholar
Nataliya Chukhrova
View author publications
You can also search for this author in PubMed Google Scholar
Saddam Akber Abbasi
View author publications
You can also search for this author in PubMed Google Scholar
Farhad Pourpanah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arne Johannssen.

Ethics declarations

Conflicts of interest

(i) this manuscript is the authors' original work, which has not been published nor submitted simultaneously elsewhere; (ii) all authors have checked the manuscript and agreed to the submission, and (iii) there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yeganeh, A., Johannssen, A., Chukhrova, N. et al. Employing machine learning techniques in monitoring autocorrelated profiles. Neural Comput & Applic 35, 16321–16340 (2023). https://doi.org/10.1007/s00521-023-08483-3

Download citation

Received: 31 October 2022
Accepted: 08 March 2023
Published: 29 April 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s00521-023-08483-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Employing machine learning techniques in monitoring autocorrelated profiles

Abstract

Similar content being viewed by others

Parametric and non-parametric methods for monitoring nonlinear fuzzy profiles

On efficient estimation strategies in monitoring of linear profiles

Application of Machine Learning in Statistical Process Control Charts: A Survey and Perspective

1 Introduction

2 Preliminaries

2.1 The linear autocorrelated profile in Phase II

2.2 The T ² control chart for monitoring autocorrelated profiles in Phase II

2.3 The EWMA/R control chart for monitoring autocorrelated profiles in Phase II