Introduction

In most statistical process control (SPC) applications, the time when a control scheme triggers an out-of-control signal does not indicate the actual time of change in the process. In such situations, estimating the actual time when the fault is first manifested (called change point) is inevitable, because it can facilitate identifying the assignable causes by searching in a limited time interval. We can conclude from the literature that the most efforts have been focused on change point estimation of univariate processes. As one of the first methods in change point estimation of univariate processes, Samuel et al. (1998) investigated the time of step changes in the X-bar control chart. It is also worth addressing two recent researches in change point estimation of univariate and uni-attribute processes. Assareh et al. (2013) applied Bayesian hierarchical models to estimate change point where there exists a step change, a linear trend and a known multiple number of changes in the Poisson quality characteristic. Amiri et al. (2014) developed a probabilistic neural network (PNN)-based procedure to estimate the variance change point in a univariate process with normal quality characteristic. For more information, refer to the review paper provided by Amiri and Allahyari (2012).

In many manufacturing processes, multivariate or multi-attribute quality characteristics, which are correlated, characterize the quality of the products. For example, El-Midany et al. (2010) proposed modular artificial neural networks (ANNs) to recognize abnormal patterns, to identify the variables responsible for out-of-control signal and classify the abnormal pattern parameters in multivariate processes. On the other hand, Niaki and Nafar (2008) proposed a modular ANN-based method to monitor multi-attribute quality characteristics.

In comparison with processes with a single quality characteristic, fewer methodologies have been developed by researchers for change point estimation of multivariate and multi-attribute processes. Sullivan and Woodall (2000) based on a likelihood ratio statistic proposed a preliminary control chart for change point estimation in the multivariate processes with individual observations when the shift occurs in the mean vector, the covariance matrix or in both. Nedumaran et al. (2000) developed maximum likelihood estimator (MLE) approach to estimate step change point in the multivariate normal process mean. Zamba and Hawkins (2006) presented a model to estimate the time of step change in the mean vector of multivariate processes in situations where the parameters are unknown.

Zarandi and Alaeddini (2010) estimated the time of change in different types of control charts including univariate, uni-attribute as well as the multivariate control charts with either fixed or variable sampling strategy using a general fuzzy-statistical clustering approach. Doğu and Kocakoc (2011) estimated the time of step change in the covariance matrix of multivariate normal processes, in which a multivariate control chart based on sample covariance is used for receiving out-of-control signals. Niaki and Khedmati (2012) proposed a new method to derive the maximum likelihood estimator of the time of a step change in the mean vector of multivariate Poisson processes. To do that, first, they employed two transformations to decrease the inherent skewness involved in multi-attribute processes and made the distribution of quality characteristics almost multivariate normal and diminished correlations between the attributes. After that, they employed a T2 control chart for detection purposes. Finally, using a maximum likelihood estimator, they found the actual time of a change. Niaki and Khedmati (2013) then proposed a new multi-attribute T 2 control chart based on two transformation methods to monitor the parameter vector of multi-attribute Poisson processes. Then, to estimate the process change point, they developed the maximum likelihood estimators for both linear trend and step change disturbances. Doğu and Kocakoc (2013) estimated the step change point of multivariate normal processes when joint mean vector and covariance matrix shifts occurred.

Besides, some researchers used artificial neural networks for estimating the time of change in multivariate processes. Atashgar and Noorossana (2010) proposed a neural network-based change point estimator to identify the change point in the mean vector of a bivariate normal distribution when the monotonic changes occur. At the same time, they also diagnosed the variables responsible for the change in the process mean vector. Ahmadzadeh (2009) introduced a neural network to identify the time of change in the multivariate process mean parameters. Noorossana et al. (2011) proposed an integrated supervised learning solution to detect the out-of-control conditions, estimate the change point when the shift occurs in the mean vector, diagnose the variables contributing to the out-of-condition and determine the direction of the shift in the mean of each contributing variable. Allahyari and Amiri (2011) studied step change point estimation problem in multivariate processes using a clustering approach. Atashgar and Noorossana (2011) proposed a supervised learning approach based on artificial neural networks to identify the change point in a bivariate process when the process mean vector changes linearly. They simultaneously performed an analysis to diagnose the variables that contributed to the change in the process mean vector. Ahmadzadeh et al. (2013) developed a multivariate exponentially weighted moving average (MEWMA) control chart by using neural network for identifying the step change point as well diagnosing the variable responsible of the change in the multivariate process mean vector.

In some production environments, the quality of the product is expressed by the combination of both variables and attributes quality characteristics, where a non-zero correlation structure between them exists. Despite some efforts for monitoring multivariate-attribute processes, there is no research in the literature about the change point estimation of such processes. As the main contribution in this paper, we propose a modular methodology based on artificial neural networks for estimating the step changes in the covariance matrix of multivariate-attribute processes.

Considering the literature of using ANNs in different scopes of SPC, we realize that in almost all researches the observations are directly fed to the ANN as the input values. One unique aspect of our research is that we link two ANN modules including (1) module one, which is used for detecting variance shifts (fault detection) as well as diagnosing the source of shifts (fault diagnosis), and (2) module two, which consists of some ANNs used for estimating the time of variance step changes. In the proposed modular methodology, the output values of the neural network designed for detection purpose is used as the input values of the ANN estimators required for change point estimation.

The proposed methodology is presented in Phase II; hence, the parameters of quality characteristics are known based on historical data. The rest of this paper is outlined as follows: in “Modified maximum likelihood estimator”, the change point estimator based on MLE with some modifications is presented. In “Proposed model for change point estimation”, the proposed ANN-based methodology consisting of two modules, fault detection/fault diagnosis module and change point estimation module, is described. Then, in “Performance evaluation”, the performance of the proposed ANN-based method is assessed through the simulation experiments from a given multivariate-attribute process in comparison with the modified MLE estimator. In “Sensitivity analysis”, a sensitivity analysis on the effect of the location of change point on the performance of MLE approach is conducted. In the final section, conclusions as well as some recommendations for future researches are given.

Modified maximum likelihood estimator

Consider a multivariate-attribute process with p variables and q attributes where the quality characteristics are correlated. Let \( {\mathbf{X_{ij}}} = \left( {X_{ij1} , \ldots , X_{ijp} , X_{ij(p + 1)} , \ldots , X_{ij(p\; + \;q)} } \right)^{T} \) be the jth observations vector (j = 1, 2,…,n) of the ith subgroup (i = 1, 2,…) where the first p elements represent the variables, whereas the last q elements represent attribute quality characteristics. In the first step of extending our statistical approach, using Normal to Anything (NORTA) inverse method, the process data are transformed to the multivariate normal distribution. After using NORTA inverse method, the column vector of \( {\mathbf{X^{\prime}_{ij}}} = ( {X^{\prime}_{ij1} , \ldots , X^{\prime}_{ijp} , X^{\prime}_{ij(p\; + \;1)} , \ldots , X^{\prime}_{ij(p\; + \;q)} } )^{T} \) with the mean vector of μ 0 and covariance matrix of Σ0 will be obtained where the marginal distribution of the elements is normal distribution.

In this study, we use a control chart in the literature for monitoring the variability of the transformed quality characteristics. This control chart plots the determinant of the sample covariance matrix (|S|) as a statistic and has the following control limits (Montgomery 2005):

$$ {\text{UCL}} = \;|\mathbf{\Sigma}_{\mathbf{0}} |\left( {b_{1} + L\sqrt {b_{2} } } \right), $$
(1)
$$ {\text{LCL}} = \;|\mathbf{\Sigma}_{\mathbf{0}} |\left( {b_{1} + L\sqrt {b_{2} } } \right). $$
(2)

The value of control limit coefficient (L) is set such that a predetermined in-control average run length (ARL0) is obtained. The parameters of b 1 and b 2 depend only on the number of variable and attribute quality characteristics and sample size and can be determined through the following equations, respectively (Montgomery 2005):

$$ b_{1} =(n - 1)^{-(p + q)} \prod\limits_{i\; = \;1}^{p\; + \;q} {(n - i),} $$
(3)
$$b_{2} = (n - 1)^{ - 2(p\; + \;q)} \times \prod\limits_{i\; = \;1}^{p\; + \;q} {(n - i) \times \left[ {\prod\limits_{j\; = \;1}^{p\; + \;q} {(n - j + 2)} - \prod\limits_{j\; = \;1}^{p\; + \;q} {(n - j)} } \right].}$$
(4)

The control chart triggers an out-of-control alarm in the ith subgroup if |S i | exceeds the control limits. The negative value of LCL should be substituted by zero.

Let T be the time when the extended control chart triggers an out-of-control alarm. Let also after incidence of a given step shift in process variability, the covariance matrix of multivariate-attribute quality characteristics changes from Σ 0 to Σ 1 at the unknown change point τ. After that, the parameters of covariance matrix remain at the new level until the sources of the assignable cause is identified and omitted. Hence, the determinant of sample matrix until the time τ., i.e., |S 1|, |S 2|,…,|S τ |come from the in-control state, whereas |S τ + 1|,…, |S T | come from the out-of-control state of the multivariate-attribute process variability. The estimation of the parameter τ by the MLE approach (\( \widehat{\tau } \)) is a value that maximizes the likelihood function of observations as follows (Doğu and Kocakoc 2011):

$$ \hat{\tau } = \arg \hbox{max} \left\{ {{\text{CP}}_{t} \} , \, t = 0,1, \ldots ,T - 1.} \right. $$
(5)

The value of CP t is computed according to Eq. (6):

$$ \begin{aligned} {\text{CP}}_{t} &= \frac{{\left( {\text{trace}\left( {\mathbf{\Sigma _{0}^{ - 1}} \; \times \;\sum\limits_{{i\; = \;t\; + \;1}}^{T} {\sum\limits_{{j\; = \;1}}^{n} {\text{(}\mathbf{X^{\prime}}_{{\mathbf{ij}}} \; - \;\boldsymbol{\upmu}_{\mathbf{0}} )(\mathbf{X^{\prime}}_{{\mathbf{ij}}} \; - \;{\boldsymbol\upmu}_{\mathbf{0}} {)^{\prime}}} } } \right)} \right)}}{\text{2}} \\ &\quad-\frac{{n\,(T\; - \;t)}}{2}\text{ln}\left( {\left| {\frac{{\sum\limits_{{i\; = \;t\; + \;1}}^{T} {\sum\limits_{{j\; = \;1}}^{n} {(\mathbf{X^{\prime}}_{{\mathbf{ij}}} \; - \;{\boldsymbol\upmu}_{\mathbf{0}} )(\mathbf{X^{\prime}}_{{\mathbf{ij}}} \; - \;{\boldsymbol\upmu}_{0} {)^{\prime}}} \;} \times \;\mathbf{\Sigma _{0}^{ - 1}} }}{{n\,(T\; - \;t)}}} \right|} \right) \\ &\quad-\frac{n\,(p\; + \;q)\;(T\; - \;t)}{2}, \end{aligned} $$
(6)

where the trace (A) is the sum of diagonal elements in matrix A.

The proposed model for change point estimation

In this section, the proposed modular model for estimating the change point in the covariance matrix of multivariate-attribute processes is described. In module one, we use a three-layered perceptron neural network that is presented by Amiri et al. (2015) for monitoring the multivariate-attribute process variability. In the first module, we aim to detect the variance shifts and diagnose quality characteristics responsible for out-of-control signals. Note that, diagnosing the out-of-control quality characteristics in the first module is equivalent to identifying which out-of-control state has occurred. Then in the second module, we design an artificial neural network for estimating the change point corresponding to any out-of-control state. In a multivariate-attribute process whose quality is characterized by the combination of p variables and q attributes, we design totally \( \sum\nolimits_{i\; = \;1}^{p\; + \;q} {\left( {\begin{array}{cc} {p + q} \\ i \\ \end{array} } \right)} = 2^{p\; + \;q} - 1 \) ANN estimators for change estimation in the second module. Then, based on the out-of-control state which is diagnosed in the first module, one of the designed ANN estimators in the second module will be activated for estimating the change point in the covariance matrix of multivariate-attribute quality characteristics. Note that, using only one ANN in change point estimation instead of modular ANNs makes it large and consequently the training process will be complex and time consuming. Figure 1 presents the proposed modular ANN-based methodology (It is supposed that the ith out-of-control state is diagnosed in the first module).

Fig. 1
figure 1

The proposed modular methodology

Structure of ANN modules

In this subsection, the structure of each ANN estimator required for estimating the time of change in each out-of-control state is illustrated. Due to the successful performance of multilayer perceptron neural networks in various scopes of SPC, this type of neural network is used in designing the ANNs of the change point estimation module. To determine the number of nodes in the input layer of each ANN estimator in the second module, the following procedure is recommended. First in module one, we identify the quality characteristics contributed to the out-of-control signal. In a multivariate-attribute process with p variables and q attributes, there are totally \( \left( {\begin{array}{cc} {p + q} \\ j \\ \end{array} } \right);j,\,j = 1,2, \ldots,p + q \) states where j quality characteristics are the sources of variation. Obviously, in the situations where one of the p + q quality characteristics is responsible for out-of-control signal, p + q ANNs should be designed for the change point estimation.

The following procedure is used to determine the number of nodes in the input layer of the ANN estimator corresponding to the out-of-control state in which the ith quality characteristic has been diagnosed as the source of variation: First, we generate random samples in a p-variate/q-attribute process, each of size n. In the simulated random samples δ i  > 1, whereas δ j  = 1, j = 1, 2,…, p + qj ≠ i, where δ i is the standard deviation of the ith quality characteristic after a given step shift divided by its standard deviation for the in-control state. Then, we enter the generated random samples into the ANN of the first module until the first out-of-control signal is received. This process is repeated in 10,000 replicates and in each replicate we record the run length (RL) values obtained in the first module and save them in a vector like c 1. Finally, an element in vector c 1 (called m 1 ) such that the value of pr (RL > m 1) obtained in 10,000 simulation replicates is negligible is considered as the number of nodes in the input layer of the neural network designed for change point estimation of the corresponding out-of-control state. To determine the number of input nodes, we should consider two issues: (1) increasing the number of nodes in the input layer of ANN modules leads to the complex ANNs in both training and testing, (2) insufficient number of input nodes is not desired. Because in situations where the run length obtained by module one exceeds the number of input nodes, the ANN module will not be usable. The input vector used in the ANN of the first module is a column vector as follows:

$$ {\mathbf{S}} = [s_{1} ,s_{2} , \ldots ,s_{p\; + \;q} ]^{T} , $$
(7)

where s j is the sample standard deviation of the jth quality characteristic in the simulated random sample. Determining the number of input nodes in the ANN modules required for estimating the time of change in the variance of two quality characteristics (totally \( ( {\begin{array}{cc} {p + q} \\ 2 \\ \end{array} } ) \) ANNs) is almost the same as the previous p + q ANNs. Suppose that the quality characteristics u and v are the sources of variation in the covariance matrix of multivariate-attribute quality characteristics. First, we generate random samples of size n from a p-variate/q-attribute process where δ u , δ v  > 1, whereas δ j  = 1, j = 1, 2,…,p + q, j ≠ u, v. Then, based on 10,000 simulation replicates, we obtain run length values in the first module and save them in a vector like c 2. Finally, an element like m 2 in the vector c 2 that the value of pr (RL > m 2) obtained from 10,000 simulation replicates is almost equal to zero is considered as the number of nodes in the input layer of these ANNs.

This procedure is performed to determine the number of input nodes in the neural networks that are designed for change point estimations related to other out-of-control states where j, j = 3,…,p + q quality characteristics contribute to the out-of-control situations. The number of nodes in the output layer of each neural network is also considered as equal as the number of its input layer nodes. There is no standard guideline to determine the number of hidden layers as well as the number of nodes in each hidden layer. It is also stipulated in the literature that one or two hidden layers may be sufficient in designing any neural networks. We finalize the number of hidden layers as well the number of nodes in each one after trial and error experiments. We also use the sigmoid function as the transfer function of all ANNs which send out outputs in the range of [0, 1].

Training procedure of neural networks

The most substantial issue in the training process of each 2p + q − 1 ANNs for change point estimation purpose is collecting proper training data sets. The training procedure of the kth; k = 1, 2,…,2p + q – 1 ANN estimator corresponding to the kth out-of-control state which contains m k nodes in its input layer is presented as follows: The first output value of ANN in the first module is used as the input value of the ANN estimators in the second module.

Assume that the run length value obtained from the first module is equal to h, h = 1, 2,…,m k . First of all, we generate m k  − h in-control random samples of size n. We also simulate h random samples from a p variate/q attribute process where kth; k = 1, 2,…,2p + q – 1 out-of-control state has occurred. In the next step, for each generated m k random samples, we calculate vector S (the input vector associated with the ANN in the first module) according to Eq. (7). Now, we enter the generated vectors into the ANN in the first module and record the observed values of the first output node. Then, we save these m k values in a column vector and consider it as the input vector of the kth ANN in the second module. Hence, the input vector of the kth; k = 1, 2,…,2p + q − 1 neural network which contains m k nodes in its input layer is a column vector with m k elements as follows:

$$ {\mathbf {O_{k}}} \; = \;(o_{1} , \ldots ,o_{{m_{k\;} - \;h}} ,\;o_{{m_{k\;} - \;h\; + \;1}} \ldots ,o_{{m_{k} }} )^{T} . $$
(8)

In Eq. (8), the first m k  − h elements of vector O k are related to the in-control samples, whereas the last h elements are associated with the out-of-control samples. We repeat this process in 100 replicates for each value of RL; RL = 1,…,m k . Consequently, for each possible value of RL, totally 100 column vectors are generated. Consequently, 100 × m k column vectors with the size of m k  × 1 are available as the input data sets required for training the kth neural network in change point estimation module. Note that to generate out-of-control data in the training step, the magnitude of variance shifts in the out-of-control quality characteristics are determined randomly. Similar to the input vectors, the target vector of kth; k = 1,2,…,2p + q − 1 neural network is a m k  × 1 vector whose elements are zero, except one element which is equal one. The location of element one in the target vector represents the time when in-control state is terminated, i.e., the first out-of-control sample is taken. After generating the input vectors as well as their corresponding target vectors, using back-propagation algorithm which is the most common supervised training algorithm, the ANN estimators are trained.

Utilizing the proposed ANN modules for estimating change point

After designing and training all the neural networks required in change point estimation module, we should apply them to estimate the time of actual change in the multivariate-attribute process variability. As the first step, using the ANN of the first module, the out-of-control state is detected. Then, the quality characteristics which contributed to cause an out-of-control alarm are diagnosed. Recall that the diagnostic process by the ANN in module one is equivalent to determining the out-of-control state. Finally, based on the diagnosed quality characteristics responsible of the change in the multivariate-attribute process variability, only one ANN estimator is activated for change point estimation in the quality characteristics whose variances are changed. The flowchart of the proposed algorithm is depicted in Fig. 2.

Fig. 2
figure 2

The proposed ANN-based algorithm

Recall that the target values of neural networks in the change point estimation module are zero and one. However, when each ANN is applied in the second module, the observed output values are not exactly equal to zero or one and are in the range of [0, 1] due to errors. Hence, to overcome this problem, in this paper the maximum observed value of the neural networks is considered as the time when the process goes to the out-of-control state. For example, in an ANN with 50 nodes in its output layer, if the maximum observed value is located in the 35th element of output vector, the estimated time of change by the neural network is obtained equal to \( \widehat{\tau } = 34. \)

Performance evaluation

To evaluate the performance of the proposed ANN-based estimator in comparison with the MLE, we present a numerical example based on simulation. In the presented example, we use the random samples of size n = 10 under both methods. Suppose a multivariate-attribute process in which the quality of the products is expressed by one attribute and one variable quality characteristics which are correlated. It is supposed that there is a correlation coefficient equal to ρ = 0.357 between the quality characteristics. The presented numerical example is focused in Phase II. Based on the results of Phase I analysis, we assume that x 1 and x 2 are attribute and variable quality characteristics with the following distributions when the process is in-control:

$$ x_{ 1} \sim {\text{Poisson}}\left( {\lambda \; = \; 1} \right), \;x_{ 2} \sim {\text{Normal}}\left( {\mu \; = \; 3,\, \sigma^{ 2} = \; 4} \right). $$

After using NORTA inverse method, the mean vector and the covariance matrix of in-control transformed data are estimated equal to \( {\boldsymbol\upmu}_{\mathbf{0}} = \left( {\begin{array}{cc} {0.546} \\ 0 \\ \end{array} } \right) \) and \( {\mathbf\Sigma}_{\mathbf{0}} = \left( {\begin{array}{cc} {0.667} & {0.268} \\ {0.268} & {0.999} \\ \end{array} } \right) \), respectively. To obtain the ARL0 roughly equal to 200 by the |S| control chart, the control limits are determined equal to UCL = 2.1050 and LCL = 0 (the negative value of LCL is substituted by zero). In the MLE method, first we generate 100 in-control data and transform them to obtain bivariate normal distribution data. For the first 100 in-control random samples, it is assumed that no false alarm is received by the extended control chart. Hence, to overcome the probable problem, if the value of |S| corresponding to each simulated sample exceeds the UCL, we replace it by another in-control random sample. Starting form the 101th taken sample, the observations are simulated from an out-of-control state until the extended control chart triggers an out-of-control signal. Then, the change point in the process variability is estimated according to Eqs. (5) and (6). For different variance shift magnitudes, we repeat this procedure 10,000 times. Finally, in each replicate we calculate the absolute difference of the actual and estimated change point (\( |\widehat{\tau } - \tau | \)). Then, based on the simulated values of \( |\widehat{\tau } - \tau | \), we investigate the performance of the MLE method.

Obviously in this process (p = q = 1), there are three states where the process variability is out of control. The out-of-control states in the process variability include (1) the states in which x 1 contributes to out-of-control signals, (2) states in which x 2 contributes to out-of-control signals and (3) situations in which both x 1 and x 2 contribute to out-of-control signals. Hence, three ANNs are required to be designed for estimating the actual time of change for each out-of-control state of process variability.

Designing the ANN estimators

Using the ANN of module one for detecting purpose, according to “Proposed model for change point estimation”, the value of Pr (RL > 40) under a small variance shift of δ 1 = 1.4 under 10,000 simulation replicates is obtained equal to a negligible value (nearly zero). Hence, it seems reasonable that using 40 nodes in the input layer of the first ANN in module two (network A) is a proper choice. As noted, the number of nodes in the output layer of network A is also considered equal to 40. Designing the other ANNs (networks B and C) that are designed for change point estimation in variable x 2 as well as in joint quality characteristics x 1 and x 2 is almost similar to network A. Using module one, the probability value of Pr (RL > 40) under a variance shift with a magnitude of δ 2 = 1.4 under 10,000 simulation replicates is obtained equal to 0.002. Hence, we consider 40 nodes in the input and output layers of network B. Similarly, the probability value of Pr (RL > 30) under a joint shift with a magnitude of (δ 1 = 1.4, δ 2 = 1.4) under 10,000 simulation replicates is obtained equal to zero. Consequently, 30 nodes will exist in the input and output layer of network C. By trial and error experiments, the ANN estimators A, B and C have a single hidden layer with 22, 24 and 22 nodes, respectively.

Training the ANN estimators

According to “Training procedure of neural networks”, the input data as well as the corresponding target vectors required for training the networks A and B are both the column vectors with 40 elements. To generate out-of-control data sets in training the neural network A, we use the shifts with the random magnitude of δ 1 in the standard deviation of x 1 where δ 1 is uniformly distributed in the range of [1.5, 2.5]. Similarly, to generate out-of-control data sets required for training network B, the shifts with the random magnitude of δ 2 in the standard deviation of x 2 are used where δ 2 is also uniformly distributed in the range of [1.5, 2.5]. The input and target vectors that are prepared for training the network C are also the column vectors with 30 elements. Furthermore, we use simultaneous shifts in the standard deviation of both x 1 and x 2 with random magnitude of shifts distributed uniformly in the range of δ i  ∊ [1.5, 2.5]; i = 1, 2. Now, 4000 input data for training network A and B and 3000 ones for training network C are available. Finally, the ANN estimators are trained with the generated input vectors as well as their corresponding target vectors. The value of mean-squared error (MSE) which is an evaluation criterion in training stage for neural networks A, B and C is obtained equal to 0.0166, 0.0138 and 0.0142, respectively. When each trained network is applied for estimating the time of change in the corresponding quality characteristics, we focus on the maximum observed output and consider it as the estimation of time when the first out-of-control sample is manifested in the process.

To investigate the performance of ANN and MLE methods and provide a comparison study, we use three criteria in change point estimation literature including the mean, the standard deviation as well as the empirical distribution of estimated change point obtained by the methods around the actual time of process change. The performance of neural network A under the step variance shift (δ 1 σ 1, 0) with various magnitudes of δ 1 is investigated through 10,000 replicates and compared with MLE. Table 1 shows the performance of the ANN and MLE estimators in estimating the time of change in the variance of x 1 . The results of Table 1 show that under each variance shift magnitudes in x 1 , including small, moderate and large shifts except shift with the magnitude of (1.4σ 1, 0), the ANN method in all criteria outperforms the MLE method. It is obvious that as the variance shift in the quality characteristic x 1 increases, the preciseness and accuracy of both estimators increase.

Table 1 Performance of the first ANN (Network A) and MLE in estimating change point in the variance of x 1

Table 2 shows the performance of the neural network B as well as the MLE method under the step variance shift (0, δ 2 σ 2) with various values of δ 2 through 10,000 replicates. The computational results in Table 2 indicate that under each out-of-control state considered, the ANN considerably presents more accurate and more precise results in comparison to MLE. We can see that under both estimators, increasing the magnitude of variance shift in the quality characteristic x 2 leads to better estimates.

Table 2 Performance of the second ANN (Network B) and MLE in estimating change point in the variance of x 2

In Table 3, we summarize the results of applying neural network C as well as MLE in estimating change point when joint variance shifts in both quality characteristics have occurred. Similar to neural networks A and B, the neural network C performs adequately in estimating the time of change in the process variability when both quality characteristics are the sources of the out-of-control signals. Moreover, the results of Table 3 show the superior performance of the third ANN estimator (Network C) under all change point criteria in comparison with MLE method.

Table 3 Performance of the third ANN (Network C) in estimating change point in the variances of x 1 and x 2

The results of Tables 1 and 3 show that in all out-of-control scenarios, the MLE approach is better than ANN in terms of the \( p(|\widehat{\tau } - \tau | = 0) \) criterion. It means that in 10,000 replicates, the number of simulation runs in which the MLE estimates the actual time of change is more than ANN. However, in some simulation experiments, the MLE estimates are imprecise. Consequently, in almost all out-of-control scenarios except (1.4σ 1, 0) and (1.4σ 1, 1.4σ 2), the ANN outperforms MLE in terms of the \( p(|\widehat{\tau } - \tau | \le i);i = 1,2,3,4,5 \) criterion. In contrast to the results in Tables 1 and 3, it is observed in Table 2 that the ANN outperforms the MLE in terms of \( p(|\widehat{\tau } - \tau | = 0) \). Generally, the ANN estimates are more reliable than MLE, because before applying the ANN modules, they are trained with sufficient data set.

Sensitivity analysis

In this section, a sensitivity analysis on the location of the change point (τ) in the MLE approach is performed. The results of estimating the time of change considering separate variance shifts in x 1 and x 2 under different values of τ are summarized in Tables 4 and 5, respectively. The results of estimating the change point in both quality characteristics under different values of τ are shown in Table 6. It is concluded that in all out-of-control scenarios including separate and joint shifts under different shift magnitudes, decreasing the location of actual change point τ leads to more accurate and precise estimates.

Table 4 The effect of parameter τ on MLE approach in estimating change point under variance shifts in x 1
Table 5 The effect of parameter τ on MLE approach in estimating change point under variance shifts in x 2
Table 6 The effect of parameter τ on MLE approach in estimating change point under shifts in x 1 and x 2

Conclusions and future researches

In this paper, a modular ANN-based methodology for estimating the time when the out-of-control state is manifested in the multivariate-attribute processes was studied. Before designing the ANNs required for change point estimation in the process variability, a three-layered perceptron ANN was applied for detecting the variance shifts and identifying quality characteristics responsible for the out-of-control states. After that, based on the out-of-control quality characteristics diagnosed by this ANN, a three-layered perceptron neural network was trained to estimate the actual time of change in the responsible quality characteristics. We used back-propagation algorithm to train all ANN modules in the proposed change point estimation methodology. Moreover, we present the MLE approach for estimating the time of variance shifts. The performance of the proposed ANN-based method as well as the modified MLE in estimating the time of change in the contributed quality characteristics was investigated through a numerical example based on simulations. The results show that the proposed ANN-based methodology outperforms the MLE in estimating the time of step change in the process variability with multivariate-attribute quality characteristics. Finally, we investigate the effect of the location of the change point (τ)on the performance of MLE approach through sensitivity analysis.

Developing our proposed methods in the case of multiple change points is highly recommended as a future research. In addition, proposing the ANN-based methodology for estimating drift and isotonic change points can also be fruitful areas for future researches.