The traditional process monitoring method first projects the measured process data into the principle component subspace (PCS) and the residual subspace (RS), then calculates \(\mathrm T^2\) and \(\mathrm SPE\) statistics to detect the abnormality. However, the abnormality by these two statistics are detected from the principle components of the process. Principle components actually have no specific physical meaning, and do not contribute directly to identify the fault variable and its root cause. Researchers have proposed many methods to identify the fault variable accurately based on the projection space. The most popular is contribution plot which measures the contribution of each process variable to the principal element (Wang et al. 2017; Luo et al. 2017; Liu and Chen 2014). Moreover, in order to determine the control limits of the two statistics, their probability distributions should be estimated or assumed as specific one. The fault identification by statistics is not intuitive enough to directly reflect the role and trend of each variable when the process changes.

In this chapter, direct monitoring in the original measurement space is investigated, in which the two statistics are decomposed as a unique sum of the variable contributions of the original process variables, respectively. The monitoring of the original process variables is direct and explicit in the physical meaning, but it is relatively complicated and time consuming due to the need to monitor each variable in both \(\mathrm SPE\) and \(\mathrm{T^2}\) statistics. To address this issue, a new combined index is proposed and interpreted in geometric space, which is different from other combined indices (Qin 2003; Alcala and Qin 2010). The proposed combined index is an intrinsic method. Compared with the traditional latent space methods, the combined index-based monitoring does not require the prior distribution assumption to calculate the control limits. Thus, the monitor complexity is reduced greatly.

6.1 Two Statistics Decomposition

According to the traditional PCA method, the process variables \({\boldsymbol{x}}\) could be divided into two parts: principal component \({\hat{\boldsymbol{x}}}\) and the residual \({\boldsymbol{e}}\):

$$\begin{aligned} {{\boldsymbol{x}}} = {{\boldsymbol{t}}}{{{\boldsymbol{P}}}^\mathrm{{T}}} + {{\boldsymbol{e}}}\mathrm{{ = }}{ \hat{\boldsymbol{x}}} + {{\boldsymbol{e}}}, \end{aligned}$$
(6.1)

where \({\boldsymbol{P}}\) is the matrix associated with the loading vectors that define the latent variable space, \({\boldsymbol{t}}\) is the score matrix that contains the coordinates of \({\boldsymbol{x}}\) in that space, and \({\boldsymbol{e}}\) is the matrix of residuals. \(\mathrm{T^2}\) and SPE statistics are used to measure the distance from the new data to the model data. Generally, \(\mathrm{T^2}\) and SPE statistics should be analyzed simultaneously so that the cumulative effects of all variables can be utilized. However, most of the literatures have only considered the decomposition of \(\mathrm{T^2}\). Therefore, this chapter considered the \(\mathrm{SPE}\) statistical decomposition to obtain the original process variables monitored in \(\mathrm{T^2}\) and in the \(\mathrm{SPE}\) statistical space.

6.1.1 \(\mathrm{T^2}\) Statistic Decomposition

The statistic can be reformulated as follows:

$$\begin{aligned} \mathrm{T^2 := D } = {\boldsymbol{t}\boldsymbol{\varLambda }^{ - 1}\boldsymbol{t}^\mathrm {T}} = {\boldsymbol{x}\boldsymbol{P}\boldsymbol{\varLambda }^{ - 1}\boldsymbol{P}^\mathrm {T}\boldsymbol{x}^\mathrm {T}} = {\boldsymbol{xAx}^\mathrm {T}} = \sum \limits _{i = 1}^J {\sum \limits _{j = 1}^J {{a_{i,j}} x_i x_j} } \ge 0, \end{aligned}$$
(6.2)

where \({{\boldsymbol{A}}} = {{\boldsymbol{P}}}{{\boldsymbol{\varLambda }^{ - 1}}}{{{\boldsymbol{P}}}^\mathrm{{T}}} \ge 0\), \({\boldsymbol{\varLambda }^{ - 1}}\) is the inverse of the covariance matrix estimated from a reference population, and \({a_{i,j}}\) is the element of matrix \({\boldsymbol{A}}\).

One of the \(\mathrm{T^2}\) statistic decompositions (Birol et al. 2002) is given as follows:

$$\begin{aligned} \begin{aligned} \mathrm{D}&= \sum \limits _{ k = 1}^J {\frac{{{a_{k,k}}}}{2}\left[ {{{\left( {{ x_k} - x_k^*} \right) }^2} + \left( { x_k^2 - x_k^{*2}} \right) } \right] }\\&= \sum \limits _{ k = 1}^J {{a_{k,k}}\left[ {\left( { x_k^2 - x_k^*{ x_k}} \right) } \right] } \\ {}&= \sum \limits _{ k = 1}^J {c_k^{D}}. \end{aligned} \end{aligned}$$
(6.3)

The \( x_k^*\) is given as follows:

$$\begin{aligned} x_{k}^{*}=-\frac{\sum _{j=1 \atop j \ne k}^{N} a_{k, j} x_{j} }{a_{k, k}}, \end{aligned}$$

where the \( c_k^{D}\) is the decomposed \(\mathrm{T^2}\) statistic of each variable \(x_k\). Next, the \(\mathrm{T^2}\) statistic of each variable \(x_k\) can be calculated as follows:

$$\begin{aligned} { c}_k^D = {a_{k,k}}\left[ {\left( { x_k^2 - x_k^*{ x_k}} \right) } \right] . \end{aligned}$$
(6.4)

The detailed \(\mathrm{T^2}\) statistic decomposition process is not shown in here, details can be found in Alvarez et al. (2007, 2010).

6.1.2 SPE Statistic Decomposition

The SPE statistic, which reflects the change of the random quantity in the residual subspace, also has a quadratic form:

$$\begin{aligned} \begin{aligned} \mathrm{{SPE}}&:=\mathrm{Q} = {{\boldsymbol{e} \boldsymbol{e}^\mathrm {T}}} = {\boldsymbol{x}}\left( {\boldsymbol{I}-\boldsymbol{PP}^\mathrm {T}} \right) \left( {\boldsymbol{I}-\boldsymbol{PP}^\mathrm {T}} \right) ^\mathrm {T}{\boldsymbol{x}^\mathrm {T}}\\&= {{\boldsymbol{x B x}^\mathrm {T}}} = \sum \limits _{i = 1}^J \sum \limits _{j = 1}^J b_{i,j} x_i x_j, \end{aligned} \end{aligned}$$
(6.5)

where \({\boldsymbol{B}} = \left( {\boldsymbol{I}-\boldsymbol{PP}^\mathrm {T}} \right) \left( {\boldsymbol{I}-\boldsymbol{PP}^\mathrm {T}}\right) ^\mathrm {T}\), \(b_{i,j}\) is the element of matrix \({\boldsymbol{B}}\), and \(b_{i,j}=b_{j,i}\). Similar to the decomposition of \(\mathrm{T^2}\) statistic, \(\mathrm {SPE}\) statistics can also be decomposed into a series of new statistic of each variable.

Firstly, the SPE statistic \(\mathrm Q\) can be reformulated in terms of a single variable \({x}_k\):

$$\begin{aligned} \mathrm{Q }= {{\mathrm{{Q}}}_k}\mathrm{{ = }}{b_{k,k}} x_k^2 + \left( {2\sum \limits _{j = 1,j \ne k}^J {{b_{k,j}}{ x_j}} } \right) { x_k} + \sum \limits _{i = 1,i \ne k}^J {\sum \limits _{j = 1,j \ne k}^J {{b_{i,j}}{ x_i}{ x_j}} }. \end{aligned}$$
(6.6)

The minimum value of \(\mathrm{{Q}}_k\) can be calculated as

$$\begin{aligned} \frac{{\partial {\mathrm{{Q}}_k}}}{{\partial { x_k}}}&= 2{b_{k,k}} x_k^* + 2\sum \limits _{j = 1,j \ne k}^J {{b_{k,j}}{ x_j}} = 0 \Rightarrow x_k^* = - \sum \limits _{j = 1,j \ne k}^J {{b_{k,j}}{ x_j}} /{b_{k,k}}\end{aligned}$$
(6.7)
$$\begin{aligned} \mathrm{{Q}}_k^{\min }&= - {b_{k,k}} x_k^{*2} + \sum \limits _{i = 1,i \ne k}^J \sum \limits _{j = 1,j \ne k}^J b_{i,j} x_i x_j. \end{aligned}$$
(6.8)

The difference between the SPE statistic of \( x_k\) and \(\mathrm{{Q}}_k^{\min } \) is

$$\begin{aligned} \mathrm{Q} - \mathrm{{Q}}_k^{\min } = b_{k,k}\left( {{ x_k} - x_k^*} \right) ^2. \end{aligned}$$
(6.9)

The sum of the \(\mathrm{{Q}}_k^{\min }\) for \(k=1,2,\ldots , J\) is

$$\begin{aligned} \begin{aligned} \sum \limits _{k = 1}^J \mathrm{Q}_k^{\min }&= \sum \limits _{k = 1}^J {\left( { - {b_{k,k}} x_k^{*2} + \sum \limits _{i = 1,i \ne k}^J {\sum \limits _{j = 1,j \ne k}^J {{b_{i,j}}{ x_i}{ x_j}} } } \right) } \\&= \left( {J - 2} \right) \mathrm{Q} + \sum \limits _{k = 1}^J {{b_{k,k}}\left( { x_k^2 - x_k^{*2}} \right) }. \end{aligned} \end{aligned}$$
(6.10)

The SPE statistic obtained from (6.10) can be evaluated as the sum of the contributions of each variable \(x_k\):

$$\begin{aligned} \begin{aligned} \mathrm{Q} =&\sum \limits _{k = 1}^J {\frac{{{b_{k,k}}}}{2}\left[ {{{\left( {{ x_k} - x_k^*} \right) }^2} + \left( { x_k^2 - x_k^{*2}} \right) } \right] } \\ =&\sum \limits _{k = 1}^J {{b_{k,k}}\left[ {\left( { x_k^2 - x_k^*{ x_k}} \right) } \right] } \\ =&\sum \limits _{k = 1}^J q_k^\mathrm{SPE}. \end{aligned} \end{aligned}$$
(6.11)

The original process variables of the SPE statistic are used to monitor the system status:

$$\begin{aligned} {q}_k^\mathrm{{SPE}} = {b_{k,k}}\left[ {\left( { x_k^2 - x_k^*{ x_k}} \right) } \right] . \end{aligned}$$
(6.12)

So the novel SPE statistic can be evaluated as a unique sum of the contributions of each variable \({q}_k^\mathrm{{SPE}}\; (k=1,2,\ldots , J)\), which is used for original process variable monitoring.

6.1.3 Fault Diagnosis in Original Variable Space

Similar to other PCS monitoring strategies, the proposed original variable monitoring technique consists of two stages that are executed offline and online. Firstly, the control limits of the two statistics (\(\mathrm{T^2}\) and SPE) for each time interval are determined by reference population of normal batches in the offline stage. Next, two statistics are calculated at each sampling during the online stage. If one of statistics exceeds the established control limit, then a faulty mode is declared.

The historical data of the batch process are composed of a three-dimensional array \({\boldsymbol{X}(I\times J\times K)}\), where I, J, and K are the number of batches, process variables, and sampling times, respectively. The three-dimensional process data must be unfolded into two-dimensional forms \({\boldsymbol{X}_k(I\times J)}, k=1,2,\ldots ,K\) before performing the PCA operation. The unfolding matrix \({\boldsymbol{X}_k}\) is normalized to zero mean and unit variance in each variable. The main nonlinear and dynamic components of the variable are still left in the scaled data matrix \({\boldsymbol{X}_k}\).

The normalized data matrix \({\boldsymbol{X}_k}\) is projected into principal component subspace by loading matrix \({\boldsymbol{P}_k}\) to obtain the scores matrix \({\boldsymbol{T}_k}\):

$$\begin{aligned} {{{\boldsymbol{X}}}_k} = {{{\boldsymbol{T}}}_k}{{\boldsymbol{P}}}_k^T + {{{\boldsymbol{E}}}_k}, \end{aligned}$$

where \({\boldsymbol{E}_k}\) is the residual matrix. The two statistics associated with the ith batch for the jth variable in kth time interval are defined as \( c_{i,j,k}^D\) and \( q_{i,j,k}^\mathrm{{SPE}}\).

The control limit of a continuous process can be determined by using the kernel density estimation (KDE) method. Another method has been used for calculating the control limit for batch process, which is determined by the mean and variance of each statistic (Yoo et al. 2004; Alvarez et al. 2007). The mean and variance of \( c_{i,j,k}^D\) are calculated as follows:

$$\begin{aligned} \begin{aligned} \bar{c}_{j,k}^D&= \sum \limits _{i = 1}^I {c_{i,j,k}^D} /I\\ \mathrm{{var}} \left( {c}_{j,k}^D\right)&= \sum \limits _{i = 1}^I {{{(c_{i,j,k}^D- \bar{c}_{j,k}^D)}^2}} /(I - 1). \end{aligned} \end{aligned}$$
(6.13)

The control limit of statistic \( c_{i,j,k}^D\) is estimated as

$$\begin{aligned} c_{j,k}^{{\mathrm {limit}}} = \bar{c}_{j,k}^D + {\lambda _\mathrm{{1}}}{\left( {{\mathop {\mathrm var}} \left( { c_{j,k}^D} \right) } \right) ^{{1 / 2}}}, \end{aligned}$$
(6.14)

where \(\lambda _1\) is a predefined parameter. Similarly, the control limit of statistic is

$$\begin{aligned} q_{j,k}^{{\mathrm {limit}}} = \bar{q}_{j,k}^\mathrm{SPE} + \lambda _2 \left( {\mathrm var} \left( q_{j,k}^\mathrm{SPE} \right) \right) ^{\frac{1}{2}}, \end{aligned}$$
(6.15)

where \(\lambda _2\) is a predefined parameter,

$$\begin{aligned} \begin{aligned} \bar{q}_{j,k}^\mathrm{SPE}&= \sum \limits _{i = 1}^I { q_{i,j,k}^\mathrm{SPE}} /I\\ {\mathop {\mathrm var}} ( q_{j,k}^\mathrm{SPE})&= \sum \limits _{i = 1}^I {{{( q_{i,j,k}^\mathrm{SPE} - \bar{q}_{j,k}^\mathrm{SPE})}^2}} /(I - 1). \end{aligned} \end{aligned}$$
(6.16)

As above, the control limit calculation is very simple. Although the calculation increases, the extra calculations can be performed offline, there is no restriction during the online monitoring stage. The proposed monitoring technique corresponding to the offline and online stages is summarized as follows:

A. Offline Stage

  1. 1.

    Obtain the normal process data of I batches \({\boldsymbol{X}}\), unfold them into two-dimensional time-slice matrix \({\boldsymbol{X}}_k\), and then normalize the data.

  2. 2.

    Perform the PCA procedure on the normalized matrix \({\boldsymbol{X}}_k\) of each time slice and obtain the loading matrices \({\boldsymbol{P}}_k\).

  3. 3.

    Calculate the statistics \( c_{i,j,k}^D\) and \( q_{i,j,k}^\mathrm{{SPE}}\) of each variable in all the interval times for all batches, then calculate the variable contributions at each time interval using (6.4) and (6.12).

  4. 4.

    The control limits of statistics \( c_{i,j,k}^D\) and \( q_{i,j,k}^\mathrm{{SPE}}\) are estimated as (6.14) and (6.15).

B. Online Stage

  1. 1.

    Collect new sampling time-slice data \(x_\mathrm{{new}}\), and then normalize based on the mean and variance of prior normal I batches data (modeling data).

  2. 2.

    Use \({\boldsymbol{P}}_k\) to calculate the new statistics \( c_{i,j,k}^D\) and \( q_{i,j,k}^\mathrm{{SPE}}\) of new sampling, and judge whether these statistics exceed the control limit. If one of them exceeds the control limit, then fault identification is performed to find the faulty variable that exceeds the control limit much greater than others; if none of them exceeds the control limit, then the current data are normal.

6.2 Combined Index-Based Fault Diagnosis

The monitoring method in the original process variables can avoid some of the disadvantages of traditional statistic approach in the latent variable space, such as indirectly monitoring (Yoo et al. 2004). However, the original variable monitoring method is relatively complicated due to the monitoring of each variable in both SPE and \(\mathrm{T^2}\) statistics. It means that each variable should be monitored twice, which increases the calculation. Thus, a new combined index, composed of the SPE and \(\mathrm{T^2}\) statistics, is proposed to decrease monitoring complexity.

6.2.1 Combined Index Design

In this section, we use symbol \({\boldsymbol{X}}(I\times J)\) to substitute the unfolding process data matrix \({\boldsymbol{X}}_k(I\times J)\) for general analysis. Similarly, \({\boldsymbol{P}}_k, {\boldsymbol{T}}_k, {\boldsymbol{E}}_k\) are substituted by \({\boldsymbol{P,T,E}}\). The process data \({\boldsymbol{X}}\) could be decomposed into PCS and RS when performing PCA:

$$\begin{aligned} {\boldsymbol{X}}={\boldsymbol{TP}}^\mathrm {T}+{\boldsymbol{E}} = {\boldsymbol{\hat{X}}}+{\boldsymbol{E}}, \end{aligned}$$
(6.17)

where \({ \hat{\boldsymbol{X}}}\) is the PCS and \({\boldsymbol{E}}\) is the RS. If the principal number is m, then a PCS with m-dimension and a RS with \((J-m)\)-dimension can be obtained. When new data \({\boldsymbol{x}}\) are measured, they are projected into the principal subspace:

$$\begin{aligned} {\boldsymbol{ t}}={\boldsymbol{xP}}. \end{aligned}$$
(6.18)

The principal component (PC) score vector \({\boldsymbol{t}}(1\times m)\) is the projection of new data \({\boldsymbol{x}}\) in the PCS. Subsequently, the PC score vector is projected back into the original process variables to estimate the process data \({\hat{\boldsymbol{x}}=\boldsymbol{t}\boldsymbol{P}^\mathrm {T}}\). The residual vector \({\boldsymbol{e}}\) is defined as

$$\begin{aligned} {\boldsymbol{e}}={\boldsymbol{x}}-{\boldsymbol{\hat{x}}}={\boldsymbol{x}}\left( {\boldsymbol{I}}-{\boldsymbol{PP}}^\mathrm {T}\right) . \end{aligned}$$
(6.19)

Residual vector \({\boldsymbol{e}}\) reflects the difference between new data \({\boldsymbol{x}}\) and modeling data \({\boldsymbol{X}}\) in the RS. A graphical interpretation of \(\mathrm{T^2}\) and SPE statistics is shown in Fig. 6.1.

Fig. 6.1
figure 1

Graphical representation of \(\mathrm{T^2}\) and SPE statistics

To describe the statistics clearly in the geometry, the principal component subspace is taken as a hyperplane. The SPE statistic checks the model validity by measuring the distance between the data in the original process variables and its projection onto the model plain. Generally, the \(\mathrm{T^2}\) statistic is described by the Mahalanobis distance of the project point \({\boldsymbol{t}}\) to the projection center of normal process data, which aims to check if the new observation is projected into the limits of normal operation. The residual space is perpendicular to the principal hyperplane. The SPE statistic shows the distance from the new data \({\boldsymbol{x}}\) to the principal component hyperplane.

A new distance index \(\varphi \) from the new data to the principal component projection center of the modeling data is given in the following. It can be used for monitoring instead of the SPE and \(\mathrm{T^2}\) indicators. Consider the singular value decomposition (SVD) of the covariance matrix \({\boldsymbol{R}}_x={\mathbb {E}}\left( {\boldsymbol{X}^\mathrm {T}\boldsymbol{X}}\right) \) for given normal data \({\boldsymbol{X}}\),

$$\begin{aligned} {\boldsymbol{R}}_x = {\boldsymbol{U}\varLambda \boldsymbol{U}^\mathrm {T}}, \end{aligned}$$

where \(\boldsymbol{\varLambda }=diag\{\lambda _1,\lambda _2,\ldots ,\lambda _m,{\boldsymbol{0}}_{J-m}\}\) is the eigenvalue of \({{\boldsymbol{R}}}_x\). The original loading matrix \({\boldsymbol{U}}_{J\times J}\) is a unitary matrix and \({\boldsymbol{UU}^\mathrm {T}=I}\). Each column of the unitary matrix is a set of standard orthogonal basis in its span space. The basis vectors of principal component space and residual space divided from matrix \({\boldsymbol{U}}\) are orthogonal to each other. Furthermore,

$$\begin{aligned} {\boldsymbol{U}=\left[ \boldsymbol{P},\; \boldsymbol{P}_e \right] }, \end{aligned}$$
(6.20)

where \({\boldsymbol{P}}\in R^{J\times m}\) is the loading matrix. \({\boldsymbol{P}}_e\in R^{J\times (J-m)}\) can be treated as the loading matrix of residual space. Thus, \({\boldsymbol{P}}\) and \({\boldsymbol{P}}_e\) are presented by \({\boldsymbol{U}}\) as follows:

$$\begin{aligned} {\boldsymbol{P}=\boldsymbol{UF}}_1,\;{\boldsymbol{P}}_e={\boldsymbol{UF}}_2, \end{aligned}$$
(6.21)

where

$$\begin{aligned} {\boldsymbol{F}}_1=\begin{bmatrix}{\boldsymbol{I}}_m\\ {\boldsymbol{0}}_{J-m} \end{bmatrix}_{J\times m},\; {\boldsymbol{F}}_2=\begin{bmatrix}{\boldsymbol{0}}_m\\ {\boldsymbol{I}}_{J-m} \end{bmatrix}_{J\times m}, \end{aligned}$$
(6.22)

where \({\boldsymbol{I}}_m\) and \({\boldsymbol{I}}_{J-m}\) are the m and \(J-m\) dimension unit matrices, respectively, and \({\boldsymbol{0}}_m\) and \({\boldsymbol{0}}_{J-m}\) are the m and \(J-m\) dimension zero matrices, respectively. Furthermore, the SPE and \(\mathrm{T^2}\) statistics are denoted by \({\boldsymbol{U}}\):

$$\begin{aligned} \begin{aligned} {\boldsymbol{e}}&= {\boldsymbol{x}\left( \boldsymbol{I-PP}^\mathrm {T}\right) =\boldsymbol{x}\left( \boldsymbol{UU}^\mathrm {T}- \boldsymbol{U}\boldsymbol{F}_1\boldsymbol{F}_1^\mathrm {T}\boldsymbol{U}^\mathrm {T}\right) }\\&={\boldsymbol{x}\left( \boldsymbol{UU}^\mathrm {T}- \boldsymbol{UE}_1 \boldsymbol{U}^\mathrm {T}\right) =\boldsymbol{xU}\left( \boldsymbol{I}-\boldsymbol{E}_1 \right) \boldsymbol{U}^\mathrm {T}=\boldsymbol{xUE}_2\boldsymbol{U}^\mathrm {T}}, \end{aligned} \end{aligned}$$
(6.23)

where

$$\begin{aligned} {\boldsymbol{E}_1} = \begin{bmatrix}{\boldsymbol{I}}_{m}&{}{\boldsymbol{0}}_{m,J-m}\\ {\boldsymbol{0}}_{J-m,m}&{}{\boldsymbol{0}}_{J-m}\end{bmatrix},\; {\boldsymbol{E}_2} = \begin{bmatrix}{\boldsymbol{0}}_{m}&{}{\boldsymbol{0}}_{m,J-m}\\ {\boldsymbol{0}}_{J-m,m}&{}{\boldsymbol{I}}_{J-m}\end{bmatrix}. \end{aligned}$$
(6.24)

Define \({\boldsymbol{y}=\boldsymbol{xU}}\), then

$$\begin{aligned} \begin{aligned} \mathrm{SPE}&:=\mathrm{Q} = {\boldsymbol{ee}^\mathrm {T}=\boldsymbol{xUE}} _2{\boldsymbol{U}^\mathrm {T}\boldsymbol{UE}}_2 {\boldsymbol{U}^\mathrm {T}\boldsymbol{x}^\mathrm {T}}\\&={\boldsymbol{xUE}}_2{\boldsymbol{U}^\mathrm {T}\boldsymbol{x}^\mathrm {T}= \boldsymbol{yE}}_2 {\boldsymbol{y}^\mathrm {T}} = \sum _{i=m+1}^J y_i^2. \end{aligned} \end{aligned}$$
(6.25)

Similarly, we can describe the \(\mathrm{T^2}\) statistic as follows:

$$\begin{aligned} \begin{aligned} \mathrm{T^2}&:=\mathrm{D} = {\boldsymbol{t}\varLambda }^{-1}_m {\boldsymbol{t}^\mathrm {T}=\boldsymbol{xP}} {\boldsymbol{\varLambda }}^{-1}_m {\boldsymbol{P}^\mathrm {T}\boldsymbol{x}^\mathrm {T}}\\&={\boldsymbol{xUF}}_1{\boldsymbol{\varLambda }}^{-1}_m {\boldsymbol{F}}_1^\mathrm {T}{\boldsymbol{U}^\mathrm {T}\boldsymbol{x}^\mathrm {T}=\boldsymbol{xU\varLambda }}^{-1}{\boldsymbol{U}^\mathrm {T}\boldsymbol{x}^\mathrm {T}}\\&={\boldsymbol{y}\varLambda }^{-1} {\boldsymbol{y}^\mathrm {T}} = \sum _{i=1}^m y_i^2\sigma _i^2, \end{aligned} \end{aligned}$$
(6.26)

where

$$\begin{aligned} {\boldsymbol{\varLambda }} _m^{ - 1} = diag\{ \sigma _1^2,\sigma _2^2, \ldots ,\sigma _m^2\},\;{{\boldsymbol{\varLambda }} ^{ - 1}} = [{\boldsymbol{\varLambda }} _m^{ - 1},{{\boldsymbol{0}}_{(J - m) \times (J - m)}}]. \end{aligned}$$

The new combined index could be obtained directly by composing the two statistics as

$$\begin{aligned} \varphi = D + \mathrm{Q} = \sum _{i=1}^m y_i^2\sigma _i^2+ \sum _{i=m+1}^J y_i^2. \end{aligned}$$
(6.27)

It is proved via mathematical illustration that the two decomposed statistics can be geometrically added together directly. This result demonstrates that \(\mathrm{T^2}\) and SPE statistic can be combined primarily and that is an intrinsic property. Thus, the combined index is a more general and geometric representation compared with the other combined index. The monitoring strategy with the novel index is introduced in the next subsection.

6.2.2 Control Limit of Combined Index

In Sect. 6.1, the \(\mathrm{T^2}\) and SPE statistics are decomposed into two new statistics for each variable. To reduce the calculation of process monitoring, the two new statistics are combined into a new statistic \(\varphi \) to monitor the process.

$$\begin{aligned} {\varphi _{i,j,k}} = c_{i,j,k}^\mathrm{D} + q_{i,j,k}^\mathrm{SPE}, \end{aligned}$$
(6.28)

where \(\varphi _{i,j,k}\) is the combined statistic at sampling time k for the jth variable. The method mentioned in Sect. 6.1.3 can be used to calculate the control limit of the new statistic,

$$\begin{aligned} \varphi _{j,k}^{\mathrm{{limit}}} = \bar{\varphi }_{j,k} + \kappa {\left( {{\mathop {\mathrm var}} \left( {\varphi _{j,k}} \right) } \right) ^{{1 / 2}}}, \end{aligned}$$
(6.29)

where \(\kappa \) is a predefined parameter, and

$$\begin{aligned} \begin{aligned} \bar{\varphi }_{j,k}&= \sum \limits _{i = 1}^I \varphi _{i,j,k} /I \\ \mathrm{var} \left( \varphi _{j,k}\right)&= \sum \limits _{i = 1}^I \left( \varphi _{i,j,k} - \bar{\varphi }_{j,k}\right) ^2 /(I - 1). \end{aligned} \end{aligned}$$
(6.30)

The online process monitoring can be performed according to comparing the new statistic and its control limit. There are several points to highlight for readers when the proposed control limit is used. Firstly, the mean and variance may be inaccurate for a small number of samples. As a result, a sufficient number of training samples should be collected during the offline stage. Secondly, the predefined parameter is important and it is designed by the engineers according to the actual process conditions. The tuning method regarding \(\kappa \) is similar to the Shewhart control chart. Equation (6.29) illustrates that the effect of variance depends on the predefined parameter \(\kappa \) and the fluctuation of control limits also relies on it on each sample. For example, the control limit is smooth when \(\kappa \) is selected to be a smaller value, and the control limit fluctuates when \(\kappa \) is selected to be a larger value.

If the combined statistic of the new sample has a significant difference from those of the reference data set, then a fault is detected. As a result, a fault isolation procedure is set up to find the fault roots. This fault response process is one of advantages in original process variable monitoring as each variable has a unique formulation and physical meaning. The proposed monitoring steps are similar as that in Sect. 6.1.2.

6.3 Case Study

A fed-batch penicillin fermentation process is considered in case study, and its detailed mathematical model is given in Birol et al. (2002). A detailed description of the fed-batch penicillin fermentation process is available in Chap. 4.

6.3.1 Variable Monitoring via Two Statistics Decomposition

Firstly, the original process variable monitoring algorithm mentioned in Sect. 6.1.2 is tested. The monitoring results of all variables would be interminable and tedious, so only several typical variables are shown here for demonstration or comparison. The monitoring result of variable 1 in a test normal batch is shown in Fig. 6.2. None of the two statistics (\( c_{1,k}^D\) and \( q_{1,k}^\mathrm{{SPE}}\) ) exceeds its control limit, and the statistics (\( c_{j,k}^D\) and \( q_{j,k}^\mathrm{{SPE}}, j=2,\ldots ,11\) ) of all the other variables do not exceed the control limits as well. The monitoring results of other variables are similar to that of variable 1, so we omitted them due to the restriction of the book length. These results show that proposed algorithm do not have a false alarm when it is used to monitor the normal batch.

Fig. 6.2
figure 2

Original variables monitoring for normal batch (variable 1)

Next, the fault batch data are used to test the proposed monitoring algorithm of the original process variables, and two types of faults are chosen here.

Fault 1: step type, e.g., a 20% step decrease is added in variable 3 at 200–250 h.

The monitoring results are shown as follows. Figure 6.3 shows the monitoring result of variable 1 for fault 1, the statistics changes obviously during the fault occurrence. However, the statistics do not exceed the control limit, i.e., the process status exhibits changes, but variable 1 is not the fault source. The monitoring results of variables 2, 4, 8, 9, and 11 are almost the same as the result of variable 1, and these results are not presented here.

Fig. 6.3
figure 3

Monitoring result for Fault 1 (variable 1)

Fig. 6.4
figure 4

Monitoring result for Fault 1 (variable 3)

Fig. 6.5
figure 5

Monitoring result for Fault 1 (variable 5)

The monitoring results of variable 3 and variable 5 are shown in Figs. 6.4 and 6.5, respectively. Both of the variable statistics exceed the control limit at the sampling time 200 h. Regarding the other variables of 6, 7, and 10, the statistics of these variables also exceed the control limit, and the simulation results of these variable are nearly the same as that of variable 5 (the results are not presented here).

The question is: which variable is the fault source, variable 3, 5, or others? From the amplitude of Figs. 6.4 and 6.5, it is easy to see that the two statistics for variable 3 exceed the control limits to a much greater extent than those for variable 5 and other variables. In particular, the \(\mathrm Q\) statistic of variable 3 is 40 times greater than its control limit. From this perspective, variable 3 can be concluded to be the fault source, as it makes contribution to the statistics obviously. Note that there is no smearing effect in the proposed method. The smearing effect means that non-faulty variables exhibit larger contribution values, while the contribution of faulty variables is smaller. Because the statistics are decomposed into a unique sum of the variable contributions, each monitoring figure is plotted against the decomposed variable statistics. Furthermore, the proposed method may identify several faulty variables if they have larger contributions at close magnitudes.

To confirm the monitoring conclusion, the relative statistical contribution rate of the jth variable at time k is defined as

$$\begin{aligned} R_c^{j,k}&= c_{j,k}^\mathrm{D}/\sum \limits _{j = 1}^J { c_{j,k}^\mathrm{D}} \\ R_q^{j,k}&= q_{j,k}^\mathrm{SPE}/\sum \limits _{j = 1}^J { q_{j,k}^\mathrm{SPE}}. \end{aligned} $$

The relative statistic contribution rates of 11 variables are shown in Figs. 6.6 and 6.7. It is clear that variable 3 is the source of Fault 1. It is found that variables 9, 10, and11 still have the higher contribution when the fault is eliminated because the fault in variable 3 causes the change of the other process variables. The effects on whole process still continue, even if the fault is eliminated, and the fault variable evolves from the original variable 3 to other process variables.

Fig. 6.6
figure 6

Relative contribution rate of \(R_c\) for Fault 1

Fig. 6.7
figure 7

Relative contribution rate of \(R_q\) for Fault 1

Fig. 6.8
figure 8

Fault 2 monitoring by c statistic (variable 3)

Fig. 6.9
figure 9

Fault 2 monitoring by q statistic (variable 3)

Fault 2: ramp type, i.e., fault involving a ramp increasing with a slope of 0.3 in variable 3 at 20–80 h.

The two monitor statistics of variable 3 are shown in Figs. 6.8 and 6.9. It can be seen that both of the two statistics exceed the control limits at approximately 50 h. The alarming time lags relative to the fault occurrence time (approximately 20 h) are found because this fault variable changes gradually. When the fault is eliminated after 80 h, the relationship among the variables changes back to normal. The \(\mathrm{T^2}\) statistic obviously declines under the control limit, while the SPE statistic still exceeds the control limit because the error caused by Fault 2 still exists.

6.3.2 Combined Index-Based Monitoring

The same test data in Sect. 6.3.1 are used to test monitoring effectiveness of the new combined index. Considering a normal batch, the monitoring result of \(\varphi \) statistic is shown in Fig. 6.10. Variable 1 is still monitored in this section, as was the case in Sect. 6.3.1 for comparison. It is shown that the new index \(\varphi \) of variable 1 is far below its control limit, as is the case for the new index values of the other variables. This method shows some good performances, and the number of false alarms is zero in normal batch monitoring. The new index is more stable than the two statistics, and it is easy to observe for operators.

Fig. 6.10
figure 10

Original variables monitoring based on combined index for normal batch (variable 1)

Fig. 6.11
figure 11

Fault 1 monitoring based on combined index (variable 1)

Fault 1: step type, e.g., a 20% step decrease is added in variable 3 at 200–250 h.

Fig. 6.12
figure 12

Fault 1 monitoring based on combined index (variable 3)

Fig. 6.13
figure 13

Fault 1 monitoring based on combined index (variable 5)

The new statistic \(\varphi \) of variable 1 does not exceed the control limit in Fig. 6.11, although it changes from 200 h to 250 h during the fault. The values of new statistic \(\varphi \) of variables 2, 4, 8, 9, and 11 also do not exceed the control limit. The corresponding monitoring statistics are omitted here. Thus, these variables have no direct relationship with the fault variable, i.e., they are not the fault source.

Furthermore, the monitoring results of variables 3 and 5 are shown in Figs. 6.12 and 6.13, respectively. The value statistics of variables 3 and 5 exceed their control limits obviously, as well as those of variables 6, 7, and 10. As discussed in Sect. 6.3.1, one can see that the statistic \(\varphi \) of variable 3 changes to a greater extent than other variables, so variable 3 is the potential fault source. This result shows that the proposed approach is an efficient technique for fault detection.

The relative contribution of the new statistic is used to confirm the fault source, which is defined as

$$\begin{aligned} R_\varphi ^k =\varphi _{j,k}/\sum \limits _{j = 1}^J {\varphi _{j,k}}. \end{aligned}$$
Fig. 6.14
figure 14

Relative contribution rate of \(\varphi \) statistic for Fault 1

The relative contribution of variable 3 is nearly 100%, as shown in Fig. 6.14. So variable 3 is confirmed as the fault source. It is found that variables 9, 10, and 11 still have a higher contribution when the fault is eliminated because the fault in variable 3 causes the change of the other process variables and the effect on whole process still continues, even if the fault is eliminated.

Note that the relative contribution plot (RCP) is an auxiliary tool to locate the fault roots. It is only used for comparison with the proposed monitoring method to confirm diagnostic conclusions. Furthermore, the RCP is completely different from the traditional contribution diagram in this work. The RCP in this work is calculated using the original process variables, i.e., there is no smearing effect of the RCP. The contribution of each variable is independent of the other variables. Therefore, the proposed method is a novel and helpful approach in terms of original process variable monitoring. Furthermore, the color map of the fault contribution is intuitive. As a result, the map will promote the operator’s initiative to find the fault source, and engineers can find some useful information to avoid more serious accidents.

Fault 2: ramp type, i.e., fault involving a ramp increasing with a slope of 0.3 in variable 3 at 20–80 h.

The monitoring result of variable 3 is shown in Fig. 6.15. It can be seen that the new statistic \(\varphi \) exceeds the control limit at approximately 50 h, and then it falls below the control limit after 80 h. The result shows that the combined index can detect different faults.

Fig. 6.15
figure 15

Fault 2 monitoring of variable 3 by \(\varphi \) statistic

6.3.3 Comparative Analysis

The monitoring performances of different methods are compared. Several performance indices are given to evaluate the monitoring efficiency. False alarm (FA) is the number of false alarms during the operation life. Time detected (TD) is the time that the statistic exceeds the control limit under the fault operation, which can represent the sensitivity.

The monitoring results of the proposed method are compared with that of the traditional sub-PCA method (Lu et al. 2004) in latent space and the soft-transition sub-PCA (Wang et al. 2013) to illustrate the effectiveness. The FA and TD results for other 12 faults are presented in Tables 6.1 and 6.2, respectively. Fault variable numbers (1, 2, and 3) represent the aeration rate, agitator power, and substrate feed rate, as shown in Chap. 4. The fault type and occurring time for the variables are given in Table 6.1, and those input conditions are as same as those in Sects. 6.3.1 and 6.3.2.

It can be seen from Table 6.1 that there are multiple false alarms applying the traditional sub-PCA method to detect faults, while the original process variable monitoring method shows less false alarms based on the combined index \(\varphi \) in this chapter. Among the three indices of the original spatial monitoring, the c and q statistics may have a large number of false alarms for different reasons, but the new combined index \(\varphi \) is more accurate because it can balance the two indices.

Table 6.2 indicates that the original process variable monitoring has accurate and timely detection results comparing with the other two detection methods. The detection delay is more than 10h for Fault 4, 7, 8 and 11 in the traditional sub-PCA and the soft-transition sub-PCA. Such a delay is inconceivable in a complex industrial process. While the difference between the detected time and the real fault time for the proposed approach is less than 10 h, except for fault 4. This result is helpful and meaningful in practice. As a result, the proposed approach could provide more suitable process information to operators. Thus, the proposed monitoring method based on a combined index shows advantages of rapid detection and fewer false alarms compared with the traditional or soft-transition sub-PCA approaches, whose monitoring operation is in the latent space but not the original measurement space.

Table 6.1 Monitoring results of FA for other faults
Table 6.2 Comparing the time of fault detected

6.4 Conclusions

A new multivariate statistical method for the monitoring and diagnosis of batch processes, which operates on the original process variables, was presented in this chapter. The proposed monitoring method is based on the decomposition of the \(\mathrm{T^2}\) and SPE statistics as a unique sum of each variable contribution. However, problems may arise if the number of variables is large when the original process variables technique is applied. To reduce the workload of the monitoring calculation, a new combined index was proposed. A mathematical illustration was given to prove that the two decomposed statistics can be added together directly. Compared to the traditional PCA method in latent space, the proposed method is sufficiently direct, and only one statistical index is utilized, thereby decreases the calculation burden.

The new original variable space monitoring method can detect a fault with a clear result based on each variable. The fault source can be determined directly from the statistical index rather than using the traditional contribution plot. Furthermore, the control limit of the new combined statistics is very simple, and it does not need to assume that it follows some probability distribution. The simulation results show that the new combined statistics can detect the fault efficiently. As the new statistic index is the combination of two decomposed statistics, it can avoid many problems introduced by the use of a single statistic, such as false alarms or missing alarms.