Pareto Distribution-Based Shewhart Control Chart for Early Detection of Process Mean Shifts

Saghir, Aamir; Rao, Gadde Srinivasa; Aslam, Muhammad; Janjua, Azhar Ali

doi:10.1007/s44199-024-00071-1

Pareto Distribution-Based Shewhart Control Chart for Early Detection of Process Mean Shifts

Research Article
Open access
Published: 19 February 2024

Volume 23, pages 26–43, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Statistical Theory and Applications Aims and scope Submit manuscript

Pareto Distribution-Based Shewhart Control Chart for Early Detection of Process Mean Shifts

Download PDF

Aamir Saghir¹,
Gadde Srinivasa Rao²,
Muhammad Aslam ORCID: orcid.org/0000-0003-0644-1950³ &
…
Azhar Ali Janjua⁴

724 Accesses
Explore all metrics

Abstract

The Pareto distribution is of paramount importance in actuarial science, wealth distribution, finance, etc. This paper introduces a control chart inspired by Shewhart's methodology, designed for monitoring shifts in the Pareto distribution through a repetitive sampling approach. The chart employs a modified statistic that combines shape and threshold parameters as its plotting statistic. Coefficients for the Shewhart-type Pareto chart are computed for two-phase limits. The performance of the suggested chart is assessed in terms of run length characteristics, assuming a shift in the process mean. Additionally, we conduct an efficiency comparison with existing control charts. The findings suggest that, on average, the proposed Pareto chart demonstrates greater efficiency in promptly detecting changes compared to alternative methods. To illustrate the practical application of our approach, we present an example using revenue data.

Distribution-free Phase II Mann–Whitney control charts with runs-rules

Article 23 December 2015

Distribution-free phase-II exponentially weighted moving average schemes for joint monitoring of location and scale based on subgroup samples

Article 18 February 2017

A Study on the Performances of the SPRT Control Chart When Estimating Process Parameters

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Control charts are commonly used in statistical process control (SPC) monitoring. Typically, the assumption is made that the quality of interest follows a normal distribution. However, in reality, this assumption is often not met, as discussed by various authors, including Saghir and Lin [12]. In such cases, relying on existing control charts may mislead industrial engineers and lead to an increase in non-conforming items, see Montgomery [9]. Many authors in the literature have worked on the development of variable control chart based on non-normal distributions including for example, Bai and Choi [3], Al-Oraini and Rahim [1], Lin and Chou [7], McCracken and Chakraborti [8], Saghir and Lin [12] and Saghir et al. [11].

Control charts, taking into consideration the assumption that the quality characteristic adheres to a Pareto distribution, characterized by its exceptionally heavy tails, have been formulated by researchers such as Guo and Wang [5], Nasiru [10], and Balamurali and Jeyadurga [4]. These studies collectively underscore the significance of Pareto distribution in the realm of statistical process monitoring. Aslam et al. [2] introduced a variance control chart utilizing repetitive sampling, specifically when the average sample size aligns with a fixed sample size. Saleh et al. [13] similarly advocated for the use of repetitive sampling under the condition of the average sample size being equal to the fixed sample size.

The Shewhart control chart's primary drawback is its inadequate capacity to identify even minute variations in the process mean. Further interpretation rules have been proposed to improve the Shewhart chart's ability to identify minute changes in the process mean. Repetitive sampling is used to create a Pareto control chart, which helps overcome this type of problem. Upon a thorough examination of the existing literature and to the best of our knowledge, there is a noticeable absence of research concerning the control chart for the Pareto distribution employing repetitive sampling. This paper addresses this gap by concentrating on the development of a Shewhart Pareto control chart within a repetitive sampling framework. Building upon the groundwork laid by Guo and Wang [5], we extend their work by incorporating repetitive sampling. The goal of this study is to design a control chart for the Pareto distribution using repetitive sampling, particularly when the average sample size aligns with a fixed sample size. Subgroups in sample plans are chosen to decide how much of the plan to accept or reject, whereas subgroups in control charts are used to monitor process variance when the process is intact or in flow. Based on the outcomes of the selected subgroup and the pattern of behavior of earlier subgroups, the quality supervisor is instructed to either leave the process as it is or delay the subgroup and make process adjustments. They address a change in the process configuration when many samples are delayed because they are not plotted. The suggested method is to cap the number of postponed subgroups in repeat sampling. These limits found in the literature fall between 2-sigma limits and 3-sigma, or, in many cases, they are inside 1-sigma limits in the charts under discussion because there are two additional limits, or repetitive limits, inside the typical control limits in repetitive sampling-based control charts. The efficacy of the proposed control chart will be demonstrated in comparison to existing counterparts, and we will showcase its practical application.

2 The Proposed Pareto Control Chart Using Repetitive Sampling

Within this section, we introduce a novel framework for a control chart that employs repetitive sampling under the assumption of a Pareto-distributed quality characteristic. As only one type of Shewhart Pareto chart is available in the literature (called the [5] control chart), Therefore, we have made comparisons with it and discussed the results with reference to it. The Shewhart control chart's primary drawback is its inadequate capacity to identify even minute variations in the process mean. Further interpretation rules have been proposed to improve the Shewhart chart's ability to identify minute changes in the process mean. Repetitive sampling is used to create a Pareto control chart, which helps overcome this type of problem. Additionally, we derive the performance measure for this proposed control chart.

2.1 Design of the Proposed Control Chart

Assume that X is a random variable following the Pareto distribution with parameters a and b. Then, its probability density function and the cumulative distribution function are, respectively, given by

$$f\left(x\right)= \frac{a{b}^{a}}{{x}^{a+1}}, x>b, a< 0,$$

(1)

and

$$F\left(x\right)= 1-{\left(\frac{{\text{b}}}{{\text{x}}}\right)}^{{\text{a}}}.$$

(2)

The mean and the variance of the underlying distribution are given by μ = ab/(a − 1) for a > 1 and σ² = ab² /(a − 1)² (a − 2) for a > 2, respectively.

Further, suppose a random variable $Y=\overline{x }\pm ks$ in the interval, where $\overline{x }$ and $s$ are the sample mean and standard deviation respectively and k is an acceptable constant for a given type-I error. It is well known in the literature that when the underlying distribution is normal, $\overline{x }$ and $s$ are independently distributed as normal and chi-square, respectively and $Y$ is also normally distributed. When the underlying distribution is non-normal, the asymptotic distributions of Y is normal with the means, respectively, given by ${\upmu }_{y}= \upmu \pm k\sigma$ and with the common variance expressed by”.

$${\sigma }_{y}^{2} = \frac{{\sigma }^{2}}{n} \left[1 + \frac{{k}^{2}}{4}\left({\alpha }_{4}- 1\right)\pm k{\alpha }_{3}\right].$$

where ${\alpha }_{3}$ and ${\alpha }_{4}$ represents the measure of the skewness and kurtosis of the underlying distribution and k = 3 (as the asymptotic distribution of Y is normal). In our case, the measures of skewness and kurtosis of Pareto distribution, which are the measures for identifying the asymmetric nature of the distribution, are expressed as:and

$$\alpha_{3} = \left( {\frac{{2\left( {1 + a} \right)}}{a - 3}} \right)\sqrt {\frac{{a{-} 2}}{a}\quad } for\;a > 3$$

$$\alpha_{4} = \frac{{3\left( {a - 2} \right)\left( {3a^{3} + a + 2} \right)}}{{a\left( {a{-} 3} \right)\left( {a{-} 4} \right)}}\quad for\;a > 4.$$

(3)

This work develops a Pareto control chart for the random variable Y using the double group sampling scheme for decision making. Therefore, the proposed control chart has double control limits, outer and inner control limits. “The process is declared:

(i)
out-of-control if the plotted statistic $Y$ falls outside the outer control limits
(ii)
in-control if the plotted statistic $Y$ falls within the inner control limits.
(iii)
otherwise, resampling should be made with the same sample size”.

where the outer bound is

$$\genfrac{}{}{0pt}{}{{{\text{UCL}}}_{1}={\upmu }_{y}+{{\text{k}}}_{1}{\upsigma }_{{\text{Y}}}}{{{\text{LCL}}}_{1}={\upmu }_{y}-{{\text{k}}}_{1}{\upsigma }_{{\text{Y}}}}$$

and the inner bound is

$$\begin{aligned} {\text{UCL}}_{2} & = \upmu_{y} + {\text{k}}_{2} {\upsigma }_{{\text{Y}}} \\ {\text{LCL}}_{2} & = \upmu_{y} - {\text{k}}_{2} {\upsigma }_{{\text{Y}}} \\ \end{aligned}$$

(4)

“where ${{\text{k}}}_{1}$ and ${{\text{k}}}_{2}$ are the control chart coefficients to be calculated for a fixed ARL₀, sample size (n) and $\lambda$. ${\upmu }_{y}$ and ${\upsigma }_{{\text{Y}}}$ are the mean and standard deviation of the random variable Y. In case of unknown parameters, the sample average $\overline{Y }$ and the sample dispersion ${S}_{y}$ of statistic ${\text{Y}}$, calculated from the in-control process initial samples are used in Eq. (4)”.

2.2 The Performance Measure

“The average run length (ARL), which is defined as the average number of samples until the process shows an out-of-control indication, is the most commonly used measure for the performance evaluation of the proposed control charts. When the process is in-control, the ARL should be sufficiently large to avoid many false alarms and it is denoted by ${{\text{ARL}}}_{0}$ while, ARL should be sufficiently small when the process is out of control and generally it is denoted by ${{\text{ARL}}}_{1}$.

2.2.1 In-Control ARL

“The in-control ARL denoted by ${\text{ARL}}_{0} \;{\text{is}}\;{\text{defined}}\;{\text{as}}$:

$${{\text{ARL}}}_{0}=\frac{1}{{{\text{P}}}_{{\text{out}}}},$$

(5)

where ${{\text{P}}}_{{\text{out}}}$ denotes the probability that the process is declared as out-of-control when it is actually in control and in a repetitive sampling scheme defined as”:

$${{\text{P}}}_{{\text{out}}}=\frac{{{\text{P}}}_{{\text{out}}}^{0}}{1-{{\text{P}}}_{{\text{Res}}}}$$

where ${{\text{P}}}_{{\text{out}}}^{0}$ is the probability that the process is declared to be out of control based on a single sample is:

$${{\text{P}}}_{{\text{out}}}^{0}={\text{P}}\left({\text{Y}}\ge {{\text{UCL}}}_{1}|{\upmu }_{{\text{Y}}}={\upmu }_{0}\right)+\mathrm{ P}({\text{Y}}\le {{\text{LCL}}}_{1}|{\upmu }_{{\text{Y}}}={\upmu }_{0})$$

(6)

where ${\upmu }_{Y}={\upmu }_{0}$ be in-controlled mean of the Pareto process. As mentioned earlier, the asymptotic distribution of ${\text{Y}}$ is normal with mean ${\upmu }_{y}= \upmu \pm k\sigma$ and variance ${\sigma }_{y}^{2} = \frac{{\sigma }^{2}}{n} [1 + \frac{{k}^{2}}{4}({\alpha }_{4}- 1) \pm k{\alpha }_{3}]$. Thus, when the process is in-control,

$$\begin{aligned} {\text{P}}_{{{\text{out}}}}^{0} & = {\text{P}}\left( {{\text{Y}} \ge {\text{UCL}}_{1} |\upmu_{{\text{Y}}} = \upmu_{0} } \right) + {\text{ P}}({\text{Y}} \le {\text{LCL}}_{1} |\upmu_{{\text{Y}}} = \upmu_{0} ) \\ & = {\text{P}}\left( {\frac{{{\text{Y}} - \upmu_{y} }}{{\sigma_{y} }} \ge \frac{{{\text{UCL}}_{1} - \upmu_{y} }}{{\sigma_{y} }}|\upmu_{{\text{Y}}} = \upmu_{0} } \right) + {\text{ P}}\left( {\frac{{{\text{Y}} - \upmu_{y} }}{{\sigma_{y} }} \le \frac{{{\text{LCL}}_{1} - \upmu_{y} }}{{\sigma_{y} }}|\upmu_{{\text{Y}}} = \upmu_{0} } \right) \\ & = {\text{P}}\left( {{\text{Z}} \ge {\text{k}}_{1} } \right) + {\text{ P}}\left( {{\text{Z}} \le - {\text{k}}_{1} } \right) \\ & = 1 - {\text{F}}\left( {.,{\text{k}}_{1} } \right) + {\text{F}}\left( {.,{\text{k}}_{1} } \right) \\ \end{aligned}$$

(7)

where ${\text{Z}}$ is a standardized normal variable and ${\text{F}}(.,\mathrm{ t})$ is the cumulative probability of standard normal random variable ${\text{Z}}$ up to point $t$.

The probability that the process is declared to be out of control based on a single sample is:

$${{\text{P}}}_{{\text{in}}}^{0}={\text{P}}\left({\text{Y}}\le {{\text{UCL}}}_{2}|{\upmu }_{{\text{Y}}}={\upmu }_{0}\right)-\mathrm{ P}({\text{Y}}\le {{\text{LCL}}}_{2}|{\upmu }_{{\text{Y}}}={\upmu }_{0})$$

(8)

and when the process is in-control

$${{\text{P}}}_{{\text{in}}}^{0}={\text{P}}\left({\text{Z}}\le {{\text{k}}}_{2}\right)-\mathrm{ P}\left({\text{Z}}\le {-{\text{k}}}_{2}\right)={\text{F}}\left(.,{{\text{k}}}_{2}\right)-{\text{F}}\left(.,{-{\text{k}}}_{2}\right)$$

(9)

The probability of resampling, when the process is actually in control, is defined as:

$$\begin{aligned} {\text{Probability of resampling}} & = {\text{P}}_{{{\text{Res}}}} = {\text{P}}\left( {{\text{UCL}}_{2} < Y\left\langle {{\text{UCL}}_{1} } \right|\upmu_{{\text{Y}}} = \upmu_{0} } \right) \\ & \quad + {\text{ P}}\left( {{\text{LCL}}_{1} < Y < {\text{LCL}}_{2} {|}\upmu_{{\text{Y}}} = \upmu_{0} } \right). \\ & = {\text{P}}\left( {{\text{k}}_{2} < Z < {\text{k}}_{1} } \right) + {\text{P}}\left( { - {\text{k}}_{1} < Z < - {\text{k}}_{2} } \right) \\ & = {\text{F}}\left( {.,{\text{k}}_{2} } \right) - {\text{F}}\left( {.,{\text{k}}_{1} } \right) + {\text{F}}\left( {., - {\text{k}}_{1} } \right) - {\text{F}}\left( {., - {\text{k}}_{2} } \right) \\ \end{aligned}$$

(10)

Thus,

$${{\text{P}}}_{{\text{out}}}=\frac{{{\text{P}}}_{{\text{out}}}^{0}}{1-{{\text{P}}}_{{\text{Res}}}}=\frac{1-{\text{F}}\left(.,{{\text{k}}}_{1}\right)+{\text{F}}\left(.,{{\text{k}}}_{1}\right)}{1-{\text{F}}\left(.,{{\text{k}}}_{2}\right)+{\text{F}}\left(.,{{\text{k}}}_{1}\right)-{\text{F}}\left(.,{-{\text{k}}}_{1}\right)+{\text{F}}\left(.,{-{\text{k}}}_{2}\right)}$$

(11)

The average sample size (ASS) for the proposed control chart is defined by

$${\text{AS}}{S}_{0}=\frac{n}{1-{{\text{P}}}_{{\text{Res}}}}$$

(12)

2.2.2 Out-of-Control ARL

Suppose that the process mean does not remain at the in-control level, i. e ${\upmu }_{0}$ and further let ${\upmu }_{1}={\upmu }_{0}+\updelta {\upsigma }_{{\text{y}}}$ be a shifted level of mean value under $\updelta$ standard units shift depending upon the definition of Y. Then, the out-of-control ${{\text{ARL}}}_{1}$ is:

$${{\text{ARL}}}_{1}=\frac{1}{{{\text{P}}}_{{\text{out}},{\text{shift}}}}$$

(13)

where

$${{\text{P}}}_{{\text{out}},{\text{shift}}}=\frac{{{\text{P}}}_{{\text{out}},{\text{shift}}}^{1}}{1-{P}_{Res1}}$$

(14)

The probability of an out-of-control process when a shift occurs is

$$\begin{aligned} {\text{P}}_{{{\text{out}},{\text{shift}}}}^{1} & = {\text{P}}\left( {{\text{Y}} > {\text{UCL}}_{1} |{\upmu }_{{\text{Y}}} = {\upmu }_{1} } \right) + {\text{ P}}\left( {{\text{Y}}\left\langle {{\text{LCL}}_{1} } \right|{\upmu }_{{\text{Y}}} = {\upmu }_{1} } \right) \\ {\text{P}}_{{{\text{out}},{\text{shift}}}}^{1} & = {\text{P}}\left( {{\text{Z}} > {\text{UCL}}_{1} } \right) + {\text{ P}}({\text{Z}} < {\text{LCL}}_{1} ) \\ & = {\text{P}}\left( {\frac{{{\text{Y}} - \upmu_{1} }}{{\sigma_{y} }} \ge \frac{{{\text{UCL}}_{1} - \upmu_{1} }}{{\sigma_{y} }}} \right) + {\text{ P}}\left( {\frac{{{\text{Y}} - \upmu_{1} }}{{\sigma_{y} }} \le \frac{{{\text{LCL}}_{1} - \upmu_{1} }}{{\sigma_{y} }}} \right) \\ & = {\text{P}}\left( {{\text{Z}} \ge {\text{k}}_{1} - {\updelta }} \right) + {\text{ P}}\left( {{\text{Z}} \le {\updelta } - {\text{k}}_{1} } \right) \\ & = 1 - {\text{F}}\left( {.,{\text{k}}_{1} - {\updelta }} \right) + {\text{F}}\left( {.,{\updelta } - {\text{k}}_{1} } \right) \\ \end{aligned}$$

(15)

and the probability of resampling for the shifted mean is:

$$\begin{aligned} P_{Res1} & = {\text{P}}\left( {{\text{k}}_{2} - {\updelta } < Z < {\text{k}}_{1} - {\updelta }} \right) + {\text{P}}\left( { - {\text{k}}_{1} - {\updelta } < Z < - {\text{k}}_{2} - {\updelta }} \right) \\ & = {\text{F}}\left( {.,{\text{k}}_{2} - {\updelta }} \right) - {\text{F}}\left( {.,{\text{k}}_{1} - {\updelta }} \right) + {\text{F}}\left( {., - {\text{k}}_{1} - {\updelta }} \right) - {\text{F}}\left( {., - {\text{k}}_{2} - {\updelta }} \right) \\ \end{aligned}$$

(16)

Thus,

$${{\text{P}}}_{{\text{out}},{\text{shift}}}=\frac{1-{\text{F}}\left(.,{{\text{k}}}_{1}-\updelta \right)+{\text{F}}\left(.,{\updelta -{\text{k}}}_{1}\right)}{1-{\text{F}}\left(.,{{\text{k}}}_{2}-\updelta \right)+{\text{F}}\left(.,{{\text{k}}}_{1}-\updelta \right)-{\text{F}}\left(.,{-{\text{k}}}_{1}-\updelta \right)+{\text{F}}\left(.,{-{\text{k}}}_{2}-\updelta \right)}$$

(17)

The average sample size (ASS) for the proposed control chart is defined by

$${\text{AS}}{S}_{1}=\frac{n}{1-{P}_{Res1}}$$

(18)

The following algorithm was applied to obtain the values of ASS and ARL.

Step-1: Fix the values of $n$ and specified values of ARL, say ${r}_{0}$

Step-2: Find the values of ${{\text{k}}}_{1}$ and ${{\text{k}}}_{2}$ such that ${{\text{ARL}}}_{0}\ge {r}_{0}$ and ${\text{AS}}{S}_{0}$ is close to ${r}_{0}$.

Step-3: Determine ${{\text{ARL}}}_{1}$ and ${\text{AS}}{S}_{1}$ using ${{\text{k}}}_{1}$ and ${{\text{k}}}_{2}$ for various values of shift.

3 Results and Discussion

The proposed control chart depends on two constants ${{\text{k}}}_{1}$ and ${{\text{k}}}_{2}$, which are called control charting constants. The numerical values of these constants can be determined using the Eq. (11) by fixing the parameter of Pareto distribution and the in-control average run length ARL₀. Once, the control charting constants are determined and the Pareto control chart is constructed, the performance of the proposed control chart can further be investigated when a shift occurs in the parameter of Pareto distribution. Consider ${\upmu }_{1}={\upmu }_{0}+\updelta {\upsigma }_{{\text{y}}}$ be shifted mean value of the process when the shift of amount $\updelta$ occurs in the process mean in standard units depending upon the definition of Y $= \overline{x }\pm ks$ and k = 3 is considered in this work. The statistical software R has been used in this study for all computations.

The performance of the Shewhart Pareto Chart using a repetitive sampling scheme is evaluated in terms of ${{\text{ARL}}}_{1}$ by fixing the ${{\text{ARL}}}_{0}$, sample size and parameters of Pareto distributions. Some of the results are reported here in Tables 1, 2, 3 and 4 for various choices. The values of ${{\text{ARL}}}_{1}$ and ${\text{AS}}{S}_{1}$ for any other choice of parameter or sample size or of ${{\text{ARL}}}_{0}$ can be obtained easily using the Eqs. (13) and (18).

Table 1 The ARLs for the proposed chart when a = 4.5, b = 2 and ARL₀ = 370

Full size table

Table 2 The ARLs for the proposed chart when a = 5.0, b = 2 and ARL₀ = 370

Full size table

Table 3 The ARLs for the proposed chart when a = 4.5, b = 3 and ARL₀ = 370

Full size table

Table 4 The ARLs for the proposed chart when a = 5, b = 3 and ARL₀ = 370

Full size table

Tables 1, 2, 3 and 4 reveal that

1.
The control charting constant (k₁ and k₂) is significantly different than the standard value 3.
2.
${{\text{ARL}}}_{1}$ decreases as shift size decreases but ${\text{AS}}{S}_{1}$ increases slightly for the fixed value of a, b, n and ARL₀
3.
${{\text{ARL}}}_{1}$ decreases as sample size increases for the fixed value of a, b and ARL₀.
4.
${{\text{ARL}}}_{1}$ decreases as the location parameter value increase by fixing constant other factors This are expected results as our focus is to detect changes in the location parameter of Pareto distribution by assuming the shape parameter to be controllable.
5.
There is no significant effect on ${{\text{ARL}}}_{1}$ when the shape parameter increases for the fixed value of n, a and ARL₀.
6.
There is no significant effect on the performance of the proposed chart in terms of ${{\text{ARL}}}_{1}$ when the simultaneous change in both parameters occurred for fixed value of n and ARL₀.

Thus, the proposed Pareto control chart can effectively detect the change in location parameter and also efficiency of this chart increase with a moderate sample size. Note that it is desirable situation that the sample size must be close to ${\text{AS}}{S}_{1}$. It is note from tables that when sample size is small, the value of ${\text{AS}}{S}_{1}$ are little bit away from the fixed sample size when $\updelta$ is greater than 0.60. It is worth to noting that in case of the repetition, the fixed sample size is taken from the production process.

4 Performance Comparison with Existing Chart

In this section, the proposed control chart will be compared with the existing Shewhart-type Pareto control chart proposed by Guo and Wang [5], when $k_{1} = k_{2} = L$. “The control chart with smaller ARL is said to be more efficient”. Hence the performance of the proposed control chart will be studied based on ARL values. To defend a comparison, consider when the process is in-control at ARL₀ = 370, the out-of-control ARLs of the Guo and Wang [5] chart and proposed Pareto control chart for different mean shifts, n = 5, a = 5, b = 2 are presented in Table 5.

Table 5 The ARLs for the proposed chart and existing Shewhart type Pareto control chart when n = 5, a = 5, b = 2 &ARL₀ = 370

Full size table

According to these results, the proposed control chart gives smaller ARL values for different shifts ($\delta$); hence, the proposed control chart is more quick in detecting shifts as compared to the existing Shewhart type Pareto control [5] chart. For example, when ARL₀ = 370, n = 5, a = 5, b = 2 and $\delta$ = 0.7 from Table 5, we noticed that the proposed control chart ARL₁ = 109.87 whereas ARL₁ = 114.74 for Guo and Wang [5] Pareto control chart. The ARL curves of the proposed Pareto control chart using repetitive sampling and Guo and Wang [5] control chart for the mean of specified shifts are displayed in Fig. 1. After exploring the ARL curves, it is clear that the proposed control charts using repetitive sampling show better performance than the Shewhart type control [5] chart. To accentuate this simulation study will be carried out in the next subsection. Similar results have been observed for other choices of parameters and sample sizes.

5 A Real Example Based on a Tax Revenue Data

The developed control chart is demonstrated by using a real data set in this section, the data set is taken from Klakattawi [6]. The data set is reported as monthly records of tax revenue of Egypt between January 2006 and November 2010, in 1000 million Egyptian pounds. The data along with statistics is reported in Table 6. It is established that the tax revenue data comes from the Pareto distribution with parameters $\hat{a}$ = 1.707 and $\hat{b}$ = 6.641 and the maximum distance between the real time data and the fitted Pareto distribution is found from the Kolmogorov–Smirnov test as 0.094 and also the p-value is 0.6978. The demonstration of the goodness of fit for the given model is shown in Fig. 2, the histogram and theoretical density and P–P plots for the Pareto distribution for the tax revenue data. The chart coefficients for the estimated parameters of a and b for n = 5 and ARL = 370 are k₁ = 9.8274 and k₂ = 0.8366. Using the chat coefficients the inner bound limits are ${{\text{LCL}}}_{2}=-3.9007{\text{ and UCL}}_{2}=28.6571$; and the outer bound limits are ${{\text{LCL}}}_{1}=-178.847{\text{ and UCL}}_{1}=203.6039$. In Fig. 3 the proposed Pareto control chart using repetitive sampling for tax revenue data is presented. The visual presentation shows that some points are in between inner and outer control limits; hence repetition of sampling should be made with the same sample size.

Table 6 The monthly records of tax revenue of Egypt along with statistics

Full size table

6 Application Using Simulated Tax Revenue Data

In this section, using simulated data we study the performance of the proposed control chart. To establish the performance of the proposed control chart as compared with existing control charts, 40 samples are generated from normal distribution. The first 20 subgroups of each size 5 are generated from the Pareto distribution with in-control parameters a = 5, b = 2 and the last 20 subgroups of each size 5 are generated from the Pareto distribution with in-control parameters a = 5 and out-of-control b₁ = $\delta b = 0.8 \times 2 = 1.6$. That is, the process mean is shifted after 20 subgroups with a variance shift of $\delta$ = 0.8.

The chart coefficients when n = 5 for ARL₀ = 370 are available in Table 5. The proposed control charts using repetitive sampling with $k_{1}$ = 3.3974, $k_{2}$ = 1.7061 is displayed in Fig. 4 and the Shewhart type control chart with L = 4.4198 is presented in Fig. 5. From Fig. 4, the Shewhart type control chart fails to detect the shift. From Fig. 5, it can be observed that the proposed control charts under a repetitive sampling scheme show the out-of-control signals at subgroup numbers 28, 30, 36 and 39. This example shows that the proposed control chart under a repetitive sampling scheme is more effective in discovering process shift as compared to the Shewhart type control chart scheme.

7 Conclusions and Recommendations

In conclusion, this paper has introduced a novel Shewhart Pareto control chart designed for monitoring shifts in the Pareto distribution through a repetitive sampling approach. The chart, utilizing a modified statistic that combines shape and threshold parameters, has been evaluated for its performance in terms of run length characteristics, assuming a shift in the process mean. Through a comprehensive efficiency comparison with existing control charts, our findings indicate that the proposed Pareto chart demonstrates superior efficiency in promptly detecting changes, on average, compared to alternative methods. The practical application of our approach has been exemplified through an illustrative example employing revenue data. We have highlighted the significance of Pareto distribution in statistical process monitoring, particularly emphasizing the importance of considering heavy-tailed distributions in control chart design. The extension of the work by incorporating repetitive sampling adds depth to the methodology. The absence of prior research on control charts for the Pareto distribution employing repetitive sampling was addressed in this paper, contributing to the existing body of knowledge in statistical process control. By building upon the groundwork of [5], we have presented a comprehensive design for a Shewhart Pareto control chart under a repetitive sampling scheme. As we move forward, this work opens avenues for further exploration and refinement of control chart methodologies under non-normal distributions, ensuring robust and accurate statistical process control in diverse industrial settings. The proposed control chart faces a constraint as it is not suitable for data conforming to a normal distribution. Subsequent research endeavors could focus on developing and exploring an exponentially weighted moving average (EWMA) scheme tailored for the Pareto distribution through repetitive sampling. Additionally, investigating the efficacy of the proposed Pareto chart by employing a cost model represents a promising avenue for future research.

Availability of Data and Materials

The data is given in the paper.

References

Al-Oraini, H.A., Rahim, M.: Economic statistical design of X control charts for systems with Gamma (λ, 2) in-control times. Comput. Ind. Eng. 43(3), 645–654 (2002)
Article Google Scholar
Aslam, M., Khan, N., Jun, C.-H.: A new S 2 control chart using repetitive sampling. J. Appl. Stat. 1–12 (2015). (ahead-of-print)
Bai, D., Choi, I.: X and R control charts for skewed populations. J. Qual. Technol. 27(2), 120–131 (1995)
Article Google Scholar
Balamurali, S., Jeyadurga, P.: An attribute np control chart for monitoring mean life using multiple deferred state sampling based on truncated life tests. Int. J. Reliab. Qual. Saf. Eng. 26(01), 1950004 (2019)
Article Google Scholar
Guo, B.-C., Wang, B.-X.: Control charts for the Pareto distribution. Appl. Math. J. Chin. Univ. 30(4), 379–396 (2015)
Article MathSciNet Google Scholar
Klakattawi, H.S.: The Weibull-gamma distribution: properties and applications. Entropy 21(5), 438 (2019)
Article MathSciNet Google Scholar
Lin, Y.-C., Chou, C.-Y.: Non-normality and the variable parameters X control charts. Eur. J. Oper. Res. 176(1), 361–373 (2007)
Article MathSciNet Google Scholar
McCracken, A., Chakraborti, S.: Control charts for joint monitoring of mean and variance: an overview. Quality Technol. Quant. Manag. 10(1), 17–36 (2013)
Article Google Scholar
Montgomery, D.C.: Introduction to Statistical Quality Control. Wiley, New York (2020)
Google Scholar
Nasiru, S.: One-sided cumulative sum control chart for monitoring shifts in the shape parameter of Pareto distribution. Int. J. Product. Quality Manag. 19(2), 160–167 (2016)
Article Google Scholar
Saghir, A., Akber Abbasi, S., Faraz, A.: The exact method for designing the Maxwell chart with estimated parameter. Commun. Stat.-Simul. Comput. 50(1), 270–281 (2021)
Article MathSciNet Google Scholar
Saghir, A., Lin, Z.: Designing of Gini-chart for exponential, t, logistic and Laplace distributions. Commun. Stat.-Simul. Comput. 44(9), 2387–2409 (2015)
Article MathSciNet Google Scholar
Saleh, N.A., Mahmoud, M.A., Woodall, W.H.: A re-evaluation of repetitive sampling techniques in statistical process monitoring. Quality Technol. Quant. Manag. 1–19 (2023). https://doi.org/10.1080/16843703.2023.2246770
Article Google Scholar

Download references

Acknowledgements

The authors are deeply thankful to the reviewers and the editor for their valuable suggestions to improve the quality and presentation of the paper.

Funding

Not applicable.

Author information

Authors and Affiliations

Deparent of Statistics, Mirpur University of Science and Technology (MUST), Mirpur, 10250, AJK, Pakistan
Aamir Saghir
Department of Mathematics and Statistics, the University of Dodoma, PO Box: 338, Dodoma, Tanzania, The University of Dodoma, P.O. Box: 259, Dodoma, Tanzania
Gadde Srinivasa Rao
Department of Statistics, Faculty of Science, King Abdulaziz University, 21551, Jeddah, Saudi Arabia
Muhammad Aslam
Deputy Director Colleges Hafizabad Higher Education Department, Punjab, Pakistan, Higher Education Department, Hafizabad, Pakistan
Azhar Ali Janjua

Authors

Aamir Saghir
View author publications
You can also search for this author in PubMed Google Scholar
Gadde Srinivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Aslam
View author publications
You can also search for this author in PubMed Google Scholar
Azhar Ali Janjua
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AS, GSR, MA and AAJ wrote the paper.

Corresponding author

Correspondence to Muhammad Aslam.

Ethics declarations

Conflict of interest

No conflict of interest regarding the paper.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Saghir, A., Rao, G.S., Aslam, M. et al. Pareto Distribution-Based Shewhart Control Chart for Early Detection of Process Mean Shifts. J Stat Theory Appl 23, 26–43 (2024). https://doi.org/10.1007/s44199-024-00071-1

Download citation

Received: 06 September 2023
Accepted: 16 January 2024
Published: 19 February 2024
Issue Date: March 2024
DOI: https://doi.org/10.1007/s44199-024-00071-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Pareto Distribution-Based Shewhart Control Chart for Early Detection of Process Mean Shifts

Abstract

Similar content being viewed by others

Distribution-free Phase II Mann–Whitney control charts with runs-rules

Distribution-free phase-II exponentially weighted moving average schemes for joint monitoring of location and scale based on subgroup samples

A Study on the Performances of the SPRT Control Chart When Estimating Process Parameters

1 Introduction