1 Introduction

Control charts are commonly used in statistical process control (SPC) monitoring. Typically, the assumption is made that the quality of interest follows a normal distribution. However, in reality, this assumption is often not met, as discussed by various authors, including Saghir and Lin [12]. In such cases, relying on existing control charts may mislead industrial engineers and lead to an increase in non-conforming items, see Montgomery [9]. Many authors in the literature have worked on the development of variable control chart based on non-normal distributions including for example, Bai and Choi [3], Al-Oraini and Rahim [1], Lin and Chou [7], McCracken and Chakraborti [8], Saghir and Lin [12] and Saghir et al. [11].

Control charts, taking into consideration the assumption that the quality characteristic adheres to a Pareto distribution, characterized by its exceptionally heavy tails, have been formulated by researchers such as Guo and Wang [5], Nasiru [10], and Balamurali and Jeyadurga [4]. These studies collectively underscore the significance of Pareto distribution in the realm of statistical process monitoring. Aslam et al. [2] introduced a variance control chart utilizing repetitive sampling, specifically when the average sample size aligns with a fixed sample size. Saleh et al. [13] similarly advocated for the use of repetitive sampling under the condition of the average sample size being equal to the fixed sample size.

The Shewhart control chart's primary drawback is its inadequate capacity to identify even minute variations in the process mean. Further interpretation rules have been proposed to improve the Shewhart chart's ability to identify minute changes in the process mean. Repetitive sampling is used to create a Pareto control chart, which helps overcome this type of problem. Upon a thorough examination of the existing literature and to the best of our knowledge, there is a noticeable absence of research concerning the control chart for the Pareto distribution employing repetitive sampling. This paper addresses this gap by concentrating on the development of a Shewhart Pareto control chart within a repetitive sampling framework. Building upon the groundwork laid by Guo and Wang [5], we extend their work by incorporating repetitive sampling. The goal of this study is to design a control chart for the Pareto distribution using repetitive sampling, particularly when the average sample size aligns with a fixed sample size. Subgroups in sample plans are chosen to decide how much of the plan to accept or reject, whereas subgroups in control charts are used to monitor process variance when the process is intact or in flow. Based on the outcomes of the selected subgroup and the pattern of behavior of earlier subgroups, the quality supervisor is instructed to either leave the process as it is or delay the subgroup and make process adjustments. They address a change in the process configuration when many samples are delayed because they are not plotted. The suggested method is to cap the number of postponed subgroups in repeat sampling. These limits found in the literature fall between 2-sigma limits and 3-sigma, or, in many cases, they are inside 1-sigma limits in the charts under discussion because there are two additional limits, or repetitive limits, inside the typical control limits in repetitive sampling-based control charts. The efficacy of the proposed control chart will be demonstrated in comparison to existing counterparts, and we will showcase its practical application.

2 The Proposed Pareto Control Chart Using Repetitive Sampling

Within this section, we introduce a novel framework for a control chart that employs repetitive sampling under the assumption of a Pareto-distributed quality characteristic. As only one type of Shewhart Pareto chart is available in the literature (called the [5] control chart), Therefore, we have made comparisons with it and discussed the results with reference to it. The Shewhart control chart's primary drawback is its inadequate capacity to identify even minute variations in the process mean. Further interpretation rules have been proposed to improve the Shewhart chart's ability to identify minute changes in the process mean. Repetitive sampling is used to create a Pareto control chart, which helps overcome this type of problem. Additionally, we derive the performance measure for this proposed control chart.

2.1 Design of the Proposed Control Chart

Assume that X is a random variable following the Pareto distribution with parameters a and b. Then, its probability density function and the cumulative distribution function are, respectively, given by

$$f\left(x\right)= \frac{a{b}^{a}}{{x}^{a+1}}, x>b, a< 0,$$
(1)

and

$$F\left(x\right)= 1-{\left(\frac{{\text{b}}}{{\text{x}}}\right)}^{{\text{a}}}.$$
(2)

The mean and the variance of the underlying distribution are given by μ = ab/(a − 1) for a > 1 and σ2 = ab2 /(a − 1)2 (a − 2) for a > 2, respectively.

Further, suppose a random variable \(Y=\overline{x }\pm ks\) in the interval, where \(\overline{x }\) and \(s\) are the sample mean and standard deviation respectively and k is an acceptable constant for a given type-I error. It is well known in the literature that when the underlying distribution is normal, \(\overline{x }\) and \(s\) are independently distributed as normal and chi-square, respectively and \(Y\) is also normally distributed. When the underlying distribution is non-normal, the asymptotic distributions of Y is normal with the means, respectively, given by \({\upmu }_{y}= \upmu \pm k\sigma\) and with the common variance expressed by”.

$${\sigma }_{y}^{2} = \frac{{\sigma }^{2}}{n} \left[1 + \frac{{k}^{2}}{4}\left({\alpha }_{4}- 1\right)\pm k{\alpha }_{3}\right].$$

where \({\alpha }_{3}\) and \({\alpha }_{4}\) represents the measure of the skewness and kurtosis of the underlying distribution and k = 3 (as the asymptotic distribution of Y is normal). In our case, the measures of skewness and kurtosis of Pareto distribution, which are the measures for identifying the asymmetric nature of the distribution, are expressed as:and

$$\alpha_{3} = \left( {\frac{{2\left( {1 + a} \right)}}{a - 3}} \right)\sqrt {\frac{{a{-} 2}}{a}\quad } for\;a > 3$$
$$\alpha_{4} = \frac{{3\left( {a - 2} \right)\left( {3a^{3} + a + 2} \right)}}{{a\left( {a{-} 3} \right)\left( {a{-} 4} \right)}}\quad for\;a > 4.$$
(3)

This work develops a Pareto control chart for the random variable Y using the double group sampling scheme for decision making. Therefore, the proposed control chart has double control limits, outer and inner control limits. “The process is declared:

  1. (i)

    out-of-control if the plotted statistic \(Y\) falls outside the outer control limits

  2. (ii)

    in-control if the plotted statistic \(Y\) falls within the inner control limits.

  3. (iii)

    otherwise, resampling should be made with the same sample size”.

where the outer bound is

$$\genfrac{}{}{0pt}{}{{{\text{UCL}}}_{1}={\upmu }_{y}+{{\text{k}}}_{1}{\upsigma }_{{\text{Y}}}}{{{\text{LCL}}}_{1}={\upmu }_{y}-{{\text{k}}}_{1}{\upsigma }_{{\text{Y}}}}$$

and the inner bound is

$$\begin{aligned} {\text{UCL}}_{2} & = \upmu_{y} + {\text{k}}_{2} {\upsigma }_{{\text{Y}}} \\ {\text{LCL}}_{2} & = \upmu_{y} - {\text{k}}_{2} {\upsigma }_{{\text{Y}}} \\ \end{aligned}$$
(4)

“where \({{\text{k}}}_{1}\) and \({{\text{k}}}_{2}\) are the control chart coefficients to be calculated for a fixed ARL0, sample size (n) and \(\lambda\). \({\upmu }_{y}\) and \({\upsigma }_{{\text{Y}}}\) are the mean and standard deviation of the random variable Y. In case of unknown parameters, the sample average \(\overline{Y }\) and the sample dispersion \({S}_{y}\) of statistic \({\text{Y}}\), calculated from the in-control process initial samples are used in Eq. (4)”.

2.2 The Performance Measure

“The average run length (ARL), which is defined as the average number of samples until the process shows an out-of-control indication, is the most commonly used measure for the performance evaluation of the proposed control charts. When the process is in-control, the ARL should be sufficiently large to avoid many false alarms and it is denoted by \({{\text{ARL}}}_{0}\) while, ARL should be sufficiently small when the process is out of control and generally it is denoted by \({{\text{ARL}}}_{1}\).

2.2.1 In-Control ARL

“The in-control ARL denoted by \({\text{ARL}}_{0} \;{\text{is}}\;{\text{defined}}\;{\text{as}}\):

$${{\text{ARL}}}_{0}=\frac{1}{{{\text{P}}}_{{\text{out}}}},$$
(5)

where \({{\text{P}}}_{{\text{out}}}\) denotes the probability that the process is declared as out-of-control when it is actually in control and in a repetitive sampling scheme defined as”:

$${{\text{P}}}_{{\text{out}}}=\frac{{{\text{P}}}_{{\text{out}}}^{0}}{1-{{\text{P}}}_{{\text{Res}}}}$$

where \({{\text{P}}}_{{\text{out}}}^{0}\) is the probability that the process is declared to be out of control based on a single sample is:

$${{\text{P}}}_{{\text{out}}}^{0}={\text{P}}\left({\text{Y}}\ge {{\text{UCL}}}_{1}|{\upmu }_{{\text{Y}}}={\upmu }_{0}\right)+\mathrm{ P}({\text{Y}}\le {{\text{LCL}}}_{1}|{\upmu }_{{\text{Y}}}={\upmu }_{0})$$
(6)

where \({\upmu }_{Y}={\upmu }_{0}\) be in-controlled mean of the Pareto process. As mentioned earlier, the asymptotic distribution of \({\text{Y}}\) is normal with mean \({\upmu }_{y}= \upmu \pm k\sigma\) and variance \({\sigma }_{y}^{2} = \frac{{\sigma }^{2}}{n} [1 + \frac{{k}^{2}}{4}({\alpha }_{4}- 1) \pm k{\alpha }_{3}]\). Thus, when the process is in-control,

$$\begin{aligned} {\text{P}}_{{{\text{out}}}}^{0} & = {\text{P}}\left( {{\text{Y}} \ge {\text{UCL}}_{1} |\upmu_{{\text{Y}}} = \upmu_{0} } \right) + {\text{ P}}({\text{Y}} \le {\text{LCL}}_{1} |\upmu_{{\text{Y}}} = \upmu_{0} ) \\ & = {\text{P}}\left( {\frac{{{\text{Y}} - \upmu_{y} }}{{\sigma_{y} }} \ge \frac{{{\text{UCL}}_{1} - \upmu_{y} }}{{\sigma_{y} }}|\upmu_{{\text{Y}}} = \upmu_{0} } \right) + {\text{ P}}\left( {\frac{{{\text{Y}} - \upmu_{y} }}{{\sigma_{y} }} \le \frac{{{\text{LCL}}_{1} - \upmu_{y} }}{{\sigma_{y} }}|\upmu_{{\text{Y}}} = \upmu_{0} } \right) \\ & = {\text{P}}\left( {{\text{Z}} \ge {\text{k}}_{1} } \right) + {\text{ P}}\left( {{\text{Z}} \le - {\text{k}}_{1} } \right) \\ & = 1 - {\text{F}}\left( {.,{\text{k}}_{1} } \right) + {\text{F}}\left( {.,{\text{k}}_{1} } \right) \\ \end{aligned}$$
(7)

where \({\text{Z}}\) is a standardized normal variable and \({\text{F}}(.,\mathrm{ t})\) is the cumulative probability of standard normal random variable \({\text{Z}}\) up to point \(t\).

The probability that the process is declared to be out of control based on a single sample is:

$${{\text{P}}}_{{\text{in}}}^{0}={\text{P}}\left({\text{Y}}\le {{\text{UCL}}}_{2}|{\upmu }_{{\text{Y}}}={\upmu }_{0}\right)-\mathrm{ P}({\text{Y}}\le {{\text{LCL}}}_{2}|{\upmu }_{{\text{Y}}}={\upmu }_{0})$$
(8)

and when the process is in-control

$${{\text{P}}}_{{\text{in}}}^{0}={\text{P}}\left({\text{Z}}\le {{\text{k}}}_{2}\right)-\mathrm{ P}\left({\text{Z}}\le {-{\text{k}}}_{2}\right)={\text{F}}\left(.,{{\text{k}}}_{2}\right)-{\text{F}}\left(.,{-{\text{k}}}_{2}\right)$$
(9)

The probability of resampling, when the process is actually in control, is defined as:

$$\begin{aligned} {\text{Probability of resampling}} & = {\text{P}}_{{{\text{Res}}}} = {\text{P}}\left( {{\text{UCL}}_{2} < Y\left\langle {{\text{UCL}}_{1} } \right|\upmu_{{\text{Y}}} = \upmu_{0} } \right) \\ & \quad + {\text{ P}}\left( {{\text{LCL}}_{1} < Y < {\text{LCL}}_{2} {|}\upmu_{{\text{Y}}} = \upmu_{0} } \right). \\ & = {\text{P}}\left( {{\text{k}}_{2} < Z < {\text{k}}_{1} } \right) + {\text{P}}\left( { - {\text{k}}_{1} < Z < - {\text{k}}_{2} } \right) \\ & = {\text{F}}\left( {.,{\text{k}}_{2} } \right) - {\text{F}}\left( {.,{\text{k}}_{1} } \right) + {\text{F}}\left( {., - {\text{k}}_{1} } \right) - {\text{F}}\left( {., - {\text{k}}_{2} } \right) \\ \end{aligned}$$
(10)

Thus,

$${{\text{P}}}_{{\text{out}}}=\frac{{{\text{P}}}_{{\text{out}}}^{0}}{1-{{\text{P}}}_{{\text{Res}}}}=\frac{1-{\text{F}}\left(.,{{\text{k}}}_{1}\right)+{\text{F}}\left(.,{{\text{k}}}_{1}\right)}{1-{\text{F}}\left(.,{{\text{k}}}_{2}\right)+{\text{F}}\left(.,{{\text{k}}}_{1}\right)-{\text{F}}\left(.,{-{\text{k}}}_{1}\right)+{\text{F}}\left(.,{-{\text{k}}}_{2}\right)}$$
(11)

The average sample size (ASS) for the proposed control chart is defined by

$${\text{AS}}{S}_{0}=\frac{n}{1-{{\text{P}}}_{{\text{Res}}}}$$
(12)

2.2.2 Out-of-Control ARL

Suppose that the process mean does not remain at the in-control level, i. e \({\upmu }_{0}\) and further let \({\upmu }_{1}={\upmu }_{0}+\updelta {\upsigma }_{{\text{y}}}\) be a shifted level of mean value under \(\updelta\) standard units shift depending upon the definition of Y. Then, the out-of-control \({{\text{ARL}}}_{1}\) is:

$${{\text{ARL}}}_{1}=\frac{1}{{{\text{P}}}_{{\text{out}},{\text{shift}}}}$$
(13)

where

$${{\text{P}}}_{{\text{out}},{\text{shift}}}=\frac{{{\text{P}}}_{{\text{out}},{\text{shift}}}^{1}}{1-{P}_{Res1}}$$
(14)

The probability of an out-of-control process when a shift occurs is

$$\begin{aligned} {\text{P}}_{{{\text{out}},{\text{shift}}}}^{1} & = {\text{P}}\left( {{\text{Y}} > {\text{UCL}}_{1} |{\upmu }_{{\text{Y}}} = {\upmu }_{1} } \right) + {\text{ P}}\left( {{\text{Y}}\left\langle {{\text{LCL}}_{1} } \right|{\upmu }_{{\text{Y}}} = {\upmu }_{1} } \right) \\ {\text{P}}_{{{\text{out}},{\text{shift}}}}^{1} & = {\text{P}}\left( {{\text{Z}} > {\text{UCL}}_{1} } \right) + {\text{ P}}({\text{Z}} < {\text{LCL}}_{1} ) \\ & = {\text{P}}\left( {\frac{{{\text{Y}} - \upmu_{1} }}{{\sigma_{y} }} \ge \frac{{{\text{UCL}}_{1} - \upmu_{1} }}{{\sigma_{y} }}} \right) + {\text{ P}}\left( {\frac{{{\text{Y}} - \upmu_{1} }}{{\sigma_{y} }} \le \frac{{{\text{LCL}}_{1} - \upmu_{1} }}{{\sigma_{y} }}} \right) \\ & = {\text{P}}\left( {{\text{Z}} \ge {\text{k}}_{1} - {\updelta }} \right) + {\text{ P}}\left( {{\text{Z}} \le {\updelta } - {\text{k}}_{1} } \right) \\ & = 1 - {\text{F}}\left( {.,{\text{k}}_{1} - {\updelta }} \right) + {\text{F}}\left( {.,{\updelta } - {\text{k}}_{1} } \right) \\ \end{aligned}$$
(15)

and the probability of resampling for the shifted mean is:

$$\begin{aligned} P_{Res1} & = {\text{P}}\left( {{\text{k}}_{2} - {\updelta } < Z < {\text{k}}_{1} - {\updelta }} \right) + {\text{P}}\left( { - {\text{k}}_{1} - {\updelta } < Z < - {\text{k}}_{2} - {\updelta }} \right) \\ & = {\text{F}}\left( {.,{\text{k}}_{2} - {\updelta }} \right) - {\text{F}}\left( {.,{\text{k}}_{1} - {\updelta }} \right) + {\text{F}}\left( {., - {\text{k}}_{1} - {\updelta }} \right) - {\text{F}}\left( {., - {\text{k}}_{2} - {\updelta }} \right) \\ \end{aligned}$$
(16)

Thus,

$${{\text{P}}}_{{\text{out}},{\text{shift}}}=\frac{1-{\text{F}}\left(.,{{\text{k}}}_{1}-\updelta \right)+{\text{F}}\left(.,{\updelta -{\text{k}}}_{1}\right)}{1-{\text{F}}\left(.,{{\text{k}}}_{2}-\updelta \right)+{\text{F}}\left(.,{{\text{k}}}_{1}-\updelta \right)-{\text{F}}\left(.,{-{\text{k}}}_{1}-\updelta \right)+{\text{F}}\left(.,{-{\text{k}}}_{2}-\updelta \right)}$$
(17)

The average sample size (ASS) for the proposed control chart is defined by

$${\text{AS}}{S}_{1}=\frac{n}{1-{P}_{Res1}}$$
(18)

The following algorithm was applied to obtain the values of ASS and ARL.

Step-1: Fix the values of \(n\) and specified values of ARL, say \({r}_{0}\)

Step-2: Find the values of \({{\text{k}}}_{1}\) and \({{\text{k}}}_{2}\) such that \({{\text{ARL}}}_{0}\ge {r}_{0}\) and \({\text{AS}}{S}_{0}\) is close to \({r}_{0}\).

Step-3: Determine \({{\text{ARL}}}_{1}\) and \({\text{AS}}{S}_{1}\) using \({{\text{k}}}_{1}\) and \({{\text{k}}}_{2}\) for various values of shift.

3 Results and Discussion

The proposed control chart depends on two constants \({{\text{k}}}_{1}\) and \({{\text{k}}}_{2}\), which are called control charting constants. The numerical values of these constants can be determined using the Eq. (11) by fixing the parameter of Pareto distribution and the in-control average run length ARL0. Once, the control charting constants are determined and the Pareto control chart is constructed, the performance of the proposed control chart can further be investigated when a shift occurs in the parameter of Pareto distribution. Consider \({\upmu }_{1}={\upmu }_{0}+\updelta {\upsigma }_{{\text{y}}}\) be shifted mean value of the process when the shift of amount \(\updelta\) occurs in the process mean in standard units depending upon the definition of Y \(= \overline{x }\pm ks\) and k = 3 is considered in this work. The statistical software R has been used in this study for all computations.

The performance of the Shewhart Pareto Chart using a repetitive sampling scheme is evaluated in terms of \({{\text{ARL}}}_{1}\) by fixing the \({{\text{ARL}}}_{0}\), sample size and parameters of Pareto distributions. Some of the results are reported here in Tables 1, 2, 3 and 4 for various choices. The values of \({{\text{ARL}}}_{1}\) and \({\text{AS}}{S}_{1}\) for any other choice of parameter or sample size or of \({{\text{ARL}}}_{0}\) can be obtained easily using the Eqs. (13) and (18).

Table 1 The ARLs for the proposed chart when a = 4.5, b = 2 and ARL0 = 370
Table 2 The ARLs for the proposed chart when a = 5.0, b = 2 and ARL0 = 370
Table 3 The ARLs for the proposed chart when a = 4.5, b = 3 and ARL0 = 370
Table 4 The ARLs for the proposed chart when a = 5, b = 3 and ARL0 = 370

Tables 1, 2, 3 and 4 reveal that

  1. 1.

    The control charting constant (k1 and k2) is significantly different than the standard value 3.

  2. 2.

    \({{\text{ARL}}}_{1}\) decreases as shift size decreases but \({\text{AS}}{S}_{1}\) increases slightly for the fixed value of a, b, n and ARL0

  3. 3.

    \({{\text{ARL}}}_{1}\) decreases as sample size increases for the fixed value of a, b and ARL0.

  4. 4.

    \({{\text{ARL}}}_{1}\) decreases as the location parameter value increase by fixing constant other factors This are expected results as our focus is to detect changes in the location parameter of Pareto distribution by assuming the shape parameter to be controllable.

  5. 5.

    There is no significant effect on \({{\text{ARL}}}_{1}\) when the shape parameter increases for the fixed value of n, a and ARL0.

  6. 6.

    There is no significant effect on the performance of the proposed chart in terms of \({{\text{ARL}}}_{1}\) when the simultaneous change in both parameters occurred for fixed value of n and ARL0.

Thus, the proposed Pareto control chart can effectively detect the change in location parameter and also efficiency of this chart increase with a moderate sample size. Note that it is desirable situation that the sample size must be close to \({\text{AS}}{S}_{1}\). It is note from tables that when sample size is small, the value of \({\text{AS}}{S}_{1}\) are little bit away from the fixed sample size when \(\updelta\) is greater than 0.60. It is worth to noting that in case of the repetition, the fixed sample size is taken from the production process.

4 Performance Comparison with Existing Chart

In this section, the proposed control chart will be compared with the existing Shewhart-type Pareto control chart proposed by Guo and Wang [5], when \(k_{1} = k_{2} = L\). “The control chart with smaller ARL is said to be more efficient”. Hence the performance of the proposed control chart will be studied based on ARL values. To defend a comparison, consider when the process is in-control at ARL0 = 370, the out-of-control ARLs of the Guo and Wang [5] chart and proposed Pareto control chart for different mean shifts, n = 5, a = 5, b = 2 are presented in Table 5.

Table 5 The ARLs for the proposed chart and existing Shewhart type Pareto control chart when n = 5, a = 5, b = 2 &ARL0 = 370

According to these results, the proposed control chart gives smaller ARL values for different shifts (\(\delta\)); hence, the proposed control chart is more quick in detecting shifts as compared to the existing Shewhart type Pareto control [5] chart. For example, when ARL0 = 370, n = 5, a = 5, b = 2 and \(\delta\) = 0.7 from Table 5, we noticed that the proposed control chart ARL1 = 109.87 whereas ARL1 = 114.74 for Guo and Wang [5] Pareto control chart. The ARL curves of the proposed Pareto control chart using repetitive sampling and Guo and Wang [5] control chart for the mean of specified shifts are displayed in Fig. 1. After exploring the ARL curves, it is clear that the proposed control charts using repetitive sampling show better performance than the Shewhart type control [5] chart. To accentuate this simulation study will be carried out in the next subsection. Similar results have been observed for other choices of parameters and sample sizes.

Fig. 1
figure 1

Logarithmic ARL1 curves of proposed control chart and Shewhart type control chart when n = 5, a = 5, b = 2 &ARL0 = 370

5 A Real Example Based on a Tax Revenue Data

The developed control chart is demonstrated by using a real data set in this section, the data set is taken from Klakattawi [6]. The data set is reported as monthly records of tax revenue of Egypt between January 2006 and November 2010, in 1000 million Egyptian pounds. The data along with statistics is reported in Table 6. It is established that the tax revenue data comes from the Pareto distribution with parameters \(\hat{a}\) = 1.707 and \(\hat{b}\) = 6.641 and the maximum distance between the real time data and the fitted Pareto distribution is found from the Kolmogorov–Smirnov test as 0.094 and also the p-value is 0.6978. The demonstration of the goodness of fit for the given model is shown in Fig. 2, the histogram and theoretical density and P–P plots for the Pareto distribution for the tax revenue data. The chart coefficients for the estimated parameters of a and b for n = 5 and ARL = 370 are k1 = 9.8274 and k2 = 0.8366. Using the chat coefficients the inner bound limits are \({{\text{LCL}}}_{2}=-3.9007{\text{ and UCL}}_{2}=28.6571\); and the outer bound limits are \({{\text{LCL}}}_{1}=-178.847{\text{ and UCL}}_{1}=203.6039\). In Fig. 3 the proposed Pareto control chart using repetitive sampling for tax revenue data is presented. The visual presentation shows that some points are in between inner and outer control limits; hence repetition of sampling should be made with the same sample size.

Table 6 The monthly records of tax revenue of Egypt along with statistics
Fig. 2
figure 2

Visual presentation of tax revenue data

Fig. 3
figure 3

Proposed Pareto control chart using repetitive sampling for tax revenue data

6 Application Using Simulated Tax Revenue Data

In this section, using simulated data we study the performance of the proposed control chart. To establish the performance of the proposed control chart as compared with existing control charts, 40 samples are generated from normal distribution. The first 20 subgroups of each size 5 are generated from the Pareto distribution with in-control parameters a = 5, b = 2 and the last 20 subgroups of each size 5 are generated from the Pareto distribution with in-control parameters a = 5 and out-of-control b1 = \(\delta b = 0.8 \times 2 = 1.6\). That is, the process mean is shifted after 20 subgroups with a variance shift of \(\delta\) = 0.8.

The chart coefficients when n = 5 for ARL0 = 370 are available in Table 5. The proposed control charts using repetitive sampling with \(k_{1}\) = 3.3974, \(k_{2}\) = 1.7061 is displayed in Fig. 4 and the Shewhart type control chart with L = 4.4198 is presented in Fig. 5. From Fig. 4, the Shewhart type control chart fails to detect the shift. From Fig. 5, it can be observed that the proposed control charts under a repetitive sampling scheme show the out-of-control signals at subgroup numbers 28, 30, 36 and 39. This example shows that the proposed control chart under a repetitive sampling scheme is more effective in discovering process shift as compared to the Shewhart type control chart scheme.

Fig. 4
figure 4

Proposed Pareto control chart using repetitive sampling for simulated data

Fig. 5
figure 5

Existing Shewhart type Pareto Guo and Wang [5] control chart for simulated data

7 Conclusions and Recommendations

In conclusion, this paper has introduced a novel Shewhart Pareto control chart designed for monitoring shifts in the Pareto distribution through a repetitive sampling approach. The chart, utilizing a modified statistic that combines shape and threshold parameters, has been evaluated for its performance in terms of run length characteristics, assuming a shift in the process mean. Through a comprehensive efficiency comparison with existing control charts, our findings indicate that the proposed Pareto chart demonstrates superior efficiency in promptly detecting changes, on average, compared to alternative methods. The practical application of our approach has been exemplified through an illustrative example employing revenue data. We have highlighted the significance of Pareto distribution in statistical process monitoring, particularly emphasizing the importance of considering heavy-tailed distributions in control chart design. The extension of the work by incorporating repetitive sampling adds depth to the methodology. The absence of prior research on control charts for the Pareto distribution employing repetitive sampling was addressed in this paper, contributing to the existing body of knowledge in statistical process control. By building upon the groundwork of [5], we have presented a comprehensive design for a Shewhart Pareto control chart under a repetitive sampling scheme. As we move forward, this work opens avenues for further exploration and refinement of control chart methodologies under non-normal distributions, ensuring robust and accurate statistical process control in diverse industrial settings. The proposed control chart faces a constraint as it is not suitable for data conforming to a normal distribution. Subsequent research endeavors could focus on developing and exploring an exponentially weighted moving average (EWMA) scheme tailored for the Pareto distribution through repetitive sampling. Additionally, investigating the efficacy of the proposed Pareto chart by employing a cost model represents a promising avenue for future research.