Improved Approximation Scales for Unreplicated Factorial Experiments

Aboukalam, F.; Alharbi, M.; Bhatti, M. Ishaq

doi:10.1007/s44199-022-00049-x

Improved Approximation Scales for Unreplicated Factorial Experiments

Research Article
Open access
Published: 25 September 2022

Volume 21, pages 200–216, (2022)
Cite this article

Download PDF

You have full access to this open access article

Journal of Statistical Theory and Applications Aims and scope Submit manuscript

Improved Approximation Scales for Unreplicated Factorial Experiments

Download PDF

1491 Accesses
Explore all metrics

Abstract

Assessing the sizes of active contrasts in un-replicated factorial and fractional factorial experiments by quick and powerful methods are required in analyzing the big data in various research areas of Human endeavors. One of the old methods based on Lenth (1989) is being used in some statistical and data analytical applications which is fast and less efficient. We propose a new class of tests which are simpler, faster, and more powerful using the location median-function (${\psi }_{med}\left({\varvec{x}}\right)$) after being skipped one and/or two times. An empirical study of simulation experiments to compute the critical values, sizes and powers using various sample sizes demonstrate the superiority of our methods. The proposed methods are illustrated in examples which can be employed in various fields of research in conducting data analytics using high computing power and machine learning.

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Sampling Techniques for Quantitative Research

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years, social and physical scientists are interested in searching a quick and powerful data analytics techniques based on computing technologies, information of thing (IoT), machine learning to analyze big data. For example, Salmaso et al. [16], suggested a two-step workflow where design of experiments (DOE) is conducted prior to the usual big data analytics and machine learning modeling phase. They show their DOE on an industrial application. Among others who pointed out the use of un-replicated factorials in the DOE and other areas are Deepa et al. [9] who give an interesting review of literature of the subject whereas Guerra-Zubiaga and Luong [10] address on the energy consumption parameter analysis of industrial robots using factorial DOE without due regards being given to factors and factor interactions which do exist in designing the DOE methodologies with powerful and speedy tests. This paper fills this gap in the literature and present some applications with various examples to demonstrate the proposed tests, sizes and tests power are better than their competitors like Lenth [14] and Aboukalam [2].

It is common practice in data analytics are that the ‘response’ in many practical experiments may be affected by multiple factors, but sometimes either the most factors or some factors interactions are not active in various research areas related to big data (see for example [5, 16], Tsai [17], Hassan et al. (2022)^{Footnote 1} and Wosiak et al. [18] and references there in Deepa et al. (2022)). Considering the experimentation cost, time, effort, and/or limitation of data resources, missing values in the data, the researchers usually apply for estimating and testing main and interaction effect of each factor at ‘two levels’ and do not replicate experiments. Therefore, researchers employ un-replicated ${2}^{p}$ factorial experiments and use only one observation per treatment combination. There are many methods in the literature for analyzing unreplicated ${2}^{p}$ factorial experiments. For example, some of these methods are discussed in Lenth [14], Hamada and Balakrishnan [12], Loughin and Noble [15] and Haaland, O'Connell (1995) and Aboukalam [2] among others. In general, Lenth's method seems popular and are being used by various researcher. Aboukalam and his associates improved Lenth’s method (see Aboukalam and Al-Shiha [1], Aboukalam [2] and Al-Shiha and Aboukalam [3] using two kinds of powers and size as numerical criteria for illustrating the comparison. Due to recent revolution in computing power in the presence of IoT and machine learning these methods are no more powerful as was reported in earlier literature.

Now, due to advancement in computing power and freely availability of computing software’s like R, the speed of quickness and complex computation is not an issue. This paper attempts to demonstrate how to use both powerful statistical tools to obtain better results. We employed Huber’s skew symmetric function,${\varvec{\psi}},$ used in the location step to find a scale S which solves the one step equation of the scale-part of M- estimate^{Footnote 2} given below:

$$S^{2} = S_{0}^{2} \mathop \sum \limits_{j = 1}^{n} \left[ {{\varvec{\psi}}\left( {\frac{{x_{j} }}{{S_{0} }}} \right)} \right]^{2} \left[ {\frac{1}{{\left( {n - 1} \right) \times { }B}}} \right]$$

(1)

where ${S}_{0}$ is an initial robust scale estimate and the constant B is chosen to be equal to $\mathrm{E}\left({{\varvec{\psi}}}^{2}\right)$ to make the estimate consistent at the normality. The quick powerful methods of Al-Shiha and Aboukalam [3] and Aboukalam [2] used ${\psi }_{\mathrm{med}}\left({\varvec{x}}\right)=Sign\left(x\right)$(signal of x) the location median-function after being skipped one time and two times, respectively. Here, we aim at performing some useful approximations on the Eq. (1) with ${\psi }_{\mathrm{med}}\left({\varvec{x}}\right)$ to obtain innovative forms in terms of quickness and powerful results. The improved forms use SKM1 and SKM2 due to that one and two skipping to ${\psi }_{\mathrm{med}}\left({\varvec{x}}\right)$ are done, respectively. The improved form SKM2 enables us to generate a class of three competitive quick methods SKM2(k $\times$ A) _k=1,2,3 given that A is a constant. This task was not possible with the old form used in Aboukalam [2]. The results of SKM1 and SKM2(A) are to update the results of Al-Shiha and Aboukalam [3] and Aboukalam [2], and then compared with the old results without approximations. An empirical comparison showed that both results are comparable, which gives us confidence that approximations are successful and safe.

This paper clarifies that the ranking of the quick methods under study in terms of quality from lowest to highest is; Lenth, SKM1, SKM2(A), SKM2(2A) and SKM2(3A). The superiority of the new methods SKM2(2A), SKM2(3A) over Lenth may sometimes equals several times that of SKM2(A) over Lenth. Thus, the added value of these two new quick methods is remarkably high. Moreover, the study is considered as a list, in one place, of all quick methods of the author competing with Lenth' method. Eventually, necessary proofs are needed, and numerical simulation experiments are conducted under some given level of significances and four selected sample sizes often used in the field to compare the methods. Tables of critical points, sizes and powers are empirically computed.

2 Theoretical Objectives

Suppose that ${\widehat{\beta }}_{1},{\widehat{\beta }}_{2},\dots ,{\widehat{\beta }}_{n}$ are the best linear unbiased estimators (BLUE) of the main and interaction effects obtained from an un-replicated ${2}^{p}$ factorial experiment in the standard order, where the effects are denoted by${\beta }_{1},{\beta }_{2},\dots {,\beta }_{n}$. Under the usual assumptions about the experimental errors: normality, independence, and a common unknown variance ${\sigma }^{2}$, the estimates ${\widehat{\beta }}_{1},{\widehat{\beta }}_{2},\dots ,{\widehat{\beta }}_{n}$ are independently normally distributed random variables with a common unknown variance ${\uptau }^{2}={\upsigma }^{2}/{2}^{\mathrm{p}}$ and means ${\beta }_{1},{\beta }_{2},\dots {,\beta }_{n}$ respectively. The statistical inference problem may be stated formally as follows. There are n normal distributions with unknown means${\beta }_{1},{\beta }_{2},\dots {,\beta }_{n}$, and a common unknown variance${\uptau }^{2}$. From each distribution, a single observation ${\widehat{\beta }}_{j}(j=\mathrm{1,2}, \dots ,n)$ is obtained. The objective is to infer, if any of the ${\beta }_{j}$ is different from zero under a level of significance. In other words, the objective is to determine whether the hypothesis ${H}_{o}:{ \beta }_{j}=0$ can be rejected under a significance level$\alpha$. This procedure is done on the basis of the observation ${\widehat{\beta }}_{j}$ and an estimate of the scale$\tau$, call it S. The hypothesis ${H}_{o}$ is rejected if the j-th standardized absolute estimate ${SAE}_{j}= \left|{\widehat{\beta }}_{j}\right|$/$S$ is largeer than a critical point cr($\alpha$,n). As a result, once a good scale estimate has been chosen, those active effects ${\beta }_{j}$ corresponding to large values of ${SAE}_{j}$ could be identified. Lenth [14] proposed the easy expression of scale:

$$S_{L} = 1.5 \times Med\left\{ {\left| {\hat{\beta }_{j} } \right|:\left| {\hat{\beta }_{j} } \right| \le 2.5 \times S_{0} } \right\}$$

(2)

given that ${S}_{0}=1.5\times Med\left\{\left|{\widehat{\beta }}_{j}\right|\right\}$ is an initial scale.

We aim at producing new effective and quick forms of scales.

3 Skipped Median (SKM) scale

Let ${\widehat{\beta }}_{1},{\widehat{\beta }}_{2},\dots ,{\widehat{\beta }}_{n}$ be n estimated effects from the scale model $F\left(\widehat{\beta }/\tau \right)$, ${S}_{0}$ be an initial robust scale estimate, like ${S}_{0}=1.5 \times Med\left|\widehat{{\beta }_{j}}\right|$. The Eq. (1) of one step scale part of M-estimates is given by:

$$S^{2} = S_{0}^{2} \mathop \sum \limits_{j = 1}^{n} \left[ {{\varvec{\psi}}\left( {\frac{{\widehat{{{\upbeta }_{{\text{i}}} }}}}{{{\text{S}}_{0} }}} \right)} \right]^{2} \left[ {\frac{1}{{\left( {n - 1} \right) \times { }B}}} \right]$$

(3)

Indeed, we can produce simple forms of scales in using the location median-function:

$$\psi_{med} \left( {\varvec{x}} \right) = sign\left( x \right)$$

after being skipped one time (SKM1) at $\mp 2.5$ as following:

$$\psi_{SKM1} \left( x \right) = sign\left( x \right) \times 1_{{\left[ {0,2.5} \right]}} \left( {\left| x \right|} \right)$$

(4)

or two times (SKM2) at $\mp 1$ and $\mp 2.5$ as following:

$$\psi_{SKM2} \left( x \right) = sign\left( x \right)\left[ {1_{{\left[ {0,1} \right]}} \left( {\left| x \right|} \right) + \frac{1}{2} \times 1_{{\left( {1,2.5} \right]}} \left( {\left| x \right|} \right)} \right]$$

(5)

The chosen points $\mp 2.5$ are taken to facilitate the comparison with ${S}_{L}$ the scale of Lenth.

Next, it is to approximate the expression (3) with ${\psi }_{SKM1}\left(x\right)$ and ${\psi }_{SKM2}\left(x\right)$.

3.1 Skipped Median Scale One Time (SKM1)

If ${\psi }_{SKM1}$ is used in expression (3), then:

$${\text{s}}_{SKM1}^{2} = S_{0}^{2} \mathop \sum \limits_{j = 1}^{n} \left[ { \psi_{SKM1} \left( {\frac{{\widehat{{{\upbeta }_{{\text{i}}} }}}}{{{\text{S}}_{0} }}} \right)} \right]^{2} \left[ {\frac{1}{{\left( {n - 1} \right) \times { }B}}} \right]$$

(6)

Assuming that: ${n}_{0}=\#\widehat{{\beta }_{i}}; \left|{\widehat{\beta }}_{i}\right|>{2.5 S}_{0}$, then it is clear that:

$$\mathop \sum \limits_{j = 1}^{n} \left[ { \psi_{SKM1} \left( {\frac{{\widehat{{{\upbeta }_{{\text{i}}} }}}}{{{\text{S}}_{0} }}} \right)} \right]^{2} = n - n_{0} ,$$

(7)

$$So:{\text{s}}_{SKM1}^{2} = \left( {1.5{ } \times Med\left| {\widehat{{\beta_{i} }}} \right|} \right)^{2} \left( {\frac{{n - n_{0} }}{n}} \right)\left[ {\frac{n}{{\left( {n - 1} \right) \times { }B}}} \right]$$

(8)

As simplifying is aimed, we neglect the constant $\frac{n}{\left(n-1\right)\times B}$ and nothing is affected except the critical point, hence:

$${\varvec{s}}_{SKM1} = 1.5 \times Med\left| {\widehat{{\beta_{i} }}} \right| \times \left( {\frac{{n - n_{0} }}{n}} \right)^{\frac{1}{2}} .$$

(9)

But as ${\left(\frac{n-{n}_{0}}{n}\right)}^{1/2}\approx \left(1-\frac{{n}_{0}}{2n}\right)$, then ${{\varvec{s}}}_{SKM1}$ will take the simpler form:

$$S_{SKM1} \approx Med\left| {\widehat{{\beta_{i} }}} \right| \times (1.5 - A_{0} \times n_{0} ),$$

(10)

where ${A}_{0}=\frac{0.75}{n}$.

3.2 Skipped Median Scale Two Times (SKM2)

If ${\psi }_{SKM2}$ is used in expression (3), then:

$${\text{s}}_{SKM2}^{2} = S_{0}^{2} \mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{n}}} \left[ {{\varvec{\psi}}_{SKM2} \left( {\frac{{\widehat{{{\upbeta }_{{\text{i}}} }}}}{{{\text{S}}_{0} }}} \right)} \right]^{2} \left[ {\frac{1}{{\left( {n - 1} \right) \times { }B}}} \right]$$

(11)

Assuming that: ${n}_{0}=\#\left|\widehat{{\beta }_{i}}\right|;\left|\widehat{{\beta }_{i}}\right|>{2.5 \times S}_{0}$ and ${n}_{00}=\#\left|{\widehat{\beta }}_{i}\right|;\left|{\widehat{\beta }}_{i}\right|>{S}_{0 },$ then it is true:

$$\mathop \sum \limits_{{{\text{i}} = 1}}^{{\text{n}}} \left[ {{\varvec{\psi}}_{SKM2} \left( {\frac{{\widehat{{{\upbeta }_{{\text{i}}} }}}}{{{\text{S}}_{0} }}} \right)} \right]^{2} = \left( {n - n_{{00{ }}} } \right) + \left( {\frac{{n_{{00{ }}} {-}n_{{0{ }}} }}{4}} \right) = n - \frac{{ 3n_{00 } + n_{0 } }}{4}$$

(12)

So:

$${\text{s}}_{SKM2}^{2} = \left( {1.5{ } \times Med\left| {\widehat{{\beta_{i} }}} \right|} \right)^{2} \left( {1 - \frac{{ 3n_{00 } + n_{0 } }}{4n}} \right)\left[ {\frac{n}{{\left( {n - 1} \right) \times { }B}}} \right]$$

(13)

Also, neglecting $\frac{n}{\left(n-1\right)\times B}$ affects only the critical point, hence:

$$s_{SKM2} = 1.5 \times Med\left| {\widehat{{\beta_{i} }}} \right| \times \left( {1 - \frac{{ 3n_{00 } + n_{0 } }}{4n}} \right)^{1/2}$$

(14)

But as ${\left(1-\frac{{ 3n}_{00 }+{n}_{0 }}{4n}\right)}^{1/2}\approx \left(1-\frac{{ 3n}_{00 }+{n}_{0 }}{8 n}\right),$ then ${{\varvec{s}}}_{SKM2}$ will take the simpler form:

$$s_{SKM2} \approx Med\left| {\widehat{{\beta_{i} }}} \right| \times \left[ {1.5 - A \times \left( {{ }3n_{{00{ }}} + n_{{0{ }}} } \right)} \right],$$

(15)

where$;A=\frac{1.5}{8 n}=\frac{0.1875}{n}$.

Overall, we propose the class of scales ${s}_{SKM2\left(k\times A\right)}$ with three degrees _k=1,2,3 of multiplying the constant $A$:

$$s_{SKM2\left( A \right)} = Med\left| {\widehat{{\beta_{i} }}} \right| \times \left[ {1.5 - A \times \left( {{ }3n_{{00{ }}} + n_{{0{ }}} } \right)} \right]$$

(16)

$$s_{SKM2\left( A \right)} = Med\left| {\widehat{{\beta_{i} }}} \right| \times \left[ {1.5 - 2A \times \left( {{ }3n_{{00{ }}} + n_{{0{ }}} } \right)} \right]$$

(17)

$$s_{{SKM2\left( {3A} \right)}} = Med\left| {\widehat{{\beta_{i} }}} \right| \times \left[ {1.5 - 3A \times \left( {{ }3n_{{00{ }}} + n_{{0{ }}} } \right)} \right]$$

(18)

The values 2 and 3 of the factor k can be seen as reflection of some suitable shapes of the used ${\varvec{\psi}}$ – function, or even no worry if they are very successful proposals.

Finally, to help the experimenter remember all the aforesaid quick scales we propose to read ${s}_{SKM1}$ and ${s}_{SKM2\left(kA\right)}$ using the unified form:

$$S_{SKM} = Part \, of1.5\times \ Med\, of \, Entire\left\{ {\hat{\beta }_{j} } \right\}.$$

whereas Lenth's scale is read using the parallel way:

$$S_{L} = Entire1.5 \times Med \, of \, Part \, of\left\{ {\hat{\beta }_{j} } \right\}$$

4 Simulation Studies and Results

4.1 Critical Points of the Test Statistics

Scales are usually compared under similar probabilities (= $\alpha$) of rejecting the null hypothesis ${H}_{0}: {\beta }_{j}=0$ given that all the n effects ${\beta }_{j}$ are not active, that is ${H}_{0}$ is true. To fulfill this requirement, we should compute the empirical critical point, $cr(\alpha ,n)$, of the test statistics under ${H}_{0}$. The value of $cr(\alpha ,n)$ is 100(1–$\alpha$)% quantile point of the distribution of the test statistic ${SAE}_{j}$. The reference distribution function of this statistic is built empirically based on several runs (nr = 10,000) of samples of size n from the standard normal. Hence, the empirical distribution includes nr × n values of the aforesaid test statistic. For some values of $\alpha$ and four selected sample sizes n, the empirical critical points of Lenth, SKM1, SKM2(A), SKM2(2A) and SKM2(3A) test statistics are given in Table 1 in the appendix.

4.2 Numerical Criteria; Sizes and Powers

These five techniques, like Lenth, SKM1, SKM2(A), SKM2(2A) and SKM2(3A) will be assessed and compared in this study using numerical criteria. We intend to contribute toward the concept of power, defined as the probability of declaring an effect as active given that it is an active effect. The Power of the proposed test is accompanied with the size of the test. Here, the size we mean the probability of declaring an effect as active given that it is inactive. As is well known that the size must be close to the significance level $\alpha$ when ${H}_{0}$ is true. Here we consider, $\alpha$ is an error premium that is paid in the null case, and many experimenters may not care about it in the non-null case if it stays within the volume $\alpha$. In general, any technique would be the best if its powers are the superior than the competing tests and the sizes are the inferior.^{Footnote 3} Moreover, the decline in Powers indicates that the test losses some real active effects. This usually happens if the band $S\times cr(\alpha ,n)$ becomes lager or the scale S has the deficiency of inflation. The increase in size indicates that the test losses some real inactive effects. This usually happens if the band becomes smaller or the scale S has the deficiency of shrinkage. Powers should be first looked at. Moreover, if two techniques have quite comparable powers, then the better is the technique whose sizes is lower. Because for this, the method with lower sizes will cause less false active alarms. Consequently, the further investigations will cost less.

The relevant powers and sizes were empirically assessed since nr = 10,000 runs of standard normal random samples of sizes n = 15, 31,63 and 127. Let na be the number of active effects that are planted in the introduced sample. In each run, the first na effects,$\{{\beta }_{j}, j=\mathrm{1,2},\dots ,na\}$, were violated and made active effects and the rest (n-na) were kept on as inactive effects. The selected number ‘na’ was given the values 1, 2, 4, 6 and 7 as n = 15; the values 3, 6, 9, 12 and 15 as n = 31; the values 6, 12, 18, 24 and 31 as n = 63; and eventually the values 12, 24, 36,48 and 63 as n = 127. The effect ${\beta }_{j}$ is made active if a shift, say $\Delta$, is added to the observation:$\widehat{{\beta }_{j}}\left(i.e., \widehat{{\beta }_{j}}+\Delta \right)$. The value of the shift $\Delta =0$ considers the null case, and we propose the other values of $\Delta$ ($=2, 3, 4, 5$ and $6)$. The simulation, like the authors' 2001, 2006, 2007 papers, cover only the violation in the shift $\Delta$. This kind of violation is considered in the field.

To assure the correctness of the critical points, the simulation has been conducted for the null case $\Delta =0,$ the sizes and the powers of all methods come out close to the predefined level $\alpha$. The relatively high significance level $\alpha =$ 0.15, 0.20 are often selected to offer the desired active effect more chances of appearing. Eventually, the empirical Sizes and Powers are defined by:

$$\left[ {{\text{Size}}} \right]\left\{ {{\text{Pow}}} \right\} = \frac{{{\text{Num of declared actives}} |{\text{they are}} \left[ {{\text{inactives}}} \right]\left\{ { {\text{actives}} } \right\} {\text{in all runs}}}}{{\left[ {nr \times \left( {n - na} \right)} \right]\left\{ {nr \times na} \right\}}}$$

Indeed, we need global ideas about the sizes and powers apart of huge details about the figure $na$. Therefore, we propose to increase nr = 50,000, and in each run j the value of ${na}_{j}$ is selected randomly from the set of all its possible values. The global powers (GPow) and the global sizes (GSize) are recomputed according to the laws:

$$\left[ {{\text{GSize}}} \right]\left\{ {{\text{GPow}}} \right\} = \frac{{{\text{Num of declared actives}} |{\text{they are}} \left[ {{\text{inactives}}} \right]\left\{ { {\text{actives}} } \right\}{\text{in all runs}}}}{{\left[ {nr \times n - \mathop \sum \nolimits_{j = 1}^{j = nr} na_{j} } \right]\left\{ {\mathop \sum \nolimits_{j = 1}^{j = nr} na_{j} } \right\}}}$$

GSize and GPow of the analysis: Lenth, SKM1, SKM2(A), SKM2(2A) and SKM2(3A).

as α = 0.15, n = 15, 31, 63, 127 are computed in Table 2 and plotted in Fig. 1. The behavior of the results as α = 0.20 will not be different.

4.3 Results Discussion

Figure 1 highlights that GPow of Lenth and SKM1 methods are comparable, but it is growing gradually over them with SKM2(A), SKM2(2A) and SKM2(3A) methods, respectively. The supreme of SKM2(3A) over Lenth may sometimes reach to a great success of 12 units, as in the case SKM2(3A), n = 127 and $\Delta$=5. Whereas the supreme of the SKM2(A) over Lenth cannot exceed 4 unites. The GSize of SKM1, SKM2(A) and SKM2(2A) are lesser than this of the Lenth. For SKM2(3A), there is similarity in low shifts and the lesser is again otherwise. Briefly, the save from the deficiencies of the shrinkage or the inflation that may happen can be ranked from the lowest to the highest when using quick methods under study is as follows: Lenth, SKM1, SKM2(A), SKM2(2A) and SKM2(3A). The next applications highlight this property.

4.4 Applications

Application (1): Fig. 2, show the normal probability plot of the ordered absolute effects: 0.42, 0.73, 0.94, 0.98, 1.11, 1.19, 2.50, 2.73, 4.70, 5.10, 5.11, 5.58, 5.80, 6.65, and 8.42. The data includes 15 effects where the last 7 of them are actives.

The initial figures needed for all methods starts from:

$$Med\left| {\hat{\beta }_{j} } \right| = 2.73\quad \& \quad {\varvec{S}}_{0} = 1.5 \times 2.73 = 4.095$$

$$n_{0} = \left\{ {\left| {\#\hat{\beta }_{i} } \right|;{\text{~}}\left| {\hat{\beta }_{i} } \right| > 2.5{\text{~}} \times S_{{0{\text{~}}}} = 10.2375} \right\} = 0\quad \& \quad n_{{00}} = \left\{ {\left| {\#\hat{\beta }_{i} } \right|;{\text{~}}\left| {\hat{\beta }_{i} } \right| > S_{{0{\text{~}}}} = 4.095} \right\} = 7$$

$$A_{0} = \frac{0.75}{n} = 0.05\quad \& \quad A = \frac{0.1875}{n} = 0.0125$$

Lenth–scale:

$${\varvec{S}}_{L} = 1.5 \times Med\left\{ {\left| {\hat{\beta }_{j} } \right|:\left| {\hat{\beta }_{j} } \right| \le 2.5 \times S_{0} = 10.2375} \right\} = 1.5{ } \times { }2.73 = { }4.095,$$

$${\text{Lenth}}{-}{\text{ bands }} = { } \pm { }{\varvec{S}}_{L} \times cr\left( {0.15,15} \right) = { } \pm 4.095 \times 1.444741{ } = { } \pm 5.916{ }.$$

Here, only 2 active effects are detected, and the other 5 active effects are lost.

SKM1 – scale:

$${\varvec{S}}_{SKM1} = 2.73 \times \left( {1.5 - 0.05 \times 0} \right) = 4.095,$$

$$SKM1{-}{\text{ bands}} = { } \pm {\varvec{S}}_{SKM1} \times cr\left( {0.15,15} \right) = { } \pm 4.095 \times 1.429107{ } = \pm 5.852.$$

Here, only 2 active effects are detected, and the other 5 active effects are lost.

SKM2 (A)– scale:

$${\varvec{S}}_{SKM2\left( A \right)} = 2.73 \times \left[ {1.5 - 0.0125 \times \left( {3 \times 7 + 0} \right)} \right] = 3.378,$$

$$SKM2 \, \left( A \right){-}bands = \pm {\varvec{S}}_{SKM2\left( A \right)} \times cr\left( {0.15,15} \right) = \pm 3.378 \times 1.601896 = \pm 5.411.$$

Here, only 4 active effects are detected, and the other 3 active effects are lost.

SKM2 (2A)– scale:

$$S_{{SKM2\left( {2A} \right)}} = 2.73 \times \left[ {1.5 - 0.025 \times \left( {3 \times 7 + 0} \right)} \right] = 2.662,$$

$$SKM2 \, \left( {2A} \right) \, {-} \, bands = { } \pm {\varvec{S}}_{{SKM2\left( {2A} \right)}} \times cr\left( {0.15,15} \right){ } = { } \pm 2.662 \times 1.878985{ } = \pm 5.002.$$

Here, only 6 active effects are detected, and the other 1 active effect is lost.

SKM2 (3A)– scale:

$$S_{{SKM2\left( {3A} \right)}} = 2.73 \times \left[ {1.5 - 0.0375 \times \left( {3 \times 7 + 0} \right)} \right] = 1.945,$$

$$SKM2 \, \left( {3A} \right){-} \, bands = { } \pm S_{{SKM2\left( {3A} \right)}} \times cr\left( {0.15,15} \right){ } = { } \pm 1.945 \times 2.258468 = \pm 4.393.$$

Here, all 7 active effects are detected, and nothing is lost. Figure 2 displays the bands and the inflation deficiency of Lenth in comparing with SKM1 and SKM2. Let’s consider another application.

Application (2): In Fig. 3, the normal probability plot of Example II of Box and Meyer [8] is done. The ordered absolute effects of this example are as follows:

0.03, 0.05, 0.13, 0.13, 0.13, 0.15, 0.15, 0.30, 0.37, 0.37, 0.40, 0.40, 0.42, 2.15, and 3.10.

There are 15 effects, where the last two of them are likely to be real active effect. The initial figures needed for all methods starts from:

$$Med\left| {\hat{\beta }_{j} } \right| = { }0.30\quad \& \quad {\varvec{S}}_{0} = 1.5{ } \times { }0.30 = { }0.45,$$

$$n_{0} = \left\{ {\left| {\#\hat{\beta }_{i} } \right|;{\text{~}}\left| {\hat{\beta }_{i} } \right| > 2.5{\text{~}} \times S_{{0{\text{~}}}} = 1.125} \right\} = 2\& ~~~~~n_{{00}} = \left\{ {\left| {\#\hat{\beta }_{i} } \right|;\left| {\hat{\beta }_{i} } \right| > S_{{0{\text{~}}}} = {\text{~}}0.45} \right\} = 2$$

$$A_{0} = \frac{0.75}{n} = 0.05\quad \& \quad A = \frac{0.1875}{n} = 0.0125$$

Lenth– scale:

$${\varvec{S}}_{L} = 1.5 \times Med\left\{ {\left| {\hat{\beta }_{j} } \right|:\left| {\hat{\beta }_{j} } \right| \le 2.5 \times S_{0} = 1.125} \right\} = 1.5{ } \times { }0.15 = { }0.225,$$

$${\text{Lenth }}{-}{\text{ bands }} = { } \pm { }{\varvec{S}}_{L} \times cr\left( {0.15,15} \right) = { } \pm { }0.225 \times 1.444741{ } = { } \pm 0.3251{ }.$$

Here, 7 effects are detected as actives and 5 of them are false.

SKM1 – scale:

$${\varvec{S}}_{SKM1} = 0.30 \times \left( {1.5 - 0.05 \times 2} \right) = 0.42,$$

$$SKM1{-}{\text{ bands }} = { } \pm {\varvec{S}}_{SKM1} \times cr\left( {0.15,15} \right) = { } \pm 0.42 \times 1.429107{ } = \pm 0.6002.$$

Here, 2 effects are detected as actives, and nothing is false.

SKM2 (A)– scale:

$${\varvec{S}}_{SKM2\left( A \right)} = 0.30 \times \left[ {1.5 - 0.0125 \times \left( {3 \times 2 + 2} \right)} \right] = 0.42,$$

$$SKM2 \, \left( A \right){-}{\text{bands}} = { } \pm {\varvec{S}}_{SKM2\left( A \right)} \times cr\left( {0.15,15} \right) = { } \pm 0.42 \times 1.601896{ } = \pm 0.6728.$$

Here, 2 effects are detected as actives, and nothing is false.

SKM2 (2A)– scale:

$${\varvec{S}}_{{SKM2\left( {2A} \right)}} = 0.30 \times \left[ {1.5 - 0.025 \times \left( {3 \times 2 + 2} \right)} \right] = 0.39,$$

$$SKM2 \, \left( {2A} \right){-}{\text{bands}} = { } \pm {\varvec{S}}_{{SKM2\left( {2A} \right)}} \times cr\left( {0.15,15} \right) = { } \pm 0.39 \times 1.878985{ } = \pm 0.7328.$$

Here, 2 effects are detected as actives, and nothing is false.

SKM2 (3A)– scale:

$$S_{{SKM2\left( {3A} \right)}} = 0.30 \times \left[ {1.5 - 0.0375 \times \left( {3 \times 2 + 2} \right)} \right] = 0.36,$$

$$SKM2 \, \left( {3A} \right){-}{\text{bands}} = \pm S_{{SKM2\left( {3A} \right)}} \times cr\left( {0.15,15} \right) = \pm 0.36 \times 2.258468 = \pm 0.8130.$$

Here, 2 effects are detected as actives, and nothing is false. Figure 3 displays the bands and the shrinkage deficiency of Lenth in comparing with SKM1 and SKM2.

5 Concluding Remarks

This paper proposes a new class of simpler, faster, efficient, and more powerful method of approximations using the location median-function (${\psi }_{med}\left({\varvec{x}}\right)$). The speed of approximation is achieved by skipping one and/or two times the simulation process. We observe that model based on SKM2(3A)-technique is far better than its competitors recorded in the literature including Lenth technique. The proposed technique is quite simpler, faster, and free from the deficiencies of the inflation and/or the shrinkage that may exist. The comparative analysis with previous studies is done in presenting mathematical proof with simulated tables of the critical values, sizes, and powers. The proposed methods are illustrated in examples to demonstrate its application in data analytics.

Availability of Data and Material

Its simulated Empirical study, the data can be provided on request.

Notes

Applied on studying the effects of catalyst loading, catalyst selection, porous transport layer type and conductive additive content related water quality issue also BA et al. [5] applied to multifactor unreplicated factorial experiments. Bhatti and Al Shanfari (2017) in clustering whereas Wosiak et al. [18] used measuring the influence on the energy consumption of a vertical electrolyser under natural convection.
For detail understanding and basics of this readers are referred to see [13], page 147.
See [6] for details when designing optimal tests in testing cluster effects, computing test size, critical values and the doing empirical power comparisons in search of uniformly most powerful tests. It was cited by Angelopoulos et al. [4] in various applications in clustering and used by Bhatti and Al Shanfari (2017).

Abbreviations

Med :: Median
IoT:: Information of thing
DOE:: Design of experiments
S :: Scale
S ₀ :: Initial robust scale estimate
${S}_{L}$ :: Scale of Lenth
$F\left(x/\sigma \right)$ :: Scale model
$\alpha$ :: Significance level
H ₀ :: Null hypothesis
${\varvec{\psi}}$ :: Huber’s skew symmetric function for location
B :: E
${\psi }_{med}\left({\varvec{x}}\right)$ :: ${\varvec{\psi}}$ Of median = $Sign\left(x\right)$ (or Signal of x)
SKM1 :: Skipping ${\psi }_{med}\left({\varvec{x}}\right)$ one time
SKM2 :: Skipping ${\psi }_{med}\left({\varvec{x}}\right)$ two times
n :: Number of effects
nr :: Number of runs
na :: Number of active effects
A:: Constant $0.1875/n$
${A}_{0}$ :: Constant $0.75/n$
SKM2(k $\times$ A) _k ₌ _1,2,3 :: SKM2 At k $\times$ A
${\widehat{\beta }}_{i}$ :: Best linear unbiased estimator (BLUE) of ${\beta }_{i}$
${\sigma }^{2}$ :: Variance
cr :: Critical point
$\Delta$ :: Shift
${n}_{0}$ :: Number of absulote observations greater than ${2.5 S}_{0}$
${n}_{00}$ :: Number of absulote observations greater than ${S}_{0}$
$Pow$ :: Power
$GPow$ :: Global power
$GSize$ :: Global size

References

Aboukalam, M.A.F., Al-Shiha, A.A.: A robust analysis for unreplicated factorial experiments. Comput. Stat. Data Anal. 36(1), 31–46 (2001)
Article MathSciNet MATH Google Scholar
Aboukalam, F.: More on quick analysis of unreplicated factorial designs avoiding shrinkage and inflation deficiencies. Int. J. Reliabil. Appl. 7, 167–175 (2006)
Google Scholar
Al-Shiha, A.A., Aboukalam, M.A.F.: Quick and easy analysis of unreplicated factorial designs avoiding shrinkage deficiency. J. Stat. Theory Appl. 6, 35–43 (2007)
MathSciNet Google Scholar
Angelopoulos, P., Koukouvinos, C., Skountzou, A.: Clustering effects in unreplicated factorial experiments. Commun. Stat. Simul. Comput. 42(9), 1998–2007 (2013)
Article MathSciNet MATH Google Scholar
Ba, I., Bl, A., Gm, O.: A proposed method of identifying significant effects in un-replicated factorial experiments. Annals. Comput. Sci. Series 17, 2 (2019)
Google Scholar
Bhatti, M.I.: Cluster effects in mining complex data. Nova Science Publishers, Incorporated, NY, USA (2012)
Google Scholar
Bhatti, M.I., Al-Shanfari, H.: Econometric analysis of model selection and model testing. Routledge (2017)
Book Google Scholar
Box, G.E., Meyer, R.D.: An analysis for unreplicated fractional factorials. Technometrics 28, 11–18 (1986)
Article MathSciNet MATH Google Scholar
Deepa, N., Pham, Q.V., Nguyen, D.C., Bhattacharya, S., Prabadevi, B., Gadekallu, T.R., Pathirana, P.N.: A survey on blockchain for big data: approaches, opportunities, and future directions. Future Gener. Comput. Syst. Appear 131, 209 (2022)
Article Google Scholar
Guerra-Zubiaga, D.A., Luong, K.Y.: Energy consumption parameter analysis of industrial robots using design of experiment methodology. Int. J. Sustain. Eng. 14(5), 996–1005 (2021)
Article Google Scholar
Haaland, P.D., Connell, O., M. A.: Inference for effect-saturated fractional factorials. Technometrics 37, 82–93 (1995)
Article MATH Google Scholar
Hamada, M., Balakrishnan, N.: Analyzing unreplicated factorial experiments: A review with some new proposal. Stat. Sin. 8, 1–41 (1998)
MathSciNet MATH Google Scholar
Huber, P.J.: Robust Statistics. Wiley, New York (1981)
Book MATH Google Scholar
Lenth, R.V.: Quick and easy analysis of unreplicated factorials. Technometrics 31, 469–473 (1989)
Article MathSciNet Google Scholar
Loughin, T.M., Noble, W.: A permutation test for effects in an unreplicated factorial design. Technometrics 39, 180–190 (1997)
Article MathSciNet MATH Google Scholar
Salmaso, L., Pegoraro, L., Giancristofaro, R.A., Ceccato, R., Bianchi, A., Restello, S., Scarabottolo, D.: Design of experiments and machine learning to improve robustness of predictive maintenance with application to a real case study. Commun. Stat. Simul. Comput. 51(2), 570–582 (2022)
Article MathSciNet MATH Google Scholar
Tsai, S.F.: Analyzing dispersion effects from replicated order-of-addition experiments. J. Qual. Technol. 1, 1–18 (2022)
Article Google Scholar
Wosiak, G., da Silva, J., Sena, S.S., Carneiro-Neto, E.B., Lopes, M.C., Pereira, E.: Investigation of the influence of the void fraction on the energy consumption of a vertical electrolyser under natural convection. J. Environ. Chem. Eng. 10(3), 107577 (2022)
Article Google Scholar

Download references

Acknowledgements

We are thankful to two anonymous referees and the handling editor for constructive comments which have improved the quality of the paper. We take all responsibilities of any errors.

Funding

Not applicable.

Author information

Authors and Affiliations

Department of Statistics and Operations Research, College of Sciences, King Saud University, P. O. Box 2455, Riyadh, 11451, Saudi Arabia
F. Aboukalam & M. Alharbi
SP Jain School of Global Management, Sydney and La Trobe Business School, Melbourne, NSW, 2141, Australia
M. Ishaq Bhatti

Authors

F. Aboukalam
View author publications
You can also search for this author in PubMed Google Scholar
M. Alharbi
View author publications
You can also search for this author in PubMed Google Scholar
M. Ishaq Bhatti
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The following authors and their initials are used. Fayz Aboukalam (FA), Maryam Alharbi (MA), M. Ishaq Bhatti (MIB). “FA carried out the initial model based on earlier work, participated in the sequence alignment, and drafted the manuscript. MA carried out simulation under supervision of FA and MIB. All three authors (FA, MA and MIB) participated in the design of the study and performed the statistical analysis. All authors read, reviewed the final draft, and approved the final manuscript in revised format to be submitted to JSTA.”

Corresponding author

Correspondence to M. Ishaq Bhatti.

Ethics declarations

Competing interests

Not applicable.

Ethics Approval and Consent to Participate

Not applicable.

Consent for Publication

All authors give consent for publication.

Appendix

See Tables 1 and 2 here.

Table 1 Critical values ${\varvec{c}}{\varvec{r}}\left(\boldsymbol{\alpha },{\varvec{n}}\right)$ for Lenth, SKM1, SKM2 (A), SKM2 (2A) and SKM2 (3A)

Full size table

Table 2 Global Sizes and Powers for Lenth, SKM1, SKM2 (A), SKM2 (2A) and SKM2 (3A) for α = 0.15, and selected n = 15, 31, 63, 127

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Aboukalam, F., Alharbi, M. & Bhatti, M.I. Improved Approximation Scales for Unreplicated Factorial Experiments. J Stat Theory Appl 21, 200–216 (2022). https://doi.org/10.1007/s44199-022-00049-x

Download citation

Received: 13 April 2022
Accepted: 14 September 2022
Published: 25 September 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s44199-022-00049-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Improved Approximation Scales for Unreplicated Factorial Experiments

Abstract

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Sampling Techniques for Quantitative Research

1 Introduction

2 Theoretical Objectives