Skip to main content
Log in

A comparison of different synchronized permutation approaches to testing effects in two-level two-factor unbalanced ANOVA designs

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Analysis of variance (ANOVA) is used to compare the means of various samples. Parametric ANOVA approaches assume normally distributed error terms within subsamples. Permutation tests like synchronized permutation tests are computationally intensive and distribution free procedures. Hence they overcome the limitations of parametric methods. Unbalanced designs with differing subsample sizes are quite frequent in various disciplines. There is a broad literature about unbalanced designs and parametric testing. For permutation tests this topic received some attention recently. This paper extends the synchronized permutation method to unbalanced two-level ANOVA designs. A simulation study investigates the behavior of different procedures for various types of unbalanced designs. It compares the results to other permutation approaches. The synchronized permutation method yields comparable results to the best performing competing permutation approaches. However the approach is limited to certain kinds of unbalanced designs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Akritas MG, Arnold SF, Brunner E (1997) Nonparametric hypotheses and rank statistics for unbalanced factorial designs. J Am Stat Assoc 92(437):258–265. doi:10.2307/2291470

  • Anderson MJ, ter Braak CJF (2003) Permutation tests for multi-factorial analysis of variance. J Stat Comput Simul 73(2):85–113. doi:10.1080/0094965021000015558

    Article  MathSciNet  MATH  Google Scholar 

  • Basso D, Chiarandini M, Salmaso L (2007) Synchronized permutation tests in replicated I \(x\) J designs. J Stat Plan Inference 137:2564–2578. doi:10.1016/j.jspi.2006.04.016

    Article  MathSciNet  MATH  Google Scholar 

  • Basso D, Pesarin F, Salmaso L, Solari A (2009) Permutation tests for stochastic ordering and ANOVA. Springer, New York

    MATH  Google Scholar 

  • Brunner E, Puri ML (2001) Nonparametric methods in factorial designs. Stat Pap 42:1–52

    Article  MathSciNet  MATH  Google Scholar 

  • Corain L, Salmaso L (2007) A critical review and a comparative study on conditional permutation tests for two-way ANOVA. Commun Stat-Simul Comput 36(4):791–805. doi:10.1080/03610910701418119

    Article  MathSciNet  MATH  Google Scholar 

  • Edgington ES (1995) Randomization tests. Statistics textbooks and monographs, 3rd edn. Marcel Dekker, New York

    Google Scholar 

  • Everitt BS, Hothorn T (2009) A handbook of statistical analyses using R, 2nd edn. Chapman and Hall, Boca Raton

    MATH  Google Scholar 

  • Good P (2005) Permutation, parametric, and bootstrap tests of hypotheses. Springer series in statistics, 3rd edn. Springer, New York

    MATH  Google Scholar 

  • Hand DJ, Daly F, Lunn AD, McConway KJ, Ostrowski E (1994) A handbook of small datasets. Chapman & Hall, London

    Book  MATH  Google Scholar 

  • Kherad-Pajouh S, Renaud D (2010) An exact permutation method for testing any effect in balanced and unbalanced fixed effect ANOVA. Comput Stat Data Anal 54:1881–1893. doi:10.1016/j.csda.2010.02.015

    Article  MathSciNet  MATH  Google Scholar 

  • Manly BFJ (1997) Randomization, bootstrap and Monte Carlo methods in biology. Texts in statistical science, 2nd edn. Chapman & Hall, London

    MATH  Google Scholar 

  • Manly BFJ (2007) Randomization, bootstrap and Monte Carlo methods in biology. Texts in statistical science series, 3rd edn. CRC Press INC, New York

    MATH  Google Scholar 

  • Mendes M, Akkartal E (2010) Comparison of ANOVA F and WELCH tests with their respective permutation versions in terms of Type I error rates and test power. Kafas Univ Vet Fak Derg 16(5):711–716

    Google Scholar 

  • Pauly M, Brunner E, Konietschke F (2014) Asymptotic permutation tests in general factorial designs. J R Stat Soc: Ser B 77(2):461–473. doi:10.1111/rssb.12073

    Article  MathSciNet  Google Scholar 

  • Pesarin F (2001) Multivariate permutation tests with applications in biostatistics. Wiley & Sons, Chichester

    MATH  Google Scholar 

  • Pesarin F, Salmaso L (2010) Permutation tests for complex data: theory. Application and software. John Wiley & Sons Ltd, New York

    Book  MATH  Google Scholar 

  • R Core Team (2013) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  • Salmaso L (2003) Synchronized permutation tests in \(2^{k}\) factorial designs. Commun Stat 32:1419–1437. doi:10.1081/STA-120021566

    Article  MathSciNet  MATH  Google Scholar 

  • Searle SR (1987) Linear models for unbalanced data. Wiley series in probability and mathematical statistics. John Wiley & Sons Ltd, New York

    Google Scholar 

Download references

Acknowledgments

The authors gratefully acknowledge the syntax for the WTPS approach provided by Frank Konietschke.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luigi Salmaso.

Appendices

Appendix 1: Comparing restricted and simple weights

This appendix explains why the restricted weights approach and the fixed weights approach do lead to the same results in the simple unbalanced and the crossed unbalanced case.

The weights in the restricted weights approach are:

$$\begin{aligned}&w_{11}^{*}=\frac{\big [n_{12}n_{22} -\nu (n_{12}+n_{22})\big ]n_{21}}{\big [n_{12}n_{22}-\nu (n_{12}+n_{22})\big ]n_{21} +\big [n_{11}n_{21}-\nu (n_{11}+n_{21}) \big ]n_{22}+\big [n_{12}n_{22}-\nu (n_{12} +n_{22})\big ]n_{11}+\big [n_{11}n_{21} -\nu (n_{11}+n_{21})\big ]n_{12}},\end{aligned}$$
(26)
$$\begin{aligned}&w_{12}^{*}=\frac{\big [n_{11}n_{21}-\nu (n_{11}+n_{21})\big ]n_{22}}{\big [n_{12}n_{22}-\nu (n_{12}+n_{22})\big ]n_{21}+\big [n_{11}n_{21}-\nu (n_{11}+n_{21})\big ]n_{22}+\big [n_{12}n_{22}-\nu (n_{12}+n_{22})\big ]n_{11}+\big [n_{11}n_{21}-\nu (n_{11}+n_{21})\big ]n_{12}},\end{aligned}$$
(27)
$$\begin{aligned}&w_{21}^{*}=\frac{\big [n_{12}n_{22}-\nu (n_{12}+n_{22})\big ]n_{11}}{\big [n_{12}n_{22}-\nu (n_{12}+n_{22})\big ]n_{21}+\big [n_{11}n_{21}-\nu (n_{11}+n_{21})\big ]n_{22}+\big [n_{12}n_{22}-\nu (n_{12}+n_{22})\big ]n_{11}+\big [n_{11}n_{21}-\nu (n_{11}+n_{21})\big ]n_{12}},\end{aligned}$$
(28)
$$\begin{aligned}&w_{22}^{*}=\frac{\big [n_{11}n_{21}-\nu (n_{11}+n_{21})\big ]n_{12}}{\big [n_{12}n_{22}-\nu (n_{12}+n_{22})\big ]n_{21}+\big [n_{11}n_{21}-\nu (n_{11}+n_{21})\big ]n_{22}+\big [n_{12}n_{22}-\nu (n_{12}+n_{22})\big ]n_{11}+\big [n_{11}n_{21}-\nu (n_{11}+n_{21})\big ]n_{12}}. \end{aligned}$$
(29)

Note that we can compute these weights only when the denominator is not zero.

As in the simple unbalanced design \(n_{11}=n_{12}\) and \(n_{21}=n_{21}\), Eq. 26 can be written in the following way:

$$\begin{aligned} w_{11}^{*}= & {} \frac{\Big [n_{11}n_{21}-\nu (n_{11}+n_{21})\Big ]n_{21}}{\Big [n_{11}n_{21}-\nu (n_{11}+n_{21})\Big ]n_{21}+\Big [n_{11}n_{21}-\nu (n_{11}+n_{21})\Big ]n_{21}+\Big [n_{11}n_{21}-\nu (n_{11}+n_{21})\Big ]n_{11}+\Big [n_{11}n_{21}-\nu (n_{11}+n_{21})\Big ]n_{11}}\end{aligned}$$
(30)
$$\begin{aligned}= & {} \frac{\Big [n_{11}n_{21}-\nu (n_{11}+n_{21})\Big ]n_{21}}{2 \Big [n_{11}n_{21}-\nu (n_{11}+n_{21})\Big ](n_{11}+n_{21})}. \end{aligned}$$
(31)

As all cell frequencies are positive and if the denominator is not zero, Eq. 31 simplifies to

$$\begin{aligned} w_{11}^{*}&=\frac{n_{21}}{2(n_{11}+n_{21})}\\&=\frac{n_{21}}{n_{11}+n_{12}+n_{21}+n_{22}}. \end{aligned}$$

We can apply the same procedure to the other weights and also to the weights when imposing the sample size restrictions of the crossed unbalanced design. For both cases we can finally write the weights in the following way:

$$\begin{aligned} w_{11}^{*}&=\frac{n_{21}}{n_{11}+n_{12}+n_{21}+n_{22}},\end{aligned}$$
(32)
$$\begin{aligned} w_{12}^{*}&=\frac{n_{22}}{n_{11}+n_{12}+n_{21}+n_{22}},\end{aligned}$$
(33)
$$\begin{aligned} w_{21}^{*}&=\frac{n_{11}}{n_{11}+n_{12}+n_{21}+n_{22}},\end{aligned}$$
(34)
$$\begin{aligned} w_{22}^{*}&=\frac{n_{12}}{n_{11}+n_{12}+n_{21}+n_{22}}. \end{aligned}$$
(35)

When we compare Eqs. 3235 to the simple weights approach (\(w_{11}^{*}=n_{21}\), \(w_{12}^{*}=n_{22}\), \(w_{21}^{*}=n_{11}\), and \(w_{22}^{*}=n_{12}\)), it is obvious that they are identical except for the denominator that is not present in the simple weights approach. As the denominator is the total sample size, it is constant for every permutation. Hence, the distribution of the permuted test statistic differs only because of this constant. The order of the permuted test statistics does not change, and so the achieved p value the same. A further consideration from this is that the simple weights approach should be used in the simple unbalanced and the crossed unbalanced design as it yields the same results when the denominator of the restricted weights approach is not zero and it is also applicable in cases when the denominator is zero.

Appendix 2: Extended simulation results

During the review process we were asked to include some further simulation results on more data settings. Due to saving space in the article, these are provided in this appendix.

We extended the second part of the previous simulation study with regard to the following aspects:

  • More subsample size settings: This was realized by including subsample sizes that differed by no, two, five, and nine observations

  • Effect sizes: We included settings in which the respective effect was inactive. In others the effect was slightly lower and on others it was slightly larger than in the previous simulations. We selected the effect sizes in a way that the procedures did not achieve a too high power because in this case, differences the behavior of the procedures is almost indistinguishable.

Further, we included the interaction effect as an effect of interest.

We present the results for testing the main effect and the interaction term separately. Each plot contains the results for different minimal subsample sizes, different active effects and different error term distributions. Further, the plots contain the results when there was no effect (solid lines) and when there was an active effect (dashed lines for the smaller effect and dotted lines for the larger effect) in the data.

For the main effect A, Fig. 3 presents the results for equally sized samples, Fig. 4 presents the results for samples where the difference between the subsample sizes was two observations, and Figs. 5 and 6 correspond to settings in which the difference in subsample sizes was five and nine observations, respectively.

Fig. 3
figure 3

Empirical Type I error rates (solid lines) and power curves (dashed and dotted lines) for testing the main effect A in balanced designs with different error term distributions. The gray line corresponds to the nominal \(\alpha \)-level. AN parametric ANOVA, CF CSP with fixed weights, UF USP with fixed weights, RA permutation of raw data using ANOVA F-statistic, RE permutation of residuals, WTPS Wald-type permutation statistic

Fig. 4
figure 4

Empirical Type I error rates (solid lines) and power curves (dashed and dotted lines) for testing the main effect A in unbalanced designs with a difference of two observations in the subsample siezs and different error term distributions. The gray line corresponds to the nominal \(\alpha \)-level. AN parametric ANOVA, CF CSP with fixed weights, UF USP with fixed weights, RA permutation of raw data using ANOVA F-statistic, RE permutation of residuals, WTPS Wald-type permutation statistic

Fig. 5
figure 5

Empirical Type I error rates (solid lines) and power curves (dashed and dotted lines) for testing the main effect A in unbalanced designs with a difference of five observations in the subsample sizes and different error term distributions. The gray line corresponds to the nominal \(\alpha \)-level. AN parametric ANOVA, CF CSP with fixed weights, UF USP with fixed weights, RA permutation of raw data using ANOVA F-statistic, RE permutation of residuals, WTPS Wald-type permutation statistic

Fig. 6
figure 6

Empirical Type I error rates (solid lines) and power curves (dashed and dotted lines) for testing the main effect A in unbalanced designs with a difference of nine observations in the subsample sizes and different error term distributions. The gray line corresponds to the nominal \(\alpha \)-level. AN parametric ANOVA, CF CSP with fixed weights, UF USP with fixed weights, RA permutation of raw data using ANOVA F-statistic, RE permutation of residuals, WTPS Wald-type permutation statistic

Similarly, for the interaction effect, Fig. 7 presents the results for equally sized samples, and Figs. 8, 9, and 10 correspond to settings in which the difference in subsample sizes were two, five and nine observations, respectively.

Fig. 7
figure 7

Empirical Type I error rates (solid lines) and power curves (dashed and dotted lines) for testing the interaction effect in balanced designs with different error term distributions. The gray line corresponds to the nominal \(\alpha \)-level. AN parametric ANOVA, CF CSP with fixed weights, UF USP with fixed weights, RA permutation of raw data using ANOVA F-statistic, RE permutation of residuals, WTPS Wald-type permutation statistic

Fig. 8
figure 8

Empirical Type I error rates (solid lines) and power curves (dashed and dotted lines) for testing the interaction effect in unbalanced designs with a difference of two observations in the subsample siezs and different error term distributions. The gray line corresponds to the nominal \(\alpha \)-level. AN parametric ANOVA, CF CSP with fixed weights, UF USP with fixed weights, RA permutation of raw data using ANOVA F-statistic, RE permutation of residuals, WTPS Wald-type permutation statistic

Fig. 9
figure 9

Empirical Type I error rates (solid lines) and power curves (dashed and dotted lines) for testing the interaction effect in unbalanced designs with a difference of five observations in the subsample sizes and different error term distributions. The gray line corresponds to the nominal \(\alpha \)-level. AN parametric ANOVA, CF CSP with fixed weights, UF USP with fixed weights, RA permutation of raw data using ANOVA F-statistic, RE permutation of residuals, WTPS Wald-type permutation statistic

Fig. 10
figure 10

Empirical Type I error rates (solid lines) and power curves (dashed and dotted lines) for testing the interaction effect in unbalanced designs with a difference of nine observations in the subsample sizes and different error term distributions. The gray line corresponds to the nominal \(\alpha \)-level. AN parametric ANOVA, CF CSP with fixed weights, UF USP with fixed weights, RA permutation of raw data using ANOVA F-statistic, RE permutation of residuals, WTPS Wald-type permutation statistic

The pattern throughout the different settings is similar to the findings in the article: Both, the CSP and the USP algorithm need sufficiently large sample sizes to work properly. From the new results it becomes visible that a larger total sample size does improve the power behavior of the USP approach. The explanation for this is that the number of possible permutation increases.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hahn, S., Salmaso, L. A comparison of different synchronized permutation approaches to testing effects in two-level two-factor unbalanced ANOVA designs. Stat Papers 58, 123–146 (2017). https://doi.org/10.1007/s00362-015-0690-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-015-0690-2

Keywords

Mathematics Subject Classification

Navigation