Nonparametric Statistics in Human–Computer Interaction

Wobbrock, Jacob O.; Kay, Matthew

doi:10.1007/978-3-319-26633-6_7

Jacob O. Wobbrock⁵ &
Matthew Kay⁶

Part of the book series: Human–Computer Interaction Series ((HCIS))

5146 Accesses
8 Citations
4 Altmetric

Abstract

Data not suitable for classic parametric statistical analyses arise frequently in human–computer interaction studies. Various nonparametric statistical procedures are appropriate and advantageous when used properly. This chapter organizes and illustrates multiple nonparametric procedures, contrasting them with their parametric counterparts. Guidance is given for when to use nonparametric analyses and how to interpret and report their results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The Mann-Whitney U test has multiple and sometimes confusing names. It is also known as the Wilcoxon-Mann-Whitney test, the Mann-Whitney-Wilcoxon test, and the Wilcoxon rank-sum test. None of these should be confused with the Wilcoxon signed-rank test, which is for one-factor two-level within-subjects designs.
2.
Holm’s sequential Bonferroni procedure for three pairwise comparisons uses a significance threshold of \(\upalpha =0.05/3\) for the lowest p-value, \(\upalpha =0.05/2\) for the second lowest p-value, and \(\upalpha =0.05/1\) for the highest p-value. Should a p-value compared in that ascending order fail to be statistically significant, the procedure halts and any subsequent comparisons are regarded as statistically nonsignificant.
3.
Rather than using traditional repeated measures ANOVAs, ARTool uses mixed-effects analyses of variance, explained below in the section on Generalized Linear Mixed Models.
4.
General Linear Models are often called “linear models” and may be abbreviated “LM.” These should not be confused with Generalized Linear Models, which may be abbreviated “GLM.” However, some texts use “GLM” for linear models and “GZLM” for generalized models. Readers should take care when encountering this family of abbreviations.
5.
While not covered in this chapter, LMs and GLMs also offer the ability to use continuous independent variables, not just categorical independent variables (see Chap. 11).
6.
Multinomial logistic regression—when used with dichotomous responses such as Yes/No, True/False, Success/Fail, Agree/Disagree, or 1/0—is called “binomial regression.” The GLM for binomial regression uses a “binomial” distribution and “logit” link function. It can be conducted using the glm function in much the same way as Poisson regression explained below, except with the parameter family=binomial.
7.
Given data with a large number of zeroes, it is prudent to consider an extension to Poisson regression called “zero-inflated” Poisson regression. This model incorporates binomial regression to predict the probability of a zero alongside Poisson regression to model counts. See the zeroinfl function in the pscl package.
8.
Although the canonical link function for the Gamma distribution is actually the “inverse” function, the “log” function is often used because the inverse function can be difficult to estimate due to discontinuity at zero. The two functions provide similar results.
9.
This model uses an intercept-only random effect. There are other types of random effects such as slopes-and-intercept random effects that are described in Chap. 11.
10.
The ANOVA type indicates how the sums-of-squares are computed. In general, Type III ANOVAs are preferred because they can support conclusions about main effects in the presence of significant interactions. For Type I and Type II ANOVAs, significant main effects cannot safely be interpreted in the presence of significant interactions.

References

Anderson TW, Darling DA (1952) Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann Math Stat 23(2):193–212
Google Scholar
Anderson TW, Darling DA (1954) A test of goodness of fit. J Am Stat Assoc 49(268):765–769
Google Scholar
Brown GW, Mood AM (1948) Homogeneity of several samples. Am Stat 2(3):22
Google Scholar
Brown GW, Mood AM (1951) On median tests for linear hypotheses. In: Proceedings of the second Berkeley symposium on mathematical statistics and probability, Berkeley, California. University of California Press, Berkeley, California, pp 159–166
Google Scholar
Conover WJ, Iman RL (1981) Rank transformations as a bridge between parametric and nonparametric statistics. Am Stat 35(3):124–129
Google Scholar
D’Agostino RB (1986) Tests for the normal distribution. In: D’Agostino RB, Stephens MA (eds) Goodness-of-fit techniques. Marcel Dekker, New York, pp 367–420
Google Scholar
Dixon WJ, Mood AM (1946) The statistical sign test. J Am Stat Assoc 41(236):557–566
Google Scholar
Fawcett RF, Salter KC (1984) A Monte Carlo study of the F test and three tests based on ranks of treatment effects in randomized block designs. Commun Stat Simul Comput 13(2):213–225
Google Scholar
Fisher RA (1921) On the “probable error” of a coefficient of correlation deduced from a small sample. Metron 1(4):3–32
Google Scholar
Fisher RA (1922) On the interpretation of \(\chi ^{2}\) from contingency tables, and the calculation of P. J R Stat Soc 85(1):87–94
Google Scholar
Fisher RA (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh
MATH Google Scholar
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Google Scholar
Gilmour AR, Anderson RD, Rae AL (1985) The analysis of binomial data by a generalized linear mixed model. Biometrika \(72\)(3):593–599
Google Scholar
Greenhouse SW, Geisser S (1959) On methods in the analysis of profile data. Psychometrika 24(2):95–112
Google Scholar
Higgins JJ, Blair RC, Tashtoush S (1990) The aligned rank transform procedure. In: Proceedings of the conference on applied statistics in agriculture. Kansas State University, Manhattan, Kansas, pp 185–195
Google Scholar
Higgins JJ, Tashtoush S (1994) An aligned rank transform test for interaction. Nonlinear World 1(2):201–211
Google Scholar
Higgins JJ (2004) Introduction to modern nonparametric statistics. Duxbury Press, Pacific Grove
Google Scholar
Hodges JL, Lehmann EL (1962) Rank methods for combination of independent experiments in the analysis of variance. Ann Math Stat 33(2):482–497
Google Scholar
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6(2):65–70
Google Scholar
Kolmogorov A (1933) Sulla determinazione empirica di una legge di distributione. Giornale dell’Istituto Italiano degli Attuari 4:83–91
Google Scholar
Kramer CY (1956) Extension of multiple range tests to group means with unequal numbers of replications. Biometrics 12(3):307–310
Google Scholar
Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Amer Stat Assoc 47(260):583–621
Google Scholar
Lehmann EL (2006) Nonparametrics: statistical methods based on ranks. Springer, New York
MATH Google Scholar
Levene H (1960) Robust tests for equality of variances. In: Olkin I, Ghurye SG, Hoeffding H, Madow WG, Mann HB (eds) Contributions to probability and statistics. Stanford University Press, Palo Alto, pp 278–292
Google Scholar
Liang K-Y, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73(1):13–22
Google Scholar
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60
Google Scholar
Mansouri H (1999a) Aligned rank transform tests in linear models. J Stat Plann Inference 79(1):141–155
Google Scholar
Mansouri H (1999b) Multifactor analysis of variance based on the aligned rank transform technique. Comput Stat Data Anal 29(2):177–189
Google Scholar
Mansouri H, Paige RL, Surles JG (2004) Aligned rank transform techniques for analysis of variance and multiple comparisons. Commun Stat Theory Methods 33(9):2217–2232
Google Scholar
Massey FJ (1951) The Kolmogorov-Smirnov test for goodness of fit. J Am Stat Assoc 46(253):68–78
Google Scholar
Mauchly JW (1940) Significance test for sphericity of a normal n-variate distribution. Ann Math Stat 11(2):204–209
Google Scholar
McCullagh P (1980) Regression models for ordinal data. J R Stat Soc Ser B 42(2):109–142
Google Scholar
Mehta CR, Patel NR (1983) A network algorithm for performing Fisher’s exact test in r \(\times \) c contingency tables. J Am Stat Assoc 78(382):427–434
Google Scholar
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A \(135\)(3):370–384
Google Scholar
Pearson K (1900) On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos Mag Ser 5 50(302):157–175
Google Scholar
Razali NM, Wah YB (2011) Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. J Stat Model Anal \(2\)(1):21–33
Google Scholar
Richter SJ (1999) Nearly exact tests in factorial experiments using the aligned rank transform. J Appl Stat \(26\)(2):203–217
Google Scholar
Salter KC, Fawcett RF (1985) A robust and powerful rank test of treatment effects in balanced incomplete block designs. Commun Stat Simul Comput \(14\)(4):807–828
Google Scholar
Salter KC, Fawcett RF (1993) The ART test of interaction: a robust and powerful rank test of interaction in factorial models. Commun Stat Simul Comput \(22\)(1):137–153
Google Scholar
Sawilowsky SS (1990) Nonparametric tests of interaction in experimental design. Rev Educ Res \(60\)(1):91–126
Google Scholar
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika \(52\)(3, 4):591–611
Google Scholar
Smirnov H (1939) Sur les écarts de la courbe de distribution empirique. Recueil Mathématique (Matematiceskii Sbornik) 6:3–26
Google Scholar
Sokal RR, Rohlf FJ (1981) Biometry: the principles and practice of statistics in biological research. W. H. Freeman, Oxford
MATH Google Scholar
Stewart WM (1941) A note on the power of the sign test. Ann Math Stat \(12\)(2):236–239
Google Scholar
Stiratelli R, Laird N, Ware JH (1984) Random-effects models for serial observations with binary response. Biometrics 40(4):961–971
Google Scholar
Student (1908) The probable error of a mean. Biometrika \(6\)(1):1–25
Google Scholar
Tukey JW (1949) Comparing individual means in the analysis of variance. Biometrics 5(2):99–114
Google Scholar
Tukey JW (1953) The problem of multiple comparisons. Princeton University, Princeton
Google Scholar
von Bortkiewicz L (1898) Das Gesetz der kleinen Zahlen (The law of small numbers). Druck und Verlag von B.G. Teubner, Leipzig
Google Scholar
Wald A (1943) Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans Amer Math Soc \(54\)(3):426–482
Google Scholar
Welch BL (1951) On the comparison of several mean values: an alternative approach. Biometrika \(38\)(3/4):330–336
Google Scholar
White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48(4):817–838
Google Scholar
Wilcoxon F (1945) Individual comparisons by ranking methods. Biomet Bull 1(6):80–83
Google Scholar
Wobbrock JO, Findlater L, Gergle D, Higgins JJ (2011) The Aligned Rank Transform for nonparametric factorial analyses using only ANOVA procedures. In: Proceedings of the ACM conference on human factors in computing systems (CHI ’11), Vancouver, British Columbia, 7–12 May 2011. ACM Press, New York, pp 143–146
Google Scholar
Zeger SL, Liang K-Y, Albert PS (1988) Models for longitudinal data: a generalized estimating equation approach. Biometrics 44(4):1049–1060
Google Scholar

Download references

Author information

Authors and Affiliations

The Information School, University of Washington, Seattle, WA, 98195-2840, USA
Jacob O. Wobbrock
Department of Computer Science and Engineering, University of Washington, Seattle, WA, 98195-2350, USA
Matthew Kay

Authors

Jacob O. Wobbrock
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Kay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jacob O. Wobbrock .

Editor information

Editors and Affiliations

Moray School of Education, Edinburgh University, Edinburgh, United Kingdom
Judy Robertson
Donders Centre for Cognition, Radboud University Nijmegen, Tilburg, The Netherlands
Maurits Kaptein

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wobbrock, J.O., Kay, M. (2016). Nonparametric Statistics in Human–Computer Interaction. In: Robertson, J., Kaptein, M. (eds) Modern Statistical Methods for HCI. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-319-26633-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-26633-6_7
Published: 23 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26631-2
Online ISBN: 978-3-319-26633-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics