Skip to main content
Log in

Statistical Significance of the Contribution of Variables to the PCA solution: An Alternative Permutation Strategy

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix independently and concurrently, thus destroying the entire correlational structure of the data. This strategy is considered appropriate for assessing the significance of the PCA solution as a whole, but is not suitable for assessing the significance of the contribution of single variables. Alternatively, we propose a strategy involving permutation of one variable at a time, while keeping the other variables fixed. We compare the two approaches in a simulation study, considering proportions of Type I and Type II error. We use two corrections for multiple testing: the Bonferroni correction and controlling the False Discovery Rate (FDR). To assess the significance of the variance accounted for by the variables, permuting one variable at a time, combined with FDR correction, yields the most favorable results. This optimal strategy is applied to an empirical data set, and results are compared with bootstrap confidence intervals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti, A., & Coull, B.A. (1998). Approximate is better than ‘exact’ for interval estimation of binomial proportions. The American Statistician, 52, 119–126.

    Article  Google Scholar 

  • Anderson, M.J., & Ter Braak, C.J.F. (2003). Permutation tests for multi-factorial analysis of variance. Journal of Statistical Computation and Simulation, 73, 85–113.

    Article  Google Scholar 

  • Anderson, T.W. (1963). Asymptotic theory for principal component analysis. Annals of Mathematical Statistics, 34, 122–148.

    Article  Google Scholar 

  • Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B. Methodological, 57, 289–300.

    Google Scholar 

  • Buja, A., & Eyuboglu, N. (1992). Remarks on parallel analysis. Multivariate Behavioral Research, 27, 509–540.

    Article  Google Scholar 

  • Cohen, J. (1994). The earth is round (p<0.05). The American Psychologist, 49, 997–1003.

    Article  Google Scholar 

  • De Leeuw, J., & Van der Burg, E. (1986). The permutational limit distribution of generalized canonical correlations. In Diday, E. (Ed.). Data analysis and informatics, IV, pp. 509–521. Amsterdam: Elsevier.

    Google Scholar 

  • Dietz, E.J. (1983). Permutation tests for association between two distance matrices. Systematic Zoology, 32, 21–26.

    Article  Google Scholar 

  • Douglas, M.E., & Endler, J.A. (1982). Quantitative matrix comparisons in ecological and evolutionary investigations. Journal of Theoretical Biology, 99, 777–795.

    Article  Google Scholar 

  • Fabrigar, L.R., Wegener, D.T., MacCallum, R.C., & Strahan, E.J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4, 272–299.

    Article  Google Scholar 

  • Fisher, R.A. (1935). The design of experiments. Edinburgh: Oliver and Boyd.

    Google Scholar 

  • Girshick, M.A. (1939). On the sampling theory of roots of determinantal equations. Annals of Mathematical Statistics, 10, 203–224.

    Article  Google Scholar 

  • Glick, B.J. (1979). Tests for space-time clustering used in cancer research. Geographical Analysis, 11, 202–208.

    Article  Google Scholar 

  • Gliner, J., Leech, N., & Morgan, G. (2002). Problems with null hypothesis significance testing (NHST): What do the textbooks say? Journal of Experimental Education, 71, 83–92.

    Article  Google Scholar 

  • Good, P.I. (2000). Permutation tests: A practical guide to resampling methods for testing hypotheses. New York: Springer.

    Google Scholar 

  • Heiser, W.J., & Meulman, J.J. (1994). Homogeneity analysis: Exploring the distribution of variables and their nonlinear relationships. In Greenacre, M., & Blasius, J. (Eds.). Correspondence analysis in the social sciences: recent developments and applications (pp. 179–209). New York: Academic Press.

    Google Scholar 

  • Horney, K. (1945). Our inner conflicts: a constructive theory of neurosis. New York: Norton.

    Google Scholar 

  • Hubert, L.J. (1984). Statistical applications of linear assignment. Psychometrika, 49, 449–473.

    Article  Google Scholar 

  • Hubert, L.J. (1985). Combinatorial data analysis: association and partial association. Psychometrika, 50, 449–467.

    Article  Google Scholar 

  • Hubert, L.J. (1987). Assignment methods in combinatorial data analysis. New York: Marcel Dekker.

    Google Scholar 

  • Hubert, L.J., & Schultz, J. (1976). Quadratic assignment as a general data analysis strategy. British Journal of Mathematical & Statistical Psychology, 29, 190–241.

    Google Scholar 

  • Jolliffe, I.T. (2002). Principal component analysis. New York: Springer.

    Google Scholar 

  • Keselman, H., Cribbie, R., & Holland, B. (1999). The pairwise multiple comparison multiplicity problem: an alternative approach to familywise and comparisonwise Type I error control. Psychological Methods, 4, 58–69.

    Article  Google Scholar 

  • Killeen, P.R. (2005). An alternative to null-hypothesis significance tests. Psychological Science, 16, 345–353.

    Article  PubMed  Google Scholar 

  • Killeen, P.R. (2006). Beyond statistical inference: a decision theory for science. Psychonomic Bulletin & Review, 13, 549–562.

    Article  Google Scholar 

  • Landgrebe, J., Wurst, W., & Welzl, G. (2002). Permutation-validated principal components analysis of microarray data. Genome Biology, 3, 0019.

    Article  Google Scholar 

  • Lin, S.P., & Bendel, R.B., (1985). Algorithm AS 213: generation of population correlation matrices with specified eigenvalues. Applied Statistics, 34, 193–198.

    Article  Google Scholar 

  • Linting, M., Meulman, J.J., Groenen, P.J.F., & van der Kooij, A.J. (2007a). Nonlinear principal components analysis: introduction and application. Psychological Methods, 12, 336–358.

    Article  PubMed  Google Scholar 

  • Linting, M., Meulman, J.J., Groenen, P.J.F., & van der Kooij, A.J. (2007b). Stability of nonlinear principal components analysis: an empirical study using the balanced bootstrap. Psychological Methods, 12, 359–379.

    Article  PubMed  Google Scholar 

  • Mantel, N. (1967). The detection of disease clustering and a generalized regression approach. Cancer Research, 27, 209–220.

    PubMed  Google Scholar 

  • Meulman, J.J. (1992). The integration of multidimensional scaling and multivariate analysis with optimal transformations of the variables. Psychometrika, 57, 539–565.

    Article  Google Scholar 

  • Meulman, J.J. (1993). Nonlinear principal coordinates analysis: minimizing the sum of squares of the smallest eigenvalues. British Journal of Mathematical & Statistical Psychology, 46, 287–300.

    Google Scholar 

  • Meulman, J.J. (1996). Fitting a distance model to homogeneous subsets of variables: points of view analysis of categorical data. Journal of Classification, 13, 249–266.

    Article  Google Scholar 

  • Meulman, J.J., Van der Kooij, A.J., & Heiser, W.J. (2004). Principal components analysis with nonlinear optimal scaling transformations for ordinal and nominal data. In Kaplan, D. (Ed.), Handbook of quantitative methodology for the social sciences (pp. 49–70). London: Sage Publications.

    Google Scholar 

  • NICHD Early Child Care Research Network (1996). Characteristics of infant child care: factors contributing to positive caregiving. Early Childhood Research Quarterly, 11, 269–306.

    Article  Google Scholar 

  • Noreen, E.W. (1989). Computer intensive methods for testing hypotheses. New York: Wiley.

    Google Scholar 

  • Ogasawara, H. (2004). Asymptotic biases of the unrotated/rotated solutions in principal component analysis. British Journal of Mathematical & Statistical Psychology, 57, 353–376.

    Article  Google Scholar 

  • Peres-Neto, P.R., Jackson, D.A., & Somers, K.M. (2003). Giving meaningful interpretation to ordination axes: assessing loading significance in principal component analysis. Ecology, 84, 2347–2363.

    Article  Google Scholar 

  • Shaffer, J.P. (1995). Multiple hypothesis testing. Annual Review of Psychology, 46, 561–584.

    Article  Google Scholar 

  • Smouse, P.E., Long, J., & Sokal, R.R. (1985). Multiple regression and correlation extensions of the Mantel test of matrix correspondence. Systematic Zoology, 35, 627–632.

    Article  Google Scholar 

  • Sokal, R.R. (1979). Testing statistical significance of geographical variation. Systematic Zoology, 28, 227–232.

    Article  Google Scholar 

  • Ter Braak, C.J.F. (1992). Permutation versus bootstrap significance tests in multiple regression and ANOVA. In Jöckel, K.H., Rothe, G., & Sendler, W. (Eds.), Bootstrapping and related techniques (pp. 79–86). Berlin: Springer.

    Google Scholar 

  • Timmerman, M.E., Kiers, H.A.L., & Smilde, A.K. (2007). Estimating confidence intervals for principal component loadings: a comparison between the bootstrap and asymptotic results. British Journal of Mathematical & Statistical Psychology, 60, 295–314.

    Article  Google Scholar 

  • Verhoeven, K., Simonsen, K., & McIntyre, L. (2005). Implementing false discovery rate control: increasing your power. Oikos, 108, 643–647.

    Article  Google Scholar 

  • Wilson, E.B. (1927). Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association, 22, 209–212.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariëlle Linting.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Linting, M., van Os, B.J. & Meulman, J.J. Statistical Significance of the Contribution of Variables to the PCA solution: An Alternative Permutation Strategy. Psychometrika 76, 440–460 (2011). https://doi.org/10.1007/s11336-011-9216-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-011-9216-6

Keywords

Navigation