Common permutation methods in animal social network analysis do not control for non-independence

Hart, Jordan D. A.; Weiss, Michael N.; Brent, Lauren J. N.; Franks, Daniel W.

doi:10.1007/s00265-022-03254-x

Common permutation methods in animal social network analysis do not control for non-independence

Methods
Open access
Published: 29 October 2022

Volume 76, article number 151, (2022)
Cite this article

Download PDF

You have full access to this open access article

Behavioral Ecology and Sociobiology Aims and scope Submit manuscript

Common permutation methods in animal social network analysis do not control for non-independence

Download PDF

Jordan D. A. Hart¹,
Michael N. Weiss^1,2,
Lauren J. N. Brent¹^na1 &
…
Daniel W. Franks³^na1

4122 Accesses
19 Citations
20 Altmetric
Explore all metrics

Abstract

The non-independence of social network data is a cause for concern among behavioural ecologists conducting social network analysis. This has led to the adoption of several permutation-based methods for testing common hypotheses. One of the most common types of analysis is nodal regression, where the relationships between node-level network metrics and nodal covariates are analysed using a permutation technique known as node-label permutations. We show that, contrary to accepted wisdom, node-label permutations do not automatically account for the non-independences assumed to exist in network data, because regression-based permutation tests still assume exchangeability of residuals. The same assumption also applies to the quadratic assignment procedure (QAP), a permutation-based method often used for conducting dyadic regression. We highlight that node-label permutations produce the same p-values as equivalent parametric regression models, but that in the presence of non-independence, parametric regression models can also produce accurate effect size estimates. We also note that QAP only controls for a specific type of non-independence between edges that are connected to the same nodes, and that appropriate parametric regression models are also able to account for this type of non-independence. Based on this, we suggest that standard parametric models could be used in the place of permutation-based methods. Moving away from permutation-based methods could have several benefits, including reducing over-reliance on p-values, generating more reliable effect size estimates, and facilitating the adoption of causal inference methods and alternative types of statistical analysis.

A multilevel statistical toolkit to study animal social networks: the Animal Network Toolkit Software (ANTs) R package

Article Open access 27 July 2020

A protocol for assessing bias and robustness of social network metrics using GPS based radio-telemetry data

Article Open access 06 August 2024

Social network dynamics: the importance of distinguishing between heterogeneous and homogeneous changes

Article 04 November 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Social network analysis is a central tool in the study of animal sociality. Social networks characterise the structure of social connections between individuals and are useful for answering a wide range of biological questions related to social structure, the evolution of sociality, information, and disease transmission, and more (Farine and Whitehead 2015). Social networks are usually analysed quantitatively at three levels: nodal, dyadic, or global. Nodal metrics describe each node’s position in the network relative to the other nodes; dyadic metrics describe each edge’s position in the network; and global network metrics characterise features of the entire network, such as connection density or longest path (Butts 2008). Two common types of hypotheses in animal social network analysis can be characterised as follows: ‘nodal metrics are related to nodal covariates’, and ‘the presence or metric of edges are related to dyadic covariates’ (Dekker et al. 2007; Croft et al. 2011). The types of analyses used to test these hypotheses are known by various names, but we will refer to them as nodal regression and dyadic regression respectively. These analyses usually use permutation-based regression techniques such as node-label permutations or the quadratic assignment procedure (QAP). Node-label permutations have typically been applied to nodal regression and QAP to dyadic regression (Farine 2017). The justification for the use of permutation-based regression tests over parametric regression models is that network data are inherently non-independent and therefore break the assumptions of parametric regression.

The problem of non-independence

Many conventional statistical analyses make the assumption that data are independent (Cohen 1992). This assumption is key to reliable data analysis because it defines the source and nature of noise in data generating processes and is therefore closely linked to null hypothesis significance testing and calculation of p-values. In the case of regression analysis, a noise term is included in the model to account for non-systematic, independent random noise present in the data (Draper and Smith 1998). This assumption is convenient because it has appealing mathematical properties but in practice can rarely be met. In the presence of known sources of non-independence, statisticians often use explicit models of the sources of non-independence; for example, autocorrelation models are frequently deployed in time series analysis to account for the known temporal dependencies in sequential data (Wei 2013).

In network data, dependencies are assumed to be more complex. A common example is that undirected node strength is explicitly related to the node strength of every other node in the network, even for nodes that are not directly connected to the node of interest (Sosa et al. 2021). Therefore, noise in the data may be linked to various structural features of the network and would be poorly modelled by an independent noise term. Whether or not the p-value of a statistical analysis can be trusted depends on how well the process that generates noise in the data is described by the model, which in the case of parametric regression models requires independent residuals.

Inappropriate noise terms in statistical models are a major problem when scientific hypotheses are evaluated using null hypothesis significance testing (Anderson and Robinson 2001). Null hypothesis significance testing is based on the concept of constructing a null model that describes the data if there is no relationship between variables of interest, and that any relationship between them is due to chance alone (Wasserman 2004). Tests are usually conducted by calculating the p-value, which is the probability of getting coefficient estimates at least as extreme as those from hypothetical data generated under the null hypothesis. Parametric regression tests use the noise term in the model to estimate what coefficient values are likely ‘by chance’, and to subsequently calculate the p-value. If the noise term in the regression model does not approximately match the process that generates noise in the data generating process, then the p-value will not reflect what is expected by chance and therefore will not be reliable.

Permutation tests

The interconnectedness of social networks appears to break the independence assumptions of parametric regression models. This has been a long-term concern of behavioural ecologists conducting social network analyses (Croft et al. 2010). Because permutation tests relax some assumptions about the distributions of noise terms, they have been widely adopted with the aim of enabling regression analysis in the presence of non-independence (Croft et al. 2011). The notion behind permutation regression tests is that if there is no effect, nodal (or dyadic) covariates are equally likely to belong to any node (or dyad). When using node-label permutations, parametric regression is applied to the network and a test statistic such as the coefficient estimate or t-value is recorded. Then the node labels are swapped at random and the test statistic is re-estimated from the new dataset with permuted node labels. This permutation step is repeated many times to build a distribution of test statistic values under the null hypothesis of no relationship between node centrality and nodal covariate. The observed test statistic can then be compared to the null distribution to calculate statistical significance.

In permutation tests, some confounds can be accounted for by constraining permutations to between certain data points (Winkler et al. 2015). Constraining permutations is the key notion behind QAP, which works in much the same way as node-label permutations, but because the dyad is the unit of analysis, relabelling nodes effectively permutes all connections of a node at the same time. This controls for any dependence between edges that connect to the same node. Constraining permutations in this way means that the model that calculates the observed test statistic does not account for the confounds being used as constraints and subsequently does not take them into account when calculating effect size estimates. Consequently, effect size estimates computed in this way will be incorrect, to the extent that they may even have the wrong sign (Franks et al. 2021).

Instead of explicitly assuming a parametric noise term, permutation tests assume that under the null hypothesis, any rearrangement of the data is equally likely (Good 2000). In a regression setting, this generates the null hypothesis of no relationship between the response and covariates. Thus, permutations have the benefit of removing the need for some assumptions about the distributions of noise in data generating processes. The assumption that all permutations of the data must be equally likely under the null hypothesis is known as exchangeability of data points. This means that data points must be freely exchangeable under the null hypothesis without changing their joint probability, which depends on the underlying dependence structure of the data points. In the presence of dependence between data points, unconstrained permutations of the data do not preserve dependence structure (see Fig. 1). This breaks the exchangeability assumption of permutation tests for much the same reason as non-independence breaks the assumptions of parametric regression (Winkler et al. 2015). This is illustrated in Fig. 1A where the data points 1 and 2 are independent, but data points 1 and 3, and 2 and 3 are dependent on each other. This forms a dependence structure that must not be broken by permutations, but node-label permutations freely permute data points and thus break any dependence structure in the data.

The exchangeability condition also applies to QAP, though QAP makes the explicit assumption that dyads are dependent on the nodes to which they are attached. This assumption means that the QAP controls for one specific type of non-independence but is not immune from more complex dependencies such as dyads depending on other aspects of network substructure. Figure 1B shows how QAP restricts permutations on networks to move multiple edges at once, preserving the original dependency structure. Hypothetically speaking, if in QAP edges were permuted freely, as nodes are in node-label permutations, the dependency structure would not be preserved, and invalid permutations would be generated (as shown in Fig. 1C). Therefore, permutation tests do not automatically correct for non-independence, meaning node-label permutations will produce equivalent p-values to comparable parametric regressions, and QAP will provide equivalent p-values to comparable to parametric regressions with a term for node dependence (Good 2000).

In this paper, we provide examples to illustrate that, in practice, node-label permutations and parametric regression yield the same true and false positive rates. We show that QAP correctly accounts for a specific type of non-independence, but that alternative non-permutation models are also capable of accounting for such non-independence. We also show that in the presence of non-independence that is not explicitly accounted for, both node-label permutations and QAP yield inflated false positive rates, highlighting that permutations do not automatically control for non-independence. Finally, we discuss the potential benefits of using standard parametric models for regression analysis on network data, such as facilitating the adoption of causal inference.

Methods

In this section, we use network simulations to illustrate that node-label permutations achieve the same true positive rates (power) and false positive rates (type I error) as ordinary least squares in nodal regression to detect trait-based differences in a common node-based measure of centrality. We also use simulations to show that network substructure can introduce dependence structure in the data that neither node-label permutations nor QAP can account for. Finally, to demonstrate that parametric statistical models are able to account for specific types of non-independence in the same way as QAP, we compare QAP to both ordinary least squares and a multimembership linear model that includes a node dependence term.

Simulations: nodal regression

Trait-based strength differences

To demonstrate that node-label permutations perform the same as parametric regression, we compared a standard simple linear regression (LM) to node-label permutations where an LM is used to calculate the test statistics. Note that the LM used here is equivalent to a basic Gaussian generalised linear model with a single predictor. To generate the data, we used the simulation model described by Farine and Whitehead (2015). The simulations assigned a gregariousness score to each individual in the population of size n = 20 from a Poisson distribution. Individuals were then assigned a sex either according to their gregariousness (effect), or at random (no effect). Sampling periods were simulated where the probability of a pair interacting in a sampling period was proportional to the combined gregariousness scores of the two individuals, giving a weighted, undirected network. Node strength was calculated as the sum of each node’s connection strengths. Node strength was regressed against sex using simple linear regression. Node-label permutations were conducted with 10,000 permutations on the networks to generate the null distribution using the slope coefficient (β estimates) as the test statistic. The observed coefficient was compared to the null distribution to compute a two-sided p-value and effect size estimate for the null hypothesis of no effect. This was repeated 1000 times in the presence of both an effect and no effect, and the true positive and false positive rates were computed.

Nodal dependence on clique membership

The non-independence of network data can take many forms, but to demonstrate one possible form, we considered the case where a network is formed from two unknown underlying cliques. Our simulations assigned nodes to one of two cliques at random, with equal probability of being assigned to either clique. Dyads of nodes that were in the same cliques had an 80% chance of having a non-zero edge, whereas dyads of nodes that were in different cliques only had a 40% chance of having a non-zero edge. Edge weights were drawn from a uniform U(0,1) distribution. Nodal covariates were assigned according to a linear combination of node strength, a clique dependence variable, and a random noise term, drawn from a uniform distribution U(0,1). The clique dependence variables were drawn from a uniform distribution U(0,1) and were used to create an effect of clique membership on nodal covariates. If no effect was being simulated, the coefficient of node strength was set to zero to remove the effect; otherwise, it was set to 0.05. This simulation creates an effect of non-independence because under null hypothesis, the size of the cliques will affect node strength, and clique membership affects the nodal covariates. The strength of a node will depend on the size of its clique, which is generated by a stochastic process, so there is potential for spurious correlation between node strengths and nodal covariates. This simulation is designed to simulate the effect of substructures in the network that may be difficult or even impossible to detect either manually or computationally. The simulation was repeated 1000 times with and without the effect, and the two-sided p-values and effect size estimates for each method were recorded.

Simulations: dyadic regression

Dyadic dependence on nodes

To demonstrate the performance of QAP against parametric regression, we designed simulations based on those described by Dekker et al. (2007). Specific simulation choices such as distributions and intercepts were made in line with the original study, but the theory holds regardless of these minor details. Simulations were carried out by simulating the response and predictor matrices as being partially dependent on a node-level vector:

$${x}_{ij}={r}_{i}+{r}_{j}+{x}_{ij}^{^{\prime}}$$

$${y}_{ij}={\upbeta }^{^{\prime}}{x}_{ij}+\left(1-{\upbeta }^{^{\prime}}\right)({s}_{i}+{s}_{j}+{y}_{ij}^{^{\prime}})$$

where x and y are the observed matrices, r and s are the node dependencies for x and y respectively, y′ and x′ are the true, underlying social preferences, and β′ describes the relationship between x and y. This creates a relationship between x and y when β′ = 0. The matrices were symmetric of size n = 20, with elements drawn from a uniform U(0,1) distribution. The node dependence vectors r and s were also drawn from a uniform U(0,1) distribution, and the effect parameter β′ was set to either β′ = 0 to simulate no effect, or to β′ = 0.20 to simulate a moderate effect. In line with Dekker et al. (2007), intercepts were not included in the simulation or model, but this does not affect the generality of the results as the resulting model corresponds to a mean-centred response variable.

Previous studies have demonstrated that QAP is effective at accounting for node dependencies in dyadic regression (Dekker et al. 2007). The reason for this is not because QAP is a permutation test, but because QAP makes explicit assumptions about the sources of non-independence. This same assumption can also be built into parametric models using random effects. Since each edge depends on two nodes, and each two nodes have only one edge between them, conventional random effects cannot be used to control for node dependence, as this would use one random effect per unit of analysis (per dyad). Instead, a random effect is used for each node, and the random effects for two nodes that each edge is between are included in the model. This type of mixed model is often referred to as a multimembership model (Rushmore et al. 2013; Boyland et al. 2016). We implement the following multimembership linear model:

$${y}_{ij}=\beta {x}_{ij}+\left({u}_{i}+{u}_{i}\right)+{\upepsilon }_{ij}\hspace{1em}{\upepsilon }_{ij}\sim N\left(0,{\upsigma }^{2}\right)$$

where x and y are the predictor and response matrices, u is a random effect vector describing the influence of each node on its connected dyads, and ϵ is an independent noise term. The vector u is treated as a set of parameters to be learned, which introduces a considerable number of parameters to the model. For computational reasons, the model was fit using numerical least squares with the optim function in R, but these types of models are also supported in R packages such as brms and MCMCglmm (Hadfield 2010; Bürkner 2018; R Core Team 2022).

The simulated matrices x and y were regressed against each other using the following three methods: a simple linear regression (LM), QAP, and the multimembership linear model described previously (MMLM). The p-values and effect size estimates from each were recorded. The QAP method used 1000 permutations to generate the null distribution. As with the previous simulations, this was repeated 1000 times in the presence of both an effect and no effect, and true positive and false positive rates were computed.

Dyadic dependence on clique membership

As with the simulation of the effect of network substructure on nodal regression, the aim of this simulation was to demonstrate how dependence on network substructures can affect the performance of dyadic regression. To demonstrate the potentially subtle nature of non-independence in dyadic regression, we introduce dependence in a different way to the nodal simulation. In this simulation, we assume that subgraphs of 4 nodes form cliques that affect both the strengths of edges and dyadic covariates within the cliques. Naturally, a dyad may belong to multiple cliques, it so may have a complex structure of dependencies. Cliques of size 4 are used because they are the smallest possible subgraph that does not follow the assumptions of QAP. The rest of the simulation proceeds in the same way as the previous simulation with the three models: LM, MMLM, and QAP.

Results

Plots of the distributions of p-values in the presence and absence of an effect are shown for each of the four simulations in Fig. 2. Under the null hypothesis of no effect, the p-values should be uniformly distributed, whereas under the alternative hypothesis, the p-values should be concentrated towards zero (Wasserman 2004). The distributions of effect size estimates for the dyadic regression simulations are shown in Fig. 3 and should be centred around 0.2 when there is an effect and centred around zero when there is no effect.