Comparing models of information transfer in the structural brain network and their relationship to functional connectivity: diffusion versus shortest path routing

Neudorf, Josh; Kress, Shaylyn; Borowsky, Ron

doi:10.1007/s00429-023-02613-2

Comparing models of information transfer in the structural brain network and their relationship to functional connectivity: diffusion versus shortest path routing

Original Article
Open access
Published: 01 February 2023

Volume 228, pages 651–662, (2023)
Cite this article

Download PDF

You have full access to this open access article

Brain Structure and Function Aims and scope Submit manuscript

Comparing models of information transfer in the structural brain network and their relationship to functional connectivity: diffusion versus shortest path routing

Download PDF

Josh Neudorf¹,
Shaylyn Kress¹ &
Ron Borowsky¹

1861 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

The relationship between structural and functional connectivity in the human brain is a core question in network neuroscience, and a topic of paramount importance to our ability to meaningfully describe and predict functional outcomes. Graph theory has been used to produce measures based on the structural connectivity network that are related to functional connectivity. These measures are commonly based on either the shortest path routing model or the diffusion model, which carry distinct assumptions about how information is transferred through the network. Unlike shortest path routing, which assumes the most efficient path is always known, the diffusion model makes no such assumption, and lets information diffuse in parallel based on the number of connections to other regions. Past research has also developed hybrid measures that use concepts from both models, which have better predicted functional connectivity from structural connectivity than the shortest path length alone. We examined the extent to which each of these models can account for the structure–function relationship of interest using graph theory measures that are exclusively based on each model. This analysis was performed on multiple parcellations of the Human Connectome Project using multiple approaches, which all converged on the same finding. We found that the diffusion model accounts for much more variance in functional connectivity than the shortest path routing model, suggesting that the diffusion model is better suited to describing the structure–function relationship in the human brain at the macroscale.

Estimating psychological networks and their accuracy: A tutorial paper

Article Open access 24 March 2017

Adapting k-means for graph clustering

Article Open access 04 December 2021

Neural manifold analysis of brain circuit dynamics in health and disease

Article Open access 16 December 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Graph theory analyses of structural brain connectivity have been vital to providing breakthroughs in our understanding of how the underlying structure of the brain can influence the patterns of coordinated functional activity (see Avena-Koenigsberger et al. 2018 for a review; see also Goñi et al. 2014; Neudorf et al. 2020a, b, 2022). Defining this relationship between structural and functional connectivity using advanced techniques including graph theory has recently been highlighted as an important frontier in neuroscience (Suárez et al. 2020). When it comes to choosing graph theory measures of connectivity, important assumptions must be made about how information is transferred through the structural network, and the effectiveness of these measures for predicting functional connectivity is dependent on the accuracy of these assumptions about the human brain. Two primary graph theory models of information transfer in the brain include shortest path routing and diffusion. The shortest path routing model relies on the calculation of the shortest path to the destination region. This model is straightforward to calculate and underlies many useful graph theory measures that have been helpful in describing brain networks and networks in general (e.g., characteristic path length, Watts and Strogatz 1998; global efficiency as an indicator of small-worldness, Latora and Marchiori 2001; nodal and local efficiency, Latora and Marchiori 2001; van den Heuvel and Sporns 2013; etc.). One problem with the shortest path routing model when it comes to brain networks is that it assumes each region has whole-brain level knowledge about the most efficient path to use (Avena-Koenigsberger et al. 2019; Seguin et al. 2018, 2022; Zamani Esfahlani et al. 2022).

An alternative graph theory model has been proposed that does not assume whole-brain knowledge about the shortest path but instead assumes that information diffuses along random paths in the network influenced by the relative weighting of each path. Under these model assumptions, information propagates through the network as a “random walker” that is constrained by the structural architecture. Furthermore, information can be transferred in parallel, whereas shortest path routing describes information traveling along a single path to the destination (Fornito et al. 2016a).

Novel graph theory metrics combining both diffusion and shortest path routing models have been developed for use in brain research and applied to the task of predicting functional connectivity from the underlying structural connectivity (Goñi et al. 2014). Search information was developed as a measure of how many distractor paths may lead a random walker away from the shortest path, while path transitivity measures how likely a random walker on a detour will end up back on the shortest path. While these measures were more successful than the shortest path length alone at predicting functional connectivity from structural connectivity, there are other graph theory measures of connectivity that consider the full range of possible paths based on a diffusion model, rather than hinging on what happens around the shortest path during information transfer. One such measure of diffusion efficiency is the mean first passage time (Wang and Pei 2008), which calculates the number of steps it takes a random walker on average to travel from region A to region B. This measure has been used to show that biological brain networks typically display a balance between diffusion efficiency and global efficiency (sensitive to shortest path length; Goñi et al. 2013). Another measure relying on the diffusion model of information transfer is communicability (Estrada and Hatano 2008), which takes into consideration all possible walks from region A to region B. Walks with less edges, n, are weighted much higher than those with more, with walks weighted by the factor 1/n!. Communicability is described as reflecting the capacity for a network to transfer information in parallel assuming a diffusion model of information transfer (Fornito et al. 2016a, b). This measure has been useful in distinguishing patients from controls, including stroke (Crofts et al. 2011) and multiple sclerosis (Li et al. 2013).

Considering past success with the hybrid measures combining the diffusion and shortest path routing models of information transfer (Goñi et al. 2014), this research will apply the exclusively diffusion-based measures of mean first passage time and communicability as well as the shortest path routing measure of shortest path length to structural connectivity, and the results will be used to predict functional connectivity to determine to what extent these measures are able to account for variance in functional connectivity. Crucially, this research will extend past research that has examined the ability of multiple graph theory communication measures to predict functional connectivity from structural connectivity (Betzel et al. 2022; Vázquez-Rodríguez et al. 2019; Zamani Esfahlani et al. 2022) and benchmarking the ability for different communication measures to predict functional connectivity (Seguin et al. 2018, 2020, 2022), by directly comparing two commonly used models (diffusion and shortest path routing) using multiple linear regression analyses, partial least squares regression, and principal components analysis to determine which graph theory model is most important in this relationship. Research suggests that brain networks (at both the macroscale and microscale) typically demonstrate a balance of diffusion efficiency and global efficiency (Goñi et al. 2013), while also suggesting that this balance may lean more towards dominance of diffusion efficiency in human brains, in which case we expect that the diffusion measures examined here will be more relevant than shortest path length to the structure–function relationship in the brain.

Methods

Dataset

MRI data for 998 subjects from the Human Connectome Project (HCP; Van Essen et al. 2013) were used including diffusion tensor imaging (DTI) and resting state functional magnetic resonance imaging (rsfMRI). We used the preprocessed version of the rsfMRI data. This data has been preprocessed using FSL FIX (Salimi-Khorshidi et al. 2014). The DTI data used was also preprocessed. The HCP pipelines for preprocessing are described by Glasser et al. (2013). The Automated Anatomical Labelling 90 region atlas (AAL; Tzourio-Mazoyer et al. 2002) was used as well as the Brainnetome 246 region atlas (Fan et al. 2016). Activation at each rsfMRI acquisition was used to calculate the mean activation for the atlas regions. The rsfMRI sessions were standardized using a z-score for the regions for each session separately. The activation in these regions was then submitted to bandpass filtering (separately for each session) allowing only frequencies within 0.01 Hz and 0.1 Hz (see Hallquist et al. 2013).

Connectivity measures

To calculate the functional connectivity measures for each combination of regions, we calculated the Pearson correlation coefficient using all of the 4800 acquisitions. To calculate the structural connectivity measures, DSI Studio (http://dsi-studio.labsolver.org) was used with quantitative anisotropy (Yeh et al. 2013) as the termination index to calculate the streamline count. Generalized q-sampling (Yeh et al. 2010) was used, and tracking used 1 million fibers, 75° maximum angular deviation, and a 20 mm minimum and 500 mm maximum fiber length. To calculate the structural connectivity matrix containing the number of streamlines for each cell, a whole brain seed was used. The connectivity values for structural and functional connectivity were averaged using the mean for all subjects. The weighted structural connectivity density (the sum of connection weights divided by the total possible connection weights, where each weight has a maximum of 1.0) was 0.016 for the AAL atlas and 0.004 for the Brainnetome atlas.

Graph theory structural connectivity measures of mean first passage time (Wang and Pei 2008) and communicability (Estrada and Hatano 2008) were calculated as diffusion model measures (also discussed in Fornito et al. 2016a, b). Mean first passage time was calculated as

$$MFPT_{ij} = { }\mathop \sum \limits_{n = 1}^{N} \left[ {\left( {I - U_{j} } \right)^{ - 1} } \right]_{ni},$$

(1)

where I is the identity matrix, i is the starting node, j is the destination node, and N is the number of regions in the network, and

$$U_{j} = { }WS^{ - 1},$$

(2)

but with the jth row set to zero so that a random walker is unable to enter j. W is the weighted structural connectivity adjacency matrix, and

$$S = { }\left[ {\begin{array}{*{20}c} {s_{1} } && 0 && 0 \\ 0 && \ddots && \vdots \\ 0 && \ldots && {s_{N} } \\ \end{array} } \right],$$

(3)

with ${s}_{n}$ representing the strength (weighted number of connections) of region n. Communicability was calculated as

$$Com_{ij} = { }\left[ {e^{{S^{ - 1/2} WS^{ - 1/2} }} } \right]_{{ij^{{\prime}} }},$$

(4)

where ${e}^{{S}^{-1/2}W{S}^{-1/2}}$ is the matrix exponential of ${S}^{-1/2}W{S}^{-1/2}$, the reduced structural connectivity adjacency matrix (see Crofts et al. 2011). Long walks are weighted more weakly (by a factor of n! where n is the number of steps) in this formula as the series expansion equates to

$$Com_{ij} = { }\mathop \sum \limits_{n = 0}^{\infty } \frac{{\left[ {(S^{ - 1/2} WS^{ - 1/2} )^{n} } \right]_{ij} }}{n!}.$$

(5)

Shortest path length was calculated as the shortest path routing model measure using the NetworkX python library (Hagberg et al. 2008; function shortest_path_length, using the Dijkstra algorithm described by Dijkstra 1959, and given the inverse value of structural connectivity edges so that the edges represent resistance in the network).

Permutation testing was performed using 500 null models created from the structural connectivity following the generalized Maslov–Sneppen (Maslov and Sneppen 2002) rewiring algorithm developed by Rubinov and Sporns (2011) for use with weighted networks to control for node strength and degree while randomizing the connection weights. Permutation p-values were calculated as the number of null models resulting in the same or better variance accounted for in the models as a proportion of the total number of null models.

Results

AAL

Linear regression models

Linear regression models were computed for each log-transformed independent variable (mean first passage time, communicability, and shortest path length) with functional connectivity as the dependent variable using the lm function from the lme4 library (Bates et al. 2015) in R (R Core Team 2018). Mean first passage time demonstrated an inverse relationship with functional connectivity, whereby a high mean first passage time was associated with poorer functional connectivity, as expected, R(4003) = −0.376, p < 0.001 (null model permutation p = 0.044; see Fig. 1A). Communicability demonstrated a positive relationship with functional connectivity, whereby high communicability was associated with better functional connectivity, as expected, R(4003) = 0.316, p < 0.001 (null model permutation p = 0.014; see Fig. 1B). Shortest path length demonstrated an inverse relationship with functional connectivity, whereby a high shortest path length was associated with poorer functional connectivity, as expected, R(4003) = −0.379, p < 0.001 (null model permutation p < 0.002; see Fig. 1C). The magnitude of the structure–function relationship for each of these measures was relatively comparable, so additional multiple linear regression approaches were also taken to determine which measures are primarily driving the relationship between structural and functional connectivity.

Multiple linear regression models

Multiple linear regression models were then investigated starting with mean first passage time and shortest path length included in the model as independent variables, with functional connectivity as the dependent variable. These models were again calculated in R using the lm function from the lme4 library, as well as spcor from the ppcor library to calculate the semi-partial correlation (Kim 2015) and vif from the car library to calculate the variance inflation factor (Fox and Weisberg 2019). Prior to this, the correlation matrix of these measures was examined, which indicated that there were no extreme correlations (e.g., greater than 0.9) between the independent variables, with the highest value being R = 0.841 between mean first passage time and shortest path length (see Table 1). This correlation is theoretically interesting though, as it indicates there is a high level of redundancy between mean first passage time and shortest path length, suggesting that information in the diffusion model naturally follows paths that are similarly efficient when compared to the shortest path. This potential for decentralized information transfer strategies to take advantage of the shortest paths in the network has been noted in past research (Avena-Koenigsberger et al. 2017; Goñi et al. 2014; Seguin et al. 2018; Vézquez-Rodríguez et al. 2020). As seen in Table 2, mean first passage time and shortest path length both produced significant effects, with the shortest path length having a slightly larger semi-partial correlation (see Fig. 2A for predicted vs. empirical functional connectivity). However, when adding communicability to the model as seen in Table 3, the overall variance accounted for increased, and the semi-partial correlation of shortest path length was greatly reduced (though still significant), while the diffusion-based measures of mean first passage time and communicability had a much larger combined magnitude of semi-partial correlation. This model accounted for more variance than any of the measures independently (R² = 0.165; see Fig. 2B for predicted vs. empirical functional connectivity). It should be noted that the variance inflation factor (VIF) for the shortest path length in model 2 was greater than 5 (VIF = 5.567), indicating that multicollinearity between the independent variables may have affected the variance of the shortest path length coefficient. To address this, we also examined these variables using partial least squares regression, which is robust against multicollinearity.

Table 1 AAL independent variable correlation matrix

Full size table

Table 2 AAL multiple linear model 1, with dependent variable functional connectivity. R² = 0.157, R_adj² = 0.157 (null model permutation p = 0.012)

Full size table

Table 3 AAL multiple linear model 2, with dependent variable functional connectivity. R² = 0.165, R_adj² = 0.165 (null model permutation p = 0.016)

Full size table

Partial least squares regression

A partial least squares regression analysis was conducted with a dependent variable of functional connectivity and independent variables of mean first passage time, communicability, and shortest path length, using the plsr function from the pls library in R (Mevik and Wehrens 2007). The independent variables were log transformed and standardized to have a mean of 0 and a standard deviation of 1. To validate the model and check for overfitting a k-fold cross-validation scheme was used with 10 folds. The number of components to include was decided when additional components no longer substantially decreased the root mean squared error of prediction. With 2 components included, the root mean squared error of prediction reached its minimum of 0.161, so 2 components were used. Cross-validation determined that the model was able to account for 16.3% (R² = 0.163, R_adj² = 0.162) of the variance in functional connectivity of novel validation samples, while the model accounted for 16.5% (R² = 0.165, R_adj² = 0.164) of the variance in functional connectivity when predicting the data for all connections (null model permutation p = 0.012). These cross-validation results indicate that over-fitting is minimal. Finally, by investigating the coefficients for each of the independent variables, the pattern of results seen in the multiple linear regression model can be confirmed. For diffusion measures, mean first passage time had a coefficient of −0.037, communicability had a coefficient of 0.018, and shortest path length had a coefficient of −0.025. Note that the negative relationship between structure and function for mean first passage time and shortest path length was expected, as a higher value for these structural measures indicates weaker connectivity, while a positive relationship was expected for communicability as higher values indicate stronger connectivity. These coefficients support what was observed for the multiple linear regression, indicating that the effects of the diffusion model-based measures were greater in combined magnitude than that of the shortest path length.

Principal components analysis

The principal components analysis for functional connectivity, mean first passage time, communicability, and shortest path length shown in Fig. 3 demonstrates the unique component space occupied by each measure. This analysis was conducted using the prcomp function of the core stats library in R. To aid the interpretation of the principal component loadings of each variable, mean first passage time and shortest path length were multiplied by −1 so that larger values indicate better connectivity for all measures. In particular, Principal Component 1 seems to be sensitive to the variance in common between functional connectivity and the graph theory measures, as these all have loadings in the same direction (Fig. 3A and Table 4). Conversely, functional connectivity loads strongly onto Principal Component 2, while the graph theory measures load weakly and in the opposite direction, suggesting that this component identifies variance in functional connectivity that is not well accounted for by the graph theory measures (Fig. 3A and Table 4). Finally, functional connectivity and shortest path length load very weakly onto Principal Component 3, while the loadings for mean first passage time and communicability are strong and in opposite directions, suggesting that this component speaks to the unique position in the component space of these diffusion model measures (Fig. 3B and Table 4).

Table 4 AAL principal components analysis loadings for all 3 principal components

Full size table