Abstract
Brain connectome analysis suffers from the high dimensionality of connectivity data, often forcing a reduced representation of the brain at a lower spatial resolution or parcellation. This is particularly true for graph-based representations, which are increasingly used to characterize connectivity gradients, capturing patterns of systematic spatial variation in the functional connectivity structure. However, maintaining a high spatial resolution is crucial for enabling fine-grained topographical analysis and preserving subtle individual differences that might otherwise be lost. Here we introduce a computationally efficient approach to establish spatially fine-grained connectivity gradients. At its core, it leverages a set of landmarks to approximate the underlying connectivity structure at the full spatial resolution without requiring a full-scale vertex-by-vertex connectivity matrix. We show that this approach reduces computational time and memory usage while preserving informative individual features and demonstrate its application in improving brain-behavior predictions. Overall, its efficiency can remove computational barriers and enable the widespread application of connectivity gradients to capture spatial signatures of the connectome. Importantly, maintaining a spatially fine-grained resolution facilitates to characterize the spatial transitions inherent in the core concept of gradients of brain organization.
Similar content being viewed by others
Introduction
Graph-based models have become mainstream tools for the visualization and characterization of relationship structures within complex systems in machine learning, computer vision, and bioinformatics. Unfortunately, their application is not without challenges—particularly when dealing with representations at higher resolutions, as their computational and storage demands can exceed the capacity of most computing resources. Data reduction is indispensable when the feasibility of graph-based models is impacted. Such reductions can be achieved through statistical techniques (e.g., principal component, factor analyses), data downsampling approaches (e.g., local averaging based on spatial or topographic structure), or both. Here, using graph-based models in neuroimaging as an example, we demonstrate the need for careful consideration of the order in which these two reduction strategies are applied, suggesting an optimized workflow when both are pursued.
In neuroimaging, graph-based representations are increasingly used to characterize topographic patterns of gradual transitions and boundaries between systems—i.e., connectivity gradients1,2,3. In contrast to typical functional connectivity analysis, which focuses on examining relationships between brain regions4,5, connectivity gradients capture patterns of systematic spatial variation in the functional connectivity structure1,6. They leverage an affinity matrix, which encodes the spatial similarities between region-wise connectivity profiles, to uncover the underlying low-dimensional principles of functional brain organization within the high-dimensional connectivity data. While functional connectivity focuses on the connections between specific brain regions, connectivity gradients provide a more global and continuous representation of functional brain organization1,7. But maintaining a full spatial representation of the connectome is challenging. The number of connections scales quadratically with the number of nodes, and a common spatial resolution of the brain comprising 60,000 cortical nodes (vertices) constitutes 1,799,970,000 pairwise connections (edges). This presents a substantial challenge to characterize the full-scale connectivity structure on consumer-grade hardware and without access to dedicated computational resources. A common practice to overcome this challenge is to reduce the spatial resolution of the data, typically by averaging across vertices within spatial regions of interest (ROIs) defined using a parcellation template8. However, the loss of individual-specific detail in such parcellation approaches may result in the loss of meaningful information; there is also little agreement on the appropriate choice of parcellation templates in connectome studies9,10. Moreover, the capability to maintain a fine-grained spatial resolution allows researchers to fully characterize the individualized spatial layout of functional regions, an important and often overlooked aspect of modeling brain connectivity11,12,13, which could improve brain-behavior associations. The emphasis on spatial specificity aligns with the core principle of connectivity gradients—capturing axes of systematic connectivity change which delineate gradual transitions across brain areas7. Discrete parcellation schemes might obscure the topography of such smooth transitions and unclear border regions, failing to fully capture the continuous and gradual nature of connectivity gradients1.
Here, to facilitate connectivity gradients at a fine-grained spatial resolution, we propose Fast Connectivity Gradient Approximation (FCGA). Instead of using a full-scale vertex-by-vertex connectivity matrix, at its core, FCGA leverages a set of landmarks to approximate the underlying connectivity structure at full spatial resolution, with an efficiency that makes it possible to run on common computer hardware. These landmarks are flexible, and can, for example, be based on individual vertices or predefined ROIs.
Results and discussion
A schematic workflow of our proposed approach is displayed in Fig. 1a. After (i) establishing a set of k landmarks, the (ii) functional connectivity between all n vertices (or voxels) and the k landmarks is calculated, resulting in a n-by-k connectivity matrix (CMn-by-k) between all pairs of vertices and landmarks. Following current practices when calculating connectivity gradients1,14, we quantify the spatial similarity of vertex-wise connectivity profiles. To do so, (iii) a k-by-k (landmark-by-landmark) connectivity matrix (CMk-by-k) that characterizes the functional connectivity between all pairs of landmarks is established. Then, row-wise thresholding, retaining the 10% strongest positive connections, is applied to both CMn-by-k and CMk-by-k. Subsequently, (iv) the row-wise cosine similarity between all pairs of vertices in CMn-by-k and CMk-by-k is calculated. This results in an n-by-k affinity matrix (Wn-by-k) that characterizes the spatial similarities between the voxel-wise connectivity profiles with the landmark regions. Finally, (v) we use Principal Component Analysis (PCA) for dimensionality reduction of Wn-by-k, establishing the approximated connectivity gradients.
We evaluated the validity of FCGA on the group and individual level, using high-resolution fMRI data (59,412 cortical vertices) from the Human Connectome Project (HCP)15 and the Nathan Kline Institute-Rockland Sample (NKI-RS)16. We parametrically varied the amount of downsampling (number of landmarks) across sampling choices (vertex- and ROI-level).
At the group level, we used Spearman rank correlation to compare the spatial similarity between the approximated connectivity gradients (GFCGA) and the connectivity gradients based on the full 59,412-by-59,412 connectivity matrix (Gfull_fc). The spatial similarity between GFCGA and Gfull_fc increased with the number of landmarks used to calculate the approximated gradients (Fig. 1b). Remarkably, using 1000 ROIs as landmarks (~1.7% of the full connectivity matrix), the average spatial similarity across 25 connectivity gradients reached ⍴ = 0.86. Using 3000 (~5%) uniformly distributed vertices, a spatial similarity of ⍴ = 0.98 was achieved with <10% of the computational time and memory usage as compared to the calculation of Gfull_fc (Fig. 1c). This was replicated on the individual level, where across 100 individuals, GFCGA with landmarks based on ~5% uniformly distributed vertices yielded an average spatial similarity of ⍴ = 0.9 (Supplementary Fig. S1). The spatial topography of the top connectivity gradients, shown in Fig. 1d, further illustrates the high spatial similarity between GFCGA and Gfull_fc across a number of sampling choices. Vertex-wise similarities between the gradient profiles of GFCGA and Gfull_fc further emphasized the accuracy of the gradient approximation (Supplementary Fig. S2).
At the individual level, reliability and discriminability analysis confirmed the repeatability and the preservation of individual features with GFCGA. To quantify the agreement between GFCGA and Gfull_fc, we calculated the intraclass correlation coefficient (ICC) and discriminability for the same acquisition. We observed an average vertex-wise ICC of >0.9 from 1000 (~1.7%) landmarks onwards, and discriminability was already nearly perfect with 300 (<1%) landmarks (Fig. 2a). Next, we compared the ICC and the discriminability of GFCGA and Gfull_fc across sessions (Fig. 2b). We observed that landmarks based on single vertices (uniformly or randomly distributed) yielded slightly lower ICC and discriminability for GFCGA than for Gfull_fc. However, landmarks based on predefined parcels on the group or individual level yielded a slightly better discriminability and a similar ICC for GFCGA when compared to Gfull_fc (Fig. 2b).
Finally, we evaluated the practical implications of our FCGA approach on brain-wide association studies by predicting age and full-scale intelligence (FSIQ) across the lifespan using the first five gradients. Specifically, we compared the predictive performance of coarse-grained gradients calculated from parcellated time-series to FCGA constructed parcel-wise summarized fine-grained gradients (Fig. 2c). Importantly, we observed that fine-grained gradients (i.e., GFCGA) subjected to parcel-averaging outperformed coarse-grained gradients for predicting both age and FSIQ across all tested parcellations (pBonferroni < 0.05 in 32/32 comparisons) (Fig. 2c). Notably, for FSIQ we observed that gradients calculated on a low number of parcels did not perform better than chance. The improved prediction for parcel averaging of spatially fine-grained gradients was replicated in the HCP sample across age-adjusted composite cognition scores and an estimate for fluid intelligence (Supplementary Fig. S3). This indicates that constructing connectivity gradients from parcellated data might lose meaningful information in the process that is otherwise preserved in the fine-grained gradient topography.
The proposed approximation framework is not limited to the landmarks we used in the current study and is flexible to incorporate study-specific landmarks if needed. For example, landmarks based on individualized parcellations might yield higher discriminability across sessions, indicating potential benefits of optimized landmarks for capturing individual differences. While identifying optimal regional homologies across development or aging requires further research, we demonstrated that the use of uniformly or randomly distributed vertices as landmarks already offers a highly accurate approximation on the group and individual levels.
It is also important to note that a vertex-wise representation is not necessarily superior to parcellation-based representations when the quality of the data is suboptimal17,18. Region-wise averaging helps to increase the signal-to-noise ratio, and spatially fine-grained data can entail computational challenges for further analyses. Nonetheless, our findings indicate that summarizing the gradient coefficients of a vertex-level representation is more informative than calculating the gradients directly on parcellated (downsampled) data. While future research is necessary to evaluate the potential applications for connectivity gradients14,19, an intermediate vertex-level representation might benefit further parcellation-based gradient analysis.
Taken together, our results demonstrate the feasibility of establishing spatially fine-grained connectivity gradients while mitigating the computational burden of vertex-level functional connectivity data by reducing computational time and memory usage. The ability to fully appreciate the spatial layout of functional regions in gradient representations at full resolution could improve associations between functional imaging data and cognitive traits11, and inform structure-function relationships20. Importantly, the advantage of preserving informative individual features in gradients of spatially fine-grained connectivity data was emphasized by the improved prediction of age and intelligence over gradients based on coarse-grained connectivity data. Overall, FCGA strengthens the core concept of connectivity gradients by maintaining the full spatial resolution to study spatial transitions of brain organization. Finally, its efficiency also removes computational barriers and can enable the widespread application of connectivity gradients to capture signatures of the connectome.
Methods
Data
To evaluate the validity of FCGA on the group and individual levels, we used resting-state fMRI data from the HCP15. The HCP data was acquired at Washington University at St. Louis on a customized Siemens 3T Connectome Skyra scanner. Institutional Review Board approval was obtained at the Washington University in St Louis, and written informed consent was obtained for all study participants. All ethical regulations relevant to human research participants were followed. Resting-state fMRI was acquired with a multiband factor of 8, 2 mm isotropic resolution, and a repetition time of 0.72 s for a duration of 14.4 min, resulting in 1200 volumes per acquisition. Participants were asked to relax, keep eyes open, and fixated on a crosshair, and not to fall asleep. Four resting-state runs were collected in different sessions across two days (REST1 and REST2), where each session comprised two runs with different phase encoding directions (LR and RL). We used the minimally preprocessed fMRI data provided by the HCP21. Briefly, the resting-state data has been motion corrected, minimally spatially smoothed (2 mm), high-pass filtered (2000s cutoff), denoised for motion-related confounds and artifacts using independent component analysis22, and spatially aligned to the 2 mm standard CIFTI grayordinates space21,23.
At the group level, we used the dense group-average functional connectome provided as part of the HCP S1200 data release. In brief, this connectivity matrix was constructed from the minimal preprocessed data across 1003 individuals that each had ~1 h (4 × 14.4 min) of resting-state fMRI acquisitions. The dense connectivity matrix of size 91,282 × 91,282 grayordinates (59,412 cortical vertices and 31,870 subcortical voxels) was calculated based on components of an incremental group-PCA. For the group-average connectome of the HCP, we averaged the precomputed functional connectivity measures for each landmark instead of recalculating landmark-based connectivity. At individual level, we used the 100 unrelated subjects cohort from the HCP (54F/46M, age = 29 ± 3.7 years) and two repeated resting-state fMRI runs (REST1_LR and REST2_LR) for each individual. We used the minimally preprocessed data as provided by the HCP. For both the group and individual levels, we focused our analysis on the 59,412 cortical vertices.
We used a cross-sectional lifespan sample with 313 healthy participants (214 female, age: 6–85 years, 42.2 ± 22.4 years) to evaluate the practical implications of our approach on brain-wide association studies. Participants were selected from the NKI-RS16, who have no diagnosis of any mental or neurological disorders and passed quality control of a head motion criteria (mean framewise displacement < 0.25 mm). The NKI-RS data was acquired at the Nathan Kline Institute on a Siemens TrioTim 3 Tesla scanner. Institutional review board approval was obtained at the Nathan Kline Institute, and written informed consent was obtained for all study participants. All ethical regulations relevant to human research participants were followed. Resting-state fMRI was acquired with a multiband factor of 4, 3 mm isotropic resolution, and a repetition time of 0.645 s for a duration of 9.7 min, which resulted in 900 volumes per run. Preprocessing was performed with the Connectome Computational System24 and included discarding the first five time points, compressing temporal spikes, slice timing correction, motion correction, 4D global mean intensity normalization, nuisance regression (Friston’s 24 model, cerebrospinal fluid and white matter), linear and quadratic detrending, band-pass filtering (0.01–0.1 Hz), as well as global signal regression. The preprocessed data were then projected on the 32k fsLR surface template with 32,492 vertices per hemisphere.
Connectivity landmarks
We used a varying number of landmarks across different sampling strategies. We utilized randomly and uniformly distributed vertices across the cortex, where the selected vertices were consistent across individuals. On the ROI-level, connectivity landmarks were defined by the average time series based on group parcellations25,26,27 and individualized parcellations for the HCP sample28.
Comparing gradients
For the group-level analysis and comparisons, no alignment or reordering of the gradients (PCA components) was performed. For the analysis on the individual level, we used one set of reference gradients based on the HCP dense connectome for both the approximated and full-scale gradients. For each individual, sampling, and gradient construction approaches, the gradients were aligned to this reference with orthogonal Procrustes alignment29. Orthogonal Procrustes finds the optimal linear transformation so that two sets of gradients have a matched component order and coefficient signs.
Intraclass correlation and discriminability
We compared the reliability and discriminability of the approximated gradients (GFCGA) and the full-scale gradients (Gfull_fc) within and across sessions, using the two repeated resting-state acquisitions from the 100 unrelated individual samples of the HCP. Within-session analysis treated GFCGA and Gfull_fc of the same acquisition as repeated measures. Across-session analysis evaluated differences in the reliability and reproducibility between GFCGA and Gfull_fc. Discriminability30 was used to measure the similarity of connectivity gradients. It is a nonparametric multivariate statistic that quantifies the degree to which repeated measurements (e.g., gradients of different approaches or sessions) are relatively similar to each other. At the vertex level, we quantified the ICC, a univariate measure of the degree of absolute agreement31,32.
Prediction of individual-specific measures
We used the NKI-RS lifespan sample to evaluate implications of our proposed approach for the prediction of individual-specific measures such as age and FSIQ, which were only weakly correlated (r = 0.13). As described above, connectivity gradients are often calculated on a reduced data representation on a parcel level. In this study, we compared the predictive performance of connectivity gradients calculated on parcellated time series (parcellation-to-gradients) to a parcel-wise averaging of fine-scale gradients constructed with FCGA (gradients-to-parcellation). Connectivity gradients of the parcellated data are established as in prior work1, using all parcels as landmarks instead of a subset. We evaluated distinct, commonly used parcellations25,26,27, and focused on the first five gradients as features to avoid excessively increasing the feature space. For prediction, we used ridge regression with a L2 regularization as implemented with Glmnet33, and a nested tenfold cross-validation scheme for hyperparameter (lambda) selection. The predictions of each fold were aggregated, and performance was measured with mean absolute error and correlation to the true age and FSIQ values. We repeated the tenfold cross-validation runs 100 times, with random splits for each fold. Additionally, we tested the prediction results against a baseline of 100 prediction runs with randomly shuffled labels to evaluate if the predictive performance is greater than chance.
Statistics and reproducibility
Spatial similarity between the approximated connectivity gradients (GFCGA) and the connectivity gradients based on the full connectivity matrix (Gfull_fc) was quantified using Spearman correlation at both group- and individual levels. A tenfold cross-validation scheme was used for the predictive modeling, utilizing a nested cross-validation within the training set for hyperparameter selection. The tenfold cross-validation was repeated 100 times with random splits. Predictive modeling was performed on the NKI-RS lifespan sample and replicated with the HCP sample.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The dataset used in this work are publicly available at the Human Connectome Project (https://www.humanconnectome.org) and the enhanced Nathan Kline Institute-Rockland Sample data repository (https://fcon_1000.projects.nitrc.org/indi/enhanced). The source data underlying all figures in the manuscript can be found in the Supplementary Data.
Code availability
The preprocessing code for the NKI data is available at https://github.com/zuoxinian/CCS, the code for discriminability is available at https://github.com/neurodata/discriminability, and for intraclass coefficient at https://github.com/TingsterX/Reliability_Explorer. Glmnet is available at https://glmnet.stanford.edu/index.html. The Fast Connectivity Gradient Approximation code is available at https://github.com/khne/FastConnectivityGradientApproximation.
References
Margulies, D. S. et al. Situating the default-mode network along a principal gradient of macroscale cortical organization. Proc. Natl Acad. Sci. USA 113, 12574–12579 (2016).
Bernhardt, B. C., Smallwood, J., Keilholz, S. & Margulies, D. S. Gradients in brain organization. Neuroimage 251, 118987 (2022).
Haak, K. V., Marquand, A. F. & Beckmann, C. F. Connectopic mapping with resting-state fMRI. Neuroimage 170, 83–94 (2018).
Bullmore, E. & Sporns, O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10, 186–198 (2009).
Bastos, A. M. & Schoffelen, J.-M. A tutorial review of functional connectivity analysis methods and their interpretational pitfalls. Front. Syst. Neurosci. 9, 175 (2015).
Vos de Wael, R. et al. BrainSpace: a toolbox for the analysis of macroscale gradients in neuroimaging and connectomics datasets. Commun. Biol. 3, 1–10 (2020).
Huntenburg, J. M., Bazin, P. L. & Margulies, D. S. Large-scale gradients in human cortical organization. Trends Cogn. Sci. 22, 21–31 (2018).
Arslan, S. et al. Human brain mapping: a systematic comparison of parcellation methods for the human cerebral cortex. Neuroimage 170, 5–30 (2018).
Domhof, J. W. M., Jung, K., Eickhoff, S. B. & Popovych, O. V. Parcellation-induced variation of empirical and simulated brain connectomes at group and subject levels. Netw. Neurosci. 5, 798 (2021).
Bryce, N. V. et al. Brain parcellation selection: an overlooked decision point with meaningful effects on individual differences in resting-state functional connectivity. Neuroimage 243, 118487 (2021).
Bijsterbosch, J. D. et al. The relationship between spatial configuration and functional connectivity of brain regions. https://doi.org/10.7554/eLife.32992 (2018).
Seitzman, B. A. et al. Trait-like variants in human functional brain networks. Proc. Natl Acad. Sci. USA 116, 22851–22861 (2019).
Gordon, E. M. et al. Precision functional mapping of individual human brains. Neuron 95, 791 (2017).
Hong, S.-J. et al. Toward a connectivity gradient-based framework for reproducible biomarker discovery. Neuroimage 223, 117322 (2020).
Van Essen, D. C. et al. The WU-Minn human connectome project: an overview. Neuroimage 80, 62–79 (2013).
Nooner, K. B. et al. The NKI-rockland sample: a model for accelerating the pace of discovery science in Psychiatry. Front. Neurosci. 6, 32787 (2012).
Betzel, R. F. & Bassett, D. S. Multi-scale brain networks. Neuroimage 160, 73–83 (2017).
Bijsterbosch, J. et al. Challenges and future directions for representations of functional brain organization. Nat. Neurosci. 23, 1484–1495 (2020).
Kong, R. et al. Comparison between gradients and parcellations for functional connectivity prediction of behavior. Neuroimage 273, 120044 (2023).
Haenelt, D. et al. High-resolution quantitative and functional MRI indicate lower myelination of thin and thick stripes in human secondary visual cortex. eLife 12, e78756 (2023).
Glasser, M. F. et al. The minimal preprocessing pipelines for the Human Connectome Project. Neuroimage 80, 105–124 (2013).
Salimi-Khorshidi, G. et al. Automatic denoising of functional MRI data: combining independent component analysis and hierarchical fusion of classifiers. Neuroimage 90, 449-68 (2014).
Robinson, E. C. et al. MSM: a new flexible framework for multimodal surface matching. Neuroimage 100, 414–426 (2014).
Xu, T., Yang, Z., Jiang, L., Xing, X.-X. & Zuo, X.-N. A connectome computation system for discovery science of brain. Sci. Bull. Fac. Agric. Kyushu Univ. 60, 86–95 (2015).
Glasser, M. F. et al. A multi-modal parcellation of human cerebral cortex. Nature 536, 171–178 (2016).
Gordon, E. M. et al. Generation and evaluation of a cortical area parcellation from resting-state correlations. Cereb. Cortex 26, 288–303 (2016).
Schaefer, A. et al. Local-global parcellation of the human cerebral cortex from intrinsic functional connectivity MRI. Cereb. Cortex 28, 3095–3114 (2018).
Kong, R. et al. Individual-specific areal-level parcellations improve functional connectivity prediction of behavior. Cereb. Cortex 31, 4477 (2021).
Langs, G., Golland, P. & Ghosh, S. S. Predicting activation across individuals with resting-state functional connectivity based multi-atlas label fusion. Med. Image Comput. Comput. Assist. Interv. 9350, 313–320 (2015).
Bridgeford, E. W. et al. Eliminating accidental deviations to minimize generalization error and maximize replicability: applications in connectomics and genomics. PLoS Comput. Biol. 17, e1009279 (2021).
Shrout, P. E. & Fleiss, J. L. Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86, 420–8 (1979).
Xu, T. et al. ReX: an integrative tool for quantifying and optimizing measurement reliability for the study of individual differences. Nat. Methods https://doi.org/10.1038/s41592-023-01901-3 (2023).
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
Acknowledgements
This work was supported by the National Institutes of Health (NIH) grants RF1MH128696, R01AG047596, P50MH109429, R01MH124045, and R01MH120482.
Author information
Authors and Affiliations
Contributions
KH.N. Conceptualization, Formal analysis, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing. T.X. Conceptualization, Methodology, Writing – review & editing. A.T. Writing – review & editing. A.R.F. Resources, Writing – review & editing. D.S.M. Writing – review & editing. S.J.C. Funding acquisition, Resources, Writing – review & editing. M.P.M. Conceptualization, Funding acquisition, Resources, Writing – review & editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks Ma Feilong and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Benjamin Bessieres.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nenning, KH., Xu, T., Tambini, A. et al. Fast connectivity gradient approximation: maintaining spatially fine-grained connectivity gradients while reducing computational costs. Commun Biol 7, 697 (2024). https://doi.org/10.1038/s42003-024-06401-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-024-06401-4
- Springer Nature Limited