Introduction

Autism spectrum disorder (ASD) is a developmental syndrome that affects a wide array of cognitive functions, ranging from core deficits in social-communication and restricted and repetitive behaviors to atypical sensory information processing [1, 2]. Researchers have had a lot of success using resting-state fMRI (rs-fMRI) functional connectivity methods to understand how such ASD-related deficits relate to atypical functional connectivity in specific brain networks. Resting-state functional connectivity data are especially well-suited to studying atypical neurophysiological dynamics in ASD because collecting it does not require researchers to design tasks to selectively probe the wide array of atypical cognitive functions and behaviors associated with ASD. In addition, resting-state functional connectivity data from multiple imaging sites can be aggregated and shared, as has been done in the large Autism Brain Imaging Data Exchange (ABIDE) data-sharing resource [3, 4]. However, despite impressive advances made using rs-fMRI to understand how ASD-related behavioral deficits correspond to functional networks in the brain, the field still lacks a whole-brain parcellation of functional networks in ASD individuals, thus leaving researchers to study brain connectivity in ASD by imposing network boundaries and regions of interest from brain maps that are based on data from typically developing (TD) individuals [5,6,7]. A parcellation of the ASD brain is needed to provide a spatial map of whole-brain functional networks that is specific to the ASD group and can be used to make comparisons with parcellations of the TD brain.

Functional parcellations of human cortex have provided useful maps for studying the organization and function of the brain in TD individuals [5,6,7,8]. It is well established that rs-fMRI activity is highly correlated within functional networks and these high correlations between regions within a network reflect direct or indirect anatomical connections [9,10,11,12,13,14,15]. Thus, functional networks are identified and differentiated from one another in parcellation-based maps by grouping together brain regions that have similar patterns of rs-fMRI activity covariance with the whole brain. In this way, rs-fMRI parcellations reflect stable relationships between brain regions that can be used to map the functional organization of the human brain [11, 14, 16,17,18]. In the current study, we provide an ASD-specific functional parcellation of the whole brain using rs-fMRI data.

Our parcellation uses high-quality rs-fMRI data from seventy high-functioning ASD individuals (referred to simply as “ASD individuals” from here on) that were collected using an imaging sequence specifically optimized to achieve high temporal signal-to-noise ratio (tSNR) across the whole-brain, including regions that usually suffer from relatively poor tSNR and distortions due to their close spatial proximity to the sinuses [19, 20]. We used a recently developed parcellation routine that intrinsically incorporates internal consistency and repeatability of the parcellation by keeping only network distinctions that agree across halves of the data over multiple random iterations [21]. We also performed this parcellation routine on a control group of seventy TD individuals that were tightly matched to the ASD group in tSNR, in-scanner motion, age, and IQ, so that we could compare the functional networks identified in the ASD group with the results from the TD-group parcellation. After functionally parcellating the ASD and TD brains, we compared them on measures of network stability and differentiation of subnetworks.

Methods

Participants

Seventy individuals [mean (SD) age = 19 (3.8) years; 14 female] who met the DSM-V criteria for ASD [1], as assessed by a trained clinician, were recruited for this experiment. Specifically, all seventy ASD participants are accurately described as high-functioning individuals with ASD, as they all met Diagnostic and Statistical Manual-IV diagnostic criteria as assessed by an experienced clinician on or near the date of their fMRI scan. Specifically, participants in the ASD group received the autism diagnostic interview (ADI or ADI-R) [22, 23] and the autism diagnostic observation schedule (ADOS, module 3 or 4) [24], administered by a trained, research-reliable clinician. All scores from participants with autism spectrum disorders met cut-off criteria for the category designated as ‘broad autism spectrum disorders’ according to criteria established by the NICHHD/NIDCD Collaborative Programs for Excellence in Autism [25]. In addition, seventy individuals with no history of psychiatric or neurological disorders [mean (SD) age = 19.7 (3.7) years; 19 female] served as the TD control group. There were no significant differences between the two groups in age (t(69) = 1.14, p = 0.26) or overall IQ, as measured by the Wechsler Abbreviated Scale of Intelligence [26]. that was administered within one year of the scanning session to all participants [mean (SD), Full-score IQ, ASD: 114.2 (12.9); t(69) = 1.15; TD: 116.1 (11), p = 0.25]. (Please see Table 1 for a listing of demographic and diagnostic information for the study participants.) Subsets of the resting-state data from these individuals have been used in a number of our previous studies [20, 21, 27,28,29,30,31].

Table 1 Demographic and diagnostic characteristics of study participants.

MRI data acquisition and procedure

Scanning was completed on a General Electric Signa HDxt 3.0 T scanner (GE Healthcare) at the NIH Clinical Center NMR Research Facility. For each participant, T2*-weighted blood oxygen level-dependent (BOLD) images covering the whole brain were acquired using an 8-channel receive-only head coil and a gradient echo single-shot echo planar imaging sequence (repetition time = 3500 ms, echo time = 27 ms, flip angle = 90°, 42 axial contiguous interleaved slices per volume, 3.0-mm slice thickness, 128 × 128 acquisition matrix, single-voxel volume = 1.7 × 1.7 × 3.0 mm, field of view = 22 cm). An acceleration factor of 2 (ASSET) was used to reduce gradient coil heating during the session. In addition to the functional images, a high-resolution T1-weighted anatomical image (magnetization-prepared rapid acquisition with gradient echo—MPRAGE) was obtained (124 axial slices, 1.2 mm3 single-voxel volume, 224 × 224 acquisition matrix, field of view = 24 cm).

During the resting scans, participants were instructed to relax and keep their eyes fixated on a central cross. Each resting scan lasted eight minutes and ten seconds for a total of 140 consecutive whole-brain volumes. Independent measures of cardiac and respiratory cycles were recorded during scanning for later artifact removal.

fMRI data preprocessing

All data were preprocessed using the AFNI software package [32]. First, the initial three TRs from each EPI scan were removed to allow for T1 equilibration. Next, 3dDespike was used to bound outlying time points in each voxel within 4 standard deviations of the time series mean and 3dTshift was used to adjust for slice acquisition time within each volume (to t = 0). 3dvolreg was then used to align each volume of the scan series to the first retained volume of the scan. White matter and large ventricle masks were created from the aligned MPRAGE scan using Freesurfer [33]. These masks were then resampled to EPI resolution, eroded by one voxel to prevent partial volume effects with gray matter voxels, and applied to the volume-registered data to generate white matter and ventricle nuisance regressors prior to spatial blurring. Scans were then spatially blurred by a 6-mm Gaussian kernel (full width at half maximum) and divided by the mean of the voxelwise time series to yield units of percent signal change.

The data were denoised using the ANATICOR preprocessing approach [34]. Nuisance regressors for each voxel included: six head-position parameter time series (three translation, three rotation), one average eroded ventricle time series, one “localized” eroded white matter time series (averaging the time series of all white matter voxels within a 15 mm-radius sphere), eight RETROICOR time series (four cardiac, four respiration) calculated from the cardiac and respiratory measures taken during the scan [35], and five Respiration Volume per Time (RVT) time series to minimize end-tidal CO2 effects from deep breaths [36]. All regressors were detrended with a fourth-order polynomial prior to denoising and the same detrending was applied during nuisance regression to the voxel time series. Finally, the residual time series were spatially transformed to standard anatomical space (Talairach-Tournoux).

To ensure that the fMRI data from both groups were high quality and matched, we measured the temporal signal-to-noise-ratio (tSNR) across the whole brain and a summary of in-scanner head motion using the @1dDiffMag program in AFNI. We calculated the tSNR in each voxel as the time series mean divided by time series standard deviation and selected participants from both groups that had high tSNR values across the whole brain. We used Diffmag (comparable to mean Framewise Displacement [37]), which estimates the average of first differences in frame-to-frame motion across each scan run, to exclude participants with scores greater than 0.2 mm/TR. Both tSNR and in-scanner head motion were matched between the groups (Fig. 1).

Fig. 1: High-quality fMRI data were matched between the TD and ASD groups.
figure 1

A Both groups had high temporal signal-to-noise ratio across the whole brain (tSNR – i.e., time series mean divided by time series SD). B There were no significant differences in tSNR between the groups when averaged separately within the cortex and subcortex masks. Head motion was low in all participants and matched across groups (as measured using the DiffMag program in AFNI). Black horizontal lines in the violin plots represent the mean of each measure in each group.

Resting-state parcellation routine

First, we used Freesurfer’s automated segmentation algorithm—that assigns an anatomical label to each voxel in an MRI brain volume based on probabilistic information estimated from a manually labeled training set—to make two masks [29, 38, 39]: a cortical mask that includes cerebellar voxels and a subcortical mask that includes brain stem voxels (Fig. 2A). Voxels with poor tSNR (<10) and prominent blood vessel signal (identified from a standard deviation map of the volume-registered EPI data [40]) were removed from the masks. The cortical mask was then downsampled to 6 mm3-resolution to speed up analysis run times, while the subcortical mask was downsampled to 3 mm3-resolution, because of its smaller starting volume.

Fig. 2: The initial parcellation routine.
figure 2

A The parcellation focused on the cortical (left) and subcortical (right) masks separately. The cortical mask included all cortical and cerebellar voxels, while the subcortical mask included all voxels in the subcortex and brain stem. B The spilt-half agreement curves were constructed across thresholds, picking the threshold that maximized proportion of coverage (i.e., number of voxels assigned a network prototype label) and the number of detected network prototypes (separately for cortical and subcortical masks). After ten iterations, one average parcellation of the retained network prototypes was formed, keeping any network that occurred in at least 50% of iterations. The proportion of coverage (top) and number of detected networks (bottom) were jointly optimized at the 90% threshold in the cortical mask and at the 85% threshold for the subcortical mask in both groups. The error in the line plots represents \(\pm\)1 SEM. C At this stage of the parcellation, we combined the masks so that all network prototypes were in the same space. This ensured that when we next ran the best-match procedure, so that every voxel in the whole brain was assigned a network label, any voxel could have a label that originated in either the cortical or subcortical mask.

We searched for functional network prototypes (i.e., sets of voxels in the group-averaged data with similar patterns of whole brain connectivity) across each mask using the InfoMap clustering algorithm [41, 42]. On each of ten iterations, the seventy participants per group were randomly split in half, and group-average correlation matrices between the mask and whole-brain voxels were calculated for each half of data (done separately for the cortical and subcortical masks). These matrices were made square by correlating each column of the whole-brain x cortical (or subcortical) matrix with themselves. The real-valued correlation matrices were then thresholded into binary (0 or 1) undirected matrices at a range of threshold values (Fig. 2B). The thresholded matrices of each half were then clustered using the InfoMap algorithm to form optimal two-level partitions found over one hundred searches. We chose to use the two-level partition option (with the clusters at the top level and the nodes/voxels that belong to each cluster at the bottom level), instead of a multi-level partitioning, because evaluating whether two hierarchical trees are similar (across halves of data) is a difficult problem to solve, while a flat partitioning of nodes (bottom level) into modules (top level) is sufficient for identifying brain networks and easy to compare across halves of the data. A network prototype was counted as repeating across halves on each iteration if the Dice coefficient [Dice(x,y) = (2*(x∩y))/(x + y)] was \(\ge\) 0.5, and the volume of the intersection was at least 2% of the size of the cortical or subcortical mask, respectively. The intersection of each network prototype that repeated across the two halves of data was retained for that iteration. After repeating the above steps for each of the ten iterations, one average parcellation of the retained network prototypes was formed, keeping voxels from any prototype that co-occurred in 50% or more of the iterations. Agreement curves were constructed across thresholds, and the threshold with the optimal proportion of coverage and number of detected prototypes was identified in each mask. We found that the split-half agreement and the number of detected prototypes were jointly optimized at the 90% threshold in the cortical mask and at the 85% threshold for the subcortical mask in both the TD and ASD group. The jointly optimized threshold was the one at which both the number of parcels retained and the proportion of coverage in each respective mask were at a “stable” point in the agreement curve (Fig. 2B). For example, we chose the 0.85 threshold in the subcortex, because this is the point in the curves where the proportion of coverage is at a local maximum just before it starts a steeper decline (i.e., an unacceptable loss in the number of voxels kept in the mask), while the number of parcels retained is at a relatively flat part of the curve just before a steep increase in the parcels retained that indicates unstable fractionation within the mask. At this stage of the parcellation, every voxel is not guaranteed to have a network label due to the stringent requirements for a parcel to appear in both halves of the data across iterations. Thus, we next used a best-match criterion to ensure that all voxels were labeled in the end.

The detected network prototypes at the optimized thresholds in the cortical and subcortical masks were combined and then assigned to each voxel in the original 2 mm3 whole-brain mask using a best-match criterion. To do so, we first calculated the pattern of connectivity between each network prototype and the whole brain. The pattern of whole-brain functional connectivity for each network prototype was then compared with the pattern of connectivity from each voxel in the whole brain, and we assigned the label of the network prototype with the most similar pattern (Pearson correlation) to that voxel, provided the best match was within a threshold level of similarity (R2 > 0.5). Since the cortical and subcortical voxels were combined before assigning a final network label to each voxel, cortical voxels could, in principle, be labeled as belonging to a subcortical network, and vice versa, according to the best-match criterion.

Calculating the \({{\mathbf{\Delta}}}\) eta2 coefficient

We calculated the eta2 coefficient for every pair of voxels across the whole brain in each participant from both groups [43]. The eta2 coefficient is defined as the ratio of variance in one variable that is explained by another variable. Thus, eta2 varies from 0 when there is no similarity between the variables and 1 when the variables are identical—i.e., a high eta2 indicates that two variables are similar to one another. We used eta2 to compare the whole-brain correlation maps between every pair of voxels and stored the eta2 coefficient in the first voxel location. If a pair of voxels are labeled a and b, then:

$${{eta}}^{2}=1-\frac{{{SS}}_{{within}}}{{{SS}}_{{total}}}=1-\frac{{\sum }_{i=1}^{n}\left[{({a}_{i}-{m}_{i})}^{2}+{({b}_{i}-{m}_{i})}^{2}\right]}{{\sum }_{i=1}^{n}\left[{({a}_{i}-\bar{M})}^{2}+{({b}_{i}-\bar{M})}^{2}\right]}$$
(1)

where ai and bi represent the position i in the correlation maps for a and b, respectively; mi is the mean value of the two maps at position i; and \(\bar{M}\) is the grand mean value across the mean of the two correlation maps (i.e., m). To calculate the \(\Delta\) eta2 coefficient for each voxel, we averaged the eta2 coefficients between it and all other voxels from the same network (within-network eta2 coefficient) and separately averaged the eta2 coefficients between it and all other voxels from outside of the network (between-network eta2 coefficient). We then separately averaged the within- and between-network values across all voxels within a network and then subtracted the between-network eta2 coefficient from the within-network eta2 coefficient to get the \(\Delta\) eta2 coefficient. Thus, if the parcellation identified meaningful functional boundaries in a group, then the \(\Delta\) eta2 coefficient will be significantly positive. We calculated the \(\Delta\) eta2 coefficient for each network in each participant. We then compared the average \(\Delta\) eta2 coefficients between the ASD and TD groups.

Calculating mean differences and null distributions to quantify group differences

To quantify the group differences in the number of networks found and the number of network-specific cortical voxels, respectively, we randomly split the data in half an additional one hundred times in each group and then compared the halves on each iteration. We did this separately in the cortical and subcortical masks. Doing so allowed us to compare one hundred observations from each group to obtain mean differences for our empirical observations between the groups. We also created null distributions by randomly labeling the two hundred halves of data either ASD or TD 25,000 times and generating a mean difference each time. We then evaluated the significance of our findings with permutation testing by comparing the real mean differences to the null distributions generated from the randomly shuffled split halves.

Predicting social and communication symptoms in ASD

We utilized both multiple regression and Ridge regression [44] with leave-one-out cross-validation to evaluate the ability to predict ASD social and communication symptoms from functional connectivity data. Social and communications symptoms in ASD were assessed with both the Social Responsiveness Scale 2 total score (SRS-2, a parent-report survey) [45] and the ADOS combined social and communication score (an in-person assessment by a trained clinician) [24, 25]. Both measures have been used in prior studies examining brain-behavior correlations in ASD [20, 46, 47]. Missing values for individual behavioral assessments were rare (SRS was available for 69/70 ASD participants; ADOS scores were available for 68/70 participants) and were estimated through K-nearest-neighbor imputation (implemented in Matlab by Khan, 2021) over a broader set of demographic and behavioral variables (e.g., Age, Sex, WASI scores, SRS, ADI, and ADOS) [48, 49].

Multiple regression was used first to assess prediction significance, and Ridge regression was used to estimate which beta coefficients were most important in the prediction. Ridge regression is often used to estimate regression coefficients when intercorrelations exist among a large number of predictor variables, as is frequently the case in neuroimaging studies with voxelwise or region-wise measurements. Prior to the main analyses, Age, Motion, and tSNR were regressed out of the average parcel-to-parcel functional connectivity data across participants (using the ASD group parcellation), with each residual variable having 0 mean. The mean of the behavioral variable was also subtracted to yield zero-mean dependent variables, removing any need to fit an intercept in the regression models. The unique combinations of parcel-to-parcel functional connectivity (including average within-parcel connectivity) in 69 of the 70 ASD participants served as independent variables (66 variables in total), with the behavioral score (either SRS or ADOS) serving as the dependent variable. The formed regression model (no intercept variable) was then used to predict the left-out participant’s behavioral score. Across all left-out participants, predicted scores were then correlated with actual scores (Spearman correlation), with chance estimated through random permutation (i.e., the entire process was repeated for randomly shuffled behavioral scores over 5000 iterations, with the rank of the original correlation between predicted/actual values in the permuted distribution determining the p-value of chance predictions).

Ridge regression was then used to estimate the most important beta coefficients in predicting behavior for any measure that was predicted successfully in the prior analyses. The first step in Ridge regression is to estimate the optimal Ridge parameter K, which was accomplished by performing a grid search over the values of K (initial grid: 1,10,100,1000,10,000; then follow-up searching in steps of 1000 between 1000 and 10,000), with each value of K repeated 10 times for stability. During this search, leave-one-out cross-validation was used along with random permutation to estimate chance predictions (5000 iterations). The optimal value of K was taken to be the one yielding the prediction with the lowest chance likelihood.

Finally, Ridge regression with bootstrap resampling over all 70 participants (10,000 iterations) was used to estimate the sampling distributions of the regression coefficients at the optimal K, permitting estimates of which coefficients differed significantly from 0. Multiple comparisons were corrected by False Discovery Rate [50]. This analysis served to identify which parcel combinations were most important to the behavioral prediction, and thereby, which brain networks were most involved.

Results

Weaker differentiation of cerebellar networks in the ASD group

After detecting network prototypes in the subcortical and cortical masks separately (Fig. 2C), we then combined the masks and found the best match to each prototype in every voxel across the whole brain in each group. In the TD group, we identified twelve whole-brain functional networks—six of the networks originated from prototypes in the cortical mask and the other six from prototypes in the subcortical mask (Fig. 3A). By contrast, in the ASD group, we identified eleven whole-brain functional networks—five of the networks originated from prototypes in the cortical mask and the other six from prototypes in the subcortical mask (Fig. 3B). The difference in the number of prototypes between the groups is due to the TD parcellation identifying two network prototypes in the cerebellum—roughly speaking, an anterior and posterior (Crus I/II and VIIB) prototype (dark green and brown, respectively, in Fig. 3A, C)—while the ASD parcellation returned just one network prototype in the cerebellum (dark green voxels in Fig. 3B). Next, we used a permutation test to ask whether there is a difference in the number of cortical parcels between the groups. We compared the mean of group differences across an additional one hundred split halves of the data with a null distribution created by randomly shuffling group membership 25,000 times. Consistent with our initial results, the real mean difference in the cortex was greater than all 25,000 randomly shuffled mean differences in the null distribution (Fig. 3D: mean difference = 0.99, p < 10–5), while the same permutation method in the subcortex revealed that there was not a significant difference between the real mean difference and the null distribution (Fig. 3D: mean diff. = −0.01, p = 0.76). Our finding of one less functional prototype in the cerebellum of the ASD group suggests that the cerebellar networks are weakly differentiated compared to the TD group. One consequence of this atypical differentiation of functional networks in the cerebellum of the ASD group is that an area of the posterior cerebellum is included in a cortical network that overlaps the default mode network (light blue in Fig. 3B).

Fig. 3: Resting-state parcellation of the whole brain in TD and ASD groups.
figure 3

A Twelve networks were identified in the TD group parcellation – six originated from cortical prototypes and the other six from subcortical prototypes. Inflated brains were created using ther HCP Workbench (Marcus et al., 2011). B Eleven networks were identified in the ASD group parcellation – five originated from cortical prototypes and the other six from subcortical prototypes. C The cerebellar networks displayed on a flattened map of the cerebellum. The cerebellum was flattened using the SUIT toolbox (Diedrichsen 2006; Diedrichsen et al., 2009, 2011, 2015). D The difference in the number of network prototypes between the groups was quantified by comparing the mean difference derived from one hundred random split halves of the data from each group (separately in cortical and subcortical masks) with a null distribution of 25,000 comparisons of the split-halves in which the group labels were randomly shuffled before obtaining the mean difference (black dots). The red dots are the actual mean difference in the number of prototypes between the groups. Positive values reflect a greater number of TD prototypes, while negative values correspond to more ASD prototypes.

Next, we examined three properties of the functional networks in both groups: (1) the degree of internal cohesion within each network, (2) the presence of differentiated subnetworks within each network, and (3) the spatial coverage of each network in the cortical and subcortical masks.

Weaker network stability in the ASD group

We used the \(\Delta\) eta2 coefficient as a measure of the degree to which the patterns of whole-brain connectivity are more similar for voxels within the same network compared to voxels from different networks. Thus, higher positive \(\Delta\) eta2 coefficients reflect more cohesive patterns of whole-brain connectivity across voxels from the same network. We found an overall significant decrease in the \(\Delta\) eta2 coefficient in the ASD compared to TD group when averaging across all networks (independent samples t(138) = 3.61, p < 0.001, Cohen’s d = 0.61) and this result was significant in both the cortical and subcortical masks when networks were averaged separately in each mask (both t’s > 2.20, both p’s < 0.05, both d’s > 0.37). We next asked whether decreases of the \(\Delta\) eta2 coefficient in the ASD group were significant in all functional networks or only a subset of networks. We found that the \(\Delta\) eta2 coefficient was significantly lower in the ASD group within five functional networks (Fig. 4: all t’s > 2.48, p’s < 0.01, d’s > 0.42, FDR-corrected, q < 0.05): hippocampal-cortical (pale orange), subcortico-cortical (red), sensorimotor (dark blue), fronto-parietal (pink), and anterior cerebellar (dark green). These results suggest that the whole-brain connectivity patterns from voxels within each of these networks, respectively, are less cohesive in the ASD compared to TD group. We confirmed this interpretation by showing that the reduced \(\Delta\) eta2 coefficient in these regions of the ASD group are due to a greater decrease in within-network eta2 coefficients compared to between-network eta2 coefficients in the ASD group (Supplementary Fig. 1). Next, we tested whether this relative lack of network cohesion influences the organization of subnetworks within each of the affected large-scale networks.

Fig. 4: The ∆ eta2 coefficients from each functional network in the TD and ASD groups.
figure 4

Asterisks represent a significant difference between the groups (p < 0.01). The networks that originated from cortical prototypes are on the top row and the networks that originated from subcortical prototypes are on the bottom row. In both rows, the networks are ordered by size (i.e., number of voxels). Note the sixth cortical network that corresponds to the posterior cerebellum is not shown because it is present in the TD group only.

A relative lack of differentiated subnetworks in subcortex and hippocampus of the ASD group

To understand how weaker network cohesion in some large-scale networks of the ASD group might influence the organization of subnetworks within them, we used our parcellation method on each functional network in turn and evaluated the number of resultant subnetworks. Each large-scale network was treated as a mask and subjected to the same parcellation routine that was used to identify networks in the whole brain. We found differences in the number of subnetworks in the subcortical network that primarily overlaps the thalamus, putamen, and caudate nucleus (red mask in Fig. 3) and the hippocampus (pale orange mask in Fig. 3), but not in the sensorimotor and fronto-parietal networks (dark blue and pink, respectively, in Fig. 3) that also showed a significantly lower \(\Delta\) eta2 coefficient. The subcortex in the TD group divided the thalamus, putamen, and caudate nucleus into three subnetworks, while in the ASD group, the subcortex did not divide into subnetworks—i.e., it remained as one undifferentiated network (Fig. 5A). Similarly, the hippocampus in the ASD group was also less differentiated compared to the TD group (three vs. four subnetworks). Figure 5A shows that this difference in the number of subnetworks between the groups is most apparent on the long axis of the hippocampus. In both the subcortex and hippocampus, the mean difference in number of subnetworks between the groups (2 and 1, respectively) was greater than all 25,000 randomly shuffled mean differences in the null distribution (Fig. 5B – both p’s < 10–5). Note that we restricted our analysis space for these networks to the subcortex mask because, as will be demonstrated in the next section, each of these functional networks differ significantly in area of cortical coverage between the groups.

Fig. 5: Differences in the number of subnetworks identified in the subcortex and hippocampus between the groups.
figure 5

A As expected, the thalamus, putamen, and caudate nucleus were separated into three subnetworks in the TD group (top left), while the hippocampus was divided into four subnetworks along the long axis (top right). By contrast, the subcortex of the ASD group did not divide into subnetworks (i.e., it remained as one undifferentiated network – bottom left) and the hippocampus comprised one less subnetwork compared to the TD group (bottom right). B The difference in the number of subnetworks between the groups was quantified by comparing the mean difference derived from one hundred random split halves of the data from each group (separately in the subcortex and hippocampus) with a null distribution of 25,000 comparisons of the split-halves in which the group labels were randomly shuffled before obtaining the mean difference (black dots). The red dots are the actual mean difference in the number of prototypes between the groups. Positive values reflect a greater number of TD subnetworks, while negative values correspond to more ASD subnetworks.

Atypical subcortico-cortical and hippocampo-cortical integration in the ASD group

In addition, to weaker local differentiation of subnetworks in the subcortex and hippocampus of the ASD group, we next evaluated how the ASD and TD groups differed in the location and size of the neocortical areas that are integrated with the subcortical and hippocampal functional networks. To do so, we overlapped each network from the groups and calculated a ratio of voxels that intersected across groups versus voxels that were specific to one or the other group. We did this separately in the cortical and subcortical masks. For all networks with voxels in the subcortical mask, the ratio of intersecting to non-intersecting voxels was greater than half, so no further analyses were conducted on subcortical voxels. In the cortical mask, the ratio of intersecting to non-intersecting voxels was less than half only for the subcortical and hippocampal networks—i.e., these networks included more group-specific cortical voxels than voxels that intersect between the groups (Fig. 6A). For each network, the mean difference between the number of cortical voxels in the TD and ASD group was greater than all 25,000 randomly shuffled mean differences in the null distribution (Fig. 6B, mean differences, subcortical = 1231.5 voxels, hippocampus = 4974.6 voxels, both p’s < 10–5). A map of the cortical voxels belonging to the subcortical network shows that there are more voxels that independently belong to the ASD group than belong to the TD group or intersect between the groups (Fig. 7A). These ASD-specific voxels are mostly located in and around the dorsolateral temporal cortex and the insula. By contrast, a map of the cortical voxels belonging to the hippocampal network shows that there are more voxels that independently belong to the TD group than belong to the ASD group or intersect between the groups (Fig. 7B). These TD-specific voxels are mostly located in lateral parieto-occipital cortex, around the retrosplenial cortex and parieto-occipital sulcus, and anterior lateral temporal cortex. These results show that the subcortical and hippocampal functional networks in individuals with ASD exhibit atypical connectivity patterns to the cortex.

Fig. 6: The percentage of intersecting and group-specific cortical voxels in each network.
figure 6

A Of the eleven networks present in both groups (excluding the posterior cerebellar network not found in ASD), seven networks were present in the cortex, while four subcortical networks did not include more than fifty cortical voxels. All but two of the networks that were present in the cortex included more voxels that were intersecting between the groups than were exclusive to either group. By contrast, the network originating from a prototype primarily overlapping the thalamus, putamen, and caudate nucleus (red) included more cortical voxels that were exclusive to the ASD group, while the network originating from a hippocampal prototype included more cortical voxels that were exclusive to the TD group. B The difference in the number of cortical voxels in the subcortical and hippocampal functional networks, respectively, between the groups was quantified by comparing the mean difference derived from one hundred random split halves of the data from each group (separately in the subcortex and hippocampus) with a null distribution of 25,000 comparisons of the split-halves in which the group labels were randomly shuffled before obtaining the mean difference (black dots). The red dots are the actual mean difference in the number of cortical voxels between the groups. Positive values reflect a greater number of TD subnetworks, while negative values correspond to more ASD subnetworks.

Fig. 7: Group-specific cortical voxels in the subcortical and hippocampal networks.
figure 7

A A network originating in the subcortex, primarily overlapping the thalamus, putamen, and caudate nucleus, is connected to more cortical voxels in the ASD than TD group. B A network primarily overlapping the hippocampus, is connected to more cortical voxels in the TD than ASD group.

Functional connectivity within the ASD parcellation predicts social/communication symptoms

In the above analyses, we have established that ASD and TD control whole-brain parcellations differ in several respects. However, we have not established relevance of the ASD parcellation to clinical symptoms. We, therefore, examined whether functional connectivity among the ASD parcels defined above successfully predicts social and communication symptoms in our ASD participants, as measured by the Social Responsiveness Scale 2 total score (SRS-2, a parent-report survey) [20, 28, 45] and the ADOS combined social and communication score (an in-person assessment by a trained clinician) [24, 25, 47]. Average parcel-to-parcel functional connectivity was calculated for each ASD participant from the voxel-level data, yielding an 11 × 11 parcel matrix. Using all unique combinations of the parcels (i.e., the upper triangle of this matrix and the diagonal), we first employed multiple regression with leave-one-out cross-validation to compare predicted scores for each left-out participant with their actual scores. Chance levels of prediction were assessed using random permutation (5000 iterations) by randomly shuffling the behavioral scores on each iteration across participants. Predictions were significant for the ADOS combined social and communication score (r = 0.4977, P < 0.0223 by permutation test; see Fig. 8A) but not for SRS total score (r = –0.0295, P > 0.7). To examine the most important regression coefficients contributing to the successful prediction of the ADOS scores, we applied Ridge regression, a method often used to estimate regression coefficients in the context of large numbers of intercorrelated predictor variables [44]. After initially determining the Ridge parameter K that optimized the behavioral prediction of ADOS scores in leave-one-out cross-validation (at K = 2000; Fig. 8B), we estimated which beta coefficients in the regression differed significantly from 0. We found that 5 out of the 66 predictor variables differed from 0 and survived correction for multiple comparisons (P < 0.0028, FDR-corrected to q < .05), corresponding to combinations of the thalamus, striatum, fronto-parietal, and the brainstem/pons parcels (see Fig. 8C, D). These beta coefficients also matched the largest partial correlations calculated between the parcel-to-parcel functional connectivity and ADOS combined social and communication score, having removed the covariation with Age, Motion, and tSNR (Fig. 8C). Taken together, these results establish that the ASD whole-brain parcellation is indeed useful for relating the fMRI data to social and communication symptoms in the ASD participants.

Fig. 8: Functional connectivity values from the ASD group parcellation predict social and communication symptoms measured by the ADOS.
figure 8

A Typical least squares multiple regression with LOO cross-validation was successfully used to predict each individual participant’s ADOS combined social+communication score, indicating that information about these symptoms is robustly present among the parcel-to-parcel functional connectivity values. B A search over the range of the Ridge parameter K was performed to identify the optimal Ridge regression model (the P-value is based on permutation, corresponding to the portion of iterations with shuffled behavioral scores that had predictions better than or equal to the original data). C The partial correlation matrix of parcel-to-parcel functional connectivity and the ADOS combined social+ communication score (partialling Age, Motion, and tSNR). Overlaid on the partial correlations are the highlighted parcel-to-parcel combinations with beta weights that differ significantly from 0 for the optimal Ridge regression model (K = 2000). P-values for the betas were estimated using bootstrap resampling (10,000 samples), and then thresholded to FDR-corrected values (P ≤ 0.0028, q < 0.05). There were five parcel-to-parcel functional connectivity relationships that were significantly involved in the prediction for the optimal model, highlighted with squares. The colored circles next to the x- and y-axes in panel (C) match the colored parcels rendered in the brain volume in panel (D). D The five parcels that were significantly involved in the prediction for the optimal model included the thalamus, striatum, frontoparietal, and pons.

Discussion

We used high-quality rs-fMRI data and a robust parcellation routine to identify functional networks across the whole brain in high-functioning individuals with ASD and tightly matched TD controls. We compared the functional networks from each group and focused on three atypical features of the ASD brain: (1) whole-brain connectivity patterns are less stable across voxels within select functional networks, (2) the cerebellum, subcortex, and hippocampus all show weaker differentiation of functional subnetworks, and (3) subcortical structures and the hippocampus are atypically integrated with the neocortex. These results were statistically robust and suggest that patterns of network connectivity between the neocortex and the cerebellum, subcortical structures, and hippocampus are atypical in ASD individuals. We also demonstrated that the ASD-specific parcellation predicts social and communication symptoms in the ASD group.

The results mentioned above seem to be related in a straightforward way. Our finding of weaker cohesion within select networks of the ASD brain indicates that the patterns of whole-brain connectivity from voxels across each of these networks are less stable compared to the TD group. This weaker cohesion is likely to be responsible for the relative lack of differentiation of subnetworks in the subcortical structures and hippocampus in the ASD group using our parcellation method. Interestingly, however, the lack of differentiation of the subcortical structures and hippocampus are coupled with opposite patterns of connectivity to the cortex—i.e., the subcortical structures are connected to more cortical voxels, while the hippocampus is connected to less cortical voxels compared to TD controls. The pattern of cortical connectivity from subcortical structures that is exclusive to the ASD group in our analysis overlaps with cortical regions that exhibited hyper-connectivity during rest and social tasks in prior reports—e.g., the insula and temporal lobes [3, 29, 51, 52]. The pattern of cortical connectivity from the hippocampus that is exclusive to the TD group in our analysis overlaps with cortical regions that exhibited hypo-connectivity between the hippocampus and cortex during episodic memory retrieval tasks [53, 54]. Intriguingly, some of the cortical regions missing from the hippocampal network in the ASD group in our analysis seem to overlap with scene-selective regions of cortex (i.e., retrosplenial and lateral occipitoparietal cortices [55]), thus suggesting that this atypical network may be a neurobiological underpinning of reported behavioral deficits in scene construction and allocentric navigation in individuals with ASD [56]. Overall, our results are more consistent with findings of atypical domain-specific network organization [20, 57,58,59], rather than differences between the groups in global organizing principles, such as distance and strength of connectivity more generally [60,61,62].

In addition to the analyses presented here, the functional network map of the ASD brain can serve multiple functions in future studies. Since we demonstrated that the whole-brain functional network organization is significantly different between the ASD and TD groups, future studies can use our results to identify group-specific networks on which to focus their analyses, rather than combining data from the groups beforehand to identify networks common between them (as is typical of group ICA studies) or simply using parcels identified in TD groups in prior studies. The ASD-specific network map also provides a common spatial framework (or template) for integrating findings from studies that choose different regions (or networks) and/or behavioral deficits of interest. For example, several prior studies have reported unique patterns of behavioral correlates with each of the atypical networks that we focused on: The cerebellum–especially lobules Crus I/II and VIIB that we find undifferentiated in the ASD group—has been linked to deficits in social processing and communication [63,64,65,66]. Subcortical structures – especially the thalamus—have been linked to deficits in social functions and sensory processing issues [67,68,69]. The hippocampus has been linked to deficits in episodic memory [53, 54, 70]. For these reasons, the functional network maps of the ASD and TD brains from this study are freely available online (OSF: https://doi.org/10.17605/OSF.IO/YT87Z).