Cortico-amygdalar connectivity and externalizing/internalizing behavior in children with neurodevelopmental disorders

Background Externalizing and internalizing behaviors contribute to clinical impairment in children with neurodevelopmental disorders (NDDs). Although associations between externalizing or internalizing behaviors and cortico-amygdalar connectivity have been found in clinical and non-clinical pediatric samples, no previous study has examined whether similar shared associations are present across children with different NDDs. Methods Multi-modal neuroimaging and behavioral data from the Province of Ontario Neurodevelopmental Disorders (POND) Network were used. POND participants aged 6–18 years with a primary diagnosis of autism spectrum disorder (ASD), attention-deficit/hyperactivity disorder (ADHD) or obsessive–compulsive disorder (OCD), as well as typically developing children (TDC) with T1-weighted, resting-state fMRI or diffusion weighted imaging (DWI) and parent-report Child Behavioral Checklist (CBCL) data available, were analyzed (total n = 346). Associations between externalizing or internalizing behavior and cortico-amygdalar structural and functional connectivity indices were examined using linear regressions, controlling for age, gender, and image-modality specific covariates. Behavior-by-diagnosis interaction effects were also examined. Results No significant linear associations (or diagnosis-by-behavior interaction effects) were found between CBCL-measured externalizing or internalizing behaviors and any of the connectivity indices examined. Post-hoc bootstrapping analyses indicated stability and reliability of these null results. Conclusions The current study provides evidence towards an absence of a shared linear relationship between internalizing or externalizing behaviors and cortico-amygdalar connectivity properties across a transdiagnostic sample of children with different primary NDD diagnoses and TDC. Different methodological approaches, including incorporation of multi-dimensional behavioral data (e.g., task-based fMRI) or clustering approaches may be needed to clarify complex brain-behavior relationships relevant to externalizing/internalizing behaviors in heterogeneous clinical NDD populations. Supplementary Information The online version contains supplementary material available at 10.1007/s00429-022-02483-0.

Internalizing and externalizing behaviors have been linked to alterations in various cortico-amygdalar networks, such as the parieto-amygdalar network (Karlsgodt et al. 2018;Chahal et al. 2020), default mode network (Umbach & Tottenham 2020;Sato et al. 2016), and the fronto-amygdalar network (Ameis et al. 2014;Vijayakumar et al. 2017). Frontal cortical regions in particular have been implicated in decision-making (Rushworth et al. 2011), behavioral regulation (Rushworth et al. 2011), and emotional regulation (Albaugh et al. 2016;Ducharme et al. 2011) which provide top-down modulation of amygdalar activity (Etkin et al. 2006;Hariri et al. 2003).This cortico-amygdalar network is connected through two main white matter tracts: the uncinate fasciculus (UF) and the cingulum bundle (CB) (Catani et al. 2013). In typically developing children (TDC), increased internalizing behavior has been associated with altered structural covariance between the prefrontal cortex and amygdala (Vijayakumar et al. 2017), decreased fractional anisotropy (FA) of the UF and CB (Albaugh et al. 2016;Mohamed Ali et al. 2019), and increased functional connectivity between the ventromedial prefrontal cortex and amygdala (Qin et al. 2014). Also in TDC, increased externalizing behavior has been associated with altered cortico-amygdalar structural covariance (Ameis et al. 2014), decreased FA of the UF (Andre et al. 2020), and altered functional connectivity between amygdala and frontal cortical regions (Aghajani et al. 2017(Aghajani et al. , 2016Saxbe et al. 2018). Broad cortico-amygdalar network alterations have also been found in studies of children with primary internalizing (e.g., major depressive disorder) or externalizing (e.g., oppositional defiant disorder) disorders (Castellanos-Ryan et al. 2014;Luking et al. 2011;Noordermeer et al. 2016;Paulesu et al. 2010;Stoycos et al. 2017) compared to TDC. Shared continuous associations between task-based fMRI and behavioral measures (parent report and in-scanner assessments) have also been found across children with different clinical diagnoses (i.e., disruptive behavior disorders, anxiety disorders, or ADHD) (Ibrahim et al. 2019;Stoddard et al. 2017). Taken together, these studies suggest that corticoamygdalar connectivity properties may be associated with both externalizing and internalizing behaviors, which often co-occur (Korhonen et al. 2014;Reef et al. 2011), and may relate to these behaviors along a continuum cutting across TDC and different mental health diagnoses.
As of yet, we know of no study that has investigated whether cortico-amygdala network properties relate to internalizing or externalizing behaviors across children with different NDDs, which would suggest common neurobiological underpinnings contributing to these behaviors across diagnoses. The present study investigated linear associations between externalizing or internalizing behaviors and indices of cortico-amygdalar network connectivity (i.e., separately evaluated structural covariance, resting-state functional connectivity, and white matter connectivity) in a large sample, including TDC and children and youth with primary diagnoses of ASD, OCD, or ADHD. We hypothesized that greater externalizing or internalizing behaviors would be associated with reduced cortico-amygdalar structural and functional connectivity indices across our transdiagnostic sample.

Participants
Participants included in the current study participated in the Province of Ontario Neurodevelopmental Disorders (POND) Network; recruitment was carried out at different sites across the province of Ontario, Canada, including the Hospital for Sick Children (SickKids), Holland Bloorview Kids Rehabilitation Hospital, Lawson Health Research Institute, McMaster University and Queen's University between June 2012 and January 2020. Children and youth were eligible to participate in POND if they had a primary clinical diagnosis of ASD, ADHD or OCD, sufficient English language comprehension to complete the behavioral assessments, and no contraindications for MRI (e.g., metal implants). The Parent Interview for Child Symptoms (Ickowicz et al. 2006) was used to confirm ADHD diagnosis, the Schedule for Affective Disorders-Children's Version (Kiddie-SADS) and Children's Yale-Brown Obsessive-Compulsive Scale (Scahill et al. 1997) for OCD, and the Autism Diagnostic Observation Schedule-2 (Lord et al. 2000) and the Autism Diagnostic Interview-Revised (Lord et al. 1994) for ASD. TDC participants were recruited through flyers posted at each recruitment site as well as through word-of-mouth. The exclusion criteria for TDC included: history of premature birth (< 35 weeks), presence of an NDD, first-degree relative with an NDD, psychiatric or neurologic diagnosis, confirmed via parental screening. Age-appropriate Wechsler scales were used to estimate full-scale IQ (Littell 1960). Participating institutions received approval for this study from their respective research ethics boards. Primary caregivers and participants provided either written informed consent or assent after a complete description of the study was provided. As of January 2020, MRI and behavioral data were available for 611 children and youth with ASD, ADHD, OCD, or TDC (n = 286 ASD; n = 159 ADHD; n = 68 OCD; n = 98 TDC) who completed MRI scanning at Sick-Kids (Toronto, Canada). The present study analyzed data from a subset of these 611 POND participants who met all of the following criteria: (i) they had successfully completed a T1-weighted, resting-state, or single-shell DWI scan, (ii) were between the ages of 6 and 18 years at time of brain scan, and (iii) had Child Behavior Checklist (CBCL) data available that was collected within 12 months of their MRI scan (see Fig. 1).

Measurement of externalizing and internalizing behaviors
Externalizing and internalizing behavioral scores were measured using the parent-report CBCL (ages 6-18), a standardized, well-established instrument (Achenbach and Ruffle 2000) that has been widely used for brain-behavior analyses in pediatric samples (Albaugh et al. 2016;Ameis et al. 2014;Ducharme et al. 2014Ducharme et al. , 2011Ibrahim et al. 2019). The CBCL provides continuous measures of externalizing (calculated by combining rule-breaking and aggressive CBCL subscales) and internalizing behavior (calculated by combining withdrawn, anxious/depressed and somatic CBCL subscales), with a domain specific t-score (standardized by age and gender) > 70 indicating clinically significant symptoms.

MRI protocol
Participants were scanned on a 3 T Siemens Tim Trio at Sick-Kids that was upgraded to the Siemens PrismaFIT in June 2016. All T1-weighted brain imaging consisted of a 5-min scan using an MPRAGE sequence with grappa parallelization (Tim Trio: (1 × 1x1)mm 3 , TR = 2,300 ms, TE = 2.96 ms, TI = 900 ms, Flip Angle = 9°, FOV = 224 × 224mm 2 , Fig. 1 Diagrams presenting the overall POND imaging samples which includes children with autism spectrum disorder (ASD), attention-deficit/hyperactivity disorder (ADHD), obsessive compulsive disorder (OCD) and typically developing children (TDC) scanned at the Hospital for Sick Children as of January 2020. Imaging data from T1-weighted (T1w), resting state fMRI (rsfMRI) and diffusion weighted imaging (DWI) sequences are presented. The reasons for exclusion presented for level 1: participants being outside the 6-18 age range at time of scan, a greater than 12 month time gap between scan and CBCL administration, and missing CBCL data; level 2: persistent processing errors at any point within the processing pipeline (e.g. errors in the fMRIprep pipeline); level 3: exclusion based on quality control (QC; details presented in the paper and supplement). The numbers for the final analysed sample for each imaging modality are presented. For the T1w and rs-fMRI samples, participants were scanned on a 3 T Siemens Tim Trio scanner prior to June 2016 when the scanner was upgraded to the PrismaFIT. For rs-fMRI acquisitions, participants scanned on the Tim Trio selected a movie to watch and participants scanned on the PrismaFIT viewed a naturalistic film (inscapes). The study includes only single-shell DWI acquisitions (n = 262) completed on the Tim Trio scanner 1 3 240 Slices, GRAPPA = 2, 12-channel head coil; Pris-maFIT: (0.8 × 0.8x0.8)mm 3 , TR = 1,870 ms, TE = 3.14 ms, TI = 945 ms, Flip Angle = 9°, FOV = 222 × 222mm 2 , 240 Slices, GRAPPA = 2, 20-channel head coil).
Single-shell DWI scans were acquired as 3 consecutive sequences with 19, 20 or 21 gradient directions (for a total of 60 directions) and 3 B0's per acquisition sequence. Scan parameters were as follows: ((2 × 2x2)mm3, TR = 3800 ms, TE = 73 ms, Flip Angle = 90° FOV = 244 × 244mm2, 69 volumes, B0 = 1000). Multi-shell DWI data were acquired post scanner upgrade to PrismaFIT. Multi-shell data were not analyzed in the current study due to the challenges of harmonization across different DWI scan acquisition protocols and concerns regarding measurement variability given substantial differences in the pre-to-post hardware upgrade sequence design (Tax et al. 2019).

Image pre-processing
Prior to pre-processing, the acquired brain scans of participants who had multiple image acquisitions underwent quality assessment and the higher quality scan was pre-processed. Visual examination of the raw brain scan was used to assess the quality of T1-weighted and DWI acquisitions. Quality metric comparisons (e.g., mean framewise displacement [FD]) from the MRIQC pipeline (Esteban et al. 2017) was used to assess rs-fMRI acquisitions.

Structural MRI
T1-weighted brain images were pre-processed using the fMRIprep pipeline (Esteban et al. 2019) which runs Free-Surfer and performs intensity non-uniformity correction, skull stripping, calculates spatial normalization based on an MNI template, tissue segmentation and surface reconstruction. Images were also run through the MRIQC pipeline (Esteban et al. 2017) to extract quality metrics used in the quality control (QC) procedure. Left and right amygdala volumes from each participant were extracted using the amygdala region-of-interest (ROI) defined by the Desikan-Killiany Atlas (Desikan et al. 2006). The ciftify pipeline ((Dickie et al. 2019); https:// github. com/ edick ie/ cifti fy) was used to transform the images from the FreeSurfer format to the Connectivity Informatics Technology Initiative (CIFTI) format. From there, the 40,962 vertices in each hemisphere were extracted based on FreeSurfer's white and pial surfaces. This pipeline registered cortical surfaces to an average surface to establish correspondence between participants.
Cortical thickness values at each vertex were smoothed with a Gaussian kernel of 12 mm full width half maximum (FWHM).

Resting state fMRI
The rs-fMRI acquisitions were pre-processed through fMRIPrep (Esteban et al. 2019). Within fMRIPrep, the data was slice timed and motion corrected. Distortion correction was performed using field maps; the functional image was co-registered to the corresponding T1-weighted image using FreeSurfer with boundary-based registration with 9 degrees of freedom. Nonlinear transformation to the MNI152 template was calculated via FSL's FNIRT (based on the T1-weighted image) and applied to the functional data. These data were then transformed onto the cortical surface and converted to the CIFTI format (Dickie et al. 2019). The first three TRs were dropped, and voxel time series underwent mean-based intensity normalization, linear and quadratic detrending, temporal bandpass filtering (0.009-0.08 Hz), and confound regression for 6 head motion parameters, white matter signal, CSF signal and global signal plus their lags, their squares, and the squares of their lags (i.e. a 24HMP + 8 Phys + 4 GSR confounds) . Global signal regression was employed as it has been shown to reduce sources of noise and reduce correlations between mean FD and functional connectivity (Parkes et al. 2018). Spatial smoothing was then performed on the cortical surface data (FWHM = 8 mm).

DWI
DWI scans from the three separate runs were concatenated. Diffusion data were denoised using random field theory and upsampled to a (1 × 1x1) mm 3 voxel size using the MRtrix3 dwidenoise and mrresize commands, respectively (Veraart et al. 2016). Using fieldmaps, images were corrected for motion artefacts accounting for field inhomogeneities and eddy current induced artefacts using FSL's (Smith et al. 2004) eddy function . Deterministic tractography was used to delineate the UF and CB via the Slicer dMRI software (https:// github. com/ Slice rDMRI). The software registered the tracts for all participants using a dataset-specific atlas based on a representative subset (n = 21) from the current sample (selected based on age, gender and diagnosis) (Fedorov et al. 2012). Within the Slicer software, fiber clusters were manually appended to create the white matter tracts of interest (CB and UF). The atlas was registered to all participants' DWI acquisitions and FA and mean diffusivity (MD) metrics were extracted.

Quality control (QC)
To reduce potential bias of image artefacts, a rigorous a priori QC procedure was applied for all imaging modalities (supplementary Sect. 1; Figure S1). T1-weighted images were assessed for motion artefacts using a visual QC approach (HTML visual outputs from the fMRIPrep pipeline) in addition to quantitative QC (MRIQC-derived quality metrics). For the rs-fMRI sequence, participants that did not complete the ~ 5-min scan were excluded based on prior research indicating this time duration is required for stable estimations of correlation strengths (Van Dijk et al. 2010). Quality of rs-fMRI acquisitions were assessed based on mean FD and excluded based on in-scanner motion at mean FD > 0.5 mm as implemented in prior studies (Satterthwaite et al. 2012;Choi et al. 2020) given that children have high levels of in-scanner head motion (Pardoe et al. 2016). DWI acquisitions were assessed for slice dropouts, poor V1 directions and residuals using an in-house standardized pipeline in addition to quantitative quality metrics. See Figure S1 and supplement for QC procedure details.

Brain-behavior relations
Each modality (T1w, rs-fMRI, and DWI) underwent a distinct statistical analytical pipeline. Separate linear regression models were fit to examine the presence of an association between externalizing or internalizing behavioral scores and cortico-amygdalar connectivity metrics from the three included modalities: structural covariance, seed based rs-fMRI, FA and MD of the UF and CB (see below and supplementary for details). Covariates for the primary regression models included age, gender, and scanner (if acquisitions from a modality included a scanner upgrade). Prior to fitting the brain-behavior regression models, linear regression models were fit between age, age-squared, and brain and behavior indices. The better fitting age term was included as a covariate in the main analyses (Table S2; supplementary Sect. 5.1). If the better fitting age term was quadratic, then linear and quadratic age terms were included in the model (see supplementary; Table S2, Figure S6). Across image modalities, if the primary regression models were significant, subsequent models were planned to sequentially fit the following covariates: (i) the alternate broad-band CBCL score (e.g., internalizing behavior as a covariate when externalizing behavior is the predictor variable) to account for shared variability (Zald and Lahey 2017), (ii) mean FD for functional connectivity models (Power et al. 2012;Satterthwaite et al. 2012) or an estimate of overall noise for white matter connectivity models (Anderson 2001) (see details in S3.2.1), and (iii) medication status (taking medication, not taking medication, unknown).

Cortico-amygdalar structural covariance
To be consistent with the approach used in prior studies examining the relationship between cortico-amygdalar connectivity and internalizing/externalizing behavior (Ameis et al. 2014;Ducharme et al. 2017;Albaugh et al. 2017;Vijayakumar et al. 2017), in the current study we assessed structural covariance using a vertex-wise approach. Using a partial regression, an interaction term (independent variable) between internalizing or externalizing behavior scores and left or right amygdala volume (e.g., externalizing behavior*left amygdala volume) was regressed onto each cortical vertex (with thickness at each vertex as the dependent variable) controlling for age, parent reported gender (boy/girl), intracranial volume (Buckner et al. 2004;Raz et al. 2004) and scanner (i.e., Tim Trio pre-upgrade or PrismaFIT post-upgrade). Analyses were carried out using FSL's Permutation Analysis of Linear Models (PALM) package. Clusters of vertex-wise significance were determined using 2000 permutations with the threshold free cluster enhancement (TFCE) approach (Smith and Nichols 1996). Considering the high number of cortical vertices and consequent linear models, group results were thresholded at p < 0.05 FDR-corrected for the number of vertices in each hemisphere, and further corrected for separate runs of PALM for each hemisphere (critical level a = 0.025). Eight models were fit with cortical thickness at each vertex as the dependent variable to account for the different combinations between behavioral scores and amygdala volume and the behavior-by-diagnosis terms. Similar partial regression models were used for analyses examining rs-fMRI and DWI metrics as dependent variables. See below an example of one of the linear regression models examining associations between left cortico-amygdalar structural covariance and externalizing behavior.

Resting-state functional connectivity
Similar to prior studies examining the relationship between cortico-amygdalar connectivity and internalizing/externalizing behavior (Ibrahim et al. 2019;Stoddard et al. 2017), we assessed functional connectivity using a seed-based functional connectivity approach with the left and right amygdala as seed ROIs. Mean time series for each amygdala ROI were correlated with the time series of each cortical vertex using the ciftify_seed_corr function from the ciftify pipeline. Externalizing or internalizing behavior was regressed onto the functional connectivity between the amygdala ROI and each cortical vertex, controlling for age, gender, and scanner. PALM with TFCE was used to control for multiple comparisons across cortical vertices.

White matter connectivity
Using R (version 3.5.0), internalizing and externalizing behavioral scores were regressed onto left or right UF and CB for FA and MD metrics, controlling for age and gender. An FDR correction was applied to the primary analyses examining associations between behavior and left or right CB and UF diffusion metrics.

Interaction effects
To examine whether association patterns differed between diagnostic groups, behavior-by-diagnosis interaction terms were fit in separate models to examine whether brain-behavior relationships were influenced by diagnostic status.

Planned subsample analysis
Given the potential for considerable behavioral and brain change over time in a developing sample (Bos et al. 2018), a sensitivity analysis was conducted in a subset of participants whose brain scan was obtained within one month of completion of their CBCL data (see Figure S2 in supplemental materials for subsample details).

Post-hoc age-by-behavior interaction
Given that prior work has found age-specific relationships between externalizing/internalizing behaviors and corticoamygdalar connectivity properties (Andre et al. 2019;Ducharme et al. 2014;Vijayakumar et al. 2017), age-by-behavior interaction terms were examined (see details in supplementary Sect. 8).

Post-hoc bootstrap resampling analysis
In light of recent calls for increased efforts to assess reliability of reported results (Button et al. 2013) due to the lack of replicability of neuroimaging research findings (Ioannidis, 2018;Simmons et al. 2011;He et al. 2020), we used bootstrap resampling to assess the stability and reliability for the brain-behavior models which address the main aims of the current study (i.e., models that examined the main effect of externalizing or internalizing behavior across cortico-amygdalar connectivity indices in the current sample). Using a case-resampling bootstrap (Monte Carlo) approach, 1000 iterations of each data (i.e., design) matrix were generated and used to perform repeated linear regressions for each generated sample. Each iteration of the data matrix randomly selected participants with replacement until the total sample size was reached (e.g., n = 346 for the main structural covariance models). Stability of the models were assessed using the bootstrap resampled standard errors of the regression coefficients (McIntosh and Lobaugh 2004;Efron and Tibshirani 1986). The reliability of assessed models was evaluated by examining the distributions (i.e., standard deviations) of the resampled model parameter estimates (i.e., regression coefficients, t-statistics, and effect sizes; Himberg et al. 2004). Stable and reliable models feature near-zero standard errors and low standard deviations of parameter estimates (McIntosh and Lobaugh 2004;Efron and Tibshirani 1986). For the structural covariance and functional connectivity models, bootstrap resampling was conducted in PALM and parameter estimates were calculated at each vertex. For the white matter connectivity models, each model was analyzed in RStudio and the standard errors of each model regression coefficients were calculated (see supplementary Sect. 7 for more details about the bootstrap resampling analysis). In addition to the bootstrap resampling analysis, a post-hoc power analysis was used to confirm that our study was adequately powered (see supplementary Sect. 8 for details and results of this analysis).

Participant information
Characteristics of the analyzed sample are shown in Table 1. Following removal of participants outside the 6-18 year age range, those with a time window greater than 12 months between acquired scan and behavioral assessment date and those who failed QC, a total of 346 participants were included for structural covariance, 299 participants for resting-state functional connectivity and 157 participants for white matter connectivity analyses (see Fig. 1 for consort diagram). Across all image modalities, there were no significant differences in externalizing and internalizing behavior scores between the total POND sample scanned at SickKids by January 2020 (n = 611) and the subsample analyzed in the current study, nor were there any differences in age, gender, or diagnostic composition (see supplementary Sect. 4 for further details). A significant positive correlation between age and internalizing behaviors was found within each of the analyzed samples (T1w sample: r = 0.17, p < 0.001, rsfMRI sample: r = 0.15, p = 0.009, and DWI sample: r = 0.22, p = 0.005). A significant negative correlation between age and externalizing behaviors was found in the T1w sample (r = − 0.11, p = 0.044). There were no significant differences in age (F = 1.44, p = 0.24), internalizing (F = 0.03, p = 0.87) or externalizing (F = 0.41, p = 0.66) behaviors between participants included in T1w, rs-fMRI, and DWI samples. There were significant relationships between internalizing and externalizing scores across each of the imaging samples: T1w (r = 0.49, p < 0.001), rs-fMRI (r = 0.51, p < 0.001), and DWI samples (r = 0.53, p < 0.001). In the T1 and rs-fMRI samples, there was a significant difference in diagnostic composition across the scanner upgrade (X 2 = 32.48, p < 0.001), consistent with the increased number of TDC participants that were scanned post upgrade.

Cortico-amygdalar structural covariance
No significant interaction effects between either internalizing or externalizing behavior score and left or right amygdala volume on vertex-wise cortical thickness were found. No significant effects were found when diagnostic status was included as an interaction term (i.e., diagnostic status-byexternalizing/internalizing behavior-by-left/right amygdala volume on cortical thickness). Figure 2 illustrates the unthresholded p-maps of the relationship between externalizing and internalizing behavior and left cortico-amygdalar structural covariance/functional connectivity.

Functional and white matter connectivity
No significant associations were found between either externalizing or internalizing behavior and the time series correlations between left or right amygdala volume and each cortical vertex (Fig. 2), nor were there significant interaction effects found between externalizing or internalizing and diagnostic status on functional connectivity. Similarly, there were no significant associations found between either behavioral domain and FA or MD of the left or right UF or CB (Table 2, Fig. 3). No diagnosis-by-behavior interaction effects were found across the functional and white matter connectivity models.

Planned subsample analysis
Findings remained the same for the structural covariance and functional connectivity analyses among the subset of participants with a time-gap of one month or less between imaging acquisition and behavioral assessments. Among the white matter connectivity models, there was a significant association between MD of the right UF and externalizing behavior, however, this association no longer remained when participants with outlier MD values were removed ( Figure   Fig. 2 Unthresholded spatial p-map of the relationship between externalizing/internalizing behaviors and cortico-amygdalar structural and functional connectivity. A Unthresholded spatial p-maps depicting the relationship between the interaction of externalizing and internalizing behavior and left amygdala volume on each cortical vertex.
B Unthresholded spatial p-maps depicting the relationship between externalizing and internalizing behavior and functional connectivity between the left amygdala seed and each cortical vertex. A logp value of 1.6 is considered significant. As seen in the figure, none of the results reached this significance threshold

S7).
There was a main effect of externalizing behavior on the MD of the right CB, such that higher MD was associated with greater externalizing behavior ( Figure S8) which did not survive the FDR significance threshold (F 3,106 = 16.6, p Model < 0.001, t Externalizing = 2.19, p Externalizing = 0.03). There were no other significant associations found between FA or MD of the left or right UF or CB.

Post-hoc bootstrap resampling analysis
See Fig. 4 for structural covariance and functional connectivity bootstrap resampling analyses (and Figure S9 for other models) plotting the regression coefficients and their standard errors at each vertex (Panel A). Across all models, the regression coefficients based on 1000 generated bootstrapped resampled models were near zero (< 0.01; low signal) with respective standard errors near zero (< 0.001; low noise). Vertices with a t-statistic greater than an absolute value of 4 served as a proxy for a high signal vertex and are depicted in pink in Fig. 4. The figure illustrates the low standard errors of the regression coefficients (< 0.001) present for both high and low signal vertices, indicating that the results of the models examined in the current study are stable. As can be seen in Panels B-D, providing an example of bootstrap parameter estimates for models examining associations between externalizing behavior and cortico-left amygdalar structural and functional connectivity, the distribution (standard deviations) of the three mean model parameter estimates (i.e., regression coefficients, t-statistics and effect sizes, averaged across 1000 iterations) are centered around zero for both structural covariance and functional connectivity analyses. For the white matter connectivity models, the bootstrapped standard errors of the regression coefficients were also near zero and nearly all models featured bootstrapped confidence intervals which included zero (Table S5). See Figure S10 and Table S6 for bootstrap resampled results for the sensitivity analyses. The power analysis revealed that the current study was powered to detect effect

Post-hoc age-by-behavior interaction effects
There were no age-by-externalizing or internalizing behavior interaction effects found on cortico-amygdalar structural covariance, functional connectivity, or white matter Fig. 4 This figure depicts the bootstrap resampling results for the models examining associations between the externalizing behavior-left amygdala interaction term and whole brain structural covariance and functional connectivity. All other models examined feature similar results as those depicted here (see supplementary). In Panel A, the scatterplots illustrate associations between the mean regression coefficient (averaged across 1000 bootstrapped resamples) and the bootstrapped standard errors of the regression coefficients of each vertex for the structural covariance and functional connectivity models. Pink points depict the vertices with a higher signal (a t-statistic greater than 4). Blue points depict the vertices with low signal (t-statistic less than 4). The low standard errors found for both high and low signal vertices indicate stable results across resampling. Panel B depicts the histogram of the mean regression coefficients of each vertex across the 1000 bootstrapped resampled analyses (distribution-structural covariance: 2.56e-06 ± 1.37e-05; functional connectivity: 0.0002 ± 0.004). Panel C depicts the histogram of the mean t-statistic of each vertex across the 1000 bootstrapped resampled analyses (distribution-structural covariance: 0.357 ± 1.91; functional connectivity: 0.119 ± 1.99). Panel D depicts the histogram of the mean effect size of the model at each vertex across the 1000 bootstrapped resampled analyses (distribution-structural covariance: 8.45e-06 ± 4.66e-05; functional connectivity: 0.00045 ± 0.009). Note, all model parameter distributions (B-D) are centred around zero. The density y-axis in panels B-D figures is the number of points (i.e., vertices) that are in each histogram bin connectivity metrics (see supplementary Sect. 8, Figure S11, Table S7).

Discussion
Using a multi-modal imaging framework, the current study did not find continuous (dimensional) linear relationships between cortico-amygalar connectivity properties and externalizing or internalizing behaviors across a transdiagnostic sample including TDC and children with ASD, ADHD, or OCD. Further, our results do not suggest that the brainbehavior patterns examined differed by diagnostic group.
The results of our post-hoc bootstrap resampling analyses indicate that the null results found in the present study are both stable and reliable. Previous neuroimaging studies have found significant linear associations between internalizing or externalizing behavior and cortico-amygdalar connectivity metrics. Table S3 highlights studies which have examined this relationship with a focus on those that characterized externalizing and internalizing behaviors using the CBCL. Prior studies examining these relationships typically included small sample sizes (range: n = 21-291, with most studies including less than 100 participants). The majority of these studies conducted their analysis on typically developing samples, which feature limited variability of behavioral scores, when compared to a large transdiagnostic sample. Among TDC samples, lower rates of externalizing and internalizing behaviors have been shown to be associated with greater cortico-amygdalar connectivity across different modalities (e.g., increased structural covariance or functional connectivity between the amygdala and cortical regions). Importantly, studies that investigated this relationship in transdiagnostic samples found positive and negative relationships between externalizing/internalizing behaviors and functional connectivity (Chabernaud et al. 2012;Ibrahim et al. 2018), suggesting that brain-behavior relationship profiles may vary within heterogeneous samples. Among the prior studies included in Table S3, smaller effect sizes were found among studies with larger samples (Albaugh et al. 2016;Ameis et al. 2014;Vijayakumar et al. 2017). Further, many of these studies featured a narrower range of internalizing and externalizing behavioral scores across examined samples (e.g., current study range = 33-87 vs. T-score range ~ 30-70 in Ameis et al. 2014;Saxbe et al. 2018;Chabernaud et al. 2012;Karlsgodt et al. 2017;Qin et al. 2014). It is possible that the narrower behavioral heterogeneity compared to that included in the current study sample may have contributed to differences between the results reported here and findings of prior studies. Thus, the null results of the current study may not contradict prior findings as, to our knowledge, this is the only study examining the relationship between parent-reported externalizing/internalizing behaviors and multimodal cortico-amygdalar connectivity in a transdiagnostic NDD sample.
The current study featured a moderate-to-large sample size and was powered to detect very small effect sizes for each individual linear model supplementary Sect. 8), thus indicating that our null results were unlikely due to lack of power. Further, we examined the stability and reliability of our null results using bootstrap resampling. Using the bootstrap resampling analysis, instability of the model findings (which could be due to underpowered models and/or highly noisy data) can be detected through the standard errors of the model parameters. In the current study, our post-hoc bootstrap resampling analysis showed near-zero standard errors of regression coefficients for the main models (across both low and high signal vertices), indicating stability of the findings of the current study. The bootstrap resampling analysis indicated that the standard deviations for the model parameter estimates examined were also consistently centered around zero with low standard deviations, suggesting that the null models found in the current study are also reliable (McIntosh and Lobaugh, 2004;Efron and Tibshirani 1986;Himberg et al. 2004). Thus, the results of the power analysis and our bootstrap resampling provide further confidence that the current study was unlikely to be underpowered. Instead, the results of the power analysis in addition to the small effect sizes found across the bootstrap resampled models (d < 0.001; Fig. 4), could suggest that there is no meaningful linear relationship between externalizing or internalizing behavior, as measured, and cortico-amygdalar connectivity properties present across a heterogeneous transdiagnostic clinical sample of children with different NDDs.
Although the current study examined cortico-amygdalar connectivity across three imaging modalities, it is important to note that behavioral traits were assessed through a single broad-band parent-report measure. Delineating brain-behavior relations relevant to internalizing or externalizing behaviors in heterogeneous clinical samples may benefit from incorporating multi-modal measures of behavior (e.g., taskbased fMRI using behavioral relevant tasks). Multi-modal measures of behavior may enhance measurement precision compared to the use of a parent-report behavioral measure alone. Two previous studies have examined brain-behavior relationships in transdiagnostic samples using both symptom measures and task-based fMRI. Ibrahim et al. found a negative association between cortico-amygdalar connectivity during an emotion perception task and externalizing behavior across children with ASD, with or without co-occurring disruptive behavior disorders (Ibrahim et al. 2019). Stoddard et al. found that amygdala-prefrontal cortical connectivity during viewing of intensely angry faces was associated with different behavioral profiles across a sample of children with ADHD, disruptive behavior disorders, anxiety disorders or 1 3 TDC, whereby decreased connectivity was associated with high levels of anxiety and irritability, and lower connectivity was associated with high anxiety but low irritability (Stoddard et al. 2017).
A number of strengths and limitations of the current study require consideration. First, considering the growing concern of statistical practices which may contribute to false positive results or inflated effect sizes (Marek et al. 2020;Poldrack et al. 2017), we made use of non-parametric statistics (Eklund et al. 2016) (i.e. TFCE; (Smith and Nichols 1996)) to reduce risk of inflation for any potential associations found. Additionally, a standardized QC protocol was implemented across all imaging modalities to reduce the likelihood for findings to be driven by artefacts or motion (Backhausen et al. 2016;Pardoe et al. 2016). This resulted in an exclusion rate of 16.8-23.7% across imaging modalities based on image QC (that is, following initial exclusion of participants based on age, missing data and > 12-month time-gap between imaging and behavioral assessments), which is comparable to previous studies examining pediatric samples or using standardized QC approaches (Ameis et al. 2014;Ducharme et al. 2014;Xia et al. 2018) and in line with higher in-scanner motion in pediatric and clinical samples (Pardoe et al. 2016). While applying this rigorous QC approach is beneficial (particularly in a pediatric clinical sample), this limited our ability to leverage more of the total data available from POND. T1-weighted and rs-fMRI acquisitions for this sample were collected across a scanner upgrade, potentially introducing scanner-related confounds not captured by our statistical approaches. Further, while the parent-report behavioral measures of internalizing or externalizing behaviors used here have been used in prior brainbehavior studies (Albaugh et al. 2016;Ameis et al. 2014;Ducharme et al. 2014Ducharme et al. , 2011Ibrahim et al. 2019), inclusion of additional measures (e.g., self-report behavioral or cognitive) may provide a more sensitive proxy of the behavioral domain of interest. Finally, given the increased variance present in our heterogeneous transdiagnostic NDD sample, which is likely present at both the behavioral and brain level (see Table 1, S3) (Dajani et al. 2016;Dickie et al. 2018;Fair et al. 2012), other analytic approaches including clustering methods and other data-driven algorithms (e.g., canonical correlation analysis or supervised machine learning (Lombardo et al. 2019;Feczko et al. 2019;Xia et al. 2019)) may be advantageous in future brain-behavior research than the more conventional univariate approaches (as in the linear models used here). Such analyses would need to establish whether newly identified clusters are robust and clinically meaningful. While employing such approaches is outside the scope of the aims of the current study, initial reports from cluster analyses applied to the POND sample indicate the potential for data-driven approaches to be useful in identifying subgroups of children with different NDD diagnoses (Jacobs et al. 2020;Kushki et al. 2019).

Conclusion
Producing consistent results that are generalizable and replicable has been challenging in clinical and cognitive neuroscience (Ioannidis, 2018;Simmons et al. 2011) as suggested by reports of non-replication (He et al. 2020) and inconsistent findings (Uddin et al. 2017) (Dajani et al. 2019;Masouleh et al. 2019). Null reports are necessary to refine methodological approaches which can inform future research. Contrary to our hypotheses, the stability and reliability of the null result found across three imaging modalities in the current study provides support for the absence of a dimensional linear association between externalizing or internalizing behavior and cortico-amygdalar connectivity across a heterogeneous group of children with different NDD diagnoses and TDCs. Future work exploring brain-behavior relations relevant to internalizing and externalizing domains in transdiagnostic samples may benefit from the use of additional clinical/cognitive/behavioral assessments (including multi-informant reports or relevant task-based fMRI), and data-driven analytical approaches to delineate subgroups with different brain-behavior profiles.
Toronto. NJF received funding from the Centre for Addiction and Mental Health Discovery Fund Postdoctoral Grant. M-CL receives funding from the Ontario Brain Institute via the POND Network, Canadian Institutes of Health Research, the Academic Scholars Award from the Department of Psychiatry, University of Toronto, and CAMH Foundation. PS has received royalties from Guilford Press. RS has consulted to Highland Therapeutics, Eli Lilly and Co., and Purdue Pharma. He has commercial interest in a cognitive rehabilitation software company, "eHave". PDA receives funding from the Alberta Innovates Translational Health Chair in Child and Youth Mental Health and holds a patent for 'SLCIAI Marker for Anxiety Disorder' granted May 6, 2008. EA receives funding from the Canadian Institutes of Health Research, National Institutes of Health, Ontario Brain Institute, Brain Canada, Azrieli foundation, Autism Speaks, Health Resources and Services Administration. She has served as a consultant to Roche and Quadrant, has received grant funding from Roche, holds a patent for the device, "Anxiety Meter", has received editorial honoria from Wiley and royalties from APPI and Springer. SHA currently receives funding from the National Institute of Mental Health (R01MH114879), Canadian Institutes of Health Research, the Academic Scholars Award from the Department of Psychiatry, University of Toronto, Autism Speaks Canada and the CAMH Foundation.
Data and code availability See https:// github. com/ hajer nakua/ corti coamygd alar2 019 for the code used in our analyses (pre-processing and analysis scripts) and QC procedures. The POND Network has made a commitment to release the POND data. Data release is controlled and managed by the Ontario Brain Institute (OBI). The OBI POND data will be released via the Brain-CODE portal. For further details please see https:// www. brain code. ca/.

Conflict of interest
Other authors report no related funding support, financial or potential conflicts of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.