Participants
Structural magnetic resonance imaging and diffusion-weighted imaging were carried out on 76 typically developing children and young adults, aged between 6 and 25 years. The participants were randomly selected from a longitudinal study (three time-points, each 2 years apart), and they were all from the population registry in Nynäshamn, Sweden (Söderqvist et al. 2010). The study was approved by the local ethics committee of the Karolinska University Hospital, Stockholm, Sweden. Informed consent was obtained from all individual participants included in the study. The informed consent was provided by the parents of children aged below 18 years.
Genotyping and quality control steps
Genotyping was performed on the Affymetrix Genome-wide Human SNP array 6.0, including more than 906,600 SNPs and more than 946,000 probes for detecting copy number variation. The genotyping was done on one batch with DNA extracted from blood (n = 63) and two batches with DNA extracted from saliva (n = 19 and n = 8) using standard methods and commercial kits. Two individuals were removed from the blood batch in initial quality control. DNA from these individuals was reextracted from saliva, and genotyping was redone using saliva DNA.
The genotypes were called using Birdsuite version 1.5.5 (Korn et al. 2008) separately on each batch. Conversion of the output files to PLINK-compatible format was done using the Birdsuite to PLINK pipeline version 1.6.6. (https://www.broadinstitute.org/ftp/pub/mpg/birdsuite/Birdsuite_Pipeline.pdf). The reference genome assembly used was hg18.
Quality control was done using PLINK version 1.07. (Purcell et al. 2007) and conducted on two levels: exclusion of individuals and exclusion of SNPs. No individuals were removed due to low genotype call rates. The average genotype call rate of individuals was over 98 % in all of the batches. In total, 13 individuals were removed from the analysis based on either gender discordance detected using heterozygosity rates of X-chromosomal SNPs (one individual), excessive heterozygosity detected by calculating inbreeding coefficients (three individuals), possible cryptic relatedness estimated by pairwise identity-by-state analysis (two individuals), or known relatedness (seven individuals). Out of the known and suspected sib-pairs, the sibling with a larger proportion of missing genotypes was removed from the analysis.
Before quality control, 909,622 SNPs were available for analysis. SNPs with a proportion of missing genotypes >0.05 or with a missing genotype for more than one individual in any batch were removed from all batches (48,451 SNPs in total). Also, the following SNPs were removed: 137753 SNPs with minor allele frequency (MAF) <0.01, 26 SNPs showing deviation from Hardy–Weinberg equilibrium (p < 1 × 10−5), 39 SNPs showing non-random missingness with respect to neighboring genotype and 37 SNPs showing association with batch membership. Overall, 79.5 % (723,316/909,622) of the markers passed all quality control filters.
After filtering, 76 individuals (41 males and 35 females) and 723,316 diploid SNPs remained. The average individual call rate was 99.7 %, and the lowest individual call rate was 97.4 %.
Selection of ROBO1 SNPs
From the quality-controlled genotype data, we first selected all SNPs from the genomic interval chr3:78,720,000–80,000,000 (hg18), in total 201 SNPs. This area encompasses all of the annotated transcript variants of the ROBO1 gene (NM_002941.3, NM_133631.3 and NM_001145845.1) as well as roughly 300- and 10-kb upstream of the longest variant (NM_002941.3).
A subset of these SNPs was chosen for a two-phase association study; in the first phase, we selected tagging SNPs to efficiently capture the common genetic variation in the area while keeping the multiple testing burden to a minimum, and in the second phase, we refined the association analysis by selecting additional, more closely spaced SNPs within the genomic locations that contained associated SNPs in the first phase.
In the first phase, Haploview (Barrett et al. 2005) was used to construct a haplotype map using the 201 SNPs in the genomic region of ROBO1. Pairwise comparisons of markers >1000 kb apart were ignored. The resulting 19 haplotype blocks were subjected to Haploview’s Tagger algorithm to find best tagging SNPs for them, with settings of the minor allele frequency (MAF) > 0.1 and a maximum number of tags to pick of =20. The selected 20 tagging SNPs (rs3773216, rs9875094, rs3773232, rs1457659, rs416551, rs7629522, rs162870, rs162871, rs162262, rs162429, rs7631406, rs12497294, rs6770483, rs9835692, rs9876238, rs4856291, rs4856447, rs12488868, rs6768880, and rs9830013) captured 98 alleles (58 % of the 169 alleles with MAF > 0.1) at r
2 ≥ 0.8 and 57 % of alleles with mean r
2 of 0.937.
In the second phase, we selected additional SNPs within and between the two haplotype blocks whose tagging SNPs (rs17396958 and rs1393375) had shown association in the first phase. We chose MAF >10 % as cutoff. The SNPs that did not show association in the first phase and the SNPs tagged by them were excluded. Of those SNPs that were in perfect linkage disequilibrium (LD) with each other, only one SNP was selected. Between rs17396958 and rs1393375, we also dropped out all but one SNP per group from groups of SNPs that were in strong LD with each other (r
2 > 0.8). In total, there were 28 SNPs analyzed in the second phase: rs6770755, rs7651370, rs7631357, rs4564923, rs6548621, rs9832405, rs7637338, rs6548628, rs9853895, rs9820160, rs7432676, rs9309825, rs13071586, rs13072324, rs6771681, rs7618126, rs7432306, rs6548650, rs7644521, rs1995402, rs17380584, rs11917376, rs11706346, rs1393360, rs1502298, rs10511118, rs1502305, and rs10511119.
Structural brain imaging and voxel-based morphometry
We applied three-dimensional magnetization prepared rapid gradient echo (MP-RAGE) sequence with TR = 2300 ms and TE = 2.92 ms to collect structural MRI data with 256 × 256 mm2 field of view (FOV), 256 × 256 matrix size, 176 sagittal slices, and 1-mm3 isotropic voxel size. The structural images were then processed using Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra (DARTEL) method, which segmented the brain into gray matter, white matter, and cerebral spinal fluid.
DARTEL, as a part of SPM5 software, was performed on structural data collected at all three time-points, and white matter was segmented after all images were registered to the template generated by iterative registration of all individuals’ T1-weighted images. To preserve the total amount of signal from different regions in the brain, the modulation step was also performed. The modulated white matter segmented (white matter density) images were then smoothed with an 8-mm Gaussian kernel and fed into a higher level statistical analysis in SPM5 for detecting any SNPs associations with white matter structure.
Diffusion tensor imaging
Diffusion tensor imaging (DTI), with the scanning parameters of 230 × 230 mm2 FOV, 128 × 128 matrix size, 40 slices with thickness of 2.5 mm, and b-value of 1000 s/mm2 in 64 gradient directions, was collected at the third round of longitudinal data collection (Söderqvist et al. 2010). Eddy current and head motions were corrected with affine registration for all diffusion-weighted images to a reference volume using FSL software (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/). The diffusion tensor parameters were then estimated for each voxel, and subsequently, the DTI and fractional anisotropy (FA) data were constructed. Non-linear registration was carried out using Tract-Based Spatial Statistics, TBSS v1.2, (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/TBSS) to align all FA images to the mean FA skeleton.
Probabilistic fiber tracking of CC
To find the CC white matter fibers, the body of CC was selected as seed region and probabilistic tractography was performed on all individuals’ DTI data, initiating from all voxels within the seed masks using probtrackx tool of FDT v2.0, FSL (http://fsl.fmrib.ox.ac.uk/fsl/fsl-4.1.9/fdt/fdt_probtrackx.html). The fiber tracking parameters were set as default, 5000 streamline samples, step length of 0.5 mm, and curvature threshold of 0.2. At the individual level, the probabilistic connectivity maps were first normalized by dividing with the corresponding way total value, as the total number of generated tracts. The number of the generated tracts for each individual was related to the size of the seed (i.e., the body of CC). The normalized probability maps were then thresholded by 5 % of the samples to remove the voxels with low probability of connection (Leh et al. 2006). In the next step, all of the traced white matter pathways were aligned using the TBSS method for non-FA images and then binarized and averaged across all subjects. To define a mask of CC fibers, the group probability map of the tracts was finally thresholded at the group level by keeping the pathways that were present in 90 % of the cases. This mask was later used as a region of interest for small volume correction in higher level statistical analysis of the white matter structure in association with ROBO1 SNPs.
Probabilistic fiber tracking for segmenting CC
Five different cortical regions of interest, including anterior frontal, superior frontal, parietal, temporal, and occipital cortex, were selected bilaterally as target regions for probabilistic fiber tracking of CC to segment this large white matter tract to smaller segments. These cortical regions were defined based on the Harvard–Oxford cortical atlas, and the body of CC was considered as the seed region for initiating the fiber tracking.
In the next step, the white matter fibers found from all five regions of interest were thresholded by 5 % of the maximum values to exclude the tracts with the probability of connections lower than the threshold. After segmenting the CC pathways based on the probabilistic fiber tracking of the CC, the averages of two different indices were computed in these five segmented white matter tracts. The first index was the probability of connection to each cortical region, which indicates the structure of white matter pathways (such as number, thickness, size, and the myelination of axons). The second one was FA that reflects the organization and packing of the axons as well as myelination.
Cortical thickness measurements
To assess the association of ROBO1 SNPs with thickness of cortex, the cortical thickness of the structural MRI data was computed using automatic longitudinal stream in FreeSurfer (Reuter et al. 2012). All structural data were first registered to a within-subject template (Reuter et al. 2010; Reuter and Fischl 2011). After applying several processing steps (Dale et al. 1999; Fischl and Dale 2000), including skull removing, template transformation, and atlas registration, the images were later segmented to white matter, gray matter, and pial based on intensity and neighborhood voxel restrictions. Thickness of cortex was computed as the distance between the white matter and the pial. The cortical thickness of the cortical regions of interest (including the left and right anterior frontal, superior frontal, parietal, temporal, and occipital) was then calculated using the workflow described in http://surfer.nmr.mgh.harvard.edu/fswiki/VolumeRoiCortical Thickness.
Statistical analyses
The white matter segmented images were analyzed by higher level SPM analysis, using a flexible factorial design (http://www.fil.ion.ucl.ac.uk/spm/software/spm8), to assess the association of ROBO1 SNPs with white matter structure. In the first phase of assessment, all 20 SNPs were entered separately as a main factor in the model. Subjects and testing time-points were also considered as factors in the flexible factorial model to consider the repeated measures. Age, gender, handedness, and total white matter volume were used as covariates, and the interactions of SNP, as the main factor, with age and gender were also added. Two of the SNPs (rs17396958 and rs1393375) were significantly associated with white matter density in the posterior part of the corpus callosum. In the second phase, we tagged 28 SNPs within and between the two haplotype blocks of these two SNPs. Then, the exploratory analysis was done within the white matter masked by the group probability map of the CC tracts, with non-stationary cluster extent correction, at FDR-corrected cluster level (p value of 0.05). We corrected for multiple comparisons (Bonferroni correction) of the number of SNPs (28 SNPs) and, accordingly, set the threshold of significant p values at 0.0018.
After fiber tracking and segmenting the CC into five parts, the mean value of two indices (probability of connection and FA) was computed in these five segments. These measures were then analyzed for associations between these brain measures and the SNPs significantly associated with white matter density within CC in the second phase of the analysis using linear regression in IBM SPSS statistics 21.0 software. For each test, brain measures were included as dependent variable, and age, gender, handedness, and genotypes were included as independent variables.
The same linear regression analysis were also performed for the measures of cortical thickness in all five cortical regions of interest using the thickness measures as the dependent variable and similar independent variables as mentioned above.