General population samples
GCTA and subsequent genome-wide analysis were conducted in children from ALSPAC, a UK population-based longitudinal pregnancy-ascertained birth-cohort (estimated birth date: April 1991–December 1992) (Boyd et al. 2013; Fraser et al. 2013), which is representative of the general population (~96 % White mothers). The initial cohort included 14,541 pregnancies and additional children eligible using the original enrolment definition (i.e. based on the same delivery dates) were recruited up to the age of 18 years, increasing the total number of pregnancies to 15,247. Information on children is available from questionnaires, clinical assessments, linkage to health and administrative records as well as biological samples including genetic and epigenetic information. Ethical approval was obtained from the ALSPAC Law-and-Ethics Committee (IRB00003312) and the Local Research Ethics Committees. The study website contains details of all available data (http://www.bris.ac.uk/alspac/researchers/data-access/data-dictionary).
Further GCTA, twin analyses, and a follow-up study of selected signals from the genome-wide screen in ALSPAC were carried out in TEDS, a large longitudinal sample of twins born in England and Wales between 1994 and 1996 (Haworth et al. 2013). The collected measures focus on cognitive and behavioural development, including difficulties in the context of normal development (www.teds.ac.uk). TEDS began when multiple births were identified from birth records and the families were invited to take part in the study; 16,810 pairs of twins were originally enrolled in TEDS. More than 10,000 of these twin pairs remain enrolled in the study to date. DNA has been collected for more than 7,000 pairs, and genome-wide genotyping data for two million DNA markers are available for around 3,500 individuals. Information is available on the twins using a combination of parent, teacher, and child rated questionnaire measures, home visits, linkage of records and online tests of cognition and behaviour. The TEDS families have taken part in studies roughly once every 2 years since the twins were 18 months of age. Ethical approval for each stage of TEDS has been obtained from the Institute of Psychiatry Ethics Committee, and informed consent was collected from the parents for each assessment. Further details about the composition and representativeness of the sample, and an overview of the measures collected are available elsewhere (Haworth et al. 2013).
Measurement of peer problems
Problematic peer relationships in ALSPAC and TEDS children were measured with the parent-completed 5-item peer problems subscale of the Strengths-and-Difficulties questionnaire (SDQ, (Goodman 1997)). The SDQ is a widely used (http://www.sdqinfo.org/py/sdqinfo/f0.py), short behavioural screening instrument applicable to children and adolescents ranging from 4 to 16 years (Goodman 1997). The SDQ has been developed as a screening instrument to predict several childhood developmental conditions (Goodman et al. 2003), the reliability of the SDQ peer problem scale is sufficient (internal consistency as measured by Cronbach’s α = 0.57) (Goodman 2001). The validity of the SDQ has been assessed by how strongly the subscales are associated with the presence of psychiatric disorders (Goodman 2001), and high SDQ scores have been associated with a substantial increase in psychiatric risk. For the peer problem subscale, there was a prevalence of a DSM-IV diagnosis of 6.4 % in the low-risk group and 31.3 % in the high-risk group (i.e. in the extreme 10 % of the population) (Goodman 2001). Different SDQ scoring profiles (including items of the peer problem scale) have been shown in patients with different clinical diagnoses, including, for example, elevated levels of peer problems and emotional difficulties, and fewer prosocial behaviours in children with ASD compared to children with ADHD (Iizuka et al. 2010).
The peer problem subscale includes the items: (I) “Rather solitary, tends to play alone”; (II) “Has at least one good friend”; (III) “Generally liked by other children”; (IV) “Picked on or bullied by other children”; and (V) “Gets on better with adults than with other children”. Each item was rated as “not true” (0), “somewhat true” (1) or “certainly true” (2) and items (II) and (III) were reverse-coded (Goodman 1997). All items were eventually summed to give a final peer problem score (score-range 0–10) with higher scores reflecting more peer-related problems. Quantitative mother-reported SDQ peer problem scores in ALSPAC children and adolescents were measured at 4, 7, 8, 10, 12, 13 and 17 years of age, and in TEDS participants parent-reported scores are available at 4, 7, 9 and 11 years (Table 1). Correlations between the scales at different ages showed modest to moderate stability in both ALSPAC (Spearman’s rho (ρ): 0.22 < ρ < 0.57; Supplementary Table S1) and TEDS (Spearman’s rho: 0.27 < ρ < 0.49; Supplementary Table S2). As expected, assessments closer in age were more strongly correlated than those that spanned the entire developmental period (Supplementary Tables S1 and S2).
Twin analyses were used to estimate the relative contribution of genetic and environmental influences to individual differences in quantitative peer problem scores. Twin intraclass correlations were calculated (Shrout and Fleiss 1979), providing an initial indication of the relative contributions of additive genetic (A), shared environmental (C), and non-shared environmental (E) factors. Additive genetic influence, also commonly known as heritability, is estimated as twice the difference between the identical and fraternal twin correlations. In twin analysis, additive genetic influences (A) include all additive genetic effects both from rare and common variants, whereas GCTA provides a lower limit estimate of heritability (A) as genetic influences due to causal variants that are not highly correlated with the common SNPs on genotyping arrays, including rare variants, are not captured (Yang et al. 2010; Plomin et al. 2013). The contribution of the shared environment, making members of a family similar, is estimated as the difference between the identical twin correlation and heritability. Although the influence of shared environment (C) was non-significant (see “Results”), twin analysis was carried out using the full ACE model to allow for comparison with GCTA estimates. Removing the influence of shared environment (C) from the analysis model could have inflated the effect of additive influences (A) and thus affected the comparison with additive influences (A) as provided by GCTA (Trzaskowski et al. 2013). Non-shared environments, i.e. environments specific to individuals, were estimated by the difference between the identical twin correlation and 1 because they are the only source of variance making identical twins different. Estimates of the non-shared environment also include measurement error. Maximum likelihood structural equation model-fitting analyses were carried out to allow for more complex analyses of the relative contribution of A, C and E (Rijsdijk and Sham 2002) and standard twin model-fitting analyses were conducted using the Mx software (Neale et al. 2006). All twin analyses were carried out using untransformed peer problem scores at 4, 7, 9 and 11 years of age that were ascertained in up to 7366 TEDS twin pairs. Detailed information on the analysed twin sample can be found in Table 1.
Multivariate (longitudinal) twin analyses were used to go beyond estimating the cross-sectional importance of genetic and environmental factors and to consider the degree to which genes and environments important at one age are also important at later ages (Neale et al. 2006). We used a standard Cholesky decomposition, converted to the mathematically equivalent correlated factors solution, to estimate the degree of genetic and environmental overlap between our longitudinal measures. In univariate twin analyses, we break down the phenotypic variance into genetic and environmental sources. The exact same logic is used in multivariate analyses to decompose the covariance between traits (or, as in the present case, between the ‘same’ trait at different ages) into genetic and environmental sources. The main outcome measures from these twin analyses are indices of genetic, shared and non-shared environmental correlations between our measured peer problems scales at ages 4, 7, 9 and 11. These correlations can range from −1 to +1, and the point estimates are independent of the magnitude of the genetic and environmental influence on each trait. Therefore, it is possible to have, for example, a high shared environmental correlation between ages even when the shared environmental influence at each age is small in magnitude, although the confidence intervals for correlations based on small proportions of variance are typically large. Such a result would mean that of the limited shared environmental variance present at each age, most of this variance also influences the later age. It is, therefore, important to interpret genetic and environmental correlations within the context of the magnitude of the cross-sectional magnitude of the A, C and E factors.
Genotyping and imputation
ALSPAC children were genotyped using the Illumina HumanHap550 quad-chip array. Genotypes were cleaned as previously described using standard quality control methods (Paternoster et al. 2012). In summary, single nucleotide polymorphisms (SNPs) with a minor allele frequency (MAF) <1 %, a call rate <95 % or evidence for violations of Hardy–Weinberg equilibrium (P < 5.0 × 10−7) were excluded. Individual participant samples were removed on the basis of sex mismatches, minimal or excessive heterozygosity, disproportionate levels of individual missingness, cryptic relatedness, insufficient sample replication and non-European ancestry. Using 464,311 directly genotyped SNPs, genotypes for 8,365 independent individuals (irrespective of available phenotypic data) were imputed to HapMapCEU (Utah residents with Northern and Western European ancestry from the Centre d’Etude du PolymorphismeHumain collection) individuals (Rel 22) using MACH (Li et al. 2010).
TEDS children were genotyped at the Affymetrix service laboratory using the Affymetrix GeneChip 6.0 and data were cleaned as previously described (Davis et al. 2014). In brief, 3,665 DNA samples from unrelated children (one member of a twin pair) were successfully genotyped. Individual samples were excluded because of low call rate or heterozygosity outliers, intensity outliers, ancestry outliers, relatedness/duplicates, gender mismatches or low concordance (<90 % after re-genotyping on a panel of 30 SNPs using Sequenom). SNPs were excluded based on minor allele frequency (MAF < 1 %) and Hardy–Weinberg (P < 10−6). SNPs with greater probability of a null call were down-weighted in the analysis, thresholding at 0.9. Imputation was carried out using the IMPUTE2 software on clean genotype data by a two-stage approach with both a haploid reference panel (HapMap2 and HapMap3 SNP data on the 120 unrelated HapMap CEU trios (Rel 22)) and a diploid reference panel (5,175 WTCCC2 controls) as previously described (Davis et al. 2014).
To increase the effective sample size and power of our analysis, we used imputed genotype data for the genetic association analysis. This allows the exchange and combination of genotype data in a uniformly exchangeable format (de Bakker et al. 2008), even when genotypes are collected using different genotyping platforms.
In addition, ancestry-specific principal components were calculated with Eigenstrat (Price et al. 2006) within each cohort (using raw genotypes), to correct for subtle differences in population structure.
All reported LD-measures are based on HapMap CEU (Rel22).
Estimation of GCTA-heritability
Using GCTA (Yang et al. 2010), we estimated the proportion of additive phenotypic variation explained by all genotyped SNPs together, both in ALSPAC (at 4, 7, 8, 10, 12, 13 and 17 years of age) and in TEDS (at 4, 7, 9, and 11 years of age). Pertinent to this study, GCTA was carried out using untransformed peer problem scores in each cohort and the most likely imputed as well as direct genotypes from autosomal SNPs (ALSPAC: N
SNPs = 2,449,665, MAF ≥ 0.01, imputation accuracy MACH-R
2 > 0.3; TEDS: N
SNPs = 1,588,650 (MAF ≥ 0.01 and INFO > 0.7 score). For sensitivity analysis, GCTA was also performed using peer problem scores adjusted for age, sex and the two most significant principal components, in addition to adjusted and subsequently rank-transformed scores. GCTA estimates from ALSPAC and TEDS were combined using fixed-effects inverse-variance meta-analysis, and evidence for overall heterogeneity was tested using Cochran’s Q-test.
We also used bivariate GCTA (Lee et al. 2012) to estimate the extent to which the same genes contribute to the observed phenotypic correlation between two variables. These estimations are based on the genetic covariance between untransformed peer problem measures at different ages, which is due to common measured genetic variation.
Genetic association analysis
Selecting peer problem scores with the highest GCTA-heritability during development (10, 13 and 17 years of age), we conducted three single time-point GWASs on ~2.45 million (N = 2,449,665) common imputed and genotyped SNPs (MAF ≥ 0.01, imputation accuracy MACH-R
2 > 0.3) within ALSPAC. Association analyses were performed using a quasi-Poisson regression model, which can accommodate overdispersion (Faraway 2006) (R ʻstats’ library). Specifically, counts of peer problems were regressed on allele dosage as well as age, sex and the two most significant ancestry-informative principal components (to correct for subtle differences in population structure (Price et al. 2006)). Regression estimates (β) thus represent changes in log-counts of peer problems per effect allele, based on SNP dosage scores. All single time-point findings were subjected to genomic control (GC)-correction (Devlin and Roeder 1999). Follow-up analyses were carried out in TEDS using a similar quasi-Poisson regression framework as described for ALSPAC including two ancestry-informative principal components.
Exploratory analysis of population-based association signals in two autism samples
Population-based signals were also investigated in the Autism Genetic Resource Exchange (AGRE) pedigrees and the Autism Case–Control (ACC) cohort in an exploratory search for autism QTL. Within the AGRE pedigrees, there are three diagnostic categories based on the Autism Diagnostic Interview–Revised (ADI–R) (Lord et al. 1994): Autism, Broad Spectrum or Not Quite Autism. All of them were utilised to define ‘cases’ in this study, and have been previously described in detail (Wang et al. 2009). 4,444 unique AGRE individuals from 943 families were genotyped on the Illumina HumanHap550 K BeadChip (Wang et al. 2009). Cleaned genome-wide data (Wang et al. 2009) were obtained from Autism Speaks (data set prepared by JK Lowe). Additional data cleaning steps of this multiethnic sample have been described in detail in previous publications (St Pourcain et al. 2014) including the removal of SNPs (>10 % missingness, violations of Hardy–Weinberg equilibrium (P < 0.001) and MAF <1 %) as well as the exclusion of individuals (>10 Mendelian errors, monozygotic twins, sample duplicates, individuals with >10 % missing data, individuals with known chromosomal abnormalities including Trisomy 21 and Fragile X syndrome, individuals of non-European ancestry). The final data set included 3,299 individuals (793 pedigrees) and 513,312 SNPs. Genotypes were imputed to HapMap CEU (release 22) using MaCH, excluding all imputed genotypes with a per-genotype posterior probability <0.9. Selected population-based signals were investigated with FBAT, a family-based association test (Lange and Laird 2002), using the most likely genotype call and an empirical variance for the test statistic (to account for linkage within pedigrees).
The ACC cohort includes 1,453 patients with either a positive ADI/ADI–R score or an Autism Diagnostic Observation Schedule (Lord et al. 2000) diagnosis or both, as well as 7,070 controls without a history of ASD. All individuals were genotyped on the Illumina HumanHap550 K BeadChip. The data cleaning was largely similar to the cleaning of the AGRE sample (see above) and has been described previously (Wang et al. 2009). The final clean data set included 1,204 ASD cases and 6,491 controls of European ancestry, as well as 480,530 SNPs (Wang et al. 2009). Genotype imputation was performed to HapMap CEU (release 22) using MaCH as previously reported (Wang et al. 2009). Genetic association for selected follow-up SNPs was analysed using SNPTEST by converting MaCH imputation files into SNPTEST input formats.
Genetic association analysis was conducted using de-identified genetic data. Ethical approval for the analysis of the AGRE and ACC samples was obtained through the IRB Protocol 10-007590 from the Children’s Hospital of Philadelphia.
Longitudinal modelling of DNA signals
All population-based signals in ALSPAC with tentative support for autism QTL were furthermore modelled longitudinally. For this, we used a mixed Poisson regression framework (R:‘lme4’ library), where overdispersion can be modelled through the random error part (Gelman and Hill 2007). Models included random intercept and slope, and SNP effects (i.e. allele dosages) were adjusted for sex, age, age2 and two ancestry-sensitive principal components. In addition, we modelled age-specific SNP effects using SNP × age and SNP × age2 interaction terms, and selected the best-fitting model based on likelihood-ratio tests. Thus, for each SNP, the final longitudinal model could include none, one (SNP × age) or two (SNP × age and SNP × age2) interaction effects. In the presence of SNP–age interaction effects, we modelled the SNP effect at different ages spanning early childhood (4 years) and later adolescence (17 years), by centering age at the respective age. We considered a SNP signal of 5 × 10−8 at any age (including combined effects from main and interaction effects) within the longitudinal modelling framework as genome-wide significant.