Autism spectrum disorders (ASDs) encompass a heterogeneous group of clinical descriptors whose core features include deficits in cognition, communication and social acuity, coupled with stereotypical behaviors. Currently, 1 in 88 children in the United States have an ASD diagnosis, with boys being affected at a ratio of 5:1 compared to girls [1]. The etiology of ASDs, with very few exceptions, is unknown although a growing number of clinical and basic research studies have implicated immunological abnormalities as being associated with and potentially responsible for the cognitive and behavioral deficits seen in ASD children [24].

A judicious and efficient way to identify genetic variation predisposing to complex diseases is the application of a hypothesis-driven framework that incorporates prior biological knowledge. Unlike agnostic genome wide association studies (GWAS), this approach narrows the hypothesis space to provide a more focused and powerful examination of the data. Indeed, several studies have shown that a hypothesis-driven approach of selecting candidate genes serves to increase the reliability and likelihood of finding genes that are truly associated with disease [5, 6]. In accord with these considerations, we tested for associations between immune-related genes and ASD children.


We used Ingenuity Pathway Analysis (IPA) [7], a bioinformatics database, to search for molecules with known immune function. IPA offers the largest and most comprehensive set of functional annotations, integrating manually curated data from other databases, and a broad range of scientific literature. Briefly, we searched for “immune” under Functions and Diseases, exported the molecule annotations, and used MatchMiner [8] to get the positions of the corresponding 2,012 immune-function genes (Additional file 1: Table S1).

We then used the genotypic data available at the Autism Genetic Resource Exchange (AGRE) [9] repository to perform a family-based association analysis between variants in those 2,012 immune loci and ASD. These data have previously been used in several GWAS [1012], and the families used in our analysis are, with the exception of minor differences due to slightly different quality control criteria, the same ones as reported in Ma et al.[10], Wang et al.[11] and Weiss et al.[12]. These data and families also overlap with those reported by Voineagu et al.[13], who, included in their expression profiling study, used the AGRE data to perform an analysis searching for evidence of a genetic enrichment of immune loci. We searched for SNPs in a region including 5 kb up- and downstream of each immune gene. After applying quality control filters, we identified 22,904 SNPs genotyped in the AGRE collection of 1,510 trios on the Illumina Hap550 platform. We tested these SNPs for association with ASD using the standard Transmission Disequilibrium Test (TDT), as implemented in PLINK v1.07 [14]. In addition, given that the 1,510 trios are part of 1,057 independent nuclear families, we also used the Pedigree Disequilibrium Test (PDT), which uses data from related nuclear families and discordant sibships from extended pedigrees [15]. The following exclusion filters were applied: minor allele frequency (MAF) < 0.01, SNP genotyping missing rate > 10%, individuals > 10% missing genotypes, Hardy-Weinberg Equilibrium (HWE) P-value < 0.001 (founders), Mendelian error rate > 5% per family and > 4% per SNP.

Since we hypothesized that ASD children may be enriched for risk loci in immune genes, we tested for a potential enrichment of significant P-values for SNPs in immune-related genes. We tested for an enrichment of association signals in these genes using INRICH [16], which corrects the empirical gene-set P-value using Bootstrapping-based re-sampling. In addition, we assessed the distribution of the linkage-disequilibrium (LD)-pruned immune-gene-set with random LD-pruned SNPs, using an r2 > 0.2 for LD-based SNP pruning. We did not observe an enrichment of immune-gene associations with either test (P = 1.0 and P = 0.45, respectively).

In order to adjust for multiple testing we applied a Benjamini and Hochberg false discovery rate (FDR) correction [17] to the 22,904 SNPs analyzed (P-FDR). Only three SNPs remained statistically significant (P-FDR < 0.05) (Table 1). Under an additive model, and at alpha = 1.0E-05, this sample has >80% power to detect the associations reported in Table 1.

Table 1 Most significant immune loci from the association analyses

Finally, in order to fine map the novel loci, we used BEAGLE v3.3.2 [18] and the 1000 Genomes Project Phase 1 (version 3; reference panel to impute all SNPs in the regions of the top genes.

Novel immune genes

The lack of an enrichment of significant associations in immune genes corroborates the findings by Voineagu et al.[13], who used a complimentary approach to test for evidence of a genetic component for the up-regulation of immune response genes in the autistic brain using the same AGRE genotype data. Nevertheless, we uncovered a few associations in immune-related genes that met an FDR-adjustment.

We observed a significant association in the CD99 molecule-like 2 gene (CD99L2) (rs11796490, P = 4.01x10-06, OR = 0.68 (0.58-0.80)) (Table 1). An imputed variant in LD with this genotyped SNP showed even stronger association (rs7880807, P = 8.26 x 10-07, OR = 0.64 (0.53-0.76)) (r2 = 1.0 in CEU from HapMap release 21) (Figure 1). As is evident from Figure 1, other variants in this gene are also showing evidence of association with ASD. Although more modest, these associations with neighboring SNPs corroborate the association observed at rs11796490, which is, therefore, less likely to be spurious. This variant is located in the first intron of CD99L2. The product of this gene functions as an adhesion molecule during leukocyte extravasation, in particular at the diapedesis step.

Figure 1
figure 1

Regional plots of immune ASD loci. Genotyped (circles) and imputed (triangles) SNPs are plotted with their P-values (as -log10 values) as a function of genomic position (Human Genome Build 18) within a region surrounding the most significant SNP (purple color). Recombination rates from the HapMap Phase II CEU are plotted in blue to reflect the regional LD structure. In each region, the index SNP is represented by a large purple symbol, and the color of all other SNPs indicates LD with the index SNP based on pairwise r2 values from HapMap CEU (red, r2 > 0.8; orange, r2 = 0.6 to 0.8; green, r2 = 0.4 to 0.6; light blue, r2 = 0.2 to 0.4; dark blue, r2 < 0.2). Known human genes in the UCSC Genome Browser are in the bottom.

A variant in the first intron of the jumonji AT rich interactive domain 2 (JARID2) gene showed association with ASD (rs13193457, P = 2.71 × 10-06, OR = 0.61 (0.49-0.75) (Table 1). Despite having a borderline Mendelian error rate, this SNP is of interest. This gene has been previously implicated in schizophrenia [19, 20] and, more recently, it has been reported in a GWAS for ASD that also included the same AGRE dataset herein used [12]. The variant reported by Weiss et al. (rs7766973) does not show LD with rs13193457 (r2 = 0.13 in CEU from 1000 Genomes Pilot 1). The jumonji-domain functions by removing methyl marks on histones that are associated with gene regulation [21]. JARID2 is required for neural tube formation and is essential for normal heart development and function, as well as acting as a transcriptional repressor of ANF by binding to both GATA4 and NKX2-5 and repressing their transcriptional activator activities.

An intronic variant in the thyroid peroxidase gene (TPO) has also shown association with ASD (rs1514687, P = 5.72 × 10-06, OR = 1.46 (1.24-1.72)). This association is corroborated by those of neighboring SNPs (Figure 1). This gene has been associated with hypothyroidism [22]. This gene encodes a membrane-bound glycoprotein which acts as an enzyme and plays a central role in thyroid gland function. The protein functions in the iodination of tyrosine residues in thyroglobulin and phenoxy-ester formation between pairs of iodinated tyrosines to generate the thyroid hormones, thyroxine and triiodothyronine. Mutations in this gene are associated with several disorders of thyroid hormonogenesis, including congenital hypothyroidism, congenital goiter and thyroid hormone organification defect IIA. Multiple transcript variants encoding distinct isoforms have been identified for this gene.


In spite of its reported high heritability, the identification of genetic risk factors predisposing to ASDs has been difficult. To date, only a small number of loci are considered established (that is, have met genome-wide significance and have been replicated in independent cohorts). GWAS are by design agnostic. Increased statistical power can be achieved with more focused hypotheses. Given the increasing speculation of a significant role for immune genes in the etiology and pathogenesis of ASD, we undertook a search for genetic variation associated with ASD in genes with immune functions. This hypothesis-driven candidate gene approach based on prior knowledge was selected because it serves to increase the power, reliability and likelihood of finding genes that are associated with disease [5, 6].

A lack of enrichment of genetic associations in immune-related genes has recently been reported [13] in the same dataset we used, and this conclusion is supported here using another statistical approach. The associations we uncovered in immune-related genes and the observed dysregulation of immune genes in ASD [13] suggest that variation in a limited number of immune-function genes may be responsible for observed up-regulation of their immune downstream targets.

The transmission/disequilibrium test (TDT) is largely robust to population substructure as it is a within-family comparison. The individuals in these data are predominately of European descent (76%), with Asian (3%) and African (2%) ancestry represented; the Autism Genetic Resource Exchange (AGRE) repository lists 14% and 5% of the individuals of unknown or admixed ancestry. One limitation of using the TDT in our study is the lack of total independence between the trios analyzed: the 1,510 trios used in the TDT are part of 1,101 independent nuclear families. The correlation among children in the same family can inflate the type I error probability. We, thus, also applied the PDT, also a within family test that is robust to population substructure, and the PDT analysis corroborates the TDT analysis.


SNPs within the CD99 molecule-like 2 gene, the jumonji AT rich interactive domain 2 (JARID2) gene, and the thyroid peroxidase gene were associated with ASD after multiple comparisons adjustment. Understanding how these genetic factors might contribute to pathogenesis should ultimately lead to important opportunities for discovering therapeutic targets that can be used to treat ASD.