Background

Autism spectrum disorder is a complex, heterogeneous, behaviorally-defined disorder with a 4:1 male:female gender distortion [1]. Although environmental elements, such as peri- and post-natal stress, have been reported to contribute to the development of autism, monozygotic twin studies along with evidence of chromosomal abnormalities, mutations in single genes, and multiple gene polymorphisms in autistic individuals, clearly show that autism is a largely genetic disorder [25].

Single mutations in neuroligin 3 and 4, cell adhesion molecules present at the post-synaptic side of the synapse, and in SHANK3, a scaffolding protein found in excitatory synapses, have been described in autistic individuals [6, 7]. In the majority of cases, however, an overall lack of Mendelian inheritance suggests the involvement of multiple genes [5, 8]. Indeed, genome-wide screens and candidate gene approaches have identified a number of chromosomal regions and genes linked with autism [919]. For example, a strong association between autism and SLC25A12, a gene encoding the mitochondrial aspartate/glutamate carrier AGC1 expressed in neurons and in neural stem cells, has been reported in 2 separate studies [17, 19]. Similarly, an analysis of chromosome 16p revealed an association between autism and the protein kinase c-beta gene (PRKCB1), which is expressed in granule cells of the brain and B lymphocytes [20].

Although most of the genetic analyses, to date, have focused on genes expressed in the brain, the pathophysiology of autism suggests that other systems such as the immune system and the pituitary-hypothalamic axis may be involved [21, 22]. In some autistic individuals, for example, abnormal secretion of pro-opio-melanocortin (POMC), adrenocorticotropin (ACTH), cortisol, and beta-endorphin has been noted [2225].

We recently performed a genome-wide linkage scan in a group of families with autism and phrase speech delay identified in the Autism Genetic Resource Exchange (AGRE) DNA repository [20]. Among the 7 genomic regions that showed a significant increase in identity-by-descent sharing in sibling pairs with autism, 4 corresponded to regions that have previously been linked to autism (chromosome 5, 13, 16, and 17) [20]. Based on these results, in the current study, we focused on chromosome 5q31 and identified 3 genes that we hypothesized could be involved in the development of autism: 1) paired-like homeodomain transcription factor 1 (PITX1), which is a key regulator of hormones within the pituitary-hypothalamic axis such as ACTH, cortisol, and beta-endorphin [22, 26, 27]; 2) histone family member H2AFY, which is involved in X-inactivation in females and therefore could be a positional candidate that could explain the 4:1 male:female gender distortion present in autism [1, 28]; and 3) Neurogenin 1 (NEUROG1), which is a transcription factor involved in neurogenesis [29]. Using single point association analyses and haplotype analyses, we found significant evidence for an association of autism with PITX1 but not with H2AFY or NEUROG1.

Methods

Subjects

Two-hundred and seventy-six families were selected from the AGRE program composed of 1086 individuals including 530 affected children. AGRE has been approved by an Institutional Review Board (IRB) to ensure the protection of research participants [30]. Each family had at least two affected children. No un-affected children were included in the study. In each family, affected siblings met the diagnostic criteria for autism according to the Autism Diagnostic Interview Revised (ADI-R) [31]. No families with known genetic defects like fragile X, RETT syndrome, or other monogenic forms of autism were included. Written informed consent was obtained from all parents included in the study. One-hundred sixteen families initially used in the linkage study were included in this study [20]. In these families, the Autism Diagnostic Observation Schedule (ADOS) evaluation was performed in 48% of the affected individuals and the concordance rate between ADI-R and ADOS was 94%.

Candidate gene, SNP selection, and genotyping

Genes were annotated according to the 35th version of the NCBI database [32]. In a first step, the three candidate genes were genotyped using on the same sample of 116 families as for the linkage study. A second step of analysis was performed using tightly linked SNPs covering gene(s) showing suggestive association at a nominal level in the first step. SNPs were genotyped according to the manufacturer's recommendations using oligoligation assays, (SNPlex) or TaqMan (Applied Biosystems, Foster City)

Statistical analysis

The SNPs pairwise LD was evaluated by the measures of D' using the LDheatmap package. Hardy-Weinberg equilibrium (HWE) was tested in parents using the exact Chi-square statistic test [33].

Single SNP and haplotype association tests were carried out using FBAT version 1.7.3 [34]. Haplotype-specific association were also tested using the HBAT option in the FBAT package. Both single and haplotype analysis were performed using the empirical variance option to account for the linkage in the region studied. Nominal P-values were provided for each haplotype with more than ten informative families. First step analyses were not corrected for multiple testing since it was performed under a strong candidate gene hypothesis to identify potential association with the disease at the nominal level. However, p-values were corrected for multiple testing in the second step. Multiple hypothesis testing was controlled using the false discovery rate (FDR) approach proposed by Benjamini and Hochberg [35]. Because of the non-independence of tests, this method is conservative (but less so than the Bonferroni correction) and generally tends to underestimate statistical significance.

To estimate relative risk (RR) of marker(s) identified as associated to autism in the present paper, we used a conditional likelihood based method [36]. This method estimates haplotype RRs under an additive model from unphased data but also single marker like SNP risk and provides unbiased RR in case of deviation from HWE.

Results

Gene specific single point association analysis

Of the 16 known genes in the 1.2 Mb region of chromosome 5q31 (Table 1), which was identified in a previous genome-wide linkage analysis for autism [20], 3 were selected as candidate genes based on function: PITX1,H2AFY, and NEUROG1 (Figure 1).

Table 1 Genes identified in a 1.2 Mb linkage region of chromosome 5q31
Figure 1
figure 1

Genomic organization of the linkage region on chromosome 5q31 and haplotype heat-map of PITX1. A) Five clones in the region showed elevated identity-by-descent sharing values. The highest p-value was observed for clone FE0DBACA4ZA08 (p = 6.40 * 10-7, position (built 34) = chr5:134.467.793 – 134.639.841). Orientation of gene transcription is indicated by arrows. The bottom panel shows the exon-intron structure of the PITX1 gene. Two transcripts have been described for the gene, which make use of the same start and stop codons but differ in their respective 5' and 3' UTR regions.

A two step procedure was used to screen the candidate genes for association with autism. The average genotyping success rate was > 92%. All markers, except rs474853 were in Hardy-Weinberg equilibrium (Table 2). One family was removed from the analysis because of Mendelian incompatibilities.

Table 2 Association analyses with tag-SNPs selected in candidate genes. Single point SNP analysis results of first step analysis of PITX1, H2AFY, and NEUROG1 in 116 families from the initial linkage scan. Base positions are indicated according to built 35 of the human genome sequence. MAF, minor allele frequency; perm, permutations; SNP, single nucleotide polymorphism; HWE, Hardy-Weinberg equilibrium probability test. Association analysis was performed using the FBAT package.

Single marker analysis revealed nominally significant p-values for 3 of the 5 SNP markers selected for PITX1: rs28330 (p = 0.013), rs3805663 (p = 0.0061), and rs1700488 (p = 0.0191) (Table 2). Marker rs1700488 is positioned 4.4 kb upstream of the first exon of PITX1. Marker rs28330 is located 11 kb 3' from the last exon of PITX1. Marker rs3805663 lies between exon 3 and 4 of PITX1. Mutation screening by direct sequencing did not show any amino-acid or other mutational changes (data not shown).

To further investigate PITX1 9 SNPs, including new markers not genotyped in phase 1, were selected to fully capture the polymorphic information of the gene and genotyped in an extended family set of 276 AGRE families. Two markers showed strong significant evidence for association (rs11959298, p = 2 * 10-4 and rs6596189, p = 1 * 10-4) and 3 other markers (rs1131611, rs6872664, rs6596188) reached significance even after correcting for multiple testing (Table 3).

Table 3 Extended SNP analysis in the PITX1 gene. In the second step additional markers were added in the PITX1 gene analysis. Also 160 further families were added to the analysis of PITX1in this step. Thus step 2 included a total of 276 families for genotyping. Markers rs28330, rs31210, rs474853, rs3805663, rs1700488, rs13163460, rs3776203, rs6596238, rs245128, rs2249596, rs1131611, rs254550, and rs254551 were genotyped using SNPlex. Markers rs6865399, rs2292011, rs1393082, rs657223, rs7700313, and rs39882 were genotyped using TaqMan, respectively. Base positions are indicated according to built 35 of the human genome sequence. MAF, minor allele frequency; perm, permutations; SNP, single nucleotide polymorphism; HWE, Hardy-Weinberg equilibrium probability test. Association analysis was performed using the FBAT package.

Linkage disequilibrium and haplotype analysis of PITX1gene

Haplotypes analysis was conducted to extract more inheritance information from the set of markers. As a first step, the LD structure among the 9 markers was evaluated in PITX1 (Figure 2). We observed two blocks of LD, a four SNP block (block 1) and a two SNPs block (block 2). Block 1 is composed of SNPs that individually were not significant after correction for multiple testing. Grouping into haplotypes, did not increase significance after correction, apart from a slight tendency for a protective effect of haplotype C-T-A-G (Table 4). The second LD block contains two SNPs that showed significant association with p = 0.0084 (pcorrected = 0.0178) for the two haplotype alleles having a frequency greater than 0.05). This block is bordered by two SNPs (rs11959298 and rs6596189) that present strong LD (D' = 1.0 and r2 = 0.98) between them and milder LD (D' = 0.83 or 0.85) with markers from block 2 (Figure 2). They displayed the strongest significant single point association and combined into haplotypes also provided strong evidence for association with p = 0.0004 (pcorrected = 0.0017) for haplotype A-C (Z-score = 3.530). Defining allele A-C as the risk allele and homozygous carrier of allele G-T as the reference genotype, we estimated the RR to 1.59 ([1.26 ; 2.02] 95% confidence interval) for a heterozygous carrier and 2.54 ([1.58 ; 4.09]) for homozygous carrier.

Table 4 Results of haplotype association analysis
Figure 2
figure 2

Linkage disequilibrium (LD) map of the PITX1 .gene. Thirteen SNP markers within the 6.5 kb of genomic sequence span the 5' and 3' UTRs of PITX1. Pair-wise LD among SNPs was investigated using D'. Haplotype blocks were determined by identifying the first and last markers in a block, which are in strong LD with all intermediate markers. The structure and position of the PITX1 gene, the positions of the 5 SNPs from the first step genotyping and the 9 SNPS from the second step genotyping are indicated (SNP markers in bold), respectively. LD, linkage disequilibrium; SNP, small nucleotide polymorphism; UTR, 5'-untranslated regions.

Discussion

In this study, we have found a significant association between autism and polymorphisms of PITX1, a paired-like homeodomain transcription factor involved in hormonal regulation [26, 27]. Using a two step procedure we initially identified evidence for association for marker rs3805663 with autism. In the second step additional markers in the PITX1 gene were genotyped in an extended sample set of 276 families total. Although in this extended set marker rs3805663 did not reach a significant p-value any more, several additional markers showed highly significant results with the most significant result for rs6596189 (p = 1 * 10-4). Haplotype analyses yielded a couple of highly significant pairings, although none of these showed higher significance than rs6596189 alone. Individuals homozygous or heterozygous for the risk allele were 2.69 and 1.74 fold more likely to be autistic than individuals who were not carrying the allele, respectively.

PITX1 is a key regulator of hormonal genes in the pituitary-hypothalamic axis. Its putative involvement in autism is supported by evidence documenting abnormal levels of hormones such as ACTH, beta-endorphin, and cortisol in autistic individuals and by the fact that these hormones are downstream of PITX1 [2227]. Deregulation of POMC and high levels of beta-endorphin in the morning, for example, have been shown to be involved in certain maladaptive behaviors, such as self-injurious behaviors, which are often seen in autistic individuals [37]. The ACTH-cortisol system, which also plays an important role in stress related responses, is impaired in autistic individuals in whom lower cortisol levels and higher ACTH levels have been reported [24].

Linkage between chromosome 5q31-32 and autism is consistent with previously published studies. Both the IMGSAC (1998) genome-wide linkage scan and the screen performed by Risch and colleagues found modestly elevated LOD-scores in this region and exclusion mapping analyses did not clearly exclude this region [10, 12].

Interestingly, this genomic region was recently identified as a potential locus for attention deficit/hyperactivity disorder (ADHD) [3840]. In a genome-wide scan using large multi-generational pedigrees, Arcos-Burgos and colleagues established an exclusion map for the region and defined a critical interval from 119 to 135 Mb, which encompasses the PITX1 gene region [38]. Their most significant family-specific microsatellite marker D5S2117 at 133.4 Mb is less than 1 Mb from the PITX1 gene locus.

Mutation screening of the coding region of PITX1 revealed a single known mis-sense mutation with a MAF of 0.33. As the single point association results did not show any evidence for association, the mutation seems unlikely to be directly involved in susceptibility to autism. The most positive SNPs in the single marker analysis as well as the positive haplotypes are all situated in the first intron of the PITX1 gene. Regulatory site analysis shows a prediction for an alternative promoter at bases 134394831–134395400, which overlaps exactly with the region covered by the four most significant SNPs. However, to our knowledge activity of this predicted promoter has not been shown. Also these SNPs are in strong LD with SNPs within the 5' region of the PITX1 gene and the designated promoter region. However, one SNP within the PITX1 promoter, rs7700313 genotyped by us did not show significant evidence for association (pnominal = 0.0578, Table 3). Therefore at this stage it is not possible to postulate where the functional variant is situated and if any of the SNPs genotyped is functional.

The linkage region also covers the H2AFY and NEUROG1 loci, which in view of their function in X-chromosome inactivation and neuronal development, respectively, were considered reasonable candidates [28, 29, 40, 41]. We found associations neither for the 5 SNPs selected for H2AFY nor for the 2 SNPs selected for NEUROG1. Moreover, mutation screening by direct sequencing did not show any amino-acid or other mutational changes within these genes.

Conclusion

Although the mechanisms by which PITX1 may contribute to the susceptibility to autism are yet to be explored, the genetic association between PITX1 polymorphisms and autism, described here, could provide an explanation for the abnormal level of hormones of the pituitary-adrenal axis reported in the literature.