A novel gene derived from a segmental duplication shows perturbed expression in Alzheimer’s disease
- First Online:
- Cite this article as:
- Avramopoulos, D., Wang, R., Valle, D. et al. Neurogenetics (2007) 8: 111. doi:10.1007/s10048-007-0081-5
- 111 Views
Alzheimer’s disease (AD) is a disabling neurodegenerative disorder with onset commonly in late life. Three genes have been identified causing earlier onset AD, and a fourth has been shown to be a risk factor for late onset AD (LOAD), while many more yet unrecognized genes are thought to contribute to susceptibility. Many studies have reported linkage to LOAD on human chromosome 10, where we have identified a parent of origin effect [Bassett SS, Avramopoulos D, Perry RT, Wiener H, Watson B Jr, Go RC, Fallin MD. Am J Med Genet B Neuropsychiatr Genet 141:537–540, (2006), Bassett SS, Avramopoulos D, Fallin D. Am J Med Genet 114:679–686, (2002)]. In this paper, we report on a gene in this region that shows reduced expression with increasing age, reduced expression in females across ages, and further reduction in LOAD patients. In concordance with the observed parent of origin effect on the linkage, this reduction is more pronounced in patients with an affected mother. We discovered this gene while studying the alkaline ceramidase gene (ASAH2); it is a partial paralog of ASAH2, and we call it ASAH2L. It is the result of a partial duplication of ASAH2 on chromosome 10q11.23, just downstream from the sequence with promoter activity. ASAH2L has a polymorphic start codon with a single nucleotide change of the original ASAH2 sequence plus other putative translation start sites that might produce novel proteins. It is expressed in all the tissues we tested including the brain and is an interesting example of the generation of a new gene. Comparison of primate and other mammal genomes suggests that ASAH2L is human specific. Further research would be necessary to determine the function of the ASAH2L transcript and explore any possible involvement in neurodegeneration.
KeywordsBrainAlzheimer diseaseGeneGene duplicationGene expression
Alzheimer’s disease (AD) is characterized by progressive cognitive decline over a course of about 8 years from diagnosis to incapacitating dementia and death. Risk factors for the disease include increased age, positive family history, and female sex [3, 4]. With the exception of a few cases presenting before age 65 and inherited in an autosomal dominant manner, the vast majority of AD cases are of later onset. Despite a high heritability estimated at about 70% [5, 6], late onset AD (LOAD) does not show a clear mode of transmission, and it is believed to follow complex genetics with several genes influencing the risk.
Initial successes in the field of AD genetics lead to the identification of three genes for the early onset forms (APP , PSEN1 , and PSEN2 ) and one gene, APOE , for the common later onset form. Subsequently reported genes have shown weaker associations and inconsistent replications . Several genome scans have been performed implicating many regions that might harbor susceptibility genes, but none has provided strong evidence or has been consistently replicated in independent scans. Nevertheless, few regions show more promise, as they have provided positive results in more than one study including chromosomes 6, 9, 10, and 12 . In 2002, we published strong linkage for LOAD on chromosome 10q11–q21 with an observed parent of origin effect, where families with an affected mother showed significantly stronger linkage . We later increased our sample size, gained further support for the parent of origin effect, and reported a logarithm of the odds score of 3.7 in 68 maternal pedigrees . Chromosome 10q has shown positive results in several linkage studies, and it is one of the most promising locations for LOAD susceptibility genes today [12, 13]. Our region of interest is complicated by the presence of numerous complex duplications that cover a total of ∼3 Mb. Such duplications often include entire genes, and they have been associated with the creation of new genes as the function of paralogous genes diverges [14–16]. They also provide a substrate for non-allelic homologous recombination  leading to chromosomal rearrangements. Variations in the gross structure of the region, including possible copy number differences between individuals, could be of importance for disease susceptibility . At the same time, the repetitive nature of these sequences renders genotyping for association studies difficult.
In our ongoing search for susceptibility genes for LOAD, we screened 12 functional candidate genes in 10q11–q21 (ACF, ALOX5, ASAH2, CREM, CUL2, CXCL12, GDF2, MAPK8, PARG, SLC18A3, TIMM23, and UBE2D1) for differences in their expression in the temporal lobe of the brain of LOAD cases and healthy controls. One gene showed lower expression in females and an inverse correlation between RNA levels and age. LOAD patients showed an even further reduction in age- and sex-adjusted expression. Consistent with our observation of increased linkage in families with affected mothers, the expression reduction was more pronounced in patients from those maternal families. Unexpectedly, we found that the transcript was not the product of the originally selected candidate gene (ASAH2) but rather originated from a partial paralog located 520 Kb telomerically, which we call ASAH2L. We show that this novel and possibly human specific transcript is abundantly expressed in all tissues, including the brain, in contrast with ASAH2 that is not expressed at significant levels in the human brain. Our data suggest a possible involvement of ASAH2L in AD.
Materials and methods
Number of cases
Average age (±SD)
Sex N of males
Race N of AA
We extracted DNA from brain tissue using a Puregene Tissue DNA extraction kit (Gentra Systems, Minneapolis) and RNA using two methods: a commercial kit based on filter retention (Qiagen cat. no. 75842) and a method utilizing the TRIzol Reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s instructions. Reverse transcriptions (RT) were performed using the TaqMan reverse transcription kit (Applied Biosystems cat. no. N808 0234) and according to the manufacturer’s instructions. We input 300 ng of total RNA in a 20-μl reaction and used random primers for the generation of complementary DNA (cDNA).
Primers used for the PCR amplifications
Exon1-52.17 Mb specific-F
We analyzed the raw data using the SDS 2.1 software (Applied Biosystems) to obtain quantities relative to the standard dilution curve. When outliers were detected in the triplicates (one replica showing > twofold difference from the other two, observed in <10% of triplicates), we deleted the outlier data and averaged the remaining replicate values. These values represent expression ratios; therefore, they are not expected to have a normal distribution. For further analysis, we used the base 2 logarithm of these values, which was normally distributed (Kolmogorov–Smirnov test p > 0.2 for all measurements). To reduce random noise due to sampling and/or other experimental variables, we combined the three independent measurements by averaging after centering each on the corresponding mean. We further analyzed the data using the statistical software package, Statistica, employing a generalized linear model to measure the independent effect of each variable on the expression levels. The log(2) of the normalized expression of the transcript was the dependent variable, while age and PMD were continuous predictors, and sex and affection status were categorical predictors along with the corresponding two-way interaction term.
The primers used in various combinations to amplify the ASAH2 cDNA in the experiments described below are shown in Table 2. PCR amplifications were performed on cDNA using 0.8 units of Taq DNA polymerase from Promega (cat. no. M166F) and the manufacturers buffer, 100 μM deoxyribonucleotide triphosphate and 200 nM of each primer in 25 μl reactions and 35 cycles of denaturing at 94°C for 30 s, annealing at 60°C for 30 s, and a varying time (from 30 s to 3 min) of extension depending on the expected product length. Sequencing reactions on PCR-amplified DNA were performed using the Sanger method with fluorescently labeled di-deoxy terminators and reagents from Applied Biosystems (cat. no. 4336919). The products were treated with exonuclease and shrimp alkaline phosphatase and after ethanol precipitation was subjected to capillary electrophoresis and visualized using a 3100 automated sequencing apparatus. Northern blot analyses were performed on a commercially available multi-tissue blot (Ambion, item no. 3140) using Rapid-hyb™ hybridization buffer (GE Healthcare, cat. no. RPN1635) and the manufacturer’s protocol. We generated probes by PCR using primer pairs Exon16-forward and Exon20-reverse as well as Exon1-forward and Exon4-reverse on cDNA templates. The products were inserted into the pCR2.1-TOPO vector and transformed into Escherichia coli One Shot Top10 chemically competent cells using reagents and protocols provided by Invitrogen. The probe identity was verified by sequencing, and probes were labeled using the Rediprime™ II DNA labeling system (GE Healthcare cat. no. RPN1633) according to the manufacturer’s protocol.
Expression levels analyses
In our initial screen of 12 genes in 12 LOAD cases and 9 controls, we observed no difference in expression for ALOX5, CREM, CUL2, CXCL12, MAPK8, PARG, TIMM23, and UBE2D1 (all p > 0.1), while ALOX5, GDF2, and SLC18A3 failed to amplify. ASAH2, however, showed a difference in expression levels (p = 0.008), which prompted us to acquire an additional 16 cases and 16 controls. The complete set of cases and controls showed modest differences in terms of age at death (cases, 83.5 ± 4.7; controls, 74.6 ± 16.3; p = 0.01), postmortem delay (cases, 7.8 ± 4.1; controls, 10.6 ± 4.8; p = 0.03), and male–female composition (5/28 males in cases vs 11/25 in controls, Fisher’s exact test p = 0.03). We examined the effect of these variables in a control only analysis, and subsequently, we corrected for their possible effect by using a generalized linear model that regressed the measured expression levels on the age at death and the PMD and included sex and affection status as categorical predictors. We did not include race in the model, as there was only one African American case and three controls. Analyses on the white samples alone gave similar results to those reported below both in terms of effect sizes and significance.
Statistical analysis results
Number of samples
Age of death
With this result in mind, we reviewed the genome databases and observed that 16 of 18 expressed sequence tags (ESTs) from the ASAH2 gene extended from exons 16 to 20. Additionally, 12 out of a total of 16 ESTs including all three brain-derived ESTs (all from the hippocampus) aligned better at 52.17 Mb than with ASAH2, as they contained leading sequences from regions adjacent to the partial copy of ASAH2 (see Fig. 2). We used one of the reported hippocampus ESTs (BI553338) and bioinformatics tools to examine the possibility of transcription from this location. We aligned BI553338 to the genome and extended the sequence 5 Kb in the 5′ direction to include possible promoter sequences. We used this extended sequence and the web-based software Dragon Promoter Finder  (http://www.research.i2r.a-star.edu.sg/promoter/promoter1_5/ DPF.htm) and found a CpG island with a putative promoter and a predicted transcription initiation site 100 bp upstream from the beginning of BI553338. This, together with the failure to amplify the upstream exons, increased our suspicion that the transcript we were investigating for expression might be transcribed from the partial copy of the ASAH2 gene at 52.17 Mb, rather than the complete ASAH2 at 51.65 Mb. The Vega Human Genome database (http://www.vega.sanger.ac.uk) based on the bioinformatics and sequence annotates at that location a putative transcript (OTTHUMG00000018239), which corresponds to our prediction. To experimentally confirm the presence of the transcript, from this putative new gene that we refer to as ASAH2L, we designed specific primers complementary to sequences present only in ASAH2L (primers exon1-52.17 Mb specific-forward and exon5-52.17 Mb-reverse). We were able to amplify cDNA from all tissues tested (Fig. 3b), and nucleotide sequencing showed that the amplified fragment corresponded to ASAH2L.
To assess the possible functionality of ASAH2L, we searched for open reading frames (ORFs) beginning with putative translation start codons. We used the sequence of the Vega-annotated transcript ASAH2B-002 and found four ORFs starting with a methionine codon and extending more than 50 codons. The first, starting at position 291, was 123-codons long but had a putative initiation methionine codon that did not conform well to the Kozak consensus [22, 23], as it lacked a purine at position −3 relative to the ATG codon, a residue shown to be of critical importance for start codon recognition. The putative protein product of this ORF shows no matches or homologies with any human or non-human protein in GenBank according to our basic local alignment search tool (BLAST)  searches, except for a hypothetical transcript derived from the chimpanzee DNA sequence, which does not come from the region syntenic to the human ASAH2 gene locus and its paralogs. A second ORF starts at position 785 at a methionine codon with a strong Kozak consensus sequence. This ORF encodes a 160-amino-acid peptide corresponding to the carboxy-terminus region of alkaline ceramidase. One of the residues of this start codon, however, was different in the sequence of the EST BI553338. We used specific primers for the 52.17 Mb repeat location and found by nucleotide sequencing that this nucleotide is polymorphic (C/T) in ASAH2L. The derived allele (T) creates a translation start codon (ATG), and it is present in ∼17% of the population (allele frequency ∼8.5%). We found no evidence of differences in the frequency of its occurrence between cases and controls (5/28 controls and 4/25 cases had the T variant). Additionally, by sequencing both genomic and cDNA for heterozygotes for this variant, we observed expression of both alleles, thus no evidence of imprinting. Although the balanced distribution of the polymorphic start site argues against a possible involvement of the corresponding transcript in AD, its presence might prevent the use of a downstream transcript, thus reducing its levels in heterozygotes. Consistent with this, we found that cases with this start site had 15% higher transcript levels than the rest of the cases (not a statistically significant difference), and exclusion of cases and controls carrying this start site slightly increased the observed effects (from 0.38 on Table 3 to 0.41). A third ORF at position 1,960 is the only one present in all individuals and has a putative initiation methionine codon that conforms to the Kozak consensus. It encodes a novel polypeptide of 109 amino acids that shows no homology to known proteins.
We report the identification of a novel transcript, which we refer to as ASAH2L, a transcript derived from a 5′ truncated paralog of the ASAH2 gene. The expression of ASAH2L is lower in females, decreases with age, and appears further reduced in Alzheimer’s patients. Female sex and increasing age are both recognized risk factors for LOAD [3, 4]. Our data are consistent with a model where low expression of ASAH2L is a risk factor for LOAD; however, further studies are needed to test this model. ASAH2L is a novel gene of unknown function resulting from a duplication of a fragment of ASAH2 and insertion at a new location next to promoter sequences.
Human alkaline ceramidase was first cloned by El Bawab et al. . Their data supported strong expression of ASAH2 in the brain; however, they used a probe from the part of the 3′-end gene that we now know to be duplicated. The major transcript they detected was ∼3.5 Kb, and we now show that it most likely corresponded to the ASAH2L transcript. We do not observe the 3.5-Kb fragment when using a probe specific to ASAH2. The predicted Vega transcript ASAH2B-002 (ID, OTTHUMT00000048083) that corresponds ASAH2L at 52.17 Mb is 3.7 Kb long consistent with our observations. El Bawab et al.  also observed a larger transcript, which they sized at ∼7 Kb, as opposed to our sizing of the larger transcript we observed at 5.3 Kb. If one disregards the sizing difference, their blot (Fig. 4 in their paper) looks very similar to ours (Fig. 4a). In our data, the larger transcript is the same size with that seen strongly in colon with the ASAH2 specific probe (Fig. 4b). This larger transcript was barely detectable in human brain RNA in the El Bawab et al.  study while it was observed in RNA from kidney and pancreas. In our Northern blot, the 5.3-Kb transcript is strong in the intestine and very weak, but detectable, in all other tissues tested including kidney and pancreas.
Our RT-PCR results are consistent with Northern blot results on mouse tissues published by Choi et al.  who detected the presence of an ASAH2 transcript in roughly the same set of tissues as we did without strong expression in the brain. Our results suggest strong ASAH2 expression only in the intestine. This is consistent with recent literature on a major role of the ASAH2 gene product, alkaline ceramidase, in the intestine for the degradation of sphingolipids [27, 28]. Nevertheless, alkaline ceramidase activity has been reported in the brain of rats and zebrafish [28, 29]. In contrast with the results of Choi et al. , Tani et al.  also observed it in mouse brain. The later discrepancies may be explained by possible differences in the included brain regions, while differences across species could also exist. Our RT-PCR experiments failed to amplify an ASAH2-specific transcript in the temporal lobe, while after increasing the PCR cycles, we weakly amplified the transcript in cDNA from whole human brain, suggesting low expression limited to brain regions other than the temporal lobe.
Alkaline ceramidase is involved in the hydrolysis of ceramide in the neutral alkaline pH range, and it is involved in programmed cell death . It has been suggested that it localizes in the mitochondria , although recently, this has been challenged . As ceramide and ceramidase are strongly involved in apoptosis signaling and the mitochondria are maternally inherited and also known to be important in apoptosis , involvement of ASAH2 in AD would fit the maternal effect observation. We showed, however, that the downregulated transcript is not the product of ASAH2 but rather ASAH2L, a novel gene that includes sequences from the 3′ end of ASAH2 corresponding to the carboxy-terminus of ceramidase. This transcript contains many potential translation start sites including one that is polymorphic and would produce a fragment of alkaline ceramidase. Further study is needed to show if any of these putative proteins are actually produced and have a function. The possibility that the RNA itself has some function influencing the alkaline ceramidase gene can also not be excluded.
Our observation of an AD-related expression difference for ASAH2L has been robust to increases in sample size and use of different quantification approaches. The downregulation with age and the lower expression in females, both of which are risk factors for AD, are consistent with the hypothesis that reduced ASAH2L transcription might increase susceptibility to AD. The localization of ASAH2L under a strong linkage peak also argues for this possibility, especially as the reduced expression is more pronounced in the maternal cases, consistent with the observed increased linkage in maternal families . Nevertheless the expression data alone cannot establish an involvement of this gene in AD, and further investigation is necessary.
There are a number of limitations that need to be taken into account in reference to our conclusions. One could argue that the observed differences might be due to the reference gene, ACTB, rather than the target. This, however, is very unlikely, as none of the other genes tested using the same reference showed expression differences. The examination of 12 genes, on the other hand, raises a multiple comparisons question. The observed significance levels can withstand correction at least regarding the disease effect (corrected p = 0.0096), which was the only effect studied across all genes. Additionally, 11 genes were not tested on the extended sample, and three of them did not provide data; thus, even this correction is conservative. The observed gene expression differences are based on statistical analyses and, as with any such study, type I errors cannot be excluded with certainty. A replication study would be very valuable to this end. It also remains to be investigated whether the reduction in expression constitutes a risk factor or is a consequence of the disease process. The localization of the gene under a linkage peak and the parent of origin effect are more consistent with a causative role; however, showing causation would require further experimental evidence including identification of the relevant DNA variant(s) and the biological mechanism involved.
Beyond the possible implications for AD, we find the emergence of a novel transcript from a partial duplication of a previously existing gene intriguing. The rearrangement that took place in this region is possibly human specific, as according to our genome database searches, the corresponding sequences in the chimpanzee, rhesus monkey, and other mammals are not known to be duplicated. Consistently with this, Choi et al.  and Tani et al.  did not observe the new transcript in their work on mouse ceramidase. Another interesting observation regarding this new transcript is the existence of a derived polymorphic allele that creates the strongest translation start site, likely leading to variation regarding the translation of the corresponding protein and proteins corresponding to downstream translation start sites. We find this new gene to be a very attractive target of additional study, as it might teach us more about gene evolution and the birth of new genes, while it will also likely promote our knowledge on neurodegeneration and Alzheimer’s disease.
This work was supported by grants RO1AG022099 and RO1AG021804 from the National Institute of Aging to DA and SSB, respectively. All described experiments comply with the laws of the United States of America.