Introduction

Sometimes simply considered junk DNA, transposable elements actually have great impact on their host genomes in several ways. Mobilized transposable elements can insertionally mutate the genes in which they land [13]. In addition, transposable element sequences in the genome can modulate gene expression by serving as promoters, enhancers, silencers, sites of epigenetic modification, and alternative splicing sites [46]. Transposon encoded genes can also be adopted by the host to perform cellular functions, a process referred to as 'molecular domestication' [712]. Because many copies of integrated elements exist in a genome, they can serve as locations for recombination events that produce deletions, duplications, inversions, or translocations [1315].

There are two basic types of transposable elements: retrotransposons, which mobilize via an RNA intermediate in a 'copy and paste' reaction; and DNA transposons, which mobilize by a cut-and-paste reaction. To date, there have been no published accounts of naturally occurring active DNA transposons in mammalian genomes, although many copies of inactive fossil DNA transposons are present [16]. There are also a great many copies of retrotransposons in our genomes. In contrast to DNA elements, a few of these retroelements are active and capable of retrotransposition. The human genome harbors about 80 to 100 potentially active long interspersed nuclear element (LINE)1 (L1) retrotransposons, although it is estimated that only about one in ten of these is highly active [17]. As a consequence, about one in 50 individuals will carry a new L1 insertion due to either retrotransposition in the parental germline or during early development [5]. In addition to the autonomous LINE elements, the human genome contains active short interspersed nuclear element (SINEs). SINE elements are non-autonomous elements and rely on the activity of proteins encoded by LINEs to retrotranspose [18]. Members of the Alu family of SINE elements are present in more than one million copies in the human genome, and new Alu insertions are relatively frequent, occurring in approximately one in 20 individuals [19]. Although transposition in the germline is the only way for new insertions to become fixed in the population, there is increasing evidence that transposable elements can have great impact on their hosts by their presence or activity in somatic cells. Here, we review several mechanisms by which transposable elements can have an impact on somatic cells, including active mobilization, by the activities of domesticated transposases and by influencing genomic rearrangements.

Insertional mutagenesis in somatic cells

Part 1: retroelements

L1 elements are a large family of retrotransposons that encode two proteins that can catalyze target-primed, reverse transcription and integration of transcripts. The L1 promoter is found in the 5' untranslated region of the element and initiates transcription within a few base pairs of base 1 [20, 21]. The L1 promoter is known to be active in some somatic cell types [2226] and therefore new L1 insertions could, in theory, accumulate in somatic cells. The ability of L1 transposition in somatic cells to result in gene mutation was appreciated when a somatic L1 insertion in the MYC proto-oncogene was discovered in a human breast ductal adenocarcinoma [27]. However, the contribution of somatic mobilization of endogenous L1 elements to cancer formation may be rare. The only other example of a known somatic insertion of a L1 element in a tumor was found in the tumor suppressor gene APC in a colon cancer [28].

Further research will be necessary to determine how often L1 gene insertions occur in human cancer because many techniques currently used to detect mutations in cancer, for example exon resequencing, would not always detect this kind of mutation. In fact, there is evidence that epigenetic regulation of endogenous L1 elements may influence tumor progression. Many human tumor genomes become globally hypomethylated upon cancer progression [29], and genome hypomethylation can result in upregulated retrotransposon transcription in cancer cells [30, 31]. L1 promoter hypomethylation has been documented in progression of chronic myeloid leukemia to blast crisis. In chronic myeloid leukemia, L1 hypomethylation was associated with an upregulation of L1 transcript levels and poorer long-term survival of patients [31]. Certainly, if the levels of transcripts from active L1 elements increase, then it could lead to the accumulation of new insertional mutations in cancer genes. However, activation of L1 transcription could also contribute to tumor initiation or progression by additional mechanisms. Studies have shown that inhibition of L1-encoded reverse transcriptase using RNA interference or small molecules reduces proliferation and promotes differentiation of melanoma, thyroid, and prostate cancer cell lines [32, 33]. It has been speculated that L1 reverse transcriptase can regulate endogenous gene expression by some unknown mechanism, such as chromatin modification or nuclear repositioning [33]. In addition, because L1 expression in cell culture can generate double strand breaks [34], it is possible that L1 activity in somatic cells could contribute to genomic instability.

Although the action of transposable elements can be either beneficial or detrimental to the organism, those insertions that harm the organism, by promoting cancer for instance, will be the easiest to identify. However, recent work indicates the potential for somatic mobilization of L1 elements to have a positive impact on the organism by generating diversity within a cell type [26]. Specifically, a human L1 element driven from its endogenous promoter was shown to retrotranspose in vitro in rodent neural progenitor cells (NPCs) and in vivo in mouse brain. Furthermore, these insertions were capable of generating mutations that apparently have an impact on cell differentiation.

This study of L1 activity in neuronal cells took advantage of a retrotransposition indicator to detect retrotransposition. This indicator is a transgene that expresses a modified L1 element from its endogenous promoter. The L1 transgene is engineered to contain a reporter that consists of an enhanced green fluorescence protein (EGFP) expression cassette in the opposite orientation to that of the L1 element. The EGFP cassette is interrupted by an intron in the same orientation as L1 transcription, so that EGFP can only be expressed when the element has undergone transcription, splicing, and integration into a new location in the genome. EGFP expression could be detected after nucleofection of the retrotransposition indicator into NPCs in vitro and in neurons of mice harboring the retrotransposition indicator as a transgene. In vitro, cells that had acquired new retrotransposon insertions could be differentiated into the three major neural cell types, namely neurons, astrocytes, and oligodendrocytes. In vivo, cells that expressed EGFP also expressed markers for neurons but not markers for astrocytes or oligodendrocytes. In vitro, the genomic locations of several retrotransposed L1s were cloned using inverse polymerase chain reaction based techniques. Some of the transposed elements had landed in neuronally expressed genes, including Psd-93. The cell clone harboring the Psd-93 insertion did express higher levels of the Psd-93 transcript than the parental cells, and downregulating Psd-93 via small interfering RNA knockdown in this clone resulted in a less differentiated phenotype. This indicates that, at least in vitro, L1 transposition can change somatic genomes in a way that has phenotypic consequences.

Future work will be required to determine whether endogenous L1s (for instance, mouse L1s in mouse cells or human L1s in human cells) transpose in NPCs and whether this truly promotes diversity among these cells. Because retrotransposition of Alu elements in trans by LINE elements has been detected in cell culture [18], it will also be interesting to determine whether somatic activity of L1 promotes somatic retrotransposition of Alu or other non-autonomous elements in vivo.

Part 2: RAG 'transposases'

Although there are no known active DNA transposons in mammals, there are genes that were domesticated from these elements present in their host genomes [7]. The recombination-activating gene (RAG)1 and RAG2 proteins are important for generating somatic diversity in the immune system because they play an indispensable role in V(D)J recombination during lymphocyte development. The RAG1 gene is hypothesized to have its origin from the transposase encoded by an ancient Transib superfamily transposon [35, 36]. In fact, recent evidence demonstrates that the RAG proteins still possess the capability to transpose genomic sequences excised during V(D)J recombination to new places in the lymphocyte genome [37].

V(D)J recombination is a complex event, and both RAG proteins as well as other factors are involved in the process. Recombination in developing lymphocytes results in the joining of variable (V), joining (J), and sometimes diversity (D) segments together to form a mature B-cell receptor (BCR) or T-cell receptor (TCR) gene. The recombination events are catalyzed between recombination signal sequences (RSSs). Each RSS is composed of unique heptamer and nonamer sequences separated by 12 or 23 base pairs. Recombination only occurs between RSSs with 12 and 23 base pair sequences separating the heptamer and nonamers. Recombination removes sequences between V and J, or V, D, and J segments, releasing a circularized intervening DNA called the signal joint. V(D)J recombination events share biochemical similarity with the cut-and-paste reaction of DNA transposons, such as the Hermes and other hobo, Activator, Tam3 family elements [38]. In fact, the excised fragment with its RSS ends is similar to a DNA transposon, but unlike a DNA transposon it is circularized.

The extrachromosomal array, or signal joint, excised during V(D)J recombination can be reintegrated by the RAG proteins into artificial DNA targets [3942] and has been detected in vivo at the HPRT locus in T cells isolated from normal human donors [43]. However, only recently has an assay been developed that allows measurement of the rate at which RAGs can catalyze insertion of the extrachromosomal array into the host genome [37]. This was accomplished by generating a recombination substrate in a pre-B cell line that could detect both rearrangement and transposition. This recombination substrate consists of a puromycin resistance gene that is interrupted by a zeocin-green fluorescent protein (GFP) resistance marker. The zeocin-GFP marker is flanked by RSSs so that it mimics the substrate for V(D)J recombination and is a substrate for RAG activity. When this reporter undergoes RAG-initiated recombination, the zeocin-GFP marker with its RSSs are excised as a signal joint, and the puromycin resistance reading frame is restored, allowing the cells to grow in puromycin selection. The excised signal joint DNA consisting of the zeocin-GFP fragment flanked by RSSs resembles a transposon. One potential fate for this fragment is simply to be lost upon cell division. However, if the fragment undergoes a transposition-like reaction and is reintegrated into the host genome, the cell will also become zeocin resistant. As zeocin resistance can also be acquired if the fragment randomly integrates into DNA, true transposition events can be identified because of the generation of a characteristic integration-associated target DNA repeat.

Using this assay, it was determined that the fragment excised by RAG activity during V(D)J recombination reintegrates in this cell line in one out of every 13,000 recombination events. Some of these events are random integrations, but the estimated rate of bona fide transposition events is one every 50,000 recombination events. If this rate were true for human lymphocytes in vivo, then it would translate into approximately 10,000 transpositions per day.

In addition, the rate detected in the B-cell line could actually be an underestimation because the assay only detects transposition events that insert the zeocin-GFP construct in a genomic location that allows its expression. Nevertheless, the critical action of the RAG proteins in V(D)J rearrangement also results in a potential negative consequence: insertional mutation of the host genome via integration of the excised 'transposon'. The consequences of these events for the genomes of developing lymphocytes remain to be determined but they should be investigated, particularly because they may relate to B-cell or T-cell malignancies.

Genomic rearragements: RAG related and Aluinfluenced

Although the consequence of RAG-mediated insertion of 'transposons' for the host lymphocyte is still unknown, the ability of V(D)J recombination to promote tumor formation through the generation of chromosomal rearrangements has been well studied [44, 45]. The formation of chimeric fusion oncoproteins can occur when RAG proteins generate double strand breaks at sites in the genome that are similar in primary sequence or that adapt a similar structure to their normal RSS sites in the TCR or BCR loci. When these breaks are recombined with legitimate breaks at the TCR or BCR loci, they can result in rearrangements wherein a proto-oncogene such as LMO2 and BCL2 becomes abnormally expressed under the control of enhancers and promoters at TCR and BCR loci, respectively. Events such as these are responsible for the generation of some oncogenic chromosomal rearrangements, but they are clearly not responsible for all. Genome fragile sites, repair of double strand breaks by nonhomologous end joining (NHEJ), and homologous recombination (HR) also play a role in generating oncogenic chromosomal rearrangements.

Recombination events that produce oncogenic translocations, deletions, and other rearrangements frequently occur at or near Alu sequences in the genome [14]. Indeed, part of the Alu core sequence is similar to χ (Crossover Hot-spot Instigator [chi]) motifs in Escherichia coli that are thought to be sites of Rec-mediated recombination [46]. It has therefore been hypothesized that this site in an Alu element could be involved in binding proteins involved in HR [14]. Although there are many documented cases of Alu-Alu HR mediating germline deletion of DNA, specific examples of Alu-Alu recombination in somatic cells are rare [47], with the most famous example being the partial internal tandem duplications of part of the MLL gene found in cases of acute myeloid leukemia [48]. In addition, recombination between Alu elements has been detected in a translocation involving the TRE oncogene in Ewing's sarcoma [49].

Although many cancer-associated translocations have breakpoints in or near Alu elements, the sequences found at the fusion sites do not indicate that HR has occurred [14, 47]. To investigate how Alu elements can influence translocation formation, a system was developed to monitor translocation formation in murine embryonic stem cells [50]. This was accomplished by knocking in two constructs into opposite chromosomes in embryonic stem cells. Each chromosome contained parts of drug resistance markers and part of an Alu element. Because it is believed that most translocations are initiated by formation of double strand breaks [51], both constructs contain a recognition site for the SceI restriction endonuclease. The SceI recognition site is not found naturally anywhere in the mouse genome, so expression of the SceI restriction enzyme will result in double strand break formation only at the sites introduced into the genome. Translocations following SceI-induced double strand break formation would bring the two halves of the drug resistance markers together, allowing drug resistance to be used as a readout for translocation formation. Furthermore, by sequencing the translocation junction from drug resistant cells, the mechanism of DNA repair can be inferred. Repair after DNA double strand break formation can occur via HR, NHEJ, or single strand annealing (SSA). SSA does involve homologous sequences, but involves annealing of homologous sequences and not the strand invasion that is a part of HR [52].

Using this system, it was determined that the presence of Alu elements near the induced double strand break does not appreciably influence the rate at which translocations are formed [47, 50, 53]. However, Alu elements can influence the type of DNA repair that occurs. The presence of identical Alu sequence on both chromosomes results in repair due to SSA, whereas NHEJ predominates in the presence of divergent Alu sequences [50]. This evidence from embryonic stem cells supports the observation that although translocations in cancer cells may form near or in Alu elements, they rarely contain junction sequences that imply direct Alu-Alu homologous recombination [14, 47]. However, the embryonic stem system contains artificial sites for DNA double strand break formation and translocation, and it remains to be determined whether similar results are obtained when sequences from the sites of frequent chromosomal translocations are used.

Conclusion

Clearly, the presence of transposable elements can generate heritable mutations or chromosomal rearrangements, some of which become fixed over evolutionary time scales. However, the recent observation that transposable elements are active in somatic cells opens the possibility that transposable elements can generate diversity among somatic cells with the same genome. The nature of this diversity includes upregulated expression of transposon proteins, which could affect endogenous gene regulation. The process of transposition by retrotransposons may create insertion mutations that influence neural differentiation or cause cancer. The very high copy number of Alu elements provides substrate to influence the outcomes of repair of double strand breaks that result in chromosomal translocations. These and other data indicate that transposons cannot be ignored as important factors that influence the behavior of somatic cells in the human genome.