Introduction

Ribosomal RNAs are the most abundant RNA molecules found in eukaryotic cells. In yeast, they account for about 80 % of all RNA molecules, with tRNAs and mRNAs making up the remaining 15 and 5 %, respectively (Warner 1999). This overwhelming abundance is not only the result of their efficient transcription but is also a consequence of the fact that the genes coding for these RNAs are almost invariably found in numerous copies in eukaryotic genomes (recently reviewed in Torres-Machorro et al. 2010). Although most eukaryotic species have more than 100 copies of both 5S rRNA genes and 18S, 5.8S and 28S rRNA genes, the number of 5S rRNA genes varies from only three in Plasmodium falciparum to as many as a million in Euplotes eurystomus. The number of rDNA units, coding for 18S, 5.8S and 28S rRNA genes, also varies from as few as 2 in Theileria parva to 5000 in Acetabularia mediterranea. Finally, there is no relationship between the number of 5S rRNA genes and the number of rDNA units. Although some species, such as Saccharomyces cerevisiae with its 100–200 linked copies of both 5S rRNA genes and rDNA units, have equal amounts of both types of genes, other species have either much more 5S rRNA genes or rDNA units. For example, whereas Trypanosoma brucei has almost 30 times more 5SrRNA genes than rDNA units (1,500 vs. 56, respectively), Tetrahymena thermophila has 30 times more rDNA units than 5S rRNA genes in its macronucleus (9,000 vs. 150, respectively; Torres-Machorro et al. 2010).

Given their high copy number and high level of expression, one might expect that both their sequence and organization would be conserved during evolution (Drummond et al. 2005). Although the sequences of rRNA genes are indeed known to be very highly conserved, and are thus often used to infer the phylogenetic relationships of distantly related species, their organization has been shown to be quite variable (Woese et al. 1990; Moreira et al. 2007; Torres-Machorro et al. 2010). As recently reviewed by Torres-Machorro et al. (2010), ribosomal RNA genes can be dispersed, organized in tandem repeats with or without other genes or located in linear or circular extrachromosomal units. They can also contain introns, insertion sequences, transposable elements, protein-coding genes or additional spacers (Eickbush and Eickbush 2007; Torres-Machorro et al. 2010).

Here, we focus on the organization of 5S rRNA genes in protist genomes. Because eukaryotic 5S rRNA genes are transcribed by RNA polymerase III, whereas the other three eukaryotic rRNA genes (coding for 18S, 5.8S and 28S rRNA; henceforth referred to as rDNA genes or units) are transcribed by RNA polymerase I, one would not expect that these two types of genes would be associated with one another (Archambault and Friesen 1993; Vannini and Cramer 2012). This is what is observed in most eukaryotic species where 5S rRNA genes are either found dispersed as independent copies throughout genomes or found in one or more clusters of tandemly repeated copies (Gerbi 1985; Drouin and Moniz de Sá 1995). However, as described in the following, 5S rRNA and rDNA genes have repeatedly been found to be linked with one another. Furthermore, 5S rRNA genes are also often found linked to genes transcribed by RNA polymerase II, as well as the rDNA are genes transcribed by RNA polymerase III. Here, we argue that these diverse gene arrangements do not provide any selective advantage but simply represent the nonadaptive fixation of haphazard recombination events in species with short generation times and frequent founder events.

Data

We used three methods to obtain data concerning the organization of 5S rRNA genes in protist species. We first used PubMed to search published studies and recorded this information. As described in the Supplemental Tables, these studies used a variety of methods to obtain these data, including Southern blot analyses, PCR amplifications with or without subsequent sequencing and genome sequence analyses. We also searched entries in NCBI’s GenBank database (http://www.ncbi.nlm.nih.gov/) for linkages using ‘5S ribosomal RNA’ and ‘5S RNA’ as query terms for searching through GenBank’s protist nucleotide sequence database. Finally, we used BLAST with a variety of 5S rRNA sequences to search protist sequences in NCBI’s GenBank database (Altschul et al. 1997).

Types of Linkages

In protists, when 5S rRNA genes are found linked to rDNA genes, they are most commonly linked to rDNA units made up of 18S, 5.8S and 28S rRNA genes. Such linkages are observed in all protist groups so far studied, i.e., stramenopile, haptophyte, cryptophyte, metamonada, alveolate, euglenozoa and amoebozoa species (Figs. 1, 2, 3, 4 and 5). In contrast, other 5S rRNA gene linkages are only observed in some protist taxa. For example, 5S rRNA gene linkages to spliced leader (SL) sequences have only been found in dinoflagellate and euglonozoa species and 5S rRNA gene linkages to tRNA genes have only been found in Entamoeba species (Figs. 3, 4 and 5). In extreme cases, some types of 5S rRNA gene linkages have only been found in single species. For example, 5S rRNA gene linkages to ubiquitin genes and pseudogenes have been found in only one of three Trichomonas species, the linkage of two 5S rRNA genes linked to ubiquitin genes has only been found in the ciliate species Tetrahymena pyriformis, 5S rRNA gene linkages to U6 snRNA genes and SL genes has only been found in the dinoflagellate species Karenia brevis, and 5S rRNA gene linkages to U5, U1, U2 snRNA genes and SL genes have only been found in the euglenid species Entospihon sulcatum (Figs. 1, 3 and 4). In all cases, 5S rRNA genes are always found within spacer sequences and never within the other genes with which they are linked (Figs. 1, 2, 3, 4 and 5).

Fig. 1
figure 1

Phylogenetic distribution of 5S genes arrangements in Stramenopile, Haptophyte, Cryptophyte and Metamonada species. Blue arrows represent rDNA units, red arrows represent 5S rRNA genes and white arrows indicate ubiquitin genes or pseudogenes (ψ). The presence of a single arrow represents genes that are tandemly repeated numerous times but not linked to any other genes. The presence of two arrows represents genes which are linked with one another to form a unit which is tandemly repeated numerous times. The arrows point in the direction of transcription. The presence of the word ‘and’ represents the fact that two types of units are found in this species. The references and accession numbers of the genes shown are listed in Supplemental Table 1. The phylogenetic relationships shown are based on the studies of Dawson and Pace (2002), Kleina et al. (2004), Baldauf (2008), Brown and Sorhannus (2010), Beakes et al. (2011) and Malik et al. (2011) (Color figure online)

Fig. 2
figure 2

Phylogenetic distribution of 5S genes arrangements in Pythium species. Blue arrows represent rDNA units and red arrows represent 5S rRNA genes. The presence of a single arrow represents genes that are tandemly repeated numerous times but not linked to any other genes. The presence of two or three arrows represents genes which are linked to form a unit which is tandemly repeated numerous times. The arrows point in the direction of transcription. Red boxes represent 5S rRNA genes for which the direction of transcription is not known. Note that most of these arrangements were deduced from PCR amplifications and that the two 5S rRNA genes found in some species might not both be functional (see Belkhiri and Klassen 1996). The gene arrangements are from Bedard et al. (2006) and the phylogenetic relationships shown are based on the study of Lévesque and de Cock (2004) (Color figure online)

Fig. 3
figure 3

Phylogenetic distribution of 5S genes arrangements in Alveolates (Ciliates, Apicomplexa and Dinoflagellates). Blue arrows represent rDNA units, red arrows represent 5S rRNA genes, green arrows represent spliced-leader genes, the black arrow represents U6 snRNA genes and the white arrow represents ubiquitin genes. The presence of a single arrow represents genes that are tandemly repeated numerous times but not linked to any other genes. The presence of two or more arrows represents genes which are linked to form a unit which is tandemly repeated numerous times. The arrows point in the direction of transcription. The presence of the word ‘and’ represents the fact that two types of units (or three types of units, when the word ‘and’ appears twice) are found in this species. The references and accession numbers of the genes shown are listed in Supplemental Table 2. The phylogenetic relationships shown are based on the work of Zhang et al. (2007), Baldauf (2008), Reeb et al. (2009), Janouškovec et al. (2010), Templeton et al. (2010), Bachvaroff et al. (2011) and Zhang et al. (2011) (Color figure online)

Fig. 4
figure 4

Phylogenetic distribution of 5S genes arrangements in Euglenozoa (Kinetoplastids, Diplonemids and Euglenids) species. Green arrows represent spliced-leader sequences, red arrows represent 5S rRNA genes, blue arrows represent rDNA units and black arrows represent snRNA genes. The presence of a single arrow represents genes that are tandemly repeated numerous times but not linked to any other genes. The presence of two or more arrows represents genes which are linked to form a unit which is tandemly repeated numerous times. The arrows point in the direction of transcription. The presence of the word ‘and’ represents the fact that two types of units (or three types of units, when the word ‘and’ appears twice) are found in this species. Red boxes represent 5S rRNA genes for which the direction of transcription is not known and ψ indicates 5S pseudogenes. The start (*) next to the L. seymouri green arrow is to indicate that several undescribed isolates related to this species have been show to contain a 5S rRNA sequence within their splice-leader repeats (Westenberger et al. 2004). The references and accession numbers of the genes shown are listed in Supplemental Table 3. The phylogenetic relationships shown are based on the studies of Stevens et al. (1999), Frantz et al. (2000), Preisfeld et al. (2001), Hughes and Piontkivska (2003a, b), Hamilton et al. (2004), Moreira et al. (2004), Simpson and Roger (2004), von der Heyden et al. (2004), Baldauf (2008), D’Avila-Levy et al. (2009) and Deschamps et al. (2011) (Color figure online)

Fig. 5
figure 5

Phylogenetic distribution of 5S genes arrangements in Amebozoa species. For Entamoeba species, the arrangements of 5S genes relative to the tRNA tandemly repeated arrays are shown. In these arrays, each tRNA is represented using the single-letter amino acid code. 5S genes are represented by the number 5 shown in red. Numbers 1 and 2 in parentheses represent the fact that the VQ5 array is found in two different locations in the genome of E. terrapinea. X represents the presence of a gene encoding and unidentified small RNA common to all species. Each column represents the organization of a given tRNA genes in all five Entamoeba species. For example, 5S genes are not linked to AlaCGC tRNA in any Entamoeba species except in E. terrapinae. Each row represents the linkages between 5S and tRNA genes in each genome. For example, 5S genes are found in two different arrays in the E. invadens genome, i.e., those coding for VMEDR5E and FW5. For simplicity, the orientation of the Entamoeba genes is not shown. For Dictyostelium species, the arrangements of 5S genes (red arrows) relative to the tandemly repeated ribosomal RNA genes (blue arrows) are shown. These two types of genes form a unit which is tandemly repeated numerous times in the genome of these two species. The arrows point in the direction of transcription. The references and accession numbers of the genes shown are listed in Supplemental Table 4. The phylogenetic relationships are from Clark et al. (2006) and Cole et al. (2010) (Color figure online)

The first reports of 5S gene linkages were those of Rubin and Sulston (1973) and Maizels (1976) who showed that these genes were linked to rDNA units in Saccharomyces cerevisiae and Dictyostelium discoideum, respectively. Because these species were considered to be ‘lower’ eukaryotes, and that their 5S rRNA gene arrangement was similar to that of eubacterial species (where 16S, 5S and 23S ribosomal RNAs are linked), this arrangement was at first interpreted as representing some sort of transitional stage between the linked eubacterial arrangement and the unlinked ‘higher’ eukaryote arrangement (Gerbi 1985; Drouin et al. 1987; Belkhiri et al. 1992; Howlett et al. 1992; Drouin and Moniz de Sá 1995). However, a large variety of 5S rRNA gene linkages have since been found and this interpretation is no longer tenable. As described earlier, in protists, this includes linkages to SL genes, snRNA genes, ubiquitin genes and pseudogenes and tRNA genes. Furthermore, a large variety of 5S rRNA gene linkages are also found in multicellular eukaryotes. This includes linkages to ribosomal rRNA genes in land plant, nematode, spider and crustacean species, linkages to SL genes in nematode and hydrozoan species and linkages to histone and/or snRNA genes in fish, crustacean, insect and mollusk species (reviewed in Drouin and Moniz de Sá 1995; Pelliccia et al. 2001; Eirín-López et al. 2004; Cross and Rebordinos 2005; Manchado et al. 2006; Derelle et al. 2010; Cabral-de-Mello et al. 2011; Vierna et al. 2011; Wicke et al. 2011). The fact that 5S rRNA genes are sometimes found linked to a variety of other genes argues against the linkage of 5S rRNA genes and ribosomal RNA genes being the primitive condition not only in protists but also in other eukaryote species.

Variation of 5S rRNA Gene Linkages at Different Taxonomic Levels

The best way to study the evolution of 5S rRNA gene linkages is to map them to independently derived phylogenies (Drouin and Moniz de Sá 1995). These analyses show that 5S rRNA gene linkages are variable within large taxonomic groups, within genus and even within species (Figs. 1, 2, 3, 4 and 5).

In some large taxonomic groups, such as stramenopiles and apicomplexa (Figs. 1, 2 and 3), 5S rRNA genes are only found linked to rDNA units whereas in others, such as dinoflagellates and euglenids, they are found linked to a larger variety of other genes, such as SL and snRNA genes (Figs. 3 and 4). However, during evolution, the arrangement of 5S rRNA gene changes, even in taxonomic groups where 5S rRNA genes are only found linked to rDNA units. Whereas most autotrophic stramenopiles have 5S rRNA genes linked to rDNA units, three species, S. petersenii, D. brightwelii and N. palea, have lost this linkage (Fig. 1). Conversely, whereas most apicomplexa species do not have 5S rRNA genes linked to their rDNA units, the rDNA units of two species, T. gondii and A. taiwanensis, have independently acquired 5S rRNA genes (Fig. 3).

5S rRNA gene linkages are also variable within genus. Within the metamonada, 5S are not linked to other gene families in two Trichomonax species whereas they are linked to ubiquitin genes and pseudogenes in T. foetus (Fig. 1). In oomycetes, the well-studied Pythium genus shows that the 5S rRNA arrangement within rDNA units is highly variable even between closely related species (Fig. 2). This includes variation in the number of linked 5S rRNA genes, from zero to two, as well as variation in the orientation of these 5S rRNA genes. In dinoflagellates, one Prorocentrum species has its 5S rRNA genes linked to rDNA units, whereas the other does not (Fig. 3). In kinetoplastids, closely related Trypanosoma species either have or do not have 5S rRNA genes linked to their SL genes (Fig. 4). In Entamoeba species, 5S rRNA genes are linked to different tRNA genes (Fig. 5).

The within species variation in 5S rRNA gene arrangement suggests that the linkage of 5S rRNA genes to other genes is a dynamic process. For example, the fact that some species, such as the dinoflagellate species K. veneficum and P. piscicida and the euglenid species M. pellucidum, have linked 5S rRNA and SL genes as well as independent clusters of tandemly repeated 5S rRNA genes and SL genes, suggests that neither the linked or unlinked arrangement has yet been fixed in these species but that both type of arrangements coexist. Such coexistence also supports the conclusion that presence or absence of such linkages has no functional purpose (see the following).

Mechanisms: Transposition and Homogenization

The numerous types of 5S rRNA gene linkages observed at different taxonomic levels clearly show that, in protists, there is no such thing as a universal primitive 5S rRNA gene arrangement. Rather, 5S rRNA genes move in and out of a variety of tandemly repeated gene families. This raises three questions: Is it the 5S rRNA genes, or the other genes, which move around, how do they move around and how are they homogenized throughout the gene families where they transpose to?

Given the wide variety of genes with which 5S rRNA genes are found associated with, such as ribosomal RNA, ubiquitin, histone, SL, snRNA and tRNA genes, the most parsimonious possibility is that it is the 5S rRNA genes that are transposed to other gene families and not the opposite. Furthermore, at least two possible mechanisms have been proposed to explain how 5S rRNA genes can be transposed to other loci: by the insertion of extrachromosomal covalently closed circular DNAs (cccDNAs) and by RNA-mediated transposition (reviewed in Drouin and Moniz de Sá 1995). cccDNAs are thought to be formed by homologous recombination between adjacent repeated sequences and cccDNAs containing 5S rRNA genes have been observed in many eukaryotic species including plant, invertebrate and vertebrate species (reviewed in Cohen and Segal 2009; Cohen et al. 2010). Given that 5S rRNA genes are often found as tandemly repeated sequences in diverse eukaryotic genomes, it is therefore likely that such molecules exist in most eukaryotic genomes.

As discussed in more details in Drouin and Moniz de Sá (1995), RNA-mediated transposition of 5S rRNA genes is also a likely transposition mechanism because their promoter sequences are located within their coding regions. Therefore, RNA-mediated transposed 5S rRNA genes can still be expressed because they contain the necessary promoter sequences. In fact, such expressed reverse-transcribed 5S rRNA gene sequences have been found dispersed in the mouse and rat genomes (Drouin 2000). The fact that these genes have been shown to be expressed suggests that they are not deleterious and that their dispersed arrangement does not preclude them from being expressed. The 5S rRNA gene of T. vivax, which is linked to SL genes, also has a 16-pb long poly-A track at its 3′-end which suggest that it transposed to the SL locus through an RNA intermediate (Fig. 4; Roditi 1992). Therefore, 5S rRNA molecules retrotranscribed in cDNA molecules by an endogenous reverse transcriptase and inserted back into the genome is a likely mechanism of 5S rRNA gene transposition. In contrast, most of the other genes with which 5S rRNA genes are found linked with, such as ribosomal RNA genes and protein-coding genes, would most likely not be functional if they were retrotransposed because, given that their promoter sequences are located upstream from their coding regions, they would lack their necessary promoter sequences (Drouin and Moniz de Sá 1995; Paule and White 2000; Lenhard et al. 2012). The fact that we know of no examples of linkages between rDNA genes, ubiquitin, histone and/or splice leader genes support the suggestion that such genes do not frequently move around genomes.

As discussed further in the following, the reason why 5S rRNA genes are found linked with a wide variety of genes is likely not a functional reason because 5S rRNA genes have no obvious functional relationship with most of these genes (i.e., apart from their relationship with the other ribosomal RNA genes). However, the genes with which 5S rRNA genes are found linked with do have a common feature: They are all gene families that are encoded by tandemly repeated genes. This immediately suggests that the mechanisms by which 5S rRNA genes move in and out of a variety of tandemly repeated gene families are those involved in the concerted evolution of gene families: gene conversion and unequal crossingover (Dover 1982; Arnheim 1983; Dover 1993; Elder and Turner 1995; Graur and Li 2000). In the case of tandemly repeated gene families, it is often difficult to determine which of these two mechanisms is the most important for their concerted evolution because all repeats are almost identical (Eickbush and Eickbush 2007). Furthermore, both mechanisms can alter gene copy numbers between generations (e.g., Szostak and Wu 1980; Pukkila and Skrzynia 1993). A change in copy numbers cannot therefore be assumed to necessarily represent the effect of unequal crossingover events. For example, gene conversions have been shown to be responsible for the gain and loss of rDNA units in the yeast genome (Gangloff et al. 1996). Smith (1974), using computer simulations, calculated that a tandemly repeated gene family having n genes would require n 2 unequal crossingover events to be homogenized. The homogenization of the rDNA gene family like that of Saccharomyces cerevisiae, which contains some 100–200 tandemly repeated rDNA units at a single locus on chromosome 12, would therefore only require some 10,000–40,000 generations (Bergeron and Drouin 2008). This predicted number of generations was later experimentally confirmed by the experimental work of Szostak and Wu (1980) who determined that some 56,000 generations were required to homogenize the rDNA repeat units of S. cerevisiae. For yeasts, this could represent as little as 15 years because, under laboratory conditions, they can have up to ten generations a day (Ganley and Kobayashi 2011). The recent study of Ganley and Kobayashi (2011) showed that a yeast rDNA unit can be gained or lost as frequently as once every cell division and that duplications and deletions occurred stochastically and at similar rates. They also uncovered that large rDNA deletions are frequent events which are likely to further increase the efficiency of homogenization. Rapid homogenization of plant ribosomal RNA units has also been observed (Kovarik et al. 2005). These fast rates of homogenization are likely responsible for the fact that, for example, so little intra-genomic variation is observed between the rDNA units of fungal species and the SL genes of trypanosome species (Ganley and Kobayashi 2007; Thomas et al. 2005). Such a high degree of homogenization is expected to be most pronounced for clustered tandemly repeated gene families because intrachromosomal exchanges are more frequent than interchromosomal exchanges (Schlötterer and Tautz 1994; Kuhn et al. 2011).

A Case of Nonadaptive Evolution

As humans, we tend to search for meaning every time we observe correlations or relationships. It is therefore not surprising that functional explanations have been proposed to explain the diverse 5S rRNA gene linkages observed. As mentioned earlier, given that the first 5S rRNA gene linkages observed in eukaryotes where with the ribosomal RNA genes of ‘lower’ eukaryotic species, this arrangement was at first interpreted as representing a transitional stage between the linked arrangement of ribosomal genes in eubacteria and the unlinked arrangement observed in most ‘higher’ eukaryotes. Given that 5S rRNA genes are now known to be linked to a wide variety of other genes, and that such linkages are observed in a wide variety of eukaryotic taxa, this explanation is no longer tenable. Other researchers also suggested that the observed 5S rRNA linkages were adaptive because they allowed the coordinated expression of the linked genes (Aksoy et al. 1992; Howlett et al. 1992; Andrews et al. 1987; Bedard et al. 2006; Wicke et al. 2011). However, the fact that 5S rRNA genes (transcribed by RNA polymerase III) have been found linked to ribosomal RNA genes (transcribed by RNA polymerase I), SL, ubiquitin and histone genes (transcribed by RNA polymerase II) as well as tRNA (transcribed by RNA polymerase III) and snRNA genes (transcribed by RNA polymerase II and III) does not support this hypothesis (Campbell et al. 2000; Thomas and Chiang 2006; Carter and Drouin 2009; Martínez-Calvillo et al. 2010). Furthermore, the fact that 5S rRNA genes are linked to both ribosomal RNA and SL genes in the genome of Euglena gracilis (Fig. 4) suggest that coordinated expression does not provide a selective advantage. Therefore, the absence of conservation in the arrangement of 5S rRNA genes, as well as the absence of support for the coordinated expression hypothesis, suggests that the arrangement of 5S rRNA genes is not subject to selective forces.

As mentioned earlier, the most obvious feature that 5S rRNA genes have with the genes with which they are found linked with is that all these other genes are found tandemly repeated in eukaryotic genomes. Because tandemly repeated gene evolve in a concerted fashion, a 5S rRNA gene sequence moving in or out of one of these repeats is quickly homogenized throughout all repeats of the gene family (see the preceding). For example, most autotrophic stramenopiles and most of their sister taxa, have their 5S rRNA genes linked to their ribosomal RNA genes (and in the same orientation; Fig. 1). However, in autotrophic stramenopiles, 5S rRNA genes were lost twice independently, once in the S. petersenii lineage and once in the lineage leading to D. brightwellii and N. palea (Fig. 1). Conversely, in the Trichomonas genus, 5S rRNA genes are not linked to any other genes in two species, but are linked to ubiquitin genes and pseudogenes in T. foetus (Fig. 1). Furthermore, changes do not only include 5S rRNA gene loss or gains but also their inversion. For example, in oomyctes (heterotrophic stramenopiles), the orientation of 5S rRNA genes relative to the ribosomal RNA genes was independently inverted repeatedly (Figs. 1 and 2).

Obviously, these deductions assume that the phylogenies on which we mapped the 5S rRNA gene arrangements are accurate. Although we cannot estimate how many of the phylogenetic relationships are accurate, the fact that different gene organizations are observed within well-supported (monophyletic) taxa demonstrates that such gene organization changes do occur irrespective of the actual branching orders. For example, different gene organizations exist within the Trichomonas, Pythium, Karlodinnium, Porocentrum, Herpetomonas, Trypanosoma (including the well-supported T. brucei and T. cruzi clades) and Entamoeba genus (Figs. 1, 2, 3, 4 and 5).

As discussed earlier, these changes can occur relatively quickly because only n 2 generations, where n is the number of repeats in a gene family, are required to homogenize a new variant. As these tandemly repeated gene families usually contain from about 100 to 300 members (Nelson et al. 1983; Muhich et al. 1987; Bellofatto et al. 1988; Keller et al. 1992; Keeling and Doolittle 1995), homogenization therefore requires some 10,000–100,000 generations, a relatively short evolutionary time for most protist species. Furthermore, the study of Zhang et al. (2009), one of the few studies that investigated intraspecific variation in the organization of 5S and SL genes, found that each of the six dinoflagellate species they investigated contained many different types of SL repeats containing complete or partial SL sequences, some of which also contained 5S rRNA and U6 snRNA genes (Fig. 3). Such extensive intraspecific variation shows that different gene organizations do co-exist in many species. Therefore, even though at any one time one of the variant might increase in frequency, often to the point where other organizations are not easily detectable (e.g., Bedard et al. 2006), other organizations may later become dominant. This is particularly true for many protozoan species where founder events, such as when a trypanosome species is transferred between vertebrate or invertebrate species living in the same environment, are likely to lead to the fixation of new gene organizations (Stevens et al. 2001; Stevens 2008). Such founder events not only likely explain the high diversity of these species but are also likely responsible for the numerous reversions in gene 5S rRNA gene organization observed during protist evolution (Westenberger et al. 2004; Fig. 1, 2, 3, 4 and 5). For example, the recurring presence and absence of 5S RNA genes linked to the SL genes of trypanosome species (Fig. 4) likely do not represent independent loss and gains of 5S rRNA genes in these lineages, but simply a change in the abundance of pre-existing repeats with and without 5S rRNA genes.

In summary, the features of 5S rRNA genes, such as their high copy number and expression level, as well as their internal promoters that allow transposed copies to be expressed, allow them to be functional when a copy serendipitously transposes within an unrelated tandemly repeated gene family. Although they most likely transpose haphazardly throughout genomes, some of these unique transposition events become noticeable when they become homogenized within a tandemly repeated gene family. Furthermore, as recently emphasized by Lynch (2007a, b), selection need not be invoked to explain numerous features of eukaryotic genomes. Rather, these features can readily be explained by the action of nonadaptive processes such as recombination and drift. The high variability in 5S rRNA genes arrangements observed between eukaryotic species therefore likely represents the nonadaptive fixation of haphazard recombination events in species with short generation times and frequent founder events.