Molecular analyses of mitochondrial pseudogenes within the nuclear genome of arvicoline rodents
- First Online:
- Cite this article as:
- Triant, D.A. & DeWoody, J.A. Genetica (2008) 132: 21. doi:10.1007/s10709-007-9145-6
- 123 Views
Nuclear sequences of mitochondrial origin (numts) are common among animals and plants. The mechanism(s) by which numts transfer from the mitochondrion to the nucleus is uncertain, but their insertions may be mediated in part by chromosomal repair mechanisms. If so, then lineages where chromosomal rearrangements are common should be good models for the study of numt evolution. Arvicoline rodents are known for their karyotypic plasticity and numt pseudogenes have been discovered in this group. Here, we characterize a 4 kb numt pseudogene in the arvicoline vole Microtusrossiaemeridionalis. This sequence is among the largest numts described for a mammal lacking a completely sequenced genome. It encompasses three protein-coding and six tRNA pseudogenes that span ∼25% of the entire mammalian mitochondrial genome. It is bordered by a dinucleotide microsatellite repeat and contains four transposable elements within its sequence and flanking regions. To determine the phylogenetic distribution of this numt among the arvicolines, we characterized one of the mitochondrial pseudogenes (cytochrome b) in 21 additional arvicoline species. Average rates of nucleotide substitution in this arvicoline pseudogene are estimated as 2.3 × 10−8 substitutions/per site/per year. Furthermore, we performed comparative analyses among all species to estimate the age of this mitochondrial transfer at nearly 4 MYA, predating the origin of most arvicolines.
KeywordsChromosomal rearrangementsCytochrome bMicrotusMitochondrial genomenumtTransposable elementsRodentVole
The historic transfer of proto-mitochondrial DNA to the nucleus and the subsequent integration of mitochondrial genes into the nuclear genome has been one of the predominant forces driving the evolution of these two distinct genomes (Margulis 1970; Lang et al. 1999). Although the two genomes remain physically separated, insertion of mitochondrial DNA into the nucleus has continued and has resulted in the accumulation of translocated mitochondrial fragments within eukaryotic nuclear genomes. Despite their absolute numbers, numts (nuclear mitochondrial pseudogenes; Lopez et al. 1994) usually comprise less than 0.1% of the nuclear genome because they are typically small in length. However, numts can range in size from 30 bp in rice (Fukuchi et al. 1991) to 14.7 kb in humans (Mourier et al. 2001) and their average size can vary by orders of magnitude across taxa (Leister 2005). Numt integration seems to be more extensive in plants as their sizes can be hundreds of kb in length (Stupar et al. 2001; Noutsos et al. 2005). The size or abundance of numts does not correlate with absolute genome size, gene density or abundance of mitochondrial transcripts, which could suggest that the accumulation of numts is lineage-specific (Woischnik and Moraes 2002; Richly and Leister 2004).
Numts often integrate into non-coding regions of the nuclear genome (Leister 2005) but Ricchetti et al. (2004) found that in humans, recently transferred numts preferentially insert within or near coding regions and may alter gene function. This result is consistent with the idea that numt transfer may be mediated by chromosomal repair mechanisms. Transcription can induce DNA breaks (Aguilera 2002; Gonzales-Barrera et al. 2002); therefore highly expressed genes may be associated with chromosomal breaks and targets for numt insertions. Indeed, there is empirical evidence that numts are often coupled with chromosomal breaks and end-joining mechanisms (Blanchard and Schmidt 1996). Willet-Brozick et al (2001) analyzed a breakpoint junction of a chromosomal translocation and discovered a 42 bp numt insertion of the mitochondrial 12s RNA gene while Ricchetti et al. (1999) reported that numts were transferred to yeast chromosomes during the repair of double strand breaks.
Numts have been described in a variety of animal and plant taxa, but rodents are an appealing animal model for their study. In particular, rodent genomes evolve rapidly relative to other mammals at both the DNA sequence level and at the chromosomal level (Contreras et al. 1990; Li et al. 1996; Cooper et al. 2003; Rat Genome Sequencing Project Consortium 2004). More specifically, arvicoline rodents are an especially attractive model for the study of numts because cursory descriptions of their numts have been published (DeWoody et al. 1999; Jaarola and Searle 2004) and their karyotype is exceedingly plastic (Matthey 1973; Modi 1987; Mazurok et al. 2001). Furthermore, arvicolines have evolved much faster than most rodents—over 60 species of the vole genus Microtus have evolved in less than 2 million years (Chaline et al. 1999). The elevated rate of speciation in Microtus is mirrored by the rapid rate of nucleotide substitution in its mitochondrial genome (Conroy and Cook 2000; Triant and DeWoody 2006).
The rodent subfamily Arvicolinae (voles, lemmings, and muskrats) includes >130 species distributed among >20 genera (Musser and Carleton 2005) although the exact number of species has varied widely. The earliest arvicoline rodent is thought to have arisen during the early Pliocene of North America and Eurasia (∼5–6 MYA; Repenning 1987; Martin 1989) but most of the contemporary diversity within the Arvicolinae is found within the genus Microtus. This genus underwent a rapid diversification and now encompasses more than 50% of the species within the entire subfamily. The accelerated rate of speciation seen within Microtus has been attributed to chromosomal rearrangements (Reig 1989), and the rapid rate of karyotypic change is illustrated by diploid numbers (2n) that range from 17 to 64 (Matthey 1973; Maruyama and Imai 1981). Whether these chromosomal changes are driving speciation is unclear, but if numt insertions are driven by chromosomal repair mechanisms, the chromosomal rearrangements that have occurred throughout the evolutionary history of arvicoline rodents would seem to provide numerous opportunities for nuclear integrations.
Once integrated into nuclear DNA, the evolutionary rate of mammalian numts should slow because of the difference in substitution rates between the mitochondrial and nuclear genomes. As such, numts may represent an ancestral form of their corresponding mitochondrial fragment and reflect the underlying mutational process (Li et al. 1981; Perna and Kocher 1996). Substitution patterns can reveal whether numts arose from multiple insertion or duplication events and allow for the examination of relative rates of evolution between the nuclear and mitochondrial genome. These types of comparative analyses, however, can be confounded by recombination and indel insertions among numt sequences. For example, complete genomic sequences provide a means to identify entire ranges of numts within an organism but reports of human numt abundance have been conflicting (Tourmen et al. 2002; Woischnik and Moraes 2002; Richly and Leister 2004). The prevalence of indels has likely contributed to alignment ambiguities and discrepancies among numt abundance estimates.
Here, we address some of the challenges involved in isolating numts from multiple species and assessing the evolutionary history of those sequences. We investigate the molecular evolution and structural composition of a large arvicoline numt that contains the cytochrome b pseudogene (ψcytb) originally described by DeWoody et al. (1999) in the Eurasian species M. arvalis. We further extended the original analysis of ψcytb in M. rossiaemeridionalis (the sister taxon of M. arvalis) to the entire numt by characterizing the transposable and repetitive elements, which flank it. By using a comparative approach to date the original arvicoline ψcytb transfer, we evaluate the rate of molecular evolution within the nuclear genome and compare it to the accelerated rate of molecular evolution in arvicoline mtDNA (Triant and DeWoody 2006).
Materials and methods
Isolation of ψcytb
Arvicoline species used in this study and their museum accession numbers
Southern red-backed vole
Northern red-backed vole
Southern bog lemming
Genomic DNA was extracted with a standard proteinase K/phenol–chloroform protocol (Sambrook and Russel 2001). PCRs were performed in a final volume of 25 μl and included 1× ThermoPol Buffer (New England BioLabs), 0.2 mM dNTPs, 0.25 μM each primer, 2.5 U Taq DNA polymerase (New England Biolabs), and 0.03 U Pfu DNA polymerase (Stratagene). The thermal profile consisted of an initial denaturation at 94°C for 2 min; 32 cycles of 94°C for 1 min, 55°C for 30 s, 72°C for 1 min; and a final elongation step for 4 min at 72°C. PCR products were cleaned with Qiaquick purification kits (Qiagen). We then identified restriction sites specific to each fragment (nuclear and mitochondrial) and digested the amplicons with the restriction enzyme Rsa I (New England Biolabs), which would determine whether the products were nuclear in origin prior to direct sequencing. Putative nuclear products were sequenced in both directions with the amplification primers and two internal sequencing primers (PcytbSeq1: 5′-TTCAGTAGACAAAGTCACTC-3′, PcytbSeq2: 5′-GGAATAGTAGGAGAACTAAT-3′) using BigDye v.3.1 (Applied Biosystems) following the manufacturer’s protocol modified to one-eighth reactions then cleaned with a sodium acetate precipitation. We aligned each amplicon with its corresponding mitochondrial cytochrome b sequence and concluded that the amplicons were numts by the presence of stop codons or indels using both the universal and mammalian mitochondrial genetic codes.
After identifying the ψcytb numt in the original four Microtus species, we sampled additional taxa within the Arvicolinae: 12 North American endemic Microtus species, one Asian species, three species from the genus Clethrionomys (the putative sister genus to Microtus), the lemming genus Synaptomys and the muskrat genus Ondatra, one of the more primitive genera within the arvicoline rodents. It has been reported that multiple numts, which span a similar portion of the mtDNA molecule can independently transfer to the nucleus (Mirol et al. 2000; Mundy et al. 2000). Thus, whenever possible we used the same primers and PCR conditions in an attempt to isolate orthologs ψcytb amplicons from all taxa sampled. However, our PCR amplifications were sub-optimal when used with the genus Clethrionomys, so we designed new primers specific to the problematic taxa.
Elongation of ψcytb in M. rossiaemeridionalis
We then attempted to determine whether the ψcytb pseudogene was part of a larger numt and used a Genome Walker kit (Clonetech) to define the insertion boundaries in M.rossiaemeridionalis, the sister taxon of M. arvalis. The Genome Walker kit utilizes a linker-ligation/primer-walking protocol to identify unknown flanking sequences. The Pcytb primer sequences described above were elongated to create a set of nested primers on either end of ψcytb. We followed the manufacturer’s suggested protocols, but modified the PCR conditions in that we used 5 U Taq (New England Biolabs) and 0.05 U Pfu DNA polymerase (Stratagene) in all reactions to help reduce fidelity errors (Cline et al. 1996). All amplicons were cleaned, bidirectionally sequenced, and aligned with the sequence of the M. rossiaemeridionalis mitochondrial genome (DQ015676; Triant and DeWoody 2006).
Comparative sequence analyses
We constructed two sets of sequence alignments. One dataset consisted of the ∼1 kb ψcytb nuclear pseudogenes and their corresponding mitochondrial cytochrome b sequences from multiple arvicoline species (n = 22, Table 1). The second dataset, from M. rossiaemeridionalis only, consisted of the full-length numt sequence (4 kb, which we term “Mr_numt”) and its corresponding mitochondrial sequence. The ψcytb arvicoline pseudogenes are presumably each part of a larger numt that are each orthologs to Mr_numt.
Divergence between putative ψcytb nuclear sequences and their mitochondrial cytochrome b counterparts
Indels (# nucleotides)
Microtus rossiaemeridionalis(contained within Mr_numt)
Because pseudogenes are thought to be associated with transposable and repetitive elements (Mishmar et al. 2004), we used RepeatMasker (Smit et al. 1996–2004) to identify any such elements in nuclear sequences. We used MEGA 3.1 (Kumar et al. 2004) and PAUP* 4.0b10 (Swofford 2003) to generate estimates of nucleotide variability. We estimated the number of nucleotide differences per site and the ratio of transitions/transversions for all mtDNA/numt pairwise comparisons. Additionally, we estimated pairwise divergences among mitochondrial and among ψcytb sequences according to the Kimura 2-parameter model. To assess saturation in arvicoline mitochondrial sequences, we plotted uncorrected pairwise transitions and transversions against corrected pairwise divergences. We then calibrated divergence estimates within Microtus (excluding M. oregoni; see “Results/Discussion”) using 1.5 MYA as the date of the microtine radiation (Repenning 1990) to obtain an average rate of nucleotide substitution in these pseudogenes. The precise date of the microtine radiation is unclear and it is not known with certainty how many radiation events occurred (Repenning 1980; Chaline et al. 1999). Therefore, we calculated nucleotide substitution rates across the range of plausible dates (0.5–2.0 MYA).
To date the original translocation of the progenitor Mr_numt sequence from the source mitochondrion to the nucleus, we used QDate 1.1 (Rambaut and Bromham 1998). This method utilizes a maximum-likelihood quartet approach and user-specified calibration dates to estimate divergence times between two monophyletic groups with different rates of evolution. We used quartet dating because rates of nucleotide substitution are expected to differ between the mitochondrion and the nucleus. In addition, this method allows for examination of each pairwise comparison within our dataset to identify numt sequences that may potentially be non-orthologs. Because our ψcytb amplification primers were designed to be gene specific, we assume all sequences are orthologs unless the phylogenetic analyses suggested otherwise (see below).
Using maximum parsimony (MP) and maximum likelihood (ML) approaches as implemented in PAUP* 4.0b10 (Swofford 2003), we conducted phylogenetic analyses of the ψcytb pseudogenes and corresponding mitochondrial cytochrome b sequences (Table 1) to identify potentially non-orthologous sequences. MP analyses were performed with heuristic searches using tree-bisection-reconnection (TBR) branch swapping with 100 random addition sequences and 1,000 bootstrap replicates. Strict and majority-rule consensus trees were constructed from all equally parsimonious tress. ML analyses were conducted under the GTR + I + G model with a shape parameter of 0.7625 as determined by Modeltest 3.7 (Posada and Crandall 1998) under the hLRT and AIC criteria. We performed heuristic searches with TBR branch swapping, the as-is addition sequence and 100 bootstrap replicates. Because saturation was observed (see “Results”), we conducted the same analyses with third positions excluded. MP analyses were conducted under the same conditions while ML analyses were performed under the Trn + I + G model with a shape parameter of 0.8087. Indels were removed prior to all analyses. Numt pseudogenes were identified in the more primitive arvicoline species (see “Results”); thus, all trees were rooted with mitochondrial cytochrome b sequences of Cricetus cricetus and C. griseus as the Cricetinae subfamily of Palearctic hamsters is thought to be the sister group to Arvicolinae (Steppan et al. 2004).
Isolation of ψcytb
We successfully amplified the presumptive ψcytb locus in 22 arvicoline species: 20 voles (Microtus and Clethrionomys), one lemming (Synaptomys) and one muskrat (Ondatra) (Table 2). In most cases, the pseudogene-specific primers PcytbF2 and PcytbR generated robust amplicons and unambiguous sequences with no apparent contamination by mitochondrial sequences. However, unique primers were developed for the genus Clethrionomys because of inconsistent results with the PcytbF2 and PcytbR. Amplicon sizes ranged from 446 to 1,369 bp across species. In all cases, pairwise comparisons with mitochondrial cytochrome b sequences revealed numerous stop codons and frameshift indels within the pseudogene sequences.
Elongation of ψcytb in M. rossiaemeridionalis
Comparative sequence analyses
Transposable elements associated with putative ψcytb sequences or their flanking regions
Fraction of sequence (%)
M. rossiaemeridionalis (Mr_numt)
3′ flanking region
5′ flanking region
Pairwise divergence among Microtus mitochondrial genes ranged from 0.02 to 0.19 (mean = 0.14), while those for nuclear pseudogene sequences ranged from 0.01 to 0.16 (mean = 0.08). For most comparisons, pairwise divergences among pseudogene sequences were generally 2–6× lower than divergences among mitochondrial sequences (i.e., the pseudogenes seem to be evolving more slowly than their mitochondrial counterparts). However, this apparent reduction in the rate of pseudogene evolution did not hold true for Clethrionomys, Ondatra, and M.oregoni. There was evidence of saturation (both transitions and transversions) in most taxa (data not shown).
We assessed results generated under the 1-rate and 2-rate substitution models for each possible quartet (n = 210). We tested the 1-rate model to confirm that our quartets were not evolving at the same rate (as expected when considering both mitochondrial and nuclear sequences) and most quartets (n = 159) were indeed rejected. Under the 2-rate substitution model, we first discounted 5 of the 210 quartets because of model nonconformity and another 51 of the 210 quartets where QDate could not establish accurate 95% confidence intervals. Most of the discounted quartets included Clethrionomys, Ondatra, and M. oregoni. We cross-referenced the remaining 154 quartets against the 51 quartets that fit the 1-rate model and found that 25 quartets fit both models and thus were discounted from further analysis. Of these 25 quartets, all contained Clethrionomys, Ondatra, and M. oregoni. We removed the remaining quartets belonging to these taxa (n = 14) because of concerns about orthology (see below), leaving only quartets derived from Microtus. The remaining 115 quartets were used to estimate the mean divergence between mitochondrial and nuclear sequences at a value of 3.99 MYA (standard deviation 0.70; median 3.88; 95% confidence intervals 3.86–4.12). Quartet analyses conducted with only first and second positions did not provide enough resolution as most quartets (n = 168) had to be discounted because they fit both the 1-rate and 2-rate models. The remaining 42 quartets provided an estimate of 4.38 MYA (standard deviation 1.08; median 4.06; 95% confidence intervals 4.04–4.72).
The original ψcytb pseudogene, described in M. arvalis by DeWoody et al. (1999), is here shown to be widespread among arvicoline rodents. Although we did not characterize the flanking region of ψcytb in all 22 taxa surveyed, we did so in M. rossiameridionalis and found that the entire numt (Mr_numt) spans ∼4 kb, approximately 25% of the entire mitochondrial genome. Multiple stop codons and frameshift indels throughout Mr_numt strongly suggest that it is non-coding. Thus, Mr_numt is almost certainly located in the nuclear genome. Richly and Leister (2004) determined that the average size of rodent numts is 193 bp. Although mammalian numts encompassing multiple mitochondrial genes have been described [e.g., in cats (Lopez et al. 1994; Kim et al. 2006)] most mammalian numts are fragments of single mitochondrial genes (Bensasson et al. 2001). A number of long numts (some encompassing more than 75% of the mtDNA genome) have been identified bioinformatically in sequenced genomes (e.g., humans, Mourier et al. 2001) but Mr_numt is one of the largest numts yet described in animals lacking a complete genome sequence.
Whether numts preferentially insert into nuclear genomes and where these insertion sites occur has been debated in the human literature. Mishmar et al. (2004) suggested that transposable elements influence the integration of mitochondrial sequences and their duplication within the nuclear genome. They analyzed 247 numt flanking regions in the human genome and found that 59% of them were within 150 bp of a repetitive element (∼6 times more often than expected) while 14% of them were inserted directly into repetitive elements. In contrast, Ricchetti et al. (2004) reported that recently transferred numts are associated with coding regions while others associate numts with chromosome break points (Ricchetti et al. 1999; Willet-Brozick et al. 2001). We identified a number of repetitive and transposable elements within Mr_numt and its flanking regions that comprise ∼25% of the total sequence. Mr_numt appears to have been inserted directly into a dinucleotide repeat and is associated with four transposable elements, one within Mr_numt itself and three within its flanking sequence. The distribution of repetitive elements within arvicoline genomes is not known but ∼40% of the mouse and rat genomes are comprised of repetitive DNA derived from mobile elements (Mouse Genome Sequencing Consortium 2002; Rat Genome Sequencing Project Consortium 2004). Therefore, the association of arvicoline numts with repetitive and mobile elements does not seem extraordinarily high. Three Microtus species had an unidentified 66-base insertion into their ψcytb sequences. These inserted sequences consist of repetitive regions and are flanked by short direct repeats but do not match any known sequences (Fig. 3); perhaps suggesting that they are lineage specific.
We were able to isolate ψcytb in most arvicoline species using the same set of primers, but this alone does not indicate orthology. One method for confirming that numts are the result of the same insertion event is to identify insertion boundaries and compare flanking sequences (e.g., Schmitz et al. 2005). However, Mr_numt is flanked by dinucleotide and mononucleotide repeats that proved difficult to sequence. Because the inclusion of paralogous sequences within our dataset could bias many of our estimates, we employed an alternative method for assessing orthology—one of phylogenetic concordance. Our analyses indicate that most of the Microtus ψcytb pseudogenes described herein seem to be orthologs.
The quartet analysis was conducted not only to date the mtDNA transfer but also to identify suspect sequences/taxa. Some of the arvicoline species examined were problematic and led us to discount a number of quartets because of rate homogeneity found within the quartet, or because divergence estimates were contained within the 95% confidence intervals. The majority of those rejected quartets belonged to Clethrionomys, M.oregoni, and Ondatra suggesting that the numt sequences isolated from these taxa are not orthologs to ψcytb in the other Microtus species and thus quartets involving those taxa were disregarded.
The estimate of ∼4 MYA for the Mr_numt translocation date would predate the initial appearances of almost all of the arvicoline species used in this study. Ondatra is the oldest genus in our dataset and fossil records indicate a mid-Pliocene origin (∼3.7 MYA; Repenning 1987). If our estimated date of ∼4 MYA is accurate, then the transfer to the nucleus might have coincided with the diversification of modern arvicoline rodents and be unique to that lineage.
Within our phylogenetic tree, most numts have shorter branch lengths than their mtDNA counterparts, suggesting that the sequences are nuclear and evolving more slowly than those within the mtDNA genome. There was poor resolution within both mtDNA and numt clades but this is not surprising as most North American Microtus species have shown little phylogenetic resolution within mtDNA cytochrome b sequences (Conroy and Cook 2000; Jaarola et al. 2004). The rapid microtine radiation, coupled with the complications of combining mtDNA and nuclear sequences with different rates of evolution into a single analysis, is likely contributing to the poor resolution of our phylogeny. M. oregoni should cluster with other North American voles (Jaarola et al. 2004) but among numt sequences was not even grouped within the Microtus clade. It is instead clustered with the more basal arvicoline taxa, suggesting a more ancient numt transfer or (more likely) that our sequence from M. oregoni is not of ψcytb but of a paralog (Fig. 6).
The relationships among numt and among mtDNA sequences were similar; however, O. zibethicus mtDNA cytochrome b sequence clustered with the numt sequences rather than the mtDNA sequences (Fig. 6). Although bootstrap support was not high, this may reflect the putative date of numt transfer (∼4 MYA) coinciding with the emergence of this species within the fossil record (Repenning 1987). The pseudogene and mtDNA sequences did recover the same clade consisting of M. montanus, M. townsendii, and M. pennsylvanicus. All three species share a synapomorphic 66-base insertion (Table 3) and are otherwise known to be closely related taxa (Conroy and Cook 2000). Note that our phylogenetic analyses were not conducted specifically to recover the systematic relationships of arvicolines, but were used to identify potentially non-orthologs sequences (e.g., those of M. oregoni).
Our tests for saturation among arvicoline mtDNA cytochrome b sequences revealed evidence of saturation at both transitions and transversions (data not shown). Saturation at transitions would be expected because of the transition bias found in animal mtDNA that has been reported previously for rodents (Yang and Yoder 1999) but saturation at transversions is surprising; transversions typically increase linearly with time in the animal mitochondrial genome (Moritz et al. 1987). Pairwise divergences among arvicoline ψcytb sequences suggest that they are evolving more slowly than their mitochondrial counterparts, as is typical of mammalian numts (Zhang and Hewitt 1996). However, pairwise comparisons involving Ondatra, Clethrionomys, and M.oregoni suggest that their numt sequences were evolving almost as rapidly as their mitochondrial DNA. This seems unlikely, and a more parsimonious explanation is that the sequences represented by these taxa actually represent independent numts that are not orthologs of ψcytb. The scatter plot of nuclear pairwise divergences vs. mitochondrial divergences showed a clustering pattern among genera that reflected the phylogenetic relationships, with M. oregoni grouped among the more basal taxa (Fig. 4). While our analyses suggest that the numts from M. oregoni, Clethrionomys and Ondatra are not orthologs of the Microtus numts, they could be orthologs with respect to each other.
By disregarding suspected paralogs, our estimates of substitution rates should be conservative. Microtus is known to exhibit an elevated evolutionary rate within their mitochondrial genome (Triant and DeWoody 2006) but it is unknown whether the mtDNA rate is concordant with the evolutionary rate of the nuclear genome. Our estimate of 2.3 × 10−8 for the substitution rate provides the first evidence of the evolutionary rate within the Microtus nuclear genome (Fig. 5), and is greater than that described for other mammals. Kumar and Subramanian (2002) estimated the average substitution rate in mammalian genomes to be 2.2 × 10−9 per year whereas estimates of the rate of neutral substitution are 2.0 × 10−9 for humans and 4.5 × 10−9 per year for mice (Mouse Genome Sequencing Consortium 2002). Li et al. (1981) estimated the average rate of nucleotide substitution for mouse, human and rabbit pseudogenes as 4.7 × 10−9. In contrast to Kumar and Subramanian (2002) but consistent with our results, Nachman and Crowel (2000) used processed pseudogenes to estimate the human mutation rate, equivalent to the substitution rate for neutral genes, as 2.5 × 10−8. The causes of the disparity among these three studies are not immediately apparent, but one obvious reason might be sampling error: we only sampled a single pseudogene in over 20 taxa, Nachman and Crowel (2000) sampled 12 autosomal pseudogenes in humans and chimpanzees, and the genome sequence estimates (Mouse Genome Sequencing Consortium 2002) are based on ancestral repeats. Further research is needed to determine if the rate of nucleotide substitution in the Microtus nuclear genome is, like the mtDNA genome, accelerated relative to other mammals.
Prior to this study, numt pseudogenes had been reported within the arvicoline rodents (DeWoody et al. 1999; Jaarola et al. 2004; Jaarola and Searle 2004) but only the ψcytb numt in M. arvalis had been described and characterized. Although the driving forces behind the integration of mtDNA sequences into nuclear genomes (and their subsequent dispersal and accumulation) are not well understood, they may be associated with chromosome breaks. If so, then the rampant chromosomal fission/fusion events in the Microtus lineage has afforded countless opportunities for numt insertions and thus voles could prove to be a valuable model.
We are grateful to the Natural Science Research Laboratory in The Museum of Texas Tech University, The University of Alaska Museum, The Museum of Southwestern Biology, and The Museum of Vertebrate Zoology at Berkeley for loaning us the tissue necessary for this study. We thank David Bos, Joe Busch, Jill Detwiler, Dave Glista, David Gopurenko, Maarit Jaarola, Emily Latch, Jamie Rudnick, Sara Turner, Rod Williams and anonymous reviewer for comments on an earlier version of this manuscript. Our lab is funded in part by the USDA, the NSF, and Purdue University. This work is contribution number 2006-17842 from Purdue University.