Background

Approximately 48,000 mite and tick species (Arthropoda: Chelicerata: Arachnida: Acari) [1, 2] have been described to this day. Since the number of undescribed species is thought to be twenty-fold higher, this subclass is by far the most species-rich group among the Arachnida [2]. Acari diversified 400 million years ago and currently three major lineages are recognised: Opilioacariformes, Acariformes and Parasitiformes [2, 3]. The Acariformes comprise two major groups, the Trombidiformes and Sarcoptiformes [2, 4]. Two of the most prominent members of the Sarcoptiformes are the European house dust mite Dermatophagoides pteronyssinus (Trouessart, 1897) and the American house dust mite Dermatophagoides farinae (Hughes, 1961), both belonging to the family of the Pyroglyphidae (cohort Astigmata). Pyroglyphid mites are typical inhabitants of animal nests. In the human environment, they are mainly found in upholstery, textile floor covers and beddings, where they primarily feed on the skin scale fraction in house dust [5]. About 40 years ago, house dust mites were first recognized as one of the major sources of allergens in house dust [6]. The allergenic proteins are found in high concentrations in mite faeces, which, after drying and pulverizing, become airborne and can be inhaled. The presence of these allergens in sensitive persons is able to cause diseases like asthma, dermatitis and rhinitis [7, 8]. In countries with a temperate climate, 6 to 35 per cent of the population is sensitive to house dust mite-derived allergens [9].

Complete mitochondrial (mt) genome sequences are becoming increasingly important for effective evolutionary and population studies. Mt genome sequences are not only more informative than shorter sequences of individual genes, they also provide sets of genome-level characters, such as the relative position of different genes, RNA secondary structures and modes of control of replication and transcription [1013]. However, the applicability of mt genomes as a marker of highly divergent lineages is still controversial [14, 15] and remains to be elucidated [16]. In addition, unravelling mt genomes can be of economic importance as well, since several chemical classes of pesticides target mt proteins. Well-known acaricides like acequinocyl and fluacrypyrim affect mt electron transport through the inhibition of the mt encoded cytochrome b in complex III [17]. Also, the economically important class of METI (Mitochondrial Electron Transfer Inhibitors)-acaricides target the mt complex I, although their exact molecular target has not yet been elucidated. Recently, resistance to the acaricide bifenazate was shown to be caused by mutations in the mt encoded cytochrome b and to have evolved rapidly through a short stage of mt heteroplasmy [18].

At present, the mt genomes of 20 species belonging to the Acari are available at NCBI ([19], status January 10, 2009). Most of the submitted sequences have the typical features of metazoan mt genomes. They are circular, between 13 and 20 kb in length, contain a coding region with 37 genes (22 tRNAs, 2 rRNAs and 13 protein coding genes) and a relatively small non-coding region. The latter is mostly AT-rich and fulfils a role in the initiation of replication and transcription [20, 21]. Compared to this typical configuration, the mt genomes of Steganacarus magnus, Metaseiulus occidentalis and Leptotrombidium pallidum show some abnormal features. S. magnus lacks 16 of the 22 tRNAs normally present in mt genomes [22]. M. occidentalis has a unusually large mt genome (24.9 kb) resulting from a duplication event of a large fragment of the codon region. Despite its large size, genes coding for nad6 and nad3 were not found during the initial annotation process [23]. L. pallidum on the other hand has 38 mt genes due to a duplication of the 16S-rRNA [24].

In this study, we analyse the complete mt genome of a member of the Sarcoptiformes, the European house dust mite D. pteronyssinus, after obtaining the complete sequence using a long PCR approach.

Results and discussion

Genome organisation

The mt genome of D. pteronyssinus was amplified, using long PCR, in three overlapping fragments. The final assembled sequence was 14,203 bp [GenBank: EU884425; Fig. 1], making it the fifth smallest sequenced genome within the Acari. Only the mt genomes of Tetranychus urticae (13,103 bp), Leptotrombidium akamushi (13,698 bp), Leptotrombidium deliense (13,731 bp) and S. magnus (13,818 bp) are smaller (Table 1 [2527]). As non-specific amplification artefacts and incomplete coverage of genes are well-known drawbacks of a PCR approach [28], we checked the genome size by restriction digest on rolling circle amplified mtDNA (Fig. 2). This approach confirmed the sequence size, considering that the relative mobility of mtDNA restriction fragments can show slight (5–12%) deviations compared to their sequence length [29]. The mt genome of D. pteronyssinus is the first mt sequence of a mite belonging to the Astigmata and is together with the mt genome of S. magnus the only representative from the order of the Sarcoptiformes. Adding this genome to the database resulted in 21 publicly available Acari mtDNA sequences. Twelve belong to species in the superorder of the Parasitiformes whereas nine – among which D. pteronyssinus – belong to species in the superorder of the Acariformes.

Table 1 Nucleotides composition of completely sequenced mt genomes of Acari and Limulus polyphemus*.
Figure 1
figure 1

Schematic representation of the mt genome of D. pteronyssinus. Except for atp8 (= 8) and nad4 (= 4L) protein coding and ribosomal genes are presented as outlined in the abbreviations section. tRNA genes are abbreviated using the one-letter amino acid code, with L1 = CUN; L2 = UUR; S1 = AGN; S2 = UCN. RNAs on the N-strand are underlined. Numbers at gene junctions indicate the length of small non-coding regions where negative numbers indicate overlap between genes. A-,T-,G- and C-content of the mt genome is represented using a red, blue, green and purple colour graded circle, respectively. Black curved lines on the outside of these circles represent mt genome coverage by Dermatophagoides ESTs (see additional file 5 for sequences of Dermatophagoides ESTs covering the mt genome of D. pteronyssinus).

Figure 2
figure 2

Restriction digest of rolling circle amplified mitochondrial DNA of D. pteronyssinus. Rolling circle amplified mtDNA, undigested (lane 3) and digested with Xmn I (lane2) and Eco RI (lane 4). Molecular marker used was MassRuler DNA ladder Mix (Fermentas) (lane 1).

All 37 genes present in a standard metazoan mt genome could be identified (Fig. 1). Gene overlap exists between trnD/atp8 (1 bp), trnR/nad3 (17 bp), trnM/trnS2 (12 bp), trnP/trnV (11 bp), trnV/trnK (7 bp), trnW/trnY (1 bp), trnY/nad1 (4 bp), trnI/trnQ (1 bp) and trnL1/trnC (5 bp). No overlap was found between protein coding genes. Small non-coding regions (> 20 bp) are present between trnS2/trnA (25 bp), trnA/trnP (20 bp) and nad1/nad6 (28 bp). A large non-coding region is positioned between trnF and trnS1 (286 bp). Twenty-five genes of the mt genome of D. pteronyssinus are transcribed on the majority strand (J-strand), whereas the others are oriented on the minority strand (N-strand).

The mt genome of the horseshoe crab Limulus polyphemus is considered to represent the ground pattern for arthropod mt genomes [30, 31]. Comparing the D. pteronyssinus genome to this sequence revealed that only 11 of the 38 gene boundaries in L. polyphemus are conserved in D. pteronyssinus (Fig. 3 [32]). Moreover, by making use of the pattern search function in the Mitome-database ([33], status January 10, 2009), the mt gene order of D. pteronyssinus appeared to be unique among arthropods. Remarkably, the relative position of trnL2 (between nad1 and 16S-rRNA), which differentiates the Chelicerata, Myriapoda and Onychophora from the Insecta and Crustacea according to Boore [12, 34], is not conserved. However, Boore's hypothesis was based on mt genome data from only 2 Chelicerata that were available in 1998. At present, 41 complete chelicerate mt genomes are available in the NCBI-database ([19], status January 10, 2009). Out of these, only 29 depict the specific arrangement of trnL2 between nad1 and 16S-rRNA (see additional file 1 for an overview of gene arrangements of chelicerate mt genomes). This illustrates that care should be taken when general rules are deduced from limited datasets.

Figure 3
figure 3

Mitochondrial gene arrangement of Limulus polyphemus, Dermatophagoides pteronyssinus and Steganacarus magnus. Graphical linearisation of mt genomes is presented according to [32]. Gene sizes are not drawn to scale. J stands for majority and N for minority strand. Protein coding and rRNA genes are abbreviated as in the abbreviations section. tRNA genes are abbreviated using the one-letter amino acid code, with L1 = CUN; L2 = UUR; S1 = AGN; S2 = UCN. White boxes represent genes with the same relative position as in the arthropod ground pattern, L. polyphemus. Light-gray boxes represent genes that changed positions relative to L. polyphemus; dark-gray boxes represent genes that changed both position and orientation. Circular dots between the genes of D. pteronyssinus represent conserved gene boundaries compared to L. polyphemus. Square dots between the genes of S. magnus represent conserved gene boundaries compared to D. pteronyssinus.

Mt gene arrangements have already provided strong support toward the resolution of several long-standing controversial phylogenetic relationships [12]. Surprisingly, the mt gene order of D. pteronyssinus differs considerably from that of other mites (see additional file 1). Comparing the D. pteronyssinus mt genome to the mt sequence of the oribatid S. magnus [22], the closest relative of D. pteronyssinus, revealed that only 6 of the 22 gene boundaries in S. magnus are conserved in D. pteronyssinus (Fig. 3).

Extending this analysis to the other Acari mt genomes showed that in several cases the set of neighboured genes that were not separated during the evolution [35] was greater between members of different superorders (e.g. D. pteronyssinus (Acariformes) and Rhipicephalus sanguineus (Parasitiformes)) than between members of the same superorder (e.g. D. pteronyssinus (Acariformes) and T. urticae (Acariformes)) (Table2). Exclusion of tRNAs in our analysis showed a similar trend, suggesting that protein coding genes were also involved in mt gene rearrangements. These results indicate that mt gene orders seem less useful for deduction of phylogenetic relationships between superorders within the Acari. However, comparing gene order might be more powerful to establish phylogenetic relations within families, as was previously proposed [14, 36]. In the case of the Ixodidae family, it was shown that the division of Prostriata (Ixodes sp.) and Metastriata (R. sanguineus, Amblyomma triguttatum, Haemaphysalis flava) could be linked to mt gene arrangements [37, 38].

Table 2 Pairwise common interval distance matrix of mt gene orders of Acari*.

Base composition and codon usage

The overall AT-content of the mt genome of D. pteronyssinus is 72.6% (Table 1). This is within the range of the average AT-content of Acari mt genomes (74.6 +/- 4.0%). The high AT-content is reflected in the codon usage (Table 3 [39]) with nucleotides 'A' and 'T' preferred over 'C' and 'G' on the wobble position and the predominant use of codons deficient in 'C' or 'G'. For example, the most frequently used codons are TTT (F) (105 codons per 1000 codons) and TTA (L2) (78 codons per 1000 codons).

Table 3 Relative synonymous codon usage (RSCU) and number of codons per 1000 codons (NC1000) in the protein coding genes of the mitochondrial genome of D. pteronyssinus.

Metazoan mt genomes usually present a clear strand bias in nucleotide composition [40]. This is probably due to asymmetric patterns of mutations during transcription and replication when one strand remains transiently in a single-stranded state, making it more vulnerable to DNA damage [41]. However, in the case of mtDNA-replication, this hypothesis is not without controversy [4245]. The strand bias in nucleotide composition can be measured as GC- and AT-skews ((G%-C%)/(G%+C%) and (A%-T%)/(A%+T%), respectively) [46]. The overall GC- and AT-skews of the J-strand of the D. pteronyssinus mt genome are 0.194 and -0.199, respectively. These are the most extreme values encountered within mite mt genomes up till now (Table 1) and they are reversed compared to the usual strand biases of metazoan mtDNA (negative GC-skew and positive AT-skew for the J-strand). Moreover, a positive GC-skew for mite mt genomes seems to be rare since at present, it was only encountered in Varroa destructor. Although hypothetical, it could be the result of a strand swap of the control region [40]. This region contains all initiation sites for transcription [47] and an inversion of the control region is expected to produce a global reversal of asymmetric mutational constraints in the mtDNA, resulting with time in a complete reversal of strand compositional bias [40]. The asymmetrical directional mutation pressure is also reflected in the codon usage of genes oriented in opposite directions [48]. Whereas NNG and NNU codons are preferred over NNA and NNC codons on the J-strand, genes on the N-strand show the exact opposite trend (see additional file 2 for an across-strand (N and J) comparison of frequencies of codons ending with the same nucleotide).

Protein coding genes

Nine proteins are encoded by genes on the J-strand (cox1, cox2, cox3, atp6, atp8, nad3, nad4, nad4L, nad5), while four are encoded by genes on the N-strand (nad1, nad6, nad2, cytB). The total length (10,826 bp) and AT-content (71.61%) of the protein-coding genes are within the range of values typical for Acari (10,639.0 +/- 272.0 bp; 74.0 +/- 4.0%, respectively) (Table 1). Compared to other mite mt proteins, cox1, cox2 and cytB are best conserved. On the other hand, atp8, nad6 and nad4L showed lowest similarity values (see additional file 3 for the average identity and similarity % of mt proteins of D. pteronyssinus).

Start and stop codons were determined based on alignments with the corresponding genes and proteins of other mite species. In the case of stop codons, we could also benefit from available expressed sequence tags (ESTs) of D. pteronyssinus (n = 1797) and D. farinae (n = 1735) (Fig. 1) [49]. As for other metazoan mt proteins, unorthodox initiation codons are used [20] (see additional file 4 for start and stop codons of protein coding genes of Acari mt genomes). Eight genes (cox2, atp6, cox3, nad3, nad6, nad4L, nad4, cytB) use the standard ATG start codon, 3 genes (cox1, nad1, nad2) start with ATA and nad5 initiates with ATT. atp8 most likely starts with codon TTG.

Eleven genes employ a complete translation termination codon, either TAG (cox1, cox3) or TAA (cox2, atp8, atp6, nad1, nad3, nad6, nad4L, nad5, cytB). With the exception of nad3, atp8 and nad4L, D. pteronyssinus ESTs for all these genes confirmed the position of the stop codon (Fig. 1, see additional file 5 for sequences of Dermatophagoides ESTs covering the mt genome of D. pteronyssinus). Berthier et al. [50] showed that the adjacent genes, nad4L/nad4 and atp8/atp6, were transcribed and translated as a bicistronic mRNA in the model organism Drosophila melanogaster. However, as no ESTs were found that aligned with the nad4L/nad4 and atp8/atp6 gene boundaries, it could not be confirmed whether this was also the case for D. pteronyssinus. Despite its efficiency, the use of sequence alignments to determine the position of stop codons resulted in several cases in overlapping genes. For example, based on a highly conserved tryptophan at the C-terminal end of Acari nad3 proteins, a stop codon was positioned despite the resulting 17 bp overlap with trnaR. The two remaining genes (nad2 and nad4) are likely equipped with a truncated stop codon (T). Polyadenylation of the mRNA is needed in these cases to form a fully functional TAA stop codon [51]. Although speculative, ESTs of D. farinae confirm the truncated stop of nad4 (Fig. 1, see additional file 5).

Transfer RNAs

Fourteen tRNAs are encoded on the J-strand and 8 on the N-strand (Fig. 1). Secondary structures were predicted for all tRNAs (Fig. 4). With the exception of trnS1 (UCU instead of GCU in L. pallidum) and trnP (UGG instead of AGG in S. magnus), all anticodon sequences were identical to those of L. pallidum and S. magnus, the only acariform mites for which tRNA secondary structures have been reported [22, 24]. Usually, T is in the first anticodon position for tRNAs that recognise either four-fold degenerate codon families or NNR codons. G is usually in this position only to specifically recognize NNY-codons [52]. Except for trnM, all of the D. pteronyssinus mt tRNAs follow this pattern. trnM has the anticodon CAT (to recognise both ATG and ATA), which is the case for almost all animal mt systems [52] (Fig. 4).

Figure 4
figure 4

Inferred secondary structures of the 22 mitochondrial tRNAs from D. pteronyssinus. tRNAs are shown in the order of occurrence in the mt genome starting from cox1. Locations of adjacent gene boundaries are indicated with arrows. Green font indicates that the sequence is part of the adjacent gene. Inferred Watson-Crick bonds are illustrated by lines, whereas GU bonds are illustrated by dots.

Only one tRNA lacks the D-arm: trnS1, as is common for most metazoans. With the exception of trnC, trnV and trnS1, all tRNAs have T-arm variable loops (TV replacement loops) instead of the T-arm. Similar structures were found for tRNAs of L. pallidum [24] and S. magnus [22]. The absence of the T-arm is a typical feature for tRNAs of Chelicerata belonging to the orders of the Araneae, Scorpiones and Thelyphonida. However, other taxa within the Chelicerata (Amblypygi, Opiliones, Ricinulei, Solifugae and ticks) possess typical metazoan cloverleaf tRNAs [13]. Masta and Boore [13] suggested a multi-step evolutionary process in an attempt to understand how so many tRNAs in these chelicerate groups could lose their T-arm. According to this speculative theory, changes in mt ribosomes, resulting in the fact that the loss of arms from tRNAs was tolerated [53, 54], and/or changes in specific elongation factors [5557] are considered as a first step in this process.

Only 7 of the 22 tRNAs have a completely matched 7 bp acceptor stem (trnG, trnW, trnH, trnF, trnE, trnL1 and trnL2). A maximum of 3 mismatches in this stem is found in trnR. In contrast, almost all tRNAs (18) possess a completely matched 5 bp anticodon stem. trnC, trnS1 and trnN have a single mismatch whereas trnY has two mismatches in this stem. All tRNAs, except trnL2, have a symmetric anticodon loop consisting of 2 bp up- and 2 bp downstream of the 3 bp anticodon. The anticodon loop of trnL2 consists of 2 nucleotides preceding the anticodon and 3 nucleotides immediately following it. This kind of aberrant anticodon loops have also been reported for the two-humped camel Camelus bactrianus ferus (trnS2) [58] and the scorpion Mesobuthus gibbosus (trnH and trnN) [59]. As mentioned before, sequences of some tRNAs overlap with neighbouring genes. The extreme examples are trnR, trnS2 and trnV. trnR overlaps with the adjacent gene nad3 on the same strand for 17 bp at its 3'-end whereas trnS2 overlaps with the adjacent gene trnM on the same strand for 12 bp at its 3'-end. trnV overlaps with the adjacent gene trnP on the opposite strand for 11 bp at its 3'-end and with trnK on the opposite strand for 7 bp at its 5'-start. Despite these overlaps, we consider these genes not likely to be pseudogenes. First of all, their sequence is relatively well conserved when compared to corresponding genes of other Acari. Secondly, besides sequence conservation they depict a conserved secondary structure. Thirdly, an EST [GenBank: CB284825] of the related species D. farinae was found corresponding to the region covering trnR, trnM and trnS2 of D. pteronyssinus indicating that the genes are expressed (see additional file 6 for an alignment of trnR, trnM and trnS2 of D. pteronyssinus with an EST of D. farinae). Finally, and most importantly, stem mismatches and sequence overlap are not uncommon for mt tRNAs of arachnids [13, 60], and are probably repaired by a post-transcriptional editing process [54, 61].

Non-coding regions

The largest non-coding region (286 bp) is flanked by trnF and trnS1. It is highly enriched in AT (91.61%) and can form stable stem-loop secondary structures. Based on these features, it possibly functions as a control region [20, 62]. With the exception of T. urticae (95.45%), it has the highest AT-content of all Acari mt control regions (Table 1). The position of the non-coding region differs from most insect and arachnid mt genomes, where the region is mostly located in close proximity to 12S-rRNA ([62], see additional file 1).

Based on the sequence pattern, the control region can be subdivided in a repeat region and a stem-loop region. The first region (11,491–11,528 bp) contains several AT-repeats. In order to verify the exact number of repeats we resequenced this region. For this purpose, two flanking primers, Dp-Ms-F and Dp-Ms-R, were synthesised spanning approximately 700 bp. The PCR product was cloned and ten independent clones were sequenced. This revealed that the number of AT-repeats varied between 7 to 28, suggesting that this domain can be considered as a microsatellite [63]. This is remarkable as a mt microsatellite was never reported before for species belonging to the Chelicerata. Also in metazoan mtDNA such microsatellites are rare and have, to our knowledge, only been reported for butterflies [64], a dragonfish, Scleropages formosus [65], a bat, Myotis bechsteinii [66], a turtle, Pelomedusa subrufa [67] and several seal species [6870].

The second region (11,529–11,768 bp) holds two short palindromic sequences, TACAT and ATGTA, which are conserved in mt genomes of mammals [71] and fishes [65, 72]. They can form a stable stem-loop structure (Fig. 5-A2), which might be involved as a recognition site for the arrest of J-strand synthesis [71]. Near this region other stem-loop structures could be folded (Fig. 5-A) but none of them had flanking sequences similar to those that are conserved in the control region of the mt genome of insects [62] and metastriate ticks [37].

Figure 5
figure 5

Secondary structures of non-coding regions of the mt genome of D. pteronyssinus. Secondary structure of non-coding regions between (A) trnF and trnS1 (large non-coding region); (B) trnS2 and trnA; (C) trnA and trnP; (D) nad1 and nad6. All structures were constructed using Mfold [103]. Inferred Watson-Crick bonds are illustrated by lines, whereas GU bonds are illustrated by dots.

As described before, four other stretches of non-coding nucleotides were found outside the control region. These short sequences can fold into stable stem-loop structures (Fig. 5-B, C, D) which may function as splicing recognition sites during processing of the transcripts [73].

Ribosomal RNAs

12S-rRNA and 16S-rRNA are located on the J-strand. This does not coincide with their position in most Chelicerata where they are located on the N-strand (see additional file 1). The AT-contents of both genes are comparable (72.9% and 76.1% for the 12S- and 16S-RNA, respectively) and are within the range of rRNAs of other Acari (76.5 +/- 4.2%; 78.0 +/- 4.1%, respectively). The sizes of the rRNAs (665 bp and 1078 bp) are slightly larger than those of other acariform mite rRNAs (626.0 +/- 29.9 bp and 1018.5 +/- 21.3 bp) but are shorter than those found in the Parasitiformes (706.0 +/- 17.5 bp and 1207.3 +/- 31.4 bp) (Table 1).

The 12S-rRNA and 16S-rRNA genes of Leptotrombidium species (Acariformes: Trombidiformes: Trombiculidae) are 23.4% and 23.5% shorter than their counterparts in Drosophila yakuba. This substantial reduction is mainly caused by the loss of stem-loop structures at the 5'-end of the rRNA genes [74]. To identify whether similar domains are absent in the rRNAs of D. pteronyssinus, we constructed their secondary structures (Fig. 6 [75, 76]). This revealed that the D. pteronyssinus 12S-rRNA indeed lacks similar stem-loops as L. pallidum, compared to D. yakuba. The structure also revealed 1 additional stem-loop (stem-loop 1) not present in 12S-rRNA of L. pallidum. Like in L. pallidum, one stem-loop replaces three stem-loops (24, 25 and 26) whereas another replaces a region of four stem-loops (39, 40, 41 and 42) of the D. yakuba 12S-rRNA [74]. Based on the modelled structure in combination with an alignment of other acariform 12S-rRNAs, the greatest sequence conservation was found in the loop region of stem-loops 21 and 27 and the region between stem-loops 48 and 50.

Figure 6
figure 6

16S-rRNA and 12S-rRNA secondary structures of the mitochondrial genome of D. pteronyssinus. The numbering of the stem-loops is after de Rijk et al. [75] for 16S-rRNA and after van de Peer et al. [76] for 12S-rRNA. Blue coloured nucleotides show 100% identity when aligned to 12S-rRNA and 16S-rRNA genes from other Acariformes (as listed in Table 1). Inferred Watson-Crick bonds are illustrated by lines, whereas GU bonds are illustrated by dots.

In analogy to the 16S-rRNA gene of L. pallidum, the main deletions of the D. pteronyssinus 16S-rRNA are located at the 5'-end. With the exception of D19, all stem-loops of L. pallidum are present in D. pteronyssinus. We also discovered three additional stem-loops (C1, E2 and E19) which are absent in the 16S-rRNA of L. pallidum. The 3'-end of the 16S-rRNA structure is best conserved compared to other acariform 16S-rRNAs. This is in agreement with the idea that this region is the main component of the peptidyl-transferase centre, and as such most vulnerable to mutations [73]. Recently, the 12S-rRNA and 16S-rRNA secondary structures of S. magnus have been published [22]. The 12S-rRNA structure of S. magnus has 5 extra stem-loops (2, 4, 5, 40 and 42) compared to the one of D. pteronyssinus whereas the 16S-rRNA lacks 6 stem-loops (D4, D16, E1, E2, E19 and G9) and has 5 stem-loops (cd1, D1, D17, D19, G13) not present in the 16S-rRNA of D. pteronyssinus.

It is still an open question how relatively well-conserved structures such as rRNAs can dramatically decrease in size while remaining functional. Wolstenholme et al. [53] and Masta [54] suggested a correlation between the occurrence of truncated rRNAs (compared to Drosophila) and the loss of the T-arm in tRNAs. The coincidence of short rRNAs and missing T-arms in tRNAs was also observed in S. magnus, L. pallidum and D. pteronyssinus. Other acariform mites like T. urticae, Ascoschoengastia sp. and Walchia hayashii also exhibit short rRNAs (Table 1) and the prediction of their tRNA secondary structures could further support this hypothesis. However, examples contradicting this hypothesis also occur e.g. pulmonate gastropods with tRNAs lacking T-arms have no truncated rRNAs. Therefore, it remains possible that truncation of both tRNAs and rRNA genes only reflects an independent trend towards minimisation of the mt genome as suggested by Yamazaki et al. [77].

Phylogenetic analysis

A phylogenetic tree was constructed based on nucleotide and amino acid sequences from all mt protein coding genes of Acari. The ILD-test [78] indicated a significant incongruence (P = 0.01) among data set partitions for nucleotide alignments and low congruence (P = 0.07) among data set partitions for amino acid alignments. A considerable debate exists on the utility of this test [7984]. However, the principle of Kluge [85] implies that all data should always be included in a combined analysis for any phylogenetic problem and therefore we combined data partitions for both amino acid and nucleotide alignments for phylogenetic analysis. A maximum parsimony (MP) analysis based on nucleotide alignments (data not shown) grouped V. destructor (Parasitiformes) within the Acariformes, close to D. pteronyssinus. This is in contrast with the generally accepted view on the phylogeny of the Acariformes and Parasitiformes [2, 3]. As mentioned before, V. destructor and D. pteronyssinus both have a reversal of asymmetrical mutation pattern. When such reversals occurred independently, D. pteronyssinus and V. destructor could have acquired a similar base composition and as a consequence group together due to the long-branch attraction (LBA) phenomenon [40, 86]. Model-based methods such as maximum likelihood (ML) and Bayesian inference (BI) are less sensitive to LBA [40, 87] and were for this reason considered for phylogenetic analysis.

ML and BI analysis performed on the amino acid data set resolved trees with an identical topology (Fig. 7-A) in which D. pteronyssinus clusters with S. magnus, forming a sistergroup of the Trombidiformes. This is in agreement with the most recent views on the classification of the Acariformes [24]. The nucleotide data set resulted in similar trees, confirming the evolutionary position of D. pteronyssinus (Fig. 7-B). The only major inconsistency over the trees was the position of T. urticae. Although this species is generally considered as a member of the Trombidiformes [24], it was clustered with the sarcoptiform mites D. pteronyssinus and S. magnus in the trees based on the nucleotide dataset. (Fig. 7-B). However, the position in the different trees is questionable as it is supported by low bootstrap values/Bayesian posterior probabilities (Fig. 7-A/B). Adding additional mt genome data from closely related taxa of T. urticae and from taxa located between T. urticae and Trombiculidae would probably position T. urticae with higher support values within the Trombidiformes.

Figure 7
figure 7

Phylogenetic trees of Acari relationships. Trees were inferred from amino acid (A) and nucleotide (B) datasets. All protein coding gene sequences were aligned and concatenated; ambiguously aligned regions were omitted by Gblocks 0.91b [105]. Trees were rooted with two outgroup taxa (L. polyphemus and L. migratoria). Numbers behind the branching points are percentages from Bayesian posterior probabilities (left) and ML bootstrapping (right). Accession numbers for the different Acari mt genomes are listed in Table 1.

In the trees based on the nucleotide dataset, H. flava is, compared to A. triguttatum, evolutionary closer related to R. sanguineus while in the trees based on the amino acid dataset this is the opposite. However, as the clustering of H. flava and R. sanguineus is in agreement with the most recent views on the classification of the Ixodida [88, 89], we consider the nucleotide topology as the most correct one. Murrell et al. [90] considers the Parasitiformes to be paraphyletic with respect to the Opilioacariformes, but as there are no complete mt genomes of Opilioacariformes available, we were not able to verify this hypothesis.

Conclusion

This is the first description of a complete mt genome of a species belonging to the Astigmata, a cohort within the Sarcoptiformes. Although the length, gene and AT-content are similar to other Acari mtDNA, the mt genome of D. pteronyssinus exhibits some interesting features. The gene order of D. pteronyssinus is completely different from that of other Acari mt genomes. Gene order comparison indicated that mt gene orders seem less useful for deduction of phylogenetic relationships between superorders within the Acari. GC- and AT-skews of the J-strand were very large and reversed as compared to those found in most metazoan mtDNA.

Compared to parasitiform mites, both D. pteronyssinus rRNAs were considerably shorter and almost all transfer RNAs lacked the T-arm. It would be interesting to investigate whether the occurrence of truncated rRNAs and the loss of the T-arm in tRNAs are correlated or just a trend toward minimisation of the mt genome. Finally, phylogenetic analysis using concatenated mt gene sequences succeeded in recovering Acari relationships concordant with traditional views of phylogeny of Acari.

Methods

Mite identification

Upon arrival in the laboratory, mites were identified as D. pteronyssinus by J. Witters (ILVO, Belgium) and F. Th. M. Spieksma (Laboratory of Aerobiology, LUMC, The Netherlands) using morphological characteristics. To back up this identification, molecular techniques were applied. For this purpose DNA was extracted and used as a template for PCR. Primers 12SID-F and 12SID-R (see additional file 7 for primer sequences) successfully amplified a 316 bp fragment. BLASTn searches against non-redundant nucleotide sequences using the amplified fragment as query resulted in a perfect match with a mt 12S-rRNA sequence of D. pteronyssinus [GenBank: AF529911].

Mite strain, mass rearing and isolation

The initial D. pteronyssinus culture was provided by D. Bylemans (Janssen Pharmaceutica, Belgium). Mites were cultured on a 1:1 mixture of Premium Gold (Vitacraft, Germany) and beard shavings at 75% R.H., 25°C and permanent dark conditions [91, 92]. Mites were isolated from the colony using a modified heat-escape technique [93, 94]. Briefly, mite cultures were transferred to small plastic petri dishes (75 mm in diameter, 28 mm high) with a lid on top. These dishes were placed in the dark on a hot plate set at 45°C (Bekso, Belgium). After 15–20 minutes the mites moved away from the heat source, formed groups on the lid of the petri dish and could be collected using a fine hair brush.

DNA extraction

Approximately 1000 D. pteronyssinus mites were collected in an Eppendorf tube and were ground in 800 μl SDS-lysis buffer (400 mM NaCl, 200 mM TRIS, 10 mM EDTA, 2% SDS) using a small sterile plastic pestle (Eppendorf, Germany). After incubation for 30 min at 60°C under continuous rotation, a standard phenol-chloroform extraction was performed [95]. Total genomic DNA was precipitated with 0.7 volumes of isopropanol at 4°C for 1 hour, centrifuged for 45 minutes at 21,000 × g and washed with 70% ethanol. Precipitated DNA was resolved in 50 μl 0.1 M Tris pH 8.2.

PCR

Standard PCR (amplicon < 500 bp) was performed in 50 μl volumes (38.5 μl double-distilled water; 5 μl buffer; 2 mM MgCl2; 0.2 mM dNTP-mix; 0.2 μM of each primer; 1 μl template DNA and 0.5 μl Taq polymerase (Invitrogen, Belgium). PCR conditions were as follows: 2' 94°C, 35 × (20" 92°C, 30" 53°C, 1' 72°C) and 2' 72°C. The annealing time was extended to 1 minute and the primer concentration was increased to 2 μM when degenerate primers were used. Long PCR (amplicon > 500 bp) was performed with the Expand Long Range Kit (Roche, Switzerland) in 50 μl volumes (28.5 μl double-distilled water; 10 μl buffer; 0.5 mM dNTP-mix; 0.3 μM of each primer; 4 μl 100% DMSO; 1 μl template DNA and 1 μl enzyme-mix). PCR conditions were: 2' 94°C, 10 × (10" 92°C, 20" at a temperature that varies depending on the primers, 1'/kb 58°C), 25 × (10" 92°C, 20" at a temperature that varies depending on the primers, 1'/kb 58°C with 20" added for every consecutive cycle) and 7' 58°C. All PCR products were separated by electrophoresis on a 1% agarose gel and visualised by EtBr staining. Fragments (amplicon < 1000 bp) of interest were excised from gel, purified with the QIAquick PCR Purification Kit (Qiagen, Belgium) and cloned into the pGEM-T vector (Promega, Belgium). After heat-shock transformation of E. coli (DH5α) cells, plasmid DNA was obtained by miniprep and inserted fragments were sequenced with SP6 and T7-primers. Long PCR products were sequenced by primer-walking. All sequencing reactions were performed by AGOWA sequencing service.

Amplification of the mt genome

Primers COXI-F and 12S-R, based on partial D. pteronyssinus cox1 and 12S-rRNA sequences [GenBank: AY525570 and AF529911, respectively] (see additional file 7 for primer sequences), successfully amplified a 4.6 kb sequence of the mt genome of D. pteronyssinus. Degenerate primers CYTB-F-Deg and CYTB-R-Deg (see additional file 7 for primer sequences), designed on conserved regions of Acari cytB, amplified a partial cytB sequence from D. pteronyssinus. A specific primer COXI-R, designed from the 3' end of the 4.5 kb sequence in combination with the primer CYTB-F, designed from the partial cytB sequence, successfully amplified a 2.2 kb sequence. Another primer CYTB-R, designed from the 5'-end of this 2.2 kb sequence, in combination with the primer 12S-F successfully amplified a 8.6 kb sequence, making the mt genome sequence complete.

Annotation and bioinformatics analysis

The complete genomic sequence was assembled and annotated using VectorNTI (Invitrogen, Belgium) according to Masta and Boore [60]. Open reading frames (ORFs) were identified with the program Getorf from the EMBOSS-package [96]. The obtained ORFs were used as query in BLASTp [97] searches against the non-redundant protein database at NCBI. Two large non-protein-coding regions were candidates for the rRNAs (16S and 12S respectively). The boundaries were identified based on alignments and secondary structures of rRNA genes of other mite species. Sixteen of the 22 tRNAs were identified by tRNA-scan SE [98] with a cove cutoff score of 0.1 and the tRNA-model set to "nematode mito". The remaining tRNAs (trnM, trnV, trnY, trnS1, trnI, trnC) were determined in the unannotated regions by sequence similarity to tRNAs of other mite species. In order to obtain additional information on mt gene boundaries, BLASTn [97] searches of D. pteronyssinus tRNA, rRNA and protein encoding nucleotide sequences were carried out against ESTs [49] restricted to Dermatophagoides sequences (n = 3532). ESTs with statistically significant matches (E-value cutoff: 0.1) were collected, checked for vector contamination and aligned by Clustal W [99] as implemented in BioEdit 7.0.1 [100] against the appropriate nucleotide sequence of D. pteronyssinus. MatGAT 2.02 was used to calculate similarity and identity values [101] of mt proteins. The identification of gene subsets that appear consecutively in different genomes was performed by common interval distance analysis using CREx [102] (see additional file 8 for input data of the CREx program).

Construction of secondary structures of RNAs and non-coding regions

Secondary structures of tRNAs were determined following the method of Masta and Boore [60]. Secondary structures of tRNAs were drawn with CorelDraw 12.0 (Corel Corporation, Canada). The rRNA genes of D. pteronyssinus were aligned with those of other Acariformes and conserved areas were identified. These regions were mapped on the published structures of L. pallidum rRNA [74]. Regions lacking significant homology were folded using Mfold [103]. Secondary structures of rRNAs were drawn using the RnaViz2 program [104] and afterwards modified with CorelDraw 12.0 (Corel Corporation, Canada). Secondary structures of non-coding regions were folded using Mfold [103]. When multiple secondary structures were possible, the most stable (lowest free energy (-ΔG)) one was preferred. Drawing and editing of these structures was done in a similar way as for rRNA secondary structures.

Rolling circle amplification and restriction enzyme digestion

Extraction and rolling circle amplification of the mtDNA of D. pteronyssinus was done according to Van Leeuwen et al. [18]. Rolling circle amplified mtDNA was digested with two enzymes (Xmn I and Eco RI; New England Biolabs) following the manufacturer's instructions. Restriction digests were fractionated by agarose gel electrophoresis as described before.

Phylogenetic analysis

Sequence data were obtained from 21 Acari species (for GenBank accession numbers see Table 1) and two outgroup taxa (Limulus polyphemus [GenBank: NC_003057] and Locusta migratoria [Genbank: NC_001712]). Only mite species with a completely sequenced mt genome were selected. Alignments from all mt protein-coding genes were used in phylogenetic analysis. Amino acid sequences and nucleotide sequences were aligned by Clustal W [99] as implemented in BioEdit 7.0.1 [100]. The nucleotide alignment was generated based on the protein alignment using codon alignment. Ambiguously aligned parts were omitted from the analysis by making use of Gblocks 0.91b [105], with default block parameters except for changing "allowed gap positions" to "with half". Abascal et al. [106] recently presented evidence that some insects and ticks use a modified mitochondrial code, with AGG coding for lysine rather than serine as in the standard invertebrate mitochondrial code. As 10 out of 20 Acari species in our dataset are ticks all positions aligning to AGG codons in the final amino acid alignment were removed.

For the nucleotide alignments the "codons" option was used in Gblocks 0.91b [105]. Due to the results of a saturation analysis [107] on single codon positions, implemented in DAMBE 4.2.13 [108], third codon positions were eliminated from the nucleotide alignment. An incongruence length difference test (ILD-test) [78] as implemented in PAUP* (version 4.0b10; [109]) was used to assess congruence among gene partitions.

Model selection was done with ProtTest 1.4 [110] for amino acid sequences and with Modeltest 3.7 [111] for nucleotide sequences. According to the Akaike information criterion, the mtART+G+I+F model was optimum for phylogenetic analysis with amino acid alignments and the GTR+I+G model was optimal for analysis with nucleotide alignments.

Two different analyses were performed. (1) Maximum likelihood (ML) analysis was performed using Treefinder [112], bootstrapping with 1000 pseudoreplicates (2) Bayesian inference (BI) was done with MrBayes 3.1.2 [113]. As the mtART model is not implemented in the current version of MrBayes, the mtREV+G+I model was used for phylogenetic analysis with the amino acid alignment. Four chains ran for 1,000,000 generations, while tree sampling was done every 100 generations. Burnin was calculated when the average standard deviation of split frequencies had declined to < 0.01. The remaining trees were used to calculate Bayesian posterior probabilities (BPP).