Background

Beauveria Vuill. is a globally distributed genus of soil-borne entomopathogenic hyphomycetes that is preferred as a model system for the study of entomopathogenesis and the biological control of pest insects [1]. The most abundant species of the genus is Beauveria bassiana, found in a wide host range of nearly 750 insect species, with extended studies on host-pathogen interactions at the molecular level and all the prerequisite knowledge for its commercial production [2]. B. brongniartii, the second most common species of the genus, has narrow host specificity and is well-studied as the pathogen of the European cockchafer (Melolontha melolontha), a pest in permanent grasslands and orchards [3]. Strains of both fungal species have been exploited as biological control agents (BCAs) [4, 5].

As is usually the case for most mitosporic fungi, morphological characters are inadequate for delimiting species within a genus and this creates a continuing demand of screening for additional taxonomic characters. Consequently, through the years, several efforts have been made to genetically characterize or differentiate Beauveria species and strains, using various tools, including isozyme markers [6], karyotyping [7], vegetative compatibility groups [8], RAPD markers [9, 10], rRNA gene sequencing and intron analyses [11, 12], RFLPs and AFLPs [1315], subtilisin protease genes [16], microsatellites [17, 18] and combinations of rRNA gene complex and other nuclear genes [1, 19, 20]. These approaches provided valuable information on polymorphisms in populations of B. bassiana, with ITS sequences combined with other nuclear gene sequences being more reliable in taxonomic and phylogenetic studies [1, 20, 21]. Consequently, earlier assumptions that Beauveria is strictly asexual have been severely hampered by the recent discoveries of Cordyceps teleomorphs associated with Beauveria [1, 22, 23]. Thus, the extent to which the entire Beauveria genus is correlated with sexual Cordyceps remains to be examined and proved [1].

Mitochondrial DNA (mtDNA), due to its properties to evolve faster than the nuclear DNA, to contain introns and mobile elements and to exhibit extensive polymorphisms, has been increasingly used to examine genetic diversity within fungal populations [2426]. In other mitosporic entomopathogenic fungi, such as Metarhizium [27], Lecanicillium [28] and Nomurea [29], mtDNA data compared favourably to data based on ITS combined with a single nuclear gene, for applications in phylogeny, taxonomy and species or strain -identification. In Beauveria, the use of mtDNA RFLPs or partial mtDNA sequences suggested that mtDNA can be equally useful for such studies [2, 30].

In recent years, molecular techniques have revolutionized taxonomical studies and have provided strong evidence that some morphologically defined species consist of a number of cryptic species that are independent lineages with restricted distributions, for example, Metarhizium anisopliae [31], Neurospora crassa [32], and Pleurotus cystidiosus [33]. This has urged mycologists to extend their studies on large samples of individuals throughout the world, in order to establish robust phylogenies from the congruence of genealogies based on appropriately polymorphic gene sequences and to test hypotheses regarding the processes responsible for distribution patterns. Thus, the notion of phylogenetic species recognition and phylogeography was introduced as a powerful method for answering questions about distribution in an evolutionary context [3436]. Phylogeography or phylogenetic biogeography emerge as the field that aims to understand the processes shaping geographic distributions of lineages using genealogies of populations and genes [37]. It is therefore, particularly important for genera like Beauveria for which only a few studies exist on strain variability and their geographic distribution and phylogenetic origins [6, 13, 16, 17, 20].

This work was undertaken to serve a dual purpose. Firstly, to further assess the usefulness of mtDNA sequences as species diagnostic tool, alone or in combination with the more commonly studied rRNA gene sequences (ITS), and secondly to infer relationships among a large population of Beauveria species and strains from different geographic origins, habitats and insect hosts. To achieve these targets we have analyzed the complete mt genomes of B. bassiana and B. brongniartii, selected the two most variable intergenic regions and constructed the phylogenetic relationships of a number of isolates for determining their biogeographic correlation.

Results

Gene content and genome organization

The mt genomes of the two Beauveria species had similar sizes, i.e., B. brongniartii IMBST 95031 33,926 bp and B. bassiana Bb147 32,263 bp, and both mapped circularly (Fig. 1). They contained all the expected genes found in typical mt genomes of ascomycetes (see Fig. 1; and Additional File 1, Table S1). Both genomes were compact and preserved the four synteny units proposed for Sordariomycetes, i.e., rns-trn(1-5)-cox 3-trn(1-5)-nad 6-trn(2-9); nad 1-nad 4-atp 8-atp 6; rnl-trn(11-12)-nad 2-nad 3 and nad 4L-nad 5-cob-cox 1 [38]. Important deduced differences in the gene content of the two genomes were found only when the intron number and insertion sites were included. This was also the case for mtDNA genome sequence of another B. bassiana isolate (Bb13) from China, recently deposited in GenBank (EU371503; 29.96 kb). When compared with our Bb147 mtDNA genome sequence, the two genomes were identical in gene order and nucleotide sequence (98-100%), for most of their sequence (approx. 28.1 kb). The difference in size -approx. 2.3 kb- was due to the absence from Bb13 of two introns, located in cox 1 and nad 1 genes of Bb147. Minor sequence differences were mostly in the intergenic regions with a preference to AT-rich areas, and were to a large extent SNP transitions (A/G and C/T) or single nucleotide insertions or deletions. The remaining differences were due to small insertions or deletions of 5-6 bp. The largest deletion (15bp) and the lowest sequence homology (86%) were observed in the intergenic region cox 1- trn R2 (see Fig. 1).

Figure 1
figure 1

Genetic organization of (a) B. bassiana strain Bb147 and (b) B. brongniartii strain IMBST 95031 mtDNA. Protein-coding genes are marked with black arrows, and all other genes with gray arrows. Introns are shown with white arrows. Arrows indicate transcription orientation.

Introns

B. bassiana Bb147 contained five and B. brongniartii six introns, contributing to their total mtDNA genome size by 20.3% and 24.7%, respectively. All introns were group-I members, located in rnl, cob, cox 1, cox 2 and nad 1 (Fig. 1; for details on exact positions of insertion and type of intron sub-group see Additional File 1, Table S1). All introns contained ORFs, i.e., the Rps3 homolog within the rnl gene (Bbrnl I and Bbrrnl I2), putative GIY-YIG homing endonucleases (Bbcob I1, cox 2I1 and nad 1I1) and the LAGLI-DADG endonuclease (Bbcox 1I1 and Bbrcox 1I1). The insertion positions of these introns were found to be conserved (identical sequences for at least 10 bp upstream and downstream of the insertion) for all known fungal complete mt genomes examined (36 in total). The only exception was the cox 2 intron which was rarely encountered in other fungi. Interestingly, the additional intron detected in rnl of the B. brongniartii IMBST 95031 mt genome (positions 806-2102 of NC_011194 and Additional File 1, Table S1), was inserted at site not encountered before among the other complete mt genomes, i.e., the stem formed in domain II of rnl 's secondary structure. The target insertion sequence for the intron was GATAAGGTTG↓TGTATGTCAA and its intronic ORF encoded for a GIY-YIG endonuclease which shared homology (57% identity at the amino acid level) with I-PcI endonuclease of Podospora curvicolla (Acc. No. CAB 72450.1).

Intergenic regions

Both mt genomes contained 39 intergenic regions amounting for 5,985 bp in B. bassiana and 5,723 bp in B. brongniartii, and corresponding to 18.6% and 16.9% of their total mt genome, respectively. The A+T content was very similar for these regions in both mt genomes (~74.5%) and the largest intergenic region was located between cox 1-trn R2 with sizes 1,314 bp for B. bassiana and 1,274 bp for B. brongniartii, respectively. Analysis of these particular regions revealed large unique putative ORFs (orf 387 and orf 368 for both genomes) with no significant similarity to any other ORFs in Genbank. Additionally, many direct repeats were also located in the same regions (maximum length 37 bp and 53 bp for B. bassiana and B. brongniartii, respectively). All other intergenic regions in the two mt genomes had approximately the same sizes but with reduced nucleotide identity (sometimes as low as 78%). Therefore, the potential usefulness of mt intergenic sequence variation for intra- and inter- species discrimination and phylogenetic studies of Beauveria was examined following an in silico analysis based on criteria of size, complexity and suitability (for designing primers) of all Beauveria mt intergenic regions. More specifically, smaller than 200 bp interenic regions were excluded due to the few informative characters they contained, whereas ideal regions were considered those with sizes between 200-800 bp because they can be easily cloned and/or obtained by PCR. Regions containing trn genes -due to their cloverleaf structures- and regions with dispersed repetitive elements were avoided because their structures make them unsuitable for designing primers for PCR amplification (for details of all intergenic regions see Additional File 1, Table S1). Thus, the most suitable intergenic regions following the above criteria for the population analyses were nad 3-atp 9 and atp 6-rns.

Population and phylogenetic studies based on ITS1-5.8S-ITS2 and intergenic mt region sequences

PCR amplicons for the ITS1-5.8S-ITS2 region showed little variation in size, being almost identical for all B. bassiana (480-482 bp) and B. brongniartii (478-481 bp) isolates, but with sizeable differences for the other Beauveria species (471-512 bp). On the contrary, the intergenic nad 3-atp 9 and atp 6-rns amplicons exhibited a much greater variability in sizes even within B. bassiana isolates, ranging from 259-332 bp for the former and 283-483 bp for the latter (Additional File 2, Table S2 and Additional File 3, Table S3), thus providing excellent tools for species or species-group identification. For example, using high-resolution agarose electrophoresis (data not shown), nad 3-atp 9 B. bassiana amplicons can be easily differentiated from the other Beauveria species and at the same time can be grouped into Clades A and C according to their sizes and in congruence to the classification proposed earlier [1] (Additional File 3, Table S3). Variability for the other Beauveria species was even greater, ranging from 84-302 bp and 249-441 bp for the nad 3-atp 9 and atp 6-rns, respectively. When analyzed, these differences were found to be mainly due to deletions and/or additions of 3-5 nucleotides for nad 3-atp 9, scattered throughout this region, and rarely due to single point mutations. The atp 6-rns sequence differences were primarily due to a 4-bp repeat (GCTT) inserted in the corresponding sequence up to 13 times (e.g., R184-483bp), thus providing in many cases excellent tools for isolate identification.

Amplicon sequences from all isolates listed in Additional File 2, Table S2 were used to draw phylogenetic trees deduced from NJ analyses (Fig. 2, 3, 4 and 5), and parsimony and Bayesian methods were applied to examine the sensitivity of the resulting trees and tree topologies. Trees remained largely invariant to these manipulations and topologies were similar to a significant extent for each gene region tested independently of the phylogenetic method applied (symmetric difference values between the trees obtained with different methods for the same dataset are shown in Additional File 4, Table S4). Trees were rooted using as outgroups Aschershonia sp. and/or Simplicillium lamelicolla (both members of Hypocreales). Specifically, the phylogenetic tree produced from the ITS1-5.8S-ITS2 sequences obtained in this work and known related sequences from the databanks, divided the majority of B. bassiana strains into two major clades (Clade A and C), with marginal support of each clade (Fig. 2). The only exception was three strains (namely U259, O46 and IR582) that grouped together, at the base of the remaining B. bassiana strains with significant bootstrap (99 and 84% for the NJ and MP analyses, respectively) and Posterior Probability support (99% for the BI analysis). Similarly, the three B. brongniartii strains, grouped with the respective sequences obtained from GeneBank and produced a sister clade to B. bassiana, whereas the B. vermiconia and B. amorpha strains were basal to B. bassiana and B. brongniartii (Fig. 2). They all clearly clustered to a group different from the other species of the order Hypocreales, with significant NJ (97%) and MP (90%) bootstrap support. Based on 265 informative characters, 2,700 most parsimonious trees were constructed with tree length of 1,106 steps [Consistency Index (CI) = 0.56, Homoplasy Index (HI) = 0.44, Retention Index (RI) = 0.86, Rescaled Consistency Index (RC) = 0.48]. The relatively small number of informative characters may explain the marginal MP bootstrap and PP support. The remaining previously known Beauveria species (B. geodes, B. nubicola, B. tundrense and B. parasiticum) grouped well with other Tolypocladium species as expected according to known taxonomic criteria [39, 40].

Figure 2
figure 2

Phylogenetic trees constructed from unambiguously aligned ITS1-5.8S-ITS2 domain, as produced by NJ analysis. Clade credibility using NJ calculated from 1K replicates (upper numbers in roman), parsimony BS support calculated from 100 replicates (first lower numbers in italics) using PAUP and PPs produced by 2M generations (second lower numbers - in bold) using MrBayes, are shown. In the phylogenetics analysis of the ITS1-5.8S-ITS2 region, fungal species names and sequences obtained from GenBank are shown with their accession numbers in the figure. Fungal hosts are indicated as follows: in a circle, A, Araneida; C, Coleoptera; D, Diptera; H, Hemipetra; L, Lepidoptera; N, Nematoda; O, Orthoptera, T, Thysanoptera, R, Rotifera; ?, not known; in a square, H, Hymenoptera and no indication from soil or air. Geographic location is provided next to each isolate together with blue, orange, green, purple and magenta colour for the isolates originated from Europe, Asia, America, Africa and Oceania, respectively. B. brongniartii isolates are shown in yellow while isolates from all other Hypocreales species are provided in black.

Figure 3
figure 3

Phylogenetic trees constructed from unambiguously aligned nad 3- atp 9 intergenic region, as produced by NJ analysis. Clade credibility using NJ calculated from 1K replicates (upper numbers in roman), parsimony BS support calculated from 100 replicates (first lower numbers in italics) using PAUP and PPs produced by 1M generations (second lower numbers - in bold) using MrBayes, are shown. Fungal hosts, geographic locations and colour designations as in Fig. 2.

Figure 4
figure 4

Phylogenetic trees constructed from unambiguously aligned atp6-rns intergenic region, as produced by NJ analysis. Clade credibility using NJ calculated from 1K replicates (upper numbers in roman), parsimony BS support calculated from 100 replicates (first lower numbers in italics) using PAUP and PPs produced by 1M generations (second lower numbers - in bold) using MrBayes, are shown. Fungal hosts, geographic locations and colour designations as in Fig. 2.

Figure 5
figure 5

Phylogenetic trees constructed from unambiguously aligned combined DNA sequences of the mt interegenic regions and the ITS domain as produced by NJ analysis. Clade credibility using NJ calculated from 1K replicates (numbers in roman), parsimonial BS support calculated from 100 replicates (numbers in italics) using PAUP and PPs produced by 2M generations (numbers in bold) using MrBayes, are shown. Fungal hosts, geographic locations and colour designations as in Fig. 2. The 3 symbol Köppen-Geiger climate classification is also provided as follows: Af, Tropical Rain Forest; Am, Tropical Monsoon climate; Aw, Tropical wet and dry; BWh, Dry (arid and semiarid) desert low latitude climate; BWk, Dry (arid and semiarid) desert middle latitude climate; BSh, Dry (arid and semiarid) steppe low latitude climate; BSk, Dry (arid and semiarid) steppe middle latitude climate; Csa/Csb, Temperate Mediterranean climate; Cfa/Cwa, Temperate humid subtropical climate; Cfb/Cwb/Cfc, Temperate Maritime climate; Cwb, Temperate with dry winters climate; Cfc, Temperate Maritime Subarctic climate; Dfa/Dwa/Dsa, Hot summer Continental climate; Dfb/Dwb/Dsb, Warm summer Continental climate; Dfc/Dwc/Dsc, Continental Subarctic climate; Dfd/Dwd, Continental Subarctic climate with extremely severe winters [41].

Both mt intergenic regions were more variable than the nuclear ITS1-5.8S-ITS2 for the B. bassiana strains. MP analyses were based on 232 and 343 informative characters and produced 7,700 most parsimonious trees with tree lengths 750 (CI = 0.71, HI = 0.29, RI = 0.87, RC = 0.62) and 1,085 steps (CI = 0.68, HI = 0.37, RI = 0.87, RC = 0.59) for the nad 3-atp 9 and atp 6-rns regions, respectively. B. bassiana strains clustered into the same two groups (Clade A and C) and again the three isolates (SP IR582, SP O46 and SP U259) were placed as a separate group, as in the ITS1-5.8S-ITS2 trees (Fig. 3 and 4). Strains of B. brongniartii were basal to those of B. bassiana with a significant bootstrap and posterior probability support (94%, 99% and 78% for NJ, MP bootstrap and BI, respectively) in the nad 3-atp 9 analysis (Fig. 3), while in the atp 6-rns tree they presented an identical topology to the ITS dataset, as a sister species to Clade A with a 100% support for all methods applied (Fig. 4). Here again, Beauveria species were clearly differentiated from other Hypocreales species, with significant support (Fig. 3 and 4). In addition, mt datasets provided better support of Clade C B. bassiana strains than their nuclear counterpart, i.e., NJ (98%) and MP (90%) bootstrap support for the nad 3-atp 9 dataset (Fig. 3), and 83% and 100%, respectively, for atp 6-rns (Fig. 4). For both mt intergenic regions Clade C B. bassiana strains clustered as a sister group with the two B. vermiconia strains (i.e., IMI 320027 and IMI 342563), with the addition of the three independent B. bassiana isolates in the case of nad 3-atp 9.

In relation to insect host order, a "loose host-associated cluster" was observed only for Clade A strains, whereas Clade C B. bassiana strains were more diverse and no relation to host origin could be detected. Interestingly, the association of B. bassiana strain clusters with their insect host origin was more consistent with the nad 3-atp 9 data, than with data derived from atp 6-rns analysis.

Concatenated sequence analysis and evidence for host and climate associations of the clades

To fully integrate and exploit all the above information, a tree was constructed based on the concatenated ITS1-5.8S-ITS2, atp 6-rns and nad 3-atp 9 sequences. Parsimony analysis provided more than 10,000 trees after exploiting 575 informative characters and the tree length was based on 1,895 steps (CI = 0.612, HI = 0.388, RI = 0.858, RC = 0.576). Analysis of the same dataset with NJ and BI methods produced similar trees with identical topologies wherever there was a strong support (Fig. 5). As in every tree produced by the analysis of a single gene region, B. bassiana strains grouped again into the same two major groups. The three isolates that were placed basally to the remaining B. bassiana remained independent, with significant bootstrap support (NJ: 99%, Fig. 5; see also DNA sequence percentage identity in comparisons of members of Clade A2 with members of Clades A and C in Additional File 5, Table S5). The most interesting feature of the concatenated data tree was that B. bassiana strains of Clade A could be divided further into seven distinct sub-groups that showed a "loose" association with their host (Fig. 5). This association was strengthened if the fungi were clustered according to their geographic and climatic origin (Fig. 6). More precisely, sub-groups 1, 3, 4 and 6 contained strains from Europe with five, nine, three and twelve members, respectively (Additional File 3, Table S3). Sub-group 1 strains were derived from France, Hungary and Spain (with a single strain from China). They were all isolated from a Lepidopteran host and originated from temperate maritime and continental microthermal climates (Cfb/Dwb) according to the Köppen-Geiger classification [41]. The three strains of sub-group 2 were isolated from Oceania (one from Australia and two from Papua New Guinea). To these, an Indian (Cfa), a Chinese (Cfa) and a Spanish (Csa) strain were also added, i.e., fungal strains from regions with temperate humid subtropic and Mediterranean climates, resembling the climate of the Oceanic Cfa [41]. Sub-groups 3 and 4 consisted almost exclusively of European strains (9 and 3, respectively) from regions with Mediterranean climate, such as Spain, Portugal and Italy. On the other hand, 12 strains from regions of Europe with maritime temperate climates (Cfb) formed a well-supported group (87 and 92% NJ and MP bootstrap and 94% PP support) presented as sub-group 6. All nine strains of sub-group 5 were from regions with dry arid, semiarid (BSh, BSk and BWk) and temperate (Csa and Csb) climates in Asia and Europe, while the South American (6) from tropic (Af, Am and Aw) and dry arid/semiarid (BSh) climates may be named as sub-group 7.

Figure 6
figure 6

Grouping of B. bassiana sensu lato strains (Clade Α) as well as Clade C and A 2 , according to their geographic distribution, climate conditions and molecular data (concatenated datasets from ITS1-5.8S-ITS2, nad 3- atp 9 and atp 6- rns ). The 3 symbol Köppen-Geiger climate classification is as shown in Fig. 5.

Discussion

Fungal mt genome size shows high divergence among the Pezizomycotina, ranging from 100.3 Kb for Podospora anserina (NC_001329) to 24.5 Kb for the entomopathogen Lecanicillium muscarium (AF487277). Beauveria mt genomes sizes were similar to those of other fungi of the order Hypocreales, e.g., Fusarium oxysporum (34.5 Kb; AY945289) and Hypocrea jecorina (42.1 Kb; NC_003388), but they were significantly larger (~40%) than the mt genomes of the other two known entomopathogenic fungi of the order, i.e., M. anisopliae (24.7 kb) [27] and L. muscarium (24.5 kb) [42]. Since the Beauveria mtDNAs contained the same protein and rRNA coding genes -also identical in sizes- with all above mt genomes, their larger sizes can be attributed to more introns and to longer intergenic regions.

Compared to mt genomes of plants and animals, fungal mt genomes are significantly richer in group I and II introns [43]. Divergence in intron content is a common feature among mt genomes of Pezizomycotina. At one extreme is the mt genome of P. anserina which contains 41 introns [44] and at the other are several fungi that contain a single intron in the rnl genes of their mt genomes (i.e., L. muscarium and M. anisopliae). The recently released mt genome of another B. bassiana isolate (EU371503) also presented fewer introns than the genomes that we analyzed. These data support and extend previous evidence for intronic variability among strains of the same Beauveria species [14, 16]. However, introns usually share common futures like insertion sites, encoded ORFs and total length [43, 45]. Exceptions are noteworthy, not only because they suggest tools for the discrimination of the fungus but also because they provide information valuable to our understanding of fungal evolution [4648]. In that respect, intron Bbrrnl 1 inserted within domain II of rnl's secondary structure was located in a novel (unique) site amongst the 36 Ascomycota complete mt genomes examined (Additional File 6, Table S6). Even though introns have been found in the same domain in Basidiomycota, for example Agrocybe aegerita [49], the uniqueness of this insertion site is of great importance to ascomycetes, as it may be a result of horizontal intron transfer. The fact that this intron encodes for a GIY-YIG homing endonuclease which shares homology with ORFs in introns located in different genes in other fungal genomes further strengthens the hypothesis of horizontal transfer. Yet, such a hypothesis remains to be experimentally tested.

Recently, a thorough attempt was made to determine associations of morphological characteristics with molecular data in Beauveria species [1]. Based on ITS1-5.8S-ITS2 and EF-1a sequences 86 exemplar isolates were examined and assigned to six major clades (A-F), where all known Beauveria species were included. B. bassiana isolates were grouped into two unrelated and morphologically indistinguishable clades (Clades A and C), while B. brongniartii formed a third sister clade to the other two (designated as Clade B). A new species, B. malawiensis, was later introduced and placed as sister clade to clade E [50], and several other B. bassiana isolates pathogenic to the coffee berry borer from Africa and the Neotropics were added to Clades A and C [22]. Our results from the ITS1-5.8S-ITS2 dataset are in full agreement with the grouping into Clades A-C and this division of B. bassiana isolates into two distinct clades is further supported by the mt intergenic region and the concatenated datasets with the best so far known bootstrap values. Mt genomes present different evolutionary rates compared to the nuclear [51] and topologies provided by one evolutionary pathway may not always indicate the correct relationships. As indicated by our findings, combining information from two independent heritages (nuclear and mt) may offer the possibility to resolve phylogenetic ambiguities. Thus, the two unrelated and morphologically indistinguishable B. bassiana clades proposed by Rehner and Buckley [1], i.e., the "B. bassiana s.l.", which contains the authentic B. bassiana (Clade A), and the "pseudobassiana" clade, which remains to be described (Clade C), are fully supported by our combined mt and nuclear data. Equally well supported by bootstrap is the placement of B. brongniartii strains as a sister clade to B. bassiana. The consistent clustering of the three B. bassiana isolates (our Clade A2 in Fig. 5 and Additional File 5, Table S5), which grouped basally to other B. bassiana with any datasets, indicates that this group is possibly a cryptic complex of B. bassiana. Experimental work with these and other similar isolates will be needed to substantiate this hypothesis.

A generally accepted notion that insect hosts are related to certain genotypes of entomopathogenic fungi has been tested in several occasions in the past for B. bassiana and B. brongniartii. However, only a few cases supported a host - fungal genotype specificity. For instance, associations have been reported between B. brongniartii and Melolontha melolontha, M. hippocastani or Hoplochelus marginalis [17, 52]. A common B. bassiana genotype was detected in isolates from Ostrinia nubilalis [10] and from Alphitobius diaperinus [53]. More often, B. bassiana isolates collected from the same insect species were found to be genetically dissimilar [54, 55] or showed cross-infectivity [56]. Similarly, fungal isolates derived from different insect species, families or orders clustered together [57]. Our results from the concatenated mt and nuclear gene datasets come to an agreement with the latter view, since molecular variability showed no general correlation between strains and host and/or geographic origin. This indicates that B. bassiana is a generalized insect pathogen, and is in agreement which its world-wide distribution, the vast variety of hosts from which it has been isolated and its entomopathogenic and/or endophytic characteristics [1, 58]. It is only in rare occasions that a particular genotype, like Clade A sub-group 1 isolates (Fig. 6; Table 1), may be associated with a particular host (Ostrinia nubilalis). In the case of B. brongniartii and under the light of previous analyses of larger fungal populations [17, 52], the association between fungal genotypes and a particular host seem to be stricter.

Table 1 Data from the phylogenetic analyses

An increasing number of studies point towards a broad correlation of fungal isolates with their place of origin and/or habitats [e.g., [18, 21, 30, 59, 60]]. Obviously, the factors that can influence B. bassiana population structures are many and can include: climate conditions, the range of temperatures in which the various isolates can grow in nature, humidity levels, UV exposure, habitat, cropping system and soil properties [18, 27, 59, 61]. In addition, collection strategies, numbers of isolates tested, gene loci used and molecular methods applied for population analyses can greatly contribute to the variability recorded. Thus, although the description of the effects of a single variable on the population of entomopathogenic fungi in a habitat can give significant and useful ecological and agronomical information, there may be relationships among the different variables that must be studied in detail to adequately understand the source of genetic variability in these fungi [59, 61]. Therefore, to increase our potential to detect correlations between molecular markers and environmental variables, we incorporated climate conditions in our analyses, based on the most widely accepted classification system, the Köppen-Geiger climate classification [41]. This approach allowed fungal isolates that were otherwise outside of a particular cluster to be embodied in this cluster. Also, with few exceptions, strains isolated from distant geographic regions, which however shared similar climatic conditions, clustered together. If an explanation had to be proposed, the isolation by distance (allopatry) cannot be ruled out [22]. During the last decade molecular phylogenetic studies concerning fungal taxa which are considered to be widespread have resulted in the recognition of allopatric cryptic sibling species [33, 62]. The suggestion that some morphologically defined species consist of a number of cryptic species that are independent lineages with restricted distributions [36], may explain the phylogeographic distribution of the three B. bassiana isolates designated in group A2 in this work. In other words, even though they are morphologically indistinguishable from the rest B. bassiana isolates, all three have the same host and are originated from Asia (i.e., Iran, Turkey and Uzbekistan) with similar climate (Bsk/Csa/Dsa).

It may be argued, and indeed it is the case, that the fungal isolates studied in this work are geographically "biased", since they are predominantly isolated from insects found in Europe (40) and Asia (19), and to a lesser extend from other places in North and South America, Africa and Oceania (16 isolates). However, even with this worldwide distribution of the isolates studied, continental drifts, geological barriers, host restrictions and human activities may contribute to long-distance dispersal and result to mixed sub-grouping classification. For instance, sub-group 2 (Fig. 6) contains the Oceanic isolates, one from India and one from Britain. While the "Indian" isolate may be considered as an evolutionary result of the opening of the Weddell Sea when eastern (including Australia, New Zealand and India) and western Gondwana (including Africa and Northern South America) separated [63], the "British" isolate may only be explained by accepting long-distance dispersal due to the human intervention as the most probable way. In similar studies where fungal distribution was caused by the breakup of Pangaea to New and Old World, like the Pleurotus cystidiosus group [33], Schizophyllum [64] and Lentinula [65], several exceptions to this pattern were observed in each study and they were usually explained by rare, but recent long - distance dispersal. Thus, gene flow among geographically distant populations of B. bassiana may be attributed to the long-distance dispersal of fungal spores through a variety of different direct or indirect means including wind, migratory insect vectors, rainfall, flooding and human traffic. On the other hand, the fact that several B. bassiana isolates belonging to different phylogenetic clades have been found in the same geographic location (e.g., Fig. 5, clades 3 and 4) may indicate a sympatric diversification. There appears to be no single morphological, physiological, host range, or genetic marker characteristic that can alone resolve molecular phylogenies in B. bassiana. Therefore, a strictly vicariant scenario may be not supported with these datasets and the occurrence of long - distance dispersal may be an alternate feasible scenario which renders the genus Beauveria cosmopolitan with several cryptic species, as already have been shown in other fungal taxa [6668]. Nevertheless, in view of the ecological complexities of this entomopathogenic fungus, it is evident that terminal lineages can only be found if experiments are performed using more hierarchical parameters (climate, habitat, ecology and biogeography) in combination with multiple gene analyses that include data both from nuclear and mitochondrial genes.

Conclusions

The complete mt genomes of B. bassiana and B. brongniartii analysed in this work had the typical gene content and organization found in other Ascomycetes of the order Hypocreales, but contained more introns and longer intergenic regions. The latter features can serve as tools for inter- and intra- species specific analysis within the genus Beauveria. Two mt intergenic regions (nad 3-atp 9 and atp 6-rns) provided valuable sequence information and good support for the discrimination of Beauveria species and the division of 76 B. bassiana isolates into two groups, namely the B. bassiana sensu lato and the B. bassiana "pseudo-bassiana". These findings were in agreement with phylogenetic inferences based on ITS1-5.8S-ITS2 and demonstrated that mt sequences can be equally useful with the universally approved ITS1-5.8S-ITS2 for phylogenetic analysis. Further, mt sequence phylogenies constantly supported the formation of a third B. bassiana group, clearly differentiated from the rest, thus hinting for the presence of cryptic species within B. bassiana. Concatenated data sets of sequences from the three regions studied (i.e., the two mt and the nuclear ITS sequences) supported the above conclusions and often combined with criteria of isolate and geographic and climatic origins offered a better resolution of the B. bassiana s.l. strains and showed for the first time in entomopathogenic fungi, that B. bassiana s.l. strains can be subdivided into seven distinct sub-groups with common climate characteristics.

Methods

Strains, growth conditions, and DNA extraction

Seventy six strains of Beauveria bassiana, 3 of B. brongniartii and 14 strains of 9 other Beauveria species, together with one representative from each of 11 species belonging to the order Hypocreales were examined and are listed in Additional File 2, Table S2 (a fungal collection kept in the Department of Genetics and Biotechnology, Athens University, Greece). All fungal isolates were derived from single conidial spores grown on Potato Dextrose Agar (PDA) plates and all cultures were started from single spore isolations. Liquid cultures were in 250 ml flasks containing 50 ml of medium, inoculated with a spore suspension to reach 105/ml final spore concentration, on an orbital shaker at 150 rev min-1, 25°C, for 3-4 days. Mycelia were removed by vacuum filtration, lyophilized for 2-4 days, and ground in liquid nitrogen using a mortar and pestle. Small quantities of ground mycelia (50-100 mg) were used for the extraction of DNA as described [69].

Construction of libraries, PCR amplification and sequencing of the complete mt genomes

Isolation and digestion of nuclear and mtDNA from B. bassiana strain Bb147 and B. brongiartii strain IMBST 95031 were performed as previously described [69]. Eco RI and Hind III restricted fragments of CsCl-purified mtDNA were ligated into vector pBluescript KS+ (Stratagene, Cedar Creek, TX), analysed, subcloned and sequenced, thus covering over 78-80% of their complete mtDNA. The rest of the mtDNA and overlapping junctions were determined through sequence analysis of long-expand PCR amplicons. For this purpose, previously designed primers were used as follows: nad1B, cox3B, atp6A [42], cox2R, LSUER [27], LSUSF [38], and NMS1, NMS2 [70]. The primer pairs and respective amplicon sizes are shown in Additional File 7 (Additional File 7, Table S7). No sequence differences were observed between cloned fragments and PCR amplicons for the overlapping regions. PCR amplifications were performed with the proof-reading polymerase Herculase (Stratagene), in a PTC-200 Gradient Peltier Thermal Cycler (MJ Research, Waltham, MA), according to the manufacturer's instructions. PCR products were cloned in vector pDrive (QIAGEN, Hilden, Germany), subcloned as smaller fragments to pBluescript SKII and sequenced. Sequencing was performed with the Thermo Sequenase Primer Cycle Sequencing kit (Amersham Biosciences, Amersham, UK), and the reactions were analyzed at a LICOR 4200 IR2 automated sequencer. All fragments were sequenced in both directions. DNA similarity searches were performed with Basic Local Alignment Search Tool (BLAST 2.2.14) [71]. The tRNAs were predicted by tRNAscan-SE 1.21 [72]. Intron identification and characterization utilized the intron prediction tool RNAweasel [73].

Phylogenetic analysis

The ITS1-5.8S-ITS2 region of the nuclear rRNA gene-complex and two mtDNA intergenic regions, namely nad 3-atp 9 and atp 6-rns, were examined in all isolates. DNA extracts from each isolate were used as templates for amplification with primers VLITS1 with VLITS2 for the ITS region [74], atp6F and rnsR for the atp 6-rns, and nad3F with atp9R for the nad 3-atp 9 mt intergenic regions [27]. All reactions were performed with Herculase polymerase (Stratagene) in a PTC-200 (MJ Research) thermocycler according to the manufacturer's protocol, with a minor modification involving lower annealing temperature (45°C) for all three pairs. Sequencing was performed as above. DNA sequence alignments were made using CLUSTALW [75] with the multiple alignment parameters set to default and then edited by visual inspection (the matrix of the concatenated dataset and its partitions is provided in Additional File 8). Maximum parsimony (MP), Neighbor-Joining (NJ) and Bayesian inference (BI) analyses were employed in order to create the phylogenetic trees. The PAUP* program ver. 4.0b10 [76] was used for NJ (using the Kimura-2 parameter model) and MP analyses with 1,000 and 10,000 replicates, respectively, and with random addition of taxa and tree-bisection reconnection branch swapping [76]. Reliability of nodes was assessed using 1,000 and 100 bootstrap iterations for NJ and MP analyses, respectively. The BI was performed with MrBayes ver. 3.1 [77]. A gamma distribution model of site variation was used, calculated with PAML [78]. For ITS1-5.8S-ITS2, nad 3-atp 9, atp 6-rns and concatenated data sets, alpha (a) was 0.529, 0.966, 1.311 and 0.693, respectively. Two independent MCMCMC searches were run for each data set using different random starting points (number of generations: 1,000,000 for all datasets except for the concatenated set, where 2 million generations were needed) with sampling every 100 generations. Convergence was checked visually by plotting likelihood scores vs. generation for the two runs [the first 25% samples from the cold chain (relburnin = yes and burninfrac = 0.25) were discarded]. Based on this analysis, the burn-in was set to 10,000, as this was found to be clearly sufficient for the likelihood and the model parameters to reach equilibrium. Distances between trees produced by the same dataset but different method were computed with the Symmetric Difference of Robinson and Foulds [79] as implemented in program Treedist of the PHYLIP v.3.69 package [80].

Nucleotide sequence accession numbers

The complete sequence of B. bassiana strain Bb147 and B. brongniartii strain IMBST 95031 have been submitted to GenBank [GenBank: EU100742 and GenBank: NC_011194, respectively]. Also, nucleotide sequences for ITS and mtDNA intergenic regions were submitted to GenBank database [GenBank: FJ972917-FJ972972, GenBank: FJ973054-FJ973076 and GenBank: EU086417-EU086434 for the ITS region, GenBank: FJ972973-FJ973028, GenBank: FJ973077-FJ973101 and GenBank: EU086435-EU086455 for intergenic region nad 3-atp 9 and GenBank: FJ972862-FJ972916, GenBank: FJ973029-FJ973053 and GenBank: EU086396-EU086416 for the intergenic region atp 6-rns].