Background

Offspring vertically inherit both nuclear and non-nuclear genetic material from their mothers [1]. Among the non-nuclear material inherited are intracellular bacteria which are transferred vertically from mother to offspring and often live in symbioses with their hosts [2]. These symbionts may be obligate (essential for host survival) or facultative, in which case they can increase or decrease host fitness [3, 4]. Obligate symbionts are found within specialized cells and typically share a long evolutionary history with their hosts [5], whereas facultative symbionts tend to have more recently formed host associations. Wolbachia (Alphaproteobacteria: Rickettsiales: Rickettsiaceae) is a genus of facultative endosymbiont common among arthropods that is estimated to have infected more than half of arthropod species [6], including two-thirds of all extant insect species [7]. As with other facultative endosymbionts, Wolbachia has been thought to primarily undergo vertical transmission from mother to offspring with high fidelity [5]. However, symbionts can also develop host associations via horizontal transmission between different host species [2, 4, 8]. Horizontal transmission is thought to be the most likely explanation for closely related symbionts occurring in phylogenetically distant insect lineages [2, 813]. There have been multiple phylogenetic and transinfection studies reporting evidence of Wolbachia transmission between both phylogenetically close and distant hosts [9, 1418]; it is therefore probable that horizontal transmission of Wolbachia is occurring between some arthropod taxa [4].

Butterflies and moths (Lepidoptera) constitute one of the most diverse insect orders with nearly 158,000 described species [19]. Lepidoptera play an important role in ecosystems and serve primarily as pollinators and herbivores, though some species feed on blood and other animal secretions [2023]. The order includes many significant agricultural pests, and some species serve as models for many biological disciplines [24]. Furthermore, lepidopteran larvae are hosts to other major insect radiations – the parasitic flies and wasps [2527]. Despite the diversity of Lepidoptera and their many associations with other organisms, little is known about their bacterial community.

Wolbachia are some of the most widespread endosymbiotic microbes [6, 2830]. In nematodes, Wolbachia interact mutually [28], and in arthropods, Wolbachia most commonly interact with their hosts via a parasitic manipulation of the reproductive system [28]. Consequently, Wolbachia has been thought to undergo vertical transmission much more frequently than horizontal transmission [28]. Wolbachia most commonly affects Lepidoptera via reproductive manipulation and can induce multiple phenotypes including feminization, male killing, and cytoplasmic incompatibility [3133]. One strain of Wolbachia enhances the susceptibility of its lepidopteran host to baculovirus, rendering it a potential biological control agent against the agricultural pest Spodoptera exempta [34]. It was recently estimated that approximately 80 % of Lepidoptera species are infected with Wolbachia [29], a prediction that is considerably higher than the 52 % estimated infection frequency across arthropods [6]. However, the reported mean prevalence (27 %) in Lepidoptera [29] does not significantly differ from the estimated prevalence in arthropods (24 %) [6]. The high incidence and low prevalence may reflect opportunities for substantial horizontal transfer of Wolbachia in Lepidoptera.

After Wolbachia undergoes stable horizontal transmission from natural to novel hosts, there are multiple possible phenotypic effects. We define “phenotype” as the set of observable characteristics of host result from its interaction with Wolbachia. The Wolbachia phenotype can become stronger, weaker, or remain the same, and in some cases, it can be changed to an unknown phenotype that is novel to the host [35]. Additionally, once Wolbachia has successfully established a close relationship with its novel host, it may transfer a gene from its genome to the host genome over time [28]. This is known as lateral gene transfer (LGT) [36, 37], and LGT is thought to be responsible for the presence of Wolbachia genes in 70 % of arthropod and nematode genomes [36, 38, 39]. A recent study showed evidence of ancient LGT of Enterococcus bacteria in Lepidoptera [40].

In this study, we 1) analyzed all published multilocus sequence typing strains (MLST) of Wolbachia including those from lepidopteran hosts in order to explore potential instances of horizontal transmission events, 2) surveyed transinfection experiments in Lepidoptera, to detail the factors underlying the host phenotype after horizontal transmission has occurred, and 3) searched for evidence of LGT between Wolbachia and Lepidoptera genomes. Our analyses reflect the complex dynamics of transmission between Wolbachia and their lepidopteran hosts.

Methods

Data collection

We used multilocus sequence typing (MLST) strains based on five loci to identify and explore Wolbachia strain diversity. MLST provides a universal and unambiguous tool for strain typing, population genetics, and molecular evolutionary studies [41]. MLST was developed as a universal genotyping tool for Wolbachia and was found effective for detecting diversity among strains within a single host species, as well as for identifying closely related strains found in different arthropod hosts [41]. We downloaded and analyzed all 345 publically available strains of Wolbachia in arthropods and nematodes on March 31, 2014 from the PubMLST website (http://pubmlst.org/Wolbachia/) developed by Jolley and Maiden [42] (Additional file 1: Table S1). Approximately 26 % of these strains (90/345) were associated with lepidopteran hosts: 81 were strictly found in lepidopteran hosts whereas nine strains were found in both lepidopteran and non-lepidopteran arthropod hosts (Additional file 2: Table S2). Some of the strains from lepidopteran hosts (16/90) were unnamed and incomplete because not all five of the MLST loci were sequenced (gatB, coxA hcpA, ftsZ, and fbpA); these strains were designated as unassigned (UA) strains (Additional file 3: Table S3), and we included them in our analysis as such.

Sequence alignment and datasets

For ingroups, we included 345 MLST strains based on five MLST loci (gatB, coxA hcpA, ftsZ, and fbpA) of Wolbachia. For outgroups, we included bacteria closely related to Wolbachia: Anaplasma marginale (NCBI Genome accession no. NC_022760), Ehrlichia ruminantium (NC_006831) and Rickettsia slovaca (NC_017065), and extracted the five MLST loci from these genomes. These three outgroups and 345 ingroups were downloaded and aligned with the GINS-I algorithm in MAFFT [43]. Geneious v8 [44] was used to trim, align, and concatenated the five MLST loci. The best model and partitioning scheme were chosen using the Bayesian Information Criterion (BIC) in PartitionFinder v1.0.1 [45] and resulted in two partitions (a combined first and second codon position [nt12]; and third codon positions only [nt3]).

Phylogenetic analysis

Maximum likelihood (ML) phylogenetic analyses were conducted in RAxML v8 [46] using a GTR + G model for each partition. To estimate the best ML tree in RAxML, we used the “–f a” option to estimate 1000 bootstraps and perform a likelihood search, as well as 200 “–f d” searches that started from a randomly generated parsimony tree, following the general methods of Kawahara et al. [47]. We also estimated SH-like branch support [48] for the best topology in RAxML v8. We used the same method to construct a second ML tree for a smaller dataset of 51 strains found only in lepidopteran hosts, using three different outgroups: ID 37 from Supergroup D (host Brugia malayi, Nematoda), ID 505 from Supergroup C (host Onchocerca cervipedis, Nematoda) and ID 260 from Supergroup F (host Odontotermes horni, Isoptera).

A phylogeny of Wolbachia strains was also inferred with ClonalFrame v1.2 [49] without outgroups. ClonalFrame uses information of substitution as well as recombination events and is therefore suitable to reconstruct bacterial evolution based on multilocus data [49]. We performed ten separate runs, each with a burnin set to 250,000 generations and a sampling period of 750,000 generations, with a sampling frequency of 100. We chose the two runs with the highest mean log likelihood values and compared these to assess convergence of chains using the methods of Gelman and Rubin [50]. Trees of the posterior samples of the converged runs were then combined to compute a majority rule consensus. We also calculated the ratio of nucleotides to point mutations (r/m).

Gene networks

Statistical parsimony network analysis has been shown to be useful for assessing species-level delimitation and to identify breaks in network connectivity [5153]. Here we designated Wolbachia breaks in the network connectivity as identifying strains belonging to the Wolbachia species [54, 55]. In the present study, 90 strains were analyzed using a parsimony network approach [56] with TCS v.1.21 [57] using a 95 % cut-off [51]. The resulting networks identify both the relationships between the different haplotypes and the number of substitutions among connecting haplotypes [58].

Mantel test

A Mantel test was used to compute the Pearson correlation coefficient R using XLSTAT 2014 (http://www.xlstat.com). The test was performed on the pairwise node distance matrix of lepidopteran families from Regier et al.’s lepidopteran tree [59] and Wolbachia strains to test for significant association between matrices [60, 61].

Co-phylogenetic analysis

Wolbachia strains from eight families of Lepidoptera were tested for codivergence.

We mapped the Wolbachia ClonalFrame tree onto the Lepidoptera phylogeny of Regier et al. [59] using JANE v4 [62]. We reconstructed codivergence patterns with default cost values for cospeciation (0), duplication (1), duplication and host switch (2), loss (1), and failure to diverge (1). JANE analysis was performed using 500 generations and population sizes of 100. We selected an edge-based cost model and a node cost model; these models differ in counting the number events related to cospeciation, duplication and failure to diverge.

Divergence time estimation

To compare the age of Wolbachia divergence to previously published Lepidoptera divergence time estimations, we dated the splits of all Wolbachia strains found in lepidopteran species. Divergence time estimation analyses were performed in BEAST v2.1.3 [63] and two independent calibrations were used to cross-validate our estimates [64]. We applied the following calibration approaches: 1) using a recently published evolutionary rate of Wolbachia, estimated from Wolbachia genomes [65] and 2) using the age of a monophyletic set of strains shown to have strictly cospeciated with their hosts (bees) [66]. We tested for the presence of a strict clock for nt12 and nt3 datasets using a likelihood ratio test (LRT) [67] in PAUP* v4.0 [68]. Since the LRT test can be affected by recombination, we also used the relative-rate test (RRT) of Posada [69] in HyPHY [70], which can discriminate between strict and relaxed clock models in the presence of recombination. Because RRT requires that the outgroup taxa are recombination free, we used 3SEQ [71], implementing the full run mode for each gene to assure that the outgroup taxa did not have any recombinant genes. RRT analyses included taxa with unique sequences and no missing MLST loci and used two different outgroup MLST strains (13_Ekue_A_Ephestia_Pyralidae, 22_Aenc_B_Ugardan_Acraea_Nymphalidae). For the RRT, an alpha of ≤ 0.05 with a Bonferroni correction was treated as significant, and if any test was significant, then the strict clock is rejected [56]. Since both the LRT and RRT rejected the strict clock, we estimated divergence times using a relaxed lognormal clock and applied one of the two calibrations to cross-validate estimates.

The first calibration scheme was based on the median rate (substitutions per site per generation) of the Wolbachia genome [65] reported in generations of Drosophila melanogaster, which was converted to year (10 generations per year) and scaled the rate to substitutions per site per million years (nt12 was 6.42× 10−3 [2.76 × 10−3 -1.29× 10−2, 95 % HPD] and nt3 was 6.87× 10−3 [2.88 × 10−3 -1.29× 10−2, 95 % HPD]). We set lognormal priors that spanned the 95 % HPD of the previous rate estimations (for nt12: lognormal M = 0.00642 and S = 0.45; for nt3: M = 0.00687 and S = 0.44). The second calibration scheme was based on the divergence time of MLST Wolbachia strains (wNLeu, wFla, wNPan) from Gerth et al. [66]. The MRCA of these MLST strains is estimated at 1.7 mya (0.86–2.61, 95 % HPD) [72]. We included these three strains in our divergence time analysis and calibrated the age of this group with a lognormal prior set to span the estimated HPD (M = 1.6 S = 0.33).

For each calibration scheme, we ran two BEAST analyses for a total of 4 runs using default settings for the remaining priors. We ran the MCMC chains for 150,000,000 generations, sampling every 1000th generation, and used Tracer [73] to ensure that the runs converged and had ESS values >200. For comparison with Wolbachia divergences, we applied the published divergence times of lepidopteran families [74, 75].

Evidence of LGT

MUMmer [76] was used to align Wolbachia and Lepidoptera genomes to search for evidence of LGT events. We used the following nine Wolbachia genomes: wBm (D) (host: Nematoda: Brugia malayi; AE017321) [77], wBol (B) (Lepidoptera: Nymphalidae: Hypolimnas bolina; CAOH01000001-CAOH0100014) [78], wMel (A) (Diptera: Drosophilidae: Drosophila melanogaster; NC_002978) [79], wPip (B) (Diptera: Culicidae: Culex quinquefasciatus; NC_010981) [80], wRi (A) (Diptera: Drosophilidae: Drosophila simulans; NC_012416) [81], wAlb (B) (Diptera: Culicidae: Aedes albopictus; CAGB01000001-CAGB01000165) [82], wVit (B) (Hymenoptera: Pteromalidae: Nasonia vitripennis; AERW00000000) [83], wHa (A) (Diptera: Drosophilidae: Drosophila simulans; CP003884) [84], and wNo (B) (Diptera: Drosophilidae: Drosophila simulans; CP003883) [84]. At the time of this study, there were nine available Lepidoptera genomes that were used to search for possible LGT events: Bombyx mori [85], Danaus plexippus [86], Heliconius melpomene [87], Manduca sexta (http://agripestbase.org/manduca), Melitaea cinxia [88], Papilio glaucus [89], P. polytes, P. xuthus [90] and Plutella xylostella [91].

Results

MLST strain diversity in Lepidoptera

All Wolbachia strains with known associated lepidopteran hosts were grouped in either Supergroup A or B (Additional file 2: Table S2). The majority of lepidopteran strains (76 total representing 32 unique MLST strains) belong to Supergroup B; the remaining (14 total strains representing 6 unique MLST) strains belonging to Supergroup A.

Phylogenetic analysis of MLST strains

ClonalFrame and RAxML analyses both yielded similar topologies overall. The few differences in the trees might be due to recombination or difference in outgroup selection (Fig. 1a, b), and the chance of recombination is likely negligible. The ratio of nucleotide changes (from recombination) to nucleotides changes from point mutations (r/m) on average, was 1.48 (0.97–2.1, 95 % credibility region), which is considerably lower than the average (r/m = 3.5) seen in other Wolbachia MLST studies [92]. Some strongly supported clades in the ML analysis were also recovered in the ClonalFrame analysis of the dataset, including all currently available MLST profiles (Fig. 1a, b).

Fig. 1
figure 1

a Maximum likelihood (ML) tree based on the concatenated five Wolbachia MLST loci (2079 bp). ML boostrap values are placed to the left of the hyphen and SH-Like branch support values placed to the right of the hyphen. Bootstrap values >60 % are placed by nodes; 100 % bootstrap values indicated by an astrisks. Outgroups were removed for simplicity. A-H refer to Supergroups A-H. b Majority-rule ClonalFrame genealogy based on the concatenated, five Wolbachia MLST loci (2079 bp) from nematodes and arthropods. Labels correspond to Wolbachia strains and host species, families and geographic localities. Support values represent the percentage of trees from the posterior sample in which each node was present. Bootstrap values from ML analyses based on 1000 pseudoreplicates are shown

In total, 345 Wolbachia strains were analyzed from insect hosts (Coleoptera, Diptera, Hemiptera, Hymenoptera, Isoptera, Orthoptera, Lepidoptera) and distantly related invertebrates (Arachnida, Crustacea, and Nematoda). The ML and ClonalFrame phylogenetic trees were divided into six major clades (Supergroups A-D, F, and H). The ClonalFrame tree also contained an additional clade with strains in Supergroups A, B, C and F; this likely represents sequences that underwent the most recombination. Supergroup A is closely related to Supergroup B (Fig. 1a, b). The strain wExe3, which has a lepidopteran host, was originally classified as A. However, it is basal to clade B with 98 % boostrap support in the ML tree, and it is denoted on Fig. 1 as “A*”. In addition, in the ML tree, strain wHyl, which has an arachnid host, was highly supported (bootstrap = 99) as being basal to the strain wExe3 (labeled “A**”, Fig. 1b). Supergroups A and B, along with A* and A**, were sister to a clade of strains previously classifed as Supergroup H, which further connects to Supergroup D and to Supergroup F. Supergroup C has high support (bootstrap = 85) as being a basal group near the outgroup (Fig. 1b). Most lepidopteran strains were classified in Supergroup B in both the ML and ClonalFrame trees (Fig. 1a). However, in the ClonalFrame tree, A* and A** were grouped in Supergroup A. In the ClonalFrame tree, Supergroup D has high support (bootstrap = 90) and is placed close to outgroups (Fig. 1a).

Gene network analyses of unique Wolbachia strains in Lepidoptera

We performed genetic network analyses for 38 unique Wolbachia strains in Lepidoptera belonging to Supergroups A and B. Strains were divided into different networks based on a 95 % parsimony cut-off. Strains of Supergroup B were placed into four networks. Network 1 contained 29 strains; four of these strains were shared strains because they were found in multiple host species, and 25 strains were singletons because they were found only in single host species. These 29 strains were connected together in one network (Fig. 2a). Strain ST41 was found in 11 butterfly species (from three families) and was shared with a dipteran (Fig. 2a). Similarly, ST146 was found in two butterfly species from two different families, and ST125 was shared between two butterflies and one moth (Fig. 2a). ST37 was shared between one butterfly and two wasps: the egg parasitoid, Tetrastichus coeruleus (Eulophidae) and the social wasp, Polistes dominula (Vespidae) (Fig. 2a). Network 2 contained one shared strain, ST40, found to be present in Eurema hecabe, E. mandarina, and Surendra vivarna. Network 3 contained two strains from two butterflies in different families: Acraea encedon (Nymphalidae) and Catopsilia pomona (Pieridae). Network 4 contained one lepidopteran strain, found on the lycaenid butterfly Brangas felderi (Fig. 2a).

Fig. 2
figure 2

Statistical parsimony genetic network analysis (95 % confidence limit) showing genealogical relationships of Wolbachia strains in Lepidoptera. a Genetic network of Wolbachia Supergroup B strains in Lepidoptera. b Genetic network of Wolbachia Supergroup A strains in Lepidoptera. For (a and b), letters in green at the top of each strain name indicate known phenotypes for that strain; CI = Cytoplasmic Incompatibility, FI = Feminization Induction, MK = Male Killing. Grey indicates a strain that is inter-specific, inter-generic, inter-familial, or inter-ordinal. “Un” is used for unknown geographical locations

Strains in Supergroup A were grouped into four networks. Two networks contained only one strain: Network 3 had the lepidopteran strain ST92 (from Ephestia kuehniella [Pyralidae]) and Network 4 had the lepidopteran strain ST223 (from Spodoptera exempta [Erebidae]). Network 2 contained two strains, both with lycanenid butterfly hosts, that were separated by two mutations: ST38 (Jamides alecto) and ST110 (Iraota rochana; Fig. 2b). Network 1 contained nine strains. ST19 was found in eight strains from eight host species: Ephestia kuehniella (Lepidoptera: Pyralidae), Ornipholidotos peucetia (Lepidoptera: Lycaneidae), Ceutorhynchus neglectus (Coleoptera: Curculionidae), and five ant species (Leptogenys sp., Leptomyrmex sp., Pheidole plagiara, P. planifrons, Technomyrmex albipes). The ninth strain, ST91 occurred on the nymphalid butterfly Hypolimnas bolina, and was separated by just one mutation from strain ST19.

Comparison of Wolbachia and Lepidoptera phylogenies

There was no strong congruence between the Wolbachia and lepidopteran phylogenies during mantel test. Analysis of the ML topologies for Wolbachia using JANE and the ML tree from Regier et al.’s [59] lepidopteran phylogeny at a p-value of 0.05 showed the reconstructions (cost = 92) with only 9 cospeciation events, 22 duplication, 19 duplication and host switching, 22 losses and 10 failure to diverge (Additional file 4: Figure S1).

The Mantel test analysis indicated that there were no significant correlations between the genetic distances of Wolbachia and host Lepidoptera (r = −0.072, P = 0.081 [indigenous]; r = 0.107, P = < 0.0001 [comparing the Wolbachia ClonalFrame tree with the ML tree of Regier et al. [59]]; r = 0.069, P = 0.019 [comparing the Wolbachia ML tree with the ML tree of Regier et al. [59]]).

A phylogeny based only on unique strains of Wolbachia in lepidopteran hosts showed that distantly related strains were found in the same host family. Most of the Wolbachia strains were found in three butterfly families (Lycaenidae, Nymphalidae, Pieridae). These three were closely related [59, 93], yet they contain distantly related strains (Fig. 3). Strains ST3, ST40, ST41, and ST146 transferred horizontally across these three sister families of butterflies. Strain ST125 was found in both butterflies (Lycaenidae, Nymphalidae) and moths (Noctuidae). Strain ST19 was found in a lycaenid, pyralid, and in two non-Lepidopteran insect orders (Coleoptera, Hymenoptera), and strains ST37 and ST41 were found in multiple orders (Diptera, Lepidoptera) (Fig. 3, Additional file 4: Figure S1).

Fig. 3
figure 3

Comparison of phylogenies of Wolbachia their lepidopteran hosts. a ML tree based on the concatenated data of the five Wolbachia MLST loci. The tree was rooted with three strains from Supergroups C, D and F. ML bootstrap values ≥50 % shown on branches. b Phylogeny of Lepidoptera according to Regier et al. [59]. Colors correspond to Lepidoptera family names. Grey indicates a strain that is inter-familial or inter-ordinal

Divergence time estimation

Both the LRT (nt12: df = 91, LRT value = 565.16, P-value = 0; nt3: df = 91, LRT = 1833.43, P-value = 0) and RRT (outgroup: Ephestia sp., nt12: 112/351, nt3: 140/428; outgroup: Acraea sp., nt12: 118/351, nt3: 272/428) rejected a strict clock. In BEAST, all run pairs converged and the ESS values were above 200. Analyses using different calibrations resulted in overlapping HPD divergence time intervals at the root with a mean of 12.67 mya (26.86–4.76 mya, 95 % HPD) using the clade calibration prior and a mean of 10.67 mya (22.6–4.7 mya, 95 % HPD) using the evolutionary rate of the Wolbachia as a prior. Both calibrations also provided overlapping HPDs for the age of the MRCA of (wNLeu, wFla, wNPan) with the run that calibrated this clade at 0.55–1.89, 95 % HPD and the run using a rate prior at 0.0097–1.84, 95 % HPD. We compared divergence times of all lepidopteran Wolbachia strains (10.16–22.5-0 mya, 95 % HPD) with divergence times of lepidopteran families of Wahlberg et al. [74] that found the youngest divergence between families at 87 mya (76–98, 95 % HPD) and the oldest divergence between moths and butterflies at 116 mya (127–105 mya, 95 % HPD) (Fig. 4). In a more recent study of insect phylogenomics, the mean divergence time between butterflies and moths was much younger, estimated at ~58 mya [75] compared to 116 mya in a prior study [74]. Given either one of these Lepidoptera time estimates, if they are correct, they imply that all switches between lepidopteran families are likely to be due to horizontal transmission. Two identical Wolbachia strains, ST19 and ST125, between butterflies and moths are clear cases of a horizontal Wolbachia jump. Wolbachia strains ST37 and ST41 were identical in Diptera and Lepidoptera, their estimated divergence time is approximately 289.65 mya (328.62–244.11 mya, 95 % HPD) [75]. Coleoptera and Lepidoptera, with an estimated split of 326.69 mya (353.05–301.86 mya, 95 % HPD) [75], and Hymenoptera and Lepidoptera, with an estimated split of approximately 344.68 mya (372.43–317.79 mya, 95 % HPD), share the ST19 strain [75].

Fig. 4
figure 4

Estimated divergence times (a) of Lepidoptera based on Wahlberg et al. [74], and (b) the divergence time evolutionary rate of MLST genes [65] for Wolbachia Supergroups A and B Three samples (wNLeu,wNFla, wNPa) under W_Bees in (b) were taken from Gerth et al. [66] to calibrate and cross validate the divergence estimation

Geography of shared strains

Geographical distributions of six shared strains (ST19, ST37, ST40, ST41, ST125, ST146) were surveyed (Fig. 5). The seventh shared strain, ST3, was not included in this analysis due to the uncertainty of the sampling location of its host species. Strain ST41 was found in one unidentified species of calyptrate fly from the United States, and ten butterfly species from six countries: Lycaenidae: Azanus mirza (Ghana), Celastrina argiolus (United States), Nacaduba angusta (Malaysia), Pseudozizeeria maha, Zizeeria knysna (India); Pieridae: Delias eucharis, Ixias pyrene, Pareronia valeria (India), Eurema hecabe and its subspecies E. h. mandarina (India, Japan, Taiwan), Nymphalidae: Neptis hylas (India). Strain ST37 was found in one Malaysian butterfly species (Anthene emolus), the American wasp species Polistes dominulus, and the wasp Tetrastichus coeruleus, which was sampled in the United States, the Netherlands and France. Strain ST125 was found in a butterfly species from India (Telicada nyseus) and a butterfly species from French Polynesia and Japan (Hypolimnas bolina). ST125 was found in a butterfly species from French Polynesia and Japan (H. bolina) and a moth species in Tanzania (Spodoptera exempta). Strain ST146 was found in two different species in India (Junonia lemnonias, T. nyseus). Strain ST40 was found in one Japanese butterfly species (E. hecabe) and one Malaysian butterfly (Surendra vivarna). Strain ST19 of Supergroup A was found in four countries spanning four continents; this strain was present in one species of weevil from Canada (Ceutorhynchus neglectus), three species of ants from Thailand (Leptogenys sp., Pheidole planifrons, P. plagiara), one ant species from Australia (Leptomyrmex sp.) and one butterfly from South Africa (Ornipholidotos peucetia).

Fig. 5
figure 5

Geographical distribution of Lepidoptera-related Wolbachia strains. The six strains that were shared among lepidopteran and non-lepidopteran species are plotted. Each color represents one strain (Blank world map was taken from www.freeusandworldmaps.com)

Summary of previous transinfection studies in Lepidoptera

The horizontal transmission of Wolbachia can facilitate the induction of unknown phenotypes into the novel host. In the last two decades, there have been multiple transinfection studies reporting evidence of Wolbachia transmission between phylogenetically close and distant species [94101]. In the present study, we surveyed previous studies on transinfection of Wolbachia in Lepidoptera and attempted to classify them according to the possible factors involved in the induction of phenotypes after the transinfection (Table 1). Our survey reveals that the stability of Wolbachia infection and induction of its phenotypes in novel hosts is determined by three factors: 1) type of strain, 2) type of host species/population, and 3) collective effects of both the host and the Wolbachia strain [94101].

Table 1 Results of published transinfection experiments of Wolbachia strains performed on lepidopteran hosts

Lateral gene transfer (LGT)

We found one possible case of LGT between the Wolbachia strain wHa of Drosophila simulans and the genome of the butterfly Melitaea cinxia. The portion of the Wolbachia gene found in the genome of M. cinxia was 350 bp with > 96 % identity. We trimmed that hit from the receptive scaffold 391 between 44,255 and 44,603 bp in the genome of M. cinxia and blasted and reconfirmed that it is the part of Wolbachia genome (between 662,982 and 663,331 bp) with 100 % query cover and > 96 % identity (337/350 bp) with a 4–160 e-value. While blasting, we found that the portion of this gene is a part of the locus wHa_05420, and it is associated with a hypothetical protein (AGJ99989.1). We did not find any evidence of LGT in the other eight genomes of Lepidoptera aligned against available genomes of Wolbachia. However, we found four hits in P. xylostella ranging between 544 and 569 bp in length with 81–83 % similarity. We blasted those hits and found that they matched Enterobacter sp. with > 97 % identity (Table 2).

Table 2 Comparisons of genomes of Wolbachia and Lepidoptera to test for traces of LGT

Discussion

Previously, vector-mediated interspecific transmission was observed in Wolbachia through shared food sources [2, 102105], ectoparasitic mites [106, 107], and parasitoids [4]. Our study revealed that inter-specific, inter-familial, and inter-ordinal horizontal transmission is also common in Lepidoptera. Using phylogenetic, co-phylogenetic and network analyses, we found at least seven probable cases of horizontal transmission among 31 host species, both within Lepidoptera and between Lepidoptera and other arthropods. Three strains (ST3, ST40, ST146) were shared among three butterfly families (Lycaenidae, Nymphalidae, Pieridae). One strain (ST125) was shared between two butterfly families (Lycaenidae, Nymphalidae), and the distantly related moth family Noctuidae. Since the majority of lepidopteran larvae feed on plant tissue, and adults obtain nectar from flowers or tree sap, the close association of Lepidoptera with plants might lead to increased infection through host plant mediation [105]. Strain 41 is the most widespread Wolbachia strain in butterflies; it was shared among eleven butterfly species in three families (Lycaenidae, Nymphalidae, Pieridae) and interestingly, it was also shared with one unidentified species of calyptrate fly. There are a number of known hymenopteran parasitoids that are found on both lepidopteran and dipteran hosts, and thus parasitoids may have mediated horizontal transfer [108].

Another strain, ST37, was found to be shared between the egg parasitoid Tetrastichus coeruleus, the social wasp Polistes dominula, and the lycaenid butterfly Athene emolus. Tetrastichus coeruleus is not known to parasitize lepidopterans. However, it parasitizes eggs of the common asparagus beetle, Crioceris asparagi [109], which shares a host plant with other Lepidoptera, such as the pest species Spodoptera exigua [110]. Perhaps Wolbachia was transferred into a lepidopteran host through this shared host plant. Larvae of Polistes dominula are parasitoids of Chalcoela iphitalis (Lepidoptera: Pyralidae) [111], could serve as a possible route of Wolbachia transfer to a lepidopteran host. The Malaysian lycaenid butterfly, Athene emolus, is symbiotic with the ant species Oecophylla smaragdina. These ants guard A. emolus larvae and protect them from predators and parasites [112]. We postulate that any one of these lepidopteran-hymenopteran interactions could potentially enable inter-ordinal transfer of ST37.

Strain ST19 also exhibits inter-ordinal transfer. It is shared among three different insect orders: Lepidoptera (the lycaenid butterfly Ornipholidotos peucetia, and the pyralid moth, Ephestia kuehniella), Hymenoptera (the ant species Leptogenys sp., Leptomyrmex sp., Pheidole planifrons, P. plagiara, and Technomyrmex albipes), and Coleoptera (the weevil Ceutorhynchus neglectus). Horizontal transmission of Wolbachia is also possible when an uninfected insect eats an infected one [113]. Ceutorhynchus neglectus is parasitized by multiple wasps [114]; weevils also feed on flower pollen and nectar [115]. It is thus possible that ST19 jumped across three insect orders either through shared host plants or via shared parasitoids.

The Mantel test revealed a weak correlation between genetic make-up of lepidopteran host and its endosymbiotic bacteria, Wolbachia, which further support horizontal transmission of Wolbachia within Lepidoptera. Co-phylogenetic analysis revealed common losses, duplication and host switches of Wolbachia strains within Lepidoptera.

We performed divergence time analyses on all available Wolbachia strains from Lepidoptera using two independent calibrations [65, 66]. Results from both calibrations cross-validate our divergence time estimates and suggest the conclusions are robust. Our analysis suggests that Wolbachia was recently introduced in Lepidoptera at a maximum age of ~23 mya. The Wolbachia divergence times, compared to the divergence times estimated by Wahlberg et al. [74], suggest lepidopteran families that are currently known to carry Wolbachia had already diversified before they became Wolbachia hosts. A recent study on insect evolution suggests the divergence between butterfly and moths and between Lepidoptera and other insect orders (Diptera, Coleoptera and Hymenoptera) took place between ~344-58 mya and the identical strains between them were acquired recently at a maximum of ~23 mya [75]. Our divergence time analysis, in light of the most comprehensive Lepidoptera calibrated phylogeny, suggests that Wolbachia strains ST3, ST19, ST40, ST41, ST125 and ST146, are likely inter-familial horizontal transmissions, and ST125 and ST19 are inter-superfamilial horizontal transmissions [74, 75]. We also found that ST19, ST37, ST41 are clear cases of inter-ordinal horizontal transmission. The cospeciation events predicted in the co-phylogenetic analysis seems to be invalidated, given the lepidopteran estimated divergence times of Wahlberg et al. [74].

Facultative endosymbionts have already been shown to change host fitness or biology; pea aphids (Acyrthosiphon pisum) have facultative symbionts that protect their hosts against entomopathogenic fungi and parasitoid wasps, ameliorate the detrimental effects of heat, and influence host plant suitability [2, 116118]. One main consequence of horizontal transmission is induction of unknown phenotypes of Wolbachia into the novel host [28]. A recently discovered Wolbachia strain confers fitness benefits by increasing the resistance against natural pathogens in fruit flies [119]. All previously published transinfection experiments in lepidopteran hosts arrived at similar conclusions that the phenotype induction after transinfection is determined by two factors strains and the host types [94101]. It is necessary to investigate each strain’s genotype and phenotype in its natural host, as well as other possible hosts in which it may have been transferred through shared resources. In some cases, suppressors against phenotype can lead toward loss of phenotype [100]. Therefore, some species that do not currently induce a phenotype may have done so in the past, implying that more species have had their biology affected by Wolbachia than previously estimated [100]. In other cases, novel hosts can suppress the Wolbachia-mediated phenotype and enable the appearance of hidden phenotypes [100, 101]. Together, these studies suggest that Wolbachia strains possess the genetic makeup to induce multiple phenotypes [28].

The spread of endosymbionts in field populations by horizontal transmission have received little attention. The mechanisms driving horizontal transmission have mostly remained unclear; even the effects induced by common cases of horizontal transmission are currently unknown [2, 3]. Since there is no way to control horizontal transmission in the field, routes of transmission must be thoroughly studied in order to investigate the genotypes and phenotypes of strains in both natural and novel hosts.

Recently, a complete copy of the Wolbachia genome was found within the genome of Drosophila ananassae and large segments were found in seven other Drosophila species [36]. During the original whole genome sequencing of the nematode, Brugia malayi, extensive levels of lateral gene transfer (LGT) were identified from its Wolbachia endosymbiont [36]. LGT from the Wolbachia genome to the nuclear genome of its eukaryotic hosts is widespread [38, 39]. In a search of sequence data archives, about 70 % of arthropods and nematodes have evidence for LGT from Wolbachia [36, 38, 39]. We found one instance of possible Wolbachia LGT between strain wHa and the butterfly Melitaea cinxia. This result must be confirmed with PCR to rule out the possibility of a genome-sequencing error or contamination. We did not find any evidence of LGT from the Wolbachia genome to the other eight available genomes of Lepidoptera. Even Plutella xylostella, the only species known to have Wolbachia infection, did not yield any evidence of LGT in our analysis of its genome. For M. cinxia, the evidence we found of LGT transmission suggests it is or has been infected with Wolbachia. The method we used to search for possible LGT has previously been used effectively to trace LGT from Wolbachia [36] and from other bacterial species [40]. The lack of evidence of LGT also supports our inference of a recent introduction of Wolbachia in Lepidoptera. Though these results are sound based on current available data, they are not conclusive; future studies should examine additional genomes and methods to trace LGT in Lepidoptera. The genome assemblies of eukaryotes often filter out bacterial sequences as contaminants and there might be possibility that Wolbachia genes may be present in the original sequencing reads, but not in the finished genome assemblies [120]. We suggest future studies to examine the raw data read instead of assembled genomes to detect those genes, which might have filtered from the original sequencing reads.

Ahmed et al. [29] found geographic patterns in the infection status of Wolbachia, however, this survey did not find any such patterns in strain distribution. The study frequently found strains distributed across the continents, such as strains ST19, ST37, and ST41, which have been found in multiple hosts across Asia, Africa, Australia and North America. There is no generally accepted theory for how these strains were transferred between various hosts across continents, partially due to the difficulty in tracing the strains’ natural hosts. The comparison of phylogenies of Wolbachia and host Lepidoptera indicates that closely related strains have phylogenetically diverse hosts and vice versa. These examples of shared strains across distantly related families demonstrate that horizontal jumps might be result of recent acquisition of Wolbachia.

Currently, only eight families of Lepidoptera have published Wolbachia strain data. These include three moth families (Crambidae, Noctuidae, Pyralidae) five butterfly families (Hesperiidae, Lycaenidae, Nymphalidae, Papilionidae, Pieridae)that represent three Lepidoptera superfamilies (Noctuoidea, Pyraloidea, Papilionoidea), which contain about 50 % of described lepidopteran species [19]. It would be interesting to explore the Wolbachia strains from other butterfly and moth families, in order to get a comprehensive estimate of the full extent of Wolbachia diversity and mode of transmission within this order.

Conclusions

We found evidence for several new instances of Wolbachia horizontal transmission in Lepidoptera. Our findings suggest that specific shared food sources and shared natural enemies are possible routes of horizontal transmission, but further studies are needed to conclusively determine these routes. We uncover evidence of Wolbachia inducing new phenotypes in novel hosts after horizontal transmission from natural hosts. However, Wolbachia-induced phenotypes have not been well studied for most natural hosts and potential novel hosts. Therefore, it is crucial to study additional Wolbachia-infected organisms in order to determine which species are natural hosts for each strain. It is also important to perform additional transinfection experiments to determine which species can sustain a stable infection. Data from these experiments will yield information about the phenotypes in both natural and novel hosts, revealing new insights into the mechanisms of Wolbachia-induced phenotypic change. Finally, further research into host genotypes should be conducted by analyzing additional genomes of potential hosts to search for the presence of inserted Wolbachia loci, in order to elucidate the function of these laterally transferred genes.

Ethics

Not applicable.

Consent to publish

Not applicable.

Availability of data and materials

We provided the data at LabArchive (DOI: 10.6070/H48913W9).