Background

Chlorops oryzae (Diptera, Chloropidae) is an important pest of paddy rice plants, which inflicts significant economic damage to rice crops throughout Asia. However, the dispersal ability, potential for long-distance dispersal, pattern of migration, ecological amplitude, and population size of C. oryzae are still uncertain. Most research on this species has focused on the physiology and ecology [1,2,3], and its genetics is relatively unstudied [4].

Frequent C. oryzae outbreaks in recent years have caused the species to become a major pest in some regions. The propensity for outbreaks may itself play an important role in homogenizing genetic variation and intensifying gene flow between pest populations [5, 6]. We hypothesized that frequent gene flow between populations enhances the species’ overall adaptability, promoting the frequent outbreaks that occur today. In other words, the frequent outbreaks of C. oryzae are associated with the species’ genetic diversity, population demography and high rate of gene flow between populations.

To test this hypothesis, we evaluated the genetic structure of different geographic populations of C. oryzae and the level of gene flow between them. We quantified the genetic diversity and degrees of genetic differentiation of 20 different geographical populations, which may provide a scientific basis for predicting future outbreaks.

We used two effective and promising DNA markers, mitochondrial DNA (mtDNA) and inter-simple sequence repeat (ISSR), to examine between-population differences. Studies of genetic variation between pest populations can not only provide information on their population structure in different geographical regions, but also deduce the demographic history of this species [7,8,9]. Yi et al. used microsatellite and mtDNA loci to investigate the genetic divergence and dispersal ability of Bactrocera dorsalis (Hendel) on six offshore islands in South China, which results indicated that these populations have high genetic diversity, frequent gene flow and low, or medium, levels of genetic differentiation. Thus, the geographic isolation of the six islands is no barrier to the dispersal of B. dorsalis [10]. Research on the genetic diversity and population structure of Leucinodes orbonalis collected from a variety of agro-climatic conditions found almost no genetic diversity and no significant genetic variation among the mitochondrial gene Cytochrome Oxidase I (COI) gene sequences of the populations examined. However, a few genetically distinct populations were associated with some specific habitat requirements [11]. Similarly, genetic differentiation in ISSR markers and the COI gene among Iranian populations of Hishimonus phycitis may have been induced by geographical and ecological isolation and may have an impact on the vectoring capability of this insect [12].

Our results not only provide information on the genetic structure and phylogeography research of C. oryzae, but also provide a potential scientific basis for monitoring and controlling this pest.

Results

COI gene analysis

Genetic diversity and differentiation

In total, 432 individuals collected from different locations were used to amplify 684 bp of the COI gene sequence, which defined 47 haplotypes. All 47 haplotypes had 43 variable sites, including 26 singleton variable sites and 17 parsimony informative sites. The mean total nucleotide frequencies of A, T, C and G in the nucleotide sequences from the 20 different populations were 29.98%, 36.56%, 16.95% and 16.51%, respectively, which shows an obvious AT bias (66.54%). The transition/transversion rate ratio was observed to be higher with purines (36.158) than pyrimidines (19.381). The overall transition/transversion bias was 12.022.

Genetic diversity parameters of the 20 populations and the results of neutrality tests are shown in Table 1. Haplotype diversity (Hd) for each population ranged from 0 to 0.71739 and the average number of differences (k) ranged from 0 to 1.35145. Nucleotide diversity (Π) for each population ranged from 0 to 0.00198. Tajima’s D and Fu’s Fs test of neutrality of 19 populations showed a negative value. When all samples were calculated as one population, Tajima’s D and Fu’s Fs values were negative and 1‰ significant, which is strong evidence of population expansion.

Table 1 Genetic diversity of 20 C. oryzae populations

Based on our sequence database, inter- and intraspecific genetic distances of C. oryzae populations were 0.00012–0.00184 and 0–0.00198 (Table 2) respectively, which indicates no significant genetic differentiation. The Fst values between populations ranged from − 0.0595 to 0.1174 (Table 3), indicating that inter-population differences are relatively low. AMOVA results suggest that 97.28% of all genetic variation is within, and only 2.72% between, populations (Table 4).

Table 2 Genetic distances between (below diagonal) and within C. oryzae populations (on diagonal) based on COI sequences
Table 3 Genetic differentiation (pairwise FST) of 20 C. oryzae populations based on nucleotide sequences of mitochondrial DNA. Lower left diagonal represents the
Table 4 AMOVA analysis of mtDNA COI gene sequences in 20 C. oryzae populations

Haplotype network and population tree

Evolutionary relationships among the haplotypes were depicted using the median-joining network method (Fig. 1). Among the 47 haplotypes, H1 was the most frequent haplotype, which occupied a central position of the network and was diversified by 46 haplotypes. Especially, H23 and H45 can be derived from two different haplotypes with just one mutational step, respectively. The topology of the C. oryzae population Neighbor-Joining tree suggested that there were no independent groups in all populations (Fig. 2). It is worth mentioning that NX population is most distant on the haplotype network and population tree, however, there is no certain landscape features around the NX location. The reason for preventing emigration of individuals is still unknown.

Fig. 1
figure 1

Median-joining network based on the single gene of COI haplotypes. Each circle represents a haplotype, and the area of a circle is proportional to the number of individuals with that haplotype. Colors within nodes refer to C. oryzae sampling regions

Fig. 2
figure 2

Neighbor-joining tree (model Kimura-2 parameter) of the phylogenetic relationships among 20 C. oryzae populations based on COI gene variation (1000 bootstrap replicates)

ISSR-PCR analysis

Genetic diversity

197 fragments were generated from the 9 primers, ranging in size from 250 to 2000 bp, and with an average of 21.89 fragments per primer. The percentage of polymorphic bands was 100% showing high genetic diversity at the species level. Among the 20 populations, PPB values ranged from 89.85% to 99.49%. The Na and Ne were 2.0000 ± 0.0000 (1.8985 ± 0.3028 to 1.9949 ± 0.0712 among populations) and 1.6460 ± 0.2373 (1.4992 ± 0.3270 to 1.6908 ± 0.2784 among populations), respectively. The H and I were 0.3785 ± 0.0984 (0.2992 ± 0.1588 to 0.3893 ± 0.1196) and 0.5609 ± 0.1151 (0.4547 ± 0.2091 to 0.5692 ± 0.1457), respectively, within all samples (Table 5).

Table 5 Genetic variation among C. oryzae populations

Genetic differentiation

The Gst was 0.0997, which indicates that 90.03% genetic variation is within populations and only 9.97% between populations. The estimated Nm value was 4.5165 (Table 6). These results suggest that genetic differentiation among populations of C. oryzae is impeded by high gene flow. Table 7 lists the genetic identity (above diagonal) and genetic distance (below diagonal) among populations.

Table 6 Population genetic differentiation coefficients and gene flow among C. oryzae populations
Table 7 Nei’s genetic identity (above diagonal) and genetic distance (below diagonal) among C. oryzae populations

The relationship between genetic and geographic distance was shown in Fig. 3. A Mantel test revealed no significant correlation between genetic and geographic distance (r = 0.54675, p = 0.9992) among C. oryzae populations, and there was therefore no evidence of significant isolation by distance.

Fig. 3
figure 3

Relationship between genetic distance and geographic distance of 20 C. oryzae populations

An UPGMA dendrogram constructed based on genetic identity (Fig. 4) grouped the 20 populations into two major clusters. The dendrogram did not reveal any major geographic structure for these populations.

Fig. 4
figure 4

UPGMA dendrogram of 20 C. oryzae populations

Discussion

The rapid development of molecular techniques has made it possible to directly measure genetic differentiation and genetic diversity among populations [13]. The faster mutation rate and relatively conserved sequence of the mtDNA COI gene is ideally suited to species identification via DNA barcoding [14]. Analysis of mtCOI gene variation is regarded as an important and reliable tool for defining cryptic species [15], evaluating biodiversity [16], identifying samples [17], and distinguishing closely related species [18].

Various surveys have demonstrated the reliability of ISSR markers, which can generate more polymorphisms than either RAPD or RFLP, [19]. For example, Dioscorea hispida were grouped into 10 vital groups based on information that provided by ISSR markers, proving the existence of significant variation among germplasm specimens. D. hispida shows a high level of genetic diversity among accessions, which suggests that ISSR markers have been very effective in detecting polymorphism in this species [20]. ISSR primers have been used to determine the potential for the diversification of cassava crops in Angola, revealing genetic diversity within populations and genetic information sharing among the three main taxa [21]. The ISSR molecular marker technique has also been used to distinguish between citrus rootstock species, and to reveal the broad genetic base and high genetic variability among these [22].

Genetic diversity

Genetic diversity, also known as gene diversity, is the foundation of biodiversity that guarantees the evolution of species. The percentage of polymorphism, haplotype diversity and nucleotide diversity are used to evaluate population genetic diversity [10]. High levels of genetic diversity are indicative of the strong viability and adaptability of populations [23]. Results from this study indicate that the genetic diversity of C. oryzae was high between populations that were sampled. This may increase fitness in populations to changing conditions. Crawford and Whitney [24] also showed that genetic diversity increases the ability of species to colonize on a short-term ecological timescale by increasing the possibility of population survival, growth and reproduction under novel environments. Episyrphus balteatus and Sphaerophoria scripta European populations successfully adapt to changing environmental conditions and have great colonization abilities due to the high genetic diversity [25]. The Tajima’s D and Fu’s Fs test of neutrality values for 19 populations were negative. When all samples were calculated as one population, Tajima’s D and Fu’s Fs values were negative and 1‰ significant, strongly indicating population expansion. This result is consistent with the idea that the negative Tajima’s D and Fu’s Fs test values demonstrated demographic expansion has occurred in these populations.

Genetic differentiation

Fst values between C. oryzae populations ranged from − 0.0595 to 0.1174, and the Gst value was 0.0997, both of which are indicative of low genetic differentiation between populations. This suggests frequent gene flow between populations, which may increase the species’ adaptability to environmental change [26]. Moreover, gene flow can not only demonstrate the probable genetic differentiation and genetic infiltration among populations, but also reduce the genetic differences among populations [27]. In this research, the Nm value of C. oryzae was 4.5165 which indicates high gene flow and low, or medium, genetic differentiation among some populations. High gene flow may impede genetic differentiation in C. oryzae. Gene flow (or lack of gene flow) plays a crucial role in genetic differentiation, affects the overall adaption of entire species and adaptative divergence between populations [28]. It has been traditionally considered as a homogeneous force that limits adaptive differences [29, 30], and recent studies have shown that it can also promote adaptation to local environmental conditions [31, 32]. For example, moderate gene flow increases the adaptation capabilities of Rhagoletis cerasi populations (which occupy different habitates in fragmented landscapes) to local habitates, thus preventing them from becoming extinct due to genetic processes [33, 34].

An AMOVA based on Fst values indicates that most of the genetic variation was resulted from the difference within populations. Furthermore, a Mantel test indicates no significant correlation between genetic and geographic distance. This result is consistent with the findings of Yang et al. [35] who compared correlation of the symmetric matrix constituted by geographic and genetic distances to analyze the existence of isolation among populations of Odontotermes formosanus in different regions. These authors found no significant correlation between geographic distance and genetic distance and no significant isolation by distance. Overall, we found a high level of genetic diversity and a low degree of population differentiation among populations of C. oryzae, and the gene flow was unaffected by geographic distance. Similarly, geographic distance did not appear to affect gene flow between 10 geographically separated populations Oedaleus infernalis [36]. Fst and Gst values for these populations are low, and the gene flow is high, indicating a low level of genetic differentiation and high gene flow among populations [36]. The correlation between genetic and geographic distance was insignificant [36].

Our results provide important, new information on the genetic diversity and genetic differentiation of C. oryzae, and suggest that high gene flow between populations contributes to the now frequent outbreaks of this pest. However, further research on both additional geographical populations and different genetic markers are necessary before definitive conclusions can be reached. Furthermore, future work can focus on doing a more comprehensive ecological and behavioural research to understand the natural history of C. oryzae in greater detail.

Conclusions

This study showed that the now frequent outbreaks of C. oryzae may due to high gene flow between populations. We have found that these populations have high genetic diversity at the species level, whereas exhibited low genetic differentiation. High genetic diversity and frequent gene flow between populations may enhance the tolerance of populations to environmental variability and increase the adaptability to novel environmental pressures, leading to frequent outbreaks what had happened and what will happen in a large scale.

Methods

Sample collection and DNA extraction

400 specimens of C. oryzae were collected from different parts of Hunan province, China, and an additional 32 specimens from Zhejiang and Guizhou provinces (Fig. 5, Additional file 1: Table S1). Samples were soaked in 100% ethanol and stored at − 20 °C until their genomic DNA (gDNA) was isolated. After removing the residual ethanol, DNA was extracted from each individual using an Ezup Column Animal Genomic DNA Extraction Kit as per the manufacturer’s instructions (Sangon Biotech, Shanghai, China).

Fig. 5
figure 5

Collection sites of populations. ZJ and GZ are populations from Zhejiang and Guizhou province, about 900 km and 810 km from Hunan province, respectively. They are not shown on the figure. The figure was created by Ailin Zhou

COI PCR amplification and sequencing

The COI was amplified with the COIF (5′-CTAGGTGCTCCAGATATAGCATTTC-3′) and COIR (5′-GGCTAAAACAACTCCTGTTAATCC-3′) primers from isolated DNA. PCR was performed in 20 μL volumes comprised of 10 μL PrimeSTAR Max DNA Polymerase (TaKaRa, Tokyo, Japan), 1 μL of each primer (10 mmol/L), 1 μL of template DNA solution (70 ng/μL), and 7 μL double distilled water. Amplifications were conducted as follows: 34 cycles of denaturation at 94 °C for 30 s, annealing at 59 °C for 30 s, and extension at 72 °C for 1 min. All the PCR products were checked by electrophoresis on a 1.2% agarose gel and bidirectional sequencing was completed by TSINGKE (Beijing, China).

ISSR PCR amplification

A total of sixteen primers from the University of British Columbia Biotechnology Laboratory Primer kit No.9 were tested for PCR and nine (Additional file 1: Table S2) that could produce reproducible, clear, polymorphic electrophoretic bands were chosen for further analysis. PCR was performed in 20 μL volumes comprised of 10 μL Premix Taq™ (Ex Taq™ Version 2.0 plus dye) (TaKaRa, Tokyo, Japan), 2 μL of primer (10 mmol/L), 1 μL of template DNA solution (70 ng/μL), and 7 μL double distilled water. Amplifications were carried out as follows: an initial denaturing at 94 °C for 3 min, followed by 34 cycles of denaturing at 94 °C for 30 s, annealing at an optimized temperature for 30 s, and extension at 72 °C for 1 min, with a final extension 7 min at 72 °C. All PCR products were electrophoretically separated on 2% agarose.

Data analysis

COI gene data analysis

COI sequences were edited manually with BioEdit v.7.0.9 to produce consensus sequences of 685 bp for each specimen [37]. All indices for sequence polymorphic sites, DNA polymorphism, genetic differentiation, neutrality tests [Tajima’s D [38] and Fu’s Fs [39] ], and haplotype analyses were executed using DnaSp v.5.10 [40]. A haplotype network, which included haplotype frequencies, was calculated using Network v.4.6 [41]. Intra- and inter-specific genetic distances and transition/transversion ratios in each codon were computed based on COI gene sequences using MEGA v.7.0 [42]. A population phylogenetic tree based on genetic distances was constructed using the Neighbor-Joining tree model in MEGA v.7.0. Analysis of molecular variance (AMOVA) was performed with Arlequin software v.3.5.2 [43].

ISSR data analysis

Amplified ISSR fragments were scored as present (1) or absent (0) according to the molecular weight (bp) and the resulting matrix of binary values was used for further analyses. The observed number of alleles (Na), effective number of alleles (Ne), Nei’s gene diversity (H), Shannon’s information index (I), the percentage of polymorphic bands (PPB), total gene diversity (Ht), genetic diversity within populations (Hs), coefficient of gene differentiation (Gst), and Gene flow (Nm) were calculated using POPOGENE v.1.31 [44]. Cluster analysis was used to construct dendrograms using the UPGMA (unweighted pair-group method with arithmetic averages) in NTSYSpc v.2.1 [45]. To estimate the existence of isolation among 20 C. oryzae populations, the geographic distances between them were calculated using the google-maps-distance-calculator (www. daftlogic.com/projects-google-maps-distance-calculator. htm) (Additional file 1: Table S3). The correlation between geographic and genetic distances was further analyzed using the MXCOMP program and a Mantel test in NTSYSpc v.2.1.