Background

Phlebotomus papatasi sand flies are vectors of Leishmania major parasites: the causative agents of cutaneous leishmaniasis in the Middle East and North Africa. The wide geographical range and the extensive use of insecticides, climate change, wars and natural catastrophes could affect population dynamics of vectors of infectious diseases [1,2,3,4,5,6]. Like most other sand flies, P. papatasi has received little attention by population geneticists; molecular genetic studies on this species using various markers were documented [7,8,9,10] and no new microsatellites have been developed, except five polymorphic markers developed by our group in 2006 [11,12,13].

Due to high polymorphism information content and fast mutation rate, microsatellites have been used with success in the past for population analysis of various insects and sand flies including P. papatasi [13,14,15,16,17,18,19,20,21]. Like other nuclear DNA markers, microsatellites found in expressed sequence tags (ESTs) are of great value as they represent a set of functional markers. High mutation rates and simple Mendelian inheritance of these loci make them appropriate for investigations on population dynamics, breeding patterns and phylogeny [22, 23]. Although selection can be expected to be operating on a small percentage of EST markers, this drawback can be largely overcome by the use of a sufficient number of markers. On the other hand, markers proved to be under selection and non-neutral should be removed from the analysis.

Research based on EST analysis suggested that the frequency of microsatellites in some organisms is greater than was expected, had a reduced occurrence of null alleles, and had high transferability to other species [24, 25]. In this study we describe the identification of a new panel of 14 polymorphic microsatellites based on our previously mined P. papatasi EST simple sequence repeats [16].

Methods

One hundred and one flies originating from 19 locations in six countries have been analyzed, including two laboratory colonies and one field population from Egypt, one laboratory colony and seven field populations from Turkey, two field populations from Tunisia, three field populations from Iran, two field populations from Afghanistan, and one laboratory colony from Cyprus. DNA was extracted from five individual flies in each population using a DNA extraction kit (Invitrogen, Carlsbad, CA, USA), following the manufacturer’s instructions. The EST primers were selected from a list of EST primers which has been mined previously by our group [16] and based on the following criteria: number of tandem repeat motifs ≥ 5, no compound motifs were used, and loci were selected from different contigs to avoid linkage disequilibrium.

The PCR reactions were carried out in a 25 μl reaction mixture containing 2.5 μl 10× PCR buffer, 0.5 μl dNTP mixture, 0.15 μl of TaKaRa Taq, 1.2 μl of template DNA, and 0.5 μM of each primer. For PCR amplification, DNA was denatured at 94 °C for 5 min followed by 35 cycles (94 °C for 45 s, annealing for 40 s, 72 °C for 45 s), and a final extension at 72 °C for 7 min. Polymorphisms were evaluated by separating PCR products on high resolution 3.5% MetaPhore agarose gel (Lonza, Rockland, ME, USA). For accurate sizing of the polymorphic PCR products, the forward primers were labeled with 5'- fluorescent dyes (D2-D4). The PCR products were then analyzed using the automated CEQTM 8000 sequencer (Beckman Coulter, Fullerton, CA, USA) and the fragment sizes were analyzed using its fragment analysis tool. Estimates of heterozygosity, inbreeding coefficient (FIS), and allele counts were completed using the software package FSTAT version 2.9.3.2 [26]. As null alleles can overestimate FIS values, the Bayesian based individual inbreeding model (IIM) implemented in the program INEST 2.0 [27, 28] was used to simultaneously estimate the presence of null alleles and inbreeding coefficients. INEST was run using nfb (null alleles, in breeding coefficients, and genotyping failures) and nb (null alleles and genotyping failures) models to detect the existence of inbreeding effects in our dataset. The number of cycles (MCHC iterations) was set to 500,000 and ‘burn-in’ was 50,000. Tests for Hardy-Weinberg equilibrium and linkage disequilibrium were done using the GenAlEx package [29].

Results and discussion

Out of 721 potential microsatellites already mined in our previous work [16], 85 primer pairs were selected and optimized. Thirty-four primer pairs successfully amplified the target sequence and generated a single band of the correct size in preliminary screening using agarose gel electrophoresis. A total of 14 microsatellite markers were found polymorphic when tested on P. papatasi flies from different countries (Table 1).

Table 1 Primer sequences and locus characteristics

The expected heterozygosity (He) for all loci was relatively higher than observed heterozygosity (Ho), ranging between 0.083–0.514 (Table 2) suggesting a heterozygote deficiency, which has been reported previously for P. papatasi microsatellites [11]. The gap between Ho and He values, suggests the presence of null alleles, isolation, genetic drift, population sub structuring (Wahlund effect) or inbreeding [30]. However, this gap may be due to high inbreeding as revealed by relatively positive high FIS values calculated by FSTAT and INEST 2.0 programs.

Table 2 Summary of descriptive statistics of P. papatasi microsatellite markers

The deviance information criterion (DIC) calculated from the “nfb” model gave a lower value (23,612.759) than the “nb” model (24,696.659) supporting the inbreeding model and its strong effect (Additional file 1: Table S1 and Additional file 2: Table S2) rather than the null allele model.

One limitation of using EST-SSRs is that they generally considered less polymorphic than other microsatellite marker types, but have the benefit of an efficient and economic method and reduced occurrence of null alleles because the DNA sequences flanking SSRs from transcribed regions are relatively stable [25]. Therefore, the markers described here are very promising and can be used with confidence for population structure studies of this sand fly vector.

A few loci, markers PPEST73, PPEST10, and PPEST43, deviated significantly from Hardy-Weinberg expectations, and therefore provide caution of the utilization of these markers. None of the loci were in linkage disequilibrium (LD); all genotypic disequilibrium comparisons showed P-values above the 5% nominal level (0.00055). The number of alleles per locus ranged from 9 to 29 alleles, with the higher number of alleles observed in our study being likely due to the higher resolution of fluorescence-based genotyping as well as the inclusion of many field caught flies. These markers may have transferability among other species. However, tests for transferability should be completed on all sand fly species to extend the usefulness of these markers for interspecies studies.

Mining EST sequences is an effective strategy to identify functional microsatellites in P. papatasi sand flies. The polymorphic microsatellite markers discovered in this study will be useful for further population structure analysis, comparative mapping between populations or species, and determining the changes occurred as a result of selection.

Conclusions

The decreased expenses of development, and lower frequency of null alleles are significant benefits of EST microsatellites, they considered valuable and appropriate markers for future population genetic studies and comparative mapping in P. papatasi. Transferability evaluation should be completed, in order to extend the benefits of these markers to other sand fly species.