Background

Climatic oscillations over the past few million years have had profound effects on the demography of, and patterns and levels of genetic diversity in, many extant species [14]. Temperate taxa inhabiting alpine habitats often shift their ranges along an altitudinal gradient in response to climatic changes. Therefore, populations on different mountains may experience alternating periods of isolation and connectivity [25]. Periods of isolation may result in genetic divergence among populations, whereas periods of connectivity allow for dispersal and gene flow [1, 2]. Because species' responses to climate change are influenced by interactive factors including ecology, demography and local topography, the pattern and tempo of diversification may differ between taxa and from place to place [14]. For example, the amount of time that populations are isolated and their effective population size will determine whether they sort into independent evolutionary lineages, whereas the level of divergence and ecological characteristics of the species often influence whether they remain distinct or merge back into a common gene pool upon secondary contact [2, 6]. Studies of genetic diversification in alpine regions are currently limited to a few areas [714]. Additional research in other alpine regions is therefore required to confirm the generality of existing hypotheses on how Pleistocene climatic change affected species diversification and demography.

The eastern Himalayan mountain range, which has perhaps the most complex topography on Earth, offers an unusual opportunity to study interactions among geography, climatic changes and diversification. These mountains rose rapidly with the uplift of the Tibetan Plateau during the past three million years [1518]. Mountains are characterized as a series of parallel alpine ranges climbing to altitudes over 5 000 m above sea level (m.a.s.l.) with the differences in altitude from valley to mountaintops often exceeding 2 000 m.a.s.l. This broadly altitudinal range has created dramatic ecological stratification and resulted in geographic isolation for many taxa [1921]. Tremendous variations in altitude, climate and vegetation may have created an archipelago of high elevation sky islands [22, 23]. The Elliot's laughing thrush (Garrulax elliotii) is endemic to the eastern Himalayan region [2326], where it generally occupies shrubland habitats at altitudes from 2 000 to 4 000 m.a.s.l. Since the montane habitat in this region is divided by deep valleys into distinct eco-subregions or clusters of mountains [19, 20], it is plausible that these geographic barriers restrict gene flow between G. elliotii populations thus contributing to genetic diversity in this bird.

High altitude parts of the eastern Himalayas were subject to repeated glaciations during the Pleistocene [27]. These glacial cycles and accompanying climate changes probably had a large influence on species' distribution and demography [28, 29]. The range of G. elliotii is predicted to have expanded and contracted along an altitudinal gradient resulting in historic periods of connectivity and isolation. Given the potential for repeated contacts between populations on different mountains, it is possible that populations merged into a single gene pool during historic periods of connectivity [1, 2, 6]. However, many studies in other montane regions suggest that gene flow among populations remained restricted during the Pleistocene, leading to each population remaining a distinct evolutionary lineage [814].

We used mitochondrial DNA and microsatellites to address the effects of Pleistocene climatic changes on genetic diversification in the eastern Himalayas. We first evaluate whether Elliot's laughing thrush from each eco-subregion or cluster of mountains comprises a distinct evolutionary lineage. Second, we investigate the contribution of geographic and environmental barriers to G. elliotii's distribution. Third, we test whether the timing of diversification within G. elliotii is consistent with climatic shifts induced by Pleistocene glacial cycles. Fourth, we use coalescent simulations to test whether populations from different mountains have descended from a single ancestor (single refugium), or, alternatively, remained isolated on individual mountains during down-slope expansions (multiple refugia). Finally, given that the range of G. elliotii is predicted to have expanded and contracted in response to Pleistocene climatic fluctuations, we also examine historical demography of this species.

Materials and methods

Study area and sampling sites

The eastern Himalayan mountain region occupies western and north-western Yunnan, western Sichuan, south-eastern Tibet, southern Qinghai and south-western Gansu. Steep mountain ranges and major rivers in deep valleys divide this region into several distinct eco-subregions [19, 20] (see Figure 1). Tang [24] and Zhang [30] described diverse floristic constitution of these subregions. The northern subregion, which is comprised a part of the Tibetan plateau, is dominated by a plateau cold and dry climatic condition [2427, 30]. The vegetation type is characterized by alpine meadows and grasslands. This northern subregion is considered as 'Plateau avifauna' with the endemic avian species as Pseudopodoces humilis, Montifringilla adamsi, Pyrgilauda ruficollis, Urocynchramus pylzowi and Kozlowia roborowskii [24, 31, 32]. The southern subregion, on the other hand, is influenced by much warmer and wetter climatic condition, and the vegetation is characterized by tropical, or subtropical, broadleaved forests with a mixture of evergreen and deciduous species. This subregion is represented by Himalaya-Hengduanshan Mountain avifauna with many endemics babbles (Timallinae) and pheasants [2426]. The eastern subregion consists of a cluster of mountains including of Qinling, Min, Mang and Daba mountains. The vegetation is dominated by temperate evergreen coniferous and broadleaved mixed forests. Endemic avian species include Chrysolophus pictus, Spizixos semitorques, Garrulax sukatschewi, G. davidi, Yuhina diademata, Paradoxornis przewalskii, P. paradoxus and Parus davidi [2426, 3133]. Subregion-specific species have also been observed for other taxa, such as plants [34], ants [35], fishes [36], reptiles [37] and mammals [38].

Figure 1
figure 1

Topographic map of the eastern Himalayan region including all elevations between 2 000 and 4 000 m.a.s.l.; the altitudinal range of Elliot's laughing thrush ( Garrulax elliotii ). The species' range is indicated by the thin dashed line as in [110]. Sampling locations are indicated by the abbreviation of each population's name (Table 1). The bold dashed line represents the boundary between recognized ecological subregions of the eastern Himalayan mountain range [20]. Insert: The shaded area represents the study region.

These three subregions are further divided into thirteen biogeographic sectors corresponding to clusters of mountains [20]. The sites from which samples of G. elliotii were obtained are shown in Figure 1 and Table 1. These sample sites were distributed across the range of the species, and grouped into ten populations according to the biogeographic sectors described in Li [20].

Table 1 Nucleotide polymorphism in each population; S, number of segregating sites; Nhap, number of haplotypes; Hd, haplotype diversity; π, nucleotide diversity

Mitochondrial DNA amplification and sequencing

DNA was extracted from tissue samples using the DNeasy Tissue kit (QIAGEN). The cytochrome b (cyt-b), cytochrome c oxidase subunit I and II (COI and COII), NADH dehydrogenase subunit 2 and 6 (ND2 and ND6), ATP synthetase subunit 8 (ATP8) and control region (CR) genes were amplified via the polymerase chain reaction (PCR) [3944]. The same primers were used for amplification and sequencing reactions. Methods of purification and sequencing were as described in Qu et al. [45]. All sequences are accessible at GenBank (Accession nos: HM601543-602022).

Microsatellite genotyping

PCR amplifications were achieved in a 10 ul reaction volume containing 10-30 ng DNA, 0.5 U Taq DNA polymerase, 0.2 nmol dNTPs, 1-3 nmol MgCl2 and 5 pmol primers. These loci were originally developed for the following passerine species: Hirundo rustica (HrU5-A) [46], Loxia scotica (LOX1) [47], Catharus ustulatus (Cum02, Cum10, Cum28 and Cum32) [48], Poecile atricapillus (Pat43) [49], Phylloscopus occipitalis (Pocc6 and Pocc8) [50] and Dendroica petechia (Dpu16) [51]. Original PCR conditions were optimized with variable MgCl2, template DNA concentrations and annealing temperatures. Microsatellites were amplified with fluorescently labelled forward primers (FAM and HEX dyes). Fragment lengths were analyzed with the internal size marker GENESCAN-500 ROX (Applied Biosystems), and scored with GENEMARKER 3.7 (SoftGenetics).

Genetic polymorphism

For mitochondrial data, we calculated the number of segregating sites, haplotype diversity and nucleotide diversity for each population and all populations combined using DnaSP 4.0 [52]. McDonald & Kreitman's [53] test was used to examine the selective neutrality of the mitochondrial DNA protein-coding fragments. Two additional neutrality tests, Fu's FS [54] and Fu & Li's [55]D were used to detect departures from the mutation-drift equilibrium that would be indicative of changes in historical demography and natural selection. All three tests were implemented in DnaSP.

For microsatellite data, parameters such as the number of alleles per locus (NA), average allelic richness (AR), observed heterozygosity (HO), heterozygosity expected from the Hardy-Weinberg assumption (HE) and exact tests of linkage disequilibrium (LD) between pairs of loci for each population were calculated with ARLEQUIN 3.5 [56] and FSTAT 2.9.4 [57].

Phylogeography and population structure

The maximum-likelihood algorithm implemented in PHYML 2.4.4 [58] was used to reconstruct phylogenetic relationships among mitochondrial haplotypes. Based on the Akaike Information Criterion (AIC) [59], we used the model GTR+I+G as suggested by Modeltest 3.7 [60]. We set the proportion of invariant sites and the shape of Gamma distribution as 0.43 and 1.048, respectively (estimated by Modeltest). The base frequency and ratio of transition/transversion were optimized by the maximum-likelihood criterion in PHYML. Outgroups were the black-faced laughing thrush (G. affinis), exquisite laughing thrush (G. formosus) and streaked laughing thrush (G. lineatus), which are considered closely related to G. elliotii [61].

We used two methods to identify genetically distinct groups among microsatellite genotypes. First, genetic relationships between individual samples were quantified using Nei's standard genetic distance in GENETIX 4.05 [62]. The resultant matrix was then converted into a dendrogram using the Neighbour-Joining algorithm [63] provided with PHYLIP 3.5 [64] and graphically displayed with TREEVIEW 1.6 [65]. Second, we used STRUCTURE 2.3.2 [66] with a burn-in of 5 × 107 and 5 × 108 iterations without prior population information and following the admixture model. We conducted ten replicate runs for each value of K (most likely number of populations) from 1 to 10. The most likely K was identified using the maximal values of Pr(X/K), typically used for STRUCTURE analysis [66] and ΔK based on the rate of change in the posterior probability of data between successive K-values [67].

Landscape genetics and environmental variables correlation analyses

Suitable habitats for G. elliotii were identified in order to investigate whether intraspecific lineages were separated by areas of unsuitable environmental conditions. The habitat-mapping model was developed in ARCGIS 9.0 (Environmental Systems Research Institute) using vegetation, topographic layers and the known distribution range of G. elliotii. Shrublands were identified as suitable vegetation types [2426]. We consider an area suitable if these vegetation types are intersected by suitable altitude distribution (2 000 to 4 000 m.a.s.l.). These constructed areas were then projected onto the current distribution of G. elliotii to generate a map of suitable habitats. Gaps between patches of suitable habitat were considered habitat gaps. GENELAND 1.0.5 [68], a computer program in the R-PACKAGE [69], was implemented to verify our definition of G. elliotii populations and locate areas of genetic discontinuity. GENELAND used microsatellite genotype data together with geographic information (the locations from which individuals were sampled) to estimate population structure. Areas of genetic discontinuity were detected as geographic areas with low posterior probability of membership. Following the approach of Coulon et al. [70], we allowed K to vary (from 1 to 10) and inferred the most probable K using five replicate runs with 5 × 105 Markov chain Monte Carlo (MCMC) iterations. For these analyses the maximum rate of the Poisson process was fixed at 100 with no uncertainty in the spatial coordinates, and the maximum number of nuclei in the Poisson-Voronoi tessellation was fixed at 300. The Dirichlet model was used to estimate allele frequencies.

Additionally, a distance-based redundancy analysis [71, 72] was used to examine whether environmental factors might explain genetic diversification among populations. In the case of mitochondrial data, we used ten locations as spatial units and the uncorrected p distance and pairwise ΦST as the measures of genetic differentiation. In the case of microsatellite data, genetic distance was measured as the FST and Nei's Ds. An array of predictor variables was grouped into six sets: (i) coordinate (latitude and longitude); (ii) vegetation (alpine steppe; Tibet alpine meadow & shrublands; subalpine grassland & shrublands; temperate grasslands & shrublands and subtropical shrublands); (iii) subregion (southern; northern and eastern); (iv) elevation; (v) temperature (mean annual temperature; mean January temperature and mean July temperature); (vi) mean annual rainfall. All predictor variables except vegetation and subregion were continuous, therefore these were treated as categorical variables with two states; 1 if a sample was located in a given vegetation type or subregion, and 0 if it was located in another vegetation type or subregion. Thus, each vegetation type or subregion was regarded as a vector corresponding to five vegetation types and three subregions, respectively. Vegetation or subregion was analyzed as a set, i.e. these variables were combined in a single test.

Environmental variables were taken from the WWF Terrestrial Ecoregion and WorldClimate (http://www.worldclimate.com) databases. Subregions were defined according to Li [20]. In order to identify variables correlated with genetic distance, individual sets of predictor variables were analyzed with the marginal test in DISTLM 5.0 [73]. The P values in these analyses were obtained using 9 999 simultaneous permutations of the rows and columns of the distance matrix. To examine which subset of predictor variables provided the best model explaining genetic differentiations among G. elliotii populations, the forward selection procedure in the program DISTLM forward [74] was performed on all sets of variables. The forward selection procedure consists of sequential tests, fitting each set of variables one at a time, conditional on the variables already included in the model. To examine whether the tested variables were correlated, output files also included information on the correlations among all pairs of explanatory variables. This provided a further check on potential multi-collinearity issue. As in the previous analysis, P values were obtained using 9 999 permutations of the rows and columns of the multivariate residual matrix under the reduced model.

Divergence time estimate

In order to overcome the limitations of conventional estimates of divergence time based on FST values (e.g. estimation of divergence time assuming isolation without gene flow) [7577], we employed a coalescent-based MCMC simulation to estimate the divergence times for the main mitochondrial lineages in the program IM [78, 79]. As recommended by Hey & Nielsen [78] we first performed multiple runs, with an increasing number of steps and using wide priors and heating schemes to ensure that the complete posterior distribution could be obtained. We finally performed four independent runs of 5 × 108 steps with a burn-in of 5 × 107 steps and a linear heating scheme (g1 = 0.05). IM was run under the HKY substitution model (estimated by Modeltest 3.7). Convergence on stationary distributions was assessed by monitoring the similarity of posterior distributions from independent runs and by assessing the autocorrelation parameter values over these runs [80]. The peaks of the resulting posterior distributions were taken as estimates of parameters [81]. Credibility intervals were recorded as the 90% highest posterior density (HPD). Because no fossil data were available to calibrate the mutation rate, we assumed a conventional molecular clock for the avian mitochondrial DNA cytochrome b gene (1 × 10-8 per site per year) [82, 83]. The mutation rate was modulated by multiplying the ratio of the average distance for the combined sequences versus that for cyt b alone to deduce the mutation rate for all fragments combined. We used this modified mutation rate to convert the divergence time estimate into calendar years.

For microsatellite data, timing of population splits was estimated using the IMa2 [81] with burn-ins of 5 × 107 and 5 × 108 iterations under the SSM model with initial short runs to provide prior estimates. A geometric heating scheme was employed and the program run four times with different random seed numbers to ensure convergence of parameter estimates. The IMa2 output included mean values of the parameters for two sets after the burn-in period, representing the first and second half of the total generations of the post burn-in run. We ran our analyses until the peak location set values were equal for both sets. This, along with the high effective sample size and low autocorrelation estimate, suggested the Markov chains reached convergence after a sufficiently long burn-in period and were sampled from the appropriate likelihood space. A generation time of 1 year [26] and a mutation rate of 10-5 to 10-4 per generation for birds [8489] were used to convert divergence time into calendar years. Given the uncertainty in mutation rates, the resultant estimates should be interpreted with caution.

Historical biogeography

To determine whether populations remained isolated in individual habitats (multiple refugia), or merged into a single gene pool (single refugium) during down-slope dispersal, we used coalescent simulations in MESQUITE 2.5 [90] to test three phylogeographic models of diversification. The first hypothesis posited that all populations derive from a single source population (single-refugium hypothesis). The second hypothesis postulated that populations were isolated into southern and north-eastern refugia (two-refugia hypothesis). The geographic structure was inferred from mitochondrial data. A third, three-refugia hypothesis, which was consistent with three-subregion division and inferred from microsatellite data (see RESULTS), predicted that populations derive from three independent refugia.

For coalescent simulations, we first estimated Ne for individual populations of G. elliotii using values for θ calculated in MIGRATE 2.3.2 [91] under the following parameters: 10 short chains (500 trees used out of 10 000 sampled) and three long chains (5 000 trees used out of 100 000 sampled) with four adaptive heating chains (1, 3, 5 and 7). Maximum-likelihood estimates (MLE) were calculated under a stepping-stone model four times to ensure convergence upon similar values for θ. We converted θ to Ne using the equation for maternally inherited mitochondrial data θ = 2 Neμ, using same mutation rate in divergence time estimate. The estimates of Ne for all populations were summed to calculate Total Ne and scaled the branch widths of our hypothesized population trees using the proportion of Total Ne that each population comprised. The N e of the refugial population was constrained to a size proportional to the relative N e of the population sampled from the sites of the putative refugia. Thus, if samples from the site of the putative refugium had an effective population size of one-third that of the Total Ne, we would constrain the population size in the simulation prior to population expansion to one-third the total Ne. During coalescent simulations, both Total Ne and lower and upper bounds of the 95% CI for N e were used as model parameters in order to encompass a wide range of potential Ne values The divergence time specified for each model corresponded to the approximate age estimated using IM and IMa2 as follows: 0.1 mya for the southern and north-eastern populations (two-refugia model), 0.015 mya for northern and eastern populations (three-refugia model). In the single refugium model, we specified divergence time as 0, which assumed a panmixia population derived from a single refugium. For converting coalescent time (in generations) to absolute time, we assumed a generation length of 1 year [26].

The amount of discordance between a gene tree and population model was measured by S, the minimum number of sorting events required to produce the genealogy within a given model of divergence [92]. The S value is a measure of the number of parsimony steps in characters for a reconstructed gene tree, such that more discordance between the population model and gene tree leads to a higher S value. To obtain a distribution of S values, 10 000 gene trees were simulated, constrained within each model of population divergence under a neutral coalescent process, and the amount of discordance between the simulated genealogy and population model was determined. Overall, this produced a null distribution of 10 000 S values based on reconstructed genealogy for each model of population divergence. The S value for observed genealogy was calculated and compared to the distributions of S values from coalescent simulations to determine whether the observed genealogy could have been generated under a given model.

We also examined the ancestral area distributions for deeper nodes of the mitochondrial haplotype phylogeny with an event-based method in DIVA 1.1 [93]. Each individual in the phylogeographic analysis was coded to one of the ten biogeographic sectors. We used an equal likelihood for the rate of change among sectors for estimating ancestral areas because we had no information on the dispersal rate among sectors.

Historical demography

The past population dynamics of mitochondrial lineages of G. elliotii were estimated using the Bayesian skyline plot method implemented in BEAST 1.4.6 [94]. This approach incorporates uncertainty in the genealogy using MCMC integration under a coalescent model, in which the timing of dates provides information about effective population sizes through time. Chains were run for 100 million generations, and the first 10% of which were discarded as 'burn-in'. The substitution model was selected according the result of Modeltest 3.7. We applied 10 grouped coalescent intervals and constant growth rate for the skyline model. The same mutation rate in divergence time estimate was used. Pilot analyses showed that the ucld.stdev parameters were close to zero, thus a strict clock model was employed. Demographic history through time was reconstructed using Tracer 1.4 [95].

The exponential growth rate (g) was also estimated for each mitochondrial lineage by FLUCTUATE 1.4 [96]. FLUCTUATE was initiated with a Watterson [97] estimate of theta (θ) and a random topology, performing 10 short chains, sampled every 20 genealogies for 200 steps, and two long chains, sampled every 20 genealogies for 20 000 steps. FLUCTUATE analyses were repeated five times and the mean and standard deviation of g calculated from the results of these separate runs. Because this genealogical method yields estimates of g with an upward bias [96], we corrected g values following the conservative approach of Lessa et al. [98] and only considered the g value indicative of population growth when g > 3 SD (g).

Results

Genetic polymorphism

A total of 4 029 bp of mitochondrial DNA was obtained, which contained 170 polymorphic sites, 125 of which were parsimony informative. These polymorphic sites defined 67 unique haplotypes, 58 of which occurred only once. Seven haplotypes were shared among individuals within the same population and two were shared between neighbouring populations. Within sampling locations, haplotype diversity values were nearly maximal, from 0.905 to 1, and nucleotide diversity ranged from 0.0028 to 0.005 (Table 1). None of the coding regions sequenced deviated significantly from neutrality (McDonald & Kreitman's test, P > 0.05, Table 2). The results of Fu's FS test and Fu & Li's D test indicated an over-abundance of singleton mutations and rare alleles for most fragments (negative D or FS values with P < 0.02, Table 2). Contradictory results between two neutrality tests and the McDonald & Kreitman's test suggest that intraspecific polymorphism of these sequences may have been mainly influenced by historical demography rather than selection.

Table 2 Nucleotide polymorphism and results of neutrality tests of each gene and all genes combined

For microsatellite data, there were 1 to 9 alleles per locus across all populations, and observed (HO) and expected heterozygosity (HE) ranged from 0.01 to 0.909 and 0.091 to 0.905, respectively. There was no evidence of linkage disequilibrium after adjusting the significance level for multiple comparisons (P < 0.001). Five locus-population combinations showed significant deviation from the Hardy-Weinberg expectation after Bonferroni correction, which involved four populations and all due to heterozygosity deficiency (Table 3).

Table 3 Summary of genetic variation at ten microsatellite loci for G. elliotii populations

Phylogeography and population structure

A reconstructed maximum-likelihood tree of mitochondrial haplotypes suggested that G. elliotii was composed of two major clades with support rates of 82% and 71%, respectively (1000 replicates). The geographic distributions of the two clades appeared uneven, with the majority of first clade's haplotypes found mainly in populations in the northern and eastern subregions (north-eastern clade), whereas haplotypes of second clade were most common in southern subregion (southern clade) (Figure 2). Time to most recent common ancestor (TMRCA) for both clades fell within the late Pleistocene glacial period (Marine Isotope Stage, MIS6): 0.139 (95% HPD, 0.093-0.194) and 0.149 (95% HPD, 0.097-0.205) mya for north-eastern and southern clades, respectively.

Figure 2
figure 2

Maximum-likelihood reconstructed phylogenetic tree of G. elliotii mitochondrial haplotypes. Branch support values are shown above nodes.

Both Neighbour-Joining and STRUCTURE analyses detected similar population structures in the microsatellite data. Samples from the southern, northern and eastern subregions separated into three distinct clusters. The first cluster (southern lineage) was comprised of two populations from the southern subregion (DL and ZD), the second cluster (eastern lineage) consisted of the five eastern populations (WX, QL, HB, BC and YA), and the third cluster (northern lineage) included the three northern populations (CD, MK and RT) (Figure 3 and 4). Yet, STRUCTURE indicated that populations from the eastern subregion showed minor admixture with samples from the northern subregion (Figure 4).

Figure 3
figure 3

NJ tree based on a matrix of Nei's genetic distance between G. elliotii individuals (calculated by GENETIX 4.05) [62]. Colours indicate the distributions of individuals in relation to their geographical locations; green represents the southern subregion, blue the northern subregion, and yellow the eastern subregion.

Figure 4
figure 4

Bar plot derived from Bayesian-based cluster analysis implemented in STRUCTURE 2.3.2. [66]. Each column along the × axis represents one G. elliotii individual grouped by locations in the same order as in Table 1. The Y-axis represents the assignment probability of each individual into three clusters (K = 3).

To assess the genetic admixture, we carried out an exclusion method implemented in the program GENECLASS 2.0 to identify potential admixture individuals. Using the previously determined three genetic clusters and geographic sampling locations as prior population information, GENECLASS identified a considerable proportion of admixture individuals (25%), most of which were assigned to the geographically intermediate populations MK, YA and BC (method and result see Additional file 1, Table S1).

Landscape genetics and environmental variables correlation analyses

When geo-referenced microsatellite genotypes were analysed using GENELAND, we found that three Bayesian population clusters received the highest probable support (55% of estimated K-values from GENELAND) (Figure 5). Additionally, GENELAND revealed that a three-subregion structure was the most probable subdivision of G. elliotii, which verified the population structure inferred by STRUCTURE and the NJ tree. Each identified cluster was spatially contiguous and areas of steep turnover in posterior probability of population membership were presumed to reflect barriers to gene flow. Geographic information system (GIS) revealed that the three clusters are separated from adjacent clusters by areas where environmental conditions are unsuitable for G. elliotii. Comparing genetic divergence with spatial landscape pattern shows that gene flow barriers coincide with the spatial distribution of habitat gaps (Figure 5a).

Figure 5
figure 5

Population structure of G. elliotii revealed by GENELAND 1.0.5 [68]. (a) Map of suitable habitats of G. elliotii. Blue areas indicate suitable habitats that include alpine steppe, alpine and subalpine grasslands and shrublands, while green and yellow areas indicate suitable habitats that include temperate and subtropical grasslands and shrublands, respectively. (b) Northern group. (c) Eastern group. (d) Southern group. Probability of population membership increases as shading intensity decreases; solid circles show the sampling locations.

The marginal tests revealed that three sets of environmental variables, subregion, vegetation and coordinate, were significantly correlated with genetic distance of both mitochondrial and microsatellite data. The forward selection procedure that classified variables according to the proportion of explained variation also recognized these variables sets as important variables in the multiple regression models. A combination of the three sets explained 77 to 97% of the genetic variation between locations (Table 4).

Table 4 Effects of environmental variables on genetic differentiation of G. elliotii populations based on mtDNA and microsatellite data analyses

Divergence time estimate

As the four runs of IM and IMa2 gave similar results, we report below the estimates from the run with the highest effective sample size (ESS) for the parameter t (the parameter with the lowest ESS value in every run). For mitochondrial data, posterior probability for divergence time peaked at time t = 3.415 (90% HPD, 2.765-4.3345). When converted to a scale of years, the divergence time between north-eastern and southern lineages was estimated to be 0.109 (0.088-0.138) mya. The migration parameters estimated by IM (m1 and m2), which represented the number of migrants per mutation (m = m/μ), were converted to population migration rates (M = 2 Nm = θm/2). The migration rate per generation was estimated to be 0.43 (0.09-0.65) from the southern to north-eastern lineage, and 0.37 (0.07-0.63) from the north-eastern to southern lineage.

For microsatellite data, the posterior probability of parameter t peaked at 0.975 (0.185-2.325) for the southern and north-eastern lineages, and at 0.15 (0.045-0.315) for the northern and eastern lineages. When converted to a scale of years, the divergence time between the southern and north-eastern lineages was estimated from 0.097 (0.018-0.232) to 0.01 (0.002-0.023) mya. For the northern and eastern lineages, divergence time was estimated between 0.015 (0.004-0.032) and 0.0015 (0.0005-0.003 2) mya. The migration rate per generation (2 Nm) was estimated at 0.83 (90% HPD, 0.48-1.8) from the north-eastern to southern lineage, and 1.01 (0.38-2.2) from the southern to north-eastern lineage. From the eastern to northern lineage, the migration value was estimated to be 1.34 (0.56-1.8), and 0.33 (0.02-1.14) from the northern to eastern lineage.

Historical biogeography

All gene trees were simulated within population trees with an effective population size of Ne = 2 994 038 (95% CI: 2752885 - 3235192), which equated to a MLE estimate of θtotal of 0.0467 with lower and upper bounds of 0.0429 and 0.0505, respectively. For the observed gene tree we computed Slatkin & Maddison's S = 40. The discordance predicted by coalescent simulations rejected the single refugium hypothesis (mean S = 47, SD = 4, P < 0.05) across a range of Ne. However, coalescent tests supported both the two-refugia (mean = 42, SD = 5, P = 0.16) and the three-refugia divergence models (mean = 40, SD = 4.5, P = 0.3) in three Ne values.

Reconstruction of ancestral areas indicates that the north-eastern clade might have originated from YA and MK, while the southern clade probably derived from ZD and BC (Figure 6). Relatively high probabilities for multiple areas being the ancestral areas at the deeper nodes in the tree supported results from the coalescent simulation that suggest diversification occurred via multiple refugia.

Figure 6
figure 6

Reconstructed ancestral areas for main internal nodes of the maximum-likelihood mitochondrial haplotype phylogeny using DIVA.

Historical demography

The historical population trend inferred by the Bayesian skyline plot seems a relatively good fit to the climate trend since the late Pleistocene glaciations (Figure 7). Past population dynamics of two mitochondrial lineages coherently indicate continuous population growth over the last 0.125 mya. Log transformed theta values were around 0.02 near the end of the MIS6, after which populations began to increase rapidly until they reached their current sizes (theta around 3.5). Recent population growth is also supported by maximum-likelihood estimates of positive growth rates (corrected g: southern clade: 1604; north-eastern clade: 3226).

Figure 7
figure 7

Bayesian skyline plot representing historical demographic trends in two mitochondrial DNA lineages of G. elliotii. The × axis is in units of millions of years ago and is estimated based on a rate of 0.78 × 10-8 substitutions per site per year. The Y axis is equal to N fe τ (the product of the effective female population size and the generation time in years (log transformed)). Estimates of means are joined by a solid line while the dashed lines delineate the 95% HPD limits. Blue represents the north-eastern lineage and green the southern lineage. The abbreviation MIS refers to marine isotope stage.

Discussion

Pattern of distribution and biogeography

Geographic complexity and environmental heterogeneity are likely to have shaped the genetic structure among G. elliotii populations. Such a landscape effect is evident in the eastern Himalayan region. Although G. elliotii has a rather restricted geographic distribution, mitochondrial data support the separation of a southern and a north-eastern lineage with incomplete gene sorting, while microsatellites indicate a clear subdivision into three lineages, a southern, a northern and an eastern. Despite intermixing of mitochondrial haplotypes, none of the haplotypes are geographically widespread. All shared haplotypes occur within the same subregional population. In this case, the steep mountains and deep valleys of the Kangding-Muli-Baoxin Divide, recognized as a geographic barrier by Zhang et al. [19] and Li [20], appear to effectively have prevented gene flow among subregional populations. Across the entire sampling area, the correlation between phylogeographic pattern and subregion division is strong. Furthermore, our regression model reveals that the general pattern of subdivision could be attributed to geographic and environmental differentiations, e.g. subregion and vegetation.

Mountains, like sky islands, and are separated by areas of unsuitable habitats that act as barriers to gene flow [14, 15]. Species restricted to sky islands commonly have high levels of interpopulation genetic divergence [815]. In the eastern Himalayas, G. elliotii is restricted to shrublands in the three subregions separated by deep valleys. The habitat-mapping model shows that these subregions or lineages are separated by areas where the environmental conditions are unsuitable for G. elliotii. Consistent with the expectation that these valleys prevent or restrict gene flow, each subregional population represents an evolutionary lineage. Thus, the contrast in environmental conditions at high and low elevations in the eastern Himalayas may have created a "sky island situation" for G. elliotii.

Although many sky island species in other regions show high levels of isolation in clusters of mountains [714], our data indicate that diversification of G. elliotii has occurred on a broader geographic scale, eco-subregions. G. elliotii is an alpine bird found between 2 000 and 4 000 m.a.s.l. [2426]. This broad altitudinal range implies large ecological plasticity. Considering that most previously studied sky island species belong to less mobile groups such as beetles and plants, this difference in scale probably reflects the fact that historical processes of isolation are more easily reconstructed in species with less dispersal capability than in highly mobile species [14].

While our study may demonstrate newly discovered sky island diversification in a previously unstudied region, sky islands also have the extra connotation that mountain ranges are equivalent ecologically and they share the same biogeographic history. Since three eco-subregions in the eastern Himalayas present different ecological systems, our result should be considered cautiously. Whether the eastern Himalayan region is a sky island situation ultimately depends on the ecology of the organisms under study. Further research on more organisms inhabiting this region, especial the same eco-climatic subregions is required to clarify this.

Effects of Pleistocene glaciations on population divergence

The Pleistocene glacial cycles have been considered to induce environmental shifts in the eastern Himalayas resulting in the isolation of G. elliotii populations on different subregions. Congruent with this, our divergence time estimates indicate that lineage diversification occurred during the late Pleistocene interglacial period (MIS5). Climatic fluctuations during the late Pleistocene resulted in several glacial-interglacial cycles in the eastern Himalayas [27, 28, 99101]. Glaciations were restricted to relatively high altitudes and did not affect the lower slopes or valleys [27]. Palaeoclimatic and palynological studies from this region reveal a vegetational shift over the Pleistocene glacial cycles [102, 103]; during the glacial periods, cool-temperate vegetations, such as shrublands, expanded to lower elevations, and contracted to high elevations during warmer and wetter interglacial periods [104, 105]. Because alpine birds are often strongly associated with their preferred habitat, range expansion and contraction of shrublands probably mirrored that of G. elliotii. It is plausible therefore that the glacial and interglacial cycles would allow G. elliotii to disperse among subregions during glacial periods, but isolate population in different subregions during the interglacial periods.

Given the potential for repeated contacts of populations from subregions during glacial expansions, it is possible that subregional populations merged into a common gene pool. However, our analyses of diversification pattern in G. elliotii appear to reflect historical differentiation in multiple refugia. Coalescent simulations and reconstruction ancestral areas of the mtDNA phylogeny suggested a divergence model involving the multiple refugia, while microsatellite data identify a distinct genetic structure concordant with the geographically separated southern, northern and eastern subregions. These results suggested that subregional populations of G. elliotii were isolated into separated refugia during down-slope expansions, although we could not distinguish between the two-refugia and three-refugia model. Furthermore, while coalescent simulation assumes that the discordance between the reconstructed gene tree and multi-refugia population trees reflects the retention of ancestral variation, the discordance could also result from migration among subregional populations (see below).

Genetic admixture among intraspecific lineages

In contrast to the deep phylogeographic partitions found in other sky island species [814], we only found a shallow phylogeographic division in G. elliotii. This is consistent with genetic admixture among lineages occurring during glacial periods. The two partially sympatric mitochondrial lineages show shared ancestry, and microsatellite data indicate gene flow among the three subregional populations. Furthermore, GENECLASS identified a considerable proportion of admixed individuals, most of which were assigned to the geographically intermediate populations MK, YA and BC. Interestingly, despite the relatively high degree of genetic admixture, these locations were also identified as the potential refugia that most likely defined the boundary of each genetic lineage.

Based on these observations, it is probable that isolated populations of G. elliotii periodically expanded to lower elevation where they mixed. Thus, the three intraspecific lineages likely established a contact zone along their boundaries. Because these lineages are geographically relatively close and are likely to have experienced repeated glacial cycles, this has resulted in multiple episodes of genetic admixture. Over time, this could have defined a centre of gene flow, such as populations YA, MK and BC. This process might cause the mixing of genetic lineages thereby obscuring the pattern of genetic variation [42].

Nevertheless, the rather large degree of genetic separation among lineages indicates that these have been isolated from each other for a relatively long period. Although the southern and north-eastern mitochondrial lineages are intermixed, no haplotypes are shared between them. This lack of sharing suggests that the two lineages are at an intermediate stage of divergence along the continuum from complete panmixia to paraphyly and ultimately to reciprocal monophyly [106, 107]. In contrast, differences in microsatellite allele frequencies among the three lineages are more distinct. The discrepancy between these two markers may indicate that admixture among these lineages is relatively ancient, and hence, that the mixing signature is relatively weak.

Historical demography

Climatic changes that resulted in population expansion and contraction are also expected to lead to demographic changes over time between lineages confined to different subregions. Congruent with this, the Bayesian skyline plot revealed an increasing population size in each genealogical lineage since 0.125 mya, i.e. during the warmer interglacial MIS5 period. During the Quaternary, the eastern Himalayas were uplifted to 4 000 - 4 500 m.a.s.l. Climatic conditions responsible for high rainfall moved away from the centre of mountains and mountain glaciers shrank [27, 99]. The maximum extent of glacial development in this region occurred during MIS6 and MIS4, with ice being restricted during the global Late Glacial Maximum, LGM (MIS2) [2729]. Palynological research has indicated that a series of prolonged, mild interglacials supported vegetation similar to the present-day flora of East Asia during MIS5 (0.11 to 0.071 mya) [42, 108, 109]. It is likely that Pleistocene climatic stability might have allowed the persistence of vegetation similar to that observed today in the eastern Himalayas, especially at moderate or low altitudes [109]. Therefore, our results, combined with the palaeoclimatic and palynological data, suggest that G. elliotii experienced population growth during the warmer MIS5 period.

Contrary to the expectation that profound ecological upheavals during cooler periods would have reduced the population size of G. elliotii, the Bayesian skyline plot revealed a stable population during the cooler MIS4 and MIS2 periods. Maintenance of a rather stable population size can probably be attributed to the frequency and location of glaciations in this region. Unlike high latitude regions of Europe and North America that were covered by heavy ice during much of Pleistocene, glaciations in the eastern Himalayas were restricted to relatively high altitudes and did not affect the lower slopes or valleys [26]. It is likely that this relatively milder climate was less stressful for cold-tolerant alpine birds than the extremes experienced by both European and North American birds. Although temperatures might have been lower elsewhere during the MIS4 and MIS2, it has been inferred that the climate of the eastern Himalayas, particularly at low altitudes, was warmer [42, 107109]. Therefore, the resultant stable niches might have provided suitable habitats for G. elliotii even during periods of climatic change, making it possible for this species to maintain a stable population size during the cooler MIS4 and MIS2.

Conclusion

The genetic structure observed in G. elliotii indicates that lineages have been isolated on individual subregions since the late Pleistocene interglacial period (MIS 5). Despite being isolated in multiple refugia during glacial advance, gene flow periodically occurred when populations expanded their ranges to lower altitudes. The resultant genetic admixture might have caused the mixing of genetic lineages thereby obscuring the pattern of genetic variation. The Bayesian skyline plot shows a gradual increase in population size in each mitochondrial lineage since the late Pleistocene interglacial period (MIS 5). Our results provide new evidence that climatic changes during the Pleistocene glaciations had profound effects on lineage diversification and demography in a bird species in a previously unstudied montane region - the eastern Himalayas.