Background

Mitochondrial markers are widely and frequently used as the first approach to determining evolutionary relationships among vertebrates at different hierarchical scales [1, 2]. Cytochrome b and cytochrome oxidase I are among the most common markers; they are particularly useful in phylogeography and species delineation studies [3,4,5,6,7,8].

Although a number of methods and types of data are available, mitochondrial markers are a common tool to support descriptions of new taxa in phylogenetic analyses. This is probably due to the nature of the mitochondrial genome, i.e. as the smaller effective population size that will cause mtDNA haplotypes of a given group to coalesce four times more quickly than nuclear markers [2].

Simulation analyses show that increasing sampling of the taxon has a clear and strong effect on the accuracy of the inferred trees and therefore on hypothesized phylogenetic relationships (and eventually on new taxonomic proposals), which should be considered in the design of studies [9]. In systematics, for example, a pattern of reciprocal monophyly may be interpreted as evidence of no gene flow at the genetic locus of interest, and major factors influencing the probability of this pattern include time since isolation and effective population size. The probability of recovering a monophyletic group also depends on random processes related to the number of sequences sampled per population [2, 10]. Thus there is a probability that by sampling a few individuals a false monophyly pattern may be exhibited, therefore requiring additional samples to strengthen the hypothesis. For example, a study showed that if gene flow ceased very recently between two populations there is a higher probability of finding monophyly when few sequences are considered [11]. Also, the number of individuals required to evaluate a hypothesis adequately may vary according to the substitution rate of the marker analyzed. More individuals seem to be necessary in the analysis when the sequenced marker evolves more quickly [12]. Several empirical and simulations studies have shown that taxon sampling may be critical to achieve accurate phylogenetic results, either to retrieve the correct topology or improve bootstrap support [13,14,15,16,17,18,19,20]. Therefore, the decision of including or excluding certain taxa may result in different topologies, which in turns, may lead to new taxonomic rearrangements and/or phylogenetic relationships.

The lizard genus Liolaemus is widely distributed in southern South America and includes more than 250 described species [21]. This genus is distributed from arid Patagonian to high-altitude Andean environments, including valleys and coastal ranges. The adaptive radiation of Liolaemus has produced interesting patterns of genetic and morphological variation, resulting in a highly species-rich genus [22]. In Chile, 96 species of Liolaemus are currently recognized [23], which inhabits most environments from the arid desert in the north to the cold southern Patagonia, as well as Coast Range and Andean environments from 0 to 4500 m [24]. Thus, the number of taxa and the distribution of Liolaemus species/populations represents a good model for testing different hypotheses in systematics. Taxonomy within Liolaemus is dynamic [25], with new species reported almost every year. Most of the studies use traditional morphological analyses to diagnose the new taxa, and some include integrative approaches with other data such as molecular phylogenetic relationships with sister or close taxa. We used two Liolaemus species as focal species to evaluate their monophyly and the influence of adding new samples from closely related taxa in the resulting phylogenies.

Liolaemus nigroviridis is a saxicolous species distributed in central Chile between latitudes ca. 32° to 34°S (Region of Valparaiso to O’Higgins) above 1100 m [26]. The species’ distribution is discontinuous, inhabiting both Andean and coastal mountain environments and absent in low elevations and valleys [27]. In the Coast Range it has been reported in the Cantillana, Chicauma, La Campana and El Roble mountains. Because of the allopatric distribution of its populations, three subspecies were described: L. n. nigroviridis in the Valle del Río San Francisco (Andes), L. n. minor in the Valle del Río Volcán (Andes), and L. n. campanae in the Coast Range [28, 29]. However, based on similarities in morphology and chromosomal characters, a subsequent study did not recognize the polytypic status of the species [30]. A new species was described recently in central Chile, Liolaemus uniformis, sister to L. nigroviridis [31]. The phylogenetic inference for the description of L. uniformis included L. nigroviridis from one population, the type locality Farellones (33°21′S, 70°17′W). A previous study reported at least four major lineages allowing to infer strong phylogeographic structure within L. nigroviridis [27]; therefore, an increase taxon sampling seems recommendable to determine the position of the new proposed taxon (L. uniformis) related to its sister species (L. nigroviridis). The second model, L. monticola, is a saxicolous lizard species from the mediterranean Andes [32, 33] and the Coast Range [34] in central Chile from between 600 m to 1900 m. This species displays an extensive variation and complexity in chromosome number in a latitudinal gradient [35, 36], with riverine barriers playing a major role in the divergence of populations [37, 38]. Phylogenetic inferences show that L. nitidus is sister to L. monticola [3, 39]. However, intensive sampling showed paraphyly of L. monticola populations, with two samples (one haplotype) of L. nitidus placed within the L. monticola clade, which was suggested as potential mitochondrial introgression [40].

In this study, we performed phylogenetic analyses using mitochondrial Cytochrome b (cyt-b) DNA to determine the influence of taxon sampling in finding monophyletic groups in Liolaemus, a main result for proposing new taxa. We also discuss the importance of including as much as information available from several sources and databases (i.e. Genbank), and discuss the limitations of the studies which do not take into account evolutionary population processes that may hide the history of species differentiation.

Methods

Specimens analyzed

Mitochondrial gene sequence data of cyt-b were obtained from 141 lizards of different species of Liolaemus, 32 sequences generated in this study and 109 sequences downloaded from Genbank (Fig. 1, See Additional file 1). Genbank accession numbers KY575287-KY575318. The species used were L. nigroviridis, L. uniformis, L. monticola, and L. nitidus. Other species included were L. confusus, L. curicensis, L. lemniscatus, L. tenuis, L. pseudolemniscatus, L. schroederi and L. chiliensis based on phylogenetic studies with mitochondrial and nuclear genes [31, 39, 41]. Due to a deep phylogeographic structure reported for L. nigroviridis [27], we added 14 sequences from a new locality (La Campana, 32° 57′S, 71° 07′W). We included L. uniformis to test the monophyly/paraphyly within L. nigroviridis.

Fig. 1
figure 1

Study area. Map of 41 localities of focal species. The colors represent the four different species: L. nigroviridis (green), L. uniformis (pink), L. monticola (orange) and L. nitidus (light blue). The Maipo River is shown to represent the geographic barrier to the northern and southern clades of L. monticola reported by [38]

DNA extraction, amplification and sequencing

Total genomic DNA was extracted from liver/muscle tissues preserved in 80% ethanol using the QIA quick DNeasy Blood & Tissue kit (Qiagen Inc., Valencia, CA, USA) following the protocol provided by the manufacturer. A fragment of 775bp of the mtDNA region encoding the cyt-b gene was amplified via polymerase chain reaction (PCR) using the light strand primer GLUDG (5′-TGACTTGAARAACCAYCGTTG-3′) and the heavy strand primer CB3 (5′ GGCAAATAGGAARTATCATTC-3′) previously used [40]. The thermocycling conditions were previously documented [27]; amplification reactions were carried out in 11.5 μl reaction mixture containing 1.5μl template DNA (1–50 ng), 2μl 5X Taq buffer, 1.2 μl 25 mM MgCl2, 0.2 μl dNTPs (10 mM), 0.124 μl each primer (10 mM) and 0.1 μl Taq polymerase (5 U/μl Promega Corp.). Double-stranded PCR-amplified products were checked by electrophoresis on a 1% agarose gel in 1x Tris-Borate-EDTA (TBE) TBE running buffer and stained with Gel Red (Biotum, Inc., Hayward CA, USA). Purification and sequencing reactions were conducted by Macrogen Inc, (South Korea). All sequences were edited and aligned using the BioEdit Sequence Alignment Editor [42] and checked by eye. Missing data were coded as “?” and cyt-b sequences were translated to check for premature stop codons using DnaSP [43].

Phylogenetic analyses

To assess interspecific relationships of Liolaemus species focusing on L. nigroviridis and L. monticola and to determine changes in the topologies with different numbers of sequences, we performed Maximum-likelihood (ML) and Bayesian analyses using three approaches: 1) Eleven Liolaemus species (114 haplotypes) that included L. nigroviridis (N = 78), L. monticola (N = 36) plus their sister species and outgroups (see below). Cytochrome b sequences were downloaded from Genbank and obtained in this study (See Additional file 1). 2) L. nigroviridis as focal species that included seven populations, plus two sequences of L. uniformis (sister species of L. nigroviridis; [31], resulting in 64 cyt-b haplotypes. 3) L. monticola as focal species from 31 localities plus L. nitidus (sister species of L. monticola; [3] from 5 localities, resulting in 44 cyt-b haplotypes.

Phylogenetic reconstructions were rooted with the outgroup criterion, using L. chiliensis and L. schroederi for all Lioalemus species included in this study (first approach), L. lemniscatus for analyses with L. nigroviridis as focal species (second approach), and L. confusus and L. curicensis for L. monticola as focal species [31, 39, 41] (third approach). Outgroups used for second and third approaches were based in the phylogenetic relationships resulting from the first approach (see results). Maximum-likelihood (ML) analyses were performed in the PhyML 3.0 server (http://www.atgc-montpellier.fr/phyml/) using the best fit model of sequence evolution according to the Bayesian Inference Criterion (other criteria such as Akaike Information Criterion resulted in the same model of sequence evolution) identified by jModelTest [44] for each data set. We evaluated node support using 1000 bootstrap pseudoreplicates [45]. The selected best models of sequence evolution were: For the first approach, Hasegawa-Kishino-Yano (HKY; [46] + invariable sites (0.55) + Gamma shape parameter (1.4866). For the second approach HKY + invariable sites (0.614), and for the third approach, HKY + invariable sites (0.533) + Gamma shape parameter (1.003).

Bayesian analyses were performed using MrBAYES v.3.1.2b [47], using as prior parameters values obtained from jModeltest [44]. Computational resources available in CIPRES Science Gateway were used to run all Bayesian analyses [48]. We ran Metropolis-coupled Markov Chain Monte Carlo simulations (MCMCMC) with four incrementally heated chains. From random starting trees, four independent runs (two replicas of two simultaneous, independent runs each) of 1 x108 generations each were performed, with the resulting trees sampled every 10000 generations. Mixing and convergence were assessed using Tracer v1.6 [49] and the first 1000 trees of the sample were removed. The last 9000 trees were used to compute a 50% majority rule consensus tree. The percentage of samples that recover any particular clade in this tree represents that clade’s posterior probability; we considered BPP (Bayesian Posterior Probability) = 95% as evidence for significant support [50]. Trees were visualized using the Fig Tree v1.4.2 program [51].

Results

Major relationships of the L. nigroviridis and L. monticola (focal species of this study) were very similar in the phylogenetic inferences (Figs. 2, 3 and 4), thus we refer to the ML topology for a discussion of the results. Figure 2 shows the results including all eleven Liolaemus species (approach 1). The ML topology shows two major groups, one supported by maximum likelihood bootstrap (MLB = 92) and by Bayesian posterior probabilities (BPP = 0.99; clade A), and a second clade not supported by maximum likelihood bootstrap (MLB = 47) but supported by Bayesian posterior probabilities (BPP = 0.96; clade B). The difference in the support for the clade B may result from a variety of causes, including the different nature of both analyses and overestimation of posterior probabilities [52,53,54,55]. The two sequences of L. uniformis were placed in a supported clade (MLB = 100, BPP = 1.0) within the L. nigroviridis group, related (but unsupported) to the supported clade that included samples from La Campana, El Roble, and Chicauma (Fig. 2). The second group (clade B) included L. monticola, L. nitidus, L. confusus, L. curicensis and L. lemniscatus (Fig. 2). Within this clade, L. monticola, L. nitidus and L. confusus plus L. curicensis were recovered in a supported clade (MLB = 87; BPP = 0.99). The focal species L. monticola was recovered within an unsupported clade together with L. nitidus. However, all sequences from the northern distribution of L. monticola (northern race, see [38] were placed in a supported clade (MLB = 99, BPP = 1.0; clade B1) with sequences of L. nitidus (from Chicauma, El Yeso, Dunas de Ritoque and Cantillana localities; See Additional file 1). Sequences of L. monticola of the southern race were recovered within a supported clade together with L. nitidus from Cantillana (MLB = 93, BPP = 0.99; clade B2). One L. nitidus sequence (L. nitidus EU22083) was placed closer to all L. monticola from the northern race (Fig. 2). Figure 3 shows the relationships of L. nigroviridis as focal species (approach 2). The ML topology does not differ from Fig. 2, but differs in the position and support of the L. nigroviridis sequences from Cantillana. In Fig. 3, 13 out of 14 sequences from Cantillana were placed in a supported clade (MLB = 89; BPP = 0.99), but in Fig. 2 all sequences were recovered in a supported (MLB = 99, BPP = 1.0) clade. Similar to Fig. 2, sequences from L. uniformis were placed in supported clade (MLB = 99, BPP = 1.0) with an unresolved position compared to all L. nigroviridis clades. Figure 4 shows the relationships with L. monticola as focal species. L. confusus and L. curicensis were used as outgroups. The topology recovered was similar to that depicted in Fig. 2, but the clade including all L. monticola plus L. nitidus sequences was supported (MLB = 96, BPP = 1.0). The sequence of L. nitidus (EU22083) was recovered unresolved with L. monticola northern race sequences but in an unsupported clade.

Fig. 2
figure 2

Maximum likelihood tree of eleven species of the genus Liolaemus. Maximum likelihood tree of eleven species of the genus Liolaemus based on the mitochondrial gene cytochrome b. Numbers on the branches indicate bootstrap values for maximum likelihood and posterior probabilities for Bayesian inference. Branches of the focal species are shown in colors: L. nigroviridis (green) and L. monticola (orange): L. uniformis is shown in pink and L. nitidus in light blue. Information on haplotypes is given in Additional file 1: Table S1

Fig. 3
figure 3

Maximum likelihood tree for the focal species L. nigroviridis. Maximum likelihood tree based on cytochrome b gene for the focal species L. nigroviridis. Numbers on the branches indicate bootstrap values for maximum likelihood and posterior probabilities for Bayesian inference. Colors of the branches represent L. nigroviridis (in green) and L. uniformis (in pink) haplotypes

Fig. 4
figure 4

Maximum likelihood tree for the focal species L. monticola. Maximum likelihood tree based on cytochrome b gene for the focal species L. monticola. Numbers on the branches indicate bootstrap values for maximum likelihood and posterior probabilities for Bayesian inference. Colors of the branches represent L. monticola (in orange) and L. nitidus (in light blue) haplotypes

Discussion

Phylogenetic analyses are widely used in systematics to support species status complimentarily with other types of data (e.g. isozyme electrophoresis [56], traditional morphology-based methods [57], chromosomes [34, 58], and some physiological and ecological traits [59, 60]), especially when monophyly is recovered. A number of studies have emphasized the role of using intense taxon sampling of species and the number of individuals to elucidate spurious monophyletic groups [2, 9, 10, 12], and the degree of polymorphism in diagnosing new or hidden taxa [61]. Reciprocal monophyly may not be achieved in taxa with different evolutionary trajectories [62, 63] leading to inconclusive support for species designation. The genus Liolaemus provides several examples of this situation [5, 12, 64,65,66,67], with a number of taxa still waiting to be studied or validated. Here we used a higher number of samples that included sister species for phylogenetic reconstructions, with two species (L. uniformis and L. nitidus) placed within the clades of the two focal species, L. nigroviridis and L. monticola, respectively.

Topologies for L. nigroviridis with ML and Bayesian inference recovered all haplotypes in a supported clade, with strong phylogeographic structure congruent with previous results [27]. Almost all sampled populations were recovered in a supported clade, with L. uniformis placed within the L. nigroviridis clade. Liolaemus uniformis was described from one locality (Laguna Chepical, 32°15′S, 70°30′W), and the analysis of the phylogeny included sequences from 37 Liolaemus species, with 11 sequences of L. nigroviridis from the locality of Farellones [31]. These taxa were inferred as sister species. Morphological descriptions and analyses were also performed including other species from the L. nigroviridis group to support the new proposed taxon. L. nigroviridis and L. uniformis were reported to be sympatric, with significant differences in a few morphological characters [31]. The phylogenetic placement of L. uniformis within the L. nigroviridis clade may be explained by several processes including mitochondrial introgression, coalescent stochasticity, imperfect taxonomy, inadequate phylogenetic information, paralogy, hybridization, the taxon sampling effect and incomplete lineage sorting [68,69,70,71,72]. Discriminating among these processes may be difficult, particularly in groups with complex evolutionary histories and/or widely distributed species inhabiting diverse landscapes. Liolaemus nigroviridis inhabits mountain areas of central Chile, and several populations are currently in allopatry, inhabiting both in Coast Range and Andean environments [27, 29]. Although it is not the goal of this study to determine the main processes that explain a potential paraphyly of L. nigroviridis, our results show that increasing the sampling from additional localities of closer (e.g. sister) taxa may result in absence of monophyly. According to our results and a previous study [27], the molecular divergence of L. nigroviridis populations and the placement of L. uniformis suggests cryptic divergence within L. nigroviridis. However, our study is limited to one mitochondrial locus, and the robustness of phylogenetic inferences (i.e. for delimiting species) increases when more independent loci are considered [2, 61, 73]. Therefore, to evaluate the processes that may explain this pattern additional molecular analyses coupled with morphology and ecology are required [74,75,76,77].

Our topologies also show polyphyly of L. monticola (Figs. 2 and 4). Haplotypes of L. monticola were placed in two divergent clades (northern and southern races; [37, 38], with several haplotypes of L. nitidus within L. monticola clades. Eight haplotypes of L. nitidus were sister (and supported; clade B1) to a clade that included all L. monticola haplotypes that belong to the northern race plus one (supported) additional L. nitidus haplotype (H96; Fig. 2). Two haplotypes of L. nitidus were closer to haplotypes of L. monticola that belong to the southern race (clade B2). Several studies using molecular phylogenetic analyses have inferred L. monticola and L. nitidus as sister species [3, 31, 39, 60, 78], a result that might contribute to the complexity of determining the main process (either deterministic or stochastic) that explains our results. Incomplete lineage sorting and mitochondrial introgression have been discussed as probable causes to explain paraphyletic patterns in Liolaemus [4, 5, 40, 64, 66]. We found that almost all haplotypes of L. nitidus and L. monticola (except L. nitidus H34; Fig. 4) were placed in divergent and supported clades, which would not support the incomplete lineage sorting hypothesis. L. nitidus is widely distributed in Central Chile, including Administrative Regions IV to VIII and from 0 to 3050 m [60], and this distribution partially overlaps that of L. monticola. In our study two out of five localities (El Yeso from the Andes and Cantillana from Coast Range) included samples of both species; these populations are in contact and may potentially exchange mitochondrial genes. Therefore, alternative processes such as historical mitochondrial introgression should be further evaluated.

Strikingly, the eight haplotypes of L. nitidus from the locality of Cantillana were placed in two divergent clades: six haplotypes (haplotypes 35, 38, 41, 42, 43, 44; Fig. 4) were closer to the clade that included the L. monticola northern race, and two other haplotypes (Haplotypes 36, 37; Fig. 4) were closer to the clade of the L. monticola southern race. Despite our sampling bias, the high molecular divergence in L. nitidus may indicate a substantial history of genetic isolation among populations [68]. Quaternary fluctuations may have played a major role in the current population structure of lizards in Chile [6, 38, 79], connecting populations that may have been allopatric in the past [80].

The divergence found in L. nitidus may be a result of ancestral allopatric populations that converged in Cantillana as a result of historical climatic changes. However, the latter hypothesis does not explain the overall phylogenetic pattern found in L. monticola and L. nitidus.

Imperfect taxonomy has also been suggested as a hypothesis to explain paraphyletic patterns. However, these species have strong differences in morphology (size, shape, scales, and color; [32], chromosome number [34, 58], and certain ecological and physiological traits [60, 81]. Molecular analyses have revealed several examples of cryptic diversity that have impacted the taxonomy of some species (e.g. [82, 83]); our topologies show that clade divergences may hide greater diversity than expected, which should be further evaluated within each of the focal species (i.e. adding more independent loci).

Conclusion

Based on a mitochondrial marker, we showed that using intensive taxon sampling for phylogenetic reconstructions, two lizard species (L. uniformis and L. nitidus) are placed within the clades of two different species (L. nigroviridis and L. monticola, respectively). Our study confirms the importance of taxon sampling to infer more accurate phylogenetic relationships, particularly to reveal hidden polyphyly or paraphyly, which may have a strong impact in taxonomic proposals and inference of cryptic diversity.