Introduction

Accurate species delimitation plays a crucial role in assessing, monitoring, and conserving biodiversity [1,2,3]. This is made more difficult by cryptic species, defined as two or more distinct species that are erroneously classified under one species [4]. These may be morphologically barely distinguishable, yet on different evolutionary trajectories [4, 5]. These hidden or unrecognized species represent a substantial fraction of biodiversity, posing challenges to taxonomy, biodiversity estimation and conservation efforts [6,7,8,9]. Cryptic diversity can arise through various mechanisms [5, 10], such as recent divergence [11,12,13], convergent evolution driven by similar environmental pressures [14, 15], hybridization [6, 16, 17], and polyploidization [18, 19]. Over the past two decades, DNA barcodes (standard DNA regions across taxa) have been widely used in discriminating species [20,21,22,23] and discovering cryptic species [24, 25]. However, the standard plant DNA barcodes are not always effective, especially for recently radiated taxa or those possess complex evolutionary histories [26, 27].

The plastid genome in almost all land plants exhibits a highly conserved quadripartite structure [28], generally ranging from 120 to 160 kb in size and containing 110–130 distinct genes including ~ 80 protein-coding genes, 30 transfer RNA (tRNA) genes, and four ribosomal RNA (rRNA) genes [29, 30]. Plastid genomes are predominantly maternally inherited in plants [31]. Due to their high copy number within cells, plastid genomes can be sequenced, assembled, and annotated more easily and cost-effectively than nuclear genomes [32, 33], aiding their widespread use in elucidating the evolutionary history of green plants [34,35,36]. Moreover, plastid genomes have emerged as super DNA barcodes or ultra-barcodes [37, 38], containing a higher number of informative sites and exhibiting greater discriminatory power than standard plant DNA barcodes [39, 40]. Plastid genomes have recently been used in discovering cryptic species and screening taxon-specific DNA barcodes for particular plant lineages [19, 41, 42].

The Hengduan Mountains Region (HDM), also known as the Mountains of Southwest China, is recognized as one of the world’s biodiversity hotspots [43, 44]. It is known for harboring the richest temperate alpine flora in the world [45], and as a center of diversity for numerous plant lineages [46,47,48,49]. Its topography is characterized by a series of north-south oriented alpine mountain ranges separated by deep river gorges [47], which act as genetic barriers for some plant taxa [50, 51] and therefore have contributed to the high species diversity in this region.

The genus Triplostegia Wall. ex DC. comprises two traditionally recognized species: T. grandiflora Gagnep. (1901) is confined to the HDM in north Yunnan and West Sichuan, whereas T. glandulifera Wall. ex DC. (1830) is widely distributed in the mountains of southwestern and central China, extending to Taiwan, Bhutan and Nepal [52] (Fig. 1). Recent evidence placed Triplostegia within subfamily Dipsacoideae of Caprifoliaceae [53,54,55], but its affinities have long been controversial [56,57,58]. Morphologically, Triplostegia grandiflora differs from T. glandulifera in its sessile (not petiolate) leaves, longer corolla and more elongated inflorescence branch [52], but recent studies based on molecular and morphological evidence have proposed merging T. grandiflora into T. glandulifera [59, 60]. However, our own investigations in the southern region of the HDM have revealed a third taxonomic entity, here termed Triplostegia sp. A, which often occurs sympatrically with T. grandiflora, but differs from both recognized species in glabrous and slender lateral taproot, petiolate and marginal serrated leaves, and corollas usually 1–2 mm in length. Furthermore, whereas T. grandiflora exists in Pinus yunnanensis and P. armandii forests up to 2066–3128 m, Triplostegia sp. A has a larger elevation range, 2651–3954 m according to our fieldwork, occurring in Pinus, Quercus, Abies, and Picea forests plus roadsides, riversides, and alpine meadows.

Fig. 1
figure 1

Occurrences of Triplostegia glandulifera (yellow dot), Triplostegia sp. A (red triangles), and T. grandiflora (green squares). The areas circled with yellow, red, and green lines are the geographical distribution areas of T. glandulifera, Triplostegia sp., and T. grandiflora, respectively. (The map is created by authors using ArcGIS software)

In this study, we performed genome skimming on 22 individuals collected from 14 populations of Triplostegia sp. A and eight populations of T. grandiflora, all from the southern HDM within the northwest region of the Yunnan Province. To these were added 11 recently published Triplostegia plastomes covering the entire distribution range of Triplostegia [59]. Furthermore, we recorded morphological and functional traits of Triplostegia species from herbarium specimens and wild plants. Our main objectives were to address the following questions. (1) What is the phylogenetic position of Triplostegia? (2) How many distinct species exist within Triplostegia? (3) When did diversification occur among Triplostegia species? (4) Are there any highly variable regions in the plastid genome that could be used as taxon-specific DNA barcodes for discriminating Triplostegia species? (5) Did any geographical features play a role promoting diversification within Triplostegia?

Materials and methods

Taxon sampling

One individual was randomly selected from each of 8 and 14 populations of T. grandiflora and Triplostegia sp. A in northwest Yunnan Province, respectively (Table S1; Fig. 6). Healthy and fresh leaves were collected and immediately dried using silica gel. Vouchers were deposited in the Herbarium of Nanchang University. In addition, 11 sequences of Triplostegia were downloaded from the NCBI Sequence Read Archive (SRA) for analysis, comprising seven samples of T. glandulifera, one of T. grandiflora, and three of the unrecognized species Triplostegia sp. A (Table S1).

DNA isolation and sequencing

Total genomic DNA was extracted from silica-gel-dried leaves using a modified CTAB method [61]. The DNA samples were then sheared into fragments and used to construct 500 bp libraries by the Molecular Biology Experiment Center, Germplasm Bank of Wild Species in Southwest China, following the manufacturer’s manual (Illumina, San Diego, CA, USA). Paired-end sequencing of 150 bp was performed on an Illumina HiSeq 2500 platform.

Plastid genome and nrDNA assembly and annotation

The raw sequence reads were quality-checked with FastQC [62] and filtered using Trimmomatic v0.3.2 [63]. The plastid genomes and nrDNA sequences were assembled from high-quality paired reads using GetOrganelle v1.6.0 [64] with default settings. The connections between contigs were evaluated and visualized using Bandage v0.7.1 [65]. The assembled plastid genomes were annotated using Geseq (MPI-MP CHLOROBOX - GeSeq(mpg.de)) [66], followed by manual adjustments in Geneious v9.05 (http://www.geneious.com/) using the published plastid genome of T. glandulifera (GenBank accession: NC_045051) as a reference. The assembled nrDNA sequences were annotated in Geneious using the nrDNA of Scabiosa canescens Waldst. & Kit. (MT735330) as a reference. Finally, the graphical maps of the Triplostegia plastid genomes were generated using Organellar Genome DRAW v. 1.3.1 [67].

Genome comparison and structural analysis

The plastid genomes were aligned and visualized using mVISTA in Shuffle-LAGAN mode [68]. To investigate potential rearrangements in the plastid genomes, multiple sequence alignment was performed using MAUVE [69]. Comparisons of boundaries between the single-copy regions and the inverted repeat (IR) regions among the plastid genomes were performed using IRscope [70].

Nucleotide diversity and genetic differentiation analysis

We used nucleotide diversity (π) to assess the levels of plastid genomic divergence within Triplostegia, and identify highly variable plastid DNA regions, using DNAsp v6.0 [71], employing a window length of 600 bp and a step size of 200 bp.

Genetic differentiation (FST) and gene flow (Nm) among species, as well as within-species genetic diversity, were also estimated using DNAsp v6.0 [71], with all samples belonging to each taxon being considered a population.

Phylogenetic analysis

Phylogenetic relationships among the 33 Triplostegia samples (Table S1) were examined using Maximum-likelihood analysis (ML) and Bayesian inference (BI), based on three datasets: the complete plastid genomes, plastid protein-coding sequences (CDS), and nrDNA sequences. As outgroups we used Pterocephalus hookeri (C.B.Clarke) Airy Shaw & M.L.Green, Dipsacus asper Wall. ex DC., Scabiosa tschiliensis Grüning, Kolkwitzia amabilis Graebn., and Patrinia heterophylla Bunge, representing genera closely related to Triplostogia [59]. Three samples were randomly selected from each of the three clades formed by the 33 samples of Triplostegia, and used to determine the position of the genus in the phylogeny of Dipsacales via ML and BI analyses. For this, 57 complete plastid genomes were obtained, including the above outgroups and covering of the recognized families within the order (Table S2). Sesamum indicum L., Mentha spicata L., Pittosporum kerrii Craib, and Apium graveolens L. were chosen as outgroups to Dipsacales based on previous studies [53,54,55].

Multiple sequence alignments were performed using MAFFT v7.409 [72]. The best-fit nucleotide substitution model was selected using ModelTest v.3.7 [73] with the Akaike information criterion (AIC). ML analysis was performed using RAxML v 8.2.12 [74] under the GTRGAMMA model with 1000 bootstrap replicates. BI analysis was performed using MrBayes v3.2.6 [75] with the Markov Chain Monte Carlo (MCMC) algorithm, running for 2,000,000 generations with the first 25% of trees discarded as burn-in, and thereafter sampling trees every 1000 generations, and using these to construct majority-rule consensus trees. Furthermore, phylogenetic networks of plastid genomes and nrDNA sequences were visualized using SplitsTree v4.14.6 [76].

Species discrimination analysis

We assessed the effectiveness of standard plant DNA barcodes, including rbcL, matK, and ITS, and the barcode ycf1 suggested by Dong et al. (2015) [77], plus the highly variable plastid DNA regions of Triplostegia and their combinations, in discriminating Triplostegia species using tree-based methods. ML trees for each marker were constructed using RAxML with the same settings as previously described. A species was considered as being correctly resolved when all the individuals of the same species formed a monophyletic group with >70% bootstrap support [78]. The standard DNA barcode trnH-psbA was not included in the analysis due to its insufficient number of informative sites for species discrimination. In addition, we generated Neighbour-Joining (NJ) trees using MEGA v10 [79] based on the highly variable plastid DNA regions and the ITS region, using the P-distance model with 1000 bootstrap replicates.

In addition to the tree-based analyses, we also conducted distance-based analyses following Hollingsworth et al. (2009) [22]. Pairwise interspecific and intraspecific genetic distances were calculated using the Kimura 2-parameter (k2p) mode using MEGA v10 [79]. A species was considered to be successfully discriminated if its minimum interspecific k2p distance involving this species was greater than its maximum intraspecific k2p distance.

Divergence time estimation

We obtained plastome sequences from GenBank (Table S3) for a total of 27 species of Dipsacales, two species of Apiales (Apium graveolens L. and Pittosporum kerrii Craib), and two species of Lamiales (Sesamum indicum L. and Mentha spicata L.) for the purpose of estimating divergence times. ModelTest analysis indicated that the GTR + I + G nucleotide substitution model performed the best (Table S4). We used BEAST v2.6.6 [80] to estimate divergence times under a relaxed lognormal clock and GTR + I + G nucleotide substitution model. Markov Chain Monte Carlo (MCMC) searches were performed for 500,000,000 generations, sampling every 25,000 generations. The tree prior was specified as a Yule process. Tracer v.1.5 [81] was used to assess chain convergence and to ensure that the effective sample sizes (ESS) were greater than 200. The maximum clade credibility (MCC) tree with median heights was computed using TreeAnnotator v2.6.6. Four calibration points were used: (1) the crown age of Dipsacales was set to 103 million years ago (Ma), with a normal prior (mean = 103 Ma, SD = 1.0), based on previous studies [82, 83]; (2) the earliest fossil record of Viburnum from the late Paleocene to early Eocene [84, 85] was used to calibrate the crown group of Adoxaceae, with lognormal prior (mean = 0, SD = 1.0, offset = 56 Ma), following Moore and Donoghue (2007) [86]; (3) the setting of divergence time between Weigela and its sister group Diervilla, with lognormal prior (mean = 0, SD = 1.0, offset = 23 Ma), following Wang et al., (2015) [83]; and (4) the fossil fruits of Diplodipelta (36 Ma) [87] were used to calibrate the stem age of Dipelta, with lognormal prior (lognormal mean = 0, SD = 1.0, offset = 36 Ma), following Wang et al., (2015) [83].

Morphological and functional traits analyses

We collected morphological trait data of Triplostegia species by measuring specimens across their distribution range. For 63, 27, and 55 specimens of T. glandulifera, T. grandiflora, and Triplostegia sp. A respectively, we measured 10 morphological traits, including plant height, taproot length and width, leaf length and width, petiole length, leaf fission depth, corolla length, fruit length and width. These traits were chosen because most of them showed disparities among the Triplostegia species according to our observations. In addition, respectively 121 and 149 individuals from 8 and 14 populations of T. grandiflora, and Triplostegia sp. A (identified by S-L Tan) where they co-occur in northwest Yunnan (Fig. 6), were examined for eight morphological and functional traits: plant height, leaf chlorophyll content, leaf area, leaf thickness, leaf dry mass, specific leaf area (SLA), corolla length, and corolla width, following previously applied protocols [88,89,90]. We conducted Principal component analysis (PCA) based on these morphological traits. Kruskal-Wallis tests and pairwise Wilcoxon rank sum tests were used to assess the differences in each trait among the three Triplostegia taxa. All statistical analyses were conducted using R version 4.1.3 [91].

Species distribution modelling

We used MaxEnt v.3.4.1 [92] to assess the suitable climate envelopes of T. glandulifera, T. grandiflora, and Triplostegia sp. A across the past, present, and future periods. Species occurrence records were obtained from the Chinese Virtual Herbarium (http://www.cvh.ac.cn/), the Global Biodiversity Information Facility (https://www.gbif.org/), plus our own field collections. To ensure data quality, we refined the occurrence records following the criteria described by Qiu, et al. (2023) [93] by removing: 1) duplicate records, 2) records lacking spatial coordinates or specific locations, 3) specimens with identification errors, and 4) unreliable records that were located in the city or bodies of water. To reduce the effect of spatial autocorrelation and the consequent overfitting, occurrence records within five kilometers of another were filtered out. Ultimately, our final dataset for species distribution modelling consisted of 64, 30, and 67 occurrence records for T. glandulifera, T. grandiflora, and Triplostegia sp. A, respectively (Fig. 1).

Nineteen bioclimatic variables were downloaded from WorldClim v1.4 for the Last Interglacial (LIG; 120,000–140,000 years ago), the Last Glacial Maximum (LGM; 22,000 years ago), and the Mid-Holocene (MH; 6000 years ago) periods, and from WorldClim v2.1 (https://www.worldclim.org/) for the present (1970–2000) and future (2090: average of 2081–2100) (Table S5), with a resolution of 2.5 arc-minute (approximately 5 km2). To provide a conservative and a comparatively larger estimate of species distribution change under future climate conditions, we used two Shared Socioeconomic Pathways (SSPs) for future climatic conditions: SSP2–4.5 (moderate climate change) and SSP5–8.5 (pessimistic climate change) from the CMIP6 (BCC-CSM2-MR) climate model [94]. To avoid multicollinearity of variables, we performed a Pearson’s correlation test for the 19 bioclimatic variables for each species, and for any pair of variables with Pearson’s r > 0.8, the variable with the higher percentage contribution was retained. The Area Under Receiver-Operating Characteristic (ROC) Curve (AUC) values were used to evaluate the accuracy of the species distribution models [95]. The AUC values range from 0.5 to 1, which are categorized as failing (0.5–0.6), poor (0.6–0.7), fair (0.7–0.8), good (0.8–0.9), and excellent (0.9–1) [96]. The Jackknife analysis was used to determine the relative significance of each bioclimatic variable [97]. To determine the potential distribution of each species, we reclassified the MaxEnt output file using the 10-percentile training presence logistic threshold value (10TPL) [98, 99]. We used the SDM toolbox v2.4 in ArcGIS 10.2 to calculate the suitable area changes between different periods.

To quantify niche similarity between species, we used the software ENMTools v1.3 [100] to estimate the niche overlap among Triplostegia species using two metrics: Schoener’s D [101] and Warren’s I [102]. Both metrics range from 0 to 1, with values closer to 1 indicating a higher degree of niche overlap between the species.

Results

General features of the Triplostegia plastid genomes

The 33 plastid genomes of Triplostegia examined (Table S1) were highly conserved in gene content, gene order, and GC content (Table 1; Fig. S1), all exhibiting a typical quadripartite structure composed of a large single-copy (LSC) region, a small single-copy region (SSC), and two inverted repeat regions (IR) (Fig. S1). The plastid genome lengths varied as follows: 154,230–155,445 bp for T. glandulifera, 155,041–155,410 bp for Triplostegia sp. A, and 155,638–155,706 bp for T. grandiflora (Table 1). The complete plastid genome consists of 113 unique genes, including 79 protein coding genes, 30 tRNA genes, and four rRNA genes. Among these, the following17 genes were found to be duplicated in the IR regions: rrn16, rrn23, rrn4.5, rrn5, trnI-CAU, trnL-CAA, trnV-GAC, trnI-GAU, trnA-UGC, trnR-ACG, trnN-GUU, rps12, rps7, rpl23, ndhB, ycf2, and ycf1. Additionally, 16 genes contained one intron (trnK-UUU, trnL-UAA, trnV-UAC, trnI-GAU, trnA-UGC, rps16, rps12, rpl16, rpl2, rpoC1, petB, petD, atpF, ndhB, ndhA, ycf2), whereas two genes (clpP and ycf3) contained two introns (Table S6).

Table 1 General characteristics of chloroplast genome of Triplostegia species

Comparative analysis of plastid genome structures

According to mVISTA (Fig. S2) and MAUVE (Fig. S3) analysis, the plastid genome structures were highly conserved in Triplostegia, with no inversion or rearrangement detected. The LSC and SSC regions were more variable than the IR regions, and the non-coding regions were more variable than the coding regions. The IR/SSC and IR/LSC junction regions of Triplostegia contained seven genes: rpl2, rpl23, trnN, ndhF, ycf1, trnI, and trnH (Fig. S4).

Phylogenetic analyses and divergence time estimation

Our phylogenetic analyses of Dipsacales based on plastid genomes revealed that Triplostegia is a strongly supported (BSML = 100%, PPBI = 1.00) monophyletic genus within the subfamily Dipsacoideae (Caprifoliaceae), forming a sister clade to a group consisting of Dipsacus, Scabiosa, and Pterocephalus (Fig. 2). Within Triplostegia, ML and BI analysis of both complete plastid genomes (Fig. 3) and plastid CDS (Fig. S5) all resolved three well-supported clades, corresponding to T. glandulifera, T. grandiflora, and Triplostegia sp. A. The phylogeny based on nrDNA sequences (Fig. S6) was similar except that of T. glandulifera did not form a monophyletic clade (Fig. S6). The three taxa also formed distinct clusters according to Neighbor-net analysis of concatenated complete plastid genomes and nrDNA sequences, with T. glandulifera displaying high levels of intraspecific genetic variation (Fig. 4).

Fig. 2
figure 2

Phylogenetic relationships of Dipsacales constructed using RAxML based on complete chloroplast genome sequences. The maximum likelihood (ML) tree is presented, with maximum likelihood bootstrap support values (BS) and Bayesian inference posterior probability (PP) values given for each node. Nodes with a ‘*’ symbol represent nodes that received maximum support from ML or BI analysis (‘*’: 100% or 1.0). Nodes without values represent maximal support in both ML and BI methods (BSML = 100%, PPBI = 1.00)

Fig. 3
figure 3

Phylogenetic relationships of 33 samples of Triplostegia species based on complete chloroplast genome sequences. The phylogenetic tree was constructed using both maximum likelihood (ML) and Bayesian inference (BI) methods. The maximum likelihood (ML) tree is presented. Numbers along the branch indicate bootstrap support values from ML analysis (based on 1000 replicates) and Bayesian posterior probabilities from BI analysis

Fig. 4
figure 4

a Unrooted neighbour-joining (NJ) tree of Triplostegia based on the P-distance calculated from three highly variable plastid DNA regions (ndhf, ndhf-trnN, rpoB-trnC) and nuclear ITS sequences. b Neighbor-net analysis of Triplostegia based on complete chloroplast genome and nrNDA sequences. Bootstrap values (based on 1000 replicates) are indicated along the branches for each clusters

Molecular dating analysis (Fig. 5) estimated the stem and crown ages of Triplostegia to be 39.96 Ma (95% highest potential density, HPD: 13.91–55.05), and 7.94 Ma (95% HPD: 1.59–22.68) respectively, with the first diverging taxon being T. glandulifera, with T. grandiflora diverging from Triplostegia sp. around 1.05 Ma (95% HPD: 0.028–6.58).

Fig. 5
figure 5

BEAST analysis of divergence times based on protein coding region sequences of the chloroplast genome. Calibration points are indicated by yellow circle. The blue bars indicate the 95% highest posterior density (HPD) credibility intervals for node ages

Nucleotide diversity and genetic differentiation

Among the three Triplostegia species, T. glandulifera displayed the highest plastomes genetic diversity (π = 1.17 × 10−3), with that of Triplostegia sp. A around eight times lower (π = 1.4 × 10−4), and T. grandiflora ~ seven time lower again (π = 2 × 10−5).

A total of 814 polymorphic loci were identified in the plastomes of Triplostegia, with average π values of 0.00146, 0.00246, and 0.00059 for the LSC, SSC, and IR regions, respectively (Fig. S7). Three highly variable regions were detected: ndhF (π = 0.03212) in the SSC region, trnN-ndhF (π = 0.02944) in the IRb region, and rpoB-trnC (π = 0.00682) in the LSC region.

For both the plastid and nrDNA data, the degree of genetic differentiation (FST), was relatively high among the three Triplostegia species (Table 2). FST was highest between T. grandiflora and the Triplostegia sp. A (0.89533 for plastids, 0.93251 for nrDNA), followed by T. grandiflora vs. T. glandulifera (0.80408 for plastids, 0.78292 for nrDNA), and Triplostegia sp. vs. T. glandulifera (0.77473 for plastids, 0.65421 for nrDNA). However, our phylogenetic analyses, based on plastid genomes, detected no correlation between the phylogenetic relatedness of samples and their geographical distribution on either the same or opposite sides of the Jinsha River, for both T. grandiflora and Triplostegia sp. (Fig. 6).

Table 2 Genetic differentiation (FST) and gene flow (Nm) among the three Triplostegia species based on complete plastid genomes and nrDNA sequences. All the samples of each species are regarded as one population
Fig. 6
figure 6

Correlations between phylogenetic relatedness and geographical distributions of samples from T. grandiflora (a) and Triplostegia sp. (b) in the alpine-gorge region of the Great Two Bends of Jinsha River. The phylogenetic relationships were inferred using the maximum likelihood (ML) method based on complete chloroplast genome sequences. (The maps are created by authors using ArcGIS software)

Species discrimination based on standard DNA barcodes and highly variable cpDNA regions

In tree-based analyses, none of the standard plant DNA barcodes (rbcL, matK, trnH-psbA, and ITS), whether used singly or in combinations, could successfully discriminate all three Triplostegia taxa (Table 3). However, the highly variable cpDNA region rpoB-trnC region alone successfully discriminated all three species with relatively high node support values (100% for T. glandulifera, 87% for T. grandiflora, and 98% for Triplostegia sp.), and the ycf1 gene did the same. The highly variable cpDNA regions ndhF and ndhF-trnN could not distinguish all three Triplostegia species alone or in combination. All other combinations including two to four regions of ndhF, ndhF-trnN, rpoB-trnC and ITS, could successfully discriminate all three Triplostegia taxa except for ndhF + ndhF-trnN and ndhF-trnN + ITS. Particularly, the combination of rpoB-trnC and ITS was able to successfully discriminate all three Triplostegia species with maximum supporting values. Furthermore, all T. grandiflora samples contained an identical 66 bp insertion sequence in their ycf1 gene, while all T. glandulifera samples contained an identical 18 bp insertion sequence in ycf1. Distance-based analyses revealed that any of the three highly variable plastid regions or the ycf1 gene alone successfully distinguished all three Triplostegia species (Table S7).

Table 3 Tree-based species discrimination rates of Triplostegia by using highly variable plastid DNA regions, ndhF, ndhF-trnN, rpoB-trnC, and standard plant DNA barcodes, rbcL, matK, and ITS singly or in combinations

Morphological and functional traits

Except for fruit length, all of the measured morphological traits exhibited significant differences among the three Triplostegia species (Table 4). Triplostegia sp. A was the shortest plant (28.61 ± 16.28 cm), while T. grandiflora was the tallest (44.01 ± 12.56 cm). The taproot of T. grandiflora was significantly shorter but thicker compared to that of its two congeners. Triplostegia sp. A had serrated leaves, whereas T. glandulifera and T. grandiflora had pinnatifid and pinnatilobate leaves, respectively. The petiole was 26.55 ± 9.88 mm in T. glandulifera, 14.77 ± 5.2 mm in Triplostegia sp. A, and absent in T. grandiflora. In addition, the corolla of T. grandiflora (6.61 ± 1.11 mm) was approximately three times longer than that of T. glandulifera (2.31 ± 0.52 mm) and Triplostegia sp. A (2.24 ± 0.77 mm) (Table 4). The PCA analysis showed that the 10 morphological traits clearly distinguished T. grandiflora from its two congeners, but the other two taxa formed overlapping clusters (Fig. 7; Table S8). Ecologically, Triplostegia sp. A typically occurred at higher elevations (3229 ± 382 m) compared to T. glandulifera (2284 ± 555 m) and T. grandiflora (2450 ± 548 m) (Table 4).

Table 4 Differences in elevation and morphological traits difference among the three Triplostegia species
Fig. 7
figure 7

Principal component analysis (PCA) of 10 morphological traits of the three Triplostegia species. Morphological trait data were collected by measuring specimens of Triplostegia

Seven of the eight morphological and functional traits measured in the field showed significant differences between T. grandiflora and Triplostegia sp. A, the exception being leaf dry mass (Fig. S8-9; Table S9-10). The plants of T. grandiflora usually inhabited lower elevations (mean elevation ± SD: 2787 ± 312 m) with a narrower elevation range (2066–3128 m) compared to Triplostegia sp. A (3073 ± 306 m; elevation range: 2651–3954 m). The plant height of T. grandiflora (height: 36 ± 12 cm) was significantly greater than that of Triplostegia sp. (32 ± 13 cm), although Triplostegia sp. A exhibiting significant variation in plant height in northwestern Yunnan Province. Furthermore, the leaves of T. grandiflora were significantly smaller (mean leaf area of 4.78 ± 2.31 cm2) but thicker compared to Triplostegia sp. A (8.74 ± 5.47 cm2). Triplostegia grandiflora had a higher leaf chlorophyll content (47.70 ± 5.93) compared to Triplostegia sp. (29.83 ± 5.67). The SLA of T. grandiflora (169.72 ± 34.85 cm2 / g) was significantly smaller than that of Triplostegia sp. A (281.96 ± 72.49 cm2 / g). Furthermore, the mean corolla length of T. grandiflora (6.19 ± 1.52 mm) was about 3.9 times longer than that of Triplostegia sp. A (1.57 ± 0.25 mm) (Fig. S9; Table S9).

Species distribution modeling

The AUC values for all models in this study were > 0.99, indicating high model performance (Table S11). Precipitation of the warmest quarter (Bio18) was the most important bioclimatic variable in determining the geographical distribution of all three Triplostegia species, with a particularly strong influence on T. glandulifera. The mean temperature of the coldest quarter (Bio11) was the second most important bioclimatic variable for T. glandulifera, while both the mean temperature of the coldest quarter (Bio11) and isothermality (Bio3) were the next most important variables for both T. grandiflora and Triplostegia sp. A (Fig. S10; Table S12–14).

The potential suitable habitats for Triplostegia sp. A exhibited similarities with those of T. grandiflora during each time period, with the current predicted range largely confined to the HDM and the Himalaya. However, the potential suitable habitat for T. glandulifera was larger, extending to East Asia (Fig. S11). The projected past suitable habitats for each Triplostegia species were much smaller than their current suitable habitats, especially for T. glandulifera during LIG and LGM. Moreover, all three Triplostegia species are projected to experience pronounced habitat shrinkage by 2090. Under the moderate (SSP2–4.5) and pessimistic (SSP5–8.5) climate change scenarios, the suitable areas for T. glandulifera are estimated to decrease by 19.4 × 104 km2 (19.2%) and 17.54 × 104 km2 (17.36%), respectively; T. grandiflora will decrease by 9.13× 104km2 (19.7%) and 20.79 × 104 km2 (44.89%); and Triplostegia sp. will decrease by 6.49× 104km2 (13.94%) and 9.37 × 104 km2 (20.13%), respectively (Fig. S12; Table S15).

The niche overlap between T. grandiflora and Triplostegia sp. was the largest, followed by that between T. glandulifera and Triplostegia sp., while T. grandiflora and glandulifera showed the smallest niche overlap (Fig. 8; Table S16).

Fig. 8
figure 8

Niche overlap between Triplostegia species measured using Warren’s I (a) and Schoener’s D (b) indices in different time periods and future climate change scenarios, including Last Interglacial (LIG), Last Glacial Maximum (LGM), mid Holocene (MH), current, and 2090 under SSP2–4.5 and SSP5–8.5 scenarios

Discussion

Confirmation of a third species in Triplostegia

Our phylogenetic analyses based on datasets of complete plastid genomes (Fig. 3), plastid CDS (Fig. S5), and highly variable plastid DNA regions, consistently indicate that Triplostegia contains three well-supported monophyletic species: T. glandulifera (BSML = 100%, PPBI = 1.00 from complete plastid genome data), T. grandiflora (BSML = 100%, PPBI = 1.00), and an undescribed species Triplostegia sp. A (BSML = 100%, PPBI = 1.00). Of these, T. glandulifera branched off first, making the other two sister species (Fig. 3). This topology is actually consistent with a previous study [59], which examined fewer accessions and did not distinguish Triplostegia sp. A (their accession numbers of SRS3196660, SRS3196661, and SRS3196663) from T. grandiflora. Hence, molecular phylogenetic analyses may produce accurate topologies but alone cannot with certainly detect cryptic species. Our Neighbor-net analysis of combined plastid and nuclear data likewise indicated the division of Triplostegia into three distinct clusters (Fig. 4). Nuclear data alone did not conflict the plastome-based topology, but samples of T. glandulifera did not form a well-supported monophyletic clade (Fig. S6), most likely due to limited resolution from the small part of the genome sampled.

Triplostegia sp. A is clearly defined by morphology, as well as plastid data. It differs from its closer relative T. grandiflora in seven morphological traits (Table 4), whereas three traits (plant height, petiole length and degree of leaf division) provide consistent differences between all species (Table 4). Ecologically, there is a clear separation by altitude, with sp. A occurring from 1800 to 4342 m, compared to 1800–3200 m for T. grandiflora and 1250–3400 m for T. glandulifera. The wider altitude range of the high altitude sp. A supports the Rapoport’s Rule [103], which postulates that species at higher elevations tend to have larger elevation ranges. The differing altitude ranges might also contribute to the differing geographical ranges of the three species (Fig. 1). Although T. grandiflora and T. glandulifera are the most similar pair for altitude range, our ecological niche modeling results indicated that the greatest interspecific niche overlap was between T. grandiflora and sp. A (Fig. 8), indicating a correlation between niche overlap and relatedness, and hence phylogenetic conservatism [104]. Hence niche differentiation likely played a significant role in the species diversification of Triplostegia.

Functional trait differences between sp. A and T. grandiflora appear to be consistent with their ecological separation: a higher chlorophyll content in T. grandiflora indicates greater photosynthetic capacity [105], whereas its lower SLA (Fig. S9; Table S9) would normally indicate a resource-stressed environment [88]. Its leaves are also sessile and smaller but thicker, and it has a thicker taproot for water and nutrient storage (Table 4, Fig. S8-9; Table S9), consistent with it occupying a warmer and drier habitat.

Although sp. A. occasionally coexists with T. grandiflora in Yunnan where their altitude ranges overlap, we found no morphological intermediates nor other evidence of hybridization. Therefore, they are able to maintain distinct populations even where sympatric. Therefore, based on an integrative examination of molecular, morphological and ecological data, it is clear that sp. A represents an undescribed third species within the genus.

The phylogenetic position of Triplostegia has long been controversial [106, 107], but our phylogenetic analysis of Dipsacales based on plastid genomes provides compelling evidence that Triplostegia is a monophyletic genus, sister to a clade comprising Dipsacus, Scabiosa, and Pterocephalus (Fig. 2). This is consistent with previous phylogenetic reconstructions of Dipsacales based on plastid genomes, which sampled fewer Triplostegia individuals [53, 55].

Geography, climate and causes of speciation

Recent rapid speciation is a feature of the Hengduan Mountains Region (HDM) [48, 49], thought to be driven by the uplift of the Hengduan Mountains and the late Miocene to Pliocene intensification of the Asian monsoon [46,47,48, 108, 109]. Such rapid uplifts create new niches at high altitude which newly formed species may inhabit, e.g. the homoploid hybrid species Pinus densata [110, 111]. Triplostegia sp. A and T. grandiflora represent a high/low altitude species pair in the HDM region, similar to Roscoea humeana and R. cautleoides [112]. The altitude ranges of T. glandulifera and T. grandiflora are fairly similar (Table 4), which indicates that the most recent common ancestor (MRCA) of the genus, and also the sp. A-T. grandiflora species pair, probably occupied lower altitudes. If so, the speciation event that produced sp. A might have involved an incursion into colder and/or higher altitude conditions. The timing of this split, around 1 million years ago, indicates that it might have come about due to Quaternary climate fluctuations, with one lineage adapting to cooling conditions coming out of an interglacial while the other moved to lower altitudes or latitudes, tracking the climate.

Large rivers may act as barriers to gene flow [113,114,115,116], with examples within China for animals [117, 118], fungi [119], and plants [120, 121]. However, both T. grandiflora and Triplostegia sp. A occur on both sides of the steep-sided Jinsha River gorge (Fig. 6), indicating that this river gorge is easily traversed, as found for Roscoea [112]. We observed that the glandular pubescent fruit of Triplostegia [52] can easily attach to animal fur and human clothing, potentially facilitating dispersal across the river barrier. River gorges serve as strong barriers for Vitex negundo [120] and Parrotia subaequalis [121], both of which have seeds apparently dispersed by gravity, whereas certain species in the Amazon region of South America are not affected by river barriers to dispersal [116]. These results suggest that the barrier effect of river gorges largely depends on the specific dispersal traits of plants [115]. A more comprehensive analysis of dispersal and gene flow in Triplostegia will require the sampling of more than one individual per population, however.

Plastid genome features and nucleotide diversity

Newly originated species typically have a narrow geographical range and lower levels of genetic diversity compared to more ancient and widespread congeners [122]. Consistent with this, T. glandulifera diverged ~ 7.94 Ma, has the widest distribution range (Fig. 1), and according to Neighbour-Net analysis (Fig. 4) and nucleotide diversity (π) exhibits much higher levels of intraspecific genetic variation than that the other two species, which diverged from each other ~ 1.05 Ma. Likewise plastid genome size varied by 1215 bp within T. glandulifera (Table 1, Fig. S1), due to expansion and contraction of the IR/SC boundary regions (Fig. S4), which is a major mechanism underlying plastid genome size variation in plants [28, 39, 123]. Length variation within T. grandiflora was 68 bp, and it had the least genetic variation (π) in general consistent with this species having the smallest geographical range of the three (Fig. 1), whereas Triplostegia sp. A was intermediate for both range and genetic variation. Otherwise, Triplostegia had a high level of conservation in plastid genome structure, gene order, gene content, and genome size, consistent with previous work on the genus [59] and family (Caprifoliaceae) [54, 55, 124,125,126].

DNA barcodes for species discrimination

The standard plant DNA barcodes, including rbcL, matK, trnH-psbA, and ITS [23], have been widely used in fields such as community ecology [78, 127], invasive species management [128], and forensic identification [129, 130]. But they are not always effective, especially for taxa that have recently diverged or possess complex evolutionary history [26, 27], and none these, either singly or in combinations, were able to discriminate all three Triplostegia species. However, the complete plastid genomes and the highly variable cpDNA region rpoB-trnC alone successfully discriminated all three Triplostegia species, respectively with high bootstrap values (Table 3; Table S7). The rpoB-trnC locus is highly variable in the plastid genomes of other plant lineages, such as Papaver [131], Dioscorea [132], and Debregeasia [133]. In addition, the ycf1 gene, which is highly variable in flowering plants [77], contained species-specific insertions of 66 bp for T. grandiflora and 18 bp for T. glandulifera, making it a powerful DNA barcode that could discriminate all three Triplostegia species (Table 3; Table S7). The specific function of the ycf1 gene remains to be explored [134, 135], but it or other plastome variation could be linked to the differences in leaf chlorophyll content between T. grandiflora and Triplostegia sp. A, and other divergences in photosynthesis-related functions. Therefore, either rpoB-trnC or ycf1 can be used as a taxon-specific DNA barcode for discriminating Triplostegia species. Our results highlight the potential of developing taxon-specific barcodes for recently diverged taxa based on plastid genome data, which has been successfully applied in many other plant taxa [42, 136].