Background

Uncaria rhynchophylla (Miq.) Jacks is used to treat convulsion, hypertension, epilepsy, eclampsia, migraine, and cerebral diseases [13]. Rhynchophylline, isorhynchophylline, corynoxeine, and isocorynoxeine are the major components of U. rhynchophylla [4]. Oleanane and ursane-type triterpenes, (including uncarinic acids, ursolic acid, 3-hydroxyurs-12-en-27,28-dioic acid, hyperin, and catechin) were found in Uncaria [1, 5]. Uncaria comprises 34 species [6], 10 of which are found in the Guangxi Zhuang Autonomous Region. Among the 10 species of Uncaria in Guangxi, U. rhynchophylla and Uncaria macrophylla are the most widely and abundantly distributed [7]. Stems with hooks from several species of Uncaria, including U. rhynchophylla, U. macrophylla, Uncaria hirsuta, Uncaria sinensis, and Uncaria sessilifructus, have been used in Chinese medicine (CM) preparations, Gouteng in Chinese. Only the above five species plants of the genus Uncaria can serve as the botanical origins of Gouteng according to the Chinese Pharmacopeia (10th edition) [8]. Adulterants of Gouteng include Uncaria laevigata, Uncaria lancifolia, Uncaria scandens, Uncaria rhynchophylloides, and Uncaria homomalla [7, 9], due to similar organoleptic characteristics to those of U. rhynchophylla. But their chemical constituents and therapeutic effects are distinct from those of U. rhynchophylla [2, 10, 11].

DNA barcoding can accurately identify species on the basis of short standardized genes or DNA regions [12, 13], without confounding factors such as environmental influence, growth phase, and morphological diversity within species [1416]. The mitochondrial gene encoding cytochrome c oxidase subunit 1 (co1) is a potential DNA barcode in most animal species as well as some fungal species. However, the co1 gene and other mitochondrial genes from plants have limited use in identifying plant species across a wide range of taxa, due to their low genetic variations and variable mitochondrial genomes [17]. Several DNA regions, such as ITS2, psbAtrnH, matK, rbcL, ITS, ycf5, and rpoC1 [14, 1821] have been evaluated as potential DNA barcodes in medicinal plants. Among these candidate barcoding loci, the ITS2 locus not only had the highest identification efficiency among all tested regions, but also discriminated a wide range of plant taxa [14, 22]. By contrast, ITS1 was a useful barcode for identifying Salvia species [23]. The psbA–trnH intergenic region was a suitable DNA marker for identification of flowering plants [17, 18], pteridophytes [24], Lonicera japonica Thunb from Caprifoliaceae [21], and aquatic plant species [25].

The authentication of the botanical origins of Gouteng is based on the morphological characteristics, microscopic structures, or chemical components of specimens [26]. The accuracy is often affected by environmental and subjective factors, especially for dry medicinal materials from different origins [26]. Chemical analysis methods, such as high-performance liquid chromatography (HPLC) and HPLC coupled with quadrupole time-of-flight mass spectrometry, have also been studied [27]. Multiple genetic molecular markers have been used to screen Uncaria, such as random amplified polymorphic DNA (RAPD) and rDNAs (including 5.8S rDNA, ITS1, and ITS2) [28].

This study aims to determine suitable markers for DNA barcoding of the genus Uncaria. In this study, five candidate loci (ITS2, rbcL, psbAtrnH, ITS, and matK) were tested for their potential as DNA barcodes for Uncaria.

Methods

Plant materials

Fifty-four sequences from our laboratory (all submitted to GenBank), among which 12 samples of six species of Uncaria (U. rhynchophylla, U. macrophylla, U. sessilifructus, U. hirsuta, U. lancifolia and U. homomalla) are used as Gouteng in CM markets, were collected from areas in Guangxi Province, including Rongshui, Sanjiang, Shanglin, Ningming, and Jinxi county, Nanning Sitang town, and Guangxi Medicinal Botanical Garden, in 2009 and 2010 by Professor Ruisong Huang. The plant species were identified by Shouyang Liu, Yiling Zhu, and Kejian Yan through morphological characteristics and analysis of microscopic structures [7, 10]. All of the voucher specimens (all the voucher numbers can be seen in Table 1) were deposited in the Key Laboratory of Biological Molecular Medicine Research of Guangxi Higher Education, Guangxi Medical University.

Table 1 Uncaria information used in this study

In total, 257 accessions related to the five candidate DNA barcoding sites (ITS2, rbcL, psbAtrnH, ITS, and matK) from 89 samples belonging to 15 species of Uncaria were analyzed in this study. All accession data were downloaded from GenBank, except for the above 54 sequences, which were amplified and sequenced in our laboratory. All datasets of Uncaria species used in the study contained more than two samples, except for Uncaria africana, Uncaria guianensis, and Uncaria lanosa. Some accessions in which the sequences contained undetermined bases or were from sp. species (taxa of species unclear or unnamed) were not selected. In this study, the correctness of the accessions downloaded from GenBank was tested through blasting against those of congener plants. Only the sequences with both a similarity ratio and query cover ratio higher than 90 % in the same species were suitable for selection. However, some accessions containing inversion sequences were collected in this dataset because they could influence the sequence divergence and supply some important genetic characters [29]. The total data and sample information used in this study are shown in Table 1.

DNA extraction, PCR amplification, and sequencing

In this study, genomic DNA was extracted from the freshly dried leaves of Uncaria plants by the improved protocol of a new rapid Plant Genomic DNA Extraction Kit (centrifugal column type, DP3112; Bioteke Corporation, Beijing, China). The Uncaria leaves were ground in liquid nitrogen, and the cell nuclear separation solution (3 ml for 0.5 g sample) was immediately added to the samples to remove impurities from the cytoplasm before the cell nuclei were lysed [30]. PCR amplification of the five candidate DNA barcode sites was performed in a Tprofessional Gradient 96 Type (Biometra, Göttingen, Germany) with approximately 30 ng of genomic DNA as a template in a 25-µL reaction mixture. Each reaction contained 1 × PCR buffer (2.0 mM MgCl2, 0.2 mM each dNTP, 0.1 µM each primer; synthesized by Sangon Biotech, Co., Ltd., Shanghai, China), and 1.0 U Taq DNA polymerase (TaKaRa Biotechnology Co., Ltd., Dalian, China). The primers and reaction conditions used were the same as those used by Chen et al. [14]. The PCR products were electrophoresed in a 1.5 % agarose gel in 1 × TAE buffer, then purified with a TIANGel Midi Purification Kit (Tiangen Biotech Co. Ltd, Beijing, China). The purified PCR products were bidirectionally sequenced with appropriate amplification primers (Additional file 1) in an ABI-PRISM3730 instrument (Thermo Fisher Scientific, MA, USA) by Sangon Biotech, Co., Ltd., Shanghai, China.

Sequence alignment and data analysis

Sequence assembly and consensus sequence generation were performed by CodonCode Aligner 3.7.1 (CodonCode Co., MA, USA) by trimming the low quality sequence and primer areas. The matK and rbcL regions were delimited by alignment with known sequences in databases by CodonCode Aligner. After removal of the psbA and trnH genes at the ends of psbAtrnH, the boundary of the psbAtrnH intergenic spacer was determined according to the annotations of similar sequences in GenBank. The five candidate DNA barcodes were aligned by ClustalW (EMBL-EBI, Heidelberg, German). Kimura 2-Parameter (K2P) genetic distances were computed with Molecular Evolutionary Genetics Analysis 6.0 (The Biodesign Institute, AZ, USA) [31]. All interspecific and intraspecific distances, including theta prime, minimum interspecific distance, theta, and coalescent depth for all accessions of each locus, were calculated and compared to evaluate the interspecific divergence and intraspecific variation by the K2P model. Meanwhile, statistical analysis of the distribution divergency of the genetic distance between different sequences was performed through the Wilcoxon signed-rank test to assess the barcoding gap for different candidate loci with SPSS software (SPSS 16.0: International Business Machines Corporation Statistical Product and Service Solutions, Armonk, New York, USA), which the test statistical W+ and W− were calculated for two side test, as described previously [14, 22]. The BLAST1 and nearest distance methods were used to evaluate the species identification efficiency [32, 33].

Results

PCR amplification and base composition of the five loci of Uncaria

The sequence length and GC content of the five candidate loci (ITS2, rbcL, psbAtrnH, ITS, and matK) were obtained from the CodonCode Aligner and Clustal W alignment results (Table 2). The GC content of psbAtrnH was the lowest, while that of ITS2 was the highest. The variability of the length range of the psbAtrnH intergenic spacer was greater than that of the other candidates. The psbAtrnH region of U. macrophylla was more divergent than that of the other Uncaria plants.

Table 2 Analysis of the five candidate barcode loci of Uncaria

Genetic interspecific divergence and intraspecific variation

Six parameters (Table 3) represented the genetic divergences of species in Uncaria. In a comparison of the intraspecific distances of the five candidate barcodes among Uncaria species, the intraspecific distance of psbAtrnH was higher than that of the other loci at the species level. Meanwhile, the interspecific genetic distance of the psbAtrnH intergenic spacer exhibited the highest divergence according to the interspecific distance, theta prime, and minimum interspecific distance. The interspecific distance of ITS2 was the second highest after psbAtrnH. All interspecific divergences of ITS2, psbAtrnH, and ITS were greatly higher than the corresponding intraspecific divergences. Furthermore, the overall mean distance of psbAtrnH was the highest among the five loci (Fig. 1).

Table 3 Calculation of interspecific and intraspecific divergences for Uncaria
Fig. 1
figure 1

Distribution of overall mean distance for all sequence pairs among five loci. The number at right y axis is the estimates of average evolutionary divergence over all sequence pairs for each locus, which is the base substitutions per site from averaging over all sequence pairs. Analyses were conducted by the maximum composite likelihood method in MEGA6 [31]

The psbAtrnH intergenic spacer had the highest interspecific divergence among all the loci based on the Wilcoxon signed-rank test. The second highest interspecific divergence was shown by ITS2. The scale of the interspecific divergence of psbAtrnH was higher than ITS2, ITS, matK and rbcL, respectively (all P < 0.001), that of ITS2 was higher than ITS, matK and rbcL, respectively (all P < 0.001, Table 4). Furthermore, the intraspecific divergences between ITS and matK, rbcL and matK, ITS2 and matK, psbAtrnH and matK, and ITS and rbcL did not exhibit any significant differences (P > 0.05, Table 5).

Table 4 Wilcoxon signed-rank test for interspecific divergences
Table 5 Wilcoxon signed-rank test for intraspecific divergences

Analysis of barcoding gaps

As a barcode for identifying botanical species, the divergence between species should be higher than the variation within species [34]. Although the histogram of the K2P genetic distance analysis revealed a partial overlap “barcoding gap” between the intraspecific and interspecific divergence of ITS2 or psbAtrnH (Fig. 2), the intraspecific variation of psbAtrnH and ITS2 was considerably lower than the distribution of their interspecific divergence. The genetic divergence distribution of ITS was similar to that of ITS2. No clear “barcoding gap” corresponding to the rbcL or matK loci was observed, wherein the genetic distance distribution of more than 90 % of accessions was less than 0.020. However, the distribution of the interspecific divergence of ITS2 and psbAtrnH provided a better resolution than that of rbcL and matK.

Fig. 2
figure 2

Distribution of divergence between interspecific and intraspecific genetic distance for five candidate barcoding loci among 257 accessions. a ITS2; b rbcL; c psbAtrnH; d ITS; e, matK. x axis, genetic distance; y axis, distribution of genetic divergence

Identification efficiency and characteristics of Clustal W alignment

The BLAST and nearest distance methods were employed to test the applicability of the five loci for species identification of Uncaria. psbAtrnH presented 95.9 % identification efficiency with both the BLAST and nearest distance methods at the species or genus level. ITS2 exhibited 92.2 % identification efficiency by the nearest distance method, but 87 % by the BLAST method, whereas rbcL showed only 76.2 % by the nearest distance method and 42.9 % by the BLAST method (Table 2). Meanwhile, psbAtrnH of U. macrophylla exhibited more obvious characteristics than U. rhynchophylla and the other species tested (Figs. 3, 4, 5). Two insertion fragments existed in the psbAtrnH sequence of U. macrophylla, including a serial seven A fragment at 171–177 bp, and another double repeat “ATTAAA” at 234–247 bp. The psbAtrnH intergenic spacer can be used as a barcode for the identification of Uncaria plants. The phylogeny of Uncaria ITS2 (computed model: Maximum Composite Likelihood) [31] showed that only four accessions (4/77 accessions) were in the incorrect taxonomic category (Fig. 6), which was less than the other loci tested. Thus, ITS2 could be another suitable DNA barcode for Uncaria.

Fig. 3
figure 3

ClustalW results of psbAtrnH of Uncaria plants. Identical positions are shown as dot; indels as dash; the red box site show a seven A repeat inserted at 171–177 bp, the differences of U. macrophylla from U. rhychophylla and other Uncaria species

Fig. 4
figure 4

ClustalW results of psbAtrnH of Uncaria plants. Identical positions are shown as dot; indels as dash; the red box site show a cis-repeats of ATTAAA insertion at 234–239 bp, the differences of U. macrophylla from U. rhychophylla and other Uncaria species

Fig. 5
figure 5

ClustalW results of psbAtrnH of Uncaria plants. Identical positions are shown as dot; indels as dash; the red box site show a cis-repeats of ATTAAA insertion at 241-247 bp, the differences of U. macrophylla from U. rhychophylla and other Uncaria species

Fig. 6
figure 6

Phylogeny tree of Uncaria ITS2. The evolutionary history was inferred using the neighbor-joining method, the evolutionary distances were computed using the maximum composite likelihood model. Only four accessions labeled by triangular, square or circular symbol were incorrectly taxonomic category

Discussion

Significance of authentication of Uncaria by DNA barcoding

Gouteng is commonly exploited as the major ingredient herb of CM prescriptions for hypertension or migraine treatment [2, 35]. The amount of stems with hooks of U. rhynchophylla (Gouteng) required in traditional clinic and pharmaceutical production, has been increased; while the natural growth of U. rhynchophylla, U. hirsuta and U. macrophylla which could serve as the botanical origins of Gouteng was limited with the rising of collection. Some other species of the genus Uncaria are often collected to adulterate Guoteng, such as U. laevigata, U. lancifolia, U. scandens [7]. Therefore, the correct genotypic identification of Uncaria plant material is essential in order to protect public health and for industrial production.

Although some methods have been developed to distinguish Uncaria plants based on morphotype, microcharacter, or physical and chemical reactions [8, 9], these are dependent on taxonomy experts. Currently, the genetic molecular markers for the genus Uncaria were related to RAPD, rDNA, and ITS, while DNA barcoding assays have not yet been reported. This study included 11 of 12 species of Uncaria in China, with U. rhynchophylloides missing in the screen for suitable DNA barcodes for Uncaria.

In the present study, psbAtrnH presented 95.9 % identification efficiency for Uncaria accessions tested with both BLAST and nearest distance methods at the species or genus level. ITS2 also exhibited high identification efficiency at 92.2 or 87 % with the nearest distance or BLAST method, respectively.

Quality and amplification efficiency of DNA from Uncaria

The DNA of Uncaria was not extracted efficiently, due to the large amounts of polysaccharides, polyphenols, and alkaloids present in the samples. A cell nuclear separation solution was used to remove the impurities from genomic DNA [30]. The quality of the DNA extracted from the Uncaria plants satisfied the requirements for PCR amplification and sequencing. The efficiency of both PCR amplification and sequencing for psbAtrnH was the highest among the five candidate loci. Specifically, PCR amplification showed 96.7 % efficiency, while sequencing showed 100 % efficiency. Because the average GC content of ITS2 was 66.3 %, which was higher than that of the other loci, the resulting DNA extract was slightly difficult to amplify.

Selection of candidate DNA barcodes

In this study, the length of psbAtrnH of Uncaria ranged from 235 to 315 bp (mean 287 bp), which was longer than that of ITS2, but shorter than that of rbcL, ITS, and matK. Additionally, psbAtrnH of Uncaria exhibited the highest interspecific divergence among the five loci tested, based on the results of six parameters of the K2P model or Wilcoxon signed-rank test of interspecific divergence. The interspecies divergence of psbAtrnH was higher than the relevant intraspecies variation. Furthermore, psbAtrnH of U. macrophylla was significantly distinct from that of U. rhynchophylla and the other species because of two insertion fragments: one was a seven A repeat inserted at 171–177 bp and the other was two cis-repeats of ATTAAA at 233–247 bp (Figs. 3, 4, 5). Although one TAAAAAA repeat was observed at 171–177 bp in psbAtrnH from Uncaria yunnanensis, no double cis-repeats of ATTAAA were observed at 233–247 bp. Meanwhile, one inversion sequence of length 73–74 bp with identity ratios of more than 98 % in psbAtrnH of Uncaria was found in this study (Additional file 2). The intragenic variation of the genus Uncaria was large because of this inversion phenomenon existing in psbAtrnH. This situation was also observed in psbAtrnH of Aconitum L. [29]. The characteristics of the insertion sequences in psbAtrnH could effectively authenticate Uncaria species.

ITS2 was another suitable locus for distinguishing different species of Uncaria. The length range of ITS2 was 210–221 bp (mean 219.9 bp), which was the shortest among the five loci. Consequently, 95.8 % efficiency could be reached by PCR amplification. In a comparison of the interspecific genetic distances of the five candidate barcodes among Uncaria species, the mean interspecific distance of ITS2 was higher than its mean intraspecific divergence, and the values were second only to those of psbAtrnH (Table 3). Based on the phylogenetic analysis of ITS2 by the neighbor-joining method and the evolutionary distances computed by the Maximum Composite Likelihood model, more than 93 % of Uncaria at the species level in this study were divided into monophyla as recognized species. Among 77 accessions of ITS2, comprising 14 species of Uncaria, only four accessions were in an incorrect taxonomic category, according to the construction of a phylogenetic tree for ITS2 (Fig. 6). Uncaria manifested complex morphological features and genetic backgrounds, and even some specimens with obvious differences in appearance possessed similar ITS sequences [28]. This could explain the existence of some accessions that appeared in different monophyla from their original morphological taxa. Some species submitted to GenBank may have been wrongly categorized. Sequences with lengths of less than 100 bp, those with ambiguous bases containing more than one “N”, or those belonging to unnamed species (such as those with spp. and aff. in the species name) were excluded [20] from this study to guarantee the reliability of the selected sequences.

A better “barcoding gap” was observed between the interspecific divergence and intraspecific variation of ITS2 compared with the other loci. ITS, which contained three fragments (ITS1, 5.8S rDNA, ITS2), exhibited a similar identification efficiency to that of ITS2. Both rbcL and matK were unsuitable genetic loci for authentication of the botanical origins of Gouteng, because of the absence of a clear barcoding gap between the interspecific divergence and intraspecific variation by the K2P model. The overall mean distance of rbcL was only 0.002 and that for matK was 0.005, as computed by the Maximum Composite Likelihood model (Fig. 1). Moreover, we found that the combination of psbAtrnH with ITS2 would provide a better result for the authentication of Uncaria plants, and could even distinguish between incorrect and correct taxa or identify some cryptic species. Currently, a preliminary system for DNA barcoding of herbal materials has been established based on a two-locus combination of ITS2 and psbAtrnH barcodes [36]. Recently, ITS2 was successfully exploited in a survey involving commercial Rhodiola products, including decoction pieces [37].

psbAtrnH and ITS2 also exhibited high authentication power for different species of Uncaria. Both psbAtrnH and ITS2 revealed the distinct divergence of U. macrophylla from U. rhynchophylla and the other species at the species level.

Conclusion

While psbAtrnH and ITS2 (used alone) were applicable barcodes for species authentication of Uncaria, psbAtrnH was a more suitable barcode for authentication of U. macrophylla.