Transcriptomic and evolutionary analysis of the mechanisms by which P. argentatum, a rubber producing perennial, responds to drought
Guayule (Parthenium argentatum Gray) is a drought tolerant, rubber producing perennial shrub native to northern Mexico and the US Southwest. Hevea brasiliensis, currently the world’s only source of natural rubber, is grown as a monoculture, leaving it vulnerable to both biotic and abiotic stressors. Isolation of rubber from guayule occurs by mechanical harvesting of the entire plant. It has been reported that environmental conditions leading up to harvest have a profound impact on rubber yield. The link between rubber biosynthesis and drought, a common environmental condition in guayule’s native habitat, is currently unclear.
We took a transcriptomic and comparative genomic approach to determine how drought impacts rubber biosynthesis in guayule. We compared transcriptional profiles of stem tissue, the location of guayule rubber biosynthesis, collected from field-grown plants subjected to water-deficit (drought) and well-watered (control) conditions. Plants subjected to the imposed drought conditions displayed an increase in production of transcripts associated with defense responses and water homeostasis, and a decrease in transcripts associated with rubber biosynthesis. An evolutionary and comparative analysis of stress-response transcripts suggests that more anciently duplicated transcripts shared among the Asteraceae, rather than recently derived duplicates, are contributing to the drought response observed in guayule. In addition, we identified several deeply conserved long non-coding RNAs (lncRNAs) containing microRNA binding motifs. One lncRNA in particular, with origins at the base of Asteraceae, may be regulating the vegetative to reproductive transition observed in water-stressed guayule by acting as a miRNA sponge for miR166.
These data represent the first genomic analyses of how guayule responds to drought like conditions in agricultural production settings. We identified an inverse relationship between stress-responsive transcripts and those associated with precursor pathways to rubber biosynthesis suggesting a physiological trade-off between maintaining homeostasis and plant productivity. We also identify a number of regulators of abiotic responses, including transcription factors and lncRNAs, that are strong candidates for future projects aimed at modulating rubber biosynthesis under water-limiting conditions common to guayules’ native production environment.
KeywordsGuayule Transcriptome Drought stress Rubber biosynthesis lncRNAs Comparative genomic
Hidden Treasure 1
Late elongated hypocotyl
long non-coding RNA
phased, secondary, small interfering RNAs.
Plasma membrane intrinsic protein
transcript per kilobase million
United States Dollars
Natural rubber is a crucial material with a myriad of uses and applications, making it invaluable to a wide range of industries, and contributing to its economic footprint of ~ 12.7 billion USD (DESA/UNSD). Natural rubber production, which is predominantly sourced from the rubber tree (Hevea brasiliensis), is currently threatened posing socioeconomic risks to industries relying on it as raw material (1). Because the species is clonally propagated and is grown as a geographically concentrated monoculture, it is vulnerable to diseases such as South American leaf blight (Microcyclus ulei), a fungal pathogen endemic to Hevea’s center of origin in the Amazon (2). Due to these growing concerns for the future stability of Hevea populations, scientists have continually searched for alternative sources of natural rubber (3,4). One such species, guayule (Parthenium argentatum A. Gray), has already been shown to be an attractive source of natural rubber that may be able to help address projected future shortages (3, 4, 5, 6).
Guayule grows throughout northern Mexico and much of the American southwest and thus is naturally adapted to arid environments (6, 7). Because of this, producers first considered guayule as an alternative source of natural rubber in the early 1900’s. Subsequent utilization of guayule as a rubber source has progressed through multiple “boom and bust” phases largely influenced by world markets and import costs surrounding rubber from H. brasiliensis (6). Due to over a century of sporadic but intense efforts to harness guayule’s rubber producing potential, it is now understood that the crop has practical advantages over Hevea; for example, as a hypoallergenic alternative for those that have adverse reactions to latex rubber (8).
Crop scientists are now aware of unique challenges that guayule poses, particularly from a plant breeding perspective. Indeed, genetic improvement of guayule is complicated because the species has two different modes of reproduction and is able to exist as either facultatively apomictic, polyploid individuals or as sporophytic, self-incompatible diploid individuals (9, 10, 11, 12). Due to this inherent biological complexity, a modern breeding approach that uses molecular techniques in tandem with traditional phenotypic selection may be the most effective way to increase the rate of genetic gain in the crop, particularly under stress conditions (13).
While the biological purpose for rubber biosynthesis and accumulation in the plant is unclear, its production is believed to be linked to the way the plant responds to abiotic stressors such as drought and temperature (14, 15, 16, 17). Abiotic stress has been shown to elicit a dramatic and highly tissue specific reprogramming of the transcriptional profile in many plant systems (18). For instance, reproductive tissue in maize exhibits down regulation of genes associated with cell division and DNA replication during drought exposure, consistent with the observed delay in ear growth (19). In sunflower (Helianthus annuus), a naturally drought tolerant relative of guayule, exposure to drought conditions resulted in elevated levels of genes associated with osmotic adjustment in leaf and vasculature tissue (20), as well as decreases in genes associated with oil metabolism in the seed (21). While guayule exhibits similar drought tolerant characteristics as sunflower, it is unclear if it responds to stress in a similar molecular manner.
One critical, but until recently, overlooked aspect of the stress response in plants lies in the non-coding RNAs (microRNAs and long non-coding RNAs) that help sense and regulate the response to stress (22, 23, 24). These non-coding RNAs act at the pre- and post-transcriptional level to modulate expression and activity of other genes necessary for the stress response (25). Due to their highly specific expression patterns and species specificity sequence conservation, these transcript classes are important targets for understanding the unique ways in which plants have evolved to respond to changes in their environment (26, 27, 28). As such, all aspects of the transcriptome must be examined to fully understand the link between rubber biosynthesis and drought stress in guayule.
To uncover the molecular mechanisms that facilitate the drought response in guayule, we used a transcriptomic approach to identify differentially expressed transcripts between plants grown under both well-watered and water limited conditions. We used a phylogenetic approach to gain some insight into whether recent or more ancient gene duplications were contributing to the observed stress response. Finally, we uncovered a number of stress-responsive, long non-coding RNAs, several of which harbor conserved miRNA binding motifs, including two miRNAs with known roles in flowering and drought responses. These lncRNAs add an additional layer of regulatory complexity to the guayule drought response. Thus, we present a first glimpse at how guayule responds to drought and offer some molecular targets for plant breeders wishing to study the trade-off between rubber biosynthesis and water conservation.
Examining the impact of drought at the transcriptome-wide level in guayule
Differentially expressed genes were identified by comparing the I25% irrigation treatment to the I100%. Of the 63,672 transcripts, 42,711 were expressed (minimum of 0.5 TPM in all replicates) in the control conditions and 43,002 in the samples grown under the limited water. Of these, 251 transcripts were upregulated under the water-limited irrigation regime whereas 393 were downregulated (Fig. 2b and Additional file 3: Table S2; adjusted p-value of 0.01). The transcript most significantly upregulated in the water-limited treatment, GFTW01080018.1 (Fig. 2c), was expressed 23-fold compared to the control treatment (~ 9 –fold increase observed with qRT-PCR, Additional file 4: Figure S2). In contrast, the transcript most significantly downregulated, GFTW01080137.1 (Fig. 2d), was reduced more than 200-fold to near imperceptible detection levels, a value confirmed by qRT-PCR (Additional file 4: Figure S2).
Next, the most differentially expressed transcripts were assessed. The most significant, highly upregulated transcript, GFTW01080018.1, appears to be orthologous to the Arabidopsis PIP2s (specifically PIP2A, B, and C; Additional file 6: Figure S3), a family of aquaporins important for hydraulic regulation (34). Despite the recovery of numerous PIP2 paralogs in the genomes of Helianthus annuus and Lactuca sativa, two close relatives of guayule within the Asteraceae (35); Additional file 6: Figure S3), and three paralogs in the guayule transcriptome, only one aquaporin was differentially expressed in response to water deficit (I25%). The most significantly down-regulated transcript, GFTW01080137.1, shares sequence similarity to Arabidopsis Cold Regulated Gene 27 (COR27; AT5G42900). Interestingly, in Arabidopsis, COR27 and another cold regulated gene with little sequence similarity, COR28, are positive regulators of flowering (36). In guayule, putative orthologs for both COR27 and COR28 (GFTW01080137.1 and GFTW01127972.1, respectively) are both significantly repressed under water limited conditions, despite the near uniform flowering that was observed for these plants (Additional file 1: Figure S1). Finally, GFTW01028919.1, the transcript that displayed the greatest decrease in transcription (although not the most significant), at > 900-fold (adjusted p-value <2E-12; Additional file 7: Figure S4) is a putative ortholog of Arabidopsis Terpene Synthase 3 (AT4G16740) and is one of 12 downregulated guayule transcripts involved in isoprenoid/terpenoid biosynthesis (Fig. 3). In sum, guayule’s transcriptomic response to water-limited conditions includes a dramatic increase in aquaporin production and defense response genes, as well as a decrease in terpenoid biosynthesis, carbohydrate metabolism, and oxidation reduction mechanisms.
Examining the evolutionary history of duplicated drought-responsive transcripts
The GO-term analysis revealed that some of the differentially expressed guayule transcripts displayed similarity to the same Arabidopsis gene, suggesting one of three possibilities: 1) an ancient expansion in a stress-responsive gene family, 2) that the transcripts are paralogs that emerged following the cross-hybridization and polyploidy event that gave rise to AZ-3, or 3) that the transcripts contain the same functional domain but bear no phylogenetic relationship. Specifically, 127 guayule stress-responsive transcripts clustered, in sets of 2–4 transcripts each, with 56 Arabidopsis genes. For example, the downregulated guayule terpene synthase ortholog (GFTW01028919.1) groups with AT4G16740 along with two other guayule transcripts (GFTW01072004.1 and GFTW01017460.1). We first determined if the guayule transcripts were indeed the product of a gene duplication by examining codon-guided multiple sequence alignments. Transcripts associated with roughly half (n = 27) of the Arabidopsis gene clusters either did not share a recent evolutionary past (sequence identity < 50%) or there was not enough evidence to support a gene duplication (e.g., guayule gene fragments that did not overlap one another in the alignment). The three guayule transcripts within the terpene synthase cluster with AT4G16740 shared sufficient sequence similarity to proceed forward to phylogenetic analysis, whereas three guayule transcripts that shared similarity with an Arabidopsis mitogen-activated protein kinase (MAPK16, AT5G19010) exhibited little to no similarity outside of the kinase domain and were not considered further.
Duplication and expression do not necessarily imply that the resulting transcript is capable of encoding for a protein. Pseudogenization or neo-functionalization of a locus (protein-coding gene - > long non-coding RNA) can occur through the disruption of a protein-coding gene’s open reading frame (ORF). We examined each of the gene clusters for loss of ORF integrity in at least one (but not all) of the duplicate guayule transcripts. We found that 6/20 of the down-regulated gene clusters had experienced a pseudogenization event that left them with a single protein-coding gene, whereas 7/9 up-regulated gene clusters were left with a single protein-coding transcript (Fig. 4c, tan bars). Thus, it appears that a number of stress-responsive paralogs with intact ORFs have been retained through multiple speciation events, suggesting they may help guayule mount a response to drought conditions.
A role for long non-coding RNAs in guayule’s drought response
To identify putative lncRNAs, we focused on the set of differentially expressed transcripts that bore no similarity to any known protein domains (Fig. 5b). We then removed potential transposable elements (TEs) and known housekeeping RNAs (rRNAs and spliceosomal RNAs). To be conservative in our lncRNA identification, we also removed any transcripts that overlapped a protein-coding gene in the H. annuus genome, as these guayule transcripts may reflect incompletely assembled protein-coding genes resulting from technical difficulties of de novo transcriptome assembly. Following these filters, we recovered 31 putative lncRNAs that were down-regulated and 39 that were up-regulated in response to drought (a complete list can be found in Additional file 8: Table S4).
We then took an evolutionary approach to identify putative lncRNAs for which we could recover sequence homologs in other species under the premise that conservation implies functionality (26). Of the 70 guayule putative lncRNAs, we identified a sequence homolog for 14 in the sunflower genome (Fig. 5c). We uncovered evidence of conservation for three lncRNAs in the lettuce genome, suggesting that these loci emerged at least ~ 39 million years ago. Four of the fourteen sunflower conserved lncRNAs were also annotated as lncRNAs in that system, with one also annotated as a lncRNA in lettuce, lending additional confidence in their lncRNA designation (Fig. 5c).
Next, an attempt to assign a function to these putative lncRNAs beyond “stress-responsive” was made. Our experimental design lacked depth to attempt a “guilt-by-association” analysis, and the absence of a guayule genome precludes the association between a lncRNA and the neighboring protein-coding gene it might regulate. Therefore, we focused on whether the set of guayule lncRNAs might be involved in sequestering miRNAs away from their intended targets, or in miRNA or phasiRNA, biogenesis. Using psRNAtarget (46), we predicted whether miRNAs might bind to the 14 lncRNAs for which we identified sequence homologs in sunflower. We then scanned the homologous locus in sunflower (and in lettuce) for conservation of the miRNA binding site. Using this approach we identified six lncRNAs with conserved miRNA binding sites (Fig. 5c; Additional file 8: Table S4). One of the guayule lncRNAs conserved and annotated as a lncRNA in both sunflower and lettuce, GFTW01168370.1, harbors a completely conserved binding site for miR166 (Fig. 5d), a microRNA associated with tissue development and whose knockdown in Arabidopsis leads to an enhanced drought response (47). As a miRNA sponge, GFTW01168370.1 would act to recruit miR166 away from its intended target, in short mimicking the knockdown response reported in Arabidopsis. Thus, within the dataset of drought-responsive transcripts, a subset were identified that showed the hallmarks of being lncRNAs. Several of these lncRNAs contain conserved miRNA binding sites, with one in particular likely helping to mediate the guayule drought response.
Transcriptome analyses reveal a suite of drought-responsive genes in guayule
As a perennial shrub native to the American Southwest and northern Mexico, guayule is well adapted to long periods of little to no water. Using next-generation sequencing, we examined the molecular mechanisms by which guayule responded to simulated drought conditions via imposed irrigation treatments. By examining stem tissue, the primary location of rubber biosynthesis in guayule, we were also able to consider the impact of drought on this metabolic pathway. We performed our analyses using a published transcriptome for guayule, taking steps to collapse potential isoforms and miss-assembled transcripts. As expected, we identified a number of differentially expressed transcripts involved in signal transduction pathways (e.g., protein phosphorylation), transcriptional regulation, and transmembrane transport. We identified more than 20 up or down-regulated transcripts with similarity to Arabidopsis transcription factors associated with circadian clock regulation. Interestingly, many of these transcripts are annotated as cell-to-cell mobile in Arabidopsis (48), perhaps indicating that our transcriptomic analysis in stem tissue is generating a snapshot of circadian regulation occurring elsewhere in the plant. Regardless, while drought conditions dramatically impact both flowering and the circadian clock in guayule, due to the abundance of transcripts, it is unclear which transcript might be the regulator/sensor that is connecting drought to flowering.
The most upregulated guayule transcript is orthologous to the Arabidopsis aquaporin PIP2 family. Interestingly, despite recent duplications in close relatives, sunflower and lettuce, that are likely shared with guayule, we only observed differential expression for a single aquaporin out of three observed in the transcriptome, suggesting that it is the key regulator of water transport in stem tissue. We also observed twelve transcripts associated with rubber biosynthesis that were down-regulated under water-limited conditions. Although guayule rubber biosynthesis is known to be induced by cold temperatures, little is known about the mechanistic impact drought has on this pathway. However, given the abundance of terpene biosynthesis-associated transcripts and their almost complete down-regulation suggests that guayule modulates precursors to the rubber biosynthesis pathway when faced with water deficit conditions. This is in agreement with the observation that I100% plants contained twice the rubber content of those grown at I25% even though water use efficiency was equivalent (29).
WGD events have added to the complexity of the guayule drought response
Gene duplication, when the resulting duplicate is retained, can result in increased nuance in how plants perceive and respond to abiotic stress (49). The presence of duplicated transcripts in guayule are not surprising, given the multiple reported whole genome duplication (WGD) events leading up to the speciation event of guayule (35). A whole genome triplication event occurred at the base of the Asteraceae and is shared among all family members. More recently, a whole genome duplication has been observed in the formation of the guayule accession used in this analysis, AZ-3. AZ-3 is a complex polyploid formed by the likely hybridization of diploid P. argentatum and an unknown Parthenium species. Tetraploid guayule reportedly has increased biomass, rubber yield, and vigor compared to its diploid relatives. Thus, both of these polyploidization events raise the possibility that some of the duplicated genes may be mediating a successful response to drought stress or are contributing to increased vigor in the species.
We searched for evidence of duplication in the stress-responsive transcripts using a parsimony based approach to infer when those duplications occurred. It should be noted that we are not observing all duplicate genes here, only the ones that continue to be stress-responsive following duplication. These transcripts likely retain conservation in their regulatory domains (e.g., promoter elements), but in the absence of a genome, we focused on retention of protein-coding capacity. We were able to infer duplication events for 29 clusters of 68 stress-responsive guayule transcripts, with most (18/29) duplication events shared across Asteraceae. ORFs were retained in a majority of these transcripts (16/29), which, when combined with the shared pattern of differential expression between paralogs and their deep conservation, suggests that these duplicates are functional. However, as most of the observed retained duplicates appear to be shared across Asteraceae, they likely cannot explain the vigor associated with tetraploid guayule.
LncRNAs are helping to mediate the drought response in guayule
Long non-coding RNAs add an additional layer of complexity to plant stress responses through their ability to act as pre- and post-transcriptional regulators of gene expression. Interestingly, we recovered a homolog of HID1, a lncRNA that helps mediate shade avoidance in Arabidopsis. Although HID1 is conserved across land plants, this is the first Asterid homolog identified. In agreement with previous reports on HID1 conservation, guayule HID1 was conserved in the 5′ region believed to be important for protein-binding. Given the role of HID1 in light signaling it is perhaps not surprising that its expression was not responsive to drought. However, we were able to identify 70 putative lncRNAs that were differentially expressed in response to drought, 14 of which were conserved in the sunflower genome. De novo transcriptome assembly routinely produces fragmented transcripts with disrupted ORFs that would appear to look like a lncRNA. Thus, we took a more conservative approach than is typically taken when a reference genome is available by filtering out any transcripts that shared sequence similarity with protein-coding genes from related species. Four of the sunflower-conserved lncRNAs were also annotated as lncRNAs in sunflower, lending further support to their classification in guayule. Based on conservation and their stress-responsiveness, we would predict that these lncRNAs are likely functioning to modulate the drought response in guayule.
Functional prediction for lncRNAs is difficult in the absence of genomic context clues or without the ability to apply guilt-by-association strategies through many experimental time points or conditions. Thus, we focused on one functional class of lncRNA, that of miRNA sponge/precursor, as miRNA binding sites are fairly easy to predict computationally. Again, using sequence conservation as a means of boosting predictive confidence, we identified conserved miRNA binding sites in six guayule lncRNAs. One of these putative miRNA sponges in particular harbors a binding site for miR166, a microRNA involved in vegetative growth, floral morphogenesis, and regulating responses to salinity and drought. The lncRNA containing the miR166 binding site is upregulated under drought conditions and therefore could be mediating either the observed floral transition or the drought response.
As a drought tolerant, rubber producing perennial crop, guayule represents a remarkable natural resource for meeting industrial demands for raw products. In the present work, a transcriptomic and comparative evolutionary analysis approach was taken to identify and characterize the molecular response of guayule to drought-like conditions. We found that rubber biosynthesis-associated transcripts were dramatically down-regulated in the plants subjected to water-limited conditions in comparison to the plants in the well-watered control treatment. These results demonstrate that even given guayule’s inherent drought tolerance, there is a molecular trade-off occurring between rubber biosynthesis and the plants ability to maintain hydration status and homeostasis. These findings suggest that water and other crop inputs need to be optimized with respect to rubber yield to find an economic balance for potential producers.
Plant growth and tissue collection
Guayule (P. argentatum AZ-3) seed was obtained from the USDA-ARS National Plant Germplasm System (NPGS; https://www.ars-grin.gov/) using ID PI 599676. Guayule plants were grown in the field under subsurface drip irrigation at the University of Arizona, Maricopa Agricultural Center in Maricopa, Arizona as described in (29). On the day of final harvest when plants were 29 months-old (March 2015), 10–15 mm diameter-stem segments from each plant were harvested and immediately frozen in liquid nitrogen and then stored at − 80 °C until used. Three biological replicates for each treatment were harvested.
RNA extraction and Illumina library preparation
Approximately 2 g of stem tissue was used for total RNA extracted following the Laudencia et al. 2007 (50) protocol with the following modifications: (i) acid phenol:chloroform MB grade (Ambion, USA) was used for the phenol:chloroform extraction step; (ii) the precipitated RNA was further cleaned with Qiagen RNeasy Plant Mini Kit (Qiagen, USA); and (iii) the cleaned RNA was treated with DNA-free™ kit (Ambion, USA). PolyA-RNA was prepared employing Qiagen RNeasy/QIAshredder protocols (Qiagen, USA). RNA-sequencing libraries were prepared using the KAPA stranded mRNA-seq kit for Illumina (KK8420) according to the manufacturer’s protocol (KR0960 - v3.15). RNA-sequencing was performed on the Illumina HiSeq2000 with 150 bp paired-end reads. A total of 98,430,986 reads was generated for the six samples.
A condensed version of the Stonebloom and Scheller transcriptome was generated by initially filtered using CD-HIT-EST v.4.6.8 (51) with a global sequence identity of 1 (100%). To identify potentially identical transcripts that contained a single mis-aligned read, 150 nts were removed from either the 5′ or 3′ end of the transcript, and if the resulting transcript was greater than 150 nts, was used as query in a BLASTn (52) against all other transcripts. Hits against self were removed, and then all other hits with 100% coverage of one of the sequences, as well as 100% identity, were collapsed into one transcript, with the longest transcript being retained. Read mapping and quantification was performed using Salmon v0.81 (53) in CyVerse’s Discovery Environment (54). Quantified reads were prepared for differential expression analysis using the tximport (55) package in R. Differential expression was determined using DESeq2 (56) with an adjusted p-value of 0.01 as the cutoff for significance.
Quantitative PCR (qPCR) analysis
Stem bark tissue was the source of RNA for qPCR analysis. For each of three biological replicates (of both water-deficit and well-watered plants), total RNA was extracted with RNAqueous™ kit (Invitrogen, USA) and traces of DNA removed with DNA-free™ kit (Life Technologies, USA). Two micrograms of total RNA were the template for oligo (dT)20-generated cDNA with SuperScriptIII First-Strand Synthesis System for qPCR (Life Technologies, USA) following manufacturer instructions. The qPCR reactions were carried out using Applied Biosystems 7500 Fast Real Time PCR System and SYBR Green chemistry (Life Technologies, USA) in 20 μl volume reactions containing 400 ng of template cDNA, 900 nM of each forward and reverse primer, 10 μl of Fast SYBR® Green Master Mix, and water as needed. The following combinations of forward/reverse primers were used: for GFTW01080018.1, 5′- TGCCGTATTCATGGTTCACTTG -3′ / 5′- GGGCCGGGTTGATTCC − 3′; for GFTW01080137.1, 5′- TTTGTGGAGCAGGAGGAGAG-3′ / 5′-GCCAGATGAAACTGTATCAGAGC-3′; for GFTW01028919.1, 5′-ACTTCAAAGGTCGTTCCAAGAC-3′ / 5′-TGCCTCGCATTTTTCTCCAG-3′; and for Pa18S, 5′- TACTATGGTGGTGACGGGTG -3′/5′- ATTGTCACTACCTCCCCGTG − 3′. Thermocycler temperature regime was 95 °C for 20s, followed by 40 cycles of 95 °C for 3 s and 60 °C for 30s. Data were analyzed using the 7500 Fast System Detection Software (Life Technologies, USA) with manually set threshold. Expression of each target gene was calculated with the Livak and Schmittgen 2001 method (57), normalized to expression of the endogenous reference gene Pa18S, and then to its expression in a calibrator (well-watered control plant). Four technical replicates qPCR reactions were run for each target gene, and the whole experiment was performed twice using the same RNA but freshly synthesized cDNA.
GO terms for the differentially expressed transcripts were identified using BLAST2GO (58). First, BLASTx was performed against a database of Arabidopsis protein-coding genes with an e-value of 1E-3 and word size of 3. Protein domains were identified using InterProScan with default parameters. For guayule transcripts sharing similarity with an Arabidopsis protein-coding gene as determined by BLASTx analysis, but for which no functional annotation was obtained through BLAST2GO, we extracted Biological Processes directly from TAIR (59).
Duplication event timing and phylogenetic analysis
To determine timing of duplication, gene families were first generated by identifying sequences in the H. annuus (CoGe ID 37147) and L. sativa (CoGe ID 37106) genomes that shared sequence similarity with both the Arabidopsis and guayule sequences using CoGe BLAST with default parameters and an E-value of 1E-10 (60). Coding sequences were extracted from the top five unique loci in each genome using CoGeBLAST’s view FASTA feature. Sequences were aligned using MAFFT (61) in Geneious (62). The 5′ and 3′ UTRs of guayule transcripts were trimmed based on the multiple sequence alignment so that all sequences started with an “ATG” and ended with a stop codon. These alignments were then used to infer phylogenetic relationships with RAxML (v7.2.8, (63)) with the GTR GAMMA substitution model and 100 bootstraps. Trees with poor support (< 70) specifically at the guayule-sunflower node were realigned with fewer sequences or different MAFFT parameters until the support increased above 70. The sister branch to the query guayule sequence, whether it was sunflower or a guayule paralog, was used to infer timing of the duplication event based on the known organismal phylogeny.
LncRNA identification, conservation, and functional assessment
Differentially expressed lncRNAs were identified by first filtering out differentially expressed transcripts that displayed any similarity with known proteins or annotated protein domains (BLASTx, 1E-3). Transcripts were then filtered using Evolinc (64), filtering based on length, coding capacity (using CPC; (65)), and similarity to annotated proteins from the same species (using the set of differentially expressed transcripts predicted to be protein-coding). Sequence homologs for these lncRNAs were identified in the sunflower and lettuce genomes using CoGe BLAST, with an E-value of 1E-20 (26). Guayule transcripts for which a sequence homolog in another species corresponded to an annotated protein-coding gene were removed. These cleared transcripts were then scanned for miRNA motifs using psRNATarget (2017 update). Putative miRNA motifs were examined for conservation using multiple sequence alignments generated by MAFFT and visualized in Geneious.
The authors would like to thank Greg Leake (USDA-ARS, retired) for field harvest assistance; Naxin Hou (University of California, Davis) for laboratory assistance; and anonymous reviewers for helpful suggestions on the manuscript. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA. The USDA is an equal opportunity employer.
ADLN - led and oversaw transcriptomic analyses and prepared the manuscript; GP - sample collection, library preparation, and processing of RNAseq data; CM - conceived and designed experiments and collected samples; DCI - analysis of RNAseq data; NAP - bioinformatic support and preparation of manuscript; EDS and DJH - experimental design and management of the field trial and sample collection; and DP - overall project management, analysis of data, and preparation of manuscript. All authors contributed to revising and editing the manuscript. All authors have read and approved of the final manuscript.
Field trial and RNA sequencing was supported by the USDA Biomass Research and Development Initiative (BRDI) grant USDA-NIFA 201210006–19391 OH. Analyses were funded by USDA-NIFA Coordinated Agricultural Program, Sustainable Bioeconomy for Arid Regions (SBAR) 2017–68005-26867 and National Science Foundation grant IOS-1758532 to ADLN. DP was supported by the University of Arizona start-up funds and SBAR.
Ethics approval and consent to participate
Consent for publication
- 6.Ray DT. Guayule: A source of natural rubber. New crops [Internet]. 1993;33843. Available from: http://hort.purdue.edu/newcrop/proceedings1993/V2-338.html
- 7.Thompson AE, Ray DT. Breeding guayule. Plant Breed Rev. 1989;6:93–165.Google Scholar
- 9.Estilai A, Ray DT. Genetics, cytogenetics, and breeding of guayule. Guayule Natural Rubber A Technical Publication with Emphasis on Recent Findings. 1991:47–92.Google Scholar
- 20.Moschen S, Di Rienzo JA, Higgins J, Tohge T, Watanabe M, González S, et al. Integration of transcriptomic and metabolic data reveals hub transcription factors involved in drought stress response in sunflower (Helianthus annuus L.). Plant Mol Biol. 2017 Jul;94(4–5):549–64.PubMedCrossRefPubMedCentralGoogle Scholar
- 26.Nelson ADL, Forsythe ES, Devisetty UK, Clausen DS, Haug-Batzell AK, Meldrum AMR, et al. A Genomic Analysis of Factors Driving lincRNA Diversification: Lessons from Plants. G3 . 2016 Sep 8;6(9):2881–2891.Google Scholar
- 30.Eranki PL, El-Shikha D, Hunsaker DJ, Bronson KF, Landis AE. A comparative life cycle assessment of flood and drip irrigation for guayule rubber production using experimental field data [Internet]. , Industrial Crops and Products. 2017 Vol. 99. p. 97–108. Available from: https://doi.org/10.1016/j.indcrop.2017.01.020 CrossRefGoogle Scholar
- 39.Yan Q, Wu F, Yan Z, Li J, Ma T, Zhang Y, et al. Differential co-expression networks of long non-coding RNAs and mRNAs in Cleistogenes songorica under water stress and during recovery. BMC Plant Biol 2019 Jan 11;19(1):23.Google Scholar
- 43.Bazin J, Romero N, Rigo R, Charon C, Blein T, Ariel F, et al. Nuclear speckle RNA binding proteins remodel alternative splicing and the non-coding Arabidopsis Transcriptome to regulate a cross-talk between Auxin and immune responses. Front Plant Sci. 2018 Aug 21;9:1209.PubMedPubMedCentralCrossRefGoogle Scholar
- 44.Cho J, Paszkowski J. Regulation of rice root development by a retrotransposon acting as a microRNA sponge. Elife [Internet]. 2017 Aug 26;6. Available from: https://doi.org/10.7554/eLife.30038
- 50.Laudencia-Chingcuanco DL, Stamova BS, You FM, Lazo GR, Beckles DM, Anderson OD. Transcriptional profiling of wheat caryopsis development using cDNA microarrays [Internet], Plant Molecular Biology. 2007. Vol. 63. p. 651–668. Available from: https://doi.org/10.1007/s11103-006-9114-y PubMedCrossRefPubMedCentralGoogle Scholar
- 53.Srivastava A, Malik L, Sarkar H, Zakeri M, Soneson C, Love MI, et al. Alignment and mapping methodology influence transcript abundance estimation [Internet]. Available from: https://doi.org/10.1101/657874
- 60.Castillo AI, Nelson ADL, Haug-Baltzell AK, Lyons E. A tutorial of diverse genome analysis tools found in the CoGe web-platform using Plasmodium spp. as a model [Internet], Database. 2018. Vol. 2018. Available from: https://doi.org/10.1093/database/bay030
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.