Skip to main content
Log in

Computational Analysis of “-omics” Data to Identify Transcription Factors Regulating Secondary Metabolism in Rauvolfia serpentina

  • Original Paper
  • Published:
Plant Molecular Biology Reporter Aims and scope Submit manuscript

Abstract

Rauvolfia serpentina has been known to produce therapeutically important indole alkaloids used in treatment of various diseases. Despite its medicinal importance, complete understanding of its secondary metabolism is challenging due to complex interplay among various transcription factors (TFs) and genes. However, weighted co-expression analysis of transcriptome along with integration of metabolomics data has proficiency to elucidate topological properties of complex regulatory interactions in secondary metabolism. We aimed to implement an integrative strategy using “-omics” data to identify TFs of “unknown function” and exemplify their role in regulation of valuable metabolites as well as metabolic traits. A total of 69 TFs were identified through significant thresholds and removal of false positives based on cis-regulatory motif analysis. Network-biology inspired analysis of co-expression network lead to generation of four statistically significant and biologically robust modules. Similar to known regulatory roles of WRKY and AP2-EREBP TF families in Catharanthus roseus, this study presented them to regulate synthesis of alkaloids in R. serpentina as well. Moreover, TFs in module 4 were observed to be regulating connecting steps between primary and secondary metabolic pathways in the synthesis of terpenoid indole alkaloids. Integration of metabolomics data further highlight the significance of module 1 since it was statistically predicted to be involved in synthesis of specialized metabolites, and associated genes may physically clustered on genome. Importantly, putative TFs in module 1 may modulate the major indole alkaloids synthesis in response to various environmental stimuli. The methodology implemented herein may provide a better reference to identify and explore functions of transcriptional regulators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Abbreviations

TFs:

Transcription factors

PCC:

Pearson correlation coefficient

MPGR:

Medicinal plant genomics resource

TAIR10:

The Arabidopsis Information Resource 10

PlnTFDB:

Plant Transcription Factor Database

ND:

Network density

AGRIS:

Arabidopsis gene regulatory information server

TFBS:

Transcription factor binding sites

MCL:

Markov cluster

GO:

Gene ontology

KEGG:

Kyoto Encyclopedia of Genes and Genomes

PMR:

Plant and microbial metabolomics resource

TIA:

Terpenoid indole alkaloid

References

Download references

Acknowledgments

We acknowledge the computational infrastructure provided in the form of project MLP0076 by CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), a constituent national laboratory of Council of Scientific and Industrial Research, India, and Department of Biotechnology, Government of India for infrastructural support in the form of Bioinformatics Infrastructure Facility (BIF) as well. The authors are thankful to Dr. Paramvir Singh Ahuja for encouragement and constant support. Shivalika Pathania is grateful to the Department of Science and Technology (DST) for INSPIRE fellowship. We are also thankful to Ashwani Jha and Vinay Randhawa for technical help in manuscript preparation. The CSIR-IHBT communication number for this article is 3777.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vishal Acharya.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Table S1

(XLSX 20 kb)

Table S2

(XLSX 10090 kb)

Table S3

(DOCX 668 kb)

Table S4

(DOCX 15 kb)

Table S5

(XLSX 53 kb)

Table S6

(XLSX 43 kb)

Table S7

(XLSX 36 kb)

Table S8

(XLSX 36 kb)

Table S9

(DOCX 13 kb)

Fig. S1

Gene ontology (GO) annotation of complete R. serpentina transcriptome. Pie chart representing GO-based annotation of complete transcriptome for a biological process and b molecular function category (GIF 92 kb)

High resolution image (TIFF 3751 kb)

Fig. S2

Gene ontology (GO) and KEGG pathways annotation of complete R. serpentina transcriptome. Pie chart representing a GO-based annotation for cellular component and b KEGG pathways annotation of complete transcriptome (GIF 88 kb)

High resolution image (TIFF 682 kb)

Fig. S3

Threshold selection. a The actual number of edges and all possible edges among non-singleton nodes as a function of PCC cutoff values. b The actual number of edges and all nodes among the non-singleton nodes as a function of PCC cutoff values. The “igraph” library of R package is used to obtain these plots (GIF 13 kb)

High resolution image (TIFF 188 kb)

Fig. S4

Weighted co-expression network follows power law degree distribution. A data set from the weighted co-expression network is represented with black filled circles, and the degree distribution adheres to a power law as all these circles lie on or close to the red line which is the graph of a function of the form ax-k. The “igraph” library of R package is used to obtain this plot (GIF 12 kb)

High resolution image (TIFF 202 kb)

Fig. S5

Hierarchical tree representing significantly enriched GO terms for ABI3VP1 TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 48 kb)

High resolution image (TIFF 1335 kb)

Fig. S6

Hierarchical tree representing significantly enriched GO terms for bHLH TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 43 kb)

High resolution image (TIFF 1193 kb)

Fig. S7

Hierarchical tree representing significantly enriched GO terms for HB TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 30 kb)

High resolution image (TIFF 724 kb)

Fig. S8

Hierarchical tree representing significantly enriched GO terms for MYB TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 58 kb)

High resolution image (TIFF 1096 kb)

Fig. S9

Hierarchical tree representing significantly enriched GO terms for MYB-related TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 55 kb)

High resolution image (TIFF 1060 kb)

Fig. S10

Hierarchical tree representing significantly enriched GO terms for WRKY TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 51 kb)

High resolution image (TIFF 1004 kb)

Fig. S11

Hierarchical tree representing significantly enriched GO terms for -EREBP TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 33 kb)

High resolution image (TIFF 560 kb)

Fig. S12

Hierarchical tree representing significantly enriched GO terms in enrichment analysis of bZIP TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 7 kb)

High resolution image (TIFF 187 kb)

Fig. S13

Hierarchical graph representing significantly enriched GO terms for MADS TF family. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 6 kb)

High resolution image (TIFF 192 kb)

Fig. S14

Clustering of weighted co-expression network. A total of 42 modules are obtained from clustering of weighted co-expression network using MCL algorithm. Colored filled circles representing TFs in each module. Cytoscape is used for visualization of the network (GIF 89 kb)

High resolution image (TIFF 3417 kb)

Fig. S15

Heat map of transcripts in four significant modules (1–4) using expression data of different tissues. Heat map is representing tissue-specific expression of transcripts in significant modules: a module 1 in roots, b module 2 in young leaves, c module 3 in flower, and d module 4 in mature leaves, where average expression is calculated based on normalized transcriptomics data. The “gplots” library of R package is used to plot heat map (GIF 100 kb)

High resolution image (TIFF 565 kb)

Fig. S16

Top most enriched KEGG pathway annotation for four significant modules (1–4). Each pie segment is labeled with significant KEGG pathway, and the percentage fraction of annotations associated with that particular pathway. a Module 1, b module 2, c module 3, and d module 4 (GIF 65 kb)

High resolution image (TIFF 1171 kb)

Fig. S17

Hierarchical graph representing significantly enriched GO terms for module 2. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 20 kb)

High resolution image (TIFF 326 kb)

Fig. S18

Hierarchical tree representing significantly enriched GO terms for module 3. These over-represented GO terms for biological process category are generated through agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 40 kb)

High resolution image (TIFF 630 kb)

Fig. S19

Hierarchical tree representing significantly enriched GO terms for module 4. These over-represented GO terms for biological process category are obtained using agriGO. Each GO term represented by box are labeled by their GO ID, term definition, and statistical information. Degree of color saturation of a box is positively correlated to the enrichment level of the term (GIF 28 kb)

High resolution image (TIFF 457 kb)

Fig. S20

Network representing the significantly enriched pathways in module 1. GO terms-based network is obtained from enrichment analysis against reference model A. thaliana for module 1. Functionally grouped pathways are mainly found to be associated to monocarboxylic acid biosynthetic process, hormone-mediated signaling pathway, and defense response. ClueGO plugin is used to generate this GO terms-based network (GIF 100 kb)

High resolution image (TIFF 3725 kb)

Fig. S21

Network representing the significantly enriched pathways in module 2. GO terms-based network is obtained from enrichment analysis against reference model Arabidopsis thaliana. Functionally grouped pathways are mainly found to be associated to “regulation of gene expression, epigenetic,” “RNA metabolic process,” and “macromolecular modification” which also complements agriGO enrichment result. ClueGO, plugin in cytoscape, is used to generate this GO terms-based network. (GIF 95 kb)

High resolution image (TIFF 3086 kb)

Fig. S22

Network representing the significantly enriched pathways in module 3. GO terms-based network is obtained from enrichment analysis against reference model Arabidopsis thaliana. Functionally grouped pathways are mainly found to be associated to “pollen exine formation,” “stamen development,” and “external encapsulating structure organization” which also complements agriGO enrichment result. ClueGO, plugin in cytoscape, is used to generate this GO terms-based network (GIF 27 kb)

High resolution image (TIFF 714 kb)

Fig. S23

Network representing the significantly enriched pathways in module 4. GO terms-based network is obtained from enrichment analysis against reference model Arabidopsis thaliana. Functionally grouped pathways are mainly found to be associated to “photosynthesis”, “plastid organization,” and “monocarboxylic acid biosynthetic process” which also complements agriGO enrichment result. ClueGO, plugin in cytoscape, is used to generate this GO terms-based network (GIF 80 kb)

High resolution image (TIFF 2365 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pathania, S., Acharya, V. Computational Analysis of “-omics” Data to Identify Transcription Factors Regulating Secondary Metabolism in Rauvolfia serpentina . Plant Mol Biol Rep 34, 283–302 (2016). https://doi.org/10.1007/s11105-015-0919-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11105-015-0919-1

Keywords

Navigation