Background

In current times, the most burdening heterogenous endocrine disorder is Polycystic Ovarian syndrome in bilateral ovaries (PCOS) affecting 6–20% women of reproductive age globally [1]. Ovulatory dysfunction along with hyperandrogenism and chronic anovulation, which inhibits the development of follicles, and formation of cysts in both ovaries detected by ultrasound are the clinical features of PCOS [2] paired with, menstrual irregularities, insulin resistance obesity, infertility, hyperandrogenism (androgen excess causing alopecia, hirsutism, and acne). Occurrence of elevated plasma insulin levels to recompense insulin resistance is prevalent in 70–80% of obese women whereas 30–40% in lean women having PCOS. Recently revealed mechanism associated with hyperandrogenism revealed synthesis of androgens in the theca cells of ovaries is directly triggered by insulin leading to production of sex hormone-binding globulin (SHBG) in the liver, thereby rising concentration of free testosterone in the circulation that worsens the syndromic effects caused to the body [3, 4]. Female health issues predominantly surround around infertility for instance Endometriosis and PCOS, and research on them has been insufficient which brings in concerns regarding delays in diagnosis and indecisive treatment options.

Keeping in mind the achievements on therapeutic effects made by omics-based polypharmacology in modulating molecular network of varied complex disorders highlights the upcoming importance given to traditional herbal medicines for developing novel drugs to treat complex multi-aetiology syndromes. One such indigenous perennial medicinal wood climber Gymnema sylvestre belonging to Asclepiadaceae family is an acclaimed antidiabetic herb by many ancient Indian medical practices such as Siddha, Ayurveda, Unani, and also has reliable reports in modern medicine. According to Sushruta Samhita, one of the oldest Sanskrit text (sixth century BCE) [5] holds immense value to medical domain due to numerous chapters describing surgical instruments and procedures on which modern science of surgery is build, G. sylvestre was given a name gur-mar, which means sugar killer and aids in curing Madhumeha, an ancient Sanskrit term, referring to hyperglycaemia and diabetes mellitus and other urinary ailments [6]. The literature so far suggests that G. sylvestre holds a treasure of benefits and acts as anthelmintic, liver restorative, anti-inflammatory, emetic, laxative, antipyretic, gastrointestinal, diuretics, thermogenic, cardiotonic, expectorant, stimulants and even is considered to be “uterus tonic”. All these therapeutic effects attributes to secondary metabolites predominantly triterpene saponins, acidic glycosides, gymnemic acids, and anthraquinones [7]. In a preclinical study, the effect of ethanolic leaves extract of G. sylvestre on the menstrual irregularity was reported to achieve normal regularity in cycles [8]. Latest study designed to evaluate the therapeutic potential of G. sylvestre leaf extract against PCOS rat models, with 28 days of continuous ingestion recorded reduced blood glucose level, improved hormonal irregularity, reverted abnormal cysts and atretic follicles and presented evidences of mature follicles with corpus luteum that indicated regular oestrous cycle and prevented ovarian damage [9]. Considering all these recent studies G. sylvestre might be considered as an alternative remedy to treat reproductive and metabolic complications in PCOS women and further understanding of this effective therapeutic is determined in this in silico study.

Methods

Major pre-requisite for Insilco-pharmacology is to rule out what are the key target proteins involved in the aetiology of PCOS. The conventional goal of RNA-Seq and its data analysis is to find differentially expressed genes (DEGs) which are resulted from varied expression of genes under altered cellular state, and that transcriptomic data are registered in the form of datasets that can be retrieved and studied for investigating the potential molecular targets involved and can help with novel drug discovery or drug repurposing.

Screening and selection of mRNA datasets

Mining for mRNA datasets was done through the Gene Expression Omnibus (GEO) repository, [10] meant for storing curated expression profiles of many in vivo and clinical trial studies aiming varied disorders across the globe. Keywords “PCOS” and “mRNA” were searched as query which gave a result of 31 datasets out of which 27 belonged to humans. The selection criteria are as follows: (i) datasets should be obtained through high-throughput sequencing, (ii) datasets must be recorded recently from year 2020 onwards, (iii) must have minimum of 3 control and 3 patient samples involved and (iv) sources of sample should be from various tissue to have an overall representation of pathophysiology involved in PCOS, and 3 datasets with GSE ID, GSE155489, GSE156067, and GSE226146 were selected for further investigation, details mentioned in (Table 1).

Table 1 Description of mRNA datasets retrieved from GEO database

GEO2R analysis and identification of differentially expressed genes (DEGs)

To identify DEGs across PCOS patients GEO2R analyser, [11] an interactional web tool was used that helped to visualize and assess mRNA data comparing healthy control v/s patients and gave top expressed deregulated genes table in an order by p values and a collection of graphic plots to ease the assessment of significantly deregulated genes.

Classification of common significantly deregulated genes

After manual sorting of all top expressed deregulated genes, they were compared between all 3 datasets by constructing a Venn diagram using tool [12] that helped to conclude with common significant genes that might be involved in pathophysiology of all different patients having different tissue and status of disorder.

Establishing PPI network of key genes and functional enrichment studies to identify potentially involved pathways

All the common significantly expressed and top 20 deregulated genes among all 3 datasets were then considered for establishing protein–protein interaction (PPI) network using String database, [13] which evaluated the interactions and gave direct (physical) as well as indirect(functional) associations of proteins in disease causing pathways. The results from STRING also enabled the finding of major Kyoto encyclopaedia for genes and genomes (KEGG) pathways and gene ontology (GO) linked to PCOS but was not able to give functional enrichment related to GO molecular function. To overcome this and for aiding the visualization between all associated pathways tool ShinyGO 0.77, [14] was used which is better at network summarizing and clustering that gave insights about the disease associations based on genes involved [15].

Identification, retrieval, purification, and structural validation of target proteins

Analysing the PPI network, 4 important protein targets were identified namely Mitochondrial 39S ribosomal protein L21 (MRPL21), Cytochrome C Oxidase Subunit6C (COX6C), NADH:ubiquinone oxidoreductase subunit A12 (NDUFA12) and mitochondrial peptidyl-tRNA hydrolase (ICT1).The experimentally-determined 3D structures of these proteins by electron microscopy were downloaded from Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), [16] having PDB ID 6ZSA, 5Z62, 5XTD and 3J7Y and resolution of 4.00A, 3.60A, 3.70 A and 3.40 A respectively in.pdb format. The purification of protein including removal of water molecules, heteroatoms, bounded extra ligands and other protein chains and addition of polar H bonds to stabilise protein chemistry were done using Biovia Discovery Studio [17]. Validating structures of purified proteins before performing molecular docking is essential as it gives valuable insights about protein folding and stability, topology and irregularities about protein structure which was assessed by generating Ramachandran plot using PDB sum generate tool, [18] wherein the Ramachandran favoured regions for all the proteins fell under accepted criteria (~ 90%) and with the help of ProSA tool, [19] the Z score of purified proteins were from − 0.76 to − 10.57 studying them the purified structures were considered further for docking against potential phytocompounds.

Retrieval of ligands

Based on ancient literature and recent animal model studies G. sylvestre is credited as effective therapeutics against PCOS especially the secondary metabolites obtained from the leaf extract. About 88 unique phytocompounds found in leaves of G. sylvestre avoiding repeated entries were retrieved from Indian Medicinal Plants, Phytochemistry and Therapeutics (IMPPAT) database, [20] which is exceptional collection of phytochemicals and their therapeutic uses of the Indian medicinal plants.

Primary screening of phytocompounds

Out of 88 unique phytocompounds primary screening was done based on criteria such as non-permeable to blood brain barrier and accepted under LIPINSKI rule of drug-likeness by using analysis tool SwissADME, [21] Screening was necessary as the chemicals that violate LIPINSKI rule of 5 are not ideal drug candidates and can cause lethal effects to body. Results were downloaded from SwissADME tool, later manually curated by screening criteria, as a result of which 34 ligands were selected for further docking. The Canonical SMILES (Simplified Molecular Input Line Entry System), identifiers (PubChem CID) and 2D structures for all 34 ligands were downloaded, with the help of PubChem chemical database, in SDF (Structure Data File) format as it is accepted chemical structure format where the coordinates of bonds and atoms along with hybridisation is encoded in plain text which is further decoded by docking software. [22]

Molecular docking

For structure-based drug designing of phytocompounds against potential biomarkers of PCOS identified, the most crucial step was molecular docking performed through virtual screening and docking software PyRx (Version 0.8) being conventionally used and one of the open source Computer-Aided Drug Design software, employed to screen chemical molecules library against promising drug targets. PyRx serves better for structure-based drug design owing to its chemical spreadsheet-like functionality, [23] and with its AutoDock Vina plugin purified protein structures were labelled as macromolecules whereas 2D SDF files of 34 ligands were set to pdbqt format after minimization obtained by applied universal force field. A blind docking was performed against all 4 target proteins and the best docked complexes were studied in depth by means of visualisation.

Visualization

The key aspect of computational modelling is understanding mechanisms of interaction of phytocompounds to target protein fulfilled by visualisation and in this study was done by Biovia Discovery Studio, representing top ligands for each target protein [24].

Pharmacological analysis of top phytocompounds showing drug-likeness, including ADMET properties

Unwanted pharmacokinetics and toxicity of probable compound becomes inevitable reasons for drug development failure and hence absorption, distribution, metabolism, excretion, and toxicity (ADMET) along with medicinal chemistry of best docked phytocompounds were evaluated using ADMETlab 2.0, [25] and conclusive decisions on therapeutic benefits were suggested.

Results

GEO2R analysis and identification of differentially expressed genes (DEGs)

The obtained results were generated by comparing designated control samples with test samples. These results were reported in the form of significantly expressed genes, represented through a volcano plot illustrating the statistical significance (− 10log P) against the magnitude of change (log 2), as depicted in Fig. 1. The number of genes that were deregulated was manually sorted and documented in Table 2.

Fig. 1
figure 1

Volcano plot analysis where adjusted p value cut-off is 0.05 (red = upregulated, blue = downregulated) and the datasets represented are a GSE155489 b GSE156067 c GSE226146

Table 2 GEOR2 Analysis results

Classification of common significantly deregulated genes

Considering all the significantly expressed genes among 3 datasets the common genes were classified by Venn diagram as shown in Fig. 2 and the same along with top 20 expressed deregulated genes of each dataset were taken into account for construction of PPI network.

Fig. 2
figure 2

Venn diagram resulted from 3 datasets gave 20 common genes between dataset

Establishing PPI network of key genes and functional enrichment studies to identify potentially involved pathways

STRING database

At high confidence level, the interaction between nodes were clear and key players were identified as illustrated in Fig. 3, were MRPS28, MRPS33, MRPL20, MRPL21, 1CT1, NDUFA8, NDUFA12, VDAC3, APOO, TOMM5, COX6C, COX7B, EDN1 and EDN3 and helped to understand the molecular mechanisms in PCOS. The role of MRP family of genes was evidently seen in the network obtained which majorly codes for mitochondrial ribosomal proteins and here suggests that mitochondrial translation is deregulated. Interestingly the involvement of ICT1 with maximum interaction suggested errors in protein biosynthesis and inappropriate assembly of mitochondrial ribosome intracellular signaling and apoptosis owning to it being crucial termination factor in mitochondria [26]. On the other end, involvement of COX6C as major interactors implicates oxidative phosphorylation deviations and many recent researches have reported aberrant levels in diabetes, uterine leiomyoma, familial hypercholesteremia and other chronic disorders such as cancers and kidney failure [27]. Indications for association to mitochondrial and neurodegenerative disorders are supposed as NADH:ubiquinone oxidoreductase subunit A12 is part of respiratory chain complex, and dysfunction in them leads to distinctive range of disorders specially neuromuscular [28]. The detailed functional enrichment is mentioned in (Table 3) wherein KEGG pathway implies major neurochemical disorders which may be the reason for stress associated symptoms of PCOS.

Fig. 3
figure 3

PPI network constructed using STRING database at mean confidence level of 0.7 and gave total 65 nodes; 20 edges with an average node degree of 0.615 and PPI enrichment p-value of 0.000316

Table 3 Functional enrichment of network obtained from STRING database

SHINY GO

The major pathways ruled out in this study were oxidative phosphorylation, non-alcoholic fatty liver disease, diabetic cardiomyopathy which are in alignment to the literature about known pathophysiology of PCOS alongside various neurodegenerative disease and metabolic pathways as depicted from Fig. 4. Biological processes, cellular locations and molecular functions affected by PCOS are directed by investigating GO Network obtained where two pathways that share genes are connected by nodes. Interpretation of these networks illustrated in Fig. 5a–c was done as per standards stated, such as larger gene sets are represented by bigger nodes and darker the node more significantly gene enriched whereas overlapped genes are implied by thicker nodes [15]. The diseases associated with these genes, determined through their phenotypic correlations, were also identified in Fig. 5d. Notably, there is a stronger emphasis on glaucoma, metabolic acidosis, and hypogonadism.

Fig. 4
figure 4

KEGG pathway generated with the help of Shiny GO webserver where a indicates network of pathways suggested at edge cut-off 0.2, while b is a bar chart highlighting statistically significant pathways alongside its fold enrichment levels

Fig. 5
figure 5

Functional enrichment networks obtained from Shiny GO where a GO biological processes, b GO cellular component, c GO molecular functions and d Disease alliances are represented at edge cut-off 0.1

Molecular docking

Macromolecules are the target proteins which are generally considered rigid while the ligands are flexible assumed by virtual screening software PyRx, where efficacy of phytocompounds were estimated by binding affinity scores, more negative the value better binding. The best docked phytocompounds out of 34 that were subjected to blind docking were documented with their binding affinity against target proteins as in (Table 4). Highest affinity was found to be of Lupeol for all the potential targets which makes it most suitable therapeutic agent, followed by Beta-Elemene, Stigmasterol and Gymnestrogenin which are proven to manage insulin resistance in patients of PCOS.

Table 4 List of ligands that give best binding score with the target proteins in terms of stronger binding affinity calculated via blind docking

Visualization

The interactions of lupeol with amino acids of target proteins were in depth studied by visualisation, and the details can be assessed through Fig. 6, where lupeol occupies space in binding pockets of protein targets and the interacting amino acids are also mentioned.

Fig. 6
figure 6

Visualization of best phytocompound lupeol against target proteins: a Mitochondrial 39S ribosomal protein L21, b NADH:ubiquinone oxidoreductase subunit A12, c Cytochrome C Oxidase Subunit6C and d mitochondrial peptidyl-tRNA hydrolase

Pharmacological analysis of top phytocompounds showing drug-likeness, including ADMET properties

The bioactive phytocompounds derived from G. sylvestre were showing promising therapeutic role in the literatures till date and were also translated in this study. The focussed ADMET screening performed revealing binding absorption and distribution across systemic circulation is summarized in Table 5, where the findings related to plasma protein binding indicate that Beta-Elemene and Gymnestrogenin exhibit less binding to plasma proteins, suggesting better tissue penetration capability but faster excretion rates. On the other hand, Table 6 presents the medicinal chemistry data, clearance rates, and half-life values of the compounds. Lupeol displayed a longer clearance rate and half-life, making it a strong candidate for drug delivery systems. The industrial feasibility of drug manufacturing and processing was evaluated using the Synthetic Accessibility Score (SA Score) in Table 6. The scores ranged from 4 to 5.3 for all six compounds, indicating moderate ease of synthesis with potential alterations to improve their solubility and other biochemical properties. To assess the drug-likeness and safety of the phytocompounds, toxicity studies were conducted, and the results are summarized in Table 7. The recommended daily doses for all six phytocompounds were found to be higher than 500 mg, inferring less threat of ingestive overdosing. Additionally, mutagenicity and drug-induced liver injury were estimated to be very low, supported by the high lethal dose of these compounds suggesting a low risk of toxic effects even at high doses.

Table 5 Adsorption and distribution properties of best docked phytocompounds when considered to have drug-likeness
Table 6 Medicinal chemistry of best docked phytocompounds having drug-likeness
Table 7 Toxicity analysis of promising phytocompounds

Discussion

According to statistical report more prevalence of adult women suffering with menstrual abnormality, infertility and hirsutism registered in Turkey (33%), France (23%), Portugal (18%), Greece (9%), India (6%), etc. owns to varied alleles of genes across specific populations, [29] and their regulations involved in PCOS reflects more on androgen excess in spite of adrenal insufficiency. Based on recent studies mutations within cytochrome P450 oxidoreductase (POR) gene, CYP9 and CYP11 are believed to be related to phenotypic spectrum of symptoms observed in PCOS like low sex steroid hormone levels, infertility, and amenorrhea [30]. Along with multiple genes the aetiology is still unclear whereas for one third of females with PCOS, from initial complaints to precise diagnosis can take as long as 10 years which comes with ultimate invasive procedures like laparoscopy, hence studies to understand molecular mechanism in depth becomes quintessential. The current study revealed involvement of oxidative stress, impaired mitochondrial translation and majorly impaired metabolism of proteins based on gene expression profiles of PCOS patients reflecting deregulation of MRP family genes, COX, NDUF and EDN genes. To summarise the deregulated pathways, an illustration was generated using Reactome pathway database, [31] which in sync with the literature evidences suggested the role of insulin-growth regulators Fig. 7, explains the predominant insulin resistance in PCOS patients.

Fig. 7
figure 7

Summarized pathways whose disbalanced regulations are co-related to PCOS patients focussing on metabolic errors of protein synthesis and impaired repair of IGF1 leading to insulin resistance

Further this interpretation helps to understand why current therapeutic drugs target phenotypic symptoms, which is due to the fact that if insulin resistance, obesity and hyperandrogenism are brought under normal levels, PCOS can be effectively managed. On-going therapies for PCOS such as Letrozole and Clomiphene Citrate for menstrual irregularities while Oral Contraceptive Pills involving Metformin and Gonadotropins are segmented and are prescribed to cure symptoms but fails against multiple pathophysiology alterations involved. Moreover, they have limitations associated such as ovarian gonadotropin stimulation promises elevated ovarian function in PCOS, however possesses threat for multiple births and ovarian hyperstimulation syndrome (OHSS) [32]. In the same manner, Clomiphene citrate is an ovarian stimulant but elevates chances of multiple pregnancies and do not affect the other metabolic and psychosocial indicators of PCOS and hence limits its potential as stand-alone drug. Letrozole hinders the formation of oestrogen from androgen and is most commonly prescribed third-generation contraceptives composed of less oestrogen levels; alongside of Isotretinoin, Flutamide, and Finasteride to treat Hyperandrogenism but long-term cardio-metabolic effects still remain unanswered [33]. In spite of drugs available in most cases, still surgical interventions like Laparoscopic Ovarian Drilling (LOD) are being practiced which transiently alters ovarian function [34]. Hence, in recent times, focus of scientific community is bended towards alternative medicines such as plant-based drug therapies for ultimate solution for PCOS.

As mentioned prior investigations suggest extract of G. sylvestre to be most efficient against hyperglycaemia which also has powerful antioxidant activities [35, 36] having both properties, it is peculiarly advisable to treat PCOS and inputs from this study also aligns. Considering the affinity of phytocompounds towards target proteins few phytocompounds were exceptionally good and researches in current time supports their therapeutic values. For example, Gymnemasaponins (aglycone saponins) exhibit dual mechanisms by elevating blood plasma levels of insulin and lowering free testosterone in blood circulation [6]. Beta-Amyrin and Beta-Elemene possess primarily Hepatoprotective properties along with antigastritic and antihomocystinuric effects. In terms of dosage, higher dose of Stigmasterol is relatively safe and helps to manage steoarthritic, hypercholesterolemic, cytotoxic and hypoglycaemic conditions by its antioxidant and anti-inflammatory effects [37]. This study highlights Lupeol and its potential to become most efficient drug compound against PCOS because of the fact that it is pharmacologically safe, with less threat of ingestive overdosing. Even evidences of treating malaria, rheumatism and urinary tract infections using Lupeol are reported owing to its anti-microbial properties [38]. In a novel in vivo study for reverting endometrial carcinoma by blocking potent target Matrix metalloproteinase-9 (MMP-9) involved in metastasis; using Lupeol as natural anti-cancerous and anti-inflammatory phytocompound with the least side effects the results supported therapeutic values pertaining to Lupeol [39]. Its efficacy in healing array of metabolic disorders was proved in a remarkable study where it acted as α-glucosidase inhibitor [40].

Further investigations with validating protein targets and their binding to natural phytocompounds with comparable efficacy in animal model studies (in vivo) are recommended which later can be translated to clinical trials after satisfactory observations to overcome the limitation of this study that is majorly sample size analysed. On the other hand, the availability of mRNA datasets was also very limited which brings in view that more RNA-Sequencing should be carried out with larger sample size and should be available as datasets so better conclusive studies are carried out in near future.

Conclusion

Investigating gene expression levels of PCOS patients revealed deregulation of MRPS28, MRPS33, MRPL20, MRPL21, 1CT1, NDUFA8, NDUFA12, VDAC3, APOO, TOMM5, COX6C, COX7B, EDN1 and EDN3 genes whose functional enrichment suggests erroneous metabolism of proteins and impaired mitochondrial translation to be causative of phenotypic spectrum of symptoms observed in PCOS. Therapeutic potential value obtained from G. sylvestre phytocompounds is high especially for Lupeol, Betaine and Stigmasterol against potential molecular targets (MRPL21, 1CT1, NDUFA12 and COX6C) and hence becomes eligible candidates for drug development.