Introduction

Cervical cancer, the fourth most common cancer in women, is a cause of concern in countries with low human development index (HDI). According to Globocan 2020, there were 604,127 new cases and 341,831 deaths across the globe, while 85% of these deaths were from the developing world [1,2] and more than one third cases were from India and China. Several factors, mainly, weakened immune system, poor nutritional status, multiparity, having multiple sexual partners, lack of proper hygiene practices, smoking, uncontrolled use of oral contraceptives, poor diagnosis and genetic predisposition are the common risk factors for developing cervical cancer. Human papilloma virus (HPV) is the primary causal organism for this cancer [3]. Conventional treatment involves chemotherapy, surgery and radiotherapy, which is not only associated with severe side effects but also unaffordable for the economically weaker section. Considering this backdrop, it is of paramount importance for the rapid development of an inexpensive and safe therapeutic regime to treat this virus-induced cancer.

HPV is a non-enveloped DNA virus belonging to the family Papillomaviridae with a diameter of 50–55 nm. There are as many as 300 genotypes of this virus; however, 200 types are pathogenic to humans. Among all these genotypes, HPV16 is considered the high-risk category. With 7.9 Kb genomic size, HPV16 codes for many viral oncoproteins, including E6 and E7. E6 and E7 are majorly responsible for integration and viral replication within the host cell, inducing several cancer markers such as angiogenesis, proliferation, metastasis, hyper telomerase activity, etc. [3], and at the same time, the anti-apoptotic proteins such as BCL2, XIAP, LIVIN [4,5], etc. upregulate in the HPV-infected cancer cells can be a targeted therapy to induce cell survival.

To date, therapeutic options are scarce for the treatment of HPV-induced cervical cancer. Vaccines are the most promising strategy for the prevention of HPV persistence. However, high expense and the presence of multiple variants of the virus currently restricted the universal applications of such vaccines, especially in the developing nations [6,7]. Recently, Gao et al. (2020) [8] introduced poly(amide-amine)-poly(β-amino ester) hyperbranched copolymer/linear poly(β-amino ester)-CRISPR/Cas9 therapeutic plasmids polyplex to treat papilloma virus-infected cervical cancer. Zamulaeva et al. (2021) [9] concluded that radiation might not be very successful for such carcinogenesis due to multiple factors. In a review article, Moga et al. (2021) [10] showed that several bioactive molecules from the seaweeds, such as glucans, lutein, carotene, stypoldione, etc., could be the potential anti-HPV anti-cancer agents.

Recent advancements in computer-aided techniques gave acceleration in the momentum of the drug discovery screening strategy. Inclusion of in silico docking experiments for screening of candidate drugs before wet-lab validation can expedite the process and reduce the cost significantly. Bioinformatics witnesses many applications in the drug discovery process [11,12]. Natural products are being considered safe and effective therapeutics when compared with their synthetic counterparts. However, the purification of natural compounds, especially those are sourced from botanical origin, is expensive and time-consuming. Therefore, screening a few potential bioactive compounds among many molecules can give a significant advantage in the drug discovery process and integration of targeted molecular docking technique may provide a substantial edge over it. A few activities of natural products against anti-HPV proteins were reported in the literature. Prasasty et al. (2017) [13] studied five natural products against HPV proteins by using the molecular docking technique. Recently, Kotadiya et al. (2020) [14] screened 17,944 herbal molecules to identify anti HPV E6 and E7 agents by applying a similar strategy.

Certain phytochemical classes such as labdane diterpene, anthraquinone, long-chain fatty acid and amino alcohols are reported to have significant activities. Hsieh et al. (2020) [15] showed that labdane diterpenes could induce apoptosis in 5-fluorouracil resistance human oral cancer cells. Furthermore, the role of anthraquinone moiety against a range of carcinomas, such as breast carcinoma, colon carcinoma, prostate carcinoma and neuroblastoma was also reported in the literature [16]. The role of long-chain fatty acid and amino alcohol as anti-cancer therapeutic options [17] were also reported. Though, significant literature is available on the anti-cancer properties of these phytochemical classes, their therapeutic potential against HPV-induced cervical cancer is yet to be established. The botanical family Zingiberaceae is considered the powerhouse of phytochemicals and reported to contain such phytochemical classes in abundance [18]. Amomum subulatum belongs to Zingiberaceae and has substantial pharmacological properties such as antioxidant, anti-inflammatory, anti-viral and anti-cancer activities [19,20].

However, the scientific data on the effect of major phytochemicals of large cardamom on HPV-induced cervical cancer is limited to the best of our knowledge.

In this context, our study aimed to evaluate anti-HPV potential of four identified uncommon compounds from the greater cardamom (Amomum subulatum), namely rhein, phytosphingosine, n-hexadecenoic acid and coronarin E. Furthermore, anti-HPV potential of these compounds was studied against the target oncoproteins HPVE6, HPVE7 and anti-apoptotic proteins BCL2, XIAP and LIVIN by using in silico docking technique. Pharmacophore screening was performed to find a few novel candidate chemical entities against the target proteins, using the phytochemicals as templates. Finally, the correlation of evaluated molecules was studied by using the statistical tool principal component analysis (PCA) based on the docking scores.

Material and Methods

Materials

Plant Material

Rhizomes of the plant Amomum subulatum var. Golsey was obtained from Kabi research farms of Indian Cardamom Research Institute in Pangthang, Sikkim, and the identified plant specimen was submitted to the Calcutta University Herbarium (CUH) Kolkata, India (accession number 20087).

Chemicals

Ethanol (LC–MS grade, SRL, India), hexane (GC–MS grade, SRL, India) and deionized double distilled water (Milli Q).

Methods

Preparation of Plant Extract

The rhizomes were cut from the main stem, washed in tap water to clean off the mud and soil matter and cut into small pieces and allowed to air dry until complete moisture loss. The dried pieces were ground to powder in an electronic grinder. Solvent extraction was carried out in 95% ethanol with the ground sample in a 1:10 ratio, including three intermittent solvent changes by filtration through non-absorbent cotton for over 10 days. The Amomum subulatum ethanolic extracts were then set to desired concentrations by air drying and re-dissolving the remnant extracted matter in the solvent.

LC/MS (Liquid Chromatography/Mass Spectrometry) Analysis

LC/MS analysis was carried out at the Sophisticated Analytical Instrument Facility (SAIF), Indian Institute of Technology, Mumbai. Ethanolic extract of A. subulatum (1 mg/mL) was subjected to liquid chromatography/mass spectrometry (LC/MS) analysis (Agilent Technologies) coupled with Q-TOF (quadrupole time of flight) mass spectrometer (model — G6550A) with a dual AJS ESI ion source, a HiP sampler with binary pump (model — G422013) with a 30-min run time at a flow rate of 0.3 mL/min and pressure limit 1200 bar. The injection volume was set as 5 μL and the chromatography column was Hypersil gold (3 μm particle size, 100 × 2.1 mm dimension). The flow gradient was maintained as 95% water and 5% acetonitrile for 0–18 min, 100% acetonitrile for 18–26 min and 95% water and 5% acetonitrile for 26–30 min. Phytochemicals were identified by Agilent LC/MS Phytochemical library.

GC/MS (Gas Chromatography/Mass Spectrometry) Analysis

A total of 2 g of dried extract was dissolved in 10 mL of hexane at 37 °C for 3 days. Residual water was removed by adding a pinch of sodium sulphate (Merck), followed by low-speed centrifugation and then finally filtered by 0.45-micron nylon membrane for gas chromatography/mass spectrometry (GC–MS) analysis. The analysis was performed in a GC interface with a mass spectrometer using an SH-Rxi™-5Sil (Shimadzu) column. The oven temperatures ranged from 70 to 260 °C. A total of 2 μL of the filtered extract was injected. A 49-min run was programmed at 5 °C/min ramping with initial hold time for 1 min and final hold time for 10 min. Helium was used as the carrier gas at a flow rate of 1 mL/min. The injection port was set at 250 °C. MS operating parameters include electron impact ionization at 70 eV with a mass range of 50–600 amu. NIST mass spectral library 2017 (Version 1.0) (Version 1.0) [21].

Pathway Analysis

KEGG or Kyoto Encyclopaedia of Genes and Genomes (https://www.genome.jp/kegg/) is an open-source database containing high-level functional information of the biological systems such as the cell, the organism and the ecosystem [22,23]. Diverse pathways based on a large molecular and high throughput experimental data are available in this database. For this study, we identified individual pathways involving the target proteins (HPVE6, HPVE7, XIAP, LIVIN and BCL2) of HPV-induced carcinogenesis and based on these pathways, an integrated pathway was constructed.

Preparation of Ligands

Based on the GC/MS and LC/MS results, four major phytochemicals, coronarin E (CID 9,971,144), n-hexadecanoic acid (CID 985), phytosphingosine (CID 122,121) and rhein (CID 10,168), and three dimensional (3D) structures of the selected phytochemicals and control kaempferol (CID 5,280,863) were downloaded from the PubChem database. PubChem is an open-source database (https://pubchem.ncbi.nlm.nih.gov/) of the National Institute of Health (NIH) and contains data on chemical structures, identifiers, chemical and physical properties, biological activities, patents, health, safety, toxicity data, etc. [24]. Furthermore, structural optimization was performed by Avogadro 1.1 software by applying a universal force field (UFF) algorithm. It utilizes frequencies, single-point energies and geometric optimization data [25].

Target Proteins (Receptors)

Three dimensional (3D) structures of the proteins, HPV (human papillomavirus) oncoprotein E6 (PDB id 4GIZ, 2.55 Å, X ray diffraction), BCL-2 (PDB id 2XA0, 2.70 Å, X ray diffraction) and BIR-3 domain of XIAP proteins (PDB id 1F9X, solution NMR), were downloaded from the RCSB Protein Data Bank (https://www.rcsb.org/).

Due to the unavailability of the 3D structures of the proteins, human papillomavirus oncoprotein E7 and LIVIN in the RCSB PDB, we modelled these proteins by using SWISS-MODEL server (https://swissmodel.expasy.org/) in two steps. In the first step, protein sequences of E7 (accession no. AAD33253.1) and LIVIN (accession no. AAQ89195.1) were downloaded from NCBI protein resources (https://www.ncbi.nlm.nih.gov/protein). In the second step, the sequences were uploaded to the SWISS-MODEL server for further modelling. SWISS MODEL relies on structural template(s) identification, target sequence and template structure(s) alignment, model-building and quality evaluation to build the model from the template library. It uses modelling engine ProMod3 and advanced tools such as Monte Carlo, graph-based TreePack algorithm, CHARMM22/CMAP force field and SCWRL4. The quality evaluation of the modelled structures was done by Ramachandran plot and QMEAN [26]. Structural rendering was performed by the open-source software UCSF Chimera 1.17 software [27].

Pharmacophore Screening Phytochemical Analogues

The pharmacophore is defined as a three-dimensional molecular interaction of ligand and receptor based on the chemical features such as hydrogen bond (H-bond), ionic charges, lipophilic and aromatic contact and hydrophobic groups. These chemical features or descriptors of the ligand are grossly responsible for a biological and functional response towards the receptor or the target proteins. Here, identified phytochemicals were used as pharmacophore candidates and chemicals features were deduced by uploading the ZINCPharmar server (http://zincpharmer.csb.pitt.edu/pharmer.html) [28], by using ‘add feature’ function. ZINCPharmar is an open access server used to screen small molecule entities against submitted candidate pharmacophore models. ZINC database contains 22,724,825 purchasable molecules. Among the output results, the top three small molecules were selected for each of the phytochemical inputs based on the RMSD values and three-dimensional structures of the ZINC ligands were downloaded from the database.

Evaluation of Drug Likeness

The canonical smile formats of the selected four phytochemicals and top three ZINC outputs were downloaded from the PubChem and uploaded in the open source SwissADME server for evaluation of drug-like properties of the phytocompounds. The drug-like properties of molecules were evaluated by multiple parameters such as lipophilicity (iLogP) water solubility (ESol), Lipinski rule of 5 and topological polar surface area (TPSA). Furthermore, lipophilicity (WLOGP) and TPSA were plotted in the Boiled-Egg graphical interface. This graph further explains the bioavailability of the drug through PGP (p-glycoprotein) responses. PGP is a membrane-bound transporter that induces efflux of the substrates (PGP +), thereby reducing the intracellular concentrations and inhibiting the drug’s bioavailability [29].

Molecular Docking

After preparation of ligands and proteins, molecular docking was performed by the web server DockThor (https://dockthor.lncc.br/v2/) [[[[30,31]]]]. DockThor uses MMFFLigand and PdbThorBox inhouse tools for it docking algorithm along with the MMFF94S53 force field [[[[32]]]]. For docking set up, grid parameters are shown in Table 1. Two anti-cancer drugs, namely kaempferol and ciglitazone were used as natural and synthetic controls, respectively.

Table 1 Grid parameters

Principal Component Analysis (PCA) and Molecular Pharmacophore Alignment

PCA is a multivariate statistical that is used to understand the dimensionality of a large set of data. It transforms the measured variables into unrelated ones, i.e. principal components (PCs). Each of the PCs is orthogonal, i.e. unrelated, and PC1 covers maximum variability of the dataset followed by subsequent PCs (PC2, PC3, etc.). PCA aids in clustering the data points based on their inherent variations [33,34]. In this study, the docking experiment’s binding energies (kcal mol−1) were used as inputs of PCA and thereby, it clusters the ligand molecules based on their similar properties. Furthermore, based on the best clusters identified, molecular pharmacophore alignment was performed by an open-source PharmaGist web server (https://bioinfo3d.cs.tau.ac.il/PharmaGist/php.php) [35] to determine the common chemical features of compounds. PharmaGist relies on the structure of the ligand rather than the receptor protein structure.

Protein Flexibility-Molecular Dynamic (MD) Simulation

The protein flexibility of the top-ranked ligand–protein complexes was evaluated by CABS-flex 2.0 server (http://biocomp.chem.uw.edu.pl/CABSflex2) and presented with RMSF (root mean square fluctuation). CABS-flex offers fast protein flexibility simulation and generates protein dynamic simulation at highly reduced system requirements. Flexibility simulation obtained from this server was reported to highly correlate with the NMR results [36,37]. CABS-flex provides high resolution (10-ns) protein near-native protein dynamics simulation and hence is very effective for evaluation of protein–ligand stability on real-time basis. Simulation in CABS-flex was set with default parameters, with 50 cycles.

Results and Discussion

Identification of Phytochemicals

In this study, four compounds, namely coronarin E (labdane diterpenes), n-hexadecanoic acid/palmitic acid (long chain fatty acid), phytosphingosine (amino alcohol) and rhein (anthroquinone) were identified from the rhizome extract of Amomum subulatum. The chromatograms are represented in the supplementary figures (Figs. S1 and S2).

Sirat et al. (1994) [38] isolated two labdane diterpenes, namely labda-8(17),12-diene-15,16-dial and coronarin E, from the chloroform extract of Alpinia javanica rhizomes. Furthermore, Subramanyan et al. (2020) [39] reported coronarin compound from Hedychium flavescent rhizome extract. A similar finding was reported from the ethyl acetate fraction of Amomum maximum [40]. We have identified coronarin E (0.4%) from the ethanolic extract of Amomum subulatum rhizome by GC–MS, which is in agreement with these findings. In addition to that, we found n-hexadecanoic acid (palmitic acid) in abundance (9.49%) in the extract. In literature, the presence of palmitic acid was commonly reported from various plants. Ali et al. (2019) [41] found high content of palmitic acid from Piper nigrum, Nigella sativa, Cinnamomum zeylanicum and Elettaria cardamomum extracts.

LC–MS analysis in our study revealed the presence of two major compounds rhein and phytosphingosine, in the ethanolic extract of A. subulatum. The phytochemical rhein was reported in a diverse number of plants. Cunha et al. (2017) [42] reported rhein as one of the major compounds in the ethanolic extract of Cassia bakeriana. Furthermore, Thanh-Tam Ho et al. (2019) [43] identified rhein from the plant Polygonum multiflorum. Nevertheless, to the best of our knowledge, we reported this compound (2.06%) for the first time from A. subulatum. Finally, we reported the presence of an uncommon phospholipid, phytosphingosine (3.2%) for the first time from A. subulatum rhizome. However, phytosphingosine was reported earlier from other plants. Agil et al. (2020) [44] identified 2.5% of phytosphingosine from the ethyl acetate fraction of Marsilea crenata Presl. leaves. This compound was also reported from the homoeopathic preparation of Gymnema sylvestre [45].

Pathway Analysis

Human papilloma-virus (HPV) proteins E6 and E7 play a crucial role in the molecular pathogenesis of cervical cancer. Literature indicated that these proteins interact with various cell cycle regulatory proteins to establish cervical cancer prognosis [46,47,48]. BCL2 is considered one of the most critical anti-apoptotic proteins related to lymphoma development [49]. This protein is also reported to relate with the manifestation of cervical cancer directly. Wang et al. (2019) [50] showed that upregulation of anti-apoptotic protein BCL2 expression through Homeobox C6 (HOXC6) transcription factor led to cervical cancer progression. Two other proteins, namely XIAP (X-linked inhibitor of apoptosis) and LIVIN, are also known to have a significant role in cervical cancer progression. XIAP is one of the members of the inhibitors of apoptosis proteins (IAP) family, and it acts by blocking the maturation and activation of caspase 3, 7 and 9 [51,52]. Overexpression of XIAP was reported with direct or indirect inhibition of caspase activity, thereby interfering with the related pathways of cell death, cell cycle, autophagy and cell migration [53]. The vascular endothelial cell angiogenesis helps to develop new blood vessels and supply oxygen and nutrients to the tumour cells. This angiogenesis was directly related to the overexpression of HPV E6 and E7 proteins [54,55]. Furthermore, it was observed that extracellular vesicles might take an essential role in cancer progression by transferring their content from a donor to cancerous cells. In cervical cancer, overexpression of anti-apoptotic proteins like XIAP and LIVIN was reported from these extracellular vesicles [56]. LIVIN, a member of IAP family, was first reported in 2001, and authors showed that silencing of this protein could trigger apoptosis specifically in the cancer cell lines [57]. We used KEGG pathways to illustrate the role of the anti-apoptotic and HPV proteins in the development of cervical cancer. Three KEGG pathways, namely cancer (map05200), apoptosis (map04210) and HPV infection (map05165), were selected (Figs. S3–S5), and finally, a combined pathway was constructed, see Fig. 1. The created diagram indicated that targeting these proteins may inhibit the overall HPV-induced cervical cancer pathway.

Fig. 1
figure 1

KEGG pathways derived integrated HPV-induced canonical cervical cancer pathways and corresponding therapeutic consequences resulting from drug application leading to apoptotic cellular death or cell cycle arrest

Pharmacophore Screening for Phytochemical Analogues

Pharmacophore-based models were screened based on the descriptors of the identified phytochemicals. For phytochemicals, applications of such techniques are common in the literature [58,59]. Based on pharmacophore-based screening, Sharma et al. (2020) [60] filtered ten compounds and identified the phytochemical schinilenol as the best lead compound against the protein cyclin-dependent kinase 2 (CDK2). Friday et al. (2020) [61] designed ten analogues based on quercetin pharmacophore, and 7-(2,3-dihydroxycyclopropyl)-2-(3,4-dihydroxyphenyl)-3,5-dihydroxy-4H-1-benzopyran-4-one was selected as the best compound. Similar to these studies, based on root mean square distance (RMSD), we identified three analogues for each of the four phytochemicals by using the ZINCPharmer database. The functional descriptors selected for pharmacophore-based model search is shown in Fig. 2. Maximum number of descriptors (hydrophobic, aromatic, H bond acceptor and donor, negative ion) was attempted to cover each of the phytochemicals. The RMSD values for the small molecules against four phytochemicals, namely coronarin E, n-hexadecanoic acid, phytosphingosine and rhein, ranged within 0.235–0.683, 0.544–0.715, 0.511–0.606 and 0.044–0.435 Å, respectively (Table 2). The two-dimensional structures of the selected phytochemical analogues are shown in Fig. 3.

Fig. 2
figure 2

Functional descriptors of identified phytochemicals from Amomum subulatum rhizome used as pharmacophores as generated by PubChem and ZINCPharmer

Table 2 ZINC small molecules screened based on phytochemical pharmacophores
Fig. 3
figure 3

2D structures of selected phytochemicals and associated ZINC analogues

Evaluation of Drug Likeliness

SwissADME is an open-source server and predicts absorption, distribution, metabolism and excretion properties of candidate drugs through the parameters such as physicochemical descriptors, pharmacokinetic properties, drug-like nature and medicinal chemistry friendliness [29,62]. The result was represented in a ‘Boiled Egg’ graphical interface, see Fig. 4. Boiled Egg graph plotted TPSA (total polar surface area) and WLogP (lipophilicity) in the x and y axis respectively. TPSA indicates drug properties such as gastro-intestinal absorption (GIA) and brain permeability (BP), while WLogP showed lipophilicity of those compounds. Except for two compounds, all other sixteen (16) molecules showed high gastro-intestinal absorption, and coronarin E analogue ZINC09794889 was permeable in the brain tissue. These parameters (GIA and BP) are associated with physiological drug metabolism and submitted candidates, including controls found to pass either one or both the criteria.

Fig. 4
figure 4

Evaluation of drug-like properties of ligands, represented by Boiled-Egg graph; yellow yolk: blood–brain barrier permeate; white, gastro-intestinal absorption

Furthermore, other drug likeliness criteria values such as TPSA, iLogP and XlogP3, ESOL Class and ESOL Log S and p-glycoprotein substrate (PGP) and Lipinski’s violations are presented in Table 3. The TPSA values should be ranged between 20 and 130 Å [63]. iLogP and XlogP3 both are the determinants of lipophilicity. iLOGP is based on GB/SA (Generalized-Born and solvent accessible surface area) model which relies on free solvation energies in n-octanol and water. XlogP3 on the other hand is automistic method with correction factors. All the compounds were found well within the lipophilicity range (iLOGP: 0.19 to 4.37 and XlogP3: 1.48 to 6.19). All the sixteen candidate drugs and two controls were found to comply more or less within the range. Log S depicts the water solubility of the drug, and compounds having values less than 0 are highly water-soluble, as determined in our study for all submitted molecules. PGP is a membrane-bound transporter and actively removes the drug molecules from the cells through an active concentration gradient. Therefore, PGP + compounds may not be bioavailable to the cells [64]. Among eighteen compounds, only four were found to be PGP + . However, these compounds passed all the other criteria. Finally, all the molecules are found to be well within the limits of Lipinski criteria violation (violations 0–1) [65].

Table 3 Evaluation of drug likeliness by SwissADME

Molecular Docking

Molecular docking is considered a promising tool to obtain a fast and preliminary understanding of the drug-like interaction properties of the compounds against target proteins. Atabaki et al. (2021) [66] showed through molecular docking experiments that phytochemical compounds from Jurinea macrocephala could inhibit HPV18 E6 oncoprotein. Furthermore, it was reported in an in silico study that the phytochemical quinizarin could prevent oncogenesis through inhibition of anti-apoptotic protein BCL2 [67]. DockThor was used here for the molecular docking experiment, and resultant binding affinities (kcal mol−1) lower than the control were considered optimal interaction values. DockThor was substantially used in the literature, Singh et al. (2021) [68] used this software to show the inhibitory potentials of phytochemicals of Schefflera vinosa extract against the target protein histone deacetylase (HDAC) of rice pathogen Magnaporthe oryzae. Recently, DockThor was used substantially to establish anti-viral properties of various medicinal plants including our previous study on SARS CoV2 [69,70,71]. In this study, we found that the phytochemical n-hexadecanoic acid and its analogues (ZINC13368607) had the overall high efficacy against most of the target proteins, when compared with the controls. n-Hexadecanoic acid analogue ZINC13368607 scored the highest against both the proteins XIAP and LIVIN (− 9.158 and − 8.130 kcal mol−1). Phytosphingosine analogue ZINC08964497, n-hexadecanoic acid-based compounds ZINC08603481 and ZINC13368607 showed the highest scores against BCL2, E6 and E7, respectively (Table 4). In most of the phytochemical analogue interactions, we found common amino acids compared to that of control (Table 5). Overall, it was noted that phytochemical analogues performed better than the original phytochemicals. The details of amino acid interactions of top-ranked ligands are presented in Fig. 5.

Table 4 Binding affinities (kcal mol.−1) of phytochemicals and subsequent pharmacophore modelled compounds with target proteins interactions as obtained by DockThor
Table 5 Amino acid interaction between the best ranked ligands and proteins
Fig. 5
figure 5

Interacting amino acid between the best ranked ligands and proteins

Principal Component Analysis and Molecular Pharmacophore Alignment

PCA is a multivariate statistical tool used to understand the correlation among different variables and witnessed diverse applications in scientific research. Hajdari et al. (2020) [72] grouped eight plant species of Lamiaceae into four principal components based on their volatile constituents. Recently, Meftahizadeh et al. (2021) [73] showed that the Dalgan genotype of Hibiscus sabdariffa had the most suitable morphological and phytochemical traits among other genotypes. In our previous study on the dengue virus, we applied the PCA tool to substantiate the beneficial role of piperine to inhibit dengue proteins [74]. The comparative docking results (binding affinity, kcal mol−1) of 18 ligands against subsequent target proteins are presented in Fig. 5A as a heat map, and PCA was performed with these docking scores. The first (PC1) and second (PC2) components explained 65.20 and 17.0% of the variance. We observed four clusters in the PCA. While cluster 1 comprised four compounds, namely ZINC13368240, ZINC08603481, ZINC09794889 and ZINC43742965, cluster 2 had five compounds ZINC59385771, ZINC03875972, ZINC34053106, n-hexadecanoic acid and natural control kaempferol. Cluster 3 represented four compounds ZINC08964497, ZINC12231395, ZINC13368607 and coronarin E, and cluster 4 showed five ligands ZINC08781724, ZINC03977762, rhein, phytosphingosine and synthetic control ciglitazone, see Fig. 5B. PCA clusters 1 and 3 were found to have maximum top-ranking ligands (high binding affinity) (Fig. 6). With these two clusters, molecular alignment was performed by PharmaGist, and in case of both clusters 1 and 3, we have observed three common chemical features (1 aromatic and 2 H bond acceptors and one aromatic, 1 H bond acceptor and 1 hydrophobic residue, respectively), see Fig. 7. Details of the alignment are shown in Figs. S6 and S7.

Fig. 6
figure 6

a Heat map binding affinity (kcal mol−1) of ligands and proteins. b Principal component analysis of docking results (kcal mol.−1)

Fig. 7
figure 7

Molecular alignment based on pharmacophores, as determined by PharmaGist. a Common aligned functional groups (dotted line) of PCA cluster 1 and (A1–A4): PCA cluster 1 compounds showing aligned functional groups (dotted line). b Common aligned functional groups (dotted line) of PCA cluster 3 and (B1-B4): PCA cluster 3 compounds showing aligned functional groups (dotted line)

Protein Flexibility-Molecular Dynamic (MD) Simulation

Evaluation of protein flexibility post ligand was deemed necessary in our study to evaluate the structural stability of protein–ligand complex after docking interaction. CABS-flex 2.0 server was used to evaluate the protein–ligand complex stability in this study. Tumskiy et al. (2021) [75] applied CABS-flex 2.0 server to study the stability of SARS CoV 2 main protease in complex with a few novel inhibitors. Furthermore, Dey et al. (2021) [76] showed the stability of angiotensin-converting enzyme 2 (ACE2) receptor-severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) S protein by using this software. We selected the protein wise best ranking ligands (XIAP-ZINC13368607, LIVIN-ZINC13368607, BCL2-ZINC08964497 and E6-ZINC08603481) for the dynamic simulation study (Fig. 8). For all the complexes, we observed minimum fluctuations of residues for all of the complexes. The RMSF variation for all the complexes was found as less than 10 Å. Therefore, all the protein–ligand complexes evaluated were found to be stable in the physiological condition.

Fig. 8
figure 8

The RMSF profiles of protein–ligand complexes as obtained by CABS-flex 2.0

Conclusion

We identified four compounds, namely coronarin E, n-hexadecanoic acid, phytosphingosine and rhein, from the rhizome extracts of Amomum subulatum (greater cardamom) screened against HPV-induced cervical cancer proteins by using in silico molecular docking approach. Furthermore, based on the pharmacophore modelling, three analogous molecules for each of the phytochemicals were identified. All the compounds were found to possess adequate drug-like properties. Overall, n-hexadecanoic acid and its analogues were found to efficiently inhibit target oncoproteins, namely HPVE6, HPVE7, XIAP, BCL2 and LIVIN, among sixteen molecules screened. Statistical evaluation based on principal component analysis (PCA) showed four phytochemical clusters including controls (kaempferol and ciglitazone). Pharmacophore alignment further revealed that majorly hydrophobic, aromatic and H-bond acceptor groups explained top-ranked ligands’ oncoprotein inhibitory potentials. Finally, it can be concluded that large cardamom possesses considerable therapeutic potential against HPV-induced cervical cancer; however, validation in the wet lab condition is warranted.