Introduction

In the recent past, the world witnessed the COVID-19 pandemic caused by SARS-CoV-2 (Hu et al. 2021). Nipah virus (NiV), a deadly emergent infectious agent which has the potential to give rise to the next pandemic, causes severe respiratory illness and fatal encephalitis (Ahmad 2014; Thakur et al. 2022). Research is difficult to do on the live virus as it is a biosafety level 4 pathogen (BSL-4) (Tigabu et al. 2014). It belongs to the genus Henipavirus of the order Mononegavirales, family Paramyxoviridae, and subfamily Paramyxovirinae (Chua et al. 2000). The virus was named after Kampung Sungai Nipah village in Malaysia, first detected in 1998 (Chattu et al. 2018). The fruit-eating bats of Pteropus spp. are the primary reservoirs of the virus (Chua et al. 2002). This zoonotic agent transmits from infected animals to humans and, in turn, from one infected human to the other, mainly through respiratory droplets, body fluids, blood and urine (Gurley et al. 2007; de Wit and Munster 2015). So far, 639 human cases of Nipah virus infection have been reported from Malaysia, Bangladesh, the Philippines, Singapore and India (Devnath and Masud 2021). Despite being short-lived, the past outbreaks of the Nipah virus claimed many human lives and were associated with high mortality rates (40–75%) (Vanitha et al. 2019). In its 2018 research and development blueprint, the World Health Organisation (WHO) stressed the gravity of the threat to humanity and called for immediate attention regarding effective drug development (Mehand et al. 2018). Nipah is an enveloped virus with a negative-sense single-stranded RNA. The non-segmented viral genome codes for several structural and non-structural proteins and is about 18.2 kb in size. The analysis of the viral genome demonstrates an arrangement of six genes, namely nucleocapsid (N), phosphoprotein (P), matrix (M), fusion glycoprotein (F), attachment glycoprotein (G) and long polymerase (L). The P gene encodes P protein and three important accessory proteins, C, W and V (Martinez-Gil et al. 2017). Among all the proteins of Nipah, the attachment glycoprotein, NiV G plays a vital role in attachment with the host receptor, Ephrin B2 or B3. This protein primarily drives the spread of the infection by cell-to-cell fusion. Therefore, the interruption of viral entry into the cell can be brought about by blocking the active site of NiV G responsible for host receptor interaction. This makes NiV G an attractive target for identifying potential anti-Nipah drugs, as currently, there are no approved therapeutics for Nipah virus infection (Geisbert et al. 2021). The treatment of infected patients is limited to supportive care (Ang et al. 2018).

Nevertheless, antiviral medications such as Ribavirin and Acyclovir were used to treat Nipah infection during past outbreaks in Malaysia and Singapore. However, they were not fully effective in curing Nipah-infected individuals (Sharma et al. 2019). The vaccines being developed by scientists worldwide are yet to be time-tested to validate their effectiveness against the infection. Initial deployment of bioinformatics tools for the preliminary screening of drugs may reduce the cost and improve the turnover time significantly. Many drug candidates can be screened quickly, efficiently and cost-effectively using a bioinformatics approach like computer-aided drug design (CADD) (Gaieb et al. 2019). A few in silico studies showed the potential of small molecules against the target proteins of the Nipah virus. However, these studies were not backed by statistical analysis (Ropón-Palacios et al. 2020; Kalbhor et al. 2021; Glaab et al. 2021).

Phytochemicals have long been used to develop novel drugs and identify potential drug candidates against emerging infectious diseases considering their efficacy and safety compared with their synthetic counterparts. Anaemia, jaundice, and teratogenic effects are some side effects of Ribavirin (Chong et al. 2001). The side effects of Acyclovir are nausea, vomiting and headaches (Miserocchi et al. 2007). Plant-derived compounds are known for their usage as medicines traditionally (Nandagoapalan et al. 2016). Phytochemical classes such as flavonoids, terpenoids, phenols, xanthophylls, carotenoids, and essential oils are known for their immunomodulatory, antitumor, antimicrobial, and antioxidant properties. These compounds have successfully entered the modern world of drug development, one of the reasons being their prominent antiviral activities (Byler et al. 2016; Ben-Shabat et al. 2020; Pandey et al. 2021). We previously published a few research articles on the role of phytochemicals against viral targets using in silico strategy. Curcumin, for example, identified in our in-silico study, was later found to be effective against the SARS-CoV-2 variant in a clinical study (Pawar et al. 2021; Nag et al. 2021b, 2022a). This presents the scope for extending this approach to identify anti-Nipah compounds. Identification of phytocompounds as potential anti-Nipah drug candidates requires evaluation of the chemistry and stability of their binding interaction with Nipah proteins which can be facilitated through the integration of molecular docking study, molecular alignment of ligands (pharmacophore) and analysis of molecular dynamic simulation.

In this work, fifty-three (53) phytochemicals and the control drug, Ribavirin were tested for Nipah G protein inhibition potential through an in silico molecular docking study. Further, drug-like properties of the selected phytochemicals were also evaluated. We combined statistical and pharmacophore approaches to identify functional descriptors responsible for protein–ligand interaction. Finally, the stability of the ligand–protein complex was studied using the molecular dynamic simulation technique.

Materials and methodology

Preparation of the protein

The three-dimensional structure of the Nipah virus attachment glycoprotein (NiV G) in complex with the human cell receptor, Ephrin B2 (PDB id 2VSM, Chain A and B, X-Ray Diffraction, Resolution 1.8 Å) was downloaded from the RCSB PDB (https://www.rcsb.org/). This selection was based on the high resolution of the crystal structure (1.8 Å), determined by the X-ray diffraction methodology (Berman et al. 2002). Nipah virus attachment glycoprotein chain A (NiV G:A) was selected as the target protein for this study by using UCSF Chimera software (Pettersen et al. 2004). The heteroatoms and water molecules in the receptor protein were removed using Discovery Studio 2021 (Makhloufi et al. 2022). Maestro 13.3 and SwissPDB viewer were used for editing the protein structure, including removing non-polar hydrogens and adding polar hydrogens to the receptor protein and energy minimization at the physiological (pH 7.4) (Trott and Olson 2009). Additionally, the three-dimensional structure of the human cell receptor, Ephrin B2 (NiV G:B) was optimised for molecular docking using a similar methodology followed for Nipah virus attachment glycoprotein chain A.

Preparation of ligands

Fifty-three (53) phytochemicals, along with control drugs Ribavirin and Acyclovir were downloaded from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/) and considered as ligands for this study (Kim et al. 2019). Ribavirin was commonly used as an antiviral agent in the literature. Tang et al. 2019 used Ribavirin as a control drug for inhibiting viral entry, in vivo. Further, in another study, it was found to inhibit Nipah virus glycoprotein in silico (Ghimire et al. 2022). Considering these evidences, Ribavirin was selected as the control drug for our work. Acyclovir was administered to workers in Singapore during Nipah virus outbreak in 1999. Hence, Acyclovir was also selected as a control drug in this study (Hauser et al. 2021). The three-dimensional structures of the phytochemicals were downloaded from PubChem in “SDF” format. The phytochemicals were structurally optimized and converted to “PDB” format using Avogadro software. This step was followed as a prerequisite for conducting molecular docking studies. For optimization of the ligands, a universal force field (UFF) algorithm was applied for the energy minimization of the ligands, followed by adding polar hydrogens to the ligands at pH, 7.4 (Hanwell et al. 2012; Nag et al. 2021b, 2023; Cho et al. 2022). Two-dimensional structures of the ligands are represented in Fig. 1.

Fig. 1
figure 1

Two dimensional structures of the fifty-three phytochemicals, along with the control drugs (Ribavirin and Acyclovir)

Active site prediction and molecular docking

Protein–protein docking

The protein–protein docking studies for Nipah virus attachment glycoprotein (NiV G) in complex with the human cell receptor, Ephrin B2 were performed using ClusPro 2.0 docking server (https://cluspro.bu.edu/login.php) (Kozakov et al. 2017). PIPER docking program used in ClusPro relies on the Fast Fourier Transform correlation approach. The interaction energy between two proteins is represented by PIPER using the following expression;

$$E \, = \, w_{1} E_{{{\text{rep}}}} + \, w_{2} E_{{{\text{attr}}}} + \, w_{3} E_{{{\text{elec}}}} + \, w_{4} E_{{{\text{DARS}}}}$$

where the attractive and repulsive contributions to the van der Waals interaction energy are denoted by Erep and Eattr, electrostatic energy is denoted by Eelec, pairwise structure-based potential is represented by EDARS and the weights of the corresponding residues are determined by the coefficients w1, w2, w3, and w4 (Bhattacharya et al. 2020). PDBSum web server (http://www.ebi.ac.uk/pdbsum) was used to visualise and analyse interacting amino acids and bonds of the protein–protein complex (Laskowski 2009).

Protein-phytochemical

Discovery Studio 2021 (BIOVIA, San Diego, USA) was used to predict the active sites in the receptor protein NiV G:A. The grid box encompassed GLN369A, GLU389A, TYR391A, and ILE398A residues (corresponds to GLN559, GLU579, TYR581, and ILE588 as per Kalbhor et al. 2021), which had a high binding affinity for Ephrin B2. The NiV G-Ephrin B2 complex was further used as a target protein for molecular docking studies with the phytochemicals and the control drugs, Ribavirin and Acyclovir. Ephrin B2 was also used as a target protein for molecular docking studies with the phytochemicals and the control drugs, Ribavirin and Acyclovir. The grid parameters for the respective receptor proteins are listed in Table 1. Protein-phytochemical molecular docking studies were performed by the DockThor web program (https://dockthor.lncc.br/v2/). DockThor uses Santos Dumont supercomputer (Santos et al. 2020) to conduct virtual screening experiments. It houses docking tools such as MMFF Ligand and PdbThorBox. The docking algorithm applies MMFF94S53 force field on the protein inputs (Nag et al. 2021a). The results of the protein–ligand docking studies (average of binding energies, Kcal mol−1) were compared with that of the control drug complexes.

Table 1 Grid parameters for protein–protein and protein–ligand docking studies

Evaluation of drug-like properties and toxicity analysis

The pharmacological and drug-likeness properties of the phytochemicals were evaluated by uploading their canonical smile formats to the SwissADME site (http://www.swissadme.ch/). Drug likeness parameters, namely topological polar surface area (TPSA), lipophilicity (iLOGP and WLOGP), estimated water solubility (ESOL Log S), five rules of Lipinski, gastrointestinal (GI) absorption and PGP substrate, were used for evaluation of drug-likeness of the input phytocompounds. The efflux of structurally unrelated compounds is mediated by the membrane-bound transporter P-Glycoprotein (PGP), which alters the bioavailability of the drugs. The physicochemical property guideline of Lipinski’s rule of five (RO5) should be followed by an ideal drug. As per RO5, a chemical compound with a certain biological activity for oral administration should have drug like properties like molecular weight less than 500 g mol−1, log P (hydrophobicity) value less than 5, hydrogen bond donors less than or equal to 5 in numbers, and less than or equal to 10 hydrogen bond acceptor sites (Lipinski et al. 2001; Doak et al. 2014). SwissADME is an open-source web server that facilitates the drug discovery by predicting drug-likeness parameters. It also explores the physicochemical descriptors, and pharmacokinetic properties of the chemical entities (Daina et al. 2017). Toxicity can be attributed as one of the important reasons for the failure of the drug development pipeline. Hence, it is crucial to test the toxicity endpoints of compounds. The toxicological properties such as AMES toxicity, Rat Oral Acute Toxicity, Skin Sensitization and Respiratory Toxicity were evaluated using ADMETlab 2.0 web server (https://admetmesh.scbdd.com/) (Xiong et al. 2021).

Principal component analysis and protein–ligand interaction analysis

Ten phytochemicals out of fifty-three total phytochemicals, along with the control drug, were selected based on the highest docking scores and evaluation of drug-likeness properties, which were then grouped by the application of Principal Component Analysis (PCA) using Minitab 18 statistical software. PCA simplifies the interpretation of large data sets by ensuring minimum loss of data sets, thereby improving data interpretation. The principal components depict an individual dimension of variations from the measured datasets. Principal components (PCs) are uncorrelated variables where PC1 encloses the maximum variation followed by PC2, PC3 and so on (Jolliffe and Cadima 2016; Lever et al. 2017). In this study, binding energy (Kcal mol−1), molecular weights, and drug likeness parameter values of the phytochemicals such as TPSA, iLOGP, WLOGP, ESOL Log S were used as inputs for the PCA. The outputs of PCA were observed as different clusters. The selection of one of the clusters for the molecular alignment study was done based on the presence of the phytochemical with the most optimum binding potentials to the active site of the target protein. The protein–ligand docking complexes associated with the phytochemicals of the selected cluster and the control drug were used for amino acid-ligand interaction analysis. The BIOVIA Discovery Studio Visualizer (Dassault Systems) software was used to visualise and analyse interacting amino acids and bonds of the protein-phytochemical complexes.

Phytochemical alignment and identification of pharmacophores

Based on the multivariate PCA modelling results, the cluster comprising phytochemicals with the optimum binding affinity and drug-like properties was subjected to molecular alignment using an open-source server, PharmaGist (https://bioinfo3d.cs.tau.ac.il/PharmaGist/). The alignment of phytochemicals facilitated the identification of common descriptors or functional groups. PharmaGist utilizes the DUD (directory of useful decoys) data set, which consists of 2950 active ligands for 40 different receptors and 36 decoy compounds for active ligands. The alignment scores of the input ligands from the chosen PCA cluster were generated based on their pivot and conformational results after pairwise alignments (Schneidman-Duhovny et al. 2008). The output was chosen based on the highest alignment score. The analysis of the common descriptors (pharmacophores) responsible for effective ligand–protein interaction was performed by using ZincPharmer (http://zincpharmer.csb.pitt.edu/). The structures were visualized using PyMol 2.5 software (Schrödinger). PyMol is known to generate high-quality three-dimensional images of biological macromolecules, including proteins and small molecules. It includes OpenGL Extension Wrangler Library (GLEW) and FreeGLUT and is capable of solving Poisson-Boltzmann equations. It uses python as its programming language (Seeliger and de Groot 2010).

Molecular dynamic (MD) simulation

The MD simulation of the NiV G:A and the phytochemical with the top binding affinity was performed using the GROMACS-2019.2 based bio-molecular package of Simlab, the University of Arkansas for Medical Sciences (UAMS), Little Rock, USA. The simulation utilised GROMOS96 43a1 force field. PRODRG software was used to generate the ligand topology file (Schüttelkopf and van Aalten 2004). A grid box was specified for the protein–ligand complex. An environment of SPC water and 0.15 M counter ions (Na+/Cl) was specified for molecular dynamic simulation. The set-up parameters, NVT/NPT ensemble temperature 300 K and 1 bar atmospheric pressure were applied to the system. Parrinello-Rahmanbarostat and Parrinello-Danadio-Bussithermostat were used to maintain the pressure and temperature (Huang et al. 2017). Energy minimization by 5000 steepest descent integrators on the output model was performed. Based on the available literature (Tadayon and Garkani-Nejad 2019; Alamri et al. 2020; Keretsu et al. 2020; Basu et al. 2020; Kushwaha et al. 2021; Yadav et al. 2021; da Cruz Freire et al. 2022), the run time for the MD simulation was fixed at 50 ns.

The results of the MD simulation studies were expressed in terms of Root Mean Square Deviation (RMSD), Radius of Gyration (Rg), Root Mean Square Fluctuation (RMSF), and Ligand-H bonds. The measured distance between protein residues and ligands is determined by RMSD. The significance of the radius of gyration lies in the analysis of the compactness of the protein structure in the free and bound forms (Nag et al. 2023). RMSF represents the differences in flexibility among residues with respect to the average molecular dynamic simulation conformation (Rao et al. 2020). Additionally, a comparative analysis of structural changes between the ligand-bound and ligand-free proteins was performed by using PyMol measurement wizard.

Free energy analysis by MM-PBSA (molecular mechanics-poisson–boltzmann solvent-accessible surface area) calculation

The free energies of the top ligand-NiV G:A complex were estimated by Molecular Mechanics-Poisson–Boltzmann Solvent-Accessible surface area (MM-PBSA) method using g-mmbsa package (Kumari et al. 2014; Nag et al. 2022a, b) for the final 10 ns of the simulation time frame. The free energy included ΔG_van der Waals, ΔG_Electrostatic, ΔG_Polar, ΔG_Non-Polar, ΔG_Binding and residual contribution energy parameters.

The calculation of ΔG_Bind (kJ mol−1) was represented by the following equation:

$$\Delta {\text{G}}\_{\text{Bind }} = {\text{ G}}\_{\text{Comp }} - \, \left( {{\text{G}}\_{\text{Prot }} + {\text{ G}}\_{\text{Lig}}} \right)$$

The value of ΔG_Bind (kJ mol−1) was derived by subtracting the summation of individual energy values of the protein (G_Prot) and ligand (G_Lig) from the energy of the protein–ligand complex (ΔG_Comp).

Results and discussion

Selection of phytochemicals

Fifty-three phytochemicals, and the control drug Ribavirin were selected for this study based on literature studies mentioned in Table 2.

Table 2 List of compounds selected for the study and its corresponding literature evidences

Drug-likeness of selected compounds

The drug-like properties were effectively predicted by SwissADME for all the fifty-three phytochemicals, along with the control drugs, Ribavirin and Acyclovir (Table 3). The bioavailability of drug candidates was represented by Topological Polar Surface Area (TPSA). The suggested range is 20 to 140 Å2 (Ertl et al. 2000). All the phytochemicals except for a few (Albaspidin AA, Beta-carotene, Chlorogenic acid, Riboflavin, Neosappanone A, Agrimol D, Forsythoside D, Verbascoside, Myricetin, Quercetin 3-galactoside, Kuzubutenolide A, Kaempferol-3-glucoside, Asebotin, Rutin, Mulberrofuran B) satisfied the criteria. Hou and Wang 2008, noted that in some cases, there might be no correlation between TPSA and other drug-like parameters. Water solubility also plays a vital role in determining drug-like properties. The solubility for drugs calculated by applying the ESOL model should be less than 6. The linear relationship between log S and five molecular parameters, namely, molecular weight, the number of rotatable bonds, the fraction of heavy aromatic atoms and Daylight's CLOGP was established by the ESOL model (Daina et al. 2017). All the phytochemicals were found to be within the recommended ESOL Log S value. Absorbance in the gastrointestinal (GI) tract is representative of the transcellular passive diffusion parameter, which is crucial to the cellular permeability of the drug candidates (Nag et al. 2022b). In this study, compounds like Malic acid, Linalol, Cinnamic acid, Salicylic acid, Vanillic acid, p-Coumaric acid, Caffeic acid, Sinapinic acid, Ferulic acid, Pentadecylic acid, Myristic acid, Coniferol, Oleic acid, Isorhamnetin, Pseudocarpaine, Linolenic acid, Thiamine, Obovatol, Prunasin, Moupinamide, Dehydrocarpaine I, Dehydrocarpaine II, Carpaine, Kaempferol, Morin, Isoarundinin II, Quercetin, Naringenin, Fisetin, Acyclovir and Tribulusamide B showed high GI absorption capacity. The calculation of iLOGP is dependent upon the Gibbs free energy of solvation. The value of Gibbs free energy of solvation is derived from the ratio of Generalized-born (GB) parameters and solvent-accessible surface area in water/n-octanol (SA) (GB/SA). The iLOGP values for all the phytochemicals in this work, with exceptions to a few compounds (Stigmasterol, Beta-carotene, β-Cryptoxanthin, Lutein), were found to meet the recommended value (less than 5) (Ibrahim et al. 2021). The extra cellular efflux of a wide range of structurally unrelated drugs is mediated by membrane-bound transporter PGP (P-glycoprotein). PGP-induced efflux of various substrates against a concentration gradient lead to the decrease in the intracellular concentration of the substrates, impacting their oral bioavailability (Constantinides and Wasan 2007). Phytochemicals Lutein, β-Cryptoxanthin, Quinic acid, Albaspidin AA, Beta-carotene, Naringin, Thiamine, Dehydrocarpaine I, Mulberrofuran B, Dehydrocarpaine II, Forsythoside D, Verbascoside, Naringenin, Kuzubutenolide A, Asebotin and Rutin were found to be PGP substrate (PGP +), and rest were PGP negative. Minimum Lipinski violations (0–3) were observed for all the phytochemicals. In general, literature indicated that neither drug-likeness of a compound should be derived based on one parameter nor that a compound is expected to pass all the drug-like tests (Ertl et al. 2000). Based on this understanding, all the phytochemicals passed one or more criteria, and substantially established themselves as the candidate drugs.

Table 3 Evaluation of drug-likeness of the phytochemicals as determined by SwissADME

The toxicity analysis of the selected compounds is presented in Table 4. Four test parameters, namely Ames test, rat oral acute toxicity, skin sensitization and respiratory toxicity were studied to evaluate the toxicity of the compounds. While the Ames test for mutagenicity represents carcinogenicity of the compounds, skin sensitization revealed conditions like allergies and diseases like contact dermatitis. Other test parameters such as rat oral acute toxicity and respiratory toxicity, are related to the morbidity and mortality. Surveillance and treatment of such parameters, should be given importance to avoid drug withdrawal (Dong et al. 2018; Xiong et al. 2021). The results indicated that most of the compounds selected in this study were safe and could be used as candidate drugs.

Table 4 Analysis of toxicity analysis of selected compounds as determined by ADMETlab 2.0 web server

Molecular docking

In the present study, docking results of all the fifty-three phytochemicals and the control drugs, Ribavirin and Acyclovir with the target protein (NiV G:A) were analysed (Table 5). However, the control drug, Acyclovir showed lower binding affinity (– 6.55 kcal mol−1) towards NiV G:A compared to Ribavirin (– 6.95 kcal mol−1). Literature showed the efficacy of Ribavirin against Nipah virus particles, in in vitro conditions (Wright et al. 2005; Aljofan et al. 2009). Acylovir, on the other hand, was administered to nine (9) Nipah-infected workers in Singapore; however, its role in curing the patients was unclear (Paton et al. 1999). Considering our current results and lack of evidence in the literature for anti-Nipah drugs, Ribavirin was used as the control drug in our study.

Table 5 The results of molecular docking in terms of binding affinity (kcal mol−1) between compounds and target protein as generated by the DockThor server

The evaluation of the top ten (10) docking scores showed that phytochemicals (Naringin, Tribulusamide B, Mulberrofuran B, Rutin, Asebotin, Kaempferol-3-glucoside, Kuzubutenolide A, Fisetin, Naringenin, Quercetin 3-galactoside) could strongly bind with the target NiV G:A (– 9.19, – 9.01, – 8.84, – 8.71, – 8.56, – 8.51, – 8.32, – 8.28, – 8.25, – 8.23 kcal mol−1, respectively) when compared with the control drug, Ribavirin. The medicinal importance of these compounds is tabulated in Table 2. However, only two compounds namely, Naringin and Tribulusamide B showed a binding affinity of more than 9 kcal mol−1, compared with other top-ranked phytochemicals. However, in the toxicity analysis of our study, Naringin was found to be safer than Tribulusamide B. Additionally, the affinity of phytochemicals towards the Ephrin B2-NiV G:A and human Ephrin B2 protein alone, were also evaluated (Tables S1 and S2). The result indicated that the top ligand Naringin, in particular, could selectively bind with NiV G:A (– 9.19 kcal mol−1) and NiV G-Ephrin B2 complex (– 10.27 kcal mol−1) over human Ephrin B2 protein alone (– 7.09 kcal mol−1). However, Naringin showed higher affinity towards Ephrin B2, than the control drug Ribavirin (– 6.974 kcal mol−1). Ephrin B2 is a B-class ephrin and potential henipaviral receptor, which facilitates viral entry and activates the virus-host fusion (Priyadarsinee et al. 2022). Naringin was reported to have significant anti-viral properties. In a recent study, Hussen et al. (2020), through molecular studies, showed that Naringin with binding affinity – 10.2 kcal mol−1, could strongly inhibit the activity of SARS-CoV-2 main protease (Amin Huseen 2020). Our result showed that the phytochemical Naringin could effectively disrupt NiV G: A-Ephrin B2 complex. Due to its comparative selectivity towards viral protein than the human receptor, post-infection clearance shall be possible. The three-dimensional structures of Naringin-protein complexes are shown in Fig. 2.

Fig. 2
figure 2

Three-dimensional structures of ligand–protein complexes. a Nipah virus attachment glycoprotein (NiV G:A) in complex with Ephrin B2 (PDB id 2VSM) b NiV G:A-Ephrin B2-Naringin complex c Ephrin B2-Naringin complex d NiV G:A-Naringin complex

Principal component analysis

In this work, the first principal component (PC1) and the second component (PC2) explained approximately 49.70 and 24.20% of the variance, respectively. We observed four clusters in the PCA (Fig. 3). Cluster 4 consisted of Naringin, Quercetin 3-galactoside, Rutin and Mulberrofuran B, while cluster 1, comprised the control drug Ribavirin along with Kaempferol-3-glucoside and Kuzubutenolide A. Fisetin and Naringenin were observed in cluster 2 and Asebotin and Tribulusamide B were grouped into cluster 3. There were marginal differences observed concerning the selected parameters (binding affinities and drug-likeness parameters) for clusters 3 and 4. Clusters 3 and 4 grouped top-ranked ligands in binding affinities towards NiV G:A, which also had higher molecular weights, and similar ranges of values for iLOGP and ESOL Log S, respectively. Clusters 1 and 2 consisted of low-scoring ligands in binding affinities towards NiV G:A, which also had differences in molecular weights, WLOGP, iLOGP, and ESOL Log S. The TPSA values for phytochemicals of cluster 4 were the highest, followed by cluster 3, 1 and 2. The observed differences associated with selected parameters for cluster 1 and 2 were more when compared to the differences related to the parameters for clusters 3 and 4. The multidimensional data analysis tool of Principal Component Analysis (PCA) can be used to achieve insightful information on the protein–ligand binding affinities (Nag et al. 2022b). The molecular interactions of Mur enzymes of Mycobacterium tuberculosis by utilizing the advanced PCA tool were explored by Kumari et al. 2021. The uniqueness of PCA lies in the conversion of measured variables into principal components. The use of the PCA tool for categorization of ligands and thereby establishing a correlation between the ligands was successfully implemented in our previous works (Nag et al. 2021a, 2022b, 2023). Considering the robustness of the PCA tool and based on our results, the compounds in cluster 4 were selected for pharmacophore study.

Fig. 3
figure 3

Principal Component Analysis of selected top ten phytochemicals showing four clusters (1, 2, 3 and 4), x and y axis represents First and Second components respectively

Interaction analysis

PCA cluster 4, ligands-NiV G:A and Ephrin B2 docked complexes were selected for the amino acid-ligand interaction analysis (Fig. 4). The phytochemicals, Naringin, Rutin, Mulberrofuran B and Quercetin 3-galactoside interacted with the amino acid residues of the target protein through conventional hydrogen bonds, carbon-hydrogen bonds, alkyl bonds, pi-pi t shaped bonds and pi-sulfur bonds. Some residues GLN369A, GLU389A, TYR391A, and ILE398A (corresponds to GLN559, GLU579, TYR581, and ILE588 as per Kalbhor et al. 2021) involved with the interactions had a strong binding affinity towards Ephrin B2. The control drug, Ribavirin, interacted with NiV G:A through conventional hydrogen, pi-sulfur, pi-alkyl and pi-pi t shaped bonds. Among various amino acids, ALA342A and CYS50A of NiV G:A were found to be the common residues omnipresent in all interactions, including the control drug, Ribavirin. For Ephrin B2, two amino acid residues THR114B, and LYS60B were found to be common when compared with the control drug, Ribavirin. The details of the ligand–protein interactions are presented in Table 6. Further, the comparative interactions between the top ligand Naringin-NiV G:A-Ephrin B2 and NiV G:A-Ephrin B2 complexes revealed GLU36A and PRO122B as the common residues. Also, Naringin was found to be placed within the attachment site of NiV G:A and Ephrin B2 (PHE376A, LEU377A, LEU378A, LYS379A; GLY37A, TYR38A; LYS116B, GLN118B, GLU119B, PHE120B, SER121B; SER257A, LEU258A) (Fig. S1).

Fig. 4
figure 4

Two-dimensional representation of protein–ligand interactions (PCA cluster 4): a1 to e1 Interaction with Nipah virus attachment glycoprotein (NiV G:A, PDB id 2VSM, Chain A) [a1 Naringin, b1 Mulberrofuran B, c1 Quercetin 3-galactoside, d1 Rutin, e1 Ribavirin]; a2 to e2 Interaction with Ephrin B2 (PDB id 2VSM, Chain B) [a2 Naringin, b2 Mulberrofuran B, c2 Quercetin 3-galactoside, d2 Rutin, e2 Ribavirin]

Table 6 Amino acid residues of target proteins interacting with the phytochemicals

Phytochemical alignment and identification of pharmacophores

Based on the PCA results, as explained earlier, cluster 4 (Naringin, Rutin, Mulberrofuran B and Quercetin 3-galactoside) was selected for the pharmacophore (descriptors) dependent molecular alignment. Phytochemical alignment reveals molecular features responsible for the biological properties of the drug. The functional descriptors include hydrogen bond donors (HBD), hydrogen bond acceptors (HBA), positive features, negative features, aromatic rings and hydrophobic features (Mohammed 2021). The interactions with the target proteins are carried out by the descriptors present in the ligands. In our recent study, a similar technique was used to align phytochemicals based on the descriptors in Camelia sinensis L. (Nag et al. 2022b).

Further, we successfully deployed molecular alignment elsewhere to evaluate the interaction of the phytochemicals with the residues of the target protein (Nag et al. 2021b, 2023; Nag and Banerjee 2021). All these studies combined PCA and molecular alignment. Recently, Glaab et al. 2021 performed pharmacophore modelling-based study which involved molecular alignment of ligands to virtually screen SARS-CoV-2 viral protease 3CLpro inhibitors from ZINC database, SWEETLEAD library and MolPort library. In the current study, we identified common descriptors for all cluster 4 phytochemicals as four H bond acceptors, one H bond donor and two aromatic groups (Fig. 5a), with a high alignment score of 32.245, as provided by the PharmaGist webserver. While evaluating the interacting groups of the ligands with the target proteins, we observed that identified common descriptors of Naringin, Rutin, Mulberrofuran B and Quercetin 3-galactoside were the functional pharmacophores responsible for the protein–ligand interaction, as shown in Fig. 5b.

Fig. 5
figure 5

Molecular alignment of phytochemicals and pharmacophore interactions, a Identification of pharmacophores of four aligned phytochemicals (Naringin, Mulberrofuran B, Rutin and Quercetin 3 galactoside); b1-b2-b3-b4 Three-dimensional representation of protein–ligand pharmacophores interactions (ACC: Acceptor, DON: Donor, AR: Aromatic)

Molecular dynamic (MD) simulation

The molecular dynamic simulation tool is commonly used to validate the conformation and molecular stability of the docked complexes in physiological conditions (Dhiman and Purohit 2022). In the present study, MD simulation tool was used to examine the stability and conformational dynamics of the top ranked Naringin-NiV G:A complex. The stability of the complex was compared with that of the Ribavirin-NiV G:A in terms of results obtained collectively from all four parameters, namely RMSD, RMSF, gyration and ligand–protein H bonds. The binding profiles for the control drug Ribavirin and Naringin to the target protein, as demonstrated by the results, were similar. The application of the MD simulation technique is widespread in literature. Kumar et al. (2023) recently evaluated quinoline molecules mediated structure restoration and aggregate inhibition of V30M mutant transthyretin protein by applying a robust MD simulation technique. In another study, with the effective utilization of this technique, Singh and Purohit (2023) screened thirty-two 3-methyleneisoindolin-1-one molecules, and M24 was reported as the top cyclin-dependent kinases 4/6 (CDK4/6) inhibitor in comparison with the drug Palbociclib. The average RMSD values of the native complexes were 0.26, 0.24 and 0.23 nm for apo, Ribavirin-NiV G:A and Naringin-NiV G:A, respectively. Minimal fluctuations of RMSD (around ± 0.05 nm) for both the complexes observed were comparable to that of the apo (ligand free) protein (Fig. 6a). Literature indicated that RMSD value less than 2 Å represented the stability of protein structure, which in turn could be translated as less deviation of protein (Singh et al. 2022). Further, scientific evidence indicated that conformational changes resulting from protein–ligand interactions impacted the RMSD values, and thereby slight fluctuations in the RMSD profiles were expected during the simulation study (Alamri et al. 2020). Randhawa et al. (2022), in a five (5) ns simulation, established stability of the docked Nipah protein and selected phytochemical complexes. RMSD fluctuations during the simulation, were also observed in their study. Overall, our results in terms of RMSD for both the complexes and the apo protein, were in agreement with these literal evidences, indicating the stability of ligand–protein complexes in our study.

Fig. 6
figure 6

Molecular dynamic (MD) simulations of apo protein Nipah Virus Attachment Glycoprotein (PDB id 2VSM), 2VSM-Ribavirin complex, 2VSM-Naringin complex: a Root-Mean-Square Deviation (RMSD), b Radius of Gyration, c Root-Mean-Square Fluctuation (RMSF) and d Ligand–Protein H bonds

The Rg values ranged from 1.67 to 1.70 nm, demonstrating that the compactness of the protein was not impacted by ligand binding (Fig. 6b). We observed limited fluctuation of the ligand-bound (Naringin) protein gyration near 30–40 ns intervals. Minor fluctuations of the radius of gyration have been commonly observed in biological systems. In an in silico molecular dynamic simulation study of ATP-binding cassette super-family G member 2 enzyme and 2,4-Disubstituted pyridopyrimidine derivatives, lower fluctuations in the radius of gyration between 45 and 50 ns were reported (Tadayon and Garkani-Nejad 2019). Such fluctuations in RMSD and Rg parameters were reported elsewhere (Hasan et al. 2022). The root mean square fluctuations (RMSF) of individual amino acid residues was analysed to understand the residual mobility of ligand-bound and unbound forms (Kumar et al. 2023). The average RMSF values for apoprotein, Naringin-NiV G:A and Ribavirin-NiV G:A ranged from 0.1 to 0.5 nm (Fig. 6c). The structural stability of the complex is affected by the formation of stable hydrogen bonds between ligand and protein. For interaction with the NiV G:A, Naringin and Ribavirin showed a maximum of six and four hydrogen bonds, respectively. We observed at least one hydrogen was long-lived throughout the simulation of 50 ns for both the Naringin and Ribavirin complexes (Fig. 6d). The molecular docking simulation study results showed that Naringin could make an effective and stable interaction with the target protein NiV G:A at its active site and, thereby, could inhibit the Nipah Virus infection.

Free energy analysis by MM-PBSA calculation

In agreement with our molecular docking result, free energy analysis showed that Naringin ( – 218.664 kJ mol−1) had a higher binding affinity to that of the control drug, Ribavirin ( – 83.812 kJ mol−1). Low binding energy in MM-PBSA analysis represents a strong binding potential of the ligand with the target protein. The free energy can be estimated by using the advanced methodology of MM-PBSA. This method, associated with a high computational cost, generates significantly more accurate results than the conventional score-based molecular docking technique (Ren et al. 2015; Singh et al. 2022; Singh and Purohit 2023). Idris et al. (2021) performed MM-PBSA analysis and showed that two ZINC compounds ZINC64606047 and ZINC05296775 could strongly bind with Transmembrane serine protease 2. van der Waals with the binding energies  – 190.75 ± 16.39 and  – 140.16 ± 14.93 kJ mol−1 respectively. Similarly, Elkarhat et al. (2022) reported high values like − 191.982 kJ mol−1 for SARS-CoV-2 nsp12–streptolydigin and  – 153.583 kJ mol−1 for nsp12-VXR complexes, respectively in, 30 ns MD simulation MM-PBSA analysis. Electrostatic, Non-Polar and Polar interaction parameters were associated with the dynamic stability of a protein–ligand complex. A low contribution in binding was shown by unfavourable polar solvation energy (He et al. 2014). In this study, van der Waals contributed more negative energy than its Electrostatic counterpart (Table 7). The residues, namely THR25, CYS26, THR28, SER51, GLN369, ILE390, THR393, ASN396, ILE398, and PRO400 had the major contributions in the binding of Naringin towards the NiV G protein Fig. 7. Important residues for Ribavirin- NiV G protein, were noted as CYS50, SER51, TYR391, THR393, ASN396 and ILE398 (Table 7 and Fig. 7). The hydrogen bonds along with the contribution energies of these residues are listed in Table 8.

Table 7 MM-PBSA calculations of binding free energy for protein–ligand complexes
Fig. 7
figure 7

Contribution energy plots of interacting amino acids of target proteins (Nipah virus attachment glycoprotein) to Naringin and the control drug, Ribavirin

Table 8 Amino acid residues of target protein NiV:G and ligand complexes, with at least one hydrogen bond formed during MD simulation

Structural changes in the native Nipah Virus attachment glycoprotein after binding of control drug and ligand

The distances between randomly flagged amino acids were measured to understand the structural impact on Nipah Virus Attachment Glycoprotein (apo) upon binding with the ligand, Naringin and the control drug, Ribavirin (Fig. 8). A steady decrease in the distances of amino acids in the ligand–protein complex, in comparison with the apo protein (ligand-free) indicated that the alteration of the size of the active site might lead to loss of the enzymatic activity of the target protein NiV G:A

Fig. 8
figure 8

Structural comparison by PyMol 2.0 software for a Nipah virus attachment glycoprotein chain A (NiV G:A) (Apo); b control drug, Ribavirin + NiV G:A and c Naringin + NiV G:A

(Table 9).

Table 9 Distance analysis between flagged amino acids of ligand free protein and selected complexes

Conclusion

Naringin, among the fifty-three screened phytochemicals, showed the most potent inhibitory potential with the Nipah virus glycoprotein (NiV G) ( – 9.19 kcal mol−1) when compared with the control drug, Ribavirin ( – 6.95 kcal mol−1). Other three phytochemicals, Mulberrofuran B, Rutin and Quercetin 3-galactoside, were also found to have strong binding affinities for the target protein. The pharmacophores, four H bond acceptors, one H bond donor and two aromatic groups were found to be responsible for effective protein–ligand interaction. Finally, as revealed by MD simulation, Naringin could form a stable complex with the target protein, NiV G, in the near-native physiological environment.