Introduction

The Severe Acute Respiratory Syndrome 2 (SARS-CoV-2) is a highly contagious virus that causes mild to life-threatening respiratory tract infection (Parry 2020). The virus was first discovered in Wuhan, China in December 2019, and eventually spread throughout the world with about 188 countries being affected. The virus belongs to the family of Coronaviridae and order of Nidovirales, which classified into α, β, γ, and δ according to the genera. The World Health Organization (WHO) named the virus as a 2019 novel coronavirus (2019 nCoV) on 12th January 2020. Later on, 11 February 2020 the International Committee on Taxonomy of Virus (ICTV) named the virus as Severe Acute Respiratory Syndrome 2 (SARS-CoV-2) based on phylogenetic analysis which formed sister clade with SARS-CoV (Gorbalenya et al. 2020) and at the same time, the disease caused by the virus was named as coronavirus disease 2019 (COVID-19). The virus infected about 7.5 million people across the globe with over 421,000 deaths as of 13th June 2020. The COVID-19 present with varying degree of infections; mild, moderate to severe conditions with fever, headache, cough, fatigue, hypoxemia, diarrhea, dyspnea, lymphopenia, acute cardiac injury, rhinorrhea, sneezing, sore throat, pneumonia and even death (Rothan and Byrareddy 2020). Following the pandemic effects of the virus, several complete genome sequences of the SARS-CoV-2 isolates were submitted to the National Centre for Biotechnological Information (NCBI) (Wu et al. 2020). The sequences encoded for four essential structural proteins (Spike, Envelop, Nucleocapsid, and Membrane protein) and sixteen non-structural proteins (Nsp1–Nsp16) (Wu et al. 2020). Among the non-structural proteins, Nsp5 which is also known as 3C-Like Main Protease (3CLpro) is an essential multifunctional enzyme that plays a vital role in the replication and transcription of the virus by enhancing the maturation of the Nsp. It also possessed proteinase which cleaved polyprotein at eleven various sites to produce different non-structural proteins which play a vital role in the replication of the virus (Wu et al. 2020). In contrast to other accessory and structural proteins, 3CLpro situated in the 3′ end which displays significant variability. This makes the protein a suitable target for drug design and discovery. The 3CLpro has three domains: domain I had residues between 1 and 100aa, domain II had 102–184 residues, and domain III ranges between 201 and 303 residues. The domains II and III are joined by long loop contained residues between 185 and 200 amino acids. The active site of the protein is situated between domains I and II and had two important residues (Cys145 and His41) (Wu et al. 2020). Few protease inhibitors (Lopinavir/ritonavir) have shown promising activity against SARS-CoV by inhibiting the activity of the catalytic dyad (Cys145 & His41). However, these inhibitors have numerous disadvantages ranging from toxicity, side-effect due to off-target, adverse drug responses, and inadequate potency (Ton et al. 2020).

Medicinal plants have long been used for the treatment of several ailments in Africa. These plants contained numerous Pharmaceutical Active Ingredients (PAIs) which could be used to develop modern drugs with minimal or no negative effect (ul Qamar et al. 2020). Currently, no FDA approved protease inhibitors available for the treatment of COVID-19. It is against this background that this study was design and seeks to determine the novel inhibitors of 3CLpro from SARS-CoV-2 using some selected African Medicinal Plants.

Materials and methods

Collection and preparation of the plants materials

The fresh and healthy Zingiber offinale and the leaves of Anacardium occidentale were collected within the premise of the University of Maiduguri, Borno State, Nigeria. The plant materials were verified and authenticated at the Department of Biological Science, University of Maiduguri. After authentication, they were washed thoroughly 3 to 4 times with tap water and allowed to dry at room temperature for 1 week. The dried plant materials were ground to powder using a grinder. About 200 g of the dried powder was extracted with 500 ml of ethanol via a soxhlet extractor. The rotary evaporator was used to vaporize the solvents and the crude extract was stored at 4 °C for further assay.

Gas chromatography-mass spectrometry (GC–MS) analysis

A 0.01 g of the sample was dissolved in 10 mL of its extraction solvent, vortex mixed strongly for 2 min, and then centrifuged at 3,000 rpm for 10 min. The clear supernatant was collected into a TSP micro vial for GCMS analysis. 1 μL of the sample was injected into the GC. GC–MS analysis of the extract of the Z. offinale and the leaves of A. occidentale were carried out using Agilent GC (7890B), equipped with 30 m × 250 μm × 0.25 μm Column; coupled with Agilent MSD (5977A MSD). The carrier gas helium was at a flow rate of 1 ml/min. The GC oven was initially set at 70 °C, for 3 min and then ramped at 10 °C/min to 280 °C and hold for 9 min. Equilibration time, MSD Transfer Line, MS Source, and MS Quad were set at 0.5 min, 250 °C, 230 °C, and 150 °C respectively. The identification and characterization of chemical compounds in various samples was based on GC retention time. The mass spectra were computer matched with those of standards available in NIST mass spectrum libraries. The percentage composition of the sample constituents was expressed as a percentage by peak area.

Preparation of crystal structure of the target protein

The crystal structure of SARS-CoV-2 main protease in complex with 02J (5-Methylisoxazole-3-carboxylic acid) and PEJ (composite ligand) (PDB Code: 6LU7and resolution of 2.16 Å) retrieved from Protein Data Bank (PDB) (Berman et al. 2000). The bound ligand complex with the crystal structure of the 3CLpro was removed and the structurewas cleaned. All the missing parameters such as atoms, residues, missing loops, and side chains were checked and inserted. Incorrect chirality was determined, and disulfide bond and steric clashes checked and corrected. All the water molecules (except the one near substrate binding site) and non-protein residues removed via structure optimization and energy minimization using Chimera (Pettersen et al. 2004) Swiss PDB Viewer (Johansson et al. 2012) and Chiron energy minimization and refinement tool (Ramachandran et al. 2011).

Physicochemical analysis of the identified compounds

The identified compounds from GC–MS analysis were screened for physicochemical properties to determine the Pharmaceutical Active Ingredients (PAIs) using Lipinski rule of five (Molecular weight, logarithms of partial coefficient, hydrogen bond donor (HBD) and hydrogen bond acceptor (HBA))(Lipinski et al. 1997), Veber rule (Rotatable bonds and Topological polar surface area (TPSA)) (Veber et al. 2002), and Egan (Pharmacia) filter (logarithms of partial coefficient and Topological polar surface area) (Egan et al. 2000) using DataWarrior program (Sander et al. 2015) and SwissADME (Daina et al. 2017). All the compounds with desirable physicochemical properties were selected for further analysis.

Pan-assay interference structure (PAINS) analysis

The compounds with desirable physicochemical properties were screened for Pan-Assay Interference Structural (PAINS) alert to determine their toxicity. This assay is also called toxicophores because of the presence of some group elements that affect the biological process by interference with DNA or proteins which lead to a fatal condition such as carcinogenicity and hepatoxicity (Baell and Holloway 2010). All compounds with 0 PAINS structural alert were selected for further analysis.

Pharmacokinetic analysis

The compounds with desirable physicochemical and Pan-Assay Interference Structural properties were further filter for pharmacokinetic properties such as absorption, distribution, metabolism, excretion, and toxicity (ADMET) using AdmetSAR tool (Cheng et al. 2012), DataWarrior program (Sander et al. 2015) and SwissADME (Daina et al. 2017). The analyzed properties comprised of Human Intestinal Absorption (HIA), Blood–Brain Barrier (BBB) penetration, Cytochrome P450 (CYP450 2D6) Inhibitor, Mutagenicity, Tumorigenicity, AMES Toxicity, and Reproduction. These properties are essential due to their effects on the exposure of the inhibitor to the human body, which affects the pharmacological activity and performance of the inhibitor.

Molecular docking analysis

The molecular docking analysis was executed to ascertain the binding conformation of the protein–ligand complex using AutoDock4.2 (Morris et al. 1998). The binding conformation would aid to reveal the binding energy of the 3CLpro and the selected ligands. The previous bound ligands02J and PEJ were docked to the 3CLpro and compare their binding energies with the selected ligands. The free binding affinities were calculated via a Lamarckian genetic algorithm, and the root means square deviation (RMSD) was analyzed. The 3CLpro was protonated using polar hydrogen with fixed Kollman charges. The PDBQT derived from 3CLpro contained information about partial charges, atom types, and torsional degrees of freedom. The ligands side chain and the torsional bonds kept flexible while the 3CLpro fixed rigid. All the ligands were docked to the residue involved in catalytic activity with x, y, and z coordinates of − 13.539, 18.826, and 63.171 respectively. The grid box was set at 60 Å × 60 Å × 60 Å and with a spacing of 0.375 Å. A total of 10 runs were carried out with a maximum generation of 27,000, a maximum evaluation of 2,500,000, and a population size of 150. The free binding energy (∆Gbind) was calculated using the sum of van der Waals energy (∆Gvdw), the sum of electrostatic energy (∆Gelect), the sum of hydrogen bond and desolvation energy (∆Ghbond), the sum of final total internal energy (∆Gconform), the sum of torsional free energy (∆Gtor) and the sum unbound system energy (∆Gsolv).

Molecular dynamics simulations

The best docked-protein receptor and ligand complexes were subjected for refinement and molecular dynamics simulation (MDS) using CHARMM (Brooks et al. 2009) and VMD (Humphrey et al. 1996) respectively. The protein complexes which were.pdb complex files were converted into.psf and trajectory files were retrieved which were then used to minimize solvate, neutralize and then refine the complex structures. Generalized Born Molecular Mechanics (GBMM) was deployed to retrieve the approximate results in an explicit solvent. We have deployed NVT dynamics which holds temperature and volume constant. The Noose-hover temperature was set to 300 K and the entire simulation was executed in 1000 steps for 50 ns. Topology and force field parameters were assigned from the CHARMM27 protein-lipid parameter set (MacKerell et al. 1998) for the proteins and the CHARMM General Force Field (CGenFF) parameter set for the small molecule ligand (Vanommeslaeghe et al. 2010). Furthermore, after the refinement, we subjected the best simulated and refined complexes for interaction analysis to check whether there is any effect on the interactions before simulation and after the refinement that is formed between the protein and the ligand using PLIP (Salentin et al. 2015).

Results and discussions

GC–MS analysis

The GC–MS analysis of the methanolic extract of Z. offinale and the leaves of A. occidentale was carried out to determine the phytochemical constituents of the plant materials. The results of the analysis showed that the composition of the phytochemicals based on compound names, molecular formula, peak, retention time (RT), and areaas presented in Tables 1 and 2. The GC–MS chromatogram of the methanolic extract of Z. offinale indicates the presence of eighteen compounds in both major and minor peaks (Fig. 1). Similarly, the GC–MS chromatogram of the methanolic extract of the leaves of A. occidentale showed the presence of eleven (11) peaks with eleven compounds (Fig. 2). The identified compounds were searched in the PubChem database and their three-dimensional structures (3D) were downloaded in SDF format. These compounds were converted to PDB format using PyMol (1.7.4.5 Edu) (DeLano 2002). A total of twenty-nine compounds were obtained from the extracts of both plants and use for this study. All the compounds were represented using the PubChem ID and presented in the Tables 1 and 2.

Table 1 Compounds obtained from the GC–MS analysis of Zingiber offinale
Table 2 Compounds obtained from the GC-MS analysis of the leaves of Anacardium occidentale
Fig. 1
figure 1

GC–MS chromatogram of the methanolic extract of Zingiber offinale

Fig. 2
figure 2

GC–MS chromatogram of the methanolic extract of Anacardium occidentale

Physicochemical and Pan-assay interference structure analyses

The physicochemical properties were analyzed to determine the efficient metabolism, therapeutic safety, and precision of the identified compounds (Saleh-e-In et al. 2019) using various properties such as molecular weight, hydrogen-bond donor, hydrogen-bond acceptor, logarithms of partial coefficient (LogP), molar refractivity, number of rotatable bonds, topological polar surface area (TPSA), and PAINS (Table 3). The analyses were carried out based on the rule of drug-likeness used during the process of drug design and discovery. These rules are Lipinski rule of five which stated that for a compound to have good membrane permeability, suitable oral bioavailability, and efficient gastrointestinal absorption in the human abdomen it must possess molecular weight ≤ 500 Da, LogP ≤ 5, HBD ≤ 5, and HBA ≤ 10 (Lipinski et al. 1997). Egan (Pharmacia) rule suggests that a therapeutic compound with LogP ≤ 5.88 and TPSA ≤ 131 Å will have high oral bioavailability. Similarly, Veber rule proposed that a molecule with rotatable bonds ≤ 10 and TPSA ≤ 140 Å will have better oral bioavailability (Veber et al. 2002).

Table 3 Physicochemical and Pan-assay interference structure analyses of the compounds obtained from GC-MS

The results of the analyses indicated that all the compounds obeyed Lipinski rule of five and Egan rule except CID_10503282, CID_56598867, CID_99615, 11697907, and CID_21121725 where logarithms of the partial coefficient are greater than five, although, the compounds possess drug-like properties with good membrane permeability and suitable oral bioavailability (Table 3). Similarly, all the compounds satisfied Veber’s rule which indicates their drug-like potentiality. The PAINS analysis indicates the possibility of a molecule to be toxic, although, all compounds have 0 PAINS structural alerts which signify their non-toxic nature (Table 3). Therefore, all the identified phytochemical compounds analyzed in this study possessed drug-like properties and were used for further study.

Pharmacokinetic analysis

The pharmacokinetics properties such as absorption, distribution, metabolism, excretion, and toxicity are the principal features for drug design and discovery in pharmaceutical research because it assists in guiding the initial evaluation of the effectiveness of in vivo and drug safety (Pricopie et al. 2019). The pharmacokinetics properties strongly affect the degree of biological activity of an active compound toward its target protein as well as its side effects (Chandrasekaran et al. 2018). It also helps to determine if the active molecule or the ligands has desirable properties such as oral administration, absorption, etc. to avoid late-stage failure (Pricopie et al. 2019). In this study, the AdmetSAR tool (Cheng et al. 2012) was used to predict the pharmacokinetic properties of the selected compounds. The predicted properties include Human Intestinal Absorption (HIA), Blood–Brain Barrier (BBB) penetration, Cytochrome P450 (CYP450 2D6) Inhibitor, Mutagenicity, Tumorigenicity, AMES Toxicity, and Reproduction. All the compounds predicted to pass via blood–brain barrier which favored their druggability. Except forCID_21121725, all the selected compounds were predicted to be absorbed in the human intestine (Table 4). Similarly, except for CID_612550, all the compounds were non-inhibitors of Cytochrome P450, which make them less susceptible to drug–drug interaction mediated side effects. About toxicity of the ligand, one compound (CID_620007) was found to be Ames toxic, while four compounds (CID_6184, CID_454, CID_612550, and CID_6058) possess the ability to cause mutation with high or low affinities. Also, three compounds (CID_3083834, CID_612550, and CID_622163) were predicted to be tumorigenic, while six compounds (CID_6184, CID_454, CID_312134, CID_612550, CID_6058, and CID_21121725) were reproducible. Thus, all the compounds with undesirable pharmacokinetic properties were eliminated for further consideration.

Table 4 Pharmacokinetic analysis of the phytocompounds identified from Zingiber offinale and leaves of Anacardium occidentale

Molecular docking analysis

A total of twenty-nine (29) Phytocompounds were obtained from GC–MS analysis of two plant extracts. These compounds were screened for physicochemical and pharmacokinetic properties to determine their drug-likeness properties. Of the 29compounds, only nineteen (19) possessed drug-likeness properties with efficient oral bioavailability and less toxicity. These compounds further used for molecular docking analysis to determine their binding energies with the 3CLpro. Also, the previous bound ligand (02J and PEJ) were docked to the 3CLpro and compared their binding energies with that of the selected ligands. The result of the analysis indicated that the free binding energies of the compounds ranged between − 5.08 and − 10.24 kcal/mol, better than the binding energies of 02j (− 4.10 kcal/mol) and PJE (− 5.07 kcal/mol) (Fig. 3). CID_99615(Fig. 4m) had the minimum binding energy of − 10.24 kcal/mol and fit to the active cavity of the 3CLpro and stabilized by two hydrogen bonds (Thr26 & Glu166) and nine Van der Waals interactions (Thr24, Thr26, Leu27, Thr54, Asn142, Gly143, Asp187, Arg188 & Gln189). Besides, some residues like Met49, Cys145, Met165 (Pi—Alkyl interaction), and His41 (Pi—Sigma interaction) were found to undergo hydrophobic interactions (Table 5).

Fig. 3
figure 3

The free binding energies of the selected phytocompounds with 3CLpro

Fig. 4
figure 4figure 4figure 4

Hydrogen bonds, Vander Walls, and hydrophobic interaction between 3CLpro and the selected ligands a CID_621914, b CID_1201518, c CID_10503282, d CID_10099, e CID_520909, f CID_6431015, g CID_160799, h CID_1197255, i CID_586455, j CID_31211, k CID_44631539, l CID_56598867, m CID_99615, n CID_3981360, o CID_620012, p CID_11697907, q CID_49865032, r CID_9910474, s CID_550857, t CID_02J, u PJE

Table 5 Free binding energies, hydrogen bonds, Vander Waals, and hydrophobic interactions of the selected ligands

CID_3981360 (Fig. 4n) forms two hydrogen bonds with the catalytic site of the 3CLpro and CID_9910474, CID_1169790, CID_10503282, and CID_620012, interact and form 5, 2, 2, and 2 hydrogen bonds respectively with the substrate binding of the 3CLpro. These interactions are shown in Table 5. CID_3981360 had the free binding energy of − 9.79 kcal/mol and form two hydrogen bonds with Cys145 and Glu166, while Tyr54, Phe140, Asn142, His163, His164, Met165, His172, Asp187, Arg188, and Gln189 underwent Vander Waals interaction and Met49 and His41 exhibit hydrophobic interactions with the ligand. The interaction of Cys145 and the CID_3981360 are very essential, since the residues form the catalytic dyad of the protein, thus binding to such residue by the ligand will impede the catalytic activity of the 3CLpro.

CID_9910474 had the minimum binding energy of − 9.14 kcal/mol and form five hydrogen bonds with Phe140, Leu141, Ser144, Cys145 and Glu166, and underwent hydrophobic interactions with Th25, Thr26, His41, Cys44, Thr45, Ser46, Met49, Asn142, Gly143, His164, Met165, and His172. The 3CLpro-CID_9910474 complex stabilized by two residues (His163 and Cys145) involved hydrophobic interactions (Fig. 4r).

Of the remaining ligands, two compounds (CID_620012 and CID_9910474) formed hydrogen bonds with Cys145. Another twelve compounds (CID_10503228, CID_10099, CID_520909, CID_6431015, CID_1197255, CID_586455, CID_44631539, CID_56598867, CID_11697907, CID_49865032, CID_9910474, and CID_550857) presented hydrophobic (Pi-sulfur and Pi-Alkyl) interactions with the either Cys145 or His41 or both (Fig. 4a–s). Therefore, all the identified ligand has the potential to inhibit the 3CLpro, as described above.

Molecular dynamic simulations analysis

Based on the molecular docking analysis Six compounds (CID_99615 = − 10.24 kcal/mol, CID_3981360 = 9.75 kcal/mol, CID_9910474 = − 9.14 kcal/mol, CID_11697907 = − 9.10 kcal/mol, CID_10503282 = − 9.09 kcal/mol and CID_620012 = − 8.53 kcal/mol) with good binding energies selected and subjected to MD Simulation to determine the stability of the protein–ligand complex. After the molecular simulation and refinement, it is evident that all the complexes have been refined to the best potential as possible and the overall energy of the complex has also been stabilized with all the structures having a good RMSD score. Table 5 summarizes the best refined complexes which can be further clinically evaluated for a suitable treatment against the novel coronavirus (nCoV-19). From our analysis, we can discern that the best stable complexes are namely—CID_9910474 and CID_10503282 as their overall energy is better than the rest of the complexes and also their RMSD score is 0.0 after refinement. The superimpositions of initial structures of CID_9910474 and CID_10503282 complex with its simulated structure are shown in Figs. 5 and 6, respectively. The drug candidates after refinement suggest the presence of some important residues that impact binding of the ligand to the complexes. The superimposition and comparison of the complexes before and after refinement showcase binding mechanisms with the lowest free energy (LFE) barriers in the same direction.

Fig. 5
figure 5

The superimposition of initial structure of the CID_9910474 complex (blue color) and simulated structure (green color) with the drug compounds

Fig. 6
figure 6

The superimposition of initial structure of the CID_10503282 complex (blue color) and simulated structure (green color) with the drug compounds. The drug ligand

Protein–Ligand interaction analysis was executed using PLIP for the two best stable complexes to check whether any alterations occur between interactions formed between protein–ligands before and after refinement strategy. It was observed that before molecular dynamics simulation, the complex CID_10503282 had three hydrogen bonds with the ligand interacting with residues of Thr24, Ser144, and Gly143 in chain A of the protein, while after the simulation, only one hydrogen bond existed between the protein-ligand complex at residue Ser144 in chain A. Very similarly with complex CID_9910474 had only a single hydrophobic interaction at 166 residue (Glu) with chain A and five hydrogen bonds with 140, 143, 144, 145 and 164 residues with chain A respectively, which were drastically changed after molecular dynamics simulation with four hydrophobic interactions and five hydrogen bonds with different residues. Table 6 is the summarized description of the interactions between the protein receptor (3CLpro) and the two drug ligands (CID_10503282 and CID_9910474), before and after molecular simulation analysis. Figure 7 showcases the interactions formed. This drastic change in the interactions formed between the drug ligand and the protein receptor highlights the necessary residues namely—Ser144 required for hydrogen bonding CID_10503282 and Cys145 essential for hydrogen bonding in CID_9910474, which yield in the potential of binding and unbinding of drug ligands CID_10503282 and CID_9910474 in the target receptor 3CLpro respectively (Table 7).

Table 6 Scores of the simulated complexes
Fig. 7
figure 7

Interactions formed between the target receptor and the two best phytochemicals

Table 7 Interaction analysis—before and after molecular dynamics simulation (MDS)

Conclusion

A total of twenty-nine compounds obtained from GC–MS analysis of the extracts of Z. offinale and the leaves of A. occidentale. These compounds were further filtered for physicochemical and pharmacokinetic properties to determine their drug-likeness properties. Out of the 29 compounds, only nineteen have drug-likeness properties with effective oral bioavailability and less toxicity. These compounds further used for molecular docking analysis to determine their binding energies with the 3CLpro. Also, the previous bound ligand (02J and PEJ) were docked to the 3CLpro and compared their binding energies with the selected ligands. The result of the analysis indicated that the free binding energies of the compounds ranged between − 5.08 and − 10.24 kcal/mol, less than the binding energies of 02j (− 4.10 kcal/mol) and PJE (− 5.07 kcal/mol). Six compounds (CID_99615 = − 10.24 kcal/mol, CID_3981360 = 9.75 kcal/mol, CID_9910474 = − 9.14 kcal/mol, CID_11697907 = − 9.10 kcal/mol, CID_10503282 = − 9.09 kcal/mol and CID_620012 = − 8.53 kcal/mol) with good binding energies further selected and subjected to MD Simulation to ascertain the stability of the protein–ligand complex. It is evident that phytochemicals CID_9910474 and CID_10503282 are highly stable and robustly bound to the target receptor 3CLpro. It is said so because of their RMSD score and hydrogen bonds formed between the phytochemicals and the amino acid residues in 3CLpro. Molecular dynamics simulation (MDS) also suggests that the overall energy of these two complexes is much better indicating that there is no systematic drift in the values, and thus, validates the affinity of these two complexes. Furthermore, the interaction analysis highlights the potential of residues 144A (SER) and 145A (CYS) in the binding and unbinding of drug candidates in target receptor 3CLpro which is a reasonable finding in discovering novel drug candidates against the novel coronavirus (nCoV-19). Therefore, these stable complexes namely- CID_9910474 and CID_10503282 can be further validated clinically against novel coronavirus (nCoV-19) target receptors.