Introduction

A novel coronavirus termed as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the respiratory disease in China. The SARS-CoV-2 outbreak was declared as an international concern of public health emergency by the World Health Organization [1,2,3]. Most of the infected patients have mild symptoms such as fever and cough, but it can be fatal leading to pneumonia and acute respiratory failure, particularly in older males with comorbidities [4]. Currently, any specific therapeutic alternatives for the infection is not present, and the treatment depends on the symptom along with repurposing of antiviral drugs such as ritonavir along with antibiotics for treatment of secondary infections [1] The obstruction of proteolytic processing of polyproteins by inhibiting necessary viral proteases has been a fruitful strategy in the treatment of human immunodeficiency virus (HIV) and hepatitis C [5, 6]. This proves that treatment of viral infections by inhibiting the protease will be a successful approach, since the main protease of SARS-CoV-2 is assumed to play an essential role in viral replication, therefore considered as a promising target [7, 8]. The main protease is composed of 306 amino acids with enzymatic activity for replication in coronavirus. It also plays a significant role in processing of the polypeptide into functional proteins [9]. The SARS-CoV-2 main protease crystal structure was recently solved it is a homodimer (designated protomer A and B). Each protomer is composed of three domains: domain I (residues 8–101) and domain II (residues 102–184) comprises an antiparallel β-barrel structure, while domain III (residues 201–303) has five α-helices arranged into a largely antiparallel globular cluster, and it is connected to domain II by a long loop region (residues 185–200). The substrate-binding site in SARS-CoV-2 Mpro has a Cys-His catalytic dyad and is located in a cleft between domain I and domain II [10], which enables the designing of inhibitory compounds. SARS-CoV-2 and SARS-CoV-1 proteases’ sequence similarity is 96.1% [7, 11], so SARS-CoV-2 might be treated with the drugs developed against SARS-CoV-1. [12]. But these compounds are not the approved medicine [13]. Moreover, the differences in single amino acids might affect its effectiveness for the novel virus [14]. Hence, SARS-CoV-2 main protease inhibitor development is an urgent need. Crystal structure analysis proposes that the residues in the active site of SARS-CoV-2 Mpro are Thr25, Thr26, Leu27, His41, Ser46, Met49, Tyr54, Phe140, Leu141, Asn142, Gly143, Cys145, His163, Met165, Glu166, Leu167, Pro168, Phe185, Asp187, Gln189, Thr190, Ala191 and Gln192. These residues play an important role in the development of effective inhibitors against SARS-CoV-2 main protease [15]. Various studies have reported different inhibitors like ligand N3 [10], compounds 11a and 11b [16] and taxifolin [17]. However, all these compounds have to pass several phases before clinical use. So FDA-approved drugs are great choice. In this study we have screened FDA-approved drug library against the main protease of SARS-CoV-2 and further validated the selected compounds by computational method.

Result and discussion

Virtual screening against Covid-19 protease

The virtual screening is one of the most frequently used techniques which helps to minimize the time and cost [18, 19]. Screening of the potentially active selected drug against biological targets was carried out using GOLD software. Among 6054 compounds screened from superdrug2 database, six have been selected for further validation. These six molecules are selected on the basis of gold score and number of hydrogen bonds. The highest GOLD fitness score among the selected compound was found 79.7 for amprenavir, whereas other compounds have GOLD fitness of 70.6 (cefoperazone), 64.52 (Riboflavin), 64.25 (Diosmin), 63.57 (Nadide) and 61.39 (Troxerutin) as shown in Table 1.

Table 1 Ligand structure, gold score, AutoDock Vina binding energy and amino acid involved in various type of interaction

Binding modes generated by molecular docking

Binding modes of protein and ligands were studied by measuring the binding energy obtained from docking tool AutoDock Vina and inspection of physical and chemical bonding between protein and drug molecules. Docking scores of all proteins are shown in Table 1. Different types of interactions namely hydrophobic, hydrogen bonding and other interactions were monitored during protein–ligand complexes which are listed in Table 1. The hydrogen bonds with taxifolin were formed by residues Glu166, Thr190, Tyr54, Asp187 and Pro168. Notably, in all possible inhibitors (except cefoperazone), Cys145 and oxyanion hole residues Asn142 and Gly143 were playing an important role in stabilising complex by forming hydrogen bond. Gly143 stabilised the complex of protease-amprenavir, protease- riboflavin and protease-troxerutin, while Asn142 played role in stability of protease-nadide complex, protease-diosmin complex and protease-riboflavin. However, in protease-cefoperazone complex, Thr45, Ser46 and Glu166 were found to have a crucial role in stability. The protease-nadide complex is additionally stabilised by Leu141 and Glu189. In addition, residues Thr26, Gln189 and Met165 are playing significant role in stability of protease-diosmin complex. Furthermore, in protease- riboflavin, His163, His41, Met49 and Met165 stabilise the complex. Moreover, His163, Leu141, Thr190, His41 and Met165 have crucial role in stability of protease-troxerutin complex.

Binding modes refined by MD simulation

MD simulation is extensively applied to monitor the stability of protein-drug/protein/DNA complexes, to restore the binding energy of protein complexes and to study the conformational changes occurring during ligand binding and unbinding, protein folding and protein dynamics. Here MD simulation was executed for all protein ligand complexes to refine and check the binding modes obtained by molecular docking through AutoDock Vina. The 2D structure of all the ligands are shown in Fig. 1.

Key determinants for drug specificity are hydrogen bonds [20], and all inhibitors have an optimal number of average hydrogen bonds as taxifolin, which proved them as good inhibitor molecules. Higher average number of hydrogen bonds in diosmin, nadide and troxerutin were observed in comparison to taxifolin and chloroquine (Fig. 2).

Fig. 1.
figure 1

2D diagram of ligand structure: (a) troxerutin, (b) amprenavir, (c) cefperazone, (d) chloroquine, (e) diosmin, (f) nadide, (g) riboflavin, and (h) taxifolin

Fig.2
figure 2figure 2figure 2

Binding site active site amino acid involved in binding and hydrogen bond analysis of protein–ligand complex, (a) protease-taxifolin complex, (b) protease-riboflavin complex, (c) protease-amprenavir complex, (d) protease- nadide complex, (e) protease-troxerutin complex, (f) protease-diosmin complex, (g) protease-cefoperazone complex and (h) protease-chloroquine complex

The protein structure conformational stability during the course of the simulation was determined by the RMSD. The predicted docked complexes RMSD analysis showed stable behaviour, confirming the strong binding affinity and the interaction between docked complexes. All the inhibitor-bound complexes attained equilibrium during 100 ns trajectory which was evident with RMSD plot. RMSD graph was stable for cefoperazone, diosmin, riboflavin, taxifolin, chloroquine and troxerutin, while amprenavir and nadide have shown slight deviation (Fig. 3). A minimum RMSD of backbone atoms strongly suggests stable dynamic behaviour of docked complex. It was clear that trajectories of all docked complexes remained stable with proper conformations throughout all the three replica of 100 ns, indicating that ligands were stable at the active site of protein during interactions (Fig. 3 and S1).

Fig. 3
figure 3

Backbone root mean square deviation (RMSD) values in nanometer (Y-axis) along with the time frame in nanosecond (X-axis)

Compactness and shape of protease inhibitor complexes were observed by measuring the Rg value. If protein unfolds, Rg change with time, while a stably folded protein maintains a steady value of Rg. Nadide seems to form the most stable complex than the other inhibitors with the lowest Rg value. All other inhibitors have similar Rg as that of taxifolin and chloroquine (Fig. 4). Rg values for all protein–ligand are almost the same, meaning no change in the compactness and stability was maintained during MD simulation of all three 100 ns replica as shown in the Fig. (4 and S2).

Fig. 4
figure 4

Radius of gyration of protease inhibitor complexes

SASA (solvent accessible surface area) values of all complexes are similar to taxifolin and chloroquine complex (Fig. 5). The lower values of SASA specify the contracted nature of protein–ligand complex. We observed that the SASA plot has shown stability in docked complexes throughout simulation, which is significant for drug designing (Fig. 5 and S3).

Fig. 5
figure 5

SASA values (Y-axis) along with the time frame (X-axis)

The mobility of different residues of a protein in presence of drug was examined by measuring the RMSF values. RMSF results indicate that fluctuations were observed in amprenavir, cefoperazone, diosmin, nadide, riboflavin and troxerutin compared with taxifolin and chloroquine. This showed that there are some conformational changes in protease to accommodate these inhibitors for creating a stable complex (Fig. 6).

Fig. 6
figure 6

RMSF plot of protease inhibitor complex with respect to amino acids residues

The molecular dynamic simulation data supported the studies of molecular docking. It has provided important understanding of the interaction at the molecular level of all the inhibitor against main protease of COVID-19 that might be useful in drug development.

On the other hand, all the screened compounds show good binding energy (Table 1).

Covalent docking

Approximately 30% of the commercial drugs which target enzymes are covalent inhibitors [21]. The pharmacological studies reported benefits of covalent inhibitors that it can achieve longer drug residence times than non-covalent inhibitors [22] and has better target selectivity [23, 24]. RMSD lower than 2 Å is generally considered as a principle for effective covalent docking [25].

Analysis of covalent interactions between inhibitors and main protease of coronavirus was done by using discovery studio software. There is a rule for effective covalent docking that the RMSD should be less than 2 Å [25]. The RMSD value of diosmin-coronavirus main protease complex were found below 2.0 Å, while RMSD values of other inhibitors were found to be more than 2.0 Å (Table 2). The results suggested that diosmin could form covalent bond with the Glu 166 of coronavirus main protease and may be considered as a good inhibitor (Fig. 7).

Table 2 RMSD value of covalent docking
Fig. 7
figure 7

Covalent interaction of diosmin with active site residue Glu 166 of SARS-CoV-2 main protease

Method

  1. 1.

    Preparation of protein

    The SARS-CoV-2 main protease structures (PDB ID:6LU7) was retrieved from RCSB Protein Data Bank (www.rcsb.org). The co-crystallized ligands and other heteroatoms were removed. All the water molecules were removed from the original crystal structure before the protein preparation process; to analyse the structure and the bond order assigned, hydrogen atoms were added to the enzyme using Discovery Studio 2.5 package [26].

  2. 2.

    Preparation of ligand

    Ligand database was downloaded from superdrug2 [27, 28] containing 6054 FDA-approved molecules. Hydrogen atoms were added in all the molecules. All the molecules were prepared by using Discovery Studio 2.5 package [26]. MM2 energy minimization tool of Chem3D17 7.1 software was used for minimization.

  3. 3.

    Selection of positive control

    We have taken positive control to taxifolin (a natural compound) which is reported to have good in silico result against SARS-CoV-2 [29] and chloroquine, which is the main protease inhibitor of SARS-CoV-2[30].

  4. 4.

    High throughput virtual screening and molecular docking

    High throughput virtual screening was performed by using GOLD (Genetic Optimization for Ligand Docking) 5.0 software [31]. Docking annealing parameters for van der Waals and hydrogen bonding were set to 5.0 and 2.5, respectively. The parameters used for genetic algorithm were considered as population size 100, selection pressure 1.2, number of operations 1,00,000, number of islands five, niche size 2, migrate 10, mutate 100 and crossover 100. The selection of the compounds was based on the GOLD fitness score, favourable binding and molecular interactions with the active site residue. Further confirmation of the results was done by using AutoDock Vina software [32]. The framework for screening of the molecules was GOLD fitness score and binding energy from AutoDock Vina. Prior performing docking in AutoDock Vina, both receptors and ligands were prepared in AutoDock tools. Polar hydrogens were added to the receptor molecules, whereas non-polar hydrogens were merged. The binding site of the ligand was obtained from the available protein-ligand complex in PDB which was used as a reference to set the dimensions of grid box in all receptor molecules.

  5. 5.

    MD simulation and binding energy calculation

    Molecular dynamics (MD) simulations for protein (one protomer)-ligand complex were performed by GROMACS (Groningen Machine for Chemical Simulation) 5.0 suite [33, 34] in combination with GROMOS96 43a1 force field [35, 36]. The ligand topologies were generated using the PRODRG webserver [37, 38]. Each docked complex was solvated with the extended simple point charge (SPC/E) water model in the cubic box. Each system was neutralized by adding counter ions which were followed by energy minimization. The equilibration steps, that is, NVT (isothermal-isochoric) then NPT (isothermal-isobaric) equilibration, were performed for 100 ps time [39]. Finally, three replicas of 100ns production run were carried out. Root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), and hydrogen bond (H-bond) were calculated using gmx rms, gmx rmsf, gmx gyrate and gmx hbond modules of GROMACS utility. All 2D and 3D figures of proteins and ligands for visualisation were rendered in Discovery Studio Visualizer (Dassault Systems Biovia Corp).

    Binding-free energy calculations for all the docked complex trajectories were performed by Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) method of g_mmpbsa module [40, 41].

  6. 6.

    Covalent docking

Protein and ligand files were set up with the link atom identified and appear in both the protein and ligand input files. The complex PDB file and Mol2 file, which present receptor and ligand, respectively, were prepared with default parameters. Then, the link atom in the ligand was forced to attach to the link atom in the protein. The possible covalent interactions between the inhibitors and the protease were determined by performing covalent docking. It was done by Genetic Optimization for Ligand Docking (GOLD) 5.0 version [31]. RMSD value was calculated and result was analysed by the use of Discovery Studio 2.5 software [26].

Conclusion

Amprenavir, cefoperazone, riboflavin, diosmin, nadide and troxerutin approved for human therapeutic usage by FDA, found to act as COVID-19 main protease inhibitors, representing potential treatment option. All these drugs bind to the SARS-CoV-2 main protease conserved residues of substrate-binding pocket and form a remarkable number of non-covalent interactions. We have identified diosmin as an inhibitor which is binding covalently to the COVID-19 main protease, inhibiting the infection pathway of COVID-19.

In conclusion the predicted drugs can be used in current clinical trials to check effectiveness of molecules against COVID-19. The results of this study confirms initial reports however additional experiments are necessary to investigate the efficacy of these inhibitors by in vitro and in vivo studies followed by clinical trials against COVID-19 patients. But publishing these molecules in public domain may help community to fight with COVID-19 if any potential investigator proceeds for their trial. We recommend that these drug candidates be experimentally tested and used as a starting point for further design of a high-efficacy drug candidate.