Background

One of the major challenges faced by the medicinal chemist was discovering epidermal growth factor (EGFR) receptor inhibitors for treating lung cancer, especially, non-small cell lung cancer ( NSCLC) or EGFR tyrosine kinase mutations [1]. Treatment of EGFR tyrosine kinase to control NSCLCs has become an urgent therapeutic requirement due to the development of drug resistance by the mutation [2].

Lung cancer was one of the world’s leading cancer problems. A lot of deaths are recorded every year, estimated to be around one third of all cancer deaths. Non-small cell lung cancer (NSCLC) was the main subset of lung cancers that accounted for about 85% of cancer-related problems [3]. Overexpression of epidermal growth factor receptor kinase was identified to be the common cause of NSCLCs. Based on the report on the population of patients with NSCLCs in Caucasia, the figure is approximately 10–15% and in Asia is approximately 30–40%, respectively [3].

NSCLC therapeutic agents showed a high response rate in patients with increase modifications of EGFR. NSCLC therapeutic agents or EGFR inhibitors were classified into reversible EGFR inhibitors (first-generation EGFR inhibitors) and irreversible EGFR inhibitors (second- and third-generation EGFR inhibitors). Unfortunately, the potency period of these first-generation EGFR inhibitors (Gefitinib and Erlotinib) is shortened due to the development of drug resistance by the secondary mutation T790M [4]. The second-generation irreversible EGFR inhibitors, such as afatinib and canertinib, were subsequently developed for the treatment of the NSCLC EGFRT790M mutations [5]. However, due to severe side effects, such as skin rashes and diarrhea, the second-generation EGFR inhibitors could not show any significant advantage over the first-generation reversible EGFR inhibitors. It is believed that the activities upon wild-type EGFR will narrow the possible activities on the patients with the T790M mutation [1, 6,7,8].

To address the unmet clinical demands, many third-generation irreversible EGFR inhibitors such as WZ4002, Rociletinib, Olmutinib, and Osimertinib were designed to inhibit the T790M resistance mutation while being more selective for wild type EGFR [1, 9,10,11,12].

In structure-based design, molecular docking is used to screen a library of compounds to identify compounds with higher affinities toward their target protein [13]. ADME and drug-likeness properties prediction played a vital role in structure-based design in the determination of the pharmacokinetic profiles of drug-like compounds in the early stage of drug development [14].

The purpose of this work is to apply the concept of structure-based design on fifty sets of quinazoline derivatives (EGFR inhibitors) previously reported in literature and design new quinazoline derivatives (new EGFR inhibitors) that may have better binding affinities than the previous quinazoline derivatives and further evaluate their pharmacokinetic properties.

Method

Computational environment and tools

This computational work was carried out on a Dell personal computer laptop, with the following specifications: Intel ® Core™ i7 Dual CPU, M330 @2.75 GHz 2.75GHz, and 8GB of RAM. The following software was utilized to achieve the success of this research: Pyrex virtual screening software, UCSF Chimera, Discovery studio, and SWISSADME an online web tool.

Compounds under investigation

Fifty sets of quinazoline derivatives as EGFR inhibitors previously reported by Zhang et al. were collected and used in this research work [15]. The compounds under investigation were synthesized under the same condition. After the retrieval of the compounds under investigation, the 2D structures of all the compounds were drawn with the help of Chemdraw software and presented in the Supplementary Table 1 [16, 17].

Ligand data preparation

Ligand preparation is very vital and also an important step in molecular docking study. As such, the preparation of the ligands in this work was done by finding the most optimum geometries of all the compounds under investigation, at B3LYP/6-311G* level of theory, with density functional theory (DFT) method using Spartan 14 wave software [18]. The optimum conformations of all the compounds under investigation were then saved in protein data bank file format for the next step. A prepared 3D conformation of one of the compounds under investigation/ EGFR inhibitor is shown in Fig. 1.

Fig. 1
figure 1

A 3D conformation of a prepared EGFR inhibitor

EGFR enzyme preparation

The crystal structure of the epidermal growth factor receptor tyrosine kinase enzyme covalently binding to WZ4002 with protein data bank entry: 3IKA was retrieved from the RCSB protein data bank database. After successful retrieval of the enzyme, the preparation of the enzyme for the molecular docking simulation was done using a discovery studio visualizer. In the process of its preparation, the co-crystallized ligand (WZ4002) and molecule of water present on the structure were deleted. Before that, polar hydrogen was added to the crystal structure. The 3D structure of the prepared EGFR tyrosine kinase enzyme is shown in Fig. 2.

Fig. 2
figure 2

3D structure of the prepared EGFR tyrosine kinase enzyme

Molecular docking execution

The docking of the compounds/EGFR inhibitors/ligands under investigation in the active site of the EGFR tyrosine kinase enzyme was achieved with the help of Autodock vina of Pyrex virtual screening software [19]. UCSF Chimera software was used for the re-coupling of the docked ligand with the receptor. The discovery studio visualizer was used to visualize the re-coupled complexes so as to view the nature of interactions between the ligands and the enzyme.

Pharmacokinetic profile prediction

The drug-likeness evaluation and ADMET screening were further performed using the SwissADME free web tool, developed by the Swiss Institute of bioinformatics, and freely available at http://www.swissadme.ch [20]. All the compounds under investigation were subjected to this part of in silico screening as filtering criteria in compliance with Lipinski’s rule of five (RO5) [21, 22].

Structure-based design

Structure-based drug design is a very important, robust, and useful technique in the drug discovery arena. It is also called direct drug design which involves the acquisition of the information regarding the three-dimensional structure of the molecular target (protein) through methods such as X-ray crystallography, NMR spectroscopy, or homology modeling, followed by the design of suitable drug candidates based on the binding affinities and selectivity for their target molecules. Structure-based drug design comprises several steps such as protein structure retrieval and preparation, ligand library preparation, docking, and manual design of new compounds [23].

Results

Molecular docking simulation

The results of the molecular docking simulation of the best top ten compounds are presented in Table 1 and Fig. 3a, b, respectively.

Table 1 The binding affinities and different interactions of the top ten compounds and EGFR receptor
Fig. 3
figure 3

The 2D structure of a Complex 6 and b Complex 8 using discovery studio

Drug-likeness and ADME property prediction

The results of the drug-likeness and ADME properties prediction for the best top ten compounds are presented in Tables 2 and 3, respectively.

Table 2 Drug-likeness properties of ten top compounds
Table 3 ADME properties of ten top compounds

Design

The structures and binding affinities of the template and the newly designed compounds are given in Table 4.

Table 4 The structure and binding affinities of newly designed compounds and the template

Molecular docking investigation of the newly designed compounds

The results of the molecular docking of the newly designed compounds are presented in Table 5 and Fig. 4a, b, respectively.

Table 5 The interactions of the designed compounds in the active site of the EGFR receptor
Fig. 4
figure 4

2D structures of the designed compound a SED 10 and b SED 14 in complex with EGFR enzyme using discovery studio

Drug-likeness and ADME property prediction of the newly designed compounds

The results of the drug-likeness and ADME properties prediction for the designed compounds are presented in Tables 6 and 7, respectively.

Table 6 Drug-likeness properties of the newly designed compounds
Table 7 ADME properties of the newly designed compounds

Discussion

Molecular docking

Molecular docking virtual screening, one of the methods applied in structure-based drug design, was used to screen fifty sets of EGFR inhibitors in order to identify hit compound that could be used to design new EGFR inhibitors by investigating their binding interactions in the active site of EGFR tyrosine kinase enzyme (3IKA) (Supplementary Table 2). The binding affinities of this docking study presented in Supplementary Table 2 in kcal/mol range from 7.1 kcal/mol to 9.3 kcal/mol. Also, all possible interactions (hydrogen bond, hydrophobic, and electrostatic interactions) between the ligands and the target protein were shown in the same table. The results of the best top 10 compounds under investigation with higher binding affinities/lower docking scores is presented in Table 1 out of which the best three are discussed.

The best three hit compounds identified were compound 6 with the highest binding affinity of 9.3 kcal/mol, followed by compounds 5 and 8, each with the binding affinity of 9.1 kcal/mol, respectively. The best hit compound 6 bound to EGFR tyrosine kinase receptor via four different types of interactions including conventional hydrogen, carbon-hydrogen, electrostatic, and hydrophobic bond interactions, respectively. Nitrogen one (N1) of the quinazoline ring of compound 6 binds to EGFR tyrosine kinase receptor via conventional hydrogen bond interaction with GLN791 amino acid residue with a bond distance of 2.87813 Å. PRO794 and GLY796 amino acid residues formed carbon-hydrogen bonds with the active site of the receptor each with a bond distance of 3.44185 Å and 3.37399 Å, respectively. EGFR tyrosine kinase receptor interacted also via three hydrophobic interactions with PHE723, LEU718, and VAL726 amino acid residues with different parts of the compound, respectively. Furthermore, electrostatic interaction was observed between compound 6 and the active site of the receptor with LYS745 and ASP855 amino acid residues, respectively.

Compound 5 was found to bind with the active site of EGFR tyrosine kinase receptor via two hydrogen bonds (conventional and carbon-hydrogen), four hydrophobic, and two electrostatic interactions. The conventional hydrogen bonds were between the hydrogen attached to one of the nitrogen of the quinazoline ring, oxygen of the acrylamide moiety, and fluorine on the benzyl ring of compound 5 with ASP855 (2.8768 Å), LYS745 (2.83195 Å), and CYS797 (2.80635 Å) amino acid residue of the receptor, respectively. The carbon-hydrogen bond was between the carbon and oxygen attached to the tetrahydrofuran-3-yl moiety and also carbon of the flourobenzyl ring with ASN842, PRO794, and GLY796 amino acid residues of the receptor each with a bond distance of 3.65564 Å, 3.22827 Å, and 3.2556 Å, respectively. A different part of the ligand was also observed to bind with the binding pocket of the receptor via two electrostatic and four hydrophobic bonds with ASP855 (2), LEU718, VAL726 (2), and PHE723 amino acids.

Compound 8 among the hit compounds formed two conventional hydrogen bonds with GLN791 and ASP855 amino acid residues, with a bond distance of 2.92147 Å and 2.69743 Å, respectively. Carbon-hydrogen bonds were also observed between the compound and the active site of the receptor with PRO794 (3.3526 Å) and GLY796 (3.30683 Å) residues, respectively. The compound was further observed to bind with the active site of the receptor protein via hydrophobic and electrostatic interactions with LYS745, LEU718, and VAL726 amino acid residues, respectively.

The common amino acid residues to these hit compounds were GLN791, LYS745, PRO794, GLY796, LEU718, and VAL726, respectively. These amino acid residues might be responsible for their higher binding affinities. Furthermore, Fig. 3a, b shows the 2D structures of compounds 6 and 8 using discovery studio visualizer.

Drug-likeness and ADME property prediction of the studied compounds

The drug-likeness and ADME properties of all the compounds under investigation were predicted using SWISSADME online web tool. The drug-likeness and ADME properties of all the compounds are presented in Supplementary Tables 3 and 4 following the notable Lipinski’s rule of five. It states that the permeation of an orally administered compound is more likely to be better if the molecule satisfies the following conditions: (i) hydrogen bond donors ≤ 5 (OH and NH groups), (ii) hydrogen bond acceptors ≤ 10 (N and O atoms), (iii) molecular weight < 500, (iv) calculated WlogP < 5 (v), and TPSA ≤ 140, respectively. Any compound that violates more than two of these conditions may have bioavailability-related problems. The results of the best top 10 compounds are presented in Tables 2 and 3.

All the compounds satisfied the Lipinski’s rule of five without violating more than one of the conditions stated except compound 16 which has two violations. Thus, predicting their good permeability properties, easy transportation, absorption, and diffusion. The number of hydrogen bond acceptors and donors for all the compounds under investigation was less than 5 and 10, respectively, per the notable RO5. From all these predicted parameters, it can be predicted that all the compounds under investigation including the three hit compounds can be orally bioavailable and also orally active as they obeyed the notable RO5.

For a drug to be orally active, it is expected to have high gastrointestinal absorption and all the compounds under investigation including the three hit compounds exhibited high gastrointestinal absorption except compounds 2, 16, 21, 34, and 39 with low gastrointestinal absorption. None of them was seen to permeant through the blood-brain barrier indicating lower toxicity. The most vital factor indicating good absorption was recognized to be the bioavailability score (which give the amount of drug present in the plasma). All the compounds under investigation including the three hit compounds were found to have high bioavailability scores of 0.55 except compound 16 which has a lower bioavailability score of 0.17. P-gp substrate served as a defender to the central nervous system (CNS) from xenobiotics. Also, the P-gp substrate of all the compounds under investigation including the 3 hit compounds was predicted. Some were found to be a substrate to P-gp while some were not. Moreover, synthetic accessibility scores refer to how easy a compound can be synthesized in a laboratory and scales of easy to hard range between 0 and 10. Synthetic accessibility score of all the compounds under investigation including the three (3) hit compounds was less than 5 showing that they can be easily synthesized in the laboratory. The compounds under investigation including the three hit compounds were predicted to have good pharmacokinetic profile and drug-likeness except compound 16 [24, 25].

Designed compounds

Based on the virtual screening carried out using molecular docking and pharmacokinetic studies on quinazoline derivatives, compound 6 with the highest binding affinity of − 9.3 kcal/mol, good pharmacokinetic profile, and drug-likeness property was identified as the best hit compound in the analogs. Compound 6 being the best hit was used as the template for designing new compounds. Sixteen new compounds (Table 4) were designed by carrying out structural modification on the meta position of the flourobenzyl ring of compound 6 (the template). Studying the designed compounds, the addition of phenyl-amino rings and halo substituted phenyl-amino rings on the meta position of the flourobenzyl ring attached to “oxy-phenyl amino ring” moiety of the template significantly increase the binding affinities of the designed compounds.

Molecular docking investigation of the newly designed compounds

The newly developed compounds were also docked at the active site of EGFR tyrosine kinase receptor (PDB code: 3IKA). Table 5 shows the docking results of all the newly designed compounds in the active site of the target protein (EGFR tyrosine kinase receptor). The binding affinities of these newly designed compounds range from − 9.5 kcal/mol to − 10.2 kcal/mol. When compared with the template and the control afatinib, the docked-designed compounds were seen to have better binding affinities than the template (f − 9.3 kcal/mol) and the control (− 7.9 kcal/mol).

Designed compound SED10 was found to be the best among all designed compounds with a binding affinity of − 10.2 kcal/mol. It interacted with the binding pocket of EGFR tyrosine kinase receptor via three conventional hydrogen bonds, two carbon-hydrogen bonds, four halogen bonds, three electrostatic bonds, and nine hydrophobic bonds with the following amino acid residues: ARG841, GLN791, GLY796, LYS745 (2), ASP855, LEU844, LEU718 (2), PRO877 (2), LYS875, VAL726, ALA743, and LEU858, respectively (Table 5).

The second-best among the designed compounds was SAD14. It bounded to the binding pocket of its target via three carbon-hydrogen bonds, four halogen bonds, three electrostatic bonds, and nine hydrophobic bonds with the following amino acids GLN791, GLY796, ASP837 (2), LYS745 (3), ASP855, LEU844, PHE723, PRO877, LEU718 (2), ALA743, VAL726, and LEU858, respectively.

The following amino acid residues GLN791, GLY796, LYS745, ASP855, LEU844, LEU718, PRO877, VAL726, ALA743, and LEU858 were common to the best two designed compounds. This may be the reason they have higher binding affinities. Furthermore, Afatinib, the positive control, was used to validate the docking procedure in this study, then compared with the designed compounds. The designed compounds were observed to have better binding affinities than Afatinib with a binding affinity of − 7.9 kcal/mol. The 2D-structures of designed compound SED10 and SED14 in complex with the receptor were presented in Fig. 4a, b.

Drug-likeness and ADME prediction of the newly designed compounds

The drug-likeness of the newly designed compounds was also predicted following Lipinski’s rule of five (Table 6). None of the designed compounds was found to violate more than two of the permissible limit set by the Lipinski’s rule of five filters for small molecules. Based on that, their permeability across the cell membrane, easy transportation, absorption, and diffusion was predicted [24, 26].

ADME properties of these newly designed compounds were also predicted (Table 7). All were observed to have low gastrointestinal absorption. But none was observed to permeate through the brain indicating lower toxicity. All designed compounds have higher bioavailability score of 0.55 except compounds 9, 13, and 14 with lower bioavailability scores of 0.17. Interestingly, all the designed compounds have a good synthetic accessibility score of < 5 except molecule 9 with the synthetic accessibility score > 5 (5.16) which indicates that these designed compounds can be easily synthesized in the laboratory [25, 27].

Conclusion

Molecular docking virtual screening carried out on fifty quinazoline derivatives/EGFR inhibitors reveals that compound 6 was the best hit compound among the investigated ones. Compound 6 is the best and was retained as the template for the structural modification for the design of new EGFR inhibitors.

The pharmacokinetic profile predictions of these hit compounds and the rest were further examined and found to be orally bioavailable with good absorption, low toxicity level, and permeable properties except compound 16.

Sixteen new EGFR inhibitors with better binding affinities than the template and Afatinib (the positive control) were designed using the best hit compound 6 as a template. The structural modification was made by adding a phenyl-amino ring, and halo substituted phenyl-amino ring on the meta position of the flourobenzyl ring attached to “oxy-phenyl amino ring” moiety of the template might be responsible for the significant increases in the binding affinity of the designed compounds.

None of the designed compounds was found to violate more than the permissible limit set by RO5, thereby predicting their transportation, absorption, and diffusion. More so, the designed compounds were found to have good synthetic accessibility which indicates that these designed compounds can be synthesized in the laboratory.