Introduction

Lung cancer is the most common cancer as evident from a comprehensive global report that also showed ∼1.8 million new cases reported in 20121. It has been one of the leading causes of cancer-related mortality worldwide (19.4% of all cancers). Additionally, it is more prominent in developing countries (58%) than in developed countries1. The abnormal activation of epidermal growth factor receptor (EGFR) tyrosine kinase is responsible for promoting various tumor types, including lung cancer and breast cancer either via an increase in the levels of extracellular ligand, hetero-dimerization of EGFR or its mutational activation2,3. The most common EGFR mutations reported so far in the case of non-small cell lung cancer (NSCLC) are the deletion of exon 19 and substitution mutation (L858R) at exon 21, leading to constitutive tyrosine kinase activity independent of ligand binding4.

Considering the role of EGFR in tumor progression, targeting it for NSCLC treatment is an effective approach. In this direction, various small molecule tyrosine kinase inhibitors, such as erlotinib, gefitinib, lapatinib have been developed and are being used as US Food and Drug Administration (FDA) approved drugs in breast cancer and NSCLC treatment regime5. Gefitinib has emerged as a novel therapeutic molecule impairing the tyrosine kinase activity of EGFR effectively. This impairment leads to blockage of downstream signaling and thus inhibits the tumor proliferation activity of EGFR6,7,8. This drug is administered orally at a dosage of 250–500 mg/day and is implemented as first-, second- and third -line therapy in cases of NSCLC9.

Gefitinib has certain side effects such as nausea, vomiting, diarrhea and interstitial lung disease10. These adverse side effects may be accounted by inhibition of either EGFR and/or drug off-targets11. Therefore, analyzing the off-targets of this drug will prove to be effective to reveal the true scenario of Gefitinib: aids and ills and will help in rational modifications of this drug to minimize the side effects. Additionally, for successful establishment of highly efficient drug-based therapies, early identification of adverse drug effects can be a crucial step as up to 40% of drug failures occur during development, adverse events in pre-clinical trials and pharmacokinetics12.

Identification of drug off-targets by an in vitro counter screening of compounds against numerous receptors and enzymes is expensive and time-consuming13,14. In contrast, in silico analysis of drug off-targets is safe, time-efficient, economical and provides a deeper understanding of the molecular mechanisms of protein-drug interactions. It has been shown that based on the establishment of the structure-activity relationship of small molecules, an in silico off-target identification can be obtained15,16. Various structure-based tools for comparing binding sites of small ligands of distantly related proteins have been developed17.

In the present study, we identified gefitinib off-targets using structure-based systems biology approach. We could confirm the binding of identified off-targets with gefitinib using a reverse docking approach. Additionally, through comparative re-docking analyses of identified off-targets with their respective experimentally characterised ligands (ligands that were present in the crystal structure of the protein) and gefitinib, we observed that a few identified off-targets may bind more efficiently with gefitinib compared to their previously reported and experimentally validated ligands. Furthermore, literature survey and data mining has clearly shown several of the identified off-targets were validated in previously reported in vitro studies. Together, these observations clearly suggest that off-targets of gefitinib identified in this study might be true off targets and could be involved in the molecular mechanism underlying the possible side effects of this drug. Interestingly, our study suggested not only negative side effects but also positive roles of gefitinib. Additionally, we could identify non-human off-targets that may be used for effective treatment of pathogen-based diseases.

Results

Prediction of gefitinib off-targets through Molecular interaction field (MIF) similarity search

To carry out the molecular interaction fields (MIFs) similarity search, the crystal structure of EGFR kinase with gefitinib (PDB id: 4WKQ) was used as query structure. This structure showed the binding of different molecules; viz., gefitinib (IRE), 2-(N-morpholino)-Ethanesulfonic acid and sodium ion in the native state. The binding pocket of EGFR kinase around the gefitinib was calculated at the distance of 3 Å and it was found to be defined by residues Leu718, Gly719, Ala743, Ile744, Lys745, Glu762, Met766, Leu788, Thr790, Gln791, Leu792, Met793, Pro794, Phe795, Gly796, Arg841, Asn842, Leu844, Thr854, Asp855 and Phe856 (Fig. 1A,B). Gefitinib interacts with Met793 in the EGFR binding pocket via H-bond formation with the nitrogen atom of the quinazoline ring and Van der Waals interactions.

Figure 1: Identification of the Gefitinib binding pocket in EGFR kinase domain.
figure 1

(A) Schematic illustration of EGFR receptor on the cell membrane. (B) The gefitinib binding pocket was calculated at the distance of 3 Å using Maestro. The identified ligand binding pocket is shown in 2D. Residues are colored according to their properties (red, negatively charged; blue, positively charged; cyan, polar; green, hydrophobic; and white, neutral). The H-bond is shown with magenta arrow.

The MIFs of ligand binding cavities of proteins listed in sc-PDB database (containing 8077 protein structures) were calculated using six properties; H-bond donor/acceptor, aromatic, hydrophobic and positively/negatively charged interactions and compared with query MIF (i.e. Gefitinib binding pocket of EGFR). All the analyzed protein structures were ranked and arranged according to the Tanimoto scores (designated as MIF ranking; Supplementary Table S1). In total, 128 protein structures were found to have Tanimoto scores of ≥0.35 value. These were considered as putative off-targets of Gefitinib, and were selected for further analysis. These selected protein structures represent 50 proteins (Table 1). These hits belong to following species; human (41), rat (3), Xenopus laevis, Pseudomonas putida, Toxoplasma gondii, Cryptosporidium parvum, E. coli, Betula pendula and Zea mays (1 each). For subsequent analysis, we focused on the 41 identified-human proteins (Table 1). Most of the putative off-target hits (107) belong to protein kinase family (pfam ID: PF00069 and PF07714). The remaining hits belong to pfam domain Ephrin type-A receptor 2 transmembrane domain, EF-hand domain, Cartilage oligomeric matrix protein, transforming growth factor beta type I GS-motif, MAATS-type transcriptional repressor, Dihydroorotate dehydrogenase, Protein Kinase C terminal domain etc. The pfam ids of above domains are listed in Table 1. Furthermore, the known functions and subcellular localization of the selected proteins are shown in Table 1.

Table 1 The details of off-targets of gefitinib identified through MIF similarity serach analysis.

As anticipated, the top ranked structure in MIF search analysis was found to be that of mutated EGFR kinase domain (G719S/T790M) in complex with gefitinib. This data indicates that the method used for binding pocket similarity analysis is appropriate and quite accurate. The other top-ranked structures such as Serine/threonine-protein kinase Chk1 (CHEK1; MIF ranked 2) in complex with 3-(Indol-2-yl) indazoles and Mitogen activated protein kinase-14 (MAPK14; MIF ranked 3) bound with an inhibitor 4-[3-methylsulfanylanilino]-6,7-dimethoxyquinazoline (PDB IDs: 2HOG and 1DI9, respectively) show high similarity to EGFR binding site and might the true off-target proteins.

The superimposed structures obtained from the detailed MIF analysis of query (EGFR) and top ranked off-targets (CHEK1 and MAPK14) are shown for five probes (color spheres; Fig. 2). The probes of query and off-target proteins are shown by bigger and smaller spheres, respectively. There are abundant hydrophobic probes surrounding ligands of both query protein and off-targets (cyan spheres in Fig. 2A,B). H-donor probes (blue sphere) of query and off-targets surround the nitrogen (N)-containing C-rings of the ligands. In case of MAPK14, H-donor residues surrounded O-31 and N-containing C-ring same as N-containing C-rings and F present in side group in gefitinib. Positive and negative probes (green and magenta spheres, respectively) surround N-containing C-ring and S-21 containing C-ring in CHEK1 and MAPK14 bound ligands respectively. This indicates that binding pockets of EGFR and the identified off-targets are very similar in nature and therefore it could be argued that gefitinib might be able to bind with these off-targets efficiently and modulate their functions.

Figure 2: Binding site structure similarity between query protein and off-targets CHEK1 and MAPK14.
figure 2

The superimposed structures of query and off-target proteins in complex with respective ligands are shown (A,B; top, left panel). The similarity of binding pockets in query (green) and off-targets (cyan) are shown (A,B; top, right panel). The hydrophobic, donor, acceptor, negative and positive binding site probes are shown separately and are represented by cyan, blue, red, magenta and green colored spheres, respectively (A,B; middle and lower panel). Large spheres represent the query binding probe while smaller ones represent the off-targets binding probe.

Identification and characterization of binding pockets of off-targets

For binding pocket estimation, we considered only pocket in which the ligand of respective off-target protein was bound. We defined the pocket at the threshold distance of 4 Å based on ligand proximity that is limited to short interactions. The defined pocket with a score ≥0.5 was considered as highly druggable pocket. After estimation of pockets in 128 off-targets, 120 pockets were found to have druggable probability greater than 0.5. Only 8 pockets had values of less than 0.5 (Supplementary Table S2). Since all the identified-pockets were from the crystal structure and have at least one bound ligand, the predicated druggability value of less than 0.5 could not be ignored. The PockDrug server also calculated 66 physicochemical properties (such as hydrophobicity, polarity, aromaticity etc.) of the pockets and the selected parameters are shown in Supplementary Table S2. The binding pocket volumes ranged from 257.99 to 1766.88.

In silico confirmation of off-targets using ligand-protein docking

To further investigate the off-targets identified through MIF similarity searches, we used a reverse docking approach. The docking of gefitinib with each of 128 off-target structures was performed. The docking score was calculated for each gefitinib binding pose that ranges from −1.224 to −12.025, and was subjected to local refinement, binding energy calculations (MM-GBSA method). The 128 structures were re-ranked according to the MM-GBSA binding energy (G-rank; according to lowest binding energies; Supplementary Table S3). Notably, the mutant EGFR kinase domain bound with gefitinib (G179S/T790M; PDB id 3UG2), that was ranked first in MIF similarity search, also showed efficient docking score and binding energy (−6.818 and −77.11 kcal/mol, respectively). The ligand interacts with Met793 and Asp800 via H-bond and Glu791 residues via polar contacts (Fig. 3A). Seven human off-targets (i.e. MAPK10, PIM-1, DHODH, ERBB-4, HSD17B1, CHK2, CHK1) were found to bind with gefitinib with equal or better binding energy than EGFR (Supplementary Table S3). The binding energy for these off-target ranges from −103.446 to −94.712 kcal/mol. The residues involved in binding of gefitinib with these off-targets are shown in Fig. 3B–H.

Figure 3
figure 3

Molecular interactions of gefitinib with selected human off-targets; (A) mutant EGFR kinase domain; (B) PIM- 1, (C) MAPK10; (D) CHEK1; (E) DHODH; (F) ERBB4; (G) CHK2 and (H) HSD17B1. PDB codes are shown in brackets. The H-bond and polar interactions are represented by black line and green lines respectively.

Comparison of binding efficiency of identified off- targets with gefitinib and reported ligands

To compare and verify the data, we also performed reversed docking of identified off-targets with their previously reported ligands that were already present in the crystal structure of the off-targets. The docking score and binding energies of these bound ligands are shown in Supplementary Table S3. For determining the difference in binding efficiency of gefitinib ((∆Ggef) and the respective bound ligand (∆Glig) with the off-target structures, the ratio of binding energies (∆Ggef/∆Glig) was calculated and plotted against average binding energies (Fig. 4). The off-targets having ∆Ggef/∆Glig value ≥ 1.5 were considered to be having significant binding efficiency towards gefitinib compared to their respective reported ligand (Fig. 4, red spheres). In contrast to the bound ligand, gefitinib showed more efficient binding with 15 human and 1 non-human off-target structures (Fig. 4; Table 2). This observation is in agreement with various in vitro studies that have reported ligands present in the co-crystals of some of these off-target structures had less efficient binding (IC50) compared to other inhibitors used in respective in vitro studies (Table 2). This indicates that gefitinib might also be a potent inhibitor of the identified off-targets.

Figure 4: Comparison of the binding energies of gefitinib (∆Ggef) and the respective bound ligand (∆Glig) with the off-target structures.
figure 4

The ratio of gefitinib and reported ligand binding energies (∆Ggef/∆Glig) were plotted against the average binding energy of both. The dotted line shows the threshold ratio 1.5. The spheres with read colour show the significant ratio above the threshold value.

Table 2 The list of characterized off-targets having significant binding efficiency towards gefitinib compared to their respective reported ligands.

Retrospective studies of identified off-targets

Various studies have been published on the comprehensive analysis of kinase inhibitors including gefitinib for their selectivity18,19. The quantitative inhibition data for gefitinib were extracted from previous studies as well as the curated databases DSigDB and ChEMBL18,19,20,21. These data were compared with the data obtained from the present study. The results from the above comparison validated the proteins ERBB4, PIM1, MAPK10, MAPK14, ALK, LCK, BTK, ABL1, SRC, STK10, TNK2, KIT, IGF1R, SLK, CHK2, MET, STK17B and SYK as true off-targets of gefitinib (Table 3). These data confirm the specificity of the in silico prediction of gefitinib off-targets. Additionally, we also found DHOH, HSD17B1, BMPR1B, NTRK1, ACVRL1, and TTK and proteins as new off-targets of gefitinib that were not included in previous reports (Table 3). Furthermore, we curated the published quantitative inhibition data to identify other off-targets. In total, 22 off-targets that were identified in previous in vitro studies however were not found among top 128 hits in our study (Supplementary Table S4). These proteins were also included for further analysis to assess the effects of gefitinib on molecular pathways and diseases.

Table 3 Binding energies of identified off-targets with gefitinib and their comparison with data previously reported in high throughput in vitro studies.

Biological pathways analysis

Biological processes were predicted on the basis of gene ontology and the pathways were ranked according to the p-value calculated using Genomatrix software. In total, 971 pathways were found to be significantly correlated with the input off-target genes (Supplementary Table S5). During this analysis, cellular processes such as protein phosphorylation and related pathways were detected in top 50 hits. The cellular proliferation pathways such as cell growth and apoptosis were strongly associated with gefitinib off-targets. Other biological processes, such as cell differentiation, cell communication, stress response, developmental and metabolic process were also found as top hits according to the p-value. The signaling pathways such as MAPK cascade, immune-response-regulating signalling pathway, serine/threonine kinase pathways and neurotrophin TRK receptor signalling pathway were the major pathways that could be correlated with the major reported side effects (Supplementary Fig. S1)22.

Associated disease analysis

Clinical diseases prediction is crucial to explain the clinical outcome of the side effects of gefitinib. Sixty clinical diseases were predicted using Genomatrix curated database that were significantly correlated with the gefitinib off-target proteins (Supplementary Table S6). These diseases were group in following broad categories: (i) different cancer types (ii) blood disorders, (iii) bone diseases and (iv) reproductive disorders (Fig. 5). Other discrete diseases/abnormalities of pituitary, endocrine system, hypothalamic, gastrointestinal and bone-marrow were also predicted. These results suggest that gefitinib might have side effects that play a major role in above mentioned diseases.

Figure 5
figure 5

Prediction of clinical diseases that might be modulated by gefitinib-induced side effects.

Discussion

In the present study, we have carried out a comprehensive analysis of gefitinib off-targets using a systems biology approach; most of the identified off-targets could be validated by retrospective analysis of previously reported studies. In addition, we could also identify a few new off-targets such as DHODH, BMPR1B, NTRK1 and HSD17B1. Together, these observations could be useful for defining the molecular basis of gefitinib-induced side effects and might help in rational improvement of the drug for better treatment.

In our analysis, the mutant EGFR kinase domain in complex with gefitinib interacts with gefitinib through the use of the same residues as the wild type EGFR. Characteristically, the wild type EGFR complexed with an imidazo[2,1-b]thiazole inhibitor (PDBID 3LZB) also showed efficient binding energy with gefitinib. However, the interacting residues were found to be different and had the lowest binding energy. This suggests that the pocket of EGFR kinase domain may adapt to different conformations for interaction with gefitinib.

Interestingly, the top ranked off-targets showed highest binding site similarity but not efficient binding energy. For example, the EGFR to gefitinib docking shows binding energy −88.354 kcal/mol and gefitinib acts as a strong EGFR inhibitor (∼97% inhibition; Ki 0.4 nM (Table 3). Another off-target ERBB4 showed the maximum affinity with gefitinib but was found to be lower ranked in MIF analyses. Notably, ERBB4 is efficiently inhibited by gefitinib (∼76% inhibition; Kd 410 nM) (Table 3). Additionally, other proteins e.g. DHODH and BMPR1B etc. (that were not reported in previous in vitro analyses) also showed efficient binding energy with gefitinib. These observations suggest that reverse docking might be a suitable approach for confirming the binding affinities. Previously, it has also been shown that the binding energy calculated on docked poses was useful for predicting the binding affinity of the ligand to the receptor23.

During our analyses, a few non-humans off-targets such as src (Gallus gallus domesticus), ttgr (Pseudomonas putida), aurkb-a (Xenopus laevis), ack2 (Zea mays), cdpk1 (Toxoplasma gondii/Cryptosporidium parvum) and erk2 (Rattus rattus) were also found to exhibit efficient binding energy with gefitinib. Amongst the non-human off-targets, ttgr (PDB id: 2UXH), a helix turn helix type transcriptional regulator and antibiotic binding repressor of Pseudomonas putida was found to exhibit highly efficient binding with gefitinib (binding energy −100.838 kcal/mol). This observation suggests that gefitinib could be used in potential combination therapy for treatment of antibiotic resistant strains of Pseudomonas putida.

Previously reported in vitro, in vivo and clinical studies have suggested that gefitinib may induce side effects like pro-apoptosis and cell-cycle inhibition possibly via interacting with off-targets24. A recent study has demonstrated that gefitinib is able to induce cardiac hypertrophy through differential expression of apoptotic and oxidative stress genes25. In contrast, a few previous studies have also demonstrated positive effects of gefitinib such as bone pain relief during bone metastasis and brain metastasis26,27,28. However, the underline molecular basis of such effects is poorly understood. In the present study, we identified that gefitinib off-targets are associated with different biological pathways that may explain the molecular mechanism of such positive and negative effects of gefitinib. The identified off-targets (ACVR1, DHODH, BTK, FGFR1, EGFR, FGFR2 and CHEK2) are linked with bone diseases. Similarly, an earlier report demonstrating drug resistance against EGFR therapy in rectal diseases and non-small cell lung cancer through dysregulation of EGFR endocytosis can be explained via identified off-targets (LYN, SRC, ABL2, ABL1, SYK, TNK2, MAPKAPK2, GAK and MAPK1) that are involved in endocytosis process.

This study has found the off-targets binding to gefitinib suggesting the molecular mechanisms of the side-effects of this drug. The biological processes, regulated by off-targets are interesting to focus in future studies. Notably, this study also suggested positive roles of gefitinib in the treatment regime. System-wide in silico approaches may facilitate the identification of side effects of preclinical and commercial drugs onto the target and off-targets. This strategy may have important applications for rational improvement of drug design and development.

Methods

Druggable proteome data set

Sc-PDB database v.2013 (http://cheminfo.u-strasbg.fr/scPDB/) was used as source of druggable proteome for off-target analysis. The data set contains 8077 structures with druggable binding sites that represent 3678 proteins and 5608 different HET ligands29.

Binding site similarity search

To compare physicochemical similarities in binding pockets of different proteins, MIF within binding sites’ volumes and pairwise MIF similarities between binding sites were calculated using the IsoMIF Finder (http://bcb.med.usherbrooke.ca/isomif)30. In this study, the crystal structure of EGFR kinase with Gefitinib (PDB id 4WKQ; 1.85 Å) was selected as query protein. The binding pocket at the distance of 3 Å around gefitinib was cropped using GetCleft tool30 and subsequently used for MIF similarities search against Sc-PDB database. The parameters used for analysis were as follows: grid spacing 1.5 Å and geometric distance threshold 3.0 Å. Structures were ranked by Tanimoto score and designated as MIF rank. Top 128 hits of gefitinib binding targets were selected for further studies. The details of 128 PDB structures with references were listed in Supplementary Table S8.

Pocket estimation and characterization

The pockets of top 128 hits were estimated based on ligand proximity within a fixed distance threshold from the bound ligand. To extract the residues localized within threshold distance; “PockDrug-Server” (http://pockdrug.rpbs.univ-paris-diderot.fr) was used. The PDB files were uploaded on the server and “prox” method was selected to estimate the pocket using threshold distance at 4 Å. The ligand information in HET code was also given during prediction31.

Protein-Ligand docking

The potential off-targets-identified from MIF similarity search were further processed for binding analysis of gefitinib and previously characterized respective ligands. The Glide 6.9 ligand-receptor docking program (Schrödinger 10.4; Schrödinger Inc, USA) was used for docking of gefitinib to each off-target structure. The ligand library of gefitinib was prepared by LigPrep tool from Schrödinger program with OPLS-2005 force field. Receptor grid was generated in the vicinity of bound ligand of each identified-off target crystal structure using Glide-Receptor Grid generation tool with default parameters. Ligand docking was performed with extra precision (XP) Glide docking module. The binding energies of docking poses were calculated using MM-GBSA method (Prime, Schrödinger Inc, USA) with default parameters.

Literature mining and retrospective studies

PubMed and Google Scholar were used to search publications and research studies relevant to gefitinib. These reports were analyzed for determining the selectivity of gefitinib. We also searched DSigDB database, a collection of small molecules including drugs based compounds and their quantitative inhibition data, for the analysis of gefitinib inhibition32.

To analyze the biological pathways corresponding to identified off-targets, Genomatrix software (Genomatrix, Munich, Germany) with Gene Ranker and GePS (Pathway system) modules were used. The identified off-targets were used as input query, network pathways were constructed and the identified pathways were ranked according to p-value.

Additional Information

How to cite this article: Verma, N. et al. Identification of gefitinib off-targets using a structure-based systems biology approach; their validation with reverse docking and retrospective data mining. Sci. Rep. 6, 33949; doi: 10.1038/srep33949 (2016).