1 Introduction

Tuberculosis (TB) is an airborne contagious disease caused by Mycobacterium tuberculosis and affects about one-third of the world’s population [1]. According to the World Health Organization (WHO) tuberculosis continues to cause considerable morbidity and mortality worldwide despite the availability of an effective and economical drug regimen [2,3,4,5]. With the emergence and spread of Multi-Drug Resistant Tuberculosis (MDR-TB) [6, 7], Extensively Drug Resistant Tuberculosis (XDR-TB) and deadly complication of tuberculosis infection with Human Immunodeficiency Virus (HIV) demands discovery and development of new antitubercular agents with good efficacy, effectivity and safety, focused on new drug targets with innovative mechanisms of action [8,9,10].

There are usually three reasons for needing new antituberculosis drugs: (i) To improve current treatment by shortening the total duration of treatment and/or by providing more widely spaced intermittent treatment [11, 12], (ii) To improve the treatment of MDR-TB, and (3) To provide more effective treatment of Latent tuberculosis infection (LTBI) [13].

Computational approaches are made to design DprE1 inhibitors which will target the recently identified enzyme Decaprenyl-phosphoribose 2′-oxidase (DprE1), catalyzes an essential step in mycobacterial cell wall metabolism [14, 15]. The cell wall is a functional and protective interface between the external and internal environments for every living organism. Disruption or inhibition in its synthesis prevents the growth and multiplication of the organism. Mycobacterium tuberculosis (MTB) have a special cell wall arrangement, with layers of outer lipids, mycolic acid, polysaccharides (arabinogalactan), peptidoglycan, plasma membrane, lipoarabinomannan (LAM), and phosphatidyl inositol mannoside. The polysaccharides arabinogalactan are the basic precursor for bacterial cell wall synthesis. Decaprenyl Phosphoryl-β-D-ribose 20-Epimerase (DprE1) is an oxidase enzyme involved in the biosynthesis of Decaprenyl Phosphoryl-D-Arabinose (DPA) which acts as a donor of D-arabino furanosyl residues for the synthesis of Arabinogalactan [16, 17]. DprE1 is a flavoprotein that along with Decaprenylphosphoryl-2-keto-ribose reductase (DprE2) catalyses epimerization of Decaprenylphosphoryl- D-ribose (DPR) to convert Decaprenylphosphoryl-D-arabinose (DPA) through an intermediate Decaprenylphosphoryl-2-Keto-ribose (DPX) (Fig. 1). This NADP dependent enzymatic reaction makes DprE1 an essential component for cell growth and survival of bacteria [18,19,20]. Hence DprE1 is evolved as an important drug target, inhibition of this enzyme block bacterial cell wall synthesis leads to the death of the bacteria [21].

Computational approaches are made to develop the new active and minimum toxic drug moieties. Computational modelling of drugs is based on information about the ligand and the target receptor [22]. Based on published information about ligands and receptor, either Structure based or Ligand-based Molecular Design approaches are used to correlate the biological activity with the chemical structure of the ligands [23]. Computational studies are considered effective tools in medicinal chemistry and are useful in speeding up the drug design process [24].

Molecular modelling represents the molecular structure numerically and their activity as the equation of quantum. An attempt is made in the present work to perform Quantitative Structure Activity Relationship (QSAR) study, Pharmacophore Modelling, Molecular Docking and insilico ADME prediction on a series of DprE1 inhibitors. QSAR study includes development of two-dimensional (2D) and three dimensional (3D) models where the structure of the molecules taken in the most stable state which are using to calculate the descriptors [25]. Validations of the developed models are carried out using different statistical parameters [26]. The validated models developed in this study help to optimize the lead compounds and provide information about the correlation between structural properties and activity [27]. Pharmacophore Modelling and Molecular Docking study are performed to understand and to interpret the binding interactions mechanism between the ligands and the receptor. Insilico ADMET prediction of drug molecules help to assess the Pharmacokinetic (PK) profile and drug likeliness of molecules [28]. The composition of Docking and Pharmacophore Modelling with QSAR studies can be applied to gain more precise information on the interactions between the ligand and the receptor [29,30,31].

Based on the developed models, rational design of novel active DprE1 inhibitors are made which are having greater selective, effective and safety, therapeutic activity. The got results of Pharmacophore and Docking study can improve the binding process of ligands with its receptor and provide insights into the structural features related to the activities of the new drug compounds.

2 Materials and methods

2.1 Data set

To perform the present computational study, a set of 50 compounds having reported IC50 values were taken from the available literature [32, 33] excluding compounds having not well defined biological activities. The selected compounds for the study shared the same activity and assay procedure with significant variations in their structure and potency. Inhibitory potencies of the compounds in the data set have IC50values ranges from 0.005 to 56.7 µm which were further converted to pIC50 by using the following mathematical formula given as Eq. 1;

$${\text{pIC}}^{{{5}0}} = -\, {\log}_{{{1}0}} \left( {{\text{IC}}^{{{5}0}} } \right)$$
(1)

The structure of all the compounds given in the data set is sketched using the molecular sketching facilities provided in the MDS software of V-Life [34]. Energy minimization of the compounds is done by using Merck molecular force field (MMFF) [35, 36] using MDS software of V-Life by fixing a dielectric constant at 1.0 and root mean square (RMS) gradient at 0.0001. Energy minimization of the compounds is made for effective binding of the drug with its target receptor. The division of whole data set into training and test sets is based on Sphere Exclusion Algorithms, so that the activity of the selected test set are distributed throughout the activity column of the compounds, the distribution curve for test and training compounds is given in Fig. 2. The QSAR models are developed and validated by taking 36 and 14 molecules as training and test set compounds. The chemical structure and their pIC50 values are given in Table 1.

Table 1 Chemical structure and pIC50 values of the compounds having DprE1 inhibition activity

2.2 QSAR study

QSAR study is performed to find the correlation between the activity and structural features (descriptors) of the data set molecules. In this method, we try to find structural parameters that relate to the inhibition activity through mathematical equations [37, 38].

2.2.1 Generation of 2D-QSAR models

In, the present study 2D QSAR models are developed between activity and descriptors like Retention Index (chi),Atomic valence connectivity index (chiv), Path Count, Chi Chain, Path Count, Path Cluster, Element Count, Dipole Moment, topological, Estate Contributions, Information Theory Index, Extended Topochemical Atom (ETA) based descriptors, Polar Surface Area etc. consider as Physiological descriptors, T_2_O_7, T_2_N_5, T_N_N_5, T_2_2_6, T_C_O_1, T_O_Cl_5 etc. as Alignment Independent (AI) and MMFF atom types descriptors. 3D descriptors such as Electro Static, Distance Based Topological Indices, SemiEmpirical and Hydrophobicity base logP descriptors are excluded during the study. 415 molecular descriptors are calculated using V-Life MDS software before developing QSAR models. For alignment independent descriptors, we have used attributes (2, T, C, N, O, F, S, Cl) range from 0 to 7 and structure descriptors as Topological in the software. After obtaining the values of descriptors for all the compounds, descriptors that have a constant value for all the molecules are discarded. Four QSAR models are developed by using Multiple Regression (MR), Principal Component Regression (PCR), Partial Least Square Regression (PLSR) and Partial Least Square associated with the Sphere Exclusion (PLS-SE) methods taking all the calculated descriptors as independent variables and biological activity as the dependent variable.

2.3 Generation of 3D-QSAR models

3D QSAR models for the above data set are developed by using k-Nearest Neighbour Molecular Field Analysis (kNN- MFA) principle. The values of the 3D descriptors such as Electrostatic and Steric parameters are calculated by setting the dielectric constant as 1.0, charge type as Gasteiger-Marsili and a sp3 carbon probe atom with charge 1.0. The cut off energy of 10.0 kcal/mol and 30 kcal/mol are set as the default for electrostatic and steric energies. A total of 2080 field descriptors (1040 for each electrostatic and steric) are calculated using MDS software. 3D QSAR models are developed by setting a cross-correlation limit as 0.5, the number of variables in the equation as 4, term selection criteria as q2, F-test in and out value as 4 and 3.99 respectively. Three models are developed by Step Wise variable Selection Method (SW-kNN MFA), Simulated Annealing Variable Selection Method (SA-kNN MFA) and Genetic Algorithm Variable Selection Method (GA-kNN MFA).

2.3.1 Model validation

For validation of the developed QSAR models, the data set is divided into two sets as training and test sets. This division is based on the substitution groups and the inhibition of compounds. The training set is employed to produce the QSAR model, and the test set is used to validate the quality of the developed models. The statistical parameters of the developed models, internal and external validations are adopted for testing the fitness, stability and predictive ability of the QSAR models. Both the developed 2D and 3D QSAR models are validated by considering many statistical parameters such as the number of compounds in regression (n), the number of variables (k), degree of freedom, squared correlation coefficient (r2), cross-validated correlation coefficient (q2), Fischer’s value (F test) and r2 for external test set, (pred_r2) for external validation. For the internal predictive ability of the model Leave One Out (LOO) method is used showed as the value of q2 (cross-validated explained variance) [39].

External validation of the developed QSAR models is performed by measuring the predictive power of the current models on the external test set by calculating the pred_r2 value as given in Eq. 2, which gives the statistical correlation between predicted and actual activities of the test set compounds.

$${\text{pred}_{\text r2}} = 1 - \frac{\sum {\left( {\text{y}}_{\text{i}} - \widehat{y_{i}} \right)}^{2}}{\sum {\left( {\text{y}}_{\text{i}} - y_{mean} \right)}^{2} }$$
(2)

where yi,\(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y}_{i}\) and \(y_{mean}\) are the actual, predicted activity of the ith molecule in the test set and the average activity of all the molecules in the test set, respectively.

Internal validation of the developed QSAR models is performed by calculating the q2 value as given in Eq. 3, which gives the statistical correlation between predicted and actual activities of the training set compounds.

$$q^{2} = 1 - \frac{{\sum {\left( {{\text{y}}_{{\text{i}}} - \widehat{{y_{i} }}} \right)^{2} } }}{{\sum {\left( {{\text{y}}_{{\text{i}}} - y_{mean} } \right)}^{2} }}$$
(3)

where yi,\(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y}_{i}\) and \(y_{mean}\) are the actual, predicted activity of the ith molecule in the training set and the average activity of all the molecules in the training set, respectively.

2.4 Pharmacophore generation

The development of pharmacophore model is one of the important tasks in drug design and bioactivity prediction. A pharmacophore model has described a set of three-dimensional features which are necessary for bioactive ligands [40, 41]. It shows about the nature of the functional groups like hydrogen bond donors, acceptors, hydrophobic areas, charge interactions, non-covalent bonding and interchange distances which affect the ligand- target interactions. The MolSign module in VLifeMDS provides tools for aligning small organic molecules based on their three dimensional pharmacophore features. Pharmacophore modelling is performed by taking TCA-1 as the reference compound and all 50 compounds for alignment. The primary pharmacophore feature count, enter the value 4, shows the minimum number of pharmacophore features generated for an alignment. The tolerance field, enter the value 10 Å, shows the flexibility in percentage allowed while comparing two feature combinations across two molecules. The Max Distance allowed between the two features, set as 10.

2.5 Molecular docking

Molecular docking study is a computational approach for searching a ligand that can fit both geometrically and energetically into the binding site of a target to show action. Docking study helps to predict drug/ ligand or receptor/ protein interactions by identifying the suitable active sites in the protein, getting the best geometry of ligand-receptor complex and calculating the energy of interaction for different ligands to design more effective ligands [42, 43]. In the present work, docking study is performed for all 50 compounds with the DprE1 enzyme. The whole study is carried out by the Biopredicta tool of V-Life MDS software version 4.6. X-ray diffraction crystal structure of M. Tuberculosis DprE1 is obtained from RCSB Protein data bank (PDB ID-4KW5) complex with inhibitor TCA-1, having resolution 2.612 A0 is used for docking study. Initially, the enzyme is bound with a ligand (TCA-1, dock score − 3.453), which is removed and the missing loops are added with the help of homology modeling modules of the software. During study bond orders of the ligands are assigned, hydrogen atoms are added and the water molecules which do not involve in the interaction are deleted. The TCA-1 bound cavity is considered carrying out the docking study of the selected 50 compounds. Finally, the best-docked structures are selected using their dock score. The interacting amino acids are identified as Val-120, Thr-117, Arg-57, Pre-117, Gly-116, His-131, Ser-122, Tyr-284 and Lys-330 present in the binding site of the target enzyme.

2.6 Drug likeliness and in silico ADME prediction

Earlier prediction of the ADMET properties of drug molecules helps a lot towards drug discovery. This information helps to assess the pharmacokinetic (PK) profile of molecules. The PK properties of the molecules depend on their chemical descriptors, which determine their ADMET. The PK parameters are calculated by ADMET lab a user-friendly freely available web interface [44,45,46]. Several mathematical predictive models for different PK parameters are available such as Aqueous Solubility, Apparent Caco-2, log Kp for skin permeability, Blood–brain barrier (BBB), Volume of Distribution (Vd),Plasma Protein Binding, Metabolism, Elimination {Half lifetime (T1/2), Clearance rate (CL)}and Toxicity which are used to predict the ADMET properties of the drug molecules. The drug-likeness (DL) analysis module includes five commonly used drug- likeness rules (Lipinski, Ghose, Oprea, Veber and Varma) and parameters, such as molecular weight (MW) of ≤ 500 amu, a logP value of ≤ 5, hydrogen bond donor ≤ 5 and hydrogen bond acceptor site (N and O atoms) ≤ 10, the number of rotatable bonds ≤ 10 and topological polar surface area (TPSA) ≤ 140 Å. The significant predicted pharmacokinetic and physicochemical descriptors accounts for druggablilty of a molecule [47, 48].

3 Result and discussion

3.1 Development and validation of 2D-QSAR models

2D QSAR models were developed by considering all the two dimensional calculated descriptors as independent variables and biological activity as the dependent variable. For internal and external validation of the developed models, the data set of the compounds was divided into 14 and 36 as the test and training sets, respectively. The correlation between actual and predicted activity for both training and test set compounds is shown in Table 2. Unicolumn statistic is performed for both training and test series to check the spread of data. The results of the unicolumn statistics study are presented in Table 3. From the result, it was clear that the test set is interpretive, i.e. the activity of the test set derived within the activity range of the training set. The mean and standard deviation of the training and test sets provides insight into the relative difference of mean and point density distribution of the two sets. As the average of the test set is higher than the training set shows the presence of relatively more active molecules as compared to the inactive ones.

Table 2 Observed and predicted activities (pIC50) for the Training and Test set compounds
Table 3 Unicolumn statistics of activity (pIC50) for Training and Test set compounds for 2D-QSAR

2D QSAR models are developed by using 4 methods multiple regression (MR), principal component regression (PCR), partial least square regression (PLSR) and partial least square associated with the sphere exclusion (PLS-SE), the correlation equations between activity (pIC50) and the selected parameters are given as Eqs. 4, 5, 6 and 7 respectively. Followed by the validation of the developed QSAR models to check both internal and external predictive power, which implies a quantitative assessment of model robustness. Validation of the four developed QSAR models is confirmed based on values for various studied statistical parameters; the result of the study is given in Table 4.

$${\text{pIC}}_{{{5}0}} = 0.{585}0\left( { \pm 0.0{742}} \right){\text{T}}\_{\text{O}}\_{\text{O}}\_{5} - 0.0{524}\left( { \pm 0.0{143}} \right){\text{ T}}\_{2}\_{2}\_{4} + 0.{9615}$$
(4)
$${\text{pIC}}_{{{5}0}} = 0.0000\left( { \pm 0.0000} \right){\text{ Ipc}} + 0.{4412}$$
(5)
$${\text{pIC}}_{{{5}0}} = {1}.{\text{1867T}}\_{\text{N}}\_{\text{O}}\_{4} - 0.0{\text{174SdOE}} - {\text{index}} - {1}.{9355}$$
(6)
$${\text{pIC}}_{{{5}0}} = - 0.{\text{2473SssssCE index}} + {1}.0{\text{914T}}\_{\text{N}}\_{\text{O}}\_{4 } + {11}.{\text{5647Most}} - {\text{vePotential}} - 0.{6}0{\text{78SsssNE index}} - 0.0{\text{162SAHydrophilicArea}} - 0.0{\text{588T}}\_{\text{T}}\_{\text{Cl}}\_{4} - 0.{\text{1418SaaCHcount}} + {2}.0{94}$$
(7)
Table 4 Statistical validation results of the developed 2D-QSAR models

From the data given in above table it is clear that the QSAR model developed by Partial Least Square associated with the Sphere Exclusion method (PLS-SE) is statistically more significant than others because the calculating r2 and r2_se for training and the same coefficient for external test set (pred_r2) are having values 0.8917, 0.2407 and 0.5935 with the low standard error of estimation shows overall internal statistical significance level better than 99.9% as the F-test having value 85.0374. This model accounts for 89% variance in the inhibitory activity. The value of the cross-validated Square Correlation Coefficient (q2) is 0.7499 suggesting the good predictive ability of the model. This model shows the interrelationship between the activity and the parameters such as SssssCE-index, T_N_O_4, Most-vePotential, SsssNE-index, SAHydrophilicArea, T_T_Cl_4 and SaaCHcount, contribution plot of these parameters towards activity is presented in Fig. 3. The positive coefficient of T_N_O_4 and Most-vePotential shows that antitubercular activity will increase with the increase in the number of Nitrogen atoms separated from Oxygen atom by 4 bonds and increase the -ve potential in the Vander Waals surface area of the molecule. Whereas the negative coefficient for the parameters SssssCE-index, SsssNE-index, SAHydrophilic Area, T_T_Cl_4 and SaaCH count shows the activity will increase with the decrease in eletrotopological state indices for the number of carbon atom and –NH group connected with 4 and 3 single bonds respectively, vdW surface descriptors showing hydrophilic surface area, the number of chlorine atom separated by 4 bonds and total number of carbon atoms connected with a hydrogen along with 2 aromatic bonds.

Fig. 1
figure 1

NAD dependent Biochemical reaction catalysed by DprE1 and DprE2 enzymes in Mycobacterium

The fitness plot between actual and predicted activity for training and test set compounds given in Fig. 4 provides an idea about how well this model is trained and how well it predicts the activity of the external test set. Further, the distribution curve of actual and predicted activity for training and test sets compounds for the well-developed model are represented in Fig. 5a, b, depicting closeness between the actual and predicted activity of the compounds for training and test set.

Fig. 2
figure 2

Distribution curve of Test (Green Dot) and Training set (Red Dot) compounds

Fig. 3
figure 3

Contribution plot of parameters towards activity

3.2 Development and validation of 3D-QSAR models

By using k-Nearest Neighbour Molecular Field Analysis (kNN- MFA) principle 3D- QSAR models for the above data set are developed. Three models are developed by Step Wise variable Selection Method (SW-kNN MFA), Simulated Annealing variable Selection Method (SA-kNN MFA) and Genetic Algorithm variable Selection Method (GA-kNN MFA) by considering 3D descriptors such as Electrostatic and Steric parameters. The QSAR models for all three methods are given in Eqs. 8, 9 and 10, respectively. To check the predictivity of the developed models, the data set is divided into training and the test set with 34 and 16 compounds. The correlation between actual and predicted activity for both training and test set compounds is shown in Table 2.

$${\text{pIC}}_{{{5}0}} =\, {\text{E}}\_{698}\left( { - {6}.{1424 } - {5}.{78}0{7}} \right) + {\text{E}}\_{225 }\left( {{6}.{79}0{7} - {7}.{1363}} \right) + {\text{S}}\_{532 }\left( { - 0.{4929 } - 0.{4784}} \right)$$
(8)
$${\text{pIC}}_{{{5}0}} =\, {\text{S}}\_{412 }\left( { - 0.{6886 } - 0.{3326}} \right) + {\text{E}}\_{276 }\left( { - {2}.{276}0 \, - 0.{7668}} \right) + {\text{E}}\_{165 }\left( {{1}.{8}0{97} - { 2}.{9967}} \right) + {\text{E}}\_{585 }\left( { - {8}.0{366} - \, 0.{8839}} \right)$$
(9)
$${\text{pIC}}_{{{5}0}} =\, {\text{E}}\_{2}0{7 }\left( { - {9}.0{489 } - {6}.{7334}} \right) + {\text{S}}\_{448 }\,\left( {{3}0.0000} \right) + {\text{E}}\_{682 }\left( { - {1}.0{329 } - 0.{8132}} \right)$$
(10)

Unicolumn statistic study is performed on training and test sets, the result is in Table 5, which signifies that test set contains more active molecules and is uniformly distributed within the min–max range of the training set.

Table 5 Unicolumn statistics of activity (pIC50) for Training and Test set compounds for 3D QSAR

Validation of the three developed models is performed to determine the best model that correlates the activity with the descriptors. The result of the validation study is given in Table 6.

Table 6 Statistical Validation results of the developed 3D-QSAR models

The validation study result of the developed 3D- QSAR models suggests that the model developed by SW-kNN MFA method given in Eq. 8 is statistically more significant and better than other two regarding the internal (q2 = 0.8198) and the external (pred_r2 = 0.6109) predictive, shows predict ability of ~ 82% and ~ 61% for the training and test set, respectively. This model shows that the contributing descriptors are E_698, E_225 and S_532 spread along as field points, the correlation plot is shown in Fig. 6. Electrostatic fields at E_698 (− 6.1424 − 5.7807) and E_225 (6.7907 − 7.1363) are in the negative and positive range near to ring towards activity showing substitution of electronegative and electropositive groups in these sites enhances the activity. Further negative coefficient of the steric factor at S_532 (− 0.4929 − 0.4784) shows the substitution of a less bulky group in this region is preferable for the increase of activity. The fitness plot between the actual and predicted activity of the developed model for training and test set compounds is shown in Fig. 7 which provides an idea about its good predictivity. The distribution curve of actual and predicted activity for training and test set compounds is given in Fig. 8a and b.

Fig. 4
figure 4

Fitness Plot for 2D QSAR model developed by PLS-SE method

Fig. 5
figure 5

a and b Actual Vs Predicted activity of Training and Test set for 2D QSAR developed by PLS-SE method

Fig. 6
figure 6

Field points exhibiting contributing descriptors for 3D-QSAR model by SW-kNN MFA method

3.3 Pharmacophore modelling

In the present work, Pharmacophore modelling for all the compounds present in the series is carried out by taking TCA-1 as a reference compound. Pharmacophore modelling provides useful information to design and synthesise novel potent DprE1 inhibitors. Pharmacophore model is developed by taking four necessary features for the activity of ligand, the results are shown in Fig. 9a, b and Table 7.The obtained Pharmacophore model contains two aromatic (Aro) centre (Yellow sphere), one aliphatic (Ala) carbon centre (Orange sphere) and one hydrogen bond donor (Hdr) centre (Green sphere) reveals that these features are necessary for showing DprE1 inhibiting activity.

Fig. 7
figure 7

Fitness curve for 3D-QSAR model by SW-kNN MFA method

Table 7 Result of pharmacophore identification study

3.4 Molecular docking studies

Molecular Docking study is carried out for all 50 compounds with the binding site of the target DprE1 enzyme. The grid docking score values of all compounds are given in Table. 8. Based on the grid dock score, five compounds of number 8,15,16,27 and 35 are selected for the study showing good binding efficiency with the target enzyme. The binding modes of these compounds are given in Fig. 10a–e respectively. Docking study reveals that these molecules are interacting with amino acid residues like Gly-116, His-131, Arg-118, Thr-117 and Gln-299 present at the active site of the target enzyme by forming H-bond with them. The two dimensional binding representation of these compounds with the target enzyme are given in Fig. 11a–e respectively shows the interaction of these compounds with active site amino acids. Two-dimensional ligand interaction plot of these compounds are shown non polar interaction because of the formation of hydrogen bonds (H-bond) between amino acids and atoms (O and N) present in the chemical structure of these compounds, the interaction result is given in Table 9. Docking study of these molecules with the target site contribute that substitution of electron donating groups on these particular sites increases the binding efficacy by forming H-bond with the target site and potentiate the DprE1 inhibiting action, hence it help towards the design and development of potent and selective lead molecules having DprE1 inhibiting antitubercular action.

Table 8 Docking score of compounds
Fig. 8
figure 8

a and b Actual Vs Predicted activity of Training and Test setfor 3D-QSAR developed by SW-kNN MFA method

Fig. 9
figure 9

a Pharmacophore hypothesis. b Distance based Pharmacophore identification

Fig. 10
figure 10

a–e Binding model of compounds 8,15,16,27 and 35 with DprE1 target cavity

Fig. 11
figure 11

a–e 2- Dimensional ligand interaction plot represents interaction of ligands (8,15,16,27 and 35) with different amino acid residues present on active site of DprE1 enzyme

Table 9 Ligand- target interaction result

3.5 Drug likeliness, in silico ADME and toxicity study

Knowing ADME features about a compound in advance is important for drug discovery, and poor pharmacokinetics (PK) is the major concern for the failure of drug candidates in clinical trials. Therefore, knowing of ideal ADME properties at earlier stages helps to generate good potential candidates that can avoid the latter stage of elimination and can easily pass from clinical trial studies. With this aim in the present study, all the 50 compounds present in the series are used for prediction of their Pharmacokinetic (ADME) parameters, drug toxicity, and drug likeliness features by using ADMET lab web interface. The predicted results of Pharmacokinetics, Toxicity and Drug likeliness are presented in Tables 10 and 11, respectively.

Table 10 Pharmacokinetic features (Insilico ADME) prediction results
Table 11 Predicted toxicity risk paremeters and Lipinski’s rule of five drug likeliness of compounds

The predicted result showed that all the compounds satisfy the Lipinski's rule of five for drug likeliness and oral bioavailability. Values for the distribution coefficient D (LogD) and distribution coefficient P (LogP) are within the optimal range for all the compounds suggest the idealness of these compounds. The Solubility (LogS) values are in optimum range, suggesting good dissolution and absorption of drugs. The optimum values of other descriptors related to absorption suggest good intestinal absorption and skin permeability of these compounds. Optimum values of Topological polar surface area (< 140 Å2) and rotatable bonds (0–15) holds a great effect towards oral bioavailability of these compounds. The predicted result shows good plasma protein binding, Blood-Brain Barrier penetration (BBB) ability, low half-life (T1/2) and rate of clearance (CL) of all compounds.

The toxicity risk calculator locates fragments within the structure of the molecule that shows a potential toxicity risk. Toxicity risk parameters such as hERG K+-channel blocker, Human Hepatotoxicity (H-HT), Ames Mutagenicity (AMES), Skin sensitization and Drug-Induced Liver Injury (DILI) are computed for all the compounds. then the compounds having number 3,4,5,13,14,15,27,28, and 29 shows low hERG K+-channel blocking activity, all compound except 27 and 28 shows mild hepatotoxicity in high dose. Ames mutagenicity prediction result shows that except compound number 23,24,27,28,29 and 32–50 showing mutagenicity and induces revertant colony growth. Skin sensitization prediction shows compounds other than 12,13,18,19,20,25,26 and 31 are skin nonsensitizer. Overall compounds are predicted to have mild toxicity risk levels. LD50 of acute toxicity predicted results for all compounds except compound number 23,41,42,43,45,46,47,48 and 50 are within the permissible limits (> 500 mg/kg) showing lower toxicity whereas above mentioned 9 compounds having LD50 value in between 51 and 500 mg/kg comes under toxicity level. The predicted drug likeliness and optimum synthetic accessibility score for all the compounds suggest good druggability and easier synthesis of these compounds.

4 Conclusion

The combined computational approach is applied to give insight into the structural basis and inhibition mechanism for the series of compounds as DprE1 inhibitors antitubercular agents. Statistically significant QSAR models for both 2D and 3D QSAR provide a structural framework for understanding the relationship of chemical structure with the activity and exhibited a good correlation, predictive ability and satisfactory agreement between experimental and predicted activity of the training and test set molecules. The validated 2D-QSAR model was used to optimize the estate contribution, hydrophobicity, electrostatic and alignment independent requirements around the moiety to increase activity whereas 3D-QSAR model suggest that substitution of electronegative, electropositive and less bulky groups in particular site is preferable for antitubercular activity. Presence of two aromatic rings, one aliphatic and one hydrogen bond donor groups are the key pharmacophoric features for inhibition of DprE1 enzyme. Molecular docking study result shows five compounds of number 8,15,16,27 and 35 have significant interaction with the amino acid residues like Gly-116, His-131, Arg-118, Thr-117 and Gln-299 present at the active site of the target enzyme by forming non polar interaction (H-bond) suggest presence of H-bond forming atoms required for interactions between the ligands and the peptide residue. In silico prediction of drug likeliness and ADME-T risk profiling were within their acceptable limit confirm good druggability of these compounds and showing mild toxicity risk in high dose. The present computation approach will help to design new DprE1 inhibitors based on the results of QSAR studies. Thus, these compounds have rationalized the possible structural requirement for better binding interactions with target site and need further lead optimization for designing of more potent DprE1 inhibitors.