Structure based classification for bile salt export pump (BSEP) inhibitors using comparative structural modeling of human BSEP

Jain, Sankalp; Grandits, Melanie; Richter, Lars; Ecker, Gerhard F.

doi:10.1007/s10822-017-0021-x

Structure based classification for bile salt export pump (BSEP) inhibitors using comparative structural modeling of human BSEP

Open access
Published: 19 May 2017

Volume 31, pages 507–521, (2017)
Cite this article

Download PDF

You have full access to this open access article

Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Structure based classification for bile salt export pump (BSEP) inhibitors using comparative structural modeling of human BSEP

Download PDF

Sankalp Jain¹,
Melanie Grandits¹,
Lars Richter¹ &
…
Gerhard F. Ecker¹

4154 Accesses
19 Citations
1 Altmetric
Explore all metrics

Abstract

The bile salt export pump (BSEP) actively transports conjugated monovalent bile acids from the hepatocytes into the bile. This facilitates the formation of micelles and promotes digestion and absorption of dietary fat. Inhibition of BSEP leads to decreased bile flow and accumulation of cytotoxic bile salts in the liver. A number of compounds have been identified to interact with BSEP, which results in drug-induced cholestasis or liver injury. Therefore, in silico approaches for flagging compounds as potential BSEP inhibitors would be of high value in the early stage of the drug discovery pipeline. Up to now, due to the lack of a high-resolution X-ray structure of BSEP, in silico based identification of BSEP inhibitors focused on ligand-based approaches. In this study, we provide a homology model for BSEP, developed using the corrected mouse P-glycoprotein structure (PDB ID: 4M1M). Subsequently, the model was used for docking-based classification of a set of 1212 compounds (405 BSEP inhibitors, 807 non-inhibitors). Using the scoring function ChemScore, a prediction accuracy of 81% on the training set and 73% on two external test sets could be obtained. In addition, the applicability domain of the models was assessed based on Euclidean distance. Further, analysis of the protein–ligand interaction fingerprints revealed certain functional group-amino acid residue interactions that could play a key role for ligand binding. Though ligand-based models, due to their high speed and accuracy, remain the method of choice for classification of BSEP inhibitors, structure-assisted docking models demonstrate reasonably good prediction accuracies while additionally providing information about putative protein–ligand interactions.

Structural basis of bile salt extrusion and small-molecule inhibition in human BSEP

Article Open access 10 November 2023

In Silico Approaches to Predict Drug-Transporter Interaction Profiles: Data Mining, Model Generation, and Link to Cholestasis

Exploring molecular fingerprints of different drugs having bile interaction: a stepping stone towards better drug delivery

Article 27 June 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Transmembrane transport proteins selectively aid in the translocation of molecules across biological membranes by binding the substrate molecules followed by a conformational change [1]. Members of the ATP-binding cassette (ABC) superfamily facilitate the transport of their solutes by using the energy from hydrolysis of ATP. While some ABC-transporters allow specific passage of inorganic ions, others facilitate ATP-dependent transport of organic compounds including xenotoxins, short peptides, lipids, bile acids, glutathione, and glucuronide conjugates. Therefore, ABC-transporters affect the absorption, distribution, metabolism, excretion and toxicity of numerous pharmacological agents. Genetic variations in the genes that encode these transporters lead to disorders such as cystic fibrosis, cholesterol and bile transport defects, as well as neurological diseases [2].

The bile salt export pump (BSEP, gene ABCB11) is a canalicular-specific exporter predominantly expressed in the cholesterol-rich apical membrane of hepatocytes [3]. BSEP facilitates secretion of bile salts from the liver into the bile canaliculi [4,5,6]. The main function of bile acids is to promote digestion and absorption of dietary fat via formation of micelles [7]. Apart from this, they are increasingly being shown to have hormonal actions throughout the body [8, 9]. Variations in the ABCB11 gene result in different forms of progressive familial intrahepatic cholestasis (PFIC) [10, 11]. PFIC is characterized by an early onset of cholestasis and eventually leads to liver cirrhosis and failure [12,13,14].

Inhibition of BSEP can result in accumulation of bile salts in the liver, which is considered to be a primary mechanism leading to drug-induced cholestasis—one of the reasons for drug-induced liver injury (DILI) [15,16,17]. By inhibiting BSEP, drugs such as bosentan, rifampicin and troglitazone cause intracellular accumulation of bile salts and decreased bile flow [18]. Dysfunction due to suppression of gene expression, disturbed signaling or steric inhibition are other important factors leading to DILI [19]. In its Guideline on the Investigation of Drug Interactions (effective: January 2013), the European Medicines Agency (EMA) indicated that BSEP inhibition assessment should be “preferably investigated”. Additionally, EMA states: “If in vitro studies indicate BSEP inhibition, adequate biochemical monitoring including serum bile salts is recommended during drug development” [20]. Furthermore, studies indicate that a majority of drugs that showed in vitro inhibition of BSEP have led to DILI, suggesting that decreased BSEP inhibition is likely to be associated with reduced risk for DILI [17, 21, 22].

With the increasing knowledge of the importance of ABC-transporter for ADMET, also in silico models for predicting ligand-transporter interaction became available [23]. With respect to BSEP, QSAR modeling was applied by Warner et al. [24] in which a support vector machine (SVM) model provided the highest accuracy of 87% in the classification of BSEP inhibitors and non-inhibitors on a dataset of 624 compounds [24]. Our group recently published a classification model based on a set of 670 compounds, which allowed the identification of bromocriptine as a BSEP inhibitor [25]. With first X-ray structures of ABC-transporters being published, also structure-based models became available. Bikadi et al. used SVM to predict P-gp substrate binding modes [26, 27]. Dolghih et al. separated P-gp binders from non-binders by applying induced fit docking into the crystal structure of mouse P-gp using the docking score for classification [28]. High area under the curve (AUC) scores of 0.93 and 0.90, respectively were observed for two independent datasets (126 and 64 compounds, respectively). Also Chan et al. [29] evaluated the prediction capability of docking by using 245 P-gp substrates and non-substrates, but the classes were not clearly separated based on the Glide docking scores.

Klepsch et al. [30] showed that docking of a set of propafenones into a homology model of human P-gp reveals poses consistent with QSAR data, and that this can be exploited for the identification of new P-gp inhibitors [31]. Recently, this was enhanced towards a structure-based classification of almost 2000 compounds [32]. Although the docking-based classification showed significantly lower performance than ligand-based models derived from machine learning, it offers information on the molecular basis of protein ligand interaction.

Up to now, due to the lack of a high-resolution X-ray structure of BSEP, no structure-based studies have been performed for this protein. In the present study, we use comparative modeling [33] to create a protein homology model for BSEP by using the corrected mouse P-glycoprotein structure (PDB ID: 4M1M) as template. Subsequently, we developed structure-based classification models using a dataset comprising 408 compounds (113 inhibitors and 295 non-inhibitors) as training set and two external test sets containing 166 compounds (44 inhibitors and 122 non-inhibitors) and 638 compounds (248 inhibitors and 390 non-inhibitors), respectively.

Materials and methods

Dataset

A set of 408 compounds (113 inhibitors and 295 non-inhibitors) from the work of Warner et al. [24] was used as the training set and another set containing 166 compounds (44 inhibitors and 122 non-inhibitors) from Pedersen et al. [34] was used as external test set. Both studies provide in vitro inhibition data on human BSEP. While Warner et al. classified compounds with a mean IC₅₀ ≤ 300 μM as BSEP inhibitors, in our study we decided to use a much lower threshold (mean IC₅₀ ≤ 10 μM) in order to retain only strong inhibitors. Compounds with mean IC₅₀ > 300 μM were considered non-inhibitors, and the remaining compounds were excluded from the dataset. Finally, we have a total of 113 strong inhibitors and 295 non-inhibitors. The Pedersen et al. data set is based on inhibition of bile salt export pump (BSEP)-mediated taurocholate (TA) transport in inverted membrane vesicles. After removal of compounds that overlapped with those in our training set, we had a total of 166 compounds (44 strong inhibitors and 122 non-inhibitors) to be used as external test set. In addition, a dataset provided by AstraZeneca within the framework of the IMI project eTOX (http://www.etoxproject.eu) was used as a second external test set to further evaluate our models. The data was measured in a [3H]-taurocholate transport assay performed in Sf21 membrane vesicles using the protocol as described by Dawson et al. [17] and contains the BSEP inhibitory potencies of 1092 compounds as IC₅₀ values. Removing the overlapping compounds from the first two datasets resulted in 638 compounds (248 inhibitors and 390 non-inhibitors). All datasets were standardized using the protocol previously described in Montanari et al. [25] and Pinto et al. [35].

Homology modeling

For human BSEP (UNIPROT ID: O95342), based on sequence identity and atomic resolution, the corrected mouse P-glycoprotein structure (PDB ID: 4M1M) was selected as the most structurally related template protein. Multiple homology models were constructed using MODELLER 9.13 [36] and the Prime module in Maestro [37, 38]. Energy minimized models were then evaluated using DOPE score [39], and GA341 score [40, 41]. The quality of the stereochemical parameters and the normality of the structures were checked using the PROCHECK program included in the PDBsum analysis [42]. Ramachandran plot [43] and G-factor [44], and finally the Q-score [45, 46] values were evaluated to identify the top ranked homology model.

Molecular dynamics simulation

Molecular dynamics (MD) simulation was carried out in Gromacs 5.0.4 [47,48,49,50] using the GROMOS 54a7 forcefield [51]. The protein was placed inside a rectangular box of size 16 × 16 × 16 nm³ including approximately 34,000 simple point charge (SPC) water molecules [52]. Sodium and chloride ions were added to gain a neutral system. Energy minimization was carried out with a maximum force of 1000 kJ/mol/nm using the steepest descent algorithm. After the minimization, a NVT equilibration was performed at a constant temperature of 300 K for 100 ps. Followed by a NPT equilibration step for 1 ns, with the pressure set constant at 1 atm and a constant temperature of 300 K. The production simulation was performed at 300 K for 20 ns. The LINCS algorithm [53] was used to constrain the covalent bonds and PME [54] was used to calculate the electrostatic interactions during the simulation. The stability of the protein structure was evaluated by calculating the secondary structure over the simulation time according to the Kabsch and Sander rules [55] and the root-mean-square fluctuation (rmsf) of active site residues (Fig. S1 in the supplementary material). All graphs were created using the XMGrace tool [56].

Molecular docking and scoring

In order to avoid any bias in the docking studies, the binding site was defined as the complete TM region, taking 20 Å around the coordinate of the center point to allow subsequent flexible docking studies of a series of BSEP inhibitors. The protein was prepared using Protein Preparation Wizard of the Schrödinger Suite (2015) [57, 58]. During this process, hydrogen atoms were added, and optimal protonation states and ASN/GLN/HIS flips were determined. To assess their correct protonation states, ligands were prepared using the LigPrep module of Schrödinger Suite [58, 59] which produces low-energy 3D structures that can be further used for docking studies. The OPLS_2005 force field was used for the minimization of the structures. Different ionization states were generated by adding or removing protons from the ligand at a target pH of 7.0 ± 2.0 using Epik version 3.1 [60, 61]. Tautomers were generated for each ligand. To generate stereoisomers, the information on chirality from the input file for each ligand was retained as is for the entire calculation. This gave a dataset of 1865 structures (318 inhibitors and 1547 non-inhibitors) for the training set, 2009 structures (858 inhibitors and 1151 non-inhibitors) for the external test set from Pedersen et al. and 1560 structures (668 inhibitors and 892 non-inhibitors) for the external test set from AstraZeneca, which were used for docking with the genetic algorithm-based GOLD suit (version 5.2.0) [62, 63].

All the docking runs were performed in high-throughput mode with GOLD. The fitness functions GoldScore (GS) and ChemScore (CS) were used. GlideXP [64, 65] docking from Maestro was also used in order to compare different scoring functions. Finally, all the poses were rescored using an external scoring function, XScore [66]. To gain deeper insights on the binding modes of BSEP inhibitors and non-inhibitors, the protein–ligand interaction fingerprints (PLIF) of the resultant complexes were retrospectively analyzed.

Machine learning-based model building

The open source software WEKA (version 3.7.10) [67] was used for building binary classification models. The machine learning classifiers: J48, Random Forest, REPTree, LibSVM and Naive Bayes were used with the default parameters along with tenfold internal cross-validation.

Network-based representation of the dataset

Tanimoto (Tc) similarities between the inhibitors and non-inhibitors of the training set were calculated using MACCS fingerprints [68]. A chemical space network (CSN) [69, 70] was constructed and analyzed in order to assess the structural similarity shared by the compounds of both groups. To show connections between the compounds, a threshold value of 0.7 was set based on the average of Tanimoto max-similarity in the dataset.

Functional group analysis

Functional group analysis was performed in two stages. First, the substructure patterns of 100 functional groups in SMARTS notation were extracted from the Daylight website (http://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html#GROUP). Next, the pattern matching was performed using the SMARTSQueryTool implemented in the Chemistry Development Kit (CDK) [71]. For each functional group, the occurrences of the fragments in a given set of molecules were calculated.

Protein ligand interaction fingerprints (PLIF)

A PLIF summarizes the interactions between a ligand and a protein using a fingerprint scheme. Here we generated three types of PLIFs that differ in the information encoded. In the first approach, the PLIF encodes the residues involved in an interaction with the ligand in each bit. The second one encodes not only the residue but also the nature of the interaction (e.g. hydrogen bond donor) with the ligand. The third category encodes the functional group of the ligand that interacts with the residue. All the PLIF bits were calculated with the MOE [72] built-in function CalculateRawInteractions using a 1% threshold for molecular interactions and a 20% threshold for surface contacts. The function was embedded in an SVL in-house script and was post processed to enable to calculate functional group PLIFs.

Applicability domain assessment

An applicability domain (AD) analysis was performed to evaluate if the chemical space covered by the training set used for developing the model is applicable to predict the outcomes of the test sets used to evaluate the model performance. Therefore, AD could provide a first hint if a new chemical structure is covered within the chemical structures or descriptor space of the training set. Many approaches were proposed to estimate AD, for instance based on descriptor ranges, Euclidean distance or probability density, each having their pros and cons. In this study, we implemented the Euclidean distance approach using the KNIME [73] node APD [74, 75] to evaluate if the test sets are within the AD of the training set.

Performance evaluation

In order to evaluate the quality of our classification models based on the docking studies, we used standard parameters such as count of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN). Sensitivity (Eq. 1), specificity (Eq. 2) and accuracy (Eq. 3) values were calculated for each model based on the aforementioned parameters to estimate its performance in classifying inhibitors and non-inhibitors. To measure the overall quality of the model, the G-mean (Eq. 4), which takes into account both sensitivity and specificity, and the Matthews’s correlation coefficient (MCC, Eq. 5) were also calculated.

$$Sensitivity=~\frac{{TP}}{{\left( {TP+FN} \right)}}$$

(1)

$$Specificity=~\frac{{TN}}{{\left( {TN+FP} \right)}}$$

(2)

$$Accuracy=\frac{{\left( {TP+TN} \right)}}{{\left( {TP+FP+TN+FN} \right)}}$$

(3)

$$G{-}mean=\sqrt {Sensitivity \times Specificity}$$

(4)

$$MCC = \frac{{\left\{ {\left( {TP \times TN} \right) - \left( {FP \times FN} \right)} \right\}}}{{\left\{ {\left( {TP + FP} \right) \times \left( {TP + FN} \right) \times \left( {TN + FP} \right) \times \left( {TN + FN} \right)} \right\}^{{1/2}} }}$$

(5)

Calculating the probability of prediction

We examined the distribution of docking scores [Chemscore, Goldscore, GlideXP, Xscore (Chemscore) and Xscore (Goldscore)] for the training set molecules. Based on the minimum and maximum score values, the scores were binned in different intervals. Each bin is characterized by the corresponding number of inhibitors and non-inhibitors. Based on these values, we calculated the probability for a molecule to be an inhibitor or a non-inhibitor. A p value (Chi square test) is calculated for each bin to identify the best scoring range that can be used to separate inhibitors from non-inhibitors.

Results and discussion

Chemical space network of the dataset

Figure 1 shows the CSN with well-resolved community structures for a set of inhibitors and non-inhibitors from the training set. The representative compounds of some communities are shown in Fig. S2 in the supplementary material. Major community structures [69] (communities with at least five representative members) were algorithmically detected and are color-coded. For our CSN designs, the Fruchterman–Reingold algorithm [76] was applied. The node size is proportional to the activity value (pIC₅₀) i.e. the more active the compound, the bigger the node size and vice versa.

A majority of the nodes do not have a connection indicating a high structural diversity in the training dataset. The test dataset from Pedersen et al., showed only three clusters in the CSN with at least five representative members (Fig. S3 in the supplementary material).

Homology modeling

Applying the Prime module from Maestro (Schrödinger, Inc. V-10.1.013), a set of homology models of BSEP were created and refined, using the refined mouse P-gp structure as template (PDB ID: 4M1M). The sequence alignment was done using Prime’s alignment program STAin maestro [37, 38] (Fig. S4 in the supplementary material). Analyzing the models with the structure assessment program PROCHECK [42], the best model had a normalized Dope score of −0.625, G-factor −0.12, and Qmean score of 0.597. Furthermore, the Ramachandran plot (Fig. S5 in the supplementary material) showed excellent results, with only 1.9% of residues in generously allowed or disallowed regions. These were all located in the nucleotide binding domains (NBD) or extracellular loops (ECL), and are therefore not involved in drug binding (Fig. S6 in the supplementary material). Based on the study by Mochizuki et al., Asn109, Asn116, Asn122, and Asn125 are residues predicted to be potential glycosylation sites in the extracellular loop (No.1) (EL No.1) of human BSEP [77]. In our final BSEP homology model (Fig. 2), these residues were also found in EL No.1, thus occurring in the correct region of the transmembrane domain (TMD, Fig. S7 in the supplementary material). For further validation, the best model based on normalized Dope score and Qmean score was subject to molecular dynamics simulations for 20 ns. Both the secondary structure of the protein (Fig. 3) as well as the root mean square fluctuation (RMSF < 0.25 nm) of active site residues showed the stability of the structure.

Docking (structure-based classification)

We recently could demonstrate that a validated homology model of P-glycoprotein allowed docking-based classification of inhibitors and non-inhibitors with reasonable performance [32]. Thus, in this study we extended this approach also to BSEP, using a set of 408 compounds (113 inhibitors and 295 non-inhibitors) published by Warner et al. [24] as training set and two data sets as external test set (see “Materials and methods” section). The scores obtained from different fitness functions were binned and the intersection point of the curves for inhibitors and non-inhibitors in the training set served as classification criterion (Fig. 4). Respective confusion matrix parameters and other performance measures are summarized in Table 1. The ChemScore docking run using Xscore as rescoring function retrieved the best performing model with AUC (0.918) and MCC (0.689) measures comparable to the models developed by Warner et al. [24] and Montanari et al. [25]. This model accurately predicted 88% of the training set compounds and 72% of the external test set compounds derived from Pedersen et al. [34] as well as 77% of a set of AstraZeneca internal compounds. The area under the ROC curve (AUC) measure, being independent from class distribution [78, 79], is a good metric for evaluating performance of virtual screening approaches. High AUC values (above 0.8) were observed, indicating a high capacity of the model in ranking compounds by their probability of being inhibitors of BSEP (Figs. S8–S12 in the supplementary material). The results from the AD assessment also show that all compounds from both test sets were found to be within the chemical domain of the training compounds (Table S1 in the supplementary material). Interestingly, the accuracy of predictions did not improve when a consensus of different scoring functions was used.

Table 1 Models obtained from different scoring functions based on the training set

Full size table

Probability of prediction

For the training set using ChemScore scoring, bin 35–40 gave the maximum number of inhibitors. 88% of inhibitors and 12% of non-inhibitors had the docking score in this range with a p value of 5.9 × 10⁻⁸. For both test sets, at least 75% of the inhibitors were found to be in this range. Results for different scoring functions can be found in the Table S2 in the supplementary material. Also with the rescoring of ChemScore using Xscore, a particular range could be defined which significantly distinguishes between inhibitors and non-inhibitors. However, this is not the case for GoldScore scoring. With this scoring function no particular docking score range could be identified for the three sets (training set, both test sets) to differentiate between the two classes of compounds with a significant p value. Similar results were obtained using the GlideXP scoring function.

Analysis of protein ligand interactions

The Maestro tool allows the computation of different molecular interactions between binding site residues and the corresponding ligand conformation. In this study, the receptor–ligand interaction fingerprint analysis was performed both for the true positives (TPs) and for the true negatives (TNs) on the basis of the docking poses generated. For the training set (Fig. 5) and the two external test sets (Figs. S13, S14 in the supplementary material), the inhibitors showed significantly more hydrophobic interactions with Phe334, Leu364, Tyr772, Phe776 and Leu1026 than non-inhibitors. More than 75% of the inhibitors in the training set and the external test sets showed hydrophobic interactions with Phe334 and Tyr772 (Fig. 5a). In contrast, non-inhibitors showed a higher number of hydrogen bond interactions than inhibitors (Fig. 5b), which points towards the fact that non-inhibitors are more hydrophilic.

The significant contribution of hydrophobic interactions prompted us to assess the importance of simple molecular descriptors such as logP and molecular weight. Figure 6 represents the distribution of molecular weight and logP(o/w), respectively, for the training set compounds. Similar distributions, represented in Fig. S15 in the supplementary material, were observed with the external test sets from Pedersen et al. [34] and from AstraZeneca (Fig. S16 in the supplementary material). As proposed by Warner et al. [24], molecular properties such as molecular weight (MW) and logP(o/w) could separate the groups quite well (Table 2). At the intersection of MW = 390 and logP(o/w) = 3.6, 79 and 77% of the compounds were classified correctly. Accordingly, compounds with a molecular weight of 390 or higher or a logP of 3.6 or higher were considered as inhibitors while others were considered as non-inhibitors.

Table 2 Models based on physicochemical properties

Full size table

The models based on docking scores (ChemScore and XScore) in combination with molecular weight and logP(o/w) (each normalized) outperformed the other models in terms of MCC and precision. ChemScore and XScore based models, when combined with the physicochemical properties [molecular weight and logP(o/w)] correctly predicted 87 and 88% of training set compounds, giving a MCC value of 0.673 and 0.701 respectively. These models also showed high accuracies as compared to other models for the two external test sets. Detailed accuracy measures are presented in Table S3 in the supplementary material.

Also when poses, generated with GoldScore scoring function and rescored with XScore, were combined with the normalized molecular weight and logP(o/w), it provided accuracies comparable to the former models (Table S3 in the supplementary material). This indicates that considering physicochemical properties of molecules that influence their activity significantly improves the performance of structure-based prediction models.

Distribution of BSEP inhibitors and non-inhibitors using different scoring functions and in combination with physicochemical properties (molecular weight, logP) are presented in Figs. S17–S32 in the supplementary material. A single intersection point could not be obtained, when the rescoring using Xscore (pose generated with GoldScore) was combined with logP(o/w) and thus was not used for the classification of inhibitors and non-inhibitors (Fig. S31 in the supplementary material).

Using the best performing docking scores (ChemScore, XScore) and the descriptors (molecular weight and logP(o/w)) as parameters, we additionally developed machine-learning based binary classification models using J48, Random Forest, REPTree, LibSVMand Naive Bayes in WEKA [67]. These models performed well with accuracies and MCC values (Table S4 in the supplementary material) comparable to those from machine-learning based classification models of Warner et al. [24] and our models previously developed [25].

Analysis of functional groups and protein–ligand interactions

Next, we investigated the distribution of functional groups between inhibitors and non-inhibitors to identify structural features that are responsible for differences in the activity (inhibitor vs. non-inhibitor). About 70 SMARTS patterns representing the most common functional groups were extracted from the Daylight website (http://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html). Basically, groups such as halide/halogen, ether, carbonyl, vinyl carbons (sp2 hybridized) and amide were more frequently found in the inhibitors compared to the non-inhibitors (Fig. 7, S33 in the supplementary material). This further points towards more hydrophobic-driven interactions for inhibitors.

In addition, we also identified the most frequently occurring interactions between residues and functional groups for the training set compounds. A heat map (Fig. 8a) was generated to illustrate the outcomes of PLIF analysis by displaying the contact residues against the functional groups of the interacting ligands. The color scale represents the amount of ligands which are involved in interactions. Therefore, the most significant interactions between a specific residue and a specific functional group could be visually detected.

We found that the interactions of arene and carbonyl functional groups with tyrosine and leucine are more prominently found among the inhibitors in comparison to the non-inhibitors. We furthered with retrospective assessment of the docking results to check the presence of the aforementioned interactions and evaluated the chances to prioritize a compound as a BSEP inhibitor. Figure 8b represents the docking pose of Glimepiride (yellow) in which its carbonyl groups interact with the residues Tyr337, Tyr772 and Asn996. The residue Leu364 shows a hydrophobic interaction with the arene moiety of the ligand. Similarly, the functional group-residue interactions were confirmed to be present in the docking results of both external test datasets (Figs. S34–S36 in the supplementary material).

Although the functional groups analysis suggests that halide/halogen, carbonyl, ether, vinyl and amide groups were significantly over represented in the inhibitors, only carbonyl group, amide were found to frequently interact with the protein. According to the heat map (Fig. 8a), halide/halogen and vinyl groups do not appear to have a significant number of contacts with the residues. At the same time, arene was found at a similar rate in inhibitors (nearly 95%) and non-inhibitors (nearly 85%), but the PLIF analysis revealed that the arene moiety participates in a significant number of interactions with residues such as Leu364 and Leu1026. This indicates that significant differences in the functional group composition between inhibitors and non-inhibitors (Fig. 7) does not necessarily indicate or provide an outlook on the nature of interactions. This would rather depend on the position of these functional groups in the molecular structure, nature of the binding site residues as well as the size of the binding pocket.

Finally, preliminary results show that the PLIF can also be used as predictor for inhibitor/non inhibitor properties by calculating the Tanimoto distance to known inhibitors. A more detailed description of this approach can be found in the supplementary material.

Analysis of misclassified compounds

Nearly 90 compounds, altogether from different datasets, were incorrectly classified by all the four scoring functions used in the study. More than 59% of the training set compounds and 48% of the test set compounds were correctly classified by all the scoring functions. Of the 19 misclassified compounds from the training set, nine were predicted as inhibitors and ten were predicted as non-inhibitors.

The training set compound Ebselen was wrongly predicted as non-inhibitor by all scoring functions. Examining its molecular properties revealed that both molecular weight (274) and logP(2.74) fall in the range of non-inhibitors (Table 2). Moreover, the structure of Ebselen was found to be structurally more similar to a set of non-inhibitors compared to the set of inhibitors. Benzylpenicillin (Penicillin G) also belongs to the property space of non-inhibitors (molecular weight = 333.38 and logP = 1.74). Interestingly, both Ebselen and Benzylpenicillin are strong inhibitors (IC₅₀< 10 μM) [24]. On the other hand, Phytomenadione (molecular weight = 450.70, logP = 9.05), despite being a non-inhibitor (IC₅₀ Y > 1000), was always misclassified as inhibitor. Similar trend was noticed in both external test sets. In total, six inhibitors and 13 non-inhibitors were misclassified from the Pedersen et al. [34] dataset. Interestingly, all six inhibitors were found to be strongly hydrophobic and the molecular properties of about 80% of the non-inhibitors fall in the range of inhibitors. This strengthens the inclusion of this physicochemical properties into the classification model.

Combining ligand- and structure-based classification (sequential modeling)

Although the structure-based models performed reasonably well, ligand-based methods are considerably faster and perform equally well. Thus, we evaluated if a sequential approach that starts with a ligand-based method and proceeds with screening the positives using structure-based models would improve the precision and reduce the false positives. Therefore, we used an external test set containing 39 inhibitors and 113 non-inhibitors as a starting point. After applying ligand-based classification using the workflow from Montanari et al. [25], 30 inhibitors were correctly predicted (TPs) and there were nine FPs, which leads to a precision of 0.77. After application of our structure-based model based on ChemScore and rescoring using XScore, the precision improved to 0.83, reducing the number of FPs to 5. Further performance measures on the sequential approach are provided in Table 3. Thus, combining ligand- and structure-based models in a sequential setting increased the precision and reduced the calculation time. This might be a versatile approach to reduce the number of FPs when performing large scale in silico screening.

Table 3 Ligand-based and structure-based classification

Full size table

Conclusion

Development of structure-based methods for transmembrane transporters of the ABC-family has been less pronounced due to limited availability of experimentally determined 3D structures. However, recent efforts that used homology models of P-glycoprotein provide promising evidences that structure-based classification methods can be applied to these highly flexible and promiscuous proteins. In this study, we used comparative modeling to generate a homology model for the ABC-transporter BSEP and developed structure-based models to classify inhibitors and non-inhibitors. Including logP and molecular weight as an additional layer of information besides the scoring function further increased the performance of the models. PLIF analysis revealed certain functional group-residue interactions that could help to understand the molecular basis of inhibition of the transporter protein by a wide range of ligands. Applicability domain of the models was assessed using Euclidean distance. Furthermore, we estimated the probability of prediction by employing a binning scheme and identified a docking score range that can distinguish a majority of inhibitors from non-inhibitors with high confidence. Finally, combining the structure-based model with our previously published ligand-based classification model in a sequential order provided additional improvement.

Combining ligand- and structure-based models to enhance the performance of virtual screening is of course not a new approach. For receptors and enzymes identification of new ligands quite often starts with a pharmacophore-based screening followed by docking of the top-ranked hits to further refine the shopping list [80]. However, in case of ABC-transporters such as P-glycoprotein, which shows a pronounced polyspecificity in its ligand profile, there is a broad variety of pharmacophore models available. This would render a sequential approach quite challenging. Furthermore, due to the eminent role of ABC-transporters like P-gp, BSEP, and the breast cancer protein (BCRP) in ADME and toxicity, the focus for in silico screening lays more on flagging potentially toxic compounds rather than on the identification of new inhibitors for further development as drug candidates. In this setting, machine learning-based classification models might be a better tool for a first computational pre-screening. Therefore, a workflow comprising of prescreening with simple descriptors, classification by machine learning techniques and post processing by structure-based methods might be the workflow of choice to provide accurate prediction combined with additional information on the molecular basis of compound-transporter interaction.

References

Lodish H, Berk A, Zipursky SL, Matsudaira P, Baltimore D, Darnell J (2000) Overview of membrane transport proteins. In: Lodish H (ed) Molecular cell biology, 4th edn. W. H. Freeman, New York
Dean M, Rzhetsky A, Allikmets R (2001) The human ATP-binding cassette (ABC) transporter superfamily. Genome Res 11:1156–1166
Article CAS Google Scholar
Kim S-R, Saito Y, Itoda M, Maekawa K, Kawamoto M, Kamatani N, Ozawa S, Sawada J (2009) Genetic variations of the ABC transporter gene ABCB11 encoding the human bile salt export pump (BSEP) in a Japanese population. Drug Metab Pharmacokinet 24:277–281
Article CAS Google Scholar
Glavinas H, Krajcsi P, Cserepes J, Sarkadi B (2004) The role of ABC transporters in drug resistance, metabolism and toxicity. Curr Drug Deliv 1:27–42
Article CAS Google Scholar
Giacomini KM, Huang S-M, Tweedie DJ et al (2010) Membrane transporters in drug development. Nat Rev Drug Discov 9:215–236
Article CAS Google Scholar
Cheng X, Buckley D, Klaassen CD (2007) Regulation of hepatic bile acid transporters Ntcp and Bsep expression. Biochem Pharmacol 74:1665–1676
Article CAS Google Scholar
Hofmann AF, Borgström B (1964) The intraluminal phase of fat digestion in man: the lipid content of the micellar and oil phases of intestinal content obtained during fat digestion and absorption*. J Clin Invest 43:247–257
Article CAS Google Scholar
Fiorucci S, Mencarelli A, Palladino G, Cipriani S (2009) Bile-acid-activated receptors: targeting TGR5 and farnesoid-X-receptor in lipid and glucose disorders. Trends Pharmacol Sci 30:570–580
Article CAS Google Scholar
Kuipers F, Groen AK (2008) Chipping away at gallstones. Nat Med 14:715–716
Article CAS Google Scholar
Strautnieks SS, Byrne JA, Pawlikowska L et al (2008) Severe bile salt export pump deficiency: 82 different ABCB11 mutations in 109 families. Gastroenterology 134:1203–1214
Article CAS Google Scholar
Perez M-J, Briz O (2009) Bile-acid-induced cell injury and protection. World J Gastroenterol 15:1677–1689
Article CAS Google Scholar
Amer S, Hajira A (2014) A comprehensive review of progressive familial intrahepatic cholestasis (PFIC): genetic disorders of hepatocanalicular transporters. Gastroenterol Res 7:39–43
CAS Google Scholar
Alonso EM, Snover DC, Montag A, Freese DK, Whitington PF (1994) Histologic pathology of the liver in progressive familial intrahepatic cholestasis. J Pediatr Gastroenterol Nutr 18:128–133
Article CAS Google Scholar
JANSEN P, MULLER M (2000) The molecular genetics of familial intrahepatic cholestasis. Gut 47:1–5
Article CAS Google Scholar
Drug Transport. In: Sigma–Aldrich. http://www.sigmaaldrich.com/technical-documents/articles/biofiles/drug-transport.html. Accessed 17 March 2015
Kosters A, Karpen SJ (2008) Bile acid transporters in health and disease. Xenobiotica Fate Foreign Compd Biol Syst 38:1043–1071
Article CAS Google Scholar
Dawson S, Stahl S, Paul N, Barber J, Kenna JG (2012) In vitro inhibition of the bile salt export pump correlates with risk of cholestatic drug-induced liver injury in humans. Drug Metab Dispos Biol Fate Chem 40:130–138
Article CAS Google Scholar
Sahi J, Sinz MW, Campbell S et al (2006) Metabolism and transporter-mediated drug-drug interactions of the endothelin-A receptor antagonist CI-1034. Chem Biol Interact 159:156–168
Article CAS Google Scholar
Kullak-Ublick GA, Stieger B, Meier PJ (2004) Enterohepatic bile salt transporters in normal physiology and liver disease. Gastroenterology 126:322–342
Article CAS Google Scholar
Guideline on the investigation of drug interactions. http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2012/07/WC500129606.pdf
Morgan RE, Trauner M, van Staden CJ, Lee PH, Ramachandran B, Eschenberg M, Afshari CA, Qualls CW, Lightfoot-Dunn R, Hamadeh HK (2010) Interference with bile salt export pump function is a susceptibility factor for human liver injury in drug development. Toxicol Sci Off J Soc Toxicol 118:485–500
Article CAS Google Scholar
Kis E, Ioja E, Rajnai Z, Jani M, Méhn D, Herédi-Szabó K, Krajcsi P (2012) BSEP inhibition: in vitro screens to assess cholestatic potential of drugs. Toxicol Vitro Int J Publ Assoc BIBRA 26:1294–1299
Article CAS Google Scholar
Montanari F, Ecker GF (2015) Prediction of drug–ABC-transporter interaction—recent advances and future challenges. Adv Drug Deliv Rev 86:17–26
Article CAS Google Scholar
Warner DJ, Chen H, Cantin L-D, Kenna JG, Stahl S, Walker CL, Noeske T (2012) Mitigating the inhibition of human bile salt export pump by drugs: opportunities provided by physicochemical property modulation, in silico modeling, and structural modification. Drug Metab Dispos Biol Fate Chem 40:2332–2341
Article CAS Google Scholar
Montanari F, Pinto M, Khunweeraphong N et al (2016) Flagging drugs that inhibit the bile salt export pump. Mol Pharm 13:163–171
Article CAS Google Scholar
Bikadi Z, Hazai I, Malik D et al (2011) Predicting P-glycoprotein-mediated drug transport based on support vector machine and three-dimensional crystal structure of P-glycoprotein. PLoS ONE 6:e25815
Article CAS Google Scholar
Blower PE, Yang C, Fligner MA, Verducci JS, Yu L, Richman S, Weinstein JN (2002) Pharmacogenomic analysis: correlating molecular substructure classes with microarray gene expression data. Pharmacogenomics J 2:259–271
Article CAS Google Scholar
Dolghih E, Bryant C, Renslo AR, Jacobson MP (2011) Predicting binding to P-glycoprotein by flexible receptor docking. PLoS Comput Biol 7:e1002083
Article CAS Google Scholar
Chen L, Li Y, Yu H, Zhang L, Hou T (2012) Computational models for predicting substrates or inhibitors of P-glycoprotein. Drug Discov Today 17:343–351
Article CAS Google Scholar
Klepsch F, Chiba P, Ecker GF (2011) Exhaustive sampling of docking poses reveals binding hypotheses for propafenone type inhibitors of P-glycoprotein. PLoS Comput Biol 7:e1002036
Article CAS Google Scholar
Prokes K (2012) Development of “in silico” models for identification of new ligands acting as pharmacochaperones for P-glycoprotein. Diploma Thesis, University of Vienna, Austria
Klepsch F, Vasanthanathan P, Ecker GF (2014) Ligand and structure-based classification models for prediction of P-glycoprotein inhibitors. J Chem Inf Model 54:218–229
Article CAS Google Scholar
Xiang Z (2006) Advances in homology protein structure modeling. Curr Protein Pept Sci 7:217–227
Article CAS Google Scholar
Pedersen JM, Matsson P, Bergström CAS, Hoogstraate J, Norén A, LeCluyse EL, Artursson P (2013) Early identification of clinically relevant drug interactions with the human bile salt export pump (BSEP/ABCB11). Toxicol Sci 136:328–343
Article CAS Google Scholar
Pinto M, Trauner M, Ecker GF (2012) An in silico classification model for putative ABCC2 substrates. Mol Inform 31:547–553
Article CAS Google Scholar
Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen M-Y, Pieper U, Sali A (2007) Comparative protein structure modeling using MODELLER. Curr Protoc Protein Sci Editor Board John E Coligan Al Chap 2:2.9
Google Scholar
Jacobson MP, Pincus DL, Rapp CS, Day TJF, Honig B, Shaw DE, Friesner RA (2004) A hierarchical approach to all-atom protein loop prediction. Proteins 55:351–367
Article CAS Google Scholar
Jacobson MP, Friesner RA, Xiang Z, Honig B (2002) On the role of the crystal environment in determining protein side-chain conformations. J Mol Biol 320:597–608
Article CAS Google Scholar
Shen M, Sali A (2006) Statistical potential for assessment and prediction of protein structures. Protein Sci Publ Protein Soc 15:2507–2524
Article CAS Google Scholar
Melo F, Sánchez R, Sali A (2002) Statistical potentials for fold assessment. Protein Sci Publ Protein Soc 11:430–448
Article CAS Google Scholar
John B, Sali A (2003) Comparative protein structure modeling by iterative alignment, model building and model assessment. Nucleic Acids Res 31:3982–3992
Article CAS Google Scholar
Laskowski R, Macarthur M, Moss D, Thornton J (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Cryst 26:283–291
Article CAS Google Scholar
Zhou AQ, O’Hern C, Regan L (2011) Revisiting the Ramachandran plot from a new angle. Protein Sci Publ Protein Soc 20:1166–1171
Article CAS Google Scholar
Engh R, Huber R (1991) Accurate bond and angle parameters for X-ray protein structure refinement. Acta Cryst A 47:392–400
Article Google Scholar
Benkert P, Künzli M, Schwede T (2009) QMEAN server for protein model quality estimation. Nucleic Acids Res 37:W510–W514
Article CAS Google Scholar
Benkert P, Tosatto SCE, Schomburg D (2008) QMEAN: a comprehensive scoring function for model quality assessment. Proteins 71:261–277
Article CAS Google Scholar
Van Der Spoel D, Lindahl E, Hess B, Groenhof G, Mark AE, Berendsen HJC (2005) GROMACS: fast, flexible, and free. J Comput Chem 26:1701–1718
Article Google Scholar
Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory Comput 4:435–447
Article CAS Google Scholar
Lindahl E, Hess B, Spoel D van der (2001) GROMACS 3.0: a package for molecular simulation and trajectory analysis. Mol Model Annu 7:306–317
Article CAS Google Scholar
Berendsen HJC, van der Spoel D, van Drunen R (1995) GROMACS: a message-passing parallel molecular dynamics implementation. Comput Phys Commun 91:43–56
Article CAS Google Scholar
Schmid N, Eichenberger AP, Choutko A, Riniker S, Winger M, Mark AE, van Gunsteren WF (2011) Definition and testing of the GROMOS force-field versions 54A7 and 54B7. Eur Biophys J 40:843–856
Article CAS Google Scholar
Berendsen HJC, Postma JPM, Gunsteren WF van, Hermans J (1981) Interaction models for water in relation to protein hydration. In: Pullman B (ed) Intermolecular forces. Springer, Dordrecht, pp 331–342
Chapter Google Scholar
Hess B, Bekker H, Berendsen HJC, Fraaije JGEM (1997) LINCS: a linear constraint solver for molecular simulations. J Comput Chem 18:18–1463
Article Google Scholar
Essmann U, Perera L, Berkowitz ML, Darden T, Lee H, Pedersen LG (1995) A smooth particle mesh Ewald method. J Chem Phys 103:8577–8593
Article CAS Google Scholar
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637
Article CAS Google Scholar
Turner PJ (2005) XMGRACE. Center for Coastal and Land-Margin Research. Oregon Graduate Institute of Science and Technology, Beaverton
Sastry GM, Adzhigirey M, Day T, Annabhimoju R, Sherman W (2013) Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J Comput Aided Mol Des 27:221–234
Article Google Scholar
Schrödinger Release 2015-1 (2015) Maestro, version 10.1, Schrödinger. LLC, New York
Schrödinger Release 2015-1 (2015) LigPrep, version 3.3, Schrödinger. LLC, New York
Shelley JC, Cholleti A, Frye LL, Greenwood JR, Timlin MR, Uchimaya M (2007) Epik: a software program for pK(a) prediction and protonation state generation for drug-like molecules. J Comput Aided Mol Des 21:681–691
Article CAS Google Scholar
Greenwood JR, Calkins D, Sullivan AP, Shelley JC (2010) Towards the comprehensive, rapid, and accurate prediction of the favorable tautomeric states of drug-like molecules in aqueous solution. J Comput Aided Mol Des 24:591–604
Article CAS Google Scholar
Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD (2003) Improved protein–ligand docking using GOLD. Proteins Struct Funct Bioinform 52:609–623
Article CAS Google Scholar
Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267:727–748
Article CAS Google Scholar
Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye LL, Pollard WT, Banks JL (2004) Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. J Med Chem 47:1750–1759
Article CAS Google Scholar
Friesner RA, Murphy RB, Repasky MP, Frye LL, Greenwood JR, Halgren TA, Sanschagrin PC, Mainz DT (2006) Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein–ligand complexes. J Med Chem 49:6177–6196
Article CAS Google Scholar
Wang R, Lai L, Wang S (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J Comput Aided Mol Des 16:11–26
Article CAS Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor Newsl 11:10–18
Article Google Scholar
MACCS Structural keys 2011, Accelrys, San Diego
Vogt M, Stumpfe D, Maggiora GM, Bajorath J (2016) Lessons learned from the design of chemical space networks and opportunities for new applications. J Comput Aided Mol Des 30:191–208
Article CAS Google Scholar
Zwierzyna M, Vogt M, Maggiora GM, Bajorath J (2015) Design and characterization of chemical space networks for different compound data sets. J Comput Aided Mol Des 29:113–125
Article CAS Google Scholar
Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The Chemistry Development Kit (CDK): an open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43:493–500
Article CAS Google Scholar
Molecular Operating Environment (MOE), 2013.08. Chemical Computing Group Inc., Montreal, Canada
Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Thiel K, Wiswedel B (2009) KNIME—the Konstanz information miner: version 2.0 and beyond. SIGKDD Explor Newsl 11:26–31
Melagraki G, Afantitis A, Sarimveis H, Igglessi-Markopoulou O, Koutentis PA, Kollias G (2010) In silico exploration for identifying structure-activity relationship of MEK inhibition and oral bioavailability for isothiazole derivatives. Chem Biol Drug Des 76:397–406
Article CAS Google Scholar
Afantitis A, Melagraki G, Koutentis PA, Sarimveis H, Kollias G (2011) Ligand-based virtual screening procedure for the prediction and the identification of novel β-amyloid aggregation inhibitors using Kohonen maps and Counterpropagation Artificial Neural Networks. Eur J Med Chem 46:497–508
Article CAS Google Scholar
Fruchterman TMJ, Reingold EM (1991) Graph drawing by force-directed placement. Softw Pract Exp 21:1129–1164
Article Google Scholar
Mochizuki K, Kagawa T, Numari A, Harris MJ, Itoh J, Watanabe N, Mine T, Arias IM (2007) Two N-linked glycans are required to maintain the transport activity of the bile salt export pump (ABCB11) in MDCK II cells. Am J Physiol Gastrointest Liver Physiol 292:G818–G828
Article CAS Google Scholar
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874
Article Google Scholar
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30:1145–1159
Article Google Scholar
Küblbeck J, Jyrkkärinne J, Poso A, Turpeinen M, Sippl W, Honkakoski P, Windshügel B (2008) Discovery of substituted sulfonamides and thiazolidin-4-one derivatives as agonists of human constitutive androstane receptor. Biochem Pharmacol 76:1288–1297
Article Google Scholar

Download references

Acknowledgements

Open access funding provided by Austrian Science Fund (FWF). We gratefully acknowledge financial support provided by the Austrian Science Fund, grants #F03502 (SFB35), W1232 (MolTag) and by the Innovative Medicines Initiative Joint Undertaking under grant agreement n°115002 (eTOX). The computational results presented have been achieved in part using the Vienna Scientific Cluster (VSC3).

Author information

Authors and Affiliations

Department of Pharmaceutical Chemistry, University of Vienna, Althanstrasse 14, 1090, Vienna, Austria
Sankalp Jain, Melanie Grandits, Lars Richter & Gerhard F. Ecker

Authors

Sankalp Jain
View author publications
You can also search for this author in PubMed Google Scholar
Melanie Grandits
View author publications
You can also search for this author in PubMed Google Scholar
Lars Richter
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard F. Ecker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gerhard F. Ecker.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (DOCX 2267 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Jain, S., Grandits, M., Richter, L. et al. Structure based classification for bile salt export pump (BSEP) inhibitors using comparative structural modeling of human BSEP. J Comput Aided Mol Des 31, 507–521 (2017). https://doi.org/10.1007/s10822-017-0021-x

Download citation

Received: 27 November 2016
Accepted: 30 April 2017
Published: 19 May 2017
Issue Date: June 2017
DOI: https://doi.org/10.1007/s10822-017-0021-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Structure based classification for bile salt export pump (BSEP) inhibitors using comparative structural modeling of human BSEP

Abstract

Similar content being viewed by others

Structural basis of bile salt extrusion and small-molecule inhibition in human BSEP

In Silico Approaches to Predict Drug-Transporter Interaction Profiles: Data Mining, Model Generation, and Link to Cholestasis

Exploring molecular fingerprints of different drugs having bile interaction: a stepping stone towards better drug delivery

Introduction

Materials and methods

Dataset

Homology modeling

Molecular dynamics simulation

Molecular docking and scoring

Machine learning-based model building

Network-based representation of the dataset

Functional group analysis

Protein ligand interaction fingerprints (PLIF)

Applicability domain assessment

Performance evaluation

Calculating the probability of prediction

Results and discussion

Chemical space network of the dataset

Homology modeling

Docking (structure-based classification)

Probability of prediction

Analysis of protein ligand interactions

Analysis of functional groups and protein–ligand interactions

Analysis of misclassified compounds

Combining ligand- and structure-based classification (sequential modeling)

Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary Material 1 (DOCX 2267 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation