Homology modeling
In order to find the most suitable protein template for the A3AR receptor model, its sequence was obtained from UniProt database [13] (sp_P33765) and used for a BLAST search using two online tools: SwissModel [14] and ProteinBlast (NCBI) [15]. In both cases, the default search modes to find the most similar PDB crystal structures were used. After comparison of the results, three templates were chosen: 3EML (2.60 Å, 39.86% identity) [16], 2YDV (2.60 Å, 42.6% identity) [17] and 3VG9 (2.70 Å, 43.34% identity) [18]. Template proteins were chosen according to their highest crystallographic resolution (as well as crystal structures availability at the time) among two independent BLAST search hits, in order to increase the chances to obtain a reliable model.
Protein structures were pre-processed using PyMOL [19]: ligands, co-crystallization agents (2YDV, 3EML), the lysozyme insertion instead of ICL3 (3EML) were removed. Protein sequences obtained in this way were aligned using the PROMALS 3D [20] online tool. The resulting alignment, after visual inspection (position of transmembrane domains, possible disulfide bridges) was used as an input for MODELLER [21, 22].
Each of the 10 output models was then aligned to the 3EML crystal structure and carefully inspected visually using UCSF Chimera [23]. In particular, the orientation of the side chain of ASN2506.55 (superscript numbers denote Ballesteros–Weinstein numbers [24]) and other binding pocket amino acids was investigated and their possible, acceptable rotamers (according to the Dunbrack library [25]) were ascertained. Reasoning was supported by means of mutagenesis data [3, 4]. Similarly, we inspected the transmembrane domains to avoid gaps, obvious steric clashes, unnatural side chain amino acid folding, as well as a preservation of the disulfide bonds between CYS833.25–CYS 16645.50.
Known ligand database preparation
The next step was to test the enrichment of ligands over non-binders in the orthosteric binding pockets of the selected models. For this purpose, two sets of ligands were obtained from the ChEMBL database [26]. The “ligands” set consisted of approx. 1500 molecules described as A3AR ligands with a Ki ≤ 100 nM. Second, the decoy set consisted of approx. 800 molecules, tested against the A3AR and described as inactive for this target. Structures of both sets of ligands were obtained from the ZINC database [27] by searching for corresponding ZINC IDs for all of the ligands extracted from ChEMBL. High quality 3D conformer ensembles of both sets were obtained using the OMEGA module [28, 29] of the OEDocking software package (maximum number of conformers = 100; RMS = 0.5).
Model refinement
The final A3AR homology model used in this study was obtained through the refinement process, using three different, consecutive strategies.
Strategy 1
As a reference ligand for docking, co-crystallized within the 3EML structure, the ligand ZM241385 was placed in the A3AR receptor models after their alignment to the 3EML structure, making sure that the hydrogen bonds with ASN2506.55 were formed. Two sets of ligands were then docked to the prepared receptor homology models using the HYBRID module (one pose per ligand, max. hitlist size 500 molecules), implemented in the OEDocking Software [30,31,32,33]. After docking, the top 500 poses were inspected visually, and receiver operator characteristic (ROC) curves were generated along with calculations of the area under the curve (AUC), using an in-house script.
Docked ligands were minimized using the SZYBKI module (OEDocking) [34] and the homology model of the protein (model 1) was minimized (with ligand present in the binding site) using CHARMM [35]. As several unfavorable energy poses and similar docking behavior was observed for the set of tested ligands, another modeling approach was then undertaken.
Strategy 2
Due to the fact that ZM241385 appears to be inactive towards A3AR and therefore might unduly bias the shape of the binding pocket during modeling, in a second round of modeling the previously identified ligand ZINC12533962, which is potent and selective towards A3AR (A3AR Ki = 40 nM) was placed manually in the crystal structure of 3EML instead, retaining similar ligand-receptor interactions. As the conformation of the ligand, the previously obtained pose from docking to the A1AR [9] was used. The protein conformation prepared this way served as a template using the same input alignment for MODELLER as for model 1, excluding the 2YDV and 3VG9 X-ray structures and including the ligand and its position during modeling.
From the output of ten models, the best scoring one (according to Modeller’s scoring functions; model 2) was chosen for evaluation.
Strategy 3
Due to possible steric clashes between the ligand and TRP2436.48 in the bottom part as well as PHE16845.52 (ECL2) at the top of the binding pocket in model 2, again the position of ZINC12533962 in 3EML structure was corrected manually and the thus prepared receptor served again as a template for MODELLER. The resulting model (model 3) showed also potential steric clashes between the ligand and PHE16845.52, thus the ligand position in the template protein was again corrected, and the protein was remodeled. To the output model (model 4), after visual inspection, the set of actives and decoys was docked. This model (4), without further refinement and minimization, was chosen for all further docking studies. ECL2 was not remodeled, as it aligned well with the reference 3EML structure. Binding pockets of all four homology models obtained are presented in Fig. 1.
Docking
In order to see whether the experimentally obtained binding profiles can be reproduced in silico, the next step involved docking of the previously described set of 39 ligands (test set) [9] to all three receptor subtypes (four A1AR homology models [9], the crystal structure of 3EML for A2AAR and the A3AR homology model). All ligands were prepared according to the same procedure described herein for the “binders/decoys” sets and docked to all receptors using the HYBRID module. As HYBRID docks multiconformer molecules into receptor-ligand complexes using an exhaustive search that systematically samples rotations and translations of each conformer of the ligand within the active site (defined by the “bound” ligand), no docking grid/sphere was set beforehand. For all of the docked sets, the preservation of a hydrogen bond with ASN6.55 as well as the orientation of the ligands in the binding pockets was inspected visually.
In silico screening evaluation
For the quantification of the in silico/in vitro correlation, and, quite literally, to check how far from each other the results of those screenings are, an approach incorporating Taxicab geometry (City Block Distance, CBD) [36] and a traffic light system was utilized. Instead of the usual distance in Euclidean geometry, Taxicab geometry defines a new metric in which the distance between two points (d1) is the sum of the absolute differences of their Cartesian coordinates (p, q).
$${d}_{1}\left(p,q\right)={\Vert p-q\Vert }_{1}=\sum\limits_{i=1}^{n}\left|{p}_{i}-{q}_{i}\right|$$
Among a variety of everyday life applications, CBD systems can also be used to assess the differences in discrete frequency distributions. In our study, instead of Cartesian coordinates, in vitro (v = Ki) and in silico (s = docking/rescore score) values were used for calculations.
$${d}_{1}\left(v,s\right)={\Vert v-s\Vert }_{1}=\sum\limits_{i=1}^{n}\left|{v}_{i}-{s}_{i}\right|$$
So as to organize the results of both, the in vitro and in silico screenings, the results were in a first instance classified empirically (Fig. 2), according to the following key:
for in vitro values: green for Ki values in the range below 1000 nM, yellow for Ki values higher than 1000 nM, but still at a measurable level, red expresses no detectable binding.
for in silico values: green expresses first 20% of the obtained docking score range, yellow next 20% of the obtained docking score range, red expresses the remaining 60% of docking score range (preliminary partitioning),
while the CBD values (0–2) were assigned to each color in the manner: 0 for red, 1 for yellow and 2 for green (CBD calculation run 1; CR-1). The division for in vitro data remained unchanged for further data development. However, it has been shown that the position of the ligand pose closest to the native pose is distributed rather randomly among all generated poses and ordered with respect to the docking score [37]. Hence, the docking results were rescored using DSX-Online and the color scheme was adapted as follows: green < − 100, − 100 < yellow < − 90, red > − 90 (Fig. 2; CR-2). The next step of proposed platform evaluation was re-docking of the whole set of ligands to all four adenosine A1 receptor models, using ZINC12533962 as a reference ligand instead of ZM241385, in order to obtain a fair comparison for the docking procedure. This has been done by overlaying the obtained A3AR homology model onto the backbone of the A1AR models and preserving the coordinates of ZM241385. Using the same data partitioning as in the preliminary calculations, a CBD value was calculated (CR-3). A rescoring procedure was incorporated as described above (CR-4). To eliminate potential boundary effects arising from the in vitro/in silico data partitioning, the system was changed to a binary distribution (CR-5–CR-8 for each previous run respectively) for either active or nonactive for the biological target (CBD = 1 for previous greens and yellows, 0 for reds) and recalculated (Fig. 4). Moreover, in order to determine the relative CBD value (CBDrel = CBD/CBDmax), the maximal possible CBD (CBDmax) values for each distribution were calculated. A CBDrel of less than 1 indicates better-than-random performance.
Method validation
In order to test the usability and versatility of the described method, a library of 88 selective ligands previously described by Katritch et al. [38] was used. This ligand database was prepared according to the same procedure described herein for the “binders/decoys” sets and docked to all receptors using the HYBRID module. Likewise, the same data partition system as described in “In silico screening evaluation” subsection was applied. However, due to the high affinity of the ligands, the second system was incorporated:
Detailed information on the used set along with partitioning systems incorporated can be found in the Supplementary Material.