Starting from the two CYP 2D6 structures of Hritz et al. [10] used previously [2], MD simulations and LIE calculations were set up and performed according to the settings described in the Methods section. This setup is optimized to include a larger number of ligands and initial starting poses of the ligand (up to eight per protein conformation in the current work), when compared to the setup used in reference [2]. Hence, an accordingly larger number of simulations is introduced in Eq. 3, which enables optimal exploration of the potential of the filtering methods presented here.
To study the efficiency and accuracy of the proposed filtering method, the minimal length L of the time window used to calculate average interaction energies was varied, while values for the gradient cut-off and noise-filtering frequency were maintained identical in all models. The filtered (thermal) noise level and gradient cut-off were optimized here based on visual inspection of the results of the fitting for a small set of selected energy trajectories. In further studies, a way to optimize values for the filtering settings through training, and the subsequent effect on the accuracy and efficiency of the resulting LIE models, could be investigated.
LIE models calibrated using average interaction energies that were obtained by subsequently following steps (1), (2), and (3) as described in the Methods section (with L=200, 400, or 600 ps) are presented as ‘filtered’ models in Tables 1 to 3. Root-mean-square errors (RMSEs) and standard deviation of prediction errors (SDEPs) are shown in Table 1, and α and β values in Table 2. SDEP values were calculated from a Leave-One-Out cross-validation test. The last three columns in Table S1 of the Supplementary Material show that for most compounds α, β and RMSE of the filtered model with L set to 200 ps do not change substantially when leaving out single compounds from the training set.
Table 1 Root-mean-square error (RMSE) and standard deviation in error prediction (SDEP) values for LIE models with \(\left \langle V^{el}_{lig-surr} \right \rangle _{bound,i}\)’s and \(\left \langle V^{vdW}_{lig-surr} \right \rangle _{bound,i}\)’s in Eq. 3 averaged over various time spans of simulations i
Table 2
α and β values for LIE models with \(\left \langle V^{el}_{lig-surr} \right \rangle _{bound,i}\)’s and \(\left \langle V^{vdW}_{lig-surr} \right \rangle _{bound,i}\)’s in Eq. 3 averaged over various time spans of simulations i
Table 3 Average time per simulation i in Eq. 3 (sim.) needed to calibrate the models reported in Tables 1 and 2
The properties of the filtered models were compared with LIE models calibrated using interaction energies averaged over the first 200, 400, and 600 ps of each simulation (referred to as ‘unfiltered’ models in Tables 1 to 3). As a reference, a LIE model is also presented using interaction energies averaged over 1000 ps of the individual production simulations (from hereon referred to as the ‘ns’ model, last column in Tables 1 to 3). Note that the RMSD, SDEP, α and β values for this model are different from the model presented in reference [2], due to differences in the docking and clustering algorithms used, in the set of training compounds used, and in the force field employed during MD simulations.
In addition to the filtered models, LIE models were calibrated in which step (2) of the protocol in the Methods section was omitted and interaction energies were averaged over the full time window selected under step (1) (or (3)) of the protocol. In the typical example displayed in Fig. 1, this would have as a consequence that for L set to 400 ps, interaction energy averages would be taken over the time span ranging from 500 ps to 1000 ps (instead of to 900 ps). In Tables 1 to 3, the models that make use of the extended time window are referred to as ‘filter+ext’.
The first line in Table 1 shows that for the unfiltered LIE models, longer sampling times lead to slightly more accurate predictions. With increasing simulation time, RMSE and SDEP values decrease, but the increase in accuracy is limited (with a maximum decrease in RMSE and SDEP values of less than 0.4 kJ mol −1). When considering the filtered and ‘filter+ext’ models (Table 1), the RMSE and SDEP decreased also slightly or adopted similar values with increasing simulation time. Models calibrated using filtered energy trajectories to calculate average interaction energies perform at least as well as the ns model. This is not only demonstrated by the differences in RMSE and SDEP values (Table 1) but also when comparing the correlations between experimental and calculated ΔG
b
i
n
d
values (cf. Fig. 2 and Table S1 of the Supplementary Material, which reports individual values for and errors in the calculated ΔG
b
i
n
d
values). This indicates that, as expected, the noise due to possible conformational changes during simulation is reduced. In general, the filtering has a positive impact on the affinity prediction for individual compounds, as illustrated in Fig. 2 for the filtered model with L=200 ps. Although predictions for some ligands become less accurate upon recalibration of the ns model, after filtering the energy trajectories, several ligands for which the prediction by the ns model deviates more than 5 kJ mol −1 from experiment are predicted with increased accuracy in the filtered models.
Table 2 shows that in terms of α and β value, the filtered models are similar to the unfiltered ones: the filtered models have α and β values within 1-2 % of the ns model. In addition, the similarity of the three filtered models (Table 2) indicates that their α and β values are less sensitive to the length of the simulations used than for the unfiltered models. The similarity between the filtered and ‘filter+ext’ models shows as well that once a time window is selected, the length L (i.e., length of local sampling) is of limited influence on model calibration. Upon filtering, only windows are used during which interaction energies (thermally) fluctuate around a relatively constant value. Therefore, the average values for the energies are decoupled from the degree of sampling during the individual simulations i in Eq. 3. In conclusion, filtering allows to use shorter simulations to calibrate iterative LIE models, without negatively influencing the quality of the model.
In order to evaluate the gain in computational efficiency by using our filtering approach, the simulation times needed to develop the models are summarized in Table 3. For every simulation i in Eq. 3, the average simulation time needed to evaluate \(\left \langle V^{el}_{lig-surr} \right \rangle _{bound,i}\) and \(\left \langle V^{vdW}_{lig-surr} \right \rangle _{bound,i}\) includes the time before accessing the time span (over which interactions are averaged) and the time span itself. In practice, in the case that no window could be selected with a length ≥L (step (3) in the protocol in the Methods section), it is only possible to conclude that all time windows are shorter than L once the time of simulation i reaches 1000 ps. For this reason, Table 3 also reports corrected average simulation times (corr.) that include the full length of the simulation in those cases. The corrected time is representative for the average over individual simulation times needed to train the filtered (or ‘filter+ext’) models. From Table 3, the average simulation time needed to calibrate the filtered model with L=200 ps is only 28 % of the time needed for the ns model, corresponding to a gain in efficiency of 72 %. For L=400 ps, a gain of 34 % was obtained. Note that for the system studied here, this represents a reduced computational effort of 300 and 100 ns less simulation time in total, respectively. Looking in detail at the number of simulations for which a window of at least length L was found according to the protocol in the Methods section, our data show that for more than 90 % of the individual simulations a time span with L=200 ps could be found. For 57 % and 37 % of the simulations, a time span of 400 ps or 600 ps was found, respectively. This correlates with the probability of finding a time span of a given length within a simulation of fixed length, under the assumption of random transition frequencies and occurrences. In addition, it demonstrates that for the CYP 2D6-aryloxypropanolamine system, especially for relatively small L (200 ps), a significant gain in computational efficiency could be obtained.