Analysis of available protein crystal structures
As a first step, the Spindlin1 crystal structures deposited in the Protein Data Bank (PDB)  were analyzed in order to investigate the conformational flexibility of the binding pocket residues. Attention was given to the second domain and specifically to the aromatic cage residues (Phe141, Trp151, Tyr170, Tyr177) as it is responsible for the recognition of the trimethylated lysine and mimetic moieties like the positively charged pyrrolidine. The binding sites of the PDB structures were aligned and the protein residues were colored by PDB B-factor. Several X-ray structures are depicted in Fig. 2 as examples of the different binding pocket conformations that were observed (PDB IDs: 2NS2, 4H75, 4MZF, 5Y5W, 5JSG, 5JSJ, 6I8Y, 6QPL [7, 8, 10,11,12,13, 34]). The analysis highlighted that among the four residues of the aromatic cage, Phe141 and Trp151 show higher temperature factor values reflecting uncertainty in the positions of their atoms in the protein crystal structures, hence indicating a higher degree of flexibility of these two amino acids. The side chain of Phe141 can adopt two different orientations which lead to two different shapes of the aromatic cage: either a closed cage (Fig. 2, PDB IDs: 2NS2, 4H75, 4MZF, 4MZH, 5JSG, 5Y5W) or an open cage (Fig. 2 PDB IDs: 5JSJ, 6QPL, 6I8Y). The open conformation is observed only in the ligand-bound forms, except in the crystal structure of Spindlin1 with the bivalent inhibitor EML405 (PDB ID: 5JSG) where Phe141 shows a closed conformation. This suggests that the ligands can generally induce the flip of the Phe141 side chain. Contrariwise, in the apo-form and peptide-bound crystal structures, the side chain of Phe141 adopts the closed conformation. Moreover, Trp151 displays slightly different orientations among the crystal structures to better interact with the positively charged moieties of the co-crystalized ligand/peptide. Additionally, the B-factor values underlined that even in the presence of the ligand or peptide, Phe141 and Trp151 can have a low empirical electron density and, thus, their position is less clearly defined.
Molecular dynamics simulation of apo-form structure
To further evaluate the flexibility of the aromatic cage as well as of the other binding site residues, and to test whether it is possible to obtain the open conformation starting from the closed conformation, the apo-form crystal structure (PDB ID: 2NS2) was subjected to 50 ns MD simulation using Desmond package [34, 35].
The root-mean-square deviation (RMSD) plots (Fig. 3a) showed that while the protein backbone atoms of the whole protein show a relatively high RMSD fluctuation of 2.5–4 Å, the second domain remains rather stable throughout the simulation (RMSD < 2.5 Å). We then focused our attention on the binding pocket and analyzed the RMSF (root-mean-square fluctuation) values of the heavy atoms of specific amino acids that constitute the pocket (Fig. 4). A closer look at those residues revealed that some amino acids are quite steady (His139, Tyr170, Tyr177, Tyr179). On the other hand, Phe141 and Trp151 show higher fluctuations (RMSF 1 and 0.8 Å, respectively) confirming what was already observed before. Nevertheless, the deviations of the latter two residues are still small. Of note, the high RMSF values of Asp184 can be attributed to the flip of its carboxyl group.
The stability of most of the binding site residues can be explained by the hydrogen bond networking established in the pocket. Indeed, hydrogen bonds are formed among the amino acids which contribute to stabilizing their side chains. In Fig. 3c, the binding pocket residues and the hydrogen bonds observed in the crystal structure (PDB ID: 2NS2) are shown. To corroborate the assumption that the hydrogen bond network contributes to the stability of the binding pocket residues, we analyzed the occupancy of these interactions during the MD simulation; values are detailed in Fig. 3c. It was observed that the majority of the hydrogen bonds are preserved during the simulation (occupancy rates grater then 94%) and they can, hence, play a role in stabilizing some of the pocket residues. Only the interaction between Trp151 and Asp149 shows low occupancy rate (15.1%). Giving the nature of Phe, no hydrogen bond interaction can be formed that could stabilize its side chain.
Most interesting is that the open cage conformation was not observed at any instance during the MD simulation time of 50 ns. Instead, as detected during the MD simulation, pi-pi interactions between Phe141 and Trp151 are established leading to a more closed pocket. In fact, after circa 3 ns, the side chain of Phe141 rather moves toward Trp151 and the orientation of these two aromatic residues is mostly stabilized by face-to-face pi-pi stacking interactions. Edge-to-face stacking interactions between Phe141 and Tyr170 are also observed during the simulation but to a much lesser degree. Therefore, Phe141 does not flip during the simulation to generate the open cage conformation but it rather goes closer into the cage. In Fig. 3d is depicted the superimposition of the reference X-ray structure (PDB ID: 2NS2) and a representative frame of the new closed cage conformation. To check the different Phe141 orientations retrieved during the simulation and to quantify their occupancy, the trajectory was clustered based on the RMSD of Ph141. The clustering analysis provided further evidence that the more closed conformation is predominant during the simulation, showing an occupancy rate of 82.2%. A total of three clusters were attained which highlighted that Phe141 and Trp151 mainly move closer as they undergo pi-pi stacking interactions. A representative frame for each cluster and their occupancy rates, as well as the reference X-ray structure, are reported in Figure S1.
We carried out a second extended MD simulation (500 ns) in order to test whether other aromatic cage conformations can be observed during longer simulation time. However, the analysis of the simulation confirmed the same trend observed in the shorter simulation (50 ns). The protein backbone atoms are stable, with relatively higher fluctuation for the whole protein (RMSD 2.5–4.5 Å) and a rather stable second domain (RMSD < 2.7 Å), Fig. 3b. Among the binding site residues, Phe141 and Trp151 still show the highest RMSF values (Figure S2b). Clustering of the trajectory based on the RMSD of Ph141 resulted in a greater number of clusters (16 clusters) as compared to the shorter simulation. Nevertheless, in the vast majority of the clusters, Phe141 and Trp151 still exhibit a face-to-face pi-pi stacking. In Figure S2d a representative frame for each of the first four most populated clusters is shown, while the occupancy values of all clusters are reported in Table S1. Only in two clusters, Phe141 displays a different orientation; however, the aromatic cage is either closed or distorted. Indeed, in cluster number 8 (occupancy 4.9%), Phe141 is flipped, but it interacts with Trp151 by edge-to-face pi-pi stacking leading to a different type of closed cage conformation where the binding pocket is blocked. Instead, in cluster number 12 (occupancy 2.9%), Trp151 is totally open, and no classical aromatic cage is observed. To numerically assess the difference of the obtained clusters to the ligand-bound open cage form (PDB ID: 6I8Y, ), the RMSD of the aromatic cage heavy atoms were computed. The values retrieved are in the range of 1.7–3.2 Å, highlighting that the pockets attained from the MD simulation show a different conformation than that observed in to the X-ray of the ligand-bound form.
To conclude, the MD simulation of the apo-form confirmed the stability of some binding pocket residues and the flexibility of others. However, it did not generate the open conformation as observed in most ligand-bound structures, since Phe141 and Trp151 mostly interact with each other and go closer during the simulation.
Docking and induced fit docking studies of A366
After investigating the pocket flexibility, we then tested the ability of induced fit docking (IFD) to correctly reproduce the experimentally determined X-ray binding mode of A366 (PDB ID: 6I8Y) whether an open or a closed conformation was used as starting point [11, 36, 37]. Three proteins were used: two with Phe141 in the closed cage conformation (apo-form, PDB ID: 2NS2; peptide-bound form, PDB ID: 4H75) and one with the open cage (ligand-bound form, PDB ID: 6QPL) [7, 12, 34]. Docking studies using Glide SP (rigid-body docking, protein kept rigid in its original conformation) were also performed to highlight that, in some cases, this approach can fail if there are residues in the pocket that can exhibit flexibility upon ligand binding. Thus, in these situations, treating the protein as rigid entity can be a limiting factor .
Not surprisingly, when A366 was docked into the closed cage conformation using Glide SP, its experimentally determined binding mode (as observed in PDB ID: 6I8Y) could not be reproduced. Instead, different binding hypotheses were obtained in which the pyrrolidine moiety is always embedded in the aromatic cage and undergoes cation-pi interactions, while the core is solvent-exposed. Additional interactions with distinct residues are formed based on the orientation adopted by the ligand. As examples, the top ranked poses are illustrated in Figs. 5a and 5b. It can be noticed that the amidine moiety interacts either with Asp95 (5a) or with Asp149 and Glu142 (5b). On the other hand, when A366 was docked into the open cage conformation (not its native crystal structure), the binding interactions and the X-ray binding mode were nicely reproduced (RMSD of 0.30 Å, heavy atoms). In Fig. 5c is reported the top ranked docking pose superimposed with the X-ray ligand structure (PDB ID: 6I8Y, ). As in the crystal structure, salt bridge interactions between the amidine moiety and Asp184, the intramolecular hydrogen bond, as well as cation-pi interactions involving the pyrrolidine moiety and the surrounding amino acids of the aromatic cage are established.
We next performed IFD of A366 in the three selected crystal structures; apo-form (PDB ID: 2NS2), peptide-bound form (PDB ID: 4H75) and ligand-bound form (PDB ID: 6QPL). Three different IFD settings were investigated aiming at establishing a protocol that could be relatively fast and efficient. Specifically, we started by treating the seven residues that constitute the pocket as flexible; then we tested only the aromatic cage plasticity (residues: Phe141, Trp151, Tyr170, Tyr177). Since our previous structural analysis and MD simulation results clearly indicated that Phe141, Trp151 and Asp184 are the most flexible residues of the pocket, we also performed IFD where only these three residues were treated as flexible.
The three different IFD settings and proteins yielded docking poses that could very nicely reproduce the binding interactions and the X-ray pose of A366 with low RMSD values (< 1.8 Å, heavy atoms). In the Supporting Information (Figure S3) the top ranked poses retrieved when either seven or four amino acids were treated as flexible are reported. Meanwhile, the poses obtained by treating three residues as flexible are shown in Fig. 6 and discussed below.
Interestingly, the flip of the Phe141 was always induced by A366. When the apo-form was used as starting conformation, a pose with a perfect overlap to the experimentally determined binding mode was generated (RMSD of 0.61 Å, heavy atoms; Fig. 6a). Besides the salt bridge and cation-pi interactions, the intramolecular hydrogen bond between the NH+ of the positively charged pyrrolidine moiety and the methoxy group is also observed. We then tested the peptide-bound form conformation as starting point. The IFD protocol generated good results with an RMSD of 1.45 Å (heavy atoms) for the top ranked pose (Fig. 6b). However, some deviations from the experimentally observed binding mode of A366 could be detected. The pyrroline moiety which is still placed in the aromatic cage is more tilted, but the linker group shows a more extended conformation, and the methoxy group is differently orientated. Consequently, the intramolecular hydrogen bond interaction between the pyrrolidine NH+ and the methoxy group cannot be formed. Noteworthy, the role of this intramolecular interaction has been investigated by the design and biological testing of A366 analogs that miss the intramolecular hydrogen bond and that are no longer active (data not shown, data will be published elsewhere). It is worth noting that when IFD was applied to the open cage structure, the open conformation was maintained and the binding mode was reproduced as observed in the X-ray (RMSD of 0.48 Å, heavy atoms; Fig. 6c).
As described in the next section, the docking poses obtained by IFD were further investigated by running short MD simulations. We specifically wanted to investigate whether the obtained IFD pose in the peptide-bound structure PDB ID 4H75 (Fig. 5b) could be optimized and stabilized into the experimentally determined binding mode by running a short MD simulation. Furthermore, the stability of the predicted binding modes attained in the apo-form (PDB ID: 2NS2) through rigid-body docking (Glide SP) as well as IFD was also verified by means of MD simulations.
Analysis of the predicted binding modes by MD simulations
To verify the stability of the predicted binding modes obtained from rigid-body docking (Glide SP) and IFD, the retrieved poses-complexes were subjected to MD simulations using Desmond package . Specifically, we wanted to investigate if the binding mode were stable during the MD simulations and in line with the experimentally determined binding mode of A366. Moreover, since the pose attained from 4H75 with IFD did not show the intramolecular hydrogen bond, we tested if the binding pose could be optimized by running a short MD simulation. The docking results reported in Fig. 5a (A366-2NS2_Docking), Fig. 6a (A366-2NS2_IFD) and Fig. 6b (A366-4H75_IFD) were used as initial coordinates for the generation of the MD systems. The analysis of the simulations was focused primarily on the binding mode stability, thus, RMSD and RMSF values were calculated and plotted in Fig. 7 and Fig. 8, respectively.
The analysis of the MD simulation of rigid-body docking of A366 in the apo-form (A366-2NS2_Docking, Fig. 5a) highlighted that the binding mode predicted into the closed aromatic cage is highly unstable during the simulation (Fig. 7a). While the pyrrolidine moiety remains in the cage, the core, which is more solvent exposed, fluctuates and generates diverse binding modes (Fig. 7b). The RMSD values are indeed very high (Fig. 7a) as well as the RMSF of the majority of the ligand atoms (Fig. 8). Phe141 does not flip during the simulation and the experimentally determined pose of A366 is not reproduced.
Meanwhile the obtained IFD pose of A366 in the apo-form structure (A366-2NS2_IFD), which showed a binding mode that perfectly reproduces the experimentally A366 X-ray conformation (Fig. 6a), is highly stable during the MD simulation with the initial pose being maintained throughout the simulation time (Fig. 7c and 7d). The intramolecular hydrogen bond is preserved (occupancy rate of 78.8%), and only marginal fluctuations of the ligand atoms are detected (Fig. 8).
Finally, the MD simulation performed for the IFD pose of A366 in the peptide-bound structure (A366-4H75_IFD, Fig. 6b) showed that the ligand is quickly adopting the binding conformation observed in the A366 crystal structure (Fig. 7f). The pyrrolidine moiety and the methoxy group orientate themselves to form the intramolecular hydrogen bond which is further conserved during the simulation (occupancy rate of 73.6%). The initial IFD binding pose is optimized and minimal fluctuations of the ligand atoms are noticed (Fig. 8).
Hence, these results clearly demonstrate that either IFD alone or IFD combined with a short MD simulation can be used to reproduce the experimentally binding mode of A366 starting from closed aromatic cage conformations.