Introduction

The Eps15 homology domain (EH domain), a highly conserved region comprising about 100 residues, exists in various organisms ranging from yeast to mammals (Santolini et al., 1999; Miliaras and Wendland, 2004). Most of the EH domains are generally present in the N-terminus of Eps15 and Eps15-related proteins that are involved in internalization events (Fazioli et al., 1993; Benmerah et al., 1995; 1998; Wong et al., 1995; Carbone et al., 1997; Delft et al., 1997; Polo et al., 2003). The EH domain is also contained at the C-terminus of certain proteins with the functional regulation of several critical endocytic events, such as internalization and recycling of various receptors (Naslavsky and Caplan, 2005; Grant and Caplan, 2008). EHD1 is one of the four highly homologous C-terminal EH domain-containing paralogs expressed in mammalian cells, and is also well characterized. EHD1 controls the recycling of various receptors from the endocytic recycling compartment (ERC) to the plasma membrane, such as the transferrin receptor (TfR) (Lin et al., 2001), major histocompatibility complex (MHC) class I proteins (Caplan et al., 2002), and β-integrins (Jovic et al., 2007). Nevertheless, it was reported that derailed internalization and recycling routes are features of cancer (Mosesson et al., 2008). For example, an increased recycling of β-integrins from ERC to the plasma membrane, which is regulated by EHD1, has been observed in motile cancer cells (Caswell and Norman, 2008; Mosesson et al., 2008). The aberrant expression of EHD1 has also been observed in many other human diseases (Maher et al., 2001; Galindo et al., 2003; Hansel et al., 2004; Cortez et al., 2006; Shin et al., 2007; Ammann and Goodman, 2009; Jansen et al., 2009; Tripathi et al., 2009; Dervan et al., 2010; Naslavsky and Caplan, 2011). EHD1 and other proteins involved in the endocytic events are thus becoming potential cancer targets for peptide inhibitor design. Recently, several cyclic peptides have been designed with higher affinities toward the EHD1 EH domain (Kamens et al., 2014).

The main binding partner of the C-terminal EH domain is the NPF (asparagine-proline-phenylalanine) motif (Guilherme et al., 2004; Naslavsky et al., 2004; 2006; Smith et al., 2004; Xu et al., 2004; Braun et al., 2005; Shi et al., 2007; Doherty et al., 2008; Sharma et al., 2009). Based on the crystal structure of mouse EHD2, an internal GPF (glycine-proline-phenylalanine) motif that was predicted to bind to the C-terminal EH domain of the opposing dimeric EHD2 was identified (Daumke et al., 2007). Kieken et al. (2009) first studied the structure of the EHD1 EH domain-DPF (aspartic acid-proline-phenylalanine) motif and quantitatively compared the binding ability of the EHD1 EH domain to these three peptide motifs (NPF, DPF, and GPF). However, the molecular mechanisms of different interactions of the three peptide motifs with the EHD1 EH domain remain unclear. Elucidation of different binding affinities of the three peptide motifs in terms of structures and energies is of significance for the better design of future peptide inhibitors.

In the present study, molecular dynamics (MD) simulations for the three EHD1 EH domain/peptide complexes were conducted for the comprehensive comparison of structural and energetic terms related to the different binding affinities. Alanine scanning for the ensemble of the nuclear magnetic resonance (NMR) structure of each complex was performed to identify binding hot spot residues. The results can provide essential information for the rational design of functional peptide inhibitors and provide some guidance for relevant inhibitor design for other proteins.

Materials and methods

Molecular dynamics simulation

There are four EHD1 EH domain-peptide complexes (PDB ID: 2KFF, 2KFG, 2KFH, and 2KSP) in the Protein Data Bank (PDB) (Rose et al., 2013). The proteins of the four complexes are the same. The peptides of 2KFF, 2KFG, and 2KFH contain NPF, DPF, and GPF motifs, respectively, and have the exact same flanking residues (FNYESTNPFTAK, FNYESTDPFTAK, and FNYESTGPFTAK). The peptide of 2KSP contains the NPF motif but entirely different flanking residues from 2KFF, 2KFG, and 2KFH (LESKPYNPFEEEEED). In order to be more comparable, we chose the complexes 2KFF, 2KFG, and 2KFH with the smallest changes between each other. The tertiary structures of these three complexes were obtained from the PDB. In order to intuitively reflect the component of the peptide, 2KFFNPF, 2KFGDPF, and 2KFHGPF are used to replace their respective PDB ID in the following. The initial coordinates for MD simulations were the first model of the ensemble of the NMR structures uniformly and the Ca2+ existing in the structure was retained.

MD simulations were performed using the GROMACS (ver. 4.5.4) package (Hess et al., 2008). The Amber03 force field (Duan et al., 2003) and the TIP3P (Jorgensen et al., 1983) water model were chosen for all simulations. Each complex was placed in the center of a dodecahedron box and the distance between the solute and the box edge was 12 Å. Proper numbers of Na+ were added to keep neutralization. The steepest-descent energy minimization was used and the maximum force was set to 100 kJ/(mol·nm) on any atom. The solvated system was equilibrated with two steps. First, the system was equilibrated for 1 ns under a constant volume ensemble (NVT) with a harmonic position restraint applied on heavy atoms of the solute. Second, the system was equilibrated for another 1 ns under a constant pressure ensemble (NPT) without any restraint. Production simulation was conducted for 60 ns under the NPT ensemble. All bonds containing hydrogen atoms were constrained using the default linear constraint solver (LINCS) algorithm (Hess et al., 1997). The coupling algorithm of Berendsen et al. (1984) was used to maintain temperature (300 K) and pressure (1 atm, 101 325 Pa) with the constant of 0.1 and 1.0 ps, respectively. The electrostatic interactions were treated with the particle-mesh-Ewald (PME) method (Darden et al., 1993; Essmann et al., 1995). A cutoff of 14 Å was applied in the calculation of the van der Waals interactions. Periodic boundary conditions were used in all three directions. The time step was 2 fs and a snapshot was collected for every 1 ps.

Binding free energy calculation

Two thousand snapshots collected once for every 10 ps from the trajectory of 40–60 ns obtained by GROMACS were converted to the trajectory file pattern recognized by AMBER using VMD 1.9.1 software (Humphrey et al., 1996). For each complex, molecular mechanics/generalized Born surface area (MM/GBSA) method (Kollman et al., 2000) incorporated in AmberTools 13 (Case et al., 2012) was used for the binding energy calculation and decomposition with the total 2000 snapshots.

$$\Delta {G_{{\rm{binding}}}} = {G_{{\rm{complex}}}} - ({G_{{\rm{protein}}}} + {G_{{\rm{peptide}}}}),$$
(1)

where Gcomplex, Gprotein, and Gpeptide are the absolute free energies of the complex, protein, and peptide, respectively. Trajectories of the protein and the peptide calculating Gprotein and Gpeptide were directly obtained from the trajectory of the complex (a single trajectory method). Each absolute free energy was estimated as the sum of the gas phase free energy (Egas), the solvation free energy (Gsolv), and the entropy term (−TS; T is temperature and S is entropy):

$$G = {E_{{\rm{gas}}}} + {G_{{\rm{solv}}}} - TS,$$
(2)

where Egas is the sum of the internal energy (Eint), the electrostatic energy (Eele), and the van der Waals energy (Evan):

$${E_{{\rm{gas}}}} = {E_{{\rm{int}}}} + {E_{{\rm{ele}}}} + {E_{{\rm{van}}}}.$$
(3)

The solvation free energy (Gsolv) is the sum of the polar free energy and the nonpolar free energy:

$${G_{{\rm{solv}}}} = {G_{{\rm{polar}}}} + {G_{{\rm{nonpolar}}}},$$
(4)

where the polar free energy was estimated using the GB model (IGB=5) with the dielectric constants 1 and 80 set to the solute and water, respectively, while the nonpolar free energy was computed from Eq. (5), where SASA is the solvent-accessible surface area calculated using the linear combination of pairwise overlaps (LCPO) model (Weiser et al., 1999) and γ has the dimension of surface-tension which was set to 0.0072 kcal/(mol·Å2) (default unit, 1 kcal=4.184 kJ):

$${G_{{\rm{nonpolar}}}} = \gamma \cdot {\rm{SASA}}.$$
(5)

As entropy calculation using normal mode analysis (NMA) is very expensive for large systems and the three complexes are highly similar systems, the entropic contribution (−TS) to the absolute free energy was ignored just as reported in other literature (Wang and Kollman, 2001; Lindahl et al., 2006; Zhang et al., 2009; Lu et al., 2011). However, in order to verify the accuracy of our calculation through direct comparison with the experimental value, we calculated the entropy term (−TS) of 2KFFNPF using NMA with 40 snapshots (collected once for every 50 snapshots). The translational entropy, rotational entropy, and vibrational entropy constitute the entropic contribution to the free energy. Thus, the binding free energy is estimated by Eq. (6) below:

$$\Delta {G_{{\rm{binding}}}} = \Delta {E_{{\rm{gas}}}} + \Delta {G_{{\rm{solv}}}} - T\Delta S,$$
(6)

where,

$$\Delta {E_{{\rm{gas}}}} = \Delta {E_{{\rm{ele}}}} + \Delta {E_{{\rm{van}}}},$$
(7)

as ∆Eint equals zero in a single trajectory method. ∆Eele and ∆Evan are the electrostatic interaction energy and the van der Waals interaction energy between the protein and the peptide, respectively.

$$\Delta {G_{{\rm{solv}}}} = \Delta {G_{{\rm{polar}}}} + \Delta {G_{{\rm{nonpolar}}}},$$
(8)

where ∆Gpolar and ∆Gnonpolar are the polar solvation free energy and the nonpolar solvation free energy during the binding process of the protein and the peptide, respectively.

Alanine scanning with FoldX

Alanine scanning experiments were done with FoldX (Guerois et al., 2002; Schymkowitz et al., 2005) (ver. 3.0 Beta 6) which has been widely used (Stein and Aloy, 2008; London et al., 2010), especially for the prediction of hot spot residues. All 10 conformations of the NMR structures of each complex were calculated and averaged at the temperature of 300 K. First, the command of “RepairPDB” was used to repair the residues which have bad torsion angles, or van der Waals’ clashes, or total energy (option VdWDesign=2). Then, the command of “complex_alascan” was used to calculate the contribution of each residue on the interaction interface of the peptide to the binding of the protein and the peptide through estimating the binding free energy differences (∆∆G) after mutating the residue to Ala (option VdWDesign=0). Details of the formula of the binding free energy (∆G) are attached to supporting information (Formula S1).

$$\Delta \Delta G = \Delta {G_{{\rm{mutate}}}} - \Delta {G_{{\rm{wild}}}},$$
(9)

where ∆Gmutate and ∆Gwild are the binding free energies of the mutant complex and the wild-type complex, respectively.

Results

System stability during MD simulations

In total three production simulations were carried out. The root mean square deviation (RMSD) of the backbone was calculated to estimate the system stability during the production simulation with the initial minimized structure as the reference conformation. In general, RMSD values of 2KFFNPF complex begin to be stable after 20 ns simulation while the stable situation occurs after about 25 and 35 ns simulations for 2KFGDPF and 2KFHGPF complexes, respectively (Fig. 1). Fluctuations of RMSD values of the peptides were a little higher than those of the complexes and proteins in 2KFGDPF and 2KFHGPF. This is probably caused by the high peptide flexibility in these complexes. The trajectory of the last 20 ns was used for the calculation and decomposition of binding free energy for their relatively good stability for all complexes, proteins, and peptides. Fluctuations of the total energy of the whole system including solution were less than 0.1% of its average value for each complex. Combined with small fluctuations of the temperature, volume, and density during the production simulation (data not shown), we believe that the three simulation systems all reached a relatively good equilibrium during the last 20 ns simulations.

Fig. 1
figure 1

RMSD of the backbone atoms of the complex, protein, and peptide during the 60 ns production simulation

Black: complex; Red: protein; Blue: peptide (Note: for interpretation of the references to color in this figure legend, the reader is referred to the web version of this article)

Binding free energy calculation and energy decomposition

The binding free energy (Table 1) was calculated to study the thermodynamics. For the three complexes, similar thermodynamic phenomena were observed. The polar components (∆Eele and ∆Gpolar) get together to make a negative contribution to the binding, while all the nonpolar components (∆Evan and ∆Gnonpolar) make positive contributions. Energy values of the van der Waals interactions (∆Evan) are nearly 8 times smaller than that of the nonpolar solvation free energy (∆Gnonpolar) in all three complexes, which suggests the importance of the van der Waals interaction for binding. Binding free energy without the entropy contribution is listed as GBTOT in Table 1. From the study of Kieken et al. (2009), we obtained the experimental disassociation constants (Kd) of the three complexes (2KFFNPF, 245 µmol/L; 2KFGDPF, 1.2 mmol/L; 2KFHGPF, 2.4 mmol/L). Correlation analysis applied to the calculated GBTOTs and pKd showed that there exists a relatively high correlation (r=−0.89) between them. The calculated total binding free energy of 2KFFNPF, which includes the entropic contribution (−TS, 31.02 kcal/mol), is −4.71 kcal/mol. This value is very close to the experimental binding free energy (−4.93 kcal/mol) converted from the disassociation constant (Kieken et al., 2009). Both these results demonstrated that our energy calculation is reliable.

Table 1 Binding free energy and energy terms computed with MM/GBSA method

Per-residue energy decomposition (Table 2) was performed to estimate the contribution of each residue to binding. More detailed information about the contribution of each energetic component in this decomposition is given in the supporting information (Table S1).

Table 2 Per-residue energy decomposition of the peptide residues

Structure analysis of the three complexes

Since the same protein is used in the three complexes and the binding mode is similar, the structure of the complex 2KFFNPF is presented as an example in Figs. 2a and 2b to show the structure of the EHD1 EH domain and the binding modes of protein and peptide. EHD1 EH domain contains two helix-loop-helix structures that connect with each other by an antiparallel β-sheet formed between the two short loops. Peptides bind to the pocket formed between helix 2 and helix 3. In order to compare the structure of the three complexes in detail, we identified the protein residues numbered 68–102 around the binding pocket, together with the whole peptide, as the objectives (Fig. 2c) to monitor the structure evolution during the 40–60 ns production simulation using DSSP (Kabsch and Sander, 1983) installed in GROMACS. The secondary structures of the protein residues numbered 68–94 and 101–102 are nearly the same among the three complexes, and the structures are also similar for the protein residues 95–100 (Fig. 3). However, obvious structural differences were observed among the three different peptides (Fig. 3). In addition, the aligned structure of the last snapshot of the three complexes during the production simulations showed that the positions of the peptide residues relative to the binding pocket of the protein are also different among the three complexes, especially for the flanking residues far from the NPF/DPF/GPF (residues numbered 149–151) motifs (Fig. 2d). Compared with the motif residues Pro150 and Phe151 which are almost completely buried in the binding pocket of the protein, the flanking residues may be only partially occupied by the protein interface because of their relatively large dynamics and flexibilities.

Fig. 2
figure 2

Structures of the complex 2KFF NPF and aligned structures of complexes 2KFF NPF , 2KFG DPF , and 2KFH GPF

Structures of complex 2KFFNPF were viewed from the side (a) and the top (b) of the binding pocket. (c) The structure of peptide and protein residues numbered 68–102. The partition of the residues is consistent with the structure module in Fig. 3. (d) This picture was obtained by superimposing the residues on the protein interface numbered 68–102 of the three complexes. Protein and peptide were displayed in Surf and NewCartoon styles, respectively. The highlighted parts of the peptides represent Pro150 and Phe151

Fig. 3
figure 3

Secondary structure evolution of peptide and protein residues numbered 68–102 during 40–60 ns dynamics simulation

Zero in the ordinate, as a chain separator, is the boundary of the protein and peptide residues. The upper part represents the peptide residues numbered 143–154, while the lower part represents the protein residues numbered 68–102

Hydrogen bond interaction network at the interaction interface

The hydrogen bond interaction network formed between the protein residues numbered 68–102 and the peptide during the 40–60 ns dynamics simulation was analyzed for each complex (Fig. S1). The geometrical criteria used to determine the hydrogen bond are that the donor-acceptor distance and the hydrogen-donor-acceptor angle are not greater than 0.35 nm and 30°, respectively. Generally, the number of the hydrogen bond decreases and the peptide residues involved in the hydrogen bond increasingly focus on a few residues within (such as Pro150 and Phe151) or around the motif residues from 2KFFNPF, 2KFGDPF, to 2KFHGPF (Fig. S1).

Intermolecular hydrogen bonds formed between Asn149/Asp149/Gly149 of the peptides and its protein partners (Fig. 4) were further analyzed owing to its possible importance for the binding affinity suggested by Kieken et al. (2009). In 2KFFNPF, a relatively stable main-chain/side-chain hydrogen bond formed between the carbonyl oxygen of Gly87 and the amino hydrogen of Asn149 was observed with an average bond length of 0.31 nm. A relatively unstable side-chain/side-chain hydrogen bond formed between the carboxyl oxygen of Asp149 and the amino hydrogen of Lys91 was observed with an average bond length of 0.29 nm in 2KFGDPF. And an extremely unstable main-chain/side-chain hydrogen bond between the carbonyl oxygen of Gly149 and the amino hydrogen of Lys73, which can be ignored, was observed in 2KFHGPF. Existence of all these hydrogen bonds is consistent with that reported by Kieken et al. (2009).

Fig. 4
figure 4

Hydrogen bond formed between Asn149/Asp149/Gly149 of the peptide and its protein partner during 40–60 ns productionsimulation

Figure below shows the lifetime of the hydrogen bond. Red, the hydrogen bond is present at that time; white, not present. Zero in the ordinate is the hydrogen bond index and represents the hydrogen bond formed between Asn149 and Gly87 in 2KFFNPF, the hydrogen bond formed between Lys91 and Asp149 in 2KFGDPF, and the hydrogen bond formed between Lys73 and Gly149 in 2KFHGPF, respectively. Figure upper left shows the hydrogen bond. Protein is colored cyan; peptide, yellow. Residues forming hydrogen bonds are drawn in CPK mode. White represents hydrogen; blue, nitrogen; red, oxygen; magenta, carbon. The blue dotted line represents hydrogen bond. Figure upper right shows the distance distribution of donor-acceptor of hydrogen bond during 40–60 ns simulation

Alanine scanning with FoldX

As the entropy contribution cannot be decomposed in the energy decomposition analysis with AMBER and the hydrogen bond energy cannot be evaluated separately by MM/GBSA, alanine scanning with FoldX was conducted as a supplement of quantitative analysis.

Taking ∆∆G>1 kcal/mol as the threshold to define hot spot residues (London et al., 2010), Asn149, Pro150, and Phe151 were identified to be the hot spot residues of 2KFFNPF, while only Pro150 and Phe151 were identified to be the hot spot residues of both 2KFGDPF and 2KFHGPF (Table 3). The value of ∆∆G of Asp149 of 2KFGDPF is also relatively large. However, Gly149 of 2KFHGPF was not even identified on the interaction interface in any conformation, which suggests that this residue does not contribute much to the protein-peptide binding even though it is a key component of the motif.

Table 3 Binding free energy difference ( ∆∆G ) measured by FoldX

Discussion

EHD1 plays an important role in endocytic transport via interactions by EH domain and thus becomes a new drug target of peptide inhibitor design. Previous work has shown that the EHD1 EH domain has different binding affinities with three different peptide motifs (NPF, DPF, and GPF motifs) (Kieken et al., 2009). However, the molecular mechanism resulting in these different binding affinities is still unclear. Therefore, we investigated the structural and energetic bases of these different binding affinities among the three complexes via MD simulations.

Pro150 and Phe151 are buried in the binding pocket of the protein (Fig. 2d) and therefore contribute the most in all three complexes (consistent in Tables 2 and 3). The total contributions of these two residues to the binding of their respective complex are similar in energy value among the three complexes according to Table 3, in which energy components are more comprehensive than those in Table 2 for the calculation of the entropy contribution. Asn149 and Asp149 are the third biggest contributors in 2KFFNPF and 2KFGDPF, respectively, of which the former contributes a little more than the latter, while Gly149 of 2KFHGPF makes no positive contribution to binding (consistent in Tables 2 and 3. 2KFFNPF, −2.17 kcal/mol; 2KFGDPF, −2.05 kcal/mol; 2KFHGPF, 0.41 kcal/mol; as shown in Table 2). The total contributions of the flanking residues (Phe143 to Thr148 and Thr152 to Lys154) decrease dramatically from 2KFFNPF (−4.47 kcal/mol) to 2KFGDPF (−1.47 kcal/mol) and 2KFHGPF (−1.66 kcal/mol) (Table 2). A similar trend occurs in Table 3 even though only two flanking residues at most were measured. Accordingly, we can suggest that the different contribution of Asp149 and Gly149 is the main reason for the different binding affinities of 2KFGDPF and 2KFHGPF, while the different contribution of the flanking residues of 2KFFNPF and 2KFGDPF is the main reason for different binding affinities of 2KFFNPF and 2KFGDPF. The fewer contributions of both Gly149 and the flanking residues result in a much smaller binding affinity of 2KFHGPF compared with 2KFFNPF.

Differences among Asn149, Asp149, and Gly149 mainly include the hydrogen bond formed at the interaction interface (Fig. 4), the polarity and charge, and the size of the side chain. As the hydrogen bond energy is included in the electrostatic interaction energy (∆Eele) by AMBER and cannot be isolated, we obtained it through the decomposition of ∆∆G by FoldX. Results showed that contributions of the intermolecular hydrogen bonds are 0.41 and 0.32 kcal/mol for Asn149 and Asp149, respectively. The former contributes a little more than the latter owing to different types and stabilities of the two hydrogen bonds (Fig. 4). In addition, even if we postulate that the value of the electrostatic interaction energy of Asn149 comes from the hydrogen bond, the energy of the stronger hydrogen bond formed by Asn149 is only −1.38 kcal/mol, let alone the weaker hydrogen bond formed by Asp149. However, though contribution of the hydrogen bond decreases from Asn149 to Asp149, the electrostatic interaction energy of these two residues enhances from −1.38 to −13.01 kcal/mol (Table S1, ∆Eele). As the binding pocket of the protein is highly positively charged (Kieken et al., 2007) and the distances between residues numbered 149 and the binding pocket are nearly the same for the two complexes (distance between two centers of mass: 2KFFNPF, 0.69 nm; 2KFGDPF, 0.68 nm), this enhancement of the electrostatic contribution should be attributed to the different polarities between the polar residue Asn149 and the negatively charged residue Asp149. Finally, as Gly149 of 2KFGDPF was not identified to be on the interaction interface under the threshold defined by FoldX, we did not obtain the intermolecular hydrogen bond energy of this residue by decomposition of ∆∆G. However, we believe that the hydrogen bond energy formed by Gly149 will be very small as this hydrogen bond is extremely unstable (Fig. 4). In addition, as the distance between Gly149 and the binding pocket is larger than the distance between Asn149 and the binding pocket (distance between two centers of mass: 2KFFNPF, 0.69 nm; 2KFHGPF, 1.02 nm), it is somewhat surprising that the contribution of the electrostatic interaction of the hydrophobic amino acid Gly149 (−1.95 kcal/mol) is a little larger than that of the polar amino acid Asn149 (−1.38 kcal/mol) (Table S1). Further analysis of the pairwise energy decomposition (data not shown) indicated that even though the electrostatic attraction between Gly149 and some protein residues is weaker than the corresponding force of Asn149, the electrostatic repulsion between Gly149 and the remaining protein residues is much weaker. Thus, a little higher electrostatic contribution occurs instead. However, on the whole, the electrostatic contributions (Table S1, ∆Eele) of all three residues numbered 149 are all eliminated completely by the polar solvation free energies (Table S1, ∆Gpolar) which get larger as the electrostatic contributions increase. The polar components (∆Eele+∆Gpolar) get together to make adverse contributions to binding with an energy of about 2 kcal/mol for Asn149 and Asp149 and an energy of 0.79 kcal/mol for Gly149 (Table S1). These energy values demonstrate that the polar components of the two polar amino acids Asn149 and Asp149 are more adverse for binding than that of the nonpolar amino acid Gly149. In the end, the van der Waals interactions of the three residues numbered 149 become the major positive contributions of binding (Table S1). Because of the longest distance between Gly149 and the binding pocket (distance between two centers of mass: 2KFFNPF, 0.69 nm; 2KFGDPF, 0.68 nm; 2KFHGPF, 1.02 nm) and the smallest side chain of Gly149, the van der Waals interaction energy of Gly149 is the largest among the three residues numbered 149 (∆Evan: Asn149, −3.74 kcal/mol; Asp149, −3.72 kcal/mol; Gly149, −0.35 kcal/mol) and does not offset the adverse contributions of its own polar components (∆Eele+∆Gpolar: Gly149, 0.79 kcal/mol) (Table S1). However, the van der Waals interaction energies of Asn149 and Asp149, which are ten times smaller than that of Gly149, eliminate the adverse contributions of their respective polar components (∆Eele+∆Gpolar: Asn149, 2.03 kcal/mol; Asp149, 2.01 kcal/mol; Table S1) and make positive contributions to binding together with the favorable nonpolar free energies. Therefore, compared with Gly149 of 2KFHGPF, contributions of Asn149 and Asp149 to the larger binding affinities of 2KFFNPF and 2KFGDPF are achieved through the much larger van der Waals interactions.

Generally, the van der Waals interaction is also the main contributor of the total contributions of the flanking residues (Table S1). However, the van der Waals interaction may be affected by the intermolecular hydrogen bonds whose presence is different among the three complexes because of the different structures of the three peptides observed in Figs. 2d and 3. A typical flanking residue is Tyr145. The van der Waals interaction energy of this residue in 2KFFNPF (−2.11 kcal/mol) is three times the corresponding energy in 2KFGDPF (−0.70 kcal/mol) and more than twice of the corresponding energy in 2KFHGPF (−0.92 kcal/mol), as shown in Table S1. The energy differences mainly come from the different van der Waals interaction energies between Tyr145 and Lys97, which are −1.47, −0.14, and −0.002 kcal/mol for 2KFFNPF, 2KFGDPF, and 2KFHGPF, respectively, according to pairwise energy decomposition. Further analysis showed that three hydrogen bonds exist between Tyr145 and Lys97 in 2KFFNPF. Consistently, only two less stable hydrogen bonds exist between these two residues in 2KFGDPF and no hydrogen bond occurs in 2KFHGPF (Fig. S1). The fixation function of intermolecular hydrogen bonds restricts flexibilities of Tyr145 and Lys97 to varying degrees, which makes them bind in different affinities. On the whole, the existence of intermolecular hydrogen bonds narrows the distance of the peptide and the protein. The mass center of the flanking residues numbered 143–148 gets farther and farther away from that of the protein binding pocket from 2KFFNPF (1.44 nm), 2KFGDPF (1.47 nm), to 2KFHGPF (1.53 nm) with their intermolecular hydrogen bonds less and weaker in turn (Fig. S1). Correspondingly, the contributions of the van der Waals interactions (2KFFNPF, −6.88 kcal/mol; 2KFGDPF, −4.83 kcal/mol; 2KFHGPF, −1.79 kcal/mol; Table S1) and then the total contributions of these residues (2KFFNPF, −2.33 kcal/mol; 2KFGDPF, −0.84 kcal/mol; 2KFHGPF, 0.64 kcal/mol; Tables 2 and S1) decrease in turn. This correlation between hydrogen bonds and energy contributions is also true for the flanking residues numbered 152–154. Thus, 2KFFNPF, which has the largest number of intermolecular hydrogen bonds, has the largest contributions of flanking residues.

At first, Kieken et al. (2007) predicted that the EHD1 EH domain may have a higher binding affinity for peptides containing the more negatively charged DPF motif than for those containing NPF motif because of its highly positive surface potential. However, their subsequent study estimating binding affinities of 2KFFNPF and 2KFGDPF confirmed that the opposite is true (Kieken et al., 2009), which implies that it is not reliable to speculate the binding affinity based solely on the surface potential and residue charge state. The reasons are just as what we have discussed above, i.e. peptide structures of 2KFFNPF and 2KFGDPF differ even though only one residue is different, and what’s more, the influence of electric charges on binding affinity is not obvious because of the offset of ∆Eele and ∆Gpolar. Similarly, the situation also occurs in the neutral/negatively charged N-terminal EH domains. de Beer et al. (2000) predicted that the second EH domain of Eps15 (EH2) cannot bind to peptides containing DPF motif because of electrostatic repulsion between Glu170 of EH2 and the negatively charged residue Asp taking the structure of EH2-NPF as reference. Shortly after that, Kim et al. (2001) proved that the Reps1 EH domain that has a nearly equal overall conformation of the binding pocket and the same orientation of the side chains of critical conserved hydrophobic residues with EH2 can bind to the DPF-containing peptide because of the different arrangement for the gate residues Glu55 (Glu170 in EH2) and Lys37 of the EH domain. Later, the EH domain of POB1, a closely related but not identical to Reps1, was also confirmed to bind to the DPF motif (Santonico et al., 2007). All these cases imply the importance of comprehensive consideration of sequence differences, structural changes, and critical energies for binding. In addition, it is worth noting that the Reps1 EH domain prefers NPF motif over DPF motif with all other flanking residues being equal (Kim et al., 2001). This situation is similar to the EHD1 EH domain here. However, explaining this preference through exact energy calculations just like here is currently not possible because of the lack of structures of Reps1 EH domain-NPF peptide and Reps1 EH domain-DPF peptide.

However, contributions of the flanking residues only occupy 21.5%, 8.0%, and 11.0% of the total contributions of all residues in 2KFFNPF, 2KFGDPF, and 2KFHGPF, respectively (Table 2). The three motifs are still the most important contributors to the binding. This is consistent with a previous systemic study, which demonstrated that the motif and the flanking residues contribute on average 79% and 21% of the global binding energy, respectively, in protein-peptide interactions (Stein and Aloy, 2008). Hence, special attention should be paid to the choice of motif residues when we design peptide inhibitors with high binding affinities. Residues which can interact with the protein partner with larger van der Waals interactions should be considered at first. For proteins with multiple binding pocket or oligomers, repeating occurrence of the motif is another way to strengthen the binding just like some EH domain containing protein (EHD protein)/peptide complexes found in vivo (Braun et al., 2005; Naslavsky et al., 2006). Nevertheless, attention to the flanking residues is always necessary for their responsibility for the specificity in EH domain/peptide interactions (Grant and Caplan, 2008; Naslavsky and Caplan, 2011) as well as their energy contributions to improve binding affinities. Previous studies have found that EHD proteins prefer acidic residues following the NPF motif in the +1, +2, and +3 positions because of salt bridges or entropic cost (Henry et al., 2010; Kieken et al., 2010). Here, we emphasized the structural importance of intermolecular hydrogen bonds of the flanking residues. Peptides that have acidic residues following the NPF motif in the +1, +2, and +3 positions and flanking residues in other positions which have a strong ability to form intermolecular hydrogen bonds may bind to EHD proteins better. In addition, design of cyclic peptides may be another choice to get high affinity. Studies have shown that cyclic NPF-containing peptides bind to the N-terminal EH domain with higher affinities than the linear ones (Yamabhai et al., 1998; de Beer et al., 2000). And a new cyclic peptide designed for the EHD1 EH domain has obtained nearly 4-fold improvement in affinity in contrast to a typical linear peptide (Kamens et al., 2014). Usually, the β-turn conformation that is adopted by the NPF motif of bound state was well stabilized in these cyclic peptides. An excellent conformational fit between protein and peptide was obtained. On the other hand, the less flexible cyclic peptide is more likely to lose less entropy upon binding thermodynamically, and thus should bind more tightly, all other things being equal. However, just as in the observation of Kim et al. (2001), the rigid cyclic peptide may induce larger conformational changes of its protein partner, slow the association rate, and thus get a lower affinity. Weaker binding affinities were also observed when increasing or decreasing the ring size of a good cyclic peptide (Kamens et al., 2014). Therefore, how to design a specific cyclic peptide with an appropriate conformation is a big challenge for the future.

In conclusion, we have investigated the molecular mechanisms of different binding affinities of three complexes formed between the EHD1 EH domain and peptides containing NPF, DPF, and GPF motifs from structural and energetic perspectives via MD simulations. Our results emphasized the importance of the van der Waals interactions and the intermolecular hydrogen bonds of the flanking residues in the EHD1 EH domain interactions with peptide, all of which provide a clear guidance to the peptide inhibitor design of the EHD1 EH domain and even other related proteins.

Compliance with ethics guidelines

Hua YU, Mao-jun WANG, Nan-xia XUAN, Zhi-cai SHANG, and Jun WU declare that they have no conflict of interest.

This article does not contain any studies with human or animal subjects performed by any of the authors.