The structural analysis of the mutants is performed in respect to the ES and LS structural characteristics using the VR model and “fuzzy oil drop” model with the distance entropy applied to quantitative measurements of the structural differences between two structures under consideration.
Structural analysis of proteins under consideration
A structural analysis of proteins under consideration with respect to the ES and LS is presented in Table 1.
Table 1 The ES and LS characteristics of proteins under consideration. The position of mutations is given in the second column, followed by the value of D (distance), which is a measure of the accordance with the adopted model of ES intermediate. The D
average
expresses the mean distance between the projected and observed values of parameters that describe the structure of ES intermediate. The proteins with D
average
values below 1.0 are considered consistent with the ES model. The protein of the relation O/T < O/R is interpreted as accordant with LS model O/T denotes the Kullback-Leibler entropy calculated for the observed (O) distribution of hydrophobicity density and theoretical one (T) treated as the target distribution in comparison with the O/R expressing the distance between observed one and random (R) treated as the target distribution. Chains A were taken for analysis in NMR technique determining the protein structure. The values given in bold denote the case of accordance with appropriate model
Applicability of the ES model
According to the ES model, structure is generated according to backbone preferences in terms of the V-angle and R-radius of curvature. This is why the values of V-angle and R-radius of curvature (in logarithmic units) as they appear in the crystal structures of proteins under consideration were analyzed versus the idealized curve. The D distance between the projected and observed values of parameters was calculated. It was arbitrarily assumed that proteins with average D below 1 exhibit a structure consistent with the model. However, in view of the availability of the final (LS stage) structures, a D
average
value above 1 does not imply that the model is inadequate. A low value of D suggests that the structural elements characteristic of the ES structural form have been preserved to a large degree in native (LS structure). All helical fragments are present in both the ES and the LS. That is why low values of D
average
may suggest a large participation of secondary structures of the helical type.
Two proteins representing extreme cases (large and low D values) are shown as examples in Fig. 2. The distribution of the observed values (V, ln(R)) in comparison to the idealized approximation curve is shown in Fig. 2.
The 3-D structures with residues with D
average
above 1 are marked in red in this picture in order to visualize the character of the structural motif which is not consistent with the adopted model (Fig. 3).
The accordance of the crystal structure with the ES model is not typically expected. On the other hand, the crystal structure is usually consistent with the LS model, the ES to LS transition is the change of optimal backbone conformation toward the presence of a hydrophobic core. Thus, it is obvious that ES characteristics may be lost in the LS intermediate, although this is not always the case. 1J5B is the only example among the discussed antifreeze proteins (type I). Its structure is entirely helical, and appears to be highly consistent with the ES model. The distribution of hydrophobicity in this molecule is much closer to the random distribution than to the Gaussian one.
Applicability of the LS model
The LS model assumes that hydrophobicity distribution in the protein molecule is consistent with the idealized one, expressed by the three-dimensional Gauss function. The profile showing the hydrophobic interactions collected by effective atoms of each residue as the effect of interactions with other amino acids is shown in Fig. 4.
The 3-D presentation of protein molecules with residues (marked in white) with strongest hydrophobic interactions (responsible for the generation of the hydrophobic core) in two proteins selected to represent the best and the worst accordance with the model under consideration is shown in Fig. 5.
The Kullback-Leibler distance entropy
The accordance between the observed and the idealized hydrophobic density distributions was expressed quantitatively using the Kullback-Leibler distance entropy (as shown in Materials and methods). The values measuring the distance between the observed and idealized (O/T) and the observed and the random (O/R) distributions are given in Table 1. The analysis of these values suggests that the structural changes do not influence the status of the structure (accordance with the idealized model is preserved). Some proteins undergo changes that result in structure no longer consistent with the adopted model, which suggests that the mutations destroy the hydrophobic core responsible for stabilizing the molecule.
A particular mutation in position 16 in 2MSI to 7MSI in respect to 1MSI appeared to affect the hydrophobic core to such a large extent that it lost its initial structure and became inconsistent with the idealized core structure.
Substituting Pro in positions 64 and 65 with Ala, which is absent in the other investigated proteins and their mutants, suggests that prolines play a critical role as far as hydrophobic core generation is concerned.
The investigated molecules are classified in Table 2 depending on accordance with ES and LS models.
Table 2 Protein classification with respect to the criteria describing/defining the early stage (ES) and late stage (LS) intermediates
The majority of the proteins under consideration are very similar (both in terms of sequence and structure), there is only one (1KDF – minimized averaged NMR structure) that satisfies the conditions of both models (ES and LS). This may suggest that the initial ES intermediate was not destroyed in the transition to LS.
The accordance with the LS model is the strongest one in 1KDE structure. The structural fluctuation of dynamic forms seems to be limited by the stabilization imposed by the hydrophobic core (in accordance with the three-dimensional Gauss function).
On the other hand, its four mutants (2MSI, 3MSI, 4MSI, 5MSI) are examples in which mutation prevented the formation of hydrophobic core, which is present in all other structural forms of other mutants of this protein.
Structural differences in pair-wise comparison
A comparison of the intensity of structural changes upon mutation in relation to other proteins of the same group is shown in Table 3. Such a ranking allows contrastive analysis, even more significantly so in this case due to identical (or similar) polypeptide chain length.
Table 3 Pair-wise comparison of selected mutants (AMI). The values under the diagonal – the RMS-D measurements: the values above the diagonal present the D
KL
distance entropy between two proteins (according to the column and row headers)
The LS model based comparative structural analysis was performed using the Kullback-Leibler divergence entropy treating one of the compared proteins as the target. The values received on the basis of these calculations were compared with traditionally used similarity scale expressed by RMS-D values. The appropriate values for selected mutants (group AMI) are given in Table 3.
The correction coefficient for D
KL
versus RMS-D as calculated using STATISTICA program is equal 0.2268 with p < 0.0001. The graphic presentation of this relation is shown in Fig. 6.