The Protein Journal

, Volume 26, Issue 8, pp 556–561

QHELIX: A Computational Tool for the Improved Measurement of Inter-Helical Angles in Proteins

Authors

  • Hui Sun Lee
    • Department of Biological Sciences, Research Center for Women’s Diseases (RCWD)Sookmyung Women’s University
  • Jiwon Choi
    • Department of Biological Sciences, Research Center for Women’s Diseases (RCWD)Sookmyung Women’s University
    • Department of Biological Sciences, Research Center for Women’s Diseases (RCWD)Sookmyung Women’s University
Article

DOI: 10.1007/s10930-007-9097-9

Cite this article as:
Lee, H.S., Choi, J. & Yoon, S. Protein J (2007) 26: 556. doi:10.1007/s10930-007-9097-9

Abstract

Knowledge about the assembled structures of the secondary elements in proteins is essential to understanding protein folding and functionality. In particular, the analysis of helix geometry is required to study helix packing with the rest of the protein and formation of super secondary structures, such as, coiled coils and helix bundles, formed by packing of two or more helices. Here we present an improved computational method, QHELIX, for the calculation of the orientation angles between helices. Since a large number of helices are known to be in curved shapes, an appropriate definition of helical axes is a prerequisite for calculating the orientation angle between helices. The present method provides a quantitative measure on the irregularity of helical shape, resulting in discriminating irregular-shaped helices from helices with an ideal geometry in a large-scale analysis of helix geometry. It is also capable of straightforwardly assigning the direction of orientation angles in a consistent way. These improvements will find applications in finding a new insight on the assembly of protein secondary structure.

Keywords

Helical axisInter-helical angleProtein structureQHELIX, computational tool

Abbreviations

PDB

Protein data bank

NMR

Nuclear magnetic resonance

SpA

Staphylococcal protein A

1 Introductiion

Elucidating how secondary structural elements assemble into a tertiary structure is an important step in understanding the folding and functionality of proteins [1, 2]. The calculation of inter-helical angle, defined as the tilt angle between two different helices (Fig. 1), has been a major geometric determinant for annotating and comparing the structural characteristics of proteins. In addition, in case of proteins that undergo large conformational changes, the determination of the inter-helical angle may permit the detailed description of structural rearrangements that occur during the events [3, 4]. Many computational tools for the analysis of protein structure thus have a functionality to calculate the inter-helix orientation [5, 6]. An appropriate definition of the axis of helices is a prerequisite for determining the inter-helical angle. One approach to this problem was proposed by Chou et al. in which the axes of regular and non-regular helices were defined as a least-squares line computed in a manner that the sum of the squares of the distances of all Cα atoms from the line is a minimum [7]. This approach, however, do not consider the variation of helical shape in determining the axis of helices, due to its inability to quantify the irregularity of the helical shape. Although the α-helix is a well-defined common secondary element [8], relatively long helices frequently show non-linear irregular shapes. Nevertheless, the influence of helical irregularity on the angle calculation has not been quantitatively considered so far. For example, Lee and Chirikjian recently used the distance-dependent inter-helical angle to study orientational preferences between interacting helices within globular proteins [9]. In their large-scale study, however, the irregularity of helical shapes was not considered due to their approach’s inability to discriminate irregular-shaped helices from helices with an ideal geometry.
https://static-content.springer.com/image/art%3A10.1007%2Fs10930-007-9097-9/MediaObjects/10930_2007_9097_Fig1_HTML.gif
Fig. 1

Inter-helical tilt angle θ determined by two helices in a protein structure. The each helical axis is represented by an arrow, with the head of the arrow pointing the direction of the C-terminus of a helix

The purpose of our present study was to develop a method to define helical angels with a quantitative consideration of shape irregularity of helices in determining helical axes. Using a unique algorithm proposed by Kahn [10], we attempted to quantify the deviation of the fitting line from the non-linear axis of an irregularly curved helix. Several other programs, using Kahn algorithm or similar approach, have been reported for analyzing helix structures. HELANAL [5] provided a detailed analysis of structural and position dependent characteristic features of each helix using successive local helix axes along a window of four Cα atoms. TRAJELIX calculates helix axes using Kahn’s algorithm [6]. It has been used in monitoring relative local distortions in helices between sampled and reference structures during molecular dynamics simulations of a single protein. Here, we report an automated computer program, QHELIX, which permits fast determination of inter-helical orientation and unique quantification of the irregularity of helix shape, thus distinguish between irregularly curved helices and helices with an ideal geometry in the angle calculation. Since the algorithm only takes Cα coordinates for the calculation of geometric axes, it is independent of the type of amino acids and can be applied to the geometric analysis of any kinds of secondary structure elements.

An additional critical point in the comparison of inter-helical orientation of proteins in a large scale is that the direction of orientation between two helices should be considered over the range of (−180, 180°). In the case where a reference helix is not given, determining the sign of the angle between two helices is not trivial. In the previous report, the reference helix is selected based on the relative distance to the origin of coordinate [7]. This method does not provide a consistency in selecting the reference helix during the rotational or transitional movement of the protein. The better way is determining the sign only based upon the geometric relationship between two helices that are used in the angle calculation. Here, we propose a simple but unique method for the consistent determination of the sign of inter-helical orientation without the consideration of the origin of coordinates or surrounding tertiary context.

2 Methods

QHELIX uses a PDB file as an input coordinates for a protein. Absolute mathematical regularity of the helical parameters is not required except that user should input residue ranges corresponding to the helices of the protein. The program provides two optional methods for determining helix axes. In the first method (Qhelix-1), the helix axis is defined as a least squares line calculated from all Cα atoms in the helix (hereafter referred to as the Chou algorithm) [7]. In any given (x, y, z) coordinate system, the equation of the least squares axis can be written as
$$ (x - x^{ * } )/l = (y - y^{ * } )/m = (z - z^{ * } )/n $$
(1)
$$ x^{ * } = \frac{1} {{n_{r} }}{\sum\limits_{i = 1}^{n{}_{r}} {x_{i} } }\quad \quad y^{ * } = \frac{1} {{n_{r} }}{\sum\limits_{i = 1}^{n{}_{r}} {y_{i} \quad \quad } }z^{ * } = \frac{1} {{n_{r} }}{\sum\limits_{i = 1}^{n{}_{r}} {z_{i} } } $$
(2)
where (xi, yi, zi) are the coordinates of the ith Cα atom of an helix (i = 1, 2, …, nr), and l, m and n are the direction cosines of the least-squares axis, obtained as the solution of the equation.
The second method (Qhelix-2) uses a segment-based algorithm for defining the helical axis. The segment-based helical axis definition was proposed by Kahn (hereafter referred to as the Kahn algorithm) [10, 11]. The segment is a short axis which is determined using simple vector operations on the coordinates of four consecutive residues. The angle between three consecutive Cα atoms is bisected, resulting in vector V1 being perpendicular to the helix axis. The repetition of this process using next trios gives a second perpendicular vector V2 to the helix axis. Let P1 and P2 be center atoms of each trio. If the radius r and the distance d along the axis from its intersections with extensions of V1 and V2 are known, then the axis segment can be calculated. The algorithm scans along the chain one residue at a time, yielding a set of axis segments for sets of four consecutive residues. To minimize the perpendicular distances between the segment-axes coordinates and the fitted line, Kahn also applied simple non-iterative linear procedures for finding the least squares line in three dimensions [11]. The sum of the distance between a set of axis segments and the fitted line is given by
$$ {\sum {d^{2}_{i} = {\sum {[(x_{i} - \alpha _{1} t_{i} )^{2} + (y_{i} - \alpha _{2} t_{i} )^{2} + (z_{i} - \alpha _{3} t_{i} )^{2} ]} }} } $$
(3)
where α1, α2 and α3 are the direction cosines of the fitted line with respect to the x, y and z axes respectively. The ti is the distance along the line between the origin and the intersection resulting from a perpendicular line from each data point, i, to the fitted line. Differentiating Eq. (3) with respect to α and the ti, and converting to matrix notation yields
$$ (X - \alpha t^{\prime} )t = 0 $$
(4)
$$ \alpha^{\prime} (X - \alpha t^{\prime}) = 0 $$
(5)
where X is a matrix of the data coordinates such that the first row contains xi, the second yi and the third zi. α is the direction cosine vector, and t is the vector of ti. The prime indicates the transpose. Modifying Eq. (4) and (5),
$$ XX^{\prime}\alpha = \alpha \alpha^{\prime} tt^{\prime} \alpha $$
(6)
is obtained. The secular determinant of Eq. (6) is formulated from the data as follows.
$$ {\left| {XX^{\prime}} \right|} = {\left( {\begin{array}{*{20}c} {{{\sum {X^{2}_{i} } }}} & {{{\sum {X_{i} Y_{i} } }}} & {{{\sum {X_{i} Z_{i} } }}} \\ {{{\sum {X_{i} Y_{i} } }}} & {{{\sum {Y^{2}_{i} } }}} & {{{\sum {Y_{i} Z_{i} } }}} \\ {{{\sum {X_{i} Z_{i} } }}} & {{{\sum {Y_{i} Z_{i} } }}} & {{{\sum {Z^{2}_{i} } }}} \\ \end{array} } \right)} $$
(7)
The solution yields three eigenvalues and the corresponding eigenvectors. Normalizing the eigenvector corresponding to the largest eigenvalue yields the direction cosines for the fitted line.
The angle between two fitted axes can be easily calculated using general vector algebra.
$$ {\left| {Angle} \right|} = \cos ^{{ - 1}} (E1 \bullet E2) $$
(8)
where E1 and E2 are unit vectors parallel to the axes of helices 1 and 2, respectively. Equation (8), however, does not give the sign of the angle. To determine the sign, the concept of dihedral angles of protein structures is used. First, the cross product is calculated using unit vectors of two helical axes. The resulting vector is perpendicular to both axes. By using a rotation matrix, the calculated cross vector is positioned along the z-axis (i.e., e(0,0,1)). By this rotation, unit vectors of helical axes are on the xy-plane (z coordinate is zero), and two helical axes are in parallel with xy-plane (z coordinates of them are not necessarily zero). The z-coordinates of N-terminal points of two helical axes are then compared. Actually, this measure represents the distance of a helix to the origin of the cross product. Among the two axes, the axis which is far from the original point of the cross product becomes the base. The clockwise angle between the base and the other vector has a positive sign and the anti-clockwise angle has a negative sign (Fig. 2). Although the Chou method also used the concept of dihedral angles to define the sign of the inter-helical angle [12], the reference (i.e., the base) was assigned to the helix which was closer to the origin of the coordinate system. Thus, an angle between two helices may change when the coordinates are shifted. Our present method provides a consistent sign to the given angle of two helices during the rotational or transitional movement of the whole protein in the given coordinate system.
https://static-content.springer.com/image/art%3A10.1007%2Fs10930-007-9097-9/MediaObjects/10930_2007_9097_Fig2_HTML.gif
Fig. 2

Determination of the sign of an inter-helical angle. The cross product of two axes, α1 and α2, was first calculated. Then, among the two axes, the axis of which N-terminal coordinate is far away from the original point of the cross product was assigned as the base. The clockwise angle between the base and the other vector has a positive sign and an anti-clockwise angle has a negative sign

3 Results and Discussion

Qhelix was used to characterize inter-helical angles in domains of Staphylococcal protein A (SpA) which is a well known pathogenicity factor from the bacterium Staphylococcus aureus (Fig. 3). Since the 3D structure and inter-helical orientations of its domains have already been reported [13, 14, 15, 16], we were able to validate the results of Qhelix calculation by comparing them with previous data. In the present analysis, two different algorithms for defining helical axis, i.e., Chou algorithm for Qhelix-1 and Kahn algorithm for Qhelix-2, were used in determining the inter-helical angles in the chain folds of the Z, B, E and Fc-bound B domain of SpA protein (Table 1). The results were compared with those from Tashiro et al. who implemented Chou algorithm in the helix definition [16]. As a result, all the calculated angles showed an excellent agreement among three methods. Since helix 1, 2 and 3 of SpA domains were relatively short with a linear shape, different methods for helical axis definition did not produce any significant variations in the calculated helical angles. The sign of angles also agrees perfectly between Qhelix and Tashiro calculations. In previous methods, the reference helix must be defined in advance in order to determine the sign of an inter-helical angel [7, 16]. The reference helix was determined based on the relative distance to the origin of the coordinate system, which may cause inconsistency in determining the sign during the translational or rotational movement of the protein. To avoid this confusion in determining the reference helix, Qhelix calculates the cross product of two unit vectors of two given helical axes. Then the resulting vector will be perpendicular to both axes. Among the two axes, the axis which is far from the original point of the cross product is automatically assigned as the reference helix for determining the sign of inter-helical angle (See detail in the sect. 2). In cases of NMR structural data where two helical axes are located very close to each other in parallel, the reference helix might be inconsistent among multiple conformations. As a result, the sign of the angle could be changed from one conformation to another in the ensemble. Nevertheless, our approach for determining the sign of the inter-helical orientation is very strict and convenient because there is no need to manually choose a reference helix or number helices sequentially. Definitely, this feature will provide an advantage in analyzing the helical geometry in a large-scale.
https://static-content.springer.com/image/art%3A10.1007%2Fs10930-007-9097-9/MediaObjects/10930_2007_9097_Fig3_HTML.gif
Fig. 3

Ribbon diagram of the three and two helical bundle domains in Staphylococcal protein. Helix 1, helix 2 and helix 3 are depicted as α1, α2 and α3, respectively. (a) The Z domain (PDB entry: 2SPZ). (b) B domain (PDB entry: 1BDC). (c) E domain (PDB entry: 1EDL). (d) Fc-bound B domain (PDB entry: 1FC2)

Table 1

Comparison of inter-helical angles in the Z domain, B domain, Fc-bound B domain and E domain of staphylococcal protein A

 

θ12

θ13

θ23

 

Tashiro

Qhelix-1

Qhelix-2

Tashiro

Qhelix-1

Qhelix-2

Tashiro

Qhelix-1

Qhelix-2

Z domain

−170

−167

−169

+16

+19

+12

+173

+171

+ 173

B domain

−148

−149

−143

+48

+46

+51

−168

−161

−163

E domain

+175

+166

+166

+16

+12

+15

−177

−168

−168

Fc bound B domain

−171

−171

−173

      

θ12, θ13 and θ23 are the inter-helical angles between helix 1 and 2, helix 1 and 3, and helix 2 and 3, respectively. Angles reported by Tashiro et al. was calculated based on Chou algorithm [16]. Qhelix calculation was carried out using two different helix axis definition algorithms (Chou algorithm in Qhelix-1 and Kahn algorithm in Qhelix-2, See sect. 2 for detail). For Z, B and E domains, the average structure of ten NMR conformers were used in the calculation. The helical axes and angles were computed for residue ranges 7–17, 24–36 and 41–54, corresponding to helix 1, 2 and 3. For the single crystal structure of the Fc bound B domain, residues corresponding to helix 1 and 2 were ranged as 128–136 and 144–155

A significant number of helices in proteins have an irregular curved shape. To annotate the helical geometry appropriately, the irregularity in the helical shape should be quantitatively measured in addition to inter-helical angles. We applied Qhelix methods to determining the inter-helical angle in a structure containing an irregular helix (Fig. 4). The helix 16 of acetohydroxy acid isomeroreductase (PDB entry: 1YVE) has a long curved shape while the helix 19 of the protein is in a short, regular shape. The inter-helical angle between these two helices is −35.25° by the Chou method and −36.15° by the Kahn method, respectively (Table 2). Two methods showed a relatively good agreement in the angle determination. To estimate the irregularity of helices in this angle calculation, we measured the average perpendicular distance (hir in Table 2) between the reference data points and the fitted axis. The reference data points which used in the calculation of the helical axis are the coordinates of all Cα atoms in the Chou algorithm, while the reference data are a set of segment axes in the Kahn algorithm (See detail in the Sect. 2). As a result, Qhelix-1 calculation based on Chou algorithm measured the difference in hir (Δhir) between helix 16 and helix 19 as 0.54 Å, while Qhelix-2 based on Kahn algorithm gave the difference as 1.49 Å. Although the shape of helix 16 is significantly curved in comparison to helix 19, the Chou algorithm intrinsically does not provide a method to quantify the deviation. On the other hand, the result calculated by the Kahn method shows that hir for helix 16 is distinctly different from that of helix 19. Helix 16, with an irregularly curved geometry, has the hir of 1.72 Å whereas that of the helix 19, with an ideal α-helical geometry, is near zero. This result shows that Qhelix-2 using the Kahn algorithm is able to quantitatively discriminate irregular-shaped helices from helices with an ideal geometry.
https://static-content.springer.com/image/art%3A10.1007%2Fs10930-007-9097-9/MediaObjects/10930_2007_9097_Fig4_HTML.gif
Fig. 4

Graphical presentation of helix axes in a protein with an irregularly curved long helix. The axes of the helices 16 (residue range 379–408, 30 residues) and 19 (residue range 462–481, 20 residues) of acetohydroxy acid isomeroreductase (PDB entry: 1YVE) are calculated by (a) Chou method and (b) Kahn method. The calculated axes are shown as bold lines. The curvature calculated by Kahn method is depicted as black spheres in (b)

Table 2

Comparison of inter-helical angle, θ16–19, and the helix irregularity, hir between an irregularly curved helix (helix 16) and a helix with an ideal geometry (helix 19)

 

θ16–19

Average displacement, hir (Å)

Helix 16

Helix 19

Δhir

Qhelix-1

−35.25

2.90

2.36

0.54

Qhelix-2

−36.15

1.72

0.23

1.49

Qhelix calculation was carried out using two different helix axis definition algorithms (Chou algorithm in Qhelix-1 and Kahn algorithm in Qhelix-2). 3D structure of helices 16 and 19 are displayed in Fig. 4

Orientational preferences between interacting helices within proteins have been studied extensively over the years [9, 17]. However the influence of helical irregularity on angle calculation has not been considered quantitatively so far. Angles between irregular helices can be defined in various ways, and are qualitatively different from those between regular helices. Therefore, for more accurate evaluation of the orientation preferences between helices on a large scale, the problem of non-linear helical axes should be appropriately considered in the angle calculation. What are the qualitative differences between inter-helix pairs with similar inter-helical angles and those with a different helical shape? What residue ranges should be used to define the helix axes and to calculate the inter-helical angle? While the two optional methods that our Qhelix program provides, produce similar inter-helical angles, the unique estimation of hir by the Kahn method provides a further insight into the orientation between interacting helices. It is expected that QHELIX will find useful applications in proteom-wide analysis of helical geometry in proteins. Particularly, in the problems of protein folding, the relationship of secondary structure elements should be understood in terms of surrounding tertiary context. QHELIX, which provides an improved method for the analysis of various helical shapes and their special relationship, will contribute to elucidating relationships between tertiary interaction and local secondary structure formation in the protein folding.

4 Software Availability

QHELIX was developed in C language and compiled using the GNU C compiler (gcc). The source code of the program is added as supplement 1. In addition to the source code, all the math library components and instruction for the compilation are freely available on our web site, http://compbio.sookmyung.ac.kr/∼qhelix/.

Acknowledgment

This research was supported by Sookmyung Women’s University Research Grants 1-0603-0149.

Supplementary material

10930_2007_9097_MOESM1_ESM.pdf (21 kb)
ESM1 (PDF 21 KB)

Copyright information

© Springer Science+Business Media, LLC 2007