The Protein Journal

, Volume 25, Issue 5, pp 301–315 | Cite as

Quantitative Analysis of the Conservation of the Tertiary Structure of Protein Segments

  • Jishou Ruan
  • Ke Chen
  • Jack A. Tuszynski
  • Lukasz A. KurganEmail author


The publication of the crystallographic structure of calmodulin protein has offered an example leading us to believe that it is possible for many protein sequence segments to exhibit multiple 3D structures referred to as multi-structural segments. To this end, this paper presents statistical analysis of uniqueness of the 3D-structure of all possible protein sequence segments stored in the Protein Data Bank (PDB, Jan. of 2003, release 103) that occur at least twice and whose lengths are greater than 10 amino acids (AAs). We refined the set of segments by choosing only those that are not parts of longer segments, which resulted in 9297 segments called a sponge set. By adding 8197 signature segments, which occur uniquely in the PDB, into the sponge set we have generated a benchmark set. Statistical analysis of the sponge set demonstrates that rotating, missing and disarranging operations described in the text, result in the segments becoming multi-structural. It turns out that missing segments do not exhibit a change of shape in the 3D-structure of a multi-structural segment. We use the root mean square distance for unit vector sequence (URMSD) as an improved measure to describe the characteristics of hinge rotations, missing, and disarranging segments. We estimated the rate of occurrence for rotating and disarranging segments in the sponge set and divided it by the number of sequences in the benchmark set which is found to be less than 0.85%. Since two of the structure changing operations concern negligible number of segment and the third one is found not to have impact on the structure, we conclude that the 3D-structure of proteins is conserved statistically for more than 98% of the segments. At the same time, the remaining 2% of the sequences may pose problems for the sequence alignment based structure prediction methods.


Multi-structural segments protein structure protein structure comparison protein structure conservation URMSD 


amino acid


Protein Data Bank


root mean square distance


root mean square distance for unit vector sequence





  1. Anfinsen, C. B. (1973). Science 81:223–233CrossRefGoogle Scholar
  2. Barrientos L. G., Louis J. M., Botos I., Mori T., Han Z., O’Keefe B. R., Boyd M. R., Wlodawer A., Gronenborn A. M. (2002). Structure 10(5):673–686CrossRefGoogle Scholar
  3. Bamborough P., Duncan D., Richards W. G. (1994) Protein Eng. 7(9):1077–1082Google Scholar
  4. Berman H. M., Westbrook J., Feng Z., Gilliland G., Bhat T. N., Weissig H., Shindyalov I. N., Bourne P. E. (2000) Nucleic Acids Res. 28:235–242CrossRefGoogle Scholar
  5. Brody S. S., Gough S. P., Kannangara C. G. (1999) Proteins 37(3):485–493CrossRefGoogle Scholar
  6. Chen K., Ruan J., and Kurgan, L. A. (2006) The Protein J. 25:(1), 57–70CrossRefGoogle Scholar
  7. Chew L. P., Huttenlocher D., Kedem K., Kleinberg J. (1999) J. Comput. Biol. 6(3–4):313–325CrossRefGoogle Scholar
  8. Ding J., Das K., Hsiou Y., Sarafianos S. G., Clark A. D., Jacobo-Molina A., Tantillo C., Hughes S. H., Arnold E. (1998) J. Mol. Biol. 284(4):1095–1111CrossRefGoogle Scholar
  9. Drum C. L., Yan S.-Z., Bard J., Shen Y.-Q., Lu D., Soelaiman S., Grabarek Z., Bohm A., Tang W. J. (2002) Nature 415:396–402CrossRefGoogle Scholar
  10. Elshorst B., Hennig M., Forsterling H., Diener A., Maurer M., Schulte P., Schwalbe H., Griesinger C., Krebs J., Schmid H., Vorherr T., Carafoli E. (1999) Biochemistry 38(38):12320–12332CrossRefGoogle Scholar
  11. Falzone C. J., Wang Y., Vu B. C., Scott N. L., Bhattacharya S., Lecomte J. T. (2001) Biochemistry 40: 4879–4891CrossRefGoogle Scholar
  12. Hansson M., Gough S. P., Brody S. S. (1997) Proteins 27(4):517–522CrossRefGoogle Scholar
  13. Kabsch W. (1978) Acta Crystallogr. A34:827–828Google Scholar
  14. Kihara D., Skolnick J. (2003) J. Mol. Biol. 334:793–802CrossRefGoogle Scholar
  15. Korolev S., Hsieh J., Gauss G. H., Lohman T. M., Waksman G. (1997) Cell 90(4):635–647CrossRefGoogle Scholar
  16. Lindberg J., Sigurdsson S., Lowgren S., Andersson H. O., Sahlberg C., Noreen R., Fridborg K., Zhang H., Unge T. (2002) Eur. J. Biochem. 269(6):1670–1677CrossRefGoogle Scholar
  17. Meador W. E., Means A. R., Quiocho F. A. (1992) Science 257(5074):1251–1255CrossRefGoogle Scholar
  18. Reva B. A., Finkelstein A. V., Skolnick J. (1998) Fold Des. 3(2):141–147CrossRefGoogle Scholar
  19. Schumacher M. A., Crum M., Miller M. C. (2004) Structure (Camb) 12(5):849–860CrossRefGoogle Scholar
  20. Shen, S. Y., Yu, T., Kai, B., Ruan, J. S. (2004). J. Eng. Math. 21:(6), 862–870 (in Chinese)Google Scholar
  21. Tiraboschi G., Jullian N., Thery V., Antonczak S., Fournie-Zaluski M. C., Roques B. P. (1999) Protein Eng. 12(2):141–149CrossRefGoogle Scholar
  22. Toyoshima C., Nakasako M., Nomura H., Ogawa H. (2000) Nature 405(6787): 647–655CrossRefGoogle Scholar
  23. Toyoshima C., Nomura H. (2002) Nature 418(6898):605–611CrossRefGoogle Scholar
  24. Veerapandian B. (1992) Biophys. J. 62(1):112–115CrossRefGoogle Scholar
  25. Xu C., Rice W. J., He W., Stokes D. L. (2002) J. Mol. Biol. 316(1):201–211CrossRefGoogle Scholar
  26. Yap K. L., Yuan T., Mal T. K., Vogel H. J., Ikura M. (2003) J. Mol. Biol. 328(1):193–204CrossRefGoogle Scholar
  27. Yona G., Kedem K, (2005) J. Comput. Biol. 12(1):12–32CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, Inc. 2006

Authors and Affiliations

  • Jishou Ruan
    • 1
  • Ke Chen
    • 4
  • Jack A. Tuszynski
    • 2
    • 3
  • Lukasz A. Kurgan
    • 4
    Email author
  1. 1.Chern Institute of Mathematics, College of Mathematical Science & LPMCNankai UniversityTianjinP. R. China
  2. 2.Department of PhysicsUniversity of AlbertaEdmontonCanada
  3. 3.Department of Experimental OncologyCross Cancer InstituteEdmontonCanada
  4. 4.Department of Electrical and Computer EngineeringUniversity of AlbertaEdmontonCanada

Personalised recommendations