A Modular Perspective of Protein Structures: Application to Fragment Based Loop Modeling

  • Narcis Fernandez-Fuentes
  • Andras FiserEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 932)


Proteins can be decomposed into supersecondary structure modules. We used a generic definition of supersecondary structure elements, so-called Smotifs, which are composed of two flanking regular secondary structures connected by a loop, to explore the evolution and current variety of structure building blocks. Here, we discuss recent observations about the saturation of Smotif geometries in protein structures and how it opens new avenues in protein structure modeling and design. As a first application of these observations we describe our loop conformation modeling algorithm, ArchPred that takes advantage of Smotifs classification. In this application, instead of focusing on specific loop properties the method narrows down possible template conformations in other, often not homologous structures, by identifying the most likely supersecondary structure environment that cradles the loop. Beyond identifying the correct starting supersecondary structure geometry, it takes into account information of fit of anchor residues, sterical clashes, match of predicted and observed dihedral angle preferences, and local sequence signal.

Key words

Secondary structure Supersecondary Structure Smotif Loop modeling Protein structure evolution Protein structure modeling Protein structure design 



This work was supported by NIH grant R01GM096041. This review is partially based on our previous publications of refs. 17, 23, 25, 57. NFF acknowledges support from the Research Councils UK under the RCUK Academic Fellowship scheme.


  1. 1.
    Murzin AG, Brenner SE, Hubbard T et al (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536PubMedGoogle Scholar
  2. 2.
    Hadley C, Jones DT (1999) A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Struct Fold Des 7:1099CrossRefGoogle Scholar
  3. 3.
    Lupas AN, Ponting CP, Russell RB (2001) On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? J Struct Biol 134:191PubMedCrossRefGoogle Scholar
  4. 4.
    Alva V, Remmert M, Biegert A et al (2010) A galaxy of folds. Protein Sci 19:124–130PubMedGoogle Scholar
  5. 5.
    Holm L, Sander C (1993) Protein structure comparison by alignment of distance matrices. J Mol Biol 233:123PubMedCrossRefGoogle Scholar
  6. 6.
    Boutonnet NS, Kajava AV, Rooman MJ (1998) Structural classification of alphabetabeta and betabetaalpha supersecondary structure units in proteins. Proteins 30:193–212PubMedCrossRefGoogle Scholar
  7. 7.
    Wintjens RT, Rooman MJ, Wodak SJ (1996) Automatic classification and analysis of alpha alpha-turn motifs in proteins. J Mol Biol 255:235–253PubMedCrossRefGoogle Scholar
  8. 8.
    Presnell SR, Cohen BI, Cohen FE (1992) A segment-based approach to protein secondary structure prediction. Biochemistry 31:983PubMedCrossRefGoogle Scholar
  9. 9.
    Berezovsky IN, Grosberg AY, Trifonov EN (2000) Closed loops of nearly standard size: common basic element of protein structure. FEBS Lett 466:283–286PubMedCrossRefGoogle Scholar
  10. 10.
    Trifonov EN, Frenkel ZM (2009) Evolution of protein modularity. Curr Opin Struct Biol 19:335–340PubMedCrossRefGoogle Scholar
  11. 11.
    Chintapalli SV, Yew BK, Illingworth CJ et al (2010) Closed loop folding units from structural alignments: experimental foldons revisited. J Comput Chem 31:2689–2701PubMedCrossRefGoogle Scholar
  12. 12.
    Papandreou N, Berezovsky IN, Lopes A et al (2004) Universal positions in globular proteins. Eur J Biochem 271:4762–4768PubMedCrossRefGoogle Scholar
  13. 13.
    Friedberg I, Godzik A (2005) Connecting the protein structure universe by using sparse recurring fragments. Structure 13:1213–1224PubMedCrossRefGoogle Scholar
  14. 14.
    Voigt CA, Martinez C, Wang ZG et al (2002) Protein building blocks preserved by recombination. Nat Struct Biol 9:553–558PubMedGoogle Scholar
  15. 15.
    Tsai CJ, Maizel JV Jr, Nussinov R (2000) Anatomy of protein structures: visualizing how a one-dimensional protein chain folds into a three-dimensional shape. Proc Natl Acad Sci USA 97:12038–12043PubMedCrossRefGoogle Scholar
  16. 16.
    Tsai CJ, Polverino de Laureto P et al (2002) Comparison of protein fragments identified by limited proteolysis and by computational cutting of proteins. Protein Sci 11:1753–1770PubMedCrossRefGoogle Scholar
  17. 17.
    Fernandez-Fuentes N, Oliva B, Fiser A (2006) A supersecondary structure library and search algorithm for modeling loops in protein structures. Nucleic Acids Res 34:2085–2097PubMedCrossRefGoogle Scholar
  18. 18.
    Oliva B, Bates PA, Querol E et al (1997) An automated classification of the structure of protein loops. J Mol Biol 266:814PubMedCrossRefGoogle Scholar
  19. 19.
    Zemla A (2003) LGA: a method for finding 3D similarities in protein structures. Nucleic Acids Res 31:3370–3374PubMedCrossRefGoogle Scholar
  20. 20.
    Andreeva A, Howorth D, Chandonia JM et al (2008) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36:D419–425PubMedCrossRefGoogle Scholar
  21. 21.
    Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637PubMedCrossRefGoogle Scholar
  22. 22.
    Moult J (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15:285–289PubMedCrossRefGoogle Scholar
  23. 23.
    Fernandez-Fuentes N, Fiser A (2006) Saturating representation of loop conformational fragments in structure databanks. BMC Struct Biol 6:15PubMedCrossRefGoogle Scholar
  24. 24.
    Orengo CA, Pearl FM, Bray JE et al (1999) The CATH Database provides insights into protein structure/function relationships. Nucleic Acids Res 27:275PubMedCrossRefGoogle Scholar
  25. 25.
    Fernandez-Fuentes N, Dybas JM, Fiser A (2010) Structural characteristics of novel protein folds. PLoS Comput Biol 6:e1000750PubMedCrossRefGoogle Scholar
  26. 26.
    Zhang Y (2007) Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69(Suppl 8):108–117PubMedCrossRefGoogle Scholar
  27. 27.
    Das R, Baker D (2008) Macromolecular modeling with rosetta. Annu Rev Biochem 77:363–82PubMedCrossRefGoogle Scholar
  28. 28.
    Fiser A, Feig M, Brooks CL III, Sali A (2002) Evolution and physics in comparative protein structure modeling. Acc Chem Res 35:413–421PubMedCrossRefGoogle Scholar
  29. 29.
    Baker D, Sali A (2001) Protein structure prediction and structural genomics. Science 294:93–96PubMedCrossRefGoogle Scholar
  30. 30.
    Blouin C, Butt D, Roger AJ (2004) Rapid evolution in conformational space: a study of loop regions in a ubiquitous GTP binding domain. Protein Sci 13:608–616PubMedCrossRefGoogle Scholar
  31. 31.
    Fiser A, Simon I, Barton GJ (1996) Conservation of amino acids in multiple alignments: aspartic acid has unexpected conservation. FEBS Lett 397:225PubMedCrossRefGoogle Scholar
  32. 32.
    Kim ST, Shirai H, Nakajima N et al (1999) Enhanced conformational diversity search of CDR-H3 in antibodies: role of the first CDR-H3 residue. Proteins 37:683–696PubMedCrossRefGoogle Scholar
  33. 33.
    Saraste M, Sibbald PR, Wittinghofer A (1990) The P-loop–a common motif in ATP- and GTP-binding proteins. Trends Biochem Sci 15:430–434PubMedCrossRefGoogle Scholar
  34. 34.
    Kawasaki H, Kretsinger RH (1995) Calcium-binding proteins 1: EF-hands. Protein Profile 2:297–490PubMedGoogle Scholar
  35. 35.
    Wierenga RK, Terpstra P, Hol WG (1986) Prediction of the occurrence of the ADP-binding beta alpha beta-fold in proteins, using an amino acid sequence fingerprint. J Mol Biol 187:101–107PubMedCrossRefGoogle Scholar
  36. 36.
    Tainer JA, Thayer MM, Cunningham RP (1995) DNA repair proteins. Curr Opin Struct Biol 5:20–26PubMedCrossRefGoogle Scholar
  37. 37.
    Johnson LN, Lowe ED, Noble ME et al (1998) The eleventh datta lecture. The structural basis for substrate recognition and control by protein kinases. FEBS Lett 430:1–11PubMedCrossRefGoogle Scholar
  38. 38.
    Wlodawer A, Miller M, Jaskolski M et al (1989) Conserved folding in retroviral proteases: crystal structure of a synthetic HIV-1 protease. Science 245:616–621PubMedCrossRefGoogle Scholar
  39. 39.
    Fiser A, Do RK, Sali A (2000) Modeling of loops in protein structures. Protein Sci 9:1753PubMedCrossRefGoogle Scholar
  40. 40.
    Fine RM, Wang H, Shenkin PS et al (1986) Predicting antibody hypervariable loop conformations. II: Minimization and molecular dynamics studies of MCPC603 from many randomly generated loop conformations. Proteins 1:342PubMedCrossRefGoogle Scholar
  41. 41.
    Moult J, James MN (1986) An algorithm for determining the conformation of polypeptide segments in proteins by systematic search. Proteins 1:146PubMedCrossRefGoogle Scholar
  42. 42.
    Bruccoleri RE, Karplus M (1987) Prediction of the folding of short polypeptide segments by uniform conformational sampling. Biopolymers 26:137PubMedCrossRefGoogle Scholar
  43. 43.
    Jones TA, Thirup S (1986) Using known substructures in protein model building and crystallography. EMBO J 5:819PubMedGoogle Scholar
  44. 44.
    Chothia C, Lesk AM (1987) Canonical structures for the hypervariable regions of immunoglobulins. J Mol Biol 196:901PubMedCrossRefGoogle Scholar
  45. 45.
    Fidelis K, Stern PS, Bacon D, Moult J (1994) Comparison of systematic search and database methods for constructing segments of protein structure. Protein Eng 7:953PubMedCrossRefGoogle Scholar
  46. 46.
    Deane CM, Blundell TL (2001) CODA: a combined algorithm for predicting the structurally variable regions of protein models. Protein Sci 10:599PubMedCrossRefGoogle Scholar
  47. 47.
    Martin AC, Cheetham JC, Rees AL (1989) Modeling antibody hypervariable loops: a combined algorithm. PNAS 86:9268–9272PubMedCrossRefGoogle Scholar
  48. 48.
    Greer J (1981) Comparative model-building of the mammalian serine proteases. J Mol Biol 153:1027PubMedCrossRefGoogle Scholar
  49. 49.
    Gunasekaran K, Ramakrishnan C, Balaram P (1997) Beta-hairpins in proteins revisited: lessons for de novo design. Protein Eng 10:1131–1141PubMedCrossRefGoogle Scholar
  50. 50.
    Michalsky E, Goede A, Preissner R (2003) Loops in proteins (LIP)–a comprehensive loop database for homology modelling. Protein Eng 16:979PubMedCrossRefGoogle Scholar
  51. 51.
    Heuser P, Wohlfahrt G, Schomburg D (2004) Efficient methods for filtering and ranking fragments for the prediction of structurally variable regions in proteins. Proteins 54:583–595PubMedCrossRefGoogle Scholar
  52. 52.
    Kolaskar AS, Kulkarni-Kale U (1992) Sequence alignment approach to pick up conformationally similar protein fragments. J Mol Biol 223:1053–1061PubMedCrossRefGoogle Scholar
  53. 53.
    Shortle D (2002) Composites of local structure propensities: evidence for local encoding of long-range structure. Protein Sci 11:18–26PubMedCrossRefGoogle Scholar
  54. 54.
    Fiser A, Sali A (2003) ModLoop: automated modeling of loops in protein structures. Bioinformatics 19:2500PubMedCrossRefGoogle Scholar
  55. 55.
    Du P, Andrec M, Levy RM (2003) Have we seen all structures corresponding to short protein fragments in the Protein Data Bank? An update. Protein Eng 16:407PubMedCrossRefGoogle Scholar
  56. 56.
    Choi Y, Deane CM (2010) FREAD revisited: accurate loop structure prediction using a database search algorithm. Proteins 78:1431–1440PubMedGoogle Scholar
  57. 57.
    Fernandez-Fuentes N, Zhai J, Fiser A (2006) ArchPRED: a template based loop structure prediction server. Nucleic Acids Res 34:W173–176PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  1. 1.Institute of BiologicalEnvironmental and Rural Sciences (IBERS) Aberystwyth University AberystwythCeredigionUK
  2. 2.Department of Systems and Computational BiologyAlbert Einstein College of MedicineBronxUSA
  3. 3.Department of BiochemistryAlbert Einstein College of MedicineBronxUSA

Personalised recommendations