Skip to main content
Log in

“Pinning strategy”: a novel approach for predicting the backbone structure in terms of protein blocks from sequence

  • Published:
Journal of Biosciences Aims and scope Submit manuscript

Abstract

The description of protein 3D structures can be performed through a library of 3D fragments, named a structural alphabet. Our structural alphabet is composed of 16 small protein fragments of 5 Cα in length, called protein blocks (PBs). It allows an efficient approximation of the 3D protein structures and a correct prediction of the local structure. The 72 most frequent series of 5 consecutive PBs, called structural words (SWs) are able to cover more than 90% of the 3D structures. PBs are highly conditioned by the presence of a limited number of transitions between them. In this study, we propose a new method called “pinning strategy” that used this specific feature to predict long protein fragments. Its goal is to define highly probable successions of PBs. It starts from the most probable SW and is then extended with overlapping SWs. Starting from an initial prediction rate of 34.4%, the use of the SWs instead of the PBs allows a gain of 4.5%. The pinning strategy simply applied to the SWs increases the prediction accuracy to 39.9%. In a second step, the sequence-structure relationship is optimized, the prediction accuracy reaches 43.6%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

PB:

Protein block

PSOWs:

preferential succession of overlapping structural words

SF:

sequence family

SWs:

structural words

References

  • Alexandrov N and Shindyalov I 2003 PDP: protein domain parser; Bioinformatics 19 429–430

    Article  PubMed  CAS  Google Scholar 

  • Alland C, Moreews F, Boens D, Carpentier M, Chiusa S, Lonquety M, Renault N, Wong Y, Cantalloube H, Chomilier J et al. 2005 RPBS: a web resource for structural bioinformatics; Nucleic Acids Res. 33 W44–W49

    Article  PubMed  CAS  Google Scholar 

  • Altschul S.F, Gish W, Miller W, Myers E W and Lipman D J 1990 Basic local alignment search tool; J. Mol. Biol. 215 403–410

    PubMed  CAS  Google Scholar 

  • Bairoch A, Boeckmann B, Ferro S and Gasteiger E 2004 Swiss-Prot: juggling between evolution and stability; Brief Bioinform 5 39–55

    Article  PubMed  CAS  Google Scholar 

  • Benros C, de Brevern A G, Etchebest C and Hazout S 2006 Assessign a novel approach for predicting local 3D protein structures from sequence; Proteins 62 865–880

    Article  PubMed  CAS  Google Scholar 

  • Benros, C, de Brevern A G and Hazout S 2003 Hybrid Protein Model (HPM): A Method For Building A Library Of Overlapping Local Structural Prototypes. Sensitivity Study And Improvements Of The Training; in IEEE Workshop on Neural Networks for Signal Processing (Toulouse, France) pp 53–72

  • Benros C, de Brevern A G and Hazout S 2004 Predicting Local Structural Candidates from Sequence by the “Hybrid Protein Model” Approach; in 12th Intelligent Systems for Molecular Biology (ISMB) / 3rd the European Conference on Computational Biology (ECCB), Glasgow

  • Bystroff C and Baker D 1998 Prediction of local structure in proteins using a library of sequence-structure motifs; J. Mol. Biol. 281 565–577

    Article  PubMed  CAS  Google Scholar 

  • Camproux A C, Brevern A G, Hazout S and Tufféry P 2001 Exploring the use of a structural alphabet for structural prediction of protein loops; Theor. Chem. Acc. 106 28–35

    CAS  Google Scholar 

  • Camproux A C, Gautier R and Tuffery P 2004 A hidden markov model derived structural alphabet for proteins; J. Mol. Biol. 339 591–605

    Article  PubMed  CAS  Google Scholar 

  • Camproux A C, Tuffery P, Buffat L, Andre C, Boisvieux J F and Hazout S 1999a Using short structural building blocks defined by a Hidden Markov Model for analysing patterns between regular secondary structures; Theor. Chem. Acc. 101 33–40

    CAS  Google Scholar 

  • Camproux A C, Tuffery P, Chevrolat J P, Boisvieux J F and Hazout S 1999b Hidden Markov model approach for identifying the modular framework of the protein backbone; Protein Eng. 12 1063–1073

    Article  PubMed  CAS  Google Scholar 

  • Chan A W, Hutchinson E G, Harris D and Thornton J M 1993 Identification, classification, and analysis of beta-bulges in proteins; Protein Sci. 2 1574–1590

    PubMed  CAS  Google Scholar 

  • Chivian D, Kim D E, Malmstrom L, Schonbrun J, Rohl C A and Baker D 2005 Prediction of CASP-6 structures using automated Robetta protocols; Proteins (Suppl. 7) 61 157–166

    CAS  Google Scholar 

  • Colloc’h N, Etchebest C, Thoreau E, Henrissat B and Mornon J P 1993 Comparison of three algorithms for the assignment of secondary structure in proteins: the advantages of a consensus assignment; Protein Eng. 6 377–382

    Article  PubMed  CAS  Google Scholar 

  • Cuff J A and Barton G J 1999 Evaluation and improvement of multiple sequence methods for protein secondary structure prediction; Proteins 34 508–519

    Article  PubMed  CAS  Google Scholar 

  • de Brevern A G 2005 New assessment of Protein Blocks; In Silico Biol. 5 283–289

    PubMed  Google Scholar 

  • de Brevern A G, Benros C, Gautier R, Valadie H, Hazout S and Etchebest C 2004 Local backbone structure prediction of proteins; In Silico Biol. 4 381–386

    PubMed  Google Scholar 

  • de Brevern A G, Camproux A-C, Hazout S, Etchebest C and Tuffery P 2001 Protein structural alphabets: beyond the secondary structure description; in Recent research developments in protein engineering (ed.) S Sangadai (Trivandrum: Research Signpost) pp 319–331

    Google Scholar 

  • de Brevern A G, Etchebest C and Hazout S 2000 Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks; Proteins 41 271–287

    Article  PubMed  Google Scholar 

  • de Brevern A G and Hazout S 2000 Hybrid Protein Model (HPM): a method to compact protein 3D-structures information and physicochemical properties; IEEE — Comput. Soc. S1 49–54

    Google Scholar 

  • de Brevern A G and Hazout S 2001 Compacting local protein folds with a “hybrid protein model”; Theor. Chem. Acc. 106 36–47

    Google Scholar 

  • de Brevern A G and Hazout S 2003 ’Hybrid protein model’ for optimally defining 3D protein structure fragments; Bioinformatics 19 345–353

    Article  PubMed  CAS  Google Scholar 

  • de Brevern A G, Valadie H, Hazout S and Etchebest C 2002 Extension of a local backbone description using a structural alphabet: a new approach to the sequence-structure relationship; Protein Sci. 11 2871–2886

    Article  PubMed  CAS  Google Scholar 

  • de Brevern A G, Wong H, Tournamille C, Colin Y, Le Van Kim C and Etchebest C 2005 A structural model of a seven-transmembrane helix receptor: The Duffy antigen/receptor for chemokine (DARC); Biochim. Biophys. Acta 1724 288–306

    PubMed  Google Scholar 

  • Efimov A V 1997 Structural trees for protein superfamilies; Proteins 28 241–260

    Article  PubMed  CAS  Google Scholar 

  • Eisenberg D 2003 The discovery of the alpha-helix and beta-sheet, the principal structural features of proteins; Proc. Natl. Acad. Sci. USA 100 11207–11210

    Article  PubMed  CAS  Google Scholar 

  • Errami, M, Geourjon C and Deleage G 2003 Detection of unrelated proteins in sequences multiple alignments by using predicted secondary structures; Bioinformatics 19 506–512

    Article  PubMed  CAS  Google Scholar 

  • Espadaler J, Fernandez-Fuentes N, Hermoso A, Querol E, Aviles F X, Sternberg M J and Oliva B 2004 ArchDB: automated protein loop classification as a tool for structural genomics; Nucleic Acids Res. 32 D185–188

    Article  PubMed  CAS  Google Scholar 

  • Etchebest C, Benros C, Hazout S and de Brevern A G 2005 A structural alphabet for local protein structures: Improved prediction methods; Proteins 59 810–827

    Article  PubMed  CAS  Google Scholar 

  • Fetrow J S, Palumbo M J and Berg G 1997 Patterns, structures, and amino acid frequencies in structural building blocks, a protein secondary structure classification scheme; Proteins 27 249–271

    Article  PubMed  CAS  Google Scholar 

  • Fourrier L, Benros C and de Brevern A G 2004 Use of a structural alphabet for analysis of short loops connecting repetitive structures; BMC Bioinformatics 5 58

    Article  PubMed  Google Scholar 

  • Gelly J C, de Brevern A G and Hazout S 2006 ’Protein Peeling’: an approach for splitting a 3D protein structure into compact fragments; Bioinformatics 22 129–133

    Article  PubMed  CAS  Google Scholar 

  • Geourjon C, Combet C, Blanchet C and Deleage G 2001 Identification of related proteins with weak sequence identity using secondary structure information; Protein Sci. 10 788–797

    Article  PubMed  CAS  Google Scholar 

  • Girod A, Ried M, Wobus C, Lahm H, Leike K, Kleinschmidt J, Deleage G and Hallek M 1999 Genetic capsid modifications allow efficient re-targeting of adeno-associated virus type 2; Nat. Med. 5 1438

    Article  PubMed  CAS  Google Scholar 

  • Hartigan, J A and Wong M A 1979 k-means; Appl. Stat. 28 100–115

    Article  Google Scholar 

  • Henikoff S and Henikoff J G 1992 Amino acid substitution matrices from protein blocks; Proc. Natl. Acad. Sci. USA 89 10915–10919

    Article  PubMed  CAS  Google Scholar 

  • Humphrey W, Dalke A and Schulten K 1996 VMD: visual molecular dynamics; J. Mol. Graph. 14 33–38, 27–38

    Article  PubMed  CAS  Google Scholar 

  • Hunter C G and Subramaniam S 2003a Protein fragment clustering and canonical local shapes; Proteins 50 580–588

    Article  PubMed  CAS  Google Scholar 

  • Hunter C G and Subramaniam S 2003b Protein local structure prediction from sequence; Proteins 50 572–579

    Article  PubMed  CAS  Google Scholar 

  • Jones D T 1999 Protein secondary structure prediction based on position-specific scoring matrices; J. Mol. Biol. 292 195–202

    Article  PubMed  CAS  Google Scholar 

  • Jurkowski W, Brylinski M, Konieczny L, Wiiniowski Z and Roterman I 2004 Conformational subspace in simulation of early-stage protein folding; Proteins 55 115–127

    Article  PubMed  CAS  Google Scholar 

  • Karchin R 2003 Evaluating local structure alphabets for protein structure prediction, Ph. D. thesis, University of California, Santz Cruz, USA

    Google Scholar 

  • Karchin R, Cline M, Mandel-Gutfreund Y and Karplus K 2003 Hidden Markov models that use predicted local structure for fold recognition: alphabets of backbone geometry; Proteins 51 504–514

    Article  PubMed  CAS  Google Scholar 

  • Kohonen T 1982 Self-organized formation of topologically correct feature maps; Biol. Cybern. 43 59–69

    Article  Google Scholar 

  • Kohonen T 2001 Self-organizing maps 3rd edition (Springer) pp 501

  • Koradi R, Billeter M and Wuthrich K 1996 MOLMOL: a program for display and analysis of macromolecular structures; J. Mol. Graph. 14 29–32

    Google Scholar 

  • Kuang R, Leslie C S and Yang A S 2004 Protein backbone angle prediction with machine learning approaches; Bioinformatics 20 1612–1621

    Article  PubMed  CAS  Google Scholar 

  • Kullback S and Leibler R A 1951 On information and sufficiency: Ann. Math. Stat. 22 79–86

    Article  Google Scholar 

  • Martin J, Letellier G, Marin A, Taly J-F, de Brevern A G and Gibrat J-F 2005 Protein secondary structure assignment revisited: a detailed analysis of different assignment methods; BMC Struct. Biol. 5 17

    Article  PubMed  CAS  Google Scholar 

  • Milner-White E J 1990 Situations of gamma-turns in proteins. Their relation to alpha-helices, beta-sheets and ligand binding sites; J. Mol. Biol. 216 386–397

    Article  PubMed  CAS  Google Scholar 

  • Murzin A G, Brenner S E, Hubbard T and Chothia C 1995 SCOP: a structural classification of proteins database for the investigation of sequences and structures; J. Mol. Biol. 247 536–540

    PubMed  CAS  Google Scholar 

  • Némethy G and Printz M P 1972 The gamma turn, a possible folded conformation of the polypeptide chain. Comparison with the beta turn; Macromolecules 5 755–758

    Article  Google Scholar 

  • Oliva B, Bates P A, Querol E, Aviles F X and Sternberg M J 1997 An automated classification of the structure of protein loops; J. Mol. Biol. 266 814–830

    Article  PubMed  CAS  Google Scholar 

  • Orengo C A, Michie A D, Jones S, Jones D T, Swindells M B and Thornton J M 1997 CATH-a hierarchic classification of protein domain structures; Structure 5 1093–1108

    Article  PubMed  CAS  Google Scholar 

  • Pauling L and Corey R B 1951a Atomic coordinates and structure factors for two helical configurations of polypeptide chains; Proc. Natl. Acad. Sci. USA 37 235–240

    Article  PubMed  CAS  Google Scholar 

  • Pauling L and Corey R B 1951b The pleated sheet, a new layer configuration of polypeptide chains; Proc. Natl. Acad. Sci. USA 37 251–256

    Article  PubMed  CAS  Google Scholar 

  • Pei J and Grishin N V 2004 Combining evolutionary and structural information for local protein structure prediction; Proteins 56 782–794

    Article  PubMed  CAS  Google Scholar 

  • Petersen T N, Lundegaard C, Nielsen M, Bohr H, Bohr J, Brunak S, Gippert G P and Lund O 2000 Prediction of protein secondary structure at 80% accuracy; Proteins 41 17–20

    Article  PubMed  CAS  Google Scholar 

  • Pollastri G and McLysaght A 2005 Porter: a new, accurate server for protein secondary structure prediction; Bioinformatics 21 1719–1720

    Article  PubMed  CAS  Google Scholar 

  • Pollastri G, Przybylski D, Rost B and Baldi P 2002 Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles; Proteins 47 228–235

    Article  PubMed  CAS  Google Scholar 

  • Prestrelski S J, Williams A L Jr and Liebman M N 1992 Generation of a substructure library for the description and classification of protein secondary structure. I. Overview of the methods and results; Proteins 14 430–439

    Article  PubMed  CAS  Google Scholar 

  • Rabiner L R 1989 A tutorial on hidden Markov models and selected application in speech recognition; Proc. IEEE 77 257–286

    Article  Google Scholar 

  • Richardson J S, Getzoff E D and Richardson D C 1978 The beta bulge: a common small unit of nonrepetitive protein structure; Proc. Natl. Acad. Sci. USA 75 2574–2578

    Article  PubMed  CAS  Google Scholar 

  • Ring C S, Kneller D G, Langridge R and Cohen F E 1992 Taxonomy and conformational analysis of loops in proteins; J. Mol. Biol. 224 685–699

    Article  PubMed  CAS  Google Scholar 

  • Rohl C A and Doig A J 1996 Models for the 3(10)-helix/coil, pi-helix/coil, and alpha-helix/3(10)-helix/coil transitions in isolated peptides; Protein Sci. 5 1687–1696

    Article  PubMed  CAS  Google Scholar 

  • Sander O, Sommer I and Lengauer T 2006 Local protein structure prediction using discriminative models; BMC Bioinformatics 7 14

    Article  PubMed  CAS  Google Scholar 

  • Sayle R A and Milner-White E J 1995 RASMOL: biomolecular graphics for all; Trends Biochem. Sci. 20 374

    Article  PubMed  CAS  Google Scholar 

  • Schuchhardt J, Schneider G, Reichelt J, Schomburg D and Wrede P 1996 Local structural motifs of protein backbones are classified by self-organizing neural networks; Protein Eng. 9 833–842

    Article  PubMed  CAS  Google Scholar 

  • Shannon C 1948 A mathematical theory of communication; Bell Syst. Tech. J. 27 379–423

    Google Scholar 

  • Sibanda B L and Thornton J M 1991 Conformation of beta hairpins in protein structures: classification and diversity in homologous structures; Methods Enzymol. 202 59–82

    Article  PubMed  CAS  Google Scholar 

  • Sowdhamini R and Blundell T L 1995 An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins; Protein Sci. 4 506–520

    Article  PubMed  CAS  Google Scholar 

  • Tendulkar A V, Joshi A A, Sohoni M A and Wangikar P P 2004 Clustering of protein structural fragments reveals modular building block approach of nature; J. Mol. Biol. 338 611–629

    Article  PubMed  CAS  Google Scholar 

  • Thompson J D, Higgins D G and Gibson T J 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice; Nucleic Acids Res. 22 4673–4680

    Article  PubMed  CAS  Google Scholar 

  • Tsai H H, Tsai C J, Ma B and Nussinov R 2004 In silico protein design by combinatorial assembly of protein building blocks; Protein Sci. 13 2753–2765

    Article  PubMed  CAS  Google Scholar 

  • Tyagi M, Sharma P, Swamy C, Cadet F, Srinivasan N, De Brevern A G and Offmann B 2006 Protein Block Expert (PBE): A web-based protein structure analysis server using a structural alphabet; Nucleic Acids Res. (in press)

  • Unger R, Harel D, Wherland S and Sussman J L 1989 A 3D building blocks approach to analyzing and predicting structure of proteins; Proteins 5 355–373

    Article  PubMed  CAS  Google Scholar 

  • Unger R and Sussman J L 1993 The importance of short structural motifs in protein structure analysis; J. Comput. Aided Mol. Des. 7 457–472

    Article  PubMed  CAS  Google Scholar 

  • Wintjens R T, Rooman M J and Wodak S J 1996 Automatic classification and analysis of alpha alpha-turn motifs in proteins; J. Mol. Biol. 255 235–253

    Article  PubMed  CAS  Google Scholar 

  • Wojcik J, Mornon J P and Chomilier J 1999 New efficient statistical sequence-dependent structure prediction of short to medium-sized protein loops based on an exhaustive loop classification; J. Mol. Biol. 289 1469–1490

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A G de Brevern.

Additional information

Both authors contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Brevern, A.G., Etchebest, C., Benros, C. et al. “Pinning strategy”: a novel approach for predicting the backbone structure in terms of protein blocks from sequence. J Biosci 32, 51–70 (2007). https://doi.org/10.1007/s12038-007-0006-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12038-007-0006-3

Keywords

Navigation