Abstract
Due to the advancement in various sequencing technologies, the gap between the number of protein sequences and the number of experimental protein structures is ever increasing. Community-wide initiatives like CASP have resulted in considerable efforts in the development of computational methods to accurately model protein structures from sequences. Sequence-based prediction of super-secondary structure has direct application in protein structure prediction, and there have been significant efforts in the prediction of super-secondary structure in the last decade. In this chapter, we first introduce the protein structure prediction problem and highlight some of the important progress in the field of protein structure prediction. Next, we discuss recent methods for the prediction of super-secondary structures. Finally, we discuss applications of super-secondary structure prediction in structure prediction/analysis of proteins. We also discuss prediction of protein structures that are composed of simple super-secondary structure repeats and protein structures that are composed of complex super-secondary structure repeats. Finally, we also discuss the recent trends in the field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dorn M, e Silva MB, Buriol LS, Lamb LC (2014) Three-dimensional protein structure prediction: methods and computational strategies. Comput Biol Chem 53:251–276
Kc DB (2016) Recent advances in sequence-based protein structure prediction. Brief Bioinform 18:1021–1032
Pruitt KD, Tatusova T, Klimke W, Maglott DR (2008) NCBI reference sequences: current status, policy and new initiatives. Nucleic Acids Res 37:D32–D36
Kc DB (2017) Recent advances in sequence-based protein structure prediction. Brief Bioinform 18:1021–1032
Chen K, Kurgan L (2012) Computational prediction of secondary and supersecondary structures, Protein supersecondary structures. Springer, New York, pp 63–86
Yang Y, Faraggi E, Zhao H, Zhou Y (2011) Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates. Bioinformatics 27:2076–2082
Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5:725
Faraggi E, Yang Y, Zhang S, Zhou Y (2009) Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. Structure 17:1515–1527
Wu S, Zhang Y (2008) MUSTER: improving protein sequence profile–profile alignments by using multiple sources of structure information. Proteins 72:547–556
Zhou H, Skolnick J (2007) Ab initio protein structure prediction using chunk-TASSER. Biophys J 93:1510–1518
Skolnick J (2006) In quest of an empirical potential for protein structure prediction. Curr Opin Struct Biol 16:166–171
Yang Y, Gao J, Wang J, Heffernan R, Hanson J et al (2018) Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Brief Bioinform 19(3):482–494
Anfinsen CB, Haber E, Sela M, White F (1961) The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proc Natl Acad Sci 47:1309–1314
Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181:223–230
Singh M (2006) Predicting protein secondary and supersecondary structure. In: Aluru S (ed) Handbook of computational molecular biology. Chapman and Hall/CRC Press, Boca Raton, pp 29.1–29.29
Zhang Y (2008) Progress and challenges in protein structure prediction. Curr Opin Struct Biol 18:342–348
Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234:779–815
Bowie JU, Luthy R, Eisenberg D (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253:164–170
Liwo A, Lee J, Ripoll DR, Pillardy J, Scheraga HA (1999) Protein structure prediction by global optimization of a potential energy function. Proc Natl Acad Sci U S A 96:5482–5485
Kryshtafovych A, Fidelis K, Moult J (2011) CASP9 results compared to those of previous CASP experiments. Proteins 79(Suppl 10):196–207
Simons KT, Kooperberg C, Huang E, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 268:209–225
Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80:1715–1735
Baker D, Sali A (2001) Protein structure prediction and structural genomics. Science 294:93–96
Tramontano A, Morea V (2003) Assessment of homology-based predictions in CASP5. Proteins 53(Suppl 6):352–368
Rost B (1999) Twilight zone of protein sequence alignments. Protein Eng 12:85–94
Browne WJ, North AC, Phillips DC, Brew K, Vanaman TC et al (1969) A possible three-dimensional structure of bovine alpha-lactalbumin based on that of hen’s egg-white lysozyme. J Mol Biol 42:65–86
Yang J, Zhang W, He B, Walker SE, Zhang H et al (2016) Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade. Proteins 84(Suppl 1):233–246
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Soding J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951–960
Ginalski K, Elofsson A, Fischer D, Rychlewski L (2003) 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19:1015–1018
Wu S, Zhang Y (2007) LOMETS: A local meta-threading-server for protein structure prediction. Nucleic Acids Res 35:3375–3382
Webb B, Sali A (2014) Comparative protein structure modeling using MODELLER. Curr Protoc Bioinformatics 47:5.6.1–5.6.32
Pieper U, Webb BM, Dong GQ, Schneidman-Duhovny D, Fan H et al (2014) ModBase, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 42:D336–D346
Eswar N, John B, Mirkovic N, Fiser A, Ilyin VA et al (2003) Tools for comparative protein structure modeling and analysis. Nucleic Acids Res 31:3375–3380
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197
Remmert M, Biegert A, Hauser A, Soding J (2012) HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 9:173–175
Zhang Y, Skolnick J (2004) Automated structure prediction of weakly homologous proteins on a genomic scale. Proc Natl Acad Sci U S A 101:7594–7599
Bowie JU, Eisenberg D (1994) An evolutionary approach to folding small alpha-helical proteins that uses sequence information and an empirical guiding fitness function. Proc Natl Acad Sci U S A 91:4436–4440
Bradley P, Misura KM, Baker D (2005) Toward high-resolution de novo structure prediction for small proteins. Science 309:1868–1871
Rohl CA, Strauss CE, Misura KM, Baker D (2004) Protein structure prediction using Rosetta. Methods Enzymol 383:66–93
Holmes JB, Tsai J (2004) Some fundamental aspects of building protein structures from fragment libraries. Protein Sci 13:1636–1650
Gront D, Kulp DW, Vernon RM, Strauss CE, Baker D (2011) Generalized fragment picking in Rosetta: design, protocols and applications. PLoS One 6:e23294
Jones DT, McGuffin LJ (2003) Assembling novel protein folds from super-secondary structural fragments. Proteins 53(Suppl 6):480–485
Kalev I, Habeck M (2011) HHfrag: HMM-based fragment detection using HHpred. Bioinformatics 27:3110–3116
Shen Y, Picord G, Guyon F, Tuffery P (2013) Detecting protein candidate fragments using a structural alphabet profile comparison approach. PLoS One 8:e80493
Xu D, Zhang Y (2013) Toward optimal fragment generations for ab initio protein structure assembly. Proteins 81:229–239
Bystroff C, Simons KT, Han KF, Baker D (1996) Local sequence-structure correlations in proteins. Curr Opin Biotechnol 7:417–421
Mackenzie CO, Grigoryan G (2017) Protein structural motifs in prediction and design. Curr Opin Struct Biol 44:161–167
Moult J (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15:285–289
Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540
Orengo CA, Michie A, Jones S, Jones DT, Swindells M et al (1997) CATH–a hierarchic classification of protein domain structures. Structure 5:1093–1109
Andreeva A, Howorth D, Chandonia J-M, Brenner SE, Hubbard TJ et al (2007) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36:D419–D425
Cuff AL, Sillitoe I, Lewis T, Clegg AB, Rentzsch R et al (2010) Extending CATH: increasing coverage of the protein structure universe and linking structure with function. Nucleic Acids Res 39:D420–D426
Kolodny R, Honig B (2006) VISTAL—a new 2D visualization tool of protein 3D structural alignments. Bioinformatics 22:2166–2167
Moreland JL, Gramada A, Buzko OV, Zhang Q, Bourne PE (2005) The Molecular Biology Toolkit (MBT): a modular platform for developing molecular visualization applications. BMC Bioinformatics 6:21
Eisenberg D (2003) The discovery of the α-helix and β-sheet, the principal structural features of proteins. Proc Natl Acad Sci 100:11207–11210
Levitt M, Greer J (1977) Automatic identification of secondary structure in globular proteins. J Mol Biol 114:181–239
Frishman D, Argos P (1995) Knowledge-based protein secondary structure assignment. Proteins 23:566–579
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637
Richards FM, Kundrot CE (1988) Identification of structural motifs from protein coordinate data: secondary structure and first-level supersecondary structure. Proteins 3:71–84
Sklenar H, Etchebest C, Lavery R (1989) Describing protein structure: a general algorithm yielding complete helicoidal parameters and a unique overall axis. Proteins 6:46–60
Labesse G, Colloc'h N, Pothier J, Mornon J-P (1997) P-SEA: a new efficient assignment of secondary structure from Cα trace of proteins. Bioinformatics 13:291–295
Zhang W, Dunker AK, Zhou Y (2008) Assessing secondary structure assignment of protein structures by using pairwise sequence-alignment benchmarks. Proteins 71:61–67
Hosseini S-R, Sadeghi M, Pezeshk H, Eslahchi C, Habibi M (2008) PROSIGN: a method for protein secondary structure assignment based on three-dimensional coordinates of consecutive Cα atoms. Comput Biol Chem 32:406–411
Park S-Y, Yoo M-J, Shin J-M, Cho K-H (2011) SABA (secondary structure assignment program based on only alpha carbons): a novel pseudo center geometrical criterion for accurate assignment of protein secondary structures. BMB Rep 44:118–122
Martin J, Letellier G, Marin A, Taly J-F, de Brevern AG et al (2005) Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC Struct Biol 5:17
King SM, Johnson WC (1999) Assigning secondary structure from protein coordinate data. Proteins 35:313–320
Fodje M, Al-Karadaghi S (2002) Occurrence, conformational features and amino acid propensities for the π-helix. Protein Eng Des Sel 15:353–358
Majumdar I, Krishna SS, Grishin NV (2005) PALSSE: a program to delineate linear secondary structural elements from protein structures. BMC Bioinformatics 6:202
Cubellis MV, Cailliez F, Lovell SC (2005) Secondary structure assignment that accurately reflects physical and evolutionary characteristics. BMC Bioinformatics 6:S8
Wang S, Peng J, Ma J, Xu J (2016) Protein secondary structure prediction using deep convolutional neural fields. Sci Rep 6:18962
Kuhn M, Meiler J, Baker D (2004) Strand-loop-strand motifs: prediction of hairpins and diverging turns in proteins. Proteins 54:282–288
Tai CH, Paul R, Dukka KC, Shilling JD, Lee B (2014) SymD webserver: a platform for detecting internally symmetric protein structures. Nucleic Acids Res 42:W296–W300
de la Cruz X, Hutchinson EG, Shepherd A, Thornton JM (2002) Toward predicting protein topology: an approach to identifying beta hairpins. Proc Natl Acad Sci U S A 99:11157–11162
de la Cruz X, Hutchinson EG, Shepherd A, Thornton JM (2002) Toward predicting protein topology: An approach to identifying β hairpins. Proc Natl Acad Sci 99:11157–11162
Kumar M, Bhasin M, Natt NK, Raghava G (2005) BhairPred: prediction of β-hairpins in a protein from multiple alignment information using ANN and SVM techniques. Nucleic Acids Res 33:W154–W159
Zou D, He Z, He J (2009) β-Hairpin prediction with quadratic discriminant analysis using diversity measure. J Comput Chem 30:2277–2284
Xia J-F, Wu M, You Z-H, Zhao X-M, Li X-L (2010) Prediction of β-hairpins in proteins using physicochemical properties and structure information. Protein Pept Lett 17:1123–1128
Xia JF, Wu M, You ZH, Zhao XM, Li XL (2010) Prediction of beta-hairpins in proteins using physicochemical properties and structure information. Protein Pept Lett 17:1123–1128
Chen K, Kurgan L (2013) Computational prediction of secondary and supersecondary structures. Methods Mol Biol 932:63–86
Li D, Hu X, Liu X, Feng Z, Ding C (2017) Using feature optimization-based support vector machine method to recognize the beta-hairpin motifs in enzymes. Saudi J Biol Sci 24:1361–1369
Yong EF, GaoShan K (2015) Identify beta-hairpin motifs with quadratic discriminant algorithm based on the chemical shifts. PLoS One 10:e0139280
Ferrer-Costa C, Shanahan HP, Jones S, Thornton JM (2005) HTHquery: a method for detecting DNA-binding proteins with a helix-turn-helix structural motif. Bioinformatics 21:3679–3680
Fletcher JM, Boyle AL, Bruning M, Bartlett GJ, Vincent TL et al (2012) A basis set of de novo coiled-coil peptide oligomers for rational protein design and synthetic biology. ACS Synth Biol 1:240–250
Li C, Wang XF, Chen Z, Zhang Z, Song J (2015) Computational characterization of parallel dimeric and trimeric coiled-coils using effective amino acid indices. Mol BioSyst 11:354–360
Wang X, Zhou Y, Yan R (2015) AAFreqCoil: a new classifier to distinguish parallel dimeric and trimeric coiled-coils. Mol BioSyst 11:1794–1801
Simm D, Hatje K, Kollmar M (2015) Waggawagga: comparative visualization of coiled-coil predictions and detection of stable single alpha-helices (SAH domains). Bioinformatics 31:767–769
Li C, Ching Han Chang C, Nagel J, Porebski BT, Hayashida M et al (2016) Critical evaluation of in silico methods for prediction of coiled-coil domains in proteins. Brief Bioinform 17:270–282
Wolf E, Kim PS, Berger B (1997) MultiCoil: a program for predicting two- and three-stranded coiled-coils. Protein Sci 6:1179–1189
Apgar JR, Gutwin KN, Keating AE (2008) Predicting helix orientation for coiled-coil dimers. Proteins 72:1048–1065
Kim BW, Jung YO, Kim MK, Kwon DH, Park SH et al (2017) ACCORD: an assessment tool to determine the orientation of homodimeric coiled-coils. Sci Rep 7:43318
Gruber M, Soding J, Lupas AN (2006) Comparative analysis of coiled-coil prediction methods. J Struct Biol 155:140–145
Wood CW, Woolfson DN (2018) CCBuilder 2.0: powerful and accessible coiled-coil modeling. Protein Sci 27:103–111
Shen Y, Bax A (2012) Identification of helix capping and b-turn motifs from NMR chemical shifts. J Biomol NMR 52:211–232
Zou D, He Z, He J, Xia Y (2011) Supersecondary structure prediction using Chou’s pseudo amino acid composition. J Comput Chem 32:271–278
Kou G, Feng Y (2015) Identify five kinds of simple super-secondary structures with quadratic discriminant algorithm based on the chemical shifts. J Theor Biol 380:392–398
Bonet J, Planas-Iglesias J, Garcia-Garcia J, Marin-Lopez MA, Fernandez-Fuentes N et al (2014) ArchDB 2014: structural classification of loops in proteins. Nucleic Acids Res 42:D315–D319
Sun L, Hu X, Li S, Jiang Z, Li K (2016) Prediction of complex super-secondary structure βαβ motifs based on combined features. Saudi J Biol Sci 23:66–71
Chambers P, Pringle CR, Easton AJ (1990) Heptad repeat sequences are located adjacent to hydrophobic regions in several types of virus fusion glycoproteins. J Gen Virol 71:3075–3080
Edoh K, MacCarthy E (2018) Network and equation-based models in epidemiology. Int J Biomath 11:1850046
Smith RK, Archibald A, MacCarthy E, Liu L, Luke NS (2016) A mathematical investigation of vaccination strategies to prevent a measles epidemic. N C J Math Stat 2:29–44
Taylor WR, Thornton JM (1983) Prediction of super-secondary structure in proteins. Nature 301:540–542
Sun L, Hu X, Li S, Jiang Z, Li K (2016) Prediction of complex super-secondary structure betaalphabeta motifs based on combined features. Saudi J Biol Sci 23:66–71
Geertz-Hansen HM, Blom N, Feist AM, Brunak S, Petersen TN (2014) Cofactory: sequence-based prediction of cofactor specificity of Rossmann folds. Proteins 82:1819–1828
Schaeffer RD, Liao Y, Cheng H, Grishin NV (2017) ECOD: new developments in the evolutionary classification of domains. Nucleic Acids Res 45:D296–D302
Andrade MA, Perez-Iratxeta C, Ponting CP (2001) Protein repeats: structures, functions, and evolution. J Struct Biol 134:117–131
Schaeffer RD, Kinch LN, Liao Y, Grishin NV (2016) Classification of proteins with shared motifs and internal repeats in the ECOD database. Protein Sci 25:1188–1203
Chaudhuri I, Soding J, Lupas AN (2008) Evolution of the beta-propeller fold. Proteins 71:795–803
Koehler Leman J, Ulmschneider MB, Gray JJ (2015) Computational modeling of membrane proteins. Proteins 83:1–24
Venko K, Roy Choudhury A, Novic M (2017) Computational approaches for revealing the structure of membrane transporters: case study on bilitranslocase. Comput Struct Biotechnol J 15:232–242
Waldispuhl J, Berger B, Clote P, Steyaert JM (2006) transFold: a web server for predicting the structure and residue contacts of transmembrane beta-barrels. Nucleic Acids Res 34:W189–W193
Tran Vdu T, Chassignet P, Sheikh S, Steyaert JM (2012) A graph-theoretic approach for classification and structure prediction of transmembrane beta-barrel proteins. BMC Genomics 13(Suppl 2):S5
Savojardo C, Fariselli P, Casadio R (2013) BETAWARE: a machine-learning tool to detect and predict transmembrane beta-barrel proteins in prokaryotes. Bioinformatics 29:504–505
Mackenzie CO, Zhou J, Grigoryan G (2016) Tertiary alphabet for the observable protein structural universe. Proc Natl Acad Sci U S A 113:E7438–E7447
May P, Barthel S, Koch I (2004) PTGL—a web-based database application for protein topologies. Bioinformatics 20:3277–3279
Koch I, Schafer T (2018) Protein super-secondary structure and quaternary structure topology: theoretical description and application. Curr Opin Struct Biol 50:134–143
Shen Y, Delaglio F, Cornilescu G, Bax A (2009) TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J Biomol NMR 44:213–223
Shen Y, Bax A (2013) Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks. J Biomol NMR 56:227–241
Hafsa NE, Wishart DS (2014) CSI 2.0: a significantly improved version of the Chemical Shift Index. J Biomol NMR 60:131–146
Wu S, Zhang Y (2010) Recognizing protein substructure similarity using segmental threading. Structure 18:858–867
Guzenko D, Strelkov SV (2017) CCFold: rapid and accurate prediction of coiled-coil structures and application to modelling intermediate filaments. Bioinformatics 34:215–222
Fernandez-Fuentes N, Dybas JM, Fiser A (2010) Structural characteristics of novel protein folds. PLoS Comput Biol 6:e1000750
Vallat B, Madrid-Aliste C, Fiser A (2015) Modularity of protein folds as a tool for template-free modeling of structures. PLoS Comput Biol 11:e1004419
Menon V, Vallat BK, Dybas JM, Fiser A (2013) Modeling proteins using a super-secondary structure library and NMR chemical shift information. Structure 21:891–899
Jones DT (2001) Predicting novel protein folds by using FRAGFOLD. Proteins 45(Suppl 5):127–132
Jones DT (1997) Successful ab initio prediction of the tertiary structure of NK-lysin using multiple sequences and recognized supersecondary structural motifs. Proteins 1:185–191
Kosciolek T, Jones DT (2014) De novo structure prediction of globular proteins aided by sequence variation-derived contacts. PLoS One 9:e92197
Jones DT, Buchan DW, Cozzetto D, Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28:184–190
Pellegrini-Calace M, Carotti A, Jones DT (2003) Folding in lipid membranes (FILM): a novel method for the prediction of small membrane protein 3D structures. Proteins 50:537–545
Nugent T, Jones DT (2012) Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis. Proc Natl Acad Sci U S A 109:E1540–E1547
Mier P, Alanis-Lobato G, Andrade-Navarro MA (2017) Protein-protein interactions can be predicted using coiled-coil co-evolution patterns. J Theor Biol 412:198–203
Pilla KB, Otting G, Huber T (2017) Protein structure determination by assembling super-secondary structure motifs using pseudocontact shifts. Structure 25:559–568
Fiser A, Šali A (2003) Modeller: generation and refinement of homology-based protein structure models. Methods Enzymol 374:461–491
Sánchez R, S̆ali A (1999) ModBase: a database of comparative protein structure models. Bioinformatics 15:1060–1061
Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9:40
Xu D, Zhang J, Roy A, Zhang Y (2011) Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement. Proteins 79:147–160
Armstrong CT, Vincent TL, Green PJ, Woolfson DN (2011) SCORER 2.0: an algorithm for distinguishing parallel dimeric and trimeric coiled-coil sequences. Bioinformatics 27:1908–1914
Vincent TL, Green PJ, Woolfson DN (2012) LOGICOIL—multi-state prediction of coiled-coil oligomeric state. Bioinformatics 29:69–76
Funding
D.B.K.C. is partly supported by a start-up grant from the Department of Computational Science and Engineering at North Carolina A&T State University. D.B.K.C. is also partly supported by NSF grant no. 1564606 and NSF grant no. 1647884.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
MacCarthy, E., Perry, D., KC, D.B. (2019). Advances in Protein Super-Secondary Structure Prediction and Application to Protein Structure Prediction. In: Kister, A. (eds) Protein Supersecondary Structures. Methods in Molecular Biology, vol 1958. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-9161-7_2
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9161-7_2
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-9160-0
Online ISBN: 978-1-4939-9161-7
eBook Packages: Springer Protocols