Advertisement

Unsolved Problems of Ambient Computationally Intelligent TBM Algorithms

Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 611)

Abstract

Structural and functional characterization of protein sequences is one of the important areas of biological research. Currently, a small number of experimentally solved protein structures exist in Protein Data Bank (PDB) in comparison to their considerably higher count of sequence available in UniProtKB/Swiss-Prot. Ambient template-based modelling (TBM) algorithms computationally predict conformational details of a protein sequence on the basis of its evolutionary related similarity with other experimentally solved protein structures. Despite several improvements, shortcomings still obstruct the accuracy of every single step of a TBM algorithm. In this study, we discuss the shortcomings as well as probable corrective measures of major TBM algorithm steps like search and selection of the reliable templates, construction of an accurate target–template alignment, model building, and sampling, and model assessment for selecting the best conformation.

Keywords

CASP TBM Domain HMM MODELLER TM_Score GDT 

References

  1. 1.
    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402CrossRefGoogle Scholar
  2. 2.
    Angermüller C, Biegert A, Söding J (2012) Discriminative modeling of context-specific amino acid substitution probabilities. Bioinformatics 28(24):3240–3247CrossRefGoogle Scholar
  3. 3.
    Barbato A, Benkert P, Schwede T, Tramontano A, Kosinski A (2012) Improving your target-template alignment with MODalign. Bioinformatics 28(7):1038–1039CrossRefGoogle Scholar
  4. 4.
    Bates PA, Kelley LA, MacCallum RM, Sternberg MJE (2001) Enhancement of protein modeling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM. Proteins Struct Funct Genet 5(5):39–46Google Scholar
  5. 5.
    Berman H, Bourne P, Gilliland G, Westbrook J, Arzberger P, Bhat T (2000) Protein Data Bank. http://www.rcsb.org/pdb/home/home.do, 08 Sept 2014
  6. 6.
    Biegert A, Söding J (2009) Sequence context-specific profiles for homology searching. Proc Nat Acad Sci USA 106(10):3770–3775CrossRefGoogle Scholar
  7. 7.
    Bonneau R, Baker D (2001) Ab-initio protein structure prediction: progress and prospects. Annu Rev Biophys Biomol Struct 30:173–189CrossRefGoogle Scholar
  8. 8.
    Boratyn GM, Schäffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden TL (2012) Domain enhanced lookup time accelerated BLAST. Biolo Direct 7, 12Google Scholar
  9. 9.
    Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M (1983) CHARMM—a program for macromolecular energy, minimization, and dynamics calculations. J Comput Chem 4(2):187–217CrossRefGoogle Scholar
  10. 10.
    Buchan DW, Minneci F, Nugent TC, Bryson K, Jones DT (2014) Scalable web services for the PSIPRED protein analysis workbench. Nucleic Acids Res 41:W349–W357CrossRefGoogle Scholar
  11. 11.
    CASP Home Page. www.predictioncenter.org
  12. 12.
    Chen H, Kihara D (2011) Effect of using suboptimal alignments in template-based protein structure prediction. Proteins: Struct, Funct, Bioinf 79(1):315–334Google Scholar
  13. 13.
    Chen J, Charles L, Brooks CL III (2007) Can molecular dynamics simulations provide high-resolution refinement of protein structure?. Proteins: Struct, Funct, Bioinf 67(4):922–930Google Scholar
  14. 14.
    Clerc M, Kennedy J (2002) The particle swarm: explosion stability and convergence in a multi-dimensional complex space. IEEE Trans Evol Comput 6(1):58–73CrossRefGoogle Scholar
  15. 15.
    Clore GM, Brunger AT, Karplus M, Gronenborn AM (1986) Application of molecular dynamics with interproton distance restraints to three-dimensional protein structure determination, A model study of crambin. J Mol Biol 191(3):523–551CrossRefGoogle Scholar
  16. 16.
    Cozzetto D, Giorgetti A, Raimondo D, Tramontano A (2008) The evaluation of protein structure prediction results. Mol Biotechnol 39(1):1–8CrossRefGoogle Scholar
  17. 17.
    Cutello V, Nicosia G, Pavone M, Prizzi I (2011) Protein multiple sequence alignment by hybrid bio-inspired algorithms. Nucleic Acids Res 39(6):1980–1992CrossRefGoogle Scholar
  18. 18.
    Dozier G, Bowen J, Homaifar A (1998) Solving constraint satisfaction problems using hybrid evolutionary search. IEEE Trans Evol Comput 2(1):23–33CrossRefGoogle Scholar
  19. 19.
    Dunbrack RL Jr (2006) Sequence comparison and protein structure prediction. Curr Opin Struct Biol 16(3):374–384CrossRefGoogle Scholar
  20. 20.
    Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14(9):755–763CrossRefGoogle Scholar
  21. 21.
    Edgar RC (2004) Muscle: multiple sequence alignment with high accuracy and high through-put. Nucleic Acids Res 32(5):1792–1797CrossRefGoogle Scholar
  22. 22.
    Feig M, Rotkiewicz P, Kolinski A, Skolnick J, Brooks CL 3rd (2000) Accurate reconstruction of all-atom protein representations from side-chain-based low-resolution models. Proteins: Struct, Funct, Bioinf 41(1):86–97Google Scholar
  23. 23.
    Fernández-Pendás M, Escribano B, Radivojević T, Akhmatskaya E (2014) Constant pressure hybrid Monte Carlo simulations in GROMACS. J Mol Model 20:2487CrossRefGoogle Scholar
  24. 24.
    Fiser A, Fieg M, Brooks CL 3rd, Sali A (2002) Evolution and physics in comparative protein structure modeling. Acc Chem Res 35(6):413–421Google Scholar
  25. 25.
    Fiser A, Sali A (2003) ModLoop: automated modeling of loops in protein structures. Bioinformatics 19(18):2500–2501CrossRefGoogle Scholar
  26. 26.
    Gonzalez MW, Pearson WR (2010) Homologous over-extension: a challenge for iterative similarity searches. Nucleic Acids Res 38(7):2177–2189CrossRefGoogle Scholar
  27. 27.
    Guo JT, Ellrott K, Xu Y (2008) A historical perspective of template-based protein structure prediction. Methods Mol Biol 413:3–42Google Scholar
  28. 28.
    Hao F, Xavier P, Alan EM (2012) Mimicking the action of folding chaperones by Hamiltonian replica-exchange molecular dynamics simulations: application in the refinement of de-novo models. Proteins: Struct, Funct, Bioinf 80(7):1744–1754Google Scholar
  29. 29.
    Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Nat Acad Sci USA 89(22):10915–10919CrossRefGoogle Scholar
  30. 30.
    Huang IK, Pei J, Grishin NV (2013) Defining and predicting structurally conserved regions in protein superfamilies. Bioinformatics 29(2):175–181CrossRefGoogle Scholar
  31. 31.
    Jaroszewski L, Rychlewski L, Godzik A (2000) Improving the quality of twilight-zone alignments. Protein Sci 9(8):1487–1496CrossRefGoogle Scholar
  32. 32.
    Jauch R, Yeo HC, Kolatkar PR, Neil DC (2007) Assessment of CASP7 structure predictions for template free targets. Proteins: Struct, Funct, Bioinf 69(8):57–67Google Scholar
  33. 33.
    Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292(2):195–202CrossRefGoogle Scholar
  34. 34.
    Jones TA, Thirup S (1986) Using known substructures in protein model building and crystallography. EMBO J 5(4):819–822Google Scholar
  35. 35.
    Joo K, Lee J, Sim S, SY Lee, Lee K, Heo S, Lee I, Lee SJ, Lee J (2014) Protein structure modeling for CASP10 by multiple layers of global optimization. Proteins: Struct, Funct, Bioinf 82(2):188–195Google Scholar
  36. 36.
    Karchin R, Cline M, Mandel-Gutfreund Y, Karplus K (2003) Hidden Markov models that use predicted local structure for fold recognition: alphabets of backbone geometry. Proteins: Struct, Funct, Bioinf 51(4):504–514Google Scholar
  37. 37.
    Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33(2):511–518CrossRefGoogle Scholar
  38. 38.
    Kedarisetti BKD, Mizianty MJ, Dick S, Kurgan L (2011) Improved sequence-based prediction of strand residues. J Bioinf Comput Biol 9(1):67–89CrossRefGoogle Scholar
  39. 39.
    Kopp J, Schwede T (2004) The SWISS-MODEL Repository of annotated three-dimensional protein structure homology models. Nucleic Acids Res 32(1):D230–D234CrossRefGoogle Scholar
  40. 40.
    Kristensen DM, Chen BY, Fofanov VY, Ward RM, Lisewski AM, Kimmel M, Kavraki LE, Lichtarge O (2006) Recurrent use of evolutionary importance for functional annotation of proteins based on local structural similarity. Protein Sci 15(6):1530–1536CrossRefGoogle Scholar
  41. 41.
    Kryshtafovych A, Fidelis K (2008) Protein structure prediction and model quality assessment. Drug Discov Today 14(7–8):386–393Google Scholar
  42. 42.
    Levitt M (1992) Accurate modeling of protein conformation by automatic segment matching. J Mol Biol 226(2):507–533CrossRefGoogle Scholar
  43. 43.
    Li Y, Zhang Y (2009) REMO: a new protocol to refine full atomic protein models from C-alpha traces by optimizing hydrogen-bonding networks. Proteins: Struct, Funct, Bioinf 76(3):665–676Google Scholar
  44. 44.
    MacCallum JL, Hua L, Schnieders MJ, Pande VS, Jacobson MP, Dill KA (2009) Assessment of the protein-structure refinement category in CASP8. Proteins: Struct, Funct, Bioinf 77(9):66–80CrossRefGoogle Scholar
  45. 45.
    Manavalan B, Lee J, Lee J (2014) Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms. PLoS ONE 9(9):e106542CrossRefGoogle Scholar
  46. 46.
    Margelevicius M, Venclovas C (2010) Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparisons. BMC Bioinf 11:89CrossRefGoogle Scholar
  47. 47.
    Micale G, Pulvirenti A, Giugno R, Ferro A (2014) Proteins comparison through probabilistic op-timal structure local alignment. Frontiers Genet 5:302CrossRefGoogle Scholar
  48. 48.
    Moult J, Fidelis K, Kryshtafovych A, Rost B, Hubbard T, Tramontano A (2007) Critical assessment of methods of protein structure prediction—Round VII. Proteins: Struct, Funct, Bioinf 69(8):3–9Google Scholar
  49. 49.
    Nguyen KD, Pan Y, Nong G (2011) Parallel progressive multiple sequence alignment on reconfigurable meshes. BMC Genom 12(5):S4CrossRefGoogle Scholar
  50. 50.
    Notredame C, Higgins DG, Heringa J (2000) T-COFFEE: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302(1):205–217CrossRefGoogle Scholar
  51. 51.
    Pany Y (2014) Low-mass molecular dynamics simulation: a simple and generic technique to enhance configurational sampling. Biochem Biophys Res Commun 452:588–592CrossRefGoogle Scholar
  52. 52.
    Pearson WR (2014) BLAST and FASTA similarity searching for multiple sequence alignment. Methods Mol Biol 1079:75–101CrossRefGoogle Scholar
  53. 53.
    Pei J, Kim BH, Tang M, Grishin NV (2007) PROMALS web server for accurate multiple protein sequence alignments. Nucleic Acids Res 35:W649–W652CrossRefGoogle Scholar
  54. 54.
    Pirovano W, Feenstra KA, Heringa J (2007) PRALINE™: a strategy for improved multiple alignment of transmembrane proteins. Bioinformatics 24(4):492–497CrossRefGoogle Scholar
  55. 55.
    Qian B, Raman S, Das R (2007) High-resolution structure prediction and the crystallographic phase problem. Nature 450(7167):259–264CrossRefGoogle Scholar
  56. 56.
    Remmert M, Biegert A, Hauser A, Söding J (2012) HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 9:173–175CrossRefGoogle Scholar
  57. 57.
    Repiso A, Oliva B, Vives Corrons JL, Carreras J, Climent F (2005) Glucose phosphate isomerase deficiency: enzymatic and familial characterization of Arg346His mutation. Biochimica et Biophysica Acta (BBA)—Molecular Basis of Disease 1740(3):467–4471Google Scholar
  58. 58.
    Runthala A, Chowdhury S (2014) Iterative optimal TM_Score and Z_Score guided sampling significantly improves model topology. In: Proceedings of the International MultiConference of Engineers and Computer Scientists (Lecture Notes in Engineering and Computer Science), March 12–14 Hong Kong, pp 123–128Google Scholar
  59. 59.
    Runthala A, Chowdhury S (2013) Protein structure prediction: are we there yet?, SCI 450. In: Pham TD, Jain LC (eds) Innovations in Knowledge-based Systems in Biomedicine and Computational Life Science, Springer-Verlag Monograph Volume, pp 79–115Google Scholar
  60. 60.
    Runthala A (2012) Protein structure prediction: challenging targets for CASP10. J Biomol Struct Dyn 30(5):607–615CrossRefGoogle Scholar
  61. 61.
    Rykunov D, Fiser A (2007) Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins: Struct, Funct, Bioinf 67(3):559–568CrossRefGoogle Scholar
  62. 62.
    Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234(3):779–815CrossRefGoogle Scholar
  63. 63.
    Sanchez R, Sali A (1997) Evaluation of comparative protein structure modelling by MODELLER-3. Proteins: Struct, Funct, Bioinf 1:50–58CrossRefGoogle Scholar
  64. 64.
    Schwede T, Kopp J, Guex N, Peitsch MC (2003) SWISS-MODEL: an automated protein homology-modelling server. Nucleic Acids Res 31(13):3381–3385CrossRefGoogle Scholar
  65. 65.
    Shindyalov IN, Bourne PE (1998) Protein structure alignment by incremental combinatorial extension of the optimum path. Protein Eng 11(9):739–747CrossRefGoogle Scholar
  66. 66.
    Siew N, Elofsson A, Rychlewski L, Fischer D (2000) MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 16(9):776–785CrossRefGoogle Scholar
  67. 67.
    Söding J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951–960CrossRefGoogle Scholar
  68. 68.
    Söding J, Biegert A, Lupas AN (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33:W244–W248CrossRefGoogle Scholar
  69. 69.
    Song Y, Mao J, Gunner MB (2009) MCCE2: Improving protein pKa calculations with extensive side chain rotamer sampling. J Comput Chem 30(14):2231–2247Google Scholar
  70. 70.
    Subramaniam S, Senes S (2014) Backbone dependency further improves side chain prediction efficiency in the Energy-Based Conformer Library (bEBL). Proteins: Struct, Funct, BioinfGoogle Scholar
  71. 71.
    Takaya D, Takeda-Shitaka M, Terashi G, Kanou K, Iwadate M, Umeyama H (2008) Bioinformatics based Ligand-Docking and in-silico screening. Chem Pharm Bull 56(5):742–744CrossRefGoogle Scholar
  72. 72.
    Teichmann SA, Chothia C, Church GM, Park J (2000) Fast assignment of protein structures to sequences using the intermediate sequence library PDB-ISL. Bioinformatics 16(2):117–124CrossRefGoogle Scholar
  73. 73.
    Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22(22):4673–4680CrossRefGoogle Scholar
  74. 74.
    Tosatto S (2006) Spritz: a server for the prediction of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Res 34:W164–W168CrossRefGoogle Scholar
  75. 75.
    Wallner B (2014) ProQM-resample: improved model quality assessment for membrane proteins by limited conformational sampling. Bioinformatics 30(15):2221–2223CrossRefGoogle Scholar
  76. 76.
    Wang G, Dunbrack RL Jr (2003) PISCES: a protein sequence culling server. Bioinformatics 19(12):1589–1591CrossRefGoogle Scholar
  77. 77.
    Wang Q, Canutescu AA, Dunbrack RL Jr (2008) SCWRL and MolIDE: computer programs for side-chain conformation prediction and homology modeling. Nat Protoc 3(12):1832–1847CrossRefGoogle Scholar
  78. 78.
    Wu EL, Cheng X, Jo S, Rui H, Song KC, Dávila-Contreras EM, Qi Y, Lee J, Monje-Galvan V, Venable RM, Klauda JB, Im W (2014) CHARMM-GUI membrane builder toward realistic biological membrane simulations. J Comput Chem 35(27):1997–2004CrossRefGoogle Scholar
  79. 79.
    Xiang Z, Honig B (2001) Extending the accuracy limits of prediction for side-chain conformations. J Mol Biol 311(2):421–430CrossRefGoogle Scholar
  80. 80.
    Xu J, Zhang Y (2010) How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26:889–895CrossRefGoogle Scholar
  81. 81.
    Xue Z, Xu D, Wang Y, Zhang Y (2013) ThreaDom: extracting protein domain boundary information from multiple threading alignments. Bioinformatics 29(13):i247–i256CrossRefGoogle Scholar
  82. 82.
    Yang T, Zhou Y (2008) Ab-initio folding of terminal segments with secondary structures reveals the fine difference between two closely related all-atom statistical energy functions. Protein Sci 72:1212–1219CrossRefGoogle Scholar
  83. 83.
    Zadeh LA (1994) Fuzzy logic, neural networks, and soft computing. Commun ACM 37(3):77–84CrossRefMathSciNetGoogle Scholar
  84. 84.
    Zemla A (2003) LGA—a method for finding 3D similarities in protein structures. Nucleic Acids Res 31(13):3370–3374CrossRefGoogle Scholar
  85. 85.
    Zhang T, Faraggi E, Xue B, Dunker AK, Uversky VN, Zhou Y (2012) SPINE-D: accurate prediction of short and long disordered regions by a single neural-network-based method. J Biomol Struct Dyn 29(4):799–813CrossRefGoogle Scholar
  86. 86.
    Zhang Y (2008) Progress and challenges in protein structure prediction. Curr Opin Struct Biol 18(3):342–348CrossRefGoogle Scholar
  87. 87.
    Zhang Y (2010) I-TASSER: fully automated protein structure prediction in CASP8. Proteins: Struct, Funct, Bioinf 77(9):100–113Google Scholar
  88. 88.
    Zhang Y, Skolnick J (2005) The protein structure prediction problem could be solved using the current PDB library. Proc Nat Acad Sci USA 102(4):1029–1034CrossRefGoogle Scholar
  89. 89.
    Zhang Y (2014) Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10. Proteins: Struct, Funct, Bioinf 82(2):175–187Google Scholar
  90. 90.
    Zheng W, Jesse E, Cheng J (2010) MULTICOM: a multi-level combination approach to protein structure prediction and its assessments in CASP8. Bioinformatics 26(7):882–888CrossRefGoogle Scholar
  91. 91.
    Zhou H, Zhou Y (2005) SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures. Bioinformatics 21(18):3615–3621CrossRefGoogle Scholar
  92. 92.
    Zwanzig R, Szabo A, Bagchi B (1992) Levinthal’s paradox. Proc Nat Acad Sci USA 89:20–22CrossRefGoogle Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  1. 1.Department of Biological SciencesBirla Institute of Technology and SciencePilaniIndia

Personalised recommendations