Abstract
Protein structure prediction (PSP) is a grand challenge in bioinformatics, drug discovery, and related fields. PSP is computationally challenging because of an astronomically large conformational space to be searched and an unknown very complex energy function to be minimised. To obtain a given protein’s structure, template-based PSP approaches adopt a similar protein’s known structure, while template-free PSP approaches work when no similar protein’s structure is known. Currently, proteins with known structures are greatly outnumbered by proteins with unknown structures. Template-free PSP has obtained significant progress recently via machine learning and search-based optimisation approaches. However, very accurate structures for complex proteins are yet to be achieved at a level suitable for effective drug design. Moreover, ab initio prediction of a protein’s structure only from its amino acid sequence remains unsolved. Furthermore, the number of protein sequences with unknown structures is growing rapidly. Hence, to make further progress in PSP, more sophisticated and advanced artificial intelligence (AI) approaches are needed. However, getting involved in PSP research is difficult for AI researchers because of the lack of a comprehensive understanding of the whole problem, along with the background and the literature of all related sub-problems. Unfortunately, existing PSP review papers cover PSP research at a very high level and only some parts of PSP and only from a particular singular viewpoint. Using a systematic approach, this review paper provides a comprehensive survey of the state-of-the-art template-free PSP research to fill this knowledge gap. Moreover, covering required PSP preliminaries and computational formulations, this paper presents PSP research from AI perspectives, discusses the challenges, provides our commentaries, and outlines future research directions.
Similar content being viewed by others
References
Adhikari B (2020) DEEPCON: protein contact prediction using dilated convolutional neural networks with dropout. Bioinformatics 36(2):470–477
Adhikari B (2020) A fully open-source framework for deep learning protein real-valued distances. Sci Rep 10(1):1–10
Adhikari B, Cheng J (2016) Protein residue contacts and prediction methods. In: Data Mining Techniques for the Life Sciences, pp. 463–476. Springer, Switzerland
Adhikari B, Cheng J (2018) CONFOLD2: improved contact-driven ab initio protein structure modeling. BMC Bioinf 19(1):1–5
Adhikari B, Bhattacharya D, Cao R, Cheng J (2015) CONFOLD: residue-residue contact-guided ab initio protein folding. Proteins 83(8):1436–1449
Adhikari B, Hou J, Cheng J (2018) DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics 34(9):1466–1472
Adhikari B, Shrestha B, Bernardini M, Hou J, Lea J (2021) DISTEVAL: a web server for evaluating predicted protein distances. BMC Bioinf 22(1):1–9
AlQuraishi M (2019) End-to-end differentiable learning of protein structure. Cell Syst 8(4):292–301
AlQuraishi M (2021) Machine learning in protein structure prediction. Curr Opin Chem Biol 65:1–8
Alford RF, Leaver-Fay A, Jeliazkov JR, O’Meara MJ, DiMaio FP, Park H, Shapovalov MV, Renfrew PD, Mulligan VK, Kappel K et al (2017) The rosetta all-atom energy function for macromolecular modeling and design. J Chem Theory Comput 13(6):3031–3048
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research 25(17):3389–3402
Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181(4096):223–230
Anishchenko I, Baek M, Park H, Dauparas J, Hiranuma N, Mansoor S, Humphrey I, Baker D (2020) Protein structure prediction guided by predicted inter-residue geometries. In: Fourteenth Meeting of Critical Assessment of Techniques for Protein Structure Prediction, p. 30
Atari M, Majd N (2022) 2D HP protein folding using quantum genetic algorithm. In: 2022 27th International Computer Conference, Computer Society of Iran (CSICC), pp. 1–8 . IEEE
Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, Wang J, Cong Q, Kinch LN, Schaeffer RD et al (2021) Accurate prediction of protein structures and interactions using a three-track neural network. Science 1:1
Bagaria A, Jaravine V, Güntert P (2013) Estimating structure quality trends in the Protein Data Bank by equivalent resolution. Comput Biol Chem 46:8–15
Bairoch A, Bougueleret L, Altairac S, Amendolia V, Auchincloss A, Puy GA, Axelsen K, Baratin D, Blatter M-C, Boeckmann B et al (2008) The universal protein resource (uniprot). Nucleic Acids Res 36:190–195
Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 3(1)
Belda I, Madurga S, Tarragó T, Llorà X, Giralt E (2007) Evolutionary computation and multimodal search: a good combination to tackle molecular diversity in the field of peptide design. Mol Divers 11(1):7–21
Benkert P, Tosatto SC, Schomburg D (2008) QMEAN: a comprehensive scoring function for model quality assessment. Proteins 71(1):261–277
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2000) Genbank. Nucleic Acids Res 28(1):15–18
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28(1):235–242
Berrera M, Molinari H, Fogolari F (2003) Amino acid empirical contact energy definitions for fold recognition in the space of contact maps. BMC Bioinf 4(1):1–26
Bhattacharya D (2019) refineD: improved protein structure refinement using machine learning based restrained relaxation. Bioinformatics 35(18):3320–3328
Bhattacharya D, Cheng J (2013) 3Drefine: consistent protein structure refinement by optimizing hydrogen bonding network and atomic-level energy minimization. Proteins 81(1):119–131
Bhattacharya D, Cao C (2016) Renzhi, Jianlin: UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling. Bioinformatics 32(18):2791–2799
Bhattacharya D, Adhikari B, Li J, Cheng J (2016) Fragsion: ultra-fast protein fragment library generation by iohmm sampling. Bioinformatics 32(13):2059–2061
Bhattacharya D, Nowotny J, Cao R, Cheng J (2016) 3Drefine: an interactive web server for efficient protein structure refinement. Nucleic Acids Res 44(W1):406–409
Bianchi L, Dorigo M, Gambardella LM, Gutjahr WJ (2009) A survey on metaheuristics for stochastic combinatorial optimization. Natural Computing 8(2):239–287
Biehn SE, Lindert S (2022) Protein structure prediction with mass spectrometry data. Ann Rev Phys Chem 73:1–19
Billings WM, Morris CJ, Della Corte D (2021) The whole is greater than its parts: ensembling improves protein contact prediction. Sci Rep 11(1):1–7
Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM computing surveys (CSUR) 35(3):268–308
Borguesan B, eSilva MB, Grisci B, Inostroza-Ponta M, Dorn M (2015) APL: an angle probability list to improve knowledge-based metaheuristics for the three-dimensional protein structure prediction. Comput Biol Chem 59:142–157
Bradley P, Misura KM, Baker D (2005) Toward high-resolution de novo structure prediction for small proteins. Science 309(5742):1868–1871
Brooks BR, Brooks CL III, Mackerell AD Jr, Nilsson L, Petrella RJ, Roux B, Won Y, Archontis G, Bartels C, Boresch S et al (2009) CHARMM: the biomolecular simulation program. J Comput Chem 30(10):1545–1614
Brunger AT (2007) Version 1.2 of the crystallography and nmr system. Nat Protocols 2(11):2728–2733
Brünger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang J-S, Kuszewski J, Nilges M, Pannu NS et al (1998) Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr Sect D 54(5):905–921
Cai Y, Li X, Sun Z, Lu Y, Zhao H, Hanson J, Paliwal K, Litfin T, Zhou Y, Yang Y (2020) SPOT-Fold: fragment-free protein structure prediction guided by predicted backbone structure and contact map. J Comput Chem 41(8):745–750
Canutescu AA, Shelenkov AA, Dunbrack RL Jr (2003) A graph-theory algorithm for rapid protein side-chain prediction. Protein 12(9):2001–2014
Cao Y, Song L, Miao Z, Hu Y, Tian L, Jiang T (2011) Improved side-chain modeling by coupling clash-detection guided iterative search with rotamer relaxation. Bioinformatics 27(6):785–790
Cao R, Adhikari B, Bhattacharya D, Sun M, Hou J, Cheng J (2017) QAcon: single model quality assessment using protein structural and contact information with machine learning techniques. Bioinformatics 33(4):586–588
Case DA, Cheatham TE III, Darden T, Gohlke H, Luo R, Merz KM Jr, Onufriev A, Simmerling C, Wang B, Woods RJ (2005) The amber biomolecular simulation programs. J Comput Chem 26(16):1668–1688
Cavanagh J, Fairbrother WJ, Palmer AG III, Skelton NJ (1996) Protein NMR spectroscopy: principles and practice. Academic Press, New York
Chan T, Jankovic B, Le V, Naverniouk I (2004) Comparative Study of Hydrophobic-Polar and Miyazawa-Jernigan Energy Functions in Protein Folding on a Cubic Lattice Using Pruned-Enriched Rosenbluth Monte Carlo Algorithm
Chaudhury S, Lyskov S, Gray JJ (2010) PyRosetta: a script-based interface for implementing molecular modeling algorithms using rosetta. Bioinformatics 26(5):689–691
Chen P, Li J (2010) Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers. BMC structural biology 10(1):1–13
Chen K, Kurgan L (2012) Computational prediction of secondary and supersecondary structures. In: Protein Supersecondary Structures, pp. 63–86. Springer, Switzerland
Chen X, Song S, Ji J, Tang Z, Todo Y (2020) Incorporating a multiobjective knowledge-based energy function into differential evolution for protein structure prediction. Information Sciences 540:69–88
Chen C, Wu T, Guo Z, Cheng J (2021) Combination of deep neural network with attention mechanism enhances the explainability of protein contact prediction. Proteins 89(6):697–707
Cheng J, Baldi P (2005) Three-stage prediction of protein \(\beta \)-sheets by neural networks, alignments and graph algorithms. Bioinformatics 21(suppl_1), 75–84
Cheng J, Tegge AN, Baldi P (2008) Machine learning methods for protein structure prediction. IEEE Rev Biomed Eng 1:41–49
Chi PB, Kim D, Lai JK, Bykova N, Weber CC, Kubelka J, Liberles DA (2018) A new parameter-rich structure-aware mechanistic model for amino acid substitution during evolution. Proteins 86(2):218–228
Chuang C-C, Chen C-Y, Yang J-M, Lyu P-C, Hwang J-K (2003) Relationship between protein structures and disulfide-bonding patterns. Proteins 53(1):1–5
Colbes J, Corona RI, Lezcano C, Rodríguez D, Brizuela CA (2017) Protein side-chain packing problem: is there still room for improvement? Brief Bioinf 18(6):1033–1043
Comellas G, Rienstra CM (2013) Protein structure determination by magic-angle spinning solid-state NMR, and insights into the formation, structure, and stability of amyloid fibrils. Ann Rev Biophys 42:515–536
Correa L, Borguesan B, Farfán C, Inostroza-Ponta M, Dorn M (2016) A memetic algorithm for 3D protein structure prediction problem. IEEE/ACM Trans Comput Biol Bioinf 15(3):690–704
Dal Palu A, Dovier A, Fogolari F, Pontelli E (2011) Exploring protein fragment assembly using CLP. In: IJCAI, pp. 2590–2595
Damm W, Frontera A, Tirado-Rives J, Jorgensen WL (1997) OPLS all-atom force field for carbohydrates. J Comput Chem 18(16):1955–1970
DasGupta D, Kaushik R, Jayaram B (2015) From ramachandran maps to tertiary structures of proteins. J Phys Chem B 119(34):11136–11145
de Lima Corrêa L, Dorn M (2020) A multi-population memetic algorithm for the 3D protein structure prediction problem. Swarm Evol Comput 55:100677
de Lima Corrêa L, Borguesan B, Krause MJ, Dorn M (2018) Three-dimensional protein structure prediction based on memetic algorithms. Comput Oper Res 91:160–177
de Oliveira SH, Shi J, Deane CM (2015) Building a better fragment library for de novo protein structure prediction. PLoS ONE 10(4):0123998
Dehghani T, Naghibzadeh M, Eghdami M (2019) BetaDL: a protein beta-sheet predictor utilizing a deep learning model and independent set solution. Computers in Biology and Medicine 104:241–249
Dhingra S, Sowdhamini R, Cadet F, Offmann B (2020) A glance into the evolution of template-free protein structure prediction methodologies. Biochimie 1:1
Di Lena P, Nagata K, Baldi P (2012) Deep architectures for protein contact map prediction. Bioinformatics 28(19):2449–2457
Dill KA (1985) Theory for the folding and stability of globular proteins. Biochemistry 24(6):1501–1509
Ding W, Gong H (2020) Predicting the real-valued inter-residue distances for proteins. Adv Sci 7(19):2001314
Ding W, Mao W, Shao D, Zhang W, Gong H (2018) DeepConPred2: an improved method for the prediction of protein residue contacts. Comput Struct Biotechnol J 16:503–510
Dotu I, Cebrian M, Van Hentenryck P, Clote P (2011) On lattice protein structure prediction revisited. IEEE/ACM Trans Comput Biol Bioinf 8(6):1620–1632
Dou J, Vorobieva AA, Sheffler W, Doyle LA, Park H, Bick MJ, Mao B, Foight GW, Lee MY, Gagnon LA et al (2018) De novo design of a fluorescence-activating \(\beta \)-barrel. Nature 561(7724):485–491
Do Duc D, Dinh P.T., Anh VTN, Linh-Trung N (2018) An efficient ant colony optimization algorithm for protein structure prediction. In: 2018 12th International Symposium on Medical Information and Communication Technology (ISMICT), pp. 1–6. IEEE
Du Z, Su H, Wang W, Ye L, Wei H, Peng Z, Anishchenko I, Baker D, Yang J (2021) The trrosetta server for fast and accurate protein structure prediction. Nat Protocols 16(12):5634–5651
Eickholt J, Cheng J (2012) Predicting protein residue-residue contacts using deep networks and boosting. Bioinformatics 28(23):3066–3072
Eickholt J, Cheng J (2013) A study and benchmark of dncon: a method for protein residue-residue contact prediction using deep networks. In: BMC Bioinf, vol. 14, pp. 1–10 . BioMed Central
Ekeberg M, Lövkvist C, Lan Y, Weigt M, Aurell E (2013) Improved contact prediction in proteins: using pseudolikelihoods to infer potts models. Phys Rev E 87(1):012707
Fang C (2018) Applications of deep neural networks to protein structure prediction. PhD thesis, University of Missouri-Columbia
Fang C, Shang Y, Xu D (2018) Prediction of protein backbone torsion angles using deep residual inception neural networks. IEEE/ACM Trans Comput Biol Bioinf 16(3):1020–1028
Fang C, Shang Y, Xu D (2018) MUFOLD-SS: new deep inception-inside-inception networks for protein secondary structure prediction. Proteins 86(5):592–598
Fielding AH (1999) An introduction to machine learning methods. In: Machine Learning Methods for Ecological Applications, pp. 1–35. Springer, Switzerland
Flot M, Mishra A, Kuchi AS, Hoque MT (2019) StackSSSPred: a stacking-based prediction of supersecondary structure from sequence. Methods Mol Biol (Clifton, NJ) 1958:101–122
Fukuda H, Tomii K (2020) DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment. BMC Bioinf 21(1):1–15
Gao S, Song S, Cheng J, Todo Y, Zhou M (2017) Incorporation of solvent effect into multi-objective evolutionary algorithm for improved protein structure prediction. IEEE/ACM Trans Comput Biol Bioinf 15(4):1365–1378
Gao Y, Wang S, Deng M, Xu J (2018) RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning. BMC Bioinf 19(4):73–84
Gao J, Yang Y, Zhou Y (2018) Grid-based prediction of torsion angle probabilities of protein backbone and its application to discrimination of protein intrinsic disorder regions and selection of model structures. BMC Bioinf 19(1):1–8
Garza-Fabre M, Kandathil SM, Handl J, Knowles J, Lovell SC (2016) Generating, maintaining, and exploiting diversity in a memetic algorithm for protein structure prediction. Evol Comput 24(4):577–607
Glover FW, Kochenberger GA (2006) Handbook of Metaheuristics, vol 57. Springer, Switzerland
Glusker J (2009) X-ray crystallography of proteins. Methods Biochem Anal 1:1–72
Goldberg DE (1989) Genetic algorithms in search. Optimization, and MachineLearning
Gordon DB, Mayo SL (1999) Branch-and-terminate: a combinatorial optimization algorithm for protein design. Structure 7(9):1089–1098
Greener JG, Kandathil SM, Jones DT (2019) Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints. Nat Commun 10(1):1–13
Gront D, Kulp DW, Vernon RM, Strauss CE, Baker D (2011) Generalized fragment picking in rosetta: design, protocols and applications. PLoS ONE 6(8):23294
Guo Z, Wu T, Liu J, Hou J, Cheng J (2021) Improving deep learning-based protein distance prediction in casp14. Bioinformatics 37(19):3190–3196
Görmez Y, Aydin Z (2022) IGPRED-MultiTask: a deep learning model to predict protein secondary structure, torsion angles and solvent accessibility. IEEE/ACM Trans Comput Biol Bioinf
Görmez Y, Sabzekar M, Aydin Z (2021) IGPRED: combination of convolutional neural and graph convolutional networks for protein secondary structure prediction. Proteins
Haas J, Roth S, Arnold K, Kiefer F, Schmidt T, Bordoli L, Schwede T (2013) The protein model portal-a comprehensive resource for protein structure and model information. Database 2013
Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y (2018) Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks. Bioinformatics 35(14):2403–2410
Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y (2018) Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks. Bioinformatics 34(23):4039–4045
He B, Mortuza S, Wang Y, Shen H-B, Zhang Y (2017) NeBcon: protein contact map prediction using neural network training coupled with naïve bayes classifiers. Bioinformatics 33(15):2296–2306
Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Yang Y, Zhou Y (2015) Improving prediction of secondary structure, local backbone angles and solvent accessible surface area of proteins by iterative deep learning. Sci Rep 5(1):1–11
Heffernan R, Yang Y, Paliwal K, Zhou Y (2017) Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics 33(18):2842–2849
Heffernan R, Paliwal K, Lyons J, Singh J, Yang Y, Zhou Y (2018) Single-sequence-based prediction of protein secondary structures and solvent accessibility by deep whole-sequence learning. J Comput Chem 39(26):2210–2216
Heo L, Feig M (2018) Experimental accuracy in protein structure refinement via molecular dynamics simulations. Proc Natl Acad Sci USA 115(52):13276–13281
Heo L, Feig M (2020) High-accuracy protein structures by combining machine-learning with physics-based refinement. Proteins 88(5):637–642
Hiranuma N, Park H, Baek M, Anishchenko I, Dauparas J, Baker D (2021) Improved protein structure refinement guided by deep learning based accuracy estimation. Nat Commun 12(1):1–11
Hou J, Wu T, Cao R, Cheng J (2019) Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins 87(12):1165–1178
Hu X-Z, Long H-X, Ding C-J, Gao S-J, Hou R (2020) Using random forest algorithm to predict super-secondary structure in proteins. J Supercomput 76(5):3199–3210
Huang H, Gong X (2020) A review of protein inter-residue distance prediction. Curr Bioinf 15(8):821–830
Huang X, Han K, Zhu Y (2013) Systematic optimization model and algorithm for binding sequence selection in computational enzyme design. Protein 22(7):929–941
Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, De Groot BL, Grubmüller H, MacKerell AD (2017) Charmm36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods 14(1):71–73
Huang X, Pearce R, Zhang Y (2020) FASPR: an open-source tool for fast and accurate protein side-chain packing. Bioinformatics 36(12):3758–3765
Hussain K, Salleh MNM, Cheng S, Shi Y (2019) Metaheuristic research: a comprehensive survey. Artif Intell Rev 52(4):2191–2233
Høie MH, Kiehl EN, Petersen B, Nielsen M, Winther O, Nielsen H, Hallgren J, Marcatili P (2022) NetSurfP-3.0: accurate and fast prediction of protein structural features by protein language models and deep learning. Nucleic Acids Res 1:1
Ingraham J, Riesselman A, Sander C, Marks D (2018) Learning protein structure with a differentiable simulator. In: International Conference on Learning Representations
Irbäck A, Peterson C, Potthast F, Sommelius O (1997) Local interactions and protein folding: a three-dimensional off-lattice approach. J Chem Phys 107(1):273–282
Jain A, Terashi G, Kagaya Y, Subramaniya SRMV, Christoffer C, Kihara D (2021) Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction. Sci Rep 11(1):1–13
Jana ND, Das S, Sil J (2018) A metaheuristic approach to protein structure prediction. Springer, Cham
Jauch R, Yeo HC, Kolatkar PR, Clarke ND (2007) Assessment of casp7 structure predictions for template free targets. Proteins 69(S8):57–67
Jayaram B, Bhushan K, Shenoy SR, Narang P, Bose S, Agrawal P, Sahu D, Pandey V (2006) Bhageerath: an energy based web enabled computer software suite for limiting the search space of tertiary structures of small globular proteins. Nucleic Acids Res 34(21):6195–6204
Ji S, Oruç T, Mead L, Rehman MF, Thomas CM, Butterworth S, Winn PJ (2019) DeepCDpred: inter-residue distance and contact prediction for improved prediction of protein structure. PLoS ONE 14(1):0205214
Jiang Q, Jin X, Lee S-J, Yao S (2017) Protein secondary structure prediction: a survey of the state of the art. Journal of Molecular Graphics and Modelling 76:379–402
Jing X, Xu J (2020) Improved protein model quality assessment by integrating sequential and pairwise features using deep learning. Bioinformatics 36(22–23):5361–5367
Jing X, Dong Q, Lu R, Dong Q (2019) Protein inter-residue contacts prediction: methods, performances and applications. Curr Bioinf 14(3):178–189
Jisna V, Jayaraj P (2021) Protein structure prediction: conventional and deep learning perspectives. Protein J 1:1–23
Jones DT, McGuffin LJ (2003) Assembling novel protein folds from super-secondary structural fragments. Proteins 53(S6):480–485
Jones DT, Kandathil SM (2018) High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics 34(19):3308–3315
Jones DT, Buchan DW, Cozzetto D, Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28(2):184–190
Jones DT, Singh T, Kosciolek T, Tetchner S (2015) Metapsicov: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 31(7):999–1006
Jumper JM, Faruk NF, Freed KF, Sosnick TR (2018) Accurate calculation of side chain packing and free energy with applications to protein molecular dynamics. PLoS Comput Biol 14(12):1006342
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Tunyasuvunakool K, Ronneberger O, Bates R, Žídek A, Bridgland A, et al. (2020) High accuracy protein structure prediction using deep learning. Fourteenth Critical Assessment of Techniques for Protein Structure Prediction (Abstract Book) 22, 24
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A et al (2021) Highly accurate protein structure prediction with alphafold. Nature 1:1–11
Kabsch W (1976) A solution for the best rotation to relate two sets of vectors. Acta Crystallogr Sect A 32(5):922–923
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22(12):2577–2637
Kaján L, Hopf TA, Kalaš M, Marks DS, Rost B (2014) FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinf 15(1):1–6
Kalisman N, Levi A, Maximova T, Reshef D, Zafriri-Lynn S, Gleyzer Y, Keasar C (2005) MESHI: a new library of Java classes for molecular modeling. Bioinformatics 21(20):3931–3932
Kamisetty H, Ovchinnikov S, Baker D (2013) Assessing the utility of coevolution-based residue-residue contact predictions in a sequence-and structure-rich era. Proc Natl Acd Sci USA 110(39):15674–15679
Kandathil SM, Greener JG, Jones DT (2019) Prediction of interresidue contacts with DeepMetaPSICOV in CASP13. Proteins 87(12):1092–1099
Karasikov M, Pagès G, Grudinin S (2019) Smooth orientation-dependent scoring function for coarse-grained protein quality assessment. Bioinformatics 35(16):2801–2808
Kelley LA, Sternberg MJ (2009) Protein structure prediction on the web: a case study using the phyre server. Nat Protocols 4(3):363–371
Khanduja N, Bhushan B (2020) Recent advances and application of metaheuristic algorithms: A survey. Metaheuristic and Evolutionary Computation, 207
Khor BY, Tye GJ, Lim TS, Choong YS (2015) General overview on structure prediction of twilight-zone proteins. Theoret Biol Med Model 12(1):1–11
Kingsford CL, Chazelle B, Singh M (2005) Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics 21(7):1028–1039
Kinjo AR, Nakamura H (2008) Nature of protein family signatures: insights from singular value analysis of position-specific scoring matrices. PLoS ONE 3(4):1963
Klausen MS, Jespersen MC, Nielsen H, Jensen KK, Jurtz VI, Soenderby CK, Sommer MOA, Winther O, Nielsen M, Petersen B et al (2019) NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning. Proteins 87(6):520–527
Klepeis JL, Wei Y, Hecht MH, Floudas CA (2005) Ab initio prediction of the three-dimensional structure of a de novo designed protein A double-blind case study. Proteins 58(3):560–570
Kotowski K, Smolarczyk T, Roterman-Konieczna I, Stapor K (2021) ProteinUnet-an efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures. J Comput Chem 42(1):50–59
Kou G, Feng Y (2015) Identify five kinds of simple super-secondary structures with quadratic discriminant algorithm based on the chemical shifts. J Theoret Biol 380:392–398
Kryshtafovych A, Monastyrskyy B, Fidelis K, Schwede T, Tramontano A (2018) Assessment of model accuracy estimations in CASP12. Proteins 86:345–360
Kugunavar S, Prabhakar C (2021) Convolutional neural networks for the diagnosis and prognosis of the coronavirus disease pandemic. Visual Computing for Industry, Biomedicine, and Art 4(1):1–14
Kuhlman B, Bradley P (2019) Advances in protein structure prediction and design. Nat Rev Mol Cell Biol 20(11):681–697
Kukic P, Mirabello C, Tradigo G, Walsh I, Veltri P, Pollastri G (2014) Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks. BMC Bioinf 15(1):1–15
Källberg M, Wang H, Wang S, Peng J, Wang Z, Lu H, Xu J (2012) Template-based protein structure modeling using the raptorx web server. Nat Protocols 7(8):1511–1522
Lavor C, Alves R, Figueiredo W, Petraglia A, Maculan N (2015) Clifford algebra and the discretizable molecular distance geometry problem. Advances in Applied Clifford Algebras 25(4):925–942
Leaver-Fay A, Tyka M, Lewis SM, Lange OF, Thompson J, Jacak R, Kaufman KW, Renfrew PD, Smith CA, Sheffler W, et al. (2011) ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. In: Methods in Enzymology vol. 487, pp. 545–574. Elsevier, Amsterdam
Lee GR, Won J, Heo L, Seok C (2019) GalaxyRefine2: simultaneous refinement of inaccurate local regions and overall protein structure. Nucleic Acids Res 47(W1):451–455
Lee D, Xiong D, Wierbowski S, Li L, Liang S, Yu H (2022) Deep learning methods for 3D structural proteome and interactome modeling. Curr Opin Struct Biol 73
Levinthal C (1968) Are there pathways for protein folding? Journal de chimie physique 65:44–45
Li Y, Roy A, Zhang Y (2009) HAAD: a quick algorithm for accurate prediction of hydrogen atoms in protein structures. PLoS ONE 4(8):6701
Li Y, Fang Y, Fang J (2011) Predicting residue-residue contacts using random forest models. Bioinformatics 27(24):3379–3384
Li C, Wang X-F, Chen Z, Zhang Z, Song J (2015) Computational characterization of parallel dimeric and trimeric coiled-coils using effective amino acid indices. Mol BioSyst 11(2):354–360
Li Y, Hu J, Zhang C, Yu D-J, Zhang Y (2019) ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks. Bioinformatics 35(22):4647–4655
Li Y, Zhang C, Zheng W, Zhou X, Bell E, Yu D, Zhang Y (2020) Learning deep statistical potentials for protein folding. CASP 14:72–73
Li Y, Zheng W, Zhang C, Bell E, Huang X, Pearce R, Zhou X, Zhang Y (2020) Protein 3D structure prediction by DI-TASSER in CASP14. CASP 14:339–341
Li Y, Zhang C, Bell EW, Zheng W, Zhou X, Yu D-J, Zhang Y (2021) Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks. PLoS Comput Biol 17(3):1008865
Liljas A, Liljas L, Lindblom G, Nissen P, Kjeldgaard M, Ash Mr (2016) Textbook of Structural Biology vol. 8. World Scientific, Singapore
Lin HH, Tseng LY (2010) DBCP: a web server for disulfide bonding connectivity pattern prediction without the prior knowledge of the bonding state of cysteines. Nucleic Acids Res 38(suppl_2), 503–507
Liu Z, Jiang L, Gao Y, Liang S, Chen H, Han Y, Lai L (2003) Beyond the rotamer library: genetic algorithm combined with the disturbing mutation process for upbuilding protein side-chains. Proteins 50(1):49–62
Liu Y, Palmedo P, Ye Q, Berger B, Peng J (2018) Enhancing evolutionary couplings with deep convolutional neural networks. Cell Syst 6(1):65–74
Liu Z-L, Hu J-H, Jiang F, Wu Y-D (2020) CRiSP: accurate structure prediction of disulfide-rich peptides with cystine-specific sequence alignment and machine learning. Bioinformatics 36(11):3385–3392
Liu J, Zhou X-G, Zhang Y, Zhang G-J (2020) CGLFold: a contact-assisted de novo protein structure prediction using global exploration and loop perturbation sampling algorithm. Bioinformatics 36(8):2443–2450
Liu S, Wang T, Xu Q, Shao B, Yin J, Liu T-Y (2021) Complementing sequence-derived features with structural information extracted from fragment libraries for protein structure prediction. BMC Bioinf 22(1):1–18
Liwo A, Khalili M, Scheraga HA (2005) Ab initio simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptide chains. Proc Natl Acad Sci USA 102(7):2362–2367
Lodish H, Berk A, Zipursky SL, Matsudaira P, Baltimore D, Darnell J (2000) Hierarchical structure of proteins. In: Molecular Cell Biology. 4th Edition. WH Freeman, Macmillan Higher Education, US
Lu M, Dousis AD, Ma J (2008) OPUS-PSP: an orientation-dependent statistical all-atom potential derived from side-chain packing. J Mol Biol 376(1):288–301
Lyons J, Dehzangi A, Heffernan R, Sharma A, Paliwal K, Sattar A, Zhou Y, Yang Y (2014) Predicting backbone c\(\alpha \) angles and dihedrals from protein sequences by stacked sparse auto-encoder deep neural network. J Comput Chem 35(28):2040–2046
Ma J, Wang S, Wang Z, Xu J (2015) Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning. Bioinformatics 31(21):3506–3513
Ma Y, Liu Y, Cheng J (2018) Protein secondary structure prediction based on data partition and semi-random subspace method. Sci Rep 8(1):1–10
Mabrouk M, Werner T, Schneider T, Putz I, Brock O (2015) Analysis of free modelling predictions by RBO aleph in CASP11. Proteins 84:87–104
MacCarthy E, Perry D, Kc DB (2019) Advances in protein super-secondary structure prediction and application to protein structure prediction. Methods Mol Biol (Clifton, NJ) 1958:15–45
Maghrabi AH, McGuffin LJ (2017) ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models. Nucleic Acids Res 45(W1):416–421
Magnan CN, Baldi P (2014) SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics 30(18):2592–2597
Maguire JB, Haddox HK, Strickland D, Halabiya SF, Coventry B, Griffin JR, Pulavarti SVK, Cummins M, Thieker DF, Klavins E et al (2021) Perturbing the energy landscape for improved packing during computational protein design. Proteins 89(4):436–449
Mariani V, Biasini M, Barbato A, Schwede T (2013) lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 29(21):2722–2728
Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6(12):28766
Marsland S (2015) Machine Learning: an Algorithmic Perspective. CRC Press, Boca Raton, Florida
Mataeimoghadam F, Newton MH, Dehzangi A, Karim A, Jayaram B, Ranganathan S, Sattar A (2020) Enhancing protein backbone angle prediction by using simpler models of deep neural networks. Sci Rep 10(1):1–12
Meiler J, Müller M, Zeidler A, Schmäschke F (2001) Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks. Molecular modeling annual 7(9):360–369
Miao Z, Cao Y, Jiang T (2011) RASP: rapid modeling of protein side chain conformations. Bioinformatics 27(22):3117–3122
Michel M, Menéndez Hurtado D, Elofsson A (2019) PconsC4: fast, accurate and hassle-free contact predictions. Bioinformatics 35(15):2677–2679
Mignan A, Broccardo M (2019) One neuron versus deep learning in aftershock prediction. Nature 574(7776):1–3
Mirabello C, Pollastri G (2013) Porter, PaleAle 4.0:high-accuracy prediction of protein secondary structure and relative solvent accessibility. Bioinformatics 29(16):2056–2058
Mirabello C, Wallner B (2019) RAWMSA: end-to-end deep learning using raw multiple sequence alignments. PLoS ONE 14(8):0220182
Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M (2022) ColabFold: making protein folding accessible to all. Nat Methods 1:1–4
Mishra A, Iqbal S, Hoque MT (2016) Discriminate protein decoys from native by using a scoring function based on ubiquitous phi and psi angles computed for all atom. J Theoret Biol 398:112–121
Mishra A, Kabir MWU, Hoque MT (2021) diSBPred: a machine learning based approach for disulfide bond prediction. Comput Biol Chem 91:107436
Mittal A, Jayaram B, Shenoy S, Bawa TS (2010) A stoichiometry driven universal spatial organization of backbones of folded proteins: are there chargaff’s rules for protein folding? Journal of Biomolecular Structure and Dynamics 28(2):133–142
Miyazawa S, Jernigan RL (1985) Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 18(3):534–552
Miyazawa S, Jernigan RL (1996) Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J Mol Biol 256(3):623–644
Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acd Sci USA 108(49):1293–1301
Mortuza S, Zheng W, Zhang C, Li Y, Pearce R, Zhang Y (2021) Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions. Nat Commun 12(1):1–12
Moscato P, Cotta C (2010) A modern introduction to memetic algorithms. In: Handbook of Metaheuristics, pp. 141–183. Springer, Cham
Moscato P et al (1989) On evolution, search, optimization, genetic algorithms and martial arts: towards memetic algorithms. Caltech concurrent computation program, C3P Rep 826, 1989
Moult J, Pedersen JT, Judson R, Fidelis K (1995) A large-scale experiment to assess protein structure prediction methods. Wiley Online Library
Mufassirin MM, Ragel RG (2018) A novel filter-wrapper based feature selection approach for cancer data classification. In: 2018 IEEE International Conference on Information and Automation for Sustainability (ICIAfS), pp. 1–6. IEEE
Mullard A (2021) What does alphafold mean for drug discovery? Nature reviews, Drug discovery
Nagata K, Randall A, Baldi P (2012) SIDEpro: a novel machine learning approach for the fast and accurate prediction of side-chain conformations. Proteins 80(1):142–153
Narloch PH, Parpinelli RS (2017) The protein structure prediction problem approached by a cascade differential evolution algorithm using ROSETTA. In: 2017 Brazilian Conference on Intelligent Systems (BRACIS), pp. 294–299. IEEE
Nazmul R, Chetty M, Chowdhury AR (2020) Multimodal memetic framework for low-resolution protein structure prediction. Swarm Evol Comput 52:100608
Newton M, Mataeimoghadam F, Zaman R, Sattar A (2022) Secondary structure specific simpler prediction models for protein backbone angles. BMC Bioinf 23(1):1–14
Newton MH, Zaman R, Mataeimoghadam F, Rahman J, Sattar A (2022) Constraint guided beta-sheet refinement for protein structure prediction. Comput Biol Chem 1:107773
Newton MH, Rahman J, Zaman R, Sattar A (2022) Enhancing protein contact map prediction accuracy via ensembles of inter-residue distance predictors. Comput Biol Chem 1:107700
Niu S, Huang T, Feng K-Y, He Z, Cui W, Gu L, Li H, Cai Y-D, Li Y (2013) Inter-and intra-chain disulfide bond prediction based on optimal feature selection. Protein Peptide Lett 20(3):324–335
Ovchinnikov S, Park H, Kim DE, DiMaio F, Baker D (2018) Protein structure prediction using rosetta in casp12. Proteins 86:113–121
O’Meara MJ, Leaver-Fay A, Tyka MD, Stein A, Houlihan K, DiMaio F, Bradley P, Kortemme T, Baker D, Snoeyink J et al (2015) Combined covalent-electrostatic model of hydrogen bonding improves structure prediction with rosetta. J Chem Theory Comput 11(2):609–622
Park H, Bradley P, Greisen P Jr, Liu Y, Mulligan VK, Kim DE, Baker D, DiMaio F (2016) Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules. J Chem Theory Comput 12(12):6201–6212
Park H, Ovchinnikov S, Kim DE, DiMaio F, Baker D (2018) Protein homology model refinement by large-scale energy optimization. Proc Natl Acad Sci USA 115(12):3054–3059
Pearce R, Zhang Y (2021) Deep learning techniques have significantly impacted protein structure prediction and protein design. Curr Opin Struct Biol 68:194–207
Pearce R, Zhang Y (2021) Toward the solution of the protein structure prediction problem. J Biol Chem 1:1
Pereira J, Simpkin AJ, Hartmann MD, Rigden DJ, Keegan RM, Lupas AN (2021) High-accuracy protein structure prediction in CASP14. Proteins 89(12):1687–1699
Persson O, Danell R, Schneider JW (2009) How to use bibexcel for various types of bibliometric analysis. Celebrating scholarly communication studies: a Festschrift for Olle Persson at his 60th Birthday 5:9–24
Peterson RW, Dutton PL, Wand AJ (2004) Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library. Protein Sci 13(3):735–751
Rahman J, Newton MH, Hasan MAM, Sattar A (2022) A stacked meta-ensemble for protein inter-residue distance prediction. Comput Biol Med 148:105824
Rahman J, Newton M, Islam MKB, Sattar A (2022) Enhancing protein inter-residue real distance prediction by scrutinising deep learning models. Sci Rep 12(1):1–13
Rakhshani H, Idoumghar L, Ghambari S, Lepagnot J, Brévilliers M (2021) On the performance of deep learning for numerical optimization: an application to protein structure prediction. Appl Soft Comput 110:107596
Ramyachitra D, Ajeeth A (2017) MODCSA-CA: a multi objective diversity controlled self adaptive cuckoo algorithm for protein structure prediction. Gene Rep 8:100–106
Rashid MA, Khatib F, Hoque MT, Sattar A (2015) An enhanced genetic algorithm for ab initio protein structure prediction. IEEE Trans Evol Comput 20(4):627–644
Rashid MA, Shatabda S, Newton MH, Hoque MT, Pham DN, Sattar A (2012) Random-walk: a stagnation recovery technique for simplified protein structure prediction. In: Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pp. 620–622
Rashid MA, Newton MH, Hoque MT, Sattar A (2013) A local search embedded genetic algorithm for simplified protein structure prediction. In: 2013 IEEE Congress on Evolutionary Computation, pp. 1091–1098. IEEE
Rashid MA, Newton M, Hoque M, Sattar A et al (2013) Mixing energy models in genetic algorithms for on-lattice protein structure prediction. BioMed Res Int 2013
Remmert M, Biegert A, Hauser A, Söding J (2012) HHblits: lightning-fast iterative protein sequence searching by hmm-hmm alignment. Nature methods 9(2):173
Richmond TJ (1984) Solvent accessible surface area and excluded volume in proteins: Analytical equations for overlapping spheres and implications for the hydrophobic effect. J Mol Biol 178(1):63–89
Rives A, Meier J, Sercu T, Goyal S, Lin, Z, Liu J, Guo D, Ott M, Zitnick CL, Ma J et al (2021) Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acd Sci USA 118:15
Rodriguez C, Chowriappa P, Dua S et al (2019) Local similarity matrix for cysteine disulfide connectivity prediction from protein sequences. IEEE/ACM Trans Comput Biol Bioinf 17(4):1276–1289
Rose GD, Geselowitz AR, Lesser GJ, Lee RH, Zehfus MH (1985) Hydrophobicity of amino acid residues in globular proteins. Science 229(4716):834–838
Rost B (2001) Protein secondary structure prediction continues to rise. Journal of structural biology 134(2–3):204–218
Roy A, Kucukural A, Zhang Y (2010) I-tasser: a unified platform for automated protein structure and function prediction. Nat Protocols 5(4):725–738
Santos KB, Trevizani R, Custódio FL, Dardenne LE (2015) Profrager web server: Fragment libraries generation for protein structure prediction. In: Proceedings of the International Conference on Bioinformatics & Computational Biology (BIOCOMP), p. 38. The Steering Committee of The World Congress in Computer Science, Computer
Sathyapriya R, Duarte JM, Stehr H, Filippis I, Lappe M (2009) Defining an essence of structure determining residue contacts in proteins. PLOS Comput Biol 5(12):1000584
Scott WR, Hünenberger PH, Tironi IG, Mark AE, Billeter SR, Fennen J, Torda AE, Huber T, Krüger P, van Gunsteren WF (1999) The GROMOS biomolecular simulation program package. J Phys Chem A 103(19):3596–3607
Seemayer S, Gruber M, Söding J (2014) CCMpred-fast and precise prediction of protein residue-residue contacts from correlated mutations. Bioinformatics 30(21):3128–3130
Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Žídek A, Nelson AW, Bridgland A et al (2020) Improved protein structure prediction using potentials from deep learning. Nature 577(7792):706–710
Shatabda S., Newton MH, Sattar A (2013) Simplified lattice models for protein structure prediction: how good are they? In: Twenty-Seventh AAAI Conference on Artificial Intelligence
Shatabda S, Newton M, Rashid MA, Pham DN, Sattar (2014) A How good are simplified models for protein structure prediction? Adv Bioinf 2014
Shen T, Wu J, Lan H, Zheng L, Pei J, Wang S, Liu W, Huang J (2021) When homologous sequences meet structural decoys: accurate contact prediction by tfold in casp14-(tfold for casp14 contact prediction). Proteins 89(12):1901–1910
Shonkwiler RW, Mendivil F (2009) Explorations in Monte Carlo Methods. Springer, Switzerland
Shrestha A, Mahmood A (2019) Review of deep learning algorithms and architectures. IEEE Access 7:53040–53065
Shuchun Y, Xianxiang L, Xue T, Ming P (2022) Protein structure prediction based on particle swarm optimization and tabu search strategy. BMC Bioinf 23(10):1–10
Shuid AN, Kempster R, McGuffin LJ (2017) ReFOLD: a server for the refinement of 3D protein models guided by accurate quality estimates. Nucleic Acids Res 45(W1):422–428
Shuvo MH, Bhattacharya S, Bhattacharya D (2020) QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks. Bioinformatics 36(Supplement-1):285–291
Simons KT, Kooperberg C, Huang E, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions. J Mol Biol 268(1):209–225
Singh J, Litfin T, Paliwal K, Singh J, Hanumanthappa AK, Zhou Y (2021) SPOT-1D-Single: improving the single-sequence-based prediction of protein secondary structure, backbone angles, solvent accessibility and half-sphere exposures using a large training set and ensembled deep learning. Bioinformatics
Singh J, Paliwal K, Litfin T, Singh J, Zhou Y (2022) Reaching alignment-profile-based accuracy in predicting protein secondary and tertiary structural properties without alignment. Sci Rep 12(1):1–9
Smialowski P, Martin-Galiano AJ, Cox J, Frishman D (2007) Predicting experimental properties of proteins from sequence by machine learning techniques. Current Protein and Peptide Science 8(2):121–133
Somvanshi M, Chavan P, Tambade S, Shinde S (2016) A review of machine learning techniques using decision tree and support vector machine. In: 2016 International Conference on Computing Communication Control and Automation (ICCUBEA), pp. 1–7 . IEEE
Song S, Gao S, Chen X, Jia D, Qian X, Todo Y (2018) Aimoes: Archive information assisted multi-objective evolutionary strategy for ab initio protein structure prediction. Knowl Based Syst 146:58–72
Song S, Ji J, Chen X, Gao S, Tang Z, Todo Y (2018) Adoption of an improved pso to explore a compound multi-objective energy function in protein structure prediction. Appl Soft Comput 72:539–551
Steinegger M, Söding J (2017) MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 35(11):1026–1028
Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J (2019) HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinf 20(1):1–15
Su H, Wang W, Du Z, Peng Z, Gao SH, Cheng MM, Yang J (2021) Improved protein structure prediction using a new multi-scale network and homologous templates. Adv Sci 1:2102592
Takahashi T, Chikenji G, Tokita K (2021) Lattice protein design using bayesian learning. Phys Rev E 104(1):014404
Talbi E-G (2009) Metaheuristics: from Design to Implementation, vol 74. John Wiley & Sons, Hoboken, New Jersey, U.S
Torrisi M, Kaleel M, Pollastri G (2019) Deeper profiles and cascaded recurrent and convolutional neural networks for state-of-the-art protein secondary structure prediction. Sci Rep 9(1):1–12
Torrisi M, Pollastri G, Le Q (2020) Deep learning methods in protein structure prediction. Comput Struct Biotechnol J 18:1301
Tunyasuvunakool K, Adler J, Wu Z, Green T, Zielinski M, Žídek A, Bridgland A, Cowie A, Meyer C, Laydon A et al (2021) Highly accurate protein structure prediction for the human proteome. Nature 596(7873):590–596
Varela D, Santos J (2017) A protein folding model using the face-centered cubic lattice model. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1674–1678
Varela D, Santos J (2022) Niching methods integrated with a differential evolution memetic algorithm for protein structure prediction. Swarm Evol Comput, 101062
Venske SM, Gonçalves RA, Benelli EM, Delgado MR (2016) ADEMO/D: an adaptive differential evolution for protein structure prediction problem. Exp Syst Appl 56:209–226
Vieira A, Ribeiro B (2018) Introduction to deep learning business applications for developers. Springer, Cham
Walsh I, Baù D, Martin AJ, Mooney C, Vullo A, Pollastri G (2009) Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks. BMC Struct Biol 9(1):1–20
Wang G, Dunbrack RL Jr (2003) Pisces: a protein sequence culling server. Bioinformatics 19(12):1589–1591
Wang G, Dunbrack RL (2005) PISCES: recent improvements to a pdb sequence culling server. Nucleic acids research 33(suppl_2), 94–98
Wang Z, Xu J (2013) Predicting protein contact map using evolutionary and physical constraints by integer programming. Bioinformatics 29(13):266–273
Wang Z, Eickholt J, Cheng J (2010) Multicom: a multi-level combination approach to protein structure prediction and its assessments in casp8. Bioinformatics 26(7):882–888
Wang S, Li W, Liu S, Xu J (2016) RaptorX-Property: a web server for protein structure property prediction. Nucleic Acids Res 44(W1):430–435
Wang S, Peng J, Ma J, Xu J (2016) Protein secondary structure prediction using deep convolutional neural fields. Sci Rep 6:18962
Wang X, Zhou Y, Yan R (2015) AAFreqCoil: a new classifier to distinguish parallel dimeric and trimeric coiled coils. Mol BioSyst 11(7):1794–1801
Wang Y, Mao H, Yi Z (2017) Protein secondary structure prediction by using deep learning method. Knowledge-Based Systems 118:115–123
Wang S, Sun S, Li Z, Zhang R, Xu J (2017) Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput Biol 13(1):1005324
Wang T, Yang Y, Zhou Y, Gong H (2017) Lrfraglib: an effective algorithm to identify fragments for de novo protein structure prediction. Bioinformatics 33(5):677–684
Wang T, Qiao Y, Ding W, Mao W, Zhou Y, Gong H (2019) Improved fragment sampling for ab initio protein structure prediction using deep neural networks. Nat Mach Intell 1(8):347–355
Wardah W, Khan MG, Sharma A, Rashid MA (2019) Protein secondary structure prediction using neural networks and deep learning: A review. Comput Biol Chem 81:1–8
Won J, Baek M, Monastyrskyy B, Kryshtafovych A, Seok C (2019) Assessment of protein model structure accuracy estimation in CASP13: challenges in the era of deep learning. Proteins 87(12):1351–1360
Wood CW, Woolfson DN (2018) CCBuilder2.0: powerful and accessible coiled-coil modeling. Protein 27(1):103–111
Wu T, Hou J, Adhikari B, Cheng J (2020) Analysis of several key factors influencing deep learning-based inter-residue contact prediction. Bioinformatics 36(4):1091–1098
Wu Q, Peng Z, Anishchenko I, Cong Q, Baker D, Yang J (2020) Protein contact prediction using metagenome sequence data and residual neural networks. Bioinformatics 36(1):41–48
Wu T, Guo Z, Hou J, Cheng J (2021) DeepDist: real-value inter-residue distance prediction with deep residual convolutional network. BMC Bioinf 22(1):1–17
Wu T, Liu J, Guo Z, Hou J, Cheng J (2021) MULTICOM2 open-source protein structure prediction system powered by deep learning and distance prediction. Sci Rep 11(1):1–9
Xia X (2018) Hidden markov models and protein secondary structure prediction. In: Bioinformatics and the Cell, pp. 145–172. Springer, Switzerland
Xia Y-H, Peng C-X, Zhou X-G, Zhang G-J (2021) A sequential niche multimodal conformational sampling algorithm for protein structure prediction. Bioinformatics 37(23):4357–4365
Xiong D, Zeng J, Gong H (2017) A deep learning framework for improving long-range residue-residue contact prediction using a hierarchical strategy. Bioinformatics 33(17):2675–2683
Xu J (2019) Distance-based protein folding powered by deep learning. Proc Natl Acd Sci USA 116(34):16856–16865
Xu J, Berger B (2006) Fast and accurate algorithms for protein side-chain packing. J ACM 53(4):533–557
Xu J, Wang S (2019) Analysis of distance-based protein structure prediction by deep learning in CASP13. Proteins 87(12):1069–1081
Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80(7):1715–1735
Xu G, Ma T, Zang T, Sun W, Wang Q, Ma J (2017) OPUS-DOSP: a distance-and orientation-dependent all-atom potential derived from side-chain packing. J Mol Biol 429(20):3113–3120
Xu G, Ma T, Du J, Wang Q, Ma J (2019) OPUS-Rota2: an improved fast and accurate side-chain modeling method. J Chem Theory Comput 15(9):5154–5160
Xu J, Mcpartlon M, Li J (2021) Improved protein structure prediction by deep learning irrespective of co-evolution information. Nat Mach Intell 1:1–9
Xu G, Wang Q, Ma J (2020) OPUS-Rota 3: Improving protein side-chain modeling by deep neural networks and ensemble methods. J Chem Inf Model 60(12):6691–6697
Xu G, Wang Q, Ma J (2020) OPUS-TASS: a protein backbone torsion angles and secondary structure predictor based on ensemble neural networks. Bioinformatics (Oxford, England)
Yang Y, Zhou Y (2008) Specific interactions for ab initio folding of protein terminal regions with secondary structures. Proteins 72(2):793–803
Yang Y, Gao J, Wang J, Heffernan R, Hanson J, Paliwal K, Zhou Y (2018) Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Briefings in bioinformatics 19(3):482–494
Yang J, He B-J, Jang R, Zhang Y, Shen H-B (2015) Accurate disulfide-bonding network predictions improve ab initio structure prediction of cysteine-rich proteins. Bioinformatics 31(23):3773–3781
Yang Y, Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Zhou Y (2017) SPIDER2: a package to predict secondary structure, accessible surface area, and main-chain torsional angles by deep neural networks. In: Prediction of Protein Secondary Structure, pp. 55–63. Springer, Cham
Yang H, Wang M, Yu Z, Zhao X-M, Li A (2020) GANcon: protein contact map prediction with deep generative adversarial network. IEEE Access 8:80899–80907
Yang J, Anishchenko I, Park H, Peng Z, Ovchinnikov S, Baker D (2020) Improved protein structure prediction using predicted interresidue orientations. Proc Natl Acad Sci USA 117(3):1496–1503
Yanofsky C, Horn V, Thorpe D (1964) Protein structure relationships revealed by mutational analysis. Science 146(3651):1593–1594
Zaman AB, Shehu A (2019) Balancing multiple objectives in conformation sampling to control decoy diversity in template-free protein structure prediction. BMC Bioinf 20(1):1–17
Zaman R, Newton MH, Mataeimoghadam F, Sattar A (2022) Constraint guided neighbour generation for protein structure prediction. IEEE Access
Zemla A (2003) LGA: a method for finding 3D similarities in protein structures. Nucleic acids research 31(13):3370–3374
Zemla A, Venclovas Č, Moult J, Fidelis K (2001) Processing and evaluation of predictions in CASP4. Wiley Online Library
Zhang Y, Skolnick J (2004) Scoring function for automated assessment of protein structure template quality. Proteins 57(4):702–710
Zhang J, Zhang Y (2010) A novel side-chain orientation dependent potential derived from random-walk reference state for protein fold selection and structure prediction. PLoS ONE 5(10):15386
Zhang C, Zhang Y (2020) Protein 3D structure prediction by d-quark in CASP14. In: Fourteenth Meeting of Critical Assessment of Techniques for Protein Structure Prediction, p. 220
Zhang G-J, Ma L-F, Wang X-Q, Zhou X-G (2018) Secondary structure and contact guided differential evolution for protein structure prediction. IEEE/ACM Trans Comput Biol Bioinf 17(3):1068–1081
Zhang L, Ma H, Qian W, Li H (2020) Protein structure optimization using improved simulated annealing algorithm on a three-dimensional ab off-lattice model. Comput Biol Chem 85:107237
Zhang H, Bei Z, Xi W, Hao M, Ju Z, Saravanan KM, Zhang H, Guo N, Wei Y (2021) Evaluation of residue-residue contact prediction methods: From retrospective to prospective. PLOS Comput Biol 17(5):1009027
Zheng W, Li Y, Zhang C, Pearce R, Mortuza S, Zhang Y (2019) Deep-learning contact-map guided protein structure prediction in CASP13. Proteins 87(12):1149–1164
Zheng W, Li Y, Zhang C, Zhou X, Pearce R, Bell EW, Huang X, Zhang Y (2021) Protein structure prediction using deep learning distance and hydrogen-bonding restraints in casp14. Proteins 89(12):1734–1751
Zhong W, Gu F (2020) Predicting local protein 3D structures using clustering deep recurrent neural network. IEEE/ACM Trans Comput Biol Bioinf
Zhou H, Skolnick J (2011) GOAP: a generalized orientation-dependent, all-atom statistical potential for protein structure prediction. Biophys J 101(8):2043–2052
Zhou X-G, Zhang G-J, Hao X-H, Yu L (2016) A novel differential evolution algorithm using local abstract convex underestimate strategy for global optimization. Computers & Operations Research 75:132–149
Zou D, He Z, He J, Xia Y (2011) Supersecondary structure prediction using Chou’s pseudo amino acid composition. J Comput Chem 32(2):271–278
Acknowledgements
This work is partly supported by the Australian Research Council Discovery Grant DP180102727 and the AHEAD OPERATIONS Project of Sri Lanka.
Author information
Authors and Affiliations
Contributions
M.M.M.M. and M.A.H.N. contributed equally in all parts of the work and are joint-first authors. A.S. took part in discussions and reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
There is NO Competing Interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mufassirin, M.M.M., Newton, M.A.H. & Sattar, A. Artificial intelligence for template-free protein structure prediction: a comprehensive review. Artif Intell Rev 56, 7665–7732 (2023). https://doi.org/10.1007/s10462-022-10350-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-022-10350-x