Skip to main content
Log in

Artificial intelligence for template-free protein structure prediction: a comprehensive review

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Protein structure prediction (PSP) is a grand challenge in bioinformatics, drug discovery, and related fields. PSP is computationally challenging because of an astronomically large conformational space to be searched and an unknown very complex energy function to be minimised. To obtain a given protein’s structure, template-based PSP approaches adopt a similar protein’s known structure, while template-free PSP approaches work when no similar protein’s structure is known. Currently, proteins with known structures are greatly outnumbered by proteins with unknown structures. Template-free PSP has obtained significant progress recently via machine learning and search-based optimisation approaches. However, very accurate structures for complex proteins are yet to be achieved at a level suitable for effective drug design. Moreover, ab initio prediction of a protein’s structure only from its amino acid sequence remains unsolved. Furthermore, the number of protein sequences with unknown structures is growing rapidly. Hence, to make further progress in PSP, more sophisticated and advanced artificial intelligence (AI) approaches are needed. However, getting involved in PSP research is difficult for AI researchers because of the lack of a comprehensive understanding of the whole problem, along with the background and the literature of all related sub-problems. Unfortunately, existing PSP review papers cover PSP research at a very high level and only some parts of PSP and only from a particular singular viewpoint. Using a systematic approach, this review paper provides a comprehensive survey of the state-of-the-art template-free PSP research to fill this knowledge gap. Moreover, covering required PSP preliminaries and computational formulations, this paper presents PSP research from AI perspectives, discusses the challenges, provides our commentaries, and outlines future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Adhikari B (2020) DEEPCON: protein contact prediction using dilated convolutional neural networks with dropout. Bioinformatics 36(2):470–477

    Google Scholar 

  • Adhikari B (2020) A fully open-source framework for deep learning protein real-valued distances. Sci Rep 10(1):1–10

    Google Scholar 

  • Adhikari B, Cheng J (2016) Protein residue contacts and prediction methods. In: Data Mining Techniques for the Life Sciences, pp. 463–476. Springer, Switzerland

  • Adhikari B, Cheng J (2018) CONFOLD2: improved contact-driven ab initio protein structure modeling. BMC Bioinf 19(1):1–5

    Google Scholar 

  • Adhikari B, Bhattacharya D, Cao R, Cheng J (2015) CONFOLD: residue-residue contact-guided ab initio protein folding. Proteins 83(8):1436–1449

    Google Scholar 

  • Adhikari B, Hou J, Cheng J (2018) DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics 34(9):1466–1472

    Google Scholar 

  • Adhikari B, Shrestha B, Bernardini M, Hou J, Lea J (2021) DISTEVAL: a web server for evaluating predicted protein distances. BMC Bioinf 22(1):1–9

    Google Scholar 

  • AlQuraishi M (2019) End-to-end differentiable learning of protein structure. Cell Syst 8(4):292–301

    Google Scholar 

  • AlQuraishi M (2021) Machine learning in protein structure prediction. Curr Opin Chem Biol 65:1–8

    Google Scholar 

  • Alford RF, Leaver-Fay A, Jeliazkov JR, O’Meara MJ, DiMaio FP, Park H, Shapovalov MV, Renfrew PD, Mulligan VK, Kappel K et al (2017) The rosetta all-atom energy function for macromolecular modeling and design. J Chem Theory Comput 13(6):3031–3048

    Google Scholar 

  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research 25(17):3389–3402

    Google Scholar 

  • Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181(4096):223–230

    Google Scholar 

  • Anishchenko I, Baek M, Park H, Dauparas J, Hiranuma N, Mansoor S, Humphrey I, Baker D (2020) Protein structure prediction guided by predicted inter-residue geometries. In: Fourteenth Meeting of Critical Assessment of Techniques for Protein Structure Prediction, p. 30

  • Atari M, Majd N (2022) 2D HP protein folding using quantum genetic algorithm. In: 2022 27th International Computer Conference, Computer Society of Iran (CSICC), pp. 1–8 . IEEE

  • Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, Wang J, Cong Q, Kinch LN, Schaeffer RD et al (2021) Accurate prediction of protein structures and interactions using a three-track neural network. Science 1:1

    Google Scholar 

  • Bagaria A, Jaravine V, Güntert P (2013) Estimating structure quality trends in the Protein Data Bank by equivalent resolution. Comput Biol Chem 46:8–15

    Google Scholar 

  • Bairoch A, Bougueleret L, Altairac S, Amendolia V, Auchincloss A, Puy GA, Axelsen K, Baratin D, Blatter M-C, Boeckmann B et al (2008) The universal protein resource (uniprot). Nucleic Acids Res 36:190–195

    Google Scholar 

  • Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 3(1)

  • Belda I, Madurga S, Tarragó T, Llorà X, Giralt E (2007) Evolutionary computation and multimodal search: a good combination to tackle molecular diversity in the field of peptide design. Mol Divers 11(1):7–21

    Google Scholar 

  • Benkert P, Tosatto SC, Schomburg D (2008) QMEAN: a comprehensive scoring function for model quality assessment. Proteins 71(1):261–277

    Google Scholar 

  • Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2000) Genbank. Nucleic Acids Res 28(1):15–18

    Google Scholar 

  • Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28(1):235–242

    Google Scholar 

  • Berrera M, Molinari H, Fogolari F (2003) Amino acid empirical contact energy definitions for fold recognition in the space of contact maps. BMC Bioinf 4(1):1–26

    Google Scholar 

  • Bhattacharya D (2019) refineD: improved protein structure refinement using machine learning based restrained relaxation. Bioinformatics 35(18):3320–3328

    Google Scholar 

  • Bhattacharya D, Cheng J (2013) 3Drefine: consistent protein structure refinement by optimizing hydrogen bonding network and atomic-level energy minimization. Proteins 81(1):119–131

    Google Scholar 

  • Bhattacharya D, Cao C (2016) Renzhi, Jianlin: UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling. Bioinformatics 32(18):2791–2799

    Google Scholar 

  • Bhattacharya D, Adhikari B, Li J, Cheng J (2016) Fragsion: ultra-fast protein fragment library generation by iohmm sampling. Bioinformatics 32(13):2059–2061

    Google Scholar 

  • Bhattacharya D, Nowotny J, Cao R, Cheng J (2016) 3Drefine: an interactive web server for efficient protein structure refinement. Nucleic Acids Res 44(W1):406–409

    Google Scholar 

  • Bianchi L, Dorigo M, Gambardella LM, Gutjahr WJ (2009) A survey on metaheuristics for stochastic combinatorial optimization. Natural Computing 8(2):239–287

    MATH  MathSciNet  Google Scholar 

  • Biehn SE, Lindert S (2022) Protein structure prediction with mass spectrometry data. Ann Rev Phys Chem 73:1–19

  • Billings WM, Morris CJ, Della Corte D (2021) The whole is greater than its parts: ensembling improves protein contact prediction. Sci Rep 11(1):1–7

    Google Scholar 

  • Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM computing surveys (CSUR) 35(3):268–308

    Google Scholar 

  • Borguesan B, eSilva MB, Grisci B, Inostroza-Ponta M, Dorn M (2015) APL: an angle probability list to improve knowledge-based metaheuristics for the three-dimensional protein structure prediction. Comput Biol Chem 59:142–157

    Google Scholar 

  • Bradley P, Misura KM, Baker D (2005) Toward high-resolution de novo structure prediction for small proteins. Science 309(5742):1868–1871

    Google Scholar 

  • Brooks BR, Brooks CL III, Mackerell AD Jr, Nilsson L, Petrella RJ, Roux B, Won Y, Archontis G, Bartels C, Boresch S et al (2009) CHARMM: the biomolecular simulation program. J Comput Chem 30(10):1545–1614

    Google Scholar 

  • Brunger AT (2007) Version 1.2 of the crystallography and nmr system. Nat Protocols 2(11):2728–2733

    Google Scholar 

  • Brünger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang J-S, Kuszewski J, Nilges M, Pannu NS et al (1998) Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr Sect D 54(5):905–921

    Google Scholar 

  • Cai Y, Li X, Sun Z, Lu Y, Zhao H, Hanson J, Paliwal K, Litfin T, Zhou Y, Yang Y (2020) SPOT-Fold: fragment-free protein structure prediction guided by predicted backbone structure and contact map. J Comput Chem 41(8):745–750

    Google Scholar 

  • Canutescu AA, Shelenkov AA, Dunbrack RL Jr (2003) A graph-theory algorithm for rapid protein side-chain prediction. Protein 12(9):2001–2014

    Google Scholar 

  • Cao Y, Song L, Miao Z, Hu Y, Tian L, Jiang T (2011) Improved side-chain modeling by coupling clash-detection guided iterative search with rotamer relaxation. Bioinformatics 27(6):785–790

    Google Scholar 

  • Cao R, Adhikari B, Bhattacharya D, Sun M, Hou J, Cheng J (2017) QAcon: single model quality assessment using protein structural and contact information with machine learning techniques. Bioinformatics 33(4):586–588

    Google Scholar 

  • Case DA, Cheatham TE III, Darden T, Gohlke H, Luo R, Merz KM Jr, Onufriev A, Simmerling C, Wang B, Woods RJ (2005) The amber biomolecular simulation programs. J Comput Chem 26(16):1668–1688

    Google Scholar 

  • Cavanagh J, Fairbrother WJ, Palmer AG III, Skelton NJ (1996) Protein NMR spectroscopy: principles and practice. Academic Press, New York

    Google Scholar 

  • Chan T, Jankovic B, Le V, Naverniouk I (2004) Comparative Study of Hydrophobic-Polar and Miyazawa-Jernigan Energy Functions in Protein Folding on a Cubic Lattice Using Pruned-Enriched Rosenbluth Monte Carlo Algorithm

  • Chaudhury S, Lyskov S, Gray JJ (2010) PyRosetta: a script-based interface for implementing molecular modeling algorithms using rosetta. Bioinformatics 26(5):689–691

    Google Scholar 

  • Chen P, Li J (2010) Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers. BMC structural biology 10(1):1–13

    Google Scholar 

  • Chen K, Kurgan L (2012) Computational prediction of secondary and supersecondary structures. In: Protein Supersecondary Structures, pp. 63–86. Springer, Switzerland

  • Chen X, Song S, Ji J, Tang Z, Todo Y (2020) Incorporating a multiobjective knowledge-based energy function into differential evolution for protein structure prediction. Information Sciences 540:69–88

    MATH  MathSciNet  Google Scholar 

  • Chen C, Wu T, Guo Z, Cheng J (2021) Combination of deep neural network with attention mechanism enhances the explainability of protein contact prediction. Proteins 89(6):697–707

    Google Scholar 

  • Cheng J, Baldi P (2005) Three-stage prediction of protein \(\beta \)-sheets by neural networks, alignments and graph algorithms. Bioinformatics 21(suppl_1), 75–84

  • Cheng J, Tegge AN, Baldi P (2008) Machine learning methods for protein structure prediction. IEEE Rev Biomed Eng 1:41–49

    Google Scholar 

  • Chi PB, Kim D, Lai JK, Bykova N, Weber CC, Kubelka J, Liberles DA (2018) A new parameter-rich structure-aware mechanistic model for amino acid substitution during evolution. Proteins 86(2):218–228

    Google Scholar 

  • Chuang C-C, Chen C-Y, Yang J-M, Lyu P-C, Hwang J-K (2003) Relationship between protein structures and disulfide-bonding patterns. Proteins 53(1):1–5

    Google Scholar 

  • Colbes J, Corona RI, Lezcano C, Rodríguez D, Brizuela CA (2017) Protein side-chain packing problem: is there still room for improvement? Brief Bioinf 18(6):1033–1043

    Google Scholar 

  • Comellas G, Rienstra CM (2013) Protein structure determination by magic-angle spinning solid-state NMR, and insights into the formation, structure, and stability of amyloid fibrils. Ann Rev Biophys 42:515–536

    Google Scholar 

  • Correa L, Borguesan B, Farfán C, Inostroza-Ponta M, Dorn M (2016) A memetic algorithm for 3D protein structure prediction problem. IEEE/ACM Trans Comput Biol Bioinf 15(3):690–704

    Google Scholar 

  • Dal Palu A, Dovier A, Fogolari F, Pontelli E (2011) Exploring protein fragment assembly using CLP. In: IJCAI, pp. 2590–2595

  • Damm W, Frontera A, Tirado-Rives J, Jorgensen WL (1997) OPLS all-atom force field for carbohydrates. J Comput Chem 18(16):1955–1970

    Google Scholar 

  • DasGupta D, Kaushik R, Jayaram B (2015) From ramachandran maps to tertiary structures of proteins. J Phys Chem B 119(34):11136–11145

    Google Scholar 

  • de Lima Corrêa L, Dorn M (2020) A multi-population memetic algorithm for the 3D protein structure prediction problem. Swarm Evol Comput 55:100677

    Google Scholar 

  • de Lima Corrêa L, Borguesan B, Krause MJ, Dorn M (2018) Three-dimensional protein structure prediction based on memetic algorithms. Comput Oper Res 91:160–177

    MATH  MathSciNet  Google Scholar 

  • de Oliveira SH, Shi J, Deane CM (2015) Building a better fragment library for de novo protein structure prediction. PLoS ONE 10(4):0123998

    Google Scholar 

  • Dehghani T, Naghibzadeh M, Eghdami M (2019) BetaDL: a protein beta-sheet predictor utilizing a deep learning model and independent set solution. Computers in Biology and Medicine 104:241–249

    Google Scholar 

  • Dhingra S, Sowdhamini R, Cadet F, Offmann B (2020) A glance into the evolution of template-free protein structure prediction methodologies. Biochimie 1:1

    Google Scholar 

  • Di Lena P, Nagata K, Baldi P (2012) Deep architectures for protein contact map prediction. Bioinformatics 28(19):2449–2457

    Google Scholar 

  • Dill KA (1985) Theory for the folding and stability of globular proteins. Biochemistry 24(6):1501–1509

    Google Scholar 

  • Ding W, Gong H (2020) Predicting the real-valued inter-residue distances for proteins. Adv Sci 7(19):2001314

    Google Scholar 

  • Ding W, Mao W, Shao D, Zhang W, Gong H (2018) DeepConPred2: an improved method for the prediction of protein residue contacts. Comput Struct Biotechnol J 16:503–510

    Google Scholar 

  • Dotu I, Cebrian M, Van Hentenryck P, Clote P (2011) On lattice protein structure prediction revisited. IEEE/ACM Trans Comput Biol Bioinf 8(6):1620–1632

    Google Scholar 

  • Dou J, Vorobieva AA, Sheffler W, Doyle LA, Park H, Bick MJ, Mao B, Foight GW, Lee MY, Gagnon LA et al (2018) De novo design of a fluorescence-activating \(\beta \)-barrel. Nature 561(7724):485–491

    Google Scholar 

  • Do Duc D, Dinh P.T., Anh VTN, Linh-Trung N (2018) An efficient ant colony optimization algorithm for protein structure prediction. In: 2018 12th International Symposium on Medical Information and Communication Technology (ISMICT), pp. 1–6. IEEE

  • Du Z, Su H, Wang W, Ye L, Wei H, Peng Z, Anishchenko I, Baker D, Yang J (2021) The trrosetta server for fast and accurate protein structure prediction. Nat Protocols 16(12):5634–5651

    Google Scholar 

  • Eickholt J, Cheng J (2012) Predicting protein residue-residue contacts using deep networks and boosting. Bioinformatics 28(23):3066–3072

    Google Scholar 

  • Eickholt J, Cheng J (2013) A study and benchmark of dncon: a method for protein residue-residue contact prediction using deep networks. In: BMC Bioinf, vol. 14, pp. 1–10 . BioMed Central

  • Ekeberg M, Lövkvist C, Lan Y, Weigt M, Aurell E (2013) Improved contact prediction in proteins: using pseudolikelihoods to infer potts models. Phys Rev E 87(1):012707

    Google Scholar 

  • Fang C (2018) Applications of deep neural networks to protein structure prediction. PhD thesis, University of Missouri-Columbia

  • Fang C, Shang Y, Xu D (2018) Prediction of protein backbone torsion angles using deep residual inception neural networks. IEEE/ACM Trans Comput Biol Bioinf 16(3):1020–1028

    Google Scholar 

  • Fang C, Shang Y, Xu D (2018) MUFOLD-SS: new deep inception-inside-inception networks for protein secondary structure prediction. Proteins 86(5):592–598

    Google Scholar 

  • Fielding AH (1999) An introduction to machine learning methods. In: Machine Learning Methods for Ecological Applications, pp. 1–35. Springer, Switzerland

  • Flot M, Mishra A, Kuchi AS, Hoque MT (2019) StackSSSPred: a stacking-based prediction of supersecondary structure from sequence. Methods Mol Biol (Clifton, NJ) 1958:101–122

    Google Scholar 

  • Fukuda H, Tomii K (2020) DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment. BMC Bioinf 21(1):1–15

    Google Scholar 

  • Gao S, Song S, Cheng J, Todo Y, Zhou M (2017) Incorporation of solvent effect into multi-objective evolutionary algorithm for improved protein structure prediction. IEEE/ACM Trans Comput Biol Bioinf 15(4):1365–1378

    Google Scholar 

  • Gao Y, Wang S, Deng M, Xu J (2018) RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning. BMC Bioinf 19(4):73–84

    Google Scholar 

  • Gao J, Yang Y, Zhou Y (2018) Grid-based prediction of torsion angle probabilities of protein backbone and its application to discrimination of protein intrinsic disorder regions and selection of model structures. BMC Bioinf 19(1):1–8

    Google Scholar 

  • Garza-Fabre M, Kandathil SM, Handl J, Knowles J, Lovell SC (2016) Generating, maintaining, and exploiting diversity in a memetic algorithm for protein structure prediction. Evol Comput 24(4):577–607

    Google Scholar 

  • Glover FW, Kochenberger GA (2006) Handbook of Metaheuristics, vol 57. Springer, Switzerland

    MATH  Google Scholar 

  • Glusker J (2009) X-ray crystallography of proteins. Methods Biochem Anal 1:1–72

    Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search. Optimization, and MachineLearning

  • Gordon DB, Mayo SL (1999) Branch-and-terminate: a combinatorial optimization algorithm for protein design. Structure 7(9):1089–1098

    Google Scholar 

  • Greener JG, Kandathil SM, Jones DT (2019) Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints. Nat Commun 10(1):1–13

    Google Scholar 

  • Gront D, Kulp DW, Vernon RM, Strauss CE, Baker D (2011) Generalized fragment picking in rosetta: design, protocols and applications. PLoS ONE 6(8):23294

    Google Scholar 

  • Guo Z, Wu T, Liu J, Hou J, Cheng J (2021) Improving deep learning-based protein distance prediction in casp14. Bioinformatics 37(19):3190–3196

    Google Scholar 

  • Görmez Y, Aydin Z (2022) IGPRED-MultiTask: a deep learning model to predict protein secondary structure, torsion angles and solvent accessibility. IEEE/ACM Trans Comput Biol Bioinf

  • Görmez Y, Sabzekar M, Aydin Z (2021) IGPRED: combination of convolutional neural and graph convolutional networks for protein secondary structure prediction. Proteins

  • Haas J, Roth S, Arnold K, Kiefer F, Schmidt T, Bordoli L, Schwede T (2013) The protein model portal-a comprehensive resource for protein structure and model information. Database 2013

  • Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y (2018) Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks. Bioinformatics 35(14):2403–2410

    Google Scholar 

  • Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y (2018) Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks. Bioinformatics 34(23):4039–4045

    Google Scholar 

  • He B, Mortuza S, Wang Y, Shen H-B, Zhang Y (2017) NeBcon: protein contact map prediction using neural network training coupled with naïve bayes classifiers. Bioinformatics 33(15):2296–2306

    Google Scholar 

  • Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Yang Y, Zhou Y (2015) Improving prediction of secondary structure, local backbone angles and solvent accessible surface area of proteins by iterative deep learning. Sci Rep 5(1):1–11

    Google Scholar 

  • Heffernan R, Yang Y, Paliwal K, Zhou Y (2017) Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics 33(18):2842–2849

    Google Scholar 

  • Heffernan R, Paliwal K, Lyons J, Singh J, Yang Y, Zhou Y (2018) Single-sequence-based prediction of protein secondary structures and solvent accessibility by deep whole-sequence learning. J Comput Chem 39(26):2210–2216

    Google Scholar 

  • Heo L, Feig M (2018) Experimental accuracy in protein structure refinement via molecular dynamics simulations. Proc Natl Acad Sci USA 115(52):13276–13281

    Google Scholar 

  • Heo L, Feig M (2020) High-accuracy protein structures by combining machine-learning with physics-based refinement. Proteins 88(5):637–642

    Google Scholar 

  • Hiranuma N, Park H, Baek M, Anishchenko I, Dauparas J, Baker D (2021) Improved protein structure refinement guided by deep learning based accuracy estimation. Nat Commun 12(1):1–11

    Google Scholar 

  • Hou J, Wu T, Cao R, Cheng J (2019) Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins 87(12):1165–1178

    Google Scholar 

  • Hu X-Z, Long H-X, Ding C-J, Gao S-J, Hou R (2020) Using random forest algorithm to predict super-secondary structure in proteins. J Supercomput 76(5):3199–3210

    Google Scholar 

  • Huang H, Gong X (2020) A review of protein inter-residue distance prediction. Curr Bioinf 15(8):821–830

    Google Scholar 

  • Huang X, Han K, Zhu Y (2013) Systematic optimization model and algorithm for binding sequence selection in computational enzyme design. Protein 22(7):929–941

    Google Scholar 

  • Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, De Groot BL, Grubmüller H, MacKerell AD (2017) Charmm36m: an improved force field for folded and intrinsically disordered proteins. Nat Methods 14(1):71–73

    Google Scholar 

  • Huang X, Pearce R, Zhang Y (2020) FASPR: an open-source tool for fast and accurate protein side-chain packing. Bioinformatics 36(12):3758–3765

    Google Scholar 

  • Hussain K, Salleh MNM, Cheng S, Shi Y (2019) Metaheuristic research: a comprehensive survey. Artif Intell Rev 52(4):2191–2233

    Google Scholar 

  • Høie MH, Kiehl EN, Petersen B, Nielsen M, Winther O, Nielsen H, Hallgren J, Marcatili P (2022) NetSurfP-3.0: accurate and fast prediction of protein structural features by protein language models and deep learning. Nucleic Acids Res 1:1

    Google Scholar 

  • Ingraham J, Riesselman A, Sander C, Marks D (2018) Learning protein structure with a differentiable simulator. In: International Conference on Learning Representations

  • Irbäck A, Peterson C, Potthast F, Sommelius O (1997) Local interactions and protein folding: a three-dimensional off-lattice approach. J Chem Phys 107(1):273–282

    Google Scholar 

  • Jain A, Terashi G, Kagaya Y, Subramaniya SRMV, Christoffer C, Kihara D (2021) Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction. Sci Rep 11(1):1–13

    Google Scholar 

  • Jana ND, Das S, Sil J (2018) A metaheuristic approach to protein structure prediction. Springer, Cham

    MATH  Google Scholar 

  • Jauch R, Yeo HC, Kolatkar PR, Clarke ND (2007) Assessment of casp7 structure predictions for template free targets. Proteins 69(S8):57–67

    Google Scholar 

  • Jayaram B, Bhushan K, Shenoy SR, Narang P, Bose S, Agrawal P, Sahu D, Pandey V (2006) Bhageerath: an energy based web enabled computer software suite for limiting the search space of tertiary structures of small globular proteins. Nucleic Acids Res 34(21):6195–6204

    Google Scholar 

  • Ji S, Oruç T, Mead L, Rehman MF, Thomas CM, Butterworth S, Winn PJ (2019) DeepCDpred: inter-residue distance and contact prediction for improved prediction of protein structure. PLoS ONE 14(1):0205214

    Google Scholar 

  • Jiang Q, Jin X, Lee S-J, Yao S (2017) Protein secondary structure prediction: a survey of the state of the art. Journal of Molecular Graphics and Modelling 76:379–402

    Google Scholar 

  • Jing X, Xu J (2020) Improved protein model quality assessment by integrating sequential and pairwise features using deep learning. Bioinformatics 36(22–23):5361–5367

    Google Scholar 

  • Jing X, Dong Q, Lu R, Dong Q (2019) Protein inter-residue contacts prediction: methods, performances and applications. Curr Bioinf 14(3):178–189

    Google Scholar 

  • Jisna V, Jayaraj P (2021) Protein structure prediction: conventional and deep learning perspectives. Protein J 1:1–23

    Google Scholar 

  • Jones DT, McGuffin LJ (2003) Assembling novel protein folds from super-secondary structural fragments. Proteins 53(S6):480–485

    Google Scholar 

  • Jones DT, Kandathil SM (2018) High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics 34(19):3308–3315

    Google Scholar 

  • Jones DT, Buchan DW, Cozzetto D, Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28(2):184–190

    Google Scholar 

  • Jones DT, Singh T, Kosciolek T, Tetchner S (2015) Metapsicov: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 31(7):999–1006

    Google Scholar 

  • Jumper JM, Faruk NF, Freed KF, Sosnick TR (2018) Accurate calculation of side chain packing and free energy with applications to protein molecular dynamics. PLoS Comput Biol 14(12):1006342

    Google Scholar 

  • Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Tunyasuvunakool K, Ronneberger O, Bates R, Žídek A, Bridgland A, et al. (2020) High accuracy protein structure prediction using deep learning. Fourteenth Critical Assessment of Techniques for Protein Structure Prediction (Abstract Book) 22, 24

  • Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A et al (2021) Highly accurate protein structure prediction with alphafold. Nature 1:1–11

    Google Scholar 

  • Kabsch W (1976) A solution for the best rotation to relate two sets of vectors. Acta Crystallogr Sect A 32(5):922–923

    Google Scholar 

  • Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22(12):2577–2637

    Google Scholar 

  • Kaján L, Hopf TA, Kalaš M, Marks DS, Rost B (2014) FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinf 15(1):1–6

    Google Scholar 

  • Kalisman N, Levi A, Maximova T, Reshef D, Zafriri-Lynn S, Gleyzer Y, Keasar C (2005) MESHI: a new library of Java classes for molecular modeling. Bioinformatics 21(20):3931–3932

    Google Scholar 

  • Kamisetty H, Ovchinnikov S, Baker D (2013) Assessing the utility of coevolution-based residue-residue contact predictions in a sequence-and structure-rich era. Proc Natl Acd Sci USA 110(39):15674–15679

    Google Scholar 

  • Kandathil SM, Greener JG, Jones DT (2019) Prediction of interresidue contacts with DeepMetaPSICOV in CASP13. Proteins 87(12):1092–1099

    Google Scholar 

  • Karasikov M, Pagès G, Grudinin S (2019) Smooth orientation-dependent scoring function for coarse-grained protein quality assessment. Bioinformatics 35(16):2801–2808

    Google Scholar 

  • Kelley LA, Sternberg MJ (2009) Protein structure prediction on the web: a case study using the phyre server. Nat Protocols 4(3):363–371

    Google Scholar 

  • Khanduja N, Bhushan B (2020) Recent advances and application of metaheuristic algorithms: A survey. Metaheuristic and Evolutionary Computation, 207

  • Khor BY, Tye GJ, Lim TS, Choong YS (2015) General overview on structure prediction of twilight-zone proteins. Theoret Biol Med Model 12(1):1–11

    Google Scholar 

  • Kingsford CL, Chazelle B, Singh M (2005) Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics 21(7):1028–1039

    Google Scholar 

  • Kinjo AR, Nakamura H (2008) Nature of protein family signatures: insights from singular value analysis of position-specific scoring matrices. PLoS ONE 3(4):1963

    Google Scholar 

  • Klausen MS, Jespersen MC, Nielsen H, Jensen KK, Jurtz VI, Soenderby CK, Sommer MOA, Winther O, Nielsen M, Petersen B et al (2019) NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning. Proteins 87(6):520–527

    Google Scholar 

  • Klepeis JL, Wei Y, Hecht MH, Floudas CA (2005) Ab initio prediction of the three-dimensional structure of a de novo designed protein A double-blind case study. Proteins 58(3):560–570

    Google Scholar 

  • Kotowski K, Smolarczyk T, Roterman-Konieczna I, Stapor K (2021) ProteinUnet-an efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures. J Comput Chem 42(1):50–59

    Google Scholar 

  • Kou G, Feng Y (2015) Identify five kinds of simple super-secondary structures with quadratic discriminant algorithm based on the chemical shifts. J Theoret Biol 380:392–398

    MATH  Google Scholar 

  • Kryshtafovych A, Monastyrskyy B, Fidelis K, Schwede T, Tramontano A (2018) Assessment of model accuracy estimations in CASP12. Proteins 86:345–360

    Google Scholar 

  • Kugunavar S, Prabhakar C (2021) Convolutional neural networks for the diagnosis and prognosis of the coronavirus disease pandemic. Visual Computing for Industry, Biomedicine, and Art 4(1):1–14

    Google Scholar 

  • Kuhlman B, Bradley P (2019) Advances in protein structure prediction and design. Nat Rev Mol Cell Biol 20(11):681–697

    Google Scholar 

  • Kukic P, Mirabello C, Tradigo G, Walsh I, Veltri P, Pollastri G (2014) Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks. BMC Bioinf 15(1):1–15

    Google Scholar 

  • Källberg M, Wang H, Wang S, Peng J, Wang Z, Lu H, Xu J (2012) Template-based protein structure modeling using the raptorx web server. Nat Protocols 7(8):1511–1522

    Google Scholar 

  • Lavor C, Alves R, Figueiredo W, Petraglia A, Maculan N (2015) Clifford algebra and the discretizable molecular distance geometry problem. Advances in Applied Clifford Algebras 25(4):925–942

    MATH  MathSciNet  Google Scholar 

  • Leaver-Fay A, Tyka M, Lewis SM, Lange OF, Thompson J, Jacak R, Kaufman KW, Renfrew PD, Smith CA, Sheffler W, et al. (2011) ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. In: Methods in Enzymology vol. 487, pp. 545–574. Elsevier, Amsterdam

  • Lee GR, Won J, Heo L, Seok C (2019) GalaxyRefine2: simultaneous refinement of inaccurate local regions and overall protein structure. Nucleic Acids Res 47(W1):451–455

    Google Scholar 

  • Lee D, Xiong D, Wierbowski S, Li L, Liang S, Yu H (2022) Deep learning methods for 3D structural proteome and interactome modeling. Curr Opin Struct Biol 73

  • Levinthal C (1968) Are there pathways for protein folding? Journal de chimie physique 65:44–45

    Google Scholar 

  • Li Y, Roy A, Zhang Y (2009) HAAD: a quick algorithm for accurate prediction of hydrogen atoms in protein structures. PLoS ONE 4(8):6701

    Google Scholar 

  • Li Y, Fang Y, Fang J (2011) Predicting residue-residue contacts using random forest models. Bioinformatics 27(24):3379–3384

    Google Scholar 

  • Li C, Wang X-F, Chen Z, Zhang Z, Song J (2015) Computational characterization of parallel dimeric and trimeric coiled-coils using effective amino acid indices. Mol BioSyst 11(2):354–360

    Google Scholar 

  • Li Y, Hu J, Zhang C, Yu D-J, Zhang Y (2019) ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks. Bioinformatics 35(22):4647–4655

    Google Scholar 

  • Li Y, Zhang C, Zheng W, Zhou X, Bell E, Yu D, Zhang Y (2020) Learning deep statistical potentials for protein folding. CASP 14:72–73

    Google Scholar 

  • Li Y, Zheng W, Zhang C, Bell E, Huang X, Pearce R, Zhou X, Zhang Y (2020) Protein 3D structure prediction by DI-TASSER in CASP14. CASP 14:339–341

    Google Scholar 

  • Li Y, Zhang C, Bell EW, Zheng W, Zhou X, Yu D-J, Zhang Y (2021) Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks. PLoS Comput Biol 17(3):1008865

    Google Scholar 

  • Liljas A, Liljas L, Lindblom G, Nissen P, Kjeldgaard M, Ash Mr (2016) Textbook of Structural Biology vol. 8. World Scientific, Singapore

  • Lin HH, Tseng LY (2010) DBCP: a web server for disulfide bonding connectivity pattern prediction without the prior knowledge of the bonding state of cysteines. Nucleic Acids Res 38(suppl_2), 503–507

  • Liu Z, Jiang L, Gao Y, Liang S, Chen H, Han Y, Lai L (2003) Beyond the rotamer library: genetic algorithm combined with the disturbing mutation process for upbuilding protein side-chains. Proteins 50(1):49–62

    Google Scholar 

  • Liu Y, Palmedo P, Ye Q, Berger B, Peng J (2018) Enhancing evolutionary couplings with deep convolutional neural networks. Cell Syst 6(1):65–74

    Google Scholar 

  • Liu Z-L, Hu J-H, Jiang F, Wu Y-D (2020) CRiSP: accurate structure prediction of disulfide-rich peptides with cystine-specific sequence alignment and machine learning. Bioinformatics 36(11):3385–3392

    Google Scholar 

  • Liu J, Zhou X-G, Zhang Y, Zhang G-J (2020) CGLFold: a contact-assisted de novo protein structure prediction using global exploration and loop perturbation sampling algorithm. Bioinformatics 36(8):2443–2450

    Google Scholar 

  • Liu S, Wang T, Xu Q, Shao B, Yin J, Liu T-Y (2021) Complementing sequence-derived features with structural information extracted from fragment libraries for protein structure prediction. BMC Bioinf 22(1):1–18

    Google Scholar 

  • Liwo A, Khalili M, Scheraga HA (2005) Ab initio simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptide chains. Proc Natl Acad Sci USA 102(7):2362–2367

    Google Scholar 

  • Lodish H, Berk A, Zipursky SL, Matsudaira P, Baltimore D, Darnell J (2000) Hierarchical structure of proteins. In: Molecular Cell Biology. 4th Edition. WH Freeman, Macmillan Higher Education, US

  • Lu M, Dousis AD, Ma J (2008) OPUS-PSP: an orientation-dependent statistical all-atom potential derived from side-chain packing. J Mol Biol 376(1):288–301

    Google Scholar 

  • Lyons J, Dehzangi A, Heffernan R, Sharma A, Paliwal K, Sattar A, Zhou Y, Yang Y (2014) Predicting backbone c\(\alpha \) angles and dihedrals from protein sequences by stacked sparse auto-encoder deep neural network. J Comput Chem 35(28):2040–2046

    Google Scholar 

  • Ma J, Wang S, Wang Z, Xu J (2015) Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning. Bioinformatics 31(21):3506–3513

    Google Scholar 

  • Ma Y, Liu Y, Cheng J (2018) Protein secondary structure prediction based on data partition and semi-random subspace method. Sci Rep 8(1):1–10

    Google Scholar 

  • Mabrouk M, Werner T, Schneider T, Putz I, Brock O (2015) Analysis of free modelling predictions by RBO aleph in CASP11. Proteins 84:87–104

    Google Scholar 

  • MacCarthy E, Perry D, Kc DB (2019) Advances in protein super-secondary structure prediction and application to protein structure prediction. Methods Mol Biol (Clifton, NJ) 1958:15–45

    Google Scholar 

  • Maghrabi AH, McGuffin LJ (2017) ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models. Nucleic Acids Res 45(W1):416–421

    Google Scholar 

  • Magnan CN, Baldi P (2014) SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics 30(18):2592–2597

    Google Scholar 

  • Maguire JB, Haddox HK, Strickland D, Halabiya SF, Coventry B, Griffin JR, Pulavarti SVK, Cummins M, Thieker DF, Klavins E et al (2021) Perturbing the energy landscape for improved packing during computational protein design. Proteins 89(4):436–449

    Google Scholar 

  • Mariani V, Biasini M, Barbato A, Schwede T (2013) lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 29(21):2722–2728

    Google Scholar 

  • Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6(12):28766

    Google Scholar 

  • Marsland S (2015) Machine Learning: an Algorithmic Perspective. CRC Press, Boca Raton, Florida

    Google Scholar 

  • Mataeimoghadam F, Newton MH, Dehzangi A, Karim A, Jayaram B, Ranganathan S, Sattar A (2020) Enhancing protein backbone angle prediction by using simpler models of deep neural networks. Sci Rep 10(1):1–12

    Google Scholar 

  • Meiler J, Müller M, Zeidler A, Schmäschke F (2001) Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks. Molecular modeling annual 7(9):360–369

    Google Scholar 

  • Miao Z, Cao Y, Jiang T (2011) RASP: rapid modeling of protein side chain conformations. Bioinformatics 27(22):3117–3122

    Google Scholar 

  • Michel M, Menéndez Hurtado D, Elofsson A (2019) PconsC4: fast, accurate and hassle-free contact predictions. Bioinformatics 35(15):2677–2679

    Google Scholar 

  • Mignan A, Broccardo M (2019) One neuron versus deep learning in aftershock prediction. Nature 574(7776):1–3

    Google Scholar 

  • Mirabello C, Pollastri G (2013) Porter, PaleAle 4.0:high-accuracy prediction of protein secondary structure and relative solvent accessibility. Bioinformatics 29(16):2056–2058

    Google Scholar 

  • Mirabello C, Wallner B (2019) RAWMSA: end-to-end deep learning using raw multiple sequence alignments. PLoS ONE 14(8):0220182

    Google Scholar 

  • Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M (2022) ColabFold: making protein folding accessible to all. Nat Methods 1:1–4

    Google Scholar 

  • Mishra A, Iqbal S, Hoque MT (2016) Discriminate protein decoys from native by using a scoring function based on ubiquitous phi and psi angles computed for all atom. J Theoret Biol 398:112–121

    MATH  Google Scholar 

  • Mishra A, Kabir MWU, Hoque MT (2021) diSBPred: a machine learning based approach for disulfide bond prediction. Comput Biol Chem 91:107436

    Google Scholar 

  • Mittal A, Jayaram B, Shenoy S, Bawa TS (2010) A stoichiometry driven universal spatial organization of backbones of folded proteins: are there chargaff’s rules for protein folding? Journal of Biomolecular Structure and Dynamics 28(2):133–142

    Google Scholar 

  • Miyazawa S, Jernigan RL (1985) Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 18(3):534–552

    Google Scholar 

  • Miyazawa S, Jernigan RL (1996) Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J Mol Biol 256(3):623–644

    Google Scholar 

  • Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acd Sci USA 108(49):1293–1301

    Google Scholar 

  • Mortuza S, Zheng W, Zhang C, Li Y, Pearce R, Zhang Y (2021) Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions. Nat Commun 12(1):1–12

    Google Scholar 

  • Moscato P, Cotta C (2010) A modern introduction to memetic algorithms. In: Handbook of Metaheuristics, pp. 141–183. Springer, Cham

  • Moscato P et al (1989) On evolution, search, optimization, genetic algorithms and martial arts: towards memetic algorithms. Caltech concurrent computation program, C3P Rep 826, 1989

  • Moult J, Pedersen JT, Judson R, Fidelis K (1995) A large-scale experiment to assess protein structure prediction methods. Wiley Online Library

    Google Scholar 

  • Mufassirin MM, Ragel RG (2018) A novel filter-wrapper based feature selection approach for cancer data classification. In: 2018 IEEE International Conference on Information and Automation for Sustainability (ICIAfS), pp. 1–6. IEEE

  • Mullard A (2021) What does alphafold mean for drug discovery? Nature reviews, Drug discovery

  • Nagata K, Randall A, Baldi P (2012) SIDEpro: a novel machine learning approach for the fast and accurate prediction of side-chain conformations. Proteins 80(1):142–153

    Google Scholar 

  • Narloch PH, Parpinelli RS (2017) The protein structure prediction problem approached by a cascade differential evolution algorithm using ROSETTA. In: 2017 Brazilian Conference on Intelligent Systems (BRACIS), pp. 294–299. IEEE

  • Nazmul R, Chetty M, Chowdhury AR (2020) Multimodal memetic framework for low-resolution protein structure prediction. Swarm Evol Comput 52:100608

    Google Scholar 

  • Newton M, Mataeimoghadam F, Zaman R, Sattar A (2022) Secondary structure specific simpler prediction models for protein backbone angles. BMC Bioinf 23(1):1–14

    Google Scholar 

  • Newton MH, Zaman R, Mataeimoghadam F, Rahman J, Sattar A (2022) Constraint guided beta-sheet refinement for protein structure prediction. Comput Biol Chem 1:107773

  • Newton MH, Rahman J, Zaman R, Sattar A (2022) Enhancing protein contact map prediction accuracy via ensembles of inter-residue distance predictors. Comput Biol Chem 1:107700

  • Niu S, Huang T, Feng K-Y, He Z, Cui W, Gu L, Li H, Cai Y-D, Li Y (2013) Inter-and intra-chain disulfide bond prediction based on optimal feature selection. Protein Peptide Lett 20(3):324–335

    Google Scholar 

  • Ovchinnikov S, Park H, Kim DE, DiMaio F, Baker D (2018) Protein structure prediction using rosetta in casp12. Proteins 86:113–121

    Google Scholar 

  • O’Meara MJ, Leaver-Fay A, Tyka MD, Stein A, Houlihan K, DiMaio F, Bradley P, Kortemme T, Baker D, Snoeyink J et al (2015) Combined covalent-electrostatic model of hydrogen bonding improves structure prediction with rosetta. J Chem Theory Comput 11(2):609–622

    Google Scholar 

  • Park H, Bradley P, Greisen P Jr, Liu Y, Mulligan VK, Kim DE, Baker D, DiMaio F (2016) Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules. J Chem Theory Comput 12(12):6201–6212

    Google Scholar 

  • Park H, Ovchinnikov S, Kim DE, DiMaio F, Baker D (2018) Protein homology model refinement by large-scale energy optimization. Proc Natl Acad Sci USA 115(12):3054–3059

    Google Scholar 

  • Pearce R, Zhang Y (2021) Deep learning techniques have significantly impacted protein structure prediction and protein design. Curr Opin Struct Biol 68:194–207

    Google Scholar 

  • Pearce R, Zhang Y (2021) Toward the solution of the protein structure prediction problem. J Biol Chem 1:1

    Google Scholar 

  • Pereira J, Simpkin AJ, Hartmann MD, Rigden DJ, Keegan RM, Lupas AN (2021) High-accuracy protein structure prediction in CASP14. Proteins 89(12):1687–1699

    Google Scholar 

  • Persson O, Danell R, Schneider JW (2009) How to use bibexcel for various types of bibliometric analysis. Celebrating scholarly communication studies: a Festschrift for Olle Persson at his 60th Birthday 5:9–24

  • Peterson RW, Dutton PL, Wand AJ (2004) Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library. Protein Sci 13(3):735–751

    Google Scholar 

  • Rahman J, Newton MH, Hasan MAM, Sattar A (2022) A stacked meta-ensemble for protein inter-residue distance prediction. Comput Biol Med 148:105824

    Google Scholar 

  • Rahman J, Newton M, Islam MKB, Sattar A (2022) Enhancing protein inter-residue real distance prediction by scrutinising deep learning models. Sci Rep 12(1):1–13

    Google Scholar 

  • Rakhshani H, Idoumghar L, Ghambari S, Lepagnot J, Brévilliers M (2021) On the performance of deep learning for numerical optimization: an application to protein structure prediction. Appl Soft Comput 110:107596

    Google Scholar 

  • Ramyachitra D, Ajeeth A (2017) MODCSA-CA: a multi objective diversity controlled self adaptive cuckoo algorithm for protein structure prediction. Gene Rep 8:100–106

    Google Scholar 

  • Rashid MA, Khatib F, Hoque MT, Sattar A (2015) An enhanced genetic algorithm for ab initio protein structure prediction. IEEE Trans Evol Comput 20(4):627–644

    Google Scholar 

  • Rashid MA, Shatabda S, Newton MH, Hoque MT, Pham DN, Sattar A (2012) Random-walk: a stagnation recovery technique for simplified protein structure prediction. In: Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pp. 620–622

  • Rashid MA, Newton MH, Hoque MT, Sattar A (2013) A local search embedded genetic algorithm for simplified protein structure prediction. In: 2013 IEEE Congress on Evolutionary Computation, pp. 1091–1098. IEEE

  • Rashid MA, Newton M, Hoque M, Sattar A et al (2013) Mixing energy models in genetic algorithms for on-lattice protein structure prediction. BioMed Res Int 2013

  • Remmert M, Biegert A, Hauser A, Söding J (2012) HHblits: lightning-fast iterative protein sequence searching by hmm-hmm alignment. Nature methods 9(2):173

    Google Scholar 

  • Richmond TJ (1984) Solvent accessible surface area and excluded volume in proteins: Analytical equations for overlapping spheres and implications for the hydrophobic effect. J Mol Biol 178(1):63–89

    Google Scholar 

  • Rives A, Meier J, Sercu T, Goyal S, Lin, Z, Liu J, Guo D, Ott M, Zitnick CL, Ma J et al (2021) Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acd Sci USA 118:15

  • Rodriguez C, Chowriappa P, Dua S et al (2019) Local similarity matrix for cysteine disulfide connectivity prediction from protein sequences. IEEE/ACM Trans Comput Biol Bioinf 17(4):1276–1289

    Google Scholar 

  • Rose GD, Geselowitz AR, Lesser GJ, Lee RH, Zehfus MH (1985) Hydrophobicity of amino acid residues in globular proteins. Science 229(4716):834–838

    Google Scholar 

  • Rost B (2001) Protein secondary structure prediction continues to rise. Journal of structural biology 134(2–3):204–218

    Google Scholar 

  • Roy A, Kucukural A, Zhang Y (2010) I-tasser: a unified platform for automated protein structure and function prediction. Nat Protocols 5(4):725–738

    Google Scholar 

  • Santos KB, Trevizani R, Custódio FL, Dardenne LE (2015) Profrager web server: Fragment libraries generation for protein structure prediction. In: Proceedings of the International Conference on Bioinformatics & Computational Biology (BIOCOMP), p. 38. The Steering Committee of The World Congress in Computer Science, Computer

  • Sathyapriya R, Duarte JM, Stehr H, Filippis I, Lappe M (2009) Defining an essence of structure determining residue contacts in proteins. PLOS Comput Biol 5(12):1000584

    Google Scholar 

  • Scott WR, Hünenberger PH, Tironi IG, Mark AE, Billeter SR, Fennen J, Torda AE, Huber T, Krüger P, van Gunsteren WF (1999) The GROMOS biomolecular simulation program package. J Phys Chem A 103(19):3596–3607

    Google Scholar 

  • Seemayer S, Gruber M, Söding J (2014) CCMpred-fast and precise prediction of protein residue-residue contacts from correlated mutations. Bioinformatics 30(21):3128–3130

    Google Scholar 

  • Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Žídek A, Nelson AW, Bridgland A et al (2020) Improved protein structure prediction using potentials from deep learning. Nature 577(7792):706–710

    Google Scholar 

  • Shatabda S., Newton MH, Sattar A (2013) Simplified lattice models for protein structure prediction: how good are they? In: Twenty-Seventh AAAI Conference on Artificial Intelligence

  • Shatabda S, Newton M, Rashid MA, Pham DN, Sattar (2014) A How good are simplified models for protein structure prediction? Adv Bioinf 2014

  • Shen T, Wu J, Lan H, Zheng L, Pei J, Wang S, Liu W, Huang J (2021) When homologous sequences meet structural decoys: accurate contact prediction by tfold in casp14-(tfold for casp14 contact prediction). Proteins 89(12):1901–1910

    Google Scholar 

  • Shonkwiler RW, Mendivil F (2009) Explorations in Monte Carlo Methods. Springer, Switzerland

    MATH  Google Scholar 

  • Shrestha A, Mahmood A (2019) Review of deep learning algorithms and architectures. IEEE Access 7:53040–53065

    Google Scholar 

  • Shuchun Y, Xianxiang L, Xue T, Ming P (2022) Protein structure prediction based on particle swarm optimization and tabu search strategy. BMC Bioinf 23(10):1–10

    Google Scholar 

  • Shuid AN, Kempster R, McGuffin LJ (2017) ReFOLD: a server for the refinement of 3D protein models guided by accurate quality estimates. Nucleic Acids Res 45(W1):422–428

    Google Scholar 

  • Shuvo MH, Bhattacharya S, Bhattacharya D (2020) QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks. Bioinformatics 36(Supplement-1):285–291

    Google Scholar 

  • Simons KT, Kooperberg C, Huang E, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions. J Mol Biol 268(1):209–225

    Google Scholar 

  • Singh J, Litfin T, Paliwal K, Singh J, Hanumanthappa AK, Zhou Y (2021) SPOT-1D-Single: improving the single-sequence-based prediction of protein secondary structure, backbone angles, solvent accessibility and half-sphere exposures using a large training set and ensembled deep learning. Bioinformatics

  • Singh J, Paliwal K, Litfin T, Singh J, Zhou Y (2022) Reaching alignment-profile-based accuracy in predicting protein secondary and tertiary structural properties without alignment. Sci Rep 12(1):1–9

    Google Scholar 

  • Smialowski P, Martin-Galiano AJ, Cox J, Frishman D (2007) Predicting experimental properties of proteins from sequence by machine learning techniques. Current Protein and Peptide Science 8(2):121–133

    Google Scholar 

  • Somvanshi M, Chavan P, Tambade S, Shinde S (2016) A review of machine learning techniques using decision tree and support vector machine. In: 2016 International Conference on Computing Communication Control and Automation (ICCUBEA), pp. 1–7 . IEEE

  • Song S, Gao S, Chen X, Jia D, Qian X, Todo Y (2018) Aimoes: Archive information assisted multi-objective evolutionary strategy for ab initio protein structure prediction. Knowl Based Syst 146:58–72

    Google Scholar 

  • Song S, Ji J, Chen X, Gao S, Tang Z, Todo Y (2018) Adoption of an improved pso to explore a compound multi-objective energy function in protein structure prediction. Appl Soft Comput 72:539–551

    Google Scholar 

  • Steinegger M, Söding J (2017) MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 35(11):1026–1028

    Google Scholar 

  • Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J (2019) HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinf 20(1):1–15

    Google Scholar 

  • Su H, Wang W, Du Z, Peng Z, Gao SH, Cheng MM, Yang J (2021) Improved protein structure prediction using a new multi-scale network and homologous templates. Adv Sci 1:2102592

  • Takahashi T, Chikenji G, Tokita K (2021) Lattice protein design using bayesian learning. Phys Rev E 104(1):014404

    MathSciNet  Google Scholar 

  • Talbi E-G (2009) Metaheuristics: from Design to Implementation, vol 74. John Wiley & Sons, Hoboken, New Jersey, U.S

    MATH  Google Scholar 

  • Torrisi M, Kaleel M, Pollastri G (2019) Deeper profiles and cascaded recurrent and convolutional neural networks for state-of-the-art protein secondary structure prediction. Sci Rep 9(1):1–12

    Google Scholar 

  • Torrisi M, Pollastri G, Le Q (2020) Deep learning methods in protein structure prediction. Comput Struct Biotechnol J 18:1301

    Google Scholar 

  • Tunyasuvunakool K, Adler J, Wu Z, Green T, Zielinski M, Žídek A, Bridgland A, Cowie A, Meyer C, Laydon A et al (2021) Highly accurate protein structure prediction for the human proteome. Nature 596(7873):590–596

    Google Scholar 

  • Varela D, Santos J (2017) A protein folding model using the face-centered cubic lattice model. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1674–1678

  • Varela D, Santos J (2022) Niching methods integrated with a differential evolution memetic algorithm for protein structure prediction. Swarm Evol Comput, 101062

  • Venske SM, Gonçalves RA, Benelli EM, Delgado MR (2016) ADEMO/D: an adaptive differential evolution for protein structure prediction problem. Exp Syst Appl 56:209–226

    Google Scholar 

  • Vieira A, Ribeiro B (2018) Introduction to deep learning business applications for developers. Springer, Cham

    Google Scholar 

  • Walsh I, Baù D, Martin AJ, Mooney C, Vullo A, Pollastri G (2009) Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks. BMC Struct Biol 9(1):1–20

    Google Scholar 

  • Wang G, Dunbrack RL Jr (2003) Pisces: a protein sequence culling server. Bioinformatics 19(12):1589–1591

    Google Scholar 

  • Wang G, Dunbrack RL (2005) PISCES: recent improvements to a pdb sequence culling server. Nucleic acids research 33(suppl_2), 94–98

  • Wang Z, Xu J (2013) Predicting protein contact map using evolutionary and physical constraints by integer programming. Bioinformatics 29(13):266–273

    Google Scholar 

  • Wang Z, Eickholt J, Cheng J (2010) Multicom: a multi-level combination approach to protein structure prediction and its assessments in casp8. Bioinformatics 26(7):882–888

    Google Scholar 

  • Wang S, Li W, Liu S, Xu J (2016) RaptorX-Property: a web server for protein structure property prediction. Nucleic Acids Res 44(W1):430–435

    Google Scholar 

  • Wang S, Peng J, Ma J, Xu J (2016) Protein secondary structure prediction using deep convolutional neural fields. Sci Rep 6:18962

    Google Scholar 

  • Wang X, Zhou Y, Yan R (2015) AAFreqCoil: a new classifier to distinguish parallel dimeric and trimeric coiled coils. Mol BioSyst 11(7):1794–1801

    Google Scholar 

  • Wang Y, Mao H, Yi Z (2017) Protein secondary structure prediction by using deep learning method. Knowledge-Based Systems 118:115–123

    Google Scholar 

  • Wang S, Sun S, Li Z, Zhang R, Xu J (2017) Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput Biol 13(1):1005324

    Google Scholar 

  • Wang T, Yang Y, Zhou Y, Gong H (2017) Lrfraglib: an effective algorithm to identify fragments for de novo protein structure prediction. Bioinformatics 33(5):677–684

    Google Scholar 

  • Wang T, Qiao Y, Ding W, Mao W, Zhou Y, Gong H (2019) Improved fragment sampling for ab initio protein structure prediction using deep neural networks. Nat Mach Intell 1(8):347–355

    Google Scholar 

  • Wardah W, Khan MG, Sharma A, Rashid MA (2019) Protein secondary structure prediction using neural networks and deep learning: A review. Comput Biol Chem 81:1–8

    Google Scholar 

  • Won J, Baek M, Monastyrskyy B, Kryshtafovych A, Seok C (2019) Assessment of protein model structure accuracy estimation in CASP13: challenges in the era of deep learning. Proteins 87(12):1351–1360

    Google Scholar 

  • Wood CW, Woolfson DN (2018) CCBuilder2.0: powerful and accessible coiled-coil modeling. Protein 27(1):103–111

    Google Scholar 

  • Wu T, Hou J, Adhikari B, Cheng J (2020) Analysis of several key factors influencing deep learning-based inter-residue contact prediction. Bioinformatics 36(4):1091–1098

    Google Scholar 

  • Wu Q, Peng Z, Anishchenko I, Cong Q, Baker D, Yang J (2020) Protein contact prediction using metagenome sequence data and residual neural networks. Bioinformatics 36(1):41–48

    Google Scholar 

  • Wu T, Guo Z, Hou J, Cheng J (2021) DeepDist: real-value inter-residue distance prediction with deep residual convolutional network. BMC Bioinf 22(1):1–17

    Google Scholar 

  • Wu T, Liu J, Guo Z, Hou J, Cheng J (2021) MULTICOM2 open-source protein structure prediction system powered by deep learning and distance prediction. Sci Rep 11(1):1–9

    Google Scholar 

  • Xia X (2018) Hidden markov models and protein secondary structure prediction. In: Bioinformatics and the Cell, pp. 145–172. Springer, Switzerland

  • Xia Y-H, Peng C-X, Zhou X-G, Zhang G-J (2021) A sequential niche multimodal conformational sampling algorithm for protein structure prediction. Bioinformatics 37(23):4357–4365

    Google Scholar 

  • Xiong D, Zeng J, Gong H (2017) A deep learning framework for improving long-range residue-residue contact prediction using a hierarchical strategy. Bioinformatics 33(17):2675–2683

    Google Scholar 

  • Xu J (2019) Distance-based protein folding powered by deep learning. Proc Natl Acd Sci USA 116(34):16856–16865

    Google Scholar 

  • Xu J, Berger B (2006) Fast and accurate algorithms for protein side-chain packing. J ACM 53(4):533–557

    MATH  MathSciNet  Google Scholar 

  • Xu J, Wang S (2019) Analysis of distance-based protein structure prediction by deep learning in CASP13. Proteins 87(12):1069–1081

    MathSciNet  Google Scholar 

  • Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80(7):1715–1735

    Google Scholar 

  • Xu G, Ma T, Zang T, Sun W, Wang Q, Ma J (2017) OPUS-DOSP: a distance-and orientation-dependent all-atom potential derived from side-chain packing. J Mol Biol 429(20):3113–3120

    Google Scholar 

  • Xu G, Ma T, Du J, Wang Q, Ma J (2019) OPUS-Rota2: an improved fast and accurate side-chain modeling method. J Chem Theory Comput 15(9):5154–5160

    Google Scholar 

  • Xu J, Mcpartlon M, Li J (2021) Improved protein structure prediction by deep learning irrespective of co-evolution information. Nat Mach Intell 1:1–9

    Google Scholar 

  • Xu G, Wang Q, Ma J (2020) OPUS-Rota 3: Improving protein side-chain modeling by deep neural networks and ensemble methods. J Chem Inf Model 60(12):6691–6697

    Google Scholar 

  • Xu G, Wang Q, Ma J (2020) OPUS-TASS: a protein backbone torsion angles and secondary structure predictor based on ensemble neural networks. Bioinformatics (Oxford, England)

  • Yang Y, Zhou Y (2008) Specific interactions for ab initio folding of protein terminal regions with secondary structures. Proteins 72(2):793–803

    Google Scholar 

  • Yang Y, Gao J, Wang J, Heffernan R, Hanson J, Paliwal K, Zhou Y (2018) Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Briefings in bioinformatics 19(3):482–494

    Google Scholar 

  • Yang J, He B-J, Jang R, Zhang Y, Shen H-B (2015) Accurate disulfide-bonding network predictions improve ab initio structure prediction of cysteine-rich proteins. Bioinformatics 31(23):3773–3781

    Google Scholar 

  • Yang Y, Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Zhou Y (2017) SPIDER2: a package to predict secondary structure, accessible surface area, and main-chain torsional angles by deep neural networks. In: Prediction of Protein Secondary Structure, pp. 55–63. Springer, Cham

  • Yang H, Wang M, Yu Z, Zhao X-M, Li A (2020) GANcon: protein contact map prediction with deep generative adversarial network. IEEE Access 8:80899–80907

    Google Scholar 

  • Yang J, Anishchenko I, Park H, Peng Z, Ovchinnikov S, Baker D (2020) Improved protein structure prediction using predicted interresidue orientations. Proc Natl Acad Sci USA 117(3):1496–1503

    Google Scholar 

  • Yanofsky C, Horn V, Thorpe D (1964) Protein structure relationships revealed by mutational analysis. Science 146(3651):1593–1594

    Google Scholar 

  • Zaman AB, Shehu A (2019) Balancing multiple objectives in conformation sampling to control decoy diversity in template-free protein structure prediction. BMC Bioinf 20(1):1–17

    Google Scholar 

  • Zaman R, Newton MH, Mataeimoghadam F, Sattar A (2022) Constraint guided neighbour generation for protein structure prediction. IEEE Access

  • Zemla A (2003) LGA: a method for finding 3D similarities in protein structures. Nucleic acids research 31(13):3370–3374

    Google Scholar 

  • Zemla A, Venclovas Č, Moult J, Fidelis K (2001) Processing and evaluation of predictions in CASP4. Wiley Online Library

    Google Scholar 

  • Zhang Y, Skolnick J (2004) Scoring function for automated assessment of protein structure template quality. Proteins 57(4):702–710

    Google Scholar 

  • Zhang J, Zhang Y (2010) A novel side-chain orientation dependent potential derived from random-walk reference state for protein fold selection and structure prediction. PLoS ONE 5(10):15386

    Google Scholar 

  • Zhang C, Zhang Y (2020) Protein 3D structure prediction by d-quark in CASP14. In: Fourteenth Meeting of Critical Assessment of Techniques for Protein Structure Prediction, p. 220

  • Zhang G-J, Ma L-F, Wang X-Q, Zhou X-G (2018) Secondary structure and contact guided differential evolution for protein structure prediction. IEEE/ACM Trans Comput Biol Bioinf 17(3):1068–1081

    Google Scholar 

  • Zhang L, Ma H, Qian W, Li H (2020) Protein structure optimization using improved simulated annealing algorithm on a three-dimensional ab off-lattice model. Comput Biol Chem 85:107237

    Google Scholar 

  • Zhang H, Bei Z, Xi W, Hao M, Ju Z, Saravanan KM, Zhang H, Guo N, Wei Y (2021) Evaluation of residue-residue contact prediction methods: From retrospective to prospective. PLOS Comput Biol 17(5):1009027

    Google Scholar 

  • Zheng W, Li Y, Zhang C, Pearce R, Mortuza S, Zhang Y (2019) Deep-learning contact-map guided protein structure prediction in CASP13. Proteins 87(12):1149–1164

    Google Scholar 

  • Zheng W, Li Y, Zhang C, Zhou X, Pearce R, Bell EW, Huang X, Zhang Y (2021) Protein structure prediction using deep learning distance and hydrogen-bonding restraints in casp14. Proteins 89(12):1734–1751

    Google Scholar 

  • Zhong W, Gu F (2020) Predicting local protein 3D structures using clustering deep recurrent neural network. IEEE/ACM Trans Comput Biol Bioinf

  • Zhou H, Skolnick J (2011) GOAP: a generalized orientation-dependent, all-atom statistical potential for protein structure prediction. Biophys J 101(8):2043–2052

    Google Scholar 

  • Zhou X-G, Zhang G-J, Hao X-H, Yu L (2016) A novel differential evolution algorithm using local abstract convex underestimate strategy for global optimization. Computers & Operations Research 75:132–149

    MATH  MathSciNet  Google Scholar 

  • Zou D, He Z, He J, Xia Y (2011) Supersecondary structure prediction using Chou’s pseudo amino acid composition. J Comput Chem 32(2):271–278

    Google Scholar 

Download references

Acknowledgements

This work is partly supported by the Australian Research Council Discovery Grant DP180102727 and the AHEAD OPERATIONS Project of Sri Lanka.

Author information

Authors and Affiliations

Authors

Contributions

M.M.M.M. and M.A.H.N. contributed equally in all parts of the work and are joint-first authors. A.S. took part in discussions and reviewed the manuscript.

Corresponding author

Correspondence to M. M. Mohamed Mufassirin.

Ethics declarations

Competing interests

There is NO Competing Interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mufassirin, M.M.M., Newton, M.A.H. & Sattar, A. Artificial intelligence for template-free protein structure prediction: a comprehensive review. Artif Intell Rev 56, 7665–7732 (2023). https://doi.org/10.1007/s10462-022-10350-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-022-10350-x

Keywords

Navigation