A Composite Approach to Protein Tertiary Structure Prediction: Hidden Markov Model Based on Lattice

Abstract

The biological function of protein depends mainly on its tertiary structure which is determined by its amino acid sequence via the process of protein folding. Prediction of protein structure from its amino acid sequence is one of the most prominent problems in computational biology. Two basic methodologies on protein structure prediction are combined: ab initio method (3-D space lattice) and fold recognition method (hidden Markov model). The primary structure of proteins and 3-D coordinates of amino acid residues are put together in one hidden Markov model to learn the path of amino acid residues in 3-D space from the first atom to the last atom of each protein of each fold. Therefore, each model has the information of 3-D path of amino acids of each fold. The proposed method is compared to fold recognition methods which have hidden Markov model as a base of their algorithms having approaches on only amino acid sequence or secondary structure. To validate the proposed method, the models are assessed with three datasets. Results show that the proposed models outperform 7-HMM and 3-HMM in the same dataset. The face-centered cubic lattice which is the most compacted 3-D lattice reached the maximum classification accuracy in all experiments in comparison with the performance of the most effective version of optimized 3-HMM as well as the performance of the latest version of SAM 3.5. Results show that 3-D coordinates of atoms of amino acids in proteins have an important role in prediction. It also has great hidden information as compared to secondary structure of proteins in fold classification.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Bahamish HAA, Abdullah R, Salam RA (2009) Protein tertiary structure prediction using artificial bee colony algorithm. In: Third Asia international conference on modelling & simulation, pp 258–263

  2. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H et al (2000) The Protein Data Bank. Nucl Acids Res 28:235–242

    Article  Google Scholar 

  3. Bidargaddi NP, Chetty M, Kamruzzaman J (2009) Combining segmental semi-Markov models with neural networks for protein secondary structure prediction. Neurocomputing 72:3943–3950

    Article  Google Scholar 

  4. Camproux AC, Tufféry P (2005) Hidden Markov Model-derived structural alphabet for proteins: the learning of protein local shapes captures sequence specificity. Biochem Biophys Acta 1724:394–403

    Article  Google Scholar 

  5. Caoa H, Ihma Y, Wangb C-Z, Morrisb JR, Sua M, Dobbsc D et al (2004) Three-dimensional threading approach to protein structure recognition. Polymer 45:687–697

    Article  Google Scholar 

  6. Chandonia JM, Hon G, Walker NS, Lo Conte L, Koehl P, Levitt M et al (2004) The ASTRAL Compendium in 2004. Nucleic Acids Res 32:D189–D192

    Article  Google Scholar 

  7. Chmielnicki W, Stapor K (2012) A hybrid discriminative/generative approach to protein fold recognition,”. Neurocomputing 75:194–198

    Article  Google Scholar 

  8. Deschavanne P, Tufféry P (2009) Enhanced protein fold recognition using a structural alphabet. Proteins 76:129–137

    Article  Google Scholar 

  9. Dorn M, Silva MB, Buriol LS, Lamb LC (2014) Three-dimensional protein structure prediction: methods and computational strategies. Comput Biol Chem 53:251–276

    Article  Google Scholar 

  10. Dotu I, Cebrian M, Van Hentenryck P, Clote P (2011) On lattice protein structure prediction revisited. IEEE/ACM Trans Comput Biol Bioinform 8:1620–1632

    Article  Google Scholar 

  11. Elofsson A, Hargbo J (1999) Hidden Markov models that use predicted secondary structures for fold recognition. Proteins 36:68–76

    Article  Google Scholar 

  12. Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucl Acids Res 39:W29–W37

    Article  Google Scholar 

  13. Fox NK, Brenner SE, Chandonia JM (2015) The value of protein structure classification information-Surveying the scientific literature. Proteins Struct Funct Bioinform 83:2025–2038

    Article  Google Scholar 

  14. Gheraibia Y, Moussaoui A (2012) Prediction of 3D protein structure using a genetic algorithm and a K nearest neighbour classifier. In: Biomedical engineering international conference BIOMEIC’12, Algeria

  15. Karchin R, Cline M, Mandel-Gutfreund Y, Karplus K (2003) Hidden Markov models that use predicted local structure for fold recognition: alphabets of backbone geometry. Proteins 51:504–514

    Article  Google Scholar 

  16. Karplus K, Sjölander K, Barrett C, Cline M, Haussler D, Hughey R et al (1997) Predicting protein structure using hidden Markov models. Proteins Struct Funct Bioinform 29:134–139

    Article  Google Scholar 

  17. Karplus K, Karchin R, Shackelford G, Hughey R (2005) Calibrating E-values for hidden Markov models using reverse-sequence null models. Bioinformatics 21:4107–4115

    Article  Google Scholar 

  18. Kong L, Zhang L (2014) Novel structure-driven features for accurate prediction of protein structural class. Genomics 103:292–297

    Article  Google Scholar 

  19. Lampros C, Papaloukas C, Exarchos TP, Goletsis Y, Fotiadis DI (2007a) Sequence-based protein structure prediction using a reduced state-space hidden Markov model. Comput Biol Med 37:1211–1224

    Article  Google Scholar 

  20. Lampros C, Papaloukas C, Exarchos K (2007b) Improvement in fold recognition accuracy of a reduced-state-space hidden Markov model by using secondary structure information in scoring. In: 29th annual international conference of the IEEE EMBS, France

  21. Lampros C, Papaloukas C, Exarchos K, Fotiadis DI, Tsalikakis D (2009) Improving the protein fold recognition accuracy of a reduced state-space hidden Markov model. Comput Biol Med 39:907–914

    Article  Google Scholar 

  22. Lampros C, Simos T, Exarchos TP, Exarchos KP, Papaloukas C, Fotiadis DI (2014) Assessment of optimized Markov models in protein fold classification. J Bioinform Comput Biol 12(4):1450016. https://doi.org/10.1142/S0219720014500164

    Article  Google Scholar 

  23. Lampros C, Papaloukas C, Exarchos T, Fotiadis DI (2017) HMMs in Protein Fold Classification. Hidden Markov Models Methods Mol Biol 1552:13–27

    Article  Google Scholar 

  24. Lee J, Kim S-Y, Joo K, Kim I, Lee J (2004) Prediction of protein tertiary structure using PROFESY, a novel method based on fragment assembly and conformational space annealing. Proteins Struct Funct Bioinform 56:704–714

    Article  Google Scholar 

  25. Lee SY, Lee JY, Jung KS, Ryu KH (2009) A 9-state hidden Markov model using protein secondary structure information for protein fold recognition. Comput Biol Med 39:527–534

    Article  Google Scholar 

  26. Lin C-J, Su S-C (2011) Protein 3D HP model folding simulation using a hybrid of genetic algorithm and particle swarm optimization. Int J Fuzzy Syst 13:140–147

    Google Scholar 

  27. Márquez-Chamorro AE, Divina F, Aguilar-Ruiz JS, Bacardit J, Asencio-Cortés G, Santiesteban-Toca CE (2012) A NSGA-II algorithm for the residue-residue contact prediction. Springer, Berlin, pp 234–244

    Google Scholar 

  28. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540

    Google Scholar 

  29. Nanni L, Brahnamc S, Lumini A (2014) Prediction of protein structure classes by incorporating different protein descriptors into general Chou’s pseudo amino acid composition. J Theor Biol 360:109–116

    Article  Google Scholar 

  30. Pitteri M, Zanzotto G (1996) On the definition and classification of Bravais lattices. Acta Cryst A52:830–838

    MathSciNet  Article  Google Scholar 

  31. Rashid MA, Newton MAH, Hoque MT, Sattar A (2013a) Mixing energy models in genetic algorithms for on-lattice protein structure prediction. BioMed Res Int 27:37–52

    Google Scholar 

  32. Rashid MA, Newton MAH, Hoque MT, Sattar A (2013b) A local search embedded genetic algorithm for simplified protein structure prediction. 2013 IEEE congress on evolutionary computation. https://doi.org/10.1109/CEC.2013.6557688

    Article  Google Scholar 

  33. Regad L, Guyon F, Maupetit J, Tufféry P, Camproux AC (2008) A Hidden Markov Model applied to the protein 3D structure analysis. Comput Stat Data Anal 52:3198–3207

    MathSciNet  Article  Google Scholar 

  34. Shi J-Y, Zhang Y-N (2010) Using hierarchical hidden Markov models to perform sequence-based classification of protein structure. In: IEEE 10th international conference on signal processing, Beijing, pp 1789–1792

  35. Song NY, Yan H (2013) Autoregressive and iterative hidden Markov models for periodicity detection and solenoid structure recognition in protein sequences. IEEE J Biomed Health Inform 17:436–441

    Article  Google Scholar 

  36. Stanfel LE (1996) A new approach to clustering the amino acids. J Theor Biol 183:195–205

    Article  Google Scholar 

  37. Tan C-W, Jones DT (2008) Using neural networks and evolutionary information in decoy discrimination for protein tertiary structure prediction. BMC Bioinform 94:19–42

    Google Scholar 

  38. Valavanis I, Spyrou G, Nikita K (2010) A similarity network approach for the analysis and comparison of protein sequence/structure sets. J Biomed Inform 43:257–267

    Article  Google Scholar 

  39. Yang Y, Faraggi E, Zhao H, Zhou Y (2011) Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates. Struct Bioinform 27:2076–2082

    Article  Google Scholar 

  40. Yoon B-J (2009) Hidden Markov models and their applications in biological sequence analysis. Curr Genom 10:402–415

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Alimohammad Latif.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Peyravi, F., Latif, A. & Moshtaghioun, S.M. A Composite Approach to Protein Tertiary Structure Prediction: Hidden Markov Model Based on Lattice. Bull Math Biol 81, 899–918 (2019). https://doi.org/10.1007/s11538-018-00542-4

Download citation

Keywords

  • Protein structure prediction
  • Tertiary structure
  • Fold recognition
  • Hidden Markov model
  • Bravais lattice