DFS Based Partial Pathways in GA for Protein Structure Prediction

  • Md Tamjidul Hoque
  • Madhu Chetty
  • Andrew Lewis
  • Abdul Sattar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5265)


Nondeterministic conformational search techniques, such as Genetic Algorithms (GAs) are promising for solving protein structure prediction (PSP) problem. The crossover operator of a GA can underpin the formation of potential conformations by exchanging and sharing potential sub-conformations, which is promising for solving PSP. However, the usual nature of an optimum PSP conformation being compact can produce many invalid conformations (by having non-self-avoiding-walk) using crossover. While a crossover-based converging conformation suffers from limited pathways, combining it with depth-first search (DFS) can partially reveal potential pathways. DFS generates random conformations increasingly quickly with increasing length of the protein sequences compared to random-move-only-based conformation generation. Random conformations are frequently applied for maintaining diversity as well as for initialization in many GA variations.


Depth-first search protein structure prediction genetic algorithm lattice model 


  1. 1.
    Chivian, D., Robertson, T., Bonneau, R., Baker, D.: AB INITIO METHODS. In: Bourne, P.E., Weissig, H. (eds.) Structural Bioinformatics. Wiley-Liss, Inc., Chichester (2003)Google Scholar
  2. 2.
    Allen, F., et al.: Blue Gene: A vision for protein science using a petaflop supercomputer. IBM System Journal 40, 310–327 (2001)CrossRefGoogle Scholar
  3. 3.
    Pietzsch, J.: The importance of protein folding. Nature (2003) (last access, 2007),
  4. 4.
    Petit-Zeman, S.: Treating protein folding diseases. Nature (last access, 2008),
  5. 5.
    Dill, K.A.: Theory for the Folding and Stability of Globular Proteins. Biochemistry 24, 1501–1509 (1985)CrossRefPubMedGoogle Scholar
  6. 6.
    Backofen, R., Will, S.: A Constraint-Based Approach to Fast and Exact Structure Prediction in Three-Dimensional Protein Models. Constraints Journal 11 (2006)Google Scholar
  7. 7.
    Crescenzi, P., Goldman, D., Papadimitriou, C., Piccolboni, A., Yannakakis, M.: On the complexity of protein folding (extended abstract). In: 2nd Intl. conference on Computational molecular biology, pp. 597–603. ACM, New York (1998)Google Scholar
  8. 8.
    Berger, B., Leighton, T.: Protein Folding in the Hydrophobic-Hydrophilic (HP) Model is NP-Complete. Journal of Computational Biology 5, 27–40 (1998)CrossRefPubMedGoogle Scholar
  9. 9.
    Schiemann, R., Bachmann, M., Janke, W.: Exact Enumeration of Three – Dimensional Lattice Proteins. In: Computer Physics Communications, p. 166. Elsevier Science, Amsterdam (2005)Google Scholar
  10. 10.
    Guttmann, A.J.: Self-avoiding walks in constrained and random geometries. Elsevier, Amsterdam (2005)Google Scholar
  11. 11.
    Unger, R., Moult, J.: On the Applicability of Genetic Algorithms to Protein Folding. The Twenty-Sixth Hawaii International Conference on System Sciences 1, 715–725 (1993)CrossRefGoogle Scholar
  12. 12.
    Bastolla, U., Frauenkron, H., Gerstner, E., Grassberger, P., Nadler, W.: Testing a new Monte Carlo Algorithm for Protein Folding. National Center for Biotechnology Information 32, 52–66 (1998)Google Scholar
  13. 13.
    Liang, F., Wong, W.H.: Evolutionary Monte Carlo for protein folding simulations. J. Chem. Phys. 115 (2001)Google Scholar
  14. 14.
    Jiang, T., Cui, Q., Shi, G., Ma, S.: Protein folding simulation of the hydrophobic-hydrophilic model by computing tabu search with genetic algorithms. ISMB, Brisbane Australia (2003) Google Scholar
  15. 15.
    Shmygelska, A., Hoos, H.H.: An ant colony optimization algorithm for the 2D and 3D hydrophobic polar protein folding problem. BMC Bioinformatics 6 (2005)Google Scholar
  16. 16.
    Lee, J.: Conformational space annealing and a lattice model Protein. Journal of the Korean Physical Society 45, 1450–1454 (2004)Google Scholar
  17. 17.
    Cormen, T.H., leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge (1998)Google Scholar
  18. 18.
    Toma, L., Toma, S.: Folding simulation of protein models on the structure based cubo-octahedral lattice with the Contact interactions algorithm. Protein Science 8, 196–202 (1999)CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Baker, D.: A surprising simplicity to protein folding. Nature 405, 39–42 (2000)CrossRefPubMedGoogle Scholar
  20. 20.
    Pande, V.S., Rokhsar, D.: Folding pathway of a lattice model for proteins. PNAS 96, 273–278 (1999)Google Scholar
  21. 21.
    Levinthal, C.: Are there pathways for protein folding? Journal of Chemical Physics 64, 44–45 (1968)Google Scholar
  22. 22.
    Hoque, M.T., Chetty, M., Dooley, L.: A Guided Genetic Algorithm for Protein Folding Prediction Using 3D Hydrophobic-Hydrophilic Model. In: IEEE CEC (2006) Google Scholar
  23. 23.
    Hoque, M.T., Chetty, M., Dooley, L.S.: Significance of Hybrid Evolutionary Computation for Ab Initio Protein Folding Prediction. In: Grosan, C., Abraham, A., Ishibuchi, H. (eds.) Hybrid Evolutionary Algorithms, vol. 75, pp. 241–268. Springer, Berlin (2007)CrossRefGoogle Scholar
  24. 24.
    Hoque, M.T., Chetty, M., Dooley, L.S.: A Hybrid Genetic Algorithm for 2D FCC Hydrophobic-Hydrophilic Lattice Model to Predict Protein Folding. In: Sattar, A., Kang, B.-h. (eds.) AI 2006. LNCS (LNAI), vol. 4304. Springer, Heidelberg (2006)Google Scholar
  25. 25.
    Hoque, M.T., Chetty, M., Dooley, L.S.: Fast computation of the fitness function for protein folding prediction in a 2D hydrophilic-hydrophobic model. Journal published in the special issue of the International Journal of Simulation Systems, Science and Technology 6, 27–37 (2005)Google Scholar
  26. 26.
    Hoque, M.T., Chetty, M., Dooley, L.S.: A New Guided Genetic Algorithm for 2D Hydrophobic-Hydrophilic Model to Predict Protein Folding. In: IEEE CEC, Edinburgh, UK (2005)Google Scholar
  27. 27.
    Hoque, M.T., Chetty, M., Sattar, A.: Protein Folding Prediction in 3D FCC HP Lattice Model Using Genetic Algorithm Bioinformatics special session. In: IEEE CEC, Singapore (2007)Google Scholar
  28. 28.
    Ghosh, K., Ozkan, S.B., Dill, K.A.: The Ultimate Speed Limit to Protein Folding Is Conformational Searching. Journal of American Chemical Society 129, 11920–11927 (2007)CrossRefGoogle Scholar
  29. 29.
    Lau, K.F., Dill, K.A.: A lattice statistical mechanics model of the conformational and sequence spaces of proteins. Macromolecules 22, 3986–3997 (1989)CrossRefGoogle Scholar
  30. 30.
    Dill, K.A., Bromberg, S., Yue, K., Fiebig, K.M., Yee, D.P., Thomas, P.D., Chan, H.S.: Principles of protein folding – A perspective from simple exact models. Protein Science 4, 561–602 (1995)CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Dill, K.A., Ozkan, S.B., Weikl, T.R., Chodera, J.D., Voelz, V.A.: The protein folding problem: when will it be solved? Current Opinion in Structural Biology 17, 246–342 (2007)CrossRefGoogle Scholar
  32. 32.
    Corne, D.W., Fogel, G.B.: An Introduction to Bioinformatics for Computer Scientists. In: Fogel, G.B., Corne, D.W. (eds.) Evolutionary Computation in Bioinformatics, pp. 3–18 (2004)Google Scholar
  33. 33.
    Baker, D.: Prediction and design of macromolecular structures and interactions. Phil. Trans. R. Soc. B361, 459–463 (2006)CrossRefGoogle Scholar
  34. 34.
    Schueler-Furman, O., Wang, C., Bradley, P., Misura, K., Baker, D.: Progress in Modeling of Protein Structures and Interactions. Science 310, 638–642 (2005)CrossRefPubMedGoogle Scholar
  35. 35.
    Xia, Y., Huang, E.S., Levitt, M., Samudrala, R.: Ab Initio Construction of Protein Tertiary Structures using a Hierarchical Approach. J. Mol. Biol. 300, 171–185 (2000)CrossRefPubMedGoogle Scholar
  36. 36.
    Backofen, R., Will, S., Clote, P.: Algorithmic approach to quantifying the hydrophobic force contribution in protein folding. Pacific Symp. On Biocomputing 5, 92–103 (2000)Google Scholar
  37. 37.
    Yue, K., Dill, K.A.: Sequence-structure relationships in proteins and copolymers. Phys. Rev. E48, 2267–2278 (1993)Google Scholar
  38. 38.
    Toma, L., Toma, S.: Contact interactions methods: A new Algorithm for Protein Folding Simulations. Protein Science 5, 147–153 (1996)CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Bornberg-Bauer, E.: Chain Growth Algorithms for HP-Type Lattice Proteins. In: RECOMB, USA (1997)Google Scholar
  40. 40.
    Hoque, M.T., Chetty, M., Dooley, L.S.: Non-Isomorphic Coding in Lattice Model and its Impact for Protein Folding Prediction Using Genetic Algorithm. In: IEEE CIBCB, Toronto, Canada (2006)Google Scholar
  41. 41.
    Chen, M., Lin, K.Y.: Universal amplitude ratios for three-dimensional self-avoiding walks. Journal of Physics A: Mathematical and General 35, 1501–1508 (2002)CrossRefGoogle Scholar
  42. 42.
    Roterman, I.K., Lambert, M.H., Gibson, K.D., Scheraga, H.: A comparison of the CHARMM, AMBER and ECEPP potentials for peptides. II. Phi-psi maps for N-acetyl alanine N’-methyl amide: comparisons, contrasts and simple experimental tests. J. Biomol. Struct. Dynamics 7, 421–453 (1989)CrossRefGoogle Scholar
  43. 43.
    Cornell, W.D., Cieplak, P., Bayly, C.I., Gould Jr., I.R., Merz Jr., K.M., Ferguson, D.M., Spellmeyer, D.C., Fox, T., Caldwell, J.W., Kollman, P.A.: A second generation force field for the simulation of proteins and nucleic acids. J. Am. Chem. Soc. 117, 5179–5197 (1995)CrossRefGoogle Scholar
  44. 44.
    Cutello, V., Nicosia, G., Pavone, M., Timmis, J.: An Immune Algorithm for Protein Structure Prediction on Lattice Models. IEEE Transactions on Evolutionary Computation 11 (2007)Google Scholar
  45. 45.
    Takahashi, O., Kita, H., Kobayashi, S.: Protein Folding by A Hierarchical Genetic Algorithm. AROB (1999)Google Scholar
  46. 46.
    Unger, R., Moult, J.: Genetic Algorithms for Protein Folding Simulations. J. of Mol. Bio. 231, 75–81 (1993)CrossRefGoogle Scholar
  47. 47.
    Unger, R., Moult, J.: Genetic Algorithm for 3D Protein Folding Simulations. In: Conference on GAs, pp. 581–588 (1993)Google Scholar
  48. 48.
    König, R., Dandekar, T.: Refined Genetic Algorithm Simulation to Model Proteins. Journal of Molecular Modeling 5 (1999)Google Scholar
  49. 49.
    Lee, J., Scheraga, H.A., Rackovsky, S.: New Optimization Method for Conformational energy Calculations on Polypeptides: Conformational Space Annealing. J. of Comp. Chemistry 18, 1222–1232 (1997)CrossRefGoogle Scholar
  50. 50.
    Haupt, R.L., Haupt, S.E.: Practical Genetic Algorithms (2004)Google Scholar
  51. 51.
    Digalakis, J.G., Margaritis, K.G.: An experimental Study of Benchmarking Functions for Genetic Algorithms Intern. J. Computer Math. 79, 403–416 (2002)Google Scholar
  52. 52.
    Flores, S.D., Smith, J.: Study of Fitness Landscapes for the HP model of Protein Structure Prediction. In: IEEE CEC (2003)Google Scholar
  53. 53.
    Mousseau, N., Barkema, G.T.: Exploring High-Dimensional Energy Landscape. Computing in Science & Engineering 1, 74–80, 82 (1999)CrossRefGoogle Scholar
  54. 54.
    Hansmann, U.H.E.: Protein Folding in Silico: An Overview. In: IEEE CS and the AIP (2003) Google Scholar
  55. 55.
    Skolnick, J., Kolinski, A.: Computational Studies of Protein Folding. IEEE COMPUTING IN SCIENCE & ENGINEERING 3, 40–50 (2001)CrossRefGoogle Scholar
  56. 56.
    Cui, Y., Wong, W.H., Bornberg-Bauer, E., Chan, H.S.: Recombinatoric exploration of novel folded structures: A heteropolymer-based model of protein evolutionary landscapes. PNAS 99, 809–814 (2002)CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Schreiner, K.: Distributed Project Tackle Protein Mystery. Computing in Science & Engineering. IEEE 3, 13–16 (2001)Google Scholar
  58. 58.
    Lesh, N., Mitzenmacher, M., Whitesides, S.: A Complete and Effective Move Set for Simplified Protein Folding. In: RECOMB, Berlin, Germany (2003)Google Scholar
  59. 59.
    Hart, W.E., Istrail, S.: HP Benchmarks vol. 2005 (2005)Google Scholar
  60. 60.
    Rosetta, Y.: 2.1.0., Copyright © 2007-2008 The Rosetta Commons (last access, 2008),
  61. 61.
    Bonneau, R., Tsai, J., Ruczinski, I., Chivian, D., Rohl, C., Strauss, C.E.M., Baker, D.: Rosetta in CASP4: Progress in Ab Initio Protein Structure Prediction. PROTEINS: Struct. Func. and Genetics 5, 116–119 (2001)Google Scholar
  62. 62.
    Bradley, P., et al.: Rosetta Predictions in CASP5: Success, Failure, and Prospects for Complete Automation. PROTEINS: Structure, Function, and Genetics 53, 457–468 (2003)CrossRefGoogle Scholar
  63. 63.
    Simons, K.T., Bonneau, R., Ruczinski, I., Baker, D.: Ab Initio Protein Structure Prediction of CASP III Target Using ROSETTA. PROTEINS: Structure, Function, and Genetics 3, 171–176 (1999)CrossRefGoogle Scholar
  64. 64.
    Hoque, M.T., Chetty, M., Dooley, L.S.: Generalized Schemata Theorem Incorporating Twin Removal for Protein Structure Prediction. In: PRIB, Singapore (2007)Google Scholar
  65. 65.
    Koumousis, V.K., Katsaras, C.P.: A Saw-Tooth Genetic Algorithm Combining the Effects of Variable Population Size and Reinitialization to Enhance Performance. TEVC 10, 19–28 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Md Tamjidul Hoque
    • 1
  • Madhu Chetty
    • 2
  • Andrew Lewis
    • 1
  • Abdul Sattar
    • 1
  1. 1.Institute for Integrated and Intelligent Systems (IIIS)Griffith UniversityNathanAustralia
  2. 2.Gippsland School of Information Technology (GSIT)Monash UniversityChurchillAustralia

Personalised recommendations