Generalized Schemata Theorem Incorporating Twin Removal for Protein Structure Prediction

  • Md Tamjidul Hoque
  • Madhu Chetty
  • Laurence S. Dooley
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4774)


The schemata theorem, on which the working of Genetic Algorithm (GA) is based in its current form, has a fallacious selection procedure and incomplete crossover operation. In this paper, generalization of the schemata theorem has been provided by correcting and removing these limitations. The analysis shows that similarity growth within GA population is inherent due to its stochastic nature. While the stochastic property helps in GA’s convergence. The similarity growth is responsible for stalling and becomes more prevalent for hard optimization problem like protein structure prediction (PSP). While it is very essential that GA should explore the vast and complicated search landscape, in reality, it is often stuck in local minima. This paper shows that, removal of members of population having certain percentage of similarity would keep GA perform better, balancing and maintaining convergence property intact as well as avoids stalling.


Schemata theorem twin removal protein structure prediction similarity in population hard optimization problem 


  1. 1.
    Berger, B., Leighton, T.: Protein Folding in the Hydrophobic-Hydrophilic (HP) Model is NP-Complete. Journal of Computational Biology 5, 27 (1998)CrossRefGoogle Scholar
  2. 2.
    Crescenzi, P., Goldman, D., Papadimitriou, C., Piccolboni, A., Yannakakis, M.: On the complexity of protein folding (extended abstract), p. 597. ACM, New York (1998)Google Scholar
  3. 3.
    Chen, M., Lin, K.Y.: Universal amplitude ratios for three-dimensional self-avoiding walks. Journal of Physics A: Mathematical and General 35, 1501 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Guttmann, A.J.: Self-avoiding walks in constrained and random geometries. Elsevier, Amsterdam (2005)Google Scholar
  5. 5.
    Jiang, T., Cui, Q., Shi, G., Ma, S.: Protein folding simulation of the hydrophobic-hydrophilic model by computing tabu search with genetic algorithms. In: ISMB (2003)Google Scholar
  6. 6.
    König, R., Dandekar, T.: Refined Genetic Algorithm Simulation to Model Proteins. Journal of Molecular Modeling 5 (1999)Google Scholar
  7. 7.
    Lamont, G.B., Merkie, L.D.: Toward effective polypeptide chain prediction with parallel fast messy genetic algorithms. In: Fogel, G., Corne, D. (eds.) Evolutionary Computation in Bioinformatics, p. 137 (2004)Google Scholar
  8. 8.
    Pedersen, J.T., Moult, J.: Ab initio protein folding simulations with genetic algorithms: simulations on the complete sequence of small proteins. Proteins 29, 179 (1997)CrossRefGoogle Scholar
  9. 9.
    Schulze-Kremer, S.: Genetic Algorithms and Protein Folding, vol. 1996 (2007)Google Scholar
  10. 10.
    Takahashi, O., Kita, H., Kobayashi, S.: Protein Folding by A Hierarchical Genetic Algorithm. In: 4th Int. Symp. AROB (1999)Google Scholar
  11. 11.
    Unger, R., Moult, J.: On the Applicability of Genetic Algorithms to Protein Folding. In: The Twenty-Sixth Hawaii International Conference on System Sciences, p. 715 (1993)Google Scholar
  12. 12.
    Unger, R., Moult, J.: Genetic Algorithms for Protein Folding Simulations. Journal of Molecular Biology 231, 75 (1993)CrossRefGoogle Scholar
  13. 13.
    Hoque, M.T., Chetty, M., Dooley, L.S.: Significance of Hybrid Evolutionary Computation for Ab Inito Protein Folding Prediction. Springer, Heidelberg (2006)Google Scholar
  14. 14.
    Hoque, M.T., Chetty, M., Dooley, L.S.: A New Guided Genetic Algorithm for 2D Hydrophobic-Hydrophilic Model to Predict Protein Folding. In: IEEE CEC. IEEE Computer Society Press, Los Alamitos (2005)Google Scholar
  15. 15.
    Bastolla, U., Frauenkron, H., Gerstner, E., Grassberger, P., Nadler, W.: Testing a new Monte Carlo Algorithm for Protein Folding. Nat. Center for Biotech. Info. 32, 52 (1998)Google Scholar
  16. 16.
    Liang, F., Wong, W.H.: Evolutionary Monte Carlo for protein folding simulations. J. Chem. Phys. 115 (2001)Google Scholar
  17. 17.
    Shmygelska, A., Hoos, H.H.: An ant colony optimization algorithm for the 2D and 3D hydrophobic polar protein folding problem. BMC Bioinformatics 6 (2005)Google Scholar
  18. 18.
    Fogel, D.B.: EVOLUTIONARY COMPUTATION Towards a new philosophy of Machine Intelligence. IEEE Press, Los Alamitos (2000)Google Scholar
  19. 19.
    Sareni, B., Krähenbühl, L., Nicolas, A.: Effective Genetic Algorithms for Solving Hard Constrained Optimization Problems. IEEE Transaction on Magetics 36 (2000)Google Scholar
  20. 20.
    Rudolph, G.: Convergence analysis of canonical genetic algorithms. ITNN 5, 96 (1994)Google Scholar
  21. 21.
    Altenberg, L.: The Schema Theorem and Price’s Theorem Foundations of Genetic Algorithms 3 (1995)Google Scholar
  22. 22.
    Fogel, D.B., Ghozeil, A.: Schema processing, proportional selection, and the misallocation of trials in genetic algorithms. Information Science 122, 93 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution (1992)Google Scholar
  24. 24.
    Whitley, D.: An Overview of Evolutionary Algorithms. Journal of Information and Software Technology 43, 817 (2001)CrossRefGoogle Scholar
  25. 25.
    Ronald, S.: Duplicate Genotypes in a Genetic algorithm. In: IEEE WCCI, p. 793. IEEE Computer Society Press, Los Alamitos (1998)Google Scholar
  26. 26.
    Haupt, R.L., Haupt, S.E.: Practical Genetic Algorithms (2004)Google Scholar
  27. 27.
    Hoque, M.T., Chetty, M., Dooley, L.S.: Critical Analysis of the Schemata Theorem: The Impact of Twins and the Effect in the Prediction of Protein Folding using Lattice Model, GSIT, MONASH University, TR-2005/8 (2005)Google Scholar
  28. 28.
    Deb, K., Goldberg, D.E.: An investigation of niche and species formation in genetic function optimization. In: 3rd Int. Conf. on Genetic Algorithms, p. 42 (1989)Google Scholar
  29. 29.
    Coello, C.A.C.: An Updated Survey of GA-Based Multiobjective Optimization Techniques. ACM Computing Surveys 32, 109 (2000)CrossRefGoogle Scholar
  30. 30.
    Rogers, A., Prügle-Bennett, A.: Genetic Drift in Genetic Algorithm Selection Schemes. IEEE Transaction on Evolutionary Computation 3, 298 (1999)CrossRefGoogle Scholar
  31. 31.
    Eshelman, L.J., Schaffer, J.D.: Preventing premature convergence in genetic algorithms by preventing incast. In: 4th Int. Conf. on Genetic Algorithms, p. 115 (1991)Google Scholar
  32. 32.
    Hoque, M.T., Chetty, M., Dooley, L.S.: Non-Isomorphic Coding in Lattice Model and its Impact for Protein Folding Prediction Using Genetic Algorithm. In: IEEE Computational Intelligence in Bioinformatics and Computational Biology IEEE CIBCB, Canada (2006)Google Scholar
  33. 33.
    Toma, L., Toma, S.: Contact interactions methods: A new Algorithm for Protein Folding Simulations. Protein Science 5, 147 (1996)CrossRefGoogle Scholar
  34. 34.
    Lesh, N., Mitzenmacher, M., Whitesides, S.: A Complete and Effective Move Set for Simplified Protein Folding. In: RECOMB (2003)Google Scholar
  35. 35.
    Bornberg-Bauer, E.: Chain Growth Algorithms for HP-Type Lattice Proteins. In: RECOMB 1997 (1997)Google Scholar
  36. 36.
    Backofen, R., Will, S.: A Constraint-Based Approach to Fast and Exact Structure Prediction in Three-Dimensional Protein Models. Kluwer Academic Publishers, Dordrecht (2005)Google Scholar
  37. 37.
    Dill, K.A.: Theory for the Folding and Stability of Globular Proteins. Bio-chemistry 24, 501 (1985)Google Scholar
  38. 38.
    Hart, W.E., Istrail, S.: HP Benchmarks (2005),
  39. 39.
    Vose, M.D.: The Simple Genetic Algorithm. MIT Press, Cambridge (1999)zbMATHGoogle Scholar
  40. 40.
    Holland, J.H.: Adaptation in Natural And Artificial Systems. MIT Press, Cambridge (2001)Google Scholar
  41. 41.
    Goldberg, D.E.: Genetic Algorithm Search, Optimization, and Machine Learning. Addison-Wesley Publishing Company, Reading (1989)Google Scholar
  42. 42.
    Davis, L.: Handbook of Genetic Algorithm. VNR, New York (1991)Google Scholar
  43. 43.
    PDB, Protein Data Base, vol. 2007 (2007),
  44. 44.
    Digalakis, J.G., Margaritis, K.G.: An experimental Study of Benchmarking Functions for Genetic Algorithms. Intern. J. Computer Math. 79, 403 (2002)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Md Tamjidul Hoque
    • 1
  • Madhu Chetty
    • 1
  • Laurence S. Dooley
    • 1
  1. 1.Gippsland School of Information Technology (GSIT), Monash University, Churchill VIC 3842Australia

Personalised recommendations