Abstract
The schemata theorem, on which the working of Genetic Algorithm (GA) is based in its current form, has a fallacious selection procedure and incomplete crossover operation. In this paper, generalization of the schemata theorem has been provided by correcting and removing these limitations. The analysis shows that similarity growth within GA population is inherent due to its stochastic nature. While the stochastic property helps in GA’s convergence. The similarity growth is responsible for stalling and becomes more prevalent for hard optimization problem like protein structure prediction (PSP). While it is very essential that GA should explore the vast and complicated search landscape, in reality, it is often stuck in local minima. This paper shows that, removal of members of population having certain percentage of similarity would keep GA perform better, balancing and maintaining convergence property intact as well as avoids stalling.
Keywords
Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Berger, B., Leighton, T.: Protein Folding in the Hydrophobic-Hydrophilic (HP) Model is NP-Complete. Journal of Computational Biology 5, 27 (1998)
Crescenzi, P., Goldman, D., Papadimitriou, C., Piccolboni, A., Yannakakis, M.: On the complexity of protein folding (extended abstract), p. 597. ACM, New York (1998)
Chen, M., Lin, K.Y.: Universal amplitude ratios for three-dimensional self-avoiding walks. Journal of Physics A: Mathematical and General 35, 1501 (2002)
Guttmann, A.J.: Self-avoiding walks in constrained and random geometries. Elsevier, Amsterdam (2005)
Jiang, T., Cui, Q., Shi, G., Ma, S.: Protein folding simulation of the hydrophobic-hydrophilic model by computing tabu search with genetic algorithms. In: ISMB (2003)
König, R., Dandekar, T.: Refined Genetic Algorithm Simulation to Model Proteins. Journal of Molecular Modeling 5 (1999)
Lamont, G.B., Merkie, L.D.: Toward effective polypeptide chain prediction with parallel fast messy genetic algorithms. In: Fogel, G., Corne, D. (eds.) Evolutionary Computation in Bioinformatics, p. 137 (2004)
Pedersen, J.T., Moult, J.: Ab initio protein folding simulations with genetic algorithms: simulations on the complete sequence of small proteins. Proteins 29, 179 (1997)
Schulze-Kremer, S.: Genetic Algorithms and Protein Folding, vol. 1996 (2007)
Takahashi, O., Kita, H., Kobayashi, S.: Protein Folding by A Hierarchical Genetic Algorithm. In: 4th Int. Symp. AROB (1999)
Unger, R., Moult, J.: On the Applicability of Genetic Algorithms to Protein Folding. In: The Twenty-Sixth Hawaii International Conference on System Sciences, p. 715 (1993)
Unger, R., Moult, J.: Genetic Algorithms for Protein Folding Simulations. Journal of Molecular Biology 231, 75 (1993)
Hoque, M.T., Chetty, M., Dooley, L.S.: Significance of Hybrid Evolutionary Computation for Ab Inito Protein Folding Prediction. Springer, Heidelberg (2006)
Hoque, M.T., Chetty, M., Dooley, L.S.: A New Guided Genetic Algorithm for 2D Hydrophobic-Hydrophilic Model to Predict Protein Folding. In: IEEE CEC. IEEE Computer Society Press, Los Alamitos (2005)
Bastolla, U., Frauenkron, H., Gerstner, E., Grassberger, P., Nadler, W.: Testing a new Monte Carlo Algorithm for Protein Folding. Nat. Center for Biotech. Info. 32, 52 (1998)
Liang, F., Wong, W.H.: Evolutionary Monte Carlo for protein folding simulations. J. Chem. Phys. 115 (2001)
Shmygelska, A., Hoos, H.H.: An ant colony optimization algorithm for the 2D and 3D hydrophobic polar protein folding problem. BMC Bioinformatics 6 (2005)
Fogel, D.B.: EVOLUTIONARY COMPUTATION Towards a new philosophy of Machine Intelligence. IEEE Press, Los Alamitos (2000)
Sareni, B., Krähenbühl, L., Nicolas, A.: Effective Genetic Algorithms for Solving Hard Constrained Optimization Problems. IEEE Transaction on Magetics 36 (2000)
Rudolph, G.: Convergence analysis of canonical genetic algorithms. ITNN 5, 96 (1994)
Altenberg, L.: The Schema Theorem and Price’s Theorem Foundations of Genetic Algorithms 3 (1995)
Fogel, D.B., Ghozeil, A.: Schema processing, proportional selection, and the misallocation of trials in genetic algorithms. Information Science 122, 93 (2000)
Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution (1992)
Whitley, D.: An Overview of Evolutionary Algorithms. Journal of Information and Software Technology 43, 817 (2001)
Ronald, S.: Duplicate Genotypes in a Genetic algorithm. In: IEEE WCCI, p. 793. IEEE Computer Society Press, Los Alamitos (1998)
Haupt, R.L., Haupt, S.E.: Practical Genetic Algorithms (2004)
Hoque, M.T., Chetty, M., Dooley, L.S.: Critical Analysis of the Schemata Theorem: The Impact of Twins and the Effect in the Prediction of Protein Folding using Lattice Model, GSIT, MONASH University, TR-2005/8 (2005)
Deb, K., Goldberg, D.E.: An investigation of niche and species formation in genetic function optimization. In: 3rd Int. Conf. on Genetic Algorithms, p. 42 (1989)
Coello, C.A.C.: An Updated Survey of GA-Based Multiobjective Optimization Techniques. ACM Computing Surveys 32, 109 (2000)
Rogers, A., Prügle-Bennett, A.: Genetic Drift in Genetic Algorithm Selection Schemes. IEEE Transaction on Evolutionary Computation 3, 298 (1999)
Eshelman, L.J., Schaffer, J.D.: Preventing premature convergence in genetic algorithms by preventing incast. In: 4th Int. Conf. on Genetic Algorithms, p. 115 (1991)
Hoque, M.T., Chetty, M., Dooley, L.S.: Non-Isomorphic Coding in Lattice Model and its Impact for Protein Folding Prediction Using Genetic Algorithm. In: IEEE Computational Intelligence in Bioinformatics and Computational Biology IEEE CIBCB, Canada (2006)
Toma, L., Toma, S.: Contact interactions methods: A new Algorithm for Protein Folding Simulations. Protein Science 5, 147 (1996)
Lesh, N., Mitzenmacher, M., Whitesides, S.: A Complete and Effective Move Set for Simplified Protein Folding. In: RECOMB (2003)
Bornberg-Bauer, E.: Chain Growth Algorithms for HP-Type Lattice Proteins. In: RECOMB 1997 (1997)
Backofen, R., Will, S.: A Constraint-Based Approach to Fast and Exact Structure Prediction in Three-Dimensional Protein Models. Kluwer Academic Publishers, Dordrecht (2005)
Dill, K.A.: Theory for the Folding and Stability of Globular Proteins. Bio-chemistry 24, 501 (1985)
Hart, W.E., Istrail, S.: HP Benchmarks (2005), http://www.cs.sandia.gov/
Vose, M.D.: The Simple Genetic Algorithm. MIT Press, Cambridge (1999)
Holland, J.H.: Adaptation in Natural And Artificial Systems. MIT Press, Cambridge (2001)
Goldberg, D.E.: Genetic Algorithm Search, Optimization, and Machine Learning. Addison-Wesley Publishing Company, Reading (1989)
Davis, L.: Handbook of Genetic Algorithm. VNR, New York (1991)
PDB, Protein Data Base, vol. 2007 (2007), http://www.rcsb.org/pdb/
Digalakis, J.G., Margaritis, K.G.: An experimental Study of Benchmarking Functions for Genetic Algorithms. Intern. J. Computer Math. 79, 403 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hoque, M.T., Chetty, M., Dooley, L.S. (2007). Generalized Schemata Theorem Incorporating Twin Removal for Protein Structure Prediction. In: Rajapakse, J.C., Schmidt, B., Volkert, G. (eds) Pattern Recognition in Bioinformatics. PRIB 2007. Lecture Notes in Computer Science(), vol 4774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75286-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-75286-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75285-1
Online ISBN: 978-3-540-75286-8
eBook Packages: Computer ScienceComputer Science (R0)