ICONIP 2010: Neural Information Processing. Models and Applications pp 217-225 | Cite as
iGAPK: Improved GAPK Algorithm for Regulatory DNA Motif Discovery
Abstract
Computational DNA motif discovery is one of the major research areas in bioinformatics, which helps to understanding the mechanism of gene regulation. Recently, we have developed a GA-based motif discovery algorithm, named as GAPK, which addresses the use of some identified transcription factor binding sites extracted from orthologs for algorithm development. With our GAPK framework, technical improvements on background filtering, evolutionary computation or model refinement will contribute to achieving better performances. This paper aims to improve the GAPK framework by introducing a new fitness function, termed as relative model mismatch score (RMMS), which characterizes the conservation and rareness properties of DNA motifs simultaneously. Other technical contributions include a rule-based system for filtering background data and a “most one-in-out” (MOIO) strategy for motif model refinement. Comparative studies are carried out using eight benchmark datasets with original GAPK and two GA-based motif discovery algorithms, GAME and GALF-P. The results show that our improved GAPK method favorably outperforms others on the testing datasets.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Hu, J., Li, B., Kihara, D.: Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res. 33, 4899–4913 (2005)CrossRefGoogle Scholar
- 2.Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Science 4, 1618–1632 (1995)CrossRefGoogle Scholar
- 3.Bailey, T.L., Elkan, C.: Unsupervised learning of multiple motifs in biopolymers using EM. Machine Learning 21, 51–80 (1995)Google Scholar
- 4.Tompa, M., Li, N., Bailey, T.L., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nature Biotechnology 23, 137–144 (2005)CrossRefGoogle Scholar
- 5.Bailey, T.L., Elkan, C.P.: The value of prior knowledge in discovering motifs with MEME. Intell. Sys. Mol. Bilo. 3, 21–29 (1995)Google Scholar
- 6.Li, L.P., Liang, Y., Bass, R.L.L.: GAPWM: a genetic algorithm method for optimizing a position weight matrix. Bioinformatics 23, 1188–1194 (2007)CrossRefGoogle Scholar
- 7.Wang, T., Stormo, G.D.: Combining phylogenetic data with co-regulated genes to identify regulatory motifs. Bioinformatics 19, 2369–2380 (2003)CrossRefGoogle Scholar
- 8.Narang, V., Mittal, A., Sung, W.-K.: Localized motif discovery in gene regulatory sequences. Bioinformatics 26, 1152–1159 (2010)CrossRefGoogle Scholar
- 9.Wei, Z., Jensen, S.T.: GAME: detecting cis-regulatory elements using a genetic algorithm. Bioinformatics 22, 1577–1584 (2006)CrossRefGoogle Scholar
- 10.Chan, T.-M., Leung, K.-S., Lee, K.-H.: TFBS identification based on genetic algorithm with combined representations and adaptive post-processing. Bioinformatics 24, 341–349 (2008)CrossRefGoogle Scholar
- 11.Wang, D.H., Li, X.: GAPK: Genetic algorithms with prior knowledge for motif discovery in DNA sequences. In: CEC 2009: IEEE Congress on Evolutionary Computation 2009, Trondheim, Norway, pp. 277–284 (2009)Google Scholar
- 12.Wang, D.H., Lee, N.K.: MISCORE: mismatch-based matrix similarity scores for DNA motif detection. In: Köppen, M., Kasabov, N., Coghill, G. (eds.) ICONIP 2008. LNCS, vol. 5506, pp. 478–485. Springer, Heidelberg (2009)CrossRefGoogle Scholar
- 13.Wang, D.H.: Characterization of regulatory motif models. Technical Report, La Trobe University, Australia (October 2009)Google Scholar
- 14.Stormo, G.D., Fields, D.S.: Specificity, free energy and information content in protein-DNA interactions. Trends in Biochemical Sciences 23, 109–113 (1998)CrossRefGoogle Scholar
- 15.Thijs, G., Lescot, M., Marchal, K., Rombauts, S., De Moor, B., Rouzé, P., Moreau, Y.: A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17, 1113–1122 (2001)CrossRefGoogle Scholar