iGAPK: Improved GAPK Algorithm for Regulatory DNA Motif Discovery

  • Dianhui Wang
  • Xi Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6444)

Abstract

Computational DNA motif discovery is one of the major research areas in bioinformatics, which helps to understanding the mechanism of gene regulation. Recently, we have developed a GA-based motif discovery algorithm, named as GAPK, which addresses the use of some identified transcription factor binding sites extracted from orthologs for algorithm development. With our GAPK framework, technical improvements on background filtering, evolutionary computation or model refinement will contribute to achieving better performances. This paper aims to improve the GAPK framework by introducing a new fitness function, termed as relative model mismatch score (RMMS), which characterizes the conservation and rareness properties of DNA motifs simultaneously. Other technical contributions include a rule-based system for filtering background data and a “most one-in-out” (MOIO) strategy for motif model refinement. Comparative studies are carried out using eight benchmark datasets with original GAPK and two GA-based motif discovery algorithms, GAME and GALF-P. The results show that our improved GAPK method favorably outperforms others on the testing datasets.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hu, J., Li, B., Kihara, D.: Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res. 33, 4899–4913 (2005)CrossRefGoogle Scholar
  2. 2.
    Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Science 4, 1618–1632 (1995)CrossRefGoogle Scholar
  3. 3.
    Bailey, T.L., Elkan, C.: Unsupervised learning of multiple motifs in biopolymers using EM. Machine Learning 21, 51–80 (1995)Google Scholar
  4. 4.
    Tompa, M., Li, N., Bailey, T.L., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nature Biotechnology 23, 137–144 (2005)CrossRefGoogle Scholar
  5. 5.
    Bailey, T.L., Elkan, C.P.: The value of prior knowledge in discovering motifs with MEME. Intell. Sys. Mol. Bilo. 3, 21–29 (1995)Google Scholar
  6. 6.
    Li, L.P., Liang, Y., Bass, R.L.L.: GAPWM: a genetic algorithm method for optimizing a position weight matrix. Bioinformatics 23, 1188–1194 (2007)CrossRefGoogle Scholar
  7. 7.
    Wang, T., Stormo, G.D.: Combining phylogenetic data with co-regulated genes to identify regulatory motifs. Bioinformatics 19, 2369–2380 (2003)CrossRefGoogle Scholar
  8. 8.
    Narang, V., Mittal, A., Sung, W.-K.: Localized motif discovery in gene regulatory sequences. Bioinformatics 26, 1152–1159 (2010)CrossRefGoogle Scholar
  9. 9.
    Wei, Z., Jensen, S.T.: GAME: detecting cis-regulatory elements using a genetic algorithm. Bioinformatics 22, 1577–1584 (2006)CrossRefGoogle Scholar
  10. 10.
    Chan, T.-M., Leung, K.-S., Lee, K.-H.: TFBS identification based on genetic algorithm with combined representations and adaptive post-processing. Bioinformatics 24, 341–349 (2008)CrossRefGoogle Scholar
  11. 11.
    Wang, D.H., Li, X.: GAPK: Genetic algorithms with prior knowledge for motif discovery in DNA sequences. In: CEC 2009: IEEE Congress on Evolutionary Computation 2009, Trondheim, Norway, pp. 277–284 (2009)Google Scholar
  12. 12.
    Wang, D.H., Lee, N.K.: MISCORE: mismatch-based matrix similarity scores for DNA motif detection. In: Köppen, M., Kasabov, N., Coghill, G. (eds.) ICONIP 2008. LNCS, vol. 5506, pp. 478–485. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  13. 13.
    Wang, D.H.: Characterization of regulatory motif models. Technical Report, La Trobe University, Australia (October 2009)Google Scholar
  14. 14.
    Stormo, G.D., Fields, D.S.: Specificity, free energy and information content in protein-DNA interactions. Trends in Biochemical Sciences 23, 109–113 (1998)CrossRefGoogle Scholar
  15. 15.
    Thijs, G., Lescot, M., Marchal, K., Rombauts, S., De Moor, B., Rouzé, P., Moreau, Y.: A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17, 1113–1122 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Dianhui Wang
    • 1
  • Xi Li
    • 1
    • 2
  1. 1.Department of Computer Science and Computer EngineeringLa Trobe UniversityMelbourneAustralia
  2. 2.Department of Primary Industries, Bioscience Research DivisionVictorian AgriBiosciences CentreBundooraAustralia

Personalised recommendations