Advertisement

Computing Exact p-Value for Structured Motif

  • Jing Zhang
  • Xi Chen
  • Ming Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4580)

Abstract

Extracting motifs from a set of DNA sequences is important in computational biology. Occurrence probability is a common used statistics to evaluate the statistical significance of a motif. A main problem is how to calculate the occurrence probability of the motif on the random model of DNA sequence efficiently and accurately. In this paper, we are interested in a particular motif model which is useful in transcription process. This motif, which is called structured motif, is composed two motif words on single nucleotide alphabet and with fixed spacers between them. We present an efficient algorithm to calculate the exact occurrence probability of a structured motif on a given sequence. It is the first non-trivial algorithm to calculate the exact p-value for such kind of motifs.

Keywords

Pattern and motif discovery exact p-value structured motif dynamic programming 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Marsan, L., Sagot, M.F.: Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification. J. Comp. Biol. 7, 345–362Google Scholar
  2. 2.
    Marsan, L., Sagot, M.F: Extracting structured motifs using a suffix tree-algorithm and application to promoter consensus identification. In: RECOMB 2000 Proceedings of Fourth Annual International Conference on Computational Molecular Biology, pp. 210–219. ACM Press, New York (2000)CrossRefGoogle Scholar
  3. 3.
    Robin, S., Daudin, J.-J., Richard, H., Sagot, M.-F., Schbath, S.: Occurrence probability of structured motifs in random sequences. J. Comp. Biol. 9, 761–773 (2002)CrossRefGoogle Scholar
  4. 4.
    Van Helden, J., Rios, A.F., Collado-Vides, J.: Discovering and Regulatory elements in non-coding sequences by analysis of spaced dyads. Nucl. Acids Res. 28, 1808–1818Google Scholar
  5. 5.
    Zhu, J., Zhang, M.Q.: SCPD: A promoter database of yearst saccharomyces cerevisiae. Bioinformatics 15, 607–611 (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jing Zhang
    • 1
  • Xi Chen
    • 1
  • Ming Li
    • 2
  1. 1.Computer Science, Tsinghua University, Beijing, 100084China
  2. 2.School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1Canada

Personalised recommendations