Computing Exact p-Value for Structured Motif
Extracting motifs from a set of DNA sequences is important in computational biology. Occurrence probability is a common used statistics to evaluate the statistical significance of a motif. A main problem is how to calculate the occurrence probability of the motif on the random model of DNA sequence efficiently and accurately. In this paper, we are interested in a particular motif model which is useful in transcription process. This motif, which is called structured motif, is composed two motif words on single nucleotide alphabet and with fixed spacers between them. We present an efficient algorithm to calculate the exact occurrence probability of a structured motif on a given sequence. It is the first non-trivial algorithm to calculate the exact p-value for such kind of motifs.
KeywordsPattern and motif discovery exact p-value structured motif dynamic programming
Unable to display preview. Download preview PDF.
- 1.Marsan, L., Sagot, M.F.: Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification. J. Comp. Biol. 7, 345–362Google Scholar
- 2.Marsan, L., Sagot, M.F: Extracting structured motifs using a suffix tree-algorithm and application to promoter consensus identification. In: RECOMB 2000 Proceedings of Fourth Annual International Conference on Computational Molecular Biology, pp. 210–219. ACM Press, New York (2000)CrossRefGoogle Scholar
- 4.Van Helden, J., Rios, A.F., Collado-Vides, J.: Discovering and Regulatory elements in non-coding sequences by analysis of spaced dyads. Nucl. Acids Res. 28, 1808–1818Google Scholar