A fast algorithm for the construction of universal footprinting templates in DNA
- First Online:
- 63 Downloads
We introduce and give a complete description of a new graph to be used for DNA sequencing questions. This graph has the advantage over the classical de Bruijn graph that it fully accounts for the double stranded nature of DNA, rather than dealing with single strands. Technically, our graph may be thought of as the quotient of the de Bruijn graph under the natural involution of sending a DNA strand to its complementary strand. However, this involution has fixed points, and this complicates the structure of the quotient graph which we have therefore modified herein.
As an application and motivating example, we give an efficient algorithm for constructing universal footprinting templates for n-mers. This problem may be formulated as the task of finding a shortest possible segment of DNA which contains every possible sequence of base pairs of some fixed length n. Previous work by Kwan et al has attacked this problem from a numerical point of view and generated minimal length universal footprinting templates for n=2, 3, 5, 7, together with unsubstantiated candidates for the case n=4. We show that their candidates for n=4 are indeed minimal length universal footprinting templates.
Keywords or phrasesDNA sequencing Universal footprinting template de Bruijn graph Eulerian graphs
Unable to display preview. Download preview PDF.
- 1.Bollobas, B.: Graph Theory: an introductory course. Graduate Texts Math. 63, Springer-Verlag, New York, 1979Google Scholar
- 3.Galas, D.J., Schmitz, A.: DNAase footprinting – Simple method for detection of protein – DNA binding specificity. Nucleic Acids Res. 5, 3157–3170 (1978)Google Scholar
- 4.Guille, M.J., Kneale, G.: Methods for the analysis of DNA-protein interactions. Molecular Biotechnology 8, 35–52 (1997)Google Scholar
- 5.Kwan, A.H.Y., Czolij, R., Mackay, J.P., Crossley, M.: Pentaprobe: a comprehensive sequence for the one-step detection of DNA-binding activities. Nucleic Acids Res. 31, e124 (2003)Google Scholar