Phylogenetic Footprinting to Find Functional DNA Elements

  • Austen R.D. Ganley
  • Takehiko Kobayashi
Part of the Methods in Molecular Biology™ book series (MIMB, volume 395)


Phylogenetic footprinting is powerful technique for finding functional elements from sequence data. Functional elements are thought to have greater sequence constraint than nonfunctional elements, and, thus, undergo a slower rate of sequence change through time. Phylogenetic footprinting uses comparisons of homologous sequences from closely related organisms to identify “phylogenetic footprints,” regions with slower rates of sequence change than background. This does not require prior characterization of the sequence in question, therefore, it can be used in a wide range of applications. In particular, it is useful for the identification of functional elements in noncoding DNA, which are traditionally difficult to detect. Here, we describe in detail how to perform a simple yet powerful phylogenetic footprinting analysis. As an example, we use ribosomal DNA repeat sequences from various Saccharomyces yeasts to find functional noncoding DNA elements in the intergenic spacer, and explain critical considerations in performing phylogenetic footprinting analyses, including the number of species and species range, and some of the available software. Our methods are broadly applicable and should appeal to molecular biologists with little experience in bioinformatics.


Phylogenetic footprinting noncoding functional DNA element Saccharomyces ribosomal DNA. 



This work was supported by grants 13141205, 17080010, and 17370065 from the Ministry of Education, Science and Culture, Japan, and by a Human Frontier Science Program grant.


  1. 1.
    Frazer, K. A., Elnitski, L., Church, D. M., Dubchak, I., and Hardison, R. C. (2003) Cross-species sequence comparisons: a review of methods and available resources. Genome Res. 13, 1–12.CrossRefPubMedGoogle Scholar
  2. 2.
    Hardison, R. C. (2003) Comparative genomics. PLoS Biol. 1, 156–160.CrossRefGoogle Scholar
  3. 3.
    Moses, A. M., Chiang, D. Y., Kellis, M., Lander, E. S., and Eisen, M. B. (2003) Position specific variation in the rate of evolution in transcription factor binding sites. BMC Evol. Biol. 3, 19.CrossRefPubMedGoogle Scholar
  4. 4.
    Hardison, R. C. (2000) Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet. 16, 369–372.CrossRefPubMedGoogle Scholar
  5. 5.
    Gumucio, D. L., Shelton, D. A., Bailey, W. J., Slightom, J. L., and Goodman, M. (1993) Phylogenetic footprinting reveals unexpected complexity in trans factor binding upstream from the ε-globin gene. Proc. Natl. Acad. Sci. USA 90, 6018–6022.CrossRefPubMedGoogle Scholar
  6. 6.
    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D. (2003) Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing. Plant Cell 15, 1296–1309.CrossRefPubMedGoogle Scholar
  7. 7.
    Brachat, S., Dietrich, F. S., Voegeli, S., et al. (2003) Reinvestigation of the Saccharomyces cerevisiae genome annotation by comparison to the genome of a related fungus: Ashbya gossypii. Genome Biol. 4, R45.CrossRefGoogle Scholar
  8. 8.
    Ganley, A. R. D., Hayashi, K., Horiuchi, T., and Kobayashi, T. (2005) Identifying gene-independent noncoding functional elements in the yeast ribosomal DNA by phylogenetic footprinting. Proc. Natl. Acad. Sci. USA 102, 11,787–11,792.Google Scholar
  9. 9.
    Dermitzakis, E. T., Reymond, A., Scamuffa, N., et al. (2003) Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302, 1033–1035.CrossRefPubMedGoogle Scholar
  10. 10.
    Kellis, M., Patterson, N., Endrizzi, M., Birren, B., and Lander, E. S. (2003) Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254.CrossRefPubMedGoogle Scholar
  11. 11.
    Frazer, K. A., Pachter, L., Poliakov, A., Rubin, E. M., and Dubchak, I. (2004) VISTA: computational tools for comparative genomics. Nucl. Acids Res. 32, W273–W279.CrossRefPubMedGoogle Scholar
  12. 12.
    Ovcharenko, I., Loots, G. G., Hardison, R. C., Miller, W., and Stubbs, L. (2004) zPicture: dynamic alignment and visualization tool for analyzing conservation profiles. Genome Res. 14, 472–477.CrossRefPubMedGoogle Scholar
  13. 13.
    Aerts, S., Van Loo, P., Thijs, G., et al. (2005) TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis. Nucl. Acids Res. 33, W393–W396.CrossRefPubMedGoogle Scholar
  14. 14.
    Sinha, S., Blanchette, M., and Tompa, M. (2004) PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences. BMC Bioinform. 5, 170.CrossRefGoogle Scholar
  15. 15.
    Nix, D. A. and Eisen, M. B. (2005) GATA: a graphic alignment tool for comparative sequence analysis. BMC Bioinform. 6, 9.CrossRefGoogle Scholar
  16. 16.
    Long, E. O. and Dawid, I. B. (1980) Repeated genes in eukaryotes. Ann. Rev. Biochem. 49, 727–764.CrossRefPubMedGoogle Scholar
  17. 17.
    Kurtzman, C. P. and Robnett, C. J. (2003) Phylogenetic relationships among yeasts of the ‘Saccharomyces complex’ determined from multigene sequence analyses. FEMS Yeast Res. 3, 417–432.CrossRefPubMedGoogle Scholar
  18. 18.
    Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680.CrossRefPubMedGoogle Scholar
  19. 19.
    Schug, J. and Overton, G. C. (1997) TESS: Transcription Element Search Software on the WWW. University of Pennsylvania, Philadelphia, PA.Google Scholar
  20. 20.
    Moses, A. M., Chiang, D. Y., Pollard, D. A., Iyer, V. N., and Eisen, M. B. (2004) MONKEY: identifying conserved transcription-factor binding sites in mulitple alignments using a binding site-specific evolutionary model. Genome Biol. 5, R98.CrossRefPubMedGoogle Scholar
  21. 21.
    Göttgens, B., Gilbert, J. G. R., Barton, L. M., et al. (2001) Long-range comparison of human and mouse SCL loci: localized regions of sensitivity to restriction endonucleases correspond precisely with peaks of conserved noncoding sequences. Genome Res. 11, 87–97.CrossRefPubMedGoogle Scholar
  22. 22.
    Pride, D. T. and Blaser, M. J. (2002) Concerted evolution between duplicated genetic elements in Helicobacter pylori. J. Mol. Biol. 316, 629–642.Google Scholar
  23. 23.
    Morgenstern, B. (2004) DIALIGN: multiple DNA and protein sequence alignment at BiBiServ. Nucl. Acids Res. 32, W33–W36.CrossRefPubMedGoogle Scholar
  24. 24.
    Brudno, M., Do, C. B., Cooper, G. M., et al. (2003) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721–731.CrossRefPubMedGoogle Scholar
  25. 25.
    Schneider, T. D. and Stephens, R. M. (1990) Sequence logos: a new way to display consensus sequences. Nucl. Acids Res. 18, 6097–6100.CrossRefPubMedGoogle Scholar
  26. 26.
    Crooks, G. E., Hon, G., Chandonia, J. -M., and Brenner, S. E. (2004) WebLogo: a sequence logo generator. Genome Res. 14, 1188–1190.CrossRefPubMedGoogle Scholar
  27. 27.
    Kobayashi, T. (2003) The replication fork barrier site forms a unique structure with Fob1p and inhibits the replication fork. Mol. Cell. Biol. 23, 9178–9188.CrossRefPubMedGoogle Scholar
  28. 28.
    Kobayashi, T. and Ganley, A. R. D. (2005) Recombination regulation by transcription-induced cohesin dissociation in rDNA repeats. Science 309, 1581–1584.CrossRefPubMedGoogle Scholar
  29. 29.
    Musters, W., Knol, J., Maas, P., Dekker, A. F., van Heerikhuizen, H., and Planta, R. J. (1989) Linker scanning of the yeast RNA polymerase I promoter. Nucl. Acids Res. 17, 9661–9678.CrossRefPubMedGoogle Scholar
  30. 30.
    Challice, J. M. and Segall, J. (1989) Transcription of the 5S rRNA gene of Saccharomyces cerevisiae requires a promoter element at +1 and a 14-base pair internal control region. J. Biol. Chem. 264, 20,060–20,067.Google Scholar
  31. 31.
    Brown, B. R., Bartholomew, B., Kassavetis, G. A., and Geiduschek, E. P. (1992) Topography of transcription factor complexes on the Saccharomyces cerevisiae 5S RNA gene. J. Mol. Biol. 228, 1063–1077.CrossRefGoogle Scholar
  32. 31.
    Miller, C. A. and Kowalski, D. (1993) cis-Acting components in the replication origin from ribosomal DNA of Saccharomyces cerevisiae. Mol. Cell. Biol. 13, 5360–5369.Google Scholar
  33. 33.
    Burkhalter, M. D. and Sogo, J. M. (2004) rDNA enhancer affects replication initiation and mitotic recombination: Fob1 mediates nucleolytic processing independently of replication. Mol. Cell. 15, 409–421.CrossRefPubMedGoogle Scholar
  34. 34.
    Laloraya, S., Guacci, V., and Koshland, D. (2000) Chromosomal addresses of the cohesin component Mcd1p. J. Cell Biol. 151, 1047–1056.CrossRefPubMedGoogle Scholar
  35. 35.
    Morrow, B. E., Johnson, S. P., and Warner, J. R. (1989) Proteins that bind to the yeast rDNA enhancer. J. Biol. Chem. 264, 9061–9068.PubMedGoogle Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Austen R.D. Ganley
    • 1
  • Takehiko Kobayashi
    • 2
  1. 1.Division of Cytogenetics, National Institute of Genetics and Department of GeneticsSokendaiUSA
  2. 2.Division of Cytogenetics, National Institute of Genetics and Department of GeneticsSokendaiUSA

Personalised recommendations