A Nonalignment Approach for Genome-Scale Discovery of DNA and mRNA Regulatory Elements Using Network-Level Conservation
  • Olivier Elemento
  • Saeed Tavazoie
Part of the Methods in Molecular Biology™ book series (MIMB, volume 395)


Here, we describe the usage of Fastcompare, a simple and efficient comparative approach for finding short noncoding DNA (e.g., transcription factor binding sites) and mRNA (e.g., microRNA target sites) sequences that are globally conserved between two genomes. Fastcompare is based on the network-level conservation principle, according to which the connectivity of transcriptional regulatory networks should be largely conserved between two closely related genomes. We describe here the procedure for applying Fastcompare to large genomes (with an emphasis on metazoan genomes), including scoring of exhaustive motif lists, determination of conservation threshold using sequence randomizations, and discovery of interactions between regulatory elements.

Key Words

Transcription factor binding sites microRNA target sites computational method network-level conservation comparative genomics metazoan genomes. 



The authors are grateful to Chang S. Chan and Kellen Olszewski for critical reading of preliminary versions of this document. The authors are also grateful to members of the Tavazoie group for insightful discussions. Saeed Tavazoie is supported by National Institutes of Health, National Science Foundation, and Defense Advanced Research Projects Agency.


  1. 1.
    Elemento, O. and Tavazoie, S. (2005) Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach.Genome Biol 6, R18.CrossRefPubMedGoogle Scholar
  2. 2.
    Pritsker, M., Liu, Y., Beer, M., and Tavazoie, S. (2004) Whole-genome discovery of transcription factor binding sites by network-level conservation. Genome Res. 14, 99–108.CrossRefPubMedGoogle Scholar
  3. 3.
    Chan, C. S., Elemento, O., and Tavazoie, S. (2005) Revealing posttranscriptional regulatory elements through network-level conservation. PLoS Computational Biology 1, e69.CrossRefPubMedGoogle Scholar
  4. 4.
    Wasserman, W. W., Palumbo, M., Thompson, W., Fickett, J. W., and Lawrence, C. E. (2000) Human-mouse genome comparisons to locate regulatory sites. Nat. Genet. 26, 225–228.CrossRefPubMedGoogle Scholar
  5. 5.
    Blanchette, M., and Tompa, M. (2002) Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res. 12, 739–748.CrossRefPubMedGoogle Scholar
  6. 6.
    Xie, X., Lu, J., Kulbokas, E., et al. (2005) Systematic discovery of regulatory motifs in human promoters and 3’ UTRs by comparison of several mammals. Nature 434, 338–345.CrossRefPubMedGoogle Scholar
  7. 7.
    Kellis, M., Patterson, N., Endrizzi, M., Birren, B., and Lander, E. S. (2003) Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254.CrossRefPubMedGoogle Scholar
  8. 8.
    Cliften, P., Sudarsanam, P., Desikan, A., et al. (2003) Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 301, 71–76.CrossRefPubMedGoogle Scholar
  9. 9.
    Lee, T. I., Rinaldi, N. J., Robert, F., et al. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804.CrossRefPubMedGoogle Scholar
  10. 10.
    Matys, V., Kel-Margoulis, O. V., Fricke, E., et al. (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110.CrossRefPubMedGoogle Scholar
  11. 11.
    Griffiths-Jones, S. (2004) The microRNA Registry. Nucleic Acids Res. 32, D109–D111.CrossRefPubMedGoogle Scholar
  12. 12.
    Needleman, S. B. and Wunsch, C. D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453.CrossRefPubMedGoogle Scholar
  13. 13.
    Tavazoie, S., Hughes, J., Campbell, M., Cho, R., and Church, G. (1999) Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285.CrossRefPubMedGoogle Scholar
  14. 14.
    Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680.CrossRefPubMedGoogle Scholar
  15. 15.
    Birney, E., Andrews, D., Caccamo, M., et al. (2006) Ensembl 2006. Nucl. Acids Res. 34, D556–D561.CrossRefPubMedGoogle Scholar
  16. 16.
    Altschul, S., Madden, T., Schaffer, A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res. 25, 3389–3402.CrossRefPubMedGoogle Scholar
  17. 17.
    Birney, E., Clamp, M., and Durbin, R. (2004) GeneWise and Genomewise. Genome Res 14, 988–995.CrossRefPubMedGoogle Scholar
  18. 18.
    Jacobs Anderson, J. S., and Parker, R. (2000) Computational identification of cis-acting elements affecting post-transcriptional control of gene expression in Saccharomyces cerevisiae. Nucleic Acids Res. 28, 1604–1617.CrossRefPubMedGoogle Scholar
  19. 19.
    O’Brien, K. P., Remm, M., and Sonnhammer, E. L. (2005) Inparanoid: a comprehensive database of eukaryotic orthologs. Nucl. Acids Res. 33, D476–D480.CrossRefPubMedGoogle Scholar
  20. 20.
    Stein, L., Bao, Z., Blasiar, D., et al. (2003) The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLoS Biol 1, E45.CrossRefPubMedGoogle Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Olivier Elemento
    • 1
  • Saeed Tavazoie
    • 1
  1. 1.Department of Molecular Biology &, Lewis-SiglerInstitute for Integrative Genomics, Princeton UniversityUS

Personalised recommendations