Multiple-Sequence Alignment to Predict Functional Elements in Genomic Sequences
  • Gabriela G. Loots
  • Ivan Ovcharenko
Part of the Methods in Molecular Biology™ book series (MIMB, volume 395)


Multiple sequence alignment analysis is a powerful approach for translating the evolutionary selective power into phylogenetic relationships to localize functional coding and noncoding genomic elements. The tool Mulan ( has been designed to effectively perform multiple comparisons of genomic sequences necessary to facilitate bioinformatic-driven biological discoveries. The Mulan network server is capable of comparing both closely and distantly related genomes to identify conserved elements over a broad range of evolutionary time. Several novel algorithms are brought together in this tool: the tba multisequence aligner program used to rapidly identify local sequence conservation and the multiTF program to detect evolutionarily conserved transcription factor binding sites in alignments. Mulan is integrated with the ERC Browser, the UCSC Genome Browser for quick uploads of available sequences and supports two-way communication with the GALA database to overlay GALA functional genome annotation with sequence conservation profiles. Local multiple alignments computed by Mulan ensure reliable representation of short- and large-scale genomic rearrangements in distant organisms. Recently, we have also introduced the ability to handle duplications to permit the reliable reconstruction of evolutionary events that underlie the genome sequence data. Here, we describe the main features of the Mulan tool that include the interactive modification of critical conservation parameters, visualization options, and dynamic access to sequence data from visual graphs for flexible and easy-to-perform analysis of differentially evolving genomic regions.

Key Words

Multiple alignment alignment tool evolutionary conservation conserved elements conserved transcription factor binding sites 


  1. 1.
    Pennacchio, L. A., Olivier, M., Hubacek, J. A., et al. (2001) An apolipoprotein influencing triglycerides in humans and mice revealed by comparative sequencing. Science 294, 169–173.CrossRefPubMedGoogle Scholar
  2. 2.
    Gilligan, P., Brenner, S., and Venkatesh, B. (2002) Fugu and human sequence comparison identifies novel human genes and conserved non-coding sequences. Gene 294, 35–44.CrossRefPubMedGoogle Scholar
  3. 3.
    Elnitski, L., Li, J., Noguchi, C. T., Miller, W., and Hardison, R. (2001) A negative cis-element regulates the level of enhancement by hypersensitive site 2 of the beta-globin locus control region. J. Biol. Chem. 276, 6289–6298.CrossRefPubMedGoogle Scholar
  4. 4.
    Loots, G. G., Locksley, R. M., Blankespoor, C. M., et al. (2000) Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science 288, 136–140.CrossRefPubMedGoogle Scholar
  5. 5.
    Mayor, C., Brudno, M., Schwartz, J. R., et al. (2000) VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16, 1046–1047.CrossRefPubMedGoogle Scholar
  6. 6.
    Ovcharenko, I., Loots, G. G., Hardison, R. C., Miller, W., and Stubbs, L. (2004) zPicture: dynamic alignment and visualization tool for analyzing conservation profiles. Genome Res. 14, 472–477.CrossRefPubMedGoogle Scholar
  7. 7.
    Schwartz, S., Zhang, Z., Frazer, K. A., et al. (2000) PipMaker: a web server for aligning two genomic DNA sequences. Genome Res. 10, 577–586.CrossRefPubMedGoogle Scholar
  8. 8.
    Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680.CrossRefPubMedGoogle Scholar
  9. 9.
    Bray, N., Dubchak, I., and Pachter, L. (2003) AVID: A global alignment program. Genome Res. 13, 97–102.CrossRefPubMedGoogle Scholar
  10. 10.
    Brudno, M., Do, C. B., Cooper, G. M., et al. (2003) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721–731.CrossRefPubMedGoogle Scholar
  11. 11.
    Ovcharenko, I., Boffelli, D., and Loots, G. G. (2004) eShadow: a tool for comparing closely related sequences. Genome Res. 14, 1191–1198.CrossRefPubMedGoogle Scholar
  12. 12.
    Schwartz, S., Elnitski, L., Li, M., et al., and NISC Comparative Sequencing Program. (2003) MultiPipMaker and supporting tools: Alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res. 31, 3518–3524.Google Scholar
  13. 13.
    Blanchette, M., Kent, W. J., Riemer, C., et al. (2004) Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715.CrossRefPubMedGoogle Scholar
  14. 14.
    Ovcharenko, I., Stubbs, L., and Loots, G. G. (2004) Interpreting mammalian evolution using Fugu genome comparisons. Genomics 84, 890–895.CrossRefPubMedGoogle Scholar
  15. 15.
    Aerts, S., Thijs, G., Coessens, B., Staes, M., Moreau, Y., and De Moor, B. (2003) Toucan: deciphering the cis-regulatory logic of coregulated genes. Nucleic Acids Res. 31, 1753–1764.CrossRefPubMedGoogle Scholar
  16. 16.
    Loots, G. G., Ovcharenko, I., Pachter, L., Dubchak, I., and Rubin, E. M. (2002) rVista for comparative sequence-based discovery of functional transcription factor binding sites. Genome Res. 12, 832–839.PubMedGoogle Scholar
  17. 17.
    Loots, G. G. and Ovcharenko, I. (2004) rVISTA 2.0: evolutionary analysis of transcription factor binding sites. Nucleic Acids Res. 32, W217–W221.CrossRefPubMedGoogle Scholar
  18. 18.
    Lenhard, B., Sandelin, A., Mendoza, L., Engstrom, P., Jareborg, N., and Wasserman, W. W. (2003) Identification of conserved regulatory elements by comparative genome analysis. J. Biol. 2, 13.CrossRefPubMedGoogle Scholar
  19. 19.
    Wingender, E., Dietze, P., Karas, H., and Knuppel, R. (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res. 24, 238–241.CrossRefPubMedGoogle Scholar
  20. 20.
    Ovcharenko, I., Loots, G. G., Giardine, B. M., et al. (2005) Mulan: multiple-sequence local alignment and visualization for studying function and evolution. Genome Res. 15, 184–194.CrossRefPubMedGoogle Scholar
  21. 21.
    Giardine, B., Elnitski, L., Riemer, C., et al. (2003) GALA, a database for genomic sequence alignments and annotations. Genome Res. 13, 732–741.CrossRefPubMedGoogle Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Gabriela G. Loots
    • 1
  • Ivan Ovcharenko
    • 1
  1. 1.Lawrence Livermore National LaboratoryLivermore

Personalised recommendations