Comparative Genomics

A Tool to Functionally Annotate Human DNA
  • Jan-Fang Cheng
  • James R. Priest
  • Len A. Pennacchio
Part of the Methods in Molecular Biology book series (MIMB, volume 366)


The availability of an increasing number of vertebrate genomes has enabled comparative methods to infer functional sequences based on evolutionary constraint. Although this has proved powerful for gene identification, significant progress has also been made in uncovering gene regulatory sequences such as distant acting transcriptional enhancers. These pursuits have led to the development of a variety of valuable databases and resources that should serve as a routine toolbox for biological discovery.

Key Words

Comparative genomics gene regulation enhancer transcription databases 


  1. 1.
    International Human Genome Sequencing Consortium. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.CrossRefGoogle Scholar
  2. 2.
    Pennisi, E. (2003) Human genome. Reaching their goal early, sequencing labs celebrate. Science 300, 409.CrossRefPubMedGoogle Scholar
  3. 3.
    Frazer, K. A., Elnitski, L., Church, D. M., Dubchak, I., and Hardison, R. C. (2003) Cross-species sequence comparisons: a review of methods and available resources. Genome Res. 13, 1–12.CrossRefPubMedGoogle Scholar
  4. 4.
    Hardison, R. C. (2003) Comparative genomics. PLoS Biol. 1, 156–160.CrossRefGoogle Scholar
  5. 5.
    Pennacchio, L. A. and Rubin, E. M. (2003) Comparative genomic tools and databases: providing insights into the human genome. J. Clin. Invest. 111, 1099–1106.PubMedGoogle Scholar
  6. 6.
    Batzoglou, S., Pachter, L., Mesirov, J. P., Berger, B., and Lander, E. S. (2000) Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10, 950–958.CrossRefPubMedGoogle Scholar
  7. 7.
    Chen, R., Bouck, J. B., Weinstock, G. M., and Gibbs, R. A. (2001) Comparing vertebrate whole-genome shotgun reads to the human genome. Genome Res. 11, 1807–1816.PubMedGoogle Scholar
  8. 8.
    Waterston, R. H. and International Mouse Genome Sequencing Consortium (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562.CrossRefPubMedGoogle Scholar
  9. 9.
    Oeltjen, J. C., Malley, T. M., Muzny, D. M., Miller, W., Gibbs, R. A., and Belmont, J. W. (1997) Large-scale comparative sequence analysis of the human and murine Bruton’s tyrosine kinase loci reveals conserved regulatory domains. Genome Res. 7, 315–329.PubMedGoogle Scholar
  10. 10.
    Loots, G. G., Locksley, R. M., Blankespoor, C. M., et al. (2000) Identification of a coordinate regulator of interleukins 4,13, and 5 by cross-species sequence comparisons. Science 288, 136–140.CrossRefPubMedGoogle Scholar
  11. 11.
    Gottgens, B., Barton, L. M., Chapman, M. A., et al. (2002) Transcriptional regulation of the stem cell leukemia gene (SCL)—comparative analysis of five vertebrate SCL loci. Genome Res. 12, 749–759.CrossRefPubMedGoogle Scholar
  12. 12.
    Dermitzakis, E. T., Reymond, A., Scamuffa, N., et al. (2003) Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302, 1033–1035.CrossRefPubMedGoogle Scholar
  13. 13.
    Genome sequencing prioritization list of NIH/National Human Genome Research Institute (
  14. 14.
  15. 15.
    Kent, W. J., Sugnet, C. W., Furey, T. S., et al. (2002) The Human Genome Browser at UCSC. Genome Res. 12, 996–1006.PubMedGoogle Scholar
  16. 16.
    UC Santa Cruz Genome Browser (
  17. 17.
    Mayor, C., Brudno, M., Schwartz, J. R., et al. (2000) VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16, 1046–1047.CrossRefPubMedGoogle Scholar
  18. 18.
    VISTA Genome Browser (
  19. 19.
  20. 20.
    Schwartz, S., Kent, W. J., Smit, A., et al. (2003) Human-mouse alignments with BLASTZ. Genome Res. 13, 103–107.CrossRefPubMedGoogle Scholar
  21. 21.
    Brudno, M., Do, C. B., Cooper, G. M., et al., and NISC Comparative Sequencing Program. (2003) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721–731.CrossRefPubMedGoogle Scholar
  22. 22.
    Bray, N., Dubchak, I., and Pachter, L. (2003) AVID: A global alignment program. Genome Res. 13, 97–102.CrossRefPubMedGoogle Scholar
  23. 23.
    Pollard, D. A., Bergman, C. M., Stoye, J., Celniker, S. E., and Eisen, M. B. (2004) Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformatics 5, 6.CrossRefPubMedGoogle Scholar
  24. 24.
    Kent, W. J. (2002) BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664.PubMedGoogle Scholar
  25. 25.
  26. 26.
    Search for sequences in the NCBI unassembled trace archive (
  27. 27.
    BAC library resources (
  28. 28.
    DOE/Joint Genome Institute’s Community Sequencing Program (
  29. 29.
    NIH/NHGRI Genome Sequencing Program (
  30. 30.
    Boffelli, D., McAuliffe, J., Ovcharenko, D., et al. (2003) Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299, 1391–1394.CrossRefPubMedGoogle Scholar
  31. 31.
    Loots, G. G., Ovcharenko, L., Pachter, L., Dubchak, I., and Rubin, E. M. (2002) rVista for comparative sequence-based discovery of functional transcription factor binding sites. Genome Res. 12, 832–839.PubMedGoogle Scholar
  32. 32.
    Schwartz, S., Zhang, Z., Frazer, K. A., et al. (2000) PipMaker—a web server for aligning two genomic DNA sequences. Genome Res. 10, 577–586.CrossRefPubMedGoogle Scholar
  33. 33.
    PipMaker and MultiPipMaker (
  34. 34.
    Schwartz, S., Elnitski, L., Li, M., et al., and NISC Comparative Sequencing Program. (2003) MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res. 31, 3518–3524.CrossRefPubMedGoogle Scholar
  35. 35.
  36. 36.
  37. 37.
    Bray, N. and Pachter, L. (2003) MAVID multiple alignment server. Nucleic Acids Res. 31, 3525–3526.CrossRefPubMedGoogle Scholar
  38. 38.
  39. 39.
    Ovcharenko, I., Boffelli, D., and Loots, G. G. (2004) eShadow: a tool for comparing closely related sequences. Genome Res. 14, 1191–1198.CrossRefPubMedGoogle Scholar
  40. 40.
    Jung, J., Zheng, M., Goldfarb, M., and Zaret, K. S. (1999) Initiation of mammalian liver development from endoderm by fibroblast growth factors. Science 284, 1998–2003.CrossRefPubMedGoogle Scholar
  41. 41.
    Fukuchi-Shimogori, T. and Grove, E. A. (2001) Neocortex patterning by the secreted signaling molecule FGF8. Science 294, 1071–1074.CrossRefPubMedGoogle Scholar
  42. 42.
    Storm, E. E., Rubenstein, J. L. R., and Martin, G. R. (2003) Dosage of Fgf8 determines whether cell survival is positively or negatively regulated in the developing forebrain. Proc. Nat. Acad. Sci. USA 100, 1757–1762.CrossRefPubMedGoogle Scholar
  43. 43.
    Crossley, P. H., Minowada, G., MacArthur, C. A., and Martin, G. R. (1996) Roles for FGF8 in the induction, initiation, and maintenance of chick limb development. Cell 84, 127–136.CrossRefPubMedGoogle Scholar
  44. 44.
    Lewandoski, M., Sun, X., and Martin, G. R. (2000) Fgf8 signalling from the AER is essential for normal limb development. Nat. Genet. 26, 460–463.CrossRefPubMedGoogle Scholar
  45. 45.
    Sun, X., Mariani, F. V., and Martin, G. R. (2002) Functions of FGF signalling from the apical ectodermal ridge in limb development. Nature 418, 501–508.CrossRefPubMedGoogle Scholar
  46. 46.
    Tanaka, A., Kamiakito, T., Takayashiki, N., Sakurai, S., and Saito, K. (2002) Fibroblast growth factor 8 expression in breast carcinoma: associations with androgen receptor and prostate-specific antigen expressions. VirchowsArch. 441, 380–384.CrossRefGoogle Scholar
  47. 47.
  48. 48.
    Gemel, J., Jacobsen, C., and MacArthur, C. A. (1999) Fibroblast growth factor-8 expression is regulated by intronic engrailed and Pbxl-binding sites. J. Biol. Chem. 274, 6020–6026.CrossRefPubMedGoogle Scholar
  49. 49.
    Brondani, V., Klimkait, T., Egly, J. M., and Hamy, F. (2002) Promoter of FGF8 reveals a unique regulation by unliganded RARalpha. J. Mol. Biol. 319, 715–728.CrossRefPubMedGoogle Scholar
  50. 50.
    Gnanapragasam, V. J., Robson, C. N., Neal, D. E., and Leung, H. Y. (2002) Regulation of FGF8 expression by the androgen receptor in human prostate cancer. Oncogene 21, 5069–5080.CrossRefPubMedGoogle Scholar
  51. 51.
    Walsh, A., Ito, Y., and Breslow, J. L. (1989) High levels of human apolipoprotein A-I in transgenic mice result in increased plasma levels of small high density lipoprotein (HDL) particles comparable to human HDL3. J. Biol. Chem. 264, 6488–6494.PubMedGoogle Scholar
  52. 52.
    Collins, F. S., Green, E. D., Guttmacher, A. E., and Guyer, M. S. (2003) A vision for the future of genomics research. Nature 422, 835–847.CrossRefPubMedGoogle Scholar
  53. 53.
    Dickerson, R. E. and Geis, I. (1983) Hemoglobin: Structure, Function, Evolution, and Pathology. Benjamin/Cummings.Google Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Jan-Fang Cheng
    • 1
    • 2
  • James R. Priest
    • 1
    • 2
  • Len A. Pennacchio
    • 1
    • 2
  1. 1.Genomics DivisionLawrence Berkeley National LaboratoryBerkeleyUSA
  2. 2.Joint Genome InstituteU.S. Department of EnergyWalnut CreekUSA

Personalised recommendations