Genome size and the accumulation of simple sequence repeats: implications of new data from genome sequencing projects
- John M. Hancock
- … show all 1 hide
Purchase on Springer.com
$39.95 / €34.95 / £29.95*
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.
The relationship between the level of repetitiveness in genomic sequences and genome size has been re-investigated making use of the rapidly growing database of complete eubacterial and archaeal genome sequences combined with the fragmentary but now large amount of data from eukaryotic genomes. Relative simplicity factors (RSFs), which measure the repetitiveness of sequences, were calculated and significantly simple motifs (SSMs), which identify the kinds of sequences that are repeated, were identified. A previously reported correlation between genome size and repetitiveness was confirmed, but it was shown that the higher RSFs seen in eukaryotic genomes also reflect a generally higher level of repetitiveness independent of genome size differences. Differences in genome size are responsible for about 10% of the variance in RSF seen between species. The spectrum of SSMs seen within a genome differed markedly within the eubacteria but less so in eukaryotes and, particularly, in archaea. Species with SSM spectra that differ from the norm tend also to have high RSFs for their genome size and to be pathogens that make use of repetitive sequences to avoid host defence responses. Some of the variance in repetitiveness seen in other species may therefore also reflect the action of selection, although other forces such as variation in the effectiveness of mechanisms for regulating slippage errors of replication, may also be important.
- Adams, M.D. & 194 co-authors, 2000. The genome sequence of Drosophila melanogaster. Science 287: 2185-2195.
- Albà, M.M., M.F. Santibáñez-Koref & J.M. Hancock, 2001. The comparative genomics of glutamine codon repetition: a category of genes that includes repeat expansion disease genes is prominent in humans and mice and rare in Drosophila. J. Mol. Evol. 52: 249-259.
- Bennetzen, J.L., 2000. Transposable element contributions to plant gene and genome evolution. Plant Mol. Biol. 42: 251-269.
- The C. elegans Sequencing Consortium, 1998. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282: 2012-2018.
- Cavalier-Smith, T., 1985. Eukaryote gene numbers, non-coding DNA and genome size, pp. 69-103 in The Evolution of Genome Size, edited by T. Cavalier-Smith. Wiley, New York.
- DeLong, E.F., 1992. Archaea in coastal marine environments. Proc. Natl. Acad. Sci. USA 89: 5685-5689.
- Djian, P., J.M. Hancock & H.S. Chana, 1996. Codon repeats in genes associated with human diseases: fewer repeats in the genes of non-human primates and concentrated nucleotide substitutions at the sites of reiteration. Proc. Natl. Acad. Sci. USA 93: 417-421.
- Ellegren, H., 2000. Heterogeneous mutation processes in human microsatellite DNA sequences. Nat. Genet. 24: 400-402.
- Felsenstein, J., 1993. PHYLIP (Phylogeny Inference Package) version 3.5c. Distributed by the author. Department of Genetics, University of Washington, Seattle.
- Freudenreich, C.H., J.B. Stavenhagen & V.A. Zakian, 1997. Stability of a CTG/CAG trinucleotide repeat in yeast is dependent on its orientation in the genome. Mol. Cell. Biol. 17: 2090-2098.
- Hancock, J.M., 1995. The contribution of slippage-like processes to genome evolution. J. Mol. Evol. 41: 1038-1047.
- Hancock, J.M., 1996a. Simple sequences and the expanding genome. BioEssays 18: 421-425.
- Hancock, J.M., 1996b. Simple sequences in a ‘minimal’ genome. Nat. Genet. 14: 14-15.
- Hancock, J.M. & J.S. Armstrong, 1994. SIMPLE34: an improved and enhanced implementation for VAX and Sun computers of the SIMPLE algorithm for analysis of clustered repetitive motifs in nucleotide sequences. Comput. Appl. Biosci. 10: 67-70.
- Hancock, J.M. & A.P. Vogler, 2000. How slippage-derived sequences are incorporated into rRNA variable-region secondary structure: implications for phylogeny reconstruction. Mol. Phylogenet. Evol. 14: 366-374.
- Hancock, J.M., E.A. Worthey & M.F. Santibáñez-Koref, 2001. A role for selection in regulating the evolutionary emergence of disease-causing and other coding CAG repeats in humans and mice. Mol. Biol. Evol. 18: 1014-1023.
- Hart, R.W. & R.B. Setlow, 1974. Correlation between deoxyribonucleic acid excision-repair and life-span in a number of mammalian species. Proc. Natl. Acad. Sci. USA 71: 2169-2173.
- International Human Genome Sequencing Consortium, 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.
- Landau, G.M., J.P. Schmidt & D. Sokol, 2001. An algorithm for approximate tandem repeats. J. Comp. Biol. 8: 1-18.
- Moxon E.R., P.B. Rainey, M.A. Nowak & R.E. Lenski, 1994. Adaptive evolution of highly mutable loci in pathogenic bacteria. Curr. Biol. 4: 24-33.
- Rubinsztein, D.C., W. Amos, J. Leggo, S. Goodburn, R.S. Ramesar, J. Old, R. Bontrop, R. McMahon, D.E. Barton & M.A. Ferguson-Smith, 1994. Mutational bias provides a model for the evolution of Huntington's disease and predicts a general increase in disease prevalence. Nat. Genet. 7: 525-530.
- Rubinsztein, D.C., W. Amos, J. Leggo, S. Goodburn, S. Jain, S.H. Li, R.L. Margolis, C.A. Ross & M.A. Ferguson-Smith, 1995a. Microsatellite evolution-evidence for directionality and variation in rate between species. Nat. Genet. 10: 337-343.
- Rubinsztein, D.C., J. Leggo, G.A. Coetzee, R.A. Irvine, M. Buckley & M.A. Ferguson-Smith, 1995b. Sequence variation and size ranges of CAG repeats in the Machado-Joseph disease, spinocerebellar ataxia type 1 and androgen receptor genes. Hum. Mol. Genet. 4: 1585-1590.
- Saunders, N.J., A.C. Jeffries, J.F. Peden, D.W. Hood, H. Tettelin, R. Rappuoli & E.R. Moxon, 2000. Repeat-associated phase variable genes in the complete genome sequence of Neisseria meningitidis MC58. Mol. Microbiol. 37: 207-215.
- Schmidt, K.H., C.M. Abbott & D.R. Leach, 2000. Two opposing effects of mismatch repair on CTG repeat instability in Escherichia coli. Mol. Microbiol. 35: 463-471.
- Schug M.D., T.F. Mackay & C.F. Aquadro, 1997. Low mutation rates of microsatellite loci in Drosophila melanogaster. Nat. Genet. 15: 99-102.
- Tautz, D., M. Trick & G.A. Dover, 1986. Cryptic simplicity in DNA is a major source of genetic variation. Nature 322: 652-656.
- Xu, X., M. Peng & Z. Fang, 2000. The direction of microsatellite mutations is dependent upon allele length. Nat. Genet. 24: 396-399.
- Wright, F.A., W.J. Lemon, W.D. Zhao, R. Sears, D. Zhuo, J.-P. Wang, H.-Y. Yang, T. Baer, D. Stredney, J. Spitzner, A. Stutz, R. Krahe & B. Yuan, 2001. A draft annotation and overview of the human genome. Genome Biol. 2: Preprint 0001.1-0001.39.
- Genome size and the accumulation of simple sequence repeats: implications of new data from genome sequencing projects
Volume 115, Issue 1 , pp 93-103
- Cover Date
- Print ISSN
- Online ISSN
- Kluwer Academic Publishers
- Additional Links
- genome size
- relative simplicity factor
- sequence repetitiveness
- simple sequences
- John M. Hancock (1)
- Author Affiliations
- 1. Department of Computer Science, Royal Holloway University of London, UK (Phone