Journal of Molecular Evolution

, Volume 67, Issue 6, pp 696–704

Extensive Reorganization of the Plastid Genome of Trifolium subterraneum (Fabaceae) Is Associated with Numerous Repeated Sequences and Novel DNA Insertions


  • Zhengqiu Cai
    • The University of Texas at Austin
  • Mary Guisinger
    • The University of Texas at Austin
  • Hyi-Gyung Kim
    • The University of Texas at Austin
  • Elizabeth Ruck
    • The University of Texas at Austin
  • John C. Blazier
    • The University of Texas at Austin
  • Vanity McMurtry
    • The University of Texas M. D. Anderson Cancer Center
  • Jennifer V. Kuehl
    • DOE Joint Genome Institute
  • Jeffrey Boore
    • DOE Joint Genome Institute
    • Genome Project Solutions
    • The University of Texas at Austin
    • Section of Integrative Biology and Institute of Cellular and Molecular BiologyThe University of Texas at Austin

DOI: 10.1007/s00239-008-9180-7

Cite this article as:
Cai, Z., Guisinger, M., Kim, H. et al. J Mol Evol (2008) 67: 696. doi:10.1007/s00239-008-9180-7


The plastid genome of Trifolium subterraneum is 144,763 bp, about 20 kb longer than those of closely related legumes, which also lost one copy of the large inverted repeat (IR). The genome has undergone extensive genomic reconfiguration, including the loss of six genes (accD, infA, rpl22, rps16, rps18, and ycf1) and two introns (clpP and rps12) and numerous gene order changes, attributable to 14–18 inversions. All endpoints of rearranged gene clusters are flanked by repeated sequences, tRNAs, or pseudogenes. One unusual feature of the Trifoliumsubterraneum genome is the large number of dispersed repeats, which comprise 19.5% (ca. 28 kb) of the genome (versus about 4% for other angiosperms) and account for part of the increase in genome size. Nine genes (psbT, rbcL, clpP, rps3, rpl23, atpB, psbN, trnI-cau, and ycf3) have also been duplicated either partially or completely. rpl23 is the most highly duplicated gene, with portions of this gene duplicated six times. Comparisons of the Trifolium plastid genome with the Plant Repeat Database and searches for flanking inverted repeats suggest that the high incidence of dispersed repeats and rearrangements is not likely the result of transposition. Trifolium has 19.5 kb of unique DNA distributed among 160 fragments ranging in size from 30 to 494 bp, greatly surpassing the other five sequenced legume plastid genomes in novel DNA content. At least some of this unique DNA may represent horizontal transfer from bacterial genomes. These unusual features provide direction for the development of more complex models of plastid genome evolution.


FabaceaePlastid genomeRepeated sequencesTrifolium

Supplementary material

239_2008_9180_MOESM1_ESM.doc (92 kb)
MOESM1 (DOC 91 kb)

Copyright information

© Springer Science+Business Media, LLC 2008