Journal of Molecular Evolution

, Volume 67, Issue 6, pp 696-704

First online:

Extensive Reorganization of the Plastid Genome of Trifolium subterraneum (Fabaceae) Is Associated with Numerous Repeated Sequences and Novel DNA Insertions

  • Zhengqiu CaiAffiliated withThe University of Texas at Austin
  • , Mary GuisingerAffiliated withThe University of Texas at Austin
  • , Hyi-Gyung KimAffiliated withThe University of Texas at Austin
  • , Elizabeth RuckAffiliated withThe University of Texas at Austin
  • , John C. BlazierAffiliated withThe University of Texas at Austin
  • , Vanity McMurtryAffiliated withThe University of Texas M. D. Anderson Cancer Center
  • , Jennifer V. KuehlAffiliated withDOE Joint Genome Institute
  • , Jeffrey BooreAffiliated withDOE Joint Genome InstituteGenome Project Solutions
  • , Robert K. JansenAffiliated withThe University of Texas at AustinSection of Integrative Biology and Institute of Cellular and Molecular Biology, The University of Texas at Austin Email author 

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


The plastid genome of Trifolium subterraneum is 144,763 bp, about 20 kb longer than those of closely related legumes, which also lost one copy of the large inverted repeat (IR). The genome has undergone extensive genomic reconfiguration, including the loss of six genes (accD, infA, rpl22, rps16, rps18, and ycf1) and two introns (clpP and rps12) and numerous gene order changes, attributable to 14–18 inversions. All endpoints of rearranged gene clusters are flanked by repeated sequences, tRNAs, or pseudogenes. One unusual feature of the Trifolium subterraneum genome is the large number of dispersed repeats, which comprise 19.5% (ca. 28 kb) of the genome (versus about 4% for other angiosperms) and account for part of the increase in genome size. Nine genes (psbT, rbcL, clpP, rps3, rpl23, atpB, psbN, trnI-cau, and ycf3) have also been duplicated either partially or completely. rpl23 is the most highly duplicated gene, with portions of this gene duplicated six times. Comparisons of the Trifolium plastid genome with the Plant Repeat Database and searches for flanking inverted repeats suggest that the high incidence of dispersed repeats and rearrangements is not likely the result of transposition. Trifolium has 19.5 kb of unique DNA distributed among 160 fragments ranging in size from 30 to 494 bp, greatly surpassing the other five sequenced legume plastid genomes in novel DNA content. At least some of this unique DNA may represent horizontal transfer from bacterial genomes. These unusual features provide direction for the development of more complex models of plastid genome evolution.


Fabaceae Plastid genome Repeated sequences Trifolium