Updating of transposable element annotations from large wheat genomic sequences reveals diverse activities and gene associations
- 295 Downloads
Triticeae species (including wheat, barley and rye) have huge and complex genomes due to polyploidization and a high content of transposable elements (TEs). TEs are known to play a major role in the structure and evolutionary dynamics of Triticeae genomes. During the last 5 years, substantial stretches of contiguous genomic sequence from various species of Triticeae have been generated, making it necessary to update and standardize TE annotations and nomenclature. In this study we propose standard procedures for these tasks, based on structure, nucleic acid and protein sequence homologies. We report statistical analyses of TE composition and distribution in large blocks of genomic sequences from wheat and barley. Altogether, 3.8 Mb of wheat sequence available in the databases was analyzed or re-analyzed, and compared with 1.3 Mb of re-annotated genomic sequences from barley. The wheat sequences were relatively gene-rich (one gene per 23.9 kb), although wheat gene-derived sequences represented only 7.8% (159 elements) of the total, while the remainder mainly comprised coding sequences found in TEs (54.7%, 751 elements). Class I elements [mainly long terminal repeat (LTR) retrotransposons] accounted for the major proportion of TEs, in terms of sequence length as well as element number (83.6% and 498, respectively). In addition, we show that the gene-rich sequences of wheat genome A seem to have a higher TE content than those of genomes B and D, or of barley gene-rich sequences. Moreover, among the various TE groups, MITEs were most often associated with genes: 43.1% of MITEs fell into this category. Finally, the TRIM and copia elements were shown to be the most active TEs in the wheat genome. The implications of these results for the evolution of diploid and polyploid wheat species are discussed.
KeywordsTransposable elements Sequence annotation Triticeae Genome evolution
The authors thank Catherine Feuillet and Beat Keller for their corrections and their “bloody cuts”, Alan Schulman for his help on the activity index and his corrections, Jorge Dubcovsky, the reviewers and the editor for their constructive observations and Bikram Gill, John Fellers, Dave Matthews and all the corresponding authors of the previous articles on large genomic sequences for their help.