Background

The proteins present in cells are the product of the blueprint prescribed by the genes [1,2,3]. Collectively, all of the genes (including coding and non-coding) presents in a cell represent the genome of an organism [4, 5]. The construction of a protein from a gene is a complex procedure and requires the involvement of transfer RNA (tRNA), messenger RNA (mRNA), ribosomes, amino acids, and other molecules [6,7,8,9]. This process is commonly known as translation which is a fundamental parameter of living cells [6,7,8,9]. The functional apparatus involved in gene translation is highly conserved across the tree of life [10]. mRNA conveys the blueprint information as triplet codons composed of nucleotides and tRNA are able to perceive the cognate codons [11, 12]. Although mRNA and ribosomes represent the two major parts of the machinery responsible for translation, transfer RNAs (tRNAs) are the fundamental units of this translation machinery [13,14,15]. The anti-codon of a tRNA links to the codon of the mRNA and supplies the corresponding amino acid into the protein translation chain [3, 8, 15, 16]. Two or more different tRNAs can bind an amino acid and transfer it to the ribosome [17,18,19,20]. There are 22 different amino acids encoded by 63 codons (including UGA and UAG codons for selenocysteine and pyrrolysine, respectively) as several of the amino acids are encoded by more than one codon and hence its corresponding anti-codon [21,22,23,24,25]. Therefore, it is possible to encode more than one tRNA molecule with different anti-codons to transfer a particular amino acid [21, 26,27,28]. Although codon selection for a corresponding anti-codon is the primary unit of the translation machinery, mutational bias, selection, drift, and codon usage bias also shape the prescribed translation [29,30,31,32]. Although there are critical steps for the efficient and proper functioning of the translation machinery, other synonymous codons can also serve as an alternative choice [32,33,34]. The differential use of codons also reflects their natural demand in the protein translation machinery [35, 36]. tRNAs are classified into various gene families based on their isoacceptor anti-codons [17, 19, 20]. The available tRNA pool is maintained at a level that can accommodate the transcript levels present in a cell, thus ensuring efficient and accurate translation. Highly-expressed genes, however, exhibit codon usage bias that reflects the copy number of the corresponding tRNA [37,38,39]. Translational selection acts to maintain the balance between codon usage and tRNA availability [40,41,42]. There is always selection pressure, however, to increase the production of the codons used in highly-expressed genes [32, 43, 44].

Over the course of evolution, the earth has undergone enormous changes and the plant kingdom has been subjected to numerous stresses [45,46,47]. All living organisms had to adapt to a changing environment, which resulted in the increased importance of some protein-coding genes while others became less important [48,49,50]. Accordingly, there was a need to alter the relative number and type of available tRNAs to fulfil the translational requirements of the new and/or modified protein-coding genes [51, 52]. Changes in the relative number and type of tRNA molecules are also associated with a change in the number and type of anti-codons [53, 54]. The role of selection pressure brought about by translational demand and its role in maintaining tRNA pools has not been adequately addressed. Furthermore, the selection pressure that determines the maintenance of low copy tRNA families and anti-codons also remains unclear. Whether translational selection pressure favours optimal codons in particular cases and keeps other codons as non-optimal, and hence in low supply, is unknown. It is also unknown if the amino acid requirements of proteins impact the need to provide specific tRNAs having the required anti-codons, as well as the genes that encode those tRNAs. In the present study, an attempt was made to determine the frequency of anti-codons in the tRNAnome of the Plant Kingdom to better understand the presence of codons and anti-codon frequency. Our objective was to provide information on the link between the presence of codons and their corresponding anti-codons, tRNAs, and the number of amino acids utilized in plant proteomes. Therefore, we analysed the frequency of anti-codons in the tRNA of plant genomes and constructed an anti-codon table of the Plant Kingdom.

Material and methods

Sequence retrieval

The annotated RNA sequence files of all 128 plant species were downloaded from the National Center for Biotechnology Information (NCBI) using the Ensemble genome browser. The downloaded sequence files were scanned for the presence of tRNAs using tRNAscan-SE software on a Linux-based platform. The resulting tRNAscan files were used for further analysis. After the completion of the scanning of individual files, all files were merged to obtain a complete plant tRNAnome file. The frequency of each individual anti-codon was obtained from the tRNAnome file and presented as a number and percentage (%). In the course of the analysis, several tRNASec were identified in different plant genomes and were kept separately for further study.

Sequence alignment

Multiple sequence alignment of tRNASec genes was conducted using multalin software with default parameters. To construct the phylogenetic tree, a multiple sequence alignment of tRNAs and tRNASec were conducted using the MUSCLE program in MEGA7 software [55, 56]. The resulting alignment was saved in a MEGA file format. The alignment file was subsequently used to construct a phylogenetic tree using MEGA7 software. Prior to the construction of the phylogenetic tree, a model selection was carried out using the following statistical parameters; statistical method, maximum likelihood substitution type, nucleotides, gaps/missing data treatment, complete deletion. Based on the lowest BIC score, a phylogenetic tree of tRNAs and tRNASec was constructed. The statistical parameters used to construct the phylogenetic tree were: statistical method (maximum likelihood), test of phylogeny (bootstrap method), no. of bootstrap replicates (1000), substitution type (nucleotides), model/method (Kimura-2-parameter model), rates among sites (gamma distributed), no. of discrete gamma parameters (5), gaps/missing data treatment (partial deletion), site coverage cut-off (95%), ML Heuristic method (nearest-neighbour-interchange), and branch swap filter (very strong). A separate phylogenetic tree was constructed using all of the tRNASec sequences and the same statistical approaches as mentioned above to determine deletion and duplication events. The constructed phylogenetic tree of tRNASec genes was exported in a Newick file format. Subsequently, a species tree was constructed using all of the 128 species in the taxonomy browser of NCBI. To determine RNASec deletion and duplication events, the phylogenetic tree of tRNASec was reconciled with the species tree using Notung software, version 2.9. The reconciled gene and species tree revealed deletion, duplication, and co-divergence events that occurred in tRNASec genes. The resultant phylogenetic tree of tRNAs (with tRNASec) and the phylogenetic tree of tRNASec were analysed by using Icy Tree to identify recombination events.

Cluster based grouping of the anti-codons

Anti-codons were grouped based on their percentage frequency in the tRNAnome. To cluster them, the percent frequency of anti-codons was used against each anti-codon. A classical clustering approach was used to cluster the anti-codons using a paired group UPGMA algorithm and Euclidean similarity index with 1000 bootstrap replicates.

Statistical analysis

The probability plot linear regression analysis of tRNA gene number per genome and frequency of anti-codons were statistically analysed and a value of p < 0.05 was considered to be significant. To investigate anti-codon numbers in different lineages and their statistical significance, a t-test was conducted comparing anti-codon number in eudicot vs. monocot, eudicot vs. algae, and monocot vs. algae. Differences were deemed significant at p < 0.05. All of the statistical analyses were conducting using Past3 software.

Results

Genome size is not proportional to the number of tRNA genes

A genome-wide analysis of fully-annotated whole genome sequences of 128 plant species was conducted to identify tRNA genes and to construct an anti-codon table of the plant kingdom (Table 1). The species included in the study varied in the size of their respective genomes (Table 2) A regression analysis was conducted to determine the correlation between genome size and the number of tRNA genes encoded per genome. Results indicated that plant genome size was not correlated (r = 0.5471, y = 0.17892x + 619.76) with the number of the tRNA genes per genome (Fig. 1). Ipomoea nil, with a genome size of genome size of 735.23 Mb, possesses 6475 tRNA genes which was the highest number of tRNA encoding genes identified in the species of plants that were analysed. Other species with a high number of tRNA genes in their genome were Cucurbita moschata (4062), Cucurbita pepo (3228), Cucurbita maxima (3036), Papaver somniferum (2571), Brassica napus (2180), and Ipomoea triloba (2180). Among the 128 analysed plant species, 22 (16.92%) species possessed more than 1 thousand tRNA genes in their genome. In contrast, Ostreococcus tauri and Phaedactylum tricornutum only encoded 41 tRNA genes in their genome, which was the lowest number of tRNA genes in the analysed genomes. Other species encoding lower number of tRNA genes were Raphidocelis subcapitata (43), Monoraphidium neglectum (48), and Bathycoccus prasinus (57). The genome size of O. tauri, P. tricornutum, R. subcapitata, and M. neglectum was 14.76, 27.4, 51.16, and 69.71 Mb, respectively. These genome sizes are relatively smaller than the genome of most of the other plant species that were analysed.

Table 1 Anti-codon table of the plant kingdom with frequency of anti-codons
Table 2 Genomic details of plant anti-codons
Fig. 1
figure 1

Regression analysis of tRNA gene number with plant genome. The analysis indicated that the number of tRNAs were not significantly correlated with plant genome size

CAU (met) was the most abundant and GCG (Arg) was the least abundant encoded anti-codons in the plant kingdom

The occurrence of each of the anti-codons were separately analysed to determine the frequency of anti-codons in the genomes of the Plant Kingdom. Results indicated that CAU (Met) was the most abundant (5.033%) anti-codon in the Plant Kingdom, followed by GUC (Asp, 4.274%), GUU (Asn, 4.020%), and GCC (Gly, 3.811%) (Table 1, Supplementary File 1). In contrast, GCG (Arg) was identified as the least abundant (0.004%) anti-codon in the Plant Kingdom, followed by GAG (Leu, 0.009%), CUA (Sup, 0.0111%), and ACU (Ser, 0.019%) (Table 1, Supplementary File 1). The lowest-abundant anti-codon (GCG) was only present in Ipomea nil, Nicotiana attenuata, Papaver somniferum, and Ziziphus jujuba. When the anti-codon frequency of different tRNA isoacceptor was considered, however, tRNALeu was found to be the most abundant tRNA isoacceptor (Table 1). Approximately 7.808% of all anti-codons in the Plant Kingdom were found to be encoded by tRNALeu (Table 1). The abundance of tRNALeu, was followed by tRNASer (7.668%), tRNAGly (7.523%), and tRNAArg (7.284%) (Table 1). tRNALeuc, tRNASer, and tRNAArg encode six different isoacceptors which might be the reason for their higher abundance in the plant genomes. Suppressor tRNA (0.036%) was found to be the least abundant tRNA isoacceptor in the plant genomes, followed by tRNASec (0.066%), tRNAHis (2.109%), and tRNACys (2.547%) (Table 1). Suppressor tRNA (CUA) anti-codon was only found in Ectocarpus siliculosus, Nicotiana sylvestris, and Zea mays (Supplementary File 1).

Anti-codons can be classified into five groups based on their frequency of occurrence in plant genomes

A clustering analysis based on the frequency of abundance of the anti-codons in the Plant Kingdom was conducted using the paired group (UPGMA) algorithm and Euclidean similarity index with 1000 bootstrap replicates. The analysis revealed five distinct groups of anti-codons and were named as group A, B, C, D, and E (Fig. 2). The anti-codons in the different groups were: Group A - CAU, GCC, GUU, and GUC); Group B - CUU, GAA, AAU, AGA, UCC, GCA, GCU, UCC, AAC, CCA, GUA, UUU, UGG, AGC, UUC, and CUC; Group C - UGA, UGU, UAG, UUG, UCU, CAC, AGU, GUG, AAG, AGG, UGC, CAA, and ACG; Group D - CCG, CGU, CGA, CGG, CAG, UAA, CGC, UAU, UCG, CCC, UAC, CCU, and CUG; and Group E - GGU, GGA, AUU, GAU, GAC, AUC, AUG, AAA, ACA, UCA, GGG, ACU, UUA, GGC, ACC, AUA, GAG, CUA, and GCG (Fig. 2). The anti-codon groupings are based on their abundance in plant genomes, from highest (Group A) to lowest (Group E).

Fig. 2
figure 2

Grouping of plant anti-codons. The clustering was conducted based on the frequency (percentage) of the anti-codons found in the collective plant genomes of 128 plant species. The grouping A, B, C, D, and E were made based on the decreasing order of anti-codon frequency. The clustering was conducted using the UPGMA algorithm and Euclidean distance matrix with 1000 bootstrap replicates in Past3 software

Plant genomes encode 18 to 59 isoacceptors (anti-codons)

The genome-wide analysis of the Plant Kingdom revealed the diversity in the number of anti-codons present in the genomes of individual species, which ranged from 18 to 59 (Table 2). Ostreococcus tauri was found to encode only 18 isoacceptors while Micromonas commoda encodes only 26 isoacceptors (Table 2). Ipomoea nil, Papaver somniferum, and Zea mays encoded the highest number of anti-codons at 59 each. At least 51 (39.53%) species were found to encode 50 or more anti-codons in their genome. On average, plant genomes encode 48.25 anti-codons per genome. A paired two tailed t-test was conducted to statistically analyse the frequency of anti-codons present in algae, eudicot, and monocot species. The comparison between eudicot and monocot species indicated that the frequency of tRNA anti-codons in these two groups was not significantly different (P < 0.05) at 1.2691 < 1.984 (t-test result 1.2691, critical value T 1.984), respectively (Table 3). In contrast, a significant difference in tRNA frequency was observed between eudicots and algae (10.3939 > 1.987), and between monocots and algae (6.2914 > 2.037) (Table 3). Notably, the variance in tRNA frequency in the monocot lineage was much lower than it was in the eudicots and algae.

Table 3 t-test (two tailed) between eudicot and monocot anti-codon numbers. The t-value is smaller than critical value (1.2691 < 1.984). So, the mean was not significantly different (p < 0.05). (B) t-test (two tailed) between eudicot and algae anti-codon numbers. The t-test result was greater than critical value (10.3939 > 1.987). So, the mean is significantly different (p < 0.05). (C) t-test (two tailed) between Eudicot and algae anti-codon numbers. The t-test result was greater than critical value (6.2914 > 2.037). So, the mean is significantly different (p < 0.05)

Only a few species have lost tRNA genes

Our analysis revealed that a few species have lost the presence of specific tRNA genes (tRNA isotype) in their genome. These species include Coccomyxa subellipsoidea (tRNATyr), Corchorus capsularis (tRNALys, tRNATyr), Corchorus olitorius (tRNATyr), Klebsormidium nitens (tRNATyr, tRNASer), Monoraphidium neglectum (tRNAThr), Ostreococcus tauri (tRNAPhe, tRNAGln), Picea glauca (tRNASer), Phaedactylum tricornutum (tRNACys), and Raphidocelis subcapitata (tRNATyr) (Table 2). These species were found to lost the mentioned gene(s) in their genome. Understanding the loss of tRNA genes and its functional implication in protein translation is very crucial.

Some plant species encode tRNASec in their genomes

Several plant species were found to encode tRNA genes for selenocysteine amino acids. More specifically, 22 (17.187%) species were found to encode a tRNASec gene in their genome. These species were Aegilops tauschii, Beta vulgaris, Brassica rapa, Cucumis sativus, Cucurbita maxima, Cucurbita moschata, Cucurbita pepo, Ectocarpus siliculosus, Ipomoea nil, Ipomoea triloba, Lactuca sativa, Momordica charantia, Medicago truncatula, Monoraphidium neglectum, Nicotiana tabacum, Papaver somniferum, Picea glauca, Populus euphratica, Salvia splendens, Tarenaya hassleriana, Triticum urartu, and Zea mays (Table 2). The length of tRNASec encoding genes was ranged from 70 to 90 nucleotides with average length being 72.93 nucleotides per tRNA. A multiple sequence alignment of tRNASec genes indicated the presence of a conserved G-x-C nucleotide at the 30th and 32nd positions and a conserved U-C-A at 34th, 35th, and 36th positions (Supplementary Figure 1). The pseudo-uridine loop was also found to contain a conserved G-U-U-x2-A-x2-C nucleotide consensus sequence (Supplementary Figure 1). The tRNASec in C. maxima (NW_019272053.1), however, was found to encode a C-U-U nucleotide sequence instead of a G-U-U conserved consensus sequence in its pseudo-uridine loop (Supplementary Figure 1).

Loss of tRNASec occurred to a greater extent than duplication

A phylogenetic tree was constructed to investigate the evolution of tRNASec genes by considering the nucleotide sequences of all the 20 tRNA genes along with tRNASec genes. The phylogenetic tree revealed the 28 major tRNA groups (Fig. 3). The tRNASec genes were clustered in the middle of the phylogenetic tree and tRNASec was found to be present in at least six different clusters (Fig. 3). A few tRNASec genes were grouped with tRNALys (CUU), tRNAAsn (GUU), tRNAArg (UCG, CCG), tRNAGly (UCC), and tRNATrp (CCA) (Fig. 3). The analysis indicates that tRNASec is distributed in different clusters in the phylogenetic tree. This explains the role of duplication events in the evolution of tRNASec genes. Therefore, an analysis was conducted to investigate the deletion/duplication events related to tRNASec genes. As a result, we found that tRNASec deletion events occurred more frequently than duplication events. A total of 45 duplications, 119 deletions, and 9 co-divergent events were identified within 68 tRNASec genes found in 22 species (Supplementary Figure 2). The role of recombination in the evolution of tRNASec was further analysed. Results indicated that tRNASec genes had undergone recombination events, as did other tRNA genes (Fig. 4). The role of recombination and duplication of tRNASec genes resulted in the sharing of its genetic sequence with other tRNAs genes which may perhaps explain why tRNASec was present in different clusters within the phylogenetic tree. A recombination analysis of tRNASec genes indicated the role of recombination events within the tRNASec itself (Fig. 5). A time tree analysis revealed that the divergence time of tRNASec genes in plant species occurred at least 2466.30 million years ago (MYA) (Supplementary Figure 3) and less than a MYA in the case of the tRNASec in P. somniferum. The tRNASec in P. somniferum was found to arise from a duplication event. The recent divergence time for the tRNASec in P. somniferum indicates that this gene has undergone a recent duplication event.

Fig. 3
figure 3

Phylogenetic tree of tRNASec and other tRNA isotypes. The phylogenetic tree with 21 tRNA isotypes revealed at least 28 major phylogenetic groups where tRNASec (red) was placed with different tRNA isotypes. The phylogenetic tree indicates that tRNA has most likely evolved from multiple common ancestors and has also undergone duplication. The evolutionary history was inferred using the Maximum Likelihood method based on the Kimura 2-parameter model. The tree with the highest log likelihood (− 7466.51) is illustrated. The percentage of the branches in which the associated taxa cluster together is shown next to the branches. Initial tree(s) for the heuristic search were automatically obtained applying the Neighbor-Join and BIONJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with a superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among the sites [5 categories (+G, parameter = 2.6875)]. The tree is drawn to scale, with branch lengths representing the number of substitutions per site. The analysis utilized 702 nucleotide sequences. All positions with less than 95% site coverage were eliminated. Fewer than 5% alignment gaps, missing data, and ambiguous bases were allowed at any position. Evolutionary analyses were conducted in MEGA7 [2]

Fig. 4
figure 4

Recombination events in tRNA isotypes. Results indicated that tRNAs haves undergone dynamic recombination events during the course of evolution. The recombination study was conducted using IcyTree software using a nwk file format obtained from the phylogenetic tree

Fig. 5
figure 5

Recombination events in tRNASec genes. Results indicate that tRNASec have undergone recombination events during the evolution. The recombination study was conducted using IcyTree using the nwk file format of the phylogenetic tree of the tRNASec

tRNASec underwent a switch in anti-codons during evolution

tRNA genes undergo rapid changes during the course of their evolution to meet translational demand. Therefore, an attempt was made to better understand the role of tRNASec genes in plant evolution. It is well known that the tRNASec gene is encoded by a UCA anti-codon and that this gene was found in different clusters in the phylogenetic tree of tRNAs. An anti-codon switch occurs more frequently with a nucleotide sequence of a tRNA gene with a different anti-codon than with a gene with a similar anti-codon [51]. Therefore, the possibility of anti-codon switch in tRNASec gene was examined. tRNASec grouped with tRNALys (CUU), tRNAAsn (GUU), tRNAArg (UCG, CCG), tRNAGly (UCC), and tRNATrp (CCA). The UCA anti-codon of tRNASec was replaced by CUU in tRNALys and in tRNAAsn it was replaced by GUU where the 2nd and 3rd nucleotide of the anti-codons were constant. In tRNAArg and tRNAGly, the UCA anti-codon of tRNASec was replaced by UCG and UCC where the 1st nucleotide of the anti-codons remained constant and the 2nd and 3rd anti-codons were variable. For the CCG anti-codon of tRNAArg and the CCA anti-codon of tRNATrp, the 1st nucleotide of U (CA) of tRNASec was replaced with a C nucleotide and the 3rd nucleotide remained variable.

Statistical analysis

The varied number and frequency of anti-codons led us to understand whether or not a dataset is approximately normally distributed. Therefore, we conducted normal probability plot study of anti-codon numbers (Fig. 6). The normal probability plot correlation coefficient was 0.9632. the correlation co-efficient and an approximately straight line indicate that normal distribution was good for the dataset (Fig. 6). Ordinary linear fit least square regression model of anti-codon numbers was conducted to find the best fit for a set of data by minimizing the sum of the offsets or residuals of points from the plotted curve and to understand the behaviour of dependent variables (Supplementary Figure 4). The method estimates the relationship by minimizing the sum of the squares in the difference between the observed and predicted values of dependent variable configured as a straight line. At 95% significance and intercept at zero, the slope was found to be 34.621 (Supplementary Figure 4). The statistical result of the ordinary least square regression was; t = 10.728, standard error a = 3.227, and p (slope) = 6.161E-16. For 95% bootstrap confidence interval (N = 1999); correlation r = 0.00916, r2 = 8.3917E-05, t = 0.072713, p (uncorr) = 0.94226, and pemutation p = 0.9404. the residual standard error of estimate was 147.

Fig. 6
figure 6

Normal probability plot of anti-codon numbers of the plant kingdom with correlation coefficient 0.9636 suggesting the datasets are normally distributed

Discussion

tRNA is an adaptor molecule that becomes charged when it binds an amino acid and subsequently donates it to an elongating peptide chain as determined by a codon-anti-codon recognition system. Each tRNA contain a characteristics anti-codon sequence which dictates the translation of a mRNA sequence into a protein. In some cases, the same codon can get decoded by different tRNA species and the same tRNA species can also become decoded by different codons due to wobble interactions (Watson-Crick base pairing) at the first position of an anti-codon and third position of the codon [26,27,28]. In our analysis of 128 species of the plants, none were found to encode all 64 anti-codons, which suggests that wobble base pairing exists in all plant species. The wobble interaction occurs at the G:U (guanine-uracil) base pairing and modifications in anti-codons that change the specificity of a codon [57,58,59]. Due to this redundancy, it is not necessary for a plant genome to encode all of existing anti-codons and utilize different tRNAs according to the requirement. The presence of only 29 anti-codons in the genome of Klebsmordium nitens and 31 anti-codons in Bathycoccus prasinos, however, are somewhat very interesting. Species K. nitens and B. prasinos belonged to the phylum algae and the genome sizes of these species are much smaller than the genome sizes found in gymnosperm and angiosperms. The absence of a greater number of anti-codons in these species suggests that the rate of wobble base-pairing might be quite high in these species. Mohanta et al., (2020) reported that species of cyanobacteria possessed 32 to 43 anti-codons per genome [20]. Cyanobacterial genomes are smaller than genomes of alae and higher plants [60]. The absence of a greater number of anti-codons in species with smaller genome is directly related to a higher frequency of wobble base-pairing. Ipomea nil (59), Ipomea triloba (58), Papaver somniferum (59), Cucurbita pepo (56), and Zea mays (59) possess a high number of anti-codons and so the occurrence of wobble base pairing may be quite minimal in these species. It will be interesting to determine the factors responsible for the occurrence of high and low frequencies of wobble base-pairing. Zhang et al., [61] reported that the presence of high concentration of amino acids in the nutrient media led to higher rate of mismatch incorporation of amino acids into the translating protein chain [61]. They also reported that wobble codon position is less stringent in base pair mismatch and base change in 3rd position explained additional 25% misincorporation either by favourable GmRNA/UtRNA mismatch or wobble position mismatch [61]. The G/U mismatch was predominant during the codon recognition and which is commonly found in the nucleic acid secondary structures as well [62,63,64].

The abundance of the CAU anti-codon that encodes tRNAMet was the greatest among all of the anti-codons (Supplementary File 1). Methionine is used to initiate the start of a polypeptide chain, and as a result, almost all proteins require a methionine amino acid. Therefore, the abundance of an anti-codon for tRNAMet was found to be the highest. Additionally, tRNAMet (CAU) was found to have evolved earlier than other tRNAs during the course of evolution [18, 19]. If the abundance of isoacceptors is considered, tRNALeu, which contain six isoacceptors (GGA, AGA, CGA, UGA, ACU, GCU), has the highest abundance (7.808% of the collective plant species). Similarly, tRNASer, and tRNAArg, both with six isoacceptors, have a high percentage of anti-codon abundance. This finding led us to conclude that, the higher the number of isoacceptors for tRNA isotypes, the greater the level of anti-codon sharing in a genome. The study also reveals that plant genomes encode tRNALeu, tRNASer, and tRNAArg more frequently than other tRNAs. A proteome-wide analysis by Mohanta et al., [19] reported a higher abundance of Leu amino acids in the proteomes of the Plant Kingdom [65]. This observation directly corroborates that the number and abundance of tRNALeu genes in genome is directly proportional to the number of Leu amino acids in the proteome. In contrast, a few anti-codons, including GCG, GAG, GGG, GGC, ACU, ACC, UCA (Sec) (group E) of different tRNA isotypes were found to have a low abundance (Fig. 2). Yona et al., [51] reported that multiple copies of rare tRNAs are deleterious to a cell [51]. They also stated that the effective gene copy number of each tRNA anti-codon set can undergo changes during evolution that may be due to the changes in demand-to-supply [51]. A single point mutation in an anti-codon can change one tRNA to another. The lowest encoding anti-codon GCG of tRNAArg may have undergone a point mutation resulting in tRNAArg with ACG, CCG, and UCG, which avoids the deleterious effect of the GCG anti-codon. Previous studies have also noted that rare tRNAs may be essential for co-translational folding as low abundance could provide a pause in translation [44, 66].

When plants grow in a multitude of environmental conditions, environmental stress can induce the expression of genes needed for stress adaptation, which may affect codon usage by the transcriptome. This leads to a demand for a different pool of tRNAs to support the change in codon usage and avoid a translational imbalance [52, 67]. If the altered environmental conditions persist, the tRNAs have to undergo changes in their level of expression to meet and respond to the environmental stress-induced changes in gene expression. If the changes in supply-demand continue, it may lead to changes in the genetic pool of the tRNAs that are beneficial and favoured by selection pressures. These novel translational demands can be maintained by shifting nucleotides in the anti-codons rather than by the duplication of genes. The tRNA pool can evolve to maintain the translational requirement by adjusting the number and/or ratio of tRNA isotypes encoding the same amino acid. An anti-codon switch, however, can also dramatically change the ratios of tRNA isoacceptor within a tRNA pool. This can be done by increasing the copy number of one isoacceptor at the expense of others. The high sequence similarity of different anti-codons (anti-codon switch) can be the result of purifying selection that maintains sequence similarity. Sequence similarity, however, can result from concerted evolution that maintains sequence similarity through frequent recombination among members of the same gene family [68, 69]. The presence of a high level of recombination in tRNAs indicates that the evolution of plant tRNAs for anti-codon switch and sequence similarity may be due to concerted evolution. A single point mutation in an anti-codon can result in the encoding of a different tRNA family. It would be interesting to understand the evolutionary constraints that lead to the generation of more members while others have fewer members. It has been previously reported that tRNALeu encodes a higher number of tRNA genes in the genome, a feature that is directly related to the higher number of tRNA isoacceptors in tRNALeu [17,18,19,20]. The question remains if purifying selection plays a role in maintaining a low level of certain tRNAs, such as tRNASec, tRNAHis, tRNATrp, and tRNATyr. It is plausible that this purifying selection might be responsible for maintaining the anti-codons of these tRNAs at non-optimal levels. A previous study reported that increasing the copy number of a low copy tRNA gene family in a cell results in proteotoxic stress due to problems in protein folding [51]. In addressing the need for environmental adaptation, tRNA isotypes provide evolutionary plasticity to changes in translational demand due to their presence as a multi-member gene family. A few species have lost tRNA genes for particular tRNA isotypes and anti-codon switch/point mutations of anti-codons may be a factor that contributes to maintaining the function of a genome in the complete absence of a particular gene family.

Selenocysteine (a selenium containing cysteine analog) is co-translationally inserted in a small fraction of proteins (selenoproteins) and is driven by a tRNASec gene. Although Sec is found in all three domains of life, it is not universal. Approximately 20% of the prokaryotic genome contains selenoproteins, while in eukaryotes selenoproteins are reported to be more concentrated in the metazoan lineage [70,71,72,73]. The absence of selenoproteins in fungi and land plants has also been reported previously [74]. and results from a lack of a tRNASec gene in their genomes. tRNASec is encoded by a UGA anti-codon which also encodes a stop codon. A highly sensitive and efficient method of tRNA identification is needed to find tRNASec. The lack of suitable identification techniques may be the main reason for stating the absence of tRNASec genes in fungal and plant genomes. Using current technology, however, we were able to identify tRNASec, as well as tRNASec genes in a few of the genomes of the analysed plant species.

Conclusion

The repertoire of tRNA has a significant impact on the fitness of an organism. The frequency (abundance) of anti-codons that explains synonymous codon usage in coding genes, however, has remained unexplored. Anti-codon frequency can be directly attributed to the frequency of synonymous codon usage and an anti-codon table of the Plant Kingdom, along with the percent abundance of each anti-codon, can be very helpful for understanding the relationship between codon and anti-codon frequency in the genome. The 21st amino acid, selenocysteine, encoded by tRNASec has undergone a duplication event along with an anti-codon switch. Understanding the mechanisms involved in the loss of tRNA genes in a few species may be crucial to deciphering the translation mechanism in these species. The frequency of the anti-codons GCG (Arg), GAG (Leu), ACU (Ser), GGG (Pro) were very low in abundance and appear to be the rarest form of anti-codons in the Plant Kingdom. Yona et al., [51] reported that multiple copies of rare tRNAs are deleterious to a cell [51], which suggests that large copy numbers of CGC, GAG, ACU, and GGG anti-codons may be deleterious to plant cells. Therefore, a very low number of these anti-codons are encoded in the plant genome. A few species have completely lost specific tRNA isotype genes in their genome. Additionally, a previous also reported the loss of tRNA genes in some plant genomes [75].