Genome-wide identification and functional prediction of silicon (Si) transporters in poplar (Populus trichocarpa)

Silicon (Si) enhances plant tolerance to various biotic and abiotic stressors such as salinity, drought, and heat. In addition, Si can be biomineralized within plants to form organic carbon-containing phytoliths that can have ecosystem-level consequences by contributing to long-term carbon sequestration. Si is taken up and transported in plants via different transporter proteins such as influx transporters (e.g., Lsi1, Lsi6) and efflux transporters (e.g., Lsi2). Additionally, the imported Si can be deposited in plant leaves via silicification process using the Siliplant 1 (e.g., Slp1) protein. Functional homologs of these proteins have been reported in different food crops. Here, we performed a genome-wide analysis to identify different Si transporters and Slp1 homologs in the bioenergy crop poplar (Populus trichocarpa Torr. and A. Gray ex W. Hook). We identified one channel-type Si influx transporter (PtLsi1; Potri.017G083300), one Si efflux transporter (PtLsi2; Potri.012G144000) and two proteins like Slp1 (PtSlp1a; Potri.004G168600 and PtSlp1b; Potri.009G129900) in the P. trichocarpa genome. We found a unique sequence (KPKPPVFKPPPVPI) in PtSlp1a which is repeated six times. Repeated presence of this sequence in PtSlp1a indicates that this protein might be important for silicification processes in P. trichocarpa. The mutation profiles of different Si transporters in a P. trichocarpa genome-wide association study population identified significant and impactful mutations in Potri.004G168600 and Potri.009G129900. Using a publically accessible database (http://bar.utoronto.ca/eplant_poplar/), digital expression analysis of the putative Si transporters in P. trichocarpa found low to moderate expression in the anticipated tissues, such as roots and leaves. Subcellular localization analysis found that PtLsi1/PtLsi2 are localized in the plasma membrane, whereas PtSlp1a/PtSlp1b are found in the extracellular spaces. Protein–Protein interaction analysis of PtLsi1/PtLsi2 identified Delta-1-pyrroline-5-carboxylate synthase (P5CS) as one of the main interacting partners of PtLsi2, which plays a key role in proline biosynthesis. Proline is a well-known participant in biotic and abiotic stress tolerance in plants. These findings will reinforce future efforts to modify Si accumulation for enhancing plant stress tolerance and carbon sequestration in poplar.


Introduction
Silicon (Si) is the second-most abundant (28.8%) element in Earth's continental crust after oxygen (Kaur and Greger 2019). The uptake and accumulation of Si in plants is associated with a number of benefits (Fig. 1). For example, Si enhances plant tolerance to various abiotic (e.g., drought, salinity, metal toxicity) and biotic (e.g., pathogen, herbivory) stressors (Souri et al. 2020;Gaur et al. 2020;Ahanger et al. 2020). The accumulation of Si in plants as carbon-containing phytoliths (Si-based structures in plant cells) also contributes to long-term carbon sequestration (Hans Wedepohl 1995). However, Si is considered a non-essential element for most plants. Plants acquire Si as silicic acid [Si(OH) 4 ] from the soil through the root system. Silicic acid is then translocated to the aboveground tissues and organs and deposited as phytoliths through the process of biosilicification (Kaur and Greger 2019).
The uptake, transport and deposition of Si from soil to a plant is mediated by a cooperative action of various transporter genes, such as channel-type influx (e.g., Lsi1/Lsi6) and efflux (e.g., Lsi2) transporters (Kaur and Greger 2019;Nawaz et al. 2020). The first Si transporter gene identified in plants was rice (Oryza sativa L.) low silicon 1 (OsLsi1) (Ma et al. 2006). Functional homologs of OsLsi1 have since been identified in many plant species, including in barley (Hordeum vulgare L.) (Chiba et al. 2009), pumpkin (Cucurbita moschata Duch.) , wheat (Triticum aestivum L.) (Montpetit et al. 2012), soybean (Glycine max L.) (Deshmukh et al. 2013), maize (Zea mays L.) (Mitani et al. 2009b) and cucumber (Cucumis sativus L.) (Sun et al. 2017). Lsi1 belongs to the Nodulin 26-like intrinsic protein (NIP) III subgroup of the aquaporin gene family (Ma and Yamaji 2015). This gene is involved in Si uptake from the soil into root cells in both eudicots and monocots. Functional characterization of the Lsi1 protein in different plant species shows that Si specificity in plants is linked to the two highly conserved Asn-Pro-Ala (NPA) motifs and an aromatic/arginine (ar/R) region formed by four amino acid (AA) residues that act as a selectivity filter (SF). The SFs consist of Gly-Ser-Gly-Arg (GSGR) or Ser-Thr-Ala-Arg (STAR) residues (Mitani-Ueno et al. 2011b;Grégoire et al. 2012). In addition, a precise spacing of 108 AAs between the two NPA motifs is essential for Si permeability in plants (Deshmukh et al. 2015). Lsi2, an efflux transporter gene which was first identified in rice, transports Si from root cells to the root xylem tissue. It belongs to an uncharacterized putative anion transporter protein family. Homologs of Lsi2 have been reported in several plants, such as barley, maize (Mitani et al. 2009a) and pumpkin (Mitani-Ueno et al. 2011a).
Si in xylem sap is deposited as monosilicic acid which is then unloaded by Lsi6, a homolog of Lsi1 (Yamaji et al. 2008). Lsi6 subsequently distributes Si to the aboveground -Forming osmoprotectants (e.g., proline) -Synthesizing heat shock proteins (HSP) -Reducing oxidaƟve damage -Decreasing transpiraƟon rate by thickening cell walls -Increasing stem strength to prevent lodging -Reducing electrolyte leakage -Increasing thermal stability of cell membranes BioƟc stressors (e.g., pathogens, insects): -Producing natural defense compounds (e.g., phytoalexins) -AƩracƟng natural predators and parasitoids during pest aƩack -Strengthening cell walls to prevent pathogen entry

Si content can affect:
-Bioprocessing of bioenergy feedstocks due to low solubility in water and high stability of silica -Carbon sequestraƟon by occluding organic carbon within Si-rich phytoliths tissues where it is deposited as amorphous Si(SiO 2 ) in phytoliths (Kumar et al. 2017). In some plants, there are specialized cells called silica cells (a leaf epidermal cell) where Si is deposited. The molecular mechanism of silica deposition has recently been uncovered in Sorghum with the identification of the protein Siliplant 1 (Slp1) (Kumar et al. 2017). Slp1 contains a short peptide (HKKPVPPKP-KPEPK) that controls the silicification process in Sorghum . A generalized scheme of Si uptake, transport and deposition in plants is depicted in Fig. 1. The expression and subcellular localization of Lsi1 and Lsi2 in roots varies in different plant species, despite these transporters having similar functions. In rice roots, OsLsi1 is found on the distal side of root exodermal and endodermal cells (Ma et al. 2006(Ma et al. , 2007. On the other hand, the maize homolog of OsLsi1, ZmLsi1, is found in seminal and crown roots on the distal side of root epidermal and hypodermal cells, as well as in the cortical cells of lateral roots (Mitani et al. 2009b). In barley, HvLsi1 is localized on the distal side of epidermal and cortical cells in seminal roots and on the distal side of hypodermal cells in lateral roots (Chiba et al. 2009). Like Lsi1, subcellular localization of Lsi2 varies by plant species. For example, in rice, OsLsi2 is localized to the proximal side of root exodermal and endodermal cells (Ma et al. 2006(Ma et al. , 2007, while maize and barley, Lsi2 transporters are only localized to root endodermal cells and show no polar distribution (Mitani et al. 2009a). The difference in cellular localizations of Si transporters in different plant species may be associated with their Si uptake ability. Using a mathematical simulation model, Sakurai et al. (2015Sakurai et al. ( , 2017 concluded that the most efficient Si uptake strategy is with OsLsi1 and OsLsi2 localized to the exodermis and endodermis with a polar distribution. This model simulation is, therefore, predicting that rice a high Si accumulator. To test such predictions, Si transporters need to be experimentally identified and characterized across multiple plant species to better understand how localization affects Si uptake and accumulation. As described above, previous studies on Si accumulation have focused on food crops (e.g., barley, rice, wheat, maize, soybean) (Souri et al. 2020;Ahanger et al. 2020). However, our knowledge on the genetic variability and molecular mechanisms of Si accumulation in bioenergy crops such as switchgrass (Panicum virgatum L.) and poplar (Populus trichocarpa Torr. & A. Gray ex W. Hook) is limited. While Si uptake by plants can have multiple benefits (Fig. 1), there may be negative impacts on the utilization of these crops for bioenergy production as Si in plant tissues with high Si content may inhibit the bioprocessing of bioenergy feedstocks. Thus, manipulating the phytolith-forming pathway, including uptake, transport, and translocation of Si in plants, will directly influence plant productivity and potentially extend to influence bioprocessing and ecosystem-level processes such as carbon sequestration (Fig. 1). In this study, we conducted an in-silico genome analysis to identify the putative transporter genes involved in Si uptake, transport, and deposition in poplar, a predominant bioenergy crop. 2) were obtained from Phytozome v13 (https:// phyto zome-next. jgi. doe. gov/) (Goodstein et al. 2012). The protein sequences translated from the CDSs of individual plant species were combined into a single protein sequence file in fasta format. Then 'all-against-all' sequence alignments in this combined protein sequence dataset were performed using DIAMOND BLASTP (Buchfink et al. 2015) with parameters of "-e 1e-5 -f 6 -k 1000000". The BLASTP output was used for clustering the protein sequences into tribes using TRIBE-MCL (Enright et al. 2002), with various inflation values (i.e., 1.1, 1.2, 1.3, …, 2.0).

Phylogenetic analysis
Phylogenetic analysis of protein sequences was performed as described previously (Yang et al. 2017). Specifically, protein sequences of the six plant species in each gene family containing known Si transporters were aligned using MAFFT (v7.453) (Katoh and Standley 2013) or CLC sequence viewer (http:// www. qiagen. com), with the setting of "auto". The protein sequence alignment was edited using Jalview software (Waterhouse et al. 2009). The protein sequence alignment for each gene family was used to infer maximum-likelihood phylogenetic tree using IQ-TREE 2 (version 2.1.2) (Minh et al. 2020), with the setting of "-B 1000". Finally, the phylogenetic trees were visualized using MEGA X (version 10.1.6) (Kumar et al. 2018).

Gene ontology analysis and pathway annotation
Gene ontology (GO) analysis of protein sequences was performed using DeepGOPlus (version 1.0.1) (Kulmanov and Hoehndorf 2020), with a prediction threshold of 0.3. Pathway annotation of protein sequences in each gene family containing known Si transporters was performed on the KEGG Blast KOALA (Kanehisa et al. 2016), with the KEGG GENES database as "family_eukaryotes".

Gene expression analysis
The gene expression data for the Si transporter genes in P. trichocarpa were obtained from ePlant Poplar (http:// bar. utoro nto. ca/ eplant_ poplar/) (Waese et al. 2017). Briefly, Affymetrix expression data were normalized by the GCOS method, with a TGT value of 500. Duplicate or triplicate samples were analyzed from greenhouse-grown or field-grown material (the latter in the case of the catkins). The seedings were grown on moist filter paper. All materials were grown under a diurnal cycle of 12 h light/12 h dark and sampled at midday, except for the xylem samples, which were sampled at midnight.

Protein 3D structure analysis
The three-dimensional (3D) structures of the Si transporter and silicification proteins were determined with AlphaFold 2 on Google Colab's Notebook (Mirdita et al. 2022). This Colab notebook allows for the prediction of protein structure using a slightly simplified version of AlphaFold v.2.1.0. Colab notebooks are coupled to a special sequence search program, MMSeqs2, thus making this technology more powerful. The predicted structures were downloaded from Google Colab's server in protein databank (PDB) format and visualized with the FirstGlance in Jmol server (http:// first glance. jmol. org).

Analysis of gene duplications
MCScanX (Wang et al. 2012) was used to detect gene synteny among different plant species as well as segmental duplication within species. The tandem gene duplication information was obtained from PLAZA 4.5 (Van Bel et al 2018).

Analysis of gene networks
The protein sequences of P. trichocarpa were used to search the UniProt database through BLAST (https:// www. unipr ot. org/ blast/) to obtain the corresponding proteins with UniProt accession numbers. Then, the UniProt accession numbers were used as queries to search the STRING (Szklarczyk et al. 2021) database for genes connected to each query gene.

Genome-wide association studies (GWAS)
Mutation profiles for each gene were obtained from wholegenome resequencing of 917 unrelated P. trichocarpa genotypes that were used to assemble the GWAS as described by Muchero et al. (2018). Expression analysis for the GWAS panel was performed using RNA-seq as previously described (Zhang et al. 2018). In the work by Muchero et al. (2018) and Zhang et al. (2018), 533 xylem and 470 leaf samples were collected from 4-year-old trees from the Clatskanie field site, Oregon, USA, while 529 root samples were collected greenhouse-grown plants at Oak Ridge National Laboratory, Tennessee, USA.

Channel-type Si influx transporters (SITs)
Until now, two channel-type SIT genes (Lsi1/Lsi6) have been reported in different plant species such as rice, wheat, maize, barley and soybean (Souri et al. 2020). Here, we performed an in-silico analysis to identify the putative channel-type SITs in poplar. We first compared the protein coding genes of A. thaliana, A. trichopoda, O. sativa, P. trichocarpa, S. bicolor and S. lycopersicum at the whole-genome level to identify the putative channel-type SITs in poplar. Phylogenetic analysis of the whole-gene family containing the channel-type SITs (i.e., Lsi1/Lsi6) grouped them into four distinct clades ( Fig. 2A). Further analysis of the subfamily containing the known SIT genes of rice (OsLsi1/OsLsi6) revealed that this clade contained 11 putative SITs in poplar (Fig. 2B). Out of the 11 putative SITs found in the poplar genome, only Potri.017G083300.1 (here referred to as PtLsi1) was grouped into the clade containing the rice SIT, indicating that this gene might be the candidate channel-type SIT in poplar. Further sequence analysis showed that PtLsi1 is ~ 55% similar to OsLsi1 and ~ 53% similar to OsLsi6 at the AA level.
To examine the presence of NPA motifs and the SF in the putative PtLsi1, we aligned the AA sequences of rice channel-type SIT, i.e., OsLsi1/OsLsi6 with the putative PtLsi1 identified in this study. The alignment (Fig. 3A) shows that PtLsi1 contains both NPA motifs and SF, and their locations match the positions of NPA/SF reported in OsLsi1/ OsLsi6. Similar to OsLsi1/OsLsi6, the distance between the two NPA motifs in PtLsi1 is 108 AA. The channel-type SITs in plants have six transmembrane domains (TMDs) (Wallace and Roberts 2004;Ma et al. 2006;Nawaz et al. 2020). We also found six TMDs in PtLsi1 (Fig. 3B), indicating that PtLsi1 is a membrane-bound protein and has an active role in transport across cells. Protein topology analysis showed that the N-terminus and C-terminus of PtLsi1 are located inside the intracellular space whereas in OsLsi1, these regions are in the extracellular space (Fig. 3B). Subcellular localization of PtLsi1 was determined using three different tools, including Wolfpsort (Horton et al. 2007), Cello (Yu et al 2006), and DeepLoc (Almagro Armenteros et al. 2017), and all tools predicted that PtLsi1 is a membrane-bound protein. A previous report in pumpkin  showed that Si transport activity is hampered by a prolineto-leucin mutation in CmLsi1 at position 242 resulting in a mislocalization to the endoplasmic reticulum instead of the plasma membrane, which is essential for its activity. This mutation, however, was not found in PtLsi1.
A previous study (Vatansever et al. 2017) found that sequence similarity and Si transport ability can be used to infer accumulation capacities of different plant species. Applying a similar approach to our findings, it is likely that PtLsi1 is a high Si accumulator given sequence similarity to the high accumulating rice homolog (Vatansever et al. 2017). Indeed, a previous study (Deshmukh et al. 2015) identified "Potri.017G083300.1" as the candidate PtLsi1 in P. trichocarpa and showed that it can accumulate high levels of Si in Xenopus (clawed frog) eggs, similar to that observed with rice OsLsi1. However, when assessed in poplar plants, Deshmukh et al. (2015) found moderate levels of Si accumulation. This might be explained by the low expression of PtLsi1 in poplar roots (see section "Gene expression pattern of Si transporters in poplar" for details). One interesting observation is that the number of influx transporters is highly variable in different plant species. For example, rice contains two influx transporters (Lsi1, Lsi6) whereas maize has as many as four homologs, and both rice and maize accumulate high levels of Si. This is also the case for sunflower (Helianthus annuus L.) which accumulates Si and contains more than one channel-type influx transporter. Whether the number of Si transporters affects a plant species' ability to take up Si has yet to be verified experimentally. Although rice has two Si influx transporter genes, only one homolog of Lsi1 was identified in P. trichocarpa. We did not find any homologs of Lsi6 in P. trichocarpa, which is involved in unloading Si from xylem and transporting it to multiple aboveground tissues. Interestingly, it has been shown that the absence of Lsi6 has no negative effect on Si uptake but can affect the pattern of Si deposition in leaves (Ma et al. 2011;Kaur and Greger 2019).

Si efflux transporter
Lsi2 is the only efflux transporter identified in plants to date. It belongs to an uncharacterized anion transporter family. Our genome-wide analysis identified 22 genes in this family, including five homologs in poplar (Potri.012G144000.5, Potri.015G146900.5, Potri.017G141600.3, Potri.017G 141800.2, Potri.017G141950.1). This phylogenetic analysis grouped the 22 genes into six distinct clades (Fig. 2C). The  clade containing the rice Lsi2 (OsLsi2; LOC_Os02g51110) harbors only one poplar gene Potri.012G144000 (here referred to as PtLsi2). We next aligned the protein sequences of five poplar homologs along with OsLsi2 to visualize their conserved regions (Fig. 4A). The alignments showed that these genes are highly conserved over the full-length of the protein sequences except at positions 230-375. The similarity between the OsLsi2 and the putative PtLsi2 is ~ 50% at the AA level. We found five conserved motifs in the putative Lsi2s (except Potri.017G141950.1) in poplar and in OsLsi2 (Fig. 4B).
Previous studies found that the Lsi2 protein in plants is localized in the plasma membrane, featuring 9-11 TMDs, with extracellular N-and extracellular/cytoplasmic C-terminus regions (Vivancos et al. 2016;Vatansever et al. 2017;Nawaz et al. 2020). Similarly, we identified 9-11 TMDs in the putative poplar Lsi2 homologs except Potri.017G141950.1, which contained five TMDs (Supplementary Fig. 1A). Only Potri.012G144000.5 contained a signal peptide ( Supplementary Fig. 1), further supporting the notion that this gene is the putative poplar Lsi2 homolog corresponding to OsLsi2. A recent study (Nawaz et al. 2020) identified a highly conserved region with a consensus sequence "XXTKHXWFXXCXXXXRX" among the Lsi2 proteins in different plant species such as rice, wheat, and Sorghum. This conserved sequence, however, was not found in the Lsi2 homologs in poplar. Many plants with high Si accumulation capacity, such as Equisetum (horsetail) and pumpkin, do not contain this conserved sequence either (Mitani-Ueno et al. 2011a;Grégoire et al. 2012). Therefore, this conserved sequence might not have any functional importance, but further characterization is warranted. Lsi2 is not an extensively characterized protein like Lsi1, and thus there is a little information available for this protein.

Siliplant 1 (Slp1) protein
Our genome-wide analysis identified 27 proteins in the studied plant species which belong to the Slp1 protein family. Out of the 27 proteins, there were only two poplar proteins which will be referred to as PtSlp1a (Potri.004G168600.10) and PtSlp1b (Potri.009G129900.6) in the remaining text. The isoelectric point (pI value) of these proteins was computed as positively charged AAs are involved in biosilicification (Otzen 2012;Nawaz et al. 2020;Kumar et al. 2020). The predicted pI values of the poplar Slp1 homologs were 9.96 and 9.78 (Table 1) for PtSlp1a and Ptslp1b, respectively. Similar to the known Slp1 protein in Sorghum, PtSlp1a and PtSlp1b are rich in proline (23.5% in PtSlp1a and 32.2% in PtSlp1b) and lysine (13.7% in PtSlp1a and 15.4% in PtSlp1b) (Table 1). However, they are poor in histidine (3.7% in PtSlp1a and 2.1% in PtSlp1b), whereas the known Sorghum Slp1 protein is rich in histidine . We also tested whether these proteins have signal peptides because Sorghum Slp1 is a secretory protein ). Both poplar Slp1 homologs contain a signal peptide ( Supplementary Fig. 1B).
The reported peptide sequence for in vitro silica precipitation is HKKPVPPKPKPEPK ). In addition, the well-characterized Sorghum Slp1 protein has three conserved domains, namely the A-, B-and C-domains. The A-domain is rich in histidine and aspartic acid (i.e., H, D-rich) and the consensus sequence is DXF-HKXHDYHXXXXHFH, whereas the B-and C-domains are rich in proline, lysine, glutamic acid (P, K, E-rich) and proline, threonine, and tyrosine (P, T, Y-rich) with consensus sequences KKPXPXKPKPXPKPXPXPX and YHX-PXPTYXSPTPIYHPPX, respectively ). We looked for the presence of these features in the putative Slp1 poplar homologs. We found that both PtSlp1a and  is in pink, B-domain is in sky blue, C-domain is in red). B Repeat sequences found in the two poplar Slp1 homolog proteins identified in this study PtSlp1a (Potri.004G168600.10) and PtSlp1b (Potri.009G129900.6) the P, K, E-rich and P, T, Y-rich regions which are partially conserved relative to the Sorghum Slp1 protein (Fig. 5A). Absence of the H, D-rich domain suggests that PtSlp1a or PtSlp1b might be a moderate Si accumulator as described by ). Finally, we checked the PtSlp1a/ PtSlp1b protein sequences for the presence of any repeat AA sequences as the Sorghum Slp1 contains multiple repeats (Kauss et al. 2003;Kumar et al. 2020). We found that PtSlp1a contains a sequence (KPKPPVFKPPPVPI) repeated six times, whereas PtSlp1b harbors a sequence (PKPKPPIFK) which is repeated three times (Fig. 5B).
The AA lengths of PtSlp1a and PtSlp1b are also different. PtSlp1a has 513 AA whereas the length of PtSlp1b is 336 AA. The length of the Sorghum Slp1 protein is 524 AA which is almost equivalent to PtSlp1a. These results suggest that PtSlp1a is the best candidate Slp1 homolog in poplar.

Gene expression pattern of Si transporters in poplar
In plants with high Si accumulation capabilities, channeltype SIT genes (i.e., Lsi1/Lsi6) and Lsi2 are predominantly expressed in root and stem tissue (Jadhao et al. 2020), consistent with their role of Si uptake by roots, whereas Slp1 is expressed in the inflorescence and leaves . We examined the expression pattern of the putative channel-type SITs, Lsi2, and Slp1 homologs in poplar using gene expression data from the ePlant poplar (http:// bar. utoro nto. ca/ eplant_ poplar/) database. We found that the expression levels of both PtLsi1 (Potri.017G083300) and PtLsi2 (Potri.012G144000) were lower in the root than in the other parts of the plant (Fig. 6A), such as leaves, suggesting that poplar is a weak Si accumulator. PtSlp1a (Potri.004G168600) is highly expressed in the inflorescence and young leaves (Fig. 6A), similar to the Sorghum Slp1 gene (Nawaz et al. 2020;Kumar et al. 2020). Furthermore, the expression of PtSlp1a was found to be much higher than that of PtSlp1b (Potri.009G129900) in multiple tissue types (Fig. 6A). This expression pattern suggests that PtSlp1a might be the best candidate Slp1 homolog in poplar, which is consistent with our inference based on the analysis of protein sequences in section "Siliplant 1 (Slp1) protein". Similar to observations above, analysis of expression patterns across the P. trichocarpa GWAS population also revealed that expression levels of these four genes was highest in leaves followed by roots and was negligible in xylem tissues (Fig. 6B). Moreover, there was significant variation across genotypes for each gene. When evaluated across the P. trichocarpa tissue Atlas expression profiles, Potri.004G168600 consistently had the highest expression across conditions followed by Potri.009G129900. Across all conditions, Potri.017G083300 consistently showed low expression levels (Fig. 6C).

Potential interaction partners of PtLsi1 and PtLsi2
We identified the putative interaction partners of PtLsi1/ PtLsi2 using public data in the STRING database (Szklarczyk et al. 2021). The putative interaction partners of PtLsi1 (Potri.017G083300) include different types of ion transporters, such as POPTR_0013s05030.2 (Fig. 7A), which is an aquaporin protein. This protein belongs to the major intrinsic protein (MIP) family and small basic intrinsic protein (SIP) subfamily and is involved in the transport of water and ions into the cell (Johanson and Gustavsson 2002;Ishikawa et al. 2005). Other important interaction partners of PtLsi1 are POPTR _0005s20790.1 and POPTR_0002s07550.2, which are involved in a range of metabolic pathways. The interaction partners of PtLsi2 include Delta-1-pyrroline-5-carboxylate synthase (POPTR_0008 s06060.1, POPTR_0010s20590.1), aminomethyltransferase (gdcT1/ gdcT2), and the 26S proteasome regulatory subunit family protein. Among these interaction partners, Delta-1-pyrroline-5-carboxylate synthase (P5Cs) plays a key role in proline biosynthesis in plants which helps protect the plant from different biotic and abiotic stressors (Hayat et al. 2012;Dar et al. 2016). This suggests that PtLsi2 may play a role in stress response.

Evolution of the Slp1 gene family
Gene duplication plays an important role in plant evolution. Specifically, gene family expansion and genomic evolutionary mechanisms mainly depend on gene duplication events. In this study, gene duplication events were identified in the Slp1 gene family using a bioinformatics analysis. All identified Slp1 genes were mapped to their corresponding chromosomes, as shown in Fig. 8. Some of these genes showed tandem gene duplications such as Sobic.001G265900 versus Sobic.001G266100 in Sorghum, Sobic.001G438400 versus Sobic.001G438500 in Sorghum, and LOC_Os03g14130 versus LOC_Os03g14140 on rice chromosome 3 (Fig. 8).
The expansion of the Slp1 gene family partially resulted from segmental duplications, such as a segmental duplication on Sorghum chromosome 1 (e.g., the segment containing Sobic.001G265900 and Sobic.001G266100 versus the segment containing Sobic.001G438400 and Sobic.001G438500) as well as a segmental duplication between the two poplar chromosomes (the segment containing Potri.004G168600 versus the segment containing Potri.009G129900) (Fig. 8). There were clear syntenic genomic blocks conserved among different plant species, such as the block containing LOC_Os03g14130 and LOC_Os03g14140 in rice and the block containing Sobic.001G438400 and Sobic.001G438500 in Sorghum.
These results indicate that the evolution and expansion of the Slp1 gene family was mediated by tandem gene duplication and segmental duplications within the same chromosome and between different chromosomes.

Mutation profiles of Si transporters across the P. trichocarpa GWAS population
Among the poplar Si transporters identified in this study, Potri.004G168600 had the highest number of mutations with three significant mutations based on SNPeff analysis. This gene had a codon deletion mutation at position Chr04:18772192, a non-synonymous mutation (G > A) at position Chr04:18772384, and another nonsynonymous mutation (T > C) at position Chr04:18772686. Potri.009G129900 had the highest impact mutation at position Chr09:10591818 where a cytosine insertion created a frame-shift mutation (GTT > GCTT). The frame-shift mutation occurred only in the heterozygous condition in 103 of the 917 P. trichocarpa genotypes evaluated. The absence of the homozygous state of this allele might suggest deleterious effects if protein function is significantly affected. The second-most impactful mutation occurred in Potri.012G144000 at position Chr12:15654656 with a four-nucleotide deletion from the reference allele creating an alternative splice site (CAT ATA TA > CATA). Of the 917 genotypes, 450 carried the CATA allele in the homozygous state while 366 were homozygous and 101 individuals carried the CAT ATA TA minor allele in a homozygous state. The relative lack of high-impact mutations in these genes might be indicative of their essential role in P. trichocarpa biology. Genotypes exhibiting high-impact mutations identified from this study can be further characterized to evaluate the putative impacts on trait expression. Potri.017G083300 and PtLsi2; Potri.012G144000) and the candidate silicification proteins (PtSlp1a; Potri.004G168600 and PtSlp1b; Potri.009G129900). A Digital expression patterns obtained from gene expression data available at http:// bar. utoro nto. ca/ eplant_ poplar/. B Expression profiles of different Si transporters across the P. trichocarpa GWAS population. C Expression profiles of Si transporters across different tissue types ◂ Fig. 7 Predicted interaction partners of the putative Si transporters in poplar. A PtLsi1 (Potri.017G083300; UniProt accession# B9IJY1) and B PtLsi2 (Potri.012G144000; UniProt accession# U5FXI4). The network was obtained from the STRING database (Szklarczyk et al. 2021) Protein 3D structure analysis Figure 9 shows the 3D structures of the putative Si transporter and silicification proteins of poplar. The 3D structures were predicted with Alphafold and the accuracy of these structures was determined by the predicted local distance difference test (pLDDT) score. pLDDT scores range from 0 to 100 and a higher value indicates a more accurate prediction. Another indicator for assessing the Alphafold predicted protein structure is the predicted aligned error (PAE) which assesses the confidence in the domain packing and largescale topology of the protein. PAE is presented as an interactive 2D plot where dark blue indicates small error and red indicates large error. Alphafold detected a well-defined 3D structure (Fig. 9A) for most of the PtLsi1 with high confidence (pLDDT > 80; Fig. 9H) except the C-terminus (residue 1-32) which was depicted as a ribbon-like structure with a low pLDDT score (pLDDT < 50; Fig. 9G). This ribbonlike structure is likely to be an intrinsically disordered region (IDR). The PtLsi1 protein also exhibited high structural similarity with OsLsi1 except the IDR which is detected in the N-terminus in OsLsi1. Disordered protein domains are often important for intracellular signaling and can transition from a disordered to an ordered state upon binding to other proteins (Ruff and Pappu 2021;David et al. 2022 Fig. 9 3D structures of the Si transporter and silicification proteins in poplar and rice. A-G Alphafold predicted 3D structures with the interactive 2D plots of predicted aligned error (PAE). Dark blue in the PAE plot indicates smaller errors whereas red indicates larger errors. H Interactive 2D plot of predicted local difference distance (pLDDT) scores for different models. The pLDDT plot gives the best information on intra-domain confidence with a value > 90 indicative of a high score different binding partners or function differently. PtLsi2 also showed a well-folded structure predicted by Alphafold and was highly similar to the OsLsi2 3D structure (Fig. 9C, D) with a pLDDT score > 80 (Fig. 9G). Interestingly, the IDR detected in both proteins were found in very similar positions (residues 211-267 in PtLsi2 and 211-295 in OsLsi2). The high structural similarity along with a similar location of the IDR implies that PtLsi2 is the candidate homolog for OsLsi2 in poplar. For silicification proteins, both the candidates in poplar (PtSlp1a and PtSlp1b) and the known SbSlp1 in Sorghum showed highly disordered structures in the Alphafold-predicted 3D model (Fig. 9E-G) with pLDDT scores < 50 (Fig. 9G). Comparing PtSlp1a and PtSlp1b with SbSlp1, it appears that PtSlp1a is structurally more similar to SbSlp1 indicating that PtSlp1a may be the candidate silicification protein in poplar.
The quality and accuracy of the Alphafold prediction model varies. The pLDDT score is a good guide for model quality and it correlates well with model accuracy when assessed against an experimentally determined structure. However, further experimental characterization and structure-based mutational studies are required to validate the structural model. Another limitation of Alphafold is that the predictions yield a static picture of protein structure and contain little or no information on protein dynamics. Conformational changes in proteins are often key to their function and regulation and are frequently triggered by binding to other molecules (e.g., proteins or small molecules/ligands) or by post-translational modifications. The current prediction algorithm is unable to model these conformational changes and is therefore largely unable to provide a crucial aspect of understanding the protein structure-function relationship (Jumper et al. 2021;van Breugel et al. 2022).

Conclusion
Previous studies on Si transporters mainly focused on monocot plants, particularly food crops. In this study, we identified transporter genes putatively involved in Si uptake, transport, and deposition in the woody bioenergy crop, P. trichocarpa. Further, we identified putative interaction partners and expression profiles of Si transporter genes and have described the possible evolution of Si transporter proteins in P. trichocarpa. To the best of our knowledge, this study represents the first attempt to provide an in-depth analysis of Si transporters in P. trichocarpa. The current findings expand our understanding of Si transporters in woody plants and provide knowledge that will greatly enhance genetic engineering strategies aimed at improving carbon sequestration and biomass quality of bioenergy feedstocks. Specifically, different Si transporter genes can be combined and expressed together in P. trichocarpa to examine their effects on phytolith formation. An additional next step would examine if the expression of rice Lsi6, the homolog that was not identified in poplar, improves phytolith formation in P. trichocarpa.
Author contributions MMH, XY and DW conceived the idea. MMH, XY, WM, KF and TBY analyzed the data. MMH, SM, WM and XY led the writing and revision of the manuscript. DW, SM, NAG, GY, MZM contributed to the manuscript revision. All authors accepted the final version of the manuscript.
Funding This work was supported by the Laboratory Directed Research and Development program of Oak Ridge National Laboratory. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http:// energy. gov/ downl oads/ doe-public-access-plan).

Conflict of interest The authors declare no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.