INTRODUCTION

Representatives of the poplar genus (Populus) differ from other woody plants in their high rate of biomass formation [1]; therefore, the poplar can be considered as a source of relatively cheap raw materials for the production of paper and other wood products, as well as a plant with great potential for use in phytoremediation [2]. Poplars are actively used in landscaping, but for this purpose it is necessary to use male plants, since poplar seeds, the so-called poplar fluff, can negatively affect people. Down absorbs pollutants from the air, as well as pollen allergens, thereby increasing allergic reactions in sensitive people and worsening their quality of life. In addition, fluff, by irritating the mucous membranes of the nasopharynx, can contribute to the development of diseases of the upper respiratory tract [3]. Evidence has also been obtained that female poplar plants are less resistant to a number of stress influences [4]. All this indicates the need to use male poplar plants in urban landscaping.

However, the poplar, like other dioecious plants, has a complex system of sex determination, which can vary significantly in different taxonomic groups [5]. It is assumed that in the process of evolution, dioecy arose repeatedly and independently within different taxa [6]. The complexity of the sex-determining region (SDR) structure has been influenced by genome duplication events and the spread of transposable elements [79], which have contributed to the complete or partial loss of duplicated genes. Moreover, the activity of the SDR locus can be controlled by epigenetic modifications [10].

The poplar can serve as a model woody plant for the study of SDR loci, which is facilitated by the possibility of conducting various types of studies [11], as well as the development of omics technologies, which make it possible to study the relationship between genotypic and phenotypic variability [12]. In addition, the structure of SDR loci and the functions of their key elements in the poplar have been studied in detail [1315]. It is also important that in the poplar sex is regulated by only one gene, the ARR16/ARR17 orthologue (Arabidopsis response regulator 16/17) [16].

This review describes the emergence, development, structure, and function of SDRs in members of the family Salicaceae, with greater emphasis on the genus Populus. In addition, modern data are presented on the molecular mechanisms of regulation of sex choice in the poplar and on the possible role in this process of genes whose expression differs in male and female plants.

THE PRACTICAL SIGNIFICANCE OF DETERMINING SEX IN THE POPLAR

Due to the ability to accumulate heavy metals (cadmium, lead, copper, zinc, etc.), as well as mercury [2, 17], various types of the poplar can serve the purpose of air purification in large populated areas, especially those on the territory of which industrial facilities are located. Poplars can also be used to assess the pollution of urban areas with heavy metals [17]. The poplar is one of the fastest growing trees [1], therefore, it forms biomass faster, absorbs more carbon dioxide and releases more oxygen compared to other trees, i.e., it effectively improves the composition of the atmosphere in populated areas. In addition, using genetic approaches, work is underway to increase poplar productivity [18]. However, a problem for city dwellers may be the so-called poplar fluff, or seeds produced by female plants. Down itself is not allergenic, but it absorbs other allergens from the air, such as pollen. Moreover, fluff irritates the mucous membranes of the nasopharynx and can contribute to the development of chronic diseases of the upper respiratory tract [3]. Other negative properties of fluff include the high fire hazard of large accumulations of fluff.

Recent studies have shown that female plants are less resistant than male plants to environmental stressors such as drought [19, 20], increased salinity and soil alkalization [21], oxidative stress [4, 21], and heavy metals [20, 21], as well as to the action of fungal pathogens [23]. Increased resistance of male plants to stress conditions is associated with overexpression and increased activity of protective proteins. For example, under conditions that stimulate the formation of reactive oxygen species, in male poplars the activity of antioxidant defense enzymes such as peroxidase, superoxide dismutase, catalase, glutathione reductase, and ascorbate peroxidase is higher than in female plants [21, 24]. Differences in the activities of antioxidant enzymes may be associated with differences in the expression level of the genes encoding them [25], which, in turn, is due to differences in the methylation of these genes in male and female plants [26]. Thus, in females, genes belonging to the functional categories of proteolysis, oxidation–reduction, phosphorylation, transmembrane transport, and transcription regulation are hypermethylated [26]. It is possible that sex-related differences in the levels of methylation and expression of genes involved in stress responses may be due to sex-specific polymorphisms [27] and differential expression of microRNAs that control the response to stress [28]. Published data indicate that despite increased sensitivity to certain types of stress, female plants may be more resistant to certain combinations of stress influences. An example is the combination of drought and infection, which correlates well with the sex-specific composition of the microbiome associated with plants [2931].

EVOLUTION OF THE SEX DETECTION SYSTEM IN SALICACEAE

The evolution of plants, including Salicaceae, is closely related to a series of successive whole genome duplications (WGDs) [16], which played a key role in the emergence of new taxa. Thus, the families Brassicaceae and Salicaceae, whose representatives are Arabidopsis thaliana and P. trichocarpa, respectively, split about 100–120 million years ago after β-WGD in their common ancestor, which lived about 125 million years ago [16]. In the further evolution of Salicaceae, the Salicoid WGD specific to this family played an important role, which occurred in the Paleocene (approximately 60–65 million years ago), before the division of the genera Salix (willow) and Populus (Fig. 1). As a result of this genomic duplication, the number of chromosomes in the haploid set increased from 7–10 to 16–21 (most modern poplars have 19 chromosomes). It is believed that this WGD allowed representatives of the genera Populus and Salix to occupy many ecological niches in the Northern Hemisphere [32]. Formation of dioecy and sexes (i.e., the SDR locus) in Salicaceae occurred after the Salicoid WGD about 45 million years ago [3335]. The Salicoid WGD affected approximately 92% of the poplar genome, resulting in approximately 8000 pairs of paralogous genes [36], including genes in the SDR [37]. It is also possible that the complication of the SDR structure could occur due to doubling of genome segments [38] and tandem duplications [39]. As a consequence, each of the 19 chromosomes of the American poplar P. trichocarpa has extensive areas of homology with other chromosomes. For instance, chromosome 19, which contains the SDR, has significant similarity to chromosome 13 [36].

Fig. 1.
figure 1

The phylogeny and evolution of the sex determination system in Salicaceae. During plant evolution, plant genomes have undergone multiple genome-wide duplications, including ζ-WGD, ε-WGD, γ-WGT, τ-WGD, β-WGD, α-WGD, and Salicoid WGD. Representatives of the genera Salix and Populus diverged after the Salicoid WGD of ancestral forms. The genus Populus is divided into Abaso, Turanga, Populus, Leucoides, Aigeiros, and Tacamahaca sections, of which the last three are combined into the ATL group. Among poplars there are species that have both ZW and XY sex determination systems. In some poplars, for example, P. tremula, P. deltoides, and P. balsamifera, there are duplicated fragments of ARR16/17 gene orthologs inside the SDR.

In the Salicaceae family there are more than 50 genera, including about 1000 species, of which in the genus Salix there are 330–500 species and in the genus Populus, 22–45 species [40]. The genus Salix, which is traditionally used as a comparison group, consists of two clades that diverged during the Miocene. The most intensive process of speciation of Salix and Populus occurred in the Pliocene [40, 41]. The complexity of genome organization, ease of crossing of closely related species, and reticulate evolution make the study of poplar phylogeny difficult [42]. Traditionally, the genus Populus is divided into six sections: Abaso, Turanga, Leucoids, Populus, Tacamahaca, Aigeiros. Whole-genome sequencing showed that the Abaso section was the first to separate, then Turanga, then the section dispersed Populus and the ATL group, uniting the Aigeiros sections, Tacamahaca, and Leucoides (Fig. 1).

In addition to WGD, transposable elements played a major role in the evolution of sex chromosomes and SDRs in Salicaceae [43]. The most common DNA transposons found in the poplar SDR are Helitron group DNA transposons and long terminal repeat retrotransposons (LTR-RT) of the Copia and Gypsy superfamilies [39, 43]. Transposable elements, like other types of repeated sequences, can cause suppression of recombination and hypermethylation of genes located near the site of integration [7]. The associated transcriptional repression [44] may contribute to subsequent degeneration of genes at sex loci [8, 45] and, ultimately, changes in the structure and size of sex chromosomes [9]. In addition, the features of the replication mechanism of Helitron-like transposons allow them to capture and move entire genes or their fragments [46], thus these transposons could contribute to an increase in the number of gene copies in SDRs [43].

In dioecious plants, there are both homomorphic (slightly different) and heteromorphic (very different in size and number of active genes) sex chromosomes. In species with heteromorphic sex chromosomes, two systems of sex determination are distinguished: XY and ZW. In plants with an XY system, males are heterogametic and females are homogametic, while in plants with a ZW system males are homogametic and females are heterogametic. Within the genus Populus species have been described that have both XY and ZW systems [43, 44] (Fig. 1).

Such heterogeneity indicates the independent origin of reproductive systems in species of the genus Populus, making sex determination even more difficult. Indeed, only sex determination caused by an ARR16/17 orthologue is conserved, whereas during evolution, rearrangements of sex chromosomes and the transition between the XY and ZW systems occurred repeatedly [55]. At a minimum, some willows have (e.g., S. purpurea) the ZW system. Sex determination in the Abaso section has not yet been studied for P. euphratica (section Turanga) and representatives of the ATL group are characterized by the XY-system, and both systems were found in the Populus group [55, 59].

The transition from monoecy to dioecy is associated with the appearance of the SDR locus. Currently, two hypotheses for the origin of dioecious plants are dominant. The classic “two-gene model” suggests that the two sexes arose through the spread of two altered genes for male and female sterility. To avoid hermaphroditism or complete sterility, these mutations must be fixed on two homologous chromosomes [47]. This hypothesis is supported by the presence two sex-determining genes in the SDR of the kiwi (Actinidia deliciosa) and a number of other dioecious plants [4852]. The “single-gene model” proposed in 2016 postulates that the evolutionary transition from monoecy to dioecy was facilitated by a mutation in a single gene encoding a high-level regulator [53]. Through signaling pathways and effectors, the mutant regulator could control the formation of flowers of a certain sex [54]. There may be several genes involved in the selection of a particular sex, but only one, the main one, determines sex when “on” or “off” [53]. Experimental confirmation of the single-gene model was obtained for the genus Populus. It has been proven that PtRR9, an orthologue of ARR16/ARR17 in A. thaliana, is responsible for sex selection in P. tremula [55].

As a result of evolution, modern poplar species have large (most often about 100 kbp) [27, 56] SDR loci enriched in paralogous genes and repeat elements. Such a structure seriously complicates the determination of the sequence of the entire locus, the identification of its functionally important elements, and the study of molecular genetic mechanisms of sex determination [39, 57]. A striking example is P. trichocarpa, in which early studies suggested the existence of a ZW system, but later, using third-generation genome sequencing technologies (PacBio), it was convincingly shown that this species has an XY system [27].

THE GENERAL STRUCTURE OF THE SDR IN REPRESENTATIVES OF THE GENUS POPULUS

In the vast majority of poplar species, the SDR is located on chromosome 19. In P. trichocarpa and P. nigra (sections Tacamahaca and Aigeiros) SDR is located in the subtelomeric region of the chromosome [27, 57], and in P. tremula, P. tremuloides, and P. alba (section Populus), in the pericentromeric region [14, 58]. It should be noted that the SDR of P. trichocarpa consists of two subtelomeric regions of chromosome 19: one is located on the left arm and the other on the right [39]. The SDR is mapped to the subtelomeric region of chromosome 14 only in P. euphratica, a representative of the Turanga section [43]. Depending on the method of genome analysis, species, and sex, the experimentally estimated average SDR length varies from 100 [27] to 140 kb [39, 56]. For example, at P. deltoides the length of the SDR on the X chromosome (X-SDR) is 100 kb, whereas the length of the SDR found on the Y chromosome (Y-SDR) is 140 kb [56]. The largest SDR is 1.71 Mb long in W allele P. qiongdaoensis [59].

ARR16/ARR17 orthologs are considered key SDR genes, as members of the family of cytokinin response regulators (RRs), which are effectors of the highly conserved cytokinin signal transduction pathway and regulate the processes of plant growth and development [43, 60, 61]. Group genes RR functionally and sequentially divided into four types: A (RRA), B (RRB), C (RRC) and pseudo-RR (PRR) [62]. Poplar has 33 described RR genes belonging to three types, RRA, RRB, and pseudo-RR [37]. RRBs encodes transcription factors with the Myb DNA-binding domain that respond positively to the activation of cytokinin receptors, and the genes of the group RRA, which includes ARR16/ARR17 orthologs, encode negative regulators of cytokinin-dependent processes that do not have their own DNA-binding domain [16, 6365]. PRRs encode transcription factors with dimerization N-terminal and DNA-binding C-terminal domains, which provide regulation of circadian rhythms and processes associated with them [37, 66].

In addition to ARR16/ARR17 orthologs, which control sex, poplar SDRs also contain other sex-linked genes, which are present in both Y-SDR and X‑SDR and differ in sex-specific polymorphisms (Fig. 2). In early research on SDRs of P. trichocarpa and P. balsamifera 13 genes were discovered, including ACD1-LIKE, ATHEMA1, TCP-1/cpn60, ATCLC-C, MET1, RFL1, NB-ARC, and EGM1 [27]. However, the presence of only four known genes was later confirmed: TCP-1, CLC-C, MET1, and NB-ARC [39]. The difficulty in determining the gene composition of the SDR is associated with the abundance of transposable elements, which lead to read assembly artifacts and thereby make it difficult to determine the exact sequence of this genome region. The presence of TCP-1, CLC-c, and MET1 in the SDR has also been confirmed in P. deltoides [56], P. × sibirica [15], and P. tomentosa [25]; their differential sex-specific expression has been shown. The TCP-1 gene (tailless complex polypeptide 1) encodes a cytosolic eukaryotic protein from the group of chaperonins [67]. CLC (Chloride Channel) is a family of proteins including anion channels and anion/cation antiporters that regulate Cl and \({\text{NO}}_{3}^{ - }\) metabolism in different cellular compartments. In particular, CLC-a, b, c, g are located on the vacuole membrane, d and f are on the membranes of the Golgi complex, and e are on the internal membranes of chloroplasts [68]. MET1 apparently encodes the main plant DNA methyltransferase, which supports methylation of CpG islands [69].

Fig. 2.
figure 2

The molecular functions of some non-RR genes are mapped to poplar SDRs, and structure of mobile SDR elements. Genes found in both the X- and Y-SDR include TCP-1, which encodes a cytosolic chaperonin, CLC-c, a proton-chloride antiporter on the surface of the vacuole, and MET1, DNA methyltransferase. In P. tremula the full length TOZ19 gene occurs only on the Y-SDR. In the SDR, LTR-RT mobile elements from the Copia and Gypsy superfamilies are found in large numbers; structurally they differ in the order of gene arrangement.

The P. tremula SDR contains the TOZ19 gene, which, like its orthologue in A. thaliana, is associated with rRNA synthesis in nucleoli and is important for embryonic development [70], while Y-SDR contains its full-length version, and X-SDR contains only the 3' end [55, 71]. The total number of genes at SDR loci tends to correlate with the overall size of the locus. For example, SDR P. trichocarpa, which has a typical size for poplars, contains at least five genes [39], and in giant W- and Z-SDR P. qiongdaoensis 122 and 50 genes are located, respectively [59]. The functions of most of these genes are unknown.

In the SDR, sex-specific regions containing regulatory genes can also be identified. In Y-SDR P. deltoides two long Y-specific hemizygous sequences (YHS) were found: YHS1 is about 35 kb, and YHS2 is about 4.3 kb. YHS1 contains two male genes: one, called FERR-R, contains partial FERR gene duplications (another name for an ARR16/ARR17 ortholog) and suppresses the formation of female generative organs, while another, called MSL [56], belongs to the Gypsy superfamily of retrotransposons. MSL consists of three parts: MSL-1, MSL-2, and MSL-3, in which both strands of DNA are transcribed to form long noncoding RNAs (lncRNAs), which regulate development towards the male phenotype. Sequences homologous to MSL are found on different chromosomes in the genome of different poplar species: for example, P. deltoides, P. alba, P. trichocarpa, P. tomentosa, and P. euphratica [56, 72].

In the YHS2 area of P. deltoides only one Tn gene has been found to be missing from the X-SDR. The function of this gene in the regulation of sex-related processes is unclear. These data indicate that several genes, including transposable elements, may be involved in the formation of sex and/or in the regulation of sex-associated traits in the poplar [56].

Considering the differences in the gene compositions of the Y-SDR and X-SDR loci, it is necessary to note numerous single nucleotide polymorphisms (SNPs), including sex-specific ones, which can facilitate the task of sex identification in the poplar. Evolutionarily, the emergence of SNPs is associated with duplication of alleles (due to WGD or transposon-mediated tandem duplications) and their subsequent divergence. The total number of SNPs that differ between sexes can be very large; for example, in P. trichocarpa more than 3 500 000 have been identified, and P. balsamifera, more than 1 000 000 [27]. It should be noted that some polymorphisms may be conserved and occur in other representatives of the section, confirming their relationship. Thus, some polymorphisms in P. trichocarpa have been found in other species of the ATL section: P. balsamifera, P. deltoides, and P. nigra; but these SNPs are absent in P. tremuloides, which belongs to the section Populus [73]. Among the conserved SNPs in SDR genes, there are also constitutive SNPs that could potentially be used as sex markers. For example, in ARR16/ARR17 gene orthologues in P. × sibirica from 16 to 49 SNPs were discovered; six of them are always found to be specific for X- and Y-alleles [74]. Another example would be the MET1 gene, which actively accumulates sex-specific SNPs. In P. × sibirica this gene contains from 80 to 179 SNPs, 11 of which are strictly Y-specific [75]. It is possible that some polymorphisms may reduce the activity of the DNA methyltransferase encoded by this gene and lead to a decrease in the overall level of DNA methylation in male plants. It has been shown that the DNA of female plants P. tomentosa has a higher level of methylation than males [76]. Thus, constitutive sex-specific SNPs can serve as promising markers in sex determination in the poplar. SNP detection methods are faster, cheaper, and more reliable than the whole-genome sequencing required to determine the entire SDR sequence. However, careful validation of these SNP markers is necessary.

The poplar SDR contains a large number of transposable elements that form the large size of the SDR and have played an important role in the evolution of sex loci. DNA transposons of the Helitron and LTR-RT groups of the Gypsy and Copia superfamilies are often found. Representatives of the last two groups have a similar structure: at the ends there are LTRs, between which there are five genes encoding a group-specific antigen (GAG), protease (PR), integrase (INT), reverse transcriptase (RT), and ribonuclease H (RH). Superfamilies differ only in the order of gene arrangement (in the 5' → 3' direction): in Copia it is GAG, PR, INT, RT, and RH, and Gypsy has GAG, PR, RT, RH, and INT (Fig. 2). We note that 30–40% of the poplar genome consists of repeated sequences, more than half of which belong to LTR-RT and, primarily, to the Gypsy and Copia superfamilies [59, 77, 78]. Thus, the appearance of LTR-RT in the SDR is quite natural. The location of LTR-RT as well as Helitron-like elements near RR genes in the SDRs of P. trichocarpa, P. euphratica, and P. alba, indicates their strong influence on the structure and functioning of the SDR [39, 43].

ARR16/ARR17 ORTHOLOGUES AND MOLECULAR MECHANISMS OF SEX DETERMINATION IN THE POPLAR

In the poplar, as in other representatives of the Salicaceae family, the main regulators of sex are ARR16/ARR17 gene orthologues located in the SDR. This is confirmed by the fact that ARR16/ARR17 orthologs are expressed predominantly in reproductive tissues, and their differential methylation and expression are associated with sex [10, 79].

The proof of this is provided by the results of an experiment on P. tremula, in which the ARR16/ARR17 orthologue was knocked out using the CRISPR/Cas9 system, which converted female plants into male ones [55]. WGD and other mechanisms of gene duplication contributed to an increase in the number of full and partial copies of RR genes, including ARR16/ARR17 orthologs, in the SDR of the poplar. The record holder for the number of partial copies of ARR16/ARR17 orthologs among the poplars is P. euphratica: it contains six short (S1, S2, S3, S4, S5, and S6) and four long (L1, L2, L3, and L4) fragments, which appeared, apparently, as a result of the movement of mobile elements [43]. This assumption is confirmed by the presence in all short sequences, except S2, of a representative of the group of Helitron-like transposons, and in all long sequences, of elements of the Copia superfamily [43].

A probable evolutionary scenario for the formation of SDRs with a complex structure in the Populus section has been proposed [59]. The common ancestor of the Populus section could have had two SDR alleles: the first (the ancestor of the SDR on the Z chromosome of P. qiongdaoensis and P. alba and X chromosome of P. tremula) did not contain ARR16/ARR17 orthologues, and the second (the ancestor of the SDR on the W chromosome of P. qiongdaoensis and P. alba and the Y chromosome of P. tremula) had two facing copies of ARR16/ARR17 orthologues, the 5' ends of which were adjacent to transposons from the Helitron group [59]. Next, in the evolutionary branch of P. qiongdaoensis in the ancestral form of the W chromosome, a partial duplication of this pair of genes occurred. The newly formed duplicates turned out to be defective, while the activity of the original genes and the sex determination mechanism were preserved [59]. In the evolutionary branch of the ancestors of P. alba and P. tremula a translocation of a full copy of the ARR16/ARR17 ortholog occurred from the pericentromeric to the subtelomeric region of the W chromosome, followed by double duplication. It is shown that next to each duplicate ARR16/ARR17 ortholog in P. alba there are Helitron- and Copia-like elements, which apparently mediated the translocation and duplication of ARR16/ARR17 orthologues [43]. The original SDR P. alba was eliminated. Thus, in female P. alba plants (ZW) three copies of ARR16/ARR17 orthologs are on the W chromosome, and for males (ZZ), none. In P. tremula fragments of ARR16/ARR17 orthologues were preserved in the original SDR, one of which was duplicated. Unlike P. alba, in P. tremula an XY system has formed in which female plants (XX) have two copies in the X-SDR PtRR9, and in male plants (XY) the Y-SDR includes not only two copies of PtRR9, but also fragments of the original genes that now serve as PtRR9 repressors [55, 59].

The molecular mechanisms of sex determination in species of the ATL section have been studied in most detail: P. trichocarpa, P. balsamifera, and P. deltiodes. P. trichocarpa has two SDRs located in the subtelomeric regions of the left and right arms of chromosome 19 (XY). In Y-SDR, there are five partial duplicates of the ARR16/ARR17 ortholog on the left arm of the Y chromosome, named PtRR9. Full size PtRR9 is located in the subtelomeric region of the right arm of the Y chromosome (Fig. 3). The X chromosome contains only the PtRR9 gene (in the same position as on the Y chromosome) and does not contain partial duplicates on the SDR. It has been shown that in male plants with partial gene duplicates PtRR9 lncRNAs complementary to all exons are transcribed in the SDR for PtRR9, except for exons 5 and 6 [39]. This correlates well with exon methylation status of PtRR9, observed in P. balsamifera: exons 1 and 4 were the most highly methylated in male plants compared to female ones, while exon 5 was not methylated [10]. In turn, a high level of PtRR9 exon methylation corresponds to suppression of the expression of this gene in males. In addition, partial duplicate of the ARR16/ARR17 orthologs form inverted repeats, which means that the RNA transcribed from them can form double-stranded RNA. This RNA can be processed by the RNA interference system to form small RNAs capable of repressing full ARR16/ARR17 orthologs, both through DNA methylation and mRNA degradation [10, 39].

Fig. 3.
figure 3

The molecular genetic mechanisms of sex determination in representatives of the ATL section P. trichocarpa and P. deltoides. The structures of the Y-SDR and the mechanisms of regulation of the expression of ARR16/ARR17 orthologues are schematically presented for P. trichocarpa and P. deltoides. In both species, SDRs are located in the subtelomeric regions of chromosome 19. Partial duplicates of ARR16/ARR17 orthologs are located in the SDR located on the left shoulder (in P. deltoides this area is called FERR-R), which have a common evolutionary origin with ARR16/ARR17 orthologs located at the opposite end of the same chromosome (in P. deltoides this gene is called FERR).

A similar mechanism of sex determination is described in P. deltoides. The FERR gene is located in the subtelomeric region of the right arm of chromosome 19 (aka, PdRR9 [38], an ARR16/ARR17 orthologue). In males, the Y-SDR is located on the left arm of the same chromosome (Y chromosome) (it is absent in the X chromosome). There are three regions in Y-SDR [80]. One of these regions contains the MSL and FERR-R genes, the latter are partial duplicates of the FERR gene, arranged in a head-to-tail order. Part of FERR-R includes eight fragments (S1, S2, S3, S4, S5, S6, S7, and S8), from which long noncoding RNAs are transcribed, causing RNA-dependent FERR gene methylation and degradation of its transcripts in the cytoplasm. As a result, flowers of male plants express only FERR-R, and in females, only FERR [56].

In addition to repress ARR16/ARR17 orthologs, the Y-SDR of some poplar species contains a HEMA1 gene repressor, encoding one of the three (the others are encoded by HEMA2 and HEMA3 paralogs) glutamyl-tRNA reductases, which are enzymes that catalyze the rate-limiting step in the chlorophyll biosynthesis pathway [81, 80]. The HEMA1 repressor consists of two inverted repeats located in spacer 2 of P. trichocarpa or in the S5 area of P. deltoides on chromosome 9 (Fig. 3) [39, 56] as well as in P. trichocarpa and P. deltoides; HEMA1 repressors are found in the Y‑SDR in P. euphratica and P. pruinosa (section Turanga); their presence is possible in other representatives of the genus Populus [55]. By analogy with ARR16/ARR17 orthologs it can be assumed that in males HEMA1 is repressed. However, the connection with gender and the functional role of HEMA1 repression is not currently understood.

Understanding the mechanisms of sex determination in the poplar opens up opportunities for targeted modification of this plant. For example, ARR16/ARR17 homologue knockout in P. tremula females using the CRISPR/Cas9 system promoted the development of male flowers [55]. According to unpublished data, insufficient FERR-R expression in genetically male P. lasiocarpa individuals leads to FERR gene hypomethylation, due to which plants acquire an intermediate sexual phenotype with male and female flowers [72]. P. × canescens is capable of a directed sex change in adulthood: about 30% of F1 descendants obtained through crossing ♀ P. alba × ♂ P. tremula, which are genetically female, demonstrated sexual lability throughout life, since at first only male flowers developed on them, over time, simultaneously male, female, and bisexual flowers were formed, and then only female ones [82]. Cases of deviation from dioecy have previously been described in P. tremuloides, P. tremula, P. trichocarpa, P. deltoides, P. euphratica, P. tomentosa, P. nigra, and other cases [82].

Apparently, microRNA-dependent methylation of SDRs and other sex-specific genes located outside the sex chromosomes in generative organs is widespread not only in the poplar [83], but also in other dioecious plants [8487]. Next, we will consider microRNAs and their target genes, whose differential expression is associated with gender.

microRNAs, THEIR TARGETS, AND DIFFERENTIAL GENE EXPRESSION ASSOCIATED WITH SEX

Differences between female and male individuals in the poplar are due to the expression of primarily ARR16/ARR17 orthologs in the SDR are observed only in female reproductive organs [55]. It has been established that gender-specific expression of a large number of other genes is observed in the generative organs. In P. × sibirica several thousand genes have been discovered whose expression differs in male and female plants, and most of these genes are found in flowers [15]. It should be noted that on the somatic chromosomes of the poplar, in addition to the sex-determining ARR16/ARR17 ortholog, its numerous RRA paralogues from the group are found [38]. “Somatic” RRA genes are expressed in both vegetative and generative organs, with some of them exhibiting gender-specific expression in generative organs [38]. These data indicate the possible involvement of ARR16/ARR17 paralogs in the formation of primary sexual characteristics in the poplar, most likely due to direct or indirect regulation of sex-specific gene expression.

A study by Cronk et al. studied the relationship between gene expression and the development of male and female generative organs in P. balsamifera from June to October, and then in March and May of the following year [25]. It was shown that the gene expression profile changes slightly in the first months of bud development, which may be associated with summer diapause. In the autumn period (September–October), preparation for winter begins, which leads to an increase in the expression of a number of genes. The expressions of even more genes change in March, and the number of differentially expressed genes peaks in May, when the generative organs themselves are formed. At this time, the expression of more than 2000 genes changes occurs in a sex-specific manner. Among them, genes associated with chromatin modification stand out, which are expressed in five successive waves: early (DDM1, KYP, argonaute, CMT3, CDK1, CDK2, AUR1, AUR3, BRCA1-homologue, ATXR6, JMJ12, JMJ13, RRC2, CLF, and VRN5), early-middle (DRD1, AGO4, SHH1, JMJ27, TAF14B, ATX1, and FLD), middle (CYP71, JMJ16, JMJ17, JMJ20, and HDA6), late (HDA19, ENY2, and UBC2) and very late (SUVH5 (mainly in male plants) and SAHH1). Histone modification and subsequent changes in chromatin structure precede further changes in gene expression [88]. Selected groups of chromatin structure regulators can be effectors of signaling pathways that integrate environmental parameters (light, temperature, etc.) to control sequential changes in the expression of certain sets of other genes, thereby ensuring the staged development of generative organs. The same authors discovered a relatively small group of 110 genes whose expression depends on sex and does not depend on the stage of organ development [25]. First of all, these genes include PbRR9 (an ARR16/ARR17 ortholog), expressed only in female flowers, as well as ARR9 and ARR22 orthologs. Despite the fact that ARR9 and ARR22 orthologs are not located on the sex chromosome, their expression correlates with PbRR9 activity and their mRNA levels are higher in male flowers than in female flowers. Possibly, expression of ARR9 and ARR22 orthologs is regulated by a negative feedback mechanism from the PbRR9 homolog. As well, in male flowers, the expression of two homologs related to genes with a MADS box is increased: IP and AP3. In A. thaliana the P.I./AP3 heterodimer negatively regulates the development of petals and stamens, including through GATA21 and GNC gene repression, encoding a heterodimeric transcription factor [89]. According to these data, GATA21 and GNC homologs in the poplar are more actively expressed in female flowers than in male ones [25]. In male plants, the expression of genes encoding components of cytokinin reception and signal transmission is increased: AHK4, AHP5 (which encode cytokinin receptors), CRF5 (RR regulator type B), and ADA2 (transcription adapter 2). Interestingly, the alleles of genes found in the SDR: MET1, CLC-c and TCP1, and CHR11 (which encodes a protein involved in chromatin remodeling) are expressed sex-specifically. One allele is overexpressed in male plants and the second in female plants. Genes associated with disease resistance and response to oxidative stress (genes from the peroxidase superfamily) are expressed in a sex-specific manner [25]. This may explain sex-related differences in pathogen resistance [23] and oxidative stress [4, 21] observed in the poplar, which is important to consider during landscaping of human settlements.

In P. tomentosa differences between male and female flowers are manifested in the expression of 24 genes, many of them located on the sex chromosome 19 [76]. Thus, female plants have increased MET1 expression (chromosome 19) and DMT3, encoding DNA methyltransferases, which correlates well with the increased overall level of DNA methylation in female flowers. The DDM1 gene is overexpressed in DNA methyltransferases (chromosome 19) in male flowers. These data indicate the involvement of DNA methyltransferases in the sex-specific regulation of gene expression. Most of the other sex-specifically expressed genes are involved in the early stages of flower development and in signal transduction pathways triggered by hormones such as cytokinins, gibberellins, indolylacetic, and abscisic acids (Fig. 4). For example, ATA, PM30, MSL3, MYB79, GI, RGF1, TFL1, PIL5, PIN13, SAUR39, GA20ox2, and CKX3 genes are overexpressed in female plants, and in males, the expression of A9.2, EXPA10, SVP, COL9, AGL24, UFO, SAP, SAUR13, and GA20ox7 is higher [76].

Fig. 4.
figure 4

The functional groups of genes whose expression is sex-specifically controlled by microRNAs. The largest number of microRNA targets have been described for genes responsible for flower development. In second place in terms of the number of targets regulated by microRNAs is the group of genes responsible for hormone metabolism.

It’s interesting that P. tomentosa genes were discovered whose methylation is associated with an increase in their expression. Thus, in female plants PtGT2 genes are fully and half methylated (MF26 homologue) and PtPAL3 (MF29 homologue), whereas both genes are unmethylated in males. PtCER4 (MF35 homologue) is fully methylated in male plants and not methylated in female plants. PtGT2 and PtPAL3 are more actively expressed in female plants, and PtCER4 in males [81].

MicroRNAs are actively involved in the regulation of gene expression, mediating their methylation. There are sex-dependent differences in the structures of microRNAs: in male P. tomentosa flowers miRNAs consists of 21 or 24 nucleotides, while in females miRNAs 21 nucleotides long predominate [90]. In addition, the number of miRNAs expressed only in female flowers is greater than in male flowers (94 versus 40 in the case of canonical miRNAs and 61 versus 11 in the case of novel miRNAs) [90]. It is possible that the increased abundance of miRNAs contributes to the increased overall level of genome methylation observed in female plants [76]. During mapping of methylated regions of the genome P. tomentosa it was found that 15.1% of the reads are genes, of which 95.7% are protein-coding, and the remaining 4.3% are microRNA genes [91]. In protein-coding genes, CpG islands in the promoter region, as well as the gene body, are more methylated, and in microRNA genes, CHH islands, as well as 3'- and 5'-untranslated regions, are more methylated. Sex differences in the degree of methylation of CHH islands, which are most often located in intergenic regions and are most heavily methylated in male plants, were also revealed. Sex differences in the methylation of genes and intergenic regions may be associated with differential expression of DNA methyltransferase genes located on the sex chromosomes (for example, MET1, DMT3, and DDM1 [76]). There is an inverse relationship between the expression level of microRNAs and their targets. For example, in female plants, compared to male plants, miR169-q, t, u expression is increased, while the expression of the mRNA of their target gene is reduced (NFYA); miR-164a, e (the mRNA of their targets, CUC1 and CUC, is reduced); miR-159a, miR-319f (the mRNA of their targets MYB33, MYB65, MYB101, TCP2, TCP3, TCP10, and TCP24 is reduced); 172b, i expression is reduced (and the mRNA of their target AP2 is increased); and miR-156l, k (and the mRNA of their targets SPL9 and SPL10 is reduced) [91]. Most miRNA targets are associated with flower development (COL2, AP2, UFO, and PM36 are negatively regulated by F-specific microRNAs; EPR1 and PFT1 are negatively regulated by M-specific microRNAs). The next largest group of target genes is associated with the metabolism of phytohormones (PINI ARATH, IAA4, GA2ox, EREBF4, SAUR29, ABA3, and ABI3 are negatively regulated by F-specific microRNAs), Ca2+ transport (EDA39 and AT1G21630, which are negatively regulated by F-specific microRNAs; CDP is negatively regulated by M‑specific miRNA) and DNA methylation (three miRNAs; DDM1 is negatively regulated by F-specific microRNAs; MET1 is negatively regulated by M-specific miR167) (Fig. 4). It should be noted that a significant portion of microRNA targets are located on the sex chromosome19.

CONCLUSIONS

Trees of the genus Populus are actively used in landscaping large populated areas, with the preferable use of males. However, a complex evolutionary path led to the formation of the sex locus in the poplar that is complex in structure and functioning mechanisms that are enriched with mobile elements and duplicates of sex-specific genes. Moreover, within the genus there are significant variations in the structure of the sex locus and mechanisms of sex determination: they can vary significantly between sections. Despite the fact that in poplars the key regulator of sex is one gene, the ARR16/ARR17 orthologue, the abundance of full and partial copies of this gene and the presence of several levels of regulation of its activity make sex determination in various poplar species a difficult task. Further research aimed at better understanding of the structure and mechanisms of sex determination is not only important from a practical point of view, but is also of great fundamental interest, since it may bring us closer to understanding the evolutionary mechanisms that contributed to the formation of this diverse and successful group of trees.

ABBREVIATIONS

WGD, whole genome duplication; WGT, whole genome triplication; SDR, sex determination region; RR, cytokinin response regulators; X-SDR, SDR located on the X chromosome; Y-SDR, SDR located on the Y chromosome; YHS, Y-specific hemizygous sequence; LTR-RT, retrotransposons with long terminal repeats (long terminal repeat retrotransposon); SNP, single nucleotide polymorphism; lncRNA, long non-coding RNA.