Introduction

The Homeobox (HB) gene is a class of genes which have a 60 amino acid helix-turn-helix DNA binding motif. The characteristic domain is called a Homeodomain (HD) (Desplan et al. 1988). TALE transcription factors are a special class of HB genes. It encode one atypical HD consisting of 63 amino acids, with an additional 3 amino acid (PYP, proline-tyrosine-proline) residues linked between the first and second helix regions (Bertolino et al. 1995).

The TALE gene family consists of the KNOX subfamily and the BELL subfamily. The KNOX subfamily includes four domains: KNOX1, KNOX2, ELK, and Homeodomain. KNOX proteins can be divided into three types: KNOXI, KNOXII, and KNOXIII. KNOXI genes have been found to be mainly expressed in meristem and play an important roles in plant growth and development (Hay and Tsiantis 2010). For example, in Arabidopsis thaliana (A. thaliana), the SHOOT-MERISTEMLESS (STM) gene encoding KNOXI protein was found to be necessary for the formation of shoot apical meristem (SAM) during embryogenesis (Long et al. 1996). KNOXII genes can regulate the formation of plant secondary walls (Li et al. 2012). KNOXIII genes are currently only found in dicots, a novel KNOX gene that lacks the HD. KNATM found in A. thaliana was expressed in the proximal lateral domain of organ primordium and the boundary of mature organs (Magnani and Hake 2008). The BELL subfamily, also known as the BLH subfamily, includes POX, HD domains. The POX domain, also known as the MID domain, which is composed of two domains: SKY and BELL. The BELL subfamily plays an important role in the response to stress, the development of meristems, hormone biosynthesis and signal transduction (Niu and Fu 2022). In previous research on A. thaliana, BELL1 was found to be associated with ovule development (Brambilla et al. 2007). ATH1 has been shown to regulate the growth of the interface between stems, meristem and organ primordium (Gomez-Mena and Sablowski 2008). BLH1 and KNAT3 have been shown to regulate seed germination and early seedling development by directly regulating ABI3 expression (Kim et al. 2013). ABI3 mediates both plant development and the stress response (Xu and Cai 2019). SlBEL11 plays an important role in chloroplast development and chlorophyll synthesis in tomato fruit (Meng et al. 2018).

TALE transcription factors are widely found in plants and have been identified in a variety of plants such as Punica granatum (Wang et al. 2020), Prunus mume (Yang et al. 2022), Gossypium arboretum (Ma et al. 2019), Phyllostachys edulis (Xu et al. 2019), Ananas comosus (Ali et al. 2019). There have been many relevant investigations proving it plays an important role in the growth and development of plants and the response to the eliminate external environment (Zhao et al. 2019; Li et al. 2022; Hou et al. 2021). In previous research, TALE has been found to be related to various biological processes such as the maintenance of organ morphology, hormone regulation and tuber formation (Belles-Boix et al. 2006; Shani et al. 2006; Kondhare et al. 2019). TALE proteins were found in A. thaliana to control meristem formation and/or maintenance, organ position, and several aspects of reproductive phase. And they are also involved in the regulation of several hormonal pathways, providing a link between gene regulatory networks and SAM signaling (Hamant and Pautot 2010). Research has shown that TALE genes plays an important role in fruit quality and maturity of many fruit crops (Brian et al. 2021; Costa et al. 2020; Shahan et al. 2019).

Tobacco (Nicotiana tabacum L.) is a cash crop that harvests leaves. Tobacco is also an important model plant" (Bally et al. 2018). Functional verification research of many plant genes is carried out in tobacco (Li et al. 2023; Chen et al. 2023; Zhao et al. 2023). With global warming and frequent natural disasters, tobacco is facing and suffering from many stresses. TALE gene encode is a transcription factor that can participate in the maintenance of organ morphology and respond to a variety of stresses. The study of TALE genes is of great significance for tobacco stress resistance and growth and development. The completion of tobacco genome sequencing has greatly facilitated the study of gene function in tobacco (Sierro et al. 2014). In this study, the physicochemical properties, subcellular localization, signal peptides, conserved motifs, gene structure, cis-acting elements, protein secondary and 3D structure, protein–protein interaction networks (PPI) analysis, gene expression patterns and evidence of expansion in the Solanaceae of NtTALE genes were analyzed. This study provides a theoretical basis for further revealing the role of TALE transcription factors in tobacco.

Materials and methods

Identification of the TALE genes: To identify the TALE genes, A. thaliana TALE (AtTALE) protein sequences, Nicotiana tabacum cv. TN90 and Solanum ptychanthum genomic data were downloaded from the TAIR database (https://www.arabidopsis.org/, accessed on 27 March 2023) (Lamesch et al. 2012), the NCBI database (https://www.ncbi.nlm.nih.gov/, accessed on 27 March 2023) (Sayers et al. 2022) and Solanaceae Genomics Network (https://solgenomics.net/, accessed on 7 March 2024) (Fernandez-Pozo et al. 2015), respectively. TBtools was used to blast the protein sequence of AtTALE transcription factor family proteins in tobacco and Solanum ptychanthum, respectively, and the parameters were all default (Chen et al. 2023). The domains of candidate NtTALE genes were identified using the CDD database (https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml, accessed on 8 March 2024) (Lu et al. 2020). When there were multiple transcripts of the same gene, the longest transcript was selected as the TALE gene.

Phylogenetic and orthologous gene analysis of TALE genes: Nicotiana sylvestris, Nicotiana tomentosiformis, Solanum lycopersicum, Capsicum annuum, Nicotiana benthamiana, Oryza sativa subsp. Japonica, Oryza sativa subsp. Indica, Solanum melongena, Solanum pimpinellifolium, Solanum pennellii, and Solanum tuberosum TALE protein sequences were downloaded from Plant Transcription Factor Database (https://planttfdb.gao-lab.org/, accessed on 7 March 2024) (Tian et al. 2019). The TALE protein sequences were aligned by ClustalW. The phylogenetic tree was build by MEGA7.0 (Kumar et al. 2016). The NJ method was used, with the parameters were set to Bootstrap method 1000 times, poisson model, and pairwise deleetion. The phylogenetic tree was beautified by the iTOL online website (https://itol.embl.de/, accessed on 15 March 2024) (Letunic and Bork 2021). The GmBLH4 protein sequence was downloaded from Phytozome database (https://phytozome-next.jgi.doe.gov/, accessed on 18 April 2024) (Goodstein et al. 2012). Orthologous genes from Capsicum annuum, Solanum lycopersicum, Solanum melongena, Solanum tuberosum, Glycine max, Nicotiana benthamiana, Nicotiana sylvestris, Nicotiana tomentosiformis and Nicotiana tabacum were analyzed by OrthoFinder (Emms and Kelly 2019).

Physicochemical properties, subcellular localization, and signaling peptide of NtTALE genes. The AA, MW, pI, AI, and GRAVY of the NtTALE genes were predicted by the ExPASy ProtPara online website (https://web.expasy.org/protparam/, accessed on 30 March 2023) (Duvaud et al. 2021). The subcellular localization of NtTALE genes were predicted by the online website WoLF PSORT(https://wolfpsort.hgc.jp/, accessed on 30 March 2023) (Horton et al. 2007). The signaling peptides of the NtTALE genes were predicted by the online website SignalP 5.0 (https://services.healthtech.dtu.dk/services/Signal-P-5.0/, accessed on 7 April 2023) (Nielsen et al. 1999).

Conservation motif and gene structure of the NtTALE genes. The conserved motifs of the NtTALE proteins were analyzed by the online website MEME (http://meme-suite.org, accessed on 31 March 2023) (Bailey et al. 2015). The maximum number of motifs was set to 10, and the rest were the default values. The conserved motifs and the gene structure of NtTALE genes were visualized by TBtools.

The cis-acting elements of the NtTALE promoters. The upstream 2000bp sequences of all NtTALE genes coding regions were extracted. The cis-acting elements were predicted by PlantCARE (https://bioinformatics.psb.ugent.be/webtools/plantcare/html/, accessed on 2 April 2023) (Lescot et al. 2002). The result was visualized by the ggplot2 package.

Secondary and 3D structure of the NtTALE proteins. NtTALEs were predicted for protein secondary and 3D structures using the online websites SOMPA (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa%20_sopma.html, accessed on 6 April 2023) and SWISS-MODEL (https://swissmodel.expasy.org/, accessed on 7 April 2023) (Waterhouse et al. 2018).

Expression patterns of the NtTALE genes. RNA-seq data were download from the SRA database (https://www.ncbi.nlm.nih.gov/sra, accessed on 23 April 2023) to analyze the expression patterns of NtTALE genes under different conditions, including different tissues (SRP029183), cold (SRP097876), dehydration (SRP301492), NaHCO3 (SRP193166), NaCl (SRP193166) and R. solanacearum (SRP336664).

SRA files were converted to FASTA files by Fastq-dump. Trimmomatic was used to remove adapters from the FASTQ sequence of the Illumina platform and to trim the FASTQ sequence file based on base mass values (Bolger et al. 2014). Hista2 was used to index and compare replies. SAM files were converted into BAM files by Samtools (Kim et al. 2019). The FPKM value was calculated by Stringtie and the count was calculated by the prepDE.py3 scriptt (Li et al. 2022). The DESeq2 package was used for differential analysis, and the differential expression gene screening criteria were: log2FoldChange ≥ 1 and padj ≤ 0.05 is up-regulated, log2FoldChange ≤  − 1 and padj ≤ 0.05 is down-regulated (Love et al. 2014). The results were visualized by TBtools. The Upset plot was made by the ChiPlot online website (https://www.chiplot.online/, accessed on 2 August 2023).

PPI of the NtTALE genes. The PPI of the NtTALE proteins were analyzed by the online website String (https://string-db.org/, accessed on 20 April 2023) (Szklarczyk et al. 2023). The network was beautified by Cytoscape software (Shannon et al. 2003).

Results

Identification and fundamental characteristics of the NtTALE genes. A total of 45 members of the NtTALE gene family were identified, including 18 members of the BELL subfamily and 27 members of the KNOX subfamily. They were renamed NtTALE1-45. The fundamental characteristics of the NtTALE genes were analyzed (Supplementary Table S1). The number of amino acids (AA) was 173 ~ 771, and the molecular weight (MW) of protein was 19509.86 ~ 84061.97 Da. The theoretical pI (pI) was 4.85 ~ 9.34, of which 39 NtTALE gene family members had a pI less than 7, which belongs to acidic proteins. The grand average of hydropathicity (GRAVY) was − 0.908  to  − 0.407, all of which were negative, indicating that all proteins were hydrophilic proteins. The aliphatic index (AI) was 57.87 ~ 82.64.

Subcellular localization predictions found that all 45 NtTALE genes were localized to the nucleus. In addition, no signal peptides were detected in all any NtTALE genes. We speculated that this family member in tobacco might not have a transmembrane transport function. In summary, it was speculated that 45 NtTALE family proteins might all perform their biological functions in the nucleus.

Phylogenetic and orthologous gene analysis of TALE genes. In order to analyze the classification of NtTALE genes and their phylogenetic relationships in Solanaceae, a phylogenetic tree was constructed using genes from 405 TALE genes in A. thaliana, rice and 11 Solanaceae species (Fig. 1). As a result, the phylogenetic tree was divided into two subfamilies, namely the BELL subfamily and the KNOX subfamily. The BELL subfamily has 27 NtTALE genes. The KNOX subfamily was further divided into KNOXI, KNOXII, KNOXIII, with ten, six, and two NtTALE genes, respectively. It is worth noting that tobacco has two KNOXIII subfamily genes (NtTALE29 and NtTALE45) that are not found in other Solanaceae plants. OrthoFinder analysis found that it could be divided into 13 orthologous gene groups (Supplementary Table 2, Supplementary Table 3). One of the orthologous groups is unique to regular tobacco and includes NtTALE29 and NtTALE45. Compared with Nicotiana tomentosiformis and Nicotiana sylvestris, some genes of Nicotiana tabacum were expanded and some genes were contracted (Supplementary Table 4).

Fig. 1
figure 1

Phylogenetic tree of the TALE genes. Nt, Nicotiana tabacum. Ns, Nicotiana sylvestris. Nto, Nicotiana tomentosiformis. At, Arabidopsis thaliana. Sl, Solanum lycopersicum. Ca, Capsicum annuum. Nb, Nicotiana benthamiana. Osj, Oryza sativa subsp. Japonica. Osi, Oryza sativa subsp. Indica. Sm, Solanum melongena. Spi, Solanum pimpinellifolium. Spt, Solanum ptychanthum. Spe, Solanum pennellii. St, Solanum tuberosum

Motifs compositions and gene structures of NtTALE genes. Motif analysis of 45 NtTALE genes using the online website MEME resulted in a total of 10 significant motifs (Fig. 2B; Supplementary Fig. S1). In the KNOX subfamily, KNOXI members all had motif 1, motif 3, except NtTALE30, other members had motif 6, motif 8. KNOXII members had motif 1, motif 3, motif 6, motif 10. Among them, NtTALE15 and NtTALE19 also had motif 7. NtTALE14, NtTALE17, NtTALE20, NtTALE37 also had motif 8. KNOXIII members had only motif 6 and motif 8. The BELL subfamily all had motif 2 and motif 4. Motif 1 was present in most NtTALE family members, and it might play a role in maintaining the basic functions of family members.

Fig. 2
figure 2

Motif composition and gene structure of NtTALE genes. A Phylogenetic tree of the NtTALE genes. B Conserved motifs of the NtTALE proteins. The colored squares represent different motifs. C Gene structures of the NtTALE genes. The black lines represent introns. UTR, untranslated regions; CDS, coding sequences. The scale bar at the bottom was used to estimate protein structure and gene structure size

The NtTALE gene family had a complex structure (Fig. 2C). The number of exons of in the NtTALE gene family was three to eight. In addition, except for NtTALE2, NtTALE22, NtTALE30, NtTALE33 did not have 5' UTR, other family members had 5' and 3' UTR, and some family members had multiple 5' UTRs. Differences in gene structure between NtTALE may lead to differences in their function.

The cis-acting elements of the NtTALE promoters. PlantCARE was uesd for cis-acting element prediction. After deleting the core promoters such as TATA-box and the unannotated promoters, the remaining elements were divided into three categories: light-responsive elements (Fig. 3A), elements that responsed hormonal and abiotic stresses (Fig. 3B), and plant development elements (Fig. 3C). There are a total of 53 elements, including 27 elements that respond to light, 11 elements that respond to hormones, four elements that respond to abiotic stress, and 15 elements that regulate plant development.

Fig. 3
figure 3

The type, number, and distribution of cis-acting element in the NtTALE gene family. A Light-responsive elements. B Hormonal and abiotic stress response elements. C Plant development element. Colored squares represented different response element types

Among the hormone response elements, there were four kinds of auxin response elements, three kinds of gibberellin response elements, two kinds of methyl jasmonate response elements, one kinds of salicylic acid response element and one kinds of abscisic acid response element. There were also abiotic stress response elements that respond to low temperature, drought, wound, and defense and stress. And elements that could regulate plant development, including differentiation of the palisade mesophyll cell, meristem development, flavonoid biosynthesis, circadian rhythm control, etc. In summary, the NtTALE genes might be involved in hormonal regulatory responses and resistance to abiotic stresses.

Secondary and 3D structure of NtTALE proteins. The biological function of plant proteins is determined by the higher-level structure of proteins. NtTALE protein secondary structure prediction (Supplementary Fig. S2) revealed that the members of this family had alpha helix, beta turn, extended strand, random coil, mainly alpha helix and random coil, and beta turn and extended strand interspersed. Multiple alpha helixes were alternately arranged with beta turn to form a helix-turn-helix structure.

3D structure prediction was performed for all NtTALE proteins according to the template recommended by the website, retaining sequences with more than 30% similarity to the template, a total of 42. The 3D structure of proteins of the same subfamily are highly similar. For the other 3 proteins, namely NtTALE29, NtTALE30 and NtTALE45 had low confidence in their structural predictions, so the prediction results were not adopted in this study (Supplementary Fig. S3). It was observed that although there were differences in the structure of different NtTALE proteins, they all had spatial structures such as α-helix, β-turn, and random coil, and all had a helix-turn-helix structure. This is consistent with the helix-turn-helix structure of the HB gene.

Expression patterns of NtTALE genes in different tissues. The function of genes can be reflected through gene expression. According to the heatmap after downloading RNA-seq data analysis, the clustering results showed that the results could be divided into two branches. The analysis found that the NtTALE genes were expressed in each tissue, but there were obvious differences in expression in different tissues (Fig. 4). The most highly expressed genes were in the stem, and 11 NtTALE genes such as NtTALE16, NtTALE30 and NtTALE41 were highly expressed in the stem. Of the 11 NtTALE genes that were highly expressed in stems, seven belonged to the KNOXI subfamily and four belonged to the BELL subfamily. The least genes were expressed in dry capsule, with only NtTALE43 and NtTALE44 being more expressed. There was little difference in NtTALE genes expressed in different stages of leaves. However, some gene expression was expressed significantly at different stages of flowering. From im-mature flower to mature flower, the expression of genes such as NtTALE17 and NtTALE26 were upregulated, and the expression of genes such as NtTALE3 and NtTALE45 were down-regulated.

Fig. 4
figure 4

Expression patterns of NtTALE genes in different tissues. R, Root; S, Stem; YL, Young Leaf; ML, Mature Leaf; SL, Senescent Leaf; IF, Immature Flower; MF, Mature Flower; DC, Dry Capsule

Expression patterns of NtTALE genes under abiotic and biotic stresses. Under different treatments, different NtTALE genes had different expression patterns. Under alkali stress (Fig. 5A), the expression of NtTALE6, NtTALE21, NtTALE25 and NtTALE37 were significantly regulated, and the expression of NtTALE10, NtTALE12, NtTALE27, NtTALE33, NtTALE36 and NtTALE38 were significantly down-regulated. After dehydration treatment (Fig. 5B), the expression of NtTALE3, NtTALE5, NtTALE6, NtTALE17, NtTALE25, NtTALE32, NtTALE37, NtTALE39, and NtTALE42 were significantly upregulated, and the expressions of NtTALE1, NtTALE4, NtTALE7, NtTALE9, NtTALE22, NtTALE23, NtTALE27, NtTALE33, NtTALE35, and NtTALE38 were significantly down-regulated. Under cold stress (Fig. 5C), the expressions of NtTALE5, NtTALE6, NtTALE17, NtTALE25, NtTALE27, NtTALE37, NtTALE38, and NtTALE42 were significantly upregulated, and the expressions of NtTALE3, NtTALE9, NtTALE15, NtTALE32, NtTALE36, and NtTALE40 were significantly down-regulated. Under salt stress (Fig. 5D), NtTALE37 expression was significantly upregulated and NtTALE27 expression was significantly down-regulated.

Fig. 5
figure 5

Expression patterns of NtTALE genes under abiotic and biotic stresses. A NaHCO3 stress; B Dehydration stress; C Cold stress; D NaCl stress; E R. solanacearum infection. SM, no infection; SI, infection

After Ralstonia solanacearum (R. solanacearum) infestation with tobacco (Fig. 5E). The expression of NtTALE37, NtTALE43 and NtTALE44 were significantly upregulated. The expression of NtTALE2, NtTALE4, NtTALE27 and other genes were significantly down-regulated.

PPI of the NtTALE genes. PPI are composed of proteins interacting with each other to participate in all aspects of life processes such as biological signaling, gene expression regulation, energy and material metabolism. In this study, the online website String was used to analyze of the NtTALE proteins (Fig. 6). Proteins that have interactions with other NtTALE members were preserved. Among them, NtTALE45 was the core members of the protein interaction of this family, and interact with multiple other members. NtTALE2, NtTALE6, NtTALE7, NtTALE9, NtTALE21, NtTALE29, and NtTALE41 are also key node proteins with important roles. In addition, the results of Gene Ontology (GO) enrichment provided by the String website showed that the family members were involved in multiple processes such as transcriptional regulation, meristem maintenance, and shoot system morphogenesis (Supplementary Table 5).

Fig. 6
figure 6

PPI of NtTALE genes. The network nodes represent different NtTALE genes, the edges represent protein–protein associations, and the larger the node, the more proteins interact with that node

Discussion

The TALE gene family is a class of regulators composed of KNOX subfamily and BELL subfamily, which have an important regulatory role in plant growth and development. In this study, the cis-acting element, expression patterns, and phylogeny of NtTALE genes were comprehensively analyzed. A total of 45 NtTALE genes were identified in tobacco. From the perspective of physicochemical properties, there were great differences in AA and MW of 45 NtTALE genes, and the proteins were all hydrophilic proteins. Subcellular localization showed that all members of this family were located in the nucleus, which was consistent with the signal peptide prediction (Supplementary Table S1). This is consistent with the fact that the orthologous gene GmBLH4 of NtTALE33 is located within the nucleus (Supplementary Table S6) (Tao et al. 2018). The cis-acting elements showed that there were gibberellin, meristem development and other response elements in the promoter region of NtTALE genes (Fig. 3). We speculate that the NtTALE genes may be involved in the regulation of multiple hormone responses and plant growth and development. In previous research, Pisum sativum PsBELL1-2 could interact with the PsDELLA1 protein regulator of the gibberellin pathway, which previously played an important role in symbiotic development (Dolgikh et al. 2020). Poplar PagKNAT2/6b could directly inhibit the synthesis of gibberellin, altering plant architecture conditions, and improve drought resistance (Song et al. 2021). We observed that the promoter regions of the NtTALE1, NtTALE3, NtTALE12, NtTALE23, NtTALE32, NtTALE35, and NtTALE38 genes had gibberellin-responsive elements. In particular, the NtTALE32 gene of the BELL subfamily had multiple gibberellin-responsive elements. We hypothesize that these genes play an important role in gibberellin signaling.

The expression pattern of NtTALE gene differed under different treatments (Fig. 5 and Supplementary Fig. S4). In this study, we analyzed RNA-seq data from five stresses: alkali, dehydration, cold, salt, and R. solanacearum, with 10, 19, 14, two, and 17 genes significantly different from normal treatments, respectively. Among the five stresses, the most NtTALE genes were involved in response to dehydration treatment and the least number of genes responded to salt stress. It was found that multiple genes such as NtTALE6 and NtTALE38 had different roles in different stresses. It is worth noting that NtTALE27 and NtTALE37 showed different degrees of up-regulation and down-regulation of these five stresses, and it is speculated that NtTALE27 and NtTALE37 were core genes of the NtTALE genes, responding to multiple stresses of tobacco. Notably, NtTALE27 was only highly expressed in immature flowers (Fig. 4). NtTALE27 and ATH1 were homologous genes. Previous research has shown that ATH1 controls floral competency as a specific activator of FLOWERING LOCUS C expression (Proveniers et al. 2007). It also modulates growth at the interface between the stem, meristem, and organ primordia and contributes to the compressed vegetative habit of A. thaliana (Gomez-Mena and Sablowski 2008). We speculated that NtTALE27 was not only involved in the response to stress, but also has an important regulatory role in flower organs. The gene function of NtTALE27 could be further studied by gene overexpression or gene silencing (Koeppe et al. 2023; Erdoğan et al. 2023). Under salt stress(Fig. 5D), one gene expression was upregulated and one gene expression was down-regulated, which was very different from the number of TALE genes (8/18) identified in sweet orange (Peng et al. 2022). In addition, in wheat, TaKNOX11-A enhanced the drought and salt tolerance of A. thaliana, and TaKNOX11-A overexpressed plants reduced malondialdehyde content and increased proline content, enabling plants to adapt to unfavorable environments more effectively (Han et al. 2022). The function of NtTALE37 gene in salt stress can be verified by Crispr/cas9 gene editing technology in the future. It is worth noting that most of the genes that are significantly upregulated under abiotic stresses belong to the BELL subfamily. It is possible that members of the BELL subfamily play an important role in responding to abiotic stresses. Phylogenetic trees were constructed using TALE genes from A. thaliana, rice, and Solanaceae (Fig. 1). The NtTALE genes could be divided into four categories, namely BELL (27), KNOXI (10), KNOXII (6), KNOXIII (2) (Fig. 2A). In Phyllostachys edulis, the BELL subfamily had 31 members and the KNOX subfamily had 24 members (Que et al. 2022). Similar to the results of this study, there were significantly more members of the BELL subfamily than the KNOX subfamily. In addition, research has shown that heterodimeric complexes can form between members of two subfamilies to function (Fig. 6). In A. thaliana, BLH2/SAW1 and BLH4/SAW2 establish leaf shape by repressing growth in specific subdomains of the leaf at least in part by repressing expression of one or more of the KNOX genes (Bellaoui et al. 2001; Kumar et al. 2007; Hackbusch et al. 2005). Soybean GmBLH4 might heterodimerize with GmSBH1 to form functional complexes and function in modulating plant growth and development as well as in response to high temperature and humidity stress (Tao et al. 2018). We can study the interaction of members of the two subfamilies through yeast two-hybrid system to further understand the function of the NtTALE genes (Wang et al. 2022). The number of genes in Nicotiana tabacum increased by 2–29 compared to other species. OrthoFinder performs orthologous gene analysis. Nicotiana tabacum is compared with nightshade and capsicum species. Taking NtTALE2 and NtTALE41 as examples, it was found that there was only one orthologous gene in other species. The expansion of the NtTALE genes may be related to polyploidization. Research has shown that Nicotiana tabacum originated from Nicotiana sylvestris and Nicotiana tomentosiformis (Wang et al. 2024). Compared with Nicotiana sylvestris and Nicotiana tomentosiformis, some genes of Nicotiana tabacum were expanded and some genes were contracted (Supplementary Table 4). We speculated that Nicotiana tabacum has lost some genes during the long evolutionary process. The emergence of the KNOXIII subfamily (NtTALE29, NtTALE45) may be due to the fact that the protogene has the advantage of adapting to the environment and thus begins to evolve, the gene base site mutation, or other reasons (Carvunis et al. 2012). The KNOX-like unique gene KNATM found in A. thaliana was expressed in the proximal lateral domain of organ primordium and the boundary of mature organs (Magnani and Hake 2008). In tobacco, KNOXIII gene was only highly expressed in immature flowers (Fig. 4). NtTALE45 is also a key node protein of PPI (Fig. 6). We can further study the function of the NtTALE29 and NtTALE45 genes using genetic transformation and other means (Rahman et al. 2023).

Conclusion

The TALE genes are composed of KNOX subfamily and BELL subfamily, which play an important role in regulating the growth and development of plants. This study identified 45 tobacco TALE genes in tobacco. This study analyzed the expression patterns of NtTALE gene under different stresses, provided evidence of NtTALE gene expansion. This study further improved the research system of tobacco gene family and provided reference for the study of gene families of other species. It is helpful to explore excellent genetic resources and reveal the stress response mechanism. This study has important theoretical value and practical significance.