, Volume 247, Issue 3, pp 745–760 | Cite as

Evolution of intron-poor clades and expression patterns of the glycosyltransferase family 47

  • Junfeng Tan
  • Zhenyan Miao
  • Chengzhi Ren
  • Ruxia Yuan
  • Yunjia Tang
  • Xiaorong Zhang
  • Zhaoxue HanEmail author
  • Chuang MaEmail author
Original Article


Main conclusion

A large-scale bioinformatics analysis revealed the origin and evolution of GT47 gene family, and identified two clades of intron-poor genes with putative functions in drought stress responses and seed development in maize.

Glycosyltransferase family 47 (GT47) genes encode β-galactosyltransferases and β-glucuronyltransferases that synthesize pectin, xyloglucans and xylan, which are important components of the plant cell wall. In this study, we performed a systematic and large-scale bioinformatics analysis of GT47 gene family using 352 GT47 proteins from 15 species ranging from cyanobacteria to seed plants. The analysis results showed that GT47 family may originate in cyanobacteria and expand along the evolutionary trajectory to moss. Further analysis of 47 GT47 genes in maize revealed that they can divide into five clades with diverse exon–intron structures. Among these five clades, two were mainly composed with intron-poor genes, which may originate in the moss. Gene duplication analysis revealed that the expansion of GT47 gene family in maize was significantly driven from tandem duplication events and segmental duplication events. Significantly, almost all duplicated genes are intron-poor genes. Expression analysis indicated that several intron-poor GT47 genes may be involved in the drought stress response and seed development in maize. This work provides insight into the origin and evolutionary process, expansion mechanisms and expression patterns of GT47 genes, thus facilitating their functional investigations in the future.


Evolution Expansion Origin Stress Seed Transcriptional regulation 



This work was supported by the Special Fund for Basic Scientific Research of Central College (QN2011114 and 2452015412), the Fund of Northwest A & F University (Z111021603 and Z111021403), the Youth Talent Program of State Key Laboratory of Crop Stress Biology for Arid Areas (CSBAAQN2016001), and Projects of Youth Technology New Star of Shaanxi Province (2017KJXX-67).

Compliance with ethical standards

Conflict of interest

We declare that we have no competing interests.

Supplementary material

425_2017_2821_MOESM1_ESM.pdf (911 kb)
Supplementary material 1 (PDF 911 kb)
425_2017_2821_MOESM2_ESM.xlsx (17 kb)
Supplementary material 2 FPKM values of Gene expression at transcriptional and translational levels and translational efficiencies of GT47 genes (XLSX 17 kb)
425_2017_2821_MOESM3_ESM.pdf (425 kb)
Supplementary material 3 (PDF 425 kb)
425_2017_2821_MOESM4_ESM.xlsx (22 kb)
Supplementary material 4 List of predicted regulatory relationships between TFs and GT47 genes with putative functions in drought stress responses and seed development in maize (XLSX 21 kb)


  1. Almagro Armenteros JJ, Kaae Sønderby C, Kaae Sønderby S, Nielsen H, Winther O (2017) DeepLoc: prediction of protein subcellular localization using deep learning. Bioinformatics 33:3387–3395. CrossRefPubMedGoogle Scholar
  2. Barvkar VT, Pardeshi VC, Kale SM, Kadoo NY, Gupta VS (2012) Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns. BMC Genom 13:175. CrossRefGoogle Scholar
  3. Batidzirai B, Valk M, Wicke B, Junginger M, Daioglou V, Euler W, Faaij A (2016) Current and future technical, economic and environmental feasibility of maize and wheat residues supply for biomass energy application: illustrated for South Africa. Biomass Bioenergy 92:106–129. CrossRefGoogle Scholar
  4. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. CrossRefPubMedPubMedCentralGoogle Scholar
  5. Brar GA, Weissman JS (2015) Ribosome profiling reveals the what, when, where, and how of protein synthesis. Nat Rev Mol Cell Biol 16:651–664. CrossRefPubMedPubMedCentralGoogle Scholar
  6. Brown DM, Zhang Z, Stephens E, Dupree P, Turner SR (2009) Characterization of IRX10 and IRX10-like reveals an essential role in glucuronoxylan biosynthesis in Arabidopsis. Plant J 57:732–746. CrossRefPubMedGoogle Scholar
  7. Cannon SB, Mitra A, Baumgarten A, Young ND, May G (2004) The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana. BMC Plant Biol 4:10. CrossRefPubMedPubMedCentralGoogle Scholar
  8. Caputi L, Malnoy M, Goremykin V, Nikiforova S, Martens S (2012) A genome-wide phylogenetic reconstruction of family 1 UDP-glycosyltransferases revealed the expansion of the family during the adaptation of plants to life on land. Plant J 69:1030–1042. CrossRefPubMedGoogle Scholar
  9. Chen X, Vega-Sánchez ME, Verhertbruggen Y, Chiniquy D, Canlas PE, Fagerström A, Prak L, Christensen U, Oikawa A, Chern M (2013) Inactivation of OsIRX10 leads to decreased xylan content in rice culm cell walls and improved biomass saccharification. Mol Plant 6:570–573. CrossRefPubMedGoogle Scholar
  10. Chen J, Zeng B, Zhang M, Xie S, Wang G, Hauck A, Lai J (2014) Dynamic transcriptome landscape of maize embryo and endosperm development. Plant Physiol 166:252–264. CrossRefPubMedPubMedCentralGoogle Scholar
  11. Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7:e1002195. CrossRefPubMedPubMedCentralGoogle Scholar
  12. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 44:D279–D285. CrossRefPubMedGoogle Scholar
  13. Gasteiger E, Hoogland C, Gattiker A, Se Duvaud, Wilkins MR, Appel RD, Bairoch A (2005) Protein identification and analysis tools on the ExPASy server. In: Walker JM (ed) The proteomics protocols handbook. Humana Press, Totowa, pp 571–607CrossRefGoogle Scholar
  14. Gaut BS, Morton BR, McCaig BC, Clegg MT (1996) Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. Proc Natl Acad Sci 93:10274–10279. CrossRefPubMedPubMedCentralGoogle Scholar
  15. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, Rokhsar DS (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40:D1178–D1186. CrossRefPubMedGoogle Scholar
  16. Hanada K, Zou C, Lehti-Shiu MD, Shinozaki K, Shiu S-H (2008) Importance of lineage-specific expansion of plant tandem duplicates in the adaptive response to environmental stimuli. Plant Physiol 148:993–1003. CrossRefPubMedPubMedCentralGoogle Scholar
  17. Hayward AP, Moreno MA, Howard TP, Hague J, Nelson K, Heffelfinger C, Romero S, Kausch AP, Glauser G, Acosta IF (2016) Control of sexuality by the sk1-encoded UDP-glycosyltransferase of maize. Sci Adv 2:e1600991. CrossRefPubMedPubMedCentralGoogle Scholar
  18. Hu B, Jin J, Guo A-Y, Zhang H, Luo J, Gao G (2015) GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics 31:1296–1297. CrossRefPubMedGoogle Scholar
  19. Huang F-F, Chai C-L, Zhang Z, Liu Z-H, Dai F-Y, Lu C, Xiang Z-H (2008) The UDP-glucosyltransferase multigene family in Bombyx mori. BMC Genom 9:563. CrossRefGoogle Scholar
  20. Iwai H, Masaoka N, Ishii T, Satoh S (2002) A pectin glucuronyltransferase gene is essential for intercellular attachment in the plant meristem. Proc Natl Acad Sci 99:16319–16324. CrossRefPubMedPubMedCentralGoogle Scholar
  21. Jeong M, Sun D, Luo M, Huang Y, Challen GA, Rodriguez B, Zhang X, Chavez L, Wang H, Hannah R (2014) Large conserved domains of low DNA methylation maintained by Dnmt3a. Nat Genet 46:17–23. CrossRefPubMedGoogle Scholar
  22. Jiao Y, Peluso P, Shi J, Liang T, Stitzer MC, Wang B, Campbell M, Stein JC, Wei X, Chin C-S (2017) Improved maize reference genome with single molecule technologies. Nature 546:524–527. PubMedGoogle Scholar
  23. Jin J, Tian F, Yang D-C, Meng Y-Q, Kong L, Luo J, Gao G (2017) PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 45:D1040–D1045. CrossRefPubMedGoogle Scholar
  24. Jones P, Vogt T (2001) Glycosyltransferases in secondary plant metabolism: tranquilizers and stimulant controllers. Planta 213:164–174. CrossRefPubMedGoogle Scholar
  25. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780. CrossRefPubMedPubMedCentralGoogle Scholar
  26. Kohl M, Wiese S, Warscheid B (2011) Cytoscape: software for visualization and analysis of biological networks. Methods Mol Biol 696:291–303. CrossRefPubMedGoogle Scholar
  27. Kong Y, Peña MJ, Renna L, Avci U, Pattathil S, Tuomivaara ST, Li X, Reiter W-D, Brandizzi F, Hahn MG (2015) Galactose-depleted xyloglucan is dysfunctional and leads to dwarfism in Arabidopsis. Plant Physiol 167:1296–1306. CrossRefPubMedPubMedCentralGoogle Scholar
  28. Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA (2009) Circos: an information aesthetic for comparative genomics. Genome Res 19:1639–1645. CrossRefPubMedPubMedCentralGoogle Scholar
  29. Lairson LL, Henrissat B, Davies GJ, Withers SG (2008) Glycosyltransferases: structures, functions, and mechanisms. Annu Rev Biochem 77:521–555. CrossRefPubMedGoogle Scholar
  30. Le Gall H, Philippe F, Domon J-M, Gillet F, Pelloux J, Rayon C (2015) Cell wall metabolism in response to abiotic stress. Plants 4:112–166. CrossRefPubMedPubMedCentralGoogle Scholar
  31. Lei L, Shi J, Chen J, Zhang M, Sun S, Xie S, Li X, Zeng B, Peng L, Hauck A (2015) Ribosome profiling reveals dynamic translational landscape in maize seedlings under drought stress. Plant J 84:1206–1218. CrossRefPubMedGoogle Scholar
  32. Leister D (2004) Tandem and segmental gene duplication and recombination in the evolution of plant disease resistance genes. Trends Genet 20:116–122. CrossRefPubMedGoogle Scholar
  33. Lescot M, Déhais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y, Rouzé P, Rombauts S (2002) PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res 30:325–327. CrossRefPubMedPubMedCentralGoogle Scholar
  34. Li Y, Baldauf S, Lim E-K, Bowles DJ (2001) Phylogenetic analysis of the UDP-glycosyltransferase multigene family of Arabidopsis thaliana. J Biol Chem 276:4338–4343. CrossRefPubMedGoogle Scholar
  35. Li Y, Li P, Wang Y, Dong R, Yu H, Hou B (2014) Genome-wide identification and phylogenetic analysis of Family-1 UDP glycosyltransferases in maize (Zea mays). Planta 239:1265–1279. CrossRefPubMedGoogle Scholar
  36. Li P, Li YJ, Zhang FJ, Zhang GZ, Jiang XY, Yu HM, Hou BK (2017) The Arabidopsis UDP-glycosyltransferases UGT79B2 and UGT79B3, contribute to cold, salt and drought stress tolerance via modulating anthocyanin accumulation. Plant J 89:85–103. CrossRefPubMedGoogle Scholar
  37. Lombard V, Ramulu HG, Drula E, Coutinho PM, Henrissat B (2014) The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res 42:D490–D495. CrossRefPubMedGoogle Scholar
  38. Lovegrove A, Wilkinson MD, Freeman J, Pellny TK, Tosi P, Saulnier L, Shewry PR, Mitchell RA (2013) RNA interference suppression of genes in glycosyl transferase families 43 and 47 in wheat starchy endosperm causes large decreases in arabinoxylan content. Plant Physiol 163:95–107. CrossRefPubMedPubMedCentralGoogle Scholar
  39. Lyons E, Pedersen B, Kane J, Freeling M (2008) The value of nonmodel genomes and an example using SynMap within CoGe to dissect the hexaploidy that predates the rosids. Trop Plant Biol 1:181–190. CrossRefGoogle Scholar
  40. Ma C, Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis. Plant Physiol 160:192–203. CrossRefPubMedPubMedCentralGoogle Scholar
  41. Ma C, Xin M, Feldmann KA, Wang X (2014) Machine learning-based differential network analysis: a study of stress-responsive transcriptomes in Arabidopsis. Plant Cell 26:520–537. CrossRefPubMedPubMedCentralGoogle Scholar
  42. Madson M, Dunand C, Li X, Verma R, Vanzin GF, Caplan J, Shoue DA, Carpita NC, Reiter W-D (2003) The MUR3 gene of Arabidopsis encodes a xyloglucan galactosyltransferase that is evolutionarily related to animal exostosins. Plant Cell 15:1662–1670. CrossRefPubMedPubMedCentralGoogle Scholar
  43. Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Lu F, Marchler GH, Mullokandov M, Omelchenko MV, Robertson CL, Song JS, Thanki N, Yamashita RA, Zhang D, Zhang N, Zheng C, Bryant SH (2011) CDD: a conserved domain database for the functional annotation of proteins. Nucleic Acids Res 39:D225–D229. CrossRefPubMedGoogle Scholar
  44. Miao Z, Han Z, Zhang T, Chen S, Ma C (2017) A systems approach to a spatio-temporal understanding of the drought stress response in maize. Sci Rep 7:6590. CrossRefPubMedPubMedCentralGoogle Scholar
  45. Ouyang S, Zhu W, Hamilton J, Lin H, Campbell M, Childs K, Thibaud-Nissen F, Malek RL, Lee Y, Zheng L, Orvis J, Haas B, Wortman J, Buell CR (2007) The TIGR rice genome annotation resource: improvements and new features. Nucleic Acids Res 35:D883–D887. CrossRefPubMedGoogle Scholar
  46. Pfaffl MW (2001) A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res 29:e45. CrossRefPubMedPubMedCentralGoogle Scholar
  47. Price MN, Dehal PS, Arkin AP (2010) FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One 5:e9490. CrossRefPubMedPubMedCentralGoogle Scholar
  48. Rai KM, Thu SW, Balasubramanian VK, Cobos CJ, Disasa T, Mendu V (2016) Identification, characterization, and expression analysis of cell wall related genes in Sorghum bicolor (L.) moench, a food, fodder, and biofuel crop. Front Plant Sci 7:1287. PubMedPubMedCentralGoogle Scholar
  49. Rehman HM, Nawaz MA, Bao L, Shah ZH, Lee J-M, Ahmad MQ, Chung G, Yang SH (2016) Genome-wide analysis of family-1 UDP-glycosyltransferases in soybean confirms their abundance and varied expression during seed development. J Plant Physiol 206:87–97. CrossRefGoogle Scholar
  50. Salse J, Bolot S, Throude M, Jouffe V, Piegu B, Quraishi UM, Calcagno T, Cooke R, Delseny M, Feuillet C (2008) Identification and characterization of shared duplications between rice and wheat provide new insight into grass genome evolution. Plant Cell 20:11–24. CrossRefPubMedPubMedCentralGoogle Scholar
  51. Schreiber F, Patricio M, Muffato M, Pignatelli M, Bateman A (2013) TreeFam v9: a new website, more species and orthology-on-the-fly. Nucleic Acids Res 42:D922–D925. CrossRefPubMedPubMedCentralGoogle Scholar
  52. Sharma R, Rawat V, Suresh C (2014) Genome-wide identification and tissue-specific expression analysis of UDP-glycosyltransferases genes confirm their abundance in Cicer arietinum (Chickpea) genome. PLoS One 9:e109715. CrossRefPubMedPubMedCentralGoogle Scholar
  53. Shirokikh NE, Archer SK, Beilharz TH, Powell D, Preiss T (2017) Translation complex profile sequencing to study the in vivo dynamics of mRNA-ribosome interactions during translation initiation, elongation and termination. Nat Protoc 12:697–731. CrossRefPubMedGoogle Scholar
  54. Strable J, Scanlon MJ (2009) Maize (Zea mays): a model organism for basic and applied research in plant biology. Cold Spring Harb Protoc. PubMedGoogle Scholar
  55. Taujale R, Yin Y (2015) Glycosyltransferase family 43 is also found in early eukaryotes and has three subfamilies in charophycean green algae. PLoS One 10:e0128409. CrossRefPubMedPubMedCentralGoogle Scholar
  56. Tedman-Jones JD, Lei R, Jay F, Fabro G, Li X, Reiter WD, Brearley C, Jones JD (2008) Characterization of Arabidopsis mur3 mutations that result in constitutive activation of defence in petioles, but not leaves. Plant J 56:691–703. CrossRefPubMedGoogle Scholar
  57. Thatcher SR, Danilevskaya ON, Meng X, Beatty M, Zastrow-Hayes G, Harris C, Van Allen B, Habben J, Li B (2016) Genome-wide analysis of alternative splicing during development and drought stress in maize. Plant Physiol 170:586–599. CrossRefPubMedGoogle Scholar
  58. Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111. CrossRefPubMedPubMedCentralGoogle Scholar
  59. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7:562. CrossRefPubMedPubMedCentralGoogle Scholar
  60. Weis M, Lim E-K, Bruce NC, Bowles DJ (2008) Engineering and kinetic characterisation of two glucosyltransferases from Arabidopsis thaliana. Biochimie 90:830–834. CrossRefPubMedGoogle Scholar
  61. Wu B, Gao L, Gao J, Xu Y, Liu H, Cao X, Zhang B, Chen K (2017) Genome-wide identification, expression patterns, and functional analysis of UDP glycosyltransferase family in peach (Prunus persica L. Batsch). Front plant Sci 8:389. PubMedPubMedCentralGoogle Scholar
  62. Xu B, Yang Z (2013) pamlX: a graphical user interface for PAML. Mol Biol Evol 30:2723–2724. CrossRefPubMedGoogle Scholar
  63. Yin Y, Chen H, Hahn MG, Mohnen D, Xu Y (2010) Evolution and function of the plant cell wall synthesis-related glycosyltransferase family 8. Plant Physiol 153:1729–1746. CrossRefPubMedPubMedCentralGoogle Scholar
  64. Yonekura-Sakakibara K, Hanada K (2011) An evolutionary view of functional diversity in family 1 glycosyltransferases. Plant J 66:182–193. CrossRefPubMedGoogle Scholar
  65. Yu J, Hu F, Dossa K, Wang Z, Ke T (2017) Genome-wide analysis of UDP-glycosyltransferase super family in Brassica rapa and Brassica oleracea reveals its evolutionary history and functional characterization. BMC Genom 18:474. CrossRefGoogle Scholar
  66. Zhong R, Ye Z-H (2003) Unraveling the functions of glycosyltransferase family 47 in plants. Trends Plant Sci 8:565–568. CrossRefPubMedGoogle Scholar
  67. Zhu Y, Wu N, Song W, Yin G, Qin Y, Yan Y, Hu Y (2014) Soybean (Glycine max) expansin gene superfamily origins: segmental and tandem duplication events followed by divergent selection among subfamilies. BMC Plant Biol 14:93. CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2017

Authors and Affiliations

  1. 1.State Key Laboratory of Crop Stress Biology for Arid Areas, College of Life SciencesNorthwest A&F UniversityYanglingChina
  2. 2.Center of Bioinformatics, College of Life SciencesNorthwest A&F UniversityYanglingChina
  3. 3.Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of AgricultureNorthwest A&F UniversityYanglingChina
  4. 4.Biomass Energy Center for Arid and Semi-Arid LandsNorthwest A&F UniversityYanglingChina

Personalised recommendations