Genome-wide analysis of the Zn(II)2Cys6 zinc cluster-encoding gene family in Aspergillus flavus
- First Online:
- Cite this article as:
- Chang, P. & Ehrlich, K.C. Appl Microbiol Biotechnol (2013) 97: 4289. doi:10.1007/s00253-013-4865-2
- 765 Views
Proteins with a Zn(II)2Cys6 domain, Cys-X2-Cys-X6-Cys-X5-12-Cys-X2-Cys-X6-9-Cys (hereafter, referred to as the C6 domain), form a subclass of zinc finger proteins found exclusively in fungi and yeast. Genome sequence databases of Saccharomyces cerevisiae and Candida albicans have provided an overview of this family of genes. Annotation of this gene family in most fungal genomes is still far from perfect and refined bioinformatic algorithms are urgently needed. Aspergillus flavus is a saprophytic soil fungus that can produce the carcinogenic aflatoxin. It is the second leading causative agent of invasive aspergillosis. The 37-Mb genome of A. flavus is predicted to encode 12,000 proteins. Two and a half percent of the total proteins are estimated to contain the C6 domain, more than twofold greater than those estimated for yeast, which is about 1 %. The variability in the spacing between cysteines, C3-C4 and C5-C6, in the zinc cluster enables classification of the domains into distinct subgroups, which are also well conserved in Aspergillus nidulans. Sixty-six percent (202/306) of the A. flavus C6 proteins contain a specific transcription factor domain, and 7 % contain a domain of unknown function, DUF3468. Two A. nidulans C6 proteins containing the DUF3468 are involved in asexual conidiation and another two in sexual differentiation. In the anamorphic A. flavus, a homolog of the latter lacks the C6 domain. A. flavus being heterothallic and reproducing mainly through conidiation appears to have lost some components involved in homothallic sexual development. Of the 55 predicted gene clusters thought to be involved in production of secondary metabolites, only about half have a C6-encoding gene in or near the gene clusters. The features revealed by the A. flavus C6 proteins likely are common for other ascomycete fungi.
KeywordsAspergillus flavusZinc-cluster proteinGenomeGene clusterSecondary metaboliteDUF3468
Biological systems contain various groups of DNA-binding proteins that are involved in regulation of many vital cellular processes, such as DNA replication, DNA repair, recombination, and transcription control. The most commonly known DNA-binding proteins include those termed zinc finger, helix-turn-helix, helix-loop-helix, basic leucine zipper, and high mobility group box, which are characterized by the secondary structure of their DNA-binding motifs. The zinc-binding proteins form one of the largest families of transcription factors in eukaryotes. In general, they are categorized into three main classes based on their zinc finger binding motifs (MacPherson et al. 2006), i.e., Cys2His2 (C2H2), Cys4 (C4), and Cys6 (C6). Only fungi and yeast contain the C6 zinc cluster DNA-binding proteins; this class of proteins hasn’t been found in bacteria, plants, and animals. This review summarizes roles of known fungal C6 proteins and deciphers features of C6-encoding genes in the Aspergillus flavus genome including subgroups of the C6 domains, functions of a unique domain, DUF3468, and physical association of C6 domain genes with the predicted 55 secondary metabolite gene clusters.
Gal4p, the classical model of C6 zinc cluster DNA-binding protein
The best studied C6 protein is the Gal4 transcriptional activator of the budding yeast, Saccharomyces cerevisiae (Johnston 1987). Gal4p binds to four related 17-base-pair sequences within an upstream activating sequence to activate transcription of the Gal1 and Gal10 genes that are required for catabolism of galactose. Studies have identified various functional domains in the 881-amino-acid Gal4 protein; they include a DNA-binding domain (residues 1–65) (Keegan et al. 1986), a dimerization domain (residues 65–94) (Hidalgo et al. 2001; Hong et al. 2008), and three acidic activation domains (Ma and Ptashne 1987b), and a region near the C-terminus that binds the inhibitor Gal80p (Ma and Ptashne 1987a). The six cysteine residues bind to two Zn(II) ions in a bimetal-thiolate cluster (Pan and Coleman 1990), and the term “binuclear-cluster zinc-finger” DNA-binding domain is used interchangeably. Commonly, Zn(II)2Cys6 DNA binding domains interact with DNA binding sites consisting of conserved terminal trinucleotides, which are usually in a symmetrical configuration and are spaced by an internal variable sequence of defined length ranging from 2 to 17 nucleotides (MacPherson et al. 2006; Todd and Andrianopoulos 1997).
Functions of previously characterized fungal C6 proteins
Fungal C6 proteins involved in regulation of carbon and nitrogen utilization
A majority of the fungal C6 proteins involved in the regulation of genes necessary for carbon and nitrogen utilization have been identified mainly from A. nidulans, a fungus long been used as a genetic and molecular model. These include AlcR in ethanol metabolism (Felenbok et al. 1988), FacB in acetate utilization (Todd et al. 1997), QutA in quinate utilization (Beri et al. 1987), AmdR in catabolism of acetamide and omega amino acids (Andrianopoulos and Hynes 1990), PrnA in proline utilization (Scazzocchio 1994), UaY in purine catabolism (Suarez et al. 1995), and NirA in nitrate assimilation (Burger et al. 1991). A few of these protein homologs also have been characterized in another model fungus N. crassa, such as ACU15 (FacB) (Bibbins et al. 2002), QA1F (QutA) (Baum et al. 1987), PCO1 (UaY) (Liu and Marzluf 2004), and NIT4 (NirA) (Yuan et al. 1991). Only one, HmgR, for tyrosine degradation has been characterized in the human pathogen Aspergillus fumigatus (Keller et al. 2011). C6 proteins that regulate genes involved in degradation of complex carbohydrates have been mainly reported for industrially important fungi, for example, AmyR of Aspergillus oryzae that regulates expression of clustered amylolytic genes of agdA (encoding alpha-glucosidase) and amyA (encoding alpha-amylase) (Gomi et al. 2000), XlnR of A. oryzae that regulates expression of more than 30 xylanolytic and cellulolytic genes in the degradation of beta-1,4-xylan, arabinoxylan, cellulose, and xyloglucan and catabolism of mono sugars (Noguchi et al. 2009), ManR of A. oryzae that regulates expression of the endo-ß-mannase gene (Ogawa et al. 2012) and InuR of Aspergillus niger that regulates expression of inulinolytic and sugar transport genes (Yuan et al. 2008).
Fungal C6 proteins involved in regulation of biosynthesis of secondary metabolites
Fungi are capable of producing many low molecular weight, structurally heterogeneous secondary metabolites. These compounds are not required for growth of the producing fungus, and are, therefore, considered secondary metabolites. Some secondary metabolites known as mycotoxins are toxic to humans and animals, but many other secondary metabolites have important pharmacological applications (Brakhage 2012). The C6 proteins that regulate genes involved in secondary metabolite production function as transcription activators to upregulate expression of clustering genes. One of the best known examples is AflR of A. flavus, Aspergillus parasiticus, and A. nidulans AflR, which controls expression of pathway genes for the production of aflatoxin and sterigmatocystin, respectively (Brown et al. 1996; Chang et al. 1995; Payne et al. 1993). A few other C6 regulators involved in mycotoxin production include Fusarium verticillioides FUM21 for the biosynthesis of fumonisins, which cause leukoencephalomalacia in equids and pulmonary edema in swine (Brown et al. 2007), DEP6 of Alternaria brassicicola for the biosynthesis of depudecin, a histone deacetylase inhibitor (Wight et al. 2009), and GliZ of A. fumigatus for the biosynthesis of gliotoxin, an epipolythiodioxopiperazine metabolite and a virulence factor (Bok et al. 2006). SirZ of Leptosphaeria maculans, which is homologous to A. fumigatus GliZ, is required for biosynthesis of the phytotoxin, sirodesmin (Fox et al. 2008). In Cercospora nicotianae, CTB8 regulates genes required for the biosynthesis of the host non-selective photoactivated phytotoxin, cercosporin (Chen et al. 2007). C6 regulators also are required for the biosynthesis of several therapeutical agents. For example, LovE of Aspergillus terreus for the biosynthesis of the cholesterol-lowering compound, lovastatin (Huang and Li 2009). Two LovE homologs, MokH and MlcR, required for the biosynthesis of cholesterol-lowering metabolites, monacolin K and compactin, respectively, also have been studied in Monascus pilosus (Chen et al. 2010) and Penicillium citrinum (Abe et al. 2002). ApdR, AfoA, and MdpE of A. nidulans are required for the biosynthesis of anti-cancer compounds, aspyridones (Bergmann et al. 2007), asperfuranone (Chiang et al. 2009), and mono-dictyphenone (Chiang et al. 2010), respectively. However, CtnA of Monascus purpureus, a homolog of A. nidulans AfoA, is involved in the biosynthesis of the nephrotoxic polyketide citrinin (Shimizu et al. 2007). Pigments constitute another group of fungal secondary metabolites that have important functions, including infection of hosts and protection cells from photo damages. Cmr1p of Colletotrichum lagenarium regulates melanin biosynthesis as do its counterparts of Pig1p in Magnaporthe grisea (Tsuji et al. 2000) and BMR1 in Bipolaris oryzae (Kihara et al. 2008). Bik4 is required for biosynthesis of the red pigment bikaverin in Fusariumfujikuroi (Wiemann et al. 2009). GIP2 regulates biosynthesis of the mycelial pigment aurofusarin in Gibberella zeae (anamorph: Fusarium graminearum) (Kim et al. 2006).
Identification of additional genes encoding a Zn(II)2Cys6 domain in the A. flavus genome database
Average distribution of C6-encoding genes in A. flavus, A. nidulans, and yeast genomes
The numbers (in parenthesis) of C6 proteins for other Aspergillus species identified through automated annotation and listed in the Aspergillus Comparative Database are as follows: A. clavatus (180), A. fumigatus (186), A. nidulans (243), A. niger (236), A. terreus (181), and A. oryzae (177). The genome size of A. flavus is 36.8 Mb and A. nidulans 30.1 Mb. The estimated total numbers of the C6 proteins from the two aspergilli are comparable when taking into consideration the genome size of respective species. If distributed evenly, approximately one C6-encoding gene would reside in each 130-kb genomic region. All other species in the genus Aspergillus whose genomes have been sequenced have eight chromosomes, but their genome sizes vary. The genome size of A. fumigatus is 29.4 Mb, A. niger 37.2 Mb, A. terreus 29.3 Mb, A. clavatus 27.9 Mb, and A. oryzae 37.1 Mb. As for A. flavus, the total numbers of C6 factors for these aspergilli likely have been underestimated. Refinement of the gene-call algorithms and bioinformatic protocols will undoubtedly increase the number of C6-encoding genes identified significantly. In yeast, 54 and 70 C6-containing proteins have been reported for S. cerevisiae (Akache et al. 2001; MacPherson et al. 2006) and C. albicans (Maicas et al. 2005), which have a genome size of 11.8 and 14.5 Mb, respectively. The C6-encoding gene frequency in yeast genome is equivalent to one in 200 kb, which apparently is much lower than that estimated for aspergilli. Genome augmentation via duplication/acquisition in fungi probably is responsible for the marked increase in the number of C6 genes in order for the fungi to cope with a more complex environment and to occupy and adapt to specific living niches.
Sub-grouping of Zn(II)2Cys6 domains of A. flavus and A. nidulans
Zinc cluster DNA-binding domains of A. flavus and A. nidulans
Domain of unknown function, DUF3468
As mentioned earlier, among the 94 C6-encoding genes that were predicted by the Conserved Domain search not to encode a fungal specific TF domain, ten were found to encode a unique domain called DUF3468 (DUF, domain of unknown function) This domain is present in a family of putative fungal transcription factors typically at the carboxyl region with a size of 350 to 400 amino acids. A “DUF3468” keyword search of the Aspergillus Comparative Database revealed a total of 23 annotated DUF3468 proteins. Manual analyses of the remaining 13 DUF3468 proteins indicate that 10 additional proteins contain a C6 domain. The 20 genes are AFL2G_00121.2, AFL2G_00473.2, AFL2G_01202.2, A AFL2G_01693.2, FL2G_03094.2, AFL2G_03721.2, AFL2G_03753.2, AFL2G_04415.2, AFL2G_06402.2, AFL2G_06574.2, AFL2G_07853.2, AFL2G_07980.2, AFL2G_08040.2, AFL2G_08203.2, AFL2G_09406.2, AFL2G_09466.2, AFL2G_09728.2, AFL2G_09865.2, AFL2G_11881.2, and AFL2G_12301.2. All have the C6 pattern of C-2-C-6-C-6-C-2-C-6-C. The remaining three that do not encode a C6 domain are AFL2G_00885.2, AFL2G_05084.2, and AFL2G_08434.2. Other genomes of aspergilli in the Aspergillus Comparative Database also contain various numbers of genes encoding proteins with a DUF3468 domain.
C6 regulators with DUF3468 involved in asexual conidiation of A. nidulans and A. flavus
Presence of DUF3468 in C6 regulators for A. nidulans sexual differentiation
Two A. nidulans C6-encoding genes shown to be involved in sexual development, rosA (repressor of sexual development, AJ519682, ANID_05170.1) and nosA (number of sexual spores, AM231027, ANID_01848.1) (Vienken and Fischer 2006; Vienken et al. 2005), also encode proteins that possess C-terminal DUF3468 (Pfam: PF11951) domains that were revealed by our Conserved Domain search. These two DUF3468 domains share 51 % amino acid sequence identity. A. nidulans RosA downregulates expression of the sexual development regulatory genes nsdD, veA, and stuA. Overexpression of rosA resulted in colonies with fluffy cotton-like hyphae (Vienken et al. 2005). The A. nidulans nosA gene, upregulated during the late asexual development, is required for the completion of the sexual cycle. Defects in nosA block at the primordial stage but occasionally produced minute cleistothecia containing fertile ascospores (Vienken and Fischer 2006). AFL2G_01801.2 of A. flavus is the ortholog of A. nidulans nosA with 71 % identity and 82 % positive between predicted amino acid sequences. A. flavus NosA also are C6 proteins with a DUF3468 domain. AFL2G_03812. 2 is the ortholog of A. nidulans rosA (48 % identity and 64 % positive) and the encoded RosA has a DUF3468 domain. However, the region corresponding to the A. nidulans RosA C6-containing portion has been replaced by a PAT1/TFIIA/DUF1421 domain. Being heterothallic and reproducing largely through asexual conidiation, A. flavus appears to have lost some of the components involved in homothallic sexual development. Although sexual reproduction under laboratory conditions has recently been demonstrated with A. flavus strains of different mating types (Horn et al. 2009), strict regulation on the sexual cycle may no longer be necessary for A. flavus.
Physical association of C6-encoding genes with A. flavus secondary metabolite gene clusters
Physical association of C6 domain genes with the 55 gene clusters in A. flavus genome
Zn(II)2Cys6 Gene ID
With increasing numbers of fungal genomes being sequenced, a wealth of information concerning gene sequence and location is becoming readily available. Bioinformatics has expanded our ability to predict gene function and analyze organization of gene clusters. Comparative genome studies have been performed to decipher evolutionary relationship among related species (Galagan et al. 2005; Payne et al. 2006; Sato et al. 2011) or among strains of the same species (Borneman et al. 2011). Emphasis now must be shifted toward examining functions of annotated groups of genes. Current protocols for automatic gene prediction are still far from perfect. Refinement of bioinformatic algorithms to enhance accuracy of gene prediction and annotation therefore is a prerequisite for the advance of functional genomics studies. Comparison of C6 domains and the normally conserved downstream basic amino acid dimerization region will spur investigation of mechanisms of phylogenetic diversity among different fungal species. The central role played by the C6 proteins has been evident in either as activators or repressors to modulated expression of controlled genes. Further understanding of how C6-encoding genes are activated and how C6 proteins are posttranslationally modified and interact with co-activators or globally acting transcription factors via the TF or DUF3468 domain needs to be pursued. Their roles in basic fungal development and differentiation also are largely unknown. Association of abilities to infect and colonize host plants with C6 proteins (Bluhm et al. 2008; Imazaki et al. 2007) is another new but rarely explored field. C6 proteins have been implicated in multidrug resistance and in response to stress such as heat shock, low pH, and high osmolarity in S. cerevisiae (Akache et al. 2001; MacPherson et al. 2006). However, no studies have probed this important area of transcription regulation which is critical for fungal survival. Challenges and surprises will arise by future studies of this fundamental class of regulators.