Gene organization and evolutionary history

Glutathione transferases (GSTs; EC are soluble proteins with typical molecular masses of around 50 kDa, and each is composed of two polypeptide subunits. Classically, GSTs catalyze the transfer of the tripeptide glutathione (γ-glutamyl-cysteinyl-glycine; GSH) to a cosubstrate (R-X) containing a reactive electrophilic center to form a polar S-glutathionylated reaction product (R-SG). These enzymes were first discovered in animals in the 1960s as a result of their importance in the metabolism and detoxification of drugs [1]. Their presence in plants was first recognized shortly afterwards in 1970, when a GST activity from maize was shown to be responsible for conjugating the chloro-S-triazine atrazine with GSH, thereby protecting the crop from injury by this herbicide [2]. Since that time, GST activities, or the corresponding enzymes or gene sequences, have been identified in all animals, plants and fungi analyzed to date [1,3]. In addition to the dimeric soluble GSTs, other proteins have been identified as having a restricted ability to conjugate xenobiotics (foreign organic compounds) with GSH, notably the distantly related mitochondrial kappa GSTs and the trimeric microsomal GSTs of animals [4,5]. These GSTs will not be considered further in this article.

Many GSTs have been purified from animals over the last 30 years and classified by their biochemical and immunological characteristics [1]. Sequencing studies were used to extend this system, and the mammalian GSTs active in drug metabolism are now classified into the alpha, mu and pi classes (Figure 1). Additional classes of GSTs have been identified in animals that do not have major roles in drug metabolism; these include the sigma GSTs, which function as prostaglandin synthases [6]. In cephalopods, however, sigma GSTs are lens S-crystallins [3], giving an indication of the functional diversity of these proteins. An insect-specific delta class has also been described [7], and bacteria contain a prokaryote-specific beta class of GST [8].

Figure 1
figure 1

Phylogenetic tree illustrating the diversity of GSTs and the relationships between classes. All the GSTs identified from Arabidopsis are shown in black; representative GSTs from other classes and organisms are shown in red, and their names are prefixed with two letters denoting the source organism: Hs, Homo sapiens; Rr, Rattus rattus; Rn, Rattus norvegicus; Ss, Sus scrofa; An, Aspergillus nidulans; Pm, Proteus mirabilis; Ec, Escherichia coli. Branch lengths correspond to the estimated evolutionary distance between protein sequences.

Following the purification and cloning of GSTs active in herbicide detoxification in maize in the 1980s [9], it quickly became apparent that the plant enzymes differed significantly in sequence from their mammalian counterparts [10]. Since that time, a large number of GSTs and GST-like sequences have been cloned from a variety of plants, and in order to make sense of this plethora of genes, a classification system was set up, at first with just three classes, theta, tau and zeta [10]. As our understanding of GST gene families in plants and animals has expanded, this original classification system has had to be refined. To best understand the organization of GSTs in higher plants we can now take advantage of the full genome sequence of Arabidopsis thaliana.

From predicted amino-acid sequence, the soluble dimeric GSTs of Arabidopsis may be grouped into four classes. Extending the accepted nomenclature used in the mammalian GSTs, these are termed the phi, zeta, tau and theta classes (Figure 1). Phylogenetic analysis would suggest that all soluble GSTs have arisen from an ancient progenitor gene. Zeta and theta GSTs are found in both animals and plants, but the tau and phi classes are plant-specific. Searches for GST-like sequences in the Arabidopsis genome identify a further two genes related to soluble GSTs (EMBL AL132970, gene T15C9_60 and EMBL AL162973, gene F9G14_100; our unpublished data). These genes have clearly evolved from a GST progenitor and are known to be co-induced with phi and tau GSTs in cereals following exposure to herbicide safeners, chemicals that increase tolerance to herbicides [11]. We have termed these GST-like genes lambda GSTs.

The Arabidopsis genome [12] contains 48 GST-like genes. The tau and phi GSTs are the most numerous, being represented by 28 and 13 genes respectively, whereas there are only three theta GSTs, two zeta GSTs and two lambda GSTs. By comparison, the genome of the photosynthetic cyanobacterium Synechocystis [13] contains only four GST-like sequences (one phi-like, one zeta-like, one lambda-like, and one most similar to unclassified prokaryotic sequences). The mechanisms giving rise to the multiplicity of GST genes in Arabidopsis become apparent when their positioning on the chromosomes is analyzed (Figure 2): 34 of the GST genes are present in clusters that clearly arose from gene-duplication events. This GST gene duplication has given rise to considerable sequence diversification. At the level of predicted amino acid sequence, identity between classes is typically less than 30%, and even within classes identity can be as low as 30%.

Figure 2
figure 2

Distribution of GST genes in the Arabidopsis genome, showing clustering of GSTs of the same classbecause of gene duplication. Chromosomes are represented by numbered gray bars; each triangle represents a single GST gene. The organization of the coding sequence of a typical gene in each class is shown below, drawn to scale; intron positions are shown as black lines.

From the size and sequence diversity within the GST superfamily in Arabidopsis, it is clear that there is scope for considerable functional diversification; analysis of expressed sequence tag (EST) databases shows that 41 out of the 48 GST genes are expressed (Felix Mauch, personal communication). It is also probable that several GSTs have overlapping functions, effectively leading to some redundancy. Although genome information is not yet available in the public domain for other plants, analyses of large-scale EST projects in major crops provides valuable additional information on the relative diversity of the GST gene family in plants. In maize, 12 phi, 28 tau and 2 zeta GST sequences were reported, while in soybean, 20 tau, 1 zeta and 4 phi GSTs were identified [11]. The relative abundance of the GSTs from each class in these EST studies is broadly similar to the gene distribution determined in the Arabidopsis genome, although some plants, such as soybean, may contain a smaller complement of phi GSTs than maize or Arabidopsis.

Classification and nomenclature

The large size of the GST family requires an unambiguous system of classification. At the protein level, this complexity is compounded by the possibility of GSTs being composed of either identical or dissimilar subunits. Such a classification system has been developed for animal GSTs and we have proposed to extend its use to plant GSTs [14]. Using the GSTs of Arabidopsis as an example, the nomenclature of the system is explained in Figure 3. A remaining problem lies in the numbering of the subunits. In organisms such as Arabidopsis, for which comprehensive genome information is available, it is possible to assign the numbering of the genes encoding the GST subunits of each class on the basis of their location on the chromosomes (Figure 2). In plants for which genome information is incomplete or unavailable, however, the current numbering system is based on the order of discovery of the GST genes for each class in the given plant species [14].

Figure 3
figure 3

Nomenclature for Arabidopsis and other plant GSTs, adapted from the mammalian GST classification system [14].

Characteristic structural features

Each soluble GST is a dimer of approximately 26 kDa subunits, typically forming a hydrophobic 50 kDa protein with an isoelectric point in the pH range 4-5. In the case of phi and tau GSTs, only subunits from the same class will dimerize [15,16]. Within a class, however, the subunits can dimerize even if they are quite different in amino-acid sequence [15]. As determined for the GSTs active in herbicide metabolism in maize and wheat, the ability to form heterodimers greatly increases the diversity of the GSTs in planta [2], but the functional significance of this mixing and matching of subunits has yet to be determined.

The structural biology of GSTs derived from the different classes has been studied in detail, with high-resolution crystal structures available for the mammalian alpha, mu, pi, zeta, sigma and theta GSTs, as well as the bacterial beta GSTs [3,17]. Structural information on plant GSTs is available for phi GSTs from Arabidopsis [18] and maize [19,20] and for a zeta-class GST from Arabidopsis [21]. Despite the extreme sequence divergence between the GST classes the overall structures of the enzymes are remarkably similar (Figure 4). Some of the structural characteristics of GSTs are also observed in other GSH-dependent proteins, such as glutaredoxin, suggesting a strong evolutionary pressure to retain structural motifs involved in binding GSH at the active site [3,5,17].

Figure 4
figure 4

Ribbon representations of the structures of GST subunits. The GSTs specific to mammals (alpha, mu, pi and sigma) have a blue background; the plant-specific (phi) and bacteria-specific (beta) GSTs have yellow and white backgrounds, respectively; GSTs (theta and zeta) that have counterparts in both animals and plants have green backgrounds. The structure of a tau GST has yet to be reported. Although there is little sequence similarity between enzymes of different classes, there is significant conservation in overall structure.

Each GST subunit of the protein dimer contains an independent catalytic site composed of two components (Figure 5a). The first is a binding site specific for GSH or a closely related homolog (the G site) formed from a conserved group of amino-acid residues in the amino-terminal domain of the polypeptide. The second component is a site that binds the hydrophobic substrate (the H site), which is much more structurally variable and is formed from residues in the carboxy-terminal domain. Between the two domains is a short variable linker region of 5-10 residues (Figure 5a). The G and H sites of the enzyme are quite mobile when the crystal structure is determined, suggesting that the GST subunits undergo significant conformational changes on binding the substrates. This is demonstrated in the difference in structure of the apoenzyme of phi ZmGSTF3-3 as compared with the ternary complexes (containing GSH and substrate) of other phi class GSTs [19]. Significantly, an induced-fit mechanism for GSH binding has been suggested for other classes of GSTs [3].

Figure 5
figure 5

Overview of GST dimer structure and substrate binding. (a) A ribbon/surface representation of a typical GST subunit (Z. mays GSTF1, pdb 1BYE), with the amino-terminal domain in green, the linker region in red, the carboxy-terminal domain in blue and the protein surface in gray. A glutathione conjugate of the herbicide atrazine in ball-and-stick representation is shown binding at the active site; the GSH-binding site (G site) is highlighted in yellow and the hydrophobic site (H site) is highlighted in blue. (b) A ribbon/surface representation of the ZmGSTF1 homodimer oriented with the amino-terminal domains at bottom left and top right and the subunits in blue and purple. The atrazine-glutathione conjugates are shown in ball-and-stick representation, bound at the active site of each subunit. The dimer is formed by a ball-and-socket interaction between the amino- and carboxy-terminal domains of the different subunits (see text for further details); the deep cleft between subunits is characteristic of phi GSTs.

The subunits that make up the dimer are related by two-fold symmetry as shown in Figure 5b. The dimer interface is large, with a buried surface area of between 2,700 and 3,400 Å2. Most classes of GST have one of two types of subunit interface, either a hydrophobic ball-and-socket interface (alpha, mu, pi, and phi classes; as illustrated in Figure 5), or a hydrophilic interface (theta, sigma and beta classes) [5]. Subunits from different classes of GST are not able to dimerize because of the incompatibility of the interfacial residues. As the active sites of each subunit are normally catalytically independent, the reasons that all classes of active soluble GSTs described so far are dimers, rather than monomers, remain unclear.

Localization and function

Location and regulation

Biochemical and immunological investigations point to a largely cytosolic localization for soluble GSTs in plants [14,22]. This is borne out by genomic analysis of the Arabidopsis GSTs: only one phi GST and one lambda GST show evidence of subcellular targeting to the plastid or mitochondria, and all the other GSTs contain no putative targeting sequence and would be anticipated to be in the cytoplasm. There is a limited number of accounts reporting expression of specific GSTs in the nucleus as well as extracellularly, however [14].

Although it has been known for some time that GSTs in major cereal crops are very highly expressed, representing up to 2% of the total protein in the foliage, relatively few studies have addressed their tissue-specific expression in plants. In one study carried out in in-bred maize lines, different GST isoenzymes were seen to be expressed in different tissues [23]. Pollen, for example, contained a single GST, whereas the scutellum contained five distinct isoenzymes. Similar specific patterns of GST expression are suggested by EST analyses of cDNA libraries prepared from the differing parts of maize plants [11]. Tissue-specific expression can be overridden by exposing plants to chemical treatments: maize (Zea mays) ZmGSTF2, for example, is normally expressed only in the roots, appearing in the foliage only after exposure to herbicide safeners or chemical treatments [24]. Similar patterns of expression have been determined using the promoter of a soybean tau GST to drive the β-glucuronidase reporter gene in transgenic tobacco [25].

The inducibility of phi and tau GSTs following exposure of plants to biotic and abiotic stresses is a characteristic feature of these genes, and many plant GSTs have been cloned by screening for cDNAs corresponding to stress-induced transcripts using differential or subtractive screening methods [22]. Several tau GSTs are known to be strongly induced during cell division or when plants are exposed to auxin or cytokinin plant hormones [22,26]. In the course of biotic stress, both tau and phi GSTs are known to be induced by infection or by treatments that invoke plant defense reactions, as well as by osmotic stress and extreme temperatures [22]. In some instances it seems likely that GSTs are induced by the general oxidative stress caused by these diverse treatments, but in other cases GST induction is specific to the particular stress [27]. Expression of GSTs is also enhanced following exposure to a range of xenobiotics: again, GSTs may be induced in response to the general cellular injury and oxidative stress caused by herbicides and chemical toxins [22]. Other chemicals can induce the expression of specific phi and tau GSTs without imposing any discernible chemical stress on the plant, however. The best example of this is seen in cereals treated with herbicide safeners, chemicals that enhance herbicide tolerance by increasing the expression of detoxifying enzymes, including a subset of GSTs [28].

From what is known of the regulation of GSTs in response to biotic stress and chemical treatments, it would be anticipated that their expression is regulated predominantly at the level of transcription [22,28]. In some instances, stress treatments can give rise to new GST variants through alternative splicing, though the significance of this is not clear [29]. The transcriptional regulation of individual subunits ultimately influences the range of GST homodimers and heterodimers formed. For example, under constitutive conditions, the dominant GSTs in the foliage of maize and wheat are the tau TaGSTU1-1 and phi ZmGSTF1-1 homodimers, respectively [24,30]. Following treatment with herbicide safeners there is an increased synthesis of specific subunits and novel heterodimers are observed [24,30]. In maize, safeners induce the synthesis of the ZmGSTF2 subunit, which then associates with the constitutively expressed ZmGSTF1 subunit to form the ZmGSTF1-2 heterodimer, one of the major GST isoenzymes in safener-treated tissue [24]. In wheat, the three tau GST subunits, TaGSTU2, TaGSTU3 and TaGSTU4, are induced by safeners and this results in their dimerization with the constitutively produced TaGSTU1 subunit to form the TaGSTU1-2, TaGSTU1-3 and TaGSTU1-4 heterodimers, respectively [30]. Current evidence would suggest that the relative abundance of the safener-induced heterodimers is regulated primarily by the relative abundance of newly synthesized subunits.

Cellular functions

Unusually for such a large and well-studied gene family, the functions of many GSTs are, at best, poorly understood. In animals, GSTs in the liver and other organs have a well-proven role in detoxifying ingested or absorbed toxins of both natural and synthetic origin. GSTs also have roles in detoxifying oxidative-stress metabolites, as well as having essential roles in leukotriene biosynthesis [1,3]. In plants, these roles are less obvious, and the description of these enzymes as glutathione transferases may be misleading with respect to the major functions of these proteins in planta. A typical GST reaction involving xenobiotics results in the conjugation of the toxic substrate to form an S-glutathionylated reaction product (Figure 6a). In plants, the S-glutathionylated conjugate is then rapidly transported from the cytosol into the vacuole for further processing through the action of specific transporters of the ATP-binding cassette class [14]. Despite the presence of this specific detoxification system, there is little empirical evidence that natural products are similarly S-glutathionylated and processed. On the premise that GSTs and associated detoxification pathways did not evolve to metabolize synthetic compounds, however, it is reasonable to assume that they must also function in endogenous metabolism, with the caveat that these conjugates are probably formed in small amounts and are rapidly turned over. Recently, GSTs have been assigned a more definitive role in natural-product metabolism following the observation that mutations in a maize tau GST gene, termed Bronze2, and a Petunia hybrida phi GST, termed An9, result in a failure of the plants to deposit flavonoid-derived pigments in the vacuole [14]. Recent studies have shown that these GSTs appear to be involved in the intracellular binding and stabilization of flavonoids [31], rather than in catalyzing their glutathionylation (Figure 6b).

Figure 6
figure 6

Overview of known GST functions in plants. (a) In secondary metabolism, GSTs detoxify toxins by conjugation with GSH; the conjugates (toxin-SG) are then transported into the vacuole by ABC transporters (shown as circles) prior to proteolytic processing. (b) Some phi and tau class enzymes are also required for transport of flavonoid pigments to the vacuole. (c-e) Roles of GSTs in stress metabolism include acting as (c) glutathione peroxidases that can reduce cytotoxic DNA and lipid hydroperoxides; (d) in an antioxidant capacity, protecting against Bax-induced cell death; and (e) in stress signaling, playing a role in the induction of chalcone synthase following exposure to ultraviolet light. Finally, zeta GSTs (GSTZ) have a role in primary metabolism as maleylacetoacetate isomerases. Wide arrows denote an induction process; narrow arrows denote enzymatic reactions; thick lines denote inhibition of a reaction; R, an alkyl group.

The idea that GSTs have additional functions not directly derived from their ability to catalyze the formation of GSH conjugates has gained further ground with studies demonstrating that several stress-inducible GSTs protect plants from oxidative injury by functioning as glutathione peroxidases [32,33]. Certain theta, phi and tau GSTs have been shown to have glutathione peroxidase activity, with the GSTs using glutathione to reduce organic hydroperoxides of fatty acids and nucleic acids to the corresponding monohydroxyalcohols (Figure 6c). This reduction plays a pivotal role in preventing the degradation of organic hydroperoxides to cytotoxic aldehyde derivatives. This functionality in GSTs has been demonstrated to be important in tolerance of transgenic tobacco to chilling and salt [32] and in herbicide cross-resistance in black-grass [33]. Interestingly, a further link between GSTs and oxidative-stress tolerance has been established by the finding that when expressed in yeast, a tau GST from tomato can suppress apoptosis induced by the Bax protein [34], apparently by preventing oxidative damage (Figure 6d). GSTs may also function in stress tolerance through a role in cell signaling (Figure 6e), following the observation that the induction of the genes encoding enzymes of flavonoid biosynthesis in parsley by ultraviolet light requires GSH and the expression of a specific tau GST [35]. A further catalytic role that does not involve GSH conjugation has been demonstrated for the zeta GSTs. The Arabidopsis zeta GST catalyzes the GSH-dependent isomerization of maleylacetoacetate to fumarylacetoacetate (Figure 6f), the penultimate step in tyrosine degradation [36].

Enzymatic mechanism

From a consideration of the way in which plant GSTs have adapted to fulfill a diverse range of functions, it is of interest to study the enzyme chemistry of the GSTs. The conserved nature of the G site suggests that the binding and correct orientation of glutathione is of central importance. The G site also facilitates the ionization of the sulphydryl group of GSH to yield the highly reactive thiolate anion through hydrogen bonding with an adjacent hydroxyl group. In mammalian alpha, pi and mu GSTs a tyrosine residue performs this function, whereas in all the plant enzymes this residue is replaced with a serine. For example, in ZmGSTF1-1 the effect of this hydrogen-bonding activation is to lower the dissociation constant (pKa) of the thiol from 8.7 to 6.2 [37]. In contrast, the beta GSTs have a cysteine in place of the serine/tyrosine residue; this promotes the formation of mixed disulphides with GSH, resulting in a very different catalytic activity from that of the other GSTs. The more variable H site is responsible for accepting a wide range of hydrophobic cosubstrates of diverse chemistries, with the powerful thiolate anion then driving a range of reactions. From what is known of the enzyme kinetics of the glutathione conjugation of model xenobiotic substrates, the reactions would be anticipated to undergo a random sequential two-substrate, two-product mechanism with the overall reaction rate being determined by the rate of release of reaction product from the active site [37].


The plant GST family presents a conundrum for functional genomics. The genome and EST databases have allowed us to classify GSTs and study their evolution and sequence diversity, while crystallographic studies have provided powerful insights into their structural biology. The challenge remains, however: what are the functions of these proteins, where are they located and why are they so numerous and subject to such complex regulation? GSTs appear to have many different functions in plants in primary and secondary metabolism, stress tolerance and cell signaling. From complementation studies [14], it is also probable that quite dissimilar GSTs share similar functions. Addressing these issues is now the main challenge for GST functional genomics, and continued analysis of this protein superfamily will no doubt reveal many other examples of their functional diversification.