Gene organization and evolutionary history

P450 superfamily genes are subdivided and classified following recommendations of a nomenclature committee [1,2] on the basis of amino-acid identity, phylogenetic criteria and gene organization. The root symbol CYP is followed by a number for families (generally groups of proteins with more than 40% amino-acid sequence identity, of which there are over 200), a letter for subfamilies (greater than 55% identity) and a number for the gene; for example, CYP4U2. There are also designations for clades of CYP families (clans, which can be defined as groups of genes that clearly diverged from a single common ancestor; clans are named from the lowest family number in the clade) [3] and for specific alleles of a gene (in humans) [4,5]. The diversity of the cytochrome P450 superfamily arose by an extensive process of gene duplications and by probable, but less well documented, cases of gene amplifications, conversions, genome duplications, gene loss and lateral transfers. 'Fossil' evidence of these processes can be found by careful sequence alignments as well as from the presence of many P450 gene clusters in most organisms. These gene clusters can contain up to 15 P450 genes, the orientation and sequence similarity of which sometimes allows a reconstruction of the events leading to the formation of the cluster. For instance in maize, four clustered CYP71C genes are involved in a common biosynthetic pathway for the defense compound 2,4-dihydroxy-1,4-benzoxazin-3-one (DIBOA) [6], but in most cases there is no evidence for a functional link between members of a cluster of P450 genes.

The origin of the P450 superfamily lies in prokaryotes, before the advent of eukaryotes and before the accumulation of molecular oxygen in the atmosphere. Although Escherichia coli has no P450 gene, Mycobacterium tuberculosis has 20 [7], baker's yeast has three, and the fruit fly Drosophila melanogaster has 83 P450 genes [8] and seven pseudogenes. The fly situation seems the norm in 'higher' animals, with about 80 P450 genes in the nematode Caenorhabditis elegans and about 55 genes and 25 pseudogenes in humans. Higher plants have many more P450 genes, with about 286 in Arabidopsis thaliana [8]. The plant P450s are polyphyletic, with one large clade, the A-type P450s, representing plant-specific enzymes involved mostly in the biosynthesis of natural products, and several other P450s (the non-A-type) being phylogenetically related to P450s from other phyla, often with some conservation of function (fatty acid or sterol metabolism). Only one P450 gene family, CYP51, is conserved across phyla, from plants to fungi and animals, and a CYP51 ortholog has also been found in the bacterium M. tuberculosis, possibly as a result of lateral gene transfer from its host. CYP51 genes encode the 14 a-demethylases of sterols that streamline the a face of sterols [9]. In mammals, CYP51 is also involved in the synthesis of meiosis-activating sterols in oocytes and testis. The complex catalytic function of CYP51 (sequential hydroxylations followed by carbon-carbon bond cleavage) is a highly specialized, and thus derived, trait that evolved early in the eukaryotic lineage. It was lost in insects and nematodes, which are sterol heterotrophs. Various cladograms illustrating the concept of clans or several specific aspects of P450 evolution can be found in [2,6,9,10,11,12].

The exon-intron organization of P450 genes reveals a remarkable diversity of gene structure, and few, if any, intron positions are conserved between divergent P450 families [9,10,11,13]. Thus, the evolutionary history of P450 genes is one of multiple intron gain and loss.

Characteristic structural features

P450s can be divided into four classes depending on how electrons from NAD(P)H are delivered to the catalytic site. Class I proteins require both an FAD-containing reductase and an iron sulfur redoxin. Class II proteins require only an FAD/FMN-containing P450 reductase for transfer of electrons. Class III enzymes are self-sufficient and require no electron donor, while P450s from class IV receive electrons directly from NAD(P)H. This classification of the interactions with redox partners is unrelated to P450 evolutionary history.

Sequence identity among P450 proteins is often extremely low and may be less than 20%, and there are only three absolutely conserved amino acids. The determination of an increasing number of P450 crystal structures, however, shows that this unusual variability does not preclude a high conservation of their general topography and structural fold [14]. Highest structural conservation is found in the core of the protein around the heme and reflects a common mechanism of electron and proton transfer and oxygen activation. The conserved core is formed of a four-helix (D, E, I and L) bundle, helices J and K, two sets of ß sheets, and a coil called the 'meander'. These regions comprise (Figures 1,2a): first, the heme-binding loop, containing the most characteristic P450 consensus sequence (Phe-X-X-Gly-X-Arg-X-Cys-X-Gly), located on the proximal face of the heme just before the L helix, with the absolutely conserved cysteine that serves as fifth ligand to the heme iron; second, the absolutely conserved Glu-X-X-Arg motif in helix K, also on the proximal side of heme and probably needed to stabilize the core structure; and finally, the central part of the I helix, containing another consensus sequence considered as P450 signature (Ala/Gly-Gly-X-Asp/Glu-Thr-Thr/Ser), which corresponds to the proton transfer groove on the distal side of the heme.

Figure 1
figure 1

Primary structures of P450 proteins. (a) Typical features of an ER-bound P450 protein (class II enzyme). The function of the different domains and regions indicated by colors are described in the text. (b) Variants of this canonical structure most commonly found: 1, soluble class I; 2, mitochondrial class I; 3, membrane-bound or plastidial class III. The three-dimensional folding of these structures can be viewed at [17,18]. A good (model) picture of membrane-bound P450 can be seen at [36].

Figure 2
figure 2

Secondary and tertiary structures of P450 proteins. (a) Topology diagram showing the secondary structure and arrangement of the secondary structural elements of a typical P450 protein (CYP102) [14]. Blue boxes, a helices; groups of cream arrows outlined with dotted lines, ß sheets; lines, coils and loops. The sizes of the elements are not in proportion to their length in the primary sequence. There are usually around four ß sheets and 13 a helices defining one domain that is predominantly ß sheets and one that is predominantly a helices. The first domain is often associated with substrate recognition and the access channel, the second with the catalytic center. Adapted from [14]. (b) A ribbon representation of the distal face of the folded CYP2C5 protein showing its putative association with the ER membrane (purple) [16]. Helices and sheets are labeled as in (a). Heme is in orange, the substrate in yellow. The a domain is on top left, the ß domain more closely associated with the membrane at bottom right. Epitopes not accessible for antibody binding when the protein is associated with the ER are shown in red (numbers give their position in the primary sequence). The transmembrane amino-terminal segment, removed for crystallization, and an additional II residues that are disordered in the crystal structure, are not shown. Note the I helix above the heme, close to the substrate-binding site. The heme-binding loop is visible behind the heme protoporphyrin. The conserved Gln-X-X-Arg structure in the K helix is also at the back and so is not readily visible. The proximal (back) face of the protein is involved in redox partner recognition and electron transfer to the active site; protons flow into the active site from the distal face (front). The substrate access channel is usually assumed to be located in close contact of the membrane between the F-G loop, the A helix and ß strands 1-1 and 1-2. More pictures showing other aspects of the structure, including reductase and substrate-binding, can be viewed at [17,18]. Another picture (a model) of membrane-bound P450 including the transmembrane domain can be seen at [36]. Reproduced with permission from [37].

The most variable regions are associated with either amino-terminal anchoring or targeting of membrane-bound proteins, or substrate binding and recognition; the latter regions are located near the substrate-access channel and catalytic site and are often referred to as substrate-recognition sites or SRSs [15]. They are described as flexible, moving upon binding of substrate so as to favor the catalytic reaction. Other variations reflect differences in electron donors, reaction catalyzed or membrane localization (Figure 1b). Most eukaryotic P450s are associated with microsomal membranes, and very frequently have a cluster of prolines (Pro-Pro-X-Pro) that form a hinge, preceded by a cluster of basic residues (the halt-transfer signal) between the hydrophobic amino-terminal membrane anchoring segment and the globular part of the protein (Figure 1a). Additional membrane interaction seems to be mediated essentially by a region, located between the F and G helices, that shows increased hydrophobicity [16]. A signature for mitochondrial enzymes is, in addition to an amino-terminal cleavable signal peptide, the presence of two positive charges, usually arginines, at the beginning of the L helix. Such positive charges are also found in soluble bacterial P450s of class I, which receive their electrons from a ferredoxin. Strong variation from consensus in the core region, particularly an I helix lacking the characteristic threonine and surrounding residues, is characteristic of enzymes of class III catalyzing the rearrangement of hydroperoxides. These P450s do not require molecular oxygen nor electron donor for catalysis, which explains their unusual deviation from the canonical primary structure.

No class III P450 has been crystallized to date. The crystal structures of model proteins belonging to the three other classes are, however, available [17,18], including the substrate, oxygen and CO-bound, and activated reaction-intermediate forms of some enzymes (for example class I: P450cam or CYP101 [19,20]; class II: P450BM3 or CYP102 [16,21,22]; class IV: P450nor or CYP55A1 [23]). Prototype secondary and three-dimensional structures are shown in Figure 2.

Localization and function

P450s in prokaryotes are soluble proteins. Class I P450s require both an FAD-containing NAD(P)H-reductase and an iron-sulfur redoxin as electron donors. Prokaryotic class II require only an FAD/FMN-containing NADPH-P450 reductase, which is fused to the P450 protein. P450s often confer on prokaryotes the ability to catabolize compounds used as carbon source or to detoxify xenobiotics. Other functions described for prokaryotic P450s include fatty acid metabolism and biosynthesis of antibiotics.

Eukaryotic class I enzymes are found associated with the inner membrane of mitochondria and catalyze several steps in the biosynthesis of steroid hormones and vitamin D3 in mammals. Mitochondrial P450s are also found in insects and nematodes, but so far none has been described in plants. Animal mitochondrial P450s appear to have evolved by mistargeting of an animal microsomal P450, and they are not phylogenetically related to the class I P450s of bacteria despite their analogous electron transport chain. Class II enzymes are the most common in eukaryotes. P450s and NADPH-P450 reductases are dissociated and independently anchored on the outer face of the endoplasmic reticulum (ER) by amino-terminal hydrophobic anchors. An additional carboxy-terminally ER-anchored electron donor, cytochrome b5 which conveys electrons from NAD(P)H, has been found to enhance the activity of some P450 enzymes. Functions of enzymes of class II are extremely diverse. In fungi, they include synthesis of membrane sterols and mycotoxins, detoxification of phytoalexins, and metabolism of lipid carbon sources. In animals, physiological functions include many aspects of the biosynthesis and catabolism of signalling molecules, steroid hormones, retinoic acid and oxylipins [2,24,25]. Class II P450s from plants are involved in biosynthesis or catabolism of all types of hormones, in the oxygenation of fatty acids for the synthesis of cutins, and in all of the pathways of secondary metabolism - in lignification and the synthesis of flower pigments and defense chemicals (which are also aromas, flavors, antioxidants, phyto-estrogens, anti-cancer drugs and other drugs) [12,26]. Class I and class II P450s from all organisms participate in the detoxification or sometimes the activation of xenobiotics. They have been shown to contribute to carcinogenesis, and are essential determinants of drug and pesticide metabolism, tolerance, selectivity and compatibility [24,25,27,28]. P450s that actively metabolize xenobiotics often have their expression induced by exogenous chemicals [25,28,29,30]. Their pharmacologic and agronomic impact is thus considerable.

P450s from class III are self-sufficient and do not require molecular oxygen or an external electron source. They catalyze the rearrangement or dehydration of alkylhydroperoxides or alkylperoxides initially generated by dioxygenases [31]. These enzymes are involved in the synthesis of signaling molecules such as prostaglandins in mammals and jasmonate in plants; they seem to have diverse subcellular localizations in plants, including plastidial. A P450 that receives its electrons directly from NADH has also been described that belongs to class IV. This unique fungal P450 is soluble and reduces NO generated by denitrification to N20 [31]. The latter two classes might be considered as remains of the most ancestral forms of P450 involved in detoxification of harmful activated oxygen species.

There is no rule for the tissue distribution of P450 enzymes. As their functions are extremely diverse, they can be found in all types of tissues, with developmentally regulated patterns of expression, in all types of organisms. They were first described in mammalian liver where they are especially abundant and play an essential role in drug metabolism [32,33].

A list of selected mutant phenotypes that illustrate the various functions of P450 proteins is given in Table 1.

Table 1 Selected examples of P450 mutants characterized in various organisms

Enzyme mechanism

P450 enzymes have been compared to a blowtorch. They catalyze regiospecific and stereospecific oxidative attack of non-activated hydrocarbons at physiological temperatures. Such a reaction, uncatalyzed, would require extremely high temperature and would be nonspecific. The details of the mechanism by which P450s carry out all types of reactions, especially the most complex ones found in plants, are not yet understood. The best documented aspect is the oxygen activation that is common to most P450s, including soluble enzymes for which crystal structures of different forms have been obtained. The active center for catalysis is the iron-protoporphyrin IX (heme) with the thiolate of the conserved cysteine residue as fifth ligand. Resting P450 is in the ferric form and partially six-coordinated with a molecule of solvent (Figure 3).

Figure 3
figure 3

Catalytic mechanism of P450 enzymes. P450s are usually mono-oxygenases, catalyzing the insertion of one of the atoms of molecular oxygen into a substrate, the second atom of oxygen being reduced to water. The most frequently catalyzed reaction is hydroxylation (O insertion) using the very reactive and electrophilic iron-oxo intermediate (species [C], bottom row). The hydroperoxo form of the enzyme (species [B]-) is also an electrophilic oxidant catalyzing OH+ insertion. Nucleophilic attack can be catalyzed by species [A]2- and [B]- ; reduction, isomerization or dehydration are catalyzed by the oxygen-free forms of the enzyme. This, together with the variety of the apoproteins and intrinsic reactivity of all their substrates explains the extraordinary diversity of reactions catalyzed by P450 enzymes.

The well characterized portion of the catalytic sequence involves four steps, which are indicated in Figure 3. The first step is substrate binding, with displacement of the sixth ligand solvent inducing a shift in the maximum of absorbance, spin state and redox potential of the heme protein system; the second is one-electron reduction of the complex to a ferrous state, driven by the increase in redox potential that results from the previous step; the third is binding of molecular oxygen to give a superoxide complex; and the fourth is a second reduction step leading to an 'activated oxygen species'. The exact nature of the very short-lived activated oxygen species that carries on substrate attack long remained uncertain, but the most recent data, from crystallography and mechanistic probes [20,34], strongly suggest that it is actually a mixture of two electrophilic oxidants ([B]- and [C] in Figure 3). Both iron-peroxo [B]- and iron-oxo [C] complexes are formed by protonation of the two-electrons-reduced dioxygen, a process that is allowed when a water channel forms in the groove of the I-helix upon binding of O2. The oxo (oxyferryl) species, resulting from the cleavage of the O-O bond - one atom of oxygen leaves with the two electrons and two protons as water - is apparently the most abundant. The iron-hydroperoxo species inserts the elements of OH+, producing protonated alcohols that can give cationic rearrangement products. The iron-oxo species inserts an oxygen atom. The result of P450 catalysis is not always insertion of oxygen, but can be a dealkylation, dehydration, dehydrogenation, isomerization, dimerization, carbon-carbon bond cleavage, and even a reduction [31]. The substrate specificity and type of reaction catalyzed are governed by the less conserved regions of the protein and are therefore not well understood.

Carbon monoxide can bind ferrous P450 instead of dioxygen, inducing a shift of the maximum of absorbance of the heme (called the Soret peak) to 450 nm [32,33] (this property is a characteristic of P450 enzymes). CO is bound with high affinity and prevents binding and activation of O2. The result is an inhibition of P450 activity. CO binding and inhibition can be reversed by light, with maximal efficiency at 450 nm. Other ligands, substrates and inhibitors, induce absorbance shifts of the Soret in P450 enzymes. Differential spectrophotometry is thus widely used to monitor binding of such ligands [35]. Substrates that displace the six-coordinated solvent in the resting P450 usually induce a shift to the blue (420 to 390 nm) which reflects a low- to high-spin transition of the iron. Inhibitors with an sp2 hybridized nitrogen (nitrogens in heterocycles such as azoles, pyridines, or pyrimidines) replace the sixth ligand for coordination with iron and induce a shift to the red (400 to 430 nm).

Frontiers

Issues most studied

Medical, pharmacological and toxicological studies of P450 enzymes and genes for the prediction of drug metabolism and the prevention of adverse drug reactions remain a key field of investigation. For instance, there are many allelic variants of the human CYP2D6 gene, which encodes a liver enzyme that metabolizes a quarter of all known drugs. The CYP2D6 genotype will determine a person's response to important drugs, such as antipsychotics and antidepressants. Another very active field is the tailoring of P450 enzymes for specific functions, via site-directed mutagenesis or in vitro directed evolution. The development of new approaches to improve solubility and implement crystallization of membrane-bound enzymes probably opens a new era for the understanding of P450 mechanisms and substrate specificity, as well as engineering of such catalytic properties.

Major unresolved questions

A major unresolved issue in the field is the physiological function of many of the newly discovered P450 enzymes. Whereas this question can be practically addressed in microorganisms or other organisms amenable to homologous recombination, this is not the case in most eukaryotes such as humans, where the endogenous substrate(s) of many enzymes that have been extensively studied for xenobiotic metabolism are still unknown. Also, the molecular basis and impact of receptor-mediated transcriptional activation of many P450 genes remains a challenging area of research. An interesting point to clarify is the subcellular localization of some P450 enzymes, some of which might have more than one localization. Many P450-catalyzed reactions in plants generate compounds that might be toxic if released in the cytoplasm. There is increasing evidence of the channeling of such compounds inside multi-enzyme complexes. How do P450s interact with such complexes, and do they serve to anchor them on membranes? From a mechanistic point of view, simple hydroxylation reactions are now rather well understood. More complex reactions, for example multi-step reactions or those involving no overt oxygen insertion, are still a very open field of investigation.