Objective

The polyunsaturated fatty acids (PUFAs) arachidonic acid (AA), docosahexaenoic acid (DHA) and eicosapentaenoic acid (EPA) are oxidised by cytochrome P450 (CYP) enzymes to produce metabolically active products that play significant roles in inflammation pathways [1, 2]. Due to the absence of a crystal structure of the main such enzyme in the human cardiovasculature (CYP2J2), the precise mechanism by which it metabolises PUFAs into specific stereo- and regio-epoxyisomers is not fully understood. Consequently, the effect of mutations in the protein sequence arising from non-synonymous single nucleotide polymorphisms found in the population cannot be predicted, hindering our ability to link genomic information to dysregulation of inflammatory responses and thus successful prognoses of cardiovascular health. In this project, we aimed to understand binding of PUFAs in the active site of CYP2J2 using computational methods and leverage this information to investigate the residues essential for ligand positioning and metabolism. In previous work, our groups investigated the interaction of AA with human CYP2J2 and revealed Arg117 as a key player in the recognition of this substrate [3], although these simulations were relatively short (50 ns). Simulations from other studies have come to diverse conclusions about the role of individual residues in the active site [4,5,6]. Here, we tried to investigate further using much more extensive simulations of both wild type and mutant forms of the enzyme. These new simulations confirmed the importance of Arg117 but in addition suggested Arg111 as a residue necessary for epoxidation and pointed to the role of two more arginine residues in the active site that allow some redundancy in substrate tethering and contribute to the flexibility of the catalytic capabilities of the system. Expression trials in HEK293T cells to produce CYP2J2 and its mutants were unsuccessful so the computationally derived hypotheses could not be validated in the lifetime of this project.

Table 1 Overview of data files/data sets

Data description

The data presented here comprise the results of homology modeling of the human wild type CYP2J2 and generation of models for a series of mutants [7]; molecular docking of three eicosanoid ligands (AA, DHA and EPA) to wild type CYP2J2 [7]; finally, a series of molecular dynamics simulations of the wild type and mutant enzyme with the three ligands [8,9,10,11,12,13,14,15,16,17,18,19,20]. Below is a brief description of each part of the data. More details are available in the Methods document on the top Zenodo repository [7].

Homology model of CYP2J2

The homology model [7] is based on the UniProt [21] protein sequence with UID P51589. A model of the sequence with the N-terminal transmembrane domain (residues 1–43) trimmed was built using MODELLER version 9.14 [22], using as templates the PDB structures: 1SUO [23], 2P85 [24], 3EBS [25] and 1Z10 [26]. A haem molecule was incorporated into the model building using the HETATM records from PDB structure 1SUO.

Structure models of mutants of CYP2J2 were produced using the homology model of the wild type enzyme as the starting point and changing residues 111, 117, 382 and 446 from arginine to alanine. The expectation was that mutating these residues to a non-charged amino acid would have a noticeable impact on the binding of fatty acid substrates.

Docking of PUFAs to CYP2J2

The fatty acids arachidonic acid (AA), docosahexaenoic acid (DHA) and eicosapentaenoic acid (EPA) were investigated in this study. The structure of AA was obtained from the Zinc Dock database version 12 [27]. Structures for DHA and EPA were derived using the Automated Topology Builder version 2.2 [28]. Docking of all ligands to CYP2J2 models was carried out using Autodock VINA version 1.1.2 [29]. Five independent docking runs were carried out for each ligand.

Molecular dynamics simulations

MD simulations were carried out using AMBER14 [30] as described in the Methods document (data set 1 [7]). The simulations included the standard minimization, heating, equilibration and production phases. Six docked wild type CYP2J2-AA complexes were simulated in four independent runs, each lasting 1 μs [8,9,10]. Simulations of the mutant enzymes started from the same six docked poses of AA but each pose was simulated in three repeats, each lasting 500 ns. Two single mutants were investigated (Arg111Ala [13, 14], Arg117Ala [15, 16]) followed by a double mutant (Arg111Ala and Arg117Ala [17, 18]) and finally a quadruple mutant (Arg111Ala, Arg117Ala, Arg382Ala and Arg446Ala [19, 20]). Simulations of DHA [12] and EPA [11] were carried out starting from four docked poses, each simulation repeated three times and lasting 300 ns.

The simulations highlighted two residues in the active site (Arg111 and Arg117) that appear to play important roles in anchoring the carboxylate group of the substrate. Simulations also suggested that mutating any one of these two residues, results in enhancing the role of the other one as a hydrogen-bond donor, and that if both are mutated, two more arginine residues (Arg382 and Arg446) can partially make up for the missing charged groups in the active site.

Limitations

As with all computational studies, the data here should be interpreted with care. The starting CYP2J2 structure used in these simulations is a homology model, i.e. a structure built in silico using information from related proteins whose structures have been deposited in the PDB. Although we have built the model using an alignment of multiple, carefully selected structures, it is possible that inaccuracies in the initial structure have affected the final simulations. Our molecular dynamics simulations (ranging from 900 ns to 4 μs) are, to the best of our knowledge, the longest carried out on human CYP2J2 and, in addition, multiple repeats using the same starting docked pose of the ligand were used to assess the robustness of observations to differences introduced by the random nature of the algorithm. Despite the length of these simulations and the evidence pointing to reasonable convergence in energy terms, simulations appeared to sample different conformations of the system, even when the same starting pose was used (in different repeats). These MD runs thus point towards a very flexible system that is better described as an ensemble of possible states, whose probability is affected by the substrate nature or mutations in the active site. Longer simulation times would have been useful in revealing whether convergence of the system to a few distinct conformations is possible, given enough simulation time. The haem molecule plays an important role in these simulations. Haem was modeled here in its penta-coordinated high-spin ferric form but the alternative highly reactive iron-oxygen species complex should be considered too. Finally, modeling a restricted part of this system around the haem molecule using a quantum mechanical (QM) model would be advisable. A joint QM/MM system could be setup that would offer a more realistic representation of how the intermediate complex between haem and substrate is formed.