Characterization of POLE c.1373A > T p.(Tyr458Phe), causing high cancer risk

The cancer syndrome polymerase proofreading-associated polyposis results from germline mutations in the POLE and POLD1 genes. Mutations in the exonuclease domain of these genes are associated with hyper- and ultra-mutated tumors with a predominance of base substitutions resulting from faulty proofreading during DNA replication. When a new variant is identified by gene testing of POLE and POLD1, it is important to verify whether the variant is associated with PPAP or not, to guide genetic counseling of mutation carriers. In 2015, we reported the likely pathogenic (class 4) germline POLE c.1373A > T p.(Tyr458Phe) variant and we have now characterized this variant to verify that it is a class 5 pathogenic variant. For this purpose, we investigated (1) mutator phenotype in tumors from two carriers, (2) mutation frequency in cell-based mutagenesis assays, and (3) structural consequences based on protein modeling. Whole-exome sequencing of two tumors identified an ultra-mutator phenotype with a predominance of base substitutions, the majority of which are C > T. A SupF mutagenesis assay revealed increased mutation frequency in cells overexpressing the variant of interest as well as in isogenic cells encoding the variant. Moreover, exonuclease repair yeast-based assay supported defect in proofreading activity. Lastly, we present a homology model of human POLE to demonstrate structural consequences leading to pathogenic impact of the p.(Tyr458Phe) mutation. The three lines of evidence, taken together with updated co-segregation and previously published data, allow the germline variant POLE c.1373A > T p.(Tyr458Phe) to be reclassified as a class 5 variant. That means the variant is associated with PPAP. Supplementary Information The online version contains supplementary material available at 10.1007/s00438-023-02000-w.


S1.1. Materials and Methods
The presented homology model of human DNA POL ε catalytic subunit A (UniProt ID: Q07864) was based on the crystal structure of yeast DNA polymerase epsilon (Saccharomyces cerevisiae) (PDB-ID: 4PTF) (Jain, Rajashankar et al. 2014). The template was identified through BLAST with ExPDB as the target database. The homology model was interactively modeled using DeepView (Guex and Peitsch 1997). The structural realignment was done using the implemented alignment tool, with default settings (scoring matrix: PAM200, open gap penalty: 6, extended gap penalty: 4, minimum score to be similar: 1), resulting in a sequence identity of 55%. Two positions in the alignment where manually altered (Glu112 and Leu1171) so that the respective gaps are aligned to loop regions and not secondary structure elements. The alignment was inspected visually to ensure that essential residues and regions are correctly aligned. Backbone modeling, loop building, side chain building and energy minimization was preformed using implemented tools in DeepView. The quality control was performed using the Ramachandran Plot in DeepView ( Fig. S1-1A) in addition to Verify3D (Lüthy, Bowie et al. 1992), ERRAT (Colovos and Yeates 1993), VADAR (Willard, Ranjan et al. 2003) and ProCheck (Laskowski, MacArthur et al. 1993) (Fig. S1-1B). These calculations resulted in some manual improvements of side chain orientations to remove nonbonded contacts to improve the quality. For example, the overall quality factor calculated by ERRAT increased from 76.50 % to 79.46 % (Fig. S1-1C).
For each of the variants included in the supF mutagenesis assay and the yeast-based exonuclease repair assay, namely POLE p.Tyr458Phe, p.Leu424Val (Palles, Cazier et al. 2013), and p.Asn363Lys (Rohlin, Zagoras et al. 2014), the mutations, contacts and hydrophobic regions were analyzed using several functions of PyMol (Schrodinger 2015). Two multiple sequence alignments (MSA) were computed using Clustal Omega (McWilliam, Li et al. 2013). The first MSA included DNA polymerase  (Q07864),  (P28340) and α (P09884) catalytic subunit of the B-family in Homo sapiens (see Fig. S1-2). The second MSA included POLE proteins in other organisms: H. sapiens (Q07864), Mus musculus (Q9WVF7), S. cerevisiae (P21951), Schizosaccharomyces pombe (P87154), Arabidopsis thaliana (F4HW04) and Dictyostelium discoideum (Q54RD4) (see Figure S1-3). lists the main quality score calculated by the different programs. In general, all program uses a database of known protein structures to calculated their expected values and paramters. In this table, the score from the template is compared to the predicted target structure. The 3D-1D score assigns a structural class onto each residue, based on the location and enviroment. Presented is the number of residues with a satisfactory value. ERRAT bases the quality factor on non-bonded atomatom interactions, and presented is the number of residues with an accepcted score. 3D profile by VADAR is calculated using the same paramters as Verify3D. The average score per residue is presented, which should be above 4 for the structue to be of satisfactory quality. Lastly, the number of residues within each region of the Ramachandran plot, calculated by ProCheck is listed. (C) Snapshot of the exonuclease domain in the ERRAT plot after manual improvents. The residue numbers is on the x-axis. Residues colored grey are rejected at 95 % confidence level, and residues colored black are rejected at 99 % confience level.

S1.2.2. Human homology model of POLE
The homology model includes residues 25 to 1167 ( Fig. S1-2), with a small missing region (residues 195 to 216). The gaps in the alignment resulted in a lower quality of the predicted structure for the N-terminal domain, while the high conservation of the exonuclease domain resulted in a high quality of this domain in the predicted structure. shown as cyan sticks. The DNA, 3 dNTP (red), 1 Na + -ion (purple ball) and 3 Ca 2+ -ions (grey) are included from template structure (PDB ID: 4PTF).

S1.2.3. Functional effects in known pathogenic POLE variants is confirmed by structural analysis based on the human homology protein model
How the two confirmed pathogenic germline POLE mutations (POLE Leu424Val and Asn363Lys) affected the structure was modelled. The highly conserved Leu424 is in an α-helix in the Exo IV motif. Leu424 is not directly involved in the active site, but rather indirectly involved by maintaining the stability of the hydrophobic core (Fig. S1-3A) (Palles, Cazier et al. 2013). Mutation of residue 424 to valine maintains the polarity, but the decreased size reduces the interaction with the hydrophobic core and may also cause steric clashes to the neighboring amino acid, Tyr362, depending on the rotamer option of valine (Fig. S1-3B and C). Tyr 362 is highly conserved, and changes to its conformation of the side chain may cause similar destabilization for the loop-region and the two neighboring αhelixes, as seen for the Asn363Lys mutation below. In sum, the Leu424Val mutation likely results in loss of stabilizing interaction to the exonuclease domain active site -helix and is thereby predicted to have a functional impact. The highly conserved Asn363 is in a loop region upstream of one α-helix (residue 364 to 379, labelled α-helix 1 in Fig. S1-4), and close to a second α-helix (residue 407 to 415, labelled α-helix 2 in Fig. S1-4) the Exo II motif. Asn363 is not directly involved in the active site, neither are the residues in both α-helixes, but indirectly by stabilizing the opening to the active site ( Fig. S1-4A and Fig. S1-5). Mutation to Lys introduces a positive charge and increases the size of the side chain. The positive charge will disrupt the strong helix dipole of α-helix 1, as residue 363 is at the carboxyl-terminal end. α-helix 2 also has a strong helix dipole with the amino-terminal end facing residue 363. The introduction of a positive charge at residue 363 will repel the positive charged end. This will lead to a destabilization of both α-helixes. The increased size will create steric clashes with α-helix 2 and its own loop region (Leu408 and Tyr362) and restrict access to the active site ( Fig. S1-4B). Restricting access to the exonuclease domain active site resulting from substitution of Lys at Asn363 is predicted to have pathogenic effect.

Fig. S1-4 Structural changes induced by the POLE Asn363Lys mutation. (A)
Asn363 is shown as cyan sticks in part of the exonuclease domain (magenta). The active site residues Asp462 and Glu277 are shown as blue sticks, and the Ca 2+ -ion is shown in grey. In α-helix 1 and 2, the positive (yellow) and negative (red) charged residues are marked. (B) Mutation to Lys (white stick) introduces a positive charge, destablizing the helix dipole in α-helix 1 and 2. The mutation causes steric clashes (red disks) to residue Leu408 and Tyr362 (green sticks) and restricts access to the active site. Residues Tyr458, Asn363 and Leu424 are colored cyan. The surface of the active site is show in yellow, demonstaring that all the mutated residues included in the present study are in the active site.

S1.3. Conclusion
We present here a tertiary structure prediction of human POLE. Structural analyses of two POLE variants used as positive controls (Leu424Val and Asn363Lys) indicated structural consequences resulting in pathogenic impact. These results are supported by a recent study of protein stability based on X-ray crystal structure of S. cerevisiae DNA polymerase POL2 (pdb code: 4PTF) and double-stranded DNA from the X-ray crystal structure of P. abyssi B family DNA polymerase (pdb code: 4FLU). Namely, they reported a significant increase in Gibbs free energy sufficient to destabilize the protein structure when Leu 424 is substituted to Val (Hamzaoui, Alarcon et al. 2020). Similarly, we report a loss of stabilizing interactions. Furthermore, Hamzaoui et al. (2020) demonstrate destabilization of DNA binding caused by altered DNA positioning in the exonuclease active site when Asn 363 is substituted to Lys (Hamzaoui, Alarcon et al. 2020). This corresponds well to our findings indicating that access to the exonuclease domain active site is restricted by the Asn363Lys mutation.  HEK293T cells were transfected with pEF6-POLE-EYFP Images were taken 13, 16, and 19 hours after transfection. POLE-EYFP localization to the cytosplasm, relative to nuclear localization, increases over time as indicated by arrows Table S1. Oligonucleotide sequences used for gene editing of POLE c.1373A>T by CRISPR/Cas9. All sequences are written 5'->3'. The single stranded oligonucleotide used as a repair template includes two 90-nt homology arms (lower case letters), silent mutations (in red), and the mutation of interest (red, highlighted).