Introduction

Apolipoprotein B messenger RNA-editing enzyme, catalytic polypeptide-like (APOBEC3) proteins are single-stranded DNA (ssDNA) deoxycytidine deaminases that are among some of the fastest evolving proteins in the human genome1. APOBEC3s catalyse a cytidine (C) to uridine (U) zinc-dependent deamination reaction2,3,4,5. The seven APOBEC3 enzymes are clustered on chromosome 22 (ref. 6). Although each APOBEC3 has a single catalytic active site, the human genome includes three single-domain (APOBEC3A, C and H) and four double-domain (APOBEC3B, D, F and D) enzymes. The double-domain enzymes consist of a catalytically active C-terminal domain (CTD) and an inactive pseudo-catalytic N-terminal domain (NTD) that can bind but not edit nucleic acids. Four of the seven APOBEC enzymes (APOBEC3D, APOBEC3F, APOBEC3G and APOBEC3H) have been implicated as HIV-1 host restriction factors7,8,9,10,11,12,13. The APOBEC3 enzymes act on ssDNA to introduce C-to-U modifications that create G-to-A point mutations on the paired strand as the U is read as T during replication. Such mutations in ssDNA can lead to double-strand breaks that may result in genomic DNA damage that have been observed in cancer14,15,16,17,18,19,20.

In the last decade, our laboratories21,22,23,24,25,26,27 along with others28,29,30,31,32,33,34,35,36,37,38,39,40,41,42 have solved crystal and nuclear magnetic resonance (NMR) structures of single domains of human APOBEC3s (Supplementary Fig. 1). These proteins share the same overall fold43, deaminate cytosines in ssDNA, but vary in their substrate specificity, processivity, catalytic rate and ability to restrict HIV-1. All APOBEC3 domains contain a HAEx28Cx2-4C zinc binding motif. The carboxylate group of the catalytic glutamic acid stabilizes the transition state and proton transfer during catalysis where a water coordinated by the catalytic zinc is the sole source of proton for the amino group and N3 atom of cytosine2,44,45. The specificity of different APOBECs has been elucidated by the determination of preferred mutagenic hotspot sequences, 5′-CC/TC-3′ for APOBEC3A (studied here)46, 5′-TC-3′ for APOBEC3F and 5′-CC-3′ for APOBEC3G10,47,48. APOBEC3G deaminates hotspots closer to 5′-end more efficiently than to 3′-end of ssDNA28,30,32,49, but the underlying mechanism for this preference is not known. Several alternative ssDNA-binding models for APOBEC3G-CTD and APOBEC3A have been proposed21,29,35,36. Most recently, the crystal structure of the inactive pseuodo-catalytic rhesus macaque APOBEC3G-NTD (rA3G-NTD) (Supplementary Fig. 1b) in complex with poly-dT ssDNA has been reported42. However, only one complete deoxythymidine (dT) was resolved in this structure bound in a shallow cleft far from the pseudo-catalytic zinc-binding motif. This complex did not reveal how substrate (dC) or product (dU) may be accommodated for deamination reaction. The details of ssDNA-binding and -editing mechanisms, and molecular basis underlying substrate nucleotide sequence specificities of APOBEC3 enzymes still remain elusive.

APOBEC3A (A3A) is a single-domain enzyme with the highest catalytic activity among the human APOBEC3 proteins50. While the DNA-editing activity inhibiting the replication of retroelements is beneficial for genome stability, increased expression or defective regulation of A3A could lead to mutagenesis of human genome and contribute to carcinogenesis51. The structure of A3A was initially determined by NMR35 and some preference for DNA over RNA was suggested by chemical shift perturbation data36. However, mutations of residues predicted to be involved in DNA targeting had variable effects on deamination activity, and the detailed mechanism by which A3A binds DNA substrate is still elusive35,36.

In this study, we determined the crystal structure of a ssDNA:deaminase complex, or a polynucleotide substrate bound at the active site of a catalytic domain APOBEC3 protein. Previously, we solved the crystal structure of the unliganded inactived A3A (ref. 26) and determined potent binding affinity to substrate ssDNA of ∼60 nM, whereas the product exhibited an order of magnitude lower affinity. Here the crystal structure to 2.2 Å of this variant of A3A in complex with substrate DNA oligonucleotide containing a single 5′-TC-3′ deamination target sequence in a polyT background is presented. The central nucleotides comprising the 5′-TCT-3′ motif is well ordered and bound at the active site, revealing the intermolecular interactions defining specificity for the bases at each of these three positions. The target deoxycytidine (dC0), is bound in a reaction-competent coordination at the active site. This A3A–ssDNA structure elucidates the molecular basis of nucleotide preferences in the substrate motif and provides key insights into the overall molecular mechanisms of DNA editing by cytidine deaminases.

Results

A3A–ssDNA co-crystal structure

A3A (E72A/C171A)26 was used for co-crystallization with ssDNA. E72A inactivates the enzyme permitting the formation of stable complexes and C171A increases solubility. The crystal structure of A3A (E72A/C171A) in complex with ssDNA was determined by molecular replacement at 2.2 Å resolution (Fig. 1a–c; Supplementary Fig. 2). A 15-mer DNA oligonucleotide that binds A3A with ∼60 nM affinity26 with a target deoxycytidine (5′-TTTTTTTCTTTTTTT-3′) was co-crystallized with A3A. The final refinement of the structure resulted in R-factor/R-free of 0.177/0.225, respectively (Table 1).

Figure 1: Crystal structure of A3A in complex with substrate DNA.
figure 1

(a) A3A structure with a 2FoFc electron density map contoured at 1σ. The protein is presented as a green-coloured ribbon diagram and the bound DNA is in stick representation (carbons and phosphates, orange; nitrogens, blue; oxygens, red). A zinc ion at the active centre is depicted as a magenta-coloured sphere. The side chains of zinc-coordinating residues H70, C101 and C106 are shown as sticks (carbons, green; nitrogens, blue; oxygen, red; sulfurs, yellow). DNA binding at the active site of A3A is presented in (b) ribbon and (c) surface representation. (d) Conformational changes of residues R28, H29 and Y132 upon DNA binding are indicated by arrows, with side chains in stick representation (white and green-coloured carbon for the apo (PDB code 4XXO)26 and DNA-bound forms, respectively). Surface electrostatic potentials of (e) apo and (f) DNA-bound A3A are coloured red to blue for negative and positive charges, respectively, using a scale of −5 to +5 kT e−1.

Table 1 Data collection and refinement statistics (molecular replacement).

There was a single A3A–ssDNA complex in the asymmetric unit and crystal contacts with symmetry-related complexes did not correspond to the zinc-coordinated dimer interface we observed for the apo A3A crystal structure26. The apo A3A structure included an excess of zinc (50 μM ZnCl) in the crystallization condition, while the A3A–ssDNA complex lacked added zinc, which may have destablized the dimer within this crystal form. The cooperativity upon DNA binding we observed in solution and interrogated with site-directed mutagenesis26 implicates A3A capable of binding ssDNA in the dimeric form at least transiently. Nevertheless, cooperativity does not seem to be essential as the monomeric form of A3A, with a mutation at H56A, binds substrate DNA with similar affinity26. Most likely both monomer and dimer forms of A3A play a role in recognizing substrates in solution.

The target deoxycytidine (dC0) and flanking deoxythymidines (dT−1 and dT1), as well as one additional deoxyribose at 5′-end and one phosphate at 3′-end, were well ordered in the electron density (5′-sugar-dT−1-dC0-dT1-phosphate-3′; Fig. 1a). Of the nearly 1,280 Å2 of surface area on the resolved DNA, ∼620 Å2 is buried in the interface with A3A. The central cytidine (dC0) and the preceding thymidine (dT−1) are accommodated in a deep groove formed by Loops 1,3,5 and 7 of A3A (Fig. 1b; Supplementary Fig. 1a). The bound DNA adopts an irregular conformation to encircle the side chain of H29 (Fig. 1c). Compared to apo A3A, there are conformational changes in the rotamers of the side chains of R28 and H29 in Loop 1, and Y132 in Loop 7, accompanied by more subtle reorganization of N57–A72 in loop 3 (Fig. 1d; Supplementary Figs 1 and 3). The rest of the enzyme including the active site remains essentially unchanged. The groove significantly differs from any of the previously suggested models for how ssDNA binds to A3s (refs 21, 29, 35, 36) including the recent structure of the pseuodo-catalytic A3G-NTD in complex with poly-dT ssDNA42. This conformational change allows the groove to sequester the ssDNA by forming a more complementary molecular surface, both in terms of van der Waals packing and electrostatic (electropositive) nature of the groove (Fig. 1e,f).

Recognition of the targeted cytidine

The deoxycytidine (dC0), which is the target of deamination reaction, is well coordinated and buried within the active site of A3A. The cytidine ring is located directly over the hydroxyl group of the T31 side chain, which likely hydrogen bonds to the π-orbital cloud of the base ring and simultaneously coordinates O4 atom of the deoxyribose (Fig. 2a). Residue Y130 contributes to the dC0 positioning by forming a T-shaped π–π interaction with the pyrimidine ring. The hydroxyl group of Y130 further forms a hydrogen bond with 5′-phosphate of dC0 (Fig. 2b). The H70 side chain is positioned over the N1 atom of dC0, capable of potentially forming a π–π stacking (Fig. 2a). The backbone NH of A71 hydrogen bonds to O2 of dC0. In addition, the carbonyl oxygen atoms of W98 and S99 form a bifurcated hydrogen bond to NH2 of the cytosine, which appears to both support the dC0 positioning and dictate the specificity for cytosine over thymine.

Figure 2: A3A–ssDNA atomic interactions.
figure 2

Stereo-view of the interactions between A3A and (a) the target nucleotide base (dC0), (b) the DNA backbone flanking dC0 (c) nucleotide at −1 position (dT−1). (d) Interactions between H29 side chain and the substrate DNA. Side chains of A3A residues (carbons green) and the DNA (carbons and phosphates orange) are in stick representation, with other atoms coloured as in Fig. 1b. A zinc ion (Zn) at the active centre, the zinc-liganded chlorine (Cl) and water molecule (W) are indicated by spheres coloured magenta, green and red, respectively. Estimated hydrogen bonds and π–orbital interactions are depicted by dashed lines coloured orange and black, respectively.

As expected for a catalytic A3 domain, electron density that fits a zinc ion was observed coordinating H70, C101, C106 as well as additional density that fits a Cl ion, with both assignments confirmed by anomalous difference calculations. To prevent catalysis, our A3A construct was inactivated by an E72A mutation, which left the geometry of the active site intact (Fig. 2a). Instead of the E72 side chain, we observe electron density that fits a water molecule. Molecular modelling of E72 into this space shows the side chain would be positioned just proximal to the deamination target, C4-NH2 moiety, of the cytosine (Fig. 3) and poised for deamination reaction2. After catalysis and subsequent release of NH3, this coordination, along with the interactions with W98 and S99, would be unfavourable for the product uridine. Overall, multiple interactions of the substrate cytosine with A3A active site residues ensure the specific recognition and geometry required for the deamination reaction and product release.

Figure 3: Structural model of the A3A catalytically active site.
figure 3

The target nucleotide base (dC0) bound at the A3A active site where the catalytic E72 side chain was modelled in instead of the alanine at this position in the crystal structure. Zinc, the coordinated water (W), carbonyl oxygen of dC0 and carboxyl oxygen of E72 side chain were connected by dashed lines in magenta.

Specificity for pyrimidines at −1 position

The deoxythymidine at the 5′-side of the target (the −1 position; dT−1) has extensive van der Waals contacts with three residues from Loop 7 (Y130, D131 and Y132) and W98 in Loop 5 (Fig. 2c). The Watson–Crick edge of the thymine base faces these Loop 7 residues, and makes three hydrogen bonds: O2 atom with Y132 backbone amide, N3 with the D131 side chain carboxylate and O4 with a water molecule. In addition, the D131 side chain has a salt bridge to the R189 side chain in helix 6, which stabilizes the overall hydrogen bonding configuration of Loop 7 to the thymine base. This coordination appears critical as residue 189 is conserved as a basic residue (Arg/Lys) only in catalytically active A3 domains (Supplementary Fig. 1; Supplementary Table 1). At the −1 position, deoxcytidine could form similar, but slightly rearranged, interactions as the N3 atom lacks the proton to hydrogen bond with D131. Indeed, although A3A has dual specificity for 5′-TC-3′ and 5′-CC-3′ (ref. 40), there is a preference for thymidine at the −1 position. However, Loop 7 of A3A, in particular residues Y130 and D131, would likely preclude a larger purine base from fitting in this position, thus defining the T/C specificity of A3A.

The conserved N57 is central to the active site geometry

N57 of A3A is completely conserved among the catalytically active APOBEC protein domains, while inactive pseudo-catalytic A3 domains have a conserved glycine (Supplementary Fig. 1; Supplementary Table 1), and widely conserved among other cytidine/cytosine deaminases from Escherichia coli through Homo sapiens52. The structure explains this strong conservation, as N57 of A3A is central in recognizing ssDNA with three key distinct interactions: The side chain of N57 determines the 5′–3′ directionality of ssDNA binding by forming a hydrogen bond to O3′ atom of dC0, which helps stabilize the geometry of the DNA backbone and the sugar in a C2′-endo conformation (Fig. 2b; Supplementary Fig. 4a) and induces a backbone deformation due to steric hindrance with O5′ of the target dC0. The N57 side chain forms a hydrogen bond with the backbone NH of T31, positioning the T31 side chain to hydrogen bond to the π-orbital cloud of the dC0 base ring, thus ensuring the geometry of the target nucleotide within the active site (Fig. 2a; Supplementary Fig. 4a). Finally, the N57 side chain packs against both the deoxyribose ring of dC0, stabilizing the orientation of sugar plane, and H70, which coordinates zinc. Although RNA deaminase activity has been reported for A3A53,54, if the sugar was a ribose a steric clash between the 2′-OH and H70 would occur, therefore requiring a conformational rearrangement for RNA modification. Thus, these three pivotal interactions of N57 organize the enzyme substrate complex to be poised for catalytic turnover.

The three central interactions mediated by N57 are strictly conserved in the active site geometry of other cytidine deaminases55,56,57, where the asparagine side chain (1) hydrogen bonds to substrate backbone, (2) packs to maintain the sugar orientation and (3) packs against the zinc-coordinating residue side chain (Supplementary Fig. 4b–d). The RNA cytidine deaminases replace the zinc coordinating histidine with a relatively small amino acid, cysteine, which permits a ribose ring to fit (Supplementary Fig. 4c,d). This structure explains why although not located directly at the active site, even conservative N57Q or N57D mutations severely disrupt deaminase activity29,52,58, thus our A3A–ssDNA structure reveals the conservation of N57 to be critical for proper orientation of the substrate within the active sites of cytindine deaminases.

H29 coordinates the ssDNA binding in the active site

H29 is the other lynchpin in ssDNA binding to A3A. H29 of A3A corresponding to H216 in the catalytic domain of A3G (A3G-CTD), which when mutated to alanine abolishes activity21. Maximal catalytic activity occurs at pH 5.5 for both A3G-CTD59 and A3A51, implying that the histidine is protonated. Interestingly, this His is not completely conserved in other A3s, where this position is sometimes an arginine or asparagine. The H216R mutation in A3G and H29R in A3A resulted in reduced but still significant catalytic activity41,59. In the apo A3A crystal structure26, H29 is involved in crystal contacts and rotated away from the active site (Fig. 1d). In the NMR structure of A3A the H29, side chain is solvent exposed and the rotamer is not defined in solution (PDB code 2M65)35. Thus, upon ssDNA binding, the side chain of H29 selects a rotamer to interact extensively with the substrate, latching the active site to permit catalysis. Once catalysis occurs, H29 needs to rotate out of this position to release the deaminated product. H29 forms hydrogen bonds to the backbone phosphates of dT−1, dC0, and dT1, and the deoxyribose of dT1 (Figs 1d–f and 2d). The side chain of H29 is crucial in dT1 recognition, with the imidazole ring positioned to form π–π interactions with the pyrimidine ring of dT1. This relatively non-specific stacking interaction explains the apparent lack of specificity at the +1 position. Thus, our structure reveals the unique role of H29 in positioning the substrate ssDNA with a series of coordinated hydrogen bonds and stacking interactions, essentially latching the ssDNA and the target dC0 within the active site.

A3A and rA3G-NTD differ in DNA binding

The recent structure of ssDNA bound to the inactive pseudo-catalytic domain rA3G-NTD42 is not that of a substrate complex and displays a binding mode that is incompatible with catalysis. In contrast to our structure, the single base ordered in that structure is not coordinated within the binding pocket (Supplementary Fig. 5), but rather a sugar is partially buried in the pocket. More specifically, comparing the A3A–ssDNA with the rA3G-NTD–ssDNA structure: H70, W98, S99 and Y130 in A3A (H65, W94, S95 and Y125 in rA3G-NTD) are conserved in the two protein’s sequences and interact with ssDNA; however, there are no similarities in their interactions with the ssDNA (Figs 2a and 4; Supplementary Fig. 5). H70 of A3A forms a π-hydrogen bond with dC0, while H65 of rA3G-NTD forms a hydrogen bond with C3′-carbonyl group of the ribose of dT0. W98 and S99 of A3A use their backbone carbonyl group to hydrogen bond with amino group of the target cytidine (dC0), while W94 of rA3G-NTD is stacking with the pyrimidine of dT0. Y130 of A3A forms a π–π interaction with dC0 and a hydrogen bond with the phosphate between dC0 and dT−1, while Y125 of rA3G-NTD forms a hydrogen bond with C3′-carbonyl group of the ribose of dT1. Many of these interactions preclude the interactions observed in A3A–ssDNA (Figs 1 and 2). Amino acids with more extensive interactions with substrate ssDNA are not conserved in sequence or structure including: H29, which is D, T31, which is V, and the critical N57, which is G (Supplementary Fig. 1). In addition, interactions at −1 and +1 positions are not observed in the rA3G-NTD–ssDNA complex structure as only a single dT0 is ordered in the electron density. Critically, the target cytidine (dC0) in the A3A is located ready for deamination, while non-substrate dT0 in the rA3G-NTD is not located close to the catalytic Zn2+. This binding mode corresponds to a much lower affinity of the pseudo-catalytic rA3G-NTD to ssDNA (∼1.6 μM)42 confirming non-specific binding, compared to ∼60 nM (ref. 26) we observed for substrate ssDNA binding to A3A. While the rA3G-NTD-dT structure may represent mechanisms by which non-substrate ssDNA binds A3 domains, the A3A–ssDNA structure we present here elucidates the mechanism by which ssDNAs are recognized as substrates by catalytically active A3s.

Figure 4: Structure and substrate-binding similarity between A3A and RNA deaminase TadA.
figure 4

(a) A3A structure (green ribbon) bound to substrate DNA (orange sticks, as in Fig. 1b). Three DNA nucleotides (dT−1, dC0 and dT1) are displayed. A zinc ion at the active centre is coordinated by H70 (helix α2), C101 and C106 (helix α3). (b) TadA structure60 (PDB code 2B3J) (grey ribbon) bound to substrate RNA (orange sticks). Three RNA nucleotides (rU−1, Neb0 (nebularine) and rC1) are displayed. Zinc-coordinating residues H53 (helix α2), C83 and C86 (helix α3) are shown in stick representation (carbons, white; nitrogens, blue; oxygen, red; sulfurs, yellow). (c) rA3G-NTD structure (grey ribbon) bound to ssDNA (orange sticks), only dT0 has the base while the backbone of dT1 and the sugar of dT0 is mapped. Surface representation of the nucleotide-binding site of (d) A3A, (e) TadA (f) A3G-NTD. Close-up view of the active site of (g) A3A, (h) TadA and (i) A3G-NTD. The catalytic glutamic acid side chain was modelled in instead of alanine at position 72 in the A3A crystal structure.

Molecular recognition in polynucleotide deaminases

Our crystal structure of the A3A–ssDNA complex and the crystal structure of Staphylococcus aureus tRNA adenosine deaminase (TadA) in complex with RNA (2B3J)60 are structures of single-stranded polynucleotide deaminases bound to their substrates. Although their substrates are different, as TadA deaminates adenosine at the anti-codon stem-loop of tRNAArg2 and A3A deaminates cytosines in ssDNA, their active sites are similar in that both have a HAEx∼30Cx2-4C zinc-binding motif. We observe the most striking similarity in the phosphate-sugar backbone traces of RNA (TadA) and ssDNA (A3A) (Fig. 4): 5′–3′ directionality is the same, and the polynucleotide is sharply bent with the target nucleotide deep in the active site pocket. Five nucleotides located in the anti-codon stem-loop of tRNAArg2 have adopted C2′-endo ribose conformation that is typical for DNA, explaining how the RNA forms a similar backbone conformation to the ssDNA bound to A3A (Fig. 4a,d,g). This remarkable similarity of the phosphate-sugar backbone, despite different substrates, tRNA for TadA and ssDNA for A3A, implies that the HAEx∼30Cx2-4C type zinc-dependent deaminases have an evolutionary conserved substrate-binding topology as well as catalytic mechanism.

This crystal structure of an ssDNA substrate–enzyme complex reveals how substrate recognition occurs by single-stranded polynucleotide-modifying enzymes, for APOBEC family members and other ssDNA deaminases. This is in contrast with the pseudo-catalytic domain A3G-NTD42, which is not a substrate complex and has a single base ordered in the structure that is only partially buried in the binding pocket, displaying a very dissimilar binding mode (Fig. 4c,f,i). The striking similarity of A3A–ssDNA (Fig. 4a,d,g) with the structure of TadA–tRNA complex (Fig. 4b,e,h) implies structural and mechanistic conservation among single-stranded nucleotide-modifying enzymes that have evolved to acquire distinct specificities. These specificities may be leveraged for specific gene editing. APOBEC1 and other cytidine deaminases were recently combined with CRISPR/Cas9 technology in direct ‘base editing’ to correct point mutations, without the need for a donor template or double-stranded DNA breaks61. By leveraging the directionality, specificity and binding architecture of ssDNA revealed by our A3A–ssDNA complex, base-editing technologies will become even more targeted and specific to expand the scope and effectiveness of genome editing.

Methods

Preparation of protein and DNA

The preparation method of A3A(E72A/C171A) protein was described previously26 as follows: the protein was expressed in E. coli strain BL21 DE3 Star (Stratagene) cells with pCold-GST-A3A(E72A/C171A) vector. Expression was induced with 1 mM isopropyl β-D-1-thiogalactopyranoside at 16 °C for 22 h in lysogeny broth medium containing 100 μg ml−l ampicillin. Cells were pelleted, resuspended in purification buffer (50 mM Tris-HCl (pH 8.0), 300 mM NaCl and 1 mM dithiothreitol) and lysed through sonication. Cellular debris was separated by centrifugation (45,000g, 30 min, 4 °C). The protein was purified as a GST-fused protein with glutathione-immobilized resin (Clontech). After digesting with HRV 3C protease, the protein was further purified with a size-exclusion column (GE Healthcare) equilibrated with a buffer (10 mM Tris-HCl (pH 8.0), 200 mM NaCl and 1 mM dithiothreitol). The fraction containing the monomeric form was collected and concentrated for crystallization. The purity and integrity of A3A(E72A/C171A) was confirmed by SDS–polyacrylamide gel electrophoresis. E72A inactivates the enzyme, while C171A (distal the active site) enhances solubility of the expressed protein.

The DNA oligo, d(TTTTTTTTCTTTTTT), was synthesized (Integrated DNA Technologies), and mixed with the purified A3A(E72A/C171A) protein at a molar ratio of 2:1.

Crystallization and data collection

Crystals of the A3A(E72A/C171A)–DNA complex were grown by hanging-drop vapour-diffusion method over a reservoir of 100 mM MOPS (pH 6.5), 50 mM MgCl2, 50 mM CaCl2, 23% polyethylene glycol 3,350 and 15% 2-methyl-2,4-pentanediol. Drops were formed by mixing 1 μl of A3A(E72A/C171A)–DNA solution (∼20 mg ml−l of protein concentration) and 1 μl of reservoir solution, with equilibration over the reservoir at 20 °C. Micro-seeding was performed using a cat whisker and larger crystals suitable for X-ray diffraction were obtained. Crystals were flash-frozen directly in the cryogenic stream. Diffraction data were collected using an in-house X-ray source MicroMax-007 HF (Rigaku) with a copper anode at a wavelength of 1.54178 Å and a Saturn 944 HG (Rigaku) detector. The space group of the crystals was I222 with unit cell dimensions of a=56.6 Å, b=72.7 Å, c=115.0 Å (Table 1). The collected intensities were indexed, integrated, corrected for absorption and scaled using HKL2000 (ref. 62).

Structure determination

The protein structure was solved by molecular replacement phasing using a previously determined apo A3A(E72A/C171A) crystal structure (PDB code 4XXO)26 with the program Phaser63. Model building of the protein and bound DNA, and refinements were manually performed using the programs Coot64 and Phenix65,66, respectively. A simulated annealing omit map was calculated to confirm the ssDNA positioning (Supplementary Fig. 6). The first nine residues and the side chains of residues R10, H11, H16, K30, N42, V46, K47, Q50, Q58, K60, L62, L63, F66, Y67, D177, E181 and N196 of A3A(E72A/C171A) were not modelled in due to lack of electron density. Residues N42–T44 and L62–G65 were somewhat disordered; the occupancy values were set to 0.5 for residues N42–T44, L62 and G65, and to 0.75 for residues L63 and C64 due to poor electron density. A density proximal to a zinc ion at the active centre was assigned to chloride considering the statistics of zinc ligand67, resulting in a good fit without phase-error signals (Supplementary Fig. 7). The identification of the active site zinc is further supported by the highest peak, 9.5σ, in an anomalous difference Fourier map at this position. A smaller peak at 5.6σ in this map is present at the assigned chloride position. The final model was refined to R(work)/R(free) values of 0.177/0.225 at 2.20 Å resolution (Table 1). The quality of the final model was assessed by Molprobity68, which indicated that 96.2% of the residues were in the favoured dihedral angle configuration and there were no Ramachandran outliers.

Structure analysis

Figures of structure models were generated by Pymol69, which was also used to model in the catalytic E72 side chain in Figs 3 and 4. The electrostatic distribution of A3A(E72A/C171A) was calculated and visualized using PDB2PQR server70 and Pymol with the APBS plugin, where the cysteine was modelled as thiolate anion (S) and solutes were excluded. Solvent-accessible and buried surface area was calculated with PISA71. Local root mean square deviation between apo and DNA-bound forms of A3A(E72A/C171A) was calculated using Molmol72. The distance difference matrices between the apo- and DNA-bound forms of A3A(E72A/C171A) were calculated and visualized using a custom-made script in MacOS Xcode (https://developer.apple.com/xcode/).

Data availability

Atomic coordinates and structural factors for the reported crystal structure have been deposited in the Protein Data Bank http://www.wwpdb.org/ under the accession number 5KEG. The data that support the findings of this study are available from the corresponding author upon request.

Additional information

How to cite this article: Kouno, T. et al. Crystal structure of APOBEC3A bound to single-stranded DNA reveals structural basis for cytidine deamination and specificity. Nat. Commun. 8, 15024 doi: 10.1038/ncomms15024 (2017).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.