Protein–protein HADDocking using exclusively pseudocontact shifts
In order to enhance the structure determination process of macromolecular assemblies by NMR, we have implemented long-range pseudocontact shift (PCS) restraints into the data-driven protein docking package HADDOCK. We demonstrate the efficiency of the method on a synthetic, yet realistic case based on the lanthanide-labeled N-terminal ε domain of the E. coli DNA polymerase III (ε186) in complex with the HOT domain. Docking from the bound form of the two partners is swiftly executed (interface RMSDs < 1 Å) even with addition of very large amount of noise, while the conformational changes of the free form still present some challenges (interface RMSDs in a 3.1–3.9 Å range for the ten lowest energy complexes). Finally, using exclusively PCS as experimental information, we determine the structure of ε186 in complex with the HOT-homologue θ subunit of the E. coli DNA polymerase III.
KeywordsHADDOCK Pseudocontact shift Protein docking Paramagnetic NMR DNA polymerase III
Pseudocontact shifts (PCS) are measured as the difference in chemical shifts between two NMR spectra, one of which is recorded with a paramagnetic center attached to the protein of interest. The presence of the paramagnetic center (usually a paramagnetic lanthanide; for a review on paramagnetic labeling techniques, see (Su and Otting 2010) changes the reference spectrum in several ways: Mainly, observed cross peaks are shifted, while active spins close to the paramagnetic probes (typically less than 5–10 Å) are no longer detected. The amount and direction of the shift in each dimension of the spectrum depends on multiple factors, including the vicinity of the spin to the lanthanide, and its position with respect to the anisotropic ∆χ-tensor. The ∆χ-tensor’s axial and rhombic components, as well as the relative orientation of the tensor frame to the protein, depend on the type of lanthanide used and on the surrounding electronic environment of the paramagnetic center (Bertini et al. 2002). This allows the measurement of several spectra by varying the lanthanide, which provides non-redundant information. Importantly, PCS can be measured up to distances of 40 Å from the paramagnetic center when a strong lanthanide such as Tb3+ or Dy3+ is being used, making this effect particularly suitable to obtain long-range inter-molecular information. Simple PCS-based rigid body docking concept was first demonstrated by Ubbink and coworkers (Ubbink et al. 1998). A more general method using lanthanide labeling techniques has been proposed (Pintacuda et al. 2006). The protocol has been recently reapplied in combination with chemical shift perturbation data (Saio et al. 2010). Atomic level details, necessary to precisely understand biomolecular interactions or to accurately design candidate drug compounds, can, however, only be disclosed using flexible docking approaches, such as the one offered by the data-driven docking package HADDOCK (Dominguez et al. 2003), which makes use of CNS as computational engine (Brunger et al. 1998). We present here the implementation of a PCS energy term into HADDOCK using the PARArestraints (Banci et al. 2004) module developed by Banci and coworkers, which we have ported into the structure calculation software CNS. We demonstrate that PCS alone are sufficient to accurately model the structure of a complex. We used as a test case the lanthanide-labeled N-terminal ε domain of the E. coli DNA polymerase III (ε186) in complex with the HOT domain. The active site of ε186 contains a pair of Mn2+/Mg2+ that can be substitute by a single lanthanide (Pintacuda et al. 2006). The unpaired electrons of the lanthanide induce in return intra-molecular PCS on nuclear spins in free ε186, as well as inter-molecular PCS when ε186 is bound to its protein partner. We investigate first the capability of PCS data to drive the docking. The protocol is then applied to model the structure of ε186 in complex with the HOT-homologue θ subunit of the E. coli DNA polymerase III.
Results and discussion
Docking with synthetic data: protocol
The performance of our PCS-driven flexible docking approach was first assessed on the ε186/HOT complex. Artificial PCS data were generated from the crystal structure (PDB id 2IDO, chains C and D) (Kirby et al. 2006) using the ∆χ-tensor parameters that best fit the available experimental PCS data for ε186 (Schmitz et al. 2006), assuming a single fixed location of the paramagnetic center. To keep the data set realistic, a generous flat random noise of ±0.15 ppm was added. The resulting PCSs range from −1.87 to 4.48 ppm. Furthermore, PCS that were not observed experimentally were removed. In total, five data sets were created: three for ε186 (Dy3+, Er3+ and Tb3+) and two for HOT (Dy3+ and Er3+). This synthetic data set matches the experimental data set available for the system ε186/θ in term of number of lanthanides used, number of PCSs observed, PCS value range and level of noise. We used the five data sets in the following docking runs. It is however to be noted that the Tb3+ data set is useful only to improve the location of the lanthanide, and does not help to drive the docking as it contains no intermolecular information.
Two docking runs were performed: the first one from the bound forms of ε186 and HOT taken from 2IDO, the second one from the free forms consisting of the crystal structure of ε186 [PDB id 1J53 (Hamdan et al. 2002)] and the NMR ensemble of HOT [PDB id 1SE7 (DeRose et al. 2004)]. The axial and rhombic components were fitted against the noisy synthetic data of ε186 using the software Numbat (Schmitz et al. 2008), and entered into HADDOCK (the values are given in SI Table 1). Distance restraints were defined between the paramagnetic center and the coordinating residues of ε186 (Hamdan et al. 2002) to maintain the lanthanide ion at its known location. Flexible, disordered termini of the NMR structure of HOT were removed as they can obstruct the docking process. For each runs, 1,400 structures were calculated during the rigid body minimization stage; the 200 lowest score structures were subsequently subjected to a semi flexible simulated annealing in torsion angle space, followed by a final refinement with explicit solvent (water) according to the standard HADDOCK protocol (De Vries et al. 2007).
Docking with synthetic data: bound–bound scenario
Docking with synthetic data: unbound–unbound scenario
The unbound–unbound docking run is challenging in several ways. While ε186 free and bound structures are similar (1.4 Å backbone RMSD), they exhibit a large conformational change at loop 157 K–162 G located at the edge of the interface (SI Fig. 2), and HOT experiences a more global conformational change between 3.2 and 3.6 Å for the NMR ensemble (SI Fig. 3). This range of conformational changes is already in what is considered challenging in the docking field (Andrusier et al. 2008; Bastard et al. 2011; Bonvin 2006; Zacharias 2008). Under those conditions, obtaining high quality predictions has proven difficult, even for docking software that handles flexible segments (Andrusier et al. 2008; Bastard et al. 2011; Bonvin 2006; Zacharias 2008). About one third of the structures produced by the rigid-body stage are below 4 Å i-RMSD, satisfying the acceptable—1 star criteria of CAPRI classification (Fig. 1 red squares) (Janin 2005). The next two refinement stages of the HADDOCK run improved the average i-RMSD of the ten lowest energy complexes by as much as 0.53 Å (with a maximum improvement of 0.96 Å), indicating that the PCS energy term is pulling in the right direction (Fig. 1 green unfilled triangles and blue unfilled circles). Higher quality prediction would probably require a better sampling of the conformational changes at the interface, which is notoriously difficult (Bonvin 2006). For a complex such as ε186/HOT, a HADDOCK run based on PCS restraints is thus expected to generate acceptable to high quality solutions.
Docking with experimentally observed PCS
We have demonstrated that PCS alone are sufficient to generate accurate models of complexes in combination with flexible docking. This approach, implemented in HADDOCK, was applied to model the structure of the ε186/θ based on experimental PCS data. It is anticipated that recent progresses in paramagnetic labeling techniques (Su and Otting 2010) will increase the popularity of the PCS as a structural restraints source. The inherent flexibility of some paramagnetic tags can easily be modeled by allowing for variation in the distance restraints used to maintain the ∆χ-tensor in place. The flexible, PCS-driven docking protocol described here will be made available in a future release of HADDOCK and also implemented in the web server portal (De Vries et al. 2010).
This work was supported by the Dutch Foundation for Scientific Research (NWO) through a VICI grant (no. 700.56.442) to A.M.J.J.B and by the European Community, FP7 Access to Research Infrastructures Bio-NMR project (grant number 261863), and e-Infrastructure “e-NMR” and “WeNMR” projects (grant numbers 213010 and 261572).
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang JS, Kuszewski J, Nilges M, Pannu NS, Read RJ, Rice LM, Simonson T, Warren GL (1998) Crystallography and NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr Sect D 54:905–921CrossRefGoogle Scholar
- DeLano WL (2002) The PyMOL molecular graphics system. Palo Alto, CAGoogle Scholar
- DeRose EF, Kirby TW, Mueller GA, Chikova AK, Schaaper RM, London RE (2004) Phage like it HOT: solution structure of the bacteriophage P1-encoded HOT protein, a homolog of the theta subunit of E-coli DNA polymerase III. Structure 12(12):2221–2231Google Scholar