Introduction

Oligomerization plays an important role in protein function and many soluble and membrane proteins form homodimers or higher oligomers (Goodsell and Olson 2000). The structure of oligomeric proteins in solution can be determined by NMR spectroscopy, which is particularly important for low-affinity complexes that are difficult to crystallize (Vaynberg and Qin 2006). In principle, the same NMR approach can be followed for complexes as for monomeric proteins, i.e. analysis of a 3D heteronuclear-edited NOESY (Nuclear Overhauser Enhancement Spectroscopy) on the basis of a nearly complete resonance assignment. However, the difficulty to distinguish intramonomer from intermonomer correlations, the increasing size of protein–protein complexes and the requirement to determine structures in a cost- and time-efficient manner motivated the development of various rigid-body docking approaches such as HADDOCK (Dominguez et al. 2003). Rigid-body docking requires knowledge of the 3D structure of the individual molecules and can be driven by a small number of NOEs, residual dipolar couplings (RDCs), ambiguous intermolecular distance restraints from chemical shift perturbation, long-range distances derived from saturation transfer or paramagnetic probes and other biological data (Clore 2000; Diaz-Moreno et al. 2005; Dominguez et al. 2003; Matsuda et al. 2004).

For homooligomeric systems, structure determination is complicated in three ways. Firstly, preparation of samples formed by a defined mixture of protein with different isotope labels often requires unfolding and refolding of the homooligomer. Secondly, intermolecular NOEs are especially difficult to observe due to the inherent symmetry of the system. Thirdly, as the protein cannot be obtained easily in a monomeric form, it is generally not possible to map the binding interface using chemical shift perturbation data, hydrogen exchange dynamics or backbone and side chain dynamics (Englander et al. 1997; Kay et al. 1996). To overcome this problem, paramagnetic relaxation agents might be added to the solution (Petros et al. 1990; Sakakura et al. 2005) or saturation transfer experiments in mixtures of unlabeled and 15N/2H-labeled protein may be recorded (Takahashi et al. 2000). Very often, however, the information obtained from these experiments is ambiguous or the experiments fail due to insufficient sensitivity (Liepinsh et al. 2001).

Accurate intermolecular distance and orientational restraints between atom pairs are important for the determination of high-resolution structures of protein–protein complexes. Gaponenko et al. added sub-stoichiometrically a paramagnetic probe to break the symmetry and observe monomer specific pseudocontact shifts and RDCs in a homodimer (Gaponenko et al. 2002). A drawback of the approach is that the number of signals and signal overlap is strongly increased due to the broken symmetry and pseudocontact shifts. Alternatively, long-range structural information may be derived from paramagnetic relaxation enhancement (PRE) observed in the presence of a paramagnetic nitroxide radical that has been specifically attached to a diamagnetic protein (Kosen 1989). PRE derived distances are highly useful for structural characterization of globular (Bertini et al. 1996a, b, 1997; Donaldson et al. 2001; Feeney et al. 2001; Gaponenko et al. 2000) and intrinsically disordered proteins (Dyson and Wright 1998), as well as protein-protein (Iwahara and Clore 2006) and protein–DNA complexes (Iwahara et al. 2004).

When the available experimental data are not sufficient to obtain convergence to the correct complex structure, experimental data might be combined with algorithms designed for ab initio protein-protein docking (Russell et al. 2004). For example, chemical shift perturbation data alone (Morelli et al. 2001) or in combination with RDCs (Dobrodumov and Gronenborn 2003) were applied to filter the correct structure from ab initio docking results. In homooligomeric systems, however, chemical shift perturbation data and monomer specific RDCs are not easily accessible.

RDCs, however, offer an alternative way of rapidly validating structures or models of proteins and protein–protein complexes. RDCs can be observed in proteins that are weakly aligned in an anisotropic environment (Tjandra and Bax 1997). The preferred orientation of the protein and the observed RDC values depend on the three-dimensional shape and electrostatic properties of the biomolecule. Based on this insight, we had developed a simple simulation method, called PALES, that allows prediction of a protein’s alignment tensor with reasonable accuracy from the three-dimensional charge distribution and shape of the macromolecule (Zweckstetter and Bax 2000; Zweckstetter et al. 2004). Recently, we showed that PALES in combination with RDCs that were observed in a charged Pf1 alignment medium can be used to rapidly determine the relative orientation and stoichiometry of coiled-coil proteins in solution. In particular, antiparallel homodimers could be unambiguously distinguished from parallel coiled-coil homodimers (Zweckstetter et al. 2005).

Enterococcus faecalis is one of the major causes for hospital-acquired antibiotic-resistant infections. The 15.4 kDa homodimer CylR2 is part of a two-component system that regulates the production of the exotoxin cytolysin (Gilmore et al. 1990; Murray 1990; Haas et al. 2002). We previously reported the X-ray structure of CylR2, showed its role as a repressor of cytolysin transcription and proposed a model of the CylR2/DNA complex structure (Rumpel et al. 2004). Here we (i) determined the high-resolution structure of CylR2 in solution, (ii) studied the influence of different experimental intermonomer information on the accuracy of the derived structure and (iii) showed that despite the lack of sufficient experimental data the 3D structure of CylR2 can be determined by including information from ab initio docking and prediction of molecular alignment as implemented in PALES.

Materials and methods

NMR sample preparation

Details of cloning, protein overexpression and purification have been described elsewhere (Razeto et al. 2004). Single cysteine mutants of CylR2 (N40C and T55C) were generated by using the QuikChange site-directed mutagenesis kit (Stratagene). The introduced mutations were confirmed by DNA sequencing. 15N- and 13C/15N-labeled samples were prepared from Escherichia coli cells grown in M9-based minimal medium containing 15NH4Cl and/or 13C6-glucose. Samples to determine intermonomer distances were prepared by dissolving 15N-labeled wt and unlabeled mutant in 8 M urea, mixing them in a 1:1 molar ratio and refolding by dialysis against 50 mM HEPES pH 7.0, 600 mM NaCl and 5 mM DTT. Directly before labeling with MTSL ((1-oxy-2,2,5,5-tetramethyl-d-pyrroline-3-methyl)-methanethiosulfonate, Toronto Research Chemicals), DTT was removed by using size exclusion chromatography (PD-10 columns, Amersham Pharmacia Biosciences). Free sulfhydryl groups were modified overnight at room temperature with a 3–5 fold molar excess MTSL, solubilized in acetone. Unreacted MTSL was removed via a PD-10 column. Complete incorporation of MTSL was confirmed by mass spectrometry. Following NMR analysis in the oxidized form, samples were reduced by adding a 2–3 molar excess of 200 mM ascorbic acid.

NMR spectroscopy

NMR samples contained 0.4–0.8 mM CylR2 in 50 mM HEPES pH 7.0, 600 mM NaCl and 5% D2O (v/v). All NMR experiments were acquired at 298 K on Bruker AVANCE 600 or 700 or DRX 600 spectrometers. NMR experiments used for resonance assignment, for measurement of residual dipolar couplings and for calculation of 15N-1H-NOE values were performed as described (Rumpel et al. 2004). For structure determination a 3D [15N,1H]NOESY-HSQC and a [13C,1H]NOESY-HSQC with a mixing time of 120 ms were measured. 2D [15N,1H] HSQC and 15N T 2 relaxation experiments were performed for site-directed spin-labeling studies. The T 2 relaxation times were sampled using seven different 15N relaxation delays: 7.6, 50, 90, 130, 160, 190 and 220 ms. Rotating frame relaxation times (T) of backbone nitrogens were estimated from two 1D spectra with a relaxation delay of 2 and 60 ms and with a spin-lock power of 2.5 kHz. All spectra were processed using NMRPipe/NMRDraw (Delaglio et al. 1995) and analyzed using Sparky (T. D. Goddard and D. G. Kneller, University of California, San Francisco).

Structure calculation of the CylR2 monomer

The previously reported resonance assignment (Rumpel et al. 2004) and torsion angle restraints as predicted from chemical shifts with the software TALOS (Cornilescu et al. 1999) were used as input for combined automated NOE assignment and structure calculation with the program CYANA (Guntert 2004). For the final CYANA run 19 13C-distances (three short range, nine medium range and seven long range) were assigned manually. The final 20 structures with the lowest target function were used for further refinement in the presence of HN-RDCs and in explicit solvent using Xplor-NIH (Schwieters et al. 2003).

Long-range distances from PRE broadening

For each of the two single-cysteine containing mutants, CylR2N40C and CylR2T55C, two samples were used (molecules carrying spin labels are indicated by a star), a pure 15N- and spin-labeled homodimer (15N-mut(*)/15N-mut(*)) and a 1:1 mixture of 15N-labeled wt and spin-labeled mutant at natural abundance (1:1-mixed 15N-wt/mut(*)). Intra and intermolecular PRE distances were obtained from intensities of cross-peaks of backbone amide proton-nitrogen pairs in 15N-HSQC spectra of the paramagnetic (I para) and diamagnetic (I dia) state (i.e. after addition of ascorbic acid). Intensity ratios I para /I dia were linearly fit for the enhancement of the transverse relaxation rate by the unpaired electron ( \( R_2^{{\text{para}}} \)) (Battiste and Wagner 2000):

$$ \frac{{I_{{\text{para}}} }} {{I_{{\text{dia}}} }} = \frac{{R_2 {\text{exp(}} - R_2^{{\text{para}}} t{\text{)}}}} {{R_2 + R_2^{{\text{para}}} }}{\text{,}} $$
(1)

in which t is the total INEPT evolution time of the 15N-HSQC (∼11.3 ms) and amide proton R 2 values were approximated by experimental amide nitrogen R 2 values (Ishima and Torchia 2003). The distances r between the unpaired electron and the amide protons was determined according to

$$ r = \left[ {\frac{K} {{R_2^{{\text{para}}} }}\left( {4\tau _c + \frac{{3\tau _c }} {{1 + \omega _h^2 \tau _c^2 }}} \right)} \right]^{1/6} {\text{,}} $$
(2)

in which K is 1.23 × 10−32 cm6 s−2 and ω h is the Larmor frequency of the proton. τ c is the correlation time for the electron–nuclear interaction that was assumed to be equal to the global correlation time of CylR2, which was estimated as 6 ns using Stokes’ law (Cavanagh 1996). Changing τ c from 6 to 4 ns did not change the docking results significantly, in agreement with the small (compared to r) influence of τ c on the calculated distance (Eq. 2).

The 1:1-mixed 15N-wt/mut(*) samples are composed of three different dimers: 50% 15N-wt/mut(*) heterodimer, 25% 15N-wt/15N-wt homodimer and 25% mut(*)/mut(*) homodimer (Fig. 1a). When the chemical shifts of an amide-amide proton pair in the 15N-wt/mut(*) heterodimer are identical to the values in the 15N-wt/15N-wt homodimer, the 15N-wt/15N-wt homodimer contributes 50% of the NMR signal intensity even in the paramagnetic state of the 1:1-mixed 15N-wt/mut(*) sample. This was taken into account by calculation of I para according to

$$ I_{{\text{para}}} = 2{\text{(}}I_{{\text{para}}*} - \frac{{I_{{\text{dia}}} }} {2}{\text{),}} $$
(3)

in which I para* is the signal intensity in the spectrum of the paramagnetic state.

Fig. 1
figure 1

(a) Overall strategy to derive intermonomer distances from PRE in homodimers. The paramagnetic sample is shown on the left side with a star to indicate paramagnetic subunits and the diamagnetic sample is shown on the right side. The 1:1-mixed samples are composed of equal amounts of 15N-labeled wt (15N-wt, violet) and of paramagnetic mutant (mut(*), white) monomers. The monomers combine into three distinct dimerization pairs: 25% 15N-wt/15N-wt (blue), 50% 15N-wt/mut(*) (green) and 25% mut(*)/mut(*). The former two species contribute equally to the NMR signal while the latter is undetected. For a few residues close to the para or diamagnetic tag across the dimer interface, the chemical shift can be distinguished (peak-doubling), while for all other residues, the 15N-wt/15N-wt and 15N-wt/mut(*) peaks overlap. The PRE distance is derived from the peak intensity ratio (Ipara/Idia) obtained from the paramagnetic and diamagnetic lines (green lines). For the overlapped case, Ipara can be obtained by subtracting Idia/2 according to Eq. 3. The diamagnetic sample can easily be obtained from the paramagnetic sample by ascorbic acid reduction. (b, c) Overlay of 15N-1H-HSQC spectra of paramagnetic (blue) and diamagnetic (red) forms of 15N-mut CylR2T55C (b) and the 1:1 mixed 15N-wt/mut CylR2T55C (c). Residues that disappeared in the paramagnetic state are labeled and doubled peaks in (c) are indicated by ellipses

Determination of the structure of the CylR2 homodimer

To allow usage of PRE-derived intermolecular distances in rigid-body docking, we explicitly included MTSL in the atomic coordinates of the monomeric structure of CylR2. MTSL molecules were attached simultaneously to N40C and T55C. Starting from the structure of CylR2N40C + T55C, we repeated the structure calculation of the monomer of CylR2 using CYANA. In addition to the restraints, which were already used for calculation of the structure of wt CylR2, we included intramolecular PRE distances between the spin label and the amide protons. Intramolecular PRE distances were derived from PRE broadening effects observed in the 15N-mut(*)/15N-mut(*) homodimer sample. To avoid inclusion of intermolecular effects, we analyzed the signals of only those residues that did not show broadening in the 15N-wt/mut(*) sample, resulting in 24 and 31 PRE restraints for CylR2N40C and CylR2T55C, respectively. PRE restraints were enforced as upper limit restraints that were obtained by addition of 5 Å to the distances calculated according to Eq. 2. For peaks broadened beyond detection, the upper distance limit was set to 12 Å in CYANA calculations.

Homodimer structures were calculated using a protocol for rigid-body docking as implemented in Xplor-NIH (Clore 2000). Four different sets of restraints were used: (a) PRE, (b) PRE and RDCs, (c) PRE, NOE and RDCs, and (d) NOE and RDCs. Additional calculations were performed enforcing twofold symmetry using distance difference restraints (Nilges 1993). Thirteen peaks of the 13C-NOESY-HSQC were manually assigned as intermonomer NOEs. For NOE data, upper and lower distances were set to +2.5 and −2 Å of the calculated distances, respectively. With the exception of (d) all calculations were performed using two monomers that had MTSL attached at N40C and T55C (see above). Intermonomer PRE distances were restrained from the nitrogen of the MTSL ring in one monomer to the amide protons of the other monomer. Upper and lower distance bounds were set to ±5 Å of the distances calculated according to Eq. 2. Decreasing the error bounds to ±4 Å resulted in an increased rmsd and in a larger number of violated intermolecular restraints. For peaks broadened beyond detection, distances were set to 7 ± 5 Å. For residues with broadened signals that are in the primary sequence next to a residue, which was not affected by PRE, only a lower distance bound was enforced. For residues that were not broadened in the paramagnetic state, a lower distance bound of 25 Å was used.

The structures obtained from rigid-body docking were further refined in explicit water using Xplor-NIH (Schwieters et al. 2003). For this aim the MTSL-containing monomers in the homodimer structure were replaced by the atomic coordinates of the wt protein. To restrain the monomer-to-monomer orientation, all intermolecular HN-HN distances from N40 and T55 to any amide proton of the other subunit were extracted and restrained during refinement (error bounds of ±2 Å). In addition, the intramonomer distance restraints, which had been used for calculation of the monomer structure of the wt protein, were included into the refinement. Coordinates of backbone atoms and atoms of side chains not contributing to the dimer interface (as determined on the basis of the homodimeric structure prior to refinement) were fixed during refinement. The ensemble of 15 lowest energy structures, which was calculated on the basis of intermolecular PRE distances and RDCs, was deposited in the ProteinDataBank database (PDB accession code: 2GZU).

Ab initio docking in case of insufficient experimental restraints

Ab initio docking (i.e. without experimental restraints) of two monomeric CylR2 molecules was performed for both the monomeric mean structure of the NMR ensemble and a monomer of the X-ray structure using the DOT algorithm (Mandell et al. 2001) available on the ClusPro Web server (http://www.nrc.bu.edu/cluster). The symmetry was restricted to C2 (Comeau and Camacho 2005). The docking solutions produced by DOT were ranked using ClusPro, which uses electrostatic and desolvation energies (Comeau et al. 2004).

For each docking model, the distances (d dock) between the Cβ atom of the cysteine, to which the MTSL was attached, and the backbone amide protons of the other monomer were calculated using MOLMOL (Koradi et al. 1996). d dock distances were compared to experimental distances (d exp) obtained from PRE broadening in MTSL-tagged, paramagnetic CylR2N40C and CylR2T55C according to

$$ \sigma = \frac{1} {N}\sum\limits_{i = 1}^N {\sqrt {{\text{(}}d_i^{{\text{dock}}} - d_i^{{\text{exp}}} {\text{)}}^2 {\text{,}}} } $$
(4)

in which N is the number of residues.

RDCs were predicted using the electrostatic alignment method as implemented in the software PALES (Zweckstetter et al. 2004). The default charge was attached to all ionizable residues, the Pf1 concentration was set to 12 mg ml−1 and the ionic strength was adjusted to 0.5 M NaCl. The agreement of 35 experimental RDCs (located in secondary structure elements) with ab initio docking models was evaluated using Pearson’s linear correlation coefficient.

Results and discussion

Structure of the monomeric subunit of CylR2 in solution

The structure of the monomeric subunit of the 66-residue protein CylR2 was solved based on a 98.8% complete chemical shift assignment, 987 interproton distances, 86 dihedral angle restraints and 57 HN-RDCs (Table 1). There were no major differences between the X-ray and the NMR structure (Supplementary text and Fig. S1).

Table 1 Structural statistical data for the monomeric subunit of CylR2a

Mutagenesis and spin-labeling of CylR2

To enable measurement of long-range distances in CylR2, single cysteine residues were introduced into wt CylR2. Conservative sites of mutation were chosen at position N40 and T55. N40 and T55 were located in loop regions and on opposite sides of the structure of the monomer (as was known from the NMR structure of the monomeric subunit of CylR2). As the dimer interface is not known initially, mutations might be located in the dimer interface and destabilize the oligomeric structure. Therefore, we compared chemical shifts and 15N transverse relaxation times between wt and mutant CylR2. With the exception of L57 in case of CylR2N40C and residues around the mutation site, averaged amide proton and amide nitrogen chemical shift differences between wt and mutant CylR2 were smaller than 0.16 ppm. T 1ρ values indicated an unchanged state of oligomerization (data not shown). Thus, introduction of a cysteine at N40 and T55 did not strongly perturb the structure of CylR2.

Overnight incubation of CylR2N40C and CylR2T55C with MTSL resulted in efficient attachment of the spin label to the protein. In both the 15N-mut(*)/15N-mut(*) homodimer and 15N-wt/mut(*) sample signal intensities of several residues were attenuated in the paramagnetic state. Overall, intermonomer PRE-broadening as measured in the 1:1-mixed 15N-wt/mut(*) sample was less pronounced. Only the signal of E66 was no longer observed, while one third of all backbone amide signals disappeared in case of 15N-mut(*)/15N-mut(*) due to strong intramolecular PRE (Fig. 1b). In the 1:1-mixed 15N-wt/mut(*) heterodimer, peak-doubling was observed for 10 of the 62 backbone amide signals, indicative of differences in the chemical environment close to the mutation site (Fig. 1c).

Long-range distances from PRE

Distance information derived from PREs has three advantages over NOEs: (i) It is long-range and not limited to the dimer interface, (ii) it can be used in the case of fully deuterated proteins or for proteins for which no side chain assignment can be obtained and, (iii) the number of accessible distances might be increased by attaching spin labels to different sites in the protein (at the expense of an increased amount of biochemical work). Intermonomer distances in CylR2 were derived from peak intensities of HSQC spectra recorded for the paramagnetic and diamagnetic 1:1-mixed 15N-wt/mut(*) sample. Two cases had to be distinguished. For residues affected by doubling of peaks, the peak corresponding to the heterodimeric 15N-wt/mut(*) mutant was identified as the signal that was shifted compared to the 15N-HSQC of wt CylR2 and this peak was used for calculation of the intermonomer distance according to Eq. 2. For residues without doubling of the peak, 50% of the intensity of the peak in the diamagnetic 15N-HSQC, corresponding to the contribution of the 15N-wt/15N-wt homodimer, was subtracted from the intensity of the same peak observed in the 15N-HSQC of the paramagnetic sample (Eq. 3). This approach is valid under the assumption that the sample contains 50% 15N-wt/mut(*) and 25% 15N-wt/15N-wt, i.e. both contribute 50% of the signal intensity (Fig. 1a).

To assess the accuracy of experimentally determined intermonomer PRE distances, we initially compared them to distances present in the X-ray structure of CylR2 (Fig. 2a). Overall there is good agreement and most experimental PRE distances deviate by less than 5 Å from the values observed in the X-ray structure. The remaining deviations can have a variety of sources. (i) The structure of CylR2 in solution deviates slightly from the structure in the crystalline state. (ii) Amide proton T 2 relaxation times were approximated by experimental amide nitrogen T 2 relaxation times (Ishima and Torchia 2005). (iii) The correlation time for the electron–nuclear interaction τ c was assumed to be equal to the global correlation time of CylR2. (iv) Positional averaging of the flexible nitroxide side chain of MTSL (please see “Discussion” below). (v) Errors in the determination of protein concentration and interference of MTSL with dimer formation during refolding from 8 M urea may result in a deviation from the 50% contribution of 15N-wt/mut(*) and 15N-wt/15N-wt to the 15N-HSQC signal, which was assumed in the determination of intermonomer distances for residues that showed a single peak in the 15N-wt/mut(*) sample. Note that errors in peak intensities have a more pronounced influence on calculated distances when the intensity reduction due to the paramagnetic center is small (Fig. 2b).

Fig. 2
figure 2

(a) Theoretically expected distances from the X-ray structure versus distances calculated from PRE data. The solid line indicates optimal correlation between experimental and expected distances and the dashed line marks the ±5 Å error bounds. Distances calculated with the spin-label at position N40C and T55C are shown as black circles and red triangles, respectively. Distances calculated with a τ c of 6 and 4 ns are indicated as filled and empty symbols, respectively. (b) Measured intensity ratio plotted as a function of the calculated distance. The dashed lines show that for an intensity ratio of 0.85 ± 0.05 the uncertainty of the distance is approximately four times larger than for an intensity ratio of 0.35 ± 0.05

High-resolution structure of the CylR2 homodimer in solution

The structure of the CylR2 homodimer in solution was determined by rigid-body docking of two copies of the high-resolution NMR structure of the monomeric subunit of CylR2. Rigid-body docking of heterodimeric protein-protein (Gray 2006) and protein–DNA complexes (van Dijk et al. 2006) is well established. In particular, the HADDOCK protocol is highly popular and was employed for several applications, in which various types of intermonomer restraints were used (see for example Dominguez et al. 2004; Volkov et al. 2005). We followed a protocol similar to HADDOCK implemented in Xplor-NIH (Clore 2000) to obtain answers to four questions: (i) How do different types of intermonomer restraints influence the accuracy of the structure of CylR2 obtained by rigid-body docking? (ii) What is the high-resolution structure of CylR2 in solution and does it differ from the previously determined X-ray structure? (iii) Is it possible to use PRE distances obtained from only one cysteine mutant or can intermolecular distance information be removed completely and near-native solutions identified using molecular alignment prediction?

Rigid-body docking of CylR2 monomers was performed using (a) PREs, (b) PREs and RDCs, (c) PREs, NOE and RDCs, and (d) NOEs and RDCs. The backbone of the structure that was calculated using only PREs deviated by 3.0 Å from the X-ray structure (Table 2). Enforcing twofold symmetry using distance difference restraints (Nilges 1993) did not change the accuracy of the structure, as PREs were already defined as symmetric restraints between the two subunits of CylR2 during rigid-body docking (data not shown). Inclusion of HN-RDCs reduced the deviation from the crystal structure to 1.5 Å. This is in agreement with the fact that in the presence of RDCs, one of the principal axes of the alignment tensor must be parallel and the other two orthogonal to the twofold symmetry axis (Bewley and Clore 2000). Combination of PREs and HN-RDCs with 13 intermolecular NOEs slightly further reduced the deviation from the X-ray structure. On the other hand, when only 13 intermolecular NOEs and 57 backbone HN-RDCs were used, the rigid-body docking solutions deviated by about 2 Å from the X-ray structure (Table 2). The results demonstrate the power of combining long-range distance information with RDC-derived orientational information for structure determination of homooligomeric proteins.

Table 2 Influence of different types of intermonomer restraints on the accuracy of the homodimeric structure of CylR2

Structures obtained from rigid-body docking were further refined in explicit water (see “Materials and methods” for details). This resulted in ensembles of 20 lowest energy structures with coordinate precision in the range from 0.54 to 0.59 Å (Fig. 3a). The coordinates of the backbone and side chain atoms of the mean structure deviated by 1.15 and 2.08 Å from the values in the X-ray structure. The rmsd values between the NMR and the X-ray structure were slightly higher for the dimer than for the monomer, indicative of small differences in the orientations of the two monomers within the two structures (Tables 1, 2). Most notable are the differences for the longest helix α4 (residues 43–52) that contributes strongly to the dimer interface and the loop connecting helix α3 and α4 involved in DNA binding (Fig. 3). Within this loop the flexible residue S42 is found. Conformational flexibility in this region is likely to be important for DNA binding. In addition, crystal packing might have influenced the X-ray structure of CylR2.

Fig. 3
figure 3

High-resolution structure of the CylR2 homodimer in solution. (a) Superposition of the 10 NMR structures with lowest energy. Helices are shown in magenta and β-strands in violet. The calculated average position of MTSL attached to position N40C (green) or position T55C (orange) is indicated for the left subunit. (b) Mean structure of the NMR ensemble (blue) superimposed on the X-ray structure (red). (c) Average backbone rmsd per residue for the 15 NMR structures (solid line) and backbone rmsd per residue between the mean NMR structure and the X-ray structure (dashed line) (Rumpel et al. 2004). The rmsd values between the NMR and the X-ray structure were calculated from the structural fit shown in (b) and are shown for both subunits of the CylR2 homodimer. Secondary structure elements are indicated

Although broadening of signals due to a covalently attached spin label might be measured to high accuracy, the encoded distance information is less precise mainly due to the flexibility of the paramagnetic side chain. Efforts are being made to rigidify the spin label (Leonov et al. 2005) (or lanthanide binding tags attached to any of the termini of the protein (Wohnert et al. 2003)), but averaging of distance information remains a problem. To take into account the mobility of the tag, Clore and coworkers used a multiple-structure representation of the paramagnetic group in simulated annealing calculations (Iwahara et al. 2004). Here we chose a different strategy as the structure of the monomeric subunit of CylR2 could be determined using NOEs, RDCs and torsion angles. We measured PRE broadening in the 15N-mut(*)/15N-mut(*) homodimer sample for residues that did not show any intermolecular PRE effects in the 1:1-mixed 15N-wt/mut(*) sample. The intramolecular PRE broadening observed for these residues was used to determine the position of MTSL within the monomeric subunit of CylR2. Note that this is an average position of MTSL, which is in agreement with the observed intramolecular PRE broadening. For high-affinity complexes averaging of intra and intermolecular PRE broadening is very similar, and the average position of MTSL was kept fixed during rigid-body docking. In addition, unspecific binding of MTSL to the protein can be probed when experimental intramolecular PRE distances are compared with values calculated from the NOE-based structure of the monomeric subunit.

Cysteine mutations were introduced into loop regions on the basis of the 3D structure of the monomeric subunit of CylR2. Accordingly, 15N-HSQC spectra of the 1:1-mixed 15N-wt/mut(*) sample showed two peaks for residues primarily close to the site of mutation and new assignment using triple-resonance spectra was not required (Fig. 1c). On the other hand, when Co2+ was introduced as a paramagnetic probe sub-stoichiometrically into a homodimer, the symmetry was broken, signals from three species (the Co2+-free, the diamagnetic and two non-equivalent monomeric species) were present and resonances in the paramagnetic molecules were shifted due to pseudo contact shifts. Thus, signal overlap was strongly increased even at 900 MHz and a 3D HNCO was required to assign the paramagnetically shifted resonances (Gaponenko et al. 2002).

NMR-based ranking of homodimer structures obtained from ab initio docking

Preparation of single-cysteine mutants is time consuming, mutations can alter the protein structure and they may not be possible due to the presence of essential cysteine residues in the wt protein. Thus, it is desirable to prepare only one single-cysteine mutant of the protein of interest or completely avoid the need for intermolecular distance information. Due to the reduced amount of experimental information, however, convergence to a near native structure using conventional structure calculation protocols such as Xplor-NIH is difficult. In case of CylR2, the Xplor-NIH docking did not converge to the correct solution when only intermolecular distances for one spin-label position together with HN-RDCs and symmetry restraints were used (data not shown). Combination of a small number of intermolecular NOEs with chemical shift perturbation data (Tang and Clore 2006) or combination of intermolecular NOEs with HN-RDCs in case of CylR2 (Table 2) did, however, result in a near native structure. This suggests that the unsuccessful Xplor-NIH docking is due to the fact that all PRE distance restraints in case of a single spin label involve the same atom. In addition, intermolecular NOEs define more precisely the dimer interface due to their short-range information content.

Good progress has been made in ab initio docking of protein complexes including homooligomeric proteins (Gray 2006). Ab initio docking programs like DOT have an optimized energy function that includes electrostatic and non-bonded interactions as well as shape complementarity. For many systems, the algorithms produce ensembles of low energy docking solutions that contain a structural model, which deviates by 2–5 Å from the real structure. To improve ranking of ab initio docking models, chemical shift perturbations and RDCs were used (Dobrodumov and Gronenborn 2003; Morelli et al. 2001). For homodimeric complexes, however, the protein cannot generally be obtained in monomeric form and chemical shift perturbations at the dimer interface are not available. Here we compare intermolecular distances obtained from a 1:1-mixed 15N-wt/mut(*) sample of a single, spin labeled CylR2 mutant with distances observed in different docking solutions produced by ab initio docking. In addition, we predict RDCs from the three-dimensional shape and charge distribution of docking solutions using PALES. Note that RDC prediction using PALES simulates the way how a protein aligns in a charged alignment medium. This is very different from the best-fit of RDCs to the structure of docking solutions that was used for ranking heterooligomeric complexes.

Ab initio rigid-body docking of two monomeric CylR2 molecules was performed for both the mean structure of the NMR ensemble and a monomer taken from the dimeric X-ray structure. 25 docking solutions (dockNMR and dockX-ray), as calculated by the DOT algorithm and ranked by ClusPro (Comeau and Camacho 2005; Comeau et al. 2004), were obtained in each case. The rmsd between the docking solutions and the X-ray structure of the CylR2 homodimer varied between 1.3 and 17.5 Å (Fig. 4a, e). When two copies of the monomer that was extracted from the X-ray structure were docked, the docking solution that was ranked highest (rank 1) had the smallest deviation from the high-resolution structure. In addition, docking solutions with rank 2 and 3 were also very close to the high-resolution crystal structure of CylR2 (Fig. 4e). This is in agreement with previous findings that many ab initio docking algorithm are able to reassemble protein–protein complexes, when the structures of the proteins as observed in the complex are used for docking (Gray 2006). On the other hand, when the mean structure of the NMR ensemble was used, the best-ranked homodimeric docking solution deviated by about 14 Å from the high-resolution structure of CylR2 (Fig. 4a). The docking solution with the smallest deviation from the crystal structure (deviation ∼2 Å) was ranked only fifth. These results show that DOT/ClusPro docking is able to produce near native solutions and ranking is more reliable when the monomer is taken from the crystal structure of CylR2. At the same time, however, there is no guarantee that the solution that was ranked highest is at all similar to the real structure.

Fig. 4
figure 4

NMR-based ranking of structural models obtained from ab initio docking for the mean monomer structure of the NMR ensemble (ad) and a monomer of the X-ray structure of CylR2 (eh). (a, e) Comparison of the backbone rmsd (residues 3–63) between the X-ray structure and the ab initio model with the rank assigned by ClusPro; (b, f) comparison of σ, which measures the deviation between intermolecular distances derived from PREs for the spin label at position T55C and distances calculated for the docking solutions, with the backbone rmsd to the X-ray structure; (c, g) PRE-based rank of docking solutions versus the rank assigned by ClusPro; (d, h) Comparison of the rank derived by prediction of molecular alignment as implemented in PALES with the ClusPro rank of docking solutions. In (c, d, g, h) symbols are colored according to the deviation of the ab initio docking model from the X-ray structure of CylR2

To improve ranking of homodimeric arrangements obtained from ab initio docking, we compare PRE-derived intermolecular distances with values calculated from the DOT/ClusPro solutions. For both spin label positions, the average deviation from the experimental PRE distances increases with increasing deviation of the docking model dockNMR from the X-ray structure (Fig. 4b, f). For CylR2T55C, the model dockNMR with the smallest deviation between experimental and calculated intermolecular distance restraints is the one closest to the crystal structure (deviation ∼2 Å) (Fig. 4b). At the same time, however, the docking model that deviates by 7.2 Å from the X-ray structure fits only slightly worse to the experimental PRE distances. This is due to the estimated error of ±5 Å associated with the experimental PRE distances (see also Fig. 2). In addition, intermolecular distances in the docking solutions were calculated from the Cβ atom of the cysteine residue to which the spin label was attached. Calculation of a more accurate intermolecular distance would require positioning of the spin label using intramolecular PREs (as was done in the Xplor-NIH docking) or averaging over different side chain conformations. For CylR2N40C, the two dockNMR models that fit best to experimental PRE values deviate by about 2 and 6 Å from the crystal structure of CylR2 (Supplementary Fig. S2). On the other hand, the docking model that was ranked highest by ClusPro does clearly not fit to the experimental PREs observed for either CylR2N40C or CylR2T55C (average PRE deviations σ of more than 6 Å). To identify the docking solution that is closest to the real structure, we propose to compare the rank assigned by the docking program (“docking rank”) with the rank as obtained from the comparison with experimental PREs (“PRE rank”) (Fig. 4c, g). The PRE rank was determined by sorting the docking solutions according to their average PRE deviations σ and assigning the lowest rank to the solution that fits best to the experimental PREs. For both CylR2N40C and CylR2T55C, only two docking models remained, for which the PRE rank and the docking rank was less than seven (Fig. 4c and Supplementary Fig. S2). Both docking models deviated by less than 4 Å from the X-ray structure and the one with the smaller PRE rank was closest to the X-ray structure.

To improve ranking of docking models in the absence of a paramagnetic center, we took advantage of the possibility to predict molecular alignment tensors from the charge distribution and shape of a protein using a method implemented in the software PALES (Zweckstetter et al. 2004). Pf1 bacteriophage is strongly negatively charged and CylR2, being a DNA-binding protein, contains a patch of positive charge. Thus, the alignment orientation that was predicted by PALES for different ab initio docking models of CylR2 varied strongly. Based on the correlation between experimental RDCs and values predicted by PALES we rank the docking models and compare this PALES-based rank with the rank assigned by the ab initio docking program (Fig. 4d, h). When using the NMR monomer, only a single structure belonged to the best seven structures according to PALES-based and ab initio ranking (Fig. 4d). This docking model is closest to the X-ray structure with a deviation of about 2 Å. The four docking models that were assigned a better rank according to ClusPro are not in agreement with RDCs predicted by PALES (correlation coefficients below 0.7). There is also one docking model that was assigned a docking rank of seven and a PRE rank of four, but which differs by 8.4 Å from the X-ray structure of CylR2. However, when the linear average of the PRE and docking rank is calculated this docking model would obtain an average rank 2, whereas the best docking model is ranked 1. Ranking of models, which were obtained by docking a monomer of the X-ray structure, was more reliable using either ClusPro or PALES resulting in a very reliable identification of three near native structures (Fig. 4h). The correlation between experimental RDCs and values predicted from the high-resolution NMR and X-ray structure were 0.84 and 0.80, respectively.

In case of homodimeric coiled-coil proteins, PALES had to distinguish only between the parallel and the antiparallel arrangement (Zweckstetter et al. 2005). Moreover, due to the asymmetric distribution of charges along the chain of coiled-coil proteins the two arrangements are characterized by very different distributions of the surface charges enabling a clear distinction by PALES. In the more general case of homodimers comprised by monomers with a globular structure, many different arrangements are possible that potentially do not differ strongly in the distribution of surface charges. In addition, PALES is based on a strongly simplified electrostatic model, which might further affect the accuracy of the prediction of molecular alignment. Nevertheless, the combination of the rank assigned by PALES based on prediction of molecular alignment and the rank assigned by ClusPro based on electrostatic and desolvation energies provides a reliable approach for identification of near native docking models. The reliability of PALES ranking is further improved if only docking solutions are taken into account for which the correlation coefficient between experimental and predicted RDCs is above 0.7. Comparison of Fig. 4c with 4d and of 4g with 4h indicates that PRE-based ranking is not significantly better than ranking by PALES. Thus, it is possible to identify a near native conformation without experimental information about the dimer interface using a small number of easily accessible HN-RDCs.

Concluding remarks

Our study shows that truly high-resolution structures of homodimeric proteins can be obtained by the combined use of intermolecular long-range distances obtained from paramagnetic relaxation enhancement and orientational information encoded by RDCs. Usage of PRE broadening avoids the need for assignment of side chain resonances and overcomes difficulties of distinguishing inter and intramonomer contacts in homooligomeric proteins. This is particularly important for trimeric and higher homooligomeric systems and high molecular weight complexes in general, in which side chain resonance assignment becomes increasingly difficult and essential deuteration limits the availability of NOE data. For high molecular weight homodimers, the structure determination of the monomeric unit by conventional NOE-based methods will also be more difficult and intramolecular PREs obtained on the same single-cysteine mutant proteins will be useful. Larger proteins have broad lines already in the diamagnetic state and an increase in line width due to a paramagnetic center may be too small to be measured accurately especially for longer distances. In this case, longitudinal amide proton relaxation enhancements R 1 might be more practical.

It appears that for homooligomeric systems, in which the symmetry can be restrained, the quality of structures obtained by ab initio docking is at least comparable to that obtained from two sets of PRE-based intermolecular distances. Only when RDCs are also included high-resolution structures can be obtained from the experimental restraints. On the other hand, attaching the spin label to only one site in the protein is generally not sufficient to obtain a correct homooligomeric structure in conventional restrained molecular dynamics simulations, even when RDCs were measured. Additional experimental restraints such intermolecular NOEs or pseudo contact shifts are then required.

Structural models obtained from ab initio rigid body docking can be reliably ranked using intermolecular distances derived from a single spin labeled position. Importantly, however, near native structures can be identified without chemical shift perturbation data and without intermolecular distances from a small set of backbone RDCs.