Skip to main content
Log in

Simultaneous single-structure and bundle representation of protein NMR structures in torsion angle space

  • Article
  • Published:
Journal of Biomolecular NMR Aims and scope Submit manuscript

Abstract

A method is introduced to represent an ensemble of conformers of a protein by a single structure in torsion angle space that lies closest to the averaged Cartesian coordinates while maintaining perfect covalent geometry and on average equal steric quality and an equally good fit to the experimental (e.g. NMR) data as the individual conformers of the ensemble. The single representative ‘regmean structure’ is obtained by simulated annealing in torsion angle space with the program CYANA using as input data the experimental restraints, restraints for the atom positions relative to the average Cartesian coordinates, and restraints for the torsion angles relative to the corresponding principal cluster average values of the ensemble. The method was applied to 11 proteins for which NMR structure ensembles are available, and compared to alternative, commonly used simple approaches for selecting a single representative structure, e.g. the structure from the ensemble that best fulfills the experimental and steric restraints, or the structure from the ensemble that has the lowest RMSD value to the average Cartesian coordinates. In all cases our method found a structure in torsion angle space that is significantly closer to the mean coordinates than the alternatives while maintaining the same quality as individual conformers. The method is thus suitable to generate representative single structure representations of protein structure ensembles in torsion angle space. Since in the case of NMR structure calculations with CYANA the single structure is calculated in the same way as the individual conformers except that weak positional and torsion angle restraints are added, we propose to represent new NMR structures by a ‘regmean bundle’ consisting of the single representative structure as the first conformer and all but one original individual conformers (the original conformer with the highest target function value is discarded in order to keep the number of conformers in the bundle constant). In this way, analyses that require a single structure can be carried out in the most meaningful way using the first model, while at the same time the additional information contained in the ensemble remains available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Antuch W, Güntert P, Wüthrich K (1996) Ancestral βγ-crystallin precursor structure in a yeast killer toxin. Nat Struct Biol 3:662–665

    Article  Google Scholar 

  • Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242

    Article  Google Scholar 

  • Betancourt MR, Skolnick J (2001) Finding the needle in a haystack: educing native folds from ambiguous ab initio protein structure predictions. J Comput Chem 22:339–353

    Article  Google Scholar 

  • Bianchetti CM, Blouin GC, Bitto E, Olson JS, Phillips GN (2010) The structure and NO binding properties of the nitrophorin-like heme-binding protein from Arabidopsis thaliana gene locus At 1g79260.1. Proteins 78:917–931

    Article  Google Scholar 

  • Bowie JU, Lüthy R, Eisenberg D (1991) A method to identify protein sequences that fold into a known 3-dimensional structure. Science 253:164–170

    Article  ADS  Google Scholar 

  • Calzolai L, Lysek DA, Perez DR, Güntert P, Wüthrich K (2005) Prion protein NMR structures of chickens, turtles, and frogs. Proc Natl Acad Sci USA 102:651–655

    Article  ADS  Google Scholar 

  • Chen VB, Arendall WB, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC (2010) MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D 66:12–21

    Article  Google Scholar 

  • Davis IW, Murray LW, Richardson JS, Richardson DC (2004) MolProbity: structure validation and all-atom contact analysis for nucleic acids and their complexes. Nucleic Acids Res 32:W615–W619

    Article  Google Scholar 

  • Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, Wang X, Murray LW, Arendall WB, Snoeyink J, Richardson JS, Richardson DC (2007) MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res 35:W375–W383

    Article  Google Scholar 

  • Dukka BKC (2009) Improving consensus structure by eliminating averaging artifacts. BMC Struct Biol 9:12

    Google Scholar 

  • Furnham N, Blundell TL, DePristo MA, Terwilliger TC (2006) Is one solution good enough? Nat Struct Mol Biol 13:184–185

    Article  Google Scholar 

  • Güntert P (2003) Automated NMR protein structure calculation. Prog Nucl Magn Reson Spectrosc 43:105–125

    Article  Google Scholar 

  • Güntert P (2009) Automated structure determination from NMR spectra. Eur Biophys J 38:129–143

    Article  Google Scholar 

  • Güntert P, Mumenthaler C, Wüthrich K (1997) Torsion angle dynamics for NMR structure calculation with the new program DYANA. J Mol Biol 273:283–298

    Article  Google Scholar 

  • Herrmann T, Güntert P, Wüthrich K (2002) Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA. J Mol Biol 319:209–227

    Article  Google Scholar 

  • Hooft RWW, Vriend G, Sander C, Abola EE (1996) Errors in protein structures. Nature 381:272

    Article  ADS  Google Scholar 

  • Horst R, Damberger F, Luginbühl P, Güntert P, Peng G, Nikonova L, Leal WS, Wüthrich K (2001) NMR structure reveals intramolecular regulation mechanism for pheromone binding and release. Proc Natl Acad Sci USA 98:14374–14379

    Article  ADS  Google Scholar 

  • Kainosho M, Torizawa T, Iwashita Y, Terauchi T, Ono AM, Güntert P (2006) Optimal isotope labelling for NMR protein structure determinations. Nature 440:52–57

    Article  ADS  Google Scholar 

  • Kelley LA, Gardner SP, Sutcliffe MJ (1996) An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. Protein Eng 9:1063–1065

    Article  Google Scholar 

  • Kirchner DK, Güntert P (2011) Objective identification of residue ranges for the superposition of protein structures. BMC Bioinform 12:170

    Article  Google Scholar 

  • Kolbe M, Besir H, Essen LO, Oesterhelt D (2000) Structure of the light-driven chloride pump halorhodopsin at 1.8 Å resolution. Science 288:1390–1396

    Article  ADS  Google Scholar 

  • Koradi R, Billeter M, Wüthrich K (1996) MOLMOL: a program for display and analysis of macromolecular structures. J Mol Graph 14:51–55

    Article  Google Scholar 

  • Koradi R, Billeter M, Güntert P (2000) Point-centered domain decomposition for parallel molecular dynamics simulation. Comput Phys Commun 124:139–147

    Google Scholar 

  • Kurpiewska K, Font J, Ribó M, Vilanova M, Lewiński K (2009) X-ray crystallographic studies of RNase A variants engineered at the most destabilizing positions of the main hydrophobic core: further insight into protein stability. Proteins 77:658–669

    Article  Google Scholar 

  • Laskowski RA, Rullmann JAC, MacArthur MW, Kaptein R, Thornton JM (1996) AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J Biomol NMR 8:477–486

    Article  Google Scholar 

  • Linge JP, Williams MA, Spronk CAEM, Bonvin AMJJ, Nilges M (2003) Refinement of protein structures in explicit solvent. Proteins 50:496–506

    Article  Google Scholar 

  • López-Méndez B, Güntert P (2006) Automated protein structure determination from NMR spectra. J Am Chem Soc 128:13112–13122

    Article  Google Scholar 

  • López-Méndez B, Pantoja-Uceda D, Tomizawa T, Koshiba S, Kigawa T, Shirouzu M, Terada T, Inoue M, Yabuki T, Aoki M, Seki E, Matsuda T, Hirota H, Yoshida M, Tanaka A, Osanai T, Seki M, Shinozaki K, Yokoyama S, Güntert P (2004) NMR assignment of the hypothetical ENTH-VHS domain At3g16270 from Arabidopsis thaliana. J Biomol NMR 29:205–206

    Article  Google Scholar 

  • Lüthy R, Bowie JU, Eisenberg D (1992) Assessment of protein models with 3-dimensional profiles. Nature 356:83–85

    Article  ADS  Google Scholar 

  • Luginbühl P, Güntert P, Billeter M, Wüthrich K (1996) The new program OPAL for molecular dynamics simulations and energy refinements of biological macromolecules. J Biomol NMR 8:136–146

    Google Scholar 

  • Morris AL, Macarthur MW, Hutchinson EG, Thornton JM (1992) Stereochemical quality of protein structure coordinates. Proteins 12:345–364

    Article  Google Scholar 

  • Nilges M, Clore GM, Gronenborn AM (1988) Determination of three-dimensional structures of proteins from interproton distance data by hybrid distance geometry-dynamical simulated annealing calculations. FEBS Lett 229:317–324

    Article  Google Scholar 

  • Ohnishi S, Güntert P, Koshiba S, Tomizawa T, Akasaka R, Tochio N, Sato M, Inoue M, Harada T, Watanabe S, Tanaka A, Shirouzu M, Kigawa T, Yokoyama S (2007) Solution structure of an atypical WW domain in a novel β-clam-like dimeric form. FEBS Lett 581:462–468

    Article  Google Scholar 

  • Pääkkönen K, Tossavainen H, Permi P, Rakkolainen H, Rauvala H, Raulo E, Kilpeläinen I, Güntert P (2006) Solution structures of the first and fourth TSR domains of F-spondin. Proteins 64:665–672

    Article  Google Scholar 

  • Pantoja-Uceda D, López-Méndez B, Koshiba S, Kigawa T, Shirouzu M, Terada T, Inoue M, Yabuki T, Aoki M, Seki E, Matsuda T, Hirota H, Yoshida M, Tanaka A, Osanai T, Seki M, Shinozaki K, Yokoyama S, Güntert P (2004) NMR assignment of the hypothetical rhodanese domain At4g01050 from Arabidopsis thaliana. J Biomol NMR 29:207–208

    Article  Google Scholar 

  • Pantoja-Uceda D, López-Méndez B, Koshiba S, Inoue M, Kigawa T, Terada T, Shirouzu M, Tanaka A, Seki M, Shinozaki K, Yokoyama S, Güntert P (2005) Solution structure of the rhodanese homology domain At4g01050(175–295) from Arabidopsis thaliana. Protein Sci 14:224–230

    Article  Google Scholar 

  • Pellecchia M, Sem DS, Wüthrich K (2002) NMR in drug discovery. Nat Rev Drug Discov 1:211–219

    Article  Google Scholar 

  • Ponder JW, Case DA (2003) Force fields for protein simulations. Adv Prot Chem 66:27–85

    Google Scholar 

  • Reckel S, Gottstein D, Stehle J, Löhr F, Verhoefen MK, Takeda M, Silvers R, Kainosho M, Glaubitz C, Wachtveitl J, Bernhard F, Schwalbe H, Güntert P, Dötsch V (2011) Solution NMR structure of proteorhodopsin. Angew Chem 50:11942–11946

    Article  Google Scholar 

  • Rosato A, Aramini J, Arrowsmith C, Bagaria A, Baker D, Cavalli A, Doreleijers JF, Eletsky A, Giachetti A, Guerry P, Gutmanas A, Güntert P, F. HY, Herrmann T, Huang YJ, Jaravine V, Jonker HRA, Kennedy MA, Lange OF, Liu G, Malliavin TE, Mani R, Mao B, Montelione GT, Nilges M, Rossi P, van der Schot G, Schwalbe H, Szyperski T, Vendruscolo M, Vernon R, Vranken WF, de Vries S, Vuister GW, Wu B, Yang Y, Bonvin AMJJ (2012) Blind testing of routine, fully automated determination of protein structures from NMR data. Structure 8:227–236

  • Rosato A, Bagaria A, Baker D, Bardiaux B, Cavalli A, Doreleijers JF, Giachetti A, Guerry P, Güntert P, Herrmann T, Huang YJ, Jonker HRA, Mao B, Malliavin TE, Montelione GT, Nilges M, Raman S, van der Schot G, Vranken WF, Vuister GW, Bonvin AMJJ (2009) CASD-NMR: critical assessment of automated structure determination by NMR. Nat Methods 6:625–626

    Article  Google Scholar 

  • Rotkiewicz P, Skolnick J (2008) Fast procedure for reconstruction of full-atom protein models from reduced representations. J Comput Chem 29:1460–1465

    Article  Google Scholar 

  • Schwieters CD, Clore GM (2002) Reweighted atomic densities to represent ensembles of NMR structures. J Biomol NMR 23:221–225

    Google Scholar 

  • Scott A, Pantoja-Uceda D, Koshiba S, Inoue M, Kigawa T, Terada T, Shirouzu M, Tanaka A, Sugano S, Yokoyama S, Güntert P (2004) NMR assignment of the SH2 domain from the human feline sarcoma oncogene FES. J Biomol NMR 30:463–464

    Article  Google Scholar 

  • Scott A, Pantoja-Uceda D, Koshiba S, Inoue M, Kigawa T, Terada T, Shirouzu M, Tanaka A, Sugano S, Yokoyama S, Güntert P (2005) Solution structure of the Src homology 2 domain from the human feline sarcoma oncogene Fes. J Biomol NMR 31:357–361

    Article  Google Scholar 

  • Sippl MJ (1993) Recognition of errors in 3-dimensional structures of proteins. Proteins 17:355–362

    Article  Google Scholar 

  • Sutcliffe MJ (1993) Representing an ensemble of NMR-derived protein structures by a single structure. Protein Sci 2:936–944

    Article  Google Scholar 

  • Thomas D, Pastore A (2005) WHEATSHEAF: an algorithm to average protein structure ensembles. Acta Crystallogr D 61:112–116

    Article  Google Scholar 

  • Wallner B, Elofsson A (2003) Can correct protein models be identified? Protein Sci 12:1073–1086

    Article  Google Scholar 

  • Wimmer R, Herrmann T, Solioz M, Wüthrich K (1999) NMR structure and metal interactions of the CopZ copper chaperone. J Biol Chem 274:22597–22603

    Article  Google Scholar 

  • Zhang Y, Skolnick J (2004) SPICKER: a clustering approach to identify near-native protein folds. J Comput Chem 25:865–871

    Article  Google Scholar 

  • Zhao DQ, Jardetzky O (1994) An assessment of the precision and accuracy of protein structures determined by NMR: dependence on distance errors. J Mol Biol 239:601–607

    Article  Google Scholar 

Download references

Acknowledgments

We gratefully acknowledge financial support by the Lichtenberg program of the Volkswagen Foundation and by a Grant-in-Aid for Scientific Research of the Japan Society for the Promotion of Science (JSPS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Güntert.

Appendix

Appendix

Bimodal averages and standard deviation of torsion angles

If the values \( S = \{ \phi_{1} , \ldots ,\phi_{n} \} \) of a torsion angle ϕ are clustered in two separate regions it makes little sense to determine an average value. Instead it is meaningful to split the set S into two disjoint subsets S 1 and S 2 for the purpose of computing two bimodal average values \( {{\overline{\phi}}_{1}} = \arg \sum\nolimits_{{k \in s_{1} }} {e^{{i\phi_{k} }} } \) and \( {{\overline{\phi}}_{1}} = \arg \sum\nolimits_{{k \in s_{2} }} {e^{{i\phi_{k} }} } \) of the torsion angle values in S 1 and S 2, respectively. The choice of S 1 and S 2 is optimal if it minimizes the bimodal standard deviation

$$ \sigma_{\phi }^{(2)} = \sqrt {\frac{1}{n}\sum\limits_{k =1}^{n} {\min \left( {\left| {\phi_{k} - {{\overline{\phi}}_{1}}} \right|,2\pi - \left| {\phi_{k} - {{\overline{\phi}}_{1}} }\right|,\left| {\phi_{k} - {{\overline{\phi}}_{2}} }\right|,2\pi - \left| {\phi_{k} - {{\overline{\phi}}_{2}} }\right|} \right)^2} } $$

that results from summing for each torsion angle value ϕ k the squared deviation from the closer of the two bimodal average values \( {{\overline{\phi}}_{1}} \) and \( {{\overline{\phi}}_{2}} \), taking into account the periodicity.

It would be computationally inefficient to evaluate \( \sigma_{\phi }^{(2)} \) for each of the 2n possible choices of the subsets S 1 and S 2. To determine a good approximation of the optimal bimodal average values in polynomial time, we first calculate the n × n matrix of torsion angle differences \( \Updelta \phi_{ij} = \min \left( {\left| {\phi_{i} - \phi_{j} } \right|,2\pi - \left| {\phi_{i} - \phi_{j} } \right|} \right) \). For all pairs (i, j) with \( \Updelta \phi_{ij} > \pi /4 \) (to avoid splitting into two hardly separated clusters), we compute \( \widetilde{{\phi_{1} }} = \arg \sum\nolimits_{{k:\Updelta \phi_{ki} \le \Updelta \phi_{kj} }} {e^{{i\phi_{k} }} } \) and \( \widetilde{{\phi_{2} }} = \arg \sum\nolimits_{{k:\Updelta \phi_{ki} > \Updelta \phi_{kj} }} {e^{{i\phi_{k} }} } \). (In the exponential functions i denotes the imaginary unit \( \sqrt {-1,} \) otherwise the index i.) The deviations of the individual torsion angle values \( \phi_{k} \) from \( \widetilde{{\phi_{1} }} \) and \( \widetilde{{\phi_{2} }} \) are given by \( \delta_{1k} = \min \left( {\left| {\phi_{k} - \widetilde{{\phi_{1} }}} \right|,2\pi - \left| {\phi_{k} - \widetilde{{\phi_{1} }}} \right|} \right) \) and \( \delta_{2k} = \min \left( {\left| {\phi_{k} - \widetilde{{\phi_{2} }}} \right|,2\pi - \left| {\phi_{k} - \widetilde{{\phi_{2} }}} \right|} \right) \) for k = 1, …, n. The corresponding subsets are \( S_{1} = \{ k\left| {\delta_{1k} \le \delta_{2k} } \right.\} \) and \( S_{2} = \{ k\left| {\delta_{1k} > \delta_{2k} } \right.\} \). We choose the optimal subsets S 1 and S 2 from the pair (i, j) that yields the largest value of \( \left| {\sum\nolimits_{{k \in S_{1} }} {e^{{i\phi_{k} }} } } \right| + \left| {\sum\nolimits_{{k \in S_{2} }} {e^{{i\phi_{k} }} } } \right| \) to obtain the bimodal average values \( {{\overline{\phi}}_{1}} \) and \( {{\overline{\phi}}_{2}} \). If S 2 contains more elements than S 1, we exchange the values of \( {{\overline{\phi}}_{1}} \) and \( {{\overline{\phi}}_{2}} \) such that \( {{\overline{\phi}}_{1}} \) always corresponds to the cluster with the larger number of elements.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gottstein, D., Kirchner, D.K. & Güntert, P. Simultaneous single-structure and bundle representation of protein NMR structures in torsion angle space. J Biomol NMR 52, 351–364 (2012). https://doi.org/10.1007/s10858-012-9615-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10858-012-9615-8

Keywords

Navigation