Advertisement

Journal of Biomolecular NMR

, Volume 53, Issue 4, pp 311–320 | Cite as

An automated system designed for large scale NMR data deposition and annotation: application to over 600 assigned chemical shift data entries to the BioMagResBank from the Riken Structural Genomics/Proteomics Initiative internal database

  • Naohiro Kobayashi
  • Yoko Harano
  • Naoya Tochio
  • Eiichi Nakatani
  • Takanori Kigawa
  • Shigeyuki Yokoyama
  • Steve Mading
  • Eldon L. Ulrich
  • John L. Markley
  • Hideo Akutsu
  • Toshimichi Fujiwara
Article

Abstract

Biomolecular NMR chemical shift data are key information for the functional analysis of biomolecules and the development of new techniques for NMR studies utilizing chemical shift statistical information. Structural genomics projects are major contributors to the accumulation of protein chemical shift information. The management of the large quantities of NMR data generated by each project in a local database and the transfer of the data to the public databases are still formidable tasks because of the complicated nature of NMR data. Here we report an automated and efficient system developed for the deposition and annotation of a large number of data sets including 1H, 13C and 15N resonance assignments used for the structure determination of proteins. We have demonstrated the feasibility of our system by applying it to over 600 entries from the internal database generated by the RIKEN Structural Genomics/Proteomics Initiative (RSGI) to the public database, BioMagResBank (BMRB). We have assessed the quality of the deposited chemical shifts by comparing them with those predicted from the PDB coordinate entry for the corresponding protein. The same comparison for other matched BMRB/PDB entries deposited from 2001–2011 has been carried out and the results suggest that the RSGI entries greatly improved the quality of the BMRB database. Since the entries include chemical shifts acquired under strikingly similar experimental conditions, these NMR data can be expected to be a promising resource to improve current technologies as well as to develop new NMR methods for protein studies.

Keywords

NMR Chemical shift Proteomics Database BMRB 

Notes

Acknowledgments

This work was partially supported by National Bioscience Database Center (NBDC) in Japan Science and Technology Agency (JST). We are grateful to Prof. Haruki Nakamura for intensive encouragement to us and for contribution to discussions about this study. We thank Dr. J. Doreleijers for many valuable comments, suggestions and proofreading of the manuscripts and Dr. F. Delaglio for help in establishing the macro-file library for the NMR-Pipe data process. We also thank Mr. T. Iwata for his work in preparing the web-page for downloading the BMRB related tools.

Supplementary material

10858_2012_9641_MOESM1_ESM.pdf (935 kb)
Supplementary material 1 (PDF 934 kb)

References

  1. Allen FH, Barnard JM, Cook APF, Hall SR (1995) The molecular information file (MIF): core specifications of a new standard format for chemical data. J Chem Inf Comput Sci 35:412–417CrossRefGoogle Scholar
  2. Baran MC, Moseley HN, Aramini JM, Bayro MJ, Monleon D, Locke JY, Montelione GT (2006) SPINS: a laboratory information management system for organizing and archiving intermediate and final results from NMR protein structure determinations. Proteins 62:843–851CrossRefGoogle Scholar
  3. Bhattacharya A, Tejero R, Monelione GT (2007) Evaluating protein structures determined by structural genomics consortia. Proteins 66:778–795CrossRefGoogle Scholar
  4. Cavalli A, Salvatella X, Dobson CM, Vendruscolo M (2007) Protein structure determination from NMR chemical shifts. Proc Natl Acad Sci USA 104:9615–9620ADSCrossRefGoogle Scholar
  5. Cornilescu G, Delaglio F, Bax A (1999) Protein backbone angle restraints from searching a database for chemical shift and sequence homology. J Biomol NMR 13:289–302CrossRefGoogle Scholar
  6. Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, Bax A (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 6:277–293CrossRefGoogle Scholar
  7. Fogh R, Ionides J, Ulrich E, Boucher W, Vranken W, Linge JP, Habeck M, Rieping W, Bhat TN, Westbrook J, Henrick K, Gilliland G, Berman H, Thornton J, Nilges M, Markley J, Laue E (2002) The CCPN project: an interim report on a data model for the NMR community. Nat Struct Biol 9:416–418CrossRefGoogle Scholar
  8. Fogh RH, Boucher W, Vranken WF, Pajon A, Stevens TJ, Bhat TN, Westbrook J, Ionides JM, Laue ED (2005) A framework for scientific data modeling and automated software development. Bioinformatics 21:1678–1684CrossRefGoogle Scholar
  9. Fogh RH, Vranken WF, Boucher W, Stevens TJ, Laue ED (2006) A nomenclature and data model to describe NMR experiments. J Biomol NMR 36:147–155CrossRefGoogle Scholar
  10. Güntert P (2003) Automated NMR protein structure calculation. Prog NMR Spectrosc 43:105–125CrossRefGoogle Scholar
  11. Hall SR (1991) The STAR file: a new format for electronic data transfer and archiving. J Chem Inf Comput Sci 31:326–333CrossRefGoogle Scholar
  12. Hall SR, Spadaccini N (1994) The STAR file: detailed specifications. J Chem Inf Comput Sci 34:505–508CrossRefGoogle Scholar
  13. Huang YJ, Powers R, Montelione GT (2005) Protein NMR recall, precision, and F-measure scores (RPF scores): structure quality assessment measures based on information retrieval statistics. J Am Chem Soc 127:1665–1674CrossRefGoogle Scholar
  14. Johnson BA, Blebins RA (1994) NMRView: a computer program for the visualization and analysis of NMR data. J Biomol NMR 4:603–614CrossRefGoogle Scholar
  15. Kobayashi N, Iwahara J, Koshiba S, Tomizawa T, Tochio N, Güntert P, Kigawa T, Yokoyama S (2007) KUJIRA, a package of integrated modules for systematic and interactive analysis of NMR data directed to high-throughput NMR structure studies. J Biomol NMR 39:31–52CrossRefGoogle Scholar
  16. Moseley HN, Sahota G, Montelione GT (2004) Assignment validation software suite for the evaluation and presentation of protein resonance assignment data. J Biomol NMR 28:341–355CrossRefGoogle Scholar
  17. Neal S, Nip AM, Zhang H, Wishart DS (2003) Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. J Biomol NMR 26:215–240CrossRefGoogle Scholar
  18. Penkett CJ, van Ginkel G, Velankar S, Swaminathan J, Ulrich EL, Mading S, Stevens TJ, Fogh RH, Gutmanas A, Kleywegt GJ, Henrick K, Vranken WF (2010) Straightforward and complete deposition of NMR data to the PDBe. J Biomol NMR 48:85–92CrossRefGoogle Scholar
  19. Seavey BR, Farr EA, Westler WM, Markley JL (1991) A relational database for sequence-specific protein NMR data. J Biomol NMR 1:217–236CrossRefGoogle Scholar
  20. Shen Y, Bax A (2007) Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology. J Biomol NMR 38:289–302CrossRefGoogle Scholar
  21. Shen Y, Bax A (2010) SPARTA+: a modest improvement in empirical NMR chemical shift prediction by means of an artificial neural network. J Biomol NMR 48:13–22CrossRefGoogle Scholar
  22. Shen Y, Oliver L, Delaglio F, Rossi P, Aramini J, Liu G, Eletsky A, Wu Y, Singarapu KK, Lemak A, Ignatchenko A, Cheryl H, Arrowsmith CH, Szyperski T, Gaetano T, Montelione GT, Baker D, Bax A (2008) Consistent blind protein structure generation from NMR chemical shift data. Proc Natl Acad Sci USA 105:4685–4690ADSCrossRefGoogle Scholar
  23. Shen Y, Delaglio F, Cornilescu G, Bax A (2009) TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J Biomol NMR 44:213–223CrossRefGoogle Scholar
  24. Ulrich EL, Markley JL, Kyogoku Y (1989) Creation of a nuclear magnetic resonance data repository and literature database. Protein Seq Data Anal 2:23–37Google Scholar
  25. Ulrich EL, Akutsu H, Doreleijers JF, Harano Y, Ioannidis YE, Lin J, Livny M, Mading S, Maziuk D, Miller Z, Nakatani E, Schulte CF, Tolmie DE, Kent Wenger R, Yao H, Markley JL (2008) BioMagResBank. Nucleic Acids Res 36:D402–D408CrossRefGoogle Scholar
  26. Vranken WF, Boucher W, Stevens TJ, Fogh RH, Pajon A, Llinas M, Ulrich EL, Markley JL, Ionides J, Laue ED (2005) The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins 59:687–696CrossRefGoogle Scholar
  27. Wang L, Markley JL (2009) Empirical correlation between protein backbone 15N and 13C secondary chemical shifts and its application to nitrogen chemicalshift re-referencing. J Biomol NMR 44:95–99CrossRefGoogle Scholar
  28. Wang L, Eghbalnia HR, Bahrami A, Markley JL (2005) Linear analysis of carbon-13 chemical shift differences and its application to the detection and correction of errors in referencing and spin system identifications. J Biomol NMR 44:13–22zbMATHCrossRefGoogle Scholar
  29. Wang B, Wang Y, Wishart DS (2010) A probabilistic approach for validating protein NMR chemical shift assignments. J Biomol NMR 47:85–99CrossRefGoogle Scholar
  30. Wishart DS, Arndt D, Berjanskii M, Tang P, Zhou J, Lin G (2008) CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data. Nucleic Acids Res 36:W496–W502CrossRefGoogle Scholar
  31. Yokoyama S, Hirota H, Kigawa T, Yabuki T, Shirouzu M, Terada T, Ito Y, Matsuo Y, Kuroda Y, Nishimura Y, Kyogoku Y, Miki K, Masui R, Kuramitsu S (2000) Structural genomics projects in Japan. Nat Struct Biol 7:943–945CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Naohiro Kobayashi
    • 1
    • 2
  • Yoko Harano
    • 1
  • Naoya Tochio
    • 2
  • Eiichi Nakatani
    • 1
  • Takanori Kigawa
    • 2
    • 3
  • Shigeyuki Yokoyama
    • 2
  • Steve Mading
    • 4
  • Eldon L. Ulrich
    • 4
  • John L. Markley
    • 4
  • Hideo Akutsu
    • 1
  • Toshimichi Fujiwara
    • 1
  1. 1.Institute for Protein ResearchOsaka UniversitySuitaJapan
  2. 2.RIKEN Systems and Structural Biology CenterYokohamaJapan
  3. 3.Department of Computational, Intelligence and Systems Science, Interdisciplinary, Graduate School of Science and EngineeringTokyo Institute of TechnologyYokohamaJapan
  4. 4.Department of BiochemistryUniversity of Wisconsin-MadisonMadisonUSA

Personalised recommendations