Journal of Biomolecular NMR

, Volume 71, Issue 1, pp 11–18 | Cite as

Application of Dirichlet process mixture model to the identification of spin systems in protein NMR spectra

  • Piotr KlukowskiEmail author
  • Michał Augoff
  • Maciej Zamorski
  • Adam Gonczarek
  • Michał J. Walczak


Analysis of structure, function and interactions of proteins by NMR spectroscopy usually requires the assignment of resonances to the corresponding nuclei in protein. This task, although automated by methods such as FLYA or PINE, is still frequently performed manually. To facilitate the manual sequence-specific chemical shift assignment of complex proteins, we propose a method based on Dirichlet process mixture model (DPMM) that performs automated matching of groups of signals observed in NMR spectra to corresponding nuclei in protein sequence. The model has been extensively tested on 80 proteins retrieved from the BMRB database and has shown superior performance to the reference method.


Chemical shift assignment Mixture models Spin system identification 



The research has been co-financed by the Ministry of Science and Higher Education, Republic of Poland: Adam Gonczarek, Grant No. 0402/0082/17.

Author contributions

PK designed the model with the support of AG and MA; PK and MA implemented the model and the experiments; PK and MA designed the experiments; PK, MA, AG, MJW discussed the results and wrote the manuscript, MZ prepared the Dumpling components to make the model publicly available.

Supplementary material

10858_2018_185_MOESM1_ESM.pdf (5.5 mb)
Supplementary material 1 (PDF 5.51 MB)


  1. Aeschbacher T, Schmidt E, Blatter M, Maris C, Duss O, Allain FHT, Güntert P, Schubert M (2013) Automated and assisted RNA resonance assignment using nmr chemical shift statistics. Nucleic Acids Res 41:172–172CrossRefGoogle Scholar
  2. Alipanahi B, Gao X, Karakoc E, Li SC, Balbach F, Feng G, Donaldson L, Li M (2011) Error tolerant NMR backbone resonance assignment and automated structure generation. J Bioinf Comput Biol 9:15–41CrossRefGoogle Scholar
  3. Attias H (2000) A variational Bayesian framework for graphical models. In: Advances in Neural Information Processing Systems, pp 209–215Google Scholar
  4. Bahrami A, Assadi AH, Markley JL, Eghbalnia HR (2009) Probabilistic interaction network of evidence algorithm and its application to complete labeling of peak lists from protein NMR spectroscopy. PLoS Comput Biol 5:1–15CrossRefGoogle Scholar
  5. Bax A, Clore G, Gronenborn A (1990) 1H–1H correlation via isotropic mixing of 13C magnetization, a new three-dimensional approach for assigning 1H and 13C spectra of 13C-enriched proteins. J Magn Reson 88:425–431ADSGoogle Scholar
  6. Blei D, Jordan M (2006) Variational inference for Dirichlet process mixtures. Bayesian Anal 1:121–143MathSciNetCrossRefzbMATHGoogle Scholar
  7. Grzesiek S, Bax A (1992) An efficient experiment for sequential backbone assignment of medium-sized isotopically enriched proteins. J Magn Reson 99:201–207ADSGoogle Scholar
  8. Grzesiek S, Bax A (1993) Amino acid type determination in the sequential assignment procedure of uniformly 13 C/15 N-enriched proteins. J Biomol NMR 3:185–204Google Scholar
  9. Grzesiek S, Anglister J, Bax A (1993) Correlation of backbone amide and aliphatic side-chain resonances in 13C/15N-enriched proteins by isotropic mixing of 13C magnetization. J Magn Reson 101:114–119CrossRefGoogle Scholar
  10. Güntert P (2004) Automated NMR structure calculation with cyana. In: Downing AK (ed) Protein NMR techniques. Humana Press, Totowa, pp 353–378Google Scholar
  11. Güntert P, Salzmann M, Braun D, Wüthrich K (2000) Sequence-specific NMR assignment of proteins by global fragment mapping with the program MAPPER. J Biomol NMR 18:129–137CrossRefGoogle Scholar
  12. Ishwaran H, James LF (2001) Gibbs sampling methods for stick-breaking priors. J Am Stat Assoc 96:161–173MathSciNetCrossRefzbMATHGoogle Scholar
  13. Jones E, Oliphant T, Peterson P et al (2001) SciPy: open source scientific tools for Python. URL
  14. Kay L, Ikura M, Tschudin R, Bax A (1990) Three-dimensional triple-resonance NMR spectroscopy of isotopically enriched proteins. J Magn Reson 89:496–514ADSGoogle Scholar
  15. Klukowski P, Augoff M, Zieba M, Drwal M, Gonczarek A, Walczak MJ (2018) NMRNet: a deep learning approach to automated peak picking of protein NMR spectra. Bioinformatics. Google Scholar
  16. Lukin J, Gove A, Talukdar S, Ho C (1997) Automated probabilistic method for assigning backbone resonances of (13C, 15N)-labeled proteins. J Biomol NMR 9:151–166CrossRefGoogle Scholar
  17. Moseley H, Sahota G, Montelione G (2004) Assignment validation software suite for the evaluation and presentation of protein resonance assignment data. J Biomol NMR 28:341–355CrossRefGoogle Scholar
  18. Rieping W, Vranken WF (2010) Validation of archived chemical shifts through atomic coordinates. Proteins 78:2482–2489Google Scholar
  19. Rule GS, Hitchens TK (2006) Fundamentals of protein NMR spectroscopy. Springer Science & Business Media, New YorkGoogle Scholar
  20. Schmidt E, Güntert P (2012) A new algorithm for reliable and general NMR resonance assignment. J Am Chem Soc 134:12817–12829CrossRefGoogle Scholar
  21. Ulrich EL, Akutsu H, Doreleijers JF, Harano Y, Ioannidis YE, Lin J, Livny M, Mading S, Maziuk D, Miller Z et al (2007) BioMagResBank. Nucleic Acids Res 36:402–408CrossRefGoogle Scholar
  22. Wang B, Wang Y, Wishart D (2010) A probabilistic approach for validating protein NMR chemical shift assignments. J Mol Biol 47:85–99Google Scholar
  23. Wang Y, Jardetzky O (2002) Probability-based protein secondary structure identification using combined NMR chemical-shift data. Protein Sci 11:852–861CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Computer Science and ManagementWrocław University of Science and TechnologyWrocławPoland
  2. 2.Captor Therapeutics Ltd.WrocławPoland
  3. 3.Alphamoon Ltd.WrocławPoland

Personalised recommendations