Estimation of mixture coefficients of protein conformations in solution find applications in understanding protein behavior. We describe a method for maximum a posteriori (MAP) estimation of the mixture coefficients of ensemble of conformations in a protein mixture solution using measured small angle X-ray scattering (SAXS) intensities. The proposed method builds upon a model for the measurements of crystallographically determined conformations. Assuming that a priori information on the protein mixture is available, and that priori information follows a Dirichlet distribution, we develop a method to estimate the relative abundances with MAP estimator. The Dirichlet distribution depends on concentration parameters which may not be known in practice and thus need to be estimated. To estimate these unknown concentration parameters we developed an expectation-maximization (EM) method. Adenylate kinase (ADK) protein was selected as the test bed due to its known conformations Beckstein et al. (Journal of Molecular Biology, 394(1), 160 1). Known conformations are assumed to form the full vector bases that span the measurement space. In Monte Carlo simulations, mixture coefficient estimation performances of MAP and maximum likelihood (ML) (which assumes a uniform prior on the mixture coefficients) estimators are compared. MAP estimators using known and unknown concentration parameters are also compared in terms of estimation performances. The results show that prior knowledge improves estimation accuracy, but performance is sensitive to perturbations in the Dirichlet distribution’s concentration parameters. Moreover, the estimation method based on EM algorithm shows comparable results to approximately known prior parameters.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Beckstein, O., Denning, E.J., Perilla, J.R., & Woolf, T.B. (2009). Journal of Molecular Biology, 394 (1), 160.
Putnam, C. D., Hammel, M., Hura, G. L., & Tainer, J. A. (2007). Quarterly Reviews of Biophysics, 40(03), 191.
Konarev, P. V., Volkov, V. V., Sokolova, A. V., Koch, M. H., & Svergun, D. I. (2003). Journal of Applied Crystallography, 36(5), 1277.
Onuk, A. E., Akcakaya, M., Bardhan, J. P., Erdogmus, D., Brooks, D. H., & Makowski, L. (2015). IEEE Transactions on Signal Processing, 63(20), 5383.
Onuk, A.E., Akcakaya, M., Bardhan, J., Erdogmus, D., Brooks, D.H., & Makowski, L. Machine learning for signal processing (MLSP). In 2015 IEEE 25th International Workshop on (IEEE, 2015) (pp. 1–5).
Makowski, L., Rodi, D. J., Mandava, S., Minh, D. D., Gore, D. B., & Fischetti, R. F. (2008). Journal of Molecular Biology, 375(2), 529.
Duda, R. O., Hart, P. E., & Stork, D.G. (2012). Pattern classification: Wiley.
Stoica, P., & Selen, Y. (2004). Signal processing magazine. IEEE, 21(4), 36.
Minka, T. (2000). Estimating a dirichlet distribution.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Journal of the Royal Statistical Society. Series B (methodological), 1–38.
Moon, T. K. (1996). Signal processing magazine. IEEE, 13(6), 47.
Bishop, C.M. (2006). Pattern recognition and machine learning: Springer.
Neumann, J., & Taub, A. (1961). Collected works Vol. V. New York: Pergamon.
Chen, Y. (2005). Statistics & Probability Letters, 72(4), 277.
Svergun, D., Barberato, C., & Koch, M. (1995). Journal of Applied Crystallography, 28(6), 768.
This work was supported by NSF (MSB-1158340) and NIH (R01-GM85648). A paper package containing code and data can be found here:https://repository.library.northeastern.edu/collections/neu:rx914949p
About this article
Cite this article
Onuk, A.E., Akcakaya, M., Bardhan, J. et al. Dirichlet Priors for MAP Inference of Protein Conformation Abundances from SAXS. J Sign Process Syst 90, 167–174 (2018). https://doi.org/10.1007/s11265-016-1141-6
- SAXS intensity
- Bayesian estimation
- Dirichlet prior
- ML estimation
- Adenylate kinase