Skip to main content
Log in

Accounting for spot matching uncertainty in the analysis of proteomics data from two-dimensional gel electrophoresis

  • Published:
Sankhya B Aims and scope Submit manuscript

Abstract

Two-dimensional gel electrophoresis is a biochemical technique that combines isoelectric focusing and SDS-polyacrylamide gel technology to achieve simultaneous separation of protein mixtures on the basis of isoelectric point and molecular weight. Upon staining, each protein on a gel can be characterized by an intensity measurement that reflects its abundance in the mixture. These can then conceptually be used to determine which proteins are differentially expressed under different experimental conditions. We propose an EM approach to identify differentially expressed proteins using an inferential strategy that accounts for uncertainty in matching spots to proteins across gels. The underlying mixture model has trivariate Gaussian components. The application of the EM is however, not straightforward, with the main difficulty lying in the E-step calculations because of the dependent structure of proteins within each gel. Therefore, the usual model-based clustering approach is inapplicable, and an MCMC approach is employed. Through data-based simulation, we demonstrate that our proposed method effectively accounts for uncertainty in spot matching and more successfully distinguishes differentially and non-differentially expressed proteins than a naïve t-test which ignores uncertainty in spot matching.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Almeida, J.S., R. Stanislaus, E. Krug, and J.M. Arthur. 2003. Normalization and analysis of residual variation in two-dimensional gel electrophoresis for quantitative differential proteomics. Proteomics 3:1567–1596.

    Article  Google Scholar 

  • Altman, M., J. Gill, and M. McDonald. 2003. Numerical issues in statistical computing for the social scientist. New York: Wiley-Interscience.

    Book  Google Scholar 

  • Baddeley, A.J., and J. Møller. 1989. Nearest-neighbour Markov point processes and random sets. International Statistical Review 2:89–121.

    Google Scholar 

  • Booth, J.G., and J.P. Hobert. 1999. Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. Journal of the Royal Statistical Society 61:265–285.

    Article  MATH  Google Scholar 

  • Celeux, G., and J. Diebolt. 1992. A stochastic approximation type EM algorithm for the mixture problem. Stochastics and Stochastic Reports 41:127–146.

    MathSciNet  Google Scholar 

  • Dasgupta, S. 1999. Learning mixtures of Gaussians. In Proc. IEEE symposium on foundations of computer science, 633–644. New York.

  • Delyon, B., M. Lavielle, and E. Moulines. 1999. Convergence of a stochastic approximation of the EM algorithm. The Annals of Statistics 27:94–128.

    Article  MathSciNet  MATH  Google Scholar 

  • Dempster, A.P., N.M. Laird, and D.B. Rubin. 1977. Maximum likelihood for incomplete data via the EM algorithm (with discussion). Jounal of the Royal Statistical Society, Series B 39:1–38.

    MathSciNet  MATH  Google Scholar 

  • Dowsey, A., M.J. Dunn, and G. Yang. 2003. The role of bioinformatics in two-dimensional gel electrophoresis. Proteomics 3:1567–1596.

    Article  Google Scholar 

  • Green, P.J., and K.V. Mardia. 2006. Bayesian alignment using hierarchical models, with applications in protein bioinformatics. Biometrika 93(2):235–254.

    Article  MathSciNet  MATH  Google Scholar 

  • Levine, R., and G. Casella. 2001. Implementations of the Monte Carlo EM algorithm. Journal of Computational and Graphical Statistics 10:422–439.

    Article  MathSciNet  Google Scholar 

  • Levine, R., and J. Fan. 2004. An automated (Markov Chain) Monte Carlo algorithm. Journal of Statistical Computation and Simulation 74:349–359.

    Article  MathSciNet  MATH  Google Scholar 

  • Louis, T.A. 1982. Finding the observed information matrix when using the EM algorithm. Journal of Royal Statistical Society, B 44:226–233.

    MathSciNet  MATH  Google Scholar 

  • Maitra, R. 2009. Initializing partition-optimization algorithms. IEEE/ACM Transactions on Computational Biology and Bioinformatics 6:144–157. doi:10.1109/TCBB.2007.70244.

    Article  Google Scholar 

  • McLachlan, G., and T. Krishnan. 2008. The EM algorithm and extensions. New York: Wiley.

    Book  MATH  Google Scholar 

  • McLachlan, G., and D. Peel. 2000. Finite mixture models. New York: Wiley.

    Book  MATH  Google Scholar 

  • Meng, X.L., and D.B. Rubin. 1991. Using EM to obtain asymptotic variance-covariance matrices: The SEM algorithm. Journal of the American Statistical Association 86:899–909.

    Article  Google Scholar 

  • Morris, J.S., B.N. Clark, and H.B. Gutstein. 2008. Pinnacle: A fast, automatic and accurate method for detecting and quantifying protein spots in 2-dimensional gel electrophoresis data. Bioinformatics 24:529–536.

    Article  Google Scholar 

  • Palagi, P.M., P. Hernandez, D. Walther, and R.D. Appel. 2006. Proteome informatics I: Bioinformatics tools for processing experimental data. Proteomics 6:5435–5444.

    Article  Google Scholar 

  • Roy, A., F. Seillier-Moiseiwitsch, K. Lee, Y. Hang, M.R. Marten, and B. Raman. 2003. Analyzing two-dimensional gel images. Chance 16:13–18.

    MathSciNet  Google Scholar 

  • Wei, G.C.J., and M.A. Tanner. 1990. A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. Journal of the American Statistical Association 85:699–704.

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge partial support by the National Science Foundation Awards NSF CAREER DMS-0437555, NSF IOS-0236060 and NSF DMS-0502347.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ranjan Maitra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Melnykov, V., Maitra, R. & Nettleton, D. Accounting for spot matching uncertainty in the analysis of proteomics data from two-dimensional gel electrophoresis. Sankhya B 73, 123–143 (2011). https://doi.org/10.1007/s13571-011-0016-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13571-011-0016-x

Keywords

Navigation