Skip to main content

LC-MS Data Analysis for Differential Protein Expression Detection

  • Protocol
  • First Online:
Book cover Bioinformatics for Comparative Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 694))

Abstract

In proteomic studies, liquid chromatography coupled with mass spectrometry (LC-MS) is a common platform to compare the abundance of various peptides that characterize particular proteins in biological samples. Each LC-MS run generates data consisting of thousands of peak intensities for peptides represented by retention time (RT) and mass-to-charge ratio (m/z) values. In label-free differential protein expression studies, multiple LC-MS runs are compared to identify differentially abundant peptides between distinct biological groups. This approach presents a computational challenge because of the following reasons (i) substantial variation in RT across multiple runs due to the LC instrument conditions and the variable complexity of peptide mixtures, (ii) variation in m/z values due to occasional drift in the calibration of the mass spectrometry instrument, and (iii) variation in peak intensities caused by various factors including noise and variability in sample handling and processing. In this chapter, we present computational methods for quantification and comparison of peptides by label-free LC-MS analysis. We discuss data preprocessing methods for alignment and normalization of LC-MS data. Also, we present multivariate statistical methods and pattern recognition methods for detection of differential protein expression from preprocessed LC-MS data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lill, J. (2003) Proteomic tools for quantitation by mass spectrometry. Mass Spectrom Rev 22, 182–194.

    Article  PubMed  CAS  Google Scholar 

  2. Goodlett, D. R. and Yi, E. C. (2003) Stable isotopic labeling and mass spectrometry as a means to determine differences in protein expression. TrAC Trends Anal Chem 22, 282–290.

    Article  CAS  Google Scholar 

  3. Old, W. M., Meyer-Arendt, K., Aveline-Wolf, L., Pierce, K. G., Mendoza, A., Sevinsky, J. R., Resing, K. A., and Ahn, N. G. (2005) Comparison of label-free methods for quantifying human proteins by shotgun proteomics. Mol Cell Proteomics 4, 1487–1502.

    Article  PubMed  CAS  Google Scholar 

  4. Zhongqi, Z., Shenheng, G., and Marshall, A. G. (1997) Enhancement of the effective resolution of mass spectra of high-mass biomolecules by maximum entropy-based deconvolution to eliminate the isotopic natural abundance distribution. J Am Soc Mass Spectrom 8, 659–670.

    Article  Google Scholar 

  5. Ramsay, J. O. and Silverman, B. W. (2002) Applied functional data analysi : methods and case studies. Springer, New York.

    Book  Google Scholar 

  6. Listgarten, J., Neal, R. M., Roweis, S. T., Wong, P., and Emili, A. (2007) Difference detection in LC-MS data for protein biomarker discovery. Bioinformatics 23, e198–e204.

    Article  PubMed  CAS  Google Scholar 

  7. Wang, P., Tang, H., Fitzgibbon, M. P., McIntosh, M., Coram, M., Zhang, H., Yi, E., and Aebersold, R. (2007) A statistical method for chromatographic alignment of LC-MS data. Biostatistics 8, 357–367.

    Article  PubMed  Google Scholar 

  8. Wiener, M. C., Sachs, J. R., Deyanova, E. G., and Yates, N. A. (2004) Differential mass spectrometry: a label-free LC-MS method for finding significant differences in complex peptide and protein mixtures. Anal Chem 76, 6085–6096.

    Article  PubMed  CAS  Google Scholar 

  9. Radulovic, D., Jelveh, S., Ryu, S., Hamilton, T. G., Foss, E., Mao, Y., and Emili, A. (2004) Informatics platform for global proteomic profiling and biomarker discovery using liquid chromatography-tandem mass spectrometry. Mol Cell Proteomics 3, 984–997.

    Article  PubMed  CAS  Google Scholar 

  10. Sadygov, R. G., Maroto, F. M., and Huhmer, A. F. (2006) ChromAlign: a two-step algorithmic procedure for time alignment of three-dimensional LC-MS chromatographic surfaces. Anal Chem 78, 8207–8217.

    Article  PubMed  CAS  Google Scholar 

  11. Prakash, A., Mallick, P., Whiteaker, J., Zhang, H., Paulovich, A., Flory, M., Lee, H., Aebersold, R., and Schwikowski, B. (2006) Signal maps for mass spectrometry-based comparative proteomics. Mol Cell Proteomics 5, 423–432.

    PubMed  CAS  Google Scholar 

  12. Jaitly, N., Monroe, M. E., Petyuk, V. A., Clauss, T. R., Adkins, J. N., and Smith, R. D. (2006) Robust algorithm for alignment of liquid chromatography-mass spectrometry analyses in an accurate mass and time tag data analysis pipeline. Anal Chem 78, 7397–7409.

    Article  PubMed  CAS  Google Scholar 

  13. America, A. H., Cordewener, J. H., van Geffen, M. H., Lommen, A., Vissers, J. P., Bino, R. J., and Hall, R. D. (2006) Alignment and statistical difference analysis of complex peptide data sets generated by multidimensional LC-MS. Proteomics 6, 641–653.

    Article  PubMed  CAS  Google Scholar 

  14. Pierce, K. M., Wood, L. F., Wright, B. W., and Synovec, R. E. (2005) A comprehensive two-dimensional retention time alignment algorithm to enhance chemometric analysis of comprehensive two-dimensional separation data. Anal Chem 77, 7735–7743.

    Article  PubMed  CAS  Google Scholar 

  15. Horvatovich, P., Govorukhina, N. I., Reijmers, T. H., van der Zee, A. G. J., Suits, F., and Bischoff, R. P. H. (2007) Chip-LC-MS for label-free profiling of human serum. Electrophoresis 28, 4493–4505.

    Article  PubMed  CAS  Google Scholar 

  16. Mueller, L. N., Rinner, O., Schmidt, A., Letarte, S., Bodenmiller, B., Brusniak, M. Y., Vitek, O., Aebersold, R., and Muller, M. (2007) SuperHirn – a novel tool for high resolution LC-MS-based peptide/protein profiling. Proteomics 7, 3470–3480.

    Article  PubMed  CAS  Google Scholar 

  17. Listgarten, J., Neal, R. M., Roweis, S. T., and Emily, A. (2005) Multiple alignment of continuous time series. Neural Inf Process Syst 17, 817–824.

    Google Scholar 

  18. Befekadu, G. K., Tadesse, M. G., Hathout, Y., and Ressom, H. W. (2008) Multiclass alignment of LC-MS data using probabilistic-based mixture regression models. Proceedings of the 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Vancouver, BC, 4094–4097.

    Google Scholar 

  19. Ressom, H. W., Befekadu, G. K., and Tadesse, M. G. (2009) Analysis of LC-MS data using probabilistic-based mixture regression models. at – Automatisierungstechnik 57, 453–465.

    Article  Google Scholar 

  20. Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B (Methodol) 39, 1–38.

    Google Scholar 

  21. Jordan, M. I. and Jacobs, R. A. (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6, 181–214.

    Article  Google Scholar 

  22. Redner, R. A. and Walker, H. F. (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26, 195–239.

    Article  Google Scholar 

  23. Katajamaa, M. and Oresic, M. (2005) Processing methods for differential analysis of LC/MS profile data. BMC Bioinformatics 6, 179.

    Article  PubMed  Google Scholar 

  24. Sysi-Aho, M., Katajamaa, M., Yetukuri, L., and Oresic, M. (2007) Normalization method for metabolomics data using optimal selection of multiple internal standards. BMC Bioinformatics 8, 93.

    Article  PubMed  Google Scholar 

  25. Karpievitch, Y. V., Taverner, T., Adkins, J. N., Callister, S. J., Anderson, G. A., Smith, R. D., and Dabney, A. R. (2009) Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition. Bioinformatics 25, 2573–2580.

    Article  PubMed  CAS  Google Scholar 

  26. Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30, e15.

    Article  PubMed  Google Scholar 

  27. Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193.

    Article  PubMed  CAS  Google Scholar 

  28. Kerr, M. K., Martin, M., and Churchill, G. A. (2000) Analysis of variance for gene expression microarray data. J Comput Biol 7, 819–837.

    Article  PubMed  CAS  Google Scholar 

  29. Hill, E. G., Schwacke, J. H., Comte-Walters, S., Slate, E. H., Oberg, A. L., Eckel-Passow, J. E., Therneau, T. M., and Schey, K. L. (2008) A statistical model for iTRAQ data analysis. J Proteome Res 7, 3091–3101.

    Article  PubMed  CAS  Google Scholar 

  30. Purohit, P. V. and Rocke, D. M. (2003) Discriminant models for high-throughput proteomics mass spectrometer data. Proteomics 3, 1699–1703.

    Article  PubMed  CAS  Google Scholar 

  31. Chen, C., Gonzalez, F. J., and Idle, J. R. (2007) LC-MS-based metabolomics in drug metabolism. Drug Metab Rev 39, 581–597.

    Article  PubMed  CAS  Google Scholar 

  32. Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing.J R Stat Soc Series B 57, 289–300.

    Google Scholar 

  33. Opgen-Rhein, R. and Strimmer, K. (2007) Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Stat Appl Genet Mol Biol 6, Article9.

    PubMed  Google Scholar 

  34. Datta, S. (2008) Classification of breast cancer versus normal samples from mass spectrometry profiles using linear discriminant analysis of important features selected by random forest. Stat Appl Genet Mol Biol 7, Article7.

    PubMed  Google Scholar 

  35. Wu, B., Abbott, T., Fishman, D., McMurray, W., Mor, G., Stone, K., Ward, D., Williams, K., and Zhao, H. (2003) Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19, 1636–1643.

    Article  PubMed  CAS  Google Scholar 

  36. Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. (2002) Gene Selection for cancer classification using support vector machines. Mach Learn 46, 389–422.

    Article  Google Scholar 

  37. Ressom, H. W., Varghese, R. S., Drake, S. K., Hortin, G. L., Abdel-Hamid, M., Loffredo, C. A., and Goldman, R. (2007) Peak selection from MALDI-TOF mass spectra using ant colony optimization. Bioinformatics 23, 619–626.

    Article  PubMed  CAS  Google Scholar 

  38. Wang, Z., Wang, Y., Xuan, J., Dong, Y., Bakay, M., Feng, Y., Clarke, R., and Hoffman, E. P. (2006) Optimized multilayer perceptrons for molecular classification and diagnosis using genomic data. Bioinformatics 22, 755–761.

    Article  PubMed  CAS  Google Scholar 

  39. Zhang, Z. and Chan, D. W. (2005) Cancer proteomics: in pursuit of “true” biomarker discovery. Cancer Epidemiol Biomarkers Prev 14, 2283–2286.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the National Science Foundation Grant IIS-0812246 awarded to HWR.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Varghese, R.S., Ressom, H.W. (2011). LC-MS Data Analysis for Differential Protein Expression Detection. In: Wu, C., Chen, C. (eds) Bioinformatics for Comparative Proteomics. Methods in Molecular Biology, vol 694. Humana Press. https://doi.org/10.1007/978-1-60761-977-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-977-2_10

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-976-5

  • Online ISBN: 978-1-60761-977-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics