LC-MS Data Analysis for Differential Protein Expression Detection

Varghese, Rency S.; Ressom, Habtom W.

doi:10.1007/978-1-60761-977-2_10

Rency S. Varghese³ &
Habtom W. Ressom³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 694))

2866 Accesses
1 Citations

Abstract

In proteomic studies, liquid chromatography coupled with mass spectrometry (LC-MS) is a common platform to compare the abundance of various peptides that characterize particular proteins in biological samples. Each LC-MS run generates data consisting of thousands of peak intensities for peptides represented by retention time (RT) and mass-to-charge ratio (m/z) values. In label-free differential protein expression studies, multiple LC-MS runs are compared to identify differentially abundant peptides between distinct biological groups. This approach presents a computational challenge because of the following reasons (i) substantial variation in RT across multiple runs due to the LC instrument conditions and the variable complexity of peptide mixtures, (ii) variation in m/z values due to occasional drift in the calibration of the mass spectrometry instrument, and (iii) variation in peak intensities caused by various factors including noise and variability in sample handling and processing. In this chapter, we present computational methods for quantification and comparison of peptides by label-free LC-MS analysis. We discuss data preprocessing methods for alignment and normalization of LC-MS data. Also, we present multivariate statistical methods and pattern recognition methods for detection of differential protein expression from preprocessed LC-MS data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lill, J. (2003) Proteomic tools for quantitation by mass spectrometry. Mass Spectrom Rev 22, 182–194.
Article PubMed CAS Google Scholar
Goodlett, D. R. and Yi, E. C. (2003) Stable isotopic labeling and mass spectrometry as a means to determine differences in protein expression. TrAC Trends Anal Chem 22, 282–290.
Article CAS Google Scholar
Old, W. M., Meyer-Arendt, K., Aveline-Wolf, L., Pierce, K. G., Mendoza, A., Sevinsky, J. R., Resing, K. A., and Ahn, N. G. (2005) Comparison of label-free methods for quantifying human proteins by shotgun proteomics. Mol Cell Proteomics 4, 1487–1502.
Article PubMed CAS Google Scholar
Zhongqi, Z., Shenheng, G., and Marshall, A. G. (1997) Enhancement of the effective resolution of mass spectra of high-mass biomolecules by maximum entropy-based deconvolution to eliminate the isotopic natural abundance distribution. J Am Soc Mass Spectrom 8, 659–670.
Article Google Scholar
Ramsay, J. O. and Silverman, B. W. (2002) Applied functional data analysi : methods and case studies. Springer, New York.
Book Google Scholar
Listgarten, J., Neal, R. M., Roweis, S. T., Wong, P., and Emili, A. (2007) Difference detection in LC-MS data for protein biomarker discovery. Bioinformatics 23, e198–e204.
Article PubMed CAS Google Scholar
Wang, P., Tang, H., Fitzgibbon, M. P., McIntosh, M., Coram, M., Zhang, H., Yi, E., and Aebersold, R. (2007) A statistical method for chromatographic alignment of LC-MS data. Biostatistics 8, 357–367.
Article PubMed Google Scholar
Wiener, M. C., Sachs, J. R., Deyanova, E. G., and Yates, N. A. (2004) Differential mass spectrometry: a label-free LC-MS method for finding significant differences in complex peptide and protein mixtures. Anal Chem 76, 6085–6096.
Article PubMed CAS Google Scholar
Radulovic, D., Jelveh, S., Ryu, S., Hamilton, T. G., Foss, E., Mao, Y., and Emili, A. (2004) Informatics platform for global proteomic profiling and biomarker discovery using liquid chromatography-tandem mass spectrometry. Mol Cell Proteomics 3, 984–997.
Article PubMed CAS Google Scholar
Sadygov, R. G., Maroto, F. M., and Huhmer, A. F. (2006) ChromAlign: a two-step algorithmic procedure for time alignment of three-dimensional LC-MS chromatographic surfaces. Anal Chem 78, 8207–8217.
Article PubMed CAS Google Scholar
Prakash, A., Mallick, P., Whiteaker, J., Zhang, H., Paulovich, A., Flory, M., Lee, H., Aebersold, R., and Schwikowski, B. (2006) Signal maps for mass spectrometry-based comparative proteomics. Mol Cell Proteomics 5, 423–432.
PubMed CAS Google Scholar
Jaitly, N., Monroe, M. E., Petyuk, V. A., Clauss, T. R., Adkins, J. N., and Smith, R. D. (2006) Robust algorithm for alignment of liquid chromatography-mass spectrometry analyses in an accurate mass and time tag data analysis pipeline. Anal Chem 78, 7397–7409.
Article PubMed CAS Google Scholar
America, A. H., Cordewener, J. H., van Geffen, M. H., Lommen, A., Vissers, J. P., Bino, R. J., and Hall, R. D. (2006) Alignment and statistical difference analysis of complex peptide data sets generated by multidimensional LC-MS. Proteomics 6, 641–653.
Article PubMed CAS Google Scholar
Pierce, K. M., Wood, L. F., Wright, B. W., and Synovec, R. E. (2005) A comprehensive two-dimensional retention time alignment algorithm to enhance chemometric analysis of comprehensive two-dimensional separation data. Anal Chem 77, 7735–7743.
Article PubMed CAS Google Scholar
Horvatovich, P., Govorukhina, N. I., Reijmers, T. H., van der Zee, A. G. J., Suits, F., and Bischoff, R. P. H. (2007) Chip-LC-MS for label-free profiling of human serum. Electrophoresis 28, 4493–4505.
Article PubMed CAS Google Scholar
Mueller, L. N., Rinner, O., Schmidt, A., Letarte, S., Bodenmiller, B., Brusniak, M. Y., Vitek, O., Aebersold, R., and Muller, M. (2007) SuperHirn – a novel tool for high resolution LC-MS-based peptide/protein profiling. Proteomics 7, 3470–3480.
Article PubMed CAS Google Scholar
Listgarten, J., Neal, R. M., Roweis, S. T., and Emily, A. (2005) Multiple alignment of continuous time series. Neural Inf Process Syst 17, 817–824.
Google Scholar
Befekadu, G. K., Tadesse, M. G., Hathout, Y., and Ressom, H. W. (2008) Multiclass alignment of LC-MS data using probabilistic-based mixture regression models. Proceedings of the 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Vancouver, BC, 4094–4097.
Google Scholar
Ressom, H. W., Befekadu, G. K., and Tadesse, M. G. (2009) Analysis of LC-MS data using probabilistic-based mixture regression models. at – Automatisierungstechnik 57, 453–465.
Article Google Scholar
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B (Methodol) 39, 1–38.
Google Scholar
Jordan, M. I. and Jacobs, R. A. (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6, 181–214.
Article Google Scholar
Redner, R. A. and Walker, H. F. (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26, 195–239.
Article Google Scholar
Katajamaa, M. and Oresic, M. (2005) Processing methods for differential analysis of LC/MS profile data. BMC Bioinformatics 6, 179.
Article PubMed Google Scholar
Sysi-Aho, M., Katajamaa, M., Yetukuri, L., and Oresic, M. (2007) Normalization method for metabolomics data using optimal selection of multiple internal standards. BMC Bioinformatics 8, 93.
Article PubMed Google Scholar
Karpievitch, Y. V., Taverner, T., Adkins, J. N., Callister, S. J., Anderson, G. A., Smith, R. D., and Dabney, A. R. (2009) Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition. Bioinformatics 25, 2573–2580.
Article PubMed CAS Google Scholar
Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30, e15.
Article PubMed Google Scholar
Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193.
Article PubMed CAS Google Scholar
Kerr, M. K., Martin, M., and Churchill, G. A. (2000) Analysis of variance for gene expression microarray data. J Comput Biol 7, 819–837.
Article PubMed CAS Google Scholar
Hill, E. G., Schwacke, J. H., Comte-Walters, S., Slate, E. H., Oberg, A. L., Eckel-Passow, J. E., Therneau, T. M., and Schey, K. L. (2008) A statistical model for iTRAQ data analysis. J Proteome Res 7, 3091–3101.
Article PubMed CAS Google Scholar
Purohit, P. V. and Rocke, D. M. (2003) Discriminant models for high-throughput proteomics mass spectrometer data. Proteomics 3, 1699–1703.
Article PubMed CAS Google Scholar
Chen, C., Gonzalez, F. J., and Idle, J. R. (2007) LC-MS-based metabolomics in drug metabolism. Drug Metab Rev 39, 581–597.
Article PubMed CAS Google Scholar
Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing.J R Stat Soc Series B 57, 289–300.
Google Scholar
Opgen-Rhein, R. and Strimmer, K. (2007) Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Stat Appl Genet Mol Biol 6, Article9.
PubMed Google Scholar
Datta, S. (2008) Classification of breast cancer versus normal samples from mass spectrometry profiles using linear discriminant analysis of important features selected by random forest. Stat Appl Genet Mol Biol 7, Article7.
PubMed Google Scholar
Wu, B., Abbott, T., Fishman, D., McMurray, W., Mor, G., Stone, K., Ward, D., Williams, K., and Zhao, H. (2003) Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19, 1636–1643.
Article PubMed CAS Google Scholar
Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. (2002) Gene Selection for cancer classification using support vector machines. Mach Learn 46, 389–422.
Article Google Scholar
Ressom, H. W., Varghese, R. S., Drake, S. K., Hortin, G. L., Abdel-Hamid, M., Loffredo, C. A., and Goldman, R. (2007) Peak selection from MALDI-TOF mass spectra using ant colony optimization. Bioinformatics 23, 619–626.
Article PubMed CAS Google Scholar
Wang, Z., Wang, Y., Xuan, J., Dong, Y., Bakay, M., Feng, Y., Clarke, R., and Hoffman, E. P. (2006) Optimized multilayer perceptrons for molecular classification and diagnosis using genomic data. Bioinformatics 22, 755–761.
Article PubMed CAS Google Scholar
Zhang, Z. and Chan, D. W. (2005) Cancer proteomics: in pursuit of “true” biomarker discovery. Cancer Epidemiol Biomarkers Prev 14, 2283–2286.
Article PubMed CAS Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Science Foundation Grant IIS-0812246 awarded to HWR.

Author information

Authors and Affiliations

Department of Oncology, Georgetown University Medical Center, Washington, DC, USA
Rency S. Varghese & Habtom W. Ressom

Authors

Rency S. Varghese
View author publications
You can also search for this author in PubMed Google Scholar
Habtom W. Ressom
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Delaware Biotechnology Institute, Dept. Computer & Information Sciences, University of Delaware, Innovation Way 15, Newark, 19711, Delaware, USA
Cathy H. Wu
Delaware Biotechnology Institute, Dept. Computer & Information Sciences, University of Delaware, Innovation Way 15, Newark, 19711, Delaware, USA
Chuming Chen

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Varghese, R.S., Ressom, H.W. (2011). LC-MS Data Analysis for Differential Protein Expression Detection. In: Wu, C., Chen, C. (eds) Bioinformatics for Comparative Proteomics. Methods in Molecular Biology, vol 694. Humana Press. https://doi.org/10.1007/978-1-60761-977-2_10

Download citation

DOI: https://doi.org/10.1007/978-1-60761-977-2_10
Published: 01 November 2010
Publisher Name: Humana Press
Print ISBN: 978-1-60761-976-5
Online ISBN: 978-1-60761-977-2
eBook Packages: Springer Protocols

Publish with us

Policies and ethics