Abstract
In this chapter we first describe the applications of matrix-assisted laser desorption/ionization (MALDI) mass spectrometry (MS) in biomarker discovery. After a summary of the general analysis pipeline of MALDI MS data, each step of the pipeline will be elaborated in detail. In particular we try to provide a categorization of existing solutions with the hope that the reader can obtain a global picture on this topic. In addition we show how to apply such an analysis pipeline in protein and glycan profiling for biomarker discovery and for a deeper understanding of diseases. Finally we discuss the limitations of current analysis methods and the perspectives of future research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rifai N, Gillette M, Carr S (2006) Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat Biotechnol 24(8):971–983
Koomen J, Shih L, Coombes K, Li D, Xiao L, Fidler I, Abbruzzese J, Kobayashi R (2005) Plasma protein profiling for diagnosis of pancreatic cancer reveals the presence of host response proteins. Clin Cancer Res 11(3):1110–1118
Roy P, Truntzer C, Maucort-Boulch D, Jouve T, Molinari N (2011) Protein mass spectra data analysis for clinical biomarker discovery: a global review. Brief Bioinform 12(2):176–186
Yang C, He Z, Yu W (2009) Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis. BMC Bioinformatics 10(1):4
Liu Q, Sung A, Qiao M, Chen Z, Yang J, Yang M, Huang X, Deng Y (2009) Comparison of feature selection and classification for MALDI-MS data. BMC Genomics 10(suppl 1):S3
Chen S, Li M, Hong D, Billheimer D, Li H, Xu B, Shyr Y (2009) A novel comprehensive wave-form MS data processing method. Bioinformatics 25(6):808–814
Wang P, Yang P, Arthur J, Yang JYH (2010) A dynamic wavelet-based algorithm for preprocessing tandem mass spectrometry data. Bioinformatics 26(18):2242–2249
Katajamaa M, Miettinen J, Oresic M (2006) MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22(5):634–636
Yu W, Wu B, Lin N, Stone K, Williams K, Zhao H (2006) Detecting and aligning peaks in mass spectrometry data with applications to MALDI. Comput Biol Chem 30(1):27–38
Mantini D, Petrucci F, Pieragostino D, DelBoccio P, Nicola MD, Ilio CD, Federici G, Sacchetta P, Comani S, Urbani A (2007) LIMPIC: a computational method for the separation of protein MALDI-TOF-MS signals from noise. BMC Bioinformatics 8:101
Du P, Kibbe W, Lin S (2006) Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics 22(17):2059–2065
Wee A, Grayden D, Zhu Y, Petkovic-Duran K, Smith D (2008) A continuous wavelet transform algorithm for peak detection. Electrophoresis 29(20):4215–4225
Coombes K, Tsavachidis S, Morris J, Baggerly K, Hung M, Kuerer H (2005) Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. Proteomics 5(16):4107–4117
Kwon D, Vannucci M, Song J, Jeong J, Pfeiffer R (2008) A novel wavelet-based thresholding method for the pre-processing of mass spectrometry data that accounts for heterogeneous noise. Proteomics 8(15):3019–3029
Alexandrov T, Decker J, Mertens B, Deelder A, Tollenaar R, Maass P, Thiele H (2009) Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation. Bioinformatics 25(5):643–649
Mostacci E, Truntzer C, Cardot H, Ducoroy P (2010) Multivariate denoising methods combining wavelets and principal component analysis for mass spectrometry data. Proteomics 10(14):2564–2572
Noy K, Fasulo D (2007) Improved model-based, platform-independent feature extraction for mass spectrometry. Bioinformatics 23(19):2528–2535
Samuelsson J, Dalevi D, Levander F, Rognvaldsson T (2004) Modular, scriptable, and automated analysis tools for high-throughput peptide mass fingerprinting. Bioinformatics 20(18):3628–3635
Renard B, Kirchner M, Steen H, Steen J, Hamprecht F (2008) NITPICK: peak identification for mass spectrometry data. BMC Bioinformatics 9:355
Wang Y, Zhou X, Wang H, Li K, Yao L, Wong S (2008) Reversible jump MCMC approach for peak identification for stroke SELDI mass spectrometry using mixture model. Bioinformatics 24(13):i407–i413
Mantini D, Petrucci F, Boccio P, Pieragostino D, Nicola M, Lugaresi A, Federici G, Sacchetta P, Ilio C, Urbani A (2008) Independent component analysis for the extraction of reliable protein signal profiles from MALDI-TOF mass spectra. Bioinformatics 24(1):63–70
McLerran D, Feng Z, Semmes O, Cazares L, Randolph T (2008) Signal detection in high-resolution mass spectrometry data. J Proteome Res 7(1):276–285
Zhang S, DeGraba T, Wang H, Hoehn G, Gonzales D, Suffredini A, Ching W, Ng M, Zhou X, Wong S (2009) A novel peak detection approach with chemical noise removal using short-time FFT for prOTOF MS data. Proteomics 9(15):3833–3842
Vandenbogaert V (2008) Alignment of LC-MS images, with applications to biomarker discovery and protein identification. Proteomics 8(4):650–672
Kong X, Reilly C (2009) A Bayesian approach to the alignment of mass spectra. Bioinformatics 25(24):3213–3220
Yu W, He Z, Liu J, Zhao H (2008) Improving mass spectrometry peak detection using multiple peak alignment results. J Proteome Res 7(1):123–129
Tibshirani R, Hastie T, Narasimhan B, Soltys S, Shi G, Koong A, Le Q (2004) Sample classification from protein mass spectrometry by ‘peak probability contrasts’. Bioinformatics 20(17):3034–3044
Yu W, Li X, Liu J, Wu B, Williams KR, Zhao H (2006) Multiple peak alignment in sequential data analysis: a scale-space-based approach. IEEE/ACM Trans Comput Biol Bioinform 3(3):208–219
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Ressom H, Varghese R, Abdel-Hamid M, Eissa S, Saha D, Goldman L, Petricoin E, Conrads T, Veenstra T, Loffredo C et al (2005) Analysis of mass spectral serum profiles for biomarker selection. Bioinformatics 21(21):4039–4045
Ressom H, Varghese R, Drake S, Hortin G, Abdel-Hamid M, Loffredo C, Goldman R (2007) Peak selection from MALDI-TOF mass spectra using ant colony optimization. Bioinformatics 23(5):619–626
Park Y, Downing SR, Kim D, Hahn WC, Li C, Kantoff PW, Wei L (2007) Simultaneous and exact interval estimates for the contrast of two groups based on an extremely high dimensional variable: application to mass spec data. Bioinformatics 23(12):1451–1458
Oh J, Kim Y, Gurnani P, Rosenblatt K, Gao J (2008) Biomarker selection and sample pre diction for multi-category disease on MALDI-TOF data. Bioinformatics 24(16):1812–1818
Oh J, Gurnani P, Schorge J, Rosenblatt K, Gao J (2009) An extended Markov blanket approach to proteomic biomarker detection from high-resolution mass spectrometry data. IEEE Trans Inf Technol Biomed 13(2):195–206
Hilario M, Kalousis A (2008) Approaches to dimensionality reduction in proteomic biomarker studies. Brief Bioinform 9(2):102–118
Abeel T, Helleputte T, Van de Peer Y, Dupont P, Saeys Y (2010) Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3):392–398
He Z, Yu W (2010) Stable feature selection for biomarker discovery. Comput Biol Chem 34(4):215–225
Good P (2005) Permutation, parametric and bootstrap tests of hypotheses. Springer, Heidelberg
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Stat Methodol 57(1):289–300
Benjamini Y (2010) Discovering the false discovery rate. J R Stat Soc Ser B Stat Methodol 72(4):405–416
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18
Smit S, Hoefsloot H, Smilde A (2008) Statistical data processing in clinical proteomics. J Chromatogr B 866(1–2):77–88
Wuhrer M (2007) Glycosylation profiling in clinical proteomics-heading for glycan biomarkers. Expert Rev Proteomics 4(2):135–136
Barkauskas D, An H, Kronewitter S, De Leoz M, Chew H, de Vere White R, Leiserowitz G, Miyamoto S, Lebrilla C, Rocke D (2009) Detecting glycan cancer biomarkers in serum samples using MALDI FT-ICR mass spectrometry data. Bioinformatics 25(2):251–257
Ressom HW, Varghese RS, Goldman L, An Y, Loffredo CA, Abdel-Hamid M, Kyselova Z, Mechref Y, Novotny M, Drake SK, Goldman R (2008) Analysis of MALDI-TOF mass spectrometry data for discovery of peptide and glycan biomarkers of hepatocellular carcinoma. J Proteome Res 7(2):603–610
Aamodt A, Plaza E (1994) Case-based reasoning: foundational issues, methodological variations, and system approaches. Artif Intell Commun 7(1):39–59
Acknowledgements
This work was partially supported by the Natural Science Foundation of China under Grant No. 61003176, the General Research Fund 621707 and 662509 from the Hong Kong Research Grant Council, the Area of Excellence Scheme and Special Equipment Grant from the University Grants Committee of Hong Kong, and the Research Proposal Competition Awards RPC07/08.EG25 and RPC10.EG04 from the Hong Kong University of Science and Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
He, Z., Qi, R.Z., Yu, W. (2012). Bioinformatic Analysis of Data Generated from MALDI Mass Spectrometry for Biomarker Discovery. In: Cai, Z., Liu, S. (eds) Applications of MALDI-TOF Spectroscopy. Topics in Current Chemistry, vol 331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/128_2012_365
Download citation
DOI: https://doi.org/10.1007/128_2012_365
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35664-3
Online ISBN: 978-3-642-35665-0
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)