Skip to main content

Bioinformatic Analysis of Data Generated from MALDI Mass Spectrometry for Biomarker Discovery

  • Chapter
  • First Online:
Applications of MALDI-TOF Spectroscopy

Part of the book series: Topics in Current Chemistry ((TOPCURRCHEM,volume 331))

Abstract

In this chapter we first describe the applications of matrix-assisted laser desorption/ionization (MALDI) mass spectrometry (MS) in biomarker discovery. After a summary of the general analysis pipeline of MALDI MS data, each step of the pipeline will be elaborated in detail. In particular we try to provide a categorization of existing solutions with the hope that the reader can obtain a global picture on this topic. In addition we show how to apply such an analysis pipeline in protein and glycan profiling for biomarker discovery and for a deeper understanding of diseases. Finally we discuss the limitations of current analysis methods and the perspectives of future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Rifai N, Gillette M, Carr S (2006) Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat Biotechnol 24(8):971–983

    Article  CAS  Google Scholar 

  2. Koomen J, Shih L, Coombes K, Li D, Xiao L, Fidler I, Abbruzzese J, Kobayashi R (2005) Plasma protein profiling for diagnosis of pancreatic cancer reveals the presence of host response proteins. Clin Cancer Res 11(3):1110–1118

    CAS  Google Scholar 

  3. Roy P, Truntzer C, Maucort-Boulch D, Jouve T, Molinari N (2011) Protein mass spectra data analysis for clinical biomarker discovery: a global review. Brief Bioinform 12(2):176–186

    Article  CAS  Google Scholar 

  4. Yang C, He Z, Yu W (2009) Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis. BMC Bioinformatics 10(1):4

    Article  Google Scholar 

  5. Liu Q, Sung A, Qiao M, Chen Z, Yang J, Yang M, Huang X, Deng Y (2009) Comparison of feature selection and classification for MALDI-MS data. BMC Genomics 10(suppl 1):S3

    Article  Google Scholar 

  6. Chen S, Li M, Hong D, Billheimer D, Li H, Xu B, Shyr Y (2009) A novel comprehensive wave-form MS data processing method. Bioinformatics 25(6):808–814

    Article  CAS  Google Scholar 

  7. Wang P, Yang P, Arthur J, Yang JYH (2010) A dynamic wavelet-based algorithm for preprocessing tandem mass spectrometry data. Bioinformatics 26(18):2242–2249

    Article  CAS  Google Scholar 

  8. Katajamaa M, Miettinen J, Oresic M (2006) MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22(5):634–636

    Article  CAS  Google Scholar 

  9. Yu W, Wu B, Lin N, Stone K, Williams K, Zhao H (2006) Detecting and aligning peaks in mass spectrometry data with applications to MALDI. Comput Biol Chem 30(1):27–38

    Article  CAS  Google Scholar 

  10. Mantini D, Petrucci F, Pieragostino D, DelBoccio P, Nicola MD, Ilio CD, Federici G, Sacchetta P, Comani S, Urbani A (2007) LIMPIC: a computational method for the separation of protein MALDI-TOF-MS signals from noise. BMC Bioinformatics 8:101

    Article  Google Scholar 

  11. Du P, Kibbe W, Lin S (2006) Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics 22(17):2059–2065

    Article  CAS  Google Scholar 

  12. Wee A, Grayden D, Zhu Y, Petkovic-Duran K, Smith D (2008) A continuous wavelet transform algorithm for peak detection. Electrophoresis 29(20):4215–4225

    Article  CAS  Google Scholar 

  13. Coombes K, Tsavachidis S, Morris J, Baggerly K, Hung M, Kuerer H (2005) Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. Proteomics 5(16):4107–4117

    Article  CAS  Google Scholar 

  14. Kwon D, Vannucci M, Song J, Jeong J, Pfeiffer R (2008) A novel wavelet-based thresholding method for the pre-processing of mass spectrometry data that accounts for heterogeneous noise. Proteomics 8(15):3019–3029

    Article  CAS  Google Scholar 

  15. Alexandrov T, Decker J, Mertens B, Deelder A, Tollenaar R, Maass P, Thiele H (2009) Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation. Bioinformatics 25(5):643–649

    Article  CAS  Google Scholar 

  16. Mostacci E, Truntzer C, Cardot H, Ducoroy P (2010) Multivariate denoising methods combining wavelets and principal component analysis for mass spectrometry data. Proteomics 10(14):2564–2572

    Article  CAS  Google Scholar 

  17. Noy K, Fasulo D (2007) Improved model-based, platform-independent feature extraction for mass spectrometry. Bioinformatics 23(19):2528–2535

    Article  CAS  Google Scholar 

  18. Samuelsson J, Dalevi D, Levander F, Rognvaldsson T (2004) Modular, scriptable, and automated analysis tools for high-throughput peptide mass fingerprinting. Bioinformatics 20(18):3628–3635

    Article  CAS  Google Scholar 

  19. Renard B, Kirchner M, Steen H, Steen J, Hamprecht F (2008) NITPICK: peak identification for mass spectrometry data. BMC Bioinformatics 9:355

    Article  Google Scholar 

  20. Wang Y, Zhou X, Wang H, Li K, Yao L, Wong S (2008) Reversible jump MCMC approach for peak identification for stroke SELDI mass spectrometry using mixture model. Bioinformatics 24(13):i407–i413

    Article  CAS  Google Scholar 

  21. Mantini D, Petrucci F, Boccio P, Pieragostino D, Nicola M, Lugaresi A, Federici G, Sacchetta P, Ilio C, Urbani A (2008) Independent component analysis for the extraction of reliable protein signal profiles from MALDI-TOF mass spectra. Bioinformatics 24(1):63–70

    Article  CAS  Google Scholar 

  22. McLerran D, Feng Z, Semmes O, Cazares L, Randolph T (2008) Signal detection in high-resolution mass spectrometry data. J Proteome Res 7(1):276–285

    Article  CAS  Google Scholar 

  23. Zhang S, DeGraba T, Wang H, Hoehn G, Gonzales D, Suffredini A, Ching W, Ng M, Zhou X, Wong S (2009) A novel peak detection approach with chemical noise removal using short-time FFT for prOTOF MS data. Proteomics 9(15):3833–3842

    Article  CAS  Google Scholar 

  24. Vandenbogaert V (2008) Alignment of LC-MS images, with applications to biomarker discovery and protein identification. Proteomics 8(4):650–672

    Article  CAS  Google Scholar 

  25. Kong X, Reilly C (2009) A Bayesian approach to the alignment of mass spectra. Bioinformatics 25(24):3213–3220

    Article  CAS  Google Scholar 

  26. Yu W, He Z, Liu J, Zhao H (2008) Improving mass spectrometry peak detection using multiple peak alignment results. J Proteome Res 7(1):123–129

    Article  CAS  Google Scholar 

  27. Tibshirani R, Hastie T, Narasimhan B, Soltys S, Shi G, Koong A, Le Q (2004) Sample classification from protein mass spectrometry by ‘peak probability contrasts’. Bioinformatics 20(17):3034–3044

    Article  CAS  Google Scholar 

  28. Yu W, Li X, Liu J, Wu B, Williams KR, Zhao H (2006) Multiple peak alignment in sequential data analysis: a scale-space-based approach. IEEE/ACM Trans Comput Biol Bioinform 3(3):208–219

    Article  CAS  Google Scholar 

  29. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    Google Scholar 

  30. Ressom H, Varghese R, Abdel-Hamid M, Eissa S, Saha D, Goldman L, Petricoin E, Conrads T, Veenstra T, Loffredo C et al (2005) Analysis of mass spectral serum profiles for biomarker selection. Bioinformatics 21(21):4039–4045

    Article  CAS  Google Scholar 

  31. Ressom H, Varghese R, Drake S, Hortin G, Abdel-Hamid M, Loffredo C, Goldman R (2007) Peak selection from MALDI-TOF mass spectra using ant colony optimization. Bioinformatics 23(5):619–626

    Article  CAS  Google Scholar 

  32. Park Y, Downing SR, Kim D, Hahn WC, Li C, Kantoff PW, Wei L (2007) Simultaneous and exact interval estimates for the contrast of two groups based on an extremely high dimensional variable: application to mass spec data. Bioinformatics 23(12):1451–1458

    Article  CAS  Google Scholar 

  33. Oh J, Kim Y, Gurnani P, Rosenblatt K, Gao J (2008) Biomarker selection and sample pre diction for multi-category disease on MALDI-TOF data. Bioinformatics 24(16):1812–1818

    Article  CAS  Google Scholar 

  34. Oh J, Gurnani P, Schorge J, Rosenblatt K, Gao J (2009) An extended Markov blanket approach to proteomic biomarker detection from high-resolution mass spectrometry data. IEEE Trans Inf Technol Biomed 13(2):195–206

    Article  Google Scholar 

  35. Hilario M, Kalousis A (2008) Approaches to dimensionality reduction in proteomic biomarker studies. Brief Bioinform 9(2):102–118

    Article  Google Scholar 

  36. Abeel T, Helleputte T, Van de Peer Y, Dupont P, Saeys Y (2010) Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3):392–398

    Article  CAS  Google Scholar 

  37. He Z, Yu W (2010) Stable feature selection for biomarker discovery. Comput Biol Chem 34(4):215–225

    Article  CAS  Google Scholar 

  38. Good P (2005) Permutation, parametric and bootstrap tests of hypotheses. Springer, Heidelberg

    Google Scholar 

  39. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Stat Methodol 57(1):289–300

    Google Scholar 

  40. Benjamini Y (2010) Discovering the false discovery rate. J R Stat Soc Ser B Stat Methodol 72(4):405–416

    Article  Google Scholar 

  41. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18

    Article  Google Scholar 

  42. Smit S, Hoefsloot H, Smilde A (2008) Statistical data processing in clinical proteomics. J Chromatogr B 866(1–2):77–88

    Article  CAS  Google Scholar 

  43. Wuhrer M (2007) Glycosylation profiling in clinical proteomics-heading for glycan biomarkers. Expert Rev Proteomics 4(2):135–136

    Article  CAS  Google Scholar 

  44. Barkauskas D, An H, Kronewitter S, De Leoz M, Chew H, de Vere White R, Leiserowitz G, Miyamoto S, Lebrilla C, Rocke D (2009) Detecting glycan cancer biomarkers in serum samples using MALDI FT-ICR mass spectrometry data. Bioinformatics 25(2):251–257

    Article  CAS  Google Scholar 

  45. Ressom HW, Varghese RS, Goldman L, An Y, Loffredo CA, Abdel-Hamid M, Kyselova Z, Mechref Y, Novotny M, Drake SK, Goldman R (2008) Analysis of MALDI-TOF mass spectrometry data for discovery of peptide and glycan biomarkers of hepatocellular carcinoma. J Proteome Res 7(2):603–610

    Article  CAS  Google Scholar 

  46. Aamodt A, Plaza E (1994) Case-based reasoning: foundational issues, methodological variations, and system approaches. Artif Intell Commun 7(1):39–59

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the Natural Science Foundation of China under Grant No. 61003176, the General Research Fund 621707 and 662509 from the Hong Kong Research Grant Council, the Area of Excellence Scheme and Special Equipment Grant from the University Grants Committee of Hong Kong, and the Research Proposal Competition Awards RPC07/08.EG25 and RPC10.EG04 from the Hong Kong University of Science and Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zengyou He .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

He, Z., Qi, R.Z., Yu, W. (2012). Bioinformatic Analysis of Data Generated from MALDI Mass Spectrometry for Biomarker Discovery. In: Cai, Z., Liu, S. (eds) Applications of MALDI-TOF Spectroscopy. Topics in Current Chemistry, vol 331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/128_2012_365

Download citation

Publish with us

Policies and ethics