Skip to main content

Signal Processing in Proteomics

  • Protocol
  • First Online:
Book cover Proteome Bioinformatics

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 604))

Abstract

Computational proteomics applications are often imagined as a pipeline, where information is processed in each stage before it flows to the next one. Independent of the type of application, the first stage invariably consists of obtaining the raw mass spectrometric data from the spectrometer and preparing it for use in the later stages by enhancing the signal of interest while suppressing spurious components. Numerous approaches for preprocessing MS data have been described in the literature. In this chapter, we will describe both, standard techniques originating from classical signal and image processing, and novel computational approaches specifically tailored to the analysis of MS data sets. We will focus on low level signal processing tasks such as baseline reduction, denoising, and feature detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

     Of course, this picture is changing as soon as we take posttranslational modifications or labelling techniques into account.

  2. 2.

     Please note that peptides will usually elute over several subsequent time points and will therefore appear in several neighbouring scans.

  3. 3.

     This whole “de-isotoping” step is often seen as part of a later stage of the proteomics pipeline - the identification stage - since it usually operates not on the raw data, but on the list of sticks. However, as we will show in a later section, integrating de-isotoping (feature detection) into the signal processing can improve prediction performance by extracting further valuable information from the data that would otherwise be neglected.

  4. 4.

    In reality, the peak intensities rather follow a binomial distribution, but can be approximated by a Poisson distribution.

  5. 5.

     The adaptive Wavelet transform is a slight generalization of the classical Wavelet transform in that the Wavelet kernel can vary with position; hence, the transform does not correspond to a simple convolution, but rather to a more complicated integral transform.

  6. 6.

     The sinc-function is defined by sinc(x):  =  sin(x)/x.

References

  1. Gay, S., Binz, P. A., Hochstrasser, D. F., Appel, R. D. (1999) Modeling peptide mass fingerprinting data using the atomic composition of peptides. Electrophoresis 20, 3527-34.

    Article  CAS  PubMed  Google Scholar 

  2. Bocker, S., Makinen, V. (2008) Combinatorial approaches for mass spectra recalibration. IEEE/ACM Transactions on Computational Biology and Bioinformatics 5, 91-100.

    Article  PubMed  Google Scholar 

  3. Kolibal, J., Howard, D. (2006) MALDI-TOF baseline drift removal using stochastic Bernstein approximation. Eurasip Journal on Applied Signal Processing 1, 61.

    Google Scholar 

  4. Sauve, A. C., Speed, T. P. (2004) Normalization, baseline correction and alignment of high-throughput mass spectrometry data. In: Proceedings of the Genomic Signal Processing and Statistics workshop; 26-7.

    Google Scholar 

  5. Williams, B., Cornett, S., Crecelius, A., Caprioli, R., Dawant, B., Bodenheimer, B. (2005) An algorithm for baseline correction of MALDI mass spectra. In: ACM Southeast Regional Conference: ACM Proceedings.

    Google Scholar 

  6. Shin, H., Koomen, J., Baggerly, K., Markey, M. (2004) Towards a Noise Model of MALDI TOF Spectra. In: American Association for Cancer Research (AACR) Advances in Proteomics in Cancer Research, Waikoloa.

    Google Scholar 

  7. Du, P. C., Stolovitzky, G., Horvatovich, P., Bischoff, R., Lim, J., Suits, F. (2008) A noise model for mass spectrometry based proteomics. Bioinformatics 24, 1070-7.

    Article  CAS  PubMed  Google Scholar 

  8. Savitzky, A., Golay, M. J. E. (1964) Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry 36, 1627.

    Article  CAS  Google Scholar 

  9. Cleveland, W. S. (1979) Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association 74, 829-36.

    Article  Google Scholar 

  10. Donoho, D. L., Johnstone, I. M. (1995) Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association 90, 1200-24.

    Article  Google Scholar 

  11. Wehofsky, M., Hoffmann, R. (2002) Automated deconvolution and deisotoping of electrospray mass spectra. Journal of Mass Spectrometry 37, 223-9.

    Article  CAS  PubMed  Google Scholar 

  12. Hoopmann, M. R., Finney, G. L., MacCoss, M. J. (2007) High-speed data reduction, feature detection and MS/MS spectrum quality assessment of shotgun proteomics data sets using high-resolution mass spectrometry. Analytical Chemistry 79, 5620-32.

    Article  CAS  PubMed  Google Scholar 

  13. Gambin, A., Dutkowski, J., Karczmarski, J., Kluge, B., Kowalczyk, K., Ostrowski, J., Poznanski, J., Tiuryn, J., Bakun, M., Dadlez, M. (2007) Automated reduction and inter­pretation of multidimensional mass spectra for analysis of complex peptide mixtures. International Journal of Mass Spectrometry 260, 20-30.

    Article  CAS  Google Scholar 

  14. Kaur, P., O’Connor, P. B. (2006) Algorithms for automatic interpretation of high resolution mass spectra. Journal of the American Society for Mass Spectrometry 17, 459-68.

    Article  CAS  PubMed  Google Scholar 

  15. Horn, D. M., Zubarev, R. A., McLafferty, F. W. (2000) Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules. Journal of the American Society for Mass Spectrometry 11, 320-32.

    Article  CAS  PubMed  Google Scholar 

  16. Mantini, D., Petrucci, F., Pieragostino, D., Del Boccio, P., Di Nicola, M., Di Ilio, C., Federici, G., Sacchetta, P., Comani, S., Urbani, A. (2007) LIMPIC: a computational method for the separation of protein MALDI-TOF-MS signals from noise. Bmc Bioinformatics 8, 101.

    Google Scholar 

  17. Noy, K., Fasulo, D. (2007) Improved model-based, platform-independent feature extraction for mass spectrometry. Bioinformatics 23, 2528-35.

    Article  CAS  PubMed  Google Scholar 

  18. Samuelsson, J., Dalevi, D., Levander, F., Rognvaldsson, T. (2004) Modular, scriptable and automated analysis tools for high-throughput peptide mass fingerprinting. Bioinformatics 20, 3628-35.

    Article  CAS  PubMed  Google Scholar 

  19. Schulz-Trieglaff, O., Hussong, R., Gröpl, C., Hildebrandt, A., Reinert, K. (2007) A fast and accurate algorithm for the quantification of peptides from mass spectrometry data. In: Research in computational molecular biology, Springer; 473-87.

    Google Scholar 

  20. Schulz-Trieglaff, O., Hussong, R., Gropl, C., Leinenbach, A., Hildebrandt, A., Huber, C., Reinert, K. (2008) Computational quantification of peptides from LC-MS data. Journal of Computational Biology 15, 685-704.

    Article  CAS  PubMed  Google Scholar 

  21. Yu, W. C., He, Z. Y., Liu, J. F., Zhao, H. Y. (2008) Improving mass spectrometry peak detection using multiple peak alignment results. Journal of Proteome Research 7, 123-9.

    Article  CAS  PubMed  Google Scholar 

  22. Muddiman, D. C., Rockwood, A. L., Gao, Q., Severs, J. C., Udseth, H. R., Smith, R. D., Proctor, A. (1995) Application of sequential paired covariance to capillary electrophoresis electrospray-ionization time-of-flight mass-spectrometry - unraveling the signal from the noise in the electropherogram. Analytical Chemistry 67, 4371-5.

    Article  CAS  Google Scholar 

  23. Fleming, C. M., Kowalski, B. R., Apffel, A., Hancock, W. S. (1999) Windowed mass selection method: a new data processing algorithm for liquid chromatography-mass spectrometry data. Journal of Chromatography A 849, 71-85.

    Article  CAS  Google Scholar 

  24. Lange, E., Gröpl, C., Reinert, K., Kohlbacher, O., Hildebrandt, A. (2006) High-accuracy peak picking of proteomics data using wavelet techniques. In: Pac Symp Biocomput; 243-54.

    Google Scholar 

  25. Strittmatter, E. F., Rodriguez, N., Smith, R. D. (2003) High mass measurement accuracy determination for proteomics using multivariate regression fitting: application to electrospray ionization time-of-flight mass spectrometry. Analytical Chemistry 75, 460-8.

    Article  CAS  PubMed  Google Scholar 

  26. Kempka, M., Sjodahl, J., Bjork, A., Roeraade, J. (2004) Improved method for peak picking in matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Rapid Commu­nications in Mass Spectrometry 18, 1208-12.

    Article  CAS  PubMed  Google Scholar 

  27. Di Marco, V. B., Bombi, G. G. (2001) Mathematical functions for the representation of chromatographic peaks. Journal of Chroma­tography A 931, 1-30.

    Article  PubMed  Google Scholar 

  28. Zubarev, R. A., Hakansson, P., Sundqvist, B. (1996) Accurate monoisotopic mass measurements of peptides: possibilities and limitations of high resolution time-of-flight particle desorption mass spectrometry. Rapid Commu­nications in Mass Spectrometry 10, 1386-92.

    Article  CAS  Google Scholar 

  29. Wool, A., Smilansky, Z. (2002) Precalibration of matrix-assisted laser desorption/ionization-time of flight spectra for peptide mass fingerprinting. Proteomics 2, 1365-73.

    Article  CAS  PubMed  Google Scholar 

  30. Tabb, D. L., MacCoss, M. J., Wu, C. C., Anderson, S. D., Yates, J. R., 3rd (2003) Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. Analytical Chemistry 75, 2470-7.

    Article  CAS  PubMed  Google Scholar 

  31. Du, P., Kibbe, W. A., Lin, S. M. (2006) Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics 22, 2059-65.

    Article  CAS  PubMed  Google Scholar 

  32. Carlson, S. M., Najmi, A., Whitin, J. C., Cohen, H. J. (2005) Improving feature detection and analysis of surface-enhanced laser desorption/ionization-time of flight mass spectra. Proteomics 5, 2778-88.

    Article  CAS  PubMed  Google Scholar 

  33. Randolph, T. W., Yasui, Y. (2006) Multiscale processing of mass spectrometry data. Biometrics 62, 589-97.

    Article  CAS  PubMed  Google Scholar 

  34. Andreev, V. P., Rejtar, T., Chen, H. S., Moskovets, E. V., Ivanov, A. R., Karger, B. L. (2003) A universal denoising and peak picking algorithm for LC-MS based on matched filtration in the chromatographic time domain. Analytical Chemistry 75, 6314-26.

    Article  CAS  PubMed  Google Scholar 

  35. Mantini, D., Petrucci, F., Del Boccio, P., Pieragostino, D., Di Nicola, M., Lugaresi, A., Federici, G., Sacchetta, P., Di Ilio, C., Urbani, A. (2008) Independent component analysis for the extraction of reliable protein signal profiles from MALDI-TOF mass spectra. Bioinformatics 24, 63-70.

    Article  CAS  PubMed  Google Scholar 

  36. Gras, R., Muller, M., Gasteiger, E., Gay, S., Binz, P. A., Bienvenut, W., Hoogland, C., Sanchez, J. C., Bairoch, A., Hochstrasser, D. F., Appel, R. D. (1999) Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection. Electrophoresis 20, 3535-50.

    Article  CAS  PubMed  Google Scholar 

  37. Breen, E. J., Hopwood, F. G., Williams, K. L., Wilkins, M. R. (2000) Automatic Poisson peak harvesting for high throughput protein identification. Electrophoresis 21, 2243-51.

    Article  CAS  PubMed  Google Scholar 

  38. McIlwain, S., Page, D., Huttlin, E. L., Sussman, M. R. (2007) Using dynamic programming to create isotopic distribution maps from mass spectra. Bioinformatics 23, I328-I36.

    Article  CAS  PubMed  Google Scholar 

  39. Hussong, R., Tholey, A., Hildebrandt, A. (2007) Efficient analysis of mass spectrometry data using the isotope wavelet. In: Arno, P. J. M. S., Michael, R. B., Robert, C. G., Ad, J. F., editors CompLife. (AIP) American Institute of Physics http://proceedings.aip.org/proceedings/Melville, NY; 139-49.

  40. Perkins, D. N., Pappin, D. J. C., Creasy, D. M., Cottrell, J. S. (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551-67.

    Article  CAS  PubMed  Google Scholar 

  41. Eng, J. K., McCormack, A. L., Yates, J. R. (1994) An approach to correlate tandem mass-spectral data of peptides with amino-acid-sequences in a protein database. Journal of the American Society for Mass Spectrometry 5, 976-89.

    Article  CAS  Google Scholar 

  42. Tabb, D. L., Shah, M. B., Strader, M. B., Connelly, H. M., Hettich, R. L., Hurst, G. B. (2006) Determination of peptide and protein ion charge states by Fourier transformation of isotope-resolved mass spectra. Journal of the American Society for Mass Spectrometry 17, 903-15.

    Article  CAS  PubMed  Google Scholar 

  43. Sadygov, R. G., Hao, Z., Huhmer, A. F. R. (2008) Charger: combination of signal processing and statistical learning algorithms for precursor charge-state determination from electron-transfer dissociation spectra. Analytical Chemistry 80, 376-86.

    Article  CAS  PubMed  Google Scholar 

  44. Na, S., Paek, E., Lee, C. (2008) CIFTER: automated charge-state determination for peptide tandem mass spectra. Analytical Chemistry 80, 1520-8.

    Article  CAS  PubMed  Google Scholar 

  45. Colinge, J., Magnin, J., Dessingy, T., Giron, M., Masselot, A. (2003) Improved peptide charge state assignment. Proteomics 3, 1434-40.

    Article  CAS  PubMed  Google Scholar 

  46. Chen, L., Yap, Y. L. (2008) Automated charge state determination of complex isotope-resolved mass spectra by peak-target Fourier transform. Journal of the American Society for Mass Spectrometry 19, 46-54.

    Article  PubMed  Google Scholar 

  47. Klammer, A. A., Wu, C. C., MacCoss, M. J., Noble, W. S. (2005) Peptide charge state determination for low-resolution tandem mass spectra. In: IEEE Computational Systems Bioin­formatics Conference. IEEE Computer Society.

    Google Scholar 

  48. Ong, S. E., Blagoev, B., Kratchmarova, I., Kristensen, D. B., Steen, H., Pandey, A., Mann, M. (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Molecular & Cellular Proteomics 1, 376-86.

    Article  CAS  Google Scholar 

  49. Du, P. C., Angeletti, R. H. (2006) Automatic deconvolution of isotope-resolved mass spectra using variable selection and quantized peptide mass distribution. Analytical Chemistry 78, 3385-92.

    Article  CAS  PubMed  Google Scholar 

  50. Sturm, M., Bertsch, A., Gropl, C., Hildebrandt, A., Hussong, R., Lange, E., Pfeifer, N., Schulz-Trieglaff, O., Zerck, A., Reinert, K., Kohlbacher, O. (2008) OpenMS - an open-source software framework for mass spectrometry. BMC Bioinformatics 9, 163.

    Google Scholar 

Download references

Acknowledgments

The authors would like to express their gratitude to Mrs. Anna Katharina Dehof, Mrs. Sophie Weggler, and Mrs. Linda Wolters for critical reading of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Hildebrandt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Hussong, R., Hildebrandt, A. (2010). Signal Processing in Proteomics. In: Hubbard, S., Jones, A. (eds) Proteome Bioinformatics. Methods in Molecular Biology™, vol 604. Humana Press. https://doi.org/10.1007/978-1-60761-444-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-444-9_11

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-443-2

  • Online ISBN: 978-1-60761-444-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics