Abstract
One of the challenges of using mass spectrometry for metabolomic analyses of samples consisting of thousands of compounds is that of peak identification and alignment. This paper addresses the issue of aligning mass spectral data from different samples in order to determine average component m/z peak values. The alignment scheme developed takes the instrument m/z measurement error into consideration in order to heuristically align two or more samples using a technique comparable to automated visual inspection and alignment. The results obtained using mass spectral profiles of replicate human urine samples suggest that this heuristic alignment approach is more efficient than other approaches using hierarchical clustering algorithms. The output consists of an average m/z and intensity value for the spectral components together with the number of matches from the different samples. One of the major advantages of using this alignment strategy is that it eliminates the boundary problem that occurs when using predetermined fixed bins to identify and combine peaks for averaging and the efficient runtime allows large datasets to be processed quickly.
Similar content being viewed by others
References
Aharoni A., Ric de Vos C.H. et al. (2002). Nontargeted metabolome analysis by use of fourier transform ion cyclotron mass spectrometry. OMICS: J Integr Biol 6(3):217–234
Ball G., Mian S., et al. (2002). An integrated approach utilizing artificial neural networks and SELDI mass spectrometry for the classification of human tumours and rapid identification of potential biomarkers. Bioinformatics 18(3):395–404
Coombes K.R., Morris J.S. et al. (2005). Serum proteomics profiling–a young technology begins to mature. Nat. Biotechnol. 23(3):291–2
Duran A.L., Yang J. et al. (2003). Metabolomics spectral formatting, alignment and conversion tools (MSFACTs). Bioinformatics 19(17):2283–2293
Eisen M.B., Spellman P.T. et al. (1998). Cluster analysis and display of genome-wide expression patterns. PNAS 95(25):14863–14868
Geurts P., Fillet M. et al. (2005). Proteomic mass spectra classification using decision tree based ensemble methods. Bioinformatics 21(14):3138–3145
Grabmeier J., Rudolph A. (2002). Techniques of cluster algorithms in data mining. Data Min. Knowl. Discov. 6(4):303–360
Jain A.K., Murty M.N. (1999). Data clustering: a review. ACM Comput. Surv. 31(3):264–323
Jeffries N. (2005). Algorithms for alignment of mass spectrometry proteomic data. Bioinformatics 21(14):3066–3073
Krznaric D., Levcopoulos C. (2002). Optimal algorithms for complete linkage clustering in d dimensions. Theor. Comput. Sci. 286(1):139–149
Li J., Zhang Z. et al. (2002). Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin. Chem. 48(8):1296–1304
Montgomery D.C. (2004). Design and Analysis of Experiments. John Wiley and Sons, New Jersey
Morris J.S., Coombes K.R., et al. (2005). Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics 21(9):1764–1775
Randolph, T. W. and Yasui, Y. (2004). Multiscale Processing of Mass Spectrometry Data. UW Biostatistics Working Paper Series
Tibshirani R., Hastie T. et al. (2004). Sample classification from protein mass spectrometry, by ‘peak probability contrasts’. Bioinformatics 20(17):3034–3044
Vorst O., Vos C.H.R.d. et al. (2005). A non-directed approach to the differential analysis of multiple LC/MS-derived metabolic profiles. Metabolomics 1(2):169–180
Wong J.W.H., Cagney G. et al. (2005). SpecAlign–processing and alignment of mass spectra datasets. Bioinformatics 21(9):2088–2090
Wu B., Abbott T. et al. (2003). Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19(13):1636–1643
Yasui Y., McLerran D. et al. (2003). An automated peak identification/calibration procedure for high-dimensional protein measures from mass spectrometers. J. Biomed. Biotechnol. 4:242–248
Yu J., Chen X.-W. (2005). Bayesian neural network approaches to ovarian cancer identification from high-resolution mass spectrometry data. Bioinformatics 21(suppl_1):i487–494
Acknowledgments
This study was supported by National Institute of Health (P20 GM65764-02), the Department of Defense (N00014-99-1-0905; N00014-99-1-06006) and a Faculty Research Grant from the University of Connecticut.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kazmi, S.A., Ghosh, S., Shin, DG. et al. Alignment of high resolution mass spectra: development of a heuristic approach for metabolomics. Metabolomics 2, 75–83 (2006). https://doi.org/10.1007/s11306-006-0021-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11306-006-0021-7