Skip to main content
Log in

Automated work-flow for processing high-resolution direct infusion electrospray ionization mass spectral fingerprints

  • Published:
Metabolomics Aims and scope Submit manuscript

Abstract

The use of mass spectrometry (MS) is pivotal in analyses of the metabolome and presents a major challenge for subsequent data processing. While the last few years have given new high performance instruments, there has not been a comparable development in data processing. In this paper we discuss an automated data processing pipeline to compare large numbers of fingerprint spectra from direct infusion experiments analyzed by high resolution MS. We describe some of the intriguing problems that have to be addressed, starting with the conversion and pre-processing of the raw data to the final data analysis. Illustrated on the direct infusion analysis (ESI-TOF-MS) of complex mixtures the method exploits the full quality of the high-resolution present in the mass spectra. Although the method is illustrated as a new library search method for high resolution MS, we demonstrate that the output of the preprocessing is applicable to cluster-, discriminant analysis, and related multivariate methods applied directly to mass spectra from direct infusion analysis of crude extracts. This is done to find the relationship between several terverticillate Penicillium species and identify the ions responsible for the segregation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
Figure 10.
Figure 11.
Figure 12.

Similar content being viewed by others

References

  • Allen J., Davey H.M., Broadhurst D., Heald J.K., Rowland J.J., Oliver S.G., Kell D.B. (2003) High-throughput classification of yeast mutants for functional genomics using metabolic footprinting. Nat. Biotechnol. 21:692–696

    Article  PubMed  CAS  Google Scholar 

  • Birkinshaw K. (2003) Deconvolution of mass spectra measured with a non-uniform detector array to give accurate ion abundances. J. Mass Spectrom. 38: 206–210

    Article  PubMed  CAS  Google Scholar 

  • Crawford L.R., Morrison J.D. (1968) Computer methods in analytical mass spectrometry: Identification of an unknown compound in a catalog. Anal. Chem. 40:1464–1469

    Article  CAS  Google Scholar 

  • Fellenberg, K., Hauser, N.C., Brors, B., Neutzner, A., Hoheisel, J.D., Vingron, M. (2001). Correspondence analysis applied to microarray data, Proc. Natl. Acad. Sci. U.S.A. 98:10781–10786

    Article  PubMed  CAS  Google Scholar 

  • Fiehn O. (2002) Metabolomics – the link between genotypes and phenotypes. Plant Mol. Biol. 48:155–171

    Article  PubMed  CAS  Google Scholar 

  • Frisvad, J.C. and Samson, R.A. (2004). Polyphasic taxonomy of Penicillium subgenus Penicillium. A guide to identification of the food and air-borne terverticillate Penicillia and their mycotoxins. Stud. Mycol. 49

  • Gauss, K.F., “General Investigations of Curved Surfaces” [1827] and “New General Investigations of Curved Surfaces” [1825]. Both papers bound as one book, General Investigations of Curved Surfaces, trans. Adam Hiltebeitel and James Morehead, intro. by Richard Courant, Raven Press, Hewlett, New York, 1965

  • Greaves J. (2002) Operation of an academic open access mass spectrometry facility with particular reference to the analysis of synthetic compounds. J. Mass Spectrom. 37:777–785

    Article  PubMed  CAS  Google Scholar 

  • Greenacre M., Hastie T. (1987) The geometric interpretation of correspondence analysis. J. Am. Stat. Assoc. 82:437–447

    Article  Google Scholar 

  • Grotch S.L. (1971) Computer techniques for identifying low resolution mass spectra. Anal. Chem. 43:1362–1370

    Article  CAS  Google Scholar 

  • Guilhaus M., Selby D.S., Mlynski V. (2000) Orthogonal acceleration time-of-flight mass spectrometry. Mass Spectrom. Rev. 19:65–107

    Article  PubMed  CAS  Google Scholar 

  • Han X., Gross R.W. (2003) Global analysis of cellular lipidomes directly from crude extracts of biological samples by ESI mass spectrometry: a bridge to lipodomics. J. Lipid Res. 44:1071–1079

    Article  PubMed  CAS  Google Scholar 

  • Hansen M.E., Smedsgaard J. (2004) A new matching algorithm for accurate mass spectra. J. Am. Soc. Mass Spectrosc. 15:1173–2164

    Article  CAS  Google Scholar 

  • Hastie, T., Tibshirani, R., and Friedman, J. (2002). The elements of statistical learning; datamining, inference and prediction, Springer Verlag

  • Hertz H.S., Hites R.A., Biemann K. (1971) Identification of mass spectra by computer searching a file of known spectra. Anal. Chem. 43:681–691

    Article  CAS  Google Scholar 

  • Hill M.O. (1974) Correspondence analysis: A neglected multivariate method Appl. Stat. 23:340–354

    Article  Google Scholar 

  • Kell D.B. (2004) Metabolomics and system biology: making sense of the soup. Curr. Opin. Biotechnol. 7:296–307

    CAS  Google Scholar 

  • Krzanowski W.J. (1993) Attribute selection in correspondence analysis of incidence matrices, Appl. Stat. 42:529–541

    Article  Google Scholar 

  • Leow, W.K. and Li, R. (2001). Adaptive binning and dissimilarity measure for image retrieval and classification. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 234–239.

  • Leow, W.K. (2002). The algebra and analysis of adaptive-binning color histograms, Dept. of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore 117543, Technical Report TRB8/02.

  • Maharjan R.P., Ferenci T. (2003) Global metabolite analysis: the influence of extraction methodology on metabolome profiles of Escherichia coli. Anal. Biochem. 313:145–154

    Article  PubMed  Google Scholar 

  • Marchetti A.A., Mignerey A.C. (1993) Deconvolution of mass spectra. Nucl. Instrum. Methods Phys. Res. 324(1) 288–296

    Article  Google Scholar 

  • McLafferty F.W. (1974) Propability based matching of mass spectra. Org. Mass Spectrom. 9:690–702

    Article  CAS  Google Scholar 

  • Mead A. (1992) Review of the development of multidimensional scaling methods. The Statistician 41:27–39

    Article  Google Scholar 

  • Niessen W.M.A. (2003) Progress in liquid chromatography-mass spectrometry instrumentation and its impact on high-throughput screening. J. Chromatogr. A. 1000:413–436

    Article  PubMed  CAS  Google Scholar 

  • Payne, T.R. and Edwards, P. (1999). Dimensionality reduction through correspondence analysis, The Robotics Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburg PA 15232, USA, Tech. Rep. AUCS/TR9910.

  • Pitt J.I. (1979) The Genus Penicillium and its Teleomorphic States Eupencillium and Taleromyces. Academic Press, London

    Google Scholar 

  • Ramsay J.O. (1983) Some statistical approaches to multidimensional scaling data. J. Roy. Stat. Soc. A 145:285–312

    Article  Google Scholar 

  • Ripley B.D. (1996) Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge

    Google Scholar 

  • Roussopoulos, N., Kelley, S., and Vincent, F. (1995). Nearest neighbor queries, in Proceedings of the 1995 ACM-SIGMOD Intl. Conf. on Management of Data. 71–79

  • Samson R.A., Hoekstra E.S., Frisvad J.C., Filtenborg O. (2000) Introduction to Food and Airborne fungi, 6th ed. Centraalbureau voor Schimmelcultures, Utrecht

    Google Scholar 

  • Smedsgaard J., Frisvad J.C. (1996) Using direct electrospray mass spectrometry in taxonomy and secondary metabolite profiling of crude fungal extracts. J. Microbiol. Meth. 25:5–17

    Article  CAS  Google Scholar 

  • Smedsgaard, J., A chemosystematic study of the terverticillate Penicillia using electrospray mass spectrometry, Ph.D. dissertation, IBT, Technical University of Denmark, Søltofts Plads, build 221, DK-2800 Kgs. Lyngby, 1996

  • Smedsgaard J. (1997a) Terverticillate Penicillia Studied by Direct Electrospray Mass Spectrometric Profiling of Crude Extracts. I. Chemosystematics. Biochem. Syst. Ecol. 25:51–64

    Article  CAS  Google Scholar 

  • Smedsgaard J. (1997b) Micro-scale extraction procedure for standardized screening of fungal metabolite production in cultures. J. Chromatogr. A 760:264–270

    Article  CAS  Google Scholar 

  • Smedsgaard, J., Hansen, M.E., and Frisvad, J.C. (2004). Classification of Terverticillate Penicillia by Electrospray Mass Spectrometric Profiling. Stud. Mycol. 49

  • Stein S.E., Scott D.R. (1994) Optimization and testing of mass spectral search algorithms for compound identification. J. Am. Soc. Mass Spectrosc. 5:859–866

    Article  CAS  Google Scholar 

  • Sumner L.W., Mendes P., Dixon R.A. (2003) Plant metabolomics: large-scale phytochemistry in the functional genomics area. Phytochemistry 62:817–836

    Article  PubMed  CAS  Google Scholar 

  • Vaidyanathan S., Kell D.B., Goodacre R. (2002) Flow-injection electrospray ionization mass spectrometry of crude cell extracts for high-thoughput bacteial identification. J. Am. Soc. Mass Spectrom. 13:118–128

    Article  PubMed  CAS  Google Scholar 

  • Wehofsky M., Hoffmann R. (2002) Special feature: Perspective - automated deconvolution and deisotoping of electrospray mass spectra. J. Mass Spectrom. 37:223–229

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

The authors thank Professor Jens Christian Frisvad (BioCentrum-DTU) for identification of the species. Ellen Kirstine Lyhne and Hanne Jacobsen are greatly acknowledged for cutting the plugs, doing the extraction and analysis of the samples. The project was supported by the Danish Technical Research Council under the project “Programme for predictive biotechnology: Functional biodiversity in Penicillium and Aspergillus” (grant no. 9901295) and The Danish Research Council (grant no. 274–05-0606).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Adsetts Edberg Hansen.

Appendix

Appendix

Scripts

The software used for extracting data from the MassLynx data files, can be obtained together with a full documentation of the scripts doing the processing by contacting the corresponding author by email: meh@biocentrum.dtu.dk.

Adaptive binning

The adaptive binning algorithm used

  • Require: \({\varvec{\Phi} =\left\{{\varphi _p^{{\rm S}_k} }\right\}}\)

  • for all k∈{ 1,...,K }and p∈{ 1,...,| S k | } do

  •   find the nearest cluster c to φ S p k

  •   if no cluster is found or distance \({d_{cp} \geq d_{max}}\) then

  •    create a new cluster with element p;

  •   else if \({d_{cp} \leq d_{min}}\) then

  •    add element p to cluster k

  •   end if

  • end for

  • for all cluster i do

  •   if cluster i has at least N m elements then

  •    update centroid c i of cluster i

  •    remove cluster i

  •   end if

  • end for

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hansen, M.A.E., Smedsgaard, J. Automated work-flow for processing high-resolution direct infusion electrospray ionization mass spectral fingerprints. Metabolomics 3, 41–54 (2007). https://doi.org/10.1007/s11306-006-0044-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11306-006-0044-0

Keywords

Navigation