Abstract
A novel clustering approach named Clustering Objects on Subsets of Attributes (COSA) has been proposed (Friedman and Meulman, (2004). Clustering objects on subsets of attributes. J. R. Statist. Soc. B 66, 1–25.) for unsupervised analysis of complex data sets. We demonstrate its usefulness in medical systems biology studies. Examples of metabolomics analyses are described as well as the unsupervised clustering based on the study of disease pathology and intervention effects in rats and humans. In comparison to principal components analysis and hierarchical clustering based on Euclidean distance, COSA shows an enhanced capability to trace partial similarities in groups of objects enabling a new discovery approach in systems biology as well as offering a unique approach to reveal common denominators of complex multi-factorial diseases in animal and human studies.
Similar content being viewed by others
References
Bell J.D., Brown J.C.C., Sadler P.J. (1989) NMR studies of body fluids. NMR Biomed. 2:246–256
Camacho D., de la Fuente A., Mendes P. (2005) The origin of correlations in metabolomics data. Metabolomics 1:53–63
Davidov E., Clish C.B., Oresic M., Meys M., Stochaj W., Snell P., Lavine G., Londo T.R., Adourian A., Zhang X., Johnston M., Morel N., Marple E.W., Plasterer T.N., Neumann E., Verheij E., Vogels J.T.W.E., Havekes L.M., van der Greef J., Naylor S. (2004) Methods for the differential integrative omic analysis of plasma from a transgenic disease animal model. OMICS J. Integr. Biol. 8:267–288
Everitt, B.S., Landau, S., and Leese, M. (2001). Cluster analysis. Hodder & Stoughton Educational
Friedman J.H., Meulman J.J. (2004) Clustering objects on subsets of attributes. J. R. Statist. Soc. B 66:1–25
Gates S.C., Sweeley Ch.C. (1978) Quantitative metabolic profiling based on gas chromatography. Clin. Chem. 24:1663–1673
Gnanadesikan R., Kettenring J.R., Tsao S.L. (1995) Weighting and selection of variables for cluster analysis. J. Class. 12:113–136
Hastie T., Tibshirani R., Friedman J.H. (2001) The elements of statistical learning: Data mining, inference, and prediction. Springer Verlag, New York
Horning E.C., Horning M.G. (1971) Human metabolic profiles obtained by GC and GC/MS. J. Chromatogr. Sci. 9:129–140
Ideker T., Galitski T., Hood L. (2001) A new approach to decoding life: systems biology. Annu. Rev. Genomics Hum. Genet. 2:343–372
Jackson J.E. (1991) User’s guide to principal components. John Wiley & Sons, New York, NY
Jain A.K., Murty M.N., Flynn P.J. (1999) Data clustering: a review. ACM Computing Surveys 31:264–323
Kohonen, T. (2001). Self organizing maps, Springer Verlag
Lee J. A., Lendasse A., Verleysen M. (2004) Nonlinear projection with curvilinear distances: Isomap versus curvilinear distance analysis. Neurocomputing 57:49–76
Mao J., Jain A.K. (1996) A self-organizing network for hyperellipsoidal clustering (HEC). IEEE Trans. Neural Netw. 7:16–29
Moller D.E., Kaufman K.D. (2005) Metabolic syndrome: a clinical and molecular perspective. Annu. Rev. Med. 56:45–62
Nicholson J.K., Wilson I.D. (1989) High resolution proton magnetic resonance spectroscopy of biological fluids. Progress NMR Spectrosc. 21:449–501
Oresic M., Clish C.B., Davidov E.J., Verheij E., Vogels J.T.W.E., Havekes L.M., Neumann E., Adourian A., Naylor S., Greef J.v.d., Plasterer T. (2004) Phenotype characterization using integrated gene transcript, protein and metabolite profiling. Appl. Bioinformatics 3:205–217
Parsons L., Haque E., Liu H. (2004) Subspace clustering for high dimensional data: a review. SIGKDD Explorations 6:90–105
Pauling L., Robinson A.B., Teranishi R., Cary P. (1971) Quantitative analysis of urine vapor and breath by gas-liquid partition chromatography. Proc. Nat. Acad. Sci. USA 68:2374–2376
Phillips, M.S., Liu, Q., Hammond, H.A., Dugan, V., Hey, P.J., Caskey, C.T., Hess, J.F. (1996). Leptin receptor missense mutation in the fatty Zucker rat. 13:18–19
Politzer I.A., Dowty B.J., Laseter J.L. (1976) Use of gas chromatography and mass spectrometry to analyze underivatized volatile human and animal constituents of clinical interest. Clin. Chem. 22:1775–1788
Rhodes G., Miller M., McCornell M.L., Novotny M. (1981) Metabolic abnormalities associated with diabetes mellitus, as investigated by gas chromatography and pattern recognition analysis of profiles of volatile metabolites. Clin. Chem. 27:580–585
Sammon J.W. Jr. (1969) A nonlinear mapping for data structure analysis. IEEE Trans. Comp. C-18:401–409
Tas A.C., van der Greef J., de Waart J., Bouwman J., ten Noever de Brauw M.C. (1985) Comparison of direct chemical ionization and direct probe electron impact/chemical ionization pyrolysis for characterization of Pseudomonas and Serratia bacteria. J. Anal. Appl. Pyrolysis 7:249–255
Torgerson W.S. (1952) Multidimensional scaling: I. Theory and method. Psychometrika 17:401–419
van der Greef, J., Heijden, Rv. d., and Verheij, E. (2004a). The role of mass spectrometry in systems biology: data processing and identification strategies in metabolomics. In Advances in Mass Spectrometry, Vol. 16 (Eds.), Brenton, G., Monaghan, J. and Ashkroft, A.) Elsevier, pp. 145–164
van der Greef J., Stroobant P., Heijden R.v.d. (2004b) The role of analytical sciences in medical systems biology. Curr. Opin. Chem. Biol. 8:559–565
Verhoeckx K.C.M., Bijlsma S., Jespersen S., Ramaker R., Verheij E.R., Witkamp R.F., van der Greef J. (2004) Characterization of anti-inflammatory compounds using transcriptomics, proteomics, and metabolomics in combination with multivariate data analysis. Int. Immunopharmacol. 4:1499–1514
Ward J.H. (1968) Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58:236–244
Windig W., Kistemaker P.G., Haverkamp J. (1980) Factor analysis of the influence of changes in experimental conditions in pyrolysis-mass spectrometry. J. Anal. Appl. Pyrol. 2:18
Windig W., Meuzelaar H.L. (1984) Nonsupervised numerical component extraction from pyrolysis mass spectra of complex mixtures. Anal. Chem. 56:2297–303
Acknowledgments
We would like to thank Ms. Stacey Horrigan (BG Medicine) for her help during this project.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Doris Damian, Matej Orešič, and Elwin Verheij contributed equally to this work.
Rights and permissions
About this article
Cite this article
Damian, D., Orešič, M., Verheij, E. et al. Applications of a new subspace clustering algorithm (COSA) in medical systems biology. Metabolomics 3, 69–77 (2007). https://doi.org/10.1007/s11306-006-0045-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11306-006-0045-z