Abstract
Biomedical spectroscopic experiments generate large volumes of data. For accurate, robust diagnostic tools the data must be analyzed for only a few characteristic observations per subject, and a large number of subjects must be studied. We describe here two of the current data analytic approaches applied to this problem: SIMCA (principal component analysis, partial least squares), and the statistical classification strategy (SCS). We demonstrate the application of the SCS by three examples of its use in analyzing 1H NMR spectra: screening for colon cancer, characterization of thyroid cancer, and distinguishing cancer from cholangitis in the biliary tract.
Similar content being viewed by others
Abbreviations
- FLD:
-
Fisher’s linear discriminant
- FOBT:
-
Fecal occult blood test
- NMR:
-
Nuclear magnetic resonance
- PC:
-
Principal component
- PCA:
-
Principal component analysis
- PCR:
-
Principal component regression
- PLS:
-
Partial least squares
- PSC:
-
Primary sclerosing cholangitis
- SCS:
-
Statistical classification strategy
- SIMCA:
-
Soft independent modelling of class analogies
- WCVBST:
-
Weighted cross validated bootstrap
References
Albiin N, Smith ICP, Arnelo V, Lindberg B, Berquist A, Dolenko B, Bryksina N, Bezabeh T (2008) A detection of cholangiocarcinoma with magnetic resonance spectroscopy of bile in patients with and without primary sclerosing cholangitis. Acta Radiol 49:855–862
Bezabeh T, Somorjai RL, Dolenko B, Bryskina N, Levin B, Bernstein CN, Jeyrajah E, Steinhart AH, Rubin D, Smith ICP (2009) Detecting colorectal cancer by 1H magnetic resonance spectroscopy of fecal extracts. NMR Biomed 22:593–600
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley-Interscience, New York
Eriksson L, Johansson E, Kettaneh-Wold N, Wold S (2001) Multi- and megavariate data analysis—principles and applications. Umetrics, Umea
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer, Berlin
Huber PJ (1985) Projection pursuit. Ann Stat 13:435–475
Kuncheva LI (2004) Combining instance classifiers—methods and algorithms. Wiley-Interscience, New York
Nikulin AE, Dolenko B, Bezabeh T, Somorjai RL (1998) Near-optimal region selection for feature space reduction: novel preprocessing methods for classifying MR spectra. NMR Biomed 11:209–217
Somorjai RL (2009) Creating robust, reliable, clinically relevant classifiers from spectroscopic data. Biophys Rev 1:201–211
Somorjai RL, Nikulin AE, Pizzi N, Jackson D, Scarth G, Dolenko B, Gordon H, Russell P, Lean CL, Delbridge L, Mountford CE, Smith ICP (1995) Computerized consensus diagnosis: a classification strategy for the robust analysis of NMR spectra. I. Application to thyroid neoplasms. Magn Res Med 33:257–263
Somorjai RL, Alexander M, Baumgartner R, Booth S, Bowman C, Demko A, Dolenko B, Mandelzweig M, Nikulin AE, Pizzi N, Prankeviciene E, Summers R, Zhilkin P (2004a) A data-driven, flexible machine learning strategy for the classification of biomedical data. In: Dubitzky W, Azuaje F (eds) Artificial intelligence methods and tools for systems biology. Computational biology series, vol. 5. Springer, New York, pp 67–85
Somorjai RL, Demko A, Mandelzweig M, Dolenko B, Nikulin AE, Baumgartner R, Pizzi N (2004b) Mapping high-dimensional data onto a relative distance plane—a novel, exact method for visualizing and characterizing high-dimensional instances. J Biomed Inform 37:366–379
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Smith, I.C.P., Somorjai, R.L. Deriving biomedical diagnostics from NMR spectroscopic data. Biophys Rev 3, 47–52 (2011). https://doi.org/10.1007/s12551-011-0045-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12551-011-0045-8