, Volume 1, Issue 4, pp 201-211
Date: 25 Nov 2009

Creating robust, reliable, clinically relevant classifiers from spectroscopic data

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

I describe in detail the intimately connected feature extraction and classifier development stages of the data-driven Statistical Classification Strategy (SCS) and compare them with current practice used in MR spectroscopy. We initially created the SCS for the analysis of MR and IR spectra of biofluids and tissues, and subsequently extended it to analyze biomedical data in general. I focus on explaining how to extract discriminatory spectral features and create robust classifiers that can reliably discriminate diseases and disease states. I discuss our approach to identifying features that retain spectral identity and provisionally relate these features, averaged subregions of the spectra, to specific chemical entities (“metabolites”). Particular emphasis is placed on describing the steps required to help create classifiers whose accuracy doesn’t deteriorate significantly when presented with new, unknown samples. A simple but powerful extension of the discovered features to detect metabolite-metabolite (feature-feature) interactions is also sketched. I contrast the advantages and disadvantages of using either spectral signatures or explicit metabolite concentrations derived from the spectra as sets of discriminatory features. At present, no clear-cut preference is obvious and more objective comparisons will be needed. Finally, I argue that clinical requirements and exigencies strongly suggest adopting a two-phase approach to diagnosis/prognosis. In the first phase the emphasis ought to be on providing as accurate a diagnosis as possible, without any attempt to identify “biomarkers.” That should be the goal of the second, research phase, with a view of providing prognosis on disease progression.