Classification with Gene Expression Data
A survey is given of tasks related to the construction and evaluation of classifiers applied to a renal cell cancer data set. Balanced sample splitting, non-specific filtering, linear discriminant analysis, nearest-neighbor prediction, and support vector machines are all concretely illustrated using the MLInterfaces package. Evaluations based on single and multiple random splits of data are compared. The entire presentation is given in a very generic programming format, to facilitate the adaptation and variation, by other investigators, of the techniques used here.
KeywordsSupport Vector Machine Linear Discriminant Analysis Prediction Rule Class Prediction Variable Selection Procedure
Unable to display preview. Download preview PDF.