Artificial Neural Networks for Reducing the Dimensionality of Gene Expression Data
The use of gene chips and microarrays for measuring gene expression is becoming widespread and is producing enormous amounts of data. With increasing numbers of datasets becoming available, the need grows for well-defined, robust and interpretable methods to mine and extract knowledge from these datasets. There is currently a lot of uncertainty as to which computational and statistical methods to adopt, mainly because of the new challenges with regard to high dimensionality that gene expression data presents to the data mining community. There is a tendency for increasingly complex methods for dimensionality reduction to be proposed that are difficult to interpret. Results produced by these methods are also difficult to reproduce by other researchers. We evaluate the application of single layer, feedforward backpropagation artificial neural networks for reducing the dimensionality of both discrete and continuous gene expression data. Such networks also allow for the extraction of classification rules from the reduced data set. We demonstrate how ‘supergenes’ can be extracted from combined gene expression datasets using our method.
KeywordsChronic Lymphocytic Leukaemia Gene Expression Data Output Node Gene Reduction Gene Chip
Unable to display preview. Download preview PDF.