Integrating Multiple-Platform Expression Data through Gene Set Features
- Cite this paper as:
- Holec M., Železný F., Kléma J., Tolar J. (2009) Integrating Multiple-Platform Expression Data through Gene Set Features. In: Măndoiu I., Narasimhan G., Zhang Y. (eds) Bioinformatics Research and Applications. ISBRA 2009. Lecture Notes in Computer Science, vol 5542. Springer, Berlin, Heidelberg
We demonstrate a set-level approach to the integration of multiple platform gene expression data for predictive classification and show its utility for boosting classification performance when single- platform samples are rare. We explore three ways of defining gene sets, including a novel way based on the notion of a fully coupled flux related to metabolic pathways. In two tissue classification tasks, we empirically show that the gene set based approach is useful for combining heterogeneous expression data, while surprisingly, in experiments constrained to a single platform, biologically meaningful gene sets acting as sample features are often outperformed by random gene sets with no biological relevance.
Unable to display preview. Download preview PDF.