Abstract
Microarray technology enables the simultaneous measurement of thousands of gene expressions, while often providing a limited set of samples. These datasets require data mining methods for classification, prediction, and clustering to be tailored to the peculiarity of this domain, marked by the so called ‘curse of dimensionality’. One main characteristic of these specialized algorithms is their intensive use of feature selection for improving their performance. One promising method for feature selection is Bayesian Model Averaging (BMA) to find an optimal subset of genes. This article presents BMA applied to gene selection for classification on two cancer gene expression datasets and for survival analysis on two cancer gene expression datasets, and explains how case based reasoning (CBR) can benefit from this model to provide, in a hybrid BMA-CBR classification or survival prediction method, an improved performance and more expansible model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Annest, A., Bumgarner, R.E., Raftery, A.E., Yeung, K.Y.: Iterative Bayesian Model Averaging: a method for the application of survival analysis to high-dimensional microarray data. BMC Bioinformatics 10, 10–72 (2009)
Jiangeng, L., Yanhua, D., Xiaogang, R.: A Novel Hybrid Approach to Selecting Marker Genes for Cancer Classification Using Gene Expression Data. In: The 1st International Conference on Bioinformatics and Biomedical Engineering ICBBE 2007, pp. 264–267 (2007)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. The Springer International Series in Engineering and Computer Science, vol. 454. Springer, Heidelberg (1998)
Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection. Hall/Crc, Data Mining and Knowledge Discovery Series. Chapman & Hall/CRC (2008)
Huang, T., Kecman, V., Kopriva, I.: Kernel Based Algorithms for Mining Huge Data Sets: Supervised, Semi-Supervised, and Unsupervised Learning. SCI, vol. 17. Springer, The Netherlands (2006)
Witten, I., Frank, R.: Data mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufman Series in Data Management Systems. Elsevier, Inc., San Francisco (2005)
Kotsiantis, S.: Supervised Machine Learning: A Review of Classification Techniques. Informatica 31, 249–268 (2007)
Cohen, J.: Bioinformatics – An Introduction for Computer Scientists. ACM Computing Surveys 36(2), 122–158 (2004)
Piatetsky-Shapiro, G., Tamayo, P.: Microarray Data Mining: Facing the Challenges. ACM SIGKDD Explorations Newsletter 5(2), 1–5 (2003)
Volinsky, C., Madigan, D., Raftery, A., Kronmal, R.: Bayesian Model Averaging in Proprtional Hazard Models: Assessing the Risk of a Stroke. Applied Statistics 46(4), 433–448 (1997)
Hosmer, D., Lemeshow, S., May, S.: Applied Survival Analysis: Regression Modeling of Time to Event Data, 2nd edn. Wiley Series in Probability and Statistics. Wiley Interscience, Hoboken (2008)
Raftery, A.: Bayesian Model Selection in Social Research (with Discussion). In: Marsden, P. (ed.) Sociological Methodology 1995, pp. 111–196. Blackwell, Cambridge (1995)
Hoeting, J., Madigan, D., Raftery, A., Volinsky, C.: Bayesian Model Averaging: A Tutorial. Statistical Science 14(4), 382–417 (1999)
Yeung, K., Bumgarner, R., Raftery, A.: Bayesian Model Averaging: Development of an Improved Multi-Class, Gene Selection and Classification Tool for Microarray Data. Bioinformatics 21(10), 2394–2402 (2005)
Furnival, G., Wilson, R.: Regression by Leaps and Bounds. Technometrics 16, 499–511 (1974)
Madigan, D., Raftery, A.: Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam’s Window. Journal of the American Statistical Association 89, 1335–1346 (1994)
Raftery, A.: Approximate Bayes Factors and Accounting for Model Uncertainty in Generalised Linear Models. Biometrika 83(2), 251–266 (1996)
Cox, D.: Regression Models and Life Tables. J. Royal Stat. Soc. Series B 34, 187–220 (1972)
Kaplan, E., Meier, P.: Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association 53, 457–481 (1958)
Beer, D., Kardia, S., Huang, C., Giordano, T., Levin, A., et al.: Gene-Expression Profiles Predict Survival of Patients with Lung Adenocarcinoma. Nature Medicine 8(8), 816–824 (2002)
Bair, E., Tibshirani, R.: Semi-Supervised Methods to Predict Patient Survival from Gene Expression Data. PLOS Biology 2(4), 511–522 (2004)
Jurisica, I., Glasgow, J.: Applications of Case-Based Reasoning in Molecular Biology. AI Magazine 25(1), 85–95 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bichindaritz, I., Annest, A. (2010). Case Based Reasoning with Bayesian Model Averaging: An Improved Method for Survival Analysis on Microarray Data. In: Bichindaritz, I., Montani, S. (eds) Case-Based Reasoning. Research and Development. ICCBR 2010. Lecture Notes in Computer Science(), vol 6176. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14274-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-14274-1_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14273-4
Online ISBN: 978-3-642-14274-1
eBook Packages: Computer ScienceComputer Science (R0)