Abstract
Gene selection is to select the most informative genes from the whole gene set, which is a key step of the discriminant analysis of microarray data. Rough set theory is an efficient mathematical tool for further reducing redundancy. The main limitation of traditional rough set theory is the lack of effective methods for dealing with real-valued data. However, gene expression data sets are always continuous. This has been addressed by employing discretization methods, which may result in information loss. This paper investigates one approach combining feature ranking together with features selection based on tolerance rough set theory. Moreover, this paper explores the other method which can utilize the information contained within the boundary region to improve classification accuracy in gene expression data. Compared with gene selection algorithm based on rough set theory, the proposed methods are more effective for selecting high discriminative genes in cancer classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ding, C., Peng, H.C.: Minimum Redundancy Feature Selection from Microarray Gene Expression Data. In: 2nd IEEE Computer Society Bioinformatics Conference (CSB 2003), pp. 523–529 (2003)
Tibshirani, R., Hastie, T., Narashiman, B., Chu, G.: Diagnosis of Multiple Cancer Types by Shrunken Centroids of Gene Expression. Nat’l Academy of Sciences, 6567–6572 (2002)
Dudoit, S., Fridlyand, J., Speed, T.: Comparison of discrimination methods for the classification of tumors using gene expression data. Tech. Report 576, Dept of Statistics, UC Berkeley (2000)
Model, F., Adorján, P., Olek, A., Piepenbrock, C.: Feature selection for DNA methylation based cancer classification. Bioinformatics 17, 157–164 (2001)
Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. J. Comput. Biol. 7, 559–584 (2000)
Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Artificial Intelligence, 273–324 (1997)
Pawlak, Z.: Rough Sets. International Journal of Information Computer Science 11(5), 341–356 (1982)
Greco, S., Inuiguchi, M., Slowinski, R.: Fuzzy Rough Sets and Multiple-Premise Gradual Decision Rules. International Journal of Approximate Reasoning 41(2), 179–211 (2006)
Dubois, D., Prade, H.: Putting Rough Sets and Fuzzy Sets Together. In: Intelligent Decision Support, pp. 203–232 (1992)
Jensen, R., Shen, Q.: Tolerance-Based and Fuzzy-Rough Feature Selection. In: Proceedings of the 16th International Conference on Fuzzy Systems (FUZZ- IEEE 2007), pp. 877–882 (2007)
Yang, M., Yang, P.: A Novel Condensing Tree Structure for Rough Set Feature Selection. Neurocomputing 71, 1092–1100 (2008)
Hu, Q.H., Liu, J.F., Yu, D.: Mixed Feature Selection Based on Granulation and Approximation. Knowledge-Based Systems 21, 294–304 (2008)
Parthaláin, N.M., Shen, Q.: Exploring The Boundary Region of Tolerance Rough Sets for Feature Selection. Pattern Recognition 42, 655–667 (2009)
Parthaláin, N.M., Shen, Q., Jensen, R.: Distance measure assisted rough set feature selection. In: Proceedings of the 16th International Conference on Fuzzy Systems (FUZZ-IEEE 2007), pp. 1084–1089 (2007)
Slezak, D.: Various approaches to reasoning with frequency based decision reducts: a survey. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) Rough Set Methods and Applications, pp. 235–285. Physica-Verlag, Heidelberg (2000)
Yao, Y.Y., Zhao, Y.: Discernibility Matrix Simplification for Constructing Attribute Reducts. Information Sciences 179, 867–882 (2009)
Miao, D.Q., Wang, J.: Information-Based Algorithm for Reduction of Knowledge. In: IEEE International Conference on Intelligent Processing Systems, pp. 1155–1158 (1997)
Yang, X.B., Xie, J., Song, X.N., Yang, J.Y.: Credible Rules in Incomplete Decision System Based on Descriptors. Knowledge-Based Systems 22, 8–17 (2009)
Shen, Q., Chouchoulas, A.: A Rough-Fuzzy Approach for Generating Classification Rules. Pattern Recognition 35, 2425–2438 (2002)
Qian, Y.H., Dang, C.Y., Liang, J.Y., Zhang, H.Y., Ma, J.M.: On The Evaluation of The Decision Performance of An Incomplete Decision Table. Data & Knowledge Engineering 65, 373–400 (2008)
Wang, G.Y.: Rough Set Theory and Knowledge Acquisition. Xi’an Jiaotong University Press, Xi’an (2001)
Banerjee, M., Mitra, S., Banka, H.: Evolutinary-Rough Feature Selection in Gene Expression Data. IEEE Transaction on Systems, Man, and Cyberneticd, Part C: Application and Reviews 37, 622–632 (2007)
Momin, B.F., Mitra, S., Datta Gupta, R.: Reduct Generation and Classification of Gene Expression Data. In: Proceeding of First International Conference on Hybrid Information Technology (ICHICT 2006), New York, pp. 699–708 (2006)
Li, D.F., Zhang, W.: Gene Selection Using Rough Set Theory. In: Wang, G.-Y., Peters, J.F., Skowron, A., Yao, Y. (eds.) RSKT 2006. LNCS (LNAI), vol. 4062, pp. 778–785. Springer, Heidelberg (2006)
Grzymala-Busse, J.W.: Discretization of Numerical Attributes. In: Klosgen, W., Zytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 218–225. Oxford University Press, Oxford (2002)
Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundamenta Informaticae 27, 245–253 (1996)
Slowinski, R., Vanderpooten, D.: A generalized definition of rough approximations based on similarity. IEEE Trans. on Knowl. and Data Eng. 12, 331–336 (2000)
Special issue on bioinformatics. IEEE Comput. 35(7) (2002)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: data mining, Inference, and Prediction. Springer, Heidelberg (2001)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)
Wang, L.P., Feng, C., Xie, X.: Accurate Cancer Classification Using Expressions of Very Few Genes. IEEE/ACM Transactions on Computational Biology and Bioinformatics 4, 40–53 (2007)
Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Handling Missing Attribute Values. In: Maimon, O., Rokach, L. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 37–57 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiao, N. (2011). Evolutionary Tolerance-Based Gene Selection in Gene Expression Data. In: Peters, J.F., et al. Transactions on Rough Sets XIV. Lecture Notes in Computer Science, vol 6600. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21563-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-21563-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21562-9
Online ISBN: 978-3-642-21563-6
eBook Packages: Computer ScienceComputer Science (R0)