Abstract
Microarray gene dataset is often very high-dimensional which presents complicated problems, like the degradation of data accessing, data manipulating and query processing performance. Dimensionality reduction efficiently tackles this problem and benefited us to visualize the intrinsic properties hidden in the dataset. Therefore, Rough set theory (RST) has been used for selecting only the relevant attributes of the dataset, called reduct, sufficient to characterize the information system. The investigation has been carried out on the publicly available microarray dataset. The analysis revealed that Rough Set using the concepts of dependency among genes is able to extract the various dominant genes in term of reducts which play an important role in causing the disease. Experimental results show the effectiveness of the algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Velayutham, C., Thangavel, K.: Unsupervised quick reduct algorithm using rough set theory. J. Electr. Sci. Technol. 9(3), 193–201 (2011)
Lazar, C., Taminau, J., Meganck, S., Steenhoff, D., Coletta, A., Molter, C., de Schaetzen, V., Duque, R., Bersini, H., Nowe, A.: A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(4), 1106–1119 (2012)
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(3), 131–156 (1997)
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of Ninth National Conference on Artificial Intelligence, pp. 129–134 (1992)
Langley, P.: Selection of relevant features in machine learning. In: Proceedings on AAAI Fall Symposium Relevance, pp. 1–5 (1994)
Liu, H., Motoda, H.: Feature Extraction, Construction and Selection: A Data Mining Perspective (Kluwer International Series in Engineering & Computer Science). Academic Publishers, New York (1998)
Miller A.J., Hall, C.: Subset Selection in Regression (1990)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishing, Norwell (1991)
Polkowski, L.: Rough Sets: Mathematical Foundations. Advances in Soft Computing. Physica Verlag, Heidelberg (2002)
Baixeries, J.: A formal concept analysis framework to mine functional dependencies. In: Proceeding of the Workshop on Mathematical Methods for Learning (2004)
Kerber, R., ChiMerge.: Discretization of Numeric Attributes. In: Proceedings of AAAI-92, Ninth International Conference on Artificial Intelligence, AAAI-Press, pp. 123–128 (1992)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
Hall, M.A.: Correlation-based feature selection for machine learning. The University of Waikato, New Zealand (1999)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer India
About this paper
Cite this paper
Das, S., Das, A.K. (2015). An Approach Towards Most Cancerous Gene Selection from Microarray Data. In: Jain, L., Behera, H., Mandal, J., Mohapatra, D. (eds) Computational Intelligence in Data Mining - Volume 3. Smart Innovation, Systems and Technologies, vol 33. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2202-6_58
Download citation
DOI: https://doi.org/10.1007/978-81-322-2202-6_58
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2201-9
Online ISBN: 978-81-322-2202-6
eBook Packages: EngineeringEngineering (R0)