Abstract
Even after so much advancement in gene expression microarray technology, the main hindrance in analyzing microarray data is its limited number of samples as compared to a number of factors, which is a major impediment in revealing actual gene functionality and valuable information from the data. Analyzing gene expression data can indicate the factors which are differentially expressed in the diseased tissue. As most of these genes have no part to play in causing the disease of interest, thus, identification of disease-causing genes can reveal not just the case of the disease, but also its pathogenic mechanism. There are a lot of gene selection methods available which have the capacity to remove irrelevant genes, but most of them are not sufficient enough in removing redundancy in genes from microarray data, which increases the computational cost and decreases the classification accuracy. Combining the gene expression data with the gene ontology information can be helpful in determining the redundancy which can then be removed using the algorithm mentioned in the work. The gene list obtained after these sequential steps of the algorithm can be analyzed further to obtain the most deterministic genes responsible for type 2 diabetes.
Similar content being viewed by others
References
Schena M, Shalon D, Davis RW, Brown PO (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467–470
Zhang A (2006) Advanced Analysis of Gene Expression Microarray Data. World Scientific Publishing Co., Danvers
Mohammadi A, Saraee MH, Salehi M (2011) Identification of disease-causing genes using microarray data mining and Gene Ontology. BMC Med Genom 4:12–19
Couto FM, Silva MJ, Coutinho PM (2007) Measuring semantic similarity between Gene Ontology terms. Data Knowl Eng 61:137–152
Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30(1):207–210
Kumar A, Sharmila DJS, Kant R (2014) Selection of discriminatory gene set for Type II diabetes using fisher linear discriminant. Int J Adv Comput Mathe Sci 5(2):36–42
Parikh H, Carlsson E, Chutkow WA, Johansson LE, Storgaard H, Poulsen P, Saxena R, Ladd C, Schulze PC, Mazzini MJ, Jensen CB, Krook A, Björnholm M, Tornqvist H, Zierath JR, Ridderstråle M, Altshuler D, Lee RT, Vaag A, Groop LC, Mootha VK (2007) TXNIP regulates peripheral glucose metabolism in humans. PLoS Med 4(5):868–879
Gunton JE, Kulkarni RN, Yim S, Okada T, Hawthorne WJ, Tseng YH, Roberson RS, Ricordi C, O’Connell PJ, Gonzalez FJ, Kahn CR (2005) Loss of ARNT/HIF1beta mediates altered gene expression and pancreatic-islet dysfunction in human type 2 diabetes. Cell 122(3):337–349
Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, Houstis N, Daly MJ, Patterson N, Mesirov JP, Golub TR, Tamayo P, Spiegelman B, Lander ES, Hirschhorn JN, Altshuler D, Groop LC (2003) PGC-1a responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34(3):267–273
Schlicker A, Albrecht M (2008) FunSimMat: a comprehensive functional similarity database. Nucleic Acids Res 36:434–439
Schlicker A, Domingues FS, Rahnenführer J, Lengauer T (2006) A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinform 7:302–317
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kumar, A., Sharmila, D.J.S. Algorithmic Approach for Removing the Redundancy in Diabetic Gene Categories Based on Semantic Similarity and Gene Expression Data. Interdiscip Sci Comput Life Sci 8, 162–168 (2016). https://doi.org/10.1007/s12539-015-0113-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-015-0113-z