A Compressed Sensing Based Feature Extraction Method for Identifying Characteristic Genes
In current molecular biology, it becomes more and more important to identify characteristic genes closely correlated with a key biological process from gene expression data. In this paper, a novel compressed sensing (CS) based feature extraction method named CSGS is proposed to identify the characteristic genes. Considering the transposed gene expression matrix and class labels as sensing matrix and measurement vector, respectively, CS reconstruction is implemented by basis pursuit algorithm. Top ranking genes with high signal weights are retained as the characteristic genes. Experiments of CSGS are performed on leukemia data set and compared with other sparse methods. Results demonstrate that CSGS is effective in identifying characteristic genes, and is not sensitive to parameters. CSGS could offer a simple way for feature extraction and provide more clues for biologists.
KeywordsGene expression data Characteristic genes Compressed sensing Feature extraction
This work was supported by the National Natural Science Foundation of China (Grant No.61502272, 61572284, 61572283); the Award Foundation Project of Excellent Young Scientists in Shandong Province (BS2014DX004, BS2014DX005); Project of Shandong Province Higher Educational Science and Technology Program (J13LN31); Scientific Research Foundation of Qufu Normal University(XJ201226); the Science and Technology Planning Project of Qufu Normal University (xkj201524); the Elaborate Experiment Project of Qufu Normal University (jp2015005) and the Innovation and Entrepreneurship Training Project for College Students of Qufu Normal University (2015A059).
- 5.Baraniuk, R.G.: Compressive sensing. IEEE Sign. Process. Mag. 24(4), 118–120, 124 (2007)Google Scholar
- 10.Ho, C.M., Hsu, S.D.: Determination of Nonlinear Genetic Architecture Using Compressed Sensing (2014). arXiv preprint arXiv:14086583
- 15.Huang, H., Misra, S., Tang, W., Barani, H., Al-Azzawi, H.: Applications of Compressed Sensing in Communications Networks (2013). arXiv preprint arXiv:13053002
- 17.SPGL1: A Solver for Large-Scale Sparse Reconstruction. http://www.cs.ubc.ca/labs/scl/spgl1
- 18.Kilian, J., Whitehead, D., Horak, J., Wanke, D., Weinl, S., Batistic, O., D’Angelo, C., Bornberg-Bauer, E., Kudla, J., Harter, K.: The Atgenexpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. Plant J. 50(2), 347–363 (2007)CrossRefGoogle Scholar
- 22.Wu, M.-Y., Dai, D.-Q., Zhang, X.-F., Zhu, Y.: Cancer subtype discovery and biomarker identification via a new robust network clustering algorithm (2013). http://dx.doi.org/10.1371/journal.pone.0066256
- 29.Ågesen, T., Berg, M., Clancy, T., Thiis-Evensen, E., Cekaite, L., Lind, G., Nesland, J., Bakka, A., Mala, T., Hauss, H.: CLC and IFNAR1 are differentially expressed and a global immunity score is distinct between early-and late-onset colorectal cancer. Genes Immun. 12(8), 653–662 (2011)CrossRefGoogle Scholar