Disease-Related Gene Expression Analysis Using an Ensemble Statistical Test Method

  • Bing Wang
  • Zhiwei Ji
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7996)


The development of novel high-throughput experimental techniques makes it possible to comprehensively analyze biological data in health and disease. However, a large amount of data generated results in dramatic data-analytic challenges in discovery of ‘signature’ molecules, which are specific to different biological conditions (e.g. normal vs. disease, treated vs. untreated). Current statistical methods are effective only in the case their hypothesis can be matched. In this paper, we apply an ensemble statistical method to infer significant molecules. In our approach, four well-done and well-understanding statistical techniques had been used for the analysis to the experimental data, and then the results will be collected into an ensemble framework to find the high confident “significant” molecules which can distinguish the different experimental conditions. We evaluate the performance of our approach on a test dataset which deposited on GEO database with an access number of GSE45114.


Statistical Tests Ensemble framework signature molecules gene expression profile 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baumgartner, C., Osl, M., Netzer, M., Baumgartner, D.: Bioinformatic-driven search for metabolic biomarkers in disease. J. Clin Bioinformatics 1(2) (2011), doi:10.1186/2043-9113-1181-1182Google Scholar
  2. 2.
    Wang, B., Chen, P., Wang, P., Zhao, G., Zhang, X.: Radial basis function neural network ensemble for predicting protein-protein interaction sites in heterocomplexes. Protein Pept. Lett. 17(9), 1111–1116 (2010)CrossRefGoogle Scholar
  3. 3.
    Huang, D.S., Zheng, C.H.: Independent component analysis-based penalized discriminant method for tumor classification using gene expression data. Bioinformatics 22(15), 1855–1862 (2006)CrossRefGoogle Scholar
  4. 4.
    Wang, B., Chen, P., Zhang, J., Zhao, G., Zhang, X.: Inferring protein-protein interactions using a hybrid genetic algorithm/support vector machine method. Protein Pept. Lett. 17(9), 1079–1084 (2010)CrossRefGoogle Scholar
  5. 5.
    Zheng, C.H., Huang, D.S., Zhang, L., Kong, X.Z.: Tumor clustering using nonnegative matrix factorization with gene selection. IEEE Trans. Inf. Technol. Biomed. 13(4), 599–607 (2009)CrossRefGoogle Scholar
  6. 6.
    Wang, B., Wong, H.S., Huang, D.S.: Inferring protein-protein interacting sites using residue conservation and evolutionary information. Protein Pept. Lett. 13(10), 999–1005 (2006)CrossRefGoogle Scholar
  7. 7.
    Zhang, F., Chen, J.Y.: Data mining methods in Omics-based biomarker discovery. Methods in Molecular Biology 719, 511–526 (2011)CrossRefGoogle Scholar
  8. 8.
    Kwon, S., Cui, J., Rhodes, S.L., Tsiang, D., Rotter, J.I., Guo, X.: Application of Bayesian classification with singular value decomposition method in genome-wide association studies. BMC Proceedings 3(suppl. 7), S9 (2009)CrossRefGoogle Scholar
  9. 9.
    Deng, X., Geng, H., Ali, H.H.: Cross-platform analysis of cancer biomarkers: a Bayesian network approach to incorporating mass spectrometry and microarray data. Cancer Informatics 3, 183–202 (2007)Google Scholar
  10. 10.
    Su, Y.H., Shen, J., Qian, H.G., Ma, H.C., Ji, J.F., Ma, H., Ma, L.H., Zhang, W.H., Meng, L., Li, Z.F., Wu, J., Jin, G.L., Zhang, J.Z., Shou, C.C.: Diagnosis of gastric cancer using decision tree classification of mass spectral data. Cancer Sci. 98(1), 37–43 (2007)CrossRefGoogle Scholar
  11. 11.
    Wang, H.Q., Wong, H.S., Zhu, H., Yip, T.T.: A neural network-based biomarker association information extraction approach for cancer classification. Journal of Biomedical Informatics 42(4), 654–666 (2009)CrossRefGoogle Scholar
  12. 12.
    Chi, C.L., Street, W.N., Wolberg, W.H.: Application of artificial neural network-based survival analysis on two breast cancer datasets. In: AMIA.. Annual Symposium Proceedings/AMIA Symposium, pp. 130–134 (2007)Google Scholar
  13. 13.
    Amiri, Z., Mohammad, K., Mahmoudi, M., Zeraati, H., Fotouhi, A.: Assessment of gastric cancer survival: using an artificial hierarchical neural network. Pak J. Biol. Sci. 11(8), 1076–1084 (2008)CrossRefGoogle Scholar
  14. 14.
    Brunner, E., Munzel, U.: The nonparametric Behrens-Fisher problem: Asymptotic theory and a small-sample approximation. Biometrical Journal 42(1), 17–25 (2000)MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Baumgartner, W., Weiss, P., Schindler, H.: A nonparametric test for the general two-sample problem. Biometrics 54(3), 1129–1135 (1998)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Bing Wang
    • 1
    • 2
    • 3
  • Zhiwei Ji
    • 3
  1. 1.The Advanced Research Institute of Intelligent Sensing NetworkTongji UniversityShanghaiChina
  2. 2.The Key Laboratory of Embedded System and Service Computing, Ministry of EducationTongji UniversityShanghaiChina
  3. 3.School of Electronics and Information EngineeringTongji UniversityShanghaiChina

Personalised recommendations