Abstract
Tumor diagnosis by analyzing gene expression profiles becomes an interesting topic in bioinformatics and the main problem is to identify the genes related to a tumor. This paper proposes a rank sum method to identify the related genes based on the rank sum test theory in statistics. The tumor diagnosis system is constructed by the support vector machine (SVM) trained on the set of the related gene expression profiles. The experiments demonstrate that the constructed tumor diagnosis system with the rank sum method and SVM can reach an accuracy level of 96.2% on the colon data and 100% on the leukemia data.
Similar content being viewed by others
References
Golub, T. R., Slonim, D. K., Tamayo, P. et al., Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring, Science, 1999, 286: 531–537.
Alon, U., Barkai, N., Notterman, D. A. et al., Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Proc. Nat’l Acad. Sci.USA, 1999, 96: 6745–6750.
Brown, M. P. S., Grundy, W. N., Lin D. et al., Knowledge-based analysis of microarray gene expression data by using support vector machines, Proc. Nat’l Acad. Sci., 2000, 97(1): 262–267.
Dudoit, S., Fridyand, J., Speed T. P., Comparison of discrimination methods for the classification of tumor using gene expression data, Journal of American Statistical Association, 2002, 97(457): 77–87.
Furey, T., Cristianini, N., Duffy, N. et al., Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics, 2000, 16(10): 909–914.
Guyon, I., Weston, J., Barnhill, S.et al., Gene selection for cancer classification using support vector machine, Machine Learning, 2002, 46(1/3): 389–422.
Pavlidis, P., Weston, J., Cai, J.et al., Gene functional analysis from heterogeneous data, Proc. RECOMB, New York: ACM Press, 2001, 249–255.
Ding, H. Q., Analysis of gene expression profiles: class discovery and leaf ordering, Proc. RECOMB, New York: ACM Press, 2002, 127–136.
Goulden, C. H., Methods of Statistical Analysis, 2nd ed., New York: John Wiley & Sons, 1956.
Hettmansperger, T. P., Statistical Inference Based on Ranks, New York: John Wiley & Sons, Inc., 1984.
Nikitin, Y., Asymptotic Efficiency of Non-parametric Tests, New York: Cambridge University Press, 1995.
Vapnik, V., The Nature of Statistical Learning Theory, New York: Sponger, 2000.
Joachims, T., Making large-scale SVM learning practical, Advances in Kernel Methods-Support Vector Learning (eds. Scholkopf, B., Burges, C, Smola, A. J. et al.), MIT-Press, 1999.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Deng, L., Ma, J. & Pei, J. Rank sum method for related gene selection and its application to tumor diagnosis. Chin.Sci.Bull. 49, 1652–1657 (2004). https://doi.org/10.1007/BF03184138
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF03184138