Abstract
Apart from the dimensionality problem, the uncertainty of Microarray data quality is another major challenge of Microarray classification. Microarray data contain various levels of noise and quite often high levels of noise, and these data lead to unreliable and low accuracy analysis as well as high dimensionality problem. In this paper, we propose a new Microarray data classification method, based on diversified multiple trees. The new method contains features that (1) make most use of the information from the abundant genes in the Microarray data and (2) use a unique diversity measurement in the ensemble decision committee. The experimental results show that the proposed classification method (DMDT) and the well-known method (CS4), which diversifies trees by using distinct tree roots, are more accurate on average than other well-known ensemble methods, including Bagging, Boosting, and Random Forests. The experiments also indicate that using diversity measurement of DMDT improves the classification accuracy of ensemble classification on Microarray data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Matti Aksela and Jorma Laaksonen. Using diversity of errors for selecting members of a committee classifier. Pattern Recognition, 39(4):608–623, 2006.
M. Brown, W. Grundy, D. Lin, N. Cristianini, C. Sugnet, T. Furey, M. Jr, D. Haussler. Knowledge based analysis of Microarray gene expression data by using support vector machines. Proceedings of the National Academy of Sciences of the United States of America, 97:262–267, 2000.
T. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139–158, 1998.
Mordechai Gal-Or, Jerrold H. May, William E. Spangler. Using decision tree models and diversity measures in the selection of ensemble classification models. In: Nikunj C. Oza, Robi Polikar, Josef Kittler, Fabio Roli, editors, Multiple Classifier Systems, Lecture Notes in Computer Science, 3541:186–195. Springer, 2005.
Ludmila I. Kuncheva, J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2):181–207, 2003.
Jinyan Li, Huiqing Liu. Ensembles of cascading trees. In ICDM, 585–588, 2003.
Derek Partridge, Wojtek Krzanowski. Distinct failure diversity in multiversion software. Technical report, Dept. Computer Science, University of Exeter, sec@dcs.exeter.ac.uk, 1999.
J. R. Quinlan. C4.5: Programs for machine learning. Morgan Kaufmann, San Mateo, California, 1993.
Geoffrey I. Webb, Zijian Zheng. Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering, 16(8):980–991, 2004
C. Yeang, S. Ramaswamy, P Tamayo, et al. Molecular classification of multiple tumor types. Bioinformatics, 17(Suppl 1):316–322, 2001.
Heping Zhang, Chang-Yung Yu, Burton Singer. Cell and tumor classification using gene expression data: Construction of forests. Proceedings of the National Academy of Sciences of the United States of America, 100(7):4168–4172, 2003
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this paper
Cite this paper
Zhang, Z., Li, J., Hu, H., Zhou, H. (2010). A Robust Ensemble Classification Method Analysis. In: Arabnia, H. (eds) Advances in Computational Biology. Advances in Experimental Medicine and Biology, vol 680. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-5913-3_17
Download citation
DOI: https://doi.org/10.1007/978-1-4419-5913-3_17
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-5912-6
Online ISBN: 978-1-4419-5913-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)