Design Issues and Comparison of Methods for Microarray-Based Classification
Except in situations where the amount of data is large in comparison to the number of variables, classifier design and error estimation involve subtle issues. This is especially so in applications such as cancer classification where there is no prior knowledge concerning the vector-label distributions involved. It is clearly prudent to try to achieve classification using small numbers of genes and rules of low complexity (low VC dimension), and to use cross-validation when it is not possible to obtain large independent samples for testing. Even when one uses a cross-validation method such as leave-one-out estimation, one is still confronted by the high variance of the estimator. In many applications, large samples are impossible owing to either cost or availability. Therefore, it is unlikely that a statistical approach alone will provide satisfactory results. Rather, one can use the results of classification analysis to discover gene sets that potentially provide good discrimination, and then focus attention on these. In the same vein, one can utilize the common engineering approach of integrating data with human knowledge to arrive at satisfactory systems.
KeywordsDesign Issue Epanechnikov Kernel Part Ition Computational Genomics Accor Ding
Unable to display preview. Download preview PDF.
- Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford: University Press.Google Scholar
- Bittner, M., Meltzer, P., Khan, J., Chen, Y., Jiang, Y., Seftor, E., Hendrix, M., Radmacher, M., Simon, R., Yakhini, Z., Ben-Dor, A., Dougherty, E., Wang, E., Marincola, F., Gooden, C., Lueders, J., Glatfelter, A., Pollock, P., Gillanders, E., Leja, A., Dietrich, K., Beaudry, C., Berrens, M., Alberts, D., Sondak, V., Hayward, N., and Trent, J. (2000). “Molecular Classification of Cutaneous Malignant Melanoma by Gene Expression Profiling.” Nature 406:536–540.PubMedCrossRefGoogle Scholar
- Devroye, L., Gyorfi, L., and G. Lugosi. (1996). A Probabilistic Theory of Pattern Recognition. New York: Springer-Verlag.Google Scholar
- Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D., and Lander, E. S. (1999). “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring.” Science 286:531–537.PubMedCrossRefGoogle Scholar
- Gordon, L. and Olshen, R. (1978). “Asymptotically Efficient Solutions to the Classification Problem.” Annals of Statistics 6:525–533.Google Scholar
- Hedenfalk, I., Duggan, D., Chen, Y., Radmacher, M., Bittner, M., Simon. R., Meltzer, P., Gusterson, B., Esteller, M., Raffeld, Yakhini, Z., Ben-Dor, A., Dougherty, E., Kononen, J., Bubendorf, L., Fehrle, W., Pittaluga, S., Gruvverger, S., Loman, N., Johannsson, O., Olsson, H., Wifond, B., Sauter, G., Kallioniemi, O. P., Borg, A., and Trent, J. (2001). “Gene Expression Profiles Distinguish Hereditary Breast Cancers.” New England J Medicine 34:539–548.CrossRefGoogle Scholar
- Khan, J., Wei, J. S., Ringner, M., Saal, L. H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C. R., Peterson, C., and Meltzer, P. S. (2002). “Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Artificial Neural Networks.” Nature Medicine 7:673–679.CrossRefGoogle Scholar
- Kim, S., Dougherty, E. R., Barrera, J., Chen, Y., Bittner, M., and Trent, J. M. (2002). “Strong Feature Sets From Small Samples.” Journal of Computational Biology 9.Google Scholar
- Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Washington DC: Spartan.Google Scholar
- Vapnik, V. N., Golowich, S. E., and Smola, A. (1997). “Support Vector Method for Function Approximation, Regression, and Signal Processing.” In: Advances In Neural Information Processing Systems 9.Google Scholar
- Vapnik, V. N. (1998). Statistical Learning Theory. New York: John Wiley.Google Scholar
- Vapnik, V. and Chervonenkis, A. (1974). Theory of Pattern Recognition. Moscow: Nauka.Google Scholar