PPAM 2007: Parallel Processing and Applied Mathematics pp 1210-1219 | Cite as
A Parallel Classification and Feature Reduction Method for Biomedical Applications
Abstract
Classification is one of the most widely used methods in data mining, with numerous applications in biomedicine. The scope and the resolution of data involved in many real life applications require very efficient implementations of classification methods, developed to run on parallel or distributed computational systems. In this study we describe SVD-ReGEC, a fully parallel implementation, for distributed memory multicomputers, of a classification algorithm with a feature reduction. The classification is based on Regularized Generalized Eigenvalue Classifier (ReGEC) and the preprocessing stage is a filter method algorithm based on Singular Value Decomposition (SVD), that reduces the dimension of the space in which classification is accomplished. The implementation is tested on random datasets and results are discussed using standard parameters.
Keywords
Binary classification Generalized Eigenvalue Classifier Feature transformationPreview
Unable to display preview. Download preview PDF.
References
- 1.Cannataro, M., Talia, D., Srimani, P.: Parallel data intensive computing in scientific and commercial applications. Par. Comp. 28(5), 673–704 (2002)CrossRefGoogle Scholar
- 2.Oja, E.: A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology 15, 267–273 (1982)MATHCrossRefMathSciNetGoogle Scholar
- 3.Wall, M., Dyck, P., Brettin, T.: SVDMAN - Singular Value Decomposition analysis of microarray data. Bioinformatics 17(6), 566–568 (2001)CrossRefGoogle Scholar
- 4.Golub, T., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
- 5.Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)MATHGoogle Scholar
- 6.Osuna, R., Girosi, F.: An improved training algorithm for support vector machines. In: IEEE Workshop on Neural Networks for Signal Processing, pp. 276–285 (1997)Google Scholar
- 7.Platt, J.: Fast training of SVMs using sequential minimal optimization. In: Advances in Kernel Methods: Support Vector Learning, pp. 185–208. MIT press, Cambridge (1999)Google Scholar
- 8.Graf, H., Cosatto, E., Bottou, L., Dourdanovic, I., Vapnik Parallel, V.: support vector machines: the cascade SVM. In: Press, M. (ed.) Proc. of Neural Information Processing Systems (NIPS), vol. 17 (2004)Google Scholar
- 9.Mangasarian, O., Wild, E.: Multisurface proximal support vector classification via generalized eigenvalues. Technical Report 04-03, Data Mining Institute (September 2004)Google Scholar
- 10.Guarracino, M.R., Cifarelli, C., Seref, O., Pardalos, P.M.: A classification algorithm based on generalized eigenvalue problems. Opt. Meth. Soft. 22(1), 73–81 (2007)MATHCrossRefMathSciNetGoogle Scholar
- 11.Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley-Interscience Publication, Chichester (2000)Google Scholar
- 12.Yan, R.: A matlab package for classification algorithms (2006), http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenal.zip
- 13.Hedenfalk, I., et al.: Gene-expression profiles in hereditary breast cancer. The New England Journal of Medicine 344, 539–548 (2001)CrossRefGoogle Scholar
- 14.Nutt, C., et al.: Oligonucleotide microarray for prediction of early intrahepatic recurrence of hepatocelllaur carcinoma after curative resection. The Lancet 63(7), 1602–1607 (2003)Google Scholar
- 15.Blake, C., Merz, C.: Uci repository of machine learning databases (1998), www.ics.uci.edu/~mlearn/MLRepository.html
- 16.Dongarra, J., Whaley, R.: A user’s guide to the blacs v1.1. Technical Report UT-CS-95-281, Dept. of CS, U. of Tennessee, Knoxville (1995)Google Scholar
- 17.Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message Passing Interface, 2nd edn. The MIT Press, Cambridge (1999)CrossRefGoogle Scholar
- 18.Choi, J., Demmel, J., Dhillon, I., Dongarra, J., Ostrouchov, S., Petitet, A., Stanley, K., Walker, D., Whaley, R.: Scalapack: A portable linear algebra library for distributed memory computers - design and performance. Comp. Phys. Comm. (97), 1–15 (1996)MATHCrossRefGoogle Scholar
- 19.Choi, J., Dongarra, J., Ostrouchov, S., Petitet, A., Walker, D., Whaley, R.: A proposal for a set of parallel basic linear algebra subprograms. Technical Report UT-CS-95-292, Dept. of CS, U. of Tennessee, Knoxville (1995)Google Scholar
- 20.Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D.: LAPACK Users Guide, 2nd edn. SIAM, Philadelphia (1995)Google Scholar