GPU Computing and Applications pp 235-247 | Cite as
Soft Computing Methods for Big Data Problems
Abstract
Generally, big data computing deals with massive and high-dimensional data such as DNA microarray data, financial data, medical imagery, satellite imagery, and hyperspectral imagery. Therefore, big data computing needs advanced technologies or methods to solve the issues of computational time to extract valuable information without information loss. In this context, generally, machine learning (ML) algorithms have been considered to learn and find useful and valuable information from large value of data. However, ML algorithms such as neural networks are computationally expensive, and typically, the central processing unit (CPU) is unable to cope with these requirements. Thus, we need a high-performance computer to execute faster solutions such graphics processing unit (GPU). GPUs provide remarkable performance gains compared to CPUs. The GPU is relatively inexpensive with affordable price, availability, and scalability. Since 2006, NVIDIA provides simplification of the GPU programming model with the Compute Unified Device Architecture (CUDA), which supports for accessible programming interfaces and industry-standard languages, such as C and C++. Since then, general-purpose graphics processing unit (GPGPU) using ML algorithms are applied on various applications, including signal and image pattern classification in biomedical area. The importance of fast analysis of detecting cancer or non-cancer becomes the motivation of this study. Accordingly, we proposed soft computing methods, self-organizing map (SOM) and multiple back-propagation (MBP) for big data, particularly on biomedical classification problems. Big data such as gene expression datasets are executed on high-performance computer and Fermi architecture graphics hardware. Based on the experiment, MBP and SOM with GPU-Tesla generate faster computing times than high-performance computer with feasible results in terms of speed and classification performance.
Keywords
GPGPU Big data Soft computing SOM MBP Biomedical classification problemsNotes
Acknowledgments
This work is supported by The Ministry of Higher Education (MOHE) under Long Term Research Grant Scheme (LRGS/TD/2011/UTM/ICT/03—4L805). The authors would like to thank Research Management Centre (RMC), Universiti Teknologi Malaysia (UTM) for the support in R & D, Soft Computing Research Group (SCRG) for the inspiration in making this study a success. The authors would also like to thank the anonymous reviewers who have contributed enormously to this work.
References
- 1.Hey, T., Tansley, S., Tolle, K. (eds.): The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, Redmond, WA (2009)Google Scholar
- 2.Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proceedings of the IEEE 96(5), 879–899 (2008)CrossRefGoogle Scholar
- 3.Lopes, N., Ribeiro, B., Quintas, R. (2010): GPUMLib: A New Library to combine Machine Learning algorithms with Graphics Processing Units. In: HIS 2010, pp. 229–232
- 4.Sonnenburg, S., Braun, M.L., Ong, C.S., Bengio, S., Bottou, L., Holmes, G., LeCun, Y., Muller, K.-R., Pereira, F., Rasmussen, C.E., Ratsch, G., Scholkopf, B., Smola, A., Vincent, P., Weston, J., Williamson, R.C.: The need for open source software in machine learning. J. Mach. Learn. Res. 8, 2443–2466 (2007)Google Scholar
- 5.Kyoung, S.H., Keechul, J.: GPU implementation of neural networks. Pattern Recognit. 37(6), 1311–1314 (2004)CrossRefMATHGoogle Scholar
- 6.Meuth, R., Wunsch, D.C.: A survey of neural computation on graphics processing hardware. In: IEEE 22nd International Symposium on Intelligent Control, ISIC 2007. (2007)Google Scholar
- 7.Faro, A., Giordano, D., Palazzo, S.: Integrating unsupervised and supervised clustering methods on a GPU platform for fast image segmentation. In: 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) (2012)Google Scholar
- 8.Petricoin III, E.F., Ali, M.A., Ben, A.H., Peter, J.L., Vincent, A.F., Seth, M.S., Gordon, B.M., et al.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306), 572–577 (2002)CrossRefGoogle Scholar
- 9.Bohn, C.A.: Kohonen feature mapping through graphics hardware. In: Proceedings of 3rd International Conference on Computational Intelligence and Neurosciences (1998)Google Scholar
- 10.Zhongwen, L., Hongzhi, L., Zhengping, Y., Xincai, W.: Self-organizing maps computing on graphic process unit. In: 13th European Symposium on Artificial Neural Networks, Belgium, pp. 557–562 (2005)Google Scholar
- 11.Campbell, A., Berglund, E., Streit, A.: Graphics Hardware Implementation of the Parameter-Less Self-organising Map. In: Gallagher, M., Hogan, J., Maire, F. (eds.) Intelligent Data Engineering and Automated Learning - IDEAL, pp. 343–350. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 12.McConnell, S., Sturgeon, R., Henry, G., Mayne, A., Hurley, R.: Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA. In: Paper presented at the Journal of Physics: Conference Series. (2012)Google Scholar
- 13.Platos, J., Gajdos, P.: Large data real-time classification with non-negative matrix factorization and self-organizing maps on GPU. In: International Conference on Computer Information Systems and Industrial Management Applications (CISIM) (2010)Google Scholar
- 14.Gajdoš, P., Platoš, J. (2013) GPU based parallelism for self-organizing map. In: Kudělka, M. et al. (eds.) Proceedings of the Third International Conference on Intelligent Human Computer Interaction (IHCI 2011), vol. 179, pp. 231–242, Prague, Czech Republic, August 2011. Springer, HeidelbergGoogle Scholar
- 15.Prabhu, R.D.: SOMGPU: an unsupervised pattern classifier on graphical processing unit. In: Evolutionary Computation, CEC 2008. IEEE World Congress on Computational Intelligence, pp. 1011–1018. (2008)Google Scholar
- 16.Gajdoš, P., Krátký, M., Bednár, D., Baca, R., Gono, R., Walder, J.: Efficient computation of SOM for outage database. In: ELNET 2011, 51 (2011)Google Scholar
- 17.Takatsuka, M., Bui, M.: Parallel Batch Training of the Self-Organizing Map Using OpenCL. In: Wong, K., Mendis, B.S., Bouzerdoum, A. (eds.) Neural Information Processing. Models and Applications, pp. 470–476. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 18.Eklund, A., Dufort, P., Forsberg, D., LaConte, S. M.: Medical Image Processing on the GPU - Past, Present and Future. Medical Image Analysis. Elsevier (2013)Google Scholar
- 19.Lopes, N., Ribeiro, B.: Fast pattern classification of ventricular arrhythmias using graphics processing units. In: Proceedings of the 14th Iberoamerican Conference on Pattern Recognition (CIARP 2009), LNCS 5856, pp. 603–610. Springer, Heidelberg (2009)Google Scholar
- 20.Tanwani, A., Farooq, M. The role of biomedical dataset in classification. In: Combi, C., Shahar Y., Abu-Hanna, A. (eds.) Artificial Intelligence in Medicine, vol. 5651, pp. 370–374. Springer, Heidelberg (2009)Google Scholar
- 21.Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30, 3rd edn (Extended Edition). Springer, Berlin (2001)Google Scholar
- 22.Lopes, N., Ribeiro, B.: GPU implementation of the multiple back-propagation algorithm. In: Corchado, E., Yin, H. (eds.) Intelligent Data Engineering and Automated Learning-IDEAL, pp. 449–456. Springer, Heidelberg (2009)Google Scholar
- 23.Lopes, N., Ribeiro, B.: A strategy for dealing with missing values by using selective activation neurons in a multi-topology framework. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–5. IEEE (2010)Google Scholar
- 24.Golub, T.R., Donna, K.S., Pablo, T., Christine, H., Michelle, G., Jill, P.M., Hilary, C., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)CrossRefGoogle Scholar
- 25.Singh, D., Phillip, G.F., Kenneth, R., Donald, G.J., Judith, M., Christine, L., Pablo, T., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)CrossRefGoogle Scholar