Soft Computing Methods for Big Data Problems

  • Shafaatunnur Hasan
  • Siti Mariyam Shamsuddin
  • Noel Lopes
Chapter

Abstract

Generally, big data computing deals with massive and high-dimensional data such as DNA microarray data, financial data, medical imagery, satellite imagery, and hyperspectral imagery. Therefore, big data computing needs advanced technologies or methods to solve the issues of computational time to extract valuable information without information loss. In this context, generally, machine learning (ML) algorithms have been considered to learn and find useful and valuable information from large value of data. However, ML algorithms such as neural networks are computationally expensive, and typically, the central processing unit (CPU) is unable to cope with these requirements. Thus, we need a high-performance computer to execute faster solutions such graphics processing unit (GPU). GPUs provide remarkable performance gains compared to CPUs. The GPU is relatively inexpensive with affordable price, availability, and scalability. Since 2006, NVIDIA provides simplification of the GPU programming model with the Compute Unified Device Architecture (CUDA), which supports for accessible programming interfaces and industry-standard languages, such as C and C++. Since then, general-purpose graphics processing unit (GPGPU) using ML algorithms are applied on various applications, including signal and image pattern classification in biomedical area. The importance of fast analysis of detecting cancer or non-cancer becomes the motivation of this study. Accordingly, we proposed soft computing methods, self-organizing map (SOM) and multiple back-propagation (MBP) for big data, particularly on biomedical classification problems. Big data such as gene expression datasets are executed on high-performance computer and Fermi architecture graphics hardware. Based on the experiment, MBP and SOM with GPU-Tesla generate faster computing times than high-performance computer with feasible results in terms of speed and classification performance.

Keywords

GPGPU Big data Soft computing SOM MBP Biomedical classification problems 

Notes

Acknowledgments

This work is supported by The Ministry of Higher Education (MOHE) under Long Term Research Grant Scheme (LRGS/TD/2011/UTM/ICT/03—4L805). The authors would like to thank Research Management Centre (RMC), Universiti Teknologi Malaysia (UTM) for the support in R & D, Soft Computing Research Group (SCRG) for the inspiration in making this study a success. The authors would also like to thank the anonymous reviewers who have contributed enormously to this work.

References

  1. 1.
    Hey, T., Tansley, S., Tolle, K. (eds.): The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, Redmond, WA (2009)Google Scholar
  2. 2.
    Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proceedings of the IEEE 96(5), 879–899 (2008)CrossRefGoogle Scholar
  3. 3.
    Lopes, N., Ribeiro, B., Quintas, R. (2010): GPUMLib: A New Library to combine Machine Learning algorithms with Graphics Processing Units. In: HIS 2010, pp. 229–232
  4. 4.
    Sonnenburg, S., Braun, M.L., Ong, C.S., Bengio, S., Bottou, L., Holmes, G., LeCun, Y., Muller, K.-R., Pereira, F., Rasmussen, C.E., Ratsch, G., Scholkopf, B., Smola, A., Vincent, P., Weston, J., Williamson, R.C.: The need for open source software in machine learning. J. Mach. Learn. Res. 8, 2443–2466 (2007)Google Scholar
  5. 5.
    Kyoung, S.H., Keechul, J.: GPU implementation of neural networks. Pattern Recognit. 37(6), 1311–1314 (2004)CrossRefMATHGoogle Scholar
  6. 6.
    Meuth, R., Wunsch, D.C.: A survey of neural computation on graphics processing hardware. In: IEEE 22nd International Symposium on Intelligent Control, ISIC 2007. (2007)Google Scholar
  7. 7.
    Faro, A., Giordano, D., Palazzo, S.: Integrating unsupervised and supervised clustering methods on a GPU platform for fast image segmentation. In: 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) (2012)Google Scholar
  8. 8.
    Petricoin III, E.F., Ali, M.A., Ben, A.H., Peter, J.L., Vincent, A.F., Seth, M.S., Gordon, B.M., et al.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306), 572–577 (2002)CrossRefGoogle Scholar
  9. 9.
    Bohn, C.A.: Kohonen feature mapping through graphics hardware. In: Proceedings of 3rd International Conference on Computational Intelligence and Neurosciences (1998)Google Scholar
  10. 10.
    Zhongwen, L., Hongzhi, L., Zhengping, Y., Xincai, W.: Self-organizing maps computing on graphic process unit. In: 13th European Symposium on Artificial Neural Networks, Belgium, pp. 557–562 (2005)Google Scholar
  11. 11.
    Campbell, A., Berglund, E., Streit, A.: Graphics Hardware Implementation of the Parameter-Less Self-organising Map. In: Gallagher, M., Hogan, J., Maire, F. (eds.) Intelligent Data Engineering and Automated Learning - IDEAL, pp. 343–350. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    McConnell, S., Sturgeon, R., Henry, G., Mayne, A., Hurley, R.: Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA. In: Paper presented at the Journal of Physics: Conference Series. (2012)Google Scholar
  13. 13.
    Platos, J., Gajdos, P.: Large data real-time classification with non-negative matrix factorization and self-organizing maps on GPU. In: International Conference on Computer Information Systems and Industrial Management Applications (CISIM) (2010)Google Scholar
  14. 14.
    Gajdoš, P., Platoš, J. (2013) GPU based parallelism for self-organizing map. In: Kudělka, M. et al. (eds.) Proceedings of the Third International Conference on Intelligent Human Computer Interaction (IHCI 2011), vol. 179, pp. 231–242, Prague, Czech Republic, August 2011. Springer, HeidelbergGoogle Scholar
  15. 15.
    Prabhu, R.D.: SOMGPU: an unsupervised pattern classifier on graphical processing unit. In: Evolutionary Computation, CEC 2008. IEEE World Congress on Computational Intelligence, pp. 1011–1018. (2008)Google Scholar
  16. 16.
    Gajdoš, P., Krátký, M., Bednár, D., Baca, R., Gono, R., Walder, J.: Efficient computation of SOM for outage database. In: ELNET 2011, 51 (2011)Google Scholar
  17. 17.
    Takatsuka, M., Bui, M.: Parallel Batch Training of the Self-Organizing Map Using OpenCL. In: Wong, K., Mendis, B.S., Bouzerdoum, A. (eds.) Neural Information Processing. Models and Applications, pp. 470–476. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  18. 18.
    Eklund, A., Dufort, P., Forsberg, D., LaConte, S. M.: Medical Image Processing on the GPU - Past, Present and Future. Medical Image Analysis. Elsevier (2013)Google Scholar
  19. 19.
    Lopes, N., Ribeiro, B.: Fast pattern classification of ventricular arrhythmias using graphics processing units. In: Proceedings of the 14th Iberoamerican Conference on Pattern Recognition (CIARP 2009), LNCS 5856, pp. 603–610. Springer, Heidelberg (2009)Google Scholar
  20. 20.
    Tanwani, A., Farooq, M. The role of biomedical dataset in classification. In: Combi, C., Shahar Y., Abu-Hanna, A. (eds.) Artificial Intelligence in Medicine, vol. 5651, pp. 370–374. Springer, Heidelberg (2009)Google Scholar
  21. 21.
    Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30, 3rd edn (Extended Edition). Springer, Berlin (2001)Google Scholar
  22. 22.
    Lopes, N., Ribeiro, B.: GPU implementation of the multiple back-propagation algorithm. In: Corchado, E., Yin, H. (eds.) Intelligent Data Engineering and Automated Learning-IDEAL, pp. 449–456. Springer, Heidelberg (2009)Google Scholar
  23. 23.
    Lopes, N., Ribeiro, B.: A strategy for dealing with missing values by using selective activation neurons in a multi-topology framework. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–5. IEEE (2010)Google Scholar
  24. 24.
    Golub, T.R., Donna, K.S., Pablo, T., Christine, H., Michelle, G., Jill, P.M., Hilary, C., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)CrossRefGoogle Scholar
  25. 25.
    Singh, D., Phillip, G.F., Kenneth, R., Donald, G.J., Judith, M., Christine, L., Pablo, T., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Singapore 2015

Authors and Affiliations

  • Shafaatunnur Hasan
    • 1
  • Siti Mariyam Shamsuddin
    • 1
  • Noel Lopes
    • 1
    • 2
  1. 1.UTM Big Data CentreUniversiti Teknologi MalaysiaSkudaiMalaysia
  2. 2.CISUCUniversity of CoimbraCoimbraPortugal

Personalised recommendations