Advertisement

The Journal of Supercomputing

, Volume 72, Issue 10, pp 3868–3886 | Cite as

GPU-enabled back-propagation artificial neural network for digit recognition in parallel

  • Ricardo Brito
  • Simon FongEmail author
  • Kyungeun Cho
  • Wei Song
  • Raymond Wong
  • Sabah Mohammed
  • Jinan Fiaidhi
Article

Abstract

In this paper, we show that the GPU (graphics processing unit) can be used not only for processing graphics, but also for high speed computing. We provide a comparison between the times taken on the CPU and GPU to perform the training and testing of a back-propagation artificial neural network. We implemented two neural networks for recognizing handwritten digits; one consists of serial code executed on the CPU, while the other is a GPU-based version of the same system which executes in parallel. As an experiment for performance evaluation, a system for neural network training on the GPU is developed to reduce training time. The programming environment that the system is based on is CUDA which stands for compute unified device architecture, which allows a programmer to write code that will run on an NVIDIA GPU card. Our results over an experiment of digital image recognition using neural network confirm the speed-up advantages by tapping on the resources of GPU. Our proposed model has an advantage of simplicity, while it shows on par performance with the state-of-the-arts algorithms.

Keywords

Artificial neural networks Parallel execution NVIDIA CUDA 

Notes

Acknowledgments

The authors are thankful for the financial support from the research grant “Peer-production approaches to e-Learning (PPAeL),” Grant No. FDCT 019/2011/A1, offered by Macau Fundo para o Desenvolvimento das Ciências e da Tecnologia.

References

  1. 1.
    Park SI, Ponce SP, Huang J, Cao Y, Quek F (2008) Low-cost, high-speed computer vision using NVIDIA’s CUDA architecture. In: 37th IEEE applied imagery pattern recognition workshop, pp 1–7Google Scholar
  2. 2.
  3. 3.
    Steinkrau D, Simard PY, Buck I (2013) Using GPUs for machine learning algorithms. In: 12th International conference on document analysis and recognition, pp 1115–1119Google Scholar
  4. 4.
    Lopez-Fandino J, Heras DB, Arguello F (2014) Efficient classification of hyperspectral images on commodity GPUs using ELM-based techniques. In: Conference PDPTA’14, CSREA Press, July 21–24, pp 1–13Google Scholar
  5. 5.
    Catanzaro B, Sundaram N, Keutzer K (2008) Fast support vector machine training and classification on graphics processors. In: Proceedings of the 25th international conference on machine learning (ICML 2008), Helsinki, Finland, pp 104–111Google Scholar
  6. 6.
    van Heeswijk M, Miche Y, Lindh-Knuutila T, Hilbers P, Honkela T, Oja E, Lendasse A (2009) Adaptive ensemble models of extreme learning machines for time series prediction. In: 19th International conference on artificial neural networks, Limassol, Cyprus, 9Google Scholar
  7. 7.
  8. 8.
  9. 9.
  10. 10.
    Huang G-B, Chen L, Siew C-K (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE TNN 17(4):879–892Google Scholar
  11. 11.
    Hayashi A, Ishizaki K, Koblents G, Sarkar V (2015) Machine-learning-based performance heuristics for runtime CPU/GPU selection. In: Proceedings of the principles and practices of programming on the Java platform, pp 27–36Google Scholar
  12. 12.
    Ribeiro B, Goncalves J (2012) Restricted Boltzmann machines and deep belief networks on multi-core processors. In: The 2012 international joint conference on neural networks (IJCNN), 10–15 June 2012, pp 1–7Google Scholar
  13. 13.
    Huqqani AA, Schikuta E, Ye S, Chen P (2013) Multicore and GPU parallelization of neural networks for face recognition. In: International conference on computational science, ICCS 2013, Procedia Computer Science, vol 18, pp 349–358Google Scholar
  14. 14.
    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  15. 15.
    Cambria et al (2013) Extreme learning machines. IEEE Trans Cybern 28(6):30–59Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Ricardo Brito
    • 1
  • Simon Fong
    • 1
    Email author
  • Kyungeun Cho
    • 2
  • Wei Song
    • 3
  • Raymond Wong
    • 4
  • Sabah Mohammed
    • 5
  • Jinan Fiaidhi
    • 5
  1. 1.Department of Computer and Information ScienceUniversity of MacauMacau SARChina
  2. 2.Department of Computer and Multimedia EngineeringDongguk UniversitySeoulKorea
  3. 3.College of Information EngineeringNorth China University of TechnologyBeijingChina
  4. 4.School of Computer Science and EngineeringUniversity of New South WalesSydneyAustralia
  5. 5.Department of Computer ScienceLakehead UniversityThunder BayCanada

Personalised recommendations