International Journal of Parallel Programming

, Volume 45, Issue 4, pp 760–779 | Cite as

The Parallelization of Back Propagation Neural Network in MapReduce and Spark

  • Yang Liu
  • Lixiong XuEmail author
  • Maozhen Li


Artificial neural network is proved to be an effective algorithm for dealing with recognition, regression and classification tasks. At present a number of neural network implementations have been developed, for example Hamming network, Grossberg network, Hopfield network and so on. Among these implementations, back propagation neural network (BPNN) has become the most popular one due to its sensational function approximation and generalization abilities. However, in the current big data researches, BPNN, as a both data intensive and computational intensive algorithm, its efficiency has been significantly impacted. Therefore, this paper presents a parallel BPNN algorithm based on data separation in three distributed computing environments including Hadoop, HaLoop and Spark. Moreover to improve the algorithm performance in terms of accuracy, ensemble techniques have been employed. The algorithm is firstly evaluated in a small-scale cluster. And then it is further evaluated in a commercial cloud computing environment. The experimental results indicate that the proposed algorithm can improve the efficiency of BPNN with guaranteeing its accuracy.


Neural network MapReduce Hadoop HaLoop Spark Ensemble technique 



The authors would like to appreciate the support from the National Natural Science Foundation of China (No. 51437003) and the National Basic Research Program (973) of China under Grant 2014CB340404.

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this article.


  1. 1.
    “Big Data, A New World of Opportunities”, Networked European Software and Services Initiative (NESSI) White Paper (2012).
  2. 2.
    Gu, R., Shen, F., Huang, Y.: A parallel computing platform for training large scale neural networks. In: IEEE International Conference on Big Data, pp. 376–384 (2013)Google Scholar
  3. 3.
    Long, L.N., Gupta, A.: Scalable massively parallel artificial neural networks. J. Aerosp. Comput. Inf. Commun. 5(1), 3–15 (2008)CrossRefGoogle Scholar
  4. 4.
    Liu, Y., Yang, J., Huang, Y., Xu, L., Li, S., Qi, M.: MapReduce based parallel neural networks in enabling large scale machine learning. Comput. Intell. Neurosci. (2015). doi: 10.1155/2015/297672
  5. 5.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)CrossRefGoogle Scholar
  6. 6.
    Liu, Y., Li, M., Khan, M., Qi, M.: A mapreduce based distributed LSI for scalable information retrieval. Comput. Inf. 33(2), 259–280 (2014)Google Scholar
  7. 7.
    Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: HaLoop: efficient iterative data processing on large clusters. In: 36th International Conference on Very Large Data Bases, Singapore (2010)Google Scholar
  8. 8.
    Wang, C., Tai, T., Huang, K., Liu, T., Chang, J., Shieh, C.: FedLoop: looping on federated MapReduce. In: IEEE 13th Conference on Trust, Security and Privacy in Computing and Communications, pp. 755–762. Beijing (2014)Google Scholar
  9. 9.
    Zhang, Y., Gao, Q., Gao, L., Wang, C.: iMapReduce: a distributed computing framework for iterative computation. In: IEEE International Parallel & Distributed Processing Symposium, pp. 1112–1121. Shanghai (2011)Google Scholar
  10. 10.
    Bhuiyan, M.A., Hasan, M.A.: An iterative MapReduce based frequent subgraph mining algorithm. IEEE Trans. Knowl. Data Eng. 27(3), 608–620 (2015)CrossRefGoogle Scholar
  11. 11.
    URL: Last accessed 25 May 2015
  12. 12.
    URL: Last accessed 25 May 2015
  13. 13.
    URL: Last accessed 25 May 2015
  14. 14.
    URL: Last accessed 25 May 2015
  15. 15.
    Jiang, J., Zhang, J., Yang, G., Zhang, D., Zhang, L.: Application of back propagation neural network in the classification of high resolution remote sensing image: take remote sensing image of beijing for instance. In: 18th International Conference on Geoinformatics, pp. 1–6. Beijing (2010)Google Scholar
  16. 16.
    Khoa, N., Sakakibara, K., Nishikawa, I.: Stock price forecasting using back propagation neural networks with time and profit based adjusted weight factors. In: International Joint Conference SICE-ICASE, pp. 5484–5488. Busan (2006)Google Scholar
  17. 17.
    Rizwan, M., Jamil, M., Kothari, D.P.: Generalized neural network approach for global solar energy estimation in India. IEEE Trans. Sustain. Energy 3, 576–584 (2012)CrossRefGoogle Scholar
  18. 18.
    Wang, Y., Li, B., Luo, R., Chen, Y., Xu, N., Yang, H.: Energy efficient neural networks for big data analytics. In: Design, Automation and Test in Europe Conference and Exhibition, pp. 1–2. Dresden (2014)Google Scholar
  19. 19.
    Nguyen, D., Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In: International Joint Conference on Neural Networks, vol. 3, pp. 21–26. Washington (1990)Google Scholar
  20. 20.
    Kanan, H., Khanian, M.: Reduction of neural network training time using an adaptive fuzzy approach in real time applications. Int. J. Inf. Electron. Eng. 2(3), 470–474 (2012)Google Scholar
  21. 21.
    Hasan, R., Taha, T.M.: Routing bandwidth model for feed forward neural networks on multicore neuromorphic architectures. In: International Joint Conference on Neural Networks, pp. 1–8. Dallas (2013)Google Scholar
  22. 22.
    Huqqani, A.A., Schikuta, E., Mann, E.: Parallelized neural networks as a service. In: International Joint Conference on Neural Networks, pp. 2282–2289. Beijing (2014)Google Scholar
  23. 23.
    Yuan, J., Yu, S.: Privacy preserving back-propagation neural network learning made practical with cloud computing. IEEE Trans. Parallel Distrib. Syst. 25, 212–221 (2014)CrossRefGoogle Scholar
  24. 24.
    Ikram, A.A., Ibrahim, S., Sardaraz, M., Tahir, M., Bajwa, H., Bach, C.: Neural network based cloud computing platform for bioinformatics. In: Systems Applications and Technology Conference (LISAT), pp. 1–6. Long Island (2013)Google Scholar
  25. 25.
    Rao, V., Rao, S.: Application of artificial neural networks in capacity planning of cloud based IT infrastructure. In: IEEE International Conference on Cloud Computing in Emerging Markets (CCEM), pp. 1–4. Bangalore (2012)Google Scholar
  26. 26.
    Gu, R., Shen, F., Huang, Y.: A parallel computing platform for training large scale neural networks. In: IEEE International Conference on Big Data, pp. 376–384. Silicon Valley (2013)Google Scholar
  27. 27.
    Liu, Z., Li, H., Miao, G.: MapReduce-based backpropagation neural network over large scale mobile data. In: Sixth International Conference on Natural Computation (ICNC 2010), pp. 1726–1730. Yantai (2010)Google Scholar
  28. 28.
    Hagan, M.H., Demuth, H.B., Beale, M.H.: Neural Network Design. PWS Publishing Company, Boston (1996)Google Scholar
  29. 29.
    Nasullah, K.A.: Parallelizing support vector machines for scalable image annotation. Ph.D. Thesis, Brunel University, UK (2011)Google Scholar
  30. 30.
    Lichman, M.: UCI machine learning repository []. Irvine, University of California, School of Information and Computer Science, CA (2013)
  31. 31.
    Aliyun. Last accessed 25 May 2015

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.School of Electrical Engineering and Information SystemsSichuan UniversityChengduChina
  2. 2.Department of Electronic and Computer EngineeringBrunel University LondonUxbridgeUK
  3. 3.The Key Laboratory of Embedded Systems and Service ComputingTongji UniversityShanghaiChina

Personalised recommendations