Abstract
In order to improve the performance of the traditional support vector machine (SVM), this chapter proposes one method referred as MR-SVM to parallelize SVM on MapReduce and mitigates the convergence problems brought by data partitioning and distributed computation. By splitting the large dataset and concurrently calculating the support vector set of each chunk across map units, MR-SVM improves the process capability and efficiency. Then the partial support vector sets are combined as the training set of the global training in reduce phase, and the current global optimum solved by reducing operations is fed back to each map units to determine whether MR-SVM should proceed with another pass. This process iterates until MR-SVM converges to the global optimum. In theory, it has been proved that MR-SVM converges to the global optimum within finite iteration size. Experimental results show that MR-SVM can improve the data processing capability and efficiency of the traditional counterpart and guarantee its high accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Graf, H. P., Cosatto, E., Bottou, L., Durdanovic, I., & Vapnik, V. (2004). Parallel support vector machines: The Cascade SVM. Advances in Neural Information Processing Systems (NIPS), 17, 521–528.
Salleh, N. S. M., Suliman, A., & Ahmad, A. R. (2011). Parallel execution of distributed SVM using MPI (CoDLib). In International Conference on Information Technology and Multimedia (ICIM) (pp. 1–4). IEEE.
Li, Q., Salman, R., Test, E., Strack, R., & Kecman, V. (2013). Parallel multitask cross validation for support vector machine using GPU. Journal of Parallel and Distributed Computing, 73(3), 293–302.
Caruana, G., Li, M. Z., & Liu, Y. (2012) An ontology enhanced parallel SVM for Scalable spam filter training [EB/OL]. http://dx.doi.org/10.1016/j.neucom.2012.12.001. doi:10.1016/j.neucom.2012.12.001#_parent
Chu, C. T., Kim, S. K., Lin, Y. A., Yu, Y. Y., Bradski, G., Olukotun, K., et al. (2007). Map-reduce for machine learning on multicore. Advances in Neural Information Processing Systems (NIPS), 19, 281–288.
Vapnik, V. (1995). The nature of statistical learning theory (pp. 131–162). New York, NY: Springer.
Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107–113.
Platt, J. C. (1999). Fast training of support vector machines using sequential minimal optimization (pp. 185–208). Cambridge, MA: MIT Press.
Wu, C. M., Wang, X. D., Bai, D. Y., & Zhang, H. D. (2009). Fast incremental learning algorithm of SVM on KKT conditions. The Sixth International Conference on Fuzzy Systems and Knowledge Discovery (pp. 551–554). IEEE.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ma, Y., Wang, L., Li, L. (2014). A Parallel and Convergent Support Vector Machine Based on MapReduce. In: Wong, W.E., Zhu, T. (eds) Computer Engineering and Networking. Lecture Notes in Electrical Engineering, vol 277. Springer, Cham. https://doi.org/10.1007/978-3-319-01766-2_67
Download citation
DOI: https://doi.org/10.1007/978-3-319-01766-2_67
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01765-5
Online ISBN: 978-3-319-01766-2
eBook Packages: EngineeringEngineering (R0)