Abstract
Machine learning algorithms are benefiting from the continuous improvement of programming models, including MPI, MapReduce and PGAS. k-Nearest Neighbors (k-NN) algorithm is a widely used machine learning algorithm, applied to supervised learning tasks such as classification. Several parallel implementations of k-NN have been proposed in the literature and practice. However, on high-performance computing systems with high-speed interconnects, it is important to further accelerate existing designs of the k-NN algorithm through taking advantage of scalable programming models. To improve the performance of k-NN on large-scale environment with InfiniBand network, this paper proposes several alternative hybrid MPI+OpenSHMEM designs and performs a systemic evaluation and analysis on typical workloads. The hybrid designs leverage the one-sided memory access to better overlap communication with computation than the existing pure MPI design, and propose better schemes for efficient buffer management. The implementation based on k-NN program from MaTEx toolkit with MVAPICH2-X (Unified MPI+PGAS Communication Runtime over InfiniBand) shows up to 9.0 % time reduction for training KDD Cup 2010 workload over 512 cores, and 27.6 % time reduction for small workload with balanced communication and computation. Experiments of running with varied number of cores show that our design can maintain good scalability.
This research is supported in part by National Science Foundation grants #OCI-1148371, #CCF-1213084, #IIS-1447804 and #CNS-1419123.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Altman, N.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)
Apache Software Foundation: Apache Hadoop. http://hadoop.apache.org/
Apache Software Foundation: Apache Mahout. http://mahout.apache.org/
Aparício, G., Blanquer, I., Hernández, V.: A parallel implementation of the K nearest neighbours classifier in three levels: threads, MPI processes and the grid. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds.) VECPAR 2006. LNCS, vol. 4395, pp. 225–235. Springer, Heidelberg (2007)
Arefin, A.S., Riveros, C., Berretta, R., Moscato, P.: GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs. PLoS ONE 7, e44000 (2012)
Carlson, W., Draper, J., Culler, D., Yelick, K., Brooks, E., Warren, K.: Introduction to UPC and Language Specification. Center for Computing Sciences, Institute for Defense Analyses (1999)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)
Chapman, B., Curtis, T., Pophale, S., Poole, S., Kuehn, J., Koelbel, C., Smith, L.: Introducing openSHMEM: SHMEM for the PGAS community. In: Proceedings of the 4th Conference on Partitioned Global Address Space Programming Model, p. 2 (2010)
Chu, C.T., Kim, S., Lin, Y.a., Yu, Y., Bradski, G., Olukotun, K., Ng, A.: Map-reduce for machine learning on multicore. In: Advances in Neural Information Processing Systems, vol. 19 (2007)
Dongarra, J., Beckman, P., Moore, T., Aerts, P., et al.: The international exascale software project roadmap. Int. J. High Perform. Comput. Appl. 25(1), 3–60 (2011)
Ghoting, A., Krishnamurthy, R., Pednault, E., Reinwald, B., Sindhwani, V., Tatikonda, S., Tian, Y., Vaithyanathan, S.: SystemML: declarative machine learning on mapreduce. In: Proceedings of IEEE 27th International Conference on Data Engineering (2011)
Jose, J., Potluri, S., Subramoni, H., Lu, X., Hamidouche, K., Schulz, K., Sundar, H., Panda, D.K.: Designing scalable out-of-core sorting with hybrid MPI+PGAS programming models. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models (2014)
Jose, J., Potluri, S., Tomko, K., Panda, D.K.: Designing scalable graph500 benchmark with hybrid MPI+OpenSHMEM programming models. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 109–124. Springer, Heidelberg (2013)
Li, M., Lin, J., Lu, X., Hamidouche, K., Tomko, K., Panda, D.K.: Scalable MiniMD design with hybrid MPI and OpenSHMEM. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, p. 24 (2014)
Moon, L., Long, D., Joshi, S., Tripathi, V., Xiao, B., Biros, G.: Parallel algorithms for clustering and nearest neighbor search problems in high dimensions. In: Proceedings of the 2011 ACM/IEEE Conference on Supercomputing (2011)
Network Based Computing Lab, The Ohio State University: MVAPICH2-X: Unified MPI+PGAS Communication Runtime over OpenFabrics/Gen2 for Exascale Systems. http://mvapich.cse.ohio-state.edu/
Numrich, R., Reid, J.: Co-Array Fortran for Parallel Programming. Technical Report RAL-TR-1998-060, Rutheford Appleton Laboratory (1998)
Pacific Northwest National Laboratory: Global Arrays Programming Models. http://hpc.pnl.gov/globalarrays/
Pacific Northwest National Laboratory: MaTEx: Machine Learning Toolkit for Extreme Scale. http://hpc.pnl.gov/matex/
Pophale, S., Jin, H., Poole, S., Kuehn, J.: OpenSHMEM performance and potential: A NPB experimental study. In: Proceedings of the 1st Workshop on OpenSHMEM (2013)
Yu, H.F., Lo, H.Y., Hsieh, H.P., Lou, J.K., Mckenzie, T.G., Chou, J.W., Chung, P.H., Ho, C.H., Chang, C.F., Weng, J.Y., et al.: Feature engineering and classifier ensemble for KDD cup 2010. In: JMLR Workshop and Conference Proceedings (2011)
Zhang, C., Li, F., Jestes, J.: Efficient parallel kNN joins for large data in MapReduce. In: Proceedings the 15th International Conference on Extending Database Technology (2012)
Zhang, Q., Li, C., He, P., Li, X., Zou, H.: Irregular partitioning method based K-nearest neighbor query algorithm using mapreduce. In: Proceedings of 2015 International Symposium on Computers & Informatics (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Lin, J., Hamidouche, K., Zhang, J., Lu, X., Vishnu, A., Panda, D. (2015). Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M. (eds) OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies. OpenSHMEM 2014. Lecture Notes in Computer Science(), vol 9397. Springer, Cham. https://doi.org/10.1007/978-3-319-26428-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-26428-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26427-1
Online ISBN: 978-3-319-26428-8
eBook Packages: Computer ScienceComputer Science (R0)