Abstract
Performance and efficiency became recently key requirements of computer architectures. Modern computers incorporate Graphics Processing Units (GPUs) into running data mining algorithms, as well as other general purpose computations. In this paper, different parallelization methods are analyzed and compared in order to understand their applicability. From multi-threading on shared memory to using NVIDIA’s GPU accelerators for increasing performance and efficiency on parallel computing, this work discusses the parallelization of data mining algorithms considering performance and efficiency issues. The performance is compared on both many-core systems and GPU accelerators on a distance measure algorithm using a relatively big data set. We optimize the way we deal with GPUs in heterogeneous systems to make them more suitable for big data mining applications with heavy distance calculations. Moreover, we focus on achieving a higher utilization of GPU resources and a better reuse of data. Our implementation of the content-based similarity algorithm SQFD on the GPU outperforms by up to 50× CPU counterparts, and up to 15× CPU multi-threaded implementations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abdalla, A.M.H.: Applications Performance on GPGPUs with the Fermi Architecture. MA thesis. The University of Edinburgh (2011)
Beecks, C., Uysal, M.S., Seidl, T.: Signature Quadratic Form Distance. In: Proceedings of the ACM International Conference on Image and Video Retrieval, CIVR 2010, pp. 438–445. ACM (2010)
Cao, F., Tung, A.K.H., Zhou, A.: Scalable clustering using graphics processors. In: Proceedings of the 7th International Conference on Advances in Web-Age Information Management, WAIM 2006, pp. 372–384. Springer (2006)
Das, A., Dally, W.J., Mattson, P.: Compiling for Stream Processing. In: Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, PACT 2006, pp. 33–42. ACM (2006)
Glaskowsky, P.N.: NVIDIA’s Fermi: The First Complete GPU Computing Architecture. Tech. rep. NVIDIA Corporation (2009)
Hassani, M., Spaus, P., Gaber, M.M., Seidl, T.: Density-Based Projected Clustering of Data Streams. In: Proceedings of the 6th International Conference on Scalable Uncertainty Management, SUM 2012, pp. 311–324. Springer (2012)
Kailing, K., Kriegel, H.-P., Kroeger, P.: Density-Connected Subspace Clustering for High-Dimensional Data. In: Proceedings of the Fourth SIAM International Conference on Data Mining, SDM 2004, pp. 246–257 (2004)
Krulis, M., Lokoc, J., Beecks, C., Skopal, T., Seidl, T.: Processing the signature quadratic form distance on many-core GPU architectures. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM 2011, pp. 2373–2376. ACM (2011)
Mattson, P., Dally, W.J., Rixner, S., Kapasi, U.J., Owens, J.D.: Communication Scheduling”. In: Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems. In: ASPLOS IX, pp. 82–92. ACM (2000)
Munshi, A.: The OpenCL 1.2 Speciffication. Khronos OpenCL Working Group. Khronos Grpoup. Khronos (2012)
NVIDIA CUDA C Programming Guide. NVIDIA Corp. (2012), http://www.nvidia.com
NVIDIA Corp., ed. NVIDIA’s Next Generation CUDA Compute Archi- tecture: Kepler TM GK110. The Fastest, Most Efficient HPC Architecture Ever Built (2012)
OpenMP Architecture Review Board. The OpenMP API Speciffication For Parallel Programming (2011)
Pabst, H.-F., Springer, J.P., Schollmeyer, A., Lenhardt, R., Lessig, C., Froehlich, B.: Ray casting of trimmed NURBS surfaces on the GPU. In: Proceedings of the 2006 IEEE Symposium on Interactive Ray Tracing, pp. 151–160 (2006)
Preis, T.: Econophysics complex correlations and trend switchings in financial time series. The European Physical Journal Special Topics, 5–86 (2011)
Rubner, Y., Tomasi, C., Guibas, L.J.: The Earth Mover’s Distance as a Metric for Image Retrieval. International Journal of Computer Vision, 99–121 (2000)
Tanenbaum, A.S.: Parallel Computer Architectures. In: Structured Computer Organization. Pearson Studium (2001) isbn: 0130959901
Tarakji, A., Marx, M., Lankes, S.: The Development of a Scheduling System GPUSched for Graphics Processing Units. In: The International Conference on High Performance Computing Simulation, HPCS (2013)
Wang, J.Z., Li, J., Wiederhold, G.: SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 947–963 (2001)
Wasson, S.: Nvidia Kepler powers Oak Ridge’s supercomputing Titan. Tech. rep. PC Hardware Eplored (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tarakji, A., Hassani, M., Lankes, S., Seidl, T. (2013). Using a Multitasking GPU Environment for Content-Based Similarity Measures of Big Data. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2013. ICCSA 2013. Lecture Notes in Computer Science, vol 7975. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39640-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-39640-3_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39639-7
Online ISBN: 978-3-642-39640-3
eBook Packages: Computer ScienceComputer Science (R0)