Abstract
Class-attribute interdependence maximization (CAIM) is one of the state-of-the-art algorithms for discretizing data for which classes are known. However, it may take a long time when run on high-dimensional large-scale data, with large number of attributes and/or instances. This paper presents a solution to this problem by introducing a graphic processing unit (GPU)-based implementation of the CAIM algorithm that significantly speeds up the discretization process on big complex data sets. The GPU-based implementation is scalable to multiple GPU devices and enables the use of concurrent kernels execution capabilities of modern GPUs. The CAIM GPU-based model is evaluated and compared with the original CAIM using single and multi-threaded parallel configurations on 40 data sets with different characteristics. The results show great speedup, up to 139 times faster using four GPUs, which makes discretization of big data efficient and manageable. For example, discretization time of one big data set is reduced from 2 h to \(<\)2 min.
Similar content being viewed by others
Notes
The data sets description, the algorithm’s source code, the experimental settings and results are fully described and publicly available to facilitate the replicability of the experiments and future comparisons at the website: http://www.uco.es/grupos/kdis/kdiswiki/CAIM-GPU.
References
Akl SG (1990) Parallel sorting algorithms., Notes and reports in computer science and applied mathematicsAcademic Press, Orlando
Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. Analysis framework. J Mult Valued Logic Soft Comput 17:255–287
Angiulli F, Pizzuti C (2005) Outlier mining in large high-dimensional data sets. IEEE Trans Knowl Data Eng 17(2):203–215
Bentley JL, McIlroy MD (1993) Engineering a sort function. Softw Pract Exp 23(11):1249–1265
Bernaschi M, Bisson M, Fatica M, Phillips E (2012) An introduction to multi-GPU programming for physicists. Eur Phys J Special Top 210:17–31
Brodtkorb AR, Hagen TR, Stra ML (2013) Graphics processing unit (GPU) programming strategies and trends in GPU computing. J Parallel Distrib Comput 73(1):4–13
Cannataro M, Talia D, Srimani P (2002) Parallel data intensive computing in scientific and commercial applications. Parallel Comput 28(5):673–704
Cano A, Luna JM, Ventura S (2013) High performance evaluation of evolutionary-mined association rules on GPUs. J Supercomput 66(3):1438–1461
Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16(2):187–202
Cederman D, Tsigas P (2010) GPU-quicksort: a practical quicksort algorithm for graphics processors. J Exp Algorithm 14:4–24
Cerquides J, Mantaras RLD (1997) Proposal and empirical comparison of a parallelizable distance-based discretization method. In: Proceedings of the international conference on knowledge discovery and data mining, pp 139–142
Che S, Boyer M, Meng J, Tarjan D, Sheaffer JW, Skadron K (2008) A performance study of general-purpose applications on graphics processors using CUDA. J Parallel Distrib Comput 68(10):1370–1380
Cios KJ, Pedrycz W, Swiniarski RW, Kurgan LA (2007) Data mining: a knowledge discovery approach. Springer
Cormen TH, Stein C, Rivest RL, Leiserson CE (2001) Introduction to Algorithms. In: 2nd edn. McGraw-Hill
Davidson A, Tarjan D, Garland M, Owens JD (2012) Efficient parallel merge sort for fixed and variable length keys. In: Proceedings of international conference on innovative parallel computing, pp 1–9
Frank A, Asuncion A (2010) UCI machine learning repository
Freitas AA, Lavington SH (1998) Mining very large databases with parallel processing. In: Kluwer international series on advances in database systems, vol 8. Kluwer
García S, Luengo J, Saez J, Lopez V, Herrera F (2013) A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans Knowl Data Eng 25(4):734–750
Garland M, Le Grand S, Nickolls J, Anderson J, Hardwick J, Morton S, Phillips E, Zhang Y, Volkov V (2008) Parallel computing experiences with CUDA. IEEE Micro 28(4):13–27
Gómez-Luna J, González-Linares J, Benavides J, Guil N (2012) Performance models for asynchronous data transfers on consumer graphics processing units. J Parallel Distrib Comput 72(9):1117–1126
Green I, Robert C, Wang L, Alam M, Formato RA (2012) Central force optimization on a GPU: a case study in high performance metaheuristics. J Supercomput 62(1):378–398
Hoare CAR (1962) Quicksort. Comput J 5(1):10–16
Hoberock J, Bell N (2011) Thrust: a productivity-oriented library for CUDA. In: Chapter 26, Morgan Kaufmann, pp 359–372
Jian L, Wang C, Liu Y, Liang S, Yi W, Shi Y (2013) Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). J Supercomput 64(3):942–967
Khan FG, Khan OU, Montrucchio B, Giaccone P (2011) Analysis of fast parallel sorting algorithms for GPU architectures. In: Proceedings of the international conference on frontiers of information technology, pp 173–178
Kirk DB, Hwu W-MW (2010) Programming massively parallel processors: a hands-on approach. Morgan Kaufmann
Knuth DE (1998) The art of computer programming. In: Sorting and searching, vol 3. 2nd edn. Addison Wesley
Kurgan LA, Cios KJ (2004) CAIM discretization algorithm. IEEE Trans Knowl Data Eng 16(2):145–153
Li M, Deng S, Feng S, Fan J (2011) An effective discretization based on class-attribute coherence maximization. Pattern Recognit Lett 32(15):1962–1973
Merrill D, Grimshaw A (2010) Revisiting sorting for GPGPU stream architectures. In: Proceedings of the international conference on parallel architectures and compilation techniques, pp 545–546
Merrill D, Grimshaw A (2010) Revisiting sorting for GPGPU stream architectures. In: Technical report CS2010-03, University of Virginia, Department of Computer Science, Charlottesville
Merrill D, Grimshaw A (2011) High performance and scalable radix sorting: a case study of implementing dynamic parallelism for GPU computing. Parallel Process Lett 21(2):245–272
Navarro CA, Hitschfeld-Kahler N, Mateu L (2014) A survey on parallel computing and its applications in data-parallel problems using GPU architectures. Commun Comput Phys 15(2):285–329
NVIDIA Corporation (2013) NVIDIA CUDA programming and best practices guide. http://www.nvidia.com/cuda
Owens JD, Luebke D, Govindaraju N, Harris M, Krüger J, Lefohn AE, Purcell TJ (2007) A survey of general-purpose computation on graphics hardware. Comput Graph Forum 26(1):80–113
Parthasarathy S, Ramakrishnan A (2002) Parallel incremental 2d-discretization on dynamic datasets. In: Proceedings of the international conference on parallel and distributed processing systems, pp 247–254
Peters H, Schulz-Hildebrandt O, Luttenberger N (2011) Fast in-place, comparison-based sorting with CUDA: a study with bitonic sort. Concur Comput Pract Exp 23(7):681–693
Rajaraman A, Ullman JD (2011) Mining of massive datasets. In: Cambridge University Press
Satish N, Harris M, Garland M (2009) Designing efficient sorting algorithms for manycore GPUs. In: Proceedings of the IEEE international symposium on parallel & distributed processing, pp 1–10
Schellmann M, Gorlatch S, Meilnder D, Ksters T, Schfers K, Wbbeling F, Burger M (2011) Parallel medical image reconstruction: from graphics processing units (GPU) to grids. J Supercomput 57(2):151–160
Shams R, Sadeghi P, Kennedy R, Hartley R (2010) A survey of medical image registration on multicore and the GPU. IEEE Signal Process Mag 27(2):50–60
Sintorn E, Assarsson U (2008) Fast parallel GPU-sorting using a hybrid algorithm. J Parallel Distrib Comput 68(10):1381–1388
Sriwanna K, Puntumapon K, Waiyamai K (2012) An enhanced class-attribute interdependence maximization discretization algorithm. In: Proceedings of the 8th international conference on advanced data mining and applications, vol 7713. LNAI, pp 465–476
Tatarchuk N, Shopf J, DeCoro C (2008) Advanced interactive medical visualization on the GPU. J Parallel Distrib Comput 68(10):1319–1328
Upadhyaya SR (2013) Parallel approaches to machine learning-a comprehensive survey. J Parallel Distrib Comput 73(3):284–292
Wittek P, Darnyi S (2013) Accelerating text mining workloads in a mapreduce-based distributed GPU environment. J Parallel Distrib Comput 73(2):198–206
Yang Y, Webb GI, Wu X (2010) Discretization methods. In: Data mining and knowledge discovery handbook, pp 101–116
Yulong X, Xiaopeng W, Dawei X (2012) A two step parallel discretization algorithm based on dynamic clustering. In: Proceedings of the international conference on computer science and electronics engineering, vol 3. pp 192–196
Zaki MJ, Ho CT (2000) Large-scale parallel data mining. In: State of the art survey. Lecture notes in artificial intelligence, Springer
Zhang Y, Mueller F, Cui X, Potok T (2011) Data-intensive document clustering on graphics processing unit (GPU) clusters. J Parallel Distrib Comput 71(2):211–224
Zhao Y, Niu Z, Peng X, Dai L (2011) A discretization algorithm of numerical attributes for digital library evaluation based on data mining technology. In: Digital libraries: for cultural heritage, knowledge dissemination, and future creation, vol 7008. LNCS, pp 70–76
Acknowledgments
This work has been supported by the Grant from the National Institutes of Health 1R01HD056235-01A1 (KJC), the Regional Government of Andalusia and the Ministry of Science and Technology project TIN-2011-22408 (SV), and the Ministry of Education FPU AP2010-0042 (AC). The authors also thank Duane Merrill and Sean Baxter from NVIDIA for their advise on improving efficiency of the sorting methods.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cano, A., Ventura, S. & Cios, K.J. Scalable CAIM discretization on multiple GPUs using concurrent kernels. J Supercomput 69, 273–292 (2014). https://doi.org/10.1007/s11227-014-1151-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1151-8