Abstract
Big Data frameworks allow powerful distributed computations extending the results achievable on a single machine. In this work, we present a novel distributed associative classifier, named BAC, based on ensemble techniques. Ensembles are a popular approach that builds several models on different subsets of the original dataset, eventually voting to provide a unique classification outcome. Experiments on Apache Spark and preliminary results showed the capability of the proposed ensemble classifier to obtain a quality comparable with the single-machine version on popular real-world datasets, and overcome their scalability limits on large synthetic datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tsai, C.W., Lai, C.F., Chao, H.C., Vasilakos, A.V.: Big data analytics: a survey. J. Big Data 2(1), 1–32 (2015)
Baralis, E., Garza, P.: A lazy approach to pruning classification rules. In: ICDM 2002, Maebashi, Japan, December 2002
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: KDD 1998, New York, NY., August 1998
Agrawal, R., Imilienski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993, Washington DC., May 1993
Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Han, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2005)
Quinlan, J.: C4.5: program for classification learning. Morgan Kaufmann, San Mateo (1992)
Rokach, L., Maimon, O.: Data Mining with Decision Trees: Theory and Applications. World Scientific Publishing Co. Inc., River Edge, NJ, USA (2008)
Sun, Y., Wang, Y., Wong, A.K.C.: Boosting an associative classifier. IEEE Trans. Knowl. Data Eng. 18(7), 988–992 (2006)
Acknowledgment
The research leading to these results has received funding from the European Union under the FP7 Grant Agreement n. 619633 (“ONTIC” Project).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Venturini, L., Garza, P., Apiletti, D. (2016). BAC: A Bagged Associative Classifier for Big Data Frameworks. In: Ivanović, M., et al. New Trends in Databases and Information Systems. ADBIS 2016. Communications in Computer and Information Science, vol 637. Springer, Cham. https://doi.org/10.1007/978-3-319-44066-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-44066-8_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44065-1
Online ISBN: 978-3-319-44066-8
eBook Packages: Computer ScienceComputer Science (R0)