Abstract
The significance of addressing Big Data applications is beyond all doubt. The current ability of extracting interesting knowledge from large volumes of information provides great advantages to both corporations and academia. Therefore, researchers and practitioners must deal with the problem of scalability so that Machine Learning and Data Mining algorithms can address Big Data properly. With this end, the MapRe-duce programming framework is by far the most widely used mechanism to implement fault-tolerant distributed applications. This novel framework implies the design of a divide-and-conquer mechanism in which local models are learned separately in one stage (Map tasks) whereas a second stage (Reduce) is devoted to aggregate all sub-models into a single solution. In this paper, we focus on the analysis of the behavior of Linguistic Fuzzy Rule Based Classification Systems when embedded into a MapRe-duce working procedure. By retrieving different information regarding the rules learned throughout the MapReduce process, we will be able to identify some of the capabilities of this particular paradigm that allowed them to provide a good performance when addressing Big Data problems. In summary, we will show that linguistic fuzzy classifiers are a robust approach in case of scalability requirements.
Article PDF
Avoid common mistakes on your manuscript.
References
Alexandros Labrinidis and H. V. Jagadish. Challenges and opportunities with big data. Proceedings of the VLDB Endowment (PVLDB), 5(12):2032–2033, 2012.
Xindong Wu, Xingquan Zhu, Gong-Qing Wu, and Wei Ding. Data mining with big data. IEEE Trans. Knowl. Data Eng., 26(1):97–107, 2014.
A. Fernández, S. Río, V. López, A. Bawakid, M.J. del Jesus, J.M. Ben´ıtez, and F. Herrera. Big data with cloud computing: An insight on the computing environment, mapreduce and programming framework. WIREs Data Mining and Knowledge Discovery, 4(5):380–409, 2014.
Jeffrey Dean and Sanjay Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107–113, 2008.
H. Ishibuchi, T. Nakashima, and M. Nii. Classification and modeling with linguistic information granules: Advanced approaches to linguistic data mining. Springer-Verlag, Berlin, Germany, 2004.
A. Fernandez, C.J. Carmona, M.J. del Jesus, and F. Herrera. A view on fuzzy systems for big data: Progress and opportunities. International Journal of Computational Intelligence Systems, 9(1):69–80, 2016.
H. Wang, Z. Xu, and W. Pedrycz. An overview on the roles of fuzzy set techniques in big data processing: Trends, challenges and opportunities. Knowledge-Based Systems, 118:15–30, 2017.
M. Elkano, M. Galar, J. Sanz, and H. Bustince. CHI-BD: A fuzzy rule-based classification system for big data classification problems. Fuzzy Sets and Systems, in press, doi: 10.1016/j.fss.2017.07.003, 2017.
I. Rodríguez-Fdez, M. Mucientes, and A. Bugarín. S-FRULER: Scalable fuzzy rule learning through evolution for regression. Knowledge-Based Systems, 110:255 – 266, 2016.
F. Pulgar-Rubio, A. J. Rivera-Rivas, M. D. Pérez-Godoy, P. González, C. J. Carmona, and M. J. del Jesus. MEFASD-BD: Multi-objective evolutionary fuzzy algorithm for subgroup discovery in big data environments - a mapreduce solution. Knowledge-Based Systems, 117:70–78, 2017.
M. Wasikowski and X.-W. Chen. Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng., 22(10):1388– 1400, 2010.
S. Río, V. López, J.M. Benítez, and F. Herrera. A mapreduce approach to address big data classification problems based on the fusion of linguistic fuzzy rules. International Journal of Computational Intelligence Systems, 8(3):422–437, 2015.
Z. Chi, H. Yan, and T. Pham. Fuzzy algorithms with applications to image processing and pattern recognition. World Scientific, 1996.
Hisao Ishibuchi and T. Yamamoto. Rule weight specification in fuzzy rule-based classification systems. IEEE Trans. Fuzzy Syst., 13:428–435, 2005.
M. Lichman. UCI machine learning repository; university of california, irvine, school of information and computer sciences., 2013.
J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kauffman, 1993.
A. Fernández, S. del R´ıo, N. V. Chawla, and F. Herrera. An insight into imbalanced big data classification: Outcomes and challenges. Complex and Intelligent Systems, 3(2):105–120, 2017.
Holden Karau, Andy Konwinski, Patrick Wendell, and Matei Zaharia. Learning Spark: Lightning-Fast Big Data Analytics. O’Reilly Media, 1st edition, 2014.
A. Fernandez, S. Río, and F. Herrera. Fuzzy rule based classification systems for big data with mapreduce: Granularity analysis. Advances in Data Analysis and Classification, in press, doi: 10.1007/s11634-016-0260-z, 2017.
Alberto Fernández, María José del Jesús, and Francisco Herrera. Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets. International Journal of Approximate Reasoning, 50(3):561–577, 2009.
A. Ferranti, F. Marcelloni, A. Segatori, M. Antonelli, and P. Ducange. A distributed approach to multi-objective evolutionary generation of fuzzy rule-based classifiers from big data. Information Sciences, 415– 416:319–340, 2017.
Armando Segatori, Francesco Marcelloni, and Witold Pedrycz. On distributed fuzzy decision trees for big data. IEEE Transactions on Fuzzy Systems, in press, doi: 10.1109/TFUZZ.2016.2646746, 2017.
A. Fernández, S. del Río, and F. Herrera. A first approach in evolutionary fuzzy systems based on the lateral tuning of the linguistic labels for big data classification. In 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pages 1437–1444, 2016.
Alberto Fernandez, Eva Almansa, and Francisco Herrera. Chi-Spark-RS: an spark-built evolutionary fuzzy rule selection algorithm in imbalanced classification for big data problems. In 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pages 1– 6, 2017.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
About this article
Cite this article
Fernández, A., Altalhi, A., Alshomrani, S. et al. Why Linguistic Fuzzy Rule Based Classification Systems perform well in Big Data Applications?. Int J Comput Intell Syst 10, 1211–1225 (2017). https://doi.org/10.2991/ijcis.10.1.80
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.2991/ijcis.10.1.80