Abstract
The big data term is used to describe the exponential data growth that has recently occurred and represents an immense challenge for traditional learning techniques. To deal with big data classification problems we propose the Chi-FRBCS-BigData algorithm, a linguistic fuzzy rule-based classification system that uses the MapReduce framework to learn and fuse rule bases. It has been developed in two versions with different fusion processes. An experimental study is carried out and the results obtained show that the proposal is able to handle these problems providing competitive results.
Article PDF
Avoid common mistakes on your manuscript.
References
P. Zikopoulos, C. Eaton, D. DeRoos, T. Deutsch and G. Lapis, “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data,” McGraw-Hill, (2011).
S. Madden, “From Databases to Big Data,” IEEE Internet Computing, vol. 16, no. 3, 4–6, (2012).
A. Sathi, “Big Data Analytics: Disruptive Technologies for Changing the Game,” MC Press, (2012).
C.L. Philip Chen and C.Y. Zhang, “Data-intensive applications, challenges, techniques and technologies: A survey on Big Data,” Information Sciences, vol. 275, 314–347, (2014)
X. Wu and X. Zhu and G.Q. Wu and W. Ding, “Data Mining with Big Data,” IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 1, 97–107, (2014)
A. Fernández, S. Río, V. López, A. Bawakid, M.J. del Jesus, J.M. Benítez and F. Herrera, “Big Data with Cloud Computing: An Insight on the Computing Environment, MapReduce and Programming Framework,” WIREs Data Mining and Knowledge Discovery, vol. 4, no. 5, 380–409, (2014).
H. Ishibuchi, T. Nakashima and M. Nii, “Classification and modeling with linguistic information granules: Advanced approaches to linguistic Data Mining,” SpringerVerlag, (2004).
Y. Jin, “Fuzzy modeling of high-dimensional systems: complexity reduction and interpretability improvement,” IEEE Transactions on Fuzzy Systems, vol. 8, no. 2, 212–221, (2000).
T.P. Hong, Y.C. Lee and M.T. Wu, “An effective parallel approach for genetic-fuzzy data mining,” Expert Systems with Applications, vol. 41, no. 2, 655–662, (2014).
H. Ishibuchi, S. Mihara and Y. Nojima, “Parallel distributed hybrid fuzzy GBML models with rule set migration and training data rotation,” IEEE Transactions on Fuzzy Systems, vol. 21, no. 2, 355–368, (2013).
J. Dean and S. Ghemawat, “MapReduce: Simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, 107–113, (2008).
Z. Chi, H. Yan and T. Pham, “Fuzzy algorithms with applications to image processing and pattern recognition,” World Scientific, (1996).
V. Lopez, S. Río, J.M. Benitez and F. Herrera, “On the use of MapReduce to build Linguistic Fuzzy Rule Based Classification Systems for Big Data,” IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2014), Beijing (China), 1905–1912, 6–11 July, (2014).
T. White, “Hadoop, The Definitive Guide,” OReilly Media, Inc., (2012).
J. Dean and S. Ghemawat, “MapReduce: Simplified data processing on large clusters,” OSDI’04: Proceedings of the 6th Symposium on Operating System Design and Implementation, San Francisco, California, USA. USENIX Association, 137–150, (2004).
S. Owen, R. Anil, T. Dunning and E. Friedman, “Mahout in Action,” Manning Publications Co., (2011).
V. López, S. del Río, J.M. Benítez and F. Herrera, “Cost-sensitive linguistic fuzzy rule based classification systems under the mapreduce framework for unbalanced big data,” Fuzzy Sets and Systems, vol. 258, 5–38, (2015).
S. Río, V. Lopez, J.M. Benitez and F. Herrera, “On the use of MapReduce for Imbalanced Big Data using Random Forest,” Information Sciences, vol. 285, 112–137, (2014).
I. Palit and C.K. Reddy, “Scalable and parallel boosting with mapreduce,” IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 10, 1904–1916, (2012).
Q. He, C. Du, Q. Wang, F. Zhuang and Z. Shi, “A parallel incremental extreme SVM classifier,” Neurocomputing, vol. 74, no. 16, 2532–2540, (2011).
H. Ishibuchi and T. Yamamoto, “Rule Weight Specification in Fuzzy Rule-Based Classification Systems,” IEEE Transactions on Fuzzy Systems, vol. 13, no. 4, 428–435, (2005).
O. Cordon, M.J. del Jesus and F. Herrera, “A proposal on Reasoning Methods in Fuzzy Rule-Based Classification Systems,’’International Journal of Approximate Reasoning, vol. 20, no. 1, 21–45, (1999).
Z. Chi, H. Yan and T. Pham, “Fuzzy algorithms with applications to image processing and pattern recognition,” World Scientific, (1996).
L.X. Wang and J.M. Mendel, “Generating fuzzy rules by learning from examples,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 22, no. 6, 1414–1427, (1992).
K. Bache and M. Lichman, “UCI Machine Learning Repository,” [Online; accessed November 2014] (hup://archive.ics.uci.edu/ml), (2014).
V. López, A. Fernández, S. García, V. Palade and Francisco Herrera, “An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics,” Information Sciences, vol. 250, 113–141, (2013).
K. Trawinski, O. Cordón, A. Quirin, “On designing fuzzy multiclassifier systems by combining FURIA with bagging and feature selection,” International Journal of Uncertainty, Fuzziness, and Knowledge-based Systems, vol. 19, no. 4, 589–633, (2011).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
del Río, S., López, V., Manuel Benítez, J. et al. A MapReduce Approach to Address Big Data Classification Problems Based on the Fusion of Linguistic Fuzzy Rules. Int J Comput Intell Syst 8, 422–437 (2015). https://doi.org/10.1080/18756891.2015.1017377
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1080/18756891.2015.1017377