Skip to main content

Handling Big Data with Fuzzy Based Classification Approach

  • Conference paper
Advance Trends in Soft Computing

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 312))

Abstract

Big data is a collection of very large and complex data that is difficult to load into the computer memory. The major challenges include searching, categorization and analysis of big data. In this paper, a fuzzy based supervised classifier is proposed to handle the searching, storage and categorization of big data. In this classifier, we proposed a Random Sampling Iterative Optimization Fuzzy c-Means (RSIO-FCM) clustering algorithm which partitions the big data into various subsets. These subsets adequately cover all the instances (object space) of big data. Then, clustering is performed on these subsets by feeding forward the centers of clustered subset to group remaining subsets. Further, the designed classifier based on Bayesian theory is used to assign the labels to these clusters and also used to predict labels of unknown instances. Thus, the proposed approach results in effective clusters formation which also eliminates the problem of overlapping cluster centers faced by algorithm discussed in [1] named as Simple Random Sampling plus Extension FCM (rseFCM). The effectiveness of proposed clustering algorithm over rseFCM clustering is evaluated on two very large benchmark datasets in terms of fuzzification parameter m, objective function, computational time and accuracy. Experimental results demonstrate that, the RSIO-FCM algorithm generates more appropriate cluster centers location due to which it achieves better classification accuracy as compared to the rseFCM algorithm. Thus, it observed that, cluster centers location will have significant impact over classification results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Havens, T.C., Bezdek, J.C., Leckie, C., Hall, L.O., Palaniswami, M.: Fuzzy c-Means Algorithms for Very Large Data. IEEE Trans. Fuzzy System 20(6), 1130–1146 (2012)

    Article  Google Scholar 

  2. Cai, W., Chen, S., Zhang, D.: A Multiobjective Simultaneous Learning Framework for Clustering and Classification. IEEE Trans. on Neural Networks 21(2), 185–200 (2010)

    Article  Google Scholar 

  3. Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley-Blackwell, New Work (2005)

    Google Scholar 

  4. Guha, L.S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. Inf. Syst. 26(1), 35–58 (2001)

    Article  MATH  Google Scholar 

  5. Har-Peled, S., Mazumdar, S.: On coresets for k-means and k-median clustering. In: Proc. ACM Symp. Theory Comput., pp. 291–300 (2004)

    Google Scholar 

  6. Shankar, B.U., Pal, N.: FFCM: An efficient approach for large data sets. In: Proc. Int. Conf. Fuzzy Logic, Neural Nets, Soft Comput., Fukuoka, Japan, p. 332 (1994)

    Google Scholar 

  7. Cheng, T., Goldgof, D., Hall, L.: Fast clustering with application to fuzzy rule generation. In: Proc. Int. Conf. Fuzzy Syst., Tokyo, Japan, pp. 2289–2295 (1995)

    Google Scholar 

  8. Blake, C., Keogh, E., Merz, C.J.: UCI Repository of Machine learning Databases. Dept. Inf. Comput. Sci., Univ. California Irvine, Irvine (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neha Bharill .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Bharill, N., Tiwari, A. (2014). Handling Big Data with Fuzzy Based Classification Approach. In: Jamshidi, M., Kreinovich, V., Kacprzyk, J. (eds) Advance Trends in Soft Computing. Studies in Fuzziness and Soft Computing, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-319-03674-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03674-8_21

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03673-1

  • Online ISBN: 978-3-319-03674-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics