Skip to main content

Multi-objective Fuzzy-Swarm Optimizer for Data Partitioning

  • Conference paper
  • First Online:
Advanced Computing and Intelligent Technologies

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 218))

Abstract

To boost the performance level of big data, data partitioning is considered to be as the backbone of big data applications. In recent years, many researchers are focusing their work toward data science and analysis for real-time applications with the integration of big data. Human interaction with data partitioning of big data is quite time-consuming. So, it is needed to make the data partition elastic as well as scalable while handling a high workload under the distributed system. In this paper, a multi-objective fuzzy-swarm optimization algorithm is proposed for cluster-based data partitioning. This paper also provided an analytical result analysis of different optimization algorithms for data partitioning, i.e., reduction or clustering along with their limitations. This paper provides an approach to enhance the efficiency level for clustering large complex data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Prasad, B.R., Bendale, U.K., Agarwal, S.: Distributed feature selection using vertical partitioning for high dimensional data. In: International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 807–813 (2016)

    Google Scholar 

  2. Bolon canedo, V., Sanchez, N., Cervino, J.: Toward parallel feature selection from vertically partitioned data. ESANN (2014)

    Google Scholar 

  3. Bakshi, K.: Considerations for big data: architecture and approach. In: IEEE Aerospace Conference, pp. 1–7 (2012)

    Google Scholar 

  4. Chen, X., Xie, M.: A split-and-conquer approach for analysis of extraordinarily large data. Statistica Sinica 24(4), 1655–1684 (2014)

    Google Scholar 

  5. Agarwal, S., Mozafari, B., Panda, A., Milner, H., Madden, S., Stoica, I.: BlinkDB: queries with bounded errors and bounded response times on very large data. In: ACM European Conference on Computer Systems (EuroSys’13), Prague, Czech Republic, pp. 29–42 (2013)

    Google Scholar 

  6. Lazar, N.: The big picture: Divide and combine to conquer big data. Chance 31(1), 57–59 (2018)

    Google Scholar 

  7. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Symposium on Operating System Design and Implementation (OSDI’04), pp. 137–150 (2004)

    Google Scholar 

  8. Singh, D., Reddy, C.K.: A survey on platforms for big data analytics. J. Big Data 2(1) (2014)

    Google Scholar 

  9. Jagadish, H.V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J.M., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014)

    Google Scholar 

  10. R. Nair.: Big data needs approximate computing: Technical perspective. Communications of the ACM, 58(1), 104–104 (2015)

    Google Scholar 

  11. Li, K., Li, G.: Approximate query processing: what is new and where to go? Data Sci. Eng. 3(4), 379–397 (2018)

    Google Scholar 

  12. Sagi, O., Rokach, L.: Ensemble learning: a survey. Data Mining Know. Discov. 8(4), 1–18 (2018)

    Google Scholar 

  13. Basiri, S., Ollila, E., Koivunen, V.: Robust, scalable, and fast bootstrap method for analyzing large scale data. IEEE Trans. Signal Process 64(4), 1007–1017 (2016)

    Google Scholar 

  14. Das, S., Agrawal, D., El Abbadi, A., Elastras.: An elastic transactional data store in the cloud. In: Conference on Hot Topics in Cloud Computing (HotCloud’09), San Diego, CA, USA, pp. 1–5 (2009)

    Google Scholar 

  15. Baker, J., Bond, C., Corbett, J., Furman, J., Khorlin, A., Larson, J., Leon, J.-M., Li, Y., Lloyd, A., Yushprakh, V.: Megastore: providing scalable, highly available storage for interactive services. In: Conference on Innovative Database Research (CIDR), Asilomar, CA, USA, pp. 223–234 (2011)

    Google Scholar 

  16. Kamal, J., Murshed, M., Buyya, R.: Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable OLTP applications. Future Gener. Comput. Syst. 56, 421–435 (2016)

    Google Scholar 

  17. Huang, Y.-F., Lai, C.-J.: Integrating frequent pattern clustering and branch-and-bound approaches for data partitioning. Inf. Sci. 328, 288–301 (2016)

    Article  Google Scholar 

  18. Phansalkar, S., Ahirrao, S.: Survey of data partitioning algorithms for big data stores. In: International Conference on Parallel, Distributed and Grid Computing (PDGC), pp. 163–168 (2016)

    Google Scholar 

  19. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: IEEE Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, NV, USA, pp. 1–10 (2010)

    Google Scholar 

  20. Khan, M.A., Arshad, H., Nisar, W., Javed, M.Y., Sharif, M.: An integrated design of fuzzy C-Means and NCA-Based Multi-properties Feature Reduction for Brain Tumor Recognition. Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, 1–28 (2020)

    Google Scholar 

  21. Siddiqi, U.F., Sait, S.M., Kaynak, O.: Genetic algorithm for the mutual information-based feature selection in univariate time series data. IEEE Access. 8, 9597–9609 (2020)

    Google Scholar 

  22. Kong, L., et al.: Distributed feature selection for big data using fuzzy rough sets. IEEE Trans. Fuzzy Syst. 28, 846–857 (2020)

    Article  Google Scholar 

  23. Shaw, R.N., Walde, P., Ghosh, A.: IOT based MPPT for performance improvement of solar PV arrays operating under partial shade dispersion. In: 2020 IEEE 9th Power India International Conference (PIICON), SONEPAT, India, pp. 1–4 (2020). 10.1109/PIICON49524.2020.9112952

    Google Scholar 

  24. El-Hasnony, M., Barakat, S.I., Elhoseny, M., Mostafa, R.R.: Improved feature selection model for big data analytics. IEEE Access 8, 66989–67004 (2020)

    Article  Google Scholar 

  25. Paul, S., Verma, J.K., Datta, A., Shaw, R.N., Saikia, A.: Deep learning and its importance for early signature of neuronal disorders. In: 2018 4th International Conference on Computing Communication and Automation (ICCCA), Greater Noida, India, pp. 1–5 (2018). https://doi.org/10.1109/ccaa.2018.8777527

  26. Fong, S., Wong, R., Vasilakos, A.: Accelerated PSO swarm search feature selection for data stream mining big data. Serv. IEEE Trans. Comput. 9, 33–45 (2016)

    Google Scholar 

  27. Gu, S., Cheng, R., Jin, Y.: Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft. Comput. 22, 811–822 (2018)

    Article  Google Scholar 

  28. Yan, D., Cao, H., Yu, Y., Wang, Y., Yu, X.: Single-objective/multiobjective cat swarm optimization clustering analysis for data partition. In: IEEE Trans. Autom. Sci. Eng. 17(3). 1633–1646 (2020)

    Google Scholar 

  29. Wang, S., Eick, C.F.: MR-SNN: design of parallel shared nearest neighbor clustering algorithm using MapReduce. In: IEEE International Conference on Big Data Analysis (ICBDA), pp. 312–315 (2017)

    Google Scholar 

  30. Sangeetha, J., Prakash, V. S. J.: An efficient inclusive similarity based clustering (ISC) algorithm for big data. In: World Congress on Computing and Communication Technologies (WCCCT), pp. 84–88 (2017)

    Google Scholar 

  31. Barhanpurkar, K., Rajawat, A.S., Bedi, P., Mohammed, O.: Detection of sleep apnea & cancer mutual symptoms using deep learning techniques. In: 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, pp. 821–828 (2020). https://doi.org/10.1109/i-smac49090.2020.9243488

  32. Singh Rajawat, A., Jain, S.: Fusion deep learning based on back propagation neural network for personalization. In: 2nd International Conference on Data, Engineering and Applications (IDEA), Bhopal, India, pp. 1–7 (2020). https://doi.org/10.1109/idea49133.2020.9170693

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Goyal, S.B., Bedi, P., Rajawat, A.S., Shaw, R.N., Ghosh, A. (2022). Multi-objective Fuzzy-Swarm Optimizer for Data Partitioning. In: Bianchini, M., Piuri, V., Das, S., Shaw, R.N. (eds) Advanced Computing and Intelligent Technologies. Lecture Notes in Networks and Systems, vol 218. Springer, Singapore. https://doi.org/10.1007/978-981-16-2164-2_25

Download citation

Publish with us

Policies and ethics