A Hybrid Clustering Technique to Improve Big Data Accessibility Based on Machine Learning Approaches

Ebadati, E. Omid Mahdi; Tabrizi, Mohammad Mortazavi

doi:10.1007/978-81-322-2755-7_43

E. Omid Mahdi Ebadati⁶ &
Mohammad Mortazavi Tabrizi⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 433))

1506 Accesses
4 Citations

Abstract

Big data is called to a large or complex data from traditional ones, which is unstructured in many case. Accessing to a specific value in a huge data that is not sorted or organized can be time consuming and require a high processing. With growing of data, clustering can be a most important unsupervised approach that finds a structure for data. In this paper, we demonstrate two approaches to cluster data with high accuracy, and then we sort data by implementing merge sort algorithm finally, we use binary search to find a data value point in a specific range of data. This research presents a high value efficiency combo method in big data by using genetic and k-means. After clustering with k-means total sum of the Euclidean distances is 3.37233e+09 for 4 clusters, and after genetic algorithm this number reduce to 0.0300344 in the best fit. In the second and third stage we show that after this implementation, we can access to a particular data much faster and accurate than other older methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tian, W.D. and Y.D. Zhao, Optimized Cloud Resource Management and Scheduling: Theories and Practices. 2014: Morgan Kaufmann.
Google Scholar
Gupta, R., H. Gupta, and M. Mohania, Cloud Computing and Big Data Analytics: What Is New from Databases Perspective?, in Big Data Analytics. 2012, Springer. p. 42–61.
Google Scholar
Hashem, I.A.T., et al., The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 2015. 47: p. 98–115.
Google Scholar
Fadiya, S.O., S. Saydam, and V.V. Zira, Advancing big data for humanitarian needs. Procedia Engineering, 2014. 78: p. 88–95.
Google Scholar
Young, S.D., A “big data” approach to HIV epidemiology and prevention. Preventive medicine, 2015. 70: p. 17–18.
Google Scholar
Liu, Z.-g., et al., Credal c-means clustering method based on belief functions. Knowledge-Based Systems, 2015. 74: p. 119–132.
Google Scholar
Jain, A.K., Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 2010. 31(8): p. 651–666.
Google Scholar
Ebadati E, O.M. and S. Babaie, Implementation of Two Stages k-Means Algorithm to Apply a Payment System Provider Framework in Banking Systems, in Artificial Intelligence Perspectives and Applications, R. Silhavy, et al., Editors. 2015, Springer International Publishing. p. 203–213.
Google Scholar
Liu, Y., X. Wu, and Y. Shen, Automatic clustering using genetic algorithms. Applied Mathematics and Computation, 2011. 218(4): p. 1267–1279.
Google Scholar
Razavi, S., et al., An Efficient Grouping Genetic Algorithm for Data Clustering and Big Data Analysis, in Computational Intelligence for Big Data Analysis, Springer International Publishing. 2015, p. 119–142.
Google Scholar
Ebadati E., O.M., et al., Impact of genetic algorithm for meta-heuristic methods to solve multi depot vehicle routing problems with time windows. Ciencia e Tecnica, A Science and Technology, 2014. 29(7): p. 9.
Google Scholar
Barthélemy, J.-P. and F. Brucker, Binary clustering. Discrete Applied Mathematics, 2008. 156(8): p. 1237–1250.
Google Scholar
Alzate, C. and J.A. Suykens, Hierarchical kernel spectral clustering. Neural Networks, 2012. 35: p. 21–30.
Google Scholar
Rahman, M.A. and M.Z. Islam, A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowledge-Based Systems, 2014. 71: p. 345–365.
Google Scholar
Villalba, L.J.G., A.L.S. Orozco, and J.R. Corripio, Smartphone image clustering. Expert Systems with Applications, 2015. 42(4): p. 1927–1940.
Google Scholar
Yu, J., et al., Image clustering based on sparse patch alignment framework. Pattern Recognition, 2014.
Google Scholar
Adhau, S., R. Moharil, and P. Adhau, K-Means clustering technique applied to availability of micro hydro power. Sustainable Energy Technologies and Assessments, 2014. 8: p. 191–201.
Google Scholar
Pavithra, M. and V.M. Aradhya, A comprehensive of transforms, Gabor filter and k-means clustering for text detection in images and video. Applied Computing and Informatics, 2014.
Google Scholar
Yao, M., D. Pi, and X. Cong, Chinese text clustering algorithm based k-means. Physics Procedia, 2012. 33: p. 301–307.
Google Scholar
Lipschutz, S., Data Structures With C (Sie) (Sos). Vol. 4.19–4.27. McGraw-Hill Education (India) Pvt Limited.
Google Scholar
Hatamlou, A., In search of optimal centroids on data clustering using a binary search algorithm. Pattern Recognition Letters, 2012. 33(13): p. 1756–1760.
Google Scholar
UCI Machine Learning Repository: Perfume Data Data Set. 2002–2003 cited 2015; Available from: https://archive.ics.uci.edu/ml/datasets/Perfume+Data.

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Kharazmi University, #242, Somayeh Street, Between Qarani & Vila, Tehran, Iran
E. Omid Mahdi Ebadati
Department of Knowledge Engineering and Decision Science, Kharazmi University, #242, Somayeh Street, Between Qarani & Vila, Tehran, Iran
Mohammad Mortazavi Tabrizi

Authors

E. Omid Mahdi Ebadati
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Mortazavi Tabrizi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to E. Omid Mahdi Ebadati .

Editor information

Editors and Affiliations

Deparment of CSE, Anil Neerukonda Ins. of Tech. & Sci., Vishakapatnam, India
Suresh Chandra Satapathy
Kalyani University, Nadia, West Bengal, India
Jyotsna Kumar Mandal
University of Hyderabad, Hyderabad, Andhra Pradesh, India
Siba K. Udgata
Dept. of ECE, Shri Ramswaroop Mem. Group of Prof. Clg, Lucknow, Uttar Pradesh, India
Vikrant Bhateja

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ebadati, E.O.M., Tabrizi, M.M. (2016). A Hybrid Clustering Technique to Improve Big Data Accessibility Based on Machine Learning Approaches. In: Satapathy, S., Mandal, J., Udgata, S., Bhateja, V. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 433. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2755-7_43

Download citation

DOI: https://doi.org/10.1007/978-81-322-2755-7_43
Published: 06 February 2016
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2753-3
Online ISBN: 978-81-322-2755-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics