Big Data Process Advancement
Information in this era is thriving to be maintained on a verity of sources. Data is available in different patterns and forms. Combining and processing all different types of datasets in a heterogeneity database is near to impossible, specifically, if the information is moving and changing on many different sources on a continuous basis. Information is represented in different modules and nowadays processing data from various sources can lead to critical risk assessment results. Big Data is a concept introduced to cover the use of different techniques serving the desired goals by processing the given information. Processing huge amount of data is a big challenge for a single machine to perform, in this paper we will discuss this idea and demonstrate a module of clustered machines to work as a single entity towards achieving the desired tasks while working on parallel cohesively.
The idea of a solution to combine different machines of different specification processing and power in a single cluster and then distributing input data of various data fairly to most powerful processing and well-designed data type machine in the cluster.
Distribution of input data and storing mechanism will depend on machine specification, data processing, the power of a machine, balance loading and data type.
We present our suggestion solving method by using Event-B based approach, the Key features of Event-B are the use of set theory as a modelling notation and we propose using the Rodin modelling tool for Event-B that integrates modelling and proving.
KeywordsBig data Clustering Parallel clustering Distribution process Distribution file system Formal modelling Event-B Rodin
This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic within the National Sustainability Programme project No. LO1303 (MSMT‐7778/2014) and by the European Regional Development Fund under the project CEBIA‐Tech No. CZ.1.05/2.1.00/03.0089. Also supported by grant No. IGA/CebiaTech/2017/007 from IGA (Internal Grant Agency) of Thomas Bata University in Zlin.
- 4.Mall, N.N., Shikha, Rana, S.: Overview of Big Data and Hadoop, Department of Computer Science & Engineering, IIMT College of Engineering, Greater Noida (2016). ISSN: 2454-1362Google Scholar
- 5.Abbasi, A., Sarker, S., Chiang, R.H.L.: Big Data Research in Information Systems: Toward an Inclusive Research Agenda, McIntire School of Commerce, University of Virginia, USA (2016)Google Scholar
- 9.Havens, T.C., Bezdek, J.C., Palaniswami, M.: Scalable single linkage hierarchical clustering for big data. In: 2013 IEEE Eighth International Conference on Intelligent Sensors, Sensor Networks and Information Processing, pp. 396–401. IEEE (2013)Google Scholar
- 10.Kaur, A.: Big Data: A Review of Challenges, Tools and Techniques, Department of Computer Science and Applications, Khalsa College, Amritsar, Punjab, India (2016)Google Scholar
- 13.Jastram, M., Butler, M.: Rodin User’s Handbook: Covers Rodin v.2.8, CreateSpace Independent Publishing Platform, USA (2014). https://www3.hhu.de/stups/handbook/rodin/current/pdf/rodin-doc.pdf. ISBN 10: 1495438147, ISBN 13: 9781495438141
- 14.Damchoom, K., Butler, M., Abrial, J.-R.: Modelling and proof of a tree-structured file system in Event-B and Rodin. In: Liu, S., Maibaum, T., Araki, K. (eds.) ICFEM 2008. LNCS, vol. 5256, pp. 25–44. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88194-0_5. http://www.ensiie.fr/~dubois/PR_2010/TreeFileSysICFEM2008.pdf CrossRefGoogle Scholar
- 16.Hoang, T.S., Furst, A., Abrial, J.-R.: Event-B Patterns and Their Tool Support. http://e-collection.library.ethz.ch/eserv/eth:5538/eth-5538-01.pdf
- 17.Abrial, J.-R., Butler, M., Hallerstede, S., Hoang, T.S., Mehta, F., Voisin, L.: Rodin: An Open Toolset for Modelling and Reasoning in Event-B (2009). http://deploy-eprints.ecs.soton.ac.uk/130/1/main.pdf