Big Data Process Advancement

  • Roman JasekEmail author
  • Said Krayem
  • Petr Zacek
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 574)


Information in this era is thriving to be maintained on a verity of sources. Data is available in different patterns and forms. Combining and processing all different types of datasets in a heterogeneity database is near to impossible, specifically, if the information is moving and changing on many different sources on a continuous basis. Information is represented in different modules and nowadays processing data from various sources can lead to critical risk assessment results. Big Data is a concept introduced to cover the use of different techniques serving the desired goals by processing the given information. Processing huge amount of data is a big challenge for a single machine to perform, in this paper we will discuss this idea and demonstrate a module of clustered machines to work as a single entity towards achieving the desired tasks while working on parallel cohesively.

The idea of a solution to combine different machines of different specification processing and power in a single cluster and then distributing input data of various data fairly to most powerful processing and well-designed data type machine in the cluster.

Distribution of input data and storing mechanism will depend on machine specification, data processing, the power of a machine, balance loading and data type.

We present our suggestion solving method by using Event-B based approach, the Key features of Event-B are the use of set theory as a modelling notation and we propose using the Rodin modelling tool for Event-B that integrates modelling and proving.


Big data Clustering Parallel clustering Distribution process Distribution file system Formal modelling Event-B Rodin 



This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic within the National Sustainability Programme project No. LO1303 (MSMT‐7778/2014) and by the European Regional Development Fund under the project CEBIA‐Tech No. CZ.1.05/2.1.00/03.0089. Also supported by grant No. IGA/CebiaTech/2017/007 from IGA (Internal Grant Agency) of Thomas Bata University in Zlin.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
    Mall, N.N., Shikha, Rana, S.: Overview of Big Data and Hadoop, Department of Computer Science & Engineering, IIMT College of Engineering, Greater Noida (2016). ISSN: 2454-1362Google Scholar
  5. 5.
    Abbasi, A., Sarker, S., Chiang, R.H.L.: Big Data Research in Information Systems: Toward an Inclusive Research Agenda, McIntire School of Commerce, University of Virginia, USA (2016)Google Scholar
  6. 6.
    Shirkhorshidi, A.S., Aghabozorgi, S., Wah, T.Y., Herawan, T.: Big data clustering: a review. In: Murgante, B., et al. (eds.) ICCSA 2014. LNCS, vol. 8583, pp. 707–720. Springer, Cham (2014). doi: 10.1007/978-3-319-09156-3_49 Google Scholar
  7. 7.
  8. 8.
    Megahed, F.M., Jones-Farmer, L.A.: Statistical perspectives on “big data”. In: Knoth, S., Schmid, W. (eds.) Frontiers in Statistical Quality Control 11. FSQC, pp. 29–47. Springer, Cham (2015). doi: 10.1007/978-3-319-12355-4_3 Google Scholar
  9. 9.
    Havens, T.C., Bezdek, J.C., Palaniswami, M.: Scalable single linkage hierarchical clustering for big data. In: 2013 IEEE Eighth International Conference on Intelligent Sensors, Sensor Networks and Information Processing, pp. 396–401. IEEE (2013)Google Scholar
  10. 10.
    Kaur, A.: Big Data: A Review of Challenges, Tools and Techniques, Department of Computer Science and Applications, Khalsa College, Amritsar, Punjab, India (2016)Google Scholar
  11. 11.
  12. 12.
    Williams, P., Soares, C., Gilbert, J.E.: A clustering rule based approach for classification problems. Int. J. Data Warehous. Min. 8(1), 1–23 (2012)CrossRefGoogle Scholar
  13. 13.
    Jastram, M., Butler, M.: Rodin User’s Handbook: Covers Rodin v.2.8, CreateSpace Independent Publishing Platform, USA (2014). ISBN 10: 1495438147, ISBN 13: 9781495438141
  14. 14.
    Damchoom, K., Butler, M., Abrial, J.-R.: Modelling and proof of a tree-structured file system in Event-B and Rodin. In: Liu, S., Maibaum, T., Araki, K. (eds.) ICFEM 2008. LNCS, vol. 5256, pp. 25–44. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88194-0_5. CrossRefGoogle Scholar
  15. 15.
    Abrial, J.-R., Butler, M., Hallerstede, S., Voisin, L.: An open extensible tool environment for Event-B. In: Liu, Z., He, J. (eds.) ICFEM 2006. LNCS, vol. 4260, pp. 588–605. Springer, Heidelberg (2006). doi: 10.1007/11901433_32 CrossRefGoogle Scholar
  16. 16.
    Hoang, T.S., Furst, A., Abrial, J.-R.: Event-B Patterns and Their Tool Support.
  17. 17.
    Abrial, J.-R., Butler, M., Hallerstede, S., Hoang, T.S., Mehta, F., Voisin, L.: Rodin: An Open Toolset for Modelling and Reasoning in Event-B (2009).

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Faculty of Applied InformaticsTomas Bata University in ZlinZlinCzech Republic

Personalised recommendations