Advertisement

Physics of Particles and Nuclei Letters

, Volume 14, Issue 7, pp 1001–1007 | Cite as

Middleware for big data processing: test results

  • I. Gankevich
  • V. Gaiduchok
  • V. Korkhov
  • A. Degtyarev
  • A. Bogdanov
Computer Technologies in Physics
  • 27 Downloads

Abstract

Dealing with large volumes of data is resource-consuming work which is more and more often delegated not only to a single computer but also to a whole distributed computing system at once. As the number of computers in a distributed system increases, the amount of effort put into effective management of the system grows. When the system reaches some critical size, much effort should be put into improving its fault tolerance. It is difficult to estimate when some particular distributed system needs such facilities for a given workload, so instead they should be implemented in a middleware which works efficiently with a distributed system of any size. It is also difficult to estimate whether a volume of data is large or not, so the middleware should also work with data of any volume. In other words, the purpose of the middleware is to provide facilities that adapt distributed computing system for a given workload. In this paper we introduce such middleware appliance. Tests show that this middleware is well-suited for typical HPC and big data workloads and its performance is comparable with well-known alternatives.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    B. Lantz, B. Heller, and N. McKeown, “A network in a laptop: rapid prototyping for software-defined networks,” in Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks (ACM, 2010), p. 19.Google Scholar
  2. 2.
    N. Handigol, B. Heller, V. Jeyakumar, B. Lantz, and N. McKeown, “Reproducible network experiments using container-based emulation,” in Proceedings of the 8th International Conference on Emerging Networking Experiments and Technologies (ACM, 2012), pp. 253–264.Google Scholar
  3. 3.
    B. Heller, “Reproducible network research with highfidelity emulation,” PhD Thesis (Stanford Univ., 2013).Google Scholar
  4. 4.
    A. Degtyarev and A. Reed, “Synoptic and short-term modelling of ocean waves,” Int. Shipbuild. Prog. 60, 523–553 (2013).Google Scholar
  5. 5.
    A. Degtyarev and I. Gankevich, “Wave surface generation using OpenCL, OpenMP and MPI,” in Proceedings of the 8th International Conference on Computer Science and Information Technologies, 2011, pp. 248–251.Google Scholar
  6. 6.
    A. B. Degtyarev and A. M. Reed, “Modelling of incident waves near the ship’s hull (application of autoregressive approach in problems of simulation of rough seas),” in Proceedings of the 12th International Ship Stability Workshop, 2011.Google Scholar
  7. 7.
    A. Degtyarev and I. Gankevich, “Evaluation of hydrodynamic pressures for autoregression model of irregular waves,” in Proceedings of the 11th International Conference on Stability of Ships and Ocean Vehicles, Athens, 2012, pp. 841–852.Google Scholar
  8. 8.
    Goto Kazushige and R. van de Geijn, “Anatomy of high-performance matrix multiplication,” ACM Trans. Math. Software 34 (3), 12 (2008).MathSciNetzbMATHGoogle Scholar
  9. 9.
    Goto Kazushige and R. van de Geijn, “High-performance implementation of the level-3 blas,” ACM Trans. Math. Software 35 (1), 4 (2008).MathSciNetGoogle Scholar
  10. 10.
    G. E. Krasner, S. T. Pope, et al., “A description of the model-view-controller user interface paradigm in the Smalltalk-80 system,” J. Object Oriented Program. 1 (3), 26–49 (1988).Google Scholar
  11. 11.
    S. Vinoski, “Advanced message queuing protocol,” Internet Comput. 10 (6), 87–89 (2006).CrossRefGoogle Scholar
  12. 12.
    M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, “Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (USENIX Association, 2012), p. 2.Google Scholar
  13. 13.
    M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: cluster computing with working sets,” in Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, 2010, p.10.Google Scholar
  14. 14.
    J. Dean and G. Sanjay, “MapReduce: simplified data processing on large clusters,” Commun. ACM 51, 107–113 (2008).CrossRefGoogle Scholar
  15. 15.
    M. Hausenblas and J. Nadeau, “Apache drill: interactive ad-hoc analysis at scale,” Big Data 1.2, 100–104 (2013).CrossRefGoogle Scholar
  16. 16.
    A. Thusoo et al., “Hive: a warehousing solution over a map-reduce framework,” in Proceedings of the VLDB Endowment 2.2 (2009), pp. 1626–1629.CrossRefGoogle Scholar
  17. 17.
    C. Olston et al., “Pig latin: a not-so-foreign language for data processing,” in Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (ACM, 2008).Google Scholar
  18. 18.
    V. K. Vavilapalli et al., “Apache hadoop yarn: yet another resource negotiator,” in Proceedings of the 4th Annual Symposium on Cloud Computing (ACM, 2013).Google Scholar
  19. 19.
    I. Gankevich, Yu. Tipikin, and V. Gaiduchok, “Subordination: cluster management without distributed consensus,” in Proceedings of the International Conference on High Performance Computing Simulation HPCS, 2015, pp. 639–642.Google Scholar
  20. 20.
    I. Gankevich and A. Degtyarev, “Efficient processing and classification of wave energy spectrum data with a distributed pipeline,” Comput. Res. Model. 7, 517–520 (2015).Google Scholar
  21. 21.
    I. Gankevich, Yu. Tipikin, A. Degtyarev, and V. Korkhov, “Novel approaches for distributing workload on commodity computer systems,” in Proceedings of the International Conference on Computational Science and Its Applications, ICCSA, Lect. Notes Comput. Sci. 9158, 259–271 (2015).Google Scholar
  22. 22.
    P. Hunt et al., “ZooKeeper: wait-free coordination for internet-scale systems,” in Proceedings of the USENIX Annual Technical Conference, Boston, MA, USA, June 23–25, 2010, Vol. 8.Google Scholar
  23. 23.
    CoreOS, Etcd, Fleet. https://coreos.com/.Google Scholar
  24. 24.
    NIST Big Data PWG, NIST Big Data Interoperability Framework, Vol. 1: Definitions, Reference Architecture (2015). doi 10.6028/NIST.SP.1500-1Google Scholar

Copyright information

© Pleiades Publishing, Ltd. 2017

Authors and Affiliations

  • I. Gankevich
    • 1
  • V. Gaiduchok
    • 1
  • V. Korkhov
    • 1
  • A. Degtyarev
    • 1
  • A. Bogdanov
    • 1
  1. 1.St. Petersburg State UniversitySt. PetersburgRussia

Personalised recommendations