Abstract
To process big data, applications have been utilizing data processing libraries over the last years, which are however not optimized to work together for efficient processing. Intermediate Representations (IR) have been introduced for unifying essential functions into an abstract interface that supports cross-optimization between applications. Still, the efficiency of an IR depends on the architecture and the tools required for compilation and execution. In this paper, we present a first glance at a framework that provides an IR by creating containers with executable code from structures of data analytics functions, described in an input grammar. These containers process data in query lists and they can be executed either standalone or integrated with other big data analytics applications without the need to compile the entire framework.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Zhao, J., Nagarakatte, S., Martin, M.M., Zdancewic, S.: Formalizing the llVM intermediate representation for verified program transformations. In: Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 427–440 (2012)
Soliman, M.A., Ilyas, I.F., Chang, K.C.C.: Top-k query processing in uncertain databases. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 896–905. IEEE (2007)
Guo, G., Wang, H., Bell, D., Bi, Y., Greer, K.: KNN model-based approach in classification. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) OTM 2003. LNCS, vol. 2888, pp. 986–996. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39964-3_62
Lin, X.C., et al.: Serverless boom or bust? An analysis of economic incentives. In: USENIX (2020)
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: International Symposium on Code Generation and Optimization, 2004. CGO 2004, pp. 75–86. IEEE (2004)
Stone, J.E., Gohara, D., Shi, G.: OpenCL: A parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66 (2010)
Palkar, S., et al.: Weld: a common runtime for high performance data analytics. In: Conference on Innovative Data Systems Research (CIDR), pp. 45 (2017)
Meijer, E., Beckman, B., Bierman, G.: LINQ: reconciling object, relations and xml in the. net framework. In: Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pp. 706–706 (2006)
Marquet, K., Moy, M.: PinaVM: a SystemC front-end based on an executable intermediate representation. In: Proceedings of the tenth ACM international conference on Embedded software, pp. 79–88 (2010)
Black, D.C., Donovan, J., Bunton, B., Keist, A.: SystemC: From the Ground Up, vol. 71. Springer Science & Business Media, Heidelberg (2009)
Ragan-Kelley, J., et al.: Halide: decoupling algorithms from schedules for high-performance image processing. Commun. ACM 61(1), 106–115 (2017)
Vatavu, A., Nedevschi, S.: Real-time modeling of dynamic environments in traffic scenarios using a stereo-vision system. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems, pp. 722–727. IEEE (2012)
Bozga, M., et al.: If: an intermediate representation for SDL and its applications. In: SDL 1999, pp. 423–440. Elsevier (1999)
Neumann, T.: Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endowment 4(9), 539–550 (2011)
Klonatos, Y., Koch, C., Rompf, T., Chafi, H.: Building efficient query engines in a high-level language. Proc. VLDB Endowment 7(10), 853–864 (2014)
Docker. https://www.docker.org
Brewer, E.A.: Kubernetes and the path to cloud native. In: Proceedings of the sixth ACM symposium on cloud computing, pp. 167–167 (2015)
Agache, A., et al.: Firecracker: lightweight virtualization for serverless applications. In: 17th \(\{\)usenix\(\}\) symposium on networked systems design and implementation (\(\{\)nsdi\(\}\) 20), pp. 419–434 (2020)
Acknowledgements
This research has been supported by the European Union through the H2020 952215 TAILOR project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 IFIP International Federation for Information Processing
About this paper
Cite this paper
Tzouros, G., Tsenos, M., Kalogeraki, V. (2021). Portable Intermediate Representation for Efficient Big Data Analytics. In: Matos, M., Greve, F. (eds) Distributed Applications and Interoperable Systems. DAIS 2021. Lecture Notes in Computer Science(), vol 12718. Springer, Cham. https://doi.org/10.1007/978-3-030-78198-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-78198-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78197-2
Online ISBN: 978-3-030-78198-9
eBook Packages: Computer ScienceComputer Science (R0)
-
Published in cooperation with
http://www.ifip.org/