Orchestrating Heterogeneous Applications: Motivation and State of the Art

Di Nitto, Elisabetta; Vladušič, Daniel

doi:10.1007/978-3-031-04961-3_1

Elisabetta Di Nitto⁷ &
Daniel Vladušič⁸

Part of the book series: SpringerBriefs in Applied Sciences and Technology ((BRIEFSPOLIMI))

1032 Accesses

Abstract

This chapter presents the motivation for SODALITE highlighting the difficulties faced by developers of complex applications when they need to deploy such applications in execution contexts where the usage of heterogeneous resources (HPC, Cloud and Edge) coexist. An overview of the state of the art to highlight gaps and open issues is also presented.

You have full access to this open access chapter, Download chapter PDF

TOSCA in a Nutshell: Promises and Perspectives

Evolution of MANO Towards the Cloud-Native Paradigm for the Edge Computing

Cloud resource orchestration in the multi-cloud landscape: a systematic review of existing frameworks

Article Open access 10 September 2020

1.1 Preliminaries

In recent years, the global market has seen a tremendous rise in utility computing, which serves as the backend for practically any new technology, methodology or advancement from healthcare to aerospace. General purpose GPUs are becoming common currency in datacentres while specialized FPGA accelerators, ranging from deep-learning specific accelerators to burst buffers technologies, are becoming “the big coin”, enormously speeding up applications execution and likely to become common in the near future. We are entering a new era of heterogeneous, software-defined, high-performance computing environments. In this context, SODALITE aims to address this heterogeneity by focusing on how to deploy and operate complex software into environments that comprise accelerators/GPUs, configurable processors, and non-x86 CPUs such as ARMv8.

In our view, a complex software system is composed of several and different components built for different purposes, featuring different execution models (from microservices to batch jobs) and requiring different QoS. For example, consider a web application that runs an AI inference algorithm to recognize specific objects within some images or to identify the products that a certain user will, likely, prefer. In this case, a heterogeneous setting would be the best choice for deploying such an application. More specifically, the microservices and web server will find their optimal configuration on the cloud, while at least part of the inference algorithm or its training phase may run more effectively on an HPC cluster based, for instance, on GPUs.

Having the application to be executed in a heterogeneous infrastructure can bring several advantages in terms of efficient use of the available resources and effective execution of the system. Nevertheless, being able to effectively deploy and operate application components in a heterogeneous environment today requires an in-depth knowledge of each target infrastructure, of the execution models each of them supports, and of the mechanisms that can be exploited to efficiently enable information exchange between the application parts deployed on different types of resources.

In general, Infrastructure as Code approaches do support effective deployment of applications but, at the same time, highlight a number of challenges. In the next sections, we will provide a brief analysis of the state of the art in the main relevant areas of modelling, deploying and operating complex applications (Sect. 1.2), we will then highlight the challenges that are left open by the available approaches (Sect. 1.3) and, finally, we will highlight the main innovations offered by SODALITE to cope with these challenges (Sect. 1.4).

1.2 State of the Art Analysis

1.2.1 Application Deployment Modelling

Approaches supporting application deployment assume that the application DevOps team develops a deployment model, that is, a specification of the components belonging to the application and their connectors, as well as their dependencies on a specific technological stack, if any. Infrastructure as Code approaches, such as TOSCA [9] and Ansible^{Footnote 1} do offer effective means to specify a deployment model. When this model is available, then an orchestrator can execute it and deploy the corresponding components on the available resources.

TOSCA is a standard IaC language that was designed to support a Cloud information model that can be extended through the definition of new node types and through inheritance. TOSCA itself is implementation agnostic. This means that the implementation of operations aiming at controlling the lifecycle of nodes (e.g., creation, scaling, deletion, ...) can be defined in a wide spectrum of languages ranging from bash scripts and Python to infrastructure management tools like Chef, Puppet or Ansible. These three are all open-source tools mostly designed to help DevOps configure and manage the infrastructure. Both Chef and Puppet have been designed as an Agent-master solution and thus need agents installed on each node for configuration. The offered IaC language is Ruby-like, which is usually considered difficult to learn. Ansible offers a simple and clean declarative IaC language which is widely accepted and easy to learn and adopt. Also, Ansible is characterized by a vast community support and probably the largest set of cloud infrastructure libraries support (Ansible Galaxy). Ansible is an inherently simple agentless approach to remote infrastructure management and is implemented through the standard Python Paramiko SSH library enabling the DevOps to manage any infrastructure accessible through SSH.

Wurster et al. [14] propose the concept of the essential deployment metamodel (EDMM) that captures the essential parts of declarative deployment models. In a recent survey, Bergmayr et al. [2] reviewed the current approaches to modeling cloud applications. They observed that existing modeling languages lack interoperability, and, to cope with this, suggested to leverage the TOSCA standard. In [16], they identified an EDMM-compliant subset of TOSCA, to enable the transformation from TOSCA-based specifications of deployment models to those in the languages used by the industrial infrastructure as code (IaC) tools such as Ansible and Terraform.

As observed in [2, 15], there are several graphical modeling tools (IDEs) in existence for cloud infrastructure and deployment modeling, for example, Vino4TOSCA [5] and OCCIware [17] provide visual notations for TOSCA and OCCI (Open Cloud Computing Interface) modeling elements, respectively. OCCI is a standard for managing any cloud resources. In contrast, ARGON [11, 12], DICER [1], and SWITCH [13] provide the domain specific languages (DSLs) tailored to specific application domains, for example, public cloud infrastructures including their elasticity, data-intensive (big data) applications, and containerised microservice-based cloud-native applications.

1.2.2 Application Deployment and Operation

A common approach to enacting high-level or visual deployment models is to transform them into artifacts that can be used by an orchestrator or deployment automation tool. For example, ARGON and DICER employed model-to-model (M2M) transformations to convert the models in their DSLs into deployable IaC artifacts, for example, TOSCA blueprints and Ansible. Brabra et al. [4] also applied M2M transformations to transform TOSCA-based models into Docker and Docker compose configurations. Bernal et al. [3] proposed a UML profile to model the key elements of a cloud application and infrastructure, and used M2M transformations to translate UML-based application models to the configuration files for a cloud simulator, which enables the analysis of the performance of the application.

For what concerns the enactment of IaC, there exist TOSCA and OCCI based orchestrators or runtime environments for cloud applications [2, 13, 17], including multi-cloud [7, 8]. Two interesting approaches that focus on hybrid cloud and HPC applications are Croupier [6] and INDIGO [10]. Croupier is not fully compatible with the official TOSCA standard as it uses its own adaptation of the TOSCA model. The INDIGO PaaS Orchestrator^{Footnote 2} allows instantiation of resources on the hybrid virtualized infrastructures (private, public clouds, virtual grid organizations) with the use of TOSCA YAML Simple Profile v1.0. It is integrated with other INDIGO services to enable best placement of the resources based on SLA and monitoring from the available list of cloud providers. In order to deploy, configure and update IaaS resources, the orchestrator uses an Infrastructure Manager (IM) that interfaces with multiple cloud sites in a cloud-agnostic manner. Although the INDIGO PaaS orchestrator allows to spin up a virtual cluster (e.g. managed by batch systems such as PBS Torque/Slurm/Mesos) using TOSCA, the workflow management of the jobs is not directly supported and it assumes the usage of workflow management systems (e.g., Kepler) on top of deployed virtual infrastructure. Similarly, the partial reconfiguration is done on IaaS resources and it does not operate on the application.

1.3 Open Challenges

From the brief overview in the previous section, it should be clear that approaches supporting the specification of deployment models and their execution to orchestrate the deployment of complex applications do exist and they include also TOSCA, a standardization effort that is raising the interest of multiple organizations both in academia and industry. However, when exploiting such approaches, a number of challenges must be faced.

A graphical representation of the process of snow deployment. It has three components a snow-weather-condition filter, configuration-demo, and snow-M Y S Q L. Each one has particular properties. Snow-weather-condition-filter defines a proper development model. Configuration demo specifies the deployment resources. Snow-MySQL is used for monitoring. — **Fig. 1.1**

First, defining a proper deployment model for a complex application is not an easy task. As an example, Fig. 1.1 shows a small portion of a deployment model that describes some components from SNOW, one of the SODALITE use cases. The description exploits the SODALITE Domain Specific Language, but any of the available IaC approaches would provide similar results. The figure is incomplete and refers only to three out of the about 10 components of the whole architecture. The lines in the figure show various kinds of relationships between the components of the SNOW architecture. Capturing all of them, together with all the needed properties is mandatory to enable the automation of the application deployment and configuration and gives an idea of the complexity of the specification effort. The problem we see is that current approaches do not provide guidance to the developers of such models that, as a consequence, must be very experienced.

Second, even when an expert able to master TOSCA and Ansible, or any other similar IaC approach, is available, still this expert will need to have at his/her disposal the specification of the resources to be used for deployment. In fact, every resource is assumed to be specified before its usage. This specification should include many peculiarities and details that vary from provider to provider, especially when we want to ensure optimized performance of the application to be executed. In some cases, the amount of available resources is not even known in advance and must be discovered on the fly. This is especially the case when using edge devices.

Finally, every new type of resource, even different traditional cloud IaaS, offer different APIs and different access control mechanisms. Thus, exploiting such resources and monitoring them and being able to adapt the application based on their status is, per se, a non-trivial task, even if nowadays these are supported by various experimental orchestrators and initiatives.

1.4 Innovations Offered by SODALITE

SODALITE tries to address the problems described in the previous section by providing intelligent assistance during the deployment model creation phase and in enabling the end users to include in a deployment model pieces of information suitable to support the definition of QoS constraints, the optimization of used resources and a proper configuration of the execution and monitoring environment.

Moreover, SODALITE supports resource experts in modeling their resources and in automating the process of discovering new resources and deriving suitable models for them.

It offers light-weight execution environments, which are essentially cross-platform containers that enable the user to execute, with different performance, the same application components on heterogeneous resources in a seamless way and allow them to be built automatically.

Another important aspect concerning SODALITE is enabling design-time optimization of applications. To exploit HPC resources in the best possible way, the application code may need to be tuned and/or scaling actions may need to be executed (e.g., increasing the number of cores, accelerating with GPUs or coprocessors, enabling faster storage, etc.). Such actions must be tailored considering the type of application components to be deployed, their QoS requirements and the available resources. The SODALITE Application Optimizer, MODAK, focuses on these issues and offer a framework that, given the specification of a few constraints as part of a deployment model, is able to generate the scripts to be executed in an HPC environment to achieve an optimized execution of application components.

SODALITE supports also the identification of defects in deployment models and of possible reconfiguration options of running application configurations. Thanks to machine learning, SODALITE analyses the previous history of deployment models that had to be corrected to identify defects, thus building a taxonomy of defects that is then used to provide suggestions to DevOps experts. Defects include code smells, errors and anti-patterns.

At runtime, SODALITE enables on the fly optimization of applications by dynamically scaling in and out computational resources depending on the specific applications being considered, but also by identifying, through machine learning, possible configurations that perform better than others and suggest them to DevOps experts when the monitoring system reveals the presence of problems in the current configuration

Another considered aspect concerns the support to data placement-aware deployment and to data movement between HPC, Cloud and edge resources. These aspects are very important as they have a strong impact on application performance. SODALITE optimizes data movement at two different levels: single components and compositions of multiple components. For the former case, we explore asynchronous data transfer, caching, and prefetching of the data. For the latter case, we explore using efficient data movement across storage and network to improve the workflow performance.

Providing proper identity and access management is a crucial part of protecting both user data and sensible project information. There are two different facets we consider in the scope of SODALITE. The first one concerns the mechanisms that control access to the SODALITE platform itself. This is covered through a role-based Identity and Access Management (IAM) implementation (keycloak) for SODALITE users and other implementations for secret and credential management (e.g., Vault or similar). The second aspect concerns the possibility to model privacy and security-related resources, such as Virtual Private Networks, so that they can be instantiated and reused in the deployment models of specific systems and, thus, their deployment and configuration be automated as well.

1.5 Book Objectives

The objective of this book is to: (i) present the approach and tools constituting the SODALITE solution, (ii) to describe how the approach can be used by a DevOps team, and (iii) how it has been adopted by the three SODALITE case studies.

Notes

References

Artac M et al (2018) Infrastructure-as-code for data-intensive architectures: a model-driven development approach. In: 2018 IEEE international conference on software architecture (ICSA), pp 156–15609
Google Scholar
Bergmayr A et al (2018) A systematic review of cloud modeling languages. In: ACM Comput Surv 51(1). ISSN: 0360-0300. https://doi.org/10.1145/3150227
Bernal A et al (2019) Improving cloud architectures using UML profiles and M2T transformation techniques. J Supercomput 75(12):8012–8058
Google Scholar
Brabra H et al (2019) Model-driven orchestration for cloud resources. In: 2019 IEEE 12th international conference on cloud computing (CLOUD), 422–429
Google Scholar
Breitenbücher U et al (2012) Vino4TOSCA: a visual notation for application topologies based on TOSCA. In: OTM confederated international conferences ”On the move to meaningful internet systems”. Springer, pp 416–424
Google Scholar
Carnero J, Nieto FJ (2018) Running simulations in HPC and cloud resources by implementing enhanced TOSCAWorkflows. In: 2018 international conference on high performance computing simulation (HPCS), pp 431–438
Google Scholar
Kovács J, Kacsuk P (2018) Occopus: a multi-cloud orchestrator to deploy and manage complex scientific infrastructures. J Grid Comput 16(1):19–37. ISSN: 1572-9184. https://doi.org/10.1007/s10723-017-9421-3
Kritikos K, Skrzypek P, Zahid F (2020) Are cloud platforms ready for multi-cloud? In: Brogi A, Zimmermann W, Kritikos K (eds.) Service-oriented and cloud computing. Springer International Publishing, Cham, pp 56–73. ISBN: 978-3-030-44769-4
Google Scholar
Lipton P et al (2020) Tosca simple profile in YAML version 1.3. In: OASIS committee specification 1 (2020)
Google Scholar
Salomoni D et al (2018) INDIGO-DataCloud: a platform to facilitate seamless access to e-infrastructures. J. Grid Comput. 16(3):381–408
Google Scholar
Sandobalin J, Insfran E, Abrahao S (2017) An infrastructure modelling tool for cloud provisioning. In: 2017 IEEE international conference on services computing (SCC), pp 354–361
Google Scholar
Sandobalin J, Insfran E, Abrahao S (2018) ARGON: a tool for modeling cloud resources. In: Braubach L et al (ed) Service-oriented computing – ICSOC 2017 workshops. Springer International Publishing, Cham, pp 393–397. ISBN: 978-3-319-91764-1
Google Scholar
Štefani P et al (2019) SWITCH workbench: a novel approach for the development and deployment of time-critical microservice-based cloud-native applications. Future Gener Comput Syst 99:197–212. ISSN: 0167-739X. https://doi.org/10.1016/j.future.2019.04.008. http://www.sciencedirect.com/science/article/pii/ S0167739X1831094X
Wurster M et al (2020) The EDMM modeling and transformation system. In: Yangui S et al (eds)Service-oriented computing – ICSOC 2019 Workshops. Springer International Publishing, Cham, pp 294–298. ISBN: 978-3-030-45989-5
Google Scholar
Wurster M et al (2020) The essential deployment metamodel: a systematic review of deployment automation technologies. SICS Softw-Intensive Cyber-Phys Syst 35(1):63–75. https://doi.org/10.1007/s00450-019-00412-x
Wurster M et al (2020) TOSCALight: bridging the gap between the TOSCA specification and production-ready deployment technologies. In: Proceedings of the 10th international conference on cloud computing and services science - Volume 1: CLOSER, INSTICC. SciTePress, pp 216–226. ISBN: 978-989-758-424-4. https://doi.org/10.5220/0009794302160226
Zalila F, Challita S, Merle P (2019) Model-driven cloud resource management with OCCIware. Future Gener Comput Syst 99:260–277. ISSN: 0167-739X. https://doi.org/10.1016/j.future.2019.04.015.http://www.sciencedirect.com/science/article/pii/S0167739X18306071

Download references

Author information

Authors and Affiliations

Politecnico di Milano, Milan, Italy
Elisabetta Di Nitto
XLAB, Ljubljana, Slovenia
Daniel Vladušič

Authors

Elisabetta Di Nitto
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Vladušič
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elisabetta Di Nitto .

Editor information

Editors and Affiliations

DEIB, Politecnico di Milano, Milano, Italy
Elisabetta Di Nitto
Atos (Spain), Madrid, Spain
Jesús Gorroñogoitia Cruz
Jheronimus Academy of Data Science, ‘s-Hertogenbosch, The Netherlands
Indika Kumara
XLAB (Slovenia), Ljubljana, Slovenia
Dragan Radolović
HLRS, Stuttgart, Germany
Kamil Tokmakov
Centre for Research and Technology Hella, Thermi, Thessaloniki, Greece
Zoe Vasileiou

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Di Nitto, E., Vladušič, D. (2022). Orchestrating Heterogeneous Applications: Motivation and State of the Art. In: Di Nitto, E., Gorroñogoitia Cruz, J., Kumara, I., Radolović, D., Tokmakov, K., Vasileiou, Z. (eds) Deployment and Operation of Complex Software in Heterogeneous Execution Environments. SpringerBriefs in Applied Sciences and Technology(). Springer, Cham. https://doi.org/10.1007/978-3-031-04961-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-04961-3_1
Published: 15 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04960-6
Online ISBN: 978-3-031-04961-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Orchestrating Heterogeneous Applications: Motivation and State of the Art

Abstract

Similar content being viewed by others

TOSCA in a Nutshell: Promises and Perspectives

Evolution of MANO Towards the Cloud-Native Paradigm for the Edge Computing

Cloud resource orchestration in the multi-cloud landscape: a systematic review of existing frameworks

1.1 Preliminaries

1.2 State of the Art Analysis

1.2.1 Application Deployment Modelling

1.2.2 Application Deployment and Operation

1.3 Open Challenges

1.4 Innovations Offered by SODALITE

1.5 Book Objectives

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Orchestrating Heterogeneous Applications: Motivation and State of the Art

Abstract

Similar content being viewed by others

TOSCA in a Nutshell: Promises and Perspectives

Evolution of MANO Towards the Cloud-Native Paradigm for the Edge Computing

Cloud resource orchestration in the multi-cloud landscape: a systematic review of existing frameworks

1.1 Preliminaries

1.2 State of the Art Analysis

1.2.1 Application Deployment Modelling

1.2.2 Application Deployment and Operation

1.3 Open Challenges

1.4 Innovations Offered by SODALITE

1.5 Book Objectives

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation