To tackle the identified challenges, we propose a research agenda for an integrated DevOps framework. The authors of this paper, who are involved in the RADON project, will continue to concretely deliver on this proposal and release the results as open-source software.
The RADON framework is envisioned to comprise three environments: (i) the modeling environment, (ii) the runtime environment, (iii) the coding environment (IDE). It will be coupled with a DevOps methodology to coordinate the development and release of FaaS-based applications, and with quality assurance tools to validate a particular solution. Figure 1 illustrates the architecture of the framework.
RADON will rely on a model-driven approach for creating and managing cloud applications that typically exploit the microservices architectural style and the serverless FaaS paradigm. We see OASIS TOSCA as being in the best position to act as the baseline modeling language to describe the topology and orchestration of such applications, on top of which a novel family of RADON models will be defined. At present, no native support is provided by the TOSCA language for serverless FaaS or data flows, which however are both critical to RADON’s significance due to the fact that serverless functions are often used to handle events triggered by data, e.g., real-time streams and diagnostic logs. Preliminary work has been done on the extension to TOSCA for the modeling and automated deployment of FaaS-based applications . This is considered as a basis for our further research in RADON.
One conceptual difference between RADON and TOSCA models lies in incorporating the behavior specification of FaaS-based applications. For example, RADON tends to simplify the description of a temporal sequence of actions triggered by a certain chain of events via graphical annotations, while in cases where the graphical complexity may be a limiting factor, users will have the possibility to annotate temporal predicates through a CDL. This is an intuitive logic-based language to be introduced in RADON for specifying temporal behavior and formal requirements (for performance, costs, security and privacy) at different development stages of an application. The CDL will enable users to relate entities as well as events in RADON models, including those pertaining to data that transit on a pipeline. With CDL annotations, RADON models can retain two core benefits of TOSCA: (i) graphical and textual modeling, thanks to the Eclipse Winery projectFootnote 23 and the TOSCA YAML specification;Footnote 24 (ii) automated orchestration of cloud applications at runtime.
The RADON runtime will package tools that bridge the modeling environment and IDE of the framework with the cloud, relying on model-driven orchestration and IaC to enact the deployment of FaaS-based applications and data pipelines on multiple target cloud platforms. Particularly, development in the absence of an operations team will also be taken into account in this methodology, so that the development team can manage the runtime life cycle of an application in a self-service fashion. Model-to-text transformation between graphically built models and TOSCA YAML files will be performed by Eclipse Winery to directly feed the RADON orchestrator.
The runtime environment will feature a template library that encapsulates at the proper levels of abstraction all the necessary elements of microservices architectures. This will extend the DICE TOSCA library,Footnote 25 a baseline that encodes many TOSCA templates for Big Data processing. The extension for FaaS-based applications includes the node representations of individual functions and relationship representations that express dependencies as well as annotations for events foreseen to trigger a function. In effect, a template defines a directed acyclic graph (DAG) of components that make up an application, inclusive of those pertaining to serverless computing. Purpose-built microservices, called event gateways, will serve as mediators between event producers and consumers across multiple serverless FaaS platforms, realizing an abstraction layer internal to the application. Developers will be able to implement functions at their discretion and store the implementations in a function hub provisioned within both the runtime environment and the graphical modeling tool, allowing the reuse of business logic according to a function life-cycle model devised in RADON.
A special focus will be put on enabling data engineers to carry out the control (ingestion, buffering, scheduling, transfer and storage) as well as the processing (filtering, transformation and analysis) of data across FaaS-based applications using data pipelines, which are essentially composed of specialized microservices (or serverless functions), resources and communication mechanisms. Data pipeline templates will be developed as part of the template library, combining these building blocks into reusable TOSCA components that can be injected into RADON models and consumed by the orchestrator.
Upon deployment of an application, the RADON orchestrator will leverage IaC at the configuration management level to set up and wire up all the components. It will automatically configure monitoring services and connect metric collectors with user-specified storage, which can primarily make passive monitoring useful for the quality assurance of the application but also enable the orchestrator to adhere to performance requirements defined in the template. The orchestrator will be able to dynamically reconfigure the application components and change the number of node instances according to the scaling policies, while the serverless FaaS platform handles the autoscaling of the functions.
The security and privacy policies, e.g., access control and data encryption, will be implemented by the RADON orchestrator to manage the application components at runtime and their initialization process as well in order to limit the exposure of critical functionalities, ensuring that both business logic and sensitive data are protected even before the application starts running. The orchestrator will exploit service meshes to enact the security and privacy policies for FaaS-based applications, which consist of serverless functions, microservices and data pipelines. In essence, a service mesh forms a separate control layer for managing and configuring intelligent proxies deployed as sidecars that mediate and route traffic among services.
IDE & DevOps methodology
RADON models will be exposed via a web-based graphical IDE (Eclipse CheFootnote 26). By clicking corresponding graphical elements, users will be able to (i) define serverless functions and microservices (for developers); (ii) design IaC recipes such as TOSCA YAML files (for runtime operators). Data engineers will also rely on the IDE to implement data transformations as serverless functions or microservices and combine these building blocks into reusable data pipeline templates. Under teamwork circumstances, users with different roles will be granted an appropriate level of access.
A DevOps methodology will be investigated, identifying stakeholders, the socio-technical system, barriers to adoption, customization methods, recurrent architectural patterns and the roles of users interacting with the RADON framework. An important feature of the methodology will be the conceptualization of different life cycles at play for the components of a FaaS-based application. This is a richer setting compared to that of a traditional methodology, as one can envision different life cycles for serverless functions, microservices and data pipelines. The identified life cycles will be archived in both user documentation and online help to provide guidance on how to decide tool workflows and make the whole framework easy to use.
Quality assurance tools
A particular concern is raised by serverless FaaS for security and privacy. The increased granularity and expressiveness of RADON models will prompt the need to consider security and privacy with high priority in the architectural design. To deliver the business logic through serverless functions or microservices, developers have to make careful trade-offs in terms of non-functional requirements such as performance, costs, security and privacy. The decision of the optimal attack surface exposed by an architecture, as well as the implication of the fine-grained decomposition on non-functional requirements, ask for a rigorous methodology to help engineers select the right level of granularity.
RADON models will combine CDL annotations and generalized TOSCA models that support serverless functions, microservices and data pipelines. Developers will be able to analyze a RADON model using a hierarchy of logic programming and simulation techniques to determine whether the decomposition or aggregation of certain services produces an improvement in satisfying the specified requirements. This mechanism will make it possible to enact progressive decisions on the model, resulting in a solution with the optimal decomposition and the satisfaction of security and privacy policies in addition to usual non-functional requirements for performance and costs. Monitoring feedbacks will also be available on a dashboard for users to diagnose the runtime behavior of each application component and identify in a semi-automated manner what needs to be prioritized in the design.
The RADON quality assurance tools will be used to validate (i) IaC recipes, (ii) business logic encoded in serverless functions or microservices, (iii) data pipelines.
For IaC recipes, a defect-prediction tool will be developed, combining anti-pattern detection and recent techniques from machine learning. It will address the intrinsic polyglot nature of infrastructure code, being either agnostic of IaC technologies or specific to certain IaC defects and anti-patterns.
For business logic, a continuous testing approach will be employed, through execution at the development stage immediately preceding the actual deployment of the application, to help detect unexpected issues before they are manifest in production.
For data pipelines, users will be required to model data flows by customizing data generation profiles, which are needed to automatically produce test data with desired statistical characteristics, e.g., tailedness and burstiness, for verifying responsiveness and scalability annotated through the CDL.