1 Introduction

Service-Oriented Architecture (SOA) provides the conceptual framework for realizing service-oriented systems (SOS’s) by supporting dynamic composition and reconfiguration of software systems from networked software services [1]. Rosen [2] identifies the key motivations for SOA as agility, flexibility, reuse, integration and reduced cost. However, the need to ensure that the systems can adapt quickly and effectively to changing business needs, changes in system quality and changes in their runtime environment is an increasingly important research problem [3]. Effective adaptation ensures the system remains relevant in a changing environment and is an accurate reflection of user expectations.

Taylor [4] defines dynamic adaptation as the ability of a software system’s functionality to be changed at runtime without requiring a system reload or restart. Taylor points out that there is an increasing demand for non-stop systems, as well as a desire to avoid annoying users. However, current approaches for supporting runtime adaptation in service-oriented systems differ widely with respect to the nature of systems they support, the types of system changes they support and their underlying model of adaptation [5, 6]. In addition, it is also unclear how these approaches address the important issue of ensuring the adaptation is effective. A growing consensus amongst researchers is that runtime adaptation in SOA should incorporate a validation element [16, 18].

In their research roadmap for self-adaptive systems, Lemos et al. [7] emphasize the need for feedback control in the life cycle of self-adaptive systems, and the need to perform traditional design-time verification and validation at runtime. In another survey, Salehie et al. [45] note that testing and assurance are probably the least focused phases in the engineering of self-adaptive software. Papazoglou et al. [18] echo this view. They note that the bulk of research in adaptive service-oriented systems has focused largely on dynamic compositions. Adaptation validation goes beyond verifying that the adaptation conforms to its operational specification. Validation is concerned with verifying the acceptability of an adaptation, often from the point of view of the system user – i.e. is it the right adaptation for the problem as opposed to whether it is specified right? Validation assesses the effectiveness of an adaptation. Because user requirements are constantly changing, a self-validation process would enable the adaptation system to self-assess and self- evolve in order to remain relevant.

This paper identifies the key factors that influence adaptation in service-oriented systems, and uses them to review 29 approaches intended to support runtime adaptation in service-oriented systems. The survey presented in this paper compliments and extends existing surveys on adaptation by examining runtime adaptation from a different, but useful perspective and extending the review to incorporate validation as key factor. The paper is organised as follows, Sect. 2 identifies the factors that influence adaptation and a review of current adaptation approaches. Section 3 proposes an integrated solution. Section 4 provides some concluding thoughts and a look ahead.

2 Factors that Influence Adaptation

Most of the work on self-adapting software systems takes inspiration from control theory and machine learning. Control-theory splits the world into a controller and a plant. The controller is responsible for sending signals to the plant, according to a control law, so that the output of the plant follows a reference (the expected ideal output). Figure 1 shows a typical control loop. Although it is difficult to anticipate when and how change occurs in software systems, it is possible to control when and how the adaptation should react to change.

Fig. 1.
figure 1

Dynamic physical system

Dynamic adaptive systems require information about the running application as well as control protocols to be able to reconfigure a system. For example, keeping web services up and running for a long time requires collecting of information about the current state of the system, analyzing that information to diagnose performance problems or to detect failures, deciding how to resolve the problem (e.g., via dynamic load-balancing or healing), and acting on those decisions. Figure 2(a) shows the control process for a software system equivalent of the physical system shown in Fig. 1. The controller maps onto an adaptation process that reconfigures the runtime system to address the changing needs in its application context. Figure 2(b) show how the adaptation process can be improved using validation. Validation tracks, assesses and adjusts adaptations to ensure that they reflect user expectations.

Fig. 2.
figure 2

Dynamic software system

Lemos et al. [7] highlight the importance of understanding the factors that influence adaptation. They posit that this helps in the comprehension of how software processes change when developing self-adaptive systems. The key challenges with current approaches include defining models that can represent a wide range of system properties, the need for feedback control loops in the life cycle of self-adaptive systems and self-validation. The nature and quality of runtime adaptation in service-oriented systems is influenced by system changes (i.e. adaptation triggers), the nature of the application and the logical area where it executes (i.e. application context), the strategy used to reconfigure the system in a particular change context (i.e. adaptation model), and the effectiveness of the adaptation (i.e. validation). Together, these factors, represent the what, where, when and how, and right of runtime adaptation. It is also important to note that these factors constantly shift and evolve making it difficult to specify adequate adaptation rules in advance.

2.1 Change Trigger (What)

A change trigger represents what causes adaptation and the reason for it. Change triggers are a function of changes in the business environment, service failure, and changes in the system quality and its runtime environment.

  • Business Environment Triggers. Changes in the business environment that the system supports may trigger adaptation. This may be caused by changes in user requirements, business rules or platform. Zeng [8] describes an adaptation approach that accepts changes in user requirements and business rules on the fly and composes services to address them. Similarly Cubo [6] describes an approach that uses changes in the system context and platform to trigger adaptation.

  • Service Quality Triggers. Failures in provided services, for example, incompatibilities, network outages and poor quality, can trigger adaptation. The quality of a service-oriented system depends not only on the quality of service (QoS) provided by services, but on the interdependencies between services and resource constraints imposed by the runtime environment. This type of corrective adaptation is typical of self-healing systems. Robinson et al. [10] describe an approach that uses a consumer-centred, pluggable brokerage model to track and renegotiate service faults and changes. The framework provides a service monitoring system, which actively monitors the quality of negotiated services for emergent changes, SLA violations and failure. A similar approach, The Personal Mobility Manager, described in [29] emphasizes the need for automatic system diagnosis to detect runtime errors. It helps car drivers find the best route in or between towns, by suggesting optimal combinations of transportation according to local situations, such as traffic level, weather conditions and opening hours.

  • Runtime Environment Triggers. Changes in the runtime system can also trigger adaptation. Interacting services may impose dependability as well as structural constraints on each other (e.g. performance, availability, cost, and interface requirements). Dustdar et al. [9] describe a self-adaptation technique for managing the runtime integration of diverse requirements arising from interacting services, such as time, performance and cost. Swaminathan et al. [5] propose an adaptation approach based on self-healing as a means for addressing runtime system errors. Runtime resource contentions between services in the orchestration platform can result in significant falls in service quality. This emergent quality of service is difficult to anticipate before system composition, as resource demands are often dynamic and influenced by many factors. Newman and Kotonya [11] proposes a resource-aware framework that combines resource monitoring with dynamic service orchestration to provide a runtime architecture that mediates resource contentions in embedded service-oriented systems.

Effective adaptation must address the real cause rather than the symptom. Taiani [13] describes this as a key challenge in adaptive fault tolerant computing. Moyano et al. [14] describe a system that monitors service failure and runtime environment triggers. These are changes in hardware and firmware, including the unpredictable arrival or disappearance of devices and software component. For example, a low memory trigger may be the result of an SLA violation or runtime environment resource failure. The resolution to the problem might involve replacing the service with a more efficient alternative or optimizing the runtime environment, or both. It is important that the adaptation process is not only able to find a good fit for the problem, but the right fit.

2.2 Application Context (Where)

An application context defines nature of the application and the logical area where it executes. It helps us understand where adaptation takes place and the constraints involved. Cubo et al. [6] discuss the importance of creating adaptive systems sensitive to their application context (i.e. domain, location, time and activity). Tanaka and Ishida [32] identify an input language and a target language as the application context for a language translation application. Most of the approaches surveyed in this paper were concerned with specific application contexts. Zeng et al. [8], for example, describe a runtime approach for supporting business change in the automotive industry. Newman [11] describes an adaptation framework for embedded resource-constrained environments. Baresi et al. [15] describes an adaptation framework for a smart home system. In their description for the DigiHome architecture Romero et al. [41] discuss the integration of multi-scale entities. In the DigiHome scenario, they consider several heterogeneous devices that generate isolated events, which can be used to obtain valuable information and to make decisions accordingly. They make use of Complex Event Processing (CEP), to find relationships between a series of simple and independent events from different sources, using previously defined rules. In their work Romero et al., use several heterogeneous devices that generate isolated events to obtain valuable information and to make decisions accordingly.

A few other approaches, including Swaminathan et al. [5], Cardellini et al. [16], and Zeng et al. [17] propose generic application contexts, but they only provide sketchy implementation details. Some approaches promote context variability. For example, Swaminathan et al. describe a context-independent, self-configuring, self-healing model for web services. However, the author provides no information about the implementation or evaluation of the model. Huang et al. [20] describe an approach for developing self-configuring services using service-specific knowledge. They evaluate their approach on three different systems (i.e. a video streaming service, an interactive search service, and a video-conference service). However, it is evident from their discussion that the context needs to be known before the application is deployed.

2.3 Adaptation Model (When and How)

An adaptation model shows when the adaptation process is carried out and how the model is implemented in relation to the system it manages. A decision on when to conduct adaptation is arrived at depending on when the adaptation requirements are known as well as the availability of the requirements for adaptation. If the requirements are known before the system is running then adaptation can take place at design-time, for instance introducing support for a new network communication protocol, or adding new attributes to a data model. However, if the requirements are only known after the system has started executing then adaptation will take place at runtime. This is the typical situation in ubiquitous and mobile computing scenarios. The availability of the requirements for adaptation, such as system resources can also determine when to conduct adaptation. For example, if the resources are available online then dynamic adaptation can be conducted; otherwise it can be pushed to a later time when they will be available.

Support for adaptation is required statically, where the applications could be taken offline and adapted, and dynamically where going offline is not an available option. An adaptation model may be implemented to carry out adaptation at design-time or at runtime. Most of the work reviewed in Table 1 focuses on runtime approaches to adaptation. There is no apparent attempt to integrate design-time adaptation approaches with runtime adaptation despite some of the benefits presented by design-time adaptation. Papazoglou et al. [18] and Baresi et al. [3] identify the key techniques that can be used to achieve runtime adaptation as self-configuring, self-healing, and self-optimizing techniques.

Table 1. Summary of adaptation approaches
  • Self-Configuring is the automatic re-composition of services to adapt to changes in the service environment. The work of [19, 8, 21] describe self-configuring adaptation techniques.

  • Self-Optimizing is the automatic re-composition of services to improve quality of a service. The work of [9, 17, 13] describes self-optimizing adaptation techniques.

  • Self-Healing is the automatic re-composition of services to address a service failure. Self-healing techniques detect system malfunctions and initiate policy based corrective actions without disrupting the runtime environment [18].

Romay’s [21] review of self-adaptation techniques in SOA reveals that current research focuses largely on self-configuring techniques. There is very little research on self-optimizing or self-healing techniques. Bucchiarone et al. [22] note that focusing on only one technique limits the effectiveness of the approach.

2.3.1 Predictive vs. Reactive Models

Adaptation can also occur in response to anticipated changes (predictive) or in response to change trigger (reactive). Tanaka and Ishida [32] propose a model that focuses on predicting the executability of services (i.e. if a message request will cause execution failure). Unfortunately they provide limited detail on the implementation and evaluation of their approach. Wang et al. [24] proposes a predictability model based on the Q-Learning algorithm using the Markov Decision Processes (MDP). They explain that human oriented services are rarely predictable. They point out that many service properties keep changing in a manner that prior knowledge of these changes may not be available. Instead they suggest incorporating reinforced learning in adaptation techniques to ensure that adaptation techniques remain relevant. Their model uses a decision process that maximizes the expected sum of rewards. While predictive adaptation shows some promise, there is very little research on them.

2.3.2 Model Implementation – Embedded vs. Pluggable

In embedded adaptation, the process of monitoring and re-composition are an integral part the adaptive system. In pluggable approaches, data is collected from the running target system with non-invasive probes that report the raw data to an adaptation module that is plugged onto the system. Most of the adaptation approaches in Table 1 are embedded, limiting their portability. The work of Zeng et al. [8] and Cubo et al. [6] are typical examples of this. Garlan et al. [23] state that the use of external control mechanisms for self-adaptivity is a more effective engineering solution than localizing the solution. A pluggable engine can be reused across different systems.

2.4 Adaptation Effectiveness (Right)

A typical adaptation process uses a predefined decision model to select an appropriate adaptation in response to a change trigger. This relationship is often stored as a set of fixed adaptation rules. However, the dynamic nature of service-oriented systems means these factors are constantly changing, which makes it difficult to specify adequate adaptation rules a priori. This is further complicated by the likelihood of competing adaptation requests. This means that rules used to inform adaptation decisions cannot be static and must constantly evolve to remain relevant. Most approaches that support runtime adaptation are based on rules that reconfigure systems based on fixed decision points. This means that most adaptations in service-oriented systems are responses to change rather than anticipation. One way to address the problem is through the validation of adaptation decisions. Validation serves two key roles. First, it provides a mechanism for assessing the effectiveness of an adaptation decision i.e. how well a recommended adaptation addresses the concerns for which the system is reconfigured. Secondly it provides us with insights into the nature of problems for which different adaptations are suited.

Most autonomic systems are underpinned by IBM’s Monitoring, Analysis, Planning, and Execution model (MAPE) [47]. However, while the model is evident in many self-adaptive frameworks for SOA, it lacks a runtime mechanism for supporting validation. The analysis phase of the MAPE model performs reasoning based on the details provided by the monitoring phase. The goal is to arrive at a decision on whether adaptation is required. The planning phase then identifies the appropriate adaptation and the execution phase changes the behavior of the managed resource. The entire process is based on predefined rules.

None of the approaches surveyed in this paper provided integral support for validation. However, several researchers underscored its centrality to effective adaptation [18, 45]. Skalkowski et al. [25] describe how a clustering algorithm can be used to provide automatic recognition of similar system states and grouping them into subsets (called clusters). Schumann and Gupta [26] propose a validation method to calculate safety regions for adaptive systems around the current state of operation based on Bayesian probabilities. Canfora et al. [27] propose regression testing to check the evolution of service–oriented systems.

2.5 Summary

Table 1 shows our results of surveying 29 approaches that provide runtime adaptation for SOS’s. It is important to mention that the survey was intended to be representative rather than exhaustive. The approaches were careful selected to provide a good coverage of current adaptation approaches. Each approach is reviewed it terms of the nature and extent of support for change triggers, adaptation model, validation and application context. Most of the approaches provide limited support for runtime and service quality triggers. However, they provide comparatively good support for business environment triggers. Only Ivanovic et al. [12] describes an approach for supporting all three adaptation triggers. In their work they talk of the computational cost of service networks as being dependent on internal and external factors. They recognize that triggers for adaptation are due to overlapping factors that are both internal and external to the service. Of the approaches reviewed, only provide support for adaptation validation. However, the support is very limited. There is poor support for diversity with most approaches designed to support specific application contexts. This limitation may be related to the fact that most of the approaches are embedded.

3 Support for Validation in Runtime Adaptation

We have developed a consumer-centred, pluggable runtime adaptation framework that integrates and extends the strengths of current approaches to support validation [48]. Figure 3 shows the architecture of the framework. The sensor sub-system is responsible for monitoring the system for changes and for the initial decision phase of the adaptation process. When a system change or changes match the conditions specified on a sensor, the sensor manager invokes the adaptation process by passing it the change information. The adaptation manager is responsible for assessing the request and recommending a suitable adaptation solution.

Fig. 3.
figure 3

Adaptation framework with support for validation

The validation sub-system uses machine-learning algorithms to assess past adaptation decisions and to modify the rules that inform the decisions. Figure 3 shows the validation process. We use clustering algorithms to assess and refine the adaptation decisions and deep learning to review and improve the long-term accuracy of predictions. Our framework uses the past behavior of consumers rather than their opinion to gauge reputation and quality of feedback. If similar users in the recent past have repeatedly accepted a particular adaptation decision then the decision is most likely valid. The user is not prompted for feedback and therefore they cannot intentionally manipulate the feedback they provide. Decision logs record the change triggers received and the adaptation decision made. Machine learning algorithms then assess the user adaptation decisions against adaptation triggers and form, and refine rules that inform future adaptation decisions.

4 Conclusions

The paper has discussed the importance of runtime adaptation in SOS’s and identified the factors that influence it. These factors describe the what, where, when and how and right of adaptation. Specifically adaptation triggers tell us what cause adaptation, the application context tells us where to adapt, the adaptation models tell us when and how to adapt and validation tells us how effective the adaptation is.

We have used these factors to review the current state of runtime adaptation in service-oriented systems. Our survey reveals that most of the approaches provide patchy support for the key factors that influence adaptation. Most adaptation approaches are tied to particular application contexts, focus on specific aspects changes and are embedded in the application they manage. It is also clear that there is limited empirical evidence to indicate the effectiveness of the approaches reviewed. Lastly, we have provided a possible solution that integrates and extends the strengths of current approach to support validation. We believe this paper makes a significant contribution towards understanding and addressing a challenging problem.