Optimal and Automated Deployment for Microservices

Microservices are highly modular and scalable Service Oriented Architectures. They underpin automated deployment practices like Continuous Deployment and Autoscaling. In this paper, we formalize these practices and show that automated deployment - proven undecidable in the general case - is algorithmically treatable for microservices. Our key assumption is that the configuration life-cycle of a microservice is split into two phases: (i) creation, which entails establishing initial connections with already available microservices, and (ii) subsequent binding/unbinding with other microservices. To illustrate the applicability of our approach, we implement an automatic optimal deployment tool and compute deployment plans for a realistic microservice architecture, modeled in the Abstract Behavioral Specification (ABS) language.


Introduction
Inspired by service-oriented computing, Microservices structure software applications as highly modular and scalable compositions of fine-grained and looselycoupled services [13]. These features support modern software engineering practices, like continuous delivery/deployment [24] and application autoscaling [3]. Currently, these practices focus on single microservices and do not take advantage of the information on the interdependencies within an architecture. On the contrary, architecture-level deployment plans can i ) optimize global scalinge.g., avoiding the overhead of redundantly detecting inbound traffic and sequentially scale each microservice in a pipeline -and ii ) avoid "domino" effects due to unstructured scaling -e.g., cascading slowdowns or outages [21,28,32].
In this paper, we formally investigate the problem of automatizing the deployment and reconfiguration (e.g., horizontal or vertical scaling) of microservice architectures, proving formal properties and presenting an implemented solution.
In our work, we follow the approach taken by the Aeolus component model [8][9][10], which was used to formally define the problem of deploying componentbased software systems and to prove that, in the general case, such problem is undecidable [10]. The basic idea of Aeolus is to enrich the specification of components with a finite state automaton that describes their deployment life cycle. Previous work identified decidable fragments of the Aeolus model: e.g., removing from Aeolus replication constraints (e.g., used to specify a minimal amount of services connected to a load balancer) makes the deployment problem decidable, but non-primitive recursive [9]; removing also conflicts (e.g., used to express the impossibility to deploy in the same system two types of components) makes the problem PSpace-complete [27] or even poly-time [10], but under the assumption that every required component can be (re)deployed from scratch.
Our intuition is that the Aeolus model can be adapted to formally reason on the deployment of microservices. To achieve our goal, we significantly revisit the formalization of the deployment problem, replacing Aeolus components with a model of microservices. The main difference between our model of microservices and Aeolus components lies in the specification of their deployment life cycle. Here, instead of using the full power of finite state automata (like in Aeolus and other TOSCA-compliant deployment models [6]), we assume microservices to have two states: (i) creation and (ii) binding/unbinding. Concerning creation, we use strong dependencies to express which microservices must be immediately connected to newly created ones. After creation, we use weak dependencies to indicate additional microservices that can be bound/unbound. The principle that guided this modification comes from state-of-the-art microservice deployment technologies like Docker [29] and Kubernetes [23]. In particular, the weak and strong dependencies have been inspired by Docker Compose [11] (a language for defining multi-container Docker applications) where it is possible to specify different relationships among microservices using, e.g., the depends on (or external links) modalities that force (or do not force) a specific startup order similarly to our strong (or weak) dependencies. Weak dependencies are also useful to model horizontal scaling, e.g., a load balancer that is bound to and unbound from many microservice instances during its life cycle.
In our formalization we also consider resource/cost-aware deployments, taking inspiration from the memory and CPU resources found in Kubernetes. We enrich our model of microservices with the specification of the amount of resources they need to run. In a deployment, a system of microservices runs within a set of computation nodes. In our model, nodes represent computational units (e.g., virtual machines in an Infrastructure-as-a-Service Cloud deployment). Each node has a cost and a set of resources available to the microservices it hosts.
On the model above, we define the optimal deployment problem as follows: given an initial microservice system, a set of available nodes, and a new target microservice to be deployed, find a sequence of reconfiguration actions that, once applied to the initial system, leads to a new deployment that includes the target microservice. The optimal deployment has two properties: (a) each used node has at least as many resources as those needed by the hosted microservices; (b) the total cost (i.e., the sum of the costs) of the used nodes is minimal. We show that the optimal deployment problem for microservices is decidable by presenting an algorithm that works in three main phases: (1) generate a set of constraints whose solution indicates the microservices to be deployed and their distribution over available nodes; (2) generate another set of constraints whose solution indicates the connections to be established; (3) synthesize the corresponding deployment plan. The generated set of constraints are enriched with optimization metrics that minimize the overall cost of the computed deployment.
The algorithm has NEXPTIME complexity because, in the worst-case, the length of the deployment plan could be exponential in the size of the input. However, in practice it is reasonable to assume that each node can host at most a polynomial amount of microservices, as a consequence of its resource limitations. In this case, the deployment problem is NP-complete and the problem of deploying a system minimizing its total cost is an NP-optimization problem. Moreover, having reduced the deployment problem in terms of constraints, we can exploit state of the art constraint solvers [7,17,18] that are frequently used in practice to cope with NP-hard problems.
To concretely evaluate our approach, we consider a real-world microservice architecture, inspired by the reference email processing pipeline from Iron.io [16]. We model that architecture in the Abstract Behavioral Specification (ABS) language, a high-level object-oriented language that supports deployment modeling [25]. We use our technique to compute two types of deployments: an initial one, with one instance for each microservice, and a set of deployments to horizontally scale the system depending on small, medium or large increments in the number of emails to be processed. The experimental results are encouraging in that we were able to compute deployment plans that add more than 30 new microservice instances, assuming availability of hundreds of machines of three different types, and guaranteeing optimality.

The microservice optimal deployment problem
We model microservice systems as aggregations of components with ports exposing provided and required interfaces describing offered and required functionalities, respectively. Microservices are connected by means of bindings indicating which port provides the functionality required by another port. We consider two kinds of requirements: strong required interfaces, that need to be already fulfilled when the microservice is created, and weak required interfaces, that must be fulfilled at the end of a deployment (or reconfiguration) plan. Microservices are enriched with the specification of the resources they need to properly run; such resources are provided to the microservices by nodes. Nodes can be seen as the unit of computation executing the tasks associated to each microservice.
As an example, in Fig. 1 we have reported the representation of the deployment of a microservice system inspired by the email processing pipeline that we will discuss in Section 3. We consider a simplified pipeline. A Message Receiver microservice handles inbound requests, passing them to a Message Analyzer that checks the email content and sends the attachments for inspection to an Attachment Analyzer. The Message Receiver has a port with a weak required interface that can be fulfilled by Message Analyzer instances. The Message Analyzer has instead a port with a strong required interface that can be fulfilled by Attachment Analyzer instances. In the second case, the binding between the Message Analyzer and the corresponding Attachment Analyzer must be established already when the Message Analyzer is created. In the first case, the bindings between Message Receiver and Message Analyzer microservices can be established afterwards.
The possibility to add new bindings is considered in a reconfiguration that, starting from the initial deployment depicted in Fig. 1 with continuous lines, adds the elements depicted with dashed lines. In such a reconfiguration, a couple of new instances of Message Analyzer are deployed. This is done in order to satisfy numerical constraints associated to both required and provided interfaces. For required interfaces, the numerical constraints indicate lower bounds to the outgoing bindings, while for provided interfaces they specify upper bounds to the incoming connections. In our example, the constraint ≥ 3 is associated to the weak required interface of Message Receiver. In order to fulfill such a constraint, at least two new instances of Message Analyzer must be added. On the other hand, the constraint ≤ 2 associated to the interface provided by the Attachment Analyzer implies the creation of a new instance of such microservice, in that the initial one cannot serve all the three Message Analyzers in the final configuration.
We also model resources: each microservice has associated resources that it consumes (see the CPU and RAM quantities associated to the microservices in Fig. 1). Resources are provided by nodes, that we represent as containers for the microservice instances, providing them the resources they require. Notice that nodes have also costs: the total cost of a deployment is the sum of the costs of the used nodes (e.g., in the example the total cost is 598 cents per hour, corresponding to the cost of 4 nodes: 2 C4 large and 2 C4 xlarge virtual machine instances of the Amazon public Cloud).
We now move to the formal definitions. We assume the following disjoint sets: I for interfaces, Z for microservices, and a finite set R for kinds of resources. We use N to denote natural numbers, N + for N \ {0}, and N + ∞ for N + ∪ {∞}.
Definition 1 (Microservice type). The set Γ of microservice types, ranged over by T 1 , T 2 , . . ., contains 5-ples P, D s , D w , C, R where: -P = (I → N + ∞ ) are the provided interfaces, defined as a partial function from interfaces to corresponding numerical constraints (indicating the maximum number of connected microservices); -D s = (I → N + ) are the strong required interfaces, defined as a partial function from interfaces to corresponding numerical constraints (indicating the minimum number of connected microservices); -D w = (I → N) are the weak required interfaces (defined as the strong ones, with the difference that also the constraint 0 can be used indicating that it is not strictly necessary to fulfill a weak interface); -C ⊆ I are the conflicting interfaces; -R = (R → N) specifies resource consumption, defined as a total function from resources to corresponding quantities indicating the amount of required resources.
We assume sets dom(D s ), dom(D w ) and C to be pairwise disjoint. 3 Notation: given a microservice type T = P, D s , D w , C, R , we use the following postfix projections .prov, .reqs, .reqw, .conf and .res to decompose it; e.g., T .reqw returns the partial function associating arities to weak required interfacess. In our example, for instance, the Message Receiver microservice type is such that Message Receiver.reqw(MA) = 3 and Message Receiver.res(RAM) = 4. When the numerical constraints are not explicitly indicated, we assume as default value ∞ for provided interfaces (i.e., they can satisfy an unlimited amount of ports requiring the same interface) and 1 for required interfaces (i.e., one connection with a port providing the same interface is sufficient).
Notice that in the formal definition we consider also conflicting interfaces: these can be used to express conflicts among microservice types that cannot be both present in a deployment, or cases in which a microservice type can have at most one instance (because each additional instance conflicts with the first one).
We now formalize a well-formedness condition on microservice types by requiring that there could be no possible cycles of dependencies involving only strong required interfaces. Indeed, as strong required interfaces must be already fulfilled at the time microservices are instantiated, it is impossible to deploy mutually strong dependent microservices.
Definition 2 (Well-formed Universe). Given a finite set of microservice types U (that we also call universe), we define the strong dependency graph of U as follows: G(U ) = (U, V ) with V = {(T , T )|T , T ∈ U . ∃p ∈ I . p ∈ dom(T .reqs) ∩ dom(T .prov)}. The universe U is well-formed if its strong dependency graph G(U ) is acyclic.
In the following, we always assume universes to be well-formed. It is worth noting that this does not imply the impossibility to deploy microservice system with circular dependencies. This remains possible, but it is necessary that at least one weak required interface is involved in the cycle.
Definition 3 (Nodes). The set N of nodes is ranged over by o 1 , o 2 , . . . We assume the following information to be associated to each node o in N .
-A function R = (R → N) that specifies node resource availability: we use o.res to denote such a function. -A value in N that specifies node cost: we use o.cost to denote such a value.
We now define configurations that describe systems composed of microservice instances and bindings that interconnect them. A configuration, ranged over by C 1 , C 2 , . . ., is given by a set of microservice types, a set of deployed microservices (with their associated type), and a set of bindings. Formally: -Z ⊆ Z is the set of the currently deployed microservices; -T = (Z → T ) are the microservice types, defined as a function from deployed microservices to microservice types; -N = (Z → N ) are the microservice nodes, defined as a function from deployed microservices to nodes that host them; -B ⊆ I ×Z ×Z is the set of bindings, namely 3-ples composed of an interface, the microservice that requires that interface, and the microservice that provides it; we assume that, for (p, z 1 , z 2 ) ∈ B, the two microservices z 1 and z 2 are distinct and p ∈ (dom(T (z 1 ).reqs) ∪ dom(T (z 1 ).reqw)) ∩ dom(T (z 2 ).prov).
In our example, if we use mr to refer to the instance of Message Receiver, and ma for the initially available Message Analyzer, we will have the binding (MA,mr,ma). Moreover, concerning the microservice placement function N , we have N (mr) = Node1 large and N (ma) = Node2 xlarge.
We are now ready to formalize the notion of correctness of configuration. We first define a provisional correctness, considering only constraints on strong required and provided interfaces. Then, we define a general notion of configuration correctness, considering all kinds of requirements. Intuitively, a configuration is provisionally correct if, considering its microservice bindings, the numerical constraints on both strong required and provided interfaces are satisfied. Similarly, a configuration is correct if it also satisfies the numerical constraints on weak required interfaces and conflicts are not violated.

Definition 5 (Provisionally correct configuration). A configuration
and, for each microservice z ∈ Z, both following conditions hold: .reqs implies that there exist n distinct microservices z 1 , . . . , z n ∈ Z \ {z} such that, for every 1 ≤ i ≤ n, we have p, z, z i ∈ B and p ∈ dom(T (z i ).prov); -(p → n) ∈ T (z).prov implies that there exist no m distinct microservices z 1 , . . . , z m ∈ Z \{z}, with m > n, such that, for every Definition 6 (Correct configuration). A configuration C = Z, T, N, B is correct if C is provisionally correct and, for each microservice z ∈ Z, both following conditions hold: Notice that, in the example in Fig. 1, the initial configuration (in continuous lines) is only provisionally correct in that the weak required interface MA (with arity 3) of the Message Receiver is not satisfied (because there is only one outgoing binding). The full configuration -including also the elements in dotted linesis instead correct: all the constraints associated to the interfaces are satisfied.
We now formalize how configurations evolve by means of atomic actions.

Definition 7 (Actions). The set A contains the following actions:
bind (p, z 1 , z 2 ) where z 1 , z 2 ∈ Z, with z 1 = z 2 , and p ∈ I: add a binding between z 1 and z 2 on port p (which is supposed to be a weak-require port of z 1 and a provide port of z 2 ); unbind (p, z 1 , z 2 ) where z 1 , z 2 ∈ Z, with z 1 = z 2 , and p ∈ I: remove the specified binding on p (which is supposed to be a weak required interface of z 1 and a provide port of z 2 ); with B s (representing bindings from strong required interfaces in T to sets of microservices) being such that, for each p ∈ dom(T .reqs), it holds |B s (p)| ≥ T .reqs(p): add a new microservice z of type T hosted in o and bind each of its strong required interfaces to a set of microservices as described by B s ; 5 del (z) where z ∈ Z: remove the microservice z from the configuration and all bindings involving it.
In our example, assuming that the initially available Attachment Analyzer is named aa, we have that the action to create the initial instance of Message Analyzer is new (ma, MessageAnalyzer, Node2 xlarge, (AA → {aa})). Notice that it is necessary to establish the binding with the Attachment Analyzer because of the corresponding strong required interface.
The execution of actions can now be formalized using a labeled transition system on configurations, which uses actions as labels.
Definition 8 (Reconfigurations). Reconfigurations are denoted by transitions C α − → C meaning that the execution of α ∈ A on the configuration C produces a new configuration C . The transitions from a configuration C = Z, T, N, B are defined as follows: A deployment plan is simply a sequence of actions that transform a provisionally correct configuration (without violating provisional correctness along the way) and, finally, reach a correct configuration.
Definition 9 (Deployment plan). A deployment plan P from a provisionally correct configuration C 0 is a sequence of actions α 1 , . . . , α m such that: Deployment plans are also denoted with C 0 In our example, a deployment plan that reconfigures the initial provisionally correct configuration into the final correct one is as follows: a new action to create the new instance of Attachment Analyzer, followed by two new actions for the new Message Analyzers (as commented above, the connection with the Attachment Analyzer is part of these new actions), and finally two bind actions to connect the Message Receiver to the two new instances of Message Analyzer.
We now have all the ingredients to define the optimal deployment problem, that is our main concern: given a universe of microservice types, a set of available nodes and an initial configuration, we want to know whether and how it is possible to deploy at least one microservice of a given microservice type T by optimizing the overall cost of nodes hosting the deployed microservices.
Definition 10 (Optimal deployment problem). The optimal deployment problem has, as input, a finite well-formed universe U of microservice types, a finite set of available nodes O, an initial provisionally correct configuration C 0 and a microservice type T t ∈ U . The output is: if there exists one. In particular, among all deployment plans satisfying the constraints above, one that minimizes z∈Zm N m (z).cost (i.e. the overall cost of nodes in the last configuration C m ), is outputed.
-no (stating that no such plan exists); otherwise.
We are finally ready to state our main result on the decidability of the optimal deployment problem. To prove the result we describe an approach that splits the problem in three incremental phases: (1) the first phase checks if there is a possible solution and assigns microservices to deployment nodes, (2) the intermediate phase computes how the microservices need to be connected to each other, and (3) the final phase synthesizes the corresponding deployment plan. Theorem 1. The optimal deployment problem is decidable.
Proof. The proof is in the form of an algorithm that solves the optimal deployment problem. We assume that the input to the problem to be solved is given by U (the microservice types), O (the set of available nodes), C 0 (the initial provisionally correct configuration), and T t ∈ U (the target microservice type). We use I(U ) to denote the set of interfaces used in the considered microservice types, namely The algorithm is based on three phases.

Phase 1
The first phase consists of the generation of a set of constraints that, once solved, indicates how many instances should be created for each microservice type T (denoted with inst(T )), how many of them should be deployed on node o (denoted with inst(T , o)), and how many bindings should be established for each interface p from instances of type T -considering both weak and strong required interfaces -and instances of type T (denoted with bind(p, T , T )). We also generate an optimization function that guarantees that the generated configuration is minimal w.r.t. its total cost.
We now incrementally report the generated constraints. The first group of constraints deals with the number of bindings: Constraint 1a and 1b guarantee that there are enough bindings to satisfy all the required interfaces, considering both strong and weak requirements. Symmetrically, constraint 1c guarantees that the number of bindings is not greater than the total available capacity, computed as the sum of the single capacities of each provided interface. In case the capacity is unbounded (i.e., ∞), it is sufficient to have at least one instance that activates such port to support any possible requirement (see constraint 1d). Finally, constraint 1e guarantees that no binding is established connected to provided interfaces of microservice types that are not deployed.
The second group of constraints deals with the number of instances of microservices to be deployed.
The first constraint 2a guarantees the presence of at least one instance of the target microservice. Constraint 2b guarantees that no two instances of different types will be created if one activates a conflict on an interface provided by the other one. Constraint 2c, consider the other case in which a type activates the same interface both in conflicting and provided modality: in this case, at most one instance of such type can be created. Finally, the constraints 2d and 2e guarantee that there are enough pairs of distinct instances to establish all the necessary bindings. Two distinct constraints are used: the first one deals with bindings between microservices of two different types, the second one with bindings between microservices of the same type.
The last group of constraints deals with the distribution of microservice instances over the available nodes O.
Constraint 3a simply formalizes the relationship among the variables inst(T ) and inst(T , o) (the total amount of all instances of a microservice type, should correspond to the sum of the instances locally deployed on each node). Constraint 3b checks that each node has enough resources to satisfy the requirements of all the hosted microservices. The last two constraints define the optimization function used to minimize the total cost: constraint 3c introduces the boolean variable used(o) which is true if and only if node o contains at least one microservice instance; constraint 3d is the function to be minimized, i.e., the sum of the costs of the used nodes.
These constraints, and the optimization function, are expected to be given in input to a constraint/optimization solver. If a solution is not found it is not possibile to deploy the required microservice system; otherwise, the next phases of the algorithm are executed to synthesize the optimal deployment plan.

Phase 2
The second phase consists of the generation of another set of constraints that, once solved, indicates the bindings to be established between any pair of microservices to be deployed. More precisely, for each type T such that inst(T ) > 0, we use s T i , with 1 ≤ i ≤ inst(T ), to identify the microservices of type T to be deployed. We also assume a function N that associates microservices to available nodes O, which is compliant with the values inst(T , o) already computed in Phase 1, i.e., given a type T and a node o, the number of s In the constraints below we use the variables b(p, s T i , s T j ) (with i = j): the value of such variables is 1 if there is a connection between the required interface p of s T i and the corresponding provided interface of s T j , 0 otherwise. We also make use of an auxiliary total function limProv (T , p) that extends T .prov associating 0 to the interfaces outside its domain.
Constraint 4a considers the provided interface capacities to fix upper bounds to the bindings to be established, while contraints 4b and 4c fix lower bounds based on the required interface capacities, considering both the weak (see 4b) and the strong (see 4c) ones. Finally, constraint 4d indicates that it is not possible to establish connections on interfaces that are not required.
A solution for these constraints exists because the constraints 1a . . . 2e (already solved during Phase 1) guarantee that the configuration to be synthesized contains enough capacity on the provided interfaces to satisfy all the required interfaces.
Phase 3 In this last phase we synthesize the deployment plan that, when applied to the initial configuration C 0 , reaches a new configuration C t with nodes, microservices and bindings as computed in the first two phases of the algorithm. Without loss of generality, in this proof we show the existence of a simple plan that first removes the elements in the initial configuration and then deploys the target configuration from scratch. However, as also discussed in Section 3, in practice it is possible to define elaborated planning mechanisms that re-use microservices already deployed.
Reaching an empty configuration is a trivial task since it is always possible to perform in the initial configuration unbind actions for all the bindings connected to weak required interfaces. Then the microservices can be deleted since for the well-formedness of the system it is possible to order, using a topological sort, the microservices to be removed without violating any strong required interface (e.g., first remove the microservice not requiring anything and repeat until all the microservices have been deleted).
The deployment of the target configuration follows a similar pattern. Given the distribution of microservices over nodes (computed in the first phase) and the corresponding bindings (computed in the second phase), the microservices can be created by following a topological sort considering the microservices dependencies following from the strong required interfaces. When all the microservices are deployed on the corresponding nodes, the remaining bindings (on weak required ports) may be added in any possible order. Remark 1. The constraints generated during Phase 2 of the algorithm, in order to establish the microservice bindings, are expected to be given in input to a constraint/optimization solver. One can enrich such constraints with metrics to optimize, e.g., the number of local bindings (i.e., give a preference to the connections among microservices hosted in the same node): Another example, used in the case study discussed in Section 3 6 , is the following metric that maximizes the number of bindings: From the complexity point of view, it is possible to show that the decision versions of the optimization problem solved in Phase 1 is NP-complete, in Phase 2 is in NP, while the planning in Phase 3 is synthesized in polynomial time.
Unfortunately, due to the fact that numeric constraints can be represented in log space, the output of Phase 2 requiring the enumeration of all the microservices to deploy can be exponential in the size of the output of Phase 1 (indicating only the total number of instances for each type). For this reason, the optimal deployment problem is in NEXPTIME. However, in practice, due to the resource usage of the microservices, the number of microservices to be deployed can be assumed to be polynomial in the size of the input. In this case the optimal deployment problem becomes an NP-optimization problem and its decision version is NP-complete. A formal proof of the complexity of the problem is available in Appendix A.

Application of the technique to the case-study
Given the asymptotic complexity of our solution (NP under the assumption of polynomial size of the target configuration) we have decided to evaluate its applicability in practice by considering a real-world microservice architecture, namely the email processing pipeline described in [16]. The considered architecture separates and routes the components found in an email (headers, links, text, attachments) into distinct, parallel sub-pipelines with specific tasks (e.g., remove malicious attachments, tag the content of the mail). We report in Fig. 2 a depiction of the architecture. From left to right, when an email reaches the Message Receiver, it sends each component into a specific sub-pipeline. In the subpipelines, some microservices -e.g., Text Analyzer and Attachment Analyzercoordinate with other microservices -e.g., Sentiment Analyzer and Virus Scanner -to process their inputs. Each microservice in the architecture has a given resource consumption (expressed in terms of CPU and memory). As expected, the processing of each email component entails a specific load. Some microservices can handle large inputs, e.g., in the range of 40K simultaneous requests -like the Header Analyzer that processes short and uniform inputs. Other microservices sustain heavier computations -like the Image Recognizer -and can handle smaller simultaneous inputs, e.g., in the range of 10K requests. To model the system above, we use the Abstract Behavioral Specification (ABS) language, a high-level object-oriented language that supports deployment modeling [25]. ABS is agnostic w.r.t. deployment platforms (Amazon AWS, Mi-crosoft Azure) and technologies (e.g., Docker or Kubernetes) and it offers highlevel deployment primitives for the creation of new deployment components and the instantiation of objects inside them. Here, we use ABS deployment components as computation nodes, ABS objects as microservice instances, and ABS object references as bindings. Finally, to describe the requirements in our model, we use ABS with SmartDepl [19], an extension that supports dependency annotations (e.g., from other classes, available resources) in ABS classes. We use annotations to model strong required interfaces as class dependencies and weak required interfaces as object references, which can be passed to running objects. We define a class for each microservice type, plus one load balancer class for each microservice type. A load balancer distributes requests over a set of microservice instances that can scale horizontally. Finally, we model nodes over three popular Amazon EC2 instances: c4 large, c4 xlarge, and c4 2xlarge (with the corresponding provided resources and costs).
In the table above, we report the result of our algorithm w.r.t. four incremental configurations: the initial in column 2 and under incremental loads in 3-5. We also consider an availability of 40 nodes for each of the three node types. In the first column of the Table, next to a microservice type, we report its corresponding maximum computational load. As visible in columns 2-5, different maximal computational loads imply different scaling factors w.r.t. a given number of simultaneous requests. In the initial configuration we consider 10K simultaneous requests and we have one instance of each microservice type (and of the corresponding load balancer). The other deployment configurations deal with three scenarios of horizontal scaling, assuming three increasing increments of inbound messages (20K, 50K, and 80K). In the three scaling scenarios, we do not implement the planning algorithm described in Phase 3 of the proof of Theorem 1. Contrarily, we take advantage of the presence of the load balancers and, as described in Remark 1, we achieve a similar result with an optimization function that maximizes the number of bindings of the load balancers. For every scenario, we generated automatically the ABS code for the plan that deploys an optimal configuration, using a time cap of half an hour for every deployment scenario. 7 The ABS code modeling the system and the generated code are publicly available at [5]. A graphical representation of the initial configuration is available in Appendix B.

Related Work and Conclusion
With the current popularity of Cloud Computing, the problem of automating application deployment has attracted a lot of attention and many system management tools exists [20,26,30,31]. Those tools support the specification of deployment plans but they do not support automatic distribution of software instances over the available machines. For these reasons, those tools do not solve the deployment problem as defined in this paper, but are just deployment engines to concretely execute deployment plans.
The proposals closest to ours are those by Feinerer [14] and by Fischer at al. [15]. Both proposals rely on a solver to plan deployments. The first is based on the UML component model, which includes conflicts and dependencies, but lacks the modeling of nodes. The second does not support conflicts in the specification language. Neither proposals support the computation of optimal deployments. Our work is inspired by the Aeolus component model [8,9], the Zephyrus configuration optimizer [1], and ConfSolve [22]. The Aeolus model paved the way to reason on deployment and reconfiguration, proving some decidability results. Zephyrus is a configuration tool grounded on the Aeolus model and underpins the first phase of our approach. Similarly, ConfSolve relies on constraint solving techniques to propose an optimal allocation of virtual machines to servers, and of applications to virtual machines. Both tools ignore the problem of synthesizing a low-level plan to reach the final configuration which, in the general case, has been proven undecidable. In this work, by considering microservices, we prove that the generation of the plan becomes decidable and thus fully automatable, from the synthesis of the optimal configuration to the generation of the actions to deploy it. We show a practical application of our approach on a non-trivial example of microservice architecture, modeled in the Abstract Behavioral Specification (ABS) language. As a result, we synthesize an optimal initial configuration and different scaling scenarios, generating the deployment actions directly in ABS.
Regarding autoscaling, existing solutions [2,4,12,23] support the automatic increase or decrease of the number of instances of a service/container, when some conditions (e.g., CPU average load greater than 80%) are met. Our work is an example of how we can go beyond single-component horizontal scaling policies. Contrarily, our approach supports the computation of optimal horizontal scaling operations involving at the same time more than one service, thus enabling to reason on autoscaling operation at the application level.
As a future work we are interested in investigating local search approaches to speed up the solution of the optimization problems involved in the deployment problem. This will allow us to use our approach at run time when responses times to few minutes (i.e., the times it usually takes to start a new virtual machine in a public cloud) is left as a future work.
are needed in a short amount of time (e.g., minutes) at the price of losing the optimality guarantee of the solutions. This is probably an inevitable trade-off due to the NP-hardness of the optimal deployment problem.

A Optimal Deployment Problem Complexity
Theorem 2. The optimal deployment problem is in NEXPTIME. If the number of microservices to be deployed is polynomial in the size of the input, the problem is an an NP-optimization (NPO) problem and its decision version is NP-complete.
Proof. The proof derives from the fact that the decision version of the optimization problem solved in phase 1 is NP-complete, the decision version of the optimization problem solved in phase 2 is in NP, and the problem in phase 3 is polynomial.
Due to the fact that numeric constraints can be represented in log space, the input of phase 2 can be exponential in the size of the output of phase 1. This for instance happens when the target component requires an interface p with numerical constraint ≥ n and when all the components providing the interface p have numerical constraint equal to 1. The solution in phase 1 will require the deployment of n microservices and can be represented in O(log(n)) space. However, phase 2 requires the list of microservices to be deployed and this is represented only in O(n) space.
This makes the optimal deployment problem an NEXPTIME problem. However, when the microservices to be deployed in the final configuration are polynomially bounded in the size of the input 8 , the optimal deployment problem becomes an NPO problem due to the fact that its decision version is an NPcomplete problem, being equivalent to the execution in sequence of 2 NP-complete problems.
We will now proceed by proving the complexity of the 3 phases used to solve the optimal deployment problem.
Phase 1 As proven in [8], the constraints in 1a . . . 2e can be linearized. Due to the fact that the remaining constraints 3a . . . 3d are the standard linear constraints of the bin packing problem, all the constraints of the phase 1 are linear and therefore the problem is in NP. The hardness can be proven by reducing the bin packing problem to the considered problem. The reduction is straightforward: bins corresponds to nodes, packages are represented by microservices. The size of a package is encoded in the resource consumption of the microservice. The problem of minimizing the number of bins is therefore translated into finding the minimal amount of nodes to deploy the given microservices. To require the deployment of all nodes a new dummy target component of size 0 may be introduced using strong required interface for requiring the deployment of all the other microservices.
Phase 2 As far as the decision version of the phase 2 problem is concerned, it is clear that it is in NP due to the linearity of the constraints 4a . . . 4d.
To prove the decidability of the deployment problem, it is not needed to optimize the bindings. This, however, may be useful to express preferences over bindings on the final configuration. In the following we study the complexity of the problem when a metric is used to try to optimize the bindings. We restrict ourselves to consider only linear metrics.
When linear metric constraints are used, the problem becomes NP-hard. By choosing the right metric it is indeed possible to reduce the partition problem into the considered problem.
The partition problem, a well-known NP-complete problem, checks the existence of a partition of a set S into two subsets A, B such that the difference between the sum of elements in A and the sum of elements in B is 0.
This problem can be encoded by using i) a microservice T i for every number i ∈ S, ii) two microservices T A and T B representing the two sets A and B, and iii) a dummy target microservice that requires the deployment of all the others. We can enforce all the microservices to be deployed only once by allowing them to provide and be in conflict with the same interface (p i for the T i microservices, p a for T A and p b for T b ). Every T i should provide interfaces p and q with a numerical constraint ≤ 1. T A and T B should instead weak require the interface p with numerical constraint ≥ 0 and provide the interface q. The dummy target microservice should only require |S| + 2 interfaces q. With this universe of microservices, it is possible to define a metric that weights with i every connection between T A and T i , with −i every connection between T B and T i . The original partition problem can be solved by checking if the sum of the weights is 0.
Phase 3 It is easy to see that the 3rd phase is polynomial: it simply follows from the polinomial complexity of the topological sort over the number of components to be deployed and the set of interfaces I(U ) = T ∈U dom(T .reqs) ∪ dom(T .reqw) ∪ dom(T .prov) ∪ T .conf. Figure 3 provides the graphical representation of the automatically synthesized initial configuration for our case study. The same image, for visualization purposes, has been splitted in three and shown in Figure 4.

B Graphical representation of the initial configuration
In this figure, the outermost boxes represent the AWS virtual machines, while the innermost boxes represent the services deployed on that virtual machines. The box names represent the kind of virtual machines used and the kind of objects deployed (preceded by the word default, corresponding to an ABS/SmartDepl parameter that we have not used in our case-study).
The red boxes within a microservice A represent the required interfaces (either strong or weak), the green boxes represent the provided interfaces of A. An arrow from a service A towards a service B represents the fact that A is used at runtime by B and that B needs to know the reference to A.
As can be seen from the image, the optimal initial deployment consists of 24 components, distributed over 5 virtual machines of type 2xlarge, 4 of type xlarge, and 10 of type large.   Fig. 3. Initial configuration of the email microservice system.  Fig. 4. Initial configuration of the email microservice system spitted in three for visualization purposes.