A projection-based continuous-time algorithm for distributed optimization over multi-agent systems

Multi-agent systems are widely studied due to its ability of solving complex tasks in many fields, especially in deep reinforcement learning. Recently, distributed optimization problem over multi-agent systems has drawn much attention because of its extensive applications. This paper presents a projection-based continuous-time algorithm for solving convex distributed optimization problem with equality and inequality constraints over multi-agent systems. The distinguishing feature of such problem lies in the fact that each agent with private local cost function and constraints can only communicate with its neighbors. All agents aim to cooperatively optimize a sum of local cost functions. By the aid of penalty method, the states of the proposed algorithm will enter equality constraint set in fixed time and ultimately converge to an optimal solution to the objective problem. In contrast to some existed approaches, the continuous-time algorithm has fewer state variables and the testification of the consensus is also involved in the proof of convergence. Ultimately, two simulations are given to show the viability of the algorithm.


Introduction
Reinforcement learning stems from an experiment on the behaviors of cats in 1898 by Thorndike [20].With the development of science and application, the concept of deep reinforcement learning (DRL) is proposed.DRL is a interdiscipline of reinforcement learning and deep learning to cope with environments with high dimensions [17].In DRL field, multi-agent systems raise much attention, since they own the ability to deal with complex tasks by the cooperation of individual agents.Thus, the study among multi-agent systems is important.
Owing to the widespread application in resource allocation [1], machine learning [21], and robot control [25], and so on, distributed optimization problem over multi-agent systems has drawn much attention in recent years.In resolving B Sitian Qin qinsitian@163.com;qinsitian@hitwh.edu.cnXingnan Wen hitwhwxn@163.com 1 Department of Mathematics, Harbin Institute of Technology, Weihai 264209, People's Republic of China process, agents can only communicate with their neighbors and the optimal solutions of distributed optimization problems are calculated based on their interaction.In past few decades, many research results have emerged for distributed optimization problems with sorts of constraints (see [3,14,18] and references therein).In [18], a topology network is introduced to solve a distributed optimization problem, where time delays are considered in the communication process among agents.By augmented Lagrange multiplier method, a nonsmooth distributed optimization problem with inequality constraints is studied in [29].Moreover, a penalty method is utilized in [14] for distributed optimization to simplify constraints on an undirected topology graph.
Neurodynamic approach is a powerful tool in distributed optimization problems as well as traditional optimizations.For solving optimizations, Xia et al. [24] proposed a neurodynamic approach with global convergence.In [22], a second-order distribution dynamics is introduced to solve an optimization without constraints.In application of neurodynamic approaches, time delays are often inevitable.To consider the influence of time delays, Zeng et al. presented a continuous-time algorithm with time-varying delays and studied their stabilities in [31,32].To be suitable for comput-ers, discrete-time recurrent algorithms are also considered in [30].
In recent years, for the purpose of realizing the goal of reducing the layers of related models, penalty method has attracted great attention among scholars.In [33], a discretetime algorithm is proposed to solve distributed optimization problem without constraints.Authors in [33] discuss the convergence and error tolerance with the help of estimating a penalty parameter in advance.To solve distributed optimization problems with double local constraints, a neurodynamic approach is proposed in [16] by projection and regularization methods.It is worth noting that the computational burden may increase when applying neural network in [16] to solve distributed optimization problems with complex structures.For the broader application, Jiang et al. [8] proposed a penalty-like continuous-time algorithm for distributed optimization problems with affine equality and convex inequality constraints without considering penalty method.
It can be seen that distributed optimization problems are detailed studied with sorts of constraints and topology graphs according to above research findings.And it is worth noting that the computational complexity varies among different models.To introduce a proper algorithm to simply the calculation and complexity, the penalty method utilized in [14] is also considered here.Besides, we construct a projection-based continuous-time algorithm to solve distributed optimization problems with equality and inequality constraints over multi-agent systems in this paper.The main contributions of this paper are listed as follows: 1.In this paper, a continuous-time algorithm is proposed to resolve distributed optimization problem over multiagent systems.Since the local cost functions and inequality constraint functions are studied without smoothness assumption, the distributed optimization problem here is more general than the one in [15].Moreover, algorithm is constructed as a differential inclusion system, which is more flexible than the general gradient systems such as [6,26].2. By introducing penalty method, the distributed optimization problem with equality and inequality constraints is transformed into a new one.Thus, the distributed optimization problem can be solved by the algorithm which can be applied into the reformulated problem.Compared with conventional approach for distributed optimization problems in [12], penalty method has the advantage of reducing the dimension of models.3. From any initial point, the states of the continuous-time algorithm are sure to enter equality constraint in a fixed time.This property guarantees the algorithm to possess the advantage of simplifying the distributed optimization problems by ignoring one of the constraints after a period time.Since fixed entering time is obtained without calcu-lating (A A T ) −1 , the continuous-time algorithm is easier to be applied than algorithm in [11].4. Different from conventional approaches which rely on introducing Laplace matrix to guarantee the consensus, the continuous-time algorithm realizes the consensus in the process of dealing with penalty terms.Compared with algorithm in [12], approach herein has fewer state variables, which is more preferable in application fields.
The rest parts are organized as follows.In the section "Preliminaries and problem description", some necessary preliminaries are first introduced.The targeted distributed optimization problem and the reformulated one are given in the section "Problem reformulation and neurodynamic approach".In the section "Problem reformulation and neurodynamic approach", we propose a continuous-time algorithm and discuss the convergence of its states.Double numerical examples are proposed in the section "Simulations" to verify the performance of the presented algorithm.In the section "Conclusions", the conclusion of this paper and our future works are summarized.

Preliminaries and problem description
This section presents some basic concepts containing graph theory and nonsmooth analysis as well as some necessary lemmas.Then, our studied problem is introduced.

Graph theory and nonsmooth analysis
A graph G(N , E) is utilized in this paper to depict the information sharing among agents, where N = {1, 2, . . ., m} means the set of agents and E ⊂ N × N represents an edge set.An edge e i j exists between a pair of agents (i, j) if they can communicate with each other.Denote that N i is the neighbor set of agent i, which is A path between agents i and j in graph G is a sequence of edges (i, i 1 ), (i 1 , i 2 ), . . ., (i k , j), where i 1 , i 2 , . . ., i k ∈ N .Undirected graph G is said to be connected if there is a path between any pair of agents.
Lemma 1 [4] If Ψ : R n → R is convex on Ω ⊆ R n , then for any u 1 , u 2 ∈ Ω, the subdifferential of Ψ satisfies: Due to Proposition 2.3.6 in [4], a convex function Ψ : R n → R is also regular.

Several necessary lemmas
Consider a system where u(t), u 0 ∈ R n , α, β are positive constants and a, b > 0 are odd integers satisfying a < b.

Definition 2 For any initial point u(
(2) u(t) − u * = 0, t > T , where u * is an equilibrium point of system (1).T hen, the equilibrium point of system (1) is said to be f i xed-time stable.
Lemma 3 [34] The original of system ( 1) is fixed-time stable and upper bound of the time can be calculated as: Lemma 5 [34] Suppose that Ω ⊆ R n is a nonempty closed convex set.Then, for any u ∈ R n , the projection operator:

Distributed optimization problem
In this paper, we consider a network consisted of m agents over an undirected graph G to cooperatively solve the distributed optimization problem as follows: where x ∈ R n , f i : R n → R are local cost functions and g i (x) = (g i1 (x), g i2 (x), . . ., g i p i (x)) T : R n → R p i may be nonsmooth, A i ∈ R q×n are row full rank matrixes and b i ∈ R q .For convenience, we denote: In this paper, we default there exists at least one minimum of distributed optimization problem (2).For solving distributed optimization problem (2), some useful assumptions are first introduced here.

Assumption 1
The graph G is undirected and connected.

Assumption 2
The Slater condition of distributed optimization problem (2) holds, i.e., there exists a x ∈ R n , such that g i ( x) < 0 as well as A i x = b i for i = 1, 2, . . ., m. Besides, the sets S i are bounded; that is Assumption 3 The local cost functions f i (x) are all convex and Lipschitz on R n with the Lipschitz constant L.

Remark 1
The Assumptions 1-3 are widely adistributed optimization problems in solving distributed optimization problems, such as [12,15].Assumption 1 guarantees agents can communicate with their neighbors.In addition, Assumptions 2-3 imply that distributed optimization problem (2) is solvable.

Problem reformulation and neurodynamic approach
In this section, based on penalty method, the distributed optimization problem ( 2) is equivalently converted to a new one.Besides, a neurodynamic approach is proposed to solve the obtained distributed optimization problem.

Problem reformulation
Consider the following distributed optimization problem: where f i are from distributed optimization problem (2), max{g ik (x i ), 0}, and σ, λ > 0 are penalty parameters.
Proposition 3 Let Assumption 2 holds, and then, the functions G i (x i ) in distributed optimization problem (3) are coercive, that is: Let g = − max 1≤i≤m,1≤k≤ p i {g ik ( x)}.Then, we have the following conclusion.
Proof Consider a distributed optimization problem as follows: (5) then, we will first prove distributed optimization problem ( 2) is equivalent to problem (5).
Remark 2 Penalty method is a widely adopted approach in solving optimization problems with sorts of constraints, such as [19,28] and so on.By introducing penalty parameters, one can reduce the dimension of related algorithms.In this paper, we introduce double penalty parameters to reformulated original distributed optimization problem into a new one.Compared with the conventional continuous-time algorithm in [12], the model herein has the advantages of owning lower state variables.
According to Theorem 1, the equivalence of distributed optimization problem (2) and (3) are based on Ω i = Ω j , i, j ∈ {1, 2, . . ., m}.Let Ω i = Ω, where: and In the following part of this paper, we will default this condition.

Projection-based continuous-time algorithm
To solve distributed optimization problem (3), a projectionbased continuous-time algorithm is proposed as follows: where ( are the subdifferentials of objective function in distributed optimization problem (3).They can be regarded as a gradient descent method and ensure the states of continuous-time algorithm (17) converge to an optimal solution of distributed optimization problem (3).
The projection-based continuous-time algorithm (17) can be also expressed by the following form: where The continuous-time algorithm (17) is the distributed form of ( 18) and it can be seen that algorithm ( 17) is fully distributed.

Proposition 4
The equilibrium point of continuous-time algorithm ( 18) is an optimal solution to distributed optimization problem (3) and vice versa.

Convergence analysis
In this part, with the help of Lyapunov method and above preliminaries, we will study the convergence of continuoustime algorithm (17).
Theorem 2 Let Assumptions 1-3 hold.For any initial point m } ∈ R mn , the state solution x(t) of neural network (18) for a.e.t ≥ 0, where ξ i (t) ∈ u i (t) is a vector.Consider the Lyapunov function: According to Lemma 5 and ( 19), we can take the derivative of V i (x) along x i (t) as follows: for a.e.t ≥ 0. Since v i (t), ξ i (t) − |ξ i (t)||v i (t)| ≤ 0, then it follows from (20) that: , and then, it can be immediately obtained that: By Lemma 3, we have lim ., m.Thus, we have: for i = 1, 2, .., m, which means for any initial point m } ∈ R mn , x(t) will enter the set in fixed time and T is bounded by: .

Remark 4
It is worth noting that the property of entering one of the constraints or feasible region is possessed by many continuous-time algorithms for solving optimization problems, such as [8,11,19,34].This proposition guarantees the continuous-time algorithm (17) which has the advantages of simplifying the distributed optimization problems by ignoring one of the constraints once the states enter this set.Compared with [11], the continuous-time algorithm (17) in constructed and fixed entering time is obtained without calculating (A A T ) −1 , which is easier to be applied in engineering fields.
Lemma 6 [2] (Opial) Suppose that x : [0, +∞) is a curve.Then, x(t) will be convergent to a point in C ⊆ R n (C = ∅) as t → +∞ if and only if the following two tips are satisfied (1) All accumulation points of x(•) are in C; (2) For any x * ∈ C, lim t→+∞ x(t) − x * exists.

Theorem 3 Assume that Assumptions 1-3 hold, then for any initial point x(0)
m } ∈ R mn , the state solution x(t) of neural network (17) will converge to its equilibrium point.
Proof Suppose x * = col{x * 1 , x * 2 , . . ., x * m } and C * are an equilibrium point and equilibrium point set of neural network (18).According to Theorem 2, from initial point x 0 = col{x (0) 1 , x (0) 2 , . . ., x (0) m } ∈ R mn , the state solution x(t) of neural network (17) will enter in fixed time.And the fixed time T ≤ bπ 2(b−a) .In other words, we can immediately get: for t ≥ T .Hence, we have v i (t) = 0, t ≥ T .Then, the neural network can be simplified as: for t ≥ T , where u i (t) is from (17).In the following part, we will study the convergence of continuous-time algorithm (23).
Consider the Lyapunov function: on t ∈ [T , +∞).Taking the derivative of W (t), there exist for a.e.t ≥ T .Since x * is an equilibrium point of algorithm (18), it is also an equilibrium point of algorithm (23).Thus, there exist such that: for i = 1, 2, . . ., m, which means that: Combining ( 25) and ( 27), we have: Due to the fact that f i (x i ), G i (x i ), and j∈N i ∂ x i − x j are convex functions, then by Lemma 1, it follows: Hence, it can be immediately obtained by (28) that: for a.e.t ≥ T .Because of the nonnegativeness of m i=1 x i (t)− x * i 2 , we can declare that lim t→+∞ W (t) exists.On the other hand, for the convenience of discussion, let: (29) where which means that, for any ε > 0, there exists T > T , such that: Since x * = col{x * 1 , x * 2 , . . ., x * m } is an equilibrium point of continuous-time algorithm (18), by Proposition 4, x * is also an optimal solution to distributed optimization problem (3).It is obvious that: 123 For the reason that x i (t) ∈ Ω for t ≥ T , then x(t) ∈ for t ≥ T > T .By ( 25) and ( 30), we have: for t ≥ T .The first inequality holds by Definition 1, which is utilized to demonstrate that: where . By calculating integral of (31) from T to +∞, we have: which leads to a contradiction.Hence, we obtain lim inf t→+∞ K (x(t)) = min x∈ {K (x)}.Thus, there exists a time sequence {t n }, such that: which delivers by (31) that W (t n ) → 0 as n → +∞.Since lim t→+∞ W (t) exists, we can get: Therefore, x * ∈ C * is an accumulation of x(t).By Lemma 6, it can be concluded that x(t) of neural network (18) will converge to its equilibrium point.
According to Proposition 4, the equilibrium point of neural neural network ( 17) is an optimal solution to distributed optimization problem (2).Then, we have the following conclusion: Corollary 1 Assume that Assumptions 1-3 hold, then for any initial point m } ∈ R n and σ > σ 0 , λ > λ 0 , then the states x(t) of continuous-time algorithm (18) will converge to an optimal solution to distributed optimization problem (2).
Remark 5 In [12], distributed optimization problem (2) has been studied under the similar assumptions in this paper.With the help of Laplacian matrix, authors in [12] proposed a differential inclusion system and proved the consensus and convergence of agents.Different from [12], the continuoustime algorithm (17) achieves convergence by a transformed distributed optimization problem.Compared with algorithm in [12], continuous-time algorithm (17) has fewer state variables, which is more preferable in application fields.Here, we also introduce some algorithms to solve distributed optimization problems in Table 1 to make detailed comparisons.

Simulations
This section presents two numerical examples to show the viability of the conclusion above and the projection-based continuous-time algorithm.Details are given as follows: Example 1 A three-agent-system communicating over an undirected connected graph is considered in this example and its interaction topology is presented in Fig. 1.The distributed   2 Trajectories (x i,1 , x i,2 , x i,3 ) of the states of continuous-time algorithm (17) optimization problem is depicted as: where is the decision variable of agent i.And the local cost function of distributed optimization problem ( 33) is defined as: where q = 2 is a constant and M = (1, 3, 4) T .The equality constraint set is Ω = {x i ∈ R 3 : x i,1 − 2x i,2 + x i,3 = 1} to be specific.It is obvious that A = (1, −2, 3) is a full-rowrank matrix and b = 1.In addition, the inequality constraint functions g The sets S i = {x i : g i (x i ) ≤ 0} are nonempty and satisfy that int(Ω) S i = ∅.Also, as the gradient of f i (x i ) can be calculated as ∇ f i (x i ) = M = [1, 3, 4] T , i = 1, 2, 3, apparently, it is bounded on the equality constraint set.Thus, all assumptions in this paper are satisfied.
Simulation results are presented in Figs. 2 and 3, which shows the trajectories of (x i,1 , x i,2 , x i,3 ) T of states with random initial values x (0) The colored area in Fig. 2 is the feasible region of distributed optimization problem (33) and the above-mentioned initial points are marked as black ones.Besides, the star in red means an optimal solution of distributed optimization problem (33).It is easy to find that the state solutions of continuous-time algorithm (17) enter the equality constraint in finite time and stay in it thereafter.After that, the trajectories of all the three agents converge to the equilibrium point (−0.3, −2.5, −1.233) T .The transient behaviors of the three agents' solutions of continuous-time algorithm (17) are also presented in Fig. 3.

Remark 6
In [9], Jia et al. proposed a generalized neural network for solving distributed optimization problems with inequality constraints only.Thus, the model cannot be immediately applied into distributed optimization problem (33) in Example 1.On the other hand, the continuous-time algorithm ( 17) is also suitable for distributed optimization problems in [9] under the condition that there is no equality constraint of distributed optimization problem (2) or the equality constraints are satisfied after the fixed time given in Theorem 2. Therefore, compared with neural network in [9], continuoustime algorithm ( 17) is more flexible when solving different distributed optimization problems.
Example 2 Consider a six-agent network with the communication topology shown in Fig. 4.And the distributed optimization problem is given asfollows: The local cost functions of distributed optimization problem (35) are given as: where M i = (1, 3, i) T is relevant to each agent.Calculating the gradient of f i (x i ), one derives ∇ f i (x i ) = M i .Despite the variability of the local cost functions, the boundedness of its gradient over the equality constraint set can still be guaranteed with a limited number of agents.It can be seen that A = (1, 1, −2) is a full-row-rank matrix and: Apparently, distributed optimization problem (35) meets the requirements of the conditions assumed in this paper, for instance, the boundedness of the inequality constraint set and the nonemptiness of the feasible region.The transient behaviors of the states of continuoustime algorithm (17) are presented in detail in Fig. 5, in which x i, j (t) represents the trajectories of states.Starting from initial points x

Conclusions
In this paper, a projection-based continuous-time algorithm is proposed to solve distributed optimization problems with equality and inequality constraints over multi-agent systems.By exact penalty method, the distributed optimization problem is reformulated to a new one without inequality constraints and consensus constraints.It is proved that from any initial points, the states of continuous-time algorithm will enter the equality constraint set of the transformed distributed optimization problem in fixed time.And states of continuous-time algorithm are proved to be convergent to an optimal solution of the original distributed optimization problem.Compared with existed models and approaches, the continuous-time algorithm has the advantages of owning fewer state variables.Besides, the states of continuous-time algorithm can find an optimal solution of distributed optimization problem under mild assumptions.Since the penalty method has the weakness of calculating penalty parameters, the application of continuous-time algorithm may be limited in solving more complex distributed optimization problems.In our future work, we will consider to construct a algorithm independent of penalty method.

Fig. 1
Fig. 1 Interaction topology of the three-agent system in Example 1

Table 1
Comparison of algorithms for solving distributed optimization problems