Introduction

Distributed optimization problems over multi-agent systems have received much attention and the main principle of solving distributed optimization problems is to optimize the global cost function through the cooperation of interrelated agents [31, 41, 45]. The cost function of a distributed optimization problem is generally the summation of local cost functions which is only acquired by an individual agent. In particular, to protect the privacy of communication, agents in the multi-agent system only share particular information with their neighbors. Distributed optimization problems can be found in fields of traffic balance [41], multi-objective optimization [45], and collaborative control, such as the realization of formation maintenance between multiple vehicles [31]. Considering the limitations of communication bandwidth and communication range in many applications, it becomes necessary to study distributed neurodynamic approaches that require only partial information exchange between neighbors. In fact, problems in the engineering field can usually be abstracted into mathematical models and then solved using corresponding neurodynamic approaches, such as [2, 8, 11, 13, 27, 35].

In recent years, kinds of neurodynamic approaches for solving distributed optimization problems have been published (see [11, 13,14,15, 27,28,29, 36, 46]). By means of collaborative control between agents, these neurodynamic approaches can not only protect privacy, but also solve the server overload issue. For unconstrained distributed optimization problems, authors in [36] constructed a neurodynamic approach under the assumption that the optimal solution set is a compact set. However, authors in [11] proposed a neurodynamic approach based on a strongly connected and directed communication graph to weaken this assumption. Actually, there are few unconstrained optimization problems in the actual situation, so scholars pay more attention to distributed optimization problems with constraints [13, 28, 29, 46]. Authors in [46] constructed a distributed gradient neurodynamic approach to solve the distributed optimization problem over a multi-agent system where each agent has its own private constraint function, while authors in [29] proposed a continuous-time projection neurodynamic approach sharing the same constraint set among agents in the multi-agent system. It should be noted that the neurodynamic approach in [46] requires the local cost functions which are twice continuously differentiable. To relax this differentiability hypothesis, authors in [27] proposed a subgradient neurodynamic approach for nonsmooth distributed optimization problem, which was later developed into a subgradient projection neurodynamic approach in [28] to resolve distributed optimization problems with consensus constraints. Inspired by the works above, authors in [13] proposed a subgradient neurodynamic approach for solving the nonsmooth distributed optimization problem. In addition, authors in [23] presented a primal-dual projection neurodynamic approach for nonsmooth distributed optimization problems under local bounded constraints. Because instant messaging is an ideal state, authors in [21] considered the time delay in information exchange and designed a subgradient projection neurodynamic approach. Furthermore, by combining primal-dual methods for searching saddle points with projection operators for meeting set constraints, a continuous-time neurodynamic approach for nonsmooth constrained convex optimization was designed in [49].

All neurodynamic approaches mentioned above require that the communication process of the multi-agent system takes place at each time. Therefore, to avoid the high-energy consumption caused by frequent communication and in keeping with the fact that each node is usually equipped with a limited amount of energy, a large number of neurodynamic approaches with event-triggered mechanisms have emerged (see [7, 16, 25, 33, 51]). Among them, authors in [51] proposed an event-triggered neurodynamic approach to solve distributed optimization problems with equality constraints when the local objective functions are quadratic functions, and authors in [33] designed a neurodynamic approach for the second-order multi-agent systems with event-triggered and time-triggered communication. Furthermore, to consider the more general case, authors in [25] designed a gradient neurodynamic approach with event-triggered mechanism to solve distributed convex problems under set constraints.

As we all know, whether in computers or in practical applications, discrete-time neurodynamic approaches are more practical. For resolving distributed optimization problems with different constraints, a variety of discrete-time neurodynamic approaches have been proposed in public. With the help of properties of the projection operator, authors in [27] proposed a discrete-time neurodynamic approach for constrained nonsmooth optimization problems based on the parallel and distributed computing general framework proposed by [34]. On the basis of [27], authors in [21] proposed a subgradient neurodynamic approach for the same nonsmooth optimization problem. However, the neurodynamic approach in [21] required that the constraint set is compact. To weaken this assumption, authors in [37] introduced a discrete-time neurodynamic approach under switching topologies for distributed optimization problems with strongly convex local cost functions. Recently, authors in [24] have presented a discrete-time neurodynamic approach with fixed step and analyzed its convergence rate. Later, authors in [38] proposed a subgradient neurodynamic approach with various step sizes. Authors in [50] further extended smooth local cost functions to the summation of smooth convex functions and nonsmooth \(L_{1}\)-norm functions.

However, most of the above-mentioned neurodynamic approaches require some inevitably strict assumptions. For example, the local cost functions of the considered distributed optimization problem are smooth, or the structure of the constraint set is simple or even has no constraints. To overcome these limitations, we propose neurodynamic approaches to solve the nonsmooth distributed optimization problem under inequality and set constraints. Here are detailed contributions.

  1. 1.

    Unlike the existing neurodynamic approaches where the Laplacian matrixes of communication graph are used to ensure consensus constraints (see [13, 17, 46]), one of the highlights of this paper is using the exact penalty method to deal with uniform constraints, which reduces the dimension of solution space.

  2. 2.

    The continuous-time neurodynamic approach proposed herein has better convergence property than approaches in [6, 18] because it can ensure that state solutions converge to an optimal point of the distributed optimization problem rather than the optimal solution set. Furthermore, the proposed continuous-time neurodynamic approach can solve the nonsmooth distributed optimization problem under inequality and set constraints, and it can be expended to solve distributed optimization problems in [11, 12, 22, 42].

  3. 3.

    Compared with the neurodynamic approaches under continuous communication in [11, 13, 29, 46] and so on, the advantages of the event-triggered neurodynamic approach designed in this paper are saving the communication energy consumption between nodes and reducing the number of controller updates. Thus, avoiding unnecessary consumption of network resources and improving the utilization rate of bandwidth. Moreover, the event-triggered neurodynamic approach can solve distributed optimization problems under inequality and set constraints, which has the ability to solve more general problems than [25, 33, 51].

  4. 4.

    A discrete-time variable step-size neurodynamic approach is designed for the nonsmooth convex distributed optimization problems with inequality and set constraints. As far as we know, existing discrete-time neurodynamic approaches in [32, 45, 50] can only be applied to cases that have differentiable local cost functions or affine inequality constraints, so the proposed discrete-time neurodynamic approach herein can solve more general problems. In addition, discrete-time neurodynamic approaches are easier to implement and apply than continuous-time neurodynamic approaches.

This article is organized as follows. “Preliminaries” lists some notations and basic knowledge about graph theory, convex analysis, etc. In “Problem description and equivalent form”, the distributed optimization problem with inequality and set constraints is transformed into a problem with set constraints equivalently. In “Main results”, a continuous-time distributed neurodynamic approach and an event-triggered neurodynamic approach are discussed; meanwhile, the convergence of two neurodynamic approaches is proved, respectively. Moreover, a discrete-time neurodynamic approach is presented and its convergence is analyzed. “Simulations and applications” illustrates the effectiveness and performance of neurodynamic approaches through numerical examples. The last section summarizes the thesis and looks forward to the research direction in the future.

Preliminaries

First of all, here are some terminologies and symbols that will be utilized. \(\mathbb {R}^{n}\) is the set of n-dimension real vectors, \(\mathbb {R}^{m\times n}\) is the set of \(m\times n\) real matrices, and \(\mathbb {N}\) is the set of natural numbers. \(x^{\mathrm {T}}\) and \(\Vert x\Vert =\sqrt{x^{\mathrm {T}}x}\) denote the transpose and norm of \(x\in \mathbb {R}^{n}\), respectively. The open ball centered on \(x_{0}{\in \mathbb {R}^{n}}\) is described by \(\mathrm {\varvec{B}}(x_{0},r)=\{x\in \mathbb {R}^{n}:\Vert x-x_{0}\Vert <r\}\) with \(r>0\). Furthermore, \(\mathrm {col}\{x,y\}=(x^{\mathrm {T}},y^{\mathrm {T}})^{\mathrm {T}}\in \mathbb {R}^{n+m}\) for \(x\in \mathbb {R}^{n}\) and \(y\in \mathbb {R}^{m}\). \(\mathrm {1}_{n}\) is an n-dimensional vector where all the components are ones, and \(\mathrm {0}_{n}\) is an n-dimensional vector where all the components are zero. The sign \(\otimes \) stands for the Kronecker product. For two sets \(A,B\subseteq \mathbb {R}^{n}\), \(\mathrm {int}(A)\) and \(\mathrm {bd}(A)\) mean the interior and the boundary of set A, \(A + B=\{x+y:x\in A,y\in B\}\) and \(A+x_{0}=\{x+x_{0}:x\in A\}\).

Next, some basic definitions of graph theory are as follows. Let \(\mathcal {G} =(\mathcal {V},\mathcal {E})\) be a communication topology between several nodes in a multi-agent system consisting of N agents, where \(\mathcal {V}=\{1,2,\ldots ,N\}\) and \(\mathcal {E}\subseteq \mathcal {V}\times \mathcal {V}\) are node set and edge set, respectively. If \((i,j)\in \mathcal {E}\), Node j is a neighbor of node i. If \(\mathrm {A} =(a_{ij})_{N\times N}\) satisfies that \(a_{ij}\ge 0\), and \(a_{ij}>0\) when \((i,j)\in \mathcal {E}\), \(\mathrm {A}\) is a weighted adjacency matrix of \(\mathcal {G}\). If there are distinct nodes \(i_{l}\in \mathcal {V}\), such that \(\mathcal {L}=\{(i, i_{1}),(i_{1}, i_{2}),\ldots ,(i_{k}, j)\}\), then i and j are connected and \(\mathcal {L}\) is a path between nodes i and j. Furthermore, the communication topology \(\mathcal {G}\) is called undirected, if its weighted adjacency matrix is symmetric. And an undirected graph \(\mathcal {G}\) is called connected, if there exists a path between any pair of nodes.

Subsequently, several necessary definitions and propositions of convex analysis and projection operators are given.

Definition 1

[10] If f(x) is convex on \(\mathbb {R}^{n}\), then the subdifferential of f(x) at \(x_{0}\in \mathbb {R}^{n}\) is

$$\begin{aligned} \partial f(x_{0})= & {} \{\xi \in \mathbb {R}^{n}:f(x)-f(x_{0})\\\ge & {} \big \langle \xi ,x-x_{0}\big \rangle ,x \in \mathbb {R}^{n}\}. \end{aligned}$$

Proposition 1

[10] If f(x) is convex on \(\mathbb {R}^{n}\), then \(\partial f(x)\) is nonempty, convex, compact, and upper semicontinuous.

Definition 2

[10] The differentiable function f(x) is u-strongly convex on \(\mathbb {R}^{n}\ (u>0)\), if for any \(x,y\in \mathbb {R}^{n}\)

$$\begin{aligned} f(y)\ge & {} f(x)+\nabla f(x)^{\mathrm {T}}(y-x)\\&+\frac{u}{2}\Vert y-x\Vert ^{2}. \end{aligned}$$

Let \(C\subseteq \mathbb {R}^n\) be a nonempty closed and convex set, then the projection operator on C is a function \(\mathcal {P}_{C}:\mathbb {R}^n\rightarrow C\), which is defined by

$$\begin{aligned} \mathcal {P}_{C}(x)=\arg \min \limits _{y\in C}\Vert x-y\Vert . \end{aligned}$$

Proposition 2

[27] For the projection operator \(\mathcal {P}_{C}\), it follows that:

  1. 1.

    \(\big \langle \mathcal {P}_{C}(x)-x,y-\mathcal {P}_{C}(x)\big \rangle \ge 0,\ \forall x\in \mathbb {R}^n\ and\ \forall y\in C\);

  2. 2.

    \(\Vert \mathcal {P}_{C}(x)-\mathcal {P}_{C}(y)\Vert \le \Vert x-y\Vert ,\ \forall x,y\in \mathbb {R}^n\).

Definition 3

Suppose that \(C\subseteq \mathbb {R}^{n}\) is a nonempty closed convex set. The normal cone of C at x, denoted by \(\varvec{\mathrm {N}}_{C}(x)\), is defined as

$$\begin{aligned} \varvec{\mathrm {N}}_{C}(x)=\{v\in \mathbb {R}^{n}:\langle v,y-x\rangle \le 0,\ \forall y\in C\}. \end{aligned}$$

Proposition 3

[10] Suppose that \(C,C_{1},C_{2}\subseteq \mathbb {R}^{n}\) are nonempty closed convex sets.

  1. 1.

    If f(x) is local Lipschitz and attains a minimum over C at \(x\in C\), then \(0\in \partial f(x)+\varvec{\mathrm {N}}_{C}(x);\)

  2. 2.

    If \(0\in \mathrm {int}(C_{1}-C_{2})\), then for any \(x\in C_{1}\cap C_{2}\), \(\varvec{\mathrm {N}}_{C_{1}\cap C_{2}}(x)=\varvec{\mathrm {N}}_{C_{1}}(x) +\varvec{\mathrm {N}}_{C_{2}}(x).\)

Problem description and equivalent form

In this section, we consider the following constrained optimization problem:

$$\begin{aligned} \begin{aligned} \text{ min }~~~&f(x)=\sum \limits _{i = 1}^N f_{i}(x) \\ \text{ s.t. }~~~&g_{i}(x)\le 0,\ \ i=1,2,\ldots ,N \\&x\in \varOmega , \end{aligned} \end{aligned}$$
(1)

where \(x\in \mathbb {R}^{n}\) is the decision vector, \(f_{i}(x):\mathbb {R}^{n}\rightarrow \mathbb {R}\) is convex but not necessary smooth, and constraint functions \(g_{i}(x):\mathbb {R}^{n}\rightarrow \mathbb {R}^{m_{i}}\) \((i=1,2,\ldots ,N)\) are convex. \(\varOmega \subseteq \mathbb {R}^{n}\) is a bounded closed convex set. Without loss of generality, the optimal solution set of optimization problem (1) is nonempty.

Remark 1

Optimization problems have been discussed by [11,12,13, 17, 22, 42, 46] and so on, since they can be applied in various engineering and control areas. Actually, the optimization problem discussed here is under inequality and set constraints, so optimization problem (1) has a wider range of applications than problems discussed in [12, 22] and [42].

Let \(m=\sum _{i=1}^N m_{i}\), and note \(\bar{g}(x)=\mathrm {col}\{g_{1}(x),g_{2}(x),\ldots ,g_{N}(x)\}:\mathbb {R}^{n}\rightarrow \mathbb {R}^{m} \)

$$\begin{aligned} \varXi =\{x\in \mathbb {R}^{n}:\bar{g}(x)\le 0\}, \end{aligned}$$

where \(\bar{g}(x)\le 0\) means each of its components \(\bar{g}_{l}(x)\le 0,\ l\in \{1,2,\ldots ,m\}\). To simplify the constraints in optimization problem (1), some necessary assumptions are given below.

Assumption 1

         

  1. 1.

    There exists a point \(\hat{x}\in \mathbb {R}^{n}\), such that \(\hat{x}\in \mathrm {int}(\varXi )\cap \varOmega \).

  2. 2.

    The set \(\varXi =\{x\in \mathbb {R}^{n}:\bar{g}(x)\le 0\}\) is bounded, \(\mathrm {i.e.}\), there exists \(M>0\), such that \(\varXi \subseteq \mathrm {\varvec{B}}(\hat{x},M)\).

Assumption 1 is actually common preconditions and have been used in [26, 48]. It is easy to see that the feasible region \(\varXi \cap \varOmega \) of optimization problem (1) is a bounded set, which implies that there exists a positive number L, such that

$$\begin{aligned} \Vert \xi \Vert \le L\ \mathrm {and}\ \Vert \eta \Vert \le L, \end{aligned}$$
(2)

for any \(\xi \in \partial f(x)\), \(\eta \in \partial \bar{g}(x)\), \(x\in \varXi \cap \varOmega \). Besides, define a function

$$\begin{aligned} D(x)=\sum _{l=1}^m\max \{0,\bar{g}_{l}(x)\}=\sum _{i=1}^{N}\sum _{j=1}^{m_{i}}\max \{0,g_{ij}(x)\}, \end{aligned}$$

where \(\bar{g}_{l}(x)\) is the lth component of \(\bar{g}(x)\) and \(g_{ij}(x)\) means the jth component of \(g_{i}(x)\). Due to the convexity of \(g_{i}(x)\), it can be concluded that D(x) is convex. Specially, the closed form for \(\partial D(x)\) can be calculated [43] that

$$\begin{aligned} \partial D(x)=\left\{ \begin{array}{l@{\quad }l} \sum \limits _{l\in I^{0}(x)}[0,1]\partial \bar{g}_{l}(x), &{} x\in \mathrm {bd}(\varXi ), \\ \{0\}, &{} x\in \mathrm {int}(\varXi ), \\ \quad \quad \quad \quad \ \ &{}\ \ \\ \sum \limits _{l\in I^{0}(x)}[0,1]\partial \bar{g}_{l}(x)+\sum \limits _{l\in I^{+}(x)}\partial \bar{g}_{l}(x), &{} x\notin \varXi , \\ \end{array} \right. \nonumber \\ \end{aligned}$$
(3)

where \(I=\{1,2,\ldots ,m\}\), \(I^{0}(x)=\{l\in I:\bar{g}_{l}(x)=0\}\) and \(I^{+}(x)=\{l\in I:\bar{g}_{l}(x)>0\}\). Let

$$\begin{aligned} \hat{g}=\min \{-\bar{g}_{i}(\hat{x}):i\in I\}. \end{aligned}$$
(4)

Inspired by the exact penalty method mentioned in [20] and [44], we can set up the equivalent transformation of optimization problem (1) as follows:

Theorem 1

Under Assumption 1, \(x^{*}\in \mathbb {R}^{n}\) is an optimal solution of optimization problem (1) if and only if \(x^{*}\) is an optimal solution of the following optimization problem:

$$\begin{aligned} \begin{aligned} \text{ min }~~~&f(x)+\sigma D(x) \\ \text{ s.t. }~~~&x\in \varOmega , \end{aligned} \end{aligned}$$
(5)

where the penalty parameter \(\displaystyle \sigma > LM/\hat{g}\), and \(M,\ L,\ \hat{g}\) are from Assumption 1, (2) and (4) respectively.

Proof

Let \(x^{*}\) be an optimal solution of optimization problem (1), then due to Proposition 3, it derives that there exists \(\xi ^{*}\in \partial f(x^{*})\), such that \(-\xi ^{*}\in \varvec{\mathrm {N}}_{\varXi \cap \varOmega }(x^{*})\). Based on Assumption 1, \(\varvec{\mathrm {N}}_{\varXi \cap \varOmega }(x^{*})=\varvec{\mathrm {N}}_{\varXi }(x^{*})+\varvec{\mathrm {N}}_{\varOmega }(x^{*})\). Next, the proof will be divided into cases as follows.

Case 1: \(x^{*}\in \mathrm {int}(\varXi )\cap \varOmega \). If this case holds, then \(\varvec{\mathrm {N}}_{\varXi }(x^{*})=\{0\}\). Hence, by the definition of \(\partial D(x)\) in (3), it has

$$\begin{aligned} -\xi ^{*}\in \varvec{\mathrm {N}}_{\varOmega }(x^{*})=\sigma \partial D(x^{*})+\varvec{\mathrm {N}}_{\varOmega }(x^{*}), \end{aligned}$$
(6)

which means that \(0\in \partial f(x^{*})+\sigma \partial D(x^{*})+\varvec{\mathrm {N}}_{\varOmega }(x^{*})\). It is also known by convexity that \(x^{*}\) is an optimal solution of optimization problem (5).

Case 2: \(x^{*}\in \mathrm {bd}(\varXi )\cap \varOmega \). In this case, we have

$$\begin{aligned} \partial D({x^{*}})=\sum \limits _{l\in I^{0}(x^{*})}[0,1]\partial \bar{g}_{l}(x^{*}). \end{aligned}$$

In view of \(\varvec{\mathrm {N}}_{\varXi }(x^{*})=\bigcup _{\theta \ge 0}\theta \partial D(x^{*})\) [10], it obtains that \(\varvec{\mathrm {N}}_{\varXi }(x^{*})=\sum _{l\in I^{0}(x^{*})}[0,+\infty )\partial \bar{g}_{l}(x^{*}).\) Then

$$\begin{aligned} -\xi ^{*}\in \sum \limits _{l\in I^{0}(x^{*})}[0,+\infty )\partial \bar{g}_{l}(x^{*})+\varvec{\mathrm {N}}_{\varOmega }(x^{*}). \end{aligned}$$
(7)

Since \(x^{*}\in \mathrm {bd}(\varXi )\cap \varOmega \), \(\Vert \xi ^{*}\Vert \le L\) for any \(\xi ^{*}\in \partial f(x^{*})\). And from (7), there are \(\sigma _{l}\in [0,+\infty )\), \(\eta _{l}^{*}\in \partial \bar{g}_{l}(x^{*})\) and \(\gamma ^{*}\in \varvec{\mathrm {N}}_{\varOmega }(x^{*})\), such that

$$\begin{aligned} -\xi ^{*}=\sum \limits _{l\in I^{0}(x^{*})}\sigma _{l}\eta _{l}^{*}+\gamma ^{*}. \end{aligned}$$

We claim that for any \(l\in I^{0}(x^{*})\), \(\displaystyle \sigma _{l}\le LM/\hat{g}\). If not, there is at least one \(\tilde{l}\in I^{0}(x^{*})\), such that \(\displaystyle \sigma _{\tilde{l}}> LM/\hat{g}\). Thus, by the convexity of \(\bar{g}_{l}(x)\) and \(\gamma ^{*}\in \varvec{\mathrm {N}}_{\varOmega }(x^{*})\), it has

$$\begin{aligned} \big \langle -\xi ^{*},x^{*}-\hat{x}\big \rangle= & {} \sum \limits _{l\in I^{0}(x^{*})}\sigma _{l}\big \langle \eta _{l}^{*},x^{*}-\hat{x}\big \rangle +\big \langle \gamma ^{*},x^{*} -\hat{x}\big \rangle \\\ge & {} \sum \limits _{l\in I^{0}(x^{*})}\sigma _{l}(\bar{g}_{l}(x^{*})-\bar{g}_{l}(\hat{x}))\ge \sigma _{\tilde{l}}\hat{g}>LM, \end{aligned}$$

which implies that \(L<\Vert \xi ^{*}\Vert \) by Assumption 1. Obviously, it is contradict with \(\Vert \xi ^{*}\Vert \le L\). Therefore, when \(\displaystyle \sigma \ge LM/\hat{g}\)

$$\begin{aligned} \sum \limits _{l\in I^{0}(x^{*})}\sigma _{l}\eta _{l}^{*}+\gamma ^{*}\in \sigma \partial D(x^{*})+\varvec{\mathrm {N}}_{\varOmega }(x^{*}). \end{aligned}$$
(8)

Therefore, \(0\in \partial f(x^{*})+\sigma \partial D(x^{*})+\varvec{\mathrm {N}}_{\varOmega }(x^{*})\) for all \(\displaystyle \sigma \ge LM/\hat{g}\). That is, \(x^{*}\) is an optimal solution of optimization problem (5), because \(f(x)+\sigma D(x)\) is a convex function.

On the other hand, if \(x^{*}\) is an optimal solution of optimization problem (5), we are going to show that \(x^{*}\) is an optimal solution of optimization problem (1). Without loss of generality, assume that \(\tilde{x}\) is an optimal solution of (1). Let \(\displaystyle \sigma _{0}= LM/\hat{g}\), then similar to (6)–(8), and it turns out that \(\tilde{x}\) is also an optimal solution of

$$\begin{aligned} \begin{aligned} \text{ min }~~~&f(x)+\sigma _{0}D(x) \\ \text{ s.t. }~~~&x\in \varOmega . \end{aligned} \end{aligned}$$
(9)

Thereby, since \(x^*\in \varOmega \), it has

$$\begin{aligned} f(\tilde{x})+\sigma _{0}D(\tilde{x})\le f(x^{*})+\sigma _{0}D(x^{*}). \end{aligned}$$
(10)

In addition, \(x^{*}\) is an optimal solution of optimization problem (5), so

$$\begin{aligned} f(x^{*})+\sigma D(x^{*})\le f(\tilde{x})+\sigma D(\tilde{x}); \end{aligned}$$
(11)

when \(\sigma >\sigma _{0}\). By combining (10) and (11), we get that

$$\begin{aligned} (\sigma -\sigma _{0})D(x^{*})\le (\sigma -\sigma _{0})D(\tilde{x}). \end{aligned}$$

From the definition of D(x), it holds that \(x\in \varXi \) if and only if \(D(x)=0\). Thereby, it is clear that \(D(x^{*})=0\) if \(D(\tilde{x})=0\), which means \(x^{*}\in \varXi \). As a result, \(f(x^{*})=f(x^{*})+\sigma D(x^{*})=\min _{x\in \varXi \cap \varOmega }\{f(x)+\sigma D(x)\}\le f(x)\), \(\forall x\in \varXi \cap \varOmega \). Thus, \(x^{*}\) is an optimal solution of optimization problem (1). \(\square \)

Now, we consider a multi-agent system composed of N agents, whose communication topology is \(\mathcal {G} =(\mathcal {V},\mathcal {E})\) and \(\mathcal {V}=\{1,2,\ldots ,N\}\). For \(i\in \mathcal {V}\), let \(x_{i}\) be the decision variable of agent i and \(\mathcal {N}_{i} = \{j\in \mathcal {V}:(i, j)\in \mathcal {E}\}\) be the set of neighbors of agent i, then the following lemma is essential to the equivalent transformation of optimization problem (5).

Assumption 2

The communication graph between agents is undirected and connected.

Lemma 1

[20] The optimization problem (5) can be written as a distributed optimization problem with communication graph \(\mathcal {G}\)

$$\begin{aligned} \begin{aligned} \text{ min }~~~&p(\varvec{x})\triangleq \sum \limits _{i=1}^N f_{i}(x_{i})+\sigma \sum \limits _{i=1}^N D_{i}(x_{i}) \\ \text{ s.t. }~~~&\varvec{x}\in \varvec{\varOmega }\\&x_{i}=x_{j},\ i,j=1,2,\ldots ,N, \end{aligned} \end{aligned}$$
(12)

where \(\varvec{x}=\mathrm {col}\{x_{1},x_{2},\ldots ,x_{N}\}\in \mathbb {R}^{Nn}\), \(\varvec{\varOmega }=\underbrace{\varOmega \times \varOmega \times \ldots \times \varOmega }_{N}\) and \(D_{i}(x_{i})=\sum \nolimits _{j=1}^{m_{i}}\max \{0,g_{ij}(x_{i})\}\).

Remark 2

The problem (12) is a typical distributed optimization problem, which aims to find the optimal solution to global cost function through cooperation between agents. Finding the solution to distributed optimization problems has aroused great interest among scholars due to its privacy protection characteristics, intelligence, and flexibility such as [13, 17, 46]. The above papers ensure that consensus constraints hold by introducing the Laplacian matrixes of communication graphs, but we equivalently transform the optimization problems with consensus constraints by the exact penalty method shown as the next theorem, which reduces the dimension of solution space.

Under Assumption 1, it is obvious from (2) that there is \(L>0\), such that for any \(\gamma \in \partial \bar{h}(\varvec{x})\), \(\Vert \gamma \Vert \le (1+\sigma )L.\)

Therefore, based on the following theorem, neurodynamic approaches for solving the distributed optimization problem are proposed in the next section.

Theorem 2

[39] Under Assumptions 1 and 2 as well as \(\sigma >\max \{LM/\hat{g},NL+\sqrt{NL(NL+2)}\}\), \(x^{*}\) is an optimal solution of optimization problem (1) if and only if \(\varvec{x}^{*}=\mathrm {col}\{x^{*},x^{*},\ldots ,x^{*}\}\) is an optimal solution of

$$\begin{aligned} \begin{aligned} \text{ min }~~~&h(\varvec{x})\triangleq \sum \limits _{i=1}^N f_{i}(x_{i})+\sigma \sum \limits _{i=1}^N D_{i}(x_{i})\\&\quad \quad \qquad +\frac{\sigma ^{2}}{2}\sum \limits _{i=1}^N\sum \limits _{j\in \mathcal {N}_{i}}\Vert x_{i}-x_{j}\Vert \\ \text{ s.t. }~~~&\varvec{x}\in \varvec{\varOmega }. \end{aligned} \end{aligned}$$
(13)

Main results

In this section, a continuous-time neurodynamic approach, an event-triggered neurodynamic approach, and a discrete-time neurodynamic approach are proposed to solve the distributed optimization problem with inequality and set constraints.

Continuous-time neurodynamic approach

In this subsection, we propose a projection subgradient neurodynamic approach for nonsmooth convex distributed optimization problem (13) as follows:

$$\begin{aligned} \dot{\varvec{x}}(t)\in -\varvec{x}(t)+\mathcal {P}_{\varvec{\varOmega }}\Big (\varvec{x}(t)-\partial h(\varvec{x}(t))\Big ), \end{aligned}$$
(14)

where \(h(\varvec{x})\) is defined in (13). Noting that

$$\begin{aligned}&\partial f(\varvec{x})=\mathrm {col}\{\partial f_{1}(x_{1}),\partial f_{2}(x_{2}),\ldots ,\partial f_{N}(x_{N})\},\\&\partial D(\varvec{x})=\mathrm {col}\{\partial D_{1}(x_{1}),\partial D_{2}(x_{2}),\ldots ,\partial D_{N}(x_{N})\},\\&\mathcal {P}_{\varvec{\varOmega }}(\varvec{x})=\mathrm {col}\{\mathcal {P}_{\varOmega }(x_{1}),\mathcal {P}_{\varOmega }(x_{2}),\ldots ,\mathcal {P}_{\varOmega }(x_{N})\}. \end{aligned}$$

Then, the neurodynamic approach for agent \(i\in \mathcal {V}\) can be rewritten as

$$\begin{aligned} \dot{x}_{i}(t)\in&-x_{i}(t)+\mathcal {P}_{\varOmega }\Big (x_{i}(t)-\partial f_{i}(x_{i}(t))-\sigma \partial D_{i}(x_{i}(t))\nonumber \\&\quad -\sigma ^{2}\sum \limits _{j\in \mathcal {N}_{i}}\partial \Vert x_{i}(t)-x_{j}(t)\Vert \Big ). \end{aligned}$$
(15)

Remark 3

In detail, the projection operator \(\mathcal {P}_{\varOmega }\) in (15) is to ensure that the state solution stays at \(\varOmega \), the term \(-\partial f_{i}(x_{i}(t))-\partial D_{i}(x_{i}(t))\) ensures the state solution converges to an optimal solution of neurodynamic approach (15), and \(-\sum _{j\in \mathcal {N}_{i}}\partial \Vert x_{i}(t)-x_{j}(t)\Vert \) makes the state solution reach consensus. To explain neurodynamic approach (15) more clearly, its circuit diagram is shown in Fig. 1.

In fact, the subgradients of \(f_i\) and \(g_i\) \((i\in \mathcal {V})\) are represented by piecewise functions.

Actually, the combination of projection method and subgradient method is very common when solving distributed optimization problems, such as [24, 30, 48, 49].

Fig. 1
figure 1

Circuit diagram of neurodynamic approach (15)

Theorem 3

For any initial value \(\varvec{x}_{0}\in \varvec{\varOmega }\), there exists a local solution \(\varvec{x}(t)\) of neurodynamic approach (14) on \([0,T),\ T>0\) and

$$\begin{aligned} \varvec{x}(t)\in \varvec{\varOmega },\ t\in [0,T) \end{aligned}$$

Proof

According to Propositions 1, 2, and Assumption 1 in this paper, it can be concluded that \(h(\varvec{x})\) is bounded from below on \(\mathbb {R}^{Nn}\) and Lipschitz on any bounded subset of \(\varvec{\varOmega }\). Thus, by applying the Theorem 5.2 in [5], there exists a local solution \(\varvec{x}(t)\) of neurodynamic approach (14) on [0, T) for any \(\varvec{x}_{0}\in \varvec{\varOmega }\), where [0, T) is the maximum existence interval of \(\varvec{x}(t)\).

By the definition of differential inclusion, there is \(\gamma (t)\in \partial h(\varvec{x}(t))\) for \(\mathrm {a.e.}\ t\in [0,T)\), such that

$$\begin{aligned} \dot{\varvec{x}}(t)=-\varvec{x}(t)+\mathcal {P}_{\varvec{\varOmega }}\Big (\varvec{x}(t)-\gamma (t)\Big ), \end{aligned}$$

where \(h(\varvec{x})\) is defined in distributed optimization problem (13). Let \(\varvec{\beta }(t)=\mathcal {P}_{\varvec{\varOmega }}\big (\varvec{x}(t)-\gamma (t)\big )\), then \(\varvec{\beta }(t)\in \mathcal {P}_{\varvec{\varOmega }}\big (\varvec{x}(t)-\partial h(\varvec{x}(t))\big )\) for \(\mathrm {a.e.}\ t\in [0,T)\) and

$$\begin{aligned} \dot{\varvec{x}}(t)+\varvec{x}(t)=\varvec{\beta }(t), \ \mathrm {a.e.}\ t\in [0,T). \end{aligned}$$
(16)

Inspired by the proof of Lemma 2.4 in [5], through integrating (16) on [0, t], \(t\in [0,T)\), it has

$$\begin{aligned} \varvec{x}(t)= & {} \mathrm {e}^{-t}\varvec{x}(0)+\mathrm {e}^{-t}\int _{0}^{t}\varvec{\beta }(s)\mathrm {e}^{s}\mathrm {d}s\nonumber \\= & {} \mathrm {e}^{-t}\varvec{x}_{0}+(1-\mathrm {e}^{-t})\int _{0}^{t}\varvec{\beta }(s)\frac{\mathrm {e}^{s}}{\mathrm {e}^{t}-1}\mathrm {d}s. \end{aligned}$$
(17)

Since \(\displaystyle \int _{0}^{t}\frac{\mathrm {e}^{s}}{\mathrm {e}^{t}-1}\mathrm {d}s=1\) and \(\varvec{x}_{0}\in \varvec{\varOmega }\), it is easy to deduce that \(\varvec{x}(t)\in \varvec{\varOmega },\ t\in [0,T)\), because \(\varvec{\varOmega }\) is convex. \(\square \)

Theorem 4

Under Assumptions 1 and 2, if \(\displaystyle \sigma >\max \{LM/\hat{g},NL+\sqrt{NL(NL+2)}\}\), the state solution \(\varvec{x}(t)\) of neurodynamic neurodynamic approach (14) with initial value \(\varvec{x}(0)=\varvec{x}_{0}\) exists globally and converges to an optimal solution to distributed optimization problem (13), that is, \(\lim _{t\rightarrow \infty }\varvec{x}(t)=\varvec{x}^{*}=\mathrm {col}\{x_{1}^{*},x_{2}^{*},\ldots ,x_{N}^{*}\}\), where \(x_{i}^{*}=x_{j}^{*}\ (i,j\in \mathcal {V})\) are the optimal solution to optimization problem (1).

Proof

Let \(\varvec{x}^{*}=\mathrm {col}\{x_{1}^{*},x_{2}^{*},\dots ,x_{N}^{*}\}\) be an optimal solution of distributed optimization problem (13), and \(x_{i}(t)\) \((i\in \mathcal {V})\) be the state solution from initial value \(x_{i}(0)\) of neurodynamic approach (14). Then, according to the definition of differential inclusion, there exist measurable functions \(\xi _{i}(t)\in \partial f_{i}(x_{i}(t))\), \(\eta _{i}(t)\in \partial D_{i}(x_{i}(t))\) and \(\zeta _{i}(t)\in \sum _{j\in \mathcal {N}_{i}}\partial \Vert x_{i}(t)-x_{j}(t)\Vert \), such that

$$\begin{aligned} \dot{x}_{i}(t)=-x_{i}(t)+\mathcal {P}_{\varOmega }\big (x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big ), \end{aligned}$$

for \(\mathrm {a.e.}\ t\in [0,T)\).

Construct a function as follows:

$$\begin{aligned} \begin{aligned} H(\varvec{x})=&\frac{1}{2}\sum \limits _{i=1}^N \Vert x_{i}-x_{i}^{*}\Vert ^2+h(\varvec{x})-h(\varvec{x}^*). \end{aligned} \end{aligned}$$
(18)

The derivative of \(H(\varvec{x}(t))\) with respect to time t is

$$\begin{aligned}&\frac{\mathrm {d}H(\varvec{x}(t))}{\mathrm {d}t}=\sum \limits _{i=1}^N \big \langle \xi _{i}(t)+\sigma \eta _{i}(t)\nonumber \\&\qquad +\sigma ^{2}\zeta _{i}(t)+x_{i}(t)-x_{i}^{*},\dot{x}_{i}(t)\big \rangle \nonumber \\&\quad =\sum \limits _{i=1}^N\big \langle \xi _{i}+\sigma \eta _{i}(t)+\sigma ^{2}\zeta _{i}(t)+x_{i}(t)-x_{i}^{*},-x_{i}(t)\nonumber \\&\qquad +\mathcal {P}_{\varOmega }\big (x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big )\big \rangle . \end{aligned}$$
(19)

It can be seen from Proposition 2 that

$$\begin{aligned} \Big \langle&x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)-\mathcal {P}_{\varOmega }\\&\big (x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big ),\\&\mathcal {P}_{\varOmega }\big (x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big )\\&\quad -x_{i}(t)+x_{i}(t)-x_{i}^{*}\Big \rangle \ge 0, \end{aligned}$$

that is

$$\begin{aligned}&-\Vert x_{i}(t)-\mathcal {P}_{\varOmega }\big (x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big )\Vert ^2\\&\quad -\,\big \langle \xi _{i}(t)+\sigma \eta _{i}(t)+\sigma ^{2}\zeta _{i}(t),\mathcal {P}_{\varOmega }\big (x_{i}(t)\\&\quad -\,\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big )-x_{i}(t)\big \rangle \\&\quad +\,\big \langle x_{i}(t)-\mathcal {P}_{\varOmega }\big (x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)\\&\quad -\,\sigma ^{2}\zeta _{i}(t)\big ),x_{i}(t)-x_{i}^{*}\big \rangle \\&\quad -\,\big \langle \xi _{i}(t)+\sigma \eta _{i}(t)+\sigma ^{2}\zeta _{i}(t),x_{i}(t)-x_{i}^{*}\big \rangle \ge 0, \end{aligned}$$

which implies that

$$\begin{aligned}&\big \langle \xi _{i}(t)+\sigma \eta _{i}(t)+\sigma ^{2}\zeta _{i}(t)+x_{i}(t)-x_{i}^{*},\mathcal {P}_{\varOmega }\nonumber \\&\quad \big (x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big )-x_{i}(t)\big \rangle \nonumber \\&\quad \quad \le -\Vert x_{i}(t)-\mathcal {P}_{\varOmega }\big (x_{i}(t)-\xi _{i}(t)-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big )\Vert ^2\nonumber \\&\qquad \quad -\big \langle \xi _{i}(t)+\sigma \eta _{i}(t)+\sigma ^{2}\zeta _{i}(t),x_{i}(t)-x_{i}^{*}\big \rangle . \end{aligned}$$
(20)

Therefore, it has

$$\begin{aligned} \frac{\mathrm {d}H(\varvec{x}(t))}{\mathrm {d}t}\le & {} -\sum \limits _{i=1}^N\Vert x_{i}(t)-\mathcal {P}_{\varOmega }\big (x_{i}(t)-\xi _{i}(t)\\&-\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big )\Vert ^2\\&-\sum \limits _{i=1}^N\big \langle \xi _{i}(t)+\sigma \eta _{i}(t)+\sigma ^{2}\zeta _{i}(t),x_{i}(t)-x_{i}^{*}\big \rangle . \end{aligned}$$

According to Definition 1, we have \(\sum _{i=1}^{N}\big \langle \xi _{i}(t)+\sigma \eta _{i}(t)+\sigma ^{2}\zeta _{i}(t),x_{i}(t)-x_{i}^{*}\big \rangle \ge 0\), so

$$\begin{aligned} \frac{\mathrm {d}H(\varvec{x}(t))}{\mathrm {d}t}\le & {} -\sum \limits _{i=1}^N\Vert x_{i}(t)-\mathcal {P}_{\varOmega }\big (x_{i}(t)-\xi _{i}(t) \nonumber \\&\quad -\sigma \eta _{i}(t)-\sigma ^{2}\zeta _{i}(t)\big )\Vert ^2\le 0. \end{aligned}$$
(21)

From \(\varvec{x}(t)\in \varvec{\varOmega }\) and (21), it can be obtained that \(0\le H(\varvec{x}(t))\le H(\varvec{x}(0))\) for \(t\ge 0\). Then, \(\varvec{x}(t)\) is bounded. According to Theorem 3, we get that \(\varvec{x}(t)\) exists globally by the extension theorem of solutions in [1], that is

$$\begin{aligned} \varvec{x}(t)\in \varvec{\varOmega },\ t\in [0,+\infty ). \end{aligned}$$

Define another function

$$\begin{aligned} \begin{aligned} G(\varvec{x})=\inf \left\{ \begin{array}{lll} &{} \xi _{i}\in \partial f_{i}(x_{i}), \\ \sum \limits _{i=1}^N\Vert x_{i}-\mathcal {P}_{\varOmega }\big (x_{i}-\xi _{i}-\sigma \eta _{i}-\sigma ^{2}\zeta _{i}\big )\Vert ^2: &{} \eta _{i}\in \partial D_{i}(x_{i}),\\ &{} \zeta _{i}\in \sum \limits _{j\in \mathcal {N}_{i}}\partial \Vert x_{i}-x_{j}\Vert . \end{array}\right\} \end{aligned} \end{aligned}$$

From (21), we have

$$\begin{aligned} \frac{\mathrm {d}H(\varvec{x}(t))}{\mathrm {d}t}\le -G(\varvec{x}(t))\le 0. \end{aligned}$$
(22)

Since \(H(\varvec{x}(t))\ge 0\) for \(t\ge 0\), then there is \(H_{0}\ge 0\), such that

$$\begin{aligned} \lim \limits _{t\rightarrow +\infty }H(\varvec{x}(t))=H_{0}. \end{aligned}$$

Integrating (22) on [0, t] yields

$$\begin{aligned} H(\varvec{x}(t))-H(\varvec{x}(0))\le -\int _{0}^{t}G(\varvec{x}(s))\mathrm {d}s. \end{aligned}$$

As \(t\rightarrow +\infty \), it derives that

$$\begin{aligned} \int _{0}^{+\infty }G(\varvec{x}(s))\mathrm {d}s\le H(\varvec{x}(0))-H_{0}<+\infty . \end{aligned}$$

Therefore, there exists a subsequence \(\{t_{k}\}>0\), such that

$$\begin{aligned} \lim \limits _{k\rightarrow +\infty }G(\varvec{x}(t_{k}))=0. \end{aligned}$$
(23)

Due to 2) in Assumption 1 and the boundedness of \(x_{i}(t)\), \(\{x_{i}(t_{k})\}\) has a convergent subsequence \(\{x_{i}(t_{l})\}\) satisfying

$$\begin{aligned} \lim \limits _{l\rightarrow +\infty }x_{i}(t_{l})=\hat{x}_{i} \end{aligned}$$

and \(\hat{x}_{i}\in \varOmega \) since \(x_{i}(t_{l})\in \varOmega \) and \(\varOmega \) is closed. Next, we will prove that \(\hat{\varvec{x}}=\mathrm {col}\{\hat{x}_{1},\hat{x}_{2},\ldots ,\hat{x}_{N}\}\) is an optimal solution of problem (13). From (23), there are \(\xi _{i_{l}}\in \partial f_{i}(x_{i}(t_{l}))\), \(\eta _{i_{l}}\in \partial D_{i}(x_{i}(t_{l}))\) and \(\zeta _{i_{l}}\in \sum _{j\in \mathcal {N}_{i}}\partial \Vert x_{i}(t_{l}))-x_{j}(t_{l}))\Vert \), such that

$$\begin{aligned} \lim \limits _{l\rightarrow +\infty }\sum \limits _{i=1}^N\Vert x_{i}(t_{l})-\mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big )\Vert ^2=0.\nonumber \\ \end{aligned}$$
(24)

Then by the proposition of \(\mathcal {P}_{\varOmega }(\cdot )\), it yields that

$$\begin{aligned}&\big \langle x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}-\mathcal {P}_{\varOmega }\\&\quad \big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big ),\mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}\\&\quad -\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big )-y_{i}\big \rangle \ge 0 \end{aligned}$$

for all \(y_{i}\in \varOmega \), and

$$\begin{aligned}&\big \langle \xi _{i_{l}}+\sigma \eta _{i_{l}}+\sigma ^{2}\zeta _{i_{l}},\mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big )-y_{i}\big \rangle \\&\quad \le \big \langle x_{i}(t_{l})-\mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big ),\\&\quad \mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big )-y_{i}\big \rangle \\&\quad =\big \langle x_{i}(t_{l})-\mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big ),\\&\quad \mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big )-x_{i}(t_{l})+x_{i}(t_{l})-y_{i}\big \rangle \\&\quad \le \big \langle x_{i}(t_{l})-\mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big ),x_{i}(t_{l})-y_{i}\big \rangle \\&\quad \le \Vert x_{i}(t_{l})-\mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}-\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big )\Vert \Vert x_{i}(t_{l})-y_{i}\Vert . \end{aligned}$$

Let \(l\rightarrow +\infty \) and combine with (24), and there are

$$\begin{aligned}&\limsup \limits _{l\rightarrow +\infty }\Big \langle \xi _{i_{l}}+\sigma \eta _{i_{l}}+\sigma ^{2}\zeta _{i_{l}},\mathcal {P}_{\varOmega }\big (x_{i}(t_{l})-\xi _{i_{l}}\\&\quad -\sigma \eta _{i_{l}}-\sigma ^{2}\zeta _{i_{l}}\big )-y_{i}\Big \rangle \le 0 \end{aligned}$$

for any \(y_{i}\in \varOmega \) and \(\hat{\xi }_{i}\in \partial f_{i}(\hat{x}_{i}),\hat{\eta }_{i}\in \partial D_{i}(\hat{x}_{i})\) and \(\hat{\zeta }_{i}\in \sum _{j\in \mathcal {N}_{i}}\partial \Vert \hat{x}_{i}-\hat{x}_{j}\Vert \), such that

$$\begin{aligned} \big \langle \hat{\xi }_{i}+\sigma \hat{\eta }_{i}+\sigma ^{2}\hat{\zeta }_{i},\hat{x}_{i}-y_{i}\big \rangle \le 0,\ \ \forall y_{i}\in \varOmega . \end{aligned}$$

Thus, \(-\hat{\xi }_{i}-\sigma \hat{\eta }_{i}-\sigma ^{2}\hat{\zeta }_{i}\in \varvec{\mathrm {N}}_{\varOmega }(\hat{x}_{i})\) by Definition 3. Therefore, \(\hat{\varvec{x}}=\mathrm {col}\{\hat{x}_{1},\hat{x}_{2},\ldots ,\hat{x}_{N}\}\) is an optimal solution to distributed optimization problem (13).

In the last step, we prove \(x_{i}(t)\rightarrow \hat{x}_{i}\) as \(t\rightarrow +\infty \). Let

$$\begin{aligned} \hat{H}(t,\varvec{x}(t))=\frac{1}{2}\sum \limits _{i=1}^N\Vert x_{i}(t)-\hat{x}_{i}\Vert ^2+{h(\varvec{x}(t))}-h(\hat{\varvec{x}}). \end{aligned}$$

Similar to what we did before, it derives that

$$\begin{aligned} \frac{\mathrm {d}\hat{H}(t,\varvec{x}(t))}{\mathrm {d}t}=-\sum \limits _{i=1}^N\Vert \dot{x}_{i}(t)\Vert ^2\le 0 \end{aligned}$$

for \(\mathrm {a.e.}\) \(t\in [0,+\infty )\). Together with the fact \(\hat{H}(t,\varvec{x}(t))\ge 0\), then \(\lim \nolimits _{t\rightarrow +\infty }\hat{H}(t,\varvec{x}(t))\) exists. Therefore

$$\begin{aligned} \lim \limits _{t\rightarrow +\infty }\hat{H}(t,\varvec{x}(t))= \lim \limits _{l\rightarrow +\infty }\hat{H}(t_{l},\varvec{x}(t_{l}))=0. \end{aligned}$$
(25)

Since \(\displaystyle \frac{1}{2}\sum \limits _{i=1}^N \Vert x_{i}(t)-\hat{x}_{i}\Vert ^2\le \hat{H}(t,\varvec{x}(t))\), then \(\displaystyle \lim \limits _{t\rightarrow +\infty }\frac{1}{2}\sum \limits _{i=1}^N \Vert x_{i}(t)-\hat{x}_{i}\Vert ^2=0\). That is

$$\begin{aligned} \lim \limits _{t\rightarrow +\infty }x_{i}(t)=\hat{x}_{i}. \end{aligned}$$

By means of Theorem 2 and \(\displaystyle \sigma >\max \{LM/\hat{g},NL+\sqrt{NL(NL+2)}\}\), it follows \(\displaystyle \sigma > LM/\hat{g}\) and \(\sigma ^{2}>2(1+\sigma )NL\), thus \(x_{i}(t)\) converges to the optimal solution to optimization problem (1). \(\square \)

Remark 4

The continuous-time neurodynamic approach proposed in this subsection can be used to solve nonsmooth distributed optimization problems with inequality and set constraints, and the state solution converges to an optimal solution. Thus, the continuous-time neurodynamic approach has better convergence property than [6] and [18] in which the state solution only converges to the optimal solution set. In addition, the proposed continuous-time neurodynamic approach can be extended to solve distributed optimization problems in [9, 11, 13, 17, 46]. See Table 1 for details.

Table 1 Comparison with some references

Event-triggered neurodynamic approach

It must be pointed out that in the process of solving large-scale optimization problems, the information transmission of agents may consume a lot of energy. Therefore, the addition of an event-triggered mechanism will greatly reduce energy consumption and save communication costs, which has been studied in [16, 25, 51]. Therefore, in this subsection, a distributed event-triggered neurodynamic approach is presented for solving distributed optimization problems and the corresponding event-triggered condition is designed by the Lyapunov approach. Before introducing the event-triggered mechanism, the following theorem should be shown.

Theorem 5

Assume that p(x) is a strongly convex function and \(x^{*}\) is the optimal solution of the following optimization problem:

$$\begin{aligned} \begin{aligned} \mathrm {\text{ min }}~~~&p(x) \\ \mathrm {\text{ s.t. }}~~~&\varphi (x)=0,\ x\in \varOmega , \end{aligned} \end{aligned}$$
(26)

where \(\varOmega \subseteq \mathbb {R}^{n}\) is a compact set. Let \(x^{[\nu ]}\) is the optimal solution of

$$\begin{aligned} \begin{aligned} \text{ min }~~~&G^{[\nu ]}(x)=p(x)+\frac{\nu }{2}\Vert \varphi (x)\Vert ^2\\ \text{ s.t. }~~~&x\in \varOmega ; \end{aligned} \end{aligned}$$
(27)

then, \(\lim \limits _{\nu \rightarrow +\infty }x^{[\nu ]}=x^{*}\).

Proof

According to the penalty method mentioned in [4], if \(x^{*}\) is the optimal solution of the constrained optimization problem (26) and \(y^{[\nu ]}\) is the optimal solution of the following unconstrained optimization problem:

$$\begin{aligned} \text{ min }~~~ F^{[\nu ]}(y)=p(y)+\frac{\nu }{2}\Vert \varphi (y)\Vert ^2+\frac{\alpha }{2}\Vert y-x^{*}\Vert ^2 \end{aligned}$$
(28)

for any \(\nu \in \mathbb {N}\) and \(\alpha >0\), then \(\lim _{\nu \rightarrow +\infty }y^{[\nu ]}=x^{*}\).

Since \(x^{[\nu ]}\) is the optimal solution of optimization problem (27), from the definitions of \(G^{[\nu ]}(x)\), \(F^{[\nu ]}(y)\) and \(x^{[\nu ]}\), we have

$$\begin{aligned} G^{[\nu ]}(x^{\nu })\le G^{[\nu ]}(y^{\nu })\le F^{[\nu ]}(y^{[\nu ]})\le F^{[\nu ]}(x^{*})=p(x^{*}). \end{aligned}$$

Namely, \(p(x^{[\nu ]})+\frac{\nu }{2}\Vert \varphi (x^{[\nu ]})\Vert ^2\le p(x^{*})\).

Due to \(\varOmega \) is a compact set, then \(p(x^{[\nu ]})\) is bounded on \(\varOmega \). Thus, \(\lim _{\nu \rightarrow {+\infty }}\Vert \varphi (x^{[\nu ]})\Vert ^2=0\), which implies \(\varphi (\bar{x})=0\) for any limit point \(\bar{x}\) of \(\{x^{[\nu ]}\}\). Let \(\nu \rightarrow {+\infty }\), and it derives \(p(\bar{x})=p(x^{*})\). From the strongly convexity of p(x), it is easy to get that \(\lim _{\nu \rightarrow {+\infty }}x^{[\nu ]}=\bar{x}=x^{*}\). \(\square \)

According to Theorem 5, we construct a new distributed optimization problem

$$\begin{aligned} \begin{aligned} \text{ min }~~~ H^{[\nu ]}(\varvec{x})&=\sum \limits _{i=1}^N f_{i}(x_{i})+\nu \sum \limits _{i=1}^{N}G_{i}^2(x_{i})\\&\quad +\nu \sum \limits _{i=1}^{N}\sum \limits _{j\in \mathcal {N}_{i}}\Vert x_{i}-x_{j}\Vert ^2\\&\text{ s.t. }~~~ \varvec{x}\in \varvec{\varOmega }, \end{aligned} \end{aligned}$$
(29)

where \(f_{i}(\cdot ),\ G_{i}(\cdot )\ (i\in \mathcal {V})\) and \(\varvec{\varOmega }\) are from the distributed optimization problem (12). Here, we further assume that \(f_{i}(\cdot )\) is differentiable strongly convex, so it is obvious that \(H^{[\nu ]}(\cdot )\) is a differentiable strongly convex function for any \(\nu \in \mathbb {N}\). Thus, distributed optimization problem (29) has a unique optimal solution, denoted by \(\varvec{x}^{[\nu ]}\). Now, a neurodynamic approach with event-triggered mechanism is proposed as follows:

$$\begin{aligned} \dot{x}_{i}(t)= & {} -x_{i}(t)+\mathcal {P}_{\varOmega }\big (x_{i}(t)\nonumber \\&-\beta _i\nabla H^{[\nu ]}_{i}(x_{i}(t_{k}^{(i)}))\big ),\ \ t\in [t_{k}^{(i)},t_{k+1}^{(i)}),\ i\in \mathcal {V},\nonumber \\ \end{aligned}$$
(30)

where \(t_{k}^{(i)}\in [0,{+\infty })\ (k\in \mathbb {N})\) is the kth triggering instant, \(\beta _i\) is a positive parameter, and \(H^{[\nu ]}_{i}(x_{i})=f_{i}(x_{i})+\nu G_{i}^2(x_{i})+\nu \sum _{j\in \mathcal {N}_{i}}\Vert x_{i}-x_{j}\Vert ^2\).

Remark 5

In fact, an event-triggering algorithm consists of two parts: a neurodynamic algorithm and an event-triggering mechanism. Combining with published literatures about event-triggering mechanism in [7, 16, 25, 33, 51], we are inspired to construct the event-triggering algorithm in this paper. It is worth knowing that the core of event-triggering algorithms is data updating and communication happened at trigger instants.

Zeno behavior refers to the phenomenon that the system triggers countless events within a limited time. It is well known that Zeno behavior is not desirable for control implementation, since physical devices cannot sample infinitely fast. Therefore, it is not negligible to determine whether there is Zeno behavior in an event-triggered neurodynamic approach.

Definition 4

[19] Under an event-triggered neurodynamic approach, agent \(i\ (i\in \mathcal {V})\) is said to be Zeno if

$$\begin{aligned} \lim \limits _{k\rightarrow {+\infty }}t^{(i)}_{k}=\sum \limits _{k=0}^{{+\infty }}(t^{(i)}_{k+1}-t^{(i)}_{k})=T_0, \end{aligned}$$

where \(T_0\) is a finite constant, and Zeno-free otherwise. Moreover, the event-triggered neurodynamic approach is said to be Zeno if there is an agent to be Zeno, and Zeno-free otherwise.

Remark 6

A crucial reason for the popularity of event-triggered neurodynamic approaches is that it can reduce unnecessary interaction, at the same time, save limited network resources. More specifically, the information interaction occurs only when the trigger condition is met. Inspired by [47], an event-triggering condition can be designed based on Lyapunov functions to ensure convergence of the neurodynamic approach on the one hand and prevent Zeno behavior on the other.

Assumption 3

The gradient \(\nabla H^{[\nu ]}_{i}(\cdot )\) is \(P^{[\nu ]}_{i}\)-Lipschitz with \(P^{[\nu ]}_{i}>0\) on \(\varOmega \), that is

$$\begin{aligned} \Vert \nabla H^{[\nu ]}_{i}(x)-\nabla H^{[\nu ]}_{i}(y)\Vert \le P^{[\nu ]}_{i}\Vert x-y\Vert ,\ \forall x,y \in \varOmega . \end{aligned}$$

Therefore, if the initial value \(x_{i}(0)\in \varOmega \), it can be obtained that \(x_{i}(t)\in \varOmega \). Let \(\varvec{x}^{*}=\mathrm {col}\{x_{1}^{*},x_{2}^{*},\ldots ,x_{N}^{*}\}\) and \(\varvec{x}^{[\nu ]}=\mathrm {col}\{x_{1}^{[\nu ]},x_{2}^{[\nu ]},\ldots ,x_{N}^{[\nu ]}\}\) be the optimal solutions for distributed optimization problem (13) and (29), respectively, \(e_{i}(t)=x_{i}(t)-x_{i}(t_{k}^{(i)})\) be the measurement error for agent \(i\in \mathcal {V}\). Then, we specify the event-triggered condition for storing or updating the sampled data

$$\begin{aligned} t_{k+1}^{(i)}= & {} \inf \limits _{t>0}\left\{ t>t_{k}^{(i)}\Big |{q_{i}(t)} \right. \nonumber \\= & {} \left. {\beta _{i}P^{[\nu ]}_{i}}\Vert e_{i}(t)\Vert ^2-\varvec{\zeta }_{i}^{\mathrm {T}}\varPi _{i}\varvec{\zeta }_{i}\ge 0\right\} , \end{aligned}$$
(31)

where \(\varvec{\zeta }_{i}=\mathrm {col}\big \{\mathcal {P}_{\varOmega }\big (x_{i}(t)-\beta _i\nabla H^{[\nu ]}_{i}(x_{i}(t_{k}^{(i)}))\big ),x_{i}(t)\big \}\), \(\varPi _{i}=\left( \begin{array}{cc} 1-\frac{\beta _{i}P_{i}^{[\nu ]}}{2} &{} -\frac{3}{2}\\ -\frac{3}{2} &{} 2-\frac{\beta _{i}{P_{i}^{[\nu ]}}}{2}\\ \end{array} \right) \otimes I_{n}.\) Define \({\tilde{y}_{i}(t)}=x_{i}(t)-\beta _i\nabla H^{[\nu ]}_{i}(x_{i}(t_{k}^{(i)}))\) to facilitate the following description.

Theorem 6

If Assumptions 13 hold and \(\varvec{x}_{0}\in \varvec{\varOmega }\), then the state solution \(\varvec{x}(t)\) of event-triggered neurodynamic approach (30) converges to \(\varvec{x}^{\nu }\) when the parameters \(\beta _{i}\) and \(P_{i}^{[\nu ]}\) satisfy \(\beta _{i}P_{i}^{[\nu ]}>3+\sqrt{10}\) for each agent \(i\in \mathcal {V}\). Furthermore, the event-triggered neurodynamic approach (30) is Zeno-free.

Proof

Consider a Lyapunov function

$$\begin{aligned} V^{[\nu ]}(\varvec{x})=\frac{1}{2}{\Vert \varvec{x}\Vert ^2}+ H^{[\nu ]}(\varvec{x})-H^{[\nu ]}(\varvec{x}^{\nu }), \end{aligned}$$

where \(H^{[\nu ]}(\cdot )\) is defined in (29). Since \(\varvec{x}^\nu \) is the optimal solution of distributed optimization problem (29), then \(V^{[\nu ]}(\varvec{x})\ge 0\) for \(\varvec{x}\in \varvec{\varOmega }\). Therefore, for \(t\in [t_{k}^{(i)} , t_{k+1}^{(i)})\), taking the time derivative of \(V^{[\nu ]}(\varvec{x}(t))\), it has

$$\begin{aligned} \dot{V}^{[\nu ]}(\varvec{x}(t))=\sum \limits _{i=1}^N \big \langle x_{i}(t),\dot{x}_{i}(t)\big \rangle +\sum \limits _{i=1}^N\big \langle \nabla H^{[\nu ]}_{i}(x_{i}(t)),\dot{x}_{i}(t)\big \rangle .\nonumber \\ \end{aligned}$$
(32)

Since

$$\begin{aligned}&\langle x_{i}(t),\dot{x}_{i}(t)\rangle +\langle \nabla H^{[\nu ]}_{i}(x_{i}(t)),\dot{x}_{i}(t)\rangle \\&\quad =\langle x_{i}(t),-x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle +\langle \nabla H^{[\nu ]}_{i}(x_{i}(t)) ,\\&\qquad -x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle \\&\quad =\langle -x_{i}(t)+\beta _i\nabla H^{[\nu ]}_{i}(x_{i}(t_{k}^{(i)})),-x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle \\&\qquad +2\langle x_{i}(t),-x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle \\&\qquad +\beta _i\langle \nabla H^{[\nu ]}_{i}(x_{i}(t))-\nabla H^{[\nu ]}_{i}(x_{i}(t_{k}^{(i)})),\\&\qquad -x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle \\&\quad =\langle \mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})-{\tilde{y}_{i}(t)},\\&\qquad -x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle -\langle \mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)}),\\&\qquad -x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle \\&\qquad +\beta _i\langle \nabla H^{[\nu ]}_{i}(x_{i}(t))-\nabla H^{[\nu ]}_{i}(x_{i}(t_{k}^{(i)})),-x_{i}(t)\\&\qquad +\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle +2\langle x_{i}(t),-x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle . \end{aligned}$$

According to 1) in Proposition 2, it follows that

$$\begin{aligned} \langle \mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})-{\tilde{y}_{i}(t)},-x_{i}(t)+\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle \le 0. \end{aligned}$$

Furthermore, based on Assumption 3 and the Cauchy–Schwarz inequality, we have

$$\begin{aligned}&\langle x_{i}(t),\dot{x}_{i}(t)\rangle +\langle \nabla H^{[\nu ]}_{i}(x_{i}(t)),\dot{x}_{i}(t)\rangle \\&\quad \le -\Vert \mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\Vert ^2-2\Vert x_{i}(t)\Vert ^2+3\langle x_{i}(t),\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle \\&\qquad +\beta _{i}P_{i}^{[\nu ]}\Vert e_{i}(t)\Vert \big (\Vert x_{i}(t)\Vert +\Vert \mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\Vert \big )\\&\quad \le \left( -1+\frac{\beta _{i}P_{i}^{[\nu ]}}{2}\right) \Vert \mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\Vert ^2+ \left( -2+\frac{\beta _{i}P_{i}^{[\nu ]}}{2}\right) \\&\qquad \times \Vert x_{i}(t)\Vert ^2+3\langle x_{i}(t),\mathcal {P}_{\varOmega }({\tilde{y}_{i}(t)})\rangle +\beta _{i}P_{i}^{[\nu ]}\Vert e_{i}(t)\Vert ^2\\&\quad =-\varvec{\zeta }_{i}^{\mathrm {T}}\varPi _{i}\varvec{\zeta }_{i}+\beta _{i}P_{i}^{[\nu ]}\Vert e_{i}(t)\Vert ^2. \end{aligned}$$

Therefore, \(\dot{V}^{[\nu ]}(\varvec{x}(t))<0\) when \(t\in [t_{k}^{(i)},t_{k+1}^{(i)})\) and \(\beta _{i}P_{i}^{[\nu ]}>3+\sqrt{10}\). Then, the Lyapunov stability theorem and the strongly convexity of \(H^{[\nu ]}(\cdot )\) imply that \(\varvec{x}(t)\) is convergent to \(\varvec{x}^{[\nu ]}\), which is defined as the optimal solution of (29).

Next, since \(e_{i}(t_{k}^{(i)})=0\) and \(\dot{e_{i}}(t)=\dot{x}_{i}(t)=-x_{i}(t)+\mathcal {P}_{\varOmega }\big (x_{i}(t)-{\beta _{i}}\nabla H^{[\nu ]}_{i}(x_{i}(t_{k}^{(i)}))\big )\), then for \(t\in [t_{k}^{(i)},t_{k+1}^{(i)})\)

$$\begin{aligned} \Vert e_{i}(t)\Vert \le \int _{t_{k}^{(i)}}^{t}\Vert \dot{e_{i}}(s)\Vert \mathrm {d}s\le \bar{M}_{i}(t)(t-t_{k}^{(i)}), \end{aligned}$$

where \(\bar{M}_{i}(t)=\max _{t_{k}^{(i)}\le s\le t}\Vert -x_{i}(s)+\mathcal {P}_{\varOmega }\big (x_{i}(s)-{\beta _{i}}\nabla H^{[\nu ]}_{i}(x_{i}(t_{k}^{(i)}))\big )\Vert \). From event-triggering condition (31), we can get that the next event will not be triggered before \({q_{i}(t)}=0\). Thus

$$\begin{aligned} \Vert e_{i}(t_{k+1}^{(i)-})\Vert =\sqrt{\varvec{\zeta }_{i}^{\mathrm {T}}\varPi _{i}\varvec{\zeta }_{i}}\le \bar{M}_{i}(t_{k+1}^{(i)-})(t_{k+1}^{(i)-}-t_{k}^{(i)}), \end{aligned}$$

where \(t_{k+1}^{(i)-}\) is the left limit of \(t_{k+1}^{(i)}\). Then, \(t_{k+1}^{(i)-}-t_{k}^{(i)}\ge \frac{\sqrt{\varvec{\zeta }_{i}^{\mathrm {T}}\varPi _{i}\varvec{\zeta }_{i}}}{\bar{M}_{i}(t^{(i)-}_{k+1})}>0\). It is obvious that \(\lim \nolimits _{k\rightarrow {+\infty }}(t_{k+1}^{(i)-}-t_{k}^{(i)})={+\infty }\), which implies that event-triggered neurodynamic approach (30) is Zeno-free. \(\square \)

From Theorem 5, the following corollary can be drawn.

Corollary 1

If Assumptions 13 hold and \(\varvec{x}_{0}\in \varvec{\varOmega }\), then the state solution \(\varvec{x}(t)\) of event-triggered neurodynamic approach (30) converges to the optimal solution \(\varvec{x}^{*}\) of (2) when \({\nu }\) is large enough and the parameters \(\beta _{i}\) and \(P_{i}^{[\nu ]}\) satisfy \(\beta _{i}P_{i}^{[\nu ]}>3+\sqrt{10}\) for each agent \(i\in \mathcal {V}\). Furthermore, the event-triggered neurodynamic approach (30) is Zeno-free.

Remark 7

Event-triggering mechanism is outstanding in alleviating bandwidth pressure and saving energy consumption, because agents do not need to communicate information all the time as required in [11, 13, 29, 46], but communicate their local information intermittently. In addition, compared to works in [25, 33, 51], the event-triggered neurodynamic approach (30) in this paper can solve distributed optimization problems under inequality and set constraints.

Discrete-time neurodynamic approach

Since solving the distributed optimization problem relies on the information exchange between agents in the multi-agent system, and the communication of agents is essentially a discrete process in actual operation, so studies on discrete-time neurodynamic approaches are also achieved widespread attention. In this subsection, the discrete-time neurodynamic approach in [3] is extended for solving nonsmooth convex distributed optimization problem (13) under assumptions mentioned in Sect. 4.1.

Approach I: Discrete-time neurodynamic approach.

Step 1

Take an initial point \(\varvec{x}_{0}\in \varvec{\varOmega }\), \(k=0\);

Step 2

\(\varvec{x}_{k}\in \varvec{\varOmega }\), choose \(\ell _{k}>0\), \(\varsigma _{k}\in \partial h(\varvec{x}_{k})\), \(\varsigma _{k}\ne \varvec{0}\);

Step 3

Compute

 

        \({\tau _{k}=\mathop {\arg \min }\limits _{w+\varvec{x}_{k}\in \varvec{\varOmega }}\{\frac{1}{2}\Vert w\Vert ^2+\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k}^{\mathrm {T}}w\}}\)

 

           \(=\mathcal {P}_{\varvec{\varOmega }}(\varvec{x}_{k}-\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k})-\varvec{x}_{k}\);

Step 4

If \(\tau _{k}=0\), terminate; otherwise,

 

         \(\displaystyle \varvec{x}_{k+1}=\varvec{x}_{k}+\tau _{k}=\mathcal {P}_{\varvec{\varOmega }}\left( \varvec{x}_{k}-\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k}\right) \).

 

Return to Step 2 with k replaced by \(k+1\).

where

$$\begin{aligned} \sum \limits _{k=1}^{{+\infty }} \ell _{k} = {+\infty }\ and\ \sum \limits _{k=1}^{{+\infty }} \ell _{k}^{2} < {+\infty }, \end{aligned}$$
(33)

and \(h(\varvec{x})\) is defined in distributed optimization problem (13). Let \(\{\varvec{x}_{k}\}\) and \(\{\tau _{k}\}\) be sequences generated by Approach \(\mathrm {I}\), then \(\{\varvec{x}_{k}\}\subseteq \varvec{\varOmega }\). If \(\partial h(\varvec{x}_{k})=\{\varvec{0}\}\), then \(\varvec{x}_{k}=\min _{\varvec{x}\in \mathbb {R}^{Nn}}h(\varvec{x})\). Since \(\varvec{x}_{k}\in \varvec{\varOmega }\), then \(\varvec{x}_{k}\) is an optimal solution of distributed optimization problem (13). Let \(\mathcal {O}\) be the optimal set of distributed optimization problem (13), it is obvious that if \(\tau _{k}=0\), then \(\varvec{x}_{k}\in \mathcal {O}\).

Remark 8

Approach \(\mathrm {I}\) can be viewed as the discrete form of continuous-time neurodynamic approach (14) in time dimension. It turns out that discrete-time neurodynamic approaches are more suitable for processing generators equipped with embedded digital microprocessors. This is why discrete-time neurodynamic approaches are easy to apply in practice. Besides, the step-size rule has been applied to other neurodynamic approaches in [28] and [38].

Proposition 4

Under Assumptions 1 and 2, \(\Vert \tau _{k}\Vert \le \ell _{k}\) and \(\Vert \varvec{x}_{k+1}-\varvec{x}_{k}\Vert \le \ell _{k}\).

Proof

It is clear that the conclusion holds if \(\tau _{k}=0\). Now, we suppose \(\tau _{k}\ne 0\). In this case, consider the convex function

$$\begin{aligned} \varPhi _{\varvec{x}_{k}}(w)=\frac{1}{2}\Vert w\Vert ^2+\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k}^{\mathrm {T}}w,\ \varsigma _{k}\in \partial h(\varvec{x}_{k}). \end{aligned}$$

Let

$$\begin{aligned} u_{k}=\tau _{k}+\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k}, \end{aligned}$$
(34)

then \(u_{k}\in \partial \varPhi _{\varvec{x}_{k}}(\tau _{k})\). Since \(\tau _{k}\) is the minimum point of \(\varPhi _{\varvec{x}_{k}}(w)\) on \(\varvec{\varOmega }-\varvec{x}_{k}\), we have

$$\begin{aligned} \big \langle u_{k},z-\tau _{k}\big \rangle \ge 0, \end{aligned}$$
(35)

for all \(z\in \varvec{\varOmega }-\varvec{x}_{k}\). Then, \(\varvec{x}_{k}\in \varvec{\varOmega }\) implies that \(0\in \varvec{\varOmega }-\varvec{x}_{k}\). Take \(z=0\), so

$$\begin{aligned} \big \langle u_{k},\tau _{k}\big \rangle \le 0. \end{aligned}$$

By substituting (34) into (35), we can derive

$$\begin{aligned} \Vert \tau _{k}\Vert ^{2}+\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k}^{\mathrm {T}}\tau _{k}\le 0, \end{aligned}$$
(36)

that is \(\displaystyle \Vert \tau _{k}\Vert ^{2}\le -\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k}^{\mathrm {T}}\tau _{k}.\) Therefore

$$\begin{aligned} \Vert \tau _{k}\Vert ^2\le \frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }|\varsigma _{k}^{\mathrm {T}}\tau _{k}|\le \frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\Vert \varsigma _{k}\Vert \Vert \tau _{k}\Vert =\ell _{k}\Vert \tau _{k}\Vert . \end{aligned}$$
(37)

The result can be obtained directly, since \(\Vert \tau _{k}\Vert \ne 0\). \(\square \)

Proposition 5

If Assumptions 1 and 2 hold, then for any \(\varvec{x}\in \varvec{\varOmega }\)

$$\begin{aligned} 3\ell _{k}^2+\Vert \varvec{x}-\varvec{x}_{k}\Vert ^2-\Vert \varvec{x}-\varvec{x}_{k+1}\Vert ^2\ge 2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }(h(\varvec{x}_{k})-h(\varvec{x})). \end{aligned}$$

Proof

From Proposition 4, we have

$$\begin{aligned} \ell _{k}^2+&\Vert \varvec{x}-\varvec{x}_{k}\Vert ^2-\Vert \varvec{x}-\varvec{x}_{k+1}\Vert ^{2}\nonumber \\&\quad \ge \Vert \varvec{x}_{k+1}-\varvec{x}_{k}\Vert ^2+\Vert \varvec{x}-\varvec{x}_{k}\Vert ^2-\Vert \varvec{x}-\varvec{x}_{k+1}\Vert ^2\nonumber \\&\quad =2\big \langle \varvec{x}-\varvec{x}_{k},\varvec{x}_{k+1}-\varvec{x}_{k}\big \rangle \nonumber \\&\quad =2\big \langle \tau _{k}, \varvec{x}-\varvec{x}_{k}\big \rangle .\nonumber \\ \end{aligned}$$
(38)

By (35) and (34), it has \(\displaystyle u_{k}=\tau _{k}+\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k}\) and \(\big \langle u_{k},\varvec{x}-\varvec{x}_{k}-\tau _{k}\big \rangle \ge 0,\ \forall \ \varvec{x}\in \varvec{\varOmega }.\) Then

$$\begin{aligned}&\big \langle \tau _{k}, \varvec{x}-\varvec{x}_{k}\big \rangle \nonumber \\&\quad =\big \langle u_{k},\varvec{x}-\varvec{x}_{k}\big \rangle +\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle \nonumber \\&\quad \ge \big \langle u_{k},\tau _{k}\big \rangle +\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle \nonumber \\&\quad =\big \langle \tau _{k}+\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\varsigma _{k},\tau _{k}\big \rangle +\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle \nonumber \\&\quad =\Vert \tau _{k}\Vert ^2+\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\tau _{k}\big \rangle +\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle \nonumber \\&\quad \ge \frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\tau _{k}\big \rangle +\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle . \end{aligned}$$
(39)

In combination with the above inequalities

$$\begin{aligned}&\ell _{k}^2+\Vert \varvec{x}-\varvec{x}_{k}\Vert ^2-\Vert \varvec{x}-\varvec{x}_{k+1}\Vert ^2\nonumber \\&\quad \ge 2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\tau _{k}\big \rangle +2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle \nonumber \\&\quad \ge -2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\Vert \varsigma _{k}\Vert \Vert \tau _{k}\Vert +2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle \nonumber \\&\quad =-2\ell _{k}\Vert \tau _{k}\Vert +2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle \nonumber \\&\quad \ge -2\ell _{k}^2+2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle . \end{aligned}$$
(40)

By means of the convexity of \(h(\varvec{x})\), then \(\big \langle \varsigma _{k},\varvec{x}_{k}-\varvec{x}\big \rangle \ge h(\varvec{x}_{k})-h(\varvec{x})\), since \(\varsigma _{k}\in \partial h(\varvec{x}_{k})\). Hence, the conclusion holds. \(\square \)

Lemma 2

If Assumptions 1 and 2 hold, then \(\{\varvec{x}_{k}\}\) is bounded.

Proof

According to Proposition 5, for \(\varvec{x}^{*}\in \mathcal {O}\), we have \(\displaystyle 3\ell _{k}^2+\Vert \varvec{x}^{*}-\varvec{x}_{k}\Vert ^2-\Vert \varvec{x}^{*}-\varvec{x}_{k+1}\Vert ^2\ge 2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }(h(\varvec{x}_{k})-h(\varvec{x}^{*}))\ge 0\). Therefore, it can be obtained that

$$\begin{aligned} \Vert \varvec{x}^{*}-\varvec{x}_{k+1}\Vert ^{2}\le & {} \Vert \varvec{x}^{*}-\varvec{x}_{k}\Vert ^{2}+3\ell _{k}^{2}\nonumber \\\le & {} \Vert \varvec{x}^{*}-\varvec{x}_{k-1}\Vert ^{2}+3\ell _{k-1}^{2}+3\ell _{k}^{2}\nonumber \\\le & {} \cdots \le \Vert \varvec{x}^{*}-\varvec{x}_{0}\Vert ^{2}+3\sum \limits _{l=0}^{k}\ell _{l}^{2}. \end{aligned}$$
(41)

As a result, \(\{\varvec{x}_{k}\}\) is bounded by \(\sum _{k=0}^{{+\infty }}\ell _{k}^{2}<{+\infty }\). \(\square \)

Theorem 7

Under Assumptions 1 and 2, there exists a cluster point of \(\{\varvec{x}_{k}\}\), noted as \(\varvec{x}^{*}\), and \(\varvec{x}^{*}\in \mathcal {O}\).

Proof

From Proposition 5, it is obvious that for any \(\varvec{x}\in \varvec{\varOmega }\)

$$\begin{aligned}&2\frac{\ell _{k}}{\Vert \varsigma _{k}\Vert }(h(\varvec{x}_{k})-h(\varvec{x})) \nonumber \\&\quad \le \Vert \varvec{x}_{k}-\varvec{x}\Vert ^2-\Vert \varvec{x}_{k+1}-\varvec{x}\Vert ^2+3\ell _{k}^2,\ \ k\in \mathbb {N}. \end{aligned}$$
(42)

Since \(\{\varvec{x}_{k}\}\) is bounded, then \(\{\varsigma _{k}\}\) is bounded. Without lose of generality, we suppose that \(\Vert \varsigma _{k}\Vert \le K,\ k\in \mathbb {N}\) and \(\gamma _{k}=\gamma _{k}(\varvec{x})=h(\varvec{x}_{k})-h(\varvec{x})\) for any \(\varvec{x}\in \varvec{\varOmega }\). Then

$$\begin{aligned} 2\frac{\ell _{k}}{K}\gamma _{k}\le \Vert \varvec{x}_{k}-\varvec{x}\Vert ^2-\Vert \varvec{x}_{k+1}-\varvec{x}\Vert ^2+3\ell _{k}^2. \end{aligned}$$
(43)

It is easy to calculate that

$$\begin{aligned} \frac{2}{K}\sum \limits _{k=0}^m \ell _{k}\gamma _{k}\le & {} \sum \limits _{k=0}^m\big (\Vert \varvec{x}_{k}-\varvec{x}\Vert ^2-\Vert \varvec{x}_{k+1}-\varvec{x}\Vert ^2+3\ell _{k}^2\big )\nonumber \\= & {} \Vert \varvec{x}_{0}-\varvec{x}\Vert ^2-\Vert \varvec{x}_{m+1}-\varvec{x}\Vert ^2+3\sum \limits _{k=0}^m\ell _{k}^2\nonumber \\\le & {} \Vert \varvec{x}_{0}-\varvec{x}\Vert ^2+3\sum \limits _{k=0}^m\ell _{k}^2. \end{aligned}$$
(44)

Let \(m\rightarrow +\infty \), we have

$$\begin{aligned} \sum \limits _{k=0}^{{+\infty }} \ell _{k}\gamma _{k}<{+\infty }. \end{aligned}$$
(45)

Therefore, it has a subsequence \(\{\gamma _{i_{k}}\}\), such that \(\lim _{k\rightarrow {+\infty }}\gamma _{i_{k}}\le 0\). Otherwise, there are \(\rho >0\), \(\tilde{k}\ge 0\), such that

$$\begin{aligned} {+\infty }>\sum \limits _{k=0}^{{+\infty }}\ell _{k}\gamma _{k}\ge \rho \sum \limits _{k=\tilde{k}}^{{+\infty }}\ell _{k}, \ k\ge \tilde{k}, \end{aligned}$$

which contradicts with \(\sum _{k=\tilde{k}}^{{+\infty }}\ell _{k}={+\infty }\). Therefore, \(\{\varvec{x}_{i_{k}}\}\) converges to a point, noted by \(\varvec{x}^{*}\). Since \(\varvec{\varOmega }\) is a closed set, then \(\varvec{x}^{*}\in \varvec{\varOmega }\).

Next we prove \(\varvec{x}^{*}\in \mathcal {O}\). If \(\varvec{x}^{*}\notin \mathcal {O}\), then there exists \(\hat{\varvec{x}}\in \varvec{\varOmega }\), such that \(h(\hat{\varvec{x}})<h(\varvec{x}^{*})\). It has been proved above that \(\lim _{k\rightarrow {+\infty }}\gamma _{i_{k}}\le 0\). It yields

$$\begin{aligned} \lim \limits _{k\rightarrow {+\infty }}h(\varvec{x}_{i_{k}})-h(\hat{\varvec{x}})=h(\varvec{x}^{*})-h(\hat{\varvec{x}})\le 0, \end{aligned}$$
(46)

that is, \(h(\hat{\varvec{x}})\ge h(\varvec{x}^{*}),\) which results in contradiction. \(\square \)

Theorem 8

Under Assumptions 1 and 2, all cluster points of \(\{\varvec{x}_{k}\}\) are optimal solutions to distributed optimization problem (13).

Proof

According to the proof of Theorem 7, it is sufficient to prove for any \(\varvec{x}\in \varvec{\varOmega }\), all cluster points of \(\gamma _{k}=h(\varvec{x}_{k})-h(\varvec{x})\) are nonpositive. Since \(\varvec{\varOmega }\) is a bounded set and \(\varsigma _{k}\in \partial h(\varvec{x}_{k})\), there is a \(K>0\), such that \(\Vert \varsigma _{k}\Vert \le K\). Then, for \(k\in \mathbb {N}\)

$$\begin{aligned} \gamma _{k}-\gamma _{k+1}= & {} h(\varvec{x}_{k})-h(\varvec{x})\nonumber \\&\quad -\big (h(\varvec{x}_{k+1})-h(\varvec{x})\big )= h(\varvec{x}_{k})-h(\varvec{x}_{k+1})\nonumber \\\le & {} \varsigma _{k}^{\mathrm {T}}(\varvec{x}_{k}-\varvec{x}_{k+1})\nonumber \\\le & {} \Vert \varsigma _{k}\Vert \Vert \varvec{x}_{k+1}-\varvec{x}_{k}\Vert \le K\ell _{k}. \end{aligned}$$
(47)

By Theorem 7, there is a subsequence \(\{\gamma _{i_{k}}\}\) such that \(\lim _{k\rightarrow {+\infty }}\gamma _{i_{k}}\le 0\). If there exist \(\delta >0\) and another subsequence \(\{\gamma _{l_{k}}\}\), such that \(\gamma _{l_{k}}\ge \delta .\) Therefore, we consider the subsequence \(\gamma _{j_{k}}\), where \(j_{k}\ (k\in \mathbb {N})\) is defined as follows:

$$\begin{aligned} \begin{aligned}&j_{0}=\min \{m\ge 0:\gamma _{m}\ge \delta \},\\&j_{2k+1}=\min \left\{ m\ge j_{2k}:\gamma _{m}\le \frac{\delta }{2}\right\} ,\\&\ j_{2k+2}=\min \{m\ge j_{2k+1}:\gamma _{m}\ge \delta \}. \end{aligned} \end{aligned}$$

Obviously, \(\gamma _{j_{k}}\) is well defined. Next, it comes to the conclusion that

$$\begin{aligned} \left\{ \begin{aligned} \gamma _{m}\ge \delta ,&\ \ \ j_{2k}\le m\le j_{2k+1}-1\\ \gamma _{m}\le \frac{\delta }{2},&\ \ \ j_{2k+1}\le m\le j_{2k+2}-1 \end{aligned}\right. \end{aligned}$$

and

$$\begin{aligned} \gamma _{j_{2k}}-\gamma _{j_{2k+1}}\ge \frac{\delta }{2},\ \ \forall k\in \mathbb {Z}_{\ge 0}. \end{aligned}$$

By (45) and (47), it can be derived that

$$\begin{aligned} {+\infty }> & {} \sum \limits _{k=0}^{{+\infty }}\ell _{k}\gamma _{k}=\sum \limits _{\{k:\gamma _{k}>0\}}\ell _{k}\gamma _{k}\nonumber \\&\quad +\sum \limits _{\{k:\gamma _{k}\le 0\}}\ell _{k}\gamma _{k}\ge \sum \limits _{k=0}^{{+\infty }}\sum \limits _{m=j_{2k}}^{j_{2k+1}-1}\ell _{m}\gamma _{m}+\sum \limits _{\{k:\gamma _{k}\le 0\}}\ell _{k}\gamma _{k}\nonumber \\\ge & {} \delta \sum \limits _{k=0}^{{+\infty }}\sum \limits _{m=j_{2k}}^{j_{2k+1}-1}\ell _{m}+\sum \limits _{\{k:\gamma _{k}\le 0\}}\ell _{k}\gamma _{k}\nonumber \\\ge & {} \frac{\delta }{K}\sum \limits _{k=0}^{{+\infty }}\sum \limits _{m=j_{2k}}^{j_{2k+1}-1}(\gamma _{m}-\gamma _{m+1})+\sum \limits _{\{k:\gamma _{k}\le 0\}}\ell _{k}\gamma _{k}\nonumber \\= & {} \frac{\delta }{K}\sum \limits _{k=0}^{{+\infty }}(\gamma _{j_{2k}}-\gamma _{j_{2k+1}})+\sum \limits _{\{k:\gamma _{k}\le 0\}}\ell _{k}\gamma _{k}. \end{aligned}$$
(48)

Because \(\lim \nolimits _{k\rightarrow {+\infty }}\gamma _{i_{k}}\le 0\), so \(\{k:\gamma _{k}\le 0\}\) is an infinite set. Known from \(\sum _{k=0}^{{+\infty }}\ell _{k}\gamma _{k}<{+\infty }\), then there is \(\bar{k}>0\), such that

$$\begin{aligned} \ell _{k}\gamma _{k}\ge -\frac{\delta ^2}{4K}, \end{aligned}$$
(49)

for \(k\ge \bar{k}\). Furthermore, define

$$\begin{aligned} \bar{S}=\sum \limits _{\{k<\bar{k}:\gamma _{k}\le 0\}}\left( \gamma _{j_{2k}}-\gamma _{j_{2k+1}}+\frac{K}{\delta }\ell _{k}\gamma _{k}\right) . \end{aligned}$$

Obviously \(\bar{S}\) is finite. Based on (49), it gets

$$\begin{aligned}&\frac{\delta }{K}\sum \limits _{k=0}^{{+\infty }}(\gamma _{j_{2k}}-\gamma _{j_{2k+1}})+\sum \limits _{\{k:\gamma _{k}\le 0\}}\ell _{k}\gamma _{k}\ge \frac{\delta }{K}\nonumber \\&\qquad \sum \limits _{\{k:\gamma _{k}\le 0\}}(\gamma _{j_{2k}}-\gamma _{j_{2k+1}})+\sum \limits _{\{k:\gamma _{k}\le 0\}}\ell _{k}\gamma _{k}\nonumber \\&\quad =\frac{\delta }{K}\sum \limits _{\{k:\gamma _{k}\le 0\}}\left( \gamma _{j_{2k}}-\gamma _{j_{2k+1}}+\frac{K}{\delta }\ell _{k}\gamma _{k}\right) \ge \frac{\delta }{K}\bar{S}\nonumber \\&\qquad +\frac{\delta }{K}\sum \limits _{\{k\ge \bar{k}:\gamma _{k}\le 0\}}\left( \frac{\delta }{2}-\frac{\delta }{4}\right) ={+\infty }, \end{aligned}$$
(50)

which contradicts to (48). To sum up, all cluster points of \(\{\varvec{x}_{k}\}\) belong to \(\mathcal {O}\). \(\square \)

Corollary 2

Assume \(\{\varvec{x}_{k}\}\) is generated by Approach \(\mathrm {I}\), then \(\{\varvec{x}_{k}\}\) converges to the optimal solution set of distributed optimization problem (13). Furthermore, if there is \(\sigma >\max \{LM/\hat{g},NL+\sqrt{NL(NL+2)}\}\), then \(\{x_{i}(k)\}\ (i\in \mathcal {V})\) converges to the optimal solution set of optimization problem (1).

Remark 9

The variable step-size neurodynamic approach designed as Approach \(\mathrm {I}\) can resolve the nonsmooth distributed optimization problem with convex inequality and set constraints. Compared with existing discrete-time neurodynamic approaches in [32, 45] and [50] which can only solve smooth distributed optimization problems with equality constraints or affine inequality constraints, Approach \(\mathrm {I}\) has the ability to solve the nonsmooth distributed optimization problem with inequality and set constraints. Additionally, the Approach \(\mathrm {I}\) is convergent without the assumption that local cost functions are strongly convex.

Simulations and applications

In this section, the continuous-time neurodynamic approach and the discrete-time neurodynamic approach in this paper are utilized to solve a numerical example. Moreover, an ill-conditioned Least Absolute Deviation problem and a load sharing problem are considered and solved. Therefore, the feasibility of proposed neurodynamic approaches is verified.

Numerical simulations

Example 1

Consider a system of 20 agents interacting over an undirected circled communication graph to collaboratively solve the following optimization problem:

$$\begin{aligned} \begin{aligned} \mathrm {min}~~~&f(\varvec{x})=\sum \limits _{i=1}^{20}|ix_{i1}-x_{i2}|\\ \mathrm {s.t.}~~~&~x_{i1}+ix_{i2}\le 2;\ x_{i1}\ge 0;\ x_{i1}^2-x_{i2}\le 4;\ \Vert x_{i}\Vert \le 1,\\ \end{aligned}\nonumber \\ \end{aligned}$$
(51)

where \(x_i=(x_{i1},x_{i2})^\mathrm {T}\) for \(i=1,2,\ldots ,20\) and \(\varvec{x}=\mathrm {col}\{x_{1},x_{2},\ldots ,x_{20}\}\in \mathbb {R}^{40}\). Obviously, problem (51) is a nonsmooth convex distributed optimization problem under inequality constraint set \(\varXi _{i}=\{x\in \mathbb {R}^{2}:x_{1}+ix_{2}\le 2,\ x_{1}\ge 0,\ x_{1}^2-x_{2}\le 4\}\) and set constraint \(\varOmega _{i}=\overline{\mathrm {\varvec{B}}(\varvec{0},1)}\).

(I)   Continuous-time neurodynamic approach (14) for (51)

Fig. 2
figure 2

Convergence behavior of neurodynamic approach (14)

It is easy to identify the inequality constraint set \(\varXi _{i}\) is bounded and there is \(\hat{x}_{i}=(1,0)\in \mathrm {int}(\varXi _{i})\cap \varOmega \) such that Assumption 1 is satisfied. Furthermore, for any \(x_{i}\in \varXi _{i}\cap \varOmega \)

$$\begin{aligned} \partial f_{i}(x_{i})=\left\{ \begin{array}{ll} {(i, -1)^{T}},&{} ixi1 - xi2 > 0 \\ {[-1, 1](1, 1)^{T}}, &{} ixi1 - xi2 = 0 \\ {(-i, 1)^{T}}, &{}ixi1 - xi2 < 0 \end{array}\right. \end{aligned}$$

and \(\nabla g_{1}(x)=\left( 1,i\right) ^{\mathrm {T}}\), \(\nabla g_{2}(x)=\left( -1,0\right) ^{\mathrm {T}}\), \(\nabla g_{3}(x)=\left( 2x_{1},-1\right) ^{\mathrm {T}}\). Thus, the parameters are chosen as \(M=10\) and \(L=300\). According to utilizing continuous-time neurodynamic approach (14) from an initial point, we get that the state solution is convergent to an optimal solution \(x^{*}=[0,0]^{\mathrm {T}}\) of optimization problem (1), which is shown in Fig. 2a, and the trend of the global cost function is shown in Fig. 2b. Therefore, the effectiveness of the neurodynamic approach (14) can be guaranteed.

(II)   Approach \(\mathrm {I}\) for (51)

According to Approach I with the same initial value in (I), take \(\sigma =121000\) and \(\alpha _{k}=1/(k+5000)\). Through 5000 iterations, then the convergence of the iteration sequence and the value of the global objective function \(f(\varvec{x})\) can be seen intuitively in Fig. 3 and the result is same as that in(I) .

Fig. 3
figure 3

Convergence behavior of approach \(\mathrm {I}\)

Fig. 4
figure 4

Communication topology of the multi-agent system in Example 1

(III)   A higher dimensional simulation for (51)

Now, we come to consider a multi-agent system that \(N=100\) and agents interact over an undirected communication graph showed in Fig. 4. to collaboratively solve the optimization problem (51). The 100 agents are divided into five floors, in which 20 agents in each layer are connected in a circular manner, and each two floors are connected by agents whose coordinate were \((x,y,z)=(1,0,z)\). Therefore, this communication graph is undirected and connected. Therefore, by applying continuous-time neurodynamic approach (14) and Approach \(\mathrm {I}\) proposed in this paper, respectively. Figure 5 shows the convergence of these two algorithms and illustrates their effectiveness for higher dimensional distributed optimization problems.

In Example 1, the dimension of the solution obtained by the neurodynamic approach 14 proposed in this paper is 40. Actually, the approaches proposed in [40] and [52] can be implemented to solve problem (51), and the dimensions of their solution are as high as 100 and 140, respectively. In addition, the approach in [51] is only suitable for solving distributed optimization problems with affine inequalities, so it cannot solve problem (51). However, when solving optimization problems of the same dimension, the solution dimension generated by the approach in [51] is up to 160, which is three times larger than the solution dimension obtained by (14). It is worth pointing out that this phenomenon will become more obvious when dealing with problems under larger dimensional constraints. Thus, the neurodynamic approach (14) can greatly reduce the amount of calculation and reduce the consumption of computer CPU.

Fig. 5
figure 5

The trajectories of 100 agents

Least absolute deviation problem

Example 2

The system in [50] and Approach \(\mathrm {I}\) proposed in this paper are used to solve the following ill-conditioned Least Absolute Deviation problem:

$$\begin{aligned} {\hbox {min}} ~~~ \sum \limits _{i=1}^N \Vert Dx-c\Vert _{1} \end{aligned}$$
(52)

where \(D=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1.1 &{} 4 &{} 7 &{} -3.1 &{} 5.5 \\ 2 &{} 5.1 &{} 8 &{} -4 &{} 6.9 \\ 3.1 &{} 6 &{} 8.9 &{} -5 &{} 8.1 \end{array} \right) ^{\mathrm {T}}\) and \(c=(-2,1.1, 4.2,-0.5,2.1)^{\mathrm {T}}\).

Since the system in [50] and Approach \(\mathrm {I}\) are able to solve the nonsmooth distributed convex optimization problems. As [50] mentioned, the condition number of the matrix is about 200, so it is difficult to solve problem (52) using traditional optimization algorithms. Here, we consider a multi-agent network consisting of five nodes, each of which is assigned an objective function containing D and c. The communication lines between agents can be seen in Fig. 6. Then, from the same initial point, both algorithms produce sequences of iterations that converge to the optimal solution \((2,1,-2)^{\mathrm {T}}\) and the results are given in Fig. 7. It is clear from Fig. 7 that one of the advantages of the algorithm in this paper is much faster convergence.

Fig. 6
figure 6

Communication topology of the multi-agent system in Example 2 and the interaction of the buses in Example 3

Fig. 7
figure 7

Trajectory of the iteration sequence \(\{\varvec{x}_{k}\}\)

Fig. 8
figure 8

Trajectories of the state solution \(v_{i}(t)\)

Load sharing problem

In the field of power system, the load sharing problem is frequently considered. We now want to find the optimal generation allocation to share the load under the constraints of generation and transmission capacity. Suppose the transmission network \(\mathcal {G}(\mathcal {V},\mathcal {E})\), where the buses set is \(\mathcal {V}=\{1,2,\ldots ,N\}\), the transmission lines set is \(\mathcal {E}=\{1,\ldots ,d\}\). \(\mathcal {G}\) is connected and let a incidence matrix \(D=(D_{ij})\in \mathbb {R}^{N\times d}\) with

$$\begin{aligned} D_{ij}=\left\{ \begin{array}{lll} 1, &{} \mathrm{line}\ j\ \mathrm{enters}\ \mathrm{the}\ \mathrm{bus}\ i\\ -1, &{} \mathrm{line}\ j\ \mathrm{departs}\ \mathrm{from}\ \mathrm{the}\ \mathrm{bus}\ i\\ 0, &{} \mathrm{otherwise}. \end{array} \right. \end{aligned}$$

By the work of [46], the load sharing optimal problem can be abstracted into the following mathematical model:

$$\begin{aligned} \begin{aligned} \text{ min } ~~~&{\sum _{i\in \mathcal {V}}f_{i} (p_{i}^\mathrm{gene})}\\ \text{ s.t. }~~~&~~\varvec{p}^\mathrm{gene}-D\varvec{\nu }-\varvec{p}^\mathrm{load}=\varvec{0}_{n} \\~~~~~~~~~~~~~~~&~~ \underline{p}_{i}^{gene}\le p_{i}^\mathrm{gene}\le \overline{p}_{i}^\mathrm{gene}, \forall i\in \mathcal {V} \\~~~~~~~~~~~~~~~&~~ \underline{\nu }_{j}\le \nu _{j}\le \overline{\nu }_{j}, \forall j\in \mathcal {E}, \end{aligned} \end{aligned}$$
(53)

where \(\varvec{p}^\mathrm{load}\in \mathbb {R}^{n}\) is the constant load demands at buses, \(f_{i}\) is the local cost function at bus i. \(\underline{p}_{i}^\mathrm{gene}\le p_{i}^\mathrm{gene}\le \overline{p}_{i}^\mathrm{gene}\) is the generation capacity constraint at bus i, \(\underline{\nu }_{j}\le \nu _{j}\le \overline{\nu }_{j}\) is the power flow constraint in line j. Notice that \(\varvec{p}^\mathrm{load}=[p_{1}^\mathrm{load},p_{2}^\mathrm{load},\ldots ,p_{N}^\mathrm{load}]^{\mathrm {T}}\) and \(\varvec{\nu }=[\nu _{1},\nu _{2},\ldots ,\nu _{d}]^{\mathrm {T}}\). We need to know that the upper and lower bounds of the constraint are given constants.

Equivalently, the problem (53) can be reformulated as the following distributed optimization problem:

$$\begin{aligned} \text{ min } ~~~&\sum \limits _{i\in \mathcal {V}}f_{i}(p^\mathrm{gene}_{i})= \sum \limits _{i\in \mathcal {V}}f_{i}\left( \sum \limits _{j=1}^{d}D_{ij}\nu _{j}+p_{i}^\mathrm{load}\right) \nonumber \\ \text{ s.t. }~~~&~~ \underline{p}_{i}^\mathrm{gene}\le \sum \limits _{j=1}^{d}D_{ij}\nu _{j}+p_{i}^\mathrm{load}\le \overline{p}_{i}^\mathrm{gene}, \ i\in \mathcal {V}\nonumber \\ ~~~~~~~~~~~~~~~&~~ \underline{\nu }_{j}\le \nu _{j}\le \overline{\nu }_{j}, \ j\in \mathcal {E}. \end{aligned}$$
(54)

Example 3

To solve the load sharing problem (54), we consider a five-bus and five-line system, and the interaction of the buses is shown in Fig. 6, \(\mathrm {i.e.}\) \(N=d=5\). Assuming that \(\varvec{p}^\mathrm{load}=[3,1,4,2,3]^{\mathrm {T}}\), \(\overline{\varvec{p}}^\mathrm{gene}=[4,3,1,2,5]^{\mathrm {T}}\), \(\underline{\varvec{p}}^\mathrm{gene}=[0,0,0,0,0]^{\mathrm {T}}\), and \(\overline{\varvec{\nu }}=[4,5,7,3,1]^{\mathrm {T}}\), \(\underline{\varvec{\nu }}=[-4,-3,-6,-5,-10]^{\mathrm {T}}\)

$$\begin{aligned} f_{i}= & {} 2(p^\mathrm{gene}_{i})^2+2p^\mathrm{gene}_{i},\ \ \ i=1,2,\ldots ,5\\ D= & {} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} -1 &{} 0 &{} 0 &{} 0 &{} 1 \\ 1 &{} -1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} -1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} -1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 1 &{} -1 \end{array} \right) . \end{aligned}$$

After digital simulation, we get the final result \(\varvec{\nu }=[3.3453, 2.9926, 0.9915, 1.9883, 3.6823]^{\mathrm {T}}\), \(\varvec{p}^\mathrm{gene}=[0.6488, -1.3438, 1.6647, 1.6764, 0.9941]^{\mathrm {T}}\). The convergence process is shown in Fig. 8. Obviously, the solution satisfies the given constraints and minimizes the cost of load sharing.

Conclusion

In this paper, we proposed three neurodynamic approaches for solving distributed optimization problems with inequality and set constraints. At first, a continuous-time neurodynamic approach was proposed without using the Laplacian matrix of the communication topology, and the state solution of the neurodynamic approach was proved to converge to an optimal solution of the nonsmooth distributed optimization problem under several mild assumptions. Then, an event-triggered neurodynamic approach was designed to reduce the communication burden and it was verified that Zeno behavior does not occur. Furthermore, we proposed a discrete-time neurodynamic approach and proved the iteration sequence converges to the optimal solution set of the nonsmooth distributed optimization problem. At last, numerical examples were realized to demonstrate the effectiveness and superiorities of neurodynamic approaches. From an application point of view, we plan to extend neurodynamic approaches for solving distributed optimization problems on directed or time-varying communication graphs in the next step.