Distributed dynamic stochastic approximation algorithm over time-varying networks

Fu, Kewei; Chen, Han-Fu; Zhao, Wenxiao

doi:10.1007/s43684-021-00003-1

Distributed dynamic stochastic approximation algorithm over time-varying networks

Original Article
Open access
Published: 17 August 2021

Volume 1, article number 5, (2021)
Cite this article

Download PDF

You have full access to this open access article

Autonomous Intelligent Systems Aims and scope Submit manuscript

Distributed dynamic stochastic approximation algorithm over time-varying networks

Download PDF

Kewei Fu^1,2,
Han-Fu Chen^1,2 &
Wenxiao Zhao^1,2

1795 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, a distributed stochastic approximation algorithm is proposed to track the dynamic root of a sum of time-varying regression functions over a network. Each agent updates its estimate by using the local observation, the dynamic information of the global root, and information received from its neighbors. Compared with similar works in optimization area, we allow the observation to be noise-corrupted, and the noise condition is much weaker. Furthermore, instead of the upper bound of the estimate error, we present the asymptotic convergence result of the algorithm. The consensus and convergence of the estimates are established. Finally, the algorithm is applied to a distributed target tracking problem and the numerical example is presented to demonstrate the performance of the algorithm.

Non-asymptotic error bounds for constant stepsize stochastic approximation for tracking mobile agents

Article 25 October 2019

Distributed order estimation for continuous-time stochastic systems

Article Open access 10 January 2024

Cyclic Stochastic Approximation with Disturbance on Input in the Parameter Tracking Problem Based on a Multiagent Algorithm

Article 28 June 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years, there is a huge increase in the scale of data in many real-life application problems. The traditional centralized computing method faces many challenges and sometimes is entirely infeasible for large-scale problems. As a result, distributed algorithms over multi-agent systems have received much attention from researchers of diverse areas, including consensus problem [1–4], resource allocation (RA) [5, 6], multi-unmanned aerial vehicle (MUAV) control [7], and distributed target tracking [8] etc. Distributed algorithms are usually associated with a network of agents where each agent has limited computation and communication ability. The agents are required to cooperatively achieve a global objective by using their local observations and information transmitted from their neighbors. Compared with centralized approaches, distributed algorithms have the advantage of robustness on network link failure, privacy protection, and reduction on communication and computation cost.

One of the important branches of distributed algorithms is the distributed optimization problem. It seeks the minimizer of the global function which is written as a sum of the local functions of the agents. In particular, the distributed optimization for time-invariant cost function has become a mature discipline with many results, see [9–11] and references within. On the other hand, optimization problems with time-varying cost function have attracted much attention due to its appearance in various applications, for example, signal processing [12] and online optimization [13, 14]. The main challenge of time-varying optimization lies in the fact that the minimizer of the time-varying cost function is changing with time. Since the traditional optimization algorithms can only move the estimates towards the minimizer of the cost function of the current time, they cannot track the movement of the minimizer in the dynamic environment. To cope with this issue, two different strategies have been developed. The first one is running method [15, 16], where the algorithms sample the time-varying cost function at a fixed frequency, and perform the traditional optimization algorithms on the sampled function in between the sample time. The second one is prediction-correction method [17, 18], where the algorithms only optimize the cost function of the current time at each steps. But an additional information of the dynamic of the moving minimizer is required, such as the second derivative of the cost function [18], or an additional constrain on the dynamic of the minimizer [14]. While we consider a different problem in this paper, the method we utilize is much similar to the second method.

It can be noticed that the optimization problem can often be transformed into a root-seeking problem, since the optimization of a differentiable convex function is equivalent to seeking the root of its gradient, and for the case where the gradients are unavailable we can use the finite time difference to estimate the gradient [19]. So it’s natural to consider the distributed root-tracking problem. Distributed stochastic approximation for time-invariant regression functions has been studied by many researchers as a solution for distributed root-seeking problem [20–22]. Inspired by the work of dynamic stochastic approximation [23, 24], in this paper, we propose a distributed stochastic approximation algorithm for tracking the changing root of a sum of time-varying regression functions over a network. Each agent is aimed at tracking the changing root of the global function, but it can only access a noise-corrupted local observation and the information transmitted from its neighbor. In addition, the noise-corrupted dynamics of the roots of the global regression function is assumed to be known to all agents.

In this paper, the distributed root-tracking problem for time-varying regression function is considered. First, motivated by the truncation technique given in [22], a distributed stochastic approximation algorithm with expanding truncations is introduced. The key difference is that the observation of the local function in this algorithm is a noise-corrupted one, while an exact gradient information is often required in the optimization algorithms mentioned above. Second, under the assumption that the noise-corrupted dynamics of the global roots is known to all agents, the convergence conditions of the algorithm are introduced. Third, it is proved that the estimates generated by the distributed algorithm are of both consensus and convergence with probability one. Finally, we apply this algorithm to a distributed target tracking problem. The numerical example is given demonstrating the performance of the algorithm.

The rest of the paper is organized as follows. The problem formulation and the distributed stochastic approximation algorithm are given in Section 2. The convergence conditions and results are presented in Section 3. To help the proof of the convergence result, two auxiliary sequences are defined and analysed in Section 4. The proof of the main result is given in Section 5. In Section 6, a distributed target tracking problem is solved by the algorithm and the numerical example is demonstrated. Some concluding remarks are addressed in Section 7.

2 Problem formulation and distributed root-tracking algorithm

2.1 Problem formulation

Consider a network system consisting of N agents. The interaction relationship among agents is described by a time-varying digraph $\mathcal {G}(k)=\left \{\mathcal {V},\mathcal {E}(k)\right \}$, where k is the time index, $\mathcal {V}=\{1,\dots,N\}$ is the agent set, and $\mathcal {E}(k)\subset \mathcal {V}\times \mathcal {V}$ is the edge set. By $(i,j)\in \mathcal {E}(k)$ we mean that agent j can receive information from agent i at time k. Assume $(i,i)\in \mathcal {E}(k)$ for $\forall k=1,2,\dots $ Denote the neighbor of agent i at time k by $N_{i}(k)=\left \{j\in \mathcal {V}:(j,i)\in \mathcal {E}(k)\right \}$. The adjacency matrix associated with the graph is denoted by $W(k)=\left [w_{ij}(k)\right ]_{i,j=1}^{N}$, where w_ij(k)>0 if and only if $(j,i)\in \mathcal {E}(k)$, otherwise w_ij(k)=0.

A time-independent digraph $\mathcal {G}=\{\mathcal {V},\mathcal {E}\}$ is called strongly connected if for any $i,j\in \mathcal {V}$, there exists a directed path from i to j. A directed path is a sequence of edges $\left (i,i_{1}\right),\left (i_{1},i_{2}\right),\dots,\left (i_{p-1},j\right)$ in the digraph with distinct agents $i_{k}\in \mathcal {V},~0\le k\le p-1$, where p is called the length of this path. A nonnegative matrix A is called doubly stochastic if A1=1 and 1^TA=1^T.

The time-varying global regression function is given by

$$\begin{array}{*{20}l} f_{k}(\cdot)=\frac{1}{N}\sum_{i=1}^{N}f_{i,k}(\cdot), \end{array} $$

(1)

where $f_{i,k}(\cdot):\mathbb {R}^{l}\to \mathbb {R}^{l}$ is the local function associated with agent i. Denote by θ_k the root of the sum function f_k(·) at time k, i.e., $f_{k}(\theta _{k})=0,~ k=1,2,\dots $

Further, assume that the dynamics of the root θ_k is governed by

$$\begin{array}{*{20}l} \theta_{k+1}=g_{k}(\theta_{k}){\color{black}{+\xi_{k+1}}},~k\geq0, \end{array} $$

(2)

where the function $g_{k}(\cdot):\mathbb {R}^{l}\to \mathbb {R}^{l}$ is known for all agents, and {ξ_k} is the sequence of dynamic noises. As we can see in Section 6, this assumption is reasonable in some real-life application problems and have been studied before in [14, 25, 26].

For each agent i, the distributed root-tracking problem is to track the dynamic root of the time-varying global function by using its noise-corrupted observation of local function f_i,k(·), the dynamic information of the root g_k(·), and the information obtained from its adjacent neighbors.

2.2 Algorithm

We now introduce the distributed root-tracking algorithm as follows:

$$\begin{array}{*{20}l} x^{\prime}_{i,k+1}&=\left\{\sum_{j\in N_{i}(k)}w_{ij}(k)g_{k}\left(x_{j,k}\right)+a_{k}O_{i,k+1}\right\}\mathbb{I}_{\left[\sigma_{i,k}=\hat{\sigma}_{i,k}\right]}\\ &+h_{k}(x^{*})\mathbb{I}_{\left[\sigma_{i,k}<\hat{\sigma}_{i,k}\right]}, \end{array} $$

(3)

$$\begin{array}{*{20}l} x_{i,k+1}&=x^{\prime}_{i,k+1}\mathbb{I}_{\left[||x^{\prime}_{i,k+1}-h_{k}(x^{*})||\le M_{\hat{\sigma}_{i,k}}\right]} \\ &+h_{k}(x^{*})\mathbb{I}_{\left[||x^{\prime}_{i,k+1}-h_{k}(x^{*})||> M_{\hat{\sigma}_{i,k}}\right]}, \end{array} $$

(4)

$$\begin{array}{*{20}l} \sigma_{i,k+1}&=\hat{\sigma}_{i,k}+\mathbb{I}_{\left[||x^{\prime}_{i,k+1}-h_{k}(x^{*})||> M_{\hat{\sigma}_{i,k}}\right]}, \end{array} $$

(5)

$$\begin{array}{*{20}l} \sigma_{i,0}&=0,~\hat{\sigma}_{i,k}=\max_{j\in N_{i}(k)}\sigma_{j,k}, \end{array} $$

(6)

$$\begin{array}{*{20}l} O_{i,k+1}&=f_{i,k+1}(g_{k}\left(x_{i,k})\right)+\epsilon_{i,k+1}, \end{array} $$

(7)

where 1) $x_{i,k}\in \mathbb {R}^{l}$ is the estimate of θ_k given by the agent i at time k, 2) O_i,k+1 defined by (7) is the local observation of agent i, 3) {a_k}_k≥0 is the sequence of the step-sizes used by all agents, 4) x^∗ is a fixed vector in $\mathbb {R}^{l}$ known to all agents, 5) {M_k}_k≥0 is a sequence of positive numbers increasingly diverging to infinity with M₀≥||x^∗||, 6) σ_i,k is the truncation number of agent i up-to-time k, and 7) h_k(·) is a function defined as below

$$ {}h_{1}(x)=g_{1}(x),\quad h_{k}(x)=g_{k}\left(h_{k-1}(x)\right),\quad \text{for}~k=2,3,\dots. $$

(8)

Let us explain the algorithm. 1) For agent i, the estimate x_i,k is the estimate of θ_k. Since the dynamics of {θ_k} is governed by (2), in order to make sure the estimate track the dynamic root, the update at time k+1 utilize g_k(x_i,k) instead of x_i,k as it was shown in (3). 2) For agent i, the truncation happens when one of the following cases hold true: a) $\sigma _{i,k}<\hat {\sigma }_{i,k}$, which means that there is at least one neighbor whose truncation number is larger than that of agent i. b) $||x^{\prime }_{i,k+1}-h_{k}(x^{*})||> M_{\hat {\sigma }_{i,k}}$, which means that the distance between the intermediate value $x^{\prime }_{i,k+1}$ and h_k(x^∗) is larger than the truncation bound. When truncation happens, the estimate x_i,k is pulled to h_k−1(x^∗). 3) It can be seen that the truncation may not happen at the same time for different agents in the network. So for agent i, the update (5) makes sure that the truncation number of i is not smaller than the largest truncation number of its neighbors, i.e. $\hat {\sigma }_{i,k}$. As to be shown in Lemma 4, this technique guarantees that the difference between truncation numbers of different agents is bounded, which helps the algorithm converge. 4) The truncation mechanism makes sure that the estimates x_i,k won’t be too far away from h_k−1(x^∗). As to be shown in Lemma 1, we can prove that the distance between {h_k−1(x^∗)} and the dynamic root {θ_k} is bounded. So it is reasonable to choose this truncation condition.

Remark 1

In our previous work [27], we proposed the distributed root-tracking algorithm without the expanding truncation. To make sure the algorithm converge, we assumed the dynamic root {θ_k} and the estimate of all agents {x_i,k} are bounded sequences in [27]. With the introduction of the expanding truncation mechanism, this assumption is removed in this paper.

3 Assumptions and convergence result

3.1 Assumptions

Let us list the assumptions to be used in the paper.

$a_{k}>0,a_{k}\to 0,\sum _{k=1}^{\infty }a_{k}=\infty $.
There exists a continuously differentiable function $v(\cdot):\mathbb {R}^{l}\to \mathbb {R}$ such that v(x)≠0 for ∀x≠0,v(0)=0 and for any 0<r₁<r₂<∞
$$\begin{array}{*{20}l} \sup_{k}\sup_{r_{1}\le ||x-\theta_{k}||\le r_{2}}f_{k}^{T}(x)v_{x}(x-\theta_{k})<-a, \end{array} $$

where a is a positive constant possibly depending on r₁,r₂. A constant r>η exists such that
$$\begin{array}{*{20}l} \sup_{||y||\le\eta}v(y)<\sup_{||x||=r}v(x), \end{array} $$

where η is an unknown constant specified later in Lemma 1.
The class of functions {f_i,k(·)}_k≥0 is equi-continuous for i=1,…,N, i.e., for any fixed i and any ε>0, there exists δ>0 such that
$$ ||f_{i,k}(x)-f_{i,k}(y)||\le\epsilon \quad\forall k,\quad \text{whenever}\quad ||x-y||\le\delta, $$

where δ only depends on ε. Furthermore, for ∀c>0, there exists a constant α(c) such that ||f_i,k(θ_k+ν)||<α(c) for ∀ν with $||\nu ||\le c, \forall i\in \mathcal {V},k=1,2,\dots $
a) The adjacent matrices W(k) ∀k≥0 are doubly stochastic;

b) There exists a constant 0<κ<1 such that
$$\begin{array}{*{20}l} w_{ij}(k)\ge\kappa\quad\forall j\in N_{i}(k)\quad \forall i\in\mathcal{V}\quad\forall k\ge 0, \end{array} $$

c) The digraph $\mathcal {G}_{\infty }=\{\mathcal {V},\mathcal {E}_{\infty }\}$ is strongly connected, where
$$\begin{array}{*{20}l} {}\mathcal{E}_{\infty}=\{(j,i):(j,i)\in\mathcal{E}(k)\text{ for infinitely many indices }k\}, \end{array} $$

d) There exists a positive integer B such that
$$\begin{array}{*{20}l} (j,i)\in\mathcal{E}(k)\cup\mathcal{E}(k+1)\cup\cdots\cup\mathcal{E}(k+B-1) \end{array} $$

for all $(j,i)\in \mathcal {E}_{\infty }$ and any k≥0.
For any $i\in \mathcal {V}$, the noise sequence {ε_i,k+1}_k≥0 is such that
$${\lim}_{T\to 0}\limsup_{k\to\infty}\frac{1}{T}||\sum_{m=n_{k}}^{m(n_{k},t_{k})}a_{m}\epsilon_{i,m}||=0,\quad\forall t_{k}\in[0,T]$$
where $m(k,T)\triangleq \max \left \{m:\sum _{i=k}^{m}a_{i}\le T\right \}$ and {n_k} denotes the indices of any convergent subsequence $\left \{x_{i,n_{k}}-\theta _{n_{k}}\right \}$.
$g_{k}(\cdot):\mathbb {R}^{l}\to \mathbb {R}^{l}$ is equi-continuous with respect to k and is such that $||d_{k}(x)||\le \gamma _{k}||x-\theta _{k}|| \forall x,k=1,2,\dots $, where
$$d_{k}(x)\triangleq g_{k}(x)-g_{k}(\theta_{k})-(x-\theta_{k}),$$
γ_k=o(a_k), and $\sum _{k=1}^{\infty }\gamma _{k}<\infty $.
||ξ_k||=o(a_k) and $\sum _{k=1}^{\infty }||\xi _{k}||<\infty $.

A1 and A2 are the standard assumptions for stochastic approximation. A3 implies the local boundedness of the functions f_i,k(·). Notice that the upper bound α(c) in A3 should be uniform with respect to k.

A4 describes the information exchanging among agents. We refer to [9] for the detailed explanation. Set $\Phi (k,k+1)\triangleq \mathbf {I}_{N}$ and

$$\Phi(k,s)\triangleq W(k)\cdots W(s)\quad\forall k\ge s.$$

By Proposition 1 in [9] it follows that there exist constants c>0 and 0<ρ<1 such that

$$\begin{array}{*{20}l} ||\Phi(k,s)-\frac{1}{N}\mathbf{1}\mathbf{1}^{T}||\le c\rho^{k-s+1}\quad\forall k\ge s. \end{array} $$

(9)

Notice that in A5 b), the noise condition is required only along the indices of any convergent subsequence $\left \{x_{i,n_{k}}-\theta _{n_{k}}\right \}$. As to be seen in the next section, this makes the convergence analysis much easier compared with requiring the noise condition to hold along the whole sequence.

In A6, d_k(x) measures the difference between the estimation error x_i,k−θ_k and the prediction error g_k(x_i,k)−g_k(θ_k). This assumption implies that the dynamic of the root, i.e. g_k(·), will tend to be a linear function as time k goes to infinity. For example, if the dynamics of the changing roots is g_k(x)=x+c, then A6 holds with γ_k=0.

3.2 Main result

Set $\text {col}\left \{x_{1},\dots,x_{m}\right \}\triangleq \left (x_{1}^{T},\dots,x_{m}^{T}\right)^{T}$, and define

$$\begin{array}{*{20}l} &X_{k}\triangleq \text{col}\left\{x_{1,k},\dots,x_{N,k}\right\},\\ &\Theta_{k}\triangleq \mathbf{1}\otimes \theta_{k} \in\mathbb{R}^{Nl},\\ &\epsilon_{k}\triangleq \text{col}\left\{\epsilon_{1,k},\dots,\epsilon_{N,k}\right\},\\ &{{\Xi_{k}\triangleq \mathbf{1}\otimes \xi_{k} \in\mathbb{R}^{Nl}}},\\ &G_{k}(X_{k})\triangleq \text{col}\left\{g_{k}\left(x_{1,k}\right),\dots,g_{k}\left(x_{N,k}\right)\right\},\\ &F_{k+1}(X_{k})\triangleq \text{col}\left\{f_{1,k+1}\left(x_{1,k}\right),\dots,f_{N,k+1}\left(x_{N,k}\right)\right\},\\ &D_{k}(X_{k})\triangleq \text{col}\left\{d_{k}\left(x_{1,k}\right),\dots,d_{k}\left(x_{N,k}\right)\right\}. \end{array} $$

Further, we denote the disagreement vector of X_k by $X_{\bot,k}\triangleq D_{\bot }X_{k}$ with $D_{\bot }\triangleq \left (\mathbf {I}_{N}-\frac {\mathbf {1}\mathbf {1}^{T}}{N}\right)\otimes \mathbf {I}_{l}$. Define $x_{k}\triangleq \frac {1}{N}\sum _{i=1}^{N}x_{i,k}$, the average of all agents’ estimates at time k. Define $\Delta _{i,k}\triangleq x_{i,k}-\theta _{k}, \Lambda _{k}\triangleq X_{k}-\Theta _{k}$, and $\Delta _{k}\triangleq x_{k}-\theta _{k}$.

Theorem 1

Let {x_i,k} be the estimates produced by (3)–(7) with an arbitrary initial value x_i,0. Assume A1-A4 and A6 hold. If for a fixed sample ω, A5 holds for all agents, and A7 holds, then for this ω, the following assertion takes place:

i) There exists a positive integer k₀ depending on ω such that

$$ x_{i,k+1}=\sum_{j\in N_{i}(k)}w_{ij}(k)g_{j}\left(x_{j,k}\right)+a_{k}O_{i,k+1}. $$

(10)

or in the compact form

$$ X_{k+1}=(W(k)\otimes\mathbf{I}_{l})G_{k}(X_{k})+a_{k}\left(F_{k+1}\left(G(X_{k})\right)+\epsilon_{k+1}\right) $$

(11)

for any k≥k₀;

ii)

$$\begin{array}{*{20}l} {\lim}_{k\to\infty}X_{\bot,k}=\mathbf{0},\quad {\lim}_{k\to\infty} \Lambda_{k}=\mathbf{0}. \end{array} $$

Theorem 1 i) shows that the truncation ceases after a finite number of steps. This implies that the difference between the estimate x_i,k and h_k−1(x^∗) is bounded, which is desirable as to be shown in Lemma 1. Before we move on to the proof of Theorem 1, we first show that the truncation mechanism is reasonable.

Lemma 1

If A6 and A7 hold, the sequence {h_k(x^∗)−θ_k+1} is bounded for any x^∗.

Proof

By A6 and A7 from (2) it follows that

$$\begin{array}{*{20}l} &||h_{k}(x^{*})-\theta_{k+1}||=||g_{k}\left(h_{k-1}(x^{*})\right)-g_{k}(\theta_{k})-\xi_{k+1}||\\ &=||d_{k}\left(h_{k-1}(x^{*})\right)+h_{k-1}(x^{*})-\theta_{k}-\xi_{k+1}||\\ &\le(1+\gamma_{k})||h_{k-1}(x^{*})-\theta_{k}||+||\xi_{k+1}||\\ &\le\prod_{i=1}^{k}\left(1+\gamma_{i}\right)||g_{1}(x^{*})\,-\,\theta_{1}||\,+\,\sum_{i=1}^{k}\prod_{j=i+1}^{k}\left(1+\gamma_{j}\right)||\xi_{j+1}||\\ &\le\prod_{i=1}^{\infty}(1+\gamma_{i})||g_{1}(x^{*})-\theta_{1}||\,+\,\sum_{i=1}^{\infty}\prod_{j=i+1}^{\infty}\left(1+\gamma_{j}\right)||\xi_{j+1}||\\ &\triangleq\eta, \end{array} $$

where $\prod _{i=1}^{\infty }(1+\gamma _{i})<\infty $ is implied by $\sum _{j=1}^{\infty }\gamma _{j}<\infty $. □

As we mentioned in Section 2, Lemma 1 shows that the distance between {h_k−1(x^∗)} and {θ_k} is bounded. Since we hope the estimate {x_i,k} generated by the algorithm (3)–(7) track the root {θ_k}, the truncation mechanism $\mathbb {I}_{\left [||x^{\prime }_{i,k+1}-h_{k}(x^{*})||\le M_{\hat {\sigma }_{i,k}}\right ]}$ is intuitively reasonable.

4 Auxiliary sequences

The next two sections of this paper focus on the proof of Theorem 1. But prior to analyzing {x_i,k}, we need to introduce two auxiliary sequences $\left \{\tilde {x}_{i,k}\right \}$ and $\left \{\tilde {\epsilon }_{i,k}\right \}$ for each agent $i\in \mathcal {V}$. The motivation of constructing these two sequences comes from the character of distributed algorithm with expanding truncation. Recall the convergence analysis of stochastic approximation algorithm with expanding truncation (SAAWET) [28]. The key step is to show that the truncations cease after a finite number of steps, therefore, the boundedness of the estimates is established. If the number of truncations increases unboundedly, then the estimate is pulled back to x^∗ infinitely many times. This produces a convergent subsequence from the estimate sequence. Then a contradiction can be shown analysing the property along this subsequence, which proves the boundedness of the estimates.

Although the problem in this paper is different from the one in [28] since the regression function is time-varying in this paper, we use the same approach to prove the boundedness of the estimates. Notice the distributed algorithm with expanding truncation (3)–(7). When ${\lim }_{k\to \infty }\sigma _{i,k}=\infty ~\forall i\in \mathcal {V}$, the estimate x_i,k is pulled back to h_k−1(x^∗) infinitely times. By lemma 1, we know that ||h_k−1(x^∗)−θ_k|| is bounded. So {Δ_i,k} contains a convergent subsequence. However, {Δ_k} may still not contain any convergent subsequences. This is because truncation may occur at different times for different $i\in \mathcal {V}$. Therefore, the analysis approach used for SAAWET cannot directly be applied to the algorithm (3)–(7).

To overcome this difficulty, we introduce the auxiliary sequences $\left \{\tilde {x}_{i,k}\right \}$ and $\left \{\tilde {\epsilon }_{i,k}\right \}$. As to be shown, the auxiliary sequences $\left \{\tilde {x}_{i,k}\right \}$ satisfies the recursions (19)–(21), for which the number of truncation at time k for all agents is the same and the estimates $\tilde {x}_{i,k}$ for all the agents are pulled back to h_k−1(x^∗) when σ_k>σ_k−1. The auxiliary noise $\left \{\tilde {\epsilon }_{i,k}\right \}$ satisfies a condition similar to A5 b). These make the analysis for (19)–(21) feasible.

It is shown below that the important feature of the auxiliary sequences consists in that $\left \{\tilde {x}_{i,k}\right \}$ and {x_i,k} coincide in a finite number of steps, which means that the convergence of these two sequences is equivalent.

Denote by $\tau _{i,m}\triangleq \inf \left \{k:\sigma _{i,k}=m\right \}$ the smallest time when the truncation number of agent i has reached m, by $\tau _{m}\triangleq \min _{i\in \mathcal {V}}\tau _{i,m}$ the smallest time when at least one of agents has its truncation number reached m, and by

$$\begin{array}{*{20}l} \sigma_{k}\triangleq\max_{i\in\mathcal{V}}\sigma_{i,k} \end{array} $$

(12)

the largest truncation number among all agents at time k. Set $\tilde {\tau }_{i,m}\triangleq \tau _{i,m}\land \tau _{m+1}$, where a∧b= min{a,b}.

For any $i\in \mathcal {V}$, define the auxiliary sequences $\left \{\tilde {x}_{i,k}\right \}_{k\ge 0}$ and $\left \{\tilde {\epsilon }_{i,k}\right \}_{k\ge 0}$ as follows:

$$\begin{array}{*{20}l} \tilde{x}_{i,k}\triangleq h_{k-1}(x^{*}), \tilde{\epsilon}_{i,k+1} \\ \triangleq -f_{i,k+1}\left(h_{k}(x^{*})\right),~\forall k:\tau_{m}\le k<\tilde{\tau}_{i,m}, \end{array} $$

(13)

$$\begin{array}{*{20}l} \tilde{x}_{i,k}\triangleq x_{i,k}, \tilde{\epsilon}_{i,k+1} \\ \triangleq \epsilon_{i,k+1},~\forall k:\tilde{\tau}_{i,m}\le k<\tau_{m+1}, \end{array} $$

(14)

where m is an integer.

Note that for the considered sample ω there exists a unique integer m≥0 corresponding to an integer k≥0 such that τ_m≤k<τ_m+1. By definition $\tilde {\tau }_{i,m}\le \tau _{m+1}~\forall i\in \mathcal {V}$. So, $\left \{\tilde {x}_{i,k}\right \}_{k\ge 0}$ and $\left \{\tilde {\epsilon }_{i,k}\right \}_{k\ge 0}$ are uniquely determined by the sequences {x_i,k}_k≥0 and {ε_i,k}_k≥0.

Lemma 2

For k∈[τ_m,τ_m+1), the following assertions hold:

$$\begin{array}{*{20}l} \text{i)}~\tilde{x}_{i,k}&=h_{k-1}(x^{*}),~\tilde{\epsilon}_{i,k+1} \\ &=-f_{i,k+1}\left(h_{k}(x^{*})\right),~\text{if}~\sigma_{i,k}<m; \end{array} $$

(15)

$$\begin{array}{*{20}l} &\text{ii)}~\tilde{x}_{i,k}=x_{i,k},~\tilde{\epsilon}_{i,k+1}=\epsilon_{i,k+1},~\text{if}~\sigma_{i,k}=m; \end{array} $$

(16)

$$\begin{array}{*{20}l} &\text{iii)}~\tilde{x}_{j,k}=h_{k-1}(x^{*}),~\text{if}~\sigma_{j,k-1}<m; \end{array} $$

(17)

$$\begin{array}{*{20}l} &\text{iv)}~\tilde{x}_{j,k+1}=h_{k}(x^{*}),~\forall j\in\mathcal{V},~\text{if}~\sigma_{k+1}=m+1. \end{array} $$

(18)

Proof

i) Since σ_i,k<m, by the definition of τ_i,m and the fact that k∈[τ_m,τ_m+1), we know τ_i,m>k. Thus, $\tilde {\tau }_{i,m}=\tau _{i,m}\land \tau _{m+1}>k$. Hence, we conclude (15) from (13).

ii) Since σ_i,k=m, by definition we have τ_i,m≤k. Hence $\tilde {\tau }_{i,m}=\tau _{i,m}\land \tau _{m+1}=\tau _{i,m}\le k$. So by (14) we conclude (16).

iii) By τ_m≤k<τ_m+1 we know σ_j,k≤m. We consider two cases: σ_j,k=m, and σ_j,k<m. 1) For σ_j,k=m, since σ_j,k−1<m, we know that truncation happens at time k for agent j. Truncation happens only when one of the following cases hold true: a) $\sigma _{j,k-1}<\hat {\sigma }_{j,k-1}$. For this case, by (3) we have $x^{\prime }_{j,k}=h_{k-1}(x^{*})$, hence by (4) we have x_j,k=h_k−1(x^∗); b) $\|x^{\prime }_{j,k}-h_{k-1}(x^{*})\|> M_{\hat {\sigma }_{j,k-1}}$. For this case, by (4) we have x_j,k=h_k−1(x^∗). In conclusion, when σ_j,k=m holds true, we have x_j,k=h_k−1(x^∗). Furthermore, from (16) it follows that $\tilde {x}_{j,k}=x_{j,k}$ if σ_j,k=m. So, we have $\tilde {x}_{j,k}=h_{k-1}(x^{*})$. 2) For σ_j,k<m, from (15) we have $\tilde {x}_{j,k}=h_{k-1}(x^{*})$.

iv) From k∈[τ_m,τ_m+1] we know σ_k=m. Hence from σ_k+1=m+1 by definition we have τ_m+1=k+1, and k+1∈[τ_m+1,τ_m+2). By σ_k=m we see $\sigma _{j,k}<m+1~\forall j\in \mathcal {V}$. Then we derive (18) by (17). □

Lemma 3

The auxiliary sequences $\{\tilde {x}_{i,k}\},~\{\tilde {\epsilon }_{i,k}\}$ defined by (13)(14) satisfy the following recursions:

$$\begin{array}{*{20}l} \hat{x}_{i,k+1}&=\sum_{j\in N_{i}(k)}w_{ij}(k)g_{k}(\tilde{x}_{j,k}) \\ & +a_{k}\left(f_{i,k+1}\left(g_{k}\left(\tilde{x}_{i,k}\right)\right)+\tilde{\epsilon}_{i,k+1}\right) \end{array} $$

(19)

$$\begin{array}{*{20}l} \tilde{x}_{i,k+1}&=\hat{x}_{i,k+1}\mathbb{I}_{\left[||\hat{x}_{j,k+1}-h_{k}(x^{*})||\le M_{\sigma_{k}},\forall j\in\mathcal{V}\right]}\\ &+h_{k}(x^{*})\mathbb{I}_{\left[\exists j\in\mathcal{V}~||\hat{x}_{j,k+1}-h_{k}(x^{*})>M_{\sigma_{k}}||\right]} \end{array} $$

(20)

$$\begin{array}{*{20}l} \sigma_{k+1}&=\sigma_{k}+\mathbb{I}_{\left[\exists j\in\mathcal{V}~||\hat{x}_{j,k+1}-h_{k}(x^{*})>M_{\sigma_{k}}||\right]},~\sigma_{0}=0. \end{array} $$

(21)

Proof

We prove this by induction.

First we prove (19)–(21) for k=0. Since 0∈[τ₀,τ₁) and $\sigma _{0}=0~\forall i\in \mathcal {V}$, by (16) we have $\tilde {x}_{i,0}=x_{i,0},~\tilde {\epsilon }_{i,1}=\epsilon _{i,1}~\forall i\in \mathcal {V}$. Then by $\hat {\sigma }_{i,0}=\sigma _{i,0}=0~\forall i\in \mathcal {V}$, from (3)(19) we see

$$\begin{array}{*{20}l} \hat{x}_{i,1}=x^{\prime}_{i,1},~\forall i\in\mathcal{V}. \end{array} $$

(22)

Now we prove that $\tilde {x}_{i,1}$ and σ₁ generated by (19)–(21) are consistent with the definition (12)(13)(14). We consider two cases:

i) There is no truncation at time k=1, i.e., $\sigma _{i,1}=0~\forall i\in \mathcal {V}$. Since σ_i,0=0, by (5) we know that $\phantom {\dot {i}\!}||x_{i,1}^{\prime }-h_{0}(x^{*})||\le M_{0}$. Then we have $\phantom {\dot {i}\!}x_{i,1}=x^{\prime }_{i,1}$ by (4), and $\tilde {x}_{i,1}=\hat {x}_{i,1}, \sigma _{1}=0$ by (20)(21). Combining these with (22) it is shown that $\tilde {x}_{i,1}=x_{i,1}~\forall i\in \mathcal {V}$, which is consistent with (14) since $\tilde {\tau }_{i,0}\le 1<\tau _{1}$. By (12) we see $\sigma _{1}=\max _{i\in \mathcal {V}}\sigma _{i,1}=0$, which is consistent with the one derived from (21).

ii) There is a truncation at k=1 for agent i₀, i.e., $\sigma _{i_{0},1}=1$. Then by (4)(5) we have $x_{i_{0},1}=h_{0}(x^{*}), ||x_{i_{0},1}^{\prime }-h_{0}(x^{*})||>M_{0}$. Hence $||\hat {x}_{i_{0},1}-h_{0}(x^{*})||>M_{0}$ by (22). From (20)(21) we have $\hat {x}_{i,1}=h_{0}(x^{*})~\forall i\in \mathcal {V}$ and σ₁=1. By (12) from $\sigma _{i_{0},1}=1$ we derive σ₁=1. Since 0∈[τ₀,τ₁) and σ₁=1, by (18) we have $\tilde {x}_{i,1}=h_{0}(x^{*})~\forall i\in \mathcal {V}$. Thus, $\tilde {x}_{i,1}$ and σ₁ defined by (13)(14)(12) are consistent with those generated by (19)–(21).

In summery, we prove the lemma for k=0.

By induction, we assume (19)–(21) hold for $k=0,1,\dots,p$. At a fixed sample ω for a given integer p there exists a unique integer m such that τ_m≤p<τ_m+1. Now we aim to show that (19)–(21) hold for k=p+1. Before this, we first express $\hat {x}_{i,p+1}~\forall i\in \mathcal {V}$ produced by (19) for the following two cases:

Case 1: σ_i,p<m. Since p∈[τ_m,τ_m+1), by (15) we see

$$\begin{array}{*{20}l} \tilde{x}_{i,p}=h_{p-1}(x^{*}),\quad\tilde{\epsilon}_{i,p+1}=-f_{i,p+1}\left(h_{p}(x^{*})\right). \end{array} $$

(23)

From σ_i,p<m it follows that σ_j,p−1<m ∀j∈N_i(p) by (5). Then by (17) we have $\tilde {x}_{i,p}=h_{p-1}(x^{*})~\forall j\in N_{i}(p)$, which combining with (19)(23) shows

$$\begin{array}{*{20}l} \hat{x}_{i,p+1}=h_{p}(x^{*})\quad \forall i:\sigma_{i,p}<m \end{array} $$

(24)

Case 2: σ_i,p=m. By τ_m≤p<τ_m+1 we have $\sigma _{j,p}\le m~\forall j\in \mathcal {V}$ and hence by (6) we derive

$$\begin{array}{*{20}l} \hat{\sigma}_{i,p}=m,\quad\forall i:\sigma_{i,p}=m \end{array} $$

(25)

Then by (3)

$$\begin{array}{*{20}l} x^{\prime}_{i,p+1}&=\sum_{j\in N_{i}(p)}w_{ij}(p)g_{p}(x_{j,p})\\ &+a_{p}\Big(f_{i,p+1}\big(g_{p}(x_{i,p})\big)+\epsilon_{i,p+1}\Big). \end{array} $$

(26)

From σ_i,p=m and p∈[τ_m,τ_m+1), by (16) it can be shown that

$$\begin{array}{*{20}l} \tilde{x}_{i,p}=x_{i,p},\quad\tilde{\epsilon}_{i,p+1}=\epsilon_{i,p+1}. \end{array} $$

(27)

Substituting (27) into (19), by (26) we know that

$$\begin{array}{*{20}l} \hat{x}_{i,p+1}=x^{\prime}_{i,p+1}\quad\forall i:\sigma_{i,p}=m. \end{array} $$

(28)

So we have expressed $\hat {x}_{i,p+1}~\forall i\in \mathcal {V}$ produced by (19) for the two cases above.

Since τ_m≤p<τ_m+1, we have σ_p<m+1 and hence σ_p+1≤m+1. From τ_m≤p it follows that σ_p=m and σ_p+1≥m, hence m≤σ_p+1≤m+1.

Now we show that $\tilde {x}_{i,p+1}$ and σ_p+1 generated by (19)–(21) are consistent with their definitions (12)(13)(14). We prove this for two cases σ_p+1=m+1 and σ_p+1=m.

Case 1: σ_p+1=m+1. We first show

$$\begin{array}{*{20}l} \sigma_{i,p+1}\le m,~\text{if}~\sigma_{i,p}<m \end{array} $$

(29)

for the following two cases: 1) σ_i,p<m and σ_j,p<m ∀j∈N_i(p). For this case by (6) we have $\hat {\sigma }_{i,p}<m$, and hence $\sigma _{i,p+1}\le \hat {\sigma }_{i,p}+1\le m$ by (5). 2) σ_i,p<m and σ_j,p=m for some j∈N_i(p). For this case we derive $\hat {\sigma }_{i,p}=m, x^{\prime }_{i,p+1}=h_{p}(x^{*})$ by (3)(6). So, by (5) we derive $\sigma _{i,p+1}=\hat {\sigma }_{i,p}=m$. Thus, σ_i,p+1≤m when σ_i,p<m. Hence (29) holds. Furthermore, this means that

$$\begin{array}{*{20}l} \sigma_{i,p+1}=m+1~\text{only if}~\sigma_{i,p}=m. \end{array} $$

(30)

Since we are considering the case where σ_p+1=m+1, by definition we know that there exists some agent $i_{0}\in \mathcal {V}$ such that $\sigma _{i_{0},p+1}=m+1$. Then $\sigma _{i_{0},p}=m$ by (30), and hence $\hat {\sigma }_{i_{0},p}=m$ from (25). Then from $\sigma _{i_{0},p+1}=m+1$ by (5) we know that $||x^{\prime }_{i_{0},p+1}-h_{p}(x^{*})||>M_{m}$. So from (20)(21) we derive $\tilde {x}_{i,p+1}=h_{p}(x^{*})~\forall i\in \mathcal {V}$ and σ_p+1=m+1, which is consistent with the σ_p+1 defined by (12). Since σ_p+1=m+1 and p∈[τ_m,τ_m+1), by (18) we see that $\tilde {x}_{i,p+1}=h_{p}(x^{*})~\forall i\in \mathcal {V}$, which is consistent with that generated by (19)–(21).

Case 2: σ_p+1=m. In this case $\sigma _{i,p+1}\le m~\forall i\in \mathcal {V}$. By (25), from (4)(5) we see that

$$\begin{array}{*{20}l} {}||x^{\prime}_{i,p+1}-h_{p}(x^{*})||\le M_{m},~x_{i,p+1}=x^{\prime}_{i,p+1}~\forall~i:\sigma_{i,p}=m. \end{array} $$

(31)

So, by (28) we derive

$$\begin{array}{*{20}l} ||\hat{x}_{i,p+1}-h_{p}(x^{*})||\le M_{m},~\forall i:\sigma_{i,p}=m. \end{array} $$

(32)

From (24) we have $||\hat {x}_{i,p+1}-h_{p}(x^{*})||=0\le M_{m}~\forall i:\sigma _{i,p}<m$, which incorporating with (32) yields $||\hat {x}_{i,p+1}-h_{p}(x^{*})||\le M_{m}\forall i\in \mathcal {V}$. Then from (20) it follows

$$\begin{array}{*{20}l} \tilde{x}_{i,p+1}=\hat{x}_{i,p+1}~\forall i\in\mathcal{V},\quad \sigma_{p+1}=m. \end{array} $$

(33)

which means that σ_p+1 is consistent with that defined by (12).

It remains to show that $\tilde {x}_{i,p+1}$ generated by (19)–(21) is consistent with that defined by (13)(14). We consider two cases: 1) σ_i,p=m. For this case, by (25)(31)(32) we see $\tilde {x}_{i,p+1}=x_{i,p+1}~\forall i:\sigma _{i,p}=m$. By σ_p+1=m we see p∈[τ_m,τ_m+1), and hence $\tilde {x}_{i,p+1}=x_{i,p+1}$ by (16). So the consistency assertion holds for any i with σ_i,p=m. 2) σ_i,p<m. For this case, from σ_p+1=m we see p+1∈[τ_m,τ_m+1), and hence by σ_i,p<m and (17) we know $\tilde {x}_{i,p+1}$ defined by (13)(14) is equal to h_p(x^∗). By (24)(33) we derive $\tilde {x}_{i,p+1}=h_{p}(x^{*})$. So the consistency assertion holds for i with σ_i,p<m too.

In summery, $\tilde {x}_{i,p+1}$ and σ_p+1 generated by (19)–(21) are consistent with their definitions (12)(13)(14). So the induction is complete. □

Lemma 4

Assume A4 holds. Then

i)

$$\begin{array}{*{20}l} \sigma_{j,k}+Bd_{i,j}\ge\sigma_{i,k}\quad\forall j\in\mathcal{V}~\forall k>0, \end{array} $$

(34)

where d_i,j is the length of the shortest directed path from i to j in $\mathcal {G}_{\infty }$, and B is the positive integer given in A4 d).

ii)

$$\begin{array}{*{20}l} \tilde{\tau}_{j,m}\le\tau_{m}+BD\quad\forall j\in\mathcal{V}~\text{for}~m>1, \end{array} $$

(35)

where $D\triangleq \max _{i,j\in \mathcal {V}}d_{i,j}$.

Proof

i) Since $\mathcal {G}_{\infty }$ is strongly connected by A4 c), for any $j\in \mathcal {V}$ there exists a sequence of nodes $i_{1},i_{2},\dots,i_{d_{i,j}-1}$ such that $(i,i_{1})\in \mathcal {E}_{\infty },(i_{1},i_{2})\in \mathcal {E}_{\infty },\dots,\left (i_{d_{i,j}-1,j}\right)\in \mathcal {E}_{\infty }$.

Noticing that $(i,i_{1})\in \mathcal {E}_{\infty }$, by A4 d) we have

$$\begin{array}{*{20}l} (i,i_{1})\in\mathcal{E}(k)\cup\mathcal{E}(k+1)\cup\dots\cup\mathcal{E}(k+B-1). \end{array} $$

Therefore, there exists a positive integer k^′∈[k,k+B−1] such that $(i,i_{1})\in \mathcal {E}(k^{\prime })$. So, $i\in N_{i_{1}}(k^{\prime })$, and hence by (6) and (5) we have

$$\begin{array}{*{20}l} \sigma_{i_{1},k+B}\ge\sigma_{i_{1},k^{\prime}+1}\ge\hat{\sigma}_{i_{1},k^{\prime}}\ge\sigma_{i,k^{\prime}}\ge\sigma_{i,k}. \end{array} $$

Repeat this procedure, we can obtain $\sigma _{i_{2},k+2B}\ge \sigma _{i_{1},k+B}\ge \sigma _{i,k}$. Finally we can reach (34).

ii) For some m≥1, let τ_m=k₁. Then there exists an i such that τ_i,m=k₁. By (34) we have $\sigma _{j,k_{1}+Bd_{i,j}}\ge \sigma _{i,k_{1}}=m~\forall j\in \mathcal {V}$.

For the case where $\sigma _{j,k_{1}+Bd_{i,j}}=m~\forall j\in \mathcal {V}$, we have $\tau _{j,m}\le k_{1}+Bd_{i,j}~\forall j\in \mathcal {V}$. By noticing τ_m=k₁, by the definition of $\tilde {\tau }_{j,m}$ we have (35):

$$\begin{array}{*{20}l} \tilde{\tau}_{j,m}\le\tau_{j,m}\le\tau_{m}+Bd_{i,j}\le\tau_{m}+BD\quad j\in\mathcal{V}. \end{array} $$

For the case where $\sigma _{j,k_{1}+Bd_{i,j}}>m$ for some $j\in \mathcal {V}$, we have τ_m+1≤k₁+Bd_i,j for some $j\in \mathcal {V}$, and hence τ_m+1≤τ_m+BD. Again, we obtain (35):

$$\begin{array}{*{20}l} \tilde{\tau}_{j,m}\le\tau_{m+1}\le\tau_{m}+BD\quad j\in\mathcal{V}. \end{array} $$

□

Corollary 1

If $\sigma _{k}\xrightarrow [k\to \infty ]\infty $, then ${\lim }_{k\to \infty }\sigma _{i,k}=\infty ~\forall i\in \mathcal {V}$.

This corollary can be easily obtained from (34).

Lemma 5

Assume A5 holds at the sample path ω under consideration, A1, A3,A6, and A7 hold. Then for this ω

$$\begin{array}{*{20}l} {\lim}_{T\to\infty}&\limsup_{k\to\infty}\frac{1}{T}||\sum_{s=n_{k}}^{m(n_{k},t_{k})\land(\tau_{\sigma_{n_{k}}+1}-1)}a_{s}\tilde{\epsilon}_{s+1}||=0\\ &\forall t_{k}\in[0,T] \end{array} $$

(36)

along indices {n_k} whenever $\left \{\widetilde {\Lambda }_{n_{k}}\right \}$ converges at ω, where $\tilde {\epsilon }_{k}\triangleq col\left \{\tilde {\epsilon }_{1,k},\dots,\tilde {\epsilon }_{N,k}\right \}, \tilde {X}_{k}\triangleq col\left \{\tilde {x}_{1,k},\dots,\tilde {x}_{N,k}\right \}$, and $\widetilde {\Lambda }_{k}\triangleq \tilde {X}_{k}-\Theta _{k}$.

Proof

It suffices to show

$$\begin{array}{*{20}l} {\lim}_{T\to\infty}&\limsup_{k\to\infty}\frac{1}{T}||\sum_{s=n_{k}}^{m\left(n_{k},t_{k}\right)\land\left(\tau_{\sigma_{n_{k}}+1}-1\right)}a_{s}\tilde{\epsilon}_{i,s+1}||=0\\ &\forall t_{k}\in[0,T]\text{ for sufficiently large }K>0 \end{array} $$

(37)

along indices {n_k} whenever $\left \{\tilde {x}_{i,n_{k}}-\theta _{n_{k}}\right \}$ converges at sample ω where A5 b) holds for agent i.

We consider two cases:

Case 1: ${\lim }_{k\to \infty }\sigma _{k}=\sigma <\infty $. From definition we can obtain

$$\begin{array}{*{20}l} \tau_{\sigma+1}=\infty~\text{when }{\lim}_{k\to\infty}\sigma_{k}=\sigma. \end{array} $$

(38)

By (35) we have $\tilde {\tau }_{i,\sigma }\le \tau _{\sigma }+BD$, hence by (14)(38)

$$\begin{array}{*{20}l} \tilde{x}_{i,k}=x_{i,k},~\tilde{\epsilon}_{i,k+1}=\epsilon_{i,k+1}\quad\forall k\ge\tau_{\sigma}+BD. \end{array} $$

(39)

So,

$$\begin{array}{*{20}l} ||\sum_{s=k}^{m(k,t)\land(\tau_{\sigma_{k}+1}-1)}a_{s}\tilde{\epsilon}_{i,s+1}||=||\sum_{s=k}^{m(k,t)}a_{s}\epsilon_{i,s+1}|| \end{array} $$

for any t>0 and any sufficiently large k. Then by A5 b) we prove (37).

Case 2: ${\lim }_{k\to \infty }\sigma _{k}=\infty $. In this case we prove (37) for three separate cases:

i): $\tilde {\tau }_{i,\sigma _{n_{p}}}\le n_{p}$. For this case, $\left [n_{p},\tau _{\sigma _{n_{p}}+1}\right)\subset \left [\tau _{i,\sigma _{n_{p}}},\tau _{\sigma _{n_{p}}+1}\right)$. So by (14) we have

$$\begin{array}{*{20}l} \tilde{x}_{i,s}=x_{i,s},~\tilde{\epsilon}_{i,s+1}=\epsilon_{i,s+1}\quad\forall n_{p}\le s\le\tau_{\sigma_{n_{p}}+1}. \end{array} $$

(40)

Thus, for any t_p∈[0,T]

$$\begin{array}{*{20}l} &||\sum_{s=n_{p}}^{m\left(n_{p},t_{p}\right)\land\left(\tau_{\sigma_{n_{p}}+1}-1\right)}a_{s}\tilde{\epsilon}_{i,s+1}||\\ &=||\sum_{s=n_{p}}^{m\left(n_{p},t_{p}\right)\land\left(\tau_{\sigma_{n_{p}}+1}-1\right)}a_{s}\epsilon_{i,s+1}||. \end{array} $$

(41)

Notice that by (40), $\tilde {x}_{i,n_{p}}=x_{i,n_{p}}$, and the indices {n_p} is taken when $\{\tilde {x}_{i,n_{p}}-\theta _{n_{p}}\}$ is a convergent subsequence. So $\{x_{i,n_{p}}-\theta _{n_{p}}\}$ is a convergent subsequence as well. It can be seen $\sum _{s=n_{p}}^{m(n_{p},t_{p})\land (\tau _{\sigma _{n_{p}}+1}-1)}a_{s}\le \sum _{s=n_{p}}^{m(n_{p},t_{p})}a_{s}\le t_{p}\le T$. Hence, from (41) and A5 b) we conclude (37).

ii): $\tilde {\tau }_{i,\sigma _{n_{p}}}>n_{p}$ and $\tilde {\tau }_{i,\sigma _{n_{k}}}=\tau _{\sigma _{n_{k}+1}}$. By definition of τ_k and σ_k we have $\tau _{\sigma _{k}}\le k$, and hence $\tau _{\sigma _{n_{p}}}\le n_{p}$. Then $[n_{p},\tau _{\sigma _{n_{p}+1}})\subset [\tau _{\sigma _{n_{p}}},\tilde {\tau }_{i,\sigma _{n_{p}}})$, and hence by (13) we have

$$\begin{array}{*{20}l} \tilde{x}_{i,s}=h_{s-1}(x^{*}),~\tilde{\epsilon}_{i,s+1}=-f_{i,s+1}(h_{s}(x^{*}))\\ \forall s:n_{p}\le s<\tau_{\sigma_{n_{p}}+1}. \end{array} $$

(42)

From $\tilde {\tau }_{i,\sigma _{n_{p}}}=\tau _{\sigma _{n_{p}}+1}$ by (35) we know that $\tau _{\sigma _{n_{p}}+1}\le \tau _{\sigma _{n_{p}}}+BD\le n_{p}+BD$. Then for any t_p∈[0,T], utilizing A1, A3 and Lemma 1 we have

$$\begin{array}{*{20}l} &||\sum_{s=n_{p}}^{m\left(n_{p},t_{p}\right)\land\left(\tau_{\sigma_{n_{p}}+1}-1\right)}a_{s}\tilde{\epsilon}_{i,s+1}|| \\ & \le\sum_{s=n_{p}}^{n_{p}+BD}a_{s}||f_{i,s+1}\left(h_{s}(x^{*})\right)|| \\ &=\sum_{s=n_{p}}^{n_{p}+BD}a_{s}||f_{i,s+1}\left(\theta_{s+1}+h_{s}(x^{*})-\theta_{s+1}\right)||\\ &\le\sum_{s=n_{p}}^{n_{p}+BD}a_{s}\alpha(\eta) \\ &\le BD\cdot a_{n_{p}}\cdot\alpha(\eta)\xrightarrow[p\to\infty]{}0, \end{array} $$

(43)

and hence (37) holds for this case.

iii): $\tilde {\tau }_{i,\sigma _{n_{p}}}>n_{p}$ and $\tilde {\tau }_{i,\sigma _{n_{p}}}<\tau _{\sigma _{n_{p}}+1}$. For this case from definition we know that $\tilde {\tau }_{i,\sigma _{n_{p}}}=\tau _{i,\sigma _{n_{p}}}$. So by (35) we have $\tau _{i,\sigma _{n_{p}}}\le \tau _{\sigma _{n_{p}}}+BD$. Noticing $\tau _{\sigma _{n_{p}}}\le n_{p}$, we conclude that

$$\begin{array}{*{20}l} \tau_{\sigma_{n_{p}}}\le n_{p}<\tilde{\tau}_{i,\sigma_{n_{p}}}=\tau_{i,\sigma_{n_{p}}}\le n_{p}+BD \end{array} $$

(44)

So, $[n_{p},\tau _{i,\sigma _{n_{p}}})\subset [\tau _{\sigma _{n_{p}}},\tilde {\tau }_{i,\sigma _{n_{p}}})$. From this and $\tilde {\tau }_{i,\sigma _{n_{p}}}=\tau _{i,\sigma _{n_{p}}}$, by (13)(14) we derive

$$\begin{array}{*{20}l} {}\tilde{x}_{i,s}=h_{s-1}(x^{*}),~\tilde{\epsilon}_{i,s+1}=-f_{s+1}\left(h_{s}(x^{*})\right),~\forall n_{p}\le s<\tau_{i,\sigma_{n_{p}}},\\ \tilde{x}_{i,s}=x_{i,s},~\tilde{\epsilon}_{i,s}=\epsilon_{i,s},~\forall \tau_{i,\sigma_{n_{p}}}\le s<\tau_{\sigma_{n_{p}+1}}. \end{array} $$

Consequently, for any t_p∈[0,T]

$$\begin{array}{*{20}l} &||\sum_{s=n_{p}}^{m(n_{p},t_{p})\land\left(\tau_{\sigma_{n_{p}}+1}-1\right)}a_{s}\tilde{\epsilon}_{i,s+1}||\\ &\le||\sum_{s=n_{p}}^{m(n_{p},t_{p})\land\left(\tau_{\sigma_{n_{p}}+1}-1\right)}a_{s}f_{s+1}\left(h_{s}(x^{*})\right)\mathbb{I}_{[n_{p}\le s<\tau_{i,\sigma_{n_{p}}}]}||\\ &+||\sum_{s=\tau_{i,\sigma_{n_{p}}}}^{m\left(n_{p},t_{p}\right)\land\left(\tau_{\sigma_{n_{p}}+1}-1\right)}a_{s}\epsilon_{i,s+1}|| \end{array} $$

(45)

Analyze the first term at the right hand of (45):

$$\begin{array}{*{20}l} &||\sum_{s=n_{p}}^{m(n_{p},t_{p})\land(\tau_{\sigma_{n_{p}}+1}-1)}a_{s}f_{s+1}\big(h_{s}(x^{*})\big)\mathbb{I}_{[n_{p}\le s<\tau_{i,\sigma_{n_{p}}}]}||\\ &\le\sum_{s=n_{p}}^{\tau_{i,\sigma_{n_{p}}}}a_{s}||f_{s+1}\big(h_{s}(x^{*})\big)||\\ &\le\sum_{s=n_{p}}^{n_{p}+BD}a_{s}\alpha(\eta)\xrightarrow[p\to\infty]{}0 \end{array} $$

From the definition of τ_i,k, the truncation number of agent i at time $\tau _{i,\sigma _{n_{p}}}$ is $\sigma _{n_{p}}$ while it’s smaller than $\sigma _{n_{p}}$ at time $\tau _{i,\sigma _{n_{p}}}-1$. So by the algorithm (3)-(6) we know $x_{i,\tau _{i,\sigma _{n_{p}}}}=h_{\tau _{i,\sigma _{n_{p}}}-1}(x^{*})$. Notice the second term at the right hand of (45). If we can show $\{h_{\tau _{i,\sigma _{n_{p}}}-1}(x^{*})-\theta _{\tau _{i,\sigma _{n_{p}}}}\}$ is convergent, combining it with the fact $\sum _{s=\tau _{i,\sigma _{n_{p}}}}^{m(n_{p},t_{p})\land \tau _{\sigma _{n_{p}}+1}-1}a_{s}\le \sum _{s=n_{p}}^{m(n_{p},t_{p})}a_{s}\le t_{p}$, from A5 we can conclude that the second term at the right hand of (45) tends to zero as p→∞.

We show that {h_k(x^∗)−θ_k+1} is a convergent sequence by proving that the sequence is a Cauchy sequence. For two different integer j>i>0, we see

$$\begin{array}{*{20}l} &||h_{j}(x^{*})-\theta_{j+1}-h_{i}(x^{*})+\theta_{i+1}||\\ &\le||h_{j}(x^{*})-\theta_{j+1}-h_{j-1}(x^{*})+\theta_{j}\\ &+h_{j-1}(x^{*})-\theta_{j}-h_{j-2}(x^{*})+\theta_{j-1}+\cdots+h_{i+1}(x^{*})\\&-\theta_{i+2}\\ &-h_{i}(x^{*})+\theta_{i+1}||\\ &=||g_{j}\big(h_{j-1}(x^{*})\big)\,-\,g_{j}(\theta_{j})-h_{j-1}(x^{*})+\theta_{j}-\xi_{j+1}+\cdots||\\ &\le\sum_{l=i+1}^{j}||d_{l}(h_{l-1}(x^{*}))||+\sum_{l=i+2}^{j+1}||\xi_{l}||\\ &\le\sum_{l=i+1}^{j}\gamma_{l}\cdot\eta+\sum_{i+2}^{j+1}||\xi_{l}||, \end{array} $$

where the last inequality comes from A6 and Lemma 1. By A6 and A7 we know that $\sum _{k=1}^{\infty }\gamma _{k}<\infty $ and $\sum _{k=1}^{\infty }\xi _{k}<\infty $. Thus, for ∀ε>0, we can find a sufficiently large N>0 such that ||h_j(x^∗)−θ_j+1−h_i(x^∗)+θ_i+1||<ε for ∀i,j>N, which means that {h_k(x^∗)−θ_k+1} is a Cauchy sequence. Furthermore, we can prove that (37) holds for case iii).

Since one of case i), ii), iii) must take place for the case ${\lim }_{k\to \infty }\sigma _{k}=\infty $, we can conclude that (37) holds in Case 2.

Combining Case 1 and Case 2, we conclude (36). □

Corollary 2

In (38) we show that τ_σ+1=∞ when ${\lim }_{k\to \infty }\sigma _{k}=\sigma <\infty $. So, if ${\lim }_{k\to \infty }\sigma _{k}=\sigma <\infty $, by (20) we know that $\{\tilde {x}_{i,k}\}$ and $\{x_{i,k}\}, \{\tilde {\epsilon }_{i,k}\}$ and {ε_i,k} coincide in a finite number of steps.

5 Proof of the main result

Define:

$$\begin{array}{*{20}l} \Psi(k,s)\triangleq&\big[D_{\bot}(W(k)\otimes\mathbf{I}_{l})\big]\big[D_{\bot}(W(k-1)\otimes\mathbf{I}_{l})\big]\cdots\\ &\big[D_{\bot}(W(s)\otimes\mathbf{I}_{l})\big]\quad\forall k\ge s, \end{array} $$

and

$$\quad \Psi(k-1,k)\triangleq\mathbf{I}_{Nl}.$$

Since W(k) are doubly stochastic, by the property of Kronecker product (A⊗B)(C⊗D)=(AC)⊗(BD) we know that for ∀k≥s−1

$$\begin{array}{*{20}l} &\Psi(k,s)=\big(\Phi(k,s)-\frac{1}{N}\mathbf{1}\mathbf{1}^{T}\big)\otimes\mathbf{I}_{l}, \end{array} $$

(46)

$$\begin{array}{*{20}l} &\Psi(k,s)D_{\bot}=\big(\Phi(k,s)-\frac{1}{N}\mathbf{1}\mathbf{1}^{T}\big)\otimes\mathbf{I}_{l}. \end{array} $$

(47)

The following lemma characterizes the closeness of the auxiliary sequence $\{\widetilde {\Lambda }_{k}\}_{k\geq 1}$ along its convergent subsequence $\{\widetilde {\Lambda }_{n_{k}}\}$.

Lemma 6

Assume A1, A3, A4, and A6 hold. Further, for a fixed sample ω, assume A5 and A7 hold for all the agents. Let $\{\widetilde {\Lambda }_{n_{k}}\}$ be a convergent subsequence of $\{\widetilde {\Lambda }_{k}\}$ for ω. $\widetilde {\Lambda }_{n_{k}}\xrightarrow [k\to \infty ]{}\widetilde {\Lambda }$. Then for this ω there is a T>0 such that for all sufficiently large k and any T_k∈[0,T]

$$\begin{array}{*{20}l} {}\tilde{X}_{m+1}\,=\,(W(m)\!\otimes\!\mathbf{I}_{l})G_{m}(\tilde{X}_{m})\,+\,a_{m}\Big(F_{m+1}\big(G(\tilde{X}_{m})\big)+\tilde{\epsilon}_{m+1}\Big) \end{array} $$

(48)

for any $m=n_{k},\dots,m(n_{k},T_{k})$, and

$$\begin{array}{*{20}l} ||\widetilde{\Lambda}_{m+1}-\widetilde{\Lambda}_{n_{k}}||\le c_{1}T_{k}+M_{0}^{\prime} \end{array} $$

(49)

$$\begin{array}{*{20}l} ||\widetilde{\Delta}_{m+1}-\widetilde{\Delta}_{n_{k}}||\le c_{2}T_{k},\quad \forall n_{k}\le m\le m(n_{k},T_{k}), \end{array} $$

(50)

where $\tilde {x}_{k}\triangleq \frac {1}{N}\sum _{k=1}^{N}\tilde {x}_{i,k}, \widetilde {\Delta }_{k}\triangleq \tilde {x}_{k}-\theta _{k}$, and $\widetilde {\Delta }_{i,k}\triangleq \tilde {x}_{i,k}-\theta _{k}$.

Proof

Consider a fixed sample path ω where A5 and A7 hold.

Let $C>||\widetilde {\Lambda }||$. There exists an integer k_C>0 such that

$$\begin{array}{*{20}l} ||\widetilde{\Lambda}_{n_{k}}||\le C,~\gamma_{k}<a_{k},\\ ||\xi_{k+1}||<a_{k},~a_{k}<1,\quad\forall k\ge k_{C} \end{array} $$

(51)

From Lemma 5 we know that there exist constants T₁>0 and k₀>k_C such that

$$\begin{array}{*{20}l} ||\sum_{s=n_{k}}^{m(n_{k},t_{k})\land(\tau_{\sigma_{n_{k}+1}-1})}a_{s}\tilde{\epsilon}_{s+1}||\le T_{0}\\ \forall t_{k}\in[0,T_{k}],~\forall T_{0}\in[0,T_{1}],~\forall k\ge k_{0}. \end{array} $$

(52)

Define

$$\begin{array}{*{20}l} M_{0}^{\prime}\triangleq C(c\rho+2)+1, \end{array} $$

(53)

$$\begin{array}{*{20}l} c_{1}\triangleq\sqrt{N}\cdot c_{2}+2+\frac{c(1+\rho)}{1-\rho}, \end{array} $$

(54)

$$\begin{array}{*{20}l} c_{2}\triangleq M_{0}^{\prime}+C+2+\alpha(2M_{0}^{\prime}+2C+3)+\frac{1}{\sqrt{N}}, \end{array} $$

(55)

where c and ρ are given by (9). Select T such that

$$\begin{array}{*{20}l} 0<T\le T_{1},~c_{1}T<1. \end{array} $$

(56)

For any k≥k₀ and any T_k∈[0,T] define

$$\begin{array}{*{20}l} &s_{k}\triangleq\sup\{s\ge n_{k}:||\widetilde{\Lambda}_{j}-\widetilde{\Lambda}_{n_{k}}||\\ &\le c_{1}T_{k}+M_{0}^{\prime}\quad\forall n_{k}\le j\le s\} \end{array} $$

(57)

So from (51) and (56) it follows that

$$\begin{array}{*{20}l} ||\widetilde{\Lambda}_{j}||\le M^{\prime}_{0}+C+1,\quad\forall n_{k}\le j\le s_{k}. \end{array} $$

(58)

We intend to prove s_k>m(n_k,T_k). Assume the converse that for sufficiently large k≥k₀ and any T_k∈[0,T]

$$\begin{array}{*{20}l} s_{k}\le m(n_{k},T_{k}). \end{array} $$

(59)

We first show that there exists a positive integer k₁>k₀ such that for any k≥k₁

$$\begin{array}{*{20}l} s_{k}<\tau_{\sigma_{n_{k}}+1},~\forall k\ge k_{1},~\forall T_{k}\in[0,T]. \end{array} $$

(60)

We prove (60) for two cases: ${\lim }_{k\to \infty }\sigma _{k}=\infty $ and ${\lim }_{k\to \infty }\sigma _{k}=\sigma <\infty $.

i) ${\lim }_{k\to \infty }\sigma _{k}=\infty $: From (58) we know that $||\tilde {x}_{i,n_{k}}-\theta _{n_{k}}||\le M^{\prime }_{0}+C+1~\forall i\in \mathcal {V}$. First, we prove that for sufficiently large k, truncation does not happen at time n_k+1. For any $i\in \mathcal {V}$, we consider the following two cases:

a) $\tilde {x}_{i,n_{k}}$ and $\tilde {\epsilon }_{i,n_{k}+1}$ take value as (13): From (19) we have

$$\begin{array}{*{20}l} {}\hat{x}_{i,n_{k}+1}=&\sum_{j\in N_{i}(n_{k})}w_{ij}(n_{k})g_{n_{k}}(\tilde{x}_{j,n_{k}})\\ =&\sum_{j\in N_{i}(n_{k})}w_{ij}(n_{k})\Big(g_{n_{k}}(\tilde{x}_{j,n_{k}})\,-\,g_{n_{k}}(\theta_{n_{k}})\,-\,(\tilde{x}_{j,n_{k}}\,-\,\theta_{n_{k}})\Big)\\ &+\sum_{j\in N_{i}(n_{k})}w_{ij}(n_{k})\Big(\theta_{n_{k}+1}-\xi_{n_{k}+1}+(\tilde{x}_{j,n_{k}}-\theta_{n_{k}})\Big). \end{array} $$

Since A4 indicates that W(n_k) is doubly stochastic, by A6, (51) and direct calculation we have the following inequalities

$$\begin{array}{*{20}l} \|\hat{x}_{i,n_{k}+1}-\theta_{n_{k}+1}\|\le&(\gamma_{n_{k}}+1)(M^{\prime}_{0}+C+1)+\|\xi_{n_{k}+1}\|\\ \le&2M^{\prime}_{0}+2C+3, \end{array} $$

and hence by Lemma 1 we know $||\hat {x}_{i,n_{k}+1}-h_{n_{k}}(x^{*})||\le \eta +2M^{\prime }_{0}+2C+3$.

b) $\tilde {x}_{i,n_{k}}$ and $\tilde {\epsilon }_{i,n_{k}+1}$ take value as (14): From (19) we have

$$\begin{array}{*{20}l} {}\hat{x}_{i,n_{k}+1}\!=&\sum_{j\in N_{i}(n_{k})}w_{ij}(n_{k})\Big(g_{n_{k}}(\tilde{x}_{j,n_{k}})\,-\,g_{n_{k}}(\theta_{n_{k}})\,-\,(\tilde{x}_{j,n_{k}}\,-\,\theta_{n_{k}})\Big)\\ &+\sum_{j\in N_{i}(n_{k})}w_{ij}(n_{k})\Big(\theta_{n_{k}+1}-\xi_{n_{k}+1}+(\tilde{x}_{j,n_{k}}-\theta_{n_{k}})\Big)\\ &+a_{n_{k}}f_{i,n_{k}+1}\left(\theta_{n_{k}+1}+g_{n_{k}}(\tilde{x}_{i,n_{k}})-g_{n_{k}}(\theta_{n_{k}})\right.\\ &\quad-\left.(\tilde{x}_{i,n_{k}}-\theta_{n_{k}})+(\tilde{x}_{i,n_{k}}-\theta_{n_{k}})-\xi_{n_{k}+1}\right)\\ &+a_{n_{k}}\epsilon_{i,n_{k}+1}. \end{array} $$

By A5 a) we know that $a_{n_{k}}\epsilon _{i,n_{k}+1}<1$ for sufficiently large k. Then, by A3, A4, A6 and (51), we have the following inequalities

$$\begin{array}{*{20}l} {}\|\hat{x}_{i,n_{k}+1}-\theta_{n_{k}+1}\|\le&2M^{\prime}_{0}+2C+3+a_{n_{k}}\alpha(2M^{\prime}_{0}+2C+3)\\&+a_{n_{k}}\epsilon_{i,n_{k}+1}\\ \le&2M^{\prime}_{0}+2C+4+\alpha(2M^{\prime}_{0}+2C+3)\\&\triangleq M_{1}, \end{array} $$

and hence by Lemma 1 we know $||\hat {x}_{i,n_{k}+1}-h_{n_{k}}(x^{*})||\le \eta +M_{1}$.

So we show that when $||\widetilde {\Lambda }_{n_{k}}||\le M^{\prime }_{0}+C+1$, we have $||\hat {x}_{i,n_{k}+1}-h_{n_{k}}(x^{*})||\le \eta +M_{1}$. Since {M_k} is a sequence of positive number increasingly diverging to infinity, there exits a positive integer k₁>k₀ such that $M_{\sigma _{n_{k}}}>\eta +M_{1}$ for all k≥k₁. Thus, we prove that truncation does not happen at time n_k+1.

Notice (58) holds for j:n_k≤j≤s_k. So, similar to the proof above, we can prove that truncation does not happen for time $n_{k}+1,\dots,s_{k}+1$. Then we conclude $\phantom {\dot {i}\!}s_{k}<\tau _{\sigma _{n_{k}}+1}$.

ii) ${\lim }_{k\to \infty }\sigma _{k}=\sigma <\infty $: For this case there exists a positive integer k₁>k₀ such that $\phantom {\dot {i}\!}\sigma _{n_{k}}=\sigma $ for all k≥k₁, and hence $\phantom {\dot {i}\!}\tau _{\sigma _{n_{k}}+1}=\infty $ by definition. Then $\phantom {\dot {i}\!}m(n_{k},T_{k})<\tau _{\sigma _{n_{k}}+1}$, hence by (59) we know $\phantom {\dot {i}\!}s_{k}<\tau _{\sigma _{n_{k}}+1}$. So (60) is proven.

By (56) we see T_k∈[0,T]⊂[0,T₁], then from (52) we know that for sufficiently large k>k₁ and any T_k∈[0,T]

$$\begin{array}{*{20}l} ||\sum_{s=n_{k}}^{m(n_{k},t_{k})\land(\tau_{\sigma_{n_{k}+1}-1})}a_{s}\tilde{\epsilon}_{s+1}||\le T_{k}\quad\forall t_{k}\in[0,T_{k}]. \end{array} $$

(61)

By setting $\phantom {\dot {i}\!}t_{k}=\sum _{m=n_{k}}^{s}a_{m}$ for some s∈[n_k,s_k], from (59) we see $\phantom {\dot {i}\!}\sum _{m=n_{k}}^{s}a_{m}\le \sum _{m=n_{k}}^{s_{k}}a_{m}\le T_{k}$. Noticing m(n_k,t_k)=s, from (60) we derive $\phantom {\dot {i}\!}m(n_{k},t_{k})\land (\tau _{\sigma _{n_{k}}+1}-1)=s$. So by (61) we know that

$$\begin{array}{*{20}l} ||\sum_{m=n_{k}}^{s}a_{m}\tilde{\epsilon}_{m+1}||\le T_{k}\quad\forall s:n_{k}\le s\le s_{k} \end{array} $$

(62)

for sufficiently large k≥k₁ and any T_k∈[0,T].

Now we consider the following recursive algorithm starting from n_k:

$$ \begin{aligned} Z_{m+1}=(W(m)\otimes\mathbf{I}_{l})G_{m}(Z_{m})+a_{m}\Big(F_{m+1}\big(G_{m}(Z_{m})\big)+\tilde{\epsilon}_{m+1}\Big),\\ Z_{n_{k}}=\tilde{X}_{n_{k}}. \end{aligned} $$

(63)

where $Z_{k}\triangleq \text {col}\{z_{i,k},\dots,z_{N,k}\}$. By (60) we know that (48) holds for $m=n_{k},\dots,s_{k}-1$ for ∀k≥k₁ ∀T_k∈[0,T]. Then we derive

$$\begin{array}{*{20}l} Z_{m}=\tilde{X}_{m}\quad\forall m:n_{k}\le m\le s_{k} \end{array} $$

(64)

Set $z_{k}=\frac {\mathbf {1}^{T}\otimes \mathbf {I}_{l}}{N}Z_{k}, \widehat {\Delta }_{i,k}\triangleq z_{i,k}-\theta _{k}, \widehat {\Delta }_{k}\triangleq z_{k}-\theta _{k}$, and $\widehat {\Lambda }_{k}\triangleq Z_{k}-\Theta _{k}$. By multiplying both sides of (63) with $\frac {1}{N}(\mathbf {1}^{T}\otimes \mathbf {I}_{l})$, from 1^TW(m)=1^T and (A⊗B)(C⊗D)=AB⊗CD we derive

$$\begin{array}{*{20}l} {}z_{s+1}=\frac{1}{N}\sum_{i=1}^{N}g_{s}(z_{i,s})+\frac{\mathbf{1}^{T}\otimes\mathbf{I}_{l}}{N}a_{s}\Big(F_{s+1}\big(G_{s}(Z_{s})\big)+\tilde{\epsilon}_{s+1}\Big) \end{array} $$

and hence

$$\begin{array}{*{20}l} &z_{s+1}=\frac{1}{N}\sum_{i=1}^{N}g_{s}(z_{i,s})-\frac{1}{N}\sum_{i=1}^{N}g_{s}(\theta_{s})-\xi_{s+1}+\theta_{s+1}\\ &+\frac{\mathbf{1}^{T}\otimes\mathbf{I}_{l}}{N}a_{s}F_{s+1}\big(G_{s}(Z_{s})-G_{s}(\Theta_{s})-\Xi_{s+1}+\Theta_{s+1}\big)\\ &+\frac{\mathbf{1}^{T}\otimes\mathbf{I}_{l}}{N}a_{s}\tilde{\epsilon}_{s+1}\\ &=\frac{1}{N}\sum_{i=1}^{N}d_{s}(z_{i,s})+\widehat{\Delta}_{s}-\xi_{s+1}+\theta_{s+1}\\ &+\frac{\mathbf{1}^{T}\otimes\mathbf{I}_{l}}{N}a_{s}F_{s+1}\big(\Theta_{s+1}+D_{s}(Z_{s})+\widehat{\Lambda}_{s}-\Xi_{s+1}\big)\\ &+\frac{\mathbf{1}^{T}\otimes\mathbf{I}_{l}}{N}a_{s}\tilde{\epsilon}_{s+1}. \end{array} $$

So

$$\begin{array}{*{20}l} &||\widehat{\Delta}_{s+1}-\widehat{\Delta}_{n_{k}}||\le||\sum_{j=n_{k}}^{s}\frac{1}{N}\sum_{i=1}^{N}d_{j}(z_{i,j})||+||\sum_{j=n_{k}}^{s}\xi_{j+1}||\\ &+\big|\big|\frac{\mathbf{1}^{T}\otimes\mathbf{I}_{l}}{N}\sum_{j=n_{k}}^{s}a_{j}F_{j+1}\big(\Theta_{j+1}+D_{j}(Z_{j})+\widehat{\Lambda}_{j}-\Xi_{j+1}\big)\big|\big|\\ &+||\frac{\mathbf{1}^{T}\otimes\mathbf{I}_{l}}{N}\sum_{j=n_{k}}^{s}a_{j}\tilde{\epsilon}_{j+1}||\\ &\le\frac{1}{N}\sum_{j=n_{k}}^{s}\sum_{i=1}^{N}\gamma_{j}||z_{i,j}-\theta_{j}||+||\sum_{j=n_{k}}^{s}\xi_{j+1}||\\ &+\frac{1}{N}\sum_{j=n_{k}}^{s}a_{j}\sum_{i=1}^{N}\big|\big|f_{i,j+1}\big(\theta_{j+1}+d_{j}(z_{i,j})+\widehat{\Delta}_{i,j}-\xi_{j+1}\big)\big|\big|\\ &+\frac{1}{\sqrt{N}}||\sum_{j=n_{k}}^{s}a_{j}\tilde{\epsilon}_{j+1}||\\ &\le\sum_{j=n_{k}}^{s}\gamma_{j}\cdot(M_{0}^{\prime}+C+1)+||\sum_{j=n_{k}}^{s}\xi_{j+1}||\\ &+\sum_{j=n_{k}}^{s}a_{j}\alpha\Big((1+\gamma_{j})(M^{\prime}_{0}+C+1)+||\xi_{j+1}||\Big)\\ &+\frac{1}{\sqrt{N}}T_{k}\\ &\le\Big(M^{\prime}_{0}+C+1+1+\alpha(2M^{\prime}_{0}+2C+3)+\frac{1}{\sqrt{N}}\Big)T_{k}\\ &= c_{2}T_{k},\qquad \forall s:n_{k}\le s\le s_{k}, \end{array} $$

(65)

where the second inequality comes from A6, the third inequality comes from (58) A3 and A5, the fourth inequality comes from (51) (59), and the last inequality comes from (59).

Denote by Z_⊥,s=D_⊥Z_s the disagreement vector of Z_s. By multiplying both sides of (63) with D_⊥ we have

$$\begin{array}{*{20}l} &Z_{\perp,m+1}=D_{\perp}(W(m)\otimes\mathbf{I}_{l})G_{m}(Z_{m})+\\ &a_{m}D_{\perp}\Big(F_{m+1}\big(G_{m}(Z_{m})\big)+\tilde{\epsilon}_{m+1}\Big) \end{array} $$

Notice that D_⊥(W(m)⊗I_l)=D_⊥(W(m)⊗I_l)D_⊥, so we have

$$\begin{array}{*{20}l} {}Z_{\perp,m+1}=&D_{\perp}(W(m)\otimes\mathbf{I}_{l})D_{\perp}Z_{m}\\&+D_{\perp}(W(m)\otimes\mathbf{I}_{l})\big(G_{m}(Z_{m})-Z_{m}\big)\\ &+a_{m}D_{\perp}\Big(F_{m+1}\big(G_{m}\big(Z_{m})\big)+\tilde{\epsilon}_{m+1}\Big) \end{array} $$

By definition we know that D_⊥G_m(Θ_m)=D_⊥Θ_m=0, hence we have

$$\begin{array}{*{20}l} &Z_{\perp,m+1}=D_{\perp}(W(m)\otimes\mathbf{I}_{l})Z_{\perp,m}\\ &+D_{\perp}(W(m)\otimes\mathbf{I}_{l})\big(G_{m}(Z_{m})-G_{m}(\Theta_{m})-Z_{m}+\Theta_{m}\big)\\ &+a_{m}D_{\perp}\Big(F_{m+1}\big(G_{m}(Z_{m})\big)+\tilde{\epsilon}_{m+1}\Big)\\ &=D_{\perp}(W(m)\otimes\mathbf{I}_{l})Z_{\perp,m}+D_{\perp}(W(m)\otimes\mathbf{I}_{l})D_{m}(Z_{m})\\ &+a_{m}D_{\perp}\Big(F_{m+1}\big(G_{m}(Z_{m})\big)+\tilde{\epsilon}_{m+1}\Big). \end{array} $$

So inductively

$$\begin{array}{*{20}l} &Z_{\perp,s+1}=\Psi(s,n_{k})Z_{n_{k}}+\sum_{m=n_{k}}^{s}\Psi(s,n_{k})D_{m}(Z_{m})\\ &+\sum_{m=n_{k}}^{s}\Psi(s,m+1)D_{\perp}a_{m}F_{m+1}\big(G_{m}(Z_{m})\big)\\ &+\sum_{m=n_{k}}^{s}\Psi(s,m+1)D_{\perp}a_{m}\tilde{\epsilon}_{m+1}, \end{array} $$

by (46)(47) we have

$$\begin{array}{*{20}l} {}&Z_{\perp,s+1}=\Big[(\Phi(s,n_{k})-\frac{1}{N}\mathbf{1}\mathbf{1}^{T})\otimes\mathbf{I}_{l}\Big](Z_{n_{k}}-\Theta_{n_{k}})\\ {}&+\sum_{m=n_{k}}^{s}\Big[(\Phi(s,m)-\frac{1}{N}\mathbf{1}\mathbf{1}^{T})\otimes\mathbf{I}_{l}\Big]D_{m}(Z_{m})\\ {}&+\sum_{m=n_{k}}^{s}a_{m}\Big[(\Phi(s,m+1)-\frac{1}{N}\mathbf{1}\mathbf{1}^{T})\otimes\mathbf{I}_{l}\Big]F_{m+1}\big(G_{m}(Z_{m})\big)\\ {}&+\sum_{m=n_{k}}^{s}a_{m}\Big[(\Phi(s,m+1)-\frac{1}{N}\mathbf{1}\mathbf{1}^{T})\otimes\mathbf{I}_{l}\Big]\tilde{\epsilon}_{m+1}. \end{array} $$

(66)

From (9), (51), (58), (64), A3, and A6 we can derive

$$\begin{array}{*{20}l} {}&||Z_{\perp,s+1}||\le Cc\rho^{s+1-n_{k}}+\sum_{m=n_{k}}^{s}a_{m}||Z_{m}-\Theta_{m}||c\rho^{s+1-m}\\ {}&+\sum_{m=n_{k}}^{s}a_{m}\alpha(2M^{\prime}_{0}+2C+3)c\rho^{s-m+2}\\ {}&+\Big|\Big|\sum_{m=n_{k}}^{s}a_{m}\Big[\big(\Phi(s,m+1)-\frac{1}{N}\mathbf{1}\mathbf{1}^{T}\big)\otimes\mathbf{I}_{l}\Big]\tilde{\epsilon}_{m+1}\Big|\Big|. \end{array} $$

(67)

Set $\Gamma _{n}\triangleq \sum _{m=1}^{n}a_{m}\tilde {\epsilon }_{m+1}$, by (62) we know that $\phantom {\dot {i}\!}||\Gamma _{s}-\Gamma _{n_{k}-1}||\le T_{k}\quad \forall s:n_{k}\le s\le s_{k}$. Notice

$$\begin{array}{*{20}l} &\sum_{m=n_{k}}^{s}a_{m}\big(\Phi(s,m+1)\otimes\mathbf{I}_{l}\big)\tilde{\epsilon}_{m+1}\\ &=\sum_{m=n_{k}}^{s}\big(\Phi(s,m+1)\otimes\mathbf{I}_{l}\big)(\Gamma_{m}-\Gamma_{m-1})\\ &=\sum_{m=n_{k}}^{s}\big(\Phi(s,m+1)\otimes\mathbf{I}_{l}\big)(\Gamma_{m}-\Gamma_{n_{k}-1})\\ &+\sum_{m=n_{k}}^{s}\big(\Phi(s,m+1)\otimes\mathbf{I}_{l}\big)(\Gamma_{m-1}-\Gamma_{n_{k}-1}). \end{array} $$

So, summing by parts with (9) we have

$$\begin{array}{*{20}l} &||\sum_{m=n_{k}}^{s}a_{m}(\Phi(s,m+1)\otimes\mathbf{I}_{l})\tilde{\epsilon}_{m+1}||\\ &\le||\Gamma_{s}-\Gamma_{n_{k}}||\\ &+\sum_{m=n_{k}}^{s}||\Phi(s,m+1)-\Phi(s,m+2)||\cdot||\Gamma_{m}-\Gamma_{n_{k}-1}||\\ &\le T_{k}+\sum_{m=n_{k}}^{s}(c\rho^{s-m}+c\rho^{s-m-1})\cdot T_{k}\\ &\le T_{k}+\frac{c(\rho+1)}{1-\rho}T_{k}, \end{array} $$

which incorporating with (62) produces

$$\begin{array}{*{20}l} &\Big|\Big|\sum_{m=n_{k}}^{s}a_{m}\Big[\big(\Phi(s,m+1)-\frac{1}{N}\mathbf{1}\mathbf{1}^{T}\big)\otimes\mathbf{I}_{l}\Big]\tilde{\epsilon}_{m+1}\Big|\Big|\\ &\le \Big(2+\frac{c(\rho+1)}{1-\rho}\Big)T_{k}\quad\forall s:n_{k}\le s\le s_{k} \end{array} $$

(68)

for sufficiently large k≥k₁ and any T_k∈[0,T].

Notice $\sum _{m=n_{k}}^{s}a_{m}\rho ^{s-m}\le \frac {1}{1-\rho }\sup _{m\ge n_{k}}a_{m}$ from $a_{m}\xrightarrow [m\to \infty ]{}0$. Combine this with (67)(68) we have

$$\begin{array}{*{20}l} {}&||Z_{\perp,s+1}||\le Cc\rho+(M_{0}^{\prime}+C+1)C\frac{1}{1-\rho}\sup_{m\ge n_{k}}a_{m}\\ {}&+\alpha(2M_{0}^{\prime}+2C+3)C\frac{1}{1-\rho}\sup_{m\ge n_{k}}a_{m}\,+\,\Big(2+\frac{c(\rho+1)}{1-\rho}\Big)T_{k}\\ {}&\le Cc\rho+1+\Big(2+\frac{c(\rho+1)}{1-\rho}\Big)T_{k}\quad\forall s:n_{k}\le s\le s_{k} \end{array} $$

(69)

for sufficiently large k≥k₁ and any T_k∈[0,T]. Notice that $\widehat {\Lambda }_{s}=Z_{\perp,s}+(\mathbf {1}\otimes \mathbf {I}_{l})\widehat {\Delta }_{s}$. We derive

$$\begin{array}{*{20}l} &||\widehat{\Lambda}_{s+1}-\widehat{\Lambda}_{n_{k}}||=||(\mathbf{1}\otimes\mathbf{I}_{l})\widehat{\Delta}_{s+1}+Z_{\perp,s+1}-\\ &Z_{\perp,n_{k}}-(\mathbf{1}\otimes\mathbf{I}_{l})\widehat{\Delta}_{n_{k}}||\\ &\le||Z_{\perp,s+1}||+||Z_{\perp,n_{k}}||+\sqrt{N}||\widehat{\Lambda}_{s+1}-\widehat{\Lambda}_{n_{k}}||. \end{array} $$

Since $||Z_{\perp,n_{k}}||\le 2||\widehat {\Lambda }_{n_{k}}||\le 2C$, from (65)(67) it follows that for sufficiently large k≥k₁ and any T_k∈[0,T]

$$\begin{array}{*{20}l} &||\widehat{\Lambda}_{s+1}-\widehat{\Lambda}_{n_{k}}||\\ &\le Cc\rho+1+\Big(2+\frac{c(\rho+1)}{1-\rho}\Big)T_{k}+2C+\sqrt{N}c_{2}T_{k}\\ &\le C(c\rho+2)+1+(2+\frac{c(\rho+1)}{1-\rho}+c_{2}\sqrt{N})T_{k}\\ &=m^{\prime}_{0}+c_{1}T_{k}. \end{array} $$

(70)

Therefore, from (56)(51) we know that for sufficiently large k≥k₁ and any T_k∈[0,T]

$$\begin{array}{*{20}l} ||\widehat{\Lambda}_{s_{k}+1}||\le||\widehat{\Lambda}_{n_{k}}||+m_{0}^{\prime}+c_{1}T_{k}\le M_{0}^{\prime}+1+C. \end{array} $$

(71)

Now we look back at the recursive algorithm (19). We rewrite it in the compact form as follows

$$\begin{array}{*{20}l} \widehat{X}_{s_{k}+1}&=[W(s_{k})\otimes\mathbf{I}_{l}]G_{s_{k}}(\widetilde{X}_{s_{k}})\\&\quad+a_{s_{k}}\Big(F_{s_{k}+1}\big(G_{s_{k}}(\widetilde{X}_{s_{k}})\big)+\tilde{\epsilon}_{s_{k}+1}\Big), \end{array} $$

where $\widehat {X}_{k}\triangleq \text {col}\{\hat {x}_{1,k},\dots,\hat {x}_{N,k}\}$. Then by (63)(64), $\widehat {X}_{s_{k}+1}=Z_{s_{k}+1}$. So by (71) it follows that

$$\begin{array}{*{20}l} ||\widehat{X}_{s_{k}+1}-\Theta_{s_{k}+1}||\le M^{\prime}_{0}+1+C. \end{array} $$

(72)

We now show

$$\begin{array}{*{20}l} \tilde{X}_{s_{k}+1}=\widehat{X}_{s_{k}+1},~s_{k}+1<\tau_{\sigma_{k}+1} \end{array} $$

(73)

for sufficiently large k≥k₁ and any T_k∈[0,T]. We consider the following two cases: ${\lim }_{k\to \infty }\sigma _{k}=\infty $ and ${\lim }_{k\to \infty }\sigma _{k}=\sigma <\infty $.

i) ${\lim }_{k\to \infty }\sigma _{k}=\infty $: Notice $M_{\sigma _{k}}>\eta +M_{0}^{\prime }+1+C$ when k≥k₁. By (20)(21) we know that $\phantom {\dot {i}\!}\tilde {X}_{s_{k}+1}=\widehat {X}_{s_{k}+1}, \sigma _{s_{k}+1}=\sigma _{s_{k}}$. So $\phantom {\dot {i}\!}s_{k}+1<\tau _{\sigma _{n_{k}}+1}$ by (60).

ii) ${\lim }_{k\to \infty }\sigma _{k}=\sigma <\infty $: For this case $\phantom {\dot {i}\!}\tau _{\sigma _{n_{k}+1}}=\infty $ for all k≥k₁. By (60) we see $\phantom {\dot {i}\!}s_{k}+1<\tau _{\sigma _{n_{k}+1}}$. Then by $\phantom {\dot {i}\!}\sigma _{n_{k}}=\sigma $ we conclude $\phantom {\dot {i}\!}\sigma _{s_{k}+1}=\sigma _{x_{k}}=\sigma $, and hence by (20) we derive $\phantom {\dot {i}\!}\tilde {X}_{s_{k}+1}=\widehat {X}_{s_{k}+1}$. Thus (73) holds.

From (73) we know that (48) holds for m=s_k for sufficiently large k≥k₁ and any T_k∈[0,T]. From $\phantom {\dot {i}\!}\widehat {X}_{s_{k}+1}=Z_{s_{k}+1}$ by (73) we see $\phantom {\dot {i}\!}\tilde {X}_{s_{k}+1}=Z_{s_{k}+1}$. It follows that for sufficiently large k≥k₁ and any T_k∈[0,T]

$$\begin{array}{*{20}l} ||\widetilde{\Lambda}_{s_{k}+1}-\widetilde{\Lambda}_{n_{k}}||\le M_{0}^{\prime}+c_{1}T_{k}, \end{array} $$

which contradicts with the definition of s_k. Thus (59) does not hold. So s_k>m(n_k,T_k) and hence (49) holds.

Since s_k>m(n_k,T_k), we know $\{\widetilde {\Lambda }_{s}:n_{k}\le s\le m(n_{k},T_{k})+1\}$ is bounded. Similar to proving (60) we can be shown that $\phantom {\dot {i}\!}m(n_{k},T_{k})<\tau _{\sigma _{k}+1}$. So (48) holds for $\phantom {\dot {i}\!}m=n_{k},\dots,m(n_{k},T_{k})$. Similar to (65) we can prove

$$\begin{array}{*{20}l} ||\widetilde{\Delta}_{s+1}-\widetilde{\Delta}_{n_{k}}||\le c_{2}T \end{array} $$

for sufficiently large k and any T_k∈[0,T]. Hence, (50) holds.

In conclusion, the proof of Lemma 6 is complete. □

By multiplying both sides of (48) with $\frac {1}{N}(\mathbf {1}^{T}\otimes \mathbf {I}_{l})$, we have

$$\begin{array}{*{20}l} &\tilde{x}_{m+1}=\frac{1}{N}\sum_{i=1}^{N}g_{m}(\tilde{x}_{i,m})+a_{m}\frac{1}{N}\sum_{i=1}^{N}f_{i,m+1}\big(g_{m}(\tilde{x}_{i,m})\big)\\ &+a_{m}\frac{1}{N}\sum_{i=1}^{N}\tilde{\epsilon}_{i,m+1}\\ &=g_{m}(\tilde{x}_{m})+a_{m}f_{m+1}\big(g_{m}(\tilde{x}_{m})\big)\\ &+a_{m}\frac{1}{N}\frac{1}{a_{m}}\sum_{i=1}^{N}\Big(g_{m}(\tilde{x}_{i,m})-g_{m}(\tilde{x}_{m})\Big)\\ &+a_{m}\frac{1}{N}\sum_{i=1}^{N}\Big(f_{i,m+1}\big(g_{m}(\tilde{x}_{i,m})\big)-f_{i,m+1}\big(g_{m}(\tilde{x}_{m})\big)\Big)\\ &+a_{m}\frac{1}{N}\sum_{i=1}^{N}\tilde{\epsilon}_{i,m+1}. \end{array} $$

(74)

Setting

$$\begin{array}{*{20}l} \zeta^{(1)}_{k+1}=\frac{1}{N}\frac{1}{a_{k}}\sum_{i=1}^{N}\Big(g_{k}(\tilde{x}_{i,k})-g_{k}(\tilde{x}_{k})\Big),\\ \zeta^{(2)}_{k+1}=\frac{1}{N}\sum_{i=1}^{N}\Big(f_{i,k+1}\big(g_{k}(\tilde{x}_{i,k})\big)-f_{i,k+1}\big(g_{k}(\tilde{x}_{k})\big)\Big),\\ \zeta^{(3)}_{k+1}=\frac{1}{N}\sum_{i=1}^{N}\tilde{\epsilon}_{i,k+1},\\ \zeta_{k+1}=\zeta^{(1)}_{k+1}+\zeta^{(2)}_{k+1}+\zeta^{(3)}_{k+1} \end{array} $$

We can rewrite (74) as

$$\begin{array}{*{20}l} \tilde{x}_{m+1}=g_{m}(\tilde{x}_{m})+a_{m}f_{m+1}\big(g_{m}(\tilde{x}_{m})\big)+a_{m}\zeta_{m+1}. \end{array} $$

(75)

The following lemma gives the noise property of the sequence {ζ_k+1}.

Lemma 7

Assume all the conditions in Lemma 6 hold. $\{\widetilde {\Lambda }_{n_{k}}\}$ is a convergent subsequence with limit $\widetilde {\Lambda }$ at the considered sample ω. Then for this ω

$$\begin{array}{*{20}l} {\lim}_{T\to0}\limsup_{k\to\infty}\frac{1}{T}||\sum_{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta_{s+1}||=0\quad T_{k}\in[0,T]. \end{array} $$

(76)

Proof

In the proof of Lemma 6 it has been pointed out that there exists a T∈(0,1) such that $\phantom {\dot {i}\!}m(n_{k},T)<\tau _{\sigma _{n_{k}}+1}$ for sufficiently large k. So $\phantom {\dot {i}\!}||\sum _{s=n_{k}}^{m(n_{k},T_{k})\land (\tau _{\sigma _{n_{k}}+1}-1)}a_{s}\tilde {\epsilon }_{s+1}||=||\sum _{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\tilde {\epsilon }_{s+1}||$. Thus, by Lemma 5 we can immediately derive that

$$\begin{array}{*{20}l} {\lim}_{T\to0}\limsup_{k\to\infty}\frac{1}{T}||\sum_{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta^{(3)}_{s+1}||=0\quad T_{k}\in[0,T]. \end{array} $$

Now we need to show that $\zeta ^{(i)}_{k+1}$ also satisfies the property above for i=1,2. First we consider $\zeta ^{(1)}_{k+1}$. We see that

$$\begin{array}{*{20}l} &||\sum_{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta^{(1)}_{s+1}||\\ &=||\sum_{s=n_{k}}^{m(n_{k},T_{k})}\frac{1}{N}\sum_{i=1}^{N}\big(g_{s}(\tilde{x}_{i,s})-g_{s}(\tilde{x}_{s})\big)||\\ &=||\frac{1}{N}\sum_{s=n_{k}}^{m(n_{k},T_{k})}\sum_{i=1}^{N}\big(g_{s}(\tilde{x}_{i,s})-g_{s}(\theta_{s})-g_{s}(\tilde{x}_{s})+g_{s}(\theta_{s})\big)||\\ &=\frac{1}{N}||\sum_{s=n_{k}}^{m(n_{k},T_{k})}\sum_{i=1}^{N}\big(d_{s}(\tilde{x}_{i,s})+\widetilde{\Delta}_{i,s}-d_{s}(\tilde{x}_{s})-\widetilde{\Delta}_{s}\big)||\\ &=\frac{1}{N}||\sum_{s=n_{k}}^{m(n_{k},T_{k})}\sum_{i=1}^{N}\big(d_{s}(\tilde{x}_{i,s})-d_{s}(\tilde{x}_{s})\big)||\\ &\le\frac{1}{N}\sum_{s=n_{k}}^{m(n_{k},T_{k})}\sum_{i=1}^{N}||d_{s}(\tilde{x}_{i,s})||+\sum_{s=n_{k}}^{m(n_{k},T_{k})}||d_{s}(\tilde{x}_{s})||\\ &\le\frac{1}{N}\sum_{s=n_{k}}^{m(n_{k},T_{k})}\sum_{i=1}^{N}\gamma_{s}||\widetilde{\Delta}_{i,s}||+\sum_{s=n_{k}}^{m(n_{k},T_{k})}\gamma_{s}||\widetilde{\Delta}_{s}||, \end{array} $$

(77)

where the last inequality comes from A6.

Since ${\lim }_{k\to \infty }\widetilde {\Lambda }_{n_{k}}=\widetilde {\Lambda }$, by setting $\widetilde {\Delta }\triangleq \frac {\mathbf {1}^{T}\otimes \mathbf {I}_{l}}{N}\widetilde {\Lambda }$ we see that ${\lim }_{k\to \infty }\widetilde {\Delta }_{n_{k}}=\widetilde {\Delta }$. So by (49)(50) we conclude that $\{||\widetilde {\Delta }_{i,s}||,s:n_{k}\le s\le m(n_{k},T_{k})\}$ and $\{||\widetilde {\Delta }_{s}||,s:n_{k}\le s\le m(n_{k},T_{k})\}$ are bounded. Without the loss of generality we denote the bound of these two sequence by A. Then from (77) with A6 it follows

$$\begin{array}{*{20}l} ||\sum_{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta^{(1)}_{s+1}||\le 2A\sum_{s=n_{k}}^{m(n_{k},T_{k})}\gamma_{s}\xrightarrow[k\to\infty]{}0. \end{array} $$

So we conclude

$$\begin{array}{*{20}l} {\lim}_{T\to0}\limsup_{k\to\infty}\frac{1}{T}||\sum_{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta^{(1)}_{s+1}||=0\quad T_{k}\in[0,T]. \end{array} $$

Finally, we consider the case i=2. Notice

$$\begin{array}{*{20}l} &||\sum_{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta^{(2)}_{s+1}||\\ &=\Big|\Big|\sum_{s=n_{k}}^{m(n_{k},T_{k})}\frac{1}{N}\sum_{i-1}^{N}a_{s}\Big(f_{i,s+1}\big(g_{s}(\tilde{x}_{i,s})\big)-f_{i,s+1}\big(g_{s}(\tilde{x}_{s})\big)\Big)\Big|\Big| \end{array} $$

Similar to the proof in Lemma 6 we know that there exist constants c₃,c₄,c₅>0 such that for sufficiently large k

$$\begin{array}{*{20}l} ||\tilde{X}_{\perp,s+1}||\le c_{3}\rho^{s+1-n_{k}}+c_{4}\sup_{m\ge n_{k}}a_{m}+c_{5}T \end{array} $$

(78)

hold for ∀s:n_k≤s≤m(n_k,T). From A1,A6, and A7 we can assume that for sufficiently large k

$$\begin{array}{*{20}l} a_{k}<1,~\gamma_{k}\le a_{k},~ ||\xi_{k+1}||\le a_{k}. \end{array} $$

(79)

Since 0<ρ<1, there exists a positive integer m^′ such that $\phantom {\dot {i}\!}\rho ^{m^{\prime }}<T$. Then $\phantom {\dot {i}\!}\sum _{m=n_{k}}^{n_{k}+m^{\prime }}a_{m}\xrightarrow [k\to \infty ]{}0$. Thus, we have n_k+m^′<m(n_k,T) for sufficiently large k. So from (78) it follows

$$\begin{array}{*{20}l} &||\tilde{X}_{\perp,s+1}||\le \text{o}(1)+(c_{3}+c_{5})T\\ &\forall s:n_{k}+m^{\prime}\le s\le m(n_{k},T). \end{array} $$

(80)

where o(1)→0 as k→∞. Hence for n_k+m^′<m(n_k,T) and sufficiently large k

$$\begin{array}{*{20}l} ||\tilde{x}_{i,s}-\tilde{x}_{s}||=\text{o}(1)+\delta(T) \end{array} $$

where δ(T)→0 as T→0.

So

$$\begin{array}{*{20}l} &||\sum_{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta^{(2)}_{s+1}||\\ &\le\Big|\Big|\sum_{s=n_{k}}^{n_{k}+m^{\prime}}\frac{1}{N}\sum_{i-1}^{N}a_{s}\Big(f_{i,s+1}\big(g_{s}(\tilde{x}_{i,s})\big)-f_{i,s+1}\big(g_{s}(\tilde{x}_{s})\big)\Big)\Big|\Big|\\ &+\Big|\Big|\sum_{s=n_{k}+m^{\prime}}^{m(n_{k},T)}\frac{1}{N}\sum_{i-1}^{N}a_{s}(f_{i,s+1}\big(g_{s}(\tilde{x}_{i,s})\big)-f_{i,s+1}\big(g_{s}(\tilde{x}_{s})\big)\Big|\Big|\\ &\le\sum_{s=n_{k}}^{n_{k}+m^{\prime}}\frac{1}{N}\sum_{i-1}^{N}a_{s}\Big|\Big|f_{i,s+1}\big(g_{s}(\tilde{x}_{i,s})\big)\Big|\Big|+\Big|\Big|f_{i,s+1}(g_{s}\big(\tilde{x}_{s})\big)\Big|\Big|\\ &+\sum_{s=n_{k}+m^{\prime}}^{m(n_{k},T)}a_{s}\big(\text{o}(1)+\delta(T)\big)\\ &\le\sum_{s=n_{k}}^{n_{k}+m^{\prime}}a_{s}\alpha(2A+1)+T\big(\text{o}(1)+\delta(T)\big)\\ &\le\alpha(2A+1)\cdot m^{\prime}\sup_{m\ge n_{k}}a_{m}+T\big(\text{o}(1)+\delta(T)\big). \end{array} $$

And hence $\limsup _{k\to \infty }\frac {1}{T}||\sum _{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta ^{(2)}_{s+1}||=\delta (T)$, which implies that

$$\begin{array}{*{20}l} {\lim}_{T\to0}\limsup_{k\to\infty}\frac{1}{T}||\sum_{s=n_{k}}^{m(n_{k},T_{k})}a_{s}\zeta^{(2)}_{s+1}||=0\quad T_{k}\in[0,T]. \end{array} $$

So we complete the proof. □

Lemma 8

Assume A1-A4, A6 hold, and that A5, A7 hold at the sample path ω under consideration. Then any nonempty interval [δ₁,δ₂] with 0∉[δ₁,δ₂] cannot be crossed by $\{v(\widetilde {\Delta }_{m_{k}}),\dots,v(\widetilde {\Delta }_{l_{k}})\}$ infinitely many times with $\{\widetilde {\Lambda }_{m_{k}}\}$ bounded. By " [δ₁,δ₂] being crossed by $\{v(\widetilde {\Delta }_{m_{k}}),\dots,v(\widetilde {\Delta }_{l_{k}})\}$" it means that $v(\widetilde {\Delta }_{m_{k}})\le \delta _{1}, v(\widetilde {\Delta }_{l_{k}})\ge \delta _{2}$ and $\delta _{1}<v(\widetilde {\Delta }_{s})<\delta _{2}\quad \forall s:m_{k}<s<l_{k}$.

Proof

Assume the converse: for some interval [δ₁,δ₂] with 0∉[δ₁,δ₂], there are infinitely many crossing $\{v(\widetilde {\Delta }_{m_{k}}),\dots,v(\widetilde {\Delta }_{l_{k}})\}$ with $\{\widetilde {\Lambda }_{m_{k}}\}$ bounded.

By the boundedness of $\{\widetilde {\Lambda }_{m_{k}}\}$ we can extract a convergent subsequence still denoted by $\{\widetilde {\Lambda }_{m_{k}}\}$ with limit ${\lim }_{k\to \infty }\widetilde {\Lambda }_{m_{k}}=\widetilde {\Lambda }$. So, ${\lim }_{k\to \infty }\widetilde {\Delta }_{m_{k}}=\widetilde {\Delta }$ with $\widetilde {\Delta }\triangleq \frac {\mathbf {1}^{T}\otimes \mathbf {I}_{l}}{N}\widetilde {\Lambda }$. By (50), letting T→0 we have $||\widetilde {\Delta }_{m_{k}+1}-\widetilde {\Delta }_{m_{k}}||\le c_{2}T\to 0$. By the definition of crossing, $v(\widetilde {\Delta }_{m_{k}})\le \delta _{1}<v(\widetilde {\Delta }_{m_{k}+1})$, we can obtain

$$\begin{array}{*{20}l} v(\widetilde{\Delta}_{m_{k}})\xrightarrow[k\to\infty,T\to0]{}\delta_{1}=v(\widetilde{\Delta})>0. \end{array} $$

(81)

So by the assumption v(x)=0⇔x=0 we know there exists a constant β such that $||\widetilde {\Delta }||>\beta $. And hence by (50) we conclude

$$\begin{array}{*{20}l} ||\widetilde{\Delta}_{j}||>\frac{\beta}{2},\quad j:m_{k}\le j\le m(m_{k},T)+1 \end{array} $$

(82)

for sufficiently small T>0 and large k.

Setting $\widetilde {\Delta }_{k}$ to be a vector in-between $\widetilde {\Delta }_{m_{k}}$ and $\widetilde {\Delta }_{m(m_{k},T)}$. From (50) it follows that $||\widetilde {\Delta }_{k}||\le c_{2}T+||\widetilde {\Delta }||+1$ for sufficiently large k. We consider the following Taylor’s expansion

$$\begin{array}{*{20}l} {}&v(\widetilde{\Delta}_{m(m_{k},T)})-v(\widetilde{\Delta}_{m_{k}})\\ {}&=v_{x}^{T}(\widetilde{\Delta}_{k})\sum_{j=m_{k}}^{m(m_{k},T)-1}\Big(d_{j}(\tilde{x}_{j})\,-\,\xi_{j+1}\,+\,a_{j}\big(f_{j+1}(g_{j}(\tilde{x}_{j}))\,+\,\zeta_{j+1}\big)\Big)\\ {}&=v_{x}^{T}(\widetilde{\Delta}_{k})\Big\{\sum_{j=m_{k}}^{m(m_{k},T)-1}\big(d_{j}(\tilde{x}_{j})+\xi_{j+1}\big)+\sum_{j=m_{k}}^{m(m_{k},T)-1}a_{j}\zeta_{j+1}\Big\}\\ {}&+\sum_{j=m_{k}}^{m(m_{k},T)-1}a_{j}\big(v_{x}^{T}(\widetilde{\Delta}_{k})-v_{x}^{T}(g_{j}(\tilde{x}_{j})-\theta_{j+1})\big)f_{j+1}\big(g_{j}(\tilde{x}_{j})\big)\\ {}&+\sum_{j=m_{k}}^{m(m_{k},T)-1}a_{j}v_{x}^{T}(g_{j}(\tilde{x}_{j})-\theta_{j+1})f_{j+1}\big(g_{j}(\tilde{x}_{j})\big) \end{array} $$

(83)

Similar to (51), we take sufficiently large k, then by A3, A6 and Lemma 7, there exists a constant c^′ such that

$$\begin{array}{*{20}l} &\Big|\Big|a_{j}(f_{j+1}\big(g_{j}(\tilde{x}_{j}))+\zeta_{j+1}\big)\Big|\Big|\\ &\le\Big|\Big|a_{j}f_{j+1}\big(\theta_{j+1}+d_{j}(\tilde{x}_{j})-\xi_{j+1}+\widetilde{\Delta}_{j}\big)\Big|\Big|\\ &+\Big|\Big|\sum_{l=m_{k}}^{j}a_{l}\zeta_{l+1}-\sum_{l=m_{k}}^{j-1}a_{l}\zeta_{l+1}\Big|\Big|\\ &\le a_{j}\alpha\Big(2||\widetilde{\Delta}||+1\Big)+2c^{\prime}T<\frac{\beta}{4} \end{array} $$

(84)

for sufficiently large k and sufficiently small T, where j:m_k≤j≤m(m_k,T)−1. It follows that

$$\begin{array}{*{20}l} {}||g_{j}(\tilde{x}_{j})-\theta_{j+1}||&=||\widetilde{\Delta}_{j+1}+g_{j}(\tilde{x}_{j})-\tilde{x}_{j+1}||\\ &\ge\frac{\beta}{2}-\frac{\beta}{4}=\frac{\beta}{4}, \end{array} $$

(85)

$$\begin{array}{*{20}l} {}||g_{j}(\tilde{x}_{j})-\theta_{j+1}||&=||\tilde{x}_{j+1}-\theta_{j+1}-a_{j}\{f_{j+1}(g_{j}(\tilde{x}_{j}))+\zeta_{j+1}\}||\\ &<\frac{\beta}{4}+||\widetilde{\Delta}_{j+1}||\le\frac{\beta}{4}+c_{2}T+||\widetilde{\Delta}||+1. \end{array} $$

(86)

Identifying r₁ and r₂ in A2 to $\frac {\beta }{4}$ and $\frac {\beta }{4}+||\widetilde {\Delta }_{j+1}||\le \frac {\beta }{4}+c_{2}T+||\widetilde {\Delta }||$, respectively, we can find a>0 such that for ∀j:m_k≤j≤m(m_k,T)

$$\begin{array}{*{20}l} v_{x}^{T}(g_{j}(\tilde{x}_{j})-\theta_{j+1})f_{j+1}(g_{j}(\tilde{x}_{j}))<-a. \end{array} $$

(87)

Noticing that for $\forall j:m_{k}\le j\le m(m_{k},T), ||d_{j}(\tilde {x}_{j})||\le \gamma _{j}||\widetilde {\Delta }_{j}||\le \gamma _{j}(c_{2}T+||\widetilde {\Delta }||+1)$, by A6 and A7 we have

$$\begin{array}{*{20}l} {\lim}_{k\to\infty}\sum_{j=m_{k}}^{m(m_{k},T)-1}(d_{j}(\tilde{x}_{j})-\xi_{j+1})=0. \end{array} $$

(88)

By Lemma 7 it follows

$$\begin{array}{*{20}l} \limsup_{k\to\infty}||\sum_{j=m_{k}}^{m(m_{k},T)-1}a_{j}\zeta_{j+1}||=\delta(T). \end{array} $$

(89)

Notice that for ∀j:m_k≤j≤m(m_k,T)

$$\begin{array}{*{20}l} {}&||\widetilde{\Delta}_{k}-(g_{j}(\tilde{x}_{j})-\theta_{j+1})||\\ {}&\le||\widetilde{\Delta}_{k}-\widetilde{\Delta}_{m_{k}}||+||\widetilde{\Delta}_{j}-\widetilde{\Delta}_{m_{k}}||+||g_{j}(\tilde{x}_{j})-\theta_{j+1}-\widetilde{\Delta}_{j}||\\ {}&\le 2c_{1}T+||d_{j}(\tilde{x}_{j})-\xi_{j+1}||\\ {}&\le 2c_{1}T+\gamma_{j}(c_{2}T+||\widetilde{\Delta}||+1)+||\xi_{j+1}||\xrightarrow[k\to\infty,T\to 0]{}0. \end{array} $$

(90)

So by the continuity of v(·) we know

$$\begin{array}{*{20}l} v_{x}^{T}\Big(\widetilde{\Delta}_{k}\Big)-v_{x}^{T}\Big(g_{j}(\tilde{x}_{j})-\theta_{j+1}\Big)\xrightarrow[k\to\infty,T\to 0]{}0. \end{array} $$

(91)

From A3, A6 and (51), we’ve already utilized this inequality before

$$\begin{array}{*{20}l} \Big|\Big|f_{j+1}\Big(g_{j}(\tilde{x}_{j})\Big)\Big|\Big|\le\alpha\Big(2||\widetilde{\Delta}||+1\Big). \end{array} $$

(92)

So, from (88)(89)(92) we can conclude that the first and second term of (83) is o(T) as k→∞, T→0. Combining this with (87) it follows that for sufficiently large k and sufficiently small T by (83) we have

$$\begin{array}{*{20}l} v(\widetilde{\Delta}_{m(m_{k},T)})-v(\widetilde{\Delta}_{m_{k}})\le-\frac{a}{2}T. \end{array} $$

(93)

Let k→∞ we have

$$\begin{array}{*{20}l} \limsup_{k\to\infty}v(\widetilde{\Delta}_{m(m_{k},T)})\le\delta_{1}-\frac{a}{2}T. \end{array} $$

(94)

Notice that by Lemma 6 we have

$$\begin{array}{*{20}l} {\lim}_{T\to0}\max_{m_{k}\le m\le m(m_{k},T)}\Big|\Big|v(\widetilde{\Delta}_{m})-v(\widetilde{\Delta}_{m_{k}})\Big|\Big|=0 \end{array} $$

which implies that m(m_k,T)<l_k for sufficiently small T. Therefore, $v(\widetilde {\Delta }_{m(m_{k},T)})\in [\delta _{1},\delta _{2}]$ which contradicts with (94). So, the converse assumption is not true. The proof is completed. □

Lemma 9

Assume all the assumptions required by Lemma 8 hold. Then there exists a positive integer σ such that

$$\begin{array}{*{20}l} {\lim}_{k\to\infty}\sigma_{k}=\sigma<\infty. \end{array} $$

(95)

Proof

Assume the converse:

$$\begin{array}{*{20}l} {\lim}_{k\to\infty}\sigma_{k}=\infty. \end{array} $$

(96)

Then there exists a sequence of integer {n_k}_k≥0 such that $\sigma _{n_{k}}=k$ and $\sigma _{n_{k}-1}=k-1$. By the algorithm (3)–(7) we know that $\tilde {x}_{i,n_{k}}=h_{n_{k}-1}(x^{*})~\forall i\in \mathcal {V}$. Therefore, from Lemma 1 we know that $\{\widetilde {\Lambda }_{n_{k}}\}$ is a bounded sequence, and hence, it contains a convergent subsequence. For the sake of convenience, We denote the convergent subsequence still by $\{\widetilde {\Lambda }_{n_{k}}\}$ with limit $\widetilde {\Lambda }$.

Since {M_k}_k≥0 is a sequence of positive numbers increasingly diverging to infinity, there exists a positive integer k₀ such that

$$\begin{array}{*{20}l} M_{k}\ge 2\sqrt{N}r+2+M_{1}^{\prime}~~\forall k\ge k_{0} \end{array} $$

(97)

where r is given in A2 and

$$\begin{array}{*{20}l} M_{1}^{\prime}=2+(2\sqrt{N}r+2)(c\rho+2). \end{array} $$

(98)

Now we show that under the converse assumption, $\{\tilde {\Delta }_{n_{k}}\}$ starting from n_k will exit the ball r infinitely many times. Define

$$\begin{array}{*{20}l} &m_{k}\triangleq\inf\{s>n_{k}:\|\widetilde{\Lambda}_{s}\|\ge 2\sqrt{N}r+2+m_{1}^{\prime}\}, \end{array} $$

(99)

$$\begin{array}{*{20}l} &l_{k}\triangleq\sup\{s<m_{k}:\|\widetilde{\Lambda}_{s}\|\le 2\sqrt{N}r+2\}. \end{array} $$

(100)

Notice that $\|\widetilde {\Lambda }_{n_{k}}\|=\sqrt {N}\eta $ by Lemma 1 and r>η from A2, we derive $\|\widetilde {\Lambda }_{n_{k}}\|<\sqrt {N}r$. Hence from (99) (100) we have n_k<l_k<m_k. By the definition of l_k we know that $\{\widetilde {\Lambda }_{l_{k}}\}$ is bounded, then there exists a convergent subsequence denoted still by $\{\widetilde {\Lambda }_{l_{k}}\}$.

By Lemma 6 there exist constants $M_{0}^{\prime }>0$ defined by (53) with $C=2\sqrt {N}r+2, c_{1}>0$ defined by (54), c₂>0 defined by (55), and 0<T<1 with c₁T<1 such that

$$\begin{array}{*{20}l} \|\widetilde{\Lambda}_{m+1}-\widetilde{\Lambda}_{l_{k}}\|\le c_{1}T+M_{0}^{\prime}~~\forall m:l_{k}\le m\le m(l_{k},T) \end{array} $$

for sufficiently large k≥k₀. Then for sufficiently large k≥k₀ we have

$$\begin{array}{*{20}l} \|\widetilde{\Lambda}_{m+1}\|\le&\|\widetilde{\Lambda}_{l_{k}}\|+c_{1}T+M_{0}^{\prime}\\ \le&2\sqrt{N}r+2+1+1+(2\sqrt{N}r+2)(c\rho+2)\\ =&2\sqrt{N}r+2+M_{1}^{\prime}~~\forall m:l_{k}\le m\le m(l_{k},T). \end{array} $$

(101)

Then m(l_k,T)<n_k+1 for sufficiently large k≥k₀ by (97) and the definition of n_k.

From (101) by the definition of m_k (99), we conclude m(l_k,T)+1<m_k for sufficiently large k≥k₀. Then by (99) (100) we know that for sufficiently large k≥k₀

$$\begin{array}{*{20}l} 2\sqrt{N}r+2<\|\widetilde{\Lambda}_{m+1}\|\le2\sqrt{N}r+2+M_{1}^{\prime} \end{array} $$

(102)

holds for m:l_k≤m≤m(l_k,T).

Since 0<ρ<1, there exists a positive integer m₀ such that $\phantom {\dot {i}\!}4c\rho ^{m_{0}}<1$. Then $\sum _{m=l_{k}}^{l_{k}+m_{0}}a_{m}\xrightarrow [k\to \infty ]{}0$ by A1, and hence l_k+m₀<m(l_k,T)<n_k+1 for sufficiently large k≥k₀. So, from (102) it can be seen that for sufficiently large k≥k₀ we have

$$\begin{array}{*{20}l} \|\widetilde{\Lambda}_{l_{k}+m_{0}}\|>2\sqrt{N}r+2 \end{array} $$

(103)

Notice that $\{\widetilde {\Lambda }_{m+1}:l_{k}\le m\le m(l_{k},T)\}$ is bounded, similarly to (67) (69) we know that for sufficiently large k≥k₀

$$\begin{array}{*{20}l} &\|\tilde{X}_{\bot,m+1}\|\le(2\sqrt{N}r+2)c\rho^{m+1-l_{k}}\\&\quad+(M_{0}^{\prime}+C+1)C\frac{1}{1-\rho}\sup_{m\ge n_{k}}a_{m}\\ &+\alpha(2M_{0}^{\prime}+2C+3)C\frac{1}{1-\rho}\sup_{m\ge n_{k}}a_{m}\\&\quad+\Big(2+\frac{c(\rho+1)}{1-\rho}\Big)T_{k}. \end{array} $$

Since $\phantom {\dot {i}\!}4c\rho ^{m_{0}}<1, c_{1}T<1$ and $a_{k}\xrightarrow [k\to \infty ]{}0$, it follows that

$$\begin{array}{*{20}l} &\|\tilde{X}_{\bot,l_{k}+m_{0}}\|\le\frac{1}{2}(\sqrt{N}r+1)+\frac{1}{2}+1=\frac{\sqrt{N}r}{2}+2 \end{array} $$

(104)

for sufficiently large k≥k₀. By noticing $(\mathbf {1}\otimes \mathbf {I}_{l})\tilde {\Delta }_{l_{k}+m_{0}}=\widetilde {\Lambda }_{l_{k}+m_{o}}-\tilde {X}_{\bot,l_{k}+m_{0}}$, from (103) (104) we conclude that

$$\begin{array}{*{20}l} &\sqrt{N}\|\tilde{\Delta}_{l_{k}+m_{0}}\|=\|\widetilde{\Lambda}_{l_{k}+m_{o}}-\tilde{X}_{\bot,l_{k}+m_{0}}\|\\ &\ge\|\widetilde{\Lambda}_{l_{k}+m_{o}}\|-\|\tilde{X}_{\bot,l_{k}+m_{0}}\|>\frac{3}{2}\sqrt{N}r. \end{array} $$

(105)

Therefore, $\|\tilde {\Delta }_{l_{k}+m_{0}}\|>r$. So we prove that $\{\tilde {\Delta }_{n_{k}}\}$ starting from n_k will exit the ball r infinitely many times.

Since $\{\widetilde {\Lambda }_{n_{k}}\}$ is convergent, we know that $\{\widetilde {\Delta }_{n_{k}}\}$ is convergent and from Lemma 1 it follows that $||\widetilde {\Delta }_{n_{k}}||\le \eta $. And we know that $\{\tilde {\Delta }_{n_{k}}\}$ starting from n_k will exit the ball r infinitely many times. Therefore, there exists an interval [δ₁,δ₂]∈(sup||y||≤ηv(y), inf||x||=rv(x)) with 0∉[δ₁,δ₂], where for any k, there is a sequence $\tilde {\Delta }_{s_{k}},\dots,\tilde {\Delta }_{t_{k}}$ such that $n_{k}\le s_{k}, v(\tilde {\Delta }_{s_{k}})\le \delta _{1}, \delta _{1}<v(\tilde {\Delta }_{j})<\delta _{2}$ for ∀j:s_k<j<t_k and $v(\tilde {\Delta }_{t_{k}})>\delta _{2}$. In other words, the values of v(·) at sequence $\{\tilde {\Delta }_{s_{k}},\dots,\tilde {\Delta }_{t_{k}}\}$ cross the interval [δ₁,δ₂] infinitely many times with $\|\widetilde {\Lambda }_{m_{k}}\|<r$. This contradicts with Lemma 8. So the proof is done. ${\lim }_{k\to \infty }\sigma _{k}<\infty $ is indeed true. □

Lemma 10

Assume all the assumptions required by Lemma 9 hold. Then

$$\begin{array}{*{20}l} {\lim}_{k\to\infty}\sigma_{i,k}=\sigma<\infty\quad\forall i\in\mathcal{V}. \end{array} $$

(106)

Proof

From Lemma 9 it follows that

$$\begin{array}{*{20}l} \sigma_{i,k}\le\sigma\quad\forall i\in\mathcal{V}. \end{array} $$

By Lemma 4 we know $\tilde {\tau }_{i,\sigma }=\tau _{i,\sigma }\le BD+\tau _{\sigma }$. So, by definition we know that σ_i,k≥σ ∀k≥BD+τ_σ.

In conclusion, we have $\sigma _{i,k}=\sigma \quad \forall k\ge BD+\tau _{\sigma }\quad \forall i\in \mathcal {V}$. The proof is completed. □

By the definition of the auxiliary sequence $\{\tilde {x}_{i,k}\}$, we can see that Lemma 10 indicates the fact that $\{\tilde {x}_{i,k}\}$ and {x_i,k} coincide in a finite number of steps.

Proof of Theorem 1: By (95) and (106), there exists a positive integer σ depending on ω such that

$$\begin{array}{*{20}l} \hat{\sigma}_{i,k}=\sigma_{i,k}=\sigma\quad\forall k\ge k_{0}\triangleq BD+\tau_{\sigma}\quad\forall i\in\mathcal{V}, \end{array} $$

(107)

and hence by (3)

$$\begin{array}{*{20}l} {}x^{\prime}_{i,k+1}\,=\,\sum_{j\in N_{i}(k)}w_{ij}(k)g_{j}(x_{j,k})+a_{k}O_{i,k+1}\quad\forall k\ge k_{0}\quad\forall i\in\mathcal{V}, \end{array} $$

(108)

by (5) $||x^{\prime }_{i,k+1}-h_{k}(x^{*})||\le M_{\sigma }$ and by (4) $x_{i,k+1}=x^{\prime }_{i,k+1}$ for any k≥k₀ and any $i\in \mathcal {V}$. So, we have proved the assertion i).

Multiply (11) by D_⊥ from left, we derive

$$\begin{array}{*{20}l} {}X_{\bot,k+1}&=D_{\bot}(W(k)\otimes\mathbf{I}_{l})G_{k}(X_{k})\\ &\quad+a_{k}D_{\bot}\Big(F_{k+1}\big(G_{k}(X_{k})\big)+\epsilon_{k+1}\Big)\\ &=D_{\bot}(W(k)\otimes\mathbf{I}_{l})D_{\bot}X_{k}\\ &\quad+D_{\bot}\big(W(k)\otimes\mathbf{I}_{l}\big)\big(G_{k}(X_{k})-X_{k}\big)\\ &\quad+a_{k}D_{\bot}\Big(F_{k+1}\big(G_{k}(X_{k})\big)+\epsilon_{k+1}\Big)\\ &=D_{\bot}(W(k)\otimes\mathbf{I}_{l})X_{\bot,k}\\ &\quad+D_{\bot}\big(W(k)\otimes\mathbf{I}_{l}\big)\big(G_{k}(X_{k})-G_{k}(\Theta_{k})-X_{k}\,+\,\Theta_{k}\big)\\ &\quad+a_{k}D_{\bot}\big(F_{k+1}(G_{k}(X_{k}))+\epsilon_{k+1}\big)\\ &=D_{\bot}(W(k)\otimes\mathbf{I}_{l})X_{\bot,k}\\ &\quad+D_{\bot}(W(k)\otimes\mathbf{I}_{l})D_{k}(X_{k})\\ &\quad+a_{k}D_{\bot}\big(F_{k+1}(G_{k}(X_{k}))+\epsilon_{k+1}\big), \end{array} $$

where the third equality comes from the fact that (W(k)⊗I_l)G_k(X_k)=(W(k)⊗I_l)D_⊥ and D_⊥G_k(X_k)=D_⊥X_k=0. So, for any k≥k₀ by induction we have

$$\begin{array}{*{20}l} {}X_{\bot,k+1}&=\Psi(k,k_{0})(X_{k_{0}}-\Theta_{k_{0}})+\sum_{m=k_{0}}^{k}\Psi(k,m)D_{m}(X_{m})\\ {}&+\sum_{m=k_{0}}^{k}a_{m}\Psi(k,m\,+\,1)D_{\bot}\big(F_{m+1}(G_{m}(X_{m}))\,+\,\epsilon_{m+1}\big). \end{array} $$

From Lemma 10 we know the number of truncations is finite. So, {x_i,k−h_k−1(x^∗)}_k≥1 is bounded. Furthermore, Lemma 1 shows that {h_k−1(x^∗)−θ_k}_k≥1 is bounded as well. Therefore, {x_i,k−θ_k}_k≥1 is bounded. By assumption A6, A7 we can take sufficiently large k₁>k₀ such that ||ξ_k||≤1,γ_k≤1 ∀k≥k₁. So, for sufficiently large k, there exists constants $c_{6},c_{7},c_{8},c^{\prime }_{8},c_{9}>0$ such that

$$\begin{array}{*{20}l} ||X_{\bot,k+1}||&\le c_{6}\rho^{k+1-k_{0}}+c_{7}\sum_{m=k_{0}}^{k}\gamma_{m}\rho^{k+1-m}\\ &+c_{8}\sum_{m=k_{0}}^{k}\alpha(2||\Lambda_{m}||+1)a_{m}\rho^{k-m+2}\\ &+c_{9}\sum_{m=k_{0}}^{k}a_{m}\rho^{k-m+2}||\epsilon_{m+1}||\\ &\le c_{6}\rho^{k+1-k_{0}}+c_{7}\sum_{m=k_{0}}^{k}\gamma_{m}\rho^{k+1-m}\\ &+c^{\prime}_{8}\sum_{m=k_{0}}^{k}a_{m}\rho^{k-m+2}\\&+c_{9}\sum_{m=k_{0}}^{k}a_{m}\rho^{k-m+2}||\epsilon_{m+1}||. \end{array} $$

(109)

Notice that for any ε>0, there exists integer k₂>k₁ such that γ_k<ε. We can derive

$$\begin{array}{*{20}l} {}\sum_{m=0}^{k}\gamma_{m}\rho^{k-m+1}&=\sum_{m=0}^{k_{2}}\gamma_{m}\rho^{k-m+1}+\sum_{m=k_{2}+1}^{k}\gamma_{m}\rho^{k-m+1}\\ &\le\rho^{k-k_{2}+1}\sum_{m=0}^{k_{2}}\gamma_{m}\,+\,\epsilon\frac{1}{1-\rho}\xrightarrow[k\to\infty,\epsilon\to0]{}0. \end{array} $$

Therefore, the second and third term at the right-hand side of (109) tends to zero as k→∞. Similarly, the last term of (109) also tends to zero since ${\lim }_{k\to \infty }a_{k}\epsilon _{k+1}=0$. The first term of (109) tends to zero as k→∞ as well since 0<ρ<1. So, we conclude that

$$\begin{array}{*{20}l} X_{\bot,k}\xrightarrow[k\to\infty]{}0. \end{array} $$

We now show the convergence of $\{v(\widetilde {\Delta }_{k})\}$. Since

$$\begin{array}{*{20}l} v_{1}\triangleq\liminf_{k\to\infty}v(\widetilde{\Delta}_{k})\le\limsup_{k\to\infty}v(\widetilde{\Delta}_{k})\triangleq v_{2}, \end{array} $$

we aim to prove v₁=v₂. Assume the converse: v₁<v₂. Then there exists an interval [δ₁,δ₂]∈(v₁,v₂) such that 0∉[δ₁,δ₂]. $v(\widetilde {\Delta }_{k})$ crosses the interval [δ₁,δ₂] infinite many times. By Lemma 9 and (13)–(14) we know $\widetilde {\Delta }_{k}$ is bounded, so, $\widetilde {\Lambda }_{k}$ is bounded as well. This contradicts with Lemma 8. Therefore, $\{v(\widetilde {\Delta }_{k})\}$ is convergent.

Finally, we show $\widetilde {\Delta }_{k}\xrightarrow [k\to \infty ]{}0$. Assume the converse. Then there exists a convergent subsequence $\{\widetilde {\Delta }_{n_{k}}\}$ with limit $\widetilde {\Delta }\neq 0$. Take β>0 such that $||\widetilde {\Delta }||>\beta $. From Lemma 6 we know for sufficiently large k and sufficiently small T we have

$$\begin{array}{*{20}l} ||\widetilde{\Delta}_{j}||>\frac{\beta}{2},\quad n_{k}\le j\le m(n_{k},T). \end{array} $$

Similar to the proof of Lemma 8, by Taylor’s expansion there exists a>0 such that

$$\begin{array}{*{20}l} v(\widetilde{\Delta}_{m(n_{k},T)})-v(\widetilde{\Delta}_{n_{k}})<-\frac{a}{2}T. \end{array} $$

(110)

Since $\{v(\widetilde {\Delta }_{k})\}$ is convergent and by definition $m(n_{k},T)\xrightarrow [k\to \infty ]{}\infty $, Let k→∞ for both sides of (110), we derive $0<-\frac {a}{2}T$ which is impossible. So $\widetilde {\Delta }_{k}\xrightarrow [k\to \infty ]{}0$. By Lemma 10 we know that after a finite number of steps we have $\widetilde {\Delta }_{k}=\Delta _{k}$, and hence, $\Delta _{k}\xrightarrow [k\to \infty ]{}0$. Combining $\Delta _{k}\xrightarrow [k\to \infty ]{}0$ with $X_{\bot,k}\xrightarrow [k\to \infty ]{}0$, we can conclude $\Lambda _{k}\xrightarrow [k\to \infty ]{}0$. □

6 Numerical simulation

In this section, we apply the distributed algorithm to a distributed tracking problem and demonstrate the performance of the algorithm. Consider a maneuvering target in the 2-D plane. The state of the target θ_k at each time consists of four components $\theta _{k}=[\theta _{k}^{1},\theta _{k}^{2},\theta _{k}^{3},\theta _{k}^{4}]^{T}$. They are horizontal position, horizontal velocity, vertical position, and vertical velocity, respectively. The dynamic model of the target is chosen to be a nearly constant velocity model [29], which means that the dynamic of the target is governed by:

$$\begin{array}{*{20}l} \theta_{k+1}=A\theta_{k}+\xi_{k+1}, \end{array} $$

(111)

where $\xi _{k+1}\in \mathbb {R}^{4}$ is noise, and A is defined as

$$\begin{array}{*{20}l} A=\mathbf{I}_{2}\otimes\left(\begin{array}{cc} 1 & T\\ 0 & 1\\ \end{array}\right) \end{array} $$

(112)

with T being the sampling interval. It can be seen that when ξ_k+1=0 the target follows a constant velocity movement. The goal of the network is to track this target by estimating the state θ_k.

Consider a sensor network $\mathcal {G}=(\mathcal {V},\mathcal {E})$ with $\mathcal {V}=\{1,\cdots,N\},~N=20,$ and $\mathcal {E}=G(N,p_{N})$ being the Poisson random graph^{Footnote 1} with designing parameter 0≤p_N≤1. We choose p_N=0.25. Denote by N_i the neighbor set of agent i and by n_i the cardinality of N_i. Set $W(k)=[w_{ij}]_{i,j=1}^{N}~\forall k\ge 1$ with $w_{ij}=\frac {1}{n_{i}}$ if agent j is in the set N_i. All agents aim to track the target state θ_k cooperatively. We assume for each agent, only one component of the target state can be observed with noise. To explain it in a mathematical model, the local function for agent i is defined as:

$$\begin{array}{*{20}l} f_{i,k}(x)\triangleq e_{k_{i}}\theta_{k}-e_{k_{i}}x, \end{array} $$

(113)

where e_k is a 4-dimensional square diagonal matrix with only the kth diagonal element being 1 and every other elements being 0, i.e.,

$$\begin{array}{*{20}l} e_{1}\triangleq \text{diag}(1,0,0,0)\\ e_{2}\triangleq \text{diag}(0,1,0,0)\\ e_{3}\triangleq \text{diag}(0,0,1,0)\\ e_{4}\triangleq \text{diag}(0,0,0,1). \end{array} $$

The selection of k_i will be explained later. Since the state θ_k is unknown to the agents, each agent can only get a noise-corrupted observation of this local function instead of the exact value. The global function can be written as:

$$\begin{array}{*{20}l} f_{k}(x)\triangleq\frac{1}{N} \sum_{i=1}^{N}e_{k_{i}}\theta_{k}-e_{k_{i}}x. \end{array} $$

(114)

It can be seen that while each agent can only estimate one component of θ_k with its own local function, θ_k is the unique root of the global function f_k(x).

For our experiment, we take $\xi _{k}\triangleq \frac {1}{k^{2}}v_{k}$, where {v_k} is a sequence of i.i.d. random variables uniformly distributed over [−1,1], and the step-sizes $a_{k}\triangleq \frac {20}{k}$. We let the sampling interval be $T\triangleq 0.1$ s, the truncation bound $M_{k}\triangleq k+80$, and $x^{*}\triangleq [1,1,1,1]^{T}$. The initial value x_i,0 for all agents is chosen from the uniform distribution over [−2,2]. Let the observation noise ε_i,k be the white Gaussian noise. As for the selection of k_i, for agent i, if i mod 4≠0, then $k_{i}\triangleq i\mod l$, if i mod 4=0, then $k_{i}\triangleq 4$.

Denote by $\{x_{i,k}\}_{k\ge 1},i\in \mathcal {V}$ the estimates given by (3)–(7) and by $x_{k}=\frac {1}{N}\sum _{i=1}^{N}x_{i,k}$ the average of $x_{i,k},~i\in \mathcal {V}$. In Fig. 1, the dashed lines denote the state of the moving target and the solid lines the average estimates for entries $\{\theta _{k}^{j},~j=1,\cdots,4\}_{k\geq 1}$ of {θ_k}_k≥1. From the figure we can see that the estimate can track the moving target successfully.

7 Conclusion

The distributed root-tracking problem for a sum of time-varying regression functions over a network is considered in this paper. It is assumed that a noise-corrupted dynamic information of the roots is known to all agents in the network. Each agent updates its estimate by using the local observation, the dynamic information of the global root, and information received from its neighbors. A distributed stochastic approximation algorithm is proposed and the consensus and convergence of the estimates are established.

For future research, it is of interest to relax the conditions on the dynamic information of the global roots, and to consider the convergence results of the algorithm over an unbalanced network.

8 Notations

Table 1

Full size table

Availability of data and materials

Not applicable.

Notes

For the details of Poisson random graph, we refer to [30].

References

A. Jadbabaie, J. Lin, A. S. Morse, Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans. Autom. Control. 48(6), 988–1001 (2003).
Article MathSciNet Google Scholar
Wei Ren, R. W. Beard, Consensus seeking in multiagent systems under dynamically changing interaction topologies. IEEE Trans. Autom. Control. 50(5), 655–661 (2005).
Article MathSciNet Google Scholar
R. Olfati-Saber, A. J. Fax, R. M. Murray, Consensus and cooperation in networked multi-agent systems. Proc. IEEE. 95(1), 215–233 (2007).
Article Google Scholar
W. Feng, H. -F. Chen, Output consensus of networked hammerstein and wiener systems. SIAM J. Control Optim.57(2), 1230–1254 (2019).
Article MathSciNet Google Scholar
P. Yi, Y. Hong, F. Liu, Initialization-free distributed algorithms for optimal resource allocation with feasibility constraints and application to economic dispatch of power systems. Automatica. 74:, 259–269 (2016).
Article MathSciNet Google Scholar
A. Nedić, A. Olshevsky, W. Shi, in 2018 IEEE Conf. Decis. Control (CDC). Improved convergence rates for distributed resource allocation (IEEE, 2018), pp. 172–177. https://doi.org/10.1109/cdc.2018.8619322.
Y. Kuriki, T. Namerikawa, in 2014 Amer. Control Conf.Consensus-based cooperative formation control with collision avoidance for a multi-uav system (IEEE, 2014), pp. 2077–2082. https://doi.org/10.1109/acc.2014.6858777.
S. Aeron, V. Saligrama, D. A. Castanon, Efficient sensor management policies for distributed target tracking in multihop sensor networks. IEEE Trans. Sig. Process.56(6), 2562–2574 (2008).
Article MathSciNet Google Scholar
A. Nedic, A. Ozdaglar, Distributed subgradient methods for multi-agent optimization. IEEE Trans. Autom. Control. 54(1), 48 (2009).
Article MathSciNet Google Scholar
A. Nedic, A. Ozdaglar, P. A. Parrilo, Constrained consensus and optimization in multi-agent networks. IEEE Trans. Autom. Control. 55(4), 922–938 (2010).
Article MathSciNet Google Scholar
A. Simonetto, G. Leus, Distributed asynchronous time-varying constrained optimization (IEEE, 2014). https://doi.org/10.1109/acssc.2014.7094854.
F. Y. Jakubiec, A. R. D-map, Distributed maximum a posteriori probability estimation of dynamic systems. IEEE Trans. Sig. Process.61(2), 450–466 (2012).
Article Google Scholar
X. Yi, X. Li, L. Xie, K. H. Johansson, Distributed online convex optimization with time-varying coupled inequality constraints. IEEE Trans. Sig. Process.68:, 731–746 (2020).
Article MathSciNet Google Scholar
S. Shahrampour, A. Jadbabaie, Distributed online optimization in dynamic environments using mirror descent. IEEE Trans. Autom. Control. 63(3), 714–725 (2017).
Article MathSciNet Google Scholar
D. Bajovic, J. Xavier, B. Sinopoli, J. M. F. Moura, et al., Distributed detection via gaussian running consensus: Large deviations asymptotic analysis. IEEE Trans. Sig. Process.59(9), 4381–4396 (2011).
Article MathSciNet Google Scholar
M. Ye, H. Guoqiang, in 2015 54th IEEE Conference on Decision and Control (CDC). Distributed optimization for systems with time-varying quadratic objective functions (IEEE, 2015), pp. 3285–3290. https://doi.org/10.1109/cdc.2015.7402713.
A. Simonetto, A. Koppel, A. Mokhtari, G. Leus, A. Ribeiro, Decentralized prediction-correction methods for networked time-varying convex optimization. IEEE Trans. Autom. Control. 62(11), 5724–5738 (2017).
Article MathSciNet Google Scholar
M. Fazlyab, S. Paternain, V. M. Preciado, A. Ribeiro, Prediction-correction interior-point method for time-varying convex optimization. IEEE Trans. Autom. Control. 63(7), 1973–1986 (2017).
Article MathSciNet Google Scholar
H. -F. Chen, T. E. Duncan, B. Pasik-Duncan, A kiefer-wolfowitz algorithm with randomized differences. IEEE Trans. Autom. Control. 44(3), 442–453 (1999).
Article MathSciNet Google Scholar
H. J. Kushner, G. Yin, Asymptotic properties of distributed and communicating stochastic approximation algorithms. SIAM J. Control Optim.25(5), 1266–1290 (1987).
Article MathSciNet Google Scholar
P. Bianchi, G. Fort, W. Hachem, Performance of a distributed stochastic approximation algorithm. IEEE Trans. Inform. Theory. 59(11), 7405–7418 (2013).
Article MathSciNet Google Scholar
J. Lei, H. -F. Chen, Distributed stochastic approximation algorithm with expanding truncations. IEEE Trans. Autom. Control. 65(2), 664–679 (2020).
Article MathSciNet Google Scholar
V. Dupač, A dynamic stochastic approximation method. Ann. Math. Stat., 1695–1702 (1965).
H. -F. Chen, K. Uosaki, Convergence analysis of dynamic stochastic approximation. Syst. Control Lett.35(5), 309–315 (1998).
Article MathSciNet Google Scholar
D. Acemoglu, A. Nedic, A. Ozdaglar, in 2008 47th IEEE Conference on Decision and Control. Convergence of rule-of-thumb learning rules in social networks (IEEE, 2008), pp. 1714–1720. https://doi.org/10.1109/cdc.2008.4739167.
S. Shahrampour, S. Rakhlin, A. Jadbabaie, in Advances in Neural Information Processing Systems. Online learning of dynamic parameters in social networks, (2013). https://doi.org/10.4018/978-1-4666-1815-2.ch006.
F. Kewei, H. -F. Chen, W. Zhao, in 2018 37th Chinese Control Conference (CCC). Distributed stochastic approximation algorithm for time-varying regression function over network (IEEE, 2018), pp. 1925–1930. https://doi.org/10.23919/chicc.2018.8483554.
H. -F. Chen, Stochastic Approximation and Its Applications, vol. 64 (Springer Science & Business Media, 2002). https://doi.org/10.1007/b101987.
X. Yuan, C. Han, Z. Duan, M. Lei, in 2005 7th International Conference on Information Fusion, 2. Comparison and choice of models in tracking target with coordinated turn motion (IEEE, 2005), p. 6. https://doi.org/10.1109/icif.2005.1592032.
M. O. Jackson, Social and Economic Networks (Princeton university press, US, 2010).
Book Google Scholar

Download references

Code availability

Not applicable.

Funding

This work was supported by the National Key Research and Development Program of China under Grant 2018YFA0703800 and the National Natural Science Foundation of China under Grant 61822312. This work was also supported (in part) by the Strategic Priority Research Program of Chinese Academy of Sciences under Grant No. XDA27000000.

Author information

Authors and Affiliations

Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, People’s Republic of China
Kewei Fu, Han-Fu Chen & Wenxiao Zhao
School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049, People’s Republic of China
Kewei Fu, Han-Fu Chen & Wenxiao Zhao

Authors

Kewei Fu
View author publications
You can also search for this author in PubMed Google Scholar
Han-Fu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenxiao Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Authors’ contributions

All authors contributed to the study conception and design. Mathematical analysis was performed by Kewei Fu, Han-Fu Chen and Wenxiao Zhao. The first draft of the manuscript was written by Kewei Fu and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Authors’ information

Kewei Fu received the B.S. degree from Shandong University, China in 2015. He got his Ph.D. degree from the Institute of Systems Science (ISS), Academy of Mathematics and Systems Science (AMSS), Chinese Academy of Sciences (CAS), China in 2020. His research interests include the stochastic approximation algorithm and the distributed optimization and estimation over networks.

Han-Fu Chen graduated from the Leningrad (St. Petersburg) State University, Russia. He is a Professor of the Key Laboratory of Systems and Control of the Chinese Academy of Sciences (CAS). His research interests are mainly in stochastic systems, including system identification, adaptive control, and stochastic approximation and its applications to systems, control, and signal processing. He authored and coauthored more than 210 journal papers and eight books. He served as an IFAC Council Member (2002-2005), President of the Chinese Association of Automation (1993-2002), and a Permanent member of the Council of the Chinese Mathematics Society (1991-1999). He is an IEEE Fellow, IFAC Fellow, a Member of TWAS, and a Member of CAS.

Wenxiao Zhao received the B.Sc. degree from Shandong University, China in 2003 and the Ph.D. degree in operation research and cybernetics from the Institute of Systems Science (ISS), Academy of Mathematics and Systems Science (AMSS), Chinese Academy of Sciences (CAS), China, in 2008. He is currently a professor with AMSS, CAS. His research interests are mainly in machine learning, system identification and adaptive control, and stochastic optimization.

Corresponding author

Correspondence to Wenxiao Zhao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fu, K., Chen, HF. & Zhao, W. Distributed dynamic stochastic approximation algorithm over time-varying networks. Auton. Intell. Syst. 1, 5 (2021). https://doi.org/10.1007/s43684-021-00003-1

Download citation

Received: 22 January 2021
Accepted: 02 March 2021
Published: 17 August 2021
DOI: https://doi.org/10.1007/s43684-021-00003-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Distributed dynamic stochastic approximation algorithm over time-varying networks

Abstract

Similar content being viewed by others

Non-asymptotic error bounds for constant stepsize stochastic approximation for tracking mobile agents

Distributed order estimation for continuous-time stochastic systems

Cyclic Stochastic Approximation with Disturbance on Input in the Parameter Tracking Problem Based on a Multiagent Algorithm

1 Introduction

2 Problem formulation and distributed root-tracking algorithm

2.1 Problem formulation

2.2 Algorithm

Remark 1

3 Assumptions and convergence result

3.1 Assumptions

3.2 Main result

Theorem 1

Lemma 1

Proof

4 Auxiliary sequences

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Corollary 1

Lemma 5

Proof

Corollary 2

5 Proof of the main result

Lemma 6

Proof

Lemma 7

Proof

Lemma 8

Proof

Lemma 9

Proof

Lemma 10

Proof

6 Numerical simulation

7 Conclusion

8 Notations

Availability of data and materials

Notes

References

Code availability

Funding

Author information

Authors and Affiliations

Contributions

Authors’ contributions

Authors’ information

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation